All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/6] drm/i915: Fix HSW parity test
@ 2013-09-18  4:12 Ben Widawsky
  2013-09-18  4:12 ` [PATCH 2/6] drm/i915: Add second slice l3 remapping Ben Widawsky
                   ` (12 more replies)
  0 siblings, 13 replies; 27+ messages in thread
From: Ben Widawsky @ 2013-09-18  4:12 UTC (permalink / raw)
  To: Intel GFX; +Cc: Bryan Bell, Ben Widawsky, Ben Widawsky

Haswell changed the log registers to be WO, so we can no longer read
them to determine the programming (which sucks, see later note). For
now, simply use the cached value, and hope HW doesn't screw us over.

v2: Simplify the logic to avoid an extra !, remove last, and fix the
buffer offset which broke along the rebase (Ville)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57441
CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_sysfs.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index d572435..71f6de2 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -133,6 +133,17 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
 	if (ret)
 		return ret;
 
+	if (IS_HASWELL(drm_dev)) {
+		if (dev_priv->l3_parity.remap_info)
+			memcpy(buf,
+			       dev_priv->l3_parity.remap_info + (offset/4),
+			       count);
+		else
+			memset(buf, 0, count);
+
+		goto out;
+	}
+
 	misccpctl = I915_READ(GEN7_MISCCPCTL);
 	I915_WRITE(GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
 
@@ -141,9 +152,10 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
 
 	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
 
+out:
 	mutex_unlock(&drm_dev->struct_mutex);
 
-	return i;
+	return count;
 }
 
 static ssize_t
-- 
1.8.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 2/6] drm/i915: Add second slice l3 remapping
  2013-09-18  4:12 [PATCH 1/6] drm/i915: Fix HSW parity test Ben Widawsky
@ 2013-09-18  4:12 ` Ben Widawsky
  2013-09-18  7:36   ` Ville Syrjälä
  2013-09-18  4:12 ` [PATCH 3/6] drm/i915: Make l3 remapping use the ring Ben Widawsky
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: Ben Widawsky @ 2013-09-18  4:12 UTC (permalink / raw)
  To: Intel GFX; +Cc: Bryan Bell, Ben Widawsky, Ben Widawsky

Certain HSW SKUs have a second bank of L3. This L3 remapping has a
separate register set, and interrupt from the first "slice". A slice is
simply a term to define some subset of the GPU's l3 cache. This patch
implements both the interrupt handler, and ability to communicate with
userspace about this second slice.

v2:  Remove redundant check about non-existent slice.
Change warning about interrupts of unknown slices to WARN_ON_ONCE
Handle the case where we get 2 slice interrupts concurrently, and switch
the tracking of interrupts to be non-destructive (all Ville)
Don't enable/mask the second slice parity interrupt for ivb/vlv (even
though all docs I can find claim it's rsvd) (Ville + Bryan)
Keep BYT excluded from L3 parity

CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h         |  7 ++-
 drivers/gpu/drm/i915/i915_gem.c         | 26 +++++-----
 drivers/gpu/drm/i915/i915_irq.c         | 88 +++++++++++++++++++++------------
 drivers/gpu/drm/i915/i915_reg.h         |  7 +++
 drivers/gpu/drm/i915/i915_sysfs.c       | 34 ++++++++++---
 drivers/gpu/drm/i915/intel_ringbuffer.c |  7 ++-
 include/uapi/drm/i915_drm.h             |  8 +--
 7 files changed, 116 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8b16d47..c6e8df7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -917,9 +917,11 @@ struct i915_ums_state {
 	int mm_suspended;
 };
 
+#define MAX_L3_SLICES 2
 struct intel_l3_parity {
-	u32 *remap_info;
+	u32 *remap_info[MAX_L3_SLICES];
 	struct work_struct error_work;
+	int which_slice;
 };
 
 struct i915_gem_mm {
@@ -1686,6 +1688,7 @@ struct drm_i915_file_private {
 #define HAS_FORCE_WAKE(dev) (INTEL_INFO(dev)->has_force_wake)
 
 #define HAS_L3_GPU_CACHE(dev) (IS_IVYBRIDGE(dev) || IS_HASWELL(dev))
+#define NUM_L3_SLICES(dev) (IS_HSW_GT3(dev) ? 2 : HAS_L3_GPU_CACHE(dev))
 
 #define GT_FREQUENCY_MULTIPLIER 50
 
@@ -1946,7 +1949,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
 int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
 int __must_check i915_gem_init(struct drm_device *dev);
 int __must_check i915_gem_init_hw(struct drm_device *dev);
-void i915_gem_l3_remap(struct drm_device *dev);
+void i915_gem_l3_remap(struct drm_device *dev, int slice);
 void i915_gem_init_swizzling(struct drm_device *dev);
 void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
 int __must_check i915_gpu_idle(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3d3de6e..66bf75d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4252,16 +4252,15 @@ i915_gem_idle(struct drm_device *dev)
 	return 0;
 }
 
-void i915_gem_l3_remap(struct drm_device *dev)
+void i915_gem_l3_remap(struct drm_device *dev, int slice)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
+	u32 reg_base = GEN7_L3LOG_BASE + (slice * 0x200);
+	u32 *remap_info = dev_priv->l3_parity.remap_info[slice];
 	u32 misccpctl;
 	int i;
 
-	if (!HAS_L3_GPU_CACHE(dev))
-		return;
-
-	if (!dev_priv->l3_parity.remap_info)
+	if (!HAS_L3_GPU_CACHE(dev) || !remap_info)
 		return;
 
 	misccpctl = I915_READ(GEN7_MISCCPCTL);
@@ -4269,17 +4268,17 @@ void i915_gem_l3_remap(struct drm_device *dev)
 	POSTING_READ(GEN7_MISCCPCTL);
 
 	for (i = 0; i < GEN7_L3LOG_SIZE; i += 4) {
-		u32 remap = I915_READ(GEN7_L3LOG_BASE + i);
-		if (remap && remap != dev_priv->l3_parity.remap_info[i/4])
+		u32 remap = I915_READ(reg_base + i);
+		if (remap && remap != remap_info[i/4])
 			DRM_DEBUG("0x%x was already programmed to %x\n",
-				  GEN7_L3LOG_BASE + i, remap);
-		if (remap && !dev_priv->l3_parity.remap_info[i/4])
+				  reg_base + i, remap);
+		if (remap && !remap_info[i/4])
 			DRM_DEBUG_DRIVER("Clearing remapped register\n");
-		I915_WRITE(GEN7_L3LOG_BASE + i, dev_priv->l3_parity.remap_info[i/4]);
+		I915_WRITE(reg_base + i, remap_info[i/4]);
 	}
 
 	/* Make sure all the writes land before disabling dop clock gating */
-	POSTING_READ(GEN7_L3LOG_BASE);
+	POSTING_READ(reg_base);
 
 	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
 }
@@ -4373,7 +4372,7 @@ int
 i915_gem_init_hw(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	int ret;
+	int ret, i;
 
 	if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
 		return -EIO;
@@ -4392,7 +4391,8 @@ i915_gem_init_hw(struct drm_device *dev)
 		I915_WRITE(GEN7_MSG_CTL, temp);
 	}
 
-	i915_gem_l3_remap(dev);
+	for (i = 0; i < NUM_L3_SLICES(dev); i++)
+		i915_gem_l3_remap(dev, i);
 
 	i915_gem_init_swizzling(dev);
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index a42f30b..b11ee39 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -888,9 +888,10 @@ static void ivybridge_parity_work(struct work_struct *work)
 	drm_i915_private_t *dev_priv = container_of(work, drm_i915_private_t,
 						    l3_parity.error_work);
 	u32 error_status, row, bank, subbank;
-	char *parity_event[5];
+	char *parity_event[6];
 	uint32_t misccpctl;
 	unsigned long flags;
+	uint8_t slice = 0;
 
 	/* We must turn off DOP level clock gating to access the L3 registers.
 	 * In order to prevent a get/put style interface, acquire struct mutex
@@ -898,45 +899,63 @@ static void ivybridge_parity_work(struct work_struct *work)
 	 */
 	mutex_lock(&dev_priv->dev->struct_mutex);
 
+	/* If we've screwed up tracking, just let the interrupt fire again */
+	if (WARN_ON(!dev_priv->l3_parity.which_slice))
+		goto out;
+
 	misccpctl = I915_READ(GEN7_MISCCPCTL);
 	I915_WRITE(GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
 	POSTING_READ(GEN7_MISCCPCTL);
 
-	error_status = I915_READ(GEN7_L3CDERRST1);
-	row = GEN7_PARITY_ERROR_ROW(error_status);
-	bank = GEN7_PARITY_ERROR_BANK(error_status);
-	subbank = GEN7_PARITY_ERROR_SUBBANK(error_status);
+	while ((slice = ffs(dev_priv->l3_parity.which_slice)) != 0) {
+		u32 reg;
 
-	I915_WRITE(GEN7_L3CDERRST1, GEN7_PARITY_ERROR_VALID |
-				    GEN7_L3CDERRST1_ENABLE);
-	POSTING_READ(GEN7_L3CDERRST1);
+		if (WARN_ON_ONCE(slice >= NUM_L3_SLICES(dev_priv->dev)))
+			break;
 
-	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
+		dev_priv->l3_parity.which_slice &= ~(1<<slice);
 
-	spin_lock_irqsave(&dev_priv->irq_lock, flags);
-	ilk_enable_gt_irq(dev_priv, GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
-	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
+		reg = GEN7_L3CDERRST1 + (slice * 0x200);
 
-	mutex_unlock(&dev_priv->dev->struct_mutex);
+		error_status = I915_READ(reg);
+		row = GEN7_PARITY_ERROR_ROW(error_status);
+		bank = GEN7_PARITY_ERROR_BANK(error_status);
+		subbank = GEN7_PARITY_ERROR_SUBBANK(error_status);
+
+		I915_WRITE(reg, GEN7_PARITY_ERROR_VALID | GEN7_L3CDERRST1_ENABLE);
+		POSTING_READ(reg);
+
+		parity_event[0] = I915_L3_PARITY_UEVENT "=1";
+		parity_event[1] = kasprintf(GFP_KERNEL, "ROW=%d", row);
+		parity_event[2] = kasprintf(GFP_KERNEL, "BANK=%d", bank);
+		parity_event[3] = kasprintf(GFP_KERNEL, "SUBBANK=%d", subbank);
+		parity_event[4] = kasprintf(GFP_KERNEL, "SLICE=%d", slice);
+		parity_event[5] = NULL;
+
+		kobject_uevent_env(&dev_priv->dev->primary->kdev.kobj,
+				   KOBJ_CHANGE, parity_event);
 
-	parity_event[0] = I915_L3_PARITY_UEVENT "=1";
-	parity_event[1] = kasprintf(GFP_KERNEL, "ROW=%d", row);
-	parity_event[2] = kasprintf(GFP_KERNEL, "BANK=%d", bank);
-	parity_event[3] = kasprintf(GFP_KERNEL, "SUBBANK=%d", subbank);
-	parity_event[4] = NULL;
+		DRM_DEBUG("Parity error: Slice = %d, Row = %d, Bank = %d, Sub bank = %d.\n",
+			  slice, row, bank, subbank);
 
-	kobject_uevent_env(&dev_priv->dev->primary->kdev.kobj,
-			   KOBJ_CHANGE, parity_event);
+		kfree(parity_event[4]);
+		kfree(parity_event[3]);
+		kfree(parity_event[2]);
+		kfree(parity_event[1]);
+	}
 
-	DRM_DEBUG("Parity error: Row = %d, Bank = %d, Sub bank = %d.\n",
-		  row, bank, subbank);
+	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
 
-	kfree(parity_event[3]);
-	kfree(parity_event[2]);
-	kfree(parity_event[1]);
+out:
+	WARN_ON(dev_priv->l3_parity.which_slice);
+	spin_lock_irqsave(&dev_priv->irq_lock, flags);
+	ilk_enable_gt_irq(dev_priv, GT_PARITY_ERROR(dev_priv->dev));
+	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
+
+	mutex_unlock(&dev_priv->dev->struct_mutex);
 }
 
-static void ivybridge_parity_error_irq_handler(struct drm_device *dev)
+static void ivybridge_parity_error_irq_handler(struct drm_device *dev, u32 iir)
 {
 	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
 
@@ -944,9 +963,16 @@ static void ivybridge_parity_error_irq_handler(struct drm_device *dev)
 		return;
 
 	spin_lock(&dev_priv->irq_lock);
-	ilk_disable_gt_irq(dev_priv, GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
+	ilk_disable_gt_irq(dev_priv, GT_PARITY_ERROR(dev));
 	spin_unlock(&dev_priv->irq_lock);
 
+	iir &= GT_PARITY_ERROR(dev);
+	if (iir & GT_RENDER_L3_PARITY_ERROR_INTERRUPT_S1)
+		dev_priv->l3_parity.which_slice |= 1 << 1;
+
+	if (iir & GT_RENDER_L3_PARITY_ERROR_INTERRUPT)
+		dev_priv->l3_parity.which_slice |= 1 << 0;
+
 	queue_work(dev_priv->wq, &dev_priv->l3_parity.error_work);
 }
 
@@ -981,8 +1007,8 @@ static void snb_gt_irq_handler(struct drm_device *dev,
 		i915_handle_error(dev, false);
 	}
 
-	if (gt_iir & GT_RENDER_L3_PARITY_ERROR_INTERRUPT)
-		ivybridge_parity_error_irq_handler(dev);
+	if (gt_iir & GT_PARITY_ERROR(dev))
+		ivybridge_parity_error_irq_handler(dev, gt_iir);
 }
 
 #define HPD_STORM_DETECT_PERIOD 1000
@@ -2267,8 +2293,8 @@ static void gen5_gt_irq_postinstall(struct drm_device *dev)
 	dev_priv->gt_irq_mask = ~0;
 	if (HAS_L3_GPU_CACHE(dev)) {
 		/* L3 parity interrupt is always unmasked. */
-		dev_priv->gt_irq_mask = ~GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
-		gt_irqs |= GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
+		dev_priv->gt_irq_mask = ~GT_PARITY_ERROR(dev);
+		gt_irqs |= GT_PARITY_ERROR(dev);
 	}
 
 	gt_irqs |= GT_RENDER_USER_INTERRUPT;
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 384adfb..c94946f 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -927,6 +927,7 @@
 #define GT_BLT_USER_INTERRUPT			(1 << 22)
 #define GT_BSD_CS_ERROR_INTERRUPT		(1 << 15)
 #define GT_BSD_USER_INTERRUPT			(1 << 12)
+#define GT_RENDER_L3_PARITY_ERROR_INTERRUPT_S1	(1 << 11) /* hsw+; rsvd on snb, ivb, vlv */
 #define GT_RENDER_L3_PARITY_ERROR_INTERRUPT	(1 <<  5) /* !snb */
 #define GT_RENDER_PIPECTL_NOTIFY_INTERRUPT	(1 <<  4)
 #define GT_RENDER_CS_MASTER_ERROR_INTERRUPT	(1 <<  3)
@@ -937,6 +938,10 @@
 #define PM_VEBOX_CS_ERROR_INTERRUPT		(1 << 12) /* hsw+ */
 #define PM_VEBOX_USER_INTERRUPT			(1 << 10) /* hsw+ */
 
+#define GT_PARITY_ERROR(dev) \
+	(GT_RENDER_L3_PARITY_ERROR_INTERRUPT | \
+	 IS_HASWELL(dev) ? GT_RENDER_L3_PARITY_ERROR_INTERRUPT_S1 : 0)
+
 /* These are all the "old" interrupts */
 #define ILK_BSD_USER_INTERRUPT				(1<<5)
 #define I915_PIPE_CONTROL_NOTIFY_INTERRUPT		(1<<18)
@@ -4743,6 +4748,7 @@
 
 /* IVYBRIDGE DPF */
 #define GEN7_L3CDERRST1			0xB008 /* L3CD Error Status 1 */
+#define HSW_L3CDERRST11			0xB208 /* L3CD Error Status register 1 slice 1 */
 #define   GEN7_L3CDERRST1_ROW_MASK	(0x7ff<<14)
 #define   GEN7_PARITY_ERROR_VALID	(1<<13)
 #define   GEN7_L3CDERRST1_BANK_MASK	(3<<11)
@@ -4756,6 +4762,7 @@
 #define   GEN7_L3CDERRST1_ENABLE	(1<<7)
 
 #define GEN7_L3LOG_BASE			0xB070
+#define HSW_L3LOG_BASE_SLICE1		0xB270
 #define GEN7_L3LOG_SIZE			0x80
 
 #define GEN7_HALF_SLICE_CHICKEN1	0xe100 /* IVB GT1 + VLV */
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index 71f6de2..3a8bf0c 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -119,6 +119,7 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
 	struct drm_device *drm_dev = dminor->dev;
 	struct drm_i915_private *dev_priv = drm_dev->dev_private;
 	uint32_t misccpctl;
+	int slice = (int)(uintptr_t)attr->private;
 	int i, ret;
 
 	count = round_down(count, 4);
@@ -134,9 +135,9 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
 		return ret;
 
 	if (IS_HASWELL(drm_dev)) {
-		if (dev_priv->l3_parity.remap_info)
+		if (dev_priv->l3_parity.remap_info[slice])
 			memcpy(buf,
-			       dev_priv->l3_parity.remap_info + (offset/4),
+			       dev_priv->l3_parity.remap_info[slice] + (offset/4),
 			       count);
 		else
 			memset(buf, 0, count);
@@ -168,6 +169,7 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
 	struct drm_device *drm_dev = dminor->dev;
 	struct drm_i915_private *dev_priv = drm_dev->dev_private;
 	u32 *temp = NULL; /* Just here to make handling failures easy */
+	int slice = (int)(uintptr_t)attr->private;
 	int ret;
 
 	ret = l3_access_valid(drm_dev, offset);
@@ -178,7 +180,7 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
 	if (ret)
 		return ret;
 
-	if (!dev_priv->l3_parity.remap_info) {
+	if (!dev_priv->l3_parity.remap_info[slice]) {
 		temp = kzalloc(GEN7_L3LOG_SIZE, GFP_KERNEL);
 		if (!temp) {
 			mutex_unlock(&drm_dev->struct_mutex);
@@ -198,11 +200,11 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
 	 * at this point it is left as a TODO.
 	*/
 	if (temp)
-		dev_priv->l3_parity.remap_info = temp;
+		dev_priv->l3_parity.remap_info[slice] = temp;
 
-	memcpy(dev_priv->l3_parity.remap_info + (offset/4), buf, count);
+	memcpy(dev_priv->l3_parity.remap_info[slice] + (offset/4), buf, count);
 
-	i915_gem_l3_remap(drm_dev);
+	i915_gem_l3_remap(drm_dev, slice);
 
 	mutex_unlock(&drm_dev->struct_mutex);
 
@@ -214,7 +216,17 @@ static struct bin_attribute dpf_attrs = {
 	.size = GEN7_L3LOG_SIZE,
 	.read = i915_l3_read,
 	.write = i915_l3_write,
-	.mmap = NULL
+	.mmap = NULL,
+	.private = (void *)0
+};
+
+static struct bin_attribute dpf_attrs_1 = {
+	.attr = {.name = "l3_parity_slice_1", .mode = (S_IRUSR | S_IWUSR)},
+	.size = GEN7_L3LOG_SIZE,
+	.read = i915_l3_read,
+	.write = i915_l3_write,
+	.mmap = NULL,
+	.private = (void *)1
 };
 
 static ssize_t gt_cur_freq_mhz_show(struct device *kdev,
@@ -525,6 +537,13 @@ void i915_setup_sysfs(struct drm_device *dev)
 		ret = device_create_bin_file(&dev->primary->kdev, &dpf_attrs);
 		if (ret)
 			DRM_ERROR("l3 parity sysfs setup failed\n");
+
+		if (NUM_L3_SLICES(dev) > 1) {
+			ret = device_create_bin_file(&dev->primary->kdev,
+						     &dpf_attrs_1);
+			if (ret)
+				DRM_ERROR("l3 parity slice 1 setup failed\n");
+		}
 	}
 
 	ret = 0;
@@ -548,6 +567,7 @@ void i915_teardown_sysfs(struct drm_device *dev)
 		sysfs_remove_files(&dev->primary->kdev.kobj, vlv_attrs);
 	else
 		sysfs_remove_files(&dev->primary->kdev.kobj, gen6_attrs);
+	device_remove_bin_file(&dev->primary->kdev,  &dpf_attrs_1);
 	device_remove_bin_file(&dev->primary->kdev,  &dpf_attrs);
 #ifdef CONFIG_PM
 	sysfs_unmerge_group(&dev->primary->kdev.kobj, &rc6_attr_group);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 686e5b2..958b7d8 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -570,7 +570,7 @@ static int init_render_ring(struct intel_ring_buffer *ring)
 		I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
 
 	if (HAS_L3_GPU_CACHE(dev))
-		I915_WRITE_IMR(ring, ~GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
+		I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
 
 	return ret;
 }
@@ -1000,7 +1000,7 @@ gen6_ring_get_irq(struct intel_ring_buffer *ring)
 		if (HAS_L3_GPU_CACHE(dev) && ring->id == RCS)
 			I915_WRITE_IMR(ring,
 				       ~(ring->irq_enable_mask |
-					 GT_RENDER_L3_PARITY_ERROR_INTERRUPT));
+					 GT_PARITY_ERROR(dev)));
 		else
 			I915_WRITE_IMR(ring, ~ring->irq_enable_mask);
 		ilk_enable_gt_irq(dev_priv, ring->irq_enable_mask);
@@ -1020,8 +1020,7 @@ gen6_ring_put_irq(struct intel_ring_buffer *ring)
 	spin_lock_irqsave(&dev_priv->irq_lock, flags);
 	if (--ring->irq_refcount == 0) {
 		if (HAS_L3_GPU_CACHE(dev) && ring->id == RCS)
-			I915_WRITE_IMR(ring,
-				       ~GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
+			I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
 		else
 			I915_WRITE_IMR(ring, ~0);
 		ilk_disable_gt_irq(dev_priv, ring->irq_enable_mask);
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 55bb572..3a4e97b 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -38,10 +38,10 @@
  *
  * I915_L3_PARITY_UEVENT - Generated when the driver receives a parity mismatch
  *	event from the gpu l3 cache. Additional information supplied is ROW,
- *	BANK, SUBBANK of the affected cacheline. Userspace should keep track of
- *	these events and if a specific cache-line seems to have a persistent
- *	error remap it with the l3 remapping tool supplied in intel-gpu-tools.
- *	The value supplied with the event is always 1.
+ *	BANK, SUBBANK, SLICE of the affected cacheline. Userspace should keep
+ *	track of these events and if a specific cache-line seems to have a
+ *	persistent error remap it with the l3 remapping tool supplied in
+ *	intel-gpu-tools.  The value supplied with the event is always 1.
  *
  * I915_ERROR_UEVENT - Generated upon error detection, currently only via
  *	hangcheck. The error detection event is a good indicator of when things
-- 
1.8.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 3/6] drm/i915: Make l3 remapping use the ring
  2013-09-18  4:12 [PATCH 1/6] drm/i915: Fix HSW parity test Ben Widawsky
  2013-09-18  4:12 ` [PATCH 2/6] drm/i915: Add second slice l3 remapping Ben Widawsky
@ 2013-09-18  4:12 ` Ben Widawsky
  2013-09-19 18:39   ` Daniel Vetter
  2013-09-18  4:12 ` [PATCH 4/6] drm/i915: Keep a list of all contexts Ben Widawsky
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: Ben Widawsky @ 2013-09-18  4:12 UTC (permalink / raw)
  To: Intel GFX; +Cc: Bryan Bell, Ben Widawsky, Ben Widawsky

Using LRI for setting the remapping registers allows us to stream l3
remapping information. This is necessary to handle per context remaps as
we'll see implemented in an upcoming patch.

Using the ring also means we don't need to frob the DOP clock gating
bits.

v2: Add comment about lack of worry for concurrent register access
(Daniel)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h   |  2 +-
 drivers/gpu/drm/i915/i915_gem.c   | 38 ++++++++++++++++++--------------------
 drivers/gpu/drm/i915/i915_sysfs.c |  3 ++-
 3 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c6e8df7..0c39805 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1949,7 +1949,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
 int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
 int __must_check i915_gem_init(struct drm_device *dev);
 int __must_check i915_gem_init_hw(struct drm_device *dev);
-void i915_gem_l3_remap(struct drm_device *dev, int slice);
+int i915_gem_l3_remap(struct intel_ring_buffer *ring, int slice);
 void i915_gem_init_swizzling(struct drm_device *dev);
 void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
 int __must_check i915_gpu_idle(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 66bf75d..2538138 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4252,35 +4252,33 @@ i915_gem_idle(struct drm_device *dev)
 	return 0;
 }
 
-void i915_gem_l3_remap(struct drm_device *dev, int slice)
+int i915_gem_l3_remap(struct intel_ring_buffer *ring, int slice)
 {
+	struct drm_device *dev = ring->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	u32 reg_base = GEN7_L3LOG_BASE + (slice * 0x200);
 	u32 *remap_info = dev_priv->l3_parity.remap_info[slice];
-	u32 misccpctl;
-	int i;
+	int i, ret;
 
 	if (!HAS_L3_GPU_CACHE(dev) || !remap_info)
-		return;
+		return 0;
 
-	misccpctl = I915_READ(GEN7_MISCCPCTL);
-	I915_WRITE(GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
-	POSTING_READ(GEN7_MISCCPCTL);
+	ret = intel_ring_begin(ring, GEN7_L3LOG_SIZE / 4 * 3);
+	if (ret)
+		return ret;
 
+	/* XXX: We do not worry about the concurrent register cacheline hang
+	 * here because no other code should access these registers other than
+	 * at initialization time. */
 	for (i = 0; i < GEN7_L3LOG_SIZE; i += 4) {
-		u32 remap = I915_READ(reg_base + i);
-		if (remap && remap != remap_info[i/4])
-			DRM_DEBUG("0x%x was already programmed to %x\n",
-				  reg_base + i, remap);
-		if (remap && !remap_info[i/4])
-			DRM_DEBUG_DRIVER("Clearing remapped register\n");
-		I915_WRITE(reg_base + i, remap_info[i/4]);
+		intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+		intel_ring_emit(ring, reg_base + i);
+		intel_ring_emit(ring, remap_info[i/4]);
 	}
 
-	/* Make sure all the writes land before disabling dop clock gating */
-	POSTING_READ(reg_base);
+	intel_ring_advance(ring);
 
-	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
+	return ret;
 }
 
 void i915_gem_init_swizzling(struct drm_device *dev)
@@ -4391,15 +4389,15 @@ i915_gem_init_hw(struct drm_device *dev)
 		I915_WRITE(GEN7_MSG_CTL, temp);
 	}
 
-	for (i = 0; i < NUM_L3_SLICES(dev); i++)
-		i915_gem_l3_remap(dev, i);
-
 	i915_gem_init_swizzling(dev);
 
 	ret = i915_gem_init_rings(dev);
 	if (ret)
 		return ret;
 
+	for (i = 0; i < NUM_L3_SLICES(dev); i++)
+		i915_gem_l3_remap(&dev_priv->ring[RCS], i);
+
 	/*
 	 * XXX: There was some w/a described somewhere suggesting loading
 	 * contexts before PPGTT.
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index 3a8bf0c..b07bdfb 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -204,7 +204,8 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
 
 	memcpy(dev_priv->l3_parity.remap_info[slice] + (offset/4), buf, count);
 
-	i915_gem_l3_remap(drm_dev, slice);
+	if (i915_gem_l3_remap(&dev_priv->ring[RCS], slice))
+		count = 0;
 
 	mutex_unlock(&drm_dev->struct_mutex);
 
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 4/6] drm/i915: Keep a list of all contexts
  2013-09-18  4:12 [PATCH 1/6] drm/i915: Fix HSW parity test Ben Widawsky
  2013-09-18  4:12 ` [PATCH 2/6] drm/i915: Add second slice l3 remapping Ben Widawsky
  2013-09-18  4:12 ` [PATCH 3/6] drm/i915: Make l3 remapping use the ring Ben Widawsky
@ 2013-09-18  4:12 ` Ben Widawsky
  2013-09-18  4:12 ` [PATCH 5/6] drm/i915: Do remaps for " Ben Widawsky
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: Ben Widawsky @ 2013-09-18  4:12 UTC (permalink / raw)
  To: Intel GFX; +Cc: Bryan Bell, Ben Widawsky, Ben Widawsky

I have implemented this patch before without creating a separate list
(I'm having trouble finding the links, but the messages ids are:
<1364942743-6041-2-git-send-email-ben@bwidawsk.net>
<1365118914-15753-9-git-send-email-ben@bwidawsk.net>)

However, the code is much simpler to just use a list and it makes the
code from the next patch a lot more pretty.

As you'll see in the next patch, the reason for this is to be able to
specify when a context needs to get L3 remapping. More details there.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c     | 15 +++++++++------
 drivers/gpu/drm/i915/i915_drv.h         |  3 +++
 drivers/gpu/drm/i915/i915_gem.c         |  1 +
 drivers/gpu/drm/i915/i915_gem_context.c |  3 +++
 4 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 1d77624..ada0950 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1442,6 +1442,7 @@ static int i915_context_status(struct seq_file *m, void *unused)
 	struct drm_device *dev = node->minor->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct intel_ring_buffer *ring;
+	struct i915_hw_context *ctx;
 	int ret, i;
 
 	ret = mutex_lock_interruptible(&dev->mode_config.mutex);
@@ -1460,12 +1461,14 @@ static int i915_context_status(struct seq_file *m, void *unused)
 		seq_putc(m, '\n');
 	}
 
-	for_each_ring(ring, dev_priv, i) {
-		if (ring->default_context) {
-			seq_printf(m, "HW default context %s ", ring->name);
-			describe_obj(m, ring->default_context->obj);
-			seq_putc(m, '\n');
-		}
+	list_for_each_entry(ctx, &dev_priv->context_list, link) {
+		seq_puts(m, "HW context ");
+		for_each_ring(ring, dev_priv, i)
+			if (ring->default_context == ctx)
+				seq_printf(m, "(default context %s) ", ring->name);
+
+		describe_obj(m, ctx->obj);
+		seq_putc(m, '\n');
 	}
 
 	mutex_unlock(&dev->mode_config.mutex);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0c39805..1795927 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -605,6 +605,8 @@ struct i915_hw_context {
 	struct intel_ring_buffer *ring;
 	struct drm_i915_gem_object *obj;
 	struct i915_ctx_hang_stats hang_stats;
+
+	struct list_head link;
 };
 
 struct i915_fbc {
@@ -1343,6 +1345,7 @@ typedef struct drm_i915_private {
 
 	bool hw_contexts_disabled;
 	uint32_t hw_context_size;
+	struct list_head context_list;
 
 	u32 fdi_rx_config;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2538138..18d07d7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4569,6 +4569,7 @@ i915_gem_load(struct drm_device *dev)
 	INIT_LIST_HEAD(&dev_priv->vm_list);
 	i915_init_vm(dev_priv, &dev_priv->gtt.base);
 
+	INIT_LIST_HEAD(&dev_priv->context_list);
 	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
 	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
 	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 26c3fcc..2bbdce8 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -129,6 +129,7 @@ void i915_gem_context_free(struct kref *ctx_ref)
 	struct i915_hw_context *ctx = container_of(ctx_ref,
 						   typeof(*ctx), ref);
 
+	list_del(&ctx->link);
 	drm_gem_object_unreference(&ctx->obj->base);
 	kfree(ctx);
 }
@@ -147,6 +148,7 @@ create_hw_context(struct drm_device *dev,
 
 	kref_init(&ctx->ref);
 	ctx->obj = i915_gem_alloc_object(dev, dev_priv->hw_context_size);
+	INIT_LIST_HEAD(&ctx->link);
 	if (ctx->obj == NULL) {
 		kfree(ctx);
 		DRM_DEBUG_DRIVER("Context object allocated failed\n");
@@ -166,6 +168,7 @@ create_hw_context(struct drm_device *dev,
 	 * assertion in the context switch code.
 	 */
 	ctx->ring = &dev_priv->ring[RCS];
+	list_add_tail(&ctx->link, &dev_priv->context_list);
 
 	/* Default context will never have a file_priv */
 	if (file_priv == NULL)
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 5/6] drm/i915: Do remaps for all contexts
  2013-09-18  4:12 [PATCH 1/6] drm/i915: Fix HSW parity test Ben Widawsky
                   ` (2 preceding siblings ...)
  2013-09-18  4:12 ` [PATCH 4/6] drm/i915: Keep a list of all contexts Ben Widawsky
@ 2013-09-18  4:12 ` Ben Widawsky
  2013-09-18  7:48   ` Ville Syrjälä
  2013-09-18  4:12 ` [PATCH 6/6] drm/i915: s/HAS_L3_GPU_CACHE/HAS_L3_DPF Ben Widawsky
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: Ben Widawsky @ 2013-09-18  4:12 UTC (permalink / raw)
  To: Intel GFX; +Cc: Bryan Bell, Ben Widawsky, Ben Widawsky

On both Ivybridge and Haswell, row remapping information is saved and
restored with context. This means, we never actually properly supported
the l3 remapping because our sysfs interface is asynchronous (and not
tied to any context), and the known faulty HW would be reused by the
next context to run.

Not that due to the asynchronous nature of the sysfs entry, there is no
point modifying the registers for the existing context. Instead we set a
flag for all contexts to load the correct remapping information on the
next run. Interested clients can use debugfs to determine whether or not
the row has been remapped.

One could propose at this point that we just do the remapping in the
kernel. I guess since we have to maintain the sysfs interface anyway,
I'm not sure how useful it is, and I do like keeping the policy in
userspace; (it wasn't my original decision to make the
interface the way it is, so I'm not attached).

v2: Force a context switch when we have a remap on the next switch.
(Ville)
Don't let userspace use the interface with disabled contexts.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c     |  8 +++++++
 drivers/gpu/drm/i915/i915_drv.h         |  1 +
 drivers/gpu/drm/i915/i915_gem_context.c | 22 ++++++++++++++++----
 drivers/gpu/drm/i915/i915_sysfs.c       | 37 +++++++++++++--------------------
 4 files changed, 41 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index ada0950..80bed69 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -145,6 +145,13 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 		seq_printf(m, " (%s)", obj->ring->name);
 }
 
+static void describe_ctx(struct seq_file *m, struct i915_hw_context *ctx)
+{
+	seq_putc(m, ctx->is_initialized ? 'I' : 'i');
+	seq_putc(m, ctx->remap_slice ? 'R' : 'r');
+	seq_putc(m, ' ');
+}
+
 static int i915_gem_object_list_info(struct seq_file *m, void *data)
 {
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
@@ -1463,6 +1470,7 @@ static int i915_context_status(struct seq_file *m, void *unused)
 
 	list_for_each_entry(ctx, &dev_priv->context_list, link) {
 		seq_puts(m, "HW context ");
+		describe_ctx(m, ctx);
 		for_each_ring(ring, dev_priv, i)
 			if (ring->default_context == ctx)
 				seq_printf(m, "(default context %s) ", ring->name);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1795927..015df52 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -601,6 +601,7 @@ struct i915_hw_context {
 	struct kref ref;
 	int id;
 	bool is_initialized;
+	uint8_t remap_slice;
 	struct drm_i915_file_private *file_priv;
 	struct intel_ring_buffer *ring;
 	struct drm_i915_gem_object *obj;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 2bbdce8..7e138cc 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -140,7 +140,7 @@ create_hw_context(struct drm_device *dev,
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_hw_context *ctx;
-	int ret;
+	int ret, i;
 
 	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
 	if (ctx == NULL)
@@ -181,6 +181,8 @@ create_hw_context(struct drm_device *dev,
 
 	ctx->file_priv = file_priv;
 	ctx->id = ret;
+	for (i = 0; i < NUM_L3_SLICES(dev); i++)
+		ctx->remap_slice |= 1 << 1;
 
 	return ctx;
 
@@ -396,11 +398,11 @@ static int do_switch(struct i915_hw_context *to)
 	struct intel_ring_buffer *ring = to->ring;
 	struct i915_hw_context *from = ring->last_context;
 	u32 hw_flags = 0;
-	int ret;
+	int ret, i;
 
 	BUG_ON(from != NULL && from->obj != NULL && from->obj->pin_count == 0);
 
-	if (from == to)
+	if (from == to && !to->remap_slice)
 		return 0;
 
 	ret = i915_gem_obj_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
@@ -423,7 +425,7 @@ static int do_switch(struct i915_hw_context *to)
 
 	if (!to->is_initialized || is_default_context(to))
 		hw_flags |= MI_RESTORE_INHIBIT;
-	else if (WARN_ON_ONCE(from == to)) /* not yet expected */
+	else if (from == to)
 		hw_flags |= MI_FORCE_RESTORE;
 
 	ret = mi_set_context(ring, to, hw_flags);
@@ -432,6 +434,18 @@ static int do_switch(struct i915_hw_context *to)
 		return ret;
 	}
 
+	for (i = 0; i < MAX_L3_SLICES; i++) {
+		if (!(to->remap_slice & (1<<i)))
+			continue;
+
+		ret = i915_gem_l3_remap(ring, i);
+		if (!ret) {
+			to->remap_slice &= ~(1<<i);
+			/* If it failed, try again next round */
+			DRM_DEBUG_DRIVER("L3 remapping failed\n");
+		}
+	}
+
 	/* The backing object for the context is done after switching to the
 	 * *next* context. Therefore we cannot retire the previous context until
 	 * the next context has already started running. In fact, the below code
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index b07bdfb..deb8787 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -118,9 +118,8 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
 	struct drm_minor *dminor = container_of(dev, struct drm_minor, kdev);
 	struct drm_device *drm_dev = dminor->dev;
 	struct drm_i915_private *dev_priv = drm_dev->dev_private;
-	uint32_t misccpctl;
 	int slice = (int)(uintptr_t)attr->private;
-	int i, ret;
+	int ret;
 
 	count = round_down(count, 4);
 
@@ -134,26 +133,13 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
 	if (ret)
 		return ret;
 
-	if (IS_HASWELL(drm_dev)) {
-		if (dev_priv->l3_parity.remap_info[slice])
-			memcpy(buf,
-			       dev_priv->l3_parity.remap_info[slice] + (offset/4),
-			       count);
-		else
-			memset(buf, 0, count);
-
-		goto out;
-	}
-
-	misccpctl = I915_READ(GEN7_MISCCPCTL);
-	I915_WRITE(GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
-
-	for (i = 0; i < count; i += 4)
-		*((uint32_t *)(&buf[i])) = I915_READ(GEN7_L3LOG_BASE + offset + i);
-
-	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
+	if (dev_priv->l3_parity.remap_info[slice])
+		memcpy(buf,
+		       dev_priv->l3_parity.remap_info[slice] + (offset/4),
+		       count);
+	else
+		memset(buf, 0, count);
 
-out:
 	mutex_unlock(&drm_dev->struct_mutex);
 
 	return count;
@@ -168,6 +154,7 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
 	struct drm_minor *dminor = container_of(dev, struct drm_minor, kdev);
 	struct drm_device *drm_dev = dminor->dev;
 	struct drm_i915_private *dev_priv = drm_dev->dev_private;
+	struct i915_hw_context *ctx;
 	u32 *temp = NULL; /* Just here to make handling failures easy */
 	int slice = (int)(uintptr_t)attr->private;
 	int ret;
@@ -176,6 +163,9 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
 	if (ret)
 		return ret;
 
+	if (dev_priv->hw_contexts_disabled)
+		return -ENXIO;
+
 	ret = i915_mutex_lock_interruptible(drm_dev);
 	if (ret)
 		return ret;
@@ -204,8 +194,9 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
 
 	memcpy(dev_priv->l3_parity.remap_info[slice] + (offset/4), buf, count);
 
-	if (i915_gem_l3_remap(&dev_priv->ring[RCS], slice))
-		count = 0;
+	/* NB: We defer the remapping until we switch to the context */
+	list_for_each_entry(ctx, &dev_priv->context_list, link)
+		ctx->remap_slice |= (1<<slice);
 
 	mutex_unlock(&drm_dev->struct_mutex);
 
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 6/6] drm/i915: s/HAS_L3_GPU_CACHE/HAS_L3_DPF
  2013-09-18  4:12 [PATCH 1/6] drm/i915: Fix HSW parity test Ben Widawsky
                   ` (3 preceding siblings ...)
  2013-09-18  4:12 ` [PATCH 5/6] drm/i915: Do remaps for " Ben Widawsky
@ 2013-09-18  4:12 ` Ben Widawsky
  2013-09-18  7:50   ` Ville Syrjälä
  2013-09-18  4:12 ` [PATCH 07/14] intel_l3_parity: Fix indentation Ben Widawsky
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: Ben Widawsky @ 2013-09-18  4:12 UTC (permalink / raw)
  To: Intel GFX; +Cc: Bryan Bell, Ben Widawsky, Ben Widawsky

We'd only ever used this define to denote whether or not we have the
dynamic parity feature (DPF) and never to determine whether or not L3
exists. Baytrail is a good example of where L3 exists, and not DPF.

This patch provides clarify in the code for future use cases which might
want to actually query whether or not L3 exists.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h         | 4 ++--
 drivers/gpu/drm/i915/i915_gem.c         | 2 +-
 drivers/gpu/drm/i915/i915_irq.c         | 4 ++--
 drivers/gpu/drm/i915/i915_sysfs.c       | 4 ++--
 drivers/gpu/drm/i915/intel_ringbuffer.c | 6 +++---
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 015df52..dd2753e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1691,8 +1691,8 @@ struct drm_i915_file_private {
 
 #define HAS_FORCE_WAKE(dev) (INTEL_INFO(dev)->has_force_wake)
 
-#define HAS_L3_GPU_CACHE(dev) (IS_IVYBRIDGE(dev) || IS_HASWELL(dev))
-#define NUM_L3_SLICES(dev) (IS_HSW_GT3(dev) ? 2 : HAS_L3_GPU_CACHE(dev))
+#define HAS_L3_DPF(dev) (IS_IVYBRIDGE(dev) || IS_HASWELL(dev))
+#define NUM_L3_SLICES(dev) (IS_HSW_GT3(dev) ? 2 : HAS_L3_DPF(dev))
 
 #define GT_FREQUENCY_MULTIPLIER 50
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 18d07d7..7859f91 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4260,7 +4260,7 @@ int i915_gem_l3_remap(struct intel_ring_buffer *ring, int slice)
 	u32 *remap_info = dev_priv->l3_parity.remap_info[slice];
 	int i, ret;
 
-	if (!HAS_L3_GPU_CACHE(dev) || !remap_info)
+	if (!HAS_L3_DPF(dev) || !remap_info)
 		return 0;
 
 	ret = intel_ring_begin(ring, GEN7_L3LOG_SIZE / 4 * 3);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index b11ee39..0968c98 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -959,7 +959,7 @@ static void ivybridge_parity_error_irq_handler(struct drm_device *dev, u32 iir)
 {
 	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
 
-	if (!HAS_L3_GPU_CACHE(dev))
+	if (!HAS_L3_DPF(dev))
 		return;
 
 	spin_lock(&dev_priv->irq_lock);
@@ -2291,7 +2291,7 @@ static void gen5_gt_irq_postinstall(struct drm_device *dev)
 	pm_irqs = gt_irqs = 0;
 
 	dev_priv->gt_irq_mask = ~0;
-	if (HAS_L3_GPU_CACHE(dev)) {
+	if (HAS_L3_DPF(dev)) {
 		/* L3 parity interrupt is always unmasked. */
 		dev_priv->gt_irq_mask = ~GT_PARITY_ERROR(dev);
 		gt_irqs |= GT_PARITY_ERROR(dev);
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index deb8787..7b4c79c 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -97,7 +97,7 @@ static struct attribute_group rc6_attr_group = {
 
 static int l3_access_valid(struct drm_device *dev, loff_t offset)
 {
-	if (!HAS_L3_GPU_CACHE(dev))
+	if (!HAS_L3_DPF(dev))
 		return -EPERM;
 
 	if (offset % 4 != 0)
@@ -525,7 +525,7 @@ void i915_setup_sysfs(struct drm_device *dev)
 			DRM_ERROR("RC6 residency sysfs setup failed\n");
 	}
 #endif
-	if (HAS_L3_GPU_CACHE(dev)) {
+	if (HAS_L3_DPF(dev)) {
 		ret = device_create_bin_file(&dev->primary->kdev, &dpf_attrs);
 		if (ret)
 			DRM_ERROR("l3 parity sysfs setup failed\n");
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 958b7d8..b67104a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -569,7 +569,7 @@ static int init_render_ring(struct intel_ring_buffer *ring)
 	if (INTEL_INFO(dev)->gen >= 6)
 		I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
 
-	if (HAS_L3_GPU_CACHE(dev))
+	if (HAS_L3_DPF(dev))
 		I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
 
 	return ret;
@@ -997,7 +997,7 @@ gen6_ring_get_irq(struct intel_ring_buffer *ring)
 
 	spin_lock_irqsave(&dev_priv->irq_lock, flags);
 	if (ring->irq_refcount++ == 0) {
-		if (HAS_L3_GPU_CACHE(dev) && ring->id == RCS)
+		if (HAS_L3_DPF(dev) && ring->id == RCS)
 			I915_WRITE_IMR(ring,
 				       ~(ring->irq_enable_mask |
 					 GT_PARITY_ERROR(dev)));
@@ -1019,7 +1019,7 @@ gen6_ring_put_irq(struct intel_ring_buffer *ring)
 
 	spin_lock_irqsave(&dev_priv->irq_lock, flags);
 	if (--ring->irq_refcount == 0) {
-		if (HAS_L3_GPU_CACHE(dev) && ring->id == RCS)
+		if (HAS_L3_DPF(dev) && ring->id == RCS)
 			I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
 		else
 			I915_WRITE_IMR(ring, ~0);
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 07/14] intel_l3_parity: Fix indentation
  2013-09-18  4:12 [PATCH 1/6] drm/i915: Fix HSW parity test Ben Widawsky
                   ` (4 preceding siblings ...)
  2013-09-18  4:12 ` [PATCH 6/6] drm/i915: s/HAS_L3_GPU_CACHE/HAS_L3_DPF Ben Widawsky
@ 2013-09-18  4:12 ` Ben Widawsky
  2013-09-18  4:12 ` [PATCH 08/14] intel_l3_parity: Assert all GEN7+ support Ben Widawsky
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: Ben Widawsky @ 2013-09-18  4:12 UTC (permalink / raw)
  To: Intel GFX; +Cc: Bryan Bell, Ben Widawsky, Ben Widawsky

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 tools/intel_l3_parity.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/intel_l3_parity.c b/tools/intel_l3_parity.c
index ad027ac..970dcd6 100644
--- a/tools/intel_l3_parity.c
+++ b/tools/intel_l3_parity.c
@@ -58,12 +58,12 @@ static void dumpit(void)
 		for (j = 0; j < NUM_SUBBANKS; j++) {
 			struct l3_log_register *reg = &l3log[i][j];
 
-		if (reg->row0_enable)
-			printf("Row %d, Bank %d, Subbank %d is disabled\n",
-			       reg->row0, i, j);
-		if (reg->row1_enable)
-			printf("Row %d, Bank %d, Subbank %d is disabled\n",
-			       reg->row1, i, j);
+			if (reg->row0_enable)
+				printf("Row %d, Bank %d, Subbank %d is disabled\n",
+				       reg->row0, i, j);
+			if (reg->row1_enable)
+				printf("Row %d, Bank %d, Subbank %d is disabled\n",
+				       reg->row1, i, j);
 		}
 	}
 }
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 08/14] intel_l3_parity: Assert all GEN7+ support
  2013-09-18  4:12 [PATCH 1/6] drm/i915: Fix HSW parity test Ben Widawsky
                   ` (5 preceding siblings ...)
  2013-09-18  4:12 ` [PATCH 07/14] intel_l3_parity: Fix indentation Ben Widawsky
@ 2013-09-18  4:12 ` Ben Widawsky
  2013-09-18  4:12 ` [PATCH 09/14] intel_l3_parity: Use getopt for the l3 parity tool Ben Widawsky
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: Ben Widawsky @ 2013-09-18  4:12 UTC (permalink / raw)
  To: Intel GFX; +Cc: Bryan Bell, Ben Widawsky, Ben Widawsky

v2: Don't assert for Valleyview (Bryan)
Rework code to be a bit more readable.

CC: "Bell, Bryan J" <bryan.j.bell@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 tools/intel_l3_parity.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/tools/intel_l3_parity.c b/tools/intel_l3_parity.c
index 970dcd6..715d095 100644
--- a/tools/intel_l3_parity.c
+++ b/tools/intel_l3_parity.c
@@ -116,15 +116,14 @@ int main(int argc, char *argv[])
 	drm_fd = drm_open_any();
 	devid = intel_get_drm_devid(drm_fd);
 
+	if (intel_gen(devid) < 7 || IS_VALLEYVIEW(devid))
+		exit(EXIT_SUCCESS);
+
 	ret = asprintf(&path, "/sys/class/drm/card%d/l3_parity", device);
 	assert(ret != -1);
 
 	fd = open(path, O_RDWR);
-	if (fd == -1 && IS_IVYBRIDGE(devid)) {
-		perror("Opening sysfs");
-		exit(EXIT_FAILURE);
-	} else if (fd == -1)
-		exit(EXIT_SUCCESS);
+	assert(fd != -1);
 
 	ret = read(fd, l3log, NUM_REGS * sizeof(uint32_t));
 	if (ret == -1) {
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 09/14] intel_l3_parity: Use getopt for the l3 parity tool
  2013-09-18  4:12 [PATCH 1/6] drm/i915: Fix HSW parity test Ben Widawsky
                   ` (6 preceding siblings ...)
  2013-09-18  4:12 ` [PATCH 08/14] intel_l3_parity: Assert all GEN7+ support Ben Widawsky
@ 2013-09-18  4:12 ` Ben Widawsky
  2013-09-18  4:12 ` [PATCH 10/14] intel_l3_parity: Hardware info argument Ben Widawsky
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: Ben Widawsky @ 2013-09-18  4:12 UTC (permalink / raw)
  To: Intel GFX; +Cc: Bryan Bell, Ben Widawsky, Ben Widawsky

Add new command line arguments in addition to supporting the old
features. This patch only introduces one feature, the -e argument to
enable a specific row/bank/subbank. Previously you could only enable
all. Otherwise, it has what you expect (we prefer -r -b -s for
specifying the row/bank/subbank).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 tests/sysfs_l3_parity   |  12 ++---
 tools/intel_l3_parity.c | 122 ++++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 109 insertions(+), 25 deletions(-)

diff --git a/tests/sysfs_l3_parity b/tests/sysfs_l3_parity
index 6f814a1..a0dfad9 100755
--- a/tests/sysfs_l3_parity
+++ b/tests/sysfs_l3_parity
@@ -8,20 +8,20 @@ fi
 SOURCE_DIR="$( dirname "${BASH_SOURCE[0]}" )"
 . $SOURCE_DIR/drm_lib.sh
 
-$SOURCE_DIR/../tools/intel_l3_parity -c
+$SOURCE_DIR/../tools/intel_l3_parity -r 0 -b 0 -s 0 -e
 
 #Check that we can remap a row
-$SOURCE_DIR/../tools/intel_l3_parity 0,0,0
-disabled=`$SOURCE_DIR/../tools/intel_l3_parity | grep -c 'Row 0, Bank 0, Subbank 0 is disabled'`
+$SOURCE_DIR/../tools/intel_l3_parity -r 0 -b 0 -s 0 -d
+disabled=`$SOURCE_DIR/../tools/intel_l3_parity -l | grep -c 'Row 0, Bank 0, Subbank 0 is disabled'`
 if [ "$disabled" != "1" ] ; then
 	echo "Fail"
 	exit 1
 fi
 
-$SOURCE_DIR/../tools/intel_l3_parity -c
+$SOURCE_DIR/../tools/intel_l3_parity -r 0 -b 0 -s 0 -e
 
 #Check that we can clear remaps
-if [ `$SOURCE_DIR/../tools/intel_l3_parity | wc -c` != "0" ] ; then
-	echo "Fail"
+if [ `$SOURCE_DIR/../tools/intel_l3_parity -l | wc -c` != "0" ] ; then
+	echo "Fail 2"
 	exit 1
 fi
diff --git a/tools/intel_l3_parity.c b/tools/intel_l3_parity.c
index 715d095..507a522 100644
--- a/tools/intel_l3_parity.c
+++ b/tools/intel_l3_parity.c
@@ -33,12 +33,14 @@
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
+#include <getopt.h>
 #include "intel_chipset.h"
 #include "intel_gpu_tools.h"
 #include "drmtest.h"
 
 #define NUM_BANKS 4
 #define NUM_SUBBANKS 8
+#define MAX_ROW (1<<12)
 #define NUM_REGS (NUM_BANKS * NUM_SUBBANKS)
 
 struct __attribute__ ((__packed__)) l3_log_register {
@@ -93,17 +95,34 @@ static int disable_rbs(int row, int bank, int sbank)
 	return 0;
 }
 
-static int do_parse(int argc, char *argv[])
+static void enables_rbs(int row, int bank, int sbank)
 {
-	int row, bank, sbank, i, ret;
+	struct l3_log_register *reg = &l3log[bank][sbank];
 
-	for (i = 1; i < argc; i++) {
-		ret = sscanf(argv[i], "%d,%d,%d", &row, &bank, &sbank);
-		if (ret != 3)
-			return i;
-		assert(disable_rbs(row, bank, sbank) == 0);
-	}
-	return 0;
+	if (!reg->row0_enable && !reg->row1_enable)
+		return;
+
+	if (reg->row1_enable && reg->row1 == row)
+		reg->row1_enable = 0;
+	else if (reg->row0_enable && reg->row0 == row)
+		reg->row0_enable = 0;
+}
+
+static void usage(const char *name)
+{
+	printf("usage: %s [OPTIONS] [ACTION]\n"
+		"Operate on the i915 L3 GPU cache (should be run as root)\n\n"
+		" OPTIONS:\n"
+		"  -r, --row=[row]			The row to act upon (default 0)\n"
+		"  -b, --bank=[bank]			The bank to act upon (default 0)\n"
+		"  -s, --subbank=[subbank]		The subbank to act upon (default 0)\n"
+		" ACTIONS (only 1 may be specified at a time):\n"
+		"  -h, --help				Display this help\n"
+		"  -l, --list				List the current L3 logs\n"
+		"  -a, --clear-all			Clear all disabled rows\n"
+		"  -e, --enable				Enable row, bank, subbank (undo -d)\n"
+		"  -d, --disable=<row,bank,subbank>	Disable row, bank, subbank (inline arguments are deprecated. Please use -r, -b, -s instead\n",
+		name);
 }
 
 int main(int argc, char *argv[])
@@ -111,7 +130,9 @@ int main(int argc, char *argv[])
 	const int device = drm_get_card();
 	char *path;
 	unsigned int devid;
+	int row = 0, bank = 0, sbank = 0;
 	int drm_fd, fd, ret;
+	int action = '0';
 
 	drm_fd = drm_open_any();
 	devid = intel_get_drm_devid(drm_fd);
@@ -133,19 +154,82 @@ int main(int argc, char *argv[])
 
 	assert(lseek(fd, 0, SEEK_SET) == 0);
 
-	if (argc == 1) {
-		dumpit();
-		exit(EXIT_SUCCESS);
-	} else if (!strncmp("-c", argv[1], 2)) {
-		memset(l3log, 0, sizeof(l3log));
-	} else {
-		ret = do_parse(argc, argv);
-		if (ret != 0) {
-			fprintf(stderr, "Malformed command line at %s\n", argv[ret]);
-			exit(EXIT_FAILURE);
+	while (1) {
+		int c, option_index = 0;
+		static struct option long_options[] = {
+			{ "help", no_argument, 0, 'h' },
+			{ "list", no_argument, 0, 'l' },
+			{ "clear-all", no_argument, 0, 'a' },
+			{ "enable", no_argument, 0, 'e' },
+			{ "disable", optional_argument, 0, 'd' },
+			{ "row", required_argument, 0, 'r' },
+			{ "bank", required_argument, 0, 'b' },
+			{ "subbank", required_argument, 0, 's' },
+			{0, 0, 0, 0}
+		};
+
+		c = getopt_long(argc, argv, "hr:b:s:aled::", long_options,
+				&option_index);
+		if (c == -1)
+			break;
+
+		switch (c) {
+			case '?':
+			case 'h':
+				usage(argv[0]);
+				exit(EXIT_SUCCESS);
+			case 'r':
+				row = atoi(optarg);
+				if (row >= MAX_ROW)
+					exit(EXIT_FAILURE);
+				break;
+			case 'b':
+				bank = atoi(optarg);
+				if (bank >= NUM_BANKS)
+					exit(EXIT_FAILURE);
+				break;
+			case 's':
+				sbank = atoi(optarg);
+				if (sbank >= NUM_SUBBANKS)
+					exit(EXIT_FAILURE);
+				break;
+			case 'd':
+				if (optarg) {
+					ret = sscanf(optarg, "%d,%d,%d", &row, &bank, &sbank);
+					if (ret != 3)
+						exit(EXIT_FAILURE);
+				}
+			case 'a':
+			case 'l':
+			case 'e':
+				if (action != '0') {
+					fprintf(stderr, "Only one action may be specified\n");
+					exit(EXIT_FAILURE);
+				}
+				action = c;
+				break;
+			default:
+				abort();
 		}
 	}
 
+	switch (action) {
+		case 'l':
+			dumpit();
+			exit(EXIT_SUCCESS);
+		case 'a':
+			memset(l3log, 0, sizeof(l3log));
+			break;
+		case 'e':
+			enables_rbs(row, bank, sbank);
+			break;
+		case 'd':
+			assert(disable_rbs(row, bank, sbank) == 0);
+			break;
+		default:
+			abort();
+	}
+
 	ret = write(fd, l3log, NUM_REGS * sizeof(uint32_t));
 	if (ret == -1) {
 		perror("Writing sysfs");
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 10/14] intel_l3_parity: Hardware info argument
  2013-09-18  4:12 [PATCH 1/6] drm/i915: Fix HSW parity test Ben Widawsky
                   ` (7 preceding siblings ...)
  2013-09-18  4:12 ` [PATCH 09/14] intel_l3_parity: Use getopt for the l3 parity tool Ben Widawsky
@ 2013-09-18  4:12 ` Ben Widawsky
  2013-09-18  4:12 ` [PATCH 11/14] intel_l3_parity: slice support Ben Widawsky
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: Ben Widawsky @ 2013-09-18  4:12 UTC (permalink / raw)
  To: Intel GFX; +Cc: Bryan Bell, Ben Widawsky, Ben Widawsky

Add a new command line argument to the tool which will spit out various
parameters for the giving hardware. As a result of this, some new
defines are added to help with the various info.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 tools/intel_l3_parity.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/tools/intel_l3_parity.c b/tools/intel_l3_parity.c
index 507a522..28c47eb 100644
--- a/tools/intel_l3_parity.c
+++ b/tools/intel_l3_parity.c
@@ -38,9 +38,19 @@
 #include "intel_gpu_tools.h"
 #include "drmtest.h"
 
-#define NUM_BANKS 4
+static unsigned int devid;
+/* L3 size is always a function of banks. The number of banks cannot be
+ * determined by number of slices however */
+#define MAX_BANKS 4
+#define NUM_BANKS \
+	((devid == PCI_CHIP_IVYBRIDGE_GT1 || devid == PCI_CHIP_IVYBRIDGE_M_GT1) ? 2 : 4)
 #define NUM_SUBBANKS 8
+#define BYTES_PER_BANK (128 << 10)
+/* Each row addresses [up to] 4b. This multiplied by the number of subbanks
+ * will give the L3 size per bank.
+ * TODO: Row size is fixed on IVB, and variable on HSW.*/
 #define MAX_ROW (1<<12)
+#define L3_SIZE ((MAX_ROW * 4) * NUM_SUBBANKS *  NUM_BANKS)
 #define NUM_REGS (NUM_BANKS * NUM_SUBBANKS)
 
 struct __attribute__ ((__packed__)) l3_log_register {
@@ -50,7 +60,7 @@ struct __attribute__ ((__packed__)) l3_log_register {
 	uint32_t row1_enable	: 1;
 	uint32_t rsvd1		: 4;
 	uint32_t row1		: 11;
-} l3log[NUM_BANKS][NUM_SUBBANKS];
+} l3log[MAX_BANKS][NUM_SUBBANKS];
 
 static void dumpit(void)
 {
@@ -118,6 +128,7 @@ static void usage(const char *name)
 		"  -s, --subbank=[subbank]		The subbank to act upon (default 0)\n"
 		" ACTIONS (only 1 may be specified at a time):\n"
 		"  -h, --help				Display this help\n"
+		"  -H, --hw-info				Display the current L3 properties\n"
 		"  -l, --list				List the current L3 logs\n"
 		"  -a, --clear-all			Clear all disabled rows\n"
 		"  -e, --enable				Enable row, bank, subbank (undo -d)\n"
@@ -129,7 +140,6 @@ int main(int argc, char *argv[])
 {
 	const int device = drm_get_card();
 	char *path;
-	unsigned int devid;
 	int row = 0, bank = 0, sbank = 0;
 	int drm_fd, fd, ret;
 	int action = '0';
@@ -162,13 +172,14 @@ int main(int argc, char *argv[])
 			{ "clear-all", no_argument, 0, 'a' },
 			{ "enable", no_argument, 0, 'e' },
 			{ "disable", optional_argument, 0, 'd' },
+			{ "hw-info", no_argument, 0, 'H' },
 			{ "row", required_argument, 0, 'r' },
 			{ "bank", required_argument, 0, 'b' },
 			{ "subbank", required_argument, 0, 's' },
 			{0, 0, 0, 0}
 		};
 
-		c = getopt_long(argc, argv, "hr:b:s:aled::", long_options,
+		c = getopt_long(argc, argv, "hHr:b:s:aled::", long_options,
 				&option_index);
 		if (c == -1)
 			break;
@@ -178,6 +189,11 @@ int main(int argc, char *argv[])
 			case 'h':
 				usage(argv[0]);
 				exit(EXIT_SUCCESS);
+			case 'H':
+				printf("Number of banks: %d\n", NUM_BANKS);
+				printf("Subbanks per bank: %d\n", NUM_SUBBANKS);
+				printf("L3 size: %dK\n", L3_SIZE >> 10);
+				exit(EXIT_SUCCESS);
 			case 'r':
 				row = atoi(optarg);
 				if (row >= MAX_ROW)
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 11/14] intel_l3_parity: slice support
  2013-09-18  4:12 [PATCH 1/6] drm/i915: Fix HSW parity test Ben Widawsky
                   ` (8 preceding siblings ...)
  2013-09-18  4:12 ` [PATCH 10/14] intel_l3_parity: Hardware info argument Ben Widawsky
@ 2013-09-18  4:12 ` Ben Widawsky
  2013-09-18  4:12 ` [PATCH 12/14] intel_l3_parity: Actually support multiple slices Ben Widawsky
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: Ben Widawsky @ 2013-09-18  4:12 UTC (permalink / raw)
  To: Intel GFX; +Cc: Bryan Bell, Ben Widawsky, Ben Widawsky

Haswell GT3 adds a new slice which is kept distinct from the old
register interface. Plumb it into the code, though it's only 1 slice
still.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 tools/intel_l3_parity.c | 106 +++++++++++++++++++++++++++++-------------------
 1 file changed, 65 insertions(+), 41 deletions(-)

diff --git a/tools/intel_l3_parity.c b/tools/intel_l3_parity.c
index 28c47eb..c98eb80 100644
--- a/tools/intel_l3_parity.c
+++ b/tools/intel_l3_parity.c
@@ -52,6 +52,8 @@ static unsigned int devid;
 #define MAX_ROW (1<<12)
 #define L3_SIZE ((MAX_ROW * 4) * NUM_SUBBANKS *  NUM_BANKS)
 #define NUM_REGS (NUM_BANKS * NUM_SUBBANKS)
+#define MAX_SLICES 1
+#define REAL_MAX_SLICES 1
 
 struct __attribute__ ((__packed__)) l3_log_register {
 	uint32_t row0_enable	: 1;
@@ -60,15 +62,21 @@ struct __attribute__ ((__packed__)) l3_log_register {
 	uint32_t row1_enable	: 1;
 	uint32_t rsvd1		: 4;
 	uint32_t row1		: 11;
-} l3log[MAX_BANKS][NUM_SUBBANKS];
+} l3logs[REAL_MAX_SLICES][MAX_BANKS][NUM_SUBBANKS];
 
-static void dumpit(void)
+static int which_slice = -1;
+#define for_each_slice(__i) \
+	for ((__i) = (which_slice == -1) ? 0 : which_slice; \
+			(__i) < ((which_slice == -1) ? MAX_SLICES : (which_slice + 1)); \
+			(__i)++)
+
+static void dumpit(int slice)
 {
 	int i, j;
 
 	for (i = 0; i < NUM_BANKS; i++) {
 		for (j = 0; j < NUM_SUBBANKS; j++) {
-			struct l3_log_register *reg = &l3log[i][j];
+			struct l3_log_register *reg = &l3logs[slice][i][j];
 
 			if (reg->row0_enable)
 				printf("Row %d, Bank %d, Subbank %d is disabled\n",
@@ -80,9 +88,9 @@ static void dumpit(void)
 	}
 }
 
-static int disable_rbs(int row, int bank, int sbank)
+static int disable_rbs(int row, int bank, int sbank, int slice)
 {
-	struct l3_log_register *reg = &l3log[bank][sbank];
+	struct l3_log_register *reg = &l3logs[slice][bank][sbank];
 
 	// can't map more than 2 rows
 	if (reg->row0_enable && reg->row1_enable)
@@ -105,9 +113,9 @@ static int disable_rbs(int row, int bank, int sbank)
 	return 0;
 }
 
-static void enables_rbs(int row, int bank, int sbank)
+static void enables_rbs(int row, int bank, int sbank, int slice)
 {
-	struct l3_log_register *reg = &l3log[bank][sbank];
+	struct l3_log_register *reg = &l3logs[slice][bank][sbank];
 
 	if (!reg->row0_enable && !reg->row1_enable)
 		return;
@@ -126,6 +134,7 @@ static void usage(const char *name)
 		"  -r, --row=[row]			The row to act upon (default 0)\n"
 		"  -b, --bank=[bank]			The bank to act upon (default 0)\n"
 		"  -s, --subbank=[subbank]		The subbank to act upon (default 0)\n"
+		"  -w, --slice=[slice]			Which slice to act on (default: -1 [all])"
 		" ACTIONS (only 1 may be specified at a time):\n"
 		"  -h, --help				Display this help\n"
 		"  -H, --hw-info				Display the current L3 properties\n"
@@ -139,30 +148,30 @@ static void usage(const char *name)
 int main(int argc, char *argv[])
 {
 	const int device = drm_get_card();
-	char *path;
+	char *path[REAL_MAX_SLICES];
 	int row = 0, bank = 0, sbank = 0;
-	int drm_fd, fd, ret;
+	int fd[REAL_MAX_SLICES] = {0}, ret, i;
 	int action = '0';
-
-	drm_fd = drm_open_any();
+	int drm_fd = drm_open_any();
 	devid = intel_get_drm_devid(drm_fd);
 
 	if (intel_gen(devid) < 7 || IS_VALLEYVIEW(devid))
 		exit(EXIT_SUCCESS);
 
-	ret = asprintf(&path, "/sys/class/drm/card%d/l3_parity", device);
+	ret = asprintf(&path[0], "/sys/class/drm/card%d/l3_parity", device);
 	assert(ret != -1);
 
-	fd = open(path, O_RDWR);
-	assert(fd != -1);
-
-	ret = read(fd, l3log, NUM_REGS * sizeof(uint32_t));
-	if (ret == -1) {
-		perror("Reading sysfs");
-		exit(EXIT_FAILURE);
+	for_each_slice(i) {
+		fd[i] = open(path[i], O_RDWR);
+		assert(fd[i]);
+		ret = read(fd[i], l3logs[i], NUM_REGS * sizeof(uint32_t));
+		if (ret == -1) {
+			perror("Reading sysfs");
+			exit(EXIT_FAILURE);
+		}
+		assert(lseek(fd[i], 0, SEEK_SET) == 0);
 	}
 
-	assert(lseek(fd, 0, SEEK_SET) == 0);
 
 	while (1) {
 		int c, option_index = 0;
@@ -176,10 +185,11 @@ int main(int argc, char *argv[])
 			{ "row", required_argument, 0, 'r' },
 			{ "bank", required_argument, 0, 'b' },
 			{ "subbank", required_argument, 0, 's' },
+			{ "slice", required_argument, 0, 'w' },
 			{0, 0, 0, 0}
 		};
 
-		c = getopt_long(argc, argv, "hHr:b:s:aled::", long_options,
+		c = getopt_long(argc, argv, "hHr:b:s:w:aled::", long_options,
 				&option_index);
 		if (c == -1)
 			break;
@@ -190,6 +200,7 @@ int main(int argc, char *argv[])
 				usage(argv[0]);
 				exit(EXIT_SUCCESS);
 			case 'H':
+				printf("Number of slices: %d\n", MAX_SLICES);
 				printf("Number of banks: %d\n", NUM_BANKS);
 				printf("Subbanks per bank: %d\n", NUM_SUBBANKS);
 				printf("L3 size: %dK\n", L3_SIZE >> 10);
@@ -209,6 +220,11 @@ int main(int argc, char *argv[])
 				if (sbank >= NUM_SUBBANKS)
 					exit(EXIT_FAILURE);
 				break;
+			case 'w':
+				which_slice = atoi(optarg);
+				if (which_slice > 1)
+					exit(EXIT_FAILURE);
+				break;
 			case 'd':
 				if (optarg) {
 					ret = sscanf(optarg, "%d,%d,%d", &row, &bank, &sbank);
@@ -229,30 +245,38 @@ int main(int argc, char *argv[])
 		}
 	}
 
-	switch (action) {
-		case 'l':
-			dumpit();
-			exit(EXIT_SUCCESS);
-		case 'a':
-			memset(l3log, 0, sizeof(l3log));
-			break;
-		case 'e':
-			enables_rbs(row, bank, sbank);
-			break;
-		case 'd':
-			assert(disable_rbs(row, bank, sbank) == 0);
-			break;
-		default:
-			abort();
+	/* Per slice operations */
+	for_each_slice(i) {
+		switch (action) {
+			case 'l':
+				dumpit(i);
+				break;
+			case 'a':
+				memset(l3logs[i], 0, NUM_REGS * sizeof(struct l3_log_register));
+				break;
+			case 'e':
+				enables_rbs(row, bank, sbank, i);
+				break;
+			case 'd':
+				assert(disable_rbs(row, bank, sbank, i) == 0);
+				break;
+			default:
+				abort();
+		}
 	}
 
-	ret = write(fd, l3log, NUM_REGS * sizeof(uint32_t));
-	if (ret == -1) {
-		perror("Writing sysfs");
-		exit(EXIT_FAILURE);
+	if (action == 'l')
+		exit(EXIT_SUCCESS);
+
+	for_each_slice(i) {
+		ret = write(fd[i], l3logs[i], NUM_REGS * sizeof(uint32_t));
+		if (ret == -1) {
+			perror("Writing sysfs");
+			exit(EXIT_FAILURE);
+		}
+		close(fd[i]);
 	}
 
-	close(fd);
 
 	exit(EXIT_SUCCESS);
 }
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 12/14] intel_l3_parity: Actually support multiple slices
  2013-09-18  4:12 [PATCH 1/6] drm/i915: Fix HSW parity test Ben Widawsky
                   ` (9 preceding siblings ...)
  2013-09-18  4:12 ` [PATCH 11/14] intel_l3_parity: slice support Ben Widawsky
@ 2013-09-18  4:12 ` Ben Widawsky
  2013-09-18  4:12 ` [PATCH 13/14] intel_l3_parity: Support error injection Ben Widawsky
  2013-09-18  4:12 ` [PATCH 14/14] intel_l3_parity: Support a daemonic mode Ben Widawsky
  12 siblings, 0 replies; 27+ messages in thread
From: Ben Widawsky @ 2013-09-18  4:12 UTC (permalink / raw)
  To: Intel GFX; +Cc: Bryan Bell, Ben Widawsky, Ben Widawsky

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 tools/intel_l3_parity.c | 45 ++++++++++++++++++++++++++++-----------------
 1 file changed, 28 insertions(+), 17 deletions(-)

diff --git a/tools/intel_l3_parity.c b/tools/intel_l3_parity.c
index c98eb80..ed7034a 100644
--- a/tools/intel_l3_parity.c
+++ b/tools/intel_l3_parity.c
@@ -41,19 +41,28 @@
 static unsigned int devid;
 /* L3 size is always a function of banks. The number of banks cannot be
  * determined by number of slices however */
-#define MAX_BANKS 4
-#define NUM_BANKS \
-	((devid == PCI_CHIP_IVYBRIDGE_GT1 || devid == PCI_CHIP_IVYBRIDGE_M_GT1) ? 2 : 4)
+static inline int num_banks(void) {
+	if (IS_HSW_GT3(devid))
+		return 8; /* 4 per each slice */
+	else if (IS_HSW_GT1(devid) ||
+			devid == PCI_CHIP_IVYBRIDGE_GT1 ||
+			devid == PCI_CHIP_IVYBRIDGE_M_GT1)
+		return 2;
+	else
+		return 4;
+}
 #define NUM_SUBBANKS 8
 #define BYTES_PER_BANK (128 << 10)
 /* Each row addresses [up to] 4b. This multiplied by the number of subbanks
  * will give the L3 size per bank.
  * TODO: Row size is fixed on IVB, and variable on HSW.*/
 #define MAX_ROW (1<<12)
-#define L3_SIZE ((MAX_ROW * 4) * NUM_SUBBANKS *  NUM_BANKS)
-#define NUM_REGS (NUM_BANKS * NUM_SUBBANKS)
-#define MAX_SLICES 1
-#define REAL_MAX_SLICES 1
+#define MAX_BANKS_PER_SLICE 4
+#define NUM_REGS (MAX_BANKS_PER_SLICE * NUM_SUBBANKS)
+#define MAX_SLICES (IS_HSW_GT3(devid) ? 2 : 1)
+#define REAL_MAX_SLICES 2
+/* TODO support SLM config */
+#define L3_SIZE ((MAX_ROW * 4) * NUM_SUBBANKS *  num_banks())
 
 struct __attribute__ ((__packed__)) l3_log_register {
 	uint32_t row0_enable	: 1;
@@ -62,7 +71,7 @@ struct __attribute__ ((__packed__)) l3_log_register {
 	uint32_t row1_enable	: 1;
 	uint32_t rsvd1		: 4;
 	uint32_t row1		: 11;
-} l3logs[REAL_MAX_SLICES][MAX_BANKS][NUM_SUBBANKS];
+} l3logs[REAL_MAX_SLICES][MAX_BANKS_PER_SLICE][NUM_SUBBANKS];
 
 static int which_slice = -1;
 #define for_each_slice(__i) \
@@ -74,16 +83,16 @@ static void dumpit(int slice)
 {
 	int i, j;
 
-	for (i = 0; i < NUM_BANKS; i++) {
+	for (i = 0; i < MAX_BANKS_PER_SLICE; i++) {
 		for (j = 0; j < NUM_SUBBANKS; j++) {
 			struct l3_log_register *reg = &l3logs[slice][i][j];
 
 			if (reg->row0_enable)
-				printf("Row %d, Bank %d, Subbank %d is disabled\n",
-				       reg->row0, i, j);
+				printf("Slice %d, Row %d, Bank %d, Subbank %d is disabled\n",
+				       slice, reg->row0, i, j);
 			if (reg->row1_enable)
-				printf("Row %d, Bank %d, Subbank %d is disabled\n",
-				       reg->row1, i, j);
+				printf("Slice %d, Row %d, Bank %d, Subbank %d is disabled\n",
+				       slice, reg->row1, i, j);
 		}
 	}
 }
@@ -160,6 +169,8 @@ int main(int argc, char *argv[])
 
 	ret = asprintf(&path[0], "/sys/class/drm/card%d/l3_parity", device);
 	assert(ret != -1);
+	ret = asprintf(&path[1], "/sys/class/drm/card%d/l3_parity_slice_1", device);
+	assert(ret != -1);
 
 	for_each_slice(i) {
 		fd[i] = open(path[i], O_RDWR);
@@ -201,9 +212,9 @@ int main(int argc, char *argv[])
 				exit(EXIT_SUCCESS);
 			case 'H':
 				printf("Number of slices: %d\n", MAX_SLICES);
-				printf("Number of banks: %d\n", NUM_BANKS);
+				printf("Number of banks: %d\n", num_banks());
 				printf("Subbanks per bank: %d\n", NUM_SUBBANKS);
-				printf("L3 size: %dK\n", L3_SIZE >> 10);
+				printf("Max L3 size: %dK\n", L3_SIZE >> 10);
 				exit(EXIT_SUCCESS);
 			case 'r':
 				row = atoi(optarg);
@@ -212,7 +223,7 @@ int main(int argc, char *argv[])
 				break;
 			case 'b':
 				bank = atoi(optarg);
-				if (bank >= NUM_BANKS)
+				if (bank >= num_banks() || bank >= MAX_BANKS_PER_SLICE)
 					exit(EXIT_FAILURE);
 				break;
 			case 's':
@@ -222,7 +233,7 @@ int main(int argc, char *argv[])
 				break;
 			case 'w':
 				which_slice = atoi(optarg);
-				if (which_slice > 1)
+				if (which_slice >= MAX_SLICES)
 					exit(EXIT_FAILURE);
 				break;
 			case 'd':
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 13/14] intel_l3_parity: Support error injection
  2013-09-18  4:12 [PATCH 1/6] drm/i915: Fix HSW parity test Ben Widawsky
                   ` (10 preceding siblings ...)
  2013-09-18  4:12 ` [PATCH 12/14] intel_l3_parity: Actually support multiple slices Ben Widawsky
@ 2013-09-18  4:12 ` Ben Widawsky
  2013-09-18  4:12 ` [PATCH 14/14] intel_l3_parity: Support a daemonic mode Ben Widawsky
  12 siblings, 0 replies; 27+ messages in thread
From: Ben Widawsky @ 2013-09-18  4:12 UTC (permalink / raw)
  To: Intel GFX; +Cc: Bryan Bell, Ben Widawsky, Ben Widawsky

Haswell added the ability to inject errors which is extremely useful for
testing. Add two arguments to the tool to inject, and uninject.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 tests/sysfs_l3_parity   |  2 +-
 tools/intel_l3_parity.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 68 insertions(+), 3 deletions(-)

diff --git a/tests/sysfs_l3_parity b/tests/sysfs_l3_parity
index a0dfad9..e9d4411 100755
--- a/tests/sysfs_l3_parity
+++ b/tests/sysfs_l3_parity
@@ -21,7 +21,7 @@ fi
 $SOURCE_DIR/../tools/intel_l3_parity -r 0 -b 0 -s 0 -e
 
 #Check that we can clear remaps
-if [ `$SOURCE_DIR/../tools/intel_l3_parity -l | wc -c` != "0" ] ; then
+if [ `$SOURCE_DIR/../tools/intel_l3_parity -l | wc -l` != 1 ] ; then
 	echo "Fail 2"
 	exit 1
 fi
diff --git a/tools/intel_l3_parity.c b/tools/intel_l3_parity.c
index ed7034a..d2ad3c9 100644
--- a/tools/intel_l3_parity.c
+++ b/tools/intel_l3_parity.c
@@ -79,6 +79,20 @@ static int which_slice = -1;
 			(__i) < ((which_slice == -1) ? MAX_SLICES : (which_slice + 1)); \
 			(__i)++)
 
+static void decode_dft(uint32_t dft)
+{
+	if (IS_IVYBRIDGE(devid) || !(dft & 1)) {
+		printf("Error injection disabled\n");
+		return;
+	}
+	printf("Error injection enabled\n");
+	printf("  Hang = %s\n", (dft >> 28) & 0x1 ? "yes" : "no");
+	printf("  Row = %d\n", (dft >> 7) & 0x7ff);
+	printf("  Bank = %d\n", (dft >> 2) & 0x3);
+	printf("  Subbank = %d\n", (dft >> 4) & 0x7);
+	printf("  Slice = %d\n", (dft >> 1) & 0x1);
+}
+
 static void dumpit(int slice)
 {
 	int i, j;
@@ -150,7 +164,9 @@ static void usage(const char *name)
 		"  -l, --list				List the current L3 logs\n"
 		"  -a, --clear-all			Clear all disabled rows\n"
 		"  -e, --enable				Enable row, bank, subbank (undo -d)\n"
-		"  -d, --disable=<row,bank,subbank>	Disable row, bank, subbank (inline arguments are deprecated. Please use -r, -b, -s instead\n",
+		"  -d, --disable=<row,bank,subbank>	Disable row, bank, subbank (inline arguments are deprecated. Please use -r, -b, -s instead\n"
+		"  -i, --inject				[HSW only] Cause hardware to inject a row errors\n"
+		"  -u, --uninject			[HSW only] Turn off hardware error injectection (undo -i)\n",
 		name);
 }
 
@@ -158,6 +174,7 @@ int main(int argc, char *argv[])
 {
 	const int device = drm_get_card();
 	char *path[REAL_MAX_SLICES];
+	uint32_t dft;
 	int row = 0, bank = 0, sbank = 0;
 	int fd[REAL_MAX_SLICES] = {0}, ret, i;
 	int action = '0';
@@ -167,6 +184,8 @@ int main(int argc, char *argv[])
 	if (intel_gen(devid) < 7 || IS_VALLEYVIEW(devid))
 		exit(EXIT_SUCCESS);
 
+	assert(intel_register_access_init(intel_get_pci_device(), 0) == 0);
+
 	ret = asprintf(&path[0], "/sys/class/drm/card%d/l3_parity", device);
 	assert(ret != -1);
 	ret = asprintf(&path[1], "/sys/class/drm/card%d/l3_parity_slice_1", device);
@@ -183,6 +202,7 @@ int main(int argc, char *argv[])
 		assert(lseek(fd[i], 0, SEEK_SET) == 0);
 	}
 
+	dft = intel_register_read(0xb038);
 
 	while (1) {
 		int c, option_index = 0;
@@ -192,6 +212,8 @@ int main(int argc, char *argv[])
 			{ "clear-all", no_argument, 0, 'a' },
 			{ "enable", no_argument, 0, 'e' },
 			{ "disable", optional_argument, 0, 'd' },
+			{ "inject", no_argument, 0, 'i' },
+			{ "uninject", no_argument, 0, 'u' },
 			{ "hw-info", no_argument, 0, 'H' },
 			{ "row", required_argument, 0, 'r' },
 			{ "bank", required_argument, 0, 'b' },
@@ -200,7 +222,7 @@ int main(int argc, char *argv[])
 			{0, 0, 0, 0}
 		};
 
-		c = getopt_long(argc, argv, "hHr:b:s:w:aled::", long_options,
+		c = getopt_long(argc, argv, "hHr:b:s:w:aled::iu", long_options,
 				&option_index);
 		if (c == -1)
 			break;
@@ -215,6 +237,7 @@ int main(int argc, char *argv[])
 				printf("Number of banks: %d\n", num_banks());
 				printf("Subbanks per bank: %d\n", NUM_SUBBANKS);
 				printf("Max L3 size: %dK\n", L3_SIZE >> 10);
+				printf("Has error injection: %s\n", IS_HASWELL(devid) ? "yes" : "no");
 				exit(EXIT_SUCCESS);
 			case 'r':
 				row = atoi(optarg);
@@ -236,6 +259,12 @@ int main(int argc, char *argv[])
 				if (which_slice >= MAX_SLICES)
 					exit(EXIT_FAILURE);
 				break;
+			case 'i':
+			case 'u':
+				if (!IS_HASWELL(devid)) {
+					fprintf(stderr, "Error injection supported on HSW+ only\n");
+					exit(EXIT_FAILURE);
+				}
 			case 'd':
 				if (optarg) {
 					ret = sscanf(optarg, "%d,%d,%d", &row, &bank, &sbank);
@@ -256,6 +285,23 @@ int main(int argc, char *argv[])
 		}
 	}
 
+	if (action == 'i') {
+		if (((dft >> 1) & 1) != which_slice) {
+			fprintf(stderr, "DFT register already has slice %d enabled, and we don't support multiple slices. Try modifying -w; but sometimes the register sticks in the wrong way\n", (dft >> 1) & 1);
+			exit(EXIT_FAILURE);
+		}
+
+		if (which_slice == -1) {
+			fprintf(stderr, "Cannot inject errors to multiple slices (modify -w)\n");
+			exit(EXIT_FAILURE);
+		}
+		if (dft & 1 && ((dft >> 1) && 1) == which_slice)
+			printf("warning: overwriting existing injections. This is very dangerous.\n");
+	}
+
+	if (action == 'l')
+		decode_dft(dft);
+
 	/* Per slice operations */
 	for_each_slice(i) {
 		switch (action) {
@@ -271,11 +317,30 @@ int main(int argc, char *argv[])
 			case 'd':
 				assert(disable_rbs(row, bank, sbank, i) == 0);
 				break;
+			case 'i':
+				if (bank == 3) {
+					fprintf(stderr, "The hardware does not support error inject on bank 3.\n");
+					exit(EXIT_FAILURE);
+				}
+				dft |= row << 7;
+				dft |= sbank << 4;
+				dft |= bank << 2;
+				assert(i < 2);
+				dft |= i << 1; /* slice */
+				dft |= 1 << 0; /* enable */
+				intel_register_write(0xb038, dft);
+				break;
+			case 'u':
+				intel_register_write(0xb038, dft & ~(1<<0));
+				break;
+			case 'L':
+				break;
 			default:
 				abort();
 		}
 	}
 
+	intel_register_access_fini();
 	if (action == 'l')
 		exit(EXIT_SUCCESS);
 
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 14/14] intel_l3_parity: Support a daemonic mode
  2013-09-18  4:12 [PATCH 1/6] drm/i915: Fix HSW parity test Ben Widawsky
                   ` (11 preceding siblings ...)
  2013-09-18  4:12 ` [PATCH 13/14] intel_l3_parity: Support error injection Ben Widawsky
@ 2013-09-18  4:12 ` Ben Widawsky
  12 siblings, 0 replies; 27+ messages in thread
From: Ben Widawsky @ 2013-09-18  4:12 UTC (permalink / raw)
  To: Intel GFX; +Cc: Bryan Bell, Ben Widawsky, Ben Widawsky

v2: Add a comment explaining the dangers of directly accessing the DFT
register (Daniel)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 tools/Makefile.am              |   6 ++-
 tools/intel_l3_parity.c        |  46 ++++++++++++++++--
 tools/intel_l3_parity.h        |  31 ++++++++++++
 tools/intel_l3_udev_listener.c | 108 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 186 insertions(+), 5 deletions(-)
 create mode 100644 tools/intel_l3_parity.h
 create mode 100644 tools/intel_l3_udev_listener.c

diff --git a/tools/Makefile.am b/tools/Makefile.am
index 47bd5b3..19810cf 100644
--- a/tools/Makefile.am
+++ b/tools/Makefile.am
@@ -39,7 +39,7 @@ dist_bin_SCRIPTS = intel_gpu_abrt
 
 AM_CPPFLAGS = -I$(top_srcdir) -I$(top_srcdir)/lib
 AM_CFLAGS = $(DRM_CFLAGS) $(PCIACCESS_CFLAGS) $(CWARNFLAGS) $(CAIRO_CFLAGS)
-LDADD = $(top_builddir)/lib/libintel_tools.la $(DRM_LIBS) $(PCIACCESS_LIBS) $(CAIRO_LIBS)
+LDADD = $(top_builddir)/lib/libintel_tools.la $(DRM_LIBS) $(PCIACCESS_LIBS) $(CAIRO_LIBS) $(LIBUDEV_LIBS)
 
 intel_dump_decode_SOURCES = 	\
 	intel_dump_decode.c
@@ -50,3 +50,7 @@ intel_error_decode_SOURCES =	\
 intel_bios_reader_SOURCES =	\
 	intel_bios_reader.c	\
 	intel_bios.h
+
+intel_l3_parity_SOURCES =	\
+	intel_l3_parity.c	\
+	intel_l3_udev_listener.c
diff --git a/tools/intel_l3_parity.c b/tools/intel_l3_parity.c
index d2ad3c9..ead8fb5 100644
--- a/tools/intel_l3_parity.c
+++ b/tools/intel_l3_parity.c
@@ -37,6 +37,14 @@
 #include "intel_chipset.h"
 #include "intel_gpu_tools.h"
 #include "drmtest.h"
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+#if HAVE_UDEV
+#include <libudev.h>
+#include <syslog.h>
+#endif
+#include "intel_l3_parity.h"
 
 static unsigned int devid;
 /* L3 size is always a function of banks. The number of banks cannot be
@@ -157,7 +165,8 @@ static void usage(const char *name)
 		"  -r, --row=[row]			The row to act upon (default 0)\n"
 		"  -b, --bank=[bank]			The bank to act upon (default 0)\n"
 		"  -s, --subbank=[subbank]		The subbank to act upon (default 0)\n"
-		"  -w, --slice=[slice]			Which slice to act on (default: -1 [all])"
+		"  -w, --slice=[slice]			Which slice to act on (default: -1 [all])\n"
+		"    , --daemon				Run the listener (-L) as a daemon\n"
 		" ACTIONS (only 1 may be specified at a time):\n"
 		"  -h, --help				Display this help\n"
 		"  -H, --hw-info				Display the current L3 properties\n"
@@ -166,7 +175,8 @@ static void usage(const char *name)
 		"  -e, --enable				Enable row, bank, subbank (undo -d)\n"
 		"  -d, --disable=<row,bank,subbank>	Disable row, bank, subbank (inline arguments are deprecated. Please use -r, -b, -s instead\n"
 		"  -i, --inject				[HSW only] Cause hardware to inject a row errors\n"
-		"  -u, --uninject			[HSW only] Turn off hardware error injectection (undo -i)\n",
+		"  -u, --uninject			[HSW only] Turn off hardware error injectection (undo -i)\n"
+		"  -L, --listen				Listen for uevent errors\n",
 		name);
 }
 
@@ -179,6 +189,7 @@ int main(int argc, char *argv[])
 	int fd[REAL_MAX_SLICES] = {0}, ret, i;
 	int action = '0';
 	int drm_fd = drm_open_any();
+	int daemonize = 0;
 	devid = intel_get_drm_devid(drm_fd);
 
 	if (intel_gen(devid) < 7 || IS_VALLEYVIEW(devid))
@@ -202,11 +213,18 @@ int main(int argc, char *argv[])
 		assert(lseek(fd[i], 0, SEEK_SET) == 0);
 	}
 
+	/* NB: It is potentially unsafe to read this register if the kernel is
+	 * actively using this register range, or we're running multiple
+	 * instances of this tool. Since neither of those cases should occur
+	 * (and the tool should be root only) we can safely ignore this for
+	 * now. Just be aware of this if for some reason a hang is reported
+	 * when using this tool.
+	 */
 	dft = intel_register_read(0xb038);
 
 	while (1) {
 		int c, option_index = 0;
-		static struct option long_options[] = {
+		struct option long_options[] = {
 			{ "help", no_argument, 0, 'h' },
 			{ "list", no_argument, 0, 'l' },
 			{ "clear-all", no_argument, 0, 'a' },
@@ -215,18 +233,23 @@ int main(int argc, char *argv[])
 			{ "inject", no_argument, 0, 'i' },
 			{ "uninject", no_argument, 0, 'u' },
 			{ "hw-info", no_argument, 0, 'H' },
+			{ "listen", no_argument, 0, 'L' },
 			{ "row", required_argument, 0, 'r' },
 			{ "bank", required_argument, 0, 'b' },
 			{ "subbank", required_argument, 0, 's' },
 			{ "slice", required_argument, 0, 'w' },
+			{ "daemon", no_argument, &daemonize, 1 },
 			{0, 0, 0, 0}
 		};
 
-		c = getopt_long(argc, argv, "hHr:b:s:w:aled::iu", long_options,
+		c = getopt_long(argc, argv, "hHr:b:s:w:aled::iuL", long_options,
 				&option_index);
 		if (c == -1)
 			break;
 
+		if (c == 0)
+			continue;
+
 		switch (c) {
 			case '?':
 			case 'h':
@@ -274,6 +297,7 @@ int main(int argc, char *argv[])
 			case 'a':
 			case 'l':
 			case 'e':
+			case 'L':
 				if (action != '0') {
 					fprintf(stderr, "Only one action may be specified\n");
 					exit(EXIT_FAILURE);
@@ -299,6 +323,20 @@ int main(int argc, char *argv[])
 			printf("warning: overwriting existing injections. This is very dangerous.\n");
 	}
 
+	/* Daemon doesn't work like the other commands */
+	if (action == 'L') {
+		struct l3_parity par;
+		struct l3_location loc;
+		if (daemonize) {
+			assert(daemon(0, 0) == 0);
+			openlog(argv[0], LOG_CONS | LOG_PID, LOG_USER);
+		}
+		memset(&par, 0, sizeof(par));
+		assert(l3_uevent_setup(&par) == 0);
+		assert(l3_listen(&par, daemonize == 1, &loc) == 0);
+		exit(EXIT_SUCCESS);
+	}
+
 	if (action == 'l')
 		decode_dft(dft);
 
diff --git a/tools/intel_l3_parity.h b/tools/intel_l3_parity.h
new file mode 100644
index 0000000..65697c4
--- /dev/null
+++ b/tools/intel_l3_parity.h
@@ -0,0 +1,31 @@
+#ifndef INTEL_L3_PARITY_H_
+#define INTEL_L3_PARITY_H_
+
+#include <stdint.h>
+#include <stdbool.h>
+
+struct l3_parity {
+	struct udev *udev;
+	struct udev_monitor *uevent_monitor;
+	int fd;
+	fd_set fdset;
+};
+
+struct l3_location {
+	uint8_t slice;
+	uint16_t row;
+	uint8_t bank;
+	uint8_t subbank;
+};
+
+#if HAVE_UDEV
+int l3_uevent_setup(struct l3_parity *par);
+/* Listens (blocks) for an l3 parity event. Returns the location of the error. */
+int l3_listen(struct l3_parity *par, bool daemon, struct l3_location *loc);
+#define l3_uevent_teardown(par) {}
+#else
+#define l3_uevent_setup(par, daemon, loc) -1
+#define l3_listen(par) -1
+#endif
+
+#endif
diff --git a/tools/intel_l3_udev_listener.c b/tools/intel_l3_udev_listener.c
new file mode 100644
index 0000000..c50820c
--- /dev/null
+++ b/tools/intel_l3_udev_listener.c
@@ -0,0 +1,108 @@
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+#if HAVE_UDEV
+#include <libudev.h>
+#ifndef _GNU_SOURCE
+#define _GNU_SOURCE
+#endif
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <assert.h>
+#include <syslog.h>
+#include "i915_drm.h"
+#include "intel_l3_parity.h"
+
+#ifndef I915_L3_PARITY_UEVENT
+#define I915_L3_PARITY_UEVENT "L3_PARITY_ERROR"
+#endif
+
+int l3_uevent_setup(struct l3_parity *par)
+{
+	struct udev *udev;
+	struct udev_monitor *uevent_monitor;
+	fd_set fdset;
+	int fd, ret = -1;
+
+	udev = udev_new();
+	if (!udev) {
+		return -1;
+	}
+
+	uevent_monitor = udev_monitor_new_from_netlink(udev, "udev");
+	if (!uevent_monitor)
+		goto err_out;
+
+	ret = udev_monitor_filter_add_match_subsystem_devtype(uevent_monitor, "drm", "drm_minor");
+	if (ret < 0)
+		goto err_out;
+
+	ret = udev_monitor_enable_receiving(uevent_monitor);
+	if (ret < 0)
+		goto err_out;
+
+	fd = udev_monitor_get_fd(uevent_monitor);
+	FD_ZERO(&fdset);
+	FD_SET(fd, &fdset);
+
+	par->udev = udev;
+	par->fd = fd;
+	par->fdset = fdset;
+	par->uevent_monitor = uevent_monitor;
+	return 0;
+
+err_out:
+	udev_unref(udev);
+	return ret;
+}
+
+int l3_listen(struct l3_parity *par, bool daemon, struct l3_location *loc)
+{
+	struct udev_device *udev_dev;
+	const char *parity_status;
+	char *err_msg;
+	int ret;
+
+again:
+	ret = select(par->fd + 1, &par->fdset, NULL, NULL, NULL);
+	/* Number of bits set is returned, must be >= 1 */
+	if (ret <= 0) {
+		return ret;
+	}
+
+	assert(FD_ISSET(par->fd, &par->fdset));
+
+	udev_dev = udev_monitor_receive_device(par->uevent_monitor);
+	if (!udev_dev)
+		return -1;
+
+	parity_status = udev_device_get_property_value(udev_dev, I915_L3_PARITY_UEVENT);
+	if (strncmp(parity_status, "1", 1))
+		goto again;
+
+	loc->slice = atoi(udev_device_get_property_value(udev_dev, "SLICE"));
+	loc->row = atoi(udev_device_get_property_value(udev_dev, "ROW"));
+	loc->bank = atoi(udev_device_get_property_value(udev_dev, "BANK"));
+	loc->subbank = atoi(udev_device_get_property_value(udev_dev, "SUBBANK"));
+
+	udev_device_unref(udev_dev);
+
+	asprintf(&err_msg, "Parity error detected on: %d,%d,%d,%d. "
+			"Try to run intel_l3_parity -r %d -b %d -s %d -w %d -d",
+			loc->slice, loc->row, loc->bank, loc->subbank,
+			loc->row, loc->bank, loc->subbank, loc->slice);
+	if (daemon) {
+		syslog(LOG_INFO, "%s\n", err_msg);
+		goto again;
+	}
+
+	fprintf(stderr, "%s\n", err_msg);
+
+	free(err_msg);
+
+	return 0;
+}
+#endif
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/6] drm/i915: Add second slice l3 remapping
  2013-09-18  4:12 ` [PATCH 2/6] drm/i915: Add second slice l3 remapping Ben Widawsky
@ 2013-09-18  7:36   ` Ville Syrjälä
  2013-09-18 16:22     ` Ben Widawsky
  2013-09-19 18:13     ` [PATCH] [v3] " Ben Widawsky
  0 siblings, 2 replies; 27+ messages in thread
From: Ville Syrjälä @ 2013-09-18  7:36 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Bryan Bell, Ben Widawsky

On Tue, Sep 17, 2013 at 09:12:43PM -0700, Ben Widawsky wrote:
> Certain HSW SKUs have a second bank of L3. This L3 remapping has a
> separate register set, and interrupt from the first "slice". A slice is
> simply a term to define some subset of the GPU's l3 cache. This patch
> implements both the interrupt handler, and ability to communicate with
> userspace about this second slice.
> 
> v2:  Remove redundant check about non-existent slice.
> Change warning about interrupts of unknown slices to WARN_ON_ONCE
> Handle the case where we get 2 slice interrupts concurrently, and switch
> the tracking of interrupts to be non-destructive (all Ville)
> Don't enable/mask the second slice parity interrupt for ivb/vlv (even
> though all docs I can find claim it's rsvd) (Ville + Bryan)
> Keep BYT excluded from L3 parity
> 
> CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         |  7 ++-
>  drivers/gpu/drm/i915/i915_gem.c         | 26 +++++-----
>  drivers/gpu/drm/i915/i915_irq.c         | 88 +++++++++++++++++++++------------
>  drivers/gpu/drm/i915/i915_reg.h         |  7 +++
>  drivers/gpu/drm/i915/i915_sysfs.c       | 34 ++++++++++---
>  drivers/gpu/drm/i915/intel_ringbuffer.c |  7 ++-
>  include/uapi/drm/i915_drm.h             |  8 +--
>  7 files changed, 116 insertions(+), 61 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 8b16d47..c6e8df7 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -917,9 +917,11 @@ struct i915_ums_state {
>  	int mm_suspended;
>  };
>  
> +#define MAX_L3_SLICES 2
>  struct intel_l3_parity {
> -	u32 *remap_info;
> +	u32 *remap_info[MAX_L3_SLICES];
>  	struct work_struct error_work;
> +	int which_slice;
>  };
>  
>  struct i915_gem_mm {
> @@ -1686,6 +1688,7 @@ struct drm_i915_file_private {
>  #define HAS_FORCE_WAKE(dev) (INTEL_INFO(dev)->has_force_wake)
>  
>  #define HAS_L3_GPU_CACHE(dev) (IS_IVYBRIDGE(dev) || IS_HASWELL(dev))
> +#define NUM_L3_SLICES(dev) (IS_HSW_GT3(dev) ? 2 : HAS_L3_GPU_CACHE(dev))
>  
>  #define GT_FREQUENCY_MULTIPLIER 50
>  
> @@ -1946,7 +1949,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
>  int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
>  int __must_check i915_gem_init(struct drm_device *dev);
>  int __must_check i915_gem_init_hw(struct drm_device *dev);
> -void i915_gem_l3_remap(struct drm_device *dev);
> +void i915_gem_l3_remap(struct drm_device *dev, int slice);
>  void i915_gem_init_swizzling(struct drm_device *dev);
>  void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
>  int __must_check i915_gpu_idle(struct drm_device *dev);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 3d3de6e..66bf75d 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4252,16 +4252,15 @@ i915_gem_idle(struct drm_device *dev)
>  	return 0;
>  }
>  
> -void i915_gem_l3_remap(struct drm_device *dev)
> +void i915_gem_l3_remap(struct drm_device *dev, int slice)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> +	u32 reg_base = GEN7_L3LOG_BASE + (slice * 0x200);
> +	u32 *remap_info = dev_priv->l3_parity.remap_info[slice];
>  	u32 misccpctl;
>  	int i;
>  
> -	if (!HAS_L3_GPU_CACHE(dev))
> -		return;
> -
> -	if (!dev_priv->l3_parity.remap_info)
> +	if (!HAS_L3_GPU_CACHE(dev) || !remap_info)
>  		return;
>  
>  	misccpctl = I915_READ(GEN7_MISCCPCTL);
> @@ -4269,17 +4268,17 @@ void i915_gem_l3_remap(struct drm_device *dev)
>  	POSTING_READ(GEN7_MISCCPCTL);
>  
>  	for (i = 0; i < GEN7_L3LOG_SIZE; i += 4) {
> -		u32 remap = I915_READ(GEN7_L3LOG_BASE + i);
> -		if (remap && remap != dev_priv->l3_parity.remap_info[i/4])
> +		u32 remap = I915_READ(reg_base + i);
> +		if (remap && remap != remap_info[i/4])
>  			DRM_DEBUG("0x%x was already programmed to %x\n",
> -				  GEN7_L3LOG_BASE + i, remap);
> -		if (remap && !dev_priv->l3_parity.remap_info[i/4])
> +				  reg_base + i, remap);
> +		if (remap && !remap_info[i/4])
>  			DRM_DEBUG_DRIVER("Clearing remapped register\n");
> -		I915_WRITE(GEN7_L3LOG_BASE + i, dev_priv->l3_parity.remap_info[i/4]);
> +		I915_WRITE(reg_base + i, remap_info[i/4]);
>  	}
>  
>  	/* Make sure all the writes land before disabling dop clock gating */
> -	POSTING_READ(GEN7_L3LOG_BASE);
> +	POSTING_READ(reg_base);
>  
>  	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
>  }
> @@ -4373,7 +4372,7 @@ int
>  i915_gem_init_hw(struct drm_device *dev)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	int ret;
> +	int ret, i;
>  
>  	if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
>  		return -EIO;
> @@ -4392,7 +4391,8 @@ i915_gem_init_hw(struct drm_device *dev)
>  		I915_WRITE(GEN7_MSG_CTL, temp);
>  	}
>  
> -	i915_gem_l3_remap(dev);
> +	for (i = 0; i < NUM_L3_SLICES(dev); i++)
> +		i915_gem_l3_remap(dev, i);
>  
>  	i915_gem_init_swizzling(dev);
>  
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index a42f30b..b11ee39 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -888,9 +888,10 @@ static void ivybridge_parity_work(struct work_struct *work)
>  	drm_i915_private_t *dev_priv = container_of(work, drm_i915_private_t,
>  						    l3_parity.error_work);
>  	u32 error_status, row, bank, subbank;
> -	char *parity_event[5];
> +	char *parity_event[6];
>  	uint32_t misccpctl;
>  	unsigned long flags;
> +	uint8_t slice = 0;
>  
>  	/* We must turn off DOP level clock gating to access the L3 registers.
>  	 * In order to prevent a get/put style interface, acquire struct mutex
> @@ -898,45 +899,63 @@ static void ivybridge_parity_work(struct work_struct *work)
>  	 */
>  	mutex_lock(&dev_priv->dev->struct_mutex);
>  
> +	/* If we've screwed up tracking, just let the interrupt fire again */
> +	if (WARN_ON(!dev_priv->l3_parity.which_slice))
> +		goto out;
> +
>  	misccpctl = I915_READ(GEN7_MISCCPCTL);
>  	I915_WRITE(GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
>  	POSTING_READ(GEN7_MISCCPCTL);
>  
> -	error_status = I915_READ(GEN7_L3CDERRST1);
> -	row = GEN7_PARITY_ERROR_ROW(error_status);
> -	bank = GEN7_PARITY_ERROR_BANK(error_status);
> -	subbank = GEN7_PARITY_ERROR_SUBBANK(error_status);
> +	while ((slice = ffs(dev_priv->l3_parity.which_slice)) != 0) {
> +		u32 reg;

ffs(1) == 1, so we need a slice-- here, don't we?

The rest looks OK to me, so once this one thing is fixed you can slap on a
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
sticker.

>  
> -	I915_WRITE(GEN7_L3CDERRST1, GEN7_PARITY_ERROR_VALID |
> -				    GEN7_L3CDERRST1_ENABLE);
> -	POSTING_READ(GEN7_L3CDERRST1);
> +		if (WARN_ON_ONCE(slice >= NUM_L3_SLICES(dev_priv->dev)))
> +			break;
>  
> -	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
> +		dev_priv->l3_parity.which_slice &= ~(1<<slice);
>  
> -	spin_lock_irqsave(&dev_priv->irq_lock, flags);
> -	ilk_enable_gt_irq(dev_priv, GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
> -	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
> +		reg = GEN7_L3CDERRST1 + (slice * 0x200);
>  
> -	mutex_unlock(&dev_priv->dev->struct_mutex);
> +		error_status = I915_READ(reg);
> +		row = GEN7_PARITY_ERROR_ROW(error_status);
> +		bank = GEN7_PARITY_ERROR_BANK(error_status);
> +		subbank = GEN7_PARITY_ERROR_SUBBANK(error_status);
> +
> +		I915_WRITE(reg, GEN7_PARITY_ERROR_VALID | GEN7_L3CDERRST1_ENABLE);
> +		POSTING_READ(reg);
> +
> +		parity_event[0] = I915_L3_PARITY_UEVENT "=1";
> +		parity_event[1] = kasprintf(GFP_KERNEL, "ROW=%d", row);
> +		parity_event[2] = kasprintf(GFP_KERNEL, "BANK=%d", bank);
> +		parity_event[3] = kasprintf(GFP_KERNEL, "SUBBANK=%d", subbank);
> +		parity_event[4] = kasprintf(GFP_KERNEL, "SLICE=%d", slice);
> +		parity_event[5] = NULL;
> +
> +		kobject_uevent_env(&dev_priv->dev->primary->kdev.kobj,
> +				   KOBJ_CHANGE, parity_event);
>  
> -	parity_event[0] = I915_L3_PARITY_UEVENT "=1";
> -	parity_event[1] = kasprintf(GFP_KERNEL, "ROW=%d", row);
> -	parity_event[2] = kasprintf(GFP_KERNEL, "BANK=%d", bank);
> -	parity_event[3] = kasprintf(GFP_KERNEL, "SUBBANK=%d", subbank);
> -	parity_event[4] = NULL;
> +		DRM_DEBUG("Parity error: Slice = %d, Row = %d, Bank = %d, Sub bank = %d.\n",
> +			  slice, row, bank, subbank);
>  
> -	kobject_uevent_env(&dev_priv->dev->primary->kdev.kobj,
> -			   KOBJ_CHANGE, parity_event);
> +		kfree(parity_event[4]);
> +		kfree(parity_event[3]);
> +		kfree(parity_event[2]);
> +		kfree(parity_event[1]);
> +	}
>  
> -	DRM_DEBUG("Parity error: Row = %d, Bank = %d, Sub bank = %d.\n",
> -		  row, bank, subbank);
> +	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
>  
> -	kfree(parity_event[3]);
> -	kfree(parity_event[2]);
> -	kfree(parity_event[1]);
> +out:
> +	WARN_ON(dev_priv->l3_parity.which_slice);
> +	spin_lock_irqsave(&dev_priv->irq_lock, flags);
> +	ilk_enable_gt_irq(dev_priv, GT_PARITY_ERROR(dev_priv->dev));
> +	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
> +
> +	mutex_unlock(&dev_priv->dev->struct_mutex);
>  }
>  
> -static void ivybridge_parity_error_irq_handler(struct drm_device *dev)
> +static void ivybridge_parity_error_irq_handler(struct drm_device *dev, u32 iir)
>  {
>  	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
>  
> @@ -944,9 +963,16 @@ static void ivybridge_parity_error_irq_handler(struct drm_device *dev)
>  		return;
>  
>  	spin_lock(&dev_priv->irq_lock);
> -	ilk_disable_gt_irq(dev_priv, GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
> +	ilk_disable_gt_irq(dev_priv, GT_PARITY_ERROR(dev));
>  	spin_unlock(&dev_priv->irq_lock);
>  
> +	iir &= GT_PARITY_ERROR(dev);
> +	if (iir & GT_RENDER_L3_PARITY_ERROR_INTERRUPT_S1)
> +		dev_priv->l3_parity.which_slice |= 1 << 1;
> +
> +	if (iir & GT_RENDER_L3_PARITY_ERROR_INTERRUPT)
> +		dev_priv->l3_parity.which_slice |= 1 << 0;
> +
>  	queue_work(dev_priv->wq, &dev_priv->l3_parity.error_work);
>  }
>  
> @@ -981,8 +1007,8 @@ static void snb_gt_irq_handler(struct drm_device *dev,
>  		i915_handle_error(dev, false);
>  	}
>  
> -	if (gt_iir & GT_RENDER_L3_PARITY_ERROR_INTERRUPT)
> -		ivybridge_parity_error_irq_handler(dev);
> +	if (gt_iir & GT_PARITY_ERROR(dev))
> +		ivybridge_parity_error_irq_handler(dev, gt_iir);
>  }
>  
>  #define HPD_STORM_DETECT_PERIOD 1000
> @@ -2267,8 +2293,8 @@ static void gen5_gt_irq_postinstall(struct drm_device *dev)
>  	dev_priv->gt_irq_mask = ~0;
>  	if (HAS_L3_GPU_CACHE(dev)) {
>  		/* L3 parity interrupt is always unmasked. */
> -		dev_priv->gt_irq_mask = ~GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
> -		gt_irqs |= GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
> +		dev_priv->gt_irq_mask = ~GT_PARITY_ERROR(dev);
> +		gt_irqs |= GT_PARITY_ERROR(dev);
>  	}
>  
>  	gt_irqs |= GT_RENDER_USER_INTERRUPT;
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 384adfb..c94946f 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -927,6 +927,7 @@
>  #define GT_BLT_USER_INTERRUPT			(1 << 22)
>  #define GT_BSD_CS_ERROR_INTERRUPT		(1 << 15)
>  #define GT_BSD_USER_INTERRUPT			(1 << 12)
> +#define GT_RENDER_L3_PARITY_ERROR_INTERRUPT_S1	(1 << 11) /* hsw+; rsvd on snb, ivb, vlv */
>  #define GT_RENDER_L3_PARITY_ERROR_INTERRUPT	(1 <<  5) /* !snb */
>  #define GT_RENDER_PIPECTL_NOTIFY_INTERRUPT	(1 <<  4)
>  #define GT_RENDER_CS_MASTER_ERROR_INTERRUPT	(1 <<  3)
> @@ -937,6 +938,10 @@
>  #define PM_VEBOX_CS_ERROR_INTERRUPT		(1 << 12) /* hsw+ */
>  #define PM_VEBOX_USER_INTERRUPT			(1 << 10) /* hsw+ */
>  
> +#define GT_PARITY_ERROR(dev) \
> +	(GT_RENDER_L3_PARITY_ERROR_INTERRUPT | \
> +	 IS_HASWELL(dev) ? GT_RENDER_L3_PARITY_ERROR_INTERRUPT_S1 : 0)
> +
>  /* These are all the "old" interrupts */
>  #define ILK_BSD_USER_INTERRUPT				(1<<5)
>  #define I915_PIPE_CONTROL_NOTIFY_INTERRUPT		(1<<18)
> @@ -4743,6 +4748,7 @@
>  
>  /* IVYBRIDGE DPF */
>  #define GEN7_L3CDERRST1			0xB008 /* L3CD Error Status 1 */
> +#define HSW_L3CDERRST11			0xB208 /* L3CD Error Status register 1 slice 1 */
>  #define   GEN7_L3CDERRST1_ROW_MASK	(0x7ff<<14)
>  #define   GEN7_PARITY_ERROR_VALID	(1<<13)
>  #define   GEN7_L3CDERRST1_BANK_MASK	(3<<11)
> @@ -4756,6 +4762,7 @@
>  #define   GEN7_L3CDERRST1_ENABLE	(1<<7)
>  
>  #define GEN7_L3LOG_BASE			0xB070
> +#define HSW_L3LOG_BASE_SLICE1		0xB270
>  #define GEN7_L3LOG_SIZE			0x80
>  
>  #define GEN7_HALF_SLICE_CHICKEN1	0xe100 /* IVB GT1 + VLV */
> diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> index 71f6de2..3a8bf0c 100644
> --- a/drivers/gpu/drm/i915/i915_sysfs.c
> +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> @@ -119,6 +119,7 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
>  	struct drm_device *drm_dev = dminor->dev;
>  	struct drm_i915_private *dev_priv = drm_dev->dev_private;
>  	uint32_t misccpctl;
> +	int slice = (int)(uintptr_t)attr->private;
>  	int i, ret;
>  
>  	count = round_down(count, 4);
> @@ -134,9 +135,9 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
>  		return ret;
>  
>  	if (IS_HASWELL(drm_dev)) {
> -		if (dev_priv->l3_parity.remap_info)
> +		if (dev_priv->l3_parity.remap_info[slice])
>  			memcpy(buf,
> -			       dev_priv->l3_parity.remap_info + (offset/4),
> +			       dev_priv->l3_parity.remap_info[slice] + (offset/4),
>  			       count);
>  		else
>  			memset(buf, 0, count);
> @@ -168,6 +169,7 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
>  	struct drm_device *drm_dev = dminor->dev;
>  	struct drm_i915_private *dev_priv = drm_dev->dev_private;
>  	u32 *temp = NULL; /* Just here to make handling failures easy */
> +	int slice = (int)(uintptr_t)attr->private;
>  	int ret;
>  
>  	ret = l3_access_valid(drm_dev, offset);
> @@ -178,7 +180,7 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
>  	if (ret)
>  		return ret;
>  
> -	if (!dev_priv->l3_parity.remap_info) {
> +	if (!dev_priv->l3_parity.remap_info[slice]) {
>  		temp = kzalloc(GEN7_L3LOG_SIZE, GFP_KERNEL);
>  		if (!temp) {
>  			mutex_unlock(&drm_dev->struct_mutex);
> @@ -198,11 +200,11 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
>  	 * at this point it is left as a TODO.
>  	*/
>  	if (temp)
> -		dev_priv->l3_parity.remap_info = temp;
> +		dev_priv->l3_parity.remap_info[slice] = temp;
>  
> -	memcpy(dev_priv->l3_parity.remap_info + (offset/4), buf, count);
> +	memcpy(dev_priv->l3_parity.remap_info[slice] + (offset/4), buf, count);
>  
> -	i915_gem_l3_remap(drm_dev);
> +	i915_gem_l3_remap(drm_dev, slice);
>  
>  	mutex_unlock(&drm_dev->struct_mutex);
>  
> @@ -214,7 +216,17 @@ static struct bin_attribute dpf_attrs = {
>  	.size = GEN7_L3LOG_SIZE,
>  	.read = i915_l3_read,
>  	.write = i915_l3_write,
> -	.mmap = NULL
> +	.mmap = NULL,
> +	.private = (void *)0
> +};
> +
> +static struct bin_attribute dpf_attrs_1 = {
> +	.attr = {.name = "l3_parity_slice_1", .mode = (S_IRUSR | S_IWUSR)},
> +	.size = GEN7_L3LOG_SIZE,
> +	.read = i915_l3_read,
> +	.write = i915_l3_write,
> +	.mmap = NULL,
> +	.private = (void *)1
>  };
>  
>  static ssize_t gt_cur_freq_mhz_show(struct device *kdev,
> @@ -525,6 +537,13 @@ void i915_setup_sysfs(struct drm_device *dev)
>  		ret = device_create_bin_file(&dev->primary->kdev, &dpf_attrs);
>  		if (ret)
>  			DRM_ERROR("l3 parity sysfs setup failed\n");
> +
> +		if (NUM_L3_SLICES(dev) > 1) {
> +			ret = device_create_bin_file(&dev->primary->kdev,
> +						     &dpf_attrs_1);
> +			if (ret)
> +				DRM_ERROR("l3 parity slice 1 setup failed\n");
> +		}
>  	}
>  
>  	ret = 0;
> @@ -548,6 +567,7 @@ void i915_teardown_sysfs(struct drm_device *dev)
>  		sysfs_remove_files(&dev->primary->kdev.kobj, vlv_attrs);
>  	else
>  		sysfs_remove_files(&dev->primary->kdev.kobj, gen6_attrs);
> +	device_remove_bin_file(&dev->primary->kdev,  &dpf_attrs_1);
>  	device_remove_bin_file(&dev->primary->kdev,  &dpf_attrs);
>  #ifdef CONFIG_PM
>  	sysfs_unmerge_group(&dev->primary->kdev.kobj, &rc6_attr_group);
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 686e5b2..958b7d8 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -570,7 +570,7 @@ static int init_render_ring(struct intel_ring_buffer *ring)
>  		I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
>  
>  	if (HAS_L3_GPU_CACHE(dev))
> -		I915_WRITE_IMR(ring, ~GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
> +		I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
>  
>  	return ret;
>  }
> @@ -1000,7 +1000,7 @@ gen6_ring_get_irq(struct intel_ring_buffer *ring)
>  		if (HAS_L3_GPU_CACHE(dev) && ring->id == RCS)
>  			I915_WRITE_IMR(ring,
>  				       ~(ring->irq_enable_mask |
> -					 GT_RENDER_L3_PARITY_ERROR_INTERRUPT));
> +					 GT_PARITY_ERROR(dev)));
>  		else
>  			I915_WRITE_IMR(ring, ~ring->irq_enable_mask);
>  		ilk_enable_gt_irq(dev_priv, ring->irq_enable_mask);
> @@ -1020,8 +1020,7 @@ gen6_ring_put_irq(struct intel_ring_buffer *ring)
>  	spin_lock_irqsave(&dev_priv->irq_lock, flags);
>  	if (--ring->irq_refcount == 0) {
>  		if (HAS_L3_GPU_CACHE(dev) && ring->id == RCS)
> -			I915_WRITE_IMR(ring,
> -				       ~GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
> +			I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
>  		else
>  			I915_WRITE_IMR(ring, ~0);
>  		ilk_disable_gt_irq(dev_priv, ring->irq_enable_mask);
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 55bb572..3a4e97b 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -38,10 +38,10 @@
>   *
>   * I915_L3_PARITY_UEVENT - Generated when the driver receives a parity mismatch
>   *	event from the gpu l3 cache. Additional information supplied is ROW,
> - *	BANK, SUBBANK of the affected cacheline. Userspace should keep track of
> - *	these events and if a specific cache-line seems to have a persistent
> - *	error remap it with the l3 remapping tool supplied in intel-gpu-tools.
> - *	The value supplied with the event is always 1.
> + *	BANK, SUBBANK, SLICE of the affected cacheline. Userspace should keep
> + *	track of these events and if a specific cache-line seems to have a
> + *	persistent error remap it with the l3 remapping tool supplied in
> + *	intel-gpu-tools.  The value supplied with the event is always 1.
>   *
>   * I915_ERROR_UEVENT - Generated upon error detection, currently only via
>   *	hangcheck. The error detection event is a good indicator of when things
> -- 
> 1.8.4

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/6] drm/i915: Do remaps for all contexts
  2013-09-18  4:12 ` [PATCH 5/6] drm/i915: Do remaps for " Ben Widawsky
@ 2013-09-18  7:48   ` Ville Syrjälä
  2013-09-19  1:14     ` Ben Widawsky
  2013-09-19  2:03     ` [PATCH] [v3] " Ben Widawsky
  0 siblings, 2 replies; 27+ messages in thread
From: Ville Syrjälä @ 2013-09-18  7:48 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Bryan Bell, Ben Widawsky

On Tue, Sep 17, 2013 at 09:12:46PM -0700, Ben Widawsky wrote:
> On both Ivybridge and Haswell, row remapping information is saved and
> restored with context. This means, we never actually properly supported
> the l3 remapping because our sysfs interface is asynchronous (and not
> tied to any context), and the known faulty HW would be reused by the
> next context to run.
> 
> Not that due to the asynchronous nature of the sysfs entry, there is no
> point modifying the registers for the existing context. Instead we set a
> flag for all contexts to load the correct remapping information on the
> next run. Interested clients can use debugfs to determine whether or not
> the row has been remapped.
> 
> One could propose at this point that we just do the remapping in the
> kernel. I guess since we have to maintain the sysfs interface anyway,
> I'm not sure how useful it is, and I do like keeping the policy in
> userspace; (it wasn't my original decision to make the
> interface the way it is, so I'm not attached).
> 
> v2: Force a context switch when we have a remap on the next switch.
> (Ville)
> Don't let userspace use the interface with disabled contexts.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c     |  8 +++++++
>  drivers/gpu/drm/i915/i915_drv.h         |  1 +
>  drivers/gpu/drm/i915/i915_gem_context.c | 22 ++++++++++++++++----
>  drivers/gpu/drm/i915/i915_sysfs.c       | 37 +++++++++++++--------------------
>  4 files changed, 41 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index ada0950..80bed69 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -145,6 +145,13 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>  		seq_printf(m, " (%s)", obj->ring->name);
>  }
>  
> +static void describe_ctx(struct seq_file *m, struct i915_hw_context *ctx)
> +{
> +	seq_putc(m, ctx->is_initialized ? 'I' : 'i');
> +	seq_putc(m, ctx->remap_slice ? 'R' : 'r');
> +	seq_putc(m, ' ');
> +}
> +
>  static int i915_gem_object_list_info(struct seq_file *m, void *data)
>  {
>  	struct drm_info_node *node = (struct drm_info_node *) m->private;
> @@ -1463,6 +1470,7 @@ static int i915_context_status(struct seq_file *m, void *unused)
>  
>  	list_for_each_entry(ctx, &dev_priv->context_list, link) {
>  		seq_puts(m, "HW context ");
> +		describe_ctx(m, ctx);
>  		for_each_ring(ring, dev_priv, i)
>  			if (ring->default_context == ctx)
>  				seq_printf(m, "(default context %s) ", ring->name);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 1795927..015df52 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -601,6 +601,7 @@ struct i915_hw_context {
>  	struct kref ref;
>  	int id;
>  	bool is_initialized;
> +	uint8_t remap_slice;
>  	struct drm_i915_file_private *file_priv;
>  	struct intel_ring_buffer *ring;
>  	struct drm_i915_gem_object *obj;
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 2bbdce8..7e138cc 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -140,7 +140,7 @@ create_hw_context(struct drm_device *dev,
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct i915_hw_context *ctx;
> -	int ret;
> +	int ret, i;
>  
>  	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
>  	if (ctx == NULL)
> @@ -181,6 +181,8 @@ create_hw_context(struct drm_device *dev,
>  
>  	ctx->file_priv = file_priv;
>  	ctx->id = ret;
> +	for (i = 0; i < NUM_L3_SLICES(dev); i++)
> +		ctx->remap_slice |= 1 << 1;
                                         ^

Still broken.

>  
>  	return ctx;
>  
> @@ -396,11 +398,11 @@ static int do_switch(struct i915_hw_context *to)
>  	struct intel_ring_buffer *ring = to->ring;
>  	struct i915_hw_context *from = ring->last_context;
>  	u32 hw_flags = 0;
> -	int ret;
> +	int ret, i;
>  
>  	BUG_ON(from != NULL && from->obj != NULL && from->obj->pin_count == 0);
>  
> -	if (from == to)
> +	if (from == to && !to->remap_slice)
>  		return 0;
>  
>  	ret = i915_gem_obj_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
> @@ -423,7 +425,7 @@ static int do_switch(struct i915_hw_context *to)
>  
>  	if (!to->is_initialized || is_default_context(to))
>  		hw_flags |= MI_RESTORE_INHIBIT;
> -	else if (WARN_ON_ONCE(from == to)) /* not yet expected */
> +	else if (from == to)
>  		hw_flags |= MI_FORCE_RESTORE;

Hmm. Why do we need to restore actually? Could just leave MI_SET_CONTEXT
to be effectively a nop I think.

>  
>  	ret = mi_set_context(ring, to, hw_flags);
> @@ -432,6 +434,18 @@ static int do_switch(struct i915_hw_context *to)
>  		return ret;
>  	}
>  
> +	for (i = 0; i < MAX_L3_SLICES; i++) {
> +		if (!(to->remap_slice & (1<<i)))
> +			continue;
> +
> +		ret = i915_gem_l3_remap(ring, i);
> +		if (!ret) {
> +			to->remap_slice &= ~(1<<i);
> +			/* If it failed, try again next round */
> +			DRM_DEBUG_DRIVER("L3 remapping failed\n");
> +		}

The debug message is on the wrong branch.

> +	}
> +
>  	/* The backing object for the context is done after switching to the
>  	 * *next* context. Therefore we cannot retire the previous context until
>  	 * the next context has already started running. In fact, the below code
> diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> index b07bdfb..deb8787 100644
> --- a/drivers/gpu/drm/i915/i915_sysfs.c
> +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> @@ -118,9 +118,8 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
>  	struct drm_minor *dminor = container_of(dev, struct drm_minor, kdev);
>  	struct drm_device *drm_dev = dminor->dev;
>  	struct drm_i915_private *dev_priv = drm_dev->dev_private;
> -	uint32_t misccpctl;
>  	int slice = (int)(uintptr_t)attr->private;
> -	int i, ret;
> +	int ret;
>  
>  	count = round_down(count, 4);
>  
> @@ -134,26 +133,13 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
>  	if (ret)
>  		return ret;
>  
> -	if (IS_HASWELL(drm_dev)) {
> -		if (dev_priv->l3_parity.remap_info[slice])
> -			memcpy(buf,
> -			       dev_priv->l3_parity.remap_info[slice] + (offset/4),
> -			       count);
> -		else
> -			memset(buf, 0, count);
> -
> -		goto out;
> -	}
> -
> -	misccpctl = I915_READ(GEN7_MISCCPCTL);
> -	I915_WRITE(GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
> -
> -	for (i = 0; i < count; i += 4)
> -		*((uint32_t *)(&buf[i])) = I915_READ(GEN7_L3LOG_BASE + offset + i);
> -
> -	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
> +	if (dev_priv->l3_parity.remap_info[slice])
> +		memcpy(buf,
> +		       dev_priv->l3_parity.remap_info[slice] + (offset/4),
> +		       count);
> +	else
> +		memset(buf, 0, count);
>  
> -out:
>  	mutex_unlock(&drm_dev->struct_mutex);
>  
>  	return count;
> @@ -168,6 +154,7 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
>  	struct drm_minor *dminor = container_of(dev, struct drm_minor, kdev);
>  	struct drm_device *drm_dev = dminor->dev;
>  	struct drm_i915_private *dev_priv = drm_dev->dev_private;
> +	struct i915_hw_context *ctx;
>  	u32 *temp = NULL; /* Just here to make handling failures easy */
>  	int slice = (int)(uintptr_t)attr->private;
>  	int ret;
> @@ -176,6 +163,9 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
>  	if (ret)
>  		return ret;
>  
> +	if (dev_priv->hw_contexts_disabled)
> +		return -ENXIO;
> +
>  	ret = i915_mutex_lock_interruptible(drm_dev);
>  	if (ret)
>  		return ret;
> @@ -204,8 +194,9 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
>  
>  	memcpy(dev_priv->l3_parity.remap_info[slice] + (offset/4), buf, count);
>  
> -	if (i915_gem_l3_remap(&dev_priv->ring[RCS], slice))
> -		count = 0;
> +	/* NB: We defer the remapping until we switch to the context */
> +	list_for_each_entry(ctx, &dev_priv->context_list, link)
> +		ctx->remap_slice |= (1<<slice);
>  
>  	mutex_unlock(&drm_dev->struct_mutex);
>  
> -- 
> 1.8.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 6/6] drm/i915: s/HAS_L3_GPU_CACHE/HAS_L3_DPF
  2013-09-18  4:12 ` [PATCH 6/6] drm/i915: s/HAS_L3_GPU_CACHE/HAS_L3_DPF Ben Widawsky
@ 2013-09-18  7:50   ` Ville Syrjälä
  2013-09-19 17:47     ` [PATCH] [v2] " Ben Widawsky
  2013-09-19 18:01     ` Ben Widawsky
  0 siblings, 2 replies; 27+ messages in thread
From: Ville Syrjälä @ 2013-09-18  7:50 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Bryan Bell, Ben Widawsky

On Tue, Sep 17, 2013 at 09:12:47PM -0700, Ben Widawsky wrote:
> We'd only ever used this define to denote whether or not we have the
> dynamic parity feature (DPF) and never to determine whether or not L3
> exists. Baytrail is a good example of where L3 exists, and not DPF.
> 
> This patch provides clarify in the code for future use cases which might
> want to actually query whether or not L3 exists.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         | 4 ++--
>  drivers/gpu/drm/i915/i915_gem.c         | 2 +-
>  drivers/gpu/drm/i915/i915_irq.c         | 4 ++--
>  drivers/gpu/drm/i915/i915_sysfs.c       | 4 ++--
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 6 +++---
>  5 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 015df52..dd2753e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1691,8 +1691,8 @@ struct drm_i915_file_private {
>  
>  #define HAS_FORCE_WAKE(dev) (INTEL_INFO(dev)->has_force_wake)
>  
> -#define HAS_L3_GPU_CACHE(dev) (IS_IVYBRIDGE(dev) || IS_HASWELL(dev))
> -#define NUM_L3_SLICES(dev) (IS_HSW_GT3(dev) ? 2 : HAS_L3_GPU_CACHE(dev))
> +#define HAS_L3_DPF(dev) (IS_IVYBRIDGE(dev) || IS_HASWELL(dev))
> +#define NUM_L3_SLICES(dev) (IS_HSW_GT3(dev) ? 2 : HAS_L3_DPF(dev))

Maybe add a comment saying /* DPF == dynamic parity feature */

Otherwise:
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

>  
>  #define GT_FREQUENCY_MULTIPLIER 50
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 18d07d7..7859f91 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4260,7 +4260,7 @@ int i915_gem_l3_remap(struct intel_ring_buffer *ring, int slice)
>  	u32 *remap_info = dev_priv->l3_parity.remap_info[slice];
>  	int i, ret;
>  
> -	if (!HAS_L3_GPU_CACHE(dev) || !remap_info)
> +	if (!HAS_L3_DPF(dev) || !remap_info)
>  		return 0;
>  
>  	ret = intel_ring_begin(ring, GEN7_L3LOG_SIZE / 4 * 3);
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index b11ee39..0968c98 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -959,7 +959,7 @@ static void ivybridge_parity_error_irq_handler(struct drm_device *dev, u32 iir)
>  {
>  	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
>  
> -	if (!HAS_L3_GPU_CACHE(dev))
> +	if (!HAS_L3_DPF(dev))
>  		return;
>  
>  	spin_lock(&dev_priv->irq_lock);
> @@ -2291,7 +2291,7 @@ static void gen5_gt_irq_postinstall(struct drm_device *dev)
>  	pm_irqs = gt_irqs = 0;
>  
>  	dev_priv->gt_irq_mask = ~0;
> -	if (HAS_L3_GPU_CACHE(dev)) {
> +	if (HAS_L3_DPF(dev)) {
>  		/* L3 parity interrupt is always unmasked. */
>  		dev_priv->gt_irq_mask = ~GT_PARITY_ERROR(dev);
>  		gt_irqs |= GT_PARITY_ERROR(dev);
> diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> index deb8787..7b4c79c 100644
> --- a/drivers/gpu/drm/i915/i915_sysfs.c
> +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> @@ -97,7 +97,7 @@ static struct attribute_group rc6_attr_group = {
>  
>  static int l3_access_valid(struct drm_device *dev, loff_t offset)
>  {
> -	if (!HAS_L3_GPU_CACHE(dev))
> +	if (!HAS_L3_DPF(dev))
>  		return -EPERM;
>  
>  	if (offset % 4 != 0)
> @@ -525,7 +525,7 @@ void i915_setup_sysfs(struct drm_device *dev)
>  			DRM_ERROR("RC6 residency sysfs setup failed\n");
>  	}
>  #endif
> -	if (HAS_L3_GPU_CACHE(dev)) {
> +	if (HAS_L3_DPF(dev)) {
>  		ret = device_create_bin_file(&dev->primary->kdev, &dpf_attrs);
>  		if (ret)
>  			DRM_ERROR("l3 parity sysfs setup failed\n");
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 958b7d8..b67104a 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -569,7 +569,7 @@ static int init_render_ring(struct intel_ring_buffer *ring)
>  	if (INTEL_INFO(dev)->gen >= 6)
>  		I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
>  
> -	if (HAS_L3_GPU_CACHE(dev))
> +	if (HAS_L3_DPF(dev))
>  		I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
>  
>  	return ret;
> @@ -997,7 +997,7 @@ gen6_ring_get_irq(struct intel_ring_buffer *ring)
>  
>  	spin_lock_irqsave(&dev_priv->irq_lock, flags);
>  	if (ring->irq_refcount++ == 0) {
> -		if (HAS_L3_GPU_CACHE(dev) && ring->id == RCS)
> +		if (HAS_L3_DPF(dev) && ring->id == RCS)
>  			I915_WRITE_IMR(ring,
>  				       ~(ring->irq_enable_mask |
>  					 GT_PARITY_ERROR(dev)));
> @@ -1019,7 +1019,7 @@ gen6_ring_put_irq(struct intel_ring_buffer *ring)
>  
>  	spin_lock_irqsave(&dev_priv->irq_lock, flags);
>  	if (--ring->irq_refcount == 0) {
> -		if (HAS_L3_GPU_CACHE(dev) && ring->id == RCS)
> +		if (HAS_L3_DPF(dev) && ring->id == RCS)
>  			I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
>  		else
>  			I915_WRITE_IMR(ring, ~0);
> -- 
> 1.8.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/6] drm/i915: Add second slice l3 remapping
  2013-09-18  7:36   ` Ville Syrjälä
@ 2013-09-18 16:22     ` Ben Widawsky
  2013-09-19 18:13     ` [PATCH] [v3] " Ben Widawsky
  1 sibling, 0 replies; 27+ messages in thread
From: Ben Widawsky @ 2013-09-18 16:22 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: Intel GFX, Bryan Bell, Ben Widawsky

On Wed, Sep 18, 2013 at 10:36:28AM +0300, Ville Syrjälä wrote:
> On Tue, Sep 17, 2013 at 09:12:43PM -0700, Ben Widawsky wrote:
> > Certain HSW SKUs have a second bank of L3. This L3 remapping has a
> > separate register set, and interrupt from the first "slice". A slice is
> > simply a term to define some subset of the GPU's l3 cache. This patch
> > implements both the interrupt handler, and ability to communicate with
> > userspace about this second slice.
> > 
> > v2:  Remove redundant check about non-existent slice.
> > Change warning about interrupts of unknown slices to WARN_ON_ONCE
> > Handle the case where we get 2 slice interrupts concurrently, and switch
> > the tracking of interrupts to be non-destructive (all Ville)
> > Don't enable/mask the second slice parity interrupt for ivb/vlv (even
> > though all docs I can find claim it's rsvd) (Ville + Bryan)
> > Keep BYT excluded from L3 parity
> > 
> > CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h         |  7 ++-
> >  drivers/gpu/drm/i915/i915_gem.c         | 26 +++++-----
> >  drivers/gpu/drm/i915/i915_irq.c         | 88 +++++++++++++++++++++------------
> >  drivers/gpu/drm/i915/i915_reg.h         |  7 +++
> >  drivers/gpu/drm/i915/i915_sysfs.c       | 34 ++++++++++---
> >  drivers/gpu/drm/i915/intel_ringbuffer.c |  7 ++-
> >  include/uapi/drm/i915_drm.h             |  8 +--
> >  7 files changed, 116 insertions(+), 61 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 8b16d47..c6e8df7 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -917,9 +917,11 @@ struct i915_ums_state {
> >  	int mm_suspended;
> >  };
> >  
> > +#define MAX_L3_SLICES 2
> >  struct intel_l3_parity {
> > -	u32 *remap_info;
> > +	u32 *remap_info[MAX_L3_SLICES];
> >  	struct work_struct error_work;
> > +	int which_slice;
> >  };
> >  
> >  struct i915_gem_mm {
> > @@ -1686,6 +1688,7 @@ struct drm_i915_file_private {
> >  #define HAS_FORCE_WAKE(dev) (INTEL_INFO(dev)->has_force_wake)
> >  
> >  #define HAS_L3_GPU_CACHE(dev) (IS_IVYBRIDGE(dev) || IS_HASWELL(dev))
> > +#define NUM_L3_SLICES(dev) (IS_HSW_GT3(dev) ? 2 : HAS_L3_GPU_CACHE(dev))
> >  
> >  #define GT_FREQUENCY_MULTIPLIER 50
> >  
> > @@ -1946,7 +1949,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
> >  int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
> >  int __must_check i915_gem_init(struct drm_device *dev);
> >  int __must_check i915_gem_init_hw(struct drm_device *dev);
> > -void i915_gem_l3_remap(struct drm_device *dev);
> > +void i915_gem_l3_remap(struct drm_device *dev, int slice);
> >  void i915_gem_init_swizzling(struct drm_device *dev);
> >  void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
> >  int __must_check i915_gpu_idle(struct drm_device *dev);
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 3d3de6e..66bf75d 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -4252,16 +4252,15 @@ i915_gem_idle(struct drm_device *dev)
> >  	return 0;
> >  }
> >  
> > -void i915_gem_l3_remap(struct drm_device *dev)
> > +void i915_gem_l3_remap(struct drm_device *dev, int slice)
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > +	u32 reg_base = GEN7_L3LOG_BASE + (slice * 0x200);
> > +	u32 *remap_info = dev_priv->l3_parity.remap_info[slice];
> >  	u32 misccpctl;
> >  	int i;
> >  
> > -	if (!HAS_L3_GPU_CACHE(dev))
> > -		return;
> > -
> > -	if (!dev_priv->l3_parity.remap_info)
> > +	if (!HAS_L3_GPU_CACHE(dev) || !remap_info)
> >  		return;
> >  
> >  	misccpctl = I915_READ(GEN7_MISCCPCTL);
> > @@ -4269,17 +4268,17 @@ void i915_gem_l3_remap(struct drm_device *dev)
> >  	POSTING_READ(GEN7_MISCCPCTL);
> >  
> >  	for (i = 0; i < GEN7_L3LOG_SIZE; i += 4) {
> > -		u32 remap = I915_READ(GEN7_L3LOG_BASE + i);
> > -		if (remap && remap != dev_priv->l3_parity.remap_info[i/4])
> > +		u32 remap = I915_READ(reg_base + i);
> > +		if (remap && remap != remap_info[i/4])
> >  			DRM_DEBUG("0x%x was already programmed to %x\n",
> > -				  GEN7_L3LOG_BASE + i, remap);
> > -		if (remap && !dev_priv->l3_parity.remap_info[i/4])
> > +				  reg_base + i, remap);
> > +		if (remap && !remap_info[i/4])
> >  			DRM_DEBUG_DRIVER("Clearing remapped register\n");
> > -		I915_WRITE(GEN7_L3LOG_BASE + i, dev_priv->l3_parity.remap_info[i/4]);
> > +		I915_WRITE(reg_base + i, remap_info[i/4]);
> >  	}
> >  
> >  	/* Make sure all the writes land before disabling dop clock gating */
> > -	POSTING_READ(GEN7_L3LOG_BASE);
> > +	POSTING_READ(reg_base);
> >  
> >  	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
> >  }
> > @@ -4373,7 +4372,7 @@ int
> >  i915_gem_init_hw(struct drm_device *dev)
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > -	int ret;
> > +	int ret, i;
> >  
> >  	if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
> >  		return -EIO;
> > @@ -4392,7 +4391,8 @@ i915_gem_init_hw(struct drm_device *dev)
> >  		I915_WRITE(GEN7_MSG_CTL, temp);
> >  	}
> >  
> > -	i915_gem_l3_remap(dev);
> > +	for (i = 0; i < NUM_L3_SLICES(dev); i++)
> > +		i915_gem_l3_remap(dev, i);
> >  
> >  	i915_gem_init_swizzling(dev);
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index a42f30b..b11ee39 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -888,9 +888,10 @@ static void ivybridge_parity_work(struct work_struct *work)
> >  	drm_i915_private_t *dev_priv = container_of(work, drm_i915_private_t,
> >  						    l3_parity.error_work);
> >  	u32 error_status, row, bank, subbank;
> > -	char *parity_event[5];
> > +	char *parity_event[6];
> >  	uint32_t misccpctl;
> >  	unsigned long flags;
> > +	uint8_t slice = 0;
> >  
> >  	/* We must turn off DOP level clock gating to access the L3 registers.
> >  	 * In order to prevent a get/put style interface, acquire struct mutex
> > @@ -898,45 +899,63 @@ static void ivybridge_parity_work(struct work_struct *work)
> >  	 */
> >  	mutex_lock(&dev_priv->dev->struct_mutex);
> >  
> > +	/* If we've screwed up tracking, just let the interrupt fire again */
> > +	if (WARN_ON(!dev_priv->l3_parity.which_slice))
> > +		goto out;
> > +
> >  	misccpctl = I915_READ(GEN7_MISCCPCTL);
> >  	I915_WRITE(GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
> >  	POSTING_READ(GEN7_MISCCPCTL);
> >  
> > -	error_status = I915_READ(GEN7_L3CDERRST1);
> > -	row = GEN7_PARITY_ERROR_ROW(error_status);
> > -	bank = GEN7_PARITY_ERROR_BANK(error_status);
> > -	subbank = GEN7_PARITY_ERROR_SUBBANK(error_status);
> > +	while ((slice = ffs(dev_priv->l3_parity.which_slice)) != 0) {
> > +		u32 reg;
> 
> ffs(1) == 1, so we need a slice-- here, don't we?
> 
> The rest looks OK to me, so once this one thing is fixed you can slap on a
> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> sticker.
> 

That's strange. I noticed this earlier (and don't remember actually
fixing it). I wonder why my tests didn't fail horribly. Scary.

> >  
> > -	I915_WRITE(GEN7_L3CDERRST1, GEN7_PARITY_ERROR_VALID |
> > -				    GEN7_L3CDERRST1_ENABLE);
> > -	POSTING_READ(GEN7_L3CDERRST1);
> > +		if (WARN_ON_ONCE(slice >= NUM_L3_SLICES(dev_priv->dev)))
> > +			break;
> >  
> > -	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
> > +		dev_priv->l3_parity.which_slice &= ~(1<<slice);
> >  
> > -	spin_lock_irqsave(&dev_priv->irq_lock, flags);
> > -	ilk_enable_gt_irq(dev_priv, GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
> > -	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
> > +		reg = GEN7_L3CDERRST1 + (slice * 0x200);
> >  
> > -	mutex_unlock(&dev_priv->dev->struct_mutex);
> > +		error_status = I915_READ(reg);
> > +		row = GEN7_PARITY_ERROR_ROW(error_status);
> > +		bank = GEN7_PARITY_ERROR_BANK(error_status);
> > +		subbank = GEN7_PARITY_ERROR_SUBBANK(error_status);
> > +
> > +		I915_WRITE(reg, GEN7_PARITY_ERROR_VALID | GEN7_L3CDERRST1_ENABLE);
> > +		POSTING_READ(reg);
> > +
> > +		parity_event[0] = I915_L3_PARITY_UEVENT "=1";
> > +		parity_event[1] = kasprintf(GFP_KERNEL, "ROW=%d", row);
> > +		parity_event[2] = kasprintf(GFP_KERNEL, "BANK=%d", bank);
> > +		parity_event[3] = kasprintf(GFP_KERNEL, "SUBBANK=%d", subbank);
> > +		parity_event[4] = kasprintf(GFP_KERNEL, "SLICE=%d", slice);
> > +		parity_event[5] = NULL;
> > +
> > +		kobject_uevent_env(&dev_priv->dev->primary->kdev.kobj,
> > +				   KOBJ_CHANGE, parity_event);
> >  
> > -	parity_event[0] = I915_L3_PARITY_UEVENT "=1";
> > -	parity_event[1] = kasprintf(GFP_KERNEL, "ROW=%d", row);
> > -	parity_event[2] = kasprintf(GFP_KERNEL, "BANK=%d", bank);
> > -	parity_event[3] = kasprintf(GFP_KERNEL, "SUBBANK=%d", subbank);
> > -	parity_event[4] = NULL;
> > +		DRM_DEBUG("Parity error: Slice = %d, Row = %d, Bank = %d, Sub bank = %d.\n",
> > +			  slice, row, bank, subbank);
> >  
> > -	kobject_uevent_env(&dev_priv->dev->primary->kdev.kobj,
> > -			   KOBJ_CHANGE, parity_event);
> > +		kfree(parity_event[4]);
> > +		kfree(parity_event[3]);
> > +		kfree(parity_event[2]);
> > +		kfree(parity_event[1]);
> > +	}
> >  
> > -	DRM_DEBUG("Parity error: Row = %d, Bank = %d, Sub bank = %d.\n",
> > -		  row, bank, subbank);
> > +	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
> >  
> > -	kfree(parity_event[3]);
> > -	kfree(parity_event[2]);
> > -	kfree(parity_event[1]);
> > +out:
> > +	WARN_ON(dev_priv->l3_parity.which_slice);
> > +	spin_lock_irqsave(&dev_priv->irq_lock, flags);
> > +	ilk_enable_gt_irq(dev_priv, GT_PARITY_ERROR(dev_priv->dev));
> > +	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
> > +
> > +	mutex_unlock(&dev_priv->dev->struct_mutex);
> >  }
> >  
> > -static void ivybridge_parity_error_irq_handler(struct drm_device *dev)
> > +static void ivybridge_parity_error_irq_handler(struct drm_device *dev, u32 iir)
> >  {
> >  	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
> >  
> > @@ -944,9 +963,16 @@ static void ivybridge_parity_error_irq_handler(struct drm_device *dev)
> >  		return;
> >  
> >  	spin_lock(&dev_priv->irq_lock);
> > -	ilk_disable_gt_irq(dev_priv, GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
> > +	ilk_disable_gt_irq(dev_priv, GT_PARITY_ERROR(dev));
> >  	spin_unlock(&dev_priv->irq_lock);
> >  
> > +	iir &= GT_PARITY_ERROR(dev);
> > +	if (iir & GT_RENDER_L3_PARITY_ERROR_INTERRUPT_S1)
> > +		dev_priv->l3_parity.which_slice |= 1 << 1;
> > +
> > +	if (iir & GT_RENDER_L3_PARITY_ERROR_INTERRUPT)
> > +		dev_priv->l3_parity.which_slice |= 1 << 0;
> > +
> >  	queue_work(dev_priv->wq, &dev_priv->l3_parity.error_work);
> >  }
> >  
> > @@ -981,8 +1007,8 @@ static void snb_gt_irq_handler(struct drm_device *dev,
> >  		i915_handle_error(dev, false);
> >  	}
> >  
> > -	if (gt_iir & GT_RENDER_L3_PARITY_ERROR_INTERRUPT)
> > -		ivybridge_parity_error_irq_handler(dev);
> > +	if (gt_iir & GT_PARITY_ERROR(dev))
> > +		ivybridge_parity_error_irq_handler(dev, gt_iir);
> >  }
> >  
> >  #define HPD_STORM_DETECT_PERIOD 1000
> > @@ -2267,8 +2293,8 @@ static void gen5_gt_irq_postinstall(struct drm_device *dev)
> >  	dev_priv->gt_irq_mask = ~0;
> >  	if (HAS_L3_GPU_CACHE(dev)) {
> >  		/* L3 parity interrupt is always unmasked. */
> > -		dev_priv->gt_irq_mask = ~GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
> > -		gt_irqs |= GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
> > +		dev_priv->gt_irq_mask = ~GT_PARITY_ERROR(dev);
> > +		gt_irqs |= GT_PARITY_ERROR(dev);
> >  	}
> >  
> >  	gt_irqs |= GT_RENDER_USER_INTERRUPT;
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> > index 384adfb..c94946f 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -927,6 +927,7 @@
> >  #define GT_BLT_USER_INTERRUPT			(1 << 22)
> >  #define GT_BSD_CS_ERROR_INTERRUPT		(1 << 15)
> >  #define GT_BSD_USER_INTERRUPT			(1 << 12)
> > +#define GT_RENDER_L3_PARITY_ERROR_INTERRUPT_S1	(1 << 11) /* hsw+; rsvd on snb, ivb, vlv */
> >  #define GT_RENDER_L3_PARITY_ERROR_INTERRUPT	(1 <<  5) /* !snb */
> >  #define GT_RENDER_PIPECTL_NOTIFY_INTERRUPT	(1 <<  4)
> >  #define GT_RENDER_CS_MASTER_ERROR_INTERRUPT	(1 <<  3)
> > @@ -937,6 +938,10 @@
> >  #define PM_VEBOX_CS_ERROR_INTERRUPT		(1 << 12) /* hsw+ */
> >  #define PM_VEBOX_USER_INTERRUPT			(1 << 10) /* hsw+ */
> >  
> > +#define GT_PARITY_ERROR(dev) \
> > +	(GT_RENDER_L3_PARITY_ERROR_INTERRUPT | \
> > +	 IS_HASWELL(dev) ? GT_RENDER_L3_PARITY_ERROR_INTERRUPT_S1 : 0)
> > +
> >  /* These are all the "old" interrupts */
> >  #define ILK_BSD_USER_INTERRUPT				(1<<5)
> >  #define I915_PIPE_CONTROL_NOTIFY_INTERRUPT		(1<<18)
> > @@ -4743,6 +4748,7 @@
> >  
> >  /* IVYBRIDGE DPF */
> >  #define GEN7_L3CDERRST1			0xB008 /* L3CD Error Status 1 */
> > +#define HSW_L3CDERRST11			0xB208 /* L3CD Error Status register 1 slice 1 */
> >  #define   GEN7_L3CDERRST1_ROW_MASK	(0x7ff<<14)
> >  #define   GEN7_PARITY_ERROR_VALID	(1<<13)
> >  #define   GEN7_L3CDERRST1_BANK_MASK	(3<<11)
> > @@ -4756,6 +4762,7 @@
> >  #define   GEN7_L3CDERRST1_ENABLE	(1<<7)
> >  
> >  #define GEN7_L3LOG_BASE			0xB070
> > +#define HSW_L3LOG_BASE_SLICE1		0xB270
> >  #define GEN7_L3LOG_SIZE			0x80
> >  
> >  #define GEN7_HALF_SLICE_CHICKEN1	0xe100 /* IVB GT1 + VLV */
> > diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> > index 71f6de2..3a8bf0c 100644
> > --- a/drivers/gpu/drm/i915/i915_sysfs.c
> > +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> > @@ -119,6 +119,7 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
> >  	struct drm_device *drm_dev = dminor->dev;
> >  	struct drm_i915_private *dev_priv = drm_dev->dev_private;
> >  	uint32_t misccpctl;
> > +	int slice = (int)(uintptr_t)attr->private;
> >  	int i, ret;
> >  
> >  	count = round_down(count, 4);
> > @@ -134,9 +135,9 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
> >  		return ret;
> >  
> >  	if (IS_HASWELL(drm_dev)) {
> > -		if (dev_priv->l3_parity.remap_info)
> > +		if (dev_priv->l3_parity.remap_info[slice])
> >  			memcpy(buf,
> > -			       dev_priv->l3_parity.remap_info + (offset/4),
> > +			       dev_priv->l3_parity.remap_info[slice] + (offset/4),
> >  			       count);
> >  		else
> >  			memset(buf, 0, count);
> > @@ -168,6 +169,7 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
> >  	struct drm_device *drm_dev = dminor->dev;
> >  	struct drm_i915_private *dev_priv = drm_dev->dev_private;
> >  	u32 *temp = NULL; /* Just here to make handling failures easy */
> > +	int slice = (int)(uintptr_t)attr->private;
> >  	int ret;
> >  
> >  	ret = l3_access_valid(drm_dev, offset);
> > @@ -178,7 +180,7 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
> >  	if (ret)
> >  		return ret;
> >  
> > -	if (!dev_priv->l3_parity.remap_info) {
> > +	if (!dev_priv->l3_parity.remap_info[slice]) {
> >  		temp = kzalloc(GEN7_L3LOG_SIZE, GFP_KERNEL);
> >  		if (!temp) {
> >  			mutex_unlock(&drm_dev->struct_mutex);
> > @@ -198,11 +200,11 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
> >  	 * at this point it is left as a TODO.
> >  	*/
> >  	if (temp)
> > -		dev_priv->l3_parity.remap_info = temp;
> > +		dev_priv->l3_parity.remap_info[slice] = temp;
> >  
> > -	memcpy(dev_priv->l3_parity.remap_info + (offset/4), buf, count);
> > +	memcpy(dev_priv->l3_parity.remap_info[slice] + (offset/4), buf, count);
> >  
> > -	i915_gem_l3_remap(drm_dev);
> > +	i915_gem_l3_remap(drm_dev, slice);
> >  
> >  	mutex_unlock(&drm_dev->struct_mutex);
> >  
> > @@ -214,7 +216,17 @@ static struct bin_attribute dpf_attrs = {
> >  	.size = GEN7_L3LOG_SIZE,
> >  	.read = i915_l3_read,
> >  	.write = i915_l3_write,
> > -	.mmap = NULL
> > +	.mmap = NULL,
> > +	.private = (void *)0
> > +};
> > +
> > +static struct bin_attribute dpf_attrs_1 = {
> > +	.attr = {.name = "l3_parity_slice_1", .mode = (S_IRUSR | S_IWUSR)},
> > +	.size = GEN7_L3LOG_SIZE,
> > +	.read = i915_l3_read,
> > +	.write = i915_l3_write,
> > +	.mmap = NULL,
> > +	.private = (void *)1
> >  };
> >  
> >  static ssize_t gt_cur_freq_mhz_show(struct device *kdev,
> > @@ -525,6 +537,13 @@ void i915_setup_sysfs(struct drm_device *dev)
> >  		ret = device_create_bin_file(&dev->primary->kdev, &dpf_attrs);
> >  		if (ret)
> >  			DRM_ERROR("l3 parity sysfs setup failed\n");
> > +
> > +		if (NUM_L3_SLICES(dev) > 1) {
> > +			ret = device_create_bin_file(&dev->primary->kdev,
> > +						     &dpf_attrs_1);
> > +			if (ret)
> > +				DRM_ERROR("l3 parity slice 1 setup failed\n");
> > +		}
> >  	}
> >  
> >  	ret = 0;
> > @@ -548,6 +567,7 @@ void i915_teardown_sysfs(struct drm_device *dev)
> >  		sysfs_remove_files(&dev->primary->kdev.kobj, vlv_attrs);
> >  	else
> >  		sysfs_remove_files(&dev->primary->kdev.kobj, gen6_attrs);
> > +	device_remove_bin_file(&dev->primary->kdev,  &dpf_attrs_1);
> >  	device_remove_bin_file(&dev->primary->kdev,  &dpf_attrs);
> >  #ifdef CONFIG_PM
> >  	sysfs_unmerge_group(&dev->primary->kdev.kobj, &rc6_attr_group);
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 686e5b2..958b7d8 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -570,7 +570,7 @@ static int init_render_ring(struct intel_ring_buffer *ring)
> >  		I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
> >  
> >  	if (HAS_L3_GPU_CACHE(dev))
> > -		I915_WRITE_IMR(ring, ~GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
> > +		I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
> >  
> >  	return ret;
> >  }
> > @@ -1000,7 +1000,7 @@ gen6_ring_get_irq(struct intel_ring_buffer *ring)
> >  		if (HAS_L3_GPU_CACHE(dev) && ring->id == RCS)
> >  			I915_WRITE_IMR(ring,
> >  				       ~(ring->irq_enable_mask |
> > -					 GT_RENDER_L3_PARITY_ERROR_INTERRUPT));
> > +					 GT_PARITY_ERROR(dev)));
> >  		else
> >  			I915_WRITE_IMR(ring, ~ring->irq_enable_mask);
> >  		ilk_enable_gt_irq(dev_priv, ring->irq_enable_mask);
> > @@ -1020,8 +1020,7 @@ gen6_ring_put_irq(struct intel_ring_buffer *ring)
> >  	spin_lock_irqsave(&dev_priv->irq_lock, flags);
> >  	if (--ring->irq_refcount == 0) {
> >  		if (HAS_L3_GPU_CACHE(dev) && ring->id == RCS)
> > -			I915_WRITE_IMR(ring,
> > -				       ~GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
> > +			I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
> >  		else
> >  			I915_WRITE_IMR(ring, ~0);
> >  		ilk_disable_gt_irq(dev_priv, ring->irq_enable_mask);
> > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > index 55bb572..3a4e97b 100644
> > --- a/include/uapi/drm/i915_drm.h
> > +++ b/include/uapi/drm/i915_drm.h
> > @@ -38,10 +38,10 @@
> >   *
> >   * I915_L3_PARITY_UEVENT - Generated when the driver receives a parity mismatch
> >   *	event from the gpu l3 cache. Additional information supplied is ROW,
> > - *	BANK, SUBBANK of the affected cacheline. Userspace should keep track of
> > - *	these events and if a specific cache-line seems to have a persistent
> > - *	error remap it with the l3 remapping tool supplied in intel-gpu-tools.
> > - *	The value supplied with the event is always 1.
> > + *	BANK, SUBBANK, SLICE of the affected cacheline. Userspace should keep
> > + *	track of these events and if a specific cache-line seems to have a
> > + *	persistent error remap it with the l3 remapping tool supplied in
> > + *	intel-gpu-tools.  The value supplied with the event is always 1.
> >   *
> >   * I915_ERROR_UEVENT - Generated upon error detection, currently only via
> >   *	hangcheck. The error detection event is a good indicator of when things
> > -- 
> > 1.8.4
> 
> -- 
> Ville Syrjälä
> Intel OTC

-- 
Ben Widawsky, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/6] drm/i915: Do remaps for all contexts
  2013-09-18  7:48   ` Ville Syrjälä
@ 2013-09-19  1:14     ` Ben Widawsky
  2013-09-19  1:17       ` Ben Widawsky
  2013-09-19  2:03     ` [PATCH] [v3] " Ben Widawsky
  1 sibling, 1 reply; 27+ messages in thread
From: Ben Widawsky @ 2013-09-19  1:14 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: Intel GFX, Bryan Bell, Ben Widawsky

On Wed, Sep 18, 2013 at 10:48:49AM +0300, Ville Syrjälä wrote:
> On Tue, Sep 17, 2013 at 09:12:46PM -0700, Ben Widawsky wrote:
> > On both Ivybridge and Haswell, row remapping information is saved and
> > restored with context. This means, we never actually properly supported
> > the l3 remapping because our sysfs interface is asynchronous (and not
> > tied to any context), and the known faulty HW would be reused by the
> > next context to run.
> > 
> > Not that due to the asynchronous nature of the sysfs entry, there is no
> > point modifying the registers for the existing context. Instead we set a
> > flag for all contexts to load the correct remapping information on the
> > next run. Interested clients can use debugfs to determine whether or not
> > the row has been remapped.
> > 
> > One could propose at this point that we just do the remapping in the
> > kernel. I guess since we have to maintain the sysfs interface anyway,
> > I'm not sure how useful it is, and I do like keeping the policy in
> > userspace; (it wasn't my original decision to make the
> > interface the way it is, so I'm not attached).
> > 
> > v2: Force a context switch when we have a remap on the next switch.
> > (Ville)
> > Don't let userspace use the interface with disabled contexts.
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_debugfs.c     |  8 +++++++
> >  drivers/gpu/drm/i915/i915_drv.h         |  1 +
> >  drivers/gpu/drm/i915/i915_gem_context.c | 22 ++++++++++++++++----
> >  drivers/gpu/drm/i915/i915_sysfs.c       | 37 +++++++++++++--------------------
> >  4 files changed, 41 insertions(+), 27 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > index ada0950..80bed69 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -145,6 +145,13 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
> >  		seq_printf(m, " (%s)", obj->ring->name);
> >  }
> >  
> > +static void describe_ctx(struct seq_file *m, struct i915_hw_context *ctx)
> > +{
> > +	seq_putc(m, ctx->is_initialized ? 'I' : 'i');
> > +	seq_putc(m, ctx->remap_slice ? 'R' : 'r');
> > +	seq_putc(m, ' ');
> > +}
> > +
> >  static int i915_gem_object_list_info(struct seq_file *m, void *data)
> >  {
> >  	struct drm_info_node *node = (struct drm_info_node *) m->private;
> > @@ -1463,6 +1470,7 @@ static int i915_context_status(struct seq_file *m, void *unused)
> >  
> >  	list_for_each_entry(ctx, &dev_priv->context_list, link) {
> >  		seq_puts(m, "HW context ");
> > +		describe_ctx(m, ctx);
> >  		for_each_ring(ring, dev_priv, i)
> >  			if (ring->default_context == ctx)
> >  				seq_printf(m, "(default context %s) ", ring->name);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 1795927..015df52 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -601,6 +601,7 @@ struct i915_hw_context {
> >  	struct kref ref;
> >  	int id;
> >  	bool is_initialized;
> > +	uint8_t remap_slice;
> >  	struct drm_i915_file_private *file_priv;
> >  	struct intel_ring_buffer *ring;
> >  	struct drm_i915_gem_object *obj;
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index 2bbdce8..7e138cc 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -140,7 +140,7 @@ create_hw_context(struct drm_device *dev,
> >  {
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	struct i915_hw_context *ctx;
> > -	int ret;
> > +	int ret, i;
> >  
> >  	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
> >  	if (ctx == NULL)
> > @@ -181,6 +181,8 @@ create_hw_context(struct drm_device *dev,
> >  
> >  	ctx->file_priv = file_priv;
> >  	ctx->id = ret;
> > +	for (i = 0; i < NUM_L3_SLICES(dev); i++)
> > +		ctx->remap_slice |= 1 << 1;
>                                          ^
> 
> Still broken.

Or maybe I wanted it to always be the second slice!

> 
> >  
> >  	return ctx;
> >  
> > @@ -396,11 +398,11 @@ static int do_switch(struct i915_hw_context *to)
> >  	struct intel_ring_buffer *ring = to->ring;
> >  	struct i915_hw_context *from = ring->last_context;
> >  	u32 hw_flags = 0;
> > -	int ret;
> > +	int ret, i;
> >  
> >  	BUG_ON(from != NULL && from->obj != NULL && from->obj->pin_count == 0);
> >  
> > -	if (from == to)
> > +	if (from == to && !to->remap_slice)
> >  		return 0;
> >  
> >  	ret = i915_gem_obj_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
> > @@ -423,7 +425,7 @@ static int do_switch(struct i915_hw_context *to)
> >  
> >  	if (!to->is_initialized || is_default_context(to))
> >  		hw_flags |= MI_RESTORE_INHIBIT;
> > -	else if (WARN_ON_ONCE(from == to)) /* not yet expected */
> > +	else if (from == to)
> >  		hw_flags |= MI_FORCE_RESTORE;
> 
> Hmm. Why do we need to restore actually? Could just leave MI_SET_CONTEXT
> to be effectively a nop I think.

I don't think so. You need it to restore the register, which I think it
won't do unless you issue a force restore, I am just guessing.

> 
> >  
> >  	ret = mi_set_context(ring, to, hw_flags);
> > @@ -432,6 +434,18 @@ static int do_switch(struct i915_hw_context *to)
> >  		return ret;
> >  	}
> >  
> > +	for (i = 0; i < MAX_L3_SLICES; i++) {
> > +		if (!(to->remap_slice & (1<<i)))
> > +			continue;
> > +
> > +		ret = i915_gem_l3_remap(ring, i);
> > +		if (!ret) {
> > +			to->remap_slice &= ~(1<<i);
> > +			/* If it failed, try again next round */
> > +			DRM_DEBUG_DRIVER("L3 remapping failed\n");
> > +		}
> 
> The debug message is on the wrong branch.
> 

And maybe I wanted the user to THINK it failed when it didn't; so
they'll be pleasantly surprised.

> > +	}
> > +
> >  	/* The backing object for the context is done after switching to the
> >  	 * *next* context. Therefore we cannot retire the previous context until
> >  	 * the next context has already started running. In fact, the below code
> > diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> > index b07bdfb..deb8787 100644
> > --- a/drivers/gpu/drm/i915/i915_sysfs.c
> > +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> > @@ -118,9 +118,8 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
> >  	struct drm_minor *dminor = container_of(dev, struct drm_minor, kdev);
> >  	struct drm_device *drm_dev = dminor->dev;
> >  	struct drm_i915_private *dev_priv = drm_dev->dev_private;
> > -	uint32_t misccpctl;
> >  	int slice = (int)(uintptr_t)attr->private;
> > -	int i, ret;
> > +	int ret;
> >  
> >  	count = round_down(count, 4);
> >  
> > @@ -134,26 +133,13 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
> >  	if (ret)
> >  		return ret;
> >  
> > -	if (IS_HASWELL(drm_dev)) {
> > -		if (dev_priv->l3_parity.remap_info[slice])
> > -			memcpy(buf,
> > -			       dev_priv->l3_parity.remap_info[slice] + (offset/4),
> > -			       count);
> > -		else
> > -			memset(buf, 0, count);
> > -
> > -		goto out;
> > -	}
> > -
> > -	misccpctl = I915_READ(GEN7_MISCCPCTL);
> > -	I915_WRITE(GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
> > -
> > -	for (i = 0; i < count; i += 4)
> > -		*((uint32_t *)(&buf[i])) = I915_READ(GEN7_L3LOG_BASE + offset + i);
> > -
> > -	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
> > +	if (dev_priv->l3_parity.remap_info[slice])
> > +		memcpy(buf,
> > +		       dev_priv->l3_parity.remap_info[slice] + (offset/4),
> > +		       count);
> > +	else
> > +		memset(buf, 0, count);
> >  
> > -out:
> >  	mutex_unlock(&drm_dev->struct_mutex);
> >  
> >  	return count;
> > @@ -168,6 +154,7 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
> >  	struct drm_minor *dminor = container_of(dev, struct drm_minor, kdev);
> >  	struct drm_device *drm_dev = dminor->dev;
> >  	struct drm_i915_private *dev_priv = drm_dev->dev_private;
> > +	struct i915_hw_context *ctx;
> >  	u32 *temp = NULL; /* Just here to make handling failures easy */
> >  	int slice = (int)(uintptr_t)attr->private;
> >  	int ret;
> > @@ -176,6 +163,9 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
> >  	if (ret)
> >  		return ret;
> >  
> > +	if (dev_priv->hw_contexts_disabled)
> > +		return -ENXIO;
> > +
> >  	ret = i915_mutex_lock_interruptible(drm_dev);
> >  	if (ret)
> >  		return ret;
> > @@ -204,8 +194,9 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
> >  
> >  	memcpy(dev_priv->l3_parity.remap_info[slice] + (offset/4), buf, count);
> >  
> > -	if (i915_gem_l3_remap(&dev_priv->ring[RCS], slice))
> > -		count = 0;
> > +	/* NB: We defer the remapping until we switch to the context */
> > +	list_for_each_entry(ctx, &dev_priv->context_list, link)
> > +		ctx->remap_slice |= (1<<slice);
> >  
> >  	mutex_unlock(&drm_dev->struct_mutex);
> >  
> > -- 
> > 1.8.4
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Ville Syrjälä
> Intel OTC

-- 
Ben Widawsky, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/6] drm/i915: Do remaps for all contexts
  2013-09-19  1:14     ` Ben Widawsky
@ 2013-09-19  1:17       ` Ben Widawsky
  0 siblings, 0 replies; 27+ messages in thread
From: Ben Widawsky @ 2013-09-19  1:17 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Bryan Bell

On Wed, Sep 18, 2013 at 06:14:42PM -0700, Ben Widawsky wrote:
> On Wed, Sep 18, 2013 at 10:48:49AM +0300, Ville Syrjälä wrote:
> > On Tue, Sep 17, 2013 at 09:12:46PM -0700, Ben Widawsky wrote:
> > > On both Ivybridge and Haswell, row remapping information is saved and
> > > restored with context. This means, we never actually properly supported
> > > the l3 remapping because our sysfs interface is asynchronous (and not
> > > tied to any context), and the known faulty HW would be reused by the
> > > next context to run.
> > > 
> > > Not that due to the asynchronous nature of the sysfs entry, there is no
> > > point modifying the registers for the existing context. Instead we set a
> > > flag for all contexts to load the correct remapping information on the
> > > next run. Interested clients can use debugfs to determine whether or not
> > > the row has been remapped.
> > > 
> > > One could propose at this point that we just do the remapping in the
> > > kernel. I guess since we have to maintain the sysfs interface anyway,
> > > I'm not sure how useful it is, and I do like keeping the policy in
> > > userspace; (it wasn't my original decision to make the
> > > interface the way it is, so I'm not attached).
> > > 
> > > v2: Force a context switch when we have a remap on the next switch.
> > > (Ville)
> > > Don't let userspace use the interface with disabled contexts.
> > > 
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > ---
> > >  drivers/gpu/drm/i915/i915_debugfs.c     |  8 +++++++
> > >  drivers/gpu/drm/i915/i915_drv.h         |  1 +
> > >  drivers/gpu/drm/i915/i915_gem_context.c | 22 ++++++++++++++++----
> > >  drivers/gpu/drm/i915/i915_sysfs.c       | 37 +++++++++++++--------------------
> > >  4 files changed, 41 insertions(+), 27 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > > index ada0950..80bed69 100644
> > > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > > @@ -145,6 +145,13 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
> > >  		seq_printf(m, " (%s)", obj->ring->name);
> > >  }
> > >  
> > > +static void describe_ctx(struct seq_file *m, struct i915_hw_context *ctx)
> > > +{
> > > +	seq_putc(m, ctx->is_initialized ? 'I' : 'i');
> > > +	seq_putc(m, ctx->remap_slice ? 'R' : 'r');
> > > +	seq_putc(m, ' ');
> > > +}
> > > +
> > >  static int i915_gem_object_list_info(struct seq_file *m, void *data)
> > >  {
> > >  	struct drm_info_node *node = (struct drm_info_node *) m->private;
> > > @@ -1463,6 +1470,7 @@ static int i915_context_status(struct seq_file *m, void *unused)
> > >  
> > >  	list_for_each_entry(ctx, &dev_priv->context_list, link) {
> > >  		seq_puts(m, "HW context ");
> > > +		describe_ctx(m, ctx);
> > >  		for_each_ring(ring, dev_priv, i)
> > >  			if (ring->default_context == ctx)
> > >  				seq_printf(m, "(default context %s) ", ring->name);
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index 1795927..015df52 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -601,6 +601,7 @@ struct i915_hw_context {
> > >  	struct kref ref;
> > >  	int id;
> > >  	bool is_initialized;
> > > +	uint8_t remap_slice;
> > >  	struct drm_i915_file_private *file_priv;
> > >  	struct intel_ring_buffer *ring;
> > >  	struct drm_i915_gem_object *obj;
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > > index 2bbdce8..7e138cc 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > > @@ -140,7 +140,7 @@ create_hw_context(struct drm_device *dev,
> > >  {
> > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > >  	struct i915_hw_context *ctx;
> > > -	int ret;
> > > +	int ret, i;
> > >  
> > >  	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
> > >  	if (ctx == NULL)
> > > @@ -181,6 +181,8 @@ create_hw_context(struct drm_device *dev,
> > >  
> > >  	ctx->file_priv = file_priv;
> > >  	ctx->id = ret;
> > > +	for (i = 0; i < NUM_L3_SLICES(dev); i++)
> > > +		ctx->remap_slice |= 1 << 1;
> >                                          ^
> > 
> > Still broken.
> 
> Or maybe I wanted it to always be the second slice!
> 
> > 
> > >  
> > >  	return ctx;
> > >  
> > > @@ -396,11 +398,11 @@ static int do_switch(struct i915_hw_context *to)
> > >  	struct intel_ring_buffer *ring = to->ring;
> > >  	struct i915_hw_context *from = ring->last_context;
> > >  	u32 hw_flags = 0;
> > > -	int ret;
> > > +	int ret, i;
> > >  
> > >  	BUG_ON(from != NULL && from->obj != NULL && from->obj->pin_count == 0);
> > >  
> > > -	if (from == to)
> > > +	if (from == to && !to->remap_slice)
> > >  		return 0;
> > >  
> > >  	ret = i915_gem_obj_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
> > > @@ -423,7 +425,7 @@ static int do_switch(struct i915_hw_context *to)
> > >  
> > >  	if (!to->is_initialized || is_default_context(to))
> > >  		hw_flags |= MI_RESTORE_INHIBIT;
> > > -	else if (WARN_ON_ONCE(from == to)) /* not yet expected */
> > > +	else if (from == to)
> > >  		hw_flags |= MI_FORCE_RESTORE;
> > 
> > Hmm. Why do we need to restore actually? Could just leave MI_SET_CONTEXT
> > to be effectively a nop I think.
> 
> I don't think so. You need it to restore the register, which I think it
> won't do unless you issue a force restore, I am just guessing.
> 

I should really think a bit before I type. That isn't right - when I
wrote this, I was still thinking we were using MMIO, which we're not.

In other words, you're right. All 3 fixes coming right up.

> > 
> > >  
> > >  	ret = mi_set_context(ring, to, hw_flags);
> > > @@ -432,6 +434,18 @@ static int do_switch(struct i915_hw_context *to)
> > >  		return ret;
> > >  	}
> > >  
> > > +	for (i = 0; i < MAX_L3_SLICES; i++) {
> > > +		if (!(to->remap_slice & (1<<i)))
> > > +			continue;
> > > +
> > > +		ret = i915_gem_l3_remap(ring, i);
> > > +		if (!ret) {
> > > +			to->remap_slice &= ~(1<<i);
> > > +			/* If it failed, try again next round */
> > > +			DRM_DEBUG_DRIVER("L3 remapping failed\n");
> > > +		}
> > 
> > The debug message is on the wrong branch.
> > 
> 
> And maybe I wanted the user to THINK it failed when it didn't; so
> they'll be pleasantly surprised.
> 
> > > +	}
> > > +
> > >  	/* The backing object for the context is done after switching to the
> > >  	 * *next* context. Therefore we cannot retire the previous context until
> > >  	 * the next context has already started running. In fact, the below code
> > > diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> > > index b07bdfb..deb8787 100644
> > > --- a/drivers/gpu/drm/i915/i915_sysfs.c
> > > +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> > > @@ -118,9 +118,8 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
> > >  	struct drm_minor *dminor = container_of(dev, struct drm_minor, kdev);
> > >  	struct drm_device *drm_dev = dminor->dev;
> > >  	struct drm_i915_private *dev_priv = drm_dev->dev_private;
> > > -	uint32_t misccpctl;
> > >  	int slice = (int)(uintptr_t)attr->private;
> > > -	int i, ret;
> > > +	int ret;
> > >  
> > >  	count = round_down(count, 4);
> > >  
> > > @@ -134,26 +133,13 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > -	if (IS_HASWELL(drm_dev)) {
> > > -		if (dev_priv->l3_parity.remap_info[slice])
> > > -			memcpy(buf,
> > > -			       dev_priv->l3_parity.remap_info[slice] + (offset/4),
> > > -			       count);
> > > -		else
> > > -			memset(buf, 0, count);
> > > -
> > > -		goto out;
> > > -	}
> > > -
> > > -	misccpctl = I915_READ(GEN7_MISCCPCTL);
> > > -	I915_WRITE(GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
> > > -
> > > -	for (i = 0; i < count; i += 4)
> > > -		*((uint32_t *)(&buf[i])) = I915_READ(GEN7_L3LOG_BASE + offset + i);
> > > -
> > > -	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
> > > +	if (dev_priv->l3_parity.remap_info[slice])
> > > +		memcpy(buf,
> > > +		       dev_priv->l3_parity.remap_info[slice] + (offset/4),
> > > +		       count);
> > > +	else
> > > +		memset(buf, 0, count);
> > >  
> > > -out:
> > >  	mutex_unlock(&drm_dev->struct_mutex);
> > >  
> > >  	return count;
> > > @@ -168,6 +154,7 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
> > >  	struct drm_minor *dminor = container_of(dev, struct drm_minor, kdev);
> > >  	struct drm_device *drm_dev = dminor->dev;
> > >  	struct drm_i915_private *dev_priv = drm_dev->dev_private;
> > > +	struct i915_hw_context *ctx;
> > >  	u32 *temp = NULL; /* Just here to make handling failures easy */
> > >  	int slice = (int)(uintptr_t)attr->private;
> > >  	int ret;
> > > @@ -176,6 +163,9 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > +	if (dev_priv->hw_contexts_disabled)
> > > +		return -ENXIO;
> > > +
> > >  	ret = i915_mutex_lock_interruptible(drm_dev);
> > >  	if (ret)
> > >  		return ret;
> > > @@ -204,8 +194,9 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
> > >  
> > >  	memcpy(dev_priv->l3_parity.remap_info[slice] + (offset/4), buf, count);
> > >  
> > > -	if (i915_gem_l3_remap(&dev_priv->ring[RCS], slice))
> > > -		count = 0;
> > > +	/* NB: We defer the remapping until we switch to the context */
> > > +	list_for_each_entry(ctx, &dev_priv->context_list, link)
> > > +		ctx->remap_slice |= (1<<slice);
> > >  
> > >  	mutex_unlock(&drm_dev->struct_mutex);
> > >  
> > > -- 
> > > 1.8.4
> > > 
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > 
> > -- 
> > Ville Syrjälä
> > Intel OTC
> 
> -- 
> Ben Widawsky, Intel Open Source Technology Center

-- 
Ben Widawsky, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] [v3] drm/i915: Do remaps for all contexts
  2013-09-18  7:48   ` Ville Syrjälä
  2013-09-19  1:14     ` Ben Widawsky
@ 2013-09-19  2:03     ` Ben Widawsky
  1 sibling, 0 replies; 27+ messages in thread
From: Ben Widawsky @ 2013-09-19  2:03 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky

On both Ivybridge and Haswell, row remapping information is saved and
restored with context. This means, we never actually properly supported
the l3 remapping because our sysfs interface is asynchronous (and not
tied to any context), and the known faulty HW would be reused by the
next context to run.

Not that due to the asynchronous nature of the sysfs entry, there is no
point modifying the registers for the existing context. Instead we set a
flag for all contexts to load the correct remapping information on the
next run. Interested clients can use debugfs to determine whether or not
the row has been remapped.

One could propose at this point that we just do the remapping in the
kernel. I guess since we have to maintain the sysfs interface anyway,
I'm not sure how useful it is, and I do like keeping the policy in
userspace; (it wasn't my original decision to make the
interface the way it is, so I'm not attached).

v2: Force a context switch when we have a remap on the next switch.
(Ville)
Don't let userspace use the interface with disabled contexts.

v3: Don't force a context switch, just let it nop
Improper context slice remap initialization, 1<<1 instead of 1<<i, but I
rewrote it to avoid a second round of confusion.
Error print moved to error path (All Ville)
Added a comment on why the slice remap initialization happens.

CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c     |  8 +++++++
 drivers/gpu/drm/i915/i915_drv.h         |  1 +
 drivers/gpu/drm/i915/i915_gem_context.c | 22 ++++++++++++++++----
 drivers/gpu/drm/i915/i915_sysfs.c       | 37 +++++++++++++--------------------
 4 files changed, 41 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index ada0950..80bed69 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -145,6 +145,13 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 		seq_printf(m, " (%s)", obj->ring->name);
 }
 
+static void describe_ctx(struct seq_file *m, struct i915_hw_context *ctx)
+{
+	seq_putc(m, ctx->is_initialized ? 'I' : 'i');
+	seq_putc(m, ctx->remap_slice ? 'R' : 'r');
+	seq_putc(m, ' ');
+}
+
 static int i915_gem_object_list_info(struct seq_file *m, void *data)
 {
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
@@ -1463,6 +1470,7 @@ static int i915_context_status(struct seq_file *m, void *unused)
 
 	list_for_each_entry(ctx, &dev_priv->context_list, link) {
 		seq_puts(m, "HW context ");
+		describe_ctx(m, ctx);
 		for_each_ring(ring, dev_priv, i)
 			if (ring->default_context == ctx)
 				seq_printf(m, "(default context %s) ", ring->name);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1795927..015df52 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -601,6 +601,7 @@ struct i915_hw_context {
 	struct kref ref;
 	int id;
 	bool is_initialized;
+	uint8_t remap_slice;
 	struct drm_i915_file_private *file_priv;
 	struct intel_ring_buffer *ring;
 	struct drm_i915_gem_object *obj;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 2bbdce8..9af3fe7 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -181,6 +181,10 @@ create_hw_context(struct drm_device *dev,
 
 	ctx->file_priv = file_priv;
 	ctx->id = ret;
+	/* NB: Mark all slices as needing a remap so that when the context first
+	 * loads it will restore whatever remap state already exists. If there
+	 * is no remap info, it will be a NOP. */
+	ctx->remap_slice = (1 << NUM_L3_SLICES(dev)) - 1;
 
 	return ctx;
 
@@ -396,11 +400,11 @@ static int do_switch(struct i915_hw_context *to)
 	struct intel_ring_buffer *ring = to->ring;
 	struct i915_hw_context *from = ring->last_context;
 	u32 hw_flags = 0;
-	int ret;
+	int ret, i;
 
 	BUG_ON(from != NULL && from->obj != NULL && from->obj->pin_count == 0);
 
-	if (from == to)
+	if (from == to && !to->remap_slice)
 		return 0;
 
 	ret = i915_gem_obj_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
@@ -423,8 +427,6 @@ static int do_switch(struct i915_hw_context *to)
 
 	if (!to->is_initialized || is_default_context(to))
 		hw_flags |= MI_RESTORE_INHIBIT;
-	else if (WARN_ON_ONCE(from == to)) /* not yet expected */
-		hw_flags |= MI_FORCE_RESTORE;
 
 	ret = mi_set_context(ring, to, hw_flags);
 	if (ret) {
@@ -432,6 +434,18 @@ static int do_switch(struct i915_hw_context *to)
 		return ret;
 	}
 
+	for (i = 0; i < MAX_L3_SLICES; i++) {
+		if (!(to->remap_slice & (1<<i)))
+			continue;
+
+		ret = i915_gem_l3_remap(ring, i);
+		/* If it failed, try again next round */
+		if (ret)
+			DRM_DEBUG_DRIVER("L3 remapping failed\n");
+		else
+			to->remap_slice &= ~(1<<i);
+	}
+
 	/* The backing object for the context is done after switching to the
 	 * *next* context. Therefore we cannot retire the previous context until
 	 * the next context has already started running. In fact, the below code
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index b07bdfb..deb8787 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -118,9 +118,8 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
 	struct drm_minor *dminor = container_of(dev, struct drm_minor, kdev);
 	struct drm_device *drm_dev = dminor->dev;
 	struct drm_i915_private *dev_priv = drm_dev->dev_private;
-	uint32_t misccpctl;
 	int slice = (int)(uintptr_t)attr->private;
-	int i, ret;
+	int ret;
 
 	count = round_down(count, 4);
 
@@ -134,26 +133,13 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
 	if (ret)
 		return ret;
 
-	if (IS_HASWELL(drm_dev)) {
-		if (dev_priv->l3_parity.remap_info[slice])
-			memcpy(buf,
-			       dev_priv->l3_parity.remap_info[slice] + (offset/4),
-			       count);
-		else
-			memset(buf, 0, count);
-
-		goto out;
-	}
-
-	misccpctl = I915_READ(GEN7_MISCCPCTL);
-	I915_WRITE(GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
-
-	for (i = 0; i < count; i += 4)
-		*((uint32_t *)(&buf[i])) = I915_READ(GEN7_L3LOG_BASE + offset + i);
-
-	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
+	if (dev_priv->l3_parity.remap_info[slice])
+		memcpy(buf,
+		       dev_priv->l3_parity.remap_info[slice] + (offset/4),
+		       count);
+	else
+		memset(buf, 0, count);
 
-out:
 	mutex_unlock(&drm_dev->struct_mutex);
 
 	return count;
@@ -168,6 +154,7 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
 	struct drm_minor *dminor = container_of(dev, struct drm_minor, kdev);
 	struct drm_device *drm_dev = dminor->dev;
 	struct drm_i915_private *dev_priv = drm_dev->dev_private;
+	struct i915_hw_context *ctx;
 	u32 *temp = NULL; /* Just here to make handling failures easy */
 	int slice = (int)(uintptr_t)attr->private;
 	int ret;
@@ -176,6 +163,9 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
 	if (ret)
 		return ret;
 
+	if (dev_priv->hw_contexts_disabled)
+		return -ENXIO;
+
 	ret = i915_mutex_lock_interruptible(drm_dev);
 	if (ret)
 		return ret;
@@ -204,8 +194,9 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
 
 	memcpy(dev_priv->l3_parity.remap_info[slice] + (offset/4), buf, count);
 
-	if (i915_gem_l3_remap(&dev_priv->ring[RCS], slice))
-		count = 0;
+	/* NB: We defer the remapping until we switch to the context */
+	list_for_each_entry(ctx, &dev_priv->context_list, link)
+		ctx->remap_slice |= (1<<slice);
 
 	mutex_unlock(&drm_dev->struct_mutex);
 
-- 
1.8.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH] [v2] drm/i915: s/HAS_L3_GPU_CACHE/HAS_L3_DPF
  2013-09-18  7:50   ` Ville Syrjälä
@ 2013-09-19 17:47     ` Ben Widawsky
  2013-09-19 18:01     ` Ben Widawsky
  1 sibling, 0 replies; 27+ messages in thread
From: Ben Widawsky @ 2013-09-19 17:47 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky

We'd only ever used this define to denote whether or not we have the
dynamic parity feature (DPF) and never to determine whether or not L3
exists. Baytrail is a good example of where L3 exists, and not DPF.

This patch provides clarify in the code for future use cases which might
want to actually query whether or not L3 exists.

v2: Add /* DPF == dynamic parity feature */

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h         | 5 +++--
 drivers/gpu/drm/i915/i915_gem.c         | 2 +-
 drivers/gpu/drm/i915/i915_irq.c         | 4 ++--
 drivers/gpu/drm/i915/i915_sysfs.c       | 4 ++--
 drivers/gpu/drm/i915/intel_ringbuffer.c | 6 +++---
 5 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 015df52..09a5c82 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1691,8 +1691,9 @@ struct drm_i915_file_private {
 
 #define HAS_FORCE_WAKE(dev) (INTEL_INFO(dev)->has_force_wake)
 
-#define HAS_L3_GPU_CACHE(dev) (IS_IVYBRIDGE(dev) || IS_HASWELL(dev))
-#define NUM_L3_SLICES(dev) (IS_HSW_GT3(dev) ? 2 : HAS_L3_GPU_CACHE(dev))
+/* DPF == dynamic parity feature */
+#define HAS_L3_DPF(dev) (IS_IVYBRIDGE(dev) || IS_HASWELL(dev))
+#define NUM_L3_SLICES(dev) (IS_HSW_GT3(dev) ? 2 : HAS_L3_DPF(dev))
 
 #define GT_FREQUENCY_MULTIPLIER 50
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 18d07d7..7859f91 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4260,7 +4260,7 @@ int i915_gem_l3_remap(struct intel_ring_buffer *ring, int slice)
 	u32 *remap_info = dev_priv->l3_parity.remap_info[slice];
 	int i, ret;
 
-	if (!HAS_L3_GPU_CACHE(dev) || !remap_info)
+	if (!HAS_L3_DPF(dev) || !remap_info)
 		return 0;
 
 	ret = intel_ring_begin(ring, GEN7_L3LOG_SIZE / 4 * 3);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 1c7f6ab..cfaf434 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -960,7 +960,7 @@ static void ivybridge_parity_error_irq_handler(struct drm_device *dev, u32 iir)
 {
 	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
 
-	if (!HAS_L3_GPU_CACHE(dev))
+	if (!HAS_L3_DPF(dev))
 		return;
 
 	spin_lock(&dev_priv->irq_lock);
@@ -2292,7 +2292,7 @@ static void gen5_gt_irq_postinstall(struct drm_device *dev)
 	pm_irqs = gt_irqs = 0;
 
 	dev_priv->gt_irq_mask = ~0;
-	if (HAS_L3_GPU_CACHE(dev)) {
+	if (HAS_L3_DPF(dev)) {
 		/* L3 parity interrupt is always unmasked. */
 		dev_priv->gt_irq_mask = ~GT_PARITY_ERROR(dev);
 		gt_irqs |= GT_PARITY_ERROR(dev);
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index deb8787..7b4c79c 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -97,7 +97,7 @@ static struct attribute_group rc6_attr_group = {
 
 static int l3_access_valid(struct drm_device *dev, loff_t offset)
 {
-	if (!HAS_L3_GPU_CACHE(dev))
+	if (!HAS_L3_DPF(dev))
 		return -EPERM;
 
 	if (offset % 4 != 0)
@@ -525,7 +525,7 @@ void i915_setup_sysfs(struct drm_device *dev)
 			DRM_ERROR("RC6 residency sysfs setup failed\n");
 	}
 #endif
-	if (HAS_L3_GPU_CACHE(dev)) {
+	if (HAS_L3_DPF(dev)) {
 		ret = device_create_bin_file(&dev->primary->kdev, &dpf_attrs);
 		if (ret)
 			DRM_ERROR("l3 parity sysfs setup failed\n");
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 958b7d8..b67104a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -569,7 +569,7 @@ static int init_render_ring(struct intel_ring_buffer *ring)
 	if (INTEL_INFO(dev)->gen >= 6)
 		I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
 
-	if (HAS_L3_GPU_CACHE(dev))
+	if (HAS_L3_DPF(dev))
 		I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
 
 	return ret;
@@ -997,7 +997,7 @@ gen6_ring_get_irq(struct intel_ring_buffer *ring)
 
 	spin_lock_irqsave(&dev_priv->irq_lock, flags);
 	if (ring->irq_refcount++ == 0) {
-		if (HAS_L3_GPU_CACHE(dev) && ring->id == RCS)
+		if (HAS_L3_DPF(dev) && ring->id == RCS)
 			I915_WRITE_IMR(ring,
 				       ~(ring->irq_enable_mask |
 					 GT_PARITY_ERROR(dev)));
@@ -1019,7 +1019,7 @@ gen6_ring_put_irq(struct intel_ring_buffer *ring)
 
 	spin_lock_irqsave(&dev_priv->irq_lock, flags);
 	if (--ring->irq_refcount == 0) {
-		if (HAS_L3_GPU_CACHE(dev) && ring->id == RCS)
+		if (HAS_L3_DPF(dev) && ring->id == RCS)
 			I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
 		else
 			I915_WRITE_IMR(ring, ~0);
-- 
1.8.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH] [v2] drm/i915: s/HAS_L3_GPU_CACHE/HAS_L3_DPF
  2013-09-18  7:50   ` Ville Syrjälä
  2013-09-19 17:47     ` [PATCH] [v2] " Ben Widawsky
@ 2013-09-19 18:01     ` Ben Widawsky
  2013-09-19 18:41       ` Daniel Vetter
  1 sibling, 1 reply; 27+ messages in thread
From: Ben Widawsky @ 2013-09-19 18:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

We'd only ever used this define to denote whether or not we have the
dynamic parity feature (DPF) and never to determine whether or not L3
exists. Baytrail is a good example of where L3 exists, and not DPF.

This patch provides clarify in the code for future use cases which might
want to actually query whether or not L3 exists.

v2: Add /* DPF == dynamic parity feature */

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h         | 5 +++--
 drivers/gpu/drm/i915/i915_gem.c         | 2 +-
 drivers/gpu/drm/i915/i915_irq.c         | 4 ++--
 drivers/gpu/drm/i915/i915_sysfs.c       | 4 ++--
 drivers/gpu/drm/i915/intel_ringbuffer.c | 6 +++---
 5 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 015df52..09a5c82 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1691,8 +1691,9 @@ struct drm_i915_file_private {
 
 #define HAS_FORCE_WAKE(dev) (INTEL_INFO(dev)->has_force_wake)
 
-#define HAS_L3_GPU_CACHE(dev) (IS_IVYBRIDGE(dev) || IS_HASWELL(dev))
-#define NUM_L3_SLICES(dev) (IS_HSW_GT3(dev) ? 2 : HAS_L3_GPU_CACHE(dev))
+/* DPF == dynamic parity feature */
+#define HAS_L3_DPF(dev) (IS_IVYBRIDGE(dev) || IS_HASWELL(dev))
+#define NUM_L3_SLICES(dev) (IS_HSW_GT3(dev) ? 2 : HAS_L3_DPF(dev))
 
 #define GT_FREQUENCY_MULTIPLIER 50
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 18d07d7..7859f91 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4260,7 +4260,7 @@ int i915_gem_l3_remap(struct intel_ring_buffer *ring, int slice)
 	u32 *remap_info = dev_priv->l3_parity.remap_info[slice];
 	int i, ret;
 
-	if (!HAS_L3_GPU_CACHE(dev) || !remap_info)
+	if (!HAS_L3_DPF(dev) || !remap_info)
 		return 0;
 
 	ret = intel_ring_begin(ring, GEN7_L3LOG_SIZE / 4 * 3);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 1c7f6ab..cfaf434 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -960,7 +960,7 @@ static void ivybridge_parity_error_irq_handler(struct drm_device *dev, u32 iir)
 {
 	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
 
-	if (!HAS_L3_GPU_CACHE(dev))
+	if (!HAS_L3_DPF(dev))
 		return;
 
 	spin_lock(&dev_priv->irq_lock);
@@ -2292,7 +2292,7 @@ static void gen5_gt_irq_postinstall(struct drm_device *dev)
 	pm_irqs = gt_irqs = 0;
 
 	dev_priv->gt_irq_mask = ~0;
-	if (HAS_L3_GPU_CACHE(dev)) {
+	if (HAS_L3_DPF(dev)) {
 		/* L3 parity interrupt is always unmasked. */
 		dev_priv->gt_irq_mask = ~GT_PARITY_ERROR(dev);
 		gt_irqs |= GT_PARITY_ERROR(dev);
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index deb8787..7b4c79c 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -97,7 +97,7 @@ static struct attribute_group rc6_attr_group = {
 
 static int l3_access_valid(struct drm_device *dev, loff_t offset)
 {
-	if (!HAS_L3_GPU_CACHE(dev))
+	if (!HAS_L3_DPF(dev))
 		return -EPERM;
 
 	if (offset % 4 != 0)
@@ -525,7 +525,7 @@ void i915_setup_sysfs(struct drm_device *dev)
 			DRM_ERROR("RC6 residency sysfs setup failed\n");
 	}
 #endif
-	if (HAS_L3_GPU_CACHE(dev)) {
+	if (HAS_L3_DPF(dev)) {
 		ret = device_create_bin_file(&dev->primary->kdev, &dpf_attrs);
 		if (ret)
 			DRM_ERROR("l3 parity sysfs setup failed\n");
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 958b7d8..b67104a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -569,7 +569,7 @@ static int init_render_ring(struct intel_ring_buffer *ring)
 	if (INTEL_INFO(dev)->gen >= 6)
 		I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
 
-	if (HAS_L3_GPU_CACHE(dev))
+	if (HAS_L3_DPF(dev))
 		I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
 
 	return ret;
@@ -997,7 +997,7 @@ gen6_ring_get_irq(struct intel_ring_buffer *ring)
 
 	spin_lock_irqsave(&dev_priv->irq_lock, flags);
 	if (ring->irq_refcount++ == 0) {
-		if (HAS_L3_GPU_CACHE(dev) && ring->id == RCS)
+		if (HAS_L3_DPF(dev) && ring->id == RCS)
 			I915_WRITE_IMR(ring,
 				       ~(ring->irq_enable_mask |
 					 GT_PARITY_ERROR(dev)));
@@ -1019,7 +1019,7 @@ gen6_ring_put_irq(struct intel_ring_buffer *ring)
 
 	spin_lock_irqsave(&dev_priv->irq_lock, flags);
 	if (--ring->irq_refcount == 0) {
-		if (HAS_L3_GPU_CACHE(dev) && ring->id == RCS)
+		if (HAS_L3_DPF(dev) && ring->id == RCS)
 			I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
 		else
 			I915_WRITE_IMR(ring, ~0);
-- 
1.8.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH] [v3] drm/i915: Add second slice l3 remapping
  2013-09-18  7:36   ` Ville Syrjälä
  2013-09-18 16:22     ` Ben Widawsky
@ 2013-09-19 18:13     ` Ben Widawsky
  1 sibling, 0 replies; 27+ messages in thread
From: Ben Widawsky @ 2013-09-19 18:13 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky

Certain HSW SKUs have a second bank of L3. This L3 remapping has a
separate register set, and interrupt from the first "slice". A slice is
simply a term to define some subset of the GPU's l3 cache. This patch
implements both the interrupt handler, and ability to communicate with
userspace about this second slice.

v2:  Remove redundant check about non-existent slice.
Change warning about interrupts of unknown slices to WARN_ON_ONCE
Handle the case where we get 2 slice interrupts concurrently, and switch
the tracking of interrupts to be non-destructive (all Ville)
Don't enable/mask the second slice parity interrupt for ivb/vlv (even
though all docs I can find claim it's rsvd) (Ville + Bryan)
Keep BYT excluded from L3 parity

v3: Fix the slice = ffs to be decremented by one (found by Ville). When
I initially did my testing on the series, I was using 1-based slice
counting, so this code was correct. Not sure why my simpler tests that
I've been running since then didn't pick it up sooner.

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h         |  7 ++-
 drivers/gpu/drm/i915/i915_gem.c         | 26 +++++-----
 drivers/gpu/drm/i915/i915_irq.c         | 89 +++++++++++++++++++++------------
 drivers/gpu/drm/i915/i915_reg.h         |  7 +++
 drivers/gpu/drm/i915/i915_sysfs.c       | 34 ++++++++++---
 drivers/gpu/drm/i915/intel_ringbuffer.c |  7 ++-
 include/uapi/drm/i915_drm.h             |  8 +--
 7 files changed, 117 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8b16d47..c6e8df7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -917,9 +917,11 @@ struct i915_ums_state {
 	int mm_suspended;
 };
 
+#define MAX_L3_SLICES 2
 struct intel_l3_parity {
-	u32 *remap_info;
+	u32 *remap_info[MAX_L3_SLICES];
 	struct work_struct error_work;
+	int which_slice;
 };
 
 struct i915_gem_mm {
@@ -1686,6 +1688,7 @@ struct drm_i915_file_private {
 #define HAS_FORCE_WAKE(dev) (INTEL_INFO(dev)->has_force_wake)
 
 #define HAS_L3_GPU_CACHE(dev) (IS_IVYBRIDGE(dev) || IS_HASWELL(dev))
+#define NUM_L3_SLICES(dev) (IS_HSW_GT3(dev) ? 2 : HAS_L3_GPU_CACHE(dev))
 
 #define GT_FREQUENCY_MULTIPLIER 50
 
@@ -1946,7 +1949,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
 int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
 int __must_check i915_gem_init(struct drm_device *dev);
 int __must_check i915_gem_init_hw(struct drm_device *dev);
-void i915_gem_l3_remap(struct drm_device *dev);
+void i915_gem_l3_remap(struct drm_device *dev, int slice);
 void i915_gem_init_swizzling(struct drm_device *dev);
 void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
 int __must_check i915_gpu_idle(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3d3de6e..66bf75d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4252,16 +4252,15 @@ i915_gem_idle(struct drm_device *dev)
 	return 0;
 }
 
-void i915_gem_l3_remap(struct drm_device *dev)
+void i915_gem_l3_remap(struct drm_device *dev, int slice)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
+	u32 reg_base = GEN7_L3LOG_BASE + (slice * 0x200);
+	u32 *remap_info = dev_priv->l3_parity.remap_info[slice];
 	u32 misccpctl;
 	int i;
 
-	if (!HAS_L3_GPU_CACHE(dev))
-		return;
-
-	if (!dev_priv->l3_parity.remap_info)
+	if (!HAS_L3_GPU_CACHE(dev) || !remap_info)
 		return;
 
 	misccpctl = I915_READ(GEN7_MISCCPCTL);
@@ -4269,17 +4268,17 @@ void i915_gem_l3_remap(struct drm_device *dev)
 	POSTING_READ(GEN7_MISCCPCTL);
 
 	for (i = 0; i < GEN7_L3LOG_SIZE; i += 4) {
-		u32 remap = I915_READ(GEN7_L3LOG_BASE + i);
-		if (remap && remap != dev_priv->l3_parity.remap_info[i/4])
+		u32 remap = I915_READ(reg_base + i);
+		if (remap && remap != remap_info[i/4])
 			DRM_DEBUG("0x%x was already programmed to %x\n",
-				  GEN7_L3LOG_BASE + i, remap);
-		if (remap && !dev_priv->l3_parity.remap_info[i/4])
+				  reg_base + i, remap);
+		if (remap && !remap_info[i/4])
 			DRM_DEBUG_DRIVER("Clearing remapped register\n");
-		I915_WRITE(GEN7_L3LOG_BASE + i, dev_priv->l3_parity.remap_info[i/4]);
+		I915_WRITE(reg_base + i, remap_info[i/4]);
 	}
 
 	/* Make sure all the writes land before disabling dop clock gating */
-	POSTING_READ(GEN7_L3LOG_BASE);
+	POSTING_READ(reg_base);
 
 	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
 }
@@ -4373,7 +4372,7 @@ int
 i915_gem_init_hw(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	int ret;
+	int ret, i;
 
 	if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
 		return -EIO;
@@ -4392,7 +4391,8 @@ i915_gem_init_hw(struct drm_device *dev)
 		I915_WRITE(GEN7_MSG_CTL, temp);
 	}
 
-	i915_gem_l3_remap(dev);
+	for (i = 0; i < NUM_L3_SLICES(dev); i++)
+		i915_gem_l3_remap(dev, i);
 
 	i915_gem_init_swizzling(dev);
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index a42f30b..1c7f6ab 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -888,9 +888,10 @@ static void ivybridge_parity_work(struct work_struct *work)
 	drm_i915_private_t *dev_priv = container_of(work, drm_i915_private_t,
 						    l3_parity.error_work);
 	u32 error_status, row, bank, subbank;
-	char *parity_event[5];
+	char *parity_event[6];
 	uint32_t misccpctl;
 	unsigned long flags;
+	uint8_t slice = 0;
 
 	/* We must turn off DOP level clock gating to access the L3 registers.
 	 * In order to prevent a get/put style interface, acquire struct mutex
@@ -898,45 +899,64 @@ static void ivybridge_parity_work(struct work_struct *work)
 	 */
 	mutex_lock(&dev_priv->dev->struct_mutex);
 
+	/* If we've screwed up tracking, just let the interrupt fire again */
+	if (WARN_ON(!dev_priv->l3_parity.which_slice))
+		goto out;
+
 	misccpctl = I915_READ(GEN7_MISCCPCTL);
 	I915_WRITE(GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
 	POSTING_READ(GEN7_MISCCPCTL);
 
-	error_status = I915_READ(GEN7_L3CDERRST1);
-	row = GEN7_PARITY_ERROR_ROW(error_status);
-	bank = GEN7_PARITY_ERROR_BANK(error_status);
-	subbank = GEN7_PARITY_ERROR_SUBBANK(error_status);
+	while ((slice = ffs(dev_priv->l3_parity.which_slice)) != 0) {
+		u32 reg;
 
-	I915_WRITE(GEN7_L3CDERRST1, GEN7_PARITY_ERROR_VALID |
-				    GEN7_L3CDERRST1_ENABLE);
-	POSTING_READ(GEN7_L3CDERRST1);
+		slice--;
+		if (WARN_ON_ONCE(slice >= NUM_L3_SLICES(dev_priv->dev)))
+			break;
 
-	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
+		dev_priv->l3_parity.which_slice &= ~(1<<slice);
 
-	spin_lock_irqsave(&dev_priv->irq_lock, flags);
-	ilk_enable_gt_irq(dev_priv, GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
-	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
+		reg = GEN7_L3CDERRST1 + (slice * 0x200);
 
-	mutex_unlock(&dev_priv->dev->struct_mutex);
+		error_status = I915_READ(reg);
+		row = GEN7_PARITY_ERROR_ROW(error_status);
+		bank = GEN7_PARITY_ERROR_BANK(error_status);
+		subbank = GEN7_PARITY_ERROR_SUBBANK(error_status);
+
+		I915_WRITE(reg, GEN7_PARITY_ERROR_VALID | GEN7_L3CDERRST1_ENABLE);
+		POSTING_READ(reg);
+
+		parity_event[0] = I915_L3_PARITY_UEVENT "=1";
+		parity_event[1] = kasprintf(GFP_KERNEL, "ROW=%d", row);
+		parity_event[2] = kasprintf(GFP_KERNEL, "BANK=%d", bank);
+		parity_event[3] = kasprintf(GFP_KERNEL, "SUBBANK=%d", subbank);
+		parity_event[4] = kasprintf(GFP_KERNEL, "SLICE=%d", slice);
+		parity_event[5] = NULL;
+
+		kobject_uevent_env(&dev_priv->dev->primary->kdev.kobj,
+				   KOBJ_CHANGE, parity_event);
 
-	parity_event[0] = I915_L3_PARITY_UEVENT "=1";
-	parity_event[1] = kasprintf(GFP_KERNEL, "ROW=%d", row);
-	parity_event[2] = kasprintf(GFP_KERNEL, "BANK=%d", bank);
-	parity_event[3] = kasprintf(GFP_KERNEL, "SUBBANK=%d", subbank);
-	parity_event[4] = NULL;
+		DRM_DEBUG("Parity error: Slice = %d, Row = %d, Bank = %d, Sub bank = %d.\n",
+			  slice, row, bank, subbank);
 
-	kobject_uevent_env(&dev_priv->dev->primary->kdev.kobj,
-			   KOBJ_CHANGE, parity_event);
+		kfree(parity_event[4]);
+		kfree(parity_event[3]);
+		kfree(parity_event[2]);
+		kfree(parity_event[1]);
+	}
 
-	DRM_DEBUG("Parity error: Row = %d, Bank = %d, Sub bank = %d.\n",
-		  row, bank, subbank);
+	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
 
-	kfree(parity_event[3]);
-	kfree(parity_event[2]);
-	kfree(parity_event[1]);
+out:
+	WARN_ON(dev_priv->l3_parity.which_slice);
+	spin_lock_irqsave(&dev_priv->irq_lock, flags);
+	ilk_enable_gt_irq(dev_priv, GT_PARITY_ERROR(dev_priv->dev));
+	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
+
+	mutex_unlock(&dev_priv->dev->struct_mutex);
 }
 
-static void ivybridge_parity_error_irq_handler(struct drm_device *dev)
+static void ivybridge_parity_error_irq_handler(struct drm_device *dev, u32 iir)
 {
 	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
 
@@ -944,9 +964,16 @@ static void ivybridge_parity_error_irq_handler(struct drm_device *dev)
 		return;
 
 	spin_lock(&dev_priv->irq_lock);
-	ilk_disable_gt_irq(dev_priv, GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
+	ilk_disable_gt_irq(dev_priv, GT_PARITY_ERROR(dev));
 	spin_unlock(&dev_priv->irq_lock);
 
+	iir &= GT_PARITY_ERROR(dev);
+	if (iir & GT_RENDER_L3_PARITY_ERROR_INTERRUPT_S1)
+		dev_priv->l3_parity.which_slice |= 1 << 1;
+
+	if (iir & GT_RENDER_L3_PARITY_ERROR_INTERRUPT)
+		dev_priv->l3_parity.which_slice |= 1 << 0;
+
 	queue_work(dev_priv->wq, &dev_priv->l3_parity.error_work);
 }
 
@@ -981,8 +1008,8 @@ static void snb_gt_irq_handler(struct drm_device *dev,
 		i915_handle_error(dev, false);
 	}
 
-	if (gt_iir & GT_RENDER_L3_PARITY_ERROR_INTERRUPT)
-		ivybridge_parity_error_irq_handler(dev);
+	if (gt_iir & GT_PARITY_ERROR(dev))
+		ivybridge_parity_error_irq_handler(dev, gt_iir);
 }
 
 #define HPD_STORM_DETECT_PERIOD 1000
@@ -2267,8 +2294,8 @@ static void gen5_gt_irq_postinstall(struct drm_device *dev)
 	dev_priv->gt_irq_mask = ~0;
 	if (HAS_L3_GPU_CACHE(dev)) {
 		/* L3 parity interrupt is always unmasked. */
-		dev_priv->gt_irq_mask = ~GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
-		gt_irqs |= GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
+		dev_priv->gt_irq_mask = ~GT_PARITY_ERROR(dev);
+		gt_irqs |= GT_PARITY_ERROR(dev);
 	}
 
 	gt_irqs |= GT_RENDER_USER_INTERRUPT;
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 384adfb..c94946f 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -927,6 +927,7 @@
 #define GT_BLT_USER_INTERRUPT			(1 << 22)
 #define GT_BSD_CS_ERROR_INTERRUPT		(1 << 15)
 #define GT_BSD_USER_INTERRUPT			(1 << 12)
+#define GT_RENDER_L3_PARITY_ERROR_INTERRUPT_S1	(1 << 11) /* hsw+; rsvd on snb, ivb, vlv */
 #define GT_RENDER_L3_PARITY_ERROR_INTERRUPT	(1 <<  5) /* !snb */
 #define GT_RENDER_PIPECTL_NOTIFY_INTERRUPT	(1 <<  4)
 #define GT_RENDER_CS_MASTER_ERROR_INTERRUPT	(1 <<  3)
@@ -937,6 +938,10 @@
 #define PM_VEBOX_CS_ERROR_INTERRUPT		(1 << 12) /* hsw+ */
 #define PM_VEBOX_USER_INTERRUPT			(1 << 10) /* hsw+ */
 
+#define GT_PARITY_ERROR(dev) \
+	(GT_RENDER_L3_PARITY_ERROR_INTERRUPT | \
+	 IS_HASWELL(dev) ? GT_RENDER_L3_PARITY_ERROR_INTERRUPT_S1 : 0)
+
 /* These are all the "old" interrupts */
 #define ILK_BSD_USER_INTERRUPT				(1<<5)
 #define I915_PIPE_CONTROL_NOTIFY_INTERRUPT		(1<<18)
@@ -4743,6 +4748,7 @@
 
 /* IVYBRIDGE DPF */
 #define GEN7_L3CDERRST1			0xB008 /* L3CD Error Status 1 */
+#define HSW_L3CDERRST11			0xB208 /* L3CD Error Status register 1 slice 1 */
 #define   GEN7_L3CDERRST1_ROW_MASK	(0x7ff<<14)
 #define   GEN7_PARITY_ERROR_VALID	(1<<13)
 #define   GEN7_L3CDERRST1_BANK_MASK	(3<<11)
@@ -4756,6 +4762,7 @@
 #define   GEN7_L3CDERRST1_ENABLE	(1<<7)
 
 #define GEN7_L3LOG_BASE			0xB070
+#define HSW_L3LOG_BASE_SLICE1		0xB270
 #define GEN7_L3LOG_SIZE			0x80
 
 #define GEN7_HALF_SLICE_CHICKEN1	0xe100 /* IVB GT1 + VLV */
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index 71f6de2..3a8bf0c 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -119,6 +119,7 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
 	struct drm_device *drm_dev = dminor->dev;
 	struct drm_i915_private *dev_priv = drm_dev->dev_private;
 	uint32_t misccpctl;
+	int slice = (int)(uintptr_t)attr->private;
 	int i, ret;
 
 	count = round_down(count, 4);
@@ -134,9 +135,9 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
 		return ret;
 
 	if (IS_HASWELL(drm_dev)) {
-		if (dev_priv->l3_parity.remap_info)
+		if (dev_priv->l3_parity.remap_info[slice])
 			memcpy(buf,
-			       dev_priv->l3_parity.remap_info + (offset/4),
+			       dev_priv->l3_parity.remap_info[slice] + (offset/4),
 			       count);
 		else
 			memset(buf, 0, count);
@@ -168,6 +169,7 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
 	struct drm_device *drm_dev = dminor->dev;
 	struct drm_i915_private *dev_priv = drm_dev->dev_private;
 	u32 *temp = NULL; /* Just here to make handling failures easy */
+	int slice = (int)(uintptr_t)attr->private;
 	int ret;
 
 	ret = l3_access_valid(drm_dev, offset);
@@ -178,7 +180,7 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
 	if (ret)
 		return ret;
 
-	if (!dev_priv->l3_parity.remap_info) {
+	if (!dev_priv->l3_parity.remap_info[slice]) {
 		temp = kzalloc(GEN7_L3LOG_SIZE, GFP_KERNEL);
 		if (!temp) {
 			mutex_unlock(&drm_dev->struct_mutex);
@@ -198,11 +200,11 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
 	 * at this point it is left as a TODO.
 	*/
 	if (temp)
-		dev_priv->l3_parity.remap_info = temp;
+		dev_priv->l3_parity.remap_info[slice] = temp;
 
-	memcpy(dev_priv->l3_parity.remap_info + (offset/4), buf, count);
+	memcpy(dev_priv->l3_parity.remap_info[slice] + (offset/4), buf, count);
 
-	i915_gem_l3_remap(drm_dev);
+	i915_gem_l3_remap(drm_dev, slice);
 
 	mutex_unlock(&drm_dev->struct_mutex);
 
@@ -214,7 +216,17 @@ static struct bin_attribute dpf_attrs = {
 	.size = GEN7_L3LOG_SIZE,
 	.read = i915_l3_read,
 	.write = i915_l3_write,
-	.mmap = NULL
+	.mmap = NULL,
+	.private = (void *)0
+};
+
+static struct bin_attribute dpf_attrs_1 = {
+	.attr = {.name = "l3_parity_slice_1", .mode = (S_IRUSR | S_IWUSR)},
+	.size = GEN7_L3LOG_SIZE,
+	.read = i915_l3_read,
+	.write = i915_l3_write,
+	.mmap = NULL,
+	.private = (void *)1
 };
 
 static ssize_t gt_cur_freq_mhz_show(struct device *kdev,
@@ -525,6 +537,13 @@ void i915_setup_sysfs(struct drm_device *dev)
 		ret = device_create_bin_file(&dev->primary->kdev, &dpf_attrs);
 		if (ret)
 			DRM_ERROR("l3 parity sysfs setup failed\n");
+
+		if (NUM_L3_SLICES(dev) > 1) {
+			ret = device_create_bin_file(&dev->primary->kdev,
+						     &dpf_attrs_1);
+			if (ret)
+				DRM_ERROR("l3 parity slice 1 setup failed\n");
+		}
 	}
 
 	ret = 0;
@@ -548,6 +567,7 @@ void i915_teardown_sysfs(struct drm_device *dev)
 		sysfs_remove_files(&dev->primary->kdev.kobj, vlv_attrs);
 	else
 		sysfs_remove_files(&dev->primary->kdev.kobj, gen6_attrs);
+	device_remove_bin_file(&dev->primary->kdev,  &dpf_attrs_1);
 	device_remove_bin_file(&dev->primary->kdev,  &dpf_attrs);
 #ifdef CONFIG_PM
 	sysfs_unmerge_group(&dev->primary->kdev.kobj, &rc6_attr_group);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 686e5b2..958b7d8 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -570,7 +570,7 @@ static int init_render_ring(struct intel_ring_buffer *ring)
 		I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
 
 	if (HAS_L3_GPU_CACHE(dev))
-		I915_WRITE_IMR(ring, ~GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
+		I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
 
 	return ret;
 }
@@ -1000,7 +1000,7 @@ gen6_ring_get_irq(struct intel_ring_buffer *ring)
 		if (HAS_L3_GPU_CACHE(dev) && ring->id == RCS)
 			I915_WRITE_IMR(ring,
 				       ~(ring->irq_enable_mask |
-					 GT_RENDER_L3_PARITY_ERROR_INTERRUPT));
+					 GT_PARITY_ERROR(dev)));
 		else
 			I915_WRITE_IMR(ring, ~ring->irq_enable_mask);
 		ilk_enable_gt_irq(dev_priv, ring->irq_enable_mask);
@@ -1020,8 +1020,7 @@ gen6_ring_put_irq(struct intel_ring_buffer *ring)
 	spin_lock_irqsave(&dev_priv->irq_lock, flags);
 	if (--ring->irq_refcount == 0) {
 		if (HAS_L3_GPU_CACHE(dev) && ring->id == RCS)
-			I915_WRITE_IMR(ring,
-				       ~GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
+			I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
 		else
 			I915_WRITE_IMR(ring, ~0);
 		ilk_disable_gt_irq(dev_priv, ring->irq_enable_mask);
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 55bb572..3a4e97b 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -38,10 +38,10 @@
  *
  * I915_L3_PARITY_UEVENT - Generated when the driver receives a parity mismatch
  *	event from the gpu l3 cache. Additional information supplied is ROW,
- *	BANK, SUBBANK of the affected cacheline. Userspace should keep track of
- *	these events and if a specific cache-line seems to have a persistent
- *	error remap it with the l3 remapping tool supplied in intel-gpu-tools.
- *	The value supplied with the event is always 1.
+ *	BANK, SUBBANK, SLICE of the affected cacheline. Userspace should keep
+ *	track of these events and if a specific cache-line seems to have a
+ *	persistent error remap it with the l3 remapping tool supplied in
+ *	intel-gpu-tools.  The value supplied with the event is always 1.
  *
  * I915_ERROR_UEVENT - Generated upon error detection, currently only via
  *	hangcheck. The error detection event is a good indicator of when things
-- 
1.8.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/6] drm/i915: Make l3 remapping use the ring
  2013-09-18  4:12 ` [PATCH 3/6] drm/i915: Make l3 remapping use the ring Ben Widawsky
@ 2013-09-19 18:39   ` Daniel Vetter
  0 siblings, 0 replies; 27+ messages in thread
From: Daniel Vetter @ 2013-09-19 18:39 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Bryan Bell, Ben Widawsky

On Tue, Sep 17, 2013 at 09:12:44PM -0700, Ben Widawsky wrote:
> Using LRI for setting the remapping registers allows us to stream l3
> remapping information. This is necessary to handle per context remaps as
> we'll see implemented in an upcoming patch.
> 
> Using the ring also means we don't need to frob the DOP clock gating
> bits.
> 
> v2: Add comment about lack of worry for concurrent register access
> (Daniel)
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

I've done a s/XXX/Note on the comment and converted it into the preferred
layout for multiline comments.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h   |  2 +-
>  drivers/gpu/drm/i915/i915_gem.c   | 38 ++++++++++++++++++--------------------
>  drivers/gpu/drm/i915/i915_sysfs.c |  3 ++-
>  3 files changed, 21 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index c6e8df7..0c39805 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1949,7 +1949,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
>  int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
>  int __must_check i915_gem_init(struct drm_device *dev);
>  int __must_check i915_gem_init_hw(struct drm_device *dev);
> -void i915_gem_l3_remap(struct drm_device *dev, int slice);
> +int i915_gem_l3_remap(struct intel_ring_buffer *ring, int slice);
>  void i915_gem_init_swizzling(struct drm_device *dev);
>  void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
>  int __must_check i915_gpu_idle(struct drm_device *dev);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 66bf75d..2538138 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4252,35 +4252,33 @@ i915_gem_idle(struct drm_device *dev)
>  	return 0;
>  }
>  
> -void i915_gem_l3_remap(struct drm_device *dev, int slice)
> +int i915_gem_l3_remap(struct intel_ring_buffer *ring, int slice)
>  {
> +	struct drm_device *dev = ring->dev;
>  	drm_i915_private_t *dev_priv = dev->dev_private;
>  	u32 reg_base = GEN7_L3LOG_BASE + (slice * 0x200);
>  	u32 *remap_info = dev_priv->l3_parity.remap_info[slice];
> -	u32 misccpctl;
> -	int i;
> +	int i, ret;
>  
>  	if (!HAS_L3_GPU_CACHE(dev) || !remap_info)
> -		return;
> +		return 0;
>  
> -	misccpctl = I915_READ(GEN7_MISCCPCTL);
> -	I915_WRITE(GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
> -	POSTING_READ(GEN7_MISCCPCTL);
> +	ret = intel_ring_begin(ring, GEN7_L3LOG_SIZE / 4 * 3);
> +	if (ret)
> +		return ret;
>  
> +	/* XXX: We do not worry about the concurrent register cacheline hang
> +	 * here because no other code should access these registers other than
> +	 * at initialization time. */
>  	for (i = 0; i < GEN7_L3LOG_SIZE; i += 4) {
> -		u32 remap = I915_READ(reg_base + i);
> -		if (remap && remap != remap_info[i/4])
> -			DRM_DEBUG("0x%x was already programmed to %x\n",
> -				  reg_base + i, remap);
> -		if (remap && !remap_info[i/4])
> -			DRM_DEBUG_DRIVER("Clearing remapped register\n");
> -		I915_WRITE(reg_base + i, remap_info[i/4]);
> +		intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +		intel_ring_emit(ring, reg_base + i);
> +		intel_ring_emit(ring, remap_info[i/4]);
>  	}
>  
> -	/* Make sure all the writes land before disabling dop clock gating */
> -	POSTING_READ(reg_base);
> +	intel_ring_advance(ring);
>  
> -	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
> +	return ret;
>  }
>  
>  void i915_gem_init_swizzling(struct drm_device *dev)
> @@ -4391,15 +4389,15 @@ i915_gem_init_hw(struct drm_device *dev)
>  		I915_WRITE(GEN7_MSG_CTL, temp);
>  	}
>  
> -	for (i = 0; i < NUM_L3_SLICES(dev); i++)
> -		i915_gem_l3_remap(dev, i);
> -
>  	i915_gem_init_swizzling(dev);
>  
>  	ret = i915_gem_init_rings(dev);
>  	if (ret)
>  		return ret;
>  
> +	for (i = 0; i < NUM_L3_SLICES(dev); i++)
> +		i915_gem_l3_remap(&dev_priv->ring[RCS], i);
> +
>  	/*
>  	 * XXX: There was some w/a described somewhere suggesting loading
>  	 * contexts before PPGTT.
> diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> index 3a8bf0c..b07bdfb 100644
> --- a/drivers/gpu/drm/i915/i915_sysfs.c
> +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> @@ -204,7 +204,8 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
>  
>  	memcpy(dev_priv->l3_parity.remap_info[slice] + (offset/4), buf, count);
>  
> -	i915_gem_l3_remap(drm_dev, slice);
> +	if (i915_gem_l3_remap(&dev_priv->ring[RCS], slice))
> +		count = 0;
>  
>  	mutex_unlock(&drm_dev->struct_mutex);
>  
> -- 
> 1.8.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] [v2] drm/i915: s/HAS_L3_GPU_CACHE/HAS_L3_DPF
  2013-09-19 18:01     ` Ben Widawsky
@ 2013-09-19 18:41       ` Daniel Vetter
  2013-09-19 19:59         ` Ben Widawsky
  0 siblings, 1 reply; 27+ messages in thread
From: Daniel Vetter @ 2013-09-19 18:41 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: intel-gfx, Ben Widawsky

On Thu, Sep 19, 2013 at 11:01:40AM -0700, Ben Widawsky wrote:
> We'd only ever used this define to denote whether or not we have the
> dynamic parity feature (DPF) and never to determine whether or not L3
> exists. Baytrail is a good example of where L3 exists, and not DPF.
> 
> This patch provides clarify in the code for future use cases which might
> want to actually query whether or not L3 exists.
> 
> v2: Add /* DPF == dynamic parity feature */
> 
> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Ok, I've slurped all the kernel patches in. Like I've said I'd very much
welcome a little testcase to inject an error and exercise this stuff a
bit, if it's not a horrendous amount of work to get going.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h         | 5 +++--
>  drivers/gpu/drm/i915/i915_gem.c         | 2 +-
>  drivers/gpu/drm/i915/i915_irq.c         | 4 ++--
>  drivers/gpu/drm/i915/i915_sysfs.c       | 4 ++--
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 6 +++---
>  5 files changed, 11 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 015df52..09a5c82 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1691,8 +1691,9 @@ struct drm_i915_file_private {
>  
>  #define HAS_FORCE_WAKE(dev) (INTEL_INFO(dev)->has_force_wake)
>  
> -#define HAS_L3_GPU_CACHE(dev) (IS_IVYBRIDGE(dev) || IS_HASWELL(dev))
> -#define NUM_L3_SLICES(dev) (IS_HSW_GT3(dev) ? 2 : HAS_L3_GPU_CACHE(dev))
> +/* DPF == dynamic parity feature */
> +#define HAS_L3_DPF(dev) (IS_IVYBRIDGE(dev) || IS_HASWELL(dev))
> +#define NUM_L3_SLICES(dev) (IS_HSW_GT3(dev) ? 2 : HAS_L3_DPF(dev))
>  
>  #define GT_FREQUENCY_MULTIPLIER 50
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 18d07d7..7859f91 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4260,7 +4260,7 @@ int i915_gem_l3_remap(struct intel_ring_buffer *ring, int slice)
>  	u32 *remap_info = dev_priv->l3_parity.remap_info[slice];
>  	int i, ret;
>  
> -	if (!HAS_L3_GPU_CACHE(dev) || !remap_info)
> +	if (!HAS_L3_DPF(dev) || !remap_info)
>  		return 0;
>  
>  	ret = intel_ring_begin(ring, GEN7_L3LOG_SIZE / 4 * 3);
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 1c7f6ab..cfaf434 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -960,7 +960,7 @@ static void ivybridge_parity_error_irq_handler(struct drm_device *dev, u32 iir)
>  {
>  	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
>  
> -	if (!HAS_L3_GPU_CACHE(dev))
> +	if (!HAS_L3_DPF(dev))
>  		return;
>  
>  	spin_lock(&dev_priv->irq_lock);
> @@ -2292,7 +2292,7 @@ static void gen5_gt_irq_postinstall(struct drm_device *dev)
>  	pm_irqs = gt_irqs = 0;
>  
>  	dev_priv->gt_irq_mask = ~0;
> -	if (HAS_L3_GPU_CACHE(dev)) {
> +	if (HAS_L3_DPF(dev)) {
>  		/* L3 parity interrupt is always unmasked. */
>  		dev_priv->gt_irq_mask = ~GT_PARITY_ERROR(dev);
>  		gt_irqs |= GT_PARITY_ERROR(dev);
> diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> index deb8787..7b4c79c 100644
> --- a/drivers/gpu/drm/i915/i915_sysfs.c
> +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> @@ -97,7 +97,7 @@ static struct attribute_group rc6_attr_group = {
>  
>  static int l3_access_valid(struct drm_device *dev, loff_t offset)
>  {
> -	if (!HAS_L3_GPU_CACHE(dev))
> +	if (!HAS_L3_DPF(dev))
>  		return -EPERM;
>  
>  	if (offset % 4 != 0)
> @@ -525,7 +525,7 @@ void i915_setup_sysfs(struct drm_device *dev)
>  			DRM_ERROR("RC6 residency sysfs setup failed\n");
>  	}
>  #endif
> -	if (HAS_L3_GPU_CACHE(dev)) {
> +	if (HAS_L3_DPF(dev)) {
>  		ret = device_create_bin_file(&dev->primary->kdev, &dpf_attrs);
>  		if (ret)
>  			DRM_ERROR("l3 parity sysfs setup failed\n");
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 958b7d8..b67104a 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -569,7 +569,7 @@ static int init_render_ring(struct intel_ring_buffer *ring)
>  	if (INTEL_INFO(dev)->gen >= 6)
>  		I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
>  
> -	if (HAS_L3_GPU_CACHE(dev))
> +	if (HAS_L3_DPF(dev))
>  		I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
>  
>  	return ret;
> @@ -997,7 +997,7 @@ gen6_ring_get_irq(struct intel_ring_buffer *ring)
>  
>  	spin_lock_irqsave(&dev_priv->irq_lock, flags);
>  	if (ring->irq_refcount++ == 0) {
> -		if (HAS_L3_GPU_CACHE(dev) && ring->id == RCS)
> +		if (HAS_L3_DPF(dev) && ring->id == RCS)
>  			I915_WRITE_IMR(ring,
>  				       ~(ring->irq_enable_mask |
>  					 GT_PARITY_ERROR(dev)));
> @@ -1019,7 +1019,7 @@ gen6_ring_put_irq(struct intel_ring_buffer *ring)
>  
>  	spin_lock_irqsave(&dev_priv->irq_lock, flags);
>  	if (--ring->irq_refcount == 0) {
> -		if (HAS_L3_GPU_CACHE(dev) && ring->id == RCS)
> +		if (HAS_L3_DPF(dev) && ring->id == RCS)
>  			I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
>  		else
>  			I915_WRITE_IMR(ring, ~0);
> -- 
> 1.8.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] [v2] drm/i915: s/HAS_L3_GPU_CACHE/HAS_L3_DPF
  2013-09-19 18:41       ` Daniel Vetter
@ 2013-09-19 19:59         ` Ben Widawsky
  0 siblings, 0 replies; 27+ messages in thread
From: Ben Widawsky @ 2013-09-19 19:59 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, Ben Widawsky

On Thu, Sep 19, 2013 at 08:41:55PM +0200, Daniel Vetter wrote:
> On Thu, Sep 19, 2013 at 11:01:40AM -0700, Ben Widawsky wrote:
> > We'd only ever used this define to denote whether or not we have the
> > dynamic parity feature (DPF) and never to determine whether or not L3
> > exists. Baytrail is a good example of where L3 exists, and not DPF.
> > 
> > This patch provides clarify in the code for future use cases which might
> > want to actually query whether or not L3 exists.
> > 
> > v2: Add /* DPF == dynamic parity feature */
> > 
> > Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> Ok, I've slurped all the kernel patches in. Like I've said I'd very much
> welcome a little testcase to inject an error and exercise this stuff a
> bit, if it's not a horrendous amount of work to get going.
> -Daniel
> 

I agree with you. I'm trying to get help from other internal groups to
write that test case, but so far, no luck. It's on my TODO list behind
about 4 other things right now.

> > ---
> >  drivers/gpu/drm/i915/i915_drv.h         | 5 +++--
> >  drivers/gpu/drm/i915/i915_gem.c         | 2 +-
> >  drivers/gpu/drm/i915/i915_irq.c         | 4 ++--
> >  drivers/gpu/drm/i915/i915_sysfs.c       | 4 ++--
> >  drivers/gpu/drm/i915/intel_ringbuffer.c | 6 +++---
> >  5 files changed, 11 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 015df52..09a5c82 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1691,8 +1691,9 @@ struct drm_i915_file_private {
> >  
> >  #define HAS_FORCE_WAKE(dev) (INTEL_INFO(dev)->has_force_wake)
> >  
> > -#define HAS_L3_GPU_CACHE(dev) (IS_IVYBRIDGE(dev) || IS_HASWELL(dev))
> > -#define NUM_L3_SLICES(dev) (IS_HSW_GT3(dev) ? 2 : HAS_L3_GPU_CACHE(dev))
> > +/* DPF == dynamic parity feature */
> > +#define HAS_L3_DPF(dev) (IS_IVYBRIDGE(dev) || IS_HASWELL(dev))
> > +#define NUM_L3_SLICES(dev) (IS_HSW_GT3(dev) ? 2 : HAS_L3_DPF(dev))
> >  
> >  #define GT_FREQUENCY_MULTIPLIER 50
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 18d07d7..7859f91 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -4260,7 +4260,7 @@ int i915_gem_l3_remap(struct intel_ring_buffer *ring, int slice)
> >  	u32 *remap_info = dev_priv->l3_parity.remap_info[slice];
> >  	int i, ret;
> >  
> > -	if (!HAS_L3_GPU_CACHE(dev) || !remap_info)
> > +	if (!HAS_L3_DPF(dev) || !remap_info)
> >  		return 0;
> >  
> >  	ret = intel_ring_begin(ring, GEN7_L3LOG_SIZE / 4 * 3);
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index 1c7f6ab..cfaf434 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -960,7 +960,7 @@ static void ivybridge_parity_error_irq_handler(struct drm_device *dev, u32 iir)
> >  {
> >  	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
> >  
> > -	if (!HAS_L3_GPU_CACHE(dev))
> > +	if (!HAS_L3_DPF(dev))
> >  		return;
> >  
> >  	spin_lock(&dev_priv->irq_lock);
> > @@ -2292,7 +2292,7 @@ static void gen5_gt_irq_postinstall(struct drm_device *dev)
> >  	pm_irqs = gt_irqs = 0;
> >  
> >  	dev_priv->gt_irq_mask = ~0;
> > -	if (HAS_L3_GPU_CACHE(dev)) {
> > +	if (HAS_L3_DPF(dev)) {
> >  		/* L3 parity interrupt is always unmasked. */
> >  		dev_priv->gt_irq_mask = ~GT_PARITY_ERROR(dev);
> >  		gt_irqs |= GT_PARITY_ERROR(dev);
> > diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> > index deb8787..7b4c79c 100644
> > --- a/drivers/gpu/drm/i915/i915_sysfs.c
> > +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> > @@ -97,7 +97,7 @@ static struct attribute_group rc6_attr_group = {
> >  
> >  static int l3_access_valid(struct drm_device *dev, loff_t offset)
> >  {
> > -	if (!HAS_L3_GPU_CACHE(dev))
> > +	if (!HAS_L3_DPF(dev))
> >  		return -EPERM;
> >  
> >  	if (offset % 4 != 0)
> > @@ -525,7 +525,7 @@ void i915_setup_sysfs(struct drm_device *dev)
> >  			DRM_ERROR("RC6 residency sysfs setup failed\n");
> >  	}
> >  #endif
> > -	if (HAS_L3_GPU_CACHE(dev)) {
> > +	if (HAS_L3_DPF(dev)) {
> >  		ret = device_create_bin_file(&dev->primary->kdev, &dpf_attrs);
> >  		if (ret)
> >  			DRM_ERROR("l3 parity sysfs setup failed\n");
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 958b7d8..b67104a 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -569,7 +569,7 @@ static int init_render_ring(struct intel_ring_buffer *ring)
> >  	if (INTEL_INFO(dev)->gen >= 6)
> >  		I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
> >  
> > -	if (HAS_L3_GPU_CACHE(dev))
> > +	if (HAS_L3_DPF(dev))
> >  		I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
> >  
> >  	return ret;
> > @@ -997,7 +997,7 @@ gen6_ring_get_irq(struct intel_ring_buffer *ring)
> >  
> >  	spin_lock_irqsave(&dev_priv->irq_lock, flags);
> >  	if (ring->irq_refcount++ == 0) {
> > -		if (HAS_L3_GPU_CACHE(dev) && ring->id == RCS)
> > +		if (HAS_L3_DPF(dev) && ring->id == RCS)
> >  			I915_WRITE_IMR(ring,
> >  				       ~(ring->irq_enable_mask |
> >  					 GT_PARITY_ERROR(dev)));
> > @@ -1019,7 +1019,7 @@ gen6_ring_put_irq(struct intel_ring_buffer *ring)
> >  
> >  	spin_lock_irqsave(&dev_priv->irq_lock, flags);
> >  	if (--ring->irq_refcount == 0) {
> > -		if (HAS_L3_GPU_CACHE(dev) && ring->id == RCS)
> > +		if (HAS_L3_DPF(dev) && ring->id == RCS)
> >  			I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
> >  		else
> >  			I915_WRITE_IMR(ring, ~0);
> > -- 
> > 1.8.4
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2013-09-19 19:59 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-18  4:12 [PATCH 1/6] drm/i915: Fix HSW parity test Ben Widawsky
2013-09-18  4:12 ` [PATCH 2/6] drm/i915: Add second slice l3 remapping Ben Widawsky
2013-09-18  7:36   ` Ville Syrjälä
2013-09-18 16:22     ` Ben Widawsky
2013-09-19 18:13     ` [PATCH] [v3] " Ben Widawsky
2013-09-18  4:12 ` [PATCH 3/6] drm/i915: Make l3 remapping use the ring Ben Widawsky
2013-09-19 18:39   ` Daniel Vetter
2013-09-18  4:12 ` [PATCH 4/6] drm/i915: Keep a list of all contexts Ben Widawsky
2013-09-18  4:12 ` [PATCH 5/6] drm/i915: Do remaps for " Ben Widawsky
2013-09-18  7:48   ` Ville Syrjälä
2013-09-19  1:14     ` Ben Widawsky
2013-09-19  1:17       ` Ben Widawsky
2013-09-19  2:03     ` [PATCH] [v3] " Ben Widawsky
2013-09-18  4:12 ` [PATCH 6/6] drm/i915: s/HAS_L3_GPU_CACHE/HAS_L3_DPF Ben Widawsky
2013-09-18  7:50   ` Ville Syrjälä
2013-09-19 17:47     ` [PATCH] [v2] " Ben Widawsky
2013-09-19 18:01     ` Ben Widawsky
2013-09-19 18:41       ` Daniel Vetter
2013-09-19 19:59         ` Ben Widawsky
2013-09-18  4:12 ` [PATCH 07/14] intel_l3_parity: Fix indentation Ben Widawsky
2013-09-18  4:12 ` [PATCH 08/14] intel_l3_parity: Assert all GEN7+ support Ben Widawsky
2013-09-18  4:12 ` [PATCH 09/14] intel_l3_parity: Use getopt for the l3 parity tool Ben Widawsky
2013-09-18  4:12 ` [PATCH 10/14] intel_l3_parity: Hardware info argument Ben Widawsky
2013-09-18  4:12 ` [PATCH 11/14] intel_l3_parity: slice support Ben Widawsky
2013-09-18  4:12 ` [PATCH 12/14] intel_l3_parity: Actually support multiple slices Ben Widawsky
2013-09-18  4:12 ` [PATCH 13/14] intel_l3_parity: Support error injection Ben Widawsky
2013-09-18  4:12 ` [PATCH 14/14] intel_l3_parity: Support a daemonic mode Ben Widawsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.