intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Round-up GTT allocations for unfenced surfaces to the next tile row
@ 2011-03-25  9:36 Chris Wilson
  2011-03-25 12:16 ` Chris Wilson
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2011-03-25  9:36 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

The kernel has always been lax in enforcing size restrictions upon the
buffers it is handed to by userspace, and userspace has always tried to
overallocate its buffers.

However, userspace made a mistake and failed to follow the undocumented
requirement that gen2 hardware requires its allocations to be rounded up
to the next even tile row. Rampant corruption on the bottom on images on
i8xx devices ensued. Userspace was fixed not to do that again, and now
we teach the kernel not to be so trusting of userspace.

We ensure that an allocated region within the GTT matches the proposed
usage restrictions. For fenced buffers on old hardware, this means
rounding up the allocation to the next power of two size and aligning it
to that size. For unfenced buffers, we need to ensure that the start and
end of the allocation is aligned to an even tile row.

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reported-by: Everyone with an i8xx device
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h        |    2 +
 drivers/gpu/drm/i915/i915_gem.c        |   37 +++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_gem_tiling.c |    5 +++-
 3 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 787c969..359ddce 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1183,6 +1183,8 @@ void i915_gem_free_all_phys_object(struct drm_device *dev);
 void i915_gem_release(struct drm_device *dev, struct drm_file *file);
 
 uint32_t
+i915_gem_get_unfenced_gtt_size(struct drm_i915_gem_object *obj);
+uint32_t
 i915_gem_get_unfenced_gtt_alignment(struct drm_i915_gem_object *obj);
 
 /* i915_gem_gtt.c */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0012061..3e483c5 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1427,6 +1427,35 @@ i915_gem_get_gtt_alignment(struct drm_i915_gem_object *obj)
 }
 
 /**
+ * i915_gem_get_unfenced_gtt_size - return required GTT size for an
+ *				    unfenced object
+ * @obj: object to check
+ *
+ * Return the required GTT size for an object, only taking into account
+ * unfenced tiled surface requirements.
+ */
+uint32_t
+i915_gem_get_unfenced_gtt_size(struct drm_i915_gem_object *obj)
+{
+	u32 unfenced_alignment;
+
+	/*
+	 * Current userspace will attempt to overallocate a bo so that it
+	 * can be reused with another surface and so its size is unlikely
+	 * to be an exact number of tile rows - but it promised never to
+	 * access beyond the end of the last complete row.
+	 *
+	 * gen2 has a futher restriction, in that it requires an even number
+	 * of tiles rows. Userspace was not aware of this until recently
+	 * and so violated its promise to always allocate enough pages
+	 * for the hardware. In reply, we now always round up the GTT
+	 * allocation to the next [even] tile row.
+	 */
+	unfenced_alignment = i915_gem_get_unfenced_gtt_alignment(obj);
+	return ALIGN(obj->base.size, unfenced_alignment);
+}
+
+/**
  * i915_gem_get_unfenced_gtt_alignment - return required GTT alignment for an
  *					 unfenced object
  * @obj: object to check
@@ -2744,7 +2773,8 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct drm_mm_node *free_space;
 	gfp_t gfpmask = __GFP_NORETRY | __GFP_NOWARN;
-	u32 size, fence_size, fence_alignment, unfenced_alignment;
+	u32 size, fence_size, fence_alignment;
+	u32 unfenced_size, unfenced_alignment;
 	bool mappable, fenceable;
 	int ret;
 
@@ -2755,6 +2785,7 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 
 	fence_size = i915_gem_get_gtt_size(obj);
 	fence_alignment = i915_gem_get_gtt_alignment(obj);
+	unfenced_size = i915_gem_get_unfenced_gtt_size(obj);
 	unfenced_alignment = i915_gem_get_unfenced_gtt_alignment(obj);
 
 	if (alignment == 0)
@@ -2765,12 +2796,12 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 		return -EINVAL;
 	}
 
-	size = map_and_fenceable ? fence_size : obj->base.size;
+	size = map_and_fenceable ? fence_size : unfenced_size;
 
 	/* If the object is bigger than the entire aperture, reject it early
 	 * before evicting everything in a vain attempt to find space.
 	 */
-	if (obj->base.size >
+	if (size >
 	    (map_and_fenceable ? dev_priv->mm.gtt_mappable_end : dev_priv->mm.gtt_total)) {
 		DRM_ERROR("Attempting to bind an object larger than the aperture\n");
 		return -E2BIG;
diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index 281ad3d..e4d5c58 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -349,7 +349,10 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 		if (!obj->map_and_fenceable) {
 			u32 unfenced_alignment =
 				i915_gem_get_unfenced_gtt_alignment(obj);
-			if (obj->gtt_offset & (unfenced_alignment - 1))
+			u32 unfenced_size =
+				i915_gem_get_unfenced_gtt_size(obj);
+			if (obj->gtt_space->size < unfenced_size ||
+			    obj->gtt_offset & (unfenced_alignment - 1))
 				ret = i915_gem_object_unbind(obj);
 		}
 
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH] drm/i915: Round-up GTT allocations for unfenced surfaces to the next tile row
  2011-03-25  9:36 [PATCH] drm/i915: Round-up GTT allocations for unfenced surfaces to the next tile row Chris Wilson
@ 2011-03-25 12:16 ` Chris Wilson
  2011-03-26  8:52   ` Chris Wilson
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2011-03-25 12:16 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

The kernel has always been lax in enforcing size restrictions upon the
buffers it is handed to by userspace, and userspace has always tried to
overallocate its buffers.

However, userspace made a mistake and failed to follow the undocumented
requirement that gen2 hardware requires its allocations to be rounded up
to the next even tile row. Rampant corruption on the bottom on images on
i8xx devices ensued. Userspace was fixed not to do that again, and now
we teach the kernel not to be so trusting of userspace.

We ensure that an allocated region within the GTT matches the proposed
usage restrictions. For fenced buffers on old hardware, this means
rounding up the allocation to the next power of two size and aligning it
to that size. For unfenced buffers, we need to ensure that the start and
end of the allocation is aligned to an even tile row.

v2: Apply the allocation fixup to all devices

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reported-by: Everyone with an i8xx device
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h        |    2 +
 drivers/gpu/drm/i915/i915_gem.c        |   66 ++++++++++++++++++++++++++-----
 drivers/gpu/drm/i915/i915_gem_tiling.c |    5 ++-
 3 files changed, 61 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 787c969..359ddce 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1183,6 +1183,8 @@ void i915_gem_free_all_phys_object(struct drm_device *dev);
 void i915_gem_release(struct drm_device *dev, struct drm_file *file);
 
 uint32_t
+i915_gem_get_unfenced_gtt_size(struct drm_i915_gem_object *obj);
+uint32_t
 i915_gem_get_unfenced_gtt_alignment(struct drm_i915_gem_object *obj);
 
 /* i915_gem_gtt.c */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0012061..9144ff6 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1426,6 +1426,55 @@ i915_gem_get_gtt_alignment(struct drm_i915_gem_object *obj)
 	return i915_gem_get_gtt_size(obj);
 }
 
+static u32 i915_gem_object_tiled_row_size(struct drm_i915_gem_object *obj)
+{
+	struct drm_device *dev = obj->base.dev;
+	int tile_height;
+
+	if (IS_GEN2(dev) ||
+	    (obj->tiling_mode == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev)))
+		tile_height = 32;
+	else
+		tile_height = 8;
+
+	return tile_height * obj->stride;
+}
+
+/**
+ * i915_gem_get_unfenced_gtt_size - return required GTT size for an
+ *				    unfenced object
+ * @obj: object to check
+ *
+ * Return the required GTT size for an object, only taking into account
+ * unfenced tiled surface requirements.
+ */
+uint32_t
+i915_gem_get_unfenced_gtt_size(struct drm_i915_gem_object *obj)
+{
+	u32 unfenced_alignment;
+
+	if (obj->tiling_mode == I915_TILING_NONE)
+		return obj->base.size;
+
+	/*
+	 * Current userspace will attempt to overallocate a bo so that it
+	 * can be reused with another surface and so its size is unlikely
+	 * to be an exact number of tile rows - but it promised never to
+	 * access beyond the end of the last complete row.
+	 *
+	 * gen2 has a further restriction, in that it requires an even number
+	 * of tiles rows. Userspace was not aware of this until recently
+	 * and so violated its promise to always allocate enough pages
+	 * for the hardware. In reply, we now always round up the GTT
+	 * allocation to the next [even] tile row.
+	 */
+	unfenced_alignment = i915_gem_object_tiled_row_size(obj);
+	if (IS_GEN2(obj->base.dev))
+		unfenced_alignment *= 2;
+
+	return ALIGN(obj->base.size, unfenced_alignment);
+}
+
 /**
  * i915_gem_get_unfenced_gtt_alignment - return required GTT alignment for an
  *					 unfenced object
@@ -1438,7 +1487,6 @@ uint32_t
 i915_gem_get_unfenced_gtt_alignment(struct drm_i915_gem_object *obj)
 {
 	struct drm_device *dev = obj->base.dev;
-	int tile_height;
 
 	/*
 	 * Minimum alignment is 4k (GTT page size) for sane hw.
@@ -1452,13 +1500,7 @@ i915_gem_get_unfenced_gtt_alignment(struct drm_i915_gem_object *obj)
 	 * edge of an even tile row (where tile rows are counted as if the bo is
 	 * placed in a fenced gtt region).
 	 */
-	if (IS_GEN2(dev) ||
-	    (obj->tiling_mode == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev)))
-		tile_height = 32;
-	else
-		tile_height = 8;
-
-	return tile_height * obj->stride * 2;
+	return 2 * i915_gem_object_tiled_row_size(obj);
 }
 
 int
@@ -2744,7 +2786,8 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct drm_mm_node *free_space;
 	gfp_t gfpmask = __GFP_NORETRY | __GFP_NOWARN;
-	u32 size, fence_size, fence_alignment, unfenced_alignment;
+	u32 size, fence_size, fence_alignment;
+	u32 unfenced_size, unfenced_alignment;
 	bool mappable, fenceable;
 	int ret;
 
@@ -2755,6 +2798,7 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 
 	fence_size = i915_gem_get_gtt_size(obj);
 	fence_alignment = i915_gem_get_gtt_alignment(obj);
+	unfenced_size = i915_gem_get_unfenced_gtt_size(obj);
 	unfenced_alignment = i915_gem_get_unfenced_gtt_alignment(obj);
 
 	if (alignment == 0)
@@ -2765,12 +2809,12 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 		return -EINVAL;
 	}
 
-	size = map_and_fenceable ? fence_size : obj->base.size;
+	size = map_and_fenceable ? fence_size : unfenced_size;
 
 	/* If the object is bigger than the entire aperture, reject it early
 	 * before evicting everything in a vain attempt to find space.
 	 */
-	if (obj->base.size >
+	if (size >
 	    (map_and_fenceable ? dev_priv->mm.gtt_mappable_end : dev_priv->mm.gtt_total)) {
 		DRM_ERROR("Attempting to bind an object larger than the aperture\n");
 		return -E2BIG;
diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index 281ad3d..e4d5c58 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -349,7 +349,10 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 		if (!obj->map_and_fenceable) {
 			u32 unfenced_alignment =
 				i915_gem_get_unfenced_gtt_alignment(obj);
-			if (obj->gtt_offset & (unfenced_alignment - 1))
+			u32 unfenced_size =
+				i915_gem_get_unfenced_gtt_size(obj);
+			if (obj->gtt_space->size < unfenced_size ||
+			    obj->gtt_offset & (unfenced_alignment - 1))
 				ret = i915_gem_object_unbind(obj);
 		}
 
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/i915: Round-up GTT allocations for unfenced surfaces to the next tile row
  2011-03-25 12:16 ` Chris Wilson
@ 2011-03-26  8:52   ` Chris Wilson
  2011-03-26  9:20     ` Daniel Vetter
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2011-03-26  8:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

On Fri, 25 Mar 2011 12:16:25 +0000, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> We ensure that an allocated region within the GTT matches the proposed
> usage restrictions. For fenced buffers on old hardware, this means
> rounding up the allocation to the next power of two size and aligning it
> to that size. For unfenced buffers, we need to ensure that the start and
> end of the allocation is aligned to an even tile row.
> 
> v2: Apply the allocation fixup to all devices

I verified that it has no substantial impact on performance of gen4+
devices using cairo-gl/xlib. (Since cairo likes to exercise surface
creation and reuse a lot.)

However, I'm not sure if this truly prevents the corruption on i8xx with
2.14.0. Can somebody break out an old machine and test?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/i915: Round-up GTT allocations for unfenced surfaces to the next tile row
  2011-03-26  8:52   ` Chris Wilson
@ 2011-03-26  9:20     ` Daniel Vetter
  2011-03-26  9:43       ` Chris Wilson
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Vetter @ 2011-03-26  9:20 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, intel-gfx

On Sat, Mar 26, 2011 at 08:52:31AM +0000, Chris Wilson wrote:
> However, I'm not sure if this truly prevents the corruption on i8xx with
> 2.14.0. Can somebody break out an old machine and test?

It won't (assuming I correctly diagnosed the problem): Corruptions happen
because i8xx uses copies of the scratch page in the lower-left corner.
Only extending the size of the allocation prevents that. What this patch
does though is preventing corruptions in the immediately following bo.
Dunno whether that's worth to fix given that people rather quickly
complain about visual corruptions, but seem to somewhat accept crashy i8xx
systems as a fact of life.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/i915: Round-up GTT allocations for unfenced surfaces to the next tile row
  2011-03-26  9:20     ` Daniel Vetter
@ 2011-03-26  9:43       ` Chris Wilson
  2011-03-26 14:22         ` [PATCH] drm/i915: fix relaxed tiling on gen2 Daniel Vetter
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2011-03-26  9:43 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx

On Sat, 26 Mar 2011 10:20:21 +0100, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Sat, Mar 26, 2011 at 08:52:31AM +0000, Chris Wilson wrote:
> > However, I'm not sure if this truly prevents the corruption on i8xx with
> > 2.14.0. Can somebody break out an old machine and test?
> 
> It won't (assuming I correctly diagnosed the problem): Corruptions happen
> because i8xx uses copies of the scratch page in the lower-left corner.

Yeah, I couldn't explain it any other way, but I had to give the simple
fix a shot. Oh well. (And I still can't explain how it is *using* the
results of the out-of-bounds sampling unless our assumptions about tiling
on i8xx are fundamentally flawed. Later specs make it clear that sampling
may occur outside of the texture, but the results are automatically
culled.)

> Only extending the size of the allocation prevents that. What this patch
> does though is preventing corruptions in the immediately following bo.
> Dunno whether that's worth to fix given that people rather quickly
> complain about visual corruptions, but seem to somewhat accept crashy i8xx
> systems as a fact of life.

Eventually yes, we shouldn't let userspace easily foul up the hardware or
access information belonging to others. (I know we currently have lots of
shortcomings in just how far we can go here...) If we can provide more
information to the kernel, we can be even smarter about GTT allocations
versus physical page allocations. That's a pipe dream.

So, remaining options:

1) Convince everyone that it really is the *new* userspace code for 2.6.38
that is broken and they should just upgrade the buggy packages.

2) Disable RELAXED_FENCING for gen2 and let them continue to suffer. After
all, nobody has seemed to notice the performance regressions on i8xx...

3) Pad the allocation in set_tiling?

4) Introduce a new ioctl to provide more details about the surface tiling
to the kernel that can sanity check the allocation for the intended usage
and disable RELAXED_FENCING for the old set_tiling ioctl?

5) Something completely different.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] drm/i915: fix relaxed tiling on gen2
  2011-03-26  9:43       ` Chris Wilson
@ 2011-03-26 14:22         ` Daniel Vetter
  2011-03-26 18:23           ` Chris Wilson
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Vetter @ 2011-03-26 14:22 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, intel-gfx

A tile on gen2 has a size of 2kb, stride of 128 bytes and 16 rows.

Userspace was broken and assumed 8 rows. Complain with a printk_once
and disallow such tiling requests.

Also experiments and randoms cues from i81x docs suggest that y-tiling
doesn't work with fences (it's already known to be broken on the blitter).
Original idea seems to be to use y-tiling for textures with a manually
tiling upload path. Hence also disallow y-tiling - current userspace
doesn't use it at all, anyway.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
IMO a -next patch. If people keep on complaining that .2.6.38 broke their
machines we can backport the printk_once; return -EINVAL hunk.
-Daniel

 drivers/gpu/drm/i915/i915_gem.c        |    5 +++--
 drivers/gpu/drm/i915/i915_gem_tiling.c |    9 +++++++++
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 4554b2f..bc96e70 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1452,8 +1452,9 @@ i915_gem_get_unfenced_gtt_alignment(struct drm_i915_gem_object *obj)
 	 * edge of an even tile row (where tile rows are counted as if the bo is
 	 * placed in a fenced gtt region).
 	 */
-	if (IS_GEN2(dev) ||
-	    (obj->tiling_mode == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev)))
+	if (IS_GEN2(dev))
+		tile_height = 16;
+	else if (obj->tiling_mode == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev))
 		tile_height = 32;
 	else
 		tile_height = 8;
diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index 281ad3d..693e6e8 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -190,6 +190,9 @@ i915_tiling_ok(struct drm_device *dev, int stride, int size, int tiling_mode)
 	if (tiling_mode == I915_TILING_NONE)
 		return true;
 
+	if (IS_GEN2(dev) && tiling_mode == I915_TILING_Y)
+		return false;
+
 	if (IS_GEN2(dev) ||
 	    (tiling_mode == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev)))
 		tile_width = 128;
@@ -226,6 +229,12 @@ i915_tiling_ok(struct drm_device *dev, int stride, int size, int tiling_mode)
 	if (stride < tile_width)
 		return false;
 
+	if (IS_GEN2(dev) && (size / tile_width) % 16 != 0) {
+		printk_once("broken userspace detected, "
+			    "please upgrade libdrm and xf86-video-intel\n");
+		return false;
+	}
+
 	if (stride & (stride - 1))
 		return false;
 
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/i915: fix relaxed tiling on gen2
  2011-03-26 14:22         ` [PATCH] drm/i915: fix relaxed tiling on gen2 Daniel Vetter
@ 2011-03-26 18:23           ` Chris Wilson
  2011-03-26 19:55             ` [PATCH] drm/i915: fix relaxed tiling on gen2 v2 Daniel Vetter
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2011-03-26 18:23 UTC (permalink / raw)
  Cc: Daniel Vetter, intel-gfx

On Sat, 26 Mar 2011 15:22:57 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> A tile on gen2 has a size of 2kb, stride of 128 bytes and 16 rows.

You know thats what it says in the gen2 docs as well under FENCE. :(
 
> Userspace was broken and assumed 8 rows. Complain with a printk_once
> and disallow such tiling requests.

However, we know that userspace only requests correctly aligned regions by
accident due to the rounding up to the next bucket size in libdrm.

> Also experiments and randoms cues from i81x docs suggest that y-tiling
> doesn't work with fences (it's already known to be broken on the blitter).
> Original idea seems to be to use y-tiling for textures with a manually
> tiling upload path. Hence also disallow y-tiling - current userspace
> doesn't use it at all, anyway.

gen2 docs say that the FENCE upload via the CPU can handle Y-tiling on
all chipsets but 855GM. Go figure. (To handle Y-tiling uploads on 855GM it
says to BLT from X to Y...)

In which case it is only an error in i915_gem_fault() for 855GM &
Y-tiling.

So whilst I think we've found the right bug, I don't quite think we've got
the right patch yet.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] drm/i915: fix relaxed tiling on gen2 v2
  2011-03-26 18:23           ` Chris Wilson
@ 2011-03-26 19:55             ` Daniel Vetter
  2011-03-26 20:44               ` Chris Wilson
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Vetter @ 2011-03-26 19:55 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, intel-gfx

A tile on gen2 has a size of 2kb, stride of 128 bytes and 16 rows.

Userspace was broken and assumed 8 rows. Chris Wilson noted that the
kernel unfortunately can't reliable check that because libdrm rounds
up the size to the next bucket.

He also clarified (by checking internal docs) that only i855GM has
broken y-tiled fences for cpu access (guess what hw I own). Hence
move the check to deny y-tiled access from set_tiling to gem_fault
and restrict it with IS_I85X. According to docs, upload _should_
work to y-tiled textures with the blitter on all gen2 chips.

Checking this in create_mmap_offset does not work due to libdrm
bo reuse.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_gem.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 4554b2f..1483107 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1210,6 +1210,11 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 
 	trace_i915_gem_object_fault(obj, page_offset, true, write);
 
+	/* i855gm has broken y-tiled fences for cpu access, blitter should work,
+	 * though. */
+	if (IS_I85X(dev) && obj->tiling_mode == I915_TILING_Y)
+		return VM_FAULT_SIGBUS;
+
 	/* Now bind it into the GTT if needed */
 	if (!obj->map_and_fenceable) {
 		ret = i915_gem_object_unbind(obj);
@@ -1452,8 +1457,9 @@ i915_gem_get_unfenced_gtt_alignment(struct drm_i915_gem_object *obj)
 	 * edge of an even tile row (where tile rows are counted as if the bo is
 	 * placed in a fenced gtt region).
 	 */
-	if (IS_GEN2(dev) ||
-	    (obj->tiling_mode == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev)))
+	if (IS_GEN2(dev))
+		tile_height = 16;
+	else if (obj->tiling_mode == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev))
 		tile_height = 32;
 	else
 		tile_height = 8;
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/i915: fix relaxed tiling on gen2 v2
  2011-03-26 19:55             ` [PATCH] drm/i915: fix relaxed tiling on gen2 v2 Daniel Vetter
@ 2011-03-26 20:44               ` Chris Wilson
  0 siblings, 0 replies; 9+ messages in thread
From: Chris Wilson @ 2011-03-26 20:44 UTC (permalink / raw)
  Cc: Daniel Vetter, intel-gfx

On Sat, 26 Mar 2011 20:55:15 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> A tile on gen2 has a size of 2kb, stride of 128 bytes and 16 rows.

Nice patch, marries the code to the documentation (afaict). Should split
it into its two distinct fixes though. Both -next material, perhaps. And
thankfully explains the 2*8 fixup we needed for userspace, but note it is
also wrong for Y-tiling on gen2.

So, afaics, there is no way with the current interface the kernel can stop
userspace from shooting itself in the foot.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-03-26 20:44 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-25  9:36 [PATCH] drm/i915: Round-up GTT allocations for unfenced surfaces to the next tile row Chris Wilson
2011-03-25 12:16 ` Chris Wilson
2011-03-26  8:52   ` Chris Wilson
2011-03-26  9:20     ` Daniel Vetter
2011-03-26  9:43       ` Chris Wilson
2011-03-26 14:22         ` [PATCH] drm/i915: fix relaxed tiling on gen2 Daniel Vetter
2011-03-26 18:23           ` Chris Wilson
2011-03-26 19:55             ` [PATCH] drm/i915: fix relaxed tiling on gen2 v2 Daniel Vetter
2011-03-26 20:44               ` Chris Wilson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).