All of lore.kernel.org
 help / color / mirror / Atom feed
* i915 completely broken with KMS off
@ 2011-01-18 11:57 Herbert Xu
  2011-01-18 17:16 ` Jesse Barnes
  2011-01-18 17:24 ` [PATCH] drm/i915: Correctly test for an initialised ring for dri1 Chris Wilson
  0 siblings, 2 replies; 11+ messages in thread
From: Herbert Xu @ 2011-01-18 11:57 UTC (permalink / raw)
  To: Linux Kernel Mailing List, Zou Nan hai, Xiang Hai hao, Eric Anholt

Hi:

I just upgraded my kernel on a machine with i915 and it has
started crashing constantly with messages like this:

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<(null)>] (null)
PGD 10c441067 PUD 1185e5067 PMD 0
Oops: 0010 [#1] PREEMPT SMP
last sysfs file: /sys/class/dmi/id/chassis_asset_tag
CPU 3
Modules linked in: i915 drm_kms_helper drm fb fbdev i2c_algo_bit cfbcopyarea video backlight output cfbimgblt cfbfillrect autofs4 ipv6 nfs lockd fscache nfs_acl auth_rpcgss sunrpc coretemp hwmon_vid mousedev usbhid hid option usb_wwan snd_hda_codec_via asus_atk0110 atl1e usbserial snd_hda_intel snd_hda_codec firmware_class snd_hwdep snd_pcm snd_seq snd_timer snd_seq_device processor parport_pc thermal snd thermal_sys parport 8250_pnp button rng_core rtc_cmos shpchp hwmon rtc_core ehci_hcd pci_hotplug uhci_hcd soundcore tpm_tis i2c_i801 rtc_lib tpm serio_raw snd_page_alloc tpm_bios i2c_core usbcore psmouse intel_agp sg pcspkr sr_mod evdev cdrom ext3 jbd mbcache dm_mod sd_mod ata_piix libata scsi_mod unix
Jan 18 15:49:29 lithui kernel:
Pid: 3605, comm: Xorg Not tainted 2.6.36.2 #5 P5KPL-CM/System Product Name
RIP: 0010:[<0000000000000000>]  [<(null)>] (null)
RSP: 0018:ffff8801150d1d40  EFLAGS: 00010202
RAX: 000000000001ffff RBX: ffff88011a011b00 RCX: 000000000001a704
RDX: ffff880118566028 RSI: ffff880118566028 RDI: ffff880117876800
RBP: ffff8801150d1d48 R08: ffff8801195fe300 R09: 00000000c0086444
R10: 0000000000000001 R11: 0000000000003206 R12: ffff880117876800
R13: ffff880118566000 R14: ffff880117876820 R15: ffff8801150d1df8
FS:  00007f1038d456e0(0000) GS:ffff880001780000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000001187e7000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process Xorg (pid: 3605, threadinfo ffff8801150d0000, task ffff88011b016e40)
Stack:
ffffffffa043b8e6 ffff8801150d1d98 ffffffffa041768b dead000000000000
<0> 0000000000000048 00007f1023f2a000 0000000000000044 0000000000000008
<0> ffff88010d26bd80 ffff880117876800 ffff8801150d1df8 ffff8801150d1ea8
Call Trace:
[<ffffffffa043b8e6>] ? intel_ring_advance+0x16/0x20 [i915]
[<ffffffffa041768b>] i915_irq_emit+0x15b/0x240 [i915]
[<ffffffffa03ea7b1>] drm_ioctl+0x1f1/0x460 [drm]
[<ffffffffa0417530>] ? i915_irq_emit+0x0/0x240 [i915]
[<ffffffff810dd8f1>] ? do_sync_read+0xd1/0x120
[<ffffffff81025b1f>] ? do_page_fault+0x1df/0x3d0
[<ffffffff810ed5c7>] do_vfs_ioctl+0x97/0x550
[<ffffffff8115c2ea>] ? security_file_permission+0x7a/0x90
[<ffffffff810edb19>] sys_ioctl+0x99/0xa0
[<ffffffff810024ab>] system_call_fastpath+0x16/0x1b
Code:  Bad RIP value.
RIP  [<(null)>] (null)
RSP <ffff8801150d1d40>
CR2: 0000000000000000
---[ end trace 20c59dd8ff2fd5b6 ]---

The cause is this changeset:

commit 8187a2b70e34c727a06617441f74f202b6fefaf9
Author: Zou Nan hai <nanhai.zou@intel.com>
Date:   Fri May 21 09:08:55 2010 +0800

    drm/i915: introduce intel_ring_buffer structure (V2)

This changeset completely broke i915 without KMS because the new
intel_ring_buffer structure is never initialised with KMS off,
unless the ioctl I915_GEM_ENTERVT is invoked.

So without invoking that ioctl, the kernel will always crash
whenever the i915 driver dereferences pointers on the render_ring
data structure.

Or is this a hint to get people to convert to KMS :)

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: i915 completely broken with KMS off
  2011-01-18 11:57 i915 completely broken with KMS off Herbert Xu
@ 2011-01-18 17:16 ` Jesse Barnes
  2011-01-18 17:24 ` [PATCH] drm/i915: Correctly test for an initialised ring for dri1 Chris Wilson
  1 sibling, 0 replies; 11+ messages in thread
From: Jesse Barnes @ 2011-01-18 17:16 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Linux Kernel Mailing List, Zou Nan hai, Xiang Hai hao,
	Eric Anholt, Chris Wilson

On Tue, 18 Jan 2011 22:57:01 +1100
Herbert Xu <herbert@gondor.apana.org.au> wrote:

> Hi:
> 
> I just upgraded my kernel on a machine with i915 and it has
> started crashing constantly with messages like this:
[snip]
> The cause is this changeset:
> 
> commit 8187a2b70e34c727a06617441f74f202b6fefaf9
> Author: Zou Nan hai <nanhai.zou@intel.com>
> Date:   Fri May 21 09:08:55 2010 +0800
> 
>     drm/i915: introduce intel_ring_buffer structure (V2)
> 
> This changeset completely broke i915 without KMS because the new
> intel_ring_buffer structure is never initialised with KMS off,
> unless the ioctl I915_GEM_ENTERVT is invoked.
> 
> So without invoking that ioctl, the kernel will always crash
> whenever the i915 driver dereferences pointers on the render_ring
> data structure.
> 
> Or is this a hint to get people to convert to KMS :)

Heh, well that would be one option for you. :)

The breakage was unintentional though; I imagine Chris will have an
idea of how to fix it quickly for you.

-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] drm/i915: Correctly test for an initialised ring for dri1
  2011-01-18 11:57 i915 completely broken with KMS off Herbert Xu
  2011-01-18 17:16 ` Jesse Barnes
@ 2011-01-18 17:24 ` Chris Wilson
  2011-01-19  3:24   ` Herbert Xu
  1 sibling, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2011-01-18 17:24 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Jesse Barnes, linux-kernel, Chris Wilson

The dri1 code was broken by the introduction of multiple ringbuffers in
that it failed to correctly check for the initialisation of those before
use. The result was OOPSes such as:

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<(null)>] (null)
PGD 10c441067 PUD 1185e5067 PMD 0
Oops: 0010 [#1] PREEMPT SMP
last sysfs file: /sys/class/dmi/id/chassis_asset_tag
CPU 3
Modules linked in: i915 drm_kms_helper drm fb fbdev i2c_algo_bit
cfbcopyarea video backlight output cfbimgblt cfbfillrect autofs4 ipv6
nfs lockd fscache nfs_acl auth_rpcgss sunrpc coretemp hwmon_vid mousedev
usbhid hid option usb_wwan snd_hda_codec_via asus_atk0110 atl1e
usbserial snd_hda_intel snd_hda_codec firmware_class snd_hwdep snd_pcm
snd_seq snd_timer snd_seq_device processor parport_pc thermal snd
thermal_sys parport 8250_pnp button rng_core rtc_cmos shpchp hwmon
rtc_core ehci_hcd pci_hotplug uhci_hcd soundcore tpm_tis i2c_i801
rtc_lib tpm serio_raw snd_page_alloc tpm_bios i2c_core usbcore psmouse
intel_agp sg pcspkr sr_mod evdev cdrom ext3 jbd mbcache dm_mod sd_mod
ata_piix libata scsi_mod unix
Jan 18 15:49:29 lithui kernel:
Pid: 3605, comm: Xorg Not tainted 2.6.36.2 #5 P5KPL-CM/System Product
Name
RIP: 0010:[<0000000000000000>]  [<(null)>] (null)
RSP: 0018:ffff8801150d1d40  EFLAGS: 00010202
RAX: 000000000001ffff RBX: ffff88011a011b00 RCX: 000000000001a704
RDX: ffff880118566028 RSI: ffff880118566028 RDI: ffff880117876800
RBP: ffff8801150d1d48 R08: ffff8801195fe300 R09: 00000000c0086444
R10: 0000000000000001 R11: 0000000000003206 R12: ffff880117876800
R13: ffff880118566000 R14: ffff880117876820 R15: ffff8801150d1df8
FS:  00007f1038d456e0(0000) GS:ffff880001780000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000001187e7000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process Xorg (pid: 3605, threadinfo ffff8801150d0000, task
ffff88011b016e40)
Stack:
ffffffffa043b8e6 ffff8801150d1d98 ffffffffa041768b dead000000000000
<0> 0000000000000048 00007f1023f2a000 0000000000000044 0000000000000008
<0> ffff88010d26bd80 ffff880117876800 ffff8801150d1df8 ffff8801150d1ea8
Call Trace:
[<ffffffffa043b8e6>] ? intel_ring_advance+0x16/0x20 [i915]
[<ffffffffa041768b>] i915_irq_emit+0x15b/0x240 [i915]
[<ffffffffa03ea7b1>] drm_ioctl+0x1f1/0x460 [drm]
[<ffffffffa0417530>] ? i915_irq_emit+0x0/0x240 [i915]
[<ffffffff810dd8f1>] ? do_sync_read+0xd1/0x120
[<ffffffff81025b1f>] ? do_page_fault+0x1df/0x3d0
[<ffffffff810ed5c7>] do_vfs_ioctl+0x97/0x550
[<ffffffff8115c2ea>] ? security_file_permission+0x7a/0x90
[<ffffffff810edb19>] sys_ioctl+0x99/0xa0
[<ffffffff810024ab>] system_call_fastpath+0x16/0x1b
Code:  Bad RIP value.
RIP  [<(null)>] (null)
RSP <ffff8801150d1d40>
CR2: 0000000000000000

Reported-by: Herbert Xu <herbert@gondor.apana.org.au>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=29153
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=23172
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_dma.c |   20 ++++++++++++++++----
 drivers/gpu/drm/i915/i915_drv.h |   12 ------------
 drivers/gpu/drm/i915/i915_irq.c |    4 ++--
 3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 76f2df7..dff11ebe 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -583,7 +583,10 @@ static int i915_flush_ioctl(struct drm_device *dev, void *data,
 {
 	int ret;
 
-	RING_LOCK_TEST_WITH_RETURN(dev, file_priv);
+	if (LP_RING(dev->dev_private)->obj == NULL)
+		return -EINVAL;
+
+	LOCK_TEST_WITH_RETURN(dev, file_priv);
 
 	mutex_lock(&dev->struct_mutex);
 	ret = i915_quiescent(dev);
@@ -603,6 +606,9 @@ static int i915_batchbuffer(struct drm_device *dev, void *data,
 	int ret;
 	struct drm_clip_rect *cliprects = NULL;
 
+	if (LP_RING(dev->dev_private)->obj == NULL)
+		return -EINVAL;
+
 	if (!dev_priv->allow_batchbuffer) {
 		DRM_ERROR("Batchbuffer ioctl disabled\n");
 		return -EINVAL;
@@ -611,7 +617,7 @@ static int i915_batchbuffer(struct drm_device *dev, void *data,
 	DRM_DEBUG_DRIVER("i915 batchbuffer, start %x used %d cliprects %d\n",
 			batch->start, batch->used, batch->num_cliprects);
 
-	RING_LOCK_TEST_WITH_RETURN(dev, file_priv);
+	LOCK_TEST_WITH_RETURN(dev, file_priv);
 
 	if (batch->num_cliprects < 0)
 		return -EINVAL;
@@ -657,10 +663,13 @@ static int i915_cmdbuffer(struct drm_device *dev, void *data,
 	void *batch_data;
 	int ret;
 
+	if (LP_RING(dev->dev_private)->obj == NULL)
+		return -EINVAL;
+
 	DRM_DEBUG_DRIVER("i915 cmdbuffer, buf %p sz %d cliprects %d\n",
 			cmdbuf->buf, cmdbuf->sz, cmdbuf->num_cliprects);
 
-	RING_LOCK_TEST_WITH_RETURN(dev, file_priv);
+	LOCK_TEST_WITH_RETURN(dev, file_priv);
 
 	if (cmdbuf->num_cliprects < 0)
 		return -EINVAL;
@@ -716,9 +725,12 @@ static int i915_flip_bufs(struct drm_device *dev, void *data,
 {
 	int ret;
 
+	if (LP_RING(dev->dev_private)->obj == NULL)
+		return -EINVAL;
+
 	DRM_DEBUG_DRIVER("%s\n", __func__);
 
-	RING_LOCK_TEST_WITH_RETURN(dev, file_priv);
+	LOCK_TEST_WITH_RETURN(dev, file_priv);
 
 	mutex_lock(&dev->struct_mutex);
 	ret = i915_dispatch_flip(dev);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 52ceae5..206e781 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1297,18 +1297,6 @@ extern void intel_display_print_error_state(struct seq_file *m,
 #define ADVANCE_LP_RING() \
 	intel_ring_advance(LP_RING(dev_priv))
 
-/**
- * Lock test for when it's just for synchronization of ring access.
- *
- * In that case, we don't need to do it when GEM is initialized as nobody else
- * has access to the ring.
- */
-#define RING_LOCK_TEST_WITH_RETURN(dev, file) do {			\
-	if (LP_RING(dev->dev_private)->obj == NULL)			\
-		LOCK_TEST_WITH_RETURN(dev, file);			\
-} while (0)
-
-
 #define __i915_read(x, y) \
 static inline u##x i915_read##x(struct drm_i915_private *dev_priv, u32 reg) { \
 	u##x val = read##y(dev_priv->regs + reg); \
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 46d649b..19a58bd 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1302,12 +1302,12 @@ int i915_irq_emit(struct drm_device *dev, void *data,
 	drm_i915_irq_emit_t *emit = data;
 	int result;
 
-	if (!dev_priv || !LP_RING(dev_priv)->virtual_start) {
+	if (!dev_priv || !LP_RING(dev_priv)->obj) {
 		DRM_ERROR("called with no initialization\n");
 		return -EINVAL;
 	}
 
-	RING_LOCK_TEST_WITH_RETURN(dev, file_priv);
+	LOCK_TEST_WITH_RETURN(dev, file_priv);
 
 	mutex_lock(&dev->struct_mutex);
 	result = i915_emit_irq(dev);
-- 
1.7.2.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915: Correctly test for an initialised ring for dri1
  2011-01-18 17:24 ` [PATCH] drm/i915: Correctly test for an initialised ring for dri1 Chris Wilson
@ 2011-01-19  3:24   ` Herbert Xu
  2011-01-19  9:53     ` [PATCH] drm/i915: Initialise ring vfuncs for old DRI paths Chris Wilson
  0 siblings, 1 reply; 11+ messages in thread
From: Herbert Xu @ 2011-01-19  3:24 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Jesse Barnes, linux-kernel

On Tue, Jan 18, 2011 at 05:24:09PM +0000, Chris Wilson wrote:
>
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 46d649b..19a58bd 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1302,12 +1302,12 @@ int i915_irq_emit(struct drm_device *dev, void *data,
>  	drm_i915_irq_emit_t *emit = data;
>  	int result;
>  
> -	if (!dev_priv || !LP_RING(dev_priv)->virtual_start) {
> +	if (!dev_priv || !LP_RING(dev_priv)->obj) {
>  		DRM_ERROR("called with no initialization\n");
>  		return -EINVAL;
>  	}
>  
> -	RING_LOCK_TEST_WITH_RETURN(dev, file_priv);
> +	LOCK_TEST_WITH_RETURN(dev, file_priv);

While this would surely fix the crash here, this appears to take
away existing functionality with KMS off.

We usually don't require people to upgrade user-space along
with the kernel in order to preserve existing functionality.
Do different rules apply in the DRM area?

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] drm/i915: Initialise ring vfuncs for old DRI paths
  2011-01-19  3:24   ` Herbert Xu
@ 2011-01-19  9:53     ` Chris Wilson
  2011-01-19 10:31       ` Herbert Xu
  2011-01-20  7:05       ` Herbert Xu
  0 siblings, 2 replies; 11+ messages in thread
From: Chris Wilson @ 2011-01-19  9:53 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Jesse Barnes, linux-kernel, Chris Wilson

We weren't setting up the vfunc table when initialising the old DRI
ringbuffer, leading to such OOPSes as:

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<(null)>] (null)
PGD 10c441067 PUD 1185e5067 PMD 0
Oops: 0010 [#1] PREEMPT SMP
last sysfs file: /sys/class/dmi/id/chassis_asset_tag
CPU 3
Modules linked in: i915 drm_kms_helper drm fb fbdev i2c_algo_bit
cfbcopyarea video backlight output cfbimgblt cfbfillrect autofs4 ipv6
nfs lockd fscache nfs_acl auth_rpcgss sunrpc coretemp hwmon_vid mousedev
usbhid hid option usb_wwan snd_hda_codec_via asus_atk0110 atl1e
usbserial snd_hda_intel snd_hda_codec firmware_class snd_hwdep snd_pcm
snd_seq snd_timer snd_seq_device processor parport_pc thermal snd
thermal_sys parport 8250_pnp button rng_core rtc_cmos shpchp hwmon
rtc_core ehci_hcd pci_hotplug uhci_hcd soundcore tpm_tis i2c_i801
rtc_lib tpm serio_raw snd_page_alloc tpm_bios i2c_core usbcore psmouse
intel_agp sg pcspkr sr_mod evdev cdrom ext3 jbd mbcache dm_mod sd_mod
ata_piix libata scsi_mod unix
Jan 18 15:49:29 lithui kernel:
Pid: 3605, comm: Xorg Not tainted 2.6.36.2 #5 P5KPL-CM/System Product
Name
RIP: 0010:[<0000000000000000>]  [<(null)>] (null)
RSP: 0018:ffff8801150d1d40  EFLAGS: 00010202
RAX: 000000000001ffff RBX: ffff88011a011b00 RCX: 000000000001a704
RDX: ffff880118566028 RSI: ffff880118566028 RDI: ffff880117876800
RBP: ffff8801150d1d48 R08: ffff8801195fe300 R09: 00000000c0086444
R10: 0000000000000001 R11: 0000000000003206 R12: ffff880117876800
R13: ffff880118566000 R14: ffff880117876820 R15: ffff8801150d1df8
FS:  00007f1038d456e0(0000) GS:ffff880001780000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000001187e7000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process Xorg (pid: 3605, threadinfo ffff8801150d0000, task
ffff88011b016e40)
Stack:
ffffffffa043b8e6 ffff8801150d1d98 ffffffffa041768b dead000000000000
<0> 0000000000000048 00007f1023f2a000 0000000000000044 0000000000000008
<0> ffff88010d26bd80 ffff880117876800 ffff8801150d1df8 ffff8801150d1ea8
Call Trace:
[<ffffffffa043b8e6>] ? intel_ring_advance+0x16/0x20 [i915]
[<ffffffffa041768b>] i915_irq_emit+0x15b/0x240 [i915]
[<ffffffffa03ea7b1>] drm_ioctl+0x1f1/0x460 [drm]
[<ffffffffa0417530>] ? i915_irq_emit+0x0/0x240 [i915]
[<ffffffff810dd8f1>] ? do_sync_read+0xd1/0x120
[<ffffffff81025b1f>] ? do_page_fault+0x1df/0x3d0
[<ffffffff810ed5c7>] do_vfs_ioctl+0x97/0x550
[<ffffffff8115c2ea>] ? security_file_permission+0x7a/0x90
[<ffffffff810edb19>] sys_ioctl+0x99/0xa0
[<ffffffff810024ab>] system_call_fastpath+0x16/0x1b
Code:  Bad RIP value.
RIP  [<(null)>] (null)
RSP <ffff8801150d1d40>
CR2: 0000000000000000

Reported-by: Herbert Xu <herbert@gondor.apana.org.au>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=29153
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=23172
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_dma.c         |   25 +++++--------------
 drivers/gpu/drm/i915/intel_ringbuffer.c |   38 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.h |    3 ++
 3 files changed, 48 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 76f2df7..2a45920 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -153,7 +153,7 @@ static int i915_initialize(struct drm_device * dev, drm_i915_init_t * init)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct drm_i915_master_private *master_priv = dev->primary->master->driver_priv;
-	struct intel_ring_buffer *ring = LP_RING(dev_priv);
+	int ret;
 
 	master_priv->sarea = drm_getsarea(dev);
 	if (master_priv->sarea) {
@@ -164,33 +164,22 @@ static int i915_initialize(struct drm_device * dev, drm_i915_init_t * init)
 	}
 
 	if (init->ring_size != 0) {
-		if (ring->obj != NULL) {
+		if (LP_RING(dev_priv)->obj != NULL) {
 			i915_dma_cleanup(dev);
 			DRM_ERROR("Client tried to initialize ringbuffer in "
 				  "GEM mode\n");
 			return -EINVAL;
 		}
 
-		ring->size = init->ring_size;
-
-		ring->map.offset = init->ring_start;
-		ring->map.size = init->ring_size;
-		ring->map.type = 0;
-		ring->map.flags = 0;
-		ring->map.mtrr = 0;
-
-		drm_core_ioremap_wc(&ring->map, dev);
-
-		if (ring->map.handle == NULL) {
+		ret = intel_render_ring_init_dri(dev,
+						 init->ring_start,
+						 init->ring_size);
+		if (ret) {
 			i915_dma_cleanup(dev);
-			DRM_ERROR("can not ioremap virtual address for"
-				  " ring buffer\n");
-			return -ENOMEM;
+			return ret;
 		}
 	}
 
-	ring->virtual_start = (void __force __iomem *)ring->map.handle;
-
 	dev_priv->cpp = init->cpp;
 	dev_priv->back_offset = init->back_offset;
 	dev_priv->front_offset = init->front_offset;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 03e3370..363f50b 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1291,6 +1291,44 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 	return intel_init_ring_buffer(dev, ring);
 }
 
+int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
+{
+	drm_i915_private_t *dev_priv = dev->dev_private;
+	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+
+	*ring = render_ring;
+	if (INTEL_INFO(dev)->gen >= 6) {
+		ring->add_request = gen6_add_request;
+		ring->irq_get = gen6_render_ring_get_irq;
+		ring->irq_put = gen6_render_ring_put_irq;
+	} else if (IS_GEN5(dev)) {
+		ring->add_request = pc_render_add_request;
+		ring->get_seqno = pc_render_get_seqno;
+	}
+
+	ring->dev = dev;
+	ring->size = size;
+	ring->effective_size = ring->size;
+	if (IS_I830(ring->dev))
+		ring->effective_size -= 128;
+
+	ring->map.offset = start;
+	ring->map.size = size;
+	ring->map.type = 0;
+	ring->map.flags = 0;
+	ring->map.mtrr = 0;
+
+	drm_core_ioremap_wc(&ring->map, dev);
+	if (ring->map.handle == NULL) {
+		DRM_ERROR("can not ioremap virtual address for"
+			  " ring buffer\n");
+		return -ENOMEM;
+	}
+
+	ring->virtual_start = (void __force __iomem *)ring->map.handle;
+	return 0;
+}
+
 int intel_init_bsd_ring_buffer(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index d5911aa..525c04e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -187,4 +187,7 @@ int intel_init_blt_ring_buffer(struct drm_device *dev);
 u32 intel_ring_get_active_head(struct intel_ring_buffer *ring);
 void intel_ring_setup_status_page(struct intel_ring_buffer *ring);
 
+/* DRI warts */
+int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size);
+
 #endif /* _INTEL_RINGBUFFER_H_ */
-- 
1.7.2.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915: Initialise ring vfuncs for old DRI paths
  2011-01-19  9:53     ` [PATCH] drm/i915: Initialise ring vfuncs for old DRI paths Chris Wilson
@ 2011-01-19 10:31       ` Herbert Xu
  2011-01-20  7:05       ` Herbert Xu
  1 sibling, 0 replies; 11+ messages in thread
From: Herbert Xu @ 2011-01-19 10:31 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Jesse Barnes, linux-kernel

On Wed, Jan 19, 2011 at 09:53:44AM +0000, Chris Wilson wrote:
> We weren't setting up the vfunc table when initialising the old DRI
> ringbuffer, leading to such OOPSes as:
> 
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: [<(null)>] (null)
> PGD 10c441067 PUD 1185e5067 PMD 0
> Oops: 0010 [#1] PREEMPT SMP
> last sysfs file: /sys/class/dmi/id/chassis_asset_tag
> CPU 3
> Modules linked in: i915 drm_kms_helper drm fb fbdev i2c_algo_bit
> cfbcopyarea video backlight output cfbimgblt cfbfillrect autofs4 ipv6
> nfs lockd fscache nfs_acl auth_rpcgss sunrpc coretemp hwmon_vid mousedev
> usbhid hid option usb_wwan snd_hda_codec_via asus_atk0110 atl1e
> usbserial snd_hda_intel snd_hda_codec firmware_class snd_hwdep snd_pcm
> snd_seq snd_timer snd_seq_device processor parport_pc thermal snd
> thermal_sys parport 8250_pnp button rng_core rtc_cmos shpchp hwmon
> rtc_core ehci_hcd pci_hotplug uhci_hcd soundcore tpm_tis i2c_i801
> rtc_lib tpm serio_raw snd_page_alloc tpm_bios i2c_core usbcore psmouse
> intel_agp sg pcspkr sr_mod evdev cdrom ext3 jbd mbcache dm_mod sd_mod
> ata_piix libata scsi_mod unix
> Jan 18 15:49:29 lithui kernel:
> Pid: 3605, comm: Xorg Not tainted 2.6.36.2 #5 P5KPL-CM/System Product
> Name
> RIP: 0010:[<0000000000000000>]  [<(null)>] (null)
> RSP: 0018:ffff8801150d1d40  EFLAGS: 00010202
> RAX: 000000000001ffff RBX: ffff88011a011b00 RCX: 000000000001a704
> RDX: ffff880118566028 RSI: ffff880118566028 RDI: ffff880117876800
> RBP: ffff8801150d1d48 R08: ffff8801195fe300 R09: 00000000c0086444
> R10: 0000000000000001 R11: 0000000000003206 R12: ffff880117876800
> R13: ffff880118566000 R14: ffff880117876820 R15: ffff8801150d1df8
> FS:  00007f1038d456e0(0000) GS:ffff880001780000(0000)
> knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 00000001187e7000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process Xorg (pid: 3605, threadinfo ffff8801150d0000, task
> ffff88011b016e40)
> Stack:
> ffffffffa043b8e6 ffff8801150d1d98 ffffffffa041768b dead000000000000
> <0> 0000000000000048 00007f1023f2a000 0000000000000044 0000000000000008
> <0> ffff88010d26bd80 ffff880117876800 ffff8801150d1df8 ffff8801150d1ea8
> Call Trace:
> [<ffffffffa043b8e6>] ? intel_ring_advance+0x16/0x20 [i915]
> [<ffffffffa041768b>] i915_irq_emit+0x15b/0x240 [i915]
> [<ffffffffa03ea7b1>] drm_ioctl+0x1f1/0x460 [drm]
> [<ffffffffa0417530>] ? i915_irq_emit+0x0/0x240 [i915]
> [<ffffffff810dd8f1>] ? do_sync_read+0xd1/0x120
> [<ffffffff81025b1f>] ? do_page_fault+0x1df/0x3d0
> [<ffffffff810ed5c7>] do_vfs_ioctl+0x97/0x550
> [<ffffffff8115c2ea>] ? security_file_permission+0x7a/0x90
> [<ffffffff810edb19>] sys_ioctl+0x99/0xa0
> [<ffffffff810024ab>] system_call_fastpath+0x16/0x1b
> Code:  Bad RIP value.
> RIP  [<(null)>] (null)
> RSP <ffff8801150d1d40>
> CR2: 0000000000000000
> 
> Reported-by: Herbert Xu <herbert@gondor.apana.org.au>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=29153
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=23172
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Thank you Chris!

I'll try this patch tomorrow, but it sure likes it should do the
trick.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915: Initialise ring vfuncs for old DRI paths
  2011-01-19  9:53     ` [PATCH] drm/i915: Initialise ring vfuncs for old DRI paths Chris Wilson
  2011-01-19 10:31       ` Herbert Xu
@ 2011-01-20  7:05       ` Herbert Xu
  2011-01-20  9:56         ` [PATCH] drm/i915: Don't kick-off hangcheck after a DRI interrupt Chris Wilson
  1 sibling, 1 reply; 11+ messages in thread
From: Herbert Xu @ 2011-01-20  7:05 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Jesse Barnes, linux-kernel

On Wed, Jan 19, 2011 at 09:53:44AM +0000, Chris Wilson wrote:
> We weren't setting up the vfunc table when initialising the old DRI
> ringbuffer, leading to such OOPSes as:

OK this fixes the original crash but adds a new one:

BUG: unable to handle kernel paging request at fffffffffffffff0
IP: [<ffffffffa041ee76>] i915_hangcheck_elapsed+0x96/0x270 [i915]
PGD 13d1067 PUD 13d2067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
last sysfs file: /sys/class/net/lo/operstate
CPU 2
Modules linked in: snd_pcm_oss snd_mixer_oss vmnet parport_pc parport vmblock vmci vmmon i915 drm_kms_helper drm fb fbdev i2c_algo_bit cfbcopyarea video backlight output cfbimgblt cfbfillrect autofs4 ipv6 nfs lockd fscache nfs_acl auth_rpcgss sunrpc coretemp hwmon_vid mo]

Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #5 P5KPL-CM/System Product Name
RIP: 0010:[<ffffffffa041ee76>]  [<ffffffffa041ee76>] i915_hangcheck_elapsed+0x96/0x270 [i915]
RSP: 0000:ffff880001703e40  EFLAGS: 00010217
RAX: 0000000000000000 RBX: ffff880117071800 RCX: ffff880118f7c400
RDX: 000000007dffffc0 RSI: ffff880118f7c028 RDI: ffff880117071800
RBP: ffff880001703e70 R08: ffff88000170d460 R09: ffff880001712620
R10: 0000000000000000 R11: 0000000000000001 R12: ffff880118f7c000
R13: ffff880117071800 R14: 0000000000000000 R15: 000000000e41e9d8
FS:  0000000000000000(0000) GS:ffff880001700000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: fffffffffffffff0 CR3: 00000000d83df000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kworker/0:1 (pid: 0, threadinfo ffff88011b6b2000, task ffff88011b67d5c0)
Stack:
 7dffffc000012600 ffff880117071800 ffff88011b6ac000 0000000000000102
<0> ffff880001703eb0 ffffffffa041ede0 ffff880001703ef0 ffffffff81046fad
<0> ffff88011b6b3fd8 ffff88011b6b3fd8 ffff88011b6adc20 ffff88011b6ad820
Call Trace:
 <IRQ>
 [<ffffffffa041ede0>] ? i915_hangcheck_elapsed+0x0/0x270 [i915]
 [<ffffffff81046fad>] run_timer_softirq+0x13d/0x260
 [<ffffffff81063657>] ? clockevents_program_event+0x57/0xa0
 [<ffffffff81041c76>] __do_softirq+0xa6/0x130
 [<ffffffff810032cc>] call_softirq+0x1c/0x30
 [<ffffffff81005375>] do_softirq+0x55/0x90
 [<ffffffff8104190d>] irq_exit+0x8d/0xb0
 [<ffffffff8101de8c>] smp_apic_timer_interrupt+0x6c/0xa0
 [<ffffffff81002d93>] apic_timer_interrupt+0x13/0x20
 <EOI>
 [<ffffffff8100b139>] ? mwait_idle+0x79/0x90
 [<ffffffff81001610>] ? enter_idle+0x20/0x30
 [<ffffffff81001689>] cpu_idle+0x69/0xc0
 [<ffffffff812cb19c>] start_secondary+0x183/0x1e7
Code: 8d 84 24 18 01 00 00 49 39 84 24 18 01 00 00 0f 84 cf 00 00 00 49 8b 85 68 03 00 00 49 8d 74 24 28 48 8b 80 20 01 00 00 4c 89 ef <8b> 58 f0 e8 42 5e 00 00 89 de 89 c7 e8 29 5e 00 00 84 c0 0f 85
RIP  [<ffffffffa041ee76>] i915_hangcheck_elapsed+0x96/0x270 [i915]
 RSP <ffff880001703e40>
CR2: fffffffffffffff0
---[ end trace a327d5ceef537f9e ]---

This is because request_list were initialised by i915_gem_load
but are now zeroed again by the new init_dri function.

I'm going to add the INIT_LIST_HEADs to init_dri now and see if
it survives longer.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] drm/i915: Don't kick-off hangcheck after a DRI interrupt
  2011-01-20  7:05       ` Herbert Xu
@ 2011-01-20  9:56         ` Chris Wilson
  2011-01-20 10:10           ` Herbert Xu
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2011-01-20  9:56 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Jesse Barnes, linux-kernel, Chris Wilson

Hangcheck is only used by GEM and just OOPSes with incomplete DRI
configuration:

BUG: unable to handle kernel paging request at fffffffffffffff0
IP: [<ffffffffa041ee76>] i915_hangcheck_elapsed+0x96/0x270 [i915]
PGD 13d1067 PUD 13d2067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
last sysfs file: /sys/class/net/lo/operstate
CPU 2
Modules linked in: snd_pcm_oss snd_mixer_oss vmnet parport_pc parport
vmblock vmci vmmon i915 drm_kms_helper drm fb fbdev i2c_algo_bit
cfbcopyarea video backlight output cfbimgblt cfbfillrect autofs4 ipv6
nfs lockd fscache nfs_acl auth_rpcgss sunrpc coretemp hwmon_vid mo]

Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #5 P5KPL-CM/System
Product Name
RIP: 0010:[<ffffffffa041ee76>]  [<ffffffffa041ee76>]
i915_hangcheck_elapsed+0x96/0x270 [i915]
RSP: 0000:ffff880001703e40  EFLAGS: 00010217
RAX: 0000000000000000 RBX: ffff880117071800 RCX: ffff880118f7c400
RDX: 000000007dffffc0 RSI: ffff880118f7c028 RDI: ffff880117071800
RBP: ffff880001703e70 R08: ffff88000170d460 R09: ffff880001712620
R10: 0000000000000000 R11: 0000000000000001 R12: ffff880118f7c000
R13: ffff880117071800 R14: 0000000000000000 R15: 000000000e41e9d8
FS:  0000000000000000(0000) GS:ffff880001700000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: fffffffffffffff0 CR3: 00000000d83df000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kworker/0:1 (pid: 0, threadinfo ffff88011b6b2000, task
ffff88011b67d5c0)
Stack:
 7dffffc000012600 ffff880117071800 ffff88011b6ac000 0000000000000102
<0> ffff880001703eb0 ffffffffa041ede0 ffff880001703ef0 ffffffff81046fad
<0> ffff88011b6b3fd8 ffff88011b6b3fd8 ffff88011b6adc20 ffff88011b6ad820
Call Trace:
 <IRQ>
 [<ffffffffa041ede0>] ? i915_hangcheck_elapsed+0x0/0x270 [i915]
 [<ffffffff81046fad>] run_timer_softirq+0x13d/0x260
 [<ffffffff81063657>] ? clockevents_program_event+0x57/0xa0
 [<ffffffff81041c76>] __do_softirq+0xa6/0x130
 [<ffffffff810032cc>] call_softirq+0x1c/0x30
 [<ffffffff81005375>] do_softirq+0x55/0x90
 [<ffffffff8104190d>] irq_exit+0x8d/0xb0
 [<ffffffff8101de8c>] smp_apic_timer_interrupt+0x6c/0xa0
 [<ffffffff81002d93>] apic_timer_interrupt+0x13/0x20
 <EOI>
 [<ffffffff8100b139>] ? mwait_idle+0x79/0x90
 [<ffffffff81001610>] ? enter_idle+0x20/0x30
 [<ffffffff81001689>] cpu_idle+0x69/0xc0
 [<ffffffff812cb19c>] start_secondary+0x183/0x1e7
Code: 8d 84 24 18 01 00 00 49 39 84 24 18 01 00 00 0f 84 cf 00 00 00 49
8b 85 68 03 00 00 49 8d 74 24 28 48 8b 80 20 01 00 00 4c 89 ef <8b> 58
f0 e8 42 5e 00 00 89 de 89 c7 e8 29 5e 00 00 84 c0 0f 85
RIP  [<ffffffffa041ee76>] i915_hangcheck_elapsed+0x96/0x270 [i915]
 RSP <ffff880001703e40>
CR2: fffffffffffffff0
---[ end trace a327d5ceef537f9e ]---

Reported-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_irq.c |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 46d649b..39ce40d 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -348,8 +348,12 @@ static void notify_ring(struct drm_device *dev,
 			struct intel_ring_buffer *ring)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	u32 seqno = ring->get_seqno(ring);
+	u32 seqno;
+
+	if (ring->obj == NULL)
+		return;
 
+	seqno = ring->get_seqno(ring);
 	trace_i915_gem_request_complete(dev, seqno);
 
 	ring->irq_seqno = seqno;
-- 
1.7.2.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915: Don't kick-off hangcheck after a DRI interrupt
  2011-01-20  9:56         ` [PATCH] drm/i915: Don't kick-off hangcheck after a DRI interrupt Chris Wilson
@ 2011-01-20 10:10           ` Herbert Xu
  2011-01-20 11:00             ` [PATCH] drm/i915: Dont " Chris Wilson
  0 siblings, 1 reply; 11+ messages in thread
From: Herbert Xu @ 2011-01-20 10:10 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Jesse Barnes, linux-kernel

On Thu, Jan 20, 2011 at 09:56:01AM +0000, Chris Wilson wrote:
> Hangcheck is only used by GEM and just OOPSes with incomplete DRI
> configuration:
> 
> BUG: unable to handle kernel paging request at fffffffffffffff0
> IP: [<ffffffffa041ee76>] i915_hangcheck_elapsed+0x96/0x270 [i915]
> PGD 13d1067 PUD 13d2067 PMD 0
> Oops: 0000 [#1] PREEMPT SMP
> last sysfs file: /sys/class/net/lo/operstate
> CPU 2
> Modules linked in: snd_pcm_oss snd_mixer_oss vmnet parport_pc parport
> vmblock vmci vmmon i915 drm_kms_helper drm fb fbdev i2c_algo_bit
> cfbcopyarea video backlight output cfbimgblt cfbfillrect autofs4 ipv6
> nfs lockd fscache nfs_acl auth_rpcgss sunrpc coretemp hwmon_vid mo]
> 
> Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #5 P5KPL-CM/System
> Product Name
> RIP: 0010:[<ffffffffa041ee76>]  [<ffffffffa041ee76>]
> i915_hangcheck_elapsed+0x96/0x270 [i915]
> RSP: 0000:ffff880001703e40  EFLAGS: 00010217
> RAX: 0000000000000000 RBX: ffff880117071800 RCX: ffff880118f7c400
> RDX: 000000007dffffc0 RSI: ffff880118f7c028 RDI: ffff880117071800
> RBP: ffff880001703e70 R08: ffff88000170d460 R09: ffff880001712620
> R10: 0000000000000000 R11: 0000000000000001 R12: ffff880118f7c000
> R13: ffff880117071800 R14: 0000000000000000 R15: 000000000e41e9d8
> FS:  0000000000000000(0000) GS:ffff880001700000(0000)
> knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: fffffffffffffff0 CR3: 00000000d83df000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process kworker/0:1 (pid: 0, threadinfo ffff88011b6b2000, task
> ffff88011b67d5c0)
> Stack:
>  7dffffc000012600 ffff880117071800 ffff88011b6ac000 0000000000000102
> <0> ffff880001703eb0 ffffffffa041ede0 ffff880001703ef0 ffffffff81046fad
> <0> ffff88011b6b3fd8 ffff88011b6b3fd8 ffff88011b6adc20 ffff88011b6ad820
> Call Trace:
>  <IRQ>
>  [<ffffffffa041ede0>] ? i915_hangcheck_elapsed+0x0/0x270 [i915]
>  [<ffffffff81046fad>] run_timer_softirq+0x13d/0x260
>  [<ffffffff81063657>] ? clockevents_program_event+0x57/0xa0
>  [<ffffffff81041c76>] __do_softirq+0xa6/0x130
>  [<ffffffff810032cc>] call_softirq+0x1c/0x30
>  [<ffffffff81005375>] do_softirq+0x55/0x90
>  [<ffffffff8104190d>] irq_exit+0x8d/0xb0
>  [<ffffffff8101de8c>] smp_apic_timer_interrupt+0x6c/0xa0
>  [<ffffffff81002d93>] apic_timer_interrupt+0x13/0x20
>  <EOI>
>  [<ffffffff8100b139>] ? mwait_idle+0x79/0x90
>  [<ffffffff81001610>] ? enter_idle+0x20/0x30
>  [<ffffffff81001689>] cpu_idle+0x69/0xc0
>  [<ffffffff812cb19c>] start_secondary+0x183/0x1e7
> Code: 8d 84 24 18 01 00 00 49 39 84 24 18 01 00 00 0f 84 cf 00 00 00 49
> 8b 85 68 03 00 00 49 8d 74 24 28 48 8b 80 20 01 00 00 4c 89 ef <8b> 58
> f0 e8 42 5e 00 00 89 de 89 c7 e8 29 5e 00 00 84 c0 0f 85
> RIP  [<ffffffffa041ee76>] i915_hangcheck_elapsed+0x96/0x270 [i915]
>  RSP <ffff880001703e40>
> CR2: fffffffffffffff0
> ---[ end trace a327d5ceef537f9e ]---
> 
> Reported-by: Herbert Xu <herbert@gondor.apana.org.au>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_irq.c |    6 +++++-
>  1 files changed, 5 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 46d649b..39ce40d 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -348,8 +348,12 @@ static void notify_ring(struct drm_device *dev,
>  			struct intel_ring_buffer *ring)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	u32 seqno = ring->get_seqno(ring);
> +	u32 seqno;
> +
> +	if (ring->obj == NULL)
> +		return;
>  
> +	seqno = ring->get_seqno(ring);
>  	trace_i915_gem_request_complete(dev, seqno);

While the current kernel tree has indeed changed from 2.6.36,
I don't think this is the spot corresponding to my crash.

My spot was in hangcheck_elapsed and as far as I can see it will
crash in the current kernel in pretty much the same way.  In
particular, i915_hangcheck_ring_idle will probably crash on all
three rings.

FWIW after adding the INIT_LIST_HEAD to the init_dri function
my kernel hasn't crashed yet (a couple of hours and counting).

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915: Dont kick-off hangcheck after a DRI interrupt
  2011-01-20 10:10           ` Herbert Xu
@ 2011-01-20 11:00             ` Chris Wilson
  2011-01-20 11:35               ` Herbert Xu
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2011-01-20 11:00 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Jesse Barnes, linux-kernel

On Thu, 20 Jan 2011 21:10:10 +1100, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> While the current kernel tree has indeed changed from 2.6.36,
> I don't think this is the spot corresponding to my crash.
> 
> My spot was in hangcheck_elapsed and as far as I can see it will
> crash in the current kernel in pretty much the same way.  In
> particular, i915_hangcheck_ring_idle will probably crash on all
> three rings.

Except that i915_hangcheck_elapsed should not be called from DRI code
paths, the whole error recovery code only works with GEM (we can't recover
the hardware if it is being poked directly by "userspace").
 
> FWIW after adding the INIT_LIST_HEAD to the init_dri function
> my kernel hasn't crashed yet (a couple of hours and counting).

Ah, yes. Need to pull some more code in from init_ringbuffer.

For my sanity can you check that the above patch also prevents the crash?
I'd rather not install a timer, and execute undesirable code paths, if we
don't need one.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915: Dont kick-off hangcheck after a DRI interrupt
  2011-01-20 11:00             ` [PATCH] drm/i915: Dont " Chris Wilson
@ 2011-01-20 11:35               ` Herbert Xu
  0 siblings, 0 replies; 11+ messages in thread
From: Herbert Xu @ 2011-01-20 11:35 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Jesse Barnes, linux-kernel

On Thu, Jan 20, 2011 at 11:00:25AM +0000, Chris Wilson wrote:
>
> Except that i915_hangcheck_elapsed should not be called from DRI code
> paths, the whole error recovery code only works with GEM (we can't recover
> the hardware if it is being poked directly by "userspace").

Oh I see what you mean.  Yes that should do the trick.

Thanks!
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2011-01-20 11:35 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-18 11:57 i915 completely broken with KMS off Herbert Xu
2011-01-18 17:16 ` Jesse Barnes
2011-01-18 17:24 ` [PATCH] drm/i915: Correctly test for an initialised ring for dri1 Chris Wilson
2011-01-19  3:24   ` Herbert Xu
2011-01-19  9:53     ` [PATCH] drm/i915: Initialise ring vfuncs for old DRI paths Chris Wilson
2011-01-19 10:31       ` Herbert Xu
2011-01-20  7:05       ` Herbert Xu
2011-01-20  9:56         ` [PATCH] drm/i915: Don't kick-off hangcheck after a DRI interrupt Chris Wilson
2011-01-20 10:10           ` Herbert Xu
2011-01-20 11:00             ` [PATCH] drm/i915: Dont " Chris Wilson
2011-01-20 11:35               ` Herbert Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.