All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: Juergen Gross <jgross@suse.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	dri-devel@lists.freedesktop.org,
	intel-gfx <intel-gfx@lists.freedesktop.org>,
	airlied@linux.ie, daniel.vetter@intel.com
Subject: Re: [Intel-gfx] GPU hang with kernel 4.10rc3
Date: Wed, 11 Jan 2017 17:08:23 +0000	[thread overview]
Message-ID: <20170111170823.GC16278@nuc-i3427.alporthouse.com> (raw)
In-Reply-To: <7737c1e1-7523-1eea-07a9-0be04b8078e9@suse.com>

On Wed, Jan 11, 2017 at 05:33:34PM +0100, Juergen Gross wrote:
> With kernel 4.10rc3 running as Xen dm0 I get at each boot:
> 
> [   49.213697] [drm] GPU HANG: ecode 7:0:0x3d1d3d3d, in gnome-shell
> [1431], reason: Hang on render ring, action: reset
> [   49.213699] [drm] GPU hangs can indicate a bug anywhere in the entire
> gfx stack, including userspace.
> [   49.213700] [drm] Please file a _new_ bug report on
> bugs.freedesktop.org against DRI -> DRM/Intel
> [   49.213700] [drm] drm/i915 developers can then reassign to the right
> component if it's not a kernel issue.
> [   49.213700] [drm] The gpu crash dump is required to analyze gpu
> hangs, so please always attach it.
> [   49.213701] [drm] GPU crash dump saved to /sys/class/drm/card0/error
> [   49.213755] drm/i915: Resetting chip after gpu hang
> [   60.213769] drm/i915: Resetting chip after gpu hang
> [   71.189737] drm/i915: Resetting chip after gpu hang
> [   82.165747] drm/i915: Resetting chip after gpu hang
> [   93.205727] drm/i915: Resetting chip after gpu hang
> 
> The dump is attached.

That's a nasty one. The first couple of pages of the batchbuffer appear
to be overwritten. (Full of 0xc2c2c2c2, i.e. probably pixel data.) That
may be a concurrent write by either the GPU or CPU, or we may have
incorrected mapped a set of pages. That it doesn't recovered suggests
that the corruption occurs frequently, probably on every request/batch.

Is this a new bug? Bisection would be the fastest way to triage it.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

WARNING: multiple messages have this Message-ID (diff)
From: Chris Wilson <chris@chris-wilson.co.uk>
To: Juergen Gross <jgross@suse.com>
Cc: airlied@linux.ie, daniel.vetter@intel.com,
	intel-gfx <intel-gfx@lists.freedesktop.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	dri-devel@lists.freedesktop.org
Subject: Re: GPU hang with kernel 4.10rc3
Date: Wed, 11 Jan 2017 17:08:23 +0000	[thread overview]
Message-ID: <20170111170823.GC16278@nuc-i3427.alporthouse.com> (raw)
In-Reply-To: <7737c1e1-7523-1eea-07a9-0be04b8078e9@suse.com>

On Wed, Jan 11, 2017 at 05:33:34PM +0100, Juergen Gross wrote:
> With kernel 4.10rc3 running as Xen dm0 I get at each boot:
> 
> [   49.213697] [drm] GPU HANG: ecode 7:0:0x3d1d3d3d, in gnome-shell
> [1431], reason: Hang on render ring, action: reset
> [   49.213699] [drm] GPU hangs can indicate a bug anywhere in the entire
> gfx stack, including userspace.
> [   49.213700] [drm] Please file a _new_ bug report on
> bugs.freedesktop.org against DRI -> DRM/Intel
> [   49.213700] [drm] drm/i915 developers can then reassign to the right
> component if it's not a kernel issue.
> [   49.213700] [drm] The gpu crash dump is required to analyze gpu
> hangs, so please always attach it.
> [   49.213701] [drm] GPU crash dump saved to /sys/class/drm/card0/error
> [   49.213755] drm/i915: Resetting chip after gpu hang
> [   60.213769] drm/i915: Resetting chip after gpu hang
> [   71.189737] drm/i915: Resetting chip after gpu hang
> [   82.165747] drm/i915: Resetting chip after gpu hang
> [   93.205727] drm/i915: Resetting chip after gpu hang
> 
> The dump is attached.

That's a nasty one. The first couple of pages of the batchbuffer appear
to be overwritten. (Full of 0xc2c2c2c2, i.e. probably pixel data.) That
may be a concurrent write by either the GPU or CPU, or we may have
incorrected mapped a set of pages. That it doesn't recovered suggests
that the corruption occurs frequently, probably on every request/batch.

Is this a new bug? Bisection would be the fastest way to triage it.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2017-01-11 17:08 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-11 16:33 GPU hang with kernel 4.10rc3 Juergen Gross
2017-01-11 16:33 ` Juergen Gross
2017-01-11 17:08 ` Chris Wilson [this message]
2017-01-11 17:08   ` Chris Wilson
2017-01-12  6:03   ` [Intel-gfx] " Juergen Gross
2017-01-12  9:21     ` Chris Wilson
2017-01-12  9:21       ` Chris Wilson
2017-01-13 14:41       ` [Intel-gfx] " Juergen Gross
2017-01-23  9:39         ` Juergen Gross
2017-05-11 21:08           ` Pavel Machek
2017-05-11 21:08             ` Pavel Machek
2017-05-12  4:54             ` [Intel-gfx] " Juergen Gross
2017-05-12  4:54               ` Juergen Gross

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170111170823.GC16278@nuc-i3427.alporthouse.com \
    --to=chris@chris-wilson.co.uk \
    --cc=airlied@linux.ie \
    --cc=daniel.vetter@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jgross@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.