From: Ben Widawsky <ben@bwidawsk.net> To: Daniel Vetter <daniel@ffwll.ch> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>, intel-gfx <intel-gfx@lists.freedesktop.org> Subject: Re: [PATCH] drm/i915: kicking rings considered harmful Date: Tue, 27 Sep 2011 12:38:59 -0700 Message-ID: <20110927123859.5cd58ba8@bwidawsk.net> (raw) In-Reply-To: <20110927180317.GC2785@phenom.ffwll.local> On Tue, 27 Sep 2011 20:03:17 +0200 Daniel Vetter <daniel@ffwll.ch> wrote: > On Tue, Sep 27, 2011 at 06:31:59PM +0100, Chris Wilson wrote: > > On Tue, 27 Sep 2011 09:46:14 -0700, Ben Widawsky <ben@bwidawsk.net> wrote: > > > On Tue, 27 Sep 2011 12:03:22 +0200 > > > Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > On Mon, Sep 26, 2011 at 10:22:01PM -0700, Ben Widawsky wrote: > > > > > On Mon, 26 Sep 2011 19:59:50 +0200 > > > > > Daniel Vetter <daniel.vetter@ffwll.ch> wrote: > > > > > > diff --git a/drivers/gpu/drm/i915/i915_irq.c > > > > > > b/drivers/gpu/drm/i915/i915_irq.c index da5d607..09c11e4 100644 > > > > > > --- a/drivers/gpu/drm/i915/i915_irq.c > > > > > > +++ b/drivers/gpu/drm/i915/i915_irq.c > > > > > > @@ -1694,7 +1694,7 @@ void i915_hangcheck_elapsed(unsigned long data) > > > > > > if (dev_priv->hangcheck_count++ > 1) { > > > > > > DRM_ERROR("Hangcheck timer elapsed... GPU > > > > > > hung\n"); > > > > > > - if (!IS_GEN2(dev)) { > > > > > > + if (!IS_GEN2(dev) && i915_try_reset) { > > > > > > /* Is the chip hanging on a > > > > > > WAIT_FOR_EVENT? > > > > > > * If so we can simply poke the > > > > > > RB_WAIT bit > > > > > > * and break the hang. This should > > > > > > work on > > > > > > > > > > I think you should also be able to accomplish the same thing > > > > > with enable_hangcheck param. I had the same problem with the > > > > > debugger :) > > > > > > > > I agree. Iirc you have some patches floating in that area to make the > > > > hangcheck a bit more robust. Can you maybe add this to that series and > > > > (re-)submit? > > > > > > > > Cheers, Daniel > > > > > > While 9/10 times daniel > ben, I'm playing my 10% card here and > > > suggesting that mixing the reset variable and ring kick is not the right > > > way to go about this. > > > > One purpose of the i915.reset parameter is to disable any automatic > > attempts to recover from a hang condition so that the error state is not > > misleading. So preventing the kick ring does help in that regard. > > > > A second purpose is to prevent i915_reset() from causing havoc and hanging > > the machine. Daniel is implying that kicking the rings is instrumental in > > making matters worse. Again using i915.reset to prevent kicking the rings > > fits in with that purpose. > > > > Since I regard kicking rings as a form of reset, I don't see it as a > > conflation of terms and so a valid use of i915.reset. > > Couldn't have said it any better. The bad effects of kicking stuck rings > is mostly that when we have a sync problem there's a decent chance > somebody has written garbage into our batchbuffers. Continously trying to > execute said garbage is just tempting faith in the gpu's error resilience. > -Daniel If we do this we lose the possibility to kick rings, but not reset the GPU (not that I find that terribly useful. If we do this, it does fire a wq event, but I don't see a problem with that for this case. I think I would rather do this: diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 012732b..803524e 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -1698,6 +1698,10 @@ void i915_hangcheck_elapsed(unsigned long data) if (dev_priv->hangcheck_count++ > 1) { DRM_ERROR("Hangcheck timer elapsed... GPU hung\n"); + /* Save off error state before kicking the rings and + * possibly ruining the GPU state. + */ + i915_handle_error(dev, true); if (!IS_GEN2(dev)) { /* Is the chip hanging on a WAIT_FOR_EVENT? * If so we can simply poke the RB_WAIT bit @@ -1717,7 +1721,6 @@ void i915_hangcheck_elapsed(unsigned long data) goto repeat; } - i915_handle_error(dev, true); return; } } else {
next prev parent reply index Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top 2011-05-13 5:38 [2.6.39 regression] hard lock when GNOME starts Andrew Lutomirski 2011-05-13 16:07 ` Andrew Lutomirski 2011-05-13 16:14 ` [PATCH] drm/i915: Revert i915.semaphore=1 default from 47ae63e0 Andy Lutomirski 2011-05-15 23:09 ` Keith Packard 2011-05-19 19:56 ` Keith Packard 2011-05-19 20:50 ` Andrew Lutomirski 2011-05-24 17:10 ` Andrew Lutomirski 2011-05-24 17:46 ` Keith Packard 2011-05-24 20:05 ` Ivan Bulatovic 2011-06-07 7:12 ` Eric Anholt 2011-06-10 14:06 ` Andrew Lutomirski 2011-08-22 16:53 ` Jesse Barnes 2011-08-31 18:24 ` Ben Widawsky 2011-08-31 18:30 ` Andrew Lutomirski 2011-08-31 19:07 ` Keith Packard 2011-08-31 19:37 ` Andrew Lutomirski 2011-09-26 17:59 ` [PATCH] drm/i915: kicking rings considered harmful Daniel Vetter 2011-09-26 19:07 ` Andrew Lutomirski 2011-09-27 9:57 ` Daniel Vetter 2011-09-27 5:22 ` Ben Widawsky 2011-09-27 10:03 ` Daniel Vetter 2011-09-27 16:46 ` Ben Widawsky 2011-09-27 17:31 ` Chris Wilson 2011-09-27 18:03 ` Daniel Vetter 2011-09-27 19:38 ` Ben Widawsky [this message] 2011-09-27 21:54 ` Chris Wilson 2011-09-28 1:34 ` Ben Widawsky 2011-09-28 8:47 ` Chris Wilson 2011-09-28 8:53 ` Daniel Vetter 2011-10-03 20:21 ` Andrew Lutomirski 2011-10-03 21:02 ` Daniel Vetter
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20110927123859.5cd58ba8@bwidawsk.net \ --to=ben@bwidawsk.net \ --cc=daniel.vetter@ffwll.ch \ --cc=daniel@ffwll.ch \ --cc=intel-gfx@lists.freedesktop.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Intel-GFX Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/intel-gfx/0 intel-gfx/git/0.git git clone --mirror https://lore.kernel.org/intel-gfx/1 intel-gfx/git/1.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 intel-gfx intel-gfx/ https://lore.kernel.org/intel-gfx \ intel-gfx@lists.freedesktop.org public-inbox-index intel-gfx Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.freedesktop.lists.intel-gfx AGPL code for this site: git clone https://public-inbox.org/public-inbox.git