From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Wilson Subject: Re: [PATCH 1/2] drm/i915: clear up wedged transitions Date: Wed, 3 Sep 2014 21:26:14 +0100 Message-ID: <20140903202614.GD3171@nuc-i3427.alporthouse.com> References: <1352909648-21514-1-git-send-email-daniel.vetter@ffwll.ch> <1352996243-15590-1-git-send-email-daniel.vetter@ffwll.ch> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from fireflyinternet.com (mail.fireflyinternet.com [87.106.93.118]) by gabe.freedesktop.org (Postfix) with ESMTP id 245876E5DC for ; Wed, 3 Sep 2014 13:26:18 -0700 (PDT) Content-Disposition: inline In-Reply-To: <1352996243-15590-1-git-send-email-daniel.vetter@ffwll.ch> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To: Daniel Vetter Cc: Intel Graphics Development List-Id: intel-gfx@lists.freedesktop.org On Thu, Nov 15, 2012 at 05:17:22PM +0100, Daniel Vetter wrote: > We have two important transitions of the wedged state in the current > code: > > - 0 -> 1: This means a hang has been detected, and signals to everyone > that they please get of any locks, so that the reset work item can > do its job. > > - 1 -> 0: The reset handler has completed. > > Now the last transition mixes up two states: "Reset completed and > successful" and "Reset failed". To distinguish these two we do some > tricks with the reset completion, but I simply could not convince > myself that this doesn't race under odd circumstances. > > Hence split this up, and add a new terminal state indicating that the > hw is gone for good. > > Also add explicit #defines for both states, update comments. > > v2: Split out the reset handling bugfix for the throttle ioctl. > > v3: s/tmp/wedged/ sugested by Chris Wilson. Also fixup up a rebase > error which prevented this patch from actually compiling. > > v4: To unify the wedged state with the reset counter, keep the > reset-in-progress state just as a flag. The terminally-wedged state is > now denoted with a big number. > > v5: Add a comment to the reset_counter special values explaining that > WEDGED & RESET_IN_PROGRESS needs to be true for the code to be > correct. > > v6: Fixup logic errors introduced with the wedged+reset_counter > unification. Since WEDGED implies reset-in-progress (in a way we're > terminally stuck in the dead-but-reset-not-completed state), we need > ensure that we check for this everywhere. The specific bug was in > wait_for_error, which would simply have timed out. > > v7: Extract an inline i915_reset_in_progress helper to make the code > more readable. Also annote the reset-in-progress case with an > unlikely, to help the compiler optimize the fastpath. Do the same for > the terminally wedged case with i915_terminally_wedged. > > Signed-Off-by: Daniel Vetter > @@ -1385,7 +1372,7 @@ out: > /* If this -EIO is due to a gpu hang, give the reset code a > * chance to clean up the mess. Otherwise return the proper > * SIGBUS. */ > - if (!atomic_read(&dev_priv->gpu_error.wedged)) > + if (i915_terminally_wedged(&dev_priv->gpu_error)) > return VM_FAULT_SIGBUS; (i915_gem_fault()) This if() is backwards. -Chris -- Chris Wilson, Intel Open Source Technology Centre