Intel-GFX Archive on lore.kernel.org
 help / color / Atom feed
From: Daniel Vetter <daniel.vetter@ffwll.ch>
To: Intel Graphics Development <intel-gfx@lists.freedesktop.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Subject: [PATCH] drm/i915: don't return a spurious -EIO from intel_ring_begin
Date: Wed,  4 Jul 2012 22:52:50 +0200
Message-ID: <1341435170-30391-1-git-send-email-daniel.vetter@ffwll.ch> (raw)
In-Reply-To: <1341433123-23055-6-git-send-email-daniel.vetter@ffwll.ch>

The issue with this check is that it results in userspace receiving an
-EIO while the gpu reset hasn't completed, resulting in fallback to sw
rendering or worse.

Now there's also a stern comment in intel_ring_wait_seqno saying that
intel_ring_begin should not return -EAGAIN, ever, because some callers
can't handle that. But after an audit of the callsites I don't see any
issues. I guess the last problematic spot disappeared with the removal
of the pipelined fencing code.

So do the right thing and call check_wedge, which should properly
decide whether an -EAGAIN or -EIO is appropriate if wedged is set.

Note that the early check for a wedged gpu before touching the ring is
rather important (and it took me quite some time of acting like the
densest doofus to figure that out): If we don't do that and the gpu
died for good, not having been resurrect by the reset code, userspace
can merrily fill up the entire ring until it notices that something is
amiss.

Allowing userspace to emit more render, despite that we know that it
will fail can't lead to anything good (and by experience can lead to
all sorts of havoc, including angering the OOM gods and hard-hanging
the hw for good).

v2: Fix EAGAIN mispell, noticed by Chris Wilson.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c |   18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index cd35ad4..d42d821 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1117,20 +1117,9 @@ static int intel_wrap_ring_buffer(struct intel_ring_buffer *ring)
 
 static int intel_ring_wait_seqno(struct intel_ring_buffer *ring, u32 seqno)
 {
-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
-	bool was_interruptible;
 	int ret;
 
-	/* XXX As we have not yet audited all the paths to check that
-	 * they are ready for ERESTARTSYS from intel_ring_begin, do not
-	 * allow us to be interruptible by a signal.
-	 */
-	was_interruptible = dev_priv->mm.interruptible;
-	dev_priv->mm.interruptible = false;
-
 	ret = i915_wait_seqno(ring, seqno);
-
-	dev_priv->mm.interruptible = was_interruptible;
 	if (!ret)
 		i915_gem_retire_requests_ring(ring);
 
@@ -1240,12 +1229,13 @@ int intel_wait_ring_buffer(struct intel_ring_buffer *ring, int n)
 int intel_ring_begin(struct intel_ring_buffer *ring,
 		     int num_dwords)
 {
-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	drm_i915_private_t *dev_priv = ring->dev->dev_private;
 	int n = 4*num_dwords;
 	int ret;
 
-	if (unlikely(atomic_read(&dev_priv->mm.wedged)))
-		return -EIO;
+	ret = i915_gem_check_wedge(dev_priv, dev_priv->mm.interruptible);
+	if (ret)
+		return ret;
 
 	if (unlikely(ring->tail + n > ring->effective_size)) {
 		ret = intel_wrap_ring_buffer(ring);
-- 
1.7.10

  reply index

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-04 20:18 [PATCH 0/5] reset rework, 2nd try Daniel Vetter
2012-07-04 20:18 ` [PATCH 1/5] drm/i915: don't trylock in the gpu reset code Daniel Vetter
2012-07-04 20:18 ` [PATCH 2/5] drm/i915: non-interruptible sleeps can't handle -EGAIN Daniel Vetter
2012-07-04 20:54   ` [PATCH] drm/i915: non-interruptible sleeps can't handle -EAGAIN Daniel Vetter
2012-07-04 20:18 ` [PATCH 3/5] drm/i915: don't hang userspace when the gpu reset is stuck Daniel Vetter
2012-07-04 20:18 ` [PATCH 4/5] drm/i915: properly SIGBUS on I/O errors Daniel Vetter
2012-07-04 20:40   ` Daniel Vetter
2012-07-04 20:18 ` [PATCH 5/5] drm/i915: don't return a spurious -EIO from intel_ring_begin Daniel Vetter
2012-07-04 20:52   ` Daniel Vetter [this message]
2012-07-04 20:54 ` [PATCH 0/5] reset rework, 2nd try Chris Wilson
2012-07-05  8:04   ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1341435170-30391-1-git-send-email-daniel.vetter@ffwll.ch \
    --to=daniel.vetter@ffwll.ch \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Intel-GFX Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/intel-gfx/0 intel-gfx/git/0.git
	git clone --mirror https://lore.kernel.org/intel-gfx/1 intel-gfx/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 intel-gfx intel-gfx/ https://lore.kernel.org/intel-gfx \
		intel-gfx@lists.freedesktop.org
	public-inbox-index intel-gfx

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.freedesktop.lists.intel-gfx


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git