All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Fix eviction when the GGTT is idle but full
@ 2017-10-09 14:57 Chris Wilson
  2017-10-09 15:20 ` ✓ Fi.CI.BAT: success for " Patchwork
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Chris Wilson @ 2017-10-09 14:57 UTC (permalink / raw)
  To: intel-gfx

In the full-ppgtt world, we can fill the GGTT full of context objects.
These context objects are currently implicitly tracked by the requests
that pin them i.e. they are only unpinned when the request is completed
and retired, but we do not have the link from the vma to the request
(anymore). In order to unpin those contexts, we have to issue another
request and switch to the kernel context.

The bug during eviction was that we assumed that a full GGTT meant we
would have requests on the GGTT timeline, and so we missed situations
where those requests where merely in flight (and when even they have not
yet been submitted to hw yet). The fix employed here is to change the
already-is-idle test to no look at the execution timeline, but count the
outstanding requests and then check that we have switched to the kernel
context. Erring on the side of overkill here just means that we stall a
little longer than may be strictly required, but we only expect to hit
this patch in extreme corner cases where returning an erroneous error is
worse than the delay.

Fixes: 80b204bce8f2 ("drm/i915: Enable multiple timelines")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_evict.c | 61 ++++++++++++++++++++++-------------
 1 file changed, 38 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index a5a5b7e6daae..a02b2adb64ea 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -33,21 +33,20 @@
 #include "intel_drv.h"
 #include "i915_trace.h"
 
-static bool ggtt_is_idle(struct drm_i915_private *dev_priv)
+static bool ggtt_is_idle(struct drm_i915_private *i915)
 {
-	struct i915_ggtt *ggtt = &dev_priv->ggtt;
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
+       struct intel_engine_cs *engine;
+       enum intel_engine_id id;
 
-	for_each_engine(engine, dev_priv, id) {
-		struct intel_timeline *tl;
+       if (i915->gt.active_requests)
+	       return false;
 
-		tl = &ggtt->base.timeline.engine[engine->id];
-		if (i915_gem_active_isset(&tl->last_request))
-			return false;
-	}
+       for_each_engine(engine, i915, id) {
+	       if (engine->last_retired_context != i915->kernel_context)
+		       return false;
+       }
 
-	return true;
+       return true;
 }
 
 static int ggtt_flush(struct drm_i915_private *i915)
@@ -157,7 +156,8 @@ i915_gem_evict_something(struct i915_address_space *vm,
 				    min_size, alignment, cache_level,
 				    start, end, mode);
 
-	/* Retire before we search the active list. Although we have
+	/*
+	 * Retire before we search the active list. Although we have
 	 * reasonable accuracy in our retirement lists, we may have
 	 * a stray pin (preventing eviction) that can only be resolved by
 	 * retiring.
@@ -182,7 +182,8 @@ i915_gem_evict_something(struct i915_address_space *vm,
 		BUG_ON(ret);
 	}
 
-	/* Can we unpin some objects such as idle hw contents,
+	/*
+	 * Can we unpin some objects such as idle hw contents,
 	 * or pending flips? But since only the GGTT has global entries
 	 * such as scanouts, rinbuffers and contexts, we can skip the
 	 * purge when inspecting per-process local address spaces.
@@ -190,19 +191,33 @@ i915_gem_evict_something(struct i915_address_space *vm,
 	if (!i915_is_ggtt(vm) || flags & PIN_NONBLOCK)
 		return -ENOSPC;
 
+	/*
+	 * Not everything in the GGTT is tracked via VMA using
+	 * i915_vma_move_to_active(), otherwise we could evict as required
+	 * with minimal stalling. Instead we are forced to idle the GPU and
+	 * explicitly retire outstanding requests which will then remove
+	 * the pinning for active objects such as contexts and ring,
+	 * enabling us to evict them on the next iteration.
+	 *
+	 * To ensure that all user contexts are evictable, we perform
+	 * a switch to the perma-pinned kernel context. This all also gives
+	 * us a termination condition, when the last retired context is
+	 * the kernel's there is no more we can evict.
+	 */
 	if (ggtt_is_idle(dev_priv)) {
-		/* If we still have pending pageflip completions, drop
-		 * back to userspace to give our workqueues time to
-		 * acquire our locks and unpin the old scanouts.
-		 */
-		return intel_has_pending_fb_unpin(dev_priv) ? -EAGAIN : -ENOSPC;
-	}
+		ret = ggtt_flush(dev_priv);
+		if (ret)
+			return ret;
 
-	ret = ggtt_flush(dev_priv);
-	if (ret)
-		return ret;
+		goto search_again;
+	}
 
-	goto search_again;
+	/*
+	 * If we still have pending pageflip completions, drop
+	 * back to userspace to give our workqueues time to
+	 * acquire our locks and unpin the old scanouts.
+	 */
+	return intel_has_pending_fb_unpin(dev_priv) ? -EAGAIN : -ENOSPC;
 
 found:
 	/* drm_mm doesn't allow any other other operations while
-- 
2.14.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915: Fix eviction when the GGTT is idle but full
  2017-10-09 14:57 [PATCH] drm/i915: Fix eviction when the GGTT is idle but full Chris Wilson
@ 2017-10-09 15:20 ` Patchwork
  2017-10-09 16:04 ` [PATCH] " Chris Wilson
  2017-10-09 19:59 ` ✗ Fi.CI.IGT: failure for " Patchwork
  2 siblings, 0 replies; 4+ messages in thread
From: Patchwork @ 2017-10-09 15:20 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Fix eviction when the GGTT is idle but full
URL   : https://patchwork.freedesktop.org/series/31592/
State : success

== Summary ==

Series 31592v1 drm/i915: Fix eviction when the GGTT is idle but full
https://patchwork.freedesktop.org/api/1.0/series/31592/revisions/1/mbox/

Test chamelium:
        Subgroup dp-crc-fast:
                pass       -> FAIL       (fi-kbl-7500u) fdo#102514
Test gem_exec_suspend:
        Subgroup basic-s3:
                dmesg-warn -> PASS       (fi-cfl-s) fdo#103026
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-b:
                pass       -> DMESG-WARN (fi-byt-n2820) fdo#101705

fdo#102514 https://bugs.freedesktop.org/show_bug.cgi?id=102514
fdo#103026 https://bugs.freedesktop.org/show_bug.cgi?id=103026
fdo#101705 https://bugs.freedesktop.org/show_bug.cgi?id=101705

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:447s
fi-bdw-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:478s
fi-blb-e6850     total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:393s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:567s
fi-bwr-2160      total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 time:284s
fi-bxt-dsi       total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:519s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:519s
fi-byt-j1900     total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  time:531s
fi-byt-n2820     total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  time:526s
fi-cfl-s         total:289  pass:257  dwarn:0   dfail:0   fail:0   skip:32  time:566s
fi-elk-e7500     total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:432s
fi-glk-1         total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:594s
fi-hsw-4770      total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:433s
fi-hsw-4770r     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:417s
fi-ilk-650       total:289  pass:228  dwarn:0   dfail:0   fail:0   skip:61  time:463s
fi-ivb-3520m     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:503s
fi-ivb-3770      total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:473s
fi-kbl-7500u     total:289  pass:263  dwarn:1   dfail:0   fail:1   skip:24  time:494s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:580s
fi-kbl-7567u     total:289  pass:265  dwarn:4   dfail:0   fail:0   skip:20  time:486s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:586s
fi-pnv-d510      total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  time:658s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:469s
fi-skl-6700hq    total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:650s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:528s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:519s
fi-skl-gvtdvm    total:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  time:473s
fi-snb-2520m     total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:581s
fi-snb-2600      total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  time:428s

7579653b5bfdee2b6a20aab20c36c516cbe3812f drm-tip: 2017y-10m-09d-12h-55m-47s UTC integration manifest
4267a42f3108 drm/i915: Fix eviction when the GGTT is idle but full

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5954/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/i915: Fix eviction when the GGTT is idle but full
  2017-10-09 14:57 [PATCH] drm/i915: Fix eviction when the GGTT is idle but full Chris Wilson
  2017-10-09 15:20 ` ✓ Fi.CI.BAT: success for " Patchwork
@ 2017-10-09 16:04 ` Chris Wilson
  2017-10-09 19:59 ` ✗ Fi.CI.IGT: failure for " Patchwork
  2 siblings, 0 replies; 4+ messages in thread
From: Chris Wilson @ 2017-10-09 16:04 UTC (permalink / raw)
  To: intel-gfx

Quoting Chris Wilson (2017-10-09 15:57:28)
> @@ -190,19 +191,33 @@ i915_gem_evict_something(struct i915_address_space *vm,
>         if (!i915_is_ggtt(vm) || flags & PIN_NONBLOCK)
>                 return -ENOSPC;
>  
> +       /*
> +        * Not everything in the GGTT is tracked via VMA using
> +        * i915_vma_move_to_active(), otherwise we could evict as required
> +        * with minimal stalling. Instead we are forced to idle the GPU and
> +        * explicitly retire outstanding requests which will then remove
> +        * the pinning for active objects such as contexts and ring,
> +        * enabling us to evict them on the next iteration.
> +        *
> +        * To ensure that all user contexts are evictable, we perform
> +        * a switch to the perma-pinned kernel context. This all also gives
> +        * us a termination condition, when the last retired context is
> +        * the kernel's there is no more we can evict.
> +        */
>         if (ggtt_is_idle(dev_priv)) {

Would help if I inverted the test when swapping over the branches.

I have an idea on how to test this, but it requires us to be able to
fill the GGTT within 10s. Which is going to be hard (under debug
conditions everywhere). Hmm. If fault in a 256k object, mark it
DONTNEED, when it gets evicted the next access is a sigbus. Combine that
with the knowledge that we prefer freespace over eviction, detecting
when we have evicted the mmap will tell us when we are close to a
totally full GGTT -- and then we setup the eviction test using the MRU
contexts.

Crazy. It would seem more sensible to hook a check for context eviction
into kselftests.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* ✗ Fi.CI.IGT: failure for drm/i915: Fix eviction when the GGTT is idle but full
  2017-10-09 14:57 [PATCH] drm/i915: Fix eviction when the GGTT is idle but full Chris Wilson
  2017-10-09 15:20 ` ✓ Fi.CI.BAT: success for " Patchwork
  2017-10-09 16:04 ` [PATCH] " Chris Wilson
@ 2017-10-09 19:59 ` Patchwork
  2 siblings, 0 replies; 4+ messages in thread
From: Patchwork @ 2017-10-09 19:59 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Fix eviction when the GGTT is idle but full
URL   : https://patchwork.freedesktop.org/series/31592/
State : failure

== Summary ==

Test kms_universal_plane:
        Subgroup universal-plane-pipe-A-functional:
                pass       -> DMESG-WARN (shard-hsw)
Test kms_flip:
        Subgroup vblank-vs-suspend-interruptible:
                pass       -> FAIL       (shard-hsw)
        Subgroup plain-flip-ts-check:
                pass       -> FAIL       (shard-hsw) fdo#100368

fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368

shard-hsw        total:2446 pass:1326 dwarn:7   dfail:0   fail:10  skip:1103 time:10003s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5954/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-10-09 19:59 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-09 14:57 [PATCH] drm/i915: Fix eviction when the GGTT is idle but full Chris Wilson
2017-10-09 15:20 ` ✓ Fi.CI.BAT: success for " Patchwork
2017-10-09 16:04 ` [PATCH] " Chris Wilson
2017-10-09 19:59 ` ✗ Fi.CI.IGT: failure for " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.