intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH] drm/i915/selftests: add some vma_sync
@ 2020-03-23 11:38 Matthew Auld
  2020-03-23 11:42 ` Chris Wilson
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Matthew Auld @ 2020-03-23 11:38 UTC (permalink / raw)
  To: intel-gfx

The subtest shrink_boom was added as a regression test for some missing
refcounting on the paging structures, however since the binding is
potentially async, setting the vm->fault_attr might apply to the purge
vma, and not the intended explode vma. Also it looks like it might also
be possible to hit some weird shrinker deadlock where the unbinding of
one vma allocates memory by flushing and waiting for its
still-pending-bind operation while holding vm->mutex, which will always
lands back in the shrinker since we set vm->fault_attr for the selftest.

References: https://gitlab.freedesktop.org/drm/intel/issues/1493
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index b342bef5e7c9..029406a2a0b3 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -951,6 +951,8 @@ static int shrink_boom(struct i915_address_space *vm,
 		if (err)
 			goto err_purge;
 
+		i915_vma_sync(vma);
+
 		/* Should now be ripe for purging */
 		i915_vma_unpin(vma);
 
@@ -974,6 +976,8 @@ static int shrink_boom(struct i915_address_space *vm,
 		if (err)
 			goto err_explode;
 
+		i915_vma_sync(vma);
+
 		i915_vma_unpin(vma);
 
 		i915_gem_object_put(purge);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915/selftests: add some vma_sync
  2020-03-23 11:38 [Intel-gfx] [PATCH] drm/i915/selftests: add some vma_sync Matthew Auld
@ 2020-03-23 11:42 ` Chris Wilson
  2020-03-23 13:12   ` Matthew Auld
  2020-03-23 11:50 ` Chris Wilson
  2020-03-23 13:59 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for " Patchwork
  2 siblings, 1 reply; 5+ messages in thread
From: Chris Wilson @ 2020-03-23 11:42 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2020-03-23 11:38:09)
> The subtest shrink_boom was added as a regression test for some missing
> refcounting on the paging structures, however since the binding is
> potentially async, setting the vm->fault_attr might apply to the purge
> vma, and not the intended explode vma.

Hmm. Sounds a fair point, though let's see if that is not an unintended
bonus.

> Also it looks like it might also
> be possible to hit some weird shrinker deadlock where the unbinding of
> one vma allocates memory by flushing and waiting for its
> still-pending-bind operation while holding vm->mutex, which will always
> lands back in the shrinker since we set vm->fault_attr for the selftest.

However that is a bug we have to handle. And it should be prevented
currently by avoiding shrinking active (still being bound) vma, e.g.
6f24e41022f2 ("drm/i915: Avoid recursing onto active vma from the
shrinker"). So is that a current observation?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915/selftests: add some vma_sync
  2020-03-23 11:38 [Intel-gfx] [PATCH] drm/i915/selftests: add some vma_sync Matthew Auld
  2020-03-23 11:42 ` Chris Wilson
@ 2020-03-23 11:50 ` Chris Wilson
  2020-03-23 13:59 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for " Patchwork
  2 siblings, 0 replies; 5+ messages in thread
From: Chris Wilson @ 2020-03-23 11:50 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2020-03-23 11:38:09)
> The subtest shrink_boom was added as a regression test for some missing
> refcounting on the paging structures, however since the binding is
> potentially async, setting the vm->fault_attr might apply to the purge
> vma, and not the intended explode vma. Also it looks like it might also
> be possible to hit some weird shrinker deadlock where the unbinding of
> one vma allocates memory by flushing and waiting for its
> still-pending-bind operation while holding vm->mutex, which will always
> lands back in the shrinker since we set vm->fault_attr for the selftest.
> 
> References: https://gitlab.freedesktop.org/drm/intel/issues/1493

Ah, you picked the wrong deadlock -- that's the gtt eviction recursion.
:(
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915/selftests: add some vma_sync
  2020-03-23 11:42 ` Chris Wilson
@ 2020-03-23 13:12   ` Matthew Auld
  0 siblings, 0 replies; 5+ messages in thread
From: Matthew Auld @ 2020-03-23 13:12 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics Development, Matthew Auld

On Mon, 23 Mar 2020 at 11:43, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Quoting Matthew Auld (2020-03-23 11:38:09)
> > The subtest shrink_boom was added as a regression test for some missing
> > refcounting on the paging structures, however since the binding is
> > potentially async, setting the vm->fault_attr might apply to the purge
> > vma, and not the intended explode vma.
>
> Hmm. Sounds a fair point, though let's see if that is not an unintended
> bonus.
>
> > Also it looks like it might also
> > be possible to hit some weird shrinker deadlock where the unbinding of
> > one vma allocates memory by flushing and waiting for its
> > still-pending-bind operation while holding vm->mutex, which will always
> > lands back in the shrinker since we set vm->fault_attr for the selftest.
>
> However that is a bug we have to handle. And it should be prevented
> currently by avoiding shrinking active (still being bound) vma, e.g.
> 6f24e41022f2 ("drm/i915: Avoid recursing onto active vma from the
> shrinker"). So is that a current observation?

Missed that. Egg-on-face.

> -Chris
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/selftests: add some vma_sync
  2020-03-23 11:38 [Intel-gfx] [PATCH] drm/i915/selftests: add some vma_sync Matthew Auld
  2020-03-23 11:42 ` Chris Wilson
  2020-03-23 11:50 ` Chris Wilson
@ 2020-03-23 13:59 ` Patchwork
  2 siblings, 0 replies; 5+ messages in thread
From: Patchwork @ 2020-03-23 13:59 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/selftests: add some vma_sync
URL   : https://patchwork.freedesktop.org/series/74969/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_8178 -> Patchwork_17052
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_17052 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_17052, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17052/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_17052:

### IGT changes ###

#### Possible regressions ####

  * igt@i915_selftest@live@coherency:
    - fi-gdg-551:         [PASS][1] -> [DMESG-FAIL][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8178/fi-gdg-551/igt@i915_selftest@live@coherency.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17052/fi-gdg-551/igt@i915_selftest@live@coherency.html

  
Known issues
------------

  Here are the changes found in Patchwork_17052 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@kms_chamelium@common-hpd-after-suspend:
    - fi-icl-u2:          [PASS][3] -> [FAIL][4] ([i915#217])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8178/fi-icl-u2/igt@kms_chamelium@common-hpd-after-suspend.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17052/fi-icl-u2/igt@kms_chamelium@common-hpd-after-suspend.html

  
#### Possible fixes ####

  * igt@i915_selftest@live@gem_contexts:
    - fi-cml-s:           [DMESG-FAIL][5] ([i915#877]) -> [PASS][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8178/fi-cml-s/igt@i915_selftest@live@gem_contexts.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17052/fi-cml-s/igt@i915_selftest@live@gem_contexts.html

  * igt@kms_chamelium@hdmi-hpd-fast:
    - fi-kbl-7500u:       [FAIL][7] ([i915#323]) -> [PASS][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8178/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17052/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html

  
  [i915#217]: https://gitlab.freedesktop.org/drm/intel/issues/217
  [i915#323]: https://gitlab.freedesktop.org/drm/intel/issues/323
  [i915#877]: https://gitlab.freedesktop.org/drm/intel/issues/877


Participating hosts (50 -> 42)
------------------------------

  Additional (1): fi-byt-n2820 
  Missing    (9): fi-kbl-soraka fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-bsw-kefka fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_8178 -> Patchwork_17052

  CI-20190529: 20190529
  CI_DRM_8178: 3f4392bf31d28013e860818ba613b598bb227821 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5528: 5dee9128b2aaa77d036163f670f0e0fc15b578ab @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_17052: 8bf02a45845a6710387fc0cab20cb4c79744c8ce @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

8bf02a45845a drm/i915/selftests: add some vma_sync

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17052/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-03-23 13:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-23 11:38 [Intel-gfx] [PATCH] drm/i915/selftests: add some vma_sync Matthew Auld
2020-03-23 11:42 ` Chris Wilson
2020-03-23 13:12   ` Matthew Auld
2020-03-23 11:50 ` Chris Wilson
2020-03-23 13:59 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for " Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).