All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx <intel-gfx@lists.freedesktop.org>
Subject: Re: [PATCH 2/3] drm/i915: Drop inspection of execbuf flags during evict
Date: Fri, 8 Nov 2019 17:06:20 +0100	[thread overview]
Message-ID: <CAKMK7uHmTPDk=nZTdb=12WbAn1LKX40HYy4FqYhFY6TOAGHY-w@mail.gmail.com> (raw)
In-Reply-To: <157320960375.9461.12119953763105684230@skylake-alporthouse-com>

On Fri, Nov 8, 2019 at 11:40 AM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Quoting Daniel Vetter (2019-11-08 10:20:23)
> > On Fri, Nov 8, 2019 at 11:11 AM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > > Quoting Daniel Vetter (2019-11-08 09:54:42)
> > > > On Wed, Nov 6, 2019 at 4:49 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > > > >
> > > > > With the goal of removing the serialisation from around execbuf, we will
> > > > > no longer have the privilege of there being a single execbuf in flight
> > > > > at any time and so will only be able to inspect the user's flags within
> > > > > the carefully controlled execbuf context. i915_gem_evict_for_node() is
> > > > > the only user outside of execbuf that currently peeks at the flag to
> > > > > convert an overlapping softpinned request from ENOSPC to EINVAL. Retract
> > > > > this nicety and only report ENOSPC if the location is in current use,
> > > > > either due to this execbuf or another.
> > > > >
> > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > > > Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > >
> > > > Same reasons as for patch 3, I don't think we have to do this at all.
> > >
> > > This is already undefined behaviour. That field is protected by
> > > struct_mutex and being evaluated outside of that lock.
> >
> > If this can be called on objects involved in execbuf, without
> > struct_mutex, then we already have a correctness problem of vma space
> > (which is super tight on old platforms and rather much required to be
> > well-managed because of that) being lost because concurrent threads
> > thrash it instead of forming an orderly queue. And if that's not the
> > case, and they do form an orderly queue, then there's no problem since
> > even the as-needed-only orderly queue provided by ww_mutex will then
> > be enough locking to keep this working.
>
> It doesn't get called on those objects, those objects may just be
> neighbouring and being inspected for potential eviction candidates. The
> lists themselves are protected by their mutex, it's just the contention
> over the pin_count.

Hm yeah in a per-bo locked future world this won't work. But today it
should be either vm->mutex or dev->struct_mutex, not already broken?

Otoh in the per-bo locked future we only care about conflicts with our
own execbuf, which means we could check whether the object belongs to
our batch (very easy by looking at dma_resv->lock.ctx, ttm does that
in a few places), and only do the check in that case. So could retain
full uapi semantics here without additional effort (we need to have
these locks anway, at least in any kind of execbuf slowpath where the
bo aren't all mapped when we start out). So still not understanding
(even with the "it's other bo" overlook rectified) why we have to drop
this?

> > Aside: Yeah I think we need to re-add struct_mutex to the gtt fault
> > path, the temporary pinning in there could easily starve execbuf on
> > platforms where batches run in ggtt. Maybe also some other areas where
> > we lost struct_mutex around temporary vma->pin_count elevations.
>
> That's where we are going next; not with struct_mutex but fenced access
> to reservations to replace the temporary (not HW access) pinning.

fenced as in dma_fence or dma_resv_lock?

Also if we indeed have an issue with lost elevated pin_counts now I
think we shouldn't ship 5.5 with that, and reapply the duct tape until
it's fixed for good.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

WARNING: multiple messages have this Message-ID (diff)
From: Daniel Vetter <daniel@ffwll.ch>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx <intel-gfx@lists.freedesktop.org>
Subject: Re: [Intel-gfx] [PATCH 2/3] drm/i915: Drop inspection of execbuf flags during evict
Date: Fri, 8 Nov 2019 17:06:20 +0100	[thread overview]
Message-ID: <CAKMK7uHmTPDk=nZTdb=12WbAn1LKX40HYy4FqYhFY6TOAGHY-w@mail.gmail.com> (raw)
Message-ID: <20191108160620.-nHNH-EOCTzYp8o8wDNhNBWWb25EB9x7a0hyMoGvZTo@z> (raw)
In-Reply-To: <157320960375.9461.12119953763105684230@skylake-alporthouse-com>

On Fri, Nov 8, 2019 at 11:40 AM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Quoting Daniel Vetter (2019-11-08 10:20:23)
> > On Fri, Nov 8, 2019 at 11:11 AM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > > Quoting Daniel Vetter (2019-11-08 09:54:42)
> > > > On Wed, Nov 6, 2019 at 4:49 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > > > >
> > > > > With the goal of removing the serialisation from around execbuf, we will
> > > > > no longer have the privilege of there being a single execbuf in flight
> > > > > at any time and so will only be able to inspect the user's flags within
> > > > > the carefully controlled execbuf context. i915_gem_evict_for_node() is
> > > > > the only user outside of execbuf that currently peeks at the flag to
> > > > > convert an overlapping softpinned request from ENOSPC to EINVAL. Retract
> > > > > this nicety and only report ENOSPC if the location is in current use,
> > > > > either due to this execbuf or another.
> > > > >
> > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > > > Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > >
> > > > Same reasons as for patch 3, I don't think we have to do this at all.
> > >
> > > This is already undefined behaviour. That field is protected by
> > > struct_mutex and being evaluated outside of that lock.
> >
> > If this can be called on objects involved in execbuf, without
> > struct_mutex, then we already have a correctness problem of vma space
> > (which is super tight on old platforms and rather much required to be
> > well-managed because of that) being lost because concurrent threads
> > thrash it instead of forming an orderly queue. And if that's not the
> > case, and they do form an orderly queue, then there's no problem since
> > even the as-needed-only orderly queue provided by ww_mutex will then
> > be enough locking to keep this working.
>
> It doesn't get called on those objects, those objects may just be
> neighbouring and being inspected for potential eviction candidates. The
> lists themselves are protected by their mutex, it's just the contention
> over the pin_count.

Hm yeah in a per-bo locked future world this won't work. But today it
should be either vm->mutex or dev->struct_mutex, not already broken?

Otoh in the per-bo locked future we only care about conflicts with our
own execbuf, which means we could check whether the object belongs to
our batch (very easy by looking at dma_resv->lock.ctx, ttm does that
in a few places), and only do the check in that case. So could retain
full uapi semantics here without additional effort (we need to have
these locks anway, at least in any kind of execbuf slowpath where the
bo aren't all mapped when we start out). So still not understanding
(even with the "it's other bo" overlook rectified) why we have to drop
this?

> > Aside: Yeah I think we need to re-add struct_mutex to the gtt fault
> > path, the temporary pinning in there could easily starve execbuf on
> > platforms where batches run in ggtt. Maybe also some other areas where
> > we lost struct_mutex around temporary vma->pin_count elevations.
>
> That's where we are going next; not with struct_mutex but fenced access
> to reservations to replace the temporary (not HW access) pinning.

fenced as in dma_fence or dma_resv_lock?

Also if we indeed have an issue with lost elevated pin_counts now I
think we shouldn't ship 5.5 with that, and reapply the duct tape until
it's fixed for good.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2019-11-08 16:06 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-06 15:48 [PATCH 1/3] drm/i915: Handle i915_active_fence_set() with the same fence Chris Wilson
2019-11-06 15:48 ` [Intel-gfx] " Chris Wilson
2019-11-06 15:48 ` [PATCH 2/3] drm/i915: Drop inspection of execbuf flags during evict Chris Wilson
2019-11-06 15:48   ` [Intel-gfx] " Chris Wilson
2019-11-08  9:54   ` Daniel Vetter
2019-11-08  9:54     ` [Intel-gfx] " Daniel Vetter
2019-11-08 10:11     ` Chris Wilson
2019-11-08 10:11       ` [Intel-gfx] " Chris Wilson
2019-11-08 10:20       ` Daniel Vetter
2019-11-08 10:20         ` [Intel-gfx] " Daniel Vetter
2019-11-08 10:40         ` Chris Wilson
2019-11-08 10:40           ` [Intel-gfx] " Chris Wilson
2019-11-08 16:06           ` Daniel Vetter [this message]
2019-11-08 16:06             ` Daniel Vetter
2019-11-06 15:48 ` [PATCH 3/3] drm/i915/gem: Extract transient execbuf flags from i915_vma Chris Wilson
2019-11-06 15:48   ` [Intel-gfx] " Chris Wilson
2019-11-08  9:53   ` Daniel Vetter
2019-11-08  9:53     ` [Intel-gfx] " Daniel Vetter
2019-11-08 10:05     ` Chris Wilson
2019-11-08 10:05       ` [Intel-gfx] " Chris Wilson
2019-11-08 16:13       ` Daniel Vetter
2019-11-08 16:13         ` [Intel-gfx] " Daniel Vetter
2019-11-06 19:22 ` ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Handle i915_active_fence_set() with the same fence Patchwork
2019-11-06 19:22   ` [Intel-gfx] " Patchwork
2019-11-08 10:37 ` [PATCH 1/3] " Tvrtko Ursulin
2019-11-08 10:37   ` [Intel-gfx] " Tvrtko Ursulin
2019-11-08 10:42   ` Chris Wilson
2019-11-08 10:42     ` [Intel-gfx] " Chris Wilson
2019-11-08 12:00 ` ✗ Fi.CI.IGT: failure for series starting with [1/3] " Patchwork
2019-11-08 12:00   ` [Intel-gfx] " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKMK7uHmTPDk=nZTdb=12WbAn1LKX40HYy4FqYhFY6TOAGHY-w@mail.gmail.com' \
    --to=daniel@ffwll.ch \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.