All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Chris Wilson <chris@chris-wilson.co.uk>,
	Daniel Vetter <daniel@ffwll.ch>,
	Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	intel-gfx <Intel-gfx@lists.freedesktop.org>
Subject: Re: [RFC v2] drm/i915: Android native sync support
Date: Mon, 26 Jan 2015 08:52:39 +0100	[thread overview]
Message-ID: <20150126075239.GK10113@phenom.ffwll.local> (raw)
In-Reply-To: <20150124160832.GG7762@nuc-i3427.alporthouse.com>

On Sat, Jan 24, 2015 at 04:08:32PM +0000, Chris Wilson wrote:
> On Sat, Jan 24, 2015 at 10:41:46AM +0100, Daniel Vetter wrote:
> > On Fri, Jan 23, 2015 at 6:30 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > > On Fri, Jan 23, 2015 at 04:53:48PM +0100, Daniel Vetter wrote:
> > >> Yeah that's kind the big behaviour difference (at least as I see it)
> > >> between explicit sync and implicit sync:
> > >> - with implicit sync the kernel attachs sync points/requests to buffers
> > >>   and userspace just asks about idle/business of buffers. Synchronization
> > >>   between different users is all handled behind userspace's back in the
> > >>   kernel.
> > >>
> > >> - explicit sync attaches sync points to individual bits of work and makes
> > >>   them explicit objects userspace can get at and pass around. Userspace
> > >>   uses these separate things to inquire about when something is
> > >>   done/idle/busy and has its own mapping between explicit sync objects and
> > >>   the different pieces of memory affected by each. Synchronization between
> > >>   different clients is handled explicitly by passing sync objects around
> > >>   each time some rendering is done.
> > >>
> > >> The bigger driver for explicit sync (besides "nvidia likes it sooooo much
> > >> that everyone uses it a lot") seems to be a) shitty gpu drivers without
> > >> proper bo managers (*cough*android*cough*) and svm, where there's simply
> > >> no buffer objects any more to attach sync information to.
> > >
> > > Actually, mesa would really like much finer granularity than at batch
> > > boundaries. Having a sync object for a batch boundary itself is very meh
> > > and not a substantive improvement on what it possible today, but being
> > > able to convert the implicit sync into an explicit fence object is
> > > interesting and lends a layer of abstraction that could make it more
> > > versatile. Most importantly, it allows me to defer the overhead of fence
> > > creation until I actually want to sleep on a completion. Also Jesse
> > > originally supporting inserting fences inside a batch, which looked
> > > interesting if impractical.
> > 
> > If want to allow the kernel to stall on fences (in e.g. the scheduler)
> > only the kernel should be allowed to create fences imo. At least
> > current fences assume that they _will_ signal eventually, and for i915
> > fences we have the hangcheck to ensure this is the case. In-batch
> > fences and lazy fence creation (beyond just delaying the fd allocation
> > to avoid too many fds flying around) is therefore a no-go.
> 
> Lazy fence creation (i.e. attaching a fence to a buffer) just means
> creating the fd for an existing request (which is derived from the
> fence). Or if the buffer is read or write idle, then you just create the
> fence as already-signaled. And yes, it is just to avoid the death by a
> thousand file descriptors and especially creating one every batch.

I think the problem will be platforms that want full explicit fence (like
android) but allow delayed creation of the fence fd from a gl sync object
(like the android egl extension allows).

I'm not sure yet how to best expose that really since just creating a
fence from the implicit request attached to the batch might upset the
interface purists with the mix in implicit and explicit fencing ;-) Hence
why I think for now we should just do the eager fd creation at execbuf
until ppl scream (well maybe not merge this patch until ppl scream ...).

> > For that kind of fine-grained sync between gpu and cpu workloads the
> > solutions thus far (at least what I've seen) is just busy-looping.
> > Usually those workloads have a few order more sync pionts than frames
> > we tend to render, so blocking isn't terrible efficient anyway.
> 
> Heck, I think a full-fledged fence fd per batch is still more overhead
> than I want.

One idea that crossed my mind is to expose the 2nd interrupt source to
userspace somehow (we have pipe_control/mi_flush_dw and
mi_user_interrupt). Then we could use that, maybe with some
wakeupfiltering to allow userspace to block a bit more efficient.

But my gut feel still says that most likely a bit of busy-looping won't
hurt in such a case with very fine-grained synchronization. There should
be a full blocking kernel request nearby. And often the check is for "busy
or not" only anyway, and that can already be done with seqno writes from
batchbuffers to a per-ctx bo that userspace manages.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2015-01-26  7:52 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-22 11:15 [RFC] drm/i915: Android native sync support Tvrtko Ursulin
2015-01-22 11:42 ` Chris Wilson
2015-01-22 13:41   ` Tvrtko Ursulin
2015-01-22 13:49     ` Chris Wilson
2015-01-22 13:56       ` Tvrtko Ursulin
2015-01-22 14:04     ` Damien Lespiau
2015-01-22 15:28       ` Tvrtko Ursulin
2015-01-22 15:47         ` Damien Lespiau
2015-01-22 15:54           ` Tvrtko Ursulin
2015-01-22 16:07             ` Damien Lespiau
2015-01-23 11:13 ` [RFC v2] " Tvrtko Ursulin
2015-01-23 11:27   ` Chris Wilson
2015-01-23 14:02     ` Tvrtko Ursulin
2015-01-23 15:53       ` Daniel Vetter
2015-01-23 16:49         ` Tvrtko Ursulin
2015-01-24  9:47           ` Daniel Vetter
2015-01-26 11:08             ` Tvrtko Ursulin
2015-01-28  9:20               ` Daniel Vetter
2015-01-23 17:30         ` Chris Wilson
2015-01-24  9:41           ` Daniel Vetter
2015-01-24 16:08             ` Chris Wilson
2015-01-26  7:52               ` Daniel Vetter [this message]
2015-01-26  9:08                 ` Chris Wilson
2015-01-28  9:22                   ` Daniel Vetter
2015-01-28  9:23                     ` Chris Wilson
2015-01-28  9:50                       ` Daniel Vetter
2015-01-28 10:07                         ` Chris Wilson
2015-02-25 20:46                           ` Jesse Barnes
2015-02-26  9:13                             ` Chris Wilson
2015-01-27 11:29 ` [RFC v3] " Tvrtko Ursulin
2015-01-27 11:40   ` Chris Wilson
2015-01-27 12:13     ` Tvrtko Ursulin
2015-01-27 12:18       ` Chris Wilson
2015-01-27 13:43         ` Tvrtko Ursulin
2015-01-28  9:25           ` Daniel Vetter
2015-01-28  9:29   ` Daniel Vetter
2015-01-28 16:52     ` Tvrtko Ursulin
2015-01-29 16:14       ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150126075239.GK10113@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=Intel-gfx@lists.freedesktop.org \
    --cc=chris@chris-wilson.co.uk \
    --cc=tvrtko.ursulin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.