From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jason Ekstrand Subject: Re: [PATCH] drm/syncobj: add sync obj wait interface. (v6) Date: Tue, 11 Jul 2017 08:43:23 -0700 Message-ID: References: <20170706010409.9373-1-airlied@gmail.com> <899f9bb3-f7c7-ce86-c22b-bb8d262778bf@vodafone.de> <0c81dd81-e84a-2381-3ce1-e08450929da8@daenzer.net> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0615829623==" Return-path: In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: amd-gfx-bounces-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org Sender: "amd-gfx" To: =?UTF-8?Q?Christian_K=C3=B6nig?= Cc: =?UTF-8?Q?Michel_D=C3=A4nzer?= , amd-gfx mailing list , Maling list - DRI developers List-Id: dri-devel@lists.freedesktop.org --===============0615829623== Content-Type: multipart/alternative; boundary="001a11497fe24fd2aa05540c8f7c" --001a11497fe24fd2aa05540c8f7c Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Jul 11, 2017 at 12:17 AM, Christian K=C3=B6nig wrote: > Am 11.07.2017 um 04:36 schrieb Michel D=C3=A4nzer: > >> On 11/07/17 06:09 AM, Jason Ekstrand wrote: >> >>> On Mon, Jul 10, 2017 at 9:15 AM, Christian K=C3=B6nig >>> > wrote: >>> >>> Am 10.07.2017 um 17:52 schrieb Jason Ekstrand: >>> >>>> On Mon, Jul 10, 2017 at 8:45 AM, Christian K=C3=B6nig >>>> > wrote: >>>> >>>> Am 10.07.2017 um 17:28 schrieb Jason Ekstrand: >>>> >>>>> On Wed, Jul 5, 2017 at 6:04 PM, Dave Airlie >>>>> > wrote: >>>>> [SNIP] >>>>> So, reading some CTS tests again, and I think we have a >>>>> problem here. The Vulkan spec allows you to wait on a fence >>>>> that is in the unsignaled state. >>>>> >>>> At least on the closed source driver that would be illegal as >>>> far as I know. >>>> >>>> >>>> Then they are doing workarounds in userspace. There are >>>> definitely CTS tests for this: >>>> >>>> https://github.com/KhronosGroup/VK-GL-CTS/blob/master/ >>>> external/vulkancts/modules/vulkan/synchronization/vktSync >>>> hronizationBasicFenceTests.cpp#L74 >>>> >>> external/vulkancts/modules/vulkan/synchronization/vktSync >>>> hronizationBasicFenceTests.cpp#L74> >>>> >>>> You can't wait on a semaphore before the signal operation is >>>> send down to the kerel. >>>> >>>> >>>> We (Intel) deal with this today by tracking whether or not the >>>> fence has been submitted and using a condition variable in >>>> userspace to sort it all out. >>>> >>> Which sounds exactly like what AMD is doing in it's drivers as wel= l. >>> >>> >>> Which doesn't work cross-process so... >>> >> Surely it can be made to work by providing suitable kernel APIs to >> userspace? >> > > Well, that's exactly what Jason proposed to do, but I'm not very keen of > that. > > If we ever want to share fences across processes (which we do), >>>> then this needs to be sorted in the kernel. >>>> >>> That would clearly get a NAK from my side, even Microsoft forbids >>> wait before signal because you can easily end up in deadlock >>> situations. >>> >>> Please don't NAK things that are required by the API specification and >>> CTS tests. >>> >> There is no requirement for every aspect of the Vulkan API specification >> to be mirrored 1:1 in the kernel <-> userspace API. We have to work out >> what makes sense at each level. >> > > Exactly, if we have a synchronization problem between two processes that > should be solved in userspace. > > E.g. if process A hasn't submitted it's work to the kernel it should flus= h > it's commands before sending a flip event to the compositor. > Ok, I think there is some confusion here on what is being proposed. Here are some things that are *not* being proposed: 1. This does *not* allow a client to block another client's GPU work indefinitely. This is entirely for a CPU wait API to allow for a "wait for submit" as well as a "wait for finish". 2. This is *not* for system compositors that need to be robust against malicious clients. The expected use for the OPAQUE_FD is two very tightly integrated processes which trust each other but need to be able to share synchronization primitives. One of the things they need to be able to do (as per the Vulkan API) with those synchronization primitives is a "wait for submit and finish" operation. I'm happy for the kernel to have separate APIs for "wait for submit" and "wait for finish" if that's more palatable but i don't really see why there is such a strong reaction to the "wait for submit and finish" behavior. Could we do this "in userspace"? Yes, with added kernel API. we would need some way of strapping a second FD onto a syncobj or combining two FDs into one to send across the wire or something like that, then add a shared memory segment, and then pile on a bunch of code to do cross-process condition variables and state tracking. I really don't see how that's a better solution than adding a flag to the kernel API to just do what we want. --001a11497fe24fd2aa05540c8f7c Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On T= ue, Jul 11, 2017 at 12:17 AM, Christian K=C3=B6nig <deathsimple@voda= fone.de> wrote:
Am 11.07.2017 um 04:36 schrieb Michel D=C3=A4= nzer:
On 11/07/17 06:09 AM, Jason Ekstrand wrote:
On Mon, Jul 10, 2017 at 9:15 AM, Christian K=C3=B6nig
<deathsimpl= e@vodafone.de <mailto:deathsimple@vodafone.de>> wrote:

=C2=A0 =C2=A0 =C2=A0Am 10.07.2017 um 17:52 schrieb Jason Ekstrand:
=C2=A0 =C2=A0 =C2=A0On Mon, Jul 10, 2017 at 8:45 AM, Christian K=C3=B6nig =C2=A0 =C2=A0 =C2=A0<deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org <mailto:deathsimple@vodafone.de>&g= t; wrote:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Am 10.07.2017 um 17:28 schrieb Jason Ekst= rand:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0On Wed, Jul 5, 2017 at 6:04 PM, Dave Airl= ie
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<airlied-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <mailto:airlied-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>> wrote:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0[SNIP]
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0So, reading some CTS tests again, and I t= hink we have a
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0problem here.=C2=A0 The Vulkan spec allow= s you to wait on a fence
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0that is in the unsignaled state.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0At least on the closed source driver that= would be illegal as
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0far as I know.


=C2=A0 =C2=A0 =C2=A0Then they are doing workarounds in userspace.=C2=A0 The= re are
=C2=A0 =C2=A0 =C2=A0definitely CTS tests for this:

=C2=A0 =C2=A0 =C2=A0https://gi= thub.com/KhronosGroup/VK-GL-CTS/blob/master/external/vulkancts/mo= dules/vulkan/synchronization/vktSynchronizationBasicFenceTests.cpp#L74
=C2=A0 =C2=A0 =C2=A0<https:= //github.com/KhronosGroup/VK-GL-CTS/blob/master/external/vulkanct= s/modules/vulkan/synchronization/vktSynchronizationBasicFenceTest= s.cpp#L74>
=C2=A0 =C2=A0 =C2=A0
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0You can't wait on a semaphore before = the signal operation is
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0send down to the kerel.


=C2=A0 =C2=A0 =C2=A0We (Intel) deal with this today by tracking whether or = not the
=C2=A0 =C2=A0 =C2=A0fence has been submitted and using a condition variable= in
=C2=A0 =C2=A0 =C2=A0userspace to sort it all out.
=C2=A0 =C2=A0 =C2=A0Which sounds exactly like what AMD is doing in it's= drivers as well.


Which doesn't work cross-process so...
Surely it can be made to work by providing suitable kernel APIs to
userspace?

Well, that's exactly what Jason proposed to do, but I'm not very ke= en of that.

=C2=A0 =C2=A0 =C2=A0If we ever want to share fences across processes (which= we do),
=C2=A0 =C2=A0 =C2=A0then this needs to be sorted in the kernel.
=C2=A0 =C2=A0 =C2=A0That would clearly get a NAK from my side, even Microso= ft forbids
=C2=A0 =C2=A0 =C2=A0wait before signal because you can easily end up in dea= dlock situations.

Please don't NAK things that are required by the API specification and<= br> CTS tests.
There is no requirement for every aspect of the Vulkan API specification to be mirrored 1:1 in the kernel <-> userspace API. We have to work o= ut
what makes sense at each level.

Exactly, if we have a synchronization problem between two processes that sh= ould be solved in userspace.

E.g. if process A hasn't submitted it's work to the kernel it shoul= d flush it's commands before sending a flip event to the compositor.

Ok, I think there i= s some confusion here on what is being proposed.=C2=A0 Here are some things= that are *not* being proposed:

=C2= =A01. This does *not* allow a client to block another client's GPU work= indefinitely.=C2=A0 This is entirely for a CPU wait API to allow for a &qu= ot;wait for submit" as well as a "wait for finish".
=C2=A02. This is *not* for system compositors t= hat need to be robust against malicious clients.

The expected use for the OPAQUE_FD is two very tightly integr= ated processes which trust each other but need to be able to share synchron= ization primitives.=C2=A0 One of the things they need to be able to do (as = per the Vulkan API) with those synchronization primitives is a "wait f= or submit and finish" operation.=C2=A0 I'm happy for the kernel to= have separate APIs for "wait for submit" and "wait for fini= sh" if that's more palatable but i don't really see why there = is such a strong reaction to the "wait for submit and finish" beh= avior.

Could we do this "in us= erspace"?=C2=A0 Yes, with added kernel API.=C2=A0 we would need some w= ay of strapping a second FD onto a syncobj or combining two FDs into one to= send across the wire or something like that, then add a shared memory segm= ent, and then pile on a bunch of code to do cross-process condition variabl= es and state tracking.=C2=A0 I really don't see how that's a better= solution than adding a flag to the kernel API to just do what we want.
=
--001a11497fe24fd2aa05540c8f7c-- --===============0615829623== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KYW1kLWdmeCBt YWlsaW5nIGxpc3QKYW1kLWdmeEBsaXN0cy5mcmVlZGVza3RvcC5vcmcKaHR0cHM6Ly9saXN0cy5m cmVlZGVza3RvcC5vcmcvbWFpbG1hbi9saXN0aW5mby9hbWQtZ2Z4Cg== --===============0615829623==--