dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: "Christian König" <ckoenig.leichtzumerken@gmail.com>
To: "Marek Olšák" <maraeo@gmail.com>,
	"Jason Ekstrand" <jason@jlekstrand.net>
Cc: "ML Mesa-dev" <mesa-dev@lists.freedesktop.org>,
	"Michel Dänzer" <michel@daenzer.net>,
	dri-devel <dri-devel@lists.freedesktop.org>
Subject: Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal
Date: Tue, 4 May 2021 09:01:23 +0200	[thread overview]
Message-ID: <1bd8105b-4a2a-2ad9-0b3c-a81590282f2e@gmail.com> (raw)
In-Reply-To: <CAAxE2A6NCTFsV6oH=AL=S=P1p0xYF0To8T_THpUO2ypdo0dyBw@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 13187 bytes --]

Unfortunately as I pointed out to Daniel as well this won't work 100% 
reliable either.

See the signal on the ring buffer needs to be protected by manipulation 
from userspace so that we can guarantee that the hardware really has 
finished executing when it fires.

Protecting memory by immediate page table updates is a good first step, 
but unfortunately not sufficient (and we would need to restructure large 
parts of the driver to make this happen).

On older hardware we often had the situation that for reliable 
invalidation we need the guarantee that every previous operation has 
finished executing. It's not so much of a problem when the next 
operation has already started, since then we had the opportunity to do 
things in between the last and the next operation. Just see cache 
invalidation and VM switching for example.

Additional to that it doesn't really buy us anything, e.g. there is not 
much advantage to this. Writing the ring buffer in userspace and then 
ringing in the kernel has the same overhead as doing everything in the 
kernel in the first place.

Christian.

Am 04.05.21 um 05:11 schrieb Marek Olšák:
> Proposal for a new CS ioctl, kernel pseudo code:
>
> lock(&global_lock);
> serial = get_next_serial(dev);
> add_wait_command(ring, serial - 1);
> add_exec_cmdbuf(ring, user_cmdbuf);
> add_signal_command(ring, serial);
> *ring->doorbell = FIRE;
> unlock(&global_lock);
>
> See? Just like userspace submit, but in the kernel without 
> concurrency/preemption. Is this now safe enough for dma_fence?
>
> Marek
>
> On Mon, May 3, 2021 at 4:36 PM Marek Olšák <maraeo@gmail.com 
> <mailto:maraeo@gmail.com>> wrote:
>
>     What about direct submit from the kernel where the process still
>     has write access to the GPU ring buffer but doesn't use it? I
>     think that solves your preemption example, but leaves a potential
>     backdoor for a process to overwrite the signal commands, which
>     shouldn't be a problem since we are OK with timeouts.
>
>     Marek
>
>     On Mon, May 3, 2021 at 11:23 AM Jason Ekstrand
>     <jason@jlekstrand.net <mailto:jason@jlekstrand.net>> wrote:
>
>         On Mon, May 3, 2021 at 10:16 AM Bas Nieuwenhuizen
>         <bas@basnieuwenhuizen.nl <mailto:bas@basnieuwenhuizen.nl>> wrote:
>         >
>         > On Mon, May 3, 2021 at 5:00 PM Jason Ekstrand
>         <jason@jlekstrand.net <mailto:jason@jlekstrand.net>> wrote:
>         > >
>         > > Sorry for the top-post but there's no good thing to reply
>         to here...
>         > >
>         > > One of the things pointed out to me recently by Daniel
>         Vetter that I
>         > > didn't fully understand before is that dma_buf has a very
>         subtle
>         > > second requirement beyond finite time completion:  Nothing
>         required
>         > > for signaling a dma-fence can allocate memory. Why? 
>         Because the act
>         > > of allocating memory may wait on your dma-fence.  This, as
>         it turns
>         > > out, is a massively more strict requirement than finite time
>         > > completion and, I think, throws out all of the proposals
>         we have so
>         > > far.
>         > >
>         > > Take, for instance, Marek's proposal for userspace
>         involvement with
>         > > dma-fence by asking the kernel for a next serial and the
>         kernel
>         > > trusting userspace to signal it.  That doesn't work at all if
>         > > allocating memory to trigger a dma-fence can blow up. 
>         There's simply
>         > > no way for the kernel to trust userspace to not do
>         ANYTHING which
>         > > might allocate memory.  I don't even think there's a way
>         userspace can
>         > > trust itself there.  It also blows up my plan of moving
>         the fences to
>         > > transition boundaries.
>         > >
>         > > Not sure where that leaves us.
>         >
>         > Honestly the more I look at things I think
>         userspace-signalable fences
>         > with a timeout sound like they are a valid solution for
>         these issues.
>         > Especially since (as has been mentioned countless times in
>         this email
>         > thread) userspace already has a lot of ways to cause
>         timeouts and or
>         > GPU hangs through GPU work already.
>         >
>         > Adding a timeout on the signaling side of a dma_fence would
>         ensure:
>         >
>         > - The dma_fence signals in finite time
>         > -  If the timeout case does not allocate memory then memory
>         allocation
>         > is not a blocker for signaling.
>         >
>         > Of course you lose the full dependency graph and we need to
>         make sure
>         > garbage collection of fences works correctly when we have
>         cycles.
>         > However, the latter sounds very doable and the first sounds
>         like it is
>         > to some extent inevitable.
>         >
>         > I feel like I'm missing some requirement here given that we
>         > immediately went to much more complicated things but can't
>         find it.
>         > Thoughts?
>
>         Timeouts are sufficient to protect the kernel but they make
>         the fences
>         unpredictable and unreliable from a userspace PoV.  One of the big
>         problems we face is that, once we expose a dma_fence to userspace,
>         we've allowed for some pretty crazy potential dependencies that
>         neither userspace nor the kernel can sort out.  Say you have
>         marek's
>         "next serial, please" proposal and a multi-threaded application.
>         Between time time you ask the kernel for a serial and get a
>         dma_fence
>         and submit the work to signal that serial, your process may get
>         preempted, something else shoved in which allocates memory,
>         and then
>         we end up blocking on that dma_fence.  There's no way
>         userspace can
>         predict and defend itself from that.
>
>         So I think where that leaves us is that there is no safe place to
>         create a dma_fence except for inside the ioctl which submits
>         the work
>         and only after any necessary memory has been allocated. That's a
>         pretty stiff requirement.  We may still be able to interact with
>         userspace a bit more explicitly but I think it throws any
>         notion of
>         userspace direct submit out the window.
>
>         --Jason
>
>
>         > - Bas
>         > >
>         > > --Jason
>         > >
>         > > On Mon, May 3, 2021 at 9:42 AM Alex Deucher
>         <alexdeucher@gmail.com <mailto:alexdeucher@gmail.com>> wrote:
>         > > >
>         > > > On Sat, May 1, 2021 at 6:27 PM Marek Olšák
>         <maraeo@gmail.com <mailto:maraeo@gmail.com>> wrote:
>         > > > >
>         > > > > On Wed, Apr 28, 2021 at 5:07 AM Michel Dänzer
>         <michel@daenzer.net <mailto:michel@daenzer.net>> wrote:
>         > > > >>
>         > > > >> On 2021-04-28 8:59 a.m., Christian König wrote:
>         > > > >> > Hi Dave,
>         > > > >> >
>         > > > >> > Am 27.04.21 um 21:23 schrieb Marek Olšák:
>         > > > >> >> Supporting interop with any device is always
>         possible. It depends on which drivers we need to interoperate
>         with and update them. We've already found the path forward for
>         amdgpu. We just need to find out how many other drivers need
>         to be updated and evaluate the cost/benefit aspect.
>         > > > >> >>
>         > > > >> >> Marek
>         > > > >> >>
>         > > > >> >> On Tue, Apr 27, 2021 at 2:38 PM Dave Airlie
>         <airlied@gmail.com <mailto:airlied@gmail.com>
>         <mailto:airlied@gmail.com <mailto:airlied@gmail.com>>> wrote:
>         > > > >> >>
>         > > > >> >>     On Tue, 27 Apr 2021 at 22:06, Christian König
>         > > > >> >>     <ckoenig.leichtzumerken@gmail.com
>         <mailto:ckoenig.leichtzumerken@gmail.com>
>         <mailto:ckoenig.leichtzumerken@gmail.com
>         <mailto:ckoenig.leichtzumerken@gmail.com>>> wrote:
>         > > > >> >>     >
>         > > > >> >>     > Correct, we wouldn't have synchronization
>         between device with and without user queues any more.
>         > > > >> >>     >
>         > > > >> >>     > That could only be a problem for A+I Laptops.
>         > > > >> >>
>         > > > >> >>     Since I think you mentioned you'd only be
>         enabling this on newer
>         > > > >> >>     chipsets, won't it be a problem for A+A where
>         one A is a generation
>         > > > >> >>     behind the other?
>         > > > >> >>
>         > > > >> >
>         > > > >> > Crap, that is a good point as well.
>         > > > >> >
>         > > > >> >>
>         > > > >> >>     I'm not really liking where this is going btw,
>         seems like a ill
>         > > > >> >>     thought out concept, if AMD is really going
>         down the road of designing
>         > > > >> >>     hw that is currently Linux incompatible, you
>         are going to have to
>         > > > >> >>     accept a big part of the burden in bringing
>         this support in to more
>         > > > >> >>     than just amd drivers for upcoming generations
>         of gpu.
>         > > > >> >>
>         > > > >> >
>         > > > >> > Well we don't really like that either, but we have
>         no other option as far as I can see.
>         > > > >>
>         > > > >> I don't really understand what "future hw may remove
>         support for kernel queues" means exactly. While the
>         per-context queues can be mapped to userspace directly, they
>         don't *have* to be, do they? I.e. the kernel driver should be
>         able to either intercept userspace access to the queues, or in
>         the worst case do it all itself, and provide the existing
>         synchronization semantics as needed?
>         > > > >>
>         > > > >> Surely there are resource limits for the per-context
>         queues, so the kernel driver needs to do some kind of
>         virtualization / multi-plexing anyway, or we'll get sad user
>         faces when there's no queue available for <current hot game>.
>         > > > >>
>         > > > >> I'm probably missing something though, awaiting
>         enlightenment. :)
>         > > > >
>         > > > >
>         > > > > The hw interface for userspace is that the ring buffer
>         is mapped to the process address space alongside a doorbell
>         aperture (4K page) that isn't real memory, but when the CPU
>         writes into it, it tells the hw scheduler that there are new
>         GPU commands in the ring buffer. Userspace inserts all the
>         wait, draw, and signal commands into the ring buffer and then
>         "rings" the doorbell. It's my understanding that the ring
>         buffer and the doorbell are always mapped in the same GPU
>         address space as the process, which makes it very difficult to
>         emulate the current protected ring buffers in the kernel. The
>         VMID of the ring buffer is also not changeable.
>         > > > >
>         > > >
>         > > > The doorbell does not have to be mapped into the
>         process's GPU virtual
>         > > > address space.  The CPU could write to it directly. 
>         Mapping it into
>         > > > the GPU's virtual address space would allow you to have
>         a device kick
>         > > > off work however rather than the CPU. E.g., the GPU
>         could kick off
>         > > > it's own work or multiple devices could kick off work
>         without CPU
>         > > > involvement.
>         > > >
>         > > > Alex
>         > > >
>         > > >
>         > > > > The hw scheduler doesn't do any synchronization and it
>         doesn't see any dependencies. It only chooses which queue to
>         execute, so it's really just a simple queue manager handling
>         the virtualization aspect and not much else.
>         > > > >
>         > > > > Marek
>         > > > > _______________________________________________
>         > > > > dri-devel mailing list
>         > > > > dri-devel@lists.freedesktop.org
>         <mailto:dri-devel@lists.freedesktop.org>
>         > > > >
>         https://lists.freedesktop.org/mailman/listinfo/dri-devel
>         <https://lists.freedesktop.org/mailman/listinfo/dri-devel>
>         > > > _______________________________________________
>         > > > mesa-dev mailing list
>         > > > mesa-dev@lists.freedesktop.org
>         <mailto:mesa-dev@lists.freedesktop.org>
>         > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>         <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>
>         > > _______________________________________________
>         > > dri-devel mailing list
>         > > dri-devel@lists.freedesktop.org
>         <mailto:dri-devel@lists.freedesktop.org>
>         > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>         <https://lists.freedesktop.org/mailman/listinfo/dri-devel>
>


[-- Attachment #1.2: Type: text/html, Size: 19503 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2021-05-04  7:01 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-19 10:47 [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal Marek Olšák
2021-04-19 15:48 ` Jason Ekstrand
2021-04-20  2:25   ` Marek Olšák
2021-04-20 10:15   ` [Mesa-dev] " Christian König
2021-04-20 10:34     ` Daniel Vetter
2021-04-20 11:03       ` Marek Olšák
2021-04-20 11:16         ` Daniel Vetter
2021-04-20 11:59           ` Christian König
2021-04-20 14:09             ` Daniel Vetter
2021-04-20 16:24               ` Jason Ekstrand
2021-04-20 16:19             ` Jason Ekstrand
2021-04-20 12:01 ` Daniel Vetter
2021-04-20 12:19   ` [Mesa-dev] " Christian König
2021-04-20 13:03   ` Daniel Stone
2021-04-20 14:04     ` Daniel Vetter
2021-04-20 12:42 ` Daniel Stone
2021-04-20 15:45   ` Jason Ekstrand
2021-04-20 17:44     ` Daniel Stone
2021-04-20 18:00       ` [Mesa-dev] " Christian König
2021-04-20 18:15         ` Daniel Stone
2021-04-20 19:03           ` Bas Nieuwenhuizen
2021-04-20 19:18             ` Daniel Stone
2021-04-20 18:53       ` Daniel Vetter
2021-04-20 19:14         ` Daniel Stone
2021-04-20 19:29           ` Daniel Vetter
2021-04-20 20:32             ` Daniel Stone
2021-04-26 20:59               ` Marek Olšák
2021-04-27  8:02                 ` Daniel Vetter
2021-04-27 11:49                   ` Marek Olšák
2021-04-27 12:06                     ` Christian König
2021-04-27 12:11                       ` Marek Olšák
2021-04-27 12:15                         ` Daniel Vetter
2021-04-27 12:27                           ` Christian König
2021-04-27 12:46                           ` Marek Olšák
2021-04-27 12:50                             ` Christian König
2021-04-27 13:26                               ` Marek Olšák
2021-04-27 15:13                                 ` Christian König
2021-04-27 17:31                                 ` Lucas Stach
2021-04-27 17:35                                   ` Simon Ser
2021-04-27 18:01                                     ` Alex Deucher
2021-04-27 18:27                                       ` Simon Ser
2021-04-28 10:01                                         ` Daniel Vetter
2021-04-28 10:05                                       ` Daniel Vetter
2021-04-28 10:31                                         ` Christian König
2021-04-28 12:21                                           ` Daniel Vetter
2021-04-28 12:26                                             ` Daniel Vetter
2021-04-28 13:11                                               ` Christian König
2021-04-28 13:34                                                 ` Daniel Vetter
2021-04-28 13:37                                                   ` Christian König
2021-04-28 14:34                                                     ` Daniel Vetter
2021-04-28 14:45                                                       ` Christian König
2021-04-29 11:07                                                         ` Daniel Vetter
2021-04-28 20:39                                                       ` Alex Deucher
2021-04-29 11:12                                                         ` Daniel Vetter
2021-04-30  8:58                                                           ` Daniel Vetter
2021-04-30  9:07                                                             ` Christian König
2021-04-30  9:35                                                               ` Daniel Vetter
2021-04-30 10:17                                                                 ` Daniel Stone
2021-04-28 12:45                                             ` Simon Ser
2021-04-28 13:03                                           ` Alex Deucher
2021-04-27 19:41                                   ` Jason Ekstrand
2021-04-27 21:58                                     ` Marek Olšák
2021-04-28  4:01                                       ` Jason Ekstrand
2021-04-28  5:19                                         ` Marek Olšák
2021-04-27 18:38                       ` Dave Airlie
2021-04-27 19:23                         ` Marek Olšák
2021-04-28  6:59                           ` Christian König
2021-04-28  9:07                             ` Michel Dänzer
2021-04-28  9:57                               ` Daniel Vetter
2021-05-01 22:27                               ` Marek Olšák
2021-05-03 14:42                                 ` Alex Deucher
2021-05-03 14:59                                   ` Jason Ekstrand
2021-05-03 15:03                                     ` Christian König
2021-05-03 15:15                                       ` Jason Ekstrand
2021-05-03 15:16                                     ` Bas Nieuwenhuizen
2021-05-03 15:23                                       ` Jason Ekstrand
2021-05-03 20:36                                         ` Marek Olšák
2021-05-04  3:11                                           ` Marek Olšák
2021-05-04  7:01                                             ` Christian König [this message]
2021-05-04  7:32                                               ` Daniel Vetter
2021-05-04  8:09                                                 ` Christian König
2021-05-04  8:27                                                   ` Daniel Vetter
2021-05-04  9:14                                                     ` Christian König
2021-05-04  9:47                                                       ` Daniel Vetter
2021-05-04 10:53                                                         ` Christian König
2021-05-04 11:13                                                           ` Daniel Vetter
2021-05-04 12:48                                                             ` Christian König
2021-05-04 16:44                                                               ` Daniel Vetter
2021-05-04 17:16                                                               ` Marek Olšák
2021-05-04 21:06                                                                 ` Jason Ekstrand
2021-04-28  9:54                             ` Daniel Vetter
2021-04-27 20:49                         ` Jason Ekstrand
2021-04-27 12:12                     ` Daniel Vetter
2021-04-20 19:16         ` Jason Ekstrand
2021-04-20 19:27           ` Daniel Vetter
2021-04-20 14:53 ` Daniel Stone
2021-04-20 14:58   ` [Mesa-dev] " Christian König
2021-04-20 15:07     ` Daniel Stone
2021-04-20 15:16       ` Christian König
2021-04-20 15:49         ` Daniel Stone
2021-04-20 16:25           ` Marek Olšák
2021-04-20 16:42             ` Jacob Lifshay
2021-04-20 18:03             ` Daniel Stone
2021-04-20 18:39             ` Daniel Vetter
2021-04-20 19:20               ` Marek Olšák

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1bd8105b-4a2a-2ad9-0b3c-a81590282f2e@gmail.com \
    --to=ckoenig.leichtzumerken@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jason@jlekstrand.net \
    --cc=maraeo@gmail.com \
    --cc=mesa-dev@lists.freedesktop.org \
    --cc=michel@daenzer.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).