On Tue, Apr 20, 2021 at 8:16 PM Daniel Stone wrote: > On Tue, 20 Apr 2021 at 19:00, Christian König < > ckoenig.leichtzumerken@gmail.com> wrote: > >> Am 20.04.21 um 19:44 schrieb Daniel Stone: >> >> But winsys is something _completely_ different. Yes, you're using the GPU >> to do things with buffers A, B, and C to produce buffer Z. Yes, you're >> using vkQueuePresentKHR to schedule that work. Yes, Mutter's composition >> job might depend on a Chromium composition job which depends on GTA's >> render job which depends on GTA's compute job which might take a year to >> complete. Mutter's composition job needs to complete in 'reasonable' >> (again, FSVO) time, no matter what. The two are compatible. >> >> How? Don't lump them together. Isolate them aggressively, and >> _predictably_ in a way that you can reason about. >> >> What clients do in their own process space is their own business. Games >> can deadlock themselves if they get wait-before-signal wrong. Compute jobs >> can run for a year. Their problem. Winsys is not that, because you're >> crossing every isolation boundary possible. Process, user, container, VM - >> every kind of privilege boundary. Thus far, dma_fence has protected us from >> the most egregious abuses by guaranteeing bounded-time completion; it also >> acts as a sequencing primitive, but from the perspective of a winsys person >> that's of secondary importance, which is probably one of the bigger >> disconnects between winsys people and GPU driver people. >> >> >> Finally somebody who understands me :) >> >> Well the question is then how do we get winsys and your own process space >> together then? >> > > It's a jarring transition. If you take a very narrow view and say 'it's > all GPU work in shared buffers so it should all work the same', then > client<->winsys looks the same as client<->client gbuffer. But this is a > trap. > I think this is where I think we have have a serious gap of what a winsys or a compositor is. Like if you have only a single wayland server running on a physical machine this is easy. But add a VR compositor, an intermediate compositor (say gamescope), Xwayland and some containers/VM, some video capture (or, gasp, a browser that doubles as compositor) and this story gets seriously complicated. Like who are you protecting from who? at what point is something client<->winsys vs. client<->client? > Just because you can mmap() a file on an NFS server in New Zealand doesn't > mean that you should have the same expectations of memory access to that > file as you do to of a pointer from alloca(). Even if the primitives look > the same, you are crossing significant boundaries, and those do not come > without a compromise and a penalty. > > >> Anyway, one of the great things about winsys (there are some! trust me) >> is we don't need to be as hopelessly general as for game engines, nor as >> hyperoptimised. We place strict demands on our clients, and we literally >> kill them every single time they get something wrong in a way that's >> visible to us. Our demands on the GPU are so embarrassingly simple that you >> can run every modern desktop environment on GPUs which don't have unified >> shaders. And on certain platforms who don't share tiling formats between >> texture/render-target/scanout ... and it all still runs fast enough that >> people don't complain. >> >> >> Ignoring everything below since that is the display pipeline I'm not >> really interested in. My concern is how to get the buffer from the client >> to the server without allowing the client to get the server into trouble? >> >> My thinking is still to use timeouts to acquire texture locks. E.g. when >> the compositor needs to access texture it grabs a lock and if that lock >> isn't available in less than 20ms whoever is holding it is killed hard and >> the lock given to the compositor. >> >> It's perfectly fine if a process has a hung queue, but if it tries to >> send buffers which should be filled by that queue to the compositor it just >> gets a corrupted window content. >> > > Kill the client hard. If the compositor has speculatively queued sampling > against rendering which never completed, let it access garbage. You'll have > one frame of garbage (outdated content, all black, random pattern; the > failure mode is equally imperfect, because there is no perfect answer), > then the compositor will notice the client has disappeared and remove all > its resources. > > It's not possible to completely prevent this situation if the compositor > wants to speculatively pipeline work, only ameliorate it. From a > system-global point of view, just expose the situation and let it bubble > up. Watch the number of fences which failed to retire in time, and destroy > the context if there are enough of them (maybe 1, maybe 100). Watch the > number of contexts the file description get forcibly destroyed, and destroy > the file description if there are enough of them. Watch the number of > descriptions which get forcibly destroyed, and destroy the process if there > are enough of them. Watch the number of processes in a cgroup/pidns which > get forcibly destroyed, and destroy the ... etc. Whether it's the DRM > driver or an external monitor such as systemd/Flatpak/podman/Docker doing > that is pretty immaterial, as long as the concept of failure bubbling up > remains. > > (20ms is objectively the wrong answer FWIW, because we're not a hard RTOS. > But if our biggest point of disagreement is 20 vs. 200 vs. 2000 vs. 20000 > ms, then this thread has been a huge success!) > > Cheers, > Daniel > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev >