Hi guys,
maybe soften that a bit. Reading from the shared memory of
the user fence is ok for everybody. What we need to take
more care of is the writing side.
So my current thinking is that we allow read only access,
but writing a new sequence value needs to go through the
scheduler/kernel.
So when the CPU wants to signal a timeline fence it needs to
call an IOCTL. When the GPU wants to signal the timeline
fence it needs to hand that of to the hardware scheduler.
If we lockup the kernel can check with the hardware who did
the last write and what value was written.
That together with an IOCTL to give out sequence number for
implicit sync to applications should be sufficient for the
kernel to track who is responsible if something bad happens.
In other words when the hardware says that the shader wrote
stuff like 0xdeadbeef 0x0 or 0xffffffff into memory we kill
the process who did that.
If the hardware says that seq - 1 was written fine, but seq
is missing then the kernel blames whoever was supposed to
write seq.
Just pieping the write through a privileged instance should
be fine to make sure that we don't run into issues.
Christian.
Am 10.06.21 um 17:59 schrieb Marek Olšák:
Hi Daniel,
We just talked about this whole topic internally
and we came up to the conclusion that the hardware
needs to understand sync object handles and have
high-level wait and signal operations in the command
stream. Sync objects will be backed by memory, but
they won't be readable or writable by processes
directly. The hardware will log all accesses to sync
objects and will send the log to the kernel
periodically. The kernel will identify malicious
behavior.
Example of a hardware command stream:
...
ImplicitSyncWait(syncObjHandle, sequenceNumber); //
the sequence number is assigned by the kernel
Draw();
ImplicitSyncSignalWhenDone(syncObjHandle);
...
I'm afraid we have no other choice because of the
TLB invalidation overhead.
Marek
On Wed, Jun 09,
2021 at 03:58:26PM +0200, Christian König wrote:
> Am 09.06.21 um 15:19 schrieb Daniel Vetter:
> > [SNIP]
> > > Yeah, we call this the lightweight and
the heavyweight tlb flush.
> > >
> > > The lighweight can be used when you are
sure that you don't have any of the
> > > PTEs currently in flight in the 3D/DMA
engine and you just need to
> > > invalidate the TLB.
> > >
> > > The heavyweight must be used when you
need to invalidate the TLB *AND* make
> > > sure that no concurrently operation
moves new stuff into the TLB.
> > >
> > > The problem is for this use case we
have to use the heavyweight one.
> > Just for my own curiosity: So the
lightweight flush is only for in-between
> > CS when you know access is idle? Or does
that also not work if userspace
> > has a CS on a dma engine going at the same
time because the tlb aren't
> > isolated enough between engines?
>
> More or less correct, yes.
>
> The problem is a lightweight flush only
invalidates the TLB, but doesn't
> take care of entries which have been handed out
to the different engines.
>
> In other words what can happen is the following:
>
> 1. Shader asks TLB to resolve address X.
> 2. TLB looks into its cache and can't find
address X so it asks the walker
> to resolve.
> 3. Walker comes back with result for address X
and TLB puts that into its
> cache and gives it to Shader.
> 4. Shader starts doing some operation using
result for address X.
> 5. You send lightweight TLB invalidate and TLB
throws away cached values for
> address X.
> 6. Shader happily still uses whatever the TLB
gave to it in step 3 to
> accesses address X
>
> See it like the shader has their own 1 entry L0
TLB cache which is not
> affected by the lightweight flush.
>
> The heavyweight flush on the other hand sends out
a broadcast signal to
> everybody and only comes back when we are sure
that an address is not in use
> any more.
Ah makes sense. On intel the shaders only operate in
VA, everything goes
around as explicit async messages to IO blocks. So we
don't have this, the
only difference in tlb flushes is between tlb flush in
the IB and an mmio
one which is independent for anything currently being
executed on an
egine.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch