All of lore.kernel.org
 help / color / mirror / Atom feed
* handle exclusive fence similar to shared ones
@ 2021-06-06 10:03 Christian König
  2021-06-06 10:03 ` [PATCH 1/3] dma-buf: fix dma_resv_test_signaled test_all handling Christian König
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Christian König @ 2021-06-06 10:03 UTC (permalink / raw)
  To: daniel, dri-devel

Hi Daniel,

as discussed here are the patches which change the handle around exclusive fence handling.

The main problem seems to have been the dma_resv_test_signaled() function which ignored the exclusive fence when shared fences where present. This was already rather inconsistent since dma_fence_wait_timeout() takes the exclusive one into account even if shared ones are present.

The second patch then fixes nouveu to also always take the exclusive fence into account.

The third then removes the workaround in amdgpu around the VM page table clearing handling. Since I'm not sure if there are no other places which relies on the existing behavior I will hold this one back for a while.

Is that what you had in mind as well?

Regards,
Christian.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] dma-buf: fix dma_resv_test_signaled test_all handling
  2021-06-06 10:03 handle exclusive fence similar to shared ones Christian König
@ 2021-06-06 10:03 ` Christian König
  2021-06-06 10:03 ` [PATCH 2/3] drm/nouveau: always wait for the exclusive fence Christian König
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Christian König @ 2021-06-06 10:03 UTC (permalink / raw)
  To: daniel, dri-devel

As the name implies if testing all fences is requested we
should indeed test all fences and not skip the exclusive
one because we see shared ones.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/dma-buf/dma-resv.c | 33 ++++++++++++---------------------
 1 file changed, 12 insertions(+), 21 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index f26c71747d43..c66bfdde9454 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -615,25 +615,21 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
  */
 bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
 {
-	unsigned int seq, shared_count;
+	struct dma_fence *fence;
+	unsigned int seq;
 	int ret;
 
 	rcu_read_lock();
 retry:
 	ret = true;
-	shared_count = 0;
 	seq = read_seqcount_begin(&obj->seq);
 
 	if (test_all) {
 		struct dma_resv_list *fobj = dma_resv_shared_list(obj);
-		unsigned int i;
-
-		if (fobj)
-			shared_count = fobj->shared_count;
+		unsigned int i, shared_count;
 
+		shared_count = fobj ? fobj->shared_count : 0;
 		for (i = 0; i < shared_count; ++i) {
-			struct dma_fence *fence;
-
 			fence = rcu_dereference(fobj->shared[i]);
 			ret = dma_resv_test_signaled_single(fence);
 			if (ret < 0)
@@ -641,24 +637,19 @@ bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
 			else if (!ret)
 				break;
 		}
-
-		if (read_seqcount_retry(&obj->seq, seq))
-			goto retry;
 	}
 
-	if (!shared_count) {
-		struct dma_fence *fence_excl = dma_resv_excl_fence(obj);
-
-		if (fence_excl) {
-			ret = dma_resv_test_signaled_single(fence_excl);
-			if (ret < 0)
-				goto retry;
+	fence = dma_resv_excl_fence(obj);
+	if (fence) {
+		ret = dma_resv_test_signaled_single(fence);
+		if (ret < 0)
+			goto retry;
 
-			if (read_seqcount_retry(&obj->seq, seq))
-				goto retry;
-		}
 	}
 
+	if (read_seqcount_retry(&obj->seq, seq))
+		goto retry;
+
 	rcu_read_unlock();
 	return ret;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] drm/nouveau: always wait for the exclusive fence
  2021-06-06 10:03 handle exclusive fence similar to shared ones Christian König
  2021-06-06 10:03 ` [PATCH 1/3] dma-buf: fix dma_resv_test_signaled test_all handling Christian König
@ 2021-06-06 10:03 ` Christian König
  2021-06-06 10:03 ` [PATCH 3/3] drm/amdgpu: drop workaround for adding page table clears as shared fence Christian König
  2021-06-07  8:58 ` handle exclusive fence similar to shared ones Daniel Vetter
  3 siblings, 0 replies; 11+ messages in thread
From: Christian König @ 2021-06-06 10:03 UTC (permalink / raw)
  To: daniel, dri-devel

As discussed with Daniel we want to drop the rule that all
shared fences must signal after the exclusive fence.

This means that drivers also need to to sync to the
exclusive fence when a shared one is present.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
index 6b43918035df..05d0b3eb3690 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -358,7 +358,7 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan, bool e
 	fobj = dma_resv_shared_list(resv);
 	fence = dma_resv_excl_fence(resv);
 
-	if (fence && (!exclusive || !fobj || !fobj->shared_count)) {
+	if (fence) {
 		struct nouveau_channel *prev = NULL;
 		bool must_wait = true;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] drm/amdgpu: drop workaround for adding page table clears as shared fence
  2021-06-06 10:03 handle exclusive fence similar to shared ones Christian König
  2021-06-06 10:03 ` [PATCH 1/3] dma-buf: fix dma_resv_test_signaled test_all handling Christian König
  2021-06-06 10:03 ` [PATCH 2/3] drm/nouveau: always wait for the exclusive fence Christian König
@ 2021-06-06 10:03 ` Christian König
  2021-06-07  8:58 ` handle exclusive fence similar to shared ones Daniel Vetter
  3 siblings, 0 replies; 11+ messages in thread
From: Christian König @ 2021-06-06 10:03 UTC (permalink / raw)
  To: daniel, dri-devel

We no longer need to add the exclusive fence as shared fence as
welldrm/amdgpu: drop workaround for adding page table clears as shared
fence

We no longer need to add the exclusive fence as shared fence as well..

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 1c3e3b608332..156c39cd858d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -207,7 +207,7 @@ static void amdgpu_gem_object_close(struct drm_gem_object *obj,
 	INIT_LIST_HEAD(&duplicates);
 
 	tv.bo = &bo->tbo;
-	tv.num_shared = 2;
+	tv.num_shared = 1;
 	list_add(&tv.head, &list);
 
 	amdgpu_vm_get_pd_bo(vm, &list, &vm_pd);
@@ -226,12 +226,6 @@ static void amdgpu_gem_object_close(struct drm_gem_object *obj,
 	if (!amdgpu_vm_ready(vm))
 		goto out_unlock;
 
-	fence = dma_resv_excl_fence(bo->tbo.base.resv);
-	if (fence) {
-		amdgpu_bo_fence(bo, fence, true);
-		fence = NULL;
-	}
-
 	r = amdgpu_vm_clear_freed(adev, vm, &fence);
 	if (r || !fence)
 		goto out_unlock;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: handle exclusive fence similar to shared ones
  2021-06-06 10:03 handle exclusive fence similar to shared ones Christian König
                   ` (2 preceding siblings ...)
  2021-06-06 10:03 ` [PATCH 3/3] drm/amdgpu: drop workaround for adding page table clears as shared fence Christian König
@ 2021-06-07  8:58 ` Daniel Vetter
  2021-06-07  9:59   ` Christian König
  3 siblings, 1 reply; 11+ messages in thread
From: Daniel Vetter @ 2021-06-07  8:58 UTC (permalink / raw)
  To: Christian König; +Cc: dri-devel

Hi Christian,

So unfortunately I got distracted with some i915 bugs and fun last
week completely, so didn't get around to it.

On Sun, Jun 6, 2021 at 12:03 PM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Hi Daniel,
>
> as discussed here are the patches which change the handle around exclusive fence handling.
>
> The main problem seems to have been the dma_resv_test_signaled() function which ignored the exclusive fence when shared fences where present. This was already rather inconsistent since dma_fence_wait_timeout() takes the exclusive one into account even if shared ones are present.
>
> The second patch then fixes nouveu to also always take the exclusive fence into account.
>
> The third then removes the workaround in amdgpu around the VM page table clearing handling. Since I'm not sure if there are no other places which relies on the existing behavior I will hold this one back for a while.
>
> Is that what you had in mind as well?

I think from the semantics something where we treat the exclusive
fence as an IPC mechanism that the kernel doesn't care much about
(exceptions apply), and but more consistently count all access from
any CS as a shared fence. So in a way what you've done here, and also
what you've done in the earlier series with setting the read/write
flags on shared fences.

For actual approach what I've picked is a bit of what amdgpu does +
what other drivers do with NO_IMPLICIT, but with the bugs fixed
(there's a bunch of them): Essentially we try to always set the shared
fences, and exclusive fences are set additionally on top when the
implicit sync IPC calls for that. And on the depdendency side we do
clever logic to only take in the exclusive fence when required.
Currently for amdgpu this means introspecting the fence owner (there's
some nasty tricks there I think to do to make this work and not be a
security bug), for others that's done with the NO_IMPLICIT flag (but
again some nasty corners there, which I think a bunch of drivers get
wrong).

There's two reasons I'm more leaning in that direction:
- The annoying thing is that the audit on the dependency side is a lot
trickier since everyone rolls their own dependency handling. If we
don't change (for now at least) the rules around dma_resv then an
oversight in the audit isn't going to be a huge problem.
- Wording becomes inconsistent: An exclusive fence which is also a
shared is a bit confusing. I think it's better if we stick to the
current rules for dma_resv, change the semantics we want in drivers (I
think that's doable, at maybe some code cost e.g. Jason's import ioctl
would be simpler with your changed rules, but still doable with the
current dma_resv rules). And then when we have that, we figure out
what to change with the dma_resv struct/rules.

Wrt the patches: Good thing is that what you change here and what I've
found thus far is 100% not overlapping, so at least we didn't waste
time auditing the same code :-)

Cheers, Daniel
>
> Regards,
> Christian.
>
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: handle exclusive fence similar to shared ones
  2021-06-07  8:58 ` handle exclusive fence similar to shared ones Daniel Vetter
@ 2021-06-07  9:59   ` Christian König
  2021-06-07 15:09     ` Daniel Vetter
  0 siblings, 1 reply; 11+ messages in thread
From: Christian König @ 2021-06-07  9:59 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: dri-devel

Am 07.06.21 um 10:58 schrieb Daniel Vetter:
> Hi Christian,
>
> So unfortunately I got distracted with some i915 bugs and fun last
> week completely, so didn't get around to it.
>
> On Sun, Jun 6, 2021 at 12:03 PM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Hi Daniel,
>>
>> as discussed here are the patches which change the handle around exclusive fence handling.
>>
>> The main problem seems to have been the dma_resv_test_signaled() function which ignored the exclusive fence when shared fences where present. This was already rather inconsistent since dma_fence_wait_timeout() takes the exclusive one into account even if shared ones are present.
>>
>> The second patch then fixes nouveu to also always take the exclusive fence into account.
>>
>> The third then removes the workaround in amdgpu around the VM page table clearing handling. Since I'm not sure if there are no other places which relies on the existing behavior I will hold this one back for a while.
>>
>> Is that what you had in mind as well?
> I think from the semantics something where we treat the exclusive
> fence as an IPC mechanism that the kernel doesn't care much about
> (exceptions apply), and but more consistently count all access from
> any CS as a shared fence. So in a way what you've done here, and also
> what you've done in the earlier series with setting the read/write
> flags on shared fences.

Yeah, I think that this will work for me as well.

> For actual approach what I've picked is a bit of what amdgpu does +
> what other drivers do with NO_IMPLICIT, but with the bugs fixed
> (there's a bunch of them): Essentially we try to always set the shared
> fences, and exclusive fences are set additionally on top when the
> implicit sync IPC calls for that. And on the depdendency side we do
> clever logic to only take in the exclusive fence when required.
> Currently for amdgpu this means introspecting the fence owner (there's
> some nasty tricks there I think to do to make this work and not be a
> security bug), for others that's done with the NO_IMPLICIT flag (but
> again some nasty corners there, which I think a bunch of drivers get
> wrong).

For amdgpu I have been pondering on the following idea  last week to 
make it behave the same as the other drivers:

1. We allow setting the explicit fence without touching the shared fences.
     As far as I understand it this is also part of your idea above.

2. During command submission amdgpu uses a dma_fence_chain node to chain 
together the new CS with the existing explicit sync.

3. During command synchronization amdgpu takes a look at the explicit 
fence and walks the dma_fence_chain history.
     Submissions from the same process (the owner) are not synced to 
(e.g. same behavior as of today), but as soon as we see something which 
doesn't fit into the amdgpu CS model we sync to the remaining chain.

That would give us both keeping the current amdgpu CS behavior (which we 
then can extend) as well as setting the explicit fence according to the 
DMA-buf rules.

> There's two reasons I'm more leaning in that direction:
> - The annoying thing is that the audit on the dependency side is a lot
> trickier since everyone rolls their own dependency handling.

Yes, absolutely agree. That's why I said we need to have use case based 
functionality here.

In other words what we need is something like an 
dma_resv_for_each_sync_fence(for_write) macro.

E.g. drivers then only do something like:

dma_resv_for_each_sync_fence(resv, for_write, fence)
     driver_specific_syncing_to_fence(fence);

And not every driver calling the underlying functions on it's own and 
then doing whatever it pleases.

> If we don't change (for now at least) the rules around dma_resv then an
> oversight in the audit isn't going to be a huge problem.
> - Wording becomes inconsistent: An exclusive fence which is also a
> shared is a bit confusing. I think it's better if we stick to the
> current rules for dma_resv, change the semantics we want in drivers (I
> think that's doable, at maybe some code cost e.g. Jason's import ioctl
> would be simpler with your changed rules, but still doable with the
> current dma_resv rules). And then when we have that, we figure out
> what to change with the dma_resv struct/rules.

But then at least do the minimal change so that we can get amdgpu in 
line with all other drivers like I outlined above.

We can keep that as a hack in amdgpu if that makes you feel better. 
Chaining the exclusive fence together is roughly 4 times slower than the 
shared approach, but I think that this is negligible compared to all the 
other stuff we do.

Regards,
Christian.

> Wrt the patches: Good thing is that what you change here and what I've
> found thus far is 100% not overlapping, so at least we didn't waste
> time auditing the same code :-)
>
> Cheers, Daniel
>> Regards,
>> Christian.
>>
>>
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: handle exclusive fence similar to shared ones
  2021-06-07  9:59   ` Christian König
@ 2021-06-07 15:09     ` Daniel Vetter
  2021-06-07 16:25       ` Christian König
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel Vetter @ 2021-06-07 15:09 UTC (permalink / raw)
  To: Christian König; +Cc: dri-devel

On Mon, Jun 07, 2021 at 11:59:11AM +0200, Christian König wrote:
> Am 07.06.21 um 10:58 schrieb Daniel Vetter:
> > Hi Christian,
> > 
> > So unfortunately I got distracted with some i915 bugs and fun last
> > week completely, so didn't get around to it.
> > 
> > On Sun, Jun 6, 2021 at 12:03 PM Christian König
> > <ckoenig.leichtzumerken@gmail.com> wrote:
> > > Hi Daniel,
> > > 
> > > as discussed here are the patches which change the handle around exclusive fence handling.
> > > 
> > > The main problem seems to have been the dma_resv_test_signaled() function which ignored the exclusive fence when shared fences where present. This was already rather inconsistent since dma_fence_wait_timeout() takes the exclusive one into account even if shared ones are present.
> > > 
> > > The second patch then fixes nouveu to also always take the exclusive fence into account.
> > > 
> > > The third then removes the workaround in amdgpu around the VM page table clearing handling. Since I'm not sure if there are no other places which relies on the existing behavior I will hold this one back for a while.
> > > 
> > > Is that what you had in mind as well?
> > I think from the semantics something where we treat the exclusive
> > fence as an IPC mechanism that the kernel doesn't care much about
> > (exceptions apply), and but more consistently count all access from
> > any CS as a shared fence. So in a way what you've done here, and also
> > what you've done in the earlier series with setting the read/write
> > flags on shared fences.
> 
> Yeah, I think that this will work for me as well.
> 
> > For actual approach what I've picked is a bit of what amdgpu does +
> > what other drivers do with NO_IMPLICIT, but with the bugs fixed
> > (there's a bunch of them): Essentially we try to always set the shared
> > fences, and exclusive fences are set additionally on top when the
> > implicit sync IPC calls for that. And on the depdendency side we do
> > clever logic to only take in the exclusive fence when required.
> > Currently for amdgpu this means introspecting the fence owner (there's
> > some nasty tricks there I think to do to make this work and not be a
> > security bug), for others that's done with the NO_IMPLICIT flag (but
> > again some nasty corners there, which I think a bunch of drivers get
> > wrong).
> 
> For amdgpu I have been pondering on the following idea  last week to make it
> behave the same as the other drivers:
> 
> 1. We allow setting the explicit fence without touching the shared fences.
>     As far as I understand it this is also part of your idea above.
> 
> 2. During command submission amdgpu uses a dma_fence_chain node to chain
> together the new CS with the existing explicit sync.
> 
> 3. During command synchronization amdgpu takes a look at the explicit fence
> and walks the dma_fence_chain history.
>     Submissions from the same process (the owner) are not synced to (e.g.
> same behavior as of today), but as soon as we see something which doesn't
> fit into the amdgpu CS model we sync to the remaining chain.
> 
> That would give us both keeping the current amdgpu CS behavior (which we
> then can extend) as well as setting the explicit fence according to the
> DMA-buf rules.

So what I had in mind is:

1. we reserve 3 additional shared slots (so one more than currently)

2. when we pull in depedencies we ignore exclusive fences when they're an
amdgpu/amdkfd one, only when it's a OWNER_UNKNOWN do we take it

3. above obviously breaks buffer moves, to fix that we always add the
ttm_bo->moving fence. If amggpu would support a "ignore implicit fencing"
flag like other drivers with NO_IMPLICIT, then we'd also need to overrule
that for a dynamically shared dma-buf (since for those we don't have a
->moving fence slot). Non-dynamic dma-buf aren't a problem since they are
guaranteed to be pinned, so can't move.

4. When we add fences we
- always add the exclusive fence (like in my patch)
- keep the current set of shared fences
- add our own fences also as a shared one (so that amdpug can ignore the
  exclusive fence for any sync against amdgpu, whether same owner or other
  owner). This is the critical piece to make sure the current uapi for
  amdgpu isn't changed
- add the previous exclusive fence if a) there is one and b) it's not an
  amdgpu/kfd one. This is where we need the additional fence slot

At first glance this throws away foreign exclusive fences, which could
break implicit sync. But by moving foreign exclusive fences to the shared
slot, we can rely on the amdgpu implicit sync logic of only looking at the
owner (and not whether a fence is exclusive of shared), and we get the
right implicit fencing even on subsequent CS.

And for foreign drivers it also all works, because the exlusive fence is
always set, and because amdgpu doesn't ignore foreign fences (even if
they're set as shared we force a sync iirc) there's a dependency chain
that makes sure everything is correct and ordered. Same for dma-buf
poll/sync_file export, that would then work on amdgpu correctly too
because the exclusive slot is set.

The only downside here is that amdgpu always sets the exclusive fence
slot, but that can't be fixed without an uapi revision since the kernel
simply doesn't know that. But amdgpu isn't the only driver, panfrost does
the same so *shrugh*.

So I think this should work but
- it's a hellalot of auditing to make sure I didn't miss anything
- and it's like attempt no 5 or so of me trying to slice this knot without
  breaking anything, or changing the current dma_resv rules.

> > There's two reasons I'm more leaning in that direction:
> > - The annoying thing is that the audit on the dependency side is a lot
> > trickier since everyone rolls their own dependency handling.
> 
> Yes, absolutely agree. That's why I said we need to have use case based
> functionality here.
> 
> In other words what we need is something like an
> dma_resv_for_each_sync_fence(for_write) macro.
> 
> E.g. drivers then only do something like:
> 
> dma_resv_for_each_sync_fence(resv, for_write, fence)
>     driver_specific_syncing_to_fence(fence);
> 
> And not every driver calling the underlying functions on it's own and then
> doing whatever it pleases.

Yeah, but amdgpu can't use those, so we're back to square one. amdgpu
currently has zero information from userspace about which CS are writes
and which are not. Other drivers (aside from panfrost) generally have
that, so they can do smarter things here.

Also we could fairly trivially fix this by adding new uapi so that amdgpu
would know this, and just oversyncing on old uerspace. But you made it
pretty clear when I proposed that that this option isn't on the table.

So for now we need to be more clever to get amdgpu aligned. And then when
that's done we (well you guys, maybe using the patches from Jason + a CS
flag to not do implicit sync at all) can add the uapi to make this
smarter.

Then, and only then, do we have the pieces to look into smarter/use-case
dependent dma_resv helpers.

Also, some of these helpers already exist, and are used by the drivers
derived from v3d. But amdgpu can't use them, because the "just oversync
for current userspace" approach you nacked. So you'll have to live with
your own quirks. I don't want to make helpers for that because then other
drivers might come up with the idea to use them :-)

> > If we don't change (for now at least) the rules around dma_resv then an
> > oversight in the audit isn't going to be a huge problem.
> > - Wording becomes inconsistent: An exclusive fence which is also a
> > shared is a bit confusing. I think it's better if we stick to the
> > current rules for dma_resv, change the semantics we want in drivers (I
> > think that's doable, at maybe some code cost e.g. Jason's import ioctl
> > would be simpler with your changed rules, but still doable with the
> > current dma_resv rules). And then when we have that, we figure out
> > what to change with the dma_resv struct/rules.
> 
> But then at least do the minimal change so that we can get amdgpu in line
> with all other drivers like I outlined above.
> 
> We can keep that as a hack in amdgpu if that makes you feel better. Chaining
> the exclusive fence together is roughly 4 times slower than the shared
> approach, but I think that this is negligible compared to all the other
> stuff we do.

Yeah I was pondering on the chaining, and for the intentional sync it's
not a problem because it's just 1 winsys buffer we touch like this. So
totally fine in Jason's approach. But not for amdgpu, where with the
current uapi means you have to annotate _all_ buffers as written to.

So not great, and which is why I've thrown a few variants of this idea out
already as unpractical. Hence the current idea I'm toying with above.

Cheers, Daniel


> 
> Regards,
> Christian.
> 
> > Wrt the patches: Good thing is that what you change here and what I've
> > found thus far is 100% not overlapping, so at least we didn't waste
> > time auditing the same code :-)
> > 
> > Cheers, Daniel
> > > Regards,
> > > Christian.
> > > 
> > > 
> > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: handle exclusive fence similar to shared ones
  2021-06-07 15:09     ` Daniel Vetter
@ 2021-06-07 16:25       ` Christian König
  2021-06-09 13:42         ` Daniel Vetter
  0 siblings, 1 reply; 11+ messages in thread
From: Christian König @ 2021-06-07 16:25 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: dri-devel



Am 07.06.21 um 17:09 schrieb Daniel Vetter:
> On Mon, Jun 07, 2021 at 11:59:11AM +0200, Christian König wrote:
>> Am 07.06.21 um 10:58 schrieb Daniel Vetter:
>>> Hi Christian,
>>>
>>> So unfortunately I got distracted with some i915 bugs and fun last
>>> week completely, so didn't get around to it.
>>>
>>> On Sun, Jun 6, 2021 at 12:03 PM Christian König
>>> <ckoenig.leichtzumerken@gmail.com> wrote:
>>>> Hi Daniel,
>>>>
>>>> as discussed here are the patches which change the handle around exclusive fence handling.
>>>>
>>>> The main problem seems to have been the dma_resv_test_signaled() function which ignored the exclusive fence when shared fences where present. This was already rather inconsistent since dma_fence_wait_timeout() takes the exclusive one into account even if shared ones are present.
>>>>
>>>> The second patch then fixes nouveu to also always take the exclusive fence into account.
>>>>
>>>> The third then removes the workaround in amdgpu around the VM page table clearing handling. Since I'm not sure if there are no other places which relies on the existing behavior I will hold this one back for a while.
>>>>
>>>> Is that what you had in mind as well?
>>> I think from the semantics something where we treat the exclusive
>>> fence as an IPC mechanism that the kernel doesn't care much about
>>> (exceptions apply), and but more consistently count all access from
>>> any CS as a shared fence. So in a way what you've done here, and also
>>> what you've done in the earlier series with setting the read/write
>>> flags on shared fences.
>> Yeah, I think that this will work for me as well.
>>
>>> For actual approach what I've picked is a bit of what amdgpu does +
>>> what other drivers do with NO_IMPLICIT, but with the bugs fixed
>>> (there's a bunch of them): Essentially we try to always set the shared
>>> fences, and exclusive fences are set additionally on top when the
>>> implicit sync IPC calls for that. And on the depdendency side we do
>>> clever logic to only take in the exclusive fence when required.
>>> Currently for amdgpu this means introspecting the fence owner (there's
>>> some nasty tricks there I think to do to make this work and not be a
>>> security bug), for others that's done with the NO_IMPLICIT flag (but
>>> again some nasty corners there, which I think a bunch of drivers get
>>> wrong).
>> For amdgpu I have been pondering on the following idea  last week to make it
>> behave the same as the other drivers:
>>
>> 1. We allow setting the explicit fence without touching the shared fences.
>>      As far as I understand it this is also part of your idea above.
>>
>> 2. During command submission amdgpu uses a dma_fence_chain node to chain
>> together the new CS with the existing explicit sync.
>>
>> 3. During command synchronization amdgpu takes a look at the explicit fence
>> and walks the dma_fence_chain history.
>>      Submissions from the same process (the owner) are not synced to (e.g.
>> same behavior as of today), but as soon as we see something which doesn't
>> fit into the amdgpu CS model we sync to the remaining chain.
>>
>> That would give us both keeping the current amdgpu CS behavior (which we
>> then can extend) as well as setting the explicit fence according to the
>> DMA-buf rules.
> So what I had in mind is:
>
> 1. we reserve 3 additional shared slots (so one more than currently)
>
> 2. when we pull in depedencies we ignore exclusive fences when they're an
> amdgpu/amdkfd one, only when it's a OWNER_UNKNOWN do we take it
>
> 3. above obviously breaks buffer moves, to fix that we always add the
> ttm_bo->moving fence. If amggpu would support a "ignore implicit fencing"
> flag like other drivers with NO_IMPLICIT, then we'd also need to overrule
> that for a dynamically shared dma-buf (since for those we don't have a
> ->moving fence slot). Non-dynamic dma-buf aren't a problem since they are
> guaranteed to be pinned, so can't move.
>
> 4. When we add fences we
> - always add the exclusive fence (like in my patch)
> - keep the current set of shared fences
> - add our own fences also as a shared one (so that amdpug can ignore the
>    exclusive fence for any sync against amdgpu, whether same owner or other
>    owner). This is the critical piece to make sure the current uapi for
>    amdgpu isn't changed
> - add the previous exclusive fence if a) there is one and b) it's not an
>    amdgpu/kfd one. This is where we need the additional fence slot

That won't work. The problem is that you have only one exclusive slot, 
but multiple submissions which execute out of order and compose the 
buffer object together.

That's why I suggested to use the dma_fence_chain to circumvent this.

But if you are ok that amdgpu sets the exclusive fence without changing 
the shared ones than the solution I've outlined should already work as well.

Regards,
Christian.

>
> At first glance this throws away foreign exclusive fences, which could
> break implicit sync. But by moving foreign exclusive fences to the shared
> slot, we can rely on the amdgpu implicit sync logic of only looking at the
> owner (and not whether a fence is exclusive of shared), and we get the
> right implicit fencing even on subsequent CS.
>
> And for foreign drivers it also all works, because the exlusive fence is
> always set, and because amdgpu doesn't ignore foreign fences (even if
> they're set as shared we force a sync iirc) there's a dependency chain
> that makes sure everything is correct and ordered. Same for dma-buf
> poll/sync_file export, that would then work on amdgpu correctly too
> because the exclusive slot is set.
>
> The only downside here is that amdgpu always sets the exclusive fence
> slot, but that can't be fixed without an uapi revision since the kernel
> simply doesn't know that. But amdgpu isn't the only driver, panfrost does
> the same so *shrugh*.
>
> So I think this should work but
> - it's a hellalot of auditing to make sure I didn't miss anything
> - and it's like attempt no 5 or so of me trying to slice this knot without
>    breaking anything, or changing the current dma_resv rules.
>
>>> There's two reasons I'm more leaning in that direction:
>>> - The annoying thing is that the audit on the dependency side is a lot
>>> trickier since everyone rolls their own dependency handling.
>> Yes, absolutely agree. That's why I said we need to have use case based
>> functionality here.
>>
>> In other words what we need is something like an
>> dma_resv_for_each_sync_fence(for_write) macro.
>>
>> E.g. drivers then only do something like:
>>
>> dma_resv_for_each_sync_fence(resv, for_write, fence)
>>      driver_specific_syncing_to_fence(fence);
>>
>> And not every driver calling the underlying functions on it's own and then
>> doing whatever it pleases.
> Yeah, but amdgpu can't use those, so we're back to square one. amdgpu
> currently has zero information from userspace about which CS are writes
> and which are not. Other drivers (aside from panfrost) generally have
> that, so they can do smarter things here.
>
> Also we could fairly trivially fix this by adding new uapi so that amdgpu
> would know this, and just oversyncing on old uerspace. But you made it
> pretty clear when I proposed that that this option isn't on the table.
>
> So for now we need to be more clever to get amdgpu aligned. And then when
> that's done we (well you guys, maybe using the patches from Jason + a CS
> flag to not do implicit sync at all) can add the uapi to make this
> smarter.
>
> Then, and only then, do we have the pieces to look into smarter/use-case
> dependent dma_resv helpers.
>
> Also, some of these helpers already exist, and are used by the drivers
> derived from v3d. But amdgpu can't use them, because the "just oversync
> for current userspace" approach you nacked. So you'll have to live with
> your own quirks. I don't want to make helpers for that because then other
> drivers might come up with the idea to use them :-)
>
>>> If we don't change (for now at least) the rules around dma_resv then an
>>> oversight in the audit isn't going to be a huge problem.
>>> - Wording becomes inconsistent: An exclusive fence which is also a
>>> shared is a bit confusing. I think it's better if we stick to the
>>> current rules for dma_resv, change the semantics we want in drivers (I
>>> think that's doable, at maybe some code cost e.g. Jason's import ioctl
>>> would be simpler with your changed rules, but still doable with the
>>> current dma_resv rules). And then when we have that, we figure out
>>> what to change with the dma_resv struct/rules.
>> But then at least do the minimal change so that we can get amdgpu in line
>> with all other drivers like I outlined above.
>>
>> We can keep that as a hack in amdgpu if that makes you feel better. Chaining
>> the exclusive fence together is roughly 4 times slower than the shared
>> approach, but I think that this is negligible compared to all the other
>> stuff we do.
> Yeah I was pondering on the chaining, and for the intentional sync it's
> not a problem because it's just 1 winsys buffer we touch like this. So
> totally fine in Jason's approach. But not for amdgpu, where with the
> current uapi means you have to annotate _all_ buffers as written to.
>
> So not great, and which is why I've thrown a few variants of this idea out
> already as unpractical. Hence the current idea I'm toying with above.
>
> Cheers, Daniel
>
>
>> Regards,
>> Christian.
>>
>>> Wrt the patches: Good thing is that what you change here and what I've
>>> found thus far is 100% not overlapping, so at least we didn't waste
>>> time auditing the same code :-)
>>>
>>> Cheers, Daniel
>>>> Regards,
>>>> Christian.
>>>>
>>>>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: handle exclusive fence similar to shared ones
  2021-06-07 16:25       ` Christian König
@ 2021-06-09 13:42         ` Daniel Vetter
  2021-06-09 14:07           ` Christian König
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel Vetter @ 2021-06-09 13:42 UTC (permalink / raw)
  To: Christian König; +Cc: dri-devel

On Mon, Jun 07, 2021 at 06:25:42PM +0200, Christian König wrote:
> 
> 
> Am 07.06.21 um 17:09 schrieb Daniel Vetter:
> > On Mon, Jun 07, 2021 at 11:59:11AM +0200, Christian König wrote:
> > > Am 07.06.21 um 10:58 schrieb Daniel Vetter:
> > > > Hi Christian,
> > > > 
> > > > So unfortunately I got distracted with some i915 bugs and fun last
> > > > week completely, so didn't get around to it.
> > > > 
> > > > On Sun, Jun 6, 2021 at 12:03 PM Christian König
> > > > <ckoenig.leichtzumerken@gmail.com> wrote:
> > > > > Hi Daniel,
> > > > > 
> > > > > as discussed here are the patches which change the handle around exclusive fence handling.
> > > > > 
> > > > > The main problem seems to have been the dma_resv_test_signaled() function which ignored the exclusive fence when shared fences where present. This was already rather inconsistent since dma_fence_wait_timeout() takes the exclusive one into account even if shared ones are present.
> > > > > 
> > > > > The second patch then fixes nouveu to also always take the exclusive fence into account.
> > > > > 
> > > > > The third then removes the workaround in amdgpu around the VM page table clearing handling. Since I'm not sure if there are no other places which relies on the existing behavior I will hold this one back for a while.
> > > > > 
> > > > > Is that what you had in mind as well?
> > > > I think from the semantics something where we treat the exclusive
> > > > fence as an IPC mechanism that the kernel doesn't care much about
> > > > (exceptions apply), and but more consistently count all access from
> > > > any CS as a shared fence. So in a way what you've done here, and also
> > > > what you've done in the earlier series with setting the read/write
> > > > flags on shared fences.
> > > Yeah, I think that this will work for me as well.
> > > 
> > > > For actual approach what I've picked is a bit of what amdgpu does +
> > > > what other drivers do with NO_IMPLICIT, but with the bugs fixed
> > > > (there's a bunch of them): Essentially we try to always set the shared
> > > > fences, and exclusive fences are set additionally on top when the
> > > > implicit sync IPC calls for that. And on the depdendency side we do
> > > > clever logic to only take in the exclusive fence when required.
> > > > Currently for amdgpu this means introspecting the fence owner (there's
> > > > some nasty tricks there I think to do to make this work and not be a
> > > > security bug), for others that's done with the NO_IMPLICIT flag (but
> > > > again some nasty corners there, which I think a bunch of drivers get
> > > > wrong).
> > > For amdgpu I have been pondering on the following idea  last week to make it
> > > behave the same as the other drivers:
> > > 
> > > 1. We allow setting the explicit fence without touching the shared fences.
> > >      As far as I understand it this is also part of your idea above.
> > > 
> > > 2. During command submission amdgpu uses a dma_fence_chain node to chain
> > > together the new CS with the existing explicit sync.
> > > 
> > > 3. During command synchronization amdgpu takes a look at the explicit fence
> > > and walks the dma_fence_chain history.
> > >      Submissions from the same process (the owner) are not synced to (e.g.
> > > same behavior as of today), but as soon as we see something which doesn't
> > > fit into the amdgpu CS model we sync to the remaining chain.
> > > 
> > > That would give us both keeping the current amdgpu CS behavior (which we
> > > then can extend) as well as setting the explicit fence according to the
> > > DMA-buf rules.
> > So what I had in mind is:
> > 
> > 1. we reserve 3 additional shared slots (so one more than currently)
> > 
> > 2. when we pull in depedencies we ignore exclusive fences when they're an
> > amdgpu/amdkfd one, only when it's a OWNER_UNKNOWN do we take it
> > 
> > 3. above obviously breaks buffer moves, to fix that we always add the
> > ttm_bo->moving fence. If amggpu would support a "ignore implicit fencing"
> > flag like other drivers with NO_IMPLICIT, then we'd also need to overrule
> > that for a dynamically shared dma-buf (since for those we don't have a
> > ->moving fence slot). Non-dynamic dma-buf aren't a problem since they are
> > guaranteed to be pinned, so can't move.
> > 
> > 4. When we add fences we
> > - always add the exclusive fence (like in my patch)
> > - keep the current set of shared fences
> > - add our own fences also as a shared one (so that amdpug can ignore the
> >    exclusive fence for any sync against amdgpu, whether same owner or other
> >    owner). This is the critical piece to make sure the current uapi for
> >    amdgpu isn't changed
> > - add the previous exclusive fence if a) there is one and b) it's not an
> >    amdgpu/kfd one. This is where we need the additional fence slot
> 
> That won't work. The problem is that you have only one exclusive slot, but
> multiple submissions which execute out of order and compose the buffer
> object together.
> 
> That's why I suggested to use the dma_fence_chain to circumvent this.
> 
> But if you are ok that amdgpu sets the exclusive fence without changing the
> shared ones than the solution I've outlined should already work as well.

Uh that's indeed nasty. Can you give me the details of the exact use-case
so I can read the userspace code and come up with an idea? I was assuming
that even with parallel processing there's at least one step at the end
that unifies it for the next process.

If we can't detect this somehow then it means we do indeed have to create
a fence_chain for the exclusive slot for everything, which would be nasty.
Or a large-scale redo across all drivers, which is probaly even more
nasty.
-Daniel


> 
> Regards,
> Christian.
> 
> > 
> > At first glance this throws away foreign exclusive fences, which could
> > break implicit sync. But by moving foreign exclusive fences to the shared
> > slot, we can rely on the amdgpu implicit sync logic of only looking at the
> > owner (and not whether a fence is exclusive of shared), and we get the
> > right implicit fencing even on subsequent CS.
> > 
> > And for foreign drivers it also all works, because the exlusive fence is
> > always set, and because amdgpu doesn't ignore foreign fences (even if
> > they're set as shared we force a sync iirc) there's a dependency chain
> > that makes sure everything is correct and ordered. Same for dma-buf
> > poll/sync_file export, that would then work on amdgpu correctly too
> > because the exclusive slot is set.
> > 
> > The only downside here is that amdgpu always sets the exclusive fence
> > slot, but that can't be fixed without an uapi revision since the kernel
> > simply doesn't know that. But amdgpu isn't the only driver, panfrost does
> > the same so *shrugh*.
> > 
> > So I think this should work but
> > - it's a hellalot of auditing to make sure I didn't miss anything
> > - and it's like attempt no 5 or so of me trying to slice this knot without
> >    breaking anything, or changing the current dma_resv rules.
> > 
> > > > There's two reasons I'm more leaning in that direction:
> > > > - The annoying thing is that the audit on the dependency side is a lot
> > > > trickier since everyone rolls their own dependency handling.
> > > Yes, absolutely agree. That's why I said we need to have use case based
> > > functionality here.
> > > 
> > > In other words what we need is something like an
> > > dma_resv_for_each_sync_fence(for_write) macro.
> > > 
> > > E.g. drivers then only do something like:
> > > 
> > > dma_resv_for_each_sync_fence(resv, for_write, fence)
> > >      driver_specific_syncing_to_fence(fence);
> > > 
> > > And not every driver calling the underlying functions on it's own and then
> > > doing whatever it pleases.
> > Yeah, but amdgpu can't use those, so we're back to square one. amdgpu
> > currently has zero information from userspace about which CS are writes
> > and which are not. Other drivers (aside from panfrost) generally have
> > that, so they can do smarter things here.
> > 
> > Also we could fairly trivially fix this by adding new uapi so that amdgpu
> > would know this, and just oversyncing on old uerspace. But you made it
> > pretty clear when I proposed that that this option isn't on the table.
> > 
> > So for now we need to be more clever to get amdgpu aligned. And then when
> > that's done we (well you guys, maybe using the patches from Jason + a CS
> > flag to not do implicit sync at all) can add the uapi to make this
> > smarter.
> > 
> > Then, and only then, do we have the pieces to look into smarter/use-case
> > dependent dma_resv helpers.
> > 
> > Also, some of these helpers already exist, and are used by the drivers
> > derived from v3d. But amdgpu can't use them, because the "just oversync
> > for current userspace" approach you nacked. So you'll have to live with
> > your own quirks. I don't want to make helpers for that because then other
> > drivers might come up with the idea to use them :-)
> > 
> > > > If we don't change (for now at least) the rules around dma_resv then an
> > > > oversight in the audit isn't going to be a huge problem.
> > > > - Wording becomes inconsistent: An exclusive fence which is also a
> > > > shared is a bit confusing. I think it's better if we stick to the
> > > > current rules for dma_resv, change the semantics we want in drivers (I
> > > > think that's doable, at maybe some code cost e.g. Jason's import ioctl
> > > > would be simpler with your changed rules, but still doable with the
> > > > current dma_resv rules). And then when we have that, we figure out
> > > > what to change with the dma_resv struct/rules.
> > > But then at least do the minimal change so that we can get amdgpu in line
> > > with all other drivers like I outlined above.
> > > 
> > > We can keep that as a hack in amdgpu if that makes you feel better. Chaining
> > > the exclusive fence together is roughly 4 times slower than the shared
> > > approach, but I think that this is negligible compared to all the other
> > > stuff we do.
> > Yeah I was pondering on the chaining, and for the intentional sync it's
> > not a problem because it's just 1 winsys buffer we touch like this. So
> > totally fine in Jason's approach. But not for amdgpu, where with the
> > current uapi means you have to annotate _all_ buffers as written to.
> > 
> > So not great, and which is why I've thrown a few variants of this idea out
> > already as unpractical. Hence the current idea I'm toying with above.
> > 
> > Cheers, Daniel
> > 
> > 
> > > Regards,
> > > Christian.
> > > 
> > > > Wrt the patches: Good thing is that what you change here and what I've
> > > > found thus far is 100% not overlapping, so at least we didn't waste
> > > > time auditing the same code :-)
> > > > 
> > > > Cheers, Daniel
> > > > > Regards,
> > > > > Christian.
> > > > > 
> > > > > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: handle exclusive fence similar to shared ones
  2021-06-09 13:42         ` Daniel Vetter
@ 2021-06-09 14:07           ` Christian König
  2021-06-10 16:41             ` Daniel Vetter
  0 siblings, 1 reply; 11+ messages in thread
From: Christian König @ 2021-06-09 14:07 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: dri-devel

Am 09.06.21 um 15:42 schrieb Daniel Vetter:
> [SNIP]
>> That won't work. The problem is that you have only one exclusive slot, but
>> multiple submissions which execute out of order and compose the buffer
>> object together.
>>
>> That's why I suggested to use the dma_fence_chain to circumvent this.
>>
>> But if you are ok that amdgpu sets the exclusive fence without changing the
>> shared ones than the solution I've outlined should already work as well.
> Uh that's indeed nasty. Can you give me the details of the exact use-case
> so I can read the userspace code and come up with an idea? I was assuming
> that even with parallel processing there's at least one step at the end
> that unifies it for the next process.

Unfortunately not, with Vulkan that is really in the hand of the 
application.

But the example we have in the test cases is using 3D+DMA to compose a 
buffer IIRC.

> If we can't detect this somehow then it means we do indeed have to create
> a fence_chain for the exclusive slot for everything, which would be nasty.

I've already created a prototype of that and it is not that bad. It does 
have some noticeable overhead, but I think that's ok.

> Or a large-scale redo across all drivers, which is probaly even more
> nasty.

Yeah, that is indeed harder to get right.

Christian.

> -Daniel
>
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: handle exclusive fence similar to shared ones
  2021-06-09 14:07           ` Christian König
@ 2021-06-10 16:41             ` Daniel Vetter
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Vetter @ 2021-06-10 16:41 UTC (permalink / raw)
  To: Christian König; +Cc: dri-devel

On Wed, Jun 09, 2021 at 04:07:24PM +0200, Christian König wrote:
> Am 09.06.21 um 15:42 schrieb Daniel Vetter:
> > [SNIP]
> > > That won't work. The problem is that you have only one exclusive slot, but
> > > multiple submissions which execute out of order and compose the buffer
> > > object together.
> > > 
> > > That's why I suggested to use the dma_fence_chain to circumvent this.
> > > 
> > > But if you are ok that amdgpu sets the exclusive fence without changing the
> > > shared ones than the solution I've outlined should already work as well.
> > Uh that's indeed nasty. Can you give me the details of the exact use-case
> > so I can read the userspace code and come up with an idea? I was assuming
> > that even with parallel processing there's at least one step at the end
> > that unifies it for the next process.
> 
> Unfortunately not, with Vulkan that is really in the hand of the
> application.

Vulkan explicitly says implicit sync isn't a thing, and you need to
import/export syncobj if you e.g. want to share a buffer with GL.

Ofc because amdgpu always syncs there's a good chance that userspace
running on amdgpu vk doesn't get this right and is breaking the vk spec
here :-/

> But the example we have in the test cases is using 3D+DMA to compose a
> buffer IIRC.

Yeah that's the more interesting one I think. I've heard of some
post-processing steps, but that always needs to wait for 3D to finish. 3D
+ copy engine a separate thing.

> > If we can't detect this somehow then it means we do indeed have to create
> > a fence_chain for the exclusive slot for everything, which would be nasty.
> 
> I've already created a prototype of that and it is not that bad. It does
> have some noticeable overhead, but I think that's ok.

Yup seen that, I'll go and review that tomorrow hopefully. It's not great,
but it's definitely a lot better than the force always sync.

> > Or a large-scale redo across all drivers, which is probaly even more
> > nasty.
> 
> Yeah, that is indeed harder to get right.

Yeah, and there's also a bunch of other confusions in that area.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-06-10 16:41 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-06 10:03 handle exclusive fence similar to shared ones Christian König
2021-06-06 10:03 ` [PATCH 1/3] dma-buf: fix dma_resv_test_signaled test_all handling Christian König
2021-06-06 10:03 ` [PATCH 2/3] drm/nouveau: always wait for the exclusive fence Christian König
2021-06-06 10:03 ` [PATCH 3/3] drm/amdgpu: drop workaround for adding page table clears as shared fence Christian König
2021-06-07  8:58 ` handle exclusive fence similar to shared ones Daniel Vetter
2021-06-07  9:59   ` Christian König
2021-06-07 15:09     ` Daniel Vetter
2021-06-07 16:25       ` Christian König
2021-06-09 13:42         ` Daniel Vetter
2021-06-09 14:07           ` Christian König
2021-06-10 16:41             ` Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.