All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] dma-fence: fix dma_fence_get_rcu_safe
@ 2017-09-04 13:16 Christian König
  2017-09-04 15:50 ` Christian König
  0 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2017-09-04 13:16 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

From: Christian König <christian.koenig@amd.com>

The logic is buggy and unnecessary complex. When dma_fence_get_rcu() fails to
acquire a reference it doesn't necessary mean that there is no fence at all.

It usually mean that the fence was replaced by a new one and in this situation
we certainly want to have the new one as result and *NOT* NULL.

Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-sig@lists.linaro.org
---
 include/linux/dma-fence.h | 23 ++---------------------
 1 file changed, 2 insertions(+), 21 deletions(-)

diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index a5195a7..37f3d67 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -246,27 +246,8 @@ dma_fence_get_rcu_safe(struct dma_fence * __rcu *fencep)
 		struct dma_fence *fence;
 
 		fence = rcu_dereference(*fencep);
-		if (!fence || !dma_fence_get_rcu(fence))
-			return NULL;
-
-		/* The atomic_inc_not_zero() inside dma_fence_get_rcu()
-		 * provides a full memory barrier upon success (such as now).
-		 * This is paired with the write barrier from assigning
-		 * to the __rcu protected fence pointer so that if that
-		 * pointer still matches the current fence, we know we
-		 * have successfully acquire a reference to it. If it no
-		 * longer matches, we are holding a reference to some other
-		 * reallocated pointer. This is possible if the allocator
-		 * is using a freelist like SLAB_TYPESAFE_BY_RCU where the
-		 * fence remains valid for the RCU grace period, but it
-		 * may be reallocated. When using such allocators, we are
-		 * responsible for ensuring the reference we get is to
-		 * the right fence, as below.
-		 */
-		if (fence == rcu_access_pointer(*fencep))
-			return rcu_pointer_handoff(fence);
-
-		dma_fence_put(fence);
+		if (!fence || dma_fence_get_rcu(fence))
+			return fence;
 	} while (1);
 }
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] dma-fence: fix dma_fence_get_rcu_safe
  2017-09-04 13:16 [PATCH] dma-fence: fix dma_fence_get_rcu_safe Christian König
@ 2017-09-04 15:50 ` Christian König
  0 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2017-09-04 15:50 UTC (permalink / raw)
  To: amd-gfx, Chris Wilson, Daniel Vetter, Sumit Semwal, linux-media,
	dri-devel, linaro-mm-sig

I really wonder what's wrong with my mail client, but it looks like this 
patch never made it at least to dri-devel.

Forwarding manually now,
Christian.

Am 04.09.2017 um 15:16 schrieb Christian König:
> From: Christian König <christian.koenig@amd.com>
>
> The logic is buggy and unnecessary complex. When dma_fence_get_rcu() fails to
> acquire a reference it doesn't necessary mean that there is no fence at all.
>
> It usually mean that the fence was replaced by a new one and in this situation
> we certainly want to have the new one as result and *NOT* NULL.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: linux-media@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>   include/linux/dma-fence.h | 23 ++---------------------
>   1 file changed, 2 insertions(+), 21 deletions(-)
>
> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> index a5195a7..37f3d67 100644
> --- a/include/linux/dma-fence.h
> +++ b/include/linux/dma-fence.h
> @@ -246,27 +246,8 @@ dma_fence_get_rcu_safe(struct dma_fence * __rcu *fencep)
>   		struct dma_fence *fence;
>   
>   		fence = rcu_dereference(*fencep);
> -		if (!fence || !dma_fence_get_rcu(fence))
> -			return NULL;
> -
> -		/* The atomic_inc_not_zero() inside dma_fence_get_rcu()
> -		 * provides a full memory barrier upon success (such as now).
> -		 * This is paired with the write barrier from assigning
> -		 * to the __rcu protected fence pointer so that if that
> -		 * pointer still matches the current fence, we know we
> -		 * have successfully acquire a reference to it. If it no
> -		 * longer matches, we are holding a reference to some other
> -		 * reallocated pointer. This is possible if the allocator
> -		 * is using a freelist like SLAB_TYPESAFE_BY_RCU where the
> -		 * fence remains valid for the RCU grace period, but it
> -		 * may be reallocated. When using such allocators, we are
> -		 * responsible for ensuring the reference we get is to
> -		 * the right fence, as below.
> -		 */
> -		if (fence == rcu_access_pointer(*fencep))
> -			return rcu_pointer_handoff(fence);
> -
> -		dma_fence_put(fence);
> +		if (!fence || dma_fence_get_rcu(fence))
> +			return fence;
>   	} while (1);
>   }
>   

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] dma-fence: fix dma_fence_get_rcu_safe
  2017-09-20 18:20                 ` Daniel Vetter
@ 2017-09-29 12:34                     ` Joonas Lahtinen
  2017-09-29 12:34                     ` Joonas Lahtinen
  1 sibling, 0 replies; 23+ messages in thread
From: Joonas Lahtinen @ 2017-09-29 12:34 UTC (permalink / raw)
  To: Daniel Vetter, Christian König
  Cc: Chris Wilson, Daniel Vetter, Sumit Semwal, linux-media,
	dri-devel, linaro-mm-sig, Maarten Lankhorst, Gustavo Padovan,
	Tvrtko Ursulin, Chris Wilson

On Wed, 2017-09-20 at 20:20 +0200, Daniel Vetter wrote:
> On Mon, Sep 11, 2017 at 01:06:32PM +0200, Christian König wrote:
> > Am 11.09.2017 um 12:01 schrieb Chris Wilson:
> > > [SNIP]
> > > > Yeah, but that is illegal with a fence objects.
> > > > 
> > > > When anybody allocates fences this way it breaks at least
> > > > reservation_object_get_fences_rcu(),
> > > > reservation_object_wait_timeout_rcu() and
> > > > reservation_object_test_signaled_single().
> > > 
> > > Many, many months ago I sent patches to fix them all.
> > 
> > Found those after a bit a searching. Yeah, those patches where proposed more
> > than a year ago, but never pushed upstream.
> > 
> > Not sure if we really should go this way. dma_fence objects are shared
> > between drivers and since we can't judge if it's the correct fence based on
> > a criteria in the object (only the read counter which is outside) all
> > drivers need to be correct for this.
> > 
> > I would rather go the way and change dma_fence_release() to wrap
> > fence->ops->release into call_rcu() to keep the whole RCU handling outside
> > of the individual drivers.
> 
> Hm, I entirely dropped the ball on this, I kinda assumed that we managed
> to get some agreement on this between i915 and dma_fence. Adding a pile
> more people.
> 
> Joonas, Tvrtko, I guess we need to fix this one way or the other.

I definitely didn't get the memo or notice this before. Tvrtko/Chris?

Regars, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] dma-fence: fix dma_fence_get_rcu_safe
@ 2017-09-29 12:34                     ` Joonas Lahtinen
  0 siblings, 0 replies; 23+ messages in thread
From: Joonas Lahtinen @ 2017-09-29 12:34 UTC (permalink / raw)
  To: Daniel Vetter, Christian König
  Cc: Chris Wilson, Daniel Vetter, Sumit Semwal, linux-media,
	dri-devel, linaro-mm-sig, Maarten Lankhorst, Gustavo Padovan,
	Tvrtko Ursulin

On Wed, 2017-09-20 at 20:20 +0200, Daniel Vetter wrote:
> On Mon, Sep 11, 2017 at 01:06:32PM +0200, Christian König wrote:
> > Am 11.09.2017 um 12:01 schrieb Chris Wilson:
> > > [SNIP]
> > > > Yeah, but that is illegal with a fence objects.
> > > > 
> > > > When anybody allocates fences this way it breaks at least
> > > > reservation_object_get_fences_rcu(),
> > > > reservation_object_wait_timeout_rcu() and
> > > > reservation_object_test_signaled_single().
> > > 
> > > Many, many months ago I sent patches to fix them all.
> > 
> > Found those after a bit a searching. Yeah, those patches where proposed more
> > than a year ago, but never pushed upstream.
> > 
> > Not sure if we really should go this way. dma_fence objects are shared
> > between drivers and since we can't judge if it's the correct fence based on
> > a criteria in the object (only the read counter which is outside) all
> > drivers need to be correct for this.
> > 
> > I would rather go the way and change dma_fence_release() to wrap
> > fence->ops->release into call_rcu() to keep the whole RCU handling outside
> > of the individual drivers.
> 
> Hm, I entirely dropped the ball on this, I kinda assumed that we managed
> to get some agreement on this between i915 and dma_fence. Adding a pile
> more people.
> 
> Joonas, Tvrtko, I guess we need to fix this one way or the other.

I definitely didn't get the memo or notice this before. Tvrtko/Chris?

Regars, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] dma-fence: fix dma_fence_get_rcu_safe
  2017-09-21  7:00                   ` Christian König
@ 2017-09-21  7:29                       ` Maarten Lankhorst
  0 siblings, 0 replies; 23+ messages in thread
From: Maarten Lankhorst @ 2017-09-21  7:29 UTC (permalink / raw)
  To: Christian König, Daniel Vetter
  Cc: Chris Wilson, Daniel Vetter, Sumit Semwal, linux-media,
	dri-devel, linaro-mm-sig, Gustavo Padovan, Joonas Lahtinen,
	Tvrtko Ursulin

Op 21-09-17 om 09:00 schreef Christian König:
> Am 20.09.2017 um 20:20 schrieb Daniel Vetter:
>> On Mon, Sep 11, 2017 at 01:06:32PM +0200, Christian König wrote:
>>> Am 11.09.2017 um 12:01 schrieb Chris Wilson:
>>>> [SNIP]
>>>>> Yeah, but that is illegal with a fence objects.
>>>>>
>>>>> When anybody allocates fences this way it breaks at least
>>>>> reservation_object_get_fences_rcu(),
>>>>> reservation_object_wait_timeout_rcu() and
>>>>> reservation_object_test_signaled_single().
>>>> Many, many months ago I sent patches to fix them all.
>>> Found those after a bit a searching. Yeah, those patches where proposed more
>>> than a year ago, but never pushed upstream.
>>>
>>> Not sure if we really should go this way. dma_fence objects are shared
>>> between drivers and since we can't judge if it's the correct fence based on
>>> a criteria in the object (only the read counter which is outside) all
>>> drivers need to be correct for this.
>>>
>>> I would rather go the way and change dma_fence_release() to wrap
>>> fence->ops->release into call_rcu() to keep the whole RCU handling outside
>>> of the individual drivers.
>> Hm, I entirely dropped the ball on this, I kinda assumed that we managed
>> to get some agreement on this between i915 and dma_fence. Adding a pile
>> more people.
>
> For the meantime I've send a v2 of this patch to fix at least the buggy return of NULL when we fail to grab the RCU reference but keeping the extra checking for now.
>
> Can I get an rb on this please so that we fix at least the bug at hand?
>
> Thanks,
> Christian. 
Done.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] dma-fence: fix dma_fence_get_rcu_safe
@ 2017-09-21  7:29                       ` Maarten Lankhorst
  0 siblings, 0 replies; 23+ messages in thread
From: Maarten Lankhorst @ 2017-09-21  7:29 UTC (permalink / raw)
  To: Christian König, Daniel Vetter
  Cc: Tvrtko Ursulin, Daniel Vetter, Joonas Lahtinen, dri-devel,
	linaro-mm-sig, linux-media

Op 21-09-17 om 09:00 schreef Christian König:
> Am 20.09.2017 um 20:20 schrieb Daniel Vetter:
>> On Mon, Sep 11, 2017 at 01:06:32PM +0200, Christian König wrote:
>>> Am 11.09.2017 um 12:01 schrieb Chris Wilson:
>>>> [SNIP]
>>>>> Yeah, but that is illegal with a fence objects.
>>>>>
>>>>> When anybody allocates fences this way it breaks at least
>>>>> reservation_object_get_fences_rcu(),
>>>>> reservation_object_wait_timeout_rcu() and
>>>>> reservation_object_test_signaled_single().
>>>> Many, many months ago I sent patches to fix them all.
>>> Found those after a bit a searching. Yeah, those patches where proposed more
>>> than a year ago, but never pushed upstream.
>>>
>>> Not sure if we really should go this way. dma_fence objects are shared
>>> between drivers and since we can't judge if it's the correct fence based on
>>> a criteria in the object (only the read counter which is outside) all
>>> drivers need to be correct for this.
>>>
>>> I would rather go the way and change dma_fence_release() to wrap
>>> fence->ops->release into call_rcu() to keep the whole RCU handling outside
>>> of the individual drivers.
>> Hm, I entirely dropped the ball on this, I kinda assumed that we managed
>> to get some agreement on this between i915 and dma_fence. Adding a pile
>> more people.
>
> For the meantime I've send a v2 of this patch to fix at least the buggy return of NULL when we fail to grab the RCU reference but keeping the extra checking for now.
>
> Can I get an rb on this please so that we fix at least the bug at hand?
>
> Thanks,
> Christian. 
Done.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] dma-fence: fix dma_fence_get_rcu_safe
  2017-09-20 18:20                 ` Daniel Vetter
@ 2017-09-21  7:00                   ` Christian König
  2017-09-21  7:29                       ` Maarten Lankhorst
  2017-09-29 12:34                     ` Joonas Lahtinen
  1 sibling, 1 reply; 23+ messages in thread
From: Christian König @ 2017-09-21  7:00 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Chris Wilson, Daniel Vetter, Sumit Semwal, linux-media,
	dri-devel, linaro-mm-sig, Maarten Lankhorst, Gustavo Padovan,
	Joonas Lahtinen, Tvrtko Ursulin

Am 20.09.2017 um 20:20 schrieb Daniel Vetter:
> On Mon, Sep 11, 2017 at 01:06:32PM +0200, Christian König wrote:
>> Am 11.09.2017 um 12:01 schrieb Chris Wilson:
>>> [SNIP]
>>>> Yeah, but that is illegal with a fence objects.
>>>>
>>>> When anybody allocates fences this way it breaks at least
>>>> reservation_object_get_fences_rcu(),
>>>> reservation_object_wait_timeout_rcu() and
>>>> reservation_object_test_signaled_single().
>>> Many, many months ago I sent patches to fix them all.
>> Found those after a bit a searching. Yeah, those patches where proposed more
>> than a year ago, but never pushed upstream.
>>
>> Not sure if we really should go this way. dma_fence objects are shared
>> between drivers and since we can't judge if it's the correct fence based on
>> a criteria in the object (only the read counter which is outside) all
>> drivers need to be correct for this.
>>
>> I would rather go the way and change dma_fence_release() to wrap
>> fence->ops->release into call_rcu() to keep the whole RCU handling outside
>> of the individual drivers.
> Hm, I entirely dropped the ball on this, I kinda assumed that we managed
> to get some agreement on this between i915 and dma_fence. Adding a pile
> more people.

For the meantime I've send a v2 of this patch to fix at least the buggy 
return of NULL when we fail to grab the RCU reference but keeping the 
extra checking for now.

Can I get an rb on this please so that we fix at least the bug at hand?

Thanks,
Christian.

>
> Joonas, Tvrtko, I guess we need to fix this one way or the other.
> -Daniel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] dma-fence: fix dma_fence_get_rcu_safe
  2017-09-11 11:06               ` Christian König
@ 2017-09-20 18:20                 ` Daniel Vetter
  2017-09-21  7:00                   ` Christian König
  2017-09-29 12:34                     ` Joonas Lahtinen
  0 siblings, 2 replies; 23+ messages in thread
From: Daniel Vetter @ 2017-09-20 18:20 UTC (permalink / raw)
  To: Christian König
  Cc: Chris Wilson, Daniel Vetter, Sumit Semwal, linux-media,
	dri-devel, linaro-mm-sig, Maarten Lankhorst, Gustavo Padovan,
	Joonas Lahtinen, Tvrtko Ursulin

On Mon, Sep 11, 2017 at 01:06:32PM +0200, Christian König wrote:
> Am 11.09.2017 um 12:01 schrieb Chris Wilson:
> > [SNIP]
> > > Yeah, but that is illegal with a fence objects.
> > >
> > > When anybody allocates fences this way it breaks at least
> > > reservation_object_get_fences_rcu(),
> > > reservation_object_wait_timeout_rcu() and
> > > reservation_object_test_signaled_single().
> > Many, many months ago I sent patches to fix them all.
>
> Found those after a bit a searching. Yeah, those patches where proposed more
> than a year ago, but never pushed upstream.
>
> Not sure if we really should go this way. dma_fence objects are shared
> between drivers and since we can't judge if it's the correct fence based on
> a criteria in the object (only the read counter which is outside) all
> drivers need to be correct for this.
>
> I would rather go the way and change dma_fence_release() to wrap
> fence->ops->release into call_rcu() to keep the whole RCU handling outside
> of the individual drivers.

Hm, I entirely dropped the ball on this, I kinda assumed that we managed
to get some agreement on this between i915 and dma_fence. Adding a pile
more people.

Joonas, Tvrtko, I guess we need to fix this one way or the other.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] dma-fence: fix dma_fence_get_rcu_safe
  2017-09-11 10:01             ` Chris Wilson
@ 2017-09-11 11:06               ` Christian König
  2017-09-20 18:20                 ` Daniel Vetter
  0 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2017-09-11 11:06 UTC (permalink / raw)
  To: Chris Wilson, daniel.vetter, sumit.semwal, linux-media,
	dri-devel, linaro-mm-sig

Am 11.09.2017 um 12:01 schrieb Chris Wilson:
> [SNIP]
>> Yeah, but that is illegal with a fence objects.
>>
>> When anybody allocates fences this way it breaks at least
>> reservation_object_get_fences_rcu(),
>> reservation_object_wait_timeout_rcu() and
>> reservation_object_test_signaled_single().
> Many, many months ago I sent patches to fix them all.

Found those after a bit a searching. Yeah, those patches where proposed 
more than a year ago, but never pushed upstream.

Not sure if we really should go this way. dma_fence objects are shared 
between drivers and since we can't judge if it's the correct fence based 
on a criteria in the object (only the read counter which is outside) all 
drivers need to be correct for this.

I would rather go the way and change dma_fence_release() to wrap 
fence->ops->release into call_rcu() to keep the whole RCU handling 
outside of the individual drivers.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] dma-fence: fix dma_fence_get_rcu_safe
  2017-09-11  9:57           ` Christian König
@ 2017-09-11 10:01             ` Chris Wilson
  2017-09-11 11:06               ` Christian König
  0 siblings, 1 reply; 23+ messages in thread
From: Chris Wilson @ 2017-09-11 10:01 UTC (permalink / raw)
  To: Christian König, daniel.vetter, sumit.semwal, linux-media,
	dri-devel, linaro-mm-sig

Quoting Christian König (2017-09-11 10:57:57)
> Am 11.09.2017 um 11:23 schrieb Chris Wilson:
> > Quoting Christian König (2017-09-11 10:06:50)
> >> Am 11.09.2017 um 10:59 schrieb Chris Wilson:
> >>> Quoting Christian König (2017-09-11 09:50:40)
> >>>> Sorry for the delayed response, but your mail somehow ended up in the
> >>>> Spam folder.
> >>>>
> >>>> Am 04.09.2017 um 15:40 schrieb Chris Wilson:
> >>>>> Quoting Christian König (2017-09-04 14:27:33)
> >>>>>> From: Christian König <christian.koenig@amd.com>
> >>>>>>
> >>>>>> The logic is buggy and unnecessary complex. When dma_fence_get_rcu() fails to
> >>>>>> acquire a reference it doesn't necessary mean that there is no fence at all.
> >>>>>>
> >>>>>> It usually mean that the fence was replaced by a new one and in this situation
> >>>>>> we certainly want to have the new one as result and *NOT* NULL.
> >>>>> Which is not guaranteed by the code you wrote either.
> >>>>>
> >>>>> The point of the comment is that the mb is only inside the successful
> >>>>> kref_atomic_inc_unless_zero, and that only after that mb do you know
> >>>>> whether or not you have the current fence.
> >>>>>
> >>>>> You can argue that you want to replace the
> >>>>>         if (!dma_fence_get_rcu())
> >>>>>                 return NULL
> >>>>> with
> >>>>>         if (!dma_fence_get_rcu()
> >>>>>                 continue;
> >>>>> but it would be incorrect to say that by simply ignoring the
> >>>>> post-condition check that you do have the right fence.
> >>>> You are completely missing the point here.
> >>>>
> >>>> It is irrelevant if you have the current fence or not when you return.
> >>>> You can only guarantee that it is the current fence when you take a look
> >>>> and that is exactly what we want to avoid.
> >>>>
> >>>> So the existing code is complete nonsense. Instead what we need to
> >>>> guarantee is that we return *ANY* fence which we can grab a reference for.
> >>> Not quite. We can grab a reference on a fence that was already freed and
> >>> reused between the rcu_dereference() and dma_fence_get_rcu().
> >> Reusing a memory structure before the RCU grace period is completed is
> >> illegal, otherwise the whole RCU approach won't work.
> > RCU only protects that the pointer remains valid. If you use
> > SLAB_TYPESAFE_BY_RCU, it is possible to reuse the pointer within a grace
> > period. It does happen and that is the point the comment is trying to
> > make.
> 
> Yeah, but that is illegal with a fence objects.
> 
> When anybody allocates fences this way it breaks at least 
> reservation_object_get_fences_rcu(), 
> reservation_object_wait_timeout_rcu() and 
> reservation_object_test_signaled_single().

Many, many months ago I sent patches to fix them all.
-Chris

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] dma-fence: fix dma_fence_get_rcu_safe
  2017-09-11  9:23           ` Chris Wilson
  (?)
@ 2017-09-11  9:57           ` Christian König
  2017-09-11 10:01             ` Chris Wilson
  -1 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2017-09-11  9:57 UTC (permalink / raw)
  To: Chris Wilson, daniel.vetter, sumit.semwal, linux-media,
	dri-devel, linaro-mm-sig

Am 11.09.2017 um 11:23 schrieb Chris Wilson:
> Quoting Christian König (2017-09-11 10:06:50)
>> Am 11.09.2017 um 10:59 schrieb Chris Wilson:
>>> Quoting Christian König (2017-09-11 09:50:40)
>>>> Sorry for the delayed response, but your mail somehow ended up in the
>>>> Spam folder.
>>>>
>>>> Am 04.09.2017 um 15:40 schrieb Chris Wilson:
>>>>> Quoting Christian König (2017-09-04 14:27:33)
>>>>>> From: Christian König <christian.koenig@amd.com>
>>>>>>
>>>>>> The logic is buggy and unnecessary complex. When dma_fence_get_rcu() fails to
>>>>>> acquire a reference it doesn't necessary mean that there is no fence at all.
>>>>>>
>>>>>> It usually mean that the fence was replaced by a new one and in this situation
>>>>>> we certainly want to have the new one as result and *NOT* NULL.
>>>>> Which is not guaranteed by the code you wrote either.
>>>>>
>>>>> The point of the comment is that the mb is only inside the successful
>>>>> kref_atomic_inc_unless_zero, and that only after that mb do you know
>>>>> whether or not you have the current fence.
>>>>>
>>>>> You can argue that you want to replace the
>>>>>         if (!dma_fence_get_rcu())
>>>>>                 return NULL
>>>>> with
>>>>>         if (!dma_fence_get_rcu()
>>>>>                 continue;
>>>>> but it would be incorrect to say that by simply ignoring the
>>>>> post-condition check that you do have the right fence.
>>>> You are completely missing the point here.
>>>>
>>>> It is irrelevant if you have the current fence or not when you return.
>>>> You can only guarantee that it is the current fence when you take a look
>>>> and that is exactly what we want to avoid.
>>>>
>>>> So the existing code is complete nonsense. Instead what we need to
>>>> guarantee is that we return *ANY* fence which we can grab a reference for.
>>> Not quite. We can grab a reference on a fence that was already freed and
>>> reused between the rcu_dereference() and dma_fence_get_rcu().
>> Reusing a memory structure before the RCU grace period is completed is
>> illegal, otherwise the whole RCU approach won't work.
> RCU only protects that the pointer remains valid. If you use
> SLAB_TYPESAFE_BY_RCU, it is possible to reuse the pointer within a grace
> period. It does happen and that is the point the comment is trying to
> make.

Yeah, but that is illegal with a fence objects.

When anybody allocates fences this way it breaks at least 
reservation_object_get_fences_rcu(), 
reservation_object_wait_timeout_rcu() and 
reservation_object_test_signaled_single().

Cause all of them rely on dma_fence_get() to return NULL when the fence 
isn't valid any more to restart the operation.

When dma_fence_get_rcu() returns a reallocated fence the operation 
wouldn't correctly restart and the end result most likely not be correct 
at all.

Using SLAB_TYPESAFE_BY_RCU is only valid if you can ensure that you have 
the right object using a second criteria and that is not the case with 
fences.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] dma-fence: fix dma_fence_get_rcu_safe
  2017-09-11  9:06       ` Christian König
@ 2017-09-11  9:23           ` Chris Wilson
  0 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2017-09-11  9:23 UTC (permalink / raw)
  To: Christian König, daniel.vetter, sumit.semwal, linux-media,
	dri-devel, linaro-mm-sig

Quoting Christian König (2017-09-11 10:06:50)
> Am 11.09.2017 um 10:59 schrieb Chris Wilson:
> > Quoting Christian König (2017-09-11 09:50:40)
> >> Sorry for the delayed response, but your mail somehow ended up in the
> >> Spam folder.
> >>
> >> Am 04.09.2017 um 15:40 schrieb Chris Wilson:
> >>> Quoting Christian König (2017-09-04 14:27:33)
> >>>> From: Christian König <christian.koenig@amd.com>
> >>>>
> >>>> The logic is buggy and unnecessary complex. When dma_fence_get_rcu() fails to
> >>>> acquire a reference it doesn't necessary mean that there is no fence at all.
> >>>>
> >>>> It usually mean that the fence was replaced by a new one and in this situation
> >>>> we certainly want to have the new one as result and *NOT* NULL.
> >>> Which is not guaranteed by the code you wrote either.
> >>>
> >>> The point of the comment is that the mb is only inside the successful
> >>> kref_atomic_inc_unless_zero, and that only after that mb do you know
> >>> whether or not you have the current fence.
> >>>
> >>> You can argue that you want to replace the
> >>>        if (!dma_fence_get_rcu())
> >>>                return NULL
> >>> with
> >>>        if (!dma_fence_get_rcu()
> >>>                continue;
> >>> but it would be incorrect to say that by simply ignoring the
> >>> post-condition check that you do have the right fence.
> >> You are completely missing the point here.
> >>
> >> It is irrelevant if you have the current fence or not when you return.
> >> You can only guarantee that it is the current fence when you take a look
> >> and that is exactly what we want to avoid.
> >>
> >> So the existing code is complete nonsense. Instead what we need to
> >> guarantee is that we return *ANY* fence which we can grab a reference for.
> > Not quite. We can grab a reference on a fence that was already freed and
> > reused between the rcu_dereference() and dma_fence_get_rcu().
> 
> Reusing a memory structure before the RCU grace period is completed is 
> illegal, otherwise the whole RCU approach won't work.

RCU only protects that the pointer remains valid. If you use
SLAB_TYPESAFE_BY_RCU, it is possible to reuse the pointer within a grace
period. It does happen and that is the point the comment is trying to
make.
-Chris

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] dma-fence: fix dma_fence_get_rcu_safe
@ 2017-09-11  9:23           ` Chris Wilson
  0 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2017-09-11  9:23 UTC (permalink / raw)
  To: Christian König, daniel.vetter, sumit.semwal, linux-media,
	dri-devel, linaro-mm-sig

Quoting Christian König (2017-09-11 10:06:50)
> Am 11.09.2017 um 10:59 schrieb Chris Wilson:
> > Quoting Christian König (2017-09-11 09:50:40)
> >> Sorry for the delayed response, but your mail somehow ended up in the
> >> Spam folder.
> >>
> >> Am 04.09.2017 um 15:40 schrieb Chris Wilson:
> >>> Quoting Christian König (2017-09-04 14:27:33)
> >>>> From: Christian König <christian.koenig@amd.com>
> >>>>
> >>>> The logic is buggy and unnecessary complex. When dma_fence_get_rcu() fails to
> >>>> acquire a reference it doesn't necessary mean that there is no fence at all.
> >>>>
> >>>> It usually mean that the fence was replaced by a new one and in this situation
> >>>> we certainly want to have the new one as result and *NOT* NULL.
> >>> Which is not guaranteed by the code you wrote either.
> >>>
> >>> The point of the comment is that the mb is only inside the successful
> >>> kref_atomic_inc_unless_zero, and that only after that mb do you know
> >>> whether or not you have the current fence.
> >>>
> >>> You can argue that you want to replace the
> >>>        if (!dma_fence_get_rcu())
> >>>                return NULL
> >>> with
> >>>        if (!dma_fence_get_rcu()
> >>>                continue;
> >>> but it would be incorrect to say that by simply ignoring the
> >>> post-condition check that you do have the right fence.
> >> You are completely missing the point here.
> >>
> >> It is irrelevant if you have the current fence or not when you return.
> >> You can only guarantee that it is the current fence when you take a look
> >> and that is exactly what we want to avoid.
> >>
> >> So the existing code is complete nonsense. Instead what we need to
> >> guarantee is that we return *ANY* fence which we can grab a reference for.
> > Not quite. We can grab a reference on a fence that was already freed and
> > reused between the rcu_dereference() and dma_fence_get_rcu().
> 
> Reusing a memory structure before the RCU grace period is completed is 
> illegal, otherwise the whole RCU approach won't work.

RCU only protects that the pointer remains valid. If you use
SLAB_TYPESAFE_BY_RCU, it is possible to reuse the pointer within a grace
period. It does happen and that is the point the comment is trying to
make.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] dma-fence: fix dma_fence_get_rcu_safe
  2017-09-11  8:59       ` Chris Wilson
  (?)
@ 2017-09-11  9:06       ` Christian König
  2017-09-11  9:23           ` Chris Wilson
  -1 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2017-09-11  9:06 UTC (permalink / raw)
  To: Chris Wilson, daniel.vetter, sumit.semwal, linux-media,
	dri-devel, linaro-mm-sig

Am 11.09.2017 um 10:59 schrieb Chris Wilson:
> Quoting Christian König (2017-09-11 09:50:40)
>> Sorry for the delayed response, but your mail somehow ended up in the
>> Spam folder.
>>
>> Am 04.09.2017 um 15:40 schrieb Chris Wilson:
>>> Quoting Christian König (2017-09-04 14:27:33)
>>>> From: Christian König <christian.koenig@amd.com>
>>>>
>>>> The logic is buggy and unnecessary complex. When dma_fence_get_rcu() fails to
>>>> acquire a reference it doesn't necessary mean that there is no fence at all.
>>>>
>>>> It usually mean that the fence was replaced by a new one and in this situation
>>>> we certainly want to have the new one as result and *NOT* NULL.
>>> Which is not guaranteed by the code you wrote either.
>>>
>>> The point of the comment is that the mb is only inside the successful
>>> kref_atomic_inc_unless_zero, and that only after that mb do you know
>>> whether or not you have the current fence.
>>>
>>> You can argue that you want to replace the
>>>        if (!dma_fence_get_rcu())
>>>                return NULL
>>> with
>>>        if (!dma_fence_get_rcu()
>>>                continue;
>>> but it would be incorrect to say that by simply ignoring the
>>> post-condition check that you do have the right fence.
>> You are completely missing the point here.
>>
>> It is irrelevant if you have the current fence or not when you return.
>> You can only guarantee that it is the current fence when you take a look
>> and that is exactly what we want to avoid.
>>
>> So the existing code is complete nonsense. Instead what we need to
>> guarantee is that we return *ANY* fence which we can grab a reference for.
> Not quite. We can grab a reference on a fence that was already freed and
> reused between the rcu_dereference() and dma_fence_get_rcu().

Reusing a memory structure before the RCU grace period is completed is 
illegal, otherwise the whole RCU approach won't work.

So that case can't happen.

Regards,
Christian.

> -Chris

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] dma-fence: fix dma_fence_get_rcu_safe
  2017-09-11  8:50   ` Christian König
@ 2017-09-11  8:59       ` Chris Wilson
  0 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2017-09-11  8:59 UTC (permalink / raw)
  To: Christian König, daniel.vetter, sumit.semwal, linux-media,
	dri-devel, linaro-mm-sig

Quoting Christian König (2017-09-11 09:50:40)
> Sorry for the delayed response, but your mail somehow ended up in the 
> Spam folder.
> 
> Am 04.09.2017 um 15:40 schrieb Chris Wilson:
> > Quoting Christian König (2017-09-04 14:27:33)
> >> From: Christian König <christian.koenig@amd.com>
> >>
> >> The logic is buggy and unnecessary complex. When dma_fence_get_rcu() fails to
> >> acquire a reference it doesn't necessary mean that there is no fence at all.
> >>
> >> It usually mean that the fence was replaced by a new one and in this situation
> >> we certainly want to have the new one as result and *NOT* NULL.
> > Which is not guaranteed by the code you wrote either.
> >
> > The point of the comment is that the mb is only inside the successful
> > kref_atomic_inc_unless_zero, and that only after that mb do you know
> > whether or not you have the current fence.
> >
> > You can argue that you want to replace the
> >       if (!dma_fence_get_rcu())
> >               return NULL
> > with
> >       if (!dma_fence_get_rcu()
> >               continue;
> > but it would be incorrect to say that by simply ignoring the
> > post-condition check that you do have the right fence.
> 
> You are completely missing the point here.
> 
> It is irrelevant if you have the current fence or not when you return. 
> You can only guarantee that it is the current fence when you take a look 
> and that is exactly what we want to avoid.
> 
> So the existing code is complete nonsense. Instead what we need to 
> guarantee is that we return *ANY* fence which we can grab a reference for.

Not quite. We can grab a reference on a fence that was already freed and
reused between the rcu_dereference() and dma_fence_get_rcu().
-Chris

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] dma-fence: fix dma_fence_get_rcu_safe
@ 2017-09-11  8:59       ` Chris Wilson
  0 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2017-09-11  8:59 UTC (permalink / raw)
  To: Christian König, daniel.vetter, sumit.semwal, linux-media,
	dri-devel, linaro-mm-sig

Quoting Christian König (2017-09-11 09:50:40)
> Sorry for the delayed response, but your mail somehow ended up in the 
> Spam folder.
> 
> Am 04.09.2017 um 15:40 schrieb Chris Wilson:
> > Quoting Christian König (2017-09-04 14:27:33)
> >> From: Christian König <christian.koenig@amd.com>
> >>
> >> The logic is buggy and unnecessary complex. When dma_fence_get_rcu() fails to
> >> acquire a reference it doesn't necessary mean that there is no fence at all.
> >>
> >> It usually mean that the fence was replaced by a new one and in this situation
> >> we certainly want to have the new one as result and *NOT* NULL.
> > Which is not guaranteed by the code you wrote either.
> >
> > The point of the comment is that the mb is only inside the successful
> > kref_atomic_inc_unless_zero, and that only after that mb do you know
> > whether or not you have the current fence.
> >
> > You can argue that you want to replace the
> >       if (!dma_fence_get_rcu())
> >               return NULL
> > with
> >       if (!dma_fence_get_rcu()
> >               continue;
> > but it would be incorrect to say that by simply ignoring the
> > post-condition check that you do have the right fence.
> 
> You are completely missing the point here.
> 
> It is irrelevant if you have the current fence or not when you return. 
> You can only guarantee that it is the current fence when you take a look 
> and that is exactly what we want to avoid.
> 
> So the existing code is complete nonsense. Instead what we need to 
> guarantee is that we return *ANY* fence which we can grab a reference for.

Not quite. We can grab a reference on a fence that was already freed and
reused between the rcu_dereference() and dma_fence_get_rcu().
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] dma-fence: fix dma_fence_get_rcu_safe
  2017-09-04 13:40   ` Chris Wilson
  (?)
@ 2017-09-11  8:50   ` Christian König
  2017-09-11  8:59       ` Chris Wilson
  -1 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2017-09-11  8:50 UTC (permalink / raw)
  To: Chris Wilson, daniel.vetter, sumit.semwal, linux-media,
	dri-devel, linaro-mm-sig

Sorry for the delayed response, but your mail somehow ended up in the 
Spam folder.

Am 04.09.2017 um 15:40 schrieb Chris Wilson:
> Quoting Christian König (2017-09-04 14:27:33)
>> From: Christian König <christian.koenig@amd.com>
>>
>> The logic is buggy and unnecessary complex. When dma_fence_get_rcu() fails to
>> acquire a reference it doesn't necessary mean that there is no fence at all.
>>
>> It usually mean that the fence was replaced by a new one and in this situation
>> we certainly want to have the new one as result and *NOT* NULL.
> Which is not guaranteed by the code you wrote either.
>
> The point of the comment is that the mb is only inside the successful
> kref_atomic_inc_unless_zero, and that only after that mb do you know
> whether or not you have the current fence.
>
> You can argue that you want to replace the
> 	if (!dma_fence_get_rcu())
> 		return NULL
> with
> 	if (!dma_fence_get_rcu()
> 		continue;
> but it would be incorrect to say that by simply ignoring the
> post-condition check that you do have the right fence.

You are completely missing the point here.

It is irrelevant if you have the current fence or not when you return. 
You can only guarantee that it is the current fence when you take a look 
and that is exactly what we want to avoid.

So the existing code is complete nonsense. Instead what we need to 
guarantee is that we return *ANY* fence which we can grab a reference for.

See the usual life of a fence* variable looks like this:
1. assigning pointer to fence A;
2. assigning pointer to fence B;
3. assigning pointer to fence C;
....

When dma_fence_get_rcu_safe() is called between step #1 and step #2 for 
example it is perfectly valid to just return either fence A or fence B.

But it is invalid to return NULL because that suggests that we don't 
need to sync at all.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] dma-fence: fix dma_fence_get_rcu_safe
  2017-09-04 13:27 ` Christian König
@ 2017-09-04 13:40   ` Chris Wilson
  -1 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2017-09-04 13:40 UTC (permalink / raw)
  To: Christian König, daniel.vetter, sumit.semwal, linux-media,
	dri-devel, linaro-mm-sig

Quoting Christian König (2017-09-04 14:27:33)
> From: Christian König <christian.koenig@amd.com>
> 
> The logic is buggy and unnecessary complex. When dma_fence_get_rcu() fails to
> acquire a reference it doesn't necessary mean that there is no fence at all.
> 
> It usually mean that the fence was replaced by a new one and in this situation
> we certainly want to have the new one as result and *NOT* NULL.

Which is not guaranteed by the code you wrote either.

The point of the comment is that the mb is only inside the successful
kref_atomic_inc_unless_zero, and that only after that mb do you know
whether or not you have the current fence.

You can argue that you want to replace the
	if (!dma_fence_get_rcu())
		return NULL
with
	if (!dma_fence_get_rcu()
		continue;
but it would be incorrect to say that by simply ignoring the
post-condition check that you do have the right fence.
-Chris

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] dma-fence: fix dma_fence_get_rcu_safe
@ 2017-09-04 13:40   ` Chris Wilson
  0 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2017-09-04 13:40 UTC (permalink / raw)
  To: Christian König, daniel.vetter, sumit.semwal, linux-media,
	dri-devel, linaro-mm-sig

Quoting Christian König (2017-09-04 14:27:33)
> From: Christian König <christian.koenig@amd.com>
> 
> The logic is buggy and unnecessary complex. When dma_fence_get_rcu() fails to
> acquire a reference it doesn't necessary mean that there is no fence at all.
> 
> It usually mean that the fence was replaced by a new one and in this situation
> we certainly want to have the new one as result and *NOT* NULL.

Which is not guaranteed by the code you wrote either.

The point of the comment is that the mb is only inside the successful
kref_atomic_inc_unless_zero, and that only after that mb do you know
whether or not you have the current fence.

You can argue that you want to replace the
	if (!dma_fence_get_rcu())
		return NULL
with
	if (!dma_fence_get_rcu()
		continue;
but it would be incorrect to say that by simply ignoring the
post-condition check that you do have the right fence.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] dma-fence: fix dma_fence_get_rcu_safe
@ 2017-09-04 13:27 ` Christian König
  0 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2017-09-04 13:27 UTC (permalink / raw)
  To: chris, daniel.vetter, sumit.semwal, linux-media, dri-devel,
	linaro-mm-sig

From: Christian König <christian.koenig@amd.com>

The logic is buggy and unnecessary complex. When dma_fence_get_rcu() fails to
acquire a reference it doesn't necessary mean that there is no fence at all.

It usually mean that the fence was replaced by a new one and in this situation
we certainly want to have the new one as result and *NOT* NULL.

Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-sig@lists.linaro.org
---
 include/linux/dma-fence.h | 23 ++---------------------
 1 file changed, 2 insertions(+), 21 deletions(-)

diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index a5195a7..37f3d67 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -246,27 +246,8 @@ dma_fence_get_rcu_safe(struct dma_fence * __rcu *fencep)
 		struct dma_fence *fence;
 
 		fence = rcu_dereference(*fencep);
-		if (!fence || !dma_fence_get_rcu(fence))
-			return NULL;
-
-		/* The atomic_inc_not_zero() inside dma_fence_get_rcu()
-		 * provides a full memory barrier upon success (such as now).
-		 * This is paired with the write barrier from assigning
-		 * to the __rcu protected fence pointer so that if that
-		 * pointer still matches the current fence, we know we
-		 * have successfully acquire a reference to it. If it no
-		 * longer matches, we are holding a reference to some other
-		 * reallocated pointer. This is possible if the allocator
-		 * is using a freelist like SLAB_TYPESAFE_BY_RCU where the
-		 * fence remains valid for the RCU grace period, but it
-		 * may be reallocated. When using such allocators, we are
-		 * responsible for ensuring the reference we get is to
-		 * the right fence, as below.
-		 */
-		if (fence == rcu_access_pointer(*fencep))
-			return rcu_pointer_handoff(fence);
-
-		dma_fence_put(fence);
+		if (!fence || dma_fence_get_rcu(fence))
+			return fence;
 	} while (1);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH] dma-fence: fix dma_fence_get_rcu_safe
@ 2017-09-04 13:27 ` Christian König
  0 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2017-09-04 13:27 UTC (permalink / raw)
  To: chris, daniel.vetter, sumit.semwal, linux-media, dri-devel,
	linaro-mm-sig

From: Christian König <christian.koenig@amd.com>

The logic is buggy and unnecessary complex. When dma_fence_get_rcu() fails to
acquire a reference it doesn't necessary mean that there is no fence at all.

It usually mean that the fence was replaced by a new one and in this situation
we certainly want to have the new one as result and *NOT* NULL.

Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-sig@lists.linaro.org
---
 include/linux/dma-fence.h | 23 ++---------------------
 1 file changed, 2 insertions(+), 21 deletions(-)

diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index a5195a7..37f3d67 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -246,27 +246,8 @@ dma_fence_get_rcu_safe(struct dma_fence * __rcu *fencep)
 		struct dma_fence *fence;
 
 		fence = rcu_dereference(*fencep);
-		if (!fence || !dma_fence_get_rcu(fence))
-			return NULL;
-
-		/* The atomic_inc_not_zero() inside dma_fence_get_rcu()
-		 * provides a full memory barrier upon success (such as now).
-		 * This is paired with the write barrier from assigning
-		 * to the __rcu protected fence pointer so that if that
-		 * pointer still matches the current fence, we know we
-		 * have successfully acquire a reference to it. If it no
-		 * longer matches, we are holding a reference to some other
-		 * reallocated pointer. This is possible if the allocator
-		 * is using a freelist like SLAB_TYPESAFE_BY_RCU where the
-		 * fence remains valid for the RCU grace period, but it
-		 * may be reallocated. When using such allocators, we are
-		 * responsible for ensuring the reference we get is to
-		 * the right fence, as below.
-		 */
-		if (fence == rcu_access_pointer(*fencep))
-			return rcu_pointer_handoff(fence);
-
-		dma_fence_put(fence);
+		if (!fence || dma_fence_get_rcu(fence))
+			return fence;
 	} while (1);
 }
 
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH] dma-fence: fix dma_fence_get_rcu_safe
@ 2017-09-04 13:20 ` Christian König
  0 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2017-09-04 13:20 UTC (permalink / raw)
  Cc: chris, daniel.vetter, sumit.semwal, linux-media, dri-devel,
	linaro-mm-sig

From: Christian König <christian.koenig@amd.com>

The logic is buggy and unnecessary complex. When dma_fence_get_rcu() fails to
acquire a reference it doesn't necessary mean that there is no fence at all.

It usually mean that the fence was replaced by a new one and in this situation
we certainly want to have the new one as result and *NOT* NULL.

Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-sig@lists.linaro.org
---
 include/linux/dma-fence.h | 23 ++---------------------
 1 file changed, 2 insertions(+), 21 deletions(-)

diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index a5195a7..37f3d67 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -246,27 +246,8 @@ dma_fence_get_rcu_safe(struct dma_fence * __rcu *fencep)
 		struct dma_fence *fence;
 
 		fence = rcu_dereference(*fencep);
-		if (!fence || !dma_fence_get_rcu(fence))
-			return NULL;
-
-		/* The atomic_inc_not_zero() inside dma_fence_get_rcu()
-		 * provides a full memory barrier upon success (such as now).
-		 * This is paired with the write barrier from assigning
-		 * to the __rcu protected fence pointer so that if that
-		 * pointer still matches the current fence, we know we
-		 * have successfully acquire a reference to it. If it no
-		 * longer matches, we are holding a reference to some other
-		 * reallocated pointer. This is possible if the allocator
-		 * is using a freelist like SLAB_TYPESAFE_BY_RCU where the
-		 * fence remains valid for the RCU grace period, but it
-		 * may be reallocated. When using such allocators, we are
-		 * responsible for ensuring the reference we get is to
-		 * the right fence, as below.
-		 */
-		if (fence == rcu_access_pointer(*fencep))
-			return rcu_pointer_handoff(fence);
-
-		dma_fence_put(fence);
+		if (!fence || dma_fence_get_rcu(fence))
+			return fence;
 	} while (1);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH] dma-fence: fix dma_fence_get_rcu_safe
@ 2017-09-04 13:20 ` Christian König
  0 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2017-09-04 13:20 UTC (permalink / raw)
  Cc: chris, daniel.vetter, sumit.semwal, linux-media, dri-devel,
	linaro-mm-sig

From: Christian König <christian.koenig@amd.com>

The logic is buggy and unnecessary complex. When dma_fence_get_rcu() fails to
acquire a reference it doesn't necessary mean that there is no fence at all.

It usually mean that the fence was replaced by a new one and in this situation
we certainly want to have the new one as result and *NOT* NULL.

Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-sig@lists.linaro.org
---
 include/linux/dma-fence.h | 23 ++---------------------
 1 file changed, 2 insertions(+), 21 deletions(-)

diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index a5195a7..37f3d67 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -246,27 +246,8 @@ dma_fence_get_rcu_safe(struct dma_fence * __rcu *fencep)
 		struct dma_fence *fence;
 
 		fence = rcu_dereference(*fencep);
-		if (!fence || !dma_fence_get_rcu(fence))
-			return NULL;
-
-		/* The atomic_inc_not_zero() inside dma_fence_get_rcu()
-		 * provides a full memory barrier upon success (such as now).
-		 * This is paired with the write barrier from assigning
-		 * to the __rcu protected fence pointer so that if that
-		 * pointer still matches the current fence, we know we
-		 * have successfully acquire a reference to it. If it no
-		 * longer matches, we are holding a reference to some other
-		 * reallocated pointer. This is possible if the allocator
-		 * is using a freelist like SLAB_TYPESAFE_BY_RCU where the
-		 * fence remains valid for the RCU grace period, but it
-		 * may be reallocated. When using such allocators, we are
-		 * responsible for ensuring the reference we get is to
-		 * the right fence, as below.
-		 */
-		if (fence == rcu_access_pointer(*fencep))
-			return rcu_pointer_handoff(fence);
-
-		dma_fence_put(fence);
+		if (!fence || dma_fence_get_rcu(fence))
+			return fence;
 	} while (1);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2017-09-29 12:34 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-04 13:16 [PATCH] dma-fence: fix dma_fence_get_rcu_safe Christian König
2017-09-04 15:50 ` Christian König
2017-09-04 13:20 Christian König
2017-09-04 13:20 ` Christian König
2017-09-04 13:27 Christian König
2017-09-04 13:27 ` Christian König
2017-09-04 13:40 ` Chris Wilson
2017-09-04 13:40   ` Chris Wilson
2017-09-11  8:50   ` Christian König
2017-09-11  8:59     ` Chris Wilson
2017-09-11  8:59       ` Chris Wilson
2017-09-11  9:06       ` Christian König
2017-09-11  9:23         ` Chris Wilson
2017-09-11  9:23           ` Chris Wilson
2017-09-11  9:57           ` Christian König
2017-09-11 10:01             ` Chris Wilson
2017-09-11 11:06               ` Christian König
2017-09-20 18:20                 ` Daniel Vetter
2017-09-21  7:00                   ` Christian König
2017-09-21  7:29                     ` Maarten Lankhorst
2017-09-21  7:29                       ` Maarten Lankhorst
2017-09-29 12:34                   ` Joonas Lahtinen
2017-09-29 12:34                     ` Joonas Lahtinen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.