All of lore.kernel.org
 help / color / mirror / Atom feed
* Fixing zero timeout handling for fence functions
@ 2016-10-25 12:25 Christian König
  2016-10-25 12:25 ` [PATCH 1/4] dma-buf/fence: make timeout handling in fence_default_wait consistent Christian König
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Christian König @ 2016-10-25 12:25 UTC (permalink / raw)
  To: sumit.semwal; +Cc: dri-devel

Hi Sumit,

sending this once more with all the patches in once set, cause the last one
turned out to be a bit chaotic because I send from the wrong branch.

The following patch set fixes the handling in the fence and reservation object
wait function when the timeout is zero.

An AMD developer introduced this a while ago to work around some issues in TTM
and our amdgpu driver, but essentially this effort was a bit flawed because
even with a zero timeout enable_signalling() should be called.

Otherwise someone busy waiting for the fence might never be signaled when you
have hardware with faulty interrupts for example.

Please review and/or comment,
Christian.

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/4] dma-buf/fence: make timeout handling in fence_default_wait consistent
  2016-10-25 12:25 Fixing zero timeout handling for fence functions Christian König
@ 2016-10-25 12:25 ` Christian König
  2016-10-26  8:49   ` Chris Wilson
  2016-10-25 12:25 ` [PATCH 2/4] dma-buf/fence: revert "don't wait when specified timeout is zero" Christian König
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 8+ messages in thread
From: Christian König @ 2016-10-25 12:25 UTC (permalink / raw)
  To: sumit.semwal; +Cc: dri-devel

From: Christian König <christian.koenig@amd.com>

Kernel functions taking a timeout usually return 1 on success even
when they get a zero timeout.

Signen-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/dma-buf/fence.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/dma-buf/fence.c b/drivers/dma-buf/fence.c
index 4d51f9e..fb915ab 100644
--- a/drivers/dma-buf/fence.c
+++ b/drivers/dma-buf/fence.c
@@ -335,18 +335,20 @@ fence_default_wait_cb(struct fence *fence, struct fence_cb *cb)
  * @timeout:	[in]	timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT
  *
  * Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or the
- * remaining timeout in jiffies on success.
+ * remaining timeout in jiffies on success. If timeout is zero the value one is
+ * returned if the fence is already signaled for consistency with other
+ * functions taking a jiffies timeout.
  */
 signed long
 fence_default_wait(struct fence *fence, bool intr, signed long timeout)
 {
 	struct default_wait_cb cb;
 	unsigned long flags;
-	signed long ret = timeout;
+	signed long ret = timeout ? timeout : 1;
 	bool was_set;
 
 	if (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags))
-		return timeout;
+		return ret;
 
 	spin_lock_irqsave(fence->lock, flags);
 
-- 
2.5.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/4] dma-buf/fence: revert "don't wait when specified timeout is zero"
  2016-10-25 12:25 Fixing zero timeout handling for fence functions Christian König
  2016-10-25 12:25 ` [PATCH 1/4] dma-buf/fence: make timeout handling in fence_default_wait consistent Christian König
@ 2016-10-25 12:25 ` Christian König
  2016-10-25 12:25 ` [PATCH 3/4] drm/ttm: fix ttm_bo_wait Christian König
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Christian König @ 2016-10-25 12:25 UTC (permalink / raw)
  To: sumit.semwal; +Cc: dri-devel

From: Christian König <christian.koenig@amd.com>

This reverts commit 847b19a39e4c9b5e74c40f0842c48b41664cb43c.

When we don't call the wait function software signaling might never be
activated. This can cause infinite polling loops with unreliable interrupt
driven hardware.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/dma-buf/fence.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/dma-buf/fence.c b/drivers/dma-buf/fence.c
index fb915ab..64478f9 100644
--- a/drivers/dma-buf/fence.c
+++ b/drivers/dma-buf/fence.c
@@ -159,9 +159,6 @@ fence_wait_timeout(struct fence *fence, bool intr, signed long timeout)
 	if (WARN_ON(timeout < 0))
 		return -EINVAL;
 
-	if (timeout == 0)
-		return fence_is_signaled(fence);
-
 	trace_fence_wait_start(fence);
 	ret = fence->ops->wait(fence, intr, timeout);
 	trace_fence_wait_end(fence);
-- 
2.5.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/4] drm/ttm: fix ttm_bo_wait
  2016-10-25 12:25 Fixing zero timeout handling for fence functions Christian König
  2016-10-25 12:25 ` [PATCH 1/4] dma-buf/fence: make timeout handling in fence_default_wait consistent Christian König
  2016-10-25 12:25 ` [PATCH 2/4] dma-buf/fence: revert "don't wait when specified timeout is zero" Christian König
@ 2016-10-25 12:25 ` Christian König
  2016-10-25 12:25 ` [PATCH 4/4] reservation: revert "wait only with non-zero timeout specified (v3)" v2 Christian König
  2016-10-26  2:39 ` Fixing zero timeout handling for fence functions Sumit Semwal
  4 siblings, 0 replies; 8+ messages in thread
From: Christian König @ 2016-10-25 12:25 UTC (permalink / raw)
  To: sumit.semwal; +Cc: dri-devel

From: Christian König <christian.koenig@amd.com>

reservation_object_wait_timeout_rcu() should enable signaling even with a zero
timeout, but ttm_bo_wait() can also be called from atomic context and then it
is not a good idea to do this.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 608c585..478d563 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -1605,7 +1605,14 @@ EXPORT_SYMBOL(ttm_bo_unmap_virtual);
 int ttm_bo_wait(struct ttm_buffer_object *bo,
 		bool interruptible, bool no_wait)
 {
-	long timeout = no_wait ? 0 : 15 * HZ;
+	long timeout = 15 * HZ;
+
+	if (no_wait) {
+		if (reservation_object_test_signaled_rcu(bo->resv, true))
+			return 0;
+		else
+			return -EBUSY;
+	}
 
 	timeout = reservation_object_wait_timeout_rcu(bo->resv, true,
 						      interruptible, timeout);
-- 
2.5.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/4] reservation: revert "wait only with non-zero timeout specified (v3)" v2
  2016-10-25 12:25 Fixing zero timeout handling for fence functions Christian König
                   ` (2 preceding siblings ...)
  2016-10-25 12:25 ` [PATCH 3/4] drm/ttm: fix ttm_bo_wait Christian König
@ 2016-10-25 12:25 ` Christian König
  2016-10-26  2:39 ` Fixing zero timeout handling for fence functions Sumit Semwal
  4 siblings, 0 replies; 8+ messages in thread
From: Christian König @ 2016-10-25 12:25 UTC (permalink / raw)
  To: sumit.semwal; +Cc: dri-devel

From: Christian König <christian.koenig@amd.com>

This reverts commit fb8b7d2b9d80e1e71f379e57355936bd2b024be9.

Otherwise signaling might never be activated on the fences. This can
result in infinite waiting with hardware which has unreliable interrupts.

v2: still return one when the timeout is zero and we don't have any fences.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com> (v1)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/dma-buf/reservation.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
index 9566a62..debb91d 100644
--- a/drivers/dma-buf/reservation.c
+++ b/drivers/dma-buf/reservation.c
@@ -379,10 +379,7 @@ long reservation_object_wait_timeout_rcu(struct reservation_object *obj,
 {
 	struct fence *fence;
 	unsigned seq, shared_count, i = 0;
-	long ret = timeout;
-
-	if (!timeout)
-		return reservation_object_test_signaled_rcu(obj, wait_all);
+	long ret = timeout ? timeout : 1;
 
 retry:
 	fence = NULL;
-- 
2.5.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Fixing zero timeout handling for fence functions
  2016-10-25 12:25 Fixing zero timeout handling for fence functions Christian König
                   ` (3 preceding siblings ...)
  2016-10-25 12:25 ` [PATCH 4/4] reservation: revert "wait only with non-zero timeout specified (v3)" v2 Christian König
@ 2016-10-26  2:39 ` Sumit Semwal
  4 siblings, 0 replies; 8+ messages in thread
From: Sumit Semwal @ 2016-10-26  2:39 UTC (permalink / raw)
  To: Christian König; +Cc: DRI mailing list

Hi Christian,


On 25 October 2016 at 17:55, Christian König <deathsimple@vodafone.de> wrote:
> Hi Sumit,
>
> sending this once more with all the patches in once set, cause the last one
> turned out to be a bit chaotic because I send from the wrong branch.
>
> The following patch set fixes the handling in the fence and reservation object
> wait function when the timeout is zero.
>

Thanks for your patches!

Adding a few more reviewers.

Also, usually, I find it easier if the intro patch comes as (patch
0/X), since that lists all the patches of the patch series as well. If
you could do that from next time onwards, that'd be great!

> An AMD developer introduced this a while ago to work around some issues in TTM
> and our amdgpu driver, but essentially this effort was a bit flawed because
> even with a zero timeout enable_signalling() should be called.
>
> Otherwise someone busy waiting for the fence might never be signaled when you
> have hardware with faulty interrupts for example.
>
> Please review and/or comment,
> Christian.
>

Best regards,
Sumit.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/4] dma-buf/fence: make timeout handling in fence_default_wait consistent
  2016-10-25 12:25 ` [PATCH 1/4] dma-buf/fence: make timeout handling in fence_default_wait consistent Christian König
@ 2016-10-26  8:49   ` Chris Wilson
  2016-10-26 13:27     ` Christian König
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Wilson @ 2016-10-26  8:49 UTC (permalink / raw)
  To: Christian König; +Cc: dri-devel

On Tue, Oct 25, 2016 at 02:25:08PM +0200, Christian König wrote:
> From: Christian König <christian.koenig@amd.com>
> 
> Kernel functions taking a timeout usually return 1 on success even
> when they get a zero timeout.

Which? The canonical example of schedule_timeout() doesn't behave like
this.
 
> Signen-off-by: Christian König <christian.koenig@amd.com>
> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
> ---
>  drivers/dma-buf/fence.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/dma-buf/fence.c b/drivers/dma-buf/fence.c
> index 4d51f9e..fb915ab 100644
> --- a/drivers/dma-buf/fence.c
> +++ b/drivers/dma-buf/fence.c
> @@ -335,18 +335,20 @@ fence_default_wait_cb(struct fence *fence, struct fence_cb *cb)
>   * @timeout:	[in]	timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT
>   *
>   * Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or the
> - * remaining timeout in jiffies on success.
> + * remaining timeout in jiffies on success. If timeout is zero the value one is
> + * returned if the fence is already signaled for consistency with other
> + * functions taking a jiffies timeout.
>   */
>  signed long
>  fence_default_wait(struct fence *fence, bool intr, signed long timeout)
>  {
>  	struct default_wait_cb cb;
>  	unsigned long flags;
> -	signed long ret = timeout;
> +	signed long ret = timeout ? timeout : 1;
>  	bool was_set;

This is horrible.

reservation_object_wait_timeout_rcu(), we pass it a timeout of zero to
do a safe probe on all attached fences.

The first fence happens to be signaled, so now we set the timeout to 1
jiffie. Any subsequent incomplete fence will then wait for that jiffie
before reporting 0.

When we report to userspace, we convert the timeout into an error,
usually EBUSY or ETIME. Applying that same convention here makes this
all much cleaner...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/4] dma-buf/fence: make timeout handling in fence_default_wait consistent
  2016-10-26  8:49   ` Chris Wilson
@ 2016-10-26 13:27     ` Christian König
  0 siblings, 0 replies; 8+ messages in thread
From: Christian König @ 2016-10-26 13:27 UTC (permalink / raw)
  To: Chris Wilson, sumit.semwal, dri-devel

Am 26.10.2016 um 10:49 schrieb Chris Wilson:
> On Tue, Oct 25, 2016 at 02:25:08PM +0200, Christian König wrote:
>> From: Christian König <christian.koenig@amd.com>
>>
>> Kernel functions taking a timeout usually return 1 on success even
>> when they get a zero timeout.
> Which? The canonical example of schedule_timeout() doesn't behave like
> this.

wait_event_timeout() for example, as well as to the fence_wait_timeout() 
and reservation_object_wait_timeout_rcu() functions build on top of this 
one.

Regards,
Christian.

>   
>> Signen-off-by: Christian König <christian.koenig@amd.com>
>> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
>> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
>> ---
>>   drivers/dma-buf/fence.c | 8 +++++---
>>   1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/dma-buf/fence.c b/drivers/dma-buf/fence.c
>> index 4d51f9e..fb915ab 100644
>> --- a/drivers/dma-buf/fence.c
>> +++ b/drivers/dma-buf/fence.c
>> @@ -335,18 +335,20 @@ fence_default_wait_cb(struct fence *fence, struct fence_cb *cb)
>>    * @timeout:	[in]	timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT
>>    *
>>    * Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or the
>> - * remaining timeout in jiffies on success.
>> + * remaining timeout in jiffies on success. If timeout is zero the value one is
>> + * returned if the fence is already signaled for consistency with other
>> + * functions taking a jiffies timeout.
>>    */
>>   signed long
>>   fence_default_wait(struct fence *fence, bool intr, signed long timeout)
>>   {
>>   	struct default_wait_cb cb;
>>   	unsigned long flags;
>> -	signed long ret = timeout;
>> +	signed long ret = timeout ? timeout : 1;
>>   	bool was_set;
> This is horrible.
>
> reservation_object_wait_timeout_rcu(), we pass it a timeout of zero to
> do a safe probe on all attached fences.
>
> The first fence happens to be signaled, so now we set the timeout to 1
> jiffie. Any subsequent incomplete fence will then wait for that jiffie
> before reporting 0.
>
> When we report to userspace, we convert the timeout into an error,
> usually EBUSY or ETIME. Applying that same convention here makes this
> all much cleaner...
> -Chris
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-10-26 13:27 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-25 12:25 Fixing zero timeout handling for fence functions Christian König
2016-10-25 12:25 ` [PATCH 1/4] dma-buf/fence: make timeout handling in fence_default_wait consistent Christian König
2016-10-26  8:49   ` Chris Wilson
2016-10-26 13:27     ` Christian König
2016-10-25 12:25 ` [PATCH 2/4] dma-buf/fence: revert "don't wait when specified timeout is zero" Christian König
2016-10-25 12:25 ` [PATCH 3/4] drm/ttm: fix ttm_bo_wait Christian König
2016-10-25 12:25 ` [PATCH 4/4] reservation: revert "wait only with non-zero timeout specified (v3)" v2 Christian König
2016-10-26  2:39 ` Fixing zero timeout handling for fence functions Sumit Semwal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.