linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] dma-buf: To check enable signaling before signaled
@ 2022-09-05 16:34 Arvind Yadav
  2022-09-05 16:34 ` [PATCH v2 1/4] drm/sched: Enable signaling for finished fence Arvind Yadav
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Arvind Yadav @ 2022-09-05 16:34 UTC (permalink / raw)
  To: Christian.Koenig, andrey.grodzovsky, shashank.sharma,
	amaranath.somalapuram, Arunpravin.PaneerSelvam, sumit.semwal,
	gustavo, airlied, daniel, linux-media, dri-devel, linaro-mm-sig,
	linux-kernel
  Cc: Arvind Yadav

TTM, GEM, DRM or the core DMA-buf framework are needs
to enable software signaling before the fence is signaled.
The core DMA-buf framework software can forget to call
enable_signaling before the fence is signaled. It means
framework code can forget to call dma_fence_enable_sw_signaling()
before calling dma_fence_is_signaled(). To avoid this scenario
on debug kernel, check the DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT bit
status before checking the MA_FENCE_FLAG_SIGNALED_BIT bit status
to confirm that software signaling is enabled.


Arvind Yadav (4):
  [PATCH v2 1/4] drm/sched: Enable signaling for finished fence
  [PATCH v2 2/4] dma-buf: enable signaling for the stub fence on debug
  [PATCH v2 3/4] dma-buf: enable signaling for selftest fence on debug
  [PATCH v2 4/4] dma-buf: Check status of enable-signaling bit on debug

 drivers/dma-buf/dma-fence.c            |  7 ++++
 drivers/dma-buf/st-dma-fence-chain.c   |  8 +++++
 drivers/dma-buf/st-dma-fence-unwrap.c  | 44 ++++++++++++++++++++++++++
 drivers/dma-buf/st-dma-fence.c         | 25 ++++++++++++++-
 drivers/dma-buf/st-dma-resv.c          | 20 ++++++++++++
 drivers/gpu/drm/scheduler/sched_main.c |  2 ++
 include/linux/dma-fence.h              |  5 +++
 7 files changed, 110 insertions(+), 1 deletion(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 1/4] drm/sched: Enable signaling for finished fence
  2022-09-05 16:34 [PATCH v2 0/4] dma-buf: To check enable signaling before signaled Arvind Yadav
@ 2022-09-05 16:34 ` Arvind Yadav
  2022-09-06  6:34   ` Christian König
  2022-09-05 16:35 ` [PATCH v2 2/4] dma-buf: enable signaling for the stub fence on debug Arvind Yadav
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: Arvind Yadav @ 2022-09-05 16:34 UTC (permalink / raw)
  To: Christian.Koenig, andrey.grodzovsky, shashank.sharma,
	amaranath.somalapuram, Arunpravin.PaneerSelvam, sumit.semwal,
	gustavo, airlied, daniel, linux-media, dri-devel, linaro-mm-sig,
	linux-kernel
  Cc: Arvind Yadav

Here's enabling software signaling for finished fence.

Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
---

Changes in v1 :
1- Addressing Christian's comment to remove CONFIG_DEBUG_FS check from 
this patch. 
2- The version of this patch is also changed and previously
it was [PATCH 2/4]

---
 drivers/gpu/drm/scheduler/sched_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index e0ab14e0fb6b..fe72de0e2911 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -962,6 +962,8 @@ static int drm_sched_main(void *param)
 			/* Drop for original kref_init of the fence */
 			dma_fence_put(fence);
 
+			dma_fence_enable_sw_signaling(&s_fence->finished);
+
 			r = dma_fence_add_callback(fence, &sched_job->cb,
 						   drm_sched_job_done_cb);
 			if (r == -ENOENT)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 2/4] dma-buf: enable signaling for the stub fence on debug
  2022-09-05 16:34 [PATCH v2 0/4] dma-buf: To check enable signaling before signaled Arvind Yadav
  2022-09-05 16:34 ` [PATCH v2 1/4] drm/sched: Enable signaling for finished fence Arvind Yadav
@ 2022-09-05 16:35 ` Arvind Yadav
  2022-09-06  7:09   ` Christian König
  2022-09-05 16:35 ` [PATCH v2 3/4] dma-buf: enable signaling for selftest " Arvind Yadav
  2022-09-05 16:35 ` [PATCH v2 4/4] dma-buf: Check status of enable-signaling bit " Arvind Yadav
  3 siblings, 1 reply; 18+ messages in thread
From: Arvind Yadav @ 2022-09-05 16:35 UTC (permalink / raw)
  To: Christian.Koenig, andrey.grodzovsky, shashank.sharma,
	amaranath.somalapuram, Arunpravin.PaneerSelvam, sumit.semwal,
	gustavo, airlied, daniel, linux-media, dri-devel, linaro-mm-sig,
	linux-kernel
  Cc: Arvind Yadav

Here's on debug enabling software signaling for the stub fence
which is always signaled. This fence should enable software
signaling otherwise the AMD GPU scheduler will cause a GPU reset
due to a GPU scheduler cleanup activity timeout.

Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
---

Changes in v1 :
1- Addressing Christian's comment to remove unnecessary callback.
2- Replacing CONFIG_DEBUG_WW_MUTEX_SLOWPATH instead of CONFIG_DEBUG_FS.
3- The version of this patch is also changed and previously
it was [PATCH 3/4]

---
 drivers/dma-buf/dma-fence.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 066400ed8841..2378b12538c4 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -27,6 +27,10 @@ EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
 static DEFINE_SPINLOCK(dma_fence_stub_lock);
 static struct dma_fence dma_fence_stub;
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+static bool __dma_fence_enable_signaling(struct dma_fence *fence);
+#endif
+
 /*
  * fence context counter: each execution context should have its own
  * fence context, this allows checking if fences belong to the same
@@ -136,6 +140,9 @@ struct dma_fence *dma_fence_get_stub(void)
 			       &dma_fence_stub_ops,
 			       &dma_fence_stub_lock,
 			       0, 0);
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+		__dma_fence_enable_signaling(&dma_fence_stub);
+#endif
 		dma_fence_signal_locked(&dma_fence_stub);
 	}
 	spin_unlock(&dma_fence_stub_lock);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 3/4] dma-buf: enable signaling for selftest fence on debug
  2022-09-05 16:34 [PATCH v2 0/4] dma-buf: To check enable signaling before signaled Arvind Yadav
  2022-09-05 16:34 ` [PATCH v2 1/4] drm/sched: Enable signaling for finished fence Arvind Yadav
  2022-09-05 16:35 ` [PATCH v2 2/4] dma-buf: enable signaling for the stub fence on debug Arvind Yadav
@ 2022-09-05 16:35 ` Arvind Yadav
  2022-09-06  7:11   ` Christian König
  2022-09-05 16:35 ` [PATCH v2 4/4] dma-buf: Check status of enable-signaling bit " Arvind Yadav
  3 siblings, 1 reply; 18+ messages in thread
From: Arvind Yadav @ 2022-09-05 16:35 UTC (permalink / raw)
  To: Christian.Koenig, andrey.grodzovsky, shashank.sharma,
	amaranath.somalapuram, Arunpravin.PaneerSelvam, sumit.semwal,
	gustavo, airlied, daniel, linux-media, dri-devel, linaro-mm-sig,
	linux-kernel
  Cc: Arvind Yadav

Here's on debug enabling software signaling for selftest.

Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
---

Changes in v1 :
1- Addressing Christian's comment to remove unnecessary callback.
2- Replacing CONFIG_DEBUG_WW_MUTEX_SLOWPATH instead of CONFIG_DEBUG_FS.
3- The version of this patch is also changed and previously
it was [PATCH 4/4]

---
 drivers/dma-buf/st-dma-fence-chain.c  |  8 +++++
 drivers/dma-buf/st-dma-fence-unwrap.c | 44 +++++++++++++++++++++++++++
 drivers/dma-buf/st-dma-fence.c        | 25 ++++++++++++++-
 drivers/dma-buf/st-dma-resv.c         | 20 ++++++++++++
 4 files changed, 96 insertions(+), 1 deletion(-)

diff --git a/drivers/dma-buf/st-dma-fence-chain.c b/drivers/dma-buf/st-dma-fence-chain.c
index 8ce1ea59d31b..d3070f8a393c 100644
--- a/drivers/dma-buf/st-dma-fence-chain.c
+++ b/drivers/dma-buf/st-dma-fence-chain.c
@@ -87,6 +87,10 @@ static int sanitycheck(void *arg)
 	if (!chain)
 		err = -ENOMEM;
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(chain);
+#endif
+
 	dma_fence_signal(f);
 	dma_fence_put(f);
 
@@ -143,6 +147,10 @@ static int fence_chains_init(struct fence_chains *fc, unsigned int count,
 		}
 
 		fc->tail = fc->chains[i];
+
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+		dma_fence_enable_sw_signaling(fc->chains[i]);
+#endif
 	}
 
 	fc->chain_length = i;
diff --git a/drivers/dma-buf/st-dma-fence-unwrap.c b/drivers/dma-buf/st-dma-fence-unwrap.c
index 4105d5ea8dde..b76cdd9ee0c7 100644
--- a/drivers/dma-buf/st-dma-fence-unwrap.c
+++ b/drivers/dma-buf/st-dma-fence-unwrap.c
@@ -102,6 +102,10 @@ static int sanitycheck(void *arg)
 	if (!f)
 		return -ENOMEM;
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f);
+#endif
+
 	array = mock_array(1, f);
 	if (!array)
 		return -ENOMEM;
@@ -124,12 +128,20 @@ static int unwrap_array(void *arg)
 	if (!f1)
 		return -ENOMEM;
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f1);
+#endif
+
 	f2 = mock_fence();
 	if (!f2) {
 		dma_fence_put(f1);
 		return -ENOMEM;
 	}
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f2);
+#endif
+
 	array = mock_array(2, f1, f2);
 	if (!array)
 		return -ENOMEM;
@@ -164,12 +176,20 @@ static int unwrap_chain(void *arg)
 	if (!f1)
 		return -ENOMEM;
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f1);
+#endif
+
 	f2 = mock_fence();
 	if (!f2) {
 		dma_fence_put(f1);
 		return -ENOMEM;
 	}
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f2);
+#endif
+
 	chain = mock_chain(f1, f2);
 	if (!chain)
 		return -ENOMEM;
@@ -204,12 +224,20 @@ static int unwrap_chain_array(void *arg)
 	if (!f1)
 		return -ENOMEM;
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f1);
+#endif
+
 	f2 = mock_fence();
 	if (!f2) {
 		dma_fence_put(f1);
 		return -ENOMEM;
 	}
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f2);
+#endif
+
 	array = mock_array(2, f1, f2);
 	if (!array)
 		return -ENOMEM;
@@ -248,12 +276,20 @@ static int unwrap_merge(void *arg)
 	if (!f1)
 		return -ENOMEM;
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f1);
+#endif
+
 	f2 = mock_fence();
 	if (!f2) {
 		err = -ENOMEM;
 		goto error_put_f1;
 	}
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f2);
+#endif
+
 	f3 = dma_fence_unwrap_merge(f1, f2);
 	if (!f3) {
 		err = -ENOMEM;
@@ -296,10 +332,18 @@ static int unwrap_merge_complex(void *arg)
 	if (!f1)
 		return -ENOMEM;
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f1);
+#endif
+
 	f2 = mock_fence();
 	if (!f2)
 		goto error_put_f1;
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f2);
+#endif
+
 	f3 = dma_fence_unwrap_merge(f1, f2);
 	if (!f3)
 		goto error_put_f2;
diff --git a/drivers/dma-buf/st-dma-fence.c b/drivers/dma-buf/st-dma-fence.c
index c8a12d7ad71a..b7880d8374db 100644
--- a/drivers/dma-buf/st-dma-fence.c
+++ b/drivers/dma-buf/st-dma-fence.c
@@ -101,7 +101,9 @@ static int sanitycheck(void *arg)
 	f = mock_fence();
 	if (!f)
 		return -ENOMEM;
-
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f);
+#endif
 	dma_fence_signal(f);
 	dma_fence_put(f);
 
@@ -117,6 +119,9 @@ static int test_signaling(void *arg)
 	if (!f)
 		return -ENOMEM;
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f);
+#endif
 	if (dma_fence_is_signaled(f)) {
 		pr_err("Fence unexpectedly signaled on creation\n");
 		goto err_free;
@@ -190,6 +195,9 @@ static int test_late_add_callback(void *arg)
 	if (!f)
 		return -ENOMEM;
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f);
+#endif
 	dma_fence_signal(f);
 
 	if (!dma_fence_add_callback(f, &cb.cb, simple_callback)) {
@@ -282,6 +290,9 @@ static int test_status(void *arg)
 	if (!f)
 		return -ENOMEM;
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f);
+#endif
 	if (dma_fence_get_status(f)) {
 		pr_err("Fence unexpectedly has signaled status on creation\n");
 		goto err_free;
@@ -308,6 +319,9 @@ static int test_error(void *arg)
 	if (!f)
 		return -ENOMEM;
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f);
+#endif
 	dma_fence_set_error(f, -EIO);
 
 	if (dma_fence_get_status(f)) {
@@ -337,6 +351,9 @@ static int test_wait(void *arg)
 	if (!f)
 		return -ENOMEM;
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f);
+#endif
 	if (dma_fence_wait_timeout(f, false, 0) != -ETIME) {
 		pr_err("Wait reported complete before being signaled\n");
 		goto err_free;
@@ -379,6 +396,9 @@ static int test_wait_timeout(void *arg)
 	if (!wt.f)
 		return -ENOMEM;
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(wt.f);
+#endif
 	if (dma_fence_wait_timeout(wt.f, false, 1) != -ETIME) {
 		pr_err("Wait reported complete before being signaled\n");
 		goto err_free;
@@ -458,6 +478,9 @@ static int thread_signal_callback(void *arg)
 			break;
 		}
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+		dma_fence_enable_sw_signaling(f1);
+#endif
 		rcu_assign_pointer(t->fences[t->id], f1);
 		smp_wmb();
 
diff --git a/drivers/dma-buf/st-dma-resv.c b/drivers/dma-buf/st-dma-resv.c
index 813779e3c9be..bd7ef58f8b24 100644
--- a/drivers/dma-buf/st-dma-resv.c
+++ b/drivers/dma-buf/st-dma-resv.c
@@ -45,6 +45,10 @@ static int sanitycheck(void *arg)
 	if (!f)
 		return -ENOMEM;
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f);
+#endif
+
 	dma_fence_signal(f);
 	dma_fence_put(f);
 
@@ -69,6 +73,10 @@ static int test_signaling(void *arg)
 	if (!f)
 		return -ENOMEM;
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f);
+#endif
+
 	dma_resv_init(&resv);
 	r = dma_resv_lock(&resv, NULL);
 	if (r) {
@@ -114,6 +122,10 @@ static int test_for_each(void *arg)
 	if (!f)
 		return -ENOMEM;
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f);
+#endif
+
 	dma_resv_init(&resv);
 	r = dma_resv_lock(&resv, NULL);
 	if (r) {
@@ -173,6 +185,10 @@ static int test_for_each_unlocked(void *arg)
 	if (!f)
 		return -ENOMEM;
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f);
+#endif
+
 	dma_resv_init(&resv);
 	r = dma_resv_lock(&resv, NULL);
 	if (r) {
@@ -244,6 +260,10 @@ static int test_get_fences(void *arg)
 	if (!f)
 		return -ENOMEM;
 
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	dma_fence_enable_sw_signaling(f);
+#endif
+
 	dma_resv_init(&resv);
 	r = dma_resv_lock(&resv, NULL);
 	if (r) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 4/4] dma-buf: Check status of enable-signaling bit on debug
  2022-09-05 16:34 [PATCH v2 0/4] dma-buf: To check enable signaling before signaled Arvind Yadav
                   ` (2 preceding siblings ...)
  2022-09-05 16:35 ` [PATCH v2 3/4] dma-buf: enable signaling for selftest " Arvind Yadav
@ 2022-09-05 16:35 ` Arvind Yadav
  2022-09-06  8:39   ` Christian König
  3 siblings, 1 reply; 18+ messages in thread
From: Arvind Yadav @ 2022-09-05 16:35 UTC (permalink / raw)
  To: Christian.Koenig, andrey.grodzovsky, shashank.sharma,
	amaranath.somalapuram, Arunpravin.PaneerSelvam, sumit.semwal,
	gustavo, airlied, daniel, linux-media, dri-devel, linaro-mm-sig,
	linux-kernel
  Cc: Arvind Yadav

The core DMA-buf framework needs to enable signaling
before the fence is signaled. The core DMA-buf framework
can forget to enable signaling before the fence is signaled.
To avoid this scenario on the debug kernel, check the
DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT status bit before checking
the signaling bit status to confirm that enable_signaling
is enabled.

Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
---

Changes in v1 :
1- Addressing Christian's comment to replace
CONFIG_DEBUG_WW_MUTEX_SLOWPATH instead of CONFIG_DEBUG_FS.
2- As per Christian's comment moving this patch at last so
The version of this patch is also changed and previously
it was [PATCH 1/4]


---
 include/linux/dma-fence.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 775cdc0b4f24..ba1ddc14c5d4 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -428,6 +428,11 @@ dma_fence_is_signaled_locked(struct dma_fence *fence)
 static inline bool
 dma_fence_is_signaled(struct dma_fence *fence)
 {
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+	if (!test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags))
+		return false;
+#endif
+
 	if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
 		return true;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/4] drm/sched: Enable signaling for finished fence
  2022-09-05 16:34 ` [PATCH v2 1/4] drm/sched: Enable signaling for finished fence Arvind Yadav
@ 2022-09-06  6:34   ` Christian König
  2022-09-06 19:55     ` Andrey Grodzovsky
  0 siblings, 1 reply; 18+ messages in thread
From: Christian König @ 2022-09-06  6:34 UTC (permalink / raw)
  To: Arvind Yadav, andrey.grodzovsky, shashank.sharma,
	amaranath.somalapuram, Arunpravin.PaneerSelvam, sumit.semwal,
	gustavo, airlied, daniel, linux-media, dri-devel, linaro-mm-sig,
	linux-kernel

Am 05.09.22 um 18:34 schrieb Arvind Yadav:
> Here's enabling software signaling for finished fence.
>
> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
> ---
>
> Changes in v1 :
> 1- Addressing Christian's comment to remove CONFIG_DEBUG_FS check from
> this patch.
> 2- The version of this patch is also changed and previously
> it was [PATCH 2/4]
>
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index e0ab14e0fb6b..fe72de0e2911 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -962,6 +962,8 @@ static int drm_sched_main(void *param)
>   			/* Drop for original kref_init of the fence */
>   			dma_fence_put(fence);
>   
> +			dma_fence_enable_sw_signaling(&s_fence->finished);

Ok, this makes it a lot clearer. Previously I though that we have some 
bug in dma_fence_add_callback().

This is essentially the wrong place to call this, the finished fence 
should be enabled by the caller and not here.

There is also another problem in dma_fence_enable_sw_signaling(), it 
returns early when the fence is already signaled:

         if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
                 return;

Please remove that one first.

Thanks,
Christian.


> +
>   			r = dma_fence_add_callback(fence, &sched_job->cb,
>   						   drm_sched_job_done_cb);
>   			if (r == -ENOENT)


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/4] dma-buf: enable signaling for the stub fence on debug
  2022-09-05 16:35 ` [PATCH v2 2/4] dma-buf: enable signaling for the stub fence on debug Arvind Yadav
@ 2022-09-06  7:09   ` Christian König
  2022-09-09 16:32     ` Yadav, Arvind
  0 siblings, 1 reply; 18+ messages in thread
From: Christian König @ 2022-09-06  7:09 UTC (permalink / raw)
  To: Arvind Yadav, andrey.grodzovsky, shashank.sharma,
	amaranath.somalapuram, Arunpravin.PaneerSelvam, sumit.semwal,
	gustavo, airlied, daniel, linux-media, dri-devel, linaro-mm-sig,
	linux-kernel



Am 05.09.22 um 18:35 schrieb Arvind Yadav:
> Here's on debug enabling software signaling for the stub fence
> which is always signaled. This fence should enable software
> signaling otherwise the AMD GPU scheduler will cause a GPU reset
> due to a GPU scheduler cleanup activity timeout.
>
> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
> ---
>
> Changes in v1 :
> 1- Addressing Christian's comment to remove unnecessary callback.
> 2- Replacing CONFIG_DEBUG_WW_MUTEX_SLOWPATH instead of CONFIG_DEBUG_FS.
> 3- The version of this patch is also changed and previously
> it was [PATCH 3/4]
>
> ---
>   drivers/dma-buf/dma-fence.c | 7 +++++++
>   1 file changed, 7 insertions(+)
>
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index 066400ed8841..2378b12538c4 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -27,6 +27,10 @@ EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
>   static DEFINE_SPINLOCK(dma_fence_stub_lock);
>   static struct dma_fence dma_fence_stub;
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +static bool __dma_fence_enable_signaling(struct dma_fence *fence);
> +#endif
> +

I would rename the function to something like 
dma_fence_enable_signaling_locked().

And please don't add any #ifdef if it isn't absolutely necessary. This 
makes the code pretty fragile.

>   /*
>    * fence context counter: each execution context should have its own
>    * fence context, this allows checking if fences belong to the same
> @@ -136,6 +140,9 @@ struct dma_fence *dma_fence_get_stub(void)
>   			       &dma_fence_stub_ops,
>   			       &dma_fence_stub_lock,
>   			       0, 0);
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +		__dma_fence_enable_signaling(&dma_fence_stub);
> +#endif

Alternatively in this particular case you could just set the bit 
manually here since this is part of the dma_fence code anyway.

Christian.

>   		dma_fence_signal_locked(&dma_fence_stub);
>   	}
>   	spin_unlock(&dma_fence_stub_lock);


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] dma-buf: enable signaling for selftest fence on debug
  2022-09-05 16:35 ` [PATCH v2 3/4] dma-buf: enable signaling for selftest " Arvind Yadav
@ 2022-09-06  7:11   ` Christian König
  0 siblings, 0 replies; 18+ messages in thread
From: Christian König @ 2022-09-06  7:11 UTC (permalink / raw)
  To: Arvind Yadav, andrey.grodzovsky, shashank.sharma,
	amaranath.somalapuram, Arunpravin.PaneerSelvam, sumit.semwal,
	gustavo, airlied, daniel, linux-media, dri-devel, linaro-mm-sig,
	linux-kernel

Am 05.09.22 um 18:35 schrieb Arvind Yadav:
> Here's on debug enabling software signaling for selftest.

Please drop all the #ifdefs, apart from that looks pretty good to me.

Christian.

>
> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
> ---
>
> Changes in v1 :
> 1- Addressing Christian's comment to remove unnecessary callback.
> 2- Replacing CONFIG_DEBUG_WW_MUTEX_SLOWPATH instead of CONFIG_DEBUG_FS.
> 3- The version of this patch is also changed and previously
> it was [PATCH 4/4]
>
> ---
>   drivers/dma-buf/st-dma-fence-chain.c  |  8 +++++
>   drivers/dma-buf/st-dma-fence-unwrap.c | 44 +++++++++++++++++++++++++++
>   drivers/dma-buf/st-dma-fence.c        | 25 ++++++++++++++-
>   drivers/dma-buf/st-dma-resv.c         | 20 ++++++++++++
>   4 files changed, 96 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/dma-buf/st-dma-fence-chain.c b/drivers/dma-buf/st-dma-fence-chain.c
> index 8ce1ea59d31b..d3070f8a393c 100644
> --- a/drivers/dma-buf/st-dma-fence-chain.c
> +++ b/drivers/dma-buf/st-dma-fence-chain.c
> @@ -87,6 +87,10 @@ static int sanitycheck(void *arg)
>   	if (!chain)
>   		err = -ENOMEM;
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(chain);
> +#endif
> +
>   	dma_fence_signal(f);
>   	dma_fence_put(f);
>   
> @@ -143,6 +147,10 @@ static int fence_chains_init(struct fence_chains *fc, unsigned int count,
>   		}
>   
>   		fc->tail = fc->chains[i];
> +
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +		dma_fence_enable_sw_signaling(fc->chains[i]);
> +#endif
>   	}
>   
>   	fc->chain_length = i;
> diff --git a/drivers/dma-buf/st-dma-fence-unwrap.c b/drivers/dma-buf/st-dma-fence-unwrap.c
> index 4105d5ea8dde..b76cdd9ee0c7 100644
> --- a/drivers/dma-buf/st-dma-fence-unwrap.c
> +++ b/drivers/dma-buf/st-dma-fence-unwrap.c
> @@ -102,6 +102,10 @@ static int sanitycheck(void *arg)
>   	if (!f)
>   		return -ENOMEM;
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f);
> +#endif
> +
>   	array = mock_array(1, f);
>   	if (!array)
>   		return -ENOMEM;
> @@ -124,12 +128,20 @@ static int unwrap_array(void *arg)
>   	if (!f1)
>   		return -ENOMEM;
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f1);
> +#endif
> +
>   	f2 = mock_fence();
>   	if (!f2) {
>   		dma_fence_put(f1);
>   		return -ENOMEM;
>   	}
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f2);
> +#endif
> +
>   	array = mock_array(2, f1, f2);
>   	if (!array)
>   		return -ENOMEM;
> @@ -164,12 +176,20 @@ static int unwrap_chain(void *arg)
>   	if (!f1)
>   		return -ENOMEM;
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f1);
> +#endif
> +
>   	f2 = mock_fence();
>   	if (!f2) {
>   		dma_fence_put(f1);
>   		return -ENOMEM;
>   	}
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f2);
> +#endif
> +
>   	chain = mock_chain(f1, f2);
>   	if (!chain)
>   		return -ENOMEM;
> @@ -204,12 +224,20 @@ static int unwrap_chain_array(void *arg)
>   	if (!f1)
>   		return -ENOMEM;
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f1);
> +#endif
> +
>   	f2 = mock_fence();
>   	if (!f2) {
>   		dma_fence_put(f1);
>   		return -ENOMEM;
>   	}
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f2);
> +#endif
> +
>   	array = mock_array(2, f1, f2);
>   	if (!array)
>   		return -ENOMEM;
> @@ -248,12 +276,20 @@ static int unwrap_merge(void *arg)
>   	if (!f1)
>   		return -ENOMEM;
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f1);
> +#endif
> +
>   	f2 = mock_fence();
>   	if (!f2) {
>   		err = -ENOMEM;
>   		goto error_put_f1;
>   	}
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f2);
> +#endif
> +
>   	f3 = dma_fence_unwrap_merge(f1, f2);
>   	if (!f3) {
>   		err = -ENOMEM;
> @@ -296,10 +332,18 @@ static int unwrap_merge_complex(void *arg)
>   	if (!f1)
>   		return -ENOMEM;
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f1);
> +#endif
> +
>   	f2 = mock_fence();
>   	if (!f2)
>   		goto error_put_f1;
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f2);
> +#endif
> +
>   	f3 = dma_fence_unwrap_merge(f1, f2);
>   	if (!f3)
>   		goto error_put_f2;
> diff --git a/drivers/dma-buf/st-dma-fence.c b/drivers/dma-buf/st-dma-fence.c
> index c8a12d7ad71a..b7880d8374db 100644
> --- a/drivers/dma-buf/st-dma-fence.c
> +++ b/drivers/dma-buf/st-dma-fence.c
> @@ -101,7 +101,9 @@ static int sanitycheck(void *arg)
>   	f = mock_fence();
>   	if (!f)
>   		return -ENOMEM;
> -
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f);
> +#endif
>   	dma_fence_signal(f);
>   	dma_fence_put(f);
>   
> @@ -117,6 +119,9 @@ static int test_signaling(void *arg)
>   	if (!f)
>   		return -ENOMEM;
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f);
> +#endif
>   	if (dma_fence_is_signaled(f)) {
>   		pr_err("Fence unexpectedly signaled on creation\n");
>   		goto err_free;
> @@ -190,6 +195,9 @@ static int test_late_add_callback(void *arg)
>   	if (!f)
>   		return -ENOMEM;
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f);
> +#endif
>   	dma_fence_signal(f);
>   
>   	if (!dma_fence_add_callback(f, &cb.cb, simple_callback)) {
> @@ -282,6 +290,9 @@ static int test_status(void *arg)
>   	if (!f)
>   		return -ENOMEM;
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f);
> +#endif
>   	if (dma_fence_get_status(f)) {
>   		pr_err("Fence unexpectedly has signaled status on creation\n");
>   		goto err_free;
> @@ -308,6 +319,9 @@ static int test_error(void *arg)
>   	if (!f)
>   		return -ENOMEM;
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f);
> +#endif
>   	dma_fence_set_error(f, -EIO);
>   
>   	if (dma_fence_get_status(f)) {
> @@ -337,6 +351,9 @@ static int test_wait(void *arg)
>   	if (!f)
>   		return -ENOMEM;
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f);
> +#endif
>   	if (dma_fence_wait_timeout(f, false, 0) != -ETIME) {
>   		pr_err("Wait reported complete before being signaled\n");
>   		goto err_free;
> @@ -379,6 +396,9 @@ static int test_wait_timeout(void *arg)
>   	if (!wt.f)
>   		return -ENOMEM;
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(wt.f);
> +#endif
>   	if (dma_fence_wait_timeout(wt.f, false, 1) != -ETIME) {
>   		pr_err("Wait reported complete before being signaled\n");
>   		goto err_free;
> @@ -458,6 +478,9 @@ static int thread_signal_callback(void *arg)
>   			break;
>   		}
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +		dma_fence_enable_sw_signaling(f1);
> +#endif
>   		rcu_assign_pointer(t->fences[t->id], f1);
>   		smp_wmb();
>   
> diff --git a/drivers/dma-buf/st-dma-resv.c b/drivers/dma-buf/st-dma-resv.c
> index 813779e3c9be..bd7ef58f8b24 100644
> --- a/drivers/dma-buf/st-dma-resv.c
> +++ b/drivers/dma-buf/st-dma-resv.c
> @@ -45,6 +45,10 @@ static int sanitycheck(void *arg)
>   	if (!f)
>   		return -ENOMEM;
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f);
> +#endif
> +
>   	dma_fence_signal(f);
>   	dma_fence_put(f);
>   
> @@ -69,6 +73,10 @@ static int test_signaling(void *arg)
>   	if (!f)
>   		return -ENOMEM;
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f);
> +#endif
> +
>   	dma_resv_init(&resv);
>   	r = dma_resv_lock(&resv, NULL);
>   	if (r) {
> @@ -114,6 +122,10 @@ static int test_for_each(void *arg)
>   	if (!f)
>   		return -ENOMEM;
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f);
> +#endif
> +
>   	dma_resv_init(&resv);
>   	r = dma_resv_lock(&resv, NULL);
>   	if (r) {
> @@ -173,6 +185,10 @@ static int test_for_each_unlocked(void *arg)
>   	if (!f)
>   		return -ENOMEM;
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f);
> +#endif
> +
>   	dma_resv_init(&resv);
>   	r = dma_resv_lock(&resv, NULL);
>   	if (r) {
> @@ -244,6 +260,10 @@ static int test_get_fences(void *arg)
>   	if (!f)
>   		return -ENOMEM;
>   
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	dma_fence_enable_sw_signaling(f);
> +#endif
> +
>   	dma_resv_init(&resv);
>   	r = dma_resv_lock(&resv, NULL);
>   	if (r) {


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] dma-buf: Check status of enable-signaling bit on debug
  2022-09-05 16:35 ` [PATCH v2 4/4] dma-buf: Check status of enable-signaling bit " Arvind Yadav
@ 2022-09-06  8:39   ` Christian König
  2022-09-06 10:20     ` Tvrtko Ursulin
  0 siblings, 1 reply; 18+ messages in thread
From: Christian König @ 2022-09-06  8:39 UTC (permalink / raw)
  To: Arvind Yadav, andrey.grodzovsky, shashank.sharma,
	amaranath.somalapuram, Arunpravin.PaneerSelvam, sumit.semwal,
	gustavo, airlied, daniel, linux-media, dri-devel, linaro-mm-sig,
	linux-kernel

Am 05.09.22 um 18:35 schrieb Arvind Yadav:
> The core DMA-buf framework needs to enable signaling
> before the fence is signaled. The core DMA-buf framework
> can forget to enable signaling before the fence is signaled.

This sentence is a bit confusing. I'm not a native speaker of English 
either, but I suggest something like:

"Fence signaling must be enable to make sure that the 
dma_fence_is_signaled() function ever returns true."

> To avoid this scenario on the debug kernel, check the
> DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT status bit before checking
> the signaling bit status to confirm that enable_signaling
> is enabled.

This describes the implementation, but we should rather describe the 
background of the change. The implementation should be obvious. 
Something like this maybe:

"
Since drivers and implementations sometimes mess this up enforce correct 
behavior when DEBUG_WW_MUTEX_SLOWPATH is used during debugging.

This should make any implementations bugs resulting in not signaled 
fences much more obvious.
"

>
> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>

With the improved commit message this patch is Reviewed-by: Christian 
König <christian.koenig@amd.com>

Regards,
Christian.

> ---
>
> Changes in v1 :
> 1- Addressing Christian's comment to replace
> CONFIG_DEBUG_WW_MUTEX_SLOWPATH instead of CONFIG_DEBUG_FS.
> 2- As per Christian's comment moving this patch at last so
> The version of this patch is also changed and previously
> it was [PATCH 1/4]
>
>
> ---
>   include/linux/dma-fence.h | 5 +++++
>   1 file changed, 5 insertions(+)
>
> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> index 775cdc0b4f24..ba1ddc14c5d4 100644
> --- a/include/linux/dma-fence.h
> +++ b/include/linux/dma-fence.h
> @@ -428,6 +428,11 @@ dma_fence_is_signaled_locked(struct dma_fence *fence)
>   static inline bool
>   dma_fence_is_signaled(struct dma_fence *fence)
>   {
> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
> +	if (!test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags))
> +		return false;
> +#endif
> +
>   	if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>   		return true;
>   


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] dma-buf: Check status of enable-signaling bit on debug
  2022-09-06  8:39   ` Christian König
@ 2022-09-06 10:20     ` Tvrtko Ursulin
  2022-09-06 10:43       ` Christian König
  0 siblings, 1 reply; 18+ messages in thread
From: Tvrtko Ursulin @ 2022-09-06 10:20 UTC (permalink / raw)
  To: Christian König, Arvind Yadav, andrey.grodzovsky,
	shashank.sharma, amaranath.somalapuram, Arunpravin.PaneerSelvam,
	sumit.semwal, gustavo, airlied, daniel, linux-media, dri-devel,
	linaro-mm-sig, linux-kernel


On 06/09/2022 09:39, Christian König wrote:
> Am 05.09.22 um 18:35 schrieb Arvind Yadav:
>> The core DMA-buf framework needs to enable signaling
>> before the fence is signaled. The core DMA-buf framework
>> can forget to enable signaling before the fence is signaled.
> 
> This sentence is a bit confusing. I'm not a native speaker of English 
> either, but I suggest something like:
> 
> "Fence signaling must be enable to make sure that the 
> dma_fence_is_signaled() function ever returns true."
> 
>> To avoid this scenario on the debug kernel, check the
>> DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT status bit before checking
>> the signaling bit status to confirm that enable_signaling
>> is enabled.
> 
> This describes the implementation, but we should rather describe the 
> background of the change. The implementation should be obvious. 
> Something like this maybe:
> 
> "
> Since drivers and implementations sometimes mess this up enforce correct 
> behavior when DEBUG_WW_MUTEX_SLOWPATH is used during debugging.
> 
> This should make any implementations bugs resulting in not signaled 
> fences much more obvious.
> "

I think I follow the idea but am not sure coupling (well "coupling".. not really, but cross-contaminating in a way) dma-fence.c with a foreign and effectively unrelated concept of a ww mutex is the best way.

Instead, how about a dma-buf specific debug kconfig option?

Condition would then be, according to my understanding of the rules and expectations, along the lines of:

diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 775cdc0b4f24..147a9df2c9d0 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -428,6 +428,17 @@ dma_fence_is_signaled_locked(struct dma_fence *fence)
  static inline bool
  dma_fence_is_signaled(struct dma_fence *fence)
  {
+#ifdef CONFIG_DEBUG_DMAFENCE
+       /*
+        * Implementations not providing the enable_signaling callback are
+        * required to always have signaling enabled or fences are not
+        * guaranteed to ever signal.
+        */
+       if (!fence->ops->enable_signaling &&
+           !test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags))
+               return false;
+#endif
+
         if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
                 return true;

Thoughts?

Regards,

Tvrtko

> 
>>
>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
> 
> With the improved commit message this patch is Reviewed-by: Christian 
> König <christian.koenig@amd.com>
> 
> Regards,
> Christian.
> 
>> ---
>>
>> Changes in v1 :
>> 1- Addressing Christian's comment to replace
>> CONFIG_DEBUG_WW_MUTEX_SLOWPATH instead of CONFIG_DEBUG_FS.
>> 2- As per Christian's comment moving this patch at last so
>> The version of this patch is also changed and previously
>> it was [PATCH 1/4]
>>
>>
>> ---
>>   include/linux/dma-fence.h | 5 +++++
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
>> index 775cdc0b4f24..ba1ddc14c5d4 100644
>> --- a/include/linux/dma-fence.h
>> +++ b/include/linux/dma-fence.h
>> @@ -428,6 +428,11 @@ dma_fence_is_signaled_locked(struct dma_fence 
>> *fence)
>>   static inline bool
>>   dma_fence_is_signaled(struct dma_fence *fence)
>>   {
>> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
>> +    if (!test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags))
>> +        return false;
>> +#endif
>> +
>>       if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>>           return true;
> 

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] dma-buf: Check status of enable-signaling bit on debug
  2022-09-06 10:20     ` Tvrtko Ursulin
@ 2022-09-06 10:43       ` Christian König
  2022-09-06 11:21         ` Tvrtko Ursulin
  0 siblings, 1 reply; 18+ messages in thread
From: Christian König @ 2022-09-06 10:43 UTC (permalink / raw)
  To: Tvrtko Ursulin, Arvind Yadav, andrey.grodzovsky, shashank.sharma,
	amaranath.somalapuram, Arunpravin.PaneerSelvam, sumit.semwal,
	gustavo, airlied, daniel, linux-media, dri-devel, linaro-mm-sig,
	linux-kernel

Am 06.09.22 um 12:20 schrieb Tvrtko Ursulin:
>
> On 06/09/2022 09:39, Christian König wrote:
>> Am 05.09.22 um 18:35 schrieb Arvind Yadav:
>>> The core DMA-buf framework needs to enable signaling
>>> before the fence is signaled. The core DMA-buf framework
>>> can forget to enable signaling before the fence is signaled.
>>
>> This sentence is a bit confusing. I'm not a native speaker of English 
>> either, but I suggest something like:
>>
>> "Fence signaling must be enable to make sure that the 
>> dma_fence_is_signaled() function ever returns true."
>>
>>> To avoid this scenario on the debug kernel, check the
>>> DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT status bit before checking
>>> the signaling bit status to confirm that enable_signaling
>>> is enabled.
>>
>> This describes the implementation, but we should rather describe the 
>> background of the change. The implementation should be obvious. 
>> Something like this maybe:
>>
>> "
>> Since drivers and implementations sometimes mess this up enforce 
>> correct behavior when DEBUG_WW_MUTEX_SLOWPATH is used during debugging.
>>
>> This should make any implementations bugs resulting in not signaled 
>> fences much more obvious.
>> "
>
> I think I follow the idea but am not sure coupling (well "coupling".. 
> not really, but cross-contaminating in a way) dma-fence.c with a 
> foreign and effectively unrelated concept of a ww mutex is the best way.
>
> Instead, how about a dma-buf specific debug kconfig option?

Yeah, I was thinking about that as well.

The slowpath config option was just at hand because when you want to 
test the slowpath you want to test this here as well.

>
> Condition would then be, according to my understanding of the rules 
> and expectations, along the lines of:
>
> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> index 775cdc0b4f24..147a9df2c9d0 100644
> --- a/include/linux/dma-fence.h
> +++ b/include/linux/dma-fence.h
> @@ -428,6 +428,17 @@ dma_fence_is_signaled_locked(struct dma_fence 
> *fence)
>  static inline bool
>  dma_fence_is_signaled(struct dma_fence *fence)
>  {
> +#ifdef CONFIG_DEBUG_DMAFENCE
> +       /*
> +        * Implementations not providing the enable_signaling callback 
> are
> +        * required to always have signaling enabled or fences are not
> +        * guaranteed to ever signal.
> +        */

Well that comment is a bit misleading. The intention of the extra check 
is to find bugs in the frontend and not the backend.

In other words somewhere in the drm_syncobj code we have a 
dma_fence_is_signaled() call without matching 
dma_fence_enable_sw_signaling().

Regards,
Christian.

> + if (!fence->ops->enable_signaling &&
> +           !test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags))
> +               return false;
> +#endif
> +
>         if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>                 return true;
>
> Thoughts?
>
> Regards,
>
> Tvrtko
>
>>
>>>
>>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>>
>> With the improved commit message this patch is Reviewed-by: Christian 
>> König <christian.koenig@amd.com>
>>
>> Regards,
>> Christian.
>>
>>> ---
>>>
>>> Changes in v1 :
>>> 1- Addressing Christian's comment to replace
>>> CONFIG_DEBUG_WW_MUTEX_SLOWPATH instead of CONFIG_DEBUG_FS.
>>> 2- As per Christian's comment moving this patch at last so
>>> The version of this patch is also changed and previously
>>> it was [PATCH 1/4]
>>>
>>>
>>> ---
>>>   include/linux/dma-fence.h | 5 +++++
>>>   1 file changed, 5 insertions(+)
>>>
>>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
>>> index 775cdc0b4f24..ba1ddc14c5d4 100644
>>> --- a/include/linux/dma-fence.h
>>> +++ b/include/linux/dma-fence.h
>>> @@ -428,6 +428,11 @@ dma_fence_is_signaled_locked(struct dma_fence 
>>> *fence)
>>>   static inline bool
>>>   dma_fence_is_signaled(struct dma_fence *fence)
>>>   {
>>> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
>>> +    if (!test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags))
>>> +        return false;
>>> +#endif
>>> +
>>>       if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>>>           return true;
>>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] dma-buf: Check status of enable-signaling bit on debug
  2022-09-06 10:43       ` Christian König
@ 2022-09-06 11:21         ` Tvrtko Ursulin
  2022-09-06 11:35           ` Christian König
  2022-09-06 11:38           ` Tvrtko Ursulin
  0 siblings, 2 replies; 18+ messages in thread
From: Tvrtko Ursulin @ 2022-09-06 11:21 UTC (permalink / raw)
  To: Christian König, Arvind Yadav, andrey.grodzovsky,
	shashank.sharma, amaranath.somalapuram, Arunpravin.PaneerSelvam,
	sumit.semwal, gustavo, airlied, daniel, linux-media, dri-devel,
	linaro-mm-sig, linux-kernel


On 06/09/2022 11:43, Christian König wrote:
> Am 06.09.22 um 12:20 schrieb Tvrtko Ursulin:
>>
>> On 06/09/2022 09:39, Christian König wrote:
>>> Am 05.09.22 um 18:35 schrieb Arvind Yadav:
>>>> The core DMA-buf framework needs to enable signaling
>>>> before the fence is signaled. The core DMA-buf framework
>>>> can forget to enable signaling before the fence is signaled.
>>>
>>> This sentence is a bit confusing. I'm not a native speaker of English 
>>> either, but I suggest something like:
>>>
>>> "Fence signaling must be enable to make sure that the 
>>> dma_fence_is_signaled() function ever returns true."
>>>
>>>> To avoid this scenario on the debug kernel, check the
>>>> DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT status bit before checking
>>>> the signaling bit status to confirm that enable_signaling
>>>> is enabled.
>>>
>>> This describes the implementation, but we should rather describe the 
>>> background of the change. The implementation should be obvious. 
>>> Something like this maybe:
>>>
>>> "
>>> Since drivers and implementations sometimes mess this up enforce 
>>> correct behavior when DEBUG_WW_MUTEX_SLOWPATH is used during debugging.
>>>
>>> This should make any implementations bugs resulting in not signaled 
>>> fences much more obvious.
>>> "
>>
>> I think I follow the idea but am not sure coupling (well "coupling".. 
>> not really, but cross-contaminating in a way) dma-fence.c with a 
>> foreign and effectively unrelated concept of a ww mutex is the best way.
>>
>> Instead, how about a dma-buf specific debug kconfig option?
> 
> Yeah, I was thinking about that as well.

Cool, lets see about the specifics below and then decide.
  
> The slowpath config option was just at hand because when you want to 
> test the slowpath you want to test this here as well.
> 
>>
>> Condition would then be, according to my understanding of the rules 
>> and expectations, along the lines of:
>>
>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
>> index 775cdc0b4f24..147a9df2c9d0 100644
>> --- a/include/linux/dma-fence.h
>> +++ b/include/linux/dma-fence.h
>> @@ -428,6 +428,17 @@ dma_fence_is_signaled_locked(struct dma_fence 
>> *fence)
>>  static inline bool
>>  dma_fence_is_signaled(struct dma_fence *fence)
>>  {
>> +#ifdef CONFIG_DEBUG_DMAFENCE
>> +       /*
>> +        * Implementations not providing the enable_signaling callback 
>> are
>> +        * required to always have signaling enabled or fences are not
>> +        * guaranteed to ever signal.
>> +        */
> 
> Well that comment is a bit misleading. The intention of the extra check 
> is to find bugs in the frontend and not the backend.

By backend you mean drivers/dma-buf/dma-fence.c and by front end driver specific implementations? Or simply users for dma-fence?

Could be that I got confused.. I was reading these two:


	 * This callback is optional. If this callback is not present, then the
	 * driver must always have signaling enabled.
	 */
	bool (*enable_signaling)(struct dma_fence *fence);

And dma_fence_is_signaled:

  * Returns true if the fence was already signaled, false if not. Since this
  * function doesn't enable signaling, it is not guaranteed to ever return
  * true if dma_fence_add_callback(), dma_fence_wait() or
  * dma_fence_enable_sw_signaling() haven't been called before.

Right, I think I did get confused, apologies. What I was thinking was probably two separate conditions:

  static inline bool
  dma_fence_is_signaled(struct dma_fence *fence)
  {
+#ifdef CONFIG_DEBUG_DMAFENCE
+       if (WARN_ON_ONCE(!fence->ops->enable_signaling &&
+                        !test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags)))
+               return false;
+
+       if (!test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags))
+               return false;
+#endif
+
         if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
                 return true;

Not sure "is signaled" is the best place for the first one or that it should definitely be added.

Regards,

Tvrtko

> In other words somewhere in the drm_syncobj code we have a 
> dma_fence_is_signaled() call without matching 
> dma_fence_enable_sw_signaling().
> 
> Regards,
> Christian.
> 
>> + if (!fence->ops->enable_signaling &&
>> +           !test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags))
>> +               return false;
>> +#endif
>> +
>>         if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>>                 return true;
>>
>> Thoughts?
>>
>> Regards,
>>
>> Tvrtko
>>
>>>
>>>>
>>>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>>>
>>> With the improved commit message this patch is Reviewed-by: Christian 
>>> König <christian.koenig@amd.com>
>>>
>>> Regards,
>>> Christian.
>>>
>>>> ---
>>>>
>>>> Changes in v1 :
>>>> 1- Addressing Christian's comment to replace
>>>> CONFIG_DEBUG_WW_MUTEX_SLOWPATH instead of CONFIG_DEBUG_FS.
>>>> 2- As per Christian's comment moving this patch at last so
>>>> The version of this patch is also changed and previously
>>>> it was [PATCH 1/4]
>>>>
>>>>
>>>> ---
>>>>   include/linux/dma-fence.h | 5 +++++
>>>>   1 file changed, 5 insertions(+)
>>>>
>>>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
>>>> index 775cdc0b4f24..ba1ddc14c5d4 100644
>>>> --- a/include/linux/dma-fence.h
>>>> +++ b/include/linux/dma-fence.h
>>>> @@ -428,6 +428,11 @@ dma_fence_is_signaled_locked(struct dma_fence 
>>>> *fence)
>>>>   static inline bool
>>>>   dma_fence_is_signaled(struct dma_fence *fence)
>>>>   {
>>>> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
>>>> +    if (!test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags))
>>>> +        return false;
>>>> +#endif
>>>> +
>>>>       if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>>>>           return true;
>>>
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] dma-buf: Check status of enable-signaling bit on debug
  2022-09-06 11:21         ` Tvrtko Ursulin
@ 2022-09-06 11:35           ` Christian König
  2022-09-06 11:38           ` Tvrtko Ursulin
  1 sibling, 0 replies; 18+ messages in thread
From: Christian König @ 2022-09-06 11:35 UTC (permalink / raw)
  To: Tvrtko Ursulin, Arvind Yadav, andrey.grodzovsky, shashank.sharma,
	amaranath.somalapuram, Arunpravin.PaneerSelvam, sumit.semwal,
	gustavo, airlied, daniel, linux-media, dri-devel, linaro-mm-sig,
	linux-kernel

Am 06.09.22 um 13:21 schrieb Tvrtko Ursulin:
>
> On 06/09/2022 11:43, Christian König wrote:
>> Am 06.09.22 um 12:20 schrieb Tvrtko Ursulin:
>>>
>>> On 06/09/2022 09:39, Christian König wrote:
>>>> Am 05.09.22 um 18:35 schrieb Arvind Yadav:
>>>>> The core DMA-buf framework needs to enable signaling
>>>>> before the fence is signaled. The core DMA-buf framework
>>>>> can forget to enable signaling before the fence is signaled.
>>>>
>>>> This sentence is a bit confusing. I'm not a native speaker of 
>>>> English either, but I suggest something like:
>>>>
>>>> "Fence signaling must be enable to make sure that the 
>>>> dma_fence_is_signaled() function ever returns true."
>>>>
>>>>> To avoid this scenario on the debug kernel, check the
>>>>> DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT status bit before checking
>>>>> the signaling bit status to confirm that enable_signaling
>>>>> is enabled.
>>>>
>>>> This describes the implementation, but we should rather describe 
>>>> the background of the change. The implementation should be obvious. 
>>>> Something like this maybe:
>>>>
>>>> "
>>>> Since drivers and implementations sometimes mess this up enforce 
>>>> correct behavior when DEBUG_WW_MUTEX_SLOWPATH is used during 
>>>> debugging.
>>>>
>>>> This should make any implementations bugs resulting in not signaled 
>>>> fences much more obvious.
>>>> "
>>>
>>> I think I follow the idea but am not sure coupling (well 
>>> "coupling".. not really, but cross-contaminating in a way) 
>>> dma-fence.c with a foreign and effectively unrelated concept of a ww 
>>> mutex is the best way.
>>>
>>> Instead, how about a dma-buf specific debug kconfig option?
>>
>> Yeah, I was thinking about that as well.
>
> Cool, lets see about the specifics below and then decide.
>
>> The slowpath config option was just at hand because when you want to 
>> test the slowpath you want to test this here as well.
>>
>>>
>>> Condition would then be, according to my understanding of the rules 
>>> and expectations, along the lines of:
>>>
>>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
>>> index 775cdc0b4f24..147a9df2c9d0 100644
>>> --- a/include/linux/dma-fence.h
>>> +++ b/include/linux/dma-fence.h
>>> @@ -428,6 +428,17 @@ dma_fence_is_signaled_locked(struct dma_fence 
>>> *fence)
>>>  static inline bool
>>>  dma_fence_is_signaled(struct dma_fence *fence)
>>>  {
>>> +#ifdef CONFIG_DEBUG_DMAFENCE
>>> +       /*
>>> +        * Implementations not providing the enable_signaling 
>>> callback are
>>> +        * required to always have signaling enabled or fences are not
>>> +        * guaranteed to ever signal.
>>> +        */
>>
>> Well that comment is a bit misleading. The intention of the extra 
>> check is to find bugs in the frontend and not the backend.
>
> By backend you mean drivers/dma-buf/dma-fence.c and by front end 
> driver specific implementations? Or simply users for dma-fence?

Backend are the driver specific implementation of the dma_fence_ops.

Middleware is the framework in drivers/dma-buf.

Frontend is the users of the dma_fence interface, e.g. either drivers or 
components (drm_syncobj, drm_scheduler etc...).

>
> Could be that I got confused.. I was reading these two:
>
>
>      * This callback is optional. If this callback is not present, 
> then the
>      * driver must always have signaling enabled.
>      */
>     bool (*enable_signaling)(struct dma_fence *fence);
>
> And dma_fence_is_signaled:
>
>  * Returns true if the fence was already signaled, false if not. Since 
> this
>  * function doesn't enable signaling, it is not guaranteed to ever return
>  * true if dma_fence_add_callback(), dma_fence_wait() or
>  * dma_fence_enable_sw_signaling() haven't been called before.
>
> Right, I think I did get confused, apologies. What I was thinking was 
> probably two separate conditions:
>
>  static inline bool
>  dma_fence_is_signaled(struct dma_fence *fence)
>  {
> +#ifdef CONFIG_DEBUG_DMAFENCE
> +       if (WARN_ON_ONCE(!fence->ops->enable_signaling &&
> + !test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags)))
> +               return false;
> +
> +       if (!test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags))
> +               return false;
> +#endif
> +
>         if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>                 return true;
>
> Not sure "is signaled" is the best place for the first one or that it 
> should definitely be added.

I was thinking about adding this WARN_ON() as well, but then decided 
against it.

There are a couple of cases where calling dma_fence_is_signaled() 
without previously calling dma_fence_enable_sw_signaling() is valid 
because it is done just for optimization purposes and we guarantee that 
dma_fence_enable_sw_signaling() is called just a little bit later.

But yes, in general it's the same idea I already had as well.

Regards,
Christian.

>
> Regards,
>
> Tvrtko
>
>> In other words somewhere in the drm_syncobj code we have a 
>> dma_fence_is_signaled() call without matching 
>> dma_fence_enable_sw_signaling().
>>
>> Regards,
>> Christian.
>>
>>> + if (!fence->ops->enable_signaling &&
>>> +           !test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags))
>>> +               return false;
>>> +#endif
>>> +
>>>         if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>>>                 return true;
>>>
>>> Thoughts?
>>>
>>> Regards,
>>>
>>> Tvrtko
>>>
>>>>
>>>>>
>>>>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>>>>
>>>> With the improved commit message this patch is Reviewed-by: 
>>>> Christian König <christian.koenig@amd.com>
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> ---
>>>>>
>>>>> Changes in v1 :
>>>>> 1- Addressing Christian's comment to replace
>>>>> CONFIG_DEBUG_WW_MUTEX_SLOWPATH instead of CONFIG_DEBUG_FS.
>>>>> 2- As per Christian's comment moving this patch at last so
>>>>> The version of this patch is also changed and previously
>>>>> it was [PATCH 1/4]
>>>>>
>>>>>
>>>>> ---
>>>>>   include/linux/dma-fence.h | 5 +++++
>>>>>   1 file changed, 5 insertions(+)
>>>>>
>>>>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
>>>>> index 775cdc0b4f24..ba1ddc14c5d4 100644
>>>>> --- a/include/linux/dma-fence.h
>>>>> +++ b/include/linux/dma-fence.h
>>>>> @@ -428,6 +428,11 @@ dma_fence_is_signaled_locked(struct dma_fence 
>>>>> *fence)
>>>>>   static inline bool
>>>>>   dma_fence_is_signaled(struct dma_fence *fence)
>>>>>   {
>>>>> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
>>>>> +    if (!test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags))
>>>>> +        return false;
>>>>> +#endif
>>>>> +
>>>>>       if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>>>>>           return true;
>>>>
>>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] dma-buf: Check status of enable-signaling bit on debug
  2022-09-06 11:21         ` Tvrtko Ursulin
  2022-09-06 11:35           ` Christian König
@ 2022-09-06 11:38           ` Tvrtko Ursulin
  1 sibling, 0 replies; 18+ messages in thread
From: Tvrtko Ursulin @ 2022-09-06 11:38 UTC (permalink / raw)
  To: Christian König, Arvind Yadav, andrey.grodzovsky,
	shashank.sharma, amaranath.somalapuram, Arunpravin.PaneerSelvam,
	sumit.semwal, gustavo, airlied, daniel, linux-media, dri-devel,
	linaro-mm-sig, linux-kernel


On 06/09/2022 12:21, Tvrtko Ursulin wrote:
> 
> On 06/09/2022 11:43, Christian König wrote:
>> Am 06.09.22 um 12:20 schrieb Tvrtko Ursulin:
>>>
>>> On 06/09/2022 09:39, Christian König wrote:
>>>> Am 05.09.22 um 18:35 schrieb Arvind Yadav:
>>>>> The core DMA-buf framework needs to enable signaling
>>>>> before the fence is signaled. The core DMA-buf framework
>>>>> can forget to enable signaling before the fence is signaled.
>>>>
>>>> This sentence is a bit confusing. I'm not a native speaker of 
>>>> English either, but I suggest something like:
>>>>
>>>> "Fence signaling must be enable to make sure that the 
>>>> dma_fence_is_signaled() function ever returns true."
>>>>
>>>>> To avoid this scenario on the debug kernel, check the
>>>>> DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT status bit before checking
>>>>> the signaling bit status to confirm that enable_signaling
>>>>> is enabled.
>>>>
>>>> This describes the implementation, but we should rather describe the 
>>>> background of the change. The implementation should be obvious. 
>>>> Something like this maybe:
>>>>
>>>> "
>>>> Since drivers and implementations sometimes mess this up enforce 
>>>> correct behavior when DEBUG_WW_MUTEX_SLOWPATH is used during debugging.
>>>>
>>>> This should make any implementations bugs resulting in not signaled 
>>>> fences much more obvious.
>>>> "
>>>
>>> I think I follow the idea but am not sure coupling (well "coupling".. 
>>> not really, but cross-contaminating in a way) dma-fence.c with a 
>>> foreign and effectively unrelated concept of a ww mutex is the best way.
>>>
>>> Instead, how about a dma-buf specific debug kconfig option?
>>
>> Yeah, I was thinking about that as well.
> 
> Cool, lets see about the specifics below and then decide.
> 
>> The slowpath config option was just at hand because when you want to 
>> test the slowpath you want to test this here as well.
>>
>>>
>>> Condition would then be, according to my understanding of the rules 
>>> and expectations, along the lines of:
>>>
>>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
>>> index 775cdc0b4f24..147a9df2c9d0 100644
>>> --- a/include/linux/dma-fence.h
>>> +++ b/include/linux/dma-fence.h
>>> @@ -428,6 +428,17 @@ dma_fence_is_signaled_locked(struct dma_fence 
>>> *fence)
>>>  static inline bool
>>>  dma_fence_is_signaled(struct dma_fence *fence)
>>>  {
>>> +#ifdef CONFIG_DEBUG_DMAFENCE
>>> +       /*
>>> +        * Implementations not providing the enable_signaling 
>>> callback are
>>> +        * required to always have signaling enabled or fences are not
>>> +        * guaranteed to ever signal.
>>> +        */
>>
>> Well that comment is a bit misleading. The intention of the extra 
>> check is to find bugs in the frontend and not the backend.
> 
> By backend you mean drivers/dma-buf/dma-fence.c and by front end driver 
> specific implementations? Or simply users for dma-fence?
> 
> Could be that I got confused.. I was reading these two:
> 
> 
>       * This callback is optional. If this callback is not present, then 
> the
>       * driver must always have signaling enabled.
>       */
>      bool (*enable_signaling)(struct dma_fence *fence);
> 
> And dma_fence_is_signaled:
> 
>   * Returns true if the fence was already signaled, false if not. Since 
> this
>   * function doesn't enable signaling, it is not guaranteed to ever return
>   * true if dma_fence_add_callback(), dma_fence_wait() or
>   * dma_fence_enable_sw_signaling() haven't been called before.
> 
> Right, I think I did get confused, apologies. What I was thinking was 
> probably two separate conditions:
> 
>   static inline bool
>   dma_fence_is_signaled(struct dma_fence *fence)
>   {
> +#ifdef CONFIG_DEBUG_DMAFENCE
> +       if (WARN_ON_ONCE(!fence->ops->enable_signaling &&
> +                        !test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, 
> &fence->flags)))
> +               return false;
> +
> +       if (!test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags))
> +               return false;
> +#endif
> +

Or simpler OFC:

	if (!test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags)) {
		WARN_ON_ONCE(!fence->ops->enable_signaling);
		return false;
	}

You could also run this core change past i915 CI to see if it catches anything. Just send to our trybot and see what happens? With the debug option enabled of course. Hope it won't be painful.

Regards,

Tvrtko

>          if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>                  return true;
> 
> Not sure "is signaled" is the best place for the first one or that it 
> should definitely be added.
> 
> Regards,
> 
> Tvrtko
> 
>> In other words somewhere in the drm_syncobj code we have a 
>> dma_fence_is_signaled() call without matching 
>> dma_fence_enable_sw_signaling().
>>
>> Regards,
>> Christian.
>>
>>> + if (!fence->ops->enable_signaling &&
>>> +           !test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags))
>>> +               return false;
>>> +#endif
>>> +
>>>         if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>>>                 return true;
>>>
>>> Thoughts?
>>>
>>> Regards,
>>>
>>> Tvrtko
>>>
>>>>
>>>>>
>>>>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>>>>
>>>> With the improved commit message this patch is Reviewed-by: 
>>>> Christian König <christian.koenig@amd.com>
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> ---
>>>>>
>>>>> Changes in v1 :
>>>>> 1- Addressing Christian's comment to replace
>>>>> CONFIG_DEBUG_WW_MUTEX_SLOWPATH instead of CONFIG_DEBUG_FS.
>>>>> 2- As per Christian's comment moving this patch at last so
>>>>> The version of this patch is also changed and previously
>>>>> it was [PATCH 1/4]
>>>>>
>>>>>
>>>>> ---
>>>>>   include/linux/dma-fence.h | 5 +++++
>>>>>   1 file changed, 5 insertions(+)
>>>>>
>>>>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
>>>>> index 775cdc0b4f24..ba1ddc14c5d4 100644
>>>>> --- a/include/linux/dma-fence.h
>>>>> +++ b/include/linux/dma-fence.h
>>>>> @@ -428,6 +428,11 @@ dma_fence_is_signaled_locked(struct dma_fence 
>>>>> *fence)
>>>>>   static inline bool
>>>>>   dma_fence_is_signaled(struct dma_fence *fence)
>>>>>   {
>>>>> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
>>>>> +    if (!test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags))
>>>>> +        return false;
>>>>> +#endif
>>>>> +
>>>>>       if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>>>>>           return true;
>>>>
>>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/4] drm/sched: Enable signaling for finished fence
  2022-09-06  6:34   ` Christian König
@ 2022-09-06 19:55     ` Andrey Grodzovsky
  2022-09-07  6:37       ` Christian König
  0 siblings, 1 reply; 18+ messages in thread
From: Andrey Grodzovsky @ 2022-09-06 19:55 UTC (permalink / raw)
  To: Christian König, Arvind Yadav, shashank.sharma,
	amaranath.somalapuram, Arunpravin.PaneerSelvam, sumit.semwal,
	gustavo, airlied, daniel, linux-media, dri-devel, linaro-mm-sig,
	linux-kernel


On 2022-09-06 02:34, Christian König wrote:
> Am 05.09.22 um 18:34 schrieb Arvind Yadav:
>> Here's enabling software signaling for finished fence.
>>
>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>> ---
>>
>> Changes in v1 :
>> 1- Addressing Christian's comment to remove CONFIG_DEBUG_FS check from
>> this patch.
>> 2- The version of this patch is also changed and previously
>> it was [PATCH 2/4]
>>
>> ---
>>   drivers/gpu/drm/scheduler/sched_main.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index e0ab14e0fb6b..fe72de0e2911 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -962,6 +962,8 @@ static int drm_sched_main(void *param)
>>               /* Drop for original kref_init of the fence */
>>               dma_fence_put(fence);
>>   + dma_fence_enable_sw_signaling(&s_fence->finished);
>
> Ok, this makes it a lot clearer. Previously I though that we have some 
> bug in dma_fence_add_callback().
>
> This is essentially the wrong place to call this, the finished fence 
> should be enabled by the caller and not here.
>
> There is also another problem in dma_fence_enable_sw_signaling(), it 
> returns early when the fence is already signaled:
>
>         if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>                 return;
>
> Please remove that one first.


Why we even need this explicit call if dma_fence_add_callback calls 
__dma_fence_enable_signaling anyway ?

Andrey


>
> Thanks,
> Christian.
>
>
>> +
>>               r = dma_fence_add_callback(fence, &sched_job->cb,
>>                              drm_sched_job_done_cb);
>>               if (r == -ENOENT)
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/4] drm/sched: Enable signaling for finished fence
  2022-09-06 19:55     ` Andrey Grodzovsky
@ 2022-09-07  6:37       ` Christian König
  2022-09-07 16:18         ` Andrey Grodzovsky
  0 siblings, 1 reply; 18+ messages in thread
From: Christian König @ 2022-09-07  6:37 UTC (permalink / raw)
  To: Andrey Grodzovsky, Arvind Yadav, shashank.sharma,
	amaranath.somalapuram, Arunpravin.PaneerSelvam, sumit.semwal,
	gustavo, airlied, daniel, linux-media, dri-devel, linaro-mm-sig,
	linux-kernel

Am 06.09.22 um 21:55 schrieb Andrey Grodzovsky:
>
> On 2022-09-06 02:34, Christian König wrote:
>> Am 05.09.22 um 18:34 schrieb Arvind Yadav:
>>> Here's enabling software signaling for finished fence.
>>>
>>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>>> ---
>>>
>>> Changes in v1 :
>>> 1- Addressing Christian's comment to remove CONFIG_DEBUG_FS check from
>>> this patch.
>>> 2- The version of this patch is also changed and previously
>>> it was [PATCH 2/4]
>>>
>>> ---
>>>   drivers/gpu/drm/scheduler/sched_main.c | 2 ++
>>>   1 file changed, 2 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>> index e0ab14e0fb6b..fe72de0e2911 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -962,6 +962,8 @@ static int drm_sched_main(void *param)
>>>               /* Drop for original kref_init of the fence */
>>>               dma_fence_put(fence);
>>>   + dma_fence_enable_sw_signaling(&s_fence->finished);
>>
>> Ok, this makes it a lot clearer. Previously I though that we have 
>> some bug in dma_fence_add_callback().
>>
>> This is essentially the wrong place to call this, the finished fence 
>> should be enabled by the caller and not here.
>>
>> There is also another problem in dma_fence_enable_sw_signaling(), it 
>> returns early when the fence is already signaled:
>>
>>         if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>>                 return;
>>
>> Please remove that one first.
>
>
> Why we even need this explicit call if dma_fence_add_callback calls 
> __dma_fence_enable_signaling anyway ?

Two different fence objects.

The dma_fence_add_callback() is done on the hw fence we get in return of 
submitting the job.

The dma_fence_enable_sw_signaling() here is done on the finished fence 
we use to signal the completion externally.

Key point is the finished fence should be used by the frontend drivers 
which uses the scheduler and not by the scheduler itself.

Christian.

>
> Andrey
>
>
>>
>> Thanks,
>> Christian.
>>
>>
>>> +
>>>               r = dma_fence_add_callback(fence, &sched_job->cb,
>>>                              drm_sched_job_done_cb);
>>>               if (r == -ENOENT)
>>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/4] drm/sched: Enable signaling for finished fence
  2022-09-07  6:37       ` Christian König
@ 2022-09-07 16:18         ` Andrey Grodzovsky
  0 siblings, 0 replies; 18+ messages in thread
From: Andrey Grodzovsky @ 2022-09-07 16:18 UTC (permalink / raw)
  To: Christian König, Arvind Yadav, shashank.sharma,
	amaranath.somalapuram, Arunpravin.PaneerSelvam, sumit.semwal,
	gustavo, airlied, daniel, linux-media, dri-devel, linaro-mm-sig,
	linux-kernel


On 2022-09-07 02:37, Christian König wrote:
> Am 06.09.22 um 21:55 schrieb Andrey Grodzovsky:
>>
>> On 2022-09-06 02:34, Christian König wrote:
>>> Am 05.09.22 um 18:34 schrieb Arvind Yadav:
>>>> Here's enabling software signaling for finished fence.
>>>>
>>>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>>>> ---
>>>>
>>>> Changes in v1 :
>>>> 1- Addressing Christian's comment to remove CONFIG_DEBUG_FS check from
>>>> this patch.
>>>> 2- The version of this patch is also changed and previously
>>>> it was [PATCH 2/4]
>>>>
>>>> ---
>>>>   drivers/gpu/drm/scheduler/sched_main.c | 2 ++
>>>>   1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index e0ab14e0fb6b..fe72de0e2911 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -962,6 +962,8 @@ static int drm_sched_main(void *param)
>>>>               /* Drop for original kref_init of the fence */
>>>>               dma_fence_put(fence);
>>>>   + dma_fence_enable_sw_signaling(&s_fence->finished);
>>>
>>> Ok, this makes it a lot clearer. Previously I though that we have 
>>> some bug in dma_fence_add_callback().
>>>
>>> This is essentially the wrong place to call this, the finished fence 
>>> should be enabled by the caller and not here.
>>>
>>> There is also another problem in dma_fence_enable_sw_signaling(), it 
>>> returns early when the fence is already signaled:
>>>
>>>         if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>>>                 return;
>>>
>>> Please remove that one first.
>>
>>
>> Why we even need this explicit call if dma_fence_add_callback calls 
>> __dma_fence_enable_signaling anyway ?
>
> Two different fence objects.
>
> The dma_fence_add_callback() is done on the hw fence we get in return 
> of submitting the job.
>
> The dma_fence_enable_sw_signaling() here is done on the finished fence 
> we use to signal the completion externally.
>
> Key point is the finished fence should be used by the frontend drivers 
> which uses the scheduler and not by the scheduler itself.
>
> Christian.


Oh, so we need to explicitly call this because dma_fence_add_callback is 
not always used for finished fence right ?

Yea, then it makes sense that the client needs to manage this since each 
one has his own logic what to do with it.

Andrey


>
>>
>> Andrey
>>
>>
>>>
>>> Thanks,
>>> Christian.
>>>
>>>
>>>> +
>>>>               r = dma_fence_add_callback(fence, &sched_job->cb,
>>>>                              drm_sched_job_done_cb);
>>>>               if (r == -ENOENT)
>>>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/4] dma-buf: enable signaling for the stub fence on debug
  2022-09-06  7:09   ` Christian König
@ 2022-09-09 16:32     ` Yadav, Arvind
  0 siblings, 0 replies; 18+ messages in thread
From: Yadav, Arvind @ 2022-09-09 16:32 UTC (permalink / raw)
  To: Christian König, Arvind Yadav, andrey.grodzovsky,
	shashank.sharma, amaranath.somalapuram, Arunpravin.PaneerSelvam,
	sumit.semwal, gustavo, airlied, daniel, linux-media, dri-devel,
	linaro-mm-sig, linux-kernel


On 9/6/2022 12:39 PM, Christian König wrote:
>
>
> Am 05.09.22 um 18:35 schrieb Arvind Yadav:
>> Here's on debug enabling software signaling for the stub fence
>> which is always signaled. This fence should enable software
>> signaling otherwise the AMD GPU scheduler will cause a GPU reset
>> due to a GPU scheduler cleanup activity timeout.
>>
>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>> ---
>>
>> Changes in v1 :
>> 1- Addressing Christian's comment to remove unnecessary callback.
>> 2- Replacing CONFIG_DEBUG_WW_MUTEX_SLOWPATH instead of CONFIG_DEBUG_FS.
>> 3- The version of this patch is also changed and previously
>> it was [PATCH 3/4]
>>
>> ---
>>   drivers/dma-buf/dma-fence.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
>> index 066400ed8841..2378b12538c4 100644
>> --- a/drivers/dma-buf/dma-fence.c
>> +++ b/drivers/dma-buf/dma-fence.c
>> @@ -27,6 +27,10 @@ EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
>>   static DEFINE_SPINLOCK(dma_fence_stub_lock);
>>   static struct dma_fence dma_fence_stub;
>>   +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
>> +static bool __dma_fence_enable_signaling(struct dma_fence *fence);
>> +#endif
>> +
>
> I would rename the function to something like 
> dma_fence_enable_signaling_locked().
>
> And please don't add any #ifdef if it isn't absolutely necessary. This 
> makes the code pretty fragile.
>
>>   /*
>>    * fence context counter: each execution context should have its own
>>    * fence context, this allows checking if fences belong to the same
>> @@ -136,6 +140,9 @@ struct dma_fence *dma_fence_get_stub(void)
>>                      &dma_fence_stub_ops,
>>                      &dma_fence_stub_lock,
>>                      0, 0);
>> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
>> +        __dma_fence_enable_signaling(&dma_fence_stub);
>> +#endif
>
> Alternatively in this particular case you could just set the bit 
> manually here since this is part of the dma_fence code anyway.
>
> Christian.
>
As per per review comment. I will set the bit manually.

~arvind

>> dma_fence_signal_locked(&dma_fence_stub);
>>       }
>>       spin_unlock(&dma_fence_stub_lock);
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-09-09 16:32 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-05 16:34 [PATCH v2 0/4] dma-buf: To check enable signaling before signaled Arvind Yadav
2022-09-05 16:34 ` [PATCH v2 1/4] drm/sched: Enable signaling for finished fence Arvind Yadav
2022-09-06  6:34   ` Christian König
2022-09-06 19:55     ` Andrey Grodzovsky
2022-09-07  6:37       ` Christian König
2022-09-07 16:18         ` Andrey Grodzovsky
2022-09-05 16:35 ` [PATCH v2 2/4] dma-buf: enable signaling for the stub fence on debug Arvind Yadav
2022-09-06  7:09   ` Christian König
2022-09-09 16:32     ` Yadav, Arvind
2022-09-05 16:35 ` [PATCH v2 3/4] dma-buf: enable signaling for selftest " Arvind Yadav
2022-09-06  7:11   ` Christian König
2022-09-05 16:35 ` [PATCH v2 4/4] dma-buf: Check status of enable-signaling bit " Arvind Yadav
2022-09-06  8:39   ` Christian König
2022-09-06 10:20     ` Tvrtko Ursulin
2022-09-06 10:43       ` Christian König
2022-09-06 11:21         ` Tvrtko Ursulin
2022-09-06 11:35           ` Christian König
2022-09-06 11:38           ` Tvrtko Ursulin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).