All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] dma-resv fence DAG fixes
@ 2021-07-06 10:12 ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 10:12 UTC (permalink / raw)
  To: DRI Development; +Cc: Daniel Vetter, Intel Graphics Development

Hi all,

Finally I think I got them all in trying to audit all drivers for how they
deal with dma-resv dependencies in their command submission ioctl.

This series is incomplete, it also needs a few things from Christian
- nouveau fix for waiting for all fences
- various patches for fixing up dma-buf/resv functions to always wait for
  all fences (dma-buf poll, is_signalled, ...)
- I do include the one msm patch from Christian here since there was
  another issue in msm that needed fixing, and to make sure we have the
  complete set for msm

Two main things:
- fix drivers that currently can break the DAG. I opted for the dumbest
  possible way and not for rolling out dma_fence_chain - this can be fixed
  later on if needed.

- allow shared fences to be decoupled from the exclusive slot, which
  mostly means we can't skip waiting for the exclusive fence if there's
  shared fences present, we have to wait for all fences. This is a
  semantic change compared to what we've had thus far, but really makes a
  ton of sense given where things are heading towards.

Note that this means the import/export patches from Jason need to be
adjusted too to fit.

Plus some docs for dma-resv, they've been rather lacking.

Testing and review highly welcome.

Christian König (1):
  drm/msm: always wait for the exclusive fence

Daniel Vetter (6):
  drm/msm: Don't break exclusive fence ordering
  drm/etnaviv: Don't break exclusive fence ordering
  drm/i915: delete exclude argument from i915_sw_fence_await_reservation
  drm/i915: Always wait for the exclusive fence
  drm/i915: Don't break exclusive fence ordering
  dma-resv: Give the docs a do-over

 drivers/dma-buf/dma-resv.c                    |  22 +++-
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c  |   8 +-
 drivers/gpu/drm/i915/display/intel_display.c  |   4 +-
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c   |   2 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   8 +-
 drivers/gpu/drm/i915/i915_sw_fence.c          |  10 +-
 drivers/gpu/drm/i915/i915_sw_fence.h          |   1 -
 drivers/gpu/drm/msm/msm_gem.c                 |  16 ++-
 drivers/gpu/drm/msm/msm_gem_submit.c          |   3 +-
 include/linux/dma-resv.h                      | 104 +++++++++++++++++-
 10 files changed, 142 insertions(+), 36 deletions(-)

-- 
2.32.0


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH 0/7] dma-resv fence DAG fixes
@ 2021-07-06 10:12 ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 10:12 UTC (permalink / raw)
  To: DRI Development; +Cc: Daniel Vetter, Intel Graphics Development

Hi all,

Finally I think I got them all in trying to audit all drivers for how they
deal with dma-resv dependencies in their command submission ioctl.

This series is incomplete, it also needs a few things from Christian
- nouveau fix for waiting for all fences
- various patches for fixing up dma-buf/resv functions to always wait for
  all fences (dma-buf poll, is_signalled, ...)
- I do include the one msm patch from Christian here since there was
  another issue in msm that needed fixing, and to make sure we have the
  complete set for msm

Two main things:
- fix drivers that currently can break the DAG. I opted for the dumbest
  possible way and not for rolling out dma_fence_chain - this can be fixed
  later on if needed.

- allow shared fences to be decoupled from the exclusive slot, which
  mostly means we can't skip waiting for the exclusive fence if there's
  shared fences present, we have to wait for all fences. This is a
  semantic change compared to what we've had thus far, but really makes a
  ton of sense given where things are heading towards.

Note that this means the import/export patches from Jason need to be
adjusted too to fit.

Plus some docs for dma-resv, they've been rather lacking.

Testing and review highly welcome.

Christian König (1):
  drm/msm: always wait for the exclusive fence

Daniel Vetter (6):
  drm/msm: Don't break exclusive fence ordering
  drm/etnaviv: Don't break exclusive fence ordering
  drm/i915: delete exclude argument from i915_sw_fence_await_reservation
  drm/i915: Always wait for the exclusive fence
  drm/i915: Don't break exclusive fence ordering
  dma-resv: Give the docs a do-over

 drivers/dma-buf/dma-resv.c                    |  22 +++-
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c  |   8 +-
 drivers/gpu/drm/i915/display/intel_display.c  |   4 +-
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c   |   2 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   8 +-
 drivers/gpu/drm/i915/i915_sw_fence.c          |  10 +-
 drivers/gpu/drm/i915/i915_sw_fence.h          |   1 -
 drivers/gpu/drm/msm/msm_gem.c                 |  16 ++-
 drivers/gpu/drm/msm/msm_gem_submit.c          |   3 +-
 include/linux/dma-resv.h                      | 104 +++++++++++++++++-
 10 files changed, 142 insertions(+), 36 deletions(-)

-- 
2.32.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 1/7] drm/msm: Don't break exclusive fence ordering
  2021-07-06 10:12 ` [Intel-gfx] " Daniel Vetter
  (?)
@ 2021-07-06 10:12   ` Daniel Vetter
  -1 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 10:12 UTC (permalink / raw)
  To: DRI Development
  Cc: Intel Graphics Development, Daniel Vetter, Daniel Vetter,
	Rob Clark, Sean Paul, linux-arm-msm, freedreno

There's only one exclusive slot, and we must not break the ordering.

A better fix would be to us a dma_fence_chain or _array like e.g.
amdgpu now uses, but
- msm has a synchronous dma_fence_wait for anything from another
  context, so doesn't seem to care much,
- and it probably makes sense to lift this into dma-resv.c code as a
  proper concept, so that drivers don't have to hack up their own
  solution each on their own.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
---
 drivers/gpu/drm/msm/msm_gem_submit.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
index b71da71a3dd8..edd0051d849f 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -306,7 +306,8 @@ static int submit_fence_sync(struct msm_gem_submit *submit, bool no_implicit)
 				return ret;
 		}
 
-		if (no_implicit)
+		/* exclusive fences must be ordered */
+		if (no_implicit && !write)
 			continue;
 
 		ret = msm_gem_sync_object(&msm_obj->base, submit->ring->fctx,
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 1/7] drm/msm: Don't break exclusive fence ordering
@ 2021-07-06 10:12   ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 10:12 UTC (permalink / raw)
  To: DRI Development
  Cc: freedreno, Daniel Vetter, Intel Graphics Development,
	linux-arm-msm, Daniel Vetter, Sean Paul

There's only one exclusive slot, and we must not break the ordering.

A better fix would be to us a dma_fence_chain or _array like e.g.
amdgpu now uses, but
- msm has a synchronous dma_fence_wait for anything from another
  context, so doesn't seem to care much,
- and it probably makes sense to lift this into dma-resv.c code as a
  proper concept, so that drivers don't have to hack up their own
  solution each on their own.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
---
 drivers/gpu/drm/msm/msm_gem_submit.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
index b71da71a3dd8..edd0051d849f 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -306,7 +306,8 @@ static int submit_fence_sync(struct msm_gem_submit *submit, bool no_implicit)
 				return ret;
 		}
 
-		if (no_implicit)
+		/* exclusive fences must be ordered */
+		if (no_implicit && !write)
 			continue;
 
 		ret = msm_gem_sync_object(&msm_obj->base, submit->ring->fctx,
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH 1/7] drm/msm: Don't break exclusive fence ordering
@ 2021-07-06 10:12   ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 10:12 UTC (permalink / raw)
  To: DRI Development
  Cc: freedreno, Daniel Vetter, Intel Graphics Development,
	linux-arm-msm, Daniel Vetter

There's only one exclusive slot, and we must not break the ordering.

A better fix would be to us a dma_fence_chain or _array like e.g.
amdgpu now uses, but
- msm has a synchronous dma_fence_wait for anything from another
  context, so doesn't seem to care much,
- and it probably makes sense to lift this into dma-resv.c code as a
  proper concept, so that drivers don't have to hack up their own
  solution each on their own.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
---
 drivers/gpu/drm/msm/msm_gem_submit.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
index b71da71a3dd8..edd0051d849f 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -306,7 +306,8 @@ static int submit_fence_sync(struct msm_gem_submit *submit, bool no_implicit)
 				return ret;
 		}
 
-		if (no_implicit)
+		/* exclusive fences must be ordered */
+		if (no_implicit && !write)
 			continue;
 
 		ret = msm_gem_sync_object(&msm_obj->base, submit->ring->fctx,
-- 
2.32.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 2/7] drm/msm: always wait for the exclusive fence
  2021-07-06 10:12 ` [Intel-gfx] " Daniel Vetter
  (?)
@ 2021-07-06 10:12   ` Daniel Vetter
  -1 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 10:12 UTC (permalink / raw)
  To: DRI Development
  Cc: Intel Graphics Development, Christian König,
	Christian König, Daniel Vetter, Rob Clark, Sean Paul,
	linux-arm-msm, freedreno

From: Christian König <ckoenig.leichtzumerken@gmail.com>

Drivers also need to to sync to the exclusive fence when
a shared one is present.

Signed-off-by: Christian König <christian.koenig@amd.com>
[danvet: Not that hard to compile-test on arm ...]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
---
 drivers/gpu/drm/msm/msm_gem.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 141178754231..d9c4f1deeafb 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -812,17 +812,15 @@ int msm_gem_sync_object(struct drm_gem_object *obj,
 	struct dma_fence *fence;
 	int i, ret;
 
-	fobj = dma_resv_shared_list(obj->resv);
-	if (!fobj || (fobj->shared_count == 0)) {
-		fence = dma_resv_excl_fence(obj->resv);
-		/* don't need to wait on our own fences, since ring is fifo */
-		if (fence && (fence->context != fctx->context)) {
-			ret = dma_fence_wait(fence, true);
-			if (ret)
-				return ret;
-		}
+	fence = dma_resv_excl_fence(obj->resv);
+	/* don't need to wait on our own fences, since ring is fifo */
+	if (fence && (fence->context != fctx->context)) {
+		ret = dma_fence_wait(fence, true);
+		if (ret)
+			return ret;
 	}
 
+	fobj = dma_resv_shared_list(obj->resv);
 	if (!exclusive || !fobj)
 		return 0;
 
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 2/7] drm/msm: always wait for the exclusive fence
@ 2021-07-06 10:12   ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 10:12 UTC (permalink / raw)
  To: DRI Development
  Cc: freedreno, Christian König, Intel Graphics Development,
	Daniel Vetter, Sean Paul, Christian König, linux-arm-msm

From: Christian König <ckoenig.leichtzumerken@gmail.com>

Drivers also need to to sync to the exclusive fence when
a shared one is present.

Signed-off-by: Christian König <christian.koenig@amd.com>
[danvet: Not that hard to compile-test on arm ...]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
---
 drivers/gpu/drm/msm/msm_gem.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 141178754231..d9c4f1deeafb 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -812,17 +812,15 @@ int msm_gem_sync_object(struct drm_gem_object *obj,
 	struct dma_fence *fence;
 	int i, ret;
 
-	fobj = dma_resv_shared_list(obj->resv);
-	if (!fobj || (fobj->shared_count == 0)) {
-		fence = dma_resv_excl_fence(obj->resv);
-		/* don't need to wait on our own fences, since ring is fifo */
-		if (fence && (fence->context != fctx->context)) {
-			ret = dma_fence_wait(fence, true);
-			if (ret)
-				return ret;
-		}
+	fence = dma_resv_excl_fence(obj->resv);
+	/* don't need to wait on our own fences, since ring is fifo */
+	if (fence && (fence->context != fctx->context)) {
+		ret = dma_fence_wait(fence, true);
+		if (ret)
+			return ret;
 	}
 
+	fobj = dma_resv_shared_list(obj->resv);
 	if (!exclusive || !fobj)
 		return 0;
 
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH 2/7] drm/msm: always wait for the exclusive fence
@ 2021-07-06 10:12   ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 10:12 UTC (permalink / raw)
  To: DRI Development
  Cc: freedreno, Christian König, Intel Graphics Development,
	Daniel Vetter, Christian König, linux-arm-msm

From: Christian König <ckoenig.leichtzumerken@gmail.com>

Drivers also need to to sync to the exclusive fence when
a shared one is present.

Signed-off-by: Christian König <christian.koenig@amd.com>
[danvet: Not that hard to compile-test on arm ...]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
---
 drivers/gpu/drm/msm/msm_gem.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 141178754231..d9c4f1deeafb 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -812,17 +812,15 @@ int msm_gem_sync_object(struct drm_gem_object *obj,
 	struct dma_fence *fence;
 	int i, ret;
 
-	fobj = dma_resv_shared_list(obj->resv);
-	if (!fobj || (fobj->shared_count == 0)) {
-		fence = dma_resv_excl_fence(obj->resv);
-		/* don't need to wait on our own fences, since ring is fifo */
-		if (fence && (fence->context != fctx->context)) {
-			ret = dma_fence_wait(fence, true);
-			if (ret)
-				return ret;
-		}
+	fence = dma_resv_excl_fence(obj->resv);
+	/* don't need to wait on our own fences, since ring is fifo */
+	if (fence && (fence->context != fctx->context)) {
+		ret = dma_fence_wait(fence, true);
+		if (ret)
+			return ret;
 	}
 
+	fobj = dma_resv_shared_list(obj->resv);
 	if (!exclusive || !fobj)
 		return 0;
 
-- 
2.32.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 3/7] drm/etnaviv: Don't break exclusive fence ordering
  2021-07-06 10:12 ` [Intel-gfx] " Daniel Vetter
@ 2021-07-06 10:12   ` Daniel Vetter
  -1 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 10:12 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Intel Graphics Development, etnaviv, Russell King,
	Daniel Vetter

There's only one exclusive slot, and we must not break the ordering.

A better fix would be to us a dma_fence_chain or _array like e.g.
amdgpu now uses, but it probably makes sense to lift this into
dma-resv.c code as a proper concept, so that drivers don't have to
hack up their own solution each on their own. Hence go with the simple
fix for now.

Another option is the fence import ioctl from Jason:

https://lore.kernel.org/dri-devel/20210610210925.642582-7-jason@jlekstrand.net/

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: etnaviv@lists.freedesktop.org
---
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index 92478a50a580..5c4fed2b7c6a 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -178,18 +178,20 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit)
 	for (i = 0; i < submit->nr_bos; i++) {
 		struct etnaviv_gem_submit_bo *bo = &submit->bos[i];
 		struct dma_resv *robj = bo->obj->base.resv;
+		bool write = bo->flags & ETNA_SUBMIT_BO_WRITE;
 
-		if (!(bo->flags & ETNA_SUBMIT_BO_WRITE)) {
+		if (!(write)) {
 			ret = dma_resv_reserve_shared(robj, 1);
 			if (ret)
 				return ret;
 		}
 
-		if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
+		/* exclusive fences must be ordered */
+		if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT && !write)
 			continue;
 
 		ret = drm_sched_job_await_implicit(&submit->sched_job, &bo->obj->base,
-						   bo->flags & ETNA_SUBMIT_BO_WRITE);
+						   write);
 		if (ret)
 			return ret;
 	}
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH 3/7] drm/etnaviv: Don't break exclusive fence ordering
@ 2021-07-06 10:12   ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 10:12 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Intel Graphics Development, etnaviv,
	Christian Gmeiner, Russell King, Daniel Vetter, Lucas Stach

There's only one exclusive slot, and we must not break the ordering.

A better fix would be to us a dma_fence_chain or _array like e.g.
amdgpu now uses, but it probably makes sense to lift this into
dma-resv.c code as a proper concept, so that drivers don't have to
hack up their own solution each on their own. Hence go with the simple
fix for now.

Another option is the fence import ioctl from Jason:

https://lore.kernel.org/dri-devel/20210610210925.642582-7-jason@jlekstrand.net/

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: etnaviv@lists.freedesktop.org
---
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index 92478a50a580..5c4fed2b7c6a 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -178,18 +178,20 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit)
 	for (i = 0; i < submit->nr_bos; i++) {
 		struct etnaviv_gem_submit_bo *bo = &submit->bos[i];
 		struct dma_resv *robj = bo->obj->base.resv;
+		bool write = bo->flags & ETNA_SUBMIT_BO_WRITE;
 
-		if (!(bo->flags & ETNA_SUBMIT_BO_WRITE)) {
+		if (!(write)) {
 			ret = dma_resv_reserve_shared(robj, 1);
 			if (ret)
 				return ret;
 		}
 
-		if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
+		/* exclusive fences must be ordered */
+		if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT && !write)
 			continue;
 
 		ret = drm_sched_job_await_implicit(&submit->sched_job, &bo->obj->base,
-						   bo->flags & ETNA_SUBMIT_BO_WRITE);
+						   write);
 		if (ret)
 			return ret;
 	}
-- 
2.32.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 4/7] drm/i915: delete exclude argument from i915_sw_fence_await_reservation
  2021-07-06 10:12 ` [Intel-gfx] " Daniel Vetter
@ 2021-07-06 10:12   ` Daniel Vetter
  -1 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 10:12 UTC (permalink / raw)
  To: DRI Development
  Cc: Thomas Hellström, Daniel Vetter, Intel Graphics Development,
	Jason Ekstrand, Daniel Vetter

No longer used, the last user disappeared with

commit d07f0e59b2c762584478920cd2d11fba2980a94a
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Oct 28 13:58:44 2016 +0100

    drm/i915: Move GEM activity tracking into a common struct reservation_object

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/display/intel_display.c   | 4 ++--
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c    | 2 +-
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 +-
 drivers/gpu/drm/i915/i915_sw_fence.c           | 6 +-----
 drivers/gpu/drm/i915/i915_sw_fence.h           | 1 -
 5 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 98e0f4ed7e4a..678c7839034e 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -11119,7 +11119,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
 		 */
 		if (intel_crtc_needs_modeset(crtc_state)) {
 			ret = i915_sw_fence_await_reservation(&state->commit_ready,
-							      old_obj->base.resv, NULL,
+							      old_obj->base.resv,
 							      false, 0,
 							      GFP_KERNEL);
 			if (ret < 0)
@@ -11153,7 +11153,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
 		struct dma_fence *fence;
 
 		ret = i915_sw_fence_await_reservation(&state->commit_ready,
-						      obj->base.resv, NULL,
+						      obj->base.resv,
 						      false,
 						      i915_fence_timeout(dev_priv),
 						      GFP_KERNEL);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
index daf9284ef1f5..93439d2c7a58 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
@@ -106,7 +106,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,
 		clflush = clflush_work_create(obj);
 	if (clflush) {
 		i915_sw_fence_await_reservation(&clflush->base.chain,
-						obj->base.resv, NULL, true,
+						obj->base.resv, true,
 						i915_fence_timeout(to_i915(obj->base.dev)),
 						I915_FENCE_GFP);
 		dma_resv_add_excl_fence(obj->base.resv, &clflush->base.dma);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 68ce366f46cf..47e07179347a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2095,7 +2095,7 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb,
 
 	/* Wait for all writes (and relocs) into the batch to complete */
 	err = i915_sw_fence_await_reservation(&pw->base.chain,
-					      pw->batch->resv, NULL, false,
+					      pw->batch->resv, false,
 					      0, I915_FENCE_GFP);
 	if (err < 0)
 		goto err_commit;
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
index c589a681da77..91711a46b1c7 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -567,7 +567,6 @@ int __i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence,
 
 int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
 				    struct dma_resv *resv,
-				    const struct dma_fence_ops *exclude,
 				    bool write,
 				    unsigned long timeout,
 				    gfp_t gfp)
@@ -587,9 +586,6 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
 			return ret;
 
 		for (i = 0; i < count; i++) {
-			if (shared[i]->ops == exclude)
-				continue;
-
 			pending = i915_sw_fence_await_dma_fence(fence,
 								shared[i],
 								timeout,
@@ -609,7 +605,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
 		excl = dma_resv_get_excl_unlocked(resv);
 	}
 
-	if (ret >= 0 && excl && excl->ops != exclude) {
+	if (ret >= 0 && excl) {
 		pending = i915_sw_fence_await_dma_fence(fence,
 							excl,
 							timeout,
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h b/drivers/gpu/drm/i915/i915_sw_fence.h
index 30a863353ee6..6572f01668e4 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence.h
@@ -86,7 +86,6 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence,
 
 int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
 				    struct dma_resv *resv,
-				    const struct dma_fence_ops *exclude,
 				    bool write,
 				    unsigned long timeout,
 				    gfp_t gfp);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH 4/7] drm/i915: delete exclude argument from i915_sw_fence_await_reservation
@ 2021-07-06 10:12   ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 10:12 UTC (permalink / raw)
  To: DRI Development
  Cc: Thomas Hellström, Daniel Vetter, Intel Graphics Development,
	Daniel Vetter

No longer used, the last user disappeared with

commit d07f0e59b2c762584478920cd2d11fba2980a94a
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Oct 28 13:58:44 2016 +0100

    drm/i915: Move GEM activity tracking into a common struct reservation_object

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/display/intel_display.c   | 4 ++--
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c    | 2 +-
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 +-
 drivers/gpu/drm/i915/i915_sw_fence.c           | 6 +-----
 drivers/gpu/drm/i915/i915_sw_fence.h           | 1 -
 5 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 98e0f4ed7e4a..678c7839034e 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -11119,7 +11119,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
 		 */
 		if (intel_crtc_needs_modeset(crtc_state)) {
 			ret = i915_sw_fence_await_reservation(&state->commit_ready,
-							      old_obj->base.resv, NULL,
+							      old_obj->base.resv,
 							      false, 0,
 							      GFP_KERNEL);
 			if (ret < 0)
@@ -11153,7 +11153,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
 		struct dma_fence *fence;
 
 		ret = i915_sw_fence_await_reservation(&state->commit_ready,
-						      obj->base.resv, NULL,
+						      obj->base.resv,
 						      false,
 						      i915_fence_timeout(dev_priv),
 						      GFP_KERNEL);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
index daf9284ef1f5..93439d2c7a58 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
@@ -106,7 +106,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,
 		clflush = clflush_work_create(obj);
 	if (clflush) {
 		i915_sw_fence_await_reservation(&clflush->base.chain,
-						obj->base.resv, NULL, true,
+						obj->base.resv, true,
 						i915_fence_timeout(to_i915(obj->base.dev)),
 						I915_FENCE_GFP);
 		dma_resv_add_excl_fence(obj->base.resv, &clflush->base.dma);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 68ce366f46cf..47e07179347a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2095,7 +2095,7 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb,
 
 	/* Wait for all writes (and relocs) into the batch to complete */
 	err = i915_sw_fence_await_reservation(&pw->base.chain,
-					      pw->batch->resv, NULL, false,
+					      pw->batch->resv, false,
 					      0, I915_FENCE_GFP);
 	if (err < 0)
 		goto err_commit;
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
index c589a681da77..91711a46b1c7 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -567,7 +567,6 @@ int __i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence,
 
 int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
 				    struct dma_resv *resv,
-				    const struct dma_fence_ops *exclude,
 				    bool write,
 				    unsigned long timeout,
 				    gfp_t gfp)
@@ -587,9 +586,6 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
 			return ret;
 
 		for (i = 0; i < count; i++) {
-			if (shared[i]->ops == exclude)
-				continue;
-
 			pending = i915_sw_fence_await_dma_fence(fence,
 								shared[i],
 								timeout,
@@ -609,7 +605,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
 		excl = dma_resv_get_excl_unlocked(resv);
 	}
 
-	if (ret >= 0 && excl && excl->ops != exclude) {
+	if (ret >= 0 && excl) {
 		pending = i915_sw_fence_await_dma_fence(fence,
 							excl,
 							timeout,
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h b/drivers/gpu/drm/i915/i915_sw_fence.h
index 30a863353ee6..6572f01668e4 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence.h
@@ -86,7 +86,6 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence,
 
 int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
 				    struct dma_resv *resv,
-				    const struct dma_fence_ops *exclude,
 				    bool write,
 				    unsigned long timeout,
 				    gfp_t gfp);
-- 
2.32.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 5/7] drm/i915: Always wait for the exclusive fence
  2021-07-06 10:12 ` [Intel-gfx] " Daniel Vetter
@ 2021-07-06 10:12   ` Daniel Vetter
  -1 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 10:12 UTC (permalink / raw)
  To: DRI Development
  Cc: Thomas Hellström, Daniel Vetter, Intel Graphics Development,
	Jason Ekstrand, Daniel Vetter

We're lifting, or well, clarifying that the restriction that shared
fences have to be strictly after the exclusive one doesn't apply
anymore.

So adjust the code to always also wait for the exclusive fence.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/i915_sw_fence.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
index 91711a46b1c7..271d321cea83 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -601,10 +601,10 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
 		for (i = 0; i < count; i++)
 			dma_fence_put(shared[i]);
 		kfree(shared);
-	} else {
-		excl = dma_resv_get_excl_unlocked(resv);
 	}
 
+	excl = dma_resv_get_excl_unlocked(resv);
+
 	if (ret >= 0 && excl) {
 		pending = i915_sw_fence_await_dma_fence(fence,
 							excl,
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH 5/7] drm/i915: Always wait for the exclusive fence
@ 2021-07-06 10:12   ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 10:12 UTC (permalink / raw)
  To: DRI Development
  Cc: Thomas Hellström, Daniel Vetter, Intel Graphics Development,
	Daniel Vetter

We're lifting, or well, clarifying that the restriction that shared
fences have to be strictly after the exclusive one doesn't apply
anymore.

So adjust the code to always also wait for the exclusive fence.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/i915_sw_fence.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
index 91711a46b1c7..271d321cea83 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -601,10 +601,10 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
 		for (i = 0; i < count; i++)
 			dma_fence_put(shared[i]);
 		kfree(shared);
-	} else {
-		excl = dma_resv_get_excl_unlocked(resv);
 	}
 
+	excl = dma_resv_get_excl_unlocked(resv);
+
 	if (ret >= 0 && excl) {
 		pending = i915_sw_fence_await_dma_fence(fence,
 							excl,
-- 
2.32.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 6/7] drm/i915: Don't break exclusive fence ordering
  2021-07-06 10:12 ` [Intel-gfx] " Daniel Vetter
@ 2021-07-06 10:12   ` Daniel Vetter
  -1 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 10:12 UTC (permalink / raw)
  To: DRI Development
  Cc: Thomas Hellström, Daniel Vetter, Intel Graphics Development,
	Jason Ekstrand, Daniel Vetter

There's only one exclusive slot, and we must not break the ordering.

A better fix would be to us a dma_fence_chain or _array like e.g.
amdgpu now uses, but it probably makes sense to lift this into
dma-resv.c code as a proper concept, so that drivers don't have to
hack up their own solution each on their own. Hence go with the simple
fix for now.

Another option is the fence import ioctl from Jason:

https://lore.kernel.org/dri-devel/20210610210925.642582-7-jason@jlekstrand.net/

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 47e07179347a..9d717c8842e2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1775,6 +1775,7 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
 		struct i915_vma *vma = ev->vma;
 		unsigned int flags = ev->flags;
 		struct drm_i915_gem_object *obj = vma->obj;
+		bool async, write;
 
 		assert_vma_held(vma);
 
@@ -1806,7 +1807,10 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
 				flags &= ~EXEC_OBJECT_ASYNC;
 		}
 
-		if (err == 0 && !(flags & EXEC_OBJECT_ASYNC)) {
+		async = flags & EXEC_OBJECT_ASYNC;
+		write = flags & EXEC_OBJECT_WRITE;
+
+		if (err == 0 && (!async || write)) {
 			err = i915_request_await_object
 				(eb->request, obj, flags & EXEC_OBJECT_WRITE);
 		}
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH 6/7] drm/i915: Don't break exclusive fence ordering
@ 2021-07-06 10:12   ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 10:12 UTC (permalink / raw)
  To: DRI Development
  Cc: Thomas Hellström, Daniel Vetter, Intel Graphics Development,
	Daniel Vetter

There's only one exclusive slot, and we must not break the ordering.

A better fix would be to us a dma_fence_chain or _array like e.g.
amdgpu now uses, but it probably makes sense to lift this into
dma-resv.c code as a proper concept, so that drivers don't have to
hack up their own solution each on their own. Hence go with the simple
fix for now.

Another option is the fence import ioctl from Jason:

https://lore.kernel.org/dri-devel/20210610210925.642582-7-jason@jlekstrand.net/

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 47e07179347a..9d717c8842e2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1775,6 +1775,7 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
 		struct i915_vma *vma = ev->vma;
 		unsigned int flags = ev->flags;
 		struct drm_i915_gem_object *obj = vma->obj;
+		bool async, write;
 
 		assert_vma_held(vma);
 
@@ -1806,7 +1807,10 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
 				flags &= ~EXEC_OBJECT_ASYNC;
 		}
 
-		if (err == 0 && !(flags & EXEC_OBJECT_ASYNC)) {
+		async = flags & EXEC_OBJECT_ASYNC;
+		write = flags & EXEC_OBJECT_WRITE;
+
+		if (err == 0 && (!async || write)) {
 			err = i915_request_await_object
 				(eb->request, obj, flags & EXEC_OBJECT_WRITE);
 		}
-- 
2.32.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 7/7] dma-resv: Give the docs a do-over
  2021-07-06 10:12 ` [Intel-gfx] " Daniel Vetter
  (?)
@ 2021-07-06 10:12   ` Daniel Vetter
  -1 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 10:12 UTC (permalink / raw)
  To: DRI Development
  Cc: Intel Graphics Development, Daniel Vetter, Daniel Vetter,
	Sumit Semwal, Christian König, linux-media, linaro-mm-sig

Specifically document the new/clarified rules around how the shared
fences do not have any ordering requirements against the exclusive
fence.

But also document all the things a bit better, given how central
struct dma_resv to dynamic buffer management the docs have been very
inadequat.

- Lots more links to other pieces of the puzzle. Unfortunately
  ttm_buffer_object has no docs, so no links :-(

- Explain/complain a bit about dma_resv_locking_ctx(). I still don't
  like that one, but fixing the ttm call chains is going to be
  horrible. Plus we want to plug in real slowpath locking when we do
  that anyway.

- Main part of the patch is some actual docs for struct dma_resv.

Overall I think we still have a lot of bad naming in this area (e.g.
dma_resv.fence is singular, but contains the multiple shared fences),
but I think that's more indicative of how the semantics and rules are
just not great.

Another thing that's real awkard is how chaining exclusive fences
right now means direct dma_resv.exclusive_fence pointer access with an
rcu_assign_pointer. Not so great either.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 drivers/dma-buf/dma-resv.c |  22 ++++++--
 include/linux/dma-resv.h   | 104 +++++++++++++++++++++++++++++++++++--
 2 files changed, 116 insertions(+), 10 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index f26c71747d43..898f8d894bbd 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -48,6 +48,8 @@
  * write operations) or N shared fences (read operations).  The RCU
  * mechanism is used to protect read access to fences from locked
  * write-side updates.
+ *
+ * See struct dma_resv for more details.
  */
 
 DEFINE_WD_CLASS(reservation_ww_class);
@@ -137,7 +139,11 @@ EXPORT_SYMBOL(dma_resv_fini);
  * @num_fences: number of fences we want to add
  *
  * Should be called before dma_resv_add_shared_fence().  Must
- * be called with obj->lock held.
+ * be called with @obj locked through dma_resv_lock().
+ *
+ * Note that the preallocated slots need to be re-reserved if @obj is unlocked
+ * at any time before callind dma_resv_add_shared_fence(). This is validate when
+ * CONFIG_DEBUG_MUTEXES is enabled.
  *
  * RETURNS
  * Zero for success, or -errno
@@ -234,8 +240,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
  * @obj: the reservation object
  * @fence: the shared fence to add
  *
- * Add a fence to a shared slot, obj->lock must be held, and
+ * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
  * dma_resv_reserve_shared() has been called.
+ *
+ * See also &dma_resv.fence for a discussion of the semantics.
  */
 void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
 {
@@ -280,7 +288,9 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
  * @obj: the reservation object
  * @fence: the shared fence to add
  *
- * Add a fence to the exclusive slot.  The obj->lock must be held.
+ * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
+ * Note that this function replaces all fences attached to @obj, see also
+ * &dma_resv.fence_excl for a discussion of the semantics.
  */
 void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
 {
@@ -609,9 +619,11 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
  * fence
  *
  * Callers are not required to hold specific locks, but maybe hold
- * dma_resv_lock() already
+ * dma_resv_lock() already.
+ *
  * RETURNS
- * true if all fences signaled, else false
+ *
+ * True if all fences signaled, else false.
  */
 bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
 {
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index e1ca2080a1ff..c77fd54d033f 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -62,16 +62,90 @@ struct dma_resv_list {
 
 /**
  * struct dma_resv - a reservation object manages fences for a buffer
- * @lock: update side lock
- * @seq: sequence count for managing RCU read-side synchronization
- * @fence_excl: the exclusive fence, if there is one currently
- * @fence: list of current shared fences
+ *
+ * There are multiple uses for this, with sometimes slightly different rules in
+ * how the fence slots are used.
+ *
+ * One use is to synchronize cross-driver access to a struct dma_buf, either for
+ * dynamic buffer management or just to handle implicit synchronization between
+ * different users of the buffer in userspace. See &dma_buf.resv for a more
+ * in-depth discussion.
+ *
+ * The other major use is to manage access and locking within a driver in a
+ * buffer based memory manager. struct ttm_buffer_object is the canonical
+ * example here, since this is were reservation objects originated from. But use
+ * in drivers is spreading and some drivers also manage struct
+ * drm_gem_object with the same scheme.
  */
 struct dma_resv {
+	/**
+	 * @lock:
+	 *
+	 * Update side lock. Don't use directly, instead use the wrapper
+	 * functions like dma_resv_lock() and dma_resv_unlock().
+	 *
+	 * Drivers which use the reservation object to manage memory dynamically
+	 * also use this lock to protect buffer object state like placement,
+	 * allocation policies or throughout command submission.
+	 */
 	struct ww_mutex lock;
+
+	/**
+	 * @seq:
+	 *
+	 * Sequence count for managing RCU read-side synchronization, allows
+	 * read-only access to @fence_excl and @fence while ensuring we take a
+	 * consistent snapshot.
+	 */
 	seqcount_ww_mutex_t seq;
 
+	/**
+	 * @fence_excl:
+	 *
+	 * The exclusive fence, if there is one currently.
+	 *
+	 * There are two was to update this fence:
+	 *
+	 * - First by calling dma_resv_add_excl_fence(), which replaces all
+	 *   fences attached to the reservation object. To guarantee that no
+	 *   fences are lost this new fence must signal only after all previous
+	 *   fences, both shared and exclusive, have signalled. In some cases it
+	 *   is convenient to achieve that by attaching a struct dma_fence_array
+	 *   with all the new and old fences.
+	 *
+	 * - Alternatively the fence can be set directly, which leaves the
+	 *   shared fences unchanged. To guarantee that no fences are lost this
+	 *   new fence must signale only after the previous exclusive fence has
+	 *   singalled. Since the shared fences are staying intact, it is not
+	 *   necessary to maintain any ordering against those. If semantically
+	 *   only a new access is added without actually treating the previous
+	 *   one as a dependency the exclusive fences can be strung together
+	 *   using struct dma_fence_chain.
+	 *
+	 * Note that actual semantics of what an exclusive or shared fence mean
+	 * is defined by the user, for reservation objects shared across drivers
+	 * see &dma_buf.resv.
+	 */
 	struct dma_fence __rcu *fence_excl;
+
+	/**
+	 * @fence:
+	 *
+	 * List of current shared fences.
+	 *
+	 * There are no ordering constraints of shared fences against the
+	 * exclusive fence slot. If a waiter needs to wait for all access, it
+	 * has to wait for both set of fences to signal.
+	 *
+	 * A new fence is added by calling dma_resv_add_shared_fence(). Since
+	 * this often needs to be done past the point of no return in command
+	 * submission it cannot fail, and therefor sufficient slots need to be
+	 * reserved by calling dma_resv_reserve_shared().
+	 *
+	 * Note that actual semantics of what an exclusive or shared fence mean
+	 * is defined by the user, for reservation objects shared across drivers
+	 * see &dma_buf.resv.
+	 */
 	struct dma_resv_list __rcu *fence;
 };
 
@@ -98,6 +172,13 @@ static inline void dma_resv_reset_shared_max(struct dma_resv *obj) {}
  * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
  * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
  * object may be locked by itself by passing NULL as @ctx.
+ *
+ * When a die situation is indicated by returning -EDEADLK all locks held by
+ * @ctx must be unlocked and then dma_resv_lock_slow() called on @obj.
+ *
+ * Unlocked by calling dma_resv_lock().
+ *
+ * See also dma_resv_lock_interruptible() for the interruptible variant.
  */
 static inline int dma_resv_lock(struct dma_resv *obj,
 				struct ww_acquire_ctx *ctx)
@@ -119,6 +200,12 @@ static inline int dma_resv_lock(struct dma_resv *obj,
  * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
  * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
  * object may be locked by itself by passing NULL as @ctx.
+ *
+ * When a die situation is indicated by returning -EDEADLK all locks held by
+ * @ctx must be unlocked and then dma_resv_lock_slow_interruptible() called on
+ * @obj.
+ *
+ * Unlocked by calling dma_resv_lock().
  */
 static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
 					      struct ww_acquire_ctx *ctx)
@@ -134,6 +221,8 @@ static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
  * Acquires the reservation object after a die case. This function
  * will sleep until the lock becomes available. See dma_resv_lock() as
  * well.
+ *
+ * See also dma_resv_lock_slow_interruptible() for the interruptible variant.
  */
 static inline void dma_resv_lock_slow(struct dma_resv *obj,
 				      struct ww_acquire_ctx *ctx)
@@ -167,7 +256,7 @@ static inline int dma_resv_lock_slow_interruptible(struct dma_resv *obj,
  * if they overlap with a writer.
  *
  * Also note that since no context is provided, no deadlock protection is
- * possible.
+ * possible, which is also not needed for a trylock.
  *
  * Returns true if the lock was acquired, false otherwise.
  */
@@ -193,6 +282,11 @@ static inline bool dma_resv_is_locked(struct dma_resv *obj)
  *
  * Returns the context used to lock a reservation object or NULL if no context
  * was used or the object is not locked at all.
+ *
+ * WARNING: This interface is pretty horrible, but TTM needs it because it
+ * doesn't pass the struct ww_acquire_ctx around in some very long callchains.
+ * Everyone else just uses it to check whether they're holding a reservation or
+ * not.
  */
 static inline struct ww_acquire_ctx *dma_resv_locking_ctx(struct dma_resv *obj)
 {
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 7/7] dma-resv: Give the docs a do-over
@ 2021-07-06 10:12   ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 10:12 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Intel Graphics Development, linaro-mm-sig,
	Daniel Vetter, Christian König, linux-media

Specifically document the new/clarified rules around how the shared
fences do not have any ordering requirements against the exclusive
fence.

But also document all the things a bit better, given how central
struct dma_resv to dynamic buffer management the docs have been very
inadequat.

- Lots more links to other pieces of the puzzle. Unfortunately
  ttm_buffer_object has no docs, so no links :-(

- Explain/complain a bit about dma_resv_locking_ctx(). I still don't
  like that one, but fixing the ttm call chains is going to be
  horrible. Plus we want to plug in real slowpath locking when we do
  that anyway.

- Main part of the patch is some actual docs for struct dma_resv.

Overall I think we still have a lot of bad naming in this area (e.g.
dma_resv.fence is singular, but contains the multiple shared fences),
but I think that's more indicative of how the semantics and rules are
just not great.

Another thing that's real awkard is how chaining exclusive fences
right now means direct dma_resv.exclusive_fence pointer access with an
rcu_assign_pointer. Not so great either.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 drivers/dma-buf/dma-resv.c |  22 ++++++--
 include/linux/dma-resv.h   | 104 +++++++++++++++++++++++++++++++++++--
 2 files changed, 116 insertions(+), 10 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index f26c71747d43..898f8d894bbd 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -48,6 +48,8 @@
  * write operations) or N shared fences (read operations).  The RCU
  * mechanism is used to protect read access to fences from locked
  * write-side updates.
+ *
+ * See struct dma_resv for more details.
  */
 
 DEFINE_WD_CLASS(reservation_ww_class);
@@ -137,7 +139,11 @@ EXPORT_SYMBOL(dma_resv_fini);
  * @num_fences: number of fences we want to add
  *
  * Should be called before dma_resv_add_shared_fence().  Must
- * be called with obj->lock held.
+ * be called with @obj locked through dma_resv_lock().
+ *
+ * Note that the preallocated slots need to be re-reserved if @obj is unlocked
+ * at any time before callind dma_resv_add_shared_fence(). This is validate when
+ * CONFIG_DEBUG_MUTEXES is enabled.
  *
  * RETURNS
  * Zero for success, or -errno
@@ -234,8 +240,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
  * @obj: the reservation object
  * @fence: the shared fence to add
  *
- * Add a fence to a shared slot, obj->lock must be held, and
+ * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
  * dma_resv_reserve_shared() has been called.
+ *
+ * See also &dma_resv.fence for a discussion of the semantics.
  */
 void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
 {
@@ -280,7 +288,9 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
  * @obj: the reservation object
  * @fence: the shared fence to add
  *
- * Add a fence to the exclusive slot.  The obj->lock must be held.
+ * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
+ * Note that this function replaces all fences attached to @obj, see also
+ * &dma_resv.fence_excl for a discussion of the semantics.
  */
 void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
 {
@@ -609,9 +619,11 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
  * fence
  *
  * Callers are not required to hold specific locks, but maybe hold
- * dma_resv_lock() already
+ * dma_resv_lock() already.
+ *
  * RETURNS
- * true if all fences signaled, else false
+ *
+ * True if all fences signaled, else false.
  */
 bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
 {
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index e1ca2080a1ff..c77fd54d033f 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -62,16 +62,90 @@ struct dma_resv_list {
 
 /**
  * struct dma_resv - a reservation object manages fences for a buffer
- * @lock: update side lock
- * @seq: sequence count for managing RCU read-side synchronization
- * @fence_excl: the exclusive fence, if there is one currently
- * @fence: list of current shared fences
+ *
+ * There are multiple uses for this, with sometimes slightly different rules in
+ * how the fence slots are used.
+ *
+ * One use is to synchronize cross-driver access to a struct dma_buf, either for
+ * dynamic buffer management or just to handle implicit synchronization between
+ * different users of the buffer in userspace. See &dma_buf.resv for a more
+ * in-depth discussion.
+ *
+ * The other major use is to manage access and locking within a driver in a
+ * buffer based memory manager. struct ttm_buffer_object is the canonical
+ * example here, since this is were reservation objects originated from. But use
+ * in drivers is spreading and some drivers also manage struct
+ * drm_gem_object with the same scheme.
  */
 struct dma_resv {
+	/**
+	 * @lock:
+	 *
+	 * Update side lock. Don't use directly, instead use the wrapper
+	 * functions like dma_resv_lock() and dma_resv_unlock().
+	 *
+	 * Drivers which use the reservation object to manage memory dynamically
+	 * also use this lock to protect buffer object state like placement,
+	 * allocation policies or throughout command submission.
+	 */
 	struct ww_mutex lock;
+
+	/**
+	 * @seq:
+	 *
+	 * Sequence count for managing RCU read-side synchronization, allows
+	 * read-only access to @fence_excl and @fence while ensuring we take a
+	 * consistent snapshot.
+	 */
 	seqcount_ww_mutex_t seq;
 
+	/**
+	 * @fence_excl:
+	 *
+	 * The exclusive fence, if there is one currently.
+	 *
+	 * There are two was to update this fence:
+	 *
+	 * - First by calling dma_resv_add_excl_fence(), which replaces all
+	 *   fences attached to the reservation object. To guarantee that no
+	 *   fences are lost this new fence must signal only after all previous
+	 *   fences, both shared and exclusive, have signalled. In some cases it
+	 *   is convenient to achieve that by attaching a struct dma_fence_array
+	 *   with all the new and old fences.
+	 *
+	 * - Alternatively the fence can be set directly, which leaves the
+	 *   shared fences unchanged. To guarantee that no fences are lost this
+	 *   new fence must signale only after the previous exclusive fence has
+	 *   singalled. Since the shared fences are staying intact, it is not
+	 *   necessary to maintain any ordering against those. If semantically
+	 *   only a new access is added without actually treating the previous
+	 *   one as a dependency the exclusive fences can be strung together
+	 *   using struct dma_fence_chain.
+	 *
+	 * Note that actual semantics of what an exclusive or shared fence mean
+	 * is defined by the user, for reservation objects shared across drivers
+	 * see &dma_buf.resv.
+	 */
 	struct dma_fence __rcu *fence_excl;
+
+	/**
+	 * @fence:
+	 *
+	 * List of current shared fences.
+	 *
+	 * There are no ordering constraints of shared fences against the
+	 * exclusive fence slot. If a waiter needs to wait for all access, it
+	 * has to wait for both set of fences to signal.
+	 *
+	 * A new fence is added by calling dma_resv_add_shared_fence(). Since
+	 * this often needs to be done past the point of no return in command
+	 * submission it cannot fail, and therefor sufficient slots need to be
+	 * reserved by calling dma_resv_reserve_shared().
+	 *
+	 * Note that actual semantics of what an exclusive or shared fence mean
+	 * is defined by the user, for reservation objects shared across drivers
+	 * see &dma_buf.resv.
+	 */
 	struct dma_resv_list __rcu *fence;
 };
 
@@ -98,6 +172,13 @@ static inline void dma_resv_reset_shared_max(struct dma_resv *obj) {}
  * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
  * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
  * object may be locked by itself by passing NULL as @ctx.
+ *
+ * When a die situation is indicated by returning -EDEADLK all locks held by
+ * @ctx must be unlocked and then dma_resv_lock_slow() called on @obj.
+ *
+ * Unlocked by calling dma_resv_lock().
+ *
+ * See also dma_resv_lock_interruptible() for the interruptible variant.
  */
 static inline int dma_resv_lock(struct dma_resv *obj,
 				struct ww_acquire_ctx *ctx)
@@ -119,6 +200,12 @@ static inline int dma_resv_lock(struct dma_resv *obj,
  * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
  * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
  * object may be locked by itself by passing NULL as @ctx.
+ *
+ * When a die situation is indicated by returning -EDEADLK all locks held by
+ * @ctx must be unlocked and then dma_resv_lock_slow_interruptible() called on
+ * @obj.
+ *
+ * Unlocked by calling dma_resv_lock().
  */
 static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
 					      struct ww_acquire_ctx *ctx)
@@ -134,6 +221,8 @@ static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
  * Acquires the reservation object after a die case. This function
  * will sleep until the lock becomes available. See dma_resv_lock() as
  * well.
+ *
+ * See also dma_resv_lock_slow_interruptible() for the interruptible variant.
  */
 static inline void dma_resv_lock_slow(struct dma_resv *obj,
 				      struct ww_acquire_ctx *ctx)
@@ -167,7 +256,7 @@ static inline int dma_resv_lock_slow_interruptible(struct dma_resv *obj,
  * if they overlap with a writer.
  *
  * Also note that since no context is provided, no deadlock protection is
- * possible.
+ * possible, which is also not needed for a trylock.
  *
  * Returns true if the lock was acquired, false otherwise.
  */
@@ -193,6 +282,11 @@ static inline bool dma_resv_is_locked(struct dma_resv *obj)
  *
  * Returns the context used to lock a reservation object or NULL if no context
  * was used or the object is not locked at all.
+ *
+ * WARNING: This interface is pretty horrible, but TTM needs it because it
+ * doesn't pass the struct ww_acquire_ctx around in some very long callchains.
+ * Everyone else just uses it to check whether they're holding a reservation or
+ * not.
  */
 static inline struct ww_acquire_ctx *dma_resv_locking_ctx(struct dma_resv *obj)
 {
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH 7/7] dma-resv: Give the docs a do-over
@ 2021-07-06 10:12   ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 10:12 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Intel Graphics Development, Sumit Semwal,
	linaro-mm-sig, Daniel Vetter, Christian König, linux-media

Specifically document the new/clarified rules around how the shared
fences do not have any ordering requirements against the exclusive
fence.

But also document all the things a bit better, given how central
struct dma_resv to dynamic buffer management the docs have been very
inadequat.

- Lots more links to other pieces of the puzzle. Unfortunately
  ttm_buffer_object has no docs, so no links :-(

- Explain/complain a bit about dma_resv_locking_ctx(). I still don't
  like that one, but fixing the ttm call chains is going to be
  horrible. Plus we want to plug in real slowpath locking when we do
  that anyway.

- Main part of the patch is some actual docs for struct dma_resv.

Overall I think we still have a lot of bad naming in this area (e.g.
dma_resv.fence is singular, but contains the multiple shared fences),
but I think that's more indicative of how the semantics and rules are
just not great.

Another thing that's real awkard is how chaining exclusive fences
right now means direct dma_resv.exclusive_fence pointer access with an
rcu_assign_pointer. Not so great either.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 drivers/dma-buf/dma-resv.c |  22 ++++++--
 include/linux/dma-resv.h   | 104 +++++++++++++++++++++++++++++++++++--
 2 files changed, 116 insertions(+), 10 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index f26c71747d43..898f8d894bbd 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -48,6 +48,8 @@
  * write operations) or N shared fences (read operations).  The RCU
  * mechanism is used to protect read access to fences from locked
  * write-side updates.
+ *
+ * See struct dma_resv for more details.
  */
 
 DEFINE_WD_CLASS(reservation_ww_class);
@@ -137,7 +139,11 @@ EXPORT_SYMBOL(dma_resv_fini);
  * @num_fences: number of fences we want to add
  *
  * Should be called before dma_resv_add_shared_fence().  Must
- * be called with obj->lock held.
+ * be called with @obj locked through dma_resv_lock().
+ *
+ * Note that the preallocated slots need to be re-reserved if @obj is unlocked
+ * at any time before callind dma_resv_add_shared_fence(). This is validate when
+ * CONFIG_DEBUG_MUTEXES is enabled.
  *
  * RETURNS
  * Zero for success, or -errno
@@ -234,8 +240,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
  * @obj: the reservation object
  * @fence: the shared fence to add
  *
- * Add a fence to a shared slot, obj->lock must be held, and
+ * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
  * dma_resv_reserve_shared() has been called.
+ *
+ * See also &dma_resv.fence for a discussion of the semantics.
  */
 void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
 {
@@ -280,7 +288,9 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
  * @obj: the reservation object
  * @fence: the shared fence to add
  *
- * Add a fence to the exclusive slot.  The obj->lock must be held.
+ * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
+ * Note that this function replaces all fences attached to @obj, see also
+ * &dma_resv.fence_excl for a discussion of the semantics.
  */
 void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
 {
@@ -609,9 +619,11 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
  * fence
  *
  * Callers are not required to hold specific locks, but maybe hold
- * dma_resv_lock() already
+ * dma_resv_lock() already.
+ *
  * RETURNS
- * true if all fences signaled, else false
+ *
+ * True if all fences signaled, else false.
  */
 bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
 {
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index e1ca2080a1ff..c77fd54d033f 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -62,16 +62,90 @@ struct dma_resv_list {
 
 /**
  * struct dma_resv - a reservation object manages fences for a buffer
- * @lock: update side lock
- * @seq: sequence count for managing RCU read-side synchronization
- * @fence_excl: the exclusive fence, if there is one currently
- * @fence: list of current shared fences
+ *
+ * There are multiple uses for this, with sometimes slightly different rules in
+ * how the fence slots are used.
+ *
+ * One use is to synchronize cross-driver access to a struct dma_buf, either for
+ * dynamic buffer management or just to handle implicit synchronization between
+ * different users of the buffer in userspace. See &dma_buf.resv for a more
+ * in-depth discussion.
+ *
+ * The other major use is to manage access and locking within a driver in a
+ * buffer based memory manager. struct ttm_buffer_object is the canonical
+ * example here, since this is were reservation objects originated from. But use
+ * in drivers is spreading and some drivers also manage struct
+ * drm_gem_object with the same scheme.
  */
 struct dma_resv {
+	/**
+	 * @lock:
+	 *
+	 * Update side lock. Don't use directly, instead use the wrapper
+	 * functions like dma_resv_lock() and dma_resv_unlock().
+	 *
+	 * Drivers which use the reservation object to manage memory dynamically
+	 * also use this lock to protect buffer object state like placement,
+	 * allocation policies or throughout command submission.
+	 */
 	struct ww_mutex lock;
+
+	/**
+	 * @seq:
+	 *
+	 * Sequence count for managing RCU read-side synchronization, allows
+	 * read-only access to @fence_excl and @fence while ensuring we take a
+	 * consistent snapshot.
+	 */
 	seqcount_ww_mutex_t seq;
 
+	/**
+	 * @fence_excl:
+	 *
+	 * The exclusive fence, if there is one currently.
+	 *
+	 * There are two was to update this fence:
+	 *
+	 * - First by calling dma_resv_add_excl_fence(), which replaces all
+	 *   fences attached to the reservation object. To guarantee that no
+	 *   fences are lost this new fence must signal only after all previous
+	 *   fences, both shared and exclusive, have signalled. In some cases it
+	 *   is convenient to achieve that by attaching a struct dma_fence_array
+	 *   with all the new and old fences.
+	 *
+	 * - Alternatively the fence can be set directly, which leaves the
+	 *   shared fences unchanged. To guarantee that no fences are lost this
+	 *   new fence must signale only after the previous exclusive fence has
+	 *   singalled. Since the shared fences are staying intact, it is not
+	 *   necessary to maintain any ordering against those. If semantically
+	 *   only a new access is added without actually treating the previous
+	 *   one as a dependency the exclusive fences can be strung together
+	 *   using struct dma_fence_chain.
+	 *
+	 * Note that actual semantics of what an exclusive or shared fence mean
+	 * is defined by the user, for reservation objects shared across drivers
+	 * see &dma_buf.resv.
+	 */
 	struct dma_fence __rcu *fence_excl;
+
+	/**
+	 * @fence:
+	 *
+	 * List of current shared fences.
+	 *
+	 * There are no ordering constraints of shared fences against the
+	 * exclusive fence slot. If a waiter needs to wait for all access, it
+	 * has to wait for both set of fences to signal.
+	 *
+	 * A new fence is added by calling dma_resv_add_shared_fence(). Since
+	 * this often needs to be done past the point of no return in command
+	 * submission it cannot fail, and therefor sufficient slots need to be
+	 * reserved by calling dma_resv_reserve_shared().
+	 *
+	 * Note that actual semantics of what an exclusive or shared fence mean
+	 * is defined by the user, for reservation objects shared across drivers
+	 * see &dma_buf.resv.
+	 */
 	struct dma_resv_list __rcu *fence;
 };
 
@@ -98,6 +172,13 @@ static inline void dma_resv_reset_shared_max(struct dma_resv *obj) {}
  * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
  * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
  * object may be locked by itself by passing NULL as @ctx.
+ *
+ * When a die situation is indicated by returning -EDEADLK all locks held by
+ * @ctx must be unlocked and then dma_resv_lock_slow() called on @obj.
+ *
+ * Unlocked by calling dma_resv_lock().
+ *
+ * See also dma_resv_lock_interruptible() for the interruptible variant.
  */
 static inline int dma_resv_lock(struct dma_resv *obj,
 				struct ww_acquire_ctx *ctx)
@@ -119,6 +200,12 @@ static inline int dma_resv_lock(struct dma_resv *obj,
  * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
  * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
  * object may be locked by itself by passing NULL as @ctx.
+ *
+ * When a die situation is indicated by returning -EDEADLK all locks held by
+ * @ctx must be unlocked and then dma_resv_lock_slow_interruptible() called on
+ * @obj.
+ *
+ * Unlocked by calling dma_resv_lock().
  */
 static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
 					      struct ww_acquire_ctx *ctx)
@@ -134,6 +221,8 @@ static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
  * Acquires the reservation object after a die case. This function
  * will sleep until the lock becomes available. See dma_resv_lock() as
  * well.
+ *
+ * See also dma_resv_lock_slow_interruptible() for the interruptible variant.
  */
 static inline void dma_resv_lock_slow(struct dma_resv *obj,
 				      struct ww_acquire_ctx *ctx)
@@ -167,7 +256,7 @@ static inline int dma_resv_lock_slow_interruptible(struct dma_resv *obj,
  * if they overlap with a writer.
  *
  * Also note that since no context is provided, no deadlock protection is
- * possible.
+ * possible, which is also not needed for a trylock.
  *
  * Returns true if the lock was acquired, false otherwise.
  */
@@ -193,6 +282,11 @@ static inline bool dma_resv_is_locked(struct dma_resv *obj)
  *
  * Returns the context used to lock a reservation object or NULL if no context
  * was used or the object is not locked at all.
+ *
+ * WARNING: This interface is pretty horrible, but TTM needs it because it
+ * doesn't pass the struct ww_acquire_ctx around in some very long callchains.
+ * Everyone else just uses it to check whether they're holding a reservation or
+ * not.
  */
 static inline struct ww_acquire_ctx *dma_resv_locking_ctx(struct dma_resv *obj)
 {
-- 
2.32.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for dma-resv fence DAG fixes
  2021-07-06 10:12 ` [Intel-gfx] " Daniel Vetter
                   ` (7 preceding siblings ...)
  (?)
@ 2021-07-06 10:35 ` Patchwork
  -1 siblings, 0 replies; 44+ messages in thread
From: Patchwork @ 2021-07-06 10:35 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

== Series Details ==

Series: dma-resv fence DAG fixes
URL   : https://patchwork.freedesktop.org/series/92236/
State : failure

== Summary ==

Applying: drm/msm: Don't break exclusive fence ordering
Applying: drm/msm: always wait for the exclusive fence
Applying: drm/etnaviv: Don't break exclusive fence ordering
error: sha1 information is lacking or useless (drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0003 drm/etnaviv: Don't break exclusive fence ordering
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] dma-resv: Give the docs a do-over
  2021-07-06 10:12   ` Daniel Vetter
  (?)
@ 2021-07-06 12:34     ` Matthew Auld
  -1 siblings, 0 replies; 44+ messages in thread
From: Matthew Auld @ 2021-07-06 12:34 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: DRI Development, Intel Graphics Development, Sumit Semwal,
	linaro-mm-sig, Daniel Vetter, Christian König, linux-media

On Tue, 6 Jul 2021 at 11:12, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Specifically document the new/clarified rules around how the shared
> fences do not have any ordering requirements against the exclusive
> fence.
>
> But also document all the things a bit better, given how central
> struct dma_resv to dynamic buffer management the docs have been very
> inadequat.
>
> - Lots more links to other pieces of the puzzle. Unfortunately
>   ttm_buffer_object has no docs, so no links :-(
>
> - Explain/complain a bit about dma_resv_locking_ctx(). I still don't
>   like that one, but fixing the ttm call chains is going to be
>   horrible. Plus we want to plug in real slowpath locking when we do
>   that anyway.
>
> - Main part of the patch is some actual docs for struct dma_resv.
>
> Overall I think we still have a lot of bad naming in this area (e.g.
> dma_resv.fence is singular, but contains the multiple shared fences),
> but I think that's more indicative of how the semantics and rules are
> just not great.
>
> Another thing that's real awkard is how chaining exclusive fences
> right now means direct dma_resv.exclusive_fence pointer access with an
> rcu_assign_pointer. Not so great either.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  drivers/dma-buf/dma-resv.c |  22 ++++++--
>  include/linux/dma-resv.h   | 104 +++++++++++++++++++++++++++++++++++--
>  2 files changed, 116 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index f26c71747d43..898f8d894bbd 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -48,6 +48,8 @@
>   * write operations) or N shared fences (read operations).  The RCU
>   * mechanism is used to protect read access to fences from locked
>   * write-side updates.
> + *
> + * See struct dma_resv for more details.
>   */
>
>  DEFINE_WD_CLASS(reservation_ww_class);
> @@ -137,7 +139,11 @@ EXPORT_SYMBOL(dma_resv_fini);
>   * @num_fences: number of fences we want to add
>   *
>   * Should be called before dma_resv_add_shared_fence().  Must
> - * be called with obj->lock held.
> + * be called with @obj locked through dma_resv_lock().
> + *
> + * Note that the preallocated slots need to be re-reserved if @obj is unlocked
> + * at any time before callind dma_resv_add_shared_fence(). This is validate when

s/callind/calling
s/validate/validated

> + * CONFIG_DEBUG_MUTEXES is enabled.
>   *
>   * RETURNS
>   * Zero for success, or -errno
> @@ -234,8 +240,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
>   * @obj: the reservation object
>   * @fence: the shared fence to add
>   *
> - * Add a fence to a shared slot, obj->lock must be held, and
> + * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
>   * dma_resv_reserve_shared() has been called.
> + *
> + * See also &dma_resv.fence for a discussion of the semantics.
>   */
>  void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
>  {
> @@ -280,7 +288,9 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
>   * @obj: the reservation object
>   * @fence: the shared fence to add

"the exclusive fence", or perhaps "the fence to add to the exclusive slot"?

>   *
> - * Add a fence to the exclusive slot.  The obj->lock must be held.
> + * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
> + * Note that this function replaces all fences attached to @obj, see also
> + * &dma_resv.fence_excl for a discussion of the semantics.
>   */
>  void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
>  {
> @@ -609,9 +619,11 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
>   * fence
>   *
>   * Callers are not required to hold specific locks, but maybe hold
> - * dma_resv_lock() already
> + * dma_resv_lock() already.
> + *
>   * RETURNS
> - * true if all fences signaled, else false
> + *
> + * True if all fences signaled, else false.
>   */
>  bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
>  {
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index e1ca2080a1ff..c77fd54d033f 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -62,16 +62,90 @@ struct dma_resv_list {
>
>  /**
>   * struct dma_resv - a reservation object manages fences for a buffer
> - * @lock: update side lock
> - * @seq: sequence count for managing RCU read-side synchronization
> - * @fence_excl: the exclusive fence, if there is one currently
> - * @fence: list of current shared fences
> + *
> + * There are multiple uses for this, with sometimes slightly different rules in
> + * how the fence slots are used.
> + *
> + * One use is to synchronize cross-driver access to a struct dma_buf, either for
> + * dynamic buffer management or just to handle implicit synchronization between
> + * different users of the buffer in userspace. See &dma_buf.resv for a more
> + * in-depth discussion.
> + *
> + * The other major use is to manage access and locking within a driver in a
> + * buffer based memory manager. struct ttm_buffer_object is the canonical
> + * example here, since this is were reservation objects originated from. But use

s/were/where

> + * in drivers is spreading and some drivers also manage struct
> + * drm_gem_object with the same scheme.
>   */
>  struct dma_resv {
> +       /**
> +        * @lock:
> +        *
> +        * Update side lock. Don't use directly, instead use the wrapper
> +        * functions like dma_resv_lock() and dma_resv_unlock().
> +        *
> +        * Drivers which use the reservation object to manage memory dynamically
> +        * also use this lock to protect buffer object state like placement,
> +        * allocation policies or throughout command submission.
> +        */
>         struct ww_mutex lock;
> +
> +       /**
> +        * @seq:
> +        *
> +        * Sequence count for managing RCU read-side synchronization, allows
> +        * read-only access to @fence_excl and @fence while ensuring we take a
> +        * consistent snapshot.
> +        */
>         seqcount_ww_mutex_t seq;
>
> +       /**
> +        * @fence_excl:
> +        *
> +        * The exclusive fence, if there is one currently.
> +        *
> +        * There are two was to update this fence:

s/was/ways

> +        *
> +        * - First by calling dma_resv_add_excl_fence(), which replaces all
> +        *   fences attached to the reservation object. To guarantee that no
> +        *   fences are lost this new fence must signal only after all previous
> +        *   fences, both shared and exclusive, have signalled. In some cases it/

Random slash at the end

> +        *   is convenient to achieve that by attaching a struct dma_fence_array
> +        *   with all the new and old fences.
> +        *
> +        * - Alternatively the fence can be set directly, which leaves the
> +        *   shared fences unchanged. To guarantee that no fences are lost this
> +        *   new fence must signale only after the previous exclusive fence has

s/signale/signal

> +        *   singalled. Since the shared fences are staying intact, it is not

s/singalled/signalled

> +        *   necessary to maintain any ordering against those. If semantically
> +        *   only a new access is added without actually treating the previous
> +        *   one as a dependency the exclusive fences can be strung together
> +        *   using struct dma_fence_chain.
> +        *
> +        * Note that actual semantics of what an exclusive or shared fence mean
> +        * is defined by the user, for reservation objects shared across drivers
> +        * see &dma_buf.resv.
> +        */
>         struct dma_fence __rcu *fence_excl;
> +
> +       /**
> +        * @fence:
> +        *
> +        * List of current shared fences.
> +        *
> +        * There are no ordering constraints of shared fences against the
> +        * exclusive fence slot. If a waiter needs to wait for all access, it
> +        * has to wait for both set of fences to signal.
> +        *
> +        * A new fence is added by calling dma_resv_add_shared_fence(). Since
> +        * this often needs to be done past the point of no return in command
> +        * submission it cannot fail, and therefor sufficient slots need to be

s/therefor/therefore

> +        * reserved by calling dma_resv_reserve_shared().
> +        *
> +        * Note that actual semantics of what an exclusive or shared fence mean
> +        * is defined by the user, for reservation objects shared across drivers
> +        * see &dma_buf.resv.
> +        */
>         struct dma_resv_list __rcu *fence;
>  };
>
> @@ -98,6 +172,13 @@ static inline void dma_resv_reset_shared_max(struct dma_resv *obj) {}
>   * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
>   * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
>   * object may be locked by itself by passing NULL as @ctx.
> + *
> + * When a die situation is indicated by returning -EDEADLK all locks held by
> + * @ctx must be unlocked and then dma_resv_lock_slow() called on @obj.
> + *
> + * Unlocked by calling dma_resv_lock().

dma_resv_unlock()

> + *
> + * See also dma_resv_lock_interruptible() for the interruptible variant.
>   */
>  static inline int dma_resv_lock(struct dma_resv *obj,
>                                 struct ww_acquire_ctx *ctx)
> @@ -119,6 +200,12 @@ static inline int dma_resv_lock(struct dma_resv *obj,
>   * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
>   * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
>   * object may be locked by itself by passing NULL as @ctx.
> + *
> + * When a die situation is indicated by returning -EDEADLK all locks held by
> + * @ctx must be unlocked and then dma_resv_lock_slow_interruptible() called on
> + * @obj.
> + *
> + * Unlocked by calling dma_resv_lock().

dma_resv_unlock()

fwiw,
Reviewed-by: Matthew Auld <matthew.auld@intel.com>


>   */
>  static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
>                                               struct ww_acquire_ctx *ctx)
> @@ -134,6 +221,8 @@ static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
>   * Acquires the reservation object after a die case. This function
>   * will sleep until the lock becomes available. See dma_resv_lock() as
>   * well.
> + *
> + * See also dma_resv_lock_slow_interruptible() for the interruptible variant.
>   */
>  static inline void dma_resv_lock_slow(struct dma_resv *obj,
>                                       struct ww_acquire_ctx *ctx)
> @@ -167,7 +256,7 @@ static inline int dma_resv_lock_slow_interruptible(struct dma_resv *obj,
>   * if they overlap with a writer.
>   *
>   * Also note that since no context is provided, no deadlock protection is
> - * possible.
> + * possible, which is also not needed for a trylock.
>   *
>   * Returns true if the lock was acquired, false otherwise.
>   */
> @@ -193,6 +282,11 @@ static inline bool dma_resv_is_locked(struct dma_resv *obj)
>   *
>   * Returns the context used to lock a reservation object or NULL if no context
>   * was used or the object is not locked at all.
> + *
> + * WARNING: This interface is pretty horrible, but TTM needs it because it
> + * doesn't pass the struct ww_acquire_ctx around in some very long callchains.
> + * Everyone else just uses it to check whether they're holding a reservation or
> + * not.
>   */
>  static inline struct ww_acquire_ctx *dma_resv_locking_ctx(struct dma_resv *obj)
>  {
> --
> 2.32.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] dma-resv: Give the docs a do-over
@ 2021-07-06 12:34     ` Matthew Auld
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Auld @ 2021-07-06 12:34 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, DRI Development,
	Christian König, linaro-mm-sig, Daniel Vetter, linux-media

On Tue, 6 Jul 2021 at 11:12, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Specifically document the new/clarified rules around how the shared
> fences do not have any ordering requirements against the exclusive
> fence.
>
> But also document all the things a bit better, given how central
> struct dma_resv to dynamic buffer management the docs have been very
> inadequat.
>
> - Lots more links to other pieces of the puzzle. Unfortunately
>   ttm_buffer_object has no docs, so no links :-(
>
> - Explain/complain a bit about dma_resv_locking_ctx(). I still don't
>   like that one, but fixing the ttm call chains is going to be
>   horrible. Plus we want to plug in real slowpath locking when we do
>   that anyway.
>
> - Main part of the patch is some actual docs for struct dma_resv.
>
> Overall I think we still have a lot of bad naming in this area (e.g.
> dma_resv.fence is singular, but contains the multiple shared fences),
> but I think that's more indicative of how the semantics and rules are
> just not great.
>
> Another thing that's real awkard is how chaining exclusive fences
> right now means direct dma_resv.exclusive_fence pointer access with an
> rcu_assign_pointer. Not so great either.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  drivers/dma-buf/dma-resv.c |  22 ++++++--
>  include/linux/dma-resv.h   | 104 +++++++++++++++++++++++++++++++++++--
>  2 files changed, 116 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index f26c71747d43..898f8d894bbd 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -48,6 +48,8 @@
>   * write operations) or N shared fences (read operations).  The RCU
>   * mechanism is used to protect read access to fences from locked
>   * write-side updates.
> + *
> + * See struct dma_resv for more details.
>   */
>
>  DEFINE_WD_CLASS(reservation_ww_class);
> @@ -137,7 +139,11 @@ EXPORT_SYMBOL(dma_resv_fini);
>   * @num_fences: number of fences we want to add
>   *
>   * Should be called before dma_resv_add_shared_fence().  Must
> - * be called with obj->lock held.
> + * be called with @obj locked through dma_resv_lock().
> + *
> + * Note that the preallocated slots need to be re-reserved if @obj is unlocked
> + * at any time before callind dma_resv_add_shared_fence(). This is validate when

s/callind/calling
s/validate/validated

> + * CONFIG_DEBUG_MUTEXES is enabled.
>   *
>   * RETURNS
>   * Zero for success, or -errno
> @@ -234,8 +240,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
>   * @obj: the reservation object
>   * @fence: the shared fence to add
>   *
> - * Add a fence to a shared slot, obj->lock must be held, and
> + * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
>   * dma_resv_reserve_shared() has been called.
> + *
> + * See also &dma_resv.fence for a discussion of the semantics.
>   */
>  void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
>  {
> @@ -280,7 +288,9 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
>   * @obj: the reservation object
>   * @fence: the shared fence to add

"the exclusive fence", or perhaps "the fence to add to the exclusive slot"?

>   *
> - * Add a fence to the exclusive slot.  The obj->lock must be held.
> + * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
> + * Note that this function replaces all fences attached to @obj, see also
> + * &dma_resv.fence_excl for a discussion of the semantics.
>   */
>  void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
>  {
> @@ -609,9 +619,11 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
>   * fence
>   *
>   * Callers are not required to hold specific locks, but maybe hold
> - * dma_resv_lock() already
> + * dma_resv_lock() already.
> + *
>   * RETURNS
> - * true if all fences signaled, else false
> + *
> + * True if all fences signaled, else false.
>   */
>  bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
>  {
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index e1ca2080a1ff..c77fd54d033f 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -62,16 +62,90 @@ struct dma_resv_list {
>
>  /**
>   * struct dma_resv - a reservation object manages fences for a buffer
> - * @lock: update side lock
> - * @seq: sequence count for managing RCU read-side synchronization
> - * @fence_excl: the exclusive fence, if there is one currently
> - * @fence: list of current shared fences
> + *
> + * There are multiple uses for this, with sometimes slightly different rules in
> + * how the fence slots are used.
> + *
> + * One use is to synchronize cross-driver access to a struct dma_buf, either for
> + * dynamic buffer management or just to handle implicit synchronization between
> + * different users of the buffer in userspace. See &dma_buf.resv for a more
> + * in-depth discussion.
> + *
> + * The other major use is to manage access and locking within a driver in a
> + * buffer based memory manager. struct ttm_buffer_object is the canonical
> + * example here, since this is were reservation objects originated from. But use

s/were/where

> + * in drivers is spreading and some drivers also manage struct
> + * drm_gem_object with the same scheme.
>   */
>  struct dma_resv {
> +       /**
> +        * @lock:
> +        *
> +        * Update side lock. Don't use directly, instead use the wrapper
> +        * functions like dma_resv_lock() and dma_resv_unlock().
> +        *
> +        * Drivers which use the reservation object to manage memory dynamically
> +        * also use this lock to protect buffer object state like placement,
> +        * allocation policies or throughout command submission.
> +        */
>         struct ww_mutex lock;
> +
> +       /**
> +        * @seq:
> +        *
> +        * Sequence count for managing RCU read-side synchronization, allows
> +        * read-only access to @fence_excl and @fence while ensuring we take a
> +        * consistent snapshot.
> +        */
>         seqcount_ww_mutex_t seq;
>
> +       /**
> +        * @fence_excl:
> +        *
> +        * The exclusive fence, if there is one currently.
> +        *
> +        * There are two was to update this fence:

s/was/ways

> +        *
> +        * - First by calling dma_resv_add_excl_fence(), which replaces all
> +        *   fences attached to the reservation object. To guarantee that no
> +        *   fences are lost this new fence must signal only after all previous
> +        *   fences, both shared and exclusive, have signalled. In some cases it/

Random slash at the end

> +        *   is convenient to achieve that by attaching a struct dma_fence_array
> +        *   with all the new and old fences.
> +        *
> +        * - Alternatively the fence can be set directly, which leaves the
> +        *   shared fences unchanged. To guarantee that no fences are lost this
> +        *   new fence must signale only after the previous exclusive fence has

s/signale/signal

> +        *   singalled. Since the shared fences are staying intact, it is not

s/singalled/signalled

> +        *   necessary to maintain any ordering against those. If semantically
> +        *   only a new access is added without actually treating the previous
> +        *   one as a dependency the exclusive fences can be strung together
> +        *   using struct dma_fence_chain.
> +        *
> +        * Note that actual semantics of what an exclusive or shared fence mean
> +        * is defined by the user, for reservation objects shared across drivers
> +        * see &dma_buf.resv.
> +        */
>         struct dma_fence __rcu *fence_excl;
> +
> +       /**
> +        * @fence:
> +        *
> +        * List of current shared fences.
> +        *
> +        * There are no ordering constraints of shared fences against the
> +        * exclusive fence slot. If a waiter needs to wait for all access, it
> +        * has to wait for both set of fences to signal.
> +        *
> +        * A new fence is added by calling dma_resv_add_shared_fence(). Since
> +        * this often needs to be done past the point of no return in command
> +        * submission it cannot fail, and therefor sufficient slots need to be

s/therefor/therefore

> +        * reserved by calling dma_resv_reserve_shared().
> +        *
> +        * Note that actual semantics of what an exclusive or shared fence mean
> +        * is defined by the user, for reservation objects shared across drivers
> +        * see &dma_buf.resv.
> +        */
>         struct dma_resv_list __rcu *fence;
>  };
>
> @@ -98,6 +172,13 @@ static inline void dma_resv_reset_shared_max(struct dma_resv *obj) {}
>   * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
>   * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
>   * object may be locked by itself by passing NULL as @ctx.
> + *
> + * When a die situation is indicated by returning -EDEADLK all locks held by
> + * @ctx must be unlocked and then dma_resv_lock_slow() called on @obj.
> + *
> + * Unlocked by calling dma_resv_lock().

dma_resv_unlock()

> + *
> + * See also dma_resv_lock_interruptible() for the interruptible variant.
>   */
>  static inline int dma_resv_lock(struct dma_resv *obj,
>                                 struct ww_acquire_ctx *ctx)
> @@ -119,6 +200,12 @@ static inline int dma_resv_lock(struct dma_resv *obj,
>   * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
>   * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
>   * object may be locked by itself by passing NULL as @ctx.
> + *
> + * When a die situation is indicated by returning -EDEADLK all locks held by
> + * @ctx must be unlocked and then dma_resv_lock_slow_interruptible() called on
> + * @obj.
> + *
> + * Unlocked by calling dma_resv_lock().

dma_resv_unlock()

fwiw,
Reviewed-by: Matthew Auld <matthew.auld@intel.com>


>   */
>  static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
>                                               struct ww_acquire_ctx *ctx)
> @@ -134,6 +221,8 @@ static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
>   * Acquires the reservation object after a die case. This function
>   * will sleep until the lock becomes available. See dma_resv_lock() as
>   * well.
> + *
> + * See also dma_resv_lock_slow_interruptible() for the interruptible variant.
>   */
>  static inline void dma_resv_lock_slow(struct dma_resv *obj,
>                                       struct ww_acquire_ctx *ctx)
> @@ -167,7 +256,7 @@ static inline int dma_resv_lock_slow_interruptible(struct dma_resv *obj,
>   * if they overlap with a writer.
>   *
>   * Also note that since no context is provided, no deadlock protection is
> - * possible.
> + * possible, which is also not needed for a trylock.
>   *
>   * Returns true if the lock was acquired, false otherwise.
>   */
> @@ -193,6 +282,11 @@ static inline bool dma_resv_is_locked(struct dma_resv *obj)
>   *
>   * Returns the context used to lock a reservation object or NULL if no context
>   * was used or the object is not locked at all.
> + *
> + * WARNING: This interface is pretty horrible, but TTM needs it because it
> + * doesn't pass the struct ww_acquire_ctx around in some very long callchains.
> + * Everyone else just uses it to check whether they're holding a reservation or
> + * not.
>   */
>  static inline struct ww_acquire_ctx *dma_resv_locking_ctx(struct dma_resv *obj)
>  {
> --
> 2.32.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] dma-resv: Give the docs a do-over
@ 2021-07-06 12:34     ` Matthew Auld
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Auld @ 2021-07-06 12:34 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, DRI Development,
	Christian König, linaro-mm-sig, Daniel Vetter, Sumit Semwal,
	linux-media

On Tue, 6 Jul 2021 at 11:12, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Specifically document the new/clarified rules around how the shared
> fences do not have any ordering requirements against the exclusive
> fence.
>
> But also document all the things a bit better, given how central
> struct dma_resv to dynamic buffer management the docs have been very
> inadequat.
>
> - Lots more links to other pieces of the puzzle. Unfortunately
>   ttm_buffer_object has no docs, so no links :-(
>
> - Explain/complain a bit about dma_resv_locking_ctx(). I still don't
>   like that one, but fixing the ttm call chains is going to be
>   horrible. Plus we want to plug in real slowpath locking when we do
>   that anyway.
>
> - Main part of the patch is some actual docs for struct dma_resv.
>
> Overall I think we still have a lot of bad naming in this area (e.g.
> dma_resv.fence is singular, but contains the multiple shared fences),
> but I think that's more indicative of how the semantics and rules are
> just not great.
>
> Another thing that's real awkard is how chaining exclusive fences
> right now means direct dma_resv.exclusive_fence pointer access with an
> rcu_assign_pointer. Not so great either.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  drivers/dma-buf/dma-resv.c |  22 ++++++--
>  include/linux/dma-resv.h   | 104 +++++++++++++++++++++++++++++++++++--
>  2 files changed, 116 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index f26c71747d43..898f8d894bbd 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -48,6 +48,8 @@
>   * write operations) or N shared fences (read operations).  The RCU
>   * mechanism is used to protect read access to fences from locked
>   * write-side updates.
> + *
> + * See struct dma_resv for more details.
>   */
>
>  DEFINE_WD_CLASS(reservation_ww_class);
> @@ -137,7 +139,11 @@ EXPORT_SYMBOL(dma_resv_fini);
>   * @num_fences: number of fences we want to add
>   *
>   * Should be called before dma_resv_add_shared_fence().  Must
> - * be called with obj->lock held.
> + * be called with @obj locked through dma_resv_lock().
> + *
> + * Note that the preallocated slots need to be re-reserved if @obj is unlocked
> + * at any time before callind dma_resv_add_shared_fence(). This is validate when

s/callind/calling
s/validate/validated

> + * CONFIG_DEBUG_MUTEXES is enabled.
>   *
>   * RETURNS
>   * Zero for success, or -errno
> @@ -234,8 +240,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
>   * @obj: the reservation object
>   * @fence: the shared fence to add
>   *
> - * Add a fence to a shared slot, obj->lock must be held, and
> + * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
>   * dma_resv_reserve_shared() has been called.
> + *
> + * See also &dma_resv.fence for a discussion of the semantics.
>   */
>  void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
>  {
> @@ -280,7 +288,9 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
>   * @obj: the reservation object
>   * @fence: the shared fence to add

"the exclusive fence", or perhaps "the fence to add to the exclusive slot"?

>   *
> - * Add a fence to the exclusive slot.  The obj->lock must be held.
> + * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
> + * Note that this function replaces all fences attached to @obj, see also
> + * &dma_resv.fence_excl for a discussion of the semantics.
>   */
>  void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
>  {
> @@ -609,9 +619,11 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
>   * fence
>   *
>   * Callers are not required to hold specific locks, but maybe hold
> - * dma_resv_lock() already
> + * dma_resv_lock() already.
> + *
>   * RETURNS
> - * true if all fences signaled, else false
> + *
> + * True if all fences signaled, else false.
>   */
>  bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
>  {
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index e1ca2080a1ff..c77fd54d033f 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -62,16 +62,90 @@ struct dma_resv_list {
>
>  /**
>   * struct dma_resv - a reservation object manages fences for a buffer
> - * @lock: update side lock
> - * @seq: sequence count for managing RCU read-side synchronization
> - * @fence_excl: the exclusive fence, if there is one currently
> - * @fence: list of current shared fences
> + *
> + * There are multiple uses for this, with sometimes slightly different rules in
> + * how the fence slots are used.
> + *
> + * One use is to synchronize cross-driver access to a struct dma_buf, either for
> + * dynamic buffer management or just to handle implicit synchronization between
> + * different users of the buffer in userspace. See &dma_buf.resv for a more
> + * in-depth discussion.
> + *
> + * The other major use is to manage access and locking within a driver in a
> + * buffer based memory manager. struct ttm_buffer_object is the canonical
> + * example here, since this is were reservation objects originated from. But use

s/were/where

> + * in drivers is spreading and some drivers also manage struct
> + * drm_gem_object with the same scheme.
>   */
>  struct dma_resv {
> +       /**
> +        * @lock:
> +        *
> +        * Update side lock. Don't use directly, instead use the wrapper
> +        * functions like dma_resv_lock() and dma_resv_unlock().
> +        *
> +        * Drivers which use the reservation object to manage memory dynamically
> +        * also use this lock to protect buffer object state like placement,
> +        * allocation policies or throughout command submission.
> +        */
>         struct ww_mutex lock;
> +
> +       /**
> +        * @seq:
> +        *
> +        * Sequence count for managing RCU read-side synchronization, allows
> +        * read-only access to @fence_excl and @fence while ensuring we take a
> +        * consistent snapshot.
> +        */
>         seqcount_ww_mutex_t seq;
>
> +       /**
> +        * @fence_excl:
> +        *
> +        * The exclusive fence, if there is one currently.
> +        *
> +        * There are two was to update this fence:

s/was/ways

> +        *
> +        * - First by calling dma_resv_add_excl_fence(), which replaces all
> +        *   fences attached to the reservation object. To guarantee that no
> +        *   fences are lost this new fence must signal only after all previous
> +        *   fences, both shared and exclusive, have signalled. In some cases it/

Random slash at the end

> +        *   is convenient to achieve that by attaching a struct dma_fence_array
> +        *   with all the new and old fences.
> +        *
> +        * - Alternatively the fence can be set directly, which leaves the
> +        *   shared fences unchanged. To guarantee that no fences are lost this
> +        *   new fence must signale only after the previous exclusive fence has

s/signale/signal

> +        *   singalled. Since the shared fences are staying intact, it is not

s/singalled/signalled

> +        *   necessary to maintain any ordering against those. If semantically
> +        *   only a new access is added without actually treating the previous
> +        *   one as a dependency the exclusive fences can be strung together
> +        *   using struct dma_fence_chain.
> +        *
> +        * Note that actual semantics of what an exclusive or shared fence mean
> +        * is defined by the user, for reservation objects shared across drivers
> +        * see &dma_buf.resv.
> +        */
>         struct dma_fence __rcu *fence_excl;
> +
> +       /**
> +        * @fence:
> +        *
> +        * List of current shared fences.
> +        *
> +        * There are no ordering constraints of shared fences against the
> +        * exclusive fence slot. If a waiter needs to wait for all access, it
> +        * has to wait for both set of fences to signal.
> +        *
> +        * A new fence is added by calling dma_resv_add_shared_fence(). Since
> +        * this often needs to be done past the point of no return in command
> +        * submission it cannot fail, and therefor sufficient slots need to be

s/therefor/therefore

> +        * reserved by calling dma_resv_reserve_shared().
> +        *
> +        * Note that actual semantics of what an exclusive or shared fence mean
> +        * is defined by the user, for reservation objects shared across drivers
> +        * see &dma_buf.resv.
> +        */
>         struct dma_resv_list __rcu *fence;
>  };
>
> @@ -98,6 +172,13 @@ static inline void dma_resv_reset_shared_max(struct dma_resv *obj) {}
>   * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
>   * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
>   * object may be locked by itself by passing NULL as @ctx.
> + *
> + * When a die situation is indicated by returning -EDEADLK all locks held by
> + * @ctx must be unlocked and then dma_resv_lock_slow() called on @obj.
> + *
> + * Unlocked by calling dma_resv_lock().

dma_resv_unlock()

> + *
> + * See also dma_resv_lock_interruptible() for the interruptible variant.
>   */
>  static inline int dma_resv_lock(struct dma_resv *obj,
>                                 struct ww_acquire_ctx *ctx)
> @@ -119,6 +200,12 @@ static inline int dma_resv_lock(struct dma_resv *obj,
>   * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
>   * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
>   * object may be locked by itself by passing NULL as @ctx.
> + *
> + * When a die situation is indicated by returning -EDEADLK all locks held by
> + * @ctx must be unlocked and then dma_resv_lock_slow_interruptible() called on
> + * @obj.
> + *
> + * Unlocked by calling dma_resv_lock().

dma_resv_unlock()

fwiw,
Reviewed-by: Matthew Auld <matthew.auld@intel.com>


>   */
>  static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
>                                               struct ww_acquire_ctx *ctx)
> @@ -134,6 +221,8 @@ static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
>   * Acquires the reservation object after a die case. This function
>   * will sleep until the lock becomes available. See dma_resv_lock() as
>   * well.
> + *
> + * See also dma_resv_lock_slow_interruptible() for the interruptible variant.
>   */
>  static inline void dma_resv_lock_slow(struct dma_resv *obj,
>                                       struct ww_acquire_ctx *ctx)
> @@ -167,7 +256,7 @@ static inline int dma_resv_lock_slow_interruptible(struct dma_resv *obj,
>   * if they overlap with a writer.
>   *
>   * Also note that since no context is provided, no deadlock protection is
> - * possible.
> + * possible, which is also not needed for a trylock.
>   *
>   * Returns true if the lock was acquired, false otherwise.
>   */
> @@ -193,6 +282,11 @@ static inline bool dma_resv_is_locked(struct dma_resv *obj)
>   *
>   * Returns the context used to lock a reservation object or NULL if no context
>   * was used or the object is not locked at all.
> + *
> + * WARNING: This interface is pretty horrible, but TTM needs it because it
> + * doesn't pass the struct ww_acquire_ctx around in some very long callchains.
> + * Everyone else just uses it to check whether they're holding a reservation or
> + * not.
>   */
>  static inline struct ww_acquire_ctx *dma_resv_locking_ctx(struct dma_resv *obj)
>  {
> --
> 2.32.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [PATCH 5/7] drm/i915: Always wait for the exclusive fence
  2021-07-06 10:12   ` [Intel-gfx] " Daniel Vetter
@ 2021-07-06 12:47     ` Matthew Auld
  -1 siblings, 0 replies; 44+ messages in thread
From: Matthew Auld @ 2021-07-06 12:47 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Thomas Hellström, Intel Graphics Development,
	DRI Development, Daniel Vetter

On Tue, 6 Jul 2021 at 11:12, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> We're lifting, or well, clarifying that the restriction that shared
> fences have to be strictly after the exclusive one doesn't apply
> anymore.
>
> So adjust the code to always also wait for the exclusive fence.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  drivers/gpu/drm/i915/i915_sw_fence.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
> index 91711a46b1c7..271d321cea83 100644
> --- a/drivers/gpu/drm/i915/i915_sw_fence.c
> +++ b/drivers/gpu/drm/i915/i915_sw_fence.c
> @@ -601,10 +601,10 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
>                 for (i = 0; i < count; i++)
>                         dma_fence_put(shared[i]);
>                 kfree(shared);
> -       } else {
> -               excl = dma_resv_get_excl_unlocked(resv);
>         }
>
> +       excl = dma_resv_get_excl_unlocked(resv);
> +

The dma_resv_get_fences() call looks like it already fishes out the
exclusive fence. Does this not leak the extra ref now?

>         if (ret >= 0 && excl) {
>                 pending = i915_sw_fence_await_dma_fence(fence,
>                                                         excl,
> --
> 2.32.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [PATCH 5/7] drm/i915: Always wait for the exclusive fence
@ 2021-07-06 12:47     ` Matthew Auld
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Auld @ 2021-07-06 12:47 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Thomas Hellström, Intel Graphics Development,
	DRI Development, Daniel Vetter

On Tue, 6 Jul 2021 at 11:12, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> We're lifting, or well, clarifying that the restriction that shared
> fences have to be strictly after the exclusive one doesn't apply
> anymore.
>
> So adjust the code to always also wait for the exclusive fence.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  drivers/gpu/drm/i915/i915_sw_fence.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
> index 91711a46b1c7..271d321cea83 100644
> --- a/drivers/gpu/drm/i915/i915_sw_fence.c
> +++ b/drivers/gpu/drm/i915/i915_sw_fence.c
> @@ -601,10 +601,10 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
>                 for (i = 0; i < count; i++)
>                         dma_fence_put(shared[i]);
>                 kfree(shared);
> -       } else {
> -               excl = dma_resv_get_excl_unlocked(resv);
>         }
>
> +       excl = dma_resv_get_excl_unlocked(resv);
> +

The dma_resv_get_fences() call looks like it already fishes out the
exclusive fence. Does this not leak the extra ref now?

>         if (ret >= 0 && excl) {
>                 pending = i915_sw_fence_await_dma_fence(fence,
>                                                         excl,
> --
> 2.32.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [PATCH 5/7] drm/i915: Always wait for the exclusive fence
  2021-07-06 12:47     ` Matthew Auld
@ 2021-07-06 12:58       ` Daniel Vetter
  -1 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 12:58 UTC (permalink / raw)
  To: Matthew Auld
  Cc: Thomas Hellström, Intel Graphics Development,
	DRI Development, Daniel Vetter

On Tue, Jul 6, 2021 at 2:47 PM Matthew Auld
<matthew.william.auld@gmail.com> wrote:
>
> On Tue, 6 Jul 2021 at 11:12, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > We're lifting, or well, clarifying that the restriction that shared
> > fences have to be strictly after the exclusive one doesn't apply
> > anymore.
> >
> > So adjust the code to always also wait for the exclusive fence.
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
> > Cc: Jason Ekstrand <jason@jlekstrand.net>
> > ---
> >  drivers/gpu/drm/i915/i915_sw_fence.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
> > index 91711a46b1c7..271d321cea83 100644
> > --- a/drivers/gpu/drm/i915/i915_sw_fence.c
> > +++ b/drivers/gpu/drm/i915/i915_sw_fence.c
> > @@ -601,10 +601,10 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
> >                 for (i = 0; i < count; i++)
> >                         dma_fence_put(shared[i]);
> >                 kfree(shared);
> > -       } else {
> > -               excl = dma_resv_get_excl_unlocked(resv);
> >         }
> >
> > +       excl = dma_resv_get_excl_unlocked(resv);
> > +
>
> The dma_resv_get_fences() call looks like it already fishes out the
> exclusive fence. Does this not leak the extra ref now?

Oh right I overlooked this, we already pick up the exclusive fence
unconditionally. Control flow here was a bit too clever for my parser.
I'll drop this patch in the next round.
-Daniel

>
> >         if (ret >= 0 && excl) {
> >                 pending = i915_sw_fence_await_dma_fence(fence,
> >                                                         excl,
> > --
> > 2.32.0
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [PATCH 5/7] drm/i915: Always wait for the exclusive fence
@ 2021-07-06 12:58       ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-06 12:58 UTC (permalink / raw)
  To: Matthew Auld
  Cc: Thomas Hellström, Intel Graphics Development,
	DRI Development, Daniel Vetter

On Tue, Jul 6, 2021 at 2:47 PM Matthew Auld
<matthew.william.auld@gmail.com> wrote:
>
> On Tue, 6 Jul 2021 at 11:12, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > We're lifting, or well, clarifying that the restriction that shared
> > fences have to be strictly after the exclusive one doesn't apply
> > anymore.
> >
> > So adjust the code to always also wait for the exclusive fence.
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
> > Cc: Jason Ekstrand <jason@jlekstrand.net>
> > ---
> >  drivers/gpu/drm/i915/i915_sw_fence.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
> > index 91711a46b1c7..271d321cea83 100644
> > --- a/drivers/gpu/drm/i915/i915_sw_fence.c
> > +++ b/drivers/gpu/drm/i915/i915_sw_fence.c
> > @@ -601,10 +601,10 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
> >                 for (i = 0; i < count; i++)
> >                         dma_fence_put(shared[i]);
> >                 kfree(shared);
> > -       } else {
> > -               excl = dma_resv_get_excl_unlocked(resv);
> >         }
> >
> > +       excl = dma_resv_get_excl_unlocked(resv);
> > +
>
> The dma_resv_get_fences() call looks like it already fishes out the
> exclusive fence. Does this not leak the extra ref now?

Oh right I overlooked this, we already pick up the exclusive fence
unconditionally. Control flow here was a bit too clever for my parser.
I'll drop this patch in the next round.
-Daniel

>
> >         if (ret >= 0 && excl) {
> >                 pending = i915_sw_fence_await_dma_fence(fence,
> >                                                         excl,
> > --
> > 2.32.0
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 7/7] dma-resv: Give the docs a do-over
  2021-07-06 10:12   ` Daniel Vetter
  (?)
@ 2021-07-06 23:47     ` Jason Ekstrand
  -1 siblings, 0 replies; 44+ messages in thread
From: Jason Ekstrand @ 2021-07-06 23:47 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: DRI Development, Intel Graphics Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Daniel Vetter,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Tue, Jul 6, 2021 at 5:12 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Specifically document the new/clarified rules around how the shared
> fences do not have any ordering requirements against the exclusive
> fence.
>
> But also document all the things a bit better, given how central
> struct dma_resv to dynamic buffer management the docs have been very
> inadequat.
>
> - Lots more links to other pieces of the puzzle. Unfortunately
>   ttm_buffer_object has no docs, so no links :-(
>
> - Explain/complain a bit about dma_resv_locking_ctx(). I still don't
>   like that one, but fixing the ttm call chains is going to be
>   horrible. Plus we want to plug in real slowpath locking when we do
>   that anyway.
>
> - Main part of the patch is some actual docs for struct dma_resv.
>
> Overall I think we still have a lot of bad naming in this area (e.g.
> dma_resv.fence is singular, but contains the multiple shared fences),
> but I think that's more indicative of how the semantics and rules are
> just not great.
>
> Another thing that's real awkard is how chaining exclusive fences
> right now means direct dma_resv.exclusive_fence pointer access with an
> rcu_assign_pointer. Not so great either.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  drivers/dma-buf/dma-resv.c |  22 ++++++--
>  include/linux/dma-resv.h   | 104 +++++++++++++++++++++++++++++++++++--
>  2 files changed, 116 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index f26c71747d43..898f8d894bbd 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -48,6 +48,8 @@
>   * write operations) or N shared fences (read operations).  The RCU
>   * mechanism is used to protect read access to fences from locked
>   * write-side updates.
> + *
> + * See struct dma_resv for more details.
>   */
>
>  DEFINE_WD_CLASS(reservation_ww_class);
> @@ -137,7 +139,11 @@ EXPORT_SYMBOL(dma_resv_fini);
>   * @num_fences: number of fences we want to add
>   *
>   * Should be called before dma_resv_add_shared_fence().  Must
> - * be called with obj->lock held.
> + * be called with @obj locked through dma_resv_lock().
> + *
> + * Note that the preallocated slots need to be re-reserved if @obj is unlocked
> + * at any time before callind dma_resv_add_shared_fence(). This is validate when

validated

> + * CONFIG_DEBUG_MUTEXES is enabled.
>   *
>   * RETURNS
>   * Zero for success, or -errno
> @@ -234,8 +240,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
>   * @obj: the reservation object
>   * @fence: the shared fence to add
>   *
> - * Add a fence to a shared slot, obj->lock must be held, and
> + * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
>   * dma_resv_reserve_shared() has been called.
> + *
> + * See also &dma_resv.fence for a discussion of the semantics.
>   */
>  void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
>  {
> @@ -280,7 +288,9 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
>   * @obj: the reservation object
>   * @fence: the shared fence to add
>   *
> - * Add a fence to the exclusive slot.  The obj->lock must be held.
> + * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
> + * Note that this function replaces all fences attached to @obj, see also
> + * &dma_resv.fence_excl for a discussion of the semantics.
>   */
>  void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
>  {
> @@ -609,9 +619,11 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
>   * fence
>   *
>   * Callers are not required to hold specific locks, but maybe hold
> - * dma_resv_lock() already
> + * dma_resv_lock() already.
> + *
>   * RETURNS
> - * true if all fences signaled, else false
> + *
> + * True if all fences signaled, else false.
>   */
>  bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
>  {
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index e1ca2080a1ff..c77fd54d033f 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -62,16 +62,90 @@ struct dma_resv_list {
>
>  /**
>   * struct dma_resv - a reservation object manages fences for a buffer
> - * @lock: update side lock
> - * @seq: sequence count for managing RCU read-side synchronization
> - * @fence_excl: the exclusive fence, if there is one currently
> - * @fence: list of current shared fences
> + *
> + * There are multiple uses for this, with sometimes slightly different rules in
> + * how the fence slots are used.
> + *
> + * One use is to synchronize cross-driver access to a struct dma_buf, either for
> + * dynamic buffer management or just to handle implicit synchronization between
> + * different users of the buffer in userspace. See &dma_buf.resv for a more
> + * in-depth discussion.
> + *
> + * The other major use is to manage access and locking within a driver in a
> + * buffer based memory manager. struct ttm_buffer_object is the canonical
> + * example here, since this is were reservation objects originated from. But use

s/were/where/

> + * in drivers is spreading and some drivers also manage struct
> + * drm_gem_object with the same scheme.
>   */
>  struct dma_resv {
> +       /**
> +        * @lock:
> +        *
> +        * Update side lock. Don't use directly, instead use the wrapper
> +        * functions like dma_resv_lock() and dma_resv_unlock().
> +        *
> +        * Drivers which use the reservation object to manage memory dynamically
> +        * also use this lock to protect buffer object state like placement,
> +        * allocation policies or throughout command submission.
> +        */
>         struct ww_mutex lock;
> +
> +       /**
> +        * @seq:
> +        *
> +        * Sequence count for managing RCU read-side synchronization, allows
> +        * read-only access to @fence_excl and @fence while ensuring we take a
> +        * consistent snapshot.
> +        */
>         seqcount_ww_mutex_t seq;
>
> +       /**
> +        * @fence_excl:
> +        *
> +        * The exclusive fence, if there is one currently.
> +        *
> +        * There are two was to update this fence:
> +        *
> +        * - First by calling dma_resv_add_excl_fence(), which replaces all
> +        *   fences attached to the reservation object. To guarantee that no
> +        *   fences are lost this new fence must signal only after all previous

lost, this

> +        *   fences, both shared and exclusive, have signalled. In some cases it
> +        *   is convenient to achieve that by attaching a struct dma_fence_array
> +        *   with all the new and old fences.
> +        *
> +        * - Alternatively the fence can be set directly, which leaves the
> +        *   shared fences unchanged. To guarantee that no fences are lost this
> +        *   new fence must signale only after the previous exclusive fence has

s/signale/signal/

> +        *   singalled. Since the shared fences are staying intact, it is not
> +        *   necessary to maintain any ordering against those. If semantically
> +        *   only a new access is added without actually treating the previous
> +        *   one as a dependency the exclusive fences can be strung together
> +        *   using struct dma_fence_chain.
> +        *
> +        * Note that actual semantics of what an exclusive or shared fence mean
> +        * is defined by the user, for reservation objects shared across drivers
> +        * see &dma_buf.resv.
> +        */
>         struct dma_fence __rcu *fence_excl;
> +
> +       /**
> +        * @fence:
> +        *
> +        * List of current shared fences.
> +        *
> +        * There are no ordering constraints of shared fences against the
> +        * exclusive fence slot. If a waiter needs to wait for all access, it
> +        * has to wait for both set of fences to signal.

sets

> +        *
> +        * A new fence is added by calling dma_resv_add_shared_fence(). Since
> +        * this often needs to be done past the point of no return in command
> +        * submission it cannot fail, and therefor sufficient slots need to be

therefore

> +        * reserved by calling dma_resv_reserve_shared().
> +        *
> +        * Note that actual semantics of what an exclusive or shared fence mean
> +        * is defined by the user, for reservation objects shared across drivers
> +        * see &dma_buf.resv.
> +        */
>         struct dma_resv_list __rcu *fence;
>  };
>
> @@ -98,6 +172,13 @@ static inline void dma_resv_reset_shared_max(struct dma_resv *obj) {}
>   * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
>   * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
>   * object may be locked by itself by passing NULL as @ctx.
> + *
> + * When a die situation is indicated by returning -EDEADLK all locks held by
> + * @ctx must be unlocked and then dma_resv_lock_slow() called on @obj.
> + *
> + * Unlocked by calling dma_resv_lock().

unlock?

> + *
> + * See also dma_resv_lock_interruptible() for the interruptible variant.
>   */
>  static inline int dma_resv_lock(struct dma_resv *obj,
>                                 struct ww_acquire_ctx *ctx)
> @@ -119,6 +200,12 @@ static inline int dma_resv_lock(struct dma_resv *obj,
>   * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
>   * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
>   * object may be locked by itself by passing NULL as @ctx.
> + *
> + * When a die situation is indicated by returning -EDEADLK all locks held by
> + * @ctx must be unlocked and then dma_resv_lock_slow_interruptible() called on
> + * @obj.
> + *
> + * Unlocked by calling dma_resv_lock().

unlock?

>   */
>  static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
>                                               struct ww_acquire_ctx *ctx)
> @@ -134,6 +221,8 @@ static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
>   * Acquires the reservation object after a die case. This function
>   * will sleep until the lock becomes available. See dma_resv_lock() as
>   * well.
> + *
> + * See also dma_resv_lock_slow_interruptible() for the interruptible variant.
>   */
>  static inline void dma_resv_lock_slow(struct dma_resv *obj,
>                                       struct ww_acquire_ctx *ctx)
> @@ -167,7 +256,7 @@ static inline int dma_resv_lock_slow_interruptible(struct dma_resv *obj,
>   * if they overlap with a writer.
>   *
>   * Also note that since no context is provided, no deadlock protection is
> - * possible.
> + * possible, which is also not needed for a trylock.
>   *
>   * Returns true if the lock was acquired, false otherwise.
>   */
> @@ -193,6 +282,11 @@ static inline bool dma_resv_is_locked(struct dma_resv *obj)
>   *
>   * Returns the context used to lock a reservation object or NULL if no context
>   * was used or the object is not locked at all.
> + *
> + * WARNING: This interface is pretty horrible, but TTM needs it because it
> + * doesn't pass the struct ww_acquire_ctx around in some very long callchains.
> + * Everyone else just uses it to check whether they're holding a reservation or
> + * not.
>   */
>  static inline struct ww_acquire_ctx *dma_resv_locking_ctx(struct dma_resv *obj)
>  {
> --
> 2.32.0
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 7/7] dma-resv: Give the docs a do-over
@ 2021-07-06 23:47     ` Jason Ekstrand
  0 siblings, 0 replies; 44+ messages in thread
From: Jason Ekstrand @ 2021-07-06 23:47 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Daniel Vetter,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Tue, Jul 6, 2021 at 5:12 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Specifically document the new/clarified rules around how the shared
> fences do not have any ordering requirements against the exclusive
> fence.
>
> But also document all the things a bit better, given how central
> struct dma_resv to dynamic buffer management the docs have been very
> inadequat.
>
> - Lots more links to other pieces of the puzzle. Unfortunately
>   ttm_buffer_object has no docs, so no links :-(
>
> - Explain/complain a bit about dma_resv_locking_ctx(). I still don't
>   like that one, but fixing the ttm call chains is going to be
>   horrible. Plus we want to plug in real slowpath locking when we do
>   that anyway.
>
> - Main part of the patch is some actual docs for struct dma_resv.
>
> Overall I think we still have a lot of bad naming in this area (e.g.
> dma_resv.fence is singular, but contains the multiple shared fences),
> but I think that's more indicative of how the semantics and rules are
> just not great.
>
> Another thing that's real awkard is how chaining exclusive fences
> right now means direct dma_resv.exclusive_fence pointer access with an
> rcu_assign_pointer. Not so great either.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  drivers/dma-buf/dma-resv.c |  22 ++++++--
>  include/linux/dma-resv.h   | 104 +++++++++++++++++++++++++++++++++++--
>  2 files changed, 116 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index f26c71747d43..898f8d894bbd 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -48,6 +48,8 @@
>   * write operations) or N shared fences (read operations).  The RCU
>   * mechanism is used to protect read access to fences from locked
>   * write-side updates.
> + *
> + * See struct dma_resv for more details.
>   */
>
>  DEFINE_WD_CLASS(reservation_ww_class);
> @@ -137,7 +139,11 @@ EXPORT_SYMBOL(dma_resv_fini);
>   * @num_fences: number of fences we want to add
>   *
>   * Should be called before dma_resv_add_shared_fence().  Must
> - * be called with obj->lock held.
> + * be called with @obj locked through dma_resv_lock().
> + *
> + * Note that the preallocated slots need to be re-reserved if @obj is unlocked
> + * at any time before callind dma_resv_add_shared_fence(). This is validate when

validated

> + * CONFIG_DEBUG_MUTEXES is enabled.
>   *
>   * RETURNS
>   * Zero for success, or -errno
> @@ -234,8 +240,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
>   * @obj: the reservation object
>   * @fence: the shared fence to add
>   *
> - * Add a fence to a shared slot, obj->lock must be held, and
> + * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
>   * dma_resv_reserve_shared() has been called.
> + *
> + * See also &dma_resv.fence for a discussion of the semantics.
>   */
>  void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
>  {
> @@ -280,7 +288,9 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
>   * @obj: the reservation object
>   * @fence: the shared fence to add
>   *
> - * Add a fence to the exclusive slot.  The obj->lock must be held.
> + * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
> + * Note that this function replaces all fences attached to @obj, see also
> + * &dma_resv.fence_excl for a discussion of the semantics.
>   */
>  void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
>  {
> @@ -609,9 +619,11 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
>   * fence
>   *
>   * Callers are not required to hold specific locks, but maybe hold
> - * dma_resv_lock() already
> + * dma_resv_lock() already.
> + *
>   * RETURNS
> - * true if all fences signaled, else false
> + *
> + * True if all fences signaled, else false.
>   */
>  bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
>  {
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index e1ca2080a1ff..c77fd54d033f 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -62,16 +62,90 @@ struct dma_resv_list {
>
>  /**
>   * struct dma_resv - a reservation object manages fences for a buffer
> - * @lock: update side lock
> - * @seq: sequence count for managing RCU read-side synchronization
> - * @fence_excl: the exclusive fence, if there is one currently
> - * @fence: list of current shared fences
> + *
> + * There are multiple uses for this, with sometimes slightly different rules in
> + * how the fence slots are used.
> + *
> + * One use is to synchronize cross-driver access to a struct dma_buf, either for
> + * dynamic buffer management or just to handle implicit synchronization between
> + * different users of the buffer in userspace. See &dma_buf.resv for a more
> + * in-depth discussion.
> + *
> + * The other major use is to manage access and locking within a driver in a
> + * buffer based memory manager. struct ttm_buffer_object is the canonical
> + * example here, since this is were reservation objects originated from. But use

s/were/where/

> + * in drivers is spreading and some drivers also manage struct
> + * drm_gem_object with the same scheme.
>   */
>  struct dma_resv {
> +       /**
> +        * @lock:
> +        *
> +        * Update side lock. Don't use directly, instead use the wrapper
> +        * functions like dma_resv_lock() and dma_resv_unlock().
> +        *
> +        * Drivers which use the reservation object to manage memory dynamically
> +        * also use this lock to protect buffer object state like placement,
> +        * allocation policies or throughout command submission.
> +        */
>         struct ww_mutex lock;
> +
> +       /**
> +        * @seq:
> +        *
> +        * Sequence count for managing RCU read-side synchronization, allows
> +        * read-only access to @fence_excl and @fence while ensuring we take a
> +        * consistent snapshot.
> +        */
>         seqcount_ww_mutex_t seq;
>
> +       /**
> +        * @fence_excl:
> +        *
> +        * The exclusive fence, if there is one currently.
> +        *
> +        * There are two was to update this fence:
> +        *
> +        * - First by calling dma_resv_add_excl_fence(), which replaces all
> +        *   fences attached to the reservation object. To guarantee that no
> +        *   fences are lost this new fence must signal only after all previous

lost, this

> +        *   fences, both shared and exclusive, have signalled. In some cases it
> +        *   is convenient to achieve that by attaching a struct dma_fence_array
> +        *   with all the new and old fences.
> +        *
> +        * - Alternatively the fence can be set directly, which leaves the
> +        *   shared fences unchanged. To guarantee that no fences are lost this
> +        *   new fence must signale only after the previous exclusive fence has

s/signale/signal/

> +        *   singalled. Since the shared fences are staying intact, it is not
> +        *   necessary to maintain any ordering against those. If semantically
> +        *   only a new access is added without actually treating the previous
> +        *   one as a dependency the exclusive fences can be strung together
> +        *   using struct dma_fence_chain.
> +        *
> +        * Note that actual semantics of what an exclusive or shared fence mean
> +        * is defined by the user, for reservation objects shared across drivers
> +        * see &dma_buf.resv.
> +        */
>         struct dma_fence __rcu *fence_excl;
> +
> +       /**
> +        * @fence:
> +        *
> +        * List of current shared fences.
> +        *
> +        * There are no ordering constraints of shared fences against the
> +        * exclusive fence slot. If a waiter needs to wait for all access, it
> +        * has to wait for both set of fences to signal.

sets

> +        *
> +        * A new fence is added by calling dma_resv_add_shared_fence(). Since
> +        * this often needs to be done past the point of no return in command
> +        * submission it cannot fail, and therefor sufficient slots need to be

therefore

> +        * reserved by calling dma_resv_reserve_shared().
> +        *
> +        * Note that actual semantics of what an exclusive or shared fence mean
> +        * is defined by the user, for reservation objects shared across drivers
> +        * see &dma_buf.resv.
> +        */
>         struct dma_resv_list __rcu *fence;
>  };
>
> @@ -98,6 +172,13 @@ static inline void dma_resv_reset_shared_max(struct dma_resv *obj) {}
>   * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
>   * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
>   * object may be locked by itself by passing NULL as @ctx.
> + *
> + * When a die situation is indicated by returning -EDEADLK all locks held by
> + * @ctx must be unlocked and then dma_resv_lock_slow() called on @obj.
> + *
> + * Unlocked by calling dma_resv_lock().

unlock?

> + *
> + * See also dma_resv_lock_interruptible() for the interruptible variant.
>   */
>  static inline int dma_resv_lock(struct dma_resv *obj,
>                                 struct ww_acquire_ctx *ctx)
> @@ -119,6 +200,12 @@ static inline int dma_resv_lock(struct dma_resv *obj,
>   * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
>   * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
>   * object may be locked by itself by passing NULL as @ctx.
> + *
> + * When a die situation is indicated by returning -EDEADLK all locks held by
> + * @ctx must be unlocked and then dma_resv_lock_slow_interruptible() called on
> + * @obj.
> + *
> + * Unlocked by calling dma_resv_lock().

unlock?

>   */
>  static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
>                                               struct ww_acquire_ctx *ctx)
> @@ -134,6 +221,8 @@ static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
>   * Acquires the reservation object after a die case. This function
>   * will sleep until the lock becomes available. See dma_resv_lock() as
>   * well.
> + *
> + * See also dma_resv_lock_slow_interruptible() for the interruptible variant.
>   */
>  static inline void dma_resv_lock_slow(struct dma_resv *obj,
>                                       struct ww_acquire_ctx *ctx)
> @@ -167,7 +256,7 @@ static inline int dma_resv_lock_slow_interruptible(struct dma_resv *obj,
>   * if they overlap with a writer.
>   *
>   * Also note that since no context is provided, no deadlock protection is
> - * possible.
> + * possible, which is also not needed for a trylock.
>   *
>   * Returns true if the lock was acquired, false otherwise.
>   */
> @@ -193,6 +282,11 @@ static inline bool dma_resv_is_locked(struct dma_resv *obj)
>   *
>   * Returns the context used to lock a reservation object or NULL if no context
>   * was used or the object is not locked at all.
> + *
> + * WARNING: This interface is pretty horrible, but TTM needs it because it
> + * doesn't pass the struct ww_acquire_ctx around in some very long callchains.
> + * Everyone else just uses it to check whether they're holding a reservation or
> + * not.
>   */
>  static inline struct ww_acquire_ctx *dma_resv_locking_ctx(struct dma_resv *obj)
>  {
> --
> 2.32.0
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] dma-resv: Give the docs a do-over
@ 2021-07-06 23:47     ` Jason Ekstrand
  0 siblings, 0 replies; 44+ messages in thread
From: Jason Ekstrand @ 2021-07-06 23:47 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Daniel Vetter,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Tue, Jul 6, 2021 at 5:12 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Specifically document the new/clarified rules around how the shared
> fences do not have any ordering requirements against the exclusive
> fence.
>
> But also document all the things a bit better, given how central
> struct dma_resv to dynamic buffer management the docs have been very
> inadequat.
>
> - Lots more links to other pieces of the puzzle. Unfortunately
>   ttm_buffer_object has no docs, so no links :-(
>
> - Explain/complain a bit about dma_resv_locking_ctx(). I still don't
>   like that one, but fixing the ttm call chains is going to be
>   horrible. Plus we want to plug in real slowpath locking when we do
>   that anyway.
>
> - Main part of the patch is some actual docs for struct dma_resv.
>
> Overall I think we still have a lot of bad naming in this area (e.g.
> dma_resv.fence is singular, but contains the multiple shared fences),
> but I think that's more indicative of how the semantics and rules are
> just not great.
>
> Another thing that's real awkard is how chaining exclusive fences
> right now means direct dma_resv.exclusive_fence pointer access with an
> rcu_assign_pointer. Not so great either.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  drivers/dma-buf/dma-resv.c |  22 ++++++--
>  include/linux/dma-resv.h   | 104 +++++++++++++++++++++++++++++++++++--
>  2 files changed, 116 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index f26c71747d43..898f8d894bbd 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -48,6 +48,8 @@
>   * write operations) or N shared fences (read operations).  The RCU
>   * mechanism is used to protect read access to fences from locked
>   * write-side updates.
> + *
> + * See struct dma_resv for more details.
>   */
>
>  DEFINE_WD_CLASS(reservation_ww_class);
> @@ -137,7 +139,11 @@ EXPORT_SYMBOL(dma_resv_fini);
>   * @num_fences: number of fences we want to add
>   *
>   * Should be called before dma_resv_add_shared_fence().  Must
> - * be called with obj->lock held.
> + * be called with @obj locked through dma_resv_lock().
> + *
> + * Note that the preallocated slots need to be re-reserved if @obj is unlocked
> + * at any time before callind dma_resv_add_shared_fence(). This is validate when

validated

> + * CONFIG_DEBUG_MUTEXES is enabled.
>   *
>   * RETURNS
>   * Zero for success, or -errno
> @@ -234,8 +240,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
>   * @obj: the reservation object
>   * @fence: the shared fence to add
>   *
> - * Add a fence to a shared slot, obj->lock must be held, and
> + * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
>   * dma_resv_reserve_shared() has been called.
> + *
> + * See also &dma_resv.fence for a discussion of the semantics.
>   */
>  void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
>  {
> @@ -280,7 +288,9 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
>   * @obj: the reservation object
>   * @fence: the shared fence to add
>   *
> - * Add a fence to the exclusive slot.  The obj->lock must be held.
> + * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
> + * Note that this function replaces all fences attached to @obj, see also
> + * &dma_resv.fence_excl for a discussion of the semantics.
>   */
>  void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
>  {
> @@ -609,9 +619,11 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
>   * fence
>   *
>   * Callers are not required to hold specific locks, but maybe hold
> - * dma_resv_lock() already
> + * dma_resv_lock() already.
> + *
>   * RETURNS
> - * true if all fences signaled, else false
> + *
> + * True if all fences signaled, else false.
>   */
>  bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
>  {
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index e1ca2080a1ff..c77fd54d033f 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -62,16 +62,90 @@ struct dma_resv_list {
>
>  /**
>   * struct dma_resv - a reservation object manages fences for a buffer
> - * @lock: update side lock
> - * @seq: sequence count for managing RCU read-side synchronization
> - * @fence_excl: the exclusive fence, if there is one currently
> - * @fence: list of current shared fences
> + *
> + * There are multiple uses for this, with sometimes slightly different rules in
> + * how the fence slots are used.
> + *
> + * One use is to synchronize cross-driver access to a struct dma_buf, either for
> + * dynamic buffer management or just to handle implicit synchronization between
> + * different users of the buffer in userspace. See &dma_buf.resv for a more
> + * in-depth discussion.
> + *
> + * The other major use is to manage access and locking within a driver in a
> + * buffer based memory manager. struct ttm_buffer_object is the canonical
> + * example here, since this is were reservation objects originated from. But use

s/were/where/

> + * in drivers is spreading and some drivers also manage struct
> + * drm_gem_object with the same scheme.
>   */
>  struct dma_resv {
> +       /**
> +        * @lock:
> +        *
> +        * Update side lock. Don't use directly, instead use the wrapper
> +        * functions like dma_resv_lock() and dma_resv_unlock().
> +        *
> +        * Drivers which use the reservation object to manage memory dynamically
> +        * also use this lock to protect buffer object state like placement,
> +        * allocation policies or throughout command submission.
> +        */
>         struct ww_mutex lock;
> +
> +       /**
> +        * @seq:
> +        *
> +        * Sequence count for managing RCU read-side synchronization, allows
> +        * read-only access to @fence_excl and @fence while ensuring we take a
> +        * consistent snapshot.
> +        */
>         seqcount_ww_mutex_t seq;
>
> +       /**
> +        * @fence_excl:
> +        *
> +        * The exclusive fence, if there is one currently.
> +        *
> +        * There are two was to update this fence:
> +        *
> +        * - First by calling dma_resv_add_excl_fence(), which replaces all
> +        *   fences attached to the reservation object. To guarantee that no
> +        *   fences are lost this new fence must signal only after all previous

lost, this

> +        *   fences, both shared and exclusive, have signalled. In some cases it
> +        *   is convenient to achieve that by attaching a struct dma_fence_array
> +        *   with all the new and old fences.
> +        *
> +        * - Alternatively the fence can be set directly, which leaves the
> +        *   shared fences unchanged. To guarantee that no fences are lost this
> +        *   new fence must signale only after the previous exclusive fence has

s/signale/signal/

> +        *   singalled. Since the shared fences are staying intact, it is not
> +        *   necessary to maintain any ordering against those. If semantically
> +        *   only a new access is added without actually treating the previous
> +        *   one as a dependency the exclusive fences can be strung together
> +        *   using struct dma_fence_chain.
> +        *
> +        * Note that actual semantics of what an exclusive or shared fence mean
> +        * is defined by the user, for reservation objects shared across drivers
> +        * see &dma_buf.resv.
> +        */
>         struct dma_fence __rcu *fence_excl;
> +
> +       /**
> +        * @fence:
> +        *
> +        * List of current shared fences.
> +        *
> +        * There are no ordering constraints of shared fences against the
> +        * exclusive fence slot. If a waiter needs to wait for all access, it
> +        * has to wait for both set of fences to signal.

sets

> +        *
> +        * A new fence is added by calling dma_resv_add_shared_fence(). Since
> +        * this often needs to be done past the point of no return in command
> +        * submission it cannot fail, and therefor sufficient slots need to be

therefore

> +        * reserved by calling dma_resv_reserve_shared().
> +        *
> +        * Note that actual semantics of what an exclusive or shared fence mean
> +        * is defined by the user, for reservation objects shared across drivers
> +        * see &dma_buf.resv.
> +        */
>         struct dma_resv_list __rcu *fence;
>  };
>
> @@ -98,6 +172,13 @@ static inline void dma_resv_reset_shared_max(struct dma_resv *obj) {}
>   * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
>   * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
>   * object may be locked by itself by passing NULL as @ctx.
> + *
> + * When a die situation is indicated by returning -EDEADLK all locks held by
> + * @ctx must be unlocked and then dma_resv_lock_slow() called on @obj.
> + *
> + * Unlocked by calling dma_resv_lock().

unlock?

> + *
> + * See also dma_resv_lock_interruptible() for the interruptible variant.
>   */
>  static inline int dma_resv_lock(struct dma_resv *obj,
>                                 struct ww_acquire_ctx *ctx)
> @@ -119,6 +200,12 @@ static inline int dma_resv_lock(struct dma_resv *obj,
>   * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
>   * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
>   * object may be locked by itself by passing NULL as @ctx.
> + *
> + * When a die situation is indicated by returning -EDEADLK all locks held by
> + * @ctx must be unlocked and then dma_resv_lock_slow_interruptible() called on
> + * @obj.
> + *
> + * Unlocked by calling dma_resv_lock().

unlock?

>   */
>  static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
>                                               struct ww_acquire_ctx *ctx)
> @@ -134,6 +221,8 @@ static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
>   * Acquires the reservation object after a die case. This function
>   * will sleep until the lock becomes available. See dma_resv_lock() as
>   * well.
> + *
> + * See also dma_resv_lock_slow_interruptible() for the interruptible variant.
>   */
>  static inline void dma_resv_lock_slow(struct dma_resv *obj,
>                                       struct ww_acquire_ctx *ctx)
> @@ -167,7 +256,7 @@ static inline int dma_resv_lock_slow_interruptible(struct dma_resv *obj,
>   * if they overlap with a writer.
>   *
>   * Also note that since no context is provided, no deadlock protection is
> - * possible.
> + * possible, which is also not needed for a trylock.
>   *
>   * Returns true if the lock was acquired, false otherwise.
>   */
> @@ -193,6 +282,11 @@ static inline bool dma_resv_is_locked(struct dma_resv *obj)
>   *
>   * Returns the context used to lock a reservation object or NULL if no context
>   * was used or the object is not locked at all.
> + *
> + * WARNING: This interface is pretty horrible, but TTM needs it because it
> + * doesn't pass the struct ww_acquire_ctx around in some very long callchains.
> + * Everyone else just uses it to check whether they're holding a reservation or
> + * not.
>   */
>  static inline struct ww_acquire_ctx *dma_resv_locking_ctx(struct dma_resv *obj)
>  {
> --
> 2.32.0
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 7/7] dma-resv: Give the docs a do-over
  2021-07-06 10:12   ` Daniel Vetter
  (?)
@ 2021-07-07  8:06     ` Christian König
  -1 siblings, 0 replies; 44+ messages in thread
From: Christian König @ 2021-07-07  8:06 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Intel Graphics Development, Christian König, linaro-mm-sig,
	Daniel Vetter, linux-media

Am 06.07.21 um 12:12 schrieb Daniel Vetter:
> Specifically document the new/clarified rules around how the shared
> fences do not have any ordering requirements against the exclusive
> fence.
>
> But also document all the things a bit better, given how central
> struct dma_resv to dynamic buffer management the docs have been very
> inadequat.
>
> - Lots more links to other pieces of the puzzle. Unfortunately
>    ttm_buffer_object has no docs, so no links :-(
>
> - Explain/complain a bit about dma_resv_locking_ctx(). I still don't
>    like that one, but fixing the ttm call chains is going to be
>    horrible. Plus we want to plug in real slowpath locking when we do
>    that anyway.
>
> - Main part of the patch is some actual docs for struct dma_resv.
>
> Overall I think we still have a lot of bad naming in this area (e.g.
> dma_resv.fence is singular, but contains the multiple shared fences),
> but I think that's more indicative of how the semantics and rules are
> just not great.
>
> Another thing that's real awkard is how chaining exclusive fences
> right now means direct dma_resv.exclusive_fence pointer access with an
> rcu_assign_pointer. Not so great either.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>   drivers/dma-buf/dma-resv.c |  22 ++++++--
>   include/linux/dma-resv.h   | 104 +++++++++++++++++++++++++++++++++++--
>   2 files changed, 116 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index f26c71747d43..898f8d894bbd 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -48,6 +48,8 @@
>    * write operations) or N shared fences (read operations).  The RCU
>    * mechanism is used to protect read access to fences from locked
>    * write-side updates.
> + *
> + * See struct dma_resv for more details.
>    */
>   
>   DEFINE_WD_CLASS(reservation_ww_class);
> @@ -137,7 +139,11 @@ EXPORT_SYMBOL(dma_resv_fini);
>    * @num_fences: number of fences we want to add
>    *
>    * Should be called before dma_resv_add_shared_fence().  Must
> - * be called with obj->lock held.
> + * be called with @obj locked through dma_resv_lock().
> + *
> + * Note that the preallocated slots need to be re-reserved if @obj is unlocked
> + * at any time before callind dma_resv_add_shared_fence(). This is validate when
> + * CONFIG_DEBUG_MUTEXES is enabled.
>    *
>    * RETURNS
>    * Zero for success, or -errno
> @@ -234,8 +240,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
>    * @obj: the reservation object
>    * @fence: the shared fence to add
>    *
> - * Add a fence to a shared slot, obj->lock must be held, and
> + * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
>    * dma_resv_reserve_shared() has been called.
> + *
> + * See also &dma_resv.fence for a discussion of the semantics.
>    */
>   void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
>   {
> @@ -280,7 +288,9 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
>    * @obj: the reservation object
>    * @fence: the shared fence to add
>    *
> - * Add a fence to the exclusive slot.  The obj->lock must be held.
> + * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
> + * Note that this function replaces all fences attached to @obj, see also
> + * &dma_resv.fence_excl for a discussion of the semantics.
>    */
>   void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
>   {
> @@ -609,9 +619,11 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
>    * fence
>    *
>    * Callers are not required to hold specific locks, but maybe hold
> - * dma_resv_lock() already
> + * dma_resv_lock() already.
> + *
>    * RETURNS
> - * true if all fences signaled, else false
> + *
> + * True if all fences signaled, else false.
>    */
>   bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
>   {
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index e1ca2080a1ff..c77fd54d033f 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -62,16 +62,90 @@ struct dma_resv_list {
>   
>   /**
>    * struct dma_resv - a reservation object manages fences for a buffer
> - * @lock: update side lock
> - * @seq: sequence count for managing RCU read-side synchronization
> - * @fence_excl: the exclusive fence, if there is one currently
> - * @fence: list of current shared fences
> + *
> + * There are multiple uses for this, with sometimes slightly different rules in
> + * how the fence slots are used.
> + *
> + * One use is to synchronize cross-driver access to a struct dma_buf, either for
> + * dynamic buffer management or just to handle implicit synchronization between
> + * different users of the buffer in userspace. See &dma_buf.resv for a more
> + * in-depth discussion.
> + *
> + * The other major use is to manage access and locking within a driver in a
> + * buffer based memory manager. struct ttm_buffer_object is the canonical
> + * example here, since this is were reservation objects originated from. But use
> + * in drivers is spreading and some drivers also manage struct
> + * drm_gem_object with the same scheme.

I would still make that even harder, e.g. mentioning that you run into 
use after free and the resulting memory corruption if you don't obey the 
rules.

Apart from that with the spelling stuff pointed out by others fixed the 
patch is Reviewed-by: Christian König <christian.koenig@amd.com>

Regards,
Christian.

>    */
>   struct dma_resv {
> +	/**
> +	 * @lock:
> +	 *
> +	 * Update side lock. Don't use directly, instead use the wrapper
> +	 * functions like dma_resv_lock() and dma_resv_unlock().
> +	 *
> +	 * Drivers which use the reservation object to manage memory dynamically
> +	 * also use this lock to protect buffer object state like placement,
> +	 * allocation policies or throughout command submission.
> +	 */
>   	struct ww_mutex lock;
> +
> +	/**
> +	 * @seq:
> +	 *
> +	 * Sequence count for managing RCU read-side synchronization, allows
> +	 * read-only access to @fence_excl and @fence while ensuring we take a
> +	 * consistent snapshot.
> +	 */
>   	seqcount_ww_mutex_t seq;
>   
> +	/**
> +	 * @fence_excl:
> +	 *
> +	 * The exclusive fence, if there is one currently.
> +	 *
> +	 * There are two was to update this fence:
> +	 *
> +	 * - First by calling dma_resv_add_excl_fence(), which replaces all
> +	 *   fences attached to the reservation object. To guarantee that no
> +	 *   fences are lost this new fence must signal only after all previous
> +	 *   fences, both shared and exclusive, have signalled. In some cases it
> +	 *   is convenient to achieve that by attaching a struct dma_fence_array
> +	 *   with all the new and old fences.
> +	 *
> +	 * - Alternatively the fence can be set directly, which leaves the
> +	 *   shared fences unchanged. To guarantee that no fences are lost this
> +	 *   new fence must signale only after the previous exclusive fence has
> +	 *   singalled. Since the shared fences are staying intact, it is not
> +	 *   necessary to maintain any ordering against those. If semantically
> +	 *   only a new access is added without actually treating the previous
> +	 *   one as a dependency the exclusive fences can be strung together
> +	 *   using struct dma_fence_chain.
> +	 *
> +	 * Note that actual semantics of what an exclusive or shared fence mean
> +	 * is defined by the user, for reservation objects shared across drivers
> +	 * see &dma_buf.resv.
> +	 */
>   	struct dma_fence __rcu *fence_excl;
> +
> +	/**
> +	 * @fence:
> +	 *
> +	 * List of current shared fences.
> +	 *
> +	 * There are no ordering constraints of shared fences against the
> +	 * exclusive fence slot. If a waiter needs to wait for all access, it
> +	 * has to wait for both set of fences to signal.
> +	 *
> +	 * A new fence is added by calling dma_resv_add_shared_fence(). Since
> +	 * this often needs to be done past the point of no return in command
> +	 * submission it cannot fail, and therefor sufficient slots need to be
> +	 * reserved by calling dma_resv_reserve_shared().
> +	 *
> +	 * Note that actual semantics of what an exclusive or shared fence mean
> +	 * is defined by the user, for reservation objects shared across drivers
> +	 * see &dma_buf.resv.
> +	 */
>   	struct dma_resv_list __rcu *fence;
>   };
>   
> @@ -98,6 +172,13 @@ static inline void dma_resv_reset_shared_max(struct dma_resv *obj) {}
>    * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
>    * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
>    * object may be locked by itself by passing NULL as @ctx.
> + *
> + * When a die situation is indicated by returning -EDEADLK all locks held by
> + * @ctx must be unlocked and then dma_resv_lock_slow() called on @obj.
> + *
> + * Unlocked by calling dma_resv_lock().
> + *
> + * See also dma_resv_lock_interruptible() for the interruptible variant.
>    */
>   static inline int dma_resv_lock(struct dma_resv *obj,
>   				struct ww_acquire_ctx *ctx)
> @@ -119,6 +200,12 @@ static inline int dma_resv_lock(struct dma_resv *obj,
>    * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
>    * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
>    * object may be locked by itself by passing NULL as @ctx.
> + *
> + * When a die situation is indicated by returning -EDEADLK all locks held by
> + * @ctx must be unlocked and then dma_resv_lock_slow_interruptible() called on
> + * @obj.
> + *
> + * Unlocked by calling dma_resv_lock().
>    */
>   static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
>   					      struct ww_acquire_ctx *ctx)
> @@ -134,6 +221,8 @@ static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
>    * Acquires the reservation object after a die case. This function
>    * will sleep until the lock becomes available. See dma_resv_lock() as
>    * well.
> + *
> + * See also dma_resv_lock_slow_interruptible() for the interruptible variant.
>    */
>   static inline void dma_resv_lock_slow(struct dma_resv *obj,
>   				      struct ww_acquire_ctx *ctx)
> @@ -167,7 +256,7 @@ static inline int dma_resv_lock_slow_interruptible(struct dma_resv *obj,
>    * if they overlap with a writer.
>    *
>    * Also note that since no context is provided, no deadlock protection is
> - * possible.
> + * possible, which is also not needed for a trylock.
>    *
>    * Returns true if the lock was acquired, false otherwise.
>    */
> @@ -193,6 +282,11 @@ static inline bool dma_resv_is_locked(struct dma_resv *obj)
>    *
>    * Returns the context used to lock a reservation object or NULL if no context
>    * was used or the object is not locked at all.
> + *
> + * WARNING: This interface is pretty horrible, but TTM needs it because it
> + * doesn't pass the struct ww_acquire_ctx around in some very long callchains.
> + * Everyone else just uses it to check whether they're holding a reservation or
> + * not.
>    */
>   static inline struct ww_acquire_ctx *dma_resv_locking_ctx(struct dma_resv *obj)
>   {


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 7/7] dma-resv: Give the docs a do-over
@ 2021-07-07  8:06     ` Christian König
  0 siblings, 0 replies; 44+ messages in thread
From: Christian König @ 2021-07-07  8:06 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: linaro-mm-sig, Daniel Vetter, Intel Graphics Development,
	Christian König, linux-media

Am 06.07.21 um 12:12 schrieb Daniel Vetter:
> Specifically document the new/clarified rules around how the shared
> fences do not have any ordering requirements against the exclusive
> fence.
>
> But also document all the things a bit better, given how central
> struct dma_resv to dynamic buffer management the docs have been very
> inadequat.
>
> - Lots more links to other pieces of the puzzle. Unfortunately
>    ttm_buffer_object has no docs, so no links :-(
>
> - Explain/complain a bit about dma_resv_locking_ctx(). I still don't
>    like that one, but fixing the ttm call chains is going to be
>    horrible. Plus we want to plug in real slowpath locking when we do
>    that anyway.
>
> - Main part of the patch is some actual docs for struct dma_resv.
>
> Overall I think we still have a lot of bad naming in this area (e.g.
> dma_resv.fence is singular, but contains the multiple shared fences),
> but I think that's more indicative of how the semantics and rules are
> just not great.
>
> Another thing that's real awkard is how chaining exclusive fences
> right now means direct dma_resv.exclusive_fence pointer access with an
> rcu_assign_pointer. Not so great either.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>   drivers/dma-buf/dma-resv.c |  22 ++++++--
>   include/linux/dma-resv.h   | 104 +++++++++++++++++++++++++++++++++++--
>   2 files changed, 116 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index f26c71747d43..898f8d894bbd 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -48,6 +48,8 @@
>    * write operations) or N shared fences (read operations).  The RCU
>    * mechanism is used to protect read access to fences from locked
>    * write-side updates.
> + *
> + * See struct dma_resv for more details.
>    */
>   
>   DEFINE_WD_CLASS(reservation_ww_class);
> @@ -137,7 +139,11 @@ EXPORT_SYMBOL(dma_resv_fini);
>    * @num_fences: number of fences we want to add
>    *
>    * Should be called before dma_resv_add_shared_fence().  Must
> - * be called with obj->lock held.
> + * be called with @obj locked through dma_resv_lock().
> + *
> + * Note that the preallocated slots need to be re-reserved if @obj is unlocked
> + * at any time before callind dma_resv_add_shared_fence(). This is validate when
> + * CONFIG_DEBUG_MUTEXES is enabled.
>    *
>    * RETURNS
>    * Zero for success, or -errno
> @@ -234,8 +240,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
>    * @obj: the reservation object
>    * @fence: the shared fence to add
>    *
> - * Add a fence to a shared slot, obj->lock must be held, and
> + * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
>    * dma_resv_reserve_shared() has been called.
> + *
> + * See also &dma_resv.fence for a discussion of the semantics.
>    */
>   void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
>   {
> @@ -280,7 +288,9 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
>    * @obj: the reservation object
>    * @fence: the shared fence to add
>    *
> - * Add a fence to the exclusive slot.  The obj->lock must be held.
> + * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
> + * Note that this function replaces all fences attached to @obj, see also
> + * &dma_resv.fence_excl for a discussion of the semantics.
>    */
>   void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
>   {
> @@ -609,9 +619,11 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
>    * fence
>    *
>    * Callers are not required to hold specific locks, but maybe hold
> - * dma_resv_lock() already
> + * dma_resv_lock() already.
> + *
>    * RETURNS
> - * true if all fences signaled, else false
> + *
> + * True if all fences signaled, else false.
>    */
>   bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
>   {
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index e1ca2080a1ff..c77fd54d033f 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -62,16 +62,90 @@ struct dma_resv_list {
>   
>   /**
>    * struct dma_resv - a reservation object manages fences for a buffer
> - * @lock: update side lock
> - * @seq: sequence count for managing RCU read-side synchronization
> - * @fence_excl: the exclusive fence, if there is one currently
> - * @fence: list of current shared fences
> + *
> + * There are multiple uses for this, with sometimes slightly different rules in
> + * how the fence slots are used.
> + *
> + * One use is to synchronize cross-driver access to a struct dma_buf, either for
> + * dynamic buffer management or just to handle implicit synchronization between
> + * different users of the buffer in userspace. See &dma_buf.resv for a more
> + * in-depth discussion.
> + *
> + * The other major use is to manage access and locking within a driver in a
> + * buffer based memory manager. struct ttm_buffer_object is the canonical
> + * example here, since this is were reservation objects originated from. But use
> + * in drivers is spreading and some drivers also manage struct
> + * drm_gem_object with the same scheme.

I would still make that even harder, e.g. mentioning that you run into 
use after free and the resulting memory corruption if you don't obey the 
rules.

Apart from that with the spelling stuff pointed out by others fixed the 
patch is Reviewed-by: Christian König <christian.koenig@amd.com>

Regards,
Christian.

>    */
>   struct dma_resv {
> +	/**
> +	 * @lock:
> +	 *
> +	 * Update side lock. Don't use directly, instead use the wrapper
> +	 * functions like dma_resv_lock() and dma_resv_unlock().
> +	 *
> +	 * Drivers which use the reservation object to manage memory dynamically
> +	 * also use this lock to protect buffer object state like placement,
> +	 * allocation policies or throughout command submission.
> +	 */
>   	struct ww_mutex lock;
> +
> +	/**
> +	 * @seq:
> +	 *
> +	 * Sequence count for managing RCU read-side synchronization, allows
> +	 * read-only access to @fence_excl and @fence while ensuring we take a
> +	 * consistent snapshot.
> +	 */
>   	seqcount_ww_mutex_t seq;
>   
> +	/**
> +	 * @fence_excl:
> +	 *
> +	 * The exclusive fence, if there is one currently.
> +	 *
> +	 * There are two was to update this fence:
> +	 *
> +	 * - First by calling dma_resv_add_excl_fence(), which replaces all
> +	 *   fences attached to the reservation object. To guarantee that no
> +	 *   fences are lost this new fence must signal only after all previous
> +	 *   fences, both shared and exclusive, have signalled. In some cases it
> +	 *   is convenient to achieve that by attaching a struct dma_fence_array
> +	 *   with all the new and old fences.
> +	 *
> +	 * - Alternatively the fence can be set directly, which leaves the
> +	 *   shared fences unchanged. To guarantee that no fences are lost this
> +	 *   new fence must signale only after the previous exclusive fence has
> +	 *   singalled. Since the shared fences are staying intact, it is not
> +	 *   necessary to maintain any ordering against those. If semantically
> +	 *   only a new access is added without actually treating the previous
> +	 *   one as a dependency the exclusive fences can be strung together
> +	 *   using struct dma_fence_chain.
> +	 *
> +	 * Note that actual semantics of what an exclusive or shared fence mean
> +	 * is defined by the user, for reservation objects shared across drivers
> +	 * see &dma_buf.resv.
> +	 */
>   	struct dma_fence __rcu *fence_excl;
> +
> +	/**
> +	 * @fence:
> +	 *
> +	 * List of current shared fences.
> +	 *
> +	 * There are no ordering constraints of shared fences against the
> +	 * exclusive fence slot. If a waiter needs to wait for all access, it
> +	 * has to wait for both set of fences to signal.
> +	 *
> +	 * A new fence is added by calling dma_resv_add_shared_fence(). Since
> +	 * this often needs to be done past the point of no return in command
> +	 * submission it cannot fail, and therefor sufficient slots need to be
> +	 * reserved by calling dma_resv_reserve_shared().
> +	 *
> +	 * Note that actual semantics of what an exclusive or shared fence mean
> +	 * is defined by the user, for reservation objects shared across drivers
> +	 * see &dma_buf.resv.
> +	 */
>   	struct dma_resv_list __rcu *fence;
>   };
>   
> @@ -98,6 +172,13 @@ static inline void dma_resv_reset_shared_max(struct dma_resv *obj) {}
>    * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
>    * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
>    * object may be locked by itself by passing NULL as @ctx.
> + *
> + * When a die situation is indicated by returning -EDEADLK all locks held by
> + * @ctx must be unlocked and then dma_resv_lock_slow() called on @obj.
> + *
> + * Unlocked by calling dma_resv_lock().
> + *
> + * See also dma_resv_lock_interruptible() for the interruptible variant.
>    */
>   static inline int dma_resv_lock(struct dma_resv *obj,
>   				struct ww_acquire_ctx *ctx)
> @@ -119,6 +200,12 @@ static inline int dma_resv_lock(struct dma_resv *obj,
>    * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
>    * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
>    * object may be locked by itself by passing NULL as @ctx.
> + *
> + * When a die situation is indicated by returning -EDEADLK all locks held by
> + * @ctx must be unlocked and then dma_resv_lock_slow_interruptible() called on
> + * @obj.
> + *
> + * Unlocked by calling dma_resv_lock().
>    */
>   static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
>   					      struct ww_acquire_ctx *ctx)
> @@ -134,6 +221,8 @@ static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
>    * Acquires the reservation object after a die case. This function
>    * will sleep until the lock becomes available. See dma_resv_lock() as
>    * well.
> + *
> + * See also dma_resv_lock_slow_interruptible() for the interruptible variant.
>    */
>   static inline void dma_resv_lock_slow(struct dma_resv *obj,
>   				      struct ww_acquire_ctx *ctx)
> @@ -167,7 +256,7 @@ static inline int dma_resv_lock_slow_interruptible(struct dma_resv *obj,
>    * if they overlap with a writer.
>    *
>    * Also note that since no context is provided, no deadlock protection is
> - * possible.
> + * possible, which is also not needed for a trylock.
>    *
>    * Returns true if the lock was acquired, false otherwise.
>    */
> @@ -193,6 +282,11 @@ static inline bool dma_resv_is_locked(struct dma_resv *obj)
>    *
>    * Returns the context used to lock a reservation object or NULL if no context
>    * was used or the object is not locked at all.
> + *
> + * WARNING: This interface is pretty horrible, but TTM needs it because it
> + * doesn't pass the struct ww_acquire_ctx around in some very long callchains.
> + * Everyone else just uses it to check whether they're holding a reservation or
> + * not.
>    */
>   static inline struct ww_acquire_ctx *dma_resv_locking_ctx(struct dma_resv *obj)
>   {


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 7/7] dma-resv: Give the docs a do-over
@ 2021-07-07  8:06     ` Christian König
  0 siblings, 0 replies; 44+ messages in thread
From: Christian König @ 2021-07-07  8:06 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: linaro-mm-sig, Daniel Vetter, Intel Graphics Development,
	Christian König, linux-media

Am 06.07.21 um 12:12 schrieb Daniel Vetter:
> Specifically document the new/clarified rules around how the shared
> fences do not have any ordering requirements against the exclusive
> fence.
>
> But also document all the things a bit better, given how central
> struct dma_resv to dynamic buffer management the docs have been very
> inadequat.
>
> - Lots more links to other pieces of the puzzle. Unfortunately
>    ttm_buffer_object has no docs, so no links :-(
>
> - Explain/complain a bit about dma_resv_locking_ctx(). I still don't
>    like that one, but fixing the ttm call chains is going to be
>    horrible. Plus we want to plug in real slowpath locking when we do
>    that anyway.
>
> - Main part of the patch is some actual docs for struct dma_resv.
>
> Overall I think we still have a lot of bad naming in this area (e.g.
> dma_resv.fence is singular, but contains the multiple shared fences),
> but I think that's more indicative of how the semantics and rules are
> just not great.
>
> Another thing that's real awkard is how chaining exclusive fences
> right now means direct dma_resv.exclusive_fence pointer access with an
> rcu_assign_pointer. Not so great either.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>   drivers/dma-buf/dma-resv.c |  22 ++++++--
>   include/linux/dma-resv.h   | 104 +++++++++++++++++++++++++++++++++++--
>   2 files changed, 116 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index f26c71747d43..898f8d894bbd 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -48,6 +48,8 @@
>    * write operations) or N shared fences (read operations).  The RCU
>    * mechanism is used to protect read access to fences from locked
>    * write-side updates.
> + *
> + * See struct dma_resv for more details.
>    */
>   
>   DEFINE_WD_CLASS(reservation_ww_class);
> @@ -137,7 +139,11 @@ EXPORT_SYMBOL(dma_resv_fini);
>    * @num_fences: number of fences we want to add
>    *
>    * Should be called before dma_resv_add_shared_fence().  Must
> - * be called with obj->lock held.
> + * be called with @obj locked through dma_resv_lock().
> + *
> + * Note that the preallocated slots need to be re-reserved if @obj is unlocked
> + * at any time before callind dma_resv_add_shared_fence(). This is validate when
> + * CONFIG_DEBUG_MUTEXES is enabled.
>    *
>    * RETURNS
>    * Zero for success, or -errno
> @@ -234,8 +240,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
>    * @obj: the reservation object
>    * @fence: the shared fence to add
>    *
> - * Add a fence to a shared slot, obj->lock must be held, and
> + * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
>    * dma_resv_reserve_shared() has been called.
> + *
> + * See also &dma_resv.fence for a discussion of the semantics.
>    */
>   void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
>   {
> @@ -280,7 +288,9 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
>    * @obj: the reservation object
>    * @fence: the shared fence to add
>    *
> - * Add a fence to the exclusive slot.  The obj->lock must be held.
> + * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
> + * Note that this function replaces all fences attached to @obj, see also
> + * &dma_resv.fence_excl for a discussion of the semantics.
>    */
>   void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
>   {
> @@ -609,9 +619,11 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
>    * fence
>    *
>    * Callers are not required to hold specific locks, but maybe hold
> - * dma_resv_lock() already
> + * dma_resv_lock() already.
> + *
>    * RETURNS
> - * true if all fences signaled, else false
> + *
> + * True if all fences signaled, else false.
>    */
>   bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
>   {
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index e1ca2080a1ff..c77fd54d033f 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -62,16 +62,90 @@ struct dma_resv_list {
>   
>   /**
>    * struct dma_resv - a reservation object manages fences for a buffer
> - * @lock: update side lock
> - * @seq: sequence count for managing RCU read-side synchronization
> - * @fence_excl: the exclusive fence, if there is one currently
> - * @fence: list of current shared fences
> + *
> + * There are multiple uses for this, with sometimes slightly different rules in
> + * how the fence slots are used.
> + *
> + * One use is to synchronize cross-driver access to a struct dma_buf, either for
> + * dynamic buffer management or just to handle implicit synchronization between
> + * different users of the buffer in userspace. See &dma_buf.resv for a more
> + * in-depth discussion.
> + *
> + * The other major use is to manage access and locking within a driver in a
> + * buffer based memory manager. struct ttm_buffer_object is the canonical
> + * example here, since this is were reservation objects originated from. But use
> + * in drivers is spreading and some drivers also manage struct
> + * drm_gem_object with the same scheme.

I would still make that even harder, e.g. mentioning that you run into 
use after free and the resulting memory corruption if you don't obey the 
rules.

Apart from that with the spelling stuff pointed out by others fixed the 
patch is Reviewed-by: Christian König <christian.koenig@amd.com>

Regards,
Christian.

>    */
>   struct dma_resv {
> +	/**
> +	 * @lock:
> +	 *
> +	 * Update side lock. Don't use directly, instead use the wrapper
> +	 * functions like dma_resv_lock() and dma_resv_unlock().
> +	 *
> +	 * Drivers which use the reservation object to manage memory dynamically
> +	 * also use this lock to protect buffer object state like placement,
> +	 * allocation policies or throughout command submission.
> +	 */
>   	struct ww_mutex lock;
> +
> +	/**
> +	 * @seq:
> +	 *
> +	 * Sequence count for managing RCU read-side synchronization, allows
> +	 * read-only access to @fence_excl and @fence while ensuring we take a
> +	 * consistent snapshot.
> +	 */
>   	seqcount_ww_mutex_t seq;
>   
> +	/**
> +	 * @fence_excl:
> +	 *
> +	 * The exclusive fence, if there is one currently.
> +	 *
> +	 * There are two was to update this fence:
> +	 *
> +	 * - First by calling dma_resv_add_excl_fence(), which replaces all
> +	 *   fences attached to the reservation object. To guarantee that no
> +	 *   fences are lost this new fence must signal only after all previous
> +	 *   fences, both shared and exclusive, have signalled. In some cases it
> +	 *   is convenient to achieve that by attaching a struct dma_fence_array
> +	 *   with all the new and old fences.
> +	 *
> +	 * - Alternatively the fence can be set directly, which leaves the
> +	 *   shared fences unchanged. To guarantee that no fences are lost this
> +	 *   new fence must signale only after the previous exclusive fence has
> +	 *   singalled. Since the shared fences are staying intact, it is not
> +	 *   necessary to maintain any ordering against those. If semantically
> +	 *   only a new access is added without actually treating the previous
> +	 *   one as a dependency the exclusive fences can be strung together
> +	 *   using struct dma_fence_chain.
> +	 *
> +	 * Note that actual semantics of what an exclusive or shared fence mean
> +	 * is defined by the user, for reservation objects shared across drivers
> +	 * see &dma_buf.resv.
> +	 */
>   	struct dma_fence __rcu *fence_excl;
> +
> +	/**
> +	 * @fence:
> +	 *
> +	 * List of current shared fences.
> +	 *
> +	 * There are no ordering constraints of shared fences against the
> +	 * exclusive fence slot. If a waiter needs to wait for all access, it
> +	 * has to wait for both set of fences to signal.
> +	 *
> +	 * A new fence is added by calling dma_resv_add_shared_fence(). Since
> +	 * this often needs to be done past the point of no return in command
> +	 * submission it cannot fail, and therefor sufficient slots need to be
> +	 * reserved by calling dma_resv_reserve_shared().
> +	 *
> +	 * Note that actual semantics of what an exclusive or shared fence mean
> +	 * is defined by the user, for reservation objects shared across drivers
> +	 * see &dma_buf.resv.
> +	 */
>   	struct dma_resv_list __rcu *fence;
>   };
>   
> @@ -98,6 +172,13 @@ static inline void dma_resv_reset_shared_max(struct dma_resv *obj) {}
>    * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
>    * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
>    * object may be locked by itself by passing NULL as @ctx.
> + *
> + * When a die situation is indicated by returning -EDEADLK all locks held by
> + * @ctx must be unlocked and then dma_resv_lock_slow() called on @obj.
> + *
> + * Unlocked by calling dma_resv_lock().
> + *
> + * See also dma_resv_lock_interruptible() for the interruptible variant.
>    */
>   static inline int dma_resv_lock(struct dma_resv *obj,
>   				struct ww_acquire_ctx *ctx)
> @@ -119,6 +200,12 @@ static inline int dma_resv_lock(struct dma_resv *obj,
>    * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
>    * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
>    * object may be locked by itself by passing NULL as @ctx.
> + *
> + * When a die situation is indicated by returning -EDEADLK all locks held by
> + * @ctx must be unlocked and then dma_resv_lock_slow_interruptible() called on
> + * @obj.
> + *
> + * Unlocked by calling dma_resv_lock().
>    */
>   static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
>   					      struct ww_acquire_ctx *ctx)
> @@ -134,6 +221,8 @@ static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
>    * Acquires the reservation object after a die case. This function
>    * will sleep until the lock becomes available. See dma_resv_lock() as
>    * well.
> + *
> + * See also dma_resv_lock_slow_interruptible() for the interruptible variant.
>    */
>   static inline void dma_resv_lock_slow(struct dma_resv *obj,
>   				      struct ww_acquire_ctx *ctx)
> @@ -167,7 +256,7 @@ static inline int dma_resv_lock_slow_interruptible(struct dma_resv *obj,
>    * if they overlap with a writer.
>    *
>    * Also note that since no context is provided, no deadlock protection is
> - * possible.
> + * possible, which is also not needed for a trylock.
>    *
>    * Returns true if the lock was acquired, false otherwise.
>    */
> @@ -193,6 +282,11 @@ static inline bool dma_resv_is_locked(struct dma_resv *obj)
>    *
>    * Returns the context used to lock a reservation object or NULL if no context
>    * was used or the object is not locked at all.
> + *
> + * WARNING: This interface is pretty horrible, but TTM needs it because it
> + * doesn't pass the struct ww_acquire_ctx around in some very long callchains.
> + * Everyone else just uses it to check whether they're holding a reservation or
> + * not.
>    */
>   static inline struct ww_acquire_ctx *dma_resv_locking_ctx(struct dma_resv *obj)
>   {

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/7] drm/etnaviv: Don't break exclusive fence ordering
  2021-07-06 10:12   ` [Intel-gfx] " Daniel Vetter
@ 2021-07-07  8:54     ` Lucas Stach
  -1 siblings, 0 replies; 44+ messages in thread
From: Lucas Stach @ 2021-07-07  8:54 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Daniel Vetter, Intel Graphics Development, etnaviv, Russell King

Hi Daniel,

I'm feeling like I miss a ton of context here, so some maybe dumb
questions/remarks below.

Am Dienstag, dem 06.07.2021 um 12:12 +0200 schrieb Daniel Vetter:
> There's only one exclusive slot, and we must not break the ordering.
> 
> A better fix would be to us a dma_fence_chain or _array like e.g.
> amdgpu now uses, but it probably makes sense to lift this into
> dma-resv.c code as a proper concept, so that drivers don't have to
> hack up their own solution each on their own. Hence go with the simple
> fix for now.
> 
> Another option is the fence import ioctl from Jason:
> 
> https://lore.kernel.org/dri-devel/20210610210925.642582-7-jason@jlekstrand.net/

Sorry, but why is the fence import ioctl a alternative to the fix
proposed in this patch?

> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> Cc: etnaviv@lists.freedesktop.org
> ---
>  drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> index 92478a50a580..5c4fed2b7c6a 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> @@ -178,18 +178,20 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit)
>  	for (i = 0; i < submit->nr_bos; i++) {
>  		struct etnaviv_gem_submit_bo *bo = &submit->bos[i];
>  		struct dma_resv *robj = bo->obj->base.resv;
> +		bool write = bo->flags & ETNA_SUBMIT_BO_WRITE;
>  
> -		if (!(bo->flags & ETNA_SUBMIT_BO_WRITE)) {
> +		if (!(write)) {

No parenthesis around the write needed.

>  			ret = dma_resv_reserve_shared(robj, 1);
>  			if (ret)
>  				return ret;
>  		}
>  
> -		if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
> +		/* exclusive fences must be ordered */

I feel like this comment isn't really helpful. It might tell you all
you need to know if you just paged in all the context around dma_resv
and the dependency graph, but it's not more than noise to me right now.

I guess the comment should answer the question against what the
exclusive fence we are going to add needs to be ordered and why it
isn't safe to skip implicit sync in that case.

Regards,
Lucas 
> +		if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT && !write)
>  			continue;
>  
>  		ret = drm_sched_job_await_implicit(&submit->sched_job, &bo->obj->base,
> -						   bo->flags & ETNA_SUBMIT_BO_WRITE);
> +						   write);
>  		if (ret)
>  			return ret;
>  	}



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [PATCH 3/7] drm/etnaviv: Don't break exclusive fence ordering
@ 2021-07-07  8:54     ` Lucas Stach
  0 siblings, 0 replies; 44+ messages in thread
From: Lucas Stach @ 2021-07-07  8:54 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Daniel Vetter, Intel Graphics Development, Christian Gmeiner,
	etnaviv, Russell King

Hi Daniel,

I'm feeling like I miss a ton of context here, so some maybe dumb
questions/remarks below.

Am Dienstag, dem 06.07.2021 um 12:12 +0200 schrieb Daniel Vetter:
> There's only one exclusive slot, and we must not break the ordering.
> 
> A better fix would be to us a dma_fence_chain or _array like e.g.
> amdgpu now uses, but it probably makes sense to lift this into
> dma-resv.c code as a proper concept, so that drivers don't have to
> hack up their own solution each on their own. Hence go with the simple
> fix for now.
> 
> Another option is the fence import ioctl from Jason:
> 
> https://lore.kernel.org/dri-devel/20210610210925.642582-7-jason@jlekstrand.net/

Sorry, but why is the fence import ioctl a alternative to the fix
proposed in this patch?

> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> Cc: etnaviv@lists.freedesktop.org
> ---
>  drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> index 92478a50a580..5c4fed2b7c6a 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> @@ -178,18 +178,20 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit)
>  	for (i = 0; i < submit->nr_bos; i++) {
>  		struct etnaviv_gem_submit_bo *bo = &submit->bos[i];
>  		struct dma_resv *robj = bo->obj->base.resv;
> +		bool write = bo->flags & ETNA_SUBMIT_BO_WRITE;
>  
> -		if (!(bo->flags & ETNA_SUBMIT_BO_WRITE)) {
> +		if (!(write)) {

No parenthesis around the write needed.

>  			ret = dma_resv_reserve_shared(robj, 1);
>  			if (ret)
>  				return ret;
>  		}
>  
> -		if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
> +		/* exclusive fences must be ordered */

I feel like this comment isn't really helpful. It might tell you all
you need to know if you just paged in all the context around dma_resv
and the dependency graph, but it's not more than noise to me right now.

I guess the comment should answer the question against what the
exclusive fence we are going to add needs to be ordered and why it
isn't safe to skip implicit sync in that case.

Regards,
Lucas 
> +		if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT && !write)
>  			continue;
>  
>  		ret = drm_sched_job_await_implicit(&submit->sched_job, &bo->obj->base,
> -						   bo->flags & ETNA_SUBMIT_BO_WRITE);
> +						   write);
>  		if (ret)
>  			return ret;
>  	}


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 7/7] dma-resv: Give the docs a do-over
  2021-07-07  8:06     ` Christian König
  (?)
@ 2021-07-07  9:13       ` Daniel Vetter
  -1 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-07  9:13 UTC (permalink / raw)
  To: Christian König
  Cc: DRI Development, Intel Graphics Development,
	Christian König,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Daniel Vetter,
	open list:DMA BUFFER SHARING FRAMEWORK

On Wed, Jul 7, 2021 at 10:06 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Am 06.07.21 um 12:12 schrieb Daniel Vetter:
> > Specifically document the new/clarified rules around how the shared
> > fences do not have any ordering requirements against the exclusive
> > fence.
> >
> > But also document all the things a bit better, given how central
> > struct dma_resv to dynamic buffer management the docs have been very
> > inadequat.
> >
> > - Lots more links to other pieces of the puzzle. Unfortunately
> >    ttm_buffer_object has no docs, so no links :-(
> >
> > - Explain/complain a bit about dma_resv_locking_ctx(). I still don't
> >    like that one, but fixing the ttm call chains is going to be
> >    horrible. Plus we want to plug in real slowpath locking when we do
> >    that anyway.
> >
> > - Main part of the patch is some actual docs for struct dma_resv.
> >
> > Overall I think we still have a lot of bad naming in this area (e.g.
> > dma_resv.fence is singular, but contains the multiple shared fences),
> > but I think that's more indicative of how the semantics and rules are
> > just not great.
> >
> > Another thing that's real awkard is how chaining exclusive fences
> > right now means direct dma_resv.exclusive_fence pointer access with an
> > rcu_assign_pointer. Not so great either.
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > ---
> >   drivers/dma-buf/dma-resv.c |  22 ++++++--
> >   include/linux/dma-resv.h   | 104 +++++++++++++++++++++++++++++++++++--
> >   2 files changed, 116 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> > index f26c71747d43..898f8d894bbd 100644
> > --- a/drivers/dma-buf/dma-resv.c
> > +++ b/drivers/dma-buf/dma-resv.c
> > @@ -48,6 +48,8 @@
> >    * write operations) or N shared fences (read operations).  The RCU
> >    * mechanism is used to protect read access to fences from locked
> >    * write-side updates.
> > + *
> > + * See struct dma_resv for more details.
> >    */
> >
> >   DEFINE_WD_CLASS(reservation_ww_class);
> > @@ -137,7 +139,11 @@ EXPORT_SYMBOL(dma_resv_fini);
> >    * @num_fences: number of fences we want to add
> >    *
> >    * Should be called before dma_resv_add_shared_fence().  Must
> > - * be called with obj->lock held.
> > + * be called with @obj locked through dma_resv_lock().
> > + *
> > + * Note that the preallocated slots need to be re-reserved if @obj is unlocked
> > + * at any time before callind dma_resv_add_shared_fence(). This is validate when
> > + * CONFIG_DEBUG_MUTEXES is enabled.
> >    *
> >    * RETURNS
> >    * Zero for success, or -errno
> > @@ -234,8 +240,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
> >    * @obj: the reservation object
> >    * @fence: the shared fence to add
> >    *
> > - * Add a fence to a shared slot, obj->lock must be held, and
> > + * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
> >    * dma_resv_reserve_shared() has been called.
> > + *
> > + * See also &dma_resv.fence for a discussion of the semantics.
> >    */
> >   void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
> >   {
> > @@ -280,7 +288,9 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
> >    * @obj: the reservation object
> >    * @fence: the shared fence to add
> >    *
> > - * Add a fence to the exclusive slot.  The obj->lock must be held.
> > + * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
> > + * Note that this function replaces all fences attached to @obj, see also
> > + * &dma_resv.fence_excl for a discussion of the semantics.
> >    */
> >   void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
> >   {
> > @@ -609,9 +619,11 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
> >    * fence
> >    *
> >    * Callers are not required to hold specific locks, but maybe hold
> > - * dma_resv_lock() already
> > + * dma_resv_lock() already.
> > + *
> >    * RETURNS
> > - * true if all fences signaled, else false
> > + *
> > + * True if all fences signaled, else false.
> >    */
> >   bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
> >   {
> > diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> > index e1ca2080a1ff..c77fd54d033f 100644
> > --- a/include/linux/dma-resv.h
> > +++ b/include/linux/dma-resv.h
> > @@ -62,16 +62,90 @@ struct dma_resv_list {
> >
> >   /**
> >    * struct dma_resv - a reservation object manages fences for a buffer
> > - * @lock: update side lock
> > - * @seq: sequence count for managing RCU read-side synchronization
> > - * @fence_excl: the exclusive fence, if there is one currently
> > - * @fence: list of current shared fences
> > + *
> > + * There are multiple uses for this, with sometimes slightly different rules in
> > + * how the fence slots are used.
> > + *
> > + * One use is to synchronize cross-driver access to a struct dma_buf, either for
> > + * dynamic buffer management or just to handle implicit synchronization between
> > + * different users of the buffer in userspace. See &dma_buf.resv for a more
> > + * in-depth discussion.
> > + *
> > + * The other major use is to manage access and locking within a driver in a
> > + * buffer based memory manager. struct ttm_buffer_object is the canonical
> > + * example here, since this is were reservation objects originated from. But use
> > + * in drivers is spreading and some drivers also manage struct
> > + * drm_gem_object with the same scheme.
>
> I would still make that even harder, e.g. mentioning that you run into
> use after free and the resulting memory corruption if you don't obey the
> rules.

Hm I think that's best documented in dma_buf.resv kerneldoc, pointing
at the rules here and explaining what can go wrong on the other driver
if we just overwrite fences and breakt the DAG. I'll add something
there.
-Daniel

>
> Apart from that with the spelling stuff pointed out by others fixed the
> patch is Reviewed-by: Christian König <christian.koenig@amd.com>
>
> Regards,
> Christian.
>
> >    */
> >   struct dma_resv {
> > +     /**
> > +      * @lock:
> > +      *
> > +      * Update side lock. Don't use directly, instead use the wrapper
> > +      * functions like dma_resv_lock() and dma_resv_unlock().
> > +      *
> > +      * Drivers which use the reservation object to manage memory dynamically
> > +      * also use this lock to protect buffer object state like placement,
> > +      * allocation policies or throughout command submission.
> > +      */
> >       struct ww_mutex lock;
> > +
> > +     /**
> > +      * @seq:
> > +      *
> > +      * Sequence count for managing RCU read-side synchronization, allows
> > +      * read-only access to @fence_excl and @fence while ensuring we take a
> > +      * consistent snapshot.
> > +      */
> >       seqcount_ww_mutex_t seq;
> >
> > +     /**
> > +      * @fence_excl:
> > +      *
> > +      * The exclusive fence, if there is one currently.
> > +      *
> > +      * There are two was to update this fence:
> > +      *
> > +      * - First by calling dma_resv_add_excl_fence(), which replaces all
> > +      *   fences attached to the reservation object. To guarantee that no
> > +      *   fences are lost this new fence must signal only after all previous
> > +      *   fences, both shared and exclusive, have signalled. In some cases it
> > +      *   is convenient to achieve that by attaching a struct dma_fence_array
> > +      *   with all the new and old fences.
> > +      *
> > +      * - Alternatively the fence can be set directly, which leaves the
> > +      *   shared fences unchanged. To guarantee that no fences are lost this
> > +      *   new fence must signale only after the previous exclusive fence has
> > +      *   singalled. Since the shared fences are staying intact, it is not
> > +      *   necessary to maintain any ordering against those. If semantically
> > +      *   only a new access is added without actually treating the previous
> > +      *   one as a dependency the exclusive fences can be strung together
> > +      *   using struct dma_fence_chain.
> > +      *
> > +      * Note that actual semantics of what an exclusive or shared fence mean
> > +      * is defined by the user, for reservation objects shared across drivers
> > +      * see &dma_buf.resv.
> > +      */
> >       struct dma_fence __rcu *fence_excl;
> > +
> > +     /**
> > +      * @fence:
> > +      *
> > +      * List of current shared fences.
> > +      *
> > +      * There are no ordering constraints of shared fences against the
> > +      * exclusive fence slot. If a waiter needs to wait for all access, it
> > +      * has to wait for both set of fences to signal.
> > +      *
> > +      * A new fence is added by calling dma_resv_add_shared_fence(). Since
> > +      * this often needs to be done past the point of no return in command
> > +      * submission it cannot fail, and therefor sufficient slots need to be
> > +      * reserved by calling dma_resv_reserve_shared().
> > +      *
> > +      * Note that actual semantics of what an exclusive or shared fence mean
> > +      * is defined by the user, for reservation objects shared across drivers
> > +      * see &dma_buf.resv.
> > +      */
> >       struct dma_resv_list __rcu *fence;
> >   };
> >
> > @@ -98,6 +172,13 @@ static inline void dma_resv_reset_shared_max(struct dma_resv *obj) {}
> >    * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
> >    * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
> >    * object may be locked by itself by passing NULL as @ctx.
> > + *
> > + * When a die situation is indicated by returning -EDEADLK all locks held by
> > + * @ctx must be unlocked and then dma_resv_lock_slow() called on @obj.
> > + *
> > + * Unlocked by calling dma_resv_lock().
> > + *
> > + * See also dma_resv_lock_interruptible() for the interruptible variant.
> >    */
> >   static inline int dma_resv_lock(struct dma_resv *obj,
> >                               struct ww_acquire_ctx *ctx)
> > @@ -119,6 +200,12 @@ static inline int dma_resv_lock(struct dma_resv *obj,
> >    * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
> >    * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
> >    * object may be locked by itself by passing NULL as @ctx.
> > + *
> > + * When a die situation is indicated by returning -EDEADLK all locks held by
> > + * @ctx must be unlocked and then dma_resv_lock_slow_interruptible() called on
> > + * @obj.
> > + *
> > + * Unlocked by calling dma_resv_lock().
> >    */
> >   static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
> >                                             struct ww_acquire_ctx *ctx)
> > @@ -134,6 +221,8 @@ static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
> >    * Acquires the reservation object after a die case. This function
> >    * will sleep until the lock becomes available. See dma_resv_lock() as
> >    * well.
> > + *
> > + * See also dma_resv_lock_slow_interruptible() for the interruptible variant.
> >    */
> >   static inline void dma_resv_lock_slow(struct dma_resv *obj,
> >                                     struct ww_acquire_ctx *ctx)
> > @@ -167,7 +256,7 @@ static inline int dma_resv_lock_slow_interruptible(struct dma_resv *obj,
> >    * if they overlap with a writer.
> >    *
> >    * Also note that since no context is provided, no deadlock protection is
> > - * possible.
> > + * possible, which is also not needed for a trylock.
> >    *
> >    * Returns true if the lock was acquired, false otherwise.
> >    */
> > @@ -193,6 +282,11 @@ static inline bool dma_resv_is_locked(struct dma_resv *obj)
> >    *
> >    * Returns the context used to lock a reservation object or NULL if no context
> >    * was used or the object is not locked at all.
> > + *
> > + * WARNING: This interface is pretty horrible, but TTM needs it because it
> > + * doesn't pass the struct ww_acquire_ctx around in some very long callchains.
> > + * Everyone else just uses it to check whether they're holding a reservation or
> > + * not.
> >    */
> >   static inline struct ww_acquire_ctx *dma_resv_locking_ctx(struct dma_resv *obj)
> >   {
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 7/7] dma-resv: Give the docs a do-over
@ 2021-07-07  9:13       ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-07  9:13 UTC (permalink / raw)
  To: Christian König
  Cc: Intel Graphics Development, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Daniel Vetter,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Wed, Jul 7, 2021 at 10:06 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Am 06.07.21 um 12:12 schrieb Daniel Vetter:
> > Specifically document the new/clarified rules around how the shared
> > fences do not have any ordering requirements against the exclusive
> > fence.
> >
> > But also document all the things a bit better, given how central
> > struct dma_resv to dynamic buffer management the docs have been very
> > inadequat.
> >
> > - Lots more links to other pieces of the puzzle. Unfortunately
> >    ttm_buffer_object has no docs, so no links :-(
> >
> > - Explain/complain a bit about dma_resv_locking_ctx(). I still don't
> >    like that one, but fixing the ttm call chains is going to be
> >    horrible. Plus we want to plug in real slowpath locking when we do
> >    that anyway.
> >
> > - Main part of the patch is some actual docs for struct dma_resv.
> >
> > Overall I think we still have a lot of bad naming in this area (e.g.
> > dma_resv.fence is singular, but contains the multiple shared fences),
> > but I think that's more indicative of how the semantics and rules are
> > just not great.
> >
> > Another thing that's real awkard is how chaining exclusive fences
> > right now means direct dma_resv.exclusive_fence pointer access with an
> > rcu_assign_pointer. Not so great either.
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > ---
> >   drivers/dma-buf/dma-resv.c |  22 ++++++--
> >   include/linux/dma-resv.h   | 104 +++++++++++++++++++++++++++++++++++--
> >   2 files changed, 116 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> > index f26c71747d43..898f8d894bbd 100644
> > --- a/drivers/dma-buf/dma-resv.c
> > +++ b/drivers/dma-buf/dma-resv.c
> > @@ -48,6 +48,8 @@
> >    * write operations) or N shared fences (read operations).  The RCU
> >    * mechanism is used to protect read access to fences from locked
> >    * write-side updates.
> > + *
> > + * See struct dma_resv for more details.
> >    */
> >
> >   DEFINE_WD_CLASS(reservation_ww_class);
> > @@ -137,7 +139,11 @@ EXPORT_SYMBOL(dma_resv_fini);
> >    * @num_fences: number of fences we want to add
> >    *
> >    * Should be called before dma_resv_add_shared_fence().  Must
> > - * be called with obj->lock held.
> > + * be called with @obj locked through dma_resv_lock().
> > + *
> > + * Note that the preallocated slots need to be re-reserved if @obj is unlocked
> > + * at any time before callind dma_resv_add_shared_fence(). This is validate when
> > + * CONFIG_DEBUG_MUTEXES is enabled.
> >    *
> >    * RETURNS
> >    * Zero for success, or -errno
> > @@ -234,8 +240,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
> >    * @obj: the reservation object
> >    * @fence: the shared fence to add
> >    *
> > - * Add a fence to a shared slot, obj->lock must be held, and
> > + * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
> >    * dma_resv_reserve_shared() has been called.
> > + *
> > + * See also &dma_resv.fence for a discussion of the semantics.
> >    */
> >   void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
> >   {
> > @@ -280,7 +288,9 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
> >    * @obj: the reservation object
> >    * @fence: the shared fence to add
> >    *
> > - * Add a fence to the exclusive slot.  The obj->lock must be held.
> > + * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
> > + * Note that this function replaces all fences attached to @obj, see also
> > + * &dma_resv.fence_excl for a discussion of the semantics.
> >    */
> >   void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
> >   {
> > @@ -609,9 +619,11 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
> >    * fence
> >    *
> >    * Callers are not required to hold specific locks, but maybe hold
> > - * dma_resv_lock() already
> > + * dma_resv_lock() already.
> > + *
> >    * RETURNS
> > - * true if all fences signaled, else false
> > + *
> > + * True if all fences signaled, else false.
> >    */
> >   bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
> >   {
> > diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> > index e1ca2080a1ff..c77fd54d033f 100644
> > --- a/include/linux/dma-resv.h
> > +++ b/include/linux/dma-resv.h
> > @@ -62,16 +62,90 @@ struct dma_resv_list {
> >
> >   /**
> >    * struct dma_resv - a reservation object manages fences for a buffer
> > - * @lock: update side lock
> > - * @seq: sequence count for managing RCU read-side synchronization
> > - * @fence_excl: the exclusive fence, if there is one currently
> > - * @fence: list of current shared fences
> > + *
> > + * There are multiple uses for this, with sometimes slightly different rules in
> > + * how the fence slots are used.
> > + *
> > + * One use is to synchronize cross-driver access to a struct dma_buf, either for
> > + * dynamic buffer management or just to handle implicit synchronization between
> > + * different users of the buffer in userspace. See &dma_buf.resv for a more
> > + * in-depth discussion.
> > + *
> > + * The other major use is to manage access and locking within a driver in a
> > + * buffer based memory manager. struct ttm_buffer_object is the canonical
> > + * example here, since this is were reservation objects originated from. But use
> > + * in drivers is spreading and some drivers also manage struct
> > + * drm_gem_object with the same scheme.
>
> I would still make that even harder, e.g. mentioning that you run into
> use after free and the resulting memory corruption if you don't obey the
> rules.

Hm I think that's best documented in dma_buf.resv kerneldoc, pointing
at the rules here and explaining what can go wrong on the other driver
if we just overwrite fences and breakt the DAG. I'll add something
there.
-Daniel

>
> Apart from that with the spelling stuff pointed out by others fixed the
> patch is Reviewed-by: Christian König <christian.koenig@amd.com>
>
> Regards,
> Christian.
>
> >    */
> >   struct dma_resv {
> > +     /**
> > +      * @lock:
> > +      *
> > +      * Update side lock. Don't use directly, instead use the wrapper
> > +      * functions like dma_resv_lock() and dma_resv_unlock().
> > +      *
> > +      * Drivers which use the reservation object to manage memory dynamically
> > +      * also use this lock to protect buffer object state like placement,
> > +      * allocation policies or throughout command submission.
> > +      */
> >       struct ww_mutex lock;
> > +
> > +     /**
> > +      * @seq:
> > +      *
> > +      * Sequence count for managing RCU read-side synchronization, allows
> > +      * read-only access to @fence_excl and @fence while ensuring we take a
> > +      * consistent snapshot.
> > +      */
> >       seqcount_ww_mutex_t seq;
> >
> > +     /**
> > +      * @fence_excl:
> > +      *
> > +      * The exclusive fence, if there is one currently.
> > +      *
> > +      * There are two was to update this fence:
> > +      *
> > +      * - First by calling dma_resv_add_excl_fence(), which replaces all
> > +      *   fences attached to the reservation object. To guarantee that no
> > +      *   fences are lost this new fence must signal only after all previous
> > +      *   fences, both shared and exclusive, have signalled. In some cases it
> > +      *   is convenient to achieve that by attaching a struct dma_fence_array
> > +      *   with all the new and old fences.
> > +      *
> > +      * - Alternatively the fence can be set directly, which leaves the
> > +      *   shared fences unchanged. To guarantee that no fences are lost this
> > +      *   new fence must signale only after the previous exclusive fence has
> > +      *   singalled. Since the shared fences are staying intact, it is not
> > +      *   necessary to maintain any ordering against those. If semantically
> > +      *   only a new access is added without actually treating the previous
> > +      *   one as a dependency the exclusive fences can be strung together
> > +      *   using struct dma_fence_chain.
> > +      *
> > +      * Note that actual semantics of what an exclusive or shared fence mean
> > +      * is defined by the user, for reservation objects shared across drivers
> > +      * see &dma_buf.resv.
> > +      */
> >       struct dma_fence __rcu *fence_excl;
> > +
> > +     /**
> > +      * @fence:
> > +      *
> > +      * List of current shared fences.
> > +      *
> > +      * There are no ordering constraints of shared fences against the
> > +      * exclusive fence slot. If a waiter needs to wait for all access, it
> > +      * has to wait for both set of fences to signal.
> > +      *
> > +      * A new fence is added by calling dma_resv_add_shared_fence(). Since
> > +      * this often needs to be done past the point of no return in command
> > +      * submission it cannot fail, and therefor sufficient slots need to be
> > +      * reserved by calling dma_resv_reserve_shared().
> > +      *
> > +      * Note that actual semantics of what an exclusive or shared fence mean
> > +      * is defined by the user, for reservation objects shared across drivers
> > +      * see &dma_buf.resv.
> > +      */
> >       struct dma_resv_list __rcu *fence;
> >   };
> >
> > @@ -98,6 +172,13 @@ static inline void dma_resv_reset_shared_max(struct dma_resv *obj) {}
> >    * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
> >    * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
> >    * object may be locked by itself by passing NULL as @ctx.
> > + *
> > + * When a die situation is indicated by returning -EDEADLK all locks held by
> > + * @ctx must be unlocked and then dma_resv_lock_slow() called on @obj.
> > + *
> > + * Unlocked by calling dma_resv_lock().
> > + *
> > + * See also dma_resv_lock_interruptible() for the interruptible variant.
> >    */
> >   static inline int dma_resv_lock(struct dma_resv *obj,
> >                               struct ww_acquire_ctx *ctx)
> > @@ -119,6 +200,12 @@ static inline int dma_resv_lock(struct dma_resv *obj,
> >    * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
> >    * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
> >    * object may be locked by itself by passing NULL as @ctx.
> > + *
> > + * When a die situation is indicated by returning -EDEADLK all locks held by
> > + * @ctx must be unlocked and then dma_resv_lock_slow_interruptible() called on
> > + * @obj.
> > + *
> > + * Unlocked by calling dma_resv_lock().
> >    */
> >   static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
> >                                             struct ww_acquire_ctx *ctx)
> > @@ -134,6 +221,8 @@ static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
> >    * Acquires the reservation object after a die case. This function
> >    * will sleep until the lock becomes available. See dma_resv_lock() as
> >    * well.
> > + *
> > + * See also dma_resv_lock_slow_interruptible() for the interruptible variant.
> >    */
> >   static inline void dma_resv_lock_slow(struct dma_resv *obj,
> >                                     struct ww_acquire_ctx *ctx)
> > @@ -167,7 +256,7 @@ static inline int dma_resv_lock_slow_interruptible(struct dma_resv *obj,
> >    * if they overlap with a writer.
> >    *
> >    * Also note that since no context is provided, no deadlock protection is
> > - * possible.
> > + * possible, which is also not needed for a trylock.
> >    *
> >    * Returns true if the lock was acquired, false otherwise.
> >    */
> > @@ -193,6 +282,11 @@ static inline bool dma_resv_is_locked(struct dma_resv *obj)
> >    *
> >    * Returns the context used to lock a reservation object or NULL if no context
> >    * was used or the object is not locked at all.
> > + *
> > + * WARNING: This interface is pretty horrible, but TTM needs it because it
> > + * doesn't pass the struct ww_acquire_ctx around in some very long callchains.
> > + * Everyone else just uses it to check whether they're holding a reservation or
> > + * not.
> >    */
> >   static inline struct ww_acquire_ctx *dma_resv_locking_ctx(struct dma_resv *obj)
> >   {
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 7/7] dma-resv: Give the docs a do-over
@ 2021-07-07  9:13       ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-07  9:13 UTC (permalink / raw)
  To: Christian König
  Cc: Intel Graphics Development, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Daniel Vetter,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Wed, Jul 7, 2021 at 10:06 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Am 06.07.21 um 12:12 schrieb Daniel Vetter:
> > Specifically document the new/clarified rules around how the shared
> > fences do not have any ordering requirements against the exclusive
> > fence.
> >
> > But also document all the things a bit better, given how central
> > struct dma_resv to dynamic buffer management the docs have been very
> > inadequat.
> >
> > - Lots more links to other pieces of the puzzle. Unfortunately
> >    ttm_buffer_object has no docs, so no links :-(
> >
> > - Explain/complain a bit about dma_resv_locking_ctx(). I still don't
> >    like that one, but fixing the ttm call chains is going to be
> >    horrible. Plus we want to plug in real slowpath locking when we do
> >    that anyway.
> >
> > - Main part of the patch is some actual docs for struct dma_resv.
> >
> > Overall I think we still have a lot of bad naming in this area (e.g.
> > dma_resv.fence is singular, but contains the multiple shared fences),
> > but I think that's more indicative of how the semantics and rules are
> > just not great.
> >
> > Another thing that's real awkard is how chaining exclusive fences
> > right now means direct dma_resv.exclusive_fence pointer access with an
> > rcu_assign_pointer. Not so great either.
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > ---
> >   drivers/dma-buf/dma-resv.c |  22 ++++++--
> >   include/linux/dma-resv.h   | 104 +++++++++++++++++++++++++++++++++++--
> >   2 files changed, 116 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> > index f26c71747d43..898f8d894bbd 100644
> > --- a/drivers/dma-buf/dma-resv.c
> > +++ b/drivers/dma-buf/dma-resv.c
> > @@ -48,6 +48,8 @@
> >    * write operations) or N shared fences (read operations).  The RCU
> >    * mechanism is used to protect read access to fences from locked
> >    * write-side updates.
> > + *
> > + * See struct dma_resv for more details.
> >    */
> >
> >   DEFINE_WD_CLASS(reservation_ww_class);
> > @@ -137,7 +139,11 @@ EXPORT_SYMBOL(dma_resv_fini);
> >    * @num_fences: number of fences we want to add
> >    *
> >    * Should be called before dma_resv_add_shared_fence().  Must
> > - * be called with obj->lock held.
> > + * be called with @obj locked through dma_resv_lock().
> > + *
> > + * Note that the preallocated slots need to be re-reserved if @obj is unlocked
> > + * at any time before callind dma_resv_add_shared_fence(). This is validate when
> > + * CONFIG_DEBUG_MUTEXES is enabled.
> >    *
> >    * RETURNS
> >    * Zero for success, or -errno
> > @@ -234,8 +240,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
> >    * @obj: the reservation object
> >    * @fence: the shared fence to add
> >    *
> > - * Add a fence to a shared slot, obj->lock must be held, and
> > + * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
> >    * dma_resv_reserve_shared() has been called.
> > + *
> > + * See also &dma_resv.fence for a discussion of the semantics.
> >    */
> >   void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
> >   {
> > @@ -280,7 +288,9 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
> >    * @obj: the reservation object
> >    * @fence: the shared fence to add
> >    *
> > - * Add a fence to the exclusive slot.  The obj->lock must be held.
> > + * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
> > + * Note that this function replaces all fences attached to @obj, see also
> > + * &dma_resv.fence_excl for a discussion of the semantics.
> >    */
> >   void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
> >   {
> > @@ -609,9 +619,11 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
> >    * fence
> >    *
> >    * Callers are not required to hold specific locks, but maybe hold
> > - * dma_resv_lock() already
> > + * dma_resv_lock() already.
> > + *
> >    * RETURNS
> > - * true if all fences signaled, else false
> > + *
> > + * True if all fences signaled, else false.
> >    */
> >   bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
> >   {
> > diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> > index e1ca2080a1ff..c77fd54d033f 100644
> > --- a/include/linux/dma-resv.h
> > +++ b/include/linux/dma-resv.h
> > @@ -62,16 +62,90 @@ struct dma_resv_list {
> >
> >   /**
> >    * struct dma_resv - a reservation object manages fences for a buffer
> > - * @lock: update side lock
> > - * @seq: sequence count for managing RCU read-side synchronization
> > - * @fence_excl: the exclusive fence, if there is one currently
> > - * @fence: list of current shared fences
> > + *
> > + * There are multiple uses for this, with sometimes slightly different rules in
> > + * how the fence slots are used.
> > + *
> > + * One use is to synchronize cross-driver access to a struct dma_buf, either for
> > + * dynamic buffer management or just to handle implicit synchronization between
> > + * different users of the buffer in userspace. See &dma_buf.resv for a more
> > + * in-depth discussion.
> > + *
> > + * The other major use is to manage access and locking within a driver in a
> > + * buffer based memory manager. struct ttm_buffer_object is the canonical
> > + * example here, since this is were reservation objects originated from. But use
> > + * in drivers is spreading and some drivers also manage struct
> > + * drm_gem_object with the same scheme.
>
> I would still make that even harder, e.g. mentioning that you run into
> use after free and the resulting memory corruption if you don't obey the
> rules.

Hm I think that's best documented in dma_buf.resv kerneldoc, pointing
at the rules here and explaining what can go wrong on the other driver
if we just overwrite fences and breakt the DAG. I'll add something
there.
-Daniel

>
> Apart from that with the spelling stuff pointed out by others fixed the
> patch is Reviewed-by: Christian König <christian.koenig@amd.com>
>
> Regards,
> Christian.
>
> >    */
> >   struct dma_resv {
> > +     /**
> > +      * @lock:
> > +      *
> > +      * Update side lock. Don't use directly, instead use the wrapper
> > +      * functions like dma_resv_lock() and dma_resv_unlock().
> > +      *
> > +      * Drivers which use the reservation object to manage memory dynamically
> > +      * also use this lock to protect buffer object state like placement,
> > +      * allocation policies or throughout command submission.
> > +      */
> >       struct ww_mutex lock;
> > +
> > +     /**
> > +      * @seq:
> > +      *
> > +      * Sequence count for managing RCU read-side synchronization, allows
> > +      * read-only access to @fence_excl and @fence while ensuring we take a
> > +      * consistent snapshot.
> > +      */
> >       seqcount_ww_mutex_t seq;
> >
> > +     /**
> > +      * @fence_excl:
> > +      *
> > +      * The exclusive fence, if there is one currently.
> > +      *
> > +      * There are two was to update this fence:
> > +      *
> > +      * - First by calling dma_resv_add_excl_fence(), which replaces all
> > +      *   fences attached to the reservation object. To guarantee that no
> > +      *   fences are lost this new fence must signal only after all previous
> > +      *   fences, both shared and exclusive, have signalled. In some cases it
> > +      *   is convenient to achieve that by attaching a struct dma_fence_array
> > +      *   with all the new and old fences.
> > +      *
> > +      * - Alternatively the fence can be set directly, which leaves the
> > +      *   shared fences unchanged. To guarantee that no fences are lost this
> > +      *   new fence must signale only after the previous exclusive fence has
> > +      *   singalled. Since the shared fences are staying intact, it is not
> > +      *   necessary to maintain any ordering against those. If semantically
> > +      *   only a new access is added without actually treating the previous
> > +      *   one as a dependency the exclusive fences can be strung together
> > +      *   using struct dma_fence_chain.
> > +      *
> > +      * Note that actual semantics of what an exclusive or shared fence mean
> > +      * is defined by the user, for reservation objects shared across drivers
> > +      * see &dma_buf.resv.
> > +      */
> >       struct dma_fence __rcu *fence_excl;
> > +
> > +     /**
> > +      * @fence:
> > +      *
> > +      * List of current shared fences.
> > +      *
> > +      * There are no ordering constraints of shared fences against the
> > +      * exclusive fence slot. If a waiter needs to wait for all access, it
> > +      * has to wait for both set of fences to signal.
> > +      *
> > +      * A new fence is added by calling dma_resv_add_shared_fence(). Since
> > +      * this often needs to be done past the point of no return in command
> > +      * submission it cannot fail, and therefor sufficient slots need to be
> > +      * reserved by calling dma_resv_reserve_shared().
> > +      *
> > +      * Note that actual semantics of what an exclusive or shared fence mean
> > +      * is defined by the user, for reservation objects shared across drivers
> > +      * see &dma_buf.resv.
> > +      */
> >       struct dma_resv_list __rcu *fence;
> >   };
> >
> > @@ -98,6 +172,13 @@ static inline void dma_resv_reset_shared_max(struct dma_resv *obj) {}
> >    * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
> >    * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
> >    * object may be locked by itself by passing NULL as @ctx.
> > + *
> > + * When a die situation is indicated by returning -EDEADLK all locks held by
> > + * @ctx must be unlocked and then dma_resv_lock_slow() called on @obj.
> > + *
> > + * Unlocked by calling dma_resv_lock().
> > + *
> > + * See also dma_resv_lock_interruptible() for the interruptible variant.
> >    */
> >   static inline int dma_resv_lock(struct dma_resv *obj,
> >                               struct ww_acquire_ctx *ctx)
> > @@ -119,6 +200,12 @@ static inline int dma_resv_lock(struct dma_resv *obj,
> >    * undefined order, a #ww_acquire_ctx is passed to unwind if a cycle
> >    * is detected. See ww_mutex_lock() and ww_acquire_init(). A reservation
> >    * object may be locked by itself by passing NULL as @ctx.
> > + *
> > + * When a die situation is indicated by returning -EDEADLK all locks held by
> > + * @ctx must be unlocked and then dma_resv_lock_slow_interruptible() called on
> > + * @obj.
> > + *
> > + * Unlocked by calling dma_resv_lock().
> >    */
> >   static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
> >                                             struct ww_acquire_ctx *ctx)
> > @@ -134,6 +221,8 @@ static inline int dma_resv_lock_interruptible(struct dma_resv *obj,
> >    * Acquires the reservation object after a die case. This function
> >    * will sleep until the lock becomes available. See dma_resv_lock() as
> >    * well.
> > + *
> > + * See also dma_resv_lock_slow_interruptible() for the interruptible variant.
> >    */
> >   static inline void dma_resv_lock_slow(struct dma_resv *obj,
> >                                     struct ww_acquire_ctx *ctx)
> > @@ -167,7 +256,7 @@ static inline int dma_resv_lock_slow_interruptible(struct dma_resv *obj,
> >    * if they overlap with a writer.
> >    *
> >    * Also note that since no context is provided, no deadlock protection is
> > - * possible.
> > + * possible, which is also not needed for a trylock.
> >    *
> >    * Returns true if the lock was acquired, false otherwise.
> >    */
> > @@ -193,6 +282,11 @@ static inline bool dma_resv_is_locked(struct dma_resv *obj)
> >    *
> >    * Returns the context used to lock a reservation object or NULL if no context
> >    * was used or the object is not locked at all.
> > + *
> > + * WARNING: This interface is pretty horrible, but TTM needs it because it
> > + * doesn't pass the struct ww_acquire_ctx around in some very long callchains.
> > + * Everyone else just uses it to check whether they're holding a reservation or
> > + * not.
> >    */
> >   static inline struct ww_acquire_ctx *dma_resv_locking_ctx(struct dma_resv *obj)
> >   {
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/7] drm/etnaviv: Don't break exclusive fence ordering
  2021-07-07  8:54     ` [Intel-gfx] " Lucas Stach
@ 2021-07-07 11:37       ` Daniel Vetter
  -1 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-07 11:37 UTC (permalink / raw)
  To: Lucas Stach
  Cc: Intel Graphics Development, The etnaviv authors, DRI Development,
	Russell King, Daniel Vetter

On Wed, Jul 7, 2021 at 10:54 AM Lucas Stach <l.stach@pengutronix.de> wrote:
>
> Hi Daniel,
>
> I'm feeling like I miss a ton of context here, so some maybe dumb
> questions/remarks below.
>
> Am Dienstag, dem 06.07.2021 um 12:12 +0200 schrieb Daniel Vetter:
> > There's only one exclusive slot, and we must not break the ordering.
> >
> > A better fix would be to us a dma_fence_chain or _array like e.g.
> > amdgpu now uses, but it probably makes sense to lift this into
> > dma-resv.c code as a proper concept, so that drivers don't have to
> > hack up their own solution each on their own. Hence go with the simple
> > fix for now.
> >
> > Another option is the fence import ioctl from Jason:
> >
> > https://lore.kernel.org/dri-devel/20210610210925.642582-7-jason@jlekstrand.net/
>
> Sorry, but why is the fence import ioctl a alternative to the fix
> proposed in this patch?

It's not an alternative to fixing the issue here, it's an alternative
to trying to fix this here without adding more dependencies. Depends a
bit what exactly you want to do, I just linked this for the bigger
picture.


>
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Lucas Stach <l.stach@pengutronix.de>
> > Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> > Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> > Cc: etnaviv@lists.freedesktop.org
> > ---
> >  drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > index 92478a50a580..5c4fed2b7c6a 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > @@ -178,18 +178,20 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit)
> >       for (i = 0; i < submit->nr_bos; i++) {
> >               struct etnaviv_gem_submit_bo *bo = &submit->bos[i];
> >               struct dma_resv *robj = bo->obj->base.resv;
> > +             bool write = bo->flags & ETNA_SUBMIT_BO_WRITE;
> >
> > -             if (!(bo->flags & ETNA_SUBMIT_BO_WRITE)) {
> > +             if (!(write)) {
>
> No parenthesis around the write needed.
>
> >                       ret = dma_resv_reserve_shared(robj, 1);
> >                       if (ret)
> >                               return ret;
> >               }
> >
> > -             if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
> > +             /* exclusive fences must be ordered */
>
> I feel like this comment isn't really helpful. It might tell you all
> you need to know if you just paged in all the context around dma_resv
> and the dependency graph, but it's not more than noise to me right now.
>
> I guess the comment should answer the question against what the
> exclusive fence we are going to add needs to be ordered and why it
> isn't safe to skip implicit sync in that case.

The full-length comment is the doc patch, which is last in the series.
Essentially the rule is that your not allowed to drop fences on the
floor (which you do if you just set a new write fence and not take any
of the existing fences as dependencies), at least for shared buffers.
But since it's easier to be consistent I think it's best if we do this
just across the board.

Like the commit message explains, there's a few ways to fix this:
- just treat it as implicit synced
- chain the fences together - that way you don't sync, but also
there's no fence that's being lost. This is what amdgpu does, and also
what the new import ioctl does.

2nd option is a lot more involved, and since all the dma-resv stuff is
a bit under discussion, I went with the most minimal thing possible.
-Daniel

>
> Regards,
> Lucas
> > +             if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT && !write)
> >                       continue;
> >
> >               ret = drm_sched_job_await_implicit(&submit->sched_job, &bo->obj->base,
> > -                                                bo->flags & ETNA_SUBMIT_BO_WRITE);
> > +                                                write);
> >               if (ret)
> >                       return ret;
> >       }
>
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [PATCH 3/7] drm/etnaviv: Don't break exclusive fence ordering
@ 2021-07-07 11:37       ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-07 11:37 UTC (permalink / raw)
  To: Lucas Stach
  Cc: Intel Graphics Development, The etnaviv authors, DRI Development,
	Christian Gmeiner, Russell King, Daniel Vetter

On Wed, Jul 7, 2021 at 10:54 AM Lucas Stach <l.stach@pengutronix.de> wrote:
>
> Hi Daniel,
>
> I'm feeling like I miss a ton of context here, so some maybe dumb
> questions/remarks below.
>
> Am Dienstag, dem 06.07.2021 um 12:12 +0200 schrieb Daniel Vetter:
> > There's only one exclusive slot, and we must not break the ordering.
> >
> > A better fix would be to us a dma_fence_chain or _array like e.g.
> > amdgpu now uses, but it probably makes sense to lift this into
> > dma-resv.c code as a proper concept, so that drivers don't have to
> > hack up their own solution each on their own. Hence go with the simple
> > fix for now.
> >
> > Another option is the fence import ioctl from Jason:
> >
> > https://lore.kernel.org/dri-devel/20210610210925.642582-7-jason@jlekstrand.net/
>
> Sorry, but why is the fence import ioctl a alternative to the fix
> proposed in this patch?

It's not an alternative to fixing the issue here, it's an alternative
to trying to fix this here without adding more dependencies. Depends a
bit what exactly you want to do, I just linked this for the bigger
picture.


>
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Lucas Stach <l.stach@pengutronix.de>
> > Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> > Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> > Cc: etnaviv@lists.freedesktop.org
> > ---
> >  drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > index 92478a50a580..5c4fed2b7c6a 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > @@ -178,18 +178,20 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit)
> >       for (i = 0; i < submit->nr_bos; i++) {
> >               struct etnaviv_gem_submit_bo *bo = &submit->bos[i];
> >               struct dma_resv *robj = bo->obj->base.resv;
> > +             bool write = bo->flags & ETNA_SUBMIT_BO_WRITE;
> >
> > -             if (!(bo->flags & ETNA_SUBMIT_BO_WRITE)) {
> > +             if (!(write)) {
>
> No parenthesis around the write needed.
>
> >                       ret = dma_resv_reserve_shared(robj, 1);
> >                       if (ret)
> >                               return ret;
> >               }
> >
> > -             if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
> > +             /* exclusive fences must be ordered */
>
> I feel like this comment isn't really helpful. It might tell you all
> you need to know if you just paged in all the context around dma_resv
> and the dependency graph, but it's not more than noise to me right now.
>
> I guess the comment should answer the question against what the
> exclusive fence we are going to add needs to be ordered and why it
> isn't safe to skip implicit sync in that case.

The full-length comment is the doc patch, which is last in the series.
Essentially the rule is that your not allowed to drop fences on the
floor (which you do if you just set a new write fence and not take any
of the existing fences as dependencies), at least for shared buffers.
But since it's easier to be consistent I think it's best if we do this
just across the board.

Like the commit message explains, there's a few ways to fix this:
- just treat it as implicit synced
- chain the fences together - that way you don't sync, but also
there's no fence that's being lost. This is what amdgpu does, and also
what the new import ioctl does.

2nd option is a lot more involved, and since all the dma-resv stuff is
a bit under discussion, I went with the most minimal thing possible.
-Daniel

>
> Regards,
> Lucas
> > +             if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT && !write)
> >                       continue;
> >
> >               ret = drm_sched_job_await_implicit(&submit->sched_job, &bo->obj->base,
> > -                                                bo->flags & ETNA_SUBMIT_BO_WRITE);
> > +                                                write);
> >               if (ret)
> >                       return ret;
> >       }
>
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/7] drm/etnaviv: Don't break exclusive fence ordering
  2021-07-07 11:37       ` [Intel-gfx] " Daniel Vetter
@ 2021-07-07 12:31         ` Lucas Stach
  -1 siblings, 0 replies; 44+ messages in thread
From: Lucas Stach @ 2021-07-07 12:31 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, The etnaviv authors, DRI Development,
	Russell King, Daniel Vetter

Am Mittwoch, dem 07.07.2021 um 13:37 +0200 schrieb Daniel Vetter:
> On Wed, Jul 7, 2021 at 10:54 AM Lucas Stach <l.stach@pengutronix.de> wrote:
> > 
> > Hi Daniel,
> > 
> > I'm feeling like I miss a ton of context here, so some maybe dumb
> > questions/remarks below.
> > 
> > Am Dienstag, dem 06.07.2021 um 12:12 +0200 schrieb Daniel Vetter:
> > > There's only one exclusive slot, and we must not break the ordering.
> > > 
> > > A better fix would be to us a dma_fence_chain or _array like e.g.
> > > amdgpu now uses, but it probably makes sense to lift this into
> > > dma-resv.c code as a proper concept, so that drivers don't have to
> > > hack up their own solution each on their own. Hence go with the simple
> > > fix for now.
> > > 
> > > Another option is the fence import ioctl from Jason:
> > > 
> > > https://lore.kernel.org/dri-devel/20210610210925.642582-7-jason@jlekstrand.net/
> > 
> > Sorry, but why is the fence import ioctl a alternative to the fix
> > proposed in this patch?
> 
> It's not an alternative to fixing the issue here, it's an alternative
> to trying to fix this here without adding more dependencies. Depends a
> bit what exactly you want to do, I just linked this for the bigger
> picture.
> 
I appreciate the bigger picture, it just makes it harder to follow what
is going on in this exact commit when trying to match up the code
change with the commit message. I would have expected this link to only
be part of the cover letter explaining the series, instead of being
part of the commit message.

> 
> > 
> > > 
> > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > Cc: Lucas Stach <l.stach@pengutronix.de>
> > > Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> > > Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> > > Cc: etnaviv@lists.freedesktop.org
> > > ---
> > >  drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 8 +++++---
> > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > > index 92478a50a580..5c4fed2b7c6a 100644
> > > --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > > @@ -178,18 +178,20 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit)
> > >       for (i = 0; i < submit->nr_bos; i++) {
> > >               struct etnaviv_gem_submit_bo *bo = &submit->bos[i];
> > >               struct dma_resv *robj = bo->obj->base.resv;
> > > +             bool write = bo->flags & ETNA_SUBMIT_BO_WRITE;
> > > 
> > > -             if (!(bo->flags & ETNA_SUBMIT_BO_WRITE)) {
> > > +             if (!(write)) {
> > 
> > No parenthesis around the write needed.
> > 
> > >                       ret = dma_resv_reserve_shared(robj, 1);
> > >                       if (ret)
> > >                               return ret;
> > >               }
> > > 
> > > -             if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
> > > +             /* exclusive fences must be ordered */
> > 
> > I feel like this comment isn't really helpful. It might tell you all
> > you need to know if you just paged in all the context around dma_resv
> > and the dependency graph, but it's not more than noise to me right now.
> > 
> > I guess the comment should answer the question against what the
> > exclusive fence we are going to add needs to be ordered and why it
> > isn't safe to skip implicit sync in that case.
> 
> The full-length comment is the doc patch, which is last in the series.
> Essentially the rule is that your not allowed to drop fences on the
> floor (which you do if you just set a new write fence and not take any
> of the existing fences as dependencies), at least for shared buffers.
> But since it's easier to be consistent I think it's best if we do this
> just across the board.
> 
> Like the commit message explains, there's a few ways to fix this:
> - just treat it as implicit synced
> - chain the fences together - that way you don't sync, but also
> there's no fence that's being lost. This is what amdgpu does, and also
> what the new import ioctl does.
> 
> 2nd option is a lot more involved, and since all the dma-resv stuff is
> a bit under discussion, I went with the most minimal thing possible.

Thanks for the confirmation, that was as much as I figured from the doc
patch as well. So with that in mind I would expect this comment to read
something like this:
"Adding a new exclusive fence drops all previous fences from the
dma_resv. To avoid violating the signalling order we err on the side of
over-synchronizing by waiting for the existing fences, even if
userspace asked us to ignore them."

Regards,
Lucas

> -Daniel
> 
> > 
> > Regards,
> > Lucas
> > > +             if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT && !write)
> > >                       continue;
> > > 
> > >               ret = drm_sched_job_await_implicit(&submit->sched_job, &bo->obj->base,
> > > -                                                bo->flags & ETNA_SUBMIT_BO_WRITE);
> > > +                                                write);
> > >               if (ret)
> > >                       return ret;
> > >       }
> > 
> > 
> 
> 



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [PATCH 3/7] drm/etnaviv: Don't break exclusive fence ordering
@ 2021-07-07 12:31         ` Lucas Stach
  0 siblings, 0 replies; 44+ messages in thread
From: Lucas Stach @ 2021-07-07 12:31 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, The etnaviv authors, DRI Development,
	Christian Gmeiner, Russell King, Daniel Vetter

Am Mittwoch, dem 07.07.2021 um 13:37 +0200 schrieb Daniel Vetter:
> On Wed, Jul 7, 2021 at 10:54 AM Lucas Stach <l.stach@pengutronix.de> wrote:
> > 
> > Hi Daniel,
> > 
> > I'm feeling like I miss a ton of context here, so some maybe dumb
> > questions/remarks below.
> > 
> > Am Dienstag, dem 06.07.2021 um 12:12 +0200 schrieb Daniel Vetter:
> > > There's only one exclusive slot, and we must not break the ordering.
> > > 
> > > A better fix would be to us a dma_fence_chain or _array like e.g.
> > > amdgpu now uses, but it probably makes sense to lift this into
> > > dma-resv.c code as a proper concept, so that drivers don't have to
> > > hack up their own solution each on their own. Hence go with the simple
> > > fix for now.
> > > 
> > > Another option is the fence import ioctl from Jason:
> > > 
> > > https://lore.kernel.org/dri-devel/20210610210925.642582-7-jason@jlekstrand.net/
> > 
> > Sorry, but why is the fence import ioctl a alternative to the fix
> > proposed in this patch?
> 
> It's not an alternative to fixing the issue here, it's an alternative
> to trying to fix this here without adding more dependencies. Depends a
> bit what exactly you want to do, I just linked this for the bigger
> picture.
> 
I appreciate the bigger picture, it just makes it harder to follow what
is going on in this exact commit when trying to match up the code
change with the commit message. I would have expected this link to only
be part of the cover letter explaining the series, instead of being
part of the commit message.

> 
> > 
> > > 
> > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > Cc: Lucas Stach <l.stach@pengutronix.de>
> > > Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> > > Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> > > Cc: etnaviv@lists.freedesktop.org
> > > ---
> > >  drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 8 +++++---
> > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > > index 92478a50a580..5c4fed2b7c6a 100644
> > > --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > > @@ -178,18 +178,20 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit)
> > >       for (i = 0; i < submit->nr_bos; i++) {
> > >               struct etnaviv_gem_submit_bo *bo = &submit->bos[i];
> > >               struct dma_resv *robj = bo->obj->base.resv;
> > > +             bool write = bo->flags & ETNA_SUBMIT_BO_WRITE;
> > > 
> > > -             if (!(bo->flags & ETNA_SUBMIT_BO_WRITE)) {
> > > +             if (!(write)) {
> > 
> > No parenthesis around the write needed.
> > 
> > >                       ret = dma_resv_reserve_shared(robj, 1);
> > >                       if (ret)
> > >                               return ret;
> > >               }
> > > 
> > > -             if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
> > > +             /* exclusive fences must be ordered */
> > 
> > I feel like this comment isn't really helpful. It might tell you all
> > you need to know if you just paged in all the context around dma_resv
> > and the dependency graph, but it's not more than noise to me right now.
> > 
> > I guess the comment should answer the question against what the
> > exclusive fence we are going to add needs to be ordered and why it
> > isn't safe to skip implicit sync in that case.
> 
> The full-length comment is the doc patch, which is last in the series.
> Essentially the rule is that your not allowed to drop fences on the
> floor (which you do if you just set a new write fence and not take any
> of the existing fences as dependencies), at least for shared buffers.
> But since it's easier to be consistent I think it's best if we do this
> just across the board.
> 
> Like the commit message explains, there's a few ways to fix this:
> - just treat it as implicit synced
> - chain the fences together - that way you don't sync, but also
> there's no fence that's being lost. This is what amdgpu does, and also
> what the new import ioctl does.
> 
> 2nd option is a lot more involved, and since all the dma-resv stuff is
> a bit under discussion, I went with the most minimal thing possible.

Thanks for the confirmation, that was as much as I figured from the doc
patch as well. So with that in mind I would expect this comment to read
something like this:
"Adding a new exclusive fence drops all previous fences from the
dma_resv. To avoid violating the signalling order we err on the side of
over-synchronizing by waiting for the existing fences, even if
userspace asked us to ignore them."

Regards,
Lucas

> -Daniel
> 
> > 
> > Regards,
> > Lucas
> > > +             if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT && !write)
> > >                       continue;
> > > 
> > >               ret = drm_sched_job_await_implicit(&submit->sched_job, &bo->obj->base,
> > > -                                                bo->flags & ETNA_SUBMIT_BO_WRITE);
> > > +                                                write);
> > >               if (ret)
> > >                       return ret;
> > >       }
> > 
> > 
> 
> 


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/7] drm/etnaviv: Don't break exclusive fence ordering
  2021-07-07 12:31         ` [Intel-gfx] " Lucas Stach
@ 2021-07-07 12:59           ` Daniel Vetter
  -1 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-07 12:59 UTC (permalink / raw)
  To: Lucas Stach
  Cc: Intel Graphics Development, The etnaviv authors, DRI Development,
	Russell King, Daniel Vetter

On Wed, Jul 7, 2021 at 2:31 PM Lucas Stach <l.stach@pengutronix.de> wrote:
>
> Am Mittwoch, dem 07.07.2021 um 13:37 +0200 schrieb Daniel Vetter:
> > On Wed, Jul 7, 2021 at 10:54 AM Lucas Stach <l.stach@pengutronix.de> wrote:
> > >
> > > Hi Daniel,
> > >
> > > I'm feeling like I miss a ton of context here, so some maybe dumb
> > > questions/remarks below.
> > >
> > > Am Dienstag, dem 06.07.2021 um 12:12 +0200 schrieb Daniel Vetter:
> > > > There's only one exclusive slot, and we must not break the ordering.
> > > >
> > > > A better fix would be to us a dma_fence_chain or _array like e.g.
> > > > amdgpu now uses, but it probably makes sense to lift this into
> > > > dma-resv.c code as a proper concept, so that drivers don't have to
> > > > hack up their own solution each on their own. Hence go with the simple
> > > > fix for now.
> > > >
> > > > Another option is the fence import ioctl from Jason:
> > > >
> > > > https://lore.kernel.org/dri-devel/20210610210925.642582-7-jason@jlekstrand.net/
> > >
> > > Sorry, but why is the fence import ioctl a alternative to the fix
> > > proposed in this patch?
> >
> > It's not an alternative to fixing the issue here, it's an alternative
> > to trying to fix this here without adding more dependencies. Depends a
> > bit what exactly you want to do, I just linked this for the bigger
> > picture.
> >
> I appreciate the bigger picture, it just makes it harder to follow what
> is going on in this exact commit when trying to match up the code
> change with the commit message. I would have expected this link to only
> be part of the cover letter explaining the series, instead of being
> part of the commit message.

I wanted to list all the better fix options in the commit message so
that you have the full context.

> > > >
> > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > > Cc: Lucas Stach <l.stach@pengutronix.de>
> > > > Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> > > > Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> > > > Cc: etnaviv@lists.freedesktop.org
> > > > ---
> > > >  drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 8 +++++---
> > > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > > > index 92478a50a580..5c4fed2b7c6a 100644
> > > > --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > > > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > > > @@ -178,18 +178,20 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit)
> > > >       for (i = 0; i < submit->nr_bos; i++) {
> > > >               struct etnaviv_gem_submit_bo *bo = &submit->bos[i];
> > > >               struct dma_resv *robj = bo->obj->base.resv;
> > > > +             bool write = bo->flags & ETNA_SUBMIT_BO_WRITE;
> > > >
> > > > -             if (!(bo->flags & ETNA_SUBMIT_BO_WRITE)) {
> > > > +             if (!(write)) {
> > >
> > > No parenthesis around the write needed.
> > >
> > > >                       ret = dma_resv_reserve_shared(robj, 1);
> > > >                       if (ret)
> > > >                               return ret;
> > > >               }
> > > >
> > > > -             if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
> > > > +             /* exclusive fences must be ordered */
> > >
> > > I feel like this comment isn't really helpful. It might tell you all
> > > you need to know if you just paged in all the context around dma_resv
> > > and the dependency graph, but it's not more than noise to me right now.
> > >
> > > I guess the comment should answer the question against what the
> > > exclusive fence we are going to add needs to be ordered and why it
> > > isn't safe to skip implicit sync in that case.
> >
> > The full-length comment is the doc patch, which is last in the series.
> > Essentially the rule is that your not allowed to drop fences on the
> > floor (which you do if you just set a new write fence and not take any
> > of the existing fences as dependencies), at least for shared buffers.
> > But since it's easier to be consistent I think it's best if we do this
> > just across the board.
> >
> > Like the commit message explains, there's a few ways to fix this:
> > - just treat it as implicit synced
> > - chain the fences together - that way you don't sync, but also
> > there's no fence that's being lost. This is what amdgpu does, and also
> > what the new import ioctl does.
> >
> > 2nd option is a lot more involved, and since all the dma-resv stuff is
> > a bit under discussion, I went with the most minimal thing possible.
>
> Thanks for the confirmation, that was as much as I figured from the doc
> patch as well. So with that in mind I would expect this comment to read
> something like this:
> "Adding a new exclusive fence drops all previous fences from the
> dma_resv. To avoid violating the signalling order we err on the side of
> over-synchronizing by waiting for the existing fences, even if
> userspace asked us to ignore them."

Thanks for the suggestion, I've applied that to all the 3 patches for
msm/etnaviv and i915.

I hope with that added there's enough context in the commit message
that at least the gist is understandable without full context down the
road.
-Daniel

>
> Regards,
> Lucas
>
> > -Daniel
> >
> > >
> > > Regards,
> > > Lucas
> > > > +             if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT && !write)
> > > >                       continue;
> > > >
> > > >               ret = drm_sched_job_await_implicit(&submit->sched_job, &bo->obj->base,
> > > > -                                                bo->flags & ETNA_SUBMIT_BO_WRITE);
> > > > +                                                write);
> > > >               if (ret)
> > > >                       return ret;
> > > >       }
> > >
> > >
> >
> >
>
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [PATCH 3/7] drm/etnaviv: Don't break exclusive fence ordering
@ 2021-07-07 12:59           ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2021-07-07 12:59 UTC (permalink / raw)
  To: Lucas Stach
  Cc: Intel Graphics Development, The etnaviv authors, DRI Development,
	Christian Gmeiner, Russell King, Daniel Vetter

On Wed, Jul 7, 2021 at 2:31 PM Lucas Stach <l.stach@pengutronix.de> wrote:
>
> Am Mittwoch, dem 07.07.2021 um 13:37 +0200 schrieb Daniel Vetter:
> > On Wed, Jul 7, 2021 at 10:54 AM Lucas Stach <l.stach@pengutronix.de> wrote:
> > >
> > > Hi Daniel,
> > >
> > > I'm feeling like I miss a ton of context here, so some maybe dumb
> > > questions/remarks below.
> > >
> > > Am Dienstag, dem 06.07.2021 um 12:12 +0200 schrieb Daniel Vetter:
> > > > There's only one exclusive slot, and we must not break the ordering.
> > > >
> > > > A better fix would be to us a dma_fence_chain or _array like e.g.
> > > > amdgpu now uses, but it probably makes sense to lift this into
> > > > dma-resv.c code as a proper concept, so that drivers don't have to
> > > > hack up their own solution each on their own. Hence go with the simple
> > > > fix for now.
> > > >
> > > > Another option is the fence import ioctl from Jason:
> > > >
> > > > https://lore.kernel.org/dri-devel/20210610210925.642582-7-jason@jlekstrand.net/
> > >
> > > Sorry, but why is the fence import ioctl a alternative to the fix
> > > proposed in this patch?
> >
> > It's not an alternative to fixing the issue here, it's an alternative
> > to trying to fix this here without adding more dependencies. Depends a
> > bit what exactly you want to do, I just linked this for the bigger
> > picture.
> >
> I appreciate the bigger picture, it just makes it harder to follow what
> is going on in this exact commit when trying to match up the code
> change with the commit message. I would have expected this link to only
> be part of the cover letter explaining the series, instead of being
> part of the commit message.

I wanted to list all the better fix options in the commit message so
that you have the full context.

> > > >
> > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > > Cc: Lucas Stach <l.stach@pengutronix.de>
> > > > Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> > > > Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> > > > Cc: etnaviv@lists.freedesktop.org
> > > > ---
> > > >  drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 8 +++++---
> > > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > > > index 92478a50a580..5c4fed2b7c6a 100644
> > > > --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > > > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > > > @@ -178,18 +178,20 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit)
> > > >       for (i = 0; i < submit->nr_bos; i++) {
> > > >               struct etnaviv_gem_submit_bo *bo = &submit->bos[i];
> > > >               struct dma_resv *robj = bo->obj->base.resv;
> > > > +             bool write = bo->flags & ETNA_SUBMIT_BO_WRITE;
> > > >
> > > > -             if (!(bo->flags & ETNA_SUBMIT_BO_WRITE)) {
> > > > +             if (!(write)) {
> > >
> > > No parenthesis around the write needed.
> > >
> > > >                       ret = dma_resv_reserve_shared(robj, 1);
> > > >                       if (ret)
> > > >                               return ret;
> > > >               }
> > > >
> > > > -             if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
> > > > +             /* exclusive fences must be ordered */
> > >
> > > I feel like this comment isn't really helpful. It might tell you all
> > > you need to know if you just paged in all the context around dma_resv
> > > and the dependency graph, but it's not more than noise to me right now.
> > >
> > > I guess the comment should answer the question against what the
> > > exclusive fence we are going to add needs to be ordered and why it
> > > isn't safe to skip implicit sync in that case.
> >
> > The full-length comment is the doc patch, which is last in the series.
> > Essentially the rule is that your not allowed to drop fences on the
> > floor (which you do if you just set a new write fence and not take any
> > of the existing fences as dependencies), at least for shared buffers.
> > But since it's easier to be consistent I think it's best if we do this
> > just across the board.
> >
> > Like the commit message explains, there's a few ways to fix this:
> > - just treat it as implicit synced
> > - chain the fences together - that way you don't sync, but also
> > there's no fence that's being lost. This is what amdgpu does, and also
> > what the new import ioctl does.
> >
> > 2nd option is a lot more involved, and since all the dma-resv stuff is
> > a bit under discussion, I went with the most minimal thing possible.
>
> Thanks for the confirmation, that was as much as I figured from the doc
> patch as well. So with that in mind I would expect this comment to read
> something like this:
> "Adding a new exclusive fence drops all previous fences from the
> dma_resv. To avoid violating the signalling order we err on the side of
> over-synchronizing by waiting for the existing fences, even if
> userspace asked us to ignore them."

Thanks for the suggestion, I've applied that to all the 3 patches for
msm/etnaviv and i915.

I hope with that added there's enough context in the commit message
that at least the gist is understandable without full context down the
road.
-Daniel

>
> Regards,
> Lucas
>
> > -Daniel
> >
> > >
> > > Regards,
> > > Lucas
> > > > +             if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT && !write)
> > > >                       continue;
> > > >
> > > >               ret = drm_sched_job_await_implicit(&submit->sched_job, &bo->obj->base,
> > > > -                                                bo->flags & ETNA_SUBMIT_BO_WRITE);
> > > > +                                                write);
> > > >               if (ret)
> > > >                       return ret;
> > > >       }
> > >
> > >
> >
> >
>
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2021-07-07 12:59 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-06 10:12 [PATCH 0/7] dma-resv fence DAG fixes Daniel Vetter
2021-07-06 10:12 ` [Intel-gfx] " Daniel Vetter
2021-07-06 10:12 ` [PATCH 1/7] drm/msm: Don't break exclusive fence ordering Daniel Vetter
2021-07-06 10:12   ` [Intel-gfx] " Daniel Vetter
2021-07-06 10:12   ` Daniel Vetter
2021-07-06 10:12 ` [PATCH 2/7] drm/msm: always wait for the exclusive fence Daniel Vetter
2021-07-06 10:12   ` [Intel-gfx] " Daniel Vetter
2021-07-06 10:12   ` Daniel Vetter
2021-07-06 10:12 ` [PATCH 3/7] drm/etnaviv: Don't break exclusive fence ordering Daniel Vetter
2021-07-06 10:12   ` [Intel-gfx] " Daniel Vetter
2021-07-07  8:54   ` Lucas Stach
2021-07-07  8:54     ` [Intel-gfx] " Lucas Stach
2021-07-07 11:37     ` Daniel Vetter
2021-07-07 11:37       ` [Intel-gfx] " Daniel Vetter
2021-07-07 12:31       ` Lucas Stach
2021-07-07 12:31         ` [Intel-gfx] " Lucas Stach
2021-07-07 12:59         ` Daniel Vetter
2021-07-07 12:59           ` [Intel-gfx] " Daniel Vetter
2021-07-06 10:12 ` [PATCH 4/7] drm/i915: delete exclude argument from i915_sw_fence_await_reservation Daniel Vetter
2021-07-06 10:12   ` [Intel-gfx] " Daniel Vetter
2021-07-06 10:12 ` [PATCH 5/7] drm/i915: Always wait for the exclusive fence Daniel Vetter
2021-07-06 10:12   ` [Intel-gfx] " Daniel Vetter
2021-07-06 12:47   ` Matthew Auld
2021-07-06 12:47     ` Matthew Auld
2021-07-06 12:58     ` Daniel Vetter
2021-07-06 12:58       ` Daniel Vetter
2021-07-06 10:12 ` [PATCH 6/7] drm/i915: Don't break exclusive fence ordering Daniel Vetter
2021-07-06 10:12   ` [Intel-gfx] " Daniel Vetter
2021-07-06 10:12 ` [PATCH 7/7] dma-resv: Give the docs a do-over Daniel Vetter
2021-07-06 10:12   ` [Intel-gfx] " Daniel Vetter
2021-07-06 10:12   ` Daniel Vetter
2021-07-06 12:34   ` [Intel-gfx] " Matthew Auld
2021-07-06 12:34     ` Matthew Auld
2021-07-06 12:34     ` Matthew Auld
2021-07-06 23:47   ` Jason Ekstrand
2021-07-06 23:47     ` [Intel-gfx] " Jason Ekstrand
2021-07-06 23:47     ` Jason Ekstrand
2021-07-07  8:06   ` [Linaro-mm-sig] " Christian König
2021-07-07  8:06     ` [Intel-gfx] " Christian König
2021-07-07  8:06     ` Christian König
2021-07-07  9:13     ` Daniel Vetter
2021-07-07  9:13       ` [Intel-gfx] " Daniel Vetter
2021-07-07  9:13       ` Daniel Vetter
2021-07-06 10:35 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for dma-resv fence DAG fixes Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.