linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Dynamic DMA-buf locking changes
@ 2019-08-29 14:29 Christian König
  2019-08-29 14:29 ` [PATCH 1/4] dma-buf: change DMA-buf locking convention Christian König
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Christian König @ 2019-08-29 14:29 UTC (permalink / raw)
  To: daniel, dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

Hi everyone,

since upstreaming the full dynamic DMA-buf changes turned out more problematic than previously thought I've reverted back to individual patches and separated out only the locking changes.

So this patch does NOT contain any new callbacks for pinning/unpinning and move notification, but only the locking changes necessary.

As previously discussed when the framework detects that the locking semantics between exporter and importer are different it just falls back to using a cached sgt created during attach time.

While separating the patch set I've most likely stumbled over the problem why this previously raised some lockdep warning with i915, it turned out to be just a might_lock() at the wrong place.

Please review and/or comment,
Christian.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/4] dma-buf: change DMA-buf locking convention
  2019-08-29 14:29 Dynamic DMA-buf locking changes Christian König
@ 2019-08-29 14:29 ` Christian König
  2019-09-03  8:05   ` Daniel Vetter
  2019-10-08  8:55   ` Daniel Vetter
  2019-08-29 14:29 ` [PATCH 2/4] drm/ttm: use the parent resv for ghost objects v2 Christian König
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 25+ messages in thread
From: Christian König @ 2019-08-29 14:29 UTC (permalink / raw)
  To: daniel, dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

This patch is a stripped down version of the locking changes
necessary to support dynamic DMA-buf handling.

For compatibility we cache the DMA-buf mapping as soon as
exporter/importer disagree on the dynamic handling.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/dma-buf/dma-buf.c | 90 ++++++++++++++++++++++++++++++++++++---
 include/linux/dma-buf.h   | 51 +++++++++++++++++++++-
 2 files changed, 133 insertions(+), 8 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 433d91d710e4..65052d52602b 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -525,6 +525,10 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info)
 		return ERR_PTR(-EINVAL);
 	}
 
+	if (WARN_ON(exp_info->ops->cache_sgt_mapping &&
+		    exp_info->ops->dynamic_mapping))
+		return ERR_PTR(-EINVAL);
+
 	if (!try_module_get(exp_info->owner))
 		return ERR_PTR(-ENOENT);
 
@@ -645,10 +649,11 @@ void dma_buf_put(struct dma_buf *dmabuf)
 EXPORT_SYMBOL_GPL(dma_buf_put);
 
 /**
- * dma_buf_attach - Add the device to dma_buf's attachments list; optionally,
+ * dma_buf_dynamic_attach - Add the device to dma_buf's attachments list; optionally,
  * calls attach() of dma_buf_ops to allow device-specific attach functionality
- * @dmabuf:	[in]	buffer to attach device to.
- * @dev:	[in]	device to be attached.
+ * @dmabuf:		[in]	buffer to attach device to.
+ * @dev:		[in]	device to be attached.
+ * @dynamic_mapping:	[in]	calling convention for map/unmap
  *
  * Returns struct dma_buf_attachment pointer for this attachment. Attachments
  * must be cleaned up by calling dma_buf_detach().
@@ -662,8 +667,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
  * accessible to @dev, and cannot be moved to a more suitable place. This is
  * indicated with the error code -EBUSY.
  */
-struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
-					  struct device *dev)
+struct dma_buf_attachment *
+dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
+		       bool dynamic_mapping)
 {
 	struct dma_buf_attachment *attach;
 	int ret;
@@ -677,6 +683,7 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
 
 	attach->dev = dev;
 	attach->dmabuf = dmabuf;
+	attach->dynamic_mapping = dynamic_mapping;
 
 	mutex_lock(&dmabuf->lock);
 
@@ -685,16 +692,64 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
 		if (ret)
 			goto err_attach;
 	}
+	dma_resv_lock(dmabuf->resv, NULL);
 	list_add(&attach->node, &dmabuf->attachments);
+	dma_resv_unlock(dmabuf->resv);
 
 	mutex_unlock(&dmabuf->lock);
 
+	/* When either the importer or the exporter can't handle dynamic
+	 * mappings we cache the mapping here to avoid issues with the
+	 * reservation object lock.
+	 */
+	if (dma_buf_attachment_is_dynamic(attach) !=
+	    dma_buf_is_dynamic(dmabuf)) {
+		struct sg_table *sgt;
+
+		if (dma_buf_is_dynamic(attach->dmabuf))
+			dma_resv_lock(attach->dmabuf->resv, NULL);
+
+		sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
+		if (!sgt)
+			sgt = ERR_PTR(-ENOMEM);
+		if (IS_ERR(sgt)) {
+			ret = PTR_ERR(sgt);
+			goto err_unlock;
+		}
+		if (dma_buf_is_dynamic(attach->dmabuf))
+			dma_resv_unlock(attach->dmabuf->resv);
+		attach->sgt = sgt;
+		attach->dir = DMA_BIDIRECTIONAL;
+	}
+
 	return attach;
 
 err_attach:
 	kfree(attach);
 	mutex_unlock(&dmabuf->lock);
 	return ERR_PTR(ret);
+
+err_unlock:
+	if (dma_buf_is_dynamic(attach->dmabuf))
+		dma_resv_unlock(attach->dmabuf->resv);
+
+	dma_buf_detach(dmabuf, attach);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(dma_buf_dynamic_attach);
+
+/**
+ * dma_buf_attach - Wrapper for dma_buf_dynamic_attach
+ * @dmabuf:	[in]	buffer to attach device to.
+ * @dev:	[in]	device to be attached.
+ *
+ * Wrapper to call dma_buf_dynamic_attach() for drivers which still use a static
+ * mapping.
+ */
+struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
+					  struct device *dev)
+{
+	return dma_buf_dynamic_attach(dmabuf, dev, false);
 }
 EXPORT_SYMBOL_GPL(dma_buf_attach);
 
@@ -711,11 +766,20 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach)
 	if (WARN_ON(!dmabuf || !attach))
 		return;
 
-	if (attach->sgt)
+	if (attach->sgt) {
+		if (dma_buf_is_dynamic(attach->dmabuf))
+			dma_resv_lock(attach->dmabuf->resv, NULL);
+
 		dmabuf->ops->unmap_dma_buf(attach, attach->sgt, attach->dir);
 
+		if (dma_buf_is_dynamic(attach->dmabuf))
+			dma_resv_unlock(attach->dmabuf->resv);
+	}
+
 	mutex_lock(&dmabuf->lock);
+	dma_resv_lock(dmabuf->resv, NULL);
 	list_del(&attach->node);
+	dma_resv_unlock(dmabuf->resv);
 	if (dmabuf->ops->detach)
 		dmabuf->ops->detach(dmabuf, attach);
 
@@ -749,6 +813,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
 	if (WARN_ON(!attach || !attach->dmabuf))
 		return ERR_PTR(-EINVAL);
 
+	if (dma_buf_attachment_is_dynamic(attach))
+		dma_resv_assert_held(attach->dmabuf->resv);
+
 	if (attach->sgt) {
 		/*
 		 * Two mappings with different directions for the same
@@ -761,6 +828,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
 		return attach->sgt;
 	}
 
+	if (dma_buf_is_dynamic(attach->dmabuf))
+		dma_resv_assert_held(attach->dmabuf->resv);
+
 	sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
 	if (!sg_table)
 		sg_table = ERR_PTR(-ENOMEM);
@@ -793,9 +863,15 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach,
 	if (WARN_ON(!attach || !attach->dmabuf || !sg_table))
 		return;
 
+	if (dma_buf_attachment_is_dynamic(attach))
+		dma_resv_assert_held(attach->dmabuf->resv);
+
 	if (attach->sgt == sg_table)
 		return;
 
+	if (dma_buf_is_dynamic(attach->dmabuf))
+		dma_resv_assert_held(attach->dmabuf->resv);
+
 	attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction);
 }
 EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
@@ -1219,10 +1295,12 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused)
 		seq_puts(s, "\tAttached Devices:\n");
 		attach_count = 0;
 
+		dma_resv_lock(buf_obj->resv, NULL);
 		list_for_each_entry(attach_obj, &buf_obj->attachments, node) {
 			seq_printf(s, "\t%s\n", dev_name(attach_obj->dev));
 			attach_count++;
 		}
+		dma_resv_unlock(buf_obj->resv);
 
 		seq_printf(s, "Total %d devices attached\n\n",
 				attach_count);
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index ec212cb27fdc..a8f8b2b812fd 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -42,6 +42,17 @@ struct dma_buf_ops {
 	  */
 	bool cache_sgt_mapping;
 
+	/**
+	 * @dynamic_mapping:
+	 *
+	 * If true the framework makes sure that the map/unmap_dma_buf
+	 * callbacks are always called with the dma_resv object locked.
+	 *
+	 * If false the framework makes ure that the map/unmap_dma_buf
+	 * callbacks are always called without the dma_resv object locked.
+	 */
+	bool dynamic_mapping;
+
 	/**
 	 * @attach:
 	 *
@@ -109,6 +120,9 @@ struct dma_buf_ops {
 	 * any other kind of sharing that the exporter might wish to make
 	 * available to buffer-users.
 	 *
+	 * This is always called with the dmabuf->resv object locked when
+	 * the dynamic_mapping flag is true.
+	 *
 	 * Returns:
 	 *
 	 * A &sg_table scatter list of or the backing storage of the DMA buffer,
@@ -327,6 +341,8 @@ struct dma_buf {
  * @sgt: cached mapping.
  * @dir: direction of cached mapping.
  * @priv: exporter specific attachment data.
+ * @dynamic_mapping: true if dma_buf_map/unmap_attachment() is called with the
+ * dma_resv lock held.
  *
  * This structure holds the attachment information between the dma_buf buffer
  * and its user device(s). The list contains one attachment struct per device
@@ -343,6 +359,7 @@ struct dma_buf_attachment {
 	struct list_head node;
 	struct sg_table *sgt;
 	enum dma_data_direction dir;
+	bool dynamic_mapping;
 	void *priv;
 };
 
@@ -394,10 +411,39 @@ static inline void get_dma_buf(struct dma_buf *dmabuf)
 	get_file(dmabuf->file);
 }
 
+/**
+ * dma_buf_is_dynamic - check if a DMA-buf uses dynamic mappings.
+ * @dmabuf: the DMA-buf to check
+ *
+ * Returns true if a DMA-buf exporter wants to be called with the dma_resv
+ * locked, false if it doesn't wants to be called with the lock held.
+ */
+static inline bool dma_buf_is_dynamic(struct dma_buf *dmabuf)
+{
+	return dmabuf->ops->dynamic_mapping;
+}
+
+/**
+ * dma_buf_attachment_is_dynamic - check if a DMA-buf attachment uses dynamic
+ * mappinsg
+ * @attach: the DMA-buf attachment to check
+ *
+ * Returns true if a DMA-buf importer wants to call the map/unmap functions with
+ * the dma_resv lock held.
+ */
+static inline bool
+dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
+{
+	return attach->dynamic_mapping;
+}
+
 struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
-							struct device *dev);
+					  struct device *dev);
+struct dma_buf_attachment *
+dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
+		       bool dynamic_mapping);
 void dma_buf_detach(struct dma_buf *dmabuf,
-				struct dma_buf_attachment *dmabuf_attach);
+		    struct dma_buf_attachment *attach);
 
 struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info);
 
@@ -409,6 +455,7 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *,
 					enum dma_data_direction);
 void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *,
 				enum dma_data_direction);
+void dma_buf_move_notify(struct dma_buf *dma_buf);
 int dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
 			     enum dma_data_direction dir);
 int dma_buf_end_cpu_access(struct dma_buf *dma_buf,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/4] drm/ttm: use the parent resv for ghost objects v2
  2019-08-29 14:29 Dynamic DMA-buf locking changes Christian König
  2019-08-29 14:29 ` [PATCH 1/4] dma-buf: change DMA-buf locking convention Christian König
@ 2019-08-29 14:29 ` Christian König
  2019-10-08  9:25   ` Daniel Vetter
  2019-08-29 14:29 ` [PATCH 3/4] drm/amdgpu: add independent DMA-buf export v7 Christian König
  2019-08-29 14:29 ` [PATCH 4/4] drm/amdgpu: add independent DMA-buf import v8 Christian König
  3 siblings, 1 reply; 25+ messages in thread
From: Christian König @ 2019-08-29 14:29 UTC (permalink / raw)
  To: daniel, dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

This way we can even pipeline imported BO evictions.

v2: Limit this to only cases when the parent object uses a separate
    reservation object as well. This fixes another OOM problem.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo_util.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index fe81c565e7ef..2ebe9fe7f6c8 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -517,7 +517,9 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
 	kref_init(&fbo->base.kref);
 	fbo->base.destroy = &ttm_transfered_destroy;
 	fbo->base.acc_size = 0;
-	fbo->base.base.resv = &fbo->base.base._resv;
+	if (bo->base.resv == &bo->base._resv)
+		fbo->base.base.resv = &fbo->base.base._resv;
+
 	dma_resv_init(fbo->base.base.resv);
 	ret = dma_resv_trylock(fbo->base.base.resv);
 	WARN_ON(!ret);
@@ -716,7 +718,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
 		if (ret)
 			return ret;
 
-		dma_resv_add_excl_fence(ghost_obj->base.resv, fence);
+		dma_resv_add_excl_fence(&ghost_obj->base._resv, fence);
 
 		/**
 		 * If we're not moving to fixed memory, the TTM object
@@ -729,7 +731,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
 		else
 			bo->ttm = NULL;
 
-		ttm_bo_unreserve(ghost_obj);
+		dma_resv_unlock(&ghost_obj->base._resv);
 		ttm_bo_put(ghost_obj);
 	}
 
@@ -772,7 +774,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo,
 		if (ret)
 			return ret;
 
-		dma_resv_add_excl_fence(ghost_obj->base.resv, fence);
+		dma_resv_add_excl_fence(&ghost_obj->base._resv, fence);
 
 		/**
 		 * If we're not moving to fixed memory, the TTM object
@@ -785,7 +787,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo,
 		else
 			bo->ttm = NULL;
 
-		ttm_bo_unreserve(ghost_obj);
+		dma_resv_unlock(&ghost_obj->base._resv);
 		ttm_bo_put(ghost_obj);
 
 	} else if (from->flags & TTM_MEMTYPE_FLAG_FIXED) {
@@ -841,7 +843,7 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo)
 	if (ret)
 		return ret;
 
-	ret = dma_resv_copy_fences(ghost->base.resv, bo->base.resv);
+	ret = dma_resv_copy_fences(&ghost->base._resv, bo->base.resv);
 	/* Last resort, wait for the BO to be idle when we are OOM */
 	if (ret)
 		ttm_bo_wait(bo, false, false);
@@ -850,7 +852,7 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo)
 	bo->mem.mem_type = TTM_PL_SYSTEM;
 	bo->ttm = NULL;
 
-	ttm_bo_unreserve(ghost);
+	dma_resv_unlock(&ghost->base._resv);
 	ttm_bo_put(ghost);
 
 	return 0;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/4] drm/amdgpu: add independent DMA-buf export v7
  2019-08-29 14:29 Dynamic DMA-buf locking changes Christian König
  2019-08-29 14:29 ` [PATCH 1/4] dma-buf: change DMA-buf locking convention Christian König
  2019-08-29 14:29 ` [PATCH 2/4] drm/ttm: use the parent resv for ghost objects v2 Christian König
@ 2019-08-29 14:29 ` Christian König
  2019-08-29 14:29 ` [PATCH 4/4] drm/amdgpu: add independent DMA-buf import v8 Christian König
  3 siblings, 0 replies; 25+ messages in thread
From: Christian König @ 2019-08-29 14:29 UTC (permalink / raw)
  To: daniel, dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

Add an DMA-buf export implementation independent of the DRM helpers.

This not only avoids the caching of DMA-buf mappings, but also
allows us to use the new dynamic locking approach.

This is also a prerequisite of unpinned DMA-buf handling.

v2: fix unintended recursion, remove debugging leftovers
v3: split out from unpinned DMA-buf work
v4: rebase on top of new no_sgt_cache flag
v5: fix some warnings by including amdgpu_dma_buf.h
v6: fix locking for non amdgpu exports
v7: rebased on new DMA-buf locking patch

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 172 +++++++++++---------
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h |   1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c     |   1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  |   1 +
 4 files changed, 97 insertions(+), 78 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index bf0f00508987..d3d32b75f346 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -34,26 +34,11 @@
 #include "amdgpu.h"
 #include "amdgpu_display.h"
 #include "amdgpu_gem.h"
+#include "amdgpu_dma_buf.h"
 #include <drm/amdgpu_drm.h>
 #include <linux/dma-buf.h>
 #include <linux/dma-fence-array.h>
 
-/**
- * amdgpu_gem_prime_get_sg_table - &drm_driver.gem_prime_get_sg_table
- * implementation
- * @obj: GEM buffer object (BO)
- *
- * Returns:
- * A scatter/gather table for the pinned pages of the BO's memory.
- */
-struct sg_table *amdgpu_gem_prime_get_sg_table(struct drm_gem_object *obj)
-{
-	struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
-	int npages = bo->tbo.num_pages;
-
-	return drm_prime_pages_to_sg(bo->tbo.ttm->pages, npages);
-}
-
 /**
  * amdgpu_gem_prime_vmap - &dma_buf_ops.vmap implementation
  * @obj: GEM BO
@@ -179,92 +164,126 @@ __dma_resv_make_exclusive(struct dma_resv *obj)
 }
 
 /**
- * amdgpu_dma_buf_map_attach - &dma_buf_ops.attach implementation
- * @dma_buf: Shared DMA buffer
+ * amdgpu_dma_buf_attach - &dma_buf_ops.attach implementation
+ *
+ * @dmabuf: DMA-buf where we attach to
+ * @attach: attachment to add
+ *
+ * Add the attachment as user to the exported DMA-buf.
+ */
+static int amdgpu_dma_buf_attach(struct dma_buf *dmabuf,
+				 struct dma_buf_attachment *attach)
+{
+	struct drm_gem_object *obj = dmabuf->priv;
+	struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
+	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
+	int r;
+
+	if (attach->dev->driver == adev->dev->driver)
+		return 0;
+
+	r = amdgpu_bo_reserve(bo, false);
+	if (unlikely(r != 0))
+		return r;
+
+	/*
+	 * We only create shared fences for internal use, but importers
+	 * of the dmabuf rely on exclusive fences for implicitly
+	 * tracking write hazards. As any of the current fences may
+	 * correspond to a write, we need to convert all existing
+	 * fences on the reservation object into a single exclusive
+	 * fence.
+	 */
+	r = __dma_resv_make_exclusive(bo->tbo.base.resv);
+	if (r)
+		return r;
+
+	bo->prime_shared_count++;
+	amdgpu_bo_unreserve(bo);
+	return 0;
+}
+
+/**
+ * amdgpu_dma_buf_detach - &dma_buf_ops.detach implementation
+ *
+ * @dmabuf: DMA-buf where we remove the attachment from
+ * @attach: the attachment to remove
+ *
+ * Called when an attachment is removed from the DMA-buf.
+ */
+static void amdgpu_dma_buf_detach(struct dma_buf *dmabuf,
+				  struct dma_buf_attachment *attach)
+{
+	struct drm_gem_object *obj = dmabuf->priv;
+	struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
+	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
+
+	if (attach->dev->driver != adev->dev->driver && bo->prime_shared_count)
+		bo->prime_shared_count--;
+}
+
+/**
+ * amdgpu_dma_buf_map - &dma_buf_ops.map_dma_buf implementation
  * @attach: DMA-buf attachment
+ * @dir: DMA direction
  *
  * Makes sure that the shared DMA buffer can be accessed by the target device.
  * For now, simply pins it to the GTT domain, where it should be accessible by
  * all DMA devices.
  *
  * Returns:
- * 0 on success or a negative error code on failure.
+ * sg_table filled with the DMA addresses to use or ERR_PRT with negative error
+ * code.
  */
-static int amdgpu_dma_buf_map_attach(struct dma_buf *dma_buf,
-				     struct dma_buf_attachment *attach)
+static struct sg_table *amdgpu_dma_buf_map(struct dma_buf_attachment *attach,
+					   enum dma_data_direction dir)
 {
+	struct dma_buf *dma_buf = attach->dmabuf;
 	struct drm_gem_object *obj = dma_buf->priv;
 	struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
-	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
+	struct sg_table *sgt;
 	long r;
 
-	r = drm_gem_map_attach(dma_buf, attach);
-	if (r)
-		return r;
-
-	r = amdgpu_bo_reserve(bo, false);
-	if (unlikely(r != 0))
-		goto error_detach;
-
-
-	if (attach->dev->driver != adev->dev->driver) {
-		/*
-		 * We only create shared fences for internal use, but importers
-		 * of the dmabuf rely on exclusive fences for implicitly
-		 * tracking write hazards. As any of the current fences may
-		 * correspond to a write, we need to convert all existing
-		 * fences on the reservation object into a single exclusive
-		 * fence.
-		 */
-		r = __dma_resv_make_exclusive(bo->tbo.base.resv);
-		if (r)
-			goto error_unreserve;
-	}
-
-	/* pin buffer into GTT */
 	r = amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_GTT);
 	if (r)
-		goto error_unreserve;
+		return ERR_PTR(r);
 
-	if (attach->dev->driver != adev->dev->driver)
-		bo->prime_shared_count++;
+	sgt = drm_prime_pages_to_sg(bo->tbo.ttm->pages, bo->tbo.num_pages);
+	if (IS_ERR(sgt))
+		return sgt;
 
-error_unreserve:
-	amdgpu_bo_unreserve(bo);
+	if (!dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir,
+			      DMA_ATTR_SKIP_CPU_SYNC))
+		goto error_free;
 
-error_detach:
-	if (r)
-		drm_gem_map_detach(dma_buf, attach);
-	return r;
+	return sgt;
+
+error_free:
+	sg_free_table(sgt);
+	kfree(sgt);
+	return ERR_PTR(-ENOMEM);
 }
 
 /**
- * amdgpu_dma_buf_map_detach - &dma_buf_ops.detach implementation
- * @dma_buf: Shared DMA buffer
+ * amdgpu_dma_buf_unmap - &dma_buf_ops.unmap_dma_buf implementation
  * @attach: DMA-buf attachment
+ * @sgt: sg_table to unmap
+ * @dir: DMA direction
  *
  * This is called when a shared DMA buffer no longer needs to be accessible by
  * another device. For now, simply unpins the buffer from GTT.
  */
-static void amdgpu_dma_buf_map_detach(struct dma_buf *dma_buf,
-				      struct dma_buf_attachment *attach)
+static void amdgpu_dma_buf_unmap(struct dma_buf_attachment *attach,
+				 struct sg_table *sgt,
+				 enum dma_data_direction dir)
 {
-	struct drm_gem_object *obj = dma_buf->priv;
+	struct drm_gem_object *obj = attach->dmabuf->priv;
 	struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
-	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
-	int ret = 0;
-
-	ret = amdgpu_bo_reserve(bo, true);
-	if (unlikely(ret != 0))
-		goto error;
 
+	dma_unmap_sg(attach->dev, sgt->sgl, sgt->nents, dir);
+	sg_free_table(sgt);
+	kfree(sgt);
 	amdgpu_bo_unpin(bo);
-	if (attach->dev->driver != adev->dev->driver && bo->prime_shared_count)
-		bo->prime_shared_count--;
-	amdgpu_bo_unreserve(bo);
-
-error:
-	drm_gem_map_detach(dma_buf, attach);
 }
 
 /**
@@ -308,10 +327,11 @@ static int amdgpu_dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
 }
 
 const struct dma_buf_ops amdgpu_dmabuf_ops = {
-	.attach = amdgpu_dma_buf_map_attach,
-	.detach = amdgpu_dma_buf_map_detach,
-	.map_dma_buf = drm_gem_map_dma_buf,
-	.unmap_dma_buf = drm_gem_unmap_dma_buf,
+	.dynamic_mapping = true,
+	.attach = amdgpu_dma_buf_attach,
+	.detach = amdgpu_dma_buf_detach,
+	.map_dma_buf = amdgpu_dma_buf_map,
+	.unmap_dma_buf = amdgpu_dma_buf_unmap,
 	.release = drm_gem_dmabuf_release,
 	.begin_cpu_access = amdgpu_dma_buf_begin_cpu_access,
 	.mmap = drm_gem_dmabuf_mmap,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h
index 5012e6ab58f1..ce1b3f017451 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h
@@ -25,7 +25,6 @@
 
 #include <drm/drm_gem.h>
 
-struct sg_table *amdgpu_gem_prime_get_sg_table(struct drm_gem_object *obj);
 struct drm_gem_object *
 amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
 				 struct dma_buf_attachment *attach,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6b96a5738e57..6b7ad9345b61 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1397,7 +1397,6 @@ static struct drm_driver kms_driver = {
 	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
 	.gem_prime_export = amdgpu_gem_prime_export,
 	.gem_prime_import = amdgpu_gem_prime_import,
-	.gem_prime_get_sg_table = amdgpu_gem_prime_get_sg_table,
 	.gem_prime_import_sg_table = amdgpu_gem_prime_import_sg_table,
 	.gem_prime_vmap = amdgpu_gem_prime_vmap,
 	.gem_prime_vunmap = amdgpu_gem_prime_vunmap,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 2f11ebd95528..156f0098d161 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -31,6 +31,7 @@
  */
 #include <linux/list.h>
 #include <linux/slab.h>
+#include <linux/dma-buf.h>
 
 #include <drm/amdgpu_drm.h>
 #include <drm/drm_cache.h>
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/4] drm/amdgpu: add independent DMA-buf import v8
  2019-08-29 14:29 Dynamic DMA-buf locking changes Christian König
                   ` (2 preceding siblings ...)
  2019-08-29 14:29 ` [PATCH 3/4] drm/amdgpu: add independent DMA-buf export v7 Christian König
@ 2019-08-29 14:29 ` Christian König
  3 siblings, 0 replies; 25+ messages in thread
From: Christian König @ 2019-08-29 14:29 UTC (permalink / raw)
  To: daniel, dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

Instead of relying on the DRM functions just implement our own import
functions. This prepares support for taking care of unpinned DMA-buf.

v2: enable for all exporters, not just amdgpu, fix invalidation
    handling, lock reservation object while setting callback
v3: change to new dma_buf attach interface
v4: split out from unpinned DMA-buf work
v5: rebased and cleanup on new DMA-buf interface
v6: squash with invalidation callback change,
    stop using _(map|unmap)_locked
v7: drop invalidations when the BO is already in system domain
v8: rebase on new DMA-buf patch and drop move notification

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 38 +++++++++++++--------
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h |  4 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c     |  1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c     | 32 ++++++++++++++---
 4 files changed, 50 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index d3d32b75f346..693cd01e9373 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -370,31 +370,28 @@ struct dma_buf *amdgpu_gem_prime_export(struct drm_gem_object *gobj,
 }
 
 /**
- * amdgpu_gem_prime_import_sg_table - &drm_driver.gem_prime_import_sg_table
- * implementation
+ * amdgpu_dma_buf_create_obj - create BO for DMA-buf import
+ *
  * @dev: DRM device
- * @attach: DMA-buf attachment
- * @sg: Scatter/gather table
+ * @dma_buf: DMA-buf
  *
- * Imports shared DMA buffer memory exported by another device.
+ * Creates an empty SG BO for DMA-buf import.
  *
  * Returns:
  * A new GEM BO of the given DRM device, representing the memory
  * described by the given DMA-buf attachment and scatter/gather table.
  */
-struct drm_gem_object *
-amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
-				 struct dma_buf_attachment *attach,
-				 struct sg_table *sg)
+static struct drm_gem_object *
+amdgpu_dma_buf_create_obj(struct drm_device *dev, struct dma_buf *dma_buf)
 {
-	struct dma_resv *resv = attach->dmabuf->resv;
+	struct dma_resv *resv = dma_buf->resv;
 	struct amdgpu_device *adev = dev->dev_private;
 	struct amdgpu_bo *bo;
 	struct amdgpu_bo_param bp;
 	int ret;
 
 	memset(&bp, 0, sizeof(bp));
-	bp.size = attach->dmabuf->size;
+	bp.size = dma_buf->size;
 	bp.byte_align = PAGE_SIZE;
 	bp.domain = AMDGPU_GEM_DOMAIN_CPU;
 	bp.flags = 0;
@@ -405,11 +402,9 @@ amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
 	if (ret)
 		goto error;
 
-	bo->tbo.sg = sg;
-	bo->tbo.ttm->sg = sg;
 	bo->allowed_domains = AMDGPU_GEM_DOMAIN_GTT;
 	bo->preferred_domains = AMDGPU_GEM_DOMAIN_GTT;
-	if (attach->dmabuf->ops != &amdgpu_dmabuf_ops)
+	if (dma_buf->ops != &amdgpu_dmabuf_ops)
 		bo->prime_shared_count = 1;
 
 	dma_resv_unlock(resv);
@@ -434,6 +429,7 @@ amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
 struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev,
 					    struct dma_buf *dma_buf)
 {
+	struct dma_buf_attachment *attach;
 	struct drm_gem_object *obj;
 
 	if (dma_buf->ops == &amdgpu_dmabuf_ops) {
@@ -448,5 +444,17 @@ struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev,
 		}
 	}
 
-	return drm_gem_prime_import(dev, dma_buf);
+	obj = amdgpu_dma_buf_create_obj(dev, dma_buf);
+	if (IS_ERR(obj))
+		return obj;
+
+	attach = dma_buf_dynamic_attach(dma_buf, dev->dev, true);
+	if (IS_ERR(attach)) {
+		drm_gem_object_put(obj);
+		return ERR_CAST(attach);
+	}
+
+	get_dma_buf(dma_buf);
+	obj->import_attach = attach;
+	return obj;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h
index ce1b3f017451..ec447a7b6b28 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h
@@ -25,10 +25,6 @@
 
 #include <drm/drm_gem.h>
 
-struct drm_gem_object *
-amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
-				 struct dma_buf_attachment *attach,
-				 struct sg_table *sg);
 struct dma_buf *amdgpu_gem_prime_export(struct drm_gem_object *gobj,
 					int flags);
 struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6b7ad9345b61..4c0d5162ca33 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1397,7 +1397,6 @@ static struct drm_driver kms_driver = {
 	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
 	.gem_prime_export = amdgpu_gem_prime_export,
 	.gem_prime_import = amdgpu_gem_prime_import,
-	.gem_prime_import_sg_table = amdgpu_gem_prime_import_sg_table,
 	.gem_prime_vmap = amdgpu_gem_prime_vmap,
 	.gem_prime_vunmap = amdgpu_gem_prime_vunmap,
 	.gem_prime_mmap = amdgpu_gem_prime_mmap,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index fb09314bcfd4..8c923282daf1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -39,6 +39,7 @@
 #include <linux/slab.h>
 #include <linux/swap.h>
 #include <linux/swiotlb.h>
+#include <linux/dma-buf.h>
 
 #include <drm/ttm/ttm_bo_api.h>
 #include <drm/ttm/ttm_bo_driver.h>
@@ -747,6 +748,7 @@ static unsigned long amdgpu_ttm_io_mem_pfn(struct ttm_buffer_object *bo,
  */
 struct amdgpu_ttm_tt {
 	struct ttm_dma_tt	ttm;
+	struct drm_gem_object	*gobj;
 	u64			offset;
 	uint64_t		userptr;
 	struct task_struct	*usertask;
@@ -1226,6 +1228,7 @@ static struct ttm_tt *amdgpu_ttm_tt_create(struct ttm_buffer_object *bo,
 		return NULL;
 	}
 	gtt->ttm.ttm.func = &amdgpu_backend_func;
+	gtt->gobj = &bo->base;
 
 	/* allocate space for the uninitialized page entries */
 	if (ttm_sg_tt_init(&gtt->ttm, bo, page_flags)) {
@@ -1246,7 +1249,6 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm,
 {
 	struct amdgpu_device *adev = amdgpu_ttm_adev(ttm->bdev);
 	struct amdgpu_ttm_tt *gtt = (void *)ttm;
-	bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG);
 
 	/* user pages are bound by amdgpu_ttm_tt_pin_userptr() */
 	if (gtt && gtt->userptr) {
@@ -1259,7 +1261,19 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm,
 		return 0;
 	}
 
-	if (slave && ttm->sg) {
+	if (ttm->page_flags & TTM_PAGE_FLAG_SG) {
+		if (!ttm->sg) {
+			struct dma_buf_attachment *attach;
+			struct sg_table *sgt;
+
+			attach = gtt->gobj->import_attach;
+			sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL);
+			if (IS_ERR(sgt))
+				return PTR_ERR(sgt);
+
+			ttm->sg = sgt;
+		}
+
 		drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
 						 gtt->ttm.dma_address,
 						 ttm->num_pages);
@@ -1286,9 +1300,8 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm,
  */
 static void amdgpu_ttm_tt_unpopulate(struct ttm_tt *ttm)
 {
-	struct amdgpu_device *adev;
 	struct amdgpu_ttm_tt *gtt = (void *)ttm;
-	bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG);
+	struct amdgpu_device *adev;
 
 	if (gtt && gtt->userptr) {
 		amdgpu_ttm_tt_set_user_pages(ttm, NULL);
@@ -1297,7 +1310,16 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_tt *ttm)
 		return;
 	}
 
-	if (slave)
+	if (ttm->sg && gtt->gobj->import_attach) {
+		struct dma_buf_attachment *attach;
+
+		attach = gtt->gobj->import_attach;
+		dma_buf_unmap_attachment(attach, ttm->sg, DMA_BIDIRECTIONAL);
+		ttm->sg = NULL;
+		return;
+	}
+
+	if (ttm->page_flags & TTM_PAGE_FLAG_SG)
 		return;
 
 	adev = amdgpu_ttm_adev(ttm->bdev);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] dma-buf: change DMA-buf locking convention
  2019-08-29 14:29 ` [PATCH 1/4] dma-buf: change DMA-buf locking convention Christian König
@ 2019-09-03  8:05   ` Daniel Vetter
  2019-09-11 10:53     ` Christian König
  2019-10-08  8:55   ` Daniel Vetter
  1 sibling, 1 reply; 25+ messages in thread
From: Daniel Vetter @ 2019-09-03  8:05 UTC (permalink / raw)
  To: Christian König
  Cc: daniel, dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

On Thu, Aug 29, 2019 at 04:29:14PM +0200, Christian König wrote:
> This patch is a stripped down version of the locking changes
> necessary to support dynamic DMA-buf handling.
> 
> For compatibility we cache the DMA-buf mapping as soon as
> exporter/importer disagree on the dynamic handling.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/dma-buf/dma-buf.c | 90 ++++++++++++++++++++++++++++++++++++---
>  include/linux/dma-buf.h   | 51 +++++++++++++++++++++-
>  2 files changed, 133 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index 433d91d710e4..65052d52602b 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -525,6 +525,10 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info)
>  		return ERR_PTR(-EINVAL);
>  	}
>  
> +	if (WARN_ON(exp_info->ops->cache_sgt_mapping &&
> +		    exp_info->ops->dynamic_mapping))
> +		return ERR_PTR(-EINVAL);
> +
>  	if (!try_module_get(exp_info->owner))
>  		return ERR_PTR(-ENOENT);
>  
> @@ -645,10 +649,11 @@ void dma_buf_put(struct dma_buf *dmabuf)
>  EXPORT_SYMBOL_GPL(dma_buf_put);
>  
>  /**
> - * dma_buf_attach - Add the device to dma_buf's attachments list; optionally,
> + * dma_buf_dynamic_attach - Add the device to dma_buf's attachments list; optionally,
>   * calls attach() of dma_buf_ops to allow device-specific attach functionality
> - * @dmabuf:	[in]	buffer to attach device to.
> - * @dev:	[in]	device to be attached.
> + * @dmabuf:		[in]	buffer to attach device to.
> + * @dev:		[in]	device to be attached.
> + * @dynamic_mapping:	[in]	calling convention for map/unmap
>   *
>   * Returns struct dma_buf_attachment pointer for this attachment. Attachments
>   * must be cleaned up by calling dma_buf_detach().
> @@ -662,8 +667,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
>   * accessible to @dev, and cannot be moved to a more suitable place. This is
>   * indicated with the error code -EBUSY.
>   */
> -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> -					  struct device *dev)
> +struct dma_buf_attachment *
> +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
> +		       bool dynamic_mapping)
>  {
>  	struct dma_buf_attachment *attach;
>  	int ret;
> @@ -677,6 +683,7 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>  
>  	attach->dev = dev;
>  	attach->dmabuf = dmabuf;
> +	attach->dynamic_mapping = dynamic_mapping;
>  
>  	mutex_lock(&dmabuf->lock);
>  
> @@ -685,16 +692,64 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>  		if (ret)
>  			goto err_attach;
>  	}
> +	dma_resv_lock(dmabuf->resv, NULL);
>  	list_add(&attach->node, &dmabuf->attachments);
> +	dma_resv_unlock(dmabuf->resv);
>  
>  	mutex_unlock(&dmabuf->lock);
>  
> +	/* When either the importer or the exporter can't handle dynamic
> +	 * mappings we cache the mapping here to avoid issues with the
> +	 * reservation object lock.
> +	 */
> +	if (dma_buf_attachment_is_dynamic(attach) !=
> +	    dma_buf_is_dynamic(dmabuf)) {
> +		struct sg_table *sgt;
> +
> +		if (dma_buf_is_dynamic(attach->dmabuf))
> +			dma_resv_lock(attach->dmabuf->resv, NULL);
> +
> +		sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);

Now we're back to enforcing DMA_BIDI, which works nicely around the
locking pain, but apparently upsets the arm-soc folks who want to control
this better. That's why your previous version moved the caching into
map/unmap_sg, which resurrected the locking hydra.

I think we're going a bit in circles here, and I don't have a good idea
either :-/
-Daniel

> +		if (!sgt)
> +			sgt = ERR_PTR(-ENOMEM);
> +		if (IS_ERR(sgt)) {
> +			ret = PTR_ERR(sgt);
> +			goto err_unlock;
> +		}
> +		if (dma_buf_is_dynamic(attach->dmabuf))
> +			dma_resv_unlock(attach->dmabuf->resv);
> +		attach->sgt = sgt;
> +		attach->dir = DMA_BIDIRECTIONAL;
> +	}
> +
>  	return attach;
>  
>  err_attach:
>  	kfree(attach);
>  	mutex_unlock(&dmabuf->lock);
>  	return ERR_PTR(ret);
> +
> +err_unlock:
> +	if (dma_buf_is_dynamic(attach->dmabuf))
> +		dma_resv_unlock(attach->dmabuf->resv);
> +
> +	dma_buf_detach(dmabuf, attach);
> +	return ERR_PTR(ret);
> +}
> +EXPORT_SYMBOL_GPL(dma_buf_dynamic_attach);
> +
> +/**
> + * dma_buf_attach - Wrapper for dma_buf_dynamic_attach
> + * @dmabuf:	[in]	buffer to attach device to.
> + * @dev:	[in]	device to be attached.
> + *
> + * Wrapper to call dma_buf_dynamic_attach() for drivers which still use a static
> + * mapping.
> + */
> +struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> +					  struct device *dev)
> +{
> +	return dma_buf_dynamic_attach(dmabuf, dev, false);
>  }
>  EXPORT_SYMBOL_GPL(dma_buf_attach);
>  
> @@ -711,11 +766,20 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach)
>  	if (WARN_ON(!dmabuf || !attach))
>  		return;
>  
> -	if (attach->sgt)
> +	if (attach->sgt) {
> +		if (dma_buf_is_dynamic(attach->dmabuf))
> +			dma_resv_lock(attach->dmabuf->resv, NULL);
> +
>  		dmabuf->ops->unmap_dma_buf(attach, attach->sgt, attach->dir);
>  
> +		if (dma_buf_is_dynamic(attach->dmabuf))
> +			dma_resv_unlock(attach->dmabuf->resv);
> +	}
> +
>  	mutex_lock(&dmabuf->lock);
> +	dma_resv_lock(dmabuf->resv, NULL);
>  	list_del(&attach->node);
> +	dma_resv_unlock(dmabuf->resv);
>  	if (dmabuf->ops->detach)
>  		dmabuf->ops->detach(dmabuf, attach);
>  
> @@ -749,6 +813,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
>  	if (WARN_ON(!attach || !attach->dmabuf))
>  		return ERR_PTR(-EINVAL);
>  
> +	if (dma_buf_attachment_is_dynamic(attach))
> +		dma_resv_assert_held(attach->dmabuf->resv);
> +
>  	if (attach->sgt) {
>  		/*
>  		 * Two mappings with different directions for the same
> @@ -761,6 +828,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
>  		return attach->sgt;
>  	}
>  
> +	if (dma_buf_is_dynamic(attach->dmabuf))
> +		dma_resv_assert_held(attach->dmabuf->resv);
> +
>  	sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
>  	if (!sg_table)
>  		sg_table = ERR_PTR(-ENOMEM);
> @@ -793,9 +863,15 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach,
>  	if (WARN_ON(!attach || !attach->dmabuf || !sg_table))
>  		return;
>  
> +	if (dma_buf_attachment_is_dynamic(attach))
> +		dma_resv_assert_held(attach->dmabuf->resv);
> +
>  	if (attach->sgt == sg_table)
>  		return;
>  
> +	if (dma_buf_is_dynamic(attach->dmabuf))
> +		dma_resv_assert_held(attach->dmabuf->resv);
> +
>  	attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction);
>  }
>  EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
> @@ -1219,10 +1295,12 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused)
>  		seq_puts(s, "\tAttached Devices:\n");
>  		attach_count = 0;
>  
> +		dma_resv_lock(buf_obj->resv, NULL);
>  		list_for_each_entry(attach_obj, &buf_obj->attachments, node) {
>  			seq_printf(s, "\t%s\n", dev_name(attach_obj->dev));
>  			attach_count++;
>  		}
> +		dma_resv_unlock(buf_obj->resv);
>  
>  		seq_printf(s, "Total %d devices attached\n\n",
>  				attach_count);
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index ec212cb27fdc..a8f8b2b812fd 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -42,6 +42,17 @@ struct dma_buf_ops {
>  	  */
>  	bool cache_sgt_mapping;
>  
> +	/**
> +	 * @dynamic_mapping:
> +	 *
> +	 * If true the framework makes sure that the map/unmap_dma_buf
> +	 * callbacks are always called with the dma_resv object locked.
> +	 *
> +	 * If false the framework makes ure that the map/unmap_dma_buf
> +	 * callbacks are always called without the dma_resv object locked.
> +	 */
> +	bool dynamic_mapping;
> +
>  	/**
>  	 * @attach:
>  	 *
> @@ -109,6 +120,9 @@ struct dma_buf_ops {
>  	 * any other kind of sharing that the exporter might wish to make
>  	 * available to buffer-users.
>  	 *
> +	 * This is always called with the dmabuf->resv object locked when
> +	 * the dynamic_mapping flag is true.
> +	 *
>  	 * Returns:
>  	 *
>  	 * A &sg_table scatter list of or the backing storage of the DMA buffer,
> @@ -327,6 +341,8 @@ struct dma_buf {
>   * @sgt: cached mapping.
>   * @dir: direction of cached mapping.
>   * @priv: exporter specific attachment data.
> + * @dynamic_mapping: true if dma_buf_map/unmap_attachment() is called with the
> + * dma_resv lock held.
>   *
>   * This structure holds the attachment information between the dma_buf buffer
>   * and its user device(s). The list contains one attachment struct per device
> @@ -343,6 +359,7 @@ struct dma_buf_attachment {
>  	struct list_head node;
>  	struct sg_table *sgt;
>  	enum dma_data_direction dir;
> +	bool dynamic_mapping;
>  	void *priv;
>  };
>  
> @@ -394,10 +411,39 @@ static inline void get_dma_buf(struct dma_buf *dmabuf)
>  	get_file(dmabuf->file);
>  }
>  
> +/**
> + * dma_buf_is_dynamic - check if a DMA-buf uses dynamic mappings.
> + * @dmabuf: the DMA-buf to check
> + *
> + * Returns true if a DMA-buf exporter wants to be called with the dma_resv
> + * locked, false if it doesn't wants to be called with the lock held.
> + */
> +static inline bool dma_buf_is_dynamic(struct dma_buf *dmabuf)
> +{
> +	return dmabuf->ops->dynamic_mapping;
> +}
> +
> +/**
> + * dma_buf_attachment_is_dynamic - check if a DMA-buf attachment uses dynamic
> + * mappinsg
> + * @attach: the DMA-buf attachment to check
> + *
> + * Returns true if a DMA-buf importer wants to call the map/unmap functions with
> + * the dma_resv lock held.
> + */
> +static inline bool
> +dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
> +{
> +	return attach->dynamic_mapping;
> +}
> +
>  struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> -							struct device *dev);
> +					  struct device *dev);
> +struct dma_buf_attachment *
> +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
> +		       bool dynamic_mapping);
>  void dma_buf_detach(struct dma_buf *dmabuf,
> -				struct dma_buf_attachment *dmabuf_attach);
> +		    struct dma_buf_attachment *attach);
>  
>  struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info);
>  
> @@ -409,6 +455,7 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *,
>  					enum dma_data_direction);
>  void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *,
>  				enum dma_data_direction);
> +void dma_buf_move_notify(struct dma_buf *dma_buf);
>  int dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
>  			     enum dma_data_direction dir);
>  int dma_buf_end_cpu_access(struct dma_buf *dma_buf,
> -- 
> 2.17.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] dma-buf: change DMA-buf locking convention
  2019-09-03  8:05   ` Daniel Vetter
@ 2019-09-11 10:53     ` Christian König
  2019-09-16 12:23       ` Christian König
  0 siblings, 1 reply; 25+ messages in thread
From: Christian König @ 2019-09-11 10:53 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

Am 03.09.19 um 10:05 schrieb Daniel Vetter:
> On Thu, Aug 29, 2019 at 04:29:14PM +0200, Christian König wrote:
>> This patch is a stripped down version of the locking changes
>> necessary to support dynamic DMA-buf handling.
>>
>> For compatibility we cache the DMA-buf mapping as soon as
>> exporter/importer disagree on the dynamic handling.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/dma-buf/dma-buf.c | 90 ++++++++++++++++++++++++++++++++++++---
>>   include/linux/dma-buf.h   | 51 +++++++++++++++++++++-
>>   2 files changed, 133 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>> index 433d91d710e4..65052d52602b 100644
>> --- a/drivers/dma-buf/dma-buf.c
>> +++ b/drivers/dma-buf/dma-buf.c
>> @@ -525,6 +525,10 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info)
>>   		return ERR_PTR(-EINVAL);
>>   	}
>>   
>> +	if (WARN_ON(exp_info->ops->cache_sgt_mapping &&
>> +		    exp_info->ops->dynamic_mapping))
>> +		return ERR_PTR(-EINVAL);
>> +
>>   	if (!try_module_get(exp_info->owner))
>>   		return ERR_PTR(-ENOENT);
>>   
>> @@ -645,10 +649,11 @@ void dma_buf_put(struct dma_buf *dmabuf)
>>   EXPORT_SYMBOL_GPL(dma_buf_put);
>>   
>>   /**
>> - * dma_buf_attach - Add the device to dma_buf's attachments list; optionally,
>> + * dma_buf_dynamic_attach - Add the device to dma_buf's attachments list; optionally,
>>    * calls attach() of dma_buf_ops to allow device-specific attach functionality
>> - * @dmabuf:	[in]	buffer to attach device to.
>> - * @dev:	[in]	device to be attached.
>> + * @dmabuf:		[in]	buffer to attach device to.
>> + * @dev:		[in]	device to be attached.
>> + * @dynamic_mapping:	[in]	calling convention for map/unmap
>>    *
>>    * Returns struct dma_buf_attachment pointer for this attachment. Attachments
>>    * must be cleaned up by calling dma_buf_detach().
>> @@ -662,8 +667,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
>>    * accessible to @dev, and cannot be moved to a more suitable place. This is
>>    * indicated with the error code -EBUSY.
>>    */
>> -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>> -					  struct device *dev)
>> +struct dma_buf_attachment *
>> +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
>> +		       bool dynamic_mapping)
>>   {
>>   	struct dma_buf_attachment *attach;
>>   	int ret;
>> @@ -677,6 +683,7 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>>   
>>   	attach->dev = dev;
>>   	attach->dmabuf = dmabuf;
>> +	attach->dynamic_mapping = dynamic_mapping;
>>   
>>   	mutex_lock(&dmabuf->lock);
>>   
>> @@ -685,16 +692,64 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>>   		if (ret)
>>   			goto err_attach;
>>   	}
>> +	dma_resv_lock(dmabuf->resv, NULL);
>>   	list_add(&attach->node, &dmabuf->attachments);
>> +	dma_resv_unlock(dmabuf->resv);
>>   
>>   	mutex_unlock(&dmabuf->lock);
>>   
>> +	/* When either the importer or the exporter can't handle dynamic
>> +	 * mappings we cache the mapping here to avoid issues with the
>> +	 * reservation object lock.
>> +	 */
>> +	if (dma_buf_attachment_is_dynamic(attach) !=
>> +	    dma_buf_is_dynamic(dmabuf)) {
>> +		struct sg_table *sgt;
>> +
>> +		if (dma_buf_is_dynamic(attach->dmabuf))
>> +			dma_resv_lock(attach->dmabuf->resv, NULL);
>> +
>> +		sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
> Now we're back to enforcing DMA_BIDI, which works nicely around the
> locking pain, but apparently upsets the arm-soc folks who want to control
> this better.

Take another look at dma_buf_map_attachment(), we still try to get the 
caching there for ARM.

What we do here is to bidirectionally map the buffer to avoid the 
locking hydra when importer and exporter disagree on locking.

So the ARM folks can easily avoid that by switching to dynamic locking 
for both.

Regards,
Christian.

> That's why your previous version moved the caching into
> map/unmap_sg, which resurrected the locking hydra.
>
> I think we're going a bit in circles here, and I don't have a good idea
> either :-/
> -Daniel
>
>> +		if (!sgt)
>> +			sgt = ERR_PTR(-ENOMEM);
>> +		if (IS_ERR(sgt)) {
>> +			ret = PTR_ERR(sgt);
>> +			goto err_unlock;
>> +		}
>> +		if (dma_buf_is_dynamic(attach->dmabuf))
>> +			dma_resv_unlock(attach->dmabuf->resv);
>> +		attach->sgt = sgt;
>> +		attach->dir = DMA_BIDIRECTIONAL;
>> +	}
>> +
>>   	return attach;
>>   
>>   err_attach:
>>   	kfree(attach);
>>   	mutex_unlock(&dmabuf->lock);
>>   	return ERR_PTR(ret);
>> +
>> +err_unlock:
>> +	if (dma_buf_is_dynamic(attach->dmabuf))
>> +		dma_resv_unlock(attach->dmabuf->resv);
>> +
>> +	dma_buf_detach(dmabuf, attach);
>> +	return ERR_PTR(ret);
>> +}
>> +EXPORT_SYMBOL_GPL(dma_buf_dynamic_attach);
>> +
>> +/**
>> + * dma_buf_attach - Wrapper for dma_buf_dynamic_attach
>> + * @dmabuf:	[in]	buffer to attach device to.
>> + * @dev:	[in]	device to be attached.
>> + *
>> + * Wrapper to call dma_buf_dynamic_attach() for drivers which still use a static
>> + * mapping.
>> + */
>> +struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>> +					  struct device *dev)
>> +{
>> +	return dma_buf_dynamic_attach(dmabuf, dev, false);
>>   }
>>   EXPORT_SYMBOL_GPL(dma_buf_attach);
>>   
>> @@ -711,11 +766,20 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach)
>>   	if (WARN_ON(!dmabuf || !attach))
>>   		return;
>>   
>> -	if (attach->sgt)
>> +	if (attach->sgt) {
>> +		if (dma_buf_is_dynamic(attach->dmabuf))
>> +			dma_resv_lock(attach->dmabuf->resv, NULL);
>> +
>>   		dmabuf->ops->unmap_dma_buf(attach, attach->sgt, attach->dir);
>>   
>> +		if (dma_buf_is_dynamic(attach->dmabuf))
>> +			dma_resv_unlock(attach->dmabuf->resv);
>> +	}
>> +
>>   	mutex_lock(&dmabuf->lock);
>> +	dma_resv_lock(dmabuf->resv, NULL);
>>   	list_del(&attach->node);
>> +	dma_resv_unlock(dmabuf->resv);
>>   	if (dmabuf->ops->detach)
>>   		dmabuf->ops->detach(dmabuf, attach);
>>   
>> @@ -749,6 +813,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
>>   	if (WARN_ON(!attach || !attach->dmabuf))
>>   		return ERR_PTR(-EINVAL);
>>   
>> +	if (dma_buf_attachment_is_dynamic(attach))
>> +		dma_resv_assert_held(attach->dmabuf->resv);
>> +
>>   	if (attach->sgt) {
>>   		/*
>>   		 * Two mappings with different directions for the same
>> @@ -761,6 +828,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
>>   		return attach->sgt;
>>   	}
>>   
>> +	if (dma_buf_is_dynamic(attach->dmabuf))
>> +		dma_resv_assert_held(attach->dmabuf->resv);
>> +
>>   	sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
>>   	if (!sg_table)
>>   		sg_table = ERR_PTR(-ENOMEM);
>> @@ -793,9 +863,15 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach,
>>   	if (WARN_ON(!attach || !attach->dmabuf || !sg_table))
>>   		return;
>>   
>> +	if (dma_buf_attachment_is_dynamic(attach))
>> +		dma_resv_assert_held(attach->dmabuf->resv);
>> +
>>   	if (attach->sgt == sg_table)
>>   		return;
>>   
>> +	if (dma_buf_is_dynamic(attach->dmabuf))
>> +		dma_resv_assert_held(attach->dmabuf->resv);
>> +
>>   	attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction);
>>   }
>>   EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
>> @@ -1219,10 +1295,12 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused)
>>   		seq_puts(s, "\tAttached Devices:\n");
>>   		attach_count = 0;
>>   
>> +		dma_resv_lock(buf_obj->resv, NULL);
>>   		list_for_each_entry(attach_obj, &buf_obj->attachments, node) {
>>   			seq_printf(s, "\t%s\n", dev_name(attach_obj->dev));
>>   			attach_count++;
>>   		}
>> +		dma_resv_unlock(buf_obj->resv);
>>   
>>   		seq_printf(s, "Total %d devices attached\n\n",
>>   				attach_count);
>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
>> index ec212cb27fdc..a8f8b2b812fd 100644
>> --- a/include/linux/dma-buf.h
>> +++ b/include/linux/dma-buf.h
>> @@ -42,6 +42,17 @@ struct dma_buf_ops {
>>   	  */
>>   	bool cache_sgt_mapping;
>>   
>> +	/**
>> +	 * @dynamic_mapping:
>> +	 *
>> +	 * If true the framework makes sure that the map/unmap_dma_buf
>> +	 * callbacks are always called with the dma_resv object locked.
>> +	 *
>> +	 * If false the framework makes ure that the map/unmap_dma_buf
>> +	 * callbacks are always called without the dma_resv object locked.
>> +	 */
>> +	bool dynamic_mapping;
>> +
>>   	/**
>>   	 * @attach:
>>   	 *
>> @@ -109,6 +120,9 @@ struct dma_buf_ops {
>>   	 * any other kind of sharing that the exporter might wish to make
>>   	 * available to buffer-users.
>>   	 *
>> +	 * This is always called with the dmabuf->resv object locked when
>> +	 * the dynamic_mapping flag is true.
>> +	 *
>>   	 * Returns:
>>   	 *
>>   	 * A &sg_table scatter list of or the backing storage of the DMA buffer,
>> @@ -327,6 +341,8 @@ struct dma_buf {
>>    * @sgt: cached mapping.
>>    * @dir: direction of cached mapping.
>>    * @priv: exporter specific attachment data.
>> + * @dynamic_mapping: true if dma_buf_map/unmap_attachment() is called with the
>> + * dma_resv lock held.
>>    *
>>    * This structure holds the attachment information between the dma_buf buffer
>>    * and its user device(s). The list contains one attachment struct per device
>> @@ -343,6 +359,7 @@ struct dma_buf_attachment {
>>   	struct list_head node;
>>   	struct sg_table *sgt;
>>   	enum dma_data_direction dir;
>> +	bool dynamic_mapping;
>>   	void *priv;
>>   };
>>   
>> @@ -394,10 +411,39 @@ static inline void get_dma_buf(struct dma_buf *dmabuf)
>>   	get_file(dmabuf->file);
>>   }
>>   
>> +/**
>> + * dma_buf_is_dynamic - check if a DMA-buf uses dynamic mappings.
>> + * @dmabuf: the DMA-buf to check
>> + *
>> + * Returns true if a DMA-buf exporter wants to be called with the dma_resv
>> + * locked, false if it doesn't wants to be called with the lock held.
>> + */
>> +static inline bool dma_buf_is_dynamic(struct dma_buf *dmabuf)
>> +{
>> +	return dmabuf->ops->dynamic_mapping;
>> +}
>> +
>> +/**
>> + * dma_buf_attachment_is_dynamic - check if a DMA-buf attachment uses dynamic
>> + * mappinsg
>> + * @attach: the DMA-buf attachment to check
>> + *
>> + * Returns true if a DMA-buf importer wants to call the map/unmap functions with
>> + * the dma_resv lock held.
>> + */
>> +static inline bool
>> +dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
>> +{
>> +	return attach->dynamic_mapping;
>> +}
>> +
>>   struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>> -							struct device *dev);
>> +					  struct device *dev);
>> +struct dma_buf_attachment *
>> +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
>> +		       bool dynamic_mapping);
>>   void dma_buf_detach(struct dma_buf *dmabuf,
>> -				struct dma_buf_attachment *dmabuf_attach);
>> +		    struct dma_buf_attachment *attach);
>>   
>>   struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info);
>>   
>> @@ -409,6 +455,7 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *,
>>   					enum dma_data_direction);
>>   void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *,
>>   				enum dma_data_direction);
>> +void dma_buf_move_notify(struct dma_buf *dma_buf);
>>   int dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
>>   			     enum dma_data_direction dir);
>>   int dma_buf_end_cpu_access(struct dma_buf *dma_buf,
>> -- 
>> 2.17.1
>>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] dma-buf: change DMA-buf locking convention
  2019-09-11 10:53     ` Christian König
@ 2019-09-16 12:23       ` Christian König
  2019-09-17 12:31         ` Daniel Vetter
  0 siblings, 1 reply; 25+ messages in thread
From: Christian König @ 2019-09-16 12:23 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

Ping? Any further comment on this or can't we merge at least the locking 
change?

Christian.

Am 11.09.19 um 12:53 schrieb Christian König:
> Am 03.09.19 um 10:05 schrieb Daniel Vetter:
>> On Thu, Aug 29, 2019 at 04:29:14PM +0200, Christian König wrote:
>>> This patch is a stripped down version of the locking changes
>>> necessary to support dynamic DMA-buf handling.
>>>
>>> For compatibility we cache the DMA-buf mapping as soon as
>>> exporter/importer disagree on the dynamic handling.
>>>
>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>> ---
>>>   drivers/dma-buf/dma-buf.c | 90 
>>> ++++++++++++++++++++++++++++++++++++---
>>>   include/linux/dma-buf.h   | 51 +++++++++++++++++++++-
>>>   2 files changed, 133 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>>> index 433d91d710e4..65052d52602b 100644
>>> --- a/drivers/dma-buf/dma-buf.c
>>> +++ b/drivers/dma-buf/dma-buf.c
>>> @@ -525,6 +525,10 @@ struct dma_buf *dma_buf_export(const struct 
>>> dma_buf_export_info *exp_info)
>>>           return ERR_PTR(-EINVAL);
>>>       }
>>>   +    if (WARN_ON(exp_info->ops->cache_sgt_mapping &&
>>> +            exp_info->ops->dynamic_mapping))
>>> +        return ERR_PTR(-EINVAL);
>>> +
>>>       if (!try_module_get(exp_info->owner))
>>>           return ERR_PTR(-ENOENT);
>>>   @@ -645,10 +649,11 @@ void dma_buf_put(struct dma_buf *dmabuf)
>>>   EXPORT_SYMBOL_GPL(dma_buf_put);
>>>     /**
>>> - * dma_buf_attach - Add the device to dma_buf's attachments list; 
>>> optionally,
>>> + * dma_buf_dynamic_attach - Add the device to dma_buf's attachments 
>>> list; optionally,
>>>    * calls attach() of dma_buf_ops to allow device-specific attach 
>>> functionality
>>> - * @dmabuf:    [in]    buffer to attach device to.
>>> - * @dev:    [in]    device to be attached.
>>> + * @dmabuf:        [in]    buffer to attach device to.
>>> + * @dev:        [in]    device to be attached.
>>> + * @dynamic_mapping:    [in]    calling convention for map/unmap
>>>    *
>>>    * Returns struct dma_buf_attachment pointer for this attachment. 
>>> Attachments
>>>    * must be cleaned up by calling dma_buf_detach().
>>> @@ -662,8 +667,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
>>>    * accessible to @dev, and cannot be moved to a more suitable 
>>> place. This is
>>>    * indicated with the error code -EBUSY.
>>>    */
>>> -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>>> -                      struct device *dev)
>>> +struct dma_buf_attachment *
>>> +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
>>> +               bool dynamic_mapping)
>>>   {
>>>       struct dma_buf_attachment *attach;
>>>       int ret;
>>> @@ -677,6 +683,7 @@ struct dma_buf_attachment *dma_buf_attach(struct 
>>> dma_buf *dmabuf,
>>>         attach->dev = dev;
>>>       attach->dmabuf = dmabuf;
>>> +    attach->dynamic_mapping = dynamic_mapping;
>>>         mutex_lock(&dmabuf->lock);
>>>   @@ -685,16 +692,64 @@ struct dma_buf_attachment 
>>> *dma_buf_attach(struct dma_buf *dmabuf,
>>>           if (ret)
>>>               goto err_attach;
>>>       }
>>> +    dma_resv_lock(dmabuf->resv, NULL);
>>>       list_add(&attach->node, &dmabuf->attachments);
>>> +    dma_resv_unlock(dmabuf->resv);
>>>         mutex_unlock(&dmabuf->lock);
>>>   +    /* When either the importer or the exporter can't handle dynamic
>>> +     * mappings we cache the mapping here to avoid issues with the
>>> +     * reservation object lock.
>>> +     */
>>> +    if (dma_buf_attachment_is_dynamic(attach) !=
>>> +        dma_buf_is_dynamic(dmabuf)) {
>>> +        struct sg_table *sgt;
>>> +
>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
>>> +            dma_resv_lock(attach->dmabuf->resv, NULL);
>>> +
>>> +        sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
>> Now we're back to enforcing DMA_BIDI, which works nicely around the
>> locking pain, but apparently upsets the arm-soc folks who want to 
>> control
>> this better.
>
> Take another look at dma_buf_map_attachment(), we still try to get the 
> caching there for ARM.
>
> What we do here is to bidirectionally map the buffer to avoid the 
> locking hydra when importer and exporter disagree on locking.
>
> So the ARM folks can easily avoid that by switching to dynamic locking 
> for both.
>
> Regards,
> Christian.
>
>> That's why your previous version moved the caching into
>> map/unmap_sg, which resurrected the locking hydra.
>>
>> I think we're going a bit in circles here, and I don't have a good idea
>> either :-/
>> -Daniel
>>
>>> +        if (!sgt)
>>> +            sgt = ERR_PTR(-ENOMEM);
>>> +        if (IS_ERR(sgt)) {
>>> +            ret = PTR_ERR(sgt);
>>> +            goto err_unlock;
>>> +        }
>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
>>> +            dma_resv_unlock(attach->dmabuf->resv);
>>> +        attach->sgt = sgt;
>>> +        attach->dir = DMA_BIDIRECTIONAL;
>>> +    }
>>> +
>>>       return attach;
>>>     err_attach:
>>>       kfree(attach);
>>>       mutex_unlock(&dmabuf->lock);
>>>       return ERR_PTR(ret);
>>> +
>>> +err_unlock:
>>> +    if (dma_buf_is_dynamic(attach->dmabuf))
>>> +        dma_resv_unlock(attach->dmabuf->resv);
>>> +
>>> +    dma_buf_detach(dmabuf, attach);
>>> +    return ERR_PTR(ret);
>>> +}
>>> +EXPORT_SYMBOL_GPL(dma_buf_dynamic_attach);
>>> +
>>> +/**
>>> + * dma_buf_attach - Wrapper for dma_buf_dynamic_attach
>>> + * @dmabuf:    [in]    buffer to attach device to.
>>> + * @dev:    [in]    device to be attached.
>>> + *
>>> + * Wrapper to call dma_buf_dynamic_attach() for drivers which still 
>>> use a static
>>> + * mapping.
>>> + */
>>> +struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>>> +                      struct device *dev)
>>> +{
>>> +    return dma_buf_dynamic_attach(dmabuf, dev, false);
>>>   }
>>>   EXPORT_SYMBOL_GPL(dma_buf_attach);
>>>   @@ -711,11 +766,20 @@ void dma_buf_detach(struct dma_buf *dmabuf, 
>>> struct dma_buf_attachment *attach)
>>>       if (WARN_ON(!dmabuf || !attach))
>>>           return;
>>>   -    if (attach->sgt)
>>> +    if (attach->sgt) {
>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
>>> +            dma_resv_lock(attach->dmabuf->resv, NULL);
>>> +
>>>           dmabuf->ops->unmap_dma_buf(attach, attach->sgt, attach->dir);
>>>   +        if (dma_buf_is_dynamic(attach->dmabuf))
>>> +            dma_resv_unlock(attach->dmabuf->resv);
>>> +    }
>>> +
>>>       mutex_lock(&dmabuf->lock);
>>> +    dma_resv_lock(dmabuf->resv, NULL);
>>>       list_del(&attach->node);
>>> +    dma_resv_unlock(dmabuf->resv);
>>>       if (dmabuf->ops->detach)
>>>           dmabuf->ops->detach(dmabuf, attach);
>>>   @@ -749,6 +813,9 @@ struct sg_table *dma_buf_map_attachment(struct 
>>> dma_buf_attachment *attach,
>>>       if (WARN_ON(!attach || !attach->dmabuf))
>>>           return ERR_PTR(-EINVAL);
>>>   +    if (dma_buf_attachment_is_dynamic(attach))
>>> +        dma_resv_assert_held(attach->dmabuf->resv);
>>> +
>>>       if (attach->sgt) {
>>>           /*
>>>            * Two mappings with different directions for the same
>>> @@ -761,6 +828,9 @@ struct sg_table *dma_buf_map_attachment(struct 
>>> dma_buf_attachment *attach,
>>>           return attach->sgt;
>>>       }
>>>   +    if (dma_buf_is_dynamic(attach->dmabuf))
>>> +        dma_resv_assert_held(attach->dmabuf->resv);
>>> +
>>>       sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
>>>       if (!sg_table)
>>>           sg_table = ERR_PTR(-ENOMEM);
>>> @@ -793,9 +863,15 @@ void dma_buf_unmap_attachment(struct 
>>> dma_buf_attachment *attach,
>>>       if (WARN_ON(!attach || !attach->dmabuf || !sg_table))
>>>           return;
>>>   +    if (dma_buf_attachment_is_dynamic(attach))
>>> +        dma_resv_assert_held(attach->dmabuf->resv);
>>> +
>>>       if (attach->sgt == sg_table)
>>>           return;
>>>   +    if (dma_buf_is_dynamic(attach->dmabuf))
>>> +        dma_resv_assert_held(attach->dmabuf->resv);
>>> +
>>>       attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction);
>>>   }
>>>   EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
>>> @@ -1219,10 +1295,12 @@ static int dma_buf_debug_show(struct 
>>> seq_file *s, void *unused)
>>>           seq_puts(s, "\tAttached Devices:\n");
>>>           attach_count = 0;
>>>   +        dma_resv_lock(buf_obj->resv, NULL);
>>>           list_for_each_entry(attach_obj, &buf_obj->attachments, 
>>> node) {
>>>               seq_printf(s, "\t%s\n", dev_name(attach_obj->dev));
>>>               attach_count++;
>>>           }
>>> +        dma_resv_unlock(buf_obj->resv);
>>>             seq_printf(s, "Total %d devices attached\n\n",
>>>                   attach_count);
>>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
>>> index ec212cb27fdc..a8f8b2b812fd 100644
>>> --- a/include/linux/dma-buf.h
>>> +++ b/include/linux/dma-buf.h
>>> @@ -42,6 +42,17 @@ struct dma_buf_ops {
>>>         */
>>>       bool cache_sgt_mapping;
>>>   +    /**
>>> +     * @dynamic_mapping:
>>> +     *
>>> +     * If true the framework makes sure that the map/unmap_dma_buf
>>> +     * callbacks are always called with the dma_resv object locked.
>>> +     *
>>> +     * If false the framework makes ure that the map/unmap_dma_buf
>>> +     * callbacks are always called without the dma_resv object locked.
>>> +     */
>>> +    bool dynamic_mapping;
>>> +
>>>       /**
>>>        * @attach:
>>>        *
>>> @@ -109,6 +120,9 @@ struct dma_buf_ops {
>>>        * any other kind of sharing that the exporter might wish to make
>>>        * available to buffer-users.
>>>        *
>>> +     * This is always called with the dmabuf->resv object locked when
>>> +     * the dynamic_mapping flag is true.
>>> +     *
>>>        * Returns:
>>>        *
>>>        * A &sg_table scatter list of or the backing storage of the 
>>> DMA buffer,
>>> @@ -327,6 +341,8 @@ struct dma_buf {
>>>    * @sgt: cached mapping.
>>>    * @dir: direction of cached mapping.
>>>    * @priv: exporter specific attachment data.
>>> + * @dynamic_mapping: true if dma_buf_map/unmap_attachment() is 
>>> called with the
>>> + * dma_resv lock held.
>>>    *
>>>    * This structure holds the attachment information between the 
>>> dma_buf buffer
>>>    * and its user device(s). The list contains one attachment struct 
>>> per device
>>> @@ -343,6 +359,7 @@ struct dma_buf_attachment {
>>>       struct list_head node;
>>>       struct sg_table *sgt;
>>>       enum dma_data_direction dir;
>>> +    bool dynamic_mapping;
>>>       void *priv;
>>>   };
>>>   @@ -394,10 +411,39 @@ static inline void get_dma_buf(struct 
>>> dma_buf *dmabuf)
>>>       get_file(dmabuf->file);
>>>   }
>>>   +/**
>>> + * dma_buf_is_dynamic - check if a DMA-buf uses dynamic mappings.
>>> + * @dmabuf: the DMA-buf to check
>>> + *
>>> + * Returns true if a DMA-buf exporter wants to be called with the 
>>> dma_resv
>>> + * locked, false if it doesn't wants to be called with the lock held.
>>> + */
>>> +static inline bool dma_buf_is_dynamic(struct dma_buf *dmabuf)
>>> +{
>>> +    return dmabuf->ops->dynamic_mapping;
>>> +}
>>> +
>>> +/**
>>> + * dma_buf_attachment_is_dynamic - check if a DMA-buf attachment 
>>> uses dynamic
>>> + * mappinsg
>>> + * @attach: the DMA-buf attachment to check
>>> + *
>>> + * Returns true if a DMA-buf importer wants to call the map/unmap 
>>> functions with
>>> + * the dma_resv lock held.
>>> + */
>>> +static inline bool
>>> +dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
>>> +{
>>> +    return attach->dynamic_mapping;
>>> +}
>>> +
>>>   struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>>> -                            struct device *dev);
>>> +                      struct device *dev);
>>> +struct dma_buf_attachment *
>>> +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
>>> +               bool dynamic_mapping);
>>>   void dma_buf_detach(struct dma_buf *dmabuf,
>>> -                struct dma_buf_attachment *dmabuf_attach);
>>> +            struct dma_buf_attachment *attach);
>>>     struct dma_buf *dma_buf_export(const struct dma_buf_export_info 
>>> *exp_info);
>>>   @@ -409,6 +455,7 @@ struct sg_table *dma_buf_map_attachment(struct 
>>> dma_buf_attachment *,
>>>                       enum dma_data_direction);
>>>   void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct 
>>> sg_table *,
>>>                   enum dma_data_direction);
>>> +void dma_buf_move_notify(struct dma_buf *dma_buf);
>>>   int dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
>>>                    enum dma_data_direction dir);
>>>   int dma_buf_end_cpu_access(struct dma_buf *dma_buf,
>>> -- 
>>> 2.17.1
>>>
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] dma-buf: change DMA-buf locking convention
  2019-09-16 12:23       ` Christian König
@ 2019-09-17 12:31         ` Daniel Vetter
  2019-09-17 12:40           ` Koenig, Christian
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Vetter @ 2019-09-17 12:31 UTC (permalink / raw)
  To: christian.koenig
  Cc: Daniel Vetter, dri-devel, sumit.semwal, linaro-mm-sig,
	linux-media, intel-gfx

On Mon, Sep 16, 2019 at 02:23:13PM +0200, Christian König wrote:
> Ping? Any further comment on this or can't we merge at least the locking
> change?

I was at plumbers ...
> 
> Christian.
> 
> Am 11.09.19 um 12:53 schrieb Christian König:
> > Am 03.09.19 um 10:05 schrieb Daniel Vetter:
> > > On Thu, Aug 29, 2019 at 04:29:14PM +0200, Christian König wrote:
> > > > This patch is a stripped down version of the locking changes
> > > > necessary to support dynamic DMA-buf handling.
> > > > 
> > > > For compatibility we cache the DMA-buf mapping as soon as
> > > > exporter/importer disagree on the dynamic handling.
> > > > 
> > > > Signed-off-by: Christian König <christian.koenig@amd.com>
> > > > ---
> > > >   drivers/dma-buf/dma-buf.c | 90
> > > > ++++++++++++++++++++++++++++++++++++---
> > > >   include/linux/dma-buf.h   | 51 +++++++++++++++++++++-
> > > >   2 files changed, 133 insertions(+), 8 deletions(-)
> > > > 
> > > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > > > index 433d91d710e4..65052d52602b 100644
> > > > --- a/drivers/dma-buf/dma-buf.c
> > > > +++ b/drivers/dma-buf/dma-buf.c
> > > > @@ -525,6 +525,10 @@ struct dma_buf *dma_buf_export(const struct
> > > > dma_buf_export_info *exp_info)
> > > >           return ERR_PTR(-EINVAL);
> > > >       }
> > > >   +    if (WARN_ON(exp_info->ops->cache_sgt_mapping &&
> > > > +            exp_info->ops->dynamic_mapping))
> > > > +        return ERR_PTR(-EINVAL);
> > > > +
> > > >       if (!try_module_get(exp_info->owner))
> > > >           return ERR_PTR(-ENOENT);
> > > >   @@ -645,10 +649,11 @@ void dma_buf_put(struct dma_buf *dmabuf)
> > > >   EXPORT_SYMBOL_GPL(dma_buf_put);
> > > >     /**
> > > > - * dma_buf_attach - Add the device to dma_buf's attachments
> > > > list; optionally,
> > > > + * dma_buf_dynamic_attach - Add the device to dma_buf's
> > > > attachments list; optionally,
> > > >    * calls attach() of dma_buf_ops to allow device-specific
> > > > attach functionality
> > > > - * @dmabuf:    [in]    buffer to attach device to.
> > > > - * @dev:    [in]    device to be attached.
> > > > + * @dmabuf:        [in]    buffer to attach device to.
> > > > + * @dev:        [in]    device to be attached.
> > > > + * @dynamic_mapping:    [in]    calling convention for map/unmap
> > > >    *
> > > >    * Returns struct dma_buf_attachment pointer for this
> > > > attachment. Attachments
> > > >    * must be cleaned up by calling dma_buf_detach().
> > > > @@ -662,8 +667,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
> > > >    * accessible to @dev, and cannot be moved to a more suitable
> > > > place. This is
> > > >    * indicated with the error code -EBUSY.
> > > >    */
> > > > -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> > > > -                      struct device *dev)
> > > > +struct dma_buf_attachment *
> > > > +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
> > > > +               bool dynamic_mapping)
> > > >   {
> > > >       struct dma_buf_attachment *attach;
> > > >       int ret;
> > > > @@ -677,6 +683,7 @@ struct dma_buf_attachment
> > > > *dma_buf_attach(struct dma_buf *dmabuf,
> > > >         attach->dev = dev;
> > > >       attach->dmabuf = dmabuf;
> > > > +    attach->dynamic_mapping = dynamic_mapping;
> > > >         mutex_lock(&dmabuf->lock);
> > > >   @@ -685,16 +692,64 @@ struct dma_buf_attachment
> > > > *dma_buf_attach(struct dma_buf *dmabuf,
> > > >           if (ret)
> > > >               goto err_attach;
> > > >       }
> > > > +    dma_resv_lock(dmabuf->resv, NULL);
> > > >       list_add(&attach->node, &dmabuf->attachments);
> > > > +    dma_resv_unlock(dmabuf->resv);
> > > >         mutex_unlock(&dmabuf->lock);
> > > >   +    /* When either the importer or the exporter can't handle dynamic
> > > > +     * mappings we cache the mapping here to avoid issues with the
> > > > +     * reservation object lock.
> > > > +     */
> > > > +    if (dma_buf_attachment_is_dynamic(attach) !=
> > > > +        dma_buf_is_dynamic(dmabuf)) {
> > > > +        struct sg_table *sgt;
> > > > +
> > > > +        if (dma_buf_is_dynamic(attach->dmabuf))
> > > > +            dma_resv_lock(attach->dmabuf->resv, NULL);
> > > > +
> > > > +        sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
> > > Now we're back to enforcing DMA_BIDI, which works nicely around the
> > > locking pain, but apparently upsets the arm-soc folks who want to
> > > control
> > > this better.
> > 
> > Take another look at dma_buf_map_attachment(), we still try to get the
> > caching there for ARM.
> > 
> > What we do here is to bidirectionally map the buffer to avoid the
> > locking hydra when importer and exporter disagree on locking.
> > 
> > So the ARM folks can easily avoid that by switching to dynamic locking
> > for both.

So you still break the contract between importer and exporter, except not
for anything that's run in intel-gfx-ci so all is good?

The other issue with "we solve this with caching the mapping": Currently
map/unmap flush (at least on arm, at least on cases where it matters). If
you just return the cached sg, then we don't do the flushing anymore,
which might break importers/exporters in exactly the same way as just
giving them the wrong mapping. There's zero differences between a BIDI,
TO_CPU, or TO_DEVICE mapping, the only places where this matters is for
cache flushing.

So here's something that could actually work:
- We cache the mapping.
- We cache a bidirectional mapping.
- We put the right dma_sg_sync_for_cpu/device calls in place for map/unmap
  to give current importers/exporters the same behaviour they're used to
  now.

And yes the caching we've lifted might have broken something somewhere
already. But generally you only hear about that long time after because
arm vendors roll forward once every few years. Or something like that.
-Daniel

> > 
> > Regards,
> > Christian.
> > 
> > > That's why your previous version moved the caching into
> > > map/unmap_sg, which resurrected the locking hydra.
> > > 
> > > I think we're going a bit in circles here, and I don't have a good idea
> > > either :-/
> > > -Daniel
> > > 
> > > > +        if (!sgt)
> > > > +            sgt = ERR_PTR(-ENOMEM);
> > > > +        if (IS_ERR(sgt)) {
> > > > +            ret = PTR_ERR(sgt);
> > > > +            goto err_unlock;
> > > > +        }
> > > > +        if (dma_buf_is_dynamic(attach->dmabuf))
> > > > +            dma_resv_unlock(attach->dmabuf->resv);
> > > > +        attach->sgt = sgt;
> > > > +        attach->dir = DMA_BIDIRECTIONAL;
> > > > +    }
> > > > +
> > > >       return attach;
> > > >     err_attach:
> > > >       kfree(attach);
> > > >       mutex_unlock(&dmabuf->lock);
> > > >       return ERR_PTR(ret);
> > > > +
> > > > +err_unlock:
> > > > +    if (dma_buf_is_dynamic(attach->dmabuf))
> > > > +        dma_resv_unlock(attach->dmabuf->resv);
> > > > +
> > > > +    dma_buf_detach(dmabuf, attach);
> > > > +    return ERR_PTR(ret);
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(dma_buf_dynamic_attach);
> > > > +
> > > > +/**
> > > > + * dma_buf_attach - Wrapper for dma_buf_dynamic_attach
> > > > + * @dmabuf:    [in]    buffer to attach device to.
> > > > + * @dev:    [in]    device to be attached.
> > > > + *
> > > > + * Wrapper to call dma_buf_dynamic_attach() for drivers which
> > > > still use a static
> > > > + * mapping.
> > > > + */
> > > > +struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> > > > +                      struct device *dev)
> > > > +{
> > > > +    return dma_buf_dynamic_attach(dmabuf, dev, false);
> > > >   }
> > > >   EXPORT_SYMBOL_GPL(dma_buf_attach);
> > > >   @@ -711,11 +766,20 @@ void dma_buf_detach(struct dma_buf
> > > > *dmabuf, struct dma_buf_attachment *attach)
> > > >       if (WARN_ON(!dmabuf || !attach))
> > > >           return;
> > > >   -    if (attach->sgt)
> > > > +    if (attach->sgt) {
> > > > +        if (dma_buf_is_dynamic(attach->dmabuf))
> > > > +            dma_resv_lock(attach->dmabuf->resv, NULL);
> > > > +
> > > >           dmabuf->ops->unmap_dma_buf(attach, attach->sgt, attach->dir);
> > > >   +        if (dma_buf_is_dynamic(attach->dmabuf))
> > > > +            dma_resv_unlock(attach->dmabuf->resv);
> > > > +    }
> > > > +
> > > >       mutex_lock(&dmabuf->lock);
> > > > +    dma_resv_lock(dmabuf->resv, NULL);
> > > >       list_del(&attach->node);
> > > > +    dma_resv_unlock(dmabuf->resv);
> > > >       if (dmabuf->ops->detach)
> > > >           dmabuf->ops->detach(dmabuf, attach);
> > > >   @@ -749,6 +813,9 @@ struct sg_table
> > > > *dma_buf_map_attachment(struct dma_buf_attachment *attach,
> > > >       if (WARN_ON(!attach || !attach->dmabuf))
> > > >           return ERR_PTR(-EINVAL);
> > > >   +    if (dma_buf_attachment_is_dynamic(attach))
> > > > +        dma_resv_assert_held(attach->dmabuf->resv);
> > > > +
> > > >       if (attach->sgt) {
> > > >           /*
> > > >            * Two mappings with different directions for the same
> > > > @@ -761,6 +828,9 @@ struct sg_table
> > > > *dma_buf_map_attachment(struct dma_buf_attachment *attach,
> > > >           return attach->sgt;
> > > >       }
> > > >   +    if (dma_buf_is_dynamic(attach->dmabuf))
> > > > +        dma_resv_assert_held(attach->dmabuf->resv);
> > > > +
> > > >       sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
> > > >       if (!sg_table)
> > > >           sg_table = ERR_PTR(-ENOMEM);
> > > > @@ -793,9 +863,15 @@ void dma_buf_unmap_attachment(struct
> > > > dma_buf_attachment *attach,
> > > >       if (WARN_ON(!attach || !attach->dmabuf || !sg_table))
> > > >           return;
> > > >   +    if (dma_buf_attachment_is_dynamic(attach))
> > > > +        dma_resv_assert_held(attach->dmabuf->resv);
> > > > +
> > > >       if (attach->sgt == sg_table)
> > > >           return;
> > > >   +    if (dma_buf_is_dynamic(attach->dmabuf))
> > > > +        dma_resv_assert_held(attach->dmabuf->resv);
> > > > +
> > > >       attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction);
> > > >   }
> > > >   EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
> > > > @@ -1219,10 +1295,12 @@ static int dma_buf_debug_show(struct
> > > > seq_file *s, void *unused)
> > > >           seq_puts(s, "\tAttached Devices:\n");
> > > >           attach_count = 0;
> > > >   +        dma_resv_lock(buf_obj->resv, NULL);
> > > >           list_for_each_entry(attach_obj, &buf_obj->attachments,
> > > > node) {
> > > >               seq_printf(s, "\t%s\n", dev_name(attach_obj->dev));
> > > >               attach_count++;
> > > >           }
> > > > +        dma_resv_unlock(buf_obj->resv);
> > > >             seq_printf(s, "Total %d devices attached\n\n",
> > > >                   attach_count);
> > > > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> > > > index ec212cb27fdc..a8f8b2b812fd 100644
> > > > --- a/include/linux/dma-buf.h
> > > > +++ b/include/linux/dma-buf.h
> > > > @@ -42,6 +42,17 @@ struct dma_buf_ops {
> > > >         */
> > > >       bool cache_sgt_mapping;
> > > >   +    /**
> > > > +     * @dynamic_mapping:
> > > > +     *
> > > > +     * If true the framework makes sure that the map/unmap_dma_buf
> > > > +     * callbacks are always called with the dma_resv object locked.
> > > > +     *
> > > > +     * If false the framework makes ure that the map/unmap_dma_buf
> > > > +     * callbacks are always called without the dma_resv object locked.
> > > > +     */
> > > > +    bool dynamic_mapping;
> > > > +
> > > >       /**
> > > >        * @attach:
> > > >        *
> > > > @@ -109,6 +120,9 @@ struct dma_buf_ops {
> > > >        * any other kind of sharing that the exporter might wish to make
> > > >        * available to buffer-users.
> > > >        *
> > > > +     * This is always called with the dmabuf->resv object locked when
> > > > +     * the dynamic_mapping flag is true.
> > > > +     *
> > > >        * Returns:
> > > >        *
> > > >        * A &sg_table scatter list of or the backing storage of
> > > > the DMA buffer,
> > > > @@ -327,6 +341,8 @@ struct dma_buf {
> > > >    * @sgt: cached mapping.
> > > >    * @dir: direction of cached mapping.
> > > >    * @priv: exporter specific attachment data.
> > > > + * @dynamic_mapping: true if dma_buf_map/unmap_attachment() is
> > > > called with the
> > > > + * dma_resv lock held.
> > > >    *
> > > >    * This structure holds the attachment information between the
> > > > dma_buf buffer
> > > >    * and its user device(s). The list contains one attachment
> > > > struct per device
> > > > @@ -343,6 +359,7 @@ struct dma_buf_attachment {
> > > >       struct list_head node;
> > > >       struct sg_table *sgt;
> > > >       enum dma_data_direction dir;
> > > > +    bool dynamic_mapping;
> > > >       void *priv;
> > > >   };
> > > >   @@ -394,10 +411,39 @@ static inline void get_dma_buf(struct
> > > > dma_buf *dmabuf)
> > > >       get_file(dmabuf->file);
> > > >   }
> > > >   +/**
> > > > + * dma_buf_is_dynamic - check if a DMA-buf uses dynamic mappings.
> > > > + * @dmabuf: the DMA-buf to check
> > > > + *
> > > > + * Returns true if a DMA-buf exporter wants to be called with
> > > > the dma_resv
> > > > + * locked, false if it doesn't wants to be called with the lock held.
> > > > + */
> > > > +static inline bool dma_buf_is_dynamic(struct dma_buf *dmabuf)
> > > > +{
> > > > +    return dmabuf->ops->dynamic_mapping;
> > > > +}
> > > > +
> > > > +/**
> > > > + * dma_buf_attachment_is_dynamic - check if a DMA-buf
> > > > attachment uses dynamic
> > > > + * mappinsg
> > > > + * @attach: the DMA-buf attachment to check
> > > > + *
> > > > + * Returns true if a DMA-buf importer wants to call the
> > > > map/unmap functions with
> > > > + * the dma_resv lock held.
> > > > + */
> > > > +static inline bool
> > > > +dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
> > > > +{
> > > > +    return attach->dynamic_mapping;
> > > > +}
> > > > +
> > > >   struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> > > > -                            struct device *dev);
> > > > +                      struct device *dev);
> > > > +struct dma_buf_attachment *
> > > > +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
> > > > +               bool dynamic_mapping);
> > > >   void dma_buf_detach(struct dma_buf *dmabuf,
> > > > -                struct dma_buf_attachment *dmabuf_attach);
> > > > +            struct dma_buf_attachment *attach);
> > > >     struct dma_buf *dma_buf_export(const struct
> > > > dma_buf_export_info *exp_info);
> > > >   @@ -409,6 +455,7 @@ struct sg_table
> > > > *dma_buf_map_attachment(struct dma_buf_attachment *,
> > > >                       enum dma_data_direction);
> > > >   void dma_buf_unmap_attachment(struct dma_buf_attachment *,
> > > > struct sg_table *,
> > > >                   enum dma_data_direction);
> > > > +void dma_buf_move_notify(struct dma_buf *dma_buf);
> > > >   int dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
> > > >                    enum dma_data_direction dir);
> > > >   int dma_buf_end_cpu_access(struct dma_buf *dma_buf,
> > > > -- 
> > > > 2.17.1
> > > > 
> > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] dma-buf: change DMA-buf locking convention
  2019-09-17 12:31         ` Daniel Vetter
@ 2019-09-17 12:40           ` Koenig, Christian
  2019-09-17 13:13             ` Daniel Vetter
  0 siblings, 1 reply; 25+ messages in thread
From: Koenig, Christian @ 2019-09-17 12:40 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

Am 17.09.19 um 14:31 schrieb Daniel Vetter:
> On Mon, Sep 16, 2019 at 02:23:13PM +0200, Christian König wrote:
>> Ping? Any further comment on this or can't we merge at least the locking
>> change?
> I was at plumbers ...
>> Christian.
>>
>> Am 11.09.19 um 12:53 schrieb Christian König:
>>> Am 03.09.19 um 10:05 schrieb Daniel Vetter:
>>>> On Thu, Aug 29, 2019 at 04:29:14PM +0200, Christian König wrote:
>>>>> This patch is a stripped down version of the locking changes
>>>>> necessary to support dynamic DMA-buf handling.
>>>>>
>>>>> For compatibility we cache the DMA-buf mapping as soon as
>>>>> exporter/importer disagree on the dynamic handling.
>>>>>
>>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>>> ---
>>>>>    drivers/dma-buf/dma-buf.c | 90
>>>>> ++++++++++++++++++++++++++++++++++++---
>>>>>    include/linux/dma-buf.h   | 51 +++++++++++++++++++++-
>>>>>    2 files changed, 133 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>>>>> index 433d91d710e4..65052d52602b 100644
>>>>> --- a/drivers/dma-buf/dma-buf.c
>>>>> +++ b/drivers/dma-buf/dma-buf.c
>>>>> @@ -525,6 +525,10 @@ struct dma_buf *dma_buf_export(const struct
>>>>> dma_buf_export_info *exp_info)
>>>>>            return ERR_PTR(-EINVAL);
>>>>>        }
>>>>>    +    if (WARN_ON(exp_info->ops->cache_sgt_mapping &&
>>>>> +            exp_info->ops->dynamic_mapping))
>>>>> +        return ERR_PTR(-EINVAL);
>>>>> +
>>>>>        if (!try_module_get(exp_info->owner))
>>>>>            return ERR_PTR(-ENOENT);
>>>>>    @@ -645,10 +649,11 @@ void dma_buf_put(struct dma_buf *dmabuf)
>>>>>    EXPORT_SYMBOL_GPL(dma_buf_put);
>>>>>      /**
>>>>> - * dma_buf_attach - Add the device to dma_buf's attachments
>>>>> list; optionally,
>>>>> + * dma_buf_dynamic_attach - Add the device to dma_buf's
>>>>> attachments list; optionally,
>>>>>     * calls attach() of dma_buf_ops to allow device-specific
>>>>> attach functionality
>>>>> - * @dmabuf:    [in]    buffer to attach device to.
>>>>> - * @dev:    [in]    device to be attached.
>>>>> + * @dmabuf:        [in]    buffer to attach device to.
>>>>> + * @dev:        [in]    device to be attached.
>>>>> + * @dynamic_mapping:    [in]    calling convention for map/unmap
>>>>>     *
>>>>>     * Returns struct dma_buf_attachment pointer for this
>>>>> attachment. Attachments
>>>>>     * must be cleaned up by calling dma_buf_detach().
>>>>> @@ -662,8 +667,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
>>>>>     * accessible to @dev, and cannot be moved to a more suitable
>>>>> place. This is
>>>>>     * indicated with the error code -EBUSY.
>>>>>     */
>>>>> -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>>>>> -                      struct device *dev)
>>>>> +struct dma_buf_attachment *
>>>>> +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
>>>>> +               bool dynamic_mapping)
>>>>>    {
>>>>>        struct dma_buf_attachment *attach;
>>>>>        int ret;
>>>>> @@ -677,6 +683,7 @@ struct dma_buf_attachment
>>>>> *dma_buf_attach(struct dma_buf *dmabuf,
>>>>>          attach->dev = dev;
>>>>>        attach->dmabuf = dmabuf;
>>>>> +    attach->dynamic_mapping = dynamic_mapping;
>>>>>          mutex_lock(&dmabuf->lock);
>>>>>    @@ -685,16 +692,64 @@ struct dma_buf_attachment
>>>>> *dma_buf_attach(struct dma_buf *dmabuf,
>>>>>            if (ret)
>>>>>                goto err_attach;
>>>>>        }
>>>>> +    dma_resv_lock(dmabuf->resv, NULL);
>>>>>        list_add(&attach->node, &dmabuf->attachments);
>>>>> +    dma_resv_unlock(dmabuf->resv);
>>>>>          mutex_unlock(&dmabuf->lock);
>>>>>    +    /* When either the importer or the exporter can't handle dynamic
>>>>> +     * mappings we cache the mapping here to avoid issues with the
>>>>> +     * reservation object lock.
>>>>> +     */
>>>>> +    if (dma_buf_attachment_is_dynamic(attach) !=
>>>>> +        dma_buf_is_dynamic(dmabuf)) {
>>>>> +        struct sg_table *sgt;
>>>>> +
>>>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
>>>>> +            dma_resv_lock(attach->dmabuf->resv, NULL);
>>>>> +
>>>>> +        sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
>>>> Now we're back to enforcing DMA_BIDI, which works nicely around the
>>>> locking pain, but apparently upsets the arm-soc folks who want to
>>>> control
>>>> this better.
>>> Take another look at dma_buf_map_attachment(), we still try to get the
>>> caching there for ARM.
>>>
>>> What we do here is to bidirectionally map the buffer to avoid the
>>> locking hydra when importer and exporter disagree on locking.
>>>
>>> So the ARM folks can easily avoid that by switching to dynamic locking
>>> for both.
> So you still break the contract between importer and exporter, except not
> for anything that's run in intel-gfx-ci so all is good?

No, the contract between importer and exporter stays exactly the same it 
is currently as long as you don't switch to dynamic dma-buf handling.

There is no functional change for the ARM folks here. The only change 
which takes effect is between i915 and amdgpu and that is perfectly 
covered by intel-gfx-ci.

Regards,
Christian.

>
> The other issue with "we solve this with caching the mapping": Currently
> map/unmap flush (at least on arm, at least on cases where it matters). If
> you just return the cached sg, then we don't do the flushing anymore,
> which might break importers/exporters in exactly the same way as just
> giving them the wrong mapping. There's zero differences between a BIDI,
> TO_CPU, or TO_DEVICE mapping, the only places where this matters is for
> cache flushing.
>
> So here's something that could actually work:
> - We cache the mapping.
> - We cache a bidirectional mapping.
> - We put the right dma_sg_sync_for_cpu/device calls in place for map/unmap
>    to give current importers/exporters the same behaviour they're used to
>    now.
>
> And yes the caching we've lifted might have broken something somewhere
> already. But generally you only hear about that long time after because
> arm vendors roll forward once every few years. Or something like that.
> -Daniel
>
>>> Regards,
>>> Christian.
>>>
>>>> That's why your previous version moved the caching into
>>>> map/unmap_sg, which resurrected the locking hydra.
>>>>
>>>> I think we're going a bit in circles here, and I don't have a good idea
>>>> either :-/
>>>> -Daniel
>>>>
>>>>> +        if (!sgt)
>>>>> +            sgt = ERR_PTR(-ENOMEM);
>>>>> +        if (IS_ERR(sgt)) {
>>>>> +            ret = PTR_ERR(sgt);
>>>>> +            goto err_unlock;
>>>>> +        }
>>>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
>>>>> +            dma_resv_unlock(attach->dmabuf->resv);
>>>>> +        attach->sgt = sgt;
>>>>> +        attach->dir = DMA_BIDIRECTIONAL;
>>>>> +    }
>>>>> +
>>>>>        return attach;
>>>>>      err_attach:
>>>>>        kfree(attach);
>>>>>        mutex_unlock(&dmabuf->lock);
>>>>>        return ERR_PTR(ret);
>>>>> +
>>>>> +err_unlock:
>>>>> +    if (dma_buf_is_dynamic(attach->dmabuf))
>>>>> +        dma_resv_unlock(attach->dmabuf->resv);
>>>>> +
>>>>> +    dma_buf_detach(dmabuf, attach);
>>>>> +    return ERR_PTR(ret);
>>>>> +}
>>>>> +EXPORT_SYMBOL_GPL(dma_buf_dynamic_attach);
>>>>> +
>>>>> +/**
>>>>> + * dma_buf_attach - Wrapper for dma_buf_dynamic_attach
>>>>> + * @dmabuf:    [in]    buffer to attach device to.
>>>>> + * @dev:    [in]    device to be attached.
>>>>> + *
>>>>> + * Wrapper to call dma_buf_dynamic_attach() for drivers which
>>>>> still use a static
>>>>> + * mapping.
>>>>> + */
>>>>> +struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>>>>> +                      struct device *dev)
>>>>> +{
>>>>> +    return dma_buf_dynamic_attach(dmabuf, dev, false);
>>>>>    }
>>>>>    EXPORT_SYMBOL_GPL(dma_buf_attach);
>>>>>    @@ -711,11 +766,20 @@ void dma_buf_detach(struct dma_buf
>>>>> *dmabuf, struct dma_buf_attachment *attach)
>>>>>        if (WARN_ON(!dmabuf || !attach))
>>>>>            return;
>>>>>    -    if (attach->sgt)
>>>>> +    if (attach->sgt) {
>>>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
>>>>> +            dma_resv_lock(attach->dmabuf->resv, NULL);
>>>>> +
>>>>>            dmabuf->ops->unmap_dma_buf(attach, attach->sgt, attach->dir);
>>>>>    +        if (dma_buf_is_dynamic(attach->dmabuf))
>>>>> +            dma_resv_unlock(attach->dmabuf->resv);
>>>>> +    }
>>>>> +
>>>>>        mutex_lock(&dmabuf->lock);
>>>>> +    dma_resv_lock(dmabuf->resv, NULL);
>>>>>        list_del(&attach->node);
>>>>> +    dma_resv_unlock(dmabuf->resv);
>>>>>        if (dmabuf->ops->detach)
>>>>>            dmabuf->ops->detach(dmabuf, attach);
>>>>>    @@ -749,6 +813,9 @@ struct sg_table
>>>>> *dma_buf_map_attachment(struct dma_buf_attachment *attach,
>>>>>        if (WARN_ON(!attach || !attach->dmabuf))
>>>>>            return ERR_PTR(-EINVAL);
>>>>>    +    if (dma_buf_attachment_is_dynamic(attach))
>>>>> +        dma_resv_assert_held(attach->dmabuf->resv);
>>>>> +
>>>>>        if (attach->sgt) {
>>>>>            /*
>>>>>             * Two mappings with different directions for the same
>>>>> @@ -761,6 +828,9 @@ struct sg_table
>>>>> *dma_buf_map_attachment(struct dma_buf_attachment *attach,
>>>>>            return attach->sgt;
>>>>>        }
>>>>>    +    if (dma_buf_is_dynamic(attach->dmabuf))
>>>>> +        dma_resv_assert_held(attach->dmabuf->resv);
>>>>> +
>>>>>        sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
>>>>>        if (!sg_table)
>>>>>            sg_table = ERR_PTR(-ENOMEM);
>>>>> @@ -793,9 +863,15 @@ void dma_buf_unmap_attachment(struct
>>>>> dma_buf_attachment *attach,
>>>>>        if (WARN_ON(!attach || !attach->dmabuf || !sg_table))
>>>>>            return;
>>>>>    +    if (dma_buf_attachment_is_dynamic(attach))
>>>>> +        dma_resv_assert_held(attach->dmabuf->resv);
>>>>> +
>>>>>        if (attach->sgt == sg_table)
>>>>>            return;
>>>>>    +    if (dma_buf_is_dynamic(attach->dmabuf))
>>>>> +        dma_resv_assert_held(attach->dmabuf->resv);
>>>>> +
>>>>>        attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction);
>>>>>    }
>>>>>    EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
>>>>> @@ -1219,10 +1295,12 @@ static int dma_buf_debug_show(struct
>>>>> seq_file *s, void *unused)
>>>>>            seq_puts(s, "\tAttached Devices:\n");
>>>>>            attach_count = 0;
>>>>>    +        dma_resv_lock(buf_obj->resv, NULL);
>>>>>            list_for_each_entry(attach_obj, &buf_obj->attachments,
>>>>> node) {
>>>>>                seq_printf(s, "\t%s\n", dev_name(attach_obj->dev));
>>>>>                attach_count++;
>>>>>            }
>>>>> +        dma_resv_unlock(buf_obj->resv);
>>>>>              seq_printf(s, "Total %d devices attached\n\n",
>>>>>                    attach_count);
>>>>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
>>>>> index ec212cb27fdc..a8f8b2b812fd 100644
>>>>> --- a/include/linux/dma-buf.h
>>>>> +++ b/include/linux/dma-buf.h
>>>>> @@ -42,6 +42,17 @@ struct dma_buf_ops {
>>>>>          */
>>>>>        bool cache_sgt_mapping;
>>>>>    +    /**
>>>>> +     * @dynamic_mapping:
>>>>> +     *
>>>>> +     * If true the framework makes sure that the map/unmap_dma_buf
>>>>> +     * callbacks are always called with the dma_resv object locked.
>>>>> +     *
>>>>> +     * If false the framework makes ure that the map/unmap_dma_buf
>>>>> +     * callbacks are always called without the dma_resv object locked.
>>>>> +     */
>>>>> +    bool dynamic_mapping;
>>>>> +
>>>>>        /**
>>>>>         * @attach:
>>>>>         *
>>>>> @@ -109,6 +120,9 @@ struct dma_buf_ops {
>>>>>         * any other kind of sharing that the exporter might wish to make
>>>>>         * available to buffer-users.
>>>>>         *
>>>>> +     * This is always called with the dmabuf->resv object locked when
>>>>> +     * the dynamic_mapping flag is true.
>>>>> +     *
>>>>>         * Returns:
>>>>>         *
>>>>>         * A &sg_table scatter list of or the backing storage of
>>>>> the DMA buffer,
>>>>> @@ -327,6 +341,8 @@ struct dma_buf {
>>>>>     * @sgt: cached mapping.
>>>>>     * @dir: direction of cached mapping.
>>>>>     * @priv: exporter specific attachment data.
>>>>> + * @dynamic_mapping: true if dma_buf_map/unmap_attachment() is
>>>>> called with the
>>>>> + * dma_resv lock held.
>>>>>     *
>>>>>     * This structure holds the attachment information between the
>>>>> dma_buf buffer
>>>>>     * and its user device(s). The list contains one attachment
>>>>> struct per device
>>>>> @@ -343,6 +359,7 @@ struct dma_buf_attachment {
>>>>>        struct list_head node;
>>>>>        struct sg_table *sgt;
>>>>>        enum dma_data_direction dir;
>>>>> +    bool dynamic_mapping;
>>>>>        void *priv;
>>>>>    };
>>>>>    @@ -394,10 +411,39 @@ static inline void get_dma_buf(struct
>>>>> dma_buf *dmabuf)
>>>>>        get_file(dmabuf->file);
>>>>>    }
>>>>>    +/**
>>>>> + * dma_buf_is_dynamic - check if a DMA-buf uses dynamic mappings.
>>>>> + * @dmabuf: the DMA-buf to check
>>>>> + *
>>>>> + * Returns true if a DMA-buf exporter wants to be called with
>>>>> the dma_resv
>>>>> + * locked, false if it doesn't wants to be called with the lock held.
>>>>> + */
>>>>> +static inline bool dma_buf_is_dynamic(struct dma_buf *dmabuf)
>>>>> +{
>>>>> +    return dmabuf->ops->dynamic_mapping;
>>>>> +}
>>>>> +
>>>>> +/**
>>>>> + * dma_buf_attachment_is_dynamic - check if a DMA-buf
>>>>> attachment uses dynamic
>>>>> + * mappinsg
>>>>> + * @attach: the DMA-buf attachment to check
>>>>> + *
>>>>> + * Returns true if a DMA-buf importer wants to call the
>>>>> map/unmap functions with
>>>>> + * the dma_resv lock held.
>>>>> + */
>>>>> +static inline bool
>>>>> +dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
>>>>> +{
>>>>> +    return attach->dynamic_mapping;
>>>>> +}
>>>>> +
>>>>>    struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>>>>> -                            struct device *dev);
>>>>> +                      struct device *dev);
>>>>> +struct dma_buf_attachment *
>>>>> +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
>>>>> +               bool dynamic_mapping);
>>>>>    void dma_buf_detach(struct dma_buf *dmabuf,
>>>>> -                struct dma_buf_attachment *dmabuf_attach);
>>>>> +            struct dma_buf_attachment *attach);
>>>>>      struct dma_buf *dma_buf_export(const struct
>>>>> dma_buf_export_info *exp_info);
>>>>>    @@ -409,6 +455,7 @@ struct sg_table
>>>>> *dma_buf_map_attachment(struct dma_buf_attachment *,
>>>>>                        enum dma_data_direction);
>>>>>    void dma_buf_unmap_attachment(struct dma_buf_attachment *,
>>>>> struct sg_table *,
>>>>>                    enum dma_data_direction);
>>>>> +void dma_buf_move_notify(struct dma_buf *dma_buf);
>>>>>    int dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
>>>>>                     enum dma_data_direction dir);
>>>>>    int dma_buf_end_cpu_access(struct dma_buf *dma_buf,
>>>>> -- 
>>>>> 2.17.1
>>>>>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] dma-buf: change DMA-buf locking convention
  2019-09-17 12:40           ` Koenig, Christian
@ 2019-09-17 13:13             ` Daniel Vetter
  2019-09-17 13:24               ` Koenig, Christian
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Vetter @ 2019-09-17 13:13 UTC (permalink / raw)
  To: Koenig, Christian
  Cc: Daniel Vetter, dri-devel, sumit.semwal, linaro-mm-sig,
	linux-media, intel-gfx

On Tue, Sep 17, 2019 at 12:40:51PM +0000, Koenig, Christian wrote:
> Am 17.09.19 um 14:31 schrieb Daniel Vetter:
> > On Mon, Sep 16, 2019 at 02:23:13PM +0200, Christian König wrote:
> >> Ping? Any further comment on this or can't we merge at least the locking
> >> change?
> > I was at plumbers ...
> >> Christian.
> >>
> >> Am 11.09.19 um 12:53 schrieb Christian König:
> >>> Am 03.09.19 um 10:05 schrieb Daniel Vetter:
> >>>> On Thu, Aug 29, 2019 at 04:29:14PM +0200, Christian König wrote:
> >>>>> This patch is a stripped down version of the locking changes
> >>>>> necessary to support dynamic DMA-buf handling.
> >>>>>
> >>>>> For compatibility we cache the DMA-buf mapping as soon as
> >>>>> exporter/importer disagree on the dynamic handling.
> >>>>>
> >>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
> >>>>> ---
> >>>>>    drivers/dma-buf/dma-buf.c | 90
> >>>>> ++++++++++++++++++++++++++++++++++++---
> >>>>>    include/linux/dma-buf.h   | 51 +++++++++++++++++++++-
> >>>>>    2 files changed, 133 insertions(+), 8 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> >>>>> index 433d91d710e4..65052d52602b 100644
> >>>>> --- a/drivers/dma-buf/dma-buf.c
> >>>>> +++ b/drivers/dma-buf/dma-buf.c
> >>>>> @@ -525,6 +525,10 @@ struct dma_buf *dma_buf_export(const struct
> >>>>> dma_buf_export_info *exp_info)
> >>>>>            return ERR_PTR(-EINVAL);
> >>>>>        }
> >>>>>    +    if (WARN_ON(exp_info->ops->cache_sgt_mapping &&
> >>>>> +            exp_info->ops->dynamic_mapping))
> >>>>> +        return ERR_PTR(-EINVAL);
> >>>>> +
> >>>>>        if (!try_module_get(exp_info->owner))
> >>>>>            return ERR_PTR(-ENOENT);
> >>>>>    @@ -645,10 +649,11 @@ void dma_buf_put(struct dma_buf *dmabuf)
> >>>>>    EXPORT_SYMBOL_GPL(dma_buf_put);
> >>>>>      /**
> >>>>> - * dma_buf_attach - Add the device to dma_buf's attachments
> >>>>> list; optionally,
> >>>>> + * dma_buf_dynamic_attach - Add the device to dma_buf's
> >>>>> attachments list; optionally,
> >>>>>     * calls attach() of dma_buf_ops to allow device-specific
> >>>>> attach functionality
> >>>>> - * @dmabuf:    [in]    buffer to attach device to.
> >>>>> - * @dev:    [in]    device to be attached.
> >>>>> + * @dmabuf:        [in]    buffer to attach device to.
> >>>>> + * @dev:        [in]    device to be attached.
> >>>>> + * @dynamic_mapping:    [in]    calling convention for map/unmap
> >>>>>     *
> >>>>>     * Returns struct dma_buf_attachment pointer for this
> >>>>> attachment. Attachments
> >>>>>     * must be cleaned up by calling dma_buf_detach().
> >>>>> @@ -662,8 +667,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
> >>>>>     * accessible to @dev, and cannot be moved to a more suitable
> >>>>> place. This is
> >>>>>     * indicated with the error code -EBUSY.
> >>>>>     */
> >>>>> -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> >>>>> -                      struct device *dev)
> >>>>> +struct dma_buf_attachment *
> >>>>> +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
> >>>>> +               bool dynamic_mapping)
> >>>>>    {
> >>>>>        struct dma_buf_attachment *attach;
> >>>>>        int ret;
> >>>>> @@ -677,6 +683,7 @@ struct dma_buf_attachment
> >>>>> *dma_buf_attach(struct dma_buf *dmabuf,
> >>>>>          attach->dev = dev;
> >>>>>        attach->dmabuf = dmabuf;
> >>>>> +    attach->dynamic_mapping = dynamic_mapping;
> >>>>>          mutex_lock(&dmabuf->lock);
> >>>>>    @@ -685,16 +692,64 @@ struct dma_buf_attachment
> >>>>> *dma_buf_attach(struct dma_buf *dmabuf,
> >>>>>            if (ret)
> >>>>>                goto err_attach;
> >>>>>        }
> >>>>> +    dma_resv_lock(dmabuf->resv, NULL);
> >>>>>        list_add(&attach->node, &dmabuf->attachments);
> >>>>> +    dma_resv_unlock(dmabuf->resv);
> >>>>>          mutex_unlock(&dmabuf->lock);
> >>>>>    +    /* When either the importer or the exporter can't handle dynamic
> >>>>> +     * mappings we cache the mapping here to avoid issues with the
> >>>>> +     * reservation object lock.
> >>>>> +     */
> >>>>> +    if (dma_buf_attachment_is_dynamic(attach) !=
> >>>>> +        dma_buf_is_dynamic(dmabuf)) {
> >>>>> +        struct sg_table *sgt;
> >>>>> +
> >>>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
> >>>>> +            dma_resv_lock(attach->dmabuf->resv, NULL);
> >>>>> +
> >>>>> +        sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
> >>>> Now we're back to enforcing DMA_BIDI, which works nicely around the
> >>>> locking pain, but apparently upsets the arm-soc folks who want to
> >>>> control
> >>>> this better.
> >>> Take another look at dma_buf_map_attachment(), we still try to get the
> >>> caching there for ARM.
> >>>
> >>> What we do here is to bidirectionally map the buffer to avoid the
> >>> locking hydra when importer and exporter disagree on locking.
> >>>
> >>> So the ARM folks can easily avoid that by switching to dynamic locking
> >>> for both.
> > So you still break the contract between importer and exporter, except not
> > for anything that's run in intel-gfx-ci so all is good?
> 
> No, the contract between importer and exporter stays exactly the same it 
> is currently as long as you don't switch to dynamic dma-buf handling.
> 
> There is no functional change for the ARM folks here. The only change 
> which takes effect is between i915 and amdgpu and that is perfectly 
> covered by intel-gfx-ci.

There's people who want to run amdgpu on ARM? Also, x86 doesn't have cache
flushing in the dma-api, so naturally this isn't any issue for us (we
still have cache flushing in actual hw, but that's a different topic). So
"works on x86" isn't really a great way to justify what we do here I
think.
-Daniel

> 
> Regards,
> Christian.
> 
> >
> > The other issue with "we solve this with caching the mapping": Currently
> > map/unmap flush (at least on arm, at least on cases where it matters). If
> > you just return the cached sg, then we don't do the flushing anymore,
> > which might break importers/exporters in exactly the same way as just
> > giving them the wrong mapping. There's zero differences between a BIDI,
> > TO_CPU, or TO_DEVICE mapping, the only places where this matters is for
> > cache flushing.
> >
> > So here's something that could actually work:
> > - We cache the mapping.
> > - We cache a bidirectional mapping.
> > - We put the right dma_sg_sync_for_cpu/device calls in place for map/unmap
> >    to give current importers/exporters the same behaviour they're used to
> >    now.
> >
> > And yes the caching we've lifted might have broken something somewhere
> > already. But generally you only hear about that long time after because
> > arm vendors roll forward once every few years. Or something like that.
> > -Daniel
> >
> >>> Regards,
> >>> Christian.
> >>>
> >>>> That's why your previous version moved the caching into
> >>>> map/unmap_sg, which resurrected the locking hydra.
> >>>>
> >>>> I think we're going a bit in circles here, and I don't have a good idea
> >>>> either :-/
> >>>> -Daniel
> >>>>
> >>>>> +        if (!sgt)
> >>>>> +            sgt = ERR_PTR(-ENOMEM);
> >>>>> +        if (IS_ERR(sgt)) {
> >>>>> +            ret = PTR_ERR(sgt);
> >>>>> +            goto err_unlock;
> >>>>> +        }
> >>>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
> >>>>> +            dma_resv_unlock(attach->dmabuf->resv);
> >>>>> +        attach->sgt = sgt;
> >>>>> +        attach->dir = DMA_BIDIRECTIONAL;
> >>>>> +    }
> >>>>> +
> >>>>>        return attach;
> >>>>>      err_attach:
> >>>>>        kfree(attach);
> >>>>>        mutex_unlock(&dmabuf->lock);
> >>>>>        return ERR_PTR(ret);
> >>>>> +
> >>>>> +err_unlock:
> >>>>> +    if (dma_buf_is_dynamic(attach->dmabuf))
> >>>>> +        dma_resv_unlock(attach->dmabuf->resv);
> >>>>> +
> >>>>> +    dma_buf_detach(dmabuf, attach);
> >>>>> +    return ERR_PTR(ret);
> >>>>> +}
> >>>>> +EXPORT_SYMBOL_GPL(dma_buf_dynamic_attach);
> >>>>> +
> >>>>> +/**
> >>>>> + * dma_buf_attach - Wrapper for dma_buf_dynamic_attach
> >>>>> + * @dmabuf:    [in]    buffer to attach device to.
> >>>>> + * @dev:    [in]    device to be attached.
> >>>>> + *
> >>>>> + * Wrapper to call dma_buf_dynamic_attach() for drivers which
> >>>>> still use a static
> >>>>> + * mapping.
> >>>>> + */
> >>>>> +struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> >>>>> +                      struct device *dev)
> >>>>> +{
> >>>>> +    return dma_buf_dynamic_attach(dmabuf, dev, false);
> >>>>>    }
> >>>>>    EXPORT_SYMBOL_GPL(dma_buf_attach);
> >>>>>    @@ -711,11 +766,20 @@ void dma_buf_detach(struct dma_buf
> >>>>> *dmabuf, struct dma_buf_attachment *attach)
> >>>>>        if (WARN_ON(!dmabuf || !attach))
> >>>>>            return;
> >>>>>    -    if (attach->sgt)
> >>>>> +    if (attach->sgt) {
> >>>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
> >>>>> +            dma_resv_lock(attach->dmabuf->resv, NULL);
> >>>>> +
> >>>>>            dmabuf->ops->unmap_dma_buf(attach, attach->sgt, attach->dir);
> >>>>>    +        if (dma_buf_is_dynamic(attach->dmabuf))
> >>>>> +            dma_resv_unlock(attach->dmabuf->resv);
> >>>>> +    }
> >>>>> +
> >>>>>        mutex_lock(&dmabuf->lock);
> >>>>> +    dma_resv_lock(dmabuf->resv, NULL);
> >>>>>        list_del(&attach->node);
> >>>>> +    dma_resv_unlock(dmabuf->resv);
> >>>>>        if (dmabuf->ops->detach)
> >>>>>            dmabuf->ops->detach(dmabuf, attach);
> >>>>>    @@ -749,6 +813,9 @@ struct sg_table
> >>>>> *dma_buf_map_attachment(struct dma_buf_attachment *attach,
> >>>>>        if (WARN_ON(!attach || !attach->dmabuf))
> >>>>>            return ERR_PTR(-EINVAL);
> >>>>>    +    if (dma_buf_attachment_is_dynamic(attach))
> >>>>> +        dma_resv_assert_held(attach->dmabuf->resv);
> >>>>> +
> >>>>>        if (attach->sgt) {
> >>>>>            /*
> >>>>>             * Two mappings with different directions for the same
> >>>>> @@ -761,6 +828,9 @@ struct sg_table
> >>>>> *dma_buf_map_attachment(struct dma_buf_attachment *attach,
> >>>>>            return attach->sgt;
> >>>>>        }
> >>>>>    +    if (dma_buf_is_dynamic(attach->dmabuf))
> >>>>> +        dma_resv_assert_held(attach->dmabuf->resv);
> >>>>> +
> >>>>>        sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
> >>>>>        if (!sg_table)
> >>>>>            sg_table = ERR_PTR(-ENOMEM);
> >>>>> @@ -793,9 +863,15 @@ void dma_buf_unmap_attachment(struct
> >>>>> dma_buf_attachment *attach,
> >>>>>        if (WARN_ON(!attach || !attach->dmabuf || !sg_table))
> >>>>>            return;
> >>>>>    +    if (dma_buf_attachment_is_dynamic(attach))
> >>>>> +        dma_resv_assert_held(attach->dmabuf->resv);
> >>>>> +
> >>>>>        if (attach->sgt == sg_table)
> >>>>>            return;
> >>>>>    +    if (dma_buf_is_dynamic(attach->dmabuf))
> >>>>> +        dma_resv_assert_held(attach->dmabuf->resv);
> >>>>> +
> >>>>>        attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction);
> >>>>>    }
> >>>>>    EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
> >>>>> @@ -1219,10 +1295,12 @@ static int dma_buf_debug_show(struct
> >>>>> seq_file *s, void *unused)
> >>>>>            seq_puts(s, "\tAttached Devices:\n");
> >>>>>            attach_count = 0;
> >>>>>    +        dma_resv_lock(buf_obj->resv, NULL);
> >>>>>            list_for_each_entry(attach_obj, &buf_obj->attachments,
> >>>>> node) {
> >>>>>                seq_printf(s, "\t%s\n", dev_name(attach_obj->dev));
> >>>>>                attach_count++;
> >>>>>            }
> >>>>> +        dma_resv_unlock(buf_obj->resv);
> >>>>>              seq_printf(s, "Total %d devices attached\n\n",
> >>>>>                    attach_count);
> >>>>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> >>>>> index ec212cb27fdc..a8f8b2b812fd 100644
> >>>>> --- a/include/linux/dma-buf.h
> >>>>> +++ b/include/linux/dma-buf.h
> >>>>> @@ -42,6 +42,17 @@ struct dma_buf_ops {
> >>>>>          */
> >>>>>        bool cache_sgt_mapping;
> >>>>>    +    /**
> >>>>> +     * @dynamic_mapping:
> >>>>> +     *
> >>>>> +     * If true the framework makes sure that the map/unmap_dma_buf
> >>>>> +     * callbacks are always called with the dma_resv object locked.
> >>>>> +     *
> >>>>> +     * If false the framework makes ure that the map/unmap_dma_buf
> >>>>> +     * callbacks are always called without the dma_resv object locked.
> >>>>> +     */
> >>>>> +    bool dynamic_mapping;
> >>>>> +
> >>>>>        /**
> >>>>>         * @attach:
> >>>>>         *
> >>>>> @@ -109,6 +120,9 @@ struct dma_buf_ops {
> >>>>>         * any other kind of sharing that the exporter might wish to make
> >>>>>         * available to buffer-users.
> >>>>>         *
> >>>>> +     * This is always called with the dmabuf->resv object locked when
> >>>>> +     * the dynamic_mapping flag is true.
> >>>>> +     *
> >>>>>         * Returns:
> >>>>>         *
> >>>>>         * A &sg_table scatter list of or the backing storage of
> >>>>> the DMA buffer,
> >>>>> @@ -327,6 +341,8 @@ struct dma_buf {
> >>>>>     * @sgt: cached mapping.
> >>>>>     * @dir: direction of cached mapping.
> >>>>>     * @priv: exporter specific attachment data.
> >>>>> + * @dynamic_mapping: true if dma_buf_map/unmap_attachment() is
> >>>>> called with the
> >>>>> + * dma_resv lock held.
> >>>>>     *
> >>>>>     * This structure holds the attachment information between the
> >>>>> dma_buf buffer
> >>>>>     * and its user device(s). The list contains one attachment
> >>>>> struct per device
> >>>>> @@ -343,6 +359,7 @@ struct dma_buf_attachment {
> >>>>>        struct list_head node;
> >>>>>        struct sg_table *sgt;
> >>>>>        enum dma_data_direction dir;
> >>>>> +    bool dynamic_mapping;
> >>>>>        void *priv;
> >>>>>    };
> >>>>>    @@ -394,10 +411,39 @@ static inline void get_dma_buf(struct
> >>>>> dma_buf *dmabuf)
> >>>>>        get_file(dmabuf->file);
> >>>>>    }
> >>>>>    +/**
> >>>>> + * dma_buf_is_dynamic - check if a DMA-buf uses dynamic mappings.
> >>>>> + * @dmabuf: the DMA-buf to check
> >>>>> + *
> >>>>> + * Returns true if a DMA-buf exporter wants to be called with
> >>>>> the dma_resv
> >>>>> + * locked, false if it doesn't wants to be called with the lock held.
> >>>>> + */
> >>>>> +static inline bool dma_buf_is_dynamic(struct dma_buf *dmabuf)
> >>>>> +{
> >>>>> +    return dmabuf->ops->dynamic_mapping;
> >>>>> +}
> >>>>> +
> >>>>> +/**
> >>>>> + * dma_buf_attachment_is_dynamic - check if a DMA-buf
> >>>>> attachment uses dynamic
> >>>>> + * mappinsg
> >>>>> + * @attach: the DMA-buf attachment to check
> >>>>> + *
> >>>>> + * Returns true if a DMA-buf importer wants to call the
> >>>>> map/unmap functions with
> >>>>> + * the dma_resv lock held.
> >>>>> + */
> >>>>> +static inline bool
> >>>>> +dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
> >>>>> +{
> >>>>> +    return attach->dynamic_mapping;
> >>>>> +}
> >>>>> +
> >>>>>    struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> >>>>> -                            struct device *dev);
> >>>>> +                      struct device *dev);
> >>>>> +struct dma_buf_attachment *
> >>>>> +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
> >>>>> +               bool dynamic_mapping);
> >>>>>    void dma_buf_detach(struct dma_buf *dmabuf,
> >>>>> -                struct dma_buf_attachment *dmabuf_attach);
> >>>>> +            struct dma_buf_attachment *attach);
> >>>>>      struct dma_buf *dma_buf_export(const struct
> >>>>> dma_buf_export_info *exp_info);
> >>>>>    @@ -409,6 +455,7 @@ struct sg_table
> >>>>> *dma_buf_map_attachment(struct dma_buf_attachment *,
> >>>>>                        enum dma_data_direction);
> >>>>>    void dma_buf_unmap_attachment(struct dma_buf_attachment *,
> >>>>> struct sg_table *,
> >>>>>                    enum dma_data_direction);
> >>>>> +void dma_buf_move_notify(struct dma_buf *dma_buf);
> >>>>>    int dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
> >>>>>                     enum dma_data_direction dir);
> >>>>>    int dma_buf_end_cpu_access(struct dma_buf *dma_buf,
> >>>>> -- 
> >>>>> 2.17.1
> >>>>>
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] dma-buf: change DMA-buf locking convention
  2019-09-17 13:13             ` Daniel Vetter
@ 2019-09-17 13:24               ` Koenig, Christian
  2019-09-17 13:45                 ` Daniel Vetter
  0 siblings, 1 reply; 25+ messages in thread
From: Koenig, Christian @ 2019-09-17 13:24 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

Am 17.09.19 um 15:13 schrieb Daniel Vetter:
> On Tue, Sep 17, 2019 at 12:40:51PM +0000, Koenig, Christian wrote:
>> Am 17.09.19 um 14:31 schrieb Daniel Vetter:
>>> On Mon, Sep 16, 2019 at 02:23:13PM +0200, Christian König wrote:
>>>> Ping? Any further comment on this or can't we merge at least the locking
>>>> change?
>>> I was at plumbers ...
>>>> Christian.
>>>>
>>>> Am 11.09.19 um 12:53 schrieb Christian König:
>>>>> Am 03.09.19 um 10:05 schrieb Daniel Vetter:
>>>>>> On Thu, Aug 29, 2019 at 04:29:14PM +0200, Christian König wrote:
>>>>>>> This patch is a stripped down version of the locking changes
>>>>>>> necessary to support dynamic DMA-buf handling.
>>>>>>>
>>>>>>> For compatibility we cache the DMA-buf mapping as soon as
>>>>>>> exporter/importer disagree on the dynamic handling.
>>>>>>>
>>>>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>>>>> ---
>>>>>>>     drivers/dma-buf/dma-buf.c | 90
>>>>>>> ++++++++++++++++++++++++++++++++++++---
>>>>>>>     include/linux/dma-buf.h   | 51 +++++++++++++++++++++-
>>>>>>>     2 files changed, 133 insertions(+), 8 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>>>>>>> index 433d91d710e4..65052d52602b 100644
>>>>>>> --- a/drivers/dma-buf/dma-buf.c
>>>>>>> +++ b/drivers/dma-buf/dma-buf.c
>>>>>>> @@ -525,6 +525,10 @@ struct dma_buf *dma_buf_export(const struct
>>>>>>> dma_buf_export_info *exp_info)
>>>>>>>             return ERR_PTR(-EINVAL);
>>>>>>>         }
>>>>>>>     +    if (WARN_ON(exp_info->ops->cache_sgt_mapping &&
>>>>>>> +            exp_info->ops->dynamic_mapping))
>>>>>>> +        return ERR_PTR(-EINVAL);
>>>>>>> +
>>>>>>>         if (!try_module_get(exp_info->owner))
>>>>>>>             return ERR_PTR(-ENOENT);
>>>>>>>     @@ -645,10 +649,11 @@ void dma_buf_put(struct dma_buf *dmabuf)
>>>>>>>     EXPORT_SYMBOL_GPL(dma_buf_put);
>>>>>>>       /**
>>>>>>> - * dma_buf_attach - Add the device to dma_buf's attachments
>>>>>>> list; optionally,
>>>>>>> + * dma_buf_dynamic_attach - Add the device to dma_buf's
>>>>>>> attachments list; optionally,
>>>>>>>      * calls attach() of dma_buf_ops to allow device-specific
>>>>>>> attach functionality
>>>>>>> - * @dmabuf:    [in]    buffer to attach device to.
>>>>>>> - * @dev:    [in]    device to be attached.
>>>>>>> + * @dmabuf:        [in]    buffer to attach device to.
>>>>>>> + * @dev:        [in]    device to be attached.
>>>>>>> + * @dynamic_mapping:    [in]    calling convention for map/unmap
>>>>>>>      *
>>>>>>>      * Returns struct dma_buf_attachment pointer for this
>>>>>>> attachment. Attachments
>>>>>>>      * must be cleaned up by calling dma_buf_detach().
>>>>>>> @@ -662,8 +667,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
>>>>>>>      * accessible to @dev, and cannot be moved to a more suitable
>>>>>>> place. This is
>>>>>>>      * indicated with the error code -EBUSY.
>>>>>>>      */
>>>>>>> -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>>>>>>> -                      struct device *dev)
>>>>>>> +struct dma_buf_attachment *
>>>>>>> +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
>>>>>>> +               bool dynamic_mapping)
>>>>>>>     {
>>>>>>>         struct dma_buf_attachment *attach;
>>>>>>>         int ret;
>>>>>>> @@ -677,6 +683,7 @@ struct dma_buf_attachment
>>>>>>> *dma_buf_attach(struct dma_buf *dmabuf,
>>>>>>>           attach->dev = dev;
>>>>>>>         attach->dmabuf = dmabuf;
>>>>>>> +    attach->dynamic_mapping = dynamic_mapping;
>>>>>>>           mutex_lock(&dmabuf->lock);
>>>>>>>     @@ -685,16 +692,64 @@ struct dma_buf_attachment
>>>>>>> *dma_buf_attach(struct dma_buf *dmabuf,
>>>>>>>             if (ret)
>>>>>>>                 goto err_attach;
>>>>>>>         }
>>>>>>> +    dma_resv_lock(dmabuf->resv, NULL);
>>>>>>>         list_add(&attach->node, &dmabuf->attachments);
>>>>>>> +    dma_resv_unlock(dmabuf->resv);
>>>>>>>           mutex_unlock(&dmabuf->lock);
>>>>>>>     +    /* When either the importer or the exporter can't handle dynamic
>>>>>>> +     * mappings we cache the mapping here to avoid issues with the
>>>>>>> +     * reservation object lock.
>>>>>>> +     */
>>>>>>> +    if (dma_buf_attachment_is_dynamic(attach) !=
>>>>>>> +        dma_buf_is_dynamic(dmabuf)) {
>>>>>>> +        struct sg_table *sgt;
>>>>>>> +
>>>>>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
>>>>>>> +            dma_resv_lock(attach->dmabuf->resv, NULL);
>>>>>>> +
>>>>>>> +        sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
>>>>>> Now we're back to enforcing DMA_BIDI, which works nicely around the
>>>>>> locking pain, but apparently upsets the arm-soc folks who want to
>>>>>> control
>>>>>> this better.
>>>>> Take another look at dma_buf_map_attachment(), we still try to get the
>>>>> caching there for ARM.
>>>>>
>>>>> What we do here is to bidirectionally map the buffer to avoid the
>>>>> locking hydra when importer and exporter disagree on locking.
>>>>>
>>>>> So the ARM folks can easily avoid that by switching to dynamic locking
>>>>> for both.
>>> So you still break the contract between importer and exporter, except not
>>> for anything that's run in intel-gfx-ci so all is good?
>> No, the contract between importer and exporter stays exactly the same it
>> is currently as long as you don't switch to dynamic dma-buf handling.
>>
>> There is no functional change for the ARM folks here. The only change
>> which takes effect is between i915 and amdgpu and that is perfectly
>> covered by intel-gfx-ci.
> There's people who want to run amdgpu on ARM?

Sure there are, we even recently fixed some bugs for this.

But as far as I know there is no one currently which is affect by this 
change on ARM with amdgpu.

> Also, x86 doesn't have cache
> flushing in the dma-api, so naturally this isn't any issue for us (we
> still have cache flushing in actual hw, but that's a different topic). So
> "works on x86" isn't really a great way to justify what we do here I
> think.

Well it is the exact same caching we previously had as well, so there is 
absolutely no functional change here except that we now explicitly note 
that amdgpu always needs bidirectional mappings.

I agree that we should get rid of this caching as soon as possible, but 
we should not fix things which where broken before.

On the other hand adding dma_sg_sync_for_cpu/device sounds like 
something we could easily add separately to the caching if you think 
that this will help.

Christian.

> -Daniel
>
>> Regards,
>> Christian.
>>
>>> The other issue with "we solve this with caching the mapping": Currently
>>> map/unmap flush (at least on arm, at least on cases where it matters). If
>>> you just return the cached sg, then we don't do the flushing anymore,
>>> which might break importers/exporters in exactly the same way as just
>>> giving them the wrong mapping. There's zero differences between a BIDI,
>>> TO_CPU, or TO_DEVICE mapping, the only places where this matters is for
>>> cache flushing.
>>>
>>> So here's something that could actually work:
>>> - We cache the mapping.
>>> - We cache a bidirectional mapping.
>>> - We put the right dma_sg_sync_for_cpu/device calls in place for map/unmap
>>>     to give current importers/exporters the same behaviour they're used to
>>>     now.
>>>
>>> And yes the caching we've lifted might have broken something somewhere
>>> already. But generally you only hear about that long time after because
>>> arm vendors roll forward once every few years. Or something like that.
>>> -Daniel
>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>> That's why your previous version moved the caching into
>>>>>> map/unmap_sg, which resurrected the locking hydra.
>>>>>>
>>>>>> I think we're going a bit in circles here, and I don't have a good idea
>>>>>> either :-/
>>>>>> -Daniel
>>>>>>
>>>>>>> +        if (!sgt)
>>>>>>> +            sgt = ERR_PTR(-ENOMEM);
>>>>>>> +        if (IS_ERR(sgt)) {
>>>>>>> +            ret = PTR_ERR(sgt);
>>>>>>> +            goto err_unlock;
>>>>>>> +        }
>>>>>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
>>>>>>> +            dma_resv_unlock(attach->dmabuf->resv);
>>>>>>> +        attach->sgt = sgt;
>>>>>>> +        attach->dir = DMA_BIDIRECTIONAL;
>>>>>>> +    }
>>>>>>> +
>>>>>>>         return attach;
>>>>>>>       err_attach:
>>>>>>>         kfree(attach);
>>>>>>>         mutex_unlock(&dmabuf->lock);
>>>>>>>         return ERR_PTR(ret);
>>>>>>> +
>>>>>>> +err_unlock:
>>>>>>> +    if (dma_buf_is_dynamic(attach->dmabuf))
>>>>>>> +        dma_resv_unlock(attach->dmabuf->resv);
>>>>>>> +
>>>>>>> +    dma_buf_detach(dmabuf, attach);
>>>>>>> +    return ERR_PTR(ret);
>>>>>>> +}
>>>>>>> +EXPORT_SYMBOL_GPL(dma_buf_dynamic_attach);
>>>>>>> +
>>>>>>> +/**
>>>>>>> + * dma_buf_attach - Wrapper for dma_buf_dynamic_attach
>>>>>>> + * @dmabuf:    [in]    buffer to attach device to.
>>>>>>> + * @dev:    [in]    device to be attached.
>>>>>>> + *
>>>>>>> + * Wrapper to call dma_buf_dynamic_attach() for drivers which
>>>>>>> still use a static
>>>>>>> + * mapping.
>>>>>>> + */
>>>>>>> +struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>>>>>>> +                      struct device *dev)
>>>>>>> +{
>>>>>>> +    return dma_buf_dynamic_attach(dmabuf, dev, false);
>>>>>>>     }
>>>>>>>     EXPORT_SYMBOL_GPL(dma_buf_attach);
>>>>>>>     @@ -711,11 +766,20 @@ void dma_buf_detach(struct dma_buf
>>>>>>> *dmabuf, struct dma_buf_attachment *attach)
>>>>>>>         if (WARN_ON(!dmabuf || !attach))
>>>>>>>             return;
>>>>>>>     -    if (attach->sgt)
>>>>>>> +    if (attach->sgt) {
>>>>>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
>>>>>>> +            dma_resv_lock(attach->dmabuf->resv, NULL);
>>>>>>> +
>>>>>>>             dmabuf->ops->unmap_dma_buf(attach, attach->sgt, attach->dir);
>>>>>>>     +        if (dma_buf_is_dynamic(attach->dmabuf))
>>>>>>> +            dma_resv_unlock(attach->dmabuf->resv);
>>>>>>> +    }
>>>>>>> +
>>>>>>>         mutex_lock(&dmabuf->lock);
>>>>>>> +    dma_resv_lock(dmabuf->resv, NULL);
>>>>>>>         list_del(&attach->node);
>>>>>>> +    dma_resv_unlock(dmabuf->resv);
>>>>>>>         if (dmabuf->ops->detach)
>>>>>>>             dmabuf->ops->detach(dmabuf, attach);
>>>>>>>     @@ -749,6 +813,9 @@ struct sg_table
>>>>>>> *dma_buf_map_attachment(struct dma_buf_attachment *attach,
>>>>>>>         if (WARN_ON(!attach || !attach->dmabuf))
>>>>>>>             return ERR_PTR(-EINVAL);
>>>>>>>     +    if (dma_buf_attachment_is_dynamic(attach))
>>>>>>> +        dma_resv_assert_held(attach->dmabuf->resv);
>>>>>>> +
>>>>>>>         if (attach->sgt) {
>>>>>>>             /*
>>>>>>>              * Two mappings with different directions for the same
>>>>>>> @@ -761,6 +828,9 @@ struct sg_table
>>>>>>> *dma_buf_map_attachment(struct dma_buf_attachment *attach,
>>>>>>>             return attach->sgt;
>>>>>>>         }
>>>>>>>     +    if (dma_buf_is_dynamic(attach->dmabuf))
>>>>>>> +        dma_resv_assert_held(attach->dmabuf->resv);
>>>>>>> +
>>>>>>>         sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
>>>>>>>         if (!sg_table)
>>>>>>>             sg_table = ERR_PTR(-ENOMEM);
>>>>>>> @@ -793,9 +863,15 @@ void dma_buf_unmap_attachment(struct
>>>>>>> dma_buf_attachment *attach,
>>>>>>>         if (WARN_ON(!attach || !attach->dmabuf || !sg_table))
>>>>>>>             return;
>>>>>>>     +    if (dma_buf_attachment_is_dynamic(attach))
>>>>>>> +        dma_resv_assert_held(attach->dmabuf->resv);
>>>>>>> +
>>>>>>>         if (attach->sgt == sg_table)
>>>>>>>             return;
>>>>>>>     +    if (dma_buf_is_dynamic(attach->dmabuf))
>>>>>>> +        dma_resv_assert_held(attach->dmabuf->resv);
>>>>>>> +
>>>>>>>         attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction);
>>>>>>>     }
>>>>>>>     EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
>>>>>>> @@ -1219,10 +1295,12 @@ static int dma_buf_debug_show(struct
>>>>>>> seq_file *s, void *unused)
>>>>>>>             seq_puts(s, "\tAttached Devices:\n");
>>>>>>>             attach_count = 0;
>>>>>>>     +        dma_resv_lock(buf_obj->resv, NULL);
>>>>>>>             list_for_each_entry(attach_obj, &buf_obj->attachments,
>>>>>>> node) {
>>>>>>>                 seq_printf(s, "\t%s\n", dev_name(attach_obj->dev));
>>>>>>>                 attach_count++;
>>>>>>>             }
>>>>>>> +        dma_resv_unlock(buf_obj->resv);
>>>>>>>               seq_printf(s, "Total %d devices attached\n\n",
>>>>>>>                     attach_count);
>>>>>>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
>>>>>>> index ec212cb27fdc..a8f8b2b812fd 100644
>>>>>>> --- a/include/linux/dma-buf.h
>>>>>>> +++ b/include/linux/dma-buf.h
>>>>>>> @@ -42,6 +42,17 @@ struct dma_buf_ops {
>>>>>>>           */
>>>>>>>         bool cache_sgt_mapping;
>>>>>>>     +    /**
>>>>>>> +     * @dynamic_mapping:
>>>>>>> +     *
>>>>>>> +     * If true the framework makes sure that the map/unmap_dma_buf
>>>>>>> +     * callbacks are always called with the dma_resv object locked.
>>>>>>> +     *
>>>>>>> +     * If false the framework makes ure that the map/unmap_dma_buf
>>>>>>> +     * callbacks are always called without the dma_resv object locked.
>>>>>>> +     */
>>>>>>> +    bool dynamic_mapping;
>>>>>>> +
>>>>>>>         /**
>>>>>>>          * @attach:
>>>>>>>          *
>>>>>>> @@ -109,6 +120,9 @@ struct dma_buf_ops {
>>>>>>>          * any other kind of sharing that the exporter might wish to make
>>>>>>>          * available to buffer-users.
>>>>>>>          *
>>>>>>> +     * This is always called with the dmabuf->resv object locked when
>>>>>>> +     * the dynamic_mapping flag is true.
>>>>>>> +     *
>>>>>>>          * Returns:
>>>>>>>          *
>>>>>>>          * A &sg_table scatter list of or the backing storage of
>>>>>>> the DMA buffer,
>>>>>>> @@ -327,6 +341,8 @@ struct dma_buf {
>>>>>>>      * @sgt: cached mapping.
>>>>>>>      * @dir: direction of cached mapping.
>>>>>>>      * @priv: exporter specific attachment data.
>>>>>>> + * @dynamic_mapping: true if dma_buf_map/unmap_attachment() is
>>>>>>> called with the
>>>>>>> + * dma_resv lock held.
>>>>>>>      *
>>>>>>>      * This structure holds the attachment information between the
>>>>>>> dma_buf buffer
>>>>>>>      * and its user device(s). The list contains one attachment
>>>>>>> struct per device
>>>>>>> @@ -343,6 +359,7 @@ struct dma_buf_attachment {
>>>>>>>         struct list_head node;
>>>>>>>         struct sg_table *sgt;
>>>>>>>         enum dma_data_direction dir;
>>>>>>> +    bool dynamic_mapping;
>>>>>>>         void *priv;
>>>>>>>     };
>>>>>>>     @@ -394,10 +411,39 @@ static inline void get_dma_buf(struct
>>>>>>> dma_buf *dmabuf)
>>>>>>>         get_file(dmabuf->file);
>>>>>>>     }
>>>>>>>     +/**
>>>>>>> + * dma_buf_is_dynamic - check if a DMA-buf uses dynamic mappings.
>>>>>>> + * @dmabuf: the DMA-buf to check
>>>>>>> + *
>>>>>>> + * Returns true if a DMA-buf exporter wants to be called with
>>>>>>> the dma_resv
>>>>>>> + * locked, false if it doesn't wants to be called with the lock held.
>>>>>>> + */
>>>>>>> +static inline bool dma_buf_is_dynamic(struct dma_buf *dmabuf)
>>>>>>> +{
>>>>>>> +    return dmabuf->ops->dynamic_mapping;
>>>>>>> +}
>>>>>>> +
>>>>>>> +/**
>>>>>>> + * dma_buf_attachment_is_dynamic - check if a DMA-buf
>>>>>>> attachment uses dynamic
>>>>>>> + * mappinsg
>>>>>>> + * @attach: the DMA-buf attachment to check
>>>>>>> + *
>>>>>>> + * Returns true if a DMA-buf importer wants to call the
>>>>>>> map/unmap functions with
>>>>>>> + * the dma_resv lock held.
>>>>>>> + */
>>>>>>> +static inline bool
>>>>>>> +dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
>>>>>>> +{
>>>>>>> +    return attach->dynamic_mapping;
>>>>>>> +}
>>>>>>> +
>>>>>>>     struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>>>>>>> -                            struct device *dev);
>>>>>>> +                      struct device *dev);
>>>>>>> +struct dma_buf_attachment *
>>>>>>> +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
>>>>>>> +               bool dynamic_mapping);
>>>>>>>     void dma_buf_detach(struct dma_buf *dmabuf,
>>>>>>> -                struct dma_buf_attachment *dmabuf_attach);
>>>>>>> +            struct dma_buf_attachment *attach);
>>>>>>>       struct dma_buf *dma_buf_export(const struct
>>>>>>> dma_buf_export_info *exp_info);
>>>>>>>     @@ -409,6 +455,7 @@ struct sg_table
>>>>>>> *dma_buf_map_attachment(struct dma_buf_attachment *,
>>>>>>>                         enum dma_data_direction);
>>>>>>>     void dma_buf_unmap_attachment(struct dma_buf_attachment *,
>>>>>>> struct sg_table *,
>>>>>>>                     enum dma_data_direction);
>>>>>>> +void dma_buf_move_notify(struct dma_buf *dma_buf);
>>>>>>>     int dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
>>>>>>>                      enum dma_data_direction dir);
>>>>>>>     int dma_buf_end_cpu_access(struct dma_buf *dma_buf,
>>>>>>> -- 
>>>>>>> 2.17.1
>>>>>>>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] dma-buf: change DMA-buf locking convention
  2019-09-17 13:24               ` Koenig, Christian
@ 2019-09-17 13:45                 ` Daniel Vetter
  2019-09-17 14:47                   ` Koenig, Christian
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Vetter @ 2019-09-17 13:45 UTC (permalink / raw)
  To: Koenig, Christian
  Cc: Daniel Vetter, dri-devel, sumit.semwal, linaro-mm-sig,
	linux-media, intel-gfx

On Tue, Sep 17, 2019 at 01:24:10PM +0000, Koenig, Christian wrote:
> Am 17.09.19 um 15:13 schrieb Daniel Vetter:
> > On Tue, Sep 17, 2019 at 12:40:51PM +0000, Koenig, Christian wrote:
> >> Am 17.09.19 um 14:31 schrieb Daniel Vetter:
> >>> On Mon, Sep 16, 2019 at 02:23:13PM +0200, Christian König wrote:
> >>>> Ping? Any further comment on this or can't we merge at least the locking
> >>>> change?
> >>> I was at plumbers ...
> >>>> Christian.
> >>>>
> >>>> Am 11.09.19 um 12:53 schrieb Christian König:
> >>>>> Am 03.09.19 um 10:05 schrieb Daniel Vetter:
> >>>>>> On Thu, Aug 29, 2019 at 04:29:14PM +0200, Christian König wrote:
> >>>>>>> This patch is a stripped down version of the locking changes
> >>>>>>> necessary to support dynamic DMA-buf handling.
> >>>>>>>
> >>>>>>> For compatibility we cache the DMA-buf mapping as soon as
> >>>>>>> exporter/importer disagree on the dynamic handling.
> >>>>>>>
> >>>>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
> >>>>>>> ---
> >>>>>>>     drivers/dma-buf/dma-buf.c | 90
> >>>>>>> ++++++++++++++++++++++++++++++++++++---
> >>>>>>>     include/linux/dma-buf.h   | 51 +++++++++++++++++++++-
> >>>>>>>     2 files changed, 133 insertions(+), 8 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> >>>>>>> index 433d91d710e4..65052d52602b 100644
> >>>>>>> --- a/drivers/dma-buf/dma-buf.c
> >>>>>>> +++ b/drivers/dma-buf/dma-buf.c
> >>>>>>> @@ -525,6 +525,10 @@ struct dma_buf *dma_buf_export(const struct
> >>>>>>> dma_buf_export_info *exp_info)
> >>>>>>>             return ERR_PTR(-EINVAL);
> >>>>>>>         }
> >>>>>>>     +    if (WARN_ON(exp_info->ops->cache_sgt_mapping &&
> >>>>>>> +            exp_info->ops->dynamic_mapping))
> >>>>>>> +        return ERR_PTR(-EINVAL);
> >>>>>>> +
> >>>>>>>         if (!try_module_get(exp_info->owner))
> >>>>>>>             return ERR_PTR(-ENOENT);
> >>>>>>>     @@ -645,10 +649,11 @@ void dma_buf_put(struct dma_buf *dmabuf)
> >>>>>>>     EXPORT_SYMBOL_GPL(dma_buf_put);
> >>>>>>>       /**
> >>>>>>> - * dma_buf_attach - Add the device to dma_buf's attachments
> >>>>>>> list; optionally,
> >>>>>>> + * dma_buf_dynamic_attach - Add the device to dma_buf's
> >>>>>>> attachments list; optionally,
> >>>>>>>      * calls attach() of dma_buf_ops to allow device-specific
> >>>>>>> attach functionality
> >>>>>>> - * @dmabuf:    [in]    buffer to attach device to.
> >>>>>>> - * @dev:    [in]    device to be attached.
> >>>>>>> + * @dmabuf:        [in]    buffer to attach device to.
> >>>>>>> + * @dev:        [in]    device to be attached.
> >>>>>>> + * @dynamic_mapping:    [in]    calling convention for map/unmap
> >>>>>>>      *
> >>>>>>>      * Returns struct dma_buf_attachment pointer for this
> >>>>>>> attachment. Attachments
> >>>>>>>      * must be cleaned up by calling dma_buf_detach().
> >>>>>>> @@ -662,8 +667,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
> >>>>>>>      * accessible to @dev, and cannot be moved to a more suitable
> >>>>>>> place. This is
> >>>>>>>      * indicated with the error code -EBUSY.
> >>>>>>>      */
> >>>>>>> -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> >>>>>>> -                      struct device *dev)
> >>>>>>> +struct dma_buf_attachment *
> >>>>>>> +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
> >>>>>>> +               bool dynamic_mapping)
> >>>>>>>     {
> >>>>>>>         struct dma_buf_attachment *attach;
> >>>>>>>         int ret;
> >>>>>>> @@ -677,6 +683,7 @@ struct dma_buf_attachment
> >>>>>>> *dma_buf_attach(struct dma_buf *dmabuf,
> >>>>>>>           attach->dev = dev;
> >>>>>>>         attach->dmabuf = dmabuf;
> >>>>>>> +    attach->dynamic_mapping = dynamic_mapping;
> >>>>>>>           mutex_lock(&dmabuf->lock);
> >>>>>>>     @@ -685,16 +692,64 @@ struct dma_buf_attachment
> >>>>>>> *dma_buf_attach(struct dma_buf *dmabuf,
> >>>>>>>             if (ret)
> >>>>>>>                 goto err_attach;
> >>>>>>>         }
> >>>>>>> +    dma_resv_lock(dmabuf->resv, NULL);
> >>>>>>>         list_add(&attach->node, &dmabuf->attachments);
> >>>>>>> +    dma_resv_unlock(dmabuf->resv);
> >>>>>>>           mutex_unlock(&dmabuf->lock);
> >>>>>>>     +    /* When either the importer or the exporter can't handle dynamic
> >>>>>>> +     * mappings we cache the mapping here to avoid issues with the
> >>>>>>> +     * reservation object lock.
> >>>>>>> +     */
> >>>>>>> +    if (dma_buf_attachment_is_dynamic(attach) !=
> >>>>>>> +        dma_buf_is_dynamic(dmabuf)) {
> >>>>>>> +        struct sg_table *sgt;
> >>>>>>> +
> >>>>>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
> >>>>>>> +            dma_resv_lock(attach->dmabuf->resv, NULL);
> >>>>>>> +
> >>>>>>> +        sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
> >>>>>> Now we're back to enforcing DMA_BIDI, which works nicely around the
> >>>>>> locking pain, but apparently upsets the arm-soc folks who want to
> >>>>>> control
> >>>>>> this better.
> >>>>> Take another look at dma_buf_map_attachment(), we still try to get the
> >>>>> caching there for ARM.
> >>>>>
> >>>>> What we do here is to bidirectionally map the buffer to avoid the
> >>>>> locking hydra when importer and exporter disagree on locking.
> >>>>>
> >>>>> So the ARM folks can easily avoid that by switching to dynamic locking
> >>>>> for both.
> >>> So you still break the contract between importer and exporter, except not
> >>> for anything that's run in intel-gfx-ci so all is good?
> >> No, the contract between importer and exporter stays exactly the same it
> >> is currently as long as you don't switch to dynamic dma-buf handling.
> >>
> >> There is no functional change for the ARM folks here. The only change
> >> which takes effect is between i915 and amdgpu and that is perfectly
> >> covered by intel-gfx-ci.
> > There's people who want to run amdgpu on ARM?
> 
> Sure there are, we even recently fixed some bugs for this.
> 
> But as far as I know there is no one currently which is affect by this 
> change on ARM with amdgpu.

But don't you break them with this now?

amdgpu will soon set the dynamic flag on exports, which forces the caching
at create time (to avoid the locking fun), which will then result in a
EBUSY at map_attachment time because we have a cached mapping, but it's
the wrong type.

> > Also, x86 doesn't have cache
> > flushing in the dma-api, so naturally this isn't any issue for us (we
> > still have cache flushing in actual hw, but that's a different topic). So
> > "works on x86" isn't really a great way to justify what we do here I
> > think.
> 
> Well it is the exact same caching we previously had as well, so there is 
> absolutely no functional change here except that we now explicitly note 
> that amdgpu always needs bidirectional mappings.
> 
> I agree that we should get rid of this caching as soon as possible, but 
> we should not fix things which where broken before.
> 
> On the other hand adding dma_sg_sync_for_cpu/device sounds like 
> something we could easily add separately to the caching if you think 
> that this will help.

The current code maybe lacks some cache flushes, but we already require a
fixed direction per attachment. So I guess not a real problem, probably.

But with your patches I think we now fail with EBUSY. Not exactly nice ...
-Daniel

> 
> Christian.
> 
> > -Daniel
> >
> >> Regards,
> >> Christian.
> >>
> >>> The other issue with "we solve this with caching the mapping": Currently
> >>> map/unmap flush (at least on arm, at least on cases where it matters). If
> >>> you just return the cached sg, then we don't do the flushing anymore,
> >>> which might break importers/exporters in exactly the same way as just
> >>> giving them the wrong mapping. There's zero differences between a BIDI,
> >>> TO_CPU, or TO_DEVICE mapping, the only places where this matters is for
> >>> cache flushing.
> >>>
> >>> So here's something that could actually work:
> >>> - We cache the mapping.
> >>> - We cache a bidirectional mapping.
> >>> - We put the right dma_sg_sync_for_cpu/device calls in place for map/unmap
> >>>     to give current importers/exporters the same behaviour they're used to
> >>>     now.
> >>>
> >>> And yes the caching we've lifted might have broken something somewhere
> >>> already. But generally you only hear about that long time after because
> >>> arm vendors roll forward once every few years. Or something like that.
> >>> -Daniel
> >>>
> >>>>> Regards,
> >>>>> Christian.
> >>>>>
> >>>>>> That's why your previous version moved the caching into
> >>>>>> map/unmap_sg, which resurrected the locking hydra.
> >>>>>>
> >>>>>> I think we're going a bit in circles here, and I don't have a good idea
> >>>>>> either :-/
> >>>>>> -Daniel
> >>>>>>
> >>>>>>> +        if (!sgt)
> >>>>>>> +            sgt = ERR_PTR(-ENOMEM);
> >>>>>>> +        if (IS_ERR(sgt)) {
> >>>>>>> +            ret = PTR_ERR(sgt);
> >>>>>>> +            goto err_unlock;
> >>>>>>> +        }
> >>>>>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
> >>>>>>> +            dma_resv_unlock(attach->dmabuf->resv);
> >>>>>>> +        attach->sgt = sgt;
> >>>>>>> +        attach->dir = DMA_BIDIRECTIONAL;
> >>>>>>> +    }
> >>>>>>> +
> >>>>>>>         return attach;
> >>>>>>>       err_attach:
> >>>>>>>         kfree(attach);
> >>>>>>>         mutex_unlock(&dmabuf->lock);
> >>>>>>>         return ERR_PTR(ret);
> >>>>>>> +
> >>>>>>> +err_unlock:
> >>>>>>> +    if (dma_buf_is_dynamic(attach->dmabuf))
> >>>>>>> +        dma_resv_unlock(attach->dmabuf->resv);
> >>>>>>> +
> >>>>>>> +    dma_buf_detach(dmabuf, attach);
> >>>>>>> +    return ERR_PTR(ret);
> >>>>>>> +}
> >>>>>>> +EXPORT_SYMBOL_GPL(dma_buf_dynamic_attach);
> >>>>>>> +
> >>>>>>> +/**
> >>>>>>> + * dma_buf_attach - Wrapper for dma_buf_dynamic_attach
> >>>>>>> + * @dmabuf:    [in]    buffer to attach device to.
> >>>>>>> + * @dev:    [in]    device to be attached.
> >>>>>>> + *
> >>>>>>> + * Wrapper to call dma_buf_dynamic_attach() for drivers which
> >>>>>>> still use a static
> >>>>>>> + * mapping.
> >>>>>>> + */
> >>>>>>> +struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> >>>>>>> +                      struct device *dev)
> >>>>>>> +{
> >>>>>>> +    return dma_buf_dynamic_attach(dmabuf, dev, false);
> >>>>>>>     }
> >>>>>>>     EXPORT_SYMBOL_GPL(dma_buf_attach);
> >>>>>>>     @@ -711,11 +766,20 @@ void dma_buf_detach(struct dma_buf
> >>>>>>> *dmabuf, struct dma_buf_attachment *attach)
> >>>>>>>         if (WARN_ON(!dmabuf || !attach))
> >>>>>>>             return;
> >>>>>>>     -    if (attach->sgt)
> >>>>>>> +    if (attach->sgt) {
> >>>>>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
> >>>>>>> +            dma_resv_lock(attach->dmabuf->resv, NULL);
> >>>>>>> +
> >>>>>>>             dmabuf->ops->unmap_dma_buf(attach, attach->sgt, attach->dir);
> >>>>>>>     +        if (dma_buf_is_dynamic(attach->dmabuf))
> >>>>>>> +            dma_resv_unlock(attach->dmabuf->resv);
> >>>>>>> +    }
> >>>>>>> +
> >>>>>>>         mutex_lock(&dmabuf->lock);
> >>>>>>> +    dma_resv_lock(dmabuf->resv, NULL);
> >>>>>>>         list_del(&attach->node);
> >>>>>>> +    dma_resv_unlock(dmabuf->resv);
> >>>>>>>         if (dmabuf->ops->detach)
> >>>>>>>             dmabuf->ops->detach(dmabuf, attach);
> >>>>>>>     @@ -749,6 +813,9 @@ struct sg_table
> >>>>>>> *dma_buf_map_attachment(struct dma_buf_attachment *attach,
> >>>>>>>         if (WARN_ON(!attach || !attach->dmabuf))
> >>>>>>>             return ERR_PTR(-EINVAL);
> >>>>>>>     +    if (dma_buf_attachment_is_dynamic(attach))
> >>>>>>> +        dma_resv_assert_held(attach->dmabuf->resv);
> >>>>>>> +
> >>>>>>>         if (attach->sgt) {
> >>>>>>>             /*
> >>>>>>>              * Two mappings with different directions for the same
> >>>>>>> @@ -761,6 +828,9 @@ struct sg_table
> >>>>>>> *dma_buf_map_attachment(struct dma_buf_attachment *attach,
> >>>>>>>             return attach->sgt;
> >>>>>>>         }
> >>>>>>>     +    if (dma_buf_is_dynamic(attach->dmabuf))
> >>>>>>> +        dma_resv_assert_held(attach->dmabuf->resv);
> >>>>>>> +
> >>>>>>>         sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
> >>>>>>>         if (!sg_table)
> >>>>>>>             sg_table = ERR_PTR(-ENOMEM);
> >>>>>>> @@ -793,9 +863,15 @@ void dma_buf_unmap_attachment(struct
> >>>>>>> dma_buf_attachment *attach,
> >>>>>>>         if (WARN_ON(!attach || !attach->dmabuf || !sg_table))
> >>>>>>>             return;
> >>>>>>>     +    if (dma_buf_attachment_is_dynamic(attach))
> >>>>>>> +        dma_resv_assert_held(attach->dmabuf->resv);
> >>>>>>> +
> >>>>>>>         if (attach->sgt == sg_table)
> >>>>>>>             return;
> >>>>>>>     +    if (dma_buf_is_dynamic(attach->dmabuf))
> >>>>>>> +        dma_resv_assert_held(attach->dmabuf->resv);
> >>>>>>> +
> >>>>>>>         attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction);
> >>>>>>>     }
> >>>>>>>     EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
> >>>>>>> @@ -1219,10 +1295,12 @@ static int dma_buf_debug_show(struct
> >>>>>>> seq_file *s, void *unused)
> >>>>>>>             seq_puts(s, "\tAttached Devices:\n");
> >>>>>>>             attach_count = 0;
> >>>>>>>     +        dma_resv_lock(buf_obj->resv, NULL);
> >>>>>>>             list_for_each_entry(attach_obj, &buf_obj->attachments,
> >>>>>>> node) {
> >>>>>>>                 seq_printf(s, "\t%s\n", dev_name(attach_obj->dev));
> >>>>>>>                 attach_count++;
> >>>>>>>             }
> >>>>>>> +        dma_resv_unlock(buf_obj->resv);
> >>>>>>>               seq_printf(s, "Total %d devices attached\n\n",
> >>>>>>>                     attach_count);
> >>>>>>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> >>>>>>> index ec212cb27fdc..a8f8b2b812fd 100644
> >>>>>>> --- a/include/linux/dma-buf.h
> >>>>>>> +++ b/include/linux/dma-buf.h
> >>>>>>> @@ -42,6 +42,17 @@ struct dma_buf_ops {
> >>>>>>>           */
> >>>>>>>         bool cache_sgt_mapping;
> >>>>>>>     +    /**
> >>>>>>> +     * @dynamic_mapping:
> >>>>>>> +     *
> >>>>>>> +     * If true the framework makes sure that the map/unmap_dma_buf
> >>>>>>> +     * callbacks are always called with the dma_resv object locked.
> >>>>>>> +     *
> >>>>>>> +     * If false the framework makes ure that the map/unmap_dma_buf
> >>>>>>> +     * callbacks are always called without the dma_resv object locked.
> >>>>>>> +     */
> >>>>>>> +    bool dynamic_mapping;
> >>>>>>> +
> >>>>>>>         /**
> >>>>>>>          * @attach:
> >>>>>>>          *
> >>>>>>> @@ -109,6 +120,9 @@ struct dma_buf_ops {
> >>>>>>>          * any other kind of sharing that the exporter might wish to make
> >>>>>>>          * available to buffer-users.
> >>>>>>>          *
> >>>>>>> +     * This is always called with the dmabuf->resv object locked when
> >>>>>>> +     * the dynamic_mapping flag is true.
> >>>>>>> +     *
> >>>>>>>          * Returns:
> >>>>>>>          *
> >>>>>>>          * A &sg_table scatter list of or the backing storage of
> >>>>>>> the DMA buffer,
> >>>>>>> @@ -327,6 +341,8 @@ struct dma_buf {
> >>>>>>>      * @sgt: cached mapping.
> >>>>>>>      * @dir: direction of cached mapping.
> >>>>>>>      * @priv: exporter specific attachment data.
> >>>>>>> + * @dynamic_mapping: true if dma_buf_map/unmap_attachment() is
> >>>>>>> called with the
> >>>>>>> + * dma_resv lock held.
> >>>>>>>      *
> >>>>>>>      * This structure holds the attachment information between the
> >>>>>>> dma_buf buffer
> >>>>>>>      * and its user device(s). The list contains one attachment
> >>>>>>> struct per device
> >>>>>>> @@ -343,6 +359,7 @@ struct dma_buf_attachment {
> >>>>>>>         struct list_head node;
> >>>>>>>         struct sg_table *sgt;
> >>>>>>>         enum dma_data_direction dir;
> >>>>>>> +    bool dynamic_mapping;
> >>>>>>>         void *priv;
> >>>>>>>     };
> >>>>>>>     @@ -394,10 +411,39 @@ static inline void get_dma_buf(struct
> >>>>>>> dma_buf *dmabuf)
> >>>>>>>         get_file(dmabuf->file);
> >>>>>>>     }
> >>>>>>>     +/**
> >>>>>>> + * dma_buf_is_dynamic - check if a DMA-buf uses dynamic mappings.
> >>>>>>> + * @dmabuf: the DMA-buf to check
> >>>>>>> + *
> >>>>>>> + * Returns true if a DMA-buf exporter wants to be called with
> >>>>>>> the dma_resv
> >>>>>>> + * locked, false if it doesn't wants to be called with the lock held.
> >>>>>>> + */
> >>>>>>> +static inline bool dma_buf_is_dynamic(struct dma_buf *dmabuf)
> >>>>>>> +{
> >>>>>>> +    return dmabuf->ops->dynamic_mapping;
> >>>>>>> +}
> >>>>>>> +
> >>>>>>> +/**
> >>>>>>> + * dma_buf_attachment_is_dynamic - check if a DMA-buf
> >>>>>>> attachment uses dynamic
> >>>>>>> + * mappinsg
> >>>>>>> + * @attach: the DMA-buf attachment to check
> >>>>>>> + *
> >>>>>>> + * Returns true if a DMA-buf importer wants to call the
> >>>>>>> map/unmap functions with
> >>>>>>> + * the dma_resv lock held.
> >>>>>>> + */
> >>>>>>> +static inline bool
> >>>>>>> +dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
> >>>>>>> +{
> >>>>>>> +    return attach->dynamic_mapping;
> >>>>>>> +}
> >>>>>>> +
> >>>>>>>     struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> >>>>>>> -                            struct device *dev);
> >>>>>>> +                      struct device *dev);
> >>>>>>> +struct dma_buf_attachment *
> >>>>>>> +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
> >>>>>>> +               bool dynamic_mapping);
> >>>>>>>     void dma_buf_detach(struct dma_buf *dmabuf,
> >>>>>>> -                struct dma_buf_attachment *dmabuf_attach);
> >>>>>>> +            struct dma_buf_attachment *attach);
> >>>>>>>       struct dma_buf *dma_buf_export(const struct
> >>>>>>> dma_buf_export_info *exp_info);
> >>>>>>>     @@ -409,6 +455,7 @@ struct sg_table
> >>>>>>> *dma_buf_map_attachment(struct dma_buf_attachment *,
> >>>>>>>                         enum dma_data_direction);
> >>>>>>>     void dma_buf_unmap_attachment(struct dma_buf_attachment *,
> >>>>>>> struct sg_table *,
> >>>>>>>                     enum dma_data_direction);
> >>>>>>> +void dma_buf_move_notify(struct dma_buf *dma_buf);
> >>>>>>>     int dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
> >>>>>>>                      enum dma_data_direction dir);
> >>>>>>>     int dma_buf_end_cpu_access(struct dma_buf *dma_buf,
> >>>>>>> -- 
> >>>>>>> 2.17.1
> >>>>>>>
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] dma-buf: change DMA-buf locking convention
  2019-09-17 13:45                 ` Daniel Vetter
@ 2019-09-17 14:47                   ` Koenig, Christian
  2019-09-17 14:56                     ` Daniel Vetter
  0 siblings, 1 reply; 25+ messages in thread
From: Koenig, Christian @ 2019-09-17 14:47 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

Am 17.09.19 um 15:45 schrieb Daniel Vetter:
> On Tue, Sep 17, 2019 at 01:24:10PM +0000, Koenig, Christian wrote:
>> Am 17.09.19 um 15:13 schrieb Daniel Vetter:
>>> On Tue, Sep 17, 2019 at 12:40:51PM +0000, Koenig, Christian wrote:
>>>> Am 17.09.19 um 14:31 schrieb Daniel Vetter:
>>>>> On Mon, Sep 16, 2019 at 02:23:13PM +0200, Christian König wrote:
>>>>>> Ping? Any further comment on this or can't we merge at least the locking
>>>>>> change?
>>>>> I was at plumbers ...
>>>>>> Christian.
>>>>>>
>>>>>> Am 11.09.19 um 12:53 schrieb Christian König:
>>>>>>> Am 03.09.19 um 10:05 schrieb Daniel Vetter:
>>>>>>>> On Thu, Aug 29, 2019 at 04:29:14PM +0200, Christian König wrote:
>>>>>>>>> This patch is a stripped down version of the locking changes
>>>>>>>>> necessary to support dynamic DMA-buf handling.
>>>>>>>>>
>>>>>>>>> For compatibility we cache the DMA-buf mapping as soon as
>>>>>>>>> exporter/importer disagree on the dynamic handling.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>>>>>>> ---
>>>>>>>>>      drivers/dma-buf/dma-buf.c | 90
>>>>>>>>> ++++++++++++++++++++++++++++++++++++---
>>>>>>>>>      include/linux/dma-buf.h   | 51 +++++++++++++++++++++-
>>>>>>>>>      2 files changed, 133 insertions(+), 8 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>>>>>>>>> index 433d91d710e4..65052d52602b 100644
>>>>>>>>> --- a/drivers/dma-buf/dma-buf.c
>>>>>>>>> +++ b/drivers/dma-buf/dma-buf.c
>>>>>>>>> @@ -525,6 +525,10 @@ struct dma_buf *dma_buf_export(const struct
>>>>>>>>> dma_buf_export_info *exp_info)
>>>>>>>>>              return ERR_PTR(-EINVAL);
>>>>>>>>>          }
>>>>>>>>>      +    if (WARN_ON(exp_info->ops->cache_sgt_mapping &&
>>>>>>>>> +            exp_info->ops->dynamic_mapping))
>>>>>>>>> +        return ERR_PTR(-EINVAL);
>>>>>>>>> +
>>>>>>>>>          if (!try_module_get(exp_info->owner))
>>>>>>>>>              return ERR_PTR(-ENOENT);
>>>>>>>>>      @@ -645,10 +649,11 @@ void dma_buf_put(struct dma_buf *dmabuf)
>>>>>>>>>      EXPORT_SYMBOL_GPL(dma_buf_put);
>>>>>>>>>        /**
>>>>>>>>> - * dma_buf_attach - Add the device to dma_buf's attachments
>>>>>>>>> list; optionally,
>>>>>>>>> + * dma_buf_dynamic_attach - Add the device to dma_buf's
>>>>>>>>> attachments list; optionally,
>>>>>>>>>       * calls attach() of dma_buf_ops to allow device-specific
>>>>>>>>> attach functionality
>>>>>>>>> - * @dmabuf:    [in]    buffer to attach device to.
>>>>>>>>> - * @dev:    [in]    device to be attached.
>>>>>>>>> + * @dmabuf:        [in]    buffer to attach device to.
>>>>>>>>> + * @dev:        [in]    device to be attached.
>>>>>>>>> + * @dynamic_mapping:    [in]    calling convention for map/unmap
>>>>>>>>>       *
>>>>>>>>>       * Returns struct dma_buf_attachment pointer for this
>>>>>>>>> attachment. Attachments
>>>>>>>>>       * must be cleaned up by calling dma_buf_detach().
>>>>>>>>> @@ -662,8 +667,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
>>>>>>>>>       * accessible to @dev, and cannot be moved to a more suitable
>>>>>>>>> place. This is
>>>>>>>>>       * indicated with the error code -EBUSY.
>>>>>>>>>       */
>>>>>>>>> -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>>>>>>>>> -                      struct device *dev)
>>>>>>>>> +struct dma_buf_attachment *
>>>>>>>>> +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
>>>>>>>>> +               bool dynamic_mapping)
>>>>>>>>>      {
>>>>>>>>>          struct dma_buf_attachment *attach;
>>>>>>>>>          int ret;
>>>>>>>>> @@ -677,6 +683,7 @@ struct dma_buf_attachment
>>>>>>>>> *dma_buf_attach(struct dma_buf *dmabuf,
>>>>>>>>>            attach->dev = dev;
>>>>>>>>>          attach->dmabuf = dmabuf;
>>>>>>>>> +    attach->dynamic_mapping = dynamic_mapping;
>>>>>>>>>            mutex_lock(&dmabuf->lock);
>>>>>>>>>      @@ -685,16 +692,64 @@ struct dma_buf_attachment
>>>>>>>>> *dma_buf_attach(struct dma_buf *dmabuf,
>>>>>>>>>              if (ret)
>>>>>>>>>                  goto err_attach;
>>>>>>>>>          }
>>>>>>>>> +    dma_resv_lock(dmabuf->resv, NULL);
>>>>>>>>>          list_add(&attach->node, &dmabuf->attachments);
>>>>>>>>> +    dma_resv_unlock(dmabuf->resv);
>>>>>>>>>            mutex_unlock(&dmabuf->lock);
>>>>>>>>>      +    /* When either the importer or the exporter can't handle dynamic
>>>>>>>>> +     * mappings we cache the mapping here to avoid issues with the
>>>>>>>>> +     * reservation object lock.
>>>>>>>>> +     */
>>>>>>>>> +    if (dma_buf_attachment_is_dynamic(attach) !=
>>>>>>>>> +        dma_buf_is_dynamic(dmabuf)) {
>>>>>>>>> +        struct sg_table *sgt;
>>>>>>>>> +
>>>>>>>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
>>>>>>>>> +            dma_resv_lock(attach->dmabuf->resv, NULL);
>>>>>>>>> +
>>>>>>>>> +        sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
>>>>>>>> Now we're back to enforcing DMA_BIDI, which works nicely around the
>>>>>>>> locking pain, but apparently upsets the arm-soc folks who want to
>>>>>>>> control
>>>>>>>> this better.
>>>>>>> Take another look at dma_buf_map_attachment(), we still try to get the
>>>>>>> caching there for ARM.
>>>>>>>
>>>>>>> What we do here is to bidirectionally map the buffer to avoid the
>>>>>>> locking hydra when importer and exporter disagree on locking.
>>>>>>>
>>>>>>> So the ARM folks can easily avoid that by switching to dynamic locking
>>>>>>> for both.
>>>>> So you still break the contract between importer and exporter, except not
>>>>> for anything that's run in intel-gfx-ci so all is good?
>>>> No, the contract between importer and exporter stays exactly the same it
>>>> is currently as long as you don't switch to dynamic dma-buf handling.
>>>>
>>>> There is no functional change for the ARM folks here. The only change
>>>> which takes effect is between i915 and amdgpu and that is perfectly
>>>> covered by intel-gfx-ci.
>>> There's people who want to run amdgpu on ARM?
>> Sure there are, we even recently fixed some bugs for this.
>>
>> But as far as I know there is no one currently which is affect by this
>> change on ARM with amdgpu.
> But don't you break them with this now?

No, we see the bidirectional attachment as compatible with the other ones.

> amdgpu will soon set the dynamic flag on exports, which forces the caching
> at create time (to avoid the locking fun), which will then result in a
> EBUSY at map_attachment time because we have a cached mapping, but it's
> the wrong type.

See the check in dma_buf_map_attachment():

     if (attach->dir != direction && attach->dir != DMA_BIDIRECTIONAL)
         return ERR_PTR(-EBUSY);


Regards,
Christian.

>
>>> Also, x86 doesn't have cache
>>> flushing in the dma-api, so naturally this isn't any issue for us (we
>>> still have cache flushing in actual hw, but that's a different topic). So
>>> "works on x86" isn't really a great way to justify what we do here I
>>> think.
>> Well it is the exact same caching we previously had as well, so there is
>> absolutely no functional change here except that we now explicitly note
>> that amdgpu always needs bidirectional mappings.
>>
>> I agree that we should get rid of this caching as soon as possible, but
>> we should not fix things which where broken before.
>>
>> On the other hand adding dma_sg_sync_for_cpu/device sounds like
>> something we could easily add separately to the caching if you think
>> that this will help.
> The current code maybe lacks some cache flushes, but we already require a
> fixed direction per attachment. So I guess not a real problem, probably.
>
> But with your patches I think we now fail with EBUSY. Not exactly nice ...
> -Daniel
>
>> Christian.
>>
>>> -Daniel
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> The other issue with "we solve this with caching the mapping": Currently
>>>>> map/unmap flush (at least on arm, at least on cases where it matters). If
>>>>> you just return the cached sg, then we don't do the flushing anymore,
>>>>> which might break importers/exporters in exactly the same way as just
>>>>> giving them the wrong mapping. There's zero differences between a BIDI,
>>>>> TO_CPU, or TO_DEVICE mapping, the only places where this matters is for
>>>>> cache flushing.
>>>>>
>>>>> So here's something that could actually work:
>>>>> - We cache the mapping.
>>>>> - We cache a bidirectional mapping.
>>>>> - We put the right dma_sg_sync_for_cpu/device calls in place for map/unmap
>>>>>      to give current importers/exporters the same behaviour they're used to
>>>>>      now.
>>>>>
>>>>> And yes the caching we've lifted might have broken something somewhere
>>>>> already. But generally you only hear about that long time after because
>>>>> arm vendors roll forward once every few years. Or something like that.
>>>>> -Daniel
>>>>>
>>>>>>> Regards,
>>>>>>> Christian.
>>>>>>>
>>>>>>>> That's why your previous version moved the caching into
>>>>>>>> map/unmap_sg, which resurrected the locking hydra.
>>>>>>>>
>>>>>>>> I think we're going a bit in circles here, and I don't have a good idea
>>>>>>>> either :-/
>>>>>>>> -Daniel
>>>>>>>>
>>>>>>>>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] dma-buf: change DMA-buf locking convention
  2019-09-17 14:47                   ` Koenig, Christian
@ 2019-09-17 14:56                     ` Daniel Vetter
  2019-09-24  9:51                       ` Koenig, Christian
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Vetter @ 2019-09-17 14:56 UTC (permalink / raw)
  To: Koenig, Christian
  Cc: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

On Tue, Sep 17, 2019 at 4:47 PM Koenig, Christian
<Christian.Koenig@amd.com> wrote:
>
> Am 17.09.19 um 15:45 schrieb Daniel Vetter:
> > On Tue, Sep 17, 2019 at 01:24:10PM +0000, Koenig, Christian wrote:
> >> Am 17.09.19 um 15:13 schrieb Daniel Vetter:
> >>> On Tue, Sep 17, 2019 at 12:40:51PM +0000, Koenig, Christian wrote:
> >>>> Am 17.09.19 um 14:31 schrieb Daniel Vetter:
> >>>>> On Mon, Sep 16, 2019 at 02:23:13PM +0200, Christian König wrote:
> >>>>>> Ping? Any further comment on this or can't we merge at least the locking
> >>>>>> change?
> >>>>> I was at plumbers ...
> >>>>>> Christian.
> >>>>>>
> >>>>>> Am 11.09.19 um 12:53 schrieb Christian König:
> >>>>>>> Am 03.09.19 um 10:05 schrieb Daniel Vetter:
> >>>>>>>> On Thu, Aug 29, 2019 at 04:29:14PM +0200, Christian König wrote:
> >>>>>>>>> This patch is a stripped down version of the locking changes
> >>>>>>>>> necessary to support dynamic DMA-buf handling.
> >>>>>>>>>
> >>>>>>>>> For compatibility we cache the DMA-buf mapping as soon as
> >>>>>>>>> exporter/importer disagree on the dynamic handling.
> >>>>>>>>>
> >>>>>>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
> >>>>>>>>> ---
> >>>>>>>>>      drivers/dma-buf/dma-buf.c | 90
> >>>>>>>>> ++++++++++++++++++++++++++++++++++++---
> >>>>>>>>>      include/linux/dma-buf.h   | 51 +++++++++++++++++++++-
> >>>>>>>>>      2 files changed, 133 insertions(+), 8 deletions(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> >>>>>>>>> index 433d91d710e4..65052d52602b 100644
> >>>>>>>>> --- a/drivers/dma-buf/dma-buf.c
> >>>>>>>>> +++ b/drivers/dma-buf/dma-buf.c
> >>>>>>>>> @@ -525,6 +525,10 @@ struct dma_buf *dma_buf_export(const struct
> >>>>>>>>> dma_buf_export_info *exp_info)
> >>>>>>>>>              return ERR_PTR(-EINVAL);
> >>>>>>>>>          }
> >>>>>>>>>      +    if (WARN_ON(exp_info->ops->cache_sgt_mapping &&
> >>>>>>>>> +            exp_info->ops->dynamic_mapping))
> >>>>>>>>> +        return ERR_PTR(-EINVAL);
> >>>>>>>>> +
> >>>>>>>>>          if (!try_module_get(exp_info->owner))
> >>>>>>>>>              return ERR_PTR(-ENOENT);
> >>>>>>>>>      @@ -645,10 +649,11 @@ void dma_buf_put(struct dma_buf *dmabuf)
> >>>>>>>>>      EXPORT_SYMBOL_GPL(dma_buf_put);
> >>>>>>>>>        /**
> >>>>>>>>> - * dma_buf_attach - Add the device to dma_buf's attachments
> >>>>>>>>> list; optionally,
> >>>>>>>>> + * dma_buf_dynamic_attach - Add the device to dma_buf's
> >>>>>>>>> attachments list; optionally,
> >>>>>>>>>       * calls attach() of dma_buf_ops to allow device-specific
> >>>>>>>>> attach functionality
> >>>>>>>>> - * @dmabuf:    [in]    buffer to attach device to.
> >>>>>>>>> - * @dev:    [in]    device to be attached.
> >>>>>>>>> + * @dmabuf:        [in]    buffer to attach device to.
> >>>>>>>>> + * @dev:        [in]    device to be attached.
> >>>>>>>>> + * @dynamic_mapping:    [in]    calling convention for map/unmap
> >>>>>>>>>       *
> >>>>>>>>>       * Returns struct dma_buf_attachment pointer for this
> >>>>>>>>> attachment. Attachments
> >>>>>>>>>       * must be cleaned up by calling dma_buf_detach().
> >>>>>>>>> @@ -662,8 +667,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
> >>>>>>>>>       * accessible to @dev, and cannot be moved to a more suitable
> >>>>>>>>> place. This is
> >>>>>>>>>       * indicated with the error code -EBUSY.
> >>>>>>>>>       */
> >>>>>>>>> -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> >>>>>>>>> -                      struct device *dev)
> >>>>>>>>> +struct dma_buf_attachment *
> >>>>>>>>> +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
> >>>>>>>>> +               bool dynamic_mapping)
> >>>>>>>>>      {
> >>>>>>>>>          struct dma_buf_attachment *attach;
> >>>>>>>>>          int ret;
> >>>>>>>>> @@ -677,6 +683,7 @@ struct dma_buf_attachment
> >>>>>>>>> *dma_buf_attach(struct dma_buf *dmabuf,
> >>>>>>>>>            attach->dev = dev;
> >>>>>>>>>          attach->dmabuf = dmabuf;
> >>>>>>>>> +    attach->dynamic_mapping = dynamic_mapping;
> >>>>>>>>>            mutex_lock(&dmabuf->lock);
> >>>>>>>>>      @@ -685,16 +692,64 @@ struct dma_buf_attachment
> >>>>>>>>> *dma_buf_attach(struct dma_buf *dmabuf,
> >>>>>>>>>              if (ret)
> >>>>>>>>>                  goto err_attach;
> >>>>>>>>>          }
> >>>>>>>>> +    dma_resv_lock(dmabuf->resv, NULL);
> >>>>>>>>>          list_add(&attach->node, &dmabuf->attachments);
> >>>>>>>>> +    dma_resv_unlock(dmabuf->resv);
> >>>>>>>>>            mutex_unlock(&dmabuf->lock);
> >>>>>>>>>      +    /* When either the importer or the exporter can't handle dynamic
> >>>>>>>>> +     * mappings we cache the mapping here to avoid issues with the
> >>>>>>>>> +     * reservation object lock.
> >>>>>>>>> +     */
> >>>>>>>>> +    if (dma_buf_attachment_is_dynamic(attach) !=
> >>>>>>>>> +        dma_buf_is_dynamic(dmabuf)) {
> >>>>>>>>> +        struct sg_table *sgt;
> >>>>>>>>> +
> >>>>>>>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
> >>>>>>>>> +            dma_resv_lock(attach->dmabuf->resv, NULL);
> >>>>>>>>> +
> >>>>>>>>> +        sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
> >>>>>>>> Now we're back to enforcing DMA_BIDI, which works nicely around the
> >>>>>>>> locking pain, but apparently upsets the arm-soc folks who want to
> >>>>>>>> control
> >>>>>>>> this better.
> >>>>>>> Take another look at dma_buf_map_attachment(), we still try to get the
> >>>>>>> caching there for ARM.
> >>>>>>>
> >>>>>>> What we do here is to bidirectionally map the buffer to avoid the
> >>>>>>> locking hydra when importer and exporter disagree on locking.
> >>>>>>>
> >>>>>>> So the ARM folks can easily avoid that by switching to dynamic locking
> >>>>>>> for both.
> >>>>> So you still break the contract between importer and exporter, except not
> >>>>> for anything that's run in intel-gfx-ci so all is good?
> >>>> No, the contract between importer and exporter stays exactly the same it
> >>>> is currently as long as you don't switch to dynamic dma-buf handling.
> >>>>
> >>>> There is no functional change for the ARM folks here. The only change
> >>>> which takes effect is between i915 and amdgpu and that is perfectly
> >>>> covered by intel-gfx-ci.
> >>> There's people who want to run amdgpu on ARM?
> >> Sure there are, we even recently fixed some bugs for this.
> >>
> >> But as far as I know there is no one currently which is affect by this
> >> change on ARM with amdgpu.
> > But don't you break them with this now?
>
> No, we see the bidirectional attachment as compatible with the other ones.
>
> > amdgpu will soon set the dynamic flag on exports, which forces the caching
> > at create time (to avoid the locking fun), which will then result in a
> > EBUSY at map_attachment time because we have a cached mapping, but it's
> > the wrong type.
>
> See the check in dma_buf_map_attachment():
>
>      if (attach->dir != direction && attach->dir != DMA_BIDIRECTIONAL)
>          return ERR_PTR(-EBUSY);

Hm, I misread this. So yeah should work, +/- the issue that we might
not flush enough. But I guess that can be fixed whenever, it's not
like dma-api semantics are a great fit for us. Maybe a fixme comment
would be useful here ... I'll look at this tomorrow or so because atm
brain is slow, I'm down with the usual post-conference cold it seems
:-/
-Daniel

>
> Regards,
> Christian.
>
> >
> >>> Also, x86 doesn't have cache
> >>> flushing in the dma-api, so naturally this isn't any issue for us (we
> >>> still have cache flushing in actual hw, but that's a different topic). So
> >>> "works on x86" isn't really a great way to justify what we do here I
> >>> think.
> >> Well it is the exact same caching we previously had as well, so there is
> >> absolutely no functional change here except that we now explicitly note
> >> that amdgpu always needs bidirectional mappings.
> >>
> >> I agree that we should get rid of this caching as soon as possible, but
> >> we should not fix things which where broken before.
> >>
> >> On the other hand adding dma_sg_sync_for_cpu/device sounds like
> >> something we could easily add separately to the caching if you think
> >> that this will help.
> > The current code maybe lacks some cache flushes, but we already require a
> > fixed direction per attachment. So I guess not a real problem, probably.
> >
> > But with your patches I think we now fail with EBUSY. Not exactly nice ...
> > -Daniel
> >
> >> Christian.
> >>
> >>> -Daniel
> >>>
> >>>> Regards,
> >>>> Christian.
> >>>>
> >>>>> The other issue with "we solve this with caching the mapping": Currently
> >>>>> map/unmap flush (at least on arm, at least on cases where it matters). If
> >>>>> you just return the cached sg, then we don't do the flushing anymore,
> >>>>> which might break importers/exporters in exactly the same way as just
> >>>>> giving them the wrong mapping. There's zero differences between a BIDI,
> >>>>> TO_CPU, or TO_DEVICE mapping, the only places where this matters is for
> >>>>> cache flushing.
> >>>>>
> >>>>> So here's something that could actually work:
> >>>>> - We cache the mapping.
> >>>>> - We cache a bidirectional mapping.
> >>>>> - We put the right dma_sg_sync_for_cpu/device calls in place for map/unmap
> >>>>>      to give current importers/exporters the same behaviour they're used to
> >>>>>      now.
> >>>>>
> >>>>> And yes the caching we've lifted might have broken something somewhere
> >>>>> already. But generally you only hear about that long time after because
> >>>>> arm vendors roll forward once every few years. Or something like that.
> >>>>> -Daniel
> >>>>>
> >>>>>>> Regards,
> >>>>>>> Christian.
> >>>>>>>
> >>>>>>>> That's why your previous version moved the caching into
> >>>>>>>> map/unmap_sg, which resurrected the locking hydra.
> >>>>>>>>
> >>>>>>>> I think we're going a bit in circles here, and I don't have a good idea
> >>>>>>>> either :-/
> >>>>>>>> -Daniel
> >>>>>>>>
> >>>>>>>>
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] dma-buf: change DMA-buf locking convention
  2019-09-17 14:56                     ` Daniel Vetter
@ 2019-09-24  9:51                       ` Koenig, Christian
  2019-10-02  8:37                         ` Koenig, Christian
  0 siblings, 1 reply; 25+ messages in thread
From: Koenig, Christian @ 2019-09-24  9:51 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

Am 17.09.19 um 16:56 schrieb Daniel Vetter:
> [SNIP]
>>>>>>>>>>>       +    /* When either the importer or the exporter can't handle dynamic
>>>>>>>>>>> +     * mappings we cache the mapping here to avoid issues with the
>>>>>>>>>>> +     * reservation object lock.
>>>>>>>>>>> +     */
>>>>>>>>>>> +    if (dma_buf_attachment_is_dynamic(attach) !=
>>>>>>>>>>> +        dma_buf_is_dynamic(dmabuf)) {
>>>>>>>>>>> +        struct sg_table *sgt;
>>>>>>>>>>> +
>>>>>>>>>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
>>>>>>>>>>> +            dma_resv_lock(attach->dmabuf->resv, NULL);
>>>>>>>>>>> +
>>>>>>>>>>> +        sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
>>>>>>>>>> Now we're back to enforcing DMA_BIDI, which works nicely around the
>>>>>>>>>> locking pain, but apparently upsets the arm-soc folks who want to
>>>>>>>>>> control
>>>>>>>>>> this better.
>>>>>>>>> Take another look at dma_buf_map_attachment(), we still try to get the
>>>>>>>>> caching there for ARM.
>>>>>>>>>
>>>>>>>>> What we do here is to bidirectionally map the buffer to avoid the
>>>>>>>>> locking hydra when importer and exporter disagree on locking.
>>>>>>>>>
>>>>>>>>> So the ARM folks can easily avoid that by switching to dynamic locking
>>>>>>>>> for both.
>>>>>>> So you still break the contract between importer and exporter, except not
>>>>>>> for anything that's run in intel-gfx-ci so all is good?
>>>>>> No, the contract between importer and exporter stays exactly the same it
>>>>>> is currently as long as you don't switch to dynamic dma-buf handling.
>>>>>>
>>>>>> There is no functional change for the ARM folks here. The only change
>>>>>> which takes effect is between i915 and amdgpu and that is perfectly
>>>>>> covered by intel-gfx-ci.
>>>>> There's people who want to run amdgpu on ARM?
>>>> Sure there are, we even recently fixed some bugs for this.
>>>>
>>>> But as far as I know there is no one currently which is affect by this
>>>> change on ARM with amdgpu.
>>> But don't you break them with this now?
>> No, we see the bidirectional attachment as compatible with the other ones.
>>
>>> amdgpu will soon set the dynamic flag on exports, which forces the caching
>>> at create time (to avoid the locking fun), which will then result in a
>>> EBUSY at map_attachment time because we have a cached mapping, but it's
>>> the wrong type.
>> See the check in dma_buf_map_attachment():
>>
>>       if (attach->dir != direction && attach->dir != DMA_BIDIRECTIONAL)
>>           return ERR_PTR(-EBUSY);
> Hm, I misread this. So yeah should work, +/- the issue that we might
> not flush enough. But I guess that can be fixed whenever, it's not
> like dma-api semantics are a great fit for us. Maybe a fixme comment
> would be useful here ... I'll look at this tomorrow or so because atm
> brain is slow, I'm down with the usual post-conference cold it seems
> :-/

Hope your are feeling better now, adding a comment is of course not a 
problem.

With that fixed can I get an reviewed-by or at least and acked-by?

I want to land at least some parts of those changes now.

Regards,
Christian.

> -Daniel
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] dma-buf: change DMA-buf locking convention
  2019-09-24  9:51                       ` Koenig, Christian
@ 2019-10-02  8:37                         ` Koenig, Christian
  2019-10-08  8:55                           ` Daniel Vetter
  0 siblings, 1 reply; 25+ messages in thread
From: Koenig, Christian @ 2019-10-02  8:37 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

Hi Daniel,

once more a ping on this. Any more comments or can we get it comitted?

Thanks,
Christian.

Am 24.09.19 um 11:50 schrieb Christian König:
> Am 17.09.19 um 16:56 schrieb Daniel Vetter:
>> [SNIP]
>>>>>>>>>>>>       +    /* When either the importer or the exporter 
>>>>>>>>>>>> can't handle dynamic
>>>>>>>>>>>> +     * mappings we cache the mapping here to avoid issues 
>>>>>>>>>>>> with the
>>>>>>>>>>>> +     * reservation object lock.
>>>>>>>>>>>> +     */
>>>>>>>>>>>> +    if (dma_buf_attachment_is_dynamic(attach) !=
>>>>>>>>>>>> +        dma_buf_is_dynamic(dmabuf)) {
>>>>>>>>>>>> +        struct sg_table *sgt;
>>>>>>>>>>>> +
>>>>>>>>>>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
>>>>>>>>>>>> + dma_resv_lock(attach->dmabuf->resv, NULL);
>>>>>>>>>>>> +
>>>>>>>>>>>> +        sgt = dmabuf->ops->map_dma_buf(attach, 
>>>>>>>>>>>> DMA_BIDIRECTIONAL);
>>>>>>>>>>> Now we're back to enforcing DMA_BIDI, which works nicely 
>>>>>>>>>>> around the
>>>>>>>>>>> locking pain, but apparently upsets the arm-soc folks who 
>>>>>>>>>>> want to
>>>>>>>>>>> control
>>>>>>>>>>> this better.
>>>>>>>>>> Take another look at dma_buf_map_attachment(), we still try 
>>>>>>>>>> to get the
>>>>>>>>>> caching there for ARM.
>>>>>>>>>>
>>>>>>>>>> What we do here is to bidirectionally map the buffer to avoid 
>>>>>>>>>> the
>>>>>>>>>> locking hydra when importer and exporter disagree on locking.
>>>>>>>>>>
>>>>>>>>>> So the ARM folks can easily avoid that by switching to 
>>>>>>>>>> dynamic locking
>>>>>>>>>> for both.
>>>>>>>> So you still break the contract between importer and exporter, 
>>>>>>>> except not
>>>>>>>> for anything that's run in intel-gfx-ci so all is good?
>>>>>>> No, the contract between importer and exporter stays exactly the 
>>>>>>> same it
>>>>>>> is currently as long as you don't switch to dynamic dma-buf 
>>>>>>> handling.
>>>>>>>
>>>>>>> There is no functional change for the ARM folks here. The only 
>>>>>>> change
>>>>>>> which takes effect is between i915 and amdgpu and that is perfectly
>>>>>>> covered by intel-gfx-ci.
>>>>>> There's people who want to run amdgpu on ARM?
>>>>> Sure there are, we even recently fixed some bugs for this.
>>>>>
>>>>> But as far as I know there is no one currently which is affect by 
>>>>> this
>>>>> change on ARM with amdgpu.
>>>> But don't you break them with this now?
>>> No, we see the bidirectional attachment as compatible with the other 
>>> ones.
>>>
>>>> amdgpu will soon set the dynamic flag on exports, which forces the 
>>>> caching
>>>> at create time (to avoid the locking fun), which will then result in a
>>>> EBUSY at map_attachment time because we have a cached mapping, but 
>>>> it's
>>>> the wrong type.
>>> See the check in dma_buf_map_attachment():
>>>
>>>       if (attach->dir != direction && attach->dir != DMA_BIDIRECTIONAL)
>>>           return ERR_PTR(-EBUSY);
>> Hm, I misread this. So yeah should work, +/- the issue that we might
>> not flush enough. But I guess that can be fixed whenever, it's not
>> like dma-api semantics are a great fit for us. Maybe a fixme comment
>> would be useful here ... I'll look at this tomorrow or so because atm
>> brain is slow, I'm down with the usual post-conference cold it seems
>> :-/
>
> Hope your are feeling better now, adding a comment is of course not a 
> problem.
>
> With that fixed can I get an reviewed-by or at least and acked-by?
>
> I want to land at least some parts of those changes now.
>
> Regards,
> Christian.
>
>> -Daniel
>>
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] dma-buf: change DMA-buf locking convention
  2019-08-29 14:29 ` [PATCH 1/4] dma-buf: change DMA-buf locking convention Christian König
  2019-09-03  8:05   ` Daniel Vetter
@ 2019-10-08  8:55   ` Daniel Vetter
  1 sibling, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2019-10-08  8:55 UTC (permalink / raw)
  To: Christian König
  Cc: daniel, dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

On Thu, Aug 29, 2019 at 04:29:14PM +0200, Christian König wrote:
> This patch is a stripped down version of the locking changes
> necessary to support dynamic DMA-buf handling.
> 
> For compatibility we cache the DMA-buf mapping as soon as
> exporter/importer disagree on the dynamic handling.


Needs huge summary of all the discussions and options we've considered
here. Copypasta totally fine with me if you don't feel like typing up the
entire sage anew, but imo really the more the better. Iirc we've had 3-4
dead-ends with different attempts at caching or locking rules.
 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/dma-buf/dma-buf.c | 90 ++++++++++++++++++++++++++++++++++++---
>  include/linux/dma-buf.h   | 51 +++++++++++++++++++++-
>  2 files changed, 133 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index 433d91d710e4..65052d52602b 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -525,6 +525,10 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info)
>  		return ERR_PTR(-EINVAL);
>  	}
>  
> +	if (WARN_ON(exp_info->ops->cache_sgt_mapping &&
> +		    exp_info->ops->dynamic_mapping))
> +		return ERR_PTR(-EINVAL);
> +
>  	if (!try_module_get(exp_info->owner))
>  		return ERR_PTR(-ENOENT);
>  
> @@ -645,10 +649,11 @@ void dma_buf_put(struct dma_buf *dmabuf)
>  EXPORT_SYMBOL_GPL(dma_buf_put);
>  
>  /**
> - * dma_buf_attach - Add the device to dma_buf's attachments list; optionally,
> + * dma_buf_dynamic_attach - Add the device to dma_buf's attachments list; optionally,
>   * calls attach() of dma_buf_ops to allow device-specific attach functionality
> - * @dmabuf:	[in]	buffer to attach device to.
> - * @dev:	[in]	device to be attached.
> + * @dmabuf:		[in]	buffer to attach device to.
> + * @dev:		[in]	device to be attached.
> + * @dynamic_mapping:	[in]	calling convention for map/unmap
>   *
>   * Returns struct dma_buf_attachment pointer for this attachment. Attachments
>   * must be cleaned up by calling dma_buf_detach().
> @@ -662,8 +667,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
>   * accessible to @dev, and cannot be moved to a more suitable place. This is
>   * indicated with the error code -EBUSY.
>   */
> -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> -					  struct device *dev)
> +struct dma_buf_attachment *
> +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
> +		       bool dynamic_mapping)
>  {
>  	struct dma_buf_attachment *attach;
>  	int ret;
> @@ -677,6 +683,7 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>  
>  	attach->dev = dev;
>  	attach->dmabuf = dmabuf;
> +	attach->dynamic_mapping = dynamic_mapping;
>  
>  	mutex_lock(&dmabuf->lock);
>  
> @@ -685,16 +692,64 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>  		if (ret)
>  			goto err_attach;
>  	}
> +	dma_resv_lock(dmabuf->resv, NULL);
>  	list_add(&attach->node, &dmabuf->attachments);
> +	dma_resv_unlock(dmabuf->resv);

Still inconsistent locking here with dma_resv_lock vs. dma_buf->lock. I'd
completely switch over dmabuf->attachments to dma_resv_lock and remove
dmabuf->lock for that use case (and just keep it for the vmap counter, at
least for now, but that should be moved over too imo in a later patch
perhaps).

Might also be a patch collision, but set_name is not switched over.

Also please update the kerneldoc for ->attachments and ->node to mention
which lock protects them.

>  
>  	mutex_unlock(&dmabuf->lock);
>  
> +	/* When either the importer or the exporter can't handle dynamic
> +	 * mappings we cache the mapping here to avoid issues with the
> +	 * reservation object lock.
> +	 */
> +	if (dma_buf_attachment_is_dynamic(attach) !=
> +	    dma_buf_is_dynamic(dmabuf)) {
> +		struct sg_table *sgt;
> +
> +		if (dma_buf_is_dynamic(attach->dmabuf))
> +			dma_resv_lock(attach->dmabuf->resv, NULL);
> +
> +		sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
> +		if (!sgt)
> +			sgt = ERR_PTR(-ENOMEM);
> +		if (IS_ERR(sgt)) {
> +			ret = PTR_ERR(sgt);
> +			goto err_unlock;
> +		}
> +		if (dma_buf_is_dynamic(attach->dmabuf))
> +			dma_resv_unlock(attach->dmabuf->resv);
> +		attach->sgt = sgt;
> +		attach->dir = DMA_BIDIRECTIONAL;
> +	}
> +
>  	return attach;
>  
>  err_attach:
>  	kfree(attach);
>  	mutex_unlock(&dmabuf->lock);
>  	return ERR_PTR(ret);
> +
> +err_unlock:
> +	if (dma_buf_is_dynamic(attach->dmabuf))
> +		dma_resv_unlock(attach->dmabuf->resv);
> +
> +	dma_buf_detach(dmabuf, attach);
> +	return ERR_PTR(ret);
> +}
> +EXPORT_SYMBOL_GPL(dma_buf_dynamic_attach);
> +
> +/**
> + * dma_buf_attach - Wrapper for dma_buf_dynamic_attach
> + * @dmabuf:	[in]	buffer to attach device to.
> + * @dev:	[in]	device to be attached.
> + *
> + * Wrapper to call dma_buf_dynamic_attach() for drivers which still use a static
> + * mapping.
> + */
> +struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> +					  struct device *dev)
> +{
> +	return dma_buf_dynamic_attach(dmabuf, dev, false);
>  }
>  EXPORT_SYMBOL_GPL(dma_buf_attach);
>  
> @@ -711,11 +766,20 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach)
>  	if (WARN_ON(!dmabuf || !attach))
>  		return;
>  
> -	if (attach->sgt)
> +	if (attach->sgt) {
> +		if (dma_buf_is_dynamic(attach->dmabuf))
> +			dma_resv_lock(attach->dmabuf->resv, NULL);
> +
>  		dmabuf->ops->unmap_dma_buf(attach, attach->sgt, attach->dir);
>  
> +		if (dma_buf_is_dynamic(attach->dmabuf))
> +			dma_resv_unlock(attach->dmabuf->resv);
> +	}
> +
>  	mutex_lock(&dmabuf->lock);
> +	dma_resv_lock(dmabuf->resv, NULL);
>  	list_del(&attach->node);
> +	dma_resv_unlock(dmabuf->resv);
>  	if (dmabuf->ops->detach)
>  		dmabuf->ops->detach(dmabuf, attach);
>  
> @@ -749,6 +813,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
>  	if (WARN_ON(!attach || !attach->dmabuf))
>  		return ERR_PTR(-EINVAL);
>  
> +	if (dma_buf_attachment_is_dynamic(attach))
> +		dma_resv_assert_held(attach->dmabuf->resv);
> +
>  	if (attach->sgt) {
>  		/*
>  		 * Two mappings with different directions for the same
> @@ -761,6 +828,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
>  		return attach->sgt;
>  	}
>  
> +	if (dma_buf_is_dynamic(attach->dmabuf))
> +		dma_resv_assert_held(attach->dmabuf->resv);
> +
>  	sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
>  	if (!sg_table)
>  		sg_table = ERR_PTR(-ENOMEM);
> @@ -793,9 +863,15 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach,
>  	if (WARN_ON(!attach || !attach->dmabuf || !sg_table))
>  		return;
>  
> +	if (dma_buf_attachment_is_dynamic(attach))
> +		dma_resv_assert_held(attach->dmabuf->resv);
> +
>  	if (attach->sgt == sg_table)
>  		return;
>  
> +	if (dma_buf_is_dynamic(attach->dmabuf))
> +		dma_resv_assert_held(attach->dmabuf->resv);
> +
>  	attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction);
>  }
>  EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
> @@ -1219,10 +1295,12 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused)
>  		seq_puts(s, "\tAttached Devices:\n");
>  		attach_count = 0;
>  
> +		dma_resv_lock(buf_obj->resv, NULL);
>  		list_for_each_entry(attach_obj, &buf_obj->attachments, node) {
>  			seq_printf(s, "\t%s\n", dev_name(attach_obj->dev));
>  			attach_count++;
>  		}
> +		dma_resv_unlock(buf_obj->resv);
>  
>  		seq_printf(s, "Total %d devices attached\n\n",
>  				attach_count);
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index ec212cb27fdc..a8f8b2b812fd 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -42,6 +42,17 @@ struct dma_buf_ops {
>  	  */
>  	bool cache_sgt_mapping;
>  
> +	/**
> +	 * @dynamic_mapping:
> +	 *
> +	 * If true the framework makes sure that the map/unmap_dma_buf
> +	 * callbacks are always called with the dma_resv object locked.
> +	 *
> +	 * If false the framework makes ure that the map/unmap_dma_buf
> +	 * callbacks are always called without the dma_resv object locked.
> +	 */

Please add a note that this is an cache_sgt_mapping are exlusive.

> +	bool dynamic_mapping;
> +
>  	/**
>  	 * @attach:
>  	 *
> @@ -109,6 +120,9 @@ struct dma_buf_ops {
>  	 * any other kind of sharing that the exporter might wish to make
>  	 * available to buffer-users.
>  	 *
> +	 * This is always called with the dmabuf->resv object locked when
> +	 * the dynamic_mapping flag is true.
> +	 *
>  	 * Returns:
>  	 *
>  	 * A &sg_table scatter list of or the backing storage of the DMA buffer,
> @@ -327,6 +341,8 @@ struct dma_buf {
>   * @sgt: cached mapping.
>   * @dir: direction of cached mapping.
>   * @priv: exporter specific attachment data.
> + * @dynamic_mapping: true if dma_buf_map/unmap_attachment() is called with the
> + * dma_resv lock held.
>   *
>   * This structure holds the attachment information between the dma_buf buffer
>   * and its user device(s). The list contains one attachment struct per device
> @@ -343,6 +359,7 @@ struct dma_buf_attachment {
>  	struct list_head node;
>  	struct sg_table *sgt;
>  	enum dma_data_direction dir;
> +	bool dynamic_mapping;
>  	void *priv;
>  };
>  
> @@ -394,10 +411,39 @@ static inline void get_dma_buf(struct dma_buf *dmabuf)
>  	get_file(dmabuf->file);
>  }
>  
> +/**
> + * dma_buf_is_dynamic - check if a DMA-buf uses dynamic mappings.
> + * @dmabuf: the DMA-buf to check
> + *
> + * Returns true if a DMA-buf exporter wants to be called with the dma_resv
> + * locked, false if it doesn't wants to be called with the lock held.
> + */
> +static inline bool dma_buf_is_dynamic(struct dma_buf *dmabuf)
> +{
> +	return dmabuf->ops->dynamic_mapping;
> +}
> +
> +/**
> + * dma_buf_attachment_is_dynamic - check if a DMA-buf attachment uses dynamic
> + * mappinsg
> + * @attach: the DMA-buf attachment to check
> + *
> + * Returns true if a DMA-buf importer wants to call the map/unmap functions with
> + * the dma_resv lock held.
> + */
> +static inline bool
> +dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
> +{
> +	return attach->dynamic_mapping;
> +}
> +
>  struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> -							struct device *dev);
> +					  struct device *dev);
> +struct dma_buf_attachment *
> +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
> +		       bool dynamic_mapping);
>  void dma_buf_detach(struct dma_buf *dmabuf,
> -				struct dma_buf_attachment *dmabuf_attach);
> +		    struct dma_buf_attachment *attach);
>  
>  struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info);
>  
> @@ -409,6 +455,7 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *,
>  					enum dma_data_direction);
>  void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *,
>  				enum dma_data_direction);
> +void dma_buf_move_notify(struct dma_buf *dma_buf);
>  int dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
>  			     enum dma_data_direction dir);
>  int dma_buf_end_cpu_access(struct dma_buf *dma_buf,

Aside from the nits here I think this is ok, count me convinced on
principle.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] dma-buf: change DMA-buf locking convention
  2019-10-02  8:37                         ` Koenig, Christian
@ 2019-10-08  8:55                           ` Daniel Vetter
  2019-10-16 13:46                             ` Koenig, Christian
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Vetter @ 2019-10-08  8:55 UTC (permalink / raw)
  To: Koenig, Christian
  Cc: Daniel Vetter, dri-devel, sumit.semwal, linaro-mm-sig,
	linux-media, intel-gfx

On Wed, Oct 02, 2019 at 08:37:50AM +0000, Koenig, Christian wrote:
> Hi Daniel,
> 
> once more a ping on this. Any more comments or can we get it comitted?

Sorry got a bit smashed past weeks, but should be resurrected now back
from xdc.
-Daniel
> 
> Thanks,
> Christian.
> 
> Am 24.09.19 um 11:50 schrieb Christian König:
> > Am 17.09.19 um 16:56 schrieb Daniel Vetter:
> >> [SNIP]
> >>>>>>>>>>>>       +    /* When either the importer or the exporter 
> >>>>>>>>>>>> can't handle dynamic
> >>>>>>>>>>>> +     * mappings we cache the mapping here to avoid issues 
> >>>>>>>>>>>> with the
> >>>>>>>>>>>> +     * reservation object lock.
> >>>>>>>>>>>> +     */
> >>>>>>>>>>>> +    if (dma_buf_attachment_is_dynamic(attach) !=
> >>>>>>>>>>>> +        dma_buf_is_dynamic(dmabuf)) {
> >>>>>>>>>>>> +        struct sg_table *sgt;
> >>>>>>>>>>>> +
> >>>>>>>>>>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
> >>>>>>>>>>>> + dma_resv_lock(attach->dmabuf->resv, NULL);
> >>>>>>>>>>>> +
> >>>>>>>>>>>> +        sgt = dmabuf->ops->map_dma_buf(attach, 
> >>>>>>>>>>>> DMA_BIDIRECTIONAL);
> >>>>>>>>>>> Now we're back to enforcing DMA_BIDI, which works nicely 
> >>>>>>>>>>> around the
> >>>>>>>>>>> locking pain, but apparently upsets the arm-soc folks who 
> >>>>>>>>>>> want to
> >>>>>>>>>>> control
> >>>>>>>>>>> this better.
> >>>>>>>>>> Take another look at dma_buf_map_attachment(), we still try 
> >>>>>>>>>> to get the
> >>>>>>>>>> caching there for ARM.
> >>>>>>>>>>
> >>>>>>>>>> What we do here is to bidirectionally map the buffer to avoid 
> >>>>>>>>>> the
> >>>>>>>>>> locking hydra when importer and exporter disagree on locking.
> >>>>>>>>>>
> >>>>>>>>>> So the ARM folks can easily avoid that by switching to 
> >>>>>>>>>> dynamic locking
> >>>>>>>>>> for both.
> >>>>>>>> So you still break the contract between importer and exporter, 
> >>>>>>>> except not
> >>>>>>>> for anything that's run in intel-gfx-ci so all is good?
> >>>>>>> No, the contract between importer and exporter stays exactly the 
> >>>>>>> same it
> >>>>>>> is currently as long as you don't switch to dynamic dma-buf 
> >>>>>>> handling.
> >>>>>>>
> >>>>>>> There is no functional change for the ARM folks here. The only 
> >>>>>>> change
> >>>>>>> which takes effect is between i915 and amdgpu and that is perfectly
> >>>>>>> covered by intel-gfx-ci.
> >>>>>> There's people who want to run amdgpu on ARM?
> >>>>> Sure there are, we even recently fixed some bugs for this.
> >>>>>
> >>>>> But as far as I know there is no one currently which is affect by 
> >>>>> this
> >>>>> change on ARM with amdgpu.
> >>>> But don't you break them with this now?
> >>> No, we see the bidirectional attachment as compatible with the other 
> >>> ones.
> >>>
> >>>> amdgpu will soon set the dynamic flag on exports, which forces the 
> >>>> caching
> >>>> at create time (to avoid the locking fun), which will then result in a
> >>>> EBUSY at map_attachment time because we have a cached mapping, but 
> >>>> it's
> >>>> the wrong type.
> >>> See the check in dma_buf_map_attachment():
> >>>
> >>>       if (attach->dir != direction && attach->dir != DMA_BIDIRECTIONAL)
> >>>           return ERR_PTR(-EBUSY);
> >> Hm, I misread this. So yeah should work, +/- the issue that we might
> >> not flush enough. But I guess that can be fixed whenever, it's not
> >> like dma-api semantics are a great fit for us. Maybe a fixme comment
> >> would be useful here ... I'll look at this tomorrow or so because atm
> >> brain is slow, I'm down with the usual post-conference cold it seems
> >> :-/
> >
> > Hope your are feeling better now, adding a comment is of course not a 
> > problem.
> >
> > With that fixed can I get an reviewed-by or at least and acked-by?
> >
> > I want to land at least some parts of those changes now.
> >
> > Regards,
> > Christian.
> >
> >> -Daniel
> >>
> >
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] drm/ttm: use the parent resv for ghost objects v2
  2019-08-29 14:29 ` [PATCH 2/4] drm/ttm: use the parent resv for ghost objects v2 Christian König
@ 2019-10-08  9:25   ` Daniel Vetter
  2019-10-09 13:10     ` Christian König
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Vetter @ 2019-10-08  9:25 UTC (permalink / raw)
  To: Christian König
  Cc: daniel, dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

On Thu, Aug 29, 2019 at 04:29:15PM +0200, Christian König wrote:
> This way we can even pipeline imported BO evictions.
> 
> v2: Limit this to only cases when the parent object uses a separate
>     reservation object as well. This fixes another OOM problem.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>

Since I read quite a bit of ttm I figured I'll review this too, but I'm
totally lost. And git blame gives me at best commits with one-liner commit
messages, and the docs aren't explaining much at all either (and generally
they didn't get updated at all with all the changes in the past years). 

I have a vague idea of what you're doing here, but not enough to do review
with any confidence. And from other ttm patches from amd it feels a lot
like we have essentially a bus factor of 1 for all things ttm :-/
-Daniel

> ---
>  drivers/gpu/drm/ttm/ttm_bo_util.c | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
> index fe81c565e7ef..2ebe9fe7f6c8 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> @@ -517,7 +517,9 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
>  	kref_init(&fbo->base.kref);
>  	fbo->base.destroy = &ttm_transfered_destroy;
>  	fbo->base.acc_size = 0;
> -	fbo->base.base.resv = &fbo->base.base._resv;
> +	if (bo->base.resv == &bo->base._resv)
> +		fbo->base.base.resv = &fbo->base.base._resv;
> +
>  	dma_resv_init(fbo->base.base.resv);
>  	ret = dma_resv_trylock(fbo->base.base.resv);
>  	WARN_ON(!ret);
> @@ -716,7 +718,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
>  		if (ret)
>  			return ret;
>  
> -		dma_resv_add_excl_fence(ghost_obj->base.resv, fence);
> +		dma_resv_add_excl_fence(&ghost_obj->base._resv, fence);
>  
>  		/**
>  		 * If we're not moving to fixed memory, the TTM object
> @@ -729,7 +731,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
>  		else
>  			bo->ttm = NULL;
>  
> -		ttm_bo_unreserve(ghost_obj);
> +		dma_resv_unlock(&ghost_obj->base._resv);
>  		ttm_bo_put(ghost_obj);
>  	}
>  
> @@ -772,7 +774,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo,
>  		if (ret)
>  			return ret;
>  
> -		dma_resv_add_excl_fence(ghost_obj->base.resv, fence);
> +		dma_resv_add_excl_fence(&ghost_obj->base._resv, fence);
>  
>  		/**
>  		 * If we're not moving to fixed memory, the TTM object
> @@ -785,7 +787,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo,
>  		else
>  			bo->ttm = NULL;
>  
> -		ttm_bo_unreserve(ghost_obj);
> +		dma_resv_unlock(&ghost_obj->base._resv);
>  		ttm_bo_put(ghost_obj);
>  
>  	} else if (from->flags & TTM_MEMTYPE_FLAG_FIXED) {
> @@ -841,7 +843,7 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo)
>  	if (ret)
>  		return ret;
>  
> -	ret = dma_resv_copy_fences(ghost->base.resv, bo->base.resv);
> +	ret = dma_resv_copy_fences(&ghost->base._resv, bo->base.resv);
>  	/* Last resort, wait for the BO to be idle when we are OOM */
>  	if (ret)
>  		ttm_bo_wait(bo, false, false);
> @@ -850,7 +852,7 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo)
>  	bo->mem.mem_type = TTM_PL_SYSTEM;
>  	bo->ttm = NULL;
>  
> -	ttm_bo_unreserve(ghost);
> +	dma_resv_unlock(&ghost->base._resv);
>  	ttm_bo_put(ghost);
>  
>  	return 0;
> -- 
> 2.17.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] drm/ttm: use the parent resv for ghost objects v2
  2019-10-08  9:25   ` Daniel Vetter
@ 2019-10-09 13:10     ` Christian König
  2019-10-09 14:09       ` Daniel Vetter
  0 siblings, 1 reply; 25+ messages in thread
From: Christian König @ 2019-10-09 13:10 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

Am 08.10.19 um 11:25 schrieb Daniel Vetter:
> On Thu, Aug 29, 2019 at 04:29:15PM +0200, Christian König wrote:
>> This way we can even pipeline imported BO evictions.
>>
>> v2: Limit this to only cases when the parent object uses a separate
>>      reservation object as well. This fixes another OOM problem.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
> Since I read quite a bit of ttm I figured I'll review this too, but I'm
> totally lost. And git blame gives me at best commits with one-liner commit
> messages, and the docs aren't explaining much at all either (and generally
> they didn't get updated at all with all the changes in the past years).
>
> I have a vague idea of what you're doing here, but not enough to do review
> with any confidence. And from other ttm patches from amd it feels a lot
> like we have essentially a bus factor of 1 for all things ttm :-/

Yeah, that's one of a couple of reasons why I want to get rid of TTM in 
the long term.

Basically this is a bug fix for delay freeing ttm objects. When we hang 
the ttm object on a ghost object to be freed and the ttm object is an 
imported DMA-buf we run into the problem that we want to drop the 
mapping, but have the wrong lock taken (the lock of the ghost and not of 
the parent).

Regards,
Christian.

> -Daniel
>
>> ---
>>   drivers/gpu/drm/ttm/ttm_bo_util.c | 16 +++++++++-------
>>   1 file changed, 9 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
>> index fe81c565e7ef..2ebe9fe7f6c8 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
>> @@ -517,7 +517,9 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
>>   	kref_init(&fbo->base.kref);
>>   	fbo->base.destroy = &ttm_transfered_destroy;
>>   	fbo->base.acc_size = 0;
>> -	fbo->base.base.resv = &fbo->base.base._resv;
>> +	if (bo->base.resv == &bo->base._resv)
>> +		fbo->base.base.resv = &fbo->base.base._resv;
>> +
>>   	dma_resv_init(fbo->base.base.resv);
>>   	ret = dma_resv_trylock(fbo->base.base.resv);
>>   	WARN_ON(!ret);
>> @@ -716,7 +718,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
>>   		if (ret)
>>   			return ret;
>>   
>> -		dma_resv_add_excl_fence(ghost_obj->base.resv, fence);
>> +		dma_resv_add_excl_fence(&ghost_obj->base._resv, fence);
>>   
>>   		/**
>>   		 * If we're not moving to fixed memory, the TTM object
>> @@ -729,7 +731,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
>>   		else
>>   			bo->ttm = NULL;
>>   
>> -		ttm_bo_unreserve(ghost_obj);
>> +		dma_resv_unlock(&ghost_obj->base._resv);
>>   		ttm_bo_put(ghost_obj);
>>   	}
>>   
>> @@ -772,7 +774,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo,
>>   		if (ret)
>>   			return ret;
>>   
>> -		dma_resv_add_excl_fence(ghost_obj->base.resv, fence);
>> +		dma_resv_add_excl_fence(&ghost_obj->base._resv, fence);
>>   
>>   		/**
>>   		 * If we're not moving to fixed memory, the TTM object
>> @@ -785,7 +787,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo,
>>   		else
>>   			bo->ttm = NULL;
>>   
>> -		ttm_bo_unreserve(ghost_obj);
>> +		dma_resv_unlock(&ghost_obj->base._resv);
>>   		ttm_bo_put(ghost_obj);
>>   
>>   	} else if (from->flags & TTM_MEMTYPE_FLAG_FIXED) {
>> @@ -841,7 +843,7 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo)
>>   	if (ret)
>>   		return ret;
>>   
>> -	ret = dma_resv_copy_fences(ghost->base.resv, bo->base.resv);
>> +	ret = dma_resv_copy_fences(&ghost->base._resv, bo->base.resv);
>>   	/* Last resort, wait for the BO to be idle when we are OOM */
>>   	if (ret)
>>   		ttm_bo_wait(bo, false, false);
>> @@ -850,7 +852,7 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo)
>>   	bo->mem.mem_type = TTM_PL_SYSTEM;
>>   	bo->ttm = NULL;
>>   
>> -	ttm_bo_unreserve(ghost);
>> +	dma_resv_unlock(&ghost->base._resv);
>>   	ttm_bo_put(ghost);
>>   
>>   	return 0;
>> -- 
>> 2.17.1
>>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] drm/ttm: use the parent resv for ghost objects v2
  2019-10-09 13:10     ` Christian König
@ 2019-10-09 14:09       ` Daniel Vetter
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2019-10-09 14:09 UTC (permalink / raw)
  To: christian.koenig
  Cc: Daniel Vetter, dri-devel, sumit.semwal, linaro-mm-sig,
	linux-media, intel-gfx

On Wed, Oct 09, 2019 at 03:10:09PM +0200, Christian König wrote:
> Am 08.10.19 um 11:25 schrieb Daniel Vetter:
> > On Thu, Aug 29, 2019 at 04:29:15PM +0200, Christian König wrote:
> > > This way we can even pipeline imported BO evictions.
> > > 
> > > v2: Limit this to only cases when the parent object uses a separate
> > >      reservation object as well. This fixes another OOM problem.
> > > 
> > > Signed-off-by: Christian König <christian.koenig@amd.com>
> > Since I read quite a bit of ttm I figured I'll review this too, but I'm
> > totally lost. And git blame gives me at best commits with one-liner commit
> > messages, and the docs aren't explaining much at all either (and generally
> > they didn't get updated at all with all the changes in the past years).
> > 
> > I have a vague idea of what you're doing here, but not enough to do review
> > with any confidence. And from other ttm patches from amd it feels a lot
> > like we have essentially a bus factor of 1 for all things ttm :-/
> 
> Yeah, that's one of a couple of reasons why I want to get rid of TTM in the
> long term.
> 
> Basically this is a bug fix for delay freeing ttm objects. When we hang the
> ttm object on a ghost object to be freed and the ttm object is an imported
> DMA-buf we run into the problem that we want to drop the mapping, but have
> the wrong lock taken (the lock of the ghost and not of the parent).

Got intrigued, did some more digging, I guess the bugfix part is related
to:

commit 841e763b40764a7699ae07f4cb1921af62d6316d
Author: Christian König <christian.koenig@amd.com>
Date:   Thu Jul 20 20:55:06 2017 +0200

    drm/ttm: individualize BO reservation obj when they are freed

and that's why you switch everything over to useing _resv instead of the
pointer. But then I still don't follow the details ...

> 

> Regards,
> Christian.
> 
> > -Daniel
> > 
> > > ---
> > >   drivers/gpu/drm/ttm/ttm_bo_util.c | 16 +++++++++-------
> > >   1 file changed, 9 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
> > > index fe81c565e7ef..2ebe9fe7f6c8 100644
> > > --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> > > +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> > > @@ -517,7 +517,9 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
> > >   	kref_init(&fbo->base.kref);
> > >   	fbo->base.destroy = &ttm_transfered_destroy;
> > >   	fbo->base.acc_size = 0;
> > > -	fbo->base.base.resv = &fbo->base.base._resv;
> > > +	if (bo->base.resv == &bo->base._resv)
> > > +		fbo->base.base.resv = &fbo->base.base._resv;

I got confused a bit at first, until I spotted the

	fbo->base = *bo;

somewhere above. So I think that part makes sense, together with the above
cited patch. I think at least, confidence on this is very low ...

> > > +
> > >   	dma_resv_init(fbo->base.base.resv);
> > >   	ret = dma_resv_trylock(fbo->base.base.resv);

Shouldn't this be switched over to _resv too? Otherwise feels like
unbalanced locking.

> > >   	WARN_ON(!ret);
> > > @@ -716,7 +718,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
> > >   		if (ret)
> > >   			return ret;
> > > -		dma_resv_add_excl_fence(ghost_obj->base.resv, fence);
> > > +		dma_resv_add_excl_fence(&ghost_obj->base._resv, fence);
> > >   		/**
> > >   		 * If we're not moving to fixed memory, the TTM object
> > > @@ -729,7 +731,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
> > >   		else
> > >   			bo->ttm = NULL;
> > > -		ttm_bo_unreserve(ghost_obj);
> > > +		dma_resv_unlock(&ghost_obj->base._resv);
> > >   		ttm_bo_put(ghost_obj);
> > >   	}
> > > @@ -772,7 +774,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo,
> > >   		if (ret)
> > >   			return ret;
> > > -		dma_resv_add_excl_fence(ghost_obj->base.resv, fence);
> > > +		dma_resv_add_excl_fence(&ghost_obj->base._resv, fence);
> > >   		/**
> > >   		 * If we're not moving to fixed memory, the TTM object
> > > @@ -785,7 +787,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo,
> > >   		else
> > >   			bo->ttm = NULL;
> > > -		ttm_bo_unreserve(ghost_obj);
> > > +		dma_resv_unlock(&ghost_obj->base._resv);

I guess dropping the lru part here (aside from switching from ->resv to
->_resv, which is your bugfix I think) doesn't matter since the ghost
object got all cleared up and isn't on any lists anyway? Otoh how does it
work then ...

Not clear to me why this is safe.

> > >   		ttm_bo_put(ghost_obj);
> > >   	} else if (from->flags & TTM_MEMTYPE_FLAG_FIXED) {
> > > @@ -841,7 +843,7 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo)
> > >   	if (ret)
> > >   		return ret;
> > > -	ret = dma_resv_copy_fences(ghost->base.resv, bo->base.resv);
> > > +	ret = dma_resv_copy_fences(&ghost->base._resv, bo->base.resv);
> > >   	/* Last resort, wait for the BO to be idle when we are OOM */
> > >   	if (ret)
> > >   		ttm_bo_wait(bo, false, false);
> > > @@ -850,7 +852,7 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo)
> > >   	bo->mem.mem_type = TTM_PL_SYSTEM;
> > >   	bo->ttm = NULL;
> > > -	ttm_bo_unreserve(ghost);
> > > +	dma_resv_unlock(&ghost->base._resv);
> > >   	ttm_bo_put(ghost);
> > >   	return 0;
> > > -- 
> > > 2.17.1
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] dma-buf: change DMA-buf locking convention
  2019-10-08  8:55                           ` Daniel Vetter
@ 2019-10-16 13:46                             ` Koenig, Christian
  2019-10-16 14:23                               ` Daniel Vetter
  0 siblings, 1 reply; 25+ messages in thread
From: Koenig, Christian @ 2019-10-16 13:46 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

Am 08.10.19 um 10:55 schrieb Daniel Vetter:
> On Wed, Oct 02, 2019 at 08:37:50AM +0000, Koenig, Christian wrote:
>> Hi Daniel,
>>
>> once more a ping on this. Any more comments or can we get it comitted?
> Sorry got a bit smashed past weeks, but should be resurrected now back
> from xdc.

And any more thoughts on this? I mean we are blocked for month on this 
now :(

Thanks,
Christian.

> -Daniel
>> Thanks,
>> Christian.
>>
>> Am 24.09.19 um 11:50 schrieb Christian König:
>>> Am 17.09.19 um 16:56 schrieb Daniel Vetter:
>>>> [SNIP]
>>>>>>>>>>>>>>        +    /* When either the importer or the exporter
>>>>>>>>>>>>>> can't handle dynamic
>>>>>>>>>>>>>> +     * mappings we cache the mapping here to avoid issues
>>>>>>>>>>>>>> with the
>>>>>>>>>>>>>> +     * reservation object lock.
>>>>>>>>>>>>>> +     */
>>>>>>>>>>>>>> +    if (dma_buf_attachment_is_dynamic(attach) !=
>>>>>>>>>>>>>> +        dma_buf_is_dynamic(dmabuf)) {
>>>>>>>>>>>>>> +        struct sg_table *sgt;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
>>>>>>>>>>>>>> + dma_resv_lock(attach->dmabuf->resv, NULL);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +        sgt = dmabuf->ops->map_dma_buf(attach,
>>>>>>>>>>>>>> DMA_BIDIRECTIONAL);
>>>>>>>>>>>>> Now we're back to enforcing DMA_BIDI, which works nicely
>>>>>>>>>>>>> around the
>>>>>>>>>>>>> locking pain, but apparently upsets the arm-soc folks who
>>>>>>>>>>>>> want to
>>>>>>>>>>>>> control
>>>>>>>>>>>>> this better.
>>>>>>>>>>>> Take another look at dma_buf_map_attachment(), we still try
>>>>>>>>>>>> to get the
>>>>>>>>>>>> caching there for ARM.
>>>>>>>>>>>>
>>>>>>>>>>>> What we do here is to bidirectionally map the buffer to avoid
>>>>>>>>>>>> the
>>>>>>>>>>>> locking hydra when importer and exporter disagree on locking.
>>>>>>>>>>>>
>>>>>>>>>>>> So the ARM folks can easily avoid that by switching to
>>>>>>>>>>>> dynamic locking
>>>>>>>>>>>> for both.
>>>>>>>>>> So you still break the contract between importer and exporter,
>>>>>>>>>> except not
>>>>>>>>>> for anything that's run in intel-gfx-ci so all is good?
>>>>>>>>> No, the contract between importer and exporter stays exactly the
>>>>>>>>> same it
>>>>>>>>> is currently as long as you don't switch to dynamic dma-buf
>>>>>>>>> handling.
>>>>>>>>>
>>>>>>>>> There is no functional change for the ARM folks here. The only
>>>>>>>>> change
>>>>>>>>> which takes effect is between i915 and amdgpu and that is perfectly
>>>>>>>>> covered by intel-gfx-ci.
>>>>>>>> There's people who want to run amdgpu on ARM?
>>>>>>> Sure there are, we even recently fixed some bugs for this.
>>>>>>>
>>>>>>> But as far as I know there is no one currently which is affect by
>>>>>>> this
>>>>>>> change on ARM with amdgpu.
>>>>>> But don't you break them with this now?
>>>>> No, we see the bidirectional attachment as compatible with the other
>>>>> ones.
>>>>>
>>>>>> amdgpu will soon set the dynamic flag on exports, which forces the
>>>>>> caching
>>>>>> at create time (to avoid the locking fun), which will then result in a
>>>>>> EBUSY at map_attachment time because we have a cached mapping, but
>>>>>> it's
>>>>>> the wrong type.
>>>>> See the check in dma_buf_map_attachment():
>>>>>
>>>>>        if (attach->dir != direction && attach->dir != DMA_BIDIRECTIONAL)
>>>>>            return ERR_PTR(-EBUSY);
>>>> Hm, I misread this. So yeah should work, +/- the issue that we might
>>>> not flush enough. But I guess that can be fixed whenever, it's not
>>>> like dma-api semantics are a great fit for us. Maybe a fixme comment
>>>> would be useful here ... I'll look at this tomorrow or so because atm
>>>> brain is slow, I'm down with the usual post-conference cold it seems
>>>> :-/
>>> Hope your are feeling better now, adding a comment is of course not a
>>> problem.
>>>
>>> With that fixed can I get an reviewed-by or at least and acked-by?
>>>
>>> I want to land at least some parts of those changes now.
>>>
>>> Regards,
>>> Christian.
>>>
>>>> -Daniel
>>>>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] dma-buf: change DMA-buf locking convention
  2019-10-16 13:46                             ` Koenig, Christian
@ 2019-10-16 14:23                               ` Daniel Vetter
  2019-10-17  9:04                                 ` Koenig, Christian
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Vetter @ 2019-10-16 14:23 UTC (permalink / raw)
  To: Koenig, Christian
  Cc: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

On Wed, Oct 16, 2019 at 3:46 PM Koenig, Christian
<Christian.Koenig@amd.com> wrote:
>
> Am 08.10.19 um 10:55 schrieb Daniel Vetter:
> > On Wed, Oct 02, 2019 at 08:37:50AM +0000, Koenig, Christian wrote:
> >> Hi Daniel,
> >>
> >> once more a ping on this. Any more comments or can we get it comitted?
> > Sorry got a bit smashed past weeks, but should be resurrected now back
> > from xdc.
>
> And any more thoughts on this? I mean we are blocked for month on this
> now :(

I replied to both 1 and 2 in this series on 8th Oct. You even replied
to me in the thread on patch 2 ... but not here (I top posted since
this detour here just me being confused).
-Daniel

>
> Thanks,
> Christian.
>
> > -Daniel
> >> Thanks,
> >> Christian.
> >>
> >> Am 24.09.19 um 11:50 schrieb Christian König:
> >>> Am 17.09.19 um 16:56 schrieb Daniel Vetter:
> >>>> [SNIP]
> >>>>>>>>>>>>>>        +    /* When either the importer or the exporter
> >>>>>>>>>>>>>> can't handle dynamic
> >>>>>>>>>>>>>> +     * mappings we cache the mapping here to avoid issues
> >>>>>>>>>>>>>> with the
> >>>>>>>>>>>>>> +     * reservation object lock.
> >>>>>>>>>>>>>> +     */
> >>>>>>>>>>>>>> +    if (dma_buf_attachment_is_dynamic(attach) !=
> >>>>>>>>>>>>>> +        dma_buf_is_dynamic(dmabuf)) {
> >>>>>>>>>>>>>> +        struct sg_table *sgt;
> >>>>>>>>>>>>>> +
> >>>>>>>>>>>>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
> >>>>>>>>>>>>>> + dma_resv_lock(attach->dmabuf->resv, NULL);
> >>>>>>>>>>>>>> +
> >>>>>>>>>>>>>> +        sgt = dmabuf->ops->map_dma_buf(attach,
> >>>>>>>>>>>>>> DMA_BIDIRECTIONAL);
> >>>>>>>>>>>>> Now we're back to enforcing DMA_BIDI, which works nicely
> >>>>>>>>>>>>> around the
> >>>>>>>>>>>>> locking pain, but apparently upsets the arm-soc folks who
> >>>>>>>>>>>>> want to
> >>>>>>>>>>>>> control
> >>>>>>>>>>>>> this better.
> >>>>>>>>>>>> Take another look at dma_buf_map_attachment(), we still try
> >>>>>>>>>>>> to get the
> >>>>>>>>>>>> caching there for ARM.
> >>>>>>>>>>>>
> >>>>>>>>>>>> What we do here is to bidirectionally map the buffer to avoid
> >>>>>>>>>>>> the
> >>>>>>>>>>>> locking hydra when importer and exporter disagree on locking.
> >>>>>>>>>>>>
> >>>>>>>>>>>> So the ARM folks can easily avoid that by switching to
> >>>>>>>>>>>> dynamic locking
> >>>>>>>>>>>> for both.
> >>>>>>>>>> So you still break the contract between importer and exporter,
> >>>>>>>>>> except not
> >>>>>>>>>> for anything that's run in intel-gfx-ci so all is good?
> >>>>>>>>> No, the contract between importer and exporter stays exactly the
> >>>>>>>>> same it
> >>>>>>>>> is currently as long as you don't switch to dynamic dma-buf
> >>>>>>>>> handling.
> >>>>>>>>>
> >>>>>>>>> There is no functional change for the ARM folks here. The only
> >>>>>>>>> change
> >>>>>>>>> which takes effect is between i915 and amdgpu and that is perfectly
> >>>>>>>>> covered by intel-gfx-ci.
> >>>>>>>> There's people who want to run amdgpu on ARM?
> >>>>>>> Sure there are, we even recently fixed some bugs for this.
> >>>>>>>
> >>>>>>> But as far as I know there is no one currently which is affect by
> >>>>>>> this
> >>>>>>> change on ARM with amdgpu.
> >>>>>> But don't you break them with this now?
> >>>>> No, we see the bidirectional attachment as compatible with the other
> >>>>> ones.
> >>>>>
> >>>>>> amdgpu will soon set the dynamic flag on exports, which forces the
> >>>>>> caching
> >>>>>> at create time (to avoid the locking fun), which will then result in a
> >>>>>> EBUSY at map_attachment time because we have a cached mapping, but
> >>>>>> it's
> >>>>>> the wrong type.
> >>>>> See the check in dma_buf_map_attachment():
> >>>>>
> >>>>>        if (attach->dir != direction && attach->dir != DMA_BIDIRECTIONAL)
> >>>>>            return ERR_PTR(-EBUSY);
> >>>> Hm, I misread this. So yeah should work, +/- the issue that we might
> >>>> not flush enough. But I guess that can be fixed whenever, it's not
> >>>> like dma-api semantics are a great fit for us. Maybe a fixme comment
> >>>> would be useful here ... I'll look at this tomorrow or so because atm
> >>>> brain is slow, I'm down with the usual post-conference cold it seems
> >>>> :-/
> >>> Hope your are feeling better now, adding a comment is of course not a
> >>> problem.
> >>>
> >>> With that fixed can I get an reviewed-by or at least and acked-by?
> >>>
> >>> I want to land at least some parts of those changes now.
> >>>
> >>> Regards,
> >>> Christian.
> >>>
> >>>> -Daniel
> >>>>
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] dma-buf: change DMA-buf locking convention
  2019-10-16 14:23                               ` Daniel Vetter
@ 2019-10-17  9:04                                 ` Koenig, Christian
  0 siblings, 0 replies; 25+ messages in thread
From: Koenig, Christian @ 2019-10-17  9:04 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

Am 16.10.19 um 16:23 schrieb Daniel Vetter:
> On Wed, Oct 16, 2019 at 3:46 PM Koenig, Christian
> <Christian.Koenig@amd.com> wrote:
>> Am 08.10.19 um 10:55 schrieb Daniel Vetter:
>>> On Wed, Oct 02, 2019 at 08:37:50AM +0000, Koenig, Christian wrote:
>>>> Hi Daniel,
>>>>
>>>> once more a ping on this. Any more comments or can we get it comitted?
>>> Sorry got a bit smashed past weeks, but should be resurrected now back
>>> from xdc.
>> And any more thoughts on this? I mean we are blocked for month on this
>> now :(
> I replied to both 1 and 2 in this series on 8th Oct. You even replied
> to me in the thread on patch 2 ... but not here (I top posted since
> this detour here just me being confused).

Ok, in this case its my fault. I totally missed your reply on 1 and 
thought that the reply on 2 was actually for a different thread.

I'm going to submit the TTM changes separately, cause that is actually a 
bug fix for a completely different issue which just happens to surface 
because we change the locking.

Thanks,
Christian.

> -Daniel
>
>> Thanks,
>> Christian.
>>
>>> -Daniel
>>>> Thanks,
>>>> Christian.
>>>>
>>>> Am 24.09.19 um 11:50 schrieb Christian König:
>>>>> Am 17.09.19 um 16:56 schrieb Daniel Vetter:
>>>>>> [SNIP]
>>>>>>>>>>>>>>>>         +    /* When either the importer or the exporter
>>>>>>>>>>>>>>>> can't handle dynamic
>>>>>>>>>>>>>>>> +     * mappings we cache the mapping here to avoid issues
>>>>>>>>>>>>>>>> with the
>>>>>>>>>>>>>>>> +     * reservation object lock.
>>>>>>>>>>>>>>>> +     */
>>>>>>>>>>>>>>>> +    if (dma_buf_attachment_is_dynamic(attach) !=
>>>>>>>>>>>>>>>> +        dma_buf_is_dynamic(dmabuf)) {
>>>>>>>>>>>>>>>> +        struct sg_table *sgt;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +        if (dma_buf_is_dynamic(attach->dmabuf))
>>>>>>>>>>>>>>>> + dma_resv_lock(attach->dmabuf->resv, NULL);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +        sgt = dmabuf->ops->map_dma_buf(attach,
>>>>>>>>>>>>>>>> DMA_BIDIRECTIONAL);
>>>>>>>>>>>>>>> Now we're back to enforcing DMA_BIDI, which works nicely
>>>>>>>>>>>>>>> around the
>>>>>>>>>>>>>>> locking pain, but apparently upsets the arm-soc folks who
>>>>>>>>>>>>>>> want to
>>>>>>>>>>>>>>> control
>>>>>>>>>>>>>>> this better.
>>>>>>>>>>>>>> Take another look at dma_buf_map_attachment(), we still try
>>>>>>>>>>>>>> to get the
>>>>>>>>>>>>>> caching there for ARM.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> What we do here is to bidirectionally map the buffer to avoid
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> locking hydra when importer and exporter disagree on locking.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So the ARM folks can easily avoid that by switching to
>>>>>>>>>>>>>> dynamic locking
>>>>>>>>>>>>>> for both.
>>>>>>>>>>>> So you still break the contract between importer and exporter,
>>>>>>>>>>>> except not
>>>>>>>>>>>> for anything that's run in intel-gfx-ci so all is good?
>>>>>>>>>>> No, the contract between importer and exporter stays exactly the
>>>>>>>>>>> same it
>>>>>>>>>>> is currently as long as you don't switch to dynamic dma-buf
>>>>>>>>>>> handling.
>>>>>>>>>>>
>>>>>>>>>>> There is no functional change for the ARM folks here. The only
>>>>>>>>>>> change
>>>>>>>>>>> which takes effect is between i915 and amdgpu and that is perfectly
>>>>>>>>>>> covered by intel-gfx-ci.
>>>>>>>>>> There's people who want to run amdgpu on ARM?
>>>>>>>>> Sure there are, we even recently fixed some bugs for this.
>>>>>>>>>
>>>>>>>>> But as far as I know there is no one currently which is affect by
>>>>>>>>> this
>>>>>>>>> change on ARM with amdgpu.
>>>>>>>> But don't you break them with this now?
>>>>>>> No, we see the bidirectional attachment as compatible with the other
>>>>>>> ones.
>>>>>>>
>>>>>>>> amdgpu will soon set the dynamic flag on exports, which forces the
>>>>>>>> caching
>>>>>>>> at create time (to avoid the locking fun), which will then result in a
>>>>>>>> EBUSY at map_attachment time because we have a cached mapping, but
>>>>>>>> it's
>>>>>>>> the wrong type.
>>>>>>> See the check in dma_buf_map_attachment():
>>>>>>>
>>>>>>>         if (attach->dir != direction && attach->dir != DMA_BIDIRECTIONAL)
>>>>>>>             return ERR_PTR(-EBUSY);
>>>>>> Hm, I misread this. So yeah should work, +/- the issue that we might
>>>>>> not flush enough. But I guess that can be fixed whenever, it's not
>>>>>> like dma-api semantics are a great fit for us. Maybe a fixme comment
>>>>>> would be useful here ... I'll look at this tomorrow or so because atm
>>>>>> brain is slow, I'm down with the usual post-conference cold it seems
>>>>>> :-/
>>>>> Hope your are feeling better now, adding a comment is of course not a
>>>>> problem.
>>>>>
>>>>> With that fixed can I get an reviewed-by or at least and acked-by?
>>>>>
>>>>> I want to land at least some parts of those changes now.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>> -Daniel
>>>>>>
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2019-10-17  9:04 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-29 14:29 Dynamic DMA-buf locking changes Christian König
2019-08-29 14:29 ` [PATCH 1/4] dma-buf: change DMA-buf locking convention Christian König
2019-09-03  8:05   ` Daniel Vetter
2019-09-11 10:53     ` Christian König
2019-09-16 12:23       ` Christian König
2019-09-17 12:31         ` Daniel Vetter
2019-09-17 12:40           ` Koenig, Christian
2019-09-17 13:13             ` Daniel Vetter
2019-09-17 13:24               ` Koenig, Christian
2019-09-17 13:45                 ` Daniel Vetter
2019-09-17 14:47                   ` Koenig, Christian
2019-09-17 14:56                     ` Daniel Vetter
2019-09-24  9:51                       ` Koenig, Christian
2019-10-02  8:37                         ` Koenig, Christian
2019-10-08  8:55                           ` Daniel Vetter
2019-10-16 13:46                             ` Koenig, Christian
2019-10-16 14:23                               ` Daniel Vetter
2019-10-17  9:04                                 ` Koenig, Christian
2019-10-08  8:55   ` Daniel Vetter
2019-08-29 14:29 ` [PATCH 2/4] drm/ttm: use the parent resv for ghost objects v2 Christian König
2019-10-08  9:25   ` Daniel Vetter
2019-10-09 13:10     ` Christian König
2019-10-09 14:09       ` Daniel Vetter
2019-08-29 14:29 ` [PATCH 3/4] drm/amdgpu: add independent DMA-buf export v7 Christian König
2019-08-29 14:29 ` [PATCH 4/4] drm/amdgpu: add independent DMA-buf import v8 Christian König

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).