linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RFC: Unpinned DMA-buf handling
@ 2019-10-29 10:40 Christian König
  2019-10-29 10:40 ` [PATCH 1/5] dma-buf: add dynamic DMA-buf handling v14 Christian König
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: Christian König @ 2019-10-29 10:40 UTC (permalink / raw)
  To: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

The basic idea stayed the same since the last version of those patches. The exporter can provide explicit pin/unpin functions and the importer a move_notify callback. This allows us to avoid pinning buffers while importers have a mapping for them.

In difference to the last version the locking changes were separated from this patchset and committed to drm-misc-next.

This allows drivers to implement the new locking semantics without the extra unpinned handling, but of course the changed locking semantics is still a prerequisite to the unpinned handling.

The last time this set was send out the discussion ended by questioning if the move_notify callback was really the right approach of notifying the importers that a buffer is about to change its placement. A possible alternative would be to add a special crafted fence object instead.

Let's discuss on the different approaches once more,
Christian.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/5] dma-buf: add dynamic DMA-buf handling v14
  2019-10-29 10:40 RFC: Unpinned DMA-buf handling Christian König
@ 2019-10-29 10:40 ` Christian König
  2019-11-05 10:20   ` Daniel Vetter
  2019-10-29 10:40 ` [PATCH 2/5] drm/ttm: remove the backing store if no placement is given Christian König
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Christian König @ 2019-10-29 10:40 UTC (permalink / raw)
  To: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

On the exporter side we add optional explicit pinning callbacks. Which are
called when the importer doesn't implement dynamic handling, move notification
or need the DMA-buf locked in place for its use case.

On the importer side we add an optional move_notify callback. This callback is
used by the exporter to inform the importers that their mappings should be
destroyed as soon as possible.

This allows the exporter to provide the mappings without the need to pin
the backing store.

v2: don't try to invalidate mappings when the callback is NULL,
    lock the reservation obj while using the attachments,
    add helper to set the callback
v3: move flag for invalidation support into the DMA-buf,
    use new attach_info structure to set the callback
v4: use importer_priv field instead of mangling exporter priv.
v5: drop invalidation_supported flag
v6: squash together with pin/unpin changes
v7: pin/unpin takes an attachment now
v8: nuke dma_buf_attachment_(map|unmap)_locked,
    everything is now handled backward compatible
v9: always cache when export/importer don't agree on dynamic handling
v10: minimal style cleanup
v11: drop automatically re-entry avoidance
v12: rename callback to move_notify
v13: add might_lock in appropriate places
v14: rebase on separated locking change

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/dma-buf/dma-buf.c                   | 106 ++++++++++++++++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c |   6 +-
 include/linux/dma-buf.h                     |  78 ++++++++++++--
 3 files changed, 170 insertions(+), 20 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index d377b4ca66bf..ce293cee76ed 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -529,6 +529,10 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info)
 		    exp_info->ops->dynamic_mapping))
 		return ERR_PTR(-EINVAL);
 
+	if (WARN_ON(!exp_info->ops->dynamic_mapping &&
+		    (exp_info->ops->pin || exp_info->ops->unpin)))
+		return ERR_PTR(-EINVAL);
+
 	if (!try_module_get(exp_info->owner))
 		return ERR_PTR(-ENOENT);
 
@@ -653,7 +657,8 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
  * calls attach() of dma_buf_ops to allow device-specific attach functionality
  * @dmabuf:		[in]	buffer to attach device to.
  * @dev:		[in]	device to be attached.
- * @dynamic_mapping:	[in]	calling convention for map/unmap
+ * @importer_ops	[in]	importer operations for the attachment
+ * @importer_priv	[in]	importer private pointer for the attachment
  *
  * Returns struct dma_buf_attachment pointer for this attachment. Attachments
  * must be cleaned up by calling dma_buf_detach().
@@ -669,7 +674,8 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
  */
 struct dma_buf_attachment *
 dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
-		       bool dynamic_mapping)
+		       const struct dma_buf_attach_ops *importer_ops,
+		       void *importer_priv)
 {
 	struct dma_buf_attachment *attach;
 	int ret;
@@ -683,7 +689,8 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
 
 	attach->dev = dev;
 	attach->dmabuf = dmabuf;
-	attach->dynamic_mapping = dynamic_mapping;
+	attach->importer_ops = importer_ops;
+	attach->importer_priv = importer_priv;
 
 	if (dmabuf->ops->attach) {
 		ret = dmabuf->ops->attach(dmabuf, attach);
@@ -702,15 +709,19 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
 	    dma_buf_is_dynamic(dmabuf)) {
 		struct sg_table *sgt;
 
-		if (dma_buf_is_dynamic(attach->dmabuf))
+		if (dma_buf_is_dynamic(attach->dmabuf)) {
 			dma_resv_lock(attach->dmabuf->resv, NULL);
+			ret = dma_buf_pin(attach);
+			if (ret)
+				goto err_unlock;
+		}
 
 		sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
 		if (!sgt)
 			sgt = ERR_PTR(-ENOMEM);
 		if (IS_ERR(sgt)) {
 			ret = PTR_ERR(sgt);
-			goto err_unlock;
+			goto err_unpin;
 		}
 		if (dma_buf_is_dynamic(attach->dmabuf))
 			dma_resv_unlock(attach->dmabuf->resv);
@@ -724,6 +735,10 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
 	kfree(attach);
 	return ERR_PTR(ret);
 
+err_unpin:
+	if (dma_buf_is_dynamic(attach->dmabuf))
+		dma_buf_unpin(attach);
+
 err_unlock:
 	if (dma_buf_is_dynamic(attach->dmabuf))
 		dma_resv_unlock(attach->dmabuf->resv);
@@ -744,7 +759,7 @@ EXPORT_SYMBOL_GPL(dma_buf_dynamic_attach);
 struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
 					  struct device *dev)
 {
-	return dma_buf_dynamic_attach(dmabuf, dev, false);
+	return dma_buf_dynamic_attach(dmabuf, dev, NULL, NULL);
 }
 EXPORT_SYMBOL_GPL(dma_buf_attach);
 
@@ -767,8 +782,10 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach)
 
 		dmabuf->ops->unmap_dma_buf(attach, attach->sgt, attach->dir);
 
-		if (dma_buf_is_dynamic(attach->dmabuf))
+		if (dma_buf_is_dynamic(attach->dmabuf)) {
+			dma_buf_unpin(attach);
 			dma_resv_unlock(attach->dmabuf->resv);
+		}
 	}
 
 	dma_resv_lock(dmabuf->resv, NULL);
@@ -781,6 +798,44 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach)
 }
 EXPORT_SYMBOL_GPL(dma_buf_detach);
 
+/**
+ * dma_buf_pin - Lock down the DMA-buf
+ *
+ * @attach:	[in]	attachment which should be pinned
+ *
+ * Returns:
+ * 0 on success, negative error code on failure.
+ */
+int dma_buf_pin(struct dma_buf_attachment *attach)
+{
+	struct dma_buf *dmabuf = attach->dmabuf;
+	int ret = 0;
+
+	dma_resv_assert_held(dmabuf->resv);
+
+	if (dmabuf->ops->pin)
+		ret = dmabuf->ops->pin(attach);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(dma_buf_pin);
+
+/**
+ * dma_buf_unpin - Remove lock from DMA-buf
+ *
+ * @attach:	[in]	attachment which should be unpinned
+ */
+void dma_buf_unpin(struct dma_buf_attachment *attach)
+{
+	struct dma_buf *dmabuf = attach->dmabuf;
+
+	dma_resv_assert_held(dmabuf->resv);
+
+	if (dmabuf->ops->unpin)
+		dmabuf->ops->unpin(attach);
+}
+EXPORT_SYMBOL_GPL(dma_buf_unpin);
+
 /**
  * dma_buf_map_attachment - Returns the scatterlist table of the attachment;
  * mapped into _device_ address space. Is a wrapper for map_dma_buf() of the
@@ -800,6 +855,7 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
 					enum dma_data_direction direction)
 {
 	struct sg_table *sg_table;
+	int r;
 
 	might_sleep();
 
@@ -821,13 +877,23 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
 		return attach->sgt;
 	}
 
-	if (dma_buf_is_dynamic(attach->dmabuf))
+	if (dma_buf_is_dynamic(attach->dmabuf)) {
 		dma_resv_assert_held(attach->dmabuf->resv);
+		if (!attach->importer_ops->move_notify) {
+			r = dma_buf_pin(attach);
+			if (r)
+				return ERR_PTR(r);
+		}
+	}
 
 	sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
 	if (!sg_table)
 		sg_table = ERR_PTR(-ENOMEM);
 
+	if (IS_ERR(sg_table) && dma_buf_is_dynamic(attach->dmabuf) &&
+	    !attach->importer_ops->move_notify)
+		dma_buf_unpin(attach);
+
 	if (!IS_ERR(sg_table) && attach->dmabuf->ops->cache_sgt_mapping) {
 		attach->sgt = sg_table;
 		attach->dir = direction;
@@ -866,9 +932,33 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach,
 		dma_resv_assert_held(attach->dmabuf->resv);
 
 	attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction);
+
+	if (dma_buf_is_dynamic(attach->dmabuf) &&
+	    !attach->importer_ops->move_notify)
+		dma_buf_unpin(attach);
 }
 EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
 
+/**
+ * dma_buf_move_notify - notify attachments that DMA-buf is moving
+ *
+ * @dmabuf:	[in]	buffer which is moving
+ *
+ * Informs all attachmenst that they need to destroy and recreated all their
+ * mappings.
+ */
+void dma_buf_move_notify(struct dma_buf *dmabuf)
+{
+	struct dma_buf_attachment *attach;
+
+	dma_resv_assert_held(dmabuf->resv);
+
+	list_for_each_entry(attach, &dmabuf->attachments, node)
+		if (attach->importer_ops && attach->importer_ops->move_notify)
+			attach->importer_ops->move_notify(attach);
+}
+EXPORT_SYMBOL_GPL(dma_buf_move_notify);
+
 /**
  * DOC: cpu access
  *
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index d5bcdfefbad6..8e5a68107556 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -415,6 +415,9 @@ amdgpu_dma_buf_create_obj(struct drm_device *dev, struct dma_buf *dma_buf)
 	return ERR_PTR(ret);
 }
 
+static const struct dma_buf_attach_ops amdgpu_dma_buf_attach_ops = {
+};
+
 /**
  * amdgpu_gem_prime_import - &drm_driver.gem_prime_import implementation
  * @dev: DRM device
@@ -447,7 +450,8 @@ struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev,
 	if (IS_ERR(obj))
 		return obj;
 
-	attach = dma_buf_dynamic_attach(dma_buf, dev->dev, true);
+	attach = dma_buf_dynamic_attach(dma_buf, dev->dev,
+					&amdgpu_dma_buf_attach_ops, NULL);
 	if (IS_ERR(attach)) {
 		drm_gem_object_put(obj);
 		return ERR_CAST(attach);
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index af73f835c51c..7456bb937635 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -93,14 +93,40 @@ struct dma_buf_ops {
 	 */
 	void (*detach)(struct dma_buf *, struct dma_buf_attachment *);
 
+	/**
+	 * @pin:
+	 *
+	 * This is called by dma_buf_pin and lets the exporter know that the
+	 * DMA-buf can't be moved any more.
+	 *
+	 * This is called with the dmabuf->resv object locked.
+	 *
+	 * This callback is optional.
+	 *
+	 * Returns:
+	 *
+	 * 0 on success, negative error code on failure.
+	 */
+	int (*pin)(struct dma_buf_attachment *attach);
+
+	/**
+	 * @unpin:
+	 *
+	 * This is called by dma_buf_unpin and lets the exporter know that the
+	 * DMA-buf can be moved again.
+	 *
+	 * This is called with the dmabuf->resv object locked.
+	 *
+	 * This callback is optional.
+	 */
+	void (*unpin)(struct dma_buf_attachment *attach);
+
 	/**
 	 * @map_dma_buf:
 	 *
 	 * This is called by dma_buf_map_attachment() and is used to map a
 	 * shared &dma_buf into device address space, and it is mandatory. It
-	 * can only be called if @attach has been called successfully. This
-	 * essentially pins the DMA buffer into place, and it cannot be moved
-	 * any more
+	 * can only be called if @attach has been called successfully.
 	 *
 	 * This call may sleep, e.g. when the backing storage first needs to be
 	 * allocated, or moved to a location suitable for all currently attached
@@ -141,9 +167,6 @@ struct dma_buf_ops {
 	 *
 	 * This is called by dma_buf_unmap_attachment() and should unmap and
 	 * release the &sg_table allocated in @map_dma_buf, and it is mandatory.
-	 * It should also unpin the backing storage if this is the last mapping
-	 * of the DMA buffer, it the exporter supports backing storage
-	 * migration.
 	 */
 	void (*unmap_dma_buf)(struct dma_buf_attachment *,
 			      struct sg_table *,
@@ -336,6 +359,34 @@ struct dma_buf {
 	} cb_excl, cb_shared;
 };
 
+/**
+ * struct dma_buf_attach_ops - importer operations for an attachment
+ * @move_notify: [optional] notification that the DMA-buf is moving
+ *
+ * Attachment operations implemented by the importer.
+ */
+struct dma_buf_attach_ops {
+	/**
+	 * @move_notify
+	 *
+	 * If this callback is provided the framework can avoid pinning the
+	 * backing store while mappings exists.
+	 *
+	 * This callback is called with the lock of the reservation object
+	 * associated with the dma_buf held and the mapping function must be
+	 * called with this lock held as well. This makes sure that no mapping
+	 * is created concurrently with an ongoing move operation.
+	 *
+	 * Mappings stay valid and are not directly affected by this callback.
+	 * But the DMA-buf can now be in a different physical location, so all
+	 * mappings should be destroyed and re-created as soon as possible.
+	 *
+	 * New mappings can be created after this callback returns, and will
+	 * point to the new location of the DMA-buf.
+	 */
+	void (*move_notify)(struct dma_buf_attachment *attach);
+};
+
 /**
  * struct dma_buf_attachment - holds device-buffer attachment data
  * @dmabuf: buffer for this attachment.
@@ -344,8 +395,9 @@ struct dma_buf {
  * @sgt: cached mapping.
  * @dir: direction of cached mapping.
  * @priv: exporter specific attachment data.
- * @dynamic_mapping: true if dma_buf_map/unmap_attachment() is called with the
- * dma_resv lock held.
+ * @importer_ops: importer operations for this attachment, if provided
+ * dma_buf_map/unmap_attachment() must be called with the dma_resv lock held.
+ * @importer_priv: importer specific attachment data.
  *
  * This structure holds the attachment information between the dma_buf buffer
  * and its user device(s). The list contains one attachment struct per device
@@ -362,7 +414,8 @@ struct dma_buf_attachment {
 	struct list_head node;
 	struct sg_table *sgt;
 	enum dma_data_direction dir;
-	bool dynamic_mapping;
+	const struct dma_buf_attach_ops *importer_ops;
+	void *importer_priv;
 	void *priv;
 };
 
@@ -438,16 +491,19 @@ static inline bool dma_buf_is_dynamic(struct dma_buf *dmabuf)
 static inline bool
 dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
 {
-	return attach->dynamic_mapping;
+	return !!attach->importer_ops;
 }
 
 struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
 					  struct device *dev);
 struct dma_buf_attachment *
 dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
-		       bool dynamic_mapping);
+		       const struct dma_buf_attach_ops *importer_ops,
+		       void *importer_priv);
 void dma_buf_detach(struct dma_buf *dmabuf,
 		    struct dma_buf_attachment *attach);
+int dma_buf_pin(struct dma_buf_attachment *attach);
+void dma_buf_unpin(struct dma_buf_attachment *attach);
 
 struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/5] drm/ttm: remove the backing store if no placement is given
  2019-10-29 10:40 RFC: Unpinned DMA-buf handling Christian König
  2019-10-29 10:40 ` [PATCH 1/5] dma-buf: add dynamic DMA-buf handling v14 Christian König
@ 2019-10-29 10:40 ` Christian König
  2019-10-29 10:40 ` [PATCH 3/5] drm/amdgpu: use allowed_domains for exported DMA-bufs Christian König
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Christian König @ 2019-10-29 10:40 UTC (permalink / raw)
  To: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

Pipeline removal of the BOs backing store when no placement is given
during validation.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index d52fc16266ce..2d1488271d63 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -1219,6 +1219,18 @@ int ttm_bo_validate(struct ttm_buffer_object *bo,
 	uint32_t new_flags;
 
 	dma_resv_assert_held(bo->base.resv);
+
+	/*
+	 * Remove the backing store if no placement is given.
+	 */
+	if (!placement->num_placement && !placement->num_busy_placement) {
+		ret = ttm_bo_pipeline_gutting(bo);
+		if (ret)
+			return ret;
+
+		return ttm_tt_create(bo, false);
+	}
+
 	/*
 	 * Check whether we need to move buffer.
 	 */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/5] drm/amdgpu: use allowed_domains for exported DMA-bufs
  2019-10-29 10:40 RFC: Unpinned DMA-buf handling Christian König
  2019-10-29 10:40 ` [PATCH 1/5] dma-buf: add dynamic DMA-buf handling v14 Christian König
  2019-10-29 10:40 ` [PATCH 2/5] drm/ttm: remove the backing store if no placement is given Christian König
@ 2019-10-29 10:40 ` Christian König
  2019-10-29 10:40 ` [PATCH 4/5] drm/amdgpu: add amdgpu_dma_buf_pin/unpin Christian König
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Christian König @ 2019-10-29 10:40 UTC (permalink / raw)
  To: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

Avoid that we ping/pong the buffers when we stop to pin DMA-buf
exports by using the allowed domains for exported buffers.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 253158fc378f..0253e0889daf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -28,6 +28,7 @@
 #include <linux/file.h>
 #include <linux/pagemap.h>
 #include <linux/sync_file.h>
+#include <linux/dma-buf.h>
 
 #include <drm/amdgpu_drm.h>
 #include <drm/drm_syncobj.h>
@@ -414,7 +415,9 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
 	/* Don't move this buffer if we have depleted our allowance
 	 * to move it. Don't move anything if the threshold is zero.
 	 */
-	if (p->bytes_moved < p->bytes_moved_threshold) {
+	if (p->bytes_moved < p->bytes_moved_threshold &&
+	    (!bo->tbo.base.dma_buf ||
+	    list_empty(&bo->tbo.base.dma_buf->attachments))) {
 		if (!amdgpu_gmc_vram_full_visible(&adev->gmc) &&
 		    (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)) {
 			/* And don't move a CPU_ACCESS_REQUIRED BO to limited
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/5] drm/amdgpu: add amdgpu_dma_buf_pin/unpin
  2019-10-29 10:40 RFC: Unpinned DMA-buf handling Christian König
                   ` (2 preceding siblings ...)
  2019-10-29 10:40 ` [PATCH 3/5] drm/amdgpu: use allowed_domains for exported DMA-bufs Christian König
@ 2019-10-29 10:40 ` Christian König
  2019-10-29 10:40 ` [PATCH 5/5] drm/amdgpu: implement amdgpu_gem_prime_move_notify Christian König
  2019-11-05 13:46 ` RFC: Unpinned DMA-buf handling Daniel Vetter
  5 siblings, 0 replies; 16+ messages in thread
From: Christian König @ 2019-10-29 10:40 UTC (permalink / raw)
  To: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

This implements the exporter side of unpinned DMA-buf handling.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 55 ++++++++++++++++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  |  5 ++
 2 files changed, 53 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index 8e5a68107556..3629cfe53aad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -222,6 +222,37 @@ static void amdgpu_dma_buf_detach(struct dma_buf *dmabuf,
 		bo->prime_shared_count--;
 }
 
+/**
+ * amdgpu_dma_buf_pin - &dma_buf_ops.pin implementation
+ *
+ * @attach: attachment to pin down
+ *
+ * Pin the BO which is backing the DMA-buf so that it can't move any more.
+ */
+static int amdgpu_dma_buf_pin(struct dma_buf_attachment *attach)
+{
+	struct drm_gem_object *obj = attach->dmabuf->priv;
+	struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
+
+	/* pin buffer into GTT */
+	return amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_GTT);
+}
+
+/**
+ * amdgpu_dma_buf_unpin - &dma_buf_ops.unpin implementation
+ *
+ * @attach: attachment to unpin
+ *
+ * Unpin a previously pinned BO to make it movable again.
+ */
+static void amdgpu_dma_buf_unpin(struct dma_buf_attachment *attach)
+{
+	struct drm_gem_object *obj = attach->dmabuf->priv;
+	struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
+
+	amdgpu_bo_unpin(bo);
+}
+
 /**
  * amdgpu_dma_buf_map - &dma_buf_ops.map_dma_buf implementation
  * @attach: DMA-buf attachment
@@ -244,9 +275,21 @@ static struct sg_table *amdgpu_dma_buf_map(struct dma_buf_attachment *attach,
 	struct sg_table *sgt;
 	long r;
 
-	r = amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_GTT);
-	if (r)
-		return ERR_PTR(r);
+	if (!bo->pin_count) {
+		/* move buffer into GTT */
+		struct ttm_operation_ctx ctx = { false, false };
+
+		amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_GTT);
+		r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
+		if (r)
+			return ERR_PTR(r);
+
+	} else if (!(amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type) &
+		     AMDGPU_GEM_DOMAIN_GTT)) {
+			return ERR_PTR(-EBUSY);
+	}
+
+
 
 	sgt = drm_prime_pages_to_sg(bo->tbo.ttm->pages, bo->tbo.num_pages);
 	if (IS_ERR(sgt))
@@ -277,13 +320,9 @@ static void amdgpu_dma_buf_unmap(struct dma_buf_attachment *attach,
 				 struct sg_table *sgt,
 				 enum dma_data_direction dir)
 {
-	struct drm_gem_object *obj = attach->dmabuf->priv;
-	struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
-
 	dma_unmap_sg(attach->dev, sgt->sgl, sgt->nents, dir);
 	sg_free_table(sgt);
 	kfree(sgt);
-	amdgpu_bo_unpin(bo);
 }
 
 /**
@@ -330,6 +369,8 @@ const struct dma_buf_ops amdgpu_dmabuf_ops = {
 	.dynamic_mapping = true,
 	.attach = amdgpu_dma_buf_attach,
 	.detach = amdgpu_dma_buf_detach,
+	.pin = amdgpu_dma_buf_pin,
+	.unpin = amdgpu_dma_buf_unpin,
 	.map_dma_buf = amdgpu_dma_buf_map,
 	.unmap_dma_buf = amdgpu_dma_buf_unmap,
 	.release = drm_gem_dmabuf_release,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 6f0b789a0b49..ac776d2620eb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -31,6 +31,7 @@
  */
 #include <linux/list.h>
 #include <linux/slab.h>
+#include <linux/dma-buf.h>
 
 #include <drm/amdgpu_drm.h>
 #include <drm/drm_cache.h>
@@ -1209,6 +1210,10 @@ void amdgpu_bo_move_notify(struct ttm_buffer_object *bo,
 
 	amdgpu_bo_kunmap(abo);
 
+	if (abo->tbo.base.dma_buf && !abo->tbo.base.import_attach &&
+	    bo->mem.mem_type != TTM_PL_SYSTEM)
+		dma_buf_move_notify(abo->tbo.base.dma_buf);
+
 	/* remember the eviction */
 	if (evict)
 		atomic64_inc(&adev->num_evictions);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/5] drm/amdgpu: implement amdgpu_gem_prime_move_notify
  2019-10-29 10:40 RFC: Unpinned DMA-buf handling Christian König
                   ` (3 preceding siblings ...)
  2019-10-29 10:40 ` [PATCH 4/5] drm/amdgpu: add amdgpu_dma_buf_pin/unpin Christian König
@ 2019-10-29 10:40 ` Christian König
  2019-11-05 10:52   ` Daniel Vetter
  2019-11-05 13:46 ` RFC: Unpinned DMA-buf handling Daniel Vetter
  5 siblings, 1 reply; 16+ messages in thread
From: Christian König @ 2019-10-29 10:40 UTC (permalink / raw)
  To: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

Implement the importer side of unpinned DMA-buf handling.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 28 ++++++++++++++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  |  6 +++++
 2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index 3629cfe53aad..af39553c51ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -456,7 +456,33 @@ amdgpu_dma_buf_create_obj(struct drm_device *dev, struct dma_buf *dma_buf)
 	return ERR_PTR(ret);
 }
 
+/**
+ * amdgpu_dma_buf_move_notify - &attach.move_notify implementation
+ *
+ * @attach: the DMA-buf attachment
+ *
+ * Invalidate the DMA-buf attachment, making sure that the we re-create the
+ * mapping before the next use.
+ */
+static void
+amdgpu_dma_buf_move_notify(struct dma_buf_attachment *attach)
+{
+	struct ttm_operation_ctx ctx = { false, false };
+	struct drm_gem_object *obj = attach->importer_priv;
+	struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
+	struct ttm_placement placement = {};
+	int r;
+
+	if (bo->tbo.mem.mem_type == TTM_PL_SYSTEM)
+		return;
+
+	r = ttm_bo_validate(&bo->tbo, &placement, &ctx);
+	if (r)
+		DRM_ERROR("Failed to invalidate DMA-buf import (%d))\n", r);
+}
+
 static const struct dma_buf_attach_ops amdgpu_dma_buf_attach_ops = {
+	.move_notify = amdgpu_dma_buf_move_notify
 };
 
 /**
@@ -492,7 +518,7 @@ struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev,
 		return obj;
 
 	attach = dma_buf_dynamic_attach(dma_buf, dev->dev,
-					&amdgpu_dma_buf_attach_ops, NULL);
+					&amdgpu_dma_buf_attach_ops, obj);
 	if (IS_ERR(attach)) {
 		drm_gem_object_put(obj);
 		return ERR_CAST(attach);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index ac776d2620eb..cfa46341c9a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -861,6 +861,9 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain,
 		return 0;
 	}
 
+	if (bo->tbo.base.import_attach)
+		dma_buf_pin(bo->tbo.base.import_attach);
+
 	bo->flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
 	/* force to pin into visible video ram */
 	if (!(bo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS))
@@ -944,6 +947,9 @@ int amdgpu_bo_unpin(struct amdgpu_bo *bo)
 
 	amdgpu_bo_subtract_pin_size(bo);
 
+	if (bo->tbo.base.import_attach)
+		dma_buf_unpin(bo->tbo.base.import_attach);
+
 	for (i = 0; i < bo->placement.num_placement; i++) {
 		bo->placements[i].lpfn = 0;
 		bo->placements[i].flags &= ~TTM_PL_FLAG_NO_EVICT;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] dma-buf: add dynamic DMA-buf handling v14
  2019-10-29 10:40 ` [PATCH 1/5] dma-buf: add dynamic DMA-buf handling v14 Christian König
@ 2019-11-05 10:20   ` Daniel Vetter
  2020-02-18 13:20     ` Christian König
  0 siblings, 1 reply; 16+ messages in thread
From: Daniel Vetter @ 2019-11-05 10:20 UTC (permalink / raw)
  To: Christian König
  Cc: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

On Tue, Oct 29, 2019 at 11:40:45AM +0100, Christian König wrote:
> On the exporter side we add optional explicit pinning callbacks. Which are
> called when the importer doesn't implement dynamic handling, move notification
> or need the DMA-buf locked in place for its use case.
> 
> On the importer side we add an optional move_notify callback. This callback is
> used by the exporter to inform the importers that their mappings should be
> destroyed as soon as possible.
> 
> This allows the exporter to provide the mappings without the need to pin
> the backing store.
> 
> v2: don't try to invalidate mappings when the callback is NULL,
>     lock the reservation obj while using the attachments,
>     add helper to set the callback
> v3: move flag for invalidation support into the DMA-buf,
>     use new attach_info structure to set the callback
> v4: use importer_priv field instead of mangling exporter priv.
> v5: drop invalidation_supported flag
> v6: squash together with pin/unpin changes
> v7: pin/unpin takes an attachment now
> v8: nuke dma_buf_attachment_(map|unmap)_locked,
>     everything is now handled backward compatible
> v9: always cache when export/importer don't agree on dynamic handling
> v10: minimal style cleanup
> v11: drop automatically re-entry avoidance
> v12: rename callback to move_notify
> v13: add might_lock in appropriate places
> v14: rebase on separated locking change
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>

Bunch of nitpicks/safety check comments here, I'll post the big question
stuff on the cover letter.

> ---
>  drivers/dma-buf/dma-buf.c                   | 106 ++++++++++++++++++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c |   6 +-
>  include/linux/dma-buf.h                     |  78 ++++++++++++--
>  3 files changed, 170 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index d377b4ca66bf..ce293cee76ed 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -529,6 +529,10 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info)
>  		    exp_info->ops->dynamic_mapping))
>  		return ERR_PTR(-EINVAL);
>  
> +	if (WARN_ON(!exp_info->ops->dynamic_mapping &&
> +		    (exp_info->ops->pin || exp_info->ops->unpin)))
> +		return ERR_PTR(-EINVAL);

Imo make this stronger, have a dynamic mapping iff there's both a pin and
unpin function. Otherwise this doesn't make a lot of sense to me.

> +
>  	if (!try_module_get(exp_info->owner))
>  		return ERR_PTR(-ENOENT);
>  
> @@ -653,7 +657,8 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
>   * calls attach() of dma_buf_ops to allow device-specific attach functionality
>   * @dmabuf:		[in]	buffer to attach device to.
>   * @dev:		[in]	device to be attached.
> - * @dynamic_mapping:	[in]	calling convention for map/unmap
> + * @importer_ops	[in]	importer operations for the attachment
> + * @importer_priv	[in]	importer private pointer for the attachment
>   *
>   * Returns struct dma_buf_attachment pointer for this attachment. Attachments
>   * must be cleaned up by calling dma_buf_detach().
> @@ -669,7 +674,8 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
>   */
>  struct dma_buf_attachment *
>  dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
> -		       bool dynamic_mapping)
> +		       const struct dma_buf_attach_ops *importer_ops,
> +		       void *importer_priv)
>  {
>  	struct dma_buf_attachment *attach;
>  	int ret;
> @@ -683,7 +689,8 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
>  
>  	attach->dev = dev;
>  	attach->dmabuf = dmabuf;
> -	attach->dynamic_mapping = dynamic_mapping;
> +	attach->importer_ops = importer_ops;
> +	attach->importer_priv = importer_priv;
>  
>  	if (dmabuf->ops->attach) {
>  		ret = dmabuf->ops->attach(dmabuf, attach);
> @@ -702,15 +709,19 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
>  	    dma_buf_is_dynamic(dmabuf)) {
>  		struct sg_table *sgt;
>  
> -		if (dma_buf_is_dynamic(attach->dmabuf))
> +		if (dma_buf_is_dynamic(attach->dmabuf)) {
>  			dma_resv_lock(attach->dmabuf->resv, NULL);
> +			ret = dma_buf_pin(attach);
> +			if (ret)
> +				goto err_unlock;
> +		}
>  
>  		sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
>  		if (!sgt)
>  			sgt = ERR_PTR(-ENOMEM);
>  		if (IS_ERR(sgt)) {
>  			ret = PTR_ERR(sgt);
> -			goto err_unlock;
> +			goto err_unpin;
>  		}
>  		if (dma_buf_is_dynamic(attach->dmabuf))
>  			dma_resv_unlock(attach->dmabuf->resv);
> @@ -724,6 +735,10 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
>  	kfree(attach);
>  	return ERR_PTR(ret);
>  
> +err_unpin:
> +	if (dma_buf_is_dynamic(attach->dmabuf))
> +		dma_buf_unpin(attach);
> +
>  err_unlock:
>  	if (dma_buf_is_dynamic(attach->dmabuf))
>  		dma_resv_unlock(attach->dmabuf->resv);
> @@ -744,7 +759,7 @@ EXPORT_SYMBOL_GPL(dma_buf_dynamic_attach);
>  struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>  					  struct device *dev)
>  {
> -	return dma_buf_dynamic_attach(dmabuf, dev, false);
> +	return dma_buf_dynamic_attach(dmabuf, dev, NULL, NULL);
>  }
>  EXPORT_SYMBOL_GPL(dma_buf_attach);
>  
> @@ -767,8 +782,10 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach)
>  
>  		dmabuf->ops->unmap_dma_buf(attach, attach->sgt, attach->dir);
>  
> -		if (dma_buf_is_dynamic(attach->dmabuf))
> +		if (dma_buf_is_dynamic(attach->dmabuf)) {
> +			dma_buf_unpin(attach);
>  			dma_resv_unlock(attach->dmabuf->resv);
> +		}
>  	}
>  
>  	dma_resv_lock(dmabuf->resv, NULL);
> @@ -781,6 +798,44 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach)
>  }
>  EXPORT_SYMBOL_GPL(dma_buf_detach);
>  
> +/**
> + * dma_buf_pin - Lock down the DMA-buf
> + *
> + * @attach:	[in]	attachment which should be pinned
> + *
> + * Returns:
> + * 0 on success, negative error code on failure.
> + */
> +int dma_buf_pin(struct dma_buf_attachment *attach)
> +{
> +	struct dma_buf *dmabuf = attach->dmabuf;
> +	int ret = 0;
> +
> +	dma_resv_assert_held(dmabuf->resv);
> +
> +	if (dmabuf->ops->pin)
> +		ret = dmabuf->ops->pin(attach);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(dma_buf_pin);
> +
> +/**
> + * dma_buf_unpin - Remove lock from DMA-buf
> + *
> + * @attach:	[in]	attachment which should be unpinned
> + */
> +void dma_buf_unpin(struct dma_buf_attachment *attach)
> +{
> +	struct dma_buf *dmabuf = attach->dmabuf;
> +
> +	dma_resv_assert_held(dmabuf->resv);
> +
> +	if (dmabuf->ops->unpin)
> +		dmabuf->ops->unpin(attach);
> +}
> +EXPORT_SYMBOL_GPL(dma_buf_unpin);
> +
>  /**
>   * dma_buf_map_attachment - Returns the scatterlist table of the attachment;
>   * mapped into _device_ address space. Is a wrapper for map_dma_buf() of the
> @@ -800,6 +855,7 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
>  					enum dma_data_direction direction)
>  {
>  	struct sg_table *sg_table;
> +	int r;
>  
>  	might_sleep();
>  
> @@ -821,13 +877,23 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
>  		return attach->sgt;
>  	}
>  
> -	if (dma_buf_is_dynamic(attach->dmabuf))
> +	if (dma_buf_is_dynamic(attach->dmabuf)) {
>  		dma_resv_assert_held(attach->dmabuf->resv);
> +		if (!attach->importer_ops->move_notify) {

Imo just require ->move_notify for importers that give you an ops
function. Doesn't really make sense to allow dynamic without support
->move_notify.

> +			r = dma_buf_pin(attach);
> +			if (r)
> +				return ERR_PTR(r);
> +		}
> +	}
>  
>  	sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
>  	if (!sg_table)
>  		sg_table = ERR_PTR(-ENOMEM);
>  
> +	if (IS_ERR(sg_table) && dma_buf_is_dynamic(attach->dmabuf) &&
> +	    !attach->importer_ops->move_notify)
> +		dma_buf_unpin(attach);
> +
>  	if (!IS_ERR(sg_table) && attach->dmabuf->ops->cache_sgt_mapping) {
>  		attach->sgt = sg_table;
>  		attach->dir = direction;
> @@ -866,9 +932,33 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach,
>  		dma_resv_assert_held(attach->dmabuf->resv);
>  
>  	attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction);
> +
> +	if (dma_buf_is_dynamic(attach->dmabuf) &&
> +	    !attach->importer_ops->move_notify)
> +		dma_buf_unpin(attach);
>  }
>  EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
>  
> +/**
> + * dma_buf_move_notify - notify attachments that DMA-buf is moving
> + *
> + * @dmabuf:	[in]	buffer which is moving
> + *
> + * Informs all attachmenst that they need to destroy and recreated all their
> + * mappings.
> + */
> +void dma_buf_move_notify(struct dma_buf *dmabuf)
> +{
> +	struct dma_buf_attachment *attach;
> +
> +	dma_resv_assert_held(dmabuf->resv);
> +
> +	list_for_each_entry(attach, &dmabuf->attachments, node)
> +		if (attach->importer_ops && attach->importer_ops->move_notify)
> +			attach->importer_ops->move_notify(attach);
> +}
> +EXPORT_SYMBOL_GPL(dma_buf_move_notify);
> +
>  /**
>   * DOC: cpu access
>   *
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> index d5bcdfefbad6..8e5a68107556 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> @@ -415,6 +415,9 @@ amdgpu_dma_buf_create_obj(struct drm_device *dev, struct dma_buf *dma_buf)
>  	return ERR_PTR(ret);
>  }
>  
> +static const struct dma_buf_attach_ops amdgpu_dma_buf_attach_ops = {
> +};
> +
>  /**
>   * amdgpu_gem_prime_import - &drm_driver.gem_prime_import implementation
>   * @dev: DRM device
> @@ -447,7 +450,8 @@ struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev,
>  	if (IS_ERR(obj))
>  		return obj;
>  
> -	attach = dma_buf_dynamic_attach(dma_buf, dev->dev, true);
> +	attach = dma_buf_dynamic_attach(dma_buf, dev->dev,
> +					&amdgpu_dma_buf_attach_ops, NULL);
>  	if (IS_ERR(attach)) {
>  		drm_gem_object_put(obj);
>  		return ERR_CAST(attach);
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index af73f835c51c..7456bb937635 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -93,14 +93,40 @@ struct dma_buf_ops {
>  	 */
>  	void (*detach)(struct dma_buf *, struct dma_buf_attachment *);
>  
> +	/**
> +	 * @pin:
> +	 *
> +	 * This is called by dma_buf_pin and lets the exporter know that the
> +	 * DMA-buf can't be moved any more.

I think we should add a warning here that pinning is only ok for limited
use-cases (like scanout or similar), and not as part of general buffer
management.

i915 uses temporary pins through it's execbuf management (and everywhere
else), so we have a _lot_ of people in dri-devel with quite different
ideas of what this might be for :-)

> +	 *
> +	 * This is called with the dmabuf->resv object locked.
> +	 *
> +	 * This callback is optional.
> +	 *
> +	 * Returns:
> +	 *
> +	 * 0 on success, negative error code on failure.
> +	 */
> +	int (*pin)(struct dma_buf_attachment *attach);
> +
> +	/**
> +	 * @unpin:
> +	 *
> +	 * This is called by dma_buf_unpin and lets the exporter know that the
> +	 * DMA-buf can be moved again.
> +	 *
> +	 * This is called with the dmabuf->resv object locked.
> +	 *
> +	 * This callback is optional.
> +	 */
> +	void (*unpin)(struct dma_buf_attachment *attach);
> +
>  	/**
>  	 * @map_dma_buf:
>  	 *
>  	 * This is called by dma_buf_map_attachment() and is used to map a
>  	 * shared &dma_buf into device address space, and it is mandatory. It
> -	 * can only be called if @attach has been called successfully. This
> -	 * essentially pins the DMA buffer into place, and it cannot be moved
> -	 * any more
> +	 * can only be called if @attach has been called successfully.
>  	 *
>  	 * This call may sleep, e.g. when the backing storage first needs to be
>  	 * allocated, or moved to a location suitable for all currently attached
> @@ -141,9 +167,6 @@ struct dma_buf_ops {
>  	 *
>  	 * This is called by dma_buf_unmap_attachment() and should unmap and
>  	 * release the &sg_table allocated in @map_dma_buf, and it is mandatory.
> -	 * It should also unpin the backing storage if this is the last mapping
> -	 * of the DMA buffer, it the exporter supports backing storage
> -	 * migration.

This is still valid for non-dynamic exporters. Imo keep but clarify that.

>  	 */
>  	void (*unmap_dma_buf)(struct dma_buf_attachment *,
>  			      struct sg_table *,
> @@ -336,6 +359,34 @@ struct dma_buf {
>  	} cb_excl, cb_shared;
>  };
>  
> +/**
> + * struct dma_buf_attach_ops - importer operations for an attachment
> + * @move_notify: [optional] notification that the DMA-buf is moving
> + *
> + * Attachment operations implemented by the importer.
> + */
> +struct dma_buf_attach_ops {
> +	/**
> +	 * @move_notify
> +	 *
> +	 * If this callback is provided the framework can avoid pinning the
> +	 * backing store while mappings exists.
> +	 *
> +	 * This callback is called with the lock of the reservation object
> +	 * associated with the dma_buf held and the mapping function must be
> +	 * called with this lock held as well. This makes sure that no mapping
> +	 * is created concurrently with an ongoing move operation.
> +	 *
> +	 * Mappings stay valid and are not directly affected by this callback.
> +	 * But the DMA-buf can now be in a different physical location, so all
> +	 * mappings should be destroyed and re-created as soon as possible.
> +	 *
> +	 * New mappings can be created after this callback returns, and will
> +	 * point to the new location of the DMA-buf.
> +	 */
> +	void (*move_notify)(struct dma_buf_attachment *attach);
> +};
> +
>  /**
>   * struct dma_buf_attachment - holds device-buffer attachment data
>   * @dmabuf: buffer for this attachment.
> @@ -344,8 +395,9 @@ struct dma_buf {
>   * @sgt: cached mapping.
>   * @dir: direction of cached mapping.
>   * @priv: exporter specific attachment data.
> - * @dynamic_mapping: true if dma_buf_map/unmap_attachment() is called with the
> - * dma_resv lock held.
> + * @importer_ops: importer operations for this attachment, if provided
> + * dma_buf_map/unmap_attachment() must be called with the dma_resv lock held.
> + * @importer_priv: importer specific attachment data.
>   *
>   * This structure holds the attachment information between the dma_buf buffer
>   * and its user device(s). The list contains one attachment struct per device
> @@ -362,7 +414,8 @@ struct dma_buf_attachment {
>  	struct list_head node;
>  	struct sg_table *sgt;
>  	enum dma_data_direction dir;
> -	bool dynamic_mapping;
> +	const struct dma_buf_attach_ops *importer_ops;
> +	void *importer_priv;
>  	void *priv;
>  };
>  
> @@ -438,16 +491,19 @@ static inline bool dma_buf_is_dynamic(struct dma_buf *dmabuf)
>  static inline bool
>  dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
>  {
> -	return attach->dynamic_mapping;
> +	return !!attach->importer_ops;

Hm why not do the same for exporters, and make them dynamic iff they have
pin/unpin?
-Daniel

>  }
>  
>  struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>  					  struct device *dev);
>  struct dma_buf_attachment *
>  dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
> -		       bool dynamic_mapping);
> +		       const struct dma_buf_attach_ops *importer_ops,
> +		       void *importer_priv);
>  void dma_buf_detach(struct dma_buf *dmabuf,
>  		    struct dma_buf_attachment *attach);
> +int dma_buf_pin(struct dma_buf_attachment *attach);
> +void dma_buf_unpin(struct dma_buf_attachment *attach);
>  
>  struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info);
>  
> -- 
> 2.17.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] drm/amdgpu: implement amdgpu_gem_prime_move_notify
  2019-10-29 10:40 ` [PATCH 5/5] drm/amdgpu: implement amdgpu_gem_prime_move_notify Christian König
@ 2019-11-05 10:52   ` Daniel Vetter
  2019-11-05 13:39     ` Christian König
  0 siblings, 1 reply; 16+ messages in thread
From: Daniel Vetter @ 2019-11-05 10:52 UTC (permalink / raw)
  To: Christian König
  Cc: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

On Tue, Oct 29, 2019 at 11:40:49AM +0100, Christian König wrote:
> Implement the importer side of unpinned DMA-buf handling.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 28 ++++++++++++++++++++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  |  6 +++++
>  2 files changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> index 3629cfe53aad..af39553c51ad 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> @@ -456,7 +456,33 @@ amdgpu_dma_buf_create_obj(struct drm_device *dev, struct dma_buf *dma_buf)
>  	return ERR_PTR(ret);
>  }
>  
> +/**
> + * amdgpu_dma_buf_move_notify - &attach.move_notify implementation
> + *
> + * @attach: the DMA-buf attachment
> + *
> + * Invalidate the DMA-buf attachment, making sure that the we re-create the
> + * mapping before the next use.
> + */
> +static void
> +amdgpu_dma_buf_move_notify(struct dma_buf_attachment *attach)
> +{
> +	struct ttm_operation_ctx ctx = { false, false };
> +	struct drm_gem_object *obj = attach->importer_priv;
> +	struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
> +	struct ttm_placement placement = {};
> +	int r;
> +
> +	if (bo->tbo.mem.mem_type == TTM_PL_SYSTEM)
> +		return;
> +
> +	r = ttm_bo_validate(&bo->tbo, &placement, &ctx);
> +	if (r)
> +		DRM_ERROR("Failed to invalidate DMA-buf import (%d))\n", r);

Where do you update pagetables?

The only thing I've found is in the amdgpu CS code, which is way to late
for this stuff here. Plus TTM doesn't handle virtual memory at all (aside
from the gart tt), so clearly you need to call into amdgpu code somewhere
for this. But I didn't find it, neither in your ->move_notify nor the
->move callback in ttm_bo_driver.

How does this work?
-Daniel

> +}
> +
>  static const struct dma_buf_attach_ops amdgpu_dma_buf_attach_ops = {
> +	.move_notify = amdgpu_dma_buf_move_notify
>  };
>  
>  /**
> @@ -492,7 +518,7 @@ struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev,
>  		return obj;
>  
>  	attach = dma_buf_dynamic_attach(dma_buf, dev->dev,
> -					&amdgpu_dma_buf_attach_ops, NULL);
> +					&amdgpu_dma_buf_attach_ops, obj);
>  	if (IS_ERR(attach)) {
>  		drm_gem_object_put(obj);
>  		return ERR_CAST(attach);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index ac776d2620eb..cfa46341c9a7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -861,6 +861,9 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain,
>  		return 0;
>  	}
>  
> +	if (bo->tbo.base.import_attach)
> +		dma_buf_pin(bo->tbo.base.import_attach);
> +
>  	bo->flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
>  	/* force to pin into visible video ram */
>  	if (!(bo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS))
> @@ -944,6 +947,9 @@ int amdgpu_bo_unpin(struct amdgpu_bo *bo)
>  
>  	amdgpu_bo_subtract_pin_size(bo);
>  
> +	if (bo->tbo.base.import_attach)
> +		dma_buf_unpin(bo->tbo.base.import_attach);
> +
>  	for (i = 0; i < bo->placement.num_placement; i++) {
>  		bo->placements[i].lpfn = 0;
>  		bo->placements[i].flags &= ~TTM_PL_FLAG_NO_EVICT;
> -- 
> 2.17.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] drm/amdgpu: implement amdgpu_gem_prime_move_notify
  2019-11-05 10:52   ` Daniel Vetter
@ 2019-11-05 13:39     ` Christian König
  2019-11-05 13:50       ` Daniel Vetter
  0 siblings, 1 reply; 16+ messages in thread
From: Christian König @ 2019-11-05 13:39 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

Am 05.11.19 um 11:52 schrieb Daniel Vetter:
> On Tue, Oct 29, 2019 at 11:40:49AM +0100, Christian König wrote:
>> Implement the importer side of unpinned DMA-buf handling.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 28 ++++++++++++++++++++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  |  6 +++++
>>   2 files changed, 33 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
>> index 3629cfe53aad..af39553c51ad 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
>> @@ -456,7 +456,33 @@ amdgpu_dma_buf_create_obj(struct drm_device *dev, struct dma_buf *dma_buf)
>>   	return ERR_PTR(ret);
>>   }
>>   
>> +/**
>> + * amdgpu_dma_buf_move_notify - &attach.move_notify implementation
>> + *
>> + * @attach: the DMA-buf attachment
>> + *
>> + * Invalidate the DMA-buf attachment, making sure that the we re-create the
>> + * mapping before the next use.
>> + */
>> +static void
>> +amdgpu_dma_buf_move_notify(struct dma_buf_attachment *attach)
>> +{
>> +	struct ttm_operation_ctx ctx = { false, false };
>> +	struct drm_gem_object *obj = attach->importer_priv;
>> +	struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
>> +	struct ttm_placement placement = {};
>> +	int r;
>> +
>> +	if (bo->tbo.mem.mem_type == TTM_PL_SYSTEM)
>> +		return;
>> +
>> +	r = ttm_bo_validate(&bo->tbo, &placement, &ctx);
>> +	if (r)
>> +		DRM_ERROR("Failed to invalidate DMA-buf import (%d))\n", r);
> Where do you update pagetables?
>
> The only thing I've found is in the amdgpu CS code, which is way to late
> for this stuff here. Plus TTM doesn't handle virtual memory at all (aside
> from the gart tt), so clearly you need to call into amdgpu code somewhere
> for this. But I didn't find it, neither in your ->move_notify nor the
> ->move callback in ttm_bo_driver.
>
> How does this work?

Page tables are not updated until the next command submission, e.g. in 
amdgpu_cs.c

This is save since all previous command submissions are added to the 
dma_resv object as fences and the dma_buf can't be moved before those 
are signaled.

Christian.

> -Daniel
>
>> +}
>> +
>>   static const struct dma_buf_attach_ops amdgpu_dma_buf_attach_ops = {
>> +	.move_notify = amdgpu_dma_buf_move_notify
>>   };
>>   
>>   /**
>> @@ -492,7 +518,7 @@ struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev,
>>   		return obj;
>>   
>>   	attach = dma_buf_dynamic_attach(dma_buf, dev->dev,
>> -					&amdgpu_dma_buf_attach_ops, NULL);
>> +					&amdgpu_dma_buf_attach_ops, obj);
>>   	if (IS_ERR(attach)) {
>>   		drm_gem_object_put(obj);
>>   		return ERR_CAST(attach);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index ac776d2620eb..cfa46341c9a7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -861,6 +861,9 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain,
>>   		return 0;
>>   	}
>>   
>> +	if (bo->tbo.base.import_attach)
>> +		dma_buf_pin(bo->tbo.base.import_attach);
>> +
>>   	bo->flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
>>   	/* force to pin into visible video ram */
>>   	if (!(bo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS))
>> @@ -944,6 +947,9 @@ int amdgpu_bo_unpin(struct amdgpu_bo *bo)
>>   
>>   	amdgpu_bo_subtract_pin_size(bo);
>>   
>> +	if (bo->tbo.base.import_attach)
>> +		dma_buf_unpin(bo->tbo.base.import_attach);
>> +
>>   	for (i = 0; i < bo->placement.num_placement; i++) {
>>   		bo->placements[i].lpfn = 0;
>>   		bo->placements[i].flags &= ~TTM_PL_FLAG_NO_EVICT;
>> -- 
>> 2.17.1
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: RFC: Unpinned DMA-buf handling
  2019-10-29 10:40 RFC: Unpinned DMA-buf handling Christian König
                   ` (4 preceding siblings ...)
  2019-10-29 10:40 ` [PATCH 5/5] drm/amdgpu: implement amdgpu_gem_prime_move_notify Christian König
@ 2019-11-05 13:46 ` Daniel Vetter
  5 siblings, 0 replies; 16+ messages in thread
From: Daniel Vetter @ 2019-11-05 13:46 UTC (permalink / raw)
  To: Christian König
  Cc: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

On Tue, Oct 29, 2019 at 11:40:44AM +0100, Christian König wrote:
> The basic idea stayed the same since the last version of those patches.
> The exporter can provide explicit pin/unpin functions and the importer a
> move_notify callback. This allows us to avoid pinning buffers while
> importers have a mapping for them.
> 
> In difference to the last version the locking changes were separated
> from this patchset and committed to drm-misc-next.
> 
> This allows drivers to implement the new locking semantics without the
> extra unpinned handling, but of course the changed locking semantics is
> still a prerequisite to the unpinned handling.
> 
> The last time this set was send out the discussion ended by questioning
> if the move_notify callback was really the right approach of notifying
> the importers that a buffer is about to change its placement. A possible
> alternative would be to add a special crafted fence object instead.
> 
> Let's discuss on the different approaches once more,

So here's my pile of higher-level thoughts on things still to discuss. I
don't think we need a code-answer for all of these, but at least a rough
idea to make sure we're not walling ourselves into a corner.

- The entire eviction fence stuff amdkfd does. It is kinda a very special
  version of ->move_notify, except it's also passing around an active bit
  for an entire set of buffers in an efficient way. This active bit works
  for amdkfd where we evict the entire context (logically at least, ofc
  all the unevicted buffers and their pagetables stay). I don't  think
  it'll work of a more traditional execbuf driver.

  I think we need some way to move lru/active information between drivers
  that works. Including making sure that drivers don't spend all the time
  walking over all the active buffers in their lru first, but also not
  burning down too much cpu time. So either lazy lru updates, or some bulk
  move thing, or something else. Or alternatively we spec out explicitly
  that lru updates will _not_ happen across drivers, and that drivers need
  to lru-bump buffers while evicting when they notice they're still busy
  (so some kind of lazy lru update).

- How will we handle the acquire ctx? Sooner or later I expect that when
  an importer calls into the exporter to validate the buffers we need to
  to have something like what you added as a stall point in ttm in

  commit d367bd2a5e2b12cb9135b30df94af8211196e8cf
  Author: Christian König <christian.koenig@amd.com>
  Date:   Wed May 22 09:51:47 2019 +0200

      drm/ttm: fix busy memory to fail other user v10

  Now we can do the same trick you've done of fishing out the acquire ctx
  from the buffer we're trying to get validated, instead of an explicit
  parameter to dma_buf_map_attachment. But the other change is that
  callers need to be able to handle EDEADLCK, and that's a huge one. I'm
  leaning towards requiring EDEADLCK handling for dynamic importers from
  the go, using the fake deadlck injection debug knob to enforce it.
  Explicit argument would be nice, but oh well.

- Related, we need to have an idea for how we should handle the TODO
  comment in ttm_mem_evict_wait_busy across drivers. Other drivers might
  hang onto a lot of buffers they don't really need, simply because they
  evicted them and kept them locked (i915 very likely will do that).

  This is one of the questions I don't think we need to solve right away,
  but good to have a solution in mind. I think a dma_resv->can_evict flag,
  which allows the lru evict code to throw out locked buffers (only locked
  by our own ctx ofc) would solve this. But not 100% sure. Also making it
  can_evict would make it opt-in as an optimization.

- ->move_notify needs to guarantee that all access stops, or we have a
  huge leak between security domains. I think there's three ways to do
  that:

  - Preempt the entire context right away. This is what amdkfd does
    (except with the eviction fence, not the move_notify callback). Then
    when you reschedule the context make sure all the pagetables are up to
    date again.

  - Synchronously punch out the pagetables in ->notify_move, and let
    gpu-side page faulting handle the fallout. Not sure anyone is doing
    that right now, but we at least discussed that as an idea here at
    intel.

  - Add an async pagetable update job, which has the current latest job as
    a dependency, and adds a new fence to the dma_resv object to signal
    when the pagetables are updated. This would all be scheduled from the
    ->notify_move callback, so would need to make sure this is officially
    allowed.

  If we don't have any of these then some later batch (which didn't
  declare it's going to need the buffer we've evicted) could access
  whatever new buffer has been placed at the same locations through the
  old pagetables.

  I did try to figure out how you solve this in amdgpu right now, but for
  normal CS ioctl the only pagetable update code I could find was in the
  cs ioctl itself. That's too late I think.

- Related to the above, and since I think the ttm hack to privatize the
  resv_object to avoid unecessary stalls: I think a last_acces fence on
  the attachment would be really nice. That way a driver could make sure
  it's only blocking on its own stuff and not on stuff another gpu is
  doing. But not sure it makes sense to have that in dma-buf structures,
  drivers can just track this on their own (and ttm would need a big
  overhaul since right now it totally ignores that there might be multiple
  mappings of the same underlying buffer object, that's all left to
  drivers to wrangle).

- Finally, we need to spell out the semantics of when you need to call
  dma_buf_attachment_unmap after a notify_move. I'm kinda leaning towards
  that you first need to unmap the old mapping, before you create a new
  one. But that might be too tough to implement for drivers, and result in
  stalls. So probably we need multiple mappings, and then we need to make
  it clear whether you can do that on the same attachment, or whether you
  need to do something else. Either way this must be really really clear.

I think the above is all the big questions from the past few discussion
rounds that we still have.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] drm/amdgpu: implement amdgpu_gem_prime_move_notify
  2019-11-05 13:39     ` Christian König
@ 2019-11-05 13:50       ` Daniel Vetter
  2019-11-05 15:20         ` Koenig, Christian
  0 siblings, 1 reply; 16+ messages in thread
From: Daniel Vetter @ 2019-11-05 13:50 UTC (permalink / raw)
  To: Christian König
  Cc: dri-devel, Sumit Semwal,
	moderated list:DMA BUFFER SHARING FRAMEWORK,
	open list:DMA BUFFER SHARING FRAMEWORK, intel-gfx

On Tue, Nov 5, 2019 at 2:39 PM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Am 05.11.19 um 11:52 schrieb Daniel Vetter:
> > On Tue, Oct 29, 2019 at 11:40:49AM +0100, Christian König wrote:
> >> Implement the importer side of unpinned DMA-buf handling.
> >>
> >> Signed-off-by: Christian König <christian.koenig@amd.com>
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 28 ++++++++++++++++++++-
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  |  6 +++++
> >>   2 files changed, 33 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> >> index 3629cfe53aad..af39553c51ad 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> >> @@ -456,7 +456,33 @@ amdgpu_dma_buf_create_obj(struct drm_device *dev, struct dma_buf *dma_buf)
> >>      return ERR_PTR(ret);
> >>   }
> >>
> >> +/**
> >> + * amdgpu_dma_buf_move_notify - &attach.move_notify implementation
> >> + *
> >> + * @attach: the DMA-buf attachment
> >> + *
> >> + * Invalidate the DMA-buf attachment, making sure that the we re-create the
> >> + * mapping before the next use.
> >> + */
> >> +static void
> >> +amdgpu_dma_buf_move_notify(struct dma_buf_attachment *attach)
> >> +{
> >> +    struct ttm_operation_ctx ctx = { false, false };
> >> +    struct drm_gem_object *obj = attach->importer_priv;
> >> +    struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
> >> +    struct ttm_placement placement = {};
> >> +    int r;
> >> +
> >> +    if (bo->tbo.mem.mem_type == TTM_PL_SYSTEM)
> >> +            return;
> >> +
> >> +    r = ttm_bo_validate(&bo->tbo, &placement, &ctx);
> >> +    if (r)
> >> +            DRM_ERROR("Failed to invalidate DMA-buf import (%d))\n", r);
> > Where do you update pagetables?
> >
> > The only thing I've found is in the amdgpu CS code, which is way to late
> > for this stuff here. Plus TTM doesn't handle virtual memory at all (aside
> > from the gart tt), so clearly you need to call into amdgpu code somewhere
> > for this. But I didn't find it, neither in your ->move_notify nor the
> > ->move callback in ttm_bo_driver.
> >
> > How does this work?
>
> Page tables are not updated until the next command submission, e.g. in
> amdgpu_cs.c
>
> This is save since all previous command submissions are added to the
> dma_resv object as fences and the dma_buf can't be moved before those
> are signaled.

Hm, I thought you still allow explicit buffer lists for each cs in
amdgpu? Code looks at least like that, not everything goes through the
context working set stuff.

How do you prevent the security leak if userspace simply lies about
not using a given buffer in a batch, and then abusing that to read
that virtual address range anyway and peek at whatever is now going to
be there when an eviction happened?
-Daniel

>
> Christian.
>
> > -Daniel
> >
> >> +}
> >> +
> >>   static const struct dma_buf_attach_ops amdgpu_dma_buf_attach_ops = {
> >> +    .move_notify = amdgpu_dma_buf_move_notify
> >>   };
> >>
> >>   /**
> >> @@ -492,7 +518,7 @@ struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev,
> >>              return obj;
> >>
> >>      attach = dma_buf_dynamic_attach(dma_buf, dev->dev,
> >> -                                    &amdgpu_dma_buf_attach_ops, NULL);
> >> +                                    &amdgpu_dma_buf_attach_ops, obj);
> >>      if (IS_ERR(attach)) {
> >>              drm_gem_object_put(obj);
> >>              return ERR_CAST(attach);
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >> index ac776d2620eb..cfa46341c9a7 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >> @@ -861,6 +861,9 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain,
> >>              return 0;
> >>      }
> >>
> >> +    if (bo->tbo.base.import_attach)
> >> +            dma_buf_pin(bo->tbo.base.import_attach);
> >> +
> >>      bo->flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
> >>      /* force to pin into visible video ram */
> >>      if (!(bo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS))
> >> @@ -944,6 +947,9 @@ int amdgpu_bo_unpin(struct amdgpu_bo *bo)
> >>
> >>      amdgpu_bo_subtract_pin_size(bo);
> >>
> >> +    if (bo->tbo.base.import_attach)
> >> +            dma_buf_unpin(bo->tbo.base.import_attach);
> >> +
> >>      for (i = 0; i < bo->placement.num_placement; i++) {
> >>              bo->placements[i].lpfn = 0;
> >>              bo->placements[i].flags &= ~TTM_PL_FLAG_NO_EVICT;
> >> --
> >> 2.17.1
> >>
> >> _______________________________________________
> >> dri-devel mailing list
> >> dri-devel@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] drm/amdgpu: implement amdgpu_gem_prime_move_notify
  2019-11-05 13:50       ` Daniel Vetter
@ 2019-11-05 15:20         ` Koenig, Christian
  2019-11-05 15:23           ` Daniel Vetter
  0 siblings, 1 reply; 16+ messages in thread
From: Koenig, Christian @ 2019-11-05 15:20 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: dri-devel, Sumit Semwal,
	moderated list:DMA BUFFER SHARING FRAMEWORK,
	open list:DMA BUFFER SHARING FRAMEWORK, intel-gfx

Am 05.11.19 um 14:50 schrieb Daniel Vetter:
> On Tue, Nov 5, 2019 at 2:39 PM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Am 05.11.19 um 11:52 schrieb Daniel Vetter:
>>> On Tue, Oct 29, 2019 at 11:40:49AM +0100, Christian König wrote:
>>>> Implement the importer side of unpinned DMA-buf handling.
>>>>
>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 28 ++++++++++++++++++++-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  |  6 +++++
>>>>    2 files changed, 33 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
>>>> index 3629cfe53aad..af39553c51ad 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
>>>> @@ -456,7 +456,33 @@ amdgpu_dma_buf_create_obj(struct drm_device *dev, struct dma_buf *dma_buf)
>>>>       return ERR_PTR(ret);
>>>>    }
>>>>
>>>> +/**
>>>> + * amdgpu_dma_buf_move_notify - &attach.move_notify implementation
>>>> + *
>>>> + * @attach: the DMA-buf attachment
>>>> + *
>>>> + * Invalidate the DMA-buf attachment, making sure that the we re-create the
>>>> + * mapping before the next use.
>>>> + */
>>>> +static void
>>>> +amdgpu_dma_buf_move_notify(struct dma_buf_attachment *attach)
>>>> +{
>>>> +    struct ttm_operation_ctx ctx = { false, false };
>>>> +    struct drm_gem_object *obj = attach->importer_priv;
>>>> +    struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
>>>> +    struct ttm_placement placement = {};
>>>> +    int r;
>>>> +
>>>> +    if (bo->tbo.mem.mem_type == TTM_PL_SYSTEM)
>>>> +            return;
>>>> +
>>>> +    r = ttm_bo_validate(&bo->tbo, &placement, &ctx);
>>>> +    if (r)
>>>> +            DRM_ERROR("Failed to invalidate DMA-buf import (%d))\n", r);
>>> Where do you update pagetables?
>>>
>>> The only thing I've found is in the amdgpu CS code, which is way to late
>>> for this stuff here. Plus TTM doesn't handle virtual memory at all (aside
>>> from the gart tt), so clearly you need to call into amdgpu code somewhere
>>> for this. But I didn't find it, neither in your ->move_notify nor the
>>> ->move callback in ttm_bo_driver.
>>>
>>> How does this work?
>> Page tables are not updated until the next command submission, e.g. in
>> amdgpu_cs.c
>>
>> This is save since all previous command submissions are added to the
>> dma_resv object as fences and the dma_buf can't be moved before those
>> are signaled.
> Hm, I thought you still allow explicit buffer lists for each cs in
> amdgpu? Code looks at least like that, not everything goes through the
> context working set stuff.
>
> How do you prevent the security leak if userspace simply lies about
> not using a given buffer in a batch, and then abusing that to read
> that virtual address range anyway and peek at whatever is now going to
> be there when an eviction happened?

Oh, yeah that is a really good point. And no that isn't handled 
correctly at all.

I wanted to rework that for quite some time now, but always got into 
issues with TTM.

Thanks for the notice, so I need to put my TTM rework before of this. 
Crap, that adds a whole bunch of TODOs to my list.

Regards,
Christian.

> -Daniel
>
>> Christian.
>>
>>> -Daniel
>>>
>>>> +}
>>>> +
>>>>    static const struct dma_buf_attach_ops amdgpu_dma_buf_attach_ops = {
>>>> +    .move_notify = amdgpu_dma_buf_move_notify
>>>>    };
>>>>
>>>>    /**
>>>> @@ -492,7 +518,7 @@ struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev,
>>>>               return obj;
>>>>
>>>>       attach = dma_buf_dynamic_attach(dma_buf, dev->dev,
>>>> -                                    &amdgpu_dma_buf_attach_ops, NULL);
>>>> +                                    &amdgpu_dma_buf_attach_ops, obj);
>>>>       if (IS_ERR(attach)) {
>>>>               drm_gem_object_put(obj);
>>>>               return ERR_CAST(attach);
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> index ac776d2620eb..cfa46341c9a7 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> @@ -861,6 +861,9 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain,
>>>>               return 0;
>>>>       }
>>>>
>>>> +    if (bo->tbo.base.import_attach)
>>>> +            dma_buf_pin(bo->tbo.base.import_attach);
>>>> +
>>>>       bo->flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
>>>>       /* force to pin into visible video ram */
>>>>       if (!(bo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS))
>>>> @@ -944,6 +947,9 @@ int amdgpu_bo_unpin(struct amdgpu_bo *bo)
>>>>
>>>>       amdgpu_bo_subtract_pin_size(bo);
>>>>
>>>> +    if (bo->tbo.base.import_attach)
>>>> +            dma_buf_unpin(bo->tbo.base.import_attach);
>>>> +
>>>>       for (i = 0; i < bo->placement.num_placement; i++) {
>>>>               bo->placements[i].lpfn = 0;
>>>>               bo->placements[i].flags &= ~TTM_PL_FLAG_NO_EVICT;
>>>> --
>>>> 2.17.1
>>>>
>>>> _______________________________________________
>>>> dri-devel mailing list
>>>> dri-devel@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] drm/amdgpu: implement amdgpu_gem_prime_move_notify
  2019-11-05 15:20         ` Koenig, Christian
@ 2019-11-05 15:23           ` Daniel Vetter
  0 siblings, 0 replies; 16+ messages in thread
From: Daniel Vetter @ 2019-11-05 15:23 UTC (permalink / raw)
  To: Koenig, Christian
  Cc: dri-devel, Sumit Semwal,
	moderated list:DMA BUFFER SHARING FRAMEWORK,
	open list:DMA BUFFER SHARING FRAMEWORK, intel-gfx

On Tue, Nov 5, 2019 at 4:20 PM Koenig, Christian
<Christian.Koenig@amd.com> wrote:
>
> Am 05.11.19 um 14:50 schrieb Daniel Vetter:
> > On Tue, Nov 5, 2019 at 2:39 PM Christian König
> > <ckoenig.leichtzumerken@gmail.com> wrote:
> >> Am 05.11.19 um 11:52 schrieb Daniel Vetter:
> >>> On Tue, Oct 29, 2019 at 11:40:49AM +0100, Christian König wrote:
> >>>> Implement the importer side of unpinned DMA-buf handling.
> >>>>
> >>>> Signed-off-by: Christian König <christian.koenig@amd.com>
> >>>> ---
> >>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 28 ++++++++++++++++++++-
> >>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  |  6 +++++
> >>>>    2 files changed, 33 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> >>>> index 3629cfe53aad..af39553c51ad 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> >>>> @@ -456,7 +456,33 @@ amdgpu_dma_buf_create_obj(struct drm_device *dev, struct dma_buf *dma_buf)
> >>>>       return ERR_PTR(ret);
> >>>>    }
> >>>>
> >>>> +/**
> >>>> + * amdgpu_dma_buf_move_notify - &attach.move_notify implementation
> >>>> + *
> >>>> + * @attach: the DMA-buf attachment
> >>>> + *
> >>>> + * Invalidate the DMA-buf attachment, making sure that the we re-create the
> >>>> + * mapping before the next use.
> >>>> + */
> >>>> +static void
> >>>> +amdgpu_dma_buf_move_notify(struct dma_buf_attachment *attach)
> >>>> +{
> >>>> +    struct ttm_operation_ctx ctx = { false, false };
> >>>> +    struct drm_gem_object *obj = attach->importer_priv;
> >>>> +    struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
> >>>> +    struct ttm_placement placement = {};
> >>>> +    int r;
> >>>> +
> >>>> +    if (bo->tbo.mem.mem_type == TTM_PL_SYSTEM)
> >>>> +            return;
> >>>> +
> >>>> +    r = ttm_bo_validate(&bo->tbo, &placement, &ctx);
> >>>> +    if (r)
> >>>> +            DRM_ERROR("Failed to invalidate DMA-buf import (%d))\n", r);
> >>> Where do you update pagetables?
> >>>
> >>> The only thing I've found is in the amdgpu CS code, which is way to late
> >>> for this stuff here. Plus TTM doesn't handle virtual memory at all (aside
> >>> from the gart tt), so clearly you need to call into amdgpu code somewhere
> >>> for this. But I didn't find it, neither in your ->move_notify nor the
> >>> ->move callback in ttm_bo_driver.
> >>>
> >>> How does this work?
> >> Page tables are not updated until the next command submission, e.g. in
> >> amdgpu_cs.c
> >>
> >> This is save since all previous command submissions are added to the
> >> dma_resv object as fences and the dma_buf can't be moved before those
> >> are signaled.
> > Hm, I thought you still allow explicit buffer lists for each cs in
> > amdgpu? Code looks at least like that, not everything goes through the
> > context working set stuff.
> >
> > How do you prevent the security leak if userspace simply lies about
> > not using a given buffer in a batch, and then abusing that to read
> > that virtual address range anyway and peek at whatever is now going to
> > be there when an eviction happened?
>
> Oh, yeah that is a really good point. And no that isn't handled
> correctly at all.
>
> I wanted to rework that for quite some time now, but always got into
> issues with TTM.
>
> Thanks for the notice, so I need to put my TTM rework before of this.
> Crap, that adds a whole bunch of TODOs to my list.

Ok, I think that also clears up some confusion we had around
notify_move semantics, where I wanted to add more fences from within
the callback (for the pipelined gpu pagetable clearing) and you didn't
really see the point. I dumped my overall thoughts on all things open
here on the cover letter, but yeah probably best if you wire up the pt
clearing in amdgpu first. Then we can easier see what we'll need to
funnel that through dma-buf.

Cheers, Daniel

>
> Regards,
> Christian.
>
> > -Daniel
> >
> >> Christian.
> >>
> >>> -Daniel
> >>>
> >>>> +}
> >>>> +
> >>>>    static const struct dma_buf_attach_ops amdgpu_dma_buf_attach_ops = {
> >>>> +    .move_notify = amdgpu_dma_buf_move_notify
> >>>>    };
> >>>>
> >>>>    /**
> >>>> @@ -492,7 +518,7 @@ struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev,
> >>>>               return obj;
> >>>>
> >>>>       attach = dma_buf_dynamic_attach(dma_buf, dev->dev,
> >>>> -                                    &amdgpu_dma_buf_attach_ops, NULL);
> >>>> +                                    &amdgpu_dma_buf_attach_ops, obj);
> >>>>       if (IS_ERR(attach)) {
> >>>>               drm_gem_object_put(obj);
> >>>>               return ERR_CAST(attach);
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>> index ac776d2620eb..cfa46341c9a7 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>> @@ -861,6 +861,9 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain,
> >>>>               return 0;
> >>>>       }
> >>>>
> >>>> +    if (bo->tbo.base.import_attach)
> >>>> +            dma_buf_pin(bo->tbo.base.import_attach);
> >>>> +
> >>>>       bo->flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
> >>>>       /* force to pin into visible video ram */
> >>>>       if (!(bo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS))
> >>>> @@ -944,6 +947,9 @@ int amdgpu_bo_unpin(struct amdgpu_bo *bo)
> >>>>
> >>>>       amdgpu_bo_subtract_pin_size(bo);
> >>>>
> >>>> +    if (bo->tbo.base.import_attach)
> >>>> +            dma_buf_unpin(bo->tbo.base.import_attach);
> >>>> +
> >>>>       for (i = 0; i < bo->placement.num_placement; i++) {
> >>>>               bo->placements[i].lpfn = 0;
> >>>>               bo->placements[i].flags &= ~TTM_PL_FLAG_NO_EVICT;
> >>>> --
> >>>> 2.17.1
> >>>>
> >>>> _______________________________________________
> >>>> dri-devel mailing list
> >>>> dri-devel@lists.freedesktop.org
> >>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] dma-buf: add dynamic DMA-buf handling v14
  2019-11-05 10:20   ` Daniel Vetter
@ 2020-02-18 13:20     ` Christian König
  2020-02-18 14:14       ` Daniel Vetter
  0 siblings, 1 reply; 16+ messages in thread
From: Christian König @ 2020-02-18 13:20 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: dri-devel, sumit.semwal, linaro-mm-sig, linux-media, intel-gfx

Am 05.11.19 um 11:20 schrieb Daniel Vetter:
> On Tue, Oct 29, 2019 at 11:40:45AM +0100, Christian König wrote:
> [SNIP]
>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>> index d377b4ca66bf..ce293cee76ed 100644
>> --- a/drivers/dma-buf/dma-buf.c
>> +++ b/drivers/dma-buf/dma-buf.c
>> @@ -529,6 +529,10 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info)
>>   		    exp_info->ops->dynamic_mapping))
>>   		return ERR_PTR(-EINVAL);
>>   
>> +	if (WARN_ON(!exp_info->ops->dynamic_mapping &&
>> +		    (exp_info->ops->pin || exp_info->ops->unpin)))
>> +		return ERR_PTR(-EINVAL);
> Imo make this stronger, have a dynamic mapping iff there's both a pin and
> unpin function. Otherwise this doesn't make a lot of sense to me.

I want to avoid that for the initial implementation. So far dynamic only 
meant that we have the new locking semantics.

We could make that mandatory after this patch set when amdgpu is 
migrated and has implemented the necessary callbacks.

>> [SNIP]
>> @@ -821,13 +877,23 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
>>   		return attach->sgt;
>>   	}
>>   
>> -	if (dma_buf_is_dynamic(attach->dmabuf))
>> +	if (dma_buf_is_dynamic(attach->dmabuf)) {
>>   		dma_resv_assert_held(attach->dmabuf->resv);
>> +		if (!attach->importer_ops->move_notify) {
> Imo just require ->move_notify for importers that give you an ops
> function. Doesn't really make sense to allow dynamic without support
> ->move_notify.

Same thing here. We could make that mandatory and clean it up after 
migrating amdgpu.

>> [SNIP]
>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
>> index af73f835c51c..7456bb937635 100644
>> --- a/include/linux/dma-buf.h
>> +++ b/include/linux/dma-buf.h
>> @@ -93,14 +93,40 @@ struct dma_buf_ops {
>>   	 */
>>   	void (*detach)(struct dma_buf *, struct dma_buf_attachment *);
>>   
>> +	/**
>> +	 * @pin:
>> +	 *
>> +	 * This is called by dma_buf_pin and lets the exporter know that the
>> +	 * DMA-buf can't be moved any more.
> I think we should add a warning here that pinning is only ok for limited
> use-cases (like scanout or similar), and not as part of general buffer
> management.
>
> i915 uses temporary pins through it's execbuf management (and everywhere
> else), so we have a _lot_ of people in dri-devel with quite different
> ideas of what this might be for :-)

Yeah, that is also a good idea for us. Wrote a one liner, but you might 
want to double check the wording.

>> [SNIP]
>> @@ -141,9 +167,6 @@ struct dma_buf_ops {
>>   	 *
>>   	 * This is called by dma_buf_unmap_attachment() and should unmap and
>>   	 * release the &sg_table allocated in @map_dma_buf, and it is mandatory.
>> -	 * It should also unpin the backing storage if this is the last mapping
>> -	 * of the DMA buffer, it the exporter supports backing storage
>> -	 * migration.
> This is still valid for non-dynamic exporters. Imo keep but clarify that.

OK, changed.

>> [SNIP]
>> @@ -438,16 +491,19 @@ static inline bool dma_buf_is_dynamic(struct dma_buf *dmabuf)
>>   static inline bool
>>   dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
>>   {
>> -	return attach->dynamic_mapping;
>> +	return !!attach->importer_ops;
> Hm why not do the same for exporters, and make them dynamic iff they have
> pin/unpin?

Same thing as before, to migrate amdgpu to the new interface first and 
then make it mandatory.

I think I will just write a cleanup patch into the series which comes 
after the amdgpu changes.

Thanks,
Christian.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] dma-buf: add dynamic DMA-buf handling v14
  2020-02-18 13:20     ` Christian König
@ 2020-02-18 14:14       ` Daniel Vetter
  0 siblings, 0 replies; 16+ messages in thread
From: Daniel Vetter @ 2020-02-18 14:14 UTC (permalink / raw)
  To: Christian König
  Cc: dri-devel, Sumit Semwal,
	moderated list:DMA BUFFER SHARING FRAMEWORK,
	open list:DMA BUFFER SHARING FRAMEWORK, intel-gfx

On Tue, Feb 18, 2020 at 2:20 PM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Am 05.11.19 um 11:20 schrieb Daniel Vetter:
> > On Tue, Oct 29, 2019 at 11:40:45AM +0100, Christian König wrote:
> > [SNIP]
> >> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> >> index d377b4ca66bf..ce293cee76ed 100644
> >> --- a/drivers/dma-buf/dma-buf.c
> >> +++ b/drivers/dma-buf/dma-buf.c
> >> @@ -529,6 +529,10 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info)
> >>                  exp_info->ops->dynamic_mapping))
> >>              return ERR_PTR(-EINVAL);
> >>
> >> +    if (WARN_ON(!exp_info->ops->dynamic_mapping &&
> >> +                (exp_info->ops->pin || exp_info->ops->unpin)))
> >> +            return ERR_PTR(-EINVAL);
> > Imo make this stronger, have a dynamic mapping iff there's both a pin and
> > unpin function. Otherwise this doesn't make a lot of sense to me.
>
> I want to avoid that for the initial implementation. So far dynamic only
> meant that we have the new locking semantics.
>
> We could make that mandatory after this patch set when amdgpu is
> migrated and has implemented the necessary callbacks.

Ok if we go with CONFIG_EXPERIMENTAL_DYN_DMABUF or whatever it's going
to be called I'm totally ok if we just note this somewhere as a FIXME
(maybe just inline in a code comment next to the main #ifdef in
dma-buf.h. Same for all your other comments below.

Cheers, Daniel

>
> >> [SNIP]
> >> @@ -821,13 +877,23 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
> >>              return attach->sgt;
> >>      }
> >>
> >> -    if (dma_buf_is_dynamic(attach->dmabuf))
> >> +    if (dma_buf_is_dynamic(attach->dmabuf)) {
> >>              dma_resv_assert_held(attach->dmabuf->resv);
> >> +            if (!attach->importer_ops->move_notify) {
> > Imo just require ->move_notify for importers that give you an ops
> > function. Doesn't really make sense to allow dynamic without support
> > ->move_notify.
>
> Same thing here. We could make that mandatory and clean it up after
> migrating amdgpu.
>
> >> [SNIP]
> >> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> >> index af73f835c51c..7456bb937635 100644
> >> --- a/include/linux/dma-buf.h
> >> +++ b/include/linux/dma-buf.h
> >> @@ -93,14 +93,40 @@ struct dma_buf_ops {
> >>       */
> >>      void (*detach)(struct dma_buf *, struct dma_buf_attachment *);
> >>
> >> +    /**
> >> +     * @pin:
> >> +     *
> >> +     * This is called by dma_buf_pin and lets the exporter know that the
> >> +     * DMA-buf can't be moved any more.
> > I think we should add a warning here that pinning is only ok for limited
> > use-cases (like scanout or similar), and not as part of general buffer
> > management.
> >
> > i915 uses temporary pins through it's execbuf management (and everywhere
> > else), so we have a _lot_ of people in dri-devel with quite different
> > ideas of what this might be for :-)
>
> Yeah, that is also a good idea for us. Wrote a one liner, but you might
> want to double check the wording.
>
> >> [SNIP]
> >> @@ -141,9 +167,6 @@ struct dma_buf_ops {
> >>       *
> >>       * This is called by dma_buf_unmap_attachment() and should unmap and
> >>       * release the &sg_table allocated in @map_dma_buf, and it is mandatory.
> >> -     * It should also unpin the backing storage if this is the last mapping
> >> -     * of the DMA buffer, it the exporter supports backing storage
> >> -     * migration.
> > This is still valid for non-dynamic exporters. Imo keep but clarify that.
>
> OK, changed.
>
> >> [SNIP]
> >> @@ -438,16 +491,19 @@ static inline bool dma_buf_is_dynamic(struct dma_buf *dmabuf)
> >>   static inline bool
> >>   dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
> >>   {
> >> -    return attach->dynamic_mapping;
> >> +    return !!attach->importer_ops;
> > Hm why not do the same for exporters, and make them dynamic iff they have
> > pin/unpin?
>
> Same thing as before, to migrate amdgpu to the new interface first and
> then make it mandatory.
>
> I think I will just write a cleanup patch into the series which comes
> after the amdgpu changes.
>
> Thanks,
> Christian.



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RFC: Unpinned DMA-buf handling
@ 2020-02-17 15:45 Christian König
  0 siblings, 0 replies; 16+ messages in thread
From: Christian König @ 2020-02-17 15:45 UTC (permalink / raw)
  To: dri-devel, linaro-mm-sig, linux-media, intel-gfx, daniel

Hi everyone,

hopefully the last iteration of those patches.

For now I've addressed the issue of unmapping imported BOs from the amdgpu page tables immediately by locking the page tables in place.

For HMM handling we are getting the ability to invalidate BOs without locking the VM anyway, so this last TODO will probably go away rather soon.

Place comment,
Christian.



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-02-18 14:14 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-29 10:40 RFC: Unpinned DMA-buf handling Christian König
2019-10-29 10:40 ` [PATCH 1/5] dma-buf: add dynamic DMA-buf handling v14 Christian König
2019-11-05 10:20   ` Daniel Vetter
2020-02-18 13:20     ` Christian König
2020-02-18 14:14       ` Daniel Vetter
2019-10-29 10:40 ` [PATCH 2/5] drm/ttm: remove the backing store if no placement is given Christian König
2019-10-29 10:40 ` [PATCH 3/5] drm/amdgpu: use allowed_domains for exported DMA-bufs Christian König
2019-10-29 10:40 ` [PATCH 4/5] drm/amdgpu: add amdgpu_dma_buf_pin/unpin Christian König
2019-10-29 10:40 ` [PATCH 5/5] drm/amdgpu: implement amdgpu_gem_prime_move_notify Christian König
2019-11-05 10:52   ` Daniel Vetter
2019-11-05 13:39     ` Christian König
2019-11-05 13:50       ` Daniel Vetter
2019-11-05 15:20         ` Koenig, Christian
2019-11-05 15:23           ` Daniel Vetter
2019-11-05 13:46 ` RFC: Unpinned DMA-buf handling Daniel Vetter
2020-02-17 15:45 Christian König

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).