linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] [RFC] kernel cpu access support for dma_buf
@ 2012-03-01 15:35 Daniel Vetter
  2012-03-01 15:35 ` [PATCH 1/3] dma-buf: don't hold the mutex around map/unmap calls Daniel Vetter
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Daniel Vetter @ 2012-03-01 15:35 UTC (permalink / raw)
  To: linaro-mm-sig, LKML, DRI Development, linux-media; +Cc: Daniel Vetter

Hi all,

This series here implements an interface to enable cpu access from the kernel
context to dma_buf objects. The main design goal of this interface proposal is
to enable buffer objects that reside in highmem.

Comments, flames, ideas and questions highly welcome. Althouhg I might be a bit
slow in responding - I'm on conferences and vacation the next 2 weeks.

Cheers, Daniel

Daniel Vetter (3):
  dma-buf: don't hold the mutex around map/unmap calls
  dma-buf: add support for kernel cpu access
  dma_buf: Add documentation for the new cpu access support

 Documentation/dma-buf-sharing.txt |  102 +++++++++++++++++++++++++++++-
 drivers/base/dma-buf.c            |  124 +++++++++++++++++++++++++++++++++++-
 include/linux/dma-buf.h           |   62 ++++++++++++++++++-
 3 files changed, 280 insertions(+), 8 deletions(-)

-- 
1.7.7.5


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/3] dma-buf: don't hold the mutex around map/unmap calls
  2012-03-01 15:35 [PATCH 0/3] [RFC] kernel cpu access support for dma_buf Daniel Vetter
@ 2012-03-01 15:35 ` Daniel Vetter
  2012-03-02 21:26   ` Rob Clark
  2012-03-01 15:36 ` [PATCH 2/3] dma-buf: add support for kernel cpu access Daniel Vetter
  2012-03-01 15:36 ` [PATCH 3/3] dma_buf: Add documentation for the new cpu access support Daniel Vetter
  2 siblings, 1 reply; 15+ messages in thread
From: Daniel Vetter @ 2012-03-01 15:35 UTC (permalink / raw)
  To: linaro-mm-sig, LKML, DRI Development, linux-media; +Cc: Daniel Vetter

The mutex protects the attachment list and hence needs to be held
around the callbakc to the exporters (optional) attach/detach
functions.

Holding the mutex around the map/unmap calls doesn't protect any
dma_buf state. Exporters need to properly protect any of their own
state anyway (to protect against calls from their own interfaces).
So this only makes the locking messier (and lockdep easier to anger).

Therefore let's just drop this.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/base/dma-buf.c  |    4 ----
 include/linux/dma-buf.h |    2 +-
 2 files changed, 1 insertions(+), 5 deletions(-)

diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c
index e38ad24..1b11192 100644
--- a/drivers/base/dma-buf.c
+++ b/drivers/base/dma-buf.c
@@ -258,10 +258,8 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
 	if (WARN_ON(!attach || !attach->dmabuf || !attach->dmabuf->ops))
 		return ERR_PTR(-EINVAL);
 
-	mutex_lock(&attach->dmabuf->lock);
 	if (attach->dmabuf->ops->map_dma_buf)
 		sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
-	mutex_unlock(&attach->dmabuf->lock);
 
 	return sg_table;
 }
@@ -282,10 +280,8 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach,
 			    || !attach->dmabuf->ops))
 		return;
 
-	mutex_lock(&attach->dmabuf->lock);
 	if (attach->dmabuf->ops->unmap_dma_buf)
 		attach->dmabuf->ops->unmap_dma_buf(attach, sg_table);
-	mutex_unlock(&attach->dmabuf->lock);
 
 }
 EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index f8ac076..f7ad2ca 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -86,7 +86,7 @@ struct dma_buf {
 	struct file *file;
 	struct list_head attachments;
 	const struct dma_buf_ops *ops;
-	/* mutex to serialize list manipulation and other ops */
+	/* mutex to serialize list manipulation and attach/detach */
 	struct mutex lock;
 	void *priv;
 };
-- 
1.7.7.5


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/3] dma-buf: add support for kernel cpu access
  2012-03-01 15:35 [PATCH 0/3] [RFC] kernel cpu access support for dma_buf Daniel Vetter
  2012-03-01 15:35 ` [PATCH 1/3] dma-buf: don't hold the mutex around map/unmap calls Daniel Vetter
@ 2012-03-01 15:36 ` Daniel Vetter
  2012-03-02 22:24   ` Rob Clark
  2012-03-02 22:38   ` [Linaro-mm-sig] " Chris Wilson
  2012-03-01 15:36 ` [PATCH 3/3] dma_buf: Add documentation for the new cpu access support Daniel Vetter
  2 siblings, 2 replies; 15+ messages in thread
From: Daniel Vetter @ 2012-03-01 15:36 UTC (permalink / raw)
  To: linaro-mm-sig, LKML, DRI Development, linux-media; +Cc: Daniel Vetter

Big differences to other contenders in the field (like ion) is
that this also supports highmem, so we have to split up the cpu
access from the kernel side into a prepare and a kmap step.

Prepare is allowed to fail and should do everything required so that
the kmap calls can succeed (like swapin/backing storage allocation,
flushing, ...).

More in-depth explanations will follow in the follow-up documentation
patch.

Changes in v2:

- Clear up begin_cpu_access confusion noticed by Sumit Semwal.
- Don't automatically fallback from the _atomic variants to the
  non-atomic variants. The _atomic callbacks are not allowed to
  sleep, so we want exporters to make this decision explicit. The
  function signatures are explicit, so simpler exporters can still
  use the same function for both.
- Make the unmap functions optional. Simpler exporters with permanent
  mappings don't need to do anything at unmap time.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/base/dma-buf.c  |  120 +++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/dma-buf.h |   60 +++++++++++++++++++++++
 2 files changed, 180 insertions(+), 0 deletions(-)

diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c
index 1b11192..bf54b89 100644
--- a/drivers/base/dma-buf.c
+++ b/drivers/base/dma-buf.c
@@ -285,3 +285,123 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach,
 
 }
 EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
+
+
+/**
+ * dma_buf_begin_cpu_access - Must be called before accessing a dma_buf from the
+ * cpu in the kernel context. Calls begin_cpu_access to allow exporter-specific
+ * preparations. Coherency is only guaranteed in the specified range for the
+ * specified access direction.
+ * @dma_buf:	[in]	buffer to prepare cpu access for.
+ * @start:	[in]	start of range for cpu access.
+ * @len:	[in]	length of range for cpu access.
+ * @direction:	[in]	length of range for cpu access.
+ *
+ * Can return negative error values, returns 0 on success.
+ */
+int dma_buf_begin_cpu_access(struct dma_buf *dmabuf, size_t start, size_t len,
+			     enum dma_data_direction direction)
+{
+	int ret = 0;
+
+	if (WARN_ON(!dmabuf || !dmabuf->ops))
+		return EINVAL;
+
+	if (dmabuf->ops->begin_cpu_access)
+		ret = dmabuf->ops->begin_cpu_access(dmabuf, start, len, direction);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(dma_buf_begin_cpu_access);
+
+/**
+ * dma_buf_end_cpu_access - Must be called after accessing a dma_buf from the
+ * cpu in the kernel context. Calls end_cpu_access to allow exporter-specific
+ * actions. Coherency is only guaranteed in the specified range for the
+ * specified access direction.
+ * @dma_buf:	[in]	buffer to complete cpu access for.
+ * @start:	[in]	start of range for cpu access.
+ * @len:	[in]	length of range for cpu access.
+ * @direction:	[in]	length of range for cpu access.
+ *
+ * This call must always succeed.
+ */
+void dma_buf_end_cpu_access(struct dma_buf *dmabuf, size_t start, size_t len,
+			    enum dma_data_direction direction)
+{
+	WARN_ON(!dmabuf || !dmabuf->ops);
+
+	if (dmabuf->ops->end_cpu_access)
+		dmabuf->ops->end_cpu_access(dmabuf, start, len, direction);
+}
+EXPORT_SYMBOL_GPL(dma_buf_end_cpu_access);
+
+/**
+ * dma_buf_kmap_atomic - Map a page of the buffer object into kernel address
+ * space. The same restrictions as for kmap_atomic and friends apply.
+ * @dma_buf:	[in]	buffer to map page from.
+ * @page_num:	[in]	page in PAGE_SIZE units to map.
+ *
+ * This call must always succeed, any necessary preparations that might fail
+ * need to be done in begin_cpu_access.
+ */
+void *dma_buf_kmap_atomic(struct dma_buf *dmabuf, unsigned long page_num)
+{
+	WARN_ON(!dmabuf || !dmabuf->ops);
+
+	return dmabuf->ops->kmap_atomic(dmabuf, page_num);
+}
+EXPORT_SYMBOL_GPL(dma_buf_kmap_atomic);
+
+/**
+ * dma_buf_kunmap_atomic - Unmap a page obtained by dma_buf_kmap_atomic.
+ * @dma_buf:	[in]	buffer to unmap page from.
+ * @page_num:	[in]	page in PAGE_SIZE units to unmap.
+ * @vaddr:	[in]	kernel space pointer obtained from dma_buf_kmap_atomic.
+ *
+ * This call must always succeed.
+ */
+void dma_buf_kunmap_atomic(struct dma_buf *dmabuf, unsigned long page_num,
+			   void *vaddr)
+{
+	WARN_ON(!dmabuf || !dmabuf->ops);
+
+	if (dmabuf->ops->kunmap_atomic)
+		dmabuf->ops->kunmap_atomic(dmabuf, page_num, vaddr);
+}
+EXPORT_SYMBOL_GPL(dma_buf_kunmap_atomic);
+
+/**
+ * dma_buf_kmap - Map a page of the buffer object into kernel address space. The
+ * same restrictions as for kmap and friends apply.
+ * @dma_buf:	[in]	buffer to map page from.
+ * @page_num:	[in]	page in PAGE_SIZE units to map.
+ *
+ * This call must always succeed, any necessary preparations that might fail
+ * need to be done in begin_cpu_access.
+ */
+void *dma_buf_kmap(struct dma_buf *dmabuf, unsigned long page_num)
+{
+	WARN_ON(!dmabuf || !dmabuf->ops);
+
+	return dmabuf->ops->kmap(dmabuf, page_num);
+}
+EXPORT_SYMBOL_GPL(dma_buf_kmap);
+
+/**
+ * dma_buf_kunmap - Unmap a page obtained by dma_buf_kmap.
+ * @dma_buf:	[in]	buffer to unmap page from.
+ * @page_num:	[in]	page in PAGE_SIZE units to unmap.
+ * @vaddr:	[in]	kernel space pointer obtained from dma_buf_kmap.
+ *
+ * This call must always succeed.
+ */
+void dma_buf_kunmap(struct dma_buf *dmabuf, unsigned long page_num,
+		    void *vaddr)
+{
+	WARN_ON(!dmabuf || !dmabuf->ops);
+
+	if (dmabuf->ops->kunmap)
+		dmabuf->ops->kunmap(dmabuf, page_num, vaddr);
+}
+EXPORT_SYMBOL_GPL(dma_buf_kunmap);
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index f7ad2ca..72eb922 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -49,6 +49,17 @@ struct dma_buf_attachment;
  * @unmap_dma_buf: decreases usecount of buffer, might deallocate scatter
  *		   pages.
  * @release: release this buffer; to be called after the last dma_buf_put.
+ * @begin_cpu_access: [optional] called before cpu access to invalidate cpu
+ * 		      caches and allocate backing storage (if not yet done)
+ * 		      respectively pin the objet into memory.
+ * @end_cpu_access: [optional] called after cpu access to flush cashes.
+ * @kmap_atomic: maps a page from the buffer into kernel address
+ * 		 space, users may not block until the subsequent unmap call.
+ * 		 This callback must not sleep.
+ * @kunmap_atomic: [optional] unmaps a atomically mapped page from the buffer.
+ * 		   This Callback must not sleep.
+ * @kmap: maps a page from the buffer into kernel address space.
+ * @kunmap: [optional] unmaps a page from the buffer.
  */
 struct dma_buf_ops {
 	int (*attach)(struct dma_buf *, struct device *,
@@ -71,6 +82,14 @@ struct dma_buf_ops {
 	/* after final dma_buf_put() */
 	void (*release)(struct dma_buf *);
 
+	int (*begin_cpu_access)(struct dma_buf *, size_t, size_t,
+				enum dma_data_direction);
+	void (*end_cpu_access)(struct dma_buf *, size_t, size_t,
+			       enum dma_data_direction);
+	void *(*kmap_atomic)(struct dma_buf *, unsigned long);
+	void (*kunmap_atomic)(struct dma_buf *, unsigned long, void *);
+	void *(*kmap)(struct dma_buf *, unsigned long);
+	void (*kunmap)(struct dma_buf *, unsigned long, void *);
 };
 
 /**
@@ -123,6 +142,15 @@ void dma_buf_put(struct dma_buf *dmabuf);
 struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *,
 					enum dma_data_direction);
 void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *);
+
+int dma_buf_begin_cpu_access(struct dma_buf *dma_buf, size_t start, size_t len,
+			     enum dma_data_direction dir);
+void dma_buf_end_cpu_access(struct dma_buf *dma_buf, size_t start, size_t len,
+			    enum dma_data_direction dir);
+void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long);
+void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long, void *);
+void *dma_buf_kmap(struct dma_buf *, unsigned long);
+void dma_buf_kunmap(struct dma_buf *, unsigned long, void *);
 #else
 
 static inline struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
@@ -171,6 +199,38 @@ static inline void dma_buf_unmap_attachment(struct dma_buf_attachment *attach,
 	return;
 }
 
+static inline int dma_buf_begin_cpu_access(struct dma_buf *,
+					   size_t, size_t,
+					   enum dma_data_direction)
+{
+	return -ENODEV;
+}
+
+static inline void dma_buf_end_cpu_access(struct dma_buf *,
+					  size_t, size_t,
+					  enum dma_data_direction)
+{
+}
+
+static inline void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long)
+{
+	return NULL;
+}
+
+static inline void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long,
+					 void *)
+{
+}
+
+static inline void *dma_buf_kmap(struct dma_buf *, unsigned long)
+{
+	return NULL;
+}
+
+static inline void dma_buf_kunmap(struct dma_buf *, unsigned long,
+				  void *)
+{
+}
 #endif /* CONFIG_DMA_SHARED_BUFFER */
 
 #endif /* __DMA_BUF_H__ */
-- 
1.7.7.5


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/3] dma_buf: Add documentation for the new cpu access support
  2012-03-01 15:35 [PATCH 0/3] [RFC] kernel cpu access support for dma_buf Daniel Vetter
  2012-03-01 15:35 ` [PATCH 1/3] dma-buf: don't hold the mutex around map/unmap calls Daniel Vetter
  2012-03-01 15:36 ` [PATCH 2/3] dma-buf: add support for kernel cpu access Daniel Vetter
@ 2012-03-01 15:36 ` Daniel Vetter
  2012-03-02 22:34   ` Rob Clark
  2012-03-03  0:23   ` Sakari Ailus
  2 siblings, 2 replies; 15+ messages in thread
From: Daniel Vetter @ 2012-03-01 15:36 UTC (permalink / raw)
  To: linaro-mm-sig, LKML, DRI Development, linux-media; +Cc: Daniel Vetter

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 Documentation/dma-buf-sharing.txt |  102 +++++++++++++++++++++++++++++++++++-
 1 files changed, 99 insertions(+), 3 deletions(-)

diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt
index 225f96d..f12542b 100644
--- a/Documentation/dma-buf-sharing.txt
+++ b/Documentation/dma-buf-sharing.txt
@@ -32,8 +32,12 @@ The buffer-user
 *IMPORTANT*: [see https://lkml.org/lkml/2011/12/20/211 for more details]
 For this first version, A buffer shared using the dma_buf sharing API:
 - *may* be exported to user space using "mmap" *ONLY* by exporter, outside of
-   this framework.
-- may be used *ONLY* by importers that do not need CPU access to the buffer.
+  this framework.
+- with this new iteration of the dma-buf api cpu access from the kernel has been
+  enable, see below for the details.
+
+dma-buf operations for device dma only
+--------------------------------------
 
 The dma_buf buffer sharing API usage contains the following steps:
 
@@ -219,7 +223,99 @@ NOTES:
    If the exporter chooses not to allow an attach() operation once a
    map_dma_buf() API has been called, it simply returns an error.
 
-Miscellaneous notes:
+Kernel cpu access to a dma-buf buffer object
+--------------------------------------------
+
+The motivation to allow cpu access from the kernel to a dma-buf object from the
+importers side are:
+- fallback operations, e.g. if the devices is connected to a usb bus and the
+  kernel needs to shuffle the data around first before sending it away.
+- full transperancy for existing users on the importer side, i.e. userspace
+  should not notice the difference between a normal object from that subsystem
+  and an imported one backed by a dma-buf. This is really important for drm
+  opengl drivers that expect to still use all the existing upload/download
+  paths.
+
+Access to a dma_buf from the kernel context involves three steps:
+
+1. Prepare access, which invalidate any necessary caches and make the object
+   available for cpu access.
+2. Access the object page-by-page with the dma_buf map apis
+3. Finish access, which will flush any necessary cpu caches and free reserved
+   resources.
+
+1. Prepare acces
+
+   Before an importer can acces a dma_buf object with the cpu from the kernel
+   context, it needs to notice the exporter of the access that is about to
+   happen.
+
+   Interface:
+      int dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
+				   size_t start, size_t len,
+				   enum dma_data_direction direction)
+
+   This allows the exporter to ensure that the memory is actually available for
+   cpu access - the exporter might need to allocate or swap-in and pin the
+   backing storage. The exporter also needs to ensure that cpu access is
+   coherent for the given range and access direction. The range and access
+   direction can be used by the exporter to optimize the cache flushing, i.e.
+   access outside of the range or with a different direction (read instead of
+   write) might return stale or even bogus data (e.g. when the exporter needs to
+   copy the data to temporaray storage).
+
+   This step might fail, e.g. in oom conditions.
+
+2. Accessing the buffer
+
+   To support dma_buf objects residing in highmem cpu access is page-based using
+   an api similar to kmap. Accessing a dma_buf is done in aligned chunks of
+   PAGE_SIZE size. Before accessing a chunk it needs to be mapped, which returns
+   a pointer in kernel virtual address space. Afterwards the chunk needs to be
+   unmapped again. There is no limit on how often a given chunk can be mapped
+   and unmmapped, i.e. the importer does not need to call begin_cpu_access again
+   before mapping the same chunk again.
+
+   Interfaces:
+      void *dma_buf_kmap(struct dma_buf *, unsigned long);
+      void dma_buf_kunmap(struct dma_buf *, unsigned long, void *);
+
+   There are also atomic variants of these interfaces. Like for kmap they
+   facilitate non-blocking fast-paths. Neither the importer nor the exporter (in
+   the callback) is allowed to block when using these.
+
+   Interfaces:
+      void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long);
+      void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long, void *);
+
+   For importers all the restrictions of using kmap apply, like the limited
+   supply of kmap_atomic slots. Hence an importer shall only hold onto at most 2
+   atomic dma_buf kmaps at the same time (in any given process context).
+
+   dma_buf kmap calls outside of the range specified in begin_cpu_access are
+   undefined. If the range is not PAGE_SIZE aligned, kmap needs to succeed on
+   the partial chunks at the beginning and end but may return stale or bogus
+   data outside of the range (in these partial chunks).
+
+   Note that these calls need to always succeed. The exporter needs to complete
+   any preparations that might fail in begin_cpu_access.
+
+3. Finish access
+
+   When the importer is done accessing the range specified in begin_cpu_acces,
+   it needs to announce this to the exporter (to facilitate cache flushing and
+   unpinning of any pinned resources). The result of of any dma_buf kmap calls
+   after end_cpu_access is undefined.
+
+   Interface:
+      void dma_buf_end_cpu_access(struct dma_buf *dma_buf,
+				  size_t start, size_t len,
+				  enum dma_data_direction dir);
+
+
+Miscellaneous notes
+-------------------
+
 - Any exporters or users of the dma-buf buffer sharing framework must have
   a 'select DMA_SHARED_BUFFER' in their respective Kconfigs.
 
-- 
1.7.7.5


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] dma-buf: don't hold the mutex around map/unmap calls
  2012-03-01 15:35 ` [PATCH 1/3] dma-buf: don't hold the mutex around map/unmap calls Daniel Vetter
@ 2012-03-02 21:26   ` Rob Clark
  0 siblings, 0 replies; 15+ messages in thread
From: Rob Clark @ 2012-03-02 21:26 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: linaro-mm-sig, LKML, DRI Development, linux-media

On Thu, Mar 1, 2012 at 9:35 AM, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> The mutex protects the attachment list and hence needs to be held
> around the callbakc to the exporters (optional) attach/detach
> functions.
>
> Holding the mutex around the map/unmap calls doesn't protect any
> dma_buf state. Exporters need to properly protect any of their own
> state anyway (to protect against calls from their own interfaces).
> So this only makes the locking messier (and lockdep easier to anger).
>
> Therefore let's just drop this.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Reviewed-by: Rob Clark <rob.clark@linaro.org>

> ---
>  drivers/base/dma-buf.c  |    4 ----
>  include/linux/dma-buf.h |    2 +-
>  2 files changed, 1 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c
> index e38ad24..1b11192 100644
> --- a/drivers/base/dma-buf.c
> +++ b/drivers/base/dma-buf.c
> @@ -258,10 +258,8 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
>        if (WARN_ON(!attach || !attach->dmabuf || !attach->dmabuf->ops))
>                return ERR_PTR(-EINVAL);
>
> -       mutex_lock(&attach->dmabuf->lock);
>        if (attach->dmabuf->ops->map_dma_buf)
>                sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
> -       mutex_unlock(&attach->dmabuf->lock);
>
>        return sg_table;
>  }
> @@ -282,10 +280,8 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach,
>                            || !attach->dmabuf->ops))
>                return;
>
> -       mutex_lock(&attach->dmabuf->lock);
>        if (attach->dmabuf->ops->unmap_dma_buf)
>                attach->dmabuf->ops->unmap_dma_buf(attach, sg_table);
> -       mutex_unlock(&attach->dmabuf->lock);
>
>  }
>  EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index f8ac076..f7ad2ca 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -86,7 +86,7 @@ struct dma_buf {
>        struct file *file;
>        struct list_head attachments;
>        const struct dma_buf_ops *ops;
> -       /* mutex to serialize list manipulation and other ops */
> +       /* mutex to serialize list manipulation and attach/detach */
>        struct mutex lock;
>        void *priv;
>  };
> --
> 1.7.7.5
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/3] dma-buf: add support for kernel cpu access
  2012-03-01 15:36 ` [PATCH 2/3] dma-buf: add support for kernel cpu access Daniel Vetter
@ 2012-03-02 22:24   ` Rob Clark
  2012-03-05 18:57     ` Daniel Vetter
  2012-03-02 22:38   ` [Linaro-mm-sig] " Chris Wilson
  1 sibling, 1 reply; 15+ messages in thread
From: Rob Clark @ 2012-03-02 22:24 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: linaro-mm-sig, LKML, DRI Development, linux-media

On Thu, Mar 1, 2012 at 9:36 AM, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> Big differences to other contenders in the field (like ion) is
> that this also supports highmem, so we have to split up the cpu
> access from the kernel side into a prepare and a kmap step.
>
> Prepare is allowed to fail and should do everything required so that
> the kmap calls can succeed (like swapin/backing storage allocation,
> flushing, ...).
>
> More in-depth explanations will follow in the follow-up documentation
> patch.
>
> Changes in v2:
>
> - Clear up begin_cpu_access confusion noticed by Sumit Semwal.
> - Don't automatically fallback from the _atomic variants to the
>  non-atomic variants. The _atomic callbacks are not allowed to
>  sleep, so we want exporters to make this decision explicit. The
>  function signatures are explicit, so simpler exporters can still
>  use the same function for both.
> - Make the unmap functions optional. Simpler exporters with permanent
>  mappings don't need to do anything at unmap time.
>
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/base/dma-buf.c  |  120 +++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/dma-buf.h |   60 +++++++++++++++++++++++
>  2 files changed, 180 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c
> index 1b11192..bf54b89 100644
> --- a/drivers/base/dma-buf.c
> +++ b/drivers/base/dma-buf.c
> @@ -285,3 +285,123 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach,
>
>  }
>  EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
> +
> +
> +/**
> + * dma_buf_begin_cpu_access - Must be called before accessing a dma_buf from the
> + * cpu in the kernel context. Calls begin_cpu_access to allow exporter-specific
> + * preparations. Coherency is only guaranteed in the specified range for the
> + * specified access direction.
> + * @dma_buf:   [in]    buffer to prepare cpu access for.
> + * @start:     [in]    start of range for cpu access.
> + * @len:       [in]    length of range for cpu access.
> + * @direction: [in]    length of range for cpu access.
> + *
> + * Can return negative error values, returns 0 on success.
> + */
> +int dma_buf_begin_cpu_access(struct dma_buf *dmabuf, size_t start, size_t len,
> +                            enum dma_data_direction direction)
> +{
> +       int ret = 0;
> +
> +       if (WARN_ON(!dmabuf || !dmabuf->ops))
> +               return EINVAL;
> +
> +       if (dmabuf->ops->begin_cpu_access)
> +               ret = dmabuf->ops->begin_cpu_access(dmabuf, start, len, direction);
> +
> +       return ret;
> +}
> +EXPORT_SYMBOL_GPL(dma_buf_begin_cpu_access);
> +
> +/**
> + * dma_buf_end_cpu_access - Must be called after accessing a dma_buf from the
> + * cpu in the kernel context. Calls end_cpu_access to allow exporter-specific
> + * actions. Coherency is only guaranteed in the specified range for the
> + * specified access direction.
> + * @dma_buf:   [in]    buffer to complete cpu access for.
> + * @start:     [in]    start of range for cpu access.
> + * @len:       [in]    length of range for cpu access.
> + * @direction: [in]    length of range for cpu access.
> + *
> + * This call must always succeed.
> + */
> +void dma_buf_end_cpu_access(struct dma_buf *dmabuf, size_t start, size_t len,
> +                           enum dma_data_direction direction)
> +{
> +       WARN_ON(!dmabuf || !dmabuf->ops);
> +
> +       if (dmabuf->ops->end_cpu_access)
> +               dmabuf->ops->end_cpu_access(dmabuf, start, len, direction);
> +}
> +EXPORT_SYMBOL_GPL(dma_buf_end_cpu_access);
> +
> +/**
> + * dma_buf_kmap_atomic - Map a page of the buffer object into kernel address
> + * space. The same restrictions as for kmap_atomic and friends apply.
> + * @dma_buf:   [in]    buffer to map page from.
> + * @page_num:  [in]    page in PAGE_SIZE units to map.
> + *
> + * This call must always succeed, any necessary preparations that might fail
> + * need to be done in begin_cpu_access.
> + */
> +void *dma_buf_kmap_atomic(struct dma_buf *dmabuf, unsigned long page_num)
> +{
> +       WARN_ON(!dmabuf || !dmabuf->ops);
> +
> +       return dmabuf->ops->kmap_atomic(dmabuf, page_num);


Perhaps we should check somewhere for required dmabuf ops fxns (like
kmap_atomic here), rather than just calling unconditionally what might
be a null ptr.  At least put it in the WARN_ON(), but it might be
nicer to catch a missing required fxns at export time, rather than
waiting for an importer to try and call it.  Less likely that way, for
newly added required functions go unnoticed.

(same comment applies below for the non-atomic variant.. and possibly
some other existing dmabuf ops)

BR,
-R

> +}
> +EXPORT_SYMBOL_GPL(dma_buf_kmap_atomic);
> +
> +/**
> + * dma_buf_kunmap_atomic - Unmap a page obtained by dma_buf_kmap_atomic.
> + * @dma_buf:   [in]    buffer to unmap page from.
> + * @page_num:  [in]    page in PAGE_SIZE units to unmap.
> + * @vaddr:     [in]    kernel space pointer obtained from dma_buf_kmap_atomic.
> + *
> + * This call must always succeed.
> + */
> +void dma_buf_kunmap_atomic(struct dma_buf *dmabuf, unsigned long page_num,
> +                          void *vaddr)
> +{
> +       WARN_ON(!dmabuf || !dmabuf->ops);
> +
> +       if (dmabuf->ops->kunmap_atomic)
> +               dmabuf->ops->kunmap_atomic(dmabuf, page_num, vaddr);
> +}
> +EXPORT_SYMBOL_GPL(dma_buf_kunmap_atomic);
> +
> +/**
> + * dma_buf_kmap - Map a page of the buffer object into kernel address space. The
> + * same restrictions as for kmap and friends apply.
> + * @dma_buf:   [in]    buffer to map page from.
> + * @page_num:  [in]    page in PAGE_SIZE units to map.
> + *
> + * This call must always succeed, any necessary preparations that might fail
> + * need to be done in begin_cpu_access.
> + */
> +void *dma_buf_kmap(struct dma_buf *dmabuf, unsigned long page_num)
> +{
> +       WARN_ON(!dmabuf || !dmabuf->ops);
> +
> +       return dmabuf->ops->kmap(dmabuf, page_num);
> +}
> +EXPORT_SYMBOL_GPL(dma_buf_kmap);
> +
> +/**
> + * dma_buf_kunmap - Unmap a page obtained by dma_buf_kmap.
> + * @dma_buf:   [in]    buffer to unmap page from.
> + * @page_num:  [in]    page in PAGE_SIZE units to unmap.
> + * @vaddr:     [in]    kernel space pointer obtained from dma_buf_kmap.
> + *
> + * This call must always succeed.
> + */
> +void dma_buf_kunmap(struct dma_buf *dmabuf, unsigned long page_num,
> +                   void *vaddr)
> +{
> +       WARN_ON(!dmabuf || !dmabuf->ops);
> +
> +       if (dmabuf->ops->kunmap)
> +               dmabuf->ops->kunmap(dmabuf, page_num, vaddr);
> +}
> +EXPORT_SYMBOL_GPL(dma_buf_kunmap);
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index f7ad2ca..72eb922 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -49,6 +49,17 @@ struct dma_buf_attachment;
>  * @unmap_dma_buf: decreases usecount of buffer, might deallocate scatter
>  *                pages.
>  * @release: release this buffer; to be called after the last dma_buf_put.
> + * @begin_cpu_access: [optional] called before cpu access to invalidate cpu
> + *                   caches and allocate backing storage (if not yet done)
> + *                   respectively pin the objet into memory.
> + * @end_cpu_access: [optional] called after cpu access to flush cashes.
> + * @kmap_atomic: maps a page from the buffer into kernel address
> + *              space, users may not block until the subsequent unmap call.
> + *              This callback must not sleep.
> + * @kunmap_atomic: [optional] unmaps a atomically mapped page from the buffer.
> + *                This Callback must not sleep.
> + * @kmap: maps a page from the buffer into kernel address space.
> + * @kunmap: [optional] unmaps a page from the buffer.
>  */
>  struct dma_buf_ops {
>        int (*attach)(struct dma_buf *, struct device *,
> @@ -71,6 +82,14 @@ struct dma_buf_ops {
>        /* after final dma_buf_put() */
>        void (*release)(struct dma_buf *);
>
> +       int (*begin_cpu_access)(struct dma_buf *, size_t, size_t,
> +                               enum dma_data_direction);
> +       void (*end_cpu_access)(struct dma_buf *, size_t, size_t,
> +                              enum dma_data_direction);
> +       void *(*kmap_atomic)(struct dma_buf *, unsigned long);
> +       void (*kunmap_atomic)(struct dma_buf *, unsigned long, void *);
> +       void *(*kmap)(struct dma_buf *, unsigned long);
> +       void (*kunmap)(struct dma_buf *, unsigned long, void *);
>  };
>
>  /**
> @@ -123,6 +142,15 @@ void dma_buf_put(struct dma_buf *dmabuf);
>  struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *,
>                                        enum dma_data_direction);
>  void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *);
> +
> +int dma_buf_begin_cpu_access(struct dma_buf *dma_buf, size_t start, size_t len,
> +                            enum dma_data_direction dir);
> +void dma_buf_end_cpu_access(struct dma_buf *dma_buf, size_t start, size_t len,
> +                           enum dma_data_direction dir);
> +void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long);
> +void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long, void *);
> +void *dma_buf_kmap(struct dma_buf *, unsigned long);
> +void dma_buf_kunmap(struct dma_buf *, unsigned long, void *);
>  #else
>
>  static inline struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> @@ -171,6 +199,38 @@ static inline void dma_buf_unmap_attachment(struct dma_buf_attachment *attach,
>        return;
>  }
>
> +static inline int dma_buf_begin_cpu_access(struct dma_buf *,
> +                                          size_t, size_t,
> +                                          enum dma_data_direction)
> +{
> +       return -ENODEV;
> +}
> +
> +static inline void dma_buf_end_cpu_access(struct dma_buf *,
> +                                         size_t, size_t,
> +                                         enum dma_data_direction)
> +{
> +}
> +
> +static inline void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long)
> +{
> +       return NULL;
> +}
> +
> +static inline void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long,
> +                                        void *)
> +{
> +}
> +
> +static inline void *dma_buf_kmap(struct dma_buf *, unsigned long)
> +{
> +       return NULL;
> +}
> +
> +static inline void dma_buf_kunmap(struct dma_buf *, unsigned long,
> +                                 void *)
> +{
> +}
>  #endif /* CONFIG_DMA_SHARED_BUFFER */
>
>  #endif /* __DMA_BUF_H__ */
> --
> 1.7.7.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/3] dma_buf: Add documentation for the new cpu access support
  2012-03-01 15:36 ` [PATCH 3/3] dma_buf: Add documentation for the new cpu access support Daniel Vetter
@ 2012-03-02 22:34   ` Rob Clark
  2012-03-03  0:23   ` Sakari Ailus
  1 sibling, 0 replies; 15+ messages in thread
From: Rob Clark @ 2012-03-02 22:34 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: linaro-mm-sig, LKML, DRI Development, linux-media

minor comments from the typo-police..

On Thu, Mar 1, 2012 at 9:36 AM, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  Documentation/dma-buf-sharing.txt |  102 +++++++++++++++++++++++++++++++++++-
>  1 files changed, 99 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt
> index 225f96d..f12542b 100644
> --- a/Documentation/dma-buf-sharing.txt
> +++ b/Documentation/dma-buf-sharing.txt
> @@ -32,8 +32,12 @@ The buffer-user
>  *IMPORTANT*: [see https://lkml.org/lkml/2011/12/20/211 for more details]
>  For this first version, A buffer shared using the dma_buf sharing API:
>  - *may* be exported to user space using "mmap" *ONLY* by exporter, outside of
> -   this framework.
> -- may be used *ONLY* by importers that do not need CPU access to the buffer.
> +  this framework.
> +- with this new iteration of the dma-buf api cpu access from the kernel has been
> +  enable, see below for the details.
> +
> +dma-buf operations for device dma only
> +--------------------------------------
>
>  The dma_buf buffer sharing API usage contains the following steps:
>
> @@ -219,7 +223,99 @@ NOTES:
>    If the exporter chooses not to allow an attach() operation once a
>    map_dma_buf() API has been called, it simply returns an error.
>
> -Miscellaneous notes:
> +Kernel cpu access to a dma-buf buffer object
> +--------------------------------------------
> +
> +The motivation to allow cpu access from the kernel to a dma-buf object from the
> +importers side are:
> +- fallback operations, e.g. if the devices is connected to a usb bus and the
> +  kernel needs to shuffle the data around first before sending it away.
> +- full transperancy for existing users on the importer side, i.e. userspace

s/transperancy/transparency/

> +  should not notice the difference between a normal object from that subsystem
> +  and an imported one backed by a dma-buf. This is really important for drm
> +  opengl drivers that expect to still use all the existing upload/download
> +  paths.
> +
> +Access to a dma_buf from the kernel context involves three steps:
> +
> +1. Prepare access, which invalidate any necessary caches and make the object
> +   available for cpu access.
> +2. Access the object page-by-page with the dma_buf map apis
> +3. Finish access, which will flush any necessary cpu caches and free reserved
> +   resources.
> +
> +1. Prepare acces
> +

s/acces/access/

> +   Before an importer can acces a dma_buf object with the cpu from the kernel

s/acces/access/

> +   context, it needs to notice the exporter of the access that is about to

s/notice/notify/ (I assume?)

> +   happen.
> +
> +   Interface:
> +      int dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
> +                                  size_t start, size_t len,
> +                                  enum dma_data_direction direction)
> +
> +   This allows the exporter to ensure that the memory is actually available for
> +   cpu access - the exporter might need to allocate or swap-in and pin the
> +   backing storage. The exporter also needs to ensure that cpu access is
> +   coherent for the given range and access direction. The range and access
> +   direction can be used by the exporter to optimize the cache flushing, i.e.
> +   access outside of the range or with a different direction (read instead of
> +   write) might return stale or even bogus data (e.g. when the exporter needs to
> +   copy the data to temporaray storage).

s/temporaray/temporary/

> +
> +   This step might fail, e.g. in oom conditions.
> +
> +2. Accessing the buffer
> +
> +   To support dma_buf objects residing in highmem cpu access is page-based using
> +   an api similar to kmap. Accessing a dma_buf is done in aligned chunks of
> +   PAGE_SIZE size. Before accessing a chunk it needs to be mapped, which returns
> +   a pointer in kernel virtual address space. Afterwards the chunk needs to be
> +   unmapped again. There is no limit on how often a given chunk can be mapped
> +   and unmmapped, i.e. the importer does not need to call begin_cpu_access again

s/unmmapped/unmapped/

> +   before mapping the same chunk again.
> +
> +   Interfaces:
> +      void *dma_buf_kmap(struct dma_buf *, unsigned long);
> +      void dma_buf_kunmap(struct dma_buf *, unsigned long, void *);
> +
> +   There are also atomic variants of these interfaces. Like for kmap they
> +   facilitate non-blocking fast-paths. Neither the importer nor the exporter (in
> +   the callback) is allowed to block when using these.
> +
> +   Interfaces:
> +      void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long);
> +      void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long, void *);
> +
> +   For importers all the restrictions of using kmap apply, like the limited
> +   supply of kmap_atomic slots. Hence an importer shall only hold onto at most 2
> +   atomic dma_buf kmaps at the same time (in any given process context).
> +
> +   dma_buf kmap calls outside of the range specified in begin_cpu_access are
> +   undefined. If the range is not PAGE_SIZE aligned, kmap needs to succeed on
> +   the partial chunks at the beginning and end but may return stale or bogus
> +   data outside of the range (in these partial chunks).
> +
> +   Note that these calls need to always succeed. The exporter needs to complete
> +   any preparations that might fail in begin_cpu_access.
> +
> +3. Finish access
> +
> +   When the importer is done accessing the range specified in begin_cpu_acces,

s/begin_cpu_acces/begin_cpu_access/


BR,
-R


> +   it needs to announce this to the exporter (to facilitate cache flushing and
> +   unpinning of any pinned resources). The result of of any dma_buf kmap calls
> +   after end_cpu_access is undefined.
> +
> +   Interface:
> +      void dma_buf_end_cpu_access(struct dma_buf *dma_buf,
> +                                 size_t start, size_t len,
> +                                 enum dma_data_direction dir);
> +
> +
> +Miscellaneous notes
> +-------------------
> +
>  - Any exporters or users of the dma-buf buffer sharing framework must have
>   a 'select DMA_SHARED_BUFFER' in their respective Kconfigs.
>
> --
> 1.7.7.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 2/3] dma-buf: add support for kernel cpu access
  2012-03-01 15:36 ` [PATCH 2/3] dma-buf: add support for kernel cpu access Daniel Vetter
  2012-03-02 22:24   ` Rob Clark
@ 2012-03-02 22:38   ` Chris Wilson
  2012-03-02 22:53     ` Rob Clark
  1 sibling, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2012-03-02 22:38 UTC (permalink / raw)
  To: Daniel Vetter, linaro-mm-sig, LKML, DRI Development, linux-media

On Thu,  1 Mar 2012 16:36:00 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> Big differences to other contenders in the field (like ion) is
> that this also supports highmem, so we have to split up the cpu
> access from the kernel side into a prepare and a kmap step.
> 
> Prepare is allowed to fail and should do everything required so that
> the kmap calls can succeed (like swapin/backing storage allocation,
> flushing, ...).
> 
> More in-depth explanations will follow in the follow-up documentation
> patch.
> 
> Changes in v2:
> 
> - Clear up begin_cpu_access confusion noticed by Sumit Semwal.
> - Don't automatically fallback from the _atomic variants to the
>   non-atomic variants. The _atomic callbacks are not allowed to
>   sleep, so we want exporters to make this decision explicit. The
>   function signatures are explicit, so simpler exporters can still
>   use the same function for both.
> - Make the unmap functions optional. Simpler exporters with permanent
>   mappings don't need to do anything at unmap time.

Are we going to have to have a dma_buf->ops->begin_async_access(&me, dir)
variant for coherency control of rendering with an imported dma_buf?
There is also no concurrency control here between multiple importers
doing simultaneous begin_cpu_access(). I presume that is going to be a
common function across all exporters so the midlayer might offer a
semaphore as a library function and then the
dma_buf->ops->begin_cpu_access() becomes mandatory as at a minimum it
has to point to the default implementation.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 2/3] dma-buf: add support for kernel cpu access
  2012-03-02 22:38   ` [Linaro-mm-sig] " Chris Wilson
@ 2012-03-02 22:53     ` Rob Clark
  2012-03-05 21:31       ` Daniel Vetter
  0 siblings, 1 reply; 15+ messages in thread
From: Rob Clark @ 2012-03-02 22:53 UTC (permalink / raw)
  To: Chris Wilson
  Cc: Daniel Vetter, linaro-mm-sig, LKML, DRI Development, linux-media

On Fri, Mar 2, 2012 at 4:38 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Thu,  1 Mar 2012 16:36:00 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>> Big differences to other contenders in the field (like ion) is
>> that this also supports highmem, so we have to split up the cpu
>> access from the kernel side into a prepare and a kmap step.
>>
>> Prepare is allowed to fail and should do everything required so that
>> the kmap calls can succeed (like swapin/backing storage allocation,
>> flushing, ...).
>>
>> More in-depth explanations will follow in the follow-up documentation
>> patch.
>>
>> Changes in v2:
>>
>> - Clear up begin_cpu_access confusion noticed by Sumit Semwal.
>> - Don't automatically fallback from the _atomic variants to the
>>   non-atomic variants. The _atomic callbacks are not allowed to
>>   sleep, so we want exporters to make this decision explicit. The
>>   function signatures are explicit, so simpler exporters can still
>>   use the same function for both.
>> - Make the unmap functions optional. Simpler exporters with permanent
>>   mappings don't need to do anything at unmap time.
>
> Are we going to have to have a dma_buf->ops->begin_async_access(&me, dir)
> variant for coherency control of rendering with an imported dma_buf?
> There is also no concurrency control here between multiple importers
> doing simultaneous begin_cpu_access(). I presume that is going to be a
> common function across all exporters so the midlayer might offer a
> semaphore as a library function and then the
> dma_buf->ops->begin_cpu_access() becomes mandatory as at a minimum it
> has to point to the default implementation.

Initially the expectation was that userspace would not pass a buffer
to multiple subsystems for writing (or that if it did, it would get
the undefined results that one could expect)..  so dealing w/
synchronization was punted.

I expect, though, that one of the next steps is some sort of
sync-object mechanism to supplement dmabuf

BR,
-R

> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
> --
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/3] dma_buf: Add documentation for the new cpu access support
  2012-03-01 15:36 ` [PATCH 3/3] dma_buf: Add documentation for the new cpu access support Daniel Vetter
  2012-03-02 22:34   ` Rob Clark
@ 2012-03-03  0:23   ` Sakari Ailus
  2012-03-05 18:48     ` [Linaro-mm-sig] " Clark, Rob
  2012-03-05 21:39     ` Daniel Vetter
  1 sibling, 2 replies; 15+ messages in thread
From: Sakari Ailus @ 2012-03-03  0:23 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: linaro-mm-sig, LKML, DRI Development, linux-media

Hi Daniel,

Thanks for the patch.

On Thu, Mar 01, 2012 at 04:36:01PM +0100, Daniel Vetter wrote:
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  Documentation/dma-buf-sharing.txt |  102 +++++++++++++++++++++++++++++++++++-
>  1 files changed, 99 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt
> index 225f96d..f12542b 100644
> --- a/Documentation/dma-buf-sharing.txt
> +++ b/Documentation/dma-buf-sharing.txt
> @@ -32,8 +32,12 @@ The buffer-user
>  *IMPORTANT*: [see https://lkml.org/lkml/2011/12/20/211 for more details]
>  For this first version, A buffer shared using the dma_buf sharing API:
>  - *may* be exported to user space using "mmap" *ONLY* by exporter, outside of
> -   this framework.
> -- may be used *ONLY* by importers that do not need CPU access to the buffer.
> +  this framework.
> +- with this new iteration of the dma-buf api cpu access from the kernel has been
> +  enable, see below for the details.
> +
> +dma-buf operations for device dma only
> +--------------------------------------
>  
>  The dma_buf buffer sharing API usage contains the following steps:
>  
> @@ -219,7 +223,99 @@ NOTES:
>     If the exporter chooses not to allow an attach() operation once a
>     map_dma_buf() API has been called, it simply returns an error.
>  
> -Miscellaneous notes:
> +Kernel cpu access to a dma-buf buffer object
> +--------------------------------------------
> +
> +The motivation to allow cpu access from the kernel to a dma-buf object from the
> +importers side are:
> +- fallback operations, e.g. if the devices is connected to a usb bus and the
> +  kernel needs to shuffle the data around first before sending it away.
> +- full transperancy for existing users on the importer side, i.e. userspace
> +  should not notice the difference between a normal object from that subsystem
> +  and an imported one backed by a dma-buf. This is really important for drm
> +  opengl drivers that expect to still use all the existing upload/download
> +  paths.
> +
> +Access to a dma_buf from the kernel context involves three steps:
> +
> +1. Prepare access, which invalidate any necessary caches and make the object
> +   available for cpu access.
> +2. Access the object page-by-page with the dma_buf map apis
> +3. Finish access, which will flush any necessary cpu caches and free reserved
> +   resources.

Where it should be decided which operations are being done to the buffer
when it is passed to user space and back to kernel space?

How about spliting these operations to those done on the first time the
buffer is passed to the user space (mapping to kernel address space, for
example) and those required every time buffer is passed from kernel to user
and back (cache flusing)?

I'm asking since any unnecessary time-consuming operations, especially as
heavy as mapping the buffer, should be avoidable in subsystems dealing
with streaming video, cameras etc., i.e. non-GPU users.

> +1. Prepare acces
> +
> +   Before an importer can acces a dma_buf object with the cpu from the kernel
> +   context, it needs to notice the exporter of the access that is about to
> +   happen.
> +
> +   Interface:
> +      int dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
> +				   size_t start, size_t len,
> +				   enum dma_data_direction direction)
> +
> +   This allows the exporter to ensure that the memory is actually available for
> +   cpu access - the exporter might need to allocate or swap-in and pin the
> +   backing storage. The exporter also needs to ensure that cpu access is
> +   coherent for the given range and access direction. The range and access
> +   direction can be used by the exporter to optimize the cache flushing, i.e.
> +   access outside of the range or with a different direction (read instead of
> +   write) might return stale or even bogus data (e.g. when the exporter needs to
> +   copy the data to temporaray storage).
> +
> +   This step might fail, e.g. in oom conditions.
> +
> +2. Accessing the buffer
> +
> +   To support dma_buf objects residing in highmem cpu access is page-based using
> +   an api similar to kmap. Accessing a dma_buf is done in aligned chunks of
> +   PAGE_SIZE size. Before accessing a chunk it needs to be mapped, which returns
> +   a pointer in kernel virtual address space. Afterwards the chunk needs to be
> +   unmapped again. There is no limit on how often a given chunk can be mapped
> +   and unmmapped, i.e. the importer does not need to call begin_cpu_access again
> +   before mapping the same chunk again.
> +
> +   Interfaces:
> +      void *dma_buf_kmap(struct dma_buf *, unsigned long);
> +      void dma_buf_kunmap(struct dma_buf *, unsigned long, void *);
> +
> +   There are also atomic variants of these interfaces. Like for kmap they
> +   facilitate non-blocking fast-paths. Neither the importer nor the exporter (in
> +   the callback) is allowed to block when using these.
> +
> +   Interfaces:
> +      void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long);
> +      void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long, void *);
> +
> +   For importers all the restrictions of using kmap apply, like the limited
> +   supply of kmap_atomic slots. Hence an importer shall only hold onto at most 2
> +   atomic dma_buf kmaps at the same time (in any given process context).
> +
> +   dma_buf kmap calls outside of the range specified in begin_cpu_access are
> +   undefined. If the range is not PAGE_SIZE aligned, kmap needs to succeed on
> +   the partial chunks at the beginning and end but may return stale or bogus
> +   data outside of the range (in these partial chunks).
> +
> +   Note that these calls need to always succeed. The exporter needs to complete
> +   any preparations that might fail in begin_cpu_access.
> +
> +3. Finish access
> +
> +   When the importer is done accessing the range specified in begin_cpu_acces,
> +   it needs to announce this to the exporter (to facilitate cache flushing and
> +   unpinning of any pinned resources). The result of of any dma_buf kmap calls
> +   after end_cpu_access is undefined.
> +
> +   Interface:
> +      void dma_buf_end_cpu_access(struct dma_buf *dma_buf,
> +				  size_t start, size_t len,
> +				  enum dma_data_direction dir);
> +
> +
> +Miscellaneous notes
> +-------------------
> +
>  - Any exporters or users of the dma-buf buffer sharing framework must have
>    a 'select DMA_SHARED_BUFFER' in their respective Kconfigs.

Kind regards,

-- 
Sakari Ailus
e-mail: sakari.ailus@iki.fi	jabber/XMPP/Gmail: sailus@retiisi.org.uk

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 3/3] dma_buf: Add documentation for the new cpu access support
  2012-03-03  0:23   ` Sakari Ailus
@ 2012-03-05 18:48     ` Clark, Rob
  2012-03-05 21:39     ` Daniel Vetter
  1 sibling, 0 replies; 15+ messages in thread
From: Clark, Rob @ 2012-03-05 18:48 UTC (permalink / raw)
  To: Sakari Ailus
  Cc: Daniel Vetter, linaro-mm-sig, LKML, DRI Development, linux-media

On Fri, Mar 2, 2012 at 6:23 PM, Sakari Ailus <sakari.ailus@iki.fi> wrote:
> Hi Daniel,
>
> Thanks for the patch.
>
> On Thu, Mar 01, 2012 at 04:36:01PM +0100, Daniel Vetter wrote:
>> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>> ---
>>  Documentation/dma-buf-sharing.txt |  102 +++++++++++++++++++++++++++++++++++-
>>  1 files changed, 99 insertions(+), 3 deletions(-)
>>
>> diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt
>> index 225f96d..f12542b 100644
>> --- a/Documentation/dma-buf-sharing.txt
>> +++ b/Documentation/dma-buf-sharing.txt
>> @@ -32,8 +32,12 @@ The buffer-user
>>  *IMPORTANT*: [see https://lkml.org/lkml/2011/12/20/211 for more details]
>>  For this first version, A buffer shared using the dma_buf sharing API:
>>  - *may* be exported to user space using "mmap" *ONLY* by exporter, outside of
>> -   this framework.
>> -- may be used *ONLY* by importers that do not need CPU access to the buffer.
>> +  this framework.
>> +- with this new iteration of the dma-buf api cpu access from the kernel has been
>> +  enable, see below for the details.
>> +
>> +dma-buf operations for device dma only
>> +--------------------------------------
>>
>>  The dma_buf buffer sharing API usage contains the following steps:
>>
>> @@ -219,7 +223,99 @@ NOTES:
>>     If the exporter chooses not to allow an attach() operation once a
>>     map_dma_buf() API has been called, it simply returns an error.
>>
>> -Miscellaneous notes:
>> +Kernel cpu access to a dma-buf buffer object
>> +--------------------------------------------
>> +
>> +The motivation to allow cpu access from the kernel to a dma-buf object from the
>> +importers side are:
>> +- fallback operations, e.g. if the devices is connected to a usb bus and the
>> +  kernel needs to shuffle the data around first before sending it away.
>> +- full transperancy for existing users on the importer side, i.e. userspace
>> +  should not notice the difference between a normal object from that subsystem
>> +  and an imported one backed by a dma-buf. This is really important for drm
>> +  opengl drivers that expect to still use all the existing upload/download
>> +  paths.
>> +
>> +Access to a dma_buf from the kernel context involves three steps:
>> +
>> +1. Prepare access, which invalidate any necessary caches and make the object
>> +   available for cpu access.
>> +2. Access the object page-by-page with the dma_buf map apis
>> +3. Finish access, which will flush any necessary cpu caches and free reserved
>> +   resources.
>
> Where it should be decided which operations are being done to the buffer
> when it is passed to user space and back to kernel space?
>
> How about spliting these operations to those done on the first time the
> buffer is passed to the user space (mapping to kernel address space, for
> example) and those required every time buffer is passed from kernel to user
> and back (cache flusing)?
>
> I'm asking since any unnecessary time-consuming operations, especially as
> heavy as mapping the buffer, should be avoidable in subsystems dealing
> with streaming video, cameras etc., i.e. non-GPU users.


Well, this is really something for the buffer exporter to deal with..
since there is no way for an importer to create a userspace mmap'ing
of the buffer.  A lot of these expensive operations go away if you
don't even create a userspace virtual mapping in the first place ;-)

BR,
-R

>
>> +1. Prepare acces
>> +
>> +   Before an importer can acces a dma_buf object with the cpu from the kernel
>> +   context, it needs to notice the exporter of the access that is about to
>> +   happen.
>> +
>> +   Interface:
>> +      int dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
>> +                                size_t start, size_t len,
>> +                                enum dma_data_direction direction)
>> +
>> +   This allows the exporter to ensure that the memory is actually available for
>> +   cpu access - the exporter might need to allocate or swap-in and pin the
>> +   backing storage. The exporter also needs to ensure that cpu access is
>> +   coherent for the given range and access direction. The range and access
>> +   direction can be used by the exporter to optimize the cache flushing, i.e.
>> +   access outside of the range or with a different direction (read instead of
>> +   write) might return stale or even bogus data (e.g. when the exporter needs to
>> +   copy the data to temporaray storage).
>> +
>> +   This step might fail, e.g. in oom conditions.
>> +
>> +2. Accessing the buffer
>> +
>> +   To support dma_buf objects residing in highmem cpu access is page-based using
>> +   an api similar to kmap. Accessing a dma_buf is done in aligned chunks of
>> +   PAGE_SIZE size. Before accessing a chunk it needs to be mapped, which returns
>> +   a pointer in kernel virtual address space. Afterwards the chunk needs to be
>> +   unmapped again. There is no limit on how often a given chunk can be mapped
>> +   and unmmapped, i.e. the importer does not need to call begin_cpu_access again
>> +   before mapping the same chunk again.
>> +
>> +   Interfaces:
>> +      void *dma_buf_kmap(struct dma_buf *, unsigned long);
>> +      void dma_buf_kunmap(struct dma_buf *, unsigned long, void *);
>> +
>> +   There are also atomic variants of these interfaces. Like for kmap they
>> +   facilitate non-blocking fast-paths. Neither the importer nor the exporter (in
>> +   the callback) is allowed to block when using these.
>> +
>> +   Interfaces:
>> +      void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long);
>> +      void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long, void *);
>> +
>> +   For importers all the restrictions of using kmap apply, like the limited
>> +   supply of kmap_atomic slots. Hence an importer shall only hold onto at most 2
>> +   atomic dma_buf kmaps at the same time (in any given process context).
>> +
>> +   dma_buf kmap calls outside of the range specified in begin_cpu_access are
>> +   undefined. If the range is not PAGE_SIZE aligned, kmap needs to succeed on
>> +   the partial chunks at the beginning and end but may return stale or bogus
>> +   data outside of the range (in these partial chunks).
>> +
>> +   Note that these calls need to always succeed. The exporter needs to complete
>> +   any preparations that might fail in begin_cpu_access.
>> +
>> +3. Finish access
>> +
>> +   When the importer is done accessing the range specified in begin_cpu_acces,
>> +   it needs to announce this to the exporter (to facilitate cache flushing and
>> +   unpinning of any pinned resources). The result of of any dma_buf kmap calls
>> +   after end_cpu_access is undefined.
>> +
>> +   Interface:
>> +      void dma_buf_end_cpu_access(struct dma_buf *dma_buf,
>> +                               size_t start, size_t len,
>> +                               enum dma_data_direction dir);
>> +
>> +
>> +Miscellaneous notes
>> +-------------------
>> +
>>  - Any exporters or users of the dma-buf buffer sharing framework must have
>>    a 'select DMA_SHARED_BUFFER' in their respective Kconfigs.
>
> Kind regards,
>
> --
> Sakari Ailus
> e-mail: sakari.ailus@iki.fi     jabber/XMPP/Gmail: sailus@retiisi.org.uk
>
> _______________________________________________
> Linaro-mm-sig mailing list
> Linaro-mm-sig@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-mm-sig

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/3] dma-buf: add support for kernel cpu access
  2012-03-02 22:24   ` Rob Clark
@ 2012-03-05 18:57     ` Daniel Vetter
  2012-03-06 10:33       ` Semwal, Sumit
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel Vetter @ 2012-03-05 18:57 UTC (permalink / raw)
  To: Rob Clark; +Cc: linaro-mm-sig, LKML, DRI Development, linux-media

On Fri, Mar 2, 2012 at 23:24, Rob Clark <robdclark@gmail.com> wrote:
> Perhaps we should check somewhere for required dmabuf ops fxns (like
> kmap_atomic here), rather than just calling unconditionally what might
> be a null ptr.  At least put it in the WARN_ON(), but it might be
> nicer to catch a missing required fxns at export time, rather than
> waiting for an importer to try and call it.  Less likely that way, for
> newly added required functions go unnoticed.
>
> (same comment applies below for the non-atomic variant.. and possibly
> some other existing dmabuf ops)

Agreed, I'll rework the patch to do that when rebasing onto Sumit's latest tree.
-Daniel
-- 
Daniel Vetter
daniel.vetter@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 2/3] dma-buf: add support for kernel cpu access
  2012-03-02 22:53     ` Rob Clark
@ 2012-03-05 21:31       ` Daniel Vetter
  0 siblings, 0 replies; 15+ messages in thread
From: Daniel Vetter @ 2012-03-05 21:31 UTC (permalink / raw)
  To: Rob Clark; +Cc: Chris Wilson, linaro-mm-sig, LKML, DRI Development, linux-media

On Fri, Mar 2, 2012 at 23:53, Rob Clark <rob.clark@linaro.org> wrote:
> nitially the expectation was that userspace would not pass a buffer
> to multiple subsystems for writing (or that if it did, it would get
> the undefined results that one could expect)..  so dealing w/
> synchronization was punted.

Imo synchronization should not be part of the dma_buf core, i.e.
userspace needs to ensure that access is synchronized.
begin/end_cpu_access are the coherency brackets (like map/unmap for
device dma). And if userspace asks for a gun and some bullets, the
kernel should just deliver. Even in drm/i915 gem land we don't (and
simply can't) make any promises about concurrent reads/writes/ioctls.

> I expect, though, that one of the next steps is some sort of
> sync-object mechanism to supplement dmabuf

Imo the only reason to add sync objects as explicit things is to make
device-to-device sync more efficient by using hw semaphores and
signalling lines. Or maybe a quick irq handler in the kernel that
kicks of the next device. I don't think we should design these to make
userspace simpler.

Cheers, Daniel
-- 
Daniel Vetter
daniel.vetter@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/3] dma_buf: Add documentation for the new cpu access support
  2012-03-03  0:23   ` Sakari Ailus
  2012-03-05 18:48     ` [Linaro-mm-sig] " Clark, Rob
@ 2012-03-05 21:39     ` Daniel Vetter
  1 sibling, 0 replies; 15+ messages in thread
From: Daniel Vetter @ 2012-03-05 21:39 UTC (permalink / raw)
  To: Sakari Ailus; +Cc: linaro-mm-sig, LKML, DRI Development, linux-media

On Sat, Mar 3, 2012 at 01:23, Sakari Ailus <sakari.ailus@iki.fi> wrote:
> Where it should be decided which operations are being done to the buffer
> when it is passed to user space and back to kernel space?
>
> How about spliting these operations to those done on the first time the
> buffer is passed to the user space (mapping to kernel address space, for
> example) and those required every time buffer is passed from kernel to user
> and back (cache flusing)?
>
> I'm asking since any unnecessary time-consuming operations, especially as
> heavy as mapping the buffer, should be avoidable in subsystems dealing
> with streaming video, cameras etc., i.e. non-GPU users.

I'm a bit confused about your comments because this interface
extension doesn't support userspace mmap. So userspace isn't even part
of the picture. Adding mmap support is something entirely different
imo, and I have no idea yet how we should handle cache coherency for
that.

Yours, Daniel
-- 
Daniel Vetter
daniel.vetter@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/3] dma-buf: add support for kernel cpu access
  2012-03-05 18:57     ` Daniel Vetter
@ 2012-03-06 10:33       ` Semwal, Sumit
  0 siblings, 0 replies; 15+ messages in thread
From: Semwal, Sumit @ 2012-03-06 10:33 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Clark, linaro-mm-sig, LKML, DRI Development, linux-media

Hi Daniel,
On Tue, Mar 6, 2012 at 12:27 AM, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> On Fri, Mar 2, 2012 at 23:24, Rob Clark <robdclark@gmail.com> wrote:
>> Perhaps we should check somewhere for required dmabuf ops fxns (like
>> kmap_atomic here), rather than just calling unconditionally what might
>> be a null ptr.  At least put it in the WARN_ON(), but it might be
>> nicer to catch a missing required fxns at export time, rather than
>> waiting for an importer to try and call it.  Less likely that way, for
>> newly added required functions go unnoticed.
>>
>> (same comment applies below for the non-atomic variant.. and possibly
>> some other existing dmabuf ops)
>
> Agreed, I'll rework the patch to do that when rebasing onto Sumit's latest tree.
In addition, you'd not need to check for !dmabuf->ops since the export
should already catch it.

As I sent in the other mail a while back, could you please rebase on
for-next at git://git.linaro.org/people/sumitsemwal/linux-dma-buf.git

Best regards,
~Sumit.
> -Daniel
> --

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-03-06 10:33 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-01 15:35 [PATCH 0/3] [RFC] kernel cpu access support for dma_buf Daniel Vetter
2012-03-01 15:35 ` [PATCH 1/3] dma-buf: don't hold the mutex around map/unmap calls Daniel Vetter
2012-03-02 21:26   ` Rob Clark
2012-03-01 15:36 ` [PATCH 2/3] dma-buf: add support for kernel cpu access Daniel Vetter
2012-03-02 22:24   ` Rob Clark
2012-03-05 18:57     ` Daniel Vetter
2012-03-06 10:33       ` Semwal, Sumit
2012-03-02 22:38   ` [Linaro-mm-sig] " Chris Wilson
2012-03-02 22:53     ` Rob Clark
2012-03-05 21:31       ` Daniel Vetter
2012-03-01 15:36 ` [PATCH 3/3] dma_buf: Add documentation for the new cpu access support Daniel Vetter
2012-03-02 22:34   ` Rob Clark
2012-03-03  0:23   ` Sakari Ailus
2012-03-05 18:48     ` [Linaro-mm-sig] " Clark, Rob
2012-03-05 21:39     ` Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).