linux-arm-msm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] media: venus: use contig vb2 ops
@ 2020-12-14 12:57 Alexandre Courbot
  2020-12-15  9:02 ` Tomasz Figa
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Alexandre Courbot @ 2020-12-14 12:57 UTC (permalink / raw)
  To: Stanimir Varbanov, Tomasz Figa, Fritz Koenig
  Cc: linux-media, linux-kernel, linux-arm-msm, Alexandre Courbot

This driver uses the SG vb2 ops, but effectively only ever accesses the
first entry of the SG table, indicating that it expects a flat layout.
Switch it to use the contiguous ops to make sure this expected invariant
is always enforced. Since the device is supposed to be behind an IOMMU
this should have little to none practical consequences beyond making the
driver not rely on a particular behavior of the SG implementation.

Reported-by: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
Hi everyone,

It probably doesn't hurt to fix this issue before some actual issue happens.
I have tested this patch on Chrome OS and playback was just as fine as with
the SG ops.

 drivers/media/platform/Kconfig              | 2 +-
 drivers/media/platform/qcom/venus/helpers.c | 9 ++-------
 drivers/media/platform/qcom/venus/vdec.c    | 6 +++---
 drivers/media/platform/qcom/venus/venc.c    | 6 +++---
 4 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
index 35a18d388f3f..d9d7954111f2 100644
--- a/drivers/media/platform/Kconfig
+++ b/drivers/media/platform/Kconfig
@@ -533,7 +533,7 @@ config VIDEO_QCOM_VENUS
 	depends on INTERCONNECT || !INTERCONNECT
 	select QCOM_MDT_LOADER if ARCH_QCOM
 	select QCOM_SCM if ARCH_QCOM
-	select VIDEOBUF2_DMA_SG
+	select VIDEOBUF2_DMA_CONTIG
 	select V4L2_MEM2MEM_DEV
 	help
 	  This is a V4L2 driver for Qualcomm Venus video accelerator
diff --git a/drivers/media/platform/qcom/venus/helpers.c b/drivers/media/platform/qcom/venus/helpers.c
index 50439eb1ffea..859d260f002b 100644
--- a/drivers/media/platform/qcom/venus/helpers.c
+++ b/drivers/media/platform/qcom/venus/helpers.c
@@ -7,7 +7,7 @@
 #include <linux/mutex.h>
 #include <linux/slab.h>
 #include <linux/kernel.h>
-#include <media/videobuf2-dma-sg.h>
+#include <media/videobuf2-dma-contig.h>
 #include <media/v4l2-mem2mem.h>
 #include <asm/div64.h>
 
@@ -1284,14 +1284,9 @@ int venus_helper_vb2_buf_init(struct vb2_buffer *vb)
 	struct venus_inst *inst = vb2_get_drv_priv(vb->vb2_queue);
 	struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
 	struct venus_buffer *buf = to_venus_buffer(vbuf);
-	struct sg_table *sgt;
-
-	sgt = vb2_dma_sg_plane_desc(vb, 0);
-	if (!sgt)
-		return -EFAULT;
 
 	buf->size = vb2_plane_size(vb, 0);
-	buf->dma_addr = sg_dma_address(sgt->sgl);
+	buf->dma_addr = vb2_dma_contig_plane_dma_addr(vb, 0);
 
 	if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
 		list_add_tail(&buf->reg_list, &inst->registeredbufs);
diff --git a/drivers/media/platform/qcom/venus/vdec.c b/drivers/media/platform/qcom/venus/vdec.c
index 8488411204c3..3fb277c81aca 100644
--- a/drivers/media/platform/qcom/venus/vdec.c
+++ b/drivers/media/platform/qcom/venus/vdec.c
@@ -13,7 +13,7 @@
 #include <media/v4l2-event.h>
 #include <media/v4l2-ctrls.h>
 #include <media/v4l2-mem2mem.h>
-#include <media/videobuf2-dma-sg.h>
+#include <media/videobuf2-dma-contig.h>
 
 #include "hfi_venus_io.h"
 #include "hfi_parser.h"
@@ -1461,7 +1461,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
 	src_vq->io_modes = VB2_MMAP | VB2_DMABUF;
 	src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
 	src_vq->ops = &vdec_vb2_ops;
-	src_vq->mem_ops = &vb2_dma_sg_memops;
+	src_vq->mem_ops = &vb2_dma_contig_memops;
 	src_vq->drv_priv = inst;
 	src_vq->buf_struct_size = sizeof(struct venus_buffer);
 	src_vq->allow_zero_bytesused = 1;
@@ -1475,7 +1475,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
 	dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
 	dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
 	dst_vq->ops = &vdec_vb2_ops;
-	dst_vq->mem_ops = &vb2_dma_sg_memops;
+	dst_vq->mem_ops = &vb2_dma_contig_memops;
 	dst_vq->drv_priv = inst;
 	dst_vq->buf_struct_size = sizeof(struct venus_buffer);
 	dst_vq->allow_zero_bytesused = 1;
diff --git a/drivers/media/platform/qcom/venus/venc.c b/drivers/media/platform/qcom/venus/venc.c
index 1c61602c5de1..a09550cd1dba 100644
--- a/drivers/media/platform/qcom/venus/venc.c
+++ b/drivers/media/platform/qcom/venus/venc.c
@@ -10,7 +10,7 @@
 #include <linux/pm_runtime.h>
 #include <linux/slab.h>
 #include <media/v4l2-mem2mem.h>
-#include <media/videobuf2-dma-sg.h>
+#include <media/videobuf2-dma-contig.h>
 #include <media/v4l2-ioctl.h>
 #include <media/v4l2-event.h>
 #include <media/v4l2-ctrls.h>
@@ -1001,7 +1001,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
 	src_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
 	src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
 	src_vq->ops = &venc_vb2_ops;
-	src_vq->mem_ops = &vb2_dma_sg_memops;
+	src_vq->mem_ops = &vb2_dma_contig_memops;
 	src_vq->drv_priv = inst;
 	src_vq->buf_struct_size = sizeof(struct venus_buffer);
 	src_vq->allow_zero_bytesused = 1;
@@ -1017,7 +1017,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
 	dst_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
 	dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
 	dst_vq->ops = &venc_vb2_ops;
-	dst_vq->mem_ops = &vb2_dma_sg_memops;
+	dst_vq->mem_ops = &vb2_dma_contig_memops;
 	dst_vq->drv_priv = inst;
 	dst_vq->buf_struct_size = sizeof(struct venus_buffer);
 	dst_vq->allow_zero_bytesused = 1;
-- 
2.29.2.684.gfbc64c5ab5-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] media: venus: use contig vb2 ops
  2020-12-14 12:57 [PATCH] media: venus: use contig vb2 ops Alexandre Courbot
@ 2020-12-15  9:02 ` Tomasz Figa
  2020-12-15 11:16 ` Stanimir Varbanov
  2021-03-01 19:59 ` patchwork-bot+linux-arm-msm
  2 siblings, 0 replies; 13+ messages in thread
From: Tomasz Figa @ 2020-12-15  9:02 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Stanimir Varbanov, Fritz Koenig, Linux Media Mailing List,
	Linux Kernel Mailing List, linux-arm-msm

On Mon, Dec 14, 2020 at 9:57 PM Alexandre Courbot <acourbot@chromium.org> wrote:
>
> This driver uses the SG vb2 ops, but effectively only ever accesses the
> first entry of the SG table, indicating that it expects a flat layout.
> Switch it to use the contiguous ops to make sure this expected invariant
> is always enforced. Since the device is supposed to be behind an IOMMU
> this should have little to none practical consequences beyond making the
> driver not rely on a particular behavior of the SG implementation.
>
> Reported-by: Tomasz Figa <tfiga@chromium.org>
> Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> ---
> Hi everyone,
>
> It probably doesn't hurt to fix this issue before some actual issue happens.
> I have tested this patch on Chrome OS and playback was just as fine as with
> the SG ops.
>
>  drivers/media/platform/Kconfig              | 2 +-
>  drivers/media/platform/qcom/venus/helpers.c | 9 ++-------
>  drivers/media/platform/qcom/venus/vdec.c    | 6 +++---
>  drivers/media/platform/qcom/venus/venc.c    | 6 +++---
>  4 files changed, 9 insertions(+), 14 deletions(-)
>

Reviewed-by: Tomasz Figa <tfiga@chromium.org>

Best regards,
Tomasz

> diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
> index 35a18d388f3f..d9d7954111f2 100644
> --- a/drivers/media/platform/Kconfig
> +++ b/drivers/media/platform/Kconfig
> @@ -533,7 +533,7 @@ config VIDEO_QCOM_VENUS
>         depends on INTERCONNECT || !INTERCONNECT
>         select QCOM_MDT_LOADER if ARCH_QCOM
>         select QCOM_SCM if ARCH_QCOM
> -       select VIDEOBUF2_DMA_SG
> +       select VIDEOBUF2_DMA_CONTIG
>         select V4L2_MEM2MEM_DEV
>         help
>           This is a V4L2 driver for Qualcomm Venus video accelerator
> diff --git a/drivers/media/platform/qcom/venus/helpers.c b/drivers/media/platform/qcom/venus/helpers.c
> index 50439eb1ffea..859d260f002b 100644
> --- a/drivers/media/platform/qcom/venus/helpers.c
> +++ b/drivers/media/platform/qcom/venus/helpers.c
> @@ -7,7 +7,7 @@
>  #include <linux/mutex.h>
>  #include <linux/slab.h>
>  #include <linux/kernel.h>
> -#include <media/videobuf2-dma-sg.h>
> +#include <media/videobuf2-dma-contig.h>
>  #include <media/v4l2-mem2mem.h>
>  #include <asm/div64.h>
>
> @@ -1284,14 +1284,9 @@ int venus_helper_vb2_buf_init(struct vb2_buffer *vb)
>         struct venus_inst *inst = vb2_get_drv_priv(vb->vb2_queue);
>         struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
>         struct venus_buffer *buf = to_venus_buffer(vbuf);
> -       struct sg_table *sgt;
> -
> -       sgt = vb2_dma_sg_plane_desc(vb, 0);
> -       if (!sgt)
> -               return -EFAULT;
>
>         buf->size = vb2_plane_size(vb, 0);
> -       buf->dma_addr = sg_dma_address(sgt->sgl);
> +       buf->dma_addr = vb2_dma_contig_plane_dma_addr(vb, 0);
>
>         if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
>                 list_add_tail(&buf->reg_list, &inst->registeredbufs);
> diff --git a/drivers/media/platform/qcom/venus/vdec.c b/drivers/media/platform/qcom/venus/vdec.c
> index 8488411204c3..3fb277c81aca 100644
> --- a/drivers/media/platform/qcom/venus/vdec.c
> +++ b/drivers/media/platform/qcom/venus/vdec.c
> @@ -13,7 +13,7 @@
>  #include <media/v4l2-event.h>
>  #include <media/v4l2-ctrls.h>
>  #include <media/v4l2-mem2mem.h>
> -#include <media/videobuf2-dma-sg.h>
> +#include <media/videobuf2-dma-contig.h>
>
>  #include "hfi_venus_io.h"
>  #include "hfi_parser.h"
> @@ -1461,7 +1461,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>         src_vq->io_modes = VB2_MMAP | VB2_DMABUF;
>         src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>         src_vq->ops = &vdec_vb2_ops;
> -       src_vq->mem_ops = &vb2_dma_sg_memops;
> +       src_vq->mem_ops = &vb2_dma_contig_memops;
>         src_vq->drv_priv = inst;
>         src_vq->buf_struct_size = sizeof(struct venus_buffer);
>         src_vq->allow_zero_bytesused = 1;
> @@ -1475,7 +1475,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>         dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
>         dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>         dst_vq->ops = &vdec_vb2_ops;
> -       dst_vq->mem_ops = &vb2_dma_sg_memops;
> +       dst_vq->mem_ops = &vb2_dma_contig_memops;
>         dst_vq->drv_priv = inst;
>         dst_vq->buf_struct_size = sizeof(struct venus_buffer);
>         dst_vq->allow_zero_bytesused = 1;
> diff --git a/drivers/media/platform/qcom/venus/venc.c b/drivers/media/platform/qcom/venus/venc.c
> index 1c61602c5de1..a09550cd1dba 100644
> --- a/drivers/media/platform/qcom/venus/venc.c
> +++ b/drivers/media/platform/qcom/venus/venc.c
> @@ -10,7 +10,7 @@
>  #include <linux/pm_runtime.h>
>  #include <linux/slab.h>
>  #include <media/v4l2-mem2mem.h>
> -#include <media/videobuf2-dma-sg.h>
> +#include <media/videobuf2-dma-contig.h>
>  #include <media/v4l2-ioctl.h>
>  #include <media/v4l2-event.h>
>  #include <media/v4l2-ctrls.h>
> @@ -1001,7 +1001,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>         src_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
>         src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>         src_vq->ops = &venc_vb2_ops;
> -       src_vq->mem_ops = &vb2_dma_sg_memops;
> +       src_vq->mem_ops = &vb2_dma_contig_memops;
>         src_vq->drv_priv = inst;
>         src_vq->buf_struct_size = sizeof(struct venus_buffer);
>         src_vq->allow_zero_bytesused = 1;
> @@ -1017,7 +1017,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>         dst_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
>         dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>         dst_vq->ops = &venc_vb2_ops;
> -       dst_vq->mem_ops = &vb2_dma_sg_memops;
> +       dst_vq->mem_ops = &vb2_dma_contig_memops;
>         dst_vq->drv_priv = inst;
>         dst_vq->buf_struct_size = sizeof(struct venus_buffer);
>         dst_vq->allow_zero_bytesused = 1;
> --
> 2.29.2.684.gfbc64c5ab5-goog
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] media: venus: use contig vb2 ops
  2020-12-14 12:57 [PATCH] media: venus: use contig vb2 ops Alexandre Courbot
  2020-12-15  9:02 ` Tomasz Figa
@ 2020-12-15 11:16 ` Stanimir Varbanov
  2020-12-15 11:34   ` Robin Murphy
  2020-12-15 11:47   ` Tomasz Figa
  2021-03-01 19:59 ` patchwork-bot+linux-arm-msm
  2 siblings, 2 replies; 13+ messages in thread
From: Stanimir Varbanov @ 2020-12-15 11:16 UTC (permalink / raw)
  To: Alexandre Courbot, Stanimir Varbanov, Tomasz Figa, Fritz Koenig
  Cc: linux-media, linux-kernel, linux-arm-msm, Robin Murphy

Hi,

Cc: Robin

On 12/14/20 2:57 PM, Alexandre Courbot wrote:
> This driver uses the SG vb2 ops, but effectively only ever accesses the
> first entry of the SG table, indicating that it expects a flat layout.
> Switch it to use the contiguous ops to make sure this expected invariant

Under what circumstances the sg table will has nents > 1? I came down to
[1] but not sure I got it right.

I'm afraid that for systems with low amount of system memory and when
the memory become fragmented, the driver will not work. That's why I
started with sg allocator.

[1]
https://elixir.bootlin.com/linux/v5.10.1/source/drivers/iommu/dma-iommu.c#L782

> is always enforced. Since the device is supposed to be behind an IOMMU
> this should have little to none practical consequences beyond making the
> driver not rely on a particular behavior of the SG implementation.
> 
> Reported-by: Tomasz Figa <tfiga@chromium.org>
> Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> ---
> Hi everyone,
> 
> It probably doesn't hurt to fix this issue before some actual issue happens.
> I have tested this patch on Chrome OS and playback was just as fine as with
> the SG ops.
> 
>  drivers/media/platform/Kconfig              | 2 +-
>  drivers/media/platform/qcom/venus/helpers.c | 9 ++-------
>  drivers/media/platform/qcom/venus/vdec.c    | 6 +++---
>  drivers/media/platform/qcom/venus/venc.c    | 6 +++---
>  4 files changed, 9 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
> index 35a18d388f3f..d9d7954111f2 100644
> --- a/drivers/media/platform/Kconfig
> +++ b/drivers/media/platform/Kconfig
> @@ -533,7 +533,7 @@ config VIDEO_QCOM_VENUS
>  	depends on INTERCONNECT || !INTERCONNECT
>  	select QCOM_MDT_LOADER if ARCH_QCOM
>  	select QCOM_SCM if ARCH_QCOM
> -	select VIDEOBUF2_DMA_SG
> +	select VIDEOBUF2_DMA_CONTIG
>  	select V4L2_MEM2MEM_DEV
>  	help
>  	  This is a V4L2 driver for Qualcomm Venus video accelerator
> diff --git a/drivers/media/platform/qcom/venus/helpers.c b/drivers/media/platform/qcom/venus/helpers.c
> index 50439eb1ffea..859d260f002b 100644
> --- a/drivers/media/platform/qcom/venus/helpers.c
> +++ b/drivers/media/platform/qcom/venus/helpers.c
> @@ -7,7 +7,7 @@
>  #include <linux/mutex.h>
>  #include <linux/slab.h>
>  #include <linux/kernel.h>
> -#include <media/videobuf2-dma-sg.h>
> +#include <media/videobuf2-dma-contig.h>
>  #include <media/v4l2-mem2mem.h>
>  #include <asm/div64.h>
>  
> @@ -1284,14 +1284,9 @@ int venus_helper_vb2_buf_init(struct vb2_buffer *vb)
>  	struct venus_inst *inst = vb2_get_drv_priv(vb->vb2_queue);
>  	struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
>  	struct venus_buffer *buf = to_venus_buffer(vbuf);
> -	struct sg_table *sgt;
> -
> -	sgt = vb2_dma_sg_plane_desc(vb, 0);
> -	if (!sgt)
> -		return -EFAULT;
>  
>  	buf->size = vb2_plane_size(vb, 0);
> -	buf->dma_addr = sg_dma_address(sgt->sgl);

Can we do it:

	if (WARN_ON(sgt->nents > 1))
		return -EFAULT;

I understand that logically using dma-sg when the flat layout is
expected by the hardware is wrong, but I haven't seen issues until now.

> +	buf->dma_addr = vb2_dma_contig_plane_dma_addr(vb, 0);
>  
>  	if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
>  		list_add_tail(&buf->reg_list, &inst->registeredbufs);
> diff --git a/drivers/media/platform/qcom/venus/vdec.c b/drivers/media/platform/qcom/venus/vdec.c
> index 8488411204c3..3fb277c81aca 100644
> --- a/drivers/media/platform/qcom/venus/vdec.c
> +++ b/drivers/media/platform/qcom/venus/vdec.c
> @@ -13,7 +13,7 @@
>  #include <media/v4l2-event.h>
>  #include <media/v4l2-ctrls.h>
>  #include <media/v4l2-mem2mem.h>
> -#include <media/videobuf2-dma-sg.h>
> +#include <media/videobuf2-dma-contig.h>
>  
>  #include "hfi_venus_io.h"
>  #include "hfi_parser.h"
> @@ -1461,7 +1461,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>  	src_vq->io_modes = VB2_MMAP | VB2_DMABUF;
>  	src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>  	src_vq->ops = &vdec_vb2_ops;
> -	src_vq->mem_ops = &vb2_dma_sg_memops;
> +	src_vq->mem_ops = &vb2_dma_contig_memops;
>  	src_vq->drv_priv = inst;
>  	src_vq->buf_struct_size = sizeof(struct venus_buffer);
>  	src_vq->allow_zero_bytesused = 1;
> @@ -1475,7 +1475,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>  	dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
>  	dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>  	dst_vq->ops = &vdec_vb2_ops;
> -	dst_vq->mem_ops = &vb2_dma_sg_memops;
> +	dst_vq->mem_ops = &vb2_dma_contig_memops;
>  	dst_vq->drv_priv = inst;
>  	dst_vq->buf_struct_size = sizeof(struct venus_buffer);
>  	dst_vq->allow_zero_bytesused = 1;
> diff --git a/drivers/media/platform/qcom/venus/venc.c b/drivers/media/platform/qcom/venus/venc.c
> index 1c61602c5de1..a09550cd1dba 100644
> --- a/drivers/media/platform/qcom/venus/venc.c
> +++ b/drivers/media/platform/qcom/venus/venc.c
> @@ -10,7 +10,7 @@
>  #include <linux/pm_runtime.h>
>  #include <linux/slab.h>
>  #include <media/v4l2-mem2mem.h>
> -#include <media/videobuf2-dma-sg.h>
> +#include <media/videobuf2-dma-contig.h>
>  #include <media/v4l2-ioctl.h>
>  #include <media/v4l2-event.h>
>  #include <media/v4l2-ctrls.h>
> @@ -1001,7 +1001,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>  	src_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
>  	src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>  	src_vq->ops = &venc_vb2_ops;
> -	src_vq->mem_ops = &vb2_dma_sg_memops;
> +	src_vq->mem_ops = &vb2_dma_contig_memops;
>  	src_vq->drv_priv = inst;
>  	src_vq->buf_struct_size = sizeof(struct venus_buffer);
>  	src_vq->allow_zero_bytesused = 1;
> @@ -1017,7 +1017,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>  	dst_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
>  	dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>  	dst_vq->ops = &venc_vb2_ops;
> -	dst_vq->mem_ops = &vb2_dma_sg_memops;
> +	dst_vq->mem_ops = &vb2_dma_contig_memops;
>  	dst_vq->drv_priv = inst;
>  	dst_vq->buf_struct_size = sizeof(struct venus_buffer);
>  	dst_vq->allow_zero_bytesused = 1;
> 

-- 
regards,
Stan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] media: venus: use contig vb2 ops
  2020-12-15 11:16 ` Stanimir Varbanov
@ 2020-12-15 11:34   ` Robin Murphy
  2020-12-15 11:47   ` Tomasz Figa
  1 sibling, 0 replies; 13+ messages in thread
From: Robin Murphy @ 2020-12-15 11:34 UTC (permalink / raw)
  To: Stanimir Varbanov, Alexandre Courbot, Tomasz Figa, Fritz Koenig
  Cc: linux-media, linux-kernel, linux-arm-msm

On 2020-12-15 11:16, Stanimir Varbanov wrote:
> Hi,
> 
> Cc: Robin
> 
> On 12/14/20 2:57 PM, Alexandre Courbot wrote:
>> This driver uses the SG vb2 ops, but effectively only ever accesses the
>> first entry of the SG table, indicating that it expects a flat layout.
>> Switch it to use the contiguous ops to make sure this expected invariant
> 
> Under what circumstances the sg table will has nents > 1? I came down to
> [1] but not sure I got it right.
> 
> I'm afraid that for systems with low amount of system memory and when
> the memory become fragmented, the driver will not work. That's why I
> started with sg allocator.

Despite what videobuf-dma-contig seems to assume, dma_alloc_coherent() 
makes no guarantee of providing physically contiguous memory. What it 
provides is *DMA contiguous* memory, i.e. from the point of view of the 
device. When an IOMMU is present and managed by the DMA API, such 
buffers may be assembled out of physically-scattered pages (particularly 
under memory pressure/fragmentation).

Robin.

> 
> [1]
> https://elixir.bootlin.com/linux/v5.10.1/source/drivers/iommu/dma-iommu.c#L782
> 
>> is always enforced. Since the device is supposed to be behind an IOMMU
>> this should have little to none practical consequences beyond making the
>> driver not rely on a particular behavior of the SG implementation.
>>
>> Reported-by: Tomasz Figa <tfiga@chromium.org>
>> Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
>> ---
>> Hi everyone,
>>
>> It probably doesn't hurt to fix this issue before some actual issue happens.
>> I have tested this patch on Chrome OS and playback was just as fine as with
>> the SG ops.
>>
>>   drivers/media/platform/Kconfig              | 2 +-
>>   drivers/media/platform/qcom/venus/helpers.c | 9 ++-------
>>   drivers/media/platform/qcom/venus/vdec.c    | 6 +++---
>>   drivers/media/platform/qcom/venus/venc.c    | 6 +++---
>>   4 files changed, 9 insertions(+), 14 deletions(-)
>>
>> diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
>> index 35a18d388f3f..d9d7954111f2 100644
>> --- a/drivers/media/platform/Kconfig
>> +++ b/drivers/media/platform/Kconfig
>> @@ -533,7 +533,7 @@ config VIDEO_QCOM_VENUS
>>   	depends on INTERCONNECT || !INTERCONNECT
>>   	select QCOM_MDT_LOADER if ARCH_QCOM
>>   	select QCOM_SCM if ARCH_QCOM
>> -	select VIDEOBUF2_DMA_SG
>> +	select VIDEOBUF2_DMA_CONTIG
>>   	select V4L2_MEM2MEM_DEV
>>   	help
>>   	  This is a V4L2 driver for Qualcomm Venus video accelerator
>> diff --git a/drivers/media/platform/qcom/venus/helpers.c b/drivers/media/platform/qcom/venus/helpers.c
>> index 50439eb1ffea..859d260f002b 100644
>> --- a/drivers/media/platform/qcom/venus/helpers.c
>> +++ b/drivers/media/platform/qcom/venus/helpers.c
>> @@ -7,7 +7,7 @@
>>   #include <linux/mutex.h>
>>   #include <linux/slab.h>
>>   #include <linux/kernel.h>
>> -#include <media/videobuf2-dma-sg.h>
>> +#include <media/videobuf2-dma-contig.h>
>>   #include <media/v4l2-mem2mem.h>
>>   #include <asm/div64.h>
>>   
>> @@ -1284,14 +1284,9 @@ int venus_helper_vb2_buf_init(struct vb2_buffer *vb)
>>   	struct venus_inst *inst = vb2_get_drv_priv(vb->vb2_queue);
>>   	struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
>>   	struct venus_buffer *buf = to_venus_buffer(vbuf);
>> -	struct sg_table *sgt;
>> -
>> -	sgt = vb2_dma_sg_plane_desc(vb, 0);
>> -	if (!sgt)
>> -		return -EFAULT;
>>   
>>   	buf->size = vb2_plane_size(vb, 0);
>> -	buf->dma_addr = sg_dma_address(sgt->sgl);
> 
> Can we do it:
> 
> 	if (WARN_ON(sgt->nents > 1))
> 		return -EFAULT;
> 
> I understand that logically using dma-sg when the flat layout is
> expected by the hardware is wrong, but I haven't seen issues until now.
> 
>> +	buf->dma_addr = vb2_dma_contig_plane_dma_addr(vb, 0);
>>   
>>   	if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
>>   		list_add_tail(&buf->reg_list, &inst->registeredbufs);
>> diff --git a/drivers/media/platform/qcom/venus/vdec.c b/drivers/media/platform/qcom/venus/vdec.c
>> index 8488411204c3..3fb277c81aca 100644
>> --- a/drivers/media/platform/qcom/venus/vdec.c
>> +++ b/drivers/media/platform/qcom/venus/vdec.c
>> @@ -13,7 +13,7 @@
>>   #include <media/v4l2-event.h>
>>   #include <media/v4l2-ctrls.h>
>>   #include <media/v4l2-mem2mem.h>
>> -#include <media/videobuf2-dma-sg.h>
>> +#include <media/videobuf2-dma-contig.h>
>>   
>>   #include "hfi_venus_io.h"
>>   #include "hfi_parser.h"
>> @@ -1461,7 +1461,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>>   	src_vq->io_modes = VB2_MMAP | VB2_DMABUF;
>>   	src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>>   	src_vq->ops = &vdec_vb2_ops;
>> -	src_vq->mem_ops = &vb2_dma_sg_memops;
>> +	src_vq->mem_ops = &vb2_dma_contig_memops;
>>   	src_vq->drv_priv = inst;
>>   	src_vq->buf_struct_size = sizeof(struct venus_buffer);
>>   	src_vq->allow_zero_bytesused = 1;
>> @@ -1475,7 +1475,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>>   	dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
>>   	dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>>   	dst_vq->ops = &vdec_vb2_ops;
>> -	dst_vq->mem_ops = &vb2_dma_sg_memops;
>> +	dst_vq->mem_ops = &vb2_dma_contig_memops;
>>   	dst_vq->drv_priv = inst;
>>   	dst_vq->buf_struct_size = sizeof(struct venus_buffer);
>>   	dst_vq->allow_zero_bytesused = 1;
>> diff --git a/drivers/media/platform/qcom/venus/venc.c b/drivers/media/platform/qcom/venus/venc.c
>> index 1c61602c5de1..a09550cd1dba 100644
>> --- a/drivers/media/platform/qcom/venus/venc.c
>> +++ b/drivers/media/platform/qcom/venus/venc.c
>> @@ -10,7 +10,7 @@
>>   #include <linux/pm_runtime.h>
>>   #include <linux/slab.h>
>>   #include <media/v4l2-mem2mem.h>
>> -#include <media/videobuf2-dma-sg.h>
>> +#include <media/videobuf2-dma-contig.h>
>>   #include <media/v4l2-ioctl.h>
>>   #include <media/v4l2-event.h>
>>   #include <media/v4l2-ctrls.h>
>> @@ -1001,7 +1001,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>>   	src_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
>>   	src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>>   	src_vq->ops = &venc_vb2_ops;
>> -	src_vq->mem_ops = &vb2_dma_sg_memops;
>> +	src_vq->mem_ops = &vb2_dma_contig_memops;
>>   	src_vq->drv_priv = inst;
>>   	src_vq->buf_struct_size = sizeof(struct venus_buffer);
>>   	src_vq->allow_zero_bytesused = 1;
>> @@ -1017,7 +1017,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>>   	dst_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
>>   	dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>>   	dst_vq->ops = &venc_vb2_ops;
>> -	dst_vq->mem_ops = &vb2_dma_sg_memops;
>> +	dst_vq->mem_ops = &vb2_dma_contig_memops;
>>   	dst_vq->drv_priv = inst;
>>   	dst_vq->buf_struct_size = sizeof(struct venus_buffer);
>>   	dst_vq->allow_zero_bytesused = 1;
>>
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] media: venus: use contig vb2 ops
  2020-12-15 11:16 ` Stanimir Varbanov
  2020-12-15 11:34   ` Robin Murphy
@ 2020-12-15 11:47   ` Tomasz Figa
  2020-12-15 13:32     ` Robin Murphy
  2020-12-15 13:54     ` Stanimir Varbanov
  1 sibling, 2 replies; 13+ messages in thread
From: Tomasz Figa @ 2020-12-15 11:47 UTC (permalink / raw)
  To: Stanimir Varbanov
  Cc: Alexandre Courbot, Fritz Koenig, Linux Media Mailing List,
	Linux Kernel Mailing List, linux-arm-msm, Robin Murphy

On Tue, Dec 15, 2020 at 8:16 PM Stanimir Varbanov
<stanimir.varbanov@linaro.org> wrote:
>
> Hi,
>
> Cc: Robin
>
> On 12/14/20 2:57 PM, Alexandre Courbot wrote:
> > This driver uses the SG vb2 ops, but effectively only ever accesses the
> > first entry of the SG table, indicating that it expects a flat layout.
> > Switch it to use the contiguous ops to make sure this expected invariant
>
> Under what circumstances the sg table will has nents > 1? I came down to
> [1] but not sure I got it right.
>
> I'm afraid that for systems with low amount of system memory and when
> the memory become fragmented, the driver will not work. That's why I
> started with sg allocator.

It is exactly the opposite. The vb2-dma-contig allocator is "contig"
in terms of the DMA (aka IOVA) address space. In other words, it
guarantees that having one DMA address and length fully describes the
buffer. This seems to be the requirement of the hardware/firmware
handled by the venus driver. If the device is behind an IOMMU, which
is the case for the SoCs in question, the underlying DMA ops will
actually allocate a discontiguous set of pages, so it has nothing to
do to system memory amount or fragmentation. If for some reason the
IOMMU can't be used, there is no way around, the memory needs to be
contiguous because of the hardware/firmware/driver expectation.

On the other hand, the vb2-dma-sg allocator doesn't have any
continuity guarantees for the DMA, or any other, address space. The
current code works fine, because it calls dma_map_sg() on the whole
set of pages and that ends up mapping it contiguously in the IOVA
space, but that's just an implementation detail, not an API guarantee.

Best regards,
Tomasz

>
> [1]
> https://elixir.bootlin.com/linux/v5.10.1/source/drivers/iommu/dma-iommu.c#L782
>
> > is always enforced. Since the device is supposed to be behind an IOMMU
> > this should have little to none practical consequences beyond making the
> > driver not rely on a particular behavior of the SG implementation.
> >
> > Reported-by: Tomasz Figa <tfiga@chromium.org>
> > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > ---
> > Hi everyone,
> >
> > It probably doesn't hurt to fix this issue before some actual issue happens.
> > I have tested this patch on Chrome OS and playback was just as fine as with
> > the SG ops.
> >
> >  drivers/media/platform/Kconfig              | 2 +-
> >  drivers/media/platform/qcom/venus/helpers.c | 9 ++-------
> >  drivers/media/platform/qcom/venus/vdec.c    | 6 +++---
> >  drivers/media/platform/qcom/venus/venc.c    | 6 +++---
> >  4 files changed, 9 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
> > index 35a18d388f3f..d9d7954111f2 100644
> > --- a/drivers/media/platform/Kconfig
> > +++ b/drivers/media/platform/Kconfig
> > @@ -533,7 +533,7 @@ config VIDEO_QCOM_VENUS
> >       depends on INTERCONNECT || !INTERCONNECT
> >       select QCOM_MDT_LOADER if ARCH_QCOM
> >       select QCOM_SCM if ARCH_QCOM
> > -     select VIDEOBUF2_DMA_SG
> > +     select VIDEOBUF2_DMA_CONTIG
> >       select V4L2_MEM2MEM_DEV
> >       help
> >         This is a V4L2 driver for Qualcomm Venus video accelerator
> > diff --git a/drivers/media/platform/qcom/venus/helpers.c b/drivers/media/platform/qcom/venus/helpers.c
> > index 50439eb1ffea..859d260f002b 100644
> > --- a/drivers/media/platform/qcom/venus/helpers.c
> > +++ b/drivers/media/platform/qcom/venus/helpers.c
> > @@ -7,7 +7,7 @@
> >  #include <linux/mutex.h>
> >  #include <linux/slab.h>
> >  #include <linux/kernel.h>
> > -#include <media/videobuf2-dma-sg.h>
> > +#include <media/videobuf2-dma-contig.h>
> >  #include <media/v4l2-mem2mem.h>
> >  #include <asm/div64.h>
> >
> > @@ -1284,14 +1284,9 @@ int venus_helper_vb2_buf_init(struct vb2_buffer *vb)
> >       struct venus_inst *inst = vb2_get_drv_priv(vb->vb2_queue);
> >       struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
> >       struct venus_buffer *buf = to_venus_buffer(vbuf);
> > -     struct sg_table *sgt;
> > -
> > -     sgt = vb2_dma_sg_plane_desc(vb, 0);
> > -     if (!sgt)
> > -             return -EFAULT;
> >
> >       buf->size = vb2_plane_size(vb, 0);
> > -     buf->dma_addr = sg_dma_address(sgt->sgl);
>
> Can we do it:
>
>         if (WARN_ON(sgt->nents > 1))
>                 return -EFAULT;
>
> I understand that logically using dma-sg when the flat layout is
> expected by the hardware is wrong, but I haven't seen issues until now.
>
> > +     buf->dma_addr = vb2_dma_contig_plane_dma_addr(vb, 0);
> >
> >       if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
> >               list_add_tail(&buf->reg_list, &inst->registeredbufs);
> > diff --git a/drivers/media/platform/qcom/venus/vdec.c b/drivers/media/platform/qcom/venus/vdec.c
> > index 8488411204c3..3fb277c81aca 100644
> > --- a/drivers/media/platform/qcom/venus/vdec.c
> > +++ b/drivers/media/platform/qcom/venus/vdec.c
> > @@ -13,7 +13,7 @@
> >  #include <media/v4l2-event.h>
> >  #include <media/v4l2-ctrls.h>
> >  #include <media/v4l2-mem2mem.h>
> > -#include <media/videobuf2-dma-sg.h>
> > +#include <media/videobuf2-dma-contig.h>
> >
> >  #include "hfi_venus_io.h"
> >  #include "hfi_parser.h"
> > @@ -1461,7 +1461,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
> >       src_vq->io_modes = VB2_MMAP | VB2_DMABUF;
> >       src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> >       src_vq->ops = &vdec_vb2_ops;
> > -     src_vq->mem_ops = &vb2_dma_sg_memops;
> > +     src_vq->mem_ops = &vb2_dma_contig_memops;
> >       src_vq->drv_priv = inst;
> >       src_vq->buf_struct_size = sizeof(struct venus_buffer);
> >       src_vq->allow_zero_bytesused = 1;
> > @@ -1475,7 +1475,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
> >       dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
> >       dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> >       dst_vq->ops = &vdec_vb2_ops;
> > -     dst_vq->mem_ops = &vb2_dma_sg_memops;
> > +     dst_vq->mem_ops = &vb2_dma_contig_memops;
> >       dst_vq->drv_priv = inst;
> >       dst_vq->buf_struct_size = sizeof(struct venus_buffer);
> >       dst_vq->allow_zero_bytesused = 1;
> > diff --git a/drivers/media/platform/qcom/venus/venc.c b/drivers/media/platform/qcom/venus/venc.c
> > index 1c61602c5de1..a09550cd1dba 100644
> > --- a/drivers/media/platform/qcom/venus/venc.c
> > +++ b/drivers/media/platform/qcom/venus/venc.c
> > @@ -10,7 +10,7 @@
> >  #include <linux/pm_runtime.h>
> >  #include <linux/slab.h>
> >  #include <media/v4l2-mem2mem.h>
> > -#include <media/videobuf2-dma-sg.h>
> > +#include <media/videobuf2-dma-contig.h>
> >  #include <media/v4l2-ioctl.h>
> >  #include <media/v4l2-event.h>
> >  #include <media/v4l2-ctrls.h>
> > @@ -1001,7 +1001,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
> >       src_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
> >       src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> >       src_vq->ops = &venc_vb2_ops;
> > -     src_vq->mem_ops = &vb2_dma_sg_memops;
> > +     src_vq->mem_ops = &vb2_dma_contig_memops;
> >       src_vq->drv_priv = inst;
> >       src_vq->buf_struct_size = sizeof(struct venus_buffer);
> >       src_vq->allow_zero_bytesused = 1;
> > @@ -1017,7 +1017,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
> >       dst_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
> >       dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> >       dst_vq->ops = &venc_vb2_ops;
> > -     dst_vq->mem_ops = &vb2_dma_sg_memops;
> > +     dst_vq->mem_ops = &vb2_dma_contig_memops;
> >       dst_vq->drv_priv = inst;
> >       dst_vq->buf_struct_size = sizeof(struct venus_buffer);
> >       dst_vq->allow_zero_bytesused = 1;
> >
>
> --
> regards,
> Stan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] media: venus: use contig vb2 ops
  2020-12-15 11:47   ` Tomasz Figa
@ 2020-12-15 13:32     ` Robin Murphy
  2020-12-15 13:54     ` Stanimir Varbanov
  1 sibling, 0 replies; 13+ messages in thread
From: Robin Murphy @ 2020-12-15 13:32 UTC (permalink / raw)
  To: Tomasz Figa, Stanimir Varbanov
  Cc: Alexandre Courbot, Fritz Koenig, Linux Media Mailing List,
	Linux Kernel Mailing List, linux-arm-msm

On 2020-12-15 11:47, Tomasz Figa wrote:
> On Tue, Dec 15, 2020 at 8:16 PM Stanimir Varbanov
> <stanimir.varbanov@linaro.org> wrote:
>>
>> Hi,
>>
>> Cc: Robin
>>
>> On 12/14/20 2:57 PM, Alexandre Courbot wrote:
>>> This driver uses the SG vb2 ops, but effectively only ever accesses the
>>> first entry of the SG table, indicating that it expects a flat layout.
>>> Switch it to use the contiguous ops to make sure this expected invariant
>>
>> Under what circumstances the sg table will has nents > 1? I came down to
>> [1] but not sure I got it right.
>>
>> I'm afraid that for systems with low amount of system memory and when
>> the memory become fragmented, the driver will not work. That's why I
>> started with sg allocator.
> 
> It is exactly the opposite. The vb2-dma-contig allocator is "contig"
> in terms of the DMA (aka IOVA) address space. In other words, it
> guarantees that having one DMA address and length fully describes the
> buffer. This seems to be the requirement of the hardware/firmware
> handled by the venus driver. If the device is behind an IOMMU, which
> is the case for the SoCs in question, the underlying DMA ops will
> actually allocate a discontiguous set of pages, so it has nothing to
> do to system memory amount or fragmentation. If for some reason the
> IOMMU can't be used, there is no way around, the memory needs to be
> contiguous because of the hardware/firmware/driver expectation.
> 
> On the other hand, the vb2-dma-sg allocator doesn't have any
> continuity guarantees for the DMA, or any other, address space.

Yes, intuitively one would assume that the sg code was for devices with 
native scatter-gather capability that can deal with an actual set of 
buffer descriptors, rather than just a single pointer (which is the 
original purpose of scatterlists, after all). I've always been slightly 
puzzled why the two seem to be quite so similar.

> The
> current code works fine, because it calls dma_map_sg() on the whole
> set of pages and that ends up mapping it contiguously in the IOVA
> space, but that's just an implementation detail, not an API guarantee.

Oh, the fun we've had over that implementation detail! :P

Robin.

> Best regards,
> Tomasz
> 
>>
>> [1]
>> https://elixir.bootlin.com/linux/v5.10.1/source/drivers/iommu/dma-iommu.c#L782
>>
>>> is always enforced. Since the device is supposed to be behind an IOMMU
>>> this should have little to none practical consequences beyond making the
>>> driver not rely on a particular behavior of the SG implementation.
>>>
>>> Reported-by: Tomasz Figa <tfiga@chromium.org>
>>> Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
>>> ---
>>> Hi everyone,
>>>
>>> It probably doesn't hurt to fix this issue before some actual issue happens.
>>> I have tested this patch on Chrome OS and playback was just as fine as with
>>> the SG ops.
>>>
>>>   drivers/media/platform/Kconfig              | 2 +-
>>>   drivers/media/platform/qcom/venus/helpers.c | 9 ++-------
>>>   drivers/media/platform/qcom/venus/vdec.c    | 6 +++---
>>>   drivers/media/platform/qcom/venus/venc.c    | 6 +++---
>>>   4 files changed, 9 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
>>> index 35a18d388f3f..d9d7954111f2 100644
>>> --- a/drivers/media/platform/Kconfig
>>> +++ b/drivers/media/platform/Kconfig
>>> @@ -533,7 +533,7 @@ config VIDEO_QCOM_VENUS
>>>        depends on INTERCONNECT || !INTERCONNECT
>>>        select QCOM_MDT_LOADER if ARCH_QCOM
>>>        select QCOM_SCM if ARCH_QCOM
>>> -     select VIDEOBUF2_DMA_SG
>>> +     select VIDEOBUF2_DMA_CONTIG
>>>        select V4L2_MEM2MEM_DEV
>>>        help
>>>          This is a V4L2 driver for Qualcomm Venus video accelerator
>>> diff --git a/drivers/media/platform/qcom/venus/helpers.c b/drivers/media/platform/qcom/venus/helpers.c
>>> index 50439eb1ffea..859d260f002b 100644
>>> --- a/drivers/media/platform/qcom/venus/helpers.c
>>> +++ b/drivers/media/platform/qcom/venus/helpers.c
>>> @@ -7,7 +7,7 @@
>>>   #include <linux/mutex.h>
>>>   #include <linux/slab.h>
>>>   #include <linux/kernel.h>
>>> -#include <media/videobuf2-dma-sg.h>
>>> +#include <media/videobuf2-dma-contig.h>
>>>   #include <media/v4l2-mem2mem.h>
>>>   #include <asm/div64.h>
>>>
>>> @@ -1284,14 +1284,9 @@ int venus_helper_vb2_buf_init(struct vb2_buffer *vb)
>>>        struct venus_inst *inst = vb2_get_drv_priv(vb->vb2_queue);
>>>        struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
>>>        struct venus_buffer *buf = to_venus_buffer(vbuf);
>>> -     struct sg_table *sgt;
>>> -
>>> -     sgt = vb2_dma_sg_plane_desc(vb, 0);
>>> -     if (!sgt)
>>> -             return -EFAULT;
>>>
>>>        buf->size = vb2_plane_size(vb, 0);
>>> -     buf->dma_addr = sg_dma_address(sgt->sgl);
>>
>> Can we do it:
>>
>>          if (WARN_ON(sgt->nents > 1))
>>                  return -EFAULT;
>>
>> I understand that logically using dma-sg when the flat layout is
>> expected by the hardware is wrong, but I haven't seen issues until now.
>>
>>> +     buf->dma_addr = vb2_dma_contig_plane_dma_addr(vb, 0);
>>>
>>>        if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
>>>                list_add_tail(&buf->reg_list, &inst->registeredbufs);
>>> diff --git a/drivers/media/platform/qcom/venus/vdec.c b/drivers/media/platform/qcom/venus/vdec.c
>>> index 8488411204c3..3fb277c81aca 100644
>>> --- a/drivers/media/platform/qcom/venus/vdec.c
>>> +++ b/drivers/media/platform/qcom/venus/vdec.c
>>> @@ -13,7 +13,7 @@
>>>   #include <media/v4l2-event.h>
>>>   #include <media/v4l2-ctrls.h>
>>>   #include <media/v4l2-mem2mem.h>
>>> -#include <media/videobuf2-dma-sg.h>
>>> +#include <media/videobuf2-dma-contig.h>
>>>
>>>   #include "hfi_venus_io.h"
>>>   #include "hfi_parser.h"
>>> @@ -1461,7 +1461,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>>>        src_vq->io_modes = VB2_MMAP | VB2_DMABUF;
>>>        src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>>>        src_vq->ops = &vdec_vb2_ops;
>>> -     src_vq->mem_ops = &vb2_dma_sg_memops;
>>> +     src_vq->mem_ops = &vb2_dma_contig_memops;
>>>        src_vq->drv_priv = inst;
>>>        src_vq->buf_struct_size = sizeof(struct venus_buffer);
>>>        src_vq->allow_zero_bytesused = 1;
>>> @@ -1475,7 +1475,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>>>        dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
>>>        dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>>>        dst_vq->ops = &vdec_vb2_ops;
>>> -     dst_vq->mem_ops = &vb2_dma_sg_memops;
>>> +     dst_vq->mem_ops = &vb2_dma_contig_memops;
>>>        dst_vq->drv_priv = inst;
>>>        dst_vq->buf_struct_size = sizeof(struct venus_buffer);
>>>        dst_vq->allow_zero_bytesused = 1;
>>> diff --git a/drivers/media/platform/qcom/venus/venc.c b/drivers/media/platform/qcom/venus/venc.c
>>> index 1c61602c5de1..a09550cd1dba 100644
>>> --- a/drivers/media/platform/qcom/venus/venc.c
>>> +++ b/drivers/media/platform/qcom/venus/venc.c
>>> @@ -10,7 +10,7 @@
>>>   #include <linux/pm_runtime.h>
>>>   #include <linux/slab.h>
>>>   #include <media/v4l2-mem2mem.h>
>>> -#include <media/videobuf2-dma-sg.h>
>>> +#include <media/videobuf2-dma-contig.h>
>>>   #include <media/v4l2-ioctl.h>
>>>   #include <media/v4l2-event.h>
>>>   #include <media/v4l2-ctrls.h>
>>> @@ -1001,7 +1001,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>>>        src_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
>>>        src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>>>        src_vq->ops = &venc_vb2_ops;
>>> -     src_vq->mem_ops = &vb2_dma_sg_memops;
>>> +     src_vq->mem_ops = &vb2_dma_contig_memops;
>>>        src_vq->drv_priv = inst;
>>>        src_vq->buf_struct_size = sizeof(struct venus_buffer);
>>>        src_vq->allow_zero_bytesused = 1;
>>> @@ -1017,7 +1017,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>>>        dst_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
>>>        dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>>>        dst_vq->ops = &venc_vb2_ops;
>>> -     dst_vq->mem_ops = &vb2_dma_sg_memops;
>>> +     dst_vq->mem_ops = &vb2_dma_contig_memops;
>>>        dst_vq->drv_priv = inst;
>>>        dst_vq->buf_struct_size = sizeof(struct venus_buffer);
>>>        dst_vq->allow_zero_bytesused = 1;
>>>
>>
>> --
>> regards,
>> Stan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] media: venus: use contig vb2 ops
  2020-12-15 11:47   ` Tomasz Figa
  2020-12-15 13:32     ` Robin Murphy
@ 2020-12-15 13:54     ` Stanimir Varbanov
  2020-12-15 19:21       ` Nicolas Dufresne
  1 sibling, 1 reply; 13+ messages in thread
From: Stanimir Varbanov @ 2020-12-15 13:54 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: Alexandre Courbot, Fritz Koenig, Linux Media Mailing List,
	Linux Kernel Mailing List, linux-arm-msm, Robin Murphy

Hi Tomasz,

On 12/15/20 1:47 PM, Tomasz Figa wrote:
> On Tue, Dec 15, 2020 at 8:16 PM Stanimir Varbanov
> <stanimir.varbanov@linaro.org> wrote:
>>
>> Hi,
>>
>> Cc: Robin
>>
>> On 12/14/20 2:57 PM, Alexandre Courbot wrote:
>>> This driver uses the SG vb2 ops, but effectively only ever accesses the
>>> first entry of the SG table, indicating that it expects a flat layout.
>>> Switch it to use the contiguous ops to make sure this expected invariant
>>
>> Under what circumstances the sg table will has nents > 1? I came down to
>> [1] but not sure I got it right.
>>
>> I'm afraid that for systems with low amount of system memory and when
>> the memory become fragmented, the driver will not work. That's why I
>> started with sg allocator.
> 
> It is exactly the opposite. The vb2-dma-contig allocator is "contig"
> in terms of the DMA (aka IOVA) address space. In other words, it
> guarantees that having one DMA address and length fully describes the

Ahh, I missed that part. Looks like I misunderstood videobu2 contig
allocator.

> buffer. This seems to be the requirement of the hardware/firmware
> handled by the venus driver. If the device is behind an IOMMU, which
> is the case for the SoCs in question, the underlying DMA ops will
> actually allocate a discontiguous set of pages, so it has nothing to
> do to system memory amount or fragmentation. If for some reason the
> IOMMU can't be used, there is no way around, the memory needs to be
> contiguous because of the hardware/firmware/driver expectation.
> 
> On the other hand, the vb2-dma-sg allocator doesn't have any
> continuity guarantees for the DMA, or any other, address space. The
> current code works fine, because it calls dma_map_sg() on the whole
> set of pages and that ends up mapping it contiguously in the IOVA
> space, but that's just an implementation detail, not an API guarantee.

It was good to know. Thanks for the explanation.

> 
> Best regards,
> Tomasz
> 
>>
>> [1]
>> https://elixir.bootlin.com/linux/v5.10.1/source/drivers/iommu/dma-iommu.c#L782
>>
>>> is always enforced. Since the device is supposed to be behind an IOMMU
>>> this should have little to none practical consequences beyond making the
>>> driver not rely on a particular behavior of the SG implementation.
>>>
>>> Reported-by: Tomasz Figa <tfiga@chromium.org>
>>> Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
>>> ---
>>> Hi everyone,
>>>
>>> It probably doesn't hurt to fix this issue before some actual issue happens.
>>> I have tested this patch on Chrome OS and playback was just as fine as with
>>> the SG ops.
>>>
>>>  drivers/media/platform/Kconfig              | 2 +-
>>>  drivers/media/platform/qcom/venus/helpers.c | 9 ++-------
>>>  drivers/media/platform/qcom/venus/vdec.c    | 6 +++---
>>>  drivers/media/platform/qcom/venus/venc.c    | 6 +++---
>>>  4 files changed, 9 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
>>> index 35a18d388f3f..d9d7954111f2 100644
>>> --- a/drivers/media/platform/Kconfig
>>> +++ b/drivers/media/platform/Kconfig
>>> @@ -533,7 +533,7 @@ config VIDEO_QCOM_VENUS
>>>       depends on INTERCONNECT || !INTERCONNECT
>>>       select QCOM_MDT_LOADER if ARCH_QCOM
>>>       select QCOM_SCM if ARCH_QCOM
>>> -     select VIDEOBUF2_DMA_SG
>>> +     select VIDEOBUF2_DMA_CONTIG
>>>       select V4L2_MEM2MEM_DEV
>>>       help
>>>         This is a V4L2 driver for Qualcomm Venus video accelerator
>>> diff --git a/drivers/media/platform/qcom/venus/helpers.c b/drivers/media/platform/qcom/venus/helpers.c
>>> index 50439eb1ffea..859d260f002b 100644
>>> --- a/drivers/media/platform/qcom/venus/helpers.c
>>> +++ b/drivers/media/platform/qcom/venus/helpers.c
>>> @@ -7,7 +7,7 @@
>>>  #include <linux/mutex.h>
>>>  #include <linux/slab.h>
>>>  #include <linux/kernel.h>
>>> -#include <media/videobuf2-dma-sg.h>
>>> +#include <media/videobuf2-dma-contig.h>
>>>  #include <media/v4l2-mem2mem.h>
>>>  #include <asm/div64.h>
>>>
>>> @@ -1284,14 +1284,9 @@ int venus_helper_vb2_buf_init(struct vb2_buffer *vb)
>>>       struct venus_inst *inst = vb2_get_drv_priv(vb->vb2_queue);
>>>       struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
>>>       struct venus_buffer *buf = to_venus_buffer(vbuf);
>>> -     struct sg_table *sgt;
>>> -
>>> -     sgt = vb2_dma_sg_plane_desc(vb, 0);
>>> -     if (!sgt)
>>> -             return -EFAULT;
>>>
>>>       buf->size = vb2_plane_size(vb, 0);
>>> -     buf->dma_addr = sg_dma_address(sgt->sgl);
>>
>> Can we do it:
>>
>>         if (WARN_ON(sgt->nents > 1))
>>                 return -EFAULT;
>>
>> I understand that logically using dma-sg when the flat layout is
>> expected by the hardware is wrong, but I haven't seen issues until now.
>>
>>> +     buf->dma_addr = vb2_dma_contig_plane_dma_addr(vb, 0);
>>>
>>>       if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
>>>               list_add_tail(&buf->reg_list, &inst->registeredbufs);
>>> diff --git a/drivers/media/platform/qcom/venus/vdec.c b/drivers/media/platform/qcom/venus/vdec.c
>>> index 8488411204c3..3fb277c81aca 100644
>>> --- a/drivers/media/platform/qcom/venus/vdec.c
>>> +++ b/drivers/media/platform/qcom/venus/vdec.c
>>> @@ -13,7 +13,7 @@
>>>  #include <media/v4l2-event.h>
>>>  #include <media/v4l2-ctrls.h>
>>>  #include <media/v4l2-mem2mem.h>
>>> -#include <media/videobuf2-dma-sg.h>
>>> +#include <media/videobuf2-dma-contig.h>
>>>
>>>  #include "hfi_venus_io.h"
>>>  #include "hfi_parser.h"
>>> @@ -1461,7 +1461,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>>>       src_vq->io_modes = VB2_MMAP | VB2_DMABUF;
>>>       src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>>>       src_vq->ops = &vdec_vb2_ops;
>>> -     src_vq->mem_ops = &vb2_dma_sg_memops;
>>> +     src_vq->mem_ops = &vb2_dma_contig_memops;
>>>       src_vq->drv_priv = inst;
>>>       src_vq->buf_struct_size = sizeof(struct venus_buffer);
>>>       src_vq->allow_zero_bytesused = 1;
>>> @@ -1475,7 +1475,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>>>       dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
>>>       dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>>>       dst_vq->ops = &vdec_vb2_ops;
>>> -     dst_vq->mem_ops = &vb2_dma_sg_memops;
>>> +     dst_vq->mem_ops = &vb2_dma_contig_memops;
>>>       dst_vq->drv_priv = inst;
>>>       dst_vq->buf_struct_size = sizeof(struct venus_buffer);
>>>       dst_vq->allow_zero_bytesused = 1;
>>> diff --git a/drivers/media/platform/qcom/venus/venc.c b/drivers/media/platform/qcom/venus/venc.c
>>> index 1c61602c5de1..a09550cd1dba 100644
>>> --- a/drivers/media/platform/qcom/venus/venc.c
>>> +++ b/drivers/media/platform/qcom/venus/venc.c
>>> @@ -10,7 +10,7 @@
>>>  #include <linux/pm_runtime.h>
>>>  #include <linux/slab.h>
>>>  #include <media/v4l2-mem2mem.h>
>>> -#include <media/videobuf2-dma-sg.h>
>>> +#include <media/videobuf2-dma-contig.h>
>>>  #include <media/v4l2-ioctl.h>
>>>  #include <media/v4l2-event.h>
>>>  #include <media/v4l2-ctrls.h>
>>> @@ -1001,7 +1001,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>>>       src_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
>>>       src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>>>       src_vq->ops = &venc_vb2_ops;
>>> -     src_vq->mem_ops = &vb2_dma_sg_memops;
>>> +     src_vq->mem_ops = &vb2_dma_contig_memops;
>>>       src_vq->drv_priv = inst;
>>>       src_vq->buf_struct_size = sizeof(struct venus_buffer);
>>>       src_vq->allow_zero_bytesused = 1;
>>> @@ -1017,7 +1017,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>>>       dst_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
>>>       dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>>>       dst_vq->ops = &venc_vb2_ops;
>>> -     dst_vq->mem_ops = &vb2_dma_sg_memops;
>>> +     dst_vq->mem_ops = &vb2_dma_contig_memops;
>>>       dst_vq->drv_priv = inst;
>>>       dst_vq->buf_struct_size = sizeof(struct venus_buffer);
>>>       dst_vq->allow_zero_bytesused = 1;
>>>
>>
>> --
>> regards,
>> Stan

-- 
regards,
Stan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] media: venus: use contig vb2 ops
  2020-12-15 13:54     ` Stanimir Varbanov
@ 2020-12-15 19:21       ` Nicolas Dufresne
  2020-12-16  3:15         ` Tomasz Figa
  0 siblings, 1 reply; 13+ messages in thread
From: Nicolas Dufresne @ 2020-12-15 19:21 UTC (permalink / raw)
  To: Stanimir Varbanov, Tomasz Figa
  Cc: Alexandre Courbot, Fritz Koenig, Linux Media Mailing List,
	Linux Kernel Mailing List, linux-arm-msm, Robin Murphy

Le mardi 15 décembre 2020 à 15:54 +0200, Stanimir Varbanov a écrit :
> Hi Tomasz,
> 
> On 12/15/20 1:47 PM, Tomasz Figa wrote:
> > On Tue, Dec 15, 2020 at 8:16 PM Stanimir Varbanov
> > <stanimir.varbanov@linaro.org> wrote:
> > > 
> > > Hi,
> > > 
> > > Cc: Robin
> > > 
> > > On 12/14/20 2:57 PM, Alexandre Courbot wrote:
> > > > This driver uses the SG vb2 ops, but effectively only ever accesses the
> > > > first entry of the SG table, indicating that it expects a flat layout.
> > > > Switch it to use the contiguous ops to make sure this expected invariant
> > > 
> > > Under what circumstances the sg table will has nents > 1? I came down to
> > > [1] but not sure I got it right.
> > > 
> > > I'm afraid that for systems with low amount of system memory and when
> > > the memory become fragmented, the driver will not work. That's why I
> > > started with sg allocator.
> > 
> > It is exactly the opposite. The vb2-dma-contig allocator is "contig"
> > in terms of the DMA (aka IOVA) address space. In other words, it
> > guarantees that having one DMA address and length fully describes the
> 
> Ahh, I missed that part. Looks like I misunderstood videobu2 contig
> allocator.

I'm learning everyday too, but I'm surprised I don't see a call to
vb2_dma_contig_set_max_seg_size() in this driver (I could also just have missed
a patch when overlooking this thread) ?

The reason I'm asking, doc says it should be called by driver supporting IOMMU,
which seems to be the case for such drivers (MFC, exynos4-is, exynos-gsc, mtk-
mdp, s5p-g2d, hantro, rkvdec, zoran, ti-vpe, ..). I posting it, worst case it's
all covered and we are good, otherwise perhaps a downstream patch didn't make it
?

/**
 * vb2_dma_contig_set_max_seg_size() - configure DMA max segment size
 * @dev:        device for configuring DMA parameters
 * @size:       size of DMA max segment size to set
 *
 * To allow mapping the scatter-list into a single chunk in the DMA
 * address space, the device is required to have the DMA max segment
 * size parameter set to a value larger than the buffer size. Otherwise,
 * the DMA-mapping subsystem will split the mapping into max segment
 * size chunks. This function sets the DMA max segment size
 * parameter to let DMA-mapping map a buffer as a single chunk in DMA
 * address space.
 * This code assumes that the DMA-mapping subsystem will merge all
 * scatterlist segments if this is really possible (for example when
 * an IOMMU is available and enabled).
 * Ideally, this parameter should be set by the generic bus code, but it
 * is left with the default 64KiB value due to historical litmiations in
 * other subsystems (like limited USB host drivers) and there no good
 * place to set it to the proper value.
 * This function should be called from the drivers, which are known to
 * operate on platforms with IOMMU and provide access to shared buffers
 * (either USERPTR or DMABUF). This should be done before initializing
 * videobuf2 queue.
 */

regards,
Nicolas

> 
> > buffer. This seems to be the requirement of the hardware/firmware
> > handled by the venus driver. If the device is behind an IOMMU, which
> > is the case for the SoCs in question, the underlying DMA ops will
> > actually allocate a discontiguous set of pages, so it has nothing to
> > do to system memory amount or fragmentation. If for some reason the
> > IOMMU can't be used, there is no way around, the memory needs to be
> > contiguous because of the hardware/firmware/driver expectation.
> > 
> > On the other hand, the vb2-dma-sg allocator doesn't have any
> > continuity guarantees for the DMA, or any other, address space. The
> > current code works fine, because it calls dma_map_sg() on the whole
> > set of pages and that ends up mapping it contiguously in the IOVA
> > space, but that's just an implementation detail, not an API guarantee.
> 
> It was good to know. Thanks for the explanation.
> 
> > 
> > Best regards,
> > Tomasz
> > 
> > > 
> > > [1]
> > > https://elixir.bootlin.com/linux/v5.10.1/source/drivers/iommu/dma-iommu.c#L782
> > > 
> > > > is always enforced. Since the device is supposed to be behind an IOMMU
> > > > this should have little to none practical consequences beyond making the
> > > > driver not rely on a particular behavior of the SG implementation.
> > > > 
> > > > Reported-by: Tomasz Figa <tfiga@chromium.org>
> > > > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > > > ---
> > > > Hi everyone,
> > > > 
> > > > It probably doesn't hurt to fix this issue before some actual issue
> > > > happens.
> > > > I have tested this patch on Chrome OS and playback was just as fine as
> > > > with
> > > > the SG ops.
> > > > 
> > > >  drivers/media/platform/Kconfig              | 2 +-
> > > >  drivers/media/platform/qcom/venus/helpers.c | 9 ++-------
> > > >  drivers/media/platform/qcom/venus/vdec.c    | 6 +++---
> > > >  drivers/media/platform/qcom/venus/venc.c    | 6 +++---
> > > >  4 files changed, 9 insertions(+), 14 deletions(-)
> > > > 
> > > > diff --git a/drivers/media/platform/Kconfig
> > > > b/drivers/media/platform/Kconfig
> > > > index 35a18d388f3f..d9d7954111f2 100644
> > > > --- a/drivers/media/platform/Kconfig
> > > > +++ b/drivers/media/platform/Kconfig
> > > > @@ -533,7 +533,7 @@ config VIDEO_QCOM_VENUS
> > > >       depends on INTERCONNECT || !INTERCONNECT
> > > >       select QCOM_MDT_LOADER if ARCH_QCOM
> > > >       select QCOM_SCM if ARCH_QCOM
> > > > -     select VIDEOBUF2_DMA_SG
> > > > +     select VIDEOBUF2_DMA_CONTIG
> > > >       select V4L2_MEM2MEM_DEV
> > > >       help
> > > >         This is a V4L2 driver for Qualcomm Venus video accelerator
> > > > diff --git a/drivers/media/platform/qcom/venus/helpers.c
> > > > b/drivers/media/platform/qcom/venus/helpers.c
> > > > index 50439eb1ffea..859d260f002b 100644
> > > > --- a/drivers/media/platform/qcom/venus/helpers.c
> > > > +++ b/drivers/media/platform/qcom/venus/helpers.c
> > > > @@ -7,7 +7,7 @@
> > > >  #include <linux/mutex.h>
> > > >  #include <linux/slab.h>
> > > >  #include <linux/kernel.h>
> > > > -#include <media/videobuf2-dma-sg.h>
> > > > +#include <media/videobuf2-dma-contig.h>
> > > >  #include <media/v4l2-mem2mem.h>
> > > >  #include <asm/div64.h>
> > > > 
> > > > @@ -1284,14 +1284,9 @@ int venus_helper_vb2_buf_init(struct vb2_buffer
> > > > *vb)
> > > >       struct venus_inst *inst = vb2_get_drv_priv(vb->vb2_queue);
> > > >       struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
> > > >       struct venus_buffer *buf = to_venus_buffer(vbuf);
> > > > -     struct sg_table *sgt;
> > > > -
> > > > -     sgt = vb2_dma_sg_plane_desc(vb, 0);
> > > > -     if (!sgt)
> > > > -             return -EFAULT;
> > > > 
> > > >       buf->size = vb2_plane_size(vb, 0);
> > > > -     buf->dma_addr = sg_dma_address(sgt->sgl);
> > > 
> > > Can we do it:
> > > 
> > >         if (WARN_ON(sgt->nents > 1))
> > >                 return -EFAULT;
> > > 
> > > I understand that logically using dma-sg when the flat layout is
> > > expected by the hardware is wrong, but I haven't seen issues until now.
> > > 
> > > > +     buf->dma_addr = vb2_dma_contig_plane_dma_addr(vb, 0);
> > > > 
> > > >       if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
> > > >               list_add_tail(&buf->reg_list, &inst->registeredbufs);
> > > > diff --git a/drivers/media/platform/qcom/venus/vdec.c
> > > > b/drivers/media/platform/qcom/venus/vdec.c
> > > > index 8488411204c3..3fb277c81aca 100644
> > > > --- a/drivers/media/platform/qcom/venus/vdec.c
> > > > +++ b/drivers/media/platform/qcom/venus/vdec.c
> > > > @@ -13,7 +13,7 @@
> > > >  #include <media/v4l2-event.h>
> > > >  #include <media/v4l2-ctrls.h>
> > > >  #include <media/v4l2-mem2mem.h>
> > > > -#include <media/videobuf2-dma-sg.h>
> > > > +#include <media/videobuf2-dma-contig.h>
> > > > 
> > > >  #include "hfi_venus_io.h"
> > > >  #include "hfi_parser.h"
> > > > @@ -1461,7 +1461,7 @@ static int m2m_queue_init(void *priv, struct
> > > > vb2_queue *src_vq,
> > > >       src_vq->io_modes = VB2_MMAP | VB2_DMABUF;
> > > >       src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> > > >       src_vq->ops = &vdec_vb2_ops;
> > > > -     src_vq->mem_ops = &vb2_dma_sg_memops;
> > > > +     src_vq->mem_ops = &vb2_dma_contig_memops;
> > > >       src_vq->drv_priv = inst;
> > > >       src_vq->buf_struct_size = sizeof(struct venus_buffer);
> > > >       src_vq->allow_zero_bytesused = 1;
> > > > @@ -1475,7 +1475,7 @@ static int m2m_queue_init(void *priv, struct
> > > > vb2_queue *src_vq,
> > > >       dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
> > > >       dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> > > >       dst_vq->ops = &vdec_vb2_ops;
> > > > -     dst_vq->mem_ops = &vb2_dma_sg_memops;
> > > > +     dst_vq->mem_ops = &vb2_dma_contig_memops;
> > > >       dst_vq->drv_priv = inst;
> > > >       dst_vq->buf_struct_size = sizeof(struct venus_buffer);
> > > >       dst_vq->allow_zero_bytesused = 1;
> > > > diff --git a/drivers/media/platform/qcom/venus/venc.c
> > > > b/drivers/media/platform/qcom/venus/venc.c
> > > > index 1c61602c5de1..a09550cd1dba 100644
> > > > --- a/drivers/media/platform/qcom/venus/venc.c
> > > > +++ b/drivers/media/platform/qcom/venus/venc.c
> > > > @@ -10,7 +10,7 @@
> > > >  #include <linux/pm_runtime.h>
> > > >  #include <linux/slab.h>
> > > >  #include <media/v4l2-mem2mem.h>
> > > > -#include <media/videobuf2-dma-sg.h>
> > > > +#include <media/videobuf2-dma-contig.h>
> > > >  #include <media/v4l2-ioctl.h>
> > > >  #include <media/v4l2-event.h>
> > > >  #include <media/v4l2-ctrls.h>
> > > > @@ -1001,7 +1001,7 @@ static int m2m_queue_init(void *priv, struct
> > > > vb2_queue *src_vq,
> > > >       src_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
> > > >       src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> > > >       src_vq->ops = &venc_vb2_ops;
> > > > -     src_vq->mem_ops = &vb2_dma_sg_memops;
> > > > +     src_vq->mem_ops = &vb2_dma_contig_memops;
> > > >       src_vq->drv_priv = inst;
> > > >       src_vq->buf_struct_size = sizeof(struct venus_buffer);
> > > >       src_vq->allow_zero_bytesused = 1;
> > > > @@ -1017,7 +1017,7 @@ static int m2m_queue_init(void *priv, struct
> > > > vb2_queue *src_vq,
> > > >       dst_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
> > > >       dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> > > >       dst_vq->ops = &venc_vb2_ops;
> > > > -     dst_vq->mem_ops = &vb2_dma_sg_memops;
> > > > +     dst_vq->mem_ops = &vb2_dma_contig_memops;
> > > >       dst_vq->drv_priv = inst;
> > > >       dst_vq->buf_struct_size = sizeof(struct venus_buffer);
> > > >       dst_vq->allow_zero_bytesused = 1;
> > > > 
> > > 
> > > --
> > > regards,
> > > Stan
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] media: venus: use contig vb2 ops
  2020-12-15 19:21       ` Nicolas Dufresne
@ 2020-12-16  3:15         ` Tomasz Figa
  2021-03-01  9:23           ` Tomasz Figa
  0 siblings, 1 reply; 13+ messages in thread
From: Tomasz Figa @ 2020-12-16  3:15 UTC (permalink / raw)
  To: Nicolas Dufresne
  Cc: Stanimir Varbanov, Alexandre Courbot, Fritz Koenig,
	Linux Media Mailing List, Linux Kernel Mailing List,
	linux-arm-msm, Robin Murphy

On Wed, Dec 16, 2020 at 4:21 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>
> Le mardi 15 décembre 2020 à 15:54 +0200, Stanimir Varbanov a écrit :
> > Hi Tomasz,
> >
> > On 12/15/20 1:47 PM, Tomasz Figa wrote:
> > > On Tue, Dec 15, 2020 at 8:16 PM Stanimir Varbanov
> > > <stanimir.varbanov@linaro.org> wrote:
> > > >
> > > > Hi,
> > > >
> > > > Cc: Robin
> > > >
> > > > On 12/14/20 2:57 PM, Alexandre Courbot wrote:
> > > > > This driver uses the SG vb2 ops, but effectively only ever accesses the
> > > > > first entry of the SG table, indicating that it expects a flat layout.
> > > > > Switch it to use the contiguous ops to make sure this expected invariant
> > > >
> > > > Under what circumstances the sg table will has nents > 1? I came down to
> > > > [1] but not sure I got it right.
> > > >
> > > > I'm afraid that for systems with low amount of system memory and when
> > > > the memory become fragmented, the driver will not work. That's why I
> > > > started with sg allocator.
> > >
> > > It is exactly the opposite. The vb2-dma-contig allocator is "contig"
> > > in terms of the DMA (aka IOVA) address space. In other words, it
> > > guarantees that having one DMA address and length fully describes the
> >
> > Ahh, I missed that part. Looks like I misunderstood videobu2 contig
> > allocator.
>
> I'm learning everyday too, but I'm surprised I don't see a call to
> vb2_dma_contig_set_max_seg_size() in this driver (I could also just have missed
> a patch when overlooking this thread) ?
>
> The reason I'm asking, doc says it should be called by driver supporting IOMMU,
> which seems to be the case for such drivers (MFC, exynos4-is, exynos-gsc, mtk-
> mdp, s5p-g2d, hantro, rkvdec, zoran, ti-vpe, ..). I posting it, worst case it's
> all covered and we are good, otherwise perhaps a downstream patch didn't make it
> ?
>
> /**
>  * vb2_dma_contig_set_max_seg_size() - configure DMA max segment size
>  * @dev:        device for configuring DMA parameters
>  * @size:       size of DMA max segment size to set
>  *
>  * To allow mapping the scatter-list into a single chunk in the DMA
>  * address space, the device is required to have the DMA max segment
>  * size parameter set to a value larger than the buffer size. Otherwise,
>  * the DMA-mapping subsystem will split the mapping into max segment
>  * size chunks. This function sets the DMA max segment size
>  * parameter to let DMA-mapping map a buffer as a single chunk in DMA
>  * address space.
>  * This code assumes that the DMA-mapping subsystem will merge all
>  * scatterlist segments if this is really possible (for example when
>  * an IOMMU is available and enabled).
>  * Ideally, this parameter should be set by the generic bus code, but it
>  * is left with the default 64KiB value due to historical litmiations in
>  * other subsystems (like limited USB host drivers) and there no good
>  * place to set it to the proper value.
>  * This function should be called from the drivers, which are known to
>  * operate on platforms with IOMMU and provide access to shared buffers
>  * (either USERPTR or DMABUF). This should be done before initializing
>  * videobuf2 queue.
>  */

It does call dma_set_max_seg_size() directly:
https://elixir.bootlin.com/linux/latest/source/drivers/media/platform/qcom/venus/core.c#L230

Actually, why do we even need a vb2 helper for this?

>
> regards,
> Nicolas
>
> >
> > > buffer. This seems to be the requirement of the hardware/firmware
> > > handled by the venus driver. If the device is behind an IOMMU, which
> > > is the case for the SoCs in question, the underlying DMA ops will
> > > actually allocate a discontiguous set of pages, so it has nothing to
> > > do to system memory amount or fragmentation. If for some reason the
> > > IOMMU can't be used, there is no way around, the memory needs to be
> > > contiguous because of the hardware/firmware/driver expectation.
> > >
> > > On the other hand, the vb2-dma-sg allocator doesn't have any
> > > continuity guarantees for the DMA, or any other, address space. The
> > > current code works fine, because it calls dma_map_sg() on the whole
> > > set of pages and that ends up mapping it contiguously in the IOVA
> > > space, but that's just an implementation detail, not an API guarantee.
> >
> > It was good to know. Thanks for the explanation.
> >
> > >
> > > Best regards,
> > > Tomasz
> > >
> > > >
> > > > [1]
> > > > https://elixir.bootlin.com/linux/v5.10.1/source/drivers/iommu/dma-iommu.c#L782
> > > >
> > > > > is always enforced. Since the device is supposed to be behind an IOMMU
> > > > > this should have little to none practical consequences beyond making the
> > > > > driver not rely on a particular behavior of the SG implementation.
> > > > >
> > > > > Reported-by: Tomasz Figa <tfiga@chromium.org>
> > > > > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > > > > ---
> > > > > Hi everyone,
> > > > >
> > > > > It probably doesn't hurt to fix this issue before some actual issue
> > > > > happens.
> > > > > I have tested this patch on Chrome OS and playback was just as fine as
> > > > > with
> > > > > the SG ops.
> > > > >
> > > > >  drivers/media/platform/Kconfig              | 2 +-
> > > > >  drivers/media/platform/qcom/venus/helpers.c | 9 ++-------
> > > > >  drivers/media/platform/qcom/venus/vdec.c    | 6 +++---
> > > > >  drivers/media/platform/qcom/venus/venc.c    | 6 +++---
> > > > >  4 files changed, 9 insertions(+), 14 deletions(-)
> > > > >
> > > > > diff --git a/drivers/media/platform/Kconfig
> > > > > b/drivers/media/platform/Kconfig
> > > > > index 35a18d388f3f..d9d7954111f2 100644
> > > > > --- a/drivers/media/platform/Kconfig
> > > > > +++ b/drivers/media/platform/Kconfig
> > > > > @@ -533,7 +533,7 @@ config VIDEO_QCOM_VENUS
> > > > >       depends on INTERCONNECT || !INTERCONNECT
> > > > >       select QCOM_MDT_LOADER if ARCH_QCOM
> > > > >       select QCOM_SCM if ARCH_QCOM
> > > > > -     select VIDEOBUF2_DMA_SG
> > > > > +     select VIDEOBUF2_DMA_CONTIG
> > > > >       select V4L2_MEM2MEM_DEV
> > > > >       help
> > > > >         This is a V4L2 driver for Qualcomm Venus video accelerator
> > > > > diff --git a/drivers/media/platform/qcom/venus/helpers.c
> > > > > b/drivers/media/platform/qcom/venus/helpers.c
> > > > > index 50439eb1ffea..859d260f002b 100644
> > > > > --- a/drivers/media/platform/qcom/venus/helpers.c
> > > > > +++ b/drivers/media/platform/qcom/venus/helpers.c
> > > > > @@ -7,7 +7,7 @@
> > > > >  #include <linux/mutex.h>
> > > > >  #include <linux/slab.h>
> > > > >  #include <linux/kernel.h>
> > > > > -#include <media/videobuf2-dma-sg.h>
> > > > > +#include <media/videobuf2-dma-contig.h>
> > > > >  #include <media/v4l2-mem2mem.h>
> > > > >  #include <asm/div64.h>
> > > > >
> > > > > @@ -1284,14 +1284,9 @@ int venus_helper_vb2_buf_init(struct vb2_buffer
> > > > > *vb)
> > > > >       struct venus_inst *inst = vb2_get_drv_priv(vb->vb2_queue);
> > > > >       struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
> > > > >       struct venus_buffer *buf = to_venus_buffer(vbuf);
> > > > > -     struct sg_table *sgt;
> > > > > -
> > > > > -     sgt = vb2_dma_sg_plane_desc(vb, 0);
> > > > > -     if (!sgt)
> > > > > -             return -EFAULT;
> > > > >
> > > > >       buf->size = vb2_plane_size(vb, 0);
> > > > > -     buf->dma_addr = sg_dma_address(sgt->sgl);
> > > >
> > > > Can we do it:
> > > >
> > > >         if (WARN_ON(sgt->nents > 1))
> > > >                 return -EFAULT;
> > > >
> > > > I understand that logically using dma-sg when the flat layout is
> > > > expected by the hardware is wrong, but I haven't seen issues until now.
> > > >
> > > > > +     buf->dma_addr = vb2_dma_contig_plane_dma_addr(vb, 0);
> > > > >
> > > > >       if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
> > > > >               list_add_tail(&buf->reg_list, &inst->registeredbufs);
> > > > > diff --git a/drivers/media/platform/qcom/venus/vdec.c
> > > > > b/drivers/media/platform/qcom/venus/vdec.c
> > > > > index 8488411204c3..3fb277c81aca 100644
> > > > > --- a/drivers/media/platform/qcom/venus/vdec.c
> > > > > +++ b/drivers/media/platform/qcom/venus/vdec.c
> > > > > @@ -13,7 +13,7 @@
> > > > >  #include <media/v4l2-event.h>
> > > > >  #include <media/v4l2-ctrls.h>
> > > > >  #include <media/v4l2-mem2mem.h>
> > > > > -#include <media/videobuf2-dma-sg.h>
> > > > > +#include <media/videobuf2-dma-contig.h>
> > > > >
> > > > >  #include "hfi_venus_io.h"
> > > > >  #include "hfi_parser.h"
> > > > > @@ -1461,7 +1461,7 @@ static int m2m_queue_init(void *priv, struct
> > > > > vb2_queue *src_vq,
> > > > >       src_vq->io_modes = VB2_MMAP | VB2_DMABUF;
> > > > >       src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> > > > >       src_vq->ops = &vdec_vb2_ops;
> > > > > -     src_vq->mem_ops = &vb2_dma_sg_memops;
> > > > > +     src_vq->mem_ops = &vb2_dma_contig_memops;
> > > > >       src_vq->drv_priv = inst;
> > > > >       src_vq->buf_struct_size = sizeof(struct venus_buffer);
> > > > >       src_vq->allow_zero_bytesused = 1;
> > > > > @@ -1475,7 +1475,7 @@ static int m2m_queue_init(void *priv, struct
> > > > > vb2_queue *src_vq,
> > > > >       dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
> > > > >       dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> > > > >       dst_vq->ops = &vdec_vb2_ops;
> > > > > -     dst_vq->mem_ops = &vb2_dma_sg_memops;
> > > > > +     dst_vq->mem_ops = &vb2_dma_contig_memops;
> > > > >       dst_vq->drv_priv = inst;
> > > > >       dst_vq->buf_struct_size = sizeof(struct venus_buffer);
> > > > >       dst_vq->allow_zero_bytesused = 1;
> > > > > diff --git a/drivers/media/platform/qcom/venus/venc.c
> > > > > b/drivers/media/platform/qcom/venus/venc.c
> > > > > index 1c61602c5de1..a09550cd1dba 100644
> > > > > --- a/drivers/media/platform/qcom/venus/venc.c
> > > > > +++ b/drivers/media/platform/qcom/venus/venc.c
> > > > > @@ -10,7 +10,7 @@
> > > > >  #include <linux/pm_runtime.h>
> > > > >  #include <linux/slab.h>
> > > > >  #include <media/v4l2-mem2mem.h>
> > > > > -#include <media/videobuf2-dma-sg.h>
> > > > > +#include <media/videobuf2-dma-contig.h>
> > > > >  #include <media/v4l2-ioctl.h>
> > > > >  #include <media/v4l2-event.h>
> > > > >  #include <media/v4l2-ctrls.h>
> > > > > @@ -1001,7 +1001,7 @@ static int m2m_queue_init(void *priv, struct
> > > > > vb2_queue *src_vq,
> > > > >       src_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
> > > > >       src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> > > > >       src_vq->ops = &venc_vb2_ops;
> > > > > -     src_vq->mem_ops = &vb2_dma_sg_memops;
> > > > > +     src_vq->mem_ops = &vb2_dma_contig_memops;
> > > > >       src_vq->drv_priv = inst;
> > > > >       src_vq->buf_struct_size = sizeof(struct venus_buffer);
> > > > >       src_vq->allow_zero_bytesused = 1;
> > > > > @@ -1017,7 +1017,7 @@ static int m2m_queue_init(void *priv, struct
> > > > > vb2_queue *src_vq,
> > > > >       dst_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
> > > > >       dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> > > > >       dst_vq->ops = &venc_vb2_ops;
> > > > > -     dst_vq->mem_ops = &vb2_dma_sg_memops;
> > > > > +     dst_vq->mem_ops = &vb2_dma_contig_memops;
> > > > >       dst_vq->drv_priv = inst;
> > > > >       dst_vq->buf_struct_size = sizeof(struct venus_buffer);
> > > > >       dst_vq->allow_zero_bytesused = 1;
> > > > >
> > > >
> > > > --
> > > > regards,
> > > > Stan
> >
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] media: venus: use contig vb2 ops
  2020-12-16  3:15         ` Tomasz Figa
@ 2021-03-01  9:23           ` Tomasz Figa
  2021-03-01 10:22             ` Stanimir Varbanov
  0 siblings, 1 reply; 13+ messages in thread
From: Tomasz Figa @ 2021-03-01  9:23 UTC (permalink / raw)
  To: Alexandre Courbot, Stanimir Varbanov
  Cc: Fritz Koenig, Linux Media Mailing List, Nicolas Dufresne,
	Linux Kernel Mailing List, linux-arm-msm, Robin Murphy

Hi Alex, Stanimir,

On Wed, Dec 16, 2020 at 12:15 PM Tomasz Figa <tfiga@chromium.org> wrote:
>
> On Wed, Dec 16, 2020 at 4:21 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
> >
> > Le mardi 15 décembre 2020 à 15:54 +0200, Stanimir Varbanov a écrit :
> > > Hi Tomasz,
> > >
> > > On 12/15/20 1:47 PM, Tomasz Figa wrote:
> > > > On Tue, Dec 15, 2020 at 8:16 PM Stanimir Varbanov
> > > > <stanimir.varbanov@linaro.org> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > Cc: Robin
> > > > >
> > > > > On 12/14/20 2:57 PM, Alexandre Courbot wrote:
> > > > > > This driver uses the SG vb2 ops, but effectively only ever accesses the
> > > > > > first entry of the SG table, indicating that it expects a flat layout.
> > > > > > Switch it to use the contiguous ops to make sure this expected invariant
> > > > >
> > > > > Under what circumstances the sg table will has nents > 1? I came down to
> > > > > [1] but not sure I got it right.
> > > > >
> > > > > I'm afraid that for systems with low amount of system memory and when
> > > > > the memory become fragmented, the driver will not work. That's why I
> > > > > started with sg allocator.
> > > >
> > > > It is exactly the opposite. The vb2-dma-contig allocator is "contig"
> > > > in terms of the DMA (aka IOVA) address space. In other words, it
> > > > guarantees that having one DMA address and length fully describes the
> > >
> > > Ahh, I missed that part. Looks like I misunderstood videobu2 contig
> > > allocator.
> >
> > I'm learning everyday too, but I'm surprised I don't see a call to
> > vb2_dma_contig_set_max_seg_size() in this driver (I could also just have missed
> > a patch when overlooking this thread) ?
> >
> > The reason I'm asking, doc says it should be called by driver supporting IOMMU,
> > which seems to be the case for such drivers (MFC, exynos4-is, exynos-gsc, mtk-
> > mdp, s5p-g2d, hantro, rkvdec, zoran, ti-vpe, ..). I posting it, worst case it's
> > all covered and we are good, otherwise perhaps a downstream patch didn't make it
> > ?
> >
> > /**
> >  * vb2_dma_contig_set_max_seg_size() - configure DMA max segment size
> >  * @dev:        device for configuring DMA parameters
> >  * @size:       size of DMA max segment size to set
> >  *
> >  * To allow mapping the scatter-list into a single chunk in the DMA
> >  * address space, the device is required to have the DMA max segment
> >  * size parameter set to a value larger than the buffer size. Otherwise,
> >  * the DMA-mapping subsystem will split the mapping into max segment
> >  * size chunks. This function sets the DMA max segment size
> >  * parameter to let DMA-mapping map a buffer as a single chunk in DMA
> >  * address space.
> >  * This code assumes that the DMA-mapping subsystem will merge all
> >  * scatterlist segments if this is really possible (for example when
> >  * an IOMMU is available and enabled).
> >  * Ideally, this parameter should be set by the generic bus code, but it
> >  * is left with the default 64KiB value due to historical litmiations in
> >  * other subsystems (like limited USB host drivers) and there no good
> >  * place to set it to the proper value.
> >  * This function should be called from the drivers, which are known to
> >  * operate on platforms with IOMMU and provide access to shared buffers
> >  * (either USERPTR or DMABUF). This should be done before initializing
> >  * videobuf2 queue.
> >  */
>
> It does call dma_set_max_seg_size() directly:
> https://elixir.bootlin.com/linux/latest/source/drivers/media/platform/qcom/venus/core.c#L230
>
> Actually, why do we even need a vb2 helper for this?
>

What's the plan for this patch?

Best regards,
Tomasz

> >
> > regards,
> > Nicolas
> >
> > >
> > > > buffer. This seems to be the requirement of the hardware/firmware
> > > > handled by the venus driver. If the device is behind an IOMMU, which
> > > > is the case for the SoCs in question, the underlying DMA ops will
> > > > actually allocate a discontiguous set of pages, so it has nothing to
> > > > do to system memory amount or fragmentation. If for some reason the
> > > > IOMMU can't be used, there is no way around, the memory needs to be
> > > > contiguous because of the hardware/firmware/driver expectation.
> > > >
> > > > On the other hand, the vb2-dma-sg allocator doesn't have any
> > > > continuity guarantees for the DMA, or any other, address space. The
> > > > current code works fine, because it calls dma_map_sg() on the whole
> > > > set of pages and that ends up mapping it contiguously in the IOVA
> > > > space, but that's just an implementation detail, not an API guarantee.
> > >
> > > It was good to know. Thanks for the explanation.
> > >
> > > >
> > > > Best regards,
> > > > Tomasz
> > > >
> > > > >
> > > > > [1]
> > > > > https://elixir.bootlin.com/linux/v5.10.1/source/drivers/iommu/dma-iommu.c#L782
> > > > >
> > > > > > is always enforced. Since the device is supposed to be behind an IOMMU
> > > > > > this should have little to none practical consequences beyond making the
> > > > > > driver not rely on a particular behavior of the SG implementation.
> > > > > >
> > > > > > Reported-by: Tomasz Figa <tfiga@chromium.org>
> > > > > > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > > > > > ---
> > > > > > Hi everyone,
> > > > > >
> > > > > > It probably doesn't hurt to fix this issue before some actual issue
> > > > > > happens.
> > > > > > I have tested this patch on Chrome OS and playback was just as fine as
> > > > > > with
> > > > > > the SG ops.
> > > > > >
> > > > > >  drivers/media/platform/Kconfig              | 2 +-
> > > > > >  drivers/media/platform/qcom/venus/helpers.c | 9 ++-------
> > > > > >  drivers/media/platform/qcom/venus/vdec.c    | 6 +++---
> > > > > >  drivers/media/platform/qcom/venus/venc.c    | 6 +++---
> > > > > >  4 files changed, 9 insertions(+), 14 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/media/platform/Kconfig
> > > > > > b/drivers/media/platform/Kconfig
> > > > > > index 35a18d388f3f..d9d7954111f2 100644
> > > > > > --- a/drivers/media/platform/Kconfig
> > > > > > +++ b/drivers/media/platform/Kconfig
> > > > > > @@ -533,7 +533,7 @@ config VIDEO_QCOM_VENUS
> > > > > >       depends on INTERCONNECT || !INTERCONNECT
> > > > > >       select QCOM_MDT_LOADER if ARCH_QCOM
> > > > > >       select QCOM_SCM if ARCH_QCOM
> > > > > > -     select VIDEOBUF2_DMA_SG
> > > > > > +     select VIDEOBUF2_DMA_CONTIG
> > > > > >       select V4L2_MEM2MEM_DEV
> > > > > >       help
> > > > > >         This is a V4L2 driver for Qualcomm Venus video accelerator
> > > > > > diff --git a/drivers/media/platform/qcom/venus/helpers.c
> > > > > > b/drivers/media/platform/qcom/venus/helpers.c
> > > > > > index 50439eb1ffea..859d260f002b 100644
> > > > > > --- a/drivers/media/platform/qcom/venus/helpers.c
> > > > > > +++ b/drivers/media/platform/qcom/venus/helpers.c
> > > > > > @@ -7,7 +7,7 @@
> > > > > >  #include <linux/mutex.h>
> > > > > >  #include <linux/slab.h>
> > > > > >  #include <linux/kernel.h>
> > > > > > -#include <media/videobuf2-dma-sg.h>
> > > > > > +#include <media/videobuf2-dma-contig.h>
> > > > > >  #include <media/v4l2-mem2mem.h>
> > > > > >  #include <asm/div64.h>
> > > > > >
> > > > > > @@ -1284,14 +1284,9 @@ int venus_helper_vb2_buf_init(struct vb2_buffer
> > > > > > *vb)
> > > > > >       struct venus_inst *inst = vb2_get_drv_priv(vb->vb2_queue);
> > > > > >       struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
> > > > > >       struct venus_buffer *buf = to_venus_buffer(vbuf);
> > > > > > -     struct sg_table *sgt;
> > > > > > -
> > > > > > -     sgt = vb2_dma_sg_plane_desc(vb, 0);
> > > > > > -     if (!sgt)
> > > > > > -             return -EFAULT;
> > > > > >
> > > > > >       buf->size = vb2_plane_size(vb, 0);
> > > > > > -     buf->dma_addr = sg_dma_address(sgt->sgl);
> > > > >
> > > > > Can we do it:
> > > > >
> > > > >         if (WARN_ON(sgt->nents > 1))
> > > > >                 return -EFAULT;
> > > > >
> > > > > I understand that logically using dma-sg when the flat layout is
> > > > > expected by the hardware is wrong, but I haven't seen issues until now.
> > > > >
> > > > > > +     buf->dma_addr = vb2_dma_contig_plane_dma_addr(vb, 0);
> > > > > >
> > > > > >       if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
> > > > > >               list_add_tail(&buf->reg_list, &inst->registeredbufs);
> > > > > > diff --git a/drivers/media/platform/qcom/venus/vdec.c
> > > > > > b/drivers/media/platform/qcom/venus/vdec.c
> > > > > > index 8488411204c3..3fb277c81aca 100644
> > > > > > --- a/drivers/media/platform/qcom/venus/vdec.c
> > > > > > +++ b/drivers/media/platform/qcom/venus/vdec.c
> > > > > > @@ -13,7 +13,7 @@
> > > > > >  #include <media/v4l2-event.h>
> > > > > >  #include <media/v4l2-ctrls.h>
> > > > > >  #include <media/v4l2-mem2mem.h>
> > > > > > -#include <media/videobuf2-dma-sg.h>
> > > > > > +#include <media/videobuf2-dma-contig.h>
> > > > > >
> > > > > >  #include "hfi_venus_io.h"
> > > > > >  #include "hfi_parser.h"
> > > > > > @@ -1461,7 +1461,7 @@ static int m2m_queue_init(void *priv, struct
> > > > > > vb2_queue *src_vq,
> > > > > >       src_vq->io_modes = VB2_MMAP | VB2_DMABUF;
> > > > > >       src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> > > > > >       src_vq->ops = &vdec_vb2_ops;
> > > > > > -     src_vq->mem_ops = &vb2_dma_sg_memops;
> > > > > > +     src_vq->mem_ops = &vb2_dma_contig_memops;
> > > > > >       src_vq->drv_priv = inst;
> > > > > >       src_vq->buf_struct_size = sizeof(struct venus_buffer);
> > > > > >       src_vq->allow_zero_bytesused = 1;
> > > > > > @@ -1475,7 +1475,7 @@ static int m2m_queue_init(void *priv, struct
> > > > > > vb2_queue *src_vq,
> > > > > >       dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
> > > > > >       dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> > > > > >       dst_vq->ops = &vdec_vb2_ops;
> > > > > > -     dst_vq->mem_ops = &vb2_dma_sg_memops;
> > > > > > +     dst_vq->mem_ops = &vb2_dma_contig_memops;
> > > > > >       dst_vq->drv_priv = inst;
> > > > > >       dst_vq->buf_struct_size = sizeof(struct venus_buffer);
> > > > > >       dst_vq->allow_zero_bytesused = 1;
> > > > > > diff --git a/drivers/media/platform/qcom/venus/venc.c
> > > > > > b/drivers/media/platform/qcom/venus/venc.c
> > > > > > index 1c61602c5de1..a09550cd1dba 100644
> > > > > > --- a/drivers/media/platform/qcom/venus/venc.c
> > > > > > +++ b/drivers/media/platform/qcom/venus/venc.c
> > > > > > @@ -10,7 +10,7 @@
> > > > > >  #include <linux/pm_runtime.h>
> > > > > >  #include <linux/slab.h>
> > > > > >  #include <media/v4l2-mem2mem.h>
> > > > > > -#include <media/videobuf2-dma-sg.h>
> > > > > > +#include <media/videobuf2-dma-contig.h>
> > > > > >  #include <media/v4l2-ioctl.h>
> > > > > >  #include <media/v4l2-event.h>
> > > > > >  #include <media/v4l2-ctrls.h>
> > > > > > @@ -1001,7 +1001,7 @@ static int m2m_queue_init(void *priv, struct
> > > > > > vb2_queue *src_vq,
> > > > > >       src_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
> > > > > >       src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> > > > > >       src_vq->ops = &venc_vb2_ops;
> > > > > > -     src_vq->mem_ops = &vb2_dma_sg_memops;
> > > > > > +     src_vq->mem_ops = &vb2_dma_contig_memops;
> > > > > >       src_vq->drv_priv = inst;
> > > > > >       src_vq->buf_struct_size = sizeof(struct venus_buffer);
> > > > > >       src_vq->allow_zero_bytesused = 1;
> > > > > > @@ -1017,7 +1017,7 @@ static int m2m_queue_init(void *priv, struct
> > > > > > vb2_queue *src_vq,
> > > > > >       dst_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
> > > > > >       dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> > > > > >       dst_vq->ops = &venc_vb2_ops;
> > > > > > -     dst_vq->mem_ops = &vb2_dma_sg_memops;
> > > > > > +     dst_vq->mem_ops = &vb2_dma_contig_memops;
> > > > > >       dst_vq->drv_priv = inst;
> > > > > >       dst_vq->buf_struct_size = sizeof(struct venus_buffer);
> > > > > >       dst_vq->allow_zero_bytesused = 1;
> > > > > >
> > > > >
> > > > > --
> > > > > regards,
> > > > > Stan
> > >
> >
> >

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] media: venus: use contig vb2 ops
  2021-03-01  9:23           ` Tomasz Figa
@ 2021-03-01 10:22             ` Stanimir Varbanov
  2021-03-01 10:26               ` Tomasz Figa
  0 siblings, 1 reply; 13+ messages in thread
From: Stanimir Varbanov @ 2021-03-01 10:22 UTC (permalink / raw)
  To: Tomasz Figa, Alexandre Courbot
  Cc: Fritz Koenig, Linux Media Mailing List, Nicolas Dufresne,
	Linux Kernel Mailing List, linux-arm-msm, Robin Murphy



On 3/1/21 11:23 AM, Tomasz Figa wrote:
> Hi Alex, Stanimir,
> 
> On Wed, Dec 16, 2020 at 12:15 PM Tomasz Figa <tfiga@chromium.org> wrote:
>>
>> On Wed, Dec 16, 2020 at 4:21 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>>>
>>> Le mardi 15 décembre 2020 à 15:54 +0200, Stanimir Varbanov a écrit :
>>>> Hi Tomasz,
>>>>
>>>> On 12/15/20 1:47 PM, Tomasz Figa wrote:
>>>>> On Tue, Dec 15, 2020 at 8:16 PM Stanimir Varbanov
>>>>> <stanimir.varbanov@linaro.org> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Cc: Robin
>>>>>>
>>>>>> On 12/14/20 2:57 PM, Alexandre Courbot wrote:
>>>>>>> This driver uses the SG vb2 ops, but effectively only ever accesses the
>>>>>>> first entry of the SG table, indicating that it expects a flat layout.
>>>>>>> Switch it to use the contiguous ops to make sure this expected invariant
>>>>>>
>>>>>> Under what circumstances the sg table will has nents > 1? I came down to
>>>>>> [1] but not sure I got it right.
>>>>>>
>>>>>> I'm afraid that for systems with low amount of system memory and when
>>>>>> the memory become fragmented, the driver will not work. That's why I
>>>>>> started with sg allocator.
>>>>>
>>>>> It is exactly the opposite. The vb2-dma-contig allocator is "contig"
>>>>> in terms of the DMA (aka IOVA) address space. In other words, it
>>>>> guarantees that having one DMA address and length fully describes the
>>>>
>>>> Ahh, I missed that part. Looks like I misunderstood videobu2 contig
>>>> allocator.
>>>
>>> I'm learning everyday too, but I'm surprised I don't see a call to
>>> vb2_dma_contig_set_max_seg_size() in this driver (I could also just have missed
>>> a patch when overlooking this thread) ?
>>>
>>> The reason I'm asking, doc says it should be called by driver supporting IOMMU,
>>> which seems to be the case for such drivers (MFC, exynos4-is, exynos-gsc, mtk-
>>> mdp, s5p-g2d, hantro, rkvdec, zoran, ti-vpe, ..). I posting it, worst case it's
>>> all covered and we are good, otherwise perhaps a downstream patch didn't make it
>>> ?
>>>
>>> /**
>>>  * vb2_dma_contig_set_max_seg_size() - configure DMA max segment size
>>>  * @dev:        device for configuring DMA parameters
>>>  * @size:       size of DMA max segment size to set
>>>  *
>>>  * To allow mapping the scatter-list into a single chunk in the DMA
>>>  * address space, the device is required to have the DMA max segment
>>>  * size parameter set to a value larger than the buffer size. Otherwise,
>>>  * the DMA-mapping subsystem will split the mapping into max segment
>>>  * size chunks. This function sets the DMA max segment size
>>>  * parameter to let DMA-mapping map a buffer as a single chunk in DMA
>>>  * address space.
>>>  * This code assumes that the DMA-mapping subsystem will merge all
>>>  * scatterlist segments if this is really possible (for example when
>>>  * an IOMMU is available and enabled).
>>>  * Ideally, this parameter should be set by the generic bus code, but it
>>>  * is left with the default 64KiB value due to historical litmiations in
>>>  * other subsystems (like limited USB host drivers) and there no good
>>>  * place to set it to the proper value.
>>>  * This function should be called from the drivers, which are known to
>>>  * operate on platforms with IOMMU and provide access to shared buffers
>>>  * (either USERPTR or DMABUF). This should be done before initializing
>>>  * videobuf2 queue.
>>>  */
>>
>> It does call dma_set_max_seg_size() directly:
>> https://elixir.bootlin.com/linux/latest/source/drivers/media/platform/qcom/venus/core.c#L230
>>
>> Actually, why do we even need a vb2 helper for this?
>>
> 
> What's the plan for this patch?

It will be part of v5.12.

-- 
regards,
Stan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] media: venus: use contig vb2 ops
  2021-03-01 10:22             ` Stanimir Varbanov
@ 2021-03-01 10:26               ` Tomasz Figa
  0 siblings, 0 replies; 13+ messages in thread
From: Tomasz Figa @ 2021-03-01 10:26 UTC (permalink / raw)
  To: Stanimir Varbanov
  Cc: Alexandre Courbot, Fritz Koenig, Linux Media Mailing List,
	Nicolas Dufresne, Linux Kernel Mailing List, linux-arm-msm,
	Robin Murphy

On Mon, Mar 1, 2021 at 7:22 PM Stanimir Varbanov
<stanimir.varbanov@linaro.org> wrote:
>
>
>
> On 3/1/21 11:23 AM, Tomasz Figa wrote:
> > Hi Alex, Stanimir,
> >
> > On Wed, Dec 16, 2020 at 12:15 PM Tomasz Figa <tfiga@chromium.org> wrote:
> >>
> >> On Wed, Dec 16, 2020 at 4:21 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
> >>>
> >>> Le mardi 15 décembre 2020 à 15:54 +0200, Stanimir Varbanov a écrit :
> >>>> Hi Tomasz,
> >>>>
> >>>> On 12/15/20 1:47 PM, Tomasz Figa wrote:
> >>>>> On Tue, Dec 15, 2020 at 8:16 PM Stanimir Varbanov
> >>>>> <stanimir.varbanov@linaro.org> wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> Cc: Robin
> >>>>>>
> >>>>>> On 12/14/20 2:57 PM, Alexandre Courbot wrote:
> >>>>>>> This driver uses the SG vb2 ops, but effectively only ever accesses the
> >>>>>>> first entry of the SG table, indicating that it expects a flat layout.
> >>>>>>> Switch it to use the contiguous ops to make sure this expected invariant
> >>>>>>
> >>>>>> Under what circumstances the sg table will has nents > 1? I came down to
> >>>>>> [1] but not sure I got it right.
> >>>>>>
> >>>>>> I'm afraid that for systems with low amount of system memory and when
> >>>>>> the memory become fragmented, the driver will not work. That's why I
> >>>>>> started with sg allocator.
> >>>>>
> >>>>> It is exactly the opposite. The vb2-dma-contig allocator is "contig"
> >>>>> in terms of the DMA (aka IOVA) address space. In other words, it
> >>>>> guarantees that having one DMA address and length fully describes the
> >>>>
> >>>> Ahh, I missed that part. Looks like I misunderstood videobu2 contig
> >>>> allocator.
> >>>
> >>> I'm learning everyday too, but I'm surprised I don't see a call to
> >>> vb2_dma_contig_set_max_seg_size() in this driver (I could also just have missed
> >>> a patch when overlooking this thread) ?
> >>>
> >>> The reason I'm asking, doc says it should be called by driver supporting IOMMU,
> >>> which seems to be the case for such drivers (MFC, exynos4-is, exynos-gsc, mtk-
> >>> mdp, s5p-g2d, hantro, rkvdec, zoran, ti-vpe, ..). I posting it, worst case it's
> >>> all covered and we are good, otherwise perhaps a downstream patch didn't make it
> >>> ?
> >>>
> >>> /**
> >>>  * vb2_dma_contig_set_max_seg_size() - configure DMA max segment size
> >>>  * @dev:        device for configuring DMA parameters
> >>>  * @size:       size of DMA max segment size to set
> >>>  *
> >>>  * To allow mapping the scatter-list into a single chunk in the DMA
> >>>  * address space, the device is required to have the DMA max segment
> >>>  * size parameter set to a value larger than the buffer size. Otherwise,
> >>>  * the DMA-mapping subsystem will split the mapping into max segment
> >>>  * size chunks. This function sets the DMA max segment size
> >>>  * parameter to let DMA-mapping map a buffer as a single chunk in DMA
> >>>  * address space.
> >>>  * This code assumes that the DMA-mapping subsystem will merge all
> >>>  * scatterlist segments if this is really possible (for example when
> >>>  * an IOMMU is available and enabled).
> >>>  * Ideally, this parameter should be set by the generic bus code, but it
> >>>  * is left with the default 64KiB value due to historical litmiations in
> >>>  * other subsystems (like limited USB host drivers) and there no good
> >>>  * place to set it to the proper value.
> >>>  * This function should be called from the drivers, which are known to
> >>>  * operate on platforms with IOMMU and provide access to shared buffers
> >>>  * (either USERPTR or DMABUF). This should be done before initializing
> >>>  * videobuf2 queue.
> >>>  */
> >>
> >> It does call dma_set_max_seg_size() directly:
> >> https://elixir.bootlin.com/linux/latest/source/drivers/media/platform/qcom/venus/core.c#L230
> >>
> >> Actually, why do we even need a vb2 helper for this?
> >>
> >
> > What's the plan for this patch?
>
> It will be part of v5.12.

Great, thanks!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] media: venus: use contig vb2 ops
  2020-12-14 12:57 [PATCH] media: venus: use contig vb2 ops Alexandre Courbot
  2020-12-15  9:02 ` Tomasz Figa
  2020-12-15 11:16 ` Stanimir Varbanov
@ 2021-03-01 19:59 ` patchwork-bot+linux-arm-msm
  2 siblings, 0 replies; 13+ messages in thread
From: patchwork-bot+linux-arm-msm @ 2021-03-01 19:59 UTC (permalink / raw)
  To: Alexandre Courbot; +Cc: linux-arm-msm

Hello:

This patch was applied to qcom/linux.git (refs/heads/for-next):

On Mon, 14 Dec 2020 21:57:03 +0900 you wrote:
> This driver uses the SG vb2 ops, but effectively only ever accesses the
> first entry of the SG table, indicating that it expects a flat layout.
> Switch it to use the contiguous ops to make sure this expected invariant
> is always enforced. Since the device is supposed to be behind an IOMMU
> this should have little to none practical consequences beyond making the
> driver not rely on a particular behavior of the SG implementation.
> 
> [...]

Here is the summary with links:
  - media: venus: use contig vb2 ops
    https://git.kernel.org/qcom/c/cc82fd691a3a

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-03-01 20:51 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-14 12:57 [PATCH] media: venus: use contig vb2 ops Alexandre Courbot
2020-12-15  9:02 ` Tomasz Figa
2020-12-15 11:16 ` Stanimir Varbanov
2020-12-15 11:34   ` Robin Murphy
2020-12-15 11:47   ` Tomasz Figa
2020-12-15 13:32     ` Robin Murphy
2020-12-15 13:54     ` Stanimir Varbanov
2020-12-15 19:21       ` Nicolas Dufresne
2020-12-16  3:15         ` Tomasz Figa
2021-03-01  9:23           ` Tomasz Figa
2021-03-01 10:22             ` Stanimir Varbanov
2021-03-01 10:26               ` Tomasz Figa
2021-03-01 19:59 ` patchwork-bot+linux-arm-msm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).