linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH V2 1/6] dt-bindings: mt8183: Added DIP dt-bindings
       [not found] <20190708110500.7242-1-frederic.chen@mediatek.com>
@ 2019-07-08 11:04 ` frederic.chen
  2019-07-24 16:49   ` Rob Herring
  2019-07-08 11:04 ` [RFC PATCH V2 2/6] dts: arm64: mt8183: Add DIP nodes frederic.chen
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 9+ messages in thread
From: frederic.chen @ 2019-07-08 11:04 UTC (permalink / raw)
  To: hans.verkuil, laurent.pinchart+renesas, tfiga, matthias.bgg, mchehab
  Cc: shik, devicetree, Sean.Cheng, Rynn.Wu, Allan.Yang,
	srv_heupstream, holmes.chiou, suleiman, Jerry-ch.Chen, jungo.lin,
	sj.huang, yuzhao, linux-mediatek, zwisler, christie.yu,
	frederic.chen, linux-arm-kernel, linux-media

From: Frederic Chen <frederic.chen@mediatek.com>

This patch adds DT binding documentation for the Digital Image
Processing (DIP) unit of camera ISP system on Mediatek's SoCs.

It depends on the SCP and MDP 3 patch as following:

1. dt-bindings: Add a binding for Mediatek SCP
   https://patchwork.kernel.org/patch/11027247/
2. dt-binding: mt8183: Add Mediatek MDP3 dt-bindings
   https://patchwork.kernel.org/patch/10945603/

Signed-off-by: Frederic Chen <frederic.chen@mediatek.com>
---
 .../bindings/media/mediatek,mt8183-dip.txt    | 40 +++++++++++++++++++
 1 file changed, 40 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/media/mediatek,mt8183-dip.txt

diff --git a/Documentation/devicetree/bindings/media/mediatek,mt8183-dip.txt b/Documentation/devicetree/bindings/media/mediatek,mt8183-dip.txt
new file mode 100644
index 000000000000..3a0435513089
--- /dev/null
+++ b/Documentation/devicetree/bindings/media/mediatek,mt8183-dip.txt
@@ -0,0 +1,40 @@
+* Mediatek Digital Image Processor (DIP)
+
+Digital Image Processor (DIP) unit in Mediatek ISP system is responsible for
+image content adjustment according to the tuning parameters. DIP can process
+the image form memory buffer and output the processed image to multiple output
+buffers. Furthermore, it can support demosaicing and noise reduction on the
+images.
+
+Required properties:
+- compatible: "mediatek,mt8183-dip"
+- reg: Physical base address and length of the function block register space
+- interrupts: interrupt number to the cpu
+- iommus: should point to the respective IOMMU block with master port as
+  argument, see Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
+  for details.
+- mediatek,mdp3: should point to the respective mdp block. DIP hardware
+  connects to MDP and we can get the processed image with both effect of the
+  two blocks.
+- mediatek,larb: must contain the local arbiters in the current SoCs, see
+  Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.txt
+  for details.
+- mediatek,scp: should point to the scp node since the we use SCP
+  coprocessor to control DIP hardware
+- clocks: must contain the local arbiters 5 (LARB5) and DIP clock
+- clock-names: must contain "larb5" and "dip"
+
+Example:
+	dip: dip@15022000 {
+		compatible = "mediatek,mt8183-dip";
+		mediatek,larb = <&larb5>;
+		mediatek,mdp3 = <&mdp_rdma0>;
+		mediatek,scp = <&scp>;
+		iommus = <&iommu M4U_PORT_CAM_IMGI>;
+		reg = <0 0x15022000 0 0x6000>;
+		interrupts = <GIC_SPI 268 IRQ_TYPE_LEVEL_LOW>;
+		clocks = <&imgsys CLK_IMG_LARB5>,
+			 <&imgsys CLK_IMG_DIP>;
+		clock-names = "larb5",
+			      "dip";
+		};
-- 
2.18.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH V2 2/6] dts: arm64: mt8183: Add DIP nodes
       [not found] <20190708110500.7242-1-frederic.chen@mediatek.com>
  2019-07-08 11:04 ` [RFC PATCH V2 1/6] dt-bindings: mt8183: Added DIP dt-bindings frederic.chen
@ 2019-07-08 11:04 ` frederic.chen
  2019-07-08 11:04 ` [RFC PATCH V2 3/6] media: platform: Add Mediatek DIP driver KConfig frederic.chen
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: frederic.chen @ 2019-07-08 11:04 UTC (permalink / raw)
  To: hans.verkuil, laurent.pinchart+renesas, tfiga, matthias.bgg, mchehab
  Cc: shik, devicetree, Sean.Cheng, Rynn.Wu, Allan.Yang,
	srv_heupstream, holmes.chiou, suleiman, Jerry-ch.Chen, jungo.lin,
	sj.huang, yuzhao, linux-mediatek, zwisler, christie.yu,
	frederic.chen, linux-arm-kernel, linux-media

From: Frederic Chen <frederic.chen@mediatek.com>

This patch adds nodes for Digital Image Processing (DIP). DIP is
embedded in Mediatek SoCs and works with the co-processor to
adjust image content according to tuning input data. It also
provides image format conversion, resizing, and rotation
features.

Signed-off-by: Frederic Chen <frederic.chen@mediatek.com>
---
 arch/arm64/boot/dts/mediatek/mt8183.dtsi | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
index 0f2646c9eb65..d7b0fb8230f0 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
@@ -513,6 +513,22 @@
 			#clock-cells = <1>;
 		};
 
+		dip: dip@15022000 {
+			compatible = "mediatek,mt8183-dip";
+			mediatek,larb = <&larb5>;
+			mediatek,mdp3 = <&mdp_rdma0>;
+			mediatek,scp = <&scp>;
+			iommus = <&iommu M4U_PORT_CAM_IMGI>;
+			reg = <0 0x15022000 0 0x6000>;
+			interrupts = <GIC_SPI 268 IRQ_TYPE_LEVEL_LOW>;
+			clocks =
+					<&imgsys CLK_IMG_LARB5>,
+					<&imgsys CLK_IMG_DIP>;
+			clock-names =
+					"larb5",
+					"dip";
+		};
+
 		vdecsys: syscon@16000000 {
 			compatible = "mediatek,mt8183-vdecsys", "syscon";
 			reg = <0 0x16000000 0 0x1000>;
-- 
2.18.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH V2 3/6] media: platform: Add Mediatek DIP driver KConfig
       [not found] <20190708110500.7242-1-frederic.chen@mediatek.com>
  2019-07-08 11:04 ` [RFC PATCH V2 1/6] dt-bindings: mt8183: Added DIP dt-bindings frederic.chen
  2019-07-08 11:04 ` [RFC PATCH V2 2/6] dts: arm64: mt8183: Add DIP nodes frederic.chen
@ 2019-07-08 11:04 ` frederic.chen
  2019-07-08 11:04 ` [RFC PATCH V2 5/6] remoteproc/mediatek: add SCP's shared dma pool support for mt8183 frederic.chen
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: frederic.chen @ 2019-07-08 11:04 UTC (permalink / raw)
  To: hans.verkuil, laurent.pinchart+renesas, tfiga, matthias.bgg, mchehab
  Cc: shik, devicetree, Sean.Cheng, Rynn.Wu, Allan.Yang,
	srv_heupstream, holmes.chiou, suleiman, Jerry-ch.Chen, jungo.lin,
	sj.huang, yuzhao, linux-mediatek, zwisler, christie.yu,
	frederic.chen, linux-arm-kernel, linux-media

From: Frederic Chen <frederic.chen@mediatek.com>

This patch adds KConfig for Mediatek Digital Image Processing
driver(DIP). DIP is embedded in Mediatek SoCs. It provides
image format conversion, resizing, and rotation function.

Signed-off-by: Frederic Chen <frederic.chen@mediatek.com>
---
 drivers/media/platform/Kconfig         |  2 ++
 drivers/media/platform/mtk-isp/Kconfig | 21 +++++++++++++++++++++
 2 files changed, 23 insertions(+)
 create mode 100644 drivers/media/platform/mtk-isp/Kconfig

diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
index 0c725d4dcf80..b8501e1b134f 100644
--- a/drivers/media/platform/Kconfig
+++ b/drivers/media/platform/Kconfig
@@ -32,6 +32,8 @@ source "drivers/media/platform/davinci/Kconfig"
 
 source "drivers/media/platform/omap/Kconfig"
 
+source "drivers/media/platform/mtk-isp/Kconfig"
+
 config VIDEO_ASPEED
 	tristate "Aspeed AST2400 and AST2500 Video Engine driver"
 	depends on VIDEO_V4L2
diff --git a/drivers/media/platform/mtk-isp/Kconfig b/drivers/media/platform/mtk-isp/Kconfig
new file mode 100644
index 000000000000..24a592022dd5
--- /dev/null
+++ b/drivers/media/platform/mtk-isp/Kconfig
@@ -0,0 +1,21 @@
+config VIDEO_MEDIATEK_ISP_DIP
+	bool "Mediatek Digital Image Processing function"
+	select DMA_SHARED_BUFFER
+	select VIDEO_V4L2_SUBDEV_API
+	select VIDEOBUF2_DMA_CONTIG
+	select VIDEOBUF2_CORE
+	select VIDEOBUF2_V4L2
+	select VIDEOBUF2_MEMOPS
+	select VIDEOBUF2_VMALLOC
+	select MEDIA_CONTROLLER
+
+	default n
+	help
+	    Support the basic Digital Image Processing (DIP) driver.
+
+	    DIP driver provides image format conversion, resizing,
+	    and rotation function through the low power hardware.
+	    DIP also supports multiple output feature. It can
+	    generate two or more output image with different effect
+	    based on a single input image at the same time.
+
-- 
2.18.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH V2 5/6] remoteproc/mediatek: add SCP's shared dma pool support for mt8183
       [not found] <20190708110500.7242-1-frederic.chen@mediatek.com>
                   ` (2 preceding siblings ...)
  2019-07-08 11:04 ` [RFC PATCH V2 3/6] media: platform: Add Mediatek DIP driver KConfig frederic.chen
@ 2019-07-08 11:04 ` frederic.chen
  2019-07-08 11:05 ` [RFC PATCH V2 6/6] media: platform: mtk-mdp3: Add struct tuning_addr to identify user tuning data frederic.chen
       [not found] ` <20190708110500.7242-5-frederic.chen@mediatek.com>
  5 siblings, 0 replies; 9+ messages in thread
From: frederic.chen @ 2019-07-08 11:04 UTC (permalink / raw)
  To: hans.verkuil, laurent.pinchart+renesas, tfiga, matthias.bgg, mchehab
  Cc: shik, devicetree, Sean.Cheng, Rynn.Wu, Allan.Yang,
	srv_heupstream, holmes.chiou, suleiman, Jerry-ch.Chen, jungo.lin,
	sj.huang, yuzhao, linux-mediatek, zwisler, christie.yu,
	frederic.chen, linux-arm-kernel, linux-media

From: Frederic Chen <frederic.chen@mediatek.com>

This patch uses of_reserved_mem_device_init_by_idx() to hook the
scp device to DMA mapping API to provide a shared dma pool of
SCP DMA buffers for SCP's client such as DIP and ISP Pass 1
drivers.

Signed-off-by: Frederic Chen <frederic.chen@mediatek.com>
---
 drivers/remoteproc/mtk_scp.c | 54 ++++++++++++++++++++++--------------
 1 file changed, 33 insertions(+), 21 deletions(-)

diff --git a/drivers/remoteproc/mtk_scp.c b/drivers/remoteproc/mtk_scp.c
index 4c093dec52b9..0cffe4b63dba 100644
--- a/drivers/remoteproc/mtk_scp.c
+++ b/drivers/remoteproc/mtk_scp.c
@@ -4,12 +4,14 @@
 
 #include <asm/barrier.h>
 #include <linux/clk.h>
+#include <linux/dma-mapping.h>
 #include <linux/err.h>
 #include <linux/interrupt.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
 #include <linux/of_address.h>
 #include <linux/of_platform.h>
+#include <linux/of_reserved_mem.h>
 #include <linux/platform_data/mtk_scp.h>
 #include <linux/platform_device.h>
 #include <linux/remoteproc.h>
@@ -487,29 +489,29 @@ EXPORT_SYMBOL_GPL(scp_get_reserve_mem_size);
 
 static int scp_map_memory_region(struct mtk_scp *scp)
 {
-	struct device_node *node;
-	struct resource r;
-	int ret;
+	int ret, id;
 
-	node = of_parse_phandle(scp->dev->of_node, "memory-region", 0);
-	if (!node) {
-		dev_err(scp->dev, "no memory-region specified\n");
-		return -EINVAL;
+	ret = of_reserved_mem_device_init_by_idx(scp->dev, scp->dev->of_node,
+						 0);
+	if (ret) {
+		dev_err(scp->dev,
+			"%s:of_reserved_mem_device_init_by_idx(0) failed:(%d)",
+			__func__, ret);
+		return -ENOMEM;
 	}
 
-	ret = of_address_to_resource(node, 0, &r);
-	if (ret)
-		return ret;
+	/* Pre-allocate the working buffers */
+	scp->dram_size = MAX_CODE_SIZE;
+	for (id = 0; id < SCP_NUMS_MEM_ID; id++)
+		scp->dram_size += scp_reserve_mblock[id].size;
 
-	scp->phys_addr = r.start;
-	scp->dram_size = resource_size(&r);
-	scp->cpu_addr =
-		devm_ioremap_wc(scp->dev, scp->phys_addr, scp->dram_size);
+	scp->cpu_addr = dma_alloc_coherent(scp->dev, scp->dram_size,
+					   &scp->phys_addr, GFP_KERNEL);
 
 	if (!scp->cpu_addr) {
-		dev_err(scp->dev, "unable to map memory region: %pa+%zx\n",
-			&r.start, scp->dram_size);
-		return -EBUSY;
+		dev_err(scp->dev, "unable to pre-allocate memory for SCP: %zx\n",
+			scp->dram_size);
+		return -ENOMEM;
 	}
 
 #if SCP_RESERVED_MEM
@@ -519,6 +521,13 @@ static int scp_map_memory_region(struct mtk_scp *scp)
 	return 0;
 }
 
+static void scp_unmap_memory_region(struct mtk_scp *scp)
+{
+	dma_free_coherent(scp->dev, scp->dram_size, scp->cpu_addr,
+			  scp->phys_addr);
+	of_reserved_mem_device_release(scp->dev);
+}
+
 static struct mtk_rpmsg_info mtk_scp_rpmsg_info = {
 	.send_ipi = scp_ipi_send,
 	.register_ipi = scp_ipi_register,
@@ -594,20 +603,20 @@ static int scp_probe(struct platform_device *pdev)
 	if (IS_ERR(scp->clk)) {
 		dev_err(dev, "Failed to get clock\n");
 		ret = PTR_ERR(scp->clk);
-		goto free_rproc;
+		goto release_dev_mem;
 	}
 
 	ret = clk_prepare_enable(scp->clk);
 	if (ret) {
 		dev_err(dev, "failed to enable clocks\n");
-		goto free_rproc;
+		goto release_dev_mem;
 	}
 
 	ret = scp_ipi_init(scp);
 	clk_disable_unprepare(scp->clk);
 	if (ret) {
 		dev_err(dev, "Failed to init ipi\n");
-		goto free_rproc;
+		goto release_dev_mem;
 	}
 
 	/* register SCP initialization IPI */
@@ -617,7 +626,7 @@ static int scp_probe(struct platform_device *pdev)
 			       scp);
 	if (ret) {
 		dev_err(dev, "Failed to register IPI_SCP_INIT\n");
-		goto free_rproc;
+		goto release_dev_mem;
 	}
 
 	mutex_init(&scp->lock);
@@ -645,6 +654,8 @@ static int scp_probe(struct platform_device *pdev)
 remove_subdev:
 	scp_remove_rpmsg_subdev(scp);
 	mutex_destroy(&scp->lock);
+release_dev_mem:
+	scp_unmap_memory_region(scp);
 free_rproc:
 	rproc_free(rproc);
 
@@ -658,6 +669,7 @@ static int scp_remove(struct platform_device *pdev)
 	scp_remove_rpmsg_subdev(scp);
 	rproc_del(scp->rproc);
 	rproc_free(scp->rproc);
+	scp_unmap_memory_region(scp);
 
 	return 0;
 }
-- 
2.18.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH V2 6/6] media: platform: mtk-mdp3: Add struct tuning_addr to identify user tuning data
       [not found] <20190708110500.7242-1-frederic.chen@mediatek.com>
                   ` (3 preceding siblings ...)
  2019-07-08 11:04 ` [RFC PATCH V2 5/6] remoteproc/mediatek: add SCP's shared dma pool support for mt8183 frederic.chen
@ 2019-07-08 11:05 ` frederic.chen
       [not found] ` <20190708110500.7242-5-frederic.chen@mediatek.com>
  5 siblings, 0 replies; 9+ messages in thread
From: frederic.chen @ 2019-07-08 11:05 UTC (permalink / raw)
  To: hans.verkuil, laurent.pinchart+renesas, tfiga, matthias.bgg, mchehab
  Cc: shik, devicetree, Sean.Cheng, Rynn.Wu, Allan.Yang,
	srv_heupstream, holmes.chiou, suleiman, Jerry-ch.Chen, jungo.lin,
	sj.huang, yuzhao, linux-mediatek, zwisler, christie.yu,
	frederic.chen, linux-arm-kernel, linux-media

From: Frederic Chen <frederic.chen@mediatek.com>

We added a struct tuning_addr which contains a field "present"
so that the driver can tell the firmware if we have user tuning
dataor not.

Signed-off-by: Frederic Chen <frederic.chen@mediatek.com>
---
 drivers/media/platform/mtk-mdp3/mtk-img-ipi.h | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/media/platform/mtk-mdp3/mtk-img-ipi.h b/drivers/media/platform/mtk-mdp3/mtk-img-ipi.h
index 9fabe7e8b71d..0944fe911d97 100644
--- a/drivers/media/platform/mtk-mdp3/mtk-img-ipi.h
+++ b/drivers/media/platform/mtk-mdp3/mtk-img-ipi.h
@@ -38,6 +38,12 @@ struct img_addr {
 	u32	iova;	/* Used by IOMMU HW access */
 } __attribute__ ((__packed__));
 
+struct tuning_addr {
+	u32	present;
+	u32	pa;	/* Used by CM4 access */
+	u32	iova;	/* Used by IOMMU HW access */
+} __attribute__ ((__packed__));
+
 struct img_sw_addr {
 	u64	va;	/* Used by APMCU access */
 	u32	pa;	/* Used by CM4 access */
@@ -105,7 +111,7 @@ struct img_ipi_frameparam {
 	u64		drv_data;
 	struct img_input	inputs[IMG_MAX_HW_INPUTS];
 	struct img_output	outputs[IMG_MAX_HW_OUTPUTS];
-	struct img_addr		tuning_data;
+	struct tuning_addr	tuning_data;
 	struct img_addr		subfrm_data;
 	struct img_sw_addr	config_data;
 	struct img_sw_addr  self_data;
-- 
2.18.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH V2 1/6] dt-bindings: mt8183: Added DIP dt-bindings
  2019-07-08 11:04 ` [RFC PATCH V2 1/6] dt-bindings: mt8183: Added DIP dt-bindings frederic.chen
@ 2019-07-24 16:49   ` Rob Herring
  0 siblings, 0 replies; 9+ messages in thread
From: Rob Herring @ 2019-07-24 16:49 UTC (permalink / raw)
  To: frederic.chen
  Cc: laurent.pinchart+renesas, Rynn.Wu, Allan.Yang, suleiman,
	Jerry-ch.Chen, jungo.lin, hans.verkuil, frederic.chen,
	linux-media, devicetree, holmes.chiou, shik, yuzhao,
	linux-mediatek, matthias.bgg, mchehab, linux-arm-kernel,
	Sean.Cheng, srv_heupstream, sj.huang, tfiga, christie.yu,
	zwisler

On Mon, 8 Jul 2019 19:04:55 +0800, <frederic.chen@mediatek.com> wrote:
> From: Frederic Chen <frederic.chen@mediatek.com>
> 
> This patch adds DT binding documentation for the Digital Image
> Processing (DIP) unit of camera ISP system on Mediatek's SoCs.
> 
> It depends on the SCP and MDP 3 patch as following:
> 
> 1. dt-bindings: Add a binding for Mediatek SCP
>    https://patchwork.kernel.org/patch/11027247/
> 2. dt-binding: mt8183: Add Mediatek MDP3 dt-bindings
>    https://patchwork.kernel.org/patch/10945603/
> 
> Signed-off-by: Frederic Chen <frederic.chen@mediatek.com>
> ---
>  .../bindings/media/mediatek,mt8183-dip.txt    | 40 +++++++++++++++++++
>  1 file changed, 40 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/media/mediatek,mt8183-dip.txt
> 

Reviewed-by: Rob Herring <robh@kernel.org>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH V2 4/6] platform: mtk-isp: Add Mediatek DIP driver
       [not found]     ` <1566875772.10064.99.camel@mtksdccf07>
@ 2019-08-30  7:14       ` Tomasz Figa
  2019-09-06  9:16         ` Frederic Chen
  0 siblings, 1 reply; 9+ messages in thread
From: Tomasz Figa @ 2019-08-30  7:14 UTC (permalink / raw)
  To: Frederic Chen
  Cc: Kurt.Yang, Shih-Fang.Chuang, laurent.pinchart+renesas,
	Rynn Wu (吳育恩),
	Allan Yang (楊智鈞),
	suleiman, Jerry-ch Chen (陳敬憲),
	Jungo Lin (林明俊),
	hans.verkuil, Hung-Wen Hsieh (謝鴻文),
	linux-media, devicetree, Holmes Chiou (邱挺),
	shik, yuzhao, linux-mediatek, matthias.bgg, mchehab,
	linux-arm-kernel, Sean Cheng (鄭昇弘),
	srv_heupstream, Sj Huang (黃信璋),
	Christie Yu (游雅惠),
	zwisler, Alan-YL.Hu

Hi Frederic,

On Tue, Aug 27, 2019 at 12:16 PM Frederic Chen
<frederic.chen@mediatek.com> wrote:
>
> Dear Tomasz,
>
> I appreciate your comment. I will collaborate more closely with Jungo
> to solve the common issues in DIP and Pass 1(CAM) drivers.
>

Thank you!

Also thanks for replying to all the comments, it's very helpful.
Please check my replies inline. I've snipped out the parts that I
don't have further comments on.

>
> On Wed, 2019-07-31 at 15:10 +0800, Tomasz Figa wrote:
> > Hi Frederic,
> >
> > On Mon, Jul 08, 2019 at 07:04:58PM +0800, frederic.chen@mediatek.com wrote:

[snip]

> >
> > > +                   dev_buf->vbb.vb2_buf.timestamp =
> > > +                           in_buf->vbb.vb2_buf.timestamp;
> > > +
> > > +           vb2_buffer_done(&dev_buf->vbb.vb2_buf, vbf_state);
> > > +
> > > +           node = mtk_dip_vbq_to_node(dev_buf->vbb.vb2_buf.vb2_queue);
> > > +           spin_lock(&node->buf_list_lock);
> > > +           list_del(&dev_buf->list);
> > > +           spin_unlock(&node->buf_list_lock);
> > > +
> > > +           dev_dbg(&pipe->dip_dev->pdev->dev,
> > > +                   "%s:%s: return buf, idx(%d), state(%d)\n",
> > > +                   pipe->desc->name, node->desc->name,
> > > +                   dev_buf->vbb.vb2_buf.index, vbf_state);
> > > +   }
> >
> > This looks almost the same as what's being done inside
> > mtk_dip_hw_streamoff(). Could we just call this function from the loop
> > there?
>
> I would like to call the function from mtk_dip_hw_streamoff(). The only
> difference is mtk_dip_pipe_job_finish() also remove the buffer from the
> node's internal list.
>

Would anything wrong happen if we also remove the buffer from the
node's internal list in mtk_dip_hw_streamoff()?

Actually, do we need that internal node list? If we have a list of
requests and each request stores its buffer, wouldn't that be enough?

[snip]

> >
> > > +
> > > +   width = ALIGN(width, 4);
> > > +   stride = DIV_ROUND_UP(width * 3, 2);
> > > +   stride = DIV_ROUND_UP(stride * pixel_byte, 8);
> > > +
> > > +   if (pix_fmt == V4L2_PIX_FMT_MTISP_F10)
> > > +           stride =  ALIGN(stride, 4);
> > > +   else
> > > +           stride =  ALIGN(stride, 8);
> >
> > Could you explain all the calculations done above?
>
> If the buffer comes from mtk-cam-p1, the stride setting must be the same
> as p1. I would like to re-implement the codes following p1's design in
> v4 patch as following:
>
> static u32
> mtk_dip_pass1_cal_pack_stride(u32 width, u32 pix_fmt) {
>         unsigned int bpl;
>         unsigned int pixel_bits =
>                 get_pixel_byte_by_fmt(mp->pixelformat);
>
>         /* Bayer encoding format, P1 HW needs 2 bytes alignment */
>         bpl = ALIGN(DIV_ROUND_UP(width * pixel_bits, 8), 2);
>
>         /* The setting also needs 4 bytes alignment for DIP HW */
>         return ALIGN(bpl, 4);;

If we need 4 bytes alignment, wouldn't the bytes per line value end up
different from P1 anyway? Perhaps we can just remove the ALIGN(..., 2)
above and keep this one? It should be the userspace responsibility
anyway to choose a format compatible with both consumer and producer.

By the way, double semicolon in the line above. :)

> }
>
>
> static __u32
> mtk_dip_pass1_cal_main_stride(u32 width, u32 pix_fmt)
> {
>         unsigned int bpl, ppl;
>         unsigned int pixel_bits =
>                 get_pixel_byte_by_fmt(mp->pixelformat);
>
>         /*
>          * The FULL-G encoding format
>          * 1 G component per pixel
>          * 1 R component per 4 pixel
>          * 1 B component per 4 pixel
>          * Total 4G/1R/1B in 4 pixel (pixel per line:ppl)
>          */
>         ppl = DIV_ROUND_UP(width * 6, 4);
>         bpl = DIV_ROUND_UP(ppl * pixel_bits, 8);
>
>         /* 4 bytes alignment for 10 bit & others are 8 bytes */
>         if (pixel_bits == 10)
>                 bpl = ALIGN(bpl, 4);
>         else
>                 bpl = ALIGN(bpl, 8);
>         }

Spurious }.

>
>         return bpl;
> }
>

Looks good to me, thanks!

>
> >
> > > +
> > > +   return stride;
> > > +}
> > > +
> > > +static int is_stride_need_to_align(u32 format, u32 *need_aligned_fmts,
> > > +                              int length)
> > > +{
> > > +   int i;
> > > +
> > > +   for (i = 0; i < length; i++) {
> > > +           if (format == need_aligned_fmts[i])
> > > +                   return true;
> > > +   }
> > > +
> > > +   return false;
> > > +}
> > > +
> > > +/* Stride that is accepted by MDP HW */
> > > +static u32 dip_mdp_fmt_get_stride(struct v4l2_pix_format_mplane *mfmt,
> > > +                             const struct mtk_dip_dev_format *fmt,
> > > +                             unsigned int plane)
> > > +{
> > > +   enum mdp_color c = fmt->mdp_color;
> > > +   u32 bytesperline = (mfmt->width * fmt->row_depth[plane]) / 8;
> > > +   u32 stride = (bytesperline * MDP_COLOR_BITS_PER_PIXEL(c))
> > > +           / fmt->row_depth[0];
> > > +
> > > +   if (plane == 0)
> > > +           return stride;
> > > +
> > > +   if (plane < MDP_COLOR_GET_PLANE_COUNT(c)) {
> > > +           if (MDP_COLOR_IS_BLOCK_MODE(c))
> > > +                   stride = stride / 2;
> > > +           return stride;
> > > +   }
> > > +
> > > +   return 0;
> > > +}
> > > +
> > > +/* Stride that is accepted by MDP HW of format with contiguous planes */
> > > +static u32
> > > +dip_mdp_fmt_get_stride_contig(const struct mtk_dip_dev_format *fmt,
> > > +                         u32 pix_stride,
> > > +                         unsigned int plane)
> > > +{
> > > +   enum mdp_color c = fmt->mdp_color;
> > > +   u32 stride = pix_stride;
> > > +
> > > +   if (plane == 0)
> > > +           return stride;
> > > +
> > > +   if (plane < MDP_COLOR_GET_PLANE_COUNT(c)) {
> > > +           stride = stride >> MDP_COLOR_GET_H_SUBSAMPLE(c);
> > > +           if (MDP_COLOR_IS_UV_COPLANE(c) && !MDP_COLOR_IS_BLOCK_MODE(c))
> > > +                   stride = stride * 2;
> > > +           return stride;
> > > +   }
> > > +
> > > +   return 0;
> > > +}
> > > +
> > > +/* Plane size that is accepted by MDP HW */
> > > +static u32
> > > +dip_mdp_fmt_get_plane_size(const struct mtk_dip_dev_format *fmt,
> > > +                      u32 stride, u32 height,
> > > +                      unsigned int plane)
> > > +{
> > > +   enum mdp_color c = fmt->mdp_color;
> > > +   u32 bytesperline;
> > > +
> > > +   bytesperline = (stride * fmt->row_depth[0])
> > > +           / MDP_COLOR_BITS_PER_PIXEL(c);
> >
> > Hmm, stride and bytesperline should be exactly the same thing. Could you
> > explain what's happening here?
>
> The stride here is specific for MDP hardware (which uses the same MDP
> stride setting for NV12 and NV12M):
>
>         bytesperline = width * row_depth / 8
>         MDP stride   = width * MDP_COLOR_BITS_PER_PIXEL /8
>
> Therfore,
>
>         bytesperline = MDP stride * row_depth / MDP_COLOR_BITS_PER_PIXEL
>         MDP stride   = bytesperline * MDP_COLOR_BITS_PER_PIXEL/ row_depth
>

I'm sorry I'm still confused. Is there an intermediate buffer between
DIP and MDP that has stride of |MDP stride| and then MDP writes to the
final buffer that has the stride of |bytesperline|?

[snip]

> >
> > > +           u32 sizeimage;
> > > +
> > > +           if (bpl < min_bpl)
> > > +                   bpl = min_bpl;
> > > +
> > > +           sizeimage = (bpl * mfmt_to_fill->height * dev_fmt->depth[i])
> > > +                   / dev_fmt->row_depth[i];
> >
> > Shouldn't this be bpl * fmt->height?
>
> Row_depth is the bits of the pixel.
> Depth means the bytes per pixel of the image format.
>
> For example,
> Image: 640 * 480
> YV12: row_depth = 8, depth = 12

YV12 has 3 planes of 8 bits per pixel. Not sure where does this 12 come from.

> Bytes per line = width * row_depth / 8 = 640 * 8/ 8 = 640
> Image size = Bytes per line * height * (depth/ row_depth)
>            = 640 * 480 * 1.5
>

I think we might be having some terminology issue here. "row" is
normally the same as "line", which consists of |width| pixels +
padding, which is |bytesperline| bytes in total.

Perhaps you want to store a bits_per_pixel[] and vertical and
horizontal subsampling arrays for all planes of the formats in the
format descriptor.

By the way, have you considered using the V4L2 format helpers [1]?

[1] https://elixir.bootlin.com/linux/v5.3-rc6/source/drivers/media/v4l2-core/v4l2-common.c#L561

> >
> > > +           dev_dbg(&pipe->dip_dev->pdev->dev,
> > > +                   "Non-contiguous-mp-buf(%s),total-plane-size(%d),dma_port(%d)\n",
> > > +                   buf_name,
> > > +                   total_plane_size,
> > > +                   b->usage);
> > > +           return 0;
> > > +   }
> > > +
> > > +   for (; i < MDP_COLOR_GET_PLANE_COUNT(b->format.colorformat); ++i) {
> >
> > I don't see these MDP_ macros defined anywhere. Please don't use macros
> > defined from other drivers. We can embed this information in the
> > mtk_dip_dev_format struct. One would normally call it num_cplanes (color
> > planes).
> >
> > However, I would just make this driver support M formats only and forget
> > about formats with only memory planes count != color planes count.
>
> Since the non-M formats are still used by 8183's user lib now, may I add
> num_cplanes and support the both formats?
>

Okay, let's keep them for now.

[snip]

> > > +
> > > +   /* Tuning buffer */
> > > +   dev_buf_tuning =
> > > +           pipe_job_info->buf_map[MTK_DIP_VIDEO_NODE_ID_TUNING_OUT];
> > > +   if (dev_buf_tuning) {
> > > +           dev_dbg(&pdev->dev,
> > > +                   "Tuning buf queued: scp_daddr(%pad),isp_daddr(%pad)\n",
> > > +                   &dev_buf_tuning->scp_daddr[0],
> > > +                   &dev_buf_tuning->isp_daddr[0]);
> > > +           dip_param->tuning_data.pa =
> > > +                   (uint32_t)dev_buf_tuning->scp_daddr[0];
> > > +           dip_param->tuning_data.present = true;
> > > +           dip_param->tuning_data.iova =
> > > +                   (uint32_t)dev_buf_tuning->isp_daddr[0];
> >
> > Why are these casts needed?
>
> This structure is shared between CPU and scp and the pa and iova is
> defined as 32bits fields.
>
> struct tuning_addr {
>         u32     present;
>         u32     pa;     /* Used by CM4 access */
>         u32     iova;   /* Used by IOMMU HW access */
> } __attribute__ ((__packed__));
>

I see, thanks.

[snip]

> >
> > > +#define DIP_COMPOSING_MAX_NUM              3
> > > +#define DIP_MAX_ERR_COUNT          188U
> > > +#define DIP_FLUSHING_WQ_TIMEOUT            (16U * DIP_MAX_ERR_COUNT)
> >
> > Any rationale behind this particular numbers? Please add comments explaining
> > them.
>
> I would like to adjust the time out value to  1000 / 30 *
> (DIP_COMPOSING_MAX_NUM) * 3.
>
> To wait 3 times of expected frame time (@30fps) in the worst case (max
> number of jobs in SCP).
>

With high system load it could be still possible to hit this. Since it
should normally be impossible to hit this timeout on a system working
correctly, how about just making this something really long like 1
second?

[snip]

> >
> > > +
> > > +   dev_dbg(&dip_dev->pdev->dev,
> > > +           "%s: wakeup frame_no(%d),num(%d)\n",
> > > +           __func__, req->img_fparam.frameparam.frame_no,
> > > +           atomic_read(&dip_hw->num_composing));
> > > +
> > > +   buf = req->working_buf;
> > > +   mtk_dip_wbuf_to_ipi_img_addr(&req->img_fparam.frameparam.subfrm_data,
> > > +                                &buf->buffer);
> > > +   memset(buf->buffer.vaddr, 0, DIP_SUB_FRM_SZ);
> > > +   mtk_dip_wbuf_to_ipi_img_sw_addr(&req->img_fparam.frameparam.config_data,
> > > +                                   &buf->config_data);
> > > +   memset(buf->config_data.vaddr, 0, DIP_COMP_SZ);
> > > +
> > > +   if (!req->img_fparam.frameparam.tuning_data.present) {
> > > +           /*
> > > +            * When user enqueued without tuning buffer,
> > > +            * it would use driver internal buffer.
> > > +            */
> > > +           dev_dbg(&dip_dev->pdev->dev,
> > > +                   "%s: frame_no(%d) has no tuning_data\n",
> > > +                   __func__, req->img_fparam.frameparam.frame_no);
> > > +
> > > +           mtk_dip_wbuf_to_ipi_tuning_addr
> > > +                           (&req->img_fparam.frameparam.tuning_data,
> > > +                            &buf->tuning_buf);
> > > +           memset(buf->tuning_buf.vaddr, 0, DIP_TUNING_SZ);
> > > +   }
> > > +
> > > +   req->img_fparam.frameparam.state = FRAME_STATE_COMPOSING;
> > > +   mtk_dip_wbuf_to_ipi_img_sw_addr(&req->img_fparam.frameparam.self_data,
> > > +                                   &buf->frameparam);
> > > +   memcpy(buf->frameparam.vaddr, &req->img_fparam.frameparam,
> > > +          sizeof(req->img_fparam.frameparam));
> >
> > Is img_fparam really used at this stage? I can only see ipi_param passed to
> > the IPI.
>
> The content of img_fparam is passed to SCP.
>
> The dip frame information is saved in req->img_fparam.frameparam (in
> mtk_dip_pipe_ipi_params_config()).
>
> The content of req->img_fparam.frameparam is copied to
> buf->frameparam.vaddr.
>
> Since we set ipi_param.frm_param.pa to the buf->frameparam.scp_daddr in
> mtk_dip_wbuf_to_ipi_img_sw_addr(), the content of img_fparam is pass to
> SCP through ipi_param.

Okay, I see. Thanks,

[snip]

> >
> > > +   } else {
> > > +           for (i = 0; i < *num_planes; i++) {
> > > +                   if (sizes[i] < fmt->fmt.pix_mp.plane_fmt[i].sizeimage) {
> > > +                           dev_dbg(&pipe->dip_dev->pdev->dev,
> > > +                                   "%s:%s:%s: invalid buf: %u < %u\n",
> > > +                                   __func__, pipe->desc->name,
> > > +                                   node->desc->name, sizes[0], size);
> > > +                                   return -EINVAL;
> > > +                   }
> > > +                   sizes[i] = fmt->fmt.pix_mp.plane_fmt[i].sizeimage;
> >
> > For VIDIOC_CREATEBUFS we also need to handle the case when *num_planes > 0
> > and then we need to honor the values already present in sizes[]. (Note that
> > the code above overrides *num_planes to 1, so we lose the information. The
> > code needs to be restructured to handle that.)
>
> We overrides *num_planes when *num_planes is 0. Is the modification I
> need to do to keep the original value of size[]?

Yes.

>
> if (!*num_planes) {
>         *num_planes = 1;
>         sizes[0] = fmt->fmt.pix_mp.plane_fmt[0].sizeimage;
> }
>

[snip]

> >
> > > +   dev_dbg(&pipe->dip_dev->pdev->dev,
> > > +           "%s:%s:%s cnt_nodes_not_streaming(%d), is_pipe_streamon(%d)\n",
> > > +           __func__, pipe->desc->name, node->desc->name, count,
> > > +           is_pipe_streamon);
> > > +
> > > +   if (count && is_pipe_streamon) {
> >
> > For v4l2_subdev_call() shouldn't this be !count && is_pipe_streamon?
>
> Do you mean that pipe->subdev's stop stream should be called after all
> video device are stream off, not the first video device's stream off?
>

Partially. See the comment below. We should stop the subdev when the
first video node stops streaming, but the media pipeline only when all
the nodes stopped.

> >
> > > +           ret = v4l2_subdev_call(&pipe->subdev, video, s_stream, 0);
> > > +           if (ret)
> > > +                   dev_err(&pipe->dip_dev->pdev->dev,
> > > +                           "%s:%s: sub dev s_stream(0) failed(%d)\n",
> > > +                           pipe->desc->name, node->desc->name, ret);
> > > +           media_pipeline_stop(&node->vdev.entity);
> >
> > We should do this one when the last node stops streaming to solve the
> > enable state locking issue as described in my comment to _start_streaming().
>
> I will do this when the last node stops streaming.
>

Ack.

> >
> > > +   }
> > > +
> > > +   mtk_dip_return_all_buffers(pipe, node, VB2_BUF_STATE_ERROR);
> > > +}
> > > +
> > > +static int mtk_dip_videoc_querycap(struct file *file, void *fh,
> > > +                              struct v4l2_capability *cap)
> > > +{
> > > +   struct mtk_dip_pipe *pipe = video_drvdata(file);
> > > +
> > > +   strlcpy(cap->driver, pipe->desc->name,
> > > +           sizeof(cap->driver));
> > > +   strlcpy(cap->card, pipe->desc->name,
> > > +           sizeof(cap->card));
> > > +   snprintf(cap->bus_info, sizeof(cap->bus_info),
> > > +            "platform:%s", dev_name(pipe->dip_dev->mdev.dev));
> > > +
> > > +   return 0;
> > > +}
> > > +
> > > +static int mtk_dip_videoc_try_fmt(struct file *file, void *fh,
> > > +                             struct v4l2_format *f)
> >
> > I don't see this function returning any error codes. Please make it void.
>
> mtk_dip_videoc_try_fmt() is used as vidioc_try_fmt_vid_out_mplane op.
> Using void seems to make it incompatible with
> vidioc_try_fmt_vid_out_mplane.
>
> .vidioc_try_fmt_vid_out_mplane = mtk_dip_videoc_try_fmt,
>
> int (*vidioc_try_fmt_vid_out_mplane)(struct file *file, void *fh,
>                                      struct v4l2_format *f);
>

Oops, sorry, I'm not sure why I suggested that.

[snip]

> > > +   /* Initialize subdev */
> > > +   v4l2_subdev_init(&pipe->subdev, &mtk_dip_subdev_ops);
> > > +
> > > +   pipe->subdev.entity.function =
> > > +           MEDIA_ENT_F_PROC_VIDEO_PIXEL_FORMATTER;
> > > +   pipe->subdev.entity.ops = &mtk_dip_media_ops;
> > > +   pipe->subdev.flags =
> > > +           V4L2_SUBDEV_FL_HAS_DEVNODE | V4L2_SUBDEV_FL_HAS_EVENTS;
> > > +   pipe->subdev.ctrl_handler = NULL;
> > > +   pipe->subdev.internal_ops = &mtk_dip_subdev_internal_ops;
> > > +
> > > +   for (i = 0; i < pipe->num_nodes; i++)
> > > +           pipe->subdev_pads[i].flags =
> > > +                   V4L2_TYPE_IS_OUTPUT(pipe->nodes[i].desc->buf_type) ?
> > > +                   MEDIA_PAD_FL_SINK : MEDIA_PAD_FL_SOURCE;
> >
> > Isn't this the other way around?
>
> I checked the document of MEDIA_PAD_FL_SINK and MEDIA_PAD_FL_SOURCE. It
> seems that the codes match the description.
>
> RAW Ouput video device: MEDIA_PAD_FL_SOURCE --> DIP sub dev:
> MEDIA_PAD_FL_SINK
> DIP sub dev: MEDIA_PAD_FL_SOURCE --> MDP Capture video device:
> MEDIA_PAD_FL_SINK
>
> MEDIA_PAD_FL_SINK: Input pad, relative to the entity. Input pads sink
> data and are targets of links.
> MEDIA_PAD_FL_SOURCE: Output pad, relative to the entity. Output pads
> source data and are origins of links.
>

Ah, that's right, sorry for the noise.

[snip]

> > > +   {
> > > +           .format = V4L2_PIX_FMT_NV12M,
> > > +           .mdp_color      = MDP_COLOR_NV12,
> > > +           .colorspace = V4L2_COLORSPACE_BT2020,
> > > +           .depth          = { 8, 4 },
> > > +           .row_depth      = { 8, 8 },
> >
> > What is depth and what is row_depth? They both seem to not match NV12, which
> > should have 16 bits per sample in the CbCr plane.
>
> Fow_depth is the bits of the pixel.

Bits of a YCbCr plane pixel is 16 for NV12.

> Depth means the bytes per pixel of the image foramt.
>
> For example,
> Image: 640 * 480
> YV12: row_depth = 8, depth = 12
> Bytes per line = width * row_depth / 8 = 640 * 8/ 8 = 640
> Image size = Bytes per line * height * (depth/ row_depth)
>            = 640 * 480 * 1.5
>
> Image: 640 * 480
> Bytes per line(Y) = width * row_depth/ 8 = 640
> Bytes per line(CbCr) = width * row_depth/ 8 = 640
>
> Image size(Y) = Bytes per line * height * (depth/ row_depth)
>            = 640 * 480 * 8/8 = 640 * 480 * 1
>
> Image size(CbCr) = Bytes per line * height * (depth/ row_depth)
>            = 640 * 480 * 4/8 = 640 * 480 * 0.5
>

I'd suggest either using the V4L2 format helpers, as suggested in
another comment above with a link OR adopting the typical convention
of having the .depth mean the pixel value size, i.e. 16-bit for CbCr
plane of NV12 and then use subsampling factors to calculate the plane
bytesperline and sizeimage.

[snip]

> > > +static const struct mtk_dip_video_device_desc
> > > +reprocess_output_queues_setting[MTK_DIP_VIDEO_NODE_ID_OUT_TOTAL_NUM] = {
> > > +   {
> > > +           .id = MTK_DIP_VIDEO_NODE_ID_RAW_OUT,
> > > +           .name = "Raw Input",
> > > +           .cap = V4L2_CAP_VIDEO_OUTPUT_MPLANE | V4L2_CAP_STREAMING,
> > > +           .buf_type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE,
> > > +           .smem_alloc = 0,
> > > +           .flags = MEDIA_LNK_FL_ENABLED,
> > > +           .fmts = in_fmts,
> > > +           .num_fmts = ARRAY_SIZE(in_fmts),
> > > +           .default_fmt_idx = 5,
> > > +           .default_width = MTK_DIP_OUTPUT_MAX_WIDTH,
> > > +           .default_height = MTK_DIP_OUTPUT_MAX_HEIGHT,
> > > +           .dma_port = 0,
> > > +           .frmsizeenum = &in_frmsizeenum,
> > > +           .ops = &mtk_dip_v4l2_video_out_ioctl_ops,
> > > +           .description = "Source image to be process",
> > > +
> > > +   },
> > > +   {
> > > +           .id = MTK_DIP_VIDEO_NODE_ID_TUNING_OUT,
> > > +           .name = "Tuning",
> > > +           .cap = V4L2_CAP_META_OUTPUT | V4L2_CAP_STREAMING,
> > > +           .buf_type = V4L2_BUF_TYPE_META_OUTPUT,
> > > +           .smem_alloc = 1,
> > > +           .flags = 0,
> > > +           .fmts = fw_param_fmts,
> > > +           .num_fmts = 1,
> > > +           .default_fmt_idx = 0,
> > > +           .dma_port = 0,
> > > +           .frmsizeenum = &in_frmsizeenum,
> > > +           .ops = &mtk_dip_v4l2_meta_out_ioctl_ops,
> > > +           .description = "Tuning data for image enhancement",
> > > +   },
> > > +};
> >
> > The entries here seem to be almost the same to output_queues_setting[], the
> > only difference seems to be default_fmt_idx and description.
> >
> > What's the difference between the capture and reprocess pipes?
>
> The reprocess pipe is completely the same as capture one except the
> default format of the input.
>

Does the default format really matter here? The userspace should set
its own desired format anyway. Then we could just unify the settings
of both pipes.

[snip]

> >
> > > +
> > > +   return num;
> > > +}
> > > +
> > > +static int __maybe_unused mtk_dip_suspend(struct device *dev)
> > > +{
> > > +   struct mtk_dip_dev *dip_dev = dev_get_drvdata(dev);
> > > +   int ret, num;
> > > +
> > > +   if (pm_runtime_suspended(dev)) {
> > > +           dev_dbg(dev, "%s: pm_runtime_suspended is true, no action\n",
> > > +                   __func__);
> > > +           return 0;
> > > +   }
> > > +
> > > +   ret = wait_event_timeout(dip_dev->dip_hw.flushing_hw_wq,
> > > +                            !(num = mtk_dip_get_scp_job_num(dip_dev)),
> > > +                            msecs_to_jiffies(200));
> >
> > Is 200 milliseconds a reasonable timeout here, i.e. for any potential use
> > case it would always take less than 200 ms to wait for all the tasks running
> > in the ISP?
>
> I would like to adjust the time out value to  1000 / 30 *
> (DIP_COMPOSING_MAX_NUM) * 3.
>
> To wait 3 times of expected frame time (@30fps) in worst case (max
> number of jobs in SCP).
>

As I suggested in another comment above, the worst case for the
hardware might be still better than the scheduling latency we could
get for a heavy loaded system. While that wouldn't really apply here,
because nothing else happening in parallel when the system is
suspending, we could just stick to some really long time out anyway,
e.g. 1 second. We shouldn't hit it anyway - it's just a safety guard
to prevent the system hanging if something goes really bad.

Best regards,
Tomasz

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH V2 4/6] platform: mtk-isp: Add Mediatek DIP driver
  2019-08-30  7:14       ` [RFC PATCH V2 4/6] platform: mtk-isp: Add Mediatek DIP driver Tomasz Figa
@ 2019-09-06  9:16         ` Frederic Chen
  2019-09-10  3:45           ` Tomasz Figa
  0 siblings, 1 reply; 9+ messages in thread
From: Frederic Chen @ 2019-09-06  9:16 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: yuzhao, linux-mediatek, zwisler, matthias.bgg, mchehab, linux-arm-kernel

Hi Tomasz,

Thank you for your comments.


On Fri, 2019-08-30 at 16:14 +0900, Tomasz Figa wrote:
> Hi Frederic,
> 
> On Tue, Aug 27, 2019 at 12:16 PM Frederic Chen
> <frederic.chen@mediatek.com> wrote:
> >
> > Dear Tomasz,
> >
> > I appreciate your comment. I will collaborate more closely with Jungo
> > to solve the common issues in DIP and Pass 1(CAM) drivers.
> >
> 
> Thank you!
> 
> Also thanks for replying to all the comments, it's very helpful.
> Please check my replies inline. I've snipped out the parts that I
> don't have further comments on.
> 
> >
> > On Wed, 2019-07-31 at 15:10 +0800, Tomasz Figa wrote:
> > > Hi Frederic,
> > >
> > > On Mon, Jul 08, 2019 at 07:04:58PM +0800, frederic.chen@mediatek.com wrote:
> 
> [snip]
> 
> > >
> > > > +                   dev_buf->vbb.vb2_buf.timestamp =
> > > > +                           in_buf->vbb.vb2_buf.timestamp;
> > > > +
> > > > +           vb2_buffer_done(&dev_buf->vbb.vb2_buf, vbf_state);
> > > > +
> > > > +           node = mtk_dip_vbq_to_node(dev_buf->vbb.vb2_buf.vb2_queue);
> > > > +           spin_lock(&node->buf_list_lock);
> > > > +           list_del(&dev_buf->list);
> > > > +           spin_unlock(&node->buf_list_lock);
> > > > +
> > > > +           dev_dbg(&pipe->dip_dev->pdev->dev,
> > > > +                   "%s:%s: return buf, idx(%d), state(%d)\n",
> > > > +                   pipe->desc->name, node->desc->name,
> > > > +                   dev_buf->vbb.vb2_buf.index, vbf_state);
> > > > +   }
> > >
> > > This looks almost the same as what's being done inside
> > > mtk_dip_hw_streamoff(). Could we just call this function from the loop
> > > there?
> >
> > I would like to call the function from mtk_dip_hw_streamoff(). The only
> > difference is mtk_dip_pipe_job_finish() also remove the buffer from the
> > node's internal list.
> >
> 
> Would anything wrong happen if we also remove the buffer from the
> node's internal list in mtk_dip_hw_streamoff()?
> 
> Actually, do we need that internal node list? If we have a list of
> requests and each request stores its buffer, wouldn't that be enough?
> 

We use the buffer list in the following cases:
1. media_pipeline_start() failed when streaming on video device
2. Video device stream off

If the some video device is streamed on ,but the entire pipe has not
started streaming (for example, MDP 0 is streamed on, but RAW input has
not been streamed on), we use the list to return the buffers.

Should we handle this cases? or we expect that the user will request
buffers again to ensure all buffers are removed from the video device in
this error case.


> [snip]
> 
> > >
> > > > +
> > > > +   width = ALIGN(width, 4);
> > > > +   stride = DIV_ROUND_UP(width * 3, 2);
> > > > +   stride = DIV_ROUND_UP(stride * pixel_byte, 8);
> > > > +
> > > > +   if (pix_fmt == V4L2_PIX_FMT_MTISP_F10)
> > > > +           stride =  ALIGN(stride, 4);
> > > > +   else
> > > > +           stride =  ALIGN(stride, 8);
> > >
> > > Could you explain all the calculations done above?
> >
> > If the buffer comes from mtk-cam-p1, the stride setting must be the same
> > as p1. I would like to re-implement the codes following p1's design in
> > v4 patch as following:
> >
> > static u32
> > mtk_dip_pass1_cal_pack_stride(u32 width, u32 pix_fmt) {
> >         unsigned int bpl;
> >         unsigned int pixel_bits =
> >                 get_pixel_byte_by_fmt(mp->pixelformat);
> >
> >         /* Bayer encoding format, P1 HW needs 2 bytes alignment */
> >         bpl = ALIGN(DIV_ROUND_UP(width * pixel_bits, 8), 2);
> >
> >         /* The setting also needs 4 bytes alignment for DIP HW */
> >         return ALIGN(bpl, 4);;
> 
> If we need 4 bytes alignment, wouldn't the bytes per line value end up
> different from P1 anyway? Perhaps we can just remove the ALIGN(..., 2)
> above and keep this one? It should be the userspace responsibility
> anyway to choose a format compatible with both consumer and producer.
> 
> By the way, double semicolon in the line above. :)
> 

I would like to remove the ALIGN(..., 2).

> > }
> >
> >
> > static __u32
> > mtk_dip_pass1_cal_main_stride(u32 width, u32 pix_fmt)
> > {
> >         unsigned int bpl, ppl;
> >         unsigned int pixel_bits =
> >                 get_pixel_byte_by_fmt(mp->pixelformat);
> >
> >         /*
> >          * The FULL-G encoding format
> >          * 1 G component per pixel
> >          * 1 R component per 4 pixel
> >          * 1 B component per 4 pixel
> >          * Total 4G/1R/1B in 4 pixel (pixel per line:ppl)
> >          */
> >         ppl = DIV_ROUND_UP(width * 6, 4);
> >         bpl = DIV_ROUND_UP(ppl * pixel_bits, 8);
> >
> >         /* 4 bytes alignment for 10 bit & others are 8 bytes */
> >         if (pixel_bits == 10)
> >                 bpl = ALIGN(bpl, 4);
> >         else
> >                 bpl = ALIGN(bpl, 8);
> >         }
> 
> Spurious }.
> 
> >
> >         return bpl;
> > }
> >
> 
> Looks good to me, thanks!
> 
> >
> > >
> > > > +
> > > > +   return stride;
> > > > +}
> > > > +
> > > > +static int is_stride_need_to_align(u32 format, u32 *need_aligned_fmts,
> > > > +                              int length)
> > > > +{
> > > > +   int i;
> > > > +
> > > > +   for (i = 0; i < length; i++) {
> > > > +           if (format == need_aligned_fmts[i])
> > > > +                   return true;
> > > > +   }
> > > > +
> > > > +   return false;
> > > > +}
> > > > +
> > > > +/* Stride that is accepted by MDP HW */
> > > > +static u32 dip_mdp_fmt_get_stride(struct v4l2_pix_format_mplane *mfmt,
> > > > +                             const struct mtk_dip_dev_format *fmt,
> > > > +                             unsigned int plane)
> > > > +{
> > > > +   enum mdp_color c = fmt->mdp_color;
> > > > +   u32 bytesperline = (mfmt->width * fmt->row_depth[plane]) / 8;
> > > > +   u32 stride = (bytesperline * MDP_COLOR_BITS_PER_PIXEL(c))
> > > > +           / fmt->row_depth[0];
> > > > +
> > > > +   if (plane == 0)
> > > > +           return stride;
> > > > +
> > > > +   if (plane < MDP_COLOR_GET_PLANE_COUNT(c)) {
> > > > +           if (MDP_COLOR_IS_BLOCK_MODE(c))
> > > > +                   stride = stride / 2;
> > > > +           return stride;
> > > > +   }
> > > > +
> > > > +   return 0;
> > > > +}
> > > > +
> > > > +/* Stride that is accepted by MDP HW of format with contiguous planes */
> > > > +static u32
> > > > +dip_mdp_fmt_get_stride_contig(const struct mtk_dip_dev_format *fmt,
> > > > +                         u32 pix_stride,
> > > > +                         unsigned int plane)
> > > > +{
> > > > +   enum mdp_color c = fmt->mdp_color;
> > > > +   u32 stride = pix_stride;
> > > > +
> > > > +   if (plane == 0)
> > > > +           return stride;
> > > > +
> > > > +   if (plane < MDP_COLOR_GET_PLANE_COUNT(c)) {
> > > > +           stride = stride >> MDP_COLOR_GET_H_SUBSAMPLE(c);
> > > > +           if (MDP_COLOR_IS_UV_COPLANE(c) && !MDP_COLOR_IS_BLOCK_MODE(c))
> > > > +                   stride = stride * 2;
> > > > +           return stride;
> > > > +   }
> > > > +
> > > > +   return 0;
> > > > +}
> > > > +
> > > > +/* Plane size that is accepted by MDP HW */
> > > > +static u32
> > > > +dip_mdp_fmt_get_plane_size(const struct mtk_dip_dev_format *fmt,
> > > > +                      u32 stride, u32 height,
> > > > +                      unsigned int plane)
> > > > +{
> > > > +   enum mdp_color c = fmt->mdp_color;
> > > > +   u32 bytesperline;
> > > > +
> > > > +   bytesperline = (stride * fmt->row_depth[0])
> > > > +           / MDP_COLOR_BITS_PER_PIXEL(c);
> > >
> > > Hmm, stride and bytesperline should be exactly the same thing. Could you
> > > explain what's happening here?
> >
> > The stride here is specific for MDP hardware (which uses the same MDP
> > stride setting for NV12 and NV12M):
> >
> >         bytesperline = width * row_depth / 8
> >         MDP stride   = width * MDP_COLOR_BITS_PER_PIXEL /8
> >
> > Therfore,
> >
> >         bytesperline = MDP stride * row_depth / MDP_COLOR_BITS_PER_PIXEL
> >         MDP stride   = bytesperline * MDP_COLOR_BITS_PER_PIXEL/ row_depth
> >
> 
> I'm sorry I'm still confused. Is there an intermediate buffer between
> DIP and MDP that has stride of |MDP stride| and then MDP writes to the
> final buffer that has the stride of |bytesperline|?
> 

No, there is no intermediate buffer between DIP and MDP that has stride
of |MDP stride|. DIP connects to MDP in hardware level, so MDP writes
the buffer with |MDP stride|.

As I know, V4L2's bytesperline means bytes per line of the first
plane(*), but mdp hw needs y, u, v stride (it is different from V4L2).
Therefore we calculate the |MDP stride| here.

*:
https://linuxtv.org/downloads/v4l-dvb-apis/uapi/v4l/pixfmt-v4l2.html#c.v4l2_pix_format
"When the image format is planar the bytesperline value applies to the
first plane and is divided by the same factor as the width field for the
other planes."

> [snip]
> 
> > >
> > > > +           u32 sizeimage;
> > > > +
> > > > +           if (bpl < min_bpl)
> > > > +                   bpl = min_bpl;
> > > > +
> > > > +           sizeimage = (bpl * mfmt_to_fill->height * dev_fmt->depth[i])
> > > > +                   / dev_fmt->row_depth[i];
> > >
> > > Shouldn't this be bpl * fmt->height?
> >
> > Row_depth is the bits of the pixel.
> > Depth means the bytes per pixel of the image format.
> >
> > For example,
> > Image: 640 * 480
> > YV12: row_depth = 8, depth = 12
> 
> YV12 has 3 planes of 8 bits per pixel. Not sure where does this 12 come from.
> 

Let me elaborate more about the 12 depth.

depth: pixel bit number

For 420,

y = w * h
u = (w/2) * (h/2)
v = (w/2) * (h/2)

Therefore, 

y = 8, 
u = 8/2/2 = 2
v = 8/2/2 = 2

depth (y + u + v) = 8 + 2 + 2 = 12


> > Bytes per line = width * row_depth / 8 = 640 * 8/ 8 = 640
> > Image size = Bytes per line * height * (depth/ row_depth)
> >            = 640 * 480 * 1.5
> >
> 
> I think we might be having some terminology issue here. "row" is
> normally the same as "line", which consists of |width| pixels +
> padding, which is |bytesperline| bytes in total.
> 
> Perhaps you want to store a bits_per_pixel[] and vertical and
> horizontal subsampling arrays for all planes of the formats in the
> format descriptor.
> 
> By the way, have you considered using the V4L2 format helpers [1]?
> 
> [1] https://elixir.bootlin.com/linux/v5.3-rc6/source/drivers/media/v4l2-core/v4l2-common.c#L561

Would it be possible to keep row_depth and depth? It is already used in
MDP drivers.

https://elixir.bootlin.com/linux/v5.3-rc6/source/drivers/media/platform/mtk-mdp/mtk_mdp_m2m.c


> 
> > >
> > > > +           dev_dbg(&pipe->dip_dev->pdev->dev,
> > > > +                   "Non-contiguous-mp-buf(%s),total-plane-size(%d),dma_port(%d)\n",
> > > > +                   buf_name,
> > > > +                   total_plane_size,
> > > > +                   b->usage);
> > > > +           return 0;
> > > > +   }
> > > > +
> > > > +   for (; i < MDP_COLOR_GET_PLANE_COUNT(b->format.colorformat); ++i) {
> > >
> > > I don't see these MDP_ macros defined anywhere. Please don't use macros
> > > defined from other drivers. We can embed this information in the
> > > mtk_dip_dev_format struct. One would normally call it num_cplanes (color
> > > planes).
> > >
> > > However, I would just make this driver support M formats only and forget
> > > about formats with only memory planes count != color planes count.
> >
> > Since the non-M formats are still used by 8183's user lib now, may I add
> > num_cplanes and support the both formats?
> >
> 
> Okay, let's keep them for now.
> 
> [snip]
> 
> > > > +
> > > > +   /* Tuning buffer */
> > > > +   dev_buf_tuning =
> > > > +           pipe_job_info->buf_map[MTK_DIP_VIDEO_NODE_ID_TUNING_OUT];
> > > > +   if (dev_buf_tuning) {
> > > > +           dev_dbg(&pdev->dev,
> > > > +                   "Tuning buf queued: scp_daddr(%pad),isp_daddr(%pad)\n",
> > > > +                   &dev_buf_tuning->scp_daddr[0],
> > > > +                   &dev_buf_tuning->isp_daddr[0]);
> > > > +           dip_param->tuning_data.pa =
> > > > +                   (uint32_t)dev_buf_tuning->scp_daddr[0];
> > > > +           dip_param->tuning_data.present = true;
> > > > +           dip_param->tuning_data.iova =
> > > > +                   (uint32_t)dev_buf_tuning->isp_daddr[0];
> > >
> > > Why are these casts needed?
> >
> > This structure is shared between CPU and scp and the pa and iova is
> > defined as 32bits fields.
> >
> > struct tuning_addr {
> >         u32     present;
> >         u32     pa;     /* Used by CM4 access */
> >         u32     iova;   /* Used by IOMMU HW access */
> > } __attribute__ ((__packed__));
> >
> 
> I see, thanks.
> 
> [snip]
> 
> > >
> > > > +#define DIP_COMPOSING_MAX_NUM              3
> > > > +#define DIP_MAX_ERR_COUNT          188U
> > > > +#define DIP_FLUSHING_WQ_TIMEOUT            (16U * DIP_MAX_ERR_COUNT)
> > >
> > > Any rationale behind this particular numbers? Please add comments explaining
> > > them.
> >
> > I would like to adjust the time out value to  1000 / 30 *
> > (DIP_COMPOSING_MAX_NUM) * 3.
> >
> > To wait 3 times of expected frame time (@30fps) in the worst case (max
> > number of jobs in SCP).
> >
> 
> With high system load it could be still possible to hit this. Since it
> should normally be impossible to hit this timeout on a system working
> correctly, how about just making this something really long like 1
> second?
> 

I got it. I would like to use 1 second instead.


> [snip]
> 
> > >
> > > > +
> > > > +   dev_dbg(&dip_dev->pdev->dev,
> > > > +           "%s: wakeup frame_no(%d),num(%d)\n",
> > > > +           __func__, req->img_fparam.frameparam.frame_no,
> > > > +           atomic_read(&dip_hw->num_composing));
> > > > +
> > > > +   buf = req->working_buf;
> > > > +   mtk_dip_wbuf_to_ipi_img_addr(&req->img_fparam.frameparam.subfrm_data,
> > > > +                                &buf->buffer);
> > > > +   memset(buf->buffer.vaddr, 0, DIP_SUB_FRM_SZ);
> > > > +   mtk_dip_wbuf_to_ipi_img_sw_addr(&req->img_fparam.frameparam.config_data,
> > > > +                                   &buf->config_data);
> > > > +   memset(buf->config_data.vaddr, 0, DIP_COMP_SZ);
> > > > +
> > > > +   if (!req->img_fparam.frameparam.tuning_data.present) {
> > > > +           /*
> > > > +            * When user enqueued without tuning buffer,
> > > > +            * it would use driver internal buffer.
> > > > +            */
> > > > +           dev_dbg(&dip_dev->pdev->dev,
> > > > +                   "%s: frame_no(%d) has no tuning_data\n",
> > > > +                   __func__, req->img_fparam.frameparam.frame_no);
> > > > +
> > > > +           mtk_dip_wbuf_to_ipi_tuning_addr
> > > > +                           (&req->img_fparam.frameparam.tuning_data,
> > > > +                            &buf->tuning_buf);
> > > > +           memset(buf->tuning_buf.vaddr, 0, DIP_TUNING_SZ);
> > > > +   }
> > > > +
> > > > +   req->img_fparam.frameparam.state = FRAME_STATE_COMPOSING;
> > > > +   mtk_dip_wbuf_to_ipi_img_sw_addr(&req->img_fparam.frameparam.self_data,
> > > > +                                   &buf->frameparam);
> > > > +   memcpy(buf->frameparam.vaddr, &req->img_fparam.frameparam,
> > > > +          sizeof(req->img_fparam.frameparam));
> > >
> > > Is img_fparam really used at this stage? I can only see ipi_param passed to
> > > the IPI.
> >
> > The content of img_fparam is passed to SCP.
> >
> > The dip frame information is saved in req->img_fparam.frameparam (in
> > mtk_dip_pipe_ipi_params_config()).
> >
> > The content of req->img_fparam.frameparam is copied to
> > buf->frameparam.vaddr.
> >
> > Since we set ipi_param.frm_param.pa to the buf->frameparam.scp_daddr in
> > mtk_dip_wbuf_to_ipi_img_sw_addr(), the content of img_fparam is pass to
> > SCP through ipi_param.
> 
> Okay, I see. Thanks,
> 
> [snip]
> 
> > >
> > > > +   } else {
> > > > +           for (i = 0; i < *num_planes; i++) {
> > > > +                   if (sizes[i] < fmt->fmt.pix_mp.plane_fmt[i].sizeimage) {
> > > > +                           dev_dbg(&pipe->dip_dev->pdev->dev,
> > > > +                                   "%s:%s:%s: invalid buf: %u < %u\n",
> > > > +                                   __func__, pipe->desc->name,
> > > > +                                   node->desc->name, sizes[0], size);
> > > > +                                   return -EINVAL;
> > > > +                   }
> > > > +                   sizes[i] = fmt->fmt.pix_mp.plane_fmt[i].sizeimage;
> > >
> > > For VIDIOC_CREATEBUFS we also need to handle the case when *num_planes > 0
> > > and then we need to honor the values already present in sizes[]. (Note that
> > > the code above overrides *num_planes to 1, so we lose the information. The
> > > code needs to be restructured to handle that.)
> >
> > We overrides *num_planes when *num_planes is 0. Is the modification I
> > need to do to keep the original value of size[]?
> 
> Yes.
> 
> >
> > if (!*num_planes) {
> >         *num_planes = 1;
> >         sizes[0] = fmt->fmt.pix_mp.plane_fmt[0].sizeimage;
> > }
> >
> 
> [snip]
> 
> > >
> > > > +   dev_dbg(&pipe->dip_dev->pdev->dev,
> > > > +           "%s:%s:%s cnt_nodes_not_streaming(%d), is_pipe_streamon(%d)\n",
> > > > +           __func__, pipe->desc->name, node->desc->name, count,
> > > > +           is_pipe_streamon);
> > > > +
> > > > +   if (count && is_pipe_streamon) {
> > >
> > > For v4l2_subdev_call() shouldn't this be !count && is_pipe_streamon?
> >
> > Do you mean that pipe->subdev's stop stream should be called after all
> > video device are stream off, not the first video device's stream off?
> >
> 
> Partially. See the comment below. We should stop the subdev when the
> first video node stops streaming, but the media pipeline only when all
> the nodes stopped.

I got it.

> 
> > >
> > > > +           ret = v4l2_subdev_call(&pipe->subdev, video, s_stream, 0);
> > > > +           if (ret)
> > > > +                   dev_err(&pipe->dip_dev->pdev->dev,
> > > > +                           "%s:%s: sub dev s_stream(0) failed(%d)\n",
> > > > +                           pipe->desc->name, node->desc->name, ret);
> > > > +           media_pipeline_stop(&node->vdev.entity);
> > >
> > > We should do this one when the last node stops streaming to solve the
> > > enable state locking issue as described in my comment to _start_streaming().
> >
> > I will do this when the last node stops streaming.
> >
> 
> Ack.
> 
> > >
> > > > +   }
> > > > +
> > > > +   mtk_dip_return_all_buffers(pipe, node, VB2_BUF_STATE_ERROR);
> > > > +}
> > > > +
> > > > +static int mtk_dip_videoc_querycap(struct file *file, void *fh,
> > > > +                              struct v4l2_capability *cap)
> > > > +{
> > > > +   struct mtk_dip_pipe *pipe = video_drvdata(file);
> > > > +
> > > > +   strlcpy(cap->driver, pipe->desc->name,
> > > > +           sizeof(cap->driver));
> > > > +   strlcpy(cap->card, pipe->desc->name,
> > > > +           sizeof(cap->card));
> > > > +   snprintf(cap->bus_info, sizeof(cap->bus_info),
> > > > +            "platform:%s", dev_name(pipe->dip_dev->mdev.dev));
> > > > +
> > > > +   return 0;
> > > > +}
> > > > +
> > > > +static int mtk_dip_videoc_try_fmt(struct file *file, void *fh,
> > > > +                             struct v4l2_format *f)
> > >
> > > I don't see this function returning any error codes. Please make it void.
> >
> > mtk_dip_videoc_try_fmt() is used as vidioc_try_fmt_vid_out_mplane op.
> > Using void seems to make it incompatible with
> > vidioc_try_fmt_vid_out_mplane.
> >
> > .vidioc_try_fmt_vid_out_mplane = mtk_dip_videoc_try_fmt,
> >
> > int (*vidioc_try_fmt_vid_out_mplane)(struct file *file, void *fh,
> >                                      struct v4l2_format *f);
> >
> 
> Oops, sorry, I'm not sure why I suggested that.
> 
> [snip]
> 
> > > > +   /* Initialize subdev */
> > > > +   v4l2_subdev_init(&pipe->subdev, &mtk_dip_subdev_ops);
> > > > +
> > > > +   pipe->subdev.entity.function =
> > > > +           MEDIA_ENT_F_PROC_VIDEO_PIXEL_FORMATTER;
> > > > +   pipe->subdev.entity.ops = &mtk_dip_media_ops;
> > > > +   pipe->subdev.flags =
> > > > +           V4L2_SUBDEV_FL_HAS_DEVNODE | V4L2_SUBDEV_FL_HAS_EVENTS;
> > > > +   pipe->subdev.ctrl_handler = NULL;
> > > > +   pipe->subdev.internal_ops = &mtk_dip_subdev_internal_ops;
> > > > +
> > > > +   for (i = 0; i < pipe->num_nodes; i++)
> > > > +           pipe->subdev_pads[i].flags =
> > > > +                   V4L2_TYPE_IS_OUTPUT(pipe->nodes[i].desc->buf_type) ?
> > > > +                   MEDIA_PAD_FL_SINK : MEDIA_PAD_FL_SOURCE;
> > >
> > > Isn't this the other way around?
> >
> > I checked the document of MEDIA_PAD_FL_SINK and MEDIA_PAD_FL_SOURCE. It
> > seems that the codes match the description.
> >
> > RAW Ouput video device: MEDIA_PAD_FL_SOURCE --> DIP sub dev:
> > MEDIA_PAD_FL_SINK
> > DIP sub dev: MEDIA_PAD_FL_SOURCE --> MDP Capture video device:
> > MEDIA_PAD_FL_SINK
> >
> > MEDIA_PAD_FL_SINK: Input pad, relative to the entity. Input pads sink
> > data and are targets of links.
> > MEDIA_PAD_FL_SOURCE: Output pad, relative to the entity. Output pads
> > source data and are origins of links.
> >
> 
> Ah, that's right, sorry for the noise.
> 
> [snip]
> 
> > > > +   {
> > > > +           .format = V4L2_PIX_FMT_NV12M,
> > > > +           .mdp_color      = MDP_COLOR_NV12,
> > > > +           .colorspace = V4L2_COLORSPACE_BT2020,
> > > > +           .depth          = { 8, 4 },
> > > > +           .row_depth      = { 8, 8 },
> > >
> > > What is depth and what is row_depth? They both seem to not match NV12, which
> > > should have 16 bits per sample in the CbCr plane.
> >
> > Fow_depth is the bits of the pixel.
> 
> Bits of a YCbCr plane pixel is 16 for NV12.
> 
> > Depth means the bytes per pixel of the image foramt.
> >
> > For example,
> > Image: 640 * 480
> > YV12: row_depth = 8, depth = 12
> > Bytes per line = width * row_depth / 8 = 640 * 8/ 8 = 640
> > Image size = Bytes per line * height * (depth/ row_depth)
> >            = 640 * 480 * 1.5
> >
> > Image: 640 * 480
> > Bytes per line(Y) = width * row_depth/ 8 = 640
> > Bytes per line(CbCr) = width * row_depth/ 8 = 640
> >
> > Image size(Y) = Bytes per line * height * (depth/ row_depth)
> >            = 640 * 480 * 8/8 = 640 * 480 * 1
> >
> > Image size(CbCr) = Bytes per line * height * (depth/ row_depth)
> >            = 640 * 480 * 4/8 = 640 * 480 * 0.5
> >
> 
> I'd suggest either using the V4L2 format helpers, as suggested in
> another comment above with a link OR adopting the typical convention
> of having the .depth mean the pixel value size, i.e. 16-bit for CbCr
> plane of NV12 and then use subsampling factors to calculate the plane
> bytesperline and sizeimage.
> 
> [snip]
> 
> > > > +static const struct mtk_dip_video_device_desc
> > > > +reprocess_output_queues_setting[MTK_DIP_VIDEO_NODE_ID_OUT_TOTAL_NUM] = {
> > > > +   {
> > > > +           .id = MTK_DIP_VIDEO_NODE_ID_RAW_OUT,
> > > > +           .name = "Raw Input",
> > > > +           .cap = V4L2_CAP_VIDEO_OUTPUT_MPLANE | V4L2_CAP_STREAMING,
> > > > +           .buf_type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE,
> > > > +           .smem_alloc = 0,
> > > > +           .flags = MEDIA_LNK_FL_ENABLED,
> > > > +           .fmts = in_fmts,
> > > > +           .num_fmts = ARRAY_SIZE(in_fmts),
> > > > +           .default_fmt_idx = 5,
> > > > +           .default_width = MTK_DIP_OUTPUT_MAX_WIDTH,
> > > > +           .default_height = MTK_DIP_OUTPUT_MAX_HEIGHT,
> > > > +           .dma_port = 0,
> > > > +           .frmsizeenum = &in_frmsizeenum,
> > > > +           .ops = &mtk_dip_v4l2_video_out_ioctl_ops,
> > > > +           .description = "Source image to be process",
> > > > +
> > > > +   },
> > > > +   {
> > > > +           .id = MTK_DIP_VIDEO_NODE_ID_TUNING_OUT,
> > > > +           .name = "Tuning",
> > > > +           .cap = V4L2_CAP_META_OUTPUT | V4L2_CAP_STREAMING,
> > > > +           .buf_type = V4L2_BUF_TYPE_META_OUTPUT,
> > > > +           .smem_alloc = 1,
> > > > +           .flags = 0,
> > > > +           .fmts = fw_param_fmts,
> > > > +           .num_fmts = 1,
> > > > +           .default_fmt_idx = 0,
> > > > +           .dma_port = 0,
> > > > +           .frmsizeenum = &in_frmsizeenum,
> > > > +           .ops = &mtk_dip_v4l2_meta_out_ioctl_ops,
> > > > +           .description = "Tuning data for image enhancement",
> > > > +   },
> > > > +};
> > >
> > > The entries here seem to be almost the same to output_queues_setting[], the
> > > only difference seems to be default_fmt_idx and description.
> > >
> > > What's the difference between the capture and reprocess pipes?
> >
> > The reprocess pipe is completely the same as capture one except the
> > default format of the input.
> >
> 
> Does the default format really matter here? The userspace should set
> its own desired format anyway. Then we could just unify the settings
> of both pipes.
> 

I would like to remove the reprocess_output_queues_setting.

> [snip]
> 
> > >
> > > > +
> > > > +   return num;
> > > > +}
> > > > +
> > > > +static int __maybe_unused mtk_dip_suspend(struct device *dev)
> > > > +{
> > > > +   struct mtk_dip_dev *dip_dev = dev_get_drvdata(dev);
> > > > +   int ret, num;
> > > > +
> > > > +   if (pm_runtime_suspended(dev)) {
> > > > +           dev_dbg(dev, "%s: pm_runtime_suspended is true, no action\n",
> > > > +                   __func__);
> > > > +           return 0;
> > > > +   }
> > > > +
> > > > +   ret = wait_event_timeout(dip_dev->dip_hw.flushing_hw_wq,
> > > > +                            !(num = mtk_dip_get_scp_job_num(dip_dev)),
> > > > +                            msecs_to_jiffies(200));
> > >
> > > Is 200 milliseconds a reasonable timeout here, i.e. for any potential use
> > > case it would always take less than 200 ms to wait for all the tasks running
> > > in the ISP?
> >
> > I would like to adjust the time out value to  1000 / 30 *
> > (DIP_COMPOSING_MAX_NUM) * 3.
> >
> > To wait 3 times of expected frame time (@30fps) in worst case (max
> > number of jobs in SCP).
> >
> 
> As I suggested in another comment above, the worst case for the
> hardware might be still better than the scheduling latency we could
> get for a heavy loaded system. While that wouldn't really apply here,
> because nothing else happening in parallel when the system is
> suspending, we could just stick to some really long time out anyway,
> e.g. 1 second. We shouldn't hit it anyway - it's just a safety guard
> to prevent the system hanging if something goes really bad.
> 

I got it. I will use 1 second as timeout setting.

> Best regards,
> Tomasz


Sincerely,

Frederic Chen



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH V2 4/6] platform: mtk-isp: Add Mediatek DIP driver
  2019-09-06  9:16         ` Frederic Chen
@ 2019-09-10  3:45           ` Tomasz Figa
  0 siblings, 0 replies; 9+ messages in thread
From: Tomasz Figa @ 2019-09-10  3:45 UTC (permalink / raw)
  To: Frederic Chen
  Cc: yuzhao, linux-mediatek, zwisler, matthias.bgg, mchehab, linux-arm-kernel

On Fri, Sep 6, 2019 at 6:17 PM Frederic Chen <frederic.chen@mediatek.com> wrote:
>
> Hi Tomasz,
>
> Thank you for your comments.
>
>
> On Fri, 2019-08-30 at 16:14 +0900, Tomasz Figa wrote:
> > Hi Frederic,
> >
> > On Tue, Aug 27, 2019 at 12:16 PM Frederic Chen
> > <frederic.chen@mediatek.com> wrote:
> > >
> > > Dear Tomasz,
> > >
> > > I appreciate your comment. I will collaborate more closely with Jungo
> > > to solve the common issues in DIP and Pass 1(CAM) drivers.
> > >
> >
> > Thank you!
> >
> > Also thanks for replying to all the comments, it's very helpful.
> > Please check my replies inline. I've snipped out the parts that I
> > don't have further comments on.
> >
> > >
> > > On Wed, 2019-07-31 at 15:10 +0800, Tomasz Figa wrote:
> > > > Hi Frederic,
> > > >
> > > > On Mon, Jul 08, 2019 at 07:04:58PM +0800, frederic.chen@mediatek.com wrote:
> >
> > [snip]
> >
> > > >
> > > > > +                   dev_buf->vbb.vb2_buf.timestamp =
> > > > > +                           in_buf->vbb.vb2_buf.timestamp;
> > > > > +
> > > > > +           vb2_buffer_done(&dev_buf->vbb.vb2_buf, vbf_state);
> > > > > +
> > > > > +           node = mtk_dip_vbq_to_node(dev_buf->vbb.vb2_buf.vb2_queue);
> > > > > +           spin_lock(&node->buf_list_lock);
> > > > > +           list_del(&dev_buf->list);
> > > > > +           spin_unlock(&node->buf_list_lock);
> > > > > +
> > > > > +           dev_dbg(&pipe->dip_dev->pdev->dev,
> > > > > +                   "%s:%s: return buf, idx(%d), state(%d)\n",
> > > > > +                   pipe->desc->name, node->desc->name,
> > > > > +                   dev_buf->vbb.vb2_buf.index, vbf_state);
> > > > > +   }
> > > >
> > > > This looks almost the same as what's being done inside
> > > > mtk_dip_hw_streamoff(). Could we just call this function from the loop
> > > > there?
> > >
> > > I would like to call the function from mtk_dip_hw_streamoff(). The only
> > > difference is mtk_dip_pipe_job_finish() also remove the buffer from the
> > > node's internal list.
> > >
> >
> > Would anything wrong happen if we also remove the buffer from the
> > node's internal list in mtk_dip_hw_streamoff()?
> >
> > Actually, do we need that internal node list? If we have a list of
> > requests and each request stores its buffer, wouldn't that be enough?
> >
>
> We use the buffer list in the following cases:
> 1. media_pipeline_start() failed when streaming on video device
> 2. Video device stream off
>
> If the some video device is streamed on ,but the entire pipe has not
> started streaming (for example, MDP 0 is streamed on, but RAW input has
> not been streamed on), we use the list to return the buffers.
>
> Should we handle this cases? or we expect that the user will request
> buffers again to ensure all buffers are removed from the video device in
> this error case.
>

However, if we only support the Request API, there wouldn't be any
buffers outside of any request. For a request that isn't queued, the
buffers are not passed to the driver, so it doesn't need to do
anything. The only thing left to handle is when there are some
requests queued and those requests already include the buffer request
objects inside, so we can handle them without a separate internal
buffer list.

> > > > > +/* Plane size that is accepted by MDP HW */
> > > > > +static u32
> > > > > +dip_mdp_fmt_get_plane_size(const struct mtk_dip_dev_format *fmt,
> > > > > +                      u32 stride, u32 height,
> > > > > +                      unsigned int plane)
> > > > > +{
> > > > > +   enum mdp_color c = fmt->mdp_color;
> > > > > +   u32 bytesperline;
> > > > > +
> > > > > +   bytesperline = (stride * fmt->row_depth[0])
> > > > > +           / MDP_COLOR_BITS_PER_PIXEL(c);
> > > >
> > > > Hmm, stride and bytesperline should be exactly the same thing. Could you
> > > > explain what's happening here?
> > >
> > > The stride here is specific for MDP hardware (which uses the same MDP
> > > stride setting for NV12 and NV12M):
> > >
> > >         bytesperline = width * row_depth / 8
> > >         MDP stride   = width * MDP_COLOR_BITS_PER_PIXEL /8
> > >
> > > Therfore,
> > >
> > >         bytesperline = MDP stride * row_depth / MDP_COLOR_BITS_PER_PIXEL
> > >         MDP stride   = bytesperline * MDP_COLOR_BITS_PER_PIXEL/ row_depth
> > >
> >
> > I'm sorry I'm still confused. Is there an intermediate buffer between
> > DIP and MDP that has stride of |MDP stride| and then MDP writes to the
> > final buffer that has the stride of |bytesperline|?
> >
>
> No, there is no intermediate buffer between DIP and MDP that has stride
> of |MDP stride|. DIP connects to MDP in hardware level, so MDP writes
> the buffer with |MDP stride|.
>
> As I know, V4L2's bytesperline means bytes per line of the first
> plane(*), but mdp hw needs y, u, v stride (it is different from V4L2).
> Therefore we calculate the |MDP stride| here.
>
> *:
> https://linuxtv.org/downloads/v4l-dvb-apis/uapi/v4l/pixfmt-v4l2.html#c.v4l2_pix_format
> "When the image format is planar the bytesperline value applies to the
> first plane and is divided by the same factor as the width field for the
> other planes."

However, we're using v4l2_pix_fmt_mplane, not v4l2_pix_format. If the
pixelformat is the M variant (e.g. V4L2_PIX_FMT_NV12M), the
v4l2_pix_fmt_mplane::planes[] contains bytesperline and sizeimage for
each plane separately. That's the reason I'm strongly suggesting
abandoning the non-M formats and supporting only the M ones.

>
> > [snip]
> >
> > > >
> > > > > +           u32 sizeimage;
> > > > > +
> > > > > +           if (bpl < min_bpl)
> > > > > +                   bpl = min_bpl;
> > > > > +
> > > > > +           sizeimage = (bpl * mfmt_to_fill->height * dev_fmt->depth[i])
> > > > > +                   / dev_fmt->row_depth[i];
> > > >
> > > > Shouldn't this be bpl * fmt->height?
> > >
> > > Row_depth is the bits of the pixel.
> > > Depth means the bytes per pixel of the image format.
> > >
> > > For example,
> > > Image: 640 * 480
> > > YV12: row_depth = 8, depth = 12
> >
> > YV12 has 3 planes of 8 bits per pixel. Not sure where does this 12 come from.
> >
>
> Let me elaborate more about the 12 depth.
>
> depth: pixel bit number
>
> For 420,
>
> y = w * h
> u = (w/2) * (h/2)
> v = (w/2) * (h/2)
>
> Therefore,
>
> y = 8,
> u = 8/2/2 = 2
> v = 8/2/2 = 2
>
> depth (y + u + v) = 8 + 2 + 2 = 12
>

Yes, that's what I understood, but it is by no means associated with
any physical depth. It's more like "total_bits_per_pixel". That's
however a very error prone way to express pixel formats, as for
example it can't handle the case when plane strides alignments are not
proportional.

Please see the v4l2_format_info struct for the recommended way of
describing pixel formats. Actually, the struct itself could be reused
in this driver, even if we don't end up using the helpers for some
reason.

>
> > > Bytes per line = width * row_depth / 8 = 640 * 8/ 8 = 640
> > > Image size = Bytes per line * height * (depth/ row_depth)
> > >            = 640 * 480 * 1.5
> > >
> >
> > I think we might be having some terminology issue here. "row" is
> > normally the same as "line", which consists of |width| pixels +
> > padding, which is |bytesperline| bytes in total.
> >
> > Perhaps you want to store a bits_per_pixel[] and vertical and
> > horizontal subsampling arrays for all planes of the formats in the
> > format descriptor.
> >
> > By the way, have you considered using the V4L2 format helpers [1]?
> >
> > [1] https://elixir.bootlin.com/linux/v5.3-rc6/source/drivers/media/v4l2-core/v4l2-common.c#L561
>
> Would it be possible to keep row_depth and depth? It is already used in
> MDP drivers.
>
> https://elixir.bootlin.com/linux/v5.3-rc6/source/drivers/media/platform/mtk-mdp/mtk_mdp_m2m.c
>

I'd suggest cleaning up the MDP driver instead. :)
(Could be done as a follow up later, of course.)

Best regards,
Tomasz

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-09-10  3:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190708110500.7242-1-frederic.chen@mediatek.com>
2019-07-08 11:04 ` [RFC PATCH V2 1/6] dt-bindings: mt8183: Added DIP dt-bindings frederic.chen
2019-07-24 16:49   ` Rob Herring
2019-07-08 11:04 ` [RFC PATCH V2 2/6] dts: arm64: mt8183: Add DIP nodes frederic.chen
2019-07-08 11:04 ` [RFC PATCH V2 3/6] media: platform: Add Mediatek DIP driver KConfig frederic.chen
2019-07-08 11:04 ` [RFC PATCH V2 5/6] remoteproc/mediatek: add SCP's shared dma pool support for mt8183 frederic.chen
2019-07-08 11:05 ` [RFC PATCH V2 6/6] media: platform: mtk-mdp3: Add struct tuning_addr to identify user tuning data frederic.chen
     [not found] ` <20190708110500.7242-5-frederic.chen@mediatek.com>
     [not found]   ` <20190731071014.GA43159@chromium.org>
     [not found]     ` <1566875772.10064.99.camel@mtksdccf07>
2019-08-30  7:14       ` [RFC PATCH V2 4/6] platform: mtk-isp: Add Mediatek DIP driver Tomasz Figa
2019-09-06  9:16         ` Frederic Chen
2019-09-10  3:45           ` Tomasz Figa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).