linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v9 00/13] CSI2RX support on J721E and AM62
@ 2023-08-11 10:47 Jai Luthra
  2023-08-11 10:47 ` [PATCH v9 01/13] media: dt-bindings: Make sure items in data-lanes are unique Jai Luthra
                   ` (13 more replies)
  0 siblings, 14 replies; 30+ messages in thread
From: Jai Luthra @ 2023-08-11 10:47 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, Tomi Valkeinen
  Cc: linux-media, linux-kernel, devicetree, linux-arm-kernel,
	Laurent Pinchart, Mauro Carvalho Chehab, Maxime Ripard,
	niklas.soderlund+renesas, Benoit Parrot, Vaishnav Achath,
	Vignesh Raghavendra, nm, devarsht, j-luthra, a-bhatia1,
	Martyn Welch, Julien Massot

From: Pratyush Yadav <p.yadav@ti.com>

Hi,

This series adds support for CSI2 capture on J721E. It includes some
fixes to the Cadence CSI2RX driver, and adds the TI CSI2RX wrapper driver.

This is the v9 of the below v8 series,
https://lore.kernel.org/r/20230731-upstream_csi-v8-0-fb7d3661c2c9@ti.com

Testing logs: https://gist.github.com/jailuthra/eaeb3af3c65b67e1bc0d5db28180131d

J721E CSI2RX driver can also be extended to support multi-stream
capture, filtering different CSI Virtual Channels (VC) or Data Types
(DT) to different DMA channels. A WIP series based on v7 is available
for reference at https://github.com/jailuthra/linux/commits/csi_multi_wip

I will rebase the multi-stream patches on the current series (v9) and
post them as RFC in the coming weeks.

Signed-off-by: Jai Luthra <j-luthra@ti.com>
---

Changelog from v8
=================

Range-diff: https://0x0.st/H_xh.diff

Dropped the following patches:
[v8 01/16] media: subdev: Export get_format helper for link validation
	- Using subdev's get_fmt directly instead
[v8 04/16] media: cadence: Add support for TI SoCs
	- Don't add a compatible if we are not using it in the driver
[v8 14/16] media: cadence: csi2rx: Support RAW8 and RAW10 formats
	- Squashed into a previous patch [v8 07/16]

For [05/13] media: cadence: csi2rx: Add get_fmt and set_fmt pad ops:
- Squash the patch adding RAW8 and RAW10 formats within this one
- Single line struct entries in formats[] array
- Skip specifiying redundant format.which entry in init_cfg()

For [06/13] media: cadence: csi2rx: Configure DPHY using link freq:
- Don't specify stream while calling .get_fmt()

For [07/13] media: cadence: csi2rx: Soft reset the streams before starting capture:
- Simplify reset sequence, minimizing delays

For [08/13] media: cadence: csi2rx: Set the STOP bit when stopping a stream:
- Better log message to avoid confusion between cadence streams and v4l2
  streams

For [13/13] media: ti: Add CSI2RX support for J721E:
- Allocate drain buffer at start of stream instead of doing it in the
  middle, and document why it is needed in comments
- Call subdev's get_fmt directly for link_validation()
- Cleanup height/width clamping and rounding code, document it in comments
- Return and check errors from setup_shim()
- s/subdev/source for cadence csi2rx's v4l2_subdev
- s/ti_csi2rx_init_subdev/ti_csi2rx_notifier_register
- Change copyright year/author list

---
Jai Luthra (1):
      media: dt-bindings: cadence-csi2rx: Add TI compatible string

Pratyush Yadav (12):
      media: dt-bindings: Make sure items in data-lanes are unique
      media: cadence: csi2rx: Unregister v4l2 async notifier
      media: cadence: csi2rx: Cleanup media entity properly
      media: cadence: csi2rx: Add get_fmt and set_fmt pad ops
      media: cadence: csi2rx: Configure DPHY using link freq
      media: cadence: csi2rx: Soft reset the streams before starting capture
      media: cadence: csi2rx: Set the STOP bit when stopping a stream
      media: cadence: csi2rx: Fix stream data configuration
      media: cadence: csi2rx: Populate subdev devnode
      media: cadence: csi2rx: Add link validation
      media: dt-bindings: Add TI J721E CSI2RX
      media: ti: Add CSI2RX support for J721E

 .../devicetree/bindings/media/cdns,csi2rx.yaml     |    1 +
 .../bindings/media/ti,j721e-csi2rx-shim.yaml       |  100 ++
 .../bindings/media/video-interfaces.yaml           |    1 +
 MAINTAINERS                                        |    7 +
 drivers/media/platform/cadence/cdns-csi2rx.c       |  181 ++-
 drivers/media/platform/ti/Kconfig                  |   12 +
 drivers/media/platform/ti/Makefile                 |    1 +
 drivers/media/platform/ti/j721e-csi2rx/Makefile    |    2 +
 .../media/platform/ti/j721e-csi2rx/j721e-csi2rx.c  | 1150 ++++++++++++++++++++
 9 files changed, 1448 insertions(+), 7 deletions(-)
---
base-commit: 21ef7b1e17d039053edaeaf41142423810572741
change-id: 20230727-upstream_csi-acbeabe038d8

Best regards,
-- 
Jai Luthra <j-luthra@ti.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v9 01/13] media: dt-bindings: Make sure items in data-lanes are unique
  2023-08-11 10:47 [PATCH v9 00/13] CSI2RX support on J721E and AM62 Jai Luthra
@ 2023-08-11 10:47 ` Jai Luthra
  2023-08-11 10:47 ` [PATCH v9 02/13] media: dt-bindings: cadence-csi2rx: Add TI compatible string Jai Luthra
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 30+ messages in thread
From: Jai Luthra @ 2023-08-11 10:47 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, Tomi Valkeinen
  Cc: linux-media, linux-kernel, devicetree, linux-arm-kernel,
	Laurent Pinchart, Mauro Carvalho Chehab, Maxime Ripard,
	niklas.soderlund+renesas, Benoit Parrot, Vaishnav Achath,
	Vignesh Raghavendra, nm, devarsht, j-luthra, a-bhatia1,
	Martyn Welch, Julien Massot

From: Pratyush Yadav <p.yadav@ti.com>

The data-lanes property maps the logical lane numbers to the physical
lane numbers. The position of an entry is the logical lane number and
its value is the physical lane number. Since one physical lane can only
map to one logical lane, no number in the list should repeat. Add the
uniqueItems constraint on the property to enforce this.

Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jai Luthra <j-luthra@ti.com>
---
 Documentation/devicetree/bindings/media/video-interfaces.yaml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/media/video-interfaces.yaml b/Documentation/devicetree/bindings/media/video-interfaces.yaml
index a211d49dc2ac..26e3e7d7c67b 100644
--- a/Documentation/devicetree/bindings/media/video-interfaces.yaml
+++ b/Documentation/devicetree/bindings/media/video-interfaces.yaml
@@ -160,6 +160,7 @@ properties:
     $ref: /schemas/types.yaml#/definitions/uint32-array
     minItems: 1
     maxItems: 8
+    uniqueItems: true
     items:
       # Assume up to 9 physical lane indices
       maximum: 8

-- 
2.41.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v9 02/13] media: dt-bindings: cadence-csi2rx: Add TI compatible string
  2023-08-11 10:47 [PATCH v9 00/13] CSI2RX support on J721E and AM62 Jai Luthra
  2023-08-11 10:47 ` [PATCH v9 01/13] media: dt-bindings: Make sure items in data-lanes are unique Jai Luthra
@ 2023-08-11 10:47 ` Jai Luthra
  2023-08-25  3:44   ` Laurent Pinchart
  2023-08-11 10:47 ` [PATCH v9 03/13] media: cadence: csi2rx: Unregister v4l2 async notifier Jai Luthra
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 30+ messages in thread
From: Jai Luthra @ 2023-08-11 10:47 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, Tomi Valkeinen
  Cc: linux-media, linux-kernel, devicetree, linux-arm-kernel,
	Laurent Pinchart, Mauro Carvalho Chehab, Maxime Ripard,
	niklas.soderlund+renesas, Benoit Parrot, Vaishnav Achath,
	Vignesh Raghavendra, nm, devarsht, j-luthra, a-bhatia1,
	Martyn Welch, Julien Massot

Add a SoC-specific compatible string for TI's integration of this IP in
J7 and AM62 line of SoCs.

Reviewed-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jai Luthra <j-luthra@ti.com>
---
 Documentation/devicetree/bindings/media/cdns,csi2rx.yaml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/media/cdns,csi2rx.yaml b/Documentation/devicetree/bindings/media/cdns,csi2rx.yaml
index 30a335b10762..2008a47c0580 100644
--- a/Documentation/devicetree/bindings/media/cdns,csi2rx.yaml
+++ b/Documentation/devicetree/bindings/media/cdns,csi2rx.yaml
@@ -18,6 +18,7 @@ properties:
     items:
       - enum:
           - starfive,jh7110-csi2rx
+          - ti,j721e-csi2rx
       - const: cdns,csi2rx
 
   reg:

-- 
2.41.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v9 03/13] media: cadence: csi2rx: Unregister v4l2 async notifier
  2023-08-11 10:47 [PATCH v9 00/13] CSI2RX support on J721E and AM62 Jai Luthra
  2023-08-11 10:47 ` [PATCH v9 01/13] media: dt-bindings: Make sure items in data-lanes are unique Jai Luthra
  2023-08-11 10:47 ` [PATCH v9 02/13] media: dt-bindings: cadence-csi2rx: Add TI compatible string Jai Luthra
@ 2023-08-11 10:47 ` Jai Luthra
  2023-08-11 10:47 ` [PATCH v9 04/13] media: cadence: csi2rx: Cleanup media entity properly Jai Luthra
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 30+ messages in thread
From: Jai Luthra @ 2023-08-11 10:47 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, Tomi Valkeinen
  Cc: linux-media, linux-kernel, devicetree, linux-arm-kernel,
	Laurent Pinchart, Mauro Carvalho Chehab, Maxime Ripard,
	niklas.soderlund+renesas, Benoit Parrot, Vaishnav Achath,
	Vignesh Raghavendra, nm, devarsht, j-luthra, a-bhatia1,
	Martyn Welch, Julien Massot

From: Pratyush Yadav <p.yadav@ti.com>

The notifier is added to the global notifier list when registered. When
the module is removed, the struct csi2rx_priv in which the notifier is
embedded, is destroyed. As a result the notifier list has a reference to
a notifier that no longer exists. This causes invalid memory accesses
when the list is iterated over. Similar for when the probe fails.
Unregister and clean up the notifier to avoid this.

Fixes: 1fc3b37f34f6 ("media: v4l: cadence: Add Cadence MIPI-CSI2 RX driver")

Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Jai Luthra <j-luthra@ti.com>
---
 drivers/media/platform/cadence/cdns-csi2rx.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/media/platform/cadence/cdns-csi2rx.c b/drivers/media/platform/cadence/cdns-csi2rx.c
index 0d879d71d818..9231ee7e9b3a 100644
--- a/drivers/media/platform/cadence/cdns-csi2rx.c
+++ b/drivers/media/platform/cadence/cdns-csi2rx.c
@@ -479,8 +479,10 @@ static int csi2rx_parse_dt(struct csi2rx_priv *csi2rx)
 	asd = v4l2_async_nf_add_fwnode_remote(&csi2rx->notifier, fwh,
 					      struct v4l2_async_connection);
 	of_node_put(ep);
-	if (IS_ERR(asd))
+	if (IS_ERR(asd)) {
+		v4l2_async_nf_cleanup(&csi2rx->notifier);
 		return PTR_ERR(asd);
+	}
 
 	csi2rx->notifier.ops = &csi2rx_notifier_ops;
 
@@ -543,6 +545,7 @@ static int csi2rx_probe(struct platform_device *pdev)
 	return 0;
 
 err_cleanup:
+	v4l2_async_nf_unregister(&csi2rx->notifier);
 	v4l2_async_nf_cleanup(&csi2rx->notifier);
 err_free_priv:
 	kfree(csi2rx);
@@ -553,6 +556,8 @@ static void csi2rx_remove(struct platform_device *pdev)
 {
 	struct csi2rx_priv *csi2rx = platform_get_drvdata(pdev);
 
+	v4l2_async_nf_unregister(&csi2rx->notifier);
+	v4l2_async_nf_cleanup(&csi2rx->notifier);
 	v4l2_async_unregister_subdev(&csi2rx->subdev);
 	kfree(csi2rx);
 }

-- 
2.41.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v9 04/13] media: cadence: csi2rx: Cleanup media entity properly
  2023-08-11 10:47 [PATCH v9 00/13] CSI2RX support on J721E and AM62 Jai Luthra
                   ` (2 preceding siblings ...)
  2023-08-11 10:47 ` [PATCH v9 03/13] media: cadence: csi2rx: Unregister v4l2 async notifier Jai Luthra
@ 2023-08-11 10:47 ` Jai Luthra
  2023-08-11 10:47 ` [PATCH v9 05/13] media: cadence: csi2rx: Add get_fmt and set_fmt pad ops Jai Luthra
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 30+ messages in thread
From: Jai Luthra @ 2023-08-11 10:47 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, Tomi Valkeinen
  Cc: linux-media, linux-kernel, devicetree, linux-arm-kernel,
	Laurent Pinchart, Mauro Carvalho Chehab, Maxime Ripard,
	niklas.soderlund+renesas, Benoit Parrot, Vaishnav Achath,
	Vignesh Raghavendra, nm, devarsht, j-luthra, a-bhatia1,
	Martyn Welch, Julien Massot

From: Pratyush Yadav <p.yadav@ti.com>

Call media_entity_cleanup() in probe error path and remove to make sure
the media entity is cleaned up properly.

Suggested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Jai Luthra <j-luthra@ti.com>
---
 drivers/media/platform/cadence/cdns-csi2rx.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/media/platform/cadence/cdns-csi2rx.c b/drivers/media/platform/cadence/cdns-csi2rx.c
index 9231ee7e9b3a..9de3240e261c 100644
--- a/drivers/media/platform/cadence/cdns-csi2rx.c
+++ b/drivers/media/platform/cadence/cdns-csi2rx.c
@@ -547,6 +547,7 @@ static int csi2rx_probe(struct platform_device *pdev)
 err_cleanup:
 	v4l2_async_nf_unregister(&csi2rx->notifier);
 	v4l2_async_nf_cleanup(&csi2rx->notifier);
+	media_entity_cleanup(&csi2rx->subdev.entity);
 err_free_priv:
 	kfree(csi2rx);
 	return ret;
@@ -559,6 +560,7 @@ static void csi2rx_remove(struct platform_device *pdev)
 	v4l2_async_nf_unregister(&csi2rx->notifier);
 	v4l2_async_nf_cleanup(&csi2rx->notifier);
 	v4l2_async_unregister_subdev(&csi2rx->subdev);
+	media_entity_cleanup(&csi2rx->subdev.entity);
 	kfree(csi2rx);
 }
 

-- 
2.41.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v9 05/13] media: cadence: csi2rx: Add get_fmt and set_fmt pad ops
  2023-08-11 10:47 [PATCH v9 00/13] CSI2RX support on J721E and AM62 Jai Luthra
                   ` (3 preceding siblings ...)
  2023-08-11 10:47 ` [PATCH v9 04/13] media: cadence: csi2rx: Cleanup media entity properly Jai Luthra
@ 2023-08-11 10:47 ` Jai Luthra
  2023-08-15 12:05   ` Tomi Valkeinen
  2023-08-25  3:48   ` Laurent Pinchart
  2023-08-11 10:47 ` [PATCH v9 06/13] media: cadence: csi2rx: Configure DPHY using link freq Jai Luthra
                   ` (8 subsequent siblings)
  13 siblings, 2 replies; 30+ messages in thread
From: Jai Luthra @ 2023-08-11 10:47 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, Tomi Valkeinen
  Cc: linux-media, linux-kernel, devicetree, linux-arm-kernel,
	Laurent Pinchart, Mauro Carvalho Chehab, Maxime Ripard,
	niklas.soderlund+renesas, Benoit Parrot, Vaishnav Achath,
	Vignesh Raghavendra, nm, devarsht, j-luthra, a-bhatia1,
	Martyn Welch, Julien Massot

From: Pratyush Yadav <p.yadav@ti.com>

The format is needed to calculate the link speed for the external DPHY
configuration. It is not right to query the format from the source
subdev. Add get_fmt and set_fmt pad operations so that the format can be
configured and correct bpp be selected.

Initialize and use the v4l2 subdev active state to keep track of the
active formats. Also propagate the new format from the sink pad to all
the source pads.

Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Co-authored-by: Jai Luthra <j-luthra@ti.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Jai Luthra <j-luthra@ti.com>
---
Changes from v8:
    - Squash the patch adding RAW8 and RAW10 formats within this one
    - Single line struct entries in formats[] array
    - Skip specifiying redundant format.which entry in init_cfg()

 drivers/media/platform/cadence/cdns-csi2rx.c | 101 ++++++++++++++++++++++++++-
 1 file changed, 100 insertions(+), 1 deletion(-)

diff --git a/drivers/media/platform/cadence/cdns-csi2rx.c b/drivers/media/platform/cadence/cdns-csi2rx.c
index 9de3240e261c..047e74ee2443 100644
--- a/drivers/media/platform/cadence/cdns-csi2rx.c
+++ b/drivers/media/platform/cadence/cdns-csi2rx.c
@@ -61,6 +61,11 @@ enum csi2rx_pads {
 	CSI2RX_PAD_MAX,
 };
 
+struct csi2rx_fmt {
+	u32				code;
+	u8				bpp;
+};
+
 struct csi2rx_priv {
 	struct device			*dev;
 	unsigned int			count;
@@ -95,6 +100,32 @@ struct csi2rx_priv {
 	int				source_pad;
 };
 
+static const struct csi2rx_fmt formats[] = {
+	{ .code	= MEDIA_BUS_FMT_YUYV8_1X16, .bpp = 16, },
+	{ .code	= MEDIA_BUS_FMT_UYVY8_1X16, .bpp = 16, },
+	{ .code	= MEDIA_BUS_FMT_YVYU8_1X16, .bpp = 16, },
+	{ .code	= MEDIA_BUS_FMT_VYUY8_1X16, .bpp = 16, },
+	{ .code	= MEDIA_BUS_FMT_SBGGR8_1X8, .bpp = 8, },
+	{ .code	= MEDIA_BUS_FMT_SGBRG8_1X8, .bpp = 8, },
+	{ .code	= MEDIA_BUS_FMT_SGRBG8_1X8, .bpp = 8, },
+	{ .code	= MEDIA_BUS_FMT_SRGGB8_1X8, .bpp = 8, },
+	{ .code	= MEDIA_BUS_FMT_SBGGR10_1X10, .bpp = 10, },
+	{ .code	= MEDIA_BUS_FMT_SGBRG10_1X10, .bpp = 10, },
+	{ .code	= MEDIA_BUS_FMT_SGRBG10_1X10, .bpp = 10, },
+	{ .code	= MEDIA_BUS_FMT_SRGGB10_1X10, .bpp = 10, },
+};
+
+static const struct csi2rx_fmt *csi2rx_get_fmt_by_code(u32 code)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(formats); i++)
+		if (formats[i].code == code)
+			return &formats[i];
+
+	return NULL;
+}
+
 static inline
 struct csi2rx_priv *v4l2_subdev_to_csi2rx(struct v4l2_subdev *subdev)
 {
@@ -303,12 +334,73 @@ static int csi2rx_s_stream(struct v4l2_subdev *subdev, int enable)
 	return ret;
 }
 
+static int csi2rx_set_fmt(struct v4l2_subdev *subdev,
+			  struct v4l2_subdev_state *state,
+			  struct v4l2_subdev_format *format)
+{
+	struct v4l2_mbus_framefmt *fmt;
+	unsigned int i;
+
+	/* No transcoding, source and sink formats must match. */
+	if (format->pad != CSI2RX_PAD_SINK)
+		return v4l2_subdev_get_fmt(subdev, state, format);
+
+	if (!csi2rx_get_fmt_by_code(format->format.code))
+		format->format.code = formats[0].code;
+
+	format->format.field = V4L2_FIELD_NONE;
+
+	/* Set sink format */
+	fmt = v4l2_subdev_get_pad_format(subdev, state, format->pad);
+	if (!fmt)
+		return -EINVAL;
+
+	*fmt = format->format;
+
+	/* Propagate to source formats */
+	for (i = CSI2RX_PAD_SOURCE_STREAM0; i < CSI2RX_PAD_MAX; i++) {
+		fmt = v4l2_subdev_get_pad_format(subdev, state, i);
+		if (!fmt)
+			return -EINVAL;
+		*fmt = format->format;
+	}
+
+	return 0;
+}
+
+static int csi2rx_init_cfg(struct v4l2_subdev *subdev,
+			   struct v4l2_subdev_state *state)
+{
+	struct v4l2_subdev_format format = {
+		.pad = CSI2RX_PAD_SINK,
+		.format = {
+			.width = 640,
+			.height = 480,
+			.code = MEDIA_BUS_FMT_UYVY8_1X16,
+			.field = V4L2_FIELD_NONE,
+			.colorspace = V4L2_COLORSPACE_SRGB,
+			.ycbcr_enc = V4L2_YCBCR_ENC_601,
+			.quantization = V4L2_QUANTIZATION_LIM_RANGE,
+			.xfer_func = V4L2_XFER_FUNC_SRGB,
+		},
+	};
+
+	return csi2rx_set_fmt(subdev, state, &format);
+}
+
+static const struct v4l2_subdev_pad_ops csi2rx_pad_ops = {
+	.get_fmt	= v4l2_subdev_get_fmt,
+	.set_fmt	= csi2rx_set_fmt,
+	.init_cfg	= csi2rx_init_cfg,
+};
+
 static const struct v4l2_subdev_video_ops csi2rx_video_ops = {
 	.s_stream	= csi2rx_s_stream,
 };
 
 static const struct v4l2_subdev_ops csi2rx_subdev_ops = {
 	.video		= &csi2rx_video_ops,
+	.pad		= &csi2rx_pad_ops,
 };
 
 static int csi2rx_async_bound(struct v4l2_async_notifier *notifier,
@@ -532,9 +624,13 @@ static int csi2rx_probe(struct platform_device *pdev)
 	if (ret)
 		goto err_cleanup;
 
+	ret = v4l2_subdev_init_finalize(&csi2rx->subdev);
+	if (ret)
+		goto err_cleanup;
+
 	ret = v4l2_async_register_subdev(&csi2rx->subdev);
 	if (ret < 0)
-		goto err_cleanup;
+		goto err_free_state;
 
 	dev_info(&pdev->dev,
 		 "Probed CSI2RX with %u/%u lanes, %u streams, %s D-PHY\n",
@@ -544,6 +640,8 @@ static int csi2rx_probe(struct platform_device *pdev)
 
 	return 0;
 
+err_free_state:
+	v4l2_subdev_cleanup(&csi2rx->subdev);
 err_cleanup:
 	v4l2_async_nf_unregister(&csi2rx->notifier);
 	v4l2_async_nf_cleanup(&csi2rx->notifier);
@@ -560,6 +658,7 @@ static void csi2rx_remove(struct platform_device *pdev)
 	v4l2_async_nf_unregister(&csi2rx->notifier);
 	v4l2_async_nf_cleanup(&csi2rx->notifier);
 	v4l2_async_unregister_subdev(&csi2rx->subdev);
+	v4l2_subdev_cleanup(&csi2rx->subdev);
 	media_entity_cleanup(&csi2rx->subdev.entity);
 	kfree(csi2rx);
 }

-- 
2.41.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v9 06/13] media: cadence: csi2rx: Configure DPHY using link freq
  2023-08-11 10:47 [PATCH v9 00/13] CSI2RX support on J721E and AM62 Jai Luthra
                   ` (4 preceding siblings ...)
  2023-08-11 10:47 ` [PATCH v9 05/13] media: cadence: csi2rx: Add get_fmt and set_fmt pad ops Jai Luthra
@ 2023-08-11 10:47 ` Jai Luthra
  2023-08-11 10:47 ` [PATCH v9 07/13] media: cadence: csi2rx: Soft reset the streams before starting capture Jai Luthra
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 30+ messages in thread
From: Jai Luthra @ 2023-08-11 10:47 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, Tomi Valkeinen
  Cc: linux-media, linux-kernel, devicetree, linux-arm-kernel,
	Laurent Pinchart, Mauro Carvalho Chehab, Maxime Ripard,
	niklas.soderlund+renesas, Benoit Parrot, Vaishnav Achath,
	Vignesh Raghavendra, nm, devarsht, j-luthra, a-bhatia1,
	Martyn Welch, Julien Massot

From: Pratyush Yadav <p.yadav@ti.com>

Some platforms like TI's J721E can have the CSI2RX paired with an
external DPHY. Use the generic PHY framework to configure the DPHY with
the correct link frequency.

Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Co-authored-by: Jai Luthra <j-luthra@ti.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Jai Luthra <j-luthra@ti.com>
---
Changes from v8:
    - Don't specify stream while calling .get_fmt()

 drivers/media/platform/cadence/cdns-csi2rx.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/drivers/media/platform/cadence/cdns-csi2rx.c b/drivers/media/platform/cadence/cdns-csi2rx.c
index 047e74ee2443..933edec89520 100644
--- a/drivers/media/platform/cadence/cdns-csi2rx.c
+++ b/drivers/media/platform/cadence/cdns-csi2rx.c
@@ -145,8 +145,32 @@ static void csi2rx_reset(struct csi2rx_priv *csi2rx)
 static int csi2rx_configure_ext_dphy(struct csi2rx_priv *csi2rx)
 {
 	union phy_configure_opts opts = { };
+	struct phy_configure_opts_mipi_dphy *cfg = &opts.mipi_dphy;
+	struct v4l2_subdev_format sd_fmt = {
+		.which	= V4L2_SUBDEV_FORMAT_ACTIVE,
+		.pad	= CSI2RX_PAD_SINK,
+	};
+	const struct csi2rx_fmt *fmt;
+	s64 link_freq;
 	int ret;
 
+	ret = v4l2_subdev_call_state_active(&csi2rx->subdev, pad, get_fmt,
+					    &sd_fmt);
+	if (ret < 0)
+		return ret;
+
+	fmt = csi2rx_get_fmt_by_code(sd_fmt.format.code);
+
+	link_freq = v4l2_get_link_freq(csi2rx->source_subdev->ctrl_handler,
+				       fmt->bpp, 2 * csi2rx->num_lanes);
+	if (link_freq < 0)
+		return link_freq;
+
+	ret = phy_mipi_dphy_get_default_config_for_hsclk(link_freq,
+							 csi2rx->num_lanes, cfg);
+	if (ret)
+		return ret;
+
 	ret = phy_power_on(csi2rx->dphy);
 	if (ret)
 		return ret;

-- 
2.41.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v9 07/13] media: cadence: csi2rx: Soft reset the streams before starting capture
  2023-08-11 10:47 [PATCH v9 00/13] CSI2RX support on J721E and AM62 Jai Luthra
                   ` (5 preceding siblings ...)
  2023-08-11 10:47 ` [PATCH v9 06/13] media: cadence: csi2rx: Configure DPHY using link freq Jai Luthra
@ 2023-08-11 10:47 ` Jai Luthra
  2023-08-15 12:10   ` Tomi Valkeinen
  2023-08-11 10:47 ` [PATCH v9 08/13] media: cadence: csi2rx: Set the STOP bit when stopping a stream Jai Luthra
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 30+ messages in thread
From: Jai Luthra @ 2023-08-11 10:47 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, Tomi Valkeinen
  Cc: linux-media, linux-kernel, devicetree, linux-arm-kernel,
	Laurent Pinchart, Mauro Carvalho Chehab, Maxime Ripard,
	niklas.soderlund+renesas, Benoit Parrot, Vaishnav Achath,
	Vignesh Raghavendra, nm, devarsht, j-luthra, a-bhatia1,
	Martyn Welch, Julien Massot

From: Pratyush Yadav <p.yadav@ti.com>

This resets the stream state machines and FIFOs, giving them a clean
slate. On J721E if the streams are not reset before starting the
capture, the captured frame gets wrapped around vertically on every run
after the first.

Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Jai Luthra <j-luthra@ti.com>
---
Changes from v8:
    - Simplify reset sequence, minimizing delays

 drivers/media/platform/cadence/cdns-csi2rx.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/media/platform/cadence/cdns-csi2rx.c b/drivers/media/platform/cadence/cdns-csi2rx.c
index 933edec89520..b57e0c3b1944 100644
--- a/drivers/media/platform/cadence/cdns-csi2rx.c
+++ b/drivers/media/platform/cadence/cdns-csi2rx.c
@@ -40,6 +40,7 @@
 #define CSI2RX_STREAM_BASE(n)		(((n) + 1) * 0x100)
 
 #define CSI2RX_STREAM_CTRL_REG(n)		(CSI2RX_STREAM_BASE(n) + 0x000)
+#define CSI2RX_STREAM_CTRL_SOFT_RST			BIT(4)
 #define CSI2RX_STREAM_CTRL_START			BIT(0)
 
 #define CSI2RX_STREAM_DATA_CFG_REG(n)		(CSI2RX_STREAM_BASE(n) + 0x008)
@@ -134,12 +135,23 @@ struct csi2rx_priv *v4l2_subdev_to_csi2rx(struct v4l2_subdev *subdev)
 
 static void csi2rx_reset(struct csi2rx_priv *csi2rx)
 {
+	unsigned int i;
+
+	/* Reset module */
 	writel(CSI2RX_SOFT_RESET_PROTOCOL | CSI2RX_SOFT_RESET_FRONT,
 	       csi2rx->base + CSI2RX_SOFT_RESET_REG);
+	/* Reset individual streams. */
+	for (i = 0; i < csi2rx->max_streams; i++) {
+		writel(CSI2RX_STREAM_CTRL_SOFT_RST,
+		       csi2rx->base + CSI2RX_STREAM_CTRL_REG(i));
+	}
 
-	udelay(10);
+	usleep_range(10, 20);
 
+	/* Clear resets */
 	writel(0, csi2rx->base + CSI2RX_SOFT_RESET_REG);
+	for (i = 0; i < csi2rx->max_streams; i++)
+		writel(0, csi2rx->base + CSI2RX_STREAM_CTRL_REG(i));
 }
 
 static int csi2rx_configure_ext_dphy(struct csi2rx_priv *csi2rx)

-- 
2.41.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v9 08/13] media: cadence: csi2rx: Set the STOP bit when stopping a stream
  2023-08-11 10:47 [PATCH v9 00/13] CSI2RX support on J721E and AM62 Jai Luthra
                   ` (6 preceding siblings ...)
  2023-08-11 10:47 ` [PATCH v9 07/13] media: cadence: csi2rx: Soft reset the streams before starting capture Jai Luthra
@ 2023-08-11 10:47 ` Jai Luthra
  2023-08-11 10:47 ` [PATCH v9 09/13] media: cadence: csi2rx: Fix stream data configuration Jai Luthra
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 30+ messages in thread
From: Jai Luthra @ 2023-08-11 10:47 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, Tomi Valkeinen
  Cc: linux-media, linux-kernel, devicetree, linux-arm-kernel,
	Laurent Pinchart, Mauro Carvalho Chehab, Maxime Ripard,
	niklas.soderlund+renesas, Benoit Parrot, Vaishnav Achath,
	Vignesh Raghavendra, nm, devarsht, j-luthra, a-bhatia1,
	Martyn Welch, Julien Massot

From: Pratyush Yadav <p.yadav@ti.com>

The stream stop procedure says that the STOP bit should be set when the
stream is to be stopped, and then the ready bit in stream status
register polled to make sure the STOP operation is finished.

Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Jai Luthra <j-luthra@ti.com>
---
Changes from v8:
    - Better log message to avoid confusion between cadence streams and v4l2 
    streams

 drivers/media/platform/cadence/cdns-csi2rx.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/media/platform/cadence/cdns-csi2rx.c b/drivers/media/platform/cadence/cdns-csi2rx.c
index b57e0c3b1944..f8205c3a28c0 100644
--- a/drivers/media/platform/cadence/cdns-csi2rx.c
+++ b/drivers/media/platform/cadence/cdns-csi2rx.c
@@ -8,6 +8,7 @@
 #include <linux/clk.h>
 #include <linux/delay.h>
 #include <linux/io.h>
+#include <linux/iopoll.h>
 #include <linux/module.h>
 #include <linux/of.h>
 #include <linux/of_graph.h>
@@ -41,8 +42,12 @@
 
 #define CSI2RX_STREAM_CTRL_REG(n)		(CSI2RX_STREAM_BASE(n) + 0x000)
 #define CSI2RX_STREAM_CTRL_SOFT_RST			BIT(4)
+#define CSI2RX_STREAM_CTRL_STOP				BIT(1)
 #define CSI2RX_STREAM_CTRL_START			BIT(0)
 
+#define CSI2RX_STREAM_STATUS_REG(n)		(CSI2RX_STREAM_BASE(n) + 0x004)
+#define CSI2RX_STREAM_STATUS_RDY			BIT(31)
+
 #define CSI2RX_STREAM_DATA_CFG_REG(n)		(CSI2RX_STREAM_BASE(n) + 0x008)
 #define CSI2RX_STREAM_DATA_CFG_EN_VC_SELECT		BIT(31)
 #define CSI2RX_STREAM_DATA_CFG_VC_SELECT(n)		BIT((n) + 16)
@@ -310,13 +315,25 @@ static int csi2rx_start(struct csi2rx_priv *csi2rx)
 static void csi2rx_stop(struct csi2rx_priv *csi2rx)
 {
 	unsigned int i;
+	u32 val;
+	int ret;
 
 	clk_prepare_enable(csi2rx->p_clk);
 	reset_control_assert(csi2rx->sys_rst);
 	clk_disable_unprepare(csi2rx->sys_clk);
 
 	for (i = 0; i < csi2rx->max_streams; i++) {
-		writel(0, csi2rx->base + CSI2RX_STREAM_CTRL_REG(i));
+		writel(CSI2RX_STREAM_CTRL_STOP,
+		       csi2rx->base + CSI2RX_STREAM_CTRL_REG(i));
+
+		ret = readl_relaxed_poll_timeout(csi2rx->base +
+						 CSI2RX_STREAM_STATUS_REG(i),
+						 val,
+						 !(val & CSI2RX_STREAM_STATUS_RDY),
+						 10, 10000);
+		if (ret)
+			dev_warn(csi2rx->dev,
+				 "Failed to stop streaming on pad%u\n", i);
 
 		reset_control_assert(csi2rx->pixel_rst[i]);
 		clk_disable_unprepare(csi2rx->pixel_clk[i]);

-- 
2.41.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v9 09/13] media: cadence: csi2rx: Fix stream data configuration
  2023-08-11 10:47 [PATCH v9 00/13] CSI2RX support on J721E and AM62 Jai Luthra
                   ` (7 preceding siblings ...)
  2023-08-11 10:47 ` [PATCH v9 08/13] media: cadence: csi2rx: Set the STOP bit when stopping a stream Jai Luthra
@ 2023-08-11 10:47 ` Jai Luthra
  2023-08-11 10:47 ` [PATCH v9 10/13] media: cadence: csi2rx: Populate subdev devnode Jai Luthra
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 30+ messages in thread
From: Jai Luthra @ 2023-08-11 10:47 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, Tomi Valkeinen
  Cc: linux-media, linux-kernel, devicetree, linux-arm-kernel,
	Laurent Pinchart, Mauro Carvalho Chehab, Maxime Ripard,
	niklas.soderlund+renesas, Benoit Parrot, Vaishnav Achath,
	Vignesh Raghavendra, nm, devarsht, j-luthra, a-bhatia1,
	Martyn Welch, Julien Massot

From: Pratyush Yadav <p.yadav@ti.com>

Firstly, there is no VC_EN bit present in the STREAM_DATA_CFG register.
Bit 31 is part of the VL_SELECT field. Remove it completely.

Secondly, it makes little sense to enable ith virtual channel for ith
stream. Sure, there might be a use-case that demands it. But there might
also be a use case that demands all streams to use the 0th virtual
channel. Prefer this case over the former because it is less arbitrary
and also makes it very clear what the limitations of the current driver
is instead of giving a false impression that multiple virtual channels
are supported.

Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Jai Luthra <j-luthra@ti.com>
---
 drivers/media/platform/cadence/cdns-csi2rx.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/media/platform/cadence/cdns-csi2rx.c b/drivers/media/platform/cadence/cdns-csi2rx.c
index f8205c3a28c0..46effbbe580d 100644
--- a/drivers/media/platform/cadence/cdns-csi2rx.c
+++ b/drivers/media/platform/cadence/cdns-csi2rx.c
@@ -49,7 +49,6 @@
 #define CSI2RX_STREAM_STATUS_RDY			BIT(31)
 
 #define CSI2RX_STREAM_DATA_CFG_REG(n)		(CSI2RX_STREAM_BASE(n) + 0x008)
-#define CSI2RX_STREAM_DATA_CFG_EN_VC_SELECT		BIT(31)
 #define CSI2RX_STREAM_DATA_CFG_VC_SELECT(n)		BIT((n) + 16)
 
 #define CSI2RX_STREAM_CFG_REG(n)		(CSI2RX_STREAM_BASE(n) + 0x00c)
@@ -271,8 +270,11 @@ static int csi2rx_start(struct csi2rx_priv *csi2rx)
 		writel(CSI2RX_STREAM_CFG_FIFO_MODE_LARGE_BUF,
 		       csi2rx->base + CSI2RX_STREAM_CFG_REG(i));
 
-		writel(CSI2RX_STREAM_DATA_CFG_EN_VC_SELECT |
-		       CSI2RX_STREAM_DATA_CFG_VC_SELECT(i),
+		/*
+		 * Enable one virtual channel. When multiple virtual channels
+		 * are supported this will have to be changed.
+		 */
+		writel(CSI2RX_STREAM_DATA_CFG_VC_SELECT(0),
 		       csi2rx->base + CSI2RX_STREAM_DATA_CFG_REG(i));
 
 		writel(CSI2RX_STREAM_CTRL_START,

-- 
2.41.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v9 10/13] media: cadence: csi2rx: Populate subdev devnode
  2023-08-11 10:47 [PATCH v9 00/13] CSI2RX support on J721E and AM62 Jai Luthra
                   ` (8 preceding siblings ...)
  2023-08-11 10:47 ` [PATCH v9 09/13] media: cadence: csi2rx: Fix stream data configuration Jai Luthra
@ 2023-08-11 10:47 ` Jai Luthra
  2023-08-11 10:47 ` [PATCH v9 11/13] media: cadence: csi2rx: Add link validation Jai Luthra
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 30+ messages in thread
From: Jai Luthra @ 2023-08-11 10:47 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, Tomi Valkeinen
  Cc: linux-media, linux-kernel, devicetree, linux-arm-kernel,
	Laurent Pinchart, Mauro Carvalho Chehab, Maxime Ripard,
	niklas.soderlund+renesas, Benoit Parrot, Vaishnav Achath,
	Vignesh Raghavendra, nm, devarsht, j-luthra, a-bhatia1,
	Martyn Welch, Julien Massot

From: Pratyush Yadav <p.yadav@ti.com>

The devnode can be used by media-ctl and other userspace tools to
perform configurations on the subdev. Without it, media-ctl returns
ENOENT when setting format on the sensor subdev.

Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Jai Luthra <j-luthra@ti.com>
---
 drivers/media/platform/cadence/cdns-csi2rx.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/media/platform/cadence/cdns-csi2rx.c b/drivers/media/platform/cadence/cdns-csi2rx.c
index 46effbbe580d..0947b112a573 100644
--- a/drivers/media/platform/cadence/cdns-csi2rx.c
+++ b/drivers/media/platform/cadence/cdns-csi2rx.c
@@ -673,6 +673,7 @@ static int csi2rx_probe(struct platform_device *pdev)
 	csi2rx->pads[CSI2RX_PAD_SINK].flags = MEDIA_PAD_FL_SINK;
 	for (i = CSI2RX_PAD_SOURCE_STREAM0; i < CSI2RX_PAD_MAX; i++)
 		csi2rx->pads[i].flags = MEDIA_PAD_FL_SOURCE;
+	csi2rx->subdev.flags |= V4L2_SUBDEV_FL_HAS_DEVNODE;
 
 	ret = media_entity_pads_init(&csi2rx->subdev.entity, CSI2RX_PAD_MAX,
 				     csi2rx->pads);

-- 
2.41.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v9 11/13] media: cadence: csi2rx: Add link validation
  2023-08-11 10:47 [PATCH v9 00/13] CSI2RX support on J721E and AM62 Jai Luthra
                   ` (9 preceding siblings ...)
  2023-08-11 10:47 ` [PATCH v9 10/13] media: cadence: csi2rx: Populate subdev devnode Jai Luthra
@ 2023-08-11 10:47 ` Jai Luthra
  2023-08-11 10:47 ` [PATCH v9 12/13] media: dt-bindings: Add TI J721E CSI2RX Jai Luthra
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 30+ messages in thread
From: Jai Luthra @ 2023-08-11 10:47 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, Tomi Valkeinen
  Cc: linux-media, linux-kernel, devicetree, linux-arm-kernel,
	Laurent Pinchart, Mauro Carvalho Chehab, Maxime Ripard,
	niklas.soderlund+renesas, Benoit Parrot, Vaishnav Achath,
	Vignesh Raghavendra, nm, devarsht, j-luthra, a-bhatia1,
	Martyn Welch, Julien Massot

From: Pratyush Yadav <p.yadav@ti.com>

Add media link validation to make sure incorrectly configured pipelines
are caught.

Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Jai Luthra <j-luthra@ti.com>
---
 drivers/media/platform/cadence/cdns-csi2rx.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/media/platform/cadence/cdns-csi2rx.c b/drivers/media/platform/cadence/cdns-csi2rx.c
index 0947b112a573..5eeeb398cdb5 100644
--- a/drivers/media/platform/cadence/cdns-csi2rx.c
+++ b/drivers/media/platform/cadence/cdns-csi2rx.c
@@ -458,6 +458,10 @@ static const struct v4l2_subdev_ops csi2rx_subdev_ops = {
 	.pad		= &csi2rx_pad_ops,
 };
 
+static const struct media_entity_operations csi2rx_media_ops = {
+	.link_validate = v4l2_subdev_link_validate,
+};
+
 static int csi2rx_async_bound(struct v4l2_async_notifier *notifier,
 			      struct v4l2_subdev *s_subdev,
 			      struct v4l2_async_connection *asd)
@@ -674,6 +678,7 @@ static int csi2rx_probe(struct platform_device *pdev)
 	for (i = CSI2RX_PAD_SOURCE_STREAM0; i < CSI2RX_PAD_MAX; i++)
 		csi2rx->pads[i].flags = MEDIA_PAD_FL_SOURCE;
 	csi2rx->subdev.flags |= V4L2_SUBDEV_FL_HAS_DEVNODE;
+	csi2rx->subdev.entity.ops = &csi2rx_media_ops;
 
 	ret = media_entity_pads_init(&csi2rx->subdev.entity, CSI2RX_PAD_MAX,
 				     csi2rx->pads);

-- 
2.41.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v9 12/13] media: dt-bindings: Add TI J721E CSI2RX
  2023-08-11 10:47 [PATCH v9 00/13] CSI2RX support on J721E and AM62 Jai Luthra
                   ` (10 preceding siblings ...)
  2023-08-11 10:47 ` [PATCH v9 11/13] media: cadence: csi2rx: Add link validation Jai Luthra
@ 2023-08-11 10:47 ` Jai Luthra
  2023-08-11 14:00   ` Rob Herring
  2023-08-11 10:47 ` [PATCH v9 13/13] media: ti: Add CSI2RX support for J721E Jai Luthra
  2023-08-24 15:18 ` [PATCH v9 00/13] CSI2RX support on J721E and AM62 Julien Massot
  13 siblings, 1 reply; 30+ messages in thread
From: Jai Luthra @ 2023-08-11 10:47 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, Tomi Valkeinen
  Cc: linux-media, linux-kernel, devicetree, linux-arm-kernel,
	Laurent Pinchart, Mauro Carvalho Chehab, Maxime Ripard,
	niklas.soderlund+renesas, Benoit Parrot, Vaishnav Achath,
	Vignesh Raghavendra, nm, devarsht, j-luthra, a-bhatia1,
	Martyn Welch, Julien Massot

From: Pratyush Yadav <p.yadav@ti.com>

TI's J721E uses the Cadence CSI2RX and DPHY peripherals to facilitate
capture over a CSI-2 bus. The TI CSI2RX platform driver glues all the
parts together.

Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jai Luthra <j-luthra@ti.com>
---
NOTE:

This patch depends on
9536cc949235 ("media: dt-bindings: cadence-csi2rx: Convert to DT schema") 
which is part of linux-next.

 .../bindings/media/ti,j721e-csi2rx-shim.yaml       | 100 +++++++++++++++++++++
 1 file changed, 100 insertions(+)

diff --git a/Documentation/devicetree/bindings/media/ti,j721e-csi2rx-shim.yaml b/Documentation/devicetree/bindings/media/ti,j721e-csi2rx-shim.yaml
new file mode 100644
index 000000000000..f762fdc05e4d
--- /dev/null
+++ b/Documentation/devicetree/bindings/media/ti,j721e-csi2rx-shim.yaml
@@ -0,0 +1,100 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/media/ti,j721e-csi2rx-shim.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: TI J721E CSI2RX Shim
+
+description: |
+  The TI J721E CSI2RX Shim is a wrapper around Cadence CSI2RX bridge that
+  enables sending captured frames to memory over PSI-L DMA. In the J721E
+  Technical Reference Manual (SPRUIL1B) it is referred to as "SHIM" under the
+  CSI_RX_IF section.
+
+maintainers:
+  - Jai Luthra <j-luthra@ti.com>
+
+properties:
+  compatible:
+    const: ti,j721e-csi2rx-shim
+
+  dmas:
+    maxItems: 1
+
+  dma-names:
+    items:
+      - const: rx0
+
+  reg:
+    maxItems: 1
+
+  power-domains:
+    maxItems: 1
+
+  ranges: true
+
+  "#address-cells": true
+
+  "#size-cells": true
+
+patternProperties:
+  "^csi-bridge@":
+    type: object
+    description: CSI2 bridge node.
+    $ref: cdns,csi2rx.yaml#
+
+required:
+  - compatible
+  - reg
+  - dmas
+  - dma-names
+  - power-domains
+  - ranges
+  - "#address-cells"
+  - "#size-cells"
+
+additionalProperties: false
+
+examples:
+  - |
+    #include <dt-bindings/soc/ti,sci_pm_domain.h>
+
+    ti_csi2rx0: ticsi2rx@4500000 {
+        compatible = "ti,j721e-csi2rx-shim";
+        dmas = <&main_udmap 0x4940>;
+        dma-names = "rx0";
+        reg = <0x4500000 0x1000>;
+        power-domains = <&k3_pds 26 TI_SCI_PD_EXCLUSIVE>;
+        #address-cells = <1>;
+        #size-cells = <1>;
+        ranges;
+
+        cdns_csi2rx: csi-bridge@4504000 {
+            compatible = "ti,j721e-csi2rx", "cdns,csi2rx";
+            reg = <0x4504000 0x1000>;
+            clocks = <&k3_clks 26 2>, <&k3_clks 26 0>, <&k3_clks 26 2>,
+              <&k3_clks 26 2>, <&k3_clks 26 3>, <&k3_clks 26 3>;
+            clock-names = "sys_clk", "p_clk", "pixel_if0_clk",
+              "pixel_if1_clk", "pixel_if2_clk", "pixel_if3_clk";
+            phys = <&dphy0>;
+            phy-names = "dphy";
+
+            ports {
+                #address-cells = <1>;
+                #size-cells = <0>;
+
+                csi2_0: port@0 {
+
+                    reg = <0>;
+
+                    csi2rx0_in_sensor: endpoint {
+                        remote-endpoint = <&csi2_cam0>;
+                        bus-type = <4>; /* CSI2 DPHY. */
+                        clock-lanes = <0>;
+                        data-lanes = <1 2>;
+                    };
+                };
+            };
+        };
+    };

-- 
2.41.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v9 13/13] media: ti: Add CSI2RX support for J721E
  2023-08-11 10:47 [PATCH v9 00/13] CSI2RX support on J721E and AM62 Jai Luthra
                   ` (11 preceding siblings ...)
  2023-08-11 10:47 ` [PATCH v9 12/13] media: dt-bindings: Add TI J721E CSI2RX Jai Luthra
@ 2023-08-11 10:47 ` Jai Luthra
  2023-08-15 13:00   ` Tomi Valkeinen
  2023-08-29 16:44   ` Laurent Pinchart
  2023-08-24 15:18 ` [PATCH v9 00/13] CSI2RX support on J721E and AM62 Julien Massot
  13 siblings, 2 replies; 30+ messages in thread
From: Jai Luthra @ 2023-08-11 10:47 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, Tomi Valkeinen
  Cc: linux-media, linux-kernel, devicetree, linux-arm-kernel,
	Laurent Pinchart, Mauro Carvalho Chehab, Maxime Ripard,
	niklas.soderlund+renesas, Benoit Parrot, Vaishnav Achath,
	Vignesh Raghavendra, nm, devarsht, j-luthra, a-bhatia1,
	Martyn Welch, Julien Massot

From: Pratyush Yadav <p.yadav@ti.com>

TI's J721E uses the Cadence CSI2RX and DPHY peripherals to facilitate
capture over a CSI-2 bus.

The Cadence CSI2RX IP acts as a bridge between the TI specific parts and
the CSI-2 protocol parts. TI then has a wrapper on top of this bridge
called the SHIM layer. It takes in data from stream 0, repacks it, and
sends it to memory over PSI-L DMA.

This driver acts as the "front end" to V4L2 client applications. It
implements the required ioctls and buffer operations, passes the
necessary calls on to the bridge, programs the SHIM layer, and performs
DMA via the dmaengine API to finally return the data to a buffer
supplied by the application.

Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Co-authored-by: Vaishnav Achath <vaishnav.a@ti.com>
Signed-off-by: Vaishnav Achath <vaishnav.a@ti.com>
Tested-by: Vaishnav Achath <vaishnav.a@ti.com>
Co-authored-by: Jai Luthra <j-luthra@ti.com>
Signed-off-by: Jai Luthra <j-luthra@ti.com>
---
Changes since v8:
    - Allocate drain buffer at start of stream instead of doing it in the
      middle, and document why it is needed in comments
    - Call subdev's get_fmt directly for link_validation()
    - Cleanup height/width clamping and rounding code, document it in comments
    - Return and check errors from setup_shim()
    - s/subdev/source for cadence csi2rx's v4l2_subdev
    - s/ti_csi2rx_init_subdev/ti_csi2rx_notifier_register
    - Change copyright year/author list

 MAINTAINERS                                        |    7 +
 drivers/media/platform/ti/Kconfig                  |   12 +
 drivers/media/platform/ti/Makefile                 |    1 +
 drivers/media/platform/ti/j721e-csi2rx/Makefile    |    2 +
 .../media/platform/ti/j721e-csi2rx/j721e-csi2rx.c  | 1150 ++++++++++++++++++++
 5 files changed, 1172 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 02a3192195af..959147d6d936 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -21455,6 +21455,13 @@ F:	Documentation/devicetree/bindings/media/i2c/ti,ds90*
 F:	drivers/media/i2c/ds90*
 F:	include/media/i2c/ds90*
 
+TI J721E CSI2RX DRIVER
+M:	Jai Luthra <j-luthra@ti.com>
+L:	linux-media@vger.kernel.org
+S:	Maintained
+F:	Documentation/devicetree/bindings/media/ti,j721e-csi2rx.yaml
+F:	drivers/media/platform/ti/j721e-csi2rx/
+
 TI KEYSTONE MULTICORE NAVIGATOR DRIVERS
 M:	Nishanth Menon <nm@ti.com>
 M:	Santosh Shilimkar <ssantosh@kernel.org>
diff --git a/drivers/media/platform/ti/Kconfig b/drivers/media/platform/ti/Kconfig
index e1ab56c3be1f..42c908f6e1ae 100644
--- a/drivers/media/platform/ti/Kconfig
+++ b/drivers/media/platform/ti/Kconfig
@@ -63,6 +63,18 @@ config VIDEO_TI_VPE_DEBUG
 	help
 	  Enable debug messages on VPE driver.
 
+config VIDEO_TI_J721E_CSI2RX
+	tristate "TI J721E CSI2RX wrapper layer driver"
+	depends on VIDEO_DEV && VIDEO_V4L2_SUBDEV_API
+	depends on MEDIA_SUPPORT && MEDIA_CONTROLLER
+	depends on PHY_CADENCE_DPHY_RX && VIDEO_CADENCE_CSI2RX
+	depends on ARCH_K3 || COMPILE_TEST
+	select VIDEOBUF2_DMA_CONTIG
+	select V4L2_FWNODE
+	help
+	  Support for TI CSI2RX wrapper layer. This just enables the wrapper driver.
+	  The Cadence CSI2RX bridge driver needs to be enabled separately.
+
 source "drivers/media/platform/ti/am437x/Kconfig"
 source "drivers/media/platform/ti/davinci/Kconfig"
 source "drivers/media/platform/ti/omap/Kconfig"
diff --git a/drivers/media/platform/ti/Makefile b/drivers/media/platform/ti/Makefile
index 98c5fe5c40d6..8a2f74c9380e 100644
--- a/drivers/media/platform/ti/Makefile
+++ b/drivers/media/platform/ti/Makefile
@@ -3,5 +3,6 @@ obj-y += am437x/
 obj-y += cal/
 obj-y += vpe/
 obj-y += davinci/
+obj-y += j721e-csi2rx/
 obj-y += omap/
 obj-y += omap3isp/
diff --git a/drivers/media/platform/ti/j721e-csi2rx/Makefile b/drivers/media/platform/ti/j721e-csi2rx/Makefile
new file mode 100644
index 000000000000..377afc1d6280
--- /dev/null
+++ b/drivers/media/platform/ti/j721e-csi2rx/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_VIDEO_TI_J721E_CSI2RX) += j721e-csi2rx.o
diff --git a/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c b/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c
new file mode 100644
index 000000000000..301d947f6098
--- /dev/null
+++ b/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c
@@ -0,0 +1,1150 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * TI CSI2RX Shim Wrapper Driver
+ *
+ * Copyright (C) 2023 Texas Instruments Incorporated - https://www.ti.com/
+ *
+ * Author: Pratyush Yadav <p.yadav@ti.com>
+ * Author: Jai Luthra <j-luthra@ti.com>
+ */
+
+#include <linux/bitfield.h>
+#include <linux/dmaengine.h>
+#include <linux/module.h>
+#include <linux/of_platform.h>
+#include <linux/platform_device.h>
+
+#include <media/mipi-csi2.h>
+#include <media/v4l2-device.h>
+#include <media/v4l2-ioctl.h>
+#include <media/v4l2-mc.h>
+#include <media/videobuf2-dma-contig.h>
+
+#define TI_CSI2RX_MODULE_NAME		"j721e-csi2rx"
+
+#define SHIM_CNTL			0x10
+#define SHIM_CNTL_PIX_RST		BIT(0)
+
+#define SHIM_DMACNTX			0x20
+#define SHIM_DMACNTX_EN			BIT(31)
+#define SHIM_DMACNTX_YUV422		GENMASK(27, 26)
+#define SHIM_DMACNTX_SIZE		GENMASK(21, 20)
+#define SHIM_DMACNTX_FMT		GENMASK(5, 0)
+#define SHIM_DMACNTX_UYVY		0
+#define SHIM_DMACNTX_VYUY		1
+#define SHIM_DMACNTX_YUYV		2
+#define SHIM_DMACNTX_YVYU		3
+#define SHIM_DMACNTX_SIZE_8		0
+#define SHIM_DMACNTX_SIZE_16		1
+#define SHIM_DMACNTX_SIZE_32		2
+
+#define SHIM_PSI_CFG0			0x24
+#define SHIM_PSI_CFG0_SRC_TAG		GENMASK(15, 0)
+#define SHIM_PSI_CFG0_DST_TAG		GENMASK(31, 16)
+
+#define PSIL_WORD_SIZE_BYTES		16
+/*
+ * There are no hard limits on the width or height. The DMA engine can handle
+ * all sizes. The max width and height are arbitrary numbers for this driver.
+ * Use 16K * 16K as the arbitrary limit. It is large enough that it is unlikely
+ * the limit will be hit in practice.
+ */
+#define MAX_WIDTH_BYTES			SZ_16K
+#define MAX_HEIGHT_LINES		SZ_16K
+
+#define DRAIN_TIMEOUT_MS		50
+
+struct ti_csi2rx_fmt {
+	u32				fourcc;	/* Four character code. */
+	u32				code;	/* Mbus code. */
+	u32				csi_dt;	/* CSI Data type. */
+	u8				bpp;	/* Bits per pixel. */
+	u8				size;	/* Data size shift when unpacking. */
+};
+
+struct ti_csi2rx_buffer {
+	/* Common v4l2 buffer. Must be first. */
+	struct vb2_v4l2_buffer		vb;
+	struct list_head		list;
+	struct ti_csi2rx_dev		*csi;
+};
+
+enum ti_csi2rx_dma_state {
+	TI_CSI2RX_DMA_STOPPED,	/* Streaming not started yet. */
+	TI_CSI2RX_DMA_IDLE,	/* Streaming but no pending DMA operation. */
+	TI_CSI2RX_DMA_ACTIVE,	/* Streaming and pending DMA operation. */
+};
+
+struct ti_csi2rx_dma {
+	/* Protects all fields in this struct. */
+	spinlock_t			lock;
+	struct dma_chan			*chan;
+	/* Buffers queued to the driver, waiting to be processed by DMA. */
+	struct list_head		queue;
+	enum ti_csi2rx_dma_state	state;
+	/*
+	 * Queue of buffers submitted to DMA engine.
+	 */
+	struct list_head		submitted;
+	/* Buffer to drain stale data from PSI-L endpoint */
+	struct {
+		void			*vaddr;
+		dma_addr_t		paddr;
+		size_t			len;
+	} drain;
+};
+
+struct ti_csi2rx_dev {
+	struct device			*dev;
+	void __iomem			*shim;
+	struct v4l2_device		v4l2_dev;
+	struct video_device		vdev;
+	struct media_device		mdev;
+	struct media_pipeline		pipe;
+	struct media_pad		pad;
+	struct v4l2_async_notifier	notifier;
+	struct v4l2_subdev		*source;
+	struct vb2_queue		vidq;
+	struct mutex			mutex; /* To serialize ioctls. */
+	struct v4l2_format		v_fmt;
+	struct ti_csi2rx_dma		dma;
+	u32				sequence;
+};
+
+static const struct ti_csi2rx_fmt formats[] = {
+	{
+		.fourcc			= V4L2_PIX_FMT_YUYV,
+		.code			= MEDIA_BUS_FMT_YUYV8_1X16,
+		.csi_dt			= MIPI_CSI2_DT_YUV422_8B,
+		.bpp			= 16,
+		.size			= SHIM_DMACNTX_SIZE_8,
+	}, {
+		.fourcc			= V4L2_PIX_FMT_UYVY,
+		.code			= MEDIA_BUS_FMT_UYVY8_1X16,
+		.csi_dt			= MIPI_CSI2_DT_YUV422_8B,
+		.bpp			= 16,
+		.size			= SHIM_DMACNTX_SIZE_8,
+	}, {
+		.fourcc			= V4L2_PIX_FMT_YVYU,
+		.code			= MEDIA_BUS_FMT_YVYU8_1X16,
+		.csi_dt			= MIPI_CSI2_DT_YUV422_8B,
+		.bpp			= 16,
+		.size			= SHIM_DMACNTX_SIZE_8,
+	}, {
+		.fourcc			= V4L2_PIX_FMT_VYUY,
+		.code			= MEDIA_BUS_FMT_VYUY8_1X16,
+		.csi_dt			= MIPI_CSI2_DT_YUV422_8B,
+		.bpp			= 16,
+		.size			= SHIM_DMACNTX_SIZE_8,
+	}, {
+		.fourcc			= V4L2_PIX_FMT_SBGGR8,
+		.code			= MEDIA_BUS_FMT_SBGGR8_1X8,
+		.csi_dt			= MIPI_CSI2_DT_RAW8,
+		.bpp			= 8,
+		.size			= SHIM_DMACNTX_SIZE_8,
+	}, {
+		.fourcc			= V4L2_PIX_FMT_SGBRG8,
+		.code			= MEDIA_BUS_FMT_SGBRG8_1X8,
+		.csi_dt			= MIPI_CSI2_DT_RAW8,
+		.bpp			= 8,
+		.size			= SHIM_DMACNTX_SIZE_8,
+	}, {
+		.fourcc			= V4L2_PIX_FMT_SGRBG8,
+		.code			= MEDIA_BUS_FMT_SGRBG8_1X8,
+		.csi_dt			= MIPI_CSI2_DT_RAW8,
+		.bpp			= 8,
+		.size			= SHIM_DMACNTX_SIZE_8,
+	}, {
+		.fourcc			= V4L2_PIX_FMT_SRGGB8,
+		.code			= MEDIA_BUS_FMT_SRGGB8_1X8,
+		.csi_dt			= MIPI_CSI2_DT_RAW8,
+		.bpp			= 8,
+		.size			= SHIM_DMACNTX_SIZE_8,
+	}, {
+		.fourcc			= V4L2_PIX_FMT_SBGGR10,
+		.code			= MEDIA_BUS_FMT_SBGGR10_1X10,
+		.csi_dt			= MIPI_CSI2_DT_RAW10,
+		.bpp			= 16,
+		.size			= SHIM_DMACNTX_SIZE_16,
+	}, {
+		.fourcc			= V4L2_PIX_FMT_SGBRG10,
+		.code			= MEDIA_BUS_FMT_SGBRG10_1X10,
+		.csi_dt			= MIPI_CSI2_DT_RAW10,
+		.bpp			= 16,
+		.size			= SHIM_DMACNTX_SIZE_16,
+	}, {
+		.fourcc			= V4L2_PIX_FMT_SGRBG10,
+		.code			= MEDIA_BUS_FMT_SGRBG10_1X10,
+		.csi_dt			= MIPI_CSI2_DT_RAW10,
+		.bpp			= 16,
+		.size			= SHIM_DMACNTX_SIZE_16,
+	}, {
+		.fourcc			= V4L2_PIX_FMT_SRGGB10,
+		.code			= MEDIA_BUS_FMT_SRGGB10_1X10,
+		.csi_dt			= MIPI_CSI2_DT_RAW10,
+		.bpp			= 16,
+		.size			= SHIM_DMACNTX_SIZE_16,
+	},
+
+	/* More formats can be supported but they are not listed for now. */
+};
+
+static const unsigned int num_formats = ARRAY_SIZE(formats);
+
+/* Forward declaration needed by ti_csi2rx_dma_callback. */
+static int ti_csi2rx_start_dma(struct ti_csi2rx_dev *csi,
+			       struct ti_csi2rx_buffer *buf);
+
+static const struct ti_csi2rx_fmt *find_format_by_pix(u32 pixelformat)
+{
+	unsigned int i;
+
+	for (i = 0; i < num_formats; i++) {
+		if (formats[i].fourcc == pixelformat)
+			return &formats[i];
+	}
+
+	return NULL;
+}
+
+static const struct ti_csi2rx_fmt *find_format_by_code(u32 code)
+{
+	unsigned int i;
+
+	for (i = 0; i < num_formats; i++) {
+		if (formats[i].code == code)
+			return &formats[i];
+	}
+
+	return NULL;
+}
+
+static void ti_csi2rx_fill_fmt(const struct ti_csi2rx_fmt *csi_fmt,
+			       struct v4l2_format *v4l2_fmt)
+{
+	struct v4l2_pix_format *pix = &v4l2_fmt->fmt.pix;
+	unsigned int pixels_in_word;
+	u8 bpp = ALIGN(csi_fmt->bpp, 8);
+
+	pixels_in_word = PSIL_WORD_SIZE_BYTES * 8 / bpp;
+
+	/* Clamp width and height to sensible maximums (16K x 16K) */
+	pix->width = clamp_t(unsigned int, pix->width,
+			     pixels_in_word,
+			     MAX_WIDTH_BYTES * 8 / bpp);
+	pix->height = clamp_t(unsigned int, pix->height, 1, MAX_HEIGHT_LINES);
+
+	/* Width should be a multiple of transfer word-size */
+	pix->width = rounddown(pix->width, pixels_in_word);
+
+	v4l2_fmt->type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
+	pix->pixelformat = csi_fmt->fourcc;
+	pix->colorspace = V4L2_COLORSPACE_SRGB;
+	pix->bytesperline = pix->width * (bpp / 8);
+	pix->sizeimage = pix->bytesperline * pix->height;
+}
+
+static int ti_csi2rx_querycap(struct file *file, void *priv,
+			      struct v4l2_capability *cap)
+{
+	strscpy(cap->driver, TI_CSI2RX_MODULE_NAME, sizeof(cap->driver));
+	strscpy(cap->card, TI_CSI2RX_MODULE_NAME, sizeof(cap->card));
+
+	return 0;
+}
+
+static int ti_csi2rx_enum_fmt_vid_cap(struct file *file, void *priv,
+				      struct v4l2_fmtdesc *f)
+{
+	const struct ti_csi2rx_fmt *fmt = NULL;
+
+	if (f->mbus_code) {
+		/* 1-to-1 mapping between bus formats and pixel formats */
+		if (f->index > 0)
+			return -EINVAL;
+
+		fmt = find_format_by_code(f->mbus_code);
+	} else {
+		if (f->index >= num_formats)
+			return -EINVAL;
+
+		fmt = &formats[f->index];
+	}
+
+	if (!fmt)
+		return -EINVAL;
+
+	f->pixelformat = fmt->fourcc;
+	memset(f->reserved, 0, sizeof(f->reserved));
+	f->type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
+
+	return 0;
+}
+
+static int ti_csi2rx_g_fmt_vid_cap(struct file *file, void *prov,
+				   struct v4l2_format *f)
+{
+	struct ti_csi2rx_dev *csi = video_drvdata(file);
+
+	*f = csi->v_fmt;
+
+	return 0;
+}
+
+static int ti_csi2rx_try_fmt_vid_cap(struct file *file, void *priv,
+				     struct v4l2_format *f)
+{
+	const struct ti_csi2rx_fmt *fmt;
+
+	/*
+	 * Default to the first format if the requested pixel format code isn't
+	 * supported.
+	 */
+	fmt = find_format_by_pix(f->fmt.pix.pixelformat);
+	if (!fmt)
+		fmt = &formats[0];
+
+	/* Interlaced formats are not supported. */
+	f->fmt.pix.field = V4L2_FIELD_NONE;
+
+	ti_csi2rx_fill_fmt(fmt, f);
+
+	return 0;
+}
+
+static int ti_csi2rx_s_fmt_vid_cap(struct file *file, void *priv,
+				   struct v4l2_format *f)
+{
+	struct ti_csi2rx_dev *csi = video_drvdata(file);
+	struct vb2_queue *q = &csi->vidq;
+	int ret;
+
+	if (vb2_is_busy(q))
+		return -EBUSY;
+
+	ret = ti_csi2rx_try_fmt_vid_cap(file, priv, f);
+	if (ret < 0)
+		return ret;
+
+	csi->v_fmt = *f;
+
+	return 0;
+}
+
+static int ti_csi2rx_enum_framesizes(struct file *file, void *fh,
+				     struct v4l2_frmsizeenum *fsize)
+{
+	const struct ti_csi2rx_fmt *fmt;
+	unsigned int pixels_in_word;
+	u8 bpp;
+
+	fmt = find_format_by_pix(fsize->pixel_format);
+	if (!fmt || fsize->index != 0)
+		return -EINVAL;
+
+	bpp = ALIGN(fmt->bpp, 8);
+
+	/*
+	 * Number of pixels in one PSI-L word. The transfer happens in multiples
+	 * of PSI-L word sizes.
+	 */
+	pixels_in_word = PSIL_WORD_SIZE_BYTES * 8 / bpp;
+
+	fsize->type = V4L2_FRMSIZE_TYPE_STEPWISE;
+	fsize->stepwise.min_width = pixels_in_word;
+	fsize->stepwise.max_width = rounddown(MAX_WIDTH_BYTES * 8 / bpp,
+					      pixels_in_word);
+	fsize->stepwise.step_width = pixels_in_word;
+	fsize->stepwise.min_height = 1;
+	fsize->stepwise.max_height = MAX_HEIGHT_LINES;
+	fsize->stepwise.step_height = 1;
+
+	return 0;
+}
+
+static const struct v4l2_ioctl_ops csi_ioctl_ops = {
+	.vidioc_querycap      = ti_csi2rx_querycap,
+	.vidioc_enum_fmt_vid_cap = ti_csi2rx_enum_fmt_vid_cap,
+	.vidioc_try_fmt_vid_cap = ti_csi2rx_try_fmt_vid_cap,
+	.vidioc_g_fmt_vid_cap = ti_csi2rx_g_fmt_vid_cap,
+	.vidioc_s_fmt_vid_cap = ti_csi2rx_s_fmt_vid_cap,
+	.vidioc_enum_framesizes = ti_csi2rx_enum_framesizes,
+	.vidioc_reqbufs       = vb2_ioctl_reqbufs,
+	.vidioc_create_bufs   = vb2_ioctl_create_bufs,
+	.vidioc_prepare_buf   = vb2_ioctl_prepare_buf,
+	.vidioc_querybuf      = vb2_ioctl_querybuf,
+	.vidioc_qbuf          = vb2_ioctl_qbuf,
+	.vidioc_dqbuf         = vb2_ioctl_dqbuf,
+	.vidioc_expbuf        = vb2_ioctl_expbuf,
+	.vidioc_streamon      = vb2_ioctl_streamon,
+	.vidioc_streamoff     = vb2_ioctl_streamoff,
+};
+
+static const struct v4l2_file_operations csi_fops = {
+	.owner = THIS_MODULE,
+	.open = v4l2_fh_open,
+	.release = vb2_fop_release,
+	.read = vb2_fop_read,
+	.poll = vb2_fop_poll,
+	.unlocked_ioctl = video_ioctl2,
+	.mmap = vb2_fop_mmap,
+};
+
+static int csi_async_notifier_bound(struct v4l2_async_notifier *notifier,
+				    struct v4l2_subdev *subdev,
+				    struct v4l2_async_connection *asc)
+{
+	struct ti_csi2rx_dev *csi = dev_get_drvdata(notifier->v4l2_dev->dev);
+
+	csi->source = subdev;
+
+	return 0;
+}
+
+static int csi_async_notifier_complete(struct v4l2_async_notifier *notifier)
+{
+	struct ti_csi2rx_dev *csi = dev_get_drvdata(notifier->v4l2_dev->dev);
+	struct video_device *vdev = &csi->vdev;
+	int ret;
+
+	ret = video_register_device(vdev, VFL_TYPE_VIDEO, -1);
+	if (ret)
+		return ret;
+
+	ret = v4l2_create_fwnode_links_to_pad(csi->source, &csi->pad,
+					      MEDIA_LNK_FL_IMMUTABLE | MEDIA_LNK_FL_ENABLED);
+
+	if (ret) {
+		video_unregister_device(vdev);
+		return ret;
+	}
+
+	return v4l2_device_register_subdev_nodes(&csi->v4l2_dev);
+}
+
+static const struct v4l2_async_notifier_operations csi_async_notifier_ops = {
+	.bound = csi_async_notifier_bound,
+	.complete = csi_async_notifier_complete,
+};
+
+static int ti_csi2rx_notifier_register(struct ti_csi2rx_dev *csi)
+{
+	struct fwnode_handle *fwnode;
+	struct v4l2_async_connection *asc;
+	struct device_node *node;
+	int ret;
+
+	node = of_get_child_by_name(csi->dev->of_node, "csi-bridge");
+	if (!node)
+		return -EINVAL;
+
+	fwnode = of_fwnode_handle(node);
+	if (!fwnode) {
+		of_node_put(node);
+		return -EINVAL;
+	}
+
+	v4l2_async_nf_init(&csi->notifier, &csi->v4l2_dev);
+	csi->notifier.ops = &csi_async_notifier_ops;
+
+	asc = v4l2_async_nf_add_fwnode(&csi->notifier, fwnode,
+				       struct v4l2_async_connection);
+	of_node_put(node);
+	if (IS_ERR(asc)) {
+		v4l2_async_nf_cleanup(&csi->notifier);
+		return PTR_ERR(asc);
+	}
+
+	ret = v4l2_async_nf_register(&csi->notifier);
+	if (ret) {
+		v4l2_async_nf_cleanup(&csi->notifier);
+		return ret;
+	}
+
+	return 0;
+}
+
+static int ti_csi2rx_setup_shim(struct ti_csi2rx_dev *csi)
+{
+	const struct ti_csi2rx_fmt *fmt;
+	unsigned int reg;
+
+	fmt = find_format_by_pix(csi->v_fmt.fmt.pix.pixelformat);
+	if (!fmt) {
+		dev_err(csi->dev, "Pixelformat 0x%x is not supported\n",
+			csi->v_fmt.fmt.pix.pixelformat);
+		return -EINVAL;
+	}
+
+	/* De-assert the pixel interface reset. */
+	reg = SHIM_CNTL_PIX_RST;
+	writel(reg, csi->shim + SHIM_CNTL);
+
+	reg = SHIM_DMACNTX_EN;
+	reg |= FIELD_PREP(SHIM_DMACNTX_FMT, fmt->csi_dt);
+
+	/*
+	 * Using the values from the documentation gives incorrect ordering for
+	 * the luma and chroma components. In practice, the "reverse" format
+	 * gives the correct image. So for example, if the image is in UYVY, the
+	 * reverse would be YVYU.
+	 */
+	switch (fmt->fourcc) {
+	case V4L2_PIX_FMT_UYVY:
+		reg |= FIELD_PREP(SHIM_DMACNTX_YUV422,
+					SHIM_DMACNTX_YVYU);
+		break;
+	case V4L2_PIX_FMT_VYUY:
+		reg |= FIELD_PREP(SHIM_DMACNTX_YUV422,
+					SHIM_DMACNTX_YUYV);
+		break;
+	case V4L2_PIX_FMT_YUYV:
+		reg |= FIELD_PREP(SHIM_DMACNTX_YUV422,
+					SHIM_DMACNTX_VYUY);
+		break;
+	case V4L2_PIX_FMT_YVYU:
+		reg |= FIELD_PREP(SHIM_DMACNTX_YUV422,
+					SHIM_DMACNTX_UYVY);
+		break;
+	default:
+		/* Ignore if not YUV 4:2:2 */
+		break;
+	}
+
+	reg |= FIELD_PREP(SHIM_DMACNTX_SIZE, fmt->size);
+
+	writel(reg, csi->shim + SHIM_DMACNTX);
+
+	reg = FIELD_PREP(SHIM_PSI_CFG0_SRC_TAG, 0) |
+	      FIELD_PREP(SHIM_PSI_CFG0_DST_TAG, 0);
+	writel(reg, csi->shim + SHIM_PSI_CFG0);
+
+	return 0;
+}
+
+static void ti_csi2rx_drain_callback(void *param)
+{
+	struct completion *drain_complete = param;
+
+	complete(drain_complete);
+}
+
+/** Drain the stale data left at the PSI-L endpoint.
+ *
+ * This might happen if no buffers are queued in time but source is still
+ * streaming. Or rarely it may happen while stopping the stream. To prevent
+ * that stale data corrupting the subsequent transactions, it is required to
+ * issue DMA requests to drain it out.
+ */
+static int ti_csi2rx_drain_dma(struct ti_csi2rx_dev *csi)
+{
+	struct dma_async_tx_descriptor *desc;
+	struct completion drain_complete;
+	dma_cookie_t cookie;
+	int ret;
+
+	init_completion(&drain_complete);
+
+	desc = dmaengine_prep_slave_single(csi->dma.chan, csi->dma.drain.paddr,
+					   csi->dma.drain.len, DMA_DEV_TO_MEM,
+					   DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
+	if (!desc) {
+		ret = -EIO;
+		goto out;
+	}
+
+	desc->callback = ti_csi2rx_drain_callback;
+	desc->callback_param = &drain_complete;
+
+	cookie = dmaengine_submit(desc);
+	ret = dma_submit_error(cookie);
+	if (ret)
+		goto out;
+
+	dma_async_issue_pending(csi->dma.chan);
+
+	if (!wait_for_completion_timeout(&drain_complete,
+					 msecs_to_jiffies(DRAIN_TIMEOUT_MS))) {
+		dmaengine_terminate_sync(csi->dma.chan);
+		ret = -ETIMEDOUT;
+		goto out;
+	}
+out:
+	return ret;
+}
+
+static void ti_csi2rx_dma_callback(void *param)
+{
+	struct ti_csi2rx_buffer *buf = param;
+	struct ti_csi2rx_dev *csi = buf->csi;
+	struct ti_csi2rx_dma *dma = &csi->dma;
+	unsigned long flags;
+
+	/*
+	 * TODO: Derive the sequence number from the CSI2RX frame number
+	 * hardware monitor registers.
+	 */
+	buf->vb.vb2_buf.timestamp = ktime_get_ns();
+	buf->vb.sequence = csi->sequence++;
+
+	spin_lock_irqsave(&dma->lock, flags);
+
+	WARN_ON(!list_is_first(&buf->list, &dma->submitted));
+	vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_DONE);
+	list_del(&buf->list);
+
+	/* If there are more buffers to process then start their transfer. */
+	while (!list_empty(&dma->queue)) {
+		buf = list_entry(dma->queue.next, struct ti_csi2rx_buffer, list);
+
+		if (ti_csi2rx_start_dma(csi, buf)) {
+			dev_err(csi->dev, "Failed to queue the next buffer for DMA\n");
+			vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR);
+		} else {
+			list_move_tail(&buf->list, &dma->submitted);
+		}
+	}
+
+	if (list_empty(&dma->submitted))
+		dma->state = TI_CSI2RX_DMA_IDLE;
+
+	spin_unlock_irqrestore(&dma->lock, flags);
+}
+
+static int ti_csi2rx_start_dma(struct ti_csi2rx_dev *csi,
+			       struct ti_csi2rx_buffer *buf)
+{
+	unsigned long addr;
+	struct dma_async_tx_descriptor *desc;
+	size_t len = csi->v_fmt.fmt.pix.sizeimage;
+	dma_cookie_t cookie;
+	int ret = 0;
+
+	addr = vb2_dma_contig_plane_dma_addr(&buf->vb.vb2_buf, 0);
+	desc = dmaengine_prep_slave_single(csi->dma.chan, addr, len,
+					   DMA_DEV_TO_MEM,
+					   DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
+	if (!desc)
+		return -EIO;
+
+	desc->callback = ti_csi2rx_dma_callback;
+	desc->callback_param = buf;
+
+	cookie = dmaengine_submit(desc);
+	ret = dma_submit_error(cookie);
+	if (ret)
+		return ret;
+
+	dma_async_issue_pending(csi->dma.chan);
+
+	return 0;
+}
+
+static void ti_csi2rx_cleanup_buffers(struct ti_csi2rx_dev *csi,
+				      enum vb2_buffer_state buf_state)
+{
+	struct ti_csi2rx_dma *dma = &csi->dma;
+	struct ti_csi2rx_buffer *buf, *tmp;
+	enum ti_csi2rx_dma_state state;
+	unsigned long flags;
+	int ret;
+
+	spin_lock_irqsave(&dma->lock, flags);
+	state = csi->dma.state;
+	dma->state = TI_CSI2RX_DMA_STOPPED;
+	spin_unlock_irqrestore(&dma->lock, flags);
+
+	if (state != TI_CSI2RX_DMA_STOPPED) {
+		/*
+		 * Normal DMA termination sometimes does not clean up pending
+		 * data on the endpoint.
+		 */
+		ret = ti_csi2rx_drain_dma(csi);
+		if (ret)
+			dev_dbg(csi->dev,
+				"Failed to drain DMA. Next frame might be bogus\n");
+	}
+	ret = dmaengine_terminate_sync(csi->dma.chan);
+	if (ret)
+		dev_err(csi->dev, "Failed to stop DMA: %d\n", ret);
+
+	dma_free_coherent(csi->dev, dma->drain.len,
+			  dma->drain.vaddr, dma->drain.paddr);
+	dma->drain.vaddr = NULL;
+
+	spin_lock_irqsave(&dma->lock, flags);
+	list_for_each_entry_safe(buf, tmp, &csi->dma.queue, list) {
+		list_del(&buf->list);
+		vb2_buffer_done(&buf->vb.vb2_buf, buf_state);
+	}
+	list_for_each_entry_safe(buf, tmp, &csi->dma.submitted, list) {
+		list_del(&buf->list);
+		vb2_buffer_done(&buf->vb.vb2_buf, buf_state);
+	}
+	spin_unlock_irqrestore(&dma->lock, flags);
+}
+
+static int ti_csi2rx_queue_setup(struct vb2_queue *q, unsigned int *nbuffers,
+				 unsigned int *nplanes, unsigned int sizes[],
+				 struct device *alloc_devs[])
+{
+	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(q);
+	unsigned int size = csi->v_fmt.fmt.pix.sizeimage;
+
+	if (*nplanes) {
+		if (sizes[0] < size)
+			return -EINVAL;
+		size = sizes[0];
+	}
+
+	*nplanes = 1;
+	sizes[0] = size;
+
+	return 0;
+}
+
+static int ti_csi2rx_buffer_prepare(struct vb2_buffer *vb)
+{
+	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vb->vb2_queue);
+	unsigned long size = csi->v_fmt.fmt.pix.sizeimage;
+
+	if (vb2_plane_size(vb, 0) < size) {
+		dev_err(csi->dev, "Data will not fit into plane\n");
+		return -EINVAL;
+	}
+
+	vb2_set_plane_payload(vb, 0, size);
+	return 0;
+}
+
+static void ti_csi2rx_buffer_queue(struct vb2_buffer *vb)
+{
+	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vb->vb2_queue);
+	struct ti_csi2rx_buffer *buf;
+	struct ti_csi2rx_dma *dma = &csi->dma;
+	bool restart_dma = false;
+	unsigned long flags = 0;
+	int ret;
+
+	buf = container_of(vb, struct ti_csi2rx_buffer, vb.vb2_buf);
+	buf->csi = csi;
+
+	spin_lock_irqsave(&dma->lock, flags);
+	/*
+	 * Usually the DMA callback takes care of queueing the pending buffers.
+	 * But if DMA has stalled due to lack of buffers, restart it now.
+	 */
+	if (dma->state == TI_CSI2RX_DMA_IDLE) {
+		/*
+		 * Do not restart DMA with the lock held because
+		 * ti_csi2rx_drain_dma() might block for completion.
+		 * There won't be a race on queueing DMA anyway since the
+		 * callback is not being fired.
+		 */
+		restart_dma = true;
+		dma->state = TI_CSI2RX_DMA_ACTIVE;
+	} else {
+		list_add_tail(&buf->list, &dma->queue);
+	}
+	spin_unlock_irqrestore(&dma->lock, flags);
+
+	if (restart_dma) {
+		/*
+		 * Once frames start dropping, some data gets stuck in the DMA
+		 * pipeline somewhere. So the first DMA transfer after frame
+		 * drops gives a partial frame. This is obviously not useful to
+		 * the application and will only confuse it. Issue a DMA
+		 * transaction to drain that up.
+		 */
+		ret = ti_csi2rx_drain_dma(csi);
+		if (ret)
+			dev_warn(csi->dev,
+				 "Failed to drain DMA. Next frame might be bogus\n");
+
+		ret = ti_csi2rx_start_dma(csi, buf);
+		if (ret) {
+			dev_err(csi->dev, "Failed to start DMA: %d\n", ret);
+			spin_lock_irqsave(&dma->lock, flags);
+			vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR);
+			dma->state = TI_CSI2RX_DMA_IDLE;
+			spin_unlock_irqrestore(&dma->lock, flags);
+		} else {
+			spin_lock_irqsave(&dma->lock, flags);
+			list_add_tail(&buf->list, &dma->submitted);
+			spin_unlock_irqrestore(&dma->lock, flags);
+		}
+	}
+}
+
+static int ti_csi2rx_start_streaming(struct vb2_queue *vq, unsigned int count)
+{
+	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vq);
+	struct ti_csi2rx_dma *dma = &csi->dma;
+	struct ti_csi2rx_buffer *buf;
+	unsigned long flags;
+	int ret = 0;
+
+	spin_lock_irqsave(&dma->lock, flags);
+	if (list_empty(&dma->queue))
+		ret = -EIO;
+	spin_unlock_irqrestore(&dma->lock, flags);
+	if (ret)
+		return ret;
+
+	dma->drain.len = csi->v_fmt.fmt.pix.sizeimage;
+	dma->drain.vaddr = dma_alloc_coherent(csi->dev, dma->drain.len,
+					      &dma->drain.paddr, GFP_KERNEL);
+	if (!dma->drain.vaddr)
+		return -ENOMEM;
+
+	ret = video_device_pipeline_start(&csi->vdev, &csi->pipe);
+	if (ret)
+		goto err;
+
+	ret = ti_csi2rx_setup_shim(csi);
+	if (ret)
+		goto err;
+
+	csi->sequence = 0;
+
+	spin_lock_irqsave(&dma->lock, flags);
+	buf = list_entry(dma->queue.next, struct ti_csi2rx_buffer, list);
+
+	ret = ti_csi2rx_start_dma(csi, buf);
+	if (ret) {
+		dev_err(csi->dev, "Failed to start DMA: %d\n", ret);
+		spin_unlock_irqrestore(&dma->lock, flags);
+		goto err_pipeline;
+	}
+
+	list_move_tail(&buf->list, &dma->submitted);
+	dma->state = TI_CSI2RX_DMA_ACTIVE;
+	spin_unlock_irqrestore(&dma->lock, flags);
+
+	ret = v4l2_subdev_call(csi->source, video, s_stream, 1);
+	if (ret)
+		goto err_dma;
+
+	return 0;
+
+err_dma:
+	dmaengine_terminate_sync(csi->dma.chan);
+	writel(0, csi->shim + SHIM_DMACNTX);
+err_pipeline:
+	video_device_pipeline_stop(&csi->vdev);
+err:
+	ti_csi2rx_cleanup_buffers(csi, VB2_BUF_STATE_QUEUED);
+	return ret;
+}
+
+static void ti_csi2rx_stop_streaming(struct vb2_queue *vq)
+{
+	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vq);
+	int ret;
+
+	video_device_pipeline_stop(&csi->vdev);
+
+	writel(0, csi->shim + SHIM_CNTL);
+	writel(0, csi->shim + SHIM_DMACNTX);
+
+	ret = v4l2_subdev_call(csi->source, video, s_stream, 0);
+	if (ret)
+		dev_err(csi->dev, "Failed to stop subdev stream\n");
+
+	ti_csi2rx_cleanup_buffers(csi, VB2_BUF_STATE_ERROR);
+}
+
+static const struct vb2_ops csi_vb2_qops = {
+	.queue_setup = ti_csi2rx_queue_setup,
+	.buf_prepare = ti_csi2rx_buffer_prepare,
+	.buf_queue = ti_csi2rx_buffer_queue,
+	.start_streaming = ti_csi2rx_start_streaming,
+	.stop_streaming = ti_csi2rx_stop_streaming,
+	.wait_prepare = vb2_ops_wait_prepare,
+	.wait_finish = vb2_ops_wait_finish,
+};
+
+static int ti_csi2rx_init_vb2q(struct ti_csi2rx_dev *csi)
+{
+	struct vb2_queue *q = &csi->vidq;
+	int ret;
+
+	q->type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
+	q->io_modes = VB2_MMAP | VB2_DMABUF;
+	q->drv_priv = csi;
+	q->buf_struct_size = sizeof(struct ti_csi2rx_buffer);
+	q->ops = &csi_vb2_qops;
+	q->mem_ops = &vb2_dma_contig_memops;
+	q->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC;
+	q->dev = dmaengine_get_dma_device(csi->dma.chan);
+	q->lock = &csi->mutex;
+	q->min_buffers_needed = 1;
+
+	ret = vb2_queue_init(q);
+	if (ret)
+		return ret;
+
+	csi->vdev.queue = q;
+
+	return 0;
+}
+
+static int ti_csi2rx_link_validate(struct media_link *link)
+{
+	struct media_entity *entity = link->sink->entity;
+	struct video_device *vdev = media_entity_to_video_device(entity);
+	struct ti_csi2rx_dev *csi = container_of(vdev, struct ti_csi2rx_dev, vdev);
+	struct v4l2_pix_format *csi_fmt = &csi->v_fmt.fmt.pix;
+	struct v4l2_subdev_format source_fmt = {
+		.which	= V4L2_SUBDEV_FORMAT_ACTIVE,
+		.pad	= link->source->index,
+	};
+	const struct ti_csi2rx_fmt *ti_fmt;
+	int ret;
+
+	ret = v4l2_subdev_call_state_active(csi->source, pad,
+					    get_fmt, &source_fmt);
+	if (ret)
+		return ret;
+
+	if (source_fmt.format.width != csi_fmt->width) {
+		dev_dbg(csi->dev, "Width does not match (source %u, sink %u)\n",
+			source_fmt.format.width, csi_fmt->width);
+		return -EPIPE;
+	}
+
+	if (source_fmt.format.height != csi_fmt->height) {
+		dev_dbg(csi->dev, "Height does not match (source %u, sink %u)\n",
+			source_fmt.format.height, csi_fmt->height);
+		return -EPIPE;
+	}
+
+	if (source_fmt.format.field != csi_fmt->field &&
+	    csi_fmt->field != V4L2_FIELD_NONE) {
+		dev_dbg(csi->dev, "Field does not match (source %u, sink %u)\n",
+			source_fmt.format.field, csi_fmt->field);
+		return -EPIPE;
+	}
+
+	ti_fmt = find_format_by_code(source_fmt.format.code);
+	if (!ti_fmt) {
+		dev_dbg(csi->dev, "Media bus format 0x%x not supported\n",
+			source_fmt.format.code);
+		return -EPIPE;
+	}
+
+	if (ti_fmt->fourcc != csi_fmt->pixelformat) {
+		dev_dbg(csi->dev,
+			"Cannot transform source fmt 0x%x to sink fmt 0x%x\n",
+			ti_fmt->fourcc, csi_fmt->pixelformat);
+		return -EPIPE;
+	}
+
+	return 0;
+}
+
+static const struct media_entity_operations ti_csi2rx_video_entity_ops = {
+	.link_validate = ti_csi2rx_link_validate,
+};
+
+static int ti_csi2rx_init_dma(struct ti_csi2rx_dev *csi)
+{
+	struct dma_slave_config cfg = {
+		.src_addr_width = DMA_SLAVE_BUSWIDTH_16_BYTES,
+	};
+	int ret;
+
+	INIT_LIST_HEAD(&csi->dma.queue);
+	INIT_LIST_HEAD(&csi->dma.submitted);
+	spin_lock_init(&csi->dma.lock);
+
+	csi->dma.state = TI_CSI2RX_DMA_STOPPED;
+
+	csi->dma.chan = dma_request_chan(csi->dev, "rx0");
+	if (IS_ERR(csi->dma.chan))
+		return PTR_ERR(csi->dma.chan);
+
+	ret = dmaengine_slave_config(csi->dma.chan, &cfg);
+	if (ret) {
+		dma_release_channel(csi->dma.chan);
+		return ret;
+	}
+
+	return 0;
+}
+
+static int ti_csi2rx_v4l2_init(struct ti_csi2rx_dev *csi)
+{
+	struct media_device *mdev = &csi->mdev;
+	struct video_device *vdev = &csi->vdev;
+	const struct ti_csi2rx_fmt *fmt;
+	struct v4l2_pix_format *pix_fmt = &csi->v_fmt.fmt.pix;
+	int ret;
+
+	fmt = find_format_by_pix(V4L2_PIX_FMT_UYVY);
+	if (!fmt)
+		return -EINVAL;
+
+	pix_fmt->width = 640;
+	pix_fmt->height = 480;
+	pix_fmt->field = V4L2_FIELD_NONE;
+
+	ti_csi2rx_fill_fmt(fmt, &csi->v_fmt);
+
+	mdev->dev = csi->dev;
+	mdev->hw_revision = 1;
+	strscpy(mdev->model, "TI-CSI2RX", sizeof(mdev->model));
+
+	media_device_init(mdev);
+
+	strscpy(vdev->name, TI_CSI2RX_MODULE_NAME, sizeof(vdev->name));
+	vdev->v4l2_dev = &csi->v4l2_dev;
+	vdev->vfl_dir = VFL_DIR_RX;
+	vdev->fops = &csi_fops;
+	vdev->ioctl_ops = &csi_ioctl_ops;
+	vdev->release = video_device_release_empty;
+	vdev->device_caps = V4L2_CAP_VIDEO_CAPTURE | V4L2_CAP_STREAMING |
+			    V4L2_CAP_IO_MC;
+	vdev->lock = &csi->mutex;
+	video_set_drvdata(vdev, csi);
+
+	csi->pad.flags = MEDIA_PAD_FL_SINK;
+	vdev->entity.ops = &ti_csi2rx_video_entity_ops;
+	ret = media_entity_pads_init(&csi->vdev.entity, 1, &csi->pad);
+	if (ret)
+		return ret;
+
+	csi->v4l2_dev.mdev = mdev;
+
+	ret = v4l2_device_register(csi->dev, &csi->v4l2_dev);
+	if (ret)
+		return ret;
+
+	ret = media_device_register(mdev);
+	if (ret) {
+		v4l2_device_unregister(&csi->v4l2_dev);
+		media_device_cleanup(mdev);
+		return ret;
+	}
+
+	return 0;
+}
+
+static void ti_csi2rx_cleanup_dma(struct ti_csi2rx_dev *csi)
+{
+	dma_release_channel(csi->dma.chan);
+}
+
+static void ti_csi2rx_cleanup_v4l2(struct ti_csi2rx_dev *csi)
+{
+	media_device_unregister(&csi->mdev);
+	v4l2_device_unregister(&csi->v4l2_dev);
+	media_device_cleanup(&csi->mdev);
+}
+
+static void ti_csi2rx_cleanup_subdev(struct ti_csi2rx_dev *csi)
+{
+	v4l2_async_nf_unregister(&csi->notifier);
+	v4l2_async_nf_cleanup(&csi->notifier);
+}
+
+static void ti_csi2rx_cleanup_vb2q(struct ti_csi2rx_dev *csi)
+{
+	vb2_queue_release(&csi->vidq);
+}
+
+static int ti_csi2rx_probe(struct platform_device *pdev)
+{
+	struct ti_csi2rx_dev *csi;
+	struct resource *res;
+	int ret;
+
+	csi = devm_kzalloc(&pdev->dev, sizeof(*csi), GFP_KERNEL);
+	if (!csi)
+		return -ENOMEM;
+
+	csi->dev = &pdev->dev;
+	platform_set_drvdata(pdev, csi);
+
+	mutex_init(&csi->mutex);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	csi->shim = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(csi->shim)) {
+		ret = PTR_ERR(csi->shim);
+		goto err_mutex;
+	}
+
+	ret = ti_csi2rx_init_dma(csi);
+	if (ret)
+		goto err_mutex;
+
+	ret = ti_csi2rx_v4l2_init(csi);
+	if (ret)
+		goto err_dma;
+
+	ret = ti_csi2rx_init_vb2q(csi);
+	if (ret)
+		goto err_v4l2;
+
+	ret = ti_csi2rx_notifier_register(csi);
+	if (ret)
+		goto err_vb2q;
+
+	ret = of_platform_populate(csi->dev->of_node, NULL, NULL, csi->dev);
+	if (ret) {
+		dev_err(csi->dev, "Failed to create children: %d\n", ret);
+		goto err_subdev;
+	}
+
+	return 0;
+
+err_subdev:
+	ti_csi2rx_cleanup_subdev(csi);
+err_vb2q:
+	ti_csi2rx_cleanup_vb2q(csi);
+err_v4l2:
+	ti_csi2rx_cleanup_v4l2(csi);
+err_dma:
+	ti_csi2rx_cleanup_dma(csi);
+err_mutex:
+	mutex_destroy(&csi->mutex);
+	return ret;
+}
+
+static int ti_csi2rx_remove(struct platform_device *pdev)
+{
+	struct ti_csi2rx_dev *csi = platform_get_drvdata(pdev);
+
+	video_unregister_device(&csi->vdev);
+
+	ti_csi2rx_cleanup_vb2q(csi);
+	ti_csi2rx_cleanup_subdev(csi);
+	ti_csi2rx_cleanup_v4l2(csi);
+	ti_csi2rx_cleanup_dma(csi);
+
+	mutex_destroy(&csi->mutex);
+
+	return 0;
+}
+
+static const struct of_device_id ti_csi2rx_of_match[] = {
+	{ .compatible = "ti,j721e-csi2rx-shim", },
+	{ },
+};
+MODULE_DEVICE_TABLE(of, ti_csi2rx_of_match);
+
+static struct platform_driver ti_csi2rx_pdrv = {
+	.probe = ti_csi2rx_probe,
+	.remove = ti_csi2rx_remove,
+	.driver = {
+		.name = TI_CSI2RX_MODULE_NAME,
+		.of_match_table = ti_csi2rx_of_match,
+	},
+};
+
+module_platform_driver(ti_csi2rx_pdrv);
+
+MODULE_DESCRIPTION("TI J721E CSI2 RX Driver");
+MODULE_AUTHOR("Pratyush Yadav <p.yadav@ti.com>");
+MODULE_LICENSE("GPL");

-- 
2.41.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v9 12/13] media: dt-bindings: Add TI J721E CSI2RX
  2023-08-11 10:47 ` [PATCH v9 12/13] media: dt-bindings: Add TI J721E CSI2RX Jai Luthra
@ 2023-08-11 14:00   ` Rob Herring
  2023-08-11 14:54     ` Rob Herring
  0 siblings, 1 reply; 30+ messages in thread
From: Rob Herring @ 2023-08-11 14:00 UTC (permalink / raw)
  To: Jai Luthra
  Cc: linux-arm-kernel, Maxime Ripard, Krzysztof Kozlowski,
	Mauro Carvalho Chehab, nm, Martyn Welch, Sakari Ailus,
	Tomi Valkeinen, linux-kernel, Vaishnav Achath,
	Vignesh Raghavendra, devicetree, Rob Herring, a-bhatia1,
	Laurent Pinchart, Julien Massot, Mauro Carvalho Chehab,
	Conor Dooley, niklas.soderlund+renesas, linux-media,
	Benoit Parrot, devarsht


On Fri, 11 Aug 2023 16:17:34 +0530, Jai Luthra wrote:
> From: Pratyush Yadav <p.yadav@ti.com>
> 
> TI's J721E uses the Cadence CSI2RX and DPHY peripherals to facilitate
> capture over a CSI-2 bus. The TI CSI2RX platform driver glues all the
> parts together.
> 
> Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> Reviewed-by: Rob Herring <robh@kernel.org>
> Signed-off-by: Jai Luthra <j-luthra@ti.com>
> ---
> NOTE:
> 
> This patch depends on
> 9536cc949235 ("media: dt-bindings: cadence-csi2rx: Convert to DT schema")
> which is part of linux-next.
> 
>  .../bindings/media/ti,j721e-csi2rx-shim.yaml       | 100 +++++++++++++++++++++
>  1 file changed, 100 insertions(+)
> 

My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
on your patch (DT_CHECKER_FLAGS is new in v5.13):

yamllint warnings/errors:

dtschema/dtc warnings/errors:
/builds/robherring/dt-review-ci/linux/Documentation/devicetree/bindings/media/ti,j721e-csi2rx-shim.yaml:
Error in referenced schema matching $id: http://devicetree.org/schemas/media/cdns,csi2rx.yaml
/builds/robherring/dt-review-ci/linux/Documentation/devicetree/bindings/media/ti,j721e-csi2rx-shim.example.dtb: ticsi2rx@4500000: csi-bridge@4504000: False schema does not allow {'compatible': ['ti,j721e-csi2rx', 'cdns,csi2rx'], 'reg': [[72368128, 4096]], 'clocks': [[4294967295, 26, 2], [4294967295, 26, 0], [4294967295, 26, 2], [4294967295, 26, 2], [4294967295, 26, 3], [4294967295, 26, 3]], 'clock-names': ['sys_clk', 'p_clk', 'pixel_if0_clk', 'pixel_if1_clk', 'pixel_if2_clk', 'pixel_if3_clk'], 'phys': [[4294967295]], 'phy-names': ['dphy'], 'ports': {'#address-cells': [[1]], '#size-cells': [[0]], 'port@0': {'reg': [[0]], 'endpoint': {'remote-endpoint': [[4294967295]], 'bus-type': [[4]], 'clock-lanes': [[0]], 'data-lanes': [[1, 2]]}}}}
	from schema $id: http://devicetree.org/schemas/media/ti,j721e-csi2rx-shim.yaml#
Documentation/devicetree/bindings/media/ti,j721e-csi2rx-shim.example.dtb: /example-0/ticsi2rx@4500000/csi-bridge@4504000: failed to match any schema with compatible: ['ti,j721e-csi2rx', 'cdns,csi2rx']
Documentation/devicetree/bindings/media/ti,j721e-csi2rx-shim.example.dtb: /example-0/ticsi2rx@4500000/csi-bridge@4504000: failed to match any schema with compatible: ['ti,j721e-csi2rx', 'cdns,csi2rx']

doc reference errors (make refcheckdocs):

See https://patchwork.ozlabs.org/project/devicetree-bindings/patch/20230811-upstream_csi-v9-12-8943f7a68a81@ti.com

The base for the series is generally the latest rc1. A different dependency
should be noted in *this* patch.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit after running the above command yourself. Note
that DT_SCHEMA_FILES can be set to your schema file to speed up checking
your schema. However, it must be unset to test all examples with your schema.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v9 12/13] media: dt-bindings: Add TI J721E CSI2RX
  2023-08-11 14:00   ` Rob Herring
@ 2023-08-11 14:54     ` Rob Herring
  0 siblings, 0 replies; 30+ messages in thread
From: Rob Herring @ 2023-08-11 14:54 UTC (permalink / raw)
  To: Jai Luthra
  Cc: linux-arm-kernel, Maxime Ripard, Krzysztof Kozlowski,
	Mauro Carvalho Chehab, nm, Martyn Welch, Sakari Ailus,
	Tomi Valkeinen, linux-kernel, Vaishnav Achath,
	Vignesh Raghavendra, devicetree, a-bhatia1, Laurent Pinchart,
	Julien Massot, Mauro Carvalho Chehab, Conor Dooley,
	niklas.soderlund+renesas, linux-media, Benoit Parrot, devarsht

On Fri, Aug 11, 2023 at 08:00:55AM -0600, Rob Herring wrote:
> 
> On Fri, 11 Aug 2023 16:17:34 +0530, Jai Luthra wrote:
> > From: Pratyush Yadav <p.yadav@ti.com>
> > 
> > TI's J721E uses the Cadence CSI2RX and DPHY peripherals to facilitate
> > capture over a CSI-2 bus. The TI CSI2RX platform driver glues all the
> > parts together.
> > 
> > Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
> > Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > Reviewed-by: Rob Herring <robh@kernel.org>
> > Signed-off-by: Jai Luthra <j-luthra@ti.com>
> > ---
> > NOTE:
> > 
> > This patch depends on
> > 9536cc949235 ("media: dt-bindings: cadence-csi2rx: Convert to DT schema")
> > which is part of linux-next.
> > 
> >  .../bindings/media/ti,j721e-csi2rx-shim.yaml       | 100 +++++++++++++++++++++
> >  1 file changed, 100 insertions(+)
> > 
> 
> My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
> on your patch (DT_CHECKER_FLAGS is new in v5.13):
> 
> yamllint warnings/errors:
> 
> dtschema/dtc warnings/errors:
> /builds/robherring/dt-review-ci/linux/Documentation/devicetree/bindings/media/ti,j721e-csi2rx-shim.yaml:
> Error in referenced schema matching $id: http://devicetree.org/schemas/media/cdns,csi2rx.yaml
> /builds/robherring/dt-review-ci/linux/Documentation/devicetree/bindings/media/ti,j721e-csi2rx-shim.example.dtb: ticsi2rx@4500000: csi-bridge@4504000: False schema does not allow {'compatible': ['ti,j721e-csi2rx', 'cdns,csi2rx'], 'reg': [[72368128, 4096]], 'clocks': [[4294967295, 26, 2], [4294967295, 26, 0], [4294967295, 26, 2], [4294967295, 26, 2], [4294967295, 26, 3], [4294967295, 26, 3]], 'clock-names': ['sys_clk', 'p_clk', 'pixel_if0_clk', 'pixel_if1_clk', 'pixel_if2_clk', 'pixel_if3_clk'], 'phys': [[4294967295]], 'phy-names': ['dphy'], 'ports': {'#address-cells': [[1]], '#size-cells': [[0]], 'port@0': {'reg': [[0]], 'endpoint': {'remote-endpoint': [[4294967295]], 'bus-type': [[4]], 'clock-lanes': [[0]], 'data-lanes': [[1, 2]]}}}}
> 	from schema $id: http://devicetree.org/schemas/media/ti,j721e-csi2rx-shim.yaml#
> Documentation/devicetree/bindings/media/ti,j721e-csi2rx-shim.example.dtb: /example-0/ticsi2rx@4500000/csi-bridge@4504000: failed to match any schema with compatible: ['ti,j721e-csi2rx', 'cdns,csi2rx']
> Documentation/devicetree/bindings/media/ti,j721e-csi2rx-shim.example.dtb: /example-0/ticsi2rx@4500000/csi-bridge@4504000: failed to match any schema with compatible: ['ti,j721e-csi2rx', 'cdns,csi2rx']

As noted, this can be ignored.

Rob

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v9 05/13] media: cadence: csi2rx: Add get_fmt and set_fmt pad ops
  2023-08-11 10:47 ` [PATCH v9 05/13] media: cadence: csi2rx: Add get_fmt and set_fmt pad ops Jai Luthra
@ 2023-08-15 12:05   ` Tomi Valkeinen
  2023-08-25  3:48   ` Laurent Pinchart
  1 sibling, 0 replies; 30+ messages in thread
From: Tomi Valkeinen @ 2023-08-15 12:05 UTC (permalink / raw)
  To: Jai Luthra, Mauro Carvalho Chehab, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Sakari Ailus
  Cc: linux-media, linux-kernel, devicetree, linux-arm-kernel,
	Laurent Pinchart, Mauro Carvalho Chehab, Maxime Ripard,
	niklas.soderlund+renesas, Benoit Parrot, Vaishnav Achath,
	Vignesh Raghavendra, nm, devarsht, a-bhatia1, Martyn Welch,
	Julien Massot

On 11/08/2023 13:47, Jai Luthra wrote:
> From: Pratyush Yadav <p.yadav@ti.com>
> 
> The format is needed to calculate the link speed for the external DPHY
> configuration. It is not right to query the format from the source
> subdev. Add get_fmt and set_fmt pad operations so that the format can be
> configured and correct bpp be selected.
> 
> Initialize and use the v4l2 subdev active state to keep track of the
> active formats. Also propagate the new format from the sink pad to all
> the source pads.
> 
> Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
> Co-authored-by: Jai Luthra <j-luthra@ti.com>
> Reviewed-by: Maxime Ripard <mripard@kernel.org>
> Signed-off-by: Jai Luthra <j-luthra@ti.com>
> ---
> Changes from v8:
>      - Squash the patch adding RAW8 and RAW10 formats within this one
>      - Single line struct entries in formats[] array
>      - Skip specifiying redundant format.which entry in init_cfg()
> 

Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>

  Tomi



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v9 07/13] media: cadence: csi2rx: Soft reset the streams before starting capture
  2023-08-11 10:47 ` [PATCH v9 07/13] media: cadence: csi2rx: Soft reset the streams before starting capture Jai Luthra
@ 2023-08-15 12:10   ` Tomi Valkeinen
  0 siblings, 0 replies; 30+ messages in thread
From: Tomi Valkeinen @ 2023-08-15 12:10 UTC (permalink / raw)
  To: Jai Luthra, Mauro Carvalho Chehab, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Sakari Ailus
  Cc: linux-media, linux-kernel, devicetree, linux-arm-kernel,
	Laurent Pinchart, Mauro Carvalho Chehab, Maxime Ripard,
	niklas.soderlund+renesas, Benoit Parrot, Vaishnav Achath,
	Vignesh Raghavendra, nm, devarsht, a-bhatia1, Martyn Welch,
	Julien Massot

On 11/08/2023 13:47, Jai Luthra wrote:
> From: Pratyush Yadav <p.yadav@ti.com>
> 
> This resets the stream state machines and FIFOs, giving them a clean
> slate. On J721E if the streams are not reset before starting the
> capture, the captured frame gets wrapped around vertically on every run
> after the first.
> 
> Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> Reviewed-by: Maxime Ripard <mripard@kernel.org>
> Signed-off-by: Jai Luthra <j-luthra@ti.com>
> ---
> Changes from v8:
>      - Simplify reset sequence, minimizing delays
> 
>   drivers/media/platform/cadence/cdns-csi2rx.c | 14 +++++++++++++-
>   1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/media/platform/cadence/cdns-csi2rx.c b/drivers/media/platform/cadence/cdns-csi2rx.c
> index 933edec89520..b57e0c3b1944 100644
> --- a/drivers/media/platform/cadence/cdns-csi2rx.c
> +++ b/drivers/media/platform/cadence/cdns-csi2rx.c
> @@ -40,6 +40,7 @@
>   #define CSI2RX_STREAM_BASE(n)		(((n) + 1) * 0x100)
>   
>   #define CSI2RX_STREAM_CTRL_REG(n)		(CSI2RX_STREAM_BASE(n) + 0x000)
> +#define CSI2RX_STREAM_CTRL_SOFT_RST			BIT(4)
>   #define CSI2RX_STREAM_CTRL_START			BIT(0)
>   
>   #define CSI2RX_STREAM_DATA_CFG_REG(n)		(CSI2RX_STREAM_BASE(n) + 0x008)
> @@ -134,12 +135,23 @@ struct csi2rx_priv *v4l2_subdev_to_csi2rx(struct v4l2_subdev *subdev)
>   
>   static void csi2rx_reset(struct csi2rx_priv *csi2rx)
>   {
> +	unsigned int i;
> +
> +	/* Reset module */
>   	writel(CSI2RX_SOFT_RESET_PROTOCOL | CSI2RX_SOFT_RESET_FRONT,
>   	       csi2rx->base + CSI2RX_SOFT_RESET_REG);
> +	/* Reset individual streams. */
> +	for (i = 0; i < csi2rx->max_streams; i++) {
> +		writel(CSI2RX_STREAM_CTRL_SOFT_RST,
> +		       csi2rx->base + CSI2RX_STREAM_CTRL_REG(i));
> +	}
>   
> -	udelay(10);
> +	usleep_range(10, 20);
>   
> +	/* Clear resets */
>   	writel(0, csi2rx->base + CSI2RX_SOFT_RESET_REG);
> +	for (i = 0; i < csi2rx->max_streams; i++)
> +		writel(0, csi2rx->base + CSI2RX_STREAM_CTRL_REG(i));
>   }
>   
>   static int csi2rx_configure_ext_dphy(struct csi2rx_priv *csi2rx)
> 

Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>

  Tomi


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v9 13/13] media: ti: Add CSI2RX support for J721E
  2023-08-11 10:47 ` [PATCH v9 13/13] media: ti: Add CSI2RX support for J721E Jai Luthra
@ 2023-08-15 13:00   ` Tomi Valkeinen
  2023-08-18 10:25     ` Jai Luthra
  2023-08-29 16:44   ` Laurent Pinchart
  1 sibling, 1 reply; 30+ messages in thread
From: Tomi Valkeinen @ 2023-08-15 13:00 UTC (permalink / raw)
  To: Jai Luthra, Mauro Carvalho Chehab, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Sakari Ailus
  Cc: linux-media, linux-kernel, devicetree, linux-arm-kernel,
	Laurent Pinchart, Mauro Carvalho Chehab, Maxime Ripard,
	niklas.soderlund+renesas, Benoit Parrot, Vaishnav Achath,
	Vignesh Raghavendra, nm, devarsht, a-bhatia1, Martyn Welch,
	Julien Massot

On 11/08/2023 13:47, Jai Luthra wrote:
> From: Pratyush Yadav <p.yadav@ti.com>
> 
> TI's J721E uses the Cadence CSI2RX and DPHY peripherals to facilitate
> capture over a CSI-2 bus.
> 
> The Cadence CSI2RX IP acts as a bridge between the TI specific parts and
> the CSI-2 protocol parts. TI then has a wrapper on top of this bridge
> called the SHIM layer. It takes in data from stream 0, repacks it, and
> sends it to memory over PSI-L DMA.
> 
> This driver acts as the "front end" to V4L2 client applications. It
> implements the required ioctls and buffer operations, passes the
> necessary calls on to the bridge, programs the SHIM layer, and performs
> DMA via the dmaengine API to finally return the data to a buffer
> supplied by the application.
> 
> Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
> Co-authored-by: Vaishnav Achath <vaishnav.a@ti.com>
> Signed-off-by: Vaishnav Achath <vaishnav.a@ti.com>
> Tested-by: Vaishnav Achath <vaishnav.a@ti.com>
> Co-authored-by: Jai Luthra <j-luthra@ti.com>
> Signed-off-by: Jai Luthra <j-luthra@ti.com>
> ---
> Changes since v8:
>      - Allocate drain buffer at start of stream instead of doing it in the
>        middle, and document why it is needed in comments
>      - Call subdev's get_fmt directly for link_validation()
>      - Cleanup height/width clamping and rounding code, document it in comments
>      - Return and check errors from setup_shim()
>      - s/subdev/source for cadence csi2rx's v4l2_subdev
>      - s/ti_csi2rx_init_subdev/ti_csi2rx_notifier_register
>      - Change copyright year/author list
> 
>   MAINTAINERS                                        |    7 +
>   drivers/media/platform/ti/Kconfig                  |   12 +
>   drivers/media/platform/ti/Makefile                 |    1 +
>   drivers/media/platform/ti/j721e-csi2rx/Makefile    |    2 +
>   .../media/platform/ti/j721e-csi2rx/j721e-csi2rx.c  | 1150 ++++++++++++++++++++
>   5 files changed, 1172 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 02a3192195af..959147d6d936 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -21455,6 +21455,13 @@ F:	Documentation/devicetree/bindings/media/i2c/ti,ds90*
>   F:	drivers/media/i2c/ds90*
>   F:	include/media/i2c/ds90*
>   
> +TI J721E CSI2RX DRIVER
> +M:	Jai Luthra <j-luthra@ti.com>
> +L:	linux-media@vger.kernel.org
> +S:	Maintained
> +F:	Documentation/devicetree/bindings/media/ti,j721e-csi2rx.yaml
> +F:	drivers/media/platform/ti/j721e-csi2rx/
> +
>   TI KEYSTONE MULTICORE NAVIGATOR DRIVERS
>   M:	Nishanth Menon <nm@ti.com>
>   M:	Santosh Shilimkar <ssantosh@kernel.org>
> diff --git a/drivers/media/platform/ti/Kconfig b/drivers/media/platform/ti/Kconfig
> index e1ab56c3be1f..42c908f6e1ae 100644
> --- a/drivers/media/platform/ti/Kconfig
> +++ b/drivers/media/platform/ti/Kconfig
> @@ -63,6 +63,18 @@ config VIDEO_TI_VPE_DEBUG
>   	help
>   	  Enable debug messages on VPE driver.
>   
> +config VIDEO_TI_J721E_CSI2RX
> +	tristate "TI J721E CSI2RX wrapper layer driver"
> +	depends on VIDEO_DEV && VIDEO_V4L2_SUBDEV_API
> +	depends on MEDIA_SUPPORT && MEDIA_CONTROLLER
> +	depends on PHY_CADENCE_DPHY_RX && VIDEO_CADENCE_CSI2RX
> +	depends on ARCH_K3 || COMPILE_TEST
> +	select VIDEOBUF2_DMA_CONTIG
> +	select V4L2_FWNODE
> +	help
> +	  Support for TI CSI2RX wrapper layer. This just enables the wrapper driver.
> +	  The Cadence CSI2RX bridge driver needs to be enabled separately.
> +
>   source "drivers/media/platform/ti/am437x/Kconfig"
>   source "drivers/media/platform/ti/davinci/Kconfig"
>   source "drivers/media/platform/ti/omap/Kconfig"
> diff --git a/drivers/media/platform/ti/Makefile b/drivers/media/platform/ti/Makefile
> index 98c5fe5c40d6..8a2f74c9380e 100644
> --- a/drivers/media/platform/ti/Makefile
> +++ b/drivers/media/platform/ti/Makefile
> @@ -3,5 +3,6 @@ obj-y += am437x/
>   obj-y += cal/
>   obj-y += vpe/
>   obj-y += davinci/
> +obj-y += j721e-csi2rx/
>   obj-y += omap/
>   obj-y += omap3isp/
> diff --git a/drivers/media/platform/ti/j721e-csi2rx/Makefile b/drivers/media/platform/ti/j721e-csi2rx/Makefile
> new file mode 100644
> index 000000000000..377afc1d6280
> --- /dev/null
> +++ b/drivers/media/platform/ti/j721e-csi2rx/Makefile
> @@ -0,0 +1,2 @@
> +# SPDX-License-Identifier: GPL-2.0
> +obj-$(CONFIG_VIDEO_TI_J721E_CSI2RX) += j721e-csi2rx.o
> diff --git a/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c b/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c
> new file mode 100644
> index 000000000000..301d947f6098
> --- /dev/null
> +++ b/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c
> @@ -0,0 +1,1150 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * TI CSI2RX Shim Wrapper Driver
> + *
> + * Copyright (C) 2023 Texas Instruments Incorporated - https://www.ti.com/
> + *
> + * Author: Pratyush Yadav <p.yadav@ti.com>
> + * Author: Jai Luthra <j-luthra@ti.com>
> + */
> +
> +#include <linux/bitfield.h>
> +#include <linux/dmaengine.h>
> +#include <linux/module.h>
> +#include <linux/of_platform.h>
> +#include <linux/platform_device.h>
> +
> +#include <media/mipi-csi2.h>
> +#include <media/v4l2-device.h>
> +#include <media/v4l2-ioctl.h>
> +#include <media/v4l2-mc.h>
> +#include <media/videobuf2-dma-contig.h>
> +
> +#define TI_CSI2RX_MODULE_NAME		"j721e-csi2rx"
> +
> +#define SHIM_CNTL			0x10
> +#define SHIM_CNTL_PIX_RST		BIT(0)
> +
> +#define SHIM_DMACNTX			0x20
> +#define SHIM_DMACNTX_EN			BIT(31)
> +#define SHIM_DMACNTX_YUV422		GENMASK(27, 26)
> +#define SHIM_DMACNTX_SIZE		GENMASK(21, 20)
> +#define SHIM_DMACNTX_FMT		GENMASK(5, 0)
> +#define SHIM_DMACNTX_UYVY		0
> +#define SHIM_DMACNTX_VYUY		1
> +#define SHIM_DMACNTX_YUYV		2
> +#define SHIM_DMACNTX_YVYU		3
> +#define SHIM_DMACNTX_SIZE_8		0
> +#define SHIM_DMACNTX_SIZE_16		1
> +#define SHIM_DMACNTX_SIZE_32		2
> +
> +#define SHIM_PSI_CFG0			0x24
> +#define SHIM_PSI_CFG0_SRC_TAG		GENMASK(15, 0)
> +#define SHIM_PSI_CFG0_DST_TAG		GENMASK(31, 16)
> +
> +#define PSIL_WORD_SIZE_BYTES		16
> +/*
> + * There are no hard limits on the width or height. The DMA engine can handle
> + * all sizes. The max width and height are arbitrary numbers for this driver.
> + * Use 16K * 16K as the arbitrary limit. It is large enough that it is unlikely
> + * the limit will be hit in practice.
> + */
> +#define MAX_WIDTH_BYTES			SZ_16K
> +#define MAX_HEIGHT_LINES		SZ_16K
> +
> +#define DRAIN_TIMEOUT_MS		50
> +
> +struct ti_csi2rx_fmt {
> +	u32				fourcc;	/* Four character code. */
> +	u32				code;	/* Mbus code. */
> +	u32				csi_dt;	/* CSI Data type. */
> +	u8				bpp;	/* Bits per pixel. */
> +	u8				size;	/* Data size shift when unpacking. */
> +};
> +
> +struct ti_csi2rx_buffer {
> +	/* Common v4l2 buffer. Must be first. */
> +	struct vb2_v4l2_buffer		vb;
> +	struct list_head		list;
> +	struct ti_csi2rx_dev		*csi;
> +};
> +
> +enum ti_csi2rx_dma_state {
> +	TI_CSI2RX_DMA_STOPPED,	/* Streaming not started yet. */
> +	TI_CSI2RX_DMA_IDLE,	/* Streaming but no pending DMA operation. */
> +	TI_CSI2RX_DMA_ACTIVE,	/* Streaming and pending DMA operation. */
> +};
> +
> +struct ti_csi2rx_dma {
> +	/* Protects all fields in this struct. */
> +	spinlock_t			lock;
> +	struct dma_chan			*chan;
> +	/* Buffers queued to the driver, waiting to be processed by DMA. */
> +	struct list_head		queue;
> +	enum ti_csi2rx_dma_state	state;
> +	/*
> +	 * Queue of buffers submitted to DMA engine.
> +	 */
> +	struct list_head		submitted;
> +	/* Buffer to drain stale data from PSI-L endpoint */
> +	struct {
> +		void			*vaddr;
> +		dma_addr_t		paddr;
> +		size_t			len;
> +	} drain;
> +};
> +
> +struct ti_csi2rx_dev {
> +	struct device			*dev;
> +	void __iomem			*shim;
> +	struct v4l2_device		v4l2_dev;
> +	struct video_device		vdev;
> +	struct media_device		mdev;
> +	struct media_pipeline		pipe;
> +	struct media_pad		pad;
> +	struct v4l2_async_notifier	notifier;
> +	struct v4l2_subdev		*source;
> +	struct vb2_queue		vidq;
> +	struct mutex			mutex; /* To serialize ioctls. */
> +	struct v4l2_format		v_fmt;
> +	struct ti_csi2rx_dma		dma;
> +	u32				sequence;
> +};
> +
> +static const struct ti_csi2rx_fmt formats[] = {
> +	{
> +		.fourcc			= V4L2_PIX_FMT_YUYV,
> +		.code			= MEDIA_BUS_FMT_YUYV8_1X16,
> +		.csi_dt			= MIPI_CSI2_DT_YUV422_8B,
> +		.bpp			= 16,
> +		.size			= SHIM_DMACNTX_SIZE_8,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_UYVY,
> +		.code			= MEDIA_BUS_FMT_UYVY8_1X16,
> +		.csi_dt			= MIPI_CSI2_DT_YUV422_8B,
> +		.bpp			= 16,
> +		.size			= SHIM_DMACNTX_SIZE_8,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_YVYU,
> +		.code			= MEDIA_BUS_FMT_YVYU8_1X16,
> +		.csi_dt			= MIPI_CSI2_DT_YUV422_8B,
> +		.bpp			= 16,
> +		.size			= SHIM_DMACNTX_SIZE_8,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_VYUY,
> +		.code			= MEDIA_BUS_FMT_VYUY8_1X16,
> +		.csi_dt			= MIPI_CSI2_DT_YUV422_8B,
> +		.bpp			= 16,
> +		.size			= SHIM_DMACNTX_SIZE_8,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_SBGGR8,
> +		.code			= MEDIA_BUS_FMT_SBGGR8_1X8,
> +		.csi_dt			= MIPI_CSI2_DT_RAW8,
> +		.bpp			= 8,
> +		.size			= SHIM_DMACNTX_SIZE_8,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_SGBRG8,
> +		.code			= MEDIA_BUS_FMT_SGBRG8_1X8,
> +		.csi_dt			= MIPI_CSI2_DT_RAW8,
> +		.bpp			= 8,
> +		.size			= SHIM_DMACNTX_SIZE_8,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_SGRBG8,
> +		.code			= MEDIA_BUS_FMT_SGRBG8_1X8,
> +		.csi_dt			= MIPI_CSI2_DT_RAW8,
> +		.bpp			= 8,
> +		.size			= SHIM_DMACNTX_SIZE_8,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_SRGGB8,
> +		.code			= MEDIA_BUS_FMT_SRGGB8_1X8,
> +		.csi_dt			= MIPI_CSI2_DT_RAW8,
> +		.bpp			= 8,
> +		.size			= SHIM_DMACNTX_SIZE_8,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_SBGGR10,
> +		.code			= MEDIA_BUS_FMT_SBGGR10_1X10,
> +		.csi_dt			= MIPI_CSI2_DT_RAW10,
> +		.bpp			= 16,
> +		.size			= SHIM_DMACNTX_SIZE_16,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_SGBRG10,
> +		.code			= MEDIA_BUS_FMT_SGBRG10_1X10,
> +		.csi_dt			= MIPI_CSI2_DT_RAW10,
> +		.bpp			= 16,
> +		.size			= SHIM_DMACNTX_SIZE_16,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_SGRBG10,
> +		.code			= MEDIA_BUS_FMT_SGRBG10_1X10,
> +		.csi_dt			= MIPI_CSI2_DT_RAW10,
> +		.bpp			= 16,
> +		.size			= SHIM_DMACNTX_SIZE_16,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_SRGGB10,
> +		.code			= MEDIA_BUS_FMT_SRGGB10_1X10,
> +		.csi_dt			= MIPI_CSI2_DT_RAW10,
> +		.bpp			= 16,
> +		.size			= SHIM_DMACNTX_SIZE_16,
> +	},
> +
> +	/* More formats can be supported but they are not listed for now. */
> +};
> +
> +static const unsigned int num_formats = ARRAY_SIZE(formats);
> +
> +/* Forward declaration needed by ti_csi2rx_dma_callback. */
> +static int ti_csi2rx_start_dma(struct ti_csi2rx_dev *csi,
> +			       struct ti_csi2rx_buffer *buf);
> +
> +static const struct ti_csi2rx_fmt *find_format_by_pix(u32 pixelformat)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < num_formats; i++) {
> +		if (formats[i].fourcc == pixelformat)
> +			return &formats[i];
> +	}
> +
> +	return NULL;
> +}
> +
> +static const struct ti_csi2rx_fmt *find_format_by_code(u32 code)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < num_formats; i++) {
> +		if (formats[i].code == code)
> +			return &formats[i];
> +	}
> +
> +	return NULL;
> +}
> +
> +static void ti_csi2rx_fill_fmt(const struct ti_csi2rx_fmt *csi_fmt,
> +			       struct v4l2_format *v4l2_fmt)
> +{
> +	struct v4l2_pix_format *pix = &v4l2_fmt->fmt.pix;
> +	unsigned int pixels_in_word;
> +	u8 bpp = ALIGN(csi_fmt->bpp, 8);
> +
> +	pixels_in_word = PSIL_WORD_SIZE_BYTES * 8 / bpp;
> +
> +	/* Clamp width and height to sensible maximums (16K x 16K) */
> +	pix->width = clamp_t(unsigned int, pix->width,
> +			     pixels_in_word,
> +			     MAX_WIDTH_BYTES * 8 / bpp);
> +	pix->height = clamp_t(unsigned int, pix->height, 1, MAX_HEIGHT_LINES);
> +
> +	/* Width should be a multiple of transfer word-size */
> +	pix->width = rounddown(pix->width, pixels_in_word);
> +
> +	v4l2_fmt->type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
> +	pix->pixelformat = csi_fmt->fourcc;
> +	pix->colorspace = V4L2_COLORSPACE_SRGB;
> +	pix->bytesperline = pix->width * (bpp / 8);
> +	pix->sizeimage = pix->bytesperline * pix->height;
> +}
> +
> +static int ti_csi2rx_querycap(struct file *file, void *priv,
> +			      struct v4l2_capability *cap)
> +{
> +	strscpy(cap->driver, TI_CSI2RX_MODULE_NAME, sizeof(cap->driver));
> +	strscpy(cap->card, TI_CSI2RX_MODULE_NAME, sizeof(cap->card));
> +
> +	return 0;
> +}
> +
> +static int ti_csi2rx_enum_fmt_vid_cap(struct file *file, void *priv,
> +				      struct v4l2_fmtdesc *f)
> +{
> +	const struct ti_csi2rx_fmt *fmt = NULL;
> +
> +	if (f->mbus_code) {
> +		/* 1-to-1 mapping between bus formats and pixel formats */
> +		if (f->index > 0)
> +			return -EINVAL;
> +
> +		fmt = find_format_by_code(f->mbus_code);
> +	} else {
> +		if (f->index >= num_formats)
> +			return -EINVAL;
> +
> +		fmt = &formats[f->index];
> +	}
> +
> +	if (!fmt)
> +		return -EINVAL;
> +
> +	f->pixelformat = fmt->fourcc;
> +	memset(f->reserved, 0, sizeof(f->reserved));
> +	f->type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
> +
> +	return 0;
> +}
> +
> +static int ti_csi2rx_g_fmt_vid_cap(struct file *file, void *prov,
> +				   struct v4l2_format *f)
> +{
> +	struct ti_csi2rx_dev *csi = video_drvdata(file);
> +
> +	*f = csi->v_fmt;
> +
> +	return 0;
> +}
> +
> +static int ti_csi2rx_try_fmt_vid_cap(struct file *file, void *priv,
> +				     struct v4l2_format *f)
> +{
> +	const struct ti_csi2rx_fmt *fmt;
> +
> +	/*
> +	 * Default to the first format if the requested pixel format code isn't
> +	 * supported.
> +	 */
> +	fmt = find_format_by_pix(f->fmt.pix.pixelformat);
> +	if (!fmt)
> +		fmt = &formats[0];
> +
> +	/* Interlaced formats are not supported. */
> +	f->fmt.pix.field = V4L2_FIELD_NONE;
> +
> +	ti_csi2rx_fill_fmt(fmt, f);
> +
> +	return 0;
> +}
> +
> +static int ti_csi2rx_s_fmt_vid_cap(struct file *file, void *priv,
> +				   struct v4l2_format *f)
> +{
> +	struct ti_csi2rx_dev *csi = video_drvdata(file);
> +	struct vb2_queue *q = &csi->vidq;
> +	int ret;
> +
> +	if (vb2_is_busy(q))
> +		return -EBUSY;
> +
> +	ret = ti_csi2rx_try_fmt_vid_cap(file, priv, f);
> +	if (ret < 0)
> +		return ret;
> +
> +	csi->v_fmt = *f;
> +
> +	return 0;
> +}
> +
> +static int ti_csi2rx_enum_framesizes(struct file *file, void *fh,
> +				     struct v4l2_frmsizeenum *fsize)
> +{
> +	const struct ti_csi2rx_fmt *fmt;
> +	unsigned int pixels_in_word;
> +	u8 bpp;
> +
> +	fmt = find_format_by_pix(fsize->pixel_format);
> +	if (!fmt || fsize->index != 0)
> +		return -EINVAL;
> +
> +	bpp = ALIGN(fmt->bpp, 8);
> +
> +	/*
> +	 * Number of pixels in one PSI-L word. The transfer happens in multiples
> +	 * of PSI-L word sizes.
> +	 */
> +	pixels_in_word = PSIL_WORD_SIZE_BYTES * 8 / bpp;
> +
> +	fsize->type = V4L2_FRMSIZE_TYPE_STEPWISE;
> +	fsize->stepwise.min_width = pixels_in_word;
> +	fsize->stepwise.max_width = rounddown(MAX_WIDTH_BYTES * 8 / bpp,
> +					      pixels_in_word);
> +	fsize->stepwise.step_width = pixels_in_word;
> +	fsize->stepwise.min_height = 1;
> +	fsize->stepwise.max_height = MAX_HEIGHT_LINES;
> +	fsize->stepwise.step_height = 1;
> +
> +	return 0;
> +}
> +
> +static const struct v4l2_ioctl_ops csi_ioctl_ops = {
> +	.vidioc_querycap      = ti_csi2rx_querycap,
> +	.vidioc_enum_fmt_vid_cap = ti_csi2rx_enum_fmt_vid_cap,
> +	.vidioc_try_fmt_vid_cap = ti_csi2rx_try_fmt_vid_cap,
> +	.vidioc_g_fmt_vid_cap = ti_csi2rx_g_fmt_vid_cap,
> +	.vidioc_s_fmt_vid_cap = ti_csi2rx_s_fmt_vid_cap,
> +	.vidioc_enum_framesizes = ti_csi2rx_enum_framesizes,
> +	.vidioc_reqbufs       = vb2_ioctl_reqbufs,
> +	.vidioc_create_bufs   = vb2_ioctl_create_bufs,
> +	.vidioc_prepare_buf   = vb2_ioctl_prepare_buf,
> +	.vidioc_querybuf      = vb2_ioctl_querybuf,
> +	.vidioc_qbuf          = vb2_ioctl_qbuf,
> +	.vidioc_dqbuf         = vb2_ioctl_dqbuf,
> +	.vidioc_expbuf        = vb2_ioctl_expbuf,
> +	.vidioc_streamon      = vb2_ioctl_streamon,
> +	.vidioc_streamoff     = vb2_ioctl_streamoff,
> +};
> +
> +static const struct v4l2_file_operations csi_fops = {
> +	.owner = THIS_MODULE,
> +	.open = v4l2_fh_open,
> +	.release = vb2_fop_release,
> +	.read = vb2_fop_read,
> +	.poll = vb2_fop_poll,
> +	.unlocked_ioctl = video_ioctl2,
> +	.mmap = vb2_fop_mmap,
> +};
> +
> +static int csi_async_notifier_bound(struct v4l2_async_notifier *notifier,
> +				    struct v4l2_subdev *subdev,
> +				    struct v4l2_async_connection *asc)
> +{
> +	struct ti_csi2rx_dev *csi = dev_get_drvdata(notifier->v4l2_dev->dev);
> +
> +	csi->source = subdev;
> +
> +	return 0;
> +}
> +
> +static int csi_async_notifier_complete(struct v4l2_async_notifier *notifier)
> +{
> +	struct ti_csi2rx_dev *csi = dev_get_drvdata(notifier->v4l2_dev->dev);
> +	struct video_device *vdev = &csi->vdev;
> +	int ret;
> +
> +	ret = video_register_device(vdev, VFL_TYPE_VIDEO, -1);
> +	if (ret)
> +		return ret;
> +
> +	ret = v4l2_create_fwnode_links_to_pad(csi->source, &csi->pad,
> +					      MEDIA_LNK_FL_IMMUTABLE | MEDIA_LNK_FL_ENABLED);
> +
> +	if (ret) {
> +		video_unregister_device(vdev);
> +		return ret;
> +	}
> +
> +	return v4l2_device_register_subdev_nodes(&csi->v4l2_dev);
> +}
> +
> +static const struct v4l2_async_notifier_operations csi_async_notifier_ops = {
> +	.bound = csi_async_notifier_bound,
> +	.complete = csi_async_notifier_complete,
> +};
> +
> +static int ti_csi2rx_notifier_register(struct ti_csi2rx_dev *csi)
> +{
> +	struct fwnode_handle *fwnode;
> +	struct v4l2_async_connection *asc;
> +	struct device_node *node;
> +	int ret;
> +
> +	node = of_get_child_by_name(csi->dev->of_node, "csi-bridge");
> +	if (!node)
> +		return -EINVAL;
> +
> +	fwnode = of_fwnode_handle(node);
> +	if (!fwnode) {
> +		of_node_put(node);
> +		return -EINVAL;
> +	}
> +
> +	v4l2_async_nf_init(&csi->notifier, &csi->v4l2_dev);
> +	csi->notifier.ops = &csi_async_notifier_ops;
> +
> +	asc = v4l2_async_nf_add_fwnode(&csi->notifier, fwnode,
> +				       struct v4l2_async_connection);
> +	of_node_put(node);
> +	if (IS_ERR(asc)) {
> +		v4l2_async_nf_cleanup(&csi->notifier);
> +		return PTR_ERR(asc);
> +	}
> +
> +	ret = v4l2_async_nf_register(&csi->notifier);
> +	if (ret) {
> +		v4l2_async_nf_cleanup(&csi->notifier);
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int ti_csi2rx_setup_shim(struct ti_csi2rx_dev *csi)
> +{
> +	const struct ti_csi2rx_fmt *fmt;
> +	unsigned int reg;
> +
> +	fmt = find_format_by_pix(csi->v_fmt.fmt.pix.pixelformat);
> +	if (!fmt) {
> +		dev_err(csi->dev, "Pixelformat 0x%x is not supported\n",
> +			csi->v_fmt.fmt.pix.pixelformat);
> +		return -EINVAL;
> +	}
> +
> +	/* De-assert the pixel interface reset. */
> +	reg = SHIM_CNTL_PIX_RST;
> +	writel(reg, csi->shim + SHIM_CNTL);
> +
> +	reg = SHIM_DMACNTX_EN;
> +	reg |= FIELD_PREP(SHIM_DMACNTX_FMT, fmt->csi_dt);
> +
> +	/*
> +	 * Using the values from the documentation gives incorrect ordering for
> +	 * the luma and chroma components. In practice, the "reverse" format
> +	 * gives the correct image. So for example, if the image is in UYVY, the
> +	 * reverse would be YVYU.
> +	 */
> +	switch (fmt->fourcc) {
> +	case V4L2_PIX_FMT_UYVY:
> +		reg |= FIELD_PREP(SHIM_DMACNTX_YUV422,
> +					SHIM_DMACNTX_YVYU);
> +		break;
> +	case V4L2_PIX_FMT_VYUY:
> +		reg |= FIELD_PREP(SHIM_DMACNTX_YUV422,
> +					SHIM_DMACNTX_YUYV);
> +		break;
> +	case V4L2_PIX_FMT_YUYV:
> +		reg |= FIELD_PREP(SHIM_DMACNTX_YUV422,
> +					SHIM_DMACNTX_VYUY);
> +		break;
> +	case V4L2_PIX_FMT_YVYU:
> +		reg |= FIELD_PREP(SHIM_DMACNTX_YUV422,
> +					SHIM_DMACNTX_UYVY);
> +		break;
> +	default:
> +		/* Ignore if not YUV 4:2:2 */
> +		break;
> +	}
> +
> +	reg |= FIELD_PREP(SHIM_DMACNTX_SIZE, fmt->size);
> +
> +	writel(reg, csi->shim + SHIM_DMACNTX);
> +
> +	reg = FIELD_PREP(SHIM_PSI_CFG0_SRC_TAG, 0) |
> +	      FIELD_PREP(SHIM_PSI_CFG0_DST_TAG, 0);
> +	writel(reg, csi->shim + SHIM_PSI_CFG0);
> +
> +	return 0;
> +}
> +
> +static void ti_csi2rx_drain_callback(void *param)
> +{
> +	struct completion *drain_complete = param;
> +
> +	complete(drain_complete);
> +}
> +
> +/** Drain the stale data left at the PSI-L endpoint.
> + *
> + * This might happen if no buffers are queued in time but source is still
> + * streaming. Or rarely it may happen while stopping the stream. To prevent

I understand the first one, but when does this happen when stopping the 
stream?

> + * that stale data corrupting the subsequent transactions, it is required to
> + * issue DMA requests to drain it out.
> + */
> +static int ti_csi2rx_drain_dma(struct ti_csi2rx_dev *csi)
> +{
> +	struct dma_async_tx_descriptor *desc;
> +	struct completion drain_complete;
> +	dma_cookie_t cookie;
> +	int ret;
> +
> +	init_completion(&drain_complete);
> +
> +	desc = dmaengine_prep_slave_single(csi->dma.chan, csi->dma.drain.paddr,
> +					   csi->dma.drain.len, DMA_DEV_TO_MEM,
> +					   DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
> +	if (!desc) {
> +		ret = -EIO;
> +		goto out;
> +	}
> +
> +	desc->callback = ti_csi2rx_drain_callback;
> +	desc->callback_param = &drain_complete;
> +
> +	cookie = dmaengine_submit(desc);
> +	ret = dma_submit_error(cookie);
> +	if (ret)
> +		goto out;
> +
> +	dma_async_issue_pending(csi->dma.chan);
> +
> +	if (!wait_for_completion_timeout(&drain_complete,
> +					 msecs_to_jiffies(DRAIN_TIMEOUT_MS))) {
> +		dmaengine_terminate_sync(csi->dma.chan);
> +		ret = -ETIMEDOUT;
> +		goto out;
> +	}
> +out:
> +	return ret;
> +}
> +
> +static void ti_csi2rx_dma_callback(void *param)
> +{
> +	struct ti_csi2rx_buffer *buf = param;
> +	struct ti_csi2rx_dev *csi = buf->csi;
> +	struct ti_csi2rx_dma *dma = &csi->dma;
> +	unsigned long flags;
> +
> +	/*
> +	 * TODO: Derive the sequence number from the CSI2RX frame number
> +	 * hardware monitor registers.
> +	 */
> +	buf->vb.vb2_buf.timestamp = ktime_get_ns();
> +	buf->vb.sequence = csi->sequence++;
> +
> +	spin_lock_irqsave(&dma->lock, flags);
> +
> +	WARN_ON(!list_is_first(&buf->list, &dma->submitted));
> +	vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_DONE);
> +	list_del(&buf->list);
> +
> +	/* If there are more buffers to process then start their transfer. */
> +	while (!list_empty(&dma->queue)) {
> +		buf = list_entry(dma->queue.next, struct ti_csi2rx_buffer, list);
> +
> +		if (ti_csi2rx_start_dma(csi, buf)) {
> +			dev_err(csi->dev, "Failed to queue the next buffer for DMA\n");
> +			vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR);
> +		} else {
> +			list_move_tail(&buf->list, &dma->submitted);
> +		}
> +	}
> +
> +	if (list_empty(&dma->submitted))
> +		dma->state = TI_CSI2RX_DMA_IDLE;
> +
> +	spin_unlock_irqrestore(&dma->lock, flags);
> +}
> +
> +static int ti_csi2rx_start_dma(struct ti_csi2rx_dev *csi,
> +			       struct ti_csi2rx_buffer *buf)
> +{
> +	unsigned long addr;
> +	struct dma_async_tx_descriptor *desc;
> +	size_t len = csi->v_fmt.fmt.pix.sizeimage;
> +	dma_cookie_t cookie;
> +	int ret = 0;
> +
> +	addr = vb2_dma_contig_plane_dma_addr(&buf->vb.vb2_buf, 0);
> +	desc = dmaengine_prep_slave_single(csi->dma.chan, addr, len,
> +					   DMA_DEV_TO_MEM,
> +					   DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
> +	if (!desc)
> +		return -EIO;
> +
> +	desc->callback = ti_csi2rx_dma_callback;
> +	desc->callback_param = buf;
> +
> +	cookie = dmaengine_submit(desc);
> +	ret = dma_submit_error(cookie);
> +	if (ret)
> +		return ret;
> +
> +	dma_async_issue_pending(csi->dma.chan);
> +
> +	return 0;
> +}
> +
> +static void ti_csi2rx_cleanup_buffers(struct ti_csi2rx_dev *csi,
> +				      enum vb2_buffer_state buf_state)
> +{
> +	struct ti_csi2rx_dma *dma = &csi->dma;
> +	struct ti_csi2rx_buffer *buf, *tmp;
> +	enum ti_csi2rx_dma_state state;
> +	unsigned long flags;
> +	int ret;
> +
> +	spin_lock_irqsave(&dma->lock, flags);
> +	state = csi->dma.state;
> +	dma->state = TI_CSI2RX_DMA_STOPPED;
> +	spin_unlock_irqrestore(&dma->lock, flags);
> +
> +	if (state != TI_CSI2RX_DMA_STOPPED) {
> +		/*
> +		 * Normal DMA termination sometimes does not clean up pending
> +		 * data on the endpoint.

When is "sometimes"? It's good to be more exact.

> +		 */
> +		ret = ti_csi2rx_drain_dma(csi);
> +		if (ret)
> +			dev_dbg(csi->dev,
> +				"Failed to drain DMA. Next frame might be bogus\n");
> +	}
> +	ret = dmaengine_terminate_sync(csi->dma.chan);
> +	if (ret)
> +		dev_err(csi->dev, "Failed to stop DMA: %d\n", ret);
> +
> +	dma_free_coherent(csi->dev, dma->drain.len,
> +			  dma->drain.vaddr, dma->drain.paddr);
> +	dma->drain.vaddr = NULL;
> +
> +	spin_lock_irqsave(&dma->lock, flags);
> +	list_for_each_entry_safe(buf, tmp, &csi->dma.queue, list) {
> +		list_del(&buf->list);
> +		vb2_buffer_done(&buf->vb.vb2_buf, buf_state);
> +	}
> +	list_for_each_entry_safe(buf, tmp, &csi->dma.submitted, list) {
> +		list_del(&buf->list);
> +		vb2_buffer_done(&buf->vb.vb2_buf, buf_state);
> +	}
> +	spin_unlock_irqrestore(&dma->lock, flags);
> +}
> +
> +static int ti_csi2rx_queue_setup(struct vb2_queue *q, unsigned int *nbuffers,
> +				 unsigned int *nplanes, unsigned int sizes[],
> +				 struct device *alloc_devs[])
> +{
> +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(q);
> +	unsigned int size = csi->v_fmt.fmt.pix.sizeimage;
> +
> +	if (*nplanes) {
> +		if (sizes[0] < size)
> +			return -EINVAL;
> +		size = sizes[0];
> +	}
> +
> +	*nplanes = 1;
> +	sizes[0] = size;
> +
> +	return 0;
> +}
> +
> +static int ti_csi2rx_buffer_prepare(struct vb2_buffer *vb)
> +{
> +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vb->vb2_queue);
> +	unsigned long size = csi->v_fmt.fmt.pix.sizeimage;
> +
> +	if (vb2_plane_size(vb, 0) < size) {
> +		dev_err(csi->dev, "Data will not fit into plane\n");
> +		return -EINVAL;
> +	}
> +
> +	vb2_set_plane_payload(vb, 0, size);
> +	return 0;
> +}
> +
> +static void ti_csi2rx_buffer_queue(struct vb2_buffer *vb)
> +{
> +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vb->vb2_queue);
> +	struct ti_csi2rx_buffer *buf;
> +	struct ti_csi2rx_dma *dma = &csi->dma;
> +	bool restart_dma = false;
> +	unsigned long flags = 0;
> +	int ret;
> +
> +	buf = container_of(vb, struct ti_csi2rx_buffer, vb.vb2_buf);
> +	buf->csi = csi;
> +
> +	spin_lock_irqsave(&dma->lock, flags);
> +	/*
> +	 * Usually the DMA callback takes care of queueing the pending buffers.
> +	 * But if DMA has stalled due to lack of buffers, restart it now.
> +	 */
> +	if (dma->state == TI_CSI2RX_DMA_IDLE) {
> +		/*
> +		 * Do not restart DMA with the lock held because
> +		 * ti_csi2rx_drain_dma() might block for completion.
> +		 * There won't be a race on queueing DMA anyway since the
> +		 * callback is not being fired.
> +		 */
> +		restart_dma = true;
> +		dma->state = TI_CSI2RX_DMA_ACTIVE;
> +	} else {
> +		list_add_tail(&buf->list, &dma->queue);
> +	}
> +	spin_unlock_irqrestore(&dma->lock, flags);
> +
> +	if (restart_dma) {
> +		/*
> +		 * Once frames start dropping, some data gets stuck in the DMA
> +		 * pipeline somewhere. So the first DMA transfer after frame
> +		 * drops gives a partial frame. This is obviously not useful to
> +		 * the application and will only confuse it. Issue a DMA
> +		 * transaction to drain that up.
> +		 */
> +		ret = ti_csi2rx_drain_dma(csi);
> +		if (ret)
> +			dev_warn(csi->dev,
> +				 "Failed to drain DMA. Next frame might be bogus\n");
> +
> +		ret = ti_csi2rx_start_dma(csi, buf);
> +		if (ret) {
> +			dev_err(csi->dev, "Failed to start DMA: %d\n", ret);
> +			spin_lock_irqsave(&dma->lock, flags);
> +			vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR);
> +			dma->state = TI_CSI2RX_DMA_IDLE;
> +			spin_unlock_irqrestore(&dma->lock, flags);
> +		} else {
> +			spin_lock_irqsave(&dma->lock, flags);
> +			list_add_tail(&buf->list, &dma->submitted);
> +			spin_unlock_irqrestore(&dma->lock, flags);
> +		}
> +	}
> +}
> +
> +static int ti_csi2rx_start_streaming(struct vb2_queue *vq, unsigned int count)
> +{
> +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vq);
> +	struct ti_csi2rx_dma *dma = &csi->dma;
> +	struct ti_csi2rx_buffer *buf;
> +	unsigned long flags;
> +	int ret = 0;
> +
> +	spin_lock_irqsave(&dma->lock, flags);
> +	if (list_empty(&dma->queue))
> +		ret = -EIO;
> +	spin_unlock_irqrestore(&dma->lock, flags);
> +	if (ret)
> +		return ret;
> +
> +	dma->drain.len = csi->v_fmt.fmt.pix.sizeimage;
> +	dma->drain.vaddr = dma_alloc_coherent(csi->dev, dma->drain.len,
> +					      &dma->drain.paddr, GFP_KERNEL);
> +	if (!dma->drain.vaddr)
> +		return -ENOMEM;

This is still allocating a large buffer every time streaming is started 
(and with streams support, a separate buffer for each stream?).

Did you check if the TI DMA can do writes to a constant address? That 
would be the best option, as then the whole buffer allocation problem 
goes away.

Alternatively, can you flush the buffers with multiple one line 
transfers? The flushing shouldn't be performance critical, so even if 
that's slower than a normal full-frame DMA, it shouldn't matter much. 
And if that can be done, a single probe time line-buffer allocation 
should do the trick.

Other than this drain buffer topic, I think this looks fine. So, I'm 
going to give Rb, but I do encourage you to look more into optimizing 
this drain buffer.

Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>

  Tomi


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v9 13/13] media: ti: Add CSI2RX support for J721E
  2023-08-15 13:00   ` Tomi Valkeinen
@ 2023-08-18 10:25     ` Jai Luthra
  2023-08-29 15:55       ` Laurent Pinchart
  0 siblings, 1 reply; 30+ messages in thread
From: Jai Luthra @ 2023-08-18 10:25 UTC (permalink / raw)
  To: Tomi Valkeinen, Vignesh Raghavendra
  Cc: Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, linux-media, linux-kernel,
	devicetree, linux-arm-kernel, Laurent Pinchart,
	Mauro Carvalho Chehab, Maxime Ripard, niklas.soderlund+renesas,
	Benoit Parrot, Vaishnav Achath, nm, devarsht, a-bhatia1,
	Martyn Welch, Julien Massot

[-- Attachment #1: Type: text/plain, Size: 11917 bytes --]

Hi Tomi,

Thanks for the review.

On Aug 15, 2023 at 16:00:51 +0300, Tomi Valkeinen wrote:
> On 11/08/2023 13:47, Jai Luthra wrote:
> > From: Pratyush Yadav <p.yadav@ti.com>
> > 

...

> > +
> > +static void ti_csi2rx_drain_callback(void *param)
> > +{
> > +	struct completion *drain_complete = param;
> > +
> > +	complete(drain_complete);
> > +}
> > +
> > +/** Drain the stale data left at the PSI-L endpoint.
> > + *
> > + * This might happen if no buffers are queued in time but source is still
> > + * streaming. Or rarely it may happen while stopping the stream. To prevent
> 
> I understand the first one, but when does this happen when stopping the
> stream?
> 

When multi-stream support is enabled the module-level pixel reset for 
cannot be done when stopping a single stream, in which case some 
in-flight data is left at the PSI-L endpoint despite enforcing the 
DMACNTX reset before. The same was true till v7 of this series as well, 
due to the module-level pixel reset being done in the wrong order 
(before stopping stream on the source).

Not sure if this will happen for single-stream usecases now (since v8)

I will fix this and other comments when I post subsequent patches for 
multi-stream.

> > + * that stale data corrupting the subsequent transactions, it is required to
> > + * issue DMA requests to drain it out.
> > + */
> > +static int ti_csi2rx_drain_dma(struct ti_csi2rx_dev *csi)
> > +{
> > +	struct dma_async_tx_descriptor *desc;
> > +	struct completion drain_complete;
> > +	dma_cookie_t cookie;
> > +	int ret;
> > +
> > +	init_completion(&drain_complete);
> > +
> > +	desc = dmaengine_prep_slave_single(csi->dma.chan, csi->dma.drain.paddr,
> > +					   csi->dma.drain.len, DMA_DEV_TO_MEM,
> > +					   DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
> > +	if (!desc) {
> > +		ret = -EIO;
> > +		goto out;
> > +	}
> > +
> > +	desc->callback = ti_csi2rx_drain_callback;
> > +	desc->callback_param = &drain_complete;
> > +
> > +	cookie = dmaengine_submit(desc);
> > +	ret = dma_submit_error(cookie);
> > +	if (ret)
> > +		goto out;
> > +
> > +	dma_async_issue_pending(csi->dma.chan);
> > +
> > +	if (!wait_for_completion_timeout(&drain_complete,
> > +					 msecs_to_jiffies(DRAIN_TIMEOUT_MS))) {
> > +		dmaengine_terminate_sync(csi->dma.chan);
> > +		ret = -ETIMEDOUT;
> > +		goto out;
> > +	}
> > +out:
> > +	return ret;
> > +}
> > +
> > +static void ti_csi2rx_dma_callback(void *param)
> > +{
> > +	struct ti_csi2rx_buffer *buf = param;
> > +	struct ti_csi2rx_dev *csi = buf->csi;
> > +	struct ti_csi2rx_dma *dma = &csi->dma;
> > +	unsigned long flags;
> > +
> > +	/*
> > +	 * TODO: Derive the sequence number from the CSI2RX frame number
> > +	 * hardware monitor registers.
> > +	 */
> > +	buf->vb.vb2_buf.timestamp = ktime_get_ns();
> > +	buf->vb.sequence = csi->sequence++;
> > +
> > +	spin_lock_irqsave(&dma->lock, flags);
> > +
> > +	WARN_ON(!list_is_first(&buf->list, &dma->submitted));
> > +	vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_DONE);
> > +	list_del(&buf->list);
> > +
> > +	/* If there are more buffers to process then start their transfer. */
> > +	while (!list_empty(&dma->queue)) {
> > +		buf = list_entry(dma->queue.next, struct ti_csi2rx_buffer, list);
> > +
> > +		if (ti_csi2rx_start_dma(csi, buf)) {
> > +			dev_err(csi->dev, "Failed to queue the next buffer for DMA\n");
> > +			vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR);
> > +		} else {
> > +			list_move_tail(&buf->list, &dma->submitted);
> > +		}
> > +	}
> > +
> > +	if (list_empty(&dma->submitted))
> > +		dma->state = TI_CSI2RX_DMA_IDLE;
> > +
> > +	spin_unlock_irqrestore(&dma->lock, flags);
> > +}
> > +
> > +static int ti_csi2rx_start_dma(struct ti_csi2rx_dev *csi,
> > +			       struct ti_csi2rx_buffer *buf)
> > +{
> > +	unsigned long addr;
> > +	struct dma_async_tx_descriptor *desc;
> > +	size_t len = csi->v_fmt.fmt.pix.sizeimage;
> > +	dma_cookie_t cookie;
> > +	int ret = 0;
> > +
> > +	addr = vb2_dma_contig_plane_dma_addr(&buf->vb.vb2_buf, 0);
> > +	desc = dmaengine_prep_slave_single(csi->dma.chan, addr, len,
> > +					   DMA_DEV_TO_MEM,
> > +					   DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
> > +	if (!desc)
> > +		return -EIO;
> > +
> > +	desc->callback = ti_csi2rx_dma_callback;
> > +	desc->callback_param = buf;
> > +
> > +	cookie = dmaengine_submit(desc);
> > +	ret = dma_submit_error(cookie);
> > +	if (ret)
> > +		return ret;
> > +
> > +	dma_async_issue_pending(csi->dma.chan);
> > +
> > +	return 0;
> > +}
> > +
> > +static void ti_csi2rx_cleanup_buffers(struct ti_csi2rx_dev *csi,
> > +				      enum vb2_buffer_state buf_state)
> > +{
> > +	struct ti_csi2rx_dma *dma = &csi->dma;
> > +	struct ti_csi2rx_buffer *buf, *tmp;
> > +	enum ti_csi2rx_dma_state state;
> > +	unsigned long flags;
> > +	int ret;
> > +
> > +	spin_lock_irqsave(&dma->lock, flags);
> > +	state = csi->dma.state;
> > +	dma->state = TI_CSI2RX_DMA_STOPPED;
> > +	spin_unlock_irqrestore(&dma->lock, flags);
> > +
> > +	if (state != TI_CSI2RX_DMA_STOPPED) {
> > +		/*
> > +		 * Normal DMA termination sometimes does not clean up pending
> > +		 * data on the endpoint.
> 
> When is "sometimes"? It's good to be more exact.
> 
> > +		 */
> > +		ret = ti_csi2rx_drain_dma(csi);
> > +		if (ret)
> > +			dev_dbg(csi->dev,
> > +				"Failed to drain DMA. Next frame might be bogus\n");
> > +	}
> > +	ret = dmaengine_terminate_sync(csi->dma.chan);
> > +	if (ret)
> > +		dev_err(csi->dev, "Failed to stop DMA: %d\n", ret);
> > +
> > +	dma_free_coherent(csi->dev, dma->drain.len,
> > +			  dma->drain.vaddr, dma->drain.paddr);
> > +	dma->drain.vaddr = NULL;
> > +
> > +	spin_lock_irqsave(&dma->lock, flags);
> > +	list_for_each_entry_safe(buf, tmp, &csi->dma.queue, list) {
> > +		list_del(&buf->list);
> > +		vb2_buffer_done(&buf->vb.vb2_buf, buf_state);
> > +	}
> > +	list_for_each_entry_safe(buf, tmp, &csi->dma.submitted, list) {
> > +		list_del(&buf->list);
> > +		vb2_buffer_done(&buf->vb.vb2_buf, buf_state);
> > +	}
> > +	spin_unlock_irqrestore(&dma->lock, flags);
> > +}
> > +
> > +static int ti_csi2rx_queue_setup(struct vb2_queue *q, unsigned int *nbuffers,
> > +				 unsigned int *nplanes, unsigned int sizes[],
> > +				 struct device *alloc_devs[])
> > +{
> > +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(q);
> > +	unsigned int size = csi->v_fmt.fmt.pix.sizeimage;
> > +
> > +	if (*nplanes) {
> > +		if (sizes[0] < size)
> > +			return -EINVAL;
> > +		size = sizes[0];
> > +	}
> > +
> > +	*nplanes = 1;
> > +	sizes[0] = size;
> > +
> > +	return 0;
> > +}
> > +
> > +static int ti_csi2rx_buffer_prepare(struct vb2_buffer *vb)
> > +{
> > +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vb->vb2_queue);
> > +	unsigned long size = csi->v_fmt.fmt.pix.sizeimage;
> > +
> > +	if (vb2_plane_size(vb, 0) < size) {
> > +		dev_err(csi->dev, "Data will not fit into plane\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	vb2_set_plane_payload(vb, 0, size);
> > +	return 0;
> > +}
> > +
> > +static void ti_csi2rx_buffer_queue(struct vb2_buffer *vb)
> > +{
> > +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vb->vb2_queue);
> > +	struct ti_csi2rx_buffer *buf;
> > +	struct ti_csi2rx_dma *dma = &csi->dma;
> > +	bool restart_dma = false;
> > +	unsigned long flags = 0;
> > +	int ret;
> > +
> > +	buf = container_of(vb, struct ti_csi2rx_buffer, vb.vb2_buf);
> > +	buf->csi = csi;
> > +
> > +	spin_lock_irqsave(&dma->lock, flags);
> > +	/*
> > +	 * Usually the DMA callback takes care of queueing the pending buffers.
> > +	 * But if DMA has stalled due to lack of buffers, restart it now.
> > +	 */
> > +	if (dma->state == TI_CSI2RX_DMA_IDLE) {
> > +		/*
> > +		 * Do not restart DMA with the lock held because
> > +		 * ti_csi2rx_drain_dma() might block for completion.
> > +		 * There won't be a race on queueing DMA anyway since the
> > +		 * callback is not being fired.
> > +		 */
> > +		restart_dma = true;
> > +		dma->state = TI_CSI2RX_DMA_ACTIVE;
> > +	} else {
> > +		list_add_tail(&buf->list, &dma->queue);
> > +	}
> > +	spin_unlock_irqrestore(&dma->lock, flags);
> > +
> > +	if (restart_dma) {
> > +		/*
> > +		 * Once frames start dropping, some data gets stuck in the DMA
> > +		 * pipeline somewhere. So the first DMA transfer after frame
> > +		 * drops gives a partial frame. This is obviously not useful to
> > +		 * the application and will only confuse it. Issue a DMA
> > +		 * transaction to drain that up.
> > +		 */
> > +		ret = ti_csi2rx_drain_dma(csi);
> > +		if (ret)
> > +			dev_warn(csi->dev,
> > +				 "Failed to drain DMA. Next frame might be bogus\n");
> > +
> > +		ret = ti_csi2rx_start_dma(csi, buf);
> > +		if (ret) {
> > +			dev_err(csi->dev, "Failed to start DMA: %d\n", ret);
> > +			spin_lock_irqsave(&dma->lock, flags);
> > +			vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR);
> > +			dma->state = TI_CSI2RX_DMA_IDLE;
> > +			spin_unlock_irqrestore(&dma->lock, flags);
> > +		} else {
> > +			spin_lock_irqsave(&dma->lock, flags);
> > +			list_add_tail(&buf->list, &dma->submitted);
> > +			spin_unlock_irqrestore(&dma->lock, flags);
> > +		}
> > +	}
> > +}
> > +
> > +static int ti_csi2rx_start_streaming(struct vb2_queue *vq, unsigned int count)
> > +{
> > +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vq);
> > +	struct ti_csi2rx_dma *dma = &csi->dma;
> > +	struct ti_csi2rx_buffer *buf;
> > +	unsigned long flags;
> > +	int ret = 0;
> > +
> > +	spin_lock_irqsave(&dma->lock, flags);
> > +	if (list_empty(&dma->queue))
> > +		ret = -EIO;
> > +	spin_unlock_irqrestore(&dma->lock, flags);
> > +	if (ret)
> > +		return ret;
> > +
> > +	dma->drain.len = csi->v_fmt.fmt.pix.sizeimage;
> > +	dma->drain.vaddr = dma_alloc_coherent(csi->dev, dma->drain.len,
> > +					      &dma->drain.paddr, GFP_KERNEL);
> > +	if (!dma->drain.vaddr)
> > +		return -ENOMEM;
> 
> This is still allocating a large buffer every time streaming is started (and
> with streams support, a separate buffer for each stream?).
> 
> Did you check if the TI DMA can do writes to a constant address? That would
> be the best option, as then the whole buffer allocation problem goes away.
> 

I checked with Vignesh, the hardware can support a scenario where we 
flush out all the data without allocating a buffer, but I couldn't find 
a way to signal that via the current dmaengine framework APIs. Will look 
into it further as it will be important for multi-stream support.

> Alternatively, can you flush the buffers with multiple one line transfers?
> The flushing shouldn't be performance critical, so even if that's slower
> than a normal full-frame DMA, it shouldn't matter much. And if that can be
> done, a single probe time line-buffer allocation should do the trick.

There will be considerable overhead if we queue many DMA transactions 
(in the order of 1000s or even 100s), which might not be okay for the 
scenarios where we have to drain mid-stream. Will have to run some 
experiments to see if that is worth it.

But one optimization we can for sure do is re-use a single drain buffer 
for all the streams. We will need to ensure to re-allocate the buffer 
for the "largest" framesize supported across the different streams at 
stream-on time.

My guess is the endpoint is not buffering a full-frame's worth of data, 
I will also check if we can upper bound that size to something feasible.

> 
> Other than this drain buffer topic, I think this looks fine. So, I'm going
> to give Rb, but I do encourage you to look more into optimizing this drain
> buffer.

Thank you!

> 
> Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
> 
>  Tomi
> 

-- 
Thanks,
Jai

GPG Fingerprint: 4DE0 D818 E5D5 75E8 D45A AFC5 43DE 91F9 249A 7145

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v9 00/13] CSI2RX support on J721E and AM62
  2023-08-11 10:47 [PATCH v9 00/13] CSI2RX support on J721E and AM62 Jai Luthra
                   ` (12 preceding siblings ...)
  2023-08-11 10:47 ` [PATCH v9 13/13] media: ti: Add CSI2RX support for J721E Jai Luthra
@ 2023-08-24 15:18 ` Julien Massot
  13 siblings, 0 replies; 30+ messages in thread
From: Julien Massot @ 2023-08-24 15:18 UTC (permalink / raw)
  To: Jai Luthra, Mauro Carvalho Chehab, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Sakari Ailus, Tomi Valkeinen
  Cc: linux-media, linux-kernel, devicetree, linux-arm-kernel,
	Laurent Pinchart, Mauro Carvalho Chehab, Maxime Ripard,
	niklas.soderlund+renesas, Benoit Parrot, Vaishnav Achath,
	Vignesh Raghavendra, nm, devarsht, a-bhatia1, Martyn Welch

Hi Jai,

On 8/11/23 12:47, Jai Luthra wrote:
> From: Pratyush Yadav <p.yadav@ti.com>
> 
> Hi,
> 
> This series adds support for CSI2 capture on J721E. It includes some
> fixes to the Cadence CSI2RX driver, and adds the TI CSI2RX wrapper driver.
> 
> This is the v9 of the below v8 series,
> https://lore.kernel.org/r/20230731-upstream_csi-v8-0-fb7d3661c2c9@ti.com
> 
> Testing logs: https://gist.github.com/jailuthra/eaeb3af3c65b67e1bc0d5db28180131d
> 
> J721E CSI2RX driver can also be extended to support multi-stream
> capture, filtering different CSI Virtual Channels (VC) or Data Types
> (DT) to different DMA channels. A WIP series based on v7 is available
> for reference at https://github.com/jailuthra/linux/commits/csi_multi_wip
> 
> I will rebase the multi-stream patches on the current series (v9) and
> post them as RFC in the coming weeks.
> 
> Signed-off-by: Jai Luthra <j-luthra@ti.com>
> ---

Thanks for your patches, I can confirm that the previous issue 
(repeating frames) that I saw on the v7 version is gone.

Tested-by: Julien Massot <julien.massot@collabora.com>

Regards,
Julien

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v9 02/13] media: dt-bindings: cadence-csi2rx: Add TI compatible string
  2023-08-11 10:47 ` [PATCH v9 02/13] media: dt-bindings: cadence-csi2rx: Add TI compatible string Jai Luthra
@ 2023-08-25  3:44   ` Laurent Pinchart
  0 siblings, 0 replies; 30+ messages in thread
From: Laurent Pinchart @ 2023-08-25  3:44 UTC (permalink / raw)
  To: Jai Luthra
  Cc: Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, Tomi Valkeinen, linux-media,
	linux-kernel, devicetree, linux-arm-kernel,
	Mauro Carvalho Chehab, Maxime Ripard, niklas.soderlund+renesas,
	Benoit Parrot, Vaishnav Achath, Vignesh Raghavendra, nm,
	devarsht, a-bhatia1, Martyn Welch, Julien Massot

Hi Jai,

Thank you for the patch.

On Fri, Aug 11, 2023 at 04:17:24PM +0530, Jai Luthra wrote:
> Add a SoC-specific compatible string for TI's integration of this IP in
> J7 and AM62 line of SoCs.
> 
> Reviewed-by: Maxime Ripard <mripard@kernel.org>
> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
> Signed-off-by: Jai Luthra <j-luthra@ti.com>

Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

> ---
>  Documentation/devicetree/bindings/media/cdns,csi2rx.yaml | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/Documentation/devicetree/bindings/media/cdns,csi2rx.yaml b/Documentation/devicetree/bindings/media/cdns,csi2rx.yaml
> index 30a335b10762..2008a47c0580 100644
> --- a/Documentation/devicetree/bindings/media/cdns,csi2rx.yaml
> +++ b/Documentation/devicetree/bindings/media/cdns,csi2rx.yaml
> @@ -18,6 +18,7 @@ properties:
>      items:
>        - enum:
>            - starfive,jh7110-csi2rx
> +          - ti,j721e-csi2rx
>        - const: cdns,csi2rx
>  
>    reg:
> 

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v9 05/13] media: cadence: csi2rx: Add get_fmt and set_fmt pad ops
  2023-08-11 10:47 ` [PATCH v9 05/13] media: cadence: csi2rx: Add get_fmt and set_fmt pad ops Jai Luthra
  2023-08-15 12:05   ` Tomi Valkeinen
@ 2023-08-25  3:48   ` Laurent Pinchart
  1 sibling, 0 replies; 30+ messages in thread
From: Laurent Pinchart @ 2023-08-25  3:48 UTC (permalink / raw)
  To: Jai Luthra
  Cc: Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, Tomi Valkeinen, linux-media,
	linux-kernel, devicetree, linux-arm-kernel,
	Mauro Carvalho Chehab, Maxime Ripard, niklas.soderlund+renesas,
	Benoit Parrot, Vaishnav Achath, Vignesh Raghavendra, nm,
	devarsht, a-bhatia1, Martyn Welch, Julien Massot

Hi Jai,

Thank you for the patch.

On Fri, Aug 11, 2023 at 04:17:27PM +0530, Jai Luthra wrote:
> From: Pratyush Yadav <p.yadav@ti.com>
> 
> The format is needed to calculate the link speed for the external DPHY
> configuration. It is not right to query the format from the source
> subdev. Add get_fmt and set_fmt pad operations so that the format can be
> configured and correct bpp be selected.
> 
> Initialize and use the v4l2 subdev active state to keep track of the
> active formats. Also propagate the new format from the sink pad to all
> the source pads.
> 
> Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
> Co-authored-by: Jai Luthra <j-luthra@ti.com>
> Reviewed-by: Maxime Ripard <mripard@kernel.org>
> Signed-off-by: Jai Luthra <j-luthra@ti.com>
> ---
> Changes from v8:
>     - Squash the patch adding RAW8 and RAW10 formats within this one
>     - Single line struct entries in formats[] array
>     - Skip specifiying redundant format.which entry in init_cfg()
> 
>  drivers/media/platform/cadence/cdns-csi2rx.c | 101 ++++++++++++++++++++++++++-
>  1 file changed, 100 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/media/platform/cadence/cdns-csi2rx.c b/drivers/media/platform/cadence/cdns-csi2rx.c
> index 9de3240e261c..047e74ee2443 100644
> --- a/drivers/media/platform/cadence/cdns-csi2rx.c
> +++ b/drivers/media/platform/cadence/cdns-csi2rx.c
> @@ -61,6 +61,11 @@ enum csi2rx_pads {
>  	CSI2RX_PAD_MAX,
>  };
>  
> +struct csi2rx_fmt {
> +	u32				code;
> +	u8				bpp;
> +};
> +
>  struct csi2rx_priv {
>  	struct device			*dev;
>  	unsigned int			count;
> @@ -95,6 +100,32 @@ struct csi2rx_priv {
>  	int				source_pad;
>  };
>  
> +static const struct csi2rx_fmt formats[] = {
> +	{ .code	= MEDIA_BUS_FMT_YUYV8_1X16, .bpp = 16, },
> +	{ .code	= MEDIA_BUS_FMT_UYVY8_1X16, .bpp = 16, },
> +	{ .code	= MEDIA_BUS_FMT_YVYU8_1X16, .bpp = 16, },
> +	{ .code	= MEDIA_BUS_FMT_VYUY8_1X16, .bpp = 16, },
> +	{ .code	= MEDIA_BUS_FMT_SBGGR8_1X8, .bpp = 8, },
> +	{ .code	= MEDIA_BUS_FMT_SGBRG8_1X8, .bpp = 8, },
> +	{ .code	= MEDIA_BUS_FMT_SGRBG8_1X8, .bpp = 8, },
> +	{ .code	= MEDIA_BUS_FMT_SRGGB8_1X8, .bpp = 8, },
> +	{ .code	= MEDIA_BUS_FMT_SBGGR10_1X10, .bpp = 10, },
> +	{ .code	= MEDIA_BUS_FMT_SGBRG10_1X10, .bpp = 10, },
> +	{ .code	= MEDIA_BUS_FMT_SGRBG10_1X10, .bpp = 10, },
> +	{ .code	= MEDIA_BUS_FMT_SRGGB10_1X10, .bpp = 10, },
> +};
> +
> +static const struct csi2rx_fmt *csi2rx_get_fmt_by_code(u32 code)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(formats); i++)
> +		if (formats[i].code == code)
> +			return &formats[i];
> +
> +	return NULL;
> +}
> +
>  static inline
>  struct csi2rx_priv *v4l2_subdev_to_csi2rx(struct v4l2_subdev *subdev)
>  {
> @@ -303,12 +334,73 @@ static int csi2rx_s_stream(struct v4l2_subdev *subdev, int enable)
>  	return ret;
>  }
>  
> +static int csi2rx_set_fmt(struct v4l2_subdev *subdev,
> +			  struct v4l2_subdev_state *state,
> +			  struct v4l2_subdev_format *format)
> +{
> +	struct v4l2_mbus_framefmt *fmt;
> +	unsigned int i;
> +
> +	/* No transcoding, source and sink formats must match. */
> +	if (format->pad != CSI2RX_PAD_SINK)
> +		return v4l2_subdev_get_fmt(subdev, state, format);
> +
> +	if (!csi2rx_get_fmt_by_code(format->format.code))
> +		format->format.code = formats[0].code;
> +
> +	format->format.field = V4L2_FIELD_NONE;
> +
> +	/* Set sink format */
> +	fmt = v4l2_subdev_get_pad_format(subdev, state, format->pad);
> +	if (!fmt)
> +		return -EINVAL;

You can drop this check, as format->pad is CSI2RX_PAD_SINK, this is
guaranteed to succeed.

> +
> +	*fmt = format->format;
> +
> +	/* Propagate to source formats */
> +	for (i = CSI2RX_PAD_SOURCE_STREAM0; i < CSI2RX_PAD_MAX; i++) {
> +		fmt = v4l2_subdev_get_pad_format(subdev, state, i);
> +		if (!fmt)
> +			return -EINVAL;

Same here.

With these minor issues addressed,

Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

> +		*fmt = format->format;
> +	}
> +
> +	return 0;
> +}
> +
> +static int csi2rx_init_cfg(struct v4l2_subdev *subdev,
> +			   struct v4l2_subdev_state *state)
> +{
> +	struct v4l2_subdev_format format = {
> +		.pad = CSI2RX_PAD_SINK,
> +		.format = {
> +			.width = 640,
> +			.height = 480,
> +			.code = MEDIA_BUS_FMT_UYVY8_1X16,
> +			.field = V4L2_FIELD_NONE,
> +			.colorspace = V4L2_COLORSPACE_SRGB,
> +			.ycbcr_enc = V4L2_YCBCR_ENC_601,
> +			.quantization = V4L2_QUANTIZATION_LIM_RANGE,
> +			.xfer_func = V4L2_XFER_FUNC_SRGB,
> +		},
> +	};
> +
> +	return csi2rx_set_fmt(subdev, state, &format);
> +}
> +
> +static const struct v4l2_subdev_pad_ops csi2rx_pad_ops = {
> +	.get_fmt	= v4l2_subdev_get_fmt,
> +	.set_fmt	= csi2rx_set_fmt,
> +	.init_cfg	= csi2rx_init_cfg,
> +};
> +
>  static const struct v4l2_subdev_video_ops csi2rx_video_ops = {
>  	.s_stream	= csi2rx_s_stream,
>  };
>  
>  static const struct v4l2_subdev_ops csi2rx_subdev_ops = {
>  	.video		= &csi2rx_video_ops,
> +	.pad		= &csi2rx_pad_ops,
>  };
>  
>  static int csi2rx_async_bound(struct v4l2_async_notifier *notifier,
> @@ -532,9 +624,13 @@ static int csi2rx_probe(struct platform_device *pdev)
>  	if (ret)
>  		goto err_cleanup;
>  
> +	ret = v4l2_subdev_init_finalize(&csi2rx->subdev);
> +	if (ret)
> +		goto err_cleanup;
> +
>  	ret = v4l2_async_register_subdev(&csi2rx->subdev);
>  	if (ret < 0)
> -		goto err_cleanup;
> +		goto err_free_state;
>  
>  	dev_info(&pdev->dev,
>  		 "Probed CSI2RX with %u/%u lanes, %u streams, %s D-PHY\n",
> @@ -544,6 +640,8 @@ static int csi2rx_probe(struct platform_device *pdev)
>  
>  	return 0;
>  
> +err_free_state:
> +	v4l2_subdev_cleanup(&csi2rx->subdev);
>  err_cleanup:
>  	v4l2_async_nf_unregister(&csi2rx->notifier);
>  	v4l2_async_nf_cleanup(&csi2rx->notifier);
> @@ -560,6 +658,7 @@ static void csi2rx_remove(struct platform_device *pdev)
>  	v4l2_async_nf_unregister(&csi2rx->notifier);
>  	v4l2_async_nf_cleanup(&csi2rx->notifier);
>  	v4l2_async_unregister_subdev(&csi2rx->subdev);
> +	v4l2_subdev_cleanup(&csi2rx->subdev);
>  	media_entity_cleanup(&csi2rx->subdev.entity);
>  	kfree(csi2rx);
>  }
> 

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v9 13/13] media: ti: Add CSI2RX support for J721E
  2023-08-18 10:25     ` Jai Luthra
@ 2023-08-29 15:55       ` Laurent Pinchart
  2023-10-04 13:51         ` Vinod Koul
  0 siblings, 1 reply; 30+ messages in thread
From: Laurent Pinchart @ 2023-08-29 15:55 UTC (permalink / raw)
  To: Jai Luthra
  Cc: Tomi Valkeinen, Vignesh Raghavendra, Mauro Carvalho Chehab,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, Sakari Ailus,
	linux-media, linux-kernel, devicetree, linux-arm-kernel,
	Mauro Carvalho Chehab, Maxime Ripard, niklas.soderlund+renesas,
	Benoit Parrot, Vaishnav Achath, nm, devarsht, a-bhatia1,
	Martyn Welch, Julien Massot, Vinod Koul

Hi Jai,

(CC'ing Vinod, the maintainer of the DMA engine subsystem, for a
question below)

On Fri, Aug 18, 2023 at 03:55:06PM +0530, Jai Luthra wrote:
> On Aug 15, 2023 at 16:00:51 +0300, Tomi Valkeinen wrote:
> > On 11/08/2023 13:47, Jai Luthra wrote:
> > > From: Pratyush Yadav <p.yadav@ti.com>

[snip]

> > > +static int ti_csi2rx_start_streaming(struct vb2_queue *vq, unsigned int count)
> > > +{
> > > +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vq);
> > > +	struct ti_csi2rx_dma *dma = &csi->dma;
> > > +	struct ti_csi2rx_buffer *buf;
> > > +	unsigned long flags;
> > > +	int ret = 0;
> > > +
> > > +	spin_lock_irqsave(&dma->lock, flags);
> > > +	if (list_empty(&dma->queue))
> > > +		ret = -EIO;
> > > +	spin_unlock_irqrestore(&dma->lock, flags);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	dma->drain.len = csi->v_fmt.fmt.pix.sizeimage;
> > > +	dma->drain.vaddr = dma_alloc_coherent(csi->dev, dma->drain.len,
> > > +					      &dma->drain.paddr, GFP_KERNEL);
> > > +	if (!dma->drain.vaddr)
> > > +		return -ENOMEM;
> > 
> > This is still allocating a large buffer every time streaming is started (and
> > with streams support, a separate buffer for each stream?).
> > 
> > Did you check if the TI DMA can do writes to a constant address? That would
> > be the best option, as then the whole buffer allocation problem goes away.
> 
> I checked with Vignesh, the hardware can support a scenario where we 
> flush out all the data without allocating a buffer, but I couldn't find 
> a way to signal that via the current dmaengine framework APIs. Will look 
> into it further as it will be important for multi-stream support.

That would be the best option. It's not immediately apparent to me if
the DMA engine API supports such a use case.
dmaengine_prep_interleaved_dma() gives you finer grain control on the
source and destination increments, but I haven't seen a way to instruct
the DMA engine to direct writes to /dev/null (so to speak). Vinod, is
this something that is supported, or could be supported ?

> > Alternatively, can you flush the buffers with multiple one line transfers?
> > The flushing shouldn't be performance critical, so even if that's slower
> > than a normal full-frame DMA, it shouldn't matter much. And if that can be
> > done, a single probe time line-buffer allocation should do the trick.
> 
> There will be considerable overhead if we queue many DMA transactions 
> (in the order of 1000s or even 100s), which might not be okay for the 
> scenarios where we have to drain mid-stream. Will have to run some 
> experiments to see if that is worth it.
> 
> But one optimization we can for sure do is re-use a single drain buffer 
> for all the streams. We will need to ensure to re-allocate the buffer 
> for the "largest" framesize supported across the different streams at 
> stream-on time.

If you implement .device_prep_interleaved_dma() in the DMA engine driver
you could write to a single line buffer, assuming that the hardware would
support so in a generic way.

> My guess is the endpoint is not buffering a full-frame's worth of data, 
> I will also check if we can upper bound that size to something feasible.
> 
> > Other than this drain buffer topic, I think this looks fine. So, I'm going
> > to give Rb, but I do encourage you to look more into optimizing this drain
> > buffer.
> 
> Thank you!
> 
> > Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v9 13/13] media: ti: Add CSI2RX support for J721E
  2023-08-11 10:47 ` [PATCH v9 13/13] media: ti: Add CSI2RX support for J721E Jai Luthra
  2023-08-15 13:00   ` Tomi Valkeinen
@ 2023-08-29 16:44   ` Laurent Pinchart
  2023-10-05  8:34     ` Jai Luthra
  1 sibling, 1 reply; 30+ messages in thread
From: Laurent Pinchart @ 2023-08-29 16:44 UTC (permalink / raw)
  To: Jai Luthra
  Cc: Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, Tomi Valkeinen, linux-media,
	linux-kernel, devicetree, linux-arm-kernel,
	Mauro Carvalho Chehab, Maxime Ripard, niklas.soderlund+renesas,
	Benoit Parrot, Vaishnav Achath, Vignesh Raghavendra, nm,
	devarsht, a-bhatia1, Martyn Welch, Julien Massot

Hi Jai,

Thank you for the patch.

On Fri, Aug 11, 2023 at 04:17:35PM +0530, Jai Luthra wrote:
> From: Pratyush Yadav <p.yadav@ti.com>
> 
> TI's J721E uses the Cadence CSI2RX and DPHY peripherals to facilitate
> capture over a CSI-2 bus.
> 
> The Cadence CSI2RX IP acts as a bridge between the TI specific parts and
> the CSI-2 protocol parts. TI then has a wrapper on top of this bridge
> called the SHIM layer. It takes in data from stream 0, repacks it, and
> sends it to memory over PSI-L DMA.
> 
> This driver acts as the "front end" to V4L2 client applications. It
> implements the required ioctls and buffer operations, passes the
> necessary calls on to the bridge, programs the SHIM layer, and performs
> DMA via the dmaengine API to finally return the data to a buffer
> supplied by the application.
> 
> Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
> Co-authored-by: Vaishnav Achath <vaishnav.a@ti.com>
> Signed-off-by: Vaishnav Achath <vaishnav.a@ti.com>
> Tested-by: Vaishnav Achath <vaishnav.a@ti.com>
> Co-authored-by: Jai Luthra <j-luthra@ti.com>
> Signed-off-by: Jai Luthra <j-luthra@ti.com>
> ---
> Changes since v8:
>     - Allocate drain buffer at start of stream instead of doing it in the
>       middle, and document why it is needed in comments
>     - Call subdev's get_fmt directly for link_validation()
>     - Cleanup height/width clamping and rounding code, document it in comments
>     - Return and check errors from setup_shim()
>     - s/subdev/source for cadence csi2rx's v4l2_subdev
>     - s/ti_csi2rx_init_subdev/ti_csi2rx_notifier_register
>     - Change copyright year/author list
> 
>  MAINTAINERS                                        |    7 +
>  drivers/media/platform/ti/Kconfig                  |   12 +
>  drivers/media/platform/ti/Makefile                 |    1 +
>  drivers/media/platform/ti/j721e-csi2rx/Makefile    |    2 +
>  .../media/platform/ti/j721e-csi2rx/j721e-csi2rx.c  | 1150 ++++++++++++++++++++
>  5 files changed, 1172 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 02a3192195af..959147d6d936 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -21455,6 +21455,13 @@ F:	Documentation/devicetree/bindings/media/i2c/ti,ds90*
>  F:	drivers/media/i2c/ds90*
>  F:	include/media/i2c/ds90*
>  
> +TI J721E CSI2RX DRIVER
> +M:	Jai Luthra <j-luthra@ti.com>
> +L:	linux-media@vger.kernel.org
> +S:	Maintained
> +F:	Documentation/devicetree/bindings/media/ti,j721e-csi2rx.yaml
> +F:	drivers/media/platform/ti/j721e-csi2rx/
> +
>  TI KEYSTONE MULTICORE NAVIGATOR DRIVERS
>  M:	Nishanth Menon <nm@ti.com>
>  M:	Santosh Shilimkar <ssantosh@kernel.org>
> diff --git a/drivers/media/platform/ti/Kconfig b/drivers/media/platform/ti/Kconfig
> index e1ab56c3be1f..42c908f6e1ae 100644
> --- a/drivers/media/platform/ti/Kconfig
> +++ b/drivers/media/platform/ti/Kconfig
> @@ -63,6 +63,18 @@ config VIDEO_TI_VPE_DEBUG
>  	help
>  	  Enable debug messages on VPE driver.
>  
> +config VIDEO_TI_J721E_CSI2RX
> +	tristate "TI J721E CSI2RX wrapper layer driver"
> +	depends on VIDEO_DEV && VIDEO_V4L2_SUBDEV_API
> +	depends on MEDIA_SUPPORT && MEDIA_CONTROLLER
> +	depends on PHY_CADENCE_DPHY_RX && VIDEO_CADENCE_CSI2RX

Is there a compile-time dependency on these, or just runtime ? If it's
just at runtime, it would be nice to either drop the dependency here, or
add a (...) || COMPILE_TEST

> +	depends on ARCH_K3 || COMPILE_TEST
> +	select VIDEOBUF2_DMA_CONTIG
> +	select V4L2_FWNODE
> +	help
> +	  Support for TI CSI2RX wrapper layer. This just enables the wrapper driver.
> +	  The Cadence CSI2RX bridge driver needs to be enabled separately.
> +
>  source "drivers/media/platform/ti/am437x/Kconfig"
>  source "drivers/media/platform/ti/davinci/Kconfig"
>  source "drivers/media/platform/ti/omap/Kconfig"
> diff --git a/drivers/media/platform/ti/Makefile b/drivers/media/platform/ti/Makefile
> index 98c5fe5c40d6..8a2f74c9380e 100644
> --- a/drivers/media/platform/ti/Makefile
> +++ b/drivers/media/platform/ti/Makefile
> @@ -3,5 +3,6 @@ obj-y += am437x/
>  obj-y += cal/
>  obj-y += vpe/
>  obj-y += davinci/
> +obj-y += j721e-csi2rx/
>  obj-y += omap/
>  obj-y += omap3isp/
> diff --git a/drivers/media/platform/ti/j721e-csi2rx/Makefile b/drivers/media/platform/ti/j721e-csi2rx/Makefile
> new file mode 100644
> index 000000000000..377afc1d6280
> --- /dev/null
> +++ b/drivers/media/platform/ti/j721e-csi2rx/Makefile
> @@ -0,0 +1,2 @@
> +# SPDX-License-Identifier: GPL-2.0
> +obj-$(CONFIG_VIDEO_TI_J721E_CSI2RX) += j721e-csi2rx.o
> diff --git a/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c b/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c
> new file mode 100644
> index 000000000000..301d947f6098
> --- /dev/null
> +++ b/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c
> @@ -0,0 +1,1150 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * TI CSI2RX Shim Wrapper Driver
> + *
> + * Copyright (C) 2023 Texas Instruments Incorporated - https://www.ti.com/
> + *
> + * Author: Pratyush Yadav <p.yadav@ti.com>
> + * Author: Jai Luthra <j-luthra@ti.com>
> + */
> +
> +#include <linux/bitfield.h>
> +#include <linux/dmaengine.h>
> +#include <linux/module.h>
> +#include <linux/of_platform.h>
> +#include <linux/platform_device.h>
> +
> +#include <media/mipi-csi2.h>
> +#include <media/v4l2-device.h>
> +#include <media/v4l2-ioctl.h>
> +#include <media/v4l2-mc.h>
> +#include <media/videobuf2-dma-contig.h>
> +
> +#define TI_CSI2RX_MODULE_NAME		"j721e-csi2rx"
> +
> +#define SHIM_CNTL			0x10
> +#define SHIM_CNTL_PIX_RST		BIT(0)
> +
> +#define SHIM_DMACNTX			0x20
> +#define SHIM_DMACNTX_EN			BIT(31)
> +#define SHIM_DMACNTX_YUV422		GENMASK(27, 26)
> +#define SHIM_DMACNTX_SIZE		GENMASK(21, 20)
> +#define SHIM_DMACNTX_FMT		GENMASK(5, 0)
> +#define SHIM_DMACNTX_UYVY		0
> +#define SHIM_DMACNTX_VYUY		1
> +#define SHIM_DMACNTX_YUYV		2
> +#define SHIM_DMACNTX_YVYU		3
> +#define SHIM_DMACNTX_SIZE_8		0
> +#define SHIM_DMACNTX_SIZE_16		1
> +#define SHIM_DMACNTX_SIZE_32		2
> +
> +#define SHIM_PSI_CFG0			0x24
> +#define SHIM_PSI_CFG0_SRC_TAG		GENMASK(15, 0)
> +#define SHIM_PSI_CFG0_DST_TAG		GENMASK(31, 16)
> +
> +#define PSIL_WORD_SIZE_BYTES		16
> +/*
> + * There are no hard limits on the width or height. The DMA engine can handle
> + * all sizes. The max width and height are arbitrary numbers for this driver.
> + * Use 16K * 16K as the arbitrary limit. It is large enough that it is unlikely
> + * the limit will be hit in practice.
> + */
> +#define MAX_WIDTH_BYTES			SZ_16K
> +#define MAX_HEIGHT_LINES		SZ_16K
> +
> +#define DRAIN_TIMEOUT_MS		50
> +
> +struct ti_csi2rx_fmt {
> +	u32				fourcc;	/* Four character code. */
> +	u32				code;	/* Mbus code. */
> +	u32				csi_dt;	/* CSI Data type. */
> +	u8				bpp;	/* Bits per pixel. */
> +	u8				size;	/* Data size shift when unpacking. */
> +};
> +
> +struct ti_csi2rx_buffer {
> +	/* Common v4l2 buffer. Must be first. */
> +	struct vb2_v4l2_buffer		vb;
> +	struct list_head		list;
> +	struct ti_csi2rx_dev		*csi;
> +};
> +
> +enum ti_csi2rx_dma_state {
> +	TI_CSI2RX_DMA_STOPPED,	/* Streaming not started yet. */
> +	TI_CSI2RX_DMA_IDLE,	/* Streaming but no pending DMA operation. */
> +	TI_CSI2RX_DMA_ACTIVE,	/* Streaming and pending DMA operation. */
> +};
> +
> +struct ti_csi2rx_dma {
> +	/* Protects all fields in this struct. */
> +	spinlock_t			lock;
> +	struct dma_chan			*chan;
> +	/* Buffers queued to the driver, waiting to be processed by DMA. */
> +	struct list_head		queue;
> +	enum ti_csi2rx_dma_state	state;
> +	/*
> +	 * Queue of buffers submitted to DMA engine.
> +	 */
> +	struct list_head		submitted;
> +	/* Buffer to drain stale data from PSI-L endpoint */
> +	struct {
> +		void			*vaddr;
> +		dma_addr_t		paddr;
> +		size_t			len;
> +	} drain;
> +};
> +
> +struct ti_csi2rx_dev {
> +	struct device			*dev;
> +	void __iomem			*shim;
> +	struct v4l2_device		v4l2_dev;
> +	struct video_device		vdev;
> +	struct media_device		mdev;
> +	struct media_pipeline		pipe;
> +	struct media_pad		pad;
> +	struct v4l2_async_notifier	notifier;
> +	struct v4l2_subdev		*source;
> +	struct vb2_queue		vidq;
> +	struct mutex			mutex; /* To serialize ioctls. */
> +	struct v4l2_format		v_fmt;
> +	struct ti_csi2rx_dma		dma;
> +	u32				sequence;
> +};
> +
> +static const struct ti_csi2rx_fmt formats[] = {

It would be nice to prefix local symbols to avoid namespace clashes,
even if they're static. ti_csi2rx_formats could be a good name. Same
below where applicable, and possibly above for some macro names.

> +	{
> +		.fourcc			= V4L2_PIX_FMT_YUYV,
> +		.code			= MEDIA_BUS_FMT_YUYV8_1X16,
> +		.csi_dt			= MIPI_CSI2_DT_YUV422_8B,
> +		.bpp			= 16,
> +		.size			= SHIM_DMACNTX_SIZE_8,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_UYVY,
> +		.code			= MEDIA_BUS_FMT_UYVY8_1X16,
> +		.csi_dt			= MIPI_CSI2_DT_YUV422_8B,
> +		.bpp			= 16,
> +		.size			= SHIM_DMACNTX_SIZE_8,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_YVYU,
> +		.code			= MEDIA_BUS_FMT_YVYU8_1X16,
> +		.csi_dt			= MIPI_CSI2_DT_YUV422_8B,
> +		.bpp			= 16,
> +		.size			= SHIM_DMACNTX_SIZE_8,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_VYUY,
> +		.code			= MEDIA_BUS_FMT_VYUY8_1X16,
> +		.csi_dt			= MIPI_CSI2_DT_YUV422_8B,
> +		.bpp			= 16,
> +		.size			= SHIM_DMACNTX_SIZE_8,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_SBGGR8,
> +		.code			= MEDIA_BUS_FMT_SBGGR8_1X8,
> +		.csi_dt			= MIPI_CSI2_DT_RAW8,
> +		.bpp			= 8,
> +		.size			= SHIM_DMACNTX_SIZE_8,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_SGBRG8,
> +		.code			= MEDIA_BUS_FMT_SGBRG8_1X8,
> +		.csi_dt			= MIPI_CSI2_DT_RAW8,
> +		.bpp			= 8,
> +		.size			= SHIM_DMACNTX_SIZE_8,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_SGRBG8,
> +		.code			= MEDIA_BUS_FMT_SGRBG8_1X8,
> +		.csi_dt			= MIPI_CSI2_DT_RAW8,
> +		.bpp			= 8,
> +		.size			= SHIM_DMACNTX_SIZE_8,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_SRGGB8,
> +		.code			= MEDIA_BUS_FMT_SRGGB8_1X8,
> +		.csi_dt			= MIPI_CSI2_DT_RAW8,
> +		.bpp			= 8,
> +		.size			= SHIM_DMACNTX_SIZE_8,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_SBGGR10,
> +		.code			= MEDIA_BUS_FMT_SBGGR10_1X10,
> +		.csi_dt			= MIPI_CSI2_DT_RAW10,
> +		.bpp			= 16,
> +		.size			= SHIM_DMACNTX_SIZE_16,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_SGBRG10,
> +		.code			= MEDIA_BUS_FMT_SGBRG10_1X10,
> +		.csi_dt			= MIPI_CSI2_DT_RAW10,
> +		.bpp			= 16,
> +		.size			= SHIM_DMACNTX_SIZE_16,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_SGRBG10,
> +		.code			= MEDIA_BUS_FMT_SGRBG10_1X10,
> +		.csi_dt			= MIPI_CSI2_DT_RAW10,
> +		.bpp			= 16,
> +		.size			= SHIM_DMACNTX_SIZE_16,
> +	}, {
> +		.fourcc			= V4L2_PIX_FMT_SRGGB10,
> +		.code			= MEDIA_BUS_FMT_SRGGB10_1X10,
> +		.csi_dt			= MIPI_CSI2_DT_RAW10,
> +		.bpp			= 16,
> +		.size			= SHIM_DMACNTX_SIZE_16,
> +	},
> +
> +	/* More formats can be supported but they are not listed for now. */
> +};
> +
> +static const unsigned int num_formats = ARRAY_SIZE(formats);

I would use ARRAY_SIZE(formats) below and drop num_formats, as I don't
think it improves readability, but I don't insist.

> +
> +/* Forward declaration needed by ti_csi2rx_dma_callback. */
> +static int ti_csi2rx_start_dma(struct ti_csi2rx_dev *csi,
> +			       struct ti_csi2rx_buffer *buf);
> +
> +static const struct ti_csi2rx_fmt *find_format_by_pix(u32 pixelformat)

Maybe "_by_fourcc" ? That's nitpicking though.

> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < num_formats; i++) {
> +		if (formats[i].fourcc == pixelformat)
> +			return &formats[i];
> +	}
> +
> +	return NULL;
> +}
> +
> +static const struct ti_csi2rx_fmt *find_format_by_code(u32 code)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < num_formats; i++) {
> +		if (formats[i].code == code)
> +			return &formats[i];
> +	}
> +
> +	return NULL;
> +}
> +
> +static void ti_csi2rx_fill_fmt(const struct ti_csi2rx_fmt *csi_fmt,
> +			       struct v4l2_format *v4l2_fmt)
> +{
> +	struct v4l2_pix_format *pix = &v4l2_fmt->fmt.pix;
> +	unsigned int pixels_in_word;
> +	u8 bpp = ALIGN(csi_fmt->bpp, 8);

All bpp values are multiple of 8, is ALIGN() needed ?

> +
> +	pixels_in_word = PSIL_WORD_SIZE_BYTES * 8 / bpp;
> +
> +	/* Clamp width and height to sensible maximums (16K x 16K) */
> +	pix->width = clamp_t(unsigned int, pix->width,
> +			     pixels_in_word,
> +			     MAX_WIDTH_BYTES * 8 / bpp);
> +	pix->height = clamp_t(unsigned int, pix->height, 1, MAX_HEIGHT_LINES);
> +
> +	/* Width should be a multiple of transfer word-size */
> +	pix->width = rounddown(pix->width, pixels_in_word);
> +
> +	v4l2_fmt->type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
> +	pix->pixelformat = csi_fmt->fourcc;
> +	pix->colorspace = V4L2_COLORSPACE_SRGB;

You should fill the other colorspace-related fields.

> +	pix->bytesperline = pix->width * (bpp / 8);
> +	pix->sizeimage = pix->bytesperline * pix->height;
> +}
> +
> +static int ti_csi2rx_querycap(struct file *file, void *priv,
> +			      struct v4l2_capability *cap)
> +{
> +	strscpy(cap->driver, TI_CSI2RX_MODULE_NAME, sizeof(cap->driver));
> +	strscpy(cap->card, TI_CSI2RX_MODULE_NAME, sizeof(cap->card));
> +
> +	return 0;
> +}
> +
> +static int ti_csi2rx_enum_fmt_vid_cap(struct file *file, void *priv,
> +				      struct v4l2_fmtdesc *f)
> +{
> +	const struct ti_csi2rx_fmt *fmt = NULL;
> +
> +	if (f->mbus_code) {
> +		/* 1-to-1 mapping between bus formats and pixel formats */
> +		if (f->index > 0)
> +			return -EINVAL;
> +
> +		fmt = find_format_by_code(f->mbus_code);
> +	} else {
> +		if (f->index >= num_formats)
> +			return -EINVAL;
> +
> +		fmt = &formats[f->index];
> +	}
> +
> +	if (!fmt)
> +		return -EINVAL;
> +
> +	f->pixelformat = fmt->fourcc;
> +	memset(f->reserved, 0, sizeof(f->reserved));
> +	f->type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
> +
> +	return 0;
> +}
> +
> +static int ti_csi2rx_g_fmt_vid_cap(struct file *file, void *prov,
> +				   struct v4l2_format *f)
> +{
> +	struct ti_csi2rx_dev *csi = video_drvdata(file);
> +
> +	*f = csi->v_fmt;
> +
> +	return 0;
> +}
> +
> +static int ti_csi2rx_try_fmt_vid_cap(struct file *file, void *priv,
> +				     struct v4l2_format *f)
> +{
> +	const struct ti_csi2rx_fmt *fmt;
> +
> +	/*
> +	 * Default to the first format if the requested pixel format code isn't
> +	 * supported.
> +	 */
> +	fmt = find_format_by_pix(f->fmt.pix.pixelformat);
> +	if (!fmt)
> +		fmt = &formats[0];
> +
> +	/* Interlaced formats are not supported. */
> +	f->fmt.pix.field = V4L2_FIELD_NONE;
> +
> +	ti_csi2rx_fill_fmt(fmt, f);
> +
> +	return 0;
> +}
> +
> +static int ti_csi2rx_s_fmt_vid_cap(struct file *file, void *priv,
> +				   struct v4l2_format *f)
> +{
> +	struct ti_csi2rx_dev *csi = video_drvdata(file);
> +	struct vb2_queue *q = &csi->vidq;
> +	int ret;
> +
> +	if (vb2_is_busy(q))
> +		return -EBUSY;
> +
> +	ret = ti_csi2rx_try_fmt_vid_cap(file, priv, f);
> +	if (ret < 0)
> +		return ret;
> +
> +	csi->v_fmt = *f;
> +
> +	return 0;
> +}
> +
> +static int ti_csi2rx_enum_framesizes(struct file *file, void *fh,
> +				     struct v4l2_frmsizeenum *fsize)
> +{
> +	const struct ti_csi2rx_fmt *fmt;
> +	unsigned int pixels_in_word;
> +	u8 bpp;
> +
> +	fmt = find_format_by_pix(fsize->pixel_format);
> +	if (!fmt || fsize->index != 0)
> +		return -EINVAL;
> +
> +	bpp = ALIGN(fmt->bpp, 8);
> +
> +	/*
> +	 * Number of pixels in one PSI-L word. The transfer happens in multiples
> +	 * of PSI-L word sizes.
> +	 */
> +	pixels_in_word = PSIL_WORD_SIZE_BYTES * 8 / bpp;
> +
> +	fsize->type = V4L2_FRMSIZE_TYPE_STEPWISE;
> +	fsize->stepwise.min_width = pixels_in_word;
> +	fsize->stepwise.max_width = rounddown(MAX_WIDTH_BYTES * 8 / bpp,
> +					      pixels_in_word);
> +	fsize->stepwise.step_width = pixels_in_word;
> +	fsize->stepwise.min_height = 1;
> +	fsize->stepwise.max_height = MAX_HEIGHT_LINES;
> +	fsize->stepwise.step_height = 1;
> +
> +	return 0;
> +}
> +
> +static const struct v4l2_ioctl_ops csi_ioctl_ops = {
> +	.vidioc_querycap      = ti_csi2rx_querycap,
> +	.vidioc_enum_fmt_vid_cap = ti_csi2rx_enum_fmt_vid_cap,
> +	.vidioc_try_fmt_vid_cap = ti_csi2rx_try_fmt_vid_cap,
> +	.vidioc_g_fmt_vid_cap = ti_csi2rx_g_fmt_vid_cap,
> +	.vidioc_s_fmt_vid_cap = ti_csi2rx_s_fmt_vid_cap,
> +	.vidioc_enum_framesizes = ti_csi2rx_enum_framesizes,
> +	.vidioc_reqbufs       = vb2_ioctl_reqbufs,
> +	.vidioc_create_bufs   = vb2_ioctl_create_bufs,
> +	.vidioc_prepare_buf   = vb2_ioctl_prepare_buf,
> +	.vidioc_querybuf      = vb2_ioctl_querybuf,
> +	.vidioc_qbuf          = vb2_ioctl_qbuf,
> +	.vidioc_dqbuf         = vb2_ioctl_dqbuf,
> +	.vidioc_expbuf        = vb2_ioctl_expbuf,
> +	.vidioc_streamon      = vb2_ioctl_streamon,
> +	.vidioc_streamoff     = vb2_ioctl_streamoff,
> +};
> +
> +static const struct v4l2_file_operations csi_fops = {
> +	.owner = THIS_MODULE,
> +	.open = v4l2_fh_open,
> +	.release = vb2_fop_release,
> +	.read = vb2_fop_read,
> +	.poll = vb2_fop_poll,
> +	.unlocked_ioctl = video_ioctl2,
> +	.mmap = vb2_fop_mmap,
> +};
> +
> +static int csi_async_notifier_bound(struct v4l2_async_notifier *notifier,
> +				    struct v4l2_subdev *subdev,
> +				    struct v4l2_async_connection *asc)
> +{
> +	struct ti_csi2rx_dev *csi = dev_get_drvdata(notifier->v4l2_dev->dev);
> +
> +	csi->source = subdev;
> +
> +	return 0;
> +}
> +
> +static int csi_async_notifier_complete(struct v4l2_async_notifier *notifier)
> +{
> +	struct ti_csi2rx_dev *csi = dev_get_drvdata(notifier->v4l2_dev->dev);
> +	struct video_device *vdev = &csi->vdev;
> +	int ret;
> +
> +	ret = video_register_device(vdev, VFL_TYPE_VIDEO, -1);
> +	if (ret)
> +		return ret;
> +
> +	ret = v4l2_create_fwnode_links_to_pad(csi->source, &csi->pad,
> +					      MEDIA_LNK_FL_IMMUTABLE | MEDIA_LNK_FL_ENABLED);
> +
> +	if (ret) {
> +		video_unregister_device(vdev);
> +		return ret;
> +	}
> +
> +	return v4l2_device_register_subdev_nodes(&csi->v4l2_dev);

You should call video_unregister_device() if this fails.

I'm tempted, however, to register the video device at probe time, not in
this function.

> +}
> +
> +static const struct v4l2_async_notifier_operations csi_async_notifier_ops = {
> +	.bound = csi_async_notifier_bound,
> +	.complete = csi_async_notifier_complete,
> +};
> +
> +static int ti_csi2rx_notifier_register(struct ti_csi2rx_dev *csi)
> +{
> +	struct fwnode_handle *fwnode;
> +	struct v4l2_async_connection *asc;
> +	struct device_node *node;
> +	int ret;
> +
> +	node = of_get_child_by_name(csi->dev->of_node, "csi-bridge");
> +	if (!node)
> +		return -EINVAL;
> +
> +	fwnode = of_fwnode_handle(node);
> +	if (!fwnode) {
> +		of_node_put(node);
> +		return -EINVAL;
> +	}
> +
> +	v4l2_async_nf_init(&csi->notifier, &csi->v4l2_dev);
> +	csi->notifier.ops = &csi_async_notifier_ops;
> +
> +	asc = v4l2_async_nf_add_fwnode(&csi->notifier, fwnode,
> +				       struct v4l2_async_connection);
> +	of_node_put(node);
> +	if (IS_ERR(asc)) {
> +		v4l2_async_nf_cleanup(&csi->notifier);
> +		return PTR_ERR(asc);
> +	}
> +
> +	ret = v4l2_async_nf_register(&csi->notifier);
> +	if (ret) {
> +		v4l2_async_nf_cleanup(&csi->notifier);
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int ti_csi2rx_setup_shim(struct ti_csi2rx_dev *csi)
> +{
> +	const struct ti_csi2rx_fmt *fmt;
> +	unsigned int reg;
> +
> +	fmt = find_format_by_pix(csi->v_fmt.fmt.pix.pixelformat);
> +	if (!fmt) {
> +		dev_err(csi->dev, "Pixelformat 0x%x is not supported\n",

Use %p4cc to print a fourcc. You need to pass the pixel format by
address to dev_err() then, not by value.

Can this happen though, given that the set format handler should never
allow setting a format not supported by the driver ? I think I'd drop
the error check. The function can then become a void function.

> +			csi->v_fmt.fmt.pix.pixelformat);
> +		return -EINVAL;
> +	}
> +
> +	/* De-assert the pixel interface reset. */
> +	reg = SHIM_CNTL_PIX_RST;
> +	writel(reg, csi->shim + SHIM_CNTL);
> +
> +	reg = SHIM_DMACNTX_EN;
> +	reg |= FIELD_PREP(SHIM_DMACNTX_FMT, fmt->csi_dt);
> +
> +	/*
> +	 * Using the values from the documentation gives incorrect ordering for
> +	 * the luma and chroma components. In practice, the "reverse" format
> +	 * gives the correct image. So for example, if the image is in UYVY, the
> +	 * reverse would be YVYU.
> +	 */
> +	switch (fmt->fourcc) {
> +	case V4L2_PIX_FMT_UYVY:
> +		reg |= FIELD_PREP(SHIM_DMACNTX_YUV422,
> +					SHIM_DMACNTX_YVYU);
> +		break;
> +	case V4L2_PIX_FMT_VYUY:
> +		reg |= FIELD_PREP(SHIM_DMACNTX_YUV422,
> +					SHIM_DMACNTX_YUYV);
> +		break;
> +	case V4L2_PIX_FMT_YUYV:
> +		reg |= FIELD_PREP(SHIM_DMACNTX_YUV422,
> +					SHIM_DMACNTX_VYUY);
> +		break;
> +	case V4L2_PIX_FMT_YVYU:
> +		reg |= FIELD_PREP(SHIM_DMACNTX_YUV422,
> +					SHIM_DMACNTX_UYVY);
> +		break;
> +	default:
> +		/* Ignore if not YUV 4:2:2 */
> +		break;
> +	}
> +
> +	reg |= FIELD_PREP(SHIM_DMACNTX_SIZE, fmt->size);
> +
> +	writel(reg, csi->shim + SHIM_DMACNTX);
> +
> +	reg = FIELD_PREP(SHIM_PSI_CFG0_SRC_TAG, 0) |
> +	      FIELD_PREP(SHIM_PSI_CFG0_DST_TAG, 0);
> +	writel(reg, csi->shim + SHIM_PSI_CFG0);
> +
> +	return 0;
> +}
> +
> +static void ti_csi2rx_drain_callback(void *param)
> +{
> +	struct completion *drain_complete = param;
> +
> +	complete(drain_complete);
> +}
> +
> +/** Drain the stale data left at the PSI-L endpoint.

This isn't kerneldoc, so

/*
 * Drain the stale data left at the PSI-L endpoint.

> + *
> + * This might happen if no buffers are queued in time but source is still
> + * streaming. Or rarely it may happen while stopping the stream. To prevent
> + * that stale data corrupting the subsequent transactions, it is required to
> + * issue DMA requests to drain it out.
> + */
> +static int ti_csi2rx_drain_dma(struct ti_csi2rx_dev *csi)
> +{
> +	struct dma_async_tx_descriptor *desc;
> +	struct completion drain_complete;
> +	dma_cookie_t cookie;
> +	int ret;
> +
> +	init_completion(&drain_complete);
> +
> +	desc = dmaengine_prep_slave_single(csi->dma.chan, csi->dma.drain.paddr,
> +					   csi->dma.drain.len, DMA_DEV_TO_MEM,
> +					   DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
> +	if (!desc) {
> +		ret = -EIO;
> +		goto out;
> +	}
> +
> +	desc->callback = ti_csi2rx_drain_callback;
> +	desc->callback_param = &drain_complete;
> +
> +	cookie = dmaengine_submit(desc);
> +	ret = dma_submit_error(cookie);
> +	if (ret)
> +		goto out;
> +
> +	dma_async_issue_pending(csi->dma.chan);
> +
> +	if (!wait_for_completion_timeout(&drain_complete,
> +					 msecs_to_jiffies(DRAIN_TIMEOUT_MS))) {
> +		dmaengine_terminate_sync(csi->dma.chan);
> +		ret = -ETIMEDOUT;
> +		goto out;
> +	}
> +out:
> +	return ret;
> +}
> +
> +static void ti_csi2rx_dma_callback(void *param)
> +{
> +	struct ti_csi2rx_buffer *buf = param;
> +	struct ti_csi2rx_dev *csi = buf->csi;
> +	struct ti_csi2rx_dma *dma = &csi->dma;
> +	unsigned long flags;
> +
> +	/*
> +	 * TODO: Derive the sequence number from the CSI2RX frame number
> +	 * hardware monitor registers.
> +	 */
> +	buf->vb.vb2_buf.timestamp = ktime_get_ns();
> +	buf->vb.sequence = csi->sequence++;
> +
> +	spin_lock_irqsave(&dma->lock, flags);
> +
> +	WARN_ON(!list_is_first(&buf->list, &dma->submitted));
> +	vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_DONE);
> +	list_del(&buf->list);
> +
> +	/* If there are more buffers to process then start their transfer. */
> +	while (!list_empty(&dma->queue)) {
> +		buf = list_entry(dma->queue.next, struct ti_csi2rx_buffer, list);
> +
> +		if (ti_csi2rx_start_dma(csi, buf)) {
> +			dev_err(csi->dev, "Failed to queue the next buffer for DMA\n");
> +			vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR);
> +		} else {
> +			list_move_tail(&buf->list, &dma->submitted);
> +		}
> +	}
> +
> +	if (list_empty(&dma->submitted))
> +		dma->state = TI_CSI2RX_DMA_IDLE;
> +
> +	spin_unlock_irqrestore(&dma->lock, flags);
> +}
> +
> +static int ti_csi2rx_start_dma(struct ti_csi2rx_dev *csi,
> +			       struct ti_csi2rx_buffer *buf)
> +{
> +	unsigned long addr;
> +	struct dma_async_tx_descriptor *desc;
> +	size_t len = csi->v_fmt.fmt.pix.sizeimage;
> +	dma_cookie_t cookie;
> +	int ret = 0;
> +
> +	addr = vb2_dma_contig_plane_dma_addr(&buf->vb.vb2_buf, 0);
> +	desc = dmaengine_prep_slave_single(csi->dma.chan, addr, len,
> +					   DMA_DEV_TO_MEM,
> +					   DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
> +	if (!desc)
> +		return -EIO;
> +
> +	desc->callback = ti_csi2rx_dma_callback;
> +	desc->callback_param = buf;
> +
> +	cookie = dmaengine_submit(desc);
> +	ret = dma_submit_error(cookie);
> +	if (ret)
> +		return ret;
> +
> +	dma_async_issue_pending(csi->dma.chan);
> +
> +	return 0;
> +}
> +
> +static void ti_csi2rx_cleanup_buffers(struct ti_csi2rx_dev *csi,
> +				      enum vb2_buffer_state buf_state)
> +{
> +	struct ti_csi2rx_dma *dma = &csi->dma;
> +	struct ti_csi2rx_buffer *buf, *tmp;
> +	enum ti_csi2rx_dma_state state;
> +	unsigned long flags;
> +	int ret;
> +
> +	spin_lock_irqsave(&dma->lock, flags);
> +	state = csi->dma.state;
> +	dma->state = TI_CSI2RX_DMA_STOPPED;
> +	spin_unlock_irqrestore(&dma->lock, flags);
> +
> +	if (state != TI_CSI2RX_DMA_STOPPED) {
> +		/*
> +		 * Normal DMA termination sometimes does not clean up pending
> +		 * data on the endpoint.
> +		 */
> +		ret = ti_csi2rx_drain_dma(csi);
> +		if (ret)
> +			dev_dbg(csi->dev,
> +				"Failed to drain DMA. Next frame might be bogus\n");

A dev_warn() may be more appropriate, this seems quite important.

> +	}

A blank line would be nice here.

> +	ret = dmaengine_terminate_sync(csi->dma.chan);
> +	if (ret)
> +		dev_err(csi->dev, "Failed to stop DMA: %d\n", ret);

When called from ti_csi2rx_start_streaming() there's already a
dmaengine_terminate_sync(), and there's also a call to the same function
in ti_csi2rx_drain_dma() called above. Could we avoid calling the
function multiple times ? I think stopping the DMA engine should be
moved to a separate function, as it doesn't fit with the
ti_csi2rx_cleanup_buffers() name.

> +
> +	dma_free_coherent(csi->dev, dma->drain.len,
> +			  dma->drain.vaddr, dma->drain.paddr);
> +	dma->drain.vaddr = NULL;
> +
> +	spin_lock_irqsave(&dma->lock, flags);
> +	list_for_each_entry_safe(buf, tmp, &csi->dma.queue, list) {
> +		list_del(&buf->list);
> +		vb2_buffer_done(&buf->vb.vb2_buf, buf_state);
> +	}
> +	list_for_each_entry_safe(buf, tmp, &csi->dma.submitted, list) {
> +		list_del(&buf->list);
> +		vb2_buffer_done(&buf->vb.vb2_buf, buf_state);
> +	}
> +	spin_unlock_irqrestore(&dma->lock, flags);
> +}
> +
> +static int ti_csi2rx_queue_setup(struct vb2_queue *q, unsigned int *nbuffers,
> +				 unsigned int *nplanes, unsigned int sizes[],
> +				 struct device *alloc_devs[])
> +{
> +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(q);
> +	unsigned int size = csi->v_fmt.fmt.pix.sizeimage;
> +
> +	if (*nplanes) {
> +		if (sizes[0] < size)
> +			return -EINVAL;
> +		size = sizes[0];
> +	}
> +
> +	*nplanes = 1;
> +	sizes[0] = size;
> +
> +	return 0;
> +}
> +
> +static int ti_csi2rx_buffer_prepare(struct vb2_buffer *vb)
> +{
> +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vb->vb2_queue);
> +	unsigned long size = csi->v_fmt.fmt.pix.sizeimage;
> +
> +	if (vb2_plane_size(vb, 0) < size) {
> +		dev_err(csi->dev, "Data will not fit into plane\n");
> +		return -EINVAL;
> +	}
> +
> +	vb2_set_plane_payload(vb, 0, size);
> +	return 0;
> +}
> +
> +static void ti_csi2rx_buffer_queue(struct vb2_buffer *vb)
> +{
> +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vb->vb2_queue);
> +	struct ti_csi2rx_buffer *buf;
> +	struct ti_csi2rx_dma *dma = &csi->dma;
> +	bool restart_dma = false;
> +	unsigned long flags = 0;
> +	int ret;
> +
> +	buf = container_of(vb, struct ti_csi2rx_buffer, vb.vb2_buf);
> +	buf->csi = csi;
> +
> +	spin_lock_irqsave(&dma->lock, flags);
> +	/*
> +	 * Usually the DMA callback takes care of queueing the pending buffers.
> +	 * But if DMA has stalled due to lack of buffers, restart it now.
> +	 */
> +	if (dma->state == TI_CSI2RX_DMA_IDLE) {
> +		/*
> +		 * Do not restart DMA with the lock held because
> +		 * ti_csi2rx_drain_dma() might block for completion.
> +		 * There won't be a race on queueing DMA anyway since the
> +		 * callback is not being fired.
> +		 */
> +		restart_dma = true;
> +		dma->state = TI_CSI2RX_DMA_ACTIVE;
> +	} else {
> +		list_add_tail(&buf->list, &dma->queue);
> +	}
> +	spin_unlock_irqrestore(&dma->lock, flags);
> +
> +	if (restart_dma) {
> +		/*
> +		 * Once frames start dropping, some data gets stuck in the DMA
> +		 * pipeline somewhere. So the first DMA transfer after frame
> +		 * drops gives a partial frame. This is obviously not useful to
> +		 * the application and will only confuse it. Issue a DMA
> +		 * transaction to drain that up.
> +		 */

Another option would be to return the frame to userspace with the error
flag set. That would give an earlier indication to applications that
something went wrong. Up to you.

> +		ret = ti_csi2rx_drain_dma(csi);
> +		if (ret)
> +			dev_warn(csi->dev,
> +				 "Failed to drain DMA. Next frame might be bogus\n");
> +
> +		ret = ti_csi2rx_start_dma(csi, buf);
> +		if (ret) {
> +			dev_err(csi->dev, "Failed to start DMA: %d\n", ret);
> +			spin_lock_irqsave(&dma->lock, flags);
> +			vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR);
> +			dma->state = TI_CSI2RX_DMA_IDLE;
> +			spin_unlock_irqrestore(&dma->lock, flags);
> +		} else {
> +			spin_lock_irqsave(&dma->lock, flags);
> +			list_add_tail(&buf->list, &dma->submitted);
> +			spin_unlock_irqrestore(&dma->lock, flags);
> +		}
> +	}
> +}
> +
> +static int ti_csi2rx_start_streaming(struct vb2_queue *vq, unsigned int count)
> +{
> +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vq);
> +	struct ti_csi2rx_dma *dma = &csi->dma;
> +	struct ti_csi2rx_buffer *buf;
> +	unsigned long flags;
> +	int ret = 0;
> +
> +	spin_lock_irqsave(&dma->lock, flags);
> +	if (list_empty(&dma->queue))
> +		ret = -EIO;
> +	spin_unlock_irqrestore(&dma->lock, flags);
> +	if (ret)
> +		return ret;
> +
> +	dma->drain.len = csi->v_fmt.fmt.pix.sizeimage;
> +	dma->drain.vaddr = dma_alloc_coherent(csi->dev, dma->drain.len,
> +					      &dma->drain.paddr, GFP_KERNEL);
> +	if (!dma->drain.vaddr)
> +		return -ENOMEM;
> +
> +	ret = video_device_pipeline_start(&csi->vdev, &csi->pipe);
> +	if (ret)
> +		goto err;
> +
> +	ret = ti_csi2rx_setup_shim(csi);
> +	if (ret)
> +		goto err;
> +
> +	csi->sequence = 0;
> +
> +	spin_lock_irqsave(&dma->lock, flags);
> +	buf = list_entry(dma->queue.next, struct ti_csi2rx_buffer, list);
> +
> +	ret = ti_csi2rx_start_dma(csi, buf);
> +	if (ret) {
> +		dev_err(csi->dev, "Failed to start DMA: %d\n", ret);
> +		spin_unlock_irqrestore(&dma->lock, flags);
> +		goto err_pipeline;
> +	}
> +
> +	list_move_tail(&buf->list, &dma->submitted);
> +	dma->state = TI_CSI2RX_DMA_ACTIVE;
> +	spin_unlock_irqrestore(&dma->lock, flags);
> +
> +	ret = v4l2_subdev_call(csi->source, video, s_stream, 1);
> +	if (ret)
> +		goto err_dma;
> +
> +	return 0;
> +
> +err_dma:
> +	dmaengine_terminate_sync(csi->dma.chan);
> +	writel(0, csi->shim + SHIM_DMACNTX);
> +err_pipeline:
> +	video_device_pipeline_stop(&csi->vdev);
> +err:
> +	ti_csi2rx_cleanup_buffers(csi, VB2_BUF_STATE_QUEUED);
> +	return ret;
> +}
> +
> +static void ti_csi2rx_stop_streaming(struct vb2_queue *vq)
> +{
> +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vq);
> +	int ret;
> +
> +	video_device_pipeline_stop(&csi->vdev);
> +
> +	writel(0, csi->shim + SHIM_CNTL);
> +	writel(0, csi->shim + SHIM_DMACNTX);
> +
> +	ret = v4l2_subdev_call(csi->source, video, s_stream, 0);
> +	if (ret)
> +		dev_err(csi->dev, "Failed to stop subdev stream\n");
> +
> +	ti_csi2rx_cleanup_buffers(csi, VB2_BUF_STATE_ERROR);
> +}
> +
> +static const struct vb2_ops csi_vb2_qops = {
> +	.queue_setup = ti_csi2rx_queue_setup,
> +	.buf_prepare = ti_csi2rx_buffer_prepare,
> +	.buf_queue = ti_csi2rx_buffer_queue,
> +	.start_streaming = ti_csi2rx_start_streaming,
> +	.stop_streaming = ti_csi2rx_stop_streaming,
> +	.wait_prepare = vb2_ops_wait_prepare,
> +	.wait_finish = vb2_ops_wait_finish,
> +};
> +
> +static int ti_csi2rx_init_vb2q(struct ti_csi2rx_dev *csi)
> +{
> +	struct vb2_queue *q = &csi->vidq;
> +	int ret;
> +
> +	q->type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
> +	q->io_modes = VB2_MMAP | VB2_DMABUF;
> +	q->drv_priv = csi;
> +	q->buf_struct_size = sizeof(struct ti_csi2rx_buffer);
> +	q->ops = &csi_vb2_qops;
> +	q->mem_ops = &vb2_dma_contig_memops;
> +	q->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC;
> +	q->dev = dmaengine_get_dma_device(csi->dma.chan);
> +	q->lock = &csi->mutex;
> +	q->min_buffers_needed = 1;
> +
> +	ret = vb2_queue_init(q);
> +	if (ret)
> +		return ret;
> +
> +	csi->vdev.queue = q;
> +
> +	return 0;
> +}
> +
> +static int ti_csi2rx_link_validate(struct media_link *link)
> +{
> +	struct media_entity *entity = link->sink->entity;
> +	struct video_device *vdev = media_entity_to_video_device(entity);
> +	struct ti_csi2rx_dev *csi = container_of(vdev, struct ti_csi2rx_dev, vdev);
> +	struct v4l2_pix_format *csi_fmt = &csi->v_fmt.fmt.pix;
> +	struct v4l2_subdev_format source_fmt = {
> +		.which	= V4L2_SUBDEV_FORMAT_ACTIVE,
> +		.pad	= link->source->index,
> +	};
> +	const struct ti_csi2rx_fmt *ti_fmt;
> +	int ret;
> +
> +	ret = v4l2_subdev_call_state_active(csi->source, pad,
> +					    get_fmt, &source_fmt);
> +	if (ret)
> +		return ret;
> +
> +	if (source_fmt.format.width != csi_fmt->width) {
> +		dev_dbg(csi->dev, "Width does not match (source %u, sink %u)\n",
> +			source_fmt.format.width, csi_fmt->width);
> +		return -EPIPE;
> +	}
> +
> +	if (source_fmt.format.height != csi_fmt->height) {
> +		dev_dbg(csi->dev, "Height does not match (source %u, sink %u)\n",
> +			source_fmt.format.height, csi_fmt->height);
> +		return -EPIPE;
> +	}
> +
> +	if (source_fmt.format.field != csi_fmt->field &&
> +	    csi_fmt->field != V4L2_FIELD_NONE) {
> +		dev_dbg(csi->dev, "Field does not match (source %u, sink %u)\n",
> +			source_fmt.format.field, csi_fmt->field);
> +		return -EPIPE;
> +	}
> +
> +	ti_fmt = find_format_by_code(source_fmt.format.code);
> +	if (!ti_fmt) {
> +		dev_dbg(csi->dev, "Media bus format 0x%x not supported\n",
> +			source_fmt.format.code);
> +		return -EPIPE;
> +	}
> +
> +	if (ti_fmt->fourcc != csi_fmt->pixelformat) {
> +		dev_dbg(csi->dev,
> +			"Cannot transform source fmt 0x%x to sink fmt 0x%x\n",
> +			ti_fmt->fourcc, csi_fmt->pixelformat);
> +		return -EPIPE;
> +	}
> +
> +	return 0;
> +}
> +
> +static const struct media_entity_operations ti_csi2rx_video_entity_ops = {
> +	.link_validate = ti_csi2rx_link_validate,
> +};
> +
> +static int ti_csi2rx_init_dma(struct ti_csi2rx_dev *csi)
> +{
> +	struct dma_slave_config cfg = {
> +		.src_addr_width = DMA_SLAVE_BUSWIDTH_16_BYTES,
> +	};
> +	int ret;
> +
> +	INIT_LIST_HEAD(&csi->dma.queue);
> +	INIT_LIST_HEAD(&csi->dma.submitted);
> +	spin_lock_init(&csi->dma.lock);
> +
> +	csi->dma.state = TI_CSI2RX_DMA_STOPPED;
> +
> +	csi->dma.chan = dma_request_chan(csi->dev, "rx0");
> +	if (IS_ERR(csi->dma.chan))
> +		return PTR_ERR(csi->dma.chan);
> +
> +	ret = dmaengine_slave_config(csi->dma.chan, &cfg);
> +	if (ret) {
> +		dma_release_channel(csi->dma.chan);
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int ti_csi2rx_v4l2_init(struct ti_csi2rx_dev *csi)
> +{
> +	struct media_device *mdev = &csi->mdev;
> +	struct video_device *vdev = &csi->vdev;
> +	const struct ti_csi2rx_fmt *fmt;
> +	struct v4l2_pix_format *pix_fmt = &csi->v_fmt.fmt.pix;
> +	int ret;
> +
> +	fmt = find_format_by_pix(V4L2_PIX_FMT_UYVY);
> +	if (!fmt)
> +		return -EINVAL;
> +
> +	pix_fmt->width = 640;
> +	pix_fmt->height = 480;
> +	pix_fmt->field = V4L2_FIELD_NONE;
> +
> +	ti_csi2rx_fill_fmt(fmt, &csi->v_fmt);
> +
> +	mdev->dev = csi->dev;
> +	mdev->hw_revision = 1;
> +	strscpy(mdev->model, "TI-CSI2RX", sizeof(mdev->model));
> +
> +	media_device_init(mdev);
> +
> +	strscpy(vdev->name, TI_CSI2RX_MODULE_NAME, sizeof(vdev->name));
> +	vdev->v4l2_dev = &csi->v4l2_dev;
> +	vdev->vfl_dir = VFL_DIR_RX;
> +	vdev->fops = &csi_fops;
> +	vdev->ioctl_ops = &csi_ioctl_ops;
> +	vdev->release = video_device_release_empty;
> +	vdev->device_caps = V4L2_CAP_VIDEO_CAPTURE | V4L2_CAP_STREAMING |
> +			    V4L2_CAP_IO_MC;
> +	vdev->lock = &csi->mutex;
> +	video_set_drvdata(vdev, csi);
> +
> +	csi->pad.flags = MEDIA_PAD_FL_SINK;
> +	vdev->entity.ops = &ti_csi2rx_video_entity_ops;
> +	ret = media_entity_pads_init(&csi->vdev.entity, 1, &csi->pad);
> +	if (ret)
> +		return ret;
> +
> +	csi->v4l2_dev.mdev = mdev;
> +
> +	ret = v4l2_device_register(csi->dev, &csi->v4l2_dev);
> +	if (ret)
> +		return ret;
> +
> +	ret = media_device_register(mdev);
> +	if (ret) {
> +		v4l2_device_unregister(&csi->v4l2_dev);
> +		media_device_cleanup(mdev);
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static void ti_csi2rx_cleanup_dma(struct ti_csi2rx_dev *csi)
> +{
> +	dma_release_channel(csi->dma.chan);
> +}
> +
> +static void ti_csi2rx_cleanup_v4l2(struct ti_csi2rx_dev *csi)
> +{
> +	media_device_unregister(&csi->mdev);
> +	v4l2_device_unregister(&csi->v4l2_dev);
> +	media_device_cleanup(&csi->mdev);
> +}
> +
> +static void ti_csi2rx_cleanup_subdev(struct ti_csi2rx_dev *csi)
> +{
> +	v4l2_async_nf_unregister(&csi->notifier);
> +	v4l2_async_nf_cleanup(&csi->notifier);
> +}
> +
> +static void ti_csi2rx_cleanup_vb2q(struct ti_csi2rx_dev *csi)
> +{
> +	vb2_queue_release(&csi->vidq);
> +}
> +
> +static int ti_csi2rx_probe(struct platform_device *pdev)
> +{
> +	struct ti_csi2rx_dev *csi;
> +	struct resource *res;
> +	int ret;
> +
> +	csi = devm_kzalloc(&pdev->dev, sizeof(*csi), GFP_KERNEL);
> +	if (!csi)
> +		return -ENOMEM;
> +
> +	csi->dev = &pdev->dev;
> +	platform_set_drvdata(pdev, csi);
> +
> +	mutex_init(&csi->mutex);
> +
> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +	csi->shim = devm_ioremap_resource(&pdev->dev, res);
> +	if (IS_ERR(csi->shim)) {
> +		ret = PTR_ERR(csi->shim);
> +		goto err_mutex;
> +	}
> +
> +	ret = ti_csi2rx_init_dma(csi);
> +	if (ret)
> +		goto err_mutex;
> +
> +	ret = ti_csi2rx_v4l2_init(csi);
> +	if (ret)
> +		goto err_dma;
> +
> +	ret = ti_csi2rx_init_vb2q(csi);
> +	if (ret)
> +		goto err_v4l2;
> +
> +	ret = ti_csi2rx_notifier_register(csi);
> +	if (ret)
> +		goto err_vb2q;
> +
> +	ret = of_platform_populate(csi->dev->of_node, NULL, NULL, csi->dev);
> +	if (ret) {
> +		dev_err(csi->dev, "Failed to create children: %d\n", ret);
> +		goto err_subdev;
> +	}
> +
> +	return 0;
> +
> +err_subdev:
> +	ti_csi2rx_cleanup_subdev(csi);
> +err_vb2q:
> +	ti_csi2rx_cleanup_vb2q(csi);
> +err_v4l2:
> +	ti_csi2rx_cleanup_v4l2(csi);
> +err_dma:
> +	ti_csi2rx_cleanup_dma(csi);
> +err_mutex:
> +	mutex_destroy(&csi->mutex);
> +	return ret;
> +}
> +
> +static int ti_csi2rx_remove(struct platform_device *pdev)
> +{
> +	struct ti_csi2rx_dev *csi = platform_get_drvdata(pdev);
> +
> +	video_unregister_device(&csi->vdev);
> +
> +	ti_csi2rx_cleanup_vb2q(csi);
> +	ti_csi2rx_cleanup_subdev(csi);
> +	ti_csi2rx_cleanup_v4l2(csi);
> +	ti_csi2rx_cleanup_dma(csi);
> +
> +	mutex_destroy(&csi->mutex);
> +
> +	return 0;
> +}
> +
> +static const struct of_device_id ti_csi2rx_of_match[] = {
> +	{ .compatible = "ti,j721e-csi2rx-shim", },
> +	{ },
> +};
> +MODULE_DEVICE_TABLE(of, ti_csi2rx_of_match);
> +
> +static struct platform_driver ti_csi2rx_pdrv = {
> +	.probe = ti_csi2rx_probe,
> +	.remove = ti_csi2rx_remove,
> +	.driver = {
> +		.name = TI_CSI2RX_MODULE_NAME,
> +		.of_match_table = ti_csi2rx_of_match,
> +	},
> +};
> +
> +module_platform_driver(ti_csi2rx_pdrv);
> +
> +MODULE_DESCRIPTION("TI J721E CSI2 RX Driver");
> +MODULE_AUTHOR("Pratyush Yadav <p.yadav@ti.com>");
> +MODULE_LICENSE("GPL");

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v9 13/13] media: ti: Add CSI2RX support for J721E
  2023-08-29 15:55       ` Laurent Pinchart
@ 2023-10-04 13:51         ` Vinod Koul
  2023-10-04 20:03           ` Laurent Pinchart
  0 siblings, 1 reply; 30+ messages in thread
From: Vinod Koul @ 2023-10-04 13:51 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Jai Luthra, Tomi Valkeinen, Vignesh Raghavendra,
	Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, linux-media, linux-kernel,
	devicetree, linux-arm-kernel, Mauro Carvalho Chehab,
	Maxime Ripard, niklas.soderlund+renesas, Benoit Parrot,
	Vaishnav Achath, nm, devarsht, a-bhatia1, Martyn Welch,
	Julien Massot

On 29-08-23, 18:55, Laurent Pinchart wrote:
> Hi Jai,
> 
> (CC'ing Vinod, the maintainer of the DMA engine subsystem, for a
> question below)

Sorry this got lost

> 
> On Fri, Aug 18, 2023 at 03:55:06PM +0530, Jai Luthra wrote:
> > On Aug 15, 2023 at 16:00:51 +0300, Tomi Valkeinen wrote:
> > > On 11/08/2023 13:47, Jai Luthra wrote:
> > > > From: Pratyush Yadav <p.yadav@ti.com>
> 
> [snip]
> 
> > > > +static int ti_csi2rx_start_streaming(struct vb2_queue *vq, unsigned int count)
> > > > +{
> > > > +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vq);
> > > > +	struct ti_csi2rx_dma *dma = &csi->dma;
> > > > +	struct ti_csi2rx_buffer *buf;
> > > > +	unsigned long flags;
> > > > +	int ret = 0;
> > > > +
> > > > +	spin_lock_irqsave(&dma->lock, flags);
> > > > +	if (list_empty(&dma->queue))
> > > > +		ret = -EIO;
> > > > +	spin_unlock_irqrestore(&dma->lock, flags);
> > > > +	if (ret)
> > > > +		return ret;
> > > > +
> > > > +	dma->drain.len = csi->v_fmt.fmt.pix.sizeimage;
> > > > +	dma->drain.vaddr = dma_alloc_coherent(csi->dev, dma->drain.len,
> > > > +					      &dma->drain.paddr, GFP_KERNEL);
> > > > +	if (!dma->drain.vaddr)
> > > > +		return -ENOMEM;
> > > 
> > > This is still allocating a large buffer every time streaming is started (and
> > > with streams support, a separate buffer for each stream?).
> > > 
> > > Did you check if the TI DMA can do writes to a constant address? That would
> > > be the best option, as then the whole buffer allocation problem goes away.
> > 
> > I checked with Vignesh, the hardware can support a scenario where we 
> > flush out all the data without allocating a buffer, but I couldn't find 
> > a way to signal that via the current dmaengine framework APIs. Will look 
> > into it further as it will be important for multi-stream support.
> 
> That would be the best option. It's not immediately apparent to me if
> the DMA engine API supports such a use case.
> dmaengine_prep_interleaved_dma() gives you finer grain control on the
> source and destination increments, but I haven't seen a way to instruct
> the DMA engine to direct writes to /dev/null (so to speak). Vinod, is
> this something that is supported, or could be supported ?

Write to a dummy buffer could have the same behaviour, no?

> 
> > > Alternatively, can you flush the buffers with multiple one line transfers?
> > > The flushing shouldn't be performance critical, so even if that's slower
> > > than a normal full-frame DMA, it shouldn't matter much. And if that can be
> > > done, a single probe time line-buffer allocation should do the trick.
> > 
> > There will be considerable overhead if we queue many DMA transactions 
> > (in the order of 1000s or even 100s), which might not be okay for the 
> > scenarios where we have to drain mid-stream. Will have to run some 
> > experiments to see if that is worth it.
> > 
> > But one optimization we can for sure do is re-use a single drain buffer 
> > for all the streams. We will need to ensure to re-allocate the buffer 
> > for the "largest" framesize supported across the different streams at 
> > stream-on time.
> 
> If you implement .device_prep_interleaved_dma() in the DMA engine driver
> you could write to a single line buffer, assuming that the hardware would
> support so in a generic way.
> 
> > My guess is the endpoint is not buffering a full-frame's worth of data, 
> > I will also check if we can upper bound that size to something feasible.
> > 
> > > Other than this drain buffer topic, I think this looks fine. So, I'm going
> > > to give Rb, but I do encourage you to look more into optimizing this drain
> > > buffer.
> > 
> > Thank you!
> > 
> > > Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
> 
> -- 
> Regards,
> 
> Laurent Pinchart

-- 
~Vinod

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v9 13/13] media: ti: Add CSI2RX support for J721E
  2023-10-04 13:51         ` Vinod Koul
@ 2023-10-04 20:03           ` Laurent Pinchart
  2023-10-05  4:10             ` Vinod Koul
  2023-10-06 10:26             ` Jai Luthra
  0 siblings, 2 replies; 30+ messages in thread
From: Laurent Pinchart @ 2023-10-04 20:03 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Jai Luthra, Tomi Valkeinen, Vignesh Raghavendra,
	Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, linux-media, linux-kernel,
	devicetree, linux-arm-kernel, Mauro Carvalho Chehab,
	Maxime Ripard, niklas.soderlund+renesas, Benoit Parrot,
	Vaishnav Achath, nm, devarsht, a-bhatia1, Martyn Welch,
	Julien Massot

On Wed, Oct 04, 2023 at 07:21:00PM +0530, Vinod Koul wrote:
> On 29-08-23, 18:55, Laurent Pinchart wrote:
> > Hi Jai,
> > 
> > (CC'ing Vinod, the maintainer of the DMA engine subsystem, for a
> > question below)
> 
> Sorry this got lost

No worries.

> > On Fri, Aug 18, 2023 at 03:55:06PM +0530, Jai Luthra wrote:
> > > On Aug 15, 2023 at 16:00:51 +0300, Tomi Valkeinen wrote:
> > > > On 11/08/2023 13:47, Jai Luthra wrote:
> > > > > From: Pratyush Yadav <p.yadav@ti.com>
> > 
> > [snip]
> > 
> > > > > +static int ti_csi2rx_start_streaming(struct vb2_queue *vq, unsigned int count)
> > > > > +{
> > > > > +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vq);
> > > > > +	struct ti_csi2rx_dma *dma = &csi->dma;
> > > > > +	struct ti_csi2rx_buffer *buf;
> > > > > +	unsigned long flags;
> > > > > +	int ret = 0;
> > > > > +
> > > > > +	spin_lock_irqsave(&dma->lock, flags);
> > > > > +	if (list_empty(&dma->queue))
> > > > > +		ret = -EIO;
> > > > > +	spin_unlock_irqrestore(&dma->lock, flags);
> > > > > +	if (ret)
> > > > > +		return ret;
> > > > > +
> > > > > +	dma->drain.len = csi->v_fmt.fmt.pix.sizeimage;
> > > > > +	dma->drain.vaddr = dma_alloc_coherent(csi->dev, dma->drain.len,
> > > > > +					      &dma->drain.paddr, GFP_KERNEL);
> > > > > +	if (!dma->drain.vaddr)
> > > > > +		return -ENOMEM;
> > > > 
> > > > This is still allocating a large buffer every time streaming is started (and
> > > > with streams support, a separate buffer for each stream?).
> > > > 
> > > > Did you check if the TI DMA can do writes to a constant address? That would
> > > > be the best option, as then the whole buffer allocation problem goes away.
> > > 
> > > I checked with Vignesh, the hardware can support a scenario where we 
> > > flush out all the data without allocating a buffer, but I couldn't find 
> > > a way to signal that via the current dmaengine framework APIs. Will look 
> > > into it further as it will be important for multi-stream support.
> > 
> > That would be the best option. It's not immediately apparent to me if
> > the DMA engine API supports such a use case.
> > dmaengine_prep_interleaved_dma() gives you finer grain control on the
> > source and destination increments, but I haven't seen a way to instruct
> > the DMA engine to direct writes to /dev/null (so to speak). Vinod, is
> > this something that is supported, or could be supported ?
> 
> Write to a dummy buffer could have the same behaviour, no?

Yes, but if the DMA engine can write to /dev/null, that avoids
allocating a dummy buffer, which is nicer. For video use cases, dummy
buffers are often large.

> > > > Alternatively, can you flush the buffers with multiple one line transfers?
> > > > The flushing shouldn't be performance critical, so even if that's slower
> > > > than a normal full-frame DMA, it shouldn't matter much. And if that can be
> > > > done, a single probe time line-buffer allocation should do the trick.
> > > 
> > > There will be considerable overhead if we queue many DMA transactions 
> > > (in the order of 1000s or even 100s), which might not be okay for the 
> > > scenarios where we have to drain mid-stream. Will have to run some 
> > > experiments to see if that is worth it.
> > > 
> > > But one optimization we can for sure do is re-use a single drain buffer 
> > > for all the streams. We will need to ensure to re-allocate the buffer 
> > > for the "largest" framesize supported across the different streams at 
> > > stream-on time.
> > 
> > If you implement .device_prep_interleaved_dma() in the DMA engine driver
> > you could write to a single line buffer, assuming that the hardware would
> > support so in a generic way.
> > 
> > > My guess is the endpoint is not buffering a full-frame's worth of data, 
> > > I will also check if we can upper bound that size to something feasible.
> > > 
> > > > Other than this drain buffer topic, I think this looks fine. So, I'm going
> > > > to give Rb, but I do encourage you to look more into optimizing this drain
> > > > buffer.
> > > 
> > > Thank you!
> > > 
> > > > Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v9 13/13] media: ti: Add CSI2RX support for J721E
  2023-10-04 20:03           ` Laurent Pinchart
@ 2023-10-05  4:10             ` Vinod Koul
  2023-10-06 10:26             ` Jai Luthra
  1 sibling, 0 replies; 30+ messages in thread
From: Vinod Koul @ 2023-10-05  4:10 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Jai Luthra, Tomi Valkeinen, Vignesh Raghavendra,
	Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, linux-media, linux-kernel,
	devicetree, linux-arm-kernel, Mauro Carvalho Chehab,
	Maxime Ripard, niklas.soderlund+renesas, Benoit Parrot,
	Vaishnav Achath, nm, devarsht, a-bhatia1, Martyn Welch,
	Julien Massot

On 04-10-23, 23:03, Laurent Pinchart wrote:
> On Wed, Oct 04, 2023 at 07:21:00PM +0530, Vinod Koul wrote:
> > On 29-08-23, 18:55, Laurent Pinchart wrote:
> > > Hi Jai,
> > > 
> > > (CC'ing Vinod, the maintainer of the DMA engine subsystem, for a
> > > question below)
> > 
> > Sorry this got lost
> 
> No worries.
> 
> > > On Fri, Aug 18, 2023 at 03:55:06PM +0530, Jai Luthra wrote:
> > > > On Aug 15, 2023 at 16:00:51 +0300, Tomi Valkeinen wrote:
> > > > > On 11/08/2023 13:47, Jai Luthra wrote:
> > > > > > From: Pratyush Yadav <p.yadav@ti.com>
> > > 
> > > [snip]
> > > 
> > > > > > +static int ti_csi2rx_start_streaming(struct vb2_queue *vq, unsigned int count)
> > > > > > +{
> > > > > > +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vq);
> > > > > > +	struct ti_csi2rx_dma *dma = &csi->dma;
> > > > > > +	struct ti_csi2rx_buffer *buf;
> > > > > > +	unsigned long flags;
> > > > > > +	int ret = 0;
> > > > > > +
> > > > > > +	spin_lock_irqsave(&dma->lock, flags);
> > > > > > +	if (list_empty(&dma->queue))
> > > > > > +		ret = -EIO;
> > > > > > +	spin_unlock_irqrestore(&dma->lock, flags);
> > > > > > +	if (ret)
> > > > > > +		return ret;
> > > > > > +
> > > > > > +	dma->drain.len = csi->v_fmt.fmt.pix.sizeimage;
> > > > > > +	dma->drain.vaddr = dma_alloc_coherent(csi->dev, dma->drain.len,
> > > > > > +					      &dma->drain.paddr, GFP_KERNEL);
> > > > > > +	if (!dma->drain.vaddr)
> > > > > > +		return -ENOMEM;
> > > > > 
> > > > > This is still allocating a large buffer every time streaming is started (and
> > > > > with streams support, a separate buffer for each stream?).
> > > > > 
> > > > > Did you check if the TI DMA can do writes to a constant address? That would
> > > > > be the best option, as then the whole buffer allocation problem goes away.
> > > > 
> > > > I checked with Vignesh, the hardware can support a scenario where we 
> > > > flush out all the data without allocating a buffer, but I couldn't find 
> > > > a way to signal that via the current dmaengine framework APIs. Will look 
> > > > into it further as it will be important for multi-stream support.
> > > 
> > > That would be the best option. It's not immediately apparent to me if
> > > the DMA engine API supports such a use case.
> > > dmaengine_prep_interleaved_dma() gives you finer grain control on the
> > > source and destination increments, but I haven't seen a way to instruct
> > > the DMA engine to direct writes to /dev/null (so to speak). Vinod, is
> > > this something that is supported, or could be supported ?
> > 
> > Write to a dummy buffer could have the same behaviour, no?
> 
> Yes, but if the DMA engine can write to /dev/null, that avoids
> allocating a dummy buffer, which is nicer. For video use cases, dummy
> buffers are often large.

hmmm maybe I haven't comprehended it full, would you mind explaining the
details on how such a potential interleaved transfer would look like so
that we can model it or change apis to model this

-- 
~Vinod

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v9 13/13] media: ti: Add CSI2RX support for J721E
  2023-08-29 16:44   ` Laurent Pinchart
@ 2023-10-05  8:34     ` Jai Luthra
  0 siblings, 0 replies; 30+ messages in thread
From: Jai Luthra @ 2023-10-05  8:34 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Sakari Ailus, Tomi Valkeinen, linux-media,
	linux-kernel, devicetree, linux-arm-kernel,
	Mauro Carvalho Chehab, Maxime Ripard, niklas.soderlund+renesas,
	Benoit Parrot, Vaishnav Achath, Vignesh Raghavendra, nm,
	devarsht, a-bhatia1, Martyn Welch, Julien Massot

[-- Attachment #1: Type: text/plain, Size: 34007 bytes --]

Hi Laurent,

Thanks for the review, and apologies for being late on getting back on 
this.

On Aug 29, 2023 at 19:44:57 +0300, Laurent Pinchart wrote:
> Hi Jai,
> 
> Thank you for the patch.
> 
> On Fri, Aug 11, 2023 at 04:17:35PM +0530, Jai Luthra wrote:
> > From: Pratyush Yadav <p.yadav@ti.com>
> > 
> > TI's J721E uses the Cadence CSI2RX and DPHY peripherals to facilitate
> > capture over a CSI-2 bus.
> > 
> > The Cadence CSI2RX IP acts as a bridge between the TI specific parts and
> > the CSI-2 protocol parts. TI then has a wrapper on top of this bridge
> > called the SHIM layer. It takes in data from stream 0, repacks it, and
> > sends it to memory over PSI-L DMA.
> > 
> > This driver acts as the "front end" to V4L2 client applications. It
> > implements the required ioctls and buffer operations, passes the
> > necessary calls on to the bridge, programs the SHIM layer, and performs
> > DMA via the dmaengine API to finally return the data to a buffer
> > supplied by the application.
> > 
> > Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
> > Co-authored-by: Vaishnav Achath <vaishnav.a@ti.com>
> > Signed-off-by: Vaishnav Achath <vaishnav.a@ti.com>
> > Tested-by: Vaishnav Achath <vaishnav.a@ti.com>
> > Co-authored-by: Jai Luthra <j-luthra@ti.com>
> > Signed-off-by: Jai Luthra <j-luthra@ti.com>
> > ---
> > Changes since v8:
> >     - Allocate drain buffer at start of stream instead of doing it in the
> >       middle, and document why it is needed in comments
> >     - Call subdev's get_fmt directly for link_validation()
> >     - Cleanup height/width clamping and rounding code, document it in comments
> >     - Return and check errors from setup_shim()
> >     - s/subdev/source for cadence csi2rx's v4l2_subdev
> >     - s/ti_csi2rx_init_subdev/ti_csi2rx_notifier_register
> >     - Change copyright year/author list
> > 
> >  MAINTAINERS                                        |    7 +
> >  drivers/media/platform/ti/Kconfig                  |   12 +
> >  drivers/media/platform/ti/Makefile                 |    1 +
> >  drivers/media/platform/ti/j721e-csi2rx/Makefile    |    2 +
> >  .../media/platform/ti/j721e-csi2rx/j721e-csi2rx.c  | 1150 ++++++++++++++++++++
> >  5 files changed, 1172 insertions(+)
> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 02a3192195af..959147d6d936 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -21455,6 +21455,13 @@ F:	Documentation/devicetree/bindings/media/i2c/ti,ds90*
> >  F:	drivers/media/i2c/ds90*
> >  F:	include/media/i2c/ds90*
> >  
> > +TI J721E CSI2RX DRIVER
> > +M:	Jai Luthra <j-luthra@ti.com>
> > +L:	linux-media@vger.kernel.org
> > +S:	Maintained
> > +F:	Documentation/devicetree/bindings/media/ti,j721e-csi2rx.yaml
> > +F:	drivers/media/platform/ti/j721e-csi2rx/
> > +
> >  TI KEYSTONE MULTICORE NAVIGATOR DRIVERS
> >  M:	Nishanth Menon <nm@ti.com>
> >  M:	Santosh Shilimkar <ssantosh@kernel.org>
> > diff --git a/drivers/media/platform/ti/Kconfig b/drivers/media/platform/ti/Kconfig
> > index e1ab56c3be1f..42c908f6e1ae 100644
> > --- a/drivers/media/platform/ti/Kconfig
> > +++ b/drivers/media/platform/ti/Kconfig
> > @@ -63,6 +63,18 @@ config VIDEO_TI_VPE_DEBUG
> >  	help
> >  	  Enable debug messages on VPE driver.
> >  
> > +config VIDEO_TI_J721E_CSI2RX
> > +	tristate "TI J721E CSI2RX wrapper layer driver"
> > +	depends on VIDEO_DEV && VIDEO_V4L2_SUBDEV_API
> > +	depends on MEDIA_SUPPORT && MEDIA_CONTROLLER
> > +	depends on PHY_CADENCE_DPHY_RX && VIDEO_CADENCE_CSI2RX
> 
> Is there a compile-time dependency on these, or just runtime ? If it's
> just at runtime, it would be nice to either drop the dependency here, or
> add a (...) || COMPILE_TEST
> 

There isn't a compile-time dependency as such, but yes this IP only 
works with Cadence CSI+DPHY so I will switch it to (...) || 
COMPILE_TEST.

> > +	depends on ARCH_K3 || COMPILE_TEST
> > +	select VIDEOBUF2_DMA_CONTIG
> > +	select V4L2_FWNODE
> > +	help
> > +	  Support for TI CSI2RX wrapper layer. This just enables the wrapper driver.
> > +	  The Cadence CSI2RX bridge driver needs to be enabled separately.
> > +
> >  source "drivers/media/platform/ti/am437x/Kconfig"
> >  source "drivers/media/platform/ti/davinci/Kconfig"
> >  source "drivers/media/platform/ti/omap/Kconfig"
> > diff --git a/drivers/media/platform/ti/Makefile b/drivers/media/platform/ti/Makefile
> > index 98c5fe5c40d6..8a2f74c9380e 100644
> > --- a/drivers/media/platform/ti/Makefile
> > +++ b/drivers/media/platform/ti/Makefile
> > @@ -3,5 +3,6 @@ obj-y += am437x/
> >  obj-y += cal/
> >  obj-y += vpe/
> >  obj-y += davinci/
> > +obj-y += j721e-csi2rx/
> >  obj-y += omap/
> >  obj-y += omap3isp/
> > diff --git a/drivers/media/platform/ti/j721e-csi2rx/Makefile b/drivers/media/platform/ti/j721e-csi2rx/Makefile
> > new file mode 100644
> > index 000000000000..377afc1d6280
> > --- /dev/null
> > +++ b/drivers/media/platform/ti/j721e-csi2rx/Makefile
> > @@ -0,0 +1,2 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +obj-$(CONFIG_VIDEO_TI_J721E_CSI2RX) += j721e-csi2rx.o
> > diff --git a/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c b/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c
> > new file mode 100644
> > index 000000000000..301d947f6098
> > --- /dev/null
> > +++ b/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c
> > @@ -0,0 +1,1150 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * TI CSI2RX Shim Wrapper Driver
> > + *
> > + * Copyright (C) 2023 Texas Instruments Incorporated - https://www.ti.com/
> > + *
> > + * Author: Pratyush Yadav <p.yadav@ti.com>
> > + * Author: Jai Luthra <j-luthra@ti.com>
> > + */
> > +
> > +#include <linux/bitfield.h>
> > +#include <linux/dmaengine.h>
> > +#include <linux/module.h>
> > +#include <linux/of_platform.h>
> > +#include <linux/platform_device.h>
> > +
> > +#include <media/mipi-csi2.h>
> > +#include <media/v4l2-device.h>
> > +#include <media/v4l2-ioctl.h>
> > +#include <media/v4l2-mc.h>
> > +#include <media/videobuf2-dma-contig.h>
> > +
> > +#define TI_CSI2RX_MODULE_NAME		"j721e-csi2rx"
> > +
> > +#define SHIM_CNTL			0x10
> > +#define SHIM_CNTL_PIX_RST		BIT(0)
> > +
> > +#define SHIM_DMACNTX			0x20
> > +#define SHIM_DMACNTX_EN			BIT(31)
> > +#define SHIM_DMACNTX_YUV422		GENMASK(27, 26)
> > +#define SHIM_DMACNTX_SIZE		GENMASK(21, 20)
> > +#define SHIM_DMACNTX_FMT		GENMASK(5, 0)
> > +#define SHIM_DMACNTX_UYVY		0
> > +#define SHIM_DMACNTX_VYUY		1
> > +#define SHIM_DMACNTX_YUYV		2
> > +#define SHIM_DMACNTX_YVYU		3
> > +#define SHIM_DMACNTX_SIZE_8		0
> > +#define SHIM_DMACNTX_SIZE_16		1
> > +#define SHIM_DMACNTX_SIZE_32		2
> > +
> > +#define SHIM_PSI_CFG0			0x24
> > +#define SHIM_PSI_CFG0_SRC_TAG		GENMASK(15, 0)
> > +#define SHIM_PSI_CFG0_DST_TAG		GENMASK(31, 16)
> > +
> > +#define PSIL_WORD_SIZE_BYTES		16
> > +/*
> > + * There are no hard limits on the width or height. The DMA engine can handle
> > + * all sizes. The max width and height are arbitrary numbers for this driver.
> > + * Use 16K * 16K as the arbitrary limit. It is large enough that it is unlikely
> > + * the limit will be hit in practice.
> > + */
> > +#define MAX_WIDTH_BYTES			SZ_16K
> > +#define MAX_HEIGHT_LINES		SZ_16K
> > +
> > +#define DRAIN_TIMEOUT_MS		50
> > +
> > +struct ti_csi2rx_fmt {
> > +	u32				fourcc;	/* Four character code. */
> > +	u32				code;	/* Mbus code. */
> > +	u32				csi_dt;	/* CSI Data type. */
> > +	u8				bpp;	/* Bits per pixel. */
> > +	u8				size;	/* Data size shift when unpacking. */
> > +};
> > +
> > +struct ti_csi2rx_buffer {
> > +	/* Common v4l2 buffer. Must be first. */
> > +	struct vb2_v4l2_buffer		vb;
> > +	struct list_head		list;
> > +	struct ti_csi2rx_dev		*csi;
> > +};
> > +
> > +enum ti_csi2rx_dma_state {
> > +	TI_CSI2RX_DMA_STOPPED,	/* Streaming not started yet. */
> > +	TI_CSI2RX_DMA_IDLE,	/* Streaming but no pending DMA operation. */
> > +	TI_CSI2RX_DMA_ACTIVE,	/* Streaming and pending DMA operation. */
> > +};
> > +
> > +struct ti_csi2rx_dma {
> > +	/* Protects all fields in this struct. */
> > +	spinlock_t			lock;
> > +	struct dma_chan			*chan;
> > +	/* Buffers queued to the driver, waiting to be processed by DMA. */
> > +	struct list_head		queue;
> > +	enum ti_csi2rx_dma_state	state;
> > +	/*
> > +	 * Queue of buffers submitted to DMA engine.
> > +	 */
> > +	struct list_head		submitted;
> > +	/* Buffer to drain stale data from PSI-L endpoint */
> > +	struct {
> > +		void			*vaddr;
> > +		dma_addr_t		paddr;
> > +		size_t			len;
> > +	} drain;
> > +};
> > +
> > +struct ti_csi2rx_dev {
> > +	struct device			*dev;
> > +	void __iomem			*shim;
> > +	struct v4l2_device		v4l2_dev;
> > +	struct video_device		vdev;
> > +	struct media_device		mdev;
> > +	struct media_pipeline		pipe;
> > +	struct media_pad		pad;
> > +	struct v4l2_async_notifier	notifier;
> > +	struct v4l2_subdev		*source;
> > +	struct vb2_queue		vidq;
> > +	struct mutex			mutex; /* To serialize ioctls. */
> > +	struct v4l2_format		v_fmt;
> > +	struct ti_csi2rx_dma		dma;
> > +	u32				sequence;
> > +};
> > +
> > +static const struct ti_csi2rx_fmt formats[] = {
> 
> It would be nice to prefix local symbols to avoid namespace clashes,
> even if they're static. ti_csi2rx_formats could be a good name. Same
> below where applicable, and possibly above for some macro names.
> 

Will fix.

> > +	{
> > +		.fourcc			= V4L2_PIX_FMT_YUYV,
> > +		.code			= MEDIA_BUS_FMT_YUYV8_1X16,
> > +		.csi_dt			= MIPI_CSI2_DT_YUV422_8B,
> > +		.bpp			= 16,
> > +		.size			= SHIM_DMACNTX_SIZE_8,
> > +	}, {
> > +		.fourcc			= V4L2_PIX_FMT_UYVY,
> > +		.code			= MEDIA_BUS_FMT_UYVY8_1X16,
> > +		.csi_dt			= MIPI_CSI2_DT_YUV422_8B,
> > +		.bpp			= 16,
> > +		.size			= SHIM_DMACNTX_SIZE_8,
> > +	}, {
> > +		.fourcc			= V4L2_PIX_FMT_YVYU,
> > +		.code			= MEDIA_BUS_FMT_YVYU8_1X16,
> > +		.csi_dt			= MIPI_CSI2_DT_YUV422_8B,
> > +		.bpp			= 16,
> > +		.size			= SHIM_DMACNTX_SIZE_8,
> > +	}, {
> > +		.fourcc			= V4L2_PIX_FMT_VYUY,
> > +		.code			= MEDIA_BUS_FMT_VYUY8_1X16,
> > +		.csi_dt			= MIPI_CSI2_DT_YUV422_8B,
> > +		.bpp			= 16,
> > +		.size			= SHIM_DMACNTX_SIZE_8,
> > +	}, {
> > +		.fourcc			= V4L2_PIX_FMT_SBGGR8,
> > +		.code			= MEDIA_BUS_FMT_SBGGR8_1X8,
> > +		.csi_dt			= MIPI_CSI2_DT_RAW8,
> > +		.bpp			= 8,
> > +		.size			= SHIM_DMACNTX_SIZE_8,
> > +	}, {
> > +		.fourcc			= V4L2_PIX_FMT_SGBRG8,
> > +		.code			= MEDIA_BUS_FMT_SGBRG8_1X8,
> > +		.csi_dt			= MIPI_CSI2_DT_RAW8,
> > +		.bpp			= 8,
> > +		.size			= SHIM_DMACNTX_SIZE_8,
> > +	}, {
> > +		.fourcc			= V4L2_PIX_FMT_SGRBG8,
> > +		.code			= MEDIA_BUS_FMT_SGRBG8_1X8,
> > +		.csi_dt			= MIPI_CSI2_DT_RAW8,
> > +		.bpp			= 8,
> > +		.size			= SHIM_DMACNTX_SIZE_8,
> > +	}, {
> > +		.fourcc			= V4L2_PIX_FMT_SRGGB8,
> > +		.code			= MEDIA_BUS_FMT_SRGGB8_1X8,
> > +		.csi_dt			= MIPI_CSI2_DT_RAW8,
> > +		.bpp			= 8,
> > +		.size			= SHIM_DMACNTX_SIZE_8,
> > +	}, {
> > +		.fourcc			= V4L2_PIX_FMT_SBGGR10,
> > +		.code			= MEDIA_BUS_FMT_SBGGR10_1X10,
> > +		.csi_dt			= MIPI_CSI2_DT_RAW10,
> > +		.bpp			= 16,
> > +		.size			= SHIM_DMACNTX_SIZE_16,
> > +	}, {
> > +		.fourcc			= V4L2_PIX_FMT_SGBRG10,
> > +		.code			= MEDIA_BUS_FMT_SGBRG10_1X10,
> > +		.csi_dt			= MIPI_CSI2_DT_RAW10,
> > +		.bpp			= 16,
> > +		.size			= SHIM_DMACNTX_SIZE_16,
> > +	}, {
> > +		.fourcc			= V4L2_PIX_FMT_SGRBG10,
> > +		.code			= MEDIA_BUS_FMT_SGRBG10_1X10,
> > +		.csi_dt			= MIPI_CSI2_DT_RAW10,
> > +		.bpp			= 16,
> > +		.size			= SHIM_DMACNTX_SIZE_16,
> > +	}, {
> > +		.fourcc			= V4L2_PIX_FMT_SRGGB10,
> > +		.code			= MEDIA_BUS_FMT_SRGGB10_1X10,
> > +		.csi_dt			= MIPI_CSI2_DT_RAW10,
> > +		.bpp			= 16,
> > +		.size			= SHIM_DMACNTX_SIZE_16,
> > +	},
> > +
> > +	/* More formats can be supported but they are not listed for now. */
> > +};
> > +
> > +static const unsigned int num_formats = ARRAY_SIZE(formats);
> 
> I would use ARRAY_SIZE(formats) below and drop num_formats, as I don't
> think it improves readability, but I don't insist.
> 

Will fix.

> > +
> > +/* Forward declaration needed by ti_csi2rx_dma_callback. */
> > +static int ti_csi2rx_start_dma(struct ti_csi2rx_dev *csi,
> > +			       struct ti_csi2rx_buffer *buf);
> > +
> > +static const struct ti_csi2rx_fmt *find_format_by_pix(u32 pixelformat)
> 
> Maybe "_by_fourcc" ? That's nitpicking though.
> 

Will fix.

> > +{
> > +	unsigned int i;
> > +
> > +	for (i = 0; i < num_formats; i++) {
> > +		if (formats[i].fourcc == pixelformat)
> > +			return &formats[i];
> > +	}
> > +
> > +	return NULL;
> > +}
> > +
> > +static const struct ti_csi2rx_fmt *find_format_by_code(u32 code)
> > +{
> > +	unsigned int i;
> > +
> > +	for (i = 0; i < num_formats; i++) {
> > +		if (formats[i].code == code)
> > +			return &formats[i];
> > +	}
> > +
> > +	return NULL;
> > +}
> > +
> > +static void ti_csi2rx_fill_fmt(const struct ti_csi2rx_fmt *csi_fmt,
> > +			       struct v4l2_format *v4l2_fmt)
> > +{
> > +	struct v4l2_pix_format *pix = &v4l2_fmt->fmt.pix;
> > +	unsigned int pixels_in_word;
> > +	u8 bpp = ALIGN(csi_fmt->bpp, 8);
> 
> All bpp values are multiple of 8, is ALIGN() needed ?
> 

Hmm you're right it's not really needed, I think I decided to keep it 
from Pratyush's series just to be safe. Will remove it in v10.

> > +
> > +	pixels_in_word = PSIL_WORD_SIZE_BYTES * 8 / bpp;
> > +
> > +	/* Clamp width and height to sensible maximums (16K x 16K) */
> > +	pix->width = clamp_t(unsigned int, pix->width,
> > +			     pixels_in_word,
> > +			     MAX_WIDTH_BYTES * 8 / bpp);
> > +	pix->height = clamp_t(unsigned int, pix->height, 1, MAX_HEIGHT_LINES);
> > +
> > +	/* Width should be a multiple of transfer word-size */
> > +	pix->width = rounddown(pix->width, pixels_in_word);
> > +
> > +	v4l2_fmt->type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
> > +	pix->pixelformat = csi_fmt->fourcc;
> > +	pix->colorspace = V4L2_COLORSPACE_SRGB;
> 
> You should fill the other colorspace-related fields.
> 

Good catch, I think it would be better to use the colorspace and related 
fields from the mbus format of the source subdev as this IP doesn't 
really change it. Will fix in v10.

> > +	pix->bytesperline = pix->width * (bpp / 8);
> > +	pix->sizeimage = pix->bytesperline * pix->height;
> > +}
> > +
> > +static int ti_csi2rx_querycap(struct file *file, void *priv,
> > +			      struct v4l2_capability *cap)
> > +{
> > +	strscpy(cap->driver, TI_CSI2RX_MODULE_NAME, sizeof(cap->driver));
> > +	strscpy(cap->card, TI_CSI2RX_MODULE_NAME, sizeof(cap->card));
> > +
> > +	return 0;
> > +}
> > +
> > +static int ti_csi2rx_enum_fmt_vid_cap(struct file *file, void *priv,
> > +				      struct v4l2_fmtdesc *f)
> > +{
> > +	const struct ti_csi2rx_fmt *fmt = NULL;
> > +
> > +	if (f->mbus_code) {
> > +		/* 1-to-1 mapping between bus formats and pixel formats */
> > +		if (f->index > 0)
> > +			return -EINVAL;
> > +
> > +		fmt = find_format_by_code(f->mbus_code);
> > +	} else {
> > +		if (f->index >= num_formats)
> > +			return -EINVAL;
> > +
> > +		fmt = &formats[f->index];
> > +	}
> > +
> > +	if (!fmt)
> > +		return -EINVAL;
> > +
> > +	f->pixelformat = fmt->fourcc;
> > +	memset(f->reserved, 0, sizeof(f->reserved));
> > +	f->type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
> > +
> > +	return 0;
> > +}
> > +
> > +static int ti_csi2rx_g_fmt_vid_cap(struct file *file, void *prov,
> > +				   struct v4l2_format *f)
> > +{
> > +	struct ti_csi2rx_dev *csi = video_drvdata(file);
> > +
> > +	*f = csi->v_fmt;
> > +
> > +	return 0;
> > +}
> > +
> > +static int ti_csi2rx_try_fmt_vid_cap(struct file *file, void *priv,
> > +				     struct v4l2_format *f)
> > +{
> > +	const struct ti_csi2rx_fmt *fmt;
> > +
> > +	/*
> > +	 * Default to the first format if the requested pixel format code isn't
> > +	 * supported.
> > +	 */
> > +	fmt = find_format_by_pix(f->fmt.pix.pixelformat);
> > +	if (!fmt)
> > +		fmt = &formats[0];
> > +
> > +	/* Interlaced formats are not supported. */
> > +	f->fmt.pix.field = V4L2_FIELD_NONE;
> > +
> > +	ti_csi2rx_fill_fmt(fmt, f);
> > +
> > +	return 0;
> > +}
> > +
> > +static int ti_csi2rx_s_fmt_vid_cap(struct file *file, void *priv,
> > +				   struct v4l2_format *f)
> > +{
> > +	struct ti_csi2rx_dev *csi = video_drvdata(file);
> > +	struct vb2_queue *q = &csi->vidq;
> > +	int ret;
> > +
> > +	if (vb2_is_busy(q))
> > +		return -EBUSY;
> > +
> > +	ret = ti_csi2rx_try_fmt_vid_cap(file, priv, f);
> > +	if (ret < 0)
> > +		return ret;
> > +
> > +	csi->v_fmt = *f;
> > +
> > +	return 0;
> > +}
> > +
> > +static int ti_csi2rx_enum_framesizes(struct file *file, void *fh,
> > +				     struct v4l2_frmsizeenum *fsize)
> > +{
> > +	const struct ti_csi2rx_fmt *fmt;
> > +	unsigned int pixels_in_word;
> > +	u8 bpp;
> > +
> > +	fmt = find_format_by_pix(fsize->pixel_format);
> > +	if (!fmt || fsize->index != 0)
> > +		return -EINVAL;
> > +
> > +	bpp = ALIGN(fmt->bpp, 8);
> > +
> > +	/*
> > +	 * Number of pixels in one PSI-L word. The transfer happens in multiples
> > +	 * of PSI-L word sizes.
> > +	 */
> > +	pixels_in_word = PSIL_WORD_SIZE_BYTES * 8 / bpp;
> > +
> > +	fsize->type = V4L2_FRMSIZE_TYPE_STEPWISE;
> > +	fsize->stepwise.min_width = pixels_in_word;
> > +	fsize->stepwise.max_width = rounddown(MAX_WIDTH_BYTES * 8 / bpp,
> > +					      pixels_in_word);
> > +	fsize->stepwise.step_width = pixels_in_word;
> > +	fsize->stepwise.min_height = 1;
> > +	fsize->stepwise.max_height = MAX_HEIGHT_LINES;
> > +	fsize->stepwise.step_height = 1;
> > +
> > +	return 0;
> > +}
> > +
> > +static const struct v4l2_ioctl_ops csi_ioctl_ops = {
> > +	.vidioc_querycap      = ti_csi2rx_querycap,
> > +	.vidioc_enum_fmt_vid_cap = ti_csi2rx_enum_fmt_vid_cap,
> > +	.vidioc_try_fmt_vid_cap = ti_csi2rx_try_fmt_vid_cap,
> > +	.vidioc_g_fmt_vid_cap = ti_csi2rx_g_fmt_vid_cap,
> > +	.vidioc_s_fmt_vid_cap = ti_csi2rx_s_fmt_vid_cap,
> > +	.vidioc_enum_framesizes = ti_csi2rx_enum_framesizes,
> > +	.vidioc_reqbufs       = vb2_ioctl_reqbufs,
> > +	.vidioc_create_bufs   = vb2_ioctl_create_bufs,
> > +	.vidioc_prepare_buf   = vb2_ioctl_prepare_buf,
> > +	.vidioc_querybuf      = vb2_ioctl_querybuf,
> > +	.vidioc_qbuf          = vb2_ioctl_qbuf,
> > +	.vidioc_dqbuf         = vb2_ioctl_dqbuf,
> > +	.vidioc_expbuf        = vb2_ioctl_expbuf,
> > +	.vidioc_streamon      = vb2_ioctl_streamon,
> > +	.vidioc_streamoff     = vb2_ioctl_streamoff,
> > +};
> > +
> > +static const struct v4l2_file_operations csi_fops = {
> > +	.owner = THIS_MODULE,
> > +	.open = v4l2_fh_open,
> > +	.release = vb2_fop_release,
> > +	.read = vb2_fop_read,
> > +	.poll = vb2_fop_poll,
> > +	.unlocked_ioctl = video_ioctl2,
> > +	.mmap = vb2_fop_mmap,
> > +};
> > +
> > +static int csi_async_notifier_bound(struct v4l2_async_notifier *notifier,
> > +				    struct v4l2_subdev *subdev,
> > +				    struct v4l2_async_connection *asc)
> > +{
> > +	struct ti_csi2rx_dev *csi = dev_get_drvdata(notifier->v4l2_dev->dev);
> > +
> > +	csi->source = subdev;
> > +
> > +	return 0;
> > +}
> > +
> > +static int csi_async_notifier_complete(struct v4l2_async_notifier *notifier)
> > +{
> > +	struct ti_csi2rx_dev *csi = dev_get_drvdata(notifier->v4l2_dev->dev);
> > +	struct video_device *vdev = &csi->vdev;
> > +	int ret;
> > +
> > +	ret = video_register_device(vdev, VFL_TYPE_VIDEO, -1);
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = v4l2_create_fwnode_links_to_pad(csi->source, &csi->pad,
> > +					      MEDIA_LNK_FL_IMMUTABLE | MEDIA_LNK_FL_ENABLED);
> > +
> > +	if (ret) {
> > +		video_unregister_device(vdev);
> > +		return ret;
> > +	}
> > +
> > +	return v4l2_device_register_subdev_nodes(&csi->v4l2_dev);
> 
> You should call video_unregister_device() if this fails.

Will fix.

> 
> I'm tempted, however, to register the video device at probe time, not in
> this function.
>

My guess is this was done here to prevent creating a video node if 
sensor (source subdev) failed to probe. With multistream support we will 
have multiple video nodes (upto 16) where it seems cleaner to wait for 
the source. Let me know if you had some other reason in mind.

> > +}
> > +
> > +static const struct v4l2_async_notifier_operations csi_async_notifier_ops = {
> > +	.bound = csi_async_notifier_bound,
> > +	.complete = csi_async_notifier_complete,
> > +};
> > +
> > +static int ti_csi2rx_notifier_register(struct ti_csi2rx_dev *csi)
> > +{
> > +	struct fwnode_handle *fwnode;
> > +	struct v4l2_async_connection *asc;
> > +	struct device_node *node;
> > +	int ret;
> > +
> > +	node = of_get_child_by_name(csi->dev->of_node, "csi-bridge");
> > +	if (!node)
> > +		return -EINVAL;
> > +
> > +	fwnode = of_fwnode_handle(node);
> > +	if (!fwnode) {
> > +		of_node_put(node);
> > +		return -EINVAL;
> > +	}
> > +
> > +	v4l2_async_nf_init(&csi->notifier, &csi->v4l2_dev);
> > +	csi->notifier.ops = &csi_async_notifier_ops;
> > +
> > +	asc = v4l2_async_nf_add_fwnode(&csi->notifier, fwnode,
> > +				       struct v4l2_async_connection);
> > +	of_node_put(node);
> > +	if (IS_ERR(asc)) {
> > +		v4l2_async_nf_cleanup(&csi->notifier);
> > +		return PTR_ERR(asc);
> > +	}
> > +
> > +	ret = v4l2_async_nf_register(&csi->notifier);
> > +	if (ret) {
> > +		v4l2_async_nf_cleanup(&csi->notifier);
> > +		return ret;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static int ti_csi2rx_setup_shim(struct ti_csi2rx_dev *csi)
> > +{
> > +	const struct ti_csi2rx_fmt *fmt;
> > +	unsigned int reg;
> > +
> > +	fmt = find_format_by_pix(csi->v_fmt.fmt.pix.pixelformat);
> > +	if (!fmt) {
> > +		dev_err(csi->dev, "Pixelformat 0x%x is not supported\n",
> 
> Use %p4cc to print a fourcc. You need to pass the pixel format by
> address to dev_err() then, not by value.
> 
> Can this happen though, given that the set format handler should never
> allow setting a format not supported by the driver ? I think I'd drop
> the error check. The function can then become a void function.
> 

Makes sense, will drop this.

> > +			csi->v_fmt.fmt.pix.pixelformat);
> > +		return -EINVAL;
> > +	}
> > +
> > +	/* De-assert the pixel interface reset. */
> > +	reg = SHIM_CNTL_PIX_RST;
> > +	writel(reg, csi->shim + SHIM_CNTL);
> > +
> > +	reg = SHIM_DMACNTX_EN;
> > +	reg |= FIELD_PREP(SHIM_DMACNTX_FMT, fmt->csi_dt);
> > +
> > +	/*
> > +	 * Using the values from the documentation gives incorrect ordering for
> > +	 * the luma and chroma components. In practice, the "reverse" format
> > +	 * gives the correct image. So for example, if the image is in UYVY, the
> > +	 * reverse would be YVYU.
> > +	 */
> > +	switch (fmt->fourcc) {
> > +	case V4L2_PIX_FMT_UYVY:
> > +		reg |= FIELD_PREP(SHIM_DMACNTX_YUV422,
> > +					SHIM_DMACNTX_YVYU);
> > +		break;
> > +	case V4L2_PIX_FMT_VYUY:
> > +		reg |= FIELD_PREP(SHIM_DMACNTX_YUV422,
> > +					SHIM_DMACNTX_YUYV);
> > +		break;
> > +	case V4L2_PIX_FMT_YUYV:
> > +		reg |= FIELD_PREP(SHIM_DMACNTX_YUV422,
> > +					SHIM_DMACNTX_VYUY);
> > +		break;
> > +	case V4L2_PIX_FMT_YVYU:
> > +		reg |= FIELD_PREP(SHIM_DMACNTX_YUV422,
> > +					SHIM_DMACNTX_UYVY);
> > +		break;
> > +	default:
> > +		/* Ignore if not YUV 4:2:2 */
> > +		break;
> > +	}
> > +
> > +	reg |= FIELD_PREP(SHIM_DMACNTX_SIZE, fmt->size);
> > +
> > +	writel(reg, csi->shim + SHIM_DMACNTX);
> > +
> > +	reg = FIELD_PREP(SHIM_PSI_CFG0_SRC_TAG, 0) |
> > +	      FIELD_PREP(SHIM_PSI_CFG0_DST_TAG, 0);
> > +	writel(reg, csi->shim + SHIM_PSI_CFG0);
> > +
> > +	return 0;
> > +}
> > +
> > +static void ti_csi2rx_drain_callback(void *param)
> > +{
> > +	struct completion *drain_complete = param;
> > +
> > +	complete(drain_complete);
> > +}
> > +
> > +/** Drain the stale data left at the PSI-L endpoint.
> 
> This isn't kerneldoc, so
> 
> /*
>  * Drain the stale data left at the PSI-L endpoint.
> 

Will fix.

> > + *
> > + * This might happen if no buffers are queued in time but source is still
> > + * streaming. Or rarely it may happen while stopping the stream. To prevent
> > + * that stale data corrupting the subsequent transactions, it is required to
> > + * issue DMA requests to drain it out.
> > + */
> > +static int ti_csi2rx_drain_dma(struct ti_csi2rx_dev *csi)
> > +{
> > +	struct dma_async_tx_descriptor *desc;
> > +	struct completion drain_complete;
> > +	dma_cookie_t cookie;
> > +	int ret;
> > +
> > +	init_completion(&drain_complete);
> > +
> > +	desc = dmaengine_prep_slave_single(csi->dma.chan, csi->dma.drain.paddr,
> > +					   csi->dma.drain.len, DMA_DEV_TO_MEM,
> > +					   DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
> > +	if (!desc) {
> > +		ret = -EIO;
> > +		goto out;
> > +	}
> > +
> > +	desc->callback = ti_csi2rx_drain_callback;
> > +	desc->callback_param = &drain_complete;
> > +
> > +	cookie = dmaengine_submit(desc);
> > +	ret = dma_submit_error(cookie);
> > +	if (ret)
> > +		goto out;
> > +
> > +	dma_async_issue_pending(csi->dma.chan);
> > +
> > +	if (!wait_for_completion_timeout(&drain_complete,
> > +					 msecs_to_jiffies(DRAIN_TIMEOUT_MS))) {
> > +		dmaengine_terminate_sync(csi->dma.chan);
> > +		ret = -ETIMEDOUT;
> > +		goto out;
> > +	}
> > +out:
> > +	return ret;
> > +}
> > +
> > +static void ti_csi2rx_dma_callback(void *param)
> > +{
> > +	struct ti_csi2rx_buffer *buf = param;
> > +	struct ti_csi2rx_dev *csi = buf->csi;
> > +	struct ti_csi2rx_dma *dma = &csi->dma;
> > +	unsigned long flags;
> > +
> > +	/*
> > +	 * TODO: Derive the sequence number from the CSI2RX frame number
> > +	 * hardware monitor registers.
> > +	 */
> > +	buf->vb.vb2_buf.timestamp = ktime_get_ns();
> > +	buf->vb.sequence = csi->sequence++;
> > +
> > +	spin_lock_irqsave(&dma->lock, flags);
> > +
> > +	WARN_ON(!list_is_first(&buf->list, &dma->submitted));
> > +	vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_DONE);
> > +	list_del(&buf->list);
> > +
> > +	/* If there are more buffers to process then start their transfer. */
> > +	while (!list_empty(&dma->queue)) {
> > +		buf = list_entry(dma->queue.next, struct ti_csi2rx_buffer, list);
> > +
> > +		if (ti_csi2rx_start_dma(csi, buf)) {
> > +			dev_err(csi->dev, "Failed to queue the next buffer for DMA\n");
> > +			vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR);
> > +		} else {
> > +			list_move_tail(&buf->list, &dma->submitted);
> > +		}
> > +	}
> > +
> > +	if (list_empty(&dma->submitted))
> > +		dma->state = TI_CSI2RX_DMA_IDLE;
> > +
> > +	spin_unlock_irqrestore(&dma->lock, flags);
> > +}
> > +
> > +static int ti_csi2rx_start_dma(struct ti_csi2rx_dev *csi,
> > +			       struct ti_csi2rx_buffer *buf)
> > +{
> > +	unsigned long addr;
> > +	struct dma_async_tx_descriptor *desc;
> > +	size_t len = csi->v_fmt.fmt.pix.sizeimage;
> > +	dma_cookie_t cookie;
> > +	int ret = 0;
> > +
> > +	addr = vb2_dma_contig_plane_dma_addr(&buf->vb.vb2_buf, 0);
> > +	desc = dmaengine_prep_slave_single(csi->dma.chan, addr, len,
> > +					   DMA_DEV_TO_MEM,
> > +					   DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
> > +	if (!desc)
> > +		return -EIO;
> > +
> > +	desc->callback = ti_csi2rx_dma_callback;
> > +	desc->callback_param = buf;
> > +
> > +	cookie = dmaengine_submit(desc);
> > +	ret = dma_submit_error(cookie);
> > +	if (ret)
> > +		return ret;
> > +
> > +	dma_async_issue_pending(csi->dma.chan);
> > +
> > +	return 0;
> > +}
> > +
> > +static void ti_csi2rx_cleanup_buffers(struct ti_csi2rx_dev *csi,
> > +				      enum vb2_buffer_state buf_state)
> > +{
> > +	struct ti_csi2rx_dma *dma = &csi->dma;
> > +	struct ti_csi2rx_buffer *buf, *tmp;
> > +	enum ti_csi2rx_dma_state state;
> > +	unsigned long flags;
> > +	int ret;
> > +
> > +	spin_lock_irqsave(&dma->lock, flags);
> > +	state = csi->dma.state;
> > +	dma->state = TI_CSI2RX_DMA_STOPPED;
> > +	spin_unlock_irqrestore(&dma->lock, flags);
> > +
> > +	if (state != TI_CSI2RX_DMA_STOPPED) {
> > +		/*
> > +		 * Normal DMA termination sometimes does not clean up pending
> > +		 * data on the endpoint.
> > +		 */
> > +		ret = ti_csi2rx_drain_dma(csi);
> > +		if (ret)
> > +			dev_dbg(csi->dev,
> > +				"Failed to drain DMA. Next frame might be bogus\n");
> 
> A dev_warn() may be more appropriate, this seems quite important.
> 

I think this was intentional. The calls to DMA engine for draining 
always "timeout" even when the drain is successful, because the amount 
of data drained is non-deterministic (and we use framesize to be safe, 
which is much more than what FIFOs would be storing). Keeping this as 
dev_warn() leads to spurious dmesg logs at the end of every stream 
close.

Maybe we should use dev_dbg for the timeout case, and dev_err for any 
other error thrown by DMA engine. Will fix that in v10.

Hopefully this will be simplified once we have an API in place to signal 
drain to "/dev/null".

> > +	}
> 
> A blank line would be nice here.
> 

Will fix.

> > +	ret = dmaengine_terminate_sync(csi->dma.chan);
> > +	if (ret)
> > +		dev_err(csi->dev, "Failed to stop DMA: %d\n", ret);
> 
> When called from ti_csi2rx_start_streaming() there's already a
> dmaengine_terminate_sync(), and there's also a call to the same function
> in ti_csi2rx_drain_dma() called above. Could we avoid calling the
> function multiple times ? I think stopping the DMA engine should be
> moved to a separate function, as it doesn't fit with the
> ti_csi2rx_cleanup_buffers() name.
> 

Oops missed that, will fix.

> > +
> > +	dma_free_coherent(csi->dev, dma->drain.len,
> > +			  dma->drain.vaddr, dma->drain.paddr);
> > +	dma->drain.vaddr = NULL;
> > +
> > +	spin_lock_irqsave(&dma->lock, flags);
> > +	list_for_each_entry_safe(buf, tmp, &csi->dma.queue, list) {
> > +		list_del(&buf->list);
> > +		vb2_buffer_done(&buf->vb.vb2_buf, buf_state);
> > +	}
> > +	list_for_each_entry_safe(buf, tmp, &csi->dma.submitted, list) {
> > +		list_del(&buf->list);
> > +		vb2_buffer_done(&buf->vb.vb2_buf, buf_state);
> > +	}
> > +	spin_unlock_irqrestore(&dma->lock, flags);
> > +}
> > +
> > +static int ti_csi2rx_queue_setup(struct vb2_queue *q, unsigned int *nbuffers,
> > +				 unsigned int *nplanes, unsigned int sizes[],
> > +				 struct device *alloc_devs[])
> > +{
> > +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(q);
> > +	unsigned int size = csi->v_fmt.fmt.pix.sizeimage;
> > +
> > +	if (*nplanes) {
> > +		if (sizes[0] < size)
> > +			return -EINVAL;
> > +		size = sizes[0];
> > +	}
> > +
> > +	*nplanes = 1;
> > +	sizes[0] = size;
> > +
> > +	return 0;
> > +}
> > +
> > +static int ti_csi2rx_buffer_prepare(struct vb2_buffer *vb)
> > +{
> > +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vb->vb2_queue);
> > +	unsigned long size = csi->v_fmt.fmt.pix.sizeimage;
> > +
> > +	if (vb2_plane_size(vb, 0) < size) {
> > +		dev_err(csi->dev, "Data will not fit into plane\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	vb2_set_plane_payload(vb, 0, size);
> > +	return 0;
> > +}
> > +
> > +static void ti_csi2rx_buffer_queue(struct vb2_buffer *vb)
> > +{
> > +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vb->vb2_queue);
> > +	struct ti_csi2rx_buffer *buf;
> > +	struct ti_csi2rx_dma *dma = &csi->dma;
> > +	bool restart_dma = false;
> > +	unsigned long flags = 0;
> > +	int ret;
> > +
> > +	buf = container_of(vb, struct ti_csi2rx_buffer, vb.vb2_buf);
> > +	buf->csi = csi;
> > +
> > +	spin_lock_irqsave(&dma->lock, flags);
> > +	/*
> > +	 * Usually the DMA callback takes care of queueing the pending buffers.
> > +	 * But if DMA has stalled due to lack of buffers, restart it now.
> > +	 */
> > +	if (dma->state == TI_CSI2RX_DMA_IDLE) {
> > +		/*
> > +		 * Do not restart DMA with the lock held because
> > +		 * ti_csi2rx_drain_dma() might block for completion.
> > +		 * There won't be a race on queueing DMA anyway since the
> > +		 * callback is not being fired.
> > +		 */
> > +		restart_dma = true;
> > +		dma->state = TI_CSI2RX_DMA_ACTIVE;
> > +	} else {
> > +		list_add_tail(&buf->list, &dma->queue);
> > +	}
> > +	spin_unlock_irqrestore(&dma->lock, flags);
> > +
> > +	if (restart_dma) {
> > +		/*
> > +		 * Once frames start dropping, some data gets stuck in the DMA
> > +		 * pipeline somewhere. So the first DMA transfer after frame
> > +		 * drops gives a partial frame. This is obviously not useful to
> > +		 * the application and will only confuse it. Issue a DMA
> > +		 * transaction to drain that up.
> > +		 */
> 
> Another option would be to return the frame to userspace with the error
> flag set. That would give an earlier indication to applications that
> something went wrong. Up to you.
> 

Oh I see, that does sound cleaner to me - I will try that out with 
different applications. If it works, would it be okay if I handle it as 
a separate series? The mechanism in this series is already well-tested, 
even if a bit ugly.

> > +		ret = ti_csi2rx_drain_dma(csi);
> > +		if (ret)
> > +			dev_warn(csi->dev,
> > +				 "Failed to drain DMA. Next frame might be bogus\n");
> > +
> > +		ret = ti_csi2rx_start_dma(csi, buf);
> > +		if (ret) {
> > +			dev_err(csi->dev, "Failed to start DMA: %d\n", ret);
> > +			spin_lock_irqsave(&dma->lock, flags);
> > +			vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR);
> > +			dma->state = TI_CSI2RX_DMA_IDLE;
> > +			spin_unlock_irqrestore(&dma->lock, flags);
> > +		} else {
> > +			spin_lock_irqsave(&dma->lock, flags);
> > +			list_add_tail(&buf->list, &dma->submitted);
> > +			spin_unlock_irqrestore(&dma->lock, flags);
> > +		}
> > +	}

[snip]

> 
> -- Regards,
> 
> Laurent Pinchart

-- 
Thanks,
Jai

GPG Fingerprint: 4DE0 D818 E5D5 75E8 D45A AFC5 43DE 91F9 249A 7145

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v9 13/13] media: ti: Add CSI2RX support for J721E
  2023-10-04 20:03           ` Laurent Pinchart
  2023-10-05  4:10             ` Vinod Koul
@ 2023-10-06 10:26             ` Jai Luthra
  1 sibling, 0 replies; 30+ messages in thread
From: Jai Luthra @ 2023-10-06 10:26 UTC (permalink / raw)
  To: Laurent Pinchart, Vinod Koul, Vignesh Raghavendra
  Cc: Tomi Valkeinen, Mauro Carvalho Chehab, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Sakari Ailus, linux-media,
	linux-kernel, devicetree, linux-arm-kernel,
	Mauro Carvalho Chehab, Maxime Ripard, niklas.soderlund+renesas,
	Benoit Parrot, Vaishnav Achath, nm, devarsht, a-bhatia1,
	Martyn Welch, Julien Massot

[-- Attachment #1: Type: text/plain, Size: 5320 bytes --]

Hi Laurent, Vignesh, Vinod,

I have some good news, there is an upper bound on the amount of data 
stored in the FIFOs (~32KB), so we don't need to allocate a buffer of 
the full frame size.

On Oct 04, 2023 at 23:03:12 +0300, Laurent Pinchart wrote:
> On Wed, Oct 04, 2023 at 07:21:00PM +0530, Vinod Koul wrote:
> > On 29-08-23, 18:55, Laurent Pinchart wrote:
> > > Hi Jai,
> > > 
> > > (CC'ing Vinod, the maintainer of the DMA engine subsystem, for a
> > > question below)
> > 
> > Sorry this got lost
> 
> No worries.
> 
> > > On Fri, Aug 18, 2023 at 03:55:06PM +0530, Jai Luthra wrote:
> > > > On Aug 15, 2023 at 16:00:51 +0300, Tomi Valkeinen wrote:
> > > > > On 11/08/2023 13:47, Jai Luthra wrote:
> > > > > > From: Pratyush Yadav <p.yadav@ti.com>
> > > 
> > > [snip]
> > > 
> > > > > > +static int ti_csi2rx_start_streaming(struct vb2_queue *vq, unsigned int count)
> > > > > > +{
> > > > > > +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vq);
> > > > > > +	struct ti_csi2rx_dma *dma = &csi->dma;
> > > > > > +	struct ti_csi2rx_buffer *buf;
> > > > > > +	unsigned long flags;
> > > > > > +	int ret = 0;
> > > > > > +
> > > > > > +	spin_lock_irqsave(&dma->lock, flags);
> > > > > > +	if (list_empty(&dma->queue))
> > > > > > +		ret = -EIO;
> > > > > > +	spin_unlock_irqrestore(&dma->lock, flags);
> > > > > > +	if (ret)
> > > > > > +		return ret;
> > > > > > +
> > > > > > +	dma->drain.len = csi->v_fmt.fmt.pix.sizeimage;
> > > > > > +	dma->drain.vaddr = dma_alloc_coherent(csi->dev, dma->drain.len,
> > > > > > +					      &dma->drain.paddr, GFP_KERNEL);
> > > > > > +	if (!dma->drain.vaddr)
> > > > > > +		return -ENOMEM;
> > > > > 
> > > > > This is still allocating a large buffer every time streaming is started (and
> > > > > with streams support, a separate buffer for each stream?).
> > > > > 
> > > > > Did you check if the TI DMA can do writes to a constant address? That would
> > > > > be the best option, as then the whole buffer allocation problem goes away.
> > > > 
> > > > I checked with Vignesh, the hardware can support a scenario where we 
> > > > flush out all the data without allocating a buffer, but I couldn't find 
> > > > a way to signal that via the current dmaengine framework APIs. Will look 
> > > > into it further as it will be important for multi-stream support.
> > > 
> > > That would be the best option. It's not immediately apparent to me if
> > > the DMA engine API supports such a use case.
> > > dmaengine_prep_interleaved_dma() gives you finer grain control on the
> > > source and destination increments, but I haven't seen a way to instruct
> > > the DMA engine to direct writes to /dev/null (so to speak). Vinod, is
> > > this something that is supported, or could be supported ?
> > 
> > Write to a dummy buffer could have the same behaviour, no?
> 
> Yes, but if the DMA engine can write to /dev/null, that avoids
> allocating a dummy buffer, which is nicer. For video use cases, dummy
> buffers are often large.
> 
> > > > > Alternatively, can you flush the buffers with multiple one line transfers?
> > > > > The flushing shouldn't be performance critical, so even if that's slower
> > > > > than a normal full-frame DMA, it shouldn't matter much. And if that can be
> > > > > done, a single probe time line-buffer allocation should do the trick.
> > > > 
> > > > There will be considerable overhead if we queue many DMA transactions 
> > > > (in the order of 1000s or even 100s), which might not be okay for the 
> > > > scenarios where we have to drain mid-stream. Will have to run some 
> > > > experiments to see if that is worth it.
> > > > 
> > > > But one optimization we can for sure do is re-use a single drain buffer 
> > > > for all the streams. We will need to ensure to re-allocate the buffer 
> > > > for the "largest" framesize supported across the different streams at 
> > > > stream-on time.
> > > 
> > > If you implement .device_prep_interleaved_dma() in the DMA engine driver
> > > you could write to a single line buffer, assuming that the hardware would
> > > support so in a generic way.
> > > 
> > > > My guess is the endpoint is not buffering a full-frame's worth of data, 
> > > > I will also check if we can upper bound that size to something feasible.

According to the spec the endpoint buffers a maximum of 2048 x (128-bit) 
samples, which comes out to be 32KiB.

I ran some experiments after disabling the drain and looking at the 
subsequent corrupt frames with stale data, and it was always in 
multiples of (< 20x) 128-bit samples.

Given we have an upper bound, I think a practical solution for now is to 
allocate a single re-usable 32KiB buffer at probe time (will send v10 
with this fix).

Although it would be ideal if we can do this without *any* buffers at 
all.

> > > > 
> > > > > Other than this drain buffer topic, I think this looks fine. So, I'm going
> > > > > to give Rb, but I do encourage you to look more into optimizing this drain
> > > > > buffer.
> > > > 
> > > > Thank you!
> > > > 
> > > > > Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
> 
> -- 
> Regards,
> 
> Laurent Pinchart

-- 
Thanks,
Jai

GPG Fingerprint: 4DE0 D818 E5D5 75E8 D45A AFC5 43DE 91F9 249A 7145

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2023-10-06 10:27 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-11 10:47 [PATCH v9 00/13] CSI2RX support on J721E and AM62 Jai Luthra
2023-08-11 10:47 ` [PATCH v9 01/13] media: dt-bindings: Make sure items in data-lanes are unique Jai Luthra
2023-08-11 10:47 ` [PATCH v9 02/13] media: dt-bindings: cadence-csi2rx: Add TI compatible string Jai Luthra
2023-08-25  3:44   ` Laurent Pinchart
2023-08-11 10:47 ` [PATCH v9 03/13] media: cadence: csi2rx: Unregister v4l2 async notifier Jai Luthra
2023-08-11 10:47 ` [PATCH v9 04/13] media: cadence: csi2rx: Cleanup media entity properly Jai Luthra
2023-08-11 10:47 ` [PATCH v9 05/13] media: cadence: csi2rx: Add get_fmt and set_fmt pad ops Jai Luthra
2023-08-15 12:05   ` Tomi Valkeinen
2023-08-25  3:48   ` Laurent Pinchart
2023-08-11 10:47 ` [PATCH v9 06/13] media: cadence: csi2rx: Configure DPHY using link freq Jai Luthra
2023-08-11 10:47 ` [PATCH v9 07/13] media: cadence: csi2rx: Soft reset the streams before starting capture Jai Luthra
2023-08-15 12:10   ` Tomi Valkeinen
2023-08-11 10:47 ` [PATCH v9 08/13] media: cadence: csi2rx: Set the STOP bit when stopping a stream Jai Luthra
2023-08-11 10:47 ` [PATCH v9 09/13] media: cadence: csi2rx: Fix stream data configuration Jai Luthra
2023-08-11 10:47 ` [PATCH v9 10/13] media: cadence: csi2rx: Populate subdev devnode Jai Luthra
2023-08-11 10:47 ` [PATCH v9 11/13] media: cadence: csi2rx: Add link validation Jai Luthra
2023-08-11 10:47 ` [PATCH v9 12/13] media: dt-bindings: Add TI J721E CSI2RX Jai Luthra
2023-08-11 14:00   ` Rob Herring
2023-08-11 14:54     ` Rob Herring
2023-08-11 10:47 ` [PATCH v9 13/13] media: ti: Add CSI2RX support for J721E Jai Luthra
2023-08-15 13:00   ` Tomi Valkeinen
2023-08-18 10:25     ` Jai Luthra
2023-08-29 15:55       ` Laurent Pinchart
2023-10-04 13:51         ` Vinod Koul
2023-10-04 20:03           ` Laurent Pinchart
2023-10-05  4:10             ` Vinod Koul
2023-10-06 10:26             ` Jai Luthra
2023-08-29 16:44   ` Laurent Pinchart
2023-10-05  8:34     ` Jai Luthra
2023-08-24 15:18 ` [PATCH v9 00/13] CSI2RX support on J721E and AM62 Julien Massot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).