All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/9] i.MX8MP DW100 dewarper driver
@ 2022-05-03  9:39 Xavier Roumegue
  2022-05-03  9:39 ` [PATCH v5 1/9] videodev2.h: add V4L2_CTRL_FLAG_DYNAMIC_ARRAY Xavier Roumegue
                   ` (8 more replies)
  0 siblings, 9 replies; 27+ messages in thread
From: Xavier Roumegue @ 2022-05-03  9:39 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, stanimir.varbanov, laurent.pinchart,
	tomi.valkeinen, robh+dt, nicolas, alexander.stein
  Cc: Xavier Roumegue, linux-media, devicetree

This patchset depends on the series "consolidated i.MX8MP HSIO/MEDIA/HDMI
blk-ctrl series"[1] and "i.MX8MP GPC and blk-ctrl"[2] which provide the power
driver infrastructure and associated dt-bindings.

Previous series can now be marked as superseded.
v1:
https://patchwork.linuxtv.org/project/linux-media/list/?series=7443
v2:
https://patchwork.linuxtv.org/project/linux-media/list/?series=7472
v3:
https://patchwork.linuxtv.org/project/linux-media/list/?series=7473
v4:
https://patchwork.linuxtv.org/project/linux-media/list/?series=7633

Patches [1-3/9] add support for dynamic array control imported from the
"Move HEVC stateless controls out of staging"[3] currently under review.

Patches [4-7/9] document the driver interfaces, and export its v4l2 custom
control to uapi headers.

Patches [8/9] adds the v4l2 m2m driver.
Patches [9/9] adds the driver to MAINTAINERS.

The patchset baseline is v5.18-rc4.

The Vivante DW100 Dewarp Engine, found on i.MX8MP SoC, provides high-performance
dewarp processing for the correction of the distortion that is introduced in
images produced by fisheye and wide angle lenses. The engine can be used for
accelerating scaling, cropping and pixel format conversion operations
independently of the dewarping feature.

A script example [4] has been published to generate the dewarping blob from
outgoing openCV 3d calibration process parameters. This scrict can generate
identity map with h/v flip, dewarping and stereo rectification mappings.

The driver has been tested with:
- v4l2-ctl (from master with [6]) for testing pixel format conversion, scaling
  and crop features using builtin driver identity map.
- OpenCV stereo application using dedicated dw100 openCV module [5] to implement
  stereo rectification stage.
- GStreamer v4l2convert (patched to support dewarping blob map injection)
- v4l2-compliance (test report added after changelog)

[1] https://patchwork.kernel.org/project/linux-arm-kernel/list/?series=629586
[2] https://lore.kernel.org/all/20220330104620.3600159-1-l.stach@pengutronix.de/
[3] https://patchwork.linuxtv.org/project/linux-media/list/?series=7418
[4] https://github.com/NXPmicro/vtec-cv/tree/main/bin
[5] https://github.com/NXPmicro/vtec-opencv
[6] https://patchwork.linuxtv.org/project/linux-media/list/?series=7595

---
Changelog:
v5:
- Add enum_frame_size support
- Fix checkpatch.pl --strict issues
- Reword documentation and add ascii-art figures.
- Strip v4l2 array controls size to 1 in driver state structure
- Use strscpy instead of strncpy
- Use min/max macro from minmax.h
- Use kernel DIV_ROUND helpers.
- s/0x[A-F]/0x[a-f]/

v4:
- Move dw100 driver to NXP platform media drivers folder
- Change dt compatible string to "nxp,imx8mp-dw100"
- Rename V4L2_CID_DW100_MAPPING to V4L2_CID_DW100_DEWARPING_16x16_VERTEX_MAP
- Add more verbose description of V4L2_CID_DW100_DEWARPING_16x16_VERTEX_MAP
- Handle v4l2 controls as an array in driver.

v3:
- Fix incorrect i.MX8MP Reference Manual link in documentation

v2:
- Fix yaml dt-bindings errors
- Drop assigned-clocks properties from dt-bindings example
- Add dw100 driver documentation
- Rework V4L2 LUT assignment with v4l2 dynamic array control
- Rename V4L2_CID_DW100_LUT to V4L2_CID_DW100_MAPPING
- Export V4L2_CID_DW100_MAPPING to kernel headers

Hans Verkuil (3):
  videodev2.h: add V4L2_CTRL_FLAG_DYNAMIC_ARRAY
  v4l2-ctrls: add support for dynamically allocated arrays.
  vivid: add dynamic array test control

Xavier Roumegue (6):
  media: Documentation: dw100: Add user documentation for the DW100
    driver
  media: v4l: uapi: Add user control base for DW100 controls
  media: uapi: Add a control for DW100 driver
  media: dt-bindings: media: Add i.MX8MP DW100 binding
  media: dw100: Add i.MX8MP dw100 dewarper driver
  media: MAINTAINERS: add entry for i.MX8MP DW100 v4l2 mem2mem driver

 .../devicetree/bindings/media/nxp,dw100.yaml  |   69 +
 .../userspace-api/media/drivers/dw100.rst     |  103 +
 .../userspace-api/media/drivers/index.rst     |    3 +-
 .../media/v4l/vidioc-queryctrl.rst            |    8 +
 MAINTAINERS                                   |    9 +
 drivers/media/platform/nxp/Kconfig            |    1 +
 drivers/media/platform/nxp/Makefile           |    1 +
 drivers/media/platform/nxp/dw100/Kconfig      |   16 +
 drivers/media/platform/nxp/dw100/Makefile     |    3 +
 drivers/media/platform/nxp/dw100/dw100.c      | 1782 +++++++++++++++++
 drivers/media/platform/nxp/dw100/dw100_regs.h |  118 ++
 .../media/test-drivers/vivid/vivid-ctrls.c    |   15 +
 drivers/media/v4l2-core/v4l2-ctrls-api.c      |  103 +-
 drivers/media/v4l2-core/v4l2-ctrls-core.c     |  182 +-
 drivers/media/v4l2-core/v4l2-ctrls-priv.h     |    3 +-
 drivers/media/v4l2-core/v4l2-ctrls-request.c  |   13 +-
 include/media/v4l2-ctrls.h                    |   42 +-
 include/uapi/linux/dw100.h                    |   14 +
 include/uapi/linux/v4l2-controls.h            |    6 +
 include/uapi/linux/videodev2.h                |    1 +
 20 files changed, 2420 insertions(+), 72 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/media/nxp,dw100.yaml
 create mode 100644 Documentation/userspace-api/media/drivers/dw100.rst
 create mode 100644 drivers/media/platform/nxp/dw100/Kconfig
 create mode 100644 drivers/media/platform/nxp/dw100/Makefile
 create mode 100644 drivers/media/platform/nxp/dw100/dw100.c
 create mode 100644 drivers/media/platform/nxp/dw100/dw100_regs.h
 create mode 100644 include/uapi/linux/dw100.h

==========
Compliance
==========
# v4l2-compliance -d 1
v4l2-compliance 1.23.0-4923, 64 bits, 64-bit time_t
v4l2-compliance SHA: 163144712a46 2022-04-25 05:31:44

Compliance test for dw100 device /dev/video1:

Driver Info:
	Driver name      : dw100
	Card type        : DW100 dewarper
	Bus info         : platform:dw100
	Driver version   : 5.18.0
	Capabilities     : 0x84208000
		Video Memory-to-Memory
		Streaming
		Extended Pix Format
		Device Capabilities
	Device Caps      : 0x04208000
		Video Memory-to-Memory
		Streaming
		Extended Pix Format
Media Driver Info:
	Driver name      : dw100
	Model            : dw100
	Serial           :
	Bus info         : platform:dw100
	Media version    : 5.18.0
	Hardware revision: 0x00000000 (0)
	Driver version   : 5.18.0
Interface Info:
	ID               : 0x0300000c
	Type             : V4L Video
Entity Info:
	ID               : 0x00000001 (1)
	Name             : dw100-source
	Function         : V4L2 I/O
	Pad 0x01000002   : 0: Source
	  Link 0x02000008: to remote pad 0x1000004 of entity 'dw100-proc' (Video Scaler): Data, Enabled, Immutable

Required ioctls:
	test MC information (see 'Media Driver Info' above): OK
	test VIDIOC_QUERYCAP: OK
	test invalid ioctls: OK

Allow for multiple opens:
	test second /dev/video1 open: OK
	test VIDIOC_QUERYCAP: OK
	test VIDIOC_G/S_PRIORITY: OK
	test for unlimited opens: OK

Debug ioctls:
	test VIDIOC_DBG_G/S_REGISTER: OK (Not Supported)
	test VIDIOC_LOG_STATUS: OK (Not Supported)

Input ioctls:
	test VIDIOC_G/S_TUNER/ENUM_FREQ_BANDS: OK (Not Supported)
	test VIDIOC_G/S_FREQUENCY: OK (Not Supported)
	test VIDIOC_S_HW_FREQ_SEEK: OK (Not Supported)
	test VIDIOC_ENUMAUDIO: OK (Not Supported)
	test VIDIOC_G/S/ENUMINPUT: OK (Not Supported)
	test VIDIOC_G/S_AUDIO: OK (Not Supported)
	Inputs: 0 Audio Inputs: 0 Tuners: 0

Output ioctls:
	test VIDIOC_G/S_MODULATOR: OK (Not Supported)
	test VIDIOC_G/S_FREQUENCY: OK (Not Supported)
	test VIDIOC_ENUMAUDOUT: OK (Not Supported)
	test VIDIOC_G/S/ENUMOUTPUT: OK (Not Supported)
	test VIDIOC_G/S_AUDOUT: OK (Not Supported)
	Outputs: 0 Audio Outputs: 0 Modulators: 0

Input/Output configuration ioctls:
	test VIDIOC_ENUM/G/S/QUERY_STD: OK (Not Supported)
	test VIDIOC_ENUM/G/S/QUERY_DV_TIMINGS: OK (Not Supported)
	test VIDIOC_DV_TIMINGS_CAP: OK (Not Supported)
	test VIDIOC_G/S_EDID: OK (Not Supported)

Control ioctls:
	test VIDIOC_QUERY_EXT_CTRL/QUERYMENU: OK
	test VIDIOC_QUERYCTRL: OK
	test VIDIOC_G/S_CTRL: OK
	test VIDIOC_G/S/TRY_EXT_CTRLS: OK
	test VIDIOC_(UN)SUBSCRIBE_EVENT/DQEVENT: OK
	test VIDIOC_G/S_JPEGCOMP: OK (Not Supported)
	Standard Controls: 0 Private Controls: 0
	Standard Compound Controls: 0 Private Compound Controls: 1

Format ioctls:
	test VIDIOC_ENUM_FMT/FRAMESIZES/FRAMEINTERVALS: OK
	test VIDIOC_G/S_PARM: OK (Not Supported)
	test VIDIOC_G_FBUF: OK (Not Supported)
	test VIDIOC_G_FMT: OK
	test VIDIOC_TRY_FMT: OK
	test VIDIOC_S_FMT: OK
	test VIDIOC_G_SLICED_VBI_CAP: OK (Not Supported)
	test Cropping: OK
	test Composing: OK
	test Scaling: OK

Codec ioctls:
	test VIDIOC_(TRY_)ENCODER_CMD: OK (Not Supported)
	test VIDIOC_G_ENC_INDEX: OK (Not Supported)
	test VIDIOC_(TRY_)DECODER_CMD: OK (Not Supported)

Buffer ioctls:
	test VIDIOC_REQBUFS/CREATE_BUFS/QUERYBUF: OK
	test VIDIOC_EXPBUF: OK
	test Requests: OK (Not Supported)

Total for dw100 device /dev/video1: 46, Succeeded: 46, Failed: 0, Warnings: 0

================
Media controller
================
# media-ctl -p
Media controller API version 5.18.0

Media device information
------------------------
driver          dw100
model           dw100
serial
bus info        platform:dw100
hw revision     0x0
driver version  5.18.0

Device topology
- entity 1: dw100-source (1 pad, 1 link)
            type Node subtype V4L flags 0
            device node name /dev/video1
	pad0: Source
		-> "dw100-proc":0 [ENABLED,IMMUTABLE]

- entity 3: dw100-proc (2 pads, 2 links)
            type Node subtype Unknown flags 0
	pad0: Sink
		<- "dw100-source":0 [ENABLED,IMMUTABLE]
	pad1: Source
		-> "dw100-sink":0 [ENABLED,IMMUTABLE]

- entity 6: dw100-sink (1 pad, 1 link)
            type Node subtype V4L flags 0
            device node name /dev/video1
	pad0: Sink
		<- "dw100-proc":1 [ENABLED,IMMUTABLE]

--
2.35.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v5 1/9] videodev2.h: add V4L2_CTRL_FLAG_DYNAMIC_ARRAY
  2022-05-03  9:39 [PATCH v5 0/9] i.MX8MP DW100 dewarper driver Xavier Roumegue
@ 2022-05-03  9:39 ` Xavier Roumegue
  2022-06-14 20:46   ` Laurent Pinchart
  2022-05-03  9:39 ` [PATCH v5 2/9] v4l2-ctrls: add support for dynamically allocated arrays Xavier Roumegue
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Xavier Roumegue @ 2022-05-03  9:39 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, stanimir.varbanov, laurent.pinchart,
	tomi.valkeinen, robh+dt, nicolas, alexander.stein
  Cc: linux-media, devicetree

From: Hans Verkuil <hverkuil-cisco@xs4all.nl>

Add a new flag that indicates that this control is a dynamically sized
array. Also document this flag.

Currently dynamically sized arrays are limited to one dimensional arrays,
but that might change in the future if there is a need for it.

The initial use-case of dynamic arrays are stateless codecs. A frame
can be divided in many slices, so you want to provide an array containing
slice information for each slice. Typically the number of slices is small,
but the standard allow for hundreds or thousands of slices. Dynamic arrays
are a good solution since sizing the array for the worst case would waste
substantial amounts of memory.

Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
---
 .../userspace-api/media/v4l/vidioc-queryctrl.rst          | 8 ++++++++
 include/uapi/linux/videodev2.h                            | 1 +
 2 files changed, 9 insertions(+)

diff --git a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
index 88f630252d98..a20dfa2a933b 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
@@ -625,6 +625,14 @@ See also the examples in :ref:`control`.
 	``V4L2_CTRL_FLAG_GRABBED`` flag when buffers are allocated or
 	streaming is in progress since most drivers do not support changing
 	the format in that case.
+    * - ``V4L2_CTRL_FLAG_DYNAMIC_ARRAY``
+      - 0x0800
+      - This control is a dynamically sized 1-dimensional array. It
+        behaves the same as a regular array, except that the number
+	of elements as reported by the ``elems`` field is between 1 and
+	``dims[0]``. So setting the control with a differently sized
+	array will change the ``elems`` field when the control is
+	queried afterwards.
 
 Return Value
 ============
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 3768a0a80830..8df13defde75 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -1886,6 +1886,7 @@ struct v4l2_querymenu {
 #define V4L2_CTRL_FLAG_HAS_PAYLOAD	0x0100
 #define V4L2_CTRL_FLAG_EXECUTE_ON_WRITE	0x0200
 #define V4L2_CTRL_FLAG_MODIFY_LAYOUT	0x0400
+#define V4L2_CTRL_FLAG_DYNAMIC_ARRAY	0x0800
 
 /*  Query flags, to be ORed with the control ID */
 #define V4L2_CTRL_FLAG_NEXT_CTRL	0x80000000
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v5 2/9] v4l2-ctrls: add support for dynamically allocated arrays.
  2022-05-03  9:39 [PATCH v5 0/9] i.MX8MP DW100 dewarper driver Xavier Roumegue
  2022-05-03  9:39 ` [PATCH v5 1/9] videodev2.h: add V4L2_CTRL_FLAG_DYNAMIC_ARRAY Xavier Roumegue
@ 2022-05-03  9:39 ` Xavier Roumegue
  2022-05-03  9:39 ` [PATCH v5 3/9] vivid: add dynamic array test control Xavier Roumegue
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Xavier Roumegue @ 2022-05-03  9:39 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, stanimir.varbanov, laurent.pinchart,
	tomi.valkeinen, robh+dt, nicolas, alexander.stein
  Cc: linux-media, devicetree

From: Hans Verkuil <hverkuil-cisco@xs4all.nl>

Implement support for dynamically allocated arrays.

Most of the changes concern keeping track of the number of elements
of the array and the number of elements allocated for the array and
reallocating memory if needed.

Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
---
 drivers/media/v4l2-core/v4l2-ctrls-api.c     | 103 ++++++++---
 drivers/media/v4l2-core/v4l2-ctrls-core.c    | 182 +++++++++++++++----
 drivers/media/v4l2-core/v4l2-ctrls-priv.h    |   3 +-
 drivers/media/v4l2-core/v4l2-ctrls-request.c |  13 +-
 include/media/v4l2-ctrls.h                   |  42 ++++-
 5 files changed, 272 insertions(+), 71 deletions(-)

diff --git a/drivers/media/v4l2-core/v4l2-ctrls-api.c b/drivers/media/v4l2-core/v4l2-ctrls-api.c
index db9baa0bd05f..50d012ba3c02 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls-api.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls-api.c
@@ -97,29 +97,47 @@ static int def_to_user(struct v4l2_ext_control *c, struct v4l2_ctrl *ctrl)
 	return ptr_to_user(c, ctrl, ctrl->p_new);
 }
 
-/* Helper function: copy the caller-provider value to the given control value */
-static int user_to_ptr(struct v4l2_ext_control *c,
-		       struct v4l2_ctrl *ctrl,
-		       union v4l2_ctrl_ptr ptr)
+/* Helper function: copy the caller-provider value as the new control value */
+static int user_to_new(struct v4l2_ext_control *c, struct v4l2_ctrl *ctrl)
 {
 	int ret;
 	u32 size;
 
-	ctrl->is_new = 1;
+	ctrl->is_new = 0;
+	if (ctrl->is_dyn_array &&
+	    c->size > ctrl->p_dyn_alloc_elems * ctrl->elem_size) {
+		void *old = ctrl->p_dyn;
+		void *tmp = kvzalloc(2 * c->size, GFP_KERNEL);
+
+		if (!tmp)
+			return -ENOMEM;
+		memcpy(tmp, ctrl->p_new.p, ctrl->elems * ctrl->elem_size);
+		memcpy(tmp + c->size, ctrl->p_cur.p, ctrl->elems * ctrl->elem_size);
+		ctrl->p_new.p = tmp;
+		ctrl->p_cur.p = tmp + c->size;
+		ctrl->p_dyn = tmp;
+		ctrl->p_dyn_alloc_elems = c->size / ctrl->elem_size;
+		kvfree(old);
+	}
+
 	if (ctrl->is_ptr && !ctrl->is_string) {
+		unsigned int elems = c->size / ctrl->elem_size;
 		unsigned int idx;
 
-		ret = copy_from_user(ptr.p, c->ptr, c->size) ? -EFAULT : 0;
-		if (ret || !ctrl->is_array)
-			return ret;
-		for (idx = c->size / ctrl->elem_size; idx < ctrl->elems; idx++)
-			ctrl->type_ops->init(ctrl, idx, ptr);
+		if (copy_from_user(ctrl->p_new.p, c->ptr, c->size))
+			return -EFAULT;
+		ctrl->is_new = 1;
+		if (ctrl->is_dyn_array)
+			ctrl->new_elems = elems;
+		else if (ctrl->is_array)
+			for (idx = elems; idx < ctrl->elems; idx++)
+				ctrl->type_ops->init(ctrl, idx, ctrl->p_new);
 		return 0;
 	}
 
 	switch (ctrl->type) {
 	case V4L2_CTRL_TYPE_INTEGER64:
-		*ptr.p_s64 = c->value64;
+		*ctrl->p_new.p_s64 = c->value64;
 		break;
 	case V4L2_CTRL_TYPE_STRING:
 		size = c->size;
@@ -127,32 +145,27 @@ static int user_to_ptr(struct v4l2_ext_control *c,
 			return -ERANGE;
 		if (size > ctrl->maximum + 1)
 			size = ctrl->maximum + 1;
-		ret = copy_from_user(ptr.p_char, c->string, size) ? -EFAULT : 0;
+		ret = copy_from_user(ctrl->p_new.p_char, c->string, size) ? -EFAULT : 0;
 		if (!ret) {
-			char last = ptr.p_char[size - 1];
+			char last = ctrl->p_new.p_char[size - 1];
 
-			ptr.p_char[size - 1] = 0;
+			ctrl->p_new.p_char[size - 1] = 0;
 			/*
 			 * If the string was longer than ctrl->maximum,
 			 * then return an error.
 			 */
-			if (strlen(ptr.p_char) == ctrl->maximum && last)
+			if (strlen(ctrl->p_new.p_char) == ctrl->maximum && last)
 				return -ERANGE;
 		}
 		return ret;
 	default:
-		*ptr.p_s32 = c->value;
+		*ctrl->p_new.p_s32 = c->value;
 		break;
 	}
+	ctrl->is_new = 1;
 	return 0;
 }
 
-/* Helper function: copy the caller-provider value as the new control value */
-static int user_to_new(struct v4l2_ext_control *c, struct v4l2_ctrl *ctrl)
-{
-	return user_to_ptr(c, ctrl, ctrl->p_new);
-}
-
 /*
  * VIDIOC_G/TRY/S_EXT_CTRLS implementation
  */
@@ -254,7 +267,31 @@ static int prepare_ext_ctrls(struct v4l2_ctrl_handler *hdl,
 			have_clusters = true;
 		if (ctrl->cluster[0] != ctrl)
 			ref = find_ref_lock(hdl, ctrl->cluster[0]->id);
-		if (ctrl->is_ptr && !ctrl->is_string) {
+		if (ctrl->is_dyn_array) {
+			unsigned int max_size = ctrl->dims[0] * ctrl->elem_size;
+			unsigned int tot_size = ctrl->elem_size;
+
+			if (cs->which == V4L2_CTRL_WHICH_REQUEST_VAL)
+				tot_size *= ref->p_req_elems;
+			else
+				tot_size *= ctrl->elems;
+
+			c->size = ctrl->elem_size * (c->size / ctrl->elem_size);
+			if (get) {
+				if (c->size < tot_size) {
+					c->size = tot_size;
+					return -ENOSPC;
+				}
+				c->size = tot_size;
+			} else {
+				if (c->size > max_size) {
+					c->size = max_size;
+					return -ENOSPC;
+				}
+				if (!c->size)
+					return -EFAULT;
+			}
+		} else if (ctrl->is_ptr && !ctrl->is_string) {
 			unsigned int tot_size = ctrl->elems * ctrl->elem_size;
 
 			if (c->size < tot_size) {
@@ -346,7 +383,7 @@ static int class_check(struct v4l2_ctrl_handler *hdl, u32 which)
  *
  * Note that v4l2_g_ext_ctrls_common() with 'which' set to
  * V4L2_CTRL_WHICH_REQUEST_VAL is only called if the request was
- * completed, and in that case valid_p_req is true for all controls.
+ * completed, and in that case p_req_valid is true for all controls.
  */
 int v4l2_g_ext_ctrls_common(struct v4l2_ctrl_handler *hdl,
 			    struct v4l2_ext_controls *cs,
@@ -430,7 +467,9 @@ int v4l2_g_ext_ctrls_common(struct v4l2_ctrl_handler *hdl,
 
 			if (is_default)
 				ret = def_to_user(cs->controls + idx, ref->ctrl);
-			else if (is_request && ref->valid_p_req)
+			else if (is_request && ref->p_req_dyn_enomem)
+				ret = -ENOMEM;
+			else if (is_request && ref->p_req_valid)
 				ret = req_to_user(cs->controls + idx, ref);
 			else if (is_volatile)
 				ret = new_to_user(cs->controls + idx, ref->ctrl);
@@ -457,6 +496,17 @@ int v4l2_g_ext_ctrls(struct v4l2_ctrl_handler *hdl, struct video_device *vdev,
 }
 EXPORT_SYMBOL(v4l2_g_ext_ctrls);
 
+/* Validate a new control */
+static int validate_new(const struct v4l2_ctrl *ctrl, union v4l2_ctrl_ptr p_new)
+{
+	unsigned int idx;
+	int err = 0;
+
+	for (idx = 0; !err && idx < ctrl->new_elems; idx++)
+		err = ctrl->type_ops->validate(ctrl, idx, p_new);
+	return err;
+}
+
 /* Validate controls. */
 static int validate_ctrls(struct v4l2_ext_controls *cs,
 			  struct v4l2_ctrl_helper *helpers,
@@ -872,6 +922,9 @@ int __v4l2_ctrl_s_ctrl_compound(struct v4l2_ctrl *ctrl,
 	/* It's a driver bug if this happens. */
 	if (WARN_ON(ctrl->type != type))
 		return -EINVAL;
+	/* Setting dynamic arrays is not (yet?) supported. */
+	if (WARN_ON(ctrl->is_dyn_array))
+		return -EINVAL;
 	memcpy(ctrl->p_new.p, p, ctrl->elems * ctrl->elem_size);
 	return set_ctrl(NULL, ctrl, 0);
 }
diff --git a/drivers/media/v4l2-core/v4l2-ctrls-core.c b/drivers/media/v4l2-core/v4l2-ctrls-core.c
index 8968cec8454e..cb709c74e01e 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls-core.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls-core.c
@@ -991,11 +991,12 @@ EXPORT_SYMBOL(v4l2_ctrl_notify);
 
 /* Copy the one value to another. */
 static void ptr_to_ptr(struct v4l2_ctrl *ctrl,
-		       union v4l2_ctrl_ptr from, union v4l2_ctrl_ptr to)
+		       union v4l2_ctrl_ptr from, union v4l2_ctrl_ptr to,
+		       unsigned int elems)
 {
 	if (ctrl == NULL)
 		return;
-	memcpy(to.p, from.p_const, ctrl->elems * ctrl->elem_size);
+	memcpy(to.p, from.p_const, elems * ctrl->elem_size);
 }
 
 /* Copy the new value to the current value. */
@@ -1008,8 +1009,11 @@ void new_to_cur(struct v4l2_fh *fh, struct v4l2_ctrl *ctrl, u32 ch_flags)
 
 	/* has_changed is set by cluster_changed */
 	changed = ctrl->has_changed;
-	if (changed)
-		ptr_to_ptr(ctrl, ctrl->p_new, ctrl->p_cur);
+	if (changed) {
+		if (ctrl->is_dyn_array)
+			ctrl->elems = ctrl->new_elems;
+		ptr_to_ptr(ctrl, ctrl->p_new, ctrl->p_cur, ctrl->elems);
+	}
 
 	if (ch_flags & V4L2_EVENT_CTRL_CH_FLAGS) {
 		/* Note: CH_FLAGS is only set for auto clusters. */
@@ -1039,36 +1043,122 @@ void cur_to_new(struct v4l2_ctrl *ctrl)
 {
 	if (ctrl == NULL)
 		return;
-	ptr_to_ptr(ctrl, ctrl->p_cur, ctrl->p_new);
+	if (ctrl->is_dyn_array)
+		ctrl->new_elems = ctrl->elems;
+	ptr_to_ptr(ctrl, ctrl->p_cur, ctrl->p_new, ctrl->new_elems);
+}
+
+static bool req_alloc_dyn_array(struct v4l2_ctrl_ref *ref, u32 elems)
+{
+	void *tmp;
+
+	if (elems < ref->p_req_dyn_alloc_elems)
+		return true;
+
+	tmp = kvmalloc(elems * ref->ctrl->elem_size, GFP_KERNEL);
+
+	if (!tmp) {
+		ref->p_req_dyn_enomem = true;
+		return false;
+	}
+	ref->p_req_dyn_enomem = false;
+	kvfree(ref->p_req.p);
+	ref->p_req.p = tmp;
+	ref->p_req_dyn_alloc_elems = elems;
+	return true;
 }
 
 /* Copy the new value to the request value */
 void new_to_req(struct v4l2_ctrl_ref *ref)
 {
+	struct v4l2_ctrl *ctrl;
+
 	if (!ref)
 		return;
-	ptr_to_ptr(ref->ctrl, ref->ctrl->p_new, ref->p_req);
-	ref->valid_p_req = true;
+
+	ctrl = ref->ctrl;
+	if (ctrl->is_dyn_array && !req_alloc_dyn_array(ref, ctrl->new_elems))
+		return;
+
+	ref->p_req_elems = ctrl->new_elems;
+	ptr_to_ptr(ctrl, ctrl->p_new, ref->p_req, ref->p_req_elems);
+	ref->p_req_valid = true;
 }
 
 /* Copy the current value to the request value */
 void cur_to_req(struct v4l2_ctrl_ref *ref)
 {
+	struct v4l2_ctrl *ctrl;
+
 	if (!ref)
 		return;
-	ptr_to_ptr(ref->ctrl, ref->ctrl->p_cur, ref->p_req);
-	ref->valid_p_req = true;
+
+	ctrl = ref->ctrl;
+	if (ctrl->is_dyn_array && !req_alloc_dyn_array(ref, ctrl->elems))
+		return;
+
+	ref->p_req_elems = ctrl->elems;
+	ptr_to_ptr(ctrl, ctrl->p_cur, ref->p_req, ctrl->elems);
+	ref->p_req_valid = true;
 }
 
 /* Copy the request value to the new value */
-void req_to_new(struct v4l2_ctrl_ref *ref)
+int req_to_new(struct v4l2_ctrl_ref *ref)
 {
+	struct v4l2_ctrl *ctrl;
+
 	if (!ref)
-		return;
-	if (ref->valid_p_req)
-		ptr_to_ptr(ref->ctrl, ref->p_req, ref->ctrl->p_new);
-	else
-		ptr_to_ptr(ref->ctrl, ref->ctrl->p_cur, ref->ctrl->p_new);
+		return 0;
+
+	ctrl = ref->ctrl;
+
+	/*
+	 * This control was never set in the request, so just use the current
+	 * value.
+	 */
+	if (!ref->p_req_valid) {
+		if (ctrl->is_dyn_array)
+			ctrl->new_elems = ctrl->elems;
+		ptr_to_ptr(ctrl, ctrl->p_cur, ctrl->p_new, ctrl->new_elems);
+		return 0;
+	}
+
+	/* Not a dynamic array, so just copy the request value */
+	if (!ctrl->is_dyn_array) {
+		ptr_to_ptr(ctrl, ref->p_req, ctrl->p_new, ctrl->new_elems);
+		return 0;
+	}
+
+	/* Sanity check, should never happen */
+	if (WARN_ON(!ref->p_req_dyn_alloc_elems))
+		return -ENOMEM;
+
+	/*
+	 * Check if the number of elements in the request is more than the
+	 * elements in ctrl->p_dyn. If so, attempt to realloc ctrl->p_dyn.
+	 * Note that p_dyn is allocated with twice the number of elements
+	 * in the dynamic array since it has to store both the current and
+	 * new value of such a control.
+	 */
+	if (ref->p_req_elems > ctrl->p_dyn_alloc_elems) {
+		unsigned int sz = ref->p_req_elems * ctrl->elem_size;
+		void *old = ctrl->p_dyn;
+		void *tmp = kvzalloc(2 * sz, GFP_KERNEL);
+
+		if (!tmp)
+			return -ENOMEM;
+		memcpy(tmp, ctrl->p_new.p, ctrl->elems * ctrl->elem_size);
+		memcpy(tmp + sz, ctrl->p_cur.p, ctrl->elems * ctrl->elem_size);
+		ctrl->p_new.p = tmp;
+		ctrl->p_cur.p = tmp + sz;
+		ctrl->p_dyn = tmp;
+		ctrl->p_dyn_alloc_elems = ref->p_req_elems;
+		kvfree(old);
+	}
+
+	ctrl->new_elems = ref->p_req_elems;
+	ptr_to_ptr(ctrl, ref->p_req, ctrl->p_new, ctrl->new_elems);
+	return 0;
 }
 
 /* Control range checking */
@@ -1110,17 +1200,6 @@ int check_range(enum v4l2_ctrl_type type,
 	}
 }
 
-/* Validate a new control */
-int validate_new(const struct v4l2_ctrl *ctrl, union v4l2_ctrl_ptr p_new)
-{
-	unsigned idx;
-	int err = 0;
-
-	for (idx = 0; !err && idx < ctrl->elems; idx++)
-		err = ctrl->type_ops->validate(ctrl, idx, p_new);
-	return err;
-}
-
 /* Set the handler's error code if it wasn't set earlier already */
 static inline int handler_set_err(struct v4l2_ctrl_handler *hdl, int err)
 {
@@ -1165,6 +1244,8 @@ void v4l2_ctrl_handler_free(struct v4l2_ctrl_handler *hdl)
 	/* Free all nodes */
 	list_for_each_entry_safe(ref, next_ref, &hdl->ctrl_refs, node) {
 		list_del(&ref->node);
+		if (ref->p_req_dyn_alloc_elems)
+			kvfree(ref->p_req.p);
 		kfree(ref);
 	}
 	/* Free all controls owned by the handler */
@@ -1172,6 +1253,7 @@ void v4l2_ctrl_handler_free(struct v4l2_ctrl_handler *hdl)
 		list_del(&ctrl->node);
 		list_for_each_entry_safe(sev, next_sev, &ctrl->ev_subs, node)
 			list_del(&sev->node);
+		kvfree(ctrl->p_dyn);
 		kvfree(ctrl);
 	}
 	kvfree(hdl->buckets);
@@ -1287,7 +1369,7 @@ int handler_new_ref(struct v4l2_ctrl_handler *hdl,
 	if (hdl->error)
 		return hdl->error;
 
-	if (allocate_req)
+	if (allocate_req && !ctrl->is_dyn_array)
 		size_extra_req = ctrl->elems * ctrl->elem_size;
 	new_ref = kzalloc(sizeof(*new_ref) + size_extra_req, GFP_KERNEL);
 	if (!new_ref)
@@ -1461,7 +1543,6 @@ static struct v4l2_ctrl *v4l2_ctrl_new(struct v4l2_ctrl_handler *hdl,
 			elem_size = sizeof(s32);
 		break;
 	}
-	tot_ctrl_size = elem_size * elems;
 
 	/* Sanity checks */
 	if (id == 0 || name == NULL || !elem_size ||
@@ -1482,17 +1563,33 @@ static struct v4l2_ctrl *v4l2_ctrl_new(struct v4l2_ctrl_handler *hdl,
 		handler_set_err(hdl, -EINVAL);
 		return NULL;
 	}
+	if (flags & V4L2_CTRL_FLAG_DYNAMIC_ARRAY) {
+		/*
+		 * For now only support this for one-dimensional arrays only.
+		 *
+		 * This can be relaxed in the future, but this will
+		 * require more effort.
+		 */
+		if (nr_of_dims != 1) {
+			handler_set_err(hdl, -EINVAL);
+			return NULL;
+		}
+		/* Start with just 1 element */
+		elems = 1;
+	}
 
+	tot_ctrl_size = elem_size * elems;
 	sz_extra = 0;
 	if (type == V4L2_CTRL_TYPE_BUTTON)
 		flags |= V4L2_CTRL_FLAG_WRITE_ONLY |
 			V4L2_CTRL_FLAG_EXECUTE_ON_WRITE;
 	else if (type == V4L2_CTRL_TYPE_CTRL_CLASS)
 		flags |= V4L2_CTRL_FLAG_READ_ONLY;
-	else if (type == V4L2_CTRL_TYPE_INTEGER64 ||
-		 type == V4L2_CTRL_TYPE_STRING ||
-		 type >= V4L2_CTRL_COMPOUND_TYPES ||
-		 is_array)
+	else if (!(flags & V4L2_CTRL_FLAG_DYNAMIC_ARRAY) &&
+		 (type == V4L2_CTRL_TYPE_INTEGER64 ||
+		  type == V4L2_CTRL_TYPE_STRING ||
+		  type >= V4L2_CTRL_COMPOUND_TYPES ||
+		  is_array))
 		sz_extra += 2 * tot_ctrl_size;
 
 	if (type >= V4L2_CTRL_COMPOUND_TYPES && p_def.p_const)
@@ -1521,7 +1618,9 @@ static struct v4l2_ctrl *v4l2_ctrl_new(struct v4l2_ctrl_handler *hdl,
 	ctrl->is_ptr = is_array || type >= V4L2_CTRL_COMPOUND_TYPES || ctrl->is_string;
 	ctrl->is_int = !ctrl->is_ptr && type != V4L2_CTRL_TYPE_INTEGER64;
 	ctrl->is_array = is_array;
+	ctrl->is_dyn_array = !!(flags & V4L2_CTRL_FLAG_DYNAMIC_ARRAY);
 	ctrl->elems = elems;
+	ctrl->new_elems = elems;
 	ctrl->nr_of_dims = nr_of_dims;
 	if (nr_of_dims)
 		memcpy(ctrl->dims, dims, nr_of_dims * sizeof(dims[0]));
@@ -1534,6 +1633,16 @@ static struct v4l2_ctrl *v4l2_ctrl_new(struct v4l2_ctrl_handler *hdl,
 	ctrl->cur.val = ctrl->val = def;
 	data = &ctrl[1];
 
+	if (ctrl->is_dyn_array) {
+		ctrl->p_dyn_alloc_elems = elems;
+		ctrl->p_dyn = kvzalloc(2 * elems * elem_size, GFP_KERNEL);
+		if (!ctrl->p_dyn) {
+			kvfree(ctrl);
+			return NULL;
+		}
+		data = ctrl->p_dyn;
+	}
+
 	if (!ctrl->is_int) {
 		ctrl->p_new.p = data;
 		ctrl->p_cur.p = data + tot_ctrl_size;
@@ -1543,7 +1652,10 @@ static struct v4l2_ctrl *v4l2_ctrl_new(struct v4l2_ctrl_handler *hdl,
 	}
 
 	if (type >= V4L2_CTRL_COMPOUND_TYPES && p_def.p_const) {
-		ctrl->p_def.p = ctrl->p_cur.p + tot_ctrl_size;
+		if (ctrl->is_dyn_array)
+			ctrl->p_def.p = &ctrl[1];
+		else
+			ctrl->p_def.p = ctrl->p_cur.p + tot_ctrl_size;
 		memcpy(ctrl->p_def.p, p_def.p_const, elem_size);
 	}
 
@@ -1553,6 +1665,7 @@ static struct v4l2_ctrl *v4l2_ctrl_new(struct v4l2_ctrl_handler *hdl,
 	}
 
 	if (handler_new_ref(hdl, ctrl, NULL, false, false)) {
+		kvfree(ctrl->p_dyn);
 		kvfree(ctrl);
 		return NULL;
 	}
@@ -1890,6 +2003,9 @@ static int cluster_changed(struct v4l2_ctrl *master)
 			continue;
 		}
 
+		if (ctrl->elems != ctrl->new_elems)
+			ctrl_changed = true;
+
 		for (idx = 0; !ctrl_changed && idx < ctrl->elems; idx++)
 			ctrl_changed = !ctrl->type_ops->equal(ctrl, idx,
 				ctrl->p_cur, ctrl->p_new);
diff --git a/drivers/media/v4l2-core/v4l2-ctrls-priv.h b/drivers/media/v4l2-core/v4l2-ctrls-priv.h
index d4bf2c716f97..aba6176fab6c 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls-priv.h
+++ b/drivers/media/v4l2-core/v4l2-ctrls-priv.h
@@ -57,10 +57,9 @@ void cur_to_new(struct v4l2_ctrl *ctrl);
 void cur_to_req(struct v4l2_ctrl_ref *ref);
 void new_to_cur(struct v4l2_fh *fh, struct v4l2_ctrl *ctrl, u32 ch_flags);
 void new_to_req(struct v4l2_ctrl_ref *ref);
-void req_to_new(struct v4l2_ctrl_ref *ref);
+int req_to_new(struct v4l2_ctrl_ref *ref);
 void send_initial_event(struct v4l2_fh *fh, struct v4l2_ctrl *ctrl);
 void send_event(struct v4l2_fh *fh, struct v4l2_ctrl *ctrl, u32 changes);
-int validate_new(const struct v4l2_ctrl *ctrl, union v4l2_ctrl_ptr p_new);
 int handler_new_ref(struct v4l2_ctrl_handler *hdl,
 		    struct v4l2_ctrl *ctrl,
 		    struct v4l2_ctrl_ref **ctrl_ref,
diff --git a/drivers/media/v4l2-core/v4l2-ctrls-request.c b/drivers/media/v4l2-core/v4l2-ctrls-request.c
index 7d098f287fd9..c637049d7a2b 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls-request.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls-request.c
@@ -143,7 +143,7 @@ v4l2_ctrl_request_hdl_ctrl_find(struct v4l2_ctrl_handler *hdl, u32 id)
 {
 	struct v4l2_ctrl_ref *ref = find_ref_lock(hdl, id);
 
-	return (ref && ref->valid_p_req) ? ref->ctrl : NULL;
+	return (ref && ref->p_req_valid) ? ref->ctrl : NULL;
 }
 EXPORT_SYMBOL_GPL(v4l2_ctrl_request_hdl_ctrl_find);
 
@@ -373,7 +373,7 @@ void v4l2_ctrl_request_complete(struct media_request *req,
 			v4l2_ctrl_unlock(master);
 			continue;
 		}
-		if (ref->valid_p_req)
+		if (ref->p_req_valid)
 			continue;
 
 		/* Copy the current control value into the request */
@@ -442,7 +442,7 @@ int v4l2_ctrl_request_setup(struct media_request *req,
 				struct v4l2_ctrl_ref *r =
 					find_ref(hdl, master->cluster[i]->id);
 
-				if (r->valid_p_req) {
+				if (r->p_req_valid) {
 					have_new_data = true;
 					break;
 				}
@@ -458,7 +458,11 @@ int v4l2_ctrl_request_setup(struct media_request *req,
 				struct v4l2_ctrl_ref *r =
 					find_ref(hdl, master->cluster[i]->id);
 
-				req_to_new(r);
+				ret = req_to_new(r);
+				if (ret) {
+					v4l2_ctrl_unlock(master);
+					goto error;
+				}
 				master->cluster[i]->is_new = 1;
 				r->req_done = true;
 			}
@@ -490,6 +494,7 @@ int v4l2_ctrl_request_setup(struct media_request *req,
 			break;
 	}
 
+error:
 	media_request_object_put(obj);
 	return ret;
 }
diff --git a/include/media/v4l2-ctrls.h b/include/media/v4l2-ctrls.h
index b3ce438f1329..f4105de8a8d2 100644
--- a/include/media/v4l2-ctrls.h
+++ b/include/media/v4l2-ctrls.h
@@ -185,6 +185,8 @@ typedef void (*v4l2_ctrl_notify_fnc)(struct v4l2_ctrl *ctrl, void *priv);
  *		and/or has type %V4L2_CTRL_TYPE_STRING. In other words, &struct
  *		v4l2_ext_control uses field p to point to the data.
  * @is_array: If set, then this control contains an N-dimensional array.
+ * @is_dyn_array: If set, then this control contains a dynamically sized 1-dimensional array.
+ *		If this is set, then @is_array is also set.
  * @has_volatiles: If set, then one or more members of the cluster are volatile.
  *		Drivers should never touch this flag.
  * @call_notify: If set, then call the handler's notify function whenever the
@@ -205,6 +207,9 @@ typedef void (*v4l2_ctrl_notify_fnc)(struct v4l2_ctrl *ctrl, void *priv);
  * @step:	The control's step value for non-menu controls.
  * @elems:	The number of elements in the N-dimensional array.
  * @elem_size:	The size in bytes of the control.
+ * @new_elems:	The number of elements in p_new. This is the same as @elems,
+ *		except for dynamic arrays. In that case it is in the range of
+ *		1 to @p_dyn_alloc_elems.
  * @dims:	The size of each dimension.
  * @nr_of_dims:The number of dimensions in @dims.
  * @menu_skip_mask: The control's skip mask for menu controls. This makes it
@@ -223,15 +228,21 @@ typedef void (*v4l2_ctrl_notify_fnc)(struct v4l2_ctrl *ctrl, void *priv);
  *		:math:`ceil(\frac{maximum - minimum}{step}) + 1`.
  *		Used only if the @type is %V4L2_CTRL_TYPE_INTEGER_MENU.
  * @flags:	The control's flags.
- * @cur:	Structure to store the current value.
- * @cur.val:	The control's current value, if the @type is represented via
- *		a u32 integer (see &enum v4l2_ctrl_type).
- * @val:	The control's new s32 value.
  * @priv:	The control's private pointer. For use by the driver. It is
  *		untouched by the control framework. Note that this pointer is
  *		not freed when the control is deleted. Should this be needed
  *		then a new internal bitfield can be added to tell the framework
  *		to free this pointer.
+ * @p_dyn:	Pointer to the dynamically allocated array. Only valid if
+ *		@is_dyn_array is true.
+ * @p_dyn_alloc_elems: The number of elements in the dynamically allocated
+ *		array for both the cur and new values. So @p_dyn is actually
+ *		sized for 2 * @p_dyn_alloc_elems * @elem_size. Only valid if
+ *		@is_dyn_array is true.
+ * @cur:	Structure to store the current value.
+ * @cur.val:	The control's current value, if the @type is represented via
+ *		a u32 integer (see &enum v4l2_ctrl_type).
+ * @val:	The control's new s32 value.
  * @p_def:	The control's default value represented via a union which
  *		provides a standard way of accessing control types
  *		through a pointer (for compound controls only).
@@ -260,6 +271,7 @@ struct v4l2_ctrl {
 	unsigned int is_string:1;
 	unsigned int is_ptr:1;
 	unsigned int is_array:1;
+	unsigned int is_dyn_array:1;
 	unsigned int has_volatiles:1;
 	unsigned int call_notify:1;
 	unsigned int manual_mode_value:8;
@@ -272,6 +284,7 @@ struct v4l2_ctrl {
 	s64 minimum, maximum, default_value;
 	u32 elems;
 	u32 elem_size;
+	u32 new_elems;
 	u32 dims[V4L2_CTRL_MAX_DIMS];
 	u32 nr_of_dims;
 	union {
@@ -284,6 +297,8 @@ struct v4l2_ctrl {
 	};
 	unsigned long flags;
 	void *priv;
+	void *p_dyn;
+	u32 p_dyn_alloc_elems;
 	s32 val;
 	struct {
 		s32 val;
@@ -309,12 +324,22 @@ struct v4l2_ctrl {
  *		the control has been applied. This prevents applying controls
  *		from a cluster with multiple controls twice (when the first
  *		control of a cluster is applied, they all are).
- * @valid_p_req: If set, then p_req contains the control value for the request.
+ * @p_req_valid: If set, then p_req contains the control value for the request.
+ * @p_req_dyn_enomem: If set, then p_req is invalid since allocating space for
+ *		a dynamic array failed. Attempting to read this value shall
+ *		result in ENOMEM. Only valid if ctrl->is_dyn_array is true.
+ * @p_req_dyn_alloc_elems: The number of elements allocated for the dynamic
+ *		array. Only valid if @p_req_valid and ctrl->is_dyn_array are
+ *		true.
+ * @p_req_elems: The number of elements in @p_req. This is the same as
+ *		ctrl->elems, except for dynamic arrays. In that case it is in
+ *		the range of 1 to @p_req_dyn_alloc_elems. Only valid if
+ *		@p_req_valid is true.
  * @p_req:	If the control handler containing this control reference
  *		is bound to a media request, then this points to the
  *		value of the control that must be applied when the request
  *		is executed, or to the value of the control at the time
- *		that the request was completed. If @valid_p_req is false,
+ *		that the request was completed. If @p_req_valid is false,
  *		then this control was never set for this request and the
  *		control will not be updated when this request is applied.
  *
@@ -329,7 +354,10 @@ struct v4l2_ctrl_ref {
 	struct v4l2_ctrl_helper *helper;
 	bool from_other_dev;
 	bool req_done;
-	bool valid_p_req;
+	bool p_req_valid;
+	bool p_req_dyn_enomem;
+	u32 p_req_dyn_alloc_elems;
+	u32 p_req_elems;
 	union v4l2_ctrl_ptr p_req;
 };
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v5 3/9] vivid: add dynamic array test control
  2022-05-03  9:39 [PATCH v5 0/9] i.MX8MP DW100 dewarper driver Xavier Roumegue
  2022-05-03  9:39 ` [PATCH v5 1/9] videodev2.h: add V4L2_CTRL_FLAG_DYNAMIC_ARRAY Xavier Roumegue
  2022-05-03  9:39 ` [PATCH v5 2/9] v4l2-ctrls: add support for dynamically allocated arrays Xavier Roumegue
@ 2022-05-03  9:39 ` Xavier Roumegue
  2022-06-14 21:00   ` Laurent Pinchart
  2022-05-03  9:39 ` [PATCH v5 4/9] media: Documentation: dw100: Add user documentation for the DW100 driver Xavier Roumegue
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Xavier Roumegue @ 2022-05-03  9:39 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, stanimir.varbanov, laurent.pinchart,
	tomi.valkeinen, robh+dt, nicolas, alexander.stein
  Cc: linux-media, devicetree

From: Hans Verkuil <hverkuil-cisco@xs4all.nl>

Add a dynamic array test control to help test support for this
feature.

Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
---
 drivers/media/test-drivers/vivid/vivid-ctrls.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/media/test-drivers/vivid/vivid-ctrls.c b/drivers/media/test-drivers/vivid/vivid-ctrls.c
index e7516dc1227b..7267892dc18a 100644
--- a/drivers/media/test-drivers/vivid/vivid-ctrls.c
+++ b/drivers/media/test-drivers/vivid/vivid-ctrls.c
@@ -34,6 +34,7 @@
 #define VIVID_CID_U8_4D_ARRAY		(VIVID_CID_CUSTOM_BASE + 10)
 #define VIVID_CID_AREA			(VIVID_CID_CUSTOM_BASE + 11)
 #define VIVID_CID_RO_INTEGER		(VIVID_CID_CUSTOM_BASE + 12)
+#define VIVID_CID_U32_DYN_ARRAY		(VIVID_CID_CUSTOM_BASE + 13)
 
 #define VIVID_CID_VIVID_BASE		(0x00f00000 | 0xf000)
 #define VIVID_CID_VIVID_CLASS		(0x00f00000 | 1)
@@ -189,6 +190,19 @@ static const struct v4l2_ctrl_config vivid_ctrl_u32_array = {
 	.dims = { 1 },
 };
 
+static const struct v4l2_ctrl_config vivid_ctrl_u32_dyn_array = {
+	.ops = &vivid_user_gen_ctrl_ops,
+	.id = VIVID_CID_U32_DYN_ARRAY,
+	.name = "U32 Dynamic Array",
+	.type = V4L2_CTRL_TYPE_U32,
+	.flags = V4L2_CTRL_FLAG_DYNAMIC_ARRAY,
+	.def = 50,
+	.min = 10,
+	.max = 90,
+	.step = 1,
+	.dims = { 100 },
+};
+
 static const struct v4l2_ctrl_config vivid_ctrl_u16_matrix = {
 	.ops = &vivid_user_gen_ctrl_ops,
 	.id = VIVID_CID_U16_MATRIX,
@@ -1612,6 +1626,7 @@ int vivid_create_controls(struct vivid_dev *dev, bool show_ccs_cap,
 	dev->ro_int32 = v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_ro_int32, NULL);
 	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_area, NULL);
 	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u32_array, NULL);
+	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u32_dyn_array, NULL);
 	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u16_matrix, NULL);
 	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u8_4d_array, NULL);
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v5 4/9] media: Documentation: dw100: Add user documentation for the DW100 driver
  2022-05-03  9:39 [PATCH v5 0/9] i.MX8MP DW100 dewarper driver Xavier Roumegue
                   ` (2 preceding siblings ...)
  2022-05-03  9:39 ` [PATCH v5 3/9] vivid: add dynamic array test control Xavier Roumegue
@ 2022-05-03  9:39 ` Xavier Roumegue
  2022-06-14 21:13   ` Laurent Pinchart
  2022-05-03  9:39 ` [PATCH v5 5/9] media: v4l: uapi: Add user control base for DW100 controls Xavier Roumegue
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Xavier Roumegue @ 2022-05-03  9:39 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, stanimir.varbanov, laurent.pinchart,
	tomi.valkeinen, robh+dt, nicolas, alexander.stein
  Cc: Xavier Roumegue, linux-media, devicetree

Add user documentation for the DW100 driver.

while at it, replace spaces with tab on drivers list.

Signed-off-by: Xavier Roumegue <xavier.roumegue@oss.nxp.com>
---
 .../userspace-api/media/drivers/dw100.rst     | 90 +++++++++++++++++++
 .../userspace-api/media/drivers/index.rst     |  3 +-
 2 files changed, 92 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/userspace-api/media/drivers/dw100.rst

diff --git a/Documentation/userspace-api/media/drivers/dw100.rst b/Documentation/userspace-api/media/drivers/dw100.rst
new file mode 100644
index 000000000000..1e606459cf47
--- /dev/null
+++ b/Documentation/userspace-api/media/drivers/dw100.rst
@@ -0,0 +1,90 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+DW100 dewarp driver
+===================
+
+The Vivante DW100 Dewarp Processor IP core found on i.MX8MP SoC applies a
+programmable geometrical transformation on input image to correct distortion
+introduced by lenses.
+
+The transformation function is exposed by the hardware as a grid map with 16x16
+pixel macroblocks indexed using X, Y vertex coordinates.
+::
+
+                          Image width
+           <--------------------------------------->
+
+      ^    .-------.-------.-------.-------.-------.
+      |    | 16x16 |       |       |       |       |
+   I  |    | pixel |       |       |       |       |
+   m  |    | block |       |       |       |       |
+   a  |    .-------.-------.-------.-------.-------.
+   g  |    |       |       |       |       |       |
+   e  |    |       |       |       |       |       |
+      |    |       |       |       |       |       |
+   h  |    .-------.-------.-------.-------.-------.
+   e  |    |       |       |       |       |       |
+   i  |    |       |       |       |       |       |
+   g  |    |       |       |       |       |       |
+   h  |    .-------.-------.-------.-------.-------.
+   t  |    |       |       |       |       |       |
+      |    |       |       |       |       |       |
+      |    |       |       |       |       |       |
+      v    '-------'-------'-------'-------'-------'
+
+            Grid of Image Blocks for Dewarping Map
+
+
+Each x, y coordinate register uses 16 bits to record the coordinate address in
+an unsigned 12.4 fixed point format (UQ12.4).
+::
+
+    .----------------------.--------..----------------------.--------.
+    |         31~20        | 19~16  ||         15~4         |  3~0   |
+    |       (integer)      | (frac) ||       (integer)      | (frac) |
+    '----------------------'--------''----------------------'--------'
+    <-------------------------------><------------------------------->
+                Y coordinate                     X coordinate
+
+                           Remap Register Layout
+
+The dewarping remap contains two interpolations: coordinate interpolation and
+pixel interpolation as shown in the figure below.
+::
+
+                           P1                 P2
+                        (x1, y1)           (x2, y2)
+                           .-----------------.
+                           |                 |
+                           |      P          |
+                           |   (x, y)        |
+                           |      *          |
+                           |                 |
+                           |                 |
+                           '-----------------'
+                           P3                P4
+                        (x3, y3)          (x4, y4)
+
+                          Remap with Coordinate
+                          and Pixel Interpolation
+
+
+The dewarping map is set from applications using the
+V4L2_CID_DW100_DEWARPING_16x16_VERTEX_MAP control. The control contains
+an array of u32 values storing (x, y) destination coordinates for each
+vertex of the grid. The x coordinate is stored in the 16 LSBs and the y
+coordinate in the 16 MSBs.
+
+The number of elements in the array must match the image size:
+
+.. code-block:: C
+
+    elems = (DIV_ROUND_UP(width, 16) + 1) * (DIV_ROUND_UP(height, 16) + 1);
+
+If the control doesn't contain the correct number of elements, the driver uses
+an identity map.
+
+More details on the DW100 hardware operations can be found in
+*chapter 13.15 DeWarp* of IMX8MP_ reference manual.
+
+.. _IMX8MP: https://www.nxp.com/webapp/Download?colCode=IMX8MPRM
diff --git a/Documentation/userspace-api/media/drivers/index.rst b/Documentation/userspace-api/media/drivers/index.rst
index 12e3c512d718..0c720ca1a27d 100644
--- a/Documentation/userspace-api/media/drivers/index.rst
+++ b/Documentation/userspace-api/media/drivers/index.rst
@@ -33,7 +33,8 @@ For more details see the file COPYING in the source distribution of Linux.
 
 	ccs
 	cx2341x-uapi
-        hantro
+	dw100
+	hantro
 	imx-uapi
 	max2175
 	meye-uapi
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v5 5/9] media: v4l: uapi: Add user control base for DW100 controls
  2022-05-03  9:39 [PATCH v5 0/9] i.MX8MP DW100 dewarper driver Xavier Roumegue
                   ` (3 preceding siblings ...)
  2022-05-03  9:39 ` [PATCH v5 4/9] media: Documentation: dw100: Add user documentation for the DW100 driver Xavier Roumegue
@ 2022-05-03  9:39 ` Xavier Roumegue
  2022-05-03  9:39 ` [PATCH v5 6/9] media: uapi: Add a control for DW100 driver Xavier Roumegue
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Xavier Roumegue @ 2022-05-03  9:39 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, stanimir.varbanov, laurent.pinchart,
	tomi.valkeinen, robh+dt, nicolas, alexander.stein
  Cc: Xavier Roumegue, linux-media, devicetree

Add a control base for DW100 driver controls, and reserve 16 controls.

Signed-off-by: Xavier Roumegue <xavier.roumegue@oss.nxp.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
 include/uapi/linux/v4l2-controls.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/uapi/linux/v4l2-controls.h b/include/uapi/linux/v4l2-controls.h
index bb40129446d4..c78618acc9c6 100644
--- a/include/uapi/linux/v4l2-controls.h
+++ b/include/uapi/linux/v4l2-controls.h
@@ -225,6 +225,12 @@ enum v4l2_colorfx {
  */
 #define V4L2_CID_USER_ISL7998X_BASE		(V4L2_CID_USER_BASE + 0x1180)
 
+/*
+ * The base for DW100 driver controls.
+ * We reserve 16 controls for this driver.
+ */
+#define V4L2_CID_USER_DW100_BASE		(V4L2_CID_USER_BASE + 0x1190)
+
 /* MPEG-class control IDs */
 /* The MPEG controls are applicable to all codec controls
  * and the 'MPEG' part of the define is historical */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v5 6/9] media: uapi: Add a control for DW100 driver
  2022-05-03  9:39 [PATCH v5 0/9] i.MX8MP DW100 dewarper driver Xavier Roumegue
                   ` (4 preceding siblings ...)
  2022-05-03  9:39 ` [PATCH v5 5/9] media: v4l: uapi: Add user control base for DW100 controls Xavier Roumegue
@ 2022-05-03  9:39 ` Xavier Roumegue
  2022-06-14 21:07   ` Laurent Pinchart
  2022-05-03  9:39 ` [PATCH v5 7/9] media: dt-bindings: media: Add i.MX8MP DW100 binding Xavier Roumegue
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Xavier Roumegue @ 2022-05-03  9:39 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, stanimir.varbanov, laurent.pinchart,
	tomi.valkeinen, robh+dt, nicolas, alexander.stein
  Cc: Xavier Roumegue, linux-media, devicetree

The DW100 driver gets the dewarping mapping as a binary blob from the
userspace application through a custom control.
The blob format is hardware specific so create a dedicated control for
this purpose.

Signed-off-by: Xavier Roumegue <xavier.roumegue@oss.nxp.com>
---
 .../userspace-api/media/drivers/dw100.rst          | 13 +++++++++++++
 include/uapi/linux/dw100.h                         | 14 ++++++++++++++
 2 files changed, 27 insertions(+)
 create mode 100644 include/uapi/linux/dw100.h

diff --git a/Documentation/userspace-api/media/drivers/dw100.rst b/Documentation/userspace-api/media/drivers/dw100.rst
index 1e606459cf47..1ea727698ea1 100644
--- a/Documentation/userspace-api/media/drivers/dw100.rst
+++ b/Documentation/userspace-api/media/drivers/dw100.rst
@@ -87,4 +87,17 @@ an identity map.
 More details on the DW100 hardware operations can be found in
 *chapter 13.15 DeWarp* of IMX8MP_ reference manual.
 
+The Vivante DW100 m2m driver implements the following driver-specific control:
+
+``V4L2_CID_DW100_DEWARPING_16x16_VERTEX_MAP (__u32 array)``
+    Specifies to DW100 driver its dewarping map (aka LUT) blob as described in
+    *chapter 13.15.2.3 Dewarping Remap* of IMX8MP_ reference manual as an U32
+    dynamic array. The image is divided into many small 16x16 blocks. If the
+    width/height of the image is not divisible by 16, the size of the
+    rightmost/bottommost block is the remainder. The dewarping map only saves
+    the vertex coordinates of the block. The dewarping grid map is comprised of
+    vertex coordinates for x and y. Each x, y coordinate register uses 16 bits
+    (UQ12.4) to record the coordinate address, with the Y coordinate in the
+    upper bits and X in the lower bits.
+
 .. _IMX8MP: https://www.nxp.com/webapp/Download?colCode=IMX8MPRM
diff --git a/include/uapi/linux/dw100.h b/include/uapi/linux/dw100.h
new file mode 100644
index 000000000000..3356496edd6b
--- /dev/null
+++ b/include/uapi/linux/dw100.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
+/* Copyright 2022 NXP */
+
+#ifndef __UAPI_DW100_H__
+#define __UAPI_DW100_H__
+
+#include <linux/v4l2-controls.h>
+
+/*
+ * Check Documentation/userspace-api/media/drivers/dw100.rst for control details.
+ */
+#define V4L2_CID_DW100_DEWARPING_16x16_VERTEX_MAP (V4L2_CID_USER_DW100_BASE + 1)
+
+#endif
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v5 7/9] media: dt-bindings: media: Add i.MX8MP DW100 binding
  2022-05-03  9:39 [PATCH v5 0/9] i.MX8MP DW100 dewarper driver Xavier Roumegue
                   ` (5 preceding siblings ...)
  2022-05-03  9:39 ` [PATCH v5 6/9] media: uapi: Add a control for DW100 driver Xavier Roumegue
@ 2022-05-03  9:39 ` Xavier Roumegue
  2022-05-03 11:59   ` Rob Herring
  2022-05-03  9:39 ` [PATCH v5 8/9] media: dw100: Add i.MX8MP dw100 dewarper driver Xavier Roumegue
  2022-05-03  9:39 ` [PATCH v5 9/9] media: MAINTAINERS: add entry for i.MX8MP DW100 v4l2 mem2mem driver Xavier Roumegue
  8 siblings, 1 reply; 27+ messages in thread
From: Xavier Roumegue @ 2022-05-03  9:39 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, stanimir.varbanov, laurent.pinchart,
	tomi.valkeinen, robh+dt, nicolas, alexander.stein
  Cc: Xavier Roumegue, linux-media, devicetree

Add DT binding documentation for the Vivante DW100 dewarper engine found
on NXP i.MX8MP SoC

Signed-off-by: Xavier Roumegue <xavier.roumegue@oss.nxp.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
 .../devicetree/bindings/media/nxp,dw100.yaml  | 69 +++++++++++++++++++
 1 file changed, 69 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/media/nxp,dw100.yaml

diff --git a/Documentation/devicetree/bindings/media/nxp,dw100.yaml b/Documentation/devicetree/bindings/media/nxp,dw100.yaml
new file mode 100644
index 000000000000..21910ff0e1c3
--- /dev/null
+++ b/Documentation/devicetree/bindings/media/nxp,dw100.yaml
@@ -0,0 +1,69 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/media/nxp,dw100.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: NXP i.MX8MP DW100 Dewarper core
+
+maintainers:
+  - Xavier Roumegue <xavier.roumegue@oss.nxp.com>
+
+description: |-
+  The Dewarp Engine provides high-performance dewarp processing for the
+  correction of the distortion that is introduced in images produced by fisheye
+  and wide angle lenses. It is implemented with a line/tile-cache based
+  architecture. With configurable address mapping look up tables and per tile
+  processing, it successfully generates a corrected output image.
+  The engine can be used to perform scaling, cropping and pixel format
+  conversion.
+
+properties:
+  compatible:
+    enum:
+      - nxp,imx8mp-dw100
+
+  reg:
+    maxItems: 1
+
+  interrupts:
+    maxItems: 1
+
+  clocks:
+    items:
+      - description: The AXI clock
+      - description: The AHB clock
+
+  clock-names:
+    items:
+      - const: axi
+      - const: ahb
+
+  power-domains:
+    maxItems: 1
+
+required:
+  - compatible
+  - reg
+  - interrupts
+  - clocks
+  - clock-names
+  - power-domains
+
+additionalProperties: false
+
+examples:
+  - |
+    #include <dt-bindings/clock/imx8mp-clock.h>
+    #include <dt-bindings/interrupt-controller/arm-gic.h>
+    #include <dt-bindings/power/imx8mp-power.h>
+
+    dewarp: dwe@32e30000 {
+            compatible = "nxp,imx8mp-dw100";
+            reg = <0x32e30000 0x10000>;
+            interrupts = <GIC_SPI 100 IRQ_TYPE_LEVEL_HIGH>;
+            clocks = <&clk IMX8MP_CLK_MEDIA_AXI_ROOT>,
+                     <&clk IMX8MP_CLK_MEDIA_APB_ROOT>;
+            clock-names = "axi", "ahb";
+            power-domains = <&media_blk_ctrl IMX8MP_MEDIABLK_PD_DWE>;
+    };
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v5 8/9] media: dw100: Add i.MX8MP dw100 dewarper driver
  2022-05-03  9:39 [PATCH v5 0/9] i.MX8MP DW100 dewarper driver Xavier Roumegue
                   ` (6 preceding siblings ...)
  2022-05-03  9:39 ` [PATCH v5 7/9] media: dt-bindings: media: Add i.MX8MP DW100 binding Xavier Roumegue
@ 2022-05-03  9:39 ` Xavier Roumegue
  2022-06-14 22:54   ` Laurent Pinchart
  2022-05-03  9:39 ` [PATCH v5 9/9] media: MAINTAINERS: add entry for i.MX8MP DW100 v4l2 mem2mem driver Xavier Roumegue
  8 siblings, 1 reply; 27+ messages in thread
From: Xavier Roumegue @ 2022-05-03  9:39 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, stanimir.varbanov, laurent.pinchart,
	tomi.valkeinen, robh+dt, nicolas, alexander.stein
  Cc: Xavier Roumegue, linux-media, devicetree

Add a V4L2 mem-to-mem driver for the Vivante DW100 Dewarp Processor IP
core found on i.MX8MP SoC.

The processor core applies a programmable geometrical transformation on
input image to correct distorsion introduced by lenses.
The transformation function is exposed as a grid map with 16x16 pixel
macroblocks indexed using X, Y vertex coordinates.

The dewarping map can be set from application through dedicated a v4l2
control. If not set or invalid, the driver computes an identity map
prior to start the processing engine.

The driver supports scaling, cropping and pixel format conversion.

Signed-off-by: Xavier Roumegue <xavier.roumegue@oss.nxp.com>
---
 drivers/media/platform/nxp/Kconfig            |    1 +
 drivers/media/platform/nxp/Makefile           |    1 +
 drivers/media/platform/nxp/dw100/Kconfig      |   16 +
 drivers/media/platform/nxp/dw100/Makefile     |    3 +
 drivers/media/platform/nxp/dw100/dw100.c      | 1782 +++++++++++++++++
 drivers/media/platform/nxp/dw100/dw100_regs.h |  118 ++
 6 files changed, 1921 insertions(+)
 create mode 100644 drivers/media/platform/nxp/dw100/Kconfig
 create mode 100644 drivers/media/platform/nxp/dw100/Makefile
 create mode 100644 drivers/media/platform/nxp/dw100/dw100.c
 create mode 100644 drivers/media/platform/nxp/dw100/dw100_regs.h

diff --git a/drivers/media/platform/nxp/Kconfig b/drivers/media/platform/nxp/Kconfig
index a0810df751dc..e5085e5e585b 100644
--- a/drivers/media/platform/nxp/Kconfig
+++ b/drivers/media/platform/nxp/Kconfig
@@ -52,4 +52,5 @@ config VIDEO_MX2_EMMAPRP
 	    memory to memory. Operations include resizing and format
 	    conversion.
 
+source "drivers/media/platform/nxp/dw100/Kconfig"
 source "drivers/media/platform/nxp/imx-jpeg/Kconfig"
diff --git a/drivers/media/platform/nxp/Makefile b/drivers/media/platform/nxp/Makefile
index efc38c6578ce..22ba28ac6d63 100644
--- a/drivers/media/platform/nxp/Makefile
+++ b/drivers/media/platform/nxp/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
 
+obj-y += dw100/
 obj-y += imx-jpeg/
 
 obj-$(CONFIG_VIDEO_IMX_MIPI_CSIS) += imx-mipi-csis.o
diff --git a/drivers/media/platform/nxp/dw100/Kconfig b/drivers/media/platform/nxp/dw100/Kconfig
new file mode 100644
index 000000000000..86ccac6fd120
--- /dev/null
+++ b/drivers/media/platform/nxp/dw100/Kconfig
@@ -0,0 +1,16 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+config VIDEO_DW100
+	tristate "NXP i.MX DW100 dewarper"
+	depends on V4L_MEM2MEM_DRIVERS
+	depends on VIDEO_DEV
+	depends on ARCH_MXC || COMPILE_TEST
+	select VIDEOBUF2_DMA_CONTIG
+	select V4L2_MEM2MEM_DEV
+	help
+	  DW100 is a memory-to-memory engine performing geometrical
+	  transformation on source image through a programmable dewarping map.
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called dw100.
+
diff --git a/drivers/media/platform/nxp/dw100/Makefile b/drivers/media/platform/nxp/dw100/Makefile
new file mode 100644
index 000000000000..49db80589e9a
--- /dev/null
+++ b/drivers/media/platform/nxp/dw100/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0+
+
+obj-$(CONFIG_VIDEO_DW100) += dw100.o
diff --git a/drivers/media/platform/nxp/dw100/dw100.c b/drivers/media/platform/nxp/dw100/dw100.c
new file mode 100644
index 000000000000..869ca12e0bcd
--- /dev/null
+++ b/drivers/media/platform/nxp/dw100/dw100.c
@@ -0,0 +1,1782 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * DW100 Hardware dewarper
+ *
+ * Copyright 2022 NXP
+ * Author: Xavier Roumegue (xavier.roumegue@oss.nxp.com)
+ *
+ */
+
+#include <linux/clk.h>
+#include <linux/debugfs.h>
+#include <linux/interrupt.h>
+#include <linux/minmax.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
+
+#include <media/v4l2-ctrls.h>
+#include <media/v4l2-device.h>
+#include <media/v4l2-event.h>
+#include <media/v4l2-ioctl.h>
+#include <media/v4l2-mem2mem.h>
+#include <media/videobuf2-dma-contig.h>
+
+#include <uapi/linux/dw100.h>
+
+#include "dw100_regs.h"
+
+#define DRV_NAME "dw100"
+
+#define MIN_W 176
+#define MIN_H 144
+#define MAX_W 4096
+#define MAX_H 3072
+#define ALIGN_W 3
+#define ALIGN_H 3
+
+#define DW100_BLOCK_SIZE 16
+
+#define DW100_MIN_LUT_NELEMS		(((MIN_W / DW100_BLOCK_SIZE) + 1) \
+					 * ((MIN_H / DW100_BLOCK_SIZE) + 1))
+
+#define DW100_MAX_LUT_NELEMS		(((MAX_W / DW100_BLOCK_SIZE) + 1) \
+					 * ((MAX_H / DW100_BLOCK_SIZE) + 1))
+
+/*
+ * 16 controls have been reserved for this driver for future extension, but
+ * let's limit the related driver allocation to the effective number of controls
+ * in use.
+ */
+#define DW100_MAX_CTRLS 1
+#define DW100_CTRL_DEWARPING_MAP 0
+
+static unsigned int debug;
+module_param(debug, uint, 0644);
+MODULE_PARM_DESC(debug, "Activate debug info");
+
+#define dprintk(lvl, dev, fmt, arg...) \
+	v4l2_dbg(lvl, debug, &(dev)->v4l2_dev, "%s: " fmt, __func__, ## arg)
+
+enum {
+	V4L2_M2M_SRC = 0,
+	V4L2_M2M_DST = 1,
+};
+
+enum {
+	V4L2_M2M_CAPTURE = 1,
+	V4L2_M2M_OUTPUT = 2,
+};
+
+static struct dw100_fmt {
+	u32 fourcc;
+	int depth;
+	u32 types;
+	u32 reg_format;
+	bool reg_swap_uv;
+} formats[] = {
+	{
+		.fourcc = V4L2_PIX_FMT_NV16,
+		.depth = 16,
+		.types = V4L2_M2M_OUTPUT | V4L2_M2M_CAPTURE,
+		.reg_format = DW100_DEWARP_CTRL_FORMAT_YUV422_SP,
+		.reg_swap_uv = false,
+	}, {
+		.fourcc = V4L2_PIX_FMT_NV61,
+		.depth = 16,
+		.types = V4L2_M2M_CAPTURE,
+		.reg_format = DW100_DEWARP_CTRL_FORMAT_YUV422_SP,
+		.reg_swap_uv = true,
+	}, {
+		.fourcc = V4L2_PIX_FMT_YUYV,
+		.depth = 16,
+		.types = V4L2_M2M_OUTPUT | V4L2_M2M_CAPTURE,
+		.reg_format = DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED,
+		.reg_swap_uv = false,
+	}, {
+		.fourcc = V4L2_PIX_FMT_UYVY,
+		.depth = 16,
+		.types = V4L2_M2M_OUTPUT | V4L2_M2M_CAPTURE,
+		.reg_format = DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED,
+		.reg_swap_uv = true,
+	}, {
+		.fourcc = V4L2_PIX_FMT_NV12,
+		.depth = 12,
+		.types = V4L2_M2M_OUTPUT | V4L2_M2M_CAPTURE,
+		.reg_format = DW100_DEWARP_CTRL_FORMAT_YUV420_SP,
+		.reg_swap_uv = false,
+	}, {
+		.fourcc = V4L2_PIX_FMT_NV21,
+		.depth = 12,
+		.types = V4L2_M2M_OUTPUT | V4L2_M2M_CAPTURE,
+		.reg_format = DW100_DEWARP_CTRL_FORMAT_YUV420_SP,
+		.reg_swap_uv = true,
+	},
+};
+
+static inline int to_dw100_fmt_type(enum v4l2_buf_type type)
+{
+	if (V4L2_TYPE_IS_OUTPUT(type))
+		return V4L2_M2M_OUTPUT;
+	else
+		return V4L2_M2M_CAPTURE;
+}
+
+#define NUM_FORMATS ARRAY_SIZE(formats)
+
+static struct dw100_fmt *find_pixel_format(u32 pixel_format, int fmt_type)
+{
+	struct dw100_fmt *fmt;
+	unsigned int k;
+
+	for (k = 0; k < NUM_FORMATS; k++) {
+		fmt = &formats[k];
+		if (fmt->fourcc == pixel_format && fmt->types & fmt_type)
+			return fmt;
+	}
+
+	return NULL;
+}
+
+static struct dw100_fmt *find_format(struct v4l2_format *f)
+{
+	return find_pixel_format(f->fmt.pix.pixelformat,
+			      to_dw100_fmt_type(f->type));
+}
+
+static inline u32 dw100_bytesperline(struct dw100_fmt *fmt, u32 width)
+{
+	switch (fmt->reg_format) {
+	case DW100_DEWARP_CTRL_FORMAT_YUV422_SP:
+	case DW100_DEWARP_CTRL_FORMAT_YUV420_SP:
+		return width;
+	case DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED:
+	default:
+		return (fmt->depth * width) >> 3;
+	}
+}
+
+static inline u32 dw100_sizeimage(struct dw100_fmt *fmt, u32 width, u32 height)
+{
+	return (fmt->depth * width * height) >> 3;
+}
+
+struct dw100_device {
+	struct platform_device *pdev;
+	struct v4l2_m2m_dev	*m2m_dev;
+	struct v4l2_device	v4l2_dev;
+	struct video_device	vfd;
+#ifdef CONFIG_MEDIA_CONTROLLER
+	struct media_device	mdev;
+#endif
+	struct mutex		vfd_mutex;
+	spinlock_t		irqlock;
+	void __iomem		*mmio;
+	struct clk_bulk_data	*clks;
+	int			num_clks;
+	struct dentry		*debugfs_root;
+};
+
+static int dw100_dump_regs(struct dw100_device *dw_dev)
+{
+#define __DECLARE_REG(x) { #x, x }
+	int i;
+	struct reg_desc {
+		const char * const name;
+		unsigned int addr;
+	} dw100_regs[] = {
+		__DECLARE_REG(DW100_DEWARP_ID),
+		__DECLARE_REG(DW100_DEWARP_CTRL),
+		__DECLARE_REG(DW100_MAP_LUT_ADDR),
+		__DECLARE_REG(DW100_MAP_LUT_SIZE),
+		__DECLARE_REG(DW100_MAP_LUT_ADDR2),
+		__DECLARE_REG(DW100_MAP_LUT_SIZE2),
+		__DECLARE_REG(DW100_SRC_IMG_Y_BASE),
+		__DECLARE_REG(DW100_SRC_IMG_UV_BASE),
+		__DECLARE_REG(DW100_SRC_IMG_SIZE),
+		__DECLARE_REG(DW100_SRC_IMG_STRIDE),
+		__DECLARE_REG(DW100_DST_IMG_Y_BASE),
+		__DECLARE_REG(DW100_DST_IMG_UV_BASE),
+		__DECLARE_REG(DW100_DST_IMG_SIZE),
+		__DECLARE_REG(DW100_DST_IMG_STRIDE),
+		__DECLARE_REG(DW100_DST_IMG_Y_SIZE1),
+		__DECLARE_REG(DW100_DST_IMG_UV_SIZE1),
+		__DECLARE_REG(DW100_SRC_IMG_Y_BASE2),
+		__DECLARE_REG(DW100_SRC_IMG_UV_BASE2),
+		__DECLARE_REG(DW100_SRC_IMG_SIZE2),
+		__DECLARE_REG(DW100_SRC_IMG_STRIDE2),
+		__DECLARE_REG(DW100_DST_IMG_Y_BASE2),
+		__DECLARE_REG(DW100_DST_IMG_UV_BASE2),
+		__DECLARE_REG(DW100_DST_IMG_SIZE2),
+		__DECLARE_REG(DW100_DST_IMG_STRIDE2),
+		__DECLARE_REG(DW100_DST_IMG_Y_SIZE2),
+		__DECLARE_REG(DW100_DST_IMG_UV_SIZE2),
+		__DECLARE_REG(DW100_SWAP_CONTROL),
+		__DECLARE_REG(DW100_VERTICAL_SPLIT_LINE),
+		__DECLARE_REG(DW100_HORIZON_SPLIT_LINE),
+		__DECLARE_REG(DW100_SCALE_FACTOR),
+		__DECLARE_REG(DW100_ROI_START),
+		__DECLARE_REG(DW100_BOUNDARY_PIXEL),
+		__DECLARE_REG(DW100_INTERRUPT_STATUS),
+		__DECLARE_REG(DW100_BUS_CTRL),
+		__DECLARE_REG(DW100_BUS_CTRL1),
+		__DECLARE_REG(DW100_BUS_TIME_OUT_CYCLE),
+	};
+
+	for (i = 0; i < ARRAY_SIZE(dw100_regs); i++) {
+		dev_info(&dw_dev->pdev->dev, "%s: %#x\n",
+			 dw100_regs[i].name,
+			 readl(dw_dev->mmio + dw100_regs[i].addr));
+	}
+
+	return 0;
+}
+
+struct dw100_q_data {
+	unsigned int		width;
+	unsigned int		height;
+	unsigned int		bytesperline;
+	unsigned int		sizeimage;
+	unsigned int		sequence;
+	struct dw100_fmt	*fmt;
+	struct v4l2_rect	crop;
+};
+
+struct dw100_ctx {
+	struct v4l2_fh			fh;
+	struct dw100_device		*dw_dev;
+	struct v4l2_ctrl_handler	hdl;
+	struct v4l2_ctrl		*ctrls[DW100_MAX_CTRLS];
+
+	/* Look Up Table for pixel remapping */
+	unsigned int			*map;
+	dma_addr_t			map_dma;
+	size_t				map_size;
+	unsigned int			map_width;
+	unsigned int			map_height;
+
+	/* Related colorspace properties propagated from input to output */
+	enum v4l2_colorspace	colorspace;
+	enum v4l2_xfer_func	xfer_func;
+	enum v4l2_ycbcr_encoding ycbcr_enc;
+	enum v4l2_quantization	quant;
+
+	/* Source and destination queue data */
+	struct dw100_q_data   q_data[2];
+};
+
+static inline struct dw100_ctx *file2ctx(struct file *file)
+{
+	return container_of(file->private_data, struct dw100_ctx, fh);
+}
+
+static struct dw100_q_data *get_q_data(struct dw100_ctx *ctx,
+				       enum v4l2_buf_type type)
+{
+	if (type == V4L2_BUF_TYPE_VIDEO_OUTPUT)
+		return &ctx->q_data[V4L2_M2M_SRC];
+	else
+		return &ctx->q_data[V4L2_M2M_DST];
+}
+
+static u32 dw100_get_n_vertices_from_length(u32 length)
+{
+	return DIV_ROUND_UP(length, DW100_BLOCK_SIZE) + 1;
+}
+
+static u16 dw_map_convert_to_UQ12_4(u32 a)
+{
+	return (u16)((a & 0xfff) << 4);
+}
+
+static u32 dw_map_format_coordinates(u16 xq, u16 yq)
+{
+	return (u32)((yq << 16) | xq);
+}
+
+static u32 *dw100_get_user_map(struct dw100_ctx *ctx)
+{
+	struct v4l2_ctrl *ctrl = ctx->ctrls[DW100_CTRL_DEWARPING_MAP];
+
+	size_t user_map_size;
+
+	if (!ctrl) {
+		v4l2_err(&ctx->dw_dev->v4l2_dev, "Mapping control not found !");
+		return NULL;
+	}
+
+	if (ctrl->elems < DW100_MIN_LUT_NELEMS ||
+	    ctrl->elems > DW100_MAX_LUT_NELEMS)
+		return NULL;
+
+	user_map_size = ctrl->elems * ctrl->elem_size;
+
+	if ((ctrl->elems * ctrl->elem_size) == ctx->map_size)
+		return ctrl->p_cur.p_u32;
+
+	v4l2_warn(&ctx->dw_dev->v4l2_dev,
+		  "Skipping invalid user map (%zu != %zu)\n",
+		  user_map_size,
+		  ctx->map_size);
+
+	return NULL;
+}
+
+/*
+ * Create an identity map if not set by the application
+ *
+ * A 16 pixels cell size grid is mapped on the destination image.
+ * The last cells width/height might be lesser than 16 if the destination image
+ * width/height is not divisible by 16. This dewarping grid map specifies the
+ * source image pixel location (x, y) on each grid intersection point.
+ * Bilinear interpolation is used to compute inner cell points locations.
+ *
+ * The coordinates are saved in UQ12.4 fixed point format.
+ *
+ */
+static int dw100_create_mapping(struct dw100_ctx *ctx)
+{
+	u32 sw, sh, dw, dh, mw, mh, i, j;
+	u16 qx, qy, qdx, qdy, qsh, qsw;
+	bool is_user_map = false;
+	u32 *user_map;
+
+	sw = ctx->q_data[V4L2_M2M_SRC].width;
+	dw = ctx->q_data[V4L2_M2M_DST].width;
+	sh = ctx->q_data[V4L2_M2M_SRC].height;
+	dh = ctx->q_data[V4L2_M2M_DST].height;
+
+	mw = dw100_get_n_vertices_from_length(dw);
+	mh = dw100_get_n_vertices_from_length(dh);
+
+	qdx = dw_map_convert_to_UQ12_4(sw) / (mw - 1);
+	qdy = dw_map_convert_to_UQ12_4(sh) / (mh - 1);
+	qsh = dw_map_convert_to_UQ12_4(sh);
+	qsw = dw_map_convert_to_UQ12_4(sw);
+
+	if (ctx->map)
+		dma_free_coherent(&ctx->dw_dev->pdev->dev,
+				  ctx->map_size,
+				  ctx->map,
+				  ctx->map_dma);
+
+	ctx->map_width = mw;
+	ctx->map_height = mh;
+	ctx->map_size = mh * mw * sizeof(u32);
+
+	ctx->map = dma_alloc_coherent(&ctx->dw_dev->pdev->dev,
+				      ctx->map_size,
+				      &ctx->map_dma,
+				      GFP_KERNEL);
+
+	if (!ctx->map)
+		return -ENOMEM;
+
+	user_map = dw100_get_user_map(ctx);
+	if (user_map) {
+		is_user_map = true;
+		memcpy(ctx->map, user_map, ctx->map_size);
+		goto out;
+	}
+
+	for (i = 0, qy = 0, qx = 0; i < mh; i++, qy += qdy, qx = 0) {
+		if (qy > qsh)
+			qy = qsh;
+		for (j = 0; j < mw; j++, qx += qdx) {
+			if (qx > qsw)
+				qx = qsw;
+			ctx->map[i * mw + j] = dw_map_format_coordinates(qx, qy);
+		}
+	}
+out:
+	dprintk(1, ctx->dw_dev,
+		"%dx%d %s mapping created (d:%pa-c:%p) for stream %dx%d->%dx%d\n",
+			mw, mh, is_user_map ? "user" : "identity",
+			&ctx->map_dma, ctx->map, sw, sh, dw, dh);
+
+	return 0;
+}
+
+static const struct v4l2_ctrl_config controls[] = {
+	[DW100_CTRL_DEWARPING_MAP] = {
+		.id = V4L2_CID_DW100_DEWARPING_16x16_VERTEX_MAP,
+		.name = "Look-Up Table",
+		.type = V4L2_CTRL_TYPE_U32,
+		.min = 0x00000000,
+		.max = 0xffffffff,
+		.step = 1,
+		.def = 0,
+		.dims = { DW100_MAX_LUT_NELEMS },
+		.flags = V4L2_CTRL_FLAG_DYNAMIC_ARRAY,
+	},
+};
+
+#define NUM_CTRLS ARRAY_SIZE(controls)
+
+static int dw100_queue_setup(struct vb2_queue *vq,
+			     unsigned int *nbuffers, unsigned int *nplanes,
+			     unsigned int sizes[], struct device *alloc_devs[])
+{
+	struct dw100_ctx *ctx = vb2_get_drv_priv(vq);
+	struct dw100_q_data *q_data;
+	unsigned int size, count = *nbuffers;
+
+	q_data = get_q_data(ctx, vq->type);
+
+	size = q_data->sizeimage;
+
+	*nbuffers = count;
+
+	if (*nplanes)
+		return sizes[0] < size ? -EINVAL : 0;
+
+	*nplanes = 1;
+	sizes[0] = size;
+
+	dprintk(1, ctx->dw_dev, "Queue %p: get %d buffer(s) of size %d each.\n",
+		vq, count, size);
+
+	return 0;
+}
+
+static int dw100_buf_prepare(struct vb2_buffer *vb)
+{
+	struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
+	struct dw100_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
+	struct dw100_device *dw_dev = ctx->dw_dev;
+	struct dw100_q_data *q_data;
+
+	dprintk(1, dw_dev, "Queue %p: Preparing buffer %p, type: %d\n",
+		vb->vb2_queue, vb, vb->vb2_queue->type);
+
+	q_data = get_q_data(ctx, vb->vb2_queue->type);
+	if (V4L2_TYPE_IS_OUTPUT(vb->vb2_queue->type)) {
+		if (vbuf->field == V4L2_FIELD_ANY)
+			vbuf->field = V4L2_FIELD_NONE;
+		if (vbuf->field != V4L2_FIELD_NONE) {
+			v4l2_err(&dw_dev->v4l2_dev, "%x field isn't supported\n",
+				 vbuf->field);
+			return -EINVAL;
+		}
+	}
+
+	if (vb2_plane_size(vb, 0) < q_data->sizeimage) {
+		v4l2_err(&dw_dev->v4l2_dev,
+			 "%s data will not fit into plane (%lu < %lu)\n",
+			 __func__, vb2_plane_size(vb, 0),
+			 (long)q_data->sizeimage);
+		return -EINVAL;
+	}
+
+	vb2_set_plane_payload(vb, 0, q_data->sizeimage);
+
+	return 0;
+}
+
+static void dw100_buf_queue(struct vb2_buffer *vb)
+{
+	struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
+	struct dw100_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
+
+	dprintk(2, ctx->dw_dev, "Queue %p: Queuing buffer %p.\n",
+		vb->vb2_queue, vb);
+	v4l2_m2m_buf_queue(ctx->fh.m2m_ctx, vbuf);
+}
+
+static int dw100_start_streaming(struct vb2_queue *q, unsigned int count)
+{
+	struct dw100_ctx *ctx = vb2_get_drv_priv(q);
+	struct dw100_q_data *q_data = get_q_data(ctx, q->type);
+
+	dprintk(1, ctx->dw_dev, "Queue %p: Start Streaming.\n", q);
+
+	q_data->sequence = 0;
+
+	return pm_runtime_resume_and_get(&ctx->dw_dev->pdev->dev);
+}
+
+static void dw100_stop_streaming(struct vb2_queue *q)
+{
+	struct dw100_ctx *ctx = vb2_get_drv_priv(q);
+	struct vb2_v4l2_buffer *vbuf;
+	unsigned long flags;
+
+	dprintk(1, ctx->dw_dev, "Queue %p: Stop Streaming.\n", q);
+	for (;;) {
+		if (V4L2_TYPE_IS_OUTPUT(q->type))
+			vbuf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
+		else
+			vbuf = v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
+		if (!vbuf)
+			break;
+		spin_lock_irqsave(&ctx->dw_dev->irqlock, flags);
+		v4l2_m2m_buf_done(vbuf, VB2_BUF_STATE_ERROR);
+		spin_unlock_irqrestore(&ctx->dw_dev->irqlock, flags);
+	}
+
+	pm_runtime_put_sync(&ctx->dw_dev->pdev->dev);
+
+	if (ctx->map) {
+		dma_free_coherent(&ctx->dw_dev->pdev->dev,
+				  ctx->map_size,
+				  ctx->map,
+				  ctx->map_dma);
+		ctx->map = NULL;
+	}
+}
+
+static const struct vb2_ops dw100_qops = {
+	.queue_setup	 = dw100_queue_setup,
+	.buf_prepare	 = dw100_buf_prepare,
+	.buf_queue	 = dw100_buf_queue,
+	.start_streaming = dw100_start_streaming,
+	.stop_streaming  = dw100_stop_streaming,
+	.wait_prepare	 = vb2_ops_wait_prepare,
+	.wait_finish	 = vb2_ops_wait_finish,
+};
+
+static int dw100_m2m_queue_init(void *priv, struct vb2_queue *src_vq,
+				struct vb2_queue *dst_vq)
+{
+	struct dw100_ctx *ctx = priv;
+	int ret;
+
+	src_vq->type = V4L2_BUF_TYPE_VIDEO_OUTPUT;
+	src_vq->io_modes = VB2_MMAP | VB2_DMABUF;
+	src_vq->drv_priv = ctx;
+	src_vq->buf_struct_size = sizeof(struct v4l2_m2m_buffer);
+	src_vq->ops = &dw100_qops;
+	src_vq->mem_ops = &vb2_dma_contig_memops;
+	src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
+	src_vq->lock = &ctx->dw_dev->vfd_mutex;
+	src_vq->dev = ctx->dw_dev->v4l2_dev.dev;
+
+	ret = vb2_queue_init(src_vq);
+	if (ret)
+		return ret;
+
+	dst_vq->type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
+	dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
+	dst_vq->drv_priv = ctx;
+	dst_vq->buf_struct_size = sizeof(struct v4l2_m2m_buffer);
+	dst_vq->ops = &dw100_qops;
+	dst_vq->mem_ops = &vb2_dma_contig_memops;
+	dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
+	dst_vq->lock = &ctx->dw_dev->vfd_mutex;
+	dst_vq->dev = ctx->dw_dev->v4l2_dev.dev;
+
+	return vb2_queue_init(dst_vq);
+}
+
+static int dw100_open(struct file *file)
+{
+	struct dw100_device *dw_dev = video_drvdata(file);
+	struct dw100_ctx *ctx;
+	struct v4l2_ctrl_handler *hdl;
+	int ret = 0, i;
+
+	if (mutex_lock_interruptible(&dw_dev->vfd_mutex))
+		return -ERESTARTSYS;
+
+	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+	if (!ctx) {
+		ret = -ENOMEM;
+		goto open_unlock;
+	}
+
+	v4l2_fh_init(&ctx->fh, video_devdata(file));
+	file->private_data = &ctx->fh;
+	ctx->dw_dev = dw_dev;
+
+	hdl = &ctx->hdl;
+	v4l2_ctrl_handler_init(hdl, NUM_CTRLS);
+	for (i = 0; i < NUM_CTRLS; i++) {
+		ctx->ctrls[i] = v4l2_ctrl_new_custom(hdl, &controls[i], NULL);
+		if (hdl->error) {
+			v4l2_err(&ctx->dw_dev->v4l2_dev,
+				 "Adding control (%d) failed\n", i);
+			ret = hdl->error;
+			v4l2_ctrl_handler_free(hdl);
+			v4l2_fh_exit(&ctx->fh);
+			kfree(ctx);
+			goto open_unlock;
+		}
+	}
+	ctx->fh.ctrl_handler = hdl;
+
+	ctx->q_data[V4L2_M2M_SRC].fmt = &formats[0];
+	ctx->q_data[V4L2_M2M_SRC].width = 640;
+	ctx->q_data[V4L2_M2M_SRC].height = 480;
+	ctx->q_data[V4L2_M2M_SRC].bytesperline =
+		dw100_bytesperline(&formats[0], 640);
+	ctx->q_data[V4L2_M2M_SRC].sizeimage =
+		dw100_sizeimage(&formats[0], 640, 480);
+
+	ctx->q_data[V4L2_M2M_SRC].crop.top = 0;
+	ctx->q_data[V4L2_M2M_SRC].crop.left = 0;
+	ctx->q_data[V4L2_M2M_SRC].crop.width = 640;
+	ctx->q_data[V4L2_M2M_SRC].crop.height = 480;
+
+	ctx->q_data[V4L2_M2M_DST] = ctx->q_data[V4L2_M2M_SRC];
+
+	ctx->colorspace = V4L2_COLORSPACE_REC709;
+	ctx->xfer_func = V4L2_MAP_XFER_FUNC_DEFAULT(ctx->colorspace);
+	ctx->ycbcr_enc = V4L2_MAP_YCBCR_ENC_DEFAULT(ctx->colorspace);
+	ctx->quant = V4L2_MAP_QUANTIZATION_DEFAULT(false,
+						   ctx->colorspace,
+						   ctx->ycbcr_enc);
+
+	ctx->fh.m2m_ctx = v4l2_m2m_ctx_init(dw_dev->m2m_dev,
+					    ctx, &dw100_m2m_queue_init);
+
+	if (IS_ERR(ctx->fh.m2m_ctx)) {
+		ret = PTR_ERR(ctx->fh.m2m_ctx);
+
+		v4l2_ctrl_handler_free(hdl);
+		v4l2_fh_exit(&ctx->fh);
+		kfree(ctx);
+		goto open_unlock;
+	}
+
+	v4l2_fh_add(&ctx->fh);
+
+	dprintk(1, dw_dev, "M2M instance created: %p", ctx->fh.m2m_ctx);
+
+open_unlock:
+	mutex_unlock(&dw_dev->vfd_mutex);
+	return ret;
+}
+
+static int dw100_release(struct file *file)
+{
+	struct dw100_device *dw_dev = video_drvdata(file);
+	struct dw100_ctx *ctx = file2ctx(file);
+
+	dprintk(1, dw_dev, "Releasing M2M instance: %p", ctx->fh.m2m_ctx);
+
+	v4l2_fh_del(&ctx->fh);
+	v4l2_fh_exit(&ctx->fh);
+	v4l2_ctrl_handler_free(&ctx->hdl);
+
+	mutex_lock(&dw_dev->vfd_mutex);
+	v4l2_m2m_ctx_release(ctx->fh.m2m_ctx);
+	mutex_unlock(&dw_dev->vfd_mutex);
+
+	if (ctx->map)
+		dma_free_coherent(&ctx->dw_dev->pdev->dev,
+				  ctx->map_size,
+				  ctx->map,
+				  ctx->map_dma);
+
+	kfree(ctx);
+
+	return 0;
+}
+
+static const struct v4l2_file_operations dw100_fops = {
+	.owner		= THIS_MODULE,
+	.open		= dw100_open,
+	.release	= dw100_release,
+	.poll		= v4l2_m2m_fop_poll,
+	.unlocked_ioctl	= video_ioctl2,
+	.mmap		= v4l2_m2m_fop_mmap,
+};
+
+static int dw100_querycap(struct file *file, void *priv,
+			  struct v4l2_capability *cap)
+{
+	strscpy(cap->driver, DRV_NAME, sizeof(cap->driver));
+	strscpy(cap->card, "DW100 dewarper", sizeof(cap->card));
+	snprintf(cap->bus_info, sizeof(cap->bus_info), "platform:%s", DRV_NAME);
+
+	return 0;
+}
+
+static int dw100_enum_fmt(struct v4l2_fmtdesc *f)
+{
+	int i, num = 0;
+
+	for (i = 0; i < NUM_FORMATS; i++) {
+		if (formats[i].types & to_dw100_fmt_type(f->type)) {
+			if (num == f->index) {
+				f->pixelformat = formats[i].fourcc;
+				return 0;
+			}
+			++num;
+		}
+	}
+
+	return -EINVAL;
+}
+
+static int dw100_enum_fmt_vid_cap(struct file *file, void *priv,
+				  struct v4l2_fmtdesc *f)
+{
+	return dw100_enum_fmt(f);
+}
+
+static int dw100_enum_fmt_vid_out(struct file *file, void *priv,
+				  struct v4l2_fmtdesc *f)
+{
+	return dw100_enum_fmt(f);
+}
+
+static int dw100_enum_framesizes(struct file *file, void *priv,
+				 struct v4l2_frmsizeenum *fsize)
+{
+	struct dw100_fmt *fmt;
+
+	if (fsize->index)
+		return -EINVAL;
+
+	fmt = find_pixel_format(fsize->pixel_format,
+				V4L2_M2M_OUTPUT | V4L2_M2M_CAPTURE);
+	if (!fmt)
+		return -EINVAL;
+
+	fsize->type = V4L2_FRMSIZE_TYPE_STEPWISE;
+	fsize->stepwise.min_width = MIN_W;
+	fsize->stepwise.min_height = MIN_H;
+	fsize->stepwise.max_width = MAX_W;
+	fsize->stepwise.max_height = MAX_H;
+	fsize->stepwise.step_width = (1UL << ALIGN_W);
+	fsize->stepwise.step_height = (1UL << ALIGN_H);
+
+	return 0;
+}
+
+static int dw100_g_fmt(struct dw100_ctx *ctx, struct v4l2_format *f)
+{
+	struct vb2_queue *vq;
+	struct dw100_q_data *q_data;
+
+	vq = v4l2_m2m_get_vq(ctx->fh.m2m_ctx, f->type);
+	if (!vq)
+		return -EINVAL;
+
+	q_data = get_q_data(ctx, f->type);
+
+	f->fmt.pix.width	= q_data->width;
+	f->fmt.pix.height	= q_data->height;
+	f->fmt.pix.field	= V4L2_FIELD_NONE;
+	f->fmt.pix.pixelformat	= q_data->fmt->fourcc;
+	f->fmt.pix.bytesperline	= q_data->bytesperline;
+	f->fmt.pix.sizeimage	= q_data->sizeimage;
+	f->fmt.pix.colorspace	= ctx->colorspace;
+	f->fmt.pix.xfer_func	= ctx->xfer_func;
+	f->fmt.pix.ycbcr_enc	= ctx->ycbcr_enc;
+	f->fmt.pix.quantization	= ctx->quant;
+
+	return 0;
+}
+
+static int dw100_g_fmt_vid_out(struct file *file, void *priv,
+			       struct v4l2_format *f)
+{
+	return dw100_g_fmt(file2ctx(file), f);
+}
+
+static int dw100_g_fmt_vid_cap(struct file *file, void *priv,
+			       struct v4l2_format *f)
+{
+	return dw100_g_fmt(file2ctx(file), f);
+}
+
+static int dw100_try_fmt(struct v4l2_format *f, struct dw100_fmt *fmt)
+{
+	v4l_bound_align_image(&f->fmt.pix.width, MIN_W, MAX_W, ALIGN_W,
+			      &f->fmt.pix.height, MIN_H, MAX_H, ALIGN_H, 0);
+
+	f->fmt.pix.bytesperline = dw100_bytesperline(fmt, f->fmt.pix.width);
+	f->fmt.pix.sizeimage = dw100_sizeimage(fmt, f->fmt.pix.width,
+					       f->fmt.pix.height);
+	f->fmt.pix.field = V4L2_FIELD_NONE;
+
+	return 0;
+}
+
+static int dw100_s_fmt(struct dw100_ctx *ctx, struct v4l2_format *f)
+{
+	struct dw100_q_data *q_data;
+	struct vb2_queue *vq;
+
+	vq = v4l2_m2m_get_vq(ctx->fh.m2m_ctx, f->type);
+	if (!vq)
+		return -EINVAL;
+
+	q_data = get_q_data(ctx, f->type);
+	if (!q_data)
+		return -EINVAL;
+
+	if (vb2_is_busy(vq)) {
+		v4l2_err(&ctx->dw_dev->v4l2_dev, "%s queue busy\n", __func__);
+		return -EBUSY;
+	}
+
+	q_data->fmt		= find_format(f);
+	q_data->width		= f->fmt.pix.width;
+	q_data->height		= f->fmt.pix.height;
+	q_data->bytesperline	= f->fmt.pix.bytesperline;
+	q_data->sizeimage	= f->fmt.pix.sizeimage;
+
+	q_data->crop.top = 0;
+	q_data->crop.left = 0;
+	q_data->crop.width = f->fmt.pix.width;
+	q_data->crop.height = f->fmt.pix.height;
+
+	dprintk(1, ctx->dw_dev,
+		"Setting format for type %d, wxh: %dx%d, fmt: %d\n",
+		f->type, q_data->width, q_data->height, q_data->fmt->fourcc);
+
+	return 0;
+}
+
+static int dw100_try_fmt_vid_cap(struct file *file, void *priv,
+				 struct v4l2_format *f)
+{
+	struct dw100_fmt *fmt;
+	struct dw100_ctx *ctx = file2ctx(file);
+
+	if (f->type != V4L2_BUF_TYPE_VIDEO_CAPTURE)
+		return -EINVAL;
+
+	fmt = find_format(f);
+	if (!fmt) {
+		f->fmt.pix.pixelformat = formats[0].fourcc;
+		fmt = find_format(f);
+	}
+
+	f->fmt.pix.colorspace = ctx->colorspace;
+	f->fmt.pix.xfer_func = ctx->xfer_func;
+	f->fmt.pix.ycbcr_enc = ctx->ycbcr_enc;
+	f->fmt.pix.quantization = ctx->quant;
+
+	return dw100_try_fmt(f, fmt);
+}
+
+static int dw100_s_fmt_vid_cap(struct file *file, void *priv,
+			       struct v4l2_format *f)
+{
+	struct dw100_ctx *ctx = file2ctx(file);
+	int ret;
+
+	ret = dw100_try_fmt_vid_cap(file, priv, f);
+	if (ret)
+		return ret;
+
+	ret = dw100_s_fmt(ctx, f);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int dw100_try_fmt_vid_out(struct file *file, void *priv,
+				 struct v4l2_format *f)
+{
+	struct dw100_fmt *fmt;
+
+	if (f->type != V4L2_BUF_TYPE_VIDEO_OUTPUT)
+		return -EINVAL;
+
+	fmt = find_format(f);
+	if (!fmt) {
+		f->fmt.pix.pixelformat = formats[0].fourcc;
+		fmt = find_format(f);
+	}
+
+	if (!f->fmt.pix.colorspace)
+		f->fmt.pix.colorspace = V4L2_COLORSPACE_REC709;
+
+	return dw100_try_fmt(f, fmt);
+}
+
+static int dw100_s_fmt_vid_out(struct file *file, void *priv,
+			       struct v4l2_format *f)
+{
+	struct dw100_ctx *ctx = file2ctx(file);
+	int ret;
+
+	ret = dw100_try_fmt_vid_out(file, priv, f);
+	if (ret)
+		return ret;
+
+	ret = dw100_s_fmt(ctx, f);
+	if (ret)
+		return ret;
+
+	ctx->colorspace = f->fmt.pix.colorspace;
+	ctx->xfer_func = f->fmt.pix.xfer_func;
+	ctx->ycbcr_enc = f->fmt.pix.ycbcr_enc;
+	ctx->quant = f->fmt.pix.quantization;
+
+	return 0;
+}
+
+static int dw100_g_selection(struct file *file, void *fh,
+			     struct v4l2_selection *sel)
+{
+	struct dw100_ctx *ctx = file2ctx(file);
+	struct dw100_q_data *src_q_data, *dst_q_data;
+
+	if (sel->type != V4L2_BUF_TYPE_VIDEO_OUTPUT &&
+	    sel->type != V4L2_BUF_TYPE_VIDEO_CAPTURE)
+		return -EINVAL;
+
+	src_q_data = get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_OUTPUT);
+	dst_q_data = get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE);
+
+	switch (sel->target) {
+	case V4L2_SEL_TGT_CROP_DEFAULT:
+	case V4L2_SEL_TGT_CROP_BOUNDS:
+		sel->r.top = 0;
+		sel->r.left = 0;
+		sel->r.width = src_q_data->width;
+		sel->r.height = src_q_data->height;
+		break;
+	case V4L2_SEL_TGT_CROP:
+		sel->r.top = src_q_data->crop.top;
+		sel->r.left = src_q_data->crop.left;
+		sel->r.width = src_q_data->crop.width;
+		sel->r.height = src_q_data->crop.height;
+		break;
+	case V4L2_SEL_TGT_COMPOSE_DEFAULT:
+	case V4L2_SEL_TGT_COMPOSE_BOUNDS:
+	case V4L2_SEL_TGT_COMPOSE:
+		sel->r.top = 0;
+		sel->r.left = 0;
+		sel->r.width = dst_q_data->width;
+		sel->r.height = dst_q_data->height;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	dprintk(1, ctx->dw_dev,
+		"<<< Buffer Type: %d Target: %d Rect: %dx%d@%d.%d\n",
+		sel->type, sel->target,
+		sel->r.width, sel->r.height, sel->r.left, sel->r.top);
+
+	return 0;
+}
+
+static	int dw100_s_selection(struct file *file, void *fh,
+			      struct v4l2_selection *sel)
+{
+	struct dw100_ctx *ctx = file2ctx(file);
+	struct dw100_q_data *src_q_data, *dst_q_data;
+	u32 qscalex, qscaley, qscale;
+	int x, y, w, h;
+
+	if (sel->type != V4L2_BUF_TYPE_VIDEO_OUTPUT &&
+	    sel->type != V4L2_BUF_TYPE_VIDEO_CAPTURE)
+		return -EINVAL;
+
+	src_q_data = get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_OUTPUT);
+	dst_q_data = get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE);
+
+	dprintk(1, ctx->dw_dev,
+		">>> Buffer Type: %d Target: %d Rect: %dx%d@%d.%d\n",
+		sel->type, sel->target,
+		sel->r.width, sel->r.height, sel->r.left, sel->r.top);
+
+	if (sel->r.top < 0 || sel->r.left < 0)
+		return -EINVAL;
+
+	switch (sel->target) {
+	case V4L2_SEL_TGT_CROP:
+		/* UQ16.16 for float operations */
+		if ((sel->r.left + sel->r.width > src_q_data->width) ||
+		    (sel->r.top + sel->r.height > src_q_data->height))
+			return -EINVAL;
+		qscalex = (sel->r.width << 16) / src_q_data->width;
+		qscaley = (sel->r.height << 16) / src_q_data->height;
+		y = sel->r.top;
+		x = sel->r.left;
+		if (qscalex == qscaley) {
+			qscale = qscalex;
+		} else {
+			switch (sel->flags) {
+			case 0:
+				qscale = (qscalex + qscaley) / 2;
+				break;
+			case V4L2_SEL_FLAG_GE:
+				qscale = max(qscaley, qscalex);
+				break;
+			case V4L2_SEL_FLAG_LE:
+				qscale = min(qscaley, qscalex);
+				break;
+			case V4L2_SEL_FLAG_LE | V4L2_SEL_FLAG_GE:
+				return -ERANGE;
+			default:
+				return -EINVAL;
+			}
+		}
+
+		w = (u32)((((u64)src_q_data->width << 16) * qscale) >> 32);
+		h = (u32)((((u64)src_q_data->height << 16) * qscale) >> 32);
+		x = x + (sel->r.width  - w) / 2;
+		y = y + (sel->r.height  - h) / 2;
+		x = min(src_q_data->width - w, (unsigned int)max(0, x));
+		y = min(src_q_data->height - h, (unsigned int)max(0, y));
+
+		sel->r.top = y;
+		sel->r.left = x;
+		sel->r.width = w;
+		sel->r.height = h;
+
+		src_q_data->crop.top = sel->r.top;
+		src_q_data->crop.left = sel->r.left;
+		src_q_data->crop.width = sel->r.width;
+		src_q_data->crop.height = sel->r.height;
+		break;
+	case V4L2_SEL_TGT_COMPOSE:
+		if ((sel->r.left + sel->r.width > dst_q_data->width) ||
+		    (sel->r.top + sel->r.height > dst_q_data->height))
+			return -EINVAL;
+		sel->r.top = 0;
+		sel->r.left = 0;
+		sel->r.width = dst_q_data->width;
+		sel->r.height = dst_q_data->height;
+		break;
+	case V4L2_SEL_TGT_CROP_DEFAULT:
+	case V4L2_SEL_TGT_CROP_BOUNDS:
+	case V4L2_SEL_TGT_COMPOSE_DEFAULT:
+	case V4L2_SEL_TGT_COMPOSE_BOUNDS:
+	default:
+		return -EINVAL;
+	}
+
+	dprintk(1, ctx->dw_dev,
+		"<<< Buffer Type: %d Target: %d Rect: %dx%d@%d.%d\n",
+		sel->type, sel->target,
+		sel->r.width, sel->r.height, sel->r.left, sel->r.top);
+
+	return 0;
+}
+
+static const struct v4l2_ioctl_ops dw100_ioctl_ops = {
+	.vidioc_querycap	= dw100_querycap,
+
+	.vidioc_enum_fmt_vid_cap = dw100_enum_fmt_vid_cap,
+	.vidioc_enum_framesizes = dw100_enum_framesizes,
+	.vidioc_g_fmt_vid_cap	= dw100_g_fmt_vid_cap,
+	.vidioc_try_fmt_vid_cap	= dw100_try_fmt_vid_cap,
+	.vidioc_s_fmt_vid_cap	= dw100_s_fmt_vid_cap,
+
+	.vidioc_enum_fmt_vid_out = dw100_enum_fmt_vid_out,
+	.vidioc_g_fmt_vid_out	= dw100_g_fmt_vid_out,
+	.vidioc_try_fmt_vid_out	= dw100_try_fmt_vid_out,
+	.vidioc_s_fmt_vid_out	= dw100_s_fmt_vid_out,
+
+	.vidioc_g_selection	= dw100_g_selection,
+	.vidioc_s_selection	= dw100_s_selection,
+	.vidioc_reqbufs		= v4l2_m2m_ioctl_reqbufs,
+	.vidioc_querybuf	= v4l2_m2m_ioctl_querybuf,
+	.vidioc_qbuf		= v4l2_m2m_ioctl_qbuf,
+	.vidioc_dqbuf		= v4l2_m2m_ioctl_dqbuf,
+	.vidioc_prepare_buf	= v4l2_m2m_ioctl_prepare_buf,
+	.vidioc_create_bufs	= v4l2_m2m_ioctl_create_bufs,
+	.vidioc_expbuf		= v4l2_m2m_ioctl_expbuf,
+
+	.vidioc_streamon	= v4l2_m2m_ioctl_streamon,
+	.vidioc_streamoff	= v4l2_m2m_ioctl_streamoff,
+
+	.vidioc_subscribe_event = v4l2_ctrl_subscribe_event,
+	.vidioc_unsubscribe_event = v4l2_event_unsubscribe,
+};
+
+static void dw100_job_finish(struct dw100_device *dw_dev, bool with_error)
+{
+	struct dw100_ctx *curr_ctx;
+	struct vb2_v4l2_buffer *src_vb, *dst_vb;
+	unsigned long flags;
+	enum vb2_buffer_state buf_state;
+
+	curr_ctx = v4l2_m2m_get_curr_priv(dw_dev->m2m_dev);
+
+	if (!curr_ctx) {
+		v4l2_err(&dw_dev->v4l2_dev,
+			 "Instance released before the end of transaction\n");
+		return;
+	}
+
+	src_vb = v4l2_m2m_src_buf_remove(curr_ctx->fh.m2m_ctx);
+	dst_vb = v4l2_m2m_dst_buf_remove(curr_ctx->fh.m2m_ctx);
+
+	if (likely(!with_error))
+		buf_state = VB2_BUF_STATE_DONE;
+	else
+		buf_state = VB2_BUF_STATE_ERROR;
+
+	spin_lock_irqsave(&dw_dev->irqlock, flags);
+	v4l2_m2m_buf_done(src_vb, buf_state);
+	v4l2_m2m_buf_done(dst_vb, buf_state);
+	spin_unlock_irqrestore(&dw_dev->irqlock, flags);
+
+	dprintk(2, dw_dev, "Finishing transaction with%s error(s)\n",
+		with_error ? "" : "out");
+
+	v4l2_m2m_job_finish(dw_dev->m2m_dev, curr_ctx->fh.m2m_ctx);
+}
+
+static void dw100_hw_reset(struct dw100_device *dw_dev)
+{
+	u32 val;
+
+	val = readl(dw_dev->mmio + DW100_DEWARP_CTRL);
+	val |= DW100_DEWARP_CTRL_ENABLE;
+	val |= DW100_DEWARP_CTRL_SOFT_RESET;
+	writel(val, dw_dev->mmio + DW100_DEWARP_CTRL);
+	val &= ~DW100_DEWARP_CTRL_SOFT_RESET;
+	writel(val, dw_dev->mmio + DW100_DEWARP_CTRL);
+}
+
+static void _dw100_hw_set_master_bus_enable(struct dw100_device *dw_dev,
+					    unsigned int enable)
+{
+	u32 val;
+	void __iomem *addr = dw_dev->mmio + DW100_BUS_CTRL;
+
+	dprintk(3, dw_dev, "%sable master bus\n", enable ? "En" : "Dis");
+
+	val = readl(addr);
+
+	if (enable)
+		val |= DW100_BUS_CTRL_AXI_MASTER_ENABLE;
+	else
+		val &= ~DW100_BUS_CTRL_AXI_MASTER_ENABLE;
+
+	writel(val, addr);
+}
+
+static void dw100_hw_master_bus_enable(struct dw100_device *dw_dev)
+{
+	_dw100_hw_set_master_bus_enable(dw_dev, 1);
+}
+
+static void dw100_hw_master_bus_disable(struct dw100_device *dw_dev)
+{
+	_dw100_hw_set_master_bus_enable(dw_dev, 0);
+}
+
+static void dw100_hw_dewarp_start(struct dw100_device *dw_dev)
+{
+	u32 val;
+	void __iomem *addr = dw_dev->mmio + DW100_DEWARP_CTRL;
+
+	val = readl(addr);
+
+	dprintk(3, dw_dev, "Starting Hardware CTRL:%x\n", val);
+	writel(val | DW100_DEWARP_CTRL_START, addr);
+	writel(val, addr);
+}
+
+static void dw100_hw_init_ctrl(struct dw100_device *dw_dev)
+{
+	u32 val;
+	void __iomem *addr = dw_dev->mmio + DW100_DEWARP_CTRL;
+	/*
+	 * Input format YUV422_SP
+	 * Output format YUV422_SP
+	 * No hardware handshake (SW)
+	 * No automatic double src buffering (Single)
+	 * No automatic double dst buffering (Single)
+	 * No Black Line
+	 * Prefetch image pixel traversal
+	 */
+
+	val = DW100_DEWARP_CTRL_ENABLE
+		/* Valid only for auto prefetch mode*/
+		| DW100_DEWARP_CTRL_PREFETCH_THRESHOLD(32);
+
+	/*
+	 * Calculation mode required to support any scaling factor,
+	 * but x4 slower than traversal mode.
+	 *
+	 * DW100_DEWARP_CTRL_PREFETCH_MODE_TRAVERSAL
+	 * DW100_DEWARP_CTRL_PREFETCH_MODE_CALCULATION
+	 * DW100_DEWARP_CTRL_PREFETCH_MODE_AUTO
+	 *
+	 * TODO: Find heuristics requiring calculation mode
+	 *
+	 */
+	val |= DW100_DEWARP_CTRL_PREFETCH_MODE_CALCULATION;
+
+	writel(val, addr);
+}
+
+static void dw100_hw_set_pixel_boundary(struct dw100_device *dw_dev)
+{
+	u32 val;
+	void __iomem *addr = dw_dev->mmio + DW100_BOUNDARY_PIXEL;
+
+	val = DW100_BOUNDARY_PIXEL_V(128)
+		| DW100_BOUNDARY_PIXEL_U(128)
+		| DW100_BOUNDARY_PIXEL_Y(0);
+
+	writel(val, addr);
+}
+
+static void dw100_hw_set_scale(struct dw100_device *dw_dev, u8 scale)
+{
+	void __iomem *addr = dw_dev->mmio + DW100_SCALE_FACTOR;
+
+	dprintk(1, dw_dev, "Setting scale factor to %d\n", scale);
+
+	writel(scale, addr);
+}
+
+static void dw100_hw_set_roi(struct dw100_device *dw_dev, u32 x, u32 y)
+{
+	u32 val;
+	void __iomem *addr = dw_dev->mmio + DW100_ROI_START;
+
+	dprintk(1, dw_dev, "Setting ROI region to %d.%d\n", x, y);
+
+	val = DW100_ROI_START_X(x) | DW100_ROI_START_Y(y);
+
+	writel(val, addr);
+}
+
+static void dw100_hw_set_src_crop(struct dw100_device *dw_dev,
+				  struct dw100_q_data *src_q_data,
+				  struct dw100_q_data *dst_q_data)
+{
+	struct v4l2_rect *rect = &src_q_data->crop;
+	u32 src_scale, qscale, left_scale, top_scale;
+
+	/* HW Scale is UQ1.7 encoded */
+	src_scale = (rect->width << 7) / src_q_data->width;
+	dw100_hw_set_scale(dw_dev, src_scale);
+
+	qscale = (dst_q_data->width << 7)  / src_q_data->width;
+
+	left_scale = (((rect->left << 7) * qscale) >> 14);
+	top_scale = (((rect->top << 7) * qscale) >> 14);
+
+	dw100_hw_set_roi(dw_dev, left_scale, top_scale);
+}
+
+static void dw100_hw_set_source(struct dw100_device *dw_dev,
+				struct dw100_q_data *q_data,
+				dma_addr_t addr)
+{
+	u32 width, height, stride, fourcc, val;
+	struct dw100_fmt *fmt = q_data->fmt;
+
+	width =  q_data->width;
+	height = q_data->height;
+	stride = q_data->bytesperline;
+	fourcc = q_data->fmt->fourcc;
+
+	dprintk(3, dw_dev, "Set HW source registers for %dx%d - stride %d, pixfmt: %x, dma:%pa\n",
+		width, height, stride, fourcc, &addr);
+
+	/* Pixel Format */
+	val = readl(dw_dev->mmio + DW100_DEWARP_CTRL);
+
+	val &= ~DW100_DEWARP_CTRL_INPUT_FORMAT_MASK;
+	val |= DW100_DEWARP_CTRL_INPUT_FORMAT(fmt->reg_format);
+
+	writel(val, dw_dev->mmio + DW100_DEWARP_CTRL);
+
+	/* Swap */
+	val = readl(dw_dev->mmio + DW100_SWAP_CONTROL);
+
+	val &= ~DW100_SWAP_CONTROL_SRC_MASK;
+	/*
+	 * Data swapping is performed only on Y plane for source image.
+	 */
+	if (fmt->reg_swap_uv &&
+	    fmt->reg_format == DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED)
+		val |= DW100_SWAP_CONTROL_SRC(DW100_SWAP_CONTROL_Y
+					      (DW100_SWAP_CONTROL_BYTE));
+
+	writel(val, dw_dev->mmio + DW100_SWAP_CONTROL);
+
+	/* Image resolution */
+	writel(DW100_IMG_SIZE_WIDTH(width) | DW100_IMG_SIZE_HEIGHT(height),
+	       dw_dev->mmio + DW100_SRC_IMG_SIZE);
+
+	writel(stride, dw_dev->mmio + DW100_SRC_IMG_STRIDE);
+
+	/* Buffers */
+	writel(DW100_IMG_Y_BASE(addr), dw_dev->mmio + DW100_SRC_IMG_Y_BASE);
+	writel(DW100_IMG_UV_BASE((addr + (stride * height))),
+	       dw_dev->mmio + DW100_SRC_IMG_UV_BASE);
+}
+
+static void dw100_hw_set_destination(struct dw100_device *dw_dev,
+				     struct dw100_q_data *q_data,
+				     struct dw100_fmt *ifmt,
+				     dma_addr_t addr)
+{
+	u32 width, height, stride, fourcc, val;
+	struct dw100_fmt *fmt = q_data->fmt;
+
+	width =  q_data->width;
+	height = q_data->height;
+	stride = q_data->bytesperline;
+	fourcc = q_data->fmt->fourcc;
+
+	dprintk(3, dw_dev, "Set HW source registers for %dx%d - stride %d, pixfmt: %x, dma:%pa\n",
+		width, height, stride, fourcc, &addr);
+
+	/* Pixel Format */
+	val = readl(dw_dev->mmio + DW100_DEWARP_CTRL);
+
+	val &= ~DW100_DEWARP_CTRL_OUTPUT_FORMAT_MASK;
+	val |= DW100_DEWARP_CTRL_OUTPUT_FORMAT(fmt->reg_format);
+
+	writel(val, dw_dev->mmio + DW100_DEWARP_CTRL);
+
+	/* Swap */
+	val = readl(dw_dev->mmio + DW100_SWAP_CONTROL);
+
+	val &= ~DW100_SWAP_CONTROL_DST_MASK;
+
+	/*
+	 * Avoid to swap twice
+	 */
+	if (fmt->reg_swap_uv ^
+	    (ifmt->reg_swap_uv && ifmt->reg_format !=
+	     DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED)) {
+		if (fmt->reg_format == DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED)
+			val |= DW100_SWAP_CONTROL_DST(DW100_SWAP_CONTROL_Y
+						      (DW100_SWAP_CONTROL_BYTE));
+		else
+			val |= DW100_SWAP_CONTROL_DST(DW100_SWAP_CONTROL_UV
+						      (DW100_SWAP_CONTROL_BYTE));
+	}
+
+	writel(val, dw_dev->mmio + DW100_SWAP_CONTROL);
+
+	/* Image resolution */
+	writel(DW100_IMG_SIZE_WIDTH(width) | DW100_IMG_SIZE_HEIGHT(height),
+	       dw_dev->mmio + DW100_DST_IMG_SIZE);
+	writel(stride, dw_dev->mmio + DW100_DST_IMG_STRIDE);
+
+	val = ALIGN(stride * height, 16);
+	writel(DW100_IMG_Y_BASE(addr), dw_dev->mmio + DW100_DST_IMG_Y_BASE);
+	writel(DW100_IMG_UV_BASE(addr + val),
+	       dw_dev->mmio + DW100_DST_IMG_UV_BASE);
+	writel(DW100_DST_IMG_Y_SIZE(val), dw_dev->mmio + DW100_DST_IMG_Y_SIZE1);
+
+	if (fmt->reg_format == DW100_DEWARP_CTRL_FORMAT_YUV420_SP)
+		val /= 2;
+
+	writel(DW100_DST_IMG_UV_SIZE(val),
+	       dw_dev->mmio + DW100_DST_IMG_UV_SIZE1);
+}
+
+static void dw100_hw_set_mapping(struct dw100_device *dw_dev, dma_addr_t addr,
+				 u32 width, u32 height)
+{
+	dprintk(1, dw_dev, "Set HW mapping registers for %dx%d addr:%pa", width,
+		height, &addr);
+
+	writel(DW100_MAP_LUT_ADDR_ADDR(addr), dw_dev->mmio + DW100_MAP_LUT_ADDR);
+	writel(DW100_MAP_LUT_SIZE_WIDTH(width)
+		| DW100_MAP_LUT_SIZE_HEIGHT(height),
+			dw_dev->mmio + DW100_MAP_LUT_SIZE);
+}
+
+static void dw100_hw_clear_irq(struct dw100_device *dw_dev, unsigned int irq)
+{
+	writel(DW100_INTERRUPT_STATUS_INT_CLEAR(irq),
+	       dw_dev->mmio + DW100_INTERRUPT_STATUS);
+}
+
+static void dw100_hw_enable_irq(struct dw100_device *dw_dev)
+{
+	writel(DW100_INTERRUPT_STATUS_INT_ENABLE_MASK,
+	       dw_dev->mmio + DW100_INTERRUPT_STATUS);
+}
+
+static void dw100_hw_disable_irq(struct dw100_device *dw_dev)
+{
+	writel(0, dw_dev->mmio + DW100_INTERRUPT_STATUS);
+}
+
+static bool dw100_hw_is_frame_running(struct dw100_device *dw_dev)
+{
+	bool is_running = readl(dw_dev->mmio + DW100_INTERRUPT_STATUS)
+				& DW100_INTERRUPT_STATUS_FRAME_BUSY;
+
+	return is_running;
+}
+
+static u32 dw_hw_get_pending_irqs(struct dw100_device *dw_dev)
+{
+	u32 val;
+
+	val = readl(dw_dev->mmio + DW100_INTERRUPT_STATUS);
+
+	return DW100_INTERRUPT_STATUS_INT_STATUS(val);
+}
+
+static irqreturn_t dw100_irq_handler(int irq, void *dev_id)
+{
+	struct dw100_device *dw_dev = dev_id;
+	u32 pending_irqs, err_irqs, frame_done_irq;
+	bool with_error = true;
+
+	pending_irqs = dw_hw_get_pending_irqs(dw_dev);
+	frame_done_irq = pending_irqs & DW100_INTERRUPT_STATUS_INT_FRAME_DONE;
+	err_irqs = DW100_INTERRUPT_STATUS_INT_ERR_STATUS(pending_irqs);
+
+	if (frame_done_irq) {
+		dprintk(3, dw_dev, "Frame done interrupt\n");
+		with_error = false;
+		err_irqs &= ~DW100_INTERRUPT_STATUS_INT_ERR_STATUS
+			(DW100_INTERRUPT_STATUS_INT_ERR_FRAME_DONE);
+	}
+
+	if (err_irqs)
+		v4l2_err(&dw_dev->v4l2_dev, "Interrupt error: %#x\n", err_irqs);
+
+	dw100_hw_disable_irq(dw_dev);
+	dw100_hw_master_bus_disable(dw_dev);
+	dw100_hw_clear_irq(dw_dev, pending_irqs |
+			   DW100_INTERRUPT_STATUS_INT_ERR_TIME_OUT);
+
+	dw100_job_finish(dw_dev, with_error);
+
+	return IRQ_HANDLED;
+}
+
+static void dw100_start(struct dw100_ctx *ctx, struct vb2_v4l2_buffer *in_vb,
+			struct vb2_v4l2_buffer *out_vb)
+{
+	struct dw100_device *dw_dev = ctx->dw_dev;
+	dma_addr_t p_in, p_out;
+
+	p_in = vb2_dma_contig_plane_dma_addr(&in_vb->vb2_buf, 0);
+	p_out = vb2_dma_contig_plane_dma_addr(&out_vb->vb2_buf, 0);
+	if (!p_in || !p_out) {
+		v4l2_err(&dw_dev->v4l2_dev,
+			 "Acquiring DMA addresses of buffers failed\n");
+		return;
+	}
+
+	out_vb->sequence =
+		get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE)->sequence++;
+	in_vb->sequence =
+		get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_OUTPUT)->sequence++;
+
+	dprintk(1, ctx->dw_dev,
+		"Starting queues %p->%p buffers d:%pa->d:%pa, sequence %d->%d\n",
+		v4l2_m2m_get_vq(ctx->fh.m2m_ctx, V4L2_BUF_TYPE_VIDEO_OUTPUT),
+		v4l2_m2m_get_vq(ctx->fh.m2m_ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE),
+		&p_in, &p_out, in_vb->sequence, out_vb->sequence);
+
+	out_vb->vb2_buf.timestamp = in_vb->vb2_buf.timestamp;
+	out_vb->field = in_vb->field;
+	if (in_vb->flags & V4L2_BUF_FLAG_TIMECODE)
+		out_vb->timecode = in_vb->timecode;
+	out_vb->flags = in_vb->flags &
+		(V4L2_BUF_FLAG_TIMECODE |
+		 V4L2_BUF_FLAG_KEYFRAME |
+		 V4L2_BUF_FLAG_PFRAME |
+		 V4L2_BUF_FLAG_BFRAME |
+		 V4L2_BUF_FLAG_TIMESTAMP_MASK |
+		 V4L2_BUF_FLAG_TSTAMP_SRC_MASK);
+
+	/* Now, let's deal with hardware ... */
+	dw100_hw_master_bus_disable(dw_dev);
+	if (!ctx->map)
+		dw100_create_mapping(ctx);
+
+	dw100_hw_init_ctrl(dw_dev);
+	dw100_hw_set_pixel_boundary(dw_dev);
+	dw100_hw_set_src_crop(dw_dev, &ctx->q_data[V4L2_M2M_SRC],
+			      &ctx->q_data[V4L2_M2M_DST]);
+	dw100_hw_set_source(dw_dev, &ctx->q_data[V4L2_M2M_SRC], p_in);
+	dw100_hw_set_destination(dw_dev, &ctx->q_data[V4L2_M2M_DST],
+				 ctx->q_data[V4L2_M2M_SRC].fmt, p_out);
+	dw100_hw_set_mapping(dw_dev, ctx->map_dma,
+			     ctx->map_width, ctx->map_height);
+	dw100_hw_enable_irq(dw_dev);
+	dw100_hw_dewarp_start(dw_dev);
+
+	/* Enable Bus */
+	dw100_hw_master_bus_enable(dw_dev);
+}
+
+static void dw100_device_run(void *priv)
+{
+	struct dw100_ctx *ctx = priv;
+	struct vb2_v4l2_buffer *src_buf, *dst_buf;
+
+	src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
+	dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
+
+	dw100_start(ctx, src_buf, dst_buf);
+}
+
+static int dw100_job_ready(void *priv)
+{
+	struct dw100_ctx *ctx = priv;
+
+	if (v4l2_m2m_num_src_bufs_ready(ctx->fh.m2m_ctx) < 1 ||
+	    v4l2_m2m_num_dst_bufs_ready(ctx->fh.m2m_ctx) < 1) {
+		dprintk(1, ctx->dw_dev, "Not enough buffers available\n");
+		return 0;
+	}
+
+	if (dw100_hw_is_frame_running(ctx->dw_dev)) {
+		dprintk(1, ctx->dw_dev, "HW still running a frame\n");
+		return 0;
+	}
+
+	return 1;
+}
+
+static const struct v4l2_m2m_ops dw100_m2m_ops = {
+	.device_run	= dw100_device_run,
+	.job_ready	= dw100_job_ready,
+};
+
+#ifdef CONFIG_MEDIA_CONTROLLER
+static const struct media_device_ops dw100_media_ops = {
+	.req_validate = vb2_request_validate,
+	.req_queue = v4l2_m2m_request_queue,
+};
+#endif
+
+static struct video_device *dw100_init_video_device(struct dw100_device *dw_dev)
+{
+	struct video_device *vfd = &dw_dev->vfd;
+
+	vfd->vfl_dir = VFL_DIR_M2M;
+	vfd->fops = &dw100_fops;
+	vfd->device_caps = V4L2_CAP_VIDEO_M2M | V4L2_CAP_STREAMING;
+	vfd->ioctl_ops = &dw100_ioctl_ops;
+	vfd->minor = -1;
+	vfd->release = video_device_release_empty;
+	vfd->v4l2_dev = &dw_dev->v4l2_dev;
+	vfd->lock = &dw_dev->vfd_mutex;
+
+	strscpy(vfd->name, DRV_NAME, sizeof(vfd->name));
+	mutex_init(vfd->lock);
+	video_set_drvdata(vfd, dw_dev);
+
+	return vfd;
+}
+
+static int dw100_dump_regs_show(struct seq_file *m, void *private)
+{
+	struct dw100_device *dw_dev = m->private;
+	int ret;
+
+	ret = pm_runtime_resume_and_get(&dw_dev->pdev->dev);
+	if (ret < 0)
+		return ret;
+
+	ret = dw100_dump_regs(dw_dev);
+
+	pm_runtime_put_sync(&dw_dev->pdev->dev);
+
+	return ret;
+}
+DEFINE_SHOW_ATTRIBUTE(dw100_dump_regs);
+
+static void dw100_debugfs_init(struct dw100_device *dw_dev)
+{
+	dw_dev->debugfs_root =
+		debugfs_create_dir(dev_name(&dw_dev->pdev->dev), NULL);
+
+	debugfs_create_file("dump_regs", 0600, dw_dev->debugfs_root, dw_dev,
+			    &dw100_dump_regs_fops);
+}
+
+static void dw100_debugfs_exit(struct dw100_device *dw_dev)
+{
+	debugfs_remove_recursive(dw_dev->debugfs_root);
+}
+
+static int dw100_probe(struct platform_device *pdev)
+{
+	struct dw100_device *dw_dev;
+	struct video_device *vfd;
+	struct resource *res;
+	int ret = 0;
+	int irq;
+
+	dw_dev = devm_kzalloc(&pdev->dev, sizeof(*dw_dev), GFP_KERNEL);
+	if (!dw_dev)
+		return -ENOMEM;
+	dw_dev->pdev = pdev;
+
+	ret = devm_clk_bulk_get_all(&pdev->dev, &dw_dev->clks);
+	if (ret < 0) {
+		dev_err(&pdev->dev, "Unable to get clocks: %d\n", ret);
+		return ret;
+	}
+	dw_dev->num_clks = ret;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	dw_dev->mmio = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(dw_dev->mmio))
+		return PTR_ERR(dw_dev->mmio);
+
+	irq = platform_get_irq(pdev, 0);
+	if (irq < 0)
+		return irq;
+
+	ret = devm_request_threaded_irq(&pdev->dev, irq, NULL, dw100_irq_handler,
+					IRQF_ONESHOT, dev_name(&pdev->dev),
+					dw_dev);
+	if (ret < 0) {
+		dev_err(&pdev->dev, "Failed to request irq: %d\n", ret);
+		return ret;
+	}
+
+	platform_set_drvdata(pdev, dw_dev);
+
+	pm_runtime_enable(&pdev->dev);
+	ret = pm_runtime_resume_and_get(&pdev->dev);
+	if (ret < 0) {
+		dev_err(&pdev->dev, "Unable to enable clks: %d\n", ret);
+		goto err_clk;
+	}
+
+	dw100_hw_reset(dw_dev);
+
+	pm_runtime_put_sync(&pdev->dev);
+
+	spin_lock_init(&dw_dev->irqlock);
+
+	ret = v4l2_device_register(&pdev->dev, &dw_dev->v4l2_dev);
+	if (ret)
+		goto err_clk;
+
+	vfd = dw100_init_video_device(dw_dev);
+
+	dw_dev->m2m_dev = v4l2_m2m_init(&dw100_m2m_ops);
+	if (IS_ERR(dw_dev->m2m_dev)) {
+		v4l2_err(&dw_dev->v4l2_dev, "Failed to init mem2mem device\n");
+		ret = PTR_ERR(dw_dev->m2m_dev);
+		goto err_v4l2;
+	}
+
+#ifdef CONFIG_MEDIA_CONTROLLER
+	dw_dev->mdev.dev = &pdev->dev;
+	strscpy(dw_dev->mdev.model, "dw100", sizeof(dw_dev->mdev.model));
+	strscpy(dw_dev->mdev.bus_info, "platform:dw100",
+		sizeof(dw_dev->mdev.bus_info));
+	media_device_init(&dw_dev->mdev);
+	dw_dev->mdev.ops = &dw100_media_ops;
+	dw_dev->v4l2_dev.mdev = &dw_dev->mdev;
+#endif
+
+	ret = video_register_device(vfd, VFL_TYPE_VIDEO, -1);
+	if (ret) {
+		v4l2_err(&dw_dev->v4l2_dev, "Failed to register video device\n");
+		goto err_m2m;
+	}
+
+#ifdef CONFIG_MEDIA_CONTROLLER
+	ret = v4l2_m2m_register_media_controller(dw_dev->m2m_dev, vfd,
+						 MEDIA_ENT_F_PROC_VIDEO_SCALER);
+	if (ret) {
+		v4l2_err(&dw_dev->v4l2_dev, "Failed to init mem2mem media controller\n");
+		goto error_v4l2;
+	}
+
+	ret = media_device_register(&dw_dev->mdev);
+	if (ret) {
+		v4l2_err(&dw_dev->v4l2_dev, "Failed to register mem2mem media device\n");
+		goto error_m2m_mc;
+	}
+#endif
+
+	dw100_debugfs_init(dw_dev);
+
+	v4l2_info(&dw_dev->v4l2_dev,
+		  "dw100 v4l2 m2m registered as /dev/video%d\n", vfd->num);
+
+	return 0;
+#ifdef CONFIG_MEDIA_CONTROLLER
+error_m2m_mc:
+	v4l2_m2m_unregister_media_controller(dw_dev->m2m_dev);
+error_v4l2:
+	video_unregister_device(vfd);
+#endif
+err_m2m:
+	v4l2_m2m_release(dw_dev->m2m_dev);
+err_v4l2:
+	v4l2_device_unregister(&dw_dev->v4l2_dev);
+err_clk:
+	pm_runtime_disable(&pdev->dev);
+
+	return ret;
+}
+
+static int dw100_remove(struct platform_device *pdev)
+{
+	struct dw100_device *dw_dev = platform_get_drvdata(pdev);
+
+	dw100_debugfs_exit(dw_dev);
+
+	pm_runtime_disable(&pdev->dev);
+
+#ifdef CONFIG_MEDIA_CONTROLLER
+	media_device_unregister(&dw_dev->mdev);
+	v4l2_m2m_unregister_media_controller(dw_dev->m2m_dev);
+	media_device_cleanup(&dw_dev->mdev);
+#endif
+
+	video_unregister_device(&dw_dev->vfd);
+	v4l2_m2m_release(dw_dev->m2m_dev);
+	v4l2_device_unregister(&dw_dev->v4l2_dev);
+
+	return 0;
+}
+
+static int __maybe_unused dw100_runtime_suspend(struct device *dev)
+{
+	struct dw100_device *dw_dev = dev_get_drvdata(dev);
+
+	clk_bulk_disable_unprepare(dw_dev->num_clks, dw_dev->clks);
+
+	return 0;
+}
+
+static int __maybe_unused dw100_runtime_resume(struct device *dev)
+{
+	struct dw100_device *dw_dev = dev_get_drvdata(dev);
+
+	return clk_bulk_prepare_enable(dw_dev->num_clks, dw_dev->clks);
+}
+
+static const struct dev_pm_ops dw100_pm = {
+	SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
+				pm_runtime_force_resume)
+	SET_RUNTIME_PM_OPS(dw100_runtime_suspend,
+			   dw100_runtime_resume, NULL)
+};
+
+static const struct of_device_id dw100_dt_ids[] = {
+	{ .compatible = "nxp,imx8mp-dw100", .data = NULL },
+	{ },
+};
+MODULE_DEVICE_TABLE(of, dw100_dt_ids);
+
+static struct platform_driver dw100_driver = {
+	.probe		= dw100_probe,
+	.remove		= dw100_remove,
+	.driver		= {
+		.name	= DRV_NAME,
+		.pm = &dw100_pm,
+		.of_match_table = dw100_dt_ids,
+	},
+};
+
+module_platform_driver(dw100_driver);
+
+MODULE_DESCRIPTION("DW100 Hardware dewarper");
+MODULE_AUTHOR("Xavier Roumegue <Xavier.Roumegue@oss.nxp.com>");
+MODULE_LICENSE("GPL");
diff --git a/drivers/media/platform/nxp/dw100/dw100_regs.h b/drivers/media/platform/nxp/dw100/dw100_regs.h
new file mode 100644
index 000000000000..c540f6c638db
--- /dev/null
+++ b/drivers/media/platform/nxp/dw100/dw100_regs.h
@@ -0,0 +1,118 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+/*
+ * DW100 Hardware dewarper
+ *
+ * Copyright 2022 NXP
+ * Author: Xavier Roumegue (xavier.roumegue@oss.nxp.com)
+ *
+ */
+
+#ifndef _DW100_REGS_H_
+#define _DW100_REGS_H_
+
+/* AHB register offset */
+#define DW100_DEWARP_ID			0x00
+#define DW100_DEWARP_CTRL		0x04
+#define DW100_DEWARP_CTRL_ENABLE			BIT(0)
+#define DW100_DEWARP_CTRL_START				BIT(1)
+#define DW100_DEWARP_CTRL_SOFT_RESET			BIT(2)
+#define DW100_DEWARP_CTRL_FORMAT_YUV422_SP		(0UL)
+#define DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED		(1UL)
+#define DW100_DEWARP_CTRL_FORMAT_YUV420_SP		(2UL)
+#define DW100_DEWARP_CTRL_INPUT_FORMAT_MASK		GENMASK(5, 4)
+#define DW100_DEWARP_CTRL_INPUT_FORMAT(x)		((x) << 4)
+#define DW100_DEWARP_CTRL_OUTPUT_FORMAT(x)		((x) << 6)
+#define DW100_DEWARP_CTRL_OUTPUT_FORMAT_MASK		GENMASK(7, 6)
+#define DW100_DEWARP_CTRL_SRC_AUTO_SHADOW		BIT(8)
+#define DW100_DEWARP_CTRL_HW_HANDSHAKE			BIT(9)
+#define DW100_DEWARP_CTRL_DST_AUTO_SHADOW		BIT(10)
+#define DW100_DEWARP_CTRL_SPLIT_LINE			BIT(11)
+#define DW100_DEWARP_CTRL_PREFETCH_MODE_MASK		GENMASK(17, 16)
+#define DW100_DEWARP_CTRL_PREFETCH_MODE_TRAVERSAL	(0UL << 16)
+#define DW100_DEWARP_CTRL_PREFETCH_MODE_CALCULATION	(1UL << 16)
+#define DW100_DEWARP_CTRL_PREFETCH_MODE_AUTO		(2UL << 16)
+#define DW100_DEWARP_CTRL_PREFETCH_THRESHOLD_MASK	GENMASK(24, 18)
+#define DW100_DEWARP_CTRL_PREFETCH_THRESHOLD(x)		((x) << 18)
+
+#define DW100_MAP_LUT_ADDR		0x08
+#define DW100_MAP_LUT_ADDR_ADDR(addr)	(((addr) >> 4) & GENMASK(29, 0))
+#define DW100_MAP_LUT_SIZE		0x0C
+#define DW100_MAP_LUT_SIZE_WIDTH(w)	(((w) & GENMASK(10, 0)) << 0)
+#define DW100_MAP_LUT_SIZE_HEIGHT(h)	(((h) & GENMASK(10, 0)) << 16)
+#define DW100_SRC_IMG_Y_BASE		0x10
+#define DW100_IMG_Y_BASE(base)		(((base) >> 4) & GENMASK(29, 0))
+#define DW100_SRC_IMG_UV_BASE		0x14
+#define DW100_IMG_UV_BASE(base)		(((base) >> 4) & GENMASK(29, 0))
+#define DW100_SRC_IMG_SIZE		0x18
+#define DW100_IMG_SIZE_WIDTH(w)		(((w) & GENMASK(12, 0)) << 0)
+#define DW100_IMG_SIZE_HEIGHT(h)	(((h) & GENMASK(12, 0)) << 16)
+
+#define DW100_SRC_IMG_STRIDE		0x1C
+#define DW100_MAP_LUT_ADDR2		0x20
+#define DW100_MAP_LUT_SIZE2		0x24
+#define DW100_SRC_IMG_Y_BASE2		0x28
+#define DW100_SRC_IMG_UV_BASE2		0x2C
+#define DW100_SRC_IMG_SIZE2		0x30
+#define DW100_SRC_IMG_STRIDE2		0x34
+#define DW100_DST_IMG_Y_BASE		0x38
+#define DW100_DST_IMG_UV_BASE		0x3C
+#define DW100_DST_IMG_SIZE		0x40
+#define DW100_DST_IMG_STRIDE		0x44
+#define DW100_DST_IMG_Y_BASE2		0x48
+#define DW100_DST_IMG_UV_BASE2		0x4C
+#define DW100_DST_IMG_SIZE2		0x50
+#define DW100_DST_IMG_STRIDE2		0x54
+#define DW100_SWAP_CONTROL		0x58
+#define DW100_SWAP_CONTROL_BYTE		BIT(0)
+#define DW100_SWAP_CONTROL_SHORT	BIT(1)
+#define DW100_SWAP_CONTROL_WORD		BIT(2)
+#define DW100_SWAP_CONTROL_LONG		BIT(3)
+#define DW100_SWAP_CONTROL_Y(x)		(((x) & GENMASK(3, 0)) << 0)
+#define DW100_SWAP_CONTROL_UV(x)	(((x) & GENMASK(3, 0)) << 4)
+#define DW100_SWAP_CONTROL_SRC(x)	(((x) & GENMASK(7, 0)) << 0)
+#define DW100_SWAP_CONTROL_DST(x)	(((x) & GENMASK(7, 0)) << 8)
+#define DW100_SWAP_CONTROL_SRC2(x)	(((x) & GENMASK(7, 0)) << 16)
+#define DW100_SWAP_CONTROL_DST2(x)	(((x) & GENMASK(7, 0)) << 24)
+#define DW100_SWAP_CONTROL_SRC_MASK	GENMASK(7, 0)
+#define DW100_SWAP_CONTROL_DST_MASK	GENMASK(15, 8)
+#define DW100_SWAP_CONTROL_SRC2_MASK	GENMASK(23, 16)
+#define DW100_SWAP_CONTROL_DST2_MASK	GENMASK(31, 24)
+#define DW100_VERTICAL_SPLIT_LINE	0x5C
+#define DW100_HORIZON_SPLIT_LINE	0x60
+#define DW100_SCALE_FACTOR		0x64
+#define DW100_ROI_START			0x68
+#define DW100_ROI_START_X(x)		(((x) & GENMASK(12, 0)) << 0)
+#define DW100_ROI_START_Y(y)		(((y) & GENMASK(12, 0)) << 16)
+#define DW100_BOUNDARY_PIXEL		0x6C
+#define DW100_BOUNDARY_PIXEL_V(v)	(((v) & GENMASK(7, 0)) << 0)
+#define DW100_BOUNDARY_PIXEL_U(u)	(((u) & GENMASK(7, 0)) << 8)
+#define DW100_BOUNDARY_PIXEL_Y(y)	(((y) & GENMASK(7, 0)) << 16)
+
+#define DW100_INTERRUPT_STATUS		0x70
+#define DW100_INTERRUPT_STATUS_INT_FRAME_DONE		BIT(0)
+#define DW100_INTERRUPT_STATUS_INT_ERR_TIME_OUT		BIT(1)
+#define DW100_INTERRUPT_STATUS_INT_ERR_AXI_RESP		BIT(2)
+#define DW100_INTERRUPT_STATUS_INT_ERR_X		BIT(3)
+#define DW100_INTERRUPT_STATUS_INT_ERR_MB_FETCH		BIT(4)
+#define DW100_INTERRUPT_STATUS_INT_ERR_FRAME2		BIT(5)
+#define DW100_INTERRUPT_STATUS_INT_ERR_FRAME3		BIT(6)
+#define DW100_INTERRUPT_STATUS_INT_ERR_FRAME_DONE	BIT(7)
+#define DW100_INTERRUPT_STATUS_INT_ERR_STATUS(x)	(((x) >> 1) & 0x7f)
+#define DW100_INTERRUPT_STATUS_INT_STATUS(x)		((x) & 0xff)
+
+#define DW100_INTERRUPT_STATUS_INT_ENABLE_MASK		GENMASK(15, 8)
+#define DW100_INTERRUPT_STATUS_INT_ENABLE(x)		(((x) & GENMASK(7, 0)) << 8)
+#define DW100_INTERRUPT_STATUS_FRAME_BUSY		BIT(16)
+#define DW100_INTERRUPT_STATUS_INT_CLEAR(x)		(((x) & GENMASK(7, 0)) << 24)
+#define DW100_BUS_CTRL			0x74
+#define DW100_BUS_CTRL_AXI_MASTER_ENABLE	BIT(31)
+#define DW100_BUS_CTRL1			0x78
+#define DW100_BUS_TIME_OUT_CYCLE	0x7C
+#define DW100_DST_IMG_Y_SIZE1		0x80
+#define DW100_DST_IMG_Y_SIZE(sz)	(((sz) >> 4) & GENMASK(29, 0))
+#define DW100_DST_IMG_UV_SIZE(sz)	(((sz) >> 4) & GENMASK(29, 0))
+#define DW100_DST_IMG_UV_SIZE1		0x84
+#define DW100_DST_IMG_Y_SIZE2		0x88
+#define DW100_DST_IMG_UV_SIZE2		0x8C
+
+#endif /* _DW100_REGS_H_ */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v5 9/9] media: MAINTAINERS: add entry for i.MX8MP DW100 v4l2 mem2mem driver
  2022-05-03  9:39 [PATCH v5 0/9] i.MX8MP DW100 dewarper driver Xavier Roumegue
                   ` (7 preceding siblings ...)
  2022-05-03  9:39 ` [PATCH v5 8/9] media: dw100: Add i.MX8MP dw100 dewarper driver Xavier Roumegue
@ 2022-05-03  9:39 ` Xavier Roumegue
  8 siblings, 0 replies; 27+ messages in thread
From: Xavier Roumegue @ 2022-05-03  9:39 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, stanimir.varbanov, laurent.pinchart,
	tomi.valkeinen, robh+dt, nicolas, alexander.stein
  Cc: Xavier Roumegue, linux-media, devicetree

Add myself as maintainer of the dw100 driver which offers hardware
accelerated dewarping operations through a v4l2 mem2mem interface.

Signed-off-by: Xavier Roumegue <xavier.roumegue@oss.nxp.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
 MAINTAINERS | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5e8c2f611766..6dd29c3cc281 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14187,6 +14187,15 @@ S:	Supported
 F:	Documentation/devicetree/bindings/net/nfc/nxp,nci.yaml
 F:	drivers/nfc/nxp-nci
 
+NXP i.MX 8MP DW100 V4L2 DRIVER
+M:	Xavier Roumegue <xavier.roumegue@oss.nxp.com>
+L:	linux-media@vger.kernel.org
+S:	Maintained
+F:	Documentation/devicetree/bindings/media/nxp,dw100.yaml
+F:	Documentation/userspace-api/media/drivers/dw100.rst
+F:	drivers/media/platform/nxp/dw100/
+F:	include/uapi/linux/dw100.h
+
 NXP i.MX 8QXP/8QM JPEG V4L2 DRIVER
 M:	Mirela Rabulea <mirela.rabulea@nxp.com>
 R:	NXP Linux Team <linux-imx@nxp.com>
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 7/9] media: dt-bindings: media: Add i.MX8MP DW100 binding
  2022-05-03  9:39 ` [PATCH v5 7/9] media: dt-bindings: media: Add i.MX8MP DW100 binding Xavier Roumegue
@ 2022-05-03 11:59   ` Rob Herring
  0 siblings, 0 replies; 27+ messages in thread
From: Rob Herring @ 2022-05-03 11:59 UTC (permalink / raw)
  To: Xavier Roumegue
  Cc: hverkuil-cisco, nicolas, tomi.valkeinen, alexander.stein,
	mchehab, devicetree, stanimir.varbanov, linux-media,
	laurent.pinchart, robh+dt

On Tue, 03 May 2022 11:39:23 +0200, Xavier Roumegue wrote:
> Add DT binding documentation for the Vivante DW100 dewarper engine found
> on NXP i.MX8MP SoC
> 
> Signed-off-by: Xavier Roumegue <xavier.roumegue@oss.nxp.com>
> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> ---
>  .../devicetree/bindings/media/nxp,dw100.yaml  | 69 +++++++++++++++++++
>  1 file changed, 69 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/media/nxp,dw100.yaml
> 

My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
on your patch (DT_CHECKER_FLAGS is new in v5.13):

yamllint warnings/errors:

dtschema/dtc warnings/errors:
Error: Documentation/devicetree/bindings/media/nxp,dw100.example.dts:34.50-51 syntax error
FATAL ERROR: Unable to parse input tree
make[1]: *** [scripts/Makefile.lib:364: Documentation/devicetree/bindings/media/nxp,dw100.example.dtb] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [Makefile:1401: dt_binding_check] Error 2

doc reference errors (make refcheckdocs):

See https://patchwork.ozlabs.org/patch/

This check can fail if there are any dependencies. The base for a patch
series is generally the most recent rc1.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 1/9] videodev2.h: add V4L2_CTRL_FLAG_DYNAMIC_ARRAY
  2022-05-03  9:39 ` [PATCH v5 1/9] videodev2.h: add V4L2_CTRL_FLAG_DYNAMIC_ARRAY Xavier Roumegue
@ 2022-06-14 20:46   ` Laurent Pinchart
  2022-06-15  7:37     ` Hans Verkuil
  0 siblings, 1 reply; 27+ messages in thread
From: Laurent Pinchart @ 2022-06-14 20:46 UTC (permalink / raw)
  To: Xavier Roumegue
  Cc: mchehab, hverkuil-cisco, stanimir.varbanov, tomi.valkeinen,
	robh+dt, nicolas, alexander.stein, linux-media, devicetree

Hi Xavier and Hans,

Thank you for the patch.

On Tue, May 03, 2022 at 11:39:17AM +0200, Xavier Roumegue wrote:
> From: Hans Verkuil <hverkuil-cisco@xs4all.nl>
> 
> Add a new flag that indicates that this control is a dynamically sized
> array. Also document this flag.
> 
> Currently dynamically sized arrays are limited to one dimensional arrays,
> but that might change in the future if there is a need for it.
> 
> The initial use-case of dynamic arrays are stateless codecs. A frame
> can be divided in many slices, so you want to provide an array containing
> slice information for each slice. Typically the number of slices is small,
> but the standard allow for hundreds or thousands of slices. Dynamic arrays
> are a good solution since sizing the array for the worst case would waste
> substantial amounts of memory.
> 
> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
> ---
>  .../userspace-api/media/v4l/vidioc-queryctrl.rst          | 8 ++++++++
>  include/uapi/linux/videodev2.h                            | 1 +
>  2 files changed, 9 insertions(+)
> 
> diff --git a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
> index 88f630252d98..a20dfa2a933b 100644
> --- a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
> +++ b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
> @@ -625,6 +625,14 @@ See also the examples in :ref:`control`.
>  	``V4L2_CTRL_FLAG_GRABBED`` flag when buffers are allocated or
>  	streaming is in progress since most drivers do not support changing
>  	the format in that case.
> +    * - ``V4L2_CTRL_FLAG_DYNAMIC_ARRAY``
> +      - 0x0800
> +      - This control is a dynamically sized 1-dimensional array. It
> +        behaves the same as a regular array, except that the number
> +	of elements as reported by the ``elems`` field is between 1 and
> +	``dims[0]``. So setting the control with a differently sized
> +	array will change the ``elems`` field when the control is
> +	queried afterwards.

Wrong indentation.

Can the dimension be changed by the application only, or by the driver
too ? In the latter case, is an event generated ?

Considering this in the context of this series, the driver needs to
change the dimension, as the use case is to size the control based on
the image size. Do we want to document here that the driver will reset
the control to a default value when the dimension changes, or is that
something that should be control-specific ?

>  
>  Return Value
>  ============
> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index 3768a0a80830..8df13defde75 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -1886,6 +1886,7 @@ struct v4l2_querymenu {
>  #define V4L2_CTRL_FLAG_HAS_PAYLOAD	0x0100
>  #define V4L2_CTRL_FLAG_EXECUTE_ON_WRITE	0x0200
>  #define V4L2_CTRL_FLAG_MODIFY_LAYOUT	0x0400
> +#define V4L2_CTRL_FLAG_DYNAMIC_ARRAY	0x0800
>  
>  /*  Query flags, to be ORed with the control ID */
>  #define V4L2_CTRL_FLAG_NEXT_CTRL	0x80000000

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 3/9] vivid: add dynamic array test control
  2022-05-03  9:39 ` [PATCH v5 3/9] vivid: add dynamic array test control Xavier Roumegue
@ 2022-06-14 21:00   ` Laurent Pinchart
  2022-06-15  7:39     ` Hans Verkuil
  2022-06-15  9:14     ` Hans Verkuil
  0 siblings, 2 replies; 27+ messages in thread
From: Laurent Pinchart @ 2022-06-14 21:00 UTC (permalink / raw)
  To: Xavier Roumegue
  Cc: mchehab, hverkuil-cisco, stanimir.varbanov, tomi.valkeinen,
	robh+dt, nicolas, alexander.stein, linux-media, devicetree

Hi Xavier and Hans,

Thank you for the patch.

On Tue, May 03, 2022 at 11:39:19AM +0200, Xavier Roumegue wrote:
> From: Hans Verkuil <hverkuil-cisco@xs4all.nl>
> 
> Add a dynamic array test control to help test support for this
> feature.
> 
> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
> ---
>  drivers/media/test-drivers/vivid/vivid-ctrls.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/drivers/media/test-drivers/vivid/vivid-ctrls.c b/drivers/media/test-drivers/vivid/vivid-ctrls.c
> index e7516dc1227b..7267892dc18a 100644
> --- a/drivers/media/test-drivers/vivid/vivid-ctrls.c
> +++ b/drivers/media/test-drivers/vivid/vivid-ctrls.c
> @@ -34,6 +34,7 @@
>  #define VIVID_CID_U8_4D_ARRAY		(VIVID_CID_CUSTOM_BASE + 10)
>  #define VIVID_CID_AREA			(VIVID_CID_CUSTOM_BASE + 11)
>  #define VIVID_CID_RO_INTEGER		(VIVID_CID_CUSTOM_BASE + 12)
> +#define VIVID_CID_U32_DYN_ARRAY		(VIVID_CID_CUSTOM_BASE + 13)
>  
>  #define VIVID_CID_VIVID_BASE		(0x00f00000 | 0xf000)
>  #define VIVID_CID_VIVID_CLASS		(0x00f00000 | 1)
> @@ -189,6 +190,19 @@ static const struct v4l2_ctrl_config vivid_ctrl_u32_array = {
>  	.dims = { 1 },
>  };
>  
> +static const struct v4l2_ctrl_config vivid_ctrl_u32_dyn_array = {
> +	.ops = &vivid_user_gen_ctrl_ops,
> +	.id = VIVID_CID_U32_DYN_ARRAY,
> +	.name = "U32 Dynamic Array",
> +	.type = V4L2_CTRL_TYPE_U32,
> +	.flags = V4L2_CTRL_FLAG_DYNAMIC_ARRAY,
> +	.def = 50,
> +	.min = 10,
> +	.max = 90,
> +	.step = 1,
> +	.dims = { 100 },
> +};

To meaningfully test this, don't we need the vivid driver to change the
dimension ? Or is it meant to only test changes made by the application
?

> +
>  static const struct v4l2_ctrl_config vivid_ctrl_u16_matrix = {
>  	.ops = &vivid_user_gen_ctrl_ops,
>  	.id = VIVID_CID_U16_MATRIX,
> @@ -1612,6 +1626,7 @@ int vivid_create_controls(struct vivid_dev *dev, bool show_ccs_cap,
>  	dev->ro_int32 = v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_ro_int32, NULL);
>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_area, NULL);
>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u32_array, NULL);
> +	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u32_dyn_array, NULL);
>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u16_matrix, NULL);
>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u8_4d_array, NULL);
>  

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 6/9] media: uapi: Add a control for DW100 driver
  2022-05-03  9:39 ` [PATCH v5 6/9] media: uapi: Add a control for DW100 driver Xavier Roumegue
@ 2022-06-14 21:07   ` Laurent Pinchart
  0 siblings, 0 replies; 27+ messages in thread
From: Laurent Pinchart @ 2022-06-14 21:07 UTC (permalink / raw)
  To: Xavier Roumegue
  Cc: mchehab, hverkuil-cisco, stanimir.varbanov, tomi.valkeinen,
	robh+dt, nicolas, alexander.stein, linux-media, devicetree

Hi Xavier,

Thank you for the patch.

On Tue, May 03, 2022 at 11:39:22AM +0200, Xavier Roumegue wrote:
> The DW100 driver gets the dewarping mapping as a binary blob from the
> userspace application through a custom control.
> The blob format is hardware specific so create a dedicated control for
> this purpose.
> 
> Signed-off-by: Xavier Roumegue <xavier.roumegue@oss.nxp.com>

Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

> ---
>  .../userspace-api/media/drivers/dw100.rst          | 13 +++++++++++++
>  include/uapi/linux/dw100.h                         | 14 ++++++++++++++
>  2 files changed, 27 insertions(+)
>  create mode 100644 include/uapi/linux/dw100.h
> 
> diff --git a/Documentation/userspace-api/media/drivers/dw100.rst b/Documentation/userspace-api/media/drivers/dw100.rst
> index 1e606459cf47..1ea727698ea1 100644
> --- a/Documentation/userspace-api/media/drivers/dw100.rst
> +++ b/Documentation/userspace-api/media/drivers/dw100.rst
> @@ -87,4 +87,17 @@ an identity map.
>  More details on the DW100 hardware operations can be found in
>  *chapter 13.15 DeWarp* of IMX8MP_ reference manual.
>  
> +The Vivante DW100 m2m driver implements the following driver-specific control:
> +
> +``V4L2_CID_DW100_DEWARPING_16x16_VERTEX_MAP (__u32 array)``
> +    Specifies to DW100 driver its dewarping map (aka LUT) blob as described in
> +    *chapter 13.15.2.3 Dewarping Remap* of IMX8MP_ reference manual as an U32
> +    dynamic array. The image is divided into many small 16x16 blocks. If the
> +    width/height of the image is not divisible by 16, the size of the
> +    rightmost/bottommost block is the remainder. The dewarping map only saves
> +    the vertex coordinates of the block. The dewarping grid map is comprised of
> +    vertex coordinates for x and y. Each x, y coordinate register uses 16 bits
> +    (UQ12.4) to record the coordinate address, with the Y coordinate in the
> +    upper bits and X in the lower bits.
> +
>  .. _IMX8MP: https://www.nxp.com/webapp/Download?colCode=IMX8MPRM
> diff --git a/include/uapi/linux/dw100.h b/include/uapi/linux/dw100.h
> new file mode 100644
> index 000000000000..3356496edd6b
> --- /dev/null
> +++ b/include/uapi/linux/dw100.h
> @@ -0,0 +1,14 @@
> +/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
> +/* Copyright 2022 NXP */
> +
> +#ifndef __UAPI_DW100_H__
> +#define __UAPI_DW100_H__
> +
> +#include <linux/v4l2-controls.h>
> +
> +/*
> + * Check Documentation/userspace-api/media/drivers/dw100.rst for control details.
> + */
> +#define V4L2_CID_DW100_DEWARPING_16x16_VERTEX_MAP (V4L2_CID_USER_DW100_BASE + 1)
> +
> +#endif

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 4/9] media: Documentation: dw100: Add user documentation for the DW100 driver
  2022-05-03  9:39 ` [PATCH v5 4/9] media: Documentation: dw100: Add user documentation for the DW100 driver Xavier Roumegue
@ 2022-06-14 21:13   ` Laurent Pinchart
  0 siblings, 0 replies; 27+ messages in thread
From: Laurent Pinchart @ 2022-06-14 21:13 UTC (permalink / raw)
  To: Xavier Roumegue
  Cc: mchehab, hverkuil-cisco, stanimir.varbanov, tomi.valkeinen,
	robh+dt, nicolas, alexander.stein, linux-media, devicetree

Hi Xavier,

Thank you for the patch.

On Tue, May 03, 2022 at 11:39:20AM +0200, Xavier Roumegue wrote:
> Add user documentation for the DW100 driver.
> 
> while at it, replace spaces with tab on drivers list.
> 
> Signed-off-by: Xavier Roumegue <xavier.roumegue@oss.nxp.com>
> ---
>  .../userspace-api/media/drivers/dw100.rst     | 90 +++++++++++++++++++
>  .../userspace-api/media/drivers/index.rst     |  3 +-
>  2 files changed, 92 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/userspace-api/media/drivers/dw100.rst
> 
> diff --git a/Documentation/userspace-api/media/drivers/dw100.rst b/Documentation/userspace-api/media/drivers/dw100.rst
> new file mode 100644
> index 000000000000..1e606459cf47
> --- /dev/null
> +++ b/Documentation/userspace-api/media/drivers/dw100.rst
> @@ -0,0 +1,90 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +DW100 dewarp driver
> +===================
> +
> +The Vivante DW100 Dewarp Processor IP core found on i.MX8MP SoC applies a
> +programmable geometrical transformation on input image to correct distortion

s/on input image/on the image image/ (or "on input images")

> +introduced by lenses.
> +
> +The transformation function is exposed by the hardware as a grid map with 16x16
> +pixel macroblocks indexed using X, Y vertex coordinates.
> +::
> +
> +                          Image width
> +           <--------------------------------------->
> +
> +      ^    .-------.-------.-------.-------.-------.
> +      |    | 16x16 |       |       |       |       |
> +   I  |    | pixel |       |       |       |       |
> +   m  |    | block |       |       |       |       |
> +   a  |    .-------.-------.-------.-------.-------.
> +   g  |    |       |       |       |       |       |
> +   e  |    |       |       |       |       |       |
> +      |    |       |       |       |       |       |
> +   h  |    .-------.-------.-------.-------.-------.
> +   e  |    |       |       |       |       |       |
> +   i  |    |       |       |       |       |       |
> +   g  |    |       |       |       |       |       |
> +   h  |    .-------.-------.-------.-------.-------.
> +   t  |    |       |       |       |       |       |
> +      |    |       |       |       |       |       |
> +      |    |       |       |       |       |       |
> +      v    '-------'-------'-------'-------'-------'
> +
> +            Grid of Image Blocks for Dewarping Map
> +
> +
> +Each x, y coordinate register uses 16 bits to record the coordinate address in
> +an unsigned 12.4 fixed point format (UQ12.4).
> +::
> +
> +    .----------------------.--------..----------------------.--------.
> +    |         31~20        | 19~16  ||         15~4         |  3~0   |
> +    |       (integer)      | (frac) ||       (integer)      | (frac) |
> +    '----------------------'--------''----------------------'--------'
> +    <-------------------------------><------------------------------->
> +                Y coordinate                     X coordinate
> +
> +                           Remap Register Layout
> +
> +The dewarping remap contains two interpolations: coordinate interpolation and
> +pixel interpolation as shown in the figure below.
> +::
> +
> +                           P1                 P2
> +                        (x1, y1)           (x2, y2)
> +                           .-----------------.
> +                           |                 |
> +                           |      P          |
> +                           |   (x, y)        |
> +                           |      *          |
> +                           |                 |
> +                           |                 |
> +                           '-----------------'
> +                           P3                P4
> +                        (x3, y3)          (x4, y4)
> +
> +                          Remap with Coordinate
> +                          and Pixel Interpolation

I don't understand from this figure (and the text above it) how the
interpolation works, or how it's relevant to the control definition. You
could either expand the documentation, or drop this part.

With this fixed,

Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

> +The dewarping map is set from applications using the
> +V4L2_CID_DW100_DEWARPING_16x16_VERTEX_MAP control. The control contains
> +an array of u32 values storing (x, y) destination coordinates for each
> +vertex of the grid. The x coordinate is stored in the 16 LSBs and the y
> +coordinate in the 16 MSBs.
> +
> +The number of elements in the array must match the image size:
> +
> +.. code-block:: C
> +
> +    elems = (DIV_ROUND_UP(width, 16) + 1) * (DIV_ROUND_UP(height, 16) + 1);
> +
> +If the control doesn't contain the correct number of elements, the driver uses
> +an identity map.
> +
> +More details on the DW100 hardware operations can be found in
> +*chapter 13.15 DeWarp* of IMX8MP_ reference manual.
> +
> +.. _IMX8MP: https://www.nxp.com/webapp/Download?colCode=IMX8MPRM
> diff --git a/Documentation/userspace-api/media/drivers/index.rst b/Documentation/userspace-api/media/drivers/index.rst
> index 12e3c512d718..0c720ca1a27d 100644
> --- a/Documentation/userspace-api/media/drivers/index.rst
> +++ b/Documentation/userspace-api/media/drivers/index.rst
> @@ -33,7 +33,8 @@ For more details see the file COPYING in the source distribution of Linux.
>  
>  	ccs
>  	cx2341x-uapi
> -        hantro
> +	dw100
> +	hantro
>  	imx-uapi
>  	max2175
>  	meye-uapi

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 8/9] media: dw100: Add i.MX8MP dw100 dewarper driver
  2022-05-03  9:39 ` [PATCH v5 8/9] media: dw100: Add i.MX8MP dw100 dewarper driver Xavier Roumegue
@ 2022-06-14 22:54   ` Laurent Pinchart
  2022-06-23  7:06     ` Xavier Roumegue (OSS)
  0 siblings, 1 reply; 27+ messages in thread
From: Laurent Pinchart @ 2022-06-14 22:54 UTC (permalink / raw)
  To: Xavier Roumegue
  Cc: mchehab, hverkuil-cisco, stanimir.varbanov, tomi.valkeinen,
	robh+dt, nicolas, alexander.stein, linux-media, devicetree

Hi Xavier,

Thank you for the patch.

On Tue, May 03, 2022 at 11:39:24AM +0200, Xavier Roumegue wrote:
> Add a V4L2 mem-to-mem driver for the Vivante DW100 Dewarp Processor IP
> core found on i.MX8MP SoC.
> 
> The processor core applies a programmable geometrical transformation on
> input image to correct distorsion introduced by lenses.

s/input image/input images/

> The transformation function is exposed as a grid map with 16x16 pixel
> macroblocks indexed using X, Y vertex coordinates.
> 
> The dewarping map can be set from application through dedicated a v4l2

s/dedicated a v4l2/a dedicated V4L2/

> control. If not set or invalid, the driver computes an identity map
> prior to start the processing engine.

s/start/starting/

> 
> The driver supports scaling, cropping and pixel format conversion.
> 
> Signed-off-by: Xavier Roumegue <xavier.roumegue@oss.nxp.com>
> ---
>  drivers/media/platform/nxp/Kconfig            |    1 +
>  drivers/media/platform/nxp/Makefile           |    1 +
>  drivers/media/platform/nxp/dw100/Kconfig      |   16 +
>  drivers/media/platform/nxp/dw100/Makefile     |    3 +
>  drivers/media/platform/nxp/dw100/dw100.c      | 1782 +++++++++++++++++
>  drivers/media/platform/nxp/dw100/dw100_regs.h |  118 ++
>  6 files changed, 1921 insertions(+)
>  create mode 100644 drivers/media/platform/nxp/dw100/Kconfig
>  create mode 100644 drivers/media/platform/nxp/dw100/Makefile
>  create mode 100644 drivers/media/platform/nxp/dw100/dw100.c
>  create mode 100644 drivers/media/platform/nxp/dw100/dw100_regs.h
> 
> diff --git a/drivers/media/platform/nxp/Kconfig b/drivers/media/platform/nxp/Kconfig
> index a0810df751dc..e5085e5e585b 100644
> --- a/drivers/media/platform/nxp/Kconfig
> +++ b/drivers/media/platform/nxp/Kconfig
> @@ -52,4 +52,5 @@ config VIDEO_MX2_EMMAPRP
>  	    memory to memory. Operations include resizing and format
>  	    conversion.
>  
> +source "drivers/media/platform/nxp/dw100/Kconfig"
>  source "drivers/media/platform/nxp/imx-jpeg/Kconfig"
> diff --git a/drivers/media/platform/nxp/Makefile b/drivers/media/platform/nxp/Makefile
> index efc38c6578ce..22ba28ac6d63 100644
> --- a/drivers/media/platform/nxp/Makefile
> +++ b/drivers/media/platform/nxp/Makefile
> @@ -1,5 +1,6 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>  
> +obj-y += dw100/
>  obj-y += imx-jpeg/
>  
>  obj-$(CONFIG_VIDEO_IMX_MIPI_CSIS) += imx-mipi-csis.o
> diff --git a/drivers/media/platform/nxp/dw100/Kconfig b/drivers/media/platform/nxp/dw100/Kconfig
> new file mode 100644
> index 000000000000..86ccac6fd120
> --- /dev/null
> +++ b/drivers/media/platform/nxp/dw100/Kconfig
> @@ -0,0 +1,16 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +config VIDEO_DW100
> +	tristate "NXP i.MX DW100 dewarper"
> +	depends on V4L_MEM2MEM_DRIVERS
> +	depends on VIDEO_DEV
> +	depends on ARCH_MXC || COMPILE_TEST
> +	select VIDEOBUF2_DMA_CONTIG
> +	select V4L2_MEM2MEM_DEV
> +	help
> +	  DW100 is a memory-to-memory engine performing geometrical
> +	  transformation on source image through a programmable dewarping map.

s/source image/source images/

> +
> +	  To compile this driver as a module, choose M here: the module
> +	  will be called dw100.
> +
> diff --git a/drivers/media/platform/nxp/dw100/Makefile b/drivers/media/platform/nxp/dw100/Makefile
> new file mode 100644
> index 000000000000..49db80589e9a
> --- /dev/null
> +++ b/drivers/media/platform/nxp/dw100/Makefile
> @@ -0,0 +1,3 @@
> +# SPDX-License-Identifier: GPL-2.0+
> +
> +obj-$(CONFIG_VIDEO_DW100) += dw100.o
> diff --git a/drivers/media/platform/nxp/dw100/dw100.c b/drivers/media/platform/nxp/dw100/dw100.c
> new file mode 100644
> index 000000000000..869ca12e0bcd
> --- /dev/null
> +++ b/drivers/media/platform/nxp/dw100/dw100.c
> @@ -0,0 +1,1782 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * DW100 Hardware dewarper
> + *
> + * Copyright 2022 NXP
> + * Author: Xavier Roumegue (xavier.roumegue@oss.nxp.com)
> + *
> + */
> +
> +#include <linux/clk.h>
> +#include <linux/debugfs.h>
> +#include <linux/interrupt.h>
> +#include <linux/minmax.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <linux/pm_runtime.h>
> +
> +#include <media/v4l2-ctrls.h>
> +#include <media/v4l2-device.h>
> +#include <media/v4l2-event.h>
> +#include <media/v4l2-ioctl.h>
> +#include <media/v4l2-mem2mem.h>
> +#include <media/videobuf2-dma-contig.h>
> +
> +#include <uapi/linux/dw100.h>
> +
> +#include "dw100_regs.h"
> +
> +#define DRV_NAME "dw100"
> +
> +#define MIN_W 176
> +#define MIN_H 144
> +#define MAX_W 4096
> +#define MAX_H 3072
> +#define ALIGN_W 3
> +#define ALIGN_H 3

Please add a DW100_ prefix to these macros to avoid possible future
namespace clashes.

> +
> +#define DW100_BLOCK_SIZE 16
> +
> +#define DW100_MIN_LUT_NELEMS		(((MIN_W / DW100_BLOCK_SIZE) + 1) \
> +					 * ((MIN_H / DW100_BLOCK_SIZE) + 1))
> +
> +#define DW100_MAX_LUT_NELEMS		(((MAX_W / DW100_BLOCK_SIZE) + 1) \
> +					 * ((MAX_H / DW100_BLOCK_SIZE) + 1))
> +
> +/*
> + * 16 controls have been reserved for this driver for future extension, but
> + * let's limit the related driver allocation to the effective number of controls
> + * in use.
> + */
> +#define DW100_MAX_CTRLS 1
> +#define DW100_CTRL_DEWARPING_MAP 0

Aligning all macros could make this more readable.

> +
> +static unsigned int debug;
> +module_param(debug, uint, 0644);
> +MODULE_PARM_DESC(debug, "Activate debug info");
> +
> +#define dprintk(lvl, dev, fmt, arg...) \
> +	v4l2_dbg(lvl, debug, &(dev)->v4l2_dev, "%s: " fmt, __func__, ## arg)

I'd just use dev_dbg() and drop the debug module parameter. Up to you.

> +
> +enum {
> +	V4L2_M2M_SRC = 0,
> +	V4L2_M2M_DST = 1,
> +};
> +
> +enum {
> +	V4L2_M2M_CAPTURE = 1,

BIT(0)

> +	V4L2_M2M_OUTPUT = 2,

and BIT(1) to emphesize they're bitflags.

> +};

Those names are also too generic, a DW100_ prefix would fix that. I'd
name the first two DW100_QUEUE_SRC and DW100_QUEUE_DST, and the last two
DW100_FMT_CAPTURE and DW100_FMT_OUTPUT.

> +
> +static struct dw100_fmt {

static const

> +	u32 fourcc;
> +	int depth;
> +	u32 types;
> +	u32 reg_format;
> +	bool reg_swap_uv;
> +} formats[] = {
> +	{
> +		.fourcc = V4L2_PIX_FMT_NV16,
> +		.depth = 16,
> +		.types = V4L2_M2M_OUTPUT | V4L2_M2M_CAPTURE,
> +		.reg_format = DW100_DEWARP_CTRL_FORMAT_YUV422_SP,
> +		.reg_swap_uv = false,
> +	}, {
> +		.fourcc = V4L2_PIX_FMT_NV61,
> +		.depth = 16,
> +		.types = V4L2_M2M_CAPTURE,
> +		.reg_format = DW100_DEWARP_CTRL_FORMAT_YUV422_SP,
> +		.reg_swap_uv = true,
> +	}, {
> +		.fourcc = V4L2_PIX_FMT_YUYV,
> +		.depth = 16,
> +		.types = V4L2_M2M_OUTPUT | V4L2_M2M_CAPTURE,
> +		.reg_format = DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED,
> +		.reg_swap_uv = false,
> +	}, {
> +		.fourcc = V4L2_PIX_FMT_UYVY,
> +		.depth = 16,
> +		.types = V4L2_M2M_OUTPUT | V4L2_M2M_CAPTURE,
> +		.reg_format = DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED,
> +		.reg_swap_uv = true,
> +	}, {
> +		.fourcc = V4L2_PIX_FMT_NV12,
> +		.depth = 12,
> +		.types = V4L2_M2M_OUTPUT | V4L2_M2M_CAPTURE,
> +		.reg_format = DW100_DEWARP_CTRL_FORMAT_YUV420_SP,
> +		.reg_swap_uv = false,
> +	}, {
> +		.fourcc = V4L2_PIX_FMT_NV21,
> +		.depth = 12,
> +		.types = V4L2_M2M_OUTPUT | V4L2_M2M_CAPTURE,
> +		.reg_format = DW100_DEWARP_CTRL_FORMAT_YUV420_SP,
> +		.reg_swap_uv = true,
> +	},
> +};
> +
> +static inline int to_dw100_fmt_type(enum v4l2_buf_type type)
> +{
> +	if (V4L2_TYPE_IS_OUTPUT(type))
> +		return V4L2_M2M_OUTPUT;
> +	else
> +		return V4L2_M2M_CAPTURE;
> +}
> +
> +#define NUM_FORMATS ARRAY_SIZE(formats)

I'd drop the macro and use ARRAY_SIZE(formats) directly in the two
locations where NUM_FORMATS is used. Same for NUM_CTRLS.

> +
> +static struct dw100_fmt *find_pixel_format(u32 pixel_format, int fmt_type)

static const struct

dw100_ prefix for the function name. Same below where applicable.

> +{
> +	struct dw100_fmt *fmt;

const

(and where applicable as indicated by the compiler in the rest of the
code)

It's often a good practice to move declaration of variables to the scope
they're used in.

> +	unsigned int k;

Any reason not to use the customary "i" variable ?

> +
> +	for (k = 0; k < NUM_FORMATS; k++) {
> +		fmt = &formats[k];
> +		if (fmt->fourcc == pixel_format && fmt->types & fmt_type)
> +			return fmt;
> +	}
> +
> +	return NULL;
> +}
> +
> +static struct dw100_fmt *find_format(struct v4l2_format *f)
> +{
> +	return find_pixel_format(f->fmt.pix.pixelformat,
> +			      to_dw100_fmt_type(f->type));
> +}
> +
> +static inline u32 dw100_bytesperline(struct dw100_fmt *fmt, u32 width)

You can drop the inline and let the compiler decide.

> +{
> +	switch (fmt->reg_format) {
> +	case DW100_DEWARP_CTRL_FORMAT_YUV422_SP:
> +	case DW100_DEWARP_CTRL_FORMAT_YUV420_SP:
> +		return width;
> +	case DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED:
> +	default:
> +		return (fmt->depth * width) >> 3;
> +	}
> +}
> +
> +static inline u32 dw100_sizeimage(struct dw100_fmt *fmt, u32 width, u32 height)
> +{
> +	return (fmt->depth * width * height) >> 3;
> +}
> +
> +struct dw100_device {
> +	struct platform_device *pdev;
> +	struct v4l2_m2m_dev	*m2m_dev;
> +	struct v4l2_device	v4l2_dev;
> +	struct video_device	vfd;
> +#ifdef CONFIG_MEDIA_CONTROLLER
> +	struct media_device	mdev;
> +#endif
> +	struct mutex		vfd_mutex;
> +	spinlock_t		irqlock;
> +	void __iomem		*mmio;
> +	struct clk_bulk_data	*clks;
> +	int			num_clks;
> +	struct dentry		*debugfs_root;
> +};

Could you please move all structure definitions before the functions ?

> +
> +static int dw100_dump_regs(struct dw100_device *dw_dev)
> +{
> +#define __DECLARE_REG(x) { #x, x }
> +	int i;
> +	struct reg_desc {
> +		const char * const name;
> +		unsigned int addr;
> +	} dw100_regs[] = {
> +		__DECLARE_REG(DW100_DEWARP_ID),
> +		__DECLARE_REG(DW100_DEWARP_CTRL),
> +		__DECLARE_REG(DW100_MAP_LUT_ADDR),
> +		__DECLARE_REG(DW100_MAP_LUT_SIZE),
> +		__DECLARE_REG(DW100_MAP_LUT_ADDR2),
> +		__DECLARE_REG(DW100_MAP_LUT_SIZE2),
> +		__DECLARE_REG(DW100_SRC_IMG_Y_BASE),
> +		__DECLARE_REG(DW100_SRC_IMG_UV_BASE),
> +		__DECLARE_REG(DW100_SRC_IMG_SIZE),
> +		__DECLARE_REG(DW100_SRC_IMG_STRIDE),
> +		__DECLARE_REG(DW100_DST_IMG_Y_BASE),
> +		__DECLARE_REG(DW100_DST_IMG_UV_BASE),
> +		__DECLARE_REG(DW100_DST_IMG_SIZE),
> +		__DECLARE_REG(DW100_DST_IMG_STRIDE),
> +		__DECLARE_REG(DW100_DST_IMG_Y_SIZE1),
> +		__DECLARE_REG(DW100_DST_IMG_UV_SIZE1),
> +		__DECLARE_REG(DW100_SRC_IMG_Y_BASE2),
> +		__DECLARE_REG(DW100_SRC_IMG_UV_BASE2),
> +		__DECLARE_REG(DW100_SRC_IMG_SIZE2),
> +		__DECLARE_REG(DW100_SRC_IMG_STRIDE2),
> +		__DECLARE_REG(DW100_DST_IMG_Y_BASE2),
> +		__DECLARE_REG(DW100_DST_IMG_UV_BASE2),
> +		__DECLARE_REG(DW100_DST_IMG_SIZE2),
> +		__DECLARE_REG(DW100_DST_IMG_STRIDE2),
> +		__DECLARE_REG(DW100_DST_IMG_Y_SIZE2),
> +		__DECLARE_REG(DW100_DST_IMG_UV_SIZE2),
> +		__DECLARE_REG(DW100_SWAP_CONTROL),
> +		__DECLARE_REG(DW100_VERTICAL_SPLIT_LINE),
> +		__DECLARE_REG(DW100_HORIZON_SPLIT_LINE),
> +		__DECLARE_REG(DW100_SCALE_FACTOR),
> +		__DECLARE_REG(DW100_ROI_START),
> +		__DECLARE_REG(DW100_BOUNDARY_PIXEL),
> +		__DECLARE_REG(DW100_INTERRUPT_STATUS),
> +		__DECLARE_REG(DW100_BUS_CTRL),
> +		__DECLARE_REG(DW100_BUS_CTRL1),
> +		__DECLARE_REG(DW100_BUS_TIME_OUT_CYCLE),
> +	};
> +
> +	for (i = 0; i < ARRAY_SIZE(dw100_regs); i++) {
> +		dev_info(&dw_dev->pdev->dev, "%s: %#x\n",
> +			 dw100_regs[i].name,
> +			 readl(dw_dev->mmio + dw100_regs[i].addr));

readl()/writel() wrappers would be nice, something along the lines of

static inline u32 dw100_read(struct dw100_device *dw_dev, u32 reg)
{
	return readl(dw_dev->mmio + reg);
}

static inline void dw100_write(struct dw100_device *dw_dev, u32 reg, u32 val)
{
	write(val, dw_dev->mmio + reg);
}

> +	}

As this is read through debugfs, I recommend using the seq_file API to
dump the registers to the debugfs file instead of the kernel log.

> +
> +	return 0;
> +}
> +
> +struct dw100_q_data {
> +	unsigned int		width;
> +	unsigned int		height;
> +	unsigned int		bytesperline;
> +	unsigned int		sizeimage;
> +	unsigned int		sequence;
> +	struct dw100_fmt	*fmt;
> +	struct v4l2_rect	crop;
> +};
> +
> +struct dw100_ctx {
> +	struct v4l2_fh			fh;
> +	struct dw100_device		*dw_dev;
> +	struct v4l2_ctrl_handler	hdl;
> +	struct v4l2_ctrl		*ctrls[DW100_MAX_CTRLS];
> +
> +	/* Look Up Table for pixel remapping */
> +	unsigned int			*map;
> +	dma_addr_t			map_dma;
> +	size_t				map_size;
> +	unsigned int			map_width;
> +	unsigned int			map_height;
> +
> +	/* Related colorspace properties propagated from input to output */
> +	enum v4l2_colorspace	colorspace;
> +	enum v4l2_xfer_func	xfer_func;
> +	enum v4l2_ycbcr_encoding ycbcr_enc;
> +	enum v4l2_quantization	quant;
> +
> +	/* Source and destination queue data */
> +	struct dw100_q_data   q_data[2];
> +};
> +
> +static inline struct dw100_ctx *file2ctx(struct file *file)
> +{
> +	return container_of(file->private_data, struct dw100_ctx, fh);
> +}
> +
> +static struct dw100_q_data *get_q_data(struct dw100_ctx *ctx,
> +				       enum v4l2_buf_type type)
> +{
> +	if (type == V4L2_BUF_TYPE_VIDEO_OUTPUT)
> +		return &ctx->q_data[V4L2_M2M_SRC];
> +	else
> +		return &ctx->q_data[V4L2_M2M_DST];
> +}
> +
> +static u32 dw100_get_n_vertices_from_length(u32 length)
> +{
> +	return DIV_ROUND_UP(length, DW100_BLOCK_SIZE) + 1;
> +}
> +
> +static u16 dw_map_convert_to_UQ12_4(u32 a)

s/dw_/dw100_/ (and where applicable below)
s/UQ/uq/

> +{
> +	return (u16)((a & 0xfff) << 4);
> +}
> +
> +static u32 dw_map_format_coordinates(u16 xq, u16 yq)
> +{
> +	return (u32)((yq << 16) | xq);
> +}
> +
> +static u32 *dw100_get_user_map(struct dw100_ctx *ctx)
> +{
> +	struct v4l2_ctrl *ctrl = ctx->ctrls[DW100_CTRL_DEWARPING_MAP];
> +

No need for a blank line.

> +	size_t user_map_size;
> +
> +	if (!ctrl) {
> +		v4l2_err(&ctx->dw_dev->v4l2_dev, "Mapping control not found !");
> +		return NULL;
> +	}

I don't think this can ever happen.

> +
> +	if (ctrl->elems < DW100_MIN_LUT_NELEMS ||
> +	    ctrl->elems > DW100_MAX_LUT_NELEMS)
> +		return NULL;
> +
> +	user_map_size = ctrl->elems * ctrl->elem_size;
> +
> +	if ((ctrl->elems * ctrl->elem_size) == ctx->map_size)

	if (user_map_size == ctx->map_size)

> +		return ctrl->p_cur.p_u32;
> +
> +	v4l2_warn(&ctx->dw_dev->v4l2_dev,
> +		  "Skipping invalid user map (%zu != %zu)\n",
> +		  user_map_size,
> +		  ctx->map_size);

The last two lines hold in a single line.

This should be a dev_dbg(), to avoid flooding the kernel log with
userspace-triggered messages.

I'd also write

	if (user_map_size != ctx->map_size) {
		dev_dbg(...);
		return NULL;
	}

	return ctrl->p_cur.p_u32;

to match the pattern of the previous error check.

> +
> +	return NULL;
> +}
> +
> +/*
> + * Create an identity map if not set by the application

This comment is a bit misleading, I initially thought the function's
purpose was to create an identity mapping. You can write

  * Create the dewarp map from the V4L2 control. If the control hasn't been set
  * by the application, generate an identity mapping.

> + *
> + * A 16 pixels cell size grid is mapped on the destination image.
> + * The last cells width/height might be lesser than 16 if the destination image
> + * width/height is not divisible by 16. This dewarping grid map specifies the
> + * source image pixel location (x, y) on each grid intersection point.
> + * Bilinear interpolation is used to compute inner cell points locations.
> + *
> + * The coordinates are saved in UQ12.4 fixed point format.
> + *

Extra blank line.

> + */
> +static int dw100_create_mapping(struct dw100_ctx *ctx)
> +{
> +	u32 sw, sh, dw, dh, mw, mh, i, j;
> +	u16 qx, qy, qdx, qdy, qsh, qsw;
> +	bool is_user_map = false;
> +	u32 *user_map;
> +
> +	sw = ctx->q_data[V4L2_M2M_SRC].width;
> +	dw = ctx->q_data[V4L2_M2M_DST].width;
> +	sh = ctx->q_data[V4L2_M2M_SRC].height;
> +	dh = ctx->q_data[V4L2_M2M_DST].height;
> +
> +	mw = dw100_get_n_vertices_from_length(dw);
> +	mh = dw100_get_n_vertices_from_length(dh);
> +
> +	qdx = dw_map_convert_to_UQ12_4(sw) / (mw - 1);
> +	qdy = dw_map_convert_to_UQ12_4(sh) / (mh - 1);

These two lines can be moved just before the for loop as you don't need
those variables when the user has supplied a mapping. You can then write
them

	qdx = qsw / (mw - 1);
	qdy = qsh / (mh - 1);

> +	qsh = dw_map_convert_to_UQ12_4(sh);
> +	qsw = dw_map_convert_to_UQ12_4(sw);

Swap the last two lines.

> +
> +	if (ctx->map)
> +		dma_free_coherent(&ctx->dw_dev->pdev->dev,
> +				  ctx->map_size,
> +				  ctx->map,
> +				  ctx->map_dma);

This holds on two liens.

> +
> +	ctx->map_width = mw;
> +	ctx->map_height = mh;
> +	ctx->map_size = mh * mw * sizeof(u32);
> +
> +	ctx->map = dma_alloc_coherent(&ctx->dw_dev->pdev->dev,
> +				      ctx->map_size,
> +				      &ctx->map_dma,
> +				      GFP_KERNEL);

Same here.

> +
> +	if (!ctx->map)
> +		return -ENOMEM;
> +
> +	user_map = dw100_get_user_map(ctx);
> +	if (user_map) {
> +		is_user_map = true;
> +		memcpy(ctx->map, user_map, ctx->map_size);
> +		goto out;
> +	}
> +
> +	for (i = 0, qy = 0, qx = 0; i < mh; i++, qy += qdy, qx = 0) {
> +		if (qy > qsh)
> +			qy = qsh;

Can this happen ?

> +		for (j = 0; j < mw; j++, qx += qdx) {

		for (j = 0, qx = 0; j < mw; j++, qx += qdx) {

and drop qw handling from the outer loop.

> +			if (qx > qsw)
> +				qx = qsw;
> +			ctx->map[i * mw + j] = dw_map_format_coordinates(qx, qy);

This could be made a bit more efficient by declaring a

	u32 *map;

variable, setting it to

	map = ctx->map;

just before the loop, and writing

		*map++ = dw_map_format_coordinates(qx, qy);

here.

> +		}
> +	}
> +out:
> +	dprintk(1, ctx->dw_dev,
> +		"%dx%d %s mapping created (d:%pa-c:%p) for stream %dx%d->%dx%d\n",

%u instead of %d for unsigned integers.

> +			mw, mh, is_user_map ? "user" : "identity",
> +			&ctx->map_dma, ctx->map, sw, sh, dw, dh);
> +
> +	return 0;
> +}
> +
> +static const struct v4l2_ctrl_config controls[] = {
> +	[DW100_CTRL_DEWARPING_MAP] = {
> +		.id = V4L2_CID_DW100_DEWARPING_16x16_VERTEX_MAP,
> +		.name = "Look-Up Table",
> +		.type = V4L2_CTRL_TYPE_U32,
> +		.min = 0x00000000,
> +		.max = 0xffffffff,
> +		.step = 1,
> +		.def = 0,
> +		.dims = { DW100_MAX_LUT_NELEMS },
> +		.flags = V4L2_CTRL_FLAG_DYNAMIC_ARRAY,
> +	},
> +};
> +
> +#define NUM_CTRLS ARRAY_SIZE(controls)
> +
> +static int dw100_queue_setup(struct vb2_queue *vq,
> +			     unsigned int *nbuffers, unsigned int *nplanes,
> +			     unsigned int sizes[], struct device *alloc_devs[])
> +{
> +	struct dw100_ctx *ctx = vb2_get_drv_priv(vq);
> +	struct dw100_q_data *q_data;
> +	unsigned int size, count = *nbuffers;
> +
> +	q_data = get_q_data(ctx, vq->type);
> +
> +	size = q_data->sizeimage;
> +
> +	*nbuffers = count;
> +
> +	if (*nplanes)
> +		return sizes[0] < size ? -EINVAL : 0;
> +
> +	*nplanes = 1;
> +	sizes[0] = size;
> +
> +	dprintk(1, ctx->dw_dev, "Queue %p: get %d buffer(s) of size %d each.\n",
> +		vq, count, size);

I think I'd drop this message, or have you found it useful to debug
application issues ?

> +
> +	return 0;
> +}
> +
> +static int dw100_buf_prepare(struct vb2_buffer *vb)
> +{
> +	struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
> +	struct dw100_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> +	struct dw100_device *dw_dev = ctx->dw_dev;
> +	struct dw100_q_data *q_data;
> +
> +	dprintk(1, dw_dev, "Queue %p: Preparing buffer %p, type: %d\n",
> +		vb->vb2_queue, vb, vb->vb2_queue->type);

Same here.

> +
> +	q_data = get_q_data(ctx, vb->vb2_queue->type);
> +	if (V4L2_TYPE_IS_OUTPUT(vb->vb2_queue->type)) {
> +		if (vbuf->field == V4L2_FIELD_ANY)

This can't happen,
https://linuxtv.org/downloads/v4l-dvb-apis/userspace-api/v4l/field-order.html#c.v4l2_field
states "struct v4l2_buffer field can never be V4L2_FIELD_ANY".

> +			vbuf->field = V4L2_FIELD_NONE;
> +		if (vbuf->field != V4L2_FIELD_NONE) {
> +			v4l2_err(&dw_dev->v4l2_dev, "%x field isn't supported\n",
> +				 vbuf->field);

dev_dbg()

> +			return -EINVAL;
> +		}
> +	}
> +
> +	if (vb2_plane_size(vb, 0) < q_data->sizeimage) {
> +		v4l2_err(&dw_dev->v4l2_dev,
> +			 "%s data will not fit into plane (%lu < %lu)\n",
> +			 __func__, vb2_plane_size(vb, 0),
> +			 (long)q_data->sizeimage);

dev_dbg()

> +		return -EINVAL;
> +	}
> +
> +	vb2_set_plane_payload(vb, 0, q_data->sizeimage);
> +
> +	return 0;
> +}
> +
> +static void dw100_buf_queue(struct vb2_buffer *vb)
> +{
> +	struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
> +	struct dw100_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> +
> +	dprintk(2, ctx->dw_dev, "Queue %p: Queuing buffer %p.\n",
> +		vb->vb2_queue, vb);
> +	v4l2_m2m_buf_queue(ctx->fh.m2m_ctx, vbuf);
> +}
> +
> +static int dw100_start_streaming(struct vb2_queue *q, unsigned int count)
> +{
> +	struct dw100_ctx *ctx = vb2_get_drv_priv(q);
> +	struct dw100_q_data *q_data = get_q_data(ctx, q->type);
> +
> +	dprintk(1, ctx->dw_dev, "Queue %p: Start Streaming.\n", q);
> +
> +	q_data->sequence = 0;
> +
> +	return pm_runtime_resume_and_get(&ctx->dw_dev->pdev->dev);
> +}
> +
> +static void dw100_stop_streaming(struct vb2_queue *q)
> +{
> +	struct dw100_ctx *ctx = vb2_get_drv_priv(q);
> +	struct vb2_v4l2_buffer *vbuf;
> +	unsigned long flags;
> +
> +	dprintk(1, ctx->dw_dev, "Queue %p: Stop Streaming.\n", q);
> +	for (;;) {
> +		if (V4L2_TYPE_IS_OUTPUT(q->type))
> +			vbuf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
> +		else
> +			vbuf = v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
> +		if (!vbuf)
> +			break;
> +		spin_lock_irqsave(&ctx->dw_dev->irqlock, flags);

I don't think you need the lock here, as this can't race with the IRQ,
can it ?

> +		v4l2_m2m_buf_done(vbuf, VB2_BUF_STATE_ERROR);
> +		spin_unlock_irqrestore(&ctx->dw_dev->irqlock, flags);
> +	}
> +
> +	pm_runtime_put_sync(&ctx->dw_dev->pdev->dev);
> +
> +	if (ctx->map) {
> +		dma_free_coherent(&ctx->dw_dev->pdev->dev,
> +				  ctx->map_size,
> +				  ctx->map,
> +				  ctx->map_dma);
> +		ctx->map = NULL;

There's a mismatch between start and stop, could we create the mapping
in dw100_start_streaming() instead of dw100_start() ?

> +	}
> +}
> +
> +static const struct vb2_ops dw100_qops = {
> +	.queue_setup	 = dw100_queue_setup,
> +	.buf_prepare	 = dw100_buf_prepare,
> +	.buf_queue	 = dw100_buf_queue,
> +	.start_streaming = dw100_start_streaming,
> +	.stop_streaming  = dw100_stop_streaming,
> +	.wait_prepare	 = vb2_ops_wait_prepare,
> +	.wait_finish	 = vb2_ops_wait_finish,
> +};
> +
> +static int dw100_m2m_queue_init(void *priv, struct vb2_queue *src_vq,
> +				struct vb2_queue *dst_vq)
> +{
> +	struct dw100_ctx *ctx = priv;
> +	int ret;
> +
> +	src_vq->type = V4L2_BUF_TYPE_VIDEO_OUTPUT;
> +	src_vq->io_modes = VB2_MMAP | VB2_DMABUF;
> +	src_vq->drv_priv = ctx;
> +	src_vq->buf_struct_size = sizeof(struct v4l2_m2m_buffer);
> +	src_vq->ops = &dw100_qops;
> +	src_vq->mem_ops = &vb2_dma_contig_memops;
> +	src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> +	src_vq->lock = &ctx->dw_dev->vfd_mutex;

This will serialize queue operations across all contexts. Wouldn't it be
better to have a per-context lock ?

> +	src_vq->dev = ctx->dw_dev->v4l2_dev.dev;
> +
> +	ret = vb2_queue_init(src_vq);
> +	if (ret)
> +		return ret;
> +
> +	dst_vq->type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
> +	dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
> +	dst_vq->drv_priv = ctx;
> +	dst_vq->buf_struct_size = sizeof(struct v4l2_m2m_buffer);
> +	dst_vq->ops = &dw100_qops;
> +	dst_vq->mem_ops = &vb2_dma_contig_memops;
> +	dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> +	dst_vq->lock = &ctx->dw_dev->vfd_mutex;
> +	dst_vq->dev = ctx->dw_dev->v4l2_dev.dev;
> +
> +	return vb2_queue_init(dst_vq);
> +}
> +
> +static int dw100_open(struct file *file)
> +{
> +	struct dw100_device *dw_dev = video_drvdata(file);
> +	struct dw100_ctx *ctx;
> +	struct v4l2_ctrl_handler *hdl;
> +	int ret = 0, i;
> +
> +	if (mutex_lock_interruptible(&dw_dev->vfd_mutex))
> +		return -ERESTARTSYS;
> +
> +	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
> +	if (!ctx) {
> +		ret = -ENOMEM;
> +		goto open_unlock;
> +	}

This can be moved before the mutex_lock call (with a return -ENOMEM
instead of the goto).

Actually I don't think you need the lock at all in this function.

> +
> +	v4l2_fh_init(&ctx->fh, video_devdata(file));
> +	file->private_data = &ctx->fh;
> +	ctx->dw_dev = dw_dev;
> +
> +	hdl = &ctx->hdl;
> +	v4l2_ctrl_handler_init(hdl, NUM_CTRLS);
> +	for (i = 0; i < NUM_CTRLS; i++) {
> +		ctx->ctrls[i] = v4l2_ctrl_new_custom(hdl, &controls[i], NULL);
> +		if (hdl->error) {
> +			v4l2_err(&ctx->dw_dev->v4l2_dev,
> +				 "Adding control (%d) failed\n", i);
> +			ret = hdl->error;
> +			v4l2_ctrl_handler_free(hdl);
> +			v4l2_fh_exit(&ctx->fh);
> +			kfree(ctx);
> +			goto open_unlock;
> +		}
> +	}
> +	ctx->fh.ctrl_handler = hdl;
> +
> +	ctx->q_data[V4L2_M2M_SRC].fmt = &formats[0];
> +	ctx->q_data[V4L2_M2M_SRC].width = 640;
> +	ctx->q_data[V4L2_M2M_SRC].height = 480;
> +	ctx->q_data[V4L2_M2M_SRC].bytesperline =
> +		dw100_bytesperline(&formats[0], 640);
> +	ctx->q_data[V4L2_M2M_SRC].sizeimage =
> +		dw100_sizeimage(&formats[0], 640, 480);
> +
> +	ctx->q_data[V4L2_M2M_SRC].crop.top = 0;
> +	ctx->q_data[V4L2_M2M_SRC].crop.left = 0;
> +	ctx->q_data[V4L2_M2M_SRC].crop.width = 640;
> +	ctx->q_data[V4L2_M2M_SRC].crop.height = 480;
> +
> +	ctx->q_data[V4L2_M2M_DST] = ctx->q_data[V4L2_M2M_SRC];
> +
> +	ctx->colorspace = V4L2_COLORSPACE_REC709;
> +	ctx->xfer_func = V4L2_MAP_XFER_FUNC_DEFAULT(ctx->colorspace);
> +	ctx->ycbcr_enc = V4L2_MAP_YCBCR_ENC_DEFAULT(ctx->colorspace);
> +	ctx->quant = V4L2_MAP_QUANTIZATION_DEFAULT(false,
> +						   ctx->colorspace,
> +						   ctx->ycbcr_enc);
> +
> +	ctx->fh.m2m_ctx = v4l2_m2m_ctx_init(dw_dev->m2m_dev,
> +					    ctx, &dw100_m2m_queue_init);
> +
> +	if (IS_ERR(ctx->fh.m2m_ctx)) {
> +		ret = PTR_ERR(ctx->fh.m2m_ctx);
> +
> +		v4l2_ctrl_handler_free(hdl);
> +		v4l2_fh_exit(&ctx->fh);
> +		kfree(ctx);
> +		goto open_unlock;
> +	}
> +
> +	v4l2_fh_add(&ctx->fh);
> +
> +	dprintk(1, dw_dev, "M2M instance created: %p", ctx->fh.m2m_ctx);

I'd drop this message too.

> +
> +open_unlock:
> +	mutex_unlock(&dw_dev->vfd_mutex);
> +	return ret;
> +}
> +
> +static int dw100_release(struct file *file)
> +{
> +	struct dw100_device *dw_dev = video_drvdata(file);
> +	struct dw100_ctx *ctx = file2ctx(file);
> +
> +	dprintk(1, dw_dev, "Releasing M2M instance: %p", ctx->fh.m2m_ctx);
> +
> +	v4l2_fh_del(&ctx->fh);
> +	v4l2_fh_exit(&ctx->fh);
> +	v4l2_ctrl_handler_free(&ctx->hdl);
> +
> +	mutex_lock(&dw_dev->vfd_mutex);
> +	v4l2_m2m_ctx_release(ctx->fh.m2m_ctx);
> +	mutex_unlock(&dw_dev->vfd_mutex);
> +
> +	if (ctx->map)
> +		dma_free_coherent(&ctx->dw_dev->pdev->dev,
> +				  ctx->map_size,
> +				  ctx->map,
> +				  ctx->map_dma);

Can this happen, doesn't the M2M framework guarantee that streaming is
stopped, which then guarantees that the .stop_streaming() handler has
freed the mapping ?

> +
> +	kfree(ctx);
> +
> +	return 0;
> +}
> +
> +static const struct v4l2_file_operations dw100_fops = {
> +	.owner		= THIS_MODULE,
> +	.open		= dw100_open,
> +	.release	= dw100_release,
> +	.poll		= v4l2_m2m_fop_poll,
> +	.unlocked_ioctl	= video_ioctl2,
> +	.mmap		= v4l2_m2m_fop_mmap,
> +};
> +
> +static int dw100_querycap(struct file *file, void *priv,
> +			  struct v4l2_capability *cap)
> +{
> +	strscpy(cap->driver, DRV_NAME, sizeof(cap->driver));
> +	strscpy(cap->card, "DW100 dewarper", sizeof(cap->card));
> +	snprintf(cap->bus_info, sizeof(cap->bus_info), "platform:%s", DRV_NAME);

Not needed anymore, the core handles it in v4l_querycap() (and the value
here is incorrect, it should use the device name, not hardcode the
driver name).

> +
> +	return 0;
> +}
> +
> +static int dw100_enum_fmt(struct v4l2_fmtdesc *f)
> +{
> +	int i, num = 0;
> +
> +	for (i = 0; i < NUM_FORMATS; i++) {
> +		if (formats[i].types & to_dw100_fmt_type(f->type)) {
> +			if (num == f->index) {
> +				f->pixelformat = formats[i].fourcc;
> +				return 0;
> +			}
> +			++num;
> +		}
> +	}
> +
> +	return -EINVAL;
> +}
> +
> +static int dw100_enum_fmt_vid_cap(struct file *file, void *priv,
> +				  struct v4l2_fmtdesc *f)
> +{
> +	return dw100_enum_fmt(f);
> +}
> +
> +static int dw100_enum_fmt_vid_out(struct file *file, void *priv,
> +				  struct v4l2_fmtdesc *f)
> +{
> +	return dw100_enum_fmt(f);
> +}

You can merge those three functions into a single one.

> +
> +static int dw100_enum_framesizes(struct file *file, void *priv,
> +				 struct v4l2_frmsizeenum *fsize)
> +{
> +	struct dw100_fmt *fmt;
> +
> +	if (fsize->index)
> +		return -EINVAL;
> +
> +	fmt = find_pixel_format(fsize->pixel_format,
> +				V4L2_M2M_OUTPUT | V4L2_M2M_CAPTURE);

I think you should restrict the format type based on fsize->type.

> +	if (!fmt)
> +		return -EINVAL;
> +
> +	fsize->type = V4L2_FRMSIZE_TYPE_STEPWISE;
> +	fsize->stepwise.min_width = MIN_W;
> +	fsize->stepwise.min_height = MIN_H;
> +	fsize->stepwise.max_width = MAX_W;
> +	fsize->stepwise.max_height = MAX_H;
> +	fsize->stepwise.step_width = (1UL << ALIGN_W);
> +	fsize->stepwise.step_height = (1UL << ALIGN_H);

No need for the parentheses.

> +
> +	return 0;
> +}
> +
> +static int dw100_g_fmt(struct dw100_ctx *ctx, struct v4l2_format *f)
> +{
> +	struct vb2_queue *vq;
> +	struct dw100_q_data *q_data;
> +
> +	vq = v4l2_m2m_get_vq(ctx->fh.m2m_ctx, f->type);
> +	if (!vq)
> +		return -EINVAL;
> +
> +	q_data = get_q_data(ctx, f->type);
> +
> +	f->fmt.pix.width	= q_data->width;
> +	f->fmt.pix.height	= q_data->height;
> +	f->fmt.pix.field	= V4L2_FIELD_NONE;
> +	f->fmt.pix.pixelformat	= q_data->fmt->fourcc;
> +	f->fmt.pix.bytesperline	= q_data->bytesperline;
> +	f->fmt.pix.sizeimage	= q_data->sizeimage;
> +	f->fmt.pix.colorspace	= ctx->colorspace;
> +	f->fmt.pix.xfer_func	= ctx->xfer_func;
> +	f->fmt.pix.ycbcr_enc	= ctx->ycbcr_enc;
> +	f->fmt.pix.quantization	= ctx->quant;
> +
> +	return 0;
> +}
> +
> +static int dw100_g_fmt_vid_out(struct file *file, void *priv,
> +			       struct v4l2_format *f)
> +{
> +	return dw100_g_fmt(file2ctx(file), f);
> +}
> +
> +static int dw100_g_fmt_vid_cap(struct file *file, void *priv,
> +			       struct v4l2_format *f)
> +{
> +	return dw100_g_fmt(file2ctx(file), f);
> +}

You can also merge those three functions into a single one.

> +
> +static int dw100_try_fmt(struct v4l2_format *f, struct dw100_fmt *fmt)
> +{
> +	v4l_bound_align_image(&f->fmt.pix.width, MIN_W, MAX_W, ALIGN_W,
> +			      &f->fmt.pix.height, MIN_H, MAX_H, ALIGN_H, 0);
> +
> +	f->fmt.pix.bytesperline = dw100_bytesperline(fmt, f->fmt.pix.width);
> +	f->fmt.pix.sizeimage = dw100_sizeimage(fmt, f->fmt.pix.width,
> +					       f->fmt.pix.height);

No need to support a custom line stride ?

> +	f->fmt.pix.field = V4L2_FIELD_NONE;
> +
> +	return 0;
> +}
> +
> +static int dw100_s_fmt(struct dw100_ctx *ctx, struct v4l2_format *f)
> +{
> +	struct dw100_q_data *q_data;
> +	struct vb2_queue *vq;
> +
> +	vq = v4l2_m2m_get_vq(ctx->fh.m2m_ctx, f->type);
> +	if (!vq)
> +		return -EINVAL;
> +
> +	q_data = get_q_data(ctx, f->type);
> +	if (!q_data)
> +		return -EINVAL;
> +
> +	if (vb2_is_busy(vq)) {
> +		v4l2_err(&ctx->dw_dev->v4l2_dev, "%s queue busy\n", __func__);
> +		return -EBUSY;
> +	}
> +
> +	q_data->fmt		= find_format(f);
> +	q_data->width		= f->fmt.pix.width;
> +	q_data->height		= f->fmt.pix.height;
> +	q_data->bytesperline	= f->fmt.pix.bytesperline;
> +	q_data->sizeimage	= f->fmt.pix.sizeimage;

It could be easier to store an instance of v4l2_pix_format in
dw100_q_data.

> +
> +	q_data->crop.top = 0;
> +	q_data->crop.left = 0;
> +	q_data->crop.width = f->fmt.pix.width;
> +	q_data->crop.height = f->fmt.pix.height;
> +
> +	dprintk(1, ctx->dw_dev,
> +		"Setting format for type %d, wxh: %dx%d, fmt: %d\n",
> +		f->type, q_data->width, q_data->height, q_data->fmt->fourcc);
> +
> +	return 0;
> +}
> +
> +static int dw100_try_fmt_vid_cap(struct file *file, void *priv,
> +				 struct v4l2_format *f)
> +{
> +	struct dw100_fmt *fmt;
> +	struct dw100_ctx *ctx = file2ctx(file);
> +
> +	if (f->type != V4L2_BUF_TYPE_VIDEO_CAPTURE)
> +		return -EINVAL;
> +
> +	fmt = find_format(f);
> +	if (!fmt) {
> +		f->fmt.pix.pixelformat = formats[0].fourcc;
> +		fmt = find_format(f);
> +	}
> +
> +	f->fmt.pix.colorspace = ctx->colorspace;
> +	f->fmt.pix.xfer_func = ctx->xfer_func;
> +	f->fmt.pix.ycbcr_enc = ctx->ycbcr_enc;
> +	f->fmt.pix.quantization = ctx->quant;
> +
> +	return dw100_try_fmt(f, fmt);
> +}
> +
> +static int dw100_s_fmt_vid_cap(struct file *file, void *priv,
> +			       struct v4l2_format *f)
> +{
> +	struct dw100_ctx *ctx = file2ctx(file);
> +	int ret;
> +
> +	ret = dw100_try_fmt_vid_cap(file, priv, f);
> +	if (ret)
> +		return ret;
> +
> +	ret = dw100_s_fmt(ctx, f);
> +	if (ret)
> +		return ret;
> +
> +	return 0;
> +}
> +
> +static int dw100_try_fmt_vid_out(struct file *file, void *priv,
> +				 struct v4l2_format *f)
> +{
> +	struct dw100_fmt *fmt;
> +
> +	if (f->type != V4L2_BUF_TYPE_VIDEO_OUTPUT)
> +		return -EINVAL;
> +
> +	fmt = find_format(f);
> +	if (!fmt) {
> +		f->fmt.pix.pixelformat = formats[0].fourcc;
> +		fmt = find_format(f);
> +	}
> +
> +	if (!f->fmt.pix.colorspace)
> +		f->fmt.pix.colorspace = V4L2_COLORSPACE_REC709;
> +
> +	return dw100_try_fmt(f, fmt);
> +}
> +
> +static int dw100_s_fmt_vid_out(struct file *file, void *priv,
> +			       struct v4l2_format *f)
> +{
> +	struct dw100_ctx *ctx = file2ctx(file);
> +	int ret;
> +
> +	ret = dw100_try_fmt_vid_out(file, priv, f);
> +	if (ret)
> +		return ret;
> +
> +	ret = dw100_s_fmt(ctx, f);
> +	if (ret)
> +		return ret;
> +
> +	ctx->colorspace = f->fmt.pix.colorspace;
> +	ctx->xfer_func = f->fmt.pix.xfer_func;
> +	ctx->ycbcr_enc = f->fmt.pix.ycbcr_enc;
> +	ctx->quant = f->fmt.pix.quantization;
> +
> +	return 0;
> +}
> +
> +static int dw100_g_selection(struct file *file, void *fh,
> +			     struct v4l2_selection *sel)
> +{
> +	struct dw100_ctx *ctx = file2ctx(file);
> +	struct dw100_q_data *src_q_data, *dst_q_data;
> +
> +	if (sel->type != V4L2_BUF_TYPE_VIDEO_OUTPUT &&
> +	    sel->type != V4L2_BUF_TYPE_VIDEO_CAPTURE)
> +		return -EINVAL;
> +
> +	src_q_data = get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_OUTPUT);
> +	dst_q_data = get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE);
> +
> +	switch (sel->target) {
> +	case V4L2_SEL_TGT_CROP_DEFAULT:
> +	case V4L2_SEL_TGT_CROP_BOUNDS:
> +		sel->r.top = 0;
> +		sel->r.left = 0;
> +		sel->r.width = src_q_data->width;
> +		sel->r.height = src_q_data->height;
> +		break;
> +	case V4L2_SEL_TGT_CROP:
> +		sel->r.top = src_q_data->crop.top;
> +		sel->r.left = src_q_data->crop.left;
> +		sel->r.width = src_q_data->crop.width;
> +		sel->r.height = src_q_data->crop.height;
> +		break;
> +	case V4L2_SEL_TGT_COMPOSE_DEFAULT:
> +	case V4L2_SEL_TGT_COMPOSE_BOUNDS:
> +	case V4L2_SEL_TGT_COMPOSE:
> +		sel->r.top = 0;
> +		sel->r.left = 0;
> +		sel->r.width = dst_q_data->width;
> +		sel->r.height = dst_q_data->height;
> +		break;

As far as I understand the driver doesn't support composition, maybe you
could just return -EINVAL ?

> +	default:
> +		return -EINVAL;
> +	}
> +
> +	dprintk(1, ctx->dw_dev,
> +		"<<< Buffer Type: %d Target: %d Rect: %dx%d@%d.%d\n",
> +		sel->type, sel->target,
> +		sel->r.width, sel->r.height, sel->r.left, sel->r.top);
> +
> +	return 0;
> +}
> +
> +static	int dw100_s_selection(struct file *file, void *fh,

Replace tab with space.

> +			      struct v4l2_selection *sel)
> +{
> +	struct dw100_ctx *ctx = file2ctx(file);
> +	struct dw100_q_data *src_q_data, *dst_q_data;
> +	u32 qscalex, qscaley, qscale;
> +	int x, y, w, h;
> +
> +	if (sel->type != V4L2_BUF_TYPE_VIDEO_OUTPUT &&
> +	    sel->type != V4L2_BUF_TYPE_VIDEO_CAPTURE)
> +		return -EINVAL;
> +
> +	src_q_data = get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_OUTPUT);
> +	dst_q_data = get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE);
> +
> +	dprintk(1, ctx->dw_dev,
> +		">>> Buffer Type: %d Target: %d Rect: %dx%d@%d.%d\n",
> +		sel->type, sel->target,
> +		sel->r.width, sel->r.height, sel->r.left, sel->r.top);
> +
> +	if (sel->r.top < 0 || sel->r.left < 0)
> +		return -EINVAL;

Clamp the top and left values to the supported range, .s_selection()
must adjust unsupported values, not return an error.

> +
> +	switch (sel->target) {
> +	case V4L2_SEL_TGT_CROP:
> +		/* UQ16.16 for float operations */
> +		if ((sel->r.left + sel->r.width > src_q_data->width) ||
> +		    (sel->r.top + sel->r.height > src_q_data->height))
> +			return -EINVAL;

Same.

> +		qscalex = (sel->r.width << 16) / src_q_data->width;
> +		qscaley = (sel->r.height << 16) / src_q_data->height;
> +		y = sel->r.top;
> +		x = sel->r.left;
> +		if (qscalex == qscaley) {
> +			qscale = qscalex;
> +		} else {
> +			switch (sel->flags) {
> +			case 0:
> +				qscale = (qscalex + qscaley) / 2;
> +				break;
> +			case V4L2_SEL_FLAG_GE:
> +				qscale = max(qscaley, qscalex);
> +				break;
> +			case V4L2_SEL_FLAG_LE:
> +				qscale = min(qscaley, qscalex);
> +				break;
> +			case V4L2_SEL_FLAG_LE | V4L2_SEL_FLAG_GE:
> +				return -ERANGE;
> +			default:
> +				return -EINVAL;
> +			}
> +		}
> +
> +		w = (u32)((((u64)src_q_data->width << 16) * qscale) >> 32);
> +		h = (u32)((((u64)src_q_data->height << 16) * qscale) >> 32);
> +		x = x + (sel->r.width  - w) / 2;
> +		y = y + (sel->r.height  - h) / 2;
> +		x = min(src_q_data->width - w, (unsigned int)max(0, x));
> +		y = min(src_q_data->height - h, (unsigned int)max(0, y));
> +
> +		sel->r.top = y;
> +		sel->r.left = x;
> +		sel->r.width = w;
> +		sel->r.height = h;
> +
> +		src_q_data->crop.top = sel->r.top;
> +		src_q_data->crop.left = sel->r.left;
> +		src_q_data->crop.width = sel->r.width;
> +		src_q_data->crop.height = sel->r.height;
> +		break;
> +	case V4L2_SEL_TGT_COMPOSE:
> +		if ((sel->r.left + sel->r.width > dst_q_data->width) ||
> +		    (sel->r.top + sel->r.height > dst_q_data->height))
> +			return -EINVAL;
> +		sel->r.top = 0;
> +		sel->r.left = 0;
> +		sel->r.width = dst_q_data->width;
> +		sel->r.height = dst_q_data->height;
> +		break;
> +	case V4L2_SEL_TGT_CROP_DEFAULT:
> +	case V4L2_SEL_TGT_CROP_BOUNDS:
> +	case V4L2_SEL_TGT_COMPOSE_DEFAULT:
> +	case V4L2_SEL_TGT_COMPOSE_BOUNDS:
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	dprintk(1, ctx->dw_dev,
> +		"<<< Buffer Type: %d Target: %d Rect: %dx%d@%d.%d\n",
> +		sel->type, sel->target,
> +		sel->r.width, sel->r.height, sel->r.left, sel->r.top);
> +
> +	return 0;
> +}
> +
> +static const struct v4l2_ioctl_ops dw100_ioctl_ops = {
> +	.vidioc_querycap	= dw100_querycap,
> +
> +	.vidioc_enum_fmt_vid_cap = dw100_enum_fmt_vid_cap,
> +	.vidioc_enum_framesizes = dw100_enum_framesizes,
> +	.vidioc_g_fmt_vid_cap	= dw100_g_fmt_vid_cap,
> +	.vidioc_try_fmt_vid_cap	= dw100_try_fmt_vid_cap,
> +	.vidioc_s_fmt_vid_cap	= dw100_s_fmt_vid_cap,
> +
> +	.vidioc_enum_fmt_vid_out = dw100_enum_fmt_vid_out,
> +	.vidioc_g_fmt_vid_out	= dw100_g_fmt_vid_out,
> +	.vidioc_try_fmt_vid_out	= dw100_try_fmt_vid_out,
> +	.vidioc_s_fmt_vid_out	= dw100_s_fmt_vid_out,
> +
> +	.vidioc_g_selection	= dw100_g_selection,
> +	.vidioc_s_selection	= dw100_s_selection,
> +	.vidioc_reqbufs		= v4l2_m2m_ioctl_reqbufs,
> +	.vidioc_querybuf	= v4l2_m2m_ioctl_querybuf,
> +	.vidioc_qbuf		= v4l2_m2m_ioctl_qbuf,
> +	.vidioc_dqbuf		= v4l2_m2m_ioctl_dqbuf,
> +	.vidioc_prepare_buf	= v4l2_m2m_ioctl_prepare_buf,
> +	.vidioc_create_bufs	= v4l2_m2m_ioctl_create_bufs,
> +	.vidioc_expbuf		= v4l2_m2m_ioctl_expbuf,
> +
> +	.vidioc_streamon	= v4l2_m2m_ioctl_streamon,
> +	.vidioc_streamoff	= v4l2_m2m_ioctl_streamoff,
> +
> +	.vidioc_subscribe_event = v4l2_ctrl_subscribe_event,
> +	.vidioc_unsubscribe_event = v4l2_event_unsubscribe,
> +};
> +
> +static void dw100_job_finish(struct dw100_device *dw_dev, bool with_error)
> +{
> +	struct dw100_ctx *curr_ctx;
> +	struct vb2_v4l2_buffer *src_vb, *dst_vb;
> +	unsigned long flags;
> +	enum vb2_buffer_state buf_state;
> +
> +	curr_ctx = v4l2_m2m_get_curr_priv(dw_dev->m2m_dev);
> +
> +	if (!curr_ctx) {
> +		v4l2_err(&dw_dev->v4l2_dev,
> +			 "Instance released before the end of transaction\n");
> +		return;
> +	}
> +
> +	src_vb = v4l2_m2m_src_buf_remove(curr_ctx->fh.m2m_ctx);
> +	dst_vb = v4l2_m2m_dst_buf_remove(curr_ctx->fh.m2m_ctx);
> +
> +	if (likely(!with_error))
> +		buf_state = VB2_BUF_STATE_DONE;
> +	else
> +		buf_state = VB2_BUF_STATE_ERROR;
> +
> +	spin_lock_irqsave(&dw_dev->irqlock, flags);
> +	v4l2_m2m_buf_done(src_vb, buf_state);
> +	v4l2_m2m_buf_done(dst_vb, buf_state);
> +	spin_unlock_irqrestore(&dw_dev->irqlock, flags);

And the lock could be droppped from here if it's dropped from
dw100_stop_streaming().

> +
> +	dprintk(2, dw_dev, "Finishing transaction with%s error(s)\n",
> +		with_error ? "" : "out");
> +
> +	v4l2_m2m_job_finish(dw_dev->m2m_dev, curr_ctx->fh.m2m_ctx);
> +}
> +
> +static void dw100_hw_reset(struct dw100_device *dw_dev)
> +{
> +	u32 val;
> +
> +	val = readl(dw_dev->mmio + DW100_DEWARP_CTRL);
> +	val |= DW100_DEWARP_CTRL_ENABLE;
> +	val |= DW100_DEWARP_CTRL_SOFT_RESET;
> +	writel(val, dw_dev->mmio + DW100_DEWARP_CTRL);
> +	val &= ~DW100_DEWARP_CTRL_SOFT_RESET;
> +	writel(val, dw_dev->mmio + DW100_DEWARP_CTRL);
> +}
> +
> +static void _dw100_hw_set_master_bus_enable(struct dw100_device *dw_dev,
> +					    unsigned int enable)
> +{
> +	u32 val;
> +	void __iomem *addr = dw_dev->mmio + DW100_BUS_CTRL;
> +
> +	dprintk(3, dw_dev, "%sable master bus\n", enable ? "En" : "Dis");
> +
> +	val = readl(addr);
> +
> +	if (enable)
> +		val |= DW100_BUS_CTRL_AXI_MASTER_ENABLE;
> +	else
> +		val &= ~DW100_BUS_CTRL_AXI_MASTER_ENABLE;
> +
> +	writel(val, addr);
> +}
> +
> +static void dw100_hw_master_bus_enable(struct dw100_device *dw_dev)
> +{
> +	_dw100_hw_set_master_bus_enable(dw_dev, 1);
> +}
> +
> +static void dw100_hw_master_bus_disable(struct dw100_device *dw_dev)
> +{
> +	_dw100_hw_set_master_bus_enable(dw_dev, 0);
> +}
> +
> +static void dw100_hw_dewarp_start(struct dw100_device *dw_dev)
> +{
> +	u32 val;
> +	void __iomem *addr = dw_dev->mmio + DW100_DEWARP_CTRL;
> +
> +	val = readl(addr);
> +
> +	dprintk(3, dw_dev, "Starting Hardware CTRL:%x\n", val);
> +	writel(val | DW100_DEWARP_CTRL_START, addr);
> +	writel(val, addr);
> +}
> +
> +static void dw100_hw_init_ctrl(struct dw100_device *dw_dev)
> +{
> +	u32 val;
> +	void __iomem *addr = dw_dev->mmio + DW100_DEWARP_CTRL;
> +	/*
> +	 * Input format YUV422_SP
> +	 * Output format YUV422_SP
> +	 * No hardware handshake (SW)
> +	 * No automatic double src buffering (Single)
> +	 * No automatic double dst buffering (Single)
> +	 * No Black Line
> +	 * Prefetch image pixel traversal
> +	 */
> +
> +	val = DW100_DEWARP_CTRL_ENABLE
> +		/* Valid only for auto prefetch mode*/
> +		| DW100_DEWARP_CTRL_PREFETCH_THRESHOLD(32);

Align | below the =.

> +
> +	/*
> +	 * Calculation mode required to support any scaling factor,
> +	 * but x4 slower than traversal mode.
> +	 *
> +	 * DW100_DEWARP_CTRL_PREFETCH_MODE_TRAVERSAL
> +	 * DW100_DEWARP_CTRL_PREFETCH_MODE_CALCULATION
> +	 * DW100_DEWARP_CTRL_PREFETCH_MODE_AUTO
> +	 *
> +	 * TODO: Find heuristics requiring calculation mode
> +	 *

Extra blank line.

> +	 */
> +	val |= DW100_DEWARP_CTRL_PREFETCH_MODE_CALCULATION;
> +
> +	writel(val, addr);
> +}
> +
> +static void dw100_hw_set_pixel_boundary(struct dw100_device *dw_dev)
> +{
> +	u32 val;
> +	void __iomem *addr = dw_dev->mmio + DW100_BOUNDARY_PIXEL;
> +
> +	val = DW100_BOUNDARY_PIXEL_V(128)
> +		| DW100_BOUNDARY_PIXEL_U(128)
> +		| DW100_BOUNDARY_PIXEL_Y(0);
> +
> +	writel(val, addr);
> +}
> +
> +static void dw100_hw_set_scale(struct dw100_device *dw_dev, u8 scale)
> +{
> +	void __iomem *addr = dw_dev->mmio + DW100_SCALE_FACTOR;
> +
> +	dprintk(1, dw_dev, "Setting scale factor to %d\n", scale);
> +
> +	writel(scale, addr);
> +}
> +
> +static void dw100_hw_set_roi(struct dw100_device *dw_dev, u32 x, u32 y)
> +{
> +	u32 val;
> +	void __iomem *addr = dw_dev->mmio + DW100_ROI_START;
> +
> +	dprintk(1, dw_dev, "Setting ROI region to %d.%d\n", x, y);
> +
> +	val = DW100_ROI_START_X(x) | DW100_ROI_START_Y(y);
> +
> +	writel(val, addr);
> +}
> +
> +static void dw100_hw_set_src_crop(struct dw100_device *dw_dev,
> +				  struct dw100_q_data *src_q_data,
> +				  struct dw100_q_data *dst_q_data)

const

> +{
> +	struct v4l2_rect *rect = &src_q_data->crop;
> +	u32 src_scale, qscale, left_scale, top_scale;
> +
> +	/* HW Scale is UQ1.7 encoded */
> +	src_scale = (rect->width << 7) / src_q_data->width;
> +	dw100_hw_set_scale(dw_dev, src_scale);
> +
> +	qscale = (dst_q_data->width << 7)  / src_q_data->width;
> +
> +	left_scale = (((rect->left << 7) * qscale) >> 14);
> +	top_scale = (((rect->top << 7) * qscale) >> 14);

No need for the outer parentheses.

> +
> +	dw100_hw_set_roi(dw_dev, left_scale, top_scale);
> +}
> +
> +static void dw100_hw_set_source(struct dw100_device *dw_dev,
> +				struct dw100_q_data *q_data,
> +				dma_addr_t addr)
> +{
> +	u32 width, height, stride, fourcc, val;
> +	struct dw100_fmt *fmt = q_data->fmt;
> +
> +	width =  q_data->width;
> +	height = q_data->height;
> +	stride = q_data->bytesperline;
> +	fourcc = q_data->fmt->fourcc;
> +
> +	dprintk(3, dw_dev, "Set HW source registers for %dx%d - stride %d, pixfmt: %x, dma:%pa\n",
> +		width, height, stride, fourcc, &addr);
> +
> +	/* Pixel Format */
> +	val = readl(dw_dev->mmio + DW100_DEWARP_CTRL);
> +
> +	val &= ~DW100_DEWARP_CTRL_INPUT_FORMAT_MASK;
> +	val |= DW100_DEWARP_CTRL_INPUT_FORMAT(fmt->reg_format);
> +
> +	writel(val, dw_dev->mmio + DW100_DEWARP_CTRL);
> +
> +	/* Swap */
> +	val = readl(dw_dev->mmio + DW100_SWAP_CONTROL);
> +
> +	val &= ~DW100_SWAP_CONTROL_SRC_MASK;
> +	/*
> +	 * Data swapping is performed only on Y plane for source image.
> +	 */
> +	if (fmt->reg_swap_uv &&
> +	    fmt->reg_format == DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED)
> +		val |= DW100_SWAP_CONTROL_SRC(DW100_SWAP_CONTROL_Y
> +					      (DW100_SWAP_CONTROL_BYTE));
> +
> +	writel(val, dw_dev->mmio + DW100_SWAP_CONTROL);
> +
> +	/* Image resolution */
> +	writel(DW100_IMG_SIZE_WIDTH(width) | DW100_IMG_SIZE_HEIGHT(height),
> +	       dw_dev->mmio + DW100_SRC_IMG_SIZE);
> +
> +	writel(stride, dw_dev->mmio + DW100_SRC_IMG_STRIDE);
> +
> +	/* Buffers */
> +	writel(DW100_IMG_Y_BASE(addr), dw_dev->mmio + DW100_SRC_IMG_Y_BASE);
> +	writel(DW100_IMG_UV_BASE((addr + (stride * height))),
> +	       dw_dev->mmio + DW100_SRC_IMG_UV_BASE);

As the hardware supports non-contiguous planes, we should use the mplane
API already, even if we support the single planar formats only for now.

> +}
> +
> +static void dw100_hw_set_destination(struct dw100_device *dw_dev,
> +				     struct dw100_q_data *q_data,
> +				     struct dw100_fmt *ifmt,
> +				     dma_addr_t addr)
> +{
> +	u32 width, height, stride, fourcc, val;
> +	struct dw100_fmt *fmt = q_data->fmt;
> +
> +	width =  q_data->width;
> +	height = q_data->height;
> +	stride = q_data->bytesperline;
> +	fourcc = q_data->fmt->fourcc;
> +
> +	dprintk(3, dw_dev, "Set HW source registers for %dx%d - stride %d, pixfmt: %x, dma:%pa\n",
> +		width, height, stride, fourcc, &addr);
> +
> +	/* Pixel Format */
> +	val = readl(dw_dev->mmio + DW100_DEWARP_CTRL);
> +
> +	val &= ~DW100_DEWARP_CTRL_OUTPUT_FORMAT_MASK;
> +	val |= DW100_DEWARP_CTRL_OUTPUT_FORMAT(fmt->reg_format);
> +
> +	writel(val, dw_dev->mmio + DW100_DEWARP_CTRL);
> +
> +	/* Swap */
> +	val = readl(dw_dev->mmio + DW100_SWAP_CONTROL);
> +
> +	val &= ~DW100_SWAP_CONTROL_DST_MASK;
> +
> +	/*
> +	 * Avoid to swap twice
> +	 */
> +	if (fmt->reg_swap_uv ^
> +	    (ifmt->reg_swap_uv && ifmt->reg_format !=
> +	     DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED)) {
> +		if (fmt->reg_format == DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED)
> +			val |= DW100_SWAP_CONTROL_DST(DW100_SWAP_CONTROL_Y
> +						      (DW100_SWAP_CONTROL_BYTE));
> +		else
> +			val |= DW100_SWAP_CONTROL_DST(DW100_SWAP_CONTROL_UV
> +						      (DW100_SWAP_CONTROL_BYTE));
> +	}
> +
> +	writel(val, dw_dev->mmio + DW100_SWAP_CONTROL);
> +
> +	/* Image resolution */
> +	writel(DW100_IMG_SIZE_WIDTH(width) | DW100_IMG_SIZE_HEIGHT(height),
> +	       dw_dev->mmio + DW100_DST_IMG_SIZE);
> +	writel(stride, dw_dev->mmio + DW100_DST_IMG_STRIDE);
> +
> +	val = ALIGN(stride * height, 16);
> +	writel(DW100_IMG_Y_BASE(addr), dw_dev->mmio + DW100_DST_IMG_Y_BASE);
> +	writel(DW100_IMG_UV_BASE(addr + val),
> +	       dw_dev->mmio + DW100_DST_IMG_UV_BASE);
> +	writel(DW100_DST_IMG_Y_SIZE(val), dw_dev->mmio + DW100_DST_IMG_Y_SIZE1);
> +
> +	if (fmt->reg_format == DW100_DEWARP_CTRL_FORMAT_YUV420_SP)
> +		val /= 2;
> +
> +	writel(DW100_DST_IMG_UV_SIZE(val),
> +	       dw_dev->mmio + DW100_DST_IMG_UV_SIZE1);
> +}
> +
> +static void dw100_hw_set_mapping(struct dw100_device *dw_dev, dma_addr_t addr,
> +				 u32 width, u32 height)
> +{
> +	dprintk(1, dw_dev, "Set HW mapping registers for %dx%d addr:%pa", width,
> +		height, &addr);
> +
> +	writel(DW100_MAP_LUT_ADDR_ADDR(addr), dw_dev->mmio + DW100_MAP_LUT_ADDR);
> +	writel(DW100_MAP_LUT_SIZE_WIDTH(width)
> +		| DW100_MAP_LUT_SIZE_HEIGHT(height),
> +			dw_dev->mmio + DW100_MAP_LUT_SIZE);
> +}
> +
> +static void dw100_hw_clear_irq(struct dw100_device *dw_dev, unsigned int irq)
> +{
> +	writel(DW100_INTERRUPT_STATUS_INT_CLEAR(irq),
> +	       dw_dev->mmio + DW100_INTERRUPT_STATUS);
> +}
> +
> +static void dw100_hw_enable_irq(struct dw100_device *dw_dev)
> +{
> +	writel(DW100_INTERRUPT_STATUS_INT_ENABLE_MASK,
> +	       dw_dev->mmio + DW100_INTERRUPT_STATUS);
> +}
> +
> +static void dw100_hw_disable_irq(struct dw100_device *dw_dev)
> +{
> +	writel(0, dw_dev->mmio + DW100_INTERRUPT_STATUS);
> +}
> +
> +static bool dw100_hw_is_frame_running(struct dw100_device *dw_dev)
> +{
> +	bool is_running = readl(dw_dev->mmio + DW100_INTERRUPT_STATUS)
> +				& DW100_INTERRUPT_STATUS_FRAME_BUSY;
> +
> +	return is_running;
> +}
> +
> +static u32 dw_hw_get_pending_irqs(struct dw100_device *dw_dev)
> +{
> +	u32 val;
> +
> +	val = readl(dw_dev->mmio + DW100_INTERRUPT_STATUS);
> +
> +	return DW100_INTERRUPT_STATUS_INT_STATUS(val);
> +}
> +
> +static irqreturn_t dw100_irq_handler(int irq, void *dev_id)
> +{
> +	struct dw100_device *dw_dev = dev_id;
> +	u32 pending_irqs, err_irqs, frame_done_irq;
> +	bool with_error = true;
> +
> +	pending_irqs = dw_hw_get_pending_irqs(dw_dev);
> +	frame_done_irq = pending_irqs & DW100_INTERRUPT_STATUS_INT_FRAME_DONE;
> +	err_irqs = DW100_INTERRUPT_STATUS_INT_ERR_STATUS(pending_irqs);
> +
> +	if (frame_done_irq) {
> +		dprintk(3, dw_dev, "Frame done interrupt\n");
> +		with_error = false;
> +		err_irqs &= ~DW100_INTERRUPT_STATUS_INT_ERR_STATUS
> +			(DW100_INTERRUPT_STATUS_INT_ERR_FRAME_DONE);
> +	}
> +
> +	if (err_irqs)
> +		v4l2_err(&dw_dev->v4l2_dev, "Interrupt error: %#x\n", err_irqs);
> +
> +	dw100_hw_disable_irq(dw_dev);
> +	dw100_hw_master_bus_disable(dw_dev);
> +	dw100_hw_clear_irq(dw_dev, pending_irqs |
> +			   DW100_INTERRUPT_STATUS_INT_ERR_TIME_OUT);
> +
> +	dw100_job_finish(dw_dev, with_error);
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static void dw100_start(struct dw100_ctx *ctx, struct vb2_v4l2_buffer *in_vb,
> +			struct vb2_v4l2_buffer *out_vb)
> +{
> +	struct dw100_device *dw_dev = ctx->dw_dev;
> +	dma_addr_t p_in, p_out;
> +
> +	p_in = vb2_dma_contig_plane_dma_addr(&in_vb->vb2_buf, 0);
> +	p_out = vb2_dma_contig_plane_dma_addr(&out_vb->vb2_buf, 0);
> +	if (!p_in || !p_out) {
> +		v4l2_err(&dw_dev->v4l2_dev,
> +			 "Acquiring DMA addresses of buffers failed\n");
> +		return;
> +	}
> +
> +	out_vb->sequence =
> +		get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE)->sequence++;
> +	in_vb->sequence =
> +		get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_OUTPUT)->sequence++;
> +
> +	dprintk(1, ctx->dw_dev,
> +		"Starting queues %p->%p buffers d:%pa->d:%pa, sequence %d->%d\n",
> +		v4l2_m2m_get_vq(ctx->fh.m2m_ctx, V4L2_BUF_TYPE_VIDEO_OUTPUT),
> +		v4l2_m2m_get_vq(ctx->fh.m2m_ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE),
> +		&p_in, &p_out, in_vb->sequence, out_vb->sequence);
> +
> +	out_vb->vb2_buf.timestamp = in_vb->vb2_buf.timestamp;
> +	out_vb->field = in_vb->field;
> +	if (in_vb->flags & V4L2_BUF_FLAG_TIMECODE)
> +		out_vb->timecode = in_vb->timecode;
> +	out_vb->flags = in_vb->flags &
> +		(V4L2_BUF_FLAG_TIMECODE |
> +		 V4L2_BUF_FLAG_KEYFRAME |
> +		 V4L2_BUF_FLAG_PFRAME |
> +		 V4L2_BUF_FLAG_BFRAME |
> +		 V4L2_BUF_FLAG_TIMESTAMP_MASK |
> +		 V4L2_BUF_FLAG_TSTAMP_SRC_MASK);

v4l2_m2m_buf_copy_metadata() will handle this for you. I think you can
call it here, or in dw100_job_finish().

> +
> +	/* Now, let's deal with hardware ... */
> +	dw100_hw_master_bus_disable(dw_dev);
> +	if (!ctx->map)
> +		dw100_create_mapping(ctx);
> +
> +	dw100_hw_init_ctrl(dw_dev);
> +	dw100_hw_set_pixel_boundary(dw_dev);
> +	dw100_hw_set_src_crop(dw_dev, &ctx->q_data[V4L2_M2M_SRC],
> +			      &ctx->q_data[V4L2_M2M_DST]);
> +	dw100_hw_set_source(dw_dev, &ctx->q_data[V4L2_M2M_SRC], p_in);
> +	dw100_hw_set_destination(dw_dev, &ctx->q_data[V4L2_M2M_DST],
> +				 ctx->q_data[V4L2_M2M_SRC].fmt, p_out);
> +	dw100_hw_set_mapping(dw_dev, ctx->map_dma,
> +			     ctx->map_width, ctx->map_height);
> +	dw100_hw_enable_irq(dw_dev);
> +	dw100_hw_dewarp_start(dw_dev);
> +
> +	/* Enable Bus */
> +	dw100_hw_master_bus_enable(dw_dev);
> +}
> +
> +static void dw100_device_run(void *priv)
> +{
> +	struct dw100_ctx *ctx = priv;
> +	struct vb2_v4l2_buffer *src_buf, *dst_buf;
> +
> +	src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
> +	dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
> +
> +	dw100_start(ctx, src_buf, dst_buf);
> +}
> +
> +static int dw100_job_ready(void *priv)
> +{
> +	struct dw100_ctx *ctx = priv;
> +
> +	if (v4l2_m2m_num_src_bufs_ready(ctx->fh.m2m_ctx) < 1 ||
> +	    v4l2_m2m_num_dst_bufs_ready(ctx->fh.m2m_ctx) < 1) {
> +		dprintk(1, ctx->dw_dev, "Not enough buffers available\n");
> +		return 0;
> +	}
> +
> +	if (dw100_hw_is_frame_running(ctx->dw_dev)) {
> +		dprintk(1, ctx->dw_dev, "HW still running a frame\n");
> +		return 0;
> +	}
> +
> +	return 1;
> +}

As far as I understand you could drop this function, as the default
implementation when not provided is to assume that one source buffer and
one destination buffer are all that is required.

> +
> +static const struct v4l2_m2m_ops dw100_m2m_ops = {
> +	.device_run	= dw100_device_run,
> +	.job_ready	= dw100_job_ready,
> +};
> +
> +#ifdef CONFIG_MEDIA_CONTROLLER
> +static const struct media_device_ops dw100_media_ops = {
> +	.req_validate = vb2_request_validate,
> +	.req_queue = v4l2_m2m_request_queue,
> +};

Do we need the request API ? I think you can drop this.

> +#endif
> +
> +static struct video_device *dw100_init_video_device(struct dw100_device *dw_dev)
> +{
> +	struct video_device *vfd = &dw_dev->vfd;
> +
> +	vfd->vfl_dir = VFL_DIR_M2M;
> +	vfd->fops = &dw100_fops;
> +	vfd->device_caps = V4L2_CAP_VIDEO_M2M | V4L2_CAP_STREAMING;
> +	vfd->ioctl_ops = &dw100_ioctl_ops;
> +	vfd->minor = -1;
> +	vfd->release = video_device_release_empty;
> +	vfd->v4l2_dev = &dw_dev->v4l2_dev;
> +	vfd->lock = &dw_dev->vfd_mutex;
> +
> +	strscpy(vfd->name, DRV_NAME, sizeof(vfd->name));
> +	mutex_init(vfd->lock);

Missing corresponding mutex_destroy().

> +	video_set_drvdata(vfd, dw_dev);
> +
> +	return vfd;
> +}
> +
> +static int dw100_dump_regs_show(struct seq_file *m, void *private)
> +{
> +	struct dw100_device *dw_dev = m->private;
> +	int ret;
> +
> +	ret = pm_runtime_resume_and_get(&dw_dev->pdev->dev);
> +	if (ret < 0)
> +		return ret;
> +
> +	ret = dw100_dump_regs(dw_dev);

I'd inline the function here as there's a single caller.

> +
> +	pm_runtime_put_sync(&dw_dev->pdev->dev);
> +
> +	return ret;
> +}
> +DEFINE_SHOW_ATTRIBUTE(dw100_dump_regs);
> +
> +static void dw100_debugfs_init(struct dw100_device *dw_dev)
> +{
> +	dw_dev->debugfs_root =
> +		debugfs_create_dir(dev_name(&dw_dev->pdev->dev), NULL);
> +
> +	debugfs_create_file("dump_regs", 0600, dw_dev->debugfs_root, dw_dev,
> +			    &dw100_dump_regs_fops);
> +}
> +
> +static void dw100_debugfs_exit(struct dw100_device *dw_dev)
> +{
> +	debugfs_remove_recursive(dw_dev->debugfs_root);
> +}
> +
> +static int dw100_probe(struct platform_device *pdev)
> +{
> +	struct dw100_device *dw_dev;
> +	struct video_device *vfd;
> +	struct resource *res;
> +	int ret = 0;

No need to initialize ret to 0.

> +	int irq;
> +
> +	dw_dev = devm_kzalloc(&pdev->dev, sizeof(*dw_dev), GFP_KERNEL);
> +	if (!dw_dev)
> +		return -ENOMEM;
> +	dw_dev->pdev = pdev;
> +
> +	ret = devm_clk_bulk_get_all(&pdev->dev, &dw_dev->clks);
> +	if (ret < 0) {
> +		dev_err(&pdev->dev, "Unable to get clocks: %d\n", ret);
> +		return ret;
> +	}
> +	dw_dev->num_clks = ret;
> +
> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +	dw_dev->mmio = devm_ioremap_resource(&pdev->dev, res);
> +	if (IS_ERR(dw_dev->mmio))
> +		return PTR_ERR(dw_dev->mmio);
> +
> +	irq = platform_get_irq(pdev, 0);
> +	if (irq < 0)
> +		return irq;
> +
> +	ret = devm_request_threaded_irq(&pdev->dev, irq, NULL, dw100_irq_handler,

Why do we need a threaded IRQ handler ?

> +					IRQF_ONESHOT, dev_name(&pdev->dev),
> +					dw_dev);
> +	if (ret < 0) {
> +		dev_err(&pdev->dev, "Failed to request irq: %d\n", ret);
> +		return ret;
> +	}

I'm always a bit worried when requesting an IRQ before we ensure the
device won't generate an interrupt.

> +
> +	platform_set_drvdata(pdev, dw_dev);
> +
> +	pm_runtime_enable(&pdev->dev);
> +	ret = pm_runtime_resume_and_get(&pdev->dev);
> +	if (ret < 0) {
> +		dev_err(&pdev->dev, "Unable to enable clks: %d\n", ret);
> +		goto err_clk;
> +	}
> +
> +	dw100_hw_reset(dw_dev);
> +
> +	pm_runtime_put_sync(&pdev->dev);

I wonder if the device should be reset every time runtime PM resumes.
The dw100_hw_reset() call could move to a runtime PM resume handler, and
this could then be dropped.

> +
> +	spin_lock_init(&dw_dev->irqlock);
> +
> +	ret = v4l2_device_register(&pdev->dev, &dw_dev->v4l2_dev);
> +	if (ret)
> +		goto err_clk;
> +
> +	vfd = dw100_init_video_device(dw_dev);
> +
> +	dw_dev->m2m_dev = v4l2_m2m_init(&dw100_m2m_ops);
> +	if (IS_ERR(dw_dev->m2m_dev)) {
> +		v4l2_err(&dw_dev->v4l2_dev, "Failed to init mem2mem device\n");
> +		ret = PTR_ERR(dw_dev->m2m_dev);
> +		goto err_v4l2;
> +	}
> +
> +#ifdef CONFIG_MEDIA_CONTROLLER

Let's make this unconditional as the MC API is needed by libcamera to
enumerate media devices.

> +	dw_dev->mdev.dev = &pdev->dev;
> +	strscpy(dw_dev->mdev.model, "dw100", sizeof(dw_dev->mdev.model));
> +	strscpy(dw_dev->mdev.bus_info, "platform:dw100",
> +		sizeof(dw_dev->mdev.bus_info));

No needed anymore either, it's handled by the core.

> +	media_device_init(&dw_dev->mdev);
> +	dw_dev->mdev.ops = &dw100_media_ops;
> +	dw_dev->v4l2_dev.mdev = &dw_dev->mdev;
> +#endif
> +
> +	ret = video_register_device(vfd, VFL_TYPE_VIDEO, -1);
> +	if (ret) {
> +		v4l2_err(&dw_dev->v4l2_dev, "Failed to register video device\n");
> +		goto err_m2m;
> +	}
> +
> +#ifdef CONFIG_MEDIA_CONTROLLER
> +	ret = v4l2_m2m_register_media_controller(dw_dev->m2m_dev, vfd,
> +						 MEDIA_ENT_F_PROC_VIDEO_SCALER);
> +	if (ret) {
> +		v4l2_err(&dw_dev->v4l2_dev, "Failed to init mem2mem media controller\n");

I'd use dev_err() (and dev_info(), ...) through the driver.

> +		goto error_v4l2;
> +	}
> +
> +	ret = media_device_register(&dw_dev->mdev);
> +	if (ret) {
> +		v4l2_err(&dw_dev->v4l2_dev, "Failed to register mem2mem media device\n");
> +		goto error_m2m_mc;
> +	}
> +#endif
> +
> +	dw100_debugfs_init(dw_dev);
> +
> +	v4l2_info(&dw_dev->v4l2_dev,
> +		  "dw100 v4l2 m2m registered as /dev/video%d\n", vfd->num);
> +
> +	return 0;

A blank line would be nice.

> +#ifdef CONFIG_MEDIA_CONTROLLER
> +error_m2m_mc:
> +	v4l2_m2m_unregister_media_controller(dw_dev->m2m_dev);
> +error_v4l2:
> +	video_unregister_device(vfd);
> +#endif
> +err_m2m:
> +	v4l2_m2m_release(dw_dev->m2m_dev);
> +err_v4l2:
> +	v4l2_device_unregister(&dw_dev->v4l2_dev);
> +err_clk:
> +	pm_runtime_disable(&pdev->dev);
> +
> +	return ret;
> +}
> +
> +static int dw100_remove(struct platform_device *pdev)
> +{
> +	struct dw100_device *dw_dev = platform_get_drvdata(pdev);
> +
> +	dw100_debugfs_exit(dw_dev);
> +
> +	pm_runtime_disable(&pdev->dev);
> +
> +#ifdef CONFIG_MEDIA_CONTROLLER
> +	media_device_unregister(&dw_dev->mdev);
> +	v4l2_m2m_unregister_media_controller(dw_dev->m2m_dev);
> +	media_device_cleanup(&dw_dev->mdev);
> +#endif
> +
> +	video_unregister_device(&dw_dev->vfd);
> +	v4l2_m2m_release(dw_dev->m2m_dev);
> +	v4l2_device_unregister(&dw_dev->v4l2_dev);
> +
> +	return 0;
> +}
> +
> +static int __maybe_unused dw100_runtime_suspend(struct device *dev)
> +{
> +	struct dw100_device *dw_dev = dev_get_drvdata(dev);
> +
> +	clk_bulk_disable_unprepare(dw_dev->num_clks, dw_dev->clks);
> +
> +	return 0;
> +}
> +
> +static int __maybe_unused dw100_runtime_resume(struct device *dev)
> +{
> +	struct dw100_device *dw_dev = dev_get_drvdata(dev);
> +
> +	return clk_bulk_prepare_enable(dw_dev->num_clks, dw_dev->clks);
> +}
> +
> +static const struct dev_pm_ops dw100_pm = {
> +	SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
> +				pm_runtime_force_resume)
> +	SET_RUNTIME_PM_OPS(dw100_runtime_suspend,
> +			   dw100_runtime_resume, NULL)
> +};
> +
> +static const struct of_device_id dw100_dt_ids[] = {
> +	{ .compatible = "nxp,imx8mp-dw100", .data = NULL },
> +	{ },
> +};
> +MODULE_DEVICE_TABLE(of, dw100_dt_ids);
> +
> +static struct platform_driver dw100_driver = {
> +	.probe		= dw100_probe,
> +	.remove		= dw100_remove,
> +	.driver		= {
> +		.name	= DRV_NAME,
> +		.pm = &dw100_pm,
> +		.of_match_table = dw100_dt_ids,
> +	},
> +};
> +
> +module_platform_driver(dw100_driver);
> +
> +MODULE_DESCRIPTION("DW100 Hardware dewarper");
> +MODULE_AUTHOR("Xavier Roumegue <Xavier.Roumegue@oss.nxp.com>");
> +MODULE_LICENSE("GPL");
> diff --git a/drivers/media/platform/nxp/dw100/dw100_regs.h b/drivers/media/platform/nxp/dw100/dw100_regs.h
> new file mode 100644
> index 000000000000..c540f6c638db
> --- /dev/null
> +++ b/drivers/media/platform/nxp/dw100/dw100_regs.h
> @@ -0,0 +1,118 @@
> +/* SPDX-License-Identifier: GPL-2.0+ */
> +/*
> + * DW100 Hardware dewarper
> + *
> + * Copyright 2022 NXP
> + * Author: Xavier Roumegue (xavier.roumegue@oss.nxp.com)
> + *

Extra blank line.

> + */
> +
> +#ifndef _DW100_REGS_H_
> +#define _DW100_REGS_H_
> +
> +/* AHB register offset */
> +#define DW100_DEWARP_ID			0x00
> +#define DW100_DEWARP_CTRL		0x04
> +#define DW100_DEWARP_CTRL_ENABLE			BIT(0)
> +#define DW100_DEWARP_CTRL_START				BIT(1)
> +#define DW100_DEWARP_CTRL_SOFT_RESET			BIT(2)
> +#define DW100_DEWARP_CTRL_FORMAT_YUV422_SP		(0UL)

No need for parentheses. Same below where applicable.

> +#define DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED		(1UL)
> +#define DW100_DEWARP_CTRL_FORMAT_YUV420_SP		(2UL)
> +#define DW100_DEWARP_CTRL_INPUT_FORMAT_MASK		GENMASK(5, 4)
> +#define DW100_DEWARP_CTRL_INPUT_FORMAT(x)		((x) << 4)
> +#define DW100_DEWARP_CTRL_OUTPUT_FORMAT(x)		((x) << 6)
> +#define DW100_DEWARP_CTRL_OUTPUT_FORMAT_MASK		GENMASK(7, 6)
> +#define DW100_DEWARP_CTRL_SRC_AUTO_SHADOW		BIT(8)
> +#define DW100_DEWARP_CTRL_HW_HANDSHAKE			BIT(9)
> +#define DW100_DEWARP_CTRL_DST_AUTO_SHADOW		BIT(10)
> +#define DW100_DEWARP_CTRL_SPLIT_LINE			BIT(11)
> +#define DW100_DEWARP_CTRL_PREFETCH_MODE_MASK		GENMASK(17, 16)
> +#define DW100_DEWARP_CTRL_PREFETCH_MODE_TRAVERSAL	(0UL << 16)
> +#define DW100_DEWARP_CTRL_PREFETCH_MODE_CALCULATION	(1UL << 16)
> +#define DW100_DEWARP_CTRL_PREFETCH_MODE_AUTO		(2UL << 16)
> +#define DW100_DEWARP_CTRL_PREFETCH_THRESHOLD_MASK	GENMASK(24, 18)
> +#define DW100_DEWARP_CTRL_PREFETCH_THRESHOLD(x)		((x) << 18)
> +
> +#define DW100_MAP_LUT_ADDR		0x08
> +#define DW100_MAP_LUT_ADDR_ADDR(addr)	(((addr) >> 4) & GENMASK(29, 0))
> +#define DW100_MAP_LUT_SIZE		0x0C

I usually ask for lower-case hex constants as that's what is often done
in V4L2, but maybe it's too much nit-picking :-)

> +#define DW100_MAP_LUT_SIZE_WIDTH(w)	(((w) & GENMASK(10, 0)) << 0)
> +#define DW100_MAP_LUT_SIZE_HEIGHT(h)	(((h) & GENMASK(10, 0)) << 16)
> +#define DW100_SRC_IMG_Y_BASE		0x10
> +#define DW100_IMG_Y_BASE(base)		(((base) >> 4) & GENMASK(29, 0))
> +#define DW100_SRC_IMG_UV_BASE		0x14
> +#define DW100_IMG_UV_BASE(base)		(((base) >> 4) & GENMASK(29, 0))
> +#define DW100_SRC_IMG_SIZE		0x18
> +#define DW100_IMG_SIZE_WIDTH(w)		(((w) & GENMASK(12, 0)) << 0)
> +#define DW100_IMG_SIZE_HEIGHT(h)	(((h) & GENMASK(12, 0)) << 16)
> +
> +#define DW100_SRC_IMG_STRIDE		0x1C
> +#define DW100_MAP_LUT_ADDR2		0x20
> +#define DW100_MAP_LUT_SIZE2		0x24
> +#define DW100_SRC_IMG_Y_BASE2		0x28
> +#define DW100_SRC_IMG_UV_BASE2		0x2C
> +#define DW100_SRC_IMG_SIZE2		0x30
> +#define DW100_SRC_IMG_STRIDE2		0x34
> +#define DW100_DST_IMG_Y_BASE		0x38
> +#define DW100_DST_IMG_UV_BASE		0x3C
> +#define DW100_DST_IMG_SIZE		0x40
> +#define DW100_DST_IMG_STRIDE		0x44
> +#define DW100_DST_IMG_Y_BASE2		0x48
> +#define DW100_DST_IMG_UV_BASE2		0x4C
> +#define DW100_DST_IMG_SIZE2		0x50
> +#define DW100_DST_IMG_STRIDE2		0x54
> +#define DW100_SWAP_CONTROL		0x58
> +#define DW100_SWAP_CONTROL_BYTE		BIT(0)
> +#define DW100_SWAP_CONTROL_SHORT	BIT(1)
> +#define DW100_SWAP_CONTROL_WORD		BIT(2)
> +#define DW100_SWAP_CONTROL_LONG		BIT(3)
> +#define DW100_SWAP_CONTROL_Y(x)		(((x) & GENMASK(3, 0)) << 0)
> +#define DW100_SWAP_CONTROL_UV(x)	(((x) & GENMASK(3, 0)) << 4)
> +#define DW100_SWAP_CONTROL_SRC(x)	(((x) & GENMASK(7, 0)) << 0)
> +#define DW100_SWAP_CONTROL_DST(x)	(((x) & GENMASK(7, 0)) << 8)
> +#define DW100_SWAP_CONTROL_SRC2(x)	(((x) & GENMASK(7, 0)) << 16)
> +#define DW100_SWAP_CONTROL_DST2(x)	(((x) & GENMASK(7, 0)) << 24)
> +#define DW100_SWAP_CONTROL_SRC_MASK	GENMASK(7, 0)
> +#define DW100_SWAP_CONTROL_DST_MASK	GENMASK(15, 8)
> +#define DW100_SWAP_CONTROL_SRC2_MASK	GENMASK(23, 16)
> +#define DW100_SWAP_CONTROL_DST2_MASK	GENMASK(31, 24)
> +#define DW100_VERTICAL_SPLIT_LINE	0x5C
> +#define DW100_HORIZON_SPLIT_LINE	0x60
> +#define DW100_SCALE_FACTOR		0x64
> +#define DW100_ROI_START			0x68
> +#define DW100_ROI_START_X(x)		(((x) & GENMASK(12, 0)) << 0)
> +#define DW100_ROI_START_Y(y)		(((y) & GENMASK(12, 0)) << 16)
> +#define DW100_BOUNDARY_PIXEL		0x6C
> +#define DW100_BOUNDARY_PIXEL_V(v)	(((v) & GENMASK(7, 0)) << 0)
> +#define DW100_BOUNDARY_PIXEL_U(u)	(((u) & GENMASK(7, 0)) << 8)
> +#define DW100_BOUNDARY_PIXEL_Y(y)	(((y) & GENMASK(7, 0)) << 16)
> +
> +#define DW100_INTERRUPT_STATUS		0x70
> +#define DW100_INTERRUPT_STATUS_INT_FRAME_DONE		BIT(0)
> +#define DW100_INTERRUPT_STATUS_INT_ERR_TIME_OUT		BIT(1)
> +#define DW100_INTERRUPT_STATUS_INT_ERR_AXI_RESP		BIT(2)
> +#define DW100_INTERRUPT_STATUS_INT_ERR_X		BIT(3)
> +#define DW100_INTERRUPT_STATUS_INT_ERR_MB_FETCH		BIT(4)
> +#define DW100_INTERRUPT_STATUS_INT_ERR_FRAME2		BIT(5)
> +#define DW100_INTERRUPT_STATUS_INT_ERR_FRAME3		BIT(6)
> +#define DW100_INTERRUPT_STATUS_INT_ERR_FRAME_DONE	BIT(7)
> +#define DW100_INTERRUPT_STATUS_INT_ERR_STATUS(x)	(((x) >> 1) & 0x7f)
> +#define DW100_INTERRUPT_STATUS_INT_STATUS(x)		((x) & 0xff)
> +
> +#define DW100_INTERRUPT_STATUS_INT_ENABLE_MASK		GENMASK(15, 8)
> +#define DW100_INTERRUPT_STATUS_INT_ENABLE(x)		(((x) & GENMASK(7, 0)) << 8)
> +#define DW100_INTERRUPT_STATUS_FRAME_BUSY		BIT(16)
> +#define DW100_INTERRUPT_STATUS_INT_CLEAR(x)		(((x) & GENMASK(7, 0)) << 24)
> +#define DW100_BUS_CTRL			0x74
> +#define DW100_BUS_CTRL_AXI_MASTER_ENABLE	BIT(31)
> +#define DW100_BUS_CTRL1			0x78
> +#define DW100_BUS_TIME_OUT_CYCLE	0x7C
> +#define DW100_DST_IMG_Y_SIZE1		0x80
> +#define DW100_DST_IMG_Y_SIZE(sz)	(((sz) >> 4) & GENMASK(29, 0))
> +#define DW100_DST_IMG_UV_SIZE(sz)	(((sz) >> 4) & GENMASK(29, 0))
> +#define DW100_DST_IMG_UV_SIZE1		0x84
> +#define DW100_DST_IMG_Y_SIZE2		0x88
> +#define DW100_DST_IMG_UV_SIZE2		0x8C
> +
> +#endif /* _DW100_REGS_H_ */

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 1/9] videodev2.h: add V4L2_CTRL_FLAG_DYNAMIC_ARRAY
  2022-06-14 20:46   ` Laurent Pinchart
@ 2022-06-15  7:37     ` Hans Verkuil
  2022-06-15  8:39       ` Laurent Pinchart
  0 siblings, 1 reply; 27+ messages in thread
From: Hans Verkuil @ 2022-06-15  7:37 UTC (permalink / raw)
  To: Laurent Pinchart, Xavier Roumegue
  Cc: mchehab, stanimir.varbanov, tomi.valkeinen, robh+dt, nicolas,
	alexander.stein, linux-media, devicetree



On 6/14/22 22:46, Laurent Pinchart wrote:
> Hi Xavier and Hans,
> 
> Thank you for the patch.
> 
> On Tue, May 03, 2022 at 11:39:17AM +0200, Xavier Roumegue wrote:
>> From: Hans Verkuil <hverkuil-cisco@xs4all.nl>
>>
>> Add a new flag that indicates that this control is a dynamically sized
>> array. Also document this flag.
>>
>> Currently dynamically sized arrays are limited to one dimensional arrays,
>> but that might change in the future if there is a need for it.
>>
>> The initial use-case of dynamic arrays are stateless codecs. A frame
>> can be divided in many slices, so you want to provide an array containing
>> slice information for each slice. Typically the number of slices is small,
>> but the standard allow for hundreds or thousands of slices. Dynamic arrays
>> are a good solution since sizing the array for the worst case would waste
>> substantial amounts of memory.
>>
>> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
>> ---
>>  .../userspace-api/media/v4l/vidioc-queryctrl.rst          | 8 ++++++++
>>  include/uapi/linux/videodev2.h                            | 1 +
>>  2 files changed, 9 insertions(+)
>>
>> diff --git a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
>> index 88f630252d98..a20dfa2a933b 100644
>> --- a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
>> +++ b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
>> @@ -625,6 +625,14 @@ See also the examples in :ref:`control`.
>>  	``V4L2_CTRL_FLAG_GRABBED`` flag when buffers are allocated or
>>  	streaming is in progress since most drivers do not support changing
>>  	the format in that case.
>> +    * - ``V4L2_CTRL_FLAG_DYNAMIC_ARRAY``
>> +      - 0x0800
>> +      - This control is a dynamically sized 1-dimensional array. It
>> +        behaves the same as a regular array, except that the number
>> +	of elements as reported by the ``elems`` field is between 1 and
>> +	``dims[0]``. So setting the control with a differently sized
>> +	array will change the ``elems`` field when the control is
>> +	queried afterwards.
> 
> Wrong indentation.

No, it's just that the tab in front makes it look a bit weird.

> 
> Can the dimension be changed by the application only, or by the driver
> too ? In the latter case, is an event generated ?

Interesting question. 'dims[0]' is the maximum number of elements allowed
by the driver for this dynamic array. Applications can set the control to
any number of elements up to that maximum. For the current use case I expect
that this maximum reflects what the hardware supports, and is not really
related to the video resolution. Currently the code expects the maximum
number of elements (dims[0]) to remain constant.

Technically a driver can increase dims[0] as needed. Should that be needed,
then we probably need an event to signal that.

> 
> Considering this in the context of this series, the driver needs to
> change the dimension, as the use case is to size the control based on
> the image size. Do we want to document here that the driver will reset
> the control to a default value when the dimension changes, or is that
> something that should be control-specific ?

As mentioned above, in the current context the maximum number of allowed
elements in a dynamic array is fixed, so this issue does not occur.

Regards,

	Hans

> 
>>  
>>  Return Value
>>  ============
>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
>> index 3768a0a80830..8df13defde75 100644
>> --- a/include/uapi/linux/videodev2.h
>> +++ b/include/uapi/linux/videodev2.h
>> @@ -1886,6 +1886,7 @@ struct v4l2_querymenu {
>>  #define V4L2_CTRL_FLAG_HAS_PAYLOAD	0x0100
>>  #define V4L2_CTRL_FLAG_EXECUTE_ON_WRITE	0x0200
>>  #define V4L2_CTRL_FLAG_MODIFY_LAYOUT	0x0400
>> +#define V4L2_CTRL_FLAG_DYNAMIC_ARRAY	0x0800
>>  
>>  /*  Query flags, to be ORed with the control ID */
>>  #define V4L2_CTRL_FLAG_NEXT_CTRL	0x80000000
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 3/9] vivid: add dynamic array test control
  2022-06-14 21:00   ` Laurent Pinchart
@ 2022-06-15  7:39     ` Hans Verkuil
  2022-06-15  7:46       ` Hans Verkuil
  2022-06-15  9:14     ` Hans Verkuil
  1 sibling, 1 reply; 27+ messages in thread
From: Hans Verkuil @ 2022-06-15  7:39 UTC (permalink / raw)
  To: Laurent Pinchart, Xavier Roumegue
  Cc: mchehab, stanimir.varbanov, tomi.valkeinen, robh+dt, nicolas,
	alexander.stein, linux-media, devicetree



On 6/14/22 23:00, Laurent Pinchart wrote:
> Hi Xavier and Hans,
> 
> Thank you for the patch.
> 
> On Tue, May 03, 2022 at 11:39:19AM +0200, Xavier Roumegue wrote:
>> From: Hans Verkuil <hverkuil-cisco@xs4all.nl>
>>
>> Add a dynamic array test control to help test support for this
>> feature.
>>
>> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
>> ---
>>  drivers/media/test-drivers/vivid/vivid-ctrls.c | 15 +++++++++++++++
>>  1 file changed, 15 insertions(+)
>>
>> diff --git a/drivers/media/test-drivers/vivid/vivid-ctrls.c b/drivers/media/test-drivers/vivid/vivid-ctrls.c
>> index e7516dc1227b..7267892dc18a 100644
>> --- a/drivers/media/test-drivers/vivid/vivid-ctrls.c
>> +++ b/drivers/media/test-drivers/vivid/vivid-ctrls.c
>> @@ -34,6 +34,7 @@
>>  #define VIVID_CID_U8_4D_ARRAY		(VIVID_CID_CUSTOM_BASE + 10)
>>  #define VIVID_CID_AREA			(VIVID_CID_CUSTOM_BASE + 11)
>>  #define VIVID_CID_RO_INTEGER		(VIVID_CID_CUSTOM_BASE + 12)
>> +#define VIVID_CID_U32_DYN_ARRAY		(VIVID_CID_CUSTOM_BASE + 13)
>>  
>>  #define VIVID_CID_VIVID_BASE		(0x00f00000 | 0xf000)
>>  #define VIVID_CID_VIVID_CLASS		(0x00f00000 | 1)
>> @@ -189,6 +190,19 @@ static const struct v4l2_ctrl_config vivid_ctrl_u32_array = {
>>  	.dims = { 1 },
>>  };
>>  
>> +static const struct v4l2_ctrl_config vivid_ctrl_u32_dyn_array = {
>> +	.ops = &vivid_user_gen_ctrl_ops,
>> +	.id = VIVID_CID_U32_DYN_ARRAY,
>> +	.name = "U32 Dynamic Array",
>> +	.type = V4L2_CTRL_TYPE_U32,
>> +	.flags = V4L2_CTRL_FLAG_DYNAMIC_ARRAY,
>> +	.def = 50,
>> +	.min = 10,
>> +	.max = 90,
>> +	.step = 1,
>> +	.dims = { 100 },
>> +};
> 
> To meaningfully test this, don't we need the vivid driver to change the
> dimension ? Or is it meant to only test changes made by the application
> ?

dims[0] is the maximum number of elements allowed, it is not the actual
size of the control. The application sets the control to a size between
1 and dims[0] (100 in this case) elements.

When the control is created it will have just 1 element containing the
default control value.

Regards,

	Hans

> 
>> +
>>  static const struct v4l2_ctrl_config vivid_ctrl_u16_matrix = {
>>  	.ops = &vivid_user_gen_ctrl_ops,
>>  	.id = VIVID_CID_U16_MATRIX,
>> @@ -1612,6 +1626,7 @@ int vivid_create_controls(struct vivid_dev *dev, bool show_ccs_cap,
>>  	dev->ro_int32 = v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_ro_int32, NULL);
>>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_area, NULL);
>>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u32_array, NULL);
>> +	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u32_dyn_array, NULL);
>>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u16_matrix, NULL);
>>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u8_4d_array, NULL);
>>  
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 3/9] vivid: add dynamic array test control
  2022-06-15  7:39     ` Hans Verkuil
@ 2022-06-15  7:46       ` Hans Verkuil
  0 siblings, 0 replies; 27+ messages in thread
From: Hans Verkuil @ 2022-06-15  7:46 UTC (permalink / raw)
  To: Laurent Pinchart, Xavier Roumegue
  Cc: mchehab, stanimir.varbanov, tomi.valkeinen, robh+dt, nicolas,
	alexander.stein, linux-media, devicetree



On 6/15/22 09:39, Hans Verkuil wrote:
> 
> 
> On 6/14/22 23:00, Laurent Pinchart wrote:
>> Hi Xavier and Hans,
>>
>> Thank you for the patch.
>>
>> On Tue, May 03, 2022 at 11:39:19AM +0200, Xavier Roumegue wrote:
>>> From: Hans Verkuil <hverkuil-cisco@xs4all.nl>
>>>
>>> Add a dynamic array test control to help test support for this
>>> feature.
>>>
>>> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
>>> ---
>>>  drivers/media/test-drivers/vivid/vivid-ctrls.c | 15 +++++++++++++++
>>>  1 file changed, 15 insertions(+)
>>>
>>> diff --git a/drivers/media/test-drivers/vivid/vivid-ctrls.c b/drivers/media/test-drivers/vivid/vivid-ctrls.c
>>> index e7516dc1227b..7267892dc18a 100644
>>> --- a/drivers/media/test-drivers/vivid/vivid-ctrls.c
>>> +++ b/drivers/media/test-drivers/vivid/vivid-ctrls.c
>>> @@ -34,6 +34,7 @@
>>>  #define VIVID_CID_U8_4D_ARRAY		(VIVID_CID_CUSTOM_BASE + 10)
>>>  #define VIVID_CID_AREA			(VIVID_CID_CUSTOM_BASE + 11)
>>>  #define VIVID_CID_RO_INTEGER		(VIVID_CID_CUSTOM_BASE + 12)
>>> +#define VIVID_CID_U32_DYN_ARRAY		(VIVID_CID_CUSTOM_BASE + 13)
>>>  
>>>  #define VIVID_CID_VIVID_BASE		(0x00f00000 | 0xf000)
>>>  #define VIVID_CID_VIVID_CLASS		(0x00f00000 | 1)
>>> @@ -189,6 +190,19 @@ static const struct v4l2_ctrl_config vivid_ctrl_u32_array = {
>>>  	.dims = { 1 },
>>>  };
>>>  
>>> +static const struct v4l2_ctrl_config vivid_ctrl_u32_dyn_array = {
>>> +	.ops = &vivid_user_gen_ctrl_ops,
>>> +	.id = VIVID_CID_U32_DYN_ARRAY,
>>> +	.name = "U32 Dynamic Array",
>>> +	.type = V4L2_CTRL_TYPE_U32,
>>> +	.flags = V4L2_CTRL_FLAG_DYNAMIC_ARRAY,
>>> +	.def = 50,
>>> +	.min = 10,
>>> +	.max = 90,
>>> +	.step = 1,
>>> +	.dims = { 100 },
>>> +};
>>
>> To meaningfully test this, don't we need the vivid driver to change the
>> dimension ? Or is it meant to only test changes made by the application
>> ?
> 
> dims[0] is the maximum number of elements allowed, it is not the actual
> size of the control. The application sets the control to a size between
> 1 and dims[0] (100 in this case) elements.
> 
> When the control is created it will have just 1 element containing the
> default control value.

There is actually one thing that I would like to improve: allow dynamic arrays
to be empty (0 elements). Such a control should probably be empty in the
beginning as well.

I'll take a stab at this.

Regards,

	Hans

> 
> Regards,
> 
> 	Hans
> 
>>
>>> +
>>>  static const struct v4l2_ctrl_config vivid_ctrl_u16_matrix = {
>>>  	.ops = &vivid_user_gen_ctrl_ops,
>>>  	.id = VIVID_CID_U16_MATRIX,
>>> @@ -1612,6 +1626,7 @@ int vivid_create_controls(struct vivid_dev *dev, bool show_ccs_cap,
>>>  	dev->ro_int32 = v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_ro_int32, NULL);
>>>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_area, NULL);
>>>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u32_array, NULL);
>>> +	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u32_dyn_array, NULL);
>>>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u16_matrix, NULL);
>>>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u8_4d_array, NULL);
>>>  
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 1/9] videodev2.h: add V4L2_CTRL_FLAG_DYNAMIC_ARRAY
  2022-06-15  7:37     ` Hans Verkuil
@ 2022-06-15  8:39       ` Laurent Pinchart
  2022-06-15  8:45         ` Hans Verkuil
  0 siblings, 1 reply; 27+ messages in thread
From: Laurent Pinchart @ 2022-06-15  8:39 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: Xavier Roumegue, mchehab, stanimir.varbanov, tomi.valkeinen,
	robh+dt, nicolas, alexander.stein, linux-media, devicetree

Hi Hans,

On Wed, Jun 15, 2022 at 09:37:17AM +0200, Hans Verkuil wrote:
> On 6/14/22 22:46, Laurent Pinchart wrote:
> > On Tue, May 03, 2022 at 11:39:17AM +0200, Xavier Roumegue wrote:
> >> From: Hans Verkuil <hverkuil-cisco@xs4all.nl>
> >>
> >> Add a new flag that indicates that this control is a dynamically sized
> >> array. Also document this flag.
> >>
> >> Currently dynamically sized arrays are limited to one dimensional arrays,
> >> but that might change in the future if there is a need for it.
> >>
> >> The initial use-case of dynamic arrays are stateless codecs. A frame
> >> can be divided in many slices, so you want to provide an array containing
> >> slice information for each slice. Typically the number of slices is small,
> >> but the standard allow for hundreds or thousands of slices. Dynamic arrays
> >> are a good solution since sizing the array for the worst case would waste
> >> substantial amounts of memory.
> >>
> >> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
> >> ---
> >>  .../userspace-api/media/v4l/vidioc-queryctrl.rst          | 8 ++++++++
> >>  include/uapi/linux/videodev2.h                            | 1 +
> >>  2 files changed, 9 insertions(+)
> >>
> >> diff --git a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
> >> index 88f630252d98..a20dfa2a933b 100644
> >> --- a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
> >> +++ b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
> >> @@ -625,6 +625,14 @@ See also the examples in :ref:`control`.
> >>  	``V4L2_CTRL_FLAG_GRABBED`` flag when buffers are allocated or
> >>  	streaming is in progress since most drivers do not support changing
> >>  	the format in that case.
> >> +    * - ``V4L2_CTRL_FLAG_DYNAMIC_ARRAY``
> >> +      - 0x0800
> >> +      - This control is a dynamically sized 1-dimensional array. It
> >> +        behaves the same as a regular array, except that the number
> >> +	of elements as reported by the ``elems`` field is between 1 and
> >> +	``dims[0]``. So setting the control with a differently sized
> >> +	array will change the ``elems`` field when the control is
> >> +	queried afterwards.
> > 
> > Wrong indentation.
> 
> No, it's just that the tab in front makes it look a bit weird.

I thought .rst files had to use spaces for indentation, but it seems I
was wrong.

> > Can the dimension be changed by the application only, or by the driver
> > too ? In the latter case, is an event generated ?
> 
> Interesting question. 'dims[0]' is the maximum number of elements allowed
> by the driver for this dynamic array. Applications can set the control to
> any number of elements up to that maximum. For the current use case I expect
> that this maximum reflects what the hardware supports, and is not really
> related to the video resolution. Currently the code expects the maximum
> number of elements (dims[0]) to remain constant.

The maximum should be fixed, I agree. The video resolution however
matters, as it dictates (in the DW100 case) the number of actual
elements in the control array. When changing the format, the control
value thus needs to be updated accordingly. There are multiple
strategies there. One of them would be for the driver to automatically
update the control value, most probably to a default (as there's no way
the driver could derive a new meaningful value from the old one).
Another one, which is implemented in this series, is to keep the control
as-is, but ignore its value at stream on time if it hasn't been updated
by userspace. There could be other options.

Do we want to mandate a particular strategy here, or leave it to
individual controls ?

> Technically a driver can increase dims[0] as needed. Should that be needed,
> then we probably need an event to signal that.
> 
> > Considering this in the context of this series, the driver needs to
> > change the dimension, as the use case is to size the control based on
> > the image size. Do we want to document here that the driver will reset
> > the control to a default value when the dimension changes, or is that
> > something that should be control-specific ?
> 
> As mentioned above, in the current context the maximum number of allowed
> elements in a dynamic array is fixed, so this issue does not occur.
> 
> >>  
> >>  Return Value
> >>  ============
> >> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> >> index 3768a0a80830..8df13defde75 100644
> >> --- a/include/uapi/linux/videodev2.h
> >> +++ b/include/uapi/linux/videodev2.h
> >> @@ -1886,6 +1886,7 @@ struct v4l2_querymenu {
> >>  #define V4L2_CTRL_FLAG_HAS_PAYLOAD	0x0100
> >>  #define V4L2_CTRL_FLAG_EXECUTE_ON_WRITE	0x0200
> >>  #define V4L2_CTRL_FLAG_MODIFY_LAYOUT	0x0400
> >> +#define V4L2_CTRL_FLAG_DYNAMIC_ARRAY	0x0800
> >>  
> >>  /*  Query flags, to be ORed with the control ID */
> >>  #define V4L2_CTRL_FLAG_NEXT_CTRL	0x80000000

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 1/9] videodev2.h: add V4L2_CTRL_FLAG_DYNAMIC_ARRAY
  2022-06-15  8:39       ` Laurent Pinchart
@ 2022-06-15  8:45         ` Hans Verkuil
  0 siblings, 0 replies; 27+ messages in thread
From: Hans Verkuil @ 2022-06-15  8:45 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Xavier Roumegue, mchehab, stanimir.varbanov, tomi.valkeinen,
	robh+dt, nicolas, alexander.stein, linux-media, devicetree



On 6/15/22 10:39, Laurent Pinchart wrote:
> Hi Hans,
> 
> On Wed, Jun 15, 2022 at 09:37:17AM +0200, Hans Verkuil wrote:
>> On 6/14/22 22:46, Laurent Pinchart wrote:
>>> On Tue, May 03, 2022 at 11:39:17AM +0200, Xavier Roumegue wrote:
>>>> From: Hans Verkuil <hverkuil-cisco@xs4all.nl>
>>>>
>>>> Add a new flag that indicates that this control is a dynamically sized
>>>> array. Also document this flag.
>>>>
>>>> Currently dynamically sized arrays are limited to one dimensional arrays,
>>>> but that might change in the future if there is a need for it.
>>>>
>>>> The initial use-case of dynamic arrays are stateless codecs. A frame
>>>> can be divided in many slices, so you want to provide an array containing
>>>> slice information for each slice. Typically the number of slices is small,
>>>> but the standard allow for hundreds or thousands of slices. Dynamic arrays
>>>> are a good solution since sizing the array for the worst case would waste
>>>> substantial amounts of memory.
>>>>
>>>> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
>>>> ---
>>>>  .../userspace-api/media/v4l/vidioc-queryctrl.rst          | 8 ++++++++
>>>>  include/uapi/linux/videodev2.h                            | 1 +
>>>>  2 files changed, 9 insertions(+)
>>>>
>>>> diff --git a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
>>>> index 88f630252d98..a20dfa2a933b 100644
>>>> --- a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
>>>> +++ b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
>>>> @@ -625,6 +625,14 @@ See also the examples in :ref:`control`.
>>>>  	``V4L2_CTRL_FLAG_GRABBED`` flag when buffers are allocated or
>>>>  	streaming is in progress since most drivers do not support changing
>>>>  	the format in that case.
>>>> +    * - ``V4L2_CTRL_FLAG_DYNAMIC_ARRAY``
>>>> +      - 0x0800
>>>> +      - This control is a dynamically sized 1-dimensional array. It
>>>> +        behaves the same as a regular array, except that the number
>>>> +	of elements as reported by the ``elems`` field is between 1 and
>>>> +	``dims[0]``. So setting the control with a differently sized
>>>> +	array will change the ``elems`` field when the control is
>>>> +	queried afterwards.
>>>
>>> Wrong indentation.
>>
>> No, it's just that the tab in front makes it look a bit weird.
> 
> I thought .rst files had to use spaces for indentation, but it seems I
> was wrong.
> 
>>> Can the dimension be changed by the application only, or by the driver
>>> too ? In the latter case, is an event generated ?
>>
>> Interesting question. 'dims[0]' is the maximum number of elements allowed
>> by the driver for this dynamic array. Applications can set the control to
>> any number of elements up to that maximum. For the current use case I expect
>> that this maximum reflects what the hardware supports, and is not really
>> related to the video resolution. Currently the code expects the maximum
>> number of elements (dims[0]) to remain constant.
> 
> The maximum should be fixed, I agree. The video resolution however
> matters, as it dictates (in the DW100 case) the number of actual

Ah, this patch is part of the dw100 series, I thought it was the same patch
that was part of the HEVC series. Sorry, I missed that.

I'll have to review this again in that context. I'll reply later, once I've
done that.

Regards,

	Hans

> elements in the control array. When changing the format, the control
> value thus needs to be updated accordingly. There are multiple
> strategies there. One of them would be for the driver to automatically
> update the control value, most probably to a default (as there's no way
> the driver could derive a new meaningful value from the old one).
> Another one, which is implemented in this series, is to keep the control
> as-is, but ignore its value at stream on time if it hasn't been updated
> by userspace. There could be other options.
> 
> Do we want to mandate a particular strategy here, or leave it to
> individual controls ?
> 
>> Technically a driver can increase dims[0] as needed. Should that be needed,
>> then we probably need an event to signal that.
>>
>>> Considering this in the context of this series, the driver needs to
>>> change the dimension, as the use case is to size the control based on
>>> the image size. Do we want to document here that the driver will reset
>>> the control to a default value when the dimension changes, or is that
>>> something that should be control-specific ?
>>
>> As mentioned above, in the current context the maximum number of allowed
>> elements in a dynamic array is fixed, so this issue does not occur.
>>
>>>>  
>>>>  Return Value
>>>>  ============
>>>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
>>>> index 3768a0a80830..8df13defde75 100644
>>>> --- a/include/uapi/linux/videodev2.h
>>>> +++ b/include/uapi/linux/videodev2.h
>>>> @@ -1886,6 +1886,7 @@ struct v4l2_querymenu {
>>>>  #define V4L2_CTRL_FLAG_HAS_PAYLOAD	0x0100
>>>>  #define V4L2_CTRL_FLAG_EXECUTE_ON_WRITE	0x0200
>>>>  #define V4L2_CTRL_FLAG_MODIFY_LAYOUT	0x0400
>>>> +#define V4L2_CTRL_FLAG_DYNAMIC_ARRAY	0x0800
>>>>  
>>>>  /*  Query flags, to be ORed with the control ID */
>>>>  #define V4L2_CTRL_FLAG_NEXT_CTRL	0x80000000
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 3/9] vivid: add dynamic array test control
  2022-06-14 21:00   ` Laurent Pinchart
  2022-06-15  7:39     ` Hans Verkuil
@ 2022-06-15  9:14     ` Hans Verkuil
  2022-06-15 11:53       ` Laurent Pinchart
  2022-06-16  6:08       ` Xavier Roumegue (OSS)
  1 sibling, 2 replies; 27+ messages in thread
From: Hans Verkuil @ 2022-06-15  9:14 UTC (permalink / raw)
  To: Laurent Pinchart, Xavier Roumegue
  Cc: mchehab, stanimir.varbanov, tomi.valkeinen, robh+dt, nicolas,
	alexander.stein, linux-media, devicetree

Hi Laurent, Xavier,

Ignore what I wrote before, I read it with the HEVC patch series in mind, not the dw100
series.

So let me try again :-)

On 6/14/22 23:00, Laurent Pinchart wrote:
> Hi Xavier and Hans,
> 
> Thank you for the patch.
> 
> On Tue, May 03, 2022 at 11:39:19AM +0200, Xavier Roumegue wrote:
>> From: Hans Verkuil <hverkuil-cisco@xs4all.nl>
>>
>> Add a dynamic array test control to help test support for this
>> feature.
>>
>> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
>> ---
>>  drivers/media/test-drivers/vivid/vivid-ctrls.c | 15 +++++++++++++++
>>  1 file changed, 15 insertions(+)
>>
>> diff --git a/drivers/media/test-drivers/vivid/vivid-ctrls.c b/drivers/media/test-drivers/vivid/vivid-ctrls.c
>> index e7516dc1227b..7267892dc18a 100644
>> --- a/drivers/media/test-drivers/vivid/vivid-ctrls.c
>> +++ b/drivers/media/test-drivers/vivid/vivid-ctrls.c
>> @@ -34,6 +34,7 @@
>>  #define VIVID_CID_U8_4D_ARRAY		(VIVID_CID_CUSTOM_BASE + 10)
>>  #define VIVID_CID_AREA			(VIVID_CID_CUSTOM_BASE + 11)
>>  #define VIVID_CID_RO_INTEGER		(VIVID_CID_CUSTOM_BASE + 12)
>> +#define VIVID_CID_U32_DYN_ARRAY		(VIVID_CID_CUSTOM_BASE + 13)
>>  
>>  #define VIVID_CID_VIVID_BASE		(0x00f00000 | 0xf000)
>>  #define VIVID_CID_VIVID_CLASS		(0x00f00000 | 1)
>> @@ -189,6 +190,19 @@ static const struct v4l2_ctrl_config vivid_ctrl_u32_array = {
>>  	.dims = { 1 },
>>  };
>>  
>> +static const struct v4l2_ctrl_config vivid_ctrl_u32_dyn_array = {
>> +	.ops = &vivid_user_gen_ctrl_ops,
>> +	.id = VIVID_CID_U32_DYN_ARRAY,
>> +	.name = "U32 Dynamic Array",
>> +	.type = V4L2_CTRL_TYPE_U32,
>> +	.flags = V4L2_CTRL_FLAG_DYNAMIC_ARRAY,
>> +	.def = 50,
>> +	.min = 10,
>> +	.max = 90,
>> +	.step = 1,
>> +	.dims = { 100 },
>> +};
> 
> To meaningfully test this, don't we need the vivid driver to change the
> dimension ? Or is it meant to only test changes made by the application
> ?

As I understand it the dw100 driver needs a 2 dimensional array control.
The size is fixed for each resolution, but if the resolution changes, then
this control changes size as well, and it makes sense that when that happens
it is also reset to default values.

So this isn't a dynamic array at all. It is a standard 2 dimensional array.

What is missing in the control framework is a function similar to
v4l2_ctrl_modify_range() that can resize an array.

v4l2_ctrl_modify_dimensions() would be a good name.

I can make something for that if you both agree with this proposal.

Regards,

	Hans

> 
>> +
>>  static const struct v4l2_ctrl_config vivid_ctrl_u16_matrix = {
>>  	.ops = &vivid_user_gen_ctrl_ops,
>>  	.id = VIVID_CID_U16_MATRIX,
>> @@ -1612,6 +1626,7 @@ int vivid_create_controls(struct vivid_dev *dev, bool show_ccs_cap,
>>  	dev->ro_int32 = v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_ro_int32, NULL);
>>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_area, NULL);
>>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u32_array, NULL);
>> +	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u32_dyn_array, NULL);
>>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u16_matrix, NULL);
>>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u8_4d_array, NULL);
>>  
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 3/9] vivid: add dynamic array test control
  2022-06-15  9:14     ` Hans Verkuil
@ 2022-06-15 11:53       ` Laurent Pinchart
  2022-06-15 12:27         ` Hans Verkuil
  2022-06-16  6:08       ` Xavier Roumegue (OSS)
  1 sibling, 1 reply; 27+ messages in thread
From: Laurent Pinchart @ 2022-06-15 11:53 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: Xavier Roumegue, mchehab, stanimir.varbanov, tomi.valkeinen,
	robh+dt, nicolas, alexander.stein, linux-media, devicetree

Hi Hans,

On Wed, Jun 15, 2022 at 11:14:43AM +0200, Hans Verkuil wrote:
> Hi Laurent, Xavier,
> 
> Ignore what I wrote before, I read it with the HEVC patch series in mind, not the dw100
> series.
> 
> So let me try again :-)
> 
> On 6/14/22 23:00, Laurent Pinchart wrote:
> > Hi Xavier and Hans,
> > 
> > Thank you for the patch.
> > 
> > On Tue, May 03, 2022 at 11:39:19AM +0200, Xavier Roumegue wrote:
> >> From: Hans Verkuil <hverkuil-cisco@xs4all.nl>
> >>
> >> Add a dynamic array test control to help test support for this
> >> feature.
> >>
> >> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
> >> ---
> >>  drivers/media/test-drivers/vivid/vivid-ctrls.c | 15 +++++++++++++++
> >>  1 file changed, 15 insertions(+)
> >>
> >> diff --git a/drivers/media/test-drivers/vivid/vivid-ctrls.c b/drivers/media/test-drivers/vivid/vivid-ctrls.c
> >> index e7516dc1227b..7267892dc18a 100644
> >> --- a/drivers/media/test-drivers/vivid/vivid-ctrls.c
> >> +++ b/drivers/media/test-drivers/vivid/vivid-ctrls.c
> >> @@ -34,6 +34,7 @@
> >>  #define VIVID_CID_U8_4D_ARRAY		(VIVID_CID_CUSTOM_BASE + 10)
> >>  #define VIVID_CID_AREA			(VIVID_CID_CUSTOM_BASE + 11)
> >>  #define VIVID_CID_RO_INTEGER		(VIVID_CID_CUSTOM_BASE + 12)
> >> +#define VIVID_CID_U32_DYN_ARRAY		(VIVID_CID_CUSTOM_BASE + 13)
> >>  
> >>  #define VIVID_CID_VIVID_BASE		(0x00f00000 | 0xf000)
> >>  #define VIVID_CID_VIVID_CLASS		(0x00f00000 | 1)
> >> @@ -189,6 +190,19 @@ static const struct v4l2_ctrl_config vivid_ctrl_u32_array = {
> >>  	.dims = { 1 },
> >>  };
> >>  
> >> +static const struct v4l2_ctrl_config vivid_ctrl_u32_dyn_array = {
> >> +	.ops = &vivid_user_gen_ctrl_ops,
> >> +	.id = VIVID_CID_U32_DYN_ARRAY,
> >> +	.name = "U32 Dynamic Array",
> >> +	.type = V4L2_CTRL_TYPE_U32,
> >> +	.flags = V4L2_CTRL_FLAG_DYNAMIC_ARRAY,
> >> +	.def = 50,
> >> +	.min = 10,
> >> +	.max = 90,
> >> +	.step = 1,
> >> +	.dims = { 100 },
> >> +};
> > 
> > To meaningfully test this, don't we need the vivid driver to change the
> > dimension ? Or is it meant to only test changes made by the application
> > ?
> 
> As I understand it the dw100 driver needs a 2 dimensional array control.
> The size is fixed for each resolution, but if the resolution changes, then
> this control changes size as well, and it makes sense that when that happens
> it is also reset to default values.
> 
> So this isn't a dynamic array at all. It is a standard 2 dimensional array.
> 
> What is missing in the control framework is a function similar to
> v4l2_ctrl_modify_range() that can resize an array.
> 
> v4l2_ctrl_modify_dimensions() would be a good name.
> 
> I can make something for that if you both agree with this proposal.

From a userspace point of view, we only need to be able to set the
control after setting the format. There's no need for control change
events (but I don't mind if they're there of course, even if I think
they won't be very usable in practice).

From an API point of view, I'd like a clear and documented behaviour for
what happens to the control value when the format is changed. It can be
a global behaviour, or a control-specific behaviour, I don't mind much.

> >> +
> >>  static const struct v4l2_ctrl_config vivid_ctrl_u16_matrix = {
> >>  	.ops = &vivid_user_gen_ctrl_ops,
> >>  	.id = VIVID_CID_U16_MATRIX,
> >> @@ -1612,6 +1626,7 @@ int vivid_create_controls(struct vivid_dev *dev, bool show_ccs_cap,
> >>  	dev->ro_int32 = v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_ro_int32, NULL);
> >>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_area, NULL);
> >>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u32_array, NULL);
> >> +	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u32_dyn_array, NULL);
> >>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u16_matrix, NULL);
> >>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u8_4d_array, NULL);
> >>  

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 3/9] vivid: add dynamic array test control
  2022-06-15 11:53       ` Laurent Pinchart
@ 2022-06-15 12:27         ` Hans Verkuil
  2022-06-15 12:39           ` Laurent Pinchart
  0 siblings, 1 reply; 27+ messages in thread
From: Hans Verkuil @ 2022-06-15 12:27 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Xavier Roumegue, mchehab, stanimir.varbanov, tomi.valkeinen,
	robh+dt, nicolas, alexander.stein, linux-media, devicetree



On 6/15/22 13:53, Laurent Pinchart wrote:
> Hi Hans,
> 
> On Wed, Jun 15, 2022 at 11:14:43AM +0200, Hans Verkuil wrote:
>> Hi Laurent, Xavier,
>>
>> Ignore what I wrote before, I read it with the HEVC patch series in mind, not the dw100
>> series.
>>
>> So let me try again :-)
>>
>> On 6/14/22 23:00, Laurent Pinchart wrote:
>>> Hi Xavier and Hans,
>>>
>>> Thank you for the patch.
>>>
>>> On Tue, May 03, 2022 at 11:39:19AM +0200, Xavier Roumegue wrote:
>>>> From: Hans Verkuil <hverkuil-cisco@xs4all.nl>
>>>>
>>>> Add a dynamic array test control to help test support for this
>>>> feature.
>>>>
>>>> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
>>>> ---
>>>>  drivers/media/test-drivers/vivid/vivid-ctrls.c | 15 +++++++++++++++
>>>>  1 file changed, 15 insertions(+)
>>>>
>>>> diff --git a/drivers/media/test-drivers/vivid/vivid-ctrls.c b/drivers/media/test-drivers/vivid/vivid-ctrls.c
>>>> index e7516dc1227b..7267892dc18a 100644
>>>> --- a/drivers/media/test-drivers/vivid/vivid-ctrls.c
>>>> +++ b/drivers/media/test-drivers/vivid/vivid-ctrls.c
>>>> @@ -34,6 +34,7 @@
>>>>  #define VIVID_CID_U8_4D_ARRAY		(VIVID_CID_CUSTOM_BASE + 10)
>>>>  #define VIVID_CID_AREA			(VIVID_CID_CUSTOM_BASE + 11)
>>>>  #define VIVID_CID_RO_INTEGER		(VIVID_CID_CUSTOM_BASE + 12)
>>>> +#define VIVID_CID_U32_DYN_ARRAY		(VIVID_CID_CUSTOM_BASE + 13)
>>>>  
>>>>  #define VIVID_CID_VIVID_BASE		(0x00f00000 | 0xf000)
>>>>  #define VIVID_CID_VIVID_CLASS		(0x00f00000 | 1)
>>>> @@ -189,6 +190,19 @@ static const struct v4l2_ctrl_config vivid_ctrl_u32_array = {
>>>>  	.dims = { 1 },
>>>>  };
>>>>  
>>>> +static const struct v4l2_ctrl_config vivid_ctrl_u32_dyn_array = {
>>>> +	.ops = &vivid_user_gen_ctrl_ops,
>>>> +	.id = VIVID_CID_U32_DYN_ARRAY,
>>>> +	.name = "U32 Dynamic Array",
>>>> +	.type = V4L2_CTRL_TYPE_U32,
>>>> +	.flags = V4L2_CTRL_FLAG_DYNAMIC_ARRAY,
>>>> +	.def = 50,
>>>> +	.min = 10,
>>>> +	.max = 90,
>>>> +	.step = 1,
>>>> +	.dims = { 100 },
>>>> +};
>>>
>>> To meaningfully test this, don't we need the vivid driver to change the
>>> dimension ? Or is it meant to only test changes made by the application
>>> ?
>>
>> As I understand it the dw100 driver needs a 2 dimensional array control.
>> The size is fixed for each resolution, but if the resolution changes, then
>> this control changes size as well, and it makes sense that when that happens
>> it is also reset to default values.
>>
>> So this isn't a dynamic array at all. It is a standard 2 dimensional array.
>>
>> What is missing in the control framework is a function similar to
>> v4l2_ctrl_modify_range() that can resize an array.
>>
>> v4l2_ctrl_modify_dimensions() would be a good name.
>>
>> I can make something for that if you both agree with this proposal.
> 
> From a userspace point of view, we only need to be able to set the
> control after setting the format. There's no need for control change
> events (but I don't mind if they're there of course, even if I think
> they won't be very usable in practice).
> 
> From an API point of view, I'd like a clear and documented behaviour for
> what happens to the control value when the format is changed. It can be
> a global behaviour, or a control-specific behaviour, I don't mind much.

I'd say it is control specific. In this case you would reset the contents
to default values (presumably values that don't do any warping in this
case).

Regards,

	Hans

> 
>>>> +
>>>>  static const struct v4l2_ctrl_config vivid_ctrl_u16_matrix = {
>>>>  	.ops = &vivid_user_gen_ctrl_ops,
>>>>  	.id = VIVID_CID_U16_MATRIX,
>>>> @@ -1612,6 +1626,7 @@ int vivid_create_controls(struct vivid_dev *dev, bool show_ccs_cap,
>>>>  	dev->ro_int32 = v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_ro_int32, NULL);
>>>>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_area, NULL);
>>>>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u32_array, NULL);
>>>> +	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u32_dyn_array, NULL);
>>>>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u16_matrix, NULL);
>>>>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u8_4d_array, NULL);
>>>>  
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 3/9] vivid: add dynamic array test control
  2022-06-15 12:27         ` Hans Verkuil
@ 2022-06-15 12:39           ` Laurent Pinchart
  0 siblings, 0 replies; 27+ messages in thread
From: Laurent Pinchart @ 2022-06-15 12:39 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: Xavier Roumegue, mchehab, stanimir.varbanov, tomi.valkeinen,
	robh+dt, nicolas, alexander.stein, linux-media, devicetree

Hi Hans,

On Wed, Jun 15, 2022 at 02:27:57PM +0200, Hans Verkuil wrote:
> On 6/15/22 13:53, Laurent Pinchart wrote:
> > On Wed, Jun 15, 2022 at 11:14:43AM +0200, Hans Verkuil wrote:
> >> Hi Laurent, Xavier,
> >>
> >> Ignore what I wrote before, I read it with the HEVC patch series in mind, not the dw100
> >> series.
> >>
> >> So let me try again :-)
> >>
> >> On 6/14/22 23:00, Laurent Pinchart wrote:
> >>> Hi Xavier and Hans,
> >>>
> >>> Thank you for the patch.
> >>>
> >>> On Tue, May 03, 2022 at 11:39:19AM +0200, Xavier Roumegue wrote:
> >>>> From: Hans Verkuil <hverkuil-cisco@xs4all.nl>
> >>>>
> >>>> Add a dynamic array test control to help test support for this
> >>>> feature.
> >>>>
> >>>> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
> >>>> ---
> >>>>  drivers/media/test-drivers/vivid/vivid-ctrls.c | 15 +++++++++++++++
> >>>>  1 file changed, 15 insertions(+)
> >>>>
> >>>> diff --git a/drivers/media/test-drivers/vivid/vivid-ctrls.c b/drivers/media/test-drivers/vivid/vivid-ctrls.c
> >>>> index e7516dc1227b..7267892dc18a 100644
> >>>> --- a/drivers/media/test-drivers/vivid/vivid-ctrls.c
> >>>> +++ b/drivers/media/test-drivers/vivid/vivid-ctrls.c
> >>>> @@ -34,6 +34,7 @@
> >>>>  #define VIVID_CID_U8_4D_ARRAY		(VIVID_CID_CUSTOM_BASE + 10)
> >>>>  #define VIVID_CID_AREA			(VIVID_CID_CUSTOM_BASE + 11)
> >>>>  #define VIVID_CID_RO_INTEGER		(VIVID_CID_CUSTOM_BASE + 12)
> >>>> +#define VIVID_CID_U32_DYN_ARRAY		(VIVID_CID_CUSTOM_BASE + 13)
> >>>>  
> >>>>  #define VIVID_CID_VIVID_BASE		(0x00f00000 | 0xf000)
> >>>>  #define VIVID_CID_VIVID_CLASS		(0x00f00000 | 1)
> >>>> @@ -189,6 +190,19 @@ static const struct v4l2_ctrl_config vivid_ctrl_u32_array = {
> >>>>  	.dims = { 1 },
> >>>>  };
> >>>>  
> >>>> +static const struct v4l2_ctrl_config vivid_ctrl_u32_dyn_array = {
> >>>> +	.ops = &vivid_user_gen_ctrl_ops,
> >>>> +	.id = VIVID_CID_U32_DYN_ARRAY,
> >>>> +	.name = "U32 Dynamic Array",
> >>>> +	.type = V4L2_CTRL_TYPE_U32,
> >>>> +	.flags = V4L2_CTRL_FLAG_DYNAMIC_ARRAY,
> >>>> +	.def = 50,
> >>>> +	.min = 10,
> >>>> +	.max = 90,
> >>>> +	.step = 1,
> >>>> +	.dims = { 100 },
> >>>> +};
> >>>
> >>> To meaningfully test this, don't we need the vivid driver to change the
> >>> dimension ? Or is it meant to only test changes made by the application
> >>> ?
> >>
> >> As I understand it the dw100 driver needs a 2 dimensional array control.
> >> The size is fixed for each resolution, but if the resolution changes, then
> >> this control changes size as well, and it makes sense that when that happens
> >> it is also reset to default values.
> >>
> >> So this isn't a dynamic array at all. It is a standard 2 dimensional array.
> >>
> >> What is missing in the control framework is a function similar to
> >> v4l2_ctrl_modify_range() that can resize an array.
> >>
> >> v4l2_ctrl_modify_dimensions() would be a good name.
> >>
> >> I can make something for that if you both agree with this proposal.
> > 
> > From a userspace point of view, we only need to be able to set the
> > control after setting the format. There's no need for control change
> > events (but I don't mind if they're there of course, even if I think
> > they won't be very usable in practice).
> > 
> > From an API point of view, I'd like a clear and documented behaviour for
> > what happens to the control value when the format is changed. It can be
> > a global behaviour, or a control-specific behaviour, I don't mind much.
> 
> I'd say it is control specific. In this case you would reset the contents
> to default values (presumably values that don't do any warping in this
> case).

Works for me. Let's then note in the core API that any control whose
dimensions can change need to document how its value is affected.

> >>>> +
> >>>>  static const struct v4l2_ctrl_config vivid_ctrl_u16_matrix = {
> >>>>  	.ops = &vivid_user_gen_ctrl_ops,
> >>>>  	.id = VIVID_CID_U16_MATRIX,
> >>>> @@ -1612,6 +1626,7 @@ int vivid_create_controls(struct vivid_dev *dev, bool show_ccs_cap,
> >>>>  	dev->ro_int32 = v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_ro_int32, NULL);
> >>>>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_area, NULL);
> >>>>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u32_array, NULL);
> >>>> +	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u32_dyn_array, NULL);
> >>>>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u16_matrix, NULL);
> >>>>  	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u8_4d_array, NULL);
> >>>>  

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 3/9] vivid: add dynamic array test control
  2022-06-15  9:14     ` Hans Verkuil
  2022-06-15 11:53       ` Laurent Pinchart
@ 2022-06-16  6:08       ` Xavier Roumegue (OSS)
  1 sibling, 0 replies; 27+ messages in thread
From: Xavier Roumegue (OSS) @ 2022-06-16  6:08 UTC (permalink / raw)
  To: Hans Verkuil, Laurent Pinchart
  Cc: mchehab, stanimir.varbanov, tomi.valkeinen, robh+dt, nicolas,
	alexander.stein, linux-media, devicetree

Hi Hans,

On 6/15/22 11:14, Hans Verkuil wrote:
> Hi Laurent, Xavier,
> 
> Ignore what I wrote before, I read it with the HEVC patch series in mind, not the dw100
> series.
> 
> So let me try again :-)
> 
> On 6/14/22 23:00, Laurent Pinchart wrote:
>> Hi Xavier and Hans,
>>
>> Thank you for the patch.
>>
>> On Tue, May 03, 2022 at 11:39:19AM +0200, Xavier Roumegue wrote:
>>> From: Hans Verkuil <hverkuil-cisco@xs4all.nl>
>>>
>>> Add a dynamic array test control to help test support for this
>>> feature.
>>>
>>> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
>>> ---
>>>   drivers/media/test-drivers/vivid/vivid-ctrls.c | 15 +++++++++++++++
>>>   1 file changed, 15 insertions(+)
>>>
>>> diff --git a/drivers/media/test-drivers/vivid/vivid-ctrls.c b/drivers/media/test-drivers/vivid/vivid-ctrls.c
>>> index e7516dc1227b..7267892dc18a 100644
>>> --- a/drivers/media/test-drivers/vivid/vivid-ctrls.c
>>> +++ b/drivers/media/test-drivers/vivid/vivid-ctrls.c
>>> @@ -34,6 +34,7 @@
>>>   #define VIVID_CID_U8_4D_ARRAY		(VIVID_CID_CUSTOM_BASE + 10)
>>>   #define VIVID_CID_AREA			(VIVID_CID_CUSTOM_BASE + 11)
>>>   #define VIVID_CID_RO_INTEGER		(VIVID_CID_CUSTOM_BASE + 12)
>>> +#define VIVID_CID_U32_DYN_ARRAY		(VIVID_CID_CUSTOM_BASE + 13)
>>>   
>>>   #define VIVID_CID_VIVID_BASE		(0x00f00000 | 0xf000)
>>>   #define VIVID_CID_VIVID_CLASS		(0x00f00000 | 1)
>>> @@ -189,6 +190,19 @@ static const struct v4l2_ctrl_config vivid_ctrl_u32_array = {
>>>   	.dims = { 1 },
>>>   };
>>>   
>>> +static const struct v4l2_ctrl_config vivid_ctrl_u32_dyn_array = {
>>> +	.ops = &vivid_user_gen_ctrl_ops,
>>> +	.id = VIVID_CID_U32_DYN_ARRAY,
>>> +	.name = "U32 Dynamic Array",
>>> +	.type = V4L2_CTRL_TYPE_U32,
>>> +	.flags = V4L2_CTRL_FLAG_DYNAMIC_ARRAY,
>>> +	.def = 50,
>>> +	.min = 10,
>>> +	.max = 90,
>>> +	.step = 1,
>>> +	.dims = { 100 },
>>> +};
>>
>> To meaningfully test this, don't we need the vivid driver to change the
>> dimension ? Or is it meant to only test changes made by the application
>> ?
> 
> As I understand it the dw100 driver needs a 2 dimensional array control.
Considering the semantic of the array control, indeed, this is 2 
dimensional array control. Nevertheless, the driver does not need to 
access individual array items, as the hardware is the only consumer.
Hence, the driver rather considers the array control as a binary blob 
with varying size.
 From this perspective, the dynamic array control was a good candidate.

> The size is fixed for each resolution, but if the resolution changes, then
> this control changes size as well, and it makes sense that when that happens
> it is also reset to default values.
> 
> So this isn't a dynamic array at all. It is a standard 2 dimensional array.
> 
> What is missing in the control framework is a function similar to
> v4l2_ctrl_modify_range() that can resize an array.
> 
> v4l2_ctrl_modify_dimensions() would be a goo
> I can make something for that if you both agree with this proposal.
I am ok with that. Thanks !
Regards,
  Xavier
> 
> Regards,
> 
> 	Hans
> 
>>
>>> +
>>>   static const struct v4l2_ctrl_config vivid_ctrl_u16_matrix = {
>>>   	.ops = &vivid_user_gen_ctrl_ops,
>>>   	.id = VIVID_CID_U16_MATRIX,
>>> @@ -1612,6 +1626,7 @@ int vivid_create_controls(struct vivid_dev *dev, bool show_ccs_cap,
>>>   	dev->ro_int32 = v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_ro_int32, NULL);
>>>   	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_area, NULL);
>>>   	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u32_array, NULL);
>>> +	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u32_dyn_array, NULL);
>>>   	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u16_matrix, NULL);
>>>   	v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_u8_4d_array, NULL);
>>>   
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 8/9] media: dw100: Add i.MX8MP dw100 dewarper driver
  2022-06-14 22:54   ` Laurent Pinchart
@ 2022-06-23  7:06     ` Xavier Roumegue (OSS)
  0 siblings, 0 replies; 27+ messages in thread
From: Xavier Roumegue (OSS) @ 2022-06-23  7:06 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: mchehab, hverkuil-cisco, stanimir.varbanov, tomi.valkeinen,
	robh+dt, nicolas, alexander.stein, linux-media, devicetree

Hi Laurent,


On 6/15/22 00:54, Laurent Pinchart wrote:
> Hi Xavier,
> 
> Thank you for the patch.
> 
> On Tue, May 03, 2022 at 11:39:24AM +0200, Xavier Roumegue wrote:
>> Add a V4L2 mem-to-mem driver for the Vivante DW100 Dewarp Processor IP
>> core found on i.MX8MP SoC.
>>
>> The processor core applies a programmable geometrical transformation on
>> input image to correct distorsion introduced by lenses.
> 
> s/input image/input images/
> 
>> The transformation function is exposed as a grid map with 16x16 pixel
>> macroblocks indexed using X, Y vertex coordinates.
>>
>> The dewarping map can be set from application through dedicated a v4l2
> 
> s/dedicated a v4l2/a dedicated V4L2/
> 
>> control. If not set or invalid, the driver computes an identity map
>> prior to start the processing engine.
> 
> s/start/starting/
> 
>>
>> The driver supports scaling, cropping and pixel format conversion.
>>
>> Signed-off-by: Xavier Roumegue <xavier.roumegue@oss.nxp.com>
>> ---
>>   drivers/media/platform/nxp/Kconfig            |    1 +
>>   drivers/media/platform/nxp/Makefile           |    1 +
>>   drivers/media/platform/nxp/dw100/Kconfig      |   16 +
>>   drivers/media/platform/nxp/dw100/Makefile     |    3 +
>>   drivers/media/platform/nxp/dw100/dw100.c      | 1782 +++++++++++++++++
>>   drivers/media/platform/nxp/dw100/dw100_regs.h |  118 ++
>>   6 files changed, 1921 insertions(+)
>>   create mode 100644 drivers/media/platform/nxp/dw100/Kconfig
>>   create mode 100644 drivers/media/platform/nxp/dw100/Makefile
>>   create mode 100644 drivers/media/platform/nxp/dw100/dw100.c
>>   create mode 100644 drivers/media/platform/nxp/dw100/dw100_regs.h
>>
>> diff --git a/drivers/media/platform/nxp/Kconfig b/drivers/media/platform/nxp/Kconfig
>> index a0810df751dc..e5085e5e585b 100644
>> --- a/drivers/media/platform/nxp/Kconfig
>> +++ b/drivers/media/platform/nxp/Kconfig
>> @@ -52,4 +52,5 @@ config VIDEO_MX2_EMMAPRP
>>   	    memory to memory. Operations include resizing and format
>>   	    conversion.
>>   
>> +source "drivers/media/platform/nxp/dw100/Kconfig"
>>   source "drivers/media/platform/nxp/imx-jpeg/Kconfig"
>> diff --git a/drivers/media/platform/nxp/Makefile b/drivers/media/platform/nxp/Makefile
>> index efc38c6578ce..22ba28ac6d63 100644
>> --- a/drivers/media/platform/nxp/Makefile
>> +++ b/drivers/media/platform/nxp/Makefile
>> @@ -1,5 +1,6 @@
>>   # SPDX-License-Identifier: GPL-2.0-only
>>   
>> +obj-y += dw100/
>>   obj-y += imx-jpeg/
>>   
>>   obj-$(CONFIG_VIDEO_IMX_MIPI_CSIS) += imx-mipi-csis.o
>> diff --git a/drivers/media/platform/nxp/dw100/Kconfig b/drivers/media/platform/nxp/dw100/Kconfig
>> new file mode 100644
>> index 000000000000..86ccac6fd120
>> --- /dev/null
>> +++ b/drivers/media/platform/nxp/dw100/Kconfig
>> @@ -0,0 +1,16 @@
>> +# SPDX-License-Identifier: GPL-2.0-only
>> +
>> +config VIDEO_DW100
>> +	tristate "NXP i.MX DW100 dewarper"
>> +	depends on V4L_MEM2MEM_DRIVERS
>> +	depends on VIDEO_DEV
>> +	depends on ARCH_MXC || COMPILE_TEST
>> +	select VIDEOBUF2_DMA_CONTIG
>> +	select V4L2_MEM2MEM_DEV
>> +	help
>> +	  DW100 is a memory-to-memory engine performing geometrical
>> +	  transformation on source image through a programmable dewarping map.
> 
> s/source image/source images/
> 
>> +
>> +	  To compile this driver as a module, choose M here: the module
>> +	  will be called dw100.
>> +
>> diff --git a/drivers/media/platform/nxp/dw100/Makefile b/drivers/media/platform/nxp/dw100/Makefile
>> new file mode 100644
>> index 000000000000..49db80589e9a
>> --- /dev/null
>> +++ b/drivers/media/platform/nxp/dw100/Makefile
>> @@ -0,0 +1,3 @@
>> +# SPDX-License-Identifier: GPL-2.0+
>> +
>> +obj-$(CONFIG_VIDEO_DW100) += dw100.o
>> diff --git a/drivers/media/platform/nxp/dw100/dw100.c b/drivers/media/platform/nxp/dw100/dw100.c
>> new file mode 100644
>> index 000000000000..869ca12e0bcd
>> --- /dev/null
>> +++ b/drivers/media/platform/nxp/dw100/dw100.c
>> @@ -0,0 +1,1782 @@
>> +// SPDX-License-Identifier: GPL-2.0+
>> +/*
>> + * DW100 Hardware dewarper
>> + *
>> + * Copyright 2022 NXP
>> + * Author: Xavier Roumegue (xavier.roumegue@oss.nxp.com)
>> + *
>> + */
>> +
>> +#include <linux/clk.h>
>> +#include <linux/debugfs.h>
>> +#include <linux/interrupt.h>
>> +#include <linux/minmax.h>
>> +#include <linux/module.h>
>> +#include <linux/of.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/pm_runtime.h>
>> +
>> +#include <media/v4l2-ctrls.h>
>> +#include <media/v4l2-device.h>
>> +#include <media/v4l2-event.h>
>> +#include <media/v4l2-ioctl.h>
>> +#include <media/v4l2-mem2mem.h>
>> +#include <media/videobuf2-dma-contig.h>
>> +
>> +#include <uapi/linux/dw100.h>
>> +
>> +#include "dw100_regs.h"
>> +
>> +#define DRV_NAME "dw100"
>> +
>> +#define MIN_W 176
>> +#define MIN_H 144
>> +#define MAX_W 4096
>> +#define MAX_H 3072
>> +#define ALIGN_W 3
>> +#define ALIGN_H 3
> 
> Please add a DW100_ prefix to these macros to avoid possible future
> namespace clashes.
> 
>> +
>> +#define DW100_BLOCK_SIZE 16
>> +
>> +#define DW100_MIN_LUT_NELEMS		(((MIN_W / DW100_BLOCK_SIZE) + 1) \
>> +					 * ((MIN_H / DW100_BLOCK_SIZE) + 1))
>> +
>> +#define DW100_MAX_LUT_NELEMS		(((MAX_W / DW100_BLOCK_SIZE) + 1) \
>> +					 * ((MAX_H / DW100_BLOCK_SIZE) + 1))
>> +
>> +/*
>> + * 16 controls have been reserved for this driver for future extension, but
>> + * let's limit the related driver allocation to the effective number of controls
>> + * in use.
>> + */
>> +#define DW100_MAX_CTRLS 1
>> +#define DW100_CTRL_DEWARPING_MAP 0
> 
> Aligning all macros could make this more readable.
> 
>> +
>> +static unsigned int debug;
>> +module_param(debug, uint, 0644);
>> +MODULE_PARM_DESC(debug, "Activate debug info");
>> +
>> +#define dprintk(lvl, dev, fmt, arg...) \
>> +	v4l2_dbg(lvl, debug, &(dev)->v4l2_dev, "%s: " fmt, __func__, ## arg)
> 
> I'd just use dev_dbg() and drop the debug module parameter. Up to you.
> 
>> +
>> +enum {
>> +	V4L2_M2M_SRC = 0,
>> +	V4L2_M2M_DST = 1,
>> +};
>> +
>> +enum {
>> +	V4L2_M2M_CAPTURE = 1,
> 
> BIT(0)
> 
>> +	V4L2_M2M_OUTPUT = 2,
> 
> and BIT(1) to emphesize they're bitflags.
> 
>> +};
> 
> Those names are also too generic, a DW100_ prefix would fix that. I'd
> name the first two DW100_QUEUE_SRC and DW100_QUEUE_DST, and the last two
> DW100_FMT_CAPTURE and DW100_FMT_OUTPUT.
> 
>> +
>> +static struct dw100_fmt {
> 
> static const
> 
>> +	u32 fourcc;
>> +	int depth;
>> +	u32 types;
>> +	u32 reg_format;
>> +	bool reg_swap_uv;
>> +} formats[] = {
>> +	{
>> +		.fourcc = V4L2_PIX_FMT_NV16,
>> +		.depth = 16,
>> +		.types = V4L2_M2M_OUTPUT | V4L2_M2M_CAPTURE,
>> +		.reg_format = DW100_DEWARP_CTRL_FORMAT_YUV422_SP,
>> +		.reg_swap_uv = false,
>> +	}, {
>> +		.fourcc = V4L2_PIX_FMT_NV61,
>> +		.depth = 16,
>> +		.types = V4L2_M2M_CAPTURE,
>> +		.reg_format = DW100_DEWARP_CTRL_FORMAT_YUV422_SP,
>> +		.reg_swap_uv = true,
>> +	}, {
>> +		.fourcc = V4L2_PIX_FMT_YUYV,
>> +		.depth = 16,
>> +		.types = V4L2_M2M_OUTPUT | V4L2_M2M_CAPTURE,
>> +		.reg_format = DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED,
>> +		.reg_swap_uv = false,
>> +	}, {
>> +		.fourcc = V4L2_PIX_FMT_UYVY,
>> +		.depth = 16,
>> +		.types = V4L2_M2M_OUTPUT | V4L2_M2M_CAPTURE,
>> +		.reg_format = DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED,
>> +		.reg_swap_uv = true,
>> +	}, {
>> +		.fourcc = V4L2_PIX_FMT_NV12,
>> +		.depth = 12,
>> +		.types = V4L2_M2M_OUTPUT | V4L2_M2M_CAPTURE,
>> +		.reg_format = DW100_DEWARP_CTRL_FORMAT_YUV420_SP,
>> +		.reg_swap_uv = false,
>> +	}, {
>> +		.fourcc = V4L2_PIX_FMT_NV21,
>> +		.depth = 12,
>> +		.types = V4L2_M2M_OUTPUT | V4L2_M2M_CAPTURE,
>> +		.reg_format = DW100_DEWARP_CTRL_FORMAT_YUV420_SP,
>> +		.reg_swap_uv = true,
>> +	},
>> +};
>> +
>> +static inline int to_dw100_fmt_type(enum v4l2_buf_type type)
>> +{
>> +	if (V4L2_TYPE_IS_OUTPUT(type))
>> +		return V4L2_M2M_OUTPUT;
>> +	else
>> +		return V4L2_M2M_CAPTURE;
>> +}
>> +
>> +#define NUM_FORMATS ARRAY_SIZE(formats)
> 
> I'd drop the macro and use ARRAY_SIZE(formats) directly in the two
> locations where NUM_FORMATS is used. Same for NUM_CTRLS.
> 
>> +
>> +static struct dw100_fmt *find_pixel_format(u32 pixel_format, int fmt_type)
> 
> static const struct
> 
> dw100_ prefix for the function name. Same below where applicable.
> 
>> +{
>> +	struct dw100_fmt *fmt;
> 
> const
> 
> (and where applicable as indicated by the compiler in the rest of the
> code)
> 
> It's often a good practice to move declaration of variables to the scope
> they're used in.
> 
>> +	unsigned int k;
> 
> Any reason not to use the customary "i" variable ?
> 
>> +
>> +	for (k = 0; k < NUM_FORMATS; k++) {
>> +		fmt = &formats[k];
>> +		if (fmt->fourcc == pixel_format && fmt->types & fmt_type)
>> +			return fmt;
>> +	}
>> +
>> +	return NULL;
>> +}
>> +
>> +static struct dw100_fmt *find_format(struct v4l2_format *f)
>> +{
>> +	return find_pixel_format(f->fmt.pix.pixelformat,
>> +			      to_dw100_fmt_type(f->type));
>> +}
>> +
>> +static inline u32 dw100_bytesperline(struct dw100_fmt *fmt, u32 width)
> 
> You can drop the inline and let the compiler decide.
> 
>> +{
>> +	switch (fmt->reg_format) {
>> +	case DW100_DEWARP_CTRL_FORMAT_YUV422_SP:
>> +	case DW100_DEWARP_CTRL_FORMAT_YUV420_SP:
>> +		return width;
>> +	case DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED:
>> +	default:
>> +		return (fmt->depth * width) >> 3;
>> +	}
>> +}
>> +
>> +static inline u32 dw100_sizeimage(struct dw100_fmt *fmt, u32 width, u32 height)
>> +{
>> +	return (fmt->depth * width * height) >> 3;
>> +}
>> +
>> +struct dw100_device {
>> +	struct platform_device *pdev;
>> +	struct v4l2_m2m_dev	*m2m_dev;
>> +	struct v4l2_device	v4l2_dev;
>> +	struct video_device	vfd;
>> +#ifdef CONFIG_MEDIA_CONTROLLER
>> +	struct media_device	mdev;
>> +#endif
>> +	struct mutex		vfd_mutex;
>> +	spinlock_t		irqlock;
>> +	void __iomem		*mmio;
>> +	struct clk_bulk_data	*clks;
>> +	int			num_clks;
>> +	struct dentry		*debugfs_root;
>> +};
> 
> Could you please move all structure definitions before the functions ?
> 
>> +
>> +static int dw100_dump_regs(struct dw100_device *dw_dev)
>> +{
>> +#define __DECLARE_REG(x) { #x, x }
>> +	int i;
>> +	struct reg_desc {
>> +		const char * const name;
>> +		unsigned int addr;
>> +	} dw100_regs[] = {
>> +		__DECLARE_REG(DW100_DEWARP_ID),
>> +		__DECLARE_REG(DW100_DEWARP_CTRL),
>> +		__DECLARE_REG(DW100_MAP_LUT_ADDR),
>> +		__DECLARE_REG(DW100_MAP_LUT_SIZE),
>> +		__DECLARE_REG(DW100_MAP_LUT_ADDR2),
>> +		__DECLARE_REG(DW100_MAP_LUT_SIZE2),
>> +		__DECLARE_REG(DW100_SRC_IMG_Y_BASE),
>> +		__DECLARE_REG(DW100_SRC_IMG_UV_BASE),
>> +		__DECLARE_REG(DW100_SRC_IMG_SIZE),
>> +		__DECLARE_REG(DW100_SRC_IMG_STRIDE),
>> +		__DECLARE_REG(DW100_DST_IMG_Y_BASE),
>> +		__DECLARE_REG(DW100_DST_IMG_UV_BASE),
>> +		__DECLARE_REG(DW100_DST_IMG_SIZE),
>> +		__DECLARE_REG(DW100_DST_IMG_STRIDE),
>> +		__DECLARE_REG(DW100_DST_IMG_Y_SIZE1),
>> +		__DECLARE_REG(DW100_DST_IMG_UV_SIZE1),
>> +		__DECLARE_REG(DW100_SRC_IMG_Y_BASE2),
>> +		__DECLARE_REG(DW100_SRC_IMG_UV_BASE2),
>> +		__DECLARE_REG(DW100_SRC_IMG_SIZE2),
>> +		__DECLARE_REG(DW100_SRC_IMG_STRIDE2),
>> +		__DECLARE_REG(DW100_DST_IMG_Y_BASE2),
>> +		__DECLARE_REG(DW100_DST_IMG_UV_BASE2),
>> +		__DECLARE_REG(DW100_DST_IMG_SIZE2),
>> +		__DECLARE_REG(DW100_DST_IMG_STRIDE2),
>> +		__DECLARE_REG(DW100_DST_IMG_Y_SIZE2),
>> +		__DECLARE_REG(DW100_DST_IMG_UV_SIZE2),
>> +		__DECLARE_REG(DW100_SWAP_CONTROL),
>> +		__DECLARE_REG(DW100_VERTICAL_SPLIT_LINE),
>> +		__DECLARE_REG(DW100_HORIZON_SPLIT_LINE),
>> +		__DECLARE_REG(DW100_SCALE_FACTOR),
>> +		__DECLARE_REG(DW100_ROI_START),
>> +		__DECLARE_REG(DW100_BOUNDARY_PIXEL),
>> +		__DECLARE_REG(DW100_INTERRUPT_STATUS),
>> +		__DECLARE_REG(DW100_BUS_CTRL),
>> +		__DECLARE_REG(DW100_BUS_CTRL1),
>> +		__DECLARE_REG(DW100_BUS_TIME_OUT_CYCLE),
>> +	};
>> +
>> +	for (i = 0; i < ARRAY_SIZE(dw100_regs); i++) {
>> +		dev_info(&dw_dev->pdev->dev, "%s: %#x\n",
>> +			 dw100_regs[i].name,
>> +			 readl(dw_dev->mmio + dw100_regs[i].addr));
> 
> readl()/writel() wrappers would be nice, something along the lines of
> 
> static inline u32 dw100_read(struct dw100_device *dw_dev, u32 reg)
> {
> 	return readl(dw_dev->mmio + reg);
> }
> 
> static inline void dw100_write(struct dw100_device *dw_dev, u32 reg, u32 val)
> {
> 	write(val, dw_dev->mmio + reg);
> }
> 
>> +	}
> 
> As this is read through debugfs, I recommend using the seq_file API to
> dump the registers to the debugfs file instead of the kernel log.
> 
>> +
>> +	return 0;
>> +}
>> +
>> +struct dw100_q_data {
>> +	unsigned int		width;
>> +	unsigned int		height;
>> +	unsigned int		bytesperline;
>> +	unsigned int		sizeimage;
>> +	unsigned int		sequence;
>> +	struct dw100_fmt	*fmt;
>> +	struct v4l2_rect	crop;
>> +};
>> +
>> +struct dw100_ctx {
>> +	struct v4l2_fh			fh;
>> +	struct dw100_device		*dw_dev;
>> +	struct v4l2_ctrl_handler	hdl;
>> +	struct v4l2_ctrl		*ctrls[DW100_MAX_CTRLS];
>> +
>> +	/* Look Up Table for pixel remapping */
>> +	unsigned int			*map;
>> +	dma_addr_t			map_dma;
>> +	size_t				map_size;
>> +	unsigned int			map_width;
>> +	unsigned int			map_height;
>> +
>> +	/* Related colorspace properties propagated from input to output */
>> +	enum v4l2_colorspace	colorspace;
>> +	enum v4l2_xfer_func	xfer_func;
>> +	enum v4l2_ycbcr_encoding ycbcr_enc;
>> +	enum v4l2_quantization	quant;
>> +
>> +	/* Source and destination queue data */
>> +	struct dw100_q_data   q_data[2];
>> +};
>> +
>> +static inline struct dw100_ctx *file2ctx(struct file *file)
>> +{
>> +	return container_of(file->private_data, struct dw100_ctx, fh);
>> +}
>> +
>> +static struct dw100_q_data *get_q_data(struct dw100_ctx *ctx,
>> +				       enum v4l2_buf_type type)
>> +{
>> +	if (type == V4L2_BUF_TYPE_VIDEO_OUTPUT)
>> +		return &ctx->q_data[V4L2_M2M_SRC];
>> +	else
>> +		return &ctx->q_data[V4L2_M2M_DST];
>> +}
>> +
>> +static u32 dw100_get_n_vertices_from_length(u32 length)
>> +{
>> +	return DIV_ROUND_UP(length, DW100_BLOCK_SIZE) + 1;
>> +}
>> +
>> +static u16 dw_map_convert_to_UQ12_4(u32 a)
> 
> s/dw_/dw100_/ (and where applicable below)
> s/UQ/uq/
> 
>> +{
>> +	return (u16)((a & 0xfff) << 4);
>> +}
>> +
>> +static u32 dw_map_format_coordinates(u16 xq, u16 yq)
>> +{
>> +	return (u32)((yq << 16) | xq);
>> +}
>> +
>> +static u32 *dw100_get_user_map(struct dw100_ctx *ctx)
>> +{
>> +	struct v4l2_ctrl *ctrl = ctx->ctrls[DW100_CTRL_DEWARPING_MAP];
>> +
> 
> No need for a blank line.
> 
>> +	size_t user_map_size;
>> +
>> +	if (!ctrl) {
>> +		v4l2_err(&ctx->dw_dev->v4l2_dev, "Mapping control not found !");
>> +		return NULL;
>> +	}
> 
> I don't think this can ever happen.
> 
>> +
>> +	if (ctrl->elems < DW100_MIN_LUT_NELEMS ||
>> +	    ctrl->elems > DW100_MAX_LUT_NELEMS)
>> +		return NULL;
>> +
>> +	user_map_size = ctrl->elems * ctrl->elem_size;
>> +
>> +	if ((ctrl->elems * ctrl->elem_size) == ctx->map_size)
> 
> 	if (user_map_size == ctx->map_size)
> 
>> +		return ctrl->p_cur.p_u32;
>> +
>> +	v4l2_warn(&ctx->dw_dev->v4l2_dev,
>> +		  "Skipping invalid user map (%zu != %zu)\n",
>> +		  user_map_size,
>> +		  ctx->map_size);
> 
> The last two lines hold in a single line.
> 
> This should be a dev_dbg(), to avoid flooding the kernel log with
> userspace-triggered messages.
> 
> I'd also write
> 
> 	if (user_map_size != ctx->map_size) {
> 		dev_dbg(...);
> 		return NULL;
> 	}
> 
> 	return ctrl->p_cur.p_u32;
> 
> to match the pattern of the previous error check.
> 
>> +
>> +	return NULL;
>> +}
>> +
>> +/*
>> + * Create an identity map if not set by the application
> 
> This comment is a bit misleading, I initially thought the function's
> purpose was to create an identity mapping. You can write
> 
>    * Create the dewarp map from the V4L2 control. If the control hasn't been set
>    * by the application, generate an identity mapping.
> 
>> + *
>> + * A 16 pixels cell size grid is mapped on the destination image.
>> + * The last cells width/height might be lesser than 16 if the destination image
>> + * width/height is not divisible by 16. This dewarping grid map specifies the
>> + * source image pixel location (x, y) on each grid intersection point.
>> + * Bilinear interpolation is used to compute inner cell points locations.
>> + *
>> + * The coordinates are saved in UQ12.4 fixed point format.
>> + *
> 
> Extra blank line.
> 
>> + */
>> +static int dw100_create_mapping(struct dw100_ctx *ctx)
>> +{
>> +	u32 sw, sh, dw, dh, mw, mh, i, j;
>> +	u16 qx, qy, qdx, qdy, qsh, qsw;
>> +	bool is_user_map = false;
>> +	u32 *user_map;
>> +
>> +	sw = ctx->q_data[V4L2_M2M_SRC].width;
>> +	dw = ctx->q_data[V4L2_M2M_DST].width;
>> +	sh = ctx->q_data[V4L2_M2M_SRC].height;
>> +	dh = ctx->q_data[V4L2_M2M_DST].height;
>> +
>> +	mw = dw100_get_n_vertices_from_length(dw);
>> +	mh = dw100_get_n_vertices_from_length(dh);
>> +
>> +	qdx = dw_map_convert_to_UQ12_4(sw) / (mw - 1);
>> +	qdy = dw_map_convert_to_UQ12_4(sh) / (mh - 1);
> 
> These two lines can be moved just before the for loop as you don't need
> those variables when the user has supplied a mapping. You can then write
> them
> 
> 	qdx = qsw / (mw - 1);
> 	qdy = qsh / (mh - 1);
> 
>> +	qsh = dw_map_convert_to_UQ12_4(sh);
>> +	qsw = dw_map_convert_to_UQ12_4(sw);
> 
> Swap the last two lines.
> 
>> +
>> +	if (ctx->map)
>> +		dma_free_coherent(&ctx->dw_dev->pdev->dev,
>> +				  ctx->map_size,
>> +				  ctx->map,
>> +				  ctx->map_dma);
> 
> This holds on two liens.
> 
>> +
>> +	ctx->map_width = mw;
>> +	ctx->map_height = mh;
>> +	ctx->map_size = mh * mw * sizeof(u32);
>> +
>> +	ctx->map = dma_alloc_coherent(&ctx->dw_dev->pdev->dev,
>> +				      ctx->map_size,
>> +				      &ctx->map_dma,
>> +				      GFP_KERNEL);
> 
> Same here.
> 
>> +
>> +	if (!ctx->map)
>> +		return -ENOMEM;
>> +
>> +	user_map = dw100_get_user_map(ctx);
>> +	if (user_map) {
>> +		is_user_map = true;
>> +		memcpy(ctx->map, user_map, ctx->map_size);
>> +		goto out;
>> +	}
>> +
>> +	for (i = 0, qy = 0, qx = 0; i < mh; i++, qy += qdy, qx = 0) {
>> +		if (qy > qsh)
>> +			qy = qsh;
> 
> Can this happen ?
I think so. The frame resolution is aligned on 8 pixels whereas the 
mapping block resolution is 16 pixels. Hence clamping is required.
> 
>> +		for (j = 0; j < mw; j++, qx += qdx) {
> 
> 		for (j = 0, qx = 0; j < mw; j++, qx += qdx) {
> 
> and drop qw handling from the outer loop.
> 
>> +			if (qx > qsw)
>> +				qx = qsw;
>> +			ctx->map[i * mw + j] = dw_map_format_coordinates(qx, qy);
> 
> This could be made a bit more efficient by declaring a
> 
> 	u32 *map;
> 
> variable, setting it to
> 
> 	map = ctx->map;
> 
> just before the loop, and writing
> 
> 		*map++ = dw_map_format_coordinates(qx, qy);
> 
> here.
> 
>> +		}
>> +	}
>> +out:
>> +	dprintk(1, ctx->dw_dev,
>> +		"%dx%d %s mapping created (d:%pa-c:%p) for stream %dx%d->%dx%d\n",
> 
> %u instead of %d for unsigned integers.
> 
>> +			mw, mh, is_user_map ? "user" : "identity",
>> +			&ctx->map_dma, ctx->map, sw, sh, dw, dh);
>> +
>> +	return 0;
>> +}
>> +
>> +static const struct v4l2_ctrl_config controls[] = {
>> +	[DW100_CTRL_DEWARPING_MAP] = {
>> +		.id = V4L2_CID_DW100_DEWARPING_16x16_VERTEX_MAP,
>> +		.name = "Look-Up Table",
>> +		.type = V4L2_CTRL_TYPE_U32,
>> +		.min = 0x00000000,
>> +		.max = 0xffffffff,
>> +		.step = 1,
>> +		.def = 0,
>> +		.dims = { DW100_MAX_LUT_NELEMS },
>> +		.flags = V4L2_CTRL_FLAG_DYNAMIC_ARRAY,
>> +	},
>> +};
>> +
>> +#define NUM_CTRLS ARRAY_SIZE(controls)
>> +
>> +static int dw100_queue_setup(struct vb2_queue *vq,
>> +			     unsigned int *nbuffers, unsigned int *nplanes,
>> +			     unsigned int sizes[], struct device *alloc_devs[])
>> +{
>> +	struct dw100_ctx *ctx = vb2_get_drv_priv(vq);
>> +	struct dw100_q_data *q_data;
>> +	unsigned int size, count = *nbuffers;
>> +
>> +	q_data = get_q_data(ctx, vq->type);
>> +
>> +	size = q_data->sizeimage;
>> +
>> +	*nbuffers = count;
>> +
>> +	if (*nplanes)
>> +		return sizes[0] < size ? -EINVAL : 0;
>> +
>> +	*nplanes = 1;
>> +	sizes[0] = size;
>> +
>> +	dprintk(1, ctx->dw_dev, "Queue %p: get %d buffer(s) of size %d each.\n",
>> +		vq, count, size);
> 
> I think I'd drop this message, or have you found it useful to debug
> application issues ?
> 
>> +
>> +	return 0;
>> +}
>> +
>> +static int dw100_buf_prepare(struct vb2_buffer *vb)
>> +{
>> +	struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
>> +	struct dw100_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
>> +	struct dw100_device *dw_dev = ctx->dw_dev;
>> +	struct dw100_q_data *q_data;
>> +
>> +	dprintk(1, dw_dev, "Queue %p: Preparing buffer %p, type: %d\n",
>> +		vb->vb2_queue, vb, vb->vb2_queue->type);
> 
> Same here.
> 
>> +
>> +	q_data = get_q_data(ctx, vb->vb2_queue->type);
>> +	if (V4L2_TYPE_IS_OUTPUT(vb->vb2_queue->type)) {
>> +		if (vbuf->field == V4L2_FIELD_ANY)
> 
> This can't happen,
> https://linuxtv.org/downloads/v4l-dvb-apis/userspace-api/v4l/field-order.html#c.v4l2_field
> states "struct v4l2_buffer field can never be V4L2_FIELD_ANY".
> 
>> +			vbuf->field = V4L2_FIELD_NONE;
>> +		if (vbuf->field != V4L2_FIELD_NONE) {
>> +			v4l2_err(&dw_dev->v4l2_dev, "%x field isn't supported\n",
>> +				 vbuf->field);
> 
> dev_dbg()
> 
>> +			return -EINVAL;
>> +		}
>> +	}
>> +
>> +	if (vb2_plane_size(vb, 0) < q_data->sizeimage) {
>> +		v4l2_err(&dw_dev->v4l2_dev,
>> +			 "%s data will not fit into plane (%lu < %lu)\n",
>> +			 __func__, vb2_plane_size(vb, 0),
>> +			 (long)q_data->sizeimage);
> 
> dev_dbg()
> 
>> +		return -EINVAL;
>> +	}
>> +
>> +	vb2_set_plane_payload(vb, 0, q_data->sizeimage);
>> +
>> +	return 0;
>> +}
>> +
>> +static void dw100_buf_queue(struct vb2_buffer *vb)
>> +{
>> +	struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
>> +	struct dw100_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
>> +
>> +	dprintk(2, ctx->dw_dev, "Queue %p: Queuing buffer %p.\n",
>> +		vb->vb2_queue, vb);
>> +	v4l2_m2m_buf_queue(ctx->fh.m2m_ctx, vbuf);
>> +}
>> +
>> +static int dw100_start_streaming(struct vb2_queue *q, unsigned int count)
>> +{
>> +	struct dw100_ctx *ctx = vb2_get_drv_priv(q);
>> +	struct dw100_q_data *q_data = get_q_data(ctx, q->type);
>> +
>> +	dprintk(1, ctx->dw_dev, "Queue %p: Start Streaming.\n", q);
>> +
>> +	q_data->sequence = 0;
>> +
>> +	return pm_runtime_resume_and_get(&ctx->dw_dev->pdev->dev);
>> +}
>> +
>> +static void dw100_stop_streaming(struct vb2_queue *q)
>> +{
>> +	struct dw100_ctx *ctx = vb2_get_drv_priv(q);
>> +	struct vb2_v4l2_buffer *vbuf;
>> +	unsigned long flags;
>> +
>> +	dprintk(1, ctx->dw_dev, "Queue %p: Stop Streaming.\n", q);
>> +	for (;;) {
>> +		if (V4L2_TYPE_IS_OUTPUT(q->type))
>> +			vbuf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
>> +		else
>> +			vbuf = v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
>> +		if (!vbuf)
>> +			break;
>> +		spin_lock_irqsave(&ctx->dw_dev->irqlock, flags);
> 
> I don't think you need the lock here, as this can't race with the IRQ,
> can it ?
> 
>> +		v4l2_m2m_buf_done(vbuf, VB2_BUF_STATE_ERROR);
>> +		spin_unlock_irqrestore(&ctx->dw_dev->irqlock, flags);
>> +	}
>> +
>> +	pm_runtime_put_sync(&ctx->dw_dev->pdev->dev);
>> +
>> +	if (ctx->map) {
>> +		dma_free_coherent(&ctx->dw_dev->pdev->dev,
>> +				  ctx->map_size,
>> +				  ctx->map,
>> +				  ctx->map_dma);
>> +		ctx->map = NULL;
> 
> There's a mismatch between start and stop, could we create the mapping
> in dw100_start_streaming() instead of dw100_start() ?
> 
>> +	}
>> +}
>> +
>> +static const struct vb2_ops dw100_qops = {
>> +	.queue_setup	 = dw100_queue_setup,
>> +	.buf_prepare	 = dw100_buf_prepare,
>> +	.buf_queue	 = dw100_buf_queue,
>> +	.start_streaming = dw100_start_streaming,
>> +	.stop_streaming  = dw100_stop_streaming,
>> +	.wait_prepare	 = vb2_ops_wait_prepare,
>> +	.wait_finish	 = vb2_ops_wait_finish,
>> +};
>> +
>> +static int dw100_m2m_queue_init(void *priv, struct vb2_queue *src_vq,
>> +				struct vb2_queue *dst_vq)
>> +{
>> +	struct dw100_ctx *ctx = priv;
>> +	int ret;
>> +
>> +	src_vq->type = V4L2_BUF_TYPE_VIDEO_OUTPUT;
>> +	src_vq->io_modes = VB2_MMAP | VB2_DMABUF;
>> +	src_vq->drv_priv = ctx;
>> +	src_vq->buf_struct_size = sizeof(struct v4l2_m2m_buffer);
>> +	src_vq->ops = &dw100_qops;
>> +	src_vq->mem_ops = &vb2_dma_contig_memops;
>> +	src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>> +	src_vq->lock = &ctx->dw_dev->vfd_mutex;
> 
> This will serialize queue operations across all contexts. Wouldn't it be
> better to have a per-context lock ?
> 
>> +	src_vq->dev = ctx->dw_dev->v4l2_dev.dev;
>> +
>> +	ret = vb2_queue_init(src_vq);
>> +	if (ret)
>> +		return ret;
>> +
>> +	dst_vq->type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
>> +	dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
>> +	dst_vq->drv_priv = ctx;
>> +	dst_vq->buf_struct_size = sizeof(struct v4l2_m2m_buffer);
>> +	dst_vq->ops = &dw100_qops;
>> +	dst_vq->mem_ops = &vb2_dma_contig_memops;
>> +	dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>> +	dst_vq->lock = &ctx->dw_dev->vfd_mutex;
>> +	dst_vq->dev = ctx->dw_dev->v4l2_dev.dev;
>> +
>> +	return vb2_queue_init(dst_vq);
>> +}
>> +
>> +static int dw100_open(struct file *file)
>> +{
>> +	struct dw100_device *dw_dev = video_drvdata(file);
>> +	struct dw100_ctx *ctx;
>> +	struct v4l2_ctrl_handler *hdl;
>> +	int ret = 0, i;
>> +
>> +	if (mutex_lock_interruptible(&dw_dev->vfd_mutex))
>> +		return -ERESTARTSYS;
>> +
>> +	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
>> +	if (!ctx) {
>> +		ret = -ENOMEM;
>> +		goto open_unlock;
>> +	}
> 
> This can be moved before the mutex_lock call (with a return -ENOMEM
> instead of the goto).
> 
> Actually I don't think you need the lock at all in this function.
> 
>> +
>> +	v4l2_fh_init(&ctx->fh, video_devdata(file));
>> +	file->private_data = &ctx->fh;
>> +	ctx->dw_dev = dw_dev;
>> +
>> +	hdl = &ctx->hdl;
>> +	v4l2_ctrl_handler_init(hdl, NUM_CTRLS);
>> +	for (i = 0; i < NUM_CTRLS; i++) {
>> +		ctx->ctrls[i] = v4l2_ctrl_new_custom(hdl, &controls[i], NULL);
>> +		if (hdl->error) {
>> +			v4l2_err(&ctx->dw_dev->v4l2_dev,
>> +				 "Adding control (%d) failed\n", i);
>> +			ret = hdl->error;
>> +			v4l2_ctrl_handler_free(hdl);
>> +			v4l2_fh_exit(&ctx->fh);
>> +			kfree(ctx);
>> +			goto open_unlock;
>> +		}
>> +	}
>> +	ctx->fh.ctrl_handler = hdl;
>> +
>> +	ctx->q_data[V4L2_M2M_SRC].fmt = &formats[0];
>> +	ctx->q_data[V4L2_M2M_SRC].width = 640;
>> +	ctx->q_data[V4L2_M2M_SRC].height = 480;
>> +	ctx->q_data[V4L2_M2M_SRC].bytesperline =
>> +		dw100_bytesperline(&formats[0], 640);
>> +	ctx->q_data[V4L2_M2M_SRC].sizeimage =
>> +		dw100_sizeimage(&formats[0], 640, 480);
>> +
>> +	ctx->q_data[V4L2_M2M_SRC].crop.top = 0;
>> +	ctx->q_data[V4L2_M2M_SRC].crop.left = 0;
>> +	ctx->q_data[V4L2_M2M_SRC].crop.width = 640;
>> +	ctx->q_data[V4L2_M2M_SRC].crop.height = 480;
>> +
>> +	ctx->q_data[V4L2_M2M_DST] = ctx->q_data[V4L2_M2M_SRC];
>> +
>> +	ctx->colorspace = V4L2_COLORSPACE_REC709;
>> +	ctx->xfer_func = V4L2_MAP_XFER_FUNC_DEFAULT(ctx->colorspace);
>> +	ctx->ycbcr_enc = V4L2_MAP_YCBCR_ENC_DEFAULT(ctx->colorspace);
>> +	ctx->quant = V4L2_MAP_QUANTIZATION_DEFAULT(false,
>> +						   ctx->colorspace,
>> +						   ctx->ycbcr_enc);
>> +
>> +	ctx->fh.m2m_ctx = v4l2_m2m_ctx_init(dw_dev->m2m_dev,
>> +					    ctx, &dw100_m2m_queue_init);
>> +
>> +	if (IS_ERR(ctx->fh.m2m_ctx)) {
>> +		ret = PTR_ERR(ctx->fh.m2m_ctx);
>> +
>> +		v4l2_ctrl_handler_free(hdl);
>> +		v4l2_fh_exit(&ctx->fh);
>> +		kfree(ctx);
>> +		goto open_unlock;
>> +	}
>> +
>> +	v4l2_fh_add(&ctx->fh);
>> +
>> +	dprintk(1, dw_dev, "M2M instance created: %p", ctx->fh.m2m_ctx);
> 
> I'd drop this message too.
> 
>> +
>> +open_unlock:
>> +	mutex_unlock(&dw_dev->vfd_mutex);
>> +	return ret;
>> +}
>> +
>> +static int dw100_release(struct file *file)
>> +{
>> +	struct dw100_device *dw_dev = video_drvdata(file);
>> +	struct dw100_ctx *ctx = file2ctx(file);
>> +
>> +	dprintk(1, dw_dev, "Releasing M2M instance: %p", ctx->fh.m2m_ctx);
>> +
>> +	v4l2_fh_del(&ctx->fh);
>> +	v4l2_fh_exit(&ctx->fh);
>> +	v4l2_ctrl_handler_free(&ctx->hdl);
>> +
>> +	mutex_lock(&dw_dev->vfd_mutex);
>> +	v4l2_m2m_ctx_release(ctx->fh.m2m_ctx);
>> +	mutex_unlock(&dw_dev->vfd_mutex);
>> +
>> +	if (ctx->map)
>> +		dma_free_coherent(&ctx->dw_dev->pdev->dev,
>> +				  ctx->map_size,
>> +				  ctx->map,
>> +				  ctx->map_dma);
> 
> Can this happen, doesn't the M2M framework guarantee that streaming is
> stopped, which then guarantees that the .stop_streaming() handler has
> freed the mapping ?
> 
>> +
>> +	kfree(ctx);
>> +
>> +	return 0;
>> +}
>> +
>> +static const struct v4l2_file_operations dw100_fops = {
>> +	.owner		= THIS_MODULE,
>> +	.open		= dw100_open,
>> +	.release	= dw100_release,
>> +	.poll		= v4l2_m2m_fop_poll,
>> +	.unlocked_ioctl	= video_ioctl2,
>> +	.mmap		= v4l2_m2m_fop_mmap,
>> +};
>> +
>> +static int dw100_querycap(struct file *file, void *priv,
>> +			  struct v4l2_capability *cap)
>> +{
>> +	strscpy(cap->driver, DRV_NAME, sizeof(cap->driver));
>> +	strscpy(cap->card, "DW100 dewarper", sizeof(cap->card));
>> +	snprintf(cap->bus_info, sizeof(cap->bus_info), "platform:%s", DRV_NAME);
> 
> Not needed anymore, the core handles it in v4l_querycap() (and the value
> here is incorrect, it should use the device name, not hardcode the
> driver name).
> 
>> +
>> +	return 0;
>> +}
>> +
>> +static int dw100_enum_fmt(struct v4l2_fmtdesc *f)
>> +{
>> +	int i, num = 0;
>> +
>> +	for (i = 0; i < NUM_FORMATS; i++) {
>> +		if (formats[i].types & to_dw100_fmt_type(f->type)) {
>> +			if (num == f->index) {
>> +				f->pixelformat = formats[i].fourcc;
>> +				return 0;
>> +			}
>> +			++num;
>> +		}
>> +	}
>> +
>> +	return -EINVAL;
>> +}
>> +
>> +static int dw100_enum_fmt_vid_cap(struct file *file, void *priv,
>> +				  struct v4l2_fmtdesc *f)
>> +{
>> +	return dw100_enum_fmt(f);
>> +}
>> +
>> +static int dw100_enum_fmt_vid_out(struct file *file, void *priv,
>> +				  struct v4l2_fmtdesc *f)
>> +{
>> +	return dw100_enum_fmt(f);
>> +}
> 
> You can merge those three functions into a single one.
> 
>> +
>> +static int dw100_enum_framesizes(struct file *file, void *priv,
>> +				 struct v4l2_frmsizeenum *fsize)
>> +{
>> +	struct dw100_fmt *fmt;
>> +
>> +	if (fsize->index)
>> +		return -EINVAL;
>> +
>> +	fmt = find_pixel_format(fsize->pixel_format,
>> +				V4L2_M2M_OUTPUT | V4L2_M2M_CAPTURE);
> 
> I think you should restrict the format type based on fsize->type.
> 
>> +	if (!fmt)
>> +		return -EINVAL;
>> +
>> +	fsize->type = V4L2_FRMSIZE_TYPE_STEPWISE;
>> +	fsize->stepwise.min_width = MIN_W;
>> +	fsize->stepwise.min_height = MIN_H;
>> +	fsize->stepwise.max_width = MAX_W;
>> +	fsize->stepwise.max_height = MAX_H;
>> +	fsize->stepwise.step_width = (1UL << ALIGN_W);
>> +	fsize->stepwise.step_height = (1UL << ALIGN_H);
> 
> No need for the parentheses.
> 
>> +
>> +	return 0;
>> +}
>> +
>> +static int dw100_g_fmt(struct dw100_ctx *ctx, struct v4l2_format *f)
>> +{
>> +	struct vb2_queue *vq;
>> +	struct dw100_q_data *q_data;
>> +
>> +	vq = v4l2_m2m_get_vq(ctx->fh.m2m_ctx, f->type);
>> +	if (!vq)
>> +		return -EINVAL;
>> +
>> +	q_data = get_q_data(ctx, f->type);
>> +
>> +	f->fmt.pix.width	= q_data->width;
>> +	f->fmt.pix.height	= q_data->height;
>> +	f->fmt.pix.field	= V4L2_FIELD_NONE;
>> +	f->fmt.pix.pixelformat	= q_data->fmt->fourcc;
>> +	f->fmt.pix.bytesperline	= q_data->bytesperline;
>> +	f->fmt.pix.sizeimage	= q_data->sizeimage;
>> +	f->fmt.pix.colorspace	= ctx->colorspace;
>> +	f->fmt.pix.xfer_func	= ctx->xfer_func;
>> +	f->fmt.pix.ycbcr_enc	= ctx->ycbcr_enc;
>> +	f->fmt.pix.quantization	= ctx->quant;
>> +
>> +	return 0;
>> +}
>> +
>> +static int dw100_g_fmt_vid_out(struct file *file, void *priv,
>> +			       struct v4l2_format *f)
>> +{
>> +	return dw100_g_fmt(file2ctx(file), f);
>> +}
>> +
>> +static int dw100_g_fmt_vid_cap(struct file *file, void *priv,
>> +			       struct v4l2_format *f)
>> +{
>> +	return dw100_g_fmt(file2ctx(file), f);
>> +}
> 
> You can also merge those three functions into a single one.
> 
>> +
>> +static int dw100_try_fmt(struct v4l2_format *f, struct dw100_fmt *fmt)
>> +{
>> +	v4l_bound_align_image(&f->fmt.pix.width, MIN_W, MAX_W, ALIGN_W,
>> +			      &f->fmt.pix.height, MIN_H, MAX_H, ALIGN_H, 0);
>> +
>> +	f->fmt.pix.bytesperline = dw100_bytesperline(fmt, f->fmt.pix.width);
>> +	f->fmt.pix.sizeimage = dw100_sizeimage(fmt, f->fmt.pix.width,
>> +					       f->fmt.pix.height);
> 
> No need to support a custom line stride ?
> 
>> +	f->fmt.pix.field = V4L2_FIELD_NONE;
>> +
>> +	return 0;
>> +}
>> +
>> +static int dw100_s_fmt(struct dw100_ctx *ctx, struct v4l2_format *f)
>> +{
>> +	struct dw100_q_data *q_data;
>> +	struct vb2_queue *vq;
>> +
>> +	vq = v4l2_m2m_get_vq(ctx->fh.m2m_ctx, f->type);
>> +	if (!vq)
>> +		return -EINVAL;
>> +
>> +	q_data = get_q_data(ctx, f->type);
>> +	if (!q_data)
>> +		return -EINVAL;
>> +
>> +	if (vb2_is_busy(vq)) {
>> +		v4l2_err(&ctx->dw_dev->v4l2_dev, "%s queue busy\n", __func__);
>> +		return -EBUSY;
>> +	}
>> +
>> +	q_data->fmt		= find_format(f);
>> +	q_data->width		= f->fmt.pix.width;
>> +	q_data->height		= f->fmt.pix.height;
>> +	q_data->bytesperline	= f->fmt.pix.bytesperline;
>> +	q_data->sizeimage	= f->fmt.pix.sizeimage;
> 
> It could be easier to store an instance of v4l2_pix_format in
> dw100_q_data.
> 
>> +
>> +	q_data->crop.top = 0;
>> +	q_data->crop.left = 0;
>> +	q_data->crop.width = f->fmt.pix.width;
>> +	q_data->crop.height = f->fmt.pix.height;
>> +
>> +	dprintk(1, ctx->dw_dev,
>> +		"Setting format for type %d, wxh: %dx%d, fmt: %d\n",
>> +		f->type, q_data->width, q_data->height, q_data->fmt->fourcc);
>> +
>> +	return 0;
>> +}
>> +
>> +static int dw100_try_fmt_vid_cap(struct file *file, void *priv,
>> +				 struct v4l2_format *f)
>> +{
>> +	struct dw100_fmt *fmt;
>> +	struct dw100_ctx *ctx = file2ctx(file);
>> +
>> +	if (f->type != V4L2_BUF_TYPE_VIDEO_CAPTURE)
>> +		return -EINVAL;
>> +
>> +	fmt = find_format(f);
>> +	if (!fmt) {
>> +		f->fmt.pix.pixelformat = formats[0].fourcc;
>> +		fmt = find_format(f);
>> +	}
>> +
>> +	f->fmt.pix.colorspace = ctx->colorspace;
>> +	f->fmt.pix.xfer_func = ctx->xfer_func;
>> +	f->fmt.pix.ycbcr_enc = ctx->ycbcr_enc;
>> +	f->fmt.pix.quantization = ctx->quant;
>> +
>> +	return dw100_try_fmt(f, fmt);
>> +}
>> +
>> +static int dw100_s_fmt_vid_cap(struct file *file, void *priv,
>> +			       struct v4l2_format *f)
>> +{
>> +	struct dw100_ctx *ctx = file2ctx(file);
>> +	int ret;
>> +
>> +	ret = dw100_try_fmt_vid_cap(file, priv, f);
>> +	if (ret)
>> +		return ret;
>> +
>> +	ret = dw100_s_fmt(ctx, f);
>> +	if (ret)
>> +		return ret;
>> +
>> +	return 0;
>> +}
>> +
>> +static int dw100_try_fmt_vid_out(struct file *file, void *priv,
>> +				 struct v4l2_format *f)
>> +{
>> +	struct dw100_fmt *fmt;
>> +
>> +	if (f->type != V4L2_BUF_TYPE_VIDEO_OUTPUT)
>> +		return -EINVAL;
>> +
>> +	fmt = find_format(f);
>> +	if (!fmt) {
>> +		f->fmt.pix.pixelformat = formats[0].fourcc;
>> +		fmt = find_format(f);
>> +	}
>> +
>> +	if (!f->fmt.pix.colorspace)
>> +		f->fmt.pix.colorspace = V4L2_COLORSPACE_REC709;
>> +
>> +	return dw100_try_fmt(f, fmt);
>> +}
>> +
>> +static int dw100_s_fmt_vid_out(struct file *file, void *priv,
>> +			       struct v4l2_format *f)
>> +{
>> +	struct dw100_ctx *ctx = file2ctx(file);
>> +	int ret;
>> +
>> +	ret = dw100_try_fmt_vid_out(file, priv, f);
>> +	if (ret)
>> +		return ret;
>> +
>> +	ret = dw100_s_fmt(ctx, f);
>> +	if (ret)
>> +		return ret;
>> +
>> +	ctx->colorspace = f->fmt.pix.colorspace;
>> +	ctx->xfer_func = f->fmt.pix.xfer_func;
>> +	ctx->ycbcr_enc = f->fmt.pix.ycbcr_enc;
>> +	ctx->quant = f->fmt.pix.quantization;
>> +
>> +	return 0;
>> +}
>> +
>> +static int dw100_g_selection(struct file *file, void *fh,
>> +			     struct v4l2_selection *sel)
>> +{
>> +	struct dw100_ctx *ctx = file2ctx(file);
>> +	struct dw100_q_data *src_q_data, *dst_q_data;
>> +
>> +	if (sel->type != V4L2_BUF_TYPE_VIDEO_OUTPUT &&
>> +	    sel->type != V4L2_BUF_TYPE_VIDEO_CAPTURE)
>> +		return -EINVAL;
>> +
>> +	src_q_data = get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_OUTPUT);
>> +	dst_q_data = get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE);
>> +
>> +	switch (sel->target) {
>> +	case V4L2_SEL_TGT_CROP_DEFAULT:
>> +	case V4L2_SEL_TGT_CROP_BOUNDS:
>> +		sel->r.top = 0;
>> +		sel->r.left = 0;
>> +		sel->r.width = src_q_data->width;
>> +		sel->r.height = src_q_data->height;
>> +		break;
>> +	case V4L2_SEL_TGT_CROP:
>> +		sel->r.top = src_q_data->crop.top;
>> +		sel->r.left = src_q_data->crop.left;
>> +		sel->r.width = src_q_data->crop.width;
>> +		sel->r.height = src_q_data->crop.height;
>> +		break;
>> +	case V4L2_SEL_TGT_COMPOSE_DEFAULT:
>> +	case V4L2_SEL_TGT_COMPOSE_BOUNDS:
>> +	case V4L2_SEL_TGT_COMPOSE:
>> +		sel->r.top = 0;
>> +		sel->r.left = 0;
>> +		sel->r.width = dst_q_data->width;
>> +		sel->r.height = dst_q_data->height;
>> +		break;
> 
> As far as I understand the driver doesn't support composition, maybe you
> could just return -EINVAL ?
> 
>> +	default:
>> +		return -EINVAL;
>> +	}
>> +
>> +	dprintk(1, ctx->dw_dev,
>> +		"<<< Buffer Type: %d Target: %d Rect: %dx%d@%d.%d\n",
>> +		sel->type, sel->target,
>> +		sel->r.width, sel->r.height, sel->r.left, sel->r.top);
>> +
>> +	return 0;
>> +}
>> +
>> +static	int dw100_s_selection(struct file *file, void *fh,
> 
> Replace tab with space.
> 
>> +			      struct v4l2_selection *sel)
>> +{
>> +	struct dw100_ctx *ctx = file2ctx(file);
>> +	struct dw100_q_data *src_q_data, *dst_q_data;
>> +	u32 qscalex, qscaley, qscale;
>> +	int x, y, w, h;
>> +
>> +	if (sel->type != V4L2_BUF_TYPE_VIDEO_OUTPUT &&
>> +	    sel->type != V4L2_BUF_TYPE_VIDEO_CAPTURE)
>> +		return -EINVAL;
>> +
>> +	src_q_data = get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_OUTPUT);
>> +	dst_q_data = get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE);
>> +
>> +	dprintk(1, ctx->dw_dev,
>> +		">>> Buffer Type: %d Target: %d Rect: %dx%d@%d.%d\n",
>> +		sel->type, sel->target,
>> +		sel->r.width, sel->r.height, sel->r.left, sel->r.top);
>> +
>> +	if (sel->r.top < 0 || sel->r.left < 0)
>> +		return -EINVAL;
> 
> Clamp the top and left values to the supported range, .s_selection()
> must adjust unsupported values, not return an error.
> 
>> +
>> +	switch (sel->target) {
>> +	case V4L2_SEL_TGT_CROP:
>> +		/* UQ16.16 for float operations */
>> +		if ((sel->r.left + sel->r.width > src_q_data->width) ||
>> +		    (sel->r.top + sel->r.height > src_q_data->height))
>> +			return -EINVAL;
> 
> Same.
> 
>> +		qscalex = (sel->r.width << 16) / src_q_data->width;
>> +		qscaley = (sel->r.height << 16) / src_q_data->height;
>> +		y = sel->r.top;
>> +		x = sel->r.left;
>> +		if (qscalex == qscaley) {
>> +			qscale = qscalex;
>> +		} else {
>> +			switch (sel->flags) {
>> +			case 0:
>> +				qscale = (qscalex + qscaley) / 2;
>> +				break;
>> +			case V4L2_SEL_FLAG_GE:
>> +				qscale = max(qscaley, qscalex);
>> +				break;
>> +			case V4L2_SEL_FLAG_LE:
>> +				qscale = min(qscaley, qscalex);
>> +				break;
>> +			case V4L2_SEL_FLAG_LE | V4L2_SEL_FLAG_GE:
>> +				return -ERANGE;
>> +			default:
>> +				return -EINVAL;
>> +			}
>> +		}
>> +
>> +		w = (u32)((((u64)src_q_data->width << 16) * qscale) >> 32);
>> +		h = (u32)((((u64)src_q_data->height << 16) * qscale) >> 32);
>> +		x = x + (sel->r.width  - w) / 2;
>> +		y = y + (sel->r.height  - h) / 2;
>> +		x = min(src_q_data->width - w, (unsigned int)max(0, x));
>> +		y = min(src_q_data->height - h, (unsigned int)max(0, y));
>> +
>> +		sel->r.top = y;
>> +		sel->r.left = x;
>> +		sel->r.width = w;
>> +		sel->r.height = h;
>> +
>> +		src_q_data->crop.top = sel->r.top;
>> +		src_q_data->crop.left = sel->r.left;
>> +		src_q_data->crop.width = sel->r.width;
>> +		src_q_data->crop.height = sel->r.height;
>> +		break;
>> +	case V4L2_SEL_TGT_COMPOSE:
>> +		if ((sel->r.left + sel->r.width > dst_q_data->width) ||
>> +		    (sel->r.top + sel->r.height > dst_q_data->height))
>> +			return -EINVAL;
>> +		sel->r.top = 0;
>> +		sel->r.left = 0;
>> +		sel->r.width = dst_q_data->width;
>> +		sel->r.height = dst_q_data->height;
>> +		break;
>> +	case V4L2_SEL_TGT_CROP_DEFAULT:
>> +	case V4L2_SEL_TGT_CROP_BOUNDS:
>> +	case V4L2_SEL_TGT_COMPOSE_DEFAULT:
>> +	case V4L2_SEL_TGT_COMPOSE_BOUNDS:
>> +	default:
>> +		return -EINVAL;
>> +	}
>> +
>> +	dprintk(1, ctx->dw_dev,
>> +		"<<< Buffer Type: %d Target: %d Rect: %dx%d@%d.%d\n",
>> +		sel->type, sel->target,
>> +		sel->r.width, sel->r.height, sel->r.left, sel->r.top);
>> +
>> +	return 0;
>> +}
>> +
>> +static const struct v4l2_ioctl_ops dw100_ioctl_ops = {
>> +	.vidioc_querycap	= dw100_querycap,
>> +
>> +	.vidioc_enum_fmt_vid_cap = dw100_enum_fmt_vid_cap,
>> +	.vidioc_enum_framesizes = dw100_enum_framesizes,
>> +	.vidioc_g_fmt_vid_cap	= dw100_g_fmt_vid_cap,
>> +	.vidioc_try_fmt_vid_cap	= dw100_try_fmt_vid_cap,
>> +	.vidioc_s_fmt_vid_cap	= dw100_s_fmt_vid_cap,
>> +
>> +	.vidioc_enum_fmt_vid_out = dw100_enum_fmt_vid_out,
>> +	.vidioc_g_fmt_vid_out	= dw100_g_fmt_vid_out,
>> +	.vidioc_try_fmt_vid_out	= dw100_try_fmt_vid_out,
>> +	.vidioc_s_fmt_vid_out	= dw100_s_fmt_vid_out,
>> +
>> +	.vidioc_g_selection	= dw100_g_selection,
>> +	.vidioc_s_selection	= dw100_s_selection,
>> +	.vidioc_reqbufs		= v4l2_m2m_ioctl_reqbufs,
>> +	.vidioc_querybuf	= v4l2_m2m_ioctl_querybuf,
>> +	.vidioc_qbuf		= v4l2_m2m_ioctl_qbuf,
>> +	.vidioc_dqbuf		= v4l2_m2m_ioctl_dqbuf,
>> +	.vidioc_prepare_buf	= v4l2_m2m_ioctl_prepare_buf,
>> +	.vidioc_create_bufs	= v4l2_m2m_ioctl_create_bufs,
>> +	.vidioc_expbuf		= v4l2_m2m_ioctl_expbuf,
>> +
>> +	.vidioc_streamon	= v4l2_m2m_ioctl_streamon,
>> +	.vidioc_streamoff	= v4l2_m2m_ioctl_streamoff,
>> +
>> +	.vidioc_subscribe_event = v4l2_ctrl_subscribe_event,
>> +	.vidioc_unsubscribe_event = v4l2_event_unsubscribe,
>> +};
>> +
>> +static void dw100_job_finish(struct dw100_device *dw_dev, bool with_error)
>> +{
>> +	struct dw100_ctx *curr_ctx;
>> +	struct vb2_v4l2_buffer *src_vb, *dst_vb;
>> +	unsigned long flags;
>> +	enum vb2_buffer_state buf_state;
>> +
>> +	curr_ctx = v4l2_m2m_get_curr_priv(dw_dev->m2m_dev);
>> +
>> +	if (!curr_ctx) {
>> +		v4l2_err(&dw_dev->v4l2_dev,
>> +			 "Instance released before the end of transaction\n");
>> +		return;
>> +	}
>> +
>> +	src_vb = v4l2_m2m_src_buf_remove(curr_ctx->fh.m2m_ctx);
>> +	dst_vb = v4l2_m2m_dst_buf_remove(curr_ctx->fh.m2m_ctx);
>> +
>> +	if (likely(!with_error))
>> +		buf_state = VB2_BUF_STATE_DONE;
>> +	else
>> +		buf_state = VB2_BUF_STATE_ERROR;
>> +
>> +	spin_lock_irqsave(&dw_dev->irqlock, flags);
>> +	v4l2_m2m_buf_done(src_vb, buf_state);
>> +	v4l2_m2m_buf_done(dst_vb, buf_state);
>> +	spin_unlock_irqrestore(&dw_dev->irqlock, flags);
> 
> And the lock could be droppped from here if it's dropped from
> dw100_stop_streaming().
> 
>> +
>> +	dprintk(2, dw_dev, "Finishing transaction with%s error(s)\n",
>> +		with_error ? "" : "out");
>> +
>> +	v4l2_m2m_job_finish(dw_dev->m2m_dev, curr_ctx->fh.m2m_ctx);
>> +}
>> +
>> +static void dw100_hw_reset(struct dw100_device *dw_dev)
>> +{
>> +	u32 val;
>> +
>> +	val = readl(dw_dev->mmio + DW100_DEWARP_CTRL);
>> +	val |= DW100_DEWARP_CTRL_ENABLE;
>> +	val |= DW100_DEWARP_CTRL_SOFT_RESET;
>> +	writel(val, dw_dev->mmio + DW100_DEWARP_CTRL);
>> +	val &= ~DW100_DEWARP_CTRL_SOFT_RESET;
>> +	writel(val, dw_dev->mmio + DW100_DEWARP_CTRL);
>> +}
>> +
>> +static void _dw100_hw_set_master_bus_enable(struct dw100_device *dw_dev,
>> +					    unsigned int enable)
>> +{
>> +	u32 val;
>> +	void __iomem *addr = dw_dev->mmio + DW100_BUS_CTRL;
>> +
>> +	dprintk(3, dw_dev, "%sable master bus\n", enable ? "En" : "Dis");
>> +
>> +	val = readl(addr);
>> +
>> +	if (enable)
>> +		val |= DW100_BUS_CTRL_AXI_MASTER_ENABLE;
>> +	else
>> +		val &= ~DW100_BUS_CTRL_AXI_MASTER_ENABLE;
>> +
>> +	writel(val, addr);
>> +}
>> +
>> +static void dw100_hw_master_bus_enable(struct dw100_device *dw_dev)
>> +{
>> +	_dw100_hw_set_master_bus_enable(dw_dev, 1);
>> +}
>> +
>> +static void dw100_hw_master_bus_disable(struct dw100_device *dw_dev)
>> +{
>> +	_dw100_hw_set_master_bus_enable(dw_dev, 0);
>> +}
>> +
>> +static void dw100_hw_dewarp_start(struct dw100_device *dw_dev)
>> +{
>> +	u32 val;
>> +	void __iomem *addr = dw_dev->mmio + DW100_DEWARP_CTRL;
>> +
>> +	val = readl(addr);
>> +
>> +	dprintk(3, dw_dev, "Starting Hardware CTRL:%x\n", val);
>> +	writel(val | DW100_DEWARP_CTRL_START, addr);
>> +	writel(val, addr);
>> +}
>> +
>> +static void dw100_hw_init_ctrl(struct dw100_device *dw_dev)
>> +{
>> +	u32 val;
>> +	void __iomem *addr = dw_dev->mmio + DW100_DEWARP_CTRL;
>> +	/*
>> +	 * Input format YUV422_SP
>> +	 * Output format YUV422_SP
>> +	 * No hardware handshake (SW)
>> +	 * No automatic double src buffering (Single)
>> +	 * No automatic double dst buffering (Single)
>> +	 * No Black Line
>> +	 * Prefetch image pixel traversal
>> +	 */
>> +
>> +	val = DW100_DEWARP_CTRL_ENABLE
>> +		/* Valid only for auto prefetch mode*/
>> +		| DW100_DEWARP_CTRL_PREFETCH_THRESHOLD(32);
> 
> Align | below the =.
> 
>> +
>> +	/*
>> +	 * Calculation mode required to support any scaling factor,
>> +	 * but x4 slower than traversal mode.
>> +	 *
>> +	 * DW100_DEWARP_CTRL_PREFETCH_MODE_TRAVERSAL
>> +	 * DW100_DEWARP_CTRL_PREFETCH_MODE_CALCULATION
>> +	 * DW100_DEWARP_CTRL_PREFETCH_MODE_AUTO
>> +	 *
>> +	 * TODO: Find heuristics requiring calculation mode
>> +	 *
> 
> Extra blank line.
> 
>> +	 */
>> +	val |= DW100_DEWARP_CTRL_PREFETCH_MODE_CALCULATION;
>> +
>> +	writel(val, addr);
>> +}
>> +
>> +static void dw100_hw_set_pixel_boundary(struct dw100_device *dw_dev)
>> +{
>> +	u32 val;
>> +	void __iomem *addr = dw_dev->mmio + DW100_BOUNDARY_PIXEL;
>> +
>> +	val = DW100_BOUNDARY_PIXEL_V(128)
>> +		| DW100_BOUNDARY_PIXEL_U(128)
>> +		| DW100_BOUNDARY_PIXEL_Y(0);
>> +
>> +	writel(val, addr);
>> +}
>> +
>> +static void dw100_hw_set_scale(struct dw100_device *dw_dev, u8 scale)
>> +{
>> +	void __iomem *addr = dw_dev->mmio + DW100_SCALE_FACTOR;
>> +
>> +	dprintk(1, dw_dev, "Setting scale factor to %d\n", scale);
>> +
>> +	writel(scale, addr);
>> +}
>> +
>> +static void dw100_hw_set_roi(struct dw100_device *dw_dev, u32 x, u32 y)
>> +{
>> +	u32 val;
>> +	void __iomem *addr = dw_dev->mmio + DW100_ROI_START;
>> +
>> +	dprintk(1, dw_dev, "Setting ROI region to %d.%d\n", x, y);
>> +
>> +	val = DW100_ROI_START_X(x) | DW100_ROI_START_Y(y);
>> +
>> +	writel(val, addr);
>> +}
>> +
>> +static void dw100_hw_set_src_crop(struct dw100_device *dw_dev,
>> +				  struct dw100_q_data *src_q_data,
>> +				  struct dw100_q_data *dst_q_data)
> 
> const
> 
>> +{
>> +	struct v4l2_rect *rect = &src_q_data->crop;
>> +	u32 src_scale, qscale, left_scale, top_scale;
>> +
>> +	/* HW Scale is UQ1.7 encoded */
>> +	src_scale = (rect->width << 7) / src_q_data->width;
>> +	dw100_hw_set_scale(dw_dev, src_scale);
>> +
>> +	qscale = (dst_q_data->width << 7)  / src_q_data->width;
>> +
>> +	left_scale = (((rect->left << 7) * qscale) >> 14);
>> +	top_scale = (((rect->top << 7) * qscale) >> 14);
> 
> No need for the outer parentheses.
> 
>> +
>> +	dw100_hw_set_roi(dw_dev, left_scale, top_scale);
>> +}
>> +
>> +static void dw100_hw_set_source(struct dw100_device *dw_dev,
>> +				struct dw100_q_data *q_data,
>> +				dma_addr_t addr)
>> +{
>> +	u32 width, height, stride, fourcc, val;
>> +	struct dw100_fmt *fmt = q_data->fmt;
>> +
>> +	width =  q_data->width;
>> +	height = q_data->height;
>> +	stride = q_data->bytesperline;
>> +	fourcc = q_data->fmt->fourcc;
>> +
>> +	dprintk(3, dw_dev, "Set HW source registers for %dx%d - stride %d, pixfmt: %x, dma:%pa\n",
>> +		width, height, stride, fourcc, &addr);
>> +
>> +	/* Pixel Format */
>> +	val = readl(dw_dev->mmio + DW100_DEWARP_CTRL);
>> +
>> +	val &= ~DW100_DEWARP_CTRL_INPUT_FORMAT_MASK;
>> +	val |= DW100_DEWARP_CTRL_INPUT_FORMAT(fmt->reg_format);
>> +
>> +	writel(val, dw_dev->mmio + DW100_DEWARP_CTRL);
>> +
>> +	/* Swap */
>> +	val = readl(dw_dev->mmio + DW100_SWAP_CONTROL);
>> +
>> +	val &= ~DW100_SWAP_CONTROL_SRC_MASK;
>> +	/*
>> +	 * Data swapping is performed only on Y plane for source image.
>> +	 */
>> +	if (fmt->reg_swap_uv &&
>> +	    fmt->reg_format == DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED)
>> +		val |= DW100_SWAP_CONTROL_SRC(DW100_SWAP_CONTROL_Y
>> +					      (DW100_SWAP_CONTROL_BYTE));
>> +
>> +	writel(val, dw_dev->mmio + DW100_SWAP_CONTROL);
>> +
>> +	/* Image resolution */
>> +	writel(DW100_IMG_SIZE_WIDTH(width) | DW100_IMG_SIZE_HEIGHT(height),
>> +	       dw_dev->mmio + DW100_SRC_IMG_SIZE);
>> +
>> +	writel(stride, dw_dev->mmio + DW100_SRC_IMG_STRIDE);
>> +
>> +	/* Buffers */
>> +	writel(DW100_IMG_Y_BASE(addr), dw_dev->mmio + DW100_SRC_IMG_Y_BASE);
>> +	writel(DW100_IMG_UV_BASE((addr + (stride * height))),
>> +	       dw_dev->mmio + DW100_SRC_IMG_UV_BASE);
> 
> As the hardware supports non-contiguous planes, we should use the mplane
> API already, even if we support the single planar formats only for now.
> 
>> +}
>> +
>> +static void dw100_hw_set_destination(struct dw100_device *dw_dev,
>> +				     struct dw100_q_data *q_data,
>> +				     struct dw100_fmt *ifmt,
>> +				     dma_addr_t addr)
>> +{
>> +	u32 width, height, stride, fourcc, val;
>> +	struct dw100_fmt *fmt = q_data->fmt;
>> +
>> +	width =  q_data->width;
>> +	height = q_data->height;
>> +	stride = q_data->bytesperline;
>> +	fourcc = q_data->fmt->fourcc;
>> +
>> +	dprintk(3, dw_dev, "Set HW source registers for %dx%d - stride %d, pixfmt: %x, dma:%pa\n",
>> +		width, height, stride, fourcc, &addr);
>> +
>> +	/* Pixel Format */
>> +	val = readl(dw_dev->mmio + DW100_DEWARP_CTRL);
>> +
>> +	val &= ~DW100_DEWARP_CTRL_OUTPUT_FORMAT_MASK;
>> +	val |= DW100_DEWARP_CTRL_OUTPUT_FORMAT(fmt->reg_format);
>> +
>> +	writel(val, dw_dev->mmio + DW100_DEWARP_CTRL);
>> +
>> +	/* Swap */
>> +	val = readl(dw_dev->mmio + DW100_SWAP_CONTROL);
>> +
>> +	val &= ~DW100_SWAP_CONTROL_DST_MASK;
>> +
>> +	/*
>> +	 * Avoid to swap twice
>> +	 */
>> +	if (fmt->reg_swap_uv ^
>> +	    (ifmt->reg_swap_uv && ifmt->reg_format !=
>> +	     DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED)) {
>> +		if (fmt->reg_format == DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED)
>> +			val |= DW100_SWAP_CONTROL_DST(DW100_SWAP_CONTROL_Y
>> +						      (DW100_SWAP_CONTROL_BYTE));
>> +		else
>> +			val |= DW100_SWAP_CONTROL_DST(DW100_SWAP_CONTROL_UV
>> +						      (DW100_SWAP_CONTROL_BYTE));
>> +	}
>> +
>> +	writel(val, dw_dev->mmio + DW100_SWAP_CONTROL);
>> +
>> +	/* Image resolution */
>> +	writel(DW100_IMG_SIZE_WIDTH(width) | DW100_IMG_SIZE_HEIGHT(height),
>> +	       dw_dev->mmio + DW100_DST_IMG_SIZE);
>> +	writel(stride, dw_dev->mmio + DW100_DST_IMG_STRIDE);
>> +
>> +	val = ALIGN(stride * height, 16);
>> +	writel(DW100_IMG_Y_BASE(addr), dw_dev->mmio + DW100_DST_IMG_Y_BASE);
>> +	writel(DW100_IMG_UV_BASE(addr + val),
>> +	       dw_dev->mmio + DW100_DST_IMG_UV_BASE);
>> +	writel(DW100_DST_IMG_Y_SIZE(val), dw_dev->mmio + DW100_DST_IMG_Y_SIZE1);
>> +
>> +	if (fmt->reg_format == DW100_DEWARP_CTRL_FORMAT_YUV420_SP)
>> +		val /= 2;
>> +
>> +	writel(DW100_DST_IMG_UV_SIZE(val),
>> +	       dw_dev->mmio + DW100_DST_IMG_UV_SIZE1);
>> +}
>> +
>> +static void dw100_hw_set_mapping(struct dw100_device *dw_dev, dma_addr_t addr,
>> +				 u32 width, u32 height)
>> +{
>> +	dprintk(1, dw_dev, "Set HW mapping registers for %dx%d addr:%pa", width,
>> +		height, &addr);
>> +
>> +	writel(DW100_MAP_LUT_ADDR_ADDR(addr), dw_dev->mmio + DW100_MAP_LUT_ADDR);
>> +	writel(DW100_MAP_LUT_SIZE_WIDTH(width)
>> +		| DW100_MAP_LUT_SIZE_HEIGHT(height),
>> +			dw_dev->mmio + DW100_MAP_LUT_SIZE);
>> +}
>> +
>> +static void dw100_hw_clear_irq(struct dw100_device *dw_dev, unsigned int irq)
>> +{
>> +	writel(DW100_INTERRUPT_STATUS_INT_CLEAR(irq),
>> +	       dw_dev->mmio + DW100_INTERRUPT_STATUS);
>> +}
>> +
>> +static void dw100_hw_enable_irq(struct dw100_device *dw_dev)
>> +{
>> +	writel(DW100_INTERRUPT_STATUS_INT_ENABLE_MASK,
>> +	       dw_dev->mmio + DW100_INTERRUPT_STATUS);
>> +}
>> +
>> +static void dw100_hw_disable_irq(struct dw100_device *dw_dev)
>> +{
>> +	writel(0, dw_dev->mmio + DW100_INTERRUPT_STATUS);
>> +}
>> +
>> +static bool dw100_hw_is_frame_running(struct dw100_device *dw_dev)
>> +{
>> +	bool is_running = readl(dw_dev->mmio + DW100_INTERRUPT_STATUS)
>> +				& DW100_INTERRUPT_STATUS_FRAME_BUSY;
>> +
>> +	return is_running;
>> +}
>> +
>> +static u32 dw_hw_get_pending_irqs(struct dw100_device *dw_dev)
>> +{
>> +	u32 val;
>> +
>> +	val = readl(dw_dev->mmio + DW100_INTERRUPT_STATUS);
>> +
>> +	return DW100_INTERRUPT_STATUS_INT_STATUS(val);
>> +}
>> +
>> +static irqreturn_t dw100_irq_handler(int irq, void *dev_id)
>> +{
>> +	struct dw100_device *dw_dev = dev_id;
>> +	u32 pending_irqs, err_irqs, frame_done_irq;
>> +	bool with_error = true;
>> +
>> +	pending_irqs = dw_hw_get_pending_irqs(dw_dev);
>> +	frame_done_irq = pending_irqs & DW100_INTERRUPT_STATUS_INT_FRAME_DONE;
>> +	err_irqs = DW100_INTERRUPT_STATUS_INT_ERR_STATUS(pending_irqs);
>> +
>> +	if (frame_done_irq) {
>> +		dprintk(3, dw_dev, "Frame done interrupt\n");
>> +		with_error = false;
>> +		err_irqs &= ~DW100_INTERRUPT_STATUS_INT_ERR_STATUS
>> +			(DW100_INTERRUPT_STATUS_INT_ERR_FRAME_DONE);
>> +	}
>> +
>> +	if (err_irqs)
>> +		v4l2_err(&dw_dev->v4l2_dev, "Interrupt error: %#x\n", err_irqs);
>> +
>> +	dw100_hw_disable_irq(dw_dev);
>> +	dw100_hw_master_bus_disable(dw_dev);
>> +	dw100_hw_clear_irq(dw_dev, pending_irqs |
>> +			   DW100_INTERRUPT_STATUS_INT_ERR_TIME_OUT);
>> +
>> +	dw100_job_finish(dw_dev, with_error);
>> +
>> +	return IRQ_HANDLED;
>> +}
>> +
>> +static void dw100_start(struct dw100_ctx *ctx, struct vb2_v4l2_buffer *in_vb,
>> +			struct vb2_v4l2_buffer *out_vb)
>> +{
>> +	struct dw100_device *dw_dev = ctx->dw_dev;
>> +	dma_addr_t p_in, p_out;
>> +
>> +	p_in = vb2_dma_contig_plane_dma_addr(&in_vb->vb2_buf, 0);
>> +	p_out = vb2_dma_contig_plane_dma_addr(&out_vb->vb2_buf, 0);
>> +	if (!p_in || !p_out) {
>> +		v4l2_err(&dw_dev->v4l2_dev,
>> +			 "Acquiring DMA addresses of buffers failed\n");
>> +		return;
>> +	}
>> +
>> +	out_vb->sequence =
>> +		get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE)->sequence++;
>> +	in_vb->sequence =
>> +		get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_OUTPUT)->sequence++;
>> +
>> +	dprintk(1, ctx->dw_dev,
>> +		"Starting queues %p->%p buffers d:%pa->d:%pa, sequence %d->%d\n",
>> +		v4l2_m2m_get_vq(ctx->fh.m2m_ctx, V4L2_BUF_TYPE_VIDEO_OUTPUT),
>> +		v4l2_m2m_get_vq(ctx->fh.m2m_ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE),
>> +		&p_in, &p_out, in_vb->sequence, out_vb->sequence);
>> +
>> +	out_vb->vb2_buf.timestamp = in_vb->vb2_buf.timestamp;
>> +	out_vb->field = in_vb->field;
>> +	if (in_vb->flags & V4L2_BUF_FLAG_TIMECODE)
>> +		out_vb->timecode = in_vb->timecode;
>> +	out_vb->flags = in_vb->flags &
>> +		(V4L2_BUF_FLAG_TIMECODE |
>> +		 V4L2_BUF_FLAG_KEYFRAME |
>> +		 V4L2_BUF_FLAG_PFRAME |
>> +		 V4L2_BUF_FLAG_BFRAME |
>> +		 V4L2_BUF_FLAG_TIMESTAMP_MASK |
>> +		 V4L2_BUF_FLAG_TSTAMP_SRC_MASK);
> 
> v4l2_m2m_buf_copy_metadata() will handle this for you. I think you can
> call it here, or in dw100_job_finish().
> 
>> +
>> +	/* Now, let's deal with hardware ... */
>> +	dw100_hw_master_bus_disable(dw_dev);
>> +	if (!ctx->map)
>> +		dw100_create_mapping(ctx);
>> +
>> +	dw100_hw_init_ctrl(dw_dev);
>> +	dw100_hw_set_pixel_boundary(dw_dev);
>> +	dw100_hw_set_src_crop(dw_dev, &ctx->q_data[V4L2_M2M_SRC],
>> +			      &ctx->q_data[V4L2_M2M_DST]);
>> +	dw100_hw_set_source(dw_dev, &ctx->q_data[V4L2_M2M_SRC], p_in);
>> +	dw100_hw_set_destination(dw_dev, &ctx->q_data[V4L2_M2M_DST],
>> +				 ctx->q_data[V4L2_M2M_SRC].fmt, p_out);
>> +	dw100_hw_set_mapping(dw_dev, ctx->map_dma,
>> +			     ctx->map_width, ctx->map_height);
>> +	dw100_hw_enable_irq(dw_dev);
>> +	dw100_hw_dewarp_start(dw_dev);
>> +
>> +	/* Enable Bus */
>> +	dw100_hw_master_bus_enable(dw_dev);
>> +}
>> +
>> +static void dw100_device_run(void *priv)
>> +{
>> +	struct dw100_ctx *ctx = priv;
>> +	struct vb2_v4l2_buffer *src_buf, *dst_buf;
>> +
>> +	src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
>> +	dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
>> +
>> +	dw100_start(ctx, src_buf, dst_buf);
>> +}
>> +
>> +static int dw100_job_ready(void *priv)
>> +{
>> +	struct dw100_ctx *ctx = priv;
>> +
>> +	if (v4l2_m2m_num_src_bufs_ready(ctx->fh.m2m_ctx) < 1 ||
>> +	    v4l2_m2m_num_dst_bufs_ready(ctx->fh.m2m_ctx) < 1) {
>> +		dprintk(1, ctx->dw_dev, "Not enough buffers available\n");
>> +		return 0;
>> +	}
>> +
>> +	if (dw100_hw_is_frame_running(ctx->dw_dev)) {
>> +		dprintk(1, ctx->dw_dev, "HW still running a frame\n");
>> +		return 0;
>> +	}
>> +
>> +	return 1;
>> +}
> 
> As far as I understand you could drop this function, as the default
> implementation when not provided is to assume that one source buffer and
> one destination buffer are all that is required.
> 
>> +
>> +static const struct v4l2_m2m_ops dw100_m2m_ops = {
>> +	.device_run	= dw100_device_run,
>> +	.job_ready	= dw100_job_ready,
>> +};
>> +
>> +#ifdef CONFIG_MEDIA_CONTROLLER
>> +static const struct media_device_ops dw100_media_ops = {
>> +	.req_validate = vb2_request_validate,
>> +	.req_queue = v4l2_m2m_request_queue,
>> +};
> 
> Do we need the request API ? I think you can drop this.
> 
>> +#endif
>> +
>> +static struct video_device *dw100_init_video_device(struct dw100_device *dw_dev)
>> +{
>> +	struct video_device *vfd = &dw_dev->vfd;
>> +
>> +	vfd->vfl_dir = VFL_DIR_M2M;
>> +	vfd->fops = &dw100_fops;
>> +	vfd->device_caps = V4L2_CAP_VIDEO_M2M | V4L2_CAP_STREAMING;
>> +	vfd->ioctl_ops = &dw100_ioctl_ops;
>> +	vfd->minor = -1;
>> +	vfd->release = video_device_release_empty;
>> +	vfd->v4l2_dev = &dw_dev->v4l2_dev;
>> +	vfd->lock = &dw_dev->vfd_mutex;
>> +
>> +	strscpy(vfd->name, DRV_NAME, sizeof(vfd->name));
>> +	mutex_init(vfd->lock);
> 
> Missing corresponding mutex_destroy().
> 
>> +	video_set_drvdata(vfd, dw_dev);
>> +
>> +	return vfd;
>> +}
>> +
>> +static int dw100_dump_regs_show(struct seq_file *m, void *private)
>> +{
>> +	struct dw100_device *dw_dev = m->private;
>> +	int ret;
>> +
>> +	ret = pm_runtime_resume_and_get(&dw_dev->pdev->dev);
>> +	if (ret < 0)
>> +		return ret;
>> +
>> +	ret = dw100_dump_regs(dw_dev);
> 
> I'd inline the function here as there's a single caller.
> 
>> +
>> +	pm_runtime_put_sync(&dw_dev->pdev->dev);
>> +
>> +	return ret;
>> +}
>> +DEFINE_SHOW_ATTRIBUTE(dw100_dump_regs);
>> +
>> +static void dw100_debugfs_init(struct dw100_device *dw_dev)
>> +{
>> +	dw_dev->debugfs_root =
>> +		debugfs_create_dir(dev_name(&dw_dev->pdev->dev), NULL);
>> +
>> +	debugfs_create_file("dump_regs", 0600, dw_dev->debugfs_root, dw_dev,
>> +			    &dw100_dump_regs_fops);
>> +}
>> +
>> +static void dw100_debugfs_exit(struct dw100_device *dw_dev)
>> +{
>> +	debugfs_remove_recursive(dw_dev->debugfs_root);
>> +}
>> +
>> +static int dw100_probe(struct platform_device *pdev)
>> +{
>> +	struct dw100_device *dw_dev;
>> +	struct video_device *vfd;
>> +	struct resource *res;
>> +	int ret = 0;
> 
> No need to initialize ret to 0.
> 
>> +	int irq;
>> +
>> +	dw_dev = devm_kzalloc(&pdev->dev, sizeof(*dw_dev), GFP_KERNEL);
>> +	if (!dw_dev)
>> +		return -ENOMEM;
>> +	dw_dev->pdev = pdev;
>> +
>> +	ret = devm_clk_bulk_get_all(&pdev->dev, &dw_dev->clks);
>> +	if (ret < 0) {
>> +		dev_err(&pdev->dev, "Unable to get clocks: %d\n", ret);
>> +		return ret;
>> +	}
>> +	dw_dev->num_clks = ret;
>> +
>> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>> +	dw_dev->mmio = devm_ioremap_resource(&pdev->dev, res);
>> +	if (IS_ERR(dw_dev->mmio))
>> +		return PTR_ERR(dw_dev->mmio);
>> +
>> +	irq = platform_get_irq(pdev, 0);
>> +	if (irq < 0)
>> +		return irq;
>> +
>> +	ret = devm_request_threaded_irq(&pdev->dev, irq, NULL, dw100_irq_handler,
> 
> Why do we need a threaded IRQ handler ?
> 
>> +					IRQF_ONESHOT, dev_name(&pdev->dev),
>> +					dw_dev);
>> +	if (ret < 0) {
>> +		dev_err(&pdev->dev, "Failed to request irq: %d\n", ret);
>> +		return ret;
>> +	}
> 
> I'm always a bit worried when requesting an IRQ before we ensure the
> device won't generate an interrupt.
> 
>> +
>> +	platform_set_drvdata(pdev, dw_dev);
>> +
>> +	pm_runtime_enable(&pdev->dev);
>> +	ret = pm_runtime_resume_and_get(&pdev->dev);
>> +	if (ret < 0) {
>> +		dev_err(&pdev->dev, "Unable to enable clks: %d\n", ret);
>> +		goto err_clk;
>> +	}
>> +
>> +	dw100_hw_reset(dw_dev);
>> +
>> +	pm_runtime_put_sync(&pdev->dev);
> 
> I wonder if the device should be reset every time runtime PM resumes.
> The dw100_hw_reset() call could move to a runtime PM resume handler, and
> this could then be dropped.
> 
>> +
>> +	spin_lock_init(&dw_dev->irqlock);
>> +
>> +	ret = v4l2_device_register(&pdev->dev, &dw_dev->v4l2_dev);
>> +	if (ret)
>> +		goto err_clk;
>> +
>> +	vfd = dw100_init_video_device(dw_dev);
>> +
>> +	dw_dev->m2m_dev = v4l2_m2m_init(&dw100_m2m_ops);
>> +	if (IS_ERR(dw_dev->m2m_dev)) {
>> +		v4l2_err(&dw_dev->v4l2_dev, "Failed to init mem2mem device\n");
>> +		ret = PTR_ERR(dw_dev->m2m_dev);
>> +		goto err_v4l2;
>> +	}
>> +
>> +#ifdef CONFIG_MEDIA_CONTROLLER
> 
> Let's make this unconditional as the MC API is needed by libcamera to
> enumerate media devices.
> 
>> +	dw_dev->mdev.dev = &pdev->dev;
>> +	strscpy(dw_dev->mdev.model, "dw100", sizeof(dw_dev->mdev.model));
>> +	strscpy(dw_dev->mdev.bus_info, "platform:dw100",
>> +		sizeof(dw_dev->mdev.bus_info));
> 
> No needed anymore either, it's handled by the core.
> 
>> +	media_device_init(&dw_dev->mdev);
>> +	dw_dev->mdev.ops = &dw100_media_ops;
>> +	dw_dev->v4l2_dev.mdev = &dw_dev->mdev;
>> +#endif
>> +
>> +	ret = video_register_device(vfd, VFL_TYPE_VIDEO, -1);
>> +	if (ret) {
>> +		v4l2_err(&dw_dev->v4l2_dev, "Failed to register video device\n");
>> +		goto err_m2m;
>> +	}
>> +
>> +#ifdef CONFIG_MEDIA_CONTROLLER
>> +	ret = v4l2_m2m_register_media_controller(dw_dev->m2m_dev, vfd,
>> +						 MEDIA_ENT_F_PROC_VIDEO_SCALER);
>> +	if (ret) {
>> +		v4l2_err(&dw_dev->v4l2_dev, "Failed to init mem2mem media controller\n");
> 
> I'd use dev_err() (and dev_info(), ...) through the driver.
> 
>> +		goto error_v4l2;
>> +	}
>> +
>> +	ret = media_device_register(&dw_dev->mdev);
>> +	if (ret) {
>> +		v4l2_err(&dw_dev->v4l2_dev, "Failed to register mem2mem media device\n");
>> +		goto error_m2m_mc;
>> +	}
>> +#endif
>> +
>> +	dw100_debugfs_init(dw_dev);
>> +
>> +	v4l2_info(&dw_dev->v4l2_dev,
>> +		  "dw100 v4l2 m2m registered as /dev/video%d\n", vfd->num);
>> +
>> +	return 0;
> 
> A blank line would be nice.
> 
>> +#ifdef CONFIG_MEDIA_CONTROLLER
>> +error_m2m_mc:
>> +	v4l2_m2m_unregister_media_controller(dw_dev->m2m_dev);
>> +error_v4l2:
>> +	video_unregister_device(vfd);
>> +#endif
>> +err_m2m:
>> +	v4l2_m2m_release(dw_dev->m2m_dev);
>> +err_v4l2:
>> +	v4l2_device_unregister(&dw_dev->v4l2_dev);
>> +err_clk:
>> +	pm_runtime_disable(&pdev->dev);
>> +
>> +	return ret;
>> +}
>> +
>> +static int dw100_remove(struct platform_device *pdev)
>> +{
>> +	struct dw100_device *dw_dev = platform_get_drvdata(pdev);
>> +
>> +	dw100_debugfs_exit(dw_dev);
>> +
>> +	pm_runtime_disable(&pdev->dev);
>> +
>> +#ifdef CONFIG_MEDIA_CONTROLLER
>> +	media_device_unregister(&dw_dev->mdev);
>> +	v4l2_m2m_unregister_media_controller(dw_dev->m2m_dev);
>> +	media_device_cleanup(&dw_dev->mdev);
>> +#endif
>> +
>> +	video_unregister_device(&dw_dev->vfd);
>> +	v4l2_m2m_release(dw_dev->m2m_dev);
>> +	v4l2_device_unregister(&dw_dev->v4l2_dev);
>> +
>> +	return 0;
>> +}
>> +
>> +static int __maybe_unused dw100_runtime_suspend(struct device *dev)
>> +{
>> +	struct dw100_device *dw_dev = dev_get_drvdata(dev);
>> +
>> +	clk_bulk_disable_unprepare(dw_dev->num_clks, dw_dev->clks);
>> +
>> +	return 0;
>> +}
>> +
>> +static int __maybe_unused dw100_runtime_resume(struct device *dev)
>> +{
>> +	struct dw100_device *dw_dev = dev_get_drvdata(dev);
>> +
>> +	return clk_bulk_prepare_enable(dw_dev->num_clks, dw_dev->clks);
>> +}
>> +
>> +static const struct dev_pm_ops dw100_pm = {
>> +	SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
>> +				pm_runtime_force_resume)
>> +	SET_RUNTIME_PM_OPS(dw100_runtime_suspend,
>> +			   dw100_runtime_resume, NULL)
>> +};
>> +
>> +static const struct of_device_id dw100_dt_ids[] = {
>> +	{ .compatible = "nxp,imx8mp-dw100", .data = NULL },
>> +	{ },
>> +};
>> +MODULE_DEVICE_TABLE(of, dw100_dt_ids);
>> +
>> +static struct platform_driver dw100_driver = {
>> +	.probe		= dw100_probe,
>> +	.remove		= dw100_remove,
>> +	.driver		= {
>> +		.name	= DRV_NAME,
>> +		.pm = &dw100_pm,
>> +		.of_match_table = dw100_dt_ids,
>> +	},
>> +};
>> +
>> +module_platform_driver(dw100_driver);
>> +
>> +MODULE_DESCRIPTION("DW100 Hardware dewarper");
>> +MODULE_AUTHOR("Xavier Roumegue <Xavier.Roumegue@oss.nxp.com>");
>> +MODULE_LICENSE("GPL");
>> diff --git a/drivers/media/platform/nxp/dw100/dw100_regs.h b/drivers/media/platform/nxp/dw100/dw100_regs.h
>> new file mode 100644
>> index 000000000000..c540f6c638db
>> --- /dev/null
>> +++ b/drivers/media/platform/nxp/dw100/dw100_regs.h
>> @@ -0,0 +1,118 @@
>> +/* SPDX-License-Identifier: GPL-2.0+ */
>> +/*
>> + * DW100 Hardware dewarper
>> + *
>> + * Copyright 2022 NXP
>> + * Author: Xavier Roumegue (xavier.roumegue@oss.nxp.com)
>> + *
> 
> Extra blank line.
> 
>> + */
>> +
>> +#ifndef _DW100_REGS_H_
>> +#define _DW100_REGS_H_
>> +
>> +/* AHB register offset */
>> +#define DW100_DEWARP_ID			0x00
>> +#define DW100_DEWARP_CTRL		0x04
>> +#define DW100_DEWARP_CTRL_ENABLE			BIT(0)
>> +#define DW100_DEWARP_CTRL_START				BIT(1)
>> +#define DW100_DEWARP_CTRL_SOFT_RESET			BIT(2)
>> +#define DW100_DEWARP_CTRL_FORMAT_YUV422_SP		(0UL)
> 
> No need for parentheses. Same below where applicable.
> 
>> +#define DW100_DEWARP_CTRL_FORMAT_YUV422_PACKED		(1UL)
>> +#define DW100_DEWARP_CTRL_FORMAT_YUV420_SP		(2UL)
>> +#define DW100_DEWARP_CTRL_INPUT_FORMAT_MASK		GENMASK(5, 4)
>> +#define DW100_DEWARP_CTRL_INPUT_FORMAT(x)		((x) << 4)
>> +#define DW100_DEWARP_CTRL_OUTPUT_FORMAT(x)		((x) << 6)
>> +#define DW100_DEWARP_CTRL_OUTPUT_FORMAT_MASK		GENMASK(7, 6)
>> +#define DW100_DEWARP_CTRL_SRC_AUTO_SHADOW		BIT(8)
>> +#define DW100_DEWARP_CTRL_HW_HANDSHAKE			BIT(9)
>> +#define DW100_DEWARP_CTRL_DST_AUTO_SHADOW		BIT(10)
>> +#define DW100_DEWARP_CTRL_SPLIT_LINE			BIT(11)
>> +#define DW100_DEWARP_CTRL_PREFETCH_MODE_MASK		GENMASK(17, 16)
>> +#define DW100_DEWARP_CTRL_PREFETCH_MODE_TRAVERSAL	(0UL << 16)
>> +#define DW100_DEWARP_CTRL_PREFETCH_MODE_CALCULATION	(1UL << 16)
>> +#define DW100_DEWARP_CTRL_PREFETCH_MODE_AUTO		(2UL << 16)
>> +#define DW100_DEWARP_CTRL_PREFETCH_THRESHOLD_MASK	GENMASK(24, 18)
>> +#define DW100_DEWARP_CTRL_PREFETCH_THRESHOLD(x)		((x) << 18)
>> +
>> +#define DW100_MAP_LUT_ADDR		0x08
>> +#define DW100_MAP_LUT_ADDR_ADDR(addr)	(((addr) >> 4) & GENMASK(29, 0))
>> +#define DW100_MAP_LUT_SIZE		0x0C
> 
> I usually ask for lower-case hex constants as that's what is often done
> in V4L2, but maybe it's too much nit-picking :-)
> 
>> +#define DW100_MAP_LUT_SIZE_WIDTH(w)	(((w) & GENMASK(10, 0)) << 0)
>> +#define DW100_MAP_LUT_SIZE_HEIGHT(h)	(((h) & GENMASK(10, 0)) << 16)
>> +#define DW100_SRC_IMG_Y_BASE		0x10
>> +#define DW100_IMG_Y_BASE(base)		(((base) >> 4) & GENMASK(29, 0))
>> +#define DW100_SRC_IMG_UV_BASE		0x14
>> +#define DW100_IMG_UV_BASE(base)		(((base) >> 4) & GENMASK(29, 0))
>> +#define DW100_SRC_IMG_SIZE		0x18
>> +#define DW100_IMG_SIZE_WIDTH(w)		(((w) & GENMASK(12, 0)) << 0)
>> +#define DW100_IMG_SIZE_HEIGHT(h)	(((h) & GENMASK(12, 0)) << 16)
>> +
>> +#define DW100_SRC_IMG_STRIDE		0x1C
>> +#define DW100_MAP_LUT_ADDR2		0x20
>> +#define DW100_MAP_LUT_SIZE2		0x24
>> +#define DW100_SRC_IMG_Y_BASE2		0x28
>> +#define DW100_SRC_IMG_UV_BASE2		0x2C
>> +#define DW100_SRC_IMG_SIZE2		0x30
>> +#define DW100_SRC_IMG_STRIDE2		0x34
>> +#define DW100_DST_IMG_Y_BASE		0x38
>> +#define DW100_DST_IMG_UV_BASE		0x3C
>> +#define DW100_DST_IMG_SIZE		0x40
>> +#define DW100_DST_IMG_STRIDE		0x44
>> +#define DW100_DST_IMG_Y_BASE2		0x48
>> +#define DW100_DST_IMG_UV_BASE2		0x4C
>> +#define DW100_DST_IMG_SIZE2		0x50
>> +#define DW100_DST_IMG_STRIDE2		0x54
>> +#define DW100_SWAP_CONTROL		0x58
>> +#define DW100_SWAP_CONTROL_BYTE		BIT(0)
>> +#define DW100_SWAP_CONTROL_SHORT	BIT(1)
>> +#define DW100_SWAP_CONTROL_WORD		BIT(2)
>> +#define DW100_SWAP_CONTROL_LONG		BIT(3)
>> +#define DW100_SWAP_CONTROL_Y(x)		(((x) & GENMASK(3, 0)) << 0)
>> +#define DW100_SWAP_CONTROL_UV(x)	(((x) & GENMASK(3, 0)) << 4)
>> +#define DW100_SWAP_CONTROL_SRC(x)	(((x) & GENMASK(7, 0)) << 0)
>> +#define DW100_SWAP_CONTROL_DST(x)	(((x) & GENMASK(7, 0)) << 8)
>> +#define DW100_SWAP_CONTROL_SRC2(x)	(((x) & GENMASK(7, 0)) << 16)
>> +#define DW100_SWAP_CONTROL_DST2(x)	(((x) & GENMASK(7, 0)) << 24)
>> +#define DW100_SWAP_CONTROL_SRC_MASK	GENMASK(7, 0)
>> +#define DW100_SWAP_CONTROL_DST_MASK	GENMASK(15, 8)
>> +#define DW100_SWAP_CONTROL_SRC2_MASK	GENMASK(23, 16)
>> +#define DW100_SWAP_CONTROL_DST2_MASK	GENMASK(31, 24)
>> +#define DW100_VERTICAL_SPLIT_LINE	0x5C
>> +#define DW100_HORIZON_SPLIT_LINE	0x60
>> +#define DW100_SCALE_FACTOR		0x64
>> +#define DW100_ROI_START			0x68
>> +#define DW100_ROI_START_X(x)		(((x) & GENMASK(12, 0)) << 0)
>> +#define DW100_ROI_START_Y(y)		(((y) & GENMASK(12, 0)) << 16)
>> +#define DW100_BOUNDARY_PIXEL		0x6C
>> +#define DW100_BOUNDARY_PIXEL_V(v)	(((v) & GENMASK(7, 0)) << 0)
>> +#define DW100_BOUNDARY_PIXEL_U(u)	(((u) & GENMASK(7, 0)) << 8)
>> +#define DW100_BOUNDARY_PIXEL_Y(y)	(((y) & GENMASK(7, 0)) << 16)
>> +
>> +#define DW100_INTERRUPT_STATUS		0x70
>> +#define DW100_INTERRUPT_STATUS_INT_FRAME_DONE		BIT(0)
>> +#define DW100_INTERRUPT_STATUS_INT_ERR_TIME_OUT		BIT(1)
>> +#define DW100_INTERRUPT_STATUS_INT_ERR_AXI_RESP		BIT(2)
>> +#define DW100_INTERRUPT_STATUS_INT_ERR_X		BIT(3)
>> +#define DW100_INTERRUPT_STATUS_INT_ERR_MB_FETCH		BIT(4)
>> +#define DW100_INTERRUPT_STATUS_INT_ERR_FRAME2		BIT(5)
>> +#define DW100_INTERRUPT_STATUS_INT_ERR_FRAME3		BIT(6)
>> +#define DW100_INTERRUPT_STATUS_INT_ERR_FRAME_DONE	BIT(7)
>> +#define DW100_INTERRUPT_STATUS_INT_ERR_STATUS(x)	(((x) >> 1) & 0x7f)
>> +#define DW100_INTERRUPT_STATUS_INT_STATUS(x)		((x) & 0xff)
>> +
>> +#define DW100_INTERRUPT_STATUS_INT_ENABLE_MASK		GENMASK(15, 8)
>> +#define DW100_INTERRUPT_STATUS_INT_ENABLE(x)		(((x) & GENMASK(7, 0)) << 8)
>> +#define DW100_INTERRUPT_STATUS_FRAME_BUSY		BIT(16)
>> +#define DW100_INTERRUPT_STATUS_INT_CLEAR(x)		(((x) & GENMASK(7, 0)) << 24)
>> +#define DW100_BUS_CTRL			0x74
>> +#define DW100_BUS_CTRL_AXI_MASTER_ENABLE	BIT(31)
>> +#define DW100_BUS_CTRL1			0x78
>> +#define DW100_BUS_TIME_OUT_CYCLE	0x7C
>> +#define DW100_DST_IMG_Y_SIZE1		0x80
>> +#define DW100_DST_IMG_Y_SIZE(sz)	(((sz) >> 4) & GENMASK(29, 0))
>> +#define DW100_DST_IMG_UV_SIZE(sz)	(((sz) >> 4) & GENMASK(29, 0))
>> +#define DW100_DST_IMG_UV_SIZE1		0x84
>> +#define DW100_DST_IMG_Y_SIZE2		0x88
>> +#define DW100_DST_IMG_UV_SIZE2		0x8C
>> +
>> +#endif /* _DW100_REGS_H_ */
> 

Regards,
  Xavier

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2022-06-23  7:07 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-03  9:39 [PATCH v5 0/9] i.MX8MP DW100 dewarper driver Xavier Roumegue
2022-05-03  9:39 ` [PATCH v5 1/9] videodev2.h: add V4L2_CTRL_FLAG_DYNAMIC_ARRAY Xavier Roumegue
2022-06-14 20:46   ` Laurent Pinchart
2022-06-15  7:37     ` Hans Verkuil
2022-06-15  8:39       ` Laurent Pinchart
2022-06-15  8:45         ` Hans Verkuil
2022-05-03  9:39 ` [PATCH v5 2/9] v4l2-ctrls: add support for dynamically allocated arrays Xavier Roumegue
2022-05-03  9:39 ` [PATCH v5 3/9] vivid: add dynamic array test control Xavier Roumegue
2022-06-14 21:00   ` Laurent Pinchart
2022-06-15  7:39     ` Hans Verkuil
2022-06-15  7:46       ` Hans Verkuil
2022-06-15  9:14     ` Hans Verkuil
2022-06-15 11:53       ` Laurent Pinchart
2022-06-15 12:27         ` Hans Verkuil
2022-06-15 12:39           ` Laurent Pinchart
2022-06-16  6:08       ` Xavier Roumegue (OSS)
2022-05-03  9:39 ` [PATCH v5 4/9] media: Documentation: dw100: Add user documentation for the DW100 driver Xavier Roumegue
2022-06-14 21:13   ` Laurent Pinchart
2022-05-03  9:39 ` [PATCH v5 5/9] media: v4l: uapi: Add user control base for DW100 controls Xavier Roumegue
2022-05-03  9:39 ` [PATCH v5 6/9] media: uapi: Add a control for DW100 driver Xavier Roumegue
2022-06-14 21:07   ` Laurent Pinchart
2022-05-03  9:39 ` [PATCH v5 7/9] media: dt-bindings: media: Add i.MX8MP DW100 binding Xavier Roumegue
2022-05-03 11:59   ` Rob Herring
2022-05-03  9:39 ` [PATCH v5 8/9] media: dw100: Add i.MX8MP dw100 dewarper driver Xavier Roumegue
2022-06-14 22:54   ` Laurent Pinchart
2022-06-23  7:06     ` Xavier Roumegue (OSS)
2022-05-03  9:39 ` [PATCH v5 9/9] media: MAINTAINERS: add entry for i.MX8MP DW100 v4l2 mem2mem driver Xavier Roumegue

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.