All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/2] NVIDIA Tegra VI V4L2 driver
@ 2015-08-21  0:51 Bryan Wu
  2015-08-21  0:51 ` [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver Bryan Wu
                   ` (4 more replies)
  0 siblings, 5 replies; 25+ messages in thread
From: Bryan Wu @ 2015-08-21  0:51 UTC (permalink / raw)
  To: hansverk, linux-media
  Cc: ebrower, jbang, swarren, treding, wenjiaz, davidw, gfitzer

NVIDIA Tegra SoC includes a Video Input controller, which can talk
with external camera sensors.

This patch set is still under development, since it's based on some
out of tree Tegra patches. And media controller part still needs some
rework after upstream finalize the MC redesign work.

Currently it's tested with Tegra X1 built-in test pattern generator.

Bryan Wu (2):
  [media] v4l: tegra: Add NVIDIA Tegra VI driver
  ARM64: add tegra-vi support in T210 device-tree

 arch/arm64/boot/dts/nvidia/tegra210-p2571-e01.dts |    8 +
 arch/arm64/boot/dts/nvidia/tegra210.dtsi          |   13 +
 drivers/media/platform/Kconfig                    |    1 +
 drivers/media/platform/Makefile                   |    2 +
 drivers/media/platform/tegra/Kconfig              |    9 +
 drivers/media/platform/tegra/Makefile             |    3 +
 drivers/media/platform/tegra/tegra-channel.c      | 1074 +++++++++++++++++++++
 drivers/media/platform/tegra/tegra-core.c         |  295 ++++++
 drivers/media/platform/tegra/tegra-core.h         |  134 +++
 drivers/media/platform/tegra/tegra-vi.c           |  585 +++++++++++
 drivers/media/platform/tegra/tegra-vi.h           |  224 +++++
 include/dt-bindings/media/tegra-vi.h              |   35 +
 12 files changed, 2383 insertions(+)
 create mode 100644 drivers/media/platform/tegra/Kconfig
 create mode 100644 drivers/media/platform/tegra/Makefile
 create mode 100644 drivers/media/platform/tegra/tegra-channel.c
 create mode 100644 drivers/media/platform/tegra/tegra-core.c
 create mode 100644 drivers/media/platform/tegra/tegra-core.h
 create mode 100644 drivers/media/platform/tegra/tegra-vi.c
 create mode 100644 drivers/media/platform/tegra/tegra-vi.h
 create mode 100644 include/dt-bindings/media/tegra-vi.h

-- 
2.1.4


-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
  2015-08-21  0:51 [PATCH RFC 0/2] NVIDIA Tegra VI V4L2 driver Bryan Wu
@ 2015-08-21  0:51 ` Bryan Wu
  2015-08-21  0:51 ` [PATCH 2/2] ARM64: add tegra-vi support in T210 device-tree Bryan Wu
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 25+ messages in thread
From: Bryan Wu @ 2015-08-21  0:51 UTC (permalink / raw)
  To: hansverk, linux-media
  Cc: ebrower, jbang, swarren, treding, wenjiaz, davidw, gfitzer

NVIDIA Tegra processor contains a powerful Video Input (VI) hardware
controller which can support up to 6 MIPI CSI camera sensors.

This patch adds a V4L2 media controller and capture driver to support
Tegra VI hardware. It's verified with Tegra built-in test pattern
generator.

Signed-off-by: Bryan Wu <pengw@nvidia.com>
Reviewed-by: Hans Verkuil <hans.verkuil@cisco.com>
---
 drivers/media/platform/Kconfig               |    1 +
 drivers/media/platform/Makefile              |    2 +
 drivers/media/platform/tegra/Kconfig         |    9 +
 drivers/media/platform/tegra/Makefile        |    3 +
 drivers/media/platform/tegra/tegra-channel.c | 1074 ++++++++++++++++++++++++++
 drivers/media/platform/tegra/tegra-core.c    |  295 +++++++
 drivers/media/platform/tegra/tegra-core.h    |  134 ++++
 drivers/media/platform/tegra/tegra-vi.c      |  585 ++++++++++++++
 drivers/media/platform/tegra/tegra-vi.h      |  224 ++++++
 include/dt-bindings/media/tegra-vi.h         |   35 +
 10 files changed, 2362 insertions(+)
 create mode 100644 drivers/media/platform/tegra/Kconfig
 create mode 100644 drivers/media/platform/tegra/Makefile
 create mode 100644 drivers/media/platform/tegra/tegra-channel.c
 create mode 100644 drivers/media/platform/tegra/tegra-core.c
 create mode 100644 drivers/media/platform/tegra/tegra-core.h
 create mode 100644 drivers/media/platform/tegra/tegra-vi.c
 create mode 100644 drivers/media/platform/tegra/tegra-vi.h
 create mode 100644 include/dt-bindings/media/tegra-vi.h

diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
index f6bed19..553867f 100644
--- a/drivers/media/platform/Kconfig
+++ b/drivers/media/platform/Kconfig
@@ -119,6 +119,7 @@ source "drivers/media/platform/exynos4-is/Kconfig"
 source "drivers/media/platform/s5p-tv/Kconfig"
 source "drivers/media/platform/am437x/Kconfig"
 source "drivers/media/platform/xilinx/Kconfig"
+source "drivers/media/platform/tegra/Kconfig"
 
 endif # V4L_PLATFORM_DRIVERS
 
diff --git a/drivers/media/platform/Makefile b/drivers/media/platform/Makefile
index 114f9ab..426e0e4 100644
--- a/drivers/media/platform/Makefile
+++ b/drivers/media/platform/Makefile
@@ -52,4 +52,6 @@ obj-$(CONFIG_VIDEO_AM437X_VPFE)		+= am437x/
 
 obj-$(CONFIG_VIDEO_XILINX)		+= xilinx/
 
+obj-$(CONFIG_VIDEO_TEGRA)		+= tegra/
+
 ccflags-y += -I$(srctree)/drivers/media/i2c
diff --git a/drivers/media/platform/tegra/Kconfig b/drivers/media/platform/tegra/Kconfig
new file mode 100644
index 0000000..a69d1b2
--- /dev/null
+++ b/drivers/media/platform/tegra/Kconfig
@@ -0,0 +1,9 @@
+config VIDEO_TEGRA
+	tristate "NVIDIA Tegra Video Input Driver (EXPERIMENTAL)"
+	depends on VIDEO_V4L2 && VIDEO_V4L2_SUBDEV_API && OF
+	select VIDEOBUF2_DMA_CONTIG
+	---help---
+	  Driver for Video Input (VI) device controller in NVIDIA Tegra SoC.
+
+	  TO compile this driver as a module, choose M here: the module will be
+	  called tegra-video.
diff --git a/drivers/media/platform/tegra/Makefile b/drivers/media/platform/tegra/Makefile
new file mode 100644
index 0000000..c8eff0b
--- /dev/null
+++ b/drivers/media/platform/tegra/Makefile
@@ -0,0 +1,3 @@
+tegra-video-objs += tegra-core.o tegra-vi.o tegra-channel.o
+
+obj-$(CONFIG_VIDEO_TEGRA) += tegra-video.o
diff --git a/drivers/media/platform/tegra/tegra-channel.c b/drivers/media/platform/tegra/tegra-channel.c
new file mode 100644
index 0000000..b0063d2
--- /dev/null
+++ b/drivers/media/platform/tegra/tegra-channel.c
@@ -0,0 +1,1074 @@
+/*
+ * NVIDIA Tegra Video Input Device
+ *
+ * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * Author: Bryan Wu <pengw@nvidia.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/atomic.h>
+#include <linux/bitmap.h>
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/host1x.h>
+#include <linux/lcm.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/slab.h>
+
+#include <media/v4l2-ctrls.h>
+#include <media/v4l2-dev.h>
+#include <media/v4l2-fh.h>
+#include <media/v4l2-ioctl.h>
+#include <media/videobuf2-core.h>
+#include <media/videobuf2-dma-contig.h>
+
+#include <soc/tegra/pmc.h>
+
+#include "tegra-vi.h"
+
+#define TEGRA_VI_SYNCPT_WAIT_TIMEOUT			200
+
+/* VI registers */
+#define TEGRA_VI_CFG_VI_INCR_SYNCPT                     0x000
+#define		SP_PP_LINE_START			4
+#define		SP_PP_FRAME_START			5
+#define		SP_MW_REQ_DONE				6
+#define		SP_MW_ACK_DONE				7
+
+#define TEGRA_VI_CFG_VI_INCR_SYNCPT_CNTRL               0x004
+#define TEGRA_VI_CFG_VI_INCR_SYNCPT_ERROR               0x008
+#define TEGRA_VI_CFG_CTXSW                              0x020
+#define TEGRA_VI_CFG_INTSTATUS                          0x024
+#define TEGRA_VI_CFG_PWM_CONTROL                        0x038
+#define TEGRA_VI_CFG_PWM_HIGH_PULSE                     0x03c
+#define TEGRA_VI_CFG_PWM_LOW_PULSE                      0x040
+#define TEGRA_VI_CFG_PWM_SELECT_PULSE_A                 0x044
+#define TEGRA_VI_CFG_PWM_SELECT_PULSE_B                 0x048
+#define TEGRA_VI_CFG_PWM_SELECT_PULSE_C                 0x04c
+#define TEGRA_VI_CFG_PWM_SELECT_PULSE_D                 0x050
+#define TEGRA_VI_CFG_VGP1                               0x064
+#define TEGRA_VI_CFG_VGP2                               0x068
+#define TEGRA_VI_CFG_VGP3                               0x06c
+#define TEGRA_VI_CFG_VGP4                               0x070
+#define TEGRA_VI_CFG_VGP5                               0x074
+#define TEGRA_VI_CFG_VGP6                               0x078
+#define TEGRA_VI_CFG_INTERRUPT_MASK                     0x08c
+#define TEGRA_VI_CFG_INTERRUPT_TYPE_SELECT              0x090
+#define TEGRA_VI_CFG_INTERRUPT_POLARITY_SELECT          0x094
+#define TEGRA_VI_CFG_INTERRUPT_STATUS                   0x098
+#define TEGRA_VI_CFG_VGP_SYNCPT_CONFIG                  0x0ac
+#define TEGRA_VI_CFG_VI_SW_RESET                        0x0b4
+#define TEGRA_VI_CFG_CG_CTRL                            0x0b8
+#define TEGRA_VI_CFG_VI_MCCIF_FIFOCTRL                  0x0e4
+#define TEGRA_VI_CFG_TIMEOUT_WCOAL_VI                   0x0e8
+#define TEGRA_VI_CFG_DVFS                               0x0f0
+#define TEGRA_VI_CFG_RESERVE                            0x0f4
+#define TEGRA_VI_CFG_RESERVE_1                          0x0f8
+
+/* CSI registers */
+#define TEGRA_VI_CSI_0_BASE                             0x100
+#define TEGRA_VI_CSI_1_BASE                             0x200
+#define TEGRA_VI_CSI_2_BASE                             0x300
+#define TEGRA_VI_CSI_3_BASE                             0x400
+#define TEGRA_VI_CSI_4_BASE                             0x500
+#define TEGRA_VI_CSI_5_BASE                             0x600
+
+#define TEGRA_VI_CSI_SW_RESET                           0x000
+#define TEGRA_VI_CSI_SINGLE_SHOT                        0x004
+#define TEGRA_VI_CSI_SINGLE_SHOT_STATE_UPDATE           0x008
+#define TEGRA_VI_CSI_IMAGE_DEF                          0x00c
+#define TEGRA_VI_CSI_RGB2Y_CTRL                         0x010
+#define TEGRA_VI_CSI_MEM_TILING                         0x014
+#define TEGRA_VI_CSI_IMAGE_SIZE                         0x018
+#define TEGRA_VI_CSI_IMAGE_SIZE_WC                      0x01c
+#define TEGRA_VI_CSI_IMAGE_DT                           0x020
+#define TEGRA_VI_CSI_SURFACE0_OFFSET_MSB                0x024
+#define TEGRA_VI_CSI_SURFACE0_OFFSET_LSB                0x028
+#define TEGRA_VI_CSI_SURFACE1_OFFSET_MSB                0x02c
+#define TEGRA_VI_CSI_SURFACE1_OFFSET_LSB                0x030
+#define TEGRA_VI_CSI_SURFACE2_OFFSET_MSB                0x034
+#define TEGRA_VI_CSI_SURFACE2_OFFSET_LSB                0x038
+#define TEGRA_VI_CSI_SURFACE0_BF_OFFSET_MSB             0x03c
+#define TEGRA_VI_CSI_SURFACE0_BF_OFFSET_LSB             0x040
+#define TEGRA_VI_CSI_SURFACE1_BF_OFFSET_MSB             0x044
+#define TEGRA_VI_CSI_SURFACE1_BF_OFFSET_LSB             0x048
+#define TEGRA_VI_CSI_SURFACE2_BF_OFFSET_MSB             0x04c
+#define TEGRA_VI_CSI_SURFACE2_BF_OFFSET_LSB             0x050
+#define TEGRA_VI_CSI_SURFACE0_STRIDE                    0x054
+#define TEGRA_VI_CSI_SURFACE1_STRIDE                    0x058
+#define TEGRA_VI_CSI_SURFACE2_STRIDE                    0x05c
+#define TEGRA_VI_CSI_SURFACE_HEIGHT0                    0x060
+#define TEGRA_VI_CSI_ISPINTF_CONFIG                     0x064
+#define TEGRA_VI_CSI_ERROR_STATUS                       0x084
+#define TEGRA_VI_CSI_ERROR_INT_MASK                     0x088
+#define TEGRA_VI_CSI_WD_CTRL                            0x08c
+#define TEGRA_VI_CSI_WD_PERIOD                          0x090
+
+#define TEGRA_CSI_CSI_CAP_CIL                           0x808
+#define TEGRA_CSI_CSI_CAP_CSI                           0x818
+#define TEGRA_CSI_CSI_CAP_PP                            0x828
+
+/* CSI Pixel Parser registers */
+#define TEGRA_CSI_PIXEL_PARSER_0_BASE			0x0838
+#define TEGRA_CSI_PIXEL_PARSER_1_BASE			0x086c
+#define TEGRA_CSI_PIXEL_PARSER_2_BASE			0x1038
+#define TEGRA_CSI_PIXEL_PARSER_3_BASE			0x106c
+#define TEGRA_CSI_PIXEL_PARSER_4_BASE			0x1838
+#define TEGRA_CSI_PIXEL_PARSER_5_BASE			0x186c
+
+
+/* CSI Pixel Parser registers */
+#define TEGRA_CSI_INPUT_STREAM_CONTROL                  0x000
+#define TEGRA_CSI_PIXEL_STREAM_CONTROL0                 0x004
+#define TEGRA_CSI_PIXEL_STREAM_CONTROL1                 0x008
+#define TEGRA_CSI_PIXEL_STREAM_GAP                      0x00c
+#define TEGRA_CSI_PIXEL_STREAM_PP_COMMAND               0x010
+#define TEGRA_CSI_PIXEL_STREAM_EXPECTED_FRAME           0x014
+#define TEGRA_CSI_PIXEL_PARSER_INTERRUPT_MASK           0x018
+#define TEGRA_CSI_PIXEL_PARSER_STATUS                   0x01c
+#define TEGRA_CSI_CSI_SW_SENSOR_RESET                   0x020
+
+/* CSI PHY registers */
+#define TEGRA_CSI_CIL_PHY_0_BASE			0x0908
+#define TEGRA_CSI_CIL_PHY_1_BASE			0x1108
+#define TEGRA_CSI_CIL_PHY_2_BASE			0x1908
+#define TEGRA_CSI_PHY_CIL_COMMAND			0x0908
+
+/* CSI CIL registers */
+#define TEGRA_CSI_CIL_0_BASE				0x092c
+#define TEGRA_CSI_CIL_1_BASE				0x0960
+#define TEGRA_CSI_CIL_2_BASE				0x112c
+#define TEGRA_CSI_CIL_3_BASE				0x1160
+#define TEGRA_CSI_CIL_4_BASE				0x192c
+#define TEGRA_CSI_CIL_5_BASE				0x1960
+
+#define TEGRA_CSI_CIL_PAD_CONFIG0                       0x000
+#define TEGRA_CSI_CIL_PAD_CONFIG1                       0x004
+#define TEGRA_CSI_CIL_PHY_CONTROL                       0x008
+#define TEGRA_CSI_CIL_INTERRUPT_MASK                    0x00c
+#define TEGRA_CSI_CIL_STATUS                            0x010
+#define TEGRA_CSI_CILX_STATUS                           0x014
+#define TEGRA_CSI_CIL_ESCAPE_MODE_COMMAND               0x018
+#define TEGRA_CSI_CIL_ESCAPE_MODE_DATA                  0x01c
+#define TEGRA_CSI_CIL_SW_SENSOR_RESET                   0x020
+
+/* CSI Pattern Generator registers */
+#define TEGRA_CSI_PATTERN_GENERATOR_0_BASE		0x09c4
+#define TEGRA_CSI_PATTERN_GENERATOR_1_BASE		0x09f8
+#define TEGRA_CSI_PATTERN_GENERATOR_2_BASE		0x11c4
+#define TEGRA_CSI_PATTERN_GENERATOR_3_BASE		0x11f8
+#define TEGRA_CSI_PATTERN_GENERATOR_4_BASE		0x19c4
+#define TEGRA_CSI_PATTERN_GENERATOR_5_BASE		0x19f8
+
+#define TEGRA_CSI_PATTERN_GENERATOR_CTRL		0x000
+#define TEGRA_CSI_PG_BLANK				0x004
+#define TEGRA_CSI_PG_PHASE				0x008
+#define TEGRA_CSI_PG_RED_FREQ				0x00c
+#define TEGRA_CSI_PG_RED_FREQ_RATE			0x010
+#define TEGRA_CSI_PG_GREEN_FREQ				0x014
+#define TEGRA_CSI_PG_GREEN_FREQ_RATE			0x018
+#define TEGRA_CSI_PG_BLUE_FREQ				0x01c
+#define TEGRA_CSI_PG_BLUE_FREQ_RATE			0x020
+#define TEGRA_CSI_PG_AOHDR				0x024
+
+#define TEGRA_CSI_DPCM_CTRL_A				0xad0
+#define TEGRA_CSI_DPCM_CTRL_B				0xad4
+#define TEGRA_CSI_STALL_COUNTER				0xae8
+#define TEGRA_CSI_CSI_READONLY_STATUS			0xaec
+#define TEGRA_CSI_CSI_SW_STATUS_RESET			0xaf0
+#define TEGRA_CSI_CLKEN_OVERRIDE			0xaf4
+#define TEGRA_CSI_DEBUG_CONTROL				0xaf8
+#define TEGRA_CSI_DEBUG_COUNTER_0			0xafc
+#define TEGRA_CSI_DEBUG_COUNTER_1			0xb00
+#define TEGRA_CSI_DEBUG_COUNTER_2			0xb04
+
+/* Channel registers */
+static void tegra_channel_write(struct tegra_channel *chan, u32 addr, u32 val)
+{
+	if (chan->bypass)
+		return;
+
+	writel(val, chan->iomem + addr);
+}
+
+static u32 tegra_channel_read(struct tegra_channel *chan, u32 addr)
+{
+	return readl(chan->iomem + addr);
+}
+
+/* CSI registers */
+static void csi_write(struct tegra_channel *chan,
+				    u32 addr, u32 val)
+{
+	tegra_channel_write(chan, chan->regs.csi + addr, val);
+}
+
+static u32 csi_read(struct tegra_channel *chan, u32 addr)
+{
+	return tegra_channel_read(chan, chan->regs.csi + addr);
+}
+
+/* CSI pixel parser registers */
+static void pp_write(struct tegra_channel *chan,
+				    u32 addr, u32 val)
+{
+	tegra_channel_write(chan, chan->regs.pp + addr, val);
+}
+
+static u32 pp_read(struct tegra_channel *chan, u32 addr)
+{
+	return tegra_channel_read(chan, chan->regs.pp + addr);
+}
+
+/* CIL registers */
+static void cil_write(struct tegra_channel *chan,
+				    u32 addr, u32 val)
+{
+	tegra_channel_write(chan, chan->regs.cil + addr, val);
+}
+
+static u32 cil_read(struct tegra_channel *chan, u32 addr)
+{
+	return tegra_channel_read(chan, chan->regs.cil + addr);
+}
+
+/* CIL PHY registers */
+static void phy_write(struct tegra_channel *chan, u32 val)
+{
+	tegra_channel_write(chan, chan->regs.phy, val);
+}
+
+static u32 phy_read(struct tegra_channel *chan)
+{
+	return tegra_channel_read(chan, chan->regs.phy);
+}
+
+/* Test pattern generator registers */
+static void tpg_write(struct tegra_channel *chan,
+				    u32 addr, u32 val)
+{
+	tegra_channel_write(chan, chan->regs.tpg + addr, val);
+}
+
+/* Syncpoint bits of TEGRA_VI_CFG_VI_INCR_SYNCPT */
+static u32 sp_bit(struct tegra_channel *chan, u32 sp)
+{
+	return (sp + chan->port * 4) << 8;
+}
+
+/* Calculate register base */
+static u32 regs_base(u32 regs_base, int port)
+{
+	return regs_base + (port / 2 * 0x800) + (port & 1) * 0x34;
+}
+
+/* CSI channel IO Rail IDs */
+int tegra_io_rail_csi_ids[] = {
+	TEGRA_IO_RAIL_CSIA,
+	TEGRA_IO_RAIL_CSIB,
+	TEGRA_IO_RAIL_CSIC,
+	TEGRA_IO_RAIL_CSID,
+	TEGRA_IO_RAIL_CSIE,
+	TEGRA_IO_RAIL_CSIF,
+};
+
+void tegra_channel_fmts_bitmap_init(struct tegra_channel *chan)
+{
+	int ret, index;
+	struct v4l2_subdev *subdev = chan->remote_entity->subdev;
+	struct v4l2_subdev_mbus_code_enum code = {
+		.which = V4L2_SUBDEV_FORMAT_ACTIVE,
+	};
+
+
+	bitmap_zero(chan->fmts_bitmap, MAX_FORMAT_NUM);
+
+	while (1) {
+		ret = v4l2_subdev_call(subdev, pad, enum_mbus_code,
+				       NULL, &code);
+		if (ret < 0)
+			/* no more formats */
+			return;
+
+		index = tegra_core_get_idx_by_code(code.code);
+		if (index >= 0)
+			bitmap_set(chan->fmts_bitmap, index, 1);
+
+		code.index++;
+	}
+
+	return;
+}
+
+/* -----------------------------------------------------------------------------
+ * Tegra channel frame setup and capture operations
+ */
+
+static int tegra_channel_capture_setup(struct tegra_channel *chan)
+{
+	int lanes = 2;
+	int port = chan->port;
+	u32 height = chan->format.height;
+	u32 width = chan->format.width;
+	u32 format = chan->fmtinfo->img_fmt;
+	u32 data_type = chan->fmtinfo->img_dt;
+	u32 word_count = tegra_core_get_word_count(width, chan->fmtinfo);
+	struct chan_regs_config *regs = &chan->regs;
+
+	/* CIL PHY register setup */
+	if (port & 0x1) {
+		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0 - 0x34, 0x0);
+		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x0);
+	} else {
+		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x10000);
+		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0 + 0x34, 0x0);
+	}
+
+	cil_write(chan, TEGRA_CSI_CIL_INTERRUPT_MASK, 0x0);
+	cil_write(chan, TEGRA_CSI_CIL_PHY_CONTROL, 0xA);
+	if (lanes == 4) {
+		regs->cil = regs_base(TEGRA_CSI_CIL_0_BASE, port + 1);
+		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x0);
+		cil_write(chan,	TEGRA_CSI_CIL_INTERRUPT_MASK, 0x0);
+		cil_write(chan, TEGRA_CSI_CIL_PHY_CONTROL, 0xA);
+		regs->cil = regs_base(TEGRA_CSI_CIL_0_BASE, port);
+	}
+
+	/* CSI pixel parser registers setup */
+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_PP_COMMAND, 0xf007);
+	pp_write(chan, TEGRA_CSI_PIXEL_PARSER_INTERRUPT_MASK, 0x0);
+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_CONTROL0,
+		 0x280301f0 | (port & 0x1));
+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_PP_COMMAND, 0xf007);
+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_CONTROL1, 0x11);
+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_GAP, 0x140000);
+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_EXPECTED_FRAME, 0x0);
+	pp_write(chan, TEGRA_CSI_INPUT_STREAM_CONTROL,
+		 0x3f0000 | (lanes - 1));
+
+	/* CIL PHY register setup */
+	if (lanes == 4)
+		phy_write(chan, 0x0101);
+	else {
+		u32 val = phy_read(chan);
+		if (port & 0x1)
+			val = (val & ~0x100) | 0x100;
+		else
+			val = (val & ~0x1) | 0x1;
+		phy_write(chan, val);
+	}
+
+	/* Test Pattern Generator setup */
+	if (chan->vi->pg_mode) {
+		tpg_write(chan, TEGRA_CSI_PATTERN_GENERATOR_CTRL,
+				((chan->vi->pg_mode - 1) << 2) | 0x1);
+		tpg_write(chan, TEGRA_CSI_PG_PHASE, 0x0);
+		tpg_write(chan, TEGRA_CSI_PG_RED_FREQ, 0x100010);
+		tpg_write(chan, TEGRA_CSI_PG_RED_FREQ_RATE, 0x0);
+		tpg_write(chan, TEGRA_CSI_PG_GREEN_FREQ, 0x100010);
+		tpg_write(chan, TEGRA_CSI_PG_GREEN_FREQ_RATE, 0x0);
+		tpg_write(chan, TEGRA_CSI_PG_BLUE_FREQ, 0x100010);
+		tpg_write(chan, TEGRA_CSI_PG_BLUE_FREQ_RATE, 0x0);
+		phy_write(chan, 0x0202);
+	}
+
+	csi_write(chan, TEGRA_VI_CSI_ERROR_STATUS, 0xFFFFFFFF);
+	csi_write(chan, TEGRA_VI_CSI_IMAGE_DEF,
+		  ((chan->vi->pg_mode ? 1 : 0) << 24) | (format << 16) | 0x1);
+	csi_write(chan, TEGRA_VI_CSI_IMAGE_DT, data_type);
+	csi_write(chan, TEGRA_VI_CSI_IMAGE_SIZE_WC, word_count);
+	csi_write(chan, TEGRA_VI_CSI_IMAGE_SIZE,
+		  (height << 16) | width);
+
+	/* Start pixel parser in single shot mode at beginning */
+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_PP_COMMAND, 0xf005);
+
+	return 0;
+}
+
+static void tegra_channel_capture_error(struct tegra_channel *chan, int err)
+{
+	u32 val;
+
+#ifdef DEBUG
+	val = tegra_channel_read(chan, TEGRA_CSI_DEBUG_COUNTER_0);
+	dev_err(&chan->video.dev, "TEGRA_CSI_DEBUG_COUNTER_0 0x%08x\n", val);
+#endif
+	val = cil_read(chan, TEGRA_CSI_CIL_STATUS);
+	dev_err(&chan->video.dev, "TEGRA_CSI_CSI_CIL_STATUS 0x%08x\n", val);
+	val = cil_read(chan, TEGRA_CSI_CILX_STATUS);
+	dev_err(&chan->video.dev, "TEGRA_CSI_CSI_CILX_STATUS 0x%08x\n", val);
+	val = pp_read(chan, TEGRA_CSI_PIXEL_PARSER_STATUS);
+	dev_err(&chan->video.dev, "TEGRA_CSI_PIXEL_PARSER_STATUS 0x%08x\n",
+		val);
+	val = csi_read(chan, TEGRA_VI_CSI_ERROR_STATUS);
+	dev_err(&chan->video.dev, "TEGRA_VI_CSI_ERROR_STATUS 0x%08x\n", val);
+}
+
+static int tegra_channel_capture_frame(struct tegra_channel *chan)
+{
+	struct tegra_channel_buffer *buf = chan->active;
+	struct vb2_buffer *vb = &buf->buf;
+	int err = 0;
+	u32 thresh, value, frame_start;
+	int bytes_per_line = chan->format.bytesperline;
+
+	if (!vb2_start_streaming_called(&chan->queue) || !buf)
+		return -EINVAL;
+
+	if (chan->bypass)
+		goto bypass_done;
+
+	/* Program buffer address */
+	csi_write(chan,
+		  TEGRA_VI_CSI_SURFACE0_OFFSET_MSB + chan->surface * 8,
+		  0x0);
+	csi_write(chan,
+		  TEGRA_VI_CSI_SURFACE0_OFFSET_LSB + chan->surface * 8,
+		  buf->addr);
+	csi_write(chan,
+		  TEGRA_VI_CSI_SURFACE0_STRIDE + chan->surface * 4,
+		  bytes_per_line);
+
+	/* Program syncpoint */
+	frame_start = sp_bit(chan, SP_PP_FRAME_START);
+	tegra_channel_write(chan, TEGRA_VI_CFG_VI_INCR_SYNCPT,
+			    frame_start | host1x_syncpt_id(chan->sp));
+
+	csi_write(chan, TEGRA_VI_CSI_SINGLE_SHOT, 0x1);
+
+	/* Use syncpoint to wake up */
+	thresh = host1x_syncpt_incr_max(chan->sp, 1);
+
+	mutex_unlock(&chan->lock);
+	err = host1x_syncpt_wait(chan->sp, thresh,
+			         TEGRA_VI_SYNCPT_WAIT_TIMEOUT, &value);
+	mutex_lock(&chan->lock);
+
+	if (err) {
+		dev_err(&chan->video.dev, "frame start syncpt timeout!\n");
+		tegra_channel_capture_error(chan, err);
+	}
+
+bypass_done:
+	/* Captured one frame */
+	spin_lock_irq(&chan->queued_lock);
+	vb->v4l2_buf.sequence = chan->sequence++;
+	vb->v4l2_buf.field = V4L2_FIELD_NONE;
+	v4l2_get_timestamp(&vb->v4l2_buf.timestamp);
+	vb2_set_plane_payload(vb, 0, chan->format.sizeimage);
+	vb2_buffer_done(vb, err < 0 ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
+	spin_unlock_irq(&chan->queued_lock);
+
+	return err;
+}
+
+static void tegra_channel_work(struct work_struct *work)
+{
+	struct tegra_channel *chan =
+		container_of(work, struct tegra_channel, work);
+
+	while (1) {
+		spin_lock_irq(&chan->queued_lock);
+		if (list_empty(&chan->capture)) {
+			chan->active = NULL;
+			spin_unlock_irq(&chan->queued_lock);
+			return;
+		}
+		chan->active = list_entry(chan->capture.next,
+				struct tegra_channel_buffer, queue);
+		list_del_init(&chan->active->queue);
+		spin_unlock_irq(&chan->queued_lock);
+
+		mutex_lock(&chan->lock);
+		tegra_channel_capture_frame(chan);
+		mutex_unlock(&chan->lock);
+	}
+}
+
+/* -----------------------------------------------------------------------------
+ * videobuf2 queue operations
+ */
+
+static int
+tegra_channel_queue_setup(struct vb2_queue *vq, const struct v4l2_format *fmt,
+		     unsigned int *nbuffers, unsigned int *nplanes,
+		     unsigned int sizes[], void *alloc_ctxs[])
+{
+	struct tegra_channel *chan = vb2_get_drv_priv(vq);
+
+	/* Make sure the image size is large enough. */
+	if (fmt && fmt->fmt.pix.sizeimage < chan->format.sizeimage)
+		return -EINVAL;
+
+	*nplanes = 1;
+
+	sizes[0] = fmt ? fmt->fmt.pix.sizeimage : chan->format.sizeimage;
+	alloc_ctxs[0] = chan->alloc_ctx;
+
+	return 0;
+}
+
+static int tegra_channel_buffer_prepare(struct vb2_buffer *vb)
+{
+	struct tegra_channel *chan = vb2_get_drv_priv(vb->vb2_queue);
+	struct tegra_channel_buffer *buf = to_tegra_channel_buffer(vb);
+
+	buf->chan = chan;
+	buf->addr = vb2_dma_contig_plane_dma_addr(vb, 0);
+
+	return 0;
+}
+
+static void tegra_channel_buffer_queue(struct vb2_buffer *vb)
+{
+	struct tegra_channel *chan = vb2_get_drv_priv(vb->vb2_queue);
+	struct tegra_channel_buffer *buf = to_tegra_channel_buffer(vb);
+
+	/* Put buffer into the  capture queue */
+	spin_lock_irq(&chan->queued_lock);
+	list_add_tail(&buf->queue, &chan->capture);
+	spin_unlock_irq(&chan->queued_lock);
+
+	/* Start work queue to capture data to buffer */
+	if (vb2_start_streaming_called(&chan->queue))
+		schedule_work(&chan->work);
+}
+
+static int tegra_channel_set_stream(struct tegra_channel *chan, bool on)
+{
+	struct media_entity *entity;
+	struct media_pad *pad;
+	struct v4l2_subdev *subdev;
+	int ret = 0;
+
+	entity = &chan->video.entity;
+
+	while (1) {
+		if (entity->num_pads > 1 && (chan->port & 0x1))
+			pad = &entity->pads[2];
+		else
+			pad = &entity->pads[0];
+
+		if (!(pad->flags & MEDIA_PAD_FL_SINK))
+			break;
+
+		pad = media_entity_remote_pad(pad);
+		if (pad == NULL ||
+		    media_entity_type(pad->entity) != MEDIA_ENT_T_V4L2_SUBDEV)
+			break;
+
+		entity = pad->entity;
+		subdev = media_entity_to_v4l2_subdev(entity);
+		ret = v4l2_subdev_call(subdev, video, s_stream, on);
+		if (on && ret < 0 && ret != -ENOIOCTLCMD)
+			return ret;
+	}
+	return ret;
+}
+
+static int tegra_channel_start_streaming(struct vb2_queue *vq, u32 count)
+{
+	struct tegra_channel *chan = vb2_get_drv_priv(vq);
+	struct media_pipeline *pipe = chan->video.entity.pipe;
+	struct tegra_channel_buffer *buf, *nbuf;
+	int ret = 0;
+
+	if (!chan->vi->pg_mode && !chan->remote_entity) {
+		dev_err(&chan->video.dev,
+			"is not in TPG mode and has not sensor connected!\n");
+		ret = -EINVAL;
+		goto vb2_queued;
+	}
+
+	mutex_lock(&chan->lock);
+
+	/* Start CIL clock */
+	clk_set_rate(chan->cil_clk, 102000000);
+	clk_prepare_enable(chan->cil_clk);
+
+	/* Disable DPD */
+	ret = tegra_io_rail_power_on(chan->io_id);
+	if (ret < 0) {
+		dev_err(&chan->video.dev,
+			"failed to power on CSI rail: %d\n", ret);
+		goto error_power_on;
+	}
+
+	/* Clean up status */
+	cil_write(chan, TEGRA_CSI_CIL_STATUS, 0xFFFFFFFF);
+	cil_write(chan, TEGRA_CSI_CILX_STATUS, 0xFFFFFFFF);
+	pp_write(chan, TEGRA_CSI_PIXEL_PARSER_STATUS, 0xFFFFFFFF);
+	csi_write(chan, TEGRA_VI_CSI_ERROR_STATUS, 0xFFFFFFFF);
+
+	ret = media_entity_pipeline_start(&chan->video.entity, pipe);
+	if (ret < 0)
+		goto error_pipeline_start;
+
+	/* Start the pipeline. */
+	ret = tegra_channel_set_stream(chan, true);
+	if (ret < 0)
+		goto error_set_stream;
+
+	/* Note: Program VI registers after TPG, sensors and CSI streaming */
+	ret = tegra_channel_capture_setup(chan);
+	if (ret < 0)
+		goto error_capture_setup;
+
+	chan->sequence = 0;
+	mutex_unlock(&chan->lock);
+
+	/* Start work queue to capture data to buffer */
+	schedule_work(&chan->work);
+
+	return 0;
+
+error_capture_setup:
+	tegra_channel_set_stream(chan, false);
+error_set_stream:
+	media_entity_pipeline_stop(&chan->video.entity);
+error_pipeline_start:
+	tegra_io_rail_power_off(chan->io_id);
+error_power_on:
+	clk_disable_unprepare(chan->cil_clk);
+	mutex_unlock(&chan->lock);
+vb2_queued:
+	/* Return all queued buffers back to vb2 */
+	spin_lock_irq(&chan->queued_lock);
+	vq->start_streaming_called = 0;
+	list_for_each_entry_safe(buf, nbuf, &chan->capture, queue) {
+		vb2_buffer_done(&buf->buf, VB2_BUF_STATE_QUEUED);
+		list_del(&buf->queue);
+	}
+	spin_unlock_irq(&chan->queued_lock);
+	return ret;
+}
+
+static void tegra_channel_stop_streaming(struct vb2_queue *vq)
+{
+	struct tegra_channel *chan = vb2_get_drv_priv(vq);
+	struct tegra_channel_buffer *buf, *nbuf;
+	u32 thresh, value, mw_ack_done;
+	int err;
+
+	mutex_lock(&chan->lock);
+
+	if (!chan->bypass) {
+		/* Program syncpoint */
+		mw_ack_done = sp_bit(chan, SP_MW_ACK_DONE);
+		tegra_channel_write(chan, TEGRA_VI_CFG_VI_INCR_SYNCPT,
+				mw_ack_done | host1x_syncpt_id(chan->sp));
+
+		/* Use syncpoint to wake up */
+		thresh = host1x_syncpt_incr_max(chan->sp, 1);
+		err = host1x_syncpt_wait(chan->sp, thresh,
+				TEGRA_VI_SYNCPT_WAIT_TIMEOUT, &value);
+		if (err)
+			dev_err(&chan->video.dev, "MW_ACK_DONE syncpoint time out!\n");
+	}
+
+	media_entity_pipeline_stop(&chan->video.entity);
+
+	tegra_channel_set_stream(chan, false);
+
+	tegra_io_rail_power_off(chan->io_id);
+	clk_disable_unprepare(chan->cil_clk);
+
+	mutex_unlock(&chan->lock);
+
+	/* Give back all queued buffers to videobuf2. */
+	spin_lock_irq(&chan->queued_lock);
+	list_for_each_entry_safe(buf, nbuf, &chan->capture, queue) {
+		vb2_buffer_done(&buf->buf, VB2_BUF_STATE_ERROR);
+		list_del(&buf->queue);
+	}
+	spin_unlock_irq(&chan->queued_lock);
+	cancel_work_sync(&chan->work);
+}
+
+static struct vb2_ops tegra_channel_queue_qops = {
+	.queue_setup = tegra_channel_queue_setup,
+	.buf_prepare = tegra_channel_buffer_prepare,
+	.buf_queue = tegra_channel_buffer_queue,
+	.wait_prepare = vb2_ops_wait_prepare,
+	.wait_finish = vb2_ops_wait_finish,
+	.start_streaming = tegra_channel_start_streaming,
+	.stop_streaming = tegra_channel_stop_streaming,
+};
+
+/* -----------------------------------------------------------------------------
+ * V4L2 ioctls
+ */
+
+static int
+tegra_channel_querycap(struct file *file, void *fh, struct v4l2_capability *cap)
+{
+	struct v4l2_fh *vfh = file->private_data;
+	struct tegra_channel *chan = to_tegra_channel(vfh->vdev);
+
+	cap->device_caps = V4L2_CAP_VIDEO_CAPTURE | V4L2_CAP_STREAMING;
+	cap->capabilities = cap->device_caps | V4L2_CAP_DEVICE_CAPS;
+
+	strlcpy(cap->driver, "tegra-vi", sizeof(cap->driver));
+	strlcpy(cap->card, chan->video.name, sizeof(cap->card));
+	snprintf(cap->bus_info, sizeof(cap->bus_info), "platform:%s:%u",
+		 chan->vi->dev->of_node->name, chan->port);
+
+	return 0;
+}
+
+static int
+tegra_channel_enum_format(struct file *file, void *fh, struct v4l2_fmtdesc *f)
+{
+	struct v4l2_fh *vfh = file->private_data;
+	struct tegra_channel *chan = to_tegra_channel(vfh->vdev);
+	int index, i;
+	unsigned long *fmts_bitmap = NULL;
+
+	if (chan->vi->pg_mode)
+		fmts_bitmap = chan->vi->tpg_fmts_bitmap;
+	else if (chan->remote_entity)
+		fmts_bitmap = chan->fmts_bitmap;
+
+	if (!fmts_bitmap ||
+	    f->index > bitmap_weight(fmts_bitmap, MAX_FORMAT_NUM) - 1)
+		return -EINVAL;
+
+	index = -1;
+	for (i = 0; i < f->index + 1; i++)
+		index = find_next_bit(fmts_bitmap, MAX_FORMAT_NUM, index + 1);
+
+	f->pixelformat = tegra_video_formats[index].fourcc;
+
+	return 0;
+}
+
+static int
+tegra_channel_get_format(struct file *file, void *fh, struct v4l2_format *format)
+{
+	struct v4l2_fh *vfh = file->private_data;
+	struct tegra_channel *chan = to_tegra_channel(vfh->vdev);
+
+	format->fmt.pix = chan->format;
+
+	return 0;
+}
+
+static void
+__tegra_channel_try_format(struct tegra_channel *chan, struct v4l2_pix_format *pix,
+		      const struct tegra_video_format **fmtinfo)
+{
+	const struct tegra_video_format *info;
+	unsigned int min_width;
+	unsigned int max_width;
+	unsigned int min_bpl;
+	unsigned int max_bpl;
+	unsigned int width;
+	unsigned int align;
+	unsigned int bpl;
+
+	/* Retrieve format information and select the default format if the
+	 * requested format isn't supported.
+	 */
+	info = tegra_core_get_format_by_fourcc(pix->pixelformat);
+	if (!info)
+		info = tegra_core_get_format_by_fourcc(TEGRA_VF_DEF_FOURCC);
+
+	pix->pixelformat = info->fourcc;
+	pix->field = V4L2_FIELD_NONE;
+
+	/* The transfer alignment requirements are expressed in bytes. Compute
+	 * the minimum and maximum values, clamp the requested width and convert
+	 * it back to pixels.
+	 */
+	align = lcm(chan->align, info->bpp);
+	min_width = roundup(TEGRA_MIN_WIDTH, align);
+	max_width = rounddown(TEGRA_MAX_WIDTH, align);
+	width = rounddown(pix->width * info->bpp, align);
+
+	pix->width = clamp(width, min_width, max_width) / info->bpp;
+	pix->height = clamp(pix->height, TEGRA_MIN_HEIGHT,
+			    TEGRA_MAX_HEIGHT);
+
+	/* Clamp the requested bytes per line value. If the maximum bytes per
+	 * line value is zero, the module doesn't support user configurable line
+	 * sizes. Override the requested value with the minimum in that case.
+	 */
+	min_bpl = pix->width * info->bpp;
+	max_bpl = rounddown(TEGRA_MAX_WIDTH, chan->align);
+	bpl = rounddown(pix->bytesperline, chan->align);
+
+	pix->bytesperline = clamp(bpl, min_bpl, max_bpl);
+	pix->sizeimage = pix->bytesperline * pix->height;
+
+	if (fmtinfo)
+		*fmtinfo = info;
+}
+
+static int
+tegra_channel_try_format(struct file *file, void *fh, struct v4l2_format *format)
+{
+	struct v4l2_fh *vfh = file->private_data;
+	struct tegra_channel *chan = to_tegra_channel(vfh->vdev);
+
+	__tegra_channel_try_format(chan, &format->fmt.pix, NULL);
+	return 0;
+}
+
+static int
+tegra_channel_set_format(struct file *file, void *fh, struct v4l2_format *format)
+{
+	struct v4l2_fh *vfh = file->private_data;
+	struct tegra_channel *chan = to_tegra_channel(vfh->vdev);
+	const struct tegra_video_format *info;
+
+	__tegra_channel_try_format(chan, &format->fmt.pix, &info);
+
+	if (vb2_is_busy(&chan->queue))
+		return -EBUSY;
+
+	chan->format = format->fmt.pix;
+	chan->fmtinfo = info;
+
+	return 0;
+}
+
+static const struct v4l2_ioctl_ops tegra_channel_ioctl_ops = {
+	.vidioc_querycap		= tegra_channel_querycap,
+	.vidioc_enum_fmt_vid_cap	= tegra_channel_enum_format,
+	.vidioc_g_fmt_vid_cap		= tegra_channel_get_format,
+	.vidioc_s_fmt_vid_cap		= tegra_channel_set_format,
+	.vidioc_try_fmt_vid_cap		= tegra_channel_try_format,
+	.vidioc_reqbufs			= vb2_ioctl_reqbufs,
+	.vidioc_querybuf		= vb2_ioctl_querybuf,
+	.vidioc_qbuf			= vb2_ioctl_qbuf,
+	.vidioc_dqbuf			= vb2_ioctl_dqbuf,
+	.vidioc_create_bufs		= vb2_ioctl_create_bufs,
+	.vidioc_expbuf			= vb2_ioctl_expbuf,
+	.vidioc_streamon		= vb2_ioctl_streamon,
+	.vidioc_streamoff		= vb2_ioctl_streamoff,
+};
+
+/* -----------------------------------------------------------------------------
+ * V4L2 file operations
+ */
+
+static int tegra_channel_v4l2_open(struct file *file)
+{
+	struct tegra_channel *chan = video_drvdata(file);
+	struct tegra_vi_device *vi = chan->vi;
+	int ret = 0;
+
+	mutex_lock(&vi->lock);
+	ret = v4l2_fh_open(file);
+	if (ret)
+		goto unlock;
+
+	/* The first open then turn on power*/
+	if (!vi->power_on_refcnt) {
+		tegra_vi_power_on(chan->vi);
+
+		usleep_range(5, 100);
+		tegra_channel_write(chan, TEGRA_VI_CFG_CG_CTRL, 1);
+		tegra_channel_write(chan, TEGRA_CSI_CLKEN_OVERRIDE, 0);
+		usleep_range(10, 15);
+	}
+	vi->power_on_refcnt++;
+
+unlock:
+	mutex_unlock(&vi->lock);
+	return ret;
+}
+
+static int tegra_channel_v4l2_release(struct file *file)
+{
+	struct tegra_channel *chan = video_drvdata(file);
+	struct tegra_vi_device *vi = chan->vi;
+	int ret = 0;
+
+	mutex_lock(&vi->lock);
+	vi->power_on_refcnt--;
+	/* The last release then turn off power */
+	if (!vi->power_on_refcnt)
+		tegra_vi_power_off(chan->vi);
+	ret = _vb2_fop_release(file, NULL);
+	mutex_unlock(&vi->lock);
+
+	return ret;
+}
+
+static const struct v4l2_file_operations tegra_channel_fops = {
+	.owner		= THIS_MODULE,
+	.unlocked_ioctl	= video_ioctl2,
+	.open		= tegra_channel_v4l2_open,
+	.release	= tegra_channel_v4l2_release,
+	.read		= vb2_fop_read,
+	.poll		= vb2_fop_poll,
+	.mmap		= vb2_fop_mmap,
+};
+
+int tegra_channel_init(struct tegra_vi_device *vi,
+		       struct tegra_channel *chan,
+		       u32 port)
+{
+	int ret;
+
+	chan->vi = vi;
+	chan->port = port;
+	chan->iomem = vi->iomem;
+
+	/* Init channel register base */
+	chan->regs.csi = TEGRA_VI_CSI_0_BASE + port * 0x100;
+	chan->regs.pp = regs_base(TEGRA_CSI_PIXEL_PARSER_0_BASE, port);
+	chan->regs.cil = regs_base(TEGRA_CSI_CIL_0_BASE, port);
+	chan->regs.phy = TEGRA_CSI_CIL_PHY_0_BASE + port / 2 * 0x800;
+	chan->regs.tpg = regs_base(TEGRA_CSI_PATTERN_GENERATOR_0_BASE, port);
+
+	/* Init CIL clock */
+	switch (chan->port) {
+	case 0:
+	case 1:
+		chan->cil_clk = devm_clk_get(chan->vi->dev, "cilab");
+		break;
+	case 2:
+	case 3:
+		chan->cil_clk = devm_clk_get(chan->vi->dev, "cilcd");
+		break;
+	case 4:
+	case 5:
+		chan->cil_clk = devm_clk_get(chan->vi->dev, "cile");
+		break;
+	default:
+		dev_err(chan->vi->dev, "wrong port nubmer %d\n", port);
+	}
+	if (IS_ERR(chan->cil_clk)) {
+		dev_err(chan->vi->dev, "Failed to get CIL clock\n");
+		return -EINVAL;
+	}
+
+	/* VI Channel is 64 bytes alignment */
+	chan->align = 64;
+	chan->surface = 0;
+	chan->io_id = tegra_io_rail_csi_ids[chan->port];
+	mutex_init(&chan->lock);
+	mutex_init(&chan->video_lock);
+	INIT_LIST_HEAD(&chan->capture);
+	spin_lock_init(&chan->queued_lock);
+	INIT_WORK(&chan->work, tegra_channel_work);
+
+	/* Init video format */
+	chan->fmtinfo = tegra_core_get_format_by_fourcc(TEGRA_VF_DEF_FOURCC);
+	chan->format.pixelformat = chan->fmtinfo->fourcc;
+	chan->format.colorspace = V4L2_COLORSPACE_SRGB;
+	chan->format.field = V4L2_FIELD_NONE;
+	chan->format.width = TEGRA_DEF_WIDTH;
+	chan->format.height = TEGRA_DEF_HEIGHT;
+	chan->format.bytesperline = chan->format.width * chan->fmtinfo->bpp;
+	chan->format.sizeimage = chan->format.bytesperline *
+				    chan->format.height;
+
+	/* Initialize the media entity... */
+	chan->pad.flags = MEDIA_PAD_FL_SINK;
+
+	ret = media_entity_init(&chan->video.entity, 1, &chan->pad, 0);
+	if (ret < 0)
+		return ret;
+
+	/* ... and the video node... */
+	chan->video.fops = &tegra_channel_fops;
+	chan->video.v4l2_dev = &vi->v4l2_dev;
+	chan->video.queue = &chan->queue;
+	snprintf(chan->video.name, sizeof(chan->video.name), "%s %s %u",
+		 vi->dev->of_node->name, "output", port);
+	chan->video.vfl_type = VFL_TYPE_GRABBER;
+	chan->video.vfl_dir = VFL_DIR_RX;
+	chan->video.release = video_device_release_empty;
+	chan->video.ioctl_ops = &tegra_channel_ioctl_ops;
+	chan->video.lock = &chan->video_lock;
+
+	video_set_drvdata(&chan->video, chan);
+
+	/* Init host1x interface */
+	INIT_LIST_HEAD(&chan->client.list);
+	chan->client.dev = chan->vi->dev;
+
+	ret = host1x_client_register(&chan->client);
+	if (ret < 0) {
+		dev_err(chan->vi->dev, "failed to register host1x client: %d\n",
+			ret);
+		ret = -ENODEV;
+		goto host1x_register_error;
+	}
+
+	chan->sp = host1x_syncpt_request(chan->client.dev,
+					 HOST1X_SYNCPT_HAS_BASE);
+	if (!chan->sp) {
+		dev_err(chan->vi->dev, "failed to request host1x syncpoint\n");
+		ret = -ENOMEM;
+		goto host1x_sp_error;
+	}
+
+	/* ... and the buffers queue... */
+	chan->alloc_ctx = vb2_dma_contig_init_ctx(&chan->video.dev);
+	if (IS_ERR(chan->alloc_ctx)) {
+		dev_err(chan->vi->dev, "failed to init vb2 buffer\n");
+		ret = -ENOMEM;
+		goto vb2_init_error;
+	}
+
+	chan->queue.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
+	chan->queue.io_modes = VB2_MMAP | VB2_DMABUF | VB2_READ;
+	chan->queue.lock = &chan->video_lock;
+	chan->queue.drv_priv = chan;
+	chan->queue.buf_struct_size = sizeof(struct tegra_channel_buffer);
+	chan->queue.ops = &tegra_channel_queue_qops;
+	chan->queue.mem_ops = &vb2_dma_contig_memops;
+	chan->queue.timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC
+				   | V4L2_BUF_FLAG_TSTAMP_SRC_EOF;
+	ret = vb2_queue_init(&chan->queue);
+	if (ret < 0) {
+		dev_err(chan->vi->dev, "failed to initialize VB2 queue\n");
+		goto vb2_queue_error;
+	}
+
+	ret = video_register_device(&chan->video, VFL_TYPE_GRABBER, -1);
+	if (ret < 0) {
+		dev_err(&chan->video.dev, "failed to register video device\n");
+		goto video_register_error;
+	}
+
+	return 0;
+
+video_register_error:
+	vb2_queue_release(&chan->queue);
+vb2_queue_error:
+	vb2_dma_contig_cleanup_ctx(chan->alloc_ctx);
+vb2_init_error:
+	host1x_syncpt_free(chan->sp);
+host1x_sp_error:
+	host1x_client_unregister(&chan->client);
+host1x_register_error:
+	media_entity_cleanup(&chan->video.entity);
+	return ret;
+}
+
+int tegra_channel_cleanup(struct tegra_channel *chan)
+{
+	video_unregister_device(&chan->video);
+
+	vb2_queue_release(&chan->queue);
+	vb2_dma_contig_cleanup_ctx(chan->alloc_ctx);
+
+	host1x_syncpt_free(chan->sp);
+	host1x_client_unregister(&chan->client);
+
+	media_entity_cleanup(&chan->video.entity);
+
+	return 0;
+}
diff --git a/drivers/media/platform/tegra/tegra-core.c b/drivers/media/platform/tegra/tegra-core.c
new file mode 100644
index 0000000..244b9b8
--- /dev/null
+++ b/drivers/media/platform/tegra/tegra-core.c
@@ -0,0 +1,295 @@
+/*
+ * NVIDIA Tegra Video Input Device Driver Core Helpers
+ *
+ * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * Author: Bryan Wu <pengw@nvidia.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/export.h>
+#include <linux/kernel.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+
+#include "tegra-core.h"
+
+const struct tegra_video_format tegra_video_formats[] = {
+	/* RAW 6: TODO */
+
+	/* RAW 7: TODO */
+
+	/* RAW 8 */
+	{
+		TEGRA_VF_RAW8,
+		8,
+		MEDIA_BUS_FMT_SRGGB8_1X8,
+		1,
+		TEGRA_IMAGE_FORMAT_T_L8,
+		TEGRA_IMAGE_DT_RAW8,
+		V4L2_PIX_FMT_SRGGB8,
+	},
+	{
+		TEGRA_VF_RAW8,
+		8,
+		MEDIA_BUS_FMT_SGRBG8_1X8,
+		1,
+		TEGRA_IMAGE_FORMAT_T_L8,
+		TEGRA_IMAGE_DT_RAW8,
+		V4L2_PIX_FMT_SGRBG8,
+	},
+	{
+		TEGRA_VF_RAW8,
+		8,
+		MEDIA_BUS_FMT_SGBRG8_1X8,
+		1,
+		TEGRA_IMAGE_FORMAT_T_L8,
+		TEGRA_IMAGE_DT_RAW8,
+		V4L2_PIX_FMT_SGBRG8,
+	},
+	{
+		TEGRA_VF_RAW8,
+		8,
+		MEDIA_BUS_FMT_SBGGR8_1X8,
+		1,
+		TEGRA_IMAGE_FORMAT_T_L8,
+		TEGRA_IMAGE_DT_RAW8,
+		V4L2_PIX_FMT_SBGGR8,
+	},
+
+	/* RAW 10 */
+	{
+		TEGRA_VF_RAW10,
+		10,
+		MEDIA_BUS_FMT_SRGGB10_1X10,
+		2,
+		TEGRA_IMAGE_FORMAT_T_R16_I,
+		TEGRA_IMAGE_DT_RAW10,
+		V4L2_PIX_FMT_SRGGB10,
+	},
+	{
+		TEGRA_VF_RAW10,
+		10,
+		MEDIA_BUS_FMT_SGRBG10_1X10,
+		2,
+		TEGRA_IMAGE_FORMAT_T_R16_I,
+		TEGRA_IMAGE_DT_RAW10,
+		V4L2_PIX_FMT_SGRBG10,
+	},
+	{
+		TEGRA_VF_RAW10,
+		10,
+		MEDIA_BUS_FMT_SGBRG10_1X10,
+		2,
+		TEGRA_IMAGE_FORMAT_T_R16_I,
+		TEGRA_IMAGE_DT_RAW10,
+		V4L2_PIX_FMT_SGBRG10,
+	},
+	{
+		TEGRA_VF_RAW10,
+		10,
+		MEDIA_BUS_FMT_SBGGR10_1X10,
+		2,
+		TEGRA_IMAGE_FORMAT_T_R16_I,
+		TEGRA_IMAGE_DT_RAW10,
+		V4L2_PIX_FMT_SBGGR10,
+	},
+
+	/* RAW 12 */
+	{
+		TEGRA_VF_RAW12,
+		12,
+		MEDIA_BUS_FMT_SRGGB12_1X12,
+		2,
+		TEGRA_IMAGE_FORMAT_T_R16_I,
+		TEGRA_IMAGE_DT_RAW12,
+		V4L2_PIX_FMT_SRGGB12,
+	},
+	{
+		TEGRA_VF_RAW12,
+		12,
+		MEDIA_BUS_FMT_SGRBG12_1X12,
+		2,
+		TEGRA_IMAGE_FORMAT_T_R16_I,
+		TEGRA_IMAGE_DT_RAW12,
+		V4L2_PIX_FMT_SGRBG12,
+	},
+	{
+		TEGRA_VF_RAW12,
+		12,
+		MEDIA_BUS_FMT_SGBRG12_1X12,
+		2,
+		TEGRA_IMAGE_FORMAT_T_R16_I,
+		TEGRA_IMAGE_DT_RAW12,
+		V4L2_PIX_FMT_SGBRG12,
+	},
+	{
+		TEGRA_VF_RAW12,
+		12,
+		MEDIA_BUS_FMT_SBGGR12_1X12,
+		2,
+		TEGRA_IMAGE_FORMAT_T_R16_I,
+		TEGRA_IMAGE_DT_RAW12,
+		V4L2_PIX_FMT_SBGGR12,
+	},
+
+	/* RGB888 */
+	{
+		TEGRA_VF_RGB888,
+		24,
+		MEDIA_BUS_FMT_RGB888_1X32_PADHI,
+		4,
+		TEGRA_IMAGE_FORMAT_T_A8B8G8R8,
+		TEGRA_IMAGE_DT_RGB888,
+		V4L2_PIX_FMT_RGB32,
+	},
+};
+
+/* -----------------------------------------------------------------------------
+ * Helper functions
+ */
+
+int tegra_core_get_formats_array_size(void)
+{
+	return ARRAY_SIZE(tegra_video_formats);
+}
+
+/**
+ * tegra_core_get_word_count - Calculate word count
+ * @frame_width: number of pixels in one frame
+ * @fmt: Tegra Video format struct which has BPP information
+ *
+ * Return: word count number
+ */
+u32 tegra_core_get_word_count(u32 frame_width,
+			      const struct tegra_video_format *fmt)
+{
+	return frame_width * fmt->width / 8;
+}
+
+/**
+ * tegra_core_get_idx_by_code - Retrieve index for a media bus code
+ * @code: the format media bus code
+ *
+ * Return: a index to the format information structure corresponding to the
+ * given V4L2 media bus format @code, or -1 if no corresponding format can
+ * be found.
+ */
+int tegra_core_get_idx_by_code(unsigned int code)
+{
+	unsigned int i;
+	const struct tegra_video_format *format;
+
+	for (i = 0; i < ARRAY_SIZE(tegra_video_formats); ++i) {
+		format = &tegra_video_formats[i];
+
+		if (format->code == code)
+			return i;
+	}
+
+	return -1;
+}
+
+
+/**
+ * tegra_core_get_format_by_code - Retrieve format information for a media
+ * 				   bus code
+ * @code: the format media bus code
+ *
+ * Return: a pointer to the format information structure corresponding to the
+ * given V4L2 media bus format @code, or NULL if no corresponding format can
+ * be found.
+ */
+const struct tegra_video_format *
+tegra_core_get_format_by_code(unsigned int code)
+{
+	unsigned int i;
+	const struct tegra_video_format *format;
+
+	for (i = 0; i < ARRAY_SIZE(tegra_video_formats); ++i) {
+		format = &tegra_video_formats[i];
+
+		if (format->code == code)
+			return format;
+	}
+
+	return NULL;
+}
+
+/**
+ * tegra_core_get_format_by_fourcc - Retrieve format information for a 4CC
+ * @fourcc: the format 4CC
+ *
+ * Return: a pointer to the format information structure corresponding to the
+ * given V4L2 format @fourcc, or NULL if no corresponding format can be
+ * found.
+ */
+const struct tegra_video_format *tegra_core_get_format_by_fourcc(u32 fourcc)
+{
+	unsigned int i;
+	const struct tegra_video_format *format;
+
+	for (i = 0; i < ARRAY_SIZE(tegra_video_formats); ++i) {
+		format = &tegra_video_formats[i];
+
+		if (format->fourcc == fourcc)
+			return format;
+	}
+
+	return NULL;
+}
+
+/**
+ * tegra_core_of_get_format - Parse a device tree node and return format
+ * 			      information
+ * @node: the device tree node
+ *
+ * Read the nvidia,video-format property from the device tree @node passed as
+ * an argument and return the corresponding format information.
+ *
+ * Return: a pointer to the format information structure corresponding to the
+ * format name and width, or NULL if no corresponding format can be found.
+ */
+const struct tegra_video_format *
+tegra_core_of_get_format(struct device_node *node)
+{
+	u32 vf_code;
+	int i, ret;
+	const struct tegra_video_format *format;
+
+	ret = of_property_read_u32(node, "nvidia,video-format", &vf_code);
+	if (ret < 0)
+		vf_code = TEGRA_VF_DEF;
+
+	for (i = 0; i < ARRAY_SIZE(tegra_video_formats); ++i) {
+		format = &tegra_video_formats[i];
+
+		if (format->vf_code == vf_code)
+			return format;
+	}
+
+	return NULL;
+}
+
+/**
+ * tegra_core_bytes_per_line - Calculate bytes per line in one frame
+ * @width: frame width
+ * @fmt: Tegra Video format
+ *
+ * Simply calcualte the bytes_per_line and if it's not 64 bytes aligned it
+ * will be padded to 64 boundary.
+ */
+u32 tegra_core_bytes_per_line(u32 width,
+			      const struct tegra_video_format *fmt)
+{
+	u32 bytes_per_line = width * fmt->bpp;
+
+	if (bytes_per_line % 64)
+		bytes_per_line = bytes_per_line +
+				 (64 - (bytes_per_line % 64));
+
+	return bytes_per_line;
+}
diff --git a/drivers/media/platform/tegra/tegra-core.h b/drivers/media/platform/tegra/tegra-core.h
new file mode 100644
index 0000000..7d1026b
--- /dev/null
+++ b/drivers/media/platform/tegra/tegra-core.h
@@ -0,0 +1,134 @@
+/*
+ * NVIDIA Tegra Video Input Device Driver Core Helpers
+ *
+ * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * Author: Bryan Wu <pengw@nvidia.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __TEGRA_CORE_H__
+#define __TEGRA_CORE_H__
+
+#include <dt-bindings/media/tegra-vi.h>
+
+#include <media/v4l2-subdev.h>
+
+/* Minimum and maximum width and height common to Tegra video input device. */
+#define TEGRA_MIN_WIDTH		32U
+#define TEGRA_MAX_WIDTH		7680U
+#define TEGRA_MIN_HEIGHT	32U
+#define TEGRA_MAX_HEIGHT	7680U
+
+/* UHD 4K resolution as default resolution for all Tegra video input device. */
+#define TEGRA_DEF_WIDTH		3840
+#define TEGRA_DEF_HEIGHT	2160
+
+#define TEGRA_VF_DEF		TEGRA_VF_RGB888
+#define TEGRA_VF_DEF_FOURCC	V4L2_PIX_FMT_RGB32
+
+/* These go into the TEGRA_VI_CSI_n_IMAGE_DEF registers bits 23:16 */
+#define TEGRA_IMAGE_FORMAT_T_L8                         16
+#define TEGRA_IMAGE_FORMAT_T_R16_I                      32
+#define TEGRA_IMAGE_FORMAT_T_B5G6R5                     33
+#define TEGRA_IMAGE_FORMAT_T_R5G6B5                     34
+#define TEGRA_IMAGE_FORMAT_T_A1B5G5R5                   35
+#define TEGRA_IMAGE_FORMAT_T_A1R5G5B5                   36
+#define TEGRA_IMAGE_FORMAT_T_B5G5R5A1                   37
+#define TEGRA_IMAGE_FORMAT_T_R5G5B5A1                   38
+#define TEGRA_IMAGE_FORMAT_T_A4B4G4R4                   39
+#define TEGRA_IMAGE_FORMAT_T_A4R4G4B4                   40
+#define TEGRA_IMAGE_FORMAT_T_B4G4R4A4                   41
+#define TEGRA_IMAGE_FORMAT_T_R4G4B4A4                   42
+#define TEGRA_IMAGE_FORMAT_T_A8B8G8R8                   64
+#define TEGRA_IMAGE_FORMAT_T_A8R8G8B8                   65
+#define TEGRA_IMAGE_FORMAT_T_B8G8R8A8                   66
+#define TEGRA_IMAGE_FORMAT_T_R8G8B8A8                   67
+#define TEGRA_IMAGE_FORMAT_T_A2B10G10R10                68
+#define TEGRA_IMAGE_FORMAT_T_A2R10G10B10                69
+#define TEGRA_IMAGE_FORMAT_T_B10G10R10A2                70
+#define TEGRA_IMAGE_FORMAT_T_R10G10B10A2                71
+#define TEGRA_IMAGE_FORMAT_T_A8Y8U8V8                   193
+#define TEGRA_IMAGE_FORMAT_T_V8U8Y8A8                   194
+#define TEGRA_IMAGE_FORMAT_T_A2Y10U10V10                197
+#define TEGRA_IMAGE_FORMAT_T_V10U10Y10A2                198
+#define TEGRA_IMAGE_FORMAT_T_Y8_U8__Y8_V8               200
+#define TEGRA_IMAGE_FORMAT_T_Y8_V8__Y8_U8               201
+#define TEGRA_IMAGE_FORMAT_T_U8_Y8__V8_Y8               202
+#define TEGRA_IMAGE_FORMAT_T_T_V8_Y8__U8_Y8             203
+#define TEGRA_IMAGE_FORMAT_T_T_Y8__U8__V8_N444          224
+#define TEGRA_IMAGE_FORMAT_T_Y8__U8V8_N444              225
+#define TEGRA_IMAGE_FORMAT_T_Y8__V8U8_N444              226
+#define TEGRA_IMAGE_FORMAT_T_Y8__U8__V8_N422            227
+#define TEGRA_IMAGE_FORMAT_T_Y8__U8V8_N422              228
+#define TEGRA_IMAGE_FORMAT_T_Y8__V8U8_N422              229
+#define TEGRA_IMAGE_FORMAT_T_Y8__U8__V8_N420            230
+#define TEGRA_IMAGE_FORMAT_T_Y8__U8V8_N420              231
+#define TEGRA_IMAGE_FORMAT_T_Y8__V8U8_N420              232
+#define TEGRA_IMAGE_FORMAT_T_X2Lc10Lb10La10             233
+#define TEGRA_IMAGE_FORMAT_T_A2R6R6R6R6R6               234
+
+/* These go into the TEGRA_VI_CSI_n_CSI_IMAGE_DT registers bits 7:0 */
+#define TEGRA_IMAGE_DT_YUV420_8                         24
+#define TEGRA_IMAGE_DT_YUV420_10                        25
+#define TEGRA_IMAGE_DT_YUV420CSPS_8                     28
+#define TEGRA_IMAGE_DT_YUV420CSPS_10                    29
+#define TEGRA_IMAGE_DT_YUV422_8                         30
+#define TEGRA_IMAGE_DT_YUV422_10                        31
+#define TEGRA_IMAGE_DT_RGB444                           32
+#define TEGRA_IMAGE_DT_RGB555                           33
+#define TEGRA_IMAGE_DT_RGB565                           34
+#define TEGRA_IMAGE_DT_RGB666                           35
+#define TEGRA_IMAGE_DT_RGB888                           36
+#define TEGRA_IMAGE_DT_RAW6                             40
+#define TEGRA_IMAGE_DT_RAW7                             41
+#define TEGRA_IMAGE_DT_RAW8                             42
+#define TEGRA_IMAGE_DT_RAW10                            43
+#define TEGRA_IMAGE_DT_RAW12                            44
+#define TEGRA_IMAGE_DT_RAW14                            45
+
+/**
+ * struct tegra_video_format - Tegra video format description
+ * @vf_code: video format code
+ * @width: format width in bits per component
+ * @code: media bus format code
+ * @bpp: bytes per pixel (when stored in memory)
+ * @img_fmt: image format
+ * @img_dt: image data type
+ * @fourcc: V4L2 pixel format FCC identifier
+ * @description: format description, suitable for userspace
+ */
+struct tegra_video_format {
+	u32 vf_code;
+	u32 width;
+	u32 code;
+	u32 bpp;
+	u32 img_fmt;
+	u32 img_dt;
+	u32 fourcc;
+};
+
+extern const struct tegra_video_format tegra_video_formats[];
+
+int tegra_core_get_formats_array_size(void);
+
+u32 tegra_core_get_word_count(u32 frame_width,
+			      const struct tegra_video_format *fmt);
+int tegra_core_get_idx_by_code(unsigned int code);
+const struct tegra_video_format *tegra_core_get_format_by_code(unsigned int
+							       code);
+const struct tegra_video_format *tegra_core_get_format_by_fourcc(u32 fourcc);
+const struct tegra_video_format *tegra_core_of_get_format(struct device_node
+							  *node);
+u32 tegra_core_bytes_per_line(u32 width,
+				     const struct tegra_video_format *fmt);
+int tegra_core_enum_mbus_code(struct v4l2_subdev *subdev,
+			struct v4l2_subdev_pad_config *cfg,
+			struct v4l2_subdev_mbus_code_enum *code);
+int tegra_core_enum_frame_size(struct v4l2_subdev *subdev,
+			 struct v4l2_subdev_pad_config *cfg,
+			 struct v4l2_subdev_frame_size_enum *fse);
+#endif
diff --git a/drivers/media/platform/tegra/tegra-vi.c b/drivers/media/platform/tegra/tegra-vi.c
new file mode 100644
index 0000000..65ba412
--- /dev/null
+++ b/drivers/media/platform/tegra/tegra-vi.c
@@ -0,0 +1,585 @@
+/*
+ * NVIDIA Tegra Video Input Device
+ *
+ * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * Author: Bryan Wu <pengw@nvidia.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/clk.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/of.h>
+#include <linux/of_graph.h>
+#include <linux/platform_device.h>
+#include <linux/regulator/consumer.h>
+#include <linux/reset.h>
+#include <linux/slab.h>
+
+#include <media/media-device.h>
+#include <media/v4l2-async.h>
+#include <media/v4l2-common.h>
+#include <media/v4l2-device.h>
+#include <media/v4l2-of.h>
+
+#include <soc/tegra/pmc.h>
+
+#include "tegra-vi.h"
+
+/* In TPG mode, VI only support 2 formats */
+static void vi_tpg_fmts_bitmap_init(struct tegra_vi_device *vi)
+{
+	int index;
+
+	bitmap_zero(vi->tpg_fmts_bitmap, MAX_FORMAT_NUM);
+
+	index = tegra_core_get_idx_by_code(MEDIA_BUS_FMT_SRGGB10_1X10);
+	bitmap_set(vi->tpg_fmts_bitmap, index, 1);
+
+	index = tegra_core_get_idx_by_code(MEDIA_BUS_FMT_RGB888_1X32_PADHI);
+	bitmap_set(vi->tpg_fmts_bitmap, index, 1);
+}
+
+/*
+ * Control Config
+ */
+
+static const char *const vi_pattern_strings[] = {
+	"Disabled",
+	"Black/White Direct Mode",
+	"Color Patch Mode",
+};
+
+static int vi_s_ctrl(struct v4l2_ctrl *ctrl)
+{
+	struct tegra_vi_device *vi = container_of(ctrl->handler,
+						  struct tegra_vi_device,
+						  ctrl_handler);
+	switch (ctrl->id) {
+	case V4L2_CID_TEST_PATTERN:
+		vi->pg_mode = ctrl->val;
+		break;
+	}
+
+	return 0;
+}
+
+static const struct v4l2_ctrl_ops vi_ctrl_ops = {
+	.s_ctrl	= vi_s_ctrl,
+};
+
+/* -----------------------------------------------------------------------------
+ * Media Controller and V4L2
+ */
+
+static void tegra_vi_v4l2_cleanup(struct tegra_vi_device *vi)
+{
+	v4l2_ctrl_handler_free(&vi->ctrl_handler);
+	v4l2_device_unregister(&vi->v4l2_dev);
+	media_device_unregister(&vi->media_dev);
+}
+
+static int tegra_vi_v4l2_init(struct tegra_vi_device *vi)
+{
+	int ret;
+
+	vi->media_dev.dev = vi->dev;
+	strlcpy(vi->media_dev.model, "NVIDIA Tegra Video Input Device",
+		sizeof(vi->media_dev.model));
+	vi->media_dev.hw_revision = 0;
+
+	ret = media_device_register(&vi->media_dev);
+	if (ret < 0) {
+		dev_err(vi->dev, "media device registration failed (%d)\n",
+			ret);
+		return ret;
+	}
+
+	vi->v4l2_dev.mdev = &vi->media_dev;
+	ret = v4l2_device_register(vi->dev, &vi->v4l2_dev);
+	if (ret < 0) {
+		dev_err(vi->dev, "V4L2 device registration failed (%d)\n",
+			ret);
+		goto register_error;
+	}
+
+	v4l2_ctrl_handler_init(&vi->ctrl_handler, 1);
+	vi->pattern = v4l2_ctrl_new_std_menu_items(&vi->ctrl_handler,
+					&vi_ctrl_ops, V4L2_CID_TEST_PATTERN,
+					ARRAY_SIZE(vi_pattern_strings) - 1,
+					0, 0, vi_pattern_strings);
+
+	if (vi->ctrl_handler.error) {
+		dev_err(vi->dev, "failed to add controls\n");
+		ret = vi->ctrl_handler.error;
+		goto ctrl_error;
+	}
+	vi->v4l2_dev.ctrl_handler = &vi->ctrl_handler;
+
+	ret = v4l2_ctrl_handler_setup(&vi->ctrl_handler);
+	if (ret < 0) {
+		dev_err(vi->dev, "failed to set controls\n");
+		goto ctrl_error;
+	}
+	return 0;
+
+
+ctrl_error:
+	v4l2_ctrl_handler_free(&vi->ctrl_handler);
+	v4l2_device_unregister(&vi->v4l2_dev);
+register_error:
+	media_device_unregister(&vi->media_dev);
+	return ret;
+}
+
+/* -----------------------------------------------------------------------------
+ * Platform Device Driver
+ */
+
+int tegra_vi_power_on(struct tegra_vi_device *vi)
+{
+	int ret;
+
+	ret = regulator_enable(vi->vi_reg);
+	if (ret)
+		return ret;
+
+	ret = tegra_powergate_sequence_power_up(TEGRA_POWERGATE_VENC,
+						vi->vi_clk, vi->vi_rst);
+	if (ret) {
+		regulator_disable(vi->vi_reg);
+		return ret;
+	}
+
+	clk_prepare_enable(vi->csi_clk);
+
+	clk_set_rate(vi->parent_clk, 408000000);
+	clk_set_rate(vi->vi_clk, 408000000);
+	clk_set_rate(vi->csi_clk, 408000000);
+
+	return 0;
+}
+
+void tegra_vi_power_off(struct tegra_vi_device *vi)
+{
+	clk_disable_unprepare(vi->csi_clk);
+	tegra_powergate_power_off(TEGRA_POWERGATE_VENC);
+	regulator_disable(vi->vi_reg);
+}
+
+static int tegra_vi_channels_init(struct tegra_vi_device *vi)
+{
+	int i, ret;
+	struct tegra_channel *chan;
+
+	for (i = 0; i < ARRAY_SIZE(vi->chans); i++) {
+		chan = &vi->chans[i];
+
+		ret = tegra_channel_init(vi, chan, i);
+		if (ret < 0) {
+			dev_err(vi->dev, "channel %d init failed\n", i);
+			return ret;
+		}
+	}
+	return 0;
+}
+
+static int tegra_vi_channels_cleanup(struct tegra_vi_device *vi)
+{
+	int i, ret;
+	struct tegra_channel *chan;
+
+	for (i = 0; i < ARRAY_SIZE(vi->chans); i++) {
+		chan = &vi->chans[i];
+
+		ret = tegra_channel_cleanup(chan);
+		if (ret < 0) {
+			dev_err(vi->dev, "channel %d cleanup failed\n", i);
+			return ret;
+		}
+	}
+	return 0;
+}
+
+/* -----------------------------------------------------------------------------
+ * Graph Management
+ */
+
+static struct tegra_vi_graph_entity *
+tegra_vi_graph_find_entity(struct tegra_vi_device *vi,
+		       const struct device_node *node)
+{
+	struct tegra_vi_graph_entity *entity;
+
+	list_for_each_entry(entity, &vi->entities, list) {
+		if (entity->node == node)
+			return entity;
+	}
+
+	return NULL;
+}
+
+static int tegra_vi_graph_build_links(struct tegra_vi_device *vi)
+{
+	u32 link_flags = MEDIA_LNK_FL_ENABLED;
+	struct device_node *node = vi->dev->of_node;
+	struct media_entity *source;
+	struct media_entity *sink;
+	struct media_pad *source_pad;
+	struct media_pad *sink_pad;
+	struct tegra_vi_graph_entity *ent;
+	struct v4l2_of_link link;
+	struct device_node *ep = NULL;
+	struct device_node *next;
+	struct tegra_channel *chan;
+	int ret = 0;
+
+
+	dev_dbg(vi->dev, "creating links for channels\n");
+
+	while (1) {
+		/* Get the next endpoint and parse its link. */
+		next = of_graph_get_next_endpoint(node, ep);
+		if (next == NULL)
+			break;
+
+		of_node_put(ep);
+		ep = next;
+
+		dev_dbg(vi->dev, "processing endpoint %s\n", ep->full_name);
+
+		ret = v4l2_of_parse_link(ep, &link);
+		if (ret < 0) {
+			dev_err(vi->dev, "failed to parse link for %s\n",
+				ep->full_name);
+			continue;
+		}
+
+		if (link.local_port > MAX_CHAN_NUM) {
+			dev_err(vi->dev, "wrong channel number for port %u\n",
+				link.local_port);
+			v4l2_of_put_link(&link);
+			ret = -EINVAL;
+			break;
+		}
+
+		chan = &vi->chans[link.local_port];
+
+		dev_dbg(vi->dev, "creating link for channel %s\n",
+			chan->video.name);
+
+		/* Find the remote entity. */
+		ent = tegra_vi_graph_find_entity(vi, link.remote_node);
+		if (ent == NULL) {
+			dev_err(vi->dev, "no entity found for %s\n",
+				link.remote_node->full_name);
+			v4l2_of_put_link(&link);
+			ret = -ENODEV;
+			break;
+		}
+
+		if (link.remote_port >= ent->entity->num_pads) {
+			dev_err(vi->dev, "invalid port number %u on %s\n",
+				link.remote_port, link.remote_node->full_name);
+			v4l2_of_put_link(&link);
+			ret = -EINVAL;
+			break;
+		}
+
+		source = ent->entity;
+		source_pad = &source->pads[link.remote_port];
+		sink = &chan->video.entity;
+		sink_pad = &chan->pad;
+		chan->remote_entity = ent;
+
+		v4l2_of_put_link(&link);
+
+		/* Create the media link. */
+		dev_dbg(vi->dev, "creating %s:%u -> %s:%u link\n",
+			source->name, source_pad->index,
+			sink->name, sink_pad->index);
+
+		ret = media_entity_create_link(source, source_pad->index,
+					       sink, sink_pad->index,
+					       link_flags);
+		if (ret < 0) {
+			dev_err(vi->dev,
+				"failed to create %s:%u -> %s:%u link\n",
+				source->name, source_pad->index,
+				sink->name, sink_pad->index);
+			break;
+		}
+
+		tegra_channel_fmts_bitmap_init(chan);
+	}
+
+	of_node_put(ep);
+	return ret;
+}
+
+static int tegra_vi_graph_notify_complete(struct v4l2_async_notifier *notifier)
+{
+	struct tegra_vi_device *vi =
+		container_of(notifier, struct tegra_vi_device, notifier);
+	int ret;
+
+	dev_dbg(vi->dev, "notify complete, all subdevs registered\n");
+
+	/* Create links for every entity. */
+	ret = tegra_vi_graph_build_links(vi);
+	if (ret < 0)
+		return ret;
+
+	ret = v4l2_device_register_subdev_nodes(&vi->v4l2_dev);
+	if (ret < 0)
+		dev_err(vi->dev, "failed to register subdev nodes\n");
+
+	return ret;
+}
+
+static int tegra_vi_graph_notify_bound(struct v4l2_async_notifier *notifier,
+				   struct v4l2_subdev *subdev,
+				   struct v4l2_async_subdev *asd)
+{
+	struct tegra_vi_device *vi =
+		container_of(notifier, struct tegra_vi_device, notifier);
+	struct tegra_vi_graph_entity *entity;
+
+	/* Locate the entity corresponding to the bound subdev and store the
+	 * subdev pointer.
+	 */
+	list_for_each_entry(entity, &vi->entities, list) {
+		if (entity->node != subdev->dev->of_node)
+			continue;
+
+		if (entity->subdev) {
+			dev_err(vi->dev, "duplicate subdev for node %s\n",
+				entity->node->full_name);
+			return -EINVAL;
+		}
+
+		dev_dbg(vi->dev, "subdev %s bound\n", subdev->name);
+		entity->entity = &subdev->entity;
+		entity->subdev = subdev;
+		return 0;
+	}
+
+	dev_err(vi->dev, "no entity for subdev %s\n", subdev->name);
+	return -EINVAL;
+}
+
+
+static void tegra_vi_graph_cleanup(struct tegra_vi_device *vi)
+{
+	struct tegra_vi_graph_entity *entityp;
+	struct tegra_vi_graph_entity *entity;
+
+	v4l2_async_notifier_unregister(&vi->notifier);
+
+	list_for_each_entry_safe(entity, entityp, &vi->entities, list) {
+		of_node_put(entity->node);
+		list_del(&entity->list);
+	}
+}
+
+static int tegra_vi_graph_init(struct tegra_vi_device *vi)
+{
+	struct device_node *node = vi->dev->of_node;
+	struct device_node *ep = NULL;
+	struct device_node *next; 
+	struct device_node *remote = NULL;
+	struct tegra_vi_graph_entity *entity;
+	struct v4l2_async_subdev **subdevs = NULL;
+	unsigned int num_subdevs;
+	int ret = 0, i;
+
+	/* Parse all the remote entities and put them into the list */
+	while (1) {
+		next = of_graph_get_next_endpoint(node, ep);
+		if (!next)
+			break;
+
+		of_node_put(ep);
+		ep = next;
+
+		remote = of_graph_get_remote_port_parent(ep);
+		if (!remote) {
+			ret = -EINVAL;
+			break;
+		}
+
+		entity = devm_kzalloc(vi->dev, sizeof(*entity), GFP_KERNEL);
+		if (entity == NULL) {
+			of_node_put(remote);
+			ret = -ENOMEM;
+			break;
+		}
+
+		entity->node = remote;
+		entity->asd.match_type = V4L2_ASYNC_MATCH_OF;
+		entity->asd.match.of.node = remote;
+		list_add_tail(&entity->list, &vi->entities);
+		vi->num_subdevs++;
+	}
+	of_node_put(ep);
+
+	if (!vi->num_subdevs) {
+		dev_warn(vi->dev, "no subdev found in graph\n");
+		goto done;
+	}
+
+	/* Register the subdevices notifier. */
+	num_subdevs = vi->num_subdevs;
+	subdevs = devm_kzalloc(vi->dev, sizeof(*subdevs) * num_subdevs,
+			       GFP_KERNEL);
+	if (subdevs == NULL) {
+		ret = -ENOMEM;
+		goto done;
+	}
+
+	i = 0;
+	list_for_each_entry(entity, &vi->entities, list)
+		subdevs[i++] = &entity->asd;
+
+	vi->notifier.subdevs = subdevs;
+	vi->notifier.num_subdevs = num_subdevs;
+	vi->notifier.bound = tegra_vi_graph_notify_bound;
+	vi->notifier.complete = tegra_vi_graph_notify_complete;
+
+	ret = v4l2_async_notifier_register(&vi->v4l2_dev, &vi->notifier);
+	if (ret < 0) {
+		dev_err(vi->dev, "notifier registration failed\n");
+		goto done;
+	}
+
+	return 0;
+
+done:
+	if (ret < 0)
+		tegra_vi_graph_cleanup(vi);
+
+	return ret;
+}
+
+static int tegra_vi_probe(struct platform_device *pdev)
+{
+	struct resource *res;
+	struct tegra_vi_device *vi;
+	int ret = 0;
+
+	vi = devm_kzalloc(&pdev->dev, sizeof(*vi), GFP_KERNEL);
+	if (!vi)
+		return -ENOMEM;
+
+	vi->dev = &pdev->dev;
+	INIT_LIST_HEAD(&vi->entities);
+	mutex_init(&vi->lock);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	vi->iomem = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(vi->iomem))
+		return PTR_ERR(vi->iomem);
+
+	vi->vi_rst = devm_reset_control_get(&pdev->dev, "vi");
+	if (IS_ERR(vi->vi_rst)) {
+		dev_err(&pdev->dev, "Failed to get vi reset\n");
+		return -EPROBE_DEFER;
+	}
+
+	vi->vi_clk = devm_clk_get(&pdev->dev, "vi");
+	if (IS_ERR(vi->vi_clk)) {
+		dev_err(&pdev->dev, "Failed to get vi clock\n");
+		return -EPROBE_DEFER;
+	}
+
+	vi->parent_clk = devm_clk_get(&pdev->dev, "parent");
+	if (IS_ERR(vi->parent_clk)) {
+		dev_err(&pdev->dev, "Failed to get VI parent clock\n");
+		return -EPROBE_DEFER;
+	}
+
+	ret = clk_set_parent(vi->vi_clk, vi->parent_clk);
+	if (ret < 0)
+		return ret;
+
+	vi->csi_clk = devm_clk_get(&pdev->dev, "csi");
+	if (IS_ERR(vi->csi_clk)) {
+		dev_err(&pdev->dev, "Failed to get csi clock\n");
+		return -EPROBE_DEFER;
+	}
+
+	vi->vi_reg = devm_regulator_get(&pdev->dev, "avdd-dsi-csi");
+	if (IS_ERR(vi->vi_reg)) {
+		dev_err(&pdev->dev, "Failed to get avdd-dsi-csi regulators\n");
+		return -EPROBE_DEFER;
+	}
+
+	vi_tpg_fmts_bitmap_init(vi);
+
+	ret = tegra_vi_v4l2_init(vi);
+	if (ret < 0)
+		return ret;
+
+	/* Check whether VI is in test pattern generator (TPG) mode */
+	of_property_read_u32(vi->dev->of_node, "nvidia,pg_mode",
+			     &vi->pg_mode);
+
+	/* Init Tegra VI channels */
+	ret = tegra_vi_channels_init(vi);
+	if (ret < 0)
+		goto channels_error;
+
+	/* Setup media links between VI and external sensor subdev. */
+	ret = tegra_vi_graph_init(vi);
+	if (ret < 0)
+		goto graph_error;
+
+	platform_set_drvdata(pdev, vi);
+
+	dev_info(vi->dev, "device registered\n");
+
+	return 0;
+
+graph_error:
+	tegra_vi_channels_cleanup(vi);
+channels_error:
+	tegra_vi_v4l2_cleanup(vi);
+	return ret;
+}
+
+static int tegra_vi_remove(struct platform_device *pdev)
+{
+	struct tegra_vi_device *vi = platform_get_drvdata(pdev);
+
+	tegra_vi_graph_cleanup(vi);
+	tegra_vi_channels_cleanup(vi);
+	tegra_vi_v4l2_cleanup(vi);
+
+	return 0;
+}
+
+static const struct of_device_id tegra_vi_of_id_table[] = {
+	{ .compatible = "nvidia,tegra210-vi" },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, tegra_vi_of_id_table);
+
+static struct platform_driver tegra_vi_driver = {
+	.driver = {
+		.name = "tegra-vi",
+		.of_match_table = tegra_vi_of_id_table,
+	},
+	.probe = tegra_vi_probe,
+	.remove = tegra_vi_remove,
+};
+
+module_platform_driver(tegra_vi_driver);
+
+MODULE_AUTHOR("Bryan Wu <pengw@nvidia.com>");
+MODULE_DESCRIPTION("NVIDIA Tegra Video Input Device Driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/media/platform/tegra/tegra-vi.h b/drivers/media/platform/tegra/tegra-vi.h
new file mode 100644
index 0000000..d30a6ec
--- /dev/null
+++ b/drivers/media/platform/tegra/tegra-vi.h
@@ -0,0 +1,224 @@
+/*
+ * NVIDIA Tegra Video Input Device
+ *
+ * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * Author: Bryan Wu <pengw@nvidia.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __TEGRA_VI_H__
+#define __TEGRA_VI_H__
+
+#include <linux/host1x.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
+#include <linux/spinlock.h>
+#include <linux/videodev2.h>
+
+#include <media/media-device.h>
+#include <media/media-entity.h>
+#include <media/v4l2-async.h>
+#include <media/v4l2-ctrls.h>
+#include <media/v4l2-device.h>
+#include <media/v4l2-dev.h>
+#include <media/videobuf2-core.h>
+
+#include "tegra-core.h"
+
+#define MAX_CHAN_NUM	6
+#define MAX_FORMAT_NUM	64
+
+/**
+ * struct tegra_channel_buffer - video channel buffer
+ * @buf: vb2 buffer base object
+ * @queue: buffer list entry in the channel queued buffers list
+ * @chan: channel that uses the buffer
+ * @addr: Tegra IOVA buffer address for VI output
+ */
+struct tegra_channel_buffer {
+	struct vb2_buffer buf;
+	struct list_head queue;
+	struct tegra_channel *chan;
+
+	dma_addr_t addr;
+};
+
+#define to_tegra_channel_buffer(vb) \
+	container_of(vb, struct tegra_channel_buffer, buf)
+
+
+struct chan_regs_config {
+	u32 csi;
+	u32 pp;
+	u32 cil;
+	u32 phy;
+	u32 tpg;
+};
+
+/**
+ * struct tegra_vi_graph_entity - Entity in the video graph
+ * @list: list entry in a graph entities list
+ * @node: the entity's DT node
+ * @entity: media entity, from the corresponding V4L2 subdev
+ * @asd: subdev asynchronous registration information
+ * @subdev: V4L2 subdev
+ */
+struct tegra_vi_graph_entity {
+	struct list_head list;
+	struct device_node *node;
+	struct media_entity *entity;
+
+	struct v4l2_async_subdev asd;
+	struct v4l2_subdev *subdev;
+};
+
+/**
+ * struct tegra_channel - Tegra video channel
+ * @list: list entry in a composite device dmas list
+ * @video: V4L2 video device associated with the video channel
+ * @video_lock:
+ * @pad: media pad for the video device entity
+ * @pipe: pipeline belonging to the channel
+ *
+ * @vi: composite device DT node port number for the channel
+ *
+ * @client: host1x client struct of Tegra DRM
+ * @sp: host1x syncpoint pointer
+ *
+ * @work: kernel workqueue structure of this video channel
+ * @lock: protects the @format, @fmtinfo, @queue and @work fields
+ *
+ * @format: active V4L2 pixel format
+ * @fmtinfo: format information corresponding to the active @format
+ *
+ * @queue: vb2 buffers queue
+ * @alloc_ctx: allocation context for the vb2 @queue
+ * @sequence: V4L2 buffers sequence number
+ *
+ * @capture: list of queued buffers for capture
+ * @active: active buffer for capture
+ * @queued_lock: protects the buf_queued list
+ *
+ * @iomem: root register base
+ * @regs: CSI/CIL/PHY register bases
+ * @cil_clk: clock for CIL
+ * @align: channel buffer alignment, default is 64
+ * @port: CSI port of this video channel
+ * @surface: output memory surface number
+ * @io_id: Tegra IO rail ID of this video channel
+ * @bypass: a flag to bypass register write
+ *
+ * @fmts_bitmap: a bitmap for formats supported
+ *
+ * @remote_entity: remote media entity for external sensor
+ */
+struct tegra_channel {
+	struct list_head list;
+	struct video_device video;
+	struct mutex video_lock;
+	struct media_pad pad;
+	struct media_pipeline pipe;
+
+	struct tegra_vi_device *vi;
+
+	struct host1x_client client;
+	struct host1x_syncpt *sp;
+
+	struct work_struct work;
+	struct mutex lock;
+
+	struct v4l2_pix_format format;
+	const struct tegra_video_format *fmtinfo;
+
+	struct vb2_queue queue;
+	void *alloc_ctx;
+	u32 sequence;
+
+	struct list_head capture;
+	struct tegra_channel_buffer *active;
+	spinlock_t queued_lock;
+
+	void __iomem *iomem;
+	struct chan_regs_config regs;
+	struct clk *cil_clk;
+	int align;
+	u32 port;
+	u32 surface;
+	int io_id;
+	int bypass;
+
+	DECLARE_BITMAP(fmts_bitmap, MAX_FORMAT_NUM);
+
+	struct tegra_vi_graph_entity *remote_entity;
+};
+
+#define to_tegra_channel(vdev) \
+	container_of(vdev, struct tegra_channel, video)
+
+/**
+ * struct tegra_vi_device - NVIDIA Tegra Video Input device structure
+ * @v4l2_dev: V4L2 device
+ * @media_dev: media device
+ * @dev: device struct
+ *
+ * @iomem: register base
+ * @vi_clk: main clock for VI block
+ * @parent_clk: parent clock of VI clock
+ * @csi_clk: clock for CSI
+ * @vi_rst: reset controler
+ * @vi_reg: regulator for VI hardware, normally it avdd_dsi_csi
+ *
+ * @lock: mutex lock to protect power on/off operations
+ * @power_on_refcnt: reference count for power on/off operations
+ *
+ * @notifier: V4L2 asynchronous subdevs notifier
+ * @entities: entities in the graph as a list of tegra_vi_graph_entity
+ * @num_subdevs: number of subdevs in the pipeline
+ *
+ * @channels: list of channels at the pipeline output and input
+ *
+ * @ctrl_handler: V4L2 control handler
+ * @pattern: test pattern generator V4L2 control
+ * @pg_mode: test pattern generator mode (disabled/direct/patch)
+ * @tpg_fmts_bitmap: a bitmap for formats in test pattern generator mode
+ */
+struct tegra_vi_device {
+	struct v4l2_device v4l2_dev;
+	struct media_device media_dev;
+	struct device *dev;
+
+	void __iomem *iomem;
+	struct clk *vi_clk;
+	struct clk *parent_clk;
+	struct clk *csi_clk;
+	struct reset_control *vi_rst;
+	struct regulator *vi_reg;
+
+	struct mutex lock;
+	int power_on_refcnt;
+
+	struct v4l2_async_notifier notifier;
+	struct list_head entities;
+	unsigned int num_subdevs;
+
+	struct tegra_channel chans[MAX_CHAN_NUM];
+
+	struct v4l2_ctrl_handler ctrl_handler;
+	struct v4l2_ctrl *pattern;
+	int pg_mode;
+	DECLARE_BITMAP(tpg_fmts_bitmap, MAX_FORMAT_NUM);
+};
+
+int tegra_vi_power_on(struct tegra_vi_device *vi);
+void tegra_vi_power_off(struct tegra_vi_device *vi);
+
+int tegra_channel_init(struct tegra_vi_device *vi,
+		       struct tegra_channel *chan, u32 port);
+int tegra_channel_cleanup(struct tegra_channel *chan);
+void tegra_channel_fmts_bitmap_init(struct tegra_channel *chan);
+
+#endif /* __TEGRA_VI_H__ */
diff --git a/include/dt-bindings/media/tegra-vi.h b/include/dt-bindings/media/tegra-vi.h
new file mode 100644
index 0000000..5fdea5b
--- /dev/null
+++ b/include/dt-bindings/media/tegra-vi.h
@@ -0,0 +1,35 @@
+/*
+ * NVIDIA Tegra Video Input Device Driver
+ *
+ * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * Author: Bryan Wu <pengw@nvidia.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __DT_BINDINGS_MEDIA_TEGRA_VI_H__
+#define __DT_BINDINGS_MEDIA_TEGRA_VI_H__
+
+/*
+ * Supported CSI to VI Data Formats
+ */
+#define TEGRA_VF_RAW6		0
+#define TEGRA_VF_RAW7		1
+#define TEGRA_VF_RAW8		2
+#define TEGRA_VF_RAW10		3
+#define TEGRA_VF_RAW12		4
+#define TEGRA_VF_RAW14		5
+#define TEGRA_VF_EMBEDDED8	6
+#define TEGRA_VF_RGB565		7
+#define TEGRA_VF_RGB555		8
+#define TEGRA_VF_RGB888		9
+#define TEGRA_VF_RGB444		10
+#define TEGRA_VF_RGB666		11
+#define TEGRA_VF_YUV422		12
+#define TEGRA_VF_YUV420		13
+#define TEGRA_VF_YUV420_CSPS	14
+
+#endif /* __DT_BINDINGS_MEDIA_TEGRA_VI_H__ */
-- 
2.1.4


-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/2] ARM64: add tegra-vi support in T210 device-tree
  2015-08-21  0:51 [PATCH RFC 0/2] NVIDIA Tegra VI V4L2 driver Bryan Wu
  2015-08-21  0:51 ` [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver Bryan Wu
@ 2015-08-21  0:51 ` Bryan Wu
  2015-08-21  0:51 ` [PATCH RFC 0/2] NVIDIA Tegra VI V4L2 driver Bryan Wu
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 25+ messages in thread
From: Bryan Wu @ 2015-08-21  0:51 UTC (permalink / raw)
  To: hansverk, linux-media
  Cc: ebrower, jbang, swarren, treding, wenjiaz, davidw, gfitzer

Following device tree support for Tegra VI now:
 - "vi" node which might have 6 ports/endpoints
 - in TPG mode, "vi" node don't need to define any ports/endpoints
 - ports/endpoints defines the link between VI and external sensors.

Signed-off-by: Bryan Wu <pengw@nvidia.com>
---
 arch/arm64/boot/dts/nvidia/tegra210-p2571-e01.dts |  8 ++++++++
 arch/arm64/boot/dts/nvidia/tegra210.dtsi          | 13 +++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra210-p2571-e01.dts b/arch/arm64/boot/dts/nvidia/tegra210-p2571-e01.dts
index d4ee460..534ada52 100644
--- a/arch/arm64/boot/dts/nvidia/tegra210-p2571-e01.dts
+++ b/arch/arm64/boot/dts/nvidia/tegra210-p2571-e01.dts
@@ -7,6 +7,14 @@
 	model = "NVIDIA Tegra210 P2571 reference board (E.1)";
 	compatible = "nvidia,p2571-e01", "nvidia,tegra210";
 
+	host1x@0,50000000 {
+		vi@0,54080000 {
+			status = "okay";
+
+			avdd-dsi-csi-supply = <&vdd_dsi_csi>;
+		};
+	};
+
 	pinmux: pinmux@0,700008d4 {
 		pinctrl-names = "boot";
 		pinctrl-0 = <&state_boot>;
diff --git a/arch/arm64/boot/dts/nvidia/tegra210.dtsi b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
index 1168bcd..78bfaad 100644
--- a/arch/arm64/boot/dts/nvidia/tegra210.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
@@ -112,6 +112,19 @@
 			reg = <0x0 0x54080000 0x0 0x00040000>;
 			interrupts = <GIC_SPI 69 IRQ_TYPE_LEVEL_HIGH>;
 			status = "disabled";
+			clocks = <&tegra_car TEGRA210_CLK_VI>,
+				 <&tegra_car TEGRA210_CLK_CSI>,
+				 <&tegra_car TEGRA210_CLK_PLL_C>,
+				 <&tegra_car TEGRA210_CLK_CILAB>,
+				 <&tegra_car TEGRA210_CLK_CILCD>,
+				 <&tegra_car TEGRA210_CLK_CILE>;
+			clock-names = "vi", "csi", "parent", "cilab", "cilcd", "cile";
+			resets = <&tegra_car 20>;
+			reset-names = "vi";
+
+			power-domains = <&pmc TEGRA_POWERGATE_VENC>;
+
+			iommus = <&mc TEGRA_SWGROUP_VI>;
 		};
 
 		tsec@0,54100000 {
-- 
2.1.4


-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH RFC 0/2] NVIDIA Tegra VI V4L2 driver
  2015-08-21  0:51 [PATCH RFC 0/2] NVIDIA Tegra VI V4L2 driver Bryan Wu
  2015-08-21  0:51 ` [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver Bryan Wu
  2015-08-21  0:51 ` [PATCH 2/2] ARM64: add tegra-vi support in T210 device-tree Bryan Wu
@ 2015-08-21  0:51 ` Bryan Wu
  2015-08-21  0:51 ` [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver Bryan Wu
  2015-08-21  0:51 ` [PATCH 2/2] ARM64: add tegra-vi support in T210 device-tree Bryan Wu
  4 siblings, 0 replies; 25+ messages in thread
From: Bryan Wu @ 2015-08-21  0:51 UTC (permalink / raw)
  To: hansverk, linux-media
  Cc: ebrower, jbang, swarren, treding, wenjiaz, davidw, gfitzer

NVIDIA Tegra SoC includes a Video Input controller, which can talk
with external camera sensors.

This patch set is still under development, since it's based on some
out of tree Tegra patches. And media controller part still needs some
rework after upstream finalize the MC redesign work.

Currently it's tested with Tegra X1 built-in test pattern generator.

Bryan Wu (2):
  [media] v4l: tegra: Add NVIDIA Tegra VI driver
  ARM64: add tegra-vi support in T210 device-tree

 arch/arm64/boot/dts/nvidia/tegra210-p2571-e01.dts |    8 +
 arch/arm64/boot/dts/nvidia/tegra210.dtsi          |   13 +
 drivers/media/platform/Kconfig                    |    1 +
 drivers/media/platform/Makefile                   |    2 +
 drivers/media/platform/tegra/Kconfig              |    9 +
 drivers/media/platform/tegra/Makefile             |    3 +
 drivers/media/platform/tegra/tegra-channel.c      | 1074 +++++++++++++++++++++
 drivers/media/platform/tegra/tegra-core.c         |  295 ++++++
 drivers/media/platform/tegra/tegra-core.h         |  134 +++
 drivers/media/platform/tegra/tegra-vi.c           |  585 +++++++++++
 drivers/media/platform/tegra/tegra-vi.h           |  224 +++++
 include/dt-bindings/media/tegra-vi.h              |   35 +
 12 files changed, 2383 insertions(+)
 create mode 100644 drivers/media/platform/tegra/Kconfig
 create mode 100644 drivers/media/platform/tegra/Makefile
 create mode 100644 drivers/media/platform/tegra/tegra-channel.c
 create mode 100644 drivers/media/platform/tegra/tegra-core.c
 create mode 100644 drivers/media/platform/tegra/tegra-core.h
 create mode 100644 drivers/media/platform/tegra/tegra-vi.c
 create mode 100644 drivers/media/platform/tegra/tegra-vi.h
 create mode 100644 include/dt-bindings/media/tegra-vi.h

-- 
2.1.4


-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
  2015-08-21  0:51 [PATCH RFC 0/2] NVIDIA Tegra VI V4L2 driver Bryan Wu
                   ` (2 preceding siblings ...)
  2015-08-21  0:51 ` [PATCH RFC 0/2] NVIDIA Tegra VI V4L2 driver Bryan Wu
@ 2015-08-21  0:51 ` Bryan Wu
  2015-08-21  9:28   ` Hans Verkuil
  2015-08-21 13:03   ` Thierry Reding
  2015-08-21  0:51 ` [PATCH 2/2] ARM64: add tegra-vi support in T210 device-tree Bryan Wu
  4 siblings, 2 replies; 25+ messages in thread
From: Bryan Wu @ 2015-08-21  0:51 UTC (permalink / raw)
  To: hansverk, linux-media
  Cc: ebrower, jbang, swarren, treding, wenjiaz, davidw, gfitzer

NVIDIA Tegra processor contains a powerful Video Input (VI) hardware
controller which can support up to 6 MIPI CSI camera sensors.

This patch adds a V4L2 media controller and capture driver to support
Tegra VI hardware. It's verified with Tegra built-in test pattern
generator.

Signed-off-by: Bryan Wu <pengw@nvidia.com>
Reviewed-by: Hans Verkuil <hans.verkuil@cisco.com>
---
 drivers/media/platform/Kconfig               |    1 +
 drivers/media/platform/Makefile              |    2 +
 drivers/media/platform/tegra/Kconfig         |    9 +
 drivers/media/platform/tegra/Makefile        |    3 +
 drivers/media/platform/tegra/tegra-channel.c | 1074 ++++++++++++++++++++++++++
 drivers/media/platform/tegra/tegra-core.c    |  295 +++++++
 drivers/media/platform/tegra/tegra-core.h    |  134 ++++
 drivers/media/platform/tegra/tegra-vi.c      |  585 ++++++++++++++
 drivers/media/platform/tegra/tegra-vi.h      |  224 ++++++
 include/dt-bindings/media/tegra-vi.h         |   35 +
 10 files changed, 2362 insertions(+)
 create mode 100644 drivers/media/platform/tegra/Kconfig
 create mode 100644 drivers/media/platform/tegra/Makefile
 create mode 100644 drivers/media/platform/tegra/tegra-channel.c
 create mode 100644 drivers/media/platform/tegra/tegra-core.c
 create mode 100644 drivers/media/platform/tegra/tegra-core.h
 create mode 100644 drivers/media/platform/tegra/tegra-vi.c
 create mode 100644 drivers/media/platform/tegra/tegra-vi.h
 create mode 100644 include/dt-bindings/media/tegra-vi.h

diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
index f6bed19..553867f 100644
--- a/drivers/media/platform/Kconfig
+++ b/drivers/media/platform/Kconfig
@@ -119,6 +119,7 @@ source "drivers/media/platform/exynos4-is/Kconfig"
 source "drivers/media/platform/s5p-tv/Kconfig"
 source "drivers/media/platform/am437x/Kconfig"
 source "drivers/media/platform/xilinx/Kconfig"
+source "drivers/media/platform/tegra/Kconfig"
 
 endif # V4L_PLATFORM_DRIVERS
 
diff --git a/drivers/media/platform/Makefile b/drivers/media/platform/Makefile
index 114f9ab..426e0e4 100644
--- a/drivers/media/platform/Makefile
+++ b/drivers/media/platform/Makefile
@@ -52,4 +52,6 @@ obj-$(CONFIG_VIDEO_AM437X_VPFE)		+= am437x/
 
 obj-$(CONFIG_VIDEO_XILINX)		+= xilinx/
 
+obj-$(CONFIG_VIDEO_TEGRA)		+= tegra/
+
 ccflags-y += -I$(srctree)/drivers/media/i2c
diff --git a/drivers/media/platform/tegra/Kconfig b/drivers/media/platform/tegra/Kconfig
new file mode 100644
index 0000000..a69d1b2
--- /dev/null
+++ b/drivers/media/platform/tegra/Kconfig
@@ -0,0 +1,9 @@
+config VIDEO_TEGRA
+	tristate "NVIDIA Tegra Video Input Driver (EXPERIMENTAL)"
+	depends on VIDEO_V4L2 && VIDEO_V4L2_SUBDEV_API && OF
+	select VIDEOBUF2_DMA_CONTIG
+	---help---
+	  Driver for Video Input (VI) device controller in NVIDIA Tegra SoC.
+
+	  TO compile this driver as a module, choose M here: the module will be
+	  called tegra-video.
diff --git a/drivers/media/platform/tegra/Makefile b/drivers/media/platform/tegra/Makefile
new file mode 100644
index 0000000..c8eff0b
--- /dev/null
+++ b/drivers/media/platform/tegra/Makefile
@@ -0,0 +1,3 @@
+tegra-video-objs += tegra-core.o tegra-vi.o tegra-channel.o
+
+obj-$(CONFIG_VIDEO_TEGRA) += tegra-video.o
diff --git a/drivers/media/platform/tegra/tegra-channel.c b/drivers/media/platform/tegra/tegra-channel.c
new file mode 100644
index 0000000..b0063d2
--- /dev/null
+++ b/drivers/media/platform/tegra/tegra-channel.c
@@ -0,0 +1,1074 @@
+/*
+ * NVIDIA Tegra Video Input Device
+ *
+ * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * Author: Bryan Wu <pengw@nvidia.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/atomic.h>
+#include <linux/bitmap.h>
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/host1x.h>
+#include <linux/lcm.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/slab.h>
+
+#include <media/v4l2-ctrls.h>
+#include <media/v4l2-dev.h>
+#include <media/v4l2-fh.h>
+#include <media/v4l2-ioctl.h>
+#include <media/videobuf2-core.h>
+#include <media/videobuf2-dma-contig.h>
+
+#include <soc/tegra/pmc.h>
+
+#include "tegra-vi.h"
+
+#define TEGRA_VI_SYNCPT_WAIT_TIMEOUT			200
+
+/* VI registers */
+#define TEGRA_VI_CFG_VI_INCR_SYNCPT                     0x000
+#define		SP_PP_LINE_START			4
+#define		SP_PP_FRAME_START			5
+#define		SP_MW_REQ_DONE				6
+#define		SP_MW_ACK_DONE				7
+
+#define TEGRA_VI_CFG_VI_INCR_SYNCPT_CNTRL               0x004
+#define TEGRA_VI_CFG_VI_INCR_SYNCPT_ERROR               0x008
+#define TEGRA_VI_CFG_CTXSW                              0x020
+#define TEGRA_VI_CFG_INTSTATUS                          0x024
+#define TEGRA_VI_CFG_PWM_CONTROL                        0x038
+#define TEGRA_VI_CFG_PWM_HIGH_PULSE                     0x03c
+#define TEGRA_VI_CFG_PWM_LOW_PULSE                      0x040
+#define TEGRA_VI_CFG_PWM_SELECT_PULSE_A                 0x044
+#define TEGRA_VI_CFG_PWM_SELECT_PULSE_B                 0x048
+#define TEGRA_VI_CFG_PWM_SELECT_PULSE_C                 0x04c
+#define TEGRA_VI_CFG_PWM_SELECT_PULSE_D                 0x050
+#define TEGRA_VI_CFG_VGP1                               0x064
+#define TEGRA_VI_CFG_VGP2                               0x068
+#define TEGRA_VI_CFG_VGP3                               0x06c
+#define TEGRA_VI_CFG_VGP4                               0x070
+#define TEGRA_VI_CFG_VGP5                               0x074
+#define TEGRA_VI_CFG_VGP6                               0x078
+#define TEGRA_VI_CFG_INTERRUPT_MASK                     0x08c
+#define TEGRA_VI_CFG_INTERRUPT_TYPE_SELECT              0x090
+#define TEGRA_VI_CFG_INTERRUPT_POLARITY_SELECT          0x094
+#define TEGRA_VI_CFG_INTERRUPT_STATUS                   0x098
+#define TEGRA_VI_CFG_VGP_SYNCPT_CONFIG                  0x0ac
+#define TEGRA_VI_CFG_VI_SW_RESET                        0x0b4
+#define TEGRA_VI_CFG_CG_CTRL                            0x0b8
+#define TEGRA_VI_CFG_VI_MCCIF_FIFOCTRL                  0x0e4
+#define TEGRA_VI_CFG_TIMEOUT_WCOAL_VI                   0x0e8
+#define TEGRA_VI_CFG_DVFS                               0x0f0
+#define TEGRA_VI_CFG_RESERVE                            0x0f4
+#define TEGRA_VI_CFG_RESERVE_1                          0x0f8
+
+/* CSI registers */
+#define TEGRA_VI_CSI_0_BASE                             0x100
+#define TEGRA_VI_CSI_1_BASE                             0x200
+#define TEGRA_VI_CSI_2_BASE                             0x300
+#define TEGRA_VI_CSI_3_BASE                             0x400
+#define TEGRA_VI_CSI_4_BASE                             0x500
+#define TEGRA_VI_CSI_5_BASE                             0x600
+
+#define TEGRA_VI_CSI_SW_RESET                           0x000
+#define TEGRA_VI_CSI_SINGLE_SHOT                        0x004
+#define TEGRA_VI_CSI_SINGLE_SHOT_STATE_UPDATE           0x008
+#define TEGRA_VI_CSI_IMAGE_DEF                          0x00c
+#define TEGRA_VI_CSI_RGB2Y_CTRL                         0x010
+#define TEGRA_VI_CSI_MEM_TILING                         0x014
+#define TEGRA_VI_CSI_IMAGE_SIZE                         0x018
+#define TEGRA_VI_CSI_IMAGE_SIZE_WC                      0x01c
+#define TEGRA_VI_CSI_IMAGE_DT                           0x020
+#define TEGRA_VI_CSI_SURFACE0_OFFSET_MSB                0x024
+#define TEGRA_VI_CSI_SURFACE0_OFFSET_LSB                0x028
+#define TEGRA_VI_CSI_SURFACE1_OFFSET_MSB                0x02c
+#define TEGRA_VI_CSI_SURFACE1_OFFSET_LSB                0x030
+#define TEGRA_VI_CSI_SURFACE2_OFFSET_MSB                0x034
+#define TEGRA_VI_CSI_SURFACE2_OFFSET_LSB                0x038
+#define TEGRA_VI_CSI_SURFACE0_BF_OFFSET_MSB             0x03c
+#define TEGRA_VI_CSI_SURFACE0_BF_OFFSET_LSB             0x040
+#define TEGRA_VI_CSI_SURFACE1_BF_OFFSET_MSB             0x044
+#define TEGRA_VI_CSI_SURFACE1_BF_OFFSET_LSB             0x048
+#define TEGRA_VI_CSI_SURFACE2_BF_OFFSET_MSB             0x04c
+#define TEGRA_VI_CSI_SURFACE2_BF_OFFSET_LSB             0x050
+#define TEGRA_VI_CSI_SURFACE0_STRIDE                    0x054
+#define TEGRA_VI_CSI_SURFACE1_STRIDE                    0x058
+#define TEGRA_VI_CSI_SURFACE2_STRIDE                    0x05c
+#define TEGRA_VI_CSI_SURFACE_HEIGHT0                    0x060
+#define TEGRA_VI_CSI_ISPINTF_CONFIG                     0x064
+#define TEGRA_VI_CSI_ERROR_STATUS                       0x084
+#define TEGRA_VI_CSI_ERROR_INT_MASK                     0x088
+#define TEGRA_VI_CSI_WD_CTRL                            0x08c
+#define TEGRA_VI_CSI_WD_PERIOD                          0x090
+
+#define TEGRA_CSI_CSI_CAP_CIL                           0x808
+#define TEGRA_CSI_CSI_CAP_CSI                           0x818
+#define TEGRA_CSI_CSI_CAP_PP                            0x828
+
+/* CSI Pixel Parser registers */
+#define TEGRA_CSI_PIXEL_PARSER_0_BASE			0x0838
+#define TEGRA_CSI_PIXEL_PARSER_1_BASE			0x086c
+#define TEGRA_CSI_PIXEL_PARSER_2_BASE			0x1038
+#define TEGRA_CSI_PIXEL_PARSER_3_BASE			0x106c
+#define TEGRA_CSI_PIXEL_PARSER_4_BASE			0x1838
+#define TEGRA_CSI_PIXEL_PARSER_5_BASE			0x186c
+
+
+/* CSI Pixel Parser registers */
+#define TEGRA_CSI_INPUT_STREAM_CONTROL                  0x000
+#define TEGRA_CSI_PIXEL_STREAM_CONTROL0                 0x004
+#define TEGRA_CSI_PIXEL_STREAM_CONTROL1                 0x008
+#define TEGRA_CSI_PIXEL_STREAM_GAP                      0x00c
+#define TEGRA_CSI_PIXEL_STREAM_PP_COMMAND               0x010
+#define TEGRA_CSI_PIXEL_STREAM_EXPECTED_FRAME           0x014
+#define TEGRA_CSI_PIXEL_PARSER_INTERRUPT_MASK           0x018
+#define TEGRA_CSI_PIXEL_PARSER_STATUS                   0x01c
+#define TEGRA_CSI_CSI_SW_SENSOR_RESET                   0x020
+
+/* CSI PHY registers */
+#define TEGRA_CSI_CIL_PHY_0_BASE			0x0908
+#define TEGRA_CSI_CIL_PHY_1_BASE			0x1108
+#define TEGRA_CSI_CIL_PHY_2_BASE			0x1908
+#define TEGRA_CSI_PHY_CIL_COMMAND			0x0908
+
+/* CSI CIL registers */
+#define TEGRA_CSI_CIL_0_BASE				0x092c
+#define TEGRA_CSI_CIL_1_BASE				0x0960
+#define TEGRA_CSI_CIL_2_BASE				0x112c
+#define TEGRA_CSI_CIL_3_BASE				0x1160
+#define TEGRA_CSI_CIL_4_BASE				0x192c
+#define TEGRA_CSI_CIL_5_BASE				0x1960
+
+#define TEGRA_CSI_CIL_PAD_CONFIG0                       0x000
+#define TEGRA_CSI_CIL_PAD_CONFIG1                       0x004
+#define TEGRA_CSI_CIL_PHY_CONTROL                       0x008
+#define TEGRA_CSI_CIL_INTERRUPT_MASK                    0x00c
+#define TEGRA_CSI_CIL_STATUS                            0x010
+#define TEGRA_CSI_CILX_STATUS                           0x014
+#define TEGRA_CSI_CIL_ESCAPE_MODE_COMMAND               0x018
+#define TEGRA_CSI_CIL_ESCAPE_MODE_DATA                  0x01c
+#define TEGRA_CSI_CIL_SW_SENSOR_RESET                   0x020
+
+/* CSI Pattern Generator registers */
+#define TEGRA_CSI_PATTERN_GENERATOR_0_BASE		0x09c4
+#define TEGRA_CSI_PATTERN_GENERATOR_1_BASE		0x09f8
+#define TEGRA_CSI_PATTERN_GENERATOR_2_BASE		0x11c4
+#define TEGRA_CSI_PATTERN_GENERATOR_3_BASE		0x11f8
+#define TEGRA_CSI_PATTERN_GENERATOR_4_BASE		0x19c4
+#define TEGRA_CSI_PATTERN_GENERATOR_5_BASE		0x19f8
+
+#define TEGRA_CSI_PATTERN_GENERATOR_CTRL		0x000
+#define TEGRA_CSI_PG_BLANK				0x004
+#define TEGRA_CSI_PG_PHASE				0x008
+#define TEGRA_CSI_PG_RED_FREQ				0x00c
+#define TEGRA_CSI_PG_RED_FREQ_RATE			0x010
+#define TEGRA_CSI_PG_GREEN_FREQ				0x014
+#define TEGRA_CSI_PG_GREEN_FREQ_RATE			0x018
+#define TEGRA_CSI_PG_BLUE_FREQ				0x01c
+#define TEGRA_CSI_PG_BLUE_FREQ_RATE			0x020
+#define TEGRA_CSI_PG_AOHDR				0x024
+
+#define TEGRA_CSI_DPCM_CTRL_A				0xad0
+#define TEGRA_CSI_DPCM_CTRL_B				0xad4
+#define TEGRA_CSI_STALL_COUNTER				0xae8
+#define TEGRA_CSI_CSI_READONLY_STATUS			0xaec
+#define TEGRA_CSI_CSI_SW_STATUS_RESET			0xaf0
+#define TEGRA_CSI_CLKEN_OVERRIDE			0xaf4
+#define TEGRA_CSI_DEBUG_CONTROL				0xaf8
+#define TEGRA_CSI_DEBUG_COUNTER_0			0xafc
+#define TEGRA_CSI_DEBUG_COUNTER_1			0xb00
+#define TEGRA_CSI_DEBUG_COUNTER_2			0xb04
+
+/* Channel registers */
+static void tegra_channel_write(struct tegra_channel *chan, u32 addr, u32 val)
+{
+	if (chan->bypass)
+		return;
+
+	writel(val, chan->iomem + addr);
+}
+
+static u32 tegra_channel_read(struct tegra_channel *chan, u32 addr)
+{
+	return readl(chan->iomem + addr);
+}
+
+/* CSI registers */
+static void csi_write(struct tegra_channel *chan,
+				    u32 addr, u32 val)
+{
+	tegra_channel_write(chan, chan->regs.csi + addr, val);
+}
+
+static u32 csi_read(struct tegra_channel *chan, u32 addr)
+{
+	return tegra_channel_read(chan, chan->regs.csi + addr);
+}
+
+/* CSI pixel parser registers */
+static void pp_write(struct tegra_channel *chan,
+				    u32 addr, u32 val)
+{
+	tegra_channel_write(chan, chan->regs.pp + addr, val);
+}
+
+static u32 pp_read(struct tegra_channel *chan, u32 addr)
+{
+	return tegra_channel_read(chan, chan->regs.pp + addr);
+}
+
+/* CIL registers */
+static void cil_write(struct tegra_channel *chan,
+				    u32 addr, u32 val)
+{
+	tegra_channel_write(chan, chan->regs.cil + addr, val);
+}
+
+static u32 cil_read(struct tegra_channel *chan, u32 addr)
+{
+	return tegra_channel_read(chan, chan->regs.cil + addr);
+}
+
+/* CIL PHY registers */
+static void phy_write(struct tegra_channel *chan, u32 val)
+{
+	tegra_channel_write(chan, chan->regs.phy, val);
+}
+
+static u32 phy_read(struct tegra_channel *chan)
+{
+	return tegra_channel_read(chan, chan->regs.phy);
+}
+
+/* Test pattern generator registers */
+static void tpg_write(struct tegra_channel *chan,
+				    u32 addr, u32 val)
+{
+	tegra_channel_write(chan, chan->regs.tpg + addr, val);
+}
+
+/* Syncpoint bits of TEGRA_VI_CFG_VI_INCR_SYNCPT */
+static u32 sp_bit(struct tegra_channel *chan, u32 sp)
+{
+	return (sp + chan->port * 4) << 8;
+}
+
+/* Calculate register base */
+static u32 regs_base(u32 regs_base, int port)
+{
+	return regs_base + (port / 2 * 0x800) + (port & 1) * 0x34;
+}
+
+/* CSI channel IO Rail IDs */
+int tegra_io_rail_csi_ids[] = {
+	TEGRA_IO_RAIL_CSIA,
+	TEGRA_IO_RAIL_CSIB,
+	TEGRA_IO_RAIL_CSIC,
+	TEGRA_IO_RAIL_CSID,
+	TEGRA_IO_RAIL_CSIE,
+	TEGRA_IO_RAIL_CSIF,
+};
+
+void tegra_channel_fmts_bitmap_init(struct tegra_channel *chan)
+{
+	int ret, index;
+	struct v4l2_subdev *subdev = chan->remote_entity->subdev;
+	struct v4l2_subdev_mbus_code_enum code = {
+		.which = V4L2_SUBDEV_FORMAT_ACTIVE,
+	};
+
+
+	bitmap_zero(chan->fmts_bitmap, MAX_FORMAT_NUM);
+
+	while (1) {
+		ret = v4l2_subdev_call(subdev, pad, enum_mbus_code,
+				       NULL, &code);
+		if (ret < 0)
+			/* no more formats */
+			return;
+
+		index = tegra_core_get_idx_by_code(code.code);
+		if (index >= 0)
+			bitmap_set(chan->fmts_bitmap, index, 1);
+
+		code.index++;
+	}
+
+	return;
+}
+
+/* -----------------------------------------------------------------------------
+ * Tegra channel frame setup and capture operations
+ */
+
+static int tegra_channel_capture_setup(struct tegra_channel *chan)
+{
+	int lanes = 2;
+	int port = chan->port;
+	u32 height = chan->format.height;
+	u32 width = chan->format.width;
+	u32 format = chan->fmtinfo->img_fmt;
+	u32 data_type = chan->fmtinfo->img_dt;
+	u32 word_count = tegra_core_get_word_count(width, chan->fmtinfo);
+	struct chan_regs_config *regs = &chan->regs;
+
+	/* CIL PHY register setup */
+	if (port & 0x1) {
+		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0 - 0x34, 0x0);
+		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x0);
+	} else {
+		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x10000);
+		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0 + 0x34, 0x0);
+	}
+
+	cil_write(chan, TEGRA_CSI_CIL_INTERRUPT_MASK, 0x0);
+	cil_write(chan, TEGRA_CSI_CIL_PHY_CONTROL, 0xA);
+	if (lanes == 4) {
+		regs->cil = regs_base(TEGRA_CSI_CIL_0_BASE, port + 1);
+		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x0);
+		cil_write(chan,	TEGRA_CSI_CIL_INTERRUPT_MASK, 0x0);
+		cil_write(chan, TEGRA_CSI_CIL_PHY_CONTROL, 0xA);
+		regs->cil = regs_base(TEGRA_CSI_CIL_0_BASE, port);
+	}
+
+	/* CSI pixel parser registers setup */
+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_PP_COMMAND, 0xf007);
+	pp_write(chan, TEGRA_CSI_PIXEL_PARSER_INTERRUPT_MASK, 0x0);
+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_CONTROL0,
+		 0x280301f0 | (port & 0x1));
+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_PP_COMMAND, 0xf007);
+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_CONTROL1, 0x11);
+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_GAP, 0x140000);
+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_EXPECTED_FRAME, 0x0);
+	pp_write(chan, TEGRA_CSI_INPUT_STREAM_CONTROL,
+		 0x3f0000 | (lanes - 1));
+
+	/* CIL PHY register setup */
+	if (lanes == 4)
+		phy_write(chan, 0x0101);
+	else {
+		u32 val = phy_read(chan);
+		if (port & 0x1)
+			val = (val & ~0x100) | 0x100;
+		else
+			val = (val & ~0x1) | 0x1;
+		phy_write(chan, val);
+	}
+
+	/* Test Pattern Generator setup */
+	if (chan->vi->pg_mode) {
+		tpg_write(chan, TEGRA_CSI_PATTERN_GENERATOR_CTRL,
+				((chan->vi->pg_mode - 1) << 2) | 0x1);
+		tpg_write(chan, TEGRA_CSI_PG_PHASE, 0x0);
+		tpg_write(chan, TEGRA_CSI_PG_RED_FREQ, 0x100010);
+		tpg_write(chan, TEGRA_CSI_PG_RED_FREQ_RATE, 0x0);
+		tpg_write(chan, TEGRA_CSI_PG_GREEN_FREQ, 0x100010);
+		tpg_write(chan, TEGRA_CSI_PG_GREEN_FREQ_RATE, 0x0);
+		tpg_write(chan, TEGRA_CSI_PG_BLUE_FREQ, 0x100010);
+		tpg_write(chan, TEGRA_CSI_PG_BLUE_FREQ_RATE, 0x0);
+		phy_write(chan, 0x0202);
+	}
+
+	csi_write(chan, TEGRA_VI_CSI_ERROR_STATUS, 0xFFFFFFFF);
+	csi_write(chan, TEGRA_VI_CSI_IMAGE_DEF,
+		  ((chan->vi->pg_mode ? 1 : 0) << 24) | (format << 16) | 0x1);
+	csi_write(chan, TEGRA_VI_CSI_IMAGE_DT, data_type);
+	csi_write(chan, TEGRA_VI_CSI_IMAGE_SIZE_WC, word_count);
+	csi_write(chan, TEGRA_VI_CSI_IMAGE_SIZE,
+		  (height << 16) | width);
+
+	/* Start pixel parser in single shot mode at beginning */
+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_PP_COMMAND, 0xf005);
+
+	return 0;
+}
+
+static void tegra_channel_capture_error(struct tegra_channel *chan, int err)
+{
+	u32 val;
+
+#ifdef DEBUG
+	val = tegra_channel_read(chan, TEGRA_CSI_DEBUG_COUNTER_0);
+	dev_err(&chan->video.dev, "TEGRA_CSI_DEBUG_COUNTER_0 0x%08x\n", val);
+#endif
+	val = cil_read(chan, TEGRA_CSI_CIL_STATUS);
+	dev_err(&chan->video.dev, "TEGRA_CSI_CSI_CIL_STATUS 0x%08x\n", val);
+	val = cil_read(chan, TEGRA_CSI_CILX_STATUS);
+	dev_err(&chan->video.dev, "TEGRA_CSI_CSI_CILX_STATUS 0x%08x\n", val);
+	val = pp_read(chan, TEGRA_CSI_PIXEL_PARSER_STATUS);
+	dev_err(&chan->video.dev, "TEGRA_CSI_PIXEL_PARSER_STATUS 0x%08x\n",
+		val);
+	val = csi_read(chan, TEGRA_VI_CSI_ERROR_STATUS);
+	dev_err(&chan->video.dev, "TEGRA_VI_CSI_ERROR_STATUS 0x%08x\n", val);
+}
+
+static int tegra_channel_capture_frame(struct tegra_channel *chan)
+{
+	struct tegra_channel_buffer *buf = chan->active;
+	struct vb2_buffer *vb = &buf->buf;
+	int err = 0;
+	u32 thresh, value, frame_start;
+	int bytes_per_line = chan->format.bytesperline;
+
+	if (!vb2_start_streaming_called(&chan->queue) || !buf)
+		return -EINVAL;
+
+	if (chan->bypass)
+		goto bypass_done;
+
+	/* Program buffer address */
+	csi_write(chan,
+		  TEGRA_VI_CSI_SURFACE0_OFFSET_MSB + chan->surface * 8,
+		  0x0);
+	csi_write(chan,
+		  TEGRA_VI_CSI_SURFACE0_OFFSET_LSB + chan->surface * 8,
+		  buf->addr);
+	csi_write(chan,
+		  TEGRA_VI_CSI_SURFACE0_STRIDE + chan->surface * 4,
+		  bytes_per_line);
+
+	/* Program syncpoint */
+	frame_start = sp_bit(chan, SP_PP_FRAME_START);
+	tegra_channel_write(chan, TEGRA_VI_CFG_VI_INCR_SYNCPT,
+			    frame_start | host1x_syncpt_id(chan->sp));
+
+	csi_write(chan, TEGRA_VI_CSI_SINGLE_SHOT, 0x1);
+
+	/* Use syncpoint to wake up */
+	thresh = host1x_syncpt_incr_max(chan->sp, 1);
+
+	mutex_unlock(&chan->lock);
+	err = host1x_syncpt_wait(chan->sp, thresh,
+			         TEGRA_VI_SYNCPT_WAIT_TIMEOUT, &value);
+	mutex_lock(&chan->lock);
+
+	if (err) {
+		dev_err(&chan->video.dev, "frame start syncpt timeout!\n");
+		tegra_channel_capture_error(chan, err);
+	}
+
+bypass_done:
+	/* Captured one frame */
+	spin_lock_irq(&chan->queued_lock);
+	vb->v4l2_buf.sequence = chan->sequence++;
+	vb->v4l2_buf.field = V4L2_FIELD_NONE;
+	v4l2_get_timestamp(&vb->v4l2_buf.timestamp);
+	vb2_set_plane_payload(vb, 0, chan->format.sizeimage);
+	vb2_buffer_done(vb, err < 0 ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
+	spin_unlock_irq(&chan->queued_lock);
+
+	return err;
+}
+
+static void tegra_channel_work(struct work_struct *work)
+{
+	struct tegra_channel *chan =
+		container_of(work, struct tegra_channel, work);
+
+	while (1) {
+		spin_lock_irq(&chan->queued_lock);
+		if (list_empty(&chan->capture)) {
+			chan->active = NULL;
+			spin_unlock_irq(&chan->queued_lock);
+			return;
+		}
+		chan->active = list_entry(chan->capture.next,
+				struct tegra_channel_buffer, queue);
+		list_del_init(&chan->active->queue);
+		spin_unlock_irq(&chan->queued_lock);
+
+		mutex_lock(&chan->lock);
+		tegra_channel_capture_frame(chan);
+		mutex_unlock(&chan->lock);
+	}
+}
+
+/* -----------------------------------------------------------------------------
+ * videobuf2 queue operations
+ */
+
+static int
+tegra_channel_queue_setup(struct vb2_queue *vq, const struct v4l2_format *fmt,
+		     unsigned int *nbuffers, unsigned int *nplanes,
+		     unsigned int sizes[], void *alloc_ctxs[])
+{
+	struct tegra_channel *chan = vb2_get_drv_priv(vq);
+
+	/* Make sure the image size is large enough. */
+	if (fmt && fmt->fmt.pix.sizeimage < chan->format.sizeimage)
+		return -EINVAL;
+
+	*nplanes = 1;
+
+	sizes[0] = fmt ? fmt->fmt.pix.sizeimage : chan->format.sizeimage;
+	alloc_ctxs[0] = chan->alloc_ctx;
+
+	return 0;
+}
+
+static int tegra_channel_buffer_prepare(struct vb2_buffer *vb)
+{
+	struct tegra_channel *chan = vb2_get_drv_priv(vb->vb2_queue);
+	struct tegra_channel_buffer *buf = to_tegra_channel_buffer(vb);
+
+	buf->chan = chan;
+	buf->addr = vb2_dma_contig_plane_dma_addr(vb, 0);
+
+	return 0;
+}
+
+static void tegra_channel_buffer_queue(struct vb2_buffer *vb)
+{
+	struct tegra_channel *chan = vb2_get_drv_priv(vb->vb2_queue);
+	struct tegra_channel_buffer *buf = to_tegra_channel_buffer(vb);
+
+	/* Put buffer into the  capture queue */
+	spin_lock_irq(&chan->queued_lock);
+	list_add_tail(&buf->queue, &chan->capture);
+	spin_unlock_irq(&chan->queued_lock);
+
+	/* Start work queue to capture data to buffer */
+	if (vb2_start_streaming_called(&chan->queue))
+		schedule_work(&chan->work);
+}
+
+static int tegra_channel_set_stream(struct tegra_channel *chan, bool on)
+{
+	struct media_entity *entity;
+	struct media_pad *pad;
+	struct v4l2_subdev *subdev;
+	int ret = 0;
+
+	entity = &chan->video.entity;
+
+	while (1) {
+		if (entity->num_pads > 1 && (chan->port & 0x1))
+			pad = &entity->pads[2];
+		else
+			pad = &entity->pads[0];
+
+		if (!(pad->flags & MEDIA_PAD_FL_SINK))
+			break;
+
+		pad = media_entity_remote_pad(pad);
+		if (pad == NULL ||
+		    media_entity_type(pad->entity) != MEDIA_ENT_T_V4L2_SUBDEV)
+			break;
+
+		entity = pad->entity;
+		subdev = media_entity_to_v4l2_subdev(entity);
+		ret = v4l2_subdev_call(subdev, video, s_stream, on);
+		if (on && ret < 0 && ret != -ENOIOCTLCMD)
+			return ret;
+	}
+	return ret;
+}
+
+static int tegra_channel_start_streaming(struct vb2_queue *vq, u32 count)
+{
+	struct tegra_channel *chan = vb2_get_drv_priv(vq);
+	struct media_pipeline *pipe = chan->video.entity.pipe;
+	struct tegra_channel_buffer *buf, *nbuf;
+	int ret = 0;
+
+	if (!chan->vi->pg_mode && !chan->remote_entity) {
+		dev_err(&chan->video.dev,
+			"is not in TPG mode and has not sensor connected!\n");
+		ret = -EINVAL;
+		goto vb2_queued;
+	}
+
+	mutex_lock(&chan->lock);
+
+	/* Start CIL clock */
+	clk_set_rate(chan->cil_clk, 102000000);
+	clk_prepare_enable(chan->cil_clk);
+
+	/* Disable DPD */
+	ret = tegra_io_rail_power_on(chan->io_id);
+	if (ret < 0) {
+		dev_err(&chan->video.dev,
+			"failed to power on CSI rail: %d\n", ret);
+		goto error_power_on;
+	}
+
+	/* Clean up status */
+	cil_write(chan, TEGRA_CSI_CIL_STATUS, 0xFFFFFFFF);
+	cil_write(chan, TEGRA_CSI_CILX_STATUS, 0xFFFFFFFF);
+	pp_write(chan, TEGRA_CSI_PIXEL_PARSER_STATUS, 0xFFFFFFFF);
+	csi_write(chan, TEGRA_VI_CSI_ERROR_STATUS, 0xFFFFFFFF);
+
+	ret = media_entity_pipeline_start(&chan->video.entity, pipe);
+	if (ret < 0)
+		goto error_pipeline_start;
+
+	/* Start the pipeline. */
+	ret = tegra_channel_set_stream(chan, true);
+	if (ret < 0)
+		goto error_set_stream;
+
+	/* Note: Program VI registers after TPG, sensors and CSI streaming */
+	ret = tegra_channel_capture_setup(chan);
+	if (ret < 0)
+		goto error_capture_setup;
+
+	chan->sequence = 0;
+	mutex_unlock(&chan->lock);
+
+	/* Start work queue to capture data to buffer */
+	schedule_work(&chan->work);
+
+	return 0;
+
+error_capture_setup:
+	tegra_channel_set_stream(chan, false);
+error_set_stream:
+	media_entity_pipeline_stop(&chan->video.entity);
+error_pipeline_start:
+	tegra_io_rail_power_off(chan->io_id);
+error_power_on:
+	clk_disable_unprepare(chan->cil_clk);
+	mutex_unlock(&chan->lock);
+vb2_queued:
+	/* Return all queued buffers back to vb2 */
+	spin_lock_irq(&chan->queued_lock);
+	vq->start_streaming_called = 0;
+	list_for_each_entry_safe(buf, nbuf, &chan->capture, queue) {
+		vb2_buffer_done(&buf->buf, VB2_BUF_STATE_QUEUED);
+		list_del(&buf->queue);
+	}
+	spin_unlock_irq(&chan->queued_lock);
+	return ret;
+}
+
+static void tegra_channel_stop_streaming(struct vb2_queue *vq)
+{
+	struct tegra_channel *chan = vb2_get_drv_priv(vq);
+	struct tegra_channel_buffer *buf, *nbuf;
+	u32 thresh, value, mw_ack_done;
+	int err;
+
+	mutex_lock(&chan->lock);
+
+	if (!chan->bypass) {
+		/* Program syncpoint */
+		mw_ack_done = sp_bit(chan, SP_MW_ACK_DONE);
+		tegra_channel_write(chan, TEGRA_VI_CFG_VI_INCR_SYNCPT,
+				mw_ack_done | host1x_syncpt_id(chan->sp));
+
+		/* Use syncpoint to wake up */
+		thresh = host1x_syncpt_incr_max(chan->sp, 1);
+		err = host1x_syncpt_wait(chan->sp, thresh,
+				TEGRA_VI_SYNCPT_WAIT_TIMEOUT, &value);
+		if (err)
+			dev_err(&chan->video.dev, "MW_ACK_DONE syncpoint time out!\n");
+	}
+
+	media_entity_pipeline_stop(&chan->video.entity);
+
+	tegra_channel_set_stream(chan, false);
+
+	tegra_io_rail_power_off(chan->io_id);
+	clk_disable_unprepare(chan->cil_clk);
+
+	mutex_unlock(&chan->lock);
+
+	/* Give back all queued buffers to videobuf2. */
+	spin_lock_irq(&chan->queued_lock);
+	list_for_each_entry_safe(buf, nbuf, &chan->capture, queue) {
+		vb2_buffer_done(&buf->buf, VB2_BUF_STATE_ERROR);
+		list_del(&buf->queue);
+	}
+	spin_unlock_irq(&chan->queued_lock);
+	cancel_work_sync(&chan->work);
+}
+
+static struct vb2_ops tegra_channel_queue_qops = {
+	.queue_setup = tegra_channel_queue_setup,
+	.buf_prepare = tegra_channel_buffer_prepare,
+	.buf_queue = tegra_channel_buffer_queue,
+	.wait_prepare = vb2_ops_wait_prepare,
+	.wait_finish = vb2_ops_wait_finish,
+	.start_streaming = tegra_channel_start_streaming,
+	.stop_streaming = tegra_channel_stop_streaming,
+};
+
+/* -----------------------------------------------------------------------------
+ * V4L2 ioctls
+ */
+
+static int
+tegra_channel_querycap(struct file *file, void *fh, struct v4l2_capability *cap)
+{
+	struct v4l2_fh *vfh = file->private_data;
+	struct tegra_channel *chan = to_tegra_channel(vfh->vdev);
+
+	cap->device_caps = V4L2_CAP_VIDEO_CAPTURE | V4L2_CAP_STREAMING;
+	cap->capabilities = cap->device_caps | V4L2_CAP_DEVICE_CAPS;
+
+	strlcpy(cap->driver, "tegra-vi", sizeof(cap->driver));
+	strlcpy(cap->card, chan->video.name, sizeof(cap->card));
+	snprintf(cap->bus_info, sizeof(cap->bus_info), "platform:%s:%u",
+		 chan->vi->dev->of_node->name, chan->port);
+
+	return 0;
+}
+
+static int
+tegra_channel_enum_format(struct file *file, void *fh, struct v4l2_fmtdesc *f)
+{
+	struct v4l2_fh *vfh = file->private_data;
+	struct tegra_channel *chan = to_tegra_channel(vfh->vdev);
+	int index, i;
+	unsigned long *fmts_bitmap = NULL;
+
+	if (chan->vi->pg_mode)
+		fmts_bitmap = chan->vi->tpg_fmts_bitmap;
+	else if (chan->remote_entity)
+		fmts_bitmap = chan->fmts_bitmap;
+
+	if (!fmts_bitmap ||
+	    f->index > bitmap_weight(fmts_bitmap, MAX_FORMAT_NUM) - 1)
+		return -EINVAL;
+
+	index = -1;
+	for (i = 0; i < f->index + 1; i++)
+		index = find_next_bit(fmts_bitmap, MAX_FORMAT_NUM, index + 1);
+
+	f->pixelformat = tegra_video_formats[index].fourcc;
+
+	return 0;
+}
+
+static int
+tegra_channel_get_format(struct file *file, void *fh, struct v4l2_format *format)
+{
+	struct v4l2_fh *vfh = file->private_data;
+	struct tegra_channel *chan = to_tegra_channel(vfh->vdev);
+
+	format->fmt.pix = chan->format;
+
+	return 0;
+}
+
+static void
+__tegra_channel_try_format(struct tegra_channel *chan, struct v4l2_pix_format *pix,
+		      const struct tegra_video_format **fmtinfo)
+{
+	const struct tegra_video_format *info;
+	unsigned int min_width;
+	unsigned int max_width;
+	unsigned int min_bpl;
+	unsigned int max_bpl;
+	unsigned int width;
+	unsigned int align;
+	unsigned int bpl;
+
+	/* Retrieve format information and select the default format if the
+	 * requested format isn't supported.
+	 */
+	info = tegra_core_get_format_by_fourcc(pix->pixelformat);
+	if (!info)
+		info = tegra_core_get_format_by_fourcc(TEGRA_VF_DEF_FOURCC);
+
+	pix->pixelformat = info->fourcc;
+	pix->field = V4L2_FIELD_NONE;
+
+	/* The transfer alignment requirements are expressed in bytes. Compute
+	 * the minimum and maximum values, clamp the requested width and convert
+	 * it back to pixels.
+	 */
+	align = lcm(chan->align, info->bpp);
+	min_width = roundup(TEGRA_MIN_WIDTH, align);
+	max_width = rounddown(TEGRA_MAX_WIDTH, align);
+	width = rounddown(pix->width * info->bpp, align);
+
+	pix->width = clamp(width, min_width, max_width) / info->bpp;
+	pix->height = clamp(pix->height, TEGRA_MIN_HEIGHT,
+			    TEGRA_MAX_HEIGHT);
+
+	/* Clamp the requested bytes per line value. If the maximum bytes per
+	 * line value is zero, the module doesn't support user configurable line
+	 * sizes. Override the requested value with the minimum in that case.
+	 */
+	min_bpl = pix->width * info->bpp;
+	max_bpl = rounddown(TEGRA_MAX_WIDTH, chan->align);
+	bpl = rounddown(pix->bytesperline, chan->align);
+
+	pix->bytesperline = clamp(bpl, min_bpl, max_bpl);
+	pix->sizeimage = pix->bytesperline * pix->height;
+
+	if (fmtinfo)
+		*fmtinfo = info;
+}
+
+static int
+tegra_channel_try_format(struct file *file, void *fh, struct v4l2_format *format)
+{
+	struct v4l2_fh *vfh = file->private_data;
+	struct tegra_channel *chan = to_tegra_channel(vfh->vdev);
+
+	__tegra_channel_try_format(chan, &format->fmt.pix, NULL);
+	return 0;
+}
+
+static int
+tegra_channel_set_format(struct file *file, void *fh, struct v4l2_format *format)
+{
+	struct v4l2_fh *vfh = file->private_data;
+	struct tegra_channel *chan = to_tegra_channel(vfh->vdev);
+	const struct tegra_video_format *info;
+
+	__tegra_channel_try_format(chan, &format->fmt.pix, &info);
+
+	if (vb2_is_busy(&chan->queue))
+		return -EBUSY;
+
+	chan->format = format->fmt.pix;
+	chan->fmtinfo = info;
+
+	return 0;
+}
+
+static const struct v4l2_ioctl_ops tegra_channel_ioctl_ops = {
+	.vidioc_querycap		= tegra_channel_querycap,
+	.vidioc_enum_fmt_vid_cap	= tegra_channel_enum_format,
+	.vidioc_g_fmt_vid_cap		= tegra_channel_get_format,
+	.vidioc_s_fmt_vid_cap		= tegra_channel_set_format,
+	.vidioc_try_fmt_vid_cap		= tegra_channel_try_format,
+	.vidioc_reqbufs			= vb2_ioctl_reqbufs,
+	.vidioc_querybuf		= vb2_ioctl_querybuf,
+	.vidioc_qbuf			= vb2_ioctl_qbuf,
+	.vidioc_dqbuf			= vb2_ioctl_dqbuf,
+	.vidioc_create_bufs		= vb2_ioctl_create_bufs,
+	.vidioc_expbuf			= vb2_ioctl_expbuf,
+	.vidioc_streamon		= vb2_ioctl_streamon,
+	.vidioc_streamoff		= vb2_ioctl_streamoff,
+};
+
+/* -----------------------------------------------------------------------------
+ * V4L2 file operations
+ */
+
+static int tegra_channel_v4l2_open(struct file *file)
+{
+	struct tegra_channel *chan = video_drvdata(file);
+	struct tegra_vi_device *vi = chan->vi;
+	int ret = 0;
+
+	mutex_lock(&vi->lock);
+	ret = v4l2_fh_open(file);
+	if (ret)
+		goto unlock;
+
+	/* The first open then turn on power*/
+	if (!vi->power_on_refcnt) {
+		tegra_vi_power_on(chan->vi);
+
+		usleep_range(5, 100);
+		tegra_channel_write(chan, TEGRA_VI_CFG_CG_CTRL, 1);
+		tegra_channel_write(chan, TEGRA_CSI_CLKEN_OVERRIDE, 0);
+		usleep_range(10, 15);
+	}
+	vi->power_on_refcnt++;
+
+unlock:
+	mutex_unlock(&vi->lock);
+	return ret;
+}
+
+static int tegra_channel_v4l2_release(struct file *file)
+{
+	struct tegra_channel *chan = video_drvdata(file);
+	struct tegra_vi_device *vi = chan->vi;
+	int ret = 0;
+
+	mutex_lock(&vi->lock);
+	vi->power_on_refcnt--;
+	/* The last release then turn off power */
+	if (!vi->power_on_refcnt)
+		tegra_vi_power_off(chan->vi);
+	ret = _vb2_fop_release(file, NULL);
+	mutex_unlock(&vi->lock);
+
+	return ret;
+}
+
+static const struct v4l2_file_operations tegra_channel_fops = {
+	.owner		= THIS_MODULE,
+	.unlocked_ioctl	= video_ioctl2,
+	.open		= tegra_channel_v4l2_open,
+	.release	= tegra_channel_v4l2_release,
+	.read		= vb2_fop_read,
+	.poll		= vb2_fop_poll,
+	.mmap		= vb2_fop_mmap,
+};
+
+int tegra_channel_init(struct tegra_vi_device *vi,
+		       struct tegra_channel *chan,
+		       u32 port)
+{
+	int ret;
+
+	chan->vi = vi;
+	chan->port = port;
+	chan->iomem = vi->iomem;
+
+	/* Init channel register base */
+	chan->regs.csi = TEGRA_VI_CSI_0_BASE + port * 0x100;
+	chan->regs.pp = regs_base(TEGRA_CSI_PIXEL_PARSER_0_BASE, port);
+	chan->regs.cil = regs_base(TEGRA_CSI_CIL_0_BASE, port);
+	chan->regs.phy = TEGRA_CSI_CIL_PHY_0_BASE + port / 2 * 0x800;
+	chan->regs.tpg = regs_base(TEGRA_CSI_PATTERN_GENERATOR_0_BASE, port);
+
+	/* Init CIL clock */
+	switch (chan->port) {
+	case 0:
+	case 1:
+		chan->cil_clk = devm_clk_get(chan->vi->dev, "cilab");
+		break;
+	case 2:
+	case 3:
+		chan->cil_clk = devm_clk_get(chan->vi->dev, "cilcd");
+		break;
+	case 4:
+	case 5:
+		chan->cil_clk = devm_clk_get(chan->vi->dev, "cile");
+		break;
+	default:
+		dev_err(chan->vi->dev, "wrong port nubmer %d\n", port);
+	}
+	if (IS_ERR(chan->cil_clk)) {
+		dev_err(chan->vi->dev, "Failed to get CIL clock\n");
+		return -EINVAL;
+	}
+
+	/* VI Channel is 64 bytes alignment */
+	chan->align = 64;
+	chan->surface = 0;
+	chan->io_id = tegra_io_rail_csi_ids[chan->port];
+	mutex_init(&chan->lock);
+	mutex_init(&chan->video_lock);
+	INIT_LIST_HEAD(&chan->capture);
+	spin_lock_init(&chan->queued_lock);
+	INIT_WORK(&chan->work, tegra_channel_work);
+
+	/* Init video format */
+	chan->fmtinfo = tegra_core_get_format_by_fourcc(TEGRA_VF_DEF_FOURCC);
+	chan->format.pixelformat = chan->fmtinfo->fourcc;
+	chan->format.colorspace = V4L2_COLORSPACE_SRGB;
+	chan->format.field = V4L2_FIELD_NONE;
+	chan->format.width = TEGRA_DEF_WIDTH;
+	chan->format.height = TEGRA_DEF_HEIGHT;
+	chan->format.bytesperline = chan->format.width * chan->fmtinfo->bpp;
+	chan->format.sizeimage = chan->format.bytesperline *
+				    chan->format.height;
+
+	/* Initialize the media entity... */
+	chan->pad.flags = MEDIA_PAD_FL_SINK;
+
+	ret = media_entity_init(&chan->video.entity, 1, &chan->pad, 0);
+	if (ret < 0)
+		return ret;
+
+	/* ... and the video node... */
+	chan->video.fops = &tegra_channel_fops;
+	chan->video.v4l2_dev = &vi->v4l2_dev;
+	chan->video.queue = &chan->queue;
+	snprintf(chan->video.name, sizeof(chan->video.name), "%s %s %u",
+		 vi->dev->of_node->name, "output", port);
+	chan->video.vfl_type = VFL_TYPE_GRABBER;
+	chan->video.vfl_dir = VFL_DIR_RX;
+	chan->video.release = video_device_release_empty;
+	chan->video.ioctl_ops = &tegra_channel_ioctl_ops;
+	chan->video.lock = &chan->video_lock;
+
+	video_set_drvdata(&chan->video, chan);
+
+	/* Init host1x interface */
+	INIT_LIST_HEAD(&chan->client.list);
+	chan->client.dev = chan->vi->dev;
+
+	ret = host1x_client_register(&chan->client);
+	if (ret < 0) {
+		dev_err(chan->vi->dev, "failed to register host1x client: %d\n",
+			ret);
+		ret = -ENODEV;
+		goto host1x_register_error;
+	}
+
+	chan->sp = host1x_syncpt_request(chan->client.dev,
+					 HOST1X_SYNCPT_HAS_BASE);
+	if (!chan->sp) {
+		dev_err(chan->vi->dev, "failed to request host1x syncpoint\n");
+		ret = -ENOMEM;
+		goto host1x_sp_error;
+	}
+
+	/* ... and the buffers queue... */
+	chan->alloc_ctx = vb2_dma_contig_init_ctx(&chan->video.dev);
+	if (IS_ERR(chan->alloc_ctx)) {
+		dev_err(chan->vi->dev, "failed to init vb2 buffer\n");
+		ret = -ENOMEM;
+		goto vb2_init_error;
+	}
+
+	chan->queue.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
+	chan->queue.io_modes = VB2_MMAP | VB2_DMABUF | VB2_READ;
+	chan->queue.lock = &chan->video_lock;
+	chan->queue.drv_priv = chan;
+	chan->queue.buf_struct_size = sizeof(struct tegra_channel_buffer);
+	chan->queue.ops = &tegra_channel_queue_qops;
+	chan->queue.mem_ops = &vb2_dma_contig_memops;
+	chan->queue.timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC
+				   | V4L2_BUF_FLAG_TSTAMP_SRC_EOF;
+	ret = vb2_queue_init(&chan->queue);
+	if (ret < 0) {
+		dev_err(chan->vi->dev, "failed to initialize VB2 queue\n");
+		goto vb2_queue_error;
+	}
+
+	ret = video_register_device(&chan->video, VFL_TYPE_GRABBER, -1);
+	if (ret < 0) {
+		dev_err(&chan->video.dev, "failed to register video device\n");
+		goto video_register_error;
+	}
+
+	return 0;
+
+video_register_error:
+	vb2_queue_release(&chan->queue);
+vb2_queue_error:
+	vb2_dma_contig_cleanup_ctx(chan->alloc_ctx);
+vb2_init_error:
+	host1x_syncpt_free(chan->sp);
+host1x_sp_error:
+	host1x_client_unregister(&chan->client);
+host1x_register_error:
+	media_entity_cleanup(&chan->video.entity);
+	return ret;
+}
+
+int tegra_channel_cleanup(struct tegra_channel *chan)
+{
+	video_unregister_device(&chan->video);
+
+	vb2_queue_release(&chan->queue);
+	vb2_dma_contig_cleanup_ctx(chan->alloc_ctx);
+
+	host1x_syncpt_free(chan->sp);
+	host1x_client_unregister(&chan->client);
+
+	media_entity_cleanup(&chan->video.entity);
+
+	return 0;
+}
diff --git a/drivers/media/platform/tegra/tegra-core.c b/drivers/media/platform/tegra/tegra-core.c
new file mode 100644
index 0000000..244b9b8
--- /dev/null
+++ b/drivers/media/platform/tegra/tegra-core.c
@@ -0,0 +1,295 @@
+/*
+ * NVIDIA Tegra Video Input Device Driver Core Helpers
+ *
+ * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * Author: Bryan Wu <pengw@nvidia.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/export.h>
+#include <linux/kernel.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+
+#include "tegra-core.h"
+
+const struct tegra_video_format tegra_video_formats[] = {
+	/* RAW 6: TODO */
+
+	/* RAW 7: TODO */
+
+	/* RAW 8 */
+	{
+		TEGRA_VF_RAW8,
+		8,
+		MEDIA_BUS_FMT_SRGGB8_1X8,
+		1,
+		TEGRA_IMAGE_FORMAT_T_L8,
+		TEGRA_IMAGE_DT_RAW8,
+		V4L2_PIX_FMT_SRGGB8,
+	},
+	{
+		TEGRA_VF_RAW8,
+		8,
+		MEDIA_BUS_FMT_SGRBG8_1X8,
+		1,
+		TEGRA_IMAGE_FORMAT_T_L8,
+		TEGRA_IMAGE_DT_RAW8,
+		V4L2_PIX_FMT_SGRBG8,
+	},
+	{
+		TEGRA_VF_RAW8,
+		8,
+		MEDIA_BUS_FMT_SGBRG8_1X8,
+		1,
+		TEGRA_IMAGE_FORMAT_T_L8,
+		TEGRA_IMAGE_DT_RAW8,
+		V4L2_PIX_FMT_SGBRG8,
+	},
+	{
+		TEGRA_VF_RAW8,
+		8,
+		MEDIA_BUS_FMT_SBGGR8_1X8,
+		1,
+		TEGRA_IMAGE_FORMAT_T_L8,
+		TEGRA_IMAGE_DT_RAW8,
+		V4L2_PIX_FMT_SBGGR8,
+	},
+
+	/* RAW 10 */
+	{
+		TEGRA_VF_RAW10,
+		10,
+		MEDIA_BUS_FMT_SRGGB10_1X10,
+		2,
+		TEGRA_IMAGE_FORMAT_T_R16_I,
+		TEGRA_IMAGE_DT_RAW10,
+		V4L2_PIX_FMT_SRGGB10,
+	},
+	{
+		TEGRA_VF_RAW10,
+		10,
+		MEDIA_BUS_FMT_SGRBG10_1X10,
+		2,
+		TEGRA_IMAGE_FORMAT_T_R16_I,
+		TEGRA_IMAGE_DT_RAW10,
+		V4L2_PIX_FMT_SGRBG10,
+	},
+	{
+		TEGRA_VF_RAW10,
+		10,
+		MEDIA_BUS_FMT_SGBRG10_1X10,
+		2,
+		TEGRA_IMAGE_FORMAT_T_R16_I,
+		TEGRA_IMAGE_DT_RAW10,
+		V4L2_PIX_FMT_SGBRG10,
+	},
+	{
+		TEGRA_VF_RAW10,
+		10,
+		MEDIA_BUS_FMT_SBGGR10_1X10,
+		2,
+		TEGRA_IMAGE_FORMAT_T_R16_I,
+		TEGRA_IMAGE_DT_RAW10,
+		V4L2_PIX_FMT_SBGGR10,
+	},
+
+	/* RAW 12 */
+	{
+		TEGRA_VF_RAW12,
+		12,
+		MEDIA_BUS_FMT_SRGGB12_1X12,
+		2,
+		TEGRA_IMAGE_FORMAT_T_R16_I,
+		TEGRA_IMAGE_DT_RAW12,
+		V4L2_PIX_FMT_SRGGB12,
+	},
+	{
+		TEGRA_VF_RAW12,
+		12,
+		MEDIA_BUS_FMT_SGRBG12_1X12,
+		2,
+		TEGRA_IMAGE_FORMAT_T_R16_I,
+		TEGRA_IMAGE_DT_RAW12,
+		V4L2_PIX_FMT_SGRBG12,
+	},
+	{
+		TEGRA_VF_RAW12,
+		12,
+		MEDIA_BUS_FMT_SGBRG12_1X12,
+		2,
+		TEGRA_IMAGE_FORMAT_T_R16_I,
+		TEGRA_IMAGE_DT_RAW12,
+		V4L2_PIX_FMT_SGBRG12,
+	},
+	{
+		TEGRA_VF_RAW12,
+		12,
+		MEDIA_BUS_FMT_SBGGR12_1X12,
+		2,
+		TEGRA_IMAGE_FORMAT_T_R16_I,
+		TEGRA_IMAGE_DT_RAW12,
+		V4L2_PIX_FMT_SBGGR12,
+	},
+
+	/* RGB888 */
+	{
+		TEGRA_VF_RGB888,
+		24,
+		MEDIA_BUS_FMT_RGB888_1X32_PADHI,
+		4,
+		TEGRA_IMAGE_FORMAT_T_A8B8G8R8,
+		TEGRA_IMAGE_DT_RGB888,
+		V4L2_PIX_FMT_RGB32,
+	},
+};
+
+/* -----------------------------------------------------------------------------
+ * Helper functions
+ */
+
+int tegra_core_get_formats_array_size(void)
+{
+	return ARRAY_SIZE(tegra_video_formats);
+}
+
+/**
+ * tegra_core_get_word_count - Calculate word count
+ * @frame_width: number of pixels in one frame
+ * @fmt: Tegra Video format struct which has BPP information
+ *
+ * Return: word count number
+ */
+u32 tegra_core_get_word_count(u32 frame_width,
+			      const struct tegra_video_format *fmt)
+{
+	return frame_width * fmt->width / 8;
+}
+
+/**
+ * tegra_core_get_idx_by_code - Retrieve index for a media bus code
+ * @code: the format media bus code
+ *
+ * Return: a index to the format information structure corresponding to the
+ * given V4L2 media bus format @code, or -1 if no corresponding format can
+ * be found.
+ */
+int tegra_core_get_idx_by_code(unsigned int code)
+{
+	unsigned int i;
+	const struct tegra_video_format *format;
+
+	for (i = 0; i < ARRAY_SIZE(tegra_video_formats); ++i) {
+		format = &tegra_video_formats[i];
+
+		if (format->code == code)
+			return i;
+	}
+
+	return -1;
+}
+
+
+/**
+ * tegra_core_get_format_by_code - Retrieve format information for a media
+ * 				   bus code
+ * @code: the format media bus code
+ *
+ * Return: a pointer to the format information structure corresponding to the
+ * given V4L2 media bus format @code, or NULL if no corresponding format can
+ * be found.
+ */
+const struct tegra_video_format *
+tegra_core_get_format_by_code(unsigned int code)
+{
+	unsigned int i;
+	const struct tegra_video_format *format;
+
+	for (i = 0; i < ARRAY_SIZE(tegra_video_formats); ++i) {
+		format = &tegra_video_formats[i];
+
+		if (format->code == code)
+			return format;
+	}
+
+	return NULL;
+}
+
+/**
+ * tegra_core_get_format_by_fourcc - Retrieve format information for a 4CC
+ * @fourcc: the format 4CC
+ *
+ * Return: a pointer to the format information structure corresponding to the
+ * given V4L2 format @fourcc, or NULL if no corresponding format can be
+ * found.
+ */
+const struct tegra_video_format *tegra_core_get_format_by_fourcc(u32 fourcc)
+{
+	unsigned int i;
+	const struct tegra_video_format *format;
+
+	for (i = 0; i < ARRAY_SIZE(tegra_video_formats); ++i) {
+		format = &tegra_video_formats[i];
+
+		if (format->fourcc == fourcc)
+			return format;
+	}
+
+	return NULL;
+}
+
+/**
+ * tegra_core_of_get_format - Parse a device tree node and return format
+ * 			      information
+ * @node: the device tree node
+ *
+ * Read the nvidia,video-format property from the device tree @node passed as
+ * an argument and return the corresponding format information.
+ *
+ * Return: a pointer to the format information structure corresponding to the
+ * format name and width, or NULL if no corresponding format can be found.
+ */
+const struct tegra_video_format *
+tegra_core_of_get_format(struct device_node *node)
+{
+	u32 vf_code;
+	int i, ret;
+	const struct tegra_video_format *format;
+
+	ret = of_property_read_u32(node, "nvidia,video-format", &vf_code);
+	if (ret < 0)
+		vf_code = TEGRA_VF_DEF;
+
+	for (i = 0; i < ARRAY_SIZE(tegra_video_formats); ++i) {
+		format = &tegra_video_formats[i];
+
+		if (format->vf_code == vf_code)
+			return format;
+	}
+
+	return NULL;
+}
+
+/**
+ * tegra_core_bytes_per_line - Calculate bytes per line in one frame
+ * @width: frame width
+ * @fmt: Tegra Video format
+ *
+ * Simply calcualte the bytes_per_line and if it's not 64 bytes aligned it
+ * will be padded to 64 boundary.
+ */
+u32 tegra_core_bytes_per_line(u32 width,
+			      const struct tegra_video_format *fmt)
+{
+	u32 bytes_per_line = width * fmt->bpp;
+
+	if (bytes_per_line % 64)
+		bytes_per_line = bytes_per_line +
+				 (64 - (bytes_per_line % 64));
+
+	return bytes_per_line;
+}
diff --git a/drivers/media/platform/tegra/tegra-core.h b/drivers/media/platform/tegra/tegra-core.h
new file mode 100644
index 0000000..7d1026b
--- /dev/null
+++ b/drivers/media/platform/tegra/tegra-core.h
@@ -0,0 +1,134 @@
+/*
+ * NVIDIA Tegra Video Input Device Driver Core Helpers
+ *
+ * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * Author: Bryan Wu <pengw@nvidia.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __TEGRA_CORE_H__
+#define __TEGRA_CORE_H__
+
+#include <dt-bindings/media/tegra-vi.h>
+
+#include <media/v4l2-subdev.h>
+
+/* Minimum and maximum width and height common to Tegra video input device. */
+#define TEGRA_MIN_WIDTH		32U
+#define TEGRA_MAX_WIDTH		7680U
+#define TEGRA_MIN_HEIGHT	32U
+#define TEGRA_MAX_HEIGHT	7680U
+
+/* UHD 4K resolution as default resolution for all Tegra video input device. */
+#define TEGRA_DEF_WIDTH		3840
+#define TEGRA_DEF_HEIGHT	2160
+
+#define TEGRA_VF_DEF		TEGRA_VF_RGB888
+#define TEGRA_VF_DEF_FOURCC	V4L2_PIX_FMT_RGB32
+
+/* These go into the TEGRA_VI_CSI_n_IMAGE_DEF registers bits 23:16 */
+#define TEGRA_IMAGE_FORMAT_T_L8                         16
+#define TEGRA_IMAGE_FORMAT_T_R16_I                      32
+#define TEGRA_IMAGE_FORMAT_T_B5G6R5                     33
+#define TEGRA_IMAGE_FORMAT_T_R5G6B5                     34
+#define TEGRA_IMAGE_FORMAT_T_A1B5G5R5                   35
+#define TEGRA_IMAGE_FORMAT_T_A1R5G5B5                   36
+#define TEGRA_IMAGE_FORMAT_T_B5G5R5A1                   37
+#define TEGRA_IMAGE_FORMAT_T_R5G5B5A1                   38
+#define TEGRA_IMAGE_FORMAT_T_A4B4G4R4                   39
+#define TEGRA_IMAGE_FORMAT_T_A4R4G4B4                   40
+#define TEGRA_IMAGE_FORMAT_T_B4G4R4A4                   41
+#define TEGRA_IMAGE_FORMAT_T_R4G4B4A4                   42
+#define TEGRA_IMAGE_FORMAT_T_A8B8G8R8                   64
+#define TEGRA_IMAGE_FORMAT_T_A8R8G8B8                   65
+#define TEGRA_IMAGE_FORMAT_T_B8G8R8A8                   66
+#define TEGRA_IMAGE_FORMAT_T_R8G8B8A8                   67
+#define TEGRA_IMAGE_FORMAT_T_A2B10G10R10                68
+#define TEGRA_IMAGE_FORMAT_T_A2R10G10B10                69
+#define TEGRA_IMAGE_FORMAT_T_B10G10R10A2                70
+#define TEGRA_IMAGE_FORMAT_T_R10G10B10A2                71
+#define TEGRA_IMAGE_FORMAT_T_A8Y8U8V8                   193
+#define TEGRA_IMAGE_FORMAT_T_V8U8Y8A8                   194
+#define TEGRA_IMAGE_FORMAT_T_A2Y10U10V10                197
+#define TEGRA_IMAGE_FORMAT_T_V10U10Y10A2                198
+#define TEGRA_IMAGE_FORMAT_T_Y8_U8__Y8_V8               200
+#define TEGRA_IMAGE_FORMAT_T_Y8_V8__Y8_U8               201
+#define TEGRA_IMAGE_FORMAT_T_U8_Y8__V8_Y8               202
+#define TEGRA_IMAGE_FORMAT_T_T_V8_Y8__U8_Y8             203
+#define TEGRA_IMAGE_FORMAT_T_T_Y8__U8__V8_N444          224
+#define TEGRA_IMAGE_FORMAT_T_Y8__U8V8_N444              225
+#define TEGRA_IMAGE_FORMAT_T_Y8__V8U8_N444              226
+#define TEGRA_IMAGE_FORMAT_T_Y8__U8__V8_N422            227
+#define TEGRA_IMAGE_FORMAT_T_Y8__U8V8_N422              228
+#define TEGRA_IMAGE_FORMAT_T_Y8__V8U8_N422              229
+#define TEGRA_IMAGE_FORMAT_T_Y8__U8__V8_N420            230
+#define TEGRA_IMAGE_FORMAT_T_Y8__U8V8_N420              231
+#define TEGRA_IMAGE_FORMAT_T_Y8__V8U8_N420              232
+#define TEGRA_IMAGE_FORMAT_T_X2Lc10Lb10La10             233
+#define TEGRA_IMAGE_FORMAT_T_A2R6R6R6R6R6               234
+
+/* These go into the TEGRA_VI_CSI_n_CSI_IMAGE_DT registers bits 7:0 */
+#define TEGRA_IMAGE_DT_YUV420_8                         24
+#define TEGRA_IMAGE_DT_YUV420_10                        25
+#define TEGRA_IMAGE_DT_YUV420CSPS_8                     28
+#define TEGRA_IMAGE_DT_YUV420CSPS_10                    29
+#define TEGRA_IMAGE_DT_YUV422_8                         30
+#define TEGRA_IMAGE_DT_YUV422_10                        31
+#define TEGRA_IMAGE_DT_RGB444                           32
+#define TEGRA_IMAGE_DT_RGB555                           33
+#define TEGRA_IMAGE_DT_RGB565                           34
+#define TEGRA_IMAGE_DT_RGB666                           35
+#define TEGRA_IMAGE_DT_RGB888                           36
+#define TEGRA_IMAGE_DT_RAW6                             40
+#define TEGRA_IMAGE_DT_RAW7                             41
+#define TEGRA_IMAGE_DT_RAW8                             42
+#define TEGRA_IMAGE_DT_RAW10                            43
+#define TEGRA_IMAGE_DT_RAW12                            44
+#define TEGRA_IMAGE_DT_RAW14                            45
+
+/**
+ * struct tegra_video_format - Tegra video format description
+ * @vf_code: video format code
+ * @width: format width in bits per component
+ * @code: media bus format code
+ * @bpp: bytes per pixel (when stored in memory)
+ * @img_fmt: image format
+ * @img_dt: image data type
+ * @fourcc: V4L2 pixel format FCC identifier
+ * @description: format description, suitable for userspace
+ */
+struct tegra_video_format {
+	u32 vf_code;
+	u32 width;
+	u32 code;
+	u32 bpp;
+	u32 img_fmt;
+	u32 img_dt;
+	u32 fourcc;
+};
+
+extern const struct tegra_video_format tegra_video_formats[];
+
+int tegra_core_get_formats_array_size(void);
+
+u32 tegra_core_get_word_count(u32 frame_width,
+			      const struct tegra_video_format *fmt);
+int tegra_core_get_idx_by_code(unsigned int code);
+const struct tegra_video_format *tegra_core_get_format_by_code(unsigned int
+							       code);
+const struct tegra_video_format *tegra_core_get_format_by_fourcc(u32 fourcc);
+const struct tegra_video_format *tegra_core_of_get_format(struct device_node
+							  *node);
+u32 tegra_core_bytes_per_line(u32 width,
+				     const struct tegra_video_format *fmt);
+int tegra_core_enum_mbus_code(struct v4l2_subdev *subdev,
+			struct v4l2_subdev_pad_config *cfg,
+			struct v4l2_subdev_mbus_code_enum *code);
+int tegra_core_enum_frame_size(struct v4l2_subdev *subdev,
+			 struct v4l2_subdev_pad_config *cfg,
+			 struct v4l2_subdev_frame_size_enum *fse);
+#endif
diff --git a/drivers/media/platform/tegra/tegra-vi.c b/drivers/media/platform/tegra/tegra-vi.c
new file mode 100644
index 0000000..65ba412
--- /dev/null
+++ b/drivers/media/platform/tegra/tegra-vi.c
@@ -0,0 +1,585 @@
+/*
+ * NVIDIA Tegra Video Input Device
+ *
+ * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * Author: Bryan Wu <pengw@nvidia.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/clk.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/of.h>
+#include <linux/of_graph.h>
+#include <linux/platform_device.h>
+#include <linux/regulator/consumer.h>
+#include <linux/reset.h>
+#include <linux/slab.h>
+
+#include <media/media-device.h>
+#include <media/v4l2-async.h>
+#include <media/v4l2-common.h>
+#include <media/v4l2-device.h>
+#include <media/v4l2-of.h>
+
+#include <soc/tegra/pmc.h>
+
+#include "tegra-vi.h"
+
+/* In TPG mode, VI only support 2 formats */
+static void vi_tpg_fmts_bitmap_init(struct tegra_vi_device *vi)
+{
+	int index;
+
+	bitmap_zero(vi->tpg_fmts_bitmap, MAX_FORMAT_NUM);
+
+	index = tegra_core_get_idx_by_code(MEDIA_BUS_FMT_SRGGB10_1X10);
+	bitmap_set(vi->tpg_fmts_bitmap, index, 1);
+
+	index = tegra_core_get_idx_by_code(MEDIA_BUS_FMT_RGB888_1X32_PADHI);
+	bitmap_set(vi->tpg_fmts_bitmap, index, 1);
+}
+
+/*
+ * Control Config
+ */
+
+static const char *const vi_pattern_strings[] = {
+	"Disabled",
+	"Black/White Direct Mode",
+	"Color Patch Mode",
+};
+
+static int vi_s_ctrl(struct v4l2_ctrl *ctrl)
+{
+	struct tegra_vi_device *vi = container_of(ctrl->handler,
+						  struct tegra_vi_device,
+						  ctrl_handler);
+	switch (ctrl->id) {
+	case V4L2_CID_TEST_PATTERN:
+		vi->pg_mode = ctrl->val;
+		break;
+	}
+
+	return 0;
+}
+
+static const struct v4l2_ctrl_ops vi_ctrl_ops = {
+	.s_ctrl	= vi_s_ctrl,
+};
+
+/* -----------------------------------------------------------------------------
+ * Media Controller and V4L2
+ */
+
+static void tegra_vi_v4l2_cleanup(struct tegra_vi_device *vi)
+{
+	v4l2_ctrl_handler_free(&vi->ctrl_handler);
+	v4l2_device_unregister(&vi->v4l2_dev);
+	media_device_unregister(&vi->media_dev);
+}
+
+static int tegra_vi_v4l2_init(struct tegra_vi_device *vi)
+{
+	int ret;
+
+	vi->media_dev.dev = vi->dev;
+	strlcpy(vi->media_dev.model, "NVIDIA Tegra Video Input Device",
+		sizeof(vi->media_dev.model));
+	vi->media_dev.hw_revision = 0;
+
+	ret = media_device_register(&vi->media_dev);
+	if (ret < 0) {
+		dev_err(vi->dev, "media device registration failed (%d)\n",
+			ret);
+		return ret;
+	}
+
+	vi->v4l2_dev.mdev = &vi->media_dev;
+	ret = v4l2_device_register(vi->dev, &vi->v4l2_dev);
+	if (ret < 0) {
+		dev_err(vi->dev, "V4L2 device registration failed (%d)\n",
+			ret);
+		goto register_error;
+	}
+
+	v4l2_ctrl_handler_init(&vi->ctrl_handler, 1);
+	vi->pattern = v4l2_ctrl_new_std_menu_items(&vi->ctrl_handler,
+					&vi_ctrl_ops, V4L2_CID_TEST_PATTERN,
+					ARRAY_SIZE(vi_pattern_strings) - 1,
+					0, 0, vi_pattern_strings);
+
+	if (vi->ctrl_handler.error) {
+		dev_err(vi->dev, "failed to add controls\n");
+		ret = vi->ctrl_handler.error;
+		goto ctrl_error;
+	}
+	vi->v4l2_dev.ctrl_handler = &vi->ctrl_handler;
+
+	ret = v4l2_ctrl_handler_setup(&vi->ctrl_handler);
+	if (ret < 0) {
+		dev_err(vi->dev, "failed to set controls\n");
+		goto ctrl_error;
+	}
+	return 0;
+
+
+ctrl_error:
+	v4l2_ctrl_handler_free(&vi->ctrl_handler);
+	v4l2_device_unregister(&vi->v4l2_dev);
+register_error:
+	media_device_unregister(&vi->media_dev);
+	return ret;
+}
+
+/* -----------------------------------------------------------------------------
+ * Platform Device Driver
+ */
+
+int tegra_vi_power_on(struct tegra_vi_device *vi)
+{
+	int ret;
+
+	ret = regulator_enable(vi->vi_reg);
+	if (ret)
+		return ret;
+
+	ret = tegra_powergate_sequence_power_up(TEGRA_POWERGATE_VENC,
+						vi->vi_clk, vi->vi_rst);
+	if (ret) {
+		regulator_disable(vi->vi_reg);
+		return ret;
+	}
+
+	clk_prepare_enable(vi->csi_clk);
+
+	clk_set_rate(vi->parent_clk, 408000000);
+	clk_set_rate(vi->vi_clk, 408000000);
+	clk_set_rate(vi->csi_clk, 408000000);
+
+	return 0;
+}
+
+void tegra_vi_power_off(struct tegra_vi_device *vi)
+{
+	clk_disable_unprepare(vi->csi_clk);
+	tegra_powergate_power_off(TEGRA_POWERGATE_VENC);
+	regulator_disable(vi->vi_reg);
+}
+
+static int tegra_vi_channels_init(struct tegra_vi_device *vi)
+{
+	int i, ret;
+	struct tegra_channel *chan;
+
+	for (i = 0; i < ARRAY_SIZE(vi->chans); i++) {
+		chan = &vi->chans[i];
+
+		ret = tegra_channel_init(vi, chan, i);
+		if (ret < 0) {
+			dev_err(vi->dev, "channel %d init failed\n", i);
+			return ret;
+		}
+	}
+	return 0;
+}
+
+static int tegra_vi_channels_cleanup(struct tegra_vi_device *vi)
+{
+	int i, ret;
+	struct tegra_channel *chan;
+
+	for (i = 0; i < ARRAY_SIZE(vi->chans); i++) {
+		chan = &vi->chans[i];
+
+		ret = tegra_channel_cleanup(chan);
+		if (ret < 0) {
+			dev_err(vi->dev, "channel %d cleanup failed\n", i);
+			return ret;
+		}
+	}
+	return 0;
+}
+
+/* -----------------------------------------------------------------------------
+ * Graph Management
+ */
+
+static struct tegra_vi_graph_entity *
+tegra_vi_graph_find_entity(struct tegra_vi_device *vi,
+		       const struct device_node *node)
+{
+	struct tegra_vi_graph_entity *entity;
+
+	list_for_each_entry(entity, &vi->entities, list) {
+		if (entity->node == node)
+			return entity;
+	}
+
+	return NULL;
+}
+
+static int tegra_vi_graph_build_links(struct tegra_vi_device *vi)
+{
+	u32 link_flags = MEDIA_LNK_FL_ENABLED;
+	struct device_node *node = vi->dev->of_node;
+	struct media_entity *source;
+	struct media_entity *sink;
+	struct media_pad *source_pad;
+	struct media_pad *sink_pad;
+	struct tegra_vi_graph_entity *ent;
+	struct v4l2_of_link link;
+	struct device_node *ep = NULL;
+	struct device_node *next;
+	struct tegra_channel *chan;
+	int ret = 0;
+
+
+	dev_dbg(vi->dev, "creating links for channels\n");
+
+	while (1) {
+		/* Get the next endpoint and parse its link. */
+		next = of_graph_get_next_endpoint(node, ep);
+		if (next == NULL)
+			break;
+
+		of_node_put(ep);
+		ep = next;
+
+		dev_dbg(vi->dev, "processing endpoint %s\n", ep->full_name);
+
+		ret = v4l2_of_parse_link(ep, &link);
+		if (ret < 0) {
+			dev_err(vi->dev, "failed to parse link for %s\n",
+				ep->full_name);
+			continue;
+		}
+
+		if (link.local_port > MAX_CHAN_NUM) {
+			dev_err(vi->dev, "wrong channel number for port %u\n",
+				link.local_port);
+			v4l2_of_put_link(&link);
+			ret = -EINVAL;
+			break;
+		}
+
+		chan = &vi->chans[link.local_port];
+
+		dev_dbg(vi->dev, "creating link for channel %s\n",
+			chan->video.name);
+
+		/* Find the remote entity. */
+		ent = tegra_vi_graph_find_entity(vi, link.remote_node);
+		if (ent == NULL) {
+			dev_err(vi->dev, "no entity found for %s\n",
+				link.remote_node->full_name);
+			v4l2_of_put_link(&link);
+			ret = -ENODEV;
+			break;
+		}
+
+		if (link.remote_port >= ent->entity->num_pads) {
+			dev_err(vi->dev, "invalid port number %u on %s\n",
+				link.remote_port, link.remote_node->full_name);
+			v4l2_of_put_link(&link);
+			ret = -EINVAL;
+			break;
+		}
+
+		source = ent->entity;
+		source_pad = &source->pads[link.remote_port];
+		sink = &chan->video.entity;
+		sink_pad = &chan->pad;
+		chan->remote_entity = ent;
+
+		v4l2_of_put_link(&link);
+
+		/* Create the media link. */
+		dev_dbg(vi->dev, "creating %s:%u -> %s:%u link\n",
+			source->name, source_pad->index,
+			sink->name, sink_pad->index);
+
+		ret = media_entity_create_link(source, source_pad->index,
+					       sink, sink_pad->index,
+					       link_flags);
+		if (ret < 0) {
+			dev_err(vi->dev,
+				"failed to create %s:%u -> %s:%u link\n",
+				source->name, source_pad->index,
+				sink->name, sink_pad->index);
+			break;
+		}
+
+		tegra_channel_fmts_bitmap_init(chan);
+	}
+
+	of_node_put(ep);
+	return ret;
+}
+
+static int tegra_vi_graph_notify_complete(struct v4l2_async_notifier *notifier)
+{
+	struct tegra_vi_device *vi =
+		container_of(notifier, struct tegra_vi_device, notifier);
+	int ret;
+
+	dev_dbg(vi->dev, "notify complete, all subdevs registered\n");
+
+	/* Create links for every entity. */
+	ret = tegra_vi_graph_build_links(vi);
+	if (ret < 0)
+		return ret;
+
+	ret = v4l2_device_register_subdev_nodes(&vi->v4l2_dev);
+	if (ret < 0)
+		dev_err(vi->dev, "failed to register subdev nodes\n");
+
+	return ret;
+}
+
+static int tegra_vi_graph_notify_bound(struct v4l2_async_notifier *notifier,
+				   struct v4l2_subdev *subdev,
+				   struct v4l2_async_subdev *asd)
+{
+	struct tegra_vi_device *vi =
+		container_of(notifier, struct tegra_vi_device, notifier);
+	struct tegra_vi_graph_entity *entity;
+
+	/* Locate the entity corresponding to the bound subdev and store the
+	 * subdev pointer.
+	 */
+	list_for_each_entry(entity, &vi->entities, list) {
+		if (entity->node != subdev->dev->of_node)
+			continue;
+
+		if (entity->subdev) {
+			dev_err(vi->dev, "duplicate subdev for node %s\n",
+				entity->node->full_name);
+			return -EINVAL;
+		}
+
+		dev_dbg(vi->dev, "subdev %s bound\n", subdev->name);
+		entity->entity = &subdev->entity;
+		entity->subdev = subdev;
+		return 0;
+	}
+
+	dev_err(vi->dev, "no entity for subdev %s\n", subdev->name);
+	return -EINVAL;
+}
+
+
+static void tegra_vi_graph_cleanup(struct tegra_vi_device *vi)
+{
+	struct tegra_vi_graph_entity *entityp;
+	struct tegra_vi_graph_entity *entity;
+
+	v4l2_async_notifier_unregister(&vi->notifier);
+
+	list_for_each_entry_safe(entity, entityp, &vi->entities, list) {
+		of_node_put(entity->node);
+		list_del(&entity->list);
+	}
+}
+
+static int tegra_vi_graph_init(struct tegra_vi_device *vi)
+{
+	struct device_node *node = vi->dev->of_node;
+	struct device_node *ep = NULL;
+	struct device_node *next; 
+	struct device_node *remote = NULL;
+	struct tegra_vi_graph_entity *entity;
+	struct v4l2_async_subdev **subdevs = NULL;
+	unsigned int num_subdevs;
+	int ret = 0, i;
+
+	/* Parse all the remote entities and put them into the list */
+	while (1) {
+		next = of_graph_get_next_endpoint(node, ep);
+		if (!next)
+			break;
+
+		of_node_put(ep);
+		ep = next;
+
+		remote = of_graph_get_remote_port_parent(ep);
+		if (!remote) {
+			ret = -EINVAL;
+			break;
+		}
+
+		entity = devm_kzalloc(vi->dev, sizeof(*entity), GFP_KERNEL);
+		if (entity == NULL) {
+			of_node_put(remote);
+			ret = -ENOMEM;
+			break;
+		}
+
+		entity->node = remote;
+		entity->asd.match_type = V4L2_ASYNC_MATCH_OF;
+		entity->asd.match.of.node = remote;
+		list_add_tail(&entity->list, &vi->entities);
+		vi->num_subdevs++;
+	}
+	of_node_put(ep);
+
+	if (!vi->num_subdevs) {
+		dev_warn(vi->dev, "no subdev found in graph\n");
+		goto done;
+	}
+
+	/* Register the subdevices notifier. */
+	num_subdevs = vi->num_subdevs;
+	subdevs = devm_kzalloc(vi->dev, sizeof(*subdevs) * num_subdevs,
+			       GFP_KERNEL);
+	if (subdevs == NULL) {
+		ret = -ENOMEM;
+		goto done;
+	}
+
+	i = 0;
+	list_for_each_entry(entity, &vi->entities, list)
+		subdevs[i++] = &entity->asd;
+
+	vi->notifier.subdevs = subdevs;
+	vi->notifier.num_subdevs = num_subdevs;
+	vi->notifier.bound = tegra_vi_graph_notify_bound;
+	vi->notifier.complete = tegra_vi_graph_notify_complete;
+
+	ret = v4l2_async_notifier_register(&vi->v4l2_dev, &vi->notifier);
+	if (ret < 0) {
+		dev_err(vi->dev, "notifier registration failed\n");
+		goto done;
+	}
+
+	return 0;
+
+done:
+	if (ret < 0)
+		tegra_vi_graph_cleanup(vi);
+
+	return ret;
+}
+
+static int tegra_vi_probe(struct platform_device *pdev)
+{
+	struct resource *res;
+	struct tegra_vi_device *vi;
+	int ret = 0;
+
+	vi = devm_kzalloc(&pdev->dev, sizeof(*vi), GFP_KERNEL);
+	if (!vi)
+		return -ENOMEM;
+
+	vi->dev = &pdev->dev;
+	INIT_LIST_HEAD(&vi->entities);
+	mutex_init(&vi->lock);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	vi->iomem = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(vi->iomem))
+		return PTR_ERR(vi->iomem);
+
+	vi->vi_rst = devm_reset_control_get(&pdev->dev, "vi");
+	if (IS_ERR(vi->vi_rst)) {
+		dev_err(&pdev->dev, "Failed to get vi reset\n");
+		return -EPROBE_DEFER;
+	}
+
+	vi->vi_clk = devm_clk_get(&pdev->dev, "vi");
+	if (IS_ERR(vi->vi_clk)) {
+		dev_err(&pdev->dev, "Failed to get vi clock\n");
+		return -EPROBE_DEFER;
+	}
+
+	vi->parent_clk = devm_clk_get(&pdev->dev, "parent");
+	if (IS_ERR(vi->parent_clk)) {
+		dev_err(&pdev->dev, "Failed to get VI parent clock\n");
+		return -EPROBE_DEFER;
+	}
+
+	ret = clk_set_parent(vi->vi_clk, vi->parent_clk);
+	if (ret < 0)
+		return ret;
+
+	vi->csi_clk = devm_clk_get(&pdev->dev, "csi");
+	if (IS_ERR(vi->csi_clk)) {
+		dev_err(&pdev->dev, "Failed to get csi clock\n");
+		return -EPROBE_DEFER;
+	}
+
+	vi->vi_reg = devm_regulator_get(&pdev->dev, "avdd-dsi-csi");
+	if (IS_ERR(vi->vi_reg)) {
+		dev_err(&pdev->dev, "Failed to get avdd-dsi-csi regulators\n");
+		return -EPROBE_DEFER;
+	}
+
+	vi_tpg_fmts_bitmap_init(vi);
+
+	ret = tegra_vi_v4l2_init(vi);
+	if (ret < 0)
+		return ret;
+
+	/* Check whether VI is in test pattern generator (TPG) mode */
+	of_property_read_u32(vi->dev->of_node, "nvidia,pg_mode",
+			     &vi->pg_mode);
+
+	/* Init Tegra VI channels */
+	ret = tegra_vi_channels_init(vi);
+	if (ret < 0)
+		goto channels_error;
+
+	/* Setup media links between VI and external sensor subdev. */
+	ret = tegra_vi_graph_init(vi);
+	if (ret < 0)
+		goto graph_error;
+
+	platform_set_drvdata(pdev, vi);
+
+	dev_info(vi->dev, "device registered\n");
+
+	return 0;
+
+graph_error:
+	tegra_vi_channels_cleanup(vi);
+channels_error:
+	tegra_vi_v4l2_cleanup(vi);
+	return ret;
+}
+
+static int tegra_vi_remove(struct platform_device *pdev)
+{
+	struct tegra_vi_device *vi = platform_get_drvdata(pdev);
+
+	tegra_vi_graph_cleanup(vi);
+	tegra_vi_channels_cleanup(vi);
+	tegra_vi_v4l2_cleanup(vi);
+
+	return 0;
+}
+
+static const struct of_device_id tegra_vi_of_id_table[] = {
+	{ .compatible = "nvidia,tegra210-vi" },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, tegra_vi_of_id_table);
+
+static struct platform_driver tegra_vi_driver = {
+	.driver = {
+		.name = "tegra-vi",
+		.of_match_table = tegra_vi_of_id_table,
+	},
+	.probe = tegra_vi_probe,
+	.remove = tegra_vi_remove,
+};
+
+module_platform_driver(tegra_vi_driver);
+
+MODULE_AUTHOR("Bryan Wu <pengw@nvidia.com>");
+MODULE_DESCRIPTION("NVIDIA Tegra Video Input Device Driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/media/platform/tegra/tegra-vi.h b/drivers/media/platform/tegra/tegra-vi.h
new file mode 100644
index 0000000..d30a6ec
--- /dev/null
+++ b/drivers/media/platform/tegra/tegra-vi.h
@@ -0,0 +1,224 @@
+/*
+ * NVIDIA Tegra Video Input Device
+ *
+ * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * Author: Bryan Wu <pengw@nvidia.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __TEGRA_VI_H__
+#define __TEGRA_VI_H__
+
+#include <linux/host1x.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
+#include <linux/spinlock.h>
+#include <linux/videodev2.h>
+
+#include <media/media-device.h>
+#include <media/media-entity.h>
+#include <media/v4l2-async.h>
+#include <media/v4l2-ctrls.h>
+#include <media/v4l2-device.h>
+#include <media/v4l2-dev.h>
+#include <media/videobuf2-core.h>
+
+#include "tegra-core.h"
+
+#define MAX_CHAN_NUM	6
+#define MAX_FORMAT_NUM	64
+
+/**
+ * struct tegra_channel_buffer - video channel buffer
+ * @buf: vb2 buffer base object
+ * @queue: buffer list entry in the channel queued buffers list
+ * @chan: channel that uses the buffer
+ * @addr: Tegra IOVA buffer address for VI output
+ */
+struct tegra_channel_buffer {
+	struct vb2_buffer buf;
+	struct list_head queue;
+	struct tegra_channel *chan;
+
+	dma_addr_t addr;
+};
+
+#define to_tegra_channel_buffer(vb) \
+	container_of(vb, struct tegra_channel_buffer, buf)
+
+
+struct chan_regs_config {
+	u32 csi;
+	u32 pp;
+	u32 cil;
+	u32 phy;
+	u32 tpg;
+};
+
+/**
+ * struct tegra_vi_graph_entity - Entity in the video graph
+ * @list: list entry in a graph entities list
+ * @node: the entity's DT node
+ * @entity: media entity, from the corresponding V4L2 subdev
+ * @asd: subdev asynchronous registration information
+ * @subdev: V4L2 subdev
+ */
+struct tegra_vi_graph_entity {
+	struct list_head list;
+	struct device_node *node;
+	struct media_entity *entity;
+
+	struct v4l2_async_subdev asd;
+	struct v4l2_subdev *subdev;
+};
+
+/**
+ * struct tegra_channel - Tegra video channel
+ * @list: list entry in a composite device dmas list
+ * @video: V4L2 video device associated with the video channel
+ * @video_lock:
+ * @pad: media pad for the video device entity
+ * @pipe: pipeline belonging to the channel
+ *
+ * @vi: composite device DT node port number for the channel
+ *
+ * @client: host1x client struct of Tegra DRM
+ * @sp: host1x syncpoint pointer
+ *
+ * @work: kernel workqueue structure of this video channel
+ * @lock: protects the @format, @fmtinfo, @queue and @work fields
+ *
+ * @format: active V4L2 pixel format
+ * @fmtinfo: format information corresponding to the active @format
+ *
+ * @queue: vb2 buffers queue
+ * @alloc_ctx: allocation context for the vb2 @queue
+ * @sequence: V4L2 buffers sequence number
+ *
+ * @capture: list of queued buffers for capture
+ * @active: active buffer for capture
+ * @queued_lock: protects the buf_queued list
+ *
+ * @iomem: root register base
+ * @regs: CSI/CIL/PHY register bases
+ * @cil_clk: clock for CIL
+ * @align: channel buffer alignment, default is 64
+ * @port: CSI port of this video channel
+ * @surface: output memory surface number
+ * @io_id: Tegra IO rail ID of this video channel
+ * @bypass: a flag to bypass register write
+ *
+ * @fmts_bitmap: a bitmap for formats supported
+ *
+ * @remote_entity: remote media entity for external sensor
+ */
+struct tegra_channel {
+	struct list_head list;
+	struct video_device video;
+	struct mutex video_lock;
+	struct media_pad pad;
+	struct media_pipeline pipe;
+
+	struct tegra_vi_device *vi;
+
+	struct host1x_client client;
+	struct host1x_syncpt *sp;
+
+	struct work_struct work;
+	struct mutex lock;
+
+	struct v4l2_pix_format format;
+	const struct tegra_video_format *fmtinfo;
+
+	struct vb2_queue queue;
+	void *alloc_ctx;
+	u32 sequence;
+
+	struct list_head capture;
+	struct tegra_channel_buffer *active;
+	spinlock_t queued_lock;
+
+	void __iomem *iomem;
+	struct chan_regs_config regs;
+	struct clk *cil_clk;
+	int align;
+	u32 port;
+	u32 surface;
+	int io_id;
+	int bypass;
+
+	DECLARE_BITMAP(fmts_bitmap, MAX_FORMAT_NUM);
+
+	struct tegra_vi_graph_entity *remote_entity;
+};
+
+#define to_tegra_channel(vdev) \
+	container_of(vdev, struct tegra_channel, video)
+
+/**
+ * struct tegra_vi_device - NVIDIA Tegra Video Input device structure
+ * @v4l2_dev: V4L2 device
+ * @media_dev: media device
+ * @dev: device struct
+ *
+ * @iomem: register base
+ * @vi_clk: main clock for VI block
+ * @parent_clk: parent clock of VI clock
+ * @csi_clk: clock for CSI
+ * @vi_rst: reset controler
+ * @vi_reg: regulator for VI hardware, normally it avdd_dsi_csi
+ *
+ * @lock: mutex lock to protect power on/off operations
+ * @power_on_refcnt: reference count for power on/off operations
+ *
+ * @notifier: V4L2 asynchronous subdevs notifier
+ * @entities: entities in the graph as a list of tegra_vi_graph_entity
+ * @num_subdevs: number of subdevs in the pipeline
+ *
+ * @channels: list of channels at the pipeline output and input
+ *
+ * @ctrl_handler: V4L2 control handler
+ * @pattern: test pattern generator V4L2 control
+ * @pg_mode: test pattern generator mode (disabled/direct/patch)
+ * @tpg_fmts_bitmap: a bitmap for formats in test pattern generator mode
+ */
+struct tegra_vi_device {
+	struct v4l2_device v4l2_dev;
+	struct media_device media_dev;
+	struct device *dev;
+
+	void __iomem *iomem;
+	struct clk *vi_clk;
+	struct clk *parent_clk;
+	struct clk *csi_clk;
+	struct reset_control *vi_rst;
+	struct regulator *vi_reg;
+
+	struct mutex lock;
+	int power_on_refcnt;
+
+	struct v4l2_async_notifier notifier;
+	struct list_head entities;
+	unsigned int num_subdevs;
+
+	struct tegra_channel chans[MAX_CHAN_NUM];
+
+	struct v4l2_ctrl_handler ctrl_handler;
+	struct v4l2_ctrl *pattern;
+	int pg_mode;
+	DECLARE_BITMAP(tpg_fmts_bitmap, MAX_FORMAT_NUM);
+};
+
+int tegra_vi_power_on(struct tegra_vi_device *vi);
+void tegra_vi_power_off(struct tegra_vi_device *vi);
+
+int tegra_channel_init(struct tegra_vi_device *vi,
+		       struct tegra_channel *chan, u32 port);
+int tegra_channel_cleanup(struct tegra_channel *chan);
+void tegra_channel_fmts_bitmap_init(struct tegra_channel *chan);
+
+#endif /* __TEGRA_VI_H__ */
diff --git a/include/dt-bindings/media/tegra-vi.h b/include/dt-bindings/media/tegra-vi.h
new file mode 100644
index 0000000..5fdea5b
--- /dev/null
+++ b/include/dt-bindings/media/tegra-vi.h
@@ -0,0 +1,35 @@
+/*
+ * NVIDIA Tegra Video Input Device Driver
+ *
+ * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * Author: Bryan Wu <pengw@nvidia.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __DT_BINDINGS_MEDIA_TEGRA_VI_H__
+#define __DT_BINDINGS_MEDIA_TEGRA_VI_H__
+
+/*
+ * Supported CSI to VI Data Formats
+ */
+#define TEGRA_VF_RAW6		0
+#define TEGRA_VF_RAW7		1
+#define TEGRA_VF_RAW8		2
+#define TEGRA_VF_RAW10		3
+#define TEGRA_VF_RAW12		4
+#define TEGRA_VF_RAW14		5
+#define TEGRA_VF_EMBEDDED8	6
+#define TEGRA_VF_RGB565		7
+#define TEGRA_VF_RGB555		8
+#define TEGRA_VF_RGB888		9
+#define TEGRA_VF_RGB444		10
+#define TEGRA_VF_RGB666		11
+#define TEGRA_VF_YUV422		12
+#define TEGRA_VF_YUV420		13
+#define TEGRA_VF_YUV420_CSPS	14
+
+#endif /* __DT_BINDINGS_MEDIA_TEGRA_VI_H__ */
-- 
2.1.4


-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/2] ARM64: add tegra-vi support in T210 device-tree
  2015-08-21  0:51 [PATCH RFC 0/2] NVIDIA Tegra VI V4L2 driver Bryan Wu
                   ` (3 preceding siblings ...)
  2015-08-21  0:51 ` [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver Bryan Wu
@ 2015-08-21  0:51 ` Bryan Wu
  4 siblings, 0 replies; 25+ messages in thread
From: Bryan Wu @ 2015-08-21  0:51 UTC (permalink / raw)
  To: hansverk, linux-media
  Cc: ebrower, jbang, swarren, treding, wenjiaz, davidw, gfitzer

Following device tree support for Tegra VI now:
 - "vi" node which might have 6 ports/endpoints
 - in TPG mode, "vi" node don't need to define any ports/endpoints
 - ports/endpoints defines the link between VI and external sensors.

Signed-off-by: Bryan Wu <pengw@nvidia.com>
---
 arch/arm64/boot/dts/nvidia/tegra210-p2571-e01.dts |  8 ++++++++
 arch/arm64/boot/dts/nvidia/tegra210.dtsi          | 13 +++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra210-p2571-e01.dts b/arch/arm64/boot/dts/nvidia/tegra210-p2571-e01.dts
index d4ee460..534ada52 100644
--- a/arch/arm64/boot/dts/nvidia/tegra210-p2571-e01.dts
+++ b/arch/arm64/boot/dts/nvidia/tegra210-p2571-e01.dts
@@ -7,6 +7,14 @@
 	model = "NVIDIA Tegra210 P2571 reference board (E.1)";
 	compatible = "nvidia,p2571-e01", "nvidia,tegra210";
 
+	host1x@0,50000000 {
+		vi@0,54080000 {
+			status = "okay";
+
+			avdd-dsi-csi-supply = <&vdd_dsi_csi>;
+		};
+	};
+
 	pinmux: pinmux@0,700008d4 {
 		pinctrl-names = "boot";
 		pinctrl-0 = <&state_boot>;
diff --git a/arch/arm64/boot/dts/nvidia/tegra210.dtsi b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
index 1168bcd..78bfaad 100644
--- a/arch/arm64/boot/dts/nvidia/tegra210.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
@@ -112,6 +112,19 @@
 			reg = <0x0 0x54080000 0x0 0x00040000>;
 			interrupts = <GIC_SPI 69 IRQ_TYPE_LEVEL_HIGH>;
 			status = "disabled";
+			clocks = <&tegra_car TEGRA210_CLK_VI>,
+				 <&tegra_car TEGRA210_CLK_CSI>,
+				 <&tegra_car TEGRA210_CLK_PLL_C>,
+				 <&tegra_car TEGRA210_CLK_CILAB>,
+				 <&tegra_car TEGRA210_CLK_CILCD>,
+				 <&tegra_car TEGRA210_CLK_CILE>;
+			clock-names = "vi", "csi", "parent", "cilab", "cilcd", "cile";
+			resets = <&tegra_car 20>;
+			reset-names = "vi";
+
+			power-domains = <&pmc TEGRA_POWERGATE_VENC>;
+
+			iommus = <&mc TEGRA_SWGROUP_VI>;
 		};
 
 		tsec@0,54100000 {
-- 
2.1.4


-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
  2015-08-21  0:51 ` [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver Bryan Wu
@ 2015-08-21  9:28   ` Hans Verkuil
  2015-08-25  0:43     ` Bryan Wu
  2015-08-21 13:03   ` Thierry Reding
  1 sibling, 1 reply; 25+ messages in thread
From: Hans Verkuil @ 2015-08-21  9:28 UTC (permalink / raw)
  To: Bryan Wu, hansverk, linux-media
  Cc: ebrower, jbang, swarren, treding, wenjiaz, davidw, gfitzer

Hi Bryan,

Thanks for contributing this driver, very much appreciated.

I do have some comments below, basically about the same things we discussed
privately before.

On 08/21/2015 02:51 AM, Bryan Wu wrote:
> NVIDIA Tegra processor contains a powerful Video Input (VI) hardware
> controller which can support up to 6 MIPI CSI camera sensors.
> 
> This patch adds a V4L2 media controller and capture driver to support
> Tegra VI hardware. It's verified with Tegra built-in test pattern
> generator.
> 
> Signed-off-by: Bryan Wu <pengw@nvidia.com>
> Reviewed-by: Hans Verkuil <hans.verkuil@cisco.com>
> ---
>  drivers/media/platform/Kconfig               |    1 +
>  drivers/media/platform/Makefile              |    2 +
>  drivers/media/platform/tegra/Kconfig         |    9 +
>  drivers/media/platform/tegra/Makefile        |    3 +
>  drivers/media/platform/tegra/tegra-channel.c | 1074 ++++++++++++++++++++++++++
>  drivers/media/platform/tegra/tegra-core.c    |  295 +++++++
>  drivers/media/platform/tegra/tegra-core.h    |  134 ++++
>  drivers/media/platform/tegra/tegra-vi.c      |  585 ++++++++++++++
>  drivers/media/platform/tegra/tegra-vi.h      |  224 ++++++
>  include/dt-bindings/media/tegra-vi.h         |   35 +
>  10 files changed, 2362 insertions(+)
>  create mode 100644 drivers/media/platform/tegra/Kconfig
>  create mode 100644 drivers/media/platform/tegra/Makefile
>  create mode 100644 drivers/media/platform/tegra/tegra-channel.c
>  create mode 100644 drivers/media/platform/tegra/tegra-core.c
>  create mode 100644 drivers/media/platform/tegra/tegra-core.h
>  create mode 100644 drivers/media/platform/tegra/tegra-vi.c
>  create mode 100644 drivers/media/platform/tegra/tegra-vi.h
>  create mode 100644 include/dt-bindings/media/tegra-vi.h
> 

<snip>

> +static int tegra_channel_capture_frame(struct tegra_channel *chan)
> +{
> +	struct tegra_channel_buffer *buf = chan->active;
> +	struct vb2_buffer *vb = &buf->buf;
> +	int err = 0;
> +	u32 thresh, value, frame_start;
> +	int bytes_per_line = chan->format.bytesperline;
> +
> +	if (!vb2_start_streaming_called(&chan->queue) || !buf)
> +		return -EINVAL;
> +
> +	if (chan->bypass)
> +		goto bypass_done;
> +
> +	/* Program buffer address */
> +	csi_write(chan,
> +		  TEGRA_VI_CSI_SURFACE0_OFFSET_MSB + chan->surface * 8,
> +		  0x0);
> +	csi_write(chan,
> +		  TEGRA_VI_CSI_SURFACE0_OFFSET_LSB + chan->surface * 8,
> +		  buf->addr);
> +	csi_write(chan,
> +		  TEGRA_VI_CSI_SURFACE0_STRIDE + chan->surface * 4,
> +		  bytes_per_line);
> +
> +	/* Program syncpoint */
> +	frame_start = sp_bit(chan, SP_PP_FRAME_START);
> +	tegra_channel_write(chan, TEGRA_VI_CFG_VI_INCR_SYNCPT,
> +			    frame_start | host1x_syncpt_id(chan->sp));
> +
> +	csi_write(chan, TEGRA_VI_CSI_SINGLE_SHOT, 0x1);
> +
> +	/* Use syncpoint to wake up */
> +	thresh = host1x_syncpt_incr_max(chan->sp, 1);
> +
> +	mutex_unlock(&chan->lock);
> +	err = host1x_syncpt_wait(chan->sp, thresh,
> +			         TEGRA_VI_SYNCPT_WAIT_TIMEOUT, &value);
> +	mutex_lock(&chan->lock);
> +
> +	if (err) {
> +		dev_err(&chan->video.dev, "frame start syncpt timeout!\n");
> +		tegra_channel_capture_error(chan, err);
> +	}
> +
> +bypass_done:
> +	/* Captured one frame */
> +	spin_lock_irq(&chan->queued_lock);
> +	vb->v4l2_buf.sequence = chan->sequence++;
> +	vb->v4l2_buf.field = V4L2_FIELD_NONE;
> +	v4l2_get_timestamp(&vb->v4l2_buf.timestamp);
> +	vb2_set_plane_payload(vb, 0, chan->format.sizeimage);
> +	vb2_buffer_done(vb, err < 0 ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
> +	spin_unlock_irq(&chan->queued_lock);
> +
> +	return err;
> +}
> +
> +static void tegra_channel_work(struct work_struct *work)
> +{
> +	struct tegra_channel *chan =
> +		container_of(work, struct tegra_channel, work);
> +
> +	while (1) {
> +		spin_lock_irq(&chan->queued_lock);
> +		if (list_empty(&chan->capture)) {
> +			chan->active = NULL;
> +			spin_unlock_irq(&chan->queued_lock);
> +			return;
> +		}
> +		chan->active = list_entry(chan->capture.next,
> +				struct tegra_channel_buffer, queue);
> +		list_del_init(&chan->active->queue);
> +		spin_unlock_irq(&chan->queued_lock);
> +
> +		mutex_lock(&chan->lock);
> +		tegra_channel_capture_frame(chan);
> +		mutex_unlock(&chan->lock);
> +	}
> +}
> +
> +/* -----------------------------------------------------------------------------
> + * videobuf2 queue operations
> + */
> +
> +static int
> +tegra_channel_queue_setup(struct vb2_queue *vq, const struct v4l2_format *fmt,
> +		     unsigned int *nbuffers, unsigned int *nplanes,
> +		     unsigned int sizes[], void *alloc_ctxs[])
> +{
> +	struct tegra_channel *chan = vb2_get_drv_priv(vq);
> +
> +	/* Make sure the image size is large enough. */
> +	if (fmt && fmt->fmt.pix.sizeimage < chan->format.sizeimage)
> +		return -EINVAL;
> +
> +	*nplanes = 1;
> +
> +	sizes[0] = fmt ? fmt->fmt.pix.sizeimage : chan->format.sizeimage;
> +	alloc_ctxs[0] = chan->alloc_ctx;
> +
> +	return 0;
> +}
> +
> +static int tegra_channel_buffer_prepare(struct vb2_buffer *vb)
> +{
> +	struct tegra_channel *chan = vb2_get_drv_priv(vb->vb2_queue);
> +	struct tegra_channel_buffer *buf = to_tegra_channel_buffer(vb);
> +
> +	buf->chan = chan;
> +	buf->addr = vb2_dma_contig_plane_dma_addr(vb, 0);
> +
> +	return 0;
> +}
> +
> +static void tegra_channel_buffer_queue(struct vb2_buffer *vb)
> +{
> +	struct tegra_channel *chan = vb2_get_drv_priv(vb->vb2_queue);
> +	struct tegra_channel_buffer *buf = to_tegra_channel_buffer(vb);
> +
> +	/* Put buffer into the  capture queue */
> +	spin_lock_irq(&chan->queued_lock);
> +	list_add_tail(&buf->queue, &chan->capture);
> +	spin_unlock_irq(&chan->queued_lock);
> +
> +	/* Start work queue to capture data to buffer */
> +	if (vb2_start_streaming_called(&chan->queue))
> +		schedule_work(&chan->work);
> +}
> +
> +static int tegra_channel_set_stream(struct tegra_channel *chan, bool on)
> +{
> +	struct media_entity *entity;
> +	struct media_pad *pad;
> +	struct v4l2_subdev *subdev;
> +	int ret = 0;
> +
> +	entity = &chan->video.entity;
> +
> +	while (1) {
> +		if (entity->num_pads > 1 && (chan->port & 0x1))
> +			pad = &entity->pads[2];
> +		else
> +			pad = &entity->pads[0];
> +
> +		if (!(pad->flags & MEDIA_PAD_FL_SINK))
> +			break;
> +
> +		pad = media_entity_remote_pad(pad);
> +		if (pad == NULL ||
> +		    media_entity_type(pad->entity) != MEDIA_ENT_T_V4L2_SUBDEV)
> +			break;
> +
> +		entity = pad->entity;
> +		subdev = media_entity_to_v4l2_subdev(entity);
> +		ret = v4l2_subdev_call(subdev, video, s_stream, on);
> +		if (on && ret < 0 && ret != -ENOIOCTLCMD)
> +			return ret;
> +	}
> +	return ret;
> +}
> +
> +static int tegra_channel_start_streaming(struct vb2_queue *vq, u32 count)
> +{
> +	struct tegra_channel *chan = vb2_get_drv_priv(vq);
> +	struct media_pipeline *pipe = chan->video.entity.pipe;
> +	struct tegra_channel_buffer *buf, *nbuf;
> +	int ret = 0;
> +
> +	if (!chan->vi->pg_mode && !chan->remote_entity) {
> +		dev_err(&chan->video.dev,
> +			"is not in TPG mode and has not sensor connected!\n");
> +		ret = -EINVAL;
> +		goto vb2_queued;
> +	}
> +
> +	mutex_lock(&chan->lock);
> +
> +	/* Start CIL clock */
> +	clk_set_rate(chan->cil_clk, 102000000);
> +	clk_prepare_enable(chan->cil_clk);
> +
> +	/* Disable DPD */
> +	ret = tegra_io_rail_power_on(chan->io_id);
> +	if (ret < 0) {
> +		dev_err(&chan->video.dev,
> +			"failed to power on CSI rail: %d\n", ret);
> +		goto error_power_on;
> +	}
> +
> +	/* Clean up status */
> +	cil_write(chan, TEGRA_CSI_CIL_STATUS, 0xFFFFFFFF);
> +	cil_write(chan, TEGRA_CSI_CILX_STATUS, 0xFFFFFFFF);
> +	pp_write(chan, TEGRA_CSI_PIXEL_PARSER_STATUS, 0xFFFFFFFF);
> +	csi_write(chan, TEGRA_VI_CSI_ERROR_STATUS, 0xFFFFFFFF);
> +
> +	ret = media_entity_pipeline_start(&chan->video.entity, pipe);
> +	if (ret < 0)
> +		goto error_pipeline_start;
> +
> +	/* Start the pipeline. */
> +	ret = tegra_channel_set_stream(chan, true);
> +	if (ret < 0)
> +		goto error_set_stream;
> +
> +	/* Note: Program VI registers after TPG, sensors and CSI streaming */
> +	ret = tegra_channel_capture_setup(chan);
> +	if (ret < 0)
> +		goto error_capture_setup;
> +
> +	chan->sequence = 0;
> +	mutex_unlock(&chan->lock);
> +
> +	/* Start work queue to capture data to buffer */
> +	schedule_work(&chan->work);
> +
> +	return 0;
> +
> +error_capture_setup:
> +	tegra_channel_set_stream(chan, false);
> +error_set_stream:
> +	media_entity_pipeline_stop(&chan->video.entity);
> +error_pipeline_start:
> +	tegra_io_rail_power_off(chan->io_id);
> +error_power_on:
> +	clk_disable_unprepare(chan->cil_clk);
> +	mutex_unlock(&chan->lock);
> +vb2_queued:
> +	/* Return all queued buffers back to vb2 */
> +	spin_lock_irq(&chan->queued_lock);
> +	vq->start_streaming_called = 0;
> +	list_for_each_entry_safe(buf, nbuf, &chan->capture, queue) {
> +		vb2_buffer_done(&buf->buf, VB2_BUF_STATE_QUEUED);
> +		list_del(&buf->queue);
> +	}
> +	spin_unlock_irq(&chan->queued_lock);
> +	return ret;
> +}

OK, so this whole sequence for running the DMA remains very confusing.

First of all, this needs more documentation, especially about the fact that this
uses shadow registers.

Secondly, at the very least you need to create per-channel workqueues instead of
using the global workqueue (schedule_work schedules the work on the global queue).

But I would replace the whole workqueue handling with per-channel kthreads instead:
where you call schedule_work above in start_streaming you start the thread. The
thread keeps going while there are buffers queued, and if no buffers are available
it will wait until it is woken up again. In buffer_queue you can wake up the thread
after queueing the buffer.

In stop streaming you stop the thread.

Doing it this way allows you to remove the 'vq->start_streaming_called = 0' line
above: the fact that you need it there is an indication that there is something
wrong with the design. The real problem is that buffer_queue does too much: buffer_queue
should just queue up the buffer for the DMA engine but it should never (re)start the
DMA engine. Starting and stopping should be handled in start/stop_streaming.

This keeps the design clean. I've seen other drivers that do similar things to what
is done here, and that always created a mess.

In addition, the way it works now in this driver is that the worker function is
called on start_streaming AND for every buffer_queue, so it looks like you can get
multiple worker functions running at the same time. It's all pretty weird.

Keeping all the DMA handling in a single thread makes the control mechanism much
cleaner.

<snip>

> +static void
> +__tegra_channel_try_format(struct tegra_channel *chan, struct v4l2_pix_format *pix,
> +		      const struct tegra_video_format **fmtinfo)
> +{
> +	const struct tegra_video_format *info;
> +	unsigned int min_width;
> +	unsigned int max_width;
> +	unsigned int min_bpl;
> +	unsigned int max_bpl;
> +	unsigned int width;
> +	unsigned int align;
> +	unsigned int bpl;
> +
> +	/* Retrieve format information and select the default format if the
> +	 * requested format isn't supported.
> +	 */
> +	info = tegra_core_get_format_by_fourcc(pix->pixelformat);
> +	if (!info)
> +		info = tegra_core_get_format_by_fourcc(TEGRA_VF_DEF_FOURCC);
> +
> +	pix->pixelformat = info->fourcc;
> +	pix->field = V4L2_FIELD_NONE;
> +
> +	/* The transfer alignment requirements are expressed in bytes. Compute
> +	 * the minimum and maximum values, clamp the requested width and convert
> +	 * it back to pixels.
> +	 */
> +	align = lcm(chan->align, info->bpp);
> +	min_width = roundup(TEGRA_MIN_WIDTH, align);
> +	max_width = rounddown(TEGRA_MAX_WIDTH, align);
> +	width = rounddown(pix->width * info->bpp, align);
> +
> +	pix->width = clamp(width, min_width, max_width) / info->bpp;
> +	pix->height = clamp(pix->height, TEGRA_MIN_HEIGHT,
> +			    TEGRA_MAX_HEIGHT);
> +
> +	/* Clamp the requested bytes per line value. If the maximum bytes per
> +	 * line value is zero, the module doesn't support user configurable line
> +	 * sizes. Override the requested value with the minimum in that case.
> +	 */
> +	min_bpl = pix->width * info->bpp;
> +	max_bpl = rounddown(TEGRA_MAX_WIDTH, chan->align);
> +	bpl = rounddown(pix->bytesperline, chan->align);
> +
> +	pix->bytesperline = clamp(bpl, min_bpl, max_bpl);
> +	pix->sizeimage = pix->bytesperline * pix->height;

The colorspace is still not set: using the test pattern generator as a source
I would select SRGB for Bayer and RGB pixelformats and REC709 for YUV pixelformats.

> +
> +	if (fmtinfo)
> +		*fmtinfo = info;
> +}

<snip>

> +static int tegra_channel_v4l2_open(struct file *file)
> +{
> +	struct tegra_channel *chan = video_drvdata(file);
> +	struct tegra_vi_device *vi = chan->vi;
> +	int ret = 0;
> +
> +	mutex_lock(&vi->lock);
> +	ret = v4l2_fh_open(file);
> +	if (ret)
> +		goto unlock;
> +
> +	/* The first open then turn on power*/
> +	if (!vi->power_on_refcnt) {

Instead of using your own counter you can also call:

	if (v4l2_fh_is_singular_file(file)) {

> +		tegra_vi_power_on(chan->vi);
> +
> +		usleep_range(5, 100);
> +		tegra_channel_write(chan, TEGRA_VI_CFG_CG_CTRL, 1);
> +		tegra_channel_write(chan, TEGRA_CSI_CLKEN_OVERRIDE, 0);
> +		usleep_range(10, 15);
> +	}
> +	vi->power_on_refcnt++;
> +
> +unlock:
> +	mutex_unlock(&vi->lock);
> +	return ret;
> +}
> +
> +static int tegra_channel_v4l2_release(struct file *file)
> +{
> +	struct tegra_channel *chan = video_drvdata(file);
> +	struct tegra_vi_device *vi = chan->vi;
> +	int ret = 0;
> +
> +	mutex_lock(&vi->lock);
> +	vi->power_on_refcnt--;
> +	/* The last release then turn off power */
> +	if (!vi->power_on_refcnt)

And here do the same:

	if (v4l2_fh_is_singular_file(file)) {

> +		tegra_vi_power_off(chan->vi);
> +	ret = _vb2_fop_release(file, NULL);

Is this the correct order? What if you are streaming and while streaming
close the filehandle? Will the fact that the power is turned off before
stop_streaming is called (_vb2_fop_release will call that) cause a problem?

> +	mutex_unlock(&vi->lock);
> +
> +	return ret;
> +}

Regards,

	Hans


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
  2015-08-21  0:51 ` [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver Bryan Wu
  2015-08-21  9:28   ` Hans Verkuil
@ 2015-08-21 13:03   ` Thierry Reding
  2015-08-21 13:25     ` Hans Verkuil
       [not found]     ` <20150821130339.GB22118-AwZRO8vwLAwmlAP/+Wk3EA@public.gmane.org>
  1 sibling, 2 replies; 25+ messages in thread
From: Thierry Reding @ 2015-08-21 13:03 UTC (permalink / raw)
  To: Bryan Wu
  Cc: hansverk, linux-media, ebrower, jbang, swarren, wenjiaz, davidw, gfitzer

[-- Attachment #1: Type: text/plain, Size: 55890 bytes --]

On Thu, Aug 20, 2015 at 05:51:39PM -0700, Bryan Wu wrote:
> NVIDIA Tegra processor contains a powerful Video Input (VI) hardware
> controller which can support up to 6 MIPI CSI camera sensors.
> 
> This patch adds a V4L2 media controller and capture driver to support
> Tegra VI hardware. It's verified with Tegra built-in test pattern
> generator.

Hi Bryan,

I've been looking forward to seeing this posted. I don't know the VI
hardware in very much detail, nor am I an expert on the media framework,
so I will primarily comment on architectural or SoC-specific things.

By the way, please always Cc linux-tegra@vger.kernel.org on all patches
relating to Tegra. That way people not explicitly Cc'ed but still
interested in Tegra will see this code, even if they aren't subscribed
to the linux-media mailing list.

> Signed-off-by: Bryan Wu <pengw@nvidia.com>
> Reviewed-by: Hans Verkuil <hans.verkuil@cisco.com>
> ---
>  drivers/media/platform/Kconfig               |    1 +
>  drivers/media/platform/Makefile              |    2 +
>  drivers/media/platform/tegra/Kconfig         |    9 +
>  drivers/media/platform/tegra/Makefile        |    3 +
>  drivers/media/platform/tegra/tegra-channel.c | 1074 ++++++++++++++++++++++++++
>  drivers/media/platform/tegra/tegra-core.c    |  295 +++++++
>  drivers/media/platform/tegra/tegra-core.h    |  134 ++++
>  drivers/media/platform/tegra/tegra-vi.c      |  585 ++++++++++++++
>  drivers/media/platform/tegra/tegra-vi.h      |  224 ++++++
>  include/dt-bindings/media/tegra-vi.h         |   35 +
>  10 files changed, 2362 insertions(+)
>  create mode 100644 drivers/media/platform/tegra/Kconfig
>  create mode 100644 drivers/media/platform/tegra/Makefile
>  create mode 100644 drivers/media/platform/tegra/tegra-channel.c
>  create mode 100644 drivers/media/platform/tegra/tegra-core.c
>  create mode 100644 drivers/media/platform/tegra/tegra-core.h
>  create mode 100644 drivers/media/platform/tegra/tegra-vi.c
>  create mode 100644 drivers/media/platform/tegra/tegra-vi.h
>  create mode 100644 include/dt-bindings/media/tegra-vi.h

I can't spot a device tree binding document for this, but we'll need one
to properly review this driver.

> diff --git a/drivers/media/platform/tegra/Kconfig b/drivers/media/platform/tegra/Kconfig
> new file mode 100644
> index 0000000..a69d1b2
> --- /dev/null
> +++ b/drivers/media/platform/tegra/Kconfig
> @@ -0,0 +1,9 @@
> +config VIDEO_TEGRA
> +	tristate "NVIDIA Tegra Video Input Driver (EXPERIMENTAL)"

I don't think the (EXPERIMENTAL) is warranted. Either the driver works
or it doesn't. And I assume you already tested that it works, even if
only using the TPG.

> +	depends on VIDEO_V4L2 && VIDEO_V4L2_SUBDEV_API && OF

This seems to be missing a couple of dependencies. For example I would
expect at least TEGRA_HOST1X to be listed here to make sure people can't
select this when the host1x API isn't available. I would also expect
some sort of architecture dependency because it really makes no sense to
build this if Tegra isn't supported.

If you are concerned about compile coverage you can make that explicit
using a COMPILE_TEST alternative such as:

	depends on ARCH_TEGRA || (ARM && COMPILE_TEST)

Note that the ARM dependency in there makes sure that HAVE_IOMEM is
selected, so this could also be:

	depends on ARCH_TEGRA || (HAVE_IOMEM && COMPILE_TEST)

though that'd still leave open the possibility of build breakage because
of some missing support.

If you add the dependency on TEGRA_HOST1X that I mentioned above you
shouldn't need any architecture dependency because TEGRA_HOST1X implies
those already.

> +	select VIDEOBUF2_DMA_CONTIG
> +	---help---
> +	  Driver for Video Input (VI) device controller in NVIDIA Tegra SoC.

I'd reword this slightly as:

	  Driver for the Video Input (VI) controller found on NVIDIA Tegra
	  SoCs.

> +
> +	  TO compile this driver as a module, choose M here: the module will be

s/TO/To/.

> +	  called tegra-video.

> diff --git a/drivers/media/platform/tegra/Makefile b/drivers/media/platform/tegra/Makefile
> new file mode 100644
> index 0000000..c8eff0b
> --- /dev/null
> +++ b/drivers/media/platform/tegra/Makefile
> @@ -0,0 +1,3 @@
> +tegra-video-objs += tegra-core.o tegra-vi.o tegra-channel.o

I'd personally leave out the redundant tegra- prefix here, because the
files are in a tegra/ subdirectory already.

> +obj-$(CONFIG_VIDEO_TEGRA) += tegra-video.o
> diff --git a/drivers/media/platform/tegra/tegra-channel.c b/drivers/media/platform/tegra/tegra-channel.c
> new file mode 100644
> index 0000000..b0063d2
> --- /dev/null
> +++ b/drivers/media/platform/tegra/tegra-channel.c
> @@ -0,0 +1,1074 @@
> +/*
> + * NVIDIA Tegra Video Input Device
> + *
> + * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
> + *
> + * Author: Bryan Wu <pengw@nvidia.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/atomic.h>
> +#include <linux/bitmap.h>
> +#include <linux/clk.h>
> +#include <linux/delay.h>
> +#include <linux/host1x.h>
> +#include <linux/lcm.h>
> +#include <linux/list.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/slab.h>
> +
> +#include <media/v4l2-ctrls.h>
> +#include <media/v4l2-dev.h>
> +#include <media/v4l2-fh.h>
> +#include <media/v4l2-ioctl.h>
> +#include <media/videobuf2-core.h>
> +#include <media/videobuf2-dma-contig.h>
> +
> +#include <soc/tegra/pmc.h>
> +
> +#include "tegra-vi.h"
> +
> +#define TEGRA_VI_SYNCPT_WAIT_TIMEOUT			200
> +
> +/* VI registers */
> +#define TEGRA_VI_CFG_VI_INCR_SYNCPT                     0x000
> +#define		SP_PP_LINE_START			4
> +#define		SP_PP_FRAME_START			5
> +#define		SP_MW_REQ_DONE				6
> +#define		SP_MW_ACK_DONE				7

Indentation is weird here. There also seems to be a mix of spaces and
tabs in the register definitions below. I find that these end up hard to
read, so it'd be good to make these consistent.

> +/* CSI registers */
> +#define TEGRA_VI_CSI_0_BASE                             0x100
> +#define TEGRA_VI_CSI_1_BASE                             0x200
> +#define TEGRA_VI_CSI_2_BASE                             0x300
> +#define TEGRA_VI_CSI_3_BASE                             0x400
> +#define TEGRA_VI_CSI_4_BASE                             0x500
> +#define TEGRA_VI_CSI_5_BASE                             0x600

You seem to be computing these offsets later on based on the CSI 0 base
and an offset multiplied by the instance number. Perhaps define this as

	#define TEGRA_VI_CSI_BASE(x)	(0x100 + (x) * 0x100)

to avoid the unused defines as well as the computation later on?

> +/* CSI Pixel Parser registers */
> +#define TEGRA_CSI_PIXEL_PARSER_0_BASE			0x0838
> +#define TEGRA_CSI_PIXEL_PARSER_1_BASE			0x086c
> +#define TEGRA_CSI_PIXEL_PARSER_2_BASE			0x1038
> +#define TEGRA_CSI_PIXEL_PARSER_3_BASE			0x106c
> +#define TEGRA_CSI_PIXEL_PARSER_4_BASE			0x1838
> +#define TEGRA_CSI_PIXEL_PARSER_5_BASE			0x186c

Same comment as for TEGRA_VI_CSI_*_BASE above. Only the first of these
is used.

> +
> +
> +/* CSI Pixel Parser registers */
> +#define TEGRA_CSI_INPUT_STREAM_CONTROL                  0x000
> +#define TEGRA_CSI_PIXEL_STREAM_CONTROL0                 0x004
> +#define TEGRA_CSI_PIXEL_STREAM_CONTROL1                 0x008
> +#define TEGRA_CSI_PIXEL_STREAM_GAP                      0x00c
> +#define TEGRA_CSI_PIXEL_STREAM_PP_COMMAND               0x010
> +#define TEGRA_CSI_PIXEL_STREAM_EXPECTED_FRAME           0x014
> +#define TEGRA_CSI_PIXEL_PARSER_INTERRUPT_MASK           0x018
> +#define TEGRA_CSI_PIXEL_PARSER_STATUS                   0x01c
> +#define TEGRA_CSI_CSI_SW_SENSOR_RESET                   0x020
> +
> +/* CSI PHY registers */
> +#define TEGRA_CSI_CIL_PHY_0_BASE			0x0908
> +#define TEGRA_CSI_CIL_PHY_1_BASE			0x1108
> +#define TEGRA_CSI_CIL_PHY_2_BASE			0x1908

Same as for the other base offsets.

> +#define TEGRA_CSI_PHY_CIL_COMMAND			0x0908

This doesn't seem to be used at all.

> +/* CSI CIL registers */
> +#define TEGRA_CSI_CIL_0_BASE				0x092c
> +#define TEGRA_CSI_CIL_1_BASE				0x0960
> +#define TEGRA_CSI_CIL_2_BASE				0x112c
> +#define TEGRA_CSI_CIL_3_BASE				0x1160
> +#define TEGRA_CSI_CIL_4_BASE				0x192c
> +#define TEGRA_CSI_CIL_5_BASE				0x1960

Again, unused base defines, so might be better to go with a
parameterized definition.

> +#define TEGRA_CSI_CIL_PAD_CONFIG0                       0x000
> +#define TEGRA_CSI_CIL_PAD_CONFIG1                       0x004
> +#define TEGRA_CSI_CIL_PHY_CONTROL                       0x008
> +#define TEGRA_CSI_CIL_INTERRUPT_MASK                    0x00c
> +#define TEGRA_CSI_CIL_STATUS                            0x010
> +#define TEGRA_CSI_CILX_STATUS                           0x014
> +#define TEGRA_CSI_CIL_ESCAPE_MODE_COMMAND               0x018
> +#define TEGRA_CSI_CIL_ESCAPE_MODE_DATA                  0x01c
> +#define TEGRA_CSI_CIL_SW_SENSOR_RESET                   0x020
> +
> +/* CSI Pattern Generator registers */
> +#define TEGRA_CSI_PATTERN_GENERATOR_0_BASE		0x09c4
> +#define TEGRA_CSI_PATTERN_GENERATOR_1_BASE		0x09f8
> +#define TEGRA_CSI_PATTERN_GENERATOR_2_BASE		0x11c4
> +#define TEGRA_CSI_PATTERN_GENERATOR_3_BASE		0x11f8
> +#define TEGRA_CSI_PATTERN_GENERATOR_4_BASE		0x19c4
> +#define TEGRA_CSI_PATTERN_GENERATOR_5_BASE		0x19f8

More unused base defines.

> +#define TEGRA_CSI_PATTERN_GENERATOR_CTRL		0x000
> +#define TEGRA_CSI_PG_BLANK				0x004
> +#define TEGRA_CSI_PG_PHASE				0x008
> +#define TEGRA_CSI_PG_RED_FREQ				0x00c
> +#define TEGRA_CSI_PG_RED_FREQ_RATE			0x010
> +#define TEGRA_CSI_PG_GREEN_FREQ				0x014
> +#define TEGRA_CSI_PG_GREEN_FREQ_RATE			0x018
> +#define TEGRA_CSI_PG_BLUE_FREQ				0x01c
> +#define TEGRA_CSI_PG_BLUE_FREQ_RATE			0x020
> +#define TEGRA_CSI_PG_AOHDR				0x024
> +
> +#define TEGRA_CSI_DPCM_CTRL_A				0xad0
> +#define TEGRA_CSI_DPCM_CTRL_B				0xad4
> +#define TEGRA_CSI_STALL_COUNTER				0xae8
> +#define TEGRA_CSI_CSI_READONLY_STATUS			0xaec
> +#define TEGRA_CSI_CSI_SW_STATUS_RESET			0xaf0
> +#define TEGRA_CSI_CLKEN_OVERRIDE			0xaf4
> +#define TEGRA_CSI_DEBUG_CONTROL				0xaf8
> +#define TEGRA_CSI_DEBUG_COUNTER_0			0xafc
> +#define TEGRA_CSI_DEBUG_COUNTER_1			0xb00
> +#define TEGRA_CSI_DEBUG_COUNTER_2			0xb04

Some of these are unused. I guess there's an argument to be made to
include all register definitions rather than just the used ones, if for
nothing else than completeness. I'll defer to Hans's judgement on this.

> +/* Channel registers */
> +static void tegra_channel_write(struct tegra_channel *chan, u32 addr, u32 val)

I prefer unsigned int offset instead of u32 addr. That makes in more
obvious that this is actually an offset from some I/O memory base
address. Also using a sized type for the offset is a bit exaggerated
because it doesn't need to be of any specific size.

The same comment applies to the other accessors below.

> +{
> +	if (chan->bypass)
> +		return;

I don't see this being set anywhere. Is it dead code? Also the only
description I see is that it's used to bypass register writes, but I
don't see an explanation why that's necessary.

> +/* CIL PHY registers */
> +static void phy_write(struct tegra_channel *chan, u32 val)
> +{
> +	tegra_channel_write(chan, chan->regs.phy, val);
> +}
> +
> +static u32 phy_read(struct tegra_channel *chan)
> +{
> +	return tegra_channel_read(chan, chan->regs.phy);
> +}

Are these missing an offset parameter? Or do these subblocks only have a
single register? Even if that's the case, I think it'd be more
consistent to have the same signature as the other accessors.

> +/* Syncpoint bits of TEGRA_VI_CFG_VI_INCR_SYNCPT */
> +static u32 sp_bit(struct tegra_channel *chan, u32 sp)
> +{
> +	return (sp + chan->port * 4) << 8;
> +}

Technically this returns a mask, not a bit, so sp_mask() would be more
appropriate.

> +/* Calculate register base */
> +static u32 regs_base(u32 regs_base, int port)
> +{
> +	return regs_base + (port / 2 * 0x800) + (port & 1) * 0x34;
> +}
> +
> +/* CSI channel IO Rail IDs */
> +int tegra_io_rail_csi_ids[] = {

This can be static const as far as I can tell.

> +	TEGRA_IO_RAIL_CSIA,
> +	TEGRA_IO_RAIL_CSIB,
> +	TEGRA_IO_RAIL_CSIC,
> +	TEGRA_IO_RAIL_CSID,
> +	TEGRA_IO_RAIL_CSIE,
> +	TEGRA_IO_RAIL_CSIF,
> +};
> +
> +void tegra_channel_fmts_bitmap_init(struct tegra_channel *chan)
> +{
> +	int ret, index;
> +	struct v4l2_subdev *subdev = chan->remote_entity->subdev;
> +	struct v4l2_subdev_mbus_code_enum code = {
> +		.which = V4L2_SUBDEV_FORMAT_ACTIVE,
> +	};
> +
> +

Spurious blank line.

> +static int tegra_channel_capture_setup(struct tegra_channel *chan)
> +{
> +	int lanes = 2;

unsigned int? And why is it hardcoded to 2? There are checks below for
lanes == 4, which will effectively never happen. So at the very least I
think this should have a TODO comment of some sort. Preferably can it
not be determined at runtime what number of lanes we need?

> +	int port = chan->port;

unsigned int?

> +	u32 height = chan->format.height;
> +	u32 width = chan->format.width;
> +	u32 format = chan->fmtinfo->img_fmt;
> +	u32 data_type = chan->fmtinfo->img_dt;
> +	u32 word_count = tegra_core_get_word_count(width, chan->fmtinfo);
> +	struct chan_regs_config *regs = &chan->regs;
> +
> +	/* CIL PHY register setup */
> +	if (port & 0x1) {
> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0 - 0x34, 0x0);
> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x0);
> +	} else {
> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x10000);
> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0 + 0x34, 0x0);
> +	}

This seems to address registers not actually part of this channel. Why?

Also you use magic numbers here and in the remainder of the driver. We
should be able to do better. I presume all of this is documented in the
TRM, so we should be able to easily substitute symbolic names.

> +	cil_write(chan, TEGRA_CSI_CIL_INTERRUPT_MASK, 0x0);
> +	cil_write(chan, TEGRA_CSI_CIL_PHY_CONTROL, 0xA);
> +	if (lanes == 4) {
> +		regs->cil = regs_base(TEGRA_CSI_CIL_0_BASE, port + 1);
> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x0);
> +		cil_write(chan,	TEGRA_CSI_CIL_INTERRUPT_MASK, 0x0);
> +		cil_write(chan, TEGRA_CSI_CIL_PHY_CONTROL, 0xA);
> +		regs->cil = regs_base(TEGRA_CSI_CIL_0_BASE, port);
> +	}

And this seems to access registers from another port by temporarily
rewriting the CIL base offset. That seems a little hackish to me. I
don't know the hardware intimately enough to know exactly what this
is supposed to accomplish, perhaps you can clarify? Also perhaps we
can come up with some architectural overview of the VI hardware, or
does such an overview exist in the TRM?

I see there is, perhaps add a comment somewhere, in the commit
description or the file header giving a reference to where the
architectural overview can be found?

> +	/* CSI pixel parser registers setup */
> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_PP_COMMAND, 0xf007);
> +	pp_write(chan, TEGRA_CSI_PIXEL_PARSER_INTERRUPT_MASK, 0x0);
> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_CONTROL0,
> +		 0x280301f0 | (port & 0x1));
> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_PP_COMMAND, 0xf007);
> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_CONTROL1, 0x11);
> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_GAP, 0x140000);
> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_EXPECTED_FRAME, 0x0);
> +	pp_write(chan, TEGRA_CSI_INPUT_STREAM_CONTROL,
> +		 0x3f0000 | (lanes - 1));
> +
> +	/* CIL PHY register setup */
> +	if (lanes == 4)
> +		phy_write(chan, 0x0101);
> +	else {
> +		u32 val = phy_read(chan);
> +		if (port & 0x1)
> +			val = (val & ~0x100) | 0x100;
> +		else
> +			val = (val & ~0x1) | 0x1;
> +		phy_write(chan, val);
> +	}

The & ~ isn't quite doing what I suspect it should be doing. My
assumption is that you want to set this register to 0x01 if the first
port is to be used and 0x100 if the second port is to be used (or 0x101
if both ports are to be used). In that case I think you'll want
something like this:

	value = phy_read(chan);

	if (port & 1)
		value = (value & ~0x0001) | 0x0100;
	else
		value = (value & ~0x0100) | 0x0001;

	phy_write(chan, value);

> +static void tegra_channel_capture_error(struct tegra_channel *chan, int err)
> +{
> +	u32 val;
> +
> +#ifdef DEBUG
> +	val = tegra_channel_read(chan, TEGRA_CSI_DEBUG_COUNTER_0);
> +	dev_err(&chan->video.dev, "TEGRA_CSI_DEBUG_COUNTER_0 0x%08x\n", val);
> +#endif
> +	val = cil_read(chan, TEGRA_CSI_CIL_STATUS);
> +	dev_err(&chan->video.dev, "TEGRA_CSI_CSI_CIL_STATUS 0x%08x\n", val);
> +	val = cil_read(chan, TEGRA_CSI_CILX_STATUS);
> +	dev_err(&chan->video.dev, "TEGRA_CSI_CSI_CILX_STATUS 0x%08x\n", val);
> +	val = pp_read(chan, TEGRA_CSI_PIXEL_PARSER_STATUS);
> +	dev_err(&chan->video.dev, "TEGRA_CSI_PIXEL_PARSER_STATUS 0x%08x\n",
> +		val);
> +	val = csi_read(chan, TEGRA_VI_CSI_ERROR_STATUS);
> +	dev_err(&chan->video.dev, "TEGRA_VI_CSI_ERROR_STATUS 0x%08x\n", val);
> +}

The err parameter is never used, so it should be dropped.

> +static int tegra_channel_capture_frame(struct tegra_channel *chan)
> +{
> +	struct tegra_channel_buffer *buf = chan->active;
> +	struct vb2_buffer *vb = &buf->buf;
> +	int err = 0;
> +	u32 thresh, value, frame_start;
> +	int bytes_per_line = chan->format.bytesperline;
> +
> +	if (!vb2_start_streaming_called(&chan->queue) || !buf)
> +		return -EINVAL;
> +
> +	if (chan->bypass)
> +		goto bypass_done;
> +
> +	/* Program buffer address */
> +	csi_write(chan,
> +		  TEGRA_VI_CSI_SURFACE0_OFFSET_MSB + chan->surface * 8,
> +		  0x0);
> +	csi_write(chan,
> +		  TEGRA_VI_CSI_SURFACE0_OFFSET_LSB + chan->surface * 8,
> +		  buf->addr);
> +	csi_write(chan,
> +		  TEGRA_VI_CSI_SURFACE0_STRIDE + chan->surface * 4,
> +		  bytes_per_line);
> +
> +	/* Program syncpoint */
> +	frame_start = sp_bit(chan, SP_PP_FRAME_START);
> +	tegra_channel_write(chan, TEGRA_VI_CFG_VI_INCR_SYNCPT,
> +			    frame_start | host1x_syncpt_id(chan->sp));
> +
> +	csi_write(chan, TEGRA_VI_CSI_SINGLE_SHOT, 0x1);
> +
> +	/* Use syncpoint to wake up */
> +	thresh = host1x_syncpt_incr_max(chan->sp, 1);
> +
> +	mutex_unlock(&chan->lock);
> +	err = host1x_syncpt_wait(chan->sp, thresh,
> +			         TEGRA_VI_SYNCPT_WAIT_TIMEOUT, &value);
> +	mutex_lock(&chan->lock);

What's the point of taking the lock in the first place if you drop it
here, even if temporarily? This is a per-channel lock, and it protects
the channel against concurrent captures. So if you drop the lock here,
don't you run risk of having two captures run concurrently? And by the
time you get to the error handling or buffer completion below you can't
be sure you're actually dealing with the same buffer that you started
with.

> +
> +	if (err) {
> +		dev_err(&chan->video.dev, "frame start syncpt timeout!\n");
> +		tegra_channel_capture_error(chan, err);
> +	}

Is timeout really the only kind of error that can happen here?

> +
> +bypass_done:
> +	/* Captured one frame */
> +	spin_lock_irq(&chan->queued_lock);
> +	vb->v4l2_buf.sequence = chan->sequence++;
> +	vb->v4l2_buf.field = V4L2_FIELD_NONE;
> +	v4l2_get_timestamp(&vb->v4l2_buf.timestamp);
> +	vb2_set_plane_payload(vb, 0, chan->format.sizeimage);
> +	vb2_buffer_done(vb, err < 0 ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
> +	spin_unlock_irq(&chan->queued_lock);

Do we really need to set all the buffer fields on error? Isn't it enough
to simply mark the state as "error"?

> +
> +	return err;
> +}
> +
> +static void tegra_channel_work(struct work_struct *work)
> +{
> +	struct tegra_channel *chan =
> +		container_of(work, struct tegra_channel, work);
> +
> +	while (1) {
> +		spin_lock_irq(&chan->queued_lock);
> +		if (list_empty(&chan->capture)) {
> +			chan->active = NULL;
> +			spin_unlock_irq(&chan->queued_lock);
> +			return;
> +		}
> +		chan->active = list_entry(chan->capture.next,
> +				struct tegra_channel_buffer, queue);
> +		list_del_init(&chan->active->queue);
> +		spin_unlock_irq(&chan->queued_lock);
> +
> +		mutex_lock(&chan->lock);
> +		tegra_channel_capture_frame(chan);
> +		mutex_unlock(&chan->lock);
> +	}
> +}

Should this have some mechanism to break out of the loop, for example if
somebody requested capturing to stop?

> +static int tegra_channel_buffer_prepare(struct vb2_buffer *vb)
> +{
> +	struct tegra_channel *chan = vb2_get_drv_priv(vb->vb2_queue);
> +	struct tegra_channel_buffer *buf = to_tegra_channel_buffer(vb);
> +
> +	buf->chan = chan;
> +	buf->addr = vb2_dma_contig_plane_dma_addr(vb, 0);
> +
> +	return 0;
> +}

This seems to use contiguous DMA, which I guess presumes CMA support?
We're dealing with very large buffers here. Your default frame size
would yield buffers of roughly 32 MiB each, and you probably need a
couple of those to ensure smooth playback. That's quite a bit of
memory to reserve for CMA.

Have you ever tried to make this work with the IOMMU API so that we can
allocate arbitrary buffers and linearize them for the hardware through
the SMMU?

> +static void tegra_channel_buffer_queue(struct vb2_buffer *vb)
> +{
> +	struct tegra_channel *chan = vb2_get_drv_priv(vb->vb2_queue);
> +	struct tegra_channel_buffer *buf = to_tegra_channel_buffer(vb);
> +
> +	/* Put buffer into the  capture queue */
> +	spin_lock_irq(&chan->queued_lock);
> +	list_add_tail(&buf->queue, &chan->capture);
> +	spin_unlock_irq(&chan->queued_lock);
> +
> +	/* Start work queue to capture data to buffer */
> +	if (vb2_start_streaming_called(&chan->queue))
> +		schedule_work(&chan->work);
> +}

I'm beginning to wonder if a workqueue is the best implementation here.
Couldn't we get notification on syncpoint increments and have a handler
setup capture of new frames?

> +static int tegra_channel_start_streaming(struct vb2_queue *vq, u32 count)
> +{
> +	struct tegra_channel *chan = vb2_get_drv_priv(vq);
> +	struct media_pipeline *pipe = chan->video.entity.pipe;
> +	struct tegra_channel_buffer *buf, *nbuf;
> +	int ret = 0;
> +
> +	if (!chan->vi->pg_mode && !chan->remote_entity) {
> +		dev_err(&chan->video.dev,
> +			"is not in TPG mode and has not sensor connected!\n");
> +		ret = -EINVAL;
> +		goto vb2_queued;
> +	}
> +
> +	mutex_lock(&chan->lock);
> +
> +	/* Start CIL clock */
> +	clk_set_rate(chan->cil_clk, 102000000);
> +	clk_prepare_enable(chan->cil_clk);

You need to check these for errors.

> +static struct vb2_ops tegra_channel_queue_qops = {
> +	.queue_setup = tegra_channel_queue_setup,
> +	.buf_prepare = tegra_channel_buffer_prepare,
> +	.buf_queue = tegra_channel_buffer_queue,
> +	.wait_prepare = vb2_ops_wait_prepare,
> +	.wait_finish = vb2_ops_wait_finish,
> +	.start_streaming = tegra_channel_start_streaming,
> +	.stop_streaming = tegra_channel_stop_streaming,
> +};

I think this needs to be static const.

> +static int
> +tegra_channel_querycap(struct file *file, void *fh, struct v4l2_capability *cap)
> +{
> +	struct v4l2_fh *vfh = file->private_data;
> +	struct tegra_channel *chan = to_tegra_channel(vfh->vdev);
> +
> +	cap->device_caps = V4L2_CAP_VIDEO_CAPTURE | V4L2_CAP_STREAMING;
> +	cap->capabilities = cap->device_caps | V4L2_CAP_DEVICE_CAPS;
> +
> +	strlcpy(cap->driver, "tegra-vi", sizeof(cap->driver));

Perhaps "tegra-video" to be consistent with the module name?

> +	strlcpy(cap->card, chan->video.name, sizeof(cap->card));
> +	snprintf(cap->bus_info, sizeof(cap->bus_info), "platform:%s:%u",
> +		 chan->vi->dev->of_node->name, chan->port);

Should this not rather use dev_name(chan->vi->dev) to ensure it works
fine if ever we have multiple instances of the VI controller?

> +static int
> +tegra_channel_enum_format(struct file *file, void *fh, struct v4l2_fmtdesc *f)
> +{
> +	struct v4l2_fh *vfh = file->private_data;
> +	struct tegra_channel *chan = to_tegra_channel(vfh->vdev);
> +	int index, i;

These can probably be unsigned int.

> +	unsigned long *fmts_bitmap = NULL;
> +
> +	if (chan->vi->pg_mode)
> +		fmts_bitmap = chan->vi->tpg_fmts_bitmap;
> +	else if (chan->remote_entity)
> +		fmts_bitmap = chan->fmts_bitmap;
> +
> +	if (!fmts_bitmap ||
> +	    f->index > bitmap_weight(fmts_bitmap, MAX_FORMAT_NUM) - 1)
> +		return -EINVAL;
> +
> +	index = -1;

This won't work with unsigned int, of course (actually, it would, but
it'd be ugly), but I think you could work around that by doing the more
natural:

> +	for (i = 0; i < f->index + 1; i++)
> +		index = find_next_bit(fmts_bitmap, MAX_FORMAT_NUM, index + 1);

	index = 0;

	for (i = 0; i < f->index + 1; i++, index++)
		index = find_next_bit(fmts_bitmap, MAX_FORMAT_NUM, index);

> +static void
> +__tegra_channel_try_format(struct tegra_channel *chan, struct v4l2_pix_format *pix,
> +		      const struct tegra_video_format **fmtinfo)
> +{
> +	const struct tegra_video_format *info;
> +	unsigned int min_width;
> +	unsigned int max_width;
> +	unsigned int min_bpl;
> +	unsigned int max_bpl;
> +	unsigned int width;
> +	unsigned int align;
> +	unsigned int bpl;
> +
> +	/* Retrieve format information and select the default format if the
> +	 * requested format isn't supported.
> +	 */
> +	info = tegra_core_get_format_by_fourcc(pix->pixelformat);
> +	if (!info)
> +		info = tegra_core_get_format_by_fourcc(TEGRA_VF_DEF_FOURCC);

Should this not be an error? As far as I can tell this is silently
substituting the default format for the requested one if the requested
one isn't supported. Isn't the whole point of this to find out if some
format is supported?

> +
> +	pix->pixelformat = info->fourcc;
> +	pix->field = V4L2_FIELD_NONE;
> +
> +	/* The transfer alignment requirements are expressed in bytes. Compute
> +	 * the minimum and maximum values, clamp the requested width and convert
> +	 * it back to pixels.
> +	 */
> +	align = lcm(chan->align, info->bpp);
> +	min_width = roundup(TEGRA_MIN_WIDTH, align);
> +	max_width = rounddown(TEGRA_MAX_WIDTH, align);
> +	width = rounddown(pix->width * info->bpp, align);

Shouldn't these be roundup()?

> +
> +	pix->width = clamp(width, min_width, max_width) / info->bpp;
> +	pix->height = clamp(pix->height, TEGRA_MIN_HEIGHT,
> +			    TEGRA_MAX_HEIGHT);

The above fits nicely on one line and doesn't need to be wrapped.

> +
> +	/* Clamp the requested bytes per line value. If the maximum bytes per
> +	 * line value is zero, the module doesn't support user configurable line
> +	 * sizes. Override the requested value with the minimum in that case.
> +	 */
> +	min_bpl = pix->width * info->bpp;
> +	max_bpl = rounddown(TEGRA_MAX_WIDTH, chan->align);
> +	bpl = rounddown(pix->bytesperline, chan->align);

Again, I think these should be roundup().

> +static int tegra_channel_v4l2_open(struct file *file)
> +{
> +	struct tegra_channel *chan = video_drvdata(file);
> +	struct tegra_vi_device *vi = chan->vi;
> +	int ret = 0;
> +
> +	mutex_lock(&vi->lock);
> +	ret = v4l2_fh_open(file);
> +	if (ret)
> +		goto unlock;
> +
> +	/* The first open then turn on power*/
> +	if (!vi->power_on_refcnt) {
> +		tegra_vi_power_on(chan->vi);

Perhaps propagate error codes here?

> +
> +		usleep_range(5, 100);
> +		tegra_channel_write(chan, TEGRA_VI_CFG_CG_CTRL, 1);
> +		tegra_channel_write(chan, TEGRA_CSI_CLKEN_OVERRIDE, 0);
> +		usleep_range(10, 15);
> +	}
> +	vi->power_on_refcnt++;

Also, I wonder if powering up at ->open() time isn't a little early. I
could very well imagine an application opening up a device and then not
use it for a long time. Or keep it open even while nothing is being
captures. But that's primarily an optimization matter, so this is fine
with me.

> +int tegra_channel_init(struct tegra_vi_device *vi,
> +		       struct tegra_channel *chan,
> +		       u32 port)

The above fits on 2 lines, no need to make it three. Also port should
probably be unsigned int because the size isn't important.

> +{
> +	int ret;
> +
> +	chan->vi = vi;
> +	chan->port = port;
> +	chan->iomem = vi->iomem;
> +
> +	/* Init channel register base */
> +	chan->regs.csi = TEGRA_VI_CSI_0_BASE + port * 0x100;
> +	chan->regs.pp = regs_base(TEGRA_CSI_PIXEL_PARSER_0_BASE, port);
> +	chan->regs.cil = regs_base(TEGRA_CSI_CIL_0_BASE, port);
> +	chan->regs.phy = TEGRA_CSI_CIL_PHY_0_BASE + port / 2 * 0x800;
> +	chan->regs.tpg = regs_base(TEGRA_CSI_PATTERN_GENERATOR_0_BASE, port);

Like I said, I think it'd be clearer to have the defines parameterized.
That would also make this more consistent, rather than have one set of
values that are computed here and for others the regs_base() helper is
invoked. Also, I think it'd be better to have the regs structures take
void __iomem * directly, so that the offset addition doesn't have to be
performed at every register access.

> +
> +	/* Init CIL clock */
> +	switch (chan->port) {
> +	case 0:
> +	case 1:
> +		chan->cil_clk = devm_clk_get(chan->vi->dev, "cilab");
> +		break;
> +	case 2:
> +	case 3:
> +		chan->cil_clk = devm_clk_get(chan->vi->dev, "cilcd");
> +		break;
> +	case 4:
> +	case 5:
> +		chan->cil_clk = devm_clk_get(chan->vi->dev, "cile");
> +		break;
> +	default:
> +		dev_err(chan->vi->dev, "wrong port nubmer %d\n", port);

Nit: you should use %u for unsigned integers.

> +	}
> +	if (IS_ERR(chan->cil_clk)) {
> +		dev_err(chan->vi->dev, "Failed to get CIL clock\n");

Perhaps mention which clock couldn't be received.

> +		return -EINVAL;

And propagate the error code rather than returning a hardcoded one.

> +	}
> +
> +	/* VI Channel is 64 bytes alignment */
> +	chan->align = 64;

Does this need parameterization for other SoC generations?

> +	chan->surface = 0;

I can't find this being set to anything other than 0. What is its use?

> +	chan->io_id = tegra_io_rail_csi_ids[chan->port];
> +	mutex_init(&chan->lock);
> +	mutex_init(&chan->video_lock);
> +	INIT_LIST_HEAD(&chan->capture);
> +	spin_lock_init(&chan->queued_lock);
> +	INIT_WORK(&chan->work, tegra_channel_work);
> +
> +	/* Init video format */
> +	chan->fmtinfo = tegra_core_get_format_by_fourcc(TEGRA_VF_DEF_FOURCC);
> +	chan->format.pixelformat = chan->fmtinfo->fourcc;
> +	chan->format.colorspace = V4L2_COLORSPACE_SRGB;
> +	chan->format.field = V4L2_FIELD_NONE;
> +	chan->format.width = TEGRA_DEF_WIDTH;
> +	chan->format.height = TEGRA_DEF_HEIGHT;
> +	chan->format.bytesperline = chan->format.width * chan->fmtinfo->bpp;
> +	chan->format.sizeimage = chan->format.bytesperline *
> +				    chan->format.height;
> +
> +	/* Initialize the media entity... */
> +	chan->pad.flags = MEDIA_PAD_FL_SINK;
> +
> +	ret = media_entity_init(&chan->video.entity, 1, &chan->pad, 0);
> +	if (ret < 0)
> +		return ret;
> +
> +	/* ... and the video node... */
> +	chan->video.fops = &tegra_channel_fops;
> +	chan->video.v4l2_dev = &vi->v4l2_dev;
> +	chan->video.queue = &chan->queue;
> +	snprintf(chan->video.name, sizeof(chan->video.name), "%s %s %u",
> +		 vi->dev->of_node->name, "output", port);

dev_name()?

> diff --git a/drivers/media/platform/tegra/tegra-core.c b/drivers/media/platform/tegra/tegra-core.c
[...]
> +const struct tegra_video_format tegra_video_formats[] = {

Does this need to be exposed? I see there are accessors for this below,
so exposing the structure itself doesn't seem necessary.

> +int tegra_core_get_formats_array_size(void)
> +{
> +	return ARRAY_SIZE(tegra_video_formats);
> +}
> +
> +/**
> + * tegra_core_get_word_count - Calculate word count
> + * @frame_width: number of pixels in one frame
> + * @fmt: Tegra Video format struct which has BPP information
> + *
> + * Return: word count number
> + */
> +u32 tegra_core_get_word_count(u32 frame_width,
> +			      const struct tegra_video_format *fmt)
> +{
> +	return frame_width * fmt->width / 8;
> +}

This is confusing. If frame_width is the number of pixels in one frame,
then it should probably me called frame_size or so. frame_width to me
implies number of pixels per line, not per frame.

> +/**
> + * tegra_core_get_idx_by_code - Retrieve index for a media bus code
> + * @code: the format media bus code
> + *
> + * Return: a index to the format information structure corresponding to the
> + * given V4L2 media bus format @code, or -1 if no corresponding format can
> + * be found.
> + */
> +int tegra_core_get_idx_by_code(unsigned int code)
> +{
> +	unsigned int i;
> +	const struct tegra_video_format *format;
> +
> +	for (i = 0; i < ARRAY_SIZE(tegra_video_formats); ++i) {
> +		format = &tegra_video_formats[i];
> +
> +		if (format->code == code)

You only use the format value once, so the temporary variable doesn't
buy you anything.

> +			return i;
> +	}
> +
> +	return -1;
> +}
> +
> +

Gratuitous blank line.

> +/**
> + * tegra_core_of_get_format - Parse a device tree node and return format
> + * 			      information

Why is this necessary? Why would you ever need to encode a pixel format
in DT?

> +/**
> + * tegra_core_bytes_per_line - Calculate bytes per line in one frame
> + * @width: frame width
> + * @fmt: Tegra Video format
> + *
> + * Simply calcualte the bytes_per_line and if it's not 64 bytes aligned it
> + * will be padded to 64 boundary.
> + */
> +u32 tegra_core_bytes_per_line(u32 width,
> +			      const struct tegra_video_format *fmt)
> +{
> +	u32 bytes_per_line = width * fmt->bpp;
> +
> +	if (bytes_per_line % 64)
> +		bytes_per_line = bytes_per_line +
> +				 (64 - (bytes_per_line % 64));
> +
> +	return bytes_per_line;
> +}

Perhaps this should use the channel->align field for alignment rather
than hardcode 64? Since there's no channel being passed into this, maybe
passing the alignment as a parameter would work?

Also can't the above be replaced by:

	return roundup(width * fmt->bpp, align);

?

> diff --git a/drivers/media/platform/tegra/tegra-core.h b/drivers/media/platform/tegra/tegra-core.h
> new file mode 100644
> index 0000000..7d1026b
> --- /dev/null
> +++ b/drivers/media/platform/tegra/tegra-core.h
> @@ -0,0 +1,134 @@
> +/*
> + * NVIDIA Tegra Video Input Device Driver Core Helpers
> + *
> + * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
> + *
> + * Author: Bryan Wu <pengw@nvidia.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#ifndef __TEGRA_CORE_H__
> +#define __TEGRA_CORE_H__
> +
> +#include <dt-bindings/media/tegra-vi.h>
> +
> +#include <media/v4l2-subdev.h>
> +
> +/* Minimum and maximum width and height common to Tegra video input device. */
> +#define TEGRA_MIN_WIDTH		32U
> +#define TEGRA_MAX_WIDTH		7680U
> +#define TEGRA_MIN_HEIGHT	32U
> +#define TEGRA_MAX_HEIGHT	7680U

Is this dependent on SoC generation? If we wanted to support Tegra K1,
would the same values apply or do they need to be parameterized?

On that note, could you outline what would be necessary to make this
work on Tegra K1? What are the differences between the VI hardware on
Tegra X1 vs. Tegra K1?

> +
> +/* UHD 4K resolution as default resolution for all Tegra video input device. */
> +#define TEGRA_DEF_WIDTH		3840
> +#define TEGRA_DEF_HEIGHT	2160

Is this a sensible default? It seems rather large to me.

> +
> +#define TEGRA_VF_DEF		TEGRA_VF_RGB888
> +#define TEGRA_VF_DEF_FOURCC	V4L2_PIX_FMT_RGB32

Should we not have only one of these and convert to the other via some
table?

> +/* These go into the TEGRA_VI_CSI_n_IMAGE_DEF registers bits 23:16 */
> +#define TEGRA_IMAGE_FORMAT_T_L8                         16
> +#define TEGRA_IMAGE_FORMAT_T_R16_I                      32
> +#define TEGRA_IMAGE_FORMAT_T_B5G6R5                     33
> +#define TEGRA_IMAGE_FORMAT_T_R5G6B5                     34
> +#define TEGRA_IMAGE_FORMAT_T_A1B5G5R5                   35
> +#define TEGRA_IMAGE_FORMAT_T_A1R5G5B5                   36
> +#define TEGRA_IMAGE_FORMAT_T_B5G5R5A1                   37
> +#define TEGRA_IMAGE_FORMAT_T_R5G5B5A1                   38
> +#define TEGRA_IMAGE_FORMAT_T_A4B4G4R4                   39
> +#define TEGRA_IMAGE_FORMAT_T_A4R4G4B4                   40
> +#define TEGRA_IMAGE_FORMAT_T_B4G4R4A4                   41
> +#define TEGRA_IMAGE_FORMAT_T_R4G4B4A4                   42
> +#define TEGRA_IMAGE_FORMAT_T_A8B8G8R8                   64
> +#define TEGRA_IMAGE_FORMAT_T_A8R8G8B8                   65
> +#define TEGRA_IMAGE_FORMAT_T_B8G8R8A8                   66
> +#define TEGRA_IMAGE_FORMAT_T_R8G8B8A8                   67
> +#define TEGRA_IMAGE_FORMAT_T_A2B10G10R10                68
> +#define TEGRA_IMAGE_FORMAT_T_A2R10G10B10                69
> +#define TEGRA_IMAGE_FORMAT_T_B10G10R10A2                70
> +#define TEGRA_IMAGE_FORMAT_T_R10G10B10A2                71
> +#define TEGRA_IMAGE_FORMAT_T_A8Y8U8V8                   193
> +#define TEGRA_IMAGE_FORMAT_T_V8U8Y8A8                   194
> +#define TEGRA_IMAGE_FORMAT_T_A2Y10U10V10                197
> +#define TEGRA_IMAGE_FORMAT_T_V10U10Y10A2                198
> +#define TEGRA_IMAGE_FORMAT_T_Y8_U8__Y8_V8               200
> +#define TEGRA_IMAGE_FORMAT_T_Y8_V8__Y8_U8               201
> +#define TEGRA_IMAGE_FORMAT_T_U8_Y8__V8_Y8               202
> +#define TEGRA_IMAGE_FORMAT_T_T_V8_Y8__U8_Y8             203
> +#define TEGRA_IMAGE_FORMAT_T_T_Y8__U8__V8_N444          224
> +#define TEGRA_IMAGE_FORMAT_T_Y8__U8V8_N444              225
> +#define TEGRA_IMAGE_FORMAT_T_Y8__V8U8_N444              226
> +#define TEGRA_IMAGE_FORMAT_T_Y8__U8__V8_N422            227
> +#define TEGRA_IMAGE_FORMAT_T_Y8__U8V8_N422              228
> +#define TEGRA_IMAGE_FORMAT_T_Y8__V8U8_N422              229
> +#define TEGRA_IMAGE_FORMAT_T_Y8__U8__V8_N420            230
> +#define TEGRA_IMAGE_FORMAT_T_Y8__U8V8_N420              231
> +#define TEGRA_IMAGE_FORMAT_T_Y8__V8U8_N420              232
> +#define TEGRA_IMAGE_FORMAT_T_X2Lc10Lb10La10             233
> +#define TEGRA_IMAGE_FORMAT_T_A2R6R6R6R6R6               234
> +
> +/* These go into the TEGRA_VI_CSI_n_CSI_IMAGE_DT registers bits 7:0 */
> +#define TEGRA_IMAGE_DT_YUV420_8                         24
> +#define TEGRA_IMAGE_DT_YUV420_10                        25
> +#define TEGRA_IMAGE_DT_YUV420CSPS_8                     28
> +#define TEGRA_IMAGE_DT_YUV420CSPS_10                    29
> +#define TEGRA_IMAGE_DT_YUV422_8                         30
> +#define TEGRA_IMAGE_DT_YUV422_10                        31
> +#define TEGRA_IMAGE_DT_RGB444                           32
> +#define TEGRA_IMAGE_DT_RGB555                           33
> +#define TEGRA_IMAGE_DT_RGB565                           34
> +#define TEGRA_IMAGE_DT_RGB666                           35
> +#define TEGRA_IMAGE_DT_RGB888                           36
> +#define TEGRA_IMAGE_DT_RAW6                             40
> +#define TEGRA_IMAGE_DT_RAW7                             41
> +#define TEGRA_IMAGE_DT_RAW8                             42
> +#define TEGRA_IMAGE_DT_RAW10                            43
> +#define TEGRA_IMAGE_DT_RAW12                            44
> +#define TEGRA_IMAGE_DT_RAW14                            45

It might be helpful to describe what these registers actually do. There
seems to be overlap between both lists, but I don't quite see how they
relate to one another, or what their purpose is.

> +/**
> + * struct tegra_video_format - Tegra video format description
> + * @vf_code: video format code
> + * @width: format width in bits per component
> + * @code: media bus format code
> + * @bpp: bytes per pixel (when stored in memory)
> + * @img_fmt: image format
> + * @img_dt: image data type
> + * @fourcc: V4L2 pixel format FCC identifier
> + * @description: format description, suitable for userspace
> + */
> +struct tegra_video_format {
> +	u32 vf_code;
> +	u32 width;
> +	u32 code;
> +	u32 bpp;

I think the above four can all be unsigned int. A sized type is not
necessary here.

> +	u32 img_fmt;
> +	u32 img_dt;

Perhaps these could be enums?

> +	u32 fourcc;
> +};
> +
> +extern const struct tegra_video_format tegra_video_formats[];

It looks like you have accessors for this. Do you even need to expose
it?

> diff --git a/drivers/media/platform/tegra/tegra-vi.c b/drivers/media/platform/tegra/tegra-vi.c
[...]
> +static void tegra_vi_v4l2_cleanup(struct tegra_vi_device *vi)
> +{
> +	v4l2_ctrl_handler_free(&vi->ctrl_handler);
> +	v4l2_device_unregister(&vi->v4l2_dev);
> +	media_device_unregister(&vi->media_dev);
> +}
> +
> +static int tegra_vi_v4l2_init(struct tegra_vi_device *vi)
> +{
> +	int ret;
> +
> +	vi->media_dev.dev = vi->dev;
> +	strlcpy(vi->media_dev.model, "NVIDIA Tegra Video Input Device",
> +		sizeof(vi->media_dev.model));
> +	vi->media_dev.hw_revision = 0;

Actually, I think for Tegra X1 the hardware revision would be 3, since
VI3 is what it's usually referred to. Tegra K1 has VI2, so this should
be parameterized (at least when Tegra K1 support is added).

> +int tegra_vi_power_on(struct tegra_vi_device *vi)
> +{
> +	int ret;
> +
> +	ret = regulator_enable(vi->vi_reg);
> +	if (ret)
> +		return ret;
> +
> +	ret = tegra_powergate_sequence_power_up(TEGRA_POWERGATE_VENC,
> +						vi->vi_clk, vi->vi_rst);
> +	if (ret) {
> +		regulator_disable(vi->vi_reg);
> +		return ret;
> +	}
> +
> +	clk_prepare_enable(vi->csi_clk);
> +
> +	clk_set_rate(vi->parent_clk, 408000000);

Do we really need to set the parent? Isn't that going to be set
automatically since vi_clk is the child of parent_clk?

> +	clk_set_rate(vi->vi_clk, 408000000);
> +	clk_set_rate(vi->csi_clk, 408000000);

Also all of these clock functions can fail, so you should check for
errors.

> +
> +	return 0;
> +}
> +
> +void tegra_vi_power_off(struct tegra_vi_device *vi)
> +{
> +	clk_disable_unprepare(vi->csi_clk);
> +	tegra_powergate_power_off(TEGRA_POWERGATE_VENC);

tegra_powergate_power_off() doesn't do anything with the clock or the
reset, so you'll want to manually assert reset here and then disable and
unprepare the clock. And I think both need to happen before the power
partition is turned off.

> +	regulator_disable(vi->vi_reg);
> +}
> +
> +static int tegra_vi_channels_init(struct tegra_vi_device *vi)
> +{
> +	int i, ret;

i can be unsigned.

> +	struct tegra_channel *chan;
> +
> +	for (i = 0; i < ARRAY_SIZE(vi->chans); i++) {
> +		chan = &vi->chans[i];
> +
> +		ret = tegra_channel_init(vi, chan, i);

Again, chan is only used once, so directly passing &vi->chans[i] to
tegra_channel_init() would be more concise.

> +static int tegra_vi_channels_cleanup(struct tegra_vi_device *vi)
> +{
> +	int i, ret;
> +	struct tegra_channel *chan;
> +
> +	for (i = 0; i < ARRAY_SIZE(vi->chans); i++) {
> +		chan = &vi->chans[i];
> +
> +		ret = tegra_channel_cleanup(chan);
> +		if (ret < 0) {
> +			dev_err(vi->dev, "channel %d cleanup failed\n", i);
> +			return ret;
> +		}
> +	}
> +	return 0;
> +}

Same comments as for tegra_vi_channels_init().

> +
> +/* -----------------------------------------------------------------------------
> + * Graph Management
> + */

The way devices are hooked up using the graph needs to be documented in
a device tree binding.

> +static int tegra_vi_graph_notify_complete(struct v4l2_async_notifier *notifier)
> +{
> +	struct tegra_vi_device *vi =
> +		container_of(notifier, struct tegra_vi_device, notifier);
> +	int ret;
> +
> +	dev_dbg(vi->dev, "notify complete, all subdevs registered\n");
> +
> +	/* Create links for every entity. */
> +	ret = tegra_vi_graph_build_links(vi);
> +	if (ret < 0)
> +		return ret;
> +
> +	ret = v4l2_device_register_subdev_nodes(&vi->v4l2_dev);
> +	if (ret < 0)
> +		dev_err(vi->dev, "failed to register subdev nodes\n");
> +
> +	return ret;
> +}

Why the need for this notifier mechanism, doesn't deferred probe work
here?

> +static int tegra_vi_graph_notify_bound(struct v4l2_async_notifier *notifier,
> +				   struct v4l2_subdev *subdev,
> +				   struct v4l2_async_subdev *asd)
> +{
[...]
> +}
> +
> +

Gratuitous blank line.

> +static int tegra_vi_graph_init(struct tegra_vi_device *vi)
> +{
> +	struct device_node *node = vi->dev->of_node;
> +	struct device_node *ep = NULL;
> +	struct device_node *next; 
> +	struct device_node *remote = NULL;
> +	struct tegra_vi_graph_entity *entity;
> +	struct v4l2_async_subdev **subdevs = NULL;
> +	unsigned int num_subdevs;

This variable is being used uninitialized.

> +static int tegra_vi_probe(struct platform_device *pdev)
> +{
> +	struct resource *res;
> +	struct tegra_vi_device *vi;
> +	int ret = 0;
> +
> +	vi = devm_kzalloc(&pdev->dev, sizeof(*vi), GFP_KERNEL);
> +	if (!vi)
> +		return -ENOMEM;
> +
> +	vi->dev = &pdev->dev;
> +	INIT_LIST_HEAD(&vi->entities);
> +	mutex_init(&vi->lock);
> +
> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +	vi->iomem = devm_ioremap_resource(&pdev->dev, res);
> +	if (IS_ERR(vi->iomem))
> +		return PTR_ERR(vi->iomem);
> +
> +	vi->vi_rst = devm_reset_control_get(&pdev->dev, "vi");
> +	if (IS_ERR(vi->vi_rst)) {
> +		dev_err(&pdev->dev, "Failed to get vi reset\n");
> +		return -EPROBE_DEFER;
> +	}

There could be other reasons for failure, so you should really propagate
the error code that devm_reset_control_get() provides:

		return PTR_ERR(vi->vi_rst);

> +	vi->vi_clk = devm_clk_get(&pdev->dev, "vi");
> +	if (IS_ERR(vi->vi_clk)) {
> +		dev_err(&pdev->dev, "Failed to get vi clock\n");
> +		return -EPROBE_DEFER;
> +	}

Same here...

> +	vi->parent_clk = devm_clk_get(&pdev->dev, "parent");
> +	if (IS_ERR(vi->parent_clk)) {
> +		dev_err(&pdev->dev, "Failed to get VI parent clock\n");
> +		return -EPROBE_DEFER;
> +	}

... here...

> +	ret = clk_set_parent(vi->vi_clk, vi->parent_clk);
> +	if (ret < 0)
> +		return ret;
> +
> +	vi->csi_clk = devm_clk_get(&pdev->dev, "csi");
> +	if (IS_ERR(vi->csi_clk)) {
> +		dev_err(&pdev->dev, "Failed to get csi clock\n");
> +		return -EPROBE_DEFER;
> +	}

... here...

> +	vi->vi_reg = devm_regulator_get(&pdev->dev, "avdd-dsi-csi");
> +	if (IS_ERR(vi->vi_reg)) {
> +		dev_err(&pdev->dev, "Failed to get avdd-dsi-csi regulators\n");
> +		return -EPROBE_DEFER;
> +	}

and here.

> +	vi_tpg_fmts_bitmap_init(vi);
> +
> +	ret = tegra_vi_v4l2_init(vi);
> +	if (ret < 0)
> +		return ret;
> +
> +	/* Check whether VI is in test pattern generator (TPG) mode */
> +	of_property_read_u32(vi->dev->of_node, "nvidia,pg_mode",
> +			     &vi->pg_mode);

This doesn't sound right. Wouldn't this mean that you can either use the
device in TPG mode or sensor mode only? With no means of switching at
runtime? But then I see that there's an IOCTL to set this mode, so why
even bother having this in DT in the first place?

> +	/* Init Tegra VI channels */
> +	ret = tegra_vi_channels_init(vi);
> +	if (ret < 0)
> +		goto channels_error;
> +
> +	/* Setup media links between VI and external sensor subdev. */
> +	ret = tegra_vi_graph_init(vi);
> +	if (ret < 0)
> +		goto graph_error;
> +
> +	platform_set_drvdata(pdev, vi);
> +
> +	dev_info(vi->dev, "device registered\n");

Can we get rid of this, please? There's no use in spamming the kernel
log with brag. Let people know when things have failed. Success is the
expected outcome of ->probe().

> +static struct platform_driver tegra_vi_driver = {
> +	.driver = {
> +		.name = "tegra-vi",
> +		.of_match_table = tegra_vi_of_id_table,
> +	},
> +	.probe = tegra_vi_probe,
> +	.remove = tegra_vi_remove,
> +};
> +
> +module_platform_driver(tegra_vi_driver);

There's usually no blank line between the above.

> diff --git a/drivers/media/platform/tegra/tegra-vi.h b/drivers/media/platform/tegra/tegra-vi.h
> new file mode 100644
> index 0000000..d30a6ec
> --- /dev/null
> +++ b/drivers/media/platform/tegra/tegra-vi.h
> @@ -0,0 +1,224 @@
> +/*
> + * NVIDIA Tegra Video Input Device
> + *
> + * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
> + *
> + * Author: Bryan Wu <pengw@nvidia.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#ifndef __TEGRA_VI_H__
> +#define __TEGRA_VI_H__
> +
> +#include <linux/host1x.h>
> +#include <linux/list.h>
> +#include <linux/mutex.h>
> +#include <linux/spinlock.h>
> +#include <linux/videodev2.h>
> +
> +#include <media/media-device.h>
> +#include <media/media-entity.h>
> +#include <media/v4l2-async.h>
> +#include <media/v4l2-ctrls.h>
> +#include <media/v4l2-device.h>
> +#include <media/v4l2-dev.h>
> +#include <media/videobuf2-core.h>
> +
> +#include "tegra-core.h"
> +
> +#define MAX_CHAN_NUM	6
> +#define MAX_FORMAT_NUM	64

Perhaps these need to be runtime parameters to support multiple SoC
generations? Tegra K1 seems to have only 2 channels instead of 6.

> +
> +/**
> + * struct tegra_channel_buffer - video channel buffer
> + * @buf: vb2 buffer base object
> + * @queue: buffer list entry in the channel queued buffers list
> + * @chan: channel that uses the buffer
> + * @addr: Tegra IOVA buffer address for VI output
> + */
> +struct tegra_channel_buffer {
> +	struct vb2_buffer buf;
> +	struct list_head queue;
> +	struct tegra_channel *chan;
> +
> +	dma_addr_t addr;
> +};
> +
> +#define to_tegra_channel_buffer(vb) \
> +	container_of(vb, struct tegra_channel_buffer, buf)

I usually prefer static inline functions over macros for this type of
upcasting. But perhaps Hans prefers this, so I'll defer to his judgement
here.

> +struct chan_regs_config {
> +	u32 csi;
> +	u32 pp;
> +	u32 cil;
> +	u32 phy;
> +	u32 tpg;
> +};

Have you considered making these void __iomem * so that you can avoid
the addition of the offset whenever you access a register?

> +/**
> + * struct tegra_channel - Tegra video channel
> + * @list: list entry in a composite device dmas list
> + * @video: V4L2 video device associated with the video channel
> + * @video_lock:
> + * @pad: media pad for the video device entity
> + * @pipe: pipeline belonging to the channel
> + *
> + * @vi: composite device DT node port number for the channel
> + *
> + * @client: host1x client struct of Tegra DRM

host1x client is separate from Tegra DRM.

> + * @sp: host1x syncpoint pointer
> + *
> + * @work: kernel workqueue structure of this video channel
> + * @lock: protects the @format, @fmtinfo, @queue and @work fields
> + *
> + * @format: active V4L2 pixel format
> + * @fmtinfo: format information corresponding to the active @format
> + *
> + * @queue: vb2 buffers queue
> + * @alloc_ctx: allocation context for the vb2 @queue
> + * @sequence: V4L2 buffers sequence number
> + *
> + * @capture: list of queued buffers for capture
> + * @active: active buffer for capture
> + * @queued_lock: protects the buf_queued list
> + *
> + * @iomem: root register base
> + * @regs: CSI/CIL/PHY register bases
> + * @cil_clk: clock for CIL
> + * @align: channel buffer alignment, default is 64
> + * @port: CSI port of this video channel
> + * @surface: output memory surface number
> + * @io_id: Tegra IO rail ID of this video channel
> + * @bypass: a flag to bypass register write
> + *
> + * @fmts_bitmap: a bitmap for formats supported
> + *
> + * @remote_entity: remote media entity for external sensor
> + */
> +struct tegra_channel {
> +	struct list_head list;
> +	struct video_device video;
> +	struct mutex video_lock;
> +	struct media_pad pad;
> +	struct media_pipeline pipe;
> +
> +	struct tegra_vi_device *vi;
> +
> +	struct host1x_client client;
> +	struct host1x_syncpt *sp;
> +
> +	struct work_struct work;
> +	struct mutex lock;
> +
> +	struct v4l2_pix_format format;
> +	const struct tegra_video_format *fmtinfo;
> +
> +	struct vb2_queue queue;
> +	void *alloc_ctx;
> +	u32 sequence;
> +
> +	struct list_head capture;
> +	struct tegra_channel_buffer *active;
> +	spinlock_t queued_lock;
> +
> +	void __iomem *iomem;
> +	struct chan_regs_config regs;
> +	struct clk *cil_clk;
> +	int align;
> +	u32 port;

Those can both be unsigned int.

> +	u32 surface;

This seems to be fixed to 0, do we need it?

> +	int io_id;
> +	int bypass;

bool?

> +/**
> + * struct tegra_vi_device - NVIDIA Tegra Video Input device structure
> + * @v4l2_dev: V4L2 device
> + * @media_dev: media device
> + * @dev: device struct
> + *
> + * @iomem: register base
> + * @vi_clk: main clock for VI block
> + * @parent_clk: parent clock of VI clock
> + * @csi_clk: clock for CSI
> + * @vi_rst: reset controler
> + * @vi_reg: regulator for VI hardware, normally it avdd_dsi_csi
> + *
> + * @lock: mutex lock to protect power on/off operations
> + * @power_on_refcnt: reference count for power on/off operations
> + *
> + * @notifier: V4L2 asynchronous subdevs notifier
> + * @entities: entities in the graph as a list of tegra_vi_graph_entity
> + * @num_subdevs: number of subdevs in the pipeline
> + *
> + * @channels: list of channels at the pipeline output and input
> + *
> + * @ctrl_handler: V4L2 control handler
> + * @pattern: test pattern generator V4L2 control
> + * @pg_mode: test pattern generator mode (disabled/direct/patch)
> + * @tpg_fmts_bitmap: a bitmap for formats in test pattern generator mode
> + */
> +struct tegra_vi_device {
> +	struct v4l2_device v4l2_dev;
> +	struct media_device media_dev;
> +	struct device *dev;
> +
> +	void __iomem *iomem;
> +	struct clk *vi_clk;
> +	struct clk *parent_clk;
> +	struct clk *csi_clk;
> +	struct reset_control *vi_rst;
> +	struct regulator *vi_reg;
> +
> +	struct mutex lock;
> +	int power_on_refcnt;

unsigned int, or perhaps even atomic_t, in which case you might be able
to remove the locks from ->open()/->release().

> +
> +	struct v4l2_async_notifier notifier;
> +	struct list_head entities;
> +	unsigned int num_subdevs;
> +
> +	struct tegra_channel chans[MAX_CHAN_NUM];
> +
> +	struct v4l2_ctrl_handler ctrl_handler;
> +	struct v4l2_ctrl *pattern;
> +	int pg_mode;

Perhaps this should be an enum?

> diff --git a/include/dt-bindings/media/tegra-vi.h b/include/dt-bindings/media/tegra-vi.h
[...]
> +#ifndef __DT_BINDINGS_MEDIA_TEGRA_VI_H__
> +#define __DT_BINDINGS_MEDIA_TEGRA_VI_H__
> +
> +/*
> + * Supported CSI to VI Data Formats
> + */
> +#define TEGRA_VF_RAW6		0
> +#define TEGRA_VF_RAW7		1
> +#define TEGRA_VF_RAW8		2
> +#define TEGRA_VF_RAW10		3
> +#define TEGRA_VF_RAW12		4
> +#define TEGRA_VF_RAW14		5
> +#define TEGRA_VF_EMBEDDED8	6
> +#define TEGRA_VF_RGB565		7
> +#define TEGRA_VF_RGB555		8
> +#define TEGRA_VF_RGB888		9
> +#define TEGRA_VF_RGB444		10
> +#define TEGRA_VF_RGB666		11
> +#define TEGRA_VF_YUV422		12
> +#define TEGRA_VF_YUV420		13
> +#define TEGRA_VF_YUV420_CSPS	14
> +
> +#endif /* __DT_BINDINGS_MEDIA_TEGRA_VI_H__ */

What do we need these for? These seem to me to be internal formats
supported by the hardware, but the existence of this file implies that
you plan on using them in the DT. What's the use-case?

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
  2015-08-21 13:03   ` Thierry Reding
@ 2015-08-21 13:25     ` Hans Verkuil
       [not found]     ` <20150821130339.GB22118-AwZRO8vwLAwmlAP/+Wk3EA@public.gmane.org>
  1 sibling, 0 replies; 25+ messages in thread
From: Hans Verkuil @ 2015-08-21 13:25 UTC (permalink / raw)
  To: Thierry Reding, Bryan Wu
  Cc: hansverk, linux-media, ebrower, jbang, swarren, wenjiaz, davidw, gfitzer

On 08/21/2015 03:03 PM, Thierry Reding wrote:
> On Thu, Aug 20, 2015 at 05:51:39PM -0700, Bryan Wu wrote:
>> NVIDIA Tegra processor contains a powerful Video Input (VI) hardware
>> controller which can support up to 6 MIPI CSI camera sensors.
>>
>> This patch adds a V4L2 media controller and capture driver to support
>> Tegra VI hardware. It's verified with Tegra built-in test pattern
>> generator.
> 
> Hi Bryan,
> 
> I've been looking forward to seeing this posted. I don't know the VI
> hardware in very much detail, nor am I an expert on the media framework,
> so I will primarily comment on architectural or SoC-specific things.
> 
> By the way, please always Cc linux-tegra@vger.kernel.org on all patches
> relating to Tegra. That way people not explicitly Cc'ed but still
> interested in Tegra will see this code, even if they aren't subscribed
> to the linux-media mailing list.
> 
>> Signed-off-by: Bryan Wu <pengw@nvidia.com>
>> Reviewed-by: Hans Verkuil <hans.verkuil@cisco.com>
>> ---
>>  drivers/media/platform/Kconfig               |    1 +
>>  drivers/media/platform/Makefile              |    2 +
>>  drivers/media/platform/tegra/Kconfig         |    9 +
>>  drivers/media/platform/tegra/Makefile        |    3 +
>>  drivers/media/platform/tegra/tegra-channel.c | 1074 ++++++++++++++++++++++++++
>>  drivers/media/platform/tegra/tegra-core.c    |  295 +++++++
>>  drivers/media/platform/tegra/tegra-core.h    |  134 ++++
>>  drivers/media/platform/tegra/tegra-vi.c      |  585 ++++++++++++++
>>  drivers/media/platform/tegra/tegra-vi.h      |  224 ++++++
>>  include/dt-bindings/media/tegra-vi.h         |   35 +
>>  10 files changed, 2362 insertions(+)
>>  create mode 100644 drivers/media/platform/tegra/Kconfig
>>  create mode 100644 drivers/media/platform/tegra/Makefile
>>  create mode 100644 drivers/media/platform/tegra/tegra-channel.c
>>  create mode 100644 drivers/media/platform/tegra/tegra-core.c
>>  create mode 100644 drivers/media/platform/tegra/tegra-core.h
>>  create mode 100644 drivers/media/platform/tegra/tegra-vi.c
>>  create mode 100644 drivers/media/platform/tegra/tegra-vi.h
>>  create mode 100644 include/dt-bindings/media/tegra-vi.h
> 
> I can't spot a device tree binding document for this, but we'll need one
> to properly review this driver.
> 
>> diff --git a/drivers/media/platform/tegra/Kconfig b/drivers/media/platform/tegra/Kconfig
>> new file mode 100644
>> index 0000000..a69d1b2
>> --- /dev/null
>> +++ b/drivers/media/platform/tegra/Kconfig
>> @@ -0,0 +1,9 @@
>> +config VIDEO_TEGRA
>> +	tristate "NVIDIA Tegra Video Input Driver (EXPERIMENTAL)"
> 
> I don't think the (EXPERIMENTAL) is warranted. Either the driver works
> or it doesn't. And I assume you already tested that it works, even if
> only using the TPG.
> 
>> +	depends on VIDEO_V4L2 && VIDEO_V4L2_SUBDEV_API && OF
> 
> This seems to be missing a couple of dependencies. For example I would
> expect at least TEGRA_HOST1X to be listed here to make sure people can't
> select this when the host1x API isn't available. I would also expect
> some sort of architecture dependency because it really makes no sense to
> build this if Tegra isn't supported.
> 
> If you are concerned about compile coverage you can make that explicit
> using a COMPILE_TEST alternative such as:
> 
> 	depends on ARCH_TEGRA || (ARM && COMPILE_TEST)
> 
> Note that the ARM dependency in there makes sure that HAVE_IOMEM is
> selected, so this could also be:
> 
> 	depends on ARCH_TEGRA || (HAVE_IOMEM && COMPILE_TEST)
> 
> though that'd still leave open the possibility of build breakage because
> of some missing support.
> 
> If you add the dependency on TEGRA_HOST1X that I mentioned above you
> shouldn't need any architecture dependency because TEGRA_HOST1X implies
> those already.
> 
>> +	select VIDEOBUF2_DMA_CONTIG
>> +	---help---
>> +	  Driver for Video Input (VI) device controller in NVIDIA Tegra SoC.
> 
> I'd reword this slightly as:
> 
> 	  Driver for the Video Input (VI) controller found on NVIDIA Tegra
> 	  SoCs.
> 
>> +
>> +	  TO compile this driver as a module, choose M here: the module will be
> 
> s/TO/To/.
> 
>> +	  called tegra-video.
> 
>> diff --git a/drivers/media/platform/tegra/Makefile b/drivers/media/platform/tegra/Makefile
>> new file mode 100644
>> index 0000000..c8eff0b
>> --- /dev/null
>> +++ b/drivers/media/platform/tegra/Makefile
>> @@ -0,0 +1,3 @@
>> +tegra-video-objs += tegra-core.o tegra-vi.o tegra-channel.o
> 
> I'd personally leave out the redundant tegra- prefix here, because the
> files are in a tegra/ subdirectory already.

This is actually consistent with the other media drivers, so please keep the
prefix. One can debate whether the prefix makes sense or not, but changing
that would be a subsystem-wide change.

<snip>

>> +static int tegra_channel_capture_frame(struct tegra_channel *chan)
>> +{
>> +	struct tegra_channel_buffer *buf = chan->active;
>> +	struct vb2_buffer *vb = &buf->buf;
>> +	int err = 0;
>> +	u32 thresh, value, frame_start;
>> +	int bytes_per_line = chan->format.bytesperline;
>> +
>> +	if (!vb2_start_streaming_called(&chan->queue) || !buf)
>> +		return -EINVAL;
>> +
>> +	if (chan->bypass)
>> +		goto bypass_done;
>> +
>> +	/* Program buffer address */
>> +	csi_write(chan,
>> +		  TEGRA_VI_CSI_SURFACE0_OFFSET_MSB + chan->surface * 8,
>> +		  0x0);
>> +	csi_write(chan,
>> +		  TEGRA_VI_CSI_SURFACE0_OFFSET_LSB + chan->surface * 8,
>> +		  buf->addr);
>> +	csi_write(chan,
>> +		  TEGRA_VI_CSI_SURFACE0_STRIDE + chan->surface * 4,
>> +		  bytes_per_line);
>> +
>> +	/* Program syncpoint */
>> +	frame_start = sp_bit(chan, SP_PP_FRAME_START);
>> +	tegra_channel_write(chan, TEGRA_VI_CFG_VI_INCR_SYNCPT,
>> +			    frame_start | host1x_syncpt_id(chan->sp));
>> +
>> +	csi_write(chan, TEGRA_VI_CSI_SINGLE_SHOT, 0x1);
>> +
>> +	/* Use syncpoint to wake up */
>> +	thresh = host1x_syncpt_incr_max(chan->sp, 1);
>> +
>> +	mutex_unlock(&chan->lock);
>> +	err = host1x_syncpt_wait(chan->sp, thresh,
>> +			         TEGRA_VI_SYNCPT_WAIT_TIMEOUT, &value);
>> +	mutex_lock(&chan->lock);
> 
> What's the point of taking the lock in the first place if you drop it
> here, even if temporarily? This is a per-channel lock, and it protects
> the channel against concurrent captures. So if you drop the lock here,
> don't you run risk of having two captures run concurrently? And by the
> time you get to the error handling or buffer completion below you can't
> be sure you're actually dealing with the same buffer that you started
> with.

My understanding from Bryan is that syncpt_wait is a blocking wait that
can take a long time (it's waiting for a new frame). Keeping the lock
across such a wait will prevent others ioctls that need that same lock
from being called during that time, which is perfectly legal and desirable.

BTW, you can't start two captures simultaneously for the same channel,
the vb2 framework protects against that.

But as you mentioned elsewhere as well, I think this part using workqueues
should be redesigned. It's not a good fit. Either fully interrupt driven (if
possible) or using a per-channel kernel thread.

Regards,

	Hans

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
  2015-08-21 13:03   ` Thierry Reding
@ 2015-08-25  0:26         ` Bryan Wu
       [not found]     ` <20150821130339.GB22118-AwZRO8vwLAwmlAP/+Wk3EA@public.gmane.org>
  1 sibling, 0 replies; 25+ messages in thread
From: Bryan Wu @ 2015-08-25  0:26 UTC (permalink / raw)
  To: Thierry Reding
  Cc: hansverk-FYB4Gu1CFyUAvxtiuMwx3w,
	linux-media-u79uwXL29TY76Z2rM5mHXA,
	ebrower-DDmLM1+adcrQT0dZR+AlfA, jbang-DDmLM1+adcrQT0dZR+AlfA,
	swarren-DDmLM1+adcrQT0dZR+AlfA, wenjiaz-DDmLM1+adcrQT0dZR+AlfA,
	davidw-DDmLM1+adcrQT0dZR+AlfA, gfitzer-DDmLM1+adcrQT0dZR+AlfA,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

On 08/21/2015 06:03 AM, Thierry Reding wrote:
> On Thu, Aug 20, 2015 at 05:51:39PM -0700, Bryan Wu wrote:
>> NVIDIA Tegra processor contains a powerful Video Input (VI) hardware
>> controller which can support up to 6 MIPI CSI camera sensors.
>>
>> This patch adds a V4L2 media controller and capture driver to support
>> Tegra VI hardware. It's verified with Tegra built-in test pattern
>> generator.
> Hi Bryan,
>
> I've been looking forward to seeing this posted. I don't know the VI
> hardware in very much detail, nor am I an expert on the media framework,
> so I will primarily comment on architectural or SoC-specific things.
>
> By the way, please always Cc linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org on all patches
> relating to Tegra. That way people not explicitly Cc'ed but still
> interested in Tegra will see this code, even if they aren't subscribed
> to the linux-media mailing list.
Oops. let me add linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org in Cc this time.

>> Signed-off-by: Bryan Wu <pengw-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>> Reviewed-by: Hans Verkuil <hans.verkuil-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
>> ---
>>   drivers/media/platform/Kconfig               |    1 +
>>   drivers/media/platform/Makefile              |    2 +
>>   drivers/media/platform/tegra/Kconfig         |    9 +
>>   drivers/media/platform/tegra/Makefile        |    3 +
>>   drivers/media/platform/tegra/tegra-channel.c | 1074 ++++++++++++++++++++++++++
>>   drivers/media/platform/tegra/tegra-core.c    |  295 +++++++
>>   drivers/media/platform/tegra/tegra-core.h    |  134 ++++
>>   drivers/media/platform/tegra/tegra-vi.c      |  585 ++++++++++++++
>>   drivers/media/platform/tegra/tegra-vi.h      |  224 ++++++
>>   include/dt-bindings/media/tegra-vi.h         |   35 +
>>   10 files changed, 2362 insertions(+)
>>   create mode 100644 drivers/media/platform/tegra/Kconfig
>>   create mode 100644 drivers/media/platform/tegra/Makefile
>>   create mode 100644 drivers/media/platform/tegra/tegra-channel.c
>>   create mode 100644 drivers/media/platform/tegra/tegra-core.c
>>   create mode 100644 drivers/media/platform/tegra/tegra-core.h
>>   create mode 100644 drivers/media/platform/tegra/tegra-vi.c
>>   create mode 100644 drivers/media/platform/tegra/tegra-vi.h
>>   create mode 100644 include/dt-bindings/media/tegra-vi.h
> I can't spot a device tree binding document for this, but we'll need one
> to properly review this driver.
Sure, I will add binding document for this.

>> diff --git a/drivers/media/platform/tegra/Kconfig b/drivers/media/platform/tegra/Kconfig
>> new file mode 100644
>> index 0000000..a69d1b2
>> --- /dev/null
>> +++ b/drivers/media/platform/tegra/Kconfig
>> @@ -0,0 +1,9 @@
>> +config VIDEO_TEGRA
>> +	tristate "NVIDIA Tegra Video Input Driver (EXPERIMENTAL)"
> I don't think the (EXPERIMENTAL) is warranted. Either the driver works
> or it doesn't. And I assume you already tested that it works, even if
> only using the TPG.

OK, I will remove EXPERIMENTAL.

>> +	depends on VIDEO_V4L2 && VIDEO_V4L2_SUBDEV_API && OF
> This seems to be missing a couple of dependencies. For example I would
> expect at least TEGRA_HOST1X to be listed here to make sure people can't
> select this when the host1x API isn't available. I would also expect
> some sort of architecture dependency because it really makes no sense to
> build this if Tegra isn't supported.
>
> If you are concerned about compile coverage you can make that explicit
> using a COMPILE_TEST alternative such as:
>
> 	depends on ARCH_TEGRA || (ARM && COMPILE_TEST)
>
> Note that the ARM dependency in there makes sure that HAVE_IOMEM is
> selected, so this could also be:
>
> 	depends on ARCH_TEGRA || (HAVE_IOMEM && COMPILE_TEST)
>
> though that'd still leave open the possibility of build breakage because
> of some missing support.
>
> If you add the dependency on TEGRA_HOST1X that I mentioned above you
> shouldn't need any architecture dependency because TEGRA_HOST1X implies
> those already.
Let me add 'depends on TEGRA_HOST1X' which depends on ARCH_TEGRA. Then I 
don't think I need more Tegra architecture specific rules here, because 
like pmc.c covers IO rails, powergate and reset-controller.

>> +	select VIDEOBUF2_DMA_CONTIG
>> +	---help---
>> +	  Driver for Video Input (VI) device controller in NVIDIA Tegra SoC.
> I'd reword this slightly as:
>
> 	  Driver for the Video Input (VI) controller found on NVIDIA Tegra
> 	  SoCs.

Fixed.

>> +
>> +	  TO compile this driver as a module, choose M here: the module will be
> s/TO/To/.

Fixed.

>
>> +	  called tegra-video.
>> diff --git a/drivers/media/platform/tegra/Makefile b/drivers/media/platform/tegra/Makefile
>> new file mode 100644
>> index 0000000..c8eff0b
>> --- /dev/null
>> +++ b/drivers/media/platform/tegra/Makefile
>> @@ -0,0 +1,3 @@
>> +tegra-video-objs += tegra-core.o tegra-vi.o tegra-channel.o
> I'd personally leave out the redundant tegra- prefix here, because the
> files are in a tegra/ subdirectory already.
Right, some subsystem might don't like those prefix. But I just follow 
the rules in media subsystem here.

>> +obj-$(CONFIG_VIDEO_TEGRA) += tegra-video.o
>> diff --git a/drivers/media/platform/tegra/tegra-channel.c b/drivers/media/platform/tegra/tegra-channel.c
>> new file mode 100644
>> index 0000000..b0063d2
>> --- /dev/null
>> +++ b/drivers/media/platform/tegra/tegra-channel.c
>> @@ -0,0 +1,1074 @@
>> +/*
>> + * NVIDIA Tegra Video Input Device
>> + *
>> + * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
>> + *
>> + * Author: Bryan Wu <pengw-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#include <linux/atomic.h>
>> +#include <linux/bitmap.h>
>> +#include <linux/clk.h>
>> +#include <linux/delay.h>
>> +#include <linux/host1x.h>
>> +#include <linux/lcm.h>
>> +#include <linux/list.h>
>> +#include <linux/module.h>
>> +#include <linux/of.h>
>> +#include <linux/slab.h>
>> +
>> +#include <media/v4l2-ctrls.h>
>> +#include <media/v4l2-dev.h>
>> +#include <media/v4l2-fh.h>
>> +#include <media/v4l2-ioctl.h>
>> +#include <media/videobuf2-core.h>
>> +#include <media/videobuf2-dma-contig.h>
>> +
>> +#include <soc/tegra/pmc.h>
>> +
>> +#include "tegra-vi.h"
>> +
>> +#define TEGRA_VI_SYNCPT_WAIT_TIMEOUT			200
>> +
>> +/* VI registers */
>> +#define TEGRA_VI_CFG_VI_INCR_SYNCPT                     0x000
>> +#define		SP_PP_LINE_START			4
>> +#define		SP_PP_FRAME_START			5
>> +#define		SP_MW_REQ_DONE				6
>> +#define		SP_MW_ACK_DONE				7
> Indentation is weird here. There also seems to be a mix of spaces and
> tabs in the register definitions below. I find that these end up hard to
> read, so it'd be good to make these consistent.
I will fix the indentation here. Since SP_XXX is a definition of some 
register bits, I put some indentation here to make it different from 
other register definitions.

I will replace spaces with tabs in other register definitions.

>> +/* CSI registers */
>> +#define TEGRA_VI_CSI_0_BASE                             0x100
>> +#define TEGRA_VI_CSI_1_BASE                             0x200
>> +#define TEGRA_VI_CSI_2_BASE                             0x300
>> +#define TEGRA_VI_CSI_3_BASE                             0x400
>> +#define TEGRA_VI_CSI_4_BASE                             0x500
>> +#define TEGRA_VI_CSI_5_BASE                             0x600
> You seem to be computing these offsets later on based on the CSI 0 base
> and an offset multiplied by the instance number. Perhaps define this as
>
> 	#define TEGRA_VI_CSI_BASE(x)	(0x100 + (x) * 0x100)
>
> to avoid the unused defines as well as the computation later on?

Good point. I will fix this.



>> +/* CSI Pixel Parser registers */
>> +#define TEGRA_CSI_PIXEL_PARSER_0_BASE			0x0838
>> +#define TEGRA_CSI_PIXEL_PARSER_1_BASE			0x086c
>> +#define TEGRA_CSI_PIXEL_PARSER_2_BASE			0x1038
>> +#define TEGRA_CSI_PIXEL_PARSER_3_BASE			0x106c
>> +#define TEGRA_CSI_PIXEL_PARSER_4_BASE			0x1838
>> +#define TEGRA_CSI_PIXEL_PARSER_5_BASE			0x186c
> Same comment as for TEGRA_VI_CSI_*_BASE above. Only the first of these
> is used.
Fixed.

>> +
>> +
>> +/* CSI Pixel Parser registers */
>> +#define TEGRA_CSI_INPUT_STREAM_CONTROL                  0x000
>> +#define TEGRA_CSI_PIXEL_STREAM_CONTROL0                 0x004
>> +#define TEGRA_CSI_PIXEL_STREAM_CONTROL1                 0x008
>> +#define TEGRA_CSI_PIXEL_STREAM_GAP                      0x00c
>> +#define TEGRA_CSI_PIXEL_STREAM_PP_COMMAND               0x010
>> +#define TEGRA_CSI_PIXEL_STREAM_EXPECTED_FRAME           0x014
>> +#define TEGRA_CSI_PIXEL_PARSER_INTERRUPT_MASK           0x018
>> +#define TEGRA_CSI_PIXEL_PARSER_STATUS                   0x01c
>> +#define TEGRA_CSI_CSI_SW_SENSOR_RESET                   0x020
>> +
>> +/* CSI PHY registers */
>> +#define TEGRA_CSI_CIL_PHY_0_BASE			0x0908
>> +#define TEGRA_CSI_CIL_PHY_1_BASE			0x1108
>> +#define TEGRA_CSI_CIL_PHY_2_BASE			0x1908
> Same as for the other base offsets.
Fixed

>> +#define TEGRA_CSI_PHY_CIL_COMMAND			0x0908
> This doesn't seem to be used at all.

Actually this PHY register just has this one only, I need define it as 
0x0 offset here. Let's keep this since in future we might have more PHY 
registers.

>
>> +/* CSI CIL registers */
>> +#define TEGRA_CSI_CIL_0_BASE				0x092c
>> +#define TEGRA_CSI_CIL_1_BASE				0x0960
>> +#define TEGRA_CSI_CIL_2_BASE				0x112c
>> +#define TEGRA_CSI_CIL_3_BASE				0x1160
>> +#define TEGRA_CSI_CIL_4_BASE				0x192c
>> +#define TEGRA_CSI_CIL_5_BASE				0x1960
> Again, unused base defines, so might be better to go with a
> parameterized definition.

Fixed

>> +#define TEGRA_CSI_CIL_PAD_CONFIG0                       0x000
>> +#define TEGRA_CSI_CIL_PAD_CONFIG1                       0x004
>> +#define TEGRA_CSI_CIL_PHY_CONTROL                       0x008
>> +#define TEGRA_CSI_CIL_INTERRUPT_MASK                    0x00c
>> +#define TEGRA_CSI_CIL_STATUS                            0x010
>> +#define TEGRA_CSI_CILX_STATUS                           0x014
>> +#define TEGRA_CSI_CIL_ESCAPE_MODE_COMMAND               0x018
>> +#define TEGRA_CSI_CIL_ESCAPE_MODE_DATA                  0x01c
>> +#define TEGRA_CSI_CIL_SW_SENSOR_RESET                   0x020
>> +
>> +/* CSI Pattern Generator registers */
>> +#define TEGRA_CSI_PATTERN_GENERATOR_0_BASE		0x09c4
>> +#define TEGRA_CSI_PATTERN_GENERATOR_1_BASE		0x09f8
>> +#define TEGRA_CSI_PATTERN_GENERATOR_2_BASE		0x11c4
>> +#define TEGRA_CSI_PATTERN_GENERATOR_3_BASE		0x11f8
>> +#define TEGRA_CSI_PATTERN_GENERATOR_4_BASE		0x19c4
>> +#define TEGRA_CSI_PATTERN_GENERATOR_5_BASE		0x19f8
> More unused base defines.
Fixed.

>
>> +#define TEGRA_CSI_PATTERN_GENERATOR_CTRL		0x000
>> +#define TEGRA_CSI_PG_BLANK				0x004
>> +#define TEGRA_CSI_PG_PHASE				0x008
>> +#define TEGRA_CSI_PG_RED_FREQ				0x00c
>> +#define TEGRA_CSI_PG_RED_FREQ_RATE			0x010
>> +#define TEGRA_CSI_PG_GREEN_FREQ				0x014
>> +#define TEGRA_CSI_PG_GREEN_FREQ_RATE			0x018
>> +#define TEGRA_CSI_PG_BLUE_FREQ				0x01c
>> +#define TEGRA_CSI_PG_BLUE_FREQ_RATE			0x020
>> +#define TEGRA_CSI_PG_AOHDR				0x024
>> +
>> +#define TEGRA_CSI_DPCM_CTRL_A				0xad0
>> +#define TEGRA_CSI_DPCM_CTRL_B				0xad4
>> +#define TEGRA_CSI_STALL_COUNTER				0xae8
>> +#define TEGRA_CSI_CSI_READONLY_STATUS			0xaec
>> +#define TEGRA_CSI_CSI_SW_STATUS_RESET			0xaf0
>> +#define TEGRA_CSI_CLKEN_OVERRIDE			0xaf4
>> +#define TEGRA_CSI_DEBUG_CONTROL				0xaf8
>> +#define TEGRA_CSI_DEBUG_COUNTER_0			0xafc
>> +#define TEGRA_CSI_DEBUG_COUNTER_1			0xb00
>> +#define TEGRA_CSI_DEBUG_COUNTER_2			0xb04
> Some of these are unused. I guess there's an argument to be made to
> include all register definitions rather than just the used ones, if for
> nothing else than completeness. I'll defer to Hans's judgement on this.

These are VI/CSI global registers shared by all the channels. Some of 
them are used in this driver, I suggest we keep them here.

>> +/* Channel registers */
>> +static void tegra_channel_write(struct tegra_channel *chan, u32 addr, u32 val)
> I prefer unsigned int offset instead of u32 addr. That makes in more
> obvious that this is actually an offset from some I/O memory base
> address. Also using a sized type for the offset is a bit exaggerated
> because it doesn't need to be of any specific size.
>
> The same comment applies to the other accessors below.

OK , I will use unsigned int.

>> +{
>> +	if (chan->bypass)
>> +		return;
> I don't see this being set anywhere. Is it dead code? Also the only
> description I see is that it's used to bypass register writes, but I
> don't see an explanation why that's necessary.

We are unifying our downstream VI driver with V4L2 VI driver. And this 
upstream work is the first step to help that.

We are also backporting this driver back to our internal 3.10 kernel 
which is using nvhost channel to submit register operations from 
userspace to host1x and VI hardware. Then in this case, our driver needs 
to bypass all the register operations otherwise we got conflicts between 
these 2 paths.

That's why I put bypass mode here. And bypass mode can be set in device 
tree or from v4l2-ctrls.

>> +/* CIL PHY registers */
>> +static void phy_write(struct tegra_channel *chan, u32 val)
>> +{
>> +	tegra_channel_write(chan, chan->regs.phy, val);
>> +}
>> +
>> +static u32 phy_read(struct tegra_channel *chan)
>> +{
>> +	return tegra_channel_read(chan, chan->regs.phy);
>> +}
> Are these missing an offset parameter? Or do these subblocks only have a
> single register? Even if that's the case, I think it'd be more
> consistent to have the same signature as the other accessors.
OK, I will fix this.


>> +/* Syncpoint bits of TEGRA_VI_CFG_VI_INCR_SYNCPT */
>> +static u32 sp_bit(struct tegra_channel *chan, u32 sp)
>> +{
>> +	return (sp + chan->port * 4) << 8;
>> +}
> Technically this returns a mask, not a bit, so sp_mask() would be more
> appropriate.
Actually it returns the syncpoint value for each port not a mask. 
Probably sp_bits() is better.

>> +/* Calculate register base */
>> +static u32 regs_base(u32 regs_base, int port)
>> +{
>> +	return regs_base + (port / 2 * 0x800) + (port & 1) * 0x34;
>> +}
>> +
>> +/* CSI channel IO Rail IDs */
>> +int tegra_io_rail_csi_ids[] = {
> This can be static const as far as I can tell.

OK, I fixed this.

>> +	TEGRA_IO_RAIL_CSIA,
>> +	TEGRA_IO_RAIL_CSIB,
>> +	TEGRA_IO_RAIL_CSIC,
>> +	TEGRA_IO_RAIL_CSID,
>> +	TEGRA_IO_RAIL_CSIE,
>> +	TEGRA_IO_RAIL_CSIF,
>> +};
>> +
>> +void tegra_channel_fmts_bitmap_init(struct tegra_channel *chan)
>> +{
>> +	int ret, index;
>> +	struct v4l2_subdev *subdev = chan->remote_entity->subdev;
>> +	struct v4l2_subdev_mbus_code_enum code = {
>> +		.which = V4L2_SUBDEV_FORMAT_ACTIVE,
>> +	};
>> +
>> +
> Spurious blank line.

Removed
>
>> +static int tegra_channel_capture_setup(struct tegra_channel *chan)
>> +{
>> +	int lanes = 2;
> unsigned int? And why is it hardcoded to 2? There are checks below for
> lanes == 4, which will effectively never happen. So at the very least I
> think this should have a TODO comment of some sort. Preferably can it
> not be determined at runtime what number of lanes we need?
Sure, I forget to fix this. lanes should get from DT and for TPG mode I 
will choose lanes as 4 by default.

>> +	int port = chan->port;
> unsigned int?

fixed.

>
>> +	u32 height = chan->format.height;
>> +	u32 width = chan->format.width;
>> +	u32 format = chan->fmtinfo->img_fmt;
>> +	u32 data_type = chan->fmtinfo->img_dt;
>> +	u32 word_count = tegra_core_get_word_count(width, chan->fmtinfo);
>> +	struct chan_regs_config *regs = &chan->regs;
>> +
>> +	/* CIL PHY register setup */
>> +	if (port & 0x1) {
>> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0 - 0x34, 0x0);
>> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x0);
>> +	} else {
>> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x10000);
>> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0 + 0x34, 0x0);
>> +	}
> This seems to address registers not actually part of this channel. Why?
It's little bit hackish, but it's really have no choice. CIL PHY is 
shared by 2 channels. like CSIA and CSIB, CSIC and CSID, CSIE and CSIF. 
So we have 3 groups.


> Also you use magic numbers here and in the remainder of the driver. We
> should be able to do better. I presume all of this is documented in the
> TRM, so we should be able to easily substitute symbolic names.
I also got those magic numbers from internal source. Some of them are 
not in the TRM. And people just use that settings. I will try to convert 
them to some meaningful bit names. Please let me do it after I finished 
the whole work as an incremental patch.


>
>> +	cil_write(chan, TEGRA_CSI_CIL_INTERRUPT_MASK, 0x0);
>> +	cil_write(chan, TEGRA_CSI_CIL_PHY_CONTROL, 0xA);
>> +	if (lanes == 4) {
>> +		regs->cil = regs_base(TEGRA_CSI_CIL_0_BASE, port + 1);
>> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x0);
>> +		cil_write(chan,	TEGRA_CSI_CIL_INTERRUPT_MASK, 0x0);
>> +		cil_write(chan, TEGRA_CSI_CIL_PHY_CONTROL, 0xA);
>> +		regs->cil = regs_base(TEGRA_CSI_CIL_0_BASE, port);
>> +	}
> And this seems to access registers from another port by temporarily
> rewriting the CIL base offset. That seems a little hackish to me. I
> don't know the hardware intimately enough to know exactly what this
> is supposed to accomplish, perhaps you can clarify? Also perhaps we
> can come up with some architectural overview of the VI hardware, or
> does such an overview exist in the TRM?

CSI have 6 channels but just 3 PHYs. If a channel want to use 4 data 
lanes, then it has to be CSIA, CSIC and CSIE. And CSIB, CSID and CSIF 
channels can not be used in this case.

That's why we need to access the CSIB/D/F registers in 4 data lanes use 
case.

> I see there is, perhaps add a comment somewhere, in the commit
> description or the file header giving a reference to where the
> architectural overview can be found?

It can be found in Tegra X1 TRM like this:
"The CSI unit provides for connection of up to six cameras in the system 
and is organized as three identical instances of two
MIPI support blocks, each with a separate 4-lane interface that can be 
configured as a single camera with 4 lanes or as a dual
camera with 2 lanes available for each camera."

What about I put this information in the code as a comment?
>> +	/* CSI pixel parser registers setup */
>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_PP_COMMAND, 0xf007);
>> +	pp_write(chan, TEGRA_CSI_PIXEL_PARSER_INTERRUPT_MASK, 0x0);
>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_CONTROL0,
>> +		 0x280301f0 | (port & 0x1));
>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_PP_COMMAND, 0xf007);
>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_CONTROL1, 0x11);
>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_GAP, 0x140000);
>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_EXPECTED_FRAME, 0x0);
>> +	pp_write(chan, TEGRA_CSI_INPUT_STREAM_CONTROL,
>> +		 0x3f0000 | (lanes - 1));
>> +
>> +	/* CIL PHY register setup */
>> +	if (lanes == 4)
>> +		phy_write(chan, 0x0101);
>> +	else {
>> +		u32 val = phy_read(chan);
>> +		if (port & 0x1)
>> +			val = (val & ~0x100) | 0x100;
>> +		else
>> +			val = (val & ~0x1) | 0x1;
>> +		phy_write(chan, val);
>> +	}
> The & ~ isn't quite doing what I suspect it should be doing. My
> assumption is that you want to set this register to 0x01 if the first
> port is to be used and 0x100 if the second port is to be used (or 0x101
> if both ports are to be used). In that case I think you'll want
> something like this:
>
> 	value = phy_read(chan);
>
> 	if (port & 1)
> 		value = (value & ~0x0001) | 0x0100;
> 	else
> 		value = (value & ~0x0100) | 0x0001;
>
> 	phy_write(chan, value);

I don't think your code is correct. The algorithm is to read out the 
share PHY register value and clear the port related bit and set that 
bit. Then it won't touch the setting of the other port. It means when we 
setup a channel it should not change the other channel which sharing PHY 
register with the current one.

In your case, you cleared the other port's bit and set the current port 
bit. When we write the value back to the PHY register, current port will 
be enabled but the other port will be disabled.

For example, like CSIA is running, the value of PHY register is 0x0001.
Then when we try to enable CSIB, we should write 0x0101 to the PHY 
register but not 0x0100.

>> +static void tegra_channel_capture_error(struct tegra_channel *chan, int err)
>> +{
>> +	u32 val;
>> +
>> +#ifdef DEBUG
>> +	val = tegra_channel_read(chan, TEGRA_CSI_DEBUG_COUNTER_0);
>> +	dev_err(&chan->video.dev, "TEGRA_CSI_DEBUG_COUNTER_0 0x%08x\n", val);
>> +#endif
>> +	val = cil_read(chan, TEGRA_CSI_CIL_STATUS);
>> +	dev_err(&chan->video.dev, "TEGRA_CSI_CSI_CIL_STATUS 0x%08x\n", val);
>> +	val = cil_read(chan, TEGRA_CSI_CILX_STATUS);
>> +	dev_err(&chan->video.dev, "TEGRA_CSI_CSI_CILX_STATUS 0x%08x\n", val);
>> +	val = pp_read(chan, TEGRA_CSI_PIXEL_PARSER_STATUS);
>> +	dev_err(&chan->video.dev, "TEGRA_CSI_PIXEL_PARSER_STATUS 0x%08x\n",
>> +		val);
>> +	val = csi_read(chan, TEGRA_VI_CSI_ERROR_STATUS);
>> +	dev_err(&chan->video.dev, "TEGRA_VI_CSI_ERROR_STATUS 0x%08x\n", val);
>> +}
> The err parameter is never used, so it should be dropped.
OK, I removed it.

>
>> +static int tegra_channel_capture_frame(struct tegra_channel *chan)
>> +{
>> +	struct tegra_channel_buffer *buf = chan->active;
>> +	struct vb2_buffer *vb = &buf->buf;
>> +	int err = 0;
>> +	u32 thresh, value, frame_start;
>> +	int bytes_per_line = chan->format.bytesperline;
>> +
>> +	if (!vb2_start_streaming_called(&chan->queue) || !buf)
>> +		return -EINVAL;
>> +
>> +	if (chan->bypass)
>> +		goto bypass_done;
>> +
>> +	/* Program buffer address */
>> +	csi_write(chan,
>> +		  TEGRA_VI_CSI_SURFACE0_OFFSET_MSB + chan->surface * 8,
>> +		  0x0);
>> +	csi_write(chan,
>> +		  TEGRA_VI_CSI_SURFACE0_OFFSET_LSB + chan->surface * 8,
>> +		  buf->addr);
>> +	csi_write(chan,
>> +		  TEGRA_VI_CSI_SURFACE0_STRIDE + chan->surface * 4,
>> +		  bytes_per_line);
>> +
>> +	/* Program syncpoint */
>> +	frame_start = sp_bit(chan, SP_PP_FRAME_START);
>> +	tegra_channel_write(chan, TEGRA_VI_CFG_VI_INCR_SYNCPT,
>> +			    frame_start | host1x_syncpt_id(chan->sp));
>> +
>> +	csi_write(chan, TEGRA_VI_CSI_SINGLE_SHOT, 0x1);
>> +
>> +	/* Use syncpoint to wake up */
>> +	thresh = host1x_syncpt_incr_max(chan->sp, 1);
>> +
>> +	mutex_unlock(&chan->lock);
>> +	err = host1x_syncpt_wait(chan->sp, thresh,
>> +			         TEGRA_VI_SYNCPT_WAIT_TIMEOUT, &value);
>> +	mutex_lock(&chan->lock);
> What's the point of taking the lock in the first place if you drop it
> here, even if temporarily? This is a per-channel lock, and it protects
> the channel against concurrent captures. So if you drop the lock here,
> don't you run risk of having two captures run concurrently? And by the
> time you get to the error handling or buffer completion below you can't
> be sure you're actually dealing with the same buffer that you started
> with.

After some discussion with Hans, I changed to this. Since there won't be 
a second capture start which is prevented by v4l2-core, it won't cause 
the buffer issue.

Waiting for host1x syncpoint take time, so dropping lock can let other 
non-capture ioctls and operations happen.
>> +
>> +	if (err) {
>> +		dev_err(&chan->video.dev, "frame start syncpt timeout!\n");
>> +		tegra_channel_capture_error(chan, err);
>> +	}
> Is timeout really the only kind of error that can happen here?
>
I actually don't know other errors. Any other errors I need take of here?

>> +
>> +bypass_done:
>> +	/* Captured one frame */
>> +	spin_lock_irq(&chan->queued_lock);
>> +	vb->v4l2_buf.sequence = chan->sequence++;
>> +	vb->v4l2_buf.field = V4L2_FIELD_NONE;
>> +	v4l2_get_timestamp(&vb->v4l2_buf.timestamp);
>> +	vb2_set_plane_payload(vb, 0, chan->format.sizeimage);
>> +	vb2_buffer_done(vb, err < 0 ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
>> +	spin_unlock_irq(&chan->queued_lock);
> Do we really need to set all the buffer fields on error? Isn't it enough
> to simply mark the state as "error"?

I believe vb2_buffer_done() needs some fields to set. The code here is 
not very heavy but support both DONE and ERROR mode.

>> +
>> +	return err;
>> +}
>> +
>> +static void tegra_channel_work(struct work_struct *work)
>> +{
>> +	struct tegra_channel *chan =
>> +		container_of(work, struct tegra_channel, work);
>> +
>> +	while (1) {
>> +		spin_lock_irq(&chan->queued_lock);
>> +		if (list_empty(&chan->capture)) {
>> +			chan->active = NULL;
>> +			spin_unlock_irq(&chan->queued_lock);
>> +			return;
>> +		}
>> +		chan->active = list_entry(chan->capture.next,
>> +				struct tegra_channel_buffer, queue);
>> +		list_del_init(&chan->active->queue);
>> +		spin_unlock_irq(&chan->queued_lock);
>> +
>> +		mutex_lock(&chan->lock);
>> +		tegra_channel_capture_frame(chan);
>> +		mutex_unlock(&chan->lock);
>> +	}
>> +}
> Should this have some mechanism to break out of the loop, for example if
> somebody requested capturing to stop?
I will move to a kthread solution as Hans pointed out.

>> +static int tegra_channel_buffer_prepare(struct vb2_buffer *vb)
>> +{
>> +	struct tegra_channel *chan = vb2_get_drv_priv(vb->vb2_queue);
>> +	struct tegra_channel_buffer *buf = to_tegra_channel_buffer(vb);
>> +
>> +	buf->chan = chan;
>> +	buf->addr = vb2_dma_contig_plane_dma_addr(vb, 0);
>> +
>> +	return 0;
>> +}
> This seems to use contiguous DMA, which I guess presumes CMA support?
> We're dealing with very large buffers here. Your default frame size
> would yield buffers of roughly 32 MiB each, and you probably need a
> couple of those to ensure smooth playback. That's quite a bit of
> memory to reserve for CMA.
In vb2 core driver, it's using dma-mapping API which might be CMA or SMMU.

For CMA we need increase the default memory size.

> Have you ever tried to make this work with the IOMMU API so that we can
> allocate arbitrary buffers and linearize them for the hardware through
> the SMMU?
I tested this code in downstream kernel with SMMU. Do we fully support 
SMMU in upstream version? I didn't check that.

>> +static void tegra_channel_buffer_queue(struct vb2_buffer *vb)
>> +{
>> +	struct tegra_channel *chan = vb2_get_drv_priv(vb->vb2_queue);
>> +	struct tegra_channel_buffer *buf = to_tegra_channel_buffer(vb);
>> +
>> +	/* Put buffer into the  capture queue */
>> +	spin_lock_irq(&chan->queued_lock);
>> +	list_add_tail(&buf->queue, &chan->capture);
>> +	spin_unlock_irq(&chan->queued_lock);
>> +
>> +	/* Start work queue to capture data to buffer */
>> +	if (vb2_start_streaming_called(&chan->queue))
>> +		schedule_work(&chan->work);
>> +}
> I'm beginning to wonder if a workqueue is the best implementation here.
> Couldn't we get notification on syncpoint increments and have a handler
> setup capture of new frames?

I will move to more flexible solution kthread then.

>> +static int tegra_channel_start_streaming(struct vb2_queue *vq, u32 count)
>> +{
>> +	struct tegra_channel *chan = vb2_get_drv_priv(vq);
>> +	struct media_pipeline *pipe = chan->video.entity.pipe;
>> +	struct tegra_channel_buffer *buf, *nbuf;
>> +	int ret = 0;
>> +
>> +	if (!chan->vi->pg_mode && !chan->remote_entity) {
>> +		dev_err(&chan->video.dev,
>> +			"is not in TPG mode and has not sensor connected!\n");
>> +		ret = -EINVAL;
>> +		goto vb2_queued;
>> +	}
>> +
>> +	mutex_lock(&chan->lock);
>> +
>> +	/* Start CIL clock */
>> +	clk_set_rate(chan->cil_clk, 102000000);
>> +	clk_prepare_enable(chan->cil_clk);
> You need to check these for errors.
Fixed
>
>> +static struct vb2_ops tegra_channel_queue_qops = {
>> +	.queue_setup = tegra_channel_queue_setup,
>> +	.buf_prepare = tegra_channel_buffer_prepare,
>> +	.buf_queue = tegra_channel_buffer_queue,
>> +	.wait_prepare = vb2_ops_wait_prepare,
>> +	.wait_finish = vb2_ops_wait_finish,
>> +	.start_streaming = tegra_channel_start_streaming,
>> +	.stop_streaming = tegra_channel_stop_streaming,
>> +};
> I think this needs to be static const.
Fixed
>
>> +static int
>> +tegra_channel_querycap(struct file *file, void *fh, struct v4l2_capability *cap)
>> +{
>> +	struct v4l2_fh *vfh = file->private_data;
>> +	struct tegra_channel *chan = to_tegra_channel(vfh->vdev);
>> +
>> +	cap->device_caps = V4L2_CAP_VIDEO_CAPTURE | V4L2_CAP_STREAMING;
>> +	cap->capabilities = cap->device_caps | V4L2_CAP_DEVICE_CAPS;
>> +
>> +	strlcpy(cap->driver, "tegra-vi", sizeof(cap->driver));
> Perhaps "tegra-video" to be consistent with the module name?
OK, fixed.


>> +	strlcpy(cap->card, chan->video.name, sizeof(cap->card));
>> +	snprintf(cap->bus_info, sizeof(cap->bus_info), "platform:%s:%u",
>> +		 chan->vi->dev->of_node->name, chan->port);
> Should this not rather use dev_name(chan->vi->dev) to ensure it works
> fine if ever we have multiple instances of the VI controller?
>

Fixed.

>> +static int
>> +tegra_channel_enum_format(struct file *file, void *fh, struct v4l2_fmtdesc *f)
>> +{
>> +	struct v4l2_fh *vfh = file->private_data;
>> +	struct tegra_channel *chan = to_tegra_channel(vfh->vdev);
>> +	int index, i;
> These can probably be unsigned int.
>
>> +	unsigned long *fmts_bitmap = NULL;
>> +
>> +	if (chan->vi->pg_mode)
>> +		fmts_bitmap = chan->vi->tpg_fmts_bitmap;
>> +	else if (chan->remote_entity)
>> +		fmts_bitmap = chan->fmts_bitmap;
>> +
>> +	if (!fmts_bitmap ||
>> +	    f->index > bitmap_weight(fmts_bitmap, MAX_FORMAT_NUM) - 1)
>> +		return -EINVAL;
>> +
>> +	index = -1;
> This won't work with unsigned int, of course (actually, it would, but
> it'd be ugly), but I think you could work around that by doing the more
> natural:
>
>> +	for (i = 0; i < f->index + 1; i++)
>> +		index = find_next_bit(fmts_bitmap, MAX_FORMAT_NUM, index + 1);
> 	index = 0;
>
> 	for (i = 0; i < f->index + 1; i++, index++)
> 		index = find_next_bit(fmts_bitmap, MAX_FORMAT_NUM, index);

Sure, fixed all of them

>
>> +static void
>> +__tegra_channel_try_format(struct tegra_channel *chan, struct v4l2_pix_format *pix,
>> +		      const struct tegra_video_format **fmtinfo)
>> +{
>> +	const struct tegra_video_format *info;
>> +	unsigned int min_width;
>> +	unsigned int max_width;
>> +	unsigned int min_bpl;
>> +	unsigned int max_bpl;
>> +	unsigned int width;
>> +	unsigned int align;
>> +	unsigned int bpl;
>> +
>> +	/* Retrieve format information and select the default format if the
>> +	 * requested format isn't supported.
>> +	 */
>> +	info = tegra_core_get_format_by_fourcc(pix->pixelformat);
>> +	if (!info)
>> +		info = tegra_core_get_format_by_fourcc(TEGRA_VF_DEF_FOURCC);
> Should this not be an error? As far as I can tell this is silently
> substituting the default format for the requested one if the requested
> one isn't supported. Isn't the whole point of this to find out if some
> format is supported?
>
I think it should return some error and escape following code. I will 
fix that.


>> +
>> +	pix->pixelformat = info->fourcc;
>> +	pix->field = V4L2_FIELD_NONE;
>> +
>> +	/* The transfer alignment requirements are expressed in bytes. Compute
>> +	 * the minimum and maximum values, clamp the requested width and convert
>> +	 * it back to pixels.
>> +	 */
>> +	align = lcm(chan->align, info->bpp);
>> +	min_width = roundup(TEGRA_MIN_WIDTH, align);
>> +	max_width = rounddown(TEGRA_MAX_WIDTH, align);
>> +	width = rounddown(pix->width * info->bpp, align);
> Shouldn't these be roundup()?
Why? I don't understand but rounddown looks good to me

>> +
>> +	pix->width = clamp(width, min_width, max_width) / info->bpp;
>> +	pix->height = clamp(pix->height, TEGRA_MIN_HEIGHT,
>> +			    TEGRA_MAX_HEIGHT);
> The above fits nicely on one line and doesn't need to be wrapped.
Fixed
>
>> +
>> +	/* Clamp the requested bytes per line value. If the maximum bytes per
>> +	 * line value is zero, the module doesn't support user configurable line
>> +	 * sizes. Override the requested value with the minimum in that case.
>> +	 */
>> +	min_bpl = pix->width * info->bpp;
>> +	max_bpl = rounddown(TEGRA_MAX_WIDTH, chan->align);
>> +	bpl = rounddown(pix->bytesperline, chan->align);
> Again, I think these should be roundup().

Why? I don't understand but rounddown looks good to me
>
>> +static int tegra_channel_v4l2_open(struct file *file)
>> +{
>> +	struct tegra_channel *chan = video_drvdata(file);
>> +	struct tegra_vi_device *vi = chan->vi;
>> +	int ret = 0;
>> +
>> +	mutex_lock(&vi->lock);
>> +	ret = v4l2_fh_open(file);
>> +	if (ret)
>> +		goto unlock;
>> +
>> +	/* The first open then turn on power*/
>> +	if (!vi->power_on_refcnt) {
>> +		tegra_vi_power_on(chan->vi);
> Perhaps propagate error codes here?
>
>> +
>> +		usleep_range(5, 100);
>> +		tegra_channel_write(chan, TEGRA_VI_CFG_CG_CTRL, 1);
>> +		tegra_channel_write(chan, TEGRA_CSI_CLKEN_OVERRIDE, 0);
>> +		usleep_range(10, 15);
>> +	}
>> +	vi->power_on_refcnt++;
> Also, I wonder if powering up at ->open() time isn't a little early. I
> could very well imagine an application opening up a device and then not
> use it for a long time. Or keep it open even while nothing is being
> captures. But that's primarily an optimization matter, so this is fine
> with me.
>

I think I can move this whole open/release things to start_streaming() 
point. And use v4l2 default open/release function.

>> +int tegra_channel_init(struct tegra_vi_device *vi,
>> +		       struct tegra_channel *chan,
>> +		       u32 port)
> The above fits on 2 lines, no need to make it three. Also port should
> probably be unsigned int because the size isn't important.

Fixed

>> +{
>> +	int ret;
>> +
>> +	chan->vi = vi;
>> +	chan->port = port;
>> +	chan->iomem = vi->iomem;
>> +
>> +	/* Init channel register base */
>> +	chan->regs.csi = TEGRA_VI_CSI_0_BASE + port * 0x100;
>> +	chan->regs.pp = regs_base(TEGRA_CSI_PIXEL_PARSER_0_BASE, port);
>> +	chan->regs.cil = regs_base(TEGRA_CSI_CIL_0_BASE, port);
>> +	chan->regs.phy = TEGRA_CSI_CIL_PHY_0_BASE + port / 2 * 0x800;
>> +	chan->regs.tpg = regs_base(TEGRA_CSI_PATTERN_GENERATOR_0_BASE, port);
> Like I said, I think it'd be clearer to have the defines parameterized.
> That would also make this more consistent, rather than have one set of
> values that are computed here and for others the regs_base() helper is
> invoked. Also, I think it'd be better to have the regs structures take
> void __iomem * directly, so that the offset addition doesn't have to be
> performed at every register access.

OK, I see. I will fix this.


>> +
>> +	/* Init CIL clock */
>> +	switch (chan->port) {
>> +	case 0:
>> +	case 1:
>> +		chan->cil_clk = devm_clk_get(chan->vi->dev, "cilab");
>> +		break;
>> +	case 2:
>> +	case 3:
>> +		chan->cil_clk = devm_clk_get(chan->vi->dev, "cilcd");
>> +		break;
>> +	case 4:
>> +	case 5:
>> +		chan->cil_clk = devm_clk_get(chan->vi->dev, "cile");
>> +		break;
>> +	default:
>> +		dev_err(chan->vi->dev, "wrong port nubmer %d\n", port);
> Nit: you should use %u for unsigned integers.

Fixed.

>> +	}
>> +	if (IS_ERR(chan->cil_clk)) {
>> +		dev_err(chan->vi->dev, "Failed to get CIL clock\n");
> Perhaps mention which clock couldn't be received.

Fixed

>
>> +		return -EINVAL;
> And propagate the error code rather than returning a hardcoded one.
Fixed.

>
>> +	}
>> +
>> +	/* VI Channel is 64 bytes alignment */
>> +	chan->align = 64;
> Does this need parameterization for other SoC generations?

So far it's 64 bytes and I don't see any change about this in the future 
generations.

>
>> +	chan->surface = 0;
> I can't find this being set to anything other than 0. What is its use?

Each channel actually has 3 memory output surfaces. But I don't find any 
use case to use the surface 1 and surface 2. So I just added this 
parameter for future usage.

chan->surface is used in tegra_channel_capture_frame()

>
>> +	chan->io_id = tegra_io_rail_csi_ids[chan->port];
>> +	mutex_init(&chan->lock);
>> +	mutex_init(&chan->video_lock);
>> +	INIT_LIST_HEAD(&chan->capture);
>> +	spin_lock_init(&chan->queued_lock);
>> +	INIT_WORK(&chan->work, tegra_channel_work);
>> +
>> +	/* Init video format */
>> +	chan->fmtinfo = tegra_core_get_format_by_fourcc(TEGRA_VF_DEF_FOURCC);
>> +	chan->format.pixelformat = chan->fmtinfo->fourcc;
>> +	chan->format.colorspace = V4L2_COLORSPACE_SRGB;
>> +	chan->format.field = V4L2_FIELD_NONE;
>> +	chan->format.width = TEGRA_DEF_WIDTH;
>> +	chan->format.height = TEGRA_DEF_HEIGHT;
>> +	chan->format.bytesperline = chan->format.width * chan->fmtinfo->bpp;
>> +	chan->format.sizeimage = chan->format.bytesperline *
>> +				    chan->format.height;
>> +
>> +	/* Initialize the media entity... */
>> +	chan->pad.flags = MEDIA_PAD_FL_SINK;
>> +
>> +	ret = media_entity_init(&chan->video.entity, 1, &chan->pad, 0);
>> +	if (ret < 0)
>> +		return ret;
>> +
>> +	/* ... and the video node... */
>> +	chan->video.fops = &tegra_channel_fops;
>> +	chan->video.v4l2_dev = &vi->v4l2_dev;
>> +	chan->video.queue = &chan->queue;
>> +	snprintf(chan->video.name, sizeof(chan->video.name), "%s %s %u",
>> +		 vi->dev->of_node->name, "output", port);
> dev_name()?

Fixed.

>> diff --git a/drivers/media/platform/tegra/tegra-core.c b/drivers/media/platform/tegra/tegra-core.c
> [...]
>> +const struct tegra_video_format tegra_video_formats[] = {
> Does this need to be exposed? I see there are accessors for this below,
> so exposing the structure itself doesn't seem necessary.

OK, I will fix this.

>> +int tegra_core_get_formats_array_size(void)
>> +{
>> +	return ARRAY_SIZE(tegra_video_formats);
>> +}
>> +
>> +/**
>> + * tegra_core_get_word_count - Calculate word count
>> + * @frame_width: number of pixels in one frame
>> + * @fmt: Tegra Video format struct which has BPP information
>> + *
>> + * Return: word count number
>> + */
>> +u32 tegra_core_get_word_count(u32 frame_width,
>> +			      const struct tegra_video_format *fmt)
>> +{
>> +	return frame_width * fmt->width / 8;
>> +}
> This is confusing. If frame_width is the number of pixels in one frame,
> then it should probably me called frame_size or so. frame_width to me
> implies number of pixels per line, not per frame.

Actually the comment is wrong. I will fix that.


>> +/**
>> + * tegra_core_get_idx_by_code - Retrieve index for a media bus code
>> + * @code: the format media bus code
>> + *
>> + * Return: a index to the format information structure corresponding to the
>> + * given V4L2 media bus format @code, or -1 if no corresponding format can
>> + * be found.
>> + */
>> +int tegra_core_get_idx_by_code(unsigned int code)
>> +{
>> +	unsigned int i;
>> +	const struct tegra_video_format *format;
>> +
>> +	for (i = 0; i < ARRAY_SIZE(tegra_video_formats); ++i) {
>> +		format = &tegra_video_formats[i];
>> +
>> +		if (format->code == code)
> You only use the format value once, so the temporary variable doesn't
> buy you anything.
>
OK, I will remove 'format'.


>> +			return i;
>> +	}
>> +
>> +	return -1;
>> +}
>> +
>> +
> Gratuitous blank line.

Fixed.

>
>> +/**
>> + * tegra_core_of_get_format - Parse a device tree node and return format
>> + * 			      information
> Why is this necessary? Why would you ever need to encode a pixel format
> in DT?

This is dead code. I will remove them.

>> +/**
>> + * tegra_core_bytes_per_line - Calculate bytes per line in one frame
>> + * @width: frame width
>> + * @fmt: Tegra Video format
>> + *
>> + * Simply calcualte the bytes_per_line and if it's not 64 bytes aligned it
>> + * will be padded to 64 boundary.
>> + */
>> +u32 tegra_core_bytes_per_line(u32 width,
>> +			      const struct tegra_video_format *fmt)
>> +{
>> +	u32 bytes_per_line = width * fmt->bpp;
>> +
>> +	if (bytes_per_line % 64)
>> +		bytes_per_line = bytes_per_line +
>> +				 (64 - (bytes_per_line % 64));
>> +
>> +	return bytes_per_line;
>> +}
> Perhaps this should use the channel->align field for alignment rather
> than hardcode 64? Since there's no channel being passed into this, maybe
> passing the alignment as a parameter would work?
>
> Also can't the above be replaced by:
>
> 	return roundup(width * fmt->bpp, align);
>
> ?

Great, I will fix that.

>
>> diff --git a/drivers/media/platform/tegra/tegra-core.h b/drivers/media/platform/tegra/tegra-core.h
>> new file mode 100644
>> index 0000000..7d1026b
>> --- /dev/null
>> +++ b/drivers/media/platform/tegra/tegra-core.h
>> @@ -0,0 +1,134 @@
>> +/*
>> + * NVIDIA Tegra Video Input Device Driver Core Helpers
>> + *
>> + * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
>> + *
>> + * Author: Bryan Wu <pengw-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#ifndef __TEGRA_CORE_H__
>> +#define __TEGRA_CORE_H__
>> +
>> +#include <dt-bindings/media/tegra-vi.h>
>> +
>> +#include <media/v4l2-subdev.h>
>> +
>> +/* Minimum and maximum width and height common to Tegra video input device. */
>> +#define TEGRA_MIN_WIDTH		32U
>> +#define TEGRA_MAX_WIDTH		7680U
>> +#define TEGRA_MIN_HEIGHT	32U
>> +#define TEGRA_MAX_HEIGHT	7680U
> Is this dependent on SoC generation? If we wanted to support Tegra K1,
> would the same values apply or do they need to be parameterized?
I actually don't get any information about this max/min resolution. Here 
I just put some values for the format calculation.

> On that note, could you outline what would be necessary to make this
> work on Tegra K1? What are the differences between the VI hardware on
> Tegra X1 vs. Tegra K1?
>
Tegra X1 and Tegra K1 have similar channel architecture. Tegra X1 has 6 
channels, Tegra K1 has 2 channels.


>> +
>> +/* UHD 4K resolution as default resolution for all Tegra video input device. */
>> +#define TEGRA_DEF_WIDTH		3840
>> +#define TEGRA_DEF_HEIGHT	2160
> Is this a sensible default? It seems rather large to me.
Actually I use this for TPG which is the default setting of VI. And it 
can be override from user space IOCTL.

>> +
>> +#define TEGRA_VF_DEF		TEGRA_VF_RGB888
>> +#define TEGRA_VF_DEF_FOURCC	V4L2_PIX_FMT_RGB32
> Should we not have only one of these and convert to the other via some
> table?

This is also TPG default mode

>> +/* These go into the TEGRA_VI_CSI_n_IMAGE_DEF registers bits 23:16 */
>> +#define TEGRA_IMAGE_FORMAT_T_L8                         16
>> +#define TEGRA_IMAGE_FORMAT_T_R16_I                      32
>> +#define TEGRA_IMAGE_FORMAT_T_B5G6R5                     33
>> +#define TEGRA_IMAGE_FORMAT_T_R5G6B5                     34
>> +#define TEGRA_IMAGE_FORMAT_T_A1B5G5R5                   35
>> +#define TEGRA_IMAGE_FORMAT_T_A1R5G5B5                   36
>> +#define TEGRA_IMAGE_FORMAT_T_B5G5R5A1                   37
>> +#define TEGRA_IMAGE_FORMAT_T_R5G5B5A1                   38
>> +#define TEGRA_IMAGE_FORMAT_T_A4B4G4R4                   39
>> +#define TEGRA_IMAGE_FORMAT_T_A4R4G4B4                   40
>> +#define TEGRA_IMAGE_FORMAT_T_B4G4R4A4                   41
>> +#define TEGRA_IMAGE_FORMAT_T_R4G4B4A4                   42
>> +#define TEGRA_IMAGE_FORMAT_T_A8B8G8R8                   64
>> +#define TEGRA_IMAGE_FORMAT_T_A8R8G8B8                   65
>> +#define TEGRA_IMAGE_FORMAT_T_B8G8R8A8                   66
>> +#define TEGRA_IMAGE_FORMAT_T_R8G8B8A8                   67
>> +#define TEGRA_IMAGE_FORMAT_T_A2B10G10R10                68
>> +#define TEGRA_IMAGE_FORMAT_T_A2R10G10B10                69
>> +#define TEGRA_IMAGE_FORMAT_T_B10G10R10A2                70
>> +#define TEGRA_IMAGE_FORMAT_T_R10G10B10A2                71
>> +#define TEGRA_IMAGE_FORMAT_T_A8Y8U8V8                   193
>> +#define TEGRA_IMAGE_FORMAT_T_V8U8Y8A8                   194
>> +#define TEGRA_IMAGE_FORMAT_T_A2Y10U10V10                197
>> +#define TEGRA_IMAGE_FORMAT_T_V10U10Y10A2                198
>> +#define TEGRA_IMAGE_FORMAT_T_Y8_U8__Y8_V8               200
>> +#define TEGRA_IMAGE_FORMAT_T_Y8_V8__Y8_U8               201
>> +#define TEGRA_IMAGE_FORMAT_T_U8_Y8__V8_Y8               202
>> +#define TEGRA_IMAGE_FORMAT_T_T_V8_Y8__U8_Y8             203
>> +#define TEGRA_IMAGE_FORMAT_T_T_Y8__U8__V8_N444          224
>> +#define TEGRA_IMAGE_FORMAT_T_Y8__U8V8_N444              225
>> +#define TEGRA_IMAGE_FORMAT_T_Y8__V8U8_N444              226
>> +#define TEGRA_IMAGE_FORMAT_T_Y8__U8__V8_N422            227
>> +#define TEGRA_IMAGE_FORMAT_T_Y8__U8V8_N422              228
>> +#define TEGRA_IMAGE_FORMAT_T_Y8__V8U8_N422              229
>> +#define TEGRA_IMAGE_FORMAT_T_Y8__U8__V8_N420            230
>> +#define TEGRA_IMAGE_FORMAT_T_Y8__U8V8_N420              231
>> +#define TEGRA_IMAGE_FORMAT_T_Y8__V8U8_N420              232
>> +#define TEGRA_IMAGE_FORMAT_T_X2Lc10Lb10La10             233
>> +#define TEGRA_IMAGE_FORMAT_T_A2R6R6R6R6R6               234
>> +
>> +/* These go into the TEGRA_VI_CSI_n_CSI_IMAGE_DT registers bits 7:0 */
>> +#define TEGRA_IMAGE_DT_YUV420_8                         24
>> +#define TEGRA_IMAGE_DT_YUV420_10                        25
>> +#define TEGRA_IMAGE_DT_YUV420CSPS_8                     28
>> +#define TEGRA_IMAGE_DT_YUV420CSPS_10                    29
>> +#define TEGRA_IMAGE_DT_YUV422_8                         30
>> +#define TEGRA_IMAGE_DT_YUV422_10                        31
>> +#define TEGRA_IMAGE_DT_RGB444                           32
>> +#define TEGRA_IMAGE_DT_RGB555                           33
>> +#define TEGRA_IMAGE_DT_RGB565                           34
>> +#define TEGRA_IMAGE_DT_RGB666                           35
>> +#define TEGRA_IMAGE_DT_RGB888                           36
>> +#define TEGRA_IMAGE_DT_RAW6                             40
>> +#define TEGRA_IMAGE_DT_RAW7                             41
>> +#define TEGRA_IMAGE_DT_RAW8                             42
>> +#define TEGRA_IMAGE_DT_RAW10                            43
>> +#define TEGRA_IMAGE_DT_RAW12                            44
>> +#define TEGRA_IMAGE_DT_RAW14                            45
> It might be helpful to describe what these registers actually do. There
> seems to be overlap between both lists, but I don't quite see how they
> relate to one another, or what their purpose is.
These tables are from our TRM. The first table is "Pixel memory format 
for the VI channel". The second one is "VI channel input data type".

Let me put some comments there.


>> +/**
>> + * struct tegra_video_format - Tegra video format description
>> + * @vf_code: video format code
>> + * @width: format width in bits per component
>> + * @code: media bus format code
>> + * @bpp: bytes per pixel (when stored in memory)
>> + * @img_fmt: image format
>> + * @img_dt: image data type
>> + * @fourcc: V4L2 pixel format FCC identifier
>> + * @description: format description, suitable for userspace
>> + */
>> +struct tegra_video_format {
>> +	u32 vf_code;
>> +	u32 width;
>> +	u32 code;
>> +	u32 bpp;
> I think the above four can all be unsigned int. A sized type is not
> necessary here.
OK, I will fix this.


>> +	u32 img_fmt;
>> +	u32 img_dt;
> Perhaps these could be enums?

OK, I will use enums.
>
>> +	u32 fourcc;
>> +};
>> +
>> +extern const struct tegra_video_format tegra_video_formats[];
> It looks like you have accessors for this. Do you even need to expose
> it?

Fixed.

>> diff --git a/drivers/media/platform/tegra/tegra-vi.c b/drivers/media/platform/tegra/tegra-vi.c
> [...]
>> +static void tegra_vi_v4l2_cleanup(struct tegra_vi_device *vi)
>> +{
>> +	v4l2_ctrl_handler_free(&vi->ctrl_handler);
>> +	v4l2_device_unregister(&vi->v4l2_dev);
>> +	media_device_unregister(&vi->media_dev);
>> +}
>> +
>> +static int tegra_vi_v4l2_init(struct tegra_vi_device *vi)
>> +{
>> +	int ret;
>> +
>> +	vi->media_dev.dev = vi->dev;
>> +	strlcpy(vi->media_dev.model, "NVIDIA Tegra Video Input Device",
>> +		sizeof(vi->media_dev.model));
>> +	vi->media_dev.hw_revision = 0;
> Actually, I think for Tegra X1 the hardware revision would be 3, since
> VI3 is what it's usually referred to. Tegra K1 has VI2, so this should
> be parameterized (at least when Tegra K1 support is added).
OK, I will choose 3 for Tegra X1 since we mentioned that in TRM like VI3.

>> +int tegra_vi_power_on(struct tegra_vi_device *vi)
>> +{
>> +	int ret;
>> +
>> +	ret = regulator_enable(vi->vi_reg);
>> +	if (ret)
>> +		return ret;
>> +
>> +	ret = tegra_powergate_sequence_power_up(TEGRA_POWERGATE_VENC,
>> +						vi->vi_clk, vi->vi_rst);
>> +	if (ret) {
>> +		regulator_disable(vi->vi_reg);
>> +		return ret;
>> +	}
>> +
>> +	clk_prepare_enable(vi->csi_clk);
>> +
>> +	clk_set_rate(vi->parent_clk, 408000000);
> Do we really need to set the parent? Isn't that going to be set
> automatically since vi_clk is the child of parent_clk?
Sure, I will remove this.


>> +	clk_set_rate(vi->vi_clk, 408000000);
>> +	clk_set_rate(vi->csi_clk, 408000000);
> Also all of these clock functions can fail, so you should check for
> errors.
>

Fixed.

>> +
>> +	return 0;
>> +}
>> +
>> +void tegra_vi_power_off(struct tegra_vi_device *vi)
>> +{
>> +	clk_disable_unprepare(vi->csi_clk);
>> +	tegra_powergate_power_off(TEGRA_POWERGATE_VENC);
> tegra_powergate_power_off() doesn't do anything with the clock or the
> reset, so you'll want to manually assert reset here and then disable and
> unprepare the clock. And I think both need to happen before the power
> partition is turned off.

Got it, I will fix this.


>> +	regulator_disable(vi->vi_reg);
>> +}
>> +
>> +static int tegra_vi_channels_init(struct tegra_vi_device *vi)
>> +{
>> +	int i, ret;
> i can be unsigned.

Fixed
>
>> +	struct tegra_channel *chan;
>> +
>> +	for (i = 0; i < ARRAY_SIZE(vi->chans); i++) {
>> +		chan = &vi->chans[i];
>> +
>> +		ret = tegra_channel_init(vi, chan, i);
> Again, chan is only used once, so directly passing &vi->chans[i] to
> tegra_channel_init() would be more concise.
OK, I will remove 'chan' parameter from the list. And just pass i as the 
port number.

>
>> +static int tegra_vi_channels_cleanup(struct tegra_vi_device *vi)
>> +{
>> +	int i, ret;
>> +	struct tegra_channel *chan;
>> +
>> +	for (i = 0; i < ARRAY_SIZE(vi->chans); i++) {
>> +		chan = &vi->chans[i];
>> +
>> +		ret = tegra_channel_cleanup(chan);
>> +		if (ret < 0) {
>> +			dev_err(vi->dev, "channel %d cleanup failed\n", i);
>> +			return ret;
>> +		}
>> +	}
>> +	return 0;
>> +}
> Same comments as for tegra_vi_channels_init().

Fixed.

>> +
>> +/* -----------------------------------------------------------------------------
>> + * Graph Management
>> + */
> The way devices are hooked up using the graph needs to be documented in
> a device tree binding.

Sure. This is actually the default video-interface binding. I will 
provide a document about device tree binding.

>> +static int tegra_vi_graph_notify_complete(struct v4l2_async_notifier *notifier)
>> +{
>> +	struct tegra_vi_device *vi =
>> +		container_of(notifier, struct tegra_vi_device, notifier);
>> +	int ret;
>> +
>> +	dev_dbg(vi->dev, "notify complete, all subdevs registered\n");
>> +
>> +	/* Create links for every entity. */
>> +	ret = tegra_vi_graph_build_links(vi);
>> +	if (ret < 0)
>> +		return ret;
>> +
>> +	ret = v4l2_device_register_subdev_nodes(&vi->v4l2_dev);
>> +	if (ret < 0)
>> +		dev_err(vi->dev, "failed to register subdev nodes\n");
>> +
>> +	return ret;
>> +}
> Why the need for this notifier mechanism, doesn't deferred probe work
> here?

I will revisit this after media controller and graph probing change in 
upstream mentioned by Hans.

>> +static int tegra_vi_graph_notify_bound(struct v4l2_async_notifier *notifier,
>> +				   struct v4l2_subdev *subdev,
>> +				   struct v4l2_async_subdev *asd)
>> +{
> [...]
>> +}
>> +
>> +
> Gratuitous blank line.
Fixed.

>
>> +static int tegra_vi_graph_init(struct tegra_vi_device *vi)
>> +{
>> +	struct device_node *node = vi->dev->of_node;
>> +	struct device_node *ep = NULL;
>> +	struct device_node *next;
>> +	struct device_node *remote = NULL;
>> +	struct tegra_vi_graph_entity *entity;
>> +	struct v4l2_async_subdev **subdevs = NULL;
>> +	unsigned int num_subdevs;
> This variable is being used uninitialized.
>

Fixed.

>> +static int tegra_vi_probe(struct platform_device *pdev)
>> +{
>> +	struct resource *res;
>> +	struct tegra_vi_device *vi;
>> +	int ret = 0;
>> +
>> +	vi = devm_kzalloc(&pdev->dev, sizeof(*vi), GFP_KERNEL);
>> +	if (!vi)
>> +		return -ENOMEM;
>> +
>> +	vi->dev = &pdev->dev;
>> +	INIT_LIST_HEAD(&vi->entities);
>> +	mutex_init(&vi->lock);
>> +
>> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>> +	vi->iomem = devm_ioremap_resource(&pdev->dev, res);
>> +	if (IS_ERR(vi->iomem))
>> +		return PTR_ERR(vi->iomem);
>> +
>> +	vi->vi_rst = devm_reset_control_get(&pdev->dev, "vi");
>> +	if (IS_ERR(vi->vi_rst)) {
>> +		dev_err(&pdev->dev, "Failed to get vi reset\n");
>> +		return -EPROBE_DEFER;
>> +	}
> There could be other reasons for failure, so you should really propagate
> the error code that devm_reset_control_get() provides:
>
> 		return PTR_ERR(vi->vi_rst);
OK, I will fix this.

>
>> +	vi->vi_clk = devm_clk_get(&pdev->dev, "vi");
>> +	if (IS_ERR(vi->vi_clk)) {
>> +		dev_err(&pdev->dev, "Failed to get vi clock\n");
>> +		return -EPROBE_DEFER;
>> +	}
> Same here...
OK, I will fix this.
>
>> +	vi->parent_clk = devm_clk_get(&pdev->dev, "parent");
>> +	if (IS_ERR(vi->parent_clk)) {
>> +		dev_err(&pdev->dev, "Failed to get VI parent clock\n");
>> +		return -EPROBE_DEFER;
>> +	}
> ... here...
Fixed
>
>> +	ret = clk_set_parent(vi->vi_clk, vi->parent_clk);
>> +	if (ret < 0)
>> +		return ret;
>> +
>> +	vi->csi_clk = devm_clk_get(&pdev->dev, "csi");
>> +	if (IS_ERR(vi->csi_clk)) {
>> +		dev_err(&pdev->dev, "Failed to get csi clock\n");
>> +		return -EPROBE_DEFER;
>> +	}
> ... here...
Fixed
>
>> +	vi->vi_reg = devm_regulator_get(&pdev->dev, "avdd-dsi-csi");
>> +	if (IS_ERR(vi->vi_reg)) {
>> +		dev_err(&pdev->dev, "Failed to get avdd-dsi-csi regulators\n");
>> +		return -EPROBE_DEFER;
>> +	}
> and here.
>
>> +	vi_tpg_fmts_bitmap_init(vi);
>> +
>> +	ret = tegra_vi_v4l2_init(vi);
>> +	if (ret < 0)
>> +		return ret;
>> +
>> +	/* Check whether VI is in test pattern generator (TPG) mode */
>> +	of_property_read_u32(vi->dev->of_node, "nvidia,pg_mode",
>> +			     &vi->pg_mode);
> This doesn't sound right. Wouldn't this mean that you can either use the
> device in TPG mode or sensor mode only? With no means of switching at
> runtime? But then I see that there's an IOCTL to set this mode, so why
> even bother having this in DT in the first place?
DT can provide a default way to set the whole VI as TPG. And v4l2-ctrls 
(IOCTL) is another way to do that.

We can remove this DT stuff but just use runtime v4l2-ctrls.
>> +	/* Init Tegra VI channels */
>> +	ret = tegra_vi_channels_init(vi);
>> +	if (ret < 0)
>> +		goto channels_error;
>> +
>> +	/* Setup media links between VI and external sensor subdev. */
>> +	ret = tegra_vi_graph_init(vi);
>> +	if (ret < 0)
>> +		goto graph_error;
>> +
>> +	platform_set_drvdata(pdev, vi);
>> +
>> +	dev_info(vi->dev, "device registered\n");
> Can we get rid of this, please? There's no use in spamming the kernel
> log with brag. Let people know when things have failed. Success is the
> expected outcome of ->probe().
Removed.

>> +static struct platform_driver tegra_vi_driver = {
>> +	.driver = {
>> +		.name = "tegra-vi",
>> +		.of_match_table = tegra_vi_of_id_table,
>> +	},
>> +	.probe = tegra_vi_probe,
>> +	.remove = tegra_vi_remove,
>> +};
>> +
>> +module_platform_driver(tegra_vi_driver);
> There's usually no blank line between the above.
OK, fixed.

>> diff --git a/drivers/media/platform/tegra/tegra-vi.h b/drivers/media/platform/tegra/tegra-vi.h
>> new file mode 100644
>> index 0000000..d30a6ec
>> --- /dev/null
>> +++ b/drivers/media/platform/tegra/tegra-vi.h
>> @@ -0,0 +1,224 @@
>> +/*
>> + * NVIDIA Tegra Video Input Device
>> + *
>> + * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
>> + *
>> + * Author: Bryan Wu <pengw-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#ifndef __TEGRA_VI_H__
>> +#define __TEGRA_VI_H__
>> +
>> +#include <linux/host1x.h>
>> +#include <linux/list.h>
>> +#include <linux/mutex.h>
>> +#include <linux/spinlock.h>
>> +#include <linux/videodev2.h>
>> +
>> +#include <media/media-device.h>
>> +#include <media/media-entity.h>
>> +#include <media/v4l2-async.h>
>> +#include <media/v4l2-ctrls.h>
>> +#include <media/v4l2-device.h>
>> +#include <media/v4l2-dev.h>
>> +#include <media/videobuf2-core.h>
>> +
>> +#include "tegra-core.h"
>> +
>> +#define MAX_CHAN_NUM	6
>> +#define MAX_FORMAT_NUM	64
> Perhaps these need to be runtime parameters to support multiple SoC
> generations? Tegra K1 seems to have only 2 channels instead of 6.
>
>> +
>> +/**
>> + * struct tegra_channel_buffer - video channel buffer
>> + * @buf: vb2 buffer base object
>> + * @queue: buffer list entry in the channel queued buffers list
>> + * @chan: channel that uses the buffer
>> + * @addr: Tegra IOVA buffer address for VI output
>> + */
>> +struct tegra_channel_buffer {
>> +	struct vb2_buffer buf;
>> +	struct list_head queue;
>> +	struct tegra_channel *chan;
>> +
>> +	dma_addr_t addr;
>> +};
>> +
>> +#define to_tegra_channel_buffer(vb) \
>> +	container_of(vb, struct tegra_channel_buffer, buf)
> I usually prefer static inline functions over macros for this type of
> upcasting. But perhaps Hans prefers this, so I'll defer to his judgement
> here.
>
>> +struct chan_regs_config {
>> +	u32 csi;
>> +	u32 pp;
>> +	u32 cil;
>> +	u32 phy;
>> +	u32 tpg;
>> +};
> Have you considered making these void __iomem * so that you can avoid
> the addition of the offset whenever you access a register?
>
>> +/**
>> + * struct tegra_channel - Tegra video channel
>> + * @list: list entry in a composite device dmas list
>> + * @video: V4L2 video device associated with the video channel
>> + * @video_lock:
>> + * @pad: media pad for the video device entity
>> + * @pipe: pipeline belonging to the channel
>> + *
>> + * @vi: composite device DT node port number for the channel
>> + *
>> + * @client: host1x client struct of Tegra DRM
> host1x client is separate from Tegra DRM.
Fixed.
>> + * @sp: host1x syncpoint pointer
>> + *
>> + * @work: kernel workqueue structure of this video channel
>> + * @lock: protects the @format, @fmtinfo, @queue and @work fields
>> + *
>> + * @format: active V4L2 pixel format
>> + * @fmtinfo: format information corresponding to the active @format
>> + *
>> + * @queue: vb2 buffers queue
>> + * @alloc_ctx: allocation context for the vb2 @queue
>> + * @sequence: V4L2 buffers sequence number
>> + *
>> + * @capture: list of queued buffers for capture
>> + * @active: active buffer for capture
>> + * @queued_lock: protects the buf_queued list
>> + *
>> + * @iomem: root register base
>> + * @regs: CSI/CIL/PHY register bases
>> + * @cil_clk: clock for CIL
>> + * @align: channel buffer alignment, default is 64
>> + * @port: CSI port of this video channel
>> + * @surface: output memory surface number
>> + * @io_id: Tegra IO rail ID of this video channel
>> + * @bypass: a flag to bypass register write
>> + *
>> + * @fmts_bitmap: a bitmap for formats supported
>> + *
>> + * @remote_entity: remote media entity for external sensor
>> + */
>> +struct tegra_channel {
>> +	struct list_head list;
>> +	struct video_device video;
>> +	struct mutex video_lock;
>> +	struct media_pad pad;
>> +	struct media_pipeline pipe;
>> +
>> +	struct tegra_vi_device *vi;
>> +
>> +	struct host1x_client client;
>> +	struct host1x_syncpt *sp;
>> +
>> +	struct work_struct work;
>> +	struct mutex lock;
>> +
>> +	struct v4l2_pix_format format;
>> +	const struct tegra_video_format *fmtinfo;
>> +
>> +	struct vb2_queue queue;
>> +	void *alloc_ctx;
>> +	u32 sequence;
>> +
>> +	struct list_head capture;
>> +	struct tegra_channel_buffer *active;
>> +	spinlock_t queued_lock;
>> +
>> +	void __iomem *iomem;
>> +	struct chan_regs_config regs;
>> +	struct clk *cil_clk;
>> +	int align;
>> +	u32 port;
> Those can both be unsigned int.
Fixed.
>
>> +	u32 surface;
> This seems to be fixed to 0, do we need it?

Let's keep it for future usage.

>
>> +	int io_id;
>> +	int bypass;
> bool?

Fixed.
>
>> +/**
>> + * struct tegra_vi_device - NVIDIA Tegra Video Input device structure
>> + * @v4l2_dev: V4L2 device
>> + * @media_dev: media device
>> + * @dev: device struct
>> + *
>> + * @iomem: register base
>> + * @vi_clk: main clock for VI block
>> + * @parent_clk: parent clock of VI clock
>> + * @csi_clk: clock for CSI
>> + * @vi_rst: reset controler
>> + * @vi_reg: regulator for VI hardware, normally it avdd_dsi_csi
>> + *
>> + * @lock: mutex lock to protect power on/off operations
>> + * @power_on_refcnt: reference count for power on/off operations
>> + *
>> + * @notifier: V4L2 asynchronous subdevs notifier
>> + * @entities: entities in the graph as a list of tegra_vi_graph_entity
>> + * @num_subdevs: number of subdevs in the pipeline
>> + *
>> + * @channels: list of channels at the pipeline output and input
>> + *
>> + * @ctrl_handler: V4L2 control handler
>> + * @pattern: test pattern generator V4L2 control
>> + * @pg_mode: test pattern generator mode (disabled/direct/patch)
>> + * @tpg_fmts_bitmap: a bitmap for formats in test pattern generator mode
>> + */
>> +struct tegra_vi_device {
>> +	struct v4l2_device v4l2_dev;
>> +	struct media_device media_dev;
>> +	struct device *dev;
>> +
>> +	void __iomem *iomem;
>> +	struct clk *vi_clk;
>> +	struct clk *parent_clk;
>> +	struct clk *csi_clk;
>> +	struct reset_control *vi_rst;
>> +	struct regulator *vi_reg;
>> +
>> +	struct mutex lock;
>> +	int power_on_refcnt;
> unsigned int, or perhaps even atomic_t, in which case you might be able
> to remove the locks from ->open()/->release().
I will rework the open/release()

>> +
>> +	struct v4l2_async_notifier notifier;
>> +	struct list_head entities;
>> +	unsigned int num_subdevs;
>> +
>> +	struct tegra_channel chans[MAX_CHAN_NUM];
>> +
>> +	struct v4l2_ctrl_handler ctrl_handler;
>> +	struct v4l2_ctrl *pattern;
>> +	int pg_mode;
> Perhaps this should be an enum?
Sure, fixed.

>> diff --git a/include/dt-bindings/media/tegra-vi.h b/include/dt-bindings/media/tegra-vi.h
> [...]
>> +#ifndef __DT_BINDINGS_MEDIA_TEGRA_VI_H__
>> +#define __DT_BINDINGS_MEDIA_TEGRA_VI_H__
>> +
>> +/*
>> + * Supported CSI to VI Data Formats
>> + */
>> +#define TEGRA_VF_RAW6		0
>> +#define TEGRA_VF_RAW7		1
>> +#define TEGRA_VF_RAW8		2
>> +#define TEGRA_VF_RAW10		3
>> +#define TEGRA_VF_RAW12		4
>> +#define TEGRA_VF_RAW14		5
>> +#define TEGRA_VF_EMBEDDED8	6
>> +#define TEGRA_VF_RGB565		7
>> +#define TEGRA_VF_RGB555		8
>> +#define TEGRA_VF_RGB888		9
>> +#define TEGRA_VF_RGB444		10
>> +#define TEGRA_VF_RGB666		11
>> +#define TEGRA_VF_YUV422		12
>> +#define TEGRA_VF_YUV420		13
>> +#define TEGRA_VF_YUV420_CSPS	14
>> +
>> +#endif /* __DT_BINDINGS_MEDIA_TEGRA_VI_H__ */
> What do we need these for? These seem to me to be internal formats
> supported by the hardware, but the existence of this file implies that
> you plan on using them in the DT. What's the use-case?
>
>

The original plan is to put nvidia;video-format in device tree and this 
is the data formats for that. Now we don't need nvidia;video-format in 
device tree. Then I let me move it into our tegra-core.c, because our 
tegra_video_formats table needs this.

Thierry,

Thanks a lot for this beautiful review. I almost fixed them and will 
provide a new patch soon.

-Bryan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
@ 2015-08-25  0:26         ` Bryan Wu
  0 siblings, 0 replies; 25+ messages in thread
From: Bryan Wu @ 2015-08-25  0:26 UTC (permalink / raw)
  To: Thierry Reding
  Cc: hansverk, linux-media, ebrower, jbang, swarren, wenjiaz, davidw,
	gfitzer, linux-tegra

On 08/21/2015 06:03 AM, Thierry Reding wrote:
> On Thu, Aug 20, 2015 at 05:51:39PM -0700, Bryan Wu wrote:
>> NVIDIA Tegra processor contains a powerful Video Input (VI) hardware
>> controller which can support up to 6 MIPI CSI camera sensors.
>>
>> This patch adds a V4L2 media controller and capture driver to support
>> Tegra VI hardware. It's verified with Tegra built-in test pattern
>> generator.
> Hi Bryan,
>
> I've been looking forward to seeing this posted. I don't know the VI
> hardware in very much detail, nor am I an expert on the media framework,
> so I will primarily comment on architectural or SoC-specific things.
>
> By the way, please always Cc linux-tegra@vger.kernel.org on all patches
> relating to Tegra. That way people not explicitly Cc'ed but still
> interested in Tegra will see this code, even if they aren't subscribed
> to the linux-media mailing list.
Oops. let me add linux-tegra@vger.kernel.org in Cc this time.

>> Signed-off-by: Bryan Wu <pengw@nvidia.com>
>> Reviewed-by: Hans Verkuil <hans.verkuil@cisco.com>
>> ---
>>   drivers/media/platform/Kconfig               |    1 +
>>   drivers/media/platform/Makefile              |    2 +
>>   drivers/media/platform/tegra/Kconfig         |    9 +
>>   drivers/media/platform/tegra/Makefile        |    3 +
>>   drivers/media/platform/tegra/tegra-channel.c | 1074 ++++++++++++++++++++++++++
>>   drivers/media/platform/tegra/tegra-core.c    |  295 +++++++
>>   drivers/media/platform/tegra/tegra-core.h    |  134 ++++
>>   drivers/media/platform/tegra/tegra-vi.c      |  585 ++++++++++++++
>>   drivers/media/platform/tegra/tegra-vi.h      |  224 ++++++
>>   include/dt-bindings/media/tegra-vi.h         |   35 +
>>   10 files changed, 2362 insertions(+)
>>   create mode 100644 drivers/media/platform/tegra/Kconfig
>>   create mode 100644 drivers/media/platform/tegra/Makefile
>>   create mode 100644 drivers/media/platform/tegra/tegra-channel.c
>>   create mode 100644 drivers/media/platform/tegra/tegra-core.c
>>   create mode 100644 drivers/media/platform/tegra/tegra-core.h
>>   create mode 100644 drivers/media/platform/tegra/tegra-vi.c
>>   create mode 100644 drivers/media/platform/tegra/tegra-vi.h
>>   create mode 100644 include/dt-bindings/media/tegra-vi.h
> I can't spot a device tree binding document for this, but we'll need one
> to properly review this driver.
Sure, I will add binding document for this.

>> diff --git a/drivers/media/platform/tegra/Kconfig b/drivers/media/platform/tegra/Kconfig
>> new file mode 100644
>> index 0000000..a69d1b2
>> --- /dev/null
>> +++ b/drivers/media/platform/tegra/Kconfig
>> @@ -0,0 +1,9 @@
>> +config VIDEO_TEGRA
>> +	tristate "NVIDIA Tegra Video Input Driver (EXPERIMENTAL)"
> I don't think the (EXPERIMENTAL) is warranted. Either the driver works
> or it doesn't. And I assume you already tested that it works, even if
> only using the TPG.

OK, I will remove EXPERIMENTAL.

>> +	depends on VIDEO_V4L2 && VIDEO_V4L2_SUBDEV_API && OF
> This seems to be missing a couple of dependencies. For example I would
> expect at least TEGRA_HOST1X to be listed here to make sure people can't
> select this when the host1x API isn't available. I would also expect
> some sort of architecture dependency because it really makes no sense to
> build this if Tegra isn't supported.
>
> If you are concerned about compile coverage you can make that explicit
> using a COMPILE_TEST alternative such as:
>
> 	depends on ARCH_TEGRA || (ARM && COMPILE_TEST)
>
> Note that the ARM dependency in there makes sure that HAVE_IOMEM is
> selected, so this could also be:
>
> 	depends on ARCH_TEGRA || (HAVE_IOMEM && COMPILE_TEST)
>
> though that'd still leave open the possibility of build breakage because
> of some missing support.
>
> If you add the dependency on TEGRA_HOST1X that I mentioned above you
> shouldn't need any architecture dependency because TEGRA_HOST1X implies
> those already.
Let me add 'depends on TEGRA_HOST1X' which depends on ARCH_TEGRA. Then I 
don't think I need more Tegra architecture specific rules here, because 
like pmc.c covers IO rails, powergate and reset-controller.

>> +	select VIDEOBUF2_DMA_CONTIG
>> +	---help---
>> +	  Driver for Video Input (VI) device controller in NVIDIA Tegra SoC.
> I'd reword this slightly as:
>
> 	  Driver for the Video Input (VI) controller found on NVIDIA Tegra
> 	  SoCs.

Fixed.

>> +
>> +	  TO compile this driver as a module, choose M here: the module will be
> s/TO/To/.

Fixed.

>
>> +	  called tegra-video.
>> diff --git a/drivers/media/platform/tegra/Makefile b/drivers/media/platform/tegra/Makefile
>> new file mode 100644
>> index 0000000..c8eff0b
>> --- /dev/null
>> +++ b/drivers/media/platform/tegra/Makefile
>> @@ -0,0 +1,3 @@
>> +tegra-video-objs += tegra-core.o tegra-vi.o tegra-channel.o
> I'd personally leave out the redundant tegra- prefix here, because the
> files are in a tegra/ subdirectory already.
Right, some subsystem might don't like those prefix. But I just follow 
the rules in media subsystem here.

>> +obj-$(CONFIG_VIDEO_TEGRA) += tegra-video.o
>> diff --git a/drivers/media/platform/tegra/tegra-channel.c b/drivers/media/platform/tegra/tegra-channel.c
>> new file mode 100644
>> index 0000000..b0063d2
>> --- /dev/null
>> +++ b/drivers/media/platform/tegra/tegra-channel.c
>> @@ -0,0 +1,1074 @@
>> +/*
>> + * NVIDIA Tegra Video Input Device
>> + *
>> + * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
>> + *
>> + * Author: Bryan Wu <pengw@nvidia.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#include <linux/atomic.h>
>> +#include <linux/bitmap.h>
>> +#include <linux/clk.h>
>> +#include <linux/delay.h>
>> +#include <linux/host1x.h>
>> +#include <linux/lcm.h>
>> +#include <linux/list.h>
>> +#include <linux/module.h>
>> +#include <linux/of.h>
>> +#include <linux/slab.h>
>> +
>> +#include <media/v4l2-ctrls.h>
>> +#include <media/v4l2-dev.h>
>> +#include <media/v4l2-fh.h>
>> +#include <media/v4l2-ioctl.h>
>> +#include <media/videobuf2-core.h>
>> +#include <media/videobuf2-dma-contig.h>
>> +
>> +#include <soc/tegra/pmc.h>
>> +
>> +#include "tegra-vi.h"
>> +
>> +#define TEGRA_VI_SYNCPT_WAIT_TIMEOUT			200
>> +
>> +/* VI registers */
>> +#define TEGRA_VI_CFG_VI_INCR_SYNCPT                     0x000
>> +#define		SP_PP_LINE_START			4
>> +#define		SP_PP_FRAME_START			5
>> +#define		SP_MW_REQ_DONE				6
>> +#define		SP_MW_ACK_DONE				7
> Indentation is weird here. There also seems to be a mix of spaces and
> tabs in the register definitions below. I find that these end up hard to
> read, so it'd be good to make these consistent.
I will fix the indentation here. Since SP_XXX is a definition of some 
register bits, I put some indentation here to make it different from 
other register definitions.

I will replace spaces with tabs in other register definitions.

>> +/* CSI registers */
>> +#define TEGRA_VI_CSI_0_BASE                             0x100
>> +#define TEGRA_VI_CSI_1_BASE                             0x200
>> +#define TEGRA_VI_CSI_2_BASE                             0x300
>> +#define TEGRA_VI_CSI_3_BASE                             0x400
>> +#define TEGRA_VI_CSI_4_BASE                             0x500
>> +#define TEGRA_VI_CSI_5_BASE                             0x600
> You seem to be computing these offsets later on based on the CSI 0 base
> and an offset multiplied by the instance number. Perhaps define this as
>
> 	#define TEGRA_VI_CSI_BASE(x)	(0x100 + (x) * 0x100)
>
> to avoid the unused defines as well as the computation later on?

Good point. I will fix this.



>> +/* CSI Pixel Parser registers */
>> +#define TEGRA_CSI_PIXEL_PARSER_0_BASE			0x0838
>> +#define TEGRA_CSI_PIXEL_PARSER_1_BASE			0x086c
>> +#define TEGRA_CSI_PIXEL_PARSER_2_BASE			0x1038
>> +#define TEGRA_CSI_PIXEL_PARSER_3_BASE			0x106c
>> +#define TEGRA_CSI_PIXEL_PARSER_4_BASE			0x1838
>> +#define TEGRA_CSI_PIXEL_PARSER_5_BASE			0x186c
> Same comment as for TEGRA_VI_CSI_*_BASE above. Only the first of these
> is used.
Fixed.

>> +
>> +
>> +/* CSI Pixel Parser registers */
>> +#define TEGRA_CSI_INPUT_STREAM_CONTROL                  0x000
>> +#define TEGRA_CSI_PIXEL_STREAM_CONTROL0                 0x004
>> +#define TEGRA_CSI_PIXEL_STREAM_CONTROL1                 0x008
>> +#define TEGRA_CSI_PIXEL_STREAM_GAP                      0x00c
>> +#define TEGRA_CSI_PIXEL_STREAM_PP_COMMAND               0x010
>> +#define TEGRA_CSI_PIXEL_STREAM_EXPECTED_FRAME           0x014
>> +#define TEGRA_CSI_PIXEL_PARSER_INTERRUPT_MASK           0x018
>> +#define TEGRA_CSI_PIXEL_PARSER_STATUS                   0x01c
>> +#define TEGRA_CSI_CSI_SW_SENSOR_RESET                   0x020
>> +
>> +/* CSI PHY registers */
>> +#define TEGRA_CSI_CIL_PHY_0_BASE			0x0908
>> +#define TEGRA_CSI_CIL_PHY_1_BASE			0x1108
>> +#define TEGRA_CSI_CIL_PHY_2_BASE			0x1908
> Same as for the other base offsets.
Fixed

>> +#define TEGRA_CSI_PHY_CIL_COMMAND			0x0908
> This doesn't seem to be used at all.

Actually this PHY register just has this one only, I need define it as 
0x0 offset here. Let's keep this since in future we might have more PHY 
registers.

>
>> +/* CSI CIL registers */
>> +#define TEGRA_CSI_CIL_0_BASE				0x092c
>> +#define TEGRA_CSI_CIL_1_BASE				0x0960
>> +#define TEGRA_CSI_CIL_2_BASE				0x112c
>> +#define TEGRA_CSI_CIL_3_BASE				0x1160
>> +#define TEGRA_CSI_CIL_4_BASE				0x192c
>> +#define TEGRA_CSI_CIL_5_BASE				0x1960
> Again, unused base defines, so might be better to go with a
> parameterized definition.

Fixed

>> +#define TEGRA_CSI_CIL_PAD_CONFIG0                       0x000
>> +#define TEGRA_CSI_CIL_PAD_CONFIG1                       0x004
>> +#define TEGRA_CSI_CIL_PHY_CONTROL                       0x008
>> +#define TEGRA_CSI_CIL_INTERRUPT_MASK                    0x00c
>> +#define TEGRA_CSI_CIL_STATUS                            0x010
>> +#define TEGRA_CSI_CILX_STATUS                           0x014
>> +#define TEGRA_CSI_CIL_ESCAPE_MODE_COMMAND               0x018
>> +#define TEGRA_CSI_CIL_ESCAPE_MODE_DATA                  0x01c
>> +#define TEGRA_CSI_CIL_SW_SENSOR_RESET                   0x020
>> +
>> +/* CSI Pattern Generator registers */
>> +#define TEGRA_CSI_PATTERN_GENERATOR_0_BASE		0x09c4
>> +#define TEGRA_CSI_PATTERN_GENERATOR_1_BASE		0x09f8
>> +#define TEGRA_CSI_PATTERN_GENERATOR_2_BASE		0x11c4
>> +#define TEGRA_CSI_PATTERN_GENERATOR_3_BASE		0x11f8
>> +#define TEGRA_CSI_PATTERN_GENERATOR_4_BASE		0x19c4
>> +#define TEGRA_CSI_PATTERN_GENERATOR_5_BASE		0x19f8
> More unused base defines.
Fixed.

>
>> +#define TEGRA_CSI_PATTERN_GENERATOR_CTRL		0x000
>> +#define TEGRA_CSI_PG_BLANK				0x004
>> +#define TEGRA_CSI_PG_PHASE				0x008
>> +#define TEGRA_CSI_PG_RED_FREQ				0x00c
>> +#define TEGRA_CSI_PG_RED_FREQ_RATE			0x010
>> +#define TEGRA_CSI_PG_GREEN_FREQ				0x014
>> +#define TEGRA_CSI_PG_GREEN_FREQ_RATE			0x018
>> +#define TEGRA_CSI_PG_BLUE_FREQ				0x01c
>> +#define TEGRA_CSI_PG_BLUE_FREQ_RATE			0x020
>> +#define TEGRA_CSI_PG_AOHDR				0x024
>> +
>> +#define TEGRA_CSI_DPCM_CTRL_A				0xad0
>> +#define TEGRA_CSI_DPCM_CTRL_B				0xad4
>> +#define TEGRA_CSI_STALL_COUNTER				0xae8
>> +#define TEGRA_CSI_CSI_READONLY_STATUS			0xaec
>> +#define TEGRA_CSI_CSI_SW_STATUS_RESET			0xaf0
>> +#define TEGRA_CSI_CLKEN_OVERRIDE			0xaf4
>> +#define TEGRA_CSI_DEBUG_CONTROL				0xaf8
>> +#define TEGRA_CSI_DEBUG_COUNTER_0			0xafc
>> +#define TEGRA_CSI_DEBUG_COUNTER_1			0xb00
>> +#define TEGRA_CSI_DEBUG_COUNTER_2			0xb04
> Some of these are unused. I guess there's an argument to be made to
> include all register definitions rather than just the used ones, if for
> nothing else than completeness. I'll defer to Hans's judgement on this.

These are VI/CSI global registers shared by all the channels. Some of 
them are used in this driver, I suggest we keep them here.

>> +/* Channel registers */
>> +static void tegra_channel_write(struct tegra_channel *chan, u32 addr, u32 val)
> I prefer unsigned int offset instead of u32 addr. That makes in more
> obvious that this is actually an offset from some I/O memory base
> address. Also using a sized type for the offset is a bit exaggerated
> because it doesn't need to be of any specific size.
>
> The same comment applies to the other accessors below.

OK , I will use unsigned int.

>> +{
>> +	if (chan->bypass)
>> +		return;
> I don't see this being set anywhere. Is it dead code? Also the only
> description I see is that it's used to bypass register writes, but I
> don't see an explanation why that's necessary.

We are unifying our downstream VI driver with V4L2 VI driver. And this 
upstream work is the first step to help that.

We are also backporting this driver back to our internal 3.10 kernel 
which is using nvhost channel to submit register operations from 
userspace to host1x and VI hardware. Then in this case, our driver needs 
to bypass all the register operations otherwise we got conflicts between 
these 2 paths.

That's why I put bypass mode here. And bypass mode can be set in device 
tree or from v4l2-ctrls.

>> +/* CIL PHY registers */
>> +static void phy_write(struct tegra_channel *chan, u32 val)
>> +{
>> +	tegra_channel_write(chan, chan->regs.phy, val);
>> +}
>> +
>> +static u32 phy_read(struct tegra_channel *chan)
>> +{
>> +	return tegra_channel_read(chan, chan->regs.phy);
>> +}
> Are these missing an offset parameter? Or do these subblocks only have a
> single register? Even if that's the case, I think it'd be more
> consistent to have the same signature as the other accessors.
OK, I will fix this.


>> +/* Syncpoint bits of TEGRA_VI_CFG_VI_INCR_SYNCPT */
>> +static u32 sp_bit(struct tegra_channel *chan, u32 sp)
>> +{
>> +	return (sp + chan->port * 4) << 8;
>> +}
> Technically this returns a mask, not a bit, so sp_mask() would be more
> appropriate.
Actually it returns the syncpoint value for each port not a mask. 
Probably sp_bits() is better.

>> +/* Calculate register base */
>> +static u32 regs_base(u32 regs_base, int port)
>> +{
>> +	return regs_base + (port / 2 * 0x800) + (port & 1) * 0x34;
>> +}
>> +
>> +/* CSI channel IO Rail IDs */
>> +int tegra_io_rail_csi_ids[] = {
> This can be static const as far as I can tell.

OK, I fixed this.

>> +	TEGRA_IO_RAIL_CSIA,
>> +	TEGRA_IO_RAIL_CSIB,
>> +	TEGRA_IO_RAIL_CSIC,
>> +	TEGRA_IO_RAIL_CSID,
>> +	TEGRA_IO_RAIL_CSIE,
>> +	TEGRA_IO_RAIL_CSIF,
>> +};
>> +
>> +void tegra_channel_fmts_bitmap_init(struct tegra_channel *chan)
>> +{
>> +	int ret, index;
>> +	struct v4l2_subdev *subdev = chan->remote_entity->subdev;
>> +	struct v4l2_subdev_mbus_code_enum code = {
>> +		.which = V4L2_SUBDEV_FORMAT_ACTIVE,
>> +	};
>> +
>> +
> Spurious blank line.

Removed
>
>> +static int tegra_channel_capture_setup(struct tegra_channel *chan)
>> +{
>> +	int lanes = 2;
> unsigned int? And why is it hardcoded to 2? There are checks below for
> lanes == 4, which will effectively never happen. So at the very least I
> think this should have a TODO comment of some sort. Preferably can it
> not be determined at runtime what number of lanes we need?
Sure, I forget to fix this. lanes should get from DT and for TPG mode I 
will choose lanes as 4 by default.

>> +	int port = chan->port;
> unsigned int?

fixed.

>
>> +	u32 height = chan->format.height;
>> +	u32 width = chan->format.width;
>> +	u32 format = chan->fmtinfo->img_fmt;
>> +	u32 data_type = chan->fmtinfo->img_dt;
>> +	u32 word_count = tegra_core_get_word_count(width, chan->fmtinfo);
>> +	struct chan_regs_config *regs = &chan->regs;
>> +
>> +	/* CIL PHY register setup */
>> +	if (port & 0x1) {
>> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0 - 0x34, 0x0);
>> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x0);
>> +	} else {
>> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x10000);
>> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0 + 0x34, 0x0);
>> +	}
> This seems to address registers not actually part of this channel. Why?
It's little bit hackish, but it's really have no choice. CIL PHY is 
shared by 2 channels. like CSIA and CSIB, CSIC and CSID, CSIE and CSIF. 
So we have 3 groups.


> Also you use magic numbers here and in the remainder of the driver. We
> should be able to do better. I presume all of this is documented in the
> TRM, so we should be able to easily substitute symbolic names.
I also got those magic numbers from internal source. Some of them are 
not in the TRM. And people just use that settings. I will try to convert 
them to some meaningful bit names. Please let me do it after I finished 
the whole work as an incremental patch.


>
>> +	cil_write(chan, TEGRA_CSI_CIL_INTERRUPT_MASK, 0x0);
>> +	cil_write(chan, TEGRA_CSI_CIL_PHY_CONTROL, 0xA);
>> +	if (lanes == 4) {
>> +		regs->cil = regs_base(TEGRA_CSI_CIL_0_BASE, port + 1);
>> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x0);
>> +		cil_write(chan,	TEGRA_CSI_CIL_INTERRUPT_MASK, 0x0);
>> +		cil_write(chan, TEGRA_CSI_CIL_PHY_CONTROL, 0xA);
>> +		regs->cil = regs_base(TEGRA_CSI_CIL_0_BASE, port);
>> +	}
> And this seems to access registers from another port by temporarily
> rewriting the CIL base offset. That seems a little hackish to me. I
> don't know the hardware intimately enough to know exactly what this
> is supposed to accomplish, perhaps you can clarify? Also perhaps we
> can come up with some architectural overview of the VI hardware, or
> does such an overview exist in the TRM?

CSI have 6 channels but just 3 PHYs. If a channel want to use 4 data 
lanes, then it has to be CSIA, CSIC and CSIE. And CSIB, CSID and CSIF 
channels can not be used in this case.

That's why we need to access the CSIB/D/F registers in 4 data lanes use 
case.

> I see there is, perhaps add a comment somewhere, in the commit
> description or the file header giving a reference to where the
> architectural overview can be found?

It can be found in Tegra X1 TRM like this:
"The CSI unit provides for connection of up to six cameras in the system 
and is organized as three identical instances of two
MIPI support blocks, each with a separate 4-lane interface that can be 
configured as a single camera with 4 lanes or as a dual
camera with 2 lanes available for each camera."

What about I put this information in the code as a comment?
>> +	/* CSI pixel parser registers setup */
>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_PP_COMMAND, 0xf007);
>> +	pp_write(chan, TEGRA_CSI_PIXEL_PARSER_INTERRUPT_MASK, 0x0);
>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_CONTROL0,
>> +		 0x280301f0 | (port & 0x1));
>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_PP_COMMAND, 0xf007);
>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_CONTROL1, 0x11);
>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_GAP, 0x140000);
>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_EXPECTED_FRAME, 0x0);
>> +	pp_write(chan, TEGRA_CSI_INPUT_STREAM_CONTROL,
>> +		 0x3f0000 | (lanes - 1));
>> +
>> +	/* CIL PHY register setup */
>> +	if (lanes == 4)
>> +		phy_write(chan, 0x0101);
>> +	else {
>> +		u32 val = phy_read(chan);
>> +		if (port & 0x1)
>> +			val = (val & ~0x100) | 0x100;
>> +		else
>> +			val = (val & ~0x1) | 0x1;
>> +		phy_write(chan, val);
>> +	}
> The & ~ isn't quite doing what I suspect it should be doing. My
> assumption is that you want to set this register to 0x01 if the first
> port is to be used and 0x100 if the second port is to be used (or 0x101
> if both ports are to be used). In that case I think you'll want
> something like this:
>
> 	value = phy_read(chan);
>
> 	if (port & 1)
> 		value = (value & ~0x0001) | 0x0100;
> 	else
> 		value = (value & ~0x0100) | 0x0001;
>
> 	phy_write(chan, value);

I don't think your code is correct. The algorithm is to read out the 
share PHY register value and clear the port related bit and set that 
bit. Then it won't touch the setting of the other port. It means when we 
setup a channel it should not change the other channel which sharing PHY 
register with the current one.

In your case, you cleared the other port's bit and set the current port 
bit. When we write the value back to the PHY register, current port will 
be enabled but the other port will be disabled.

For example, like CSIA is running, the value of PHY register is 0x0001.
Then when we try to enable CSIB, we should write 0x0101 to the PHY 
register but not 0x0100.

>> +static void tegra_channel_capture_error(struct tegra_channel *chan, int err)
>> +{
>> +	u32 val;
>> +
>> +#ifdef DEBUG
>> +	val = tegra_channel_read(chan, TEGRA_CSI_DEBUG_COUNTER_0);
>> +	dev_err(&chan->video.dev, "TEGRA_CSI_DEBUG_COUNTER_0 0x%08x\n", val);
>> +#endif
>> +	val = cil_read(chan, TEGRA_CSI_CIL_STATUS);
>> +	dev_err(&chan->video.dev, "TEGRA_CSI_CSI_CIL_STATUS 0x%08x\n", val);
>> +	val = cil_read(chan, TEGRA_CSI_CILX_STATUS);
>> +	dev_err(&chan->video.dev, "TEGRA_CSI_CSI_CILX_STATUS 0x%08x\n", val);
>> +	val = pp_read(chan, TEGRA_CSI_PIXEL_PARSER_STATUS);
>> +	dev_err(&chan->video.dev, "TEGRA_CSI_PIXEL_PARSER_STATUS 0x%08x\n",
>> +		val);
>> +	val = csi_read(chan, TEGRA_VI_CSI_ERROR_STATUS);
>> +	dev_err(&chan->video.dev, "TEGRA_VI_CSI_ERROR_STATUS 0x%08x\n", val);
>> +}
> The err parameter is never used, so it should be dropped.
OK, I removed it.

>
>> +static int tegra_channel_capture_frame(struct tegra_channel *chan)
>> +{
>> +	struct tegra_channel_buffer *buf = chan->active;
>> +	struct vb2_buffer *vb = &buf->buf;
>> +	int err = 0;
>> +	u32 thresh, value, frame_start;
>> +	int bytes_per_line = chan->format.bytesperline;
>> +
>> +	if (!vb2_start_streaming_called(&chan->queue) || !buf)
>> +		return -EINVAL;
>> +
>> +	if (chan->bypass)
>> +		goto bypass_done;
>> +
>> +	/* Program buffer address */
>> +	csi_write(chan,
>> +		  TEGRA_VI_CSI_SURFACE0_OFFSET_MSB + chan->surface * 8,
>> +		  0x0);
>> +	csi_write(chan,
>> +		  TEGRA_VI_CSI_SURFACE0_OFFSET_LSB + chan->surface * 8,
>> +		  buf->addr);
>> +	csi_write(chan,
>> +		  TEGRA_VI_CSI_SURFACE0_STRIDE + chan->surface * 4,
>> +		  bytes_per_line);
>> +
>> +	/* Program syncpoint */
>> +	frame_start = sp_bit(chan, SP_PP_FRAME_START);
>> +	tegra_channel_write(chan, TEGRA_VI_CFG_VI_INCR_SYNCPT,
>> +			    frame_start | host1x_syncpt_id(chan->sp));
>> +
>> +	csi_write(chan, TEGRA_VI_CSI_SINGLE_SHOT, 0x1);
>> +
>> +	/* Use syncpoint to wake up */
>> +	thresh = host1x_syncpt_incr_max(chan->sp, 1);
>> +
>> +	mutex_unlock(&chan->lock);
>> +	err = host1x_syncpt_wait(chan->sp, thresh,
>> +			         TEGRA_VI_SYNCPT_WAIT_TIMEOUT, &value);
>> +	mutex_lock(&chan->lock);
> What's the point of taking the lock in the first place if you drop it
> here, even if temporarily? This is a per-channel lock, and it protects
> the channel against concurrent captures. So if you drop the lock here,
> don't you run risk of having two captures run concurrently? And by the
> time you get to the error handling or buffer completion below you can't
> be sure you're actually dealing with the same buffer that you started
> with.

After some discussion with Hans, I changed to this. Since there won't be 
a second capture start which is prevented by v4l2-core, it won't cause 
the buffer issue.

Waiting for host1x syncpoint take time, so dropping lock can let other 
non-capture ioctls and operations happen.
>> +
>> +	if (err) {
>> +		dev_err(&chan->video.dev, "frame start syncpt timeout!\n");
>> +		tegra_channel_capture_error(chan, err);
>> +	}
> Is timeout really the only kind of error that can happen here?
>
I actually don't know other errors. Any other errors I need take of here?

>> +
>> +bypass_done:
>> +	/* Captured one frame */
>> +	spin_lock_irq(&chan->queued_lock);
>> +	vb->v4l2_buf.sequence = chan->sequence++;
>> +	vb->v4l2_buf.field = V4L2_FIELD_NONE;
>> +	v4l2_get_timestamp(&vb->v4l2_buf.timestamp);
>> +	vb2_set_plane_payload(vb, 0, chan->format.sizeimage);
>> +	vb2_buffer_done(vb, err < 0 ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
>> +	spin_unlock_irq(&chan->queued_lock);
> Do we really need to set all the buffer fields on error? Isn't it enough
> to simply mark the state as "error"?

I believe vb2_buffer_done() needs some fields to set. The code here is 
not very heavy but support both DONE and ERROR mode.

>> +
>> +	return err;
>> +}
>> +
>> +static void tegra_channel_work(struct work_struct *work)
>> +{
>> +	struct tegra_channel *chan =
>> +		container_of(work, struct tegra_channel, work);
>> +
>> +	while (1) {
>> +		spin_lock_irq(&chan->queued_lock);
>> +		if (list_empty(&chan->capture)) {
>> +			chan->active = NULL;
>> +			spin_unlock_irq(&chan->queued_lock);
>> +			return;
>> +		}
>> +		chan->active = list_entry(chan->capture.next,
>> +				struct tegra_channel_buffer, queue);
>> +		list_del_init(&chan->active->queue);
>> +		spin_unlock_irq(&chan->queued_lock);
>> +
>> +		mutex_lock(&chan->lock);
>> +		tegra_channel_capture_frame(chan);
>> +		mutex_unlock(&chan->lock);
>> +	}
>> +}
> Should this have some mechanism to break out of the loop, for example if
> somebody requested capturing to stop?
I will move to a kthread solution as Hans pointed out.

>> +static int tegra_channel_buffer_prepare(struct vb2_buffer *vb)
>> +{
>> +	struct tegra_channel *chan = vb2_get_drv_priv(vb->vb2_queue);
>> +	struct tegra_channel_buffer *buf = to_tegra_channel_buffer(vb);
>> +
>> +	buf->chan = chan;
>> +	buf->addr = vb2_dma_contig_plane_dma_addr(vb, 0);
>> +
>> +	return 0;
>> +}
> This seems to use contiguous DMA, which I guess presumes CMA support?
> We're dealing with very large buffers here. Your default frame size
> would yield buffers of roughly 32 MiB each, and you probably need a
> couple of those to ensure smooth playback. That's quite a bit of
> memory to reserve for CMA.
In vb2 core driver, it's using dma-mapping API which might be CMA or SMMU.

For CMA we need increase the default memory size.

> Have you ever tried to make this work with the IOMMU API so that we can
> allocate arbitrary buffers and linearize them for the hardware through
> the SMMU?
I tested this code in downstream kernel with SMMU. Do we fully support 
SMMU in upstream version? I didn't check that.

>> +static void tegra_channel_buffer_queue(struct vb2_buffer *vb)
>> +{
>> +	struct tegra_channel *chan = vb2_get_drv_priv(vb->vb2_queue);
>> +	struct tegra_channel_buffer *buf = to_tegra_channel_buffer(vb);
>> +
>> +	/* Put buffer into the  capture queue */
>> +	spin_lock_irq(&chan->queued_lock);
>> +	list_add_tail(&buf->queue, &chan->capture);
>> +	spin_unlock_irq(&chan->queued_lock);
>> +
>> +	/* Start work queue to capture data to buffer */
>> +	if (vb2_start_streaming_called(&chan->queue))
>> +		schedule_work(&chan->work);
>> +}
> I'm beginning to wonder if a workqueue is the best implementation here.
> Couldn't we get notification on syncpoint increments and have a handler
> setup capture of new frames?

I will move to more flexible solution kthread then.

>> +static int tegra_channel_start_streaming(struct vb2_queue *vq, u32 count)
>> +{
>> +	struct tegra_channel *chan = vb2_get_drv_priv(vq);
>> +	struct media_pipeline *pipe = chan->video.entity.pipe;
>> +	struct tegra_channel_buffer *buf, *nbuf;
>> +	int ret = 0;
>> +
>> +	if (!chan->vi->pg_mode && !chan->remote_entity) {
>> +		dev_err(&chan->video.dev,
>> +			"is not in TPG mode and has not sensor connected!\n");
>> +		ret = -EINVAL;
>> +		goto vb2_queued;
>> +	}
>> +
>> +	mutex_lock(&chan->lock);
>> +
>> +	/* Start CIL clock */
>> +	clk_set_rate(chan->cil_clk, 102000000);
>> +	clk_prepare_enable(chan->cil_clk);
> You need to check these for errors.
Fixed
>
>> +static struct vb2_ops tegra_channel_queue_qops = {
>> +	.queue_setup = tegra_channel_queue_setup,
>> +	.buf_prepare = tegra_channel_buffer_prepare,
>> +	.buf_queue = tegra_channel_buffer_queue,
>> +	.wait_prepare = vb2_ops_wait_prepare,
>> +	.wait_finish = vb2_ops_wait_finish,
>> +	.start_streaming = tegra_channel_start_streaming,
>> +	.stop_streaming = tegra_channel_stop_streaming,
>> +};
> I think this needs to be static const.
Fixed
>
>> +static int
>> +tegra_channel_querycap(struct file *file, void *fh, struct v4l2_capability *cap)
>> +{
>> +	struct v4l2_fh *vfh = file->private_data;
>> +	struct tegra_channel *chan = to_tegra_channel(vfh->vdev);
>> +
>> +	cap->device_caps = V4L2_CAP_VIDEO_CAPTURE | V4L2_CAP_STREAMING;
>> +	cap->capabilities = cap->device_caps | V4L2_CAP_DEVICE_CAPS;
>> +
>> +	strlcpy(cap->driver, "tegra-vi", sizeof(cap->driver));
> Perhaps "tegra-video" to be consistent with the module name?
OK, fixed.


>> +	strlcpy(cap->card, chan->video.name, sizeof(cap->card));
>> +	snprintf(cap->bus_info, sizeof(cap->bus_info), "platform:%s:%u",
>> +		 chan->vi->dev->of_node->name, chan->port);
> Should this not rather use dev_name(chan->vi->dev) to ensure it works
> fine if ever we have multiple instances of the VI controller?
>

Fixed.

>> +static int
>> +tegra_channel_enum_format(struct file *file, void *fh, struct v4l2_fmtdesc *f)
>> +{
>> +	struct v4l2_fh *vfh = file->private_data;
>> +	struct tegra_channel *chan = to_tegra_channel(vfh->vdev);
>> +	int index, i;
> These can probably be unsigned int.
>
>> +	unsigned long *fmts_bitmap = NULL;
>> +
>> +	if (chan->vi->pg_mode)
>> +		fmts_bitmap = chan->vi->tpg_fmts_bitmap;
>> +	else if (chan->remote_entity)
>> +		fmts_bitmap = chan->fmts_bitmap;
>> +
>> +	if (!fmts_bitmap ||
>> +	    f->index > bitmap_weight(fmts_bitmap, MAX_FORMAT_NUM) - 1)
>> +		return -EINVAL;
>> +
>> +	index = -1;
> This won't work with unsigned int, of course (actually, it would, but
> it'd be ugly), but I think you could work around that by doing the more
> natural:
>
>> +	for (i = 0; i < f->index + 1; i++)
>> +		index = find_next_bit(fmts_bitmap, MAX_FORMAT_NUM, index + 1);
> 	index = 0;
>
> 	for (i = 0; i < f->index + 1; i++, index++)
> 		index = find_next_bit(fmts_bitmap, MAX_FORMAT_NUM, index);

Sure, fixed all of them

>
>> +static void
>> +__tegra_channel_try_format(struct tegra_channel *chan, struct v4l2_pix_format *pix,
>> +		      const struct tegra_video_format **fmtinfo)
>> +{
>> +	const struct tegra_video_format *info;
>> +	unsigned int min_width;
>> +	unsigned int max_width;
>> +	unsigned int min_bpl;
>> +	unsigned int max_bpl;
>> +	unsigned int width;
>> +	unsigned int align;
>> +	unsigned int bpl;
>> +
>> +	/* Retrieve format information and select the default format if the
>> +	 * requested format isn't supported.
>> +	 */
>> +	info = tegra_core_get_format_by_fourcc(pix->pixelformat);
>> +	if (!info)
>> +		info = tegra_core_get_format_by_fourcc(TEGRA_VF_DEF_FOURCC);
> Should this not be an error? As far as I can tell this is silently
> substituting the default format for the requested one if the requested
> one isn't supported. Isn't the whole point of this to find out if some
> format is supported?
>
I think it should return some error and escape following code. I will 
fix that.


>> +
>> +	pix->pixelformat = info->fourcc;
>> +	pix->field = V4L2_FIELD_NONE;
>> +
>> +	/* The transfer alignment requirements are expressed in bytes. Compute
>> +	 * the minimum and maximum values, clamp the requested width and convert
>> +	 * it back to pixels.
>> +	 */
>> +	align = lcm(chan->align, info->bpp);
>> +	min_width = roundup(TEGRA_MIN_WIDTH, align);
>> +	max_width = rounddown(TEGRA_MAX_WIDTH, align);
>> +	width = rounddown(pix->width * info->bpp, align);
> Shouldn't these be roundup()?
Why? I don't understand but rounddown looks good to me

>> +
>> +	pix->width = clamp(width, min_width, max_width) / info->bpp;
>> +	pix->height = clamp(pix->height, TEGRA_MIN_HEIGHT,
>> +			    TEGRA_MAX_HEIGHT);
> The above fits nicely on one line and doesn't need to be wrapped.
Fixed
>
>> +
>> +	/* Clamp the requested bytes per line value. If the maximum bytes per
>> +	 * line value is zero, the module doesn't support user configurable line
>> +	 * sizes. Override the requested value with the minimum in that case.
>> +	 */
>> +	min_bpl = pix->width * info->bpp;
>> +	max_bpl = rounddown(TEGRA_MAX_WIDTH, chan->align);
>> +	bpl = rounddown(pix->bytesperline, chan->align);
> Again, I think these should be roundup().

Why? I don't understand but rounddown looks good to me
>
>> +static int tegra_channel_v4l2_open(struct file *file)
>> +{
>> +	struct tegra_channel *chan = video_drvdata(file);
>> +	struct tegra_vi_device *vi = chan->vi;
>> +	int ret = 0;
>> +
>> +	mutex_lock(&vi->lock);
>> +	ret = v4l2_fh_open(file);
>> +	if (ret)
>> +		goto unlock;
>> +
>> +	/* The first open then turn on power*/
>> +	if (!vi->power_on_refcnt) {
>> +		tegra_vi_power_on(chan->vi);
> Perhaps propagate error codes here?
>
>> +
>> +		usleep_range(5, 100);
>> +		tegra_channel_write(chan, TEGRA_VI_CFG_CG_CTRL, 1);
>> +		tegra_channel_write(chan, TEGRA_CSI_CLKEN_OVERRIDE, 0);
>> +		usleep_range(10, 15);
>> +	}
>> +	vi->power_on_refcnt++;
> Also, I wonder if powering up at ->open() time isn't a little early. I
> could very well imagine an application opening up a device and then not
> use it for a long time. Or keep it open even while nothing is being
> captures. But that's primarily an optimization matter, so this is fine
> with me.
>

I think I can move this whole open/release things to start_streaming() 
point. And use v4l2 default open/release function.

>> +int tegra_channel_init(struct tegra_vi_device *vi,
>> +		       struct tegra_channel *chan,
>> +		       u32 port)
> The above fits on 2 lines, no need to make it three. Also port should
> probably be unsigned int because the size isn't important.

Fixed

>> +{
>> +	int ret;
>> +
>> +	chan->vi = vi;
>> +	chan->port = port;
>> +	chan->iomem = vi->iomem;
>> +
>> +	/* Init channel register base */
>> +	chan->regs.csi = TEGRA_VI_CSI_0_BASE + port * 0x100;
>> +	chan->regs.pp = regs_base(TEGRA_CSI_PIXEL_PARSER_0_BASE, port);
>> +	chan->regs.cil = regs_base(TEGRA_CSI_CIL_0_BASE, port);
>> +	chan->regs.phy = TEGRA_CSI_CIL_PHY_0_BASE + port / 2 * 0x800;
>> +	chan->regs.tpg = regs_base(TEGRA_CSI_PATTERN_GENERATOR_0_BASE, port);
> Like I said, I think it'd be clearer to have the defines parameterized.
> That would also make this more consistent, rather than have one set of
> values that are computed here and for others the regs_base() helper is
> invoked. Also, I think it'd be better to have the regs structures take
> void __iomem * directly, so that the offset addition doesn't have to be
> performed at every register access.

OK, I see. I will fix this.


>> +
>> +	/* Init CIL clock */
>> +	switch (chan->port) {
>> +	case 0:
>> +	case 1:
>> +		chan->cil_clk = devm_clk_get(chan->vi->dev, "cilab");
>> +		break;
>> +	case 2:
>> +	case 3:
>> +		chan->cil_clk = devm_clk_get(chan->vi->dev, "cilcd");
>> +		break;
>> +	case 4:
>> +	case 5:
>> +		chan->cil_clk = devm_clk_get(chan->vi->dev, "cile");
>> +		break;
>> +	default:
>> +		dev_err(chan->vi->dev, "wrong port nubmer %d\n", port);
> Nit: you should use %u for unsigned integers.

Fixed.

>> +	}
>> +	if (IS_ERR(chan->cil_clk)) {
>> +		dev_err(chan->vi->dev, "Failed to get CIL clock\n");
> Perhaps mention which clock couldn't be received.

Fixed

>
>> +		return -EINVAL;
> And propagate the error code rather than returning a hardcoded one.
Fixed.

>
>> +	}
>> +
>> +	/* VI Channel is 64 bytes alignment */
>> +	chan->align = 64;
> Does this need parameterization for other SoC generations?

So far it's 64 bytes and I don't see any change about this in the future 
generations.

>
>> +	chan->surface = 0;
> I can't find this being set to anything other than 0. What is its use?

Each channel actually has 3 memory output surfaces. But I don't find any 
use case to use the surface 1 and surface 2. So I just added this 
parameter for future usage.

chan->surface is used in tegra_channel_capture_frame()

>
>> +	chan->io_id = tegra_io_rail_csi_ids[chan->port];
>> +	mutex_init(&chan->lock);
>> +	mutex_init(&chan->video_lock);
>> +	INIT_LIST_HEAD(&chan->capture);
>> +	spin_lock_init(&chan->queued_lock);
>> +	INIT_WORK(&chan->work, tegra_channel_work);
>> +
>> +	/* Init video format */
>> +	chan->fmtinfo = tegra_core_get_format_by_fourcc(TEGRA_VF_DEF_FOURCC);
>> +	chan->format.pixelformat = chan->fmtinfo->fourcc;
>> +	chan->format.colorspace = V4L2_COLORSPACE_SRGB;
>> +	chan->format.field = V4L2_FIELD_NONE;
>> +	chan->format.width = TEGRA_DEF_WIDTH;
>> +	chan->format.height = TEGRA_DEF_HEIGHT;
>> +	chan->format.bytesperline = chan->format.width * chan->fmtinfo->bpp;
>> +	chan->format.sizeimage = chan->format.bytesperline *
>> +				    chan->format.height;
>> +
>> +	/* Initialize the media entity... */
>> +	chan->pad.flags = MEDIA_PAD_FL_SINK;
>> +
>> +	ret = media_entity_init(&chan->video.entity, 1, &chan->pad, 0);
>> +	if (ret < 0)
>> +		return ret;
>> +
>> +	/* ... and the video node... */
>> +	chan->video.fops = &tegra_channel_fops;
>> +	chan->video.v4l2_dev = &vi->v4l2_dev;
>> +	chan->video.queue = &chan->queue;
>> +	snprintf(chan->video.name, sizeof(chan->video.name), "%s %s %u",
>> +		 vi->dev->of_node->name, "output", port);
> dev_name()?

Fixed.

>> diff --git a/drivers/media/platform/tegra/tegra-core.c b/drivers/media/platform/tegra/tegra-core.c
> [...]
>> +const struct tegra_video_format tegra_video_formats[] = {
> Does this need to be exposed? I see there are accessors for this below,
> so exposing the structure itself doesn't seem necessary.

OK, I will fix this.

>> +int tegra_core_get_formats_array_size(void)
>> +{
>> +	return ARRAY_SIZE(tegra_video_formats);
>> +}
>> +
>> +/**
>> + * tegra_core_get_word_count - Calculate word count
>> + * @frame_width: number of pixels in one frame
>> + * @fmt: Tegra Video format struct which has BPP information
>> + *
>> + * Return: word count number
>> + */
>> +u32 tegra_core_get_word_count(u32 frame_width,
>> +			      const struct tegra_video_format *fmt)
>> +{
>> +	return frame_width * fmt->width / 8;
>> +}
> This is confusing. If frame_width is the number of pixels in one frame,
> then it should probably me called frame_size or so. frame_width to me
> implies number of pixels per line, not per frame.

Actually the comment is wrong. I will fix that.


>> +/**
>> + * tegra_core_get_idx_by_code - Retrieve index for a media bus code
>> + * @code: the format media bus code
>> + *
>> + * Return: a index to the format information structure corresponding to the
>> + * given V4L2 media bus format @code, or -1 if no corresponding format can
>> + * be found.
>> + */
>> +int tegra_core_get_idx_by_code(unsigned int code)
>> +{
>> +	unsigned int i;
>> +	const struct tegra_video_format *format;
>> +
>> +	for (i = 0; i < ARRAY_SIZE(tegra_video_formats); ++i) {
>> +		format = &tegra_video_formats[i];
>> +
>> +		if (format->code == code)
> You only use the format value once, so the temporary variable doesn't
> buy you anything.
>
OK, I will remove 'format'.


>> +			return i;
>> +	}
>> +
>> +	return -1;
>> +}
>> +
>> +
> Gratuitous blank line.

Fixed.

>
>> +/**
>> + * tegra_core_of_get_format - Parse a device tree node and return format
>> + * 			      information
> Why is this necessary? Why would you ever need to encode a pixel format
> in DT?

This is dead code. I will remove them.

>> +/**
>> + * tegra_core_bytes_per_line - Calculate bytes per line in one frame
>> + * @width: frame width
>> + * @fmt: Tegra Video format
>> + *
>> + * Simply calcualte the bytes_per_line and if it's not 64 bytes aligned it
>> + * will be padded to 64 boundary.
>> + */
>> +u32 tegra_core_bytes_per_line(u32 width,
>> +			      const struct tegra_video_format *fmt)
>> +{
>> +	u32 bytes_per_line = width * fmt->bpp;
>> +
>> +	if (bytes_per_line % 64)
>> +		bytes_per_line = bytes_per_line +
>> +				 (64 - (bytes_per_line % 64));
>> +
>> +	return bytes_per_line;
>> +}
> Perhaps this should use the channel->align field for alignment rather
> than hardcode 64? Since there's no channel being passed into this, maybe
> passing the alignment as a parameter would work?
>
> Also can't the above be replaced by:
>
> 	return roundup(width * fmt->bpp, align);
>
> ?

Great, I will fix that.

>
>> diff --git a/drivers/media/platform/tegra/tegra-core.h b/drivers/media/platform/tegra/tegra-core.h
>> new file mode 100644
>> index 0000000..7d1026b
>> --- /dev/null
>> +++ b/drivers/media/platform/tegra/tegra-core.h
>> @@ -0,0 +1,134 @@
>> +/*
>> + * NVIDIA Tegra Video Input Device Driver Core Helpers
>> + *
>> + * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
>> + *
>> + * Author: Bryan Wu <pengw@nvidia.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#ifndef __TEGRA_CORE_H__
>> +#define __TEGRA_CORE_H__
>> +
>> +#include <dt-bindings/media/tegra-vi.h>
>> +
>> +#include <media/v4l2-subdev.h>
>> +
>> +/* Minimum and maximum width and height common to Tegra video input device. */
>> +#define TEGRA_MIN_WIDTH		32U
>> +#define TEGRA_MAX_WIDTH		7680U
>> +#define TEGRA_MIN_HEIGHT	32U
>> +#define TEGRA_MAX_HEIGHT	7680U
> Is this dependent on SoC generation? If we wanted to support Tegra K1,
> would the same values apply or do they need to be parameterized?
I actually don't get any information about this max/min resolution. Here 
I just put some values for the format calculation.

> On that note, could you outline what would be necessary to make this
> work on Tegra K1? What are the differences between the VI hardware on
> Tegra X1 vs. Tegra K1?
>
Tegra X1 and Tegra K1 have similar channel architecture. Tegra X1 has 6 
channels, Tegra K1 has 2 channels.


>> +
>> +/* UHD 4K resolution as default resolution for all Tegra video input device. */
>> +#define TEGRA_DEF_WIDTH		3840
>> +#define TEGRA_DEF_HEIGHT	2160
> Is this a sensible default? It seems rather large to me.
Actually I use this for TPG which is the default setting of VI. And it 
can be override from user space IOCTL.

>> +
>> +#define TEGRA_VF_DEF		TEGRA_VF_RGB888
>> +#define TEGRA_VF_DEF_FOURCC	V4L2_PIX_FMT_RGB32
> Should we not have only one of these and convert to the other via some
> table?

This is also TPG default mode

>> +/* These go into the TEGRA_VI_CSI_n_IMAGE_DEF registers bits 23:16 */
>> +#define TEGRA_IMAGE_FORMAT_T_L8                         16
>> +#define TEGRA_IMAGE_FORMAT_T_R16_I                      32
>> +#define TEGRA_IMAGE_FORMAT_T_B5G6R5                     33
>> +#define TEGRA_IMAGE_FORMAT_T_R5G6B5                     34
>> +#define TEGRA_IMAGE_FORMAT_T_A1B5G5R5                   35
>> +#define TEGRA_IMAGE_FORMAT_T_A1R5G5B5                   36
>> +#define TEGRA_IMAGE_FORMAT_T_B5G5R5A1                   37
>> +#define TEGRA_IMAGE_FORMAT_T_R5G5B5A1                   38
>> +#define TEGRA_IMAGE_FORMAT_T_A4B4G4R4                   39
>> +#define TEGRA_IMAGE_FORMAT_T_A4R4G4B4                   40
>> +#define TEGRA_IMAGE_FORMAT_T_B4G4R4A4                   41
>> +#define TEGRA_IMAGE_FORMAT_T_R4G4B4A4                   42
>> +#define TEGRA_IMAGE_FORMAT_T_A8B8G8R8                   64
>> +#define TEGRA_IMAGE_FORMAT_T_A8R8G8B8                   65
>> +#define TEGRA_IMAGE_FORMAT_T_B8G8R8A8                   66
>> +#define TEGRA_IMAGE_FORMAT_T_R8G8B8A8                   67
>> +#define TEGRA_IMAGE_FORMAT_T_A2B10G10R10                68
>> +#define TEGRA_IMAGE_FORMAT_T_A2R10G10B10                69
>> +#define TEGRA_IMAGE_FORMAT_T_B10G10R10A2                70
>> +#define TEGRA_IMAGE_FORMAT_T_R10G10B10A2                71
>> +#define TEGRA_IMAGE_FORMAT_T_A8Y8U8V8                   193
>> +#define TEGRA_IMAGE_FORMAT_T_V8U8Y8A8                   194
>> +#define TEGRA_IMAGE_FORMAT_T_A2Y10U10V10                197
>> +#define TEGRA_IMAGE_FORMAT_T_V10U10Y10A2                198
>> +#define TEGRA_IMAGE_FORMAT_T_Y8_U8__Y8_V8               200
>> +#define TEGRA_IMAGE_FORMAT_T_Y8_V8__Y8_U8               201
>> +#define TEGRA_IMAGE_FORMAT_T_U8_Y8__V8_Y8               202
>> +#define TEGRA_IMAGE_FORMAT_T_T_V8_Y8__U8_Y8             203
>> +#define TEGRA_IMAGE_FORMAT_T_T_Y8__U8__V8_N444          224
>> +#define TEGRA_IMAGE_FORMAT_T_Y8__U8V8_N444              225
>> +#define TEGRA_IMAGE_FORMAT_T_Y8__V8U8_N444              226
>> +#define TEGRA_IMAGE_FORMAT_T_Y8__U8__V8_N422            227
>> +#define TEGRA_IMAGE_FORMAT_T_Y8__U8V8_N422              228
>> +#define TEGRA_IMAGE_FORMAT_T_Y8__V8U8_N422              229
>> +#define TEGRA_IMAGE_FORMAT_T_Y8__U8__V8_N420            230
>> +#define TEGRA_IMAGE_FORMAT_T_Y8__U8V8_N420              231
>> +#define TEGRA_IMAGE_FORMAT_T_Y8__V8U8_N420              232
>> +#define TEGRA_IMAGE_FORMAT_T_X2Lc10Lb10La10             233
>> +#define TEGRA_IMAGE_FORMAT_T_A2R6R6R6R6R6               234
>> +
>> +/* These go into the TEGRA_VI_CSI_n_CSI_IMAGE_DT registers bits 7:0 */
>> +#define TEGRA_IMAGE_DT_YUV420_8                         24
>> +#define TEGRA_IMAGE_DT_YUV420_10                        25
>> +#define TEGRA_IMAGE_DT_YUV420CSPS_8                     28
>> +#define TEGRA_IMAGE_DT_YUV420CSPS_10                    29
>> +#define TEGRA_IMAGE_DT_YUV422_8                         30
>> +#define TEGRA_IMAGE_DT_YUV422_10                        31
>> +#define TEGRA_IMAGE_DT_RGB444                           32
>> +#define TEGRA_IMAGE_DT_RGB555                           33
>> +#define TEGRA_IMAGE_DT_RGB565                           34
>> +#define TEGRA_IMAGE_DT_RGB666                           35
>> +#define TEGRA_IMAGE_DT_RGB888                           36
>> +#define TEGRA_IMAGE_DT_RAW6                             40
>> +#define TEGRA_IMAGE_DT_RAW7                             41
>> +#define TEGRA_IMAGE_DT_RAW8                             42
>> +#define TEGRA_IMAGE_DT_RAW10                            43
>> +#define TEGRA_IMAGE_DT_RAW12                            44
>> +#define TEGRA_IMAGE_DT_RAW14                            45
> It might be helpful to describe what these registers actually do. There
> seems to be overlap between both lists, but I don't quite see how they
> relate to one another, or what their purpose is.
These tables are from our TRM. The first table is "Pixel memory format 
for the VI channel". The second one is "VI channel input data type".

Let me put some comments there.


>> +/**
>> + * struct tegra_video_format - Tegra video format description
>> + * @vf_code: video format code
>> + * @width: format width in bits per component
>> + * @code: media bus format code
>> + * @bpp: bytes per pixel (when stored in memory)
>> + * @img_fmt: image format
>> + * @img_dt: image data type
>> + * @fourcc: V4L2 pixel format FCC identifier
>> + * @description: format description, suitable for userspace
>> + */
>> +struct tegra_video_format {
>> +	u32 vf_code;
>> +	u32 width;
>> +	u32 code;
>> +	u32 bpp;
> I think the above four can all be unsigned int. A sized type is not
> necessary here.
OK, I will fix this.


>> +	u32 img_fmt;
>> +	u32 img_dt;
> Perhaps these could be enums?

OK, I will use enums.
>
>> +	u32 fourcc;
>> +};
>> +
>> +extern const struct tegra_video_format tegra_video_formats[];
> It looks like you have accessors for this. Do you even need to expose
> it?

Fixed.

>> diff --git a/drivers/media/platform/tegra/tegra-vi.c b/drivers/media/platform/tegra/tegra-vi.c
> [...]
>> +static void tegra_vi_v4l2_cleanup(struct tegra_vi_device *vi)
>> +{
>> +	v4l2_ctrl_handler_free(&vi->ctrl_handler);
>> +	v4l2_device_unregister(&vi->v4l2_dev);
>> +	media_device_unregister(&vi->media_dev);
>> +}
>> +
>> +static int tegra_vi_v4l2_init(struct tegra_vi_device *vi)
>> +{
>> +	int ret;
>> +
>> +	vi->media_dev.dev = vi->dev;
>> +	strlcpy(vi->media_dev.model, "NVIDIA Tegra Video Input Device",
>> +		sizeof(vi->media_dev.model));
>> +	vi->media_dev.hw_revision = 0;
> Actually, I think for Tegra X1 the hardware revision would be 3, since
> VI3 is what it's usually referred to. Tegra K1 has VI2, so this should
> be parameterized (at least when Tegra K1 support is added).
OK, I will choose 3 for Tegra X1 since we mentioned that in TRM like VI3.

>> +int tegra_vi_power_on(struct tegra_vi_device *vi)
>> +{
>> +	int ret;
>> +
>> +	ret = regulator_enable(vi->vi_reg);
>> +	if (ret)
>> +		return ret;
>> +
>> +	ret = tegra_powergate_sequence_power_up(TEGRA_POWERGATE_VENC,
>> +						vi->vi_clk, vi->vi_rst);
>> +	if (ret) {
>> +		regulator_disable(vi->vi_reg);
>> +		return ret;
>> +	}
>> +
>> +	clk_prepare_enable(vi->csi_clk);
>> +
>> +	clk_set_rate(vi->parent_clk, 408000000);
> Do we really need to set the parent? Isn't that going to be set
> automatically since vi_clk is the child of parent_clk?
Sure, I will remove this.


>> +	clk_set_rate(vi->vi_clk, 408000000);
>> +	clk_set_rate(vi->csi_clk, 408000000);
> Also all of these clock functions can fail, so you should check for
> errors.
>

Fixed.

>> +
>> +	return 0;
>> +}
>> +
>> +void tegra_vi_power_off(struct tegra_vi_device *vi)
>> +{
>> +	clk_disable_unprepare(vi->csi_clk);
>> +	tegra_powergate_power_off(TEGRA_POWERGATE_VENC);
> tegra_powergate_power_off() doesn't do anything with the clock or the
> reset, so you'll want to manually assert reset here and then disable and
> unprepare the clock. And I think both need to happen before the power
> partition is turned off.

Got it, I will fix this.


>> +	regulator_disable(vi->vi_reg);
>> +}
>> +
>> +static int tegra_vi_channels_init(struct tegra_vi_device *vi)
>> +{
>> +	int i, ret;
> i can be unsigned.

Fixed
>
>> +	struct tegra_channel *chan;
>> +
>> +	for (i = 0; i < ARRAY_SIZE(vi->chans); i++) {
>> +		chan = &vi->chans[i];
>> +
>> +		ret = tegra_channel_init(vi, chan, i);
> Again, chan is only used once, so directly passing &vi->chans[i] to
> tegra_channel_init() would be more concise.
OK, I will remove 'chan' parameter from the list. And just pass i as the 
port number.

>
>> +static int tegra_vi_channels_cleanup(struct tegra_vi_device *vi)
>> +{
>> +	int i, ret;
>> +	struct tegra_channel *chan;
>> +
>> +	for (i = 0; i < ARRAY_SIZE(vi->chans); i++) {
>> +		chan = &vi->chans[i];
>> +
>> +		ret = tegra_channel_cleanup(chan);
>> +		if (ret < 0) {
>> +			dev_err(vi->dev, "channel %d cleanup failed\n", i);
>> +			return ret;
>> +		}
>> +	}
>> +	return 0;
>> +}
> Same comments as for tegra_vi_channels_init().

Fixed.

>> +
>> +/* -----------------------------------------------------------------------------
>> + * Graph Management
>> + */
> The way devices are hooked up using the graph needs to be documented in
> a device tree binding.

Sure. This is actually the default video-interface binding. I will 
provide a document about device tree binding.

>> +static int tegra_vi_graph_notify_complete(struct v4l2_async_notifier *notifier)
>> +{
>> +	struct tegra_vi_device *vi =
>> +		container_of(notifier, struct tegra_vi_device, notifier);
>> +	int ret;
>> +
>> +	dev_dbg(vi->dev, "notify complete, all subdevs registered\n");
>> +
>> +	/* Create links for every entity. */
>> +	ret = tegra_vi_graph_build_links(vi);
>> +	if (ret < 0)
>> +		return ret;
>> +
>> +	ret = v4l2_device_register_subdev_nodes(&vi->v4l2_dev);
>> +	if (ret < 0)
>> +		dev_err(vi->dev, "failed to register subdev nodes\n");
>> +
>> +	return ret;
>> +}
> Why the need for this notifier mechanism, doesn't deferred probe work
> here?

I will revisit this after media controller and graph probing change in 
upstream mentioned by Hans.

>> +static int tegra_vi_graph_notify_bound(struct v4l2_async_notifier *notifier,
>> +				   struct v4l2_subdev *subdev,
>> +				   struct v4l2_async_subdev *asd)
>> +{
> [...]
>> +}
>> +
>> +
> Gratuitous blank line.
Fixed.

>
>> +static int tegra_vi_graph_init(struct tegra_vi_device *vi)
>> +{
>> +	struct device_node *node = vi->dev->of_node;
>> +	struct device_node *ep = NULL;
>> +	struct device_node *next;
>> +	struct device_node *remote = NULL;
>> +	struct tegra_vi_graph_entity *entity;
>> +	struct v4l2_async_subdev **subdevs = NULL;
>> +	unsigned int num_subdevs;
> This variable is being used uninitialized.
>

Fixed.

>> +static int tegra_vi_probe(struct platform_device *pdev)
>> +{
>> +	struct resource *res;
>> +	struct tegra_vi_device *vi;
>> +	int ret = 0;
>> +
>> +	vi = devm_kzalloc(&pdev->dev, sizeof(*vi), GFP_KERNEL);
>> +	if (!vi)
>> +		return -ENOMEM;
>> +
>> +	vi->dev = &pdev->dev;
>> +	INIT_LIST_HEAD(&vi->entities);
>> +	mutex_init(&vi->lock);
>> +
>> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>> +	vi->iomem = devm_ioremap_resource(&pdev->dev, res);
>> +	if (IS_ERR(vi->iomem))
>> +		return PTR_ERR(vi->iomem);
>> +
>> +	vi->vi_rst = devm_reset_control_get(&pdev->dev, "vi");
>> +	if (IS_ERR(vi->vi_rst)) {
>> +		dev_err(&pdev->dev, "Failed to get vi reset\n");
>> +		return -EPROBE_DEFER;
>> +	}
> There could be other reasons for failure, so you should really propagate
> the error code that devm_reset_control_get() provides:
>
> 		return PTR_ERR(vi->vi_rst);
OK, I will fix this.

>
>> +	vi->vi_clk = devm_clk_get(&pdev->dev, "vi");
>> +	if (IS_ERR(vi->vi_clk)) {
>> +		dev_err(&pdev->dev, "Failed to get vi clock\n");
>> +		return -EPROBE_DEFER;
>> +	}
> Same here...
OK, I will fix this.
>
>> +	vi->parent_clk = devm_clk_get(&pdev->dev, "parent");
>> +	if (IS_ERR(vi->parent_clk)) {
>> +		dev_err(&pdev->dev, "Failed to get VI parent clock\n");
>> +		return -EPROBE_DEFER;
>> +	}
> ... here...
Fixed
>
>> +	ret = clk_set_parent(vi->vi_clk, vi->parent_clk);
>> +	if (ret < 0)
>> +		return ret;
>> +
>> +	vi->csi_clk = devm_clk_get(&pdev->dev, "csi");
>> +	if (IS_ERR(vi->csi_clk)) {
>> +		dev_err(&pdev->dev, "Failed to get csi clock\n");
>> +		return -EPROBE_DEFER;
>> +	}
> ... here...
Fixed
>
>> +	vi->vi_reg = devm_regulator_get(&pdev->dev, "avdd-dsi-csi");
>> +	if (IS_ERR(vi->vi_reg)) {
>> +		dev_err(&pdev->dev, "Failed to get avdd-dsi-csi regulators\n");
>> +		return -EPROBE_DEFER;
>> +	}
> and here.
>
>> +	vi_tpg_fmts_bitmap_init(vi);
>> +
>> +	ret = tegra_vi_v4l2_init(vi);
>> +	if (ret < 0)
>> +		return ret;
>> +
>> +	/* Check whether VI is in test pattern generator (TPG) mode */
>> +	of_property_read_u32(vi->dev->of_node, "nvidia,pg_mode",
>> +			     &vi->pg_mode);
> This doesn't sound right. Wouldn't this mean that you can either use the
> device in TPG mode or sensor mode only? With no means of switching at
> runtime? But then I see that there's an IOCTL to set this mode, so why
> even bother having this in DT in the first place?
DT can provide a default way to set the whole VI as TPG. And v4l2-ctrls 
(IOCTL) is another way to do that.

We can remove this DT stuff but just use runtime v4l2-ctrls.
>> +	/* Init Tegra VI channels */
>> +	ret = tegra_vi_channels_init(vi);
>> +	if (ret < 0)
>> +		goto channels_error;
>> +
>> +	/* Setup media links between VI and external sensor subdev. */
>> +	ret = tegra_vi_graph_init(vi);
>> +	if (ret < 0)
>> +		goto graph_error;
>> +
>> +	platform_set_drvdata(pdev, vi);
>> +
>> +	dev_info(vi->dev, "device registered\n");
> Can we get rid of this, please? There's no use in spamming the kernel
> log with brag. Let people know when things have failed. Success is the
> expected outcome of ->probe().
Removed.

>> +static struct platform_driver tegra_vi_driver = {
>> +	.driver = {
>> +		.name = "tegra-vi",
>> +		.of_match_table = tegra_vi_of_id_table,
>> +	},
>> +	.probe = tegra_vi_probe,
>> +	.remove = tegra_vi_remove,
>> +};
>> +
>> +module_platform_driver(tegra_vi_driver);
> There's usually no blank line between the above.
OK, fixed.

>> diff --git a/drivers/media/platform/tegra/tegra-vi.h b/drivers/media/platform/tegra/tegra-vi.h
>> new file mode 100644
>> index 0000000..d30a6ec
>> --- /dev/null
>> +++ b/drivers/media/platform/tegra/tegra-vi.h
>> @@ -0,0 +1,224 @@
>> +/*
>> + * NVIDIA Tegra Video Input Device
>> + *
>> + * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
>> + *
>> + * Author: Bryan Wu <pengw@nvidia.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#ifndef __TEGRA_VI_H__
>> +#define __TEGRA_VI_H__
>> +
>> +#include <linux/host1x.h>
>> +#include <linux/list.h>
>> +#include <linux/mutex.h>
>> +#include <linux/spinlock.h>
>> +#include <linux/videodev2.h>
>> +
>> +#include <media/media-device.h>
>> +#include <media/media-entity.h>
>> +#include <media/v4l2-async.h>
>> +#include <media/v4l2-ctrls.h>
>> +#include <media/v4l2-device.h>
>> +#include <media/v4l2-dev.h>
>> +#include <media/videobuf2-core.h>
>> +
>> +#include "tegra-core.h"
>> +
>> +#define MAX_CHAN_NUM	6
>> +#define MAX_FORMAT_NUM	64
> Perhaps these need to be runtime parameters to support multiple SoC
> generations? Tegra K1 seems to have only 2 channels instead of 6.
>
>> +
>> +/**
>> + * struct tegra_channel_buffer - video channel buffer
>> + * @buf: vb2 buffer base object
>> + * @queue: buffer list entry in the channel queued buffers list
>> + * @chan: channel that uses the buffer
>> + * @addr: Tegra IOVA buffer address for VI output
>> + */
>> +struct tegra_channel_buffer {
>> +	struct vb2_buffer buf;
>> +	struct list_head queue;
>> +	struct tegra_channel *chan;
>> +
>> +	dma_addr_t addr;
>> +};
>> +
>> +#define to_tegra_channel_buffer(vb) \
>> +	container_of(vb, struct tegra_channel_buffer, buf)
> I usually prefer static inline functions over macros for this type of
> upcasting. But perhaps Hans prefers this, so I'll defer to his judgement
> here.
>
>> +struct chan_regs_config {
>> +	u32 csi;
>> +	u32 pp;
>> +	u32 cil;
>> +	u32 phy;
>> +	u32 tpg;
>> +};
> Have you considered making these void __iomem * so that you can avoid
> the addition of the offset whenever you access a register?
>
>> +/**
>> + * struct tegra_channel - Tegra video channel
>> + * @list: list entry in a composite device dmas list
>> + * @video: V4L2 video device associated with the video channel
>> + * @video_lock:
>> + * @pad: media pad for the video device entity
>> + * @pipe: pipeline belonging to the channel
>> + *
>> + * @vi: composite device DT node port number for the channel
>> + *
>> + * @client: host1x client struct of Tegra DRM
> host1x client is separate from Tegra DRM.
Fixed.
>> + * @sp: host1x syncpoint pointer
>> + *
>> + * @work: kernel workqueue structure of this video channel
>> + * @lock: protects the @format, @fmtinfo, @queue and @work fields
>> + *
>> + * @format: active V4L2 pixel format
>> + * @fmtinfo: format information corresponding to the active @format
>> + *
>> + * @queue: vb2 buffers queue
>> + * @alloc_ctx: allocation context for the vb2 @queue
>> + * @sequence: V4L2 buffers sequence number
>> + *
>> + * @capture: list of queued buffers for capture
>> + * @active: active buffer for capture
>> + * @queued_lock: protects the buf_queued list
>> + *
>> + * @iomem: root register base
>> + * @regs: CSI/CIL/PHY register bases
>> + * @cil_clk: clock for CIL
>> + * @align: channel buffer alignment, default is 64
>> + * @port: CSI port of this video channel
>> + * @surface: output memory surface number
>> + * @io_id: Tegra IO rail ID of this video channel
>> + * @bypass: a flag to bypass register write
>> + *
>> + * @fmts_bitmap: a bitmap for formats supported
>> + *
>> + * @remote_entity: remote media entity for external sensor
>> + */
>> +struct tegra_channel {
>> +	struct list_head list;
>> +	struct video_device video;
>> +	struct mutex video_lock;
>> +	struct media_pad pad;
>> +	struct media_pipeline pipe;
>> +
>> +	struct tegra_vi_device *vi;
>> +
>> +	struct host1x_client client;
>> +	struct host1x_syncpt *sp;
>> +
>> +	struct work_struct work;
>> +	struct mutex lock;
>> +
>> +	struct v4l2_pix_format format;
>> +	const struct tegra_video_format *fmtinfo;
>> +
>> +	struct vb2_queue queue;
>> +	void *alloc_ctx;
>> +	u32 sequence;
>> +
>> +	struct list_head capture;
>> +	struct tegra_channel_buffer *active;
>> +	spinlock_t queued_lock;
>> +
>> +	void __iomem *iomem;
>> +	struct chan_regs_config regs;
>> +	struct clk *cil_clk;
>> +	int align;
>> +	u32 port;
> Those can both be unsigned int.
Fixed.
>
>> +	u32 surface;
> This seems to be fixed to 0, do we need it?

Let's keep it for future usage.

>
>> +	int io_id;
>> +	int bypass;
> bool?

Fixed.
>
>> +/**
>> + * struct tegra_vi_device - NVIDIA Tegra Video Input device structure
>> + * @v4l2_dev: V4L2 device
>> + * @media_dev: media device
>> + * @dev: device struct
>> + *
>> + * @iomem: register base
>> + * @vi_clk: main clock for VI block
>> + * @parent_clk: parent clock of VI clock
>> + * @csi_clk: clock for CSI
>> + * @vi_rst: reset controler
>> + * @vi_reg: regulator for VI hardware, normally it avdd_dsi_csi
>> + *
>> + * @lock: mutex lock to protect power on/off operations
>> + * @power_on_refcnt: reference count for power on/off operations
>> + *
>> + * @notifier: V4L2 asynchronous subdevs notifier
>> + * @entities: entities in the graph as a list of tegra_vi_graph_entity
>> + * @num_subdevs: number of subdevs in the pipeline
>> + *
>> + * @channels: list of channels at the pipeline output and input
>> + *
>> + * @ctrl_handler: V4L2 control handler
>> + * @pattern: test pattern generator V4L2 control
>> + * @pg_mode: test pattern generator mode (disabled/direct/patch)
>> + * @tpg_fmts_bitmap: a bitmap for formats in test pattern generator mode
>> + */
>> +struct tegra_vi_device {
>> +	struct v4l2_device v4l2_dev;
>> +	struct media_device media_dev;
>> +	struct device *dev;
>> +
>> +	void __iomem *iomem;
>> +	struct clk *vi_clk;
>> +	struct clk *parent_clk;
>> +	struct clk *csi_clk;
>> +	struct reset_control *vi_rst;
>> +	struct regulator *vi_reg;
>> +
>> +	struct mutex lock;
>> +	int power_on_refcnt;
> unsigned int, or perhaps even atomic_t, in which case you might be able
> to remove the locks from ->open()/->release().
I will rework the open/release()

>> +
>> +	struct v4l2_async_notifier notifier;
>> +	struct list_head entities;
>> +	unsigned int num_subdevs;
>> +
>> +	struct tegra_channel chans[MAX_CHAN_NUM];
>> +
>> +	struct v4l2_ctrl_handler ctrl_handler;
>> +	struct v4l2_ctrl *pattern;
>> +	int pg_mode;
> Perhaps this should be an enum?
Sure, fixed.

>> diff --git a/include/dt-bindings/media/tegra-vi.h b/include/dt-bindings/media/tegra-vi.h
> [...]
>> +#ifndef __DT_BINDINGS_MEDIA_TEGRA_VI_H__
>> +#define __DT_BINDINGS_MEDIA_TEGRA_VI_H__
>> +
>> +/*
>> + * Supported CSI to VI Data Formats
>> + */
>> +#define TEGRA_VF_RAW6		0
>> +#define TEGRA_VF_RAW7		1
>> +#define TEGRA_VF_RAW8		2
>> +#define TEGRA_VF_RAW10		3
>> +#define TEGRA_VF_RAW12		4
>> +#define TEGRA_VF_RAW14		5
>> +#define TEGRA_VF_EMBEDDED8	6
>> +#define TEGRA_VF_RGB565		7
>> +#define TEGRA_VF_RGB555		8
>> +#define TEGRA_VF_RGB888		9
>> +#define TEGRA_VF_RGB444		10
>> +#define TEGRA_VF_RGB666		11
>> +#define TEGRA_VF_YUV422		12
>> +#define TEGRA_VF_YUV420		13
>> +#define TEGRA_VF_YUV420_CSPS	14
>> +
>> +#endif /* __DT_BINDINGS_MEDIA_TEGRA_VI_H__ */
> What do we need these for? These seem to me to be internal formats
> supported by the hardware, but the existence of this file implies that
> you plan on using them in the DT. What's the use-case?
>
>

The original plan is to put nvidia;video-format in device tree and this 
is the data formats for that. Now we don't need nvidia;video-format in 
device tree. Then I let me move it into our tegra-core.c, because our 
tegra_video_formats table needs this.

Thierry,

Thanks a lot for this beautiful review. I almost fixed them and will 
provide a new patch soon.

-Bryan


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
  2015-08-21  9:28   ` Hans Verkuil
@ 2015-08-25  0:43     ` Bryan Wu
  0 siblings, 0 replies; 25+ messages in thread
From: Bryan Wu @ 2015-08-25  0:43 UTC (permalink / raw)
  To: Hans Verkuil, hansverk, linux-media
  Cc: ebrower, jbang, swarren, treding, wenjiaz, davidw, gfitzer, linux-media

On 08/21/2015 02:28 AM, Hans Verkuil wrote:
> Hi Bryan,
>
> Thanks for contributing this driver, very much appreciated.
>
> I do have some comments below, basically about the same things we discussed
> privately before.
>
> On 08/21/2015 02:51 AM, Bryan Wu wrote:
>> NVIDIA Tegra processor contains a powerful Video Input (VI) hardware
>> controller which can support up to 6 MIPI CSI camera sensors.
>>
>> This patch adds a V4L2 media controller and capture driver to support
>> Tegra VI hardware. It's verified with Tegra built-in test pattern
>> generator.
>>
>> Signed-off-by: Bryan Wu <pengw@nvidia.com>
>> Reviewed-by: Hans Verkuil <hans.verkuil@cisco.com>
>> ---
>>   drivers/media/platform/Kconfig               |    1 +
>>   drivers/media/platform/Makefile              |    2 +
>>   drivers/media/platform/tegra/Kconfig         |    9 +
>>   drivers/media/platform/tegra/Makefile        |    3 +
>>   drivers/media/platform/tegra/tegra-channel.c | 1074 ++++++++++++++++++++++++++
>>   drivers/media/platform/tegra/tegra-core.c    |  295 +++++++
>>   drivers/media/platform/tegra/tegra-core.h    |  134 ++++
>>   drivers/media/platform/tegra/tegra-vi.c      |  585 ++++++++++++++
>>   drivers/media/platform/tegra/tegra-vi.h      |  224 ++++++
>>   include/dt-bindings/media/tegra-vi.h         |   35 +
>>   10 files changed, 2362 insertions(+)
>>   create mode 100644 drivers/media/platform/tegra/Kconfig
>>   create mode 100644 drivers/media/platform/tegra/Makefile
>>   create mode 100644 drivers/media/platform/tegra/tegra-channel.c
>>   create mode 100644 drivers/media/platform/tegra/tegra-core.c
>>   create mode 100644 drivers/media/platform/tegra/tegra-core.h
>>   create mode 100644 drivers/media/platform/tegra/tegra-vi.c
>>   create mode 100644 drivers/media/platform/tegra/tegra-vi.h
>>   create mode 100644 include/dt-bindings/media/tegra-vi.h
>>
> <snip>
>
>> +static int tegra_channel_capture_frame(struct tegra_channel *chan)
>> +{
>> +	struct tegra_channel_buffer *buf = chan->active;
>> +	struct vb2_buffer *vb = &buf->buf;
>> +	int err = 0;
>> +	u32 thresh, value, frame_start;
>> +	int bytes_per_line = chan->format.bytesperline;
>> +
>> +	if (!vb2_start_streaming_called(&chan->queue) || !buf)
>> +		return -EINVAL;
>> +
>> +	if (chan->bypass)
>> +		goto bypass_done;
>> +
>> +	/* Program buffer address */
>> +	csi_write(chan,
>> +		  TEGRA_VI_CSI_SURFACE0_OFFSET_MSB + chan->surface * 8,
>> +		  0x0);
>> +	csi_write(chan,
>> +		  TEGRA_VI_CSI_SURFACE0_OFFSET_LSB + chan->surface * 8,
>> +		  buf->addr);
>> +	csi_write(chan,
>> +		  TEGRA_VI_CSI_SURFACE0_STRIDE + chan->surface * 4,
>> +		  bytes_per_line);
>> +
>> +	/* Program syncpoint */
>> +	frame_start = sp_bit(chan, SP_PP_FRAME_START);
>> +	tegra_channel_write(chan, TEGRA_VI_CFG_VI_INCR_SYNCPT,
>> +			    frame_start | host1x_syncpt_id(chan->sp));
>> +
>> +	csi_write(chan, TEGRA_VI_CSI_SINGLE_SHOT, 0x1);
>> +
>> +	/* Use syncpoint to wake up */
>> +	thresh = host1x_syncpt_incr_max(chan->sp, 1);
>> +
>> +	mutex_unlock(&chan->lock);
>> +	err = host1x_syncpt_wait(chan->sp, thresh,
>> +			         TEGRA_VI_SYNCPT_WAIT_TIMEOUT, &value);
>> +	mutex_lock(&chan->lock);
>> +
>> +	if (err) {
>> +		dev_err(&chan->video.dev, "frame start syncpt timeout!\n");
>> +		tegra_channel_capture_error(chan, err);
>> +	}
>> +
>> +bypass_done:
>> +	/* Captured one frame */
>> +	spin_lock_irq(&chan->queued_lock);
>> +	vb->v4l2_buf.sequence = chan->sequence++;
>> +	vb->v4l2_buf.field = V4L2_FIELD_NONE;
>> +	v4l2_get_timestamp(&vb->v4l2_buf.timestamp);
>> +	vb2_set_plane_payload(vb, 0, chan->format.sizeimage);
>> +	vb2_buffer_done(vb, err < 0 ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
>> +	spin_unlock_irq(&chan->queued_lock);
>> +
>> +	return err;
>> +}
>> +
>> +static void tegra_channel_work(struct work_struct *work)
>> +{
>> +	struct tegra_channel *chan =
>> +		container_of(work, struct tegra_channel, work);
>> +
>> +	while (1) {
>> +		spin_lock_irq(&chan->queued_lock);
>> +		if (list_empty(&chan->capture)) {
>> +			chan->active = NULL;
>> +			spin_unlock_irq(&chan->queued_lock);
>> +			return;
>> +		}
>> +		chan->active = list_entry(chan->capture.next,
>> +				struct tegra_channel_buffer, queue);
>> +		list_del_init(&chan->active->queue);
>> +		spin_unlock_irq(&chan->queued_lock);
>> +
>> +		mutex_lock(&chan->lock);
>> +		tegra_channel_capture_frame(chan);
>> +		mutex_unlock(&chan->lock);
>> +	}
>> +}
>> +
>> +/* -----------------------------------------------------------------------------
>> + * videobuf2 queue operations
>> + */
>> +
>> +static int
>> +tegra_channel_queue_setup(struct vb2_queue *vq, const struct v4l2_format *fmt,
>> +		     unsigned int *nbuffers, unsigned int *nplanes,
>> +		     unsigned int sizes[], void *alloc_ctxs[])
>> +{
>> +	struct tegra_channel *chan = vb2_get_drv_priv(vq);
>> +
>> +	/* Make sure the image size is large enough. */
>> +	if (fmt && fmt->fmt.pix.sizeimage < chan->format.sizeimage)
>> +		return -EINVAL;
>> +
>> +	*nplanes = 1;
>> +
>> +	sizes[0] = fmt ? fmt->fmt.pix.sizeimage : chan->format.sizeimage;
>> +	alloc_ctxs[0] = chan->alloc_ctx;
>> +
>> +	return 0;
>> +}
>> +
>> +static int tegra_channel_buffer_prepare(struct vb2_buffer *vb)
>> +{
>> +	struct tegra_channel *chan = vb2_get_drv_priv(vb->vb2_queue);
>> +	struct tegra_channel_buffer *buf = to_tegra_channel_buffer(vb);
>> +
>> +	buf->chan = chan;
>> +	buf->addr = vb2_dma_contig_plane_dma_addr(vb, 0);
>> +
>> +	return 0;
>> +}
>> +
>> +static void tegra_channel_buffer_queue(struct vb2_buffer *vb)
>> +{
>> +	struct tegra_channel *chan = vb2_get_drv_priv(vb->vb2_queue);
>> +	struct tegra_channel_buffer *buf = to_tegra_channel_buffer(vb);
>> +
>> +	/* Put buffer into the  capture queue */
>> +	spin_lock_irq(&chan->queued_lock);
>> +	list_add_tail(&buf->queue, &chan->capture);
>> +	spin_unlock_irq(&chan->queued_lock);
>> +
>> +	/* Start work queue to capture data to buffer */
>> +	if (vb2_start_streaming_called(&chan->queue))
>> +		schedule_work(&chan->work);
>> +}
>> +
>> +static int tegra_channel_set_stream(struct tegra_channel *chan, bool on)
>> +{
>> +	struct media_entity *entity;
>> +	struct media_pad *pad;
>> +	struct v4l2_subdev *subdev;
>> +	int ret = 0;
>> +
>> +	entity = &chan->video.entity;
>> +
>> +	while (1) {
>> +		if (entity->num_pads > 1 && (chan->port & 0x1))
>> +			pad = &entity->pads[2];
>> +		else
>> +			pad = &entity->pads[0];
>> +
>> +		if (!(pad->flags & MEDIA_PAD_FL_SINK))
>> +			break;
>> +
>> +		pad = media_entity_remote_pad(pad);
>> +		if (pad == NULL ||
>> +		    media_entity_type(pad->entity) != MEDIA_ENT_T_V4L2_SUBDEV)
>> +			break;
>> +
>> +		entity = pad->entity;
>> +		subdev = media_entity_to_v4l2_subdev(entity);
>> +		ret = v4l2_subdev_call(subdev, video, s_stream, on);
>> +		if (on && ret < 0 && ret != -ENOIOCTLCMD)
>> +			return ret;
>> +	}
>> +	return ret;
>> +}
>> +
>> +static int tegra_channel_start_streaming(struct vb2_queue *vq, u32 count)
>> +{
>> +	struct tegra_channel *chan = vb2_get_drv_priv(vq);
>> +	struct media_pipeline *pipe = chan->video.entity.pipe;
>> +	struct tegra_channel_buffer *buf, *nbuf;
>> +	int ret = 0;
>> +
>> +	if (!chan->vi->pg_mode && !chan->remote_entity) {
>> +		dev_err(&chan->video.dev,
>> +			"is not in TPG mode and has not sensor connected!\n");
>> +		ret = -EINVAL;
>> +		goto vb2_queued;
>> +	}
>> +
>> +	mutex_lock(&chan->lock);
>> +
>> +	/* Start CIL clock */
>> +	clk_set_rate(chan->cil_clk, 102000000);
>> +	clk_prepare_enable(chan->cil_clk);
>> +
>> +	/* Disable DPD */
>> +	ret = tegra_io_rail_power_on(chan->io_id);
>> +	if (ret < 0) {
>> +		dev_err(&chan->video.dev,
>> +			"failed to power on CSI rail: %d\n", ret);
>> +		goto error_power_on;
>> +	}
>> +
>> +	/* Clean up status */
>> +	cil_write(chan, TEGRA_CSI_CIL_STATUS, 0xFFFFFFFF);
>> +	cil_write(chan, TEGRA_CSI_CILX_STATUS, 0xFFFFFFFF);
>> +	pp_write(chan, TEGRA_CSI_PIXEL_PARSER_STATUS, 0xFFFFFFFF);
>> +	csi_write(chan, TEGRA_VI_CSI_ERROR_STATUS, 0xFFFFFFFF);
>> +
>> +	ret = media_entity_pipeline_start(&chan->video.entity, pipe);
>> +	if (ret < 0)
>> +		goto error_pipeline_start;
>> +
>> +	/* Start the pipeline. */
>> +	ret = tegra_channel_set_stream(chan, true);
>> +	if (ret < 0)
>> +		goto error_set_stream;
>> +
>> +	/* Note: Program VI registers after TPG, sensors and CSI streaming */
>> +	ret = tegra_channel_capture_setup(chan);
>> +	if (ret < 0)
>> +		goto error_capture_setup;
>> +
>> +	chan->sequence = 0;
>> +	mutex_unlock(&chan->lock);
>> +
>> +	/* Start work queue to capture data to buffer */
>> +	schedule_work(&chan->work);
>> +
>> +	return 0;
>> +
>> +error_capture_setup:
>> +	tegra_channel_set_stream(chan, false);
>> +error_set_stream:
>> +	media_entity_pipeline_stop(&chan->video.entity);
>> +error_pipeline_start:
>> +	tegra_io_rail_power_off(chan->io_id);
>> +error_power_on:
>> +	clk_disable_unprepare(chan->cil_clk);
>> +	mutex_unlock(&chan->lock);
>> +vb2_queued:
>> +	/* Return all queued buffers back to vb2 */
>> +	spin_lock_irq(&chan->queued_lock);
>> +	vq->start_streaming_called = 0;
>> +	list_for_each_entry_safe(buf, nbuf, &chan->capture, queue) {
>> +		vb2_buffer_done(&buf->buf, VB2_BUF_STATE_QUEUED);
>> +		list_del(&buf->queue);
>> +	}
>> +	spin_unlock_irq(&chan->queued_lock);
>> +	return ret;
>> +}
> OK, so this whole sequence for running the DMA remains very confusing.
>
> First of all, this needs more documentation, especially about the fact that this
> uses shadow registers.

Sure, I will put some comments.

> Secondly, at the very least you need to create per-channel workqueues instead of
> using the global workqueue (schedule_work schedules the work on the global queue).
>
> But I would replace the whole workqueue handling with per-channel kthreads instead:
> where you call schedule_work above in start_streaming you start the thread. The
> thread keeps going while there are buffers queued, and if no buffers are available
> it will wait until it is woken up again. In buffer_queue you can wake up the thread
> after queueing the buffer.
>
> In stop streaming you stop the thread.
>
> Doing it this way allows you to remove the 'vq->start_streaming_called = 0' line
> above: the fact that you need it there is an indication that there is something
> wrong with the design. The real problem is that buffer_queue does too much: buffer_queue
> should just queue up the buffer for the DMA engine but it should never (re)start the
> DMA engine. Starting and stopping should be handled in start/stop_streaming.
>
> This keeps the design clean. I've seen other drivers that do similar things to what
> is done here, and that always created a mess.
>
> In addition, the way it works now in this driver is that the worker function is
> called on start_streaming AND for every buffer_queue, so it looks like you can get
> multiple worker functions running at the same time. It's all pretty weird.
>
> Keeping all the DMA handling in a single thread makes the control mechanism much
> cleaner.

I agree and I will move to kthread.


> <snip>
>> +static void
>> +__tegra_channel_try_format(struct tegra_channel *chan, struct v4l2_pix_format *pix,
>> +		      const struct tegra_video_format **fmtinfo)
>> +{
>> +	const struct tegra_video_format *info;
>> +	unsigned int min_width;
>> +	unsigned int max_width;
>> +	unsigned int min_bpl;
>> +	unsigned int max_bpl;
>> +	unsigned int width;
>> +	unsigned int align;
>> +	unsigned int bpl;
>> +
>> +	/* Retrieve format information and select the default format if the
>> +	 * requested format isn't supported.
>> +	 */
>> +	info = tegra_core_get_format_by_fourcc(pix->pixelformat);
>> +	if (!info)
>> +		info = tegra_core_get_format_by_fourcc(TEGRA_VF_DEF_FOURCC);
>> +
>> +	pix->pixelformat = info->fourcc;
>> +	pix->field = V4L2_FIELD_NONE;
>> +
>> +	/* The transfer alignment requirements are expressed in bytes. Compute
>> +	 * the minimum and maximum values, clamp the requested width and convert
>> +	 * it back to pixels.
>> +	 */
>> +	align = lcm(chan->align, info->bpp);
>> +	min_width = roundup(TEGRA_MIN_WIDTH, align);
>> +	max_width = rounddown(TEGRA_MAX_WIDTH, align);
>> +	width = rounddown(pix->width * info->bpp, align);
>> +
>> +	pix->width = clamp(width, min_width, max_width) / info->bpp;
>> +	pix->height = clamp(pix->height, TEGRA_MIN_HEIGHT,
>> +			    TEGRA_MAX_HEIGHT);
>> +
>> +	/* Clamp the requested bytes per line value. If the maximum bytes per
>> +	 * line value is zero, the module doesn't support user configurable line
>> +	 * sizes. Override the requested value with the minimum in that case.
>> +	 */
>> +	min_bpl = pix->width * info->bpp;
>> +	max_bpl = rounddown(TEGRA_MAX_WIDTH, chan->align);
>> +	bpl = rounddown(pix->bytesperline, chan->align);
>> +
>> +	pix->bytesperline = clamp(bpl, min_bpl, max_bpl);
>> +	pix->sizeimage = pix->bytesperline * pix->height;
> The colorspace is still not set: using the test pattern generator as a source
> I would select SRGB for Bayer and RGB pixelformats and REC709 for YUV pixelformats.

OK, I fixed it.

>> +
>> +	if (fmtinfo)
>> +		*fmtinfo = info;
>> +}
> <snip>
>
>> +static int tegra_channel_v4l2_open(struct file *file)
>> +{
>> +	struct tegra_channel *chan = video_drvdata(file);
>> +	struct tegra_vi_device *vi = chan->vi;
>> +	int ret = 0;
>> +
>> +	mutex_lock(&vi->lock);
>> +	ret = v4l2_fh_open(file);
>> +	if (ret)
>> +		goto unlock;
>> +
>> +	/* The first open then turn on power*/
>> +	if (!vi->power_on_refcnt) {
> Instead of using your own counter you can also call:
>
> 	if (v4l2_fh_is_singular_file(file)) {
>

I will rework on this open/release().

>> +		tegra_vi_power_on(chan->vi);
>> +
>> +		usleep_range(5, 100);
>> +		tegra_channel_write(chan, TEGRA_VI_CFG_CG_CTRL, 1);
>> +		tegra_channel_write(chan, TEGRA_CSI_CLKEN_OVERRIDE, 0);
>> +		usleep_range(10, 15);
>> +	}
>> +	vi->power_on_refcnt++;
>> +
>> +unlock:
>> +	mutex_unlock(&vi->lock);
>> +	return ret;
>> +}
>> +
>> +static int tegra_channel_v4l2_release(struct file *file)
>> +{
>> +	struct tegra_channel *chan = video_drvdata(file);
>> +	struct tegra_vi_device *vi = chan->vi;
>> +	int ret = 0;
>> +
>> +	mutex_lock(&vi->lock);
>> +	vi->power_on_refcnt--;
>> +	/* The last release then turn off power */
>> +	if (!vi->power_on_refcnt)
> And here do the same:
>
> 	if (v4l2_fh_is_singular_file(file)) {
>
>> +		tegra_vi_power_off(chan->vi);
>> +	ret = _vb2_fop_release(file, NULL);
> Is this the correct order? What if you are streaming and while streaming
> close the filehandle? Will the fact that the power is turned off before
> stop_streaming is called (_vb2_fop_release will call that) cause a problem?
>
>> +	mutex_unlock(&vi->lock);
>> +
>> +	return ret;
>> +}
> Regards,
>
> 	Hans
>


-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
  2015-08-25  0:26         ` Bryan Wu
@ 2015-08-25  6:30             ` Hans Verkuil
  -1 siblings, 0 replies; 25+ messages in thread
From: Hans Verkuil @ 2015-08-25  6:30 UTC (permalink / raw)
  To: Bryan Wu, Thierry Reding
  Cc: hansverk-FYB4Gu1CFyUAvxtiuMwx3w,
	linux-media-u79uwXL29TY76Z2rM5mHXA,
	ebrower-DDmLM1+adcrQT0dZR+AlfA, jbang-DDmLM1+adcrQT0dZR+AlfA,
	swarren-DDmLM1+adcrQT0dZR+AlfA, wenjiaz-DDmLM1+adcrQT0dZR+AlfA,
	davidw-DDmLM1+adcrQT0dZR+AlfA, gfitzer-DDmLM1+adcrQT0dZR+AlfA,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

A quick follow-up to Thierry's excellent review:

On 08/25/2015 02:26 AM, Bryan Wu wrote:
> On 08/21/2015 06:03 AM, Thierry Reding wrote:
>> On Thu, Aug 20, 2015 at 05:51:39PM -0700, Bryan Wu wrote:

<snip>

>>> +static void
>>> +__tegra_channel_try_format(struct tegra_channel *chan, struct v4l2_pix_format *pix,
>>> +		      const struct tegra_video_format **fmtinfo)
>>> +{
>>> +	const struct tegra_video_format *info;
>>> +	unsigned int min_width;
>>> +	unsigned int max_width;
>>> +	unsigned int min_bpl;
>>> +	unsigned int max_bpl;
>>> +	unsigned int width;
>>> +	unsigned int align;
>>> +	unsigned int bpl;
>>> +
>>> +	/* Retrieve format information and select the default format if the
>>> +	 * requested format isn't supported.
>>> +	 */
>>> +	info = tegra_core_get_format_by_fourcc(pix->pixelformat);
>>> +	if (!info)
>>> +		info = tegra_core_get_format_by_fourcc(TEGRA_VF_DEF_FOURCC);
>> Should this not be an error? As far as I can tell this is silently
>> substituting the default format for the requested one if the requested
>> one isn't supported. Isn't the whole point of this to find out if some
>> format is supported?
>>
> I think it should return some error and escape following code. I will 
> fix that.

Actually, this code is according to the V4L2 spec: if the given format is
not supported, then VIDIOC_TRY_FMT should replace it with a valid default
format.

The reality is a bit more complex: in many drivers this was never reviewed
correctly and we ended up with some drivers that return an error for this
case and some drivers that follow the spec. Historically TV capture drivers
return an error, webcam drivers don't. Most unfortunate.

Since this driver is much more likely to be used with sensors I would
follow the spec here and substitute an invalid format with a default
format.

Regards,

	Hans

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
@ 2015-08-25  6:30             ` Hans Verkuil
  0 siblings, 0 replies; 25+ messages in thread
From: Hans Verkuil @ 2015-08-25  6:30 UTC (permalink / raw)
  To: Bryan Wu, Thierry Reding
  Cc: hansverk, linux-media, ebrower, jbang, swarren, wenjiaz, davidw,
	gfitzer, linux-tegra

A quick follow-up to Thierry's excellent review:

On 08/25/2015 02:26 AM, Bryan Wu wrote:
> On 08/21/2015 06:03 AM, Thierry Reding wrote:
>> On Thu, Aug 20, 2015 at 05:51:39PM -0700, Bryan Wu wrote:

<snip>

>>> +static void
>>> +__tegra_channel_try_format(struct tegra_channel *chan, struct v4l2_pix_format *pix,
>>> +		      const struct tegra_video_format **fmtinfo)
>>> +{
>>> +	const struct tegra_video_format *info;
>>> +	unsigned int min_width;
>>> +	unsigned int max_width;
>>> +	unsigned int min_bpl;
>>> +	unsigned int max_bpl;
>>> +	unsigned int width;
>>> +	unsigned int align;
>>> +	unsigned int bpl;
>>> +
>>> +	/* Retrieve format information and select the default format if the
>>> +	 * requested format isn't supported.
>>> +	 */
>>> +	info = tegra_core_get_format_by_fourcc(pix->pixelformat);
>>> +	if (!info)
>>> +		info = tegra_core_get_format_by_fourcc(TEGRA_VF_DEF_FOURCC);
>> Should this not be an error? As far as I can tell this is silently
>> substituting the default format for the requested one if the requested
>> one isn't supported. Isn't the whole point of this to find out if some
>> format is supported?
>>
> I think it should return some error and escape following code. I will 
> fix that.

Actually, this code is according to the V4L2 spec: if the given format is
not supported, then VIDIOC_TRY_FMT should replace it with a valid default
format.

The reality is a bit more complex: in many drivers this was never reviewed
correctly and we ended up with some drivers that return an error for this
case and some drivers that follow the spec. Historically TV capture drivers
return an error, webcam drivers don't. Most unfortunate.

Since this driver is much more likely to be used with sensors I would
follow the spec here and substitute an invalid format with a default
format.

Regards,

	Hans

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
  2015-08-25  6:30             ` Hans Verkuil
@ 2015-08-25 11:25                 ` Thierry Reding
  -1 siblings, 0 replies; 25+ messages in thread
From: Thierry Reding @ 2015-08-25 11:25 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: Bryan Wu, hansverk-FYB4Gu1CFyUAvxtiuMwx3w,
	linux-media-u79uwXL29TY76Z2rM5mHXA,
	ebrower-DDmLM1+adcrQT0dZR+AlfA, jbang-DDmLM1+adcrQT0dZR+AlfA,
	swarren-DDmLM1+adcrQT0dZR+AlfA, wenjiaz-DDmLM1+adcrQT0dZR+AlfA,
	davidw-DDmLM1+adcrQT0dZR+AlfA, gfitzer-DDmLM1+adcrQT0dZR+AlfA,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 2057 bytes --]

On Tue, Aug 25, 2015 at 08:30:41AM +0200, Hans Verkuil wrote:
> A quick follow-up to Thierry's excellent review:
> 
> On 08/25/2015 02:26 AM, Bryan Wu wrote:
> > On 08/21/2015 06:03 AM, Thierry Reding wrote:
> >> On Thu, Aug 20, 2015 at 05:51:39PM -0700, Bryan Wu wrote:
> 
> <snip>
> 
> >>> +static void
> >>> +__tegra_channel_try_format(struct tegra_channel *chan, struct v4l2_pix_format *pix,
> >>> +		      const struct tegra_video_format **fmtinfo)
> >>> +{
> >>> +	const struct tegra_video_format *info;
> >>> +	unsigned int min_width;
> >>> +	unsigned int max_width;
> >>> +	unsigned int min_bpl;
> >>> +	unsigned int max_bpl;
> >>> +	unsigned int width;
> >>> +	unsigned int align;
> >>> +	unsigned int bpl;
> >>> +
> >>> +	/* Retrieve format information and select the default format if the
> >>> +	 * requested format isn't supported.
> >>> +	 */
> >>> +	info = tegra_core_get_format_by_fourcc(pix->pixelformat);
> >>> +	if (!info)
> >>> +		info = tegra_core_get_format_by_fourcc(TEGRA_VF_DEF_FOURCC);
> >> Should this not be an error? As far as I can tell this is silently
> >> substituting the default format for the requested one if the requested
> >> one isn't supported. Isn't the whole point of this to find out if some
> >> format is supported?
> >>
> > I think it should return some error and escape following code. I will 
> > fix that.
> 
> Actually, this code is according to the V4L2 spec: if the given format is
> not supported, then VIDIOC_TRY_FMT should replace it with a valid default
> format.
> 
> The reality is a bit more complex: in many drivers this was never reviewed
> correctly and we ended up with some drivers that return an error for this
> case and some drivers that follow the spec. Historically TV capture drivers
> return an error, webcam drivers don't. Most unfortunate.
> 
> Since this driver is much more likely to be used with sensors I would
> follow the spec here and substitute an invalid format with a default
> format.

Okay, sounds good to me.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
@ 2015-08-25 11:25                 ` Thierry Reding
  0 siblings, 0 replies; 25+ messages in thread
From: Thierry Reding @ 2015-08-25 11:25 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: Bryan Wu, hansverk, linux-media, ebrower, jbang, swarren,
	wenjiaz, davidw, gfitzer, linux-tegra

[-- Attachment #1: Type: text/plain, Size: 2057 bytes --]

On Tue, Aug 25, 2015 at 08:30:41AM +0200, Hans Verkuil wrote:
> A quick follow-up to Thierry's excellent review:
> 
> On 08/25/2015 02:26 AM, Bryan Wu wrote:
> > On 08/21/2015 06:03 AM, Thierry Reding wrote:
> >> On Thu, Aug 20, 2015 at 05:51:39PM -0700, Bryan Wu wrote:
> 
> <snip>
> 
> >>> +static void
> >>> +__tegra_channel_try_format(struct tegra_channel *chan, struct v4l2_pix_format *pix,
> >>> +		      const struct tegra_video_format **fmtinfo)
> >>> +{
> >>> +	const struct tegra_video_format *info;
> >>> +	unsigned int min_width;
> >>> +	unsigned int max_width;
> >>> +	unsigned int min_bpl;
> >>> +	unsigned int max_bpl;
> >>> +	unsigned int width;
> >>> +	unsigned int align;
> >>> +	unsigned int bpl;
> >>> +
> >>> +	/* Retrieve format information and select the default format if the
> >>> +	 * requested format isn't supported.
> >>> +	 */
> >>> +	info = tegra_core_get_format_by_fourcc(pix->pixelformat);
> >>> +	if (!info)
> >>> +		info = tegra_core_get_format_by_fourcc(TEGRA_VF_DEF_FOURCC);
> >> Should this not be an error? As far as I can tell this is silently
> >> substituting the default format for the requested one if the requested
> >> one isn't supported. Isn't the whole point of this to find out if some
> >> format is supported?
> >>
> > I think it should return some error and escape following code. I will 
> > fix that.
> 
> Actually, this code is according to the V4L2 spec: if the given format is
> not supported, then VIDIOC_TRY_FMT should replace it with a valid default
> format.
> 
> The reality is a bit more complex: in many drivers this was never reviewed
> correctly and we ended up with some drivers that return an error for this
> case and some drivers that follow the spec. Historically TV capture drivers
> return an error, webcam drivers don't. Most unfortunate.
> 
> Since this driver is much more likely to be used with sensors I would
> follow the spec here and substitute an invalid format with a default
> format.

Okay, sounds good to me.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
  2015-08-25  0:26         ` Bryan Wu
@ 2015-08-25 13:44             ` Thierry Reding
  -1 siblings, 0 replies; 25+ messages in thread
From: Thierry Reding @ 2015-08-25 13:44 UTC (permalink / raw)
  To: Bryan Wu
  Cc: hansverk-FYB4Gu1CFyUAvxtiuMwx3w,
	linux-media-u79uwXL29TY76Z2rM5mHXA,
	ebrower-DDmLM1+adcrQT0dZR+AlfA, jbang-DDmLM1+adcrQT0dZR+AlfA,
	swarren-DDmLM1+adcrQT0dZR+AlfA, wenjiaz-DDmLM1+adcrQT0dZR+AlfA,
	davidw-DDmLM1+adcrQT0dZR+AlfA, gfitzer-DDmLM1+adcrQT0dZR+AlfA,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 28381 bytes --]

On Mon, Aug 24, 2015 at 05:26:20PM -0700, Bryan Wu wrote:
> On 08/21/2015 06:03 AM, Thierry Reding wrote:
> >On Thu, Aug 20, 2015 at 05:51:39PM -0700, Bryan Wu wrote:
[...]
> >>+#define TEGRA_CSI_PHY_CIL_COMMAND			0x0908
> >This doesn't seem to be used at all.
> 
> Actually this PHY register just has this one only, I need define it as 0x0
> offset here. Let's keep this since in future we might have more PHY
> registers.

Yes, I had been wondering about the PHY registers. If we make this a
register with offset 0, as I understand it will become used because of
the phy_{readl,writel}() rework.

> >>+#define TEGRA_CSI_PATTERN_GENERATOR_CTRL		0x000
> >>+#define TEGRA_CSI_PG_BLANK				0x004
> >>+#define TEGRA_CSI_PG_PHASE				0x008
> >>+#define TEGRA_CSI_PG_RED_FREQ				0x00c
> >>+#define TEGRA_CSI_PG_RED_FREQ_RATE			0x010
> >>+#define TEGRA_CSI_PG_GREEN_FREQ				0x014
> >>+#define TEGRA_CSI_PG_GREEN_FREQ_RATE			0x018
> >>+#define TEGRA_CSI_PG_BLUE_FREQ				0x01c
> >>+#define TEGRA_CSI_PG_BLUE_FREQ_RATE			0x020
> >>+#define TEGRA_CSI_PG_AOHDR				0x024
> >>+
> >>+#define TEGRA_CSI_DPCM_CTRL_A				0xad0
> >>+#define TEGRA_CSI_DPCM_CTRL_B				0xad4
> >>+#define TEGRA_CSI_STALL_COUNTER				0xae8
> >>+#define TEGRA_CSI_CSI_READONLY_STATUS			0xaec
> >>+#define TEGRA_CSI_CSI_SW_STATUS_RESET			0xaf0
> >>+#define TEGRA_CSI_CLKEN_OVERRIDE			0xaf4
> >>+#define TEGRA_CSI_DEBUG_CONTROL				0xaf8
> >>+#define TEGRA_CSI_DEBUG_COUNTER_0			0xafc
> >>+#define TEGRA_CSI_DEBUG_COUNTER_1			0xb00
> >>+#define TEGRA_CSI_DEBUG_COUNTER_2			0xb04
> >Some of these are unused. I guess there's an argument to be made to
> >include all register definitions rather than just the used ones, if for
> >nothing else than completeness. I'll defer to Hans's judgement on this.
> 
> These are VI/CSI global registers shared by all the channels. Some of them
> are used in this driver, I suggest we keep them here.

Fine with me.

> >>+{
> >>+	if (chan->bypass)
> >>+		return;
> >I don't see this being set anywhere. Is it dead code? Also the only
> >description I see is that it's used to bypass register writes, but I
> >don't see an explanation why that's necessary.
> 
> We are unifying our downstream VI driver with V4L2 VI driver. And this
> upstream work is the first step to help that.
> 
> We are also backporting this driver back to our internal 3.10 kernel which
> is using nvhost channel to submit register operations from userspace to
> host1x and VI hardware. Then in this case, our driver needs to bypass all
> the register operations otherwise we got conflicts between these 2 paths.
> 
> That's why I put bypass mode here. And bypass mode can be set in device tree
> or from v4l2-ctrls.

I don't think it's fair to burden upstream with code that will only ever
be used downstream. Let's split these changes into a separate patch that
can be carried downstream.

> >>+/* Syncpoint bits of TEGRA_VI_CFG_VI_INCR_SYNCPT */
> >>+static u32 sp_bit(struct tegra_channel *chan, u32 sp)
> >>+{
> >>+	return (sp + chan->port * 4) << 8;
> >>+}
> >Technically this returns a mask, not a bit, so sp_mask() would be more
> >appropriate.
> Actually it returns the syncpoint value for each port not a mask. Probably
> sp_bits() is better.

Looking at the TRM, the field that this generates a value for is called
VI_COND (in the VI_CFG_VI_INCR_SYNCPT register), so perhaps this should
really be a macro and named something like:

	#define VI_CFG_VI_INCR_SYNCPT_COND(x) (((x) & 0xff) << 8)

As for the arithmetic, that doesn't seem to match up. Quoting from your
original patch:

> > > +/* VI registers */
> > > +#define TEGRA_VI_CFG_VI_INCR_SYNCPT                     0x000
> > > +#define		SP_PP_LINE_START			4
> > > +#define		SP_PP_FRAME_START			5
> > > +#define		SP_MW_REQ_DONE				6
> > > +#define		SP_MW_ACK_DONE				7

This doesn't seem to match the TRM, which has the following values:

	 0 = IMMEDIATE
	 1 = OP_DONE
	 2 = RD_DONE
	 3 = REG_WR_SAFE
	 4 = VI_MWA_REQ_DONE
	 5 = VI_MWB_REQ_DONE
	 6 = VI_MWA_ACK_DONE
	 7 = VI_MWB_ACK_DONE
	 8 = VI_ISPA_DONE
	 9 = VI_CSI_PPA_FRAME_START
	10 = VI_CSI_PPB_FRAME_START
	11 = VI_CSI_PPA_LINE_START
	12 = VI_CSI_PPB_LINE_START
	13 = VI_VGP0_RCVD
	14 = VI_VGP1_RCVD
	15 = VI_ISPB_DONE

Comparing with the internal register manuals it looks like the TRM is
actually wrong. Can you file an internal bug to rectify this and Cc me
on it, please?

Irrespective, since this is generating content for a register field it
would seem more consistent to define it as a parameterized macro, like
so:

	#define VI_CSI_PP_LINE_START(port)	(4 + (port) * 4)
	#define VI_CSI_PP_FRAME_START(port)	(5 + (port) * 4)
	#define VI_CSI_MWA_REQ_DONE(port)	(6 + (port) * 4)
	#define VI_CSI_MWA_ACK_DONE(port)	(7 + (port) * 4)

and then use them together with the above macro:

	value = VI_CFG_VI_INCR_SYNCPT_COND(VI_CSI_PP_FRAME_START(port)) |
		host1x_syncpt_id(syncpt);
	writel(value, ...);

> >>+static int tegra_channel_capture_setup(struct tegra_channel *chan)
> >>+{
> >>+	int lanes = 2;
> >unsigned int? And why is it hardcoded to 2? There are checks below for
> >lanes == 4, which will effectively never happen. So at the very least I
> >think this should have a TODO comment of some sort. Preferably can it
> >not be determined at runtime what number of lanes we need?
> Sure, I forget to fix this. lanes should get from DT and for TPG mode I will
> choose lanes as 4 by default.

Can the number of lanes required not be determined at runtime? I suspect
it would be a property of whatever camera is attached. Then again, this
is perhaps clarified by the DT binding, so I'll wait and see how that
looks.

> >>+	u32 height = chan->format.height;
> >>+	u32 width = chan->format.width;
> >>+	u32 format = chan->fmtinfo->img_fmt;
> >>+	u32 data_type = chan->fmtinfo->img_dt;
> >>+	u32 word_count = tegra_core_get_word_count(width, chan->fmtinfo);
> >>+	struct chan_regs_config *regs = &chan->regs;
> >>+
> >>+	/* CIL PHY register setup */
> >>+	if (port & 0x1) {
> >>+		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0 - 0x34, 0x0);
> >>+		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x0);
> >>+	} else {
> >>+		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x10000);
> >>+		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0 + 0x34, 0x0);
> >>+	}
> >This seems to address registers not actually part of this channel. Why?
> It's little bit hackish, but it's really have no choice. CIL PHY is shared
> by 2 channels. like CSIA and CSIB, CSIC and CSID, CSIE and CSIF. So we have
> 3 groups.

I'm wondering if we can't add some object as abstraction to make this
more straightforward to follow. I find this driver generally hard to
understand because of all the (seemingly) random register accesses.

> >Also you use magic numbers here and in the remainder of the driver. We
> >should be able to do better. I presume all of this is documented in the
> >TRM, so we should be able to easily substitute symbolic names.
> I also got those magic numbers from internal source. Some of them are not in
> the TRM. And people just use that settings. I will try to convert them to
> some meaningful bit names. Please let me do it after I finished the whole
> work as an incremental patch.

Sorry, that's not going to work. One of our prerequisite for merging
code into the upstream kernel has always been to have the registers
documented in the TRM. Magic numbers are not an option.

> >>+	cil_write(chan, TEGRA_CSI_CIL_INTERRUPT_MASK, 0x0);
> >>+	cil_write(chan, TEGRA_CSI_CIL_PHY_CONTROL, 0xA);
> >>+	if (lanes == 4) {
> >>+		regs->cil = regs_base(TEGRA_CSI_CIL_0_BASE, port + 1);
> >>+		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x0);
> >>+		cil_write(chan,	TEGRA_CSI_CIL_INTERRUPT_MASK, 0x0);
> >>+		cil_write(chan, TEGRA_CSI_CIL_PHY_CONTROL, 0xA);
> >>+		regs->cil = regs_base(TEGRA_CSI_CIL_0_BASE, port);
> >>+	}
> >And this seems to access registers from another port by temporarily
> >rewriting the CIL base offset. That seems a little hackish to me. I
> >don't know the hardware intimately enough to know exactly what this
> >is supposed to accomplish, perhaps you can clarify? Also perhaps we
> >can come up with some architectural overview of the VI hardware, or
> >does such an overview exist in the TRM?
> 
> CSI have 6 channels but just 3 PHYs. If a channel want to use 4 data lanes,
> then it has to be CSIA, CSIC and CSIE. And CSIB, CSID and CSIF channels can
> not be used in this case.
> 
> That's why we need to access the CSIB/D/F registers in 4 data lanes use
> case.

I find the nomenclature very difficult. So each channel has two ports,
and each port uses up two lanes of a 4-lane PHY. Can't we structure
things in a way so that we expose ports as a low-level object and then
each channel can use either one or two ports? That way we can create at
runtime a dynamic number of channels (parsed from DT?) and assign ports
to them.

Perhaps most of that information will already be available in DT. For
example if we have a 4-lane camera connected to CSI1, then ports C and D
could be connected (I suppose that's possible with an OF graph?) and the
driver would simply have to allocate both C and D ports to some channel
object representing that camera. Similarly we could have one 2-lane
camera connected to CSI and another 2-lane camera connected to CSI2 and
assign ports A or B and E or F, respectively, to channels representing
these camera links.

> >I see there is, perhaps add a comment somewhere, in the commit
> >description or the file header giving a reference to where the
> >architectural overview can be found?
> 
> It can be found in Tegra X1 TRM like this:
> "The CSI unit provides for connection of up to six cameras in the system and
> is organized as three identical instances of two
> MIPI support blocks, each with a separate 4-lane interface that can be
> configured as a single camera with 4 lanes or as a dual
> camera with 2 lanes available for each camera."
> 
> What about I put this information in the code as a comment?

Having this as comments is obviously going to help understand the code,
but the code will still be difficult to follow. I think it would be far
easier to understand if this was structured in a top-down approach
rather than bottom-up.

> >>+	/* CSI pixel parser registers setup */
> >>+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_PP_COMMAND, 0xf007);
> >>+	pp_write(chan, TEGRA_CSI_PIXEL_PARSER_INTERRUPT_MASK, 0x0);
> >>+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_CONTROL0,
> >>+		 0x280301f0 | (port & 0x1));
> >>+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_PP_COMMAND, 0xf007);
> >>+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_CONTROL1, 0x11);
> >>+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_GAP, 0x140000);
> >>+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_EXPECTED_FRAME, 0x0);
> >>+	pp_write(chan, TEGRA_CSI_INPUT_STREAM_CONTROL,
> >>+		 0x3f0000 | (lanes - 1));
> >>+
> >>+	/* CIL PHY register setup */
> >>+	if (lanes == 4)
> >>+		phy_write(chan, 0x0101);
> >>+	else {
> >>+		u32 val = phy_read(chan);
> >>+		if (port & 0x1)
> >>+			val = (val & ~0x100) | 0x100;
> >>+		else
> >>+			val = (val & ~0x1) | 0x1;
> >>+		phy_write(chan, val);
> >>+	}
> >The & ~ isn't quite doing what I suspect it should be doing. My
> >assumption is that you want to set this register to 0x01 if the first
> >port is to be used and 0x100 if the second port is to be used (or 0x101
> >if both ports are to be used). In that case I think you'll want
> >something like this:
> >
> >	value = phy_read(chan);
> >
> >	if (port & 1)
> >		value = (value & ~0x0001) | 0x0100;
> >	else
> >		value = (value & ~0x0100) | 0x0001;
> >
> >	phy_write(chan, value);
> 
> I don't think your code is correct. The algorithm is to read out the share
> PHY register value and clear the port related bit and set that bit. Then it
> won't touch the setting of the other port. It means when we setup a channel
> it should not change the other channel which sharing PHY register with the
> current one.
> 
> In your case, you cleared the other port's bit and set the current port bit.
> When we write the value back to the PHY register, current port will be
> enabled but the other port will be disabled.
> 
> For example, like CSIA is running, the value of PHY register is 0x0001.
> Then when we try to enable CSIB, we should write 0x0101 to the PHY register
> but not 0x0100.

I see. In that case I propose you simply do:

	if (port & 1)
		value |= 0x0100;
	else
		value |= 0x0001;

Clearing the bit only to set it immediately again is just a waste of CPU
resources. Likely the compiler will optimize this away, but might as
well make it easy on the compiler.

One problem with the above code, though, is that I don't see these bits
ever being cleared in the PHY. Shouldn't there be code to disable a
given port when it isn't used? Presumably that would reduce power
consumption?

> >>+static int tegra_channel_capture_frame(struct tegra_channel *chan)
> >>+{
> >>+	struct tegra_channel_buffer *buf = chan->active;
> >>+	struct vb2_buffer *vb = &buf->buf;
> >>+	int err = 0;
> >>+	u32 thresh, value, frame_start;
> >>+	int bytes_per_line = chan->format.bytesperline;
> >>+
> >>+	if (!vb2_start_streaming_called(&chan->queue) || !buf)
> >>+		return -EINVAL;
> >>+
> >>+	if (chan->bypass)
> >>+		goto bypass_done;
> >>+
> >>+	/* Program buffer address */
> >>+	csi_write(chan,
> >>+		  TEGRA_VI_CSI_SURFACE0_OFFSET_MSB + chan->surface * 8,
> >>+		  0x0);
> >>+	csi_write(chan,
> >>+		  TEGRA_VI_CSI_SURFACE0_OFFSET_LSB + chan->surface * 8,
> >>+		  buf->addr);
> >>+	csi_write(chan,
> >>+		  TEGRA_VI_CSI_SURFACE0_STRIDE + chan->surface * 4,
> >>+		  bytes_per_line);
> >>+
> >>+	/* Program syncpoint */
> >>+	frame_start = sp_bit(chan, SP_PP_FRAME_START);
> >>+	tegra_channel_write(chan, TEGRA_VI_CFG_VI_INCR_SYNCPT,
> >>+			    frame_start | host1x_syncpt_id(chan->sp));
> >>+
> >>+	csi_write(chan, TEGRA_VI_CSI_SINGLE_SHOT, 0x1);
> >>+
> >>+	/* Use syncpoint to wake up */
> >>+	thresh = host1x_syncpt_incr_max(chan->sp, 1);
> >>+
> >>+	mutex_unlock(&chan->lock);
> >>+	err = host1x_syncpt_wait(chan->sp, thresh,
> >>+			         TEGRA_VI_SYNCPT_WAIT_TIMEOUT, &value);
> >>+	mutex_lock(&chan->lock);
> >What's the point of taking the lock in the first place if you drop it
> >here, even if temporarily? This is a per-channel lock, and it protects
> >the channel against concurrent captures. So if you drop the lock here,
> >don't you run risk of having two captures run concurrently? And by the
> >time you get to the error handling or buffer completion below you can't
> >be sure you're actually dealing with the same buffer that you started
> >with.
> 
> After some discussion with Hans, I changed to this. Since there won't be a
> second capture start which is prevented by v4l2-core, it won't cause the
> buffer issue.
> 
> Waiting for host1x syncpoint take time, so dropping lock can let other
> non-capture ioctls and operations happen.

If the core already prevents multiple captures for a single channel, do
we even need the lock in the first place?

> >>+	if (err) {
> >>+		dev_err(&chan->video.dev, "frame start syncpt timeout!\n");
> >>+		tegra_channel_capture_error(chan, err);
> >>+	}
> >Is timeout really the only kind of error that can happen here?
> >
> I actually don't know other errors. Any other errors I need take of here?

Then I suggest you play it safe and simply report what exact error was
returned:

		dev_err(&chan->video.dev, "failed to wait for syncpoint: %d\n",
			err);

> >>+static int tegra_channel_buffer_prepare(struct vb2_buffer *vb)
> >>+{
> >>+	struct tegra_channel *chan = vb2_get_drv_priv(vb->vb2_queue);
> >>+	struct tegra_channel_buffer *buf = to_tegra_channel_buffer(vb);
> >>+
> >>+	buf->chan = chan;
> >>+	buf->addr = vb2_dma_contig_plane_dma_addr(vb, 0);
> >>+
> >>+	return 0;
> >>+}
> >This seems to use contiguous DMA, which I guess presumes CMA support?
> >We're dealing with very large buffers here. Your default frame size
> >would yield buffers of roughly 32 MiB each, and you probably need a
> >couple of those to ensure smooth playback. That's quite a bit of
> >memory to reserve for CMA.
> In vb2 core driver, it's using dma-mapping API which might be CMA or SMMU.

There is no way to use the DMA API with SMMU upstream. You need to set
up your IOMMU domain yourself and attach the VI device to it manually.
That means you'll also need to manage your IOVA space manually to make
use of this. I know it's an unfortunate situation and there's work
underway to improve it, but we're not quite there yet.

> For CMA we need increase the default memory size.

I'd rather not rely on CMA at all, especially since we do have a way
around it.

> >Have you ever tried to make this work with the IOMMU API so that we can
> >allocate arbitrary buffers and linearize them for the hardware through
> >the SMMU?
> I tested this code in downstream kernel with SMMU. Do we fully support SMMU
> in upstream version? I didn't check that.

*sigh* We can't merge code upstream which hasn't been tested upstream.
Let's make sure we get into place whatever we need to actually run this
on an upstream kernel. That typically means you need to apply your work
on top of some recent linux-next and run it on an upstream-supported
board.

I realize that this is rather difficult to do for Tegra X1 because the
support for it hasn't been completely merged yet. One possibility is to
apply this on top of my staging/work branch[0] and run it on the P2371
or P2571 boards that are supported there. Alternatively since this is
hardware which is available (in similar form) on Tegra K1 you could try
to make it work on something like the Jetson TK1. Getting it to support
Tegra X1 will then be (hopefully) a simple matter of adding parameters
for the new generation.

Not testing this on an upstream kernel means that it is likely not going
to work because we're missing some bits, such as in the clock driver or
other, that are essential to make this work and as a result we'd be
carrying broken code in the upstream kernel. That's not acceptable.

[0]: https://github.com/thierryreding/linux/commits/staging/work

> >>+	pix->pixelformat = info->fourcc;
> >>+	pix->field = V4L2_FIELD_NONE;
> >>+
> >>+	/* The transfer alignment requirements are expressed in bytes. Compute
> >>+	 * the minimum and maximum values, clamp the requested width and convert
> >>+	 * it back to pixels.
> >>+	 */
> >>+	align = lcm(chan->align, info->bpp);
> >>+	min_width = roundup(TEGRA_MIN_WIDTH, align);
> >>+	max_width = rounddown(TEGRA_MAX_WIDTH, align);
> >>+	width = rounddown(pix->width * info->bpp, align);
> >Shouldn't these be roundup()?
> Why? I don't understand but rounddown looks good to me

For the maximum and minimum this is probably not an issue because they
likely are multiples of the alignment (I hope they are, otherwise they
would be broken; which would indicate that computing min_width and
max_width here is actually redundant, or should be replaced by some
sort of WARN() or even BUG().

That said, for the width you'll want to round up, otherwise you will be
potentially truncating the amount of data you receive. Consider for
example the case where you wanted to capture a 2x2 image at 32-bit RGB.
With your above calculation you'll end up with:

	align = lcm(64, 4) = 64;
	width = rounddown(2 * 4 = 8, 64) = 0;

That's really not what you want. I realize that this particular case
will be cancelled out by the clamp() calculation below, but the same
error would apply to larger resolution images. You'll always be missing
up to 63 bytes if you round down that way.

> >>+	pix->width = clamp(width, min_width, max_width) / info->bpp;
> >>+	pix->height = clamp(pix->height, TEGRA_MIN_HEIGHT,
> >>+			    TEGRA_MAX_HEIGHT);
> >The above fits nicely on one line and doesn't need to be wrapped.
> Fixed
> >
> >>+
> >>+	/* Clamp the requested bytes per line value. If the maximum bytes per
> >>+	 * line value is zero, the module doesn't support user configurable line
> >>+	 * sizes. Override the requested value with the minimum in that case.
> >>+	 */
> >>+	min_bpl = pix->width * info->bpp;
> >>+	max_bpl = rounddown(TEGRA_MAX_WIDTH, chan->align);
> >>+	bpl = rounddown(pix->bytesperline, chan->align);
> >Again, I think these should be roundup().
> 
> Why? I don't understand but rounddown looks good to me

Same applies here. Alignment is a restriction regarding the *minimum*
size, rounding up is therefore what you really need.

> >>+	/* VI Channel is 64 bytes alignment */
> >>+	chan->align = 64;
> >Does this need parameterization for other SoC generations?
> 
> So far it's 64 bytes and I don't see any change about this in the future
> generations.

I don't see this documented in the TRM. Can you file a bug to get this
added? We have tables for this kind of restrictions for other devices,
such as display controller. We'll need that in the TRM for VI as well.

> >>+	chan->surface = 0;
> >I can't find this being set to anything other than 0. What is its use?
> 
> Each channel actually has 3 memory output surfaces. But I don't find any use
> case to use the surface 1 and surface 2. So I just added this parameter for
> future usage.
> 
> chan->surface is used in tegra_channel_capture_frame()

I don't understand why it needs to be stored in the channel. We could
simply hard-code it to 0 in tegra_channel_capture_frame(). Perhaps along
with a TODO comment or similar that this might need to be paramaterized?

The TRM isn't any help in explaining why three surfaces are available.
Would you happen to know what surfaces 1 and 2 can be used for?

> >>diff --git a/drivers/media/platform/tegra/tegra-core.h b/drivers/media/platform/tegra/tegra-core.h
> >>new file mode 100644
> >>index 0000000..7d1026b
> >>--- /dev/null
> >>+++ b/drivers/media/platform/tegra/tegra-core.h
> >>@@ -0,0 +1,134 @@
> >>+/*
> >>+ * NVIDIA Tegra Video Input Device Driver Core Helpers
> >>+ *
> >>+ * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
> >>+ *
> >>+ * Author: Bryan Wu <pengw-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> >>+ *
> >>+ * This program is free software; you can redistribute it and/or modify
> >>+ * it under the terms of the GNU General Public License version 2 as
> >>+ * published by the Free Software Foundation.
> >>+ */
> >>+
> >>+#ifndef __TEGRA_CORE_H__
> >>+#define __TEGRA_CORE_H__
> >>+
> >>+#include <dt-bindings/media/tegra-vi.h>
> >>+
> >>+#include <media/v4l2-subdev.h>
> >>+
> >>+/* Minimum and maximum width and height common to Tegra video input device. */
> >>+#define TEGRA_MIN_WIDTH		32U
> >>+#define TEGRA_MAX_WIDTH		7680U
> >>+#define TEGRA_MIN_HEIGHT	32U
> >>+#define TEGRA_MAX_HEIGHT	7680U
> >Is this dependent on SoC generation? If we wanted to support Tegra K1,
> >would the same values apply or do they need to be parameterized?
> I actually don't get any information about this max/min resolution. Here I
> just put some values for the format calculation.

Can you request that this be added to the TRM (via that internal bug
report I mentioned), please? According to the register definitions the
width and height fields to be programmed are 16-bit, but I doubt that we
can realistically capture frames of 65535x65535 pixels.

> >On that note, could you outline what would be necessary to make this
> >work on Tegra K1? What are the differences between the VI hardware on
> >Tegra X1 vs. Tegra K1?
> >
> Tegra X1 and Tegra K1 have similar channel architecture. Tegra X1 has 6
> channels, Tegra K1 has 2 channels.

Okay, so it should be relatively easy to make this work on Tegra K1 as
well. I'll see if I can find some time to play with that. What would be
the easiest way to check that this works? I suppose I could write a
small program to capture images from the V4L2 node(s) that this exposes
and displays them in a DRM/KMS overlay via DMA-BUF. But perhaps there
are premade tools to achieve this? Preferably with not too many
dependencies.

> >>+/* UHD 4K resolution as default resolution for all Tegra video input device. */
> >>+#define TEGRA_DEF_WIDTH		3840
> >>+#define TEGRA_DEF_HEIGHT	2160
> >Is this a sensible default? It seems rather large to me.
> Actually I use this for TPG which is the default setting of VI. And it can
> be override from user space IOCTL.

I understand, but UHD is rather big, so not sure if it makes a good
default. Perhaps 1920x1080 would be a more realistic default. But I
don't feel very strong about this.

> >>+
> >>+#define TEGRA_VF_DEF		TEGRA_VF_RGB888
> >>+#define TEGRA_VF_DEF_FOURCC	V4L2_PIX_FMT_RGB32
> >Should we not have only one of these and convert to the other via some
> >table?
> 
> This is also TPG default mode

I understand, but the fourcc version can be converted to the Tegra
internal format with a function, right? So it seems weird that we'd have
to hard-code both here, which also means that they need to be manually
kept in sync.

> >>+	struct tegra_channel *chan;
> >>+
> >>+	for (i = 0; i < ARRAY_SIZE(vi->chans); i++) {
> >>+		chan = &vi->chans[i];
> >>+
> >>+		ret = tegra_channel_init(vi, chan, i);
> >Again, chan is only used once, so directly passing &vi->chans[i] to
> >tegra_channel_init() would be more concise.
> OK, I will remove 'chan' parameter from the list. And just pass i as the
> port number.

I didn't express myself very clearly. What I was suggesting was to
remove the chan temporary variable and pass in &vi->chans[i] directly.
Passing in both &vi->chans[i] and i looks okay to me, that way you don't
have to look up i via other means. Provided that you still need it, of
course.

> >>+	vi_tpg_fmts_bitmap_init(vi);
> >>+
> >>+	ret = tegra_vi_v4l2_init(vi);
> >>+	if (ret < 0)
> >>+		return ret;
> >>+
> >>+	/* Check whether VI is in test pattern generator (TPG) mode */
> >>+	of_property_read_u32(vi->dev->of_node, "nvidia,pg_mode",
> >>+			     &vi->pg_mode);
> >This doesn't sound right. Wouldn't this mean that you can either use the
> >device in TPG mode or sensor mode only? With no means of switching at
> >runtime? But then I see that there's an IOCTL to set this mode, so why
> >even bother having this in DT in the first place?
> DT can provide a default way to set the whole VI as TPG. And v4l2-ctrls
> (IOCTL) is another way to do that.
> 
> We can remove this DT stuff but just use runtime v4l2-ctrls.

Yes, let's do that then. It's a policy decision and therefore doesn't
belong in DT.

> >>diff --git a/include/dt-bindings/media/tegra-vi.h b/include/dt-bindings/media/tegra-vi.h
> >[...]
> >>+#ifndef __DT_BINDINGS_MEDIA_TEGRA_VI_H__
> >>+#define __DT_BINDINGS_MEDIA_TEGRA_VI_H__
> >>+
> >>+/*
> >>+ * Supported CSI to VI Data Formats
> >>+ */
> >>+#define TEGRA_VF_RAW6		0
> >>+#define TEGRA_VF_RAW7		1
> >>+#define TEGRA_VF_RAW8		2
> >>+#define TEGRA_VF_RAW10		3
> >>+#define TEGRA_VF_RAW12		4
> >>+#define TEGRA_VF_RAW14		5
> >>+#define TEGRA_VF_EMBEDDED8	6
> >>+#define TEGRA_VF_RGB565		7
> >>+#define TEGRA_VF_RGB555		8
> >>+#define TEGRA_VF_RGB888		9
> >>+#define TEGRA_VF_RGB444		10
> >>+#define TEGRA_VF_RGB666		11
> >>+#define TEGRA_VF_YUV422		12
> >>+#define TEGRA_VF_YUV420		13
> >>+#define TEGRA_VF_YUV420_CSPS	14
> >>+
> >>+#endif /* __DT_BINDINGS_MEDIA_TEGRA_VI_H__ */
> >What do we need these for? These seem to me to be internal formats
> >supported by the hardware, but the existence of this file implies that
> >you plan on using them in the DT. What's the use-case?
> >
> >
> 
> The original plan is to put nvidia;video-format in device tree and this is
> the data formats for that. Now we don't need nvidia;video-format in device
> tree. Then I let me move it into our tegra-core.c, because our
> tegra_video_formats table needs this.

If we don't need it now, why will we ever need it? Shouldn't this be
something that's configurable and depending on what camera is attached
or what format the user has requested?

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
@ 2015-08-25 13:44             ` Thierry Reding
  0 siblings, 0 replies; 25+ messages in thread
From: Thierry Reding @ 2015-08-25 13:44 UTC (permalink / raw)
  To: Bryan Wu
  Cc: hansverk, linux-media, ebrower, jbang, swarren, wenjiaz, davidw,
	gfitzer, linux-tegra

[-- Attachment #1: Type: text/plain, Size: 28352 bytes --]

On Mon, Aug 24, 2015 at 05:26:20PM -0700, Bryan Wu wrote:
> On 08/21/2015 06:03 AM, Thierry Reding wrote:
> >On Thu, Aug 20, 2015 at 05:51:39PM -0700, Bryan Wu wrote:
[...]
> >>+#define TEGRA_CSI_PHY_CIL_COMMAND			0x0908
> >This doesn't seem to be used at all.
> 
> Actually this PHY register just has this one only, I need define it as 0x0
> offset here. Let's keep this since in future we might have more PHY
> registers.

Yes, I had been wondering about the PHY registers. If we make this a
register with offset 0, as I understand it will become used because of
the phy_{readl,writel}() rework.

> >>+#define TEGRA_CSI_PATTERN_GENERATOR_CTRL		0x000
> >>+#define TEGRA_CSI_PG_BLANK				0x004
> >>+#define TEGRA_CSI_PG_PHASE				0x008
> >>+#define TEGRA_CSI_PG_RED_FREQ				0x00c
> >>+#define TEGRA_CSI_PG_RED_FREQ_RATE			0x010
> >>+#define TEGRA_CSI_PG_GREEN_FREQ				0x014
> >>+#define TEGRA_CSI_PG_GREEN_FREQ_RATE			0x018
> >>+#define TEGRA_CSI_PG_BLUE_FREQ				0x01c
> >>+#define TEGRA_CSI_PG_BLUE_FREQ_RATE			0x020
> >>+#define TEGRA_CSI_PG_AOHDR				0x024
> >>+
> >>+#define TEGRA_CSI_DPCM_CTRL_A				0xad0
> >>+#define TEGRA_CSI_DPCM_CTRL_B				0xad4
> >>+#define TEGRA_CSI_STALL_COUNTER				0xae8
> >>+#define TEGRA_CSI_CSI_READONLY_STATUS			0xaec
> >>+#define TEGRA_CSI_CSI_SW_STATUS_RESET			0xaf0
> >>+#define TEGRA_CSI_CLKEN_OVERRIDE			0xaf4
> >>+#define TEGRA_CSI_DEBUG_CONTROL				0xaf8
> >>+#define TEGRA_CSI_DEBUG_COUNTER_0			0xafc
> >>+#define TEGRA_CSI_DEBUG_COUNTER_1			0xb00
> >>+#define TEGRA_CSI_DEBUG_COUNTER_2			0xb04
> >Some of these are unused. I guess there's an argument to be made to
> >include all register definitions rather than just the used ones, if for
> >nothing else than completeness. I'll defer to Hans's judgement on this.
> 
> These are VI/CSI global registers shared by all the channels. Some of them
> are used in this driver, I suggest we keep them here.

Fine with me.

> >>+{
> >>+	if (chan->bypass)
> >>+		return;
> >I don't see this being set anywhere. Is it dead code? Also the only
> >description I see is that it's used to bypass register writes, but I
> >don't see an explanation why that's necessary.
> 
> We are unifying our downstream VI driver with V4L2 VI driver. And this
> upstream work is the first step to help that.
> 
> We are also backporting this driver back to our internal 3.10 kernel which
> is using nvhost channel to submit register operations from userspace to
> host1x and VI hardware. Then in this case, our driver needs to bypass all
> the register operations otherwise we got conflicts between these 2 paths.
> 
> That's why I put bypass mode here. And bypass mode can be set in device tree
> or from v4l2-ctrls.

I don't think it's fair to burden upstream with code that will only ever
be used downstream. Let's split these changes into a separate patch that
can be carried downstream.

> >>+/* Syncpoint bits of TEGRA_VI_CFG_VI_INCR_SYNCPT */
> >>+static u32 sp_bit(struct tegra_channel *chan, u32 sp)
> >>+{
> >>+	return (sp + chan->port * 4) << 8;
> >>+}
> >Technically this returns a mask, not a bit, so sp_mask() would be more
> >appropriate.
> Actually it returns the syncpoint value for each port not a mask. Probably
> sp_bits() is better.

Looking at the TRM, the field that this generates a value for is called
VI_COND (in the VI_CFG_VI_INCR_SYNCPT register), so perhaps this should
really be a macro and named something like:

	#define VI_CFG_VI_INCR_SYNCPT_COND(x) (((x) & 0xff) << 8)

As for the arithmetic, that doesn't seem to match up. Quoting from your
original patch:

> > > +/* VI registers */
> > > +#define TEGRA_VI_CFG_VI_INCR_SYNCPT                     0x000
> > > +#define		SP_PP_LINE_START			4
> > > +#define		SP_PP_FRAME_START			5
> > > +#define		SP_MW_REQ_DONE				6
> > > +#define		SP_MW_ACK_DONE				7

This doesn't seem to match the TRM, which has the following values:

	 0 = IMMEDIATE
	 1 = OP_DONE
	 2 = RD_DONE
	 3 = REG_WR_SAFE
	 4 = VI_MWA_REQ_DONE
	 5 = VI_MWB_REQ_DONE
	 6 = VI_MWA_ACK_DONE
	 7 = VI_MWB_ACK_DONE
	 8 = VI_ISPA_DONE
	 9 = VI_CSI_PPA_FRAME_START
	10 = VI_CSI_PPB_FRAME_START
	11 = VI_CSI_PPA_LINE_START
	12 = VI_CSI_PPB_LINE_START
	13 = VI_VGP0_RCVD
	14 = VI_VGP1_RCVD
	15 = VI_ISPB_DONE

Comparing with the internal register manuals it looks like the TRM is
actually wrong. Can you file an internal bug to rectify this and Cc me
on it, please?

Irrespective, since this is generating content for a register field it
would seem more consistent to define it as a parameterized macro, like
so:

	#define VI_CSI_PP_LINE_START(port)	(4 + (port) * 4)
	#define VI_CSI_PP_FRAME_START(port)	(5 + (port) * 4)
	#define VI_CSI_MWA_REQ_DONE(port)	(6 + (port) * 4)
	#define VI_CSI_MWA_ACK_DONE(port)	(7 + (port) * 4)

and then use them together with the above macro:

	value = VI_CFG_VI_INCR_SYNCPT_COND(VI_CSI_PP_FRAME_START(port)) |
		host1x_syncpt_id(syncpt);
	writel(value, ...);

> >>+static int tegra_channel_capture_setup(struct tegra_channel *chan)
> >>+{
> >>+	int lanes = 2;
> >unsigned int? And why is it hardcoded to 2? There are checks below for
> >lanes == 4, which will effectively never happen. So at the very least I
> >think this should have a TODO comment of some sort. Preferably can it
> >not be determined at runtime what number of lanes we need?
> Sure, I forget to fix this. lanes should get from DT and for TPG mode I will
> choose lanes as 4 by default.

Can the number of lanes required not be determined at runtime? I suspect
it would be a property of whatever camera is attached. Then again, this
is perhaps clarified by the DT binding, so I'll wait and see how that
looks.

> >>+	u32 height = chan->format.height;
> >>+	u32 width = chan->format.width;
> >>+	u32 format = chan->fmtinfo->img_fmt;
> >>+	u32 data_type = chan->fmtinfo->img_dt;
> >>+	u32 word_count = tegra_core_get_word_count(width, chan->fmtinfo);
> >>+	struct chan_regs_config *regs = &chan->regs;
> >>+
> >>+	/* CIL PHY register setup */
> >>+	if (port & 0x1) {
> >>+		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0 - 0x34, 0x0);
> >>+		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x0);
> >>+	} else {
> >>+		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x10000);
> >>+		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0 + 0x34, 0x0);
> >>+	}
> >This seems to address registers not actually part of this channel. Why?
> It's little bit hackish, but it's really have no choice. CIL PHY is shared
> by 2 channels. like CSIA and CSIB, CSIC and CSID, CSIE and CSIF. So we have
> 3 groups.

I'm wondering if we can't add some object as abstraction to make this
more straightforward to follow. I find this driver generally hard to
understand because of all the (seemingly) random register accesses.

> >Also you use magic numbers here and in the remainder of the driver. We
> >should be able to do better. I presume all of this is documented in the
> >TRM, so we should be able to easily substitute symbolic names.
> I also got those magic numbers from internal source. Some of them are not in
> the TRM. And people just use that settings. I will try to convert them to
> some meaningful bit names. Please let me do it after I finished the whole
> work as an incremental patch.

Sorry, that's not going to work. One of our prerequisite for merging
code into the upstream kernel has always been to have the registers
documented in the TRM. Magic numbers are not an option.

> >>+	cil_write(chan, TEGRA_CSI_CIL_INTERRUPT_MASK, 0x0);
> >>+	cil_write(chan, TEGRA_CSI_CIL_PHY_CONTROL, 0xA);
> >>+	if (lanes == 4) {
> >>+		regs->cil = regs_base(TEGRA_CSI_CIL_0_BASE, port + 1);
> >>+		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x0);
> >>+		cil_write(chan,	TEGRA_CSI_CIL_INTERRUPT_MASK, 0x0);
> >>+		cil_write(chan, TEGRA_CSI_CIL_PHY_CONTROL, 0xA);
> >>+		regs->cil = regs_base(TEGRA_CSI_CIL_0_BASE, port);
> >>+	}
> >And this seems to access registers from another port by temporarily
> >rewriting the CIL base offset. That seems a little hackish to me. I
> >don't know the hardware intimately enough to know exactly what this
> >is supposed to accomplish, perhaps you can clarify? Also perhaps we
> >can come up with some architectural overview of the VI hardware, or
> >does such an overview exist in the TRM?
> 
> CSI have 6 channels but just 3 PHYs. If a channel want to use 4 data lanes,
> then it has to be CSIA, CSIC and CSIE. And CSIB, CSID and CSIF channels can
> not be used in this case.
> 
> That's why we need to access the CSIB/D/F registers in 4 data lanes use
> case.

I find the nomenclature very difficult. So each channel has two ports,
and each port uses up two lanes of a 4-lane PHY. Can't we structure
things in a way so that we expose ports as a low-level object and then
each channel can use either one or two ports? That way we can create at
runtime a dynamic number of channels (parsed from DT?) and assign ports
to them.

Perhaps most of that information will already be available in DT. For
example if we have a 4-lane camera connected to CSI1, then ports C and D
could be connected (I suppose that's possible with an OF graph?) and the
driver would simply have to allocate both C and D ports to some channel
object representing that camera. Similarly we could have one 2-lane
camera connected to CSI and another 2-lane camera connected to CSI2 and
assign ports A or B and E or F, respectively, to channels representing
these camera links.

> >I see there is, perhaps add a comment somewhere, in the commit
> >description or the file header giving a reference to where the
> >architectural overview can be found?
> 
> It can be found in Tegra X1 TRM like this:
> "The CSI unit provides for connection of up to six cameras in the system and
> is organized as three identical instances of two
> MIPI support blocks, each with a separate 4-lane interface that can be
> configured as a single camera with 4 lanes or as a dual
> camera with 2 lanes available for each camera."
> 
> What about I put this information in the code as a comment?

Having this as comments is obviously going to help understand the code,
but the code will still be difficult to follow. I think it would be far
easier to understand if this was structured in a top-down approach
rather than bottom-up.

> >>+	/* CSI pixel parser registers setup */
> >>+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_PP_COMMAND, 0xf007);
> >>+	pp_write(chan, TEGRA_CSI_PIXEL_PARSER_INTERRUPT_MASK, 0x0);
> >>+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_CONTROL0,
> >>+		 0x280301f0 | (port & 0x1));
> >>+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_PP_COMMAND, 0xf007);
> >>+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_CONTROL1, 0x11);
> >>+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_GAP, 0x140000);
> >>+	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_EXPECTED_FRAME, 0x0);
> >>+	pp_write(chan, TEGRA_CSI_INPUT_STREAM_CONTROL,
> >>+		 0x3f0000 | (lanes - 1));
> >>+
> >>+	/* CIL PHY register setup */
> >>+	if (lanes == 4)
> >>+		phy_write(chan, 0x0101);
> >>+	else {
> >>+		u32 val = phy_read(chan);
> >>+		if (port & 0x1)
> >>+			val = (val & ~0x100) | 0x100;
> >>+		else
> >>+			val = (val & ~0x1) | 0x1;
> >>+		phy_write(chan, val);
> >>+	}
> >The & ~ isn't quite doing what I suspect it should be doing. My
> >assumption is that you want to set this register to 0x01 if the first
> >port is to be used and 0x100 if the second port is to be used (or 0x101
> >if both ports are to be used). In that case I think you'll want
> >something like this:
> >
> >	value = phy_read(chan);
> >
> >	if (port & 1)
> >		value = (value & ~0x0001) | 0x0100;
> >	else
> >		value = (value & ~0x0100) | 0x0001;
> >
> >	phy_write(chan, value);
> 
> I don't think your code is correct. The algorithm is to read out the share
> PHY register value and clear the port related bit and set that bit. Then it
> won't touch the setting of the other port. It means when we setup a channel
> it should not change the other channel which sharing PHY register with the
> current one.
> 
> In your case, you cleared the other port's bit and set the current port bit.
> When we write the value back to the PHY register, current port will be
> enabled but the other port will be disabled.
> 
> For example, like CSIA is running, the value of PHY register is 0x0001.
> Then when we try to enable CSIB, we should write 0x0101 to the PHY register
> but not 0x0100.

I see. In that case I propose you simply do:

	if (port & 1)
		value |= 0x0100;
	else
		value |= 0x0001;

Clearing the bit only to set it immediately again is just a waste of CPU
resources. Likely the compiler will optimize this away, but might as
well make it easy on the compiler.

One problem with the above code, though, is that I don't see these bits
ever being cleared in the PHY. Shouldn't there be code to disable a
given port when it isn't used? Presumably that would reduce power
consumption?

> >>+static int tegra_channel_capture_frame(struct tegra_channel *chan)
> >>+{
> >>+	struct tegra_channel_buffer *buf = chan->active;
> >>+	struct vb2_buffer *vb = &buf->buf;
> >>+	int err = 0;
> >>+	u32 thresh, value, frame_start;
> >>+	int bytes_per_line = chan->format.bytesperline;
> >>+
> >>+	if (!vb2_start_streaming_called(&chan->queue) || !buf)
> >>+		return -EINVAL;
> >>+
> >>+	if (chan->bypass)
> >>+		goto bypass_done;
> >>+
> >>+	/* Program buffer address */
> >>+	csi_write(chan,
> >>+		  TEGRA_VI_CSI_SURFACE0_OFFSET_MSB + chan->surface * 8,
> >>+		  0x0);
> >>+	csi_write(chan,
> >>+		  TEGRA_VI_CSI_SURFACE0_OFFSET_LSB + chan->surface * 8,
> >>+		  buf->addr);
> >>+	csi_write(chan,
> >>+		  TEGRA_VI_CSI_SURFACE0_STRIDE + chan->surface * 4,
> >>+		  bytes_per_line);
> >>+
> >>+	/* Program syncpoint */
> >>+	frame_start = sp_bit(chan, SP_PP_FRAME_START);
> >>+	tegra_channel_write(chan, TEGRA_VI_CFG_VI_INCR_SYNCPT,
> >>+			    frame_start | host1x_syncpt_id(chan->sp));
> >>+
> >>+	csi_write(chan, TEGRA_VI_CSI_SINGLE_SHOT, 0x1);
> >>+
> >>+	/* Use syncpoint to wake up */
> >>+	thresh = host1x_syncpt_incr_max(chan->sp, 1);
> >>+
> >>+	mutex_unlock(&chan->lock);
> >>+	err = host1x_syncpt_wait(chan->sp, thresh,
> >>+			         TEGRA_VI_SYNCPT_WAIT_TIMEOUT, &value);
> >>+	mutex_lock(&chan->lock);
> >What's the point of taking the lock in the first place if you drop it
> >here, even if temporarily? This is a per-channel lock, and it protects
> >the channel against concurrent captures. So if you drop the lock here,
> >don't you run risk of having two captures run concurrently? And by the
> >time you get to the error handling or buffer completion below you can't
> >be sure you're actually dealing with the same buffer that you started
> >with.
> 
> After some discussion with Hans, I changed to this. Since there won't be a
> second capture start which is prevented by v4l2-core, it won't cause the
> buffer issue.
> 
> Waiting for host1x syncpoint take time, so dropping lock can let other
> non-capture ioctls and operations happen.

If the core already prevents multiple captures for a single channel, do
we even need the lock in the first place?

> >>+	if (err) {
> >>+		dev_err(&chan->video.dev, "frame start syncpt timeout!\n");
> >>+		tegra_channel_capture_error(chan, err);
> >>+	}
> >Is timeout really the only kind of error that can happen here?
> >
> I actually don't know other errors. Any other errors I need take of here?

Then I suggest you play it safe and simply report what exact error was
returned:

		dev_err(&chan->video.dev, "failed to wait for syncpoint: %d\n",
			err);

> >>+static int tegra_channel_buffer_prepare(struct vb2_buffer *vb)
> >>+{
> >>+	struct tegra_channel *chan = vb2_get_drv_priv(vb->vb2_queue);
> >>+	struct tegra_channel_buffer *buf = to_tegra_channel_buffer(vb);
> >>+
> >>+	buf->chan = chan;
> >>+	buf->addr = vb2_dma_contig_plane_dma_addr(vb, 0);
> >>+
> >>+	return 0;
> >>+}
> >This seems to use contiguous DMA, which I guess presumes CMA support?
> >We're dealing with very large buffers here. Your default frame size
> >would yield buffers of roughly 32 MiB each, and you probably need a
> >couple of those to ensure smooth playback. That's quite a bit of
> >memory to reserve for CMA.
> In vb2 core driver, it's using dma-mapping API which might be CMA or SMMU.

There is no way to use the DMA API with SMMU upstream. You need to set
up your IOMMU domain yourself and attach the VI device to it manually.
That means you'll also need to manage your IOVA space manually to make
use of this. I know it's an unfortunate situation and there's work
underway to improve it, but we're not quite there yet.

> For CMA we need increase the default memory size.

I'd rather not rely on CMA at all, especially since we do have a way
around it.

> >Have you ever tried to make this work with the IOMMU API so that we can
> >allocate arbitrary buffers and linearize them for the hardware through
> >the SMMU?
> I tested this code in downstream kernel with SMMU. Do we fully support SMMU
> in upstream version? I didn't check that.

*sigh* We can't merge code upstream which hasn't been tested upstream.
Let's make sure we get into place whatever we need to actually run this
on an upstream kernel. That typically means you need to apply your work
on top of some recent linux-next and run it on an upstream-supported
board.

I realize that this is rather difficult to do for Tegra X1 because the
support for it hasn't been completely merged yet. One possibility is to
apply this on top of my staging/work branch[0] and run it on the P2371
or P2571 boards that are supported there. Alternatively since this is
hardware which is available (in similar form) on Tegra K1 you could try
to make it work on something like the Jetson TK1. Getting it to support
Tegra X1 will then be (hopefully) a simple matter of adding parameters
for the new generation.

Not testing this on an upstream kernel means that it is likely not going
to work because we're missing some bits, such as in the clock driver or
other, that are essential to make this work and as a result we'd be
carrying broken code in the upstream kernel. That's not acceptable.

[0]: https://github.com/thierryreding/linux/commits/staging/work

> >>+	pix->pixelformat = info->fourcc;
> >>+	pix->field = V4L2_FIELD_NONE;
> >>+
> >>+	/* The transfer alignment requirements are expressed in bytes. Compute
> >>+	 * the minimum and maximum values, clamp the requested width and convert
> >>+	 * it back to pixels.
> >>+	 */
> >>+	align = lcm(chan->align, info->bpp);
> >>+	min_width = roundup(TEGRA_MIN_WIDTH, align);
> >>+	max_width = rounddown(TEGRA_MAX_WIDTH, align);
> >>+	width = rounddown(pix->width * info->bpp, align);
> >Shouldn't these be roundup()?
> Why? I don't understand but rounddown looks good to me

For the maximum and minimum this is probably not an issue because they
likely are multiples of the alignment (I hope they are, otherwise they
would be broken; which would indicate that computing min_width and
max_width here is actually redundant, or should be replaced by some
sort of WARN() or even BUG().

That said, for the width you'll want to round up, otherwise you will be
potentially truncating the amount of data you receive. Consider for
example the case where you wanted to capture a 2x2 image at 32-bit RGB.
With your above calculation you'll end up with:

	align = lcm(64, 4) = 64;
	width = rounddown(2 * 4 = 8, 64) = 0;

That's really not what you want. I realize that this particular case
will be cancelled out by the clamp() calculation below, but the same
error would apply to larger resolution images. You'll always be missing
up to 63 bytes if you round down that way.

> >>+	pix->width = clamp(width, min_width, max_width) / info->bpp;
> >>+	pix->height = clamp(pix->height, TEGRA_MIN_HEIGHT,
> >>+			    TEGRA_MAX_HEIGHT);
> >The above fits nicely on one line and doesn't need to be wrapped.
> Fixed
> >
> >>+
> >>+	/* Clamp the requested bytes per line value. If the maximum bytes per
> >>+	 * line value is zero, the module doesn't support user configurable line
> >>+	 * sizes. Override the requested value with the minimum in that case.
> >>+	 */
> >>+	min_bpl = pix->width * info->bpp;
> >>+	max_bpl = rounddown(TEGRA_MAX_WIDTH, chan->align);
> >>+	bpl = rounddown(pix->bytesperline, chan->align);
> >Again, I think these should be roundup().
> 
> Why? I don't understand but rounddown looks good to me

Same applies here. Alignment is a restriction regarding the *minimum*
size, rounding up is therefore what you really need.

> >>+	/* VI Channel is 64 bytes alignment */
> >>+	chan->align = 64;
> >Does this need parameterization for other SoC generations?
> 
> So far it's 64 bytes and I don't see any change about this in the future
> generations.

I don't see this documented in the TRM. Can you file a bug to get this
added? We have tables for this kind of restrictions for other devices,
such as display controller. We'll need that in the TRM for VI as well.

> >>+	chan->surface = 0;
> >I can't find this being set to anything other than 0. What is its use?
> 
> Each channel actually has 3 memory output surfaces. But I don't find any use
> case to use the surface 1 and surface 2. So I just added this parameter for
> future usage.
> 
> chan->surface is used in tegra_channel_capture_frame()

I don't understand why it needs to be stored in the channel. We could
simply hard-code it to 0 in tegra_channel_capture_frame(). Perhaps along
with a TODO comment or similar that this might need to be paramaterized?

The TRM isn't any help in explaining why three surfaces are available.
Would you happen to know what surfaces 1 and 2 can be used for?

> >>diff --git a/drivers/media/platform/tegra/tegra-core.h b/drivers/media/platform/tegra/tegra-core.h
> >>new file mode 100644
> >>index 0000000..7d1026b
> >>--- /dev/null
> >>+++ b/drivers/media/platform/tegra/tegra-core.h
> >>@@ -0,0 +1,134 @@
> >>+/*
> >>+ * NVIDIA Tegra Video Input Device Driver Core Helpers
> >>+ *
> >>+ * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
> >>+ *
> >>+ * Author: Bryan Wu <pengw@nvidia.com>
> >>+ *
> >>+ * This program is free software; you can redistribute it and/or modify
> >>+ * it under the terms of the GNU General Public License version 2 as
> >>+ * published by the Free Software Foundation.
> >>+ */
> >>+
> >>+#ifndef __TEGRA_CORE_H__
> >>+#define __TEGRA_CORE_H__
> >>+
> >>+#include <dt-bindings/media/tegra-vi.h>
> >>+
> >>+#include <media/v4l2-subdev.h>
> >>+
> >>+/* Minimum and maximum width and height common to Tegra video input device. */
> >>+#define TEGRA_MIN_WIDTH		32U
> >>+#define TEGRA_MAX_WIDTH		7680U
> >>+#define TEGRA_MIN_HEIGHT	32U
> >>+#define TEGRA_MAX_HEIGHT	7680U
> >Is this dependent on SoC generation? If we wanted to support Tegra K1,
> >would the same values apply or do they need to be parameterized?
> I actually don't get any information about this max/min resolution. Here I
> just put some values for the format calculation.

Can you request that this be added to the TRM (via that internal bug
report I mentioned), please? According to the register definitions the
width and height fields to be programmed are 16-bit, but I doubt that we
can realistically capture frames of 65535x65535 pixels.

> >On that note, could you outline what would be necessary to make this
> >work on Tegra K1? What are the differences between the VI hardware on
> >Tegra X1 vs. Tegra K1?
> >
> Tegra X1 and Tegra K1 have similar channel architecture. Tegra X1 has 6
> channels, Tegra K1 has 2 channels.

Okay, so it should be relatively easy to make this work on Tegra K1 as
well. I'll see if I can find some time to play with that. What would be
the easiest way to check that this works? I suppose I could write a
small program to capture images from the V4L2 node(s) that this exposes
and displays them in a DRM/KMS overlay via DMA-BUF. But perhaps there
are premade tools to achieve this? Preferably with not too many
dependencies.

> >>+/* UHD 4K resolution as default resolution for all Tegra video input device. */
> >>+#define TEGRA_DEF_WIDTH		3840
> >>+#define TEGRA_DEF_HEIGHT	2160
> >Is this a sensible default? It seems rather large to me.
> Actually I use this for TPG which is the default setting of VI. And it can
> be override from user space IOCTL.

I understand, but UHD is rather big, so not sure if it makes a good
default. Perhaps 1920x1080 would be a more realistic default. But I
don't feel very strong about this.

> >>+
> >>+#define TEGRA_VF_DEF		TEGRA_VF_RGB888
> >>+#define TEGRA_VF_DEF_FOURCC	V4L2_PIX_FMT_RGB32
> >Should we not have only one of these and convert to the other via some
> >table?
> 
> This is also TPG default mode

I understand, but the fourcc version can be converted to the Tegra
internal format with a function, right? So it seems weird that we'd have
to hard-code both here, which also means that they need to be manually
kept in sync.

> >>+	struct tegra_channel *chan;
> >>+
> >>+	for (i = 0; i < ARRAY_SIZE(vi->chans); i++) {
> >>+		chan = &vi->chans[i];
> >>+
> >>+		ret = tegra_channel_init(vi, chan, i);
> >Again, chan is only used once, so directly passing &vi->chans[i] to
> >tegra_channel_init() would be more concise.
> OK, I will remove 'chan' parameter from the list. And just pass i as the
> port number.

I didn't express myself very clearly. What I was suggesting was to
remove the chan temporary variable and pass in &vi->chans[i] directly.
Passing in both &vi->chans[i] and i looks okay to me, that way you don't
have to look up i via other means. Provided that you still need it, of
course.

> >>+	vi_tpg_fmts_bitmap_init(vi);
> >>+
> >>+	ret = tegra_vi_v4l2_init(vi);
> >>+	if (ret < 0)
> >>+		return ret;
> >>+
> >>+	/* Check whether VI is in test pattern generator (TPG) mode */
> >>+	of_property_read_u32(vi->dev->of_node, "nvidia,pg_mode",
> >>+			     &vi->pg_mode);
> >This doesn't sound right. Wouldn't this mean that you can either use the
> >device in TPG mode or sensor mode only? With no means of switching at
> >runtime? But then I see that there's an IOCTL to set this mode, so why
> >even bother having this in DT in the first place?
> DT can provide a default way to set the whole VI as TPG. And v4l2-ctrls
> (IOCTL) is another way to do that.
> 
> We can remove this DT stuff but just use runtime v4l2-ctrls.

Yes, let's do that then. It's a policy decision and therefore doesn't
belong in DT.

> >>diff --git a/include/dt-bindings/media/tegra-vi.h b/include/dt-bindings/media/tegra-vi.h
> >[...]
> >>+#ifndef __DT_BINDINGS_MEDIA_TEGRA_VI_H__
> >>+#define __DT_BINDINGS_MEDIA_TEGRA_VI_H__
> >>+
> >>+/*
> >>+ * Supported CSI to VI Data Formats
> >>+ */
> >>+#define TEGRA_VF_RAW6		0
> >>+#define TEGRA_VF_RAW7		1
> >>+#define TEGRA_VF_RAW8		2
> >>+#define TEGRA_VF_RAW10		3
> >>+#define TEGRA_VF_RAW12		4
> >>+#define TEGRA_VF_RAW14		5
> >>+#define TEGRA_VF_EMBEDDED8	6
> >>+#define TEGRA_VF_RGB565		7
> >>+#define TEGRA_VF_RGB555		8
> >>+#define TEGRA_VF_RGB888		9
> >>+#define TEGRA_VF_RGB444		10
> >>+#define TEGRA_VF_RGB666		11
> >>+#define TEGRA_VF_YUV422		12
> >>+#define TEGRA_VF_YUV420		13
> >>+#define TEGRA_VF_YUV420_CSPS	14
> >>+
> >>+#endif /* __DT_BINDINGS_MEDIA_TEGRA_VI_H__ */
> >What do we need these for? These seem to me to be internal formats
> >supported by the hardware, but the existence of this file implies that
> >you plan on using them in the DT. What's the use-case?
> >
> >
> 
> The original plan is to put nvidia;video-format in device tree and this is
> the data formats for that. Now we don't need nvidia;video-format in device
> tree. Then I let me move it into our tegra-core.c, because our
> tegra_video_formats table needs this.

If we don't need it now, why will we ever need it? Shouldn't this be
something that's configurable and depending on what camera is attached
or what format the user has requested?

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
  2015-08-25 13:44             ` Thierry Reding
  (?)
@ 2015-08-25 14:15             ` Hans Verkuil
       [not found]               ` <55DC789E.8060300-qWit8jRvyhVmR6Xm/wNWPw@public.gmane.org>
  -1 siblings, 1 reply; 25+ messages in thread
From: Hans Verkuil @ 2015-08-25 14:15 UTC (permalink / raw)
  To: Thierry Reding, Bryan Wu
  Cc: hansverk, linux-media, ebrower, jbang, swarren, wenjiaz, davidw,
	gfitzer, linux-tegra

On 08/25/15 15:44, Thierry Reding wrote:
> On Mon, Aug 24, 2015 at 05:26:20PM -0700, Bryan Wu wrote:
>> On 08/21/2015 06:03 AM, Thierry Reding wrote:
>>> On Thu, Aug 20, 2015 at 05:51:39PM -0700, Bryan Wu wrote:
>>>> +{
>>>> +	if (chan->bypass)
>>>> +		return;
>>> I don't see this being set anywhere. Is it dead code? Also the only
>>> description I see is that it's used to bypass register writes, but I
>>> don't see an explanation why that's necessary.
>>
>> We are unifying our downstream VI driver with V4L2 VI driver. And this
>> upstream work is the first step to help that.
>>
>> We are also backporting this driver back to our internal 3.10 kernel which
>> is using nvhost channel to submit register operations from userspace to
>> host1x and VI hardware. Then in this case, our driver needs to bypass all
>> the register operations otherwise we got conflicts between these 2 paths.
>>
>> That's why I put bypass mode here. And bypass mode can be set in device tree
>> or from v4l2-ctrls.
> 
> I don't think it's fair to burden upstream with code that will only ever
> be used downstream. Let's split these changes into a separate patch that
> can be carried downstream.

I think that's a good idea.

For the record: I haven't really reviewed the bypass mode. I had the impression
that it needed more work anyway (or am I wrong?).

>>>> +static int tegra_channel_capture_frame(struct tegra_channel *chan)
>>>> +{
>>>> +	struct tegra_channel_buffer *buf = chan->active;
>>>> +	struct vb2_buffer *vb = &buf->buf;
>>>> +	int err = 0;
>>>> +	u32 thresh, value, frame_start;
>>>> +	int bytes_per_line = chan->format.bytesperline;
>>>> +
>>>> +	if (!vb2_start_streaming_called(&chan->queue) || !buf)
>>>> +		return -EINVAL;
>>>> +
>>>> +	if (chan->bypass)
>>>> +		goto bypass_done;
>>>> +
>>>> +	/* Program buffer address */
>>>> +	csi_write(chan,
>>>> +		  TEGRA_VI_CSI_SURFACE0_OFFSET_MSB + chan->surface * 8,
>>>> +		  0x0);
>>>> +	csi_write(chan,
>>>> +		  TEGRA_VI_CSI_SURFACE0_OFFSET_LSB + chan->surface * 8,
>>>> +		  buf->addr);
>>>> +	csi_write(chan,
>>>> +		  TEGRA_VI_CSI_SURFACE0_STRIDE + chan->surface * 4,
>>>> +		  bytes_per_line);
>>>> +
>>>> +	/* Program syncpoint */
>>>> +	frame_start = sp_bit(chan, SP_PP_FRAME_START);
>>>> +	tegra_channel_write(chan, TEGRA_VI_CFG_VI_INCR_SYNCPT,
>>>> +			    frame_start | host1x_syncpt_id(chan->sp));
>>>> +
>>>> +	csi_write(chan, TEGRA_VI_CSI_SINGLE_SHOT, 0x1);
>>>> +
>>>> +	/* Use syncpoint to wake up */
>>>> +	thresh = host1x_syncpt_incr_max(chan->sp, 1);
>>>> +
>>>> +	mutex_unlock(&chan->lock);
>>>> +	err = host1x_syncpt_wait(chan->sp, thresh,
>>>> +			         TEGRA_VI_SYNCPT_WAIT_TIMEOUT, &value);
>>>> +	mutex_lock(&chan->lock);
>>> What's the point of taking the lock in the first place if you drop it
>>> here, even if temporarily? This is a per-channel lock, and it protects
>>> the channel against concurrent captures. So if you drop the lock here,
>>> don't you run risk of having two captures run concurrently? And by the
>>> time you get to the error handling or buffer completion below you can't
>>> be sure you're actually dealing with the same buffer that you started
>>> with.
>>
>> After some discussion with Hans, I changed to this. Since there won't be a
>> second capture start which is prevented by v4l2-core, it won't cause the
>> buffer issue.
>>
>> Waiting for host1x syncpoint take time, so dropping lock can let other
>> non-capture ioctls and operations happen.
> 
> If the core already prevents multiple captures for a single channel, do
> we even need the lock in the first place?

While this is running another process might call the driver which then
changes some of these registers. So typically locking is needed. Since this
is going to be rewritten to a kthread I'm postponing reviewing the locking
until I see the new version. I expect this to make much more sense then.

>>>> +static int tegra_channel_buffer_prepare(struct vb2_buffer *vb)
>>>> +{
>>>> +	struct tegra_channel *chan = vb2_get_drv_priv(vb->vb2_queue);
>>>> +	struct tegra_channel_buffer *buf = to_tegra_channel_buffer(vb);
>>>> +
>>>> +	buf->chan = chan;
>>>> +	buf->addr = vb2_dma_contig_plane_dma_addr(vb, 0);
>>>> +
>>>> +	return 0;
>>>> +}
>>> This seems to use contiguous DMA, which I guess presumes CMA support?
>>> We're dealing with very large buffers here. Your default frame size
>>> would yield buffers of roughly 32 MiB each, and you probably need a
>>> couple of those to ensure smooth playback. That's quite a bit of
>>> memory to reserve for CMA.
>> In vb2 core driver, it's using dma-mapping API which might be CMA or SMMU.
> 
> There is no way to use the DMA API with SMMU upstream. You need to set
> up your IOMMU domain yourself and attach the VI device to it manually.
> That means you'll also need to manage your IOVA space manually to make
> use of this. I know it's an unfortunate situation and there's work
> underway to improve it, but we're not quite there yet.
> 
>> For CMA we need increase the default memory size.
> 
> I'd rather not rely on CMA at all, especially since we do have a way
> around it.

For the record, I have no problem with it if we start out with contiguous
DMA now and enhance it later. I get the impression that getting the IOMMU
to work is non-trivial, and I don't think it should block merging of this
driver.

This is all internal to the driver, so changing it later will not affect
userspace.

>>> Have you ever tried to make this work with the IOMMU API so that we can
>>> allocate arbitrary buffers and linearize them for the hardware through
>>> the SMMU?
>> I tested this code in downstream kernel with SMMU. Do we fully support SMMU
>> in upstream version? I didn't check that.
> 
> *sigh* We can't merge code upstream which hasn't been tested upstream.
> Let's make sure we get into place whatever we need to actually run this
> on an upstream kernel. That typically means you need to apply your work
> on top of some recent linux-next and run it on an upstream-supported
> board.
> 
> I realize that this is rather difficult to do for Tegra X1 because the
> support for it hasn't been completely merged yet. One possibility is to
> apply this on top of my staging/work branch[0] and run it on the P2371
> or P2571 boards that are supported there. Alternatively since this is
> hardware which is available (in similar form) on Tegra K1 you could try
> to make it work on something like the Jetson TK1. Getting it to support
> Tegra X1 will then be (hopefully) a simple matter of adding parameters
> for the new generation.
> 
> Not testing this on an upstream kernel means that it is likely not going
> to work because we're missing some bits, such as in the clock driver or
> other, that are essential to make this work and as a result we'd be
> carrying broken code in the upstream kernel. That's not acceptable.
> 
> [0]: https://github.com/thierryreding/linux/commits/staging/work
> 

Regards,

	Hans

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
  2015-08-25 14:15             ` Hans Verkuil
@ 2015-08-25 14:24                   ` Thierry Reding
  0 siblings, 0 replies; 25+ messages in thread
From: Thierry Reding @ 2015-08-25 14:24 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: Bryan Wu, hansverk-FYB4Gu1CFyUAvxtiuMwx3w,
	linux-media-u79uwXL29TY76Z2rM5mHXA,
	ebrower-DDmLM1+adcrQT0dZR+AlfA, jbang-DDmLM1+adcrQT0dZR+AlfA,
	swarren-DDmLM1+adcrQT0dZR+AlfA, wenjiaz-DDmLM1+adcrQT0dZR+AlfA,
	davidw-DDmLM1+adcrQT0dZR+AlfA, gfitzer-DDmLM1+adcrQT0dZR+AlfA,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 697 bytes --]

On Tue, Aug 25, 2015 at 04:15:58PM +0200, Hans Verkuil wrote:
> On 08/25/15 15:44, Thierry Reding wrote:
> > On Mon, Aug 24, 2015 at 05:26:20PM -0700, Bryan Wu wrote:
[...]
> > > For CMA we need increase the default memory size.
> > 
> > I'd rather not rely on CMA at all, especially since we do have a way
> > around it.
> 
> For the record, I have no problem with it if we start out with contiguous
> DMA now and enhance it later. I get the impression that getting the IOMMU
> to work is non-trivial, and I don't think it should block merging of this
> driver.
> 
> This is all internal to the driver, so changing it later will not affect
> userspace.

Fair enough.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
@ 2015-08-25 14:24                   ` Thierry Reding
  0 siblings, 0 replies; 25+ messages in thread
From: Thierry Reding @ 2015-08-25 14:24 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: Bryan Wu, hansverk, linux-media, ebrower, jbang, swarren,
	wenjiaz, davidw, gfitzer, linux-tegra

[-- Attachment #1: Type: text/plain, Size: 697 bytes --]

On Tue, Aug 25, 2015 at 04:15:58PM +0200, Hans Verkuil wrote:
> On 08/25/15 15:44, Thierry Reding wrote:
> > On Mon, Aug 24, 2015 at 05:26:20PM -0700, Bryan Wu wrote:
[...]
> > > For CMA we need increase the default memory size.
> > 
> > I'd rather not rely on CMA at all, especially since we do have a way
> > around it.
> 
> For the record, I have no problem with it if we start out with contiguous
> DMA now and enhance it later. I get the impression that getting the IOMMU
> to work is non-trivial, and I don't think it should block merging of this
> driver.
> 
> This is all internal to the driver, so changing it later will not affect
> userspace.

Fair enough.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
  2015-08-25  6:30             ` Hans Verkuil
@ 2015-08-25 17:04                 ` Bryan Wu
  -1 siblings, 0 replies; 25+ messages in thread
From: Bryan Wu @ 2015-08-25 17:04 UTC (permalink / raw)
  To: Hans Verkuil, Thierry Reding
  Cc: hansverk-FYB4Gu1CFyUAvxtiuMwx3w,
	linux-media-u79uwXL29TY76Z2rM5mHXA,
	ebrower-DDmLM1+adcrQT0dZR+AlfA, jbang-DDmLM1+adcrQT0dZR+AlfA,
	swarren-DDmLM1+adcrQT0dZR+AlfA, wenjiaz-DDmLM1+adcrQT0dZR+AlfA,
	davidw-DDmLM1+adcrQT0dZR+AlfA, gfitzer-DDmLM1+adcrQT0dZR+AlfA,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

On 08/24/2015 11:30 PM, Hans Verkuil wrote:
> A quick follow-up to Thierry's excellent review:
>
> On 08/25/2015 02:26 AM, Bryan Wu wrote:
>> On 08/21/2015 06:03 AM, Thierry Reding wrote:
>>> On Thu, Aug 20, 2015 at 05:51:39PM -0700, Bryan Wu wrote:
> <snip>
>
>>>> +static void
>>>> +__tegra_channel_try_format(struct tegra_channel *chan, struct v4l2_pix_format *pix,
>>>> +		      const struct tegra_video_format **fmtinfo)
>>>> +{
>>>> +	const struct tegra_video_format *info;
>>>> +	unsigned int min_width;
>>>> +	unsigned int max_width;
>>>> +	unsigned int min_bpl;
>>>> +	unsigned int max_bpl;
>>>> +	unsigned int width;
>>>> +	unsigned int align;
>>>> +	unsigned int bpl;
>>>> +
>>>> +	/* Retrieve format information and select the default format if the
>>>> +	 * requested format isn't supported.
>>>> +	 */
>>>> +	info = tegra_core_get_format_by_fourcc(pix->pixelformat);
>>>> +	if (!info)
>>>> +		info = tegra_core_get_format_by_fourcc(TEGRA_VF_DEF_FOURCC);
>>> Should this not be an error? As far as I can tell this is silently
>>> substituting the default format for the requested one if the requested
>>> one isn't supported. Isn't the whole point of this to find out if some
>>> format is supported?
>>>
>> I think it should return some error and escape following code. I will
>> fix that.
> Actually, this code is according to the V4L2 spec: if the given format is
> not supported, then VIDIOC_TRY_FMT should replace it with a valid default
> format.
>
> The reality is a bit more complex: in many drivers this was never reviewed
> correctly and we ended up with some drivers that return an error for this
> case and some drivers that follow the spec. Historically TV capture drivers
> return an error, webcam drivers don't. Most unfortunate.
>
> Since this driver is much more likely to be used with sensors I would
> follow the spec here and substitute an invalid format with a default
> format.
>
>

Thanks for letting me know this. It's actually quite confusing since I 
looked at several drivers, some of them return error some of them use 
default format.

-Bryan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
@ 2015-08-25 17:04                 ` Bryan Wu
  0 siblings, 0 replies; 25+ messages in thread
From: Bryan Wu @ 2015-08-25 17:04 UTC (permalink / raw)
  To: Hans Verkuil, Thierry Reding
  Cc: hansverk, linux-media, ebrower, jbang, swarren, wenjiaz, davidw,
	gfitzer, linux-tegra

On 08/24/2015 11:30 PM, Hans Verkuil wrote:
> A quick follow-up to Thierry's excellent review:
>
> On 08/25/2015 02:26 AM, Bryan Wu wrote:
>> On 08/21/2015 06:03 AM, Thierry Reding wrote:
>>> On Thu, Aug 20, 2015 at 05:51:39PM -0700, Bryan Wu wrote:
> <snip>
>
>>>> +static void
>>>> +__tegra_channel_try_format(struct tegra_channel *chan, struct v4l2_pix_format *pix,
>>>> +		      const struct tegra_video_format **fmtinfo)
>>>> +{
>>>> +	const struct tegra_video_format *info;
>>>> +	unsigned int min_width;
>>>> +	unsigned int max_width;
>>>> +	unsigned int min_bpl;
>>>> +	unsigned int max_bpl;
>>>> +	unsigned int width;
>>>> +	unsigned int align;
>>>> +	unsigned int bpl;
>>>> +
>>>> +	/* Retrieve format information and select the default format if the
>>>> +	 * requested format isn't supported.
>>>> +	 */
>>>> +	info = tegra_core_get_format_by_fourcc(pix->pixelformat);
>>>> +	if (!info)
>>>> +		info = tegra_core_get_format_by_fourcc(TEGRA_VF_DEF_FOURCC);
>>> Should this not be an error? As far as I can tell this is silently
>>> substituting the default format for the requested one if the requested
>>> one isn't supported. Isn't the whole point of this to find out if some
>>> format is supported?
>>>
>> I think it should return some error and escape following code. I will
>> fix that.
> Actually, this code is according to the V4L2 spec: if the given format is
> not supported, then VIDIOC_TRY_FMT should replace it with a valid default
> format.
>
> The reality is a bit more complex: in many drivers this was never reviewed
> correctly and we ended up with some drivers that return an error for this
> case and some drivers that follow the spec. Historically TV capture drivers
> return an error, webcam drivers don't. Most unfortunate.
>
> Since this driver is much more likely to be used with sensors I would
> follow the spec here and substitute an invalid format with a default
> format.
>
>

Thanks for letting me know this. It's actually quite confusing since I 
looked at several drivers, some of them return error some of them use 
default format.

-Bryan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
  2015-08-25 13:44             ` Thierry Reding
@ 2015-08-25 21:42                 ` Bryan Wu
  -1 siblings, 0 replies; 25+ messages in thread
From: Bryan Wu @ 2015-08-25 21:42 UTC (permalink / raw)
  To: Thierry Reding
  Cc: hansverk-FYB4Gu1CFyUAvxtiuMwx3w,
	linux-media-u79uwXL29TY76Z2rM5mHXA,
	ebrower-DDmLM1+adcrQT0dZR+AlfA, jbang-DDmLM1+adcrQT0dZR+AlfA,
	swarren-DDmLM1+adcrQT0dZR+AlfA, wenjiaz-DDmLM1+adcrQT0dZR+AlfA,
	davidw-DDmLM1+adcrQT0dZR+AlfA, gfitzer-DDmLM1+adcrQT0dZR+AlfA,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

On 08/25/2015 06:44 AM, Thierry Reding wrote:
> On Mon, Aug 24, 2015 at 05:26:20PM -0700, Bryan Wu wrote:
>> On 08/21/2015 06:03 AM, Thierry Reding wrote:
>>> On Thu, Aug 20, 2015 at 05:51:39PM -0700, Bryan Wu wrote:
> [...]
>>>> +{
>>>> +	if (chan->bypass)
>>>> +		return;
>>> I don't see this being set anywhere. Is it dead code? Also the only
>>> description I see is that it's used to bypass register writes, but I
>>> don't see an explanation why that's necessary.
>> We are unifying our downstream VI driver with V4L2 VI driver. And this
>> upstream work is the first step to help that.
>>
>> We are also backporting this driver back to our internal 3.10 kernel which
>> is using nvhost channel to submit register operations from userspace to
>> host1x and VI hardware. Then in this case, our driver needs to bypass all
>> the register operations otherwise we got conflicts between these 2 paths.
>>
>> That's why I put bypass mode here. And bypass mode can be set in device tree
>> or from v4l2-ctrls.
> I don't think it's fair to burden upstream with code that will only ever
> be used downstream. Let's split these changes into a separate patch that
> can be carried downstream.

OK, I will split out a patch for downstream.


>>>> +/* Syncpoint bits of TEGRA_VI_CFG_VI_INCR_SYNCPT */
>>>> +static u32 sp_bit(struct tegra_channel *chan, u32 sp)
>>>> +{
>>>> +	return (sp + chan->port * 4) << 8;
>>>> +}
>>> Technically this returns a mask, not a bit, so sp_mask() would be more
>>> appropriate.
>> Actually it returns the syncpoint value for each port not a mask. Probably
>> sp_bits() is better.
> Looking at the TRM, the field that this generates a value for is called
> VI_COND (in the VI_CFG_VI_INCR_SYNCPT register), so perhaps this should
> really be a macro and named something like:
>
> 	#define VI_CFG_VI_INCR_SYNCPT_COND(x) (((x) & 0xff) << 8)
>
> As for the arithmetic, that doesn't seem to match up. Quoting from your
> original patch:
>
>>>> +/* VI registers */
>>>> +#define TEGRA_VI_CFG_VI_INCR_SYNCPT                     0x000
>>>> +#define		SP_PP_LINE_START			4
>>>> +#define		SP_PP_FRAME_START			5
>>>> +#define		SP_MW_REQ_DONE				6
>>>> +#define		SP_MW_ACK_DONE				7
> This doesn't seem to match the TRM, which has the following values:
>
> 	 0 = IMMEDIATE
> 	 1 = OP_DONE
> 	 2 = RD_DONE
> 	 3 = REG_WR_SAFE
> 	 4 = VI_MWA_REQ_DONE
> 	 5 = VI_MWB_REQ_DONE
> 	 6 = VI_MWA_ACK_DONE
> 	 7 = VI_MWB_ACK_DONE
> 	 8 = VI_ISPA_DONE
> 	 9 = VI_CSI_PPA_FRAME_START
> 	10 = VI_CSI_PPB_FRAME_START
> 	11 = VI_CSI_PPA_LINE_START
> 	12 = VI_CSI_PPB_LINE_START
> 	13 = VI_VGP0_RCVD
> 	14 = VI_VGP1_RCVD
> 	15 = VI_ISPB_DONE
>
> Comparing with the internal register manuals it looks like the TRM is
> actually wrong. Can you file an internal bug to rectify this and Cc me
> on it, please?

Oh, oops. I will file a bug for that. This list is actually old one in 
Tegra K1 not for Tegra X1.

> Irrespective, since this is generating content for a register field it
> would seem more consistent to define it as a parameterized macro, like
> so:
>
> 	#define VI_CSI_PP_LINE_START(port)	(4 + (port) * 4)
> 	#define VI_CSI_PP_FRAME_START(port)	(5 + (port) * 4)
> 	#define VI_CSI_MWA_REQ_DONE(port)	(6 + (port) * 4)
> 	#define VI_CSI_MWA_ACK_DONE(port)	(7 + (port) * 4)
>
> and then use them together with the above macro:
>
> 	value = VI_CFG_VI_INCR_SYNCPT_COND(VI_CSI_PP_FRAME_START(port)) |
> 		host1x_syncpt_id(syncpt);
> 	writel(value, ...);

Looks good to me. I will fix this.


>>>> +static int tegra_channel_capture_setup(struct tegra_channel *chan)
>>>> +{
>>>> +	int lanes = 2;
>>> unsigned int? And why is it hardcoded to 2? There are checks below for
>>> lanes == 4, which will effectively never happen. So at the very least I
>>> think this should have a TODO comment of some sort. Preferably can it
>>> not be determined at runtime what number of lanes we need?
>> Sure, I forget to fix this. lanes should get from DT and for TPG mode I will
>> choose lanes as 4 by default.
> Can the number of lanes required not be determined at runtime? I suspect
> it would be a property of whatever camera is attached. Then again, this
> is perhaps clarified by the DT binding, so I'll wait and see how that
> looks.

Sure, normally lanes number is defined as "bus-width" in the DT node 
when a real sensor connects.
But since TPG is a virtual sensor which doesn't have any lanes 
requirement. Let's choose 4 for TPG.

>>>> +	u32 height = chan->format.height;
>>>> +	u32 width = chan->format.width;
>>>> +	u32 format = chan->fmtinfo->img_fmt;
>>>> +	u32 data_type = chan->fmtinfo->img_dt;
>>>> +	u32 word_count = tegra_core_get_word_count(width, chan->fmtinfo);
>>>> +	struct chan_regs_config *regs = &chan->regs;
>>>> +
>>>> +	/* CIL PHY register setup */
>>>> +	if (port & 0x1) {
>>>> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0 - 0x34, 0x0);
>>>> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x0);
>>>> +	} else {
>>>> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x10000);
>>>> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0 + 0x34, 0x0);
>>>> +	}
>>> This seems to address registers not actually part of this channel. Why?
>> It's little bit hackish, but it's really have no choice. CIL PHY is shared
>> by 2 channels. like CSIA and CSIB, CSIC and CSID, CSIE and CSIF. So we have
>> 3 groups.
> I'm wondering if we can't add some object as abstraction to make this
> more straightforward to follow. I find this driver generally hard to
> understand because of all the (seemingly) random register accesses.

Actually my original design has a separated subdev driver named 
tegra-csi, which handles CSI specific operations and 
tegra-vi/tegra-channel will handle VI operations.

The real problem is we actually just have 1 VI/CSI hardware controller, 
so the register space is kind of mixed up. Some of them are for CSI, 
some of them are for VI.

Although it's still doable, I just feels like little bit hackish.

>>> Also you use magic numbers here and in the remainder of the driver. We
>>> should be able to do better. I presume all of this is documented in the
>>> TRM, so we should be able to easily substitute symbolic names.
>> I also got those magic numbers from internal source. Some of them are not in
>> the TRM. And people just use that settings. I will try to convert them to
>> some meaningful bit names. Please let me do it after I finished the whole
>> work as an incremental patch.
> Sorry, that's not going to work. One of our prerequisite for merging
> code into the upstream kernel has always been to have the registers
> documented in the TRM. Magic numbers are not an option.

OK, I will rework on these register bits.

>>>> +	cil_write(chan, TEGRA_CSI_CIL_INTERRUPT_MASK, 0x0);
>>>> +	cil_write(chan, TEGRA_CSI_CIL_PHY_CONTROL, 0xA);
>>>> +	if (lanes == 4) {
>>>> +		regs->cil = regs_base(TEGRA_CSI_CIL_0_BASE, port + 1);
>>>> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x0);
>>>> +		cil_write(chan,	TEGRA_CSI_CIL_INTERRUPT_MASK, 0x0);
>>>> +		cil_write(chan, TEGRA_CSI_CIL_PHY_CONTROL, 0xA);
>>>> +		regs->cil = regs_base(TEGRA_CSI_CIL_0_BASE, port);
>>>> +	}
>>> And this seems to access registers from another port by temporarily
>>> rewriting the CIL base offset. That seems a little hackish to me. I
>>> don't know the hardware intimately enough to know exactly what this
>>> is supposed to accomplish, perhaps you can clarify? Also perhaps we
>>> can come up with some architectural overview of the VI hardware, or
>>> does such an overview exist in the TRM?
>> CSI have 6 channels but just 3 PHYs. If a channel want to use 4 data lanes,
>> then it has to be CSIA, CSIC and CSIE. And CSIB, CSID and CSIF channels can
>> not be used in this case.
>>
>> That's why we need to access the CSIB/D/F registers in 4 data lanes use
>> case.
> I find the nomenclature very difficult. So each channel has two ports,
> and each port uses up two lanes of a 4-lane PHY. Can't we structure
> things in a way so that we expose ports as a low-level object and then
> each channel can use either one or two ports? That way we can create at
> runtime a dynamic number of channels (parsed from DT?) and assign ports
> to them.
>
> Perhaps most of that information will already be available in DT. For
> example if we have a 4-lane camera connected to CSI1, then ports C and D
> could be connected (I suppose that's possible with an OF graph?) and the
> driver would simply have to allocate both C and D ports to some channel
> object representing that camera. Similarly we could have one 2-lane
> camera connected to CSI and another 2-lane camera connected to CSI2 and
> assign ports A or B and E or F, respectively, to channels representing
> these camera links.
>

Yes. this can be handled in DT and use CSI subdev architecture.

Let me go to that direction and post a new patch for review.


>>> I see there is, perhaps add a comment somewhere, in the commit
>>> description or the file header giving a reference to where the
>>> architectural overview can be found?
>> It can be found in Tegra X1 TRM like this:
>> "The CSI unit provides for connection of up to six cameras in the system and
>> is organized as three identical instances of two
>> MIPI support blocks, each with a separate 4-lane interface that can be
>> configured as a single camera with 4 lanes or as a dual
>> camera with 2 lanes available for each camera."
>>
>> What about I put this information in the code as a comment?
> Having this as comments is obviously going to help understand the code,
> but the code will still be difficult to follow. I think it would be far
> easier to understand if this was structured in a top-down approach
> rather than bottom-up.
>
>>>> +	/* CSI pixel parser registers setup */
>>>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_PP_COMMAND, 0xf007);
>>>> +	pp_write(chan, TEGRA_CSI_PIXEL_PARSER_INTERRUPT_MASK, 0x0);
>>>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_CONTROL0,
>>>> +		 0x280301f0 | (port & 0x1));
>>>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_PP_COMMAND, 0xf007);
>>>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_CONTROL1, 0x11);
>>>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_GAP, 0x140000);
>>>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_EXPECTED_FRAME, 0x0);
>>>> +	pp_write(chan, TEGRA_CSI_INPUT_STREAM_CONTROL,
>>>> +		 0x3f0000 | (lanes - 1));
>>>> +
>>>> +	/* CIL PHY register setup */
>>>> +	if (lanes == 4)
>>>> +		phy_write(chan, 0x0101);
>>>> +	else {
>>>> +		u32 val = phy_read(chan);
>>>> +		if (port & 0x1)
>>>> +			val = (val & ~0x100) | 0x100;
>>>> +		else
>>>> +			val = (val & ~0x1) | 0x1;
>>>> +		phy_write(chan, val);
>>>> +	}
>>> The & ~ isn't quite doing what I suspect it should be doing. My
>>> assumption is that you want to set this register to 0x01 if the first
>>> port is to be used and 0x100 if the second port is to be used (or 0x101
>>> if both ports are to be used). In that case I think you'll want
>>> something like this:
>>>
>>> 	value = phy_read(chan);
>>>
>>> 	if (port & 1)
>>> 		value = (value & ~0x0001) | 0x0100;
>>> 	else
>>> 		value = (value & ~0x0100) | 0x0001;
>>>
>>> 	phy_write(chan, value);
>> I don't think your code is correct. The algorithm is to read out the share
>> PHY register value and clear the port related bit and set that bit. Then it
>> won't touch the setting of the other port. It means when we setup a channel
>> it should not change the other channel which sharing PHY register with the
>> current one.
>>
>> In your case, you cleared the other port's bit and set the current port bit.
>> When we write the value back to the PHY register, current port will be
>> enabled but the other port will be disabled.
>>
>> For example, like CSIA is running, the value of PHY register is 0x0001.
>> Then when we try to enable CSIB, we should write 0x0101 to the PHY register
>> but not 0x0100.
> I see. In that case I propose you simply do:
>
> 	if (port & 1)
> 		value |= 0x0100;
> 	else
> 		value |= 0x0001;
>
> Clearing the bit only to set it immediately again is just a waste of CPU
> resources. Likely the compiler will optimize this away, but might as
> well make it easy on the compiler.
Right, clearing is not necessary here.

     val |= (port & 1) ? 0x0100 : 0x0001;

Looks more simple.
> One problem with the above code, though, is that I don't see these bits
> ever being cleared in the PHY. Shouldn't there be code to disable a
> given port when it isn't used? Presumably that would reduce power
> consumption?
We normally stop clock and all the power when stop_streaming. It is not 
necessary to clear that in the PHY.


>>>> +static int tegra_channel_capture_frame(struct tegra_channel *chan)
>>>> +{
>>>> +	struct tegra_channel_buffer *buf = chan->active;
>>>> +	struct vb2_buffer *vb = &buf->buf;
>>>> +	int err = 0;
>>>> +	u32 thresh, value, frame_start;
>>>> +	int bytes_per_line = chan->format.bytesperline;
>>>> +
>>>> +	if (!vb2_start_streaming_called(&chan->queue) || !buf)
>>>> +		return -EINVAL;
>>>> +
>>>> +	if (chan->bypass)
>>>> +		goto bypass_done;
>>>> +
>>>> +	/* Program buffer address */
>>>> +	csi_write(chan,
>>>> +		  TEGRA_VI_CSI_SURFACE0_OFFSET_MSB + chan->surface * 8,
>>>> +		  0x0);
>>>> +	csi_write(chan,
>>>> +		  TEGRA_VI_CSI_SURFACE0_OFFSET_LSB + chan->surface * 8,
>>>> +		  buf->addr);
>>>> +	csi_write(chan,
>>>> +		  TEGRA_VI_CSI_SURFACE0_STRIDE + chan->surface * 4,
>>>> +		  bytes_per_line);
>>>> +
>>>> +	/* Program syncpoint */
>>>> +	frame_start = sp_bit(chan, SP_PP_FRAME_START);
>>>> +	tegra_channel_write(chan, TEGRA_VI_CFG_VI_INCR_SYNCPT,
>>>> +			    frame_start | host1x_syncpt_id(chan->sp));
>>>> +
>>>> +	csi_write(chan, TEGRA_VI_CSI_SINGLE_SHOT, 0x1);
>>>> +
>>>> +	/* Use syncpoint to wake up */
>>>> +	thresh = host1x_syncpt_incr_max(chan->sp, 1);
>>>> +
>>>> +	mutex_unlock(&chan->lock);
>>>> +	err = host1x_syncpt_wait(chan->sp, thresh,
>>>> +			         TEGRA_VI_SYNCPT_WAIT_TIMEOUT, &value);
>>>> +	mutex_lock(&chan->lock);
>>> What's the point of taking the lock in the first place if you drop it
>>> here, even if temporarily? This is a per-channel lock, and it protects
>>> the channel against concurrent captures. So if you drop the lock here,
>>> don't you run risk of having two captures run concurrently? And by the
>>> time you get to the error handling or buffer completion below you can't
>>> be sure you're actually dealing with the same buffer that you started
>>> with.
>> After some discussion with Hans, I changed to this. Since there won't be a
>> second capture start which is prevented by v4l2-core, it won't cause the
>> buffer issue.
>>
>> Waiting for host1x syncpoint take time, so dropping lock can let other
>> non-capture ioctls and operations happen.
> If the core already prevents multiple captures for a single channel, do
> we even need the lock in the first place?

Let me go for kthread.

>>>> +	if (err) {
>>>> +		dev_err(&chan->video.dev, "frame start syncpt timeout!\n");
>>>> +		tegra_channel_capture_error(chan, err);
>>>> +	}
>>> Is timeout really the only kind of error that can happen here?
>>>
>> I actually don't know other errors. Any other errors I need take of here?
> Then I suggest you play it safe and simply report what exact error was
> returned:
>
> 		dev_err(&chan->video.dev, "failed to wait for syncpoint: %d\n",
> 			err);

OK, fixed.

>>>> +static int tegra_channel_buffer_prepare(struct vb2_buffer *vb)
>>>> +{
>>>> +	struct tegra_channel *chan = vb2_get_drv_priv(vb->vb2_queue);
>>>> +	struct tegra_channel_buffer *buf = to_tegra_channel_buffer(vb);
>>>> +
>>>> +	buf->chan = chan;
>>>> +	buf->addr = vb2_dma_contig_plane_dma_addr(vb, 0);
>>>> +
>>>> +	return 0;
>>>> +}
>>> This seems to use contiguous DMA, which I guess presumes CMA support?
>>> We're dealing with very large buffers here. Your default frame size
>>> would yield buffers of roughly 32 MiB each, and you probably need a
>>> couple of those to ensure smooth playback. That's quite a bit of
>>> memory to reserve for CMA.
>> In vb2 core driver, it's using dma-mapping API which might be CMA or SMMU.
> There is no way to use the DMA API with SMMU upstream. You need to set
> up your IOMMU domain yourself and attach the VI device to it manually.
> That means you'll also need to manage your IOVA space manually to make
> use of this. I know it's an unfortunate situation and there's work
> underway to improve it, but we're not quite there yet.
>
>> For CMA we need increase the default memory size.
> I'd rather not rely on CMA at all, especially since we do have a way
> around it.
>
>>> Have you ever tried to make this work with the IOMMU API so that we can
>>> allocate arbitrary buffers and linearize them for the hardware through
>>> the SMMU?
>> I tested this code in downstream kernel with SMMU. Do we fully support SMMU
>> in upstream version? I didn't check that.
> *sigh* We can't merge code upstream which hasn't been tested upstream.
> Let's make sure we get into place whatever we need to actually run this
> on an upstream kernel. That typically means you need to apply your work
> on top of some recent linux-next and run it on an upstream-supported
> board.
>
> I realize that this is rather difficult to do for Tegra X1 because the
> support for it hasn't been completely merged yet. One possibility is to
> apply this on top of my staging/work branch[0] and run it on the P2371
> or P2571 boards that are supported there. Alternatively since this is
> hardware which is available (in similar form) on Tegra K1 you could try
> to make it work on something like the Jetson TK1. Getting it to support
> Tegra X1 will then be (hopefully) a simple matter of adding parameters
> for the new generation.
>
> Not testing this on an upstream kernel means that it is likely not going
> to work because we're missing some bits, such as in the clock driver or
> other, that are essential to make this work and as a result we'd be
> carrying broken code in the upstream kernel. That's not acceptable.
>
> [0]: https://github.com/thierryreding/linux/commits/staging/work
>
Oh, maybe my description is not very clear here. I did test this patch 
in upstream kernel which is exactly based on your Tegra kernel branch 
staging/work.

And it works fine with test pattern generator now. I just don't know 
whether the upstream kernel is using CMA or IOMMU. But from your answer, 
we don't have IOMMU in upstream but we do have in downstream kernel. I 
think my driver is using CMA, I will double check that.

The work I mentioned in downstream is quite similar with this patch, 
because both downstream and upstream V4L2 driver use the same videobuf2 
API. videobuf2 API use dma-mapping API then.
But in downstream dma-mapping by default support SMMU/IOMMU stuff. Then 
I assume it's available in upstream.




>>>> +	pix->pixelformat = info->fourcc;
>>>> +	pix->field = V4L2_FIELD_NONE;
>>>> +
>>>> +	/* The transfer alignment requirements are expressed in bytes. Compute
>>>> +	 * the minimum and maximum values, clamp the requested width and convert
>>>> +	 * it back to pixels.
>>>> +	 */
>>>> +	align = lcm(chan->align, info->bpp);
>>>> +	min_width = roundup(TEGRA_MIN_WIDTH, align);
>>>> +	max_width = rounddown(TEGRA_MAX_WIDTH, align);
>>>> +	width = rounddown(pix->width * info->bpp, align);
>>> Shouldn't these be roundup()?
>> Why? I don't understand but rounddown looks good to me
> For the maximum and minimum this is probably not an issue because they
> likely are multiples of the alignment (I hope they are, otherwise they
> would be broken; which would indicate that computing min_width and
> max_width here is actually redundant, or should be replaced by some
> sort of WARN() or even BUG().
>
> That said, for the width you'll want to round up, otherwise you will be
> potentially truncating the amount of data you receive. Consider for
> example the case where you wanted to capture a 2x2 image at 32-bit RGB.
> With your above calculation you'll end up with:
>
> 	align = lcm(64, 4) = 64;
> 	width = rounddown(2 * 4 = 8, 64) = 0;

Width should go for roundup(). I assume you asked for roundup() for all 
these calculation.

>
> That's really not what you want. I realize that this particular case
> will be cancelled out by the clamp() calculation below, but the same
> error would apply to larger resolution images. You'll always be missing
> up to 63 bytes if you round down that way.
>
>>>> +	pix->width = clamp(width, min_width, max_width) / info->bpp;
>>>> +	pix->height = clamp(pix->height, TEGRA_MIN_HEIGHT,
>>>> +			    TEGRA_MAX_HEIGHT);
>>> The above fits nicely on one line and doesn't need to be wrapped.
>> Fixed
>>>> +
>>>> +	/* Clamp the requested bytes per line value. If the maximum bytes per
>>>> +	 * line value is zero, the module doesn't support user configurable line
>>>> +	 * sizes. Override the requested value with the minimum in that case.
>>>> +	 */
>>>> +	min_bpl = pix->width * info->bpp;
>>>> +	max_bpl = rounddown(TEGRA_MAX_WIDTH, chan->align);
>>>> +	bpl = rounddown(pix->bytesperline, chan->align);
>>> Again, I think these should be roundup().
>> Why? I don't understand but rounddown looks good to me
> Same applies here. Alignment is a restriction regarding the *minimum*
> size, rounding up is therefore what you really need.
>
>>>> +	/* VI Channel is 64 bytes alignment */
>>>> +	chan->align = 64;
>>> Does this need parameterization for other SoC generations?
>> So far it's 64 bytes and I don't see any change about this in the future
>> generations.
> I don't see this documented in the TRM. Can you file a bug to get this
> added? We have tables for this kind of restrictions for other devices,
> such as display controller. We'll need that in the TRM for VI as well.

OK, I will do it.

>>>> +	chan->surface = 0;
>>> I can't find this being set to anything other than 0. What is its use?
>> Each channel actually has 3 memory output surfaces. But I don't find any use
>> case to use the surface 1 and surface 2. So I just added this parameter for
>> future usage.
>>
>> chan->surface is used in tegra_channel_capture_frame()
> I don't understand why it needs to be stored in the channel. We could
> simply hard-code it to 0 in tegra_channel_capture_frame(). Perhaps along
> with a TODO comment or similar that this might need to be paramaterized?
OK, Let me remove the surface parameters here. If we find we need that I 
will add it back in the future.


> The TRM isn't any help in explaining why three surfaces are available.
> Would you happen to know what surfaces 1 and 2 can be used for?

That's true, I don't see any explanation in TRM but just some registers.

>>>> diff --git a/drivers/media/platform/tegra/tegra-core.h b/drivers/media/platform/tegra/tegra-core.h
>>>> new file mode 100644
>>>> index 0000000..7d1026b
>>>> --- /dev/null
>>>> +++ b/drivers/media/platform/tegra/tegra-core.h
>>>> @@ -0,0 +1,134 @@
>>>> +/*
>>>> + * NVIDIA Tegra Video Input Device Driver Core Helpers
>>>> + *
>>>> + * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
>>>> + *
>>>> + * Author: Bryan Wu <pengw-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>>>> + *
>>>> + * This program is free software; you can redistribute it and/or modify
>>>> + * it under the terms of the GNU General Public License version 2 as
>>>> + * published by the Free Software Foundation.
>>>> + */
>>>> +
>>>> +#ifndef __TEGRA_CORE_H__
>>>> +#define __TEGRA_CORE_H__
>>>> +
>>>> +#include <dt-bindings/media/tegra-vi.h>
>>>> +
>>>> +#include <media/v4l2-subdev.h>
>>>> +
>>>> +/* Minimum and maximum width and height common to Tegra video input device. */
>>>> +#define TEGRA_MIN_WIDTH		32U
>>>> +#define TEGRA_MAX_WIDTH		7680U
>>>> +#define TEGRA_MIN_HEIGHT	32U
>>>> +#define TEGRA_MAX_HEIGHT	7680U
>>> Is this dependent on SoC generation? If we wanted to support Tegra K1,
>>> would the same values apply or do they need to be parameterized?
>> I actually don't get any information about this max/min resolution. Here I
>> just put some values for the format calculation.
> Can you request that this be added to the TRM (via that internal bug
> report I mentioned), please? According to the register definitions the
> width and height fields to be programmed are 16-bit, but I doubt that we
> can realistically capture frames of 65535x65535 pixels.
OK, I will do it.


>>> On that note, could you outline what would be necessary to make this
>>> work on Tegra K1? What are the differences between the VI hardware on
>>> Tegra X1 vs. Tegra K1?
>>>
>> Tegra X1 and Tegra K1 have similar channel architecture. Tegra X1 has 6
>> channels, Tegra K1 has 2 channels.
> Okay, so it should be relatively easy to make this work on Tegra K1 as
> well. I'll see if I can find some time to play with that. What would be
> the easiest way to check that this works? I suppose I could write a
> small program to capture images from the V4L2 node(s) that this exposes
> and displays them in a DRM/KMS overlay via DMA-BUF. But perhaps there
> are premade tools to achieve this? Preferably with not too many
> dependencies.
Yeah, it's not very difficult to add support for Tegra K1, basically 
just some registers are different.

For the test case, I'm using open source tool yavta to 
capture/v4l2-ctrls/enum-ctrls.

http://git.ideasonboard.org/yavta.git

For the media controller, I'm using media-ctl of v4l-utils

http://git.linuxtv.org/v4l-utils.git


>>>> +/* UHD 4K resolution as default resolution for all Tegra video input device. */
>>>> +#define TEGRA_DEF_WIDTH		3840
>>>> +#define TEGRA_DEF_HEIGHT	2160
>>> Is this a sensible default? It seems rather large to me.
>> Actually I use this for TPG which is the default setting of VI. And it can
>> be override from user space IOCTL.
> I understand, but UHD is rather big, so not sure if it makes a good
> default. Perhaps 1920x1080 would be a more realistic default. But I
> don't feel very strong about this.
1080p is good for me. I will change to that. It's just for test pattern 
generator.
For real sensor, I think we can easily support 23Mega pixel sensor.
>>>> +
>>>> +#define TEGRA_VF_DEF		TEGRA_VF_RGB888
>>>> +#define TEGRA_VF_DEF_FOURCC	V4L2_PIX_FMT_RGB32
>>> Should we not have only one of these and convert to the other via some
>>> table?
>> This is also TPG default mode
> I understand, but the fourcc version can be converted to the Tegra
> internal format with a function, right? So it seems weird that we'd have
> to hard-code both here, which also means that they need to be manually
> kept in sync.
Sure, I will remove FOURCC one and just use

#define TEGRA_VF_DEF V4L2_PIX_FMT_RGB32


>>>> +	struct tegra_channel *chan;
>>>> +
>>>> +	for (i = 0; i < ARRAY_SIZE(vi->chans); i++) {
>>>> +		chan = &vi->chans[i];
>>>> +
>>>> +		ret = tegra_channel_init(vi, chan, i);
>>> Again, chan is only used once, so directly passing &vi->chans[i] to
>>> tegra_channel_init() would be more concise.
>> OK, I will remove 'chan' parameter from the list. And just pass i as the
>> port number.
> I didn't express myself very clearly. What I was suggesting was to
> remove the chan temporary variable and pass in &vi->chans[i] directly.
> Passing in both &vi->chans[i] and i looks okay to me, that way you don't
> have to look up i via other means. Provided that you still need it, of
> course.
I understood that, but just found remove 'chan' parameter here is simpler.

----
for (i = 0; i < ARRAY_SIZE(vi->chans); i++) {
     ret = tegra_channel_init(vi, i);
----


>>>> +	vi_tpg_fmts_bitmap_init(vi);
>>>> +
>>>> +	ret = tegra_vi_v4l2_init(vi);
>>>> +	if (ret < 0)
>>>> +		return ret;
>>>> +
>>>> +	/* Check whether VI is in test pattern generator (TPG) mode */
>>>> +	of_property_read_u32(vi->dev->of_node, "nvidia,pg_mode",
>>>> +			     &vi->pg_mode);
>>> This doesn't sound right. Wouldn't this mean that you can either use the
>>> device in TPG mode or sensor mode only? With no means of switching at
>>> runtime? But then I see that there's an IOCTL to set this mode, so why
>>> even bother having this in DT in the first place?
>> DT can provide a default way to set the whole VI as TPG. And v4l2-ctrls
>> (IOCTL) is another way to do that.
>>
>> We can remove this DT stuff but just use runtime v4l2-ctrls.
> Yes, let's do that then. It's a policy decision and therefore doesn't
> belong in DT.

OK, removed.

>>>> diff --git a/include/dt-bindings/media/tegra-vi.h b/include/dt-bindings/media/tegra-vi.h
>>> [...]
>>>> +#ifndef __DT_BINDINGS_MEDIA_TEGRA_VI_H__
>>>> +#define __DT_BINDINGS_MEDIA_TEGRA_VI_H__
>>>> +
>>>> +/*
>>>> + * Supported CSI to VI Data Formats
>>>> + */
>>>> +#define TEGRA_VF_RAW6		0
>>>> +#define TEGRA_VF_RAW7		1
>>>> +#define TEGRA_VF_RAW8		2
>>>> +#define TEGRA_VF_RAW10		3
>>>> +#define TEGRA_VF_RAW12		4
>>>> +#define TEGRA_VF_RAW14		5
>>>> +#define TEGRA_VF_EMBEDDED8	6
>>>> +#define TEGRA_VF_RGB565		7
>>>> +#define TEGRA_VF_RGB555		8
>>>> +#define TEGRA_VF_RGB888		9
>>>> +#define TEGRA_VF_RGB444		10
>>>> +#define TEGRA_VF_RGB666		11
>>>> +#define TEGRA_VF_YUV422		12
>>>> +#define TEGRA_VF_YUV420		13
>>>> +#define TEGRA_VF_YUV420_CSPS	14
>>>> +
>>>> +#endif /* __DT_BINDINGS_MEDIA_TEGRA_VI_H__ */
>>> What do we need these for? These seem to me to be internal formats
>>> supported by the hardware, but the existence of this file implies that
>>> you plan on using them in the DT. What's the use-case?
>>>
>>>
>> The original plan is to put nvidia;video-format in device tree and this is
>> the data formats for that. Now we don't need nvidia;video-format in device
>> tree. Then I let me move it into our tegra-core.c, because our
>> tegra_video_formats table needs this.
> If we don't need it now, why will we ever need it? Shouldn't this be
> something that's configurable and depending on what camera is attached
> or what format the user has requested?
>
> Thierry
In my first version, I put nvidia;video-format into the VI device tree 
node, CSI device tree node and TPG DT node. Now this should be removed 
and this header file I will convert to a enum for internal 
tegra_video_formats.

Thanks,
-Bryan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver
@ 2015-08-25 21:42                 ` Bryan Wu
  0 siblings, 0 replies; 25+ messages in thread
From: Bryan Wu @ 2015-08-25 21:42 UTC (permalink / raw)
  To: Thierry Reding
  Cc: hansverk, linux-media, ebrower, jbang, swarren, wenjiaz, davidw,
	gfitzer, linux-tegra

On 08/25/2015 06:44 AM, Thierry Reding wrote:
> On Mon, Aug 24, 2015 at 05:26:20PM -0700, Bryan Wu wrote:
>> On 08/21/2015 06:03 AM, Thierry Reding wrote:
>>> On Thu, Aug 20, 2015 at 05:51:39PM -0700, Bryan Wu wrote:
> [...]
>>>> +{
>>>> +	if (chan->bypass)
>>>> +		return;
>>> I don't see this being set anywhere. Is it dead code? Also the only
>>> description I see is that it's used to bypass register writes, but I
>>> don't see an explanation why that's necessary.
>> We are unifying our downstream VI driver with V4L2 VI driver. And this
>> upstream work is the first step to help that.
>>
>> We are also backporting this driver back to our internal 3.10 kernel which
>> is using nvhost channel to submit register operations from userspace to
>> host1x and VI hardware. Then in this case, our driver needs to bypass all
>> the register operations otherwise we got conflicts between these 2 paths.
>>
>> That's why I put bypass mode here. And bypass mode can be set in device tree
>> or from v4l2-ctrls.
> I don't think it's fair to burden upstream with code that will only ever
> be used downstream. Let's split these changes into a separate patch that
> can be carried downstream.

OK, I will split out a patch for downstream.


>>>> +/* Syncpoint bits of TEGRA_VI_CFG_VI_INCR_SYNCPT */
>>>> +static u32 sp_bit(struct tegra_channel *chan, u32 sp)
>>>> +{
>>>> +	return (sp + chan->port * 4) << 8;
>>>> +}
>>> Technically this returns a mask, not a bit, so sp_mask() would be more
>>> appropriate.
>> Actually it returns the syncpoint value for each port not a mask. Probably
>> sp_bits() is better.
> Looking at the TRM, the field that this generates a value for is called
> VI_COND (in the VI_CFG_VI_INCR_SYNCPT register), so perhaps this should
> really be a macro and named something like:
>
> 	#define VI_CFG_VI_INCR_SYNCPT_COND(x) (((x) & 0xff) << 8)
>
> As for the arithmetic, that doesn't seem to match up. Quoting from your
> original patch:
>
>>>> +/* VI registers */
>>>> +#define TEGRA_VI_CFG_VI_INCR_SYNCPT                     0x000
>>>> +#define		SP_PP_LINE_START			4
>>>> +#define		SP_PP_FRAME_START			5
>>>> +#define		SP_MW_REQ_DONE				6
>>>> +#define		SP_MW_ACK_DONE				7
> This doesn't seem to match the TRM, which has the following values:
>
> 	 0 = IMMEDIATE
> 	 1 = OP_DONE
> 	 2 = RD_DONE
> 	 3 = REG_WR_SAFE
> 	 4 = VI_MWA_REQ_DONE
> 	 5 = VI_MWB_REQ_DONE
> 	 6 = VI_MWA_ACK_DONE
> 	 7 = VI_MWB_ACK_DONE
> 	 8 = VI_ISPA_DONE
> 	 9 = VI_CSI_PPA_FRAME_START
> 	10 = VI_CSI_PPB_FRAME_START
> 	11 = VI_CSI_PPA_LINE_START
> 	12 = VI_CSI_PPB_LINE_START
> 	13 = VI_VGP0_RCVD
> 	14 = VI_VGP1_RCVD
> 	15 = VI_ISPB_DONE
>
> Comparing with the internal register manuals it looks like the TRM is
> actually wrong. Can you file an internal bug to rectify this and Cc me
> on it, please?

Oh, oops. I will file a bug for that. This list is actually old one in 
Tegra K1 not for Tegra X1.

> Irrespective, since this is generating content for a register field it
> would seem more consistent to define it as a parameterized macro, like
> so:
>
> 	#define VI_CSI_PP_LINE_START(port)	(4 + (port) * 4)
> 	#define VI_CSI_PP_FRAME_START(port)	(5 + (port) * 4)
> 	#define VI_CSI_MWA_REQ_DONE(port)	(6 + (port) * 4)
> 	#define VI_CSI_MWA_ACK_DONE(port)	(7 + (port) * 4)
>
> and then use them together with the above macro:
>
> 	value = VI_CFG_VI_INCR_SYNCPT_COND(VI_CSI_PP_FRAME_START(port)) |
> 		host1x_syncpt_id(syncpt);
> 	writel(value, ...);

Looks good to me. I will fix this.


>>>> +static int tegra_channel_capture_setup(struct tegra_channel *chan)
>>>> +{
>>>> +	int lanes = 2;
>>> unsigned int? And why is it hardcoded to 2? There are checks below for
>>> lanes == 4, which will effectively never happen. So at the very least I
>>> think this should have a TODO comment of some sort. Preferably can it
>>> not be determined at runtime what number of lanes we need?
>> Sure, I forget to fix this. lanes should get from DT and for TPG mode I will
>> choose lanes as 4 by default.
> Can the number of lanes required not be determined at runtime? I suspect
> it would be a property of whatever camera is attached. Then again, this
> is perhaps clarified by the DT binding, so I'll wait and see how that
> looks.

Sure, normally lanes number is defined as "bus-width" in the DT node 
when a real sensor connects.
But since TPG is a virtual sensor which doesn't have any lanes 
requirement. Let's choose 4 for TPG.

>>>> +	u32 height = chan->format.height;
>>>> +	u32 width = chan->format.width;
>>>> +	u32 format = chan->fmtinfo->img_fmt;
>>>> +	u32 data_type = chan->fmtinfo->img_dt;
>>>> +	u32 word_count = tegra_core_get_word_count(width, chan->fmtinfo);
>>>> +	struct chan_regs_config *regs = &chan->regs;
>>>> +
>>>> +	/* CIL PHY register setup */
>>>> +	if (port & 0x1) {
>>>> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0 - 0x34, 0x0);
>>>> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x0);
>>>> +	} else {
>>>> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x10000);
>>>> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0 + 0x34, 0x0);
>>>> +	}
>>> This seems to address registers not actually part of this channel. Why?
>> It's little bit hackish, but it's really have no choice. CIL PHY is shared
>> by 2 channels. like CSIA and CSIB, CSIC and CSID, CSIE and CSIF. So we have
>> 3 groups.
> I'm wondering if we can't add some object as abstraction to make this
> more straightforward to follow. I find this driver generally hard to
> understand because of all the (seemingly) random register accesses.

Actually my original design has a separated subdev driver named 
tegra-csi, which handles CSI specific operations and 
tegra-vi/tegra-channel will handle VI operations.

The real problem is we actually just have 1 VI/CSI hardware controller, 
so the register space is kind of mixed up. Some of them are for CSI, 
some of them are for VI.

Although it's still doable, I just feels like little bit hackish.

>>> Also you use magic numbers here and in the remainder of the driver. We
>>> should be able to do better. I presume all of this is documented in the
>>> TRM, so we should be able to easily substitute symbolic names.
>> I also got those magic numbers from internal source. Some of them are not in
>> the TRM. And people just use that settings. I will try to convert them to
>> some meaningful bit names. Please let me do it after I finished the whole
>> work as an incremental patch.
> Sorry, that's not going to work. One of our prerequisite for merging
> code into the upstream kernel has always been to have the registers
> documented in the TRM. Magic numbers are not an option.

OK, I will rework on these register bits.

>>>> +	cil_write(chan, TEGRA_CSI_CIL_INTERRUPT_MASK, 0x0);
>>>> +	cil_write(chan, TEGRA_CSI_CIL_PHY_CONTROL, 0xA);
>>>> +	if (lanes == 4) {
>>>> +		regs->cil = regs_base(TEGRA_CSI_CIL_0_BASE, port + 1);
>>>> +		cil_write(chan, TEGRA_CSI_CIL_PAD_CONFIG0, 0x0);
>>>> +		cil_write(chan,	TEGRA_CSI_CIL_INTERRUPT_MASK, 0x0);
>>>> +		cil_write(chan, TEGRA_CSI_CIL_PHY_CONTROL, 0xA);
>>>> +		regs->cil = regs_base(TEGRA_CSI_CIL_0_BASE, port);
>>>> +	}
>>> And this seems to access registers from another port by temporarily
>>> rewriting the CIL base offset. That seems a little hackish to me. I
>>> don't know the hardware intimately enough to know exactly what this
>>> is supposed to accomplish, perhaps you can clarify? Also perhaps we
>>> can come up with some architectural overview of the VI hardware, or
>>> does such an overview exist in the TRM?
>> CSI have 6 channels but just 3 PHYs. If a channel want to use 4 data lanes,
>> then it has to be CSIA, CSIC and CSIE. And CSIB, CSID and CSIF channels can
>> not be used in this case.
>>
>> That's why we need to access the CSIB/D/F registers in 4 data lanes use
>> case.
> I find the nomenclature very difficult. So each channel has two ports,
> and each port uses up two lanes of a 4-lane PHY. Can't we structure
> things in a way so that we expose ports as a low-level object and then
> each channel can use either one or two ports? That way we can create at
> runtime a dynamic number of channels (parsed from DT?) and assign ports
> to them.
>
> Perhaps most of that information will already be available in DT. For
> example if we have a 4-lane camera connected to CSI1, then ports C and D
> could be connected (I suppose that's possible with an OF graph?) and the
> driver would simply have to allocate both C and D ports to some channel
> object representing that camera. Similarly we could have one 2-lane
> camera connected to CSI and another 2-lane camera connected to CSI2 and
> assign ports A or B and E or F, respectively, to channels representing
> these camera links.
>

Yes. this can be handled in DT and use CSI subdev architecture.

Let me go to that direction and post a new patch for review.


>>> I see there is, perhaps add a comment somewhere, in the commit
>>> description or the file header giving a reference to where the
>>> architectural overview can be found?
>> It can be found in Tegra X1 TRM like this:
>> "The CSI unit provides for connection of up to six cameras in the system and
>> is organized as three identical instances of two
>> MIPI support blocks, each with a separate 4-lane interface that can be
>> configured as a single camera with 4 lanes or as a dual
>> camera with 2 lanes available for each camera."
>>
>> What about I put this information in the code as a comment?
> Having this as comments is obviously going to help understand the code,
> but the code will still be difficult to follow. I think it would be far
> easier to understand if this was structured in a top-down approach
> rather than bottom-up.
>
>>>> +	/* CSI pixel parser registers setup */
>>>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_PP_COMMAND, 0xf007);
>>>> +	pp_write(chan, TEGRA_CSI_PIXEL_PARSER_INTERRUPT_MASK, 0x0);
>>>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_CONTROL0,
>>>> +		 0x280301f0 | (port & 0x1));
>>>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_PP_COMMAND, 0xf007);
>>>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_CONTROL1, 0x11);
>>>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_GAP, 0x140000);
>>>> +	pp_write(chan, TEGRA_CSI_PIXEL_STREAM_EXPECTED_FRAME, 0x0);
>>>> +	pp_write(chan, TEGRA_CSI_INPUT_STREAM_CONTROL,
>>>> +		 0x3f0000 | (lanes - 1));
>>>> +
>>>> +	/* CIL PHY register setup */
>>>> +	if (lanes == 4)
>>>> +		phy_write(chan, 0x0101);
>>>> +	else {
>>>> +		u32 val = phy_read(chan);
>>>> +		if (port & 0x1)
>>>> +			val = (val & ~0x100) | 0x100;
>>>> +		else
>>>> +			val = (val & ~0x1) | 0x1;
>>>> +		phy_write(chan, val);
>>>> +	}
>>> The & ~ isn't quite doing what I suspect it should be doing. My
>>> assumption is that you want to set this register to 0x01 if the first
>>> port is to be used and 0x100 if the second port is to be used (or 0x101
>>> if both ports are to be used). In that case I think you'll want
>>> something like this:
>>>
>>> 	value = phy_read(chan);
>>>
>>> 	if (port & 1)
>>> 		value = (value & ~0x0001) | 0x0100;
>>> 	else
>>> 		value = (value & ~0x0100) | 0x0001;
>>>
>>> 	phy_write(chan, value);
>> I don't think your code is correct. The algorithm is to read out the share
>> PHY register value and clear the port related bit and set that bit. Then it
>> won't touch the setting of the other port. It means when we setup a channel
>> it should not change the other channel which sharing PHY register with the
>> current one.
>>
>> In your case, you cleared the other port's bit and set the current port bit.
>> When we write the value back to the PHY register, current port will be
>> enabled but the other port will be disabled.
>>
>> For example, like CSIA is running, the value of PHY register is 0x0001.
>> Then when we try to enable CSIB, we should write 0x0101 to the PHY register
>> but not 0x0100.
> I see. In that case I propose you simply do:
>
> 	if (port & 1)
> 		value |= 0x0100;
> 	else
> 		value |= 0x0001;
>
> Clearing the bit only to set it immediately again is just a waste of CPU
> resources. Likely the compiler will optimize this away, but might as
> well make it easy on the compiler.
Right, clearing is not necessary here.

     val |= (port & 1) ? 0x0100 : 0x0001;

Looks more simple.
> One problem with the above code, though, is that I don't see these bits
> ever being cleared in the PHY. Shouldn't there be code to disable a
> given port when it isn't used? Presumably that would reduce power
> consumption?
We normally stop clock and all the power when stop_streaming. It is not 
necessary to clear that in the PHY.


>>>> +static int tegra_channel_capture_frame(struct tegra_channel *chan)
>>>> +{
>>>> +	struct tegra_channel_buffer *buf = chan->active;
>>>> +	struct vb2_buffer *vb = &buf->buf;
>>>> +	int err = 0;
>>>> +	u32 thresh, value, frame_start;
>>>> +	int bytes_per_line = chan->format.bytesperline;
>>>> +
>>>> +	if (!vb2_start_streaming_called(&chan->queue) || !buf)
>>>> +		return -EINVAL;
>>>> +
>>>> +	if (chan->bypass)
>>>> +		goto bypass_done;
>>>> +
>>>> +	/* Program buffer address */
>>>> +	csi_write(chan,
>>>> +		  TEGRA_VI_CSI_SURFACE0_OFFSET_MSB + chan->surface * 8,
>>>> +		  0x0);
>>>> +	csi_write(chan,
>>>> +		  TEGRA_VI_CSI_SURFACE0_OFFSET_LSB + chan->surface * 8,
>>>> +		  buf->addr);
>>>> +	csi_write(chan,
>>>> +		  TEGRA_VI_CSI_SURFACE0_STRIDE + chan->surface * 4,
>>>> +		  bytes_per_line);
>>>> +
>>>> +	/* Program syncpoint */
>>>> +	frame_start = sp_bit(chan, SP_PP_FRAME_START);
>>>> +	tegra_channel_write(chan, TEGRA_VI_CFG_VI_INCR_SYNCPT,
>>>> +			    frame_start | host1x_syncpt_id(chan->sp));
>>>> +
>>>> +	csi_write(chan, TEGRA_VI_CSI_SINGLE_SHOT, 0x1);
>>>> +
>>>> +	/* Use syncpoint to wake up */
>>>> +	thresh = host1x_syncpt_incr_max(chan->sp, 1);
>>>> +
>>>> +	mutex_unlock(&chan->lock);
>>>> +	err = host1x_syncpt_wait(chan->sp, thresh,
>>>> +			         TEGRA_VI_SYNCPT_WAIT_TIMEOUT, &value);
>>>> +	mutex_lock(&chan->lock);
>>> What's the point of taking the lock in the first place if you drop it
>>> here, even if temporarily? This is a per-channel lock, and it protects
>>> the channel against concurrent captures. So if you drop the lock here,
>>> don't you run risk of having two captures run concurrently? And by the
>>> time you get to the error handling or buffer completion below you can't
>>> be sure you're actually dealing with the same buffer that you started
>>> with.
>> After some discussion with Hans, I changed to this. Since there won't be a
>> second capture start which is prevented by v4l2-core, it won't cause the
>> buffer issue.
>>
>> Waiting for host1x syncpoint take time, so dropping lock can let other
>> non-capture ioctls and operations happen.
> If the core already prevents multiple captures for a single channel, do
> we even need the lock in the first place?

Let me go for kthread.

>>>> +	if (err) {
>>>> +		dev_err(&chan->video.dev, "frame start syncpt timeout!\n");
>>>> +		tegra_channel_capture_error(chan, err);
>>>> +	}
>>> Is timeout really the only kind of error that can happen here?
>>>
>> I actually don't know other errors. Any other errors I need take of here?
> Then I suggest you play it safe and simply report what exact error was
> returned:
>
> 		dev_err(&chan->video.dev, "failed to wait for syncpoint: %d\n",
> 			err);

OK, fixed.

>>>> +static int tegra_channel_buffer_prepare(struct vb2_buffer *vb)
>>>> +{
>>>> +	struct tegra_channel *chan = vb2_get_drv_priv(vb->vb2_queue);
>>>> +	struct tegra_channel_buffer *buf = to_tegra_channel_buffer(vb);
>>>> +
>>>> +	buf->chan = chan;
>>>> +	buf->addr = vb2_dma_contig_plane_dma_addr(vb, 0);
>>>> +
>>>> +	return 0;
>>>> +}
>>> This seems to use contiguous DMA, which I guess presumes CMA support?
>>> We're dealing with very large buffers here. Your default frame size
>>> would yield buffers of roughly 32 MiB each, and you probably need a
>>> couple of those to ensure smooth playback. That's quite a bit of
>>> memory to reserve for CMA.
>> In vb2 core driver, it's using dma-mapping API which might be CMA or SMMU.
> There is no way to use the DMA API with SMMU upstream. You need to set
> up your IOMMU domain yourself and attach the VI device to it manually.
> That means you'll also need to manage your IOVA space manually to make
> use of this. I know it's an unfortunate situation and there's work
> underway to improve it, but we're not quite there yet.
>
>> For CMA we need increase the default memory size.
> I'd rather not rely on CMA at all, especially since we do have a way
> around it.
>
>>> Have you ever tried to make this work with the IOMMU API so that we can
>>> allocate arbitrary buffers and linearize them for the hardware through
>>> the SMMU?
>> I tested this code in downstream kernel with SMMU. Do we fully support SMMU
>> in upstream version? I didn't check that.
> *sigh* We can't merge code upstream which hasn't been tested upstream.
> Let's make sure we get into place whatever we need to actually run this
> on an upstream kernel. That typically means you need to apply your work
> on top of some recent linux-next and run it on an upstream-supported
> board.
>
> I realize that this is rather difficult to do for Tegra X1 because the
> support for it hasn't been completely merged yet. One possibility is to
> apply this on top of my staging/work branch[0] and run it on the P2371
> or P2571 boards that are supported there. Alternatively since this is
> hardware which is available (in similar form) on Tegra K1 you could try
> to make it work on something like the Jetson TK1. Getting it to support
> Tegra X1 will then be (hopefully) a simple matter of adding parameters
> for the new generation.
>
> Not testing this on an upstream kernel means that it is likely not going
> to work because we're missing some bits, such as in the clock driver or
> other, that are essential to make this work and as a result we'd be
> carrying broken code in the upstream kernel. That's not acceptable.
>
> [0]: https://github.com/thierryreding/linux/commits/staging/work
>
Oh, maybe my description is not very clear here. I did test this patch 
in upstream kernel which is exactly based on your Tegra kernel branch 
staging/work.

And it works fine with test pattern generator now. I just don't know 
whether the upstream kernel is using CMA or IOMMU. But from your answer, 
we don't have IOMMU in upstream but we do have in downstream kernel. I 
think my driver is using CMA, I will double check that.

The work I mentioned in downstream is quite similar with this patch, 
because both downstream and upstream V4L2 driver use the same videobuf2 
API. videobuf2 API use dma-mapping API then.
But in downstream dma-mapping by default support SMMU/IOMMU stuff. Then 
I assume it's available in upstream.




>>>> +	pix->pixelformat = info->fourcc;
>>>> +	pix->field = V4L2_FIELD_NONE;
>>>> +
>>>> +	/* The transfer alignment requirements are expressed in bytes. Compute
>>>> +	 * the minimum and maximum values, clamp the requested width and convert
>>>> +	 * it back to pixels.
>>>> +	 */
>>>> +	align = lcm(chan->align, info->bpp);
>>>> +	min_width = roundup(TEGRA_MIN_WIDTH, align);
>>>> +	max_width = rounddown(TEGRA_MAX_WIDTH, align);
>>>> +	width = rounddown(pix->width * info->bpp, align);
>>> Shouldn't these be roundup()?
>> Why? I don't understand but rounddown looks good to me
> For the maximum and minimum this is probably not an issue because they
> likely are multiples of the alignment (I hope they are, otherwise they
> would be broken; which would indicate that computing min_width and
> max_width here is actually redundant, or should be replaced by some
> sort of WARN() or even BUG().
>
> That said, for the width you'll want to round up, otherwise you will be
> potentially truncating the amount of data you receive. Consider for
> example the case where you wanted to capture a 2x2 image at 32-bit RGB.
> With your above calculation you'll end up with:
>
> 	align = lcm(64, 4) = 64;
> 	width = rounddown(2 * 4 = 8, 64) = 0;

Width should go for roundup(). I assume you asked for roundup() for all 
these calculation.

>
> That's really not what you want. I realize that this particular case
> will be cancelled out by the clamp() calculation below, but the same
> error would apply to larger resolution images. You'll always be missing
> up to 63 bytes if you round down that way.
>
>>>> +	pix->width = clamp(width, min_width, max_width) / info->bpp;
>>>> +	pix->height = clamp(pix->height, TEGRA_MIN_HEIGHT,
>>>> +			    TEGRA_MAX_HEIGHT);
>>> The above fits nicely on one line and doesn't need to be wrapped.
>> Fixed
>>>> +
>>>> +	/* Clamp the requested bytes per line value. If the maximum bytes per
>>>> +	 * line value is zero, the module doesn't support user configurable line
>>>> +	 * sizes. Override the requested value with the minimum in that case.
>>>> +	 */
>>>> +	min_bpl = pix->width * info->bpp;
>>>> +	max_bpl = rounddown(TEGRA_MAX_WIDTH, chan->align);
>>>> +	bpl = rounddown(pix->bytesperline, chan->align);
>>> Again, I think these should be roundup().
>> Why? I don't understand but rounddown looks good to me
> Same applies here. Alignment is a restriction regarding the *minimum*
> size, rounding up is therefore what you really need.
>
>>>> +	/* VI Channel is 64 bytes alignment */
>>>> +	chan->align = 64;
>>> Does this need parameterization for other SoC generations?
>> So far it's 64 bytes and I don't see any change about this in the future
>> generations.
> I don't see this documented in the TRM. Can you file a bug to get this
> added? We have tables for this kind of restrictions for other devices,
> such as display controller. We'll need that in the TRM for VI as well.

OK, I will do it.

>>>> +	chan->surface = 0;
>>> I can't find this being set to anything other than 0. What is its use?
>> Each channel actually has 3 memory output surfaces. But I don't find any use
>> case to use the surface 1 and surface 2. So I just added this parameter for
>> future usage.
>>
>> chan->surface is used in tegra_channel_capture_frame()
> I don't understand why it needs to be stored in the channel. We could
> simply hard-code it to 0 in tegra_channel_capture_frame(). Perhaps along
> with a TODO comment or similar that this might need to be paramaterized?
OK, Let me remove the surface parameters here. If we find we need that I 
will add it back in the future.


> The TRM isn't any help in explaining why three surfaces are available.
> Would you happen to know what surfaces 1 and 2 can be used for?

That's true, I don't see any explanation in TRM but just some registers.

>>>> diff --git a/drivers/media/platform/tegra/tegra-core.h b/drivers/media/platform/tegra/tegra-core.h
>>>> new file mode 100644
>>>> index 0000000..7d1026b
>>>> --- /dev/null
>>>> +++ b/drivers/media/platform/tegra/tegra-core.h
>>>> @@ -0,0 +1,134 @@
>>>> +/*
>>>> + * NVIDIA Tegra Video Input Device Driver Core Helpers
>>>> + *
>>>> + * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
>>>> + *
>>>> + * Author: Bryan Wu <pengw@nvidia.com>
>>>> + *
>>>> + * This program is free software; you can redistribute it and/or modify
>>>> + * it under the terms of the GNU General Public License version 2 as
>>>> + * published by the Free Software Foundation.
>>>> + */
>>>> +
>>>> +#ifndef __TEGRA_CORE_H__
>>>> +#define __TEGRA_CORE_H__
>>>> +
>>>> +#include <dt-bindings/media/tegra-vi.h>
>>>> +
>>>> +#include <media/v4l2-subdev.h>
>>>> +
>>>> +/* Minimum and maximum width and height common to Tegra video input device. */
>>>> +#define TEGRA_MIN_WIDTH		32U
>>>> +#define TEGRA_MAX_WIDTH		7680U
>>>> +#define TEGRA_MIN_HEIGHT	32U
>>>> +#define TEGRA_MAX_HEIGHT	7680U
>>> Is this dependent on SoC generation? If we wanted to support Tegra K1,
>>> would the same values apply or do they need to be parameterized?
>> I actually don't get any information about this max/min resolution. Here I
>> just put some values for the format calculation.
> Can you request that this be added to the TRM (via that internal bug
> report I mentioned), please? According to the register definitions the
> width and height fields to be programmed are 16-bit, but I doubt that we
> can realistically capture frames of 65535x65535 pixels.
OK, I will do it.


>>> On that note, could you outline what would be necessary to make this
>>> work on Tegra K1? What are the differences between the VI hardware on
>>> Tegra X1 vs. Tegra K1?
>>>
>> Tegra X1 and Tegra K1 have similar channel architecture. Tegra X1 has 6
>> channels, Tegra K1 has 2 channels.
> Okay, so it should be relatively easy to make this work on Tegra K1 as
> well. I'll see if I can find some time to play with that. What would be
> the easiest way to check that this works? I suppose I could write a
> small program to capture images from the V4L2 node(s) that this exposes
> and displays them in a DRM/KMS overlay via DMA-BUF. But perhaps there
> are premade tools to achieve this? Preferably with not too many
> dependencies.
Yeah, it's not very difficult to add support for Tegra K1, basically 
just some registers are different.

For the test case, I'm using open source tool yavta to 
capture/v4l2-ctrls/enum-ctrls.

http://git.ideasonboard.org/yavta.git

For the media controller, I'm using media-ctl of v4l-utils

http://git.linuxtv.org/v4l-utils.git


>>>> +/* UHD 4K resolution as default resolution for all Tegra video input device. */
>>>> +#define TEGRA_DEF_WIDTH		3840
>>>> +#define TEGRA_DEF_HEIGHT	2160
>>> Is this a sensible default? It seems rather large to me.
>> Actually I use this for TPG which is the default setting of VI. And it can
>> be override from user space IOCTL.
> I understand, but UHD is rather big, so not sure if it makes a good
> default. Perhaps 1920x1080 would be a more realistic default. But I
> don't feel very strong about this.
1080p is good for me. I will change to that. It's just for test pattern 
generator.
For real sensor, I think we can easily support 23Mega pixel sensor.
>>>> +
>>>> +#define TEGRA_VF_DEF		TEGRA_VF_RGB888
>>>> +#define TEGRA_VF_DEF_FOURCC	V4L2_PIX_FMT_RGB32
>>> Should we not have only one of these and convert to the other via some
>>> table?
>> This is also TPG default mode
> I understand, but the fourcc version can be converted to the Tegra
> internal format with a function, right? So it seems weird that we'd have
> to hard-code both here, which also means that they need to be manually
> kept in sync.
Sure, I will remove FOURCC one and just use

#define TEGRA_VF_DEF V4L2_PIX_FMT_RGB32


>>>> +	struct tegra_channel *chan;
>>>> +
>>>> +	for (i = 0; i < ARRAY_SIZE(vi->chans); i++) {
>>>> +		chan = &vi->chans[i];
>>>> +
>>>> +		ret = tegra_channel_init(vi, chan, i);
>>> Again, chan is only used once, so directly passing &vi->chans[i] to
>>> tegra_channel_init() would be more concise.
>> OK, I will remove 'chan' parameter from the list. And just pass i as the
>> port number.
> I didn't express myself very clearly. What I was suggesting was to
> remove the chan temporary variable and pass in &vi->chans[i] directly.
> Passing in both &vi->chans[i] and i looks okay to me, that way you don't
> have to look up i via other means. Provided that you still need it, of
> course.
I understood that, but just found remove 'chan' parameter here is simpler.

----
for (i = 0; i < ARRAY_SIZE(vi->chans); i++) {
     ret = tegra_channel_init(vi, i);
----


>>>> +	vi_tpg_fmts_bitmap_init(vi);
>>>> +
>>>> +	ret = tegra_vi_v4l2_init(vi);
>>>> +	if (ret < 0)
>>>> +		return ret;
>>>> +
>>>> +	/* Check whether VI is in test pattern generator (TPG) mode */
>>>> +	of_property_read_u32(vi->dev->of_node, "nvidia,pg_mode",
>>>> +			     &vi->pg_mode);
>>> This doesn't sound right. Wouldn't this mean that you can either use the
>>> device in TPG mode or sensor mode only? With no means of switching at
>>> runtime? But then I see that there's an IOCTL to set this mode, so why
>>> even bother having this in DT in the first place?
>> DT can provide a default way to set the whole VI as TPG. And v4l2-ctrls
>> (IOCTL) is another way to do that.
>>
>> We can remove this DT stuff but just use runtime v4l2-ctrls.
> Yes, let's do that then. It's a policy decision and therefore doesn't
> belong in DT.

OK, removed.

>>>> diff --git a/include/dt-bindings/media/tegra-vi.h b/include/dt-bindings/media/tegra-vi.h
>>> [...]
>>>> +#ifndef __DT_BINDINGS_MEDIA_TEGRA_VI_H__
>>>> +#define __DT_BINDINGS_MEDIA_TEGRA_VI_H__
>>>> +
>>>> +/*
>>>> + * Supported CSI to VI Data Formats
>>>> + */
>>>> +#define TEGRA_VF_RAW6		0
>>>> +#define TEGRA_VF_RAW7		1
>>>> +#define TEGRA_VF_RAW8		2
>>>> +#define TEGRA_VF_RAW10		3
>>>> +#define TEGRA_VF_RAW12		4
>>>> +#define TEGRA_VF_RAW14		5
>>>> +#define TEGRA_VF_EMBEDDED8	6
>>>> +#define TEGRA_VF_RGB565		7
>>>> +#define TEGRA_VF_RGB555		8
>>>> +#define TEGRA_VF_RGB888		9
>>>> +#define TEGRA_VF_RGB444		10
>>>> +#define TEGRA_VF_RGB666		11
>>>> +#define TEGRA_VF_YUV422		12
>>>> +#define TEGRA_VF_YUV420		13
>>>> +#define TEGRA_VF_YUV420_CSPS	14
>>>> +
>>>> +#endif /* __DT_BINDINGS_MEDIA_TEGRA_VI_H__ */
>>> What do we need these for? These seem to me to be internal formats
>>> supported by the hardware, but the existence of this file implies that
>>> you plan on using them in the DT. What's the use-case?
>>>
>>>
>> The original plan is to put nvidia;video-format in device tree and this is
>> the data formats for that. Now we don't need nvidia;video-format in device
>> tree. Then I let me move it into our tegra-core.c, because our
>> tegra_video_formats table needs this.
> If we don't need it now, why will we ever need it? Shouldn't this be
> something that's configurable and depending on what camera is attached
> or what format the user has requested?
>
> Thierry
In my first version, I put nvidia;video-format into the VI device tree 
node, CSI device tree node and TPG DT node. Now this should be removed 
and this header file I will convert to a enum for internal 
tegra_video_formats.

Thanks,
-Bryan

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2015-08-25 21:43 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-21  0:51 [PATCH RFC 0/2] NVIDIA Tegra VI V4L2 driver Bryan Wu
2015-08-21  0:51 ` [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver Bryan Wu
2015-08-21  0:51 ` [PATCH 2/2] ARM64: add tegra-vi support in T210 device-tree Bryan Wu
2015-08-21  0:51 ` [PATCH RFC 0/2] NVIDIA Tegra VI V4L2 driver Bryan Wu
2015-08-21  0:51 ` [PATCH 1/2] [media] v4l: tegra: Add NVIDIA Tegra VI driver Bryan Wu
2015-08-21  9:28   ` Hans Verkuil
2015-08-25  0:43     ` Bryan Wu
2015-08-21 13:03   ` Thierry Reding
2015-08-21 13:25     ` Hans Verkuil
     [not found]     ` <20150821130339.GB22118-AwZRO8vwLAwmlAP/+Wk3EA@public.gmane.org>
2015-08-25  0:26       ` Bryan Wu
2015-08-25  0:26         ` Bryan Wu
     [not found]         ` <55DBB62C.4020606-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2015-08-25  6:30           ` Hans Verkuil
2015-08-25  6:30             ` Hans Verkuil
     [not found]             ` <55DC0B91.2000204-qWit8jRvyhVmR6Xm/wNWPw@public.gmane.org>
2015-08-25 11:25               ` Thierry Reding
2015-08-25 11:25                 ` Thierry Reding
2015-08-25 17:04               ` Bryan Wu
2015-08-25 17:04                 ` Bryan Wu
2015-08-25 13:44           ` Thierry Reding
2015-08-25 13:44             ` Thierry Reding
2015-08-25 14:15             ` Hans Verkuil
     [not found]               ` <55DC789E.8060300-qWit8jRvyhVmR6Xm/wNWPw@public.gmane.org>
2015-08-25 14:24                 ` Thierry Reding
2015-08-25 14:24                   ` Thierry Reding
     [not found]             ` <20150825134444.GH14034-AwZRO8vwLAwmlAP/+Wk3EA@public.gmane.org>
2015-08-25 21:42               ` Bryan Wu
2015-08-25 21:42                 ` Bryan Wu
2015-08-21  0:51 ` [PATCH 2/2] ARM64: add tegra-vi support in T210 device-tree Bryan Wu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.