All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/14] staging: media: tegra-vdea: Add Tegra124 support
@ 2018-08-13 14:50 ` Thierry Reding
  0 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Dmitry Osipenko, linux-media

From: Thierry Reding <treding@nvidia.com>

Hi,

this set of patches perform a bit of cleanup and extend support to the
VDE implementation found on Tegra114 and Tegra124. This requires adding
handling for a clock and a reset for the BSEV block that is separate
from the main VDE block. The new VDE revision also supports reference
picture marking, which requires that the BSEV writes out some related
data to a memory location. Since the supported tiling layouts have been
changed in Tegra124, which supports only block-linear and no pitch-
linear layouts, a new way is added to request a specific layout for the
decoded frames. Both of the above changes require breaking the ABI to
accomodate for the new data in the custom IOCTL.

Finally this set also adds support for dealing with an IOMMU, which
makes it more convenient to deal with imported buffers since they no
longer need to be physically contiguous.

Userspace changes for the updated ABI are available here:

	https://cgit.freedesktop.org/~tagr/libvdpau-tegra/commit/

Mauro, I'm sending the device tree changes as part of the series for
completeness, but I expect to pick those up into the Tegra tree once
this has been reviewed and you've applied the driver changes.

Thanks,
Thierry

Thierry Reding (14):
  staging: media: tegra-vde: Support BSEV clock and reset
  staging: media: tegra-vde: Support reference picture marking
  staging: media: tegra-vde: Prepare for interlacing support
  staging: media: tegra-vde: Use DRM/KMS framebuffer modifiers
  staging: media: tegra-vde: Properly mark invalid entries
  staging: media: tegra-vde: Print out invalid FD
  staging: media: tegra-vde: Add some clarifying comments
  staging: media: tegra-vde: Track struct device *
  staging: media: tegra-vde: Add IOMMU support
  staging: media: tegra-vde: Keep VDE in reset when unused
  ARM: tegra: Enable VDE on Tegra124
  ARM: tegra: Add BSEV clock and reset for VDE on Tegra20
  ARM: tegra: Add BSEV clock and reset for VDE on Tegra30
  ARM: tegra: Enable SMMU for VDE on Tegra124

 arch/arm/boot/dts/tegra124.dtsi             |  42 ++
 arch/arm/boot/dts/tegra20.dtsi              |  10 +-
 arch/arm/boot/dts/tegra30.dtsi              |  10 +-
 drivers/staging/media/tegra-vde/tegra-vde.c | 528 +++++++++++++++++---
 drivers/staging/media/tegra-vde/uapi.h      |   6 +-
 5 files changed, 511 insertions(+), 85 deletions(-)

-- 
2.17.0

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH 00/14] staging: media: tegra-vdea: Add Tegra124 support
@ 2018-08-13 14:50 ` Thierry Reding
  0 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: Greg Kroah-Hartman, Dmitry Osipenko, Jonathan Hunter,
	linux-media, linux-tegra, devel

From: Thierry Reding <treding@nvidia.com>

Hi,

this set of patches perform a bit of cleanup and extend support to the
VDE implementation found on Tegra114 and Tegra124. This requires adding
handling for a clock and a reset for the BSEV block that is separate
from the main VDE block. The new VDE revision also supports reference
picture marking, which requires that the BSEV writes out some related
data to a memory location. Since the supported tiling layouts have been
changed in Tegra124, which supports only block-linear and no pitch-
linear layouts, a new way is added to request a specific layout for the
decoded frames. Both of the above changes require breaking the ABI to
accomodate for the new data in the custom IOCTL.

Finally this set also adds support for dealing with an IOMMU, which
makes it more convenient to deal with imported buffers since they no
longer need to be physically contiguous.

Userspace changes for the updated ABI are available here:

	https://cgit.freedesktop.org/~tagr/libvdpau-tegra/commit/

Mauro, I'm sending the device tree changes as part of the series for
completeness, but I expect to pick those up into the Tegra tree once
this has been reviewed and you've applied the driver changes.

Thanks,
Thierry

Thierry Reding (14):
  staging: media: tegra-vde: Support BSEV clock and reset
  staging: media: tegra-vde: Support reference picture marking
  staging: media: tegra-vde: Prepare for interlacing support
  staging: media: tegra-vde: Use DRM/KMS framebuffer modifiers
  staging: media: tegra-vde: Properly mark invalid entries
  staging: media: tegra-vde: Print out invalid FD
  staging: media: tegra-vde: Add some clarifying comments
  staging: media: tegra-vde: Track struct device *
  staging: media: tegra-vde: Add IOMMU support
  staging: media: tegra-vde: Keep VDE in reset when unused
  ARM: tegra: Enable VDE on Tegra124
  ARM: tegra: Add BSEV clock and reset for VDE on Tegra20
  ARM: tegra: Add BSEV clock and reset for VDE on Tegra30
  ARM: tegra: Enable SMMU for VDE on Tegra124

 arch/arm/boot/dts/tegra124.dtsi             |  42 ++
 arch/arm/boot/dts/tegra20.dtsi              |  10 +-
 arch/arm/boot/dts/tegra30.dtsi              |  10 +-
 drivers/staging/media/tegra-vde/tegra-vde.c | 528 +++++++++++++++++---
 drivers/staging/media/tegra-vde/uapi.h      |   6 +-
 5 files changed, 511 insertions(+), 85 deletions(-)

-- 
2.17.0

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH 01/14] staging: media: tegra-vde: Support BSEV clock and reset
  2018-08-13 14:50 ` Thierry Reding
@ 2018-08-13 14:50   ` Thierry Reding
  -1 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Dmitry Osipenko, linux-media

From: Thierry Reding <treding@nvidia.com>

The BSEV clock has a separate gate bit and can not be assumed to be
always enabled. Add explicit handling for the BSEV clock and reset.

This fixes an issue on Tegra124 where the BSEV clock is not enabled
by default and therefore accessing the BSEV registers will hang the
CPU if the BSEV clock is not enabled and the reset not deasserted.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/staging/media/tegra-vde/tegra-vde.c | 35 +++++++++++++++++++--
 1 file changed, 33 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
index 6f06061a40d9..9d8f833744db 100644
--- a/drivers/staging/media/tegra-vde/tegra-vde.c
+++ b/drivers/staging/media/tegra-vde/tegra-vde.c
@@ -74,9 +74,11 @@ struct tegra_vde {
 	struct miscdevice miscdev;
 	struct reset_control *rst;
 	struct reset_control *rst_mc;
+	struct reset_control *rst_bsev;
 	struct gen_pool *iram_pool;
 	struct completion decode_completion;
 	struct clk *clk;
+	struct clk *clk_bsev;
 	dma_addr_t iram_lists_addr;
 	u32 *iram;
 };
@@ -979,6 +981,11 @@ static int tegra_vde_runtime_suspend(struct device *dev)
 		return err;
 	}
 
+	reset_control_assert(vde->rst_bsev);
+
+	usleep_range(2000, 4000);
+
+	clk_disable_unprepare(vde->clk_bsev);
 	clk_disable_unprepare(vde->clk);
 
 	return 0;
@@ -996,6 +1003,16 @@ static int tegra_vde_runtime_resume(struct device *dev)
 		return err;
 	}
 
+	err = clk_prepare_enable(vde->clk_bsev);
+	if (err < 0)
+		return err;
+
+	err = reset_control_deassert(vde->rst_bsev);
+	if (err < 0)
+		return err;
+
+	usleep_range(2000, 4000);
+
 	return 0;
 }
 
@@ -1084,14 +1101,21 @@ static int tegra_vde_probe(struct platform_device *pdev)
 	if (IS_ERR(vde->frameid))
 		return PTR_ERR(vde->frameid);
 
-	vde->clk = devm_clk_get(dev, NULL);
+	vde->clk = devm_clk_get(dev, "vde");
 	if (IS_ERR(vde->clk)) {
 		err = PTR_ERR(vde->clk);
 		dev_err(dev, "Could not get VDE clk %d\n", err);
 		return err;
 	}
 
-	vde->rst = devm_reset_control_get(dev, NULL);
+	vde->clk_bsev = devm_clk_get(dev, "bsev");
+	if (IS_ERR(vde->clk_bsev)) {
+		err = PTR_ERR(vde->clk_bsev);
+		dev_err(dev, "failed to get BSEV clock: %d\n", err);
+		return err;
+	}
+
+	vde->rst = devm_reset_control_get(dev, "vde");
 	if (IS_ERR(vde->rst)) {
 		err = PTR_ERR(vde->rst);
 		dev_err(dev, "Could not get VDE reset %d\n", err);
@@ -1105,6 +1129,13 @@ static int tegra_vde_probe(struct platform_device *pdev)
 		return err;
 	}
 
+	vde->rst_bsev = devm_reset_control_get(dev, "bsev");
+	if (IS_ERR(vde->rst_bsev)) {
+		err = PTR_ERR(vde->rst_bsev);
+		dev_err(dev, "failed to get BSEV reset: %d\n", err);
+		return err;
+	}
+
 	irq = platform_get_irq_byname(pdev, "sync-token");
 	if (irq < 0)
 		return irq;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 01/14] staging: media: tegra-vde: Support BSEV clock and reset
@ 2018-08-13 14:50   ` Thierry Reding
  0 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: Greg Kroah-Hartman, Dmitry Osipenko, Jonathan Hunter,
	linux-media, linux-tegra, devel

From: Thierry Reding <treding@nvidia.com>

The BSEV clock has a separate gate bit and can not be assumed to be
always enabled. Add explicit handling for the BSEV clock and reset.

This fixes an issue on Tegra124 where the BSEV clock is not enabled
by default and therefore accessing the BSEV registers will hang the
CPU if the BSEV clock is not enabled and the reset not deasserted.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/staging/media/tegra-vde/tegra-vde.c | 35 +++++++++++++++++++--
 1 file changed, 33 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
index 6f06061a40d9..9d8f833744db 100644
--- a/drivers/staging/media/tegra-vde/tegra-vde.c
+++ b/drivers/staging/media/tegra-vde/tegra-vde.c
@@ -74,9 +74,11 @@ struct tegra_vde {
 	struct miscdevice miscdev;
 	struct reset_control *rst;
 	struct reset_control *rst_mc;
+	struct reset_control *rst_bsev;
 	struct gen_pool *iram_pool;
 	struct completion decode_completion;
 	struct clk *clk;
+	struct clk *clk_bsev;
 	dma_addr_t iram_lists_addr;
 	u32 *iram;
 };
@@ -979,6 +981,11 @@ static int tegra_vde_runtime_suspend(struct device *dev)
 		return err;
 	}
 
+	reset_control_assert(vde->rst_bsev);
+
+	usleep_range(2000, 4000);
+
+	clk_disable_unprepare(vde->clk_bsev);
 	clk_disable_unprepare(vde->clk);
 
 	return 0;
@@ -996,6 +1003,16 @@ static int tegra_vde_runtime_resume(struct device *dev)
 		return err;
 	}
 
+	err = clk_prepare_enable(vde->clk_bsev);
+	if (err < 0)
+		return err;
+
+	err = reset_control_deassert(vde->rst_bsev);
+	if (err < 0)
+		return err;
+
+	usleep_range(2000, 4000);
+
 	return 0;
 }
 
@@ -1084,14 +1101,21 @@ static int tegra_vde_probe(struct platform_device *pdev)
 	if (IS_ERR(vde->frameid))
 		return PTR_ERR(vde->frameid);
 
-	vde->clk = devm_clk_get(dev, NULL);
+	vde->clk = devm_clk_get(dev, "vde");
 	if (IS_ERR(vde->clk)) {
 		err = PTR_ERR(vde->clk);
 		dev_err(dev, "Could not get VDE clk %d\n", err);
 		return err;
 	}
 
-	vde->rst = devm_reset_control_get(dev, NULL);
+	vde->clk_bsev = devm_clk_get(dev, "bsev");
+	if (IS_ERR(vde->clk_bsev)) {
+		err = PTR_ERR(vde->clk_bsev);
+		dev_err(dev, "failed to get BSEV clock: %d\n", err);
+		return err;
+	}
+
+	vde->rst = devm_reset_control_get(dev, "vde");
 	if (IS_ERR(vde->rst)) {
 		err = PTR_ERR(vde->rst);
 		dev_err(dev, "Could not get VDE reset %d\n", err);
@@ -1105,6 +1129,13 @@ static int tegra_vde_probe(struct platform_device *pdev)
 		return err;
 	}
 
+	vde->rst_bsev = devm_reset_control_get(dev, "bsev");
+	if (IS_ERR(vde->rst_bsev)) {
+		err = PTR_ERR(vde->rst_bsev);
+		dev_err(dev, "failed to get BSEV reset: %d\n", err);
+		return err;
+	}
+
 	irq = platform_get_irq_byname(pdev, "sync-token");
 	if (irq < 0)
 		return irq;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 02/14] staging: media: tegra-vde: Support reference picture marking
  2018-08-13 14:50 ` Thierry Reding
@ 2018-08-13 14:50   ` Thierry Reding
  -1 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Dmitry Osipenko, linux-media

From: Thierry Reding <treding@nvidia.com>

Tegra114 and Tegra124 support reference picture marking, which will
cause BSEV to write picture marking data to SDRAM. Make sure there is
a valid destination address for that data to avoid error messages from
the memory controller.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/staging/media/tegra-vde/tegra-vde.c | 54 ++++++++++++++++++++-
 drivers/staging/media/tegra-vde/uapi.h      |  3 ++
 2 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
index 9d8f833744db..3027b11b11ae 100644
--- a/drivers/staging/media/tegra-vde/tegra-vde.c
+++ b/drivers/staging/media/tegra-vde/tegra-vde.c
@@ -60,7 +60,12 @@ struct video_frame {
 	u32 flags;
 };
 
+struct tegra_vde_soc {
+	bool supports_ref_pic_marking;
+};
+
 struct tegra_vde {
+	const struct tegra_vde_soc *soc;
 	void __iomem *sxe;
 	void __iomem *bsev;
 	void __iomem *mbe;
@@ -330,6 +335,7 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 				      struct video_frame *dpb_frames,
 				      dma_addr_t bitstream_data_addr,
 				      size_t bitstream_data_size,
+				      dma_addr_t secure_addr,
 				      unsigned int macroblocks_nb)
 {
 	struct device *dev = vde->miscdev.parent;
@@ -454,6 +460,9 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 
 	VDE_WR(bitstream_data_addr, vde->sxe + 0x6C);
 
+	if (vde->soc->supports_ref_pic_marking)
+		VDE_WR(secure_addr, vde->sxe + 0x7c);
+
 	value = 0x10000005;
 	value |= ctx->pic_width_in_mbs << 11;
 	value |= ctx->pic_height_in_mbs << 3;
@@ -772,12 +781,15 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 	struct tegra_vde_h264_frame __user *frames_user;
 	struct video_frame *dpb_frames;
 	struct dma_buf_attachment *bitstream_data_dmabuf_attachment;
-	struct sg_table *bitstream_sgt;
+	struct dma_buf_attachment *secure_attachment = NULL;
+	struct sg_table *bitstream_sgt, *secure_sgt;
 	enum dma_data_direction dma_dir;
 	dma_addr_t bitstream_data_addr;
+	dma_addr_t secure_addr;
 	dma_addr_t bsev_ptr;
 	size_t lsize, csize;
 	size_t bitstream_data_size;
+	size_t secure_size;
 	unsigned int macroblocks_nb;
 	unsigned int read_bytes;
 	unsigned int cstride;
@@ -803,6 +815,18 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 	if (ret)
 		return ret;
 
+	if (vde->soc->supports_ref_pic_marking) {
+		ret = tegra_vde_attach_dmabuf(dev, ctx.secure_fd,
+					      ctx.secure_offset, 0, SZ_256,
+					      &secure_attachment,
+					      &secure_addr,
+					      &secure_sgt,
+					      &secure_size,
+					      DMA_TO_DEVICE);
+		if (ret)
+			goto release_bitstream_dmabuf;
+	}
+
 	dpb_frames = kcalloc(ctx.dpb_frames_nb, sizeof(*dpb_frames),
 			     GFP_KERNEL);
 	if (!dpb_frames) {
@@ -876,6 +900,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 	ret = tegra_vde_setup_hw_context(vde, &ctx, dpb_frames,
 					 bitstream_data_addr,
 					 bitstream_data_size,
+					 secure_addr,
 					 macroblocks_nb);
 	if (ret)
 		goto put_runtime_pm;
@@ -929,6 +954,10 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 	kfree(dpb_frames);
 
 release_bitstream_dmabuf:
+	if (secure_attachment)
+		tegra_vde_detach_and_put_dmabuf(secure_attachment, secure_sgt,
+						DMA_TO_DEVICE);
+
 	tegra_vde_detach_and_put_dmabuf(bitstream_data_dmabuf_attachment,
 					bitstream_sgt, DMA_TO_DEVICE);
 
@@ -1029,6 +1058,8 @@ static int tegra_vde_probe(struct platform_device *pdev)
 
 	platform_set_drvdata(pdev, vde);
 
+	vde->soc = of_device_get_match_data(&pdev->dev);
+
 	regs = platform_get_resource_byname(pdev, IORESOURCE_MEM, "sxe");
 	if (!regs)
 		return -ENODEV;
@@ -1258,8 +1289,27 @@ static const struct dev_pm_ops tegra_vde_pm_ops = {
 				tegra_vde_pm_resume)
 };
 
+static const struct tegra_vde_soc tegra20_vde_soc = {
+	.supports_ref_pic_marking = false,
+};
+
+static const struct tegra_vde_soc tegra30_vde_soc = {
+	.supports_ref_pic_marking = false,
+};
+
+static const struct tegra_vde_soc tegra114_vde_soc = {
+	.supports_ref_pic_marking = true,
+};
+
+static const struct tegra_vde_soc tegra124_vde_soc = {
+	.supports_ref_pic_marking = true,
+};
+
 static const struct of_device_id tegra_vde_of_match[] = {
-	{ .compatible = "nvidia,tegra20-vde", },
+	{ .compatible = "nvidia,tegra124-vde", .data = &tegra124_vde_soc },
+	{ .compatible = "nvidia,tegra114-vde", .data = &tegra114_vde_soc },
+	{ .compatible = "nvidia,tegra30-vde", .data = &tegra30_vde_soc },
+	{ .compatible = "nvidia,tegra20-vde", .data = &tegra20_vde_soc },
 	{ },
 };
 MODULE_DEVICE_TABLE(of, tegra_vde_of_match);
diff --git a/drivers/staging/media/tegra-vde/uapi.h b/drivers/staging/media/tegra-vde/uapi.h
index a50c7bcae057..58bfd56de55e 100644
--- a/drivers/staging/media/tegra-vde/uapi.h
+++ b/drivers/staging/media/tegra-vde/uapi.h
@@ -35,6 +35,9 @@ struct tegra_vde_h264_decoder_ctx {
 	__s32 bitstream_data_fd;
 	__u32 bitstream_data_offset;
 
+	__s32 secure_fd;
+	__u32 secure_offset;
+
 	__u64 dpb_frames_ptr;
 	__u8  dpb_frames_nb;
 	__u8  dpb_ref_frames_with_earlier_poc_nb;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 02/14] staging: media: tegra-vde: Support reference picture marking
@ 2018-08-13 14:50   ` Thierry Reding
  0 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: Greg Kroah-Hartman, Dmitry Osipenko, Jonathan Hunter,
	linux-media, linux-tegra, devel

From: Thierry Reding <treding@nvidia.com>

Tegra114 and Tegra124 support reference picture marking, which will
cause BSEV to write picture marking data to SDRAM. Make sure there is
a valid destination address for that data to avoid error messages from
the memory controller.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/staging/media/tegra-vde/tegra-vde.c | 54 ++++++++++++++++++++-
 drivers/staging/media/tegra-vde/uapi.h      |  3 ++
 2 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
index 9d8f833744db..3027b11b11ae 100644
--- a/drivers/staging/media/tegra-vde/tegra-vde.c
+++ b/drivers/staging/media/tegra-vde/tegra-vde.c
@@ -60,7 +60,12 @@ struct video_frame {
 	u32 flags;
 };
 
+struct tegra_vde_soc {
+	bool supports_ref_pic_marking;
+};
+
 struct tegra_vde {
+	const struct tegra_vde_soc *soc;
 	void __iomem *sxe;
 	void __iomem *bsev;
 	void __iomem *mbe;
@@ -330,6 +335,7 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 				      struct video_frame *dpb_frames,
 				      dma_addr_t bitstream_data_addr,
 				      size_t bitstream_data_size,
+				      dma_addr_t secure_addr,
 				      unsigned int macroblocks_nb)
 {
 	struct device *dev = vde->miscdev.parent;
@@ -454,6 +460,9 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 
 	VDE_WR(bitstream_data_addr, vde->sxe + 0x6C);
 
+	if (vde->soc->supports_ref_pic_marking)
+		VDE_WR(secure_addr, vde->sxe + 0x7c);
+
 	value = 0x10000005;
 	value |= ctx->pic_width_in_mbs << 11;
 	value |= ctx->pic_height_in_mbs << 3;
@@ -772,12 +781,15 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 	struct tegra_vde_h264_frame __user *frames_user;
 	struct video_frame *dpb_frames;
 	struct dma_buf_attachment *bitstream_data_dmabuf_attachment;
-	struct sg_table *bitstream_sgt;
+	struct dma_buf_attachment *secure_attachment = NULL;
+	struct sg_table *bitstream_sgt, *secure_sgt;
 	enum dma_data_direction dma_dir;
 	dma_addr_t bitstream_data_addr;
+	dma_addr_t secure_addr;
 	dma_addr_t bsev_ptr;
 	size_t lsize, csize;
 	size_t bitstream_data_size;
+	size_t secure_size;
 	unsigned int macroblocks_nb;
 	unsigned int read_bytes;
 	unsigned int cstride;
@@ -803,6 +815,18 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 	if (ret)
 		return ret;
 
+	if (vde->soc->supports_ref_pic_marking) {
+		ret = tegra_vde_attach_dmabuf(dev, ctx.secure_fd,
+					      ctx.secure_offset, 0, SZ_256,
+					      &secure_attachment,
+					      &secure_addr,
+					      &secure_sgt,
+					      &secure_size,
+					      DMA_TO_DEVICE);
+		if (ret)
+			goto release_bitstream_dmabuf;
+	}
+
 	dpb_frames = kcalloc(ctx.dpb_frames_nb, sizeof(*dpb_frames),
 			     GFP_KERNEL);
 	if (!dpb_frames) {
@@ -876,6 +900,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 	ret = tegra_vde_setup_hw_context(vde, &ctx, dpb_frames,
 					 bitstream_data_addr,
 					 bitstream_data_size,
+					 secure_addr,
 					 macroblocks_nb);
 	if (ret)
 		goto put_runtime_pm;
@@ -929,6 +954,10 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 	kfree(dpb_frames);
 
 release_bitstream_dmabuf:
+	if (secure_attachment)
+		tegra_vde_detach_and_put_dmabuf(secure_attachment, secure_sgt,
+						DMA_TO_DEVICE);
+
 	tegra_vde_detach_and_put_dmabuf(bitstream_data_dmabuf_attachment,
 					bitstream_sgt, DMA_TO_DEVICE);
 
@@ -1029,6 +1058,8 @@ static int tegra_vde_probe(struct platform_device *pdev)
 
 	platform_set_drvdata(pdev, vde);
 
+	vde->soc = of_device_get_match_data(&pdev->dev);
+
 	regs = platform_get_resource_byname(pdev, IORESOURCE_MEM, "sxe");
 	if (!regs)
 		return -ENODEV;
@@ -1258,8 +1289,27 @@ static const struct dev_pm_ops tegra_vde_pm_ops = {
 				tegra_vde_pm_resume)
 };
 
+static const struct tegra_vde_soc tegra20_vde_soc = {
+	.supports_ref_pic_marking = false,
+};
+
+static const struct tegra_vde_soc tegra30_vde_soc = {
+	.supports_ref_pic_marking = false,
+};
+
+static const struct tegra_vde_soc tegra114_vde_soc = {
+	.supports_ref_pic_marking = true,
+};
+
+static const struct tegra_vde_soc tegra124_vde_soc = {
+	.supports_ref_pic_marking = true,
+};
+
 static const struct of_device_id tegra_vde_of_match[] = {
-	{ .compatible = "nvidia,tegra20-vde", },
+	{ .compatible = "nvidia,tegra124-vde", .data = &tegra124_vde_soc },
+	{ .compatible = "nvidia,tegra114-vde", .data = &tegra114_vde_soc },
+	{ .compatible = "nvidia,tegra30-vde", .data = &tegra30_vde_soc },
+	{ .compatible = "nvidia,tegra20-vde", .data = &tegra20_vde_soc },
 	{ },
 };
 MODULE_DEVICE_TABLE(of, tegra_vde_of_match);
diff --git a/drivers/staging/media/tegra-vde/uapi.h b/drivers/staging/media/tegra-vde/uapi.h
index a50c7bcae057..58bfd56de55e 100644
--- a/drivers/staging/media/tegra-vde/uapi.h
+++ b/drivers/staging/media/tegra-vde/uapi.h
@@ -35,6 +35,9 @@ struct tegra_vde_h264_decoder_ctx {
 	__s32 bitstream_data_fd;
 	__u32 bitstream_data_offset;
 
+	__s32 secure_fd;
+	__u32 secure_offset;
+
 	__u64 dpb_frames_ptr;
 	__u8  dpb_frames_nb;
 	__u8  dpb_ref_frames_with_earlier_poc_nb;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 03/14] staging: media: tegra-vde: Prepare for interlacing support
  2018-08-13 14:50 ` Thierry Reding
@ 2018-08-13 14:50   ` Thierry Reding
  -1 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Dmitry Osipenko, linux-media

From: Thierry Reding <treding@nvidia.com>

The number of frames doubles when decoding interlaced content and the
structures describing the frames double in size. Take that into account
to prepare for interlacing support.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/staging/media/tegra-vde/tegra-vde.c | 73 ++++++++++++++++-----
 1 file changed, 58 insertions(+), 15 deletions(-)

diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
index 3027b11b11ae..1a40f6dff7c8 100644
--- a/drivers/staging/media/tegra-vde/tegra-vde.c
+++ b/drivers/staging/media/tegra-vde/tegra-vde.c
@@ -61,7 +61,9 @@ struct video_frame {
 };
 
 struct tegra_vde_soc {
+	unsigned int num_ref_pics;
 	bool supports_ref_pic_marking;
+	bool supports_interlacing;
 };
 
 struct tegra_vde {
@@ -205,8 +207,12 @@ static void tegra_vde_setup_frameid(struct tegra_vde *vde,
 	u32 cr_addr = frame ? frame->cr_addr : 0x6CDEAD00;
 	u32 value1 = frame ? ((mbs_width << 16) | mbs_height) : 0;
 	u32 value2 = frame ? ((((mbs_width + 1) >> 1) << 6) | 1) : 0;
+	u32 value = y_addr >> 8;
 
-	VDE_WR(y_addr  >> 8, vde->frameid + 0x000 + frameid * 4);
+	if (vde->soc->supports_interlacing)
+		value |= BIT(31);
+
+	VDE_WR(value,        vde->frameid + 0x000 + frameid * 4);
 	VDE_WR(cb_addr >> 8, vde->frameid + 0x100 + frameid * 4);
 	VDE_WR(cr_addr >> 8, vde->frameid + 0x180 + frameid * 4);
 	VDE_WR(value1,       vde->frameid + 0x080 + frameid * 4);
@@ -229,20 +235,23 @@ static void tegra_setup_frameidx(struct tegra_vde *vde,
 }
 
 static void tegra_vde_setup_iram_entry(struct tegra_vde *vde,
+				       unsigned int num_ref_pics,
 				       unsigned int table,
 				       unsigned int row,
 				       u32 value1, u32 value2)
 {
+	unsigned int entries = num_ref_pics * 2;
 	u32 *iram_tables = vde->iram;
 
 	dev_dbg(vde->miscdev.parent, "IRAM table %u: row %u: 0x%08X 0x%08X\n",
 		table, row, value1, value2);
 
-	iram_tables[0x20 * table + row * 2] = value1;
-	iram_tables[0x20 * table + row * 2 + 1] = value2;
+	iram_tables[entries * table + row * 2] = value1;
+	iram_tables[entries * table + row * 2 + 1] = value2;
 }
 
 static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
+					unsigned int num_ref_pics,
 					struct video_frame *dpb_frames,
 					unsigned int ref_frames_nb,
 					unsigned int with_earlier_poc_nb)
@@ -251,13 +260,17 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
 	u32 value, aux_addr;
 	int with_later_poc_nb;
 	unsigned int i, k;
+	size_t size;
+
+	size = num_ref_pics * 4 * 8;
+	memset(vde->iram, 0, size);
 
 	dev_dbg(vde->miscdev.parent, "DPB: Frame 0: frame_num = %d\n",
 		dpb_frames[0].frame_num);
 
 	dev_dbg(vde->miscdev.parent, "REF L0:\n");
 
-	for (i = 0; i < 16; i++) {
+	for (i = 0; i < num_ref_pics; i++) {
 		if (i < ref_frames_nb) {
 			frame = &dpb_frames[i + 1];
 
@@ -277,10 +290,14 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
 			value = 0;
 		}
 
-		tegra_vde_setup_iram_entry(vde, 0, i, value, aux_addr);
-		tegra_vde_setup_iram_entry(vde, 1, i, value, aux_addr);
-		tegra_vde_setup_iram_entry(vde, 2, i, value, aux_addr);
-		tegra_vde_setup_iram_entry(vde, 3, i, value, aux_addr);
+		tegra_vde_setup_iram_entry(vde, num_ref_pics, 0, i, value,
+					   aux_addr);
+		tegra_vde_setup_iram_entry(vde, num_ref_pics, 1, i, value,
+					   aux_addr);
+		tegra_vde_setup_iram_entry(vde, num_ref_pics, 2, i, value,
+					   aux_addr);
+		tegra_vde_setup_iram_entry(vde, num_ref_pics, 3, i, value,
+					   aux_addr);
 	}
 
 	if (!(dpb_frames[0].flags & FLAG_B_FRAME))
@@ -309,7 +326,8 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
 			"\tFrame %d: frame_num = %d\n",
 			k + 1, frame->frame_num);
 
-		tegra_vde_setup_iram_entry(vde, 2, i, value, aux_addr);
+		tegra_vde_setup_iram_entry(vde, num_ref_pics, 2, i, value,
+					   aux_addr);
 	}
 
 	for (k = 0; i < ref_frames_nb; i++, k++) {
@@ -326,7 +344,8 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
 			"\tFrame %d: frame_num = %d\n",
 			k + 1, frame->frame_num);
 
-		tegra_vde_setup_iram_entry(vde, 2, i, value, aux_addr);
+		tegra_vde_setup_iram_entry(vde, num_ref_pics, 2, i, value,
+					   aux_addr);
 	}
 }
 
@@ -339,9 +358,20 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 				      unsigned int macroblocks_nb)
 {
 	struct device *dev = vde->miscdev.parent;
+	unsigned int num_ref_pics = 16;
+	/* XXX extend ABI to provide this */
+	bool interlaced = false;
+	size_t size;
 	u32 value;
 	int err;
 
+	if (vde->soc->supports_interlacing) {
+		if (interlaced)
+			num_ref_pics = vde->soc->num_ref_pics;
+		else
+			num_ref_pics = 16;
+	}
+
 	tegra_vde_set_bits(vde, 0x000A, vde->sxe + 0xF0);
 	tegra_vde_set_bits(vde, 0x000B, vde->bsev + CMDQUE_CONTROL);
 	tegra_vde_set_bits(vde, 0x8002, vde->mbe + 0x50);
@@ -369,12 +399,12 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 	VDE_WR(0x00000000, vde->bsev + 0x98);
 	VDE_WR(0x00000060, vde->bsev + 0x9C);
 
-	memset(vde->iram + 128, 0, macroblocks_nb / 2);
+	memset(vde->iram + 1024, 0, macroblocks_nb / 2);
 
 	tegra_setup_frameidx(vde, dpb_frames, ctx->dpb_frames_nb,
 			     ctx->pic_width_in_mbs, ctx->pic_height_in_mbs);
 
-	tegra_vde_setup_iram_tables(vde, dpb_frames,
+	tegra_vde_setup_iram_tables(vde, num_ref_pics, dpb_frames,
 				    ctx->dpb_frames_nb - 1,
 				    ctx->dpb_ref_frames_with_earlier_poc_nb);
 
@@ -396,22 +426,27 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 	if (err)
 		return err;
 
-	err = tegra_vde_push_to_bsev_icmdqueue(vde, 0x800003FC, false);
+	value = (0x20 << 26) | (0 << 25) | ((4096 >> 2) & 0x1fff);
+	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);
 	if (err)
 		return err;
 
 	value = 0x01500000;
-	value |= ((vde->iram_lists_addr + 512) >> 2) & 0xFFFF;
+	value |= ((vde->iram_lists_addr + 1024) >> 2) & 0xffff;
 
 	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, true);
 	if (err)
 		return err;
 
+	value = (0x21 << 26) | ((240 & 0x1fff) << 12) | (0x54c & 0xfff);
 	err = tegra_vde_push_to_bsev_icmdqueue(vde, 0x840F054C, false);
 	if (err)
 		return err;
 
-	err = tegra_vde_push_to_bsev_icmdqueue(vde, 0x80000080, false);
+	size = num_ref_pics * 4 * 8;
+
+	value = (0x20 << 26) | (0x0 << 25) | ((size >> 2) & 0x1fff);
+	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);
 	if (err)
 		return err;
 
@@ -1290,19 +1325,27 @@ static const struct dev_pm_ops tegra_vde_pm_ops = {
 };
 
 static const struct tegra_vde_soc tegra20_vde_soc = {
+	.num_ref_pics = 16,
 	.supports_ref_pic_marking = false,
+	.supports_interlacing = false,
 };
 
 static const struct tegra_vde_soc tegra30_vde_soc = {
+	.num_ref_pics = 32,
 	.supports_ref_pic_marking = false,
+	.supports_interlacing = false,
 };
 
 static const struct tegra_vde_soc tegra114_vde_soc = {
+	.num_ref_pics = 32,
 	.supports_ref_pic_marking = true,
+	.supports_interlacing = false,
 };
 
 static const struct tegra_vde_soc tegra124_vde_soc = {
+	.num_ref_pics = 32,
 	.supports_ref_pic_marking = true,
+	.supports_interlacing = true,
 };
 
 static const struct of_device_id tegra_vde_of_match[] = {
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 03/14] staging: media: tegra-vde: Prepare for interlacing support
@ 2018-08-13 14:50   ` Thierry Reding
  0 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: Greg Kroah-Hartman, Dmitry Osipenko, Jonathan Hunter,
	linux-media, linux-tegra, devel

From: Thierry Reding <treding@nvidia.com>

The number of frames doubles when decoding interlaced content and the
structures describing the frames double in size. Take that into account
to prepare for interlacing support.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/staging/media/tegra-vde/tegra-vde.c | 73 ++++++++++++++++-----
 1 file changed, 58 insertions(+), 15 deletions(-)

diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
index 3027b11b11ae..1a40f6dff7c8 100644
--- a/drivers/staging/media/tegra-vde/tegra-vde.c
+++ b/drivers/staging/media/tegra-vde/tegra-vde.c
@@ -61,7 +61,9 @@ struct video_frame {
 };
 
 struct tegra_vde_soc {
+	unsigned int num_ref_pics;
 	bool supports_ref_pic_marking;
+	bool supports_interlacing;
 };
 
 struct tegra_vde {
@@ -205,8 +207,12 @@ static void tegra_vde_setup_frameid(struct tegra_vde *vde,
 	u32 cr_addr = frame ? frame->cr_addr : 0x6CDEAD00;
 	u32 value1 = frame ? ((mbs_width << 16) | mbs_height) : 0;
 	u32 value2 = frame ? ((((mbs_width + 1) >> 1) << 6) | 1) : 0;
+	u32 value = y_addr >> 8;
 
-	VDE_WR(y_addr  >> 8, vde->frameid + 0x000 + frameid * 4);
+	if (vde->soc->supports_interlacing)
+		value |= BIT(31);
+
+	VDE_WR(value,        vde->frameid + 0x000 + frameid * 4);
 	VDE_WR(cb_addr >> 8, vde->frameid + 0x100 + frameid * 4);
 	VDE_WR(cr_addr >> 8, vde->frameid + 0x180 + frameid * 4);
 	VDE_WR(value1,       vde->frameid + 0x080 + frameid * 4);
@@ -229,20 +235,23 @@ static void tegra_setup_frameidx(struct tegra_vde *vde,
 }
 
 static void tegra_vde_setup_iram_entry(struct tegra_vde *vde,
+				       unsigned int num_ref_pics,
 				       unsigned int table,
 				       unsigned int row,
 				       u32 value1, u32 value2)
 {
+	unsigned int entries = num_ref_pics * 2;
 	u32 *iram_tables = vde->iram;
 
 	dev_dbg(vde->miscdev.parent, "IRAM table %u: row %u: 0x%08X 0x%08X\n",
 		table, row, value1, value2);
 
-	iram_tables[0x20 * table + row * 2] = value1;
-	iram_tables[0x20 * table + row * 2 + 1] = value2;
+	iram_tables[entries * table + row * 2] = value1;
+	iram_tables[entries * table + row * 2 + 1] = value2;
 }
 
 static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
+					unsigned int num_ref_pics,
 					struct video_frame *dpb_frames,
 					unsigned int ref_frames_nb,
 					unsigned int with_earlier_poc_nb)
@@ -251,13 +260,17 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
 	u32 value, aux_addr;
 	int with_later_poc_nb;
 	unsigned int i, k;
+	size_t size;
+
+	size = num_ref_pics * 4 * 8;
+	memset(vde->iram, 0, size);
 
 	dev_dbg(vde->miscdev.parent, "DPB: Frame 0: frame_num = %d\n",
 		dpb_frames[0].frame_num);
 
 	dev_dbg(vde->miscdev.parent, "REF L0:\n");
 
-	for (i = 0; i < 16; i++) {
+	for (i = 0; i < num_ref_pics; i++) {
 		if (i < ref_frames_nb) {
 			frame = &dpb_frames[i + 1];
 
@@ -277,10 +290,14 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
 			value = 0;
 		}
 
-		tegra_vde_setup_iram_entry(vde, 0, i, value, aux_addr);
-		tegra_vde_setup_iram_entry(vde, 1, i, value, aux_addr);
-		tegra_vde_setup_iram_entry(vde, 2, i, value, aux_addr);
-		tegra_vde_setup_iram_entry(vde, 3, i, value, aux_addr);
+		tegra_vde_setup_iram_entry(vde, num_ref_pics, 0, i, value,
+					   aux_addr);
+		tegra_vde_setup_iram_entry(vde, num_ref_pics, 1, i, value,
+					   aux_addr);
+		tegra_vde_setup_iram_entry(vde, num_ref_pics, 2, i, value,
+					   aux_addr);
+		tegra_vde_setup_iram_entry(vde, num_ref_pics, 3, i, value,
+					   aux_addr);
 	}
 
 	if (!(dpb_frames[0].flags & FLAG_B_FRAME))
@@ -309,7 +326,8 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
 			"\tFrame %d: frame_num = %d\n",
 			k + 1, frame->frame_num);
 
-		tegra_vde_setup_iram_entry(vde, 2, i, value, aux_addr);
+		tegra_vde_setup_iram_entry(vde, num_ref_pics, 2, i, value,
+					   aux_addr);
 	}
 
 	for (k = 0; i < ref_frames_nb; i++, k++) {
@@ -326,7 +344,8 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
 			"\tFrame %d: frame_num = %d\n",
 			k + 1, frame->frame_num);
 
-		tegra_vde_setup_iram_entry(vde, 2, i, value, aux_addr);
+		tegra_vde_setup_iram_entry(vde, num_ref_pics, 2, i, value,
+					   aux_addr);
 	}
 }
 
@@ -339,9 +358,20 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 				      unsigned int macroblocks_nb)
 {
 	struct device *dev = vde->miscdev.parent;
+	unsigned int num_ref_pics = 16;
+	/* XXX extend ABI to provide this */
+	bool interlaced = false;
+	size_t size;
 	u32 value;
 	int err;
 
+	if (vde->soc->supports_interlacing) {
+		if (interlaced)
+			num_ref_pics = vde->soc->num_ref_pics;
+		else
+			num_ref_pics = 16;
+	}
+
 	tegra_vde_set_bits(vde, 0x000A, vde->sxe + 0xF0);
 	tegra_vde_set_bits(vde, 0x000B, vde->bsev + CMDQUE_CONTROL);
 	tegra_vde_set_bits(vde, 0x8002, vde->mbe + 0x50);
@@ -369,12 +399,12 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 	VDE_WR(0x00000000, vde->bsev + 0x98);
 	VDE_WR(0x00000060, vde->bsev + 0x9C);
 
-	memset(vde->iram + 128, 0, macroblocks_nb / 2);
+	memset(vde->iram + 1024, 0, macroblocks_nb / 2);
 
 	tegra_setup_frameidx(vde, dpb_frames, ctx->dpb_frames_nb,
 			     ctx->pic_width_in_mbs, ctx->pic_height_in_mbs);
 
-	tegra_vde_setup_iram_tables(vde, dpb_frames,
+	tegra_vde_setup_iram_tables(vde, num_ref_pics, dpb_frames,
 				    ctx->dpb_frames_nb - 1,
 				    ctx->dpb_ref_frames_with_earlier_poc_nb);
 
@@ -396,22 +426,27 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 	if (err)
 		return err;
 
-	err = tegra_vde_push_to_bsev_icmdqueue(vde, 0x800003FC, false);
+	value = (0x20 << 26) | (0 << 25) | ((4096 >> 2) & 0x1fff);
+	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);
 	if (err)
 		return err;
 
 	value = 0x01500000;
-	value |= ((vde->iram_lists_addr + 512) >> 2) & 0xFFFF;
+	value |= ((vde->iram_lists_addr + 1024) >> 2) & 0xffff;
 
 	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, true);
 	if (err)
 		return err;
 
+	value = (0x21 << 26) | ((240 & 0x1fff) << 12) | (0x54c & 0xfff);
 	err = tegra_vde_push_to_bsev_icmdqueue(vde, 0x840F054C, false);
 	if (err)
 		return err;
 
-	err = tegra_vde_push_to_bsev_icmdqueue(vde, 0x80000080, false);
+	size = num_ref_pics * 4 * 8;
+
+	value = (0x20 << 26) | (0x0 << 25) | ((size >> 2) & 0x1fff);
+	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);
 	if (err)
 		return err;
 
@@ -1290,19 +1325,27 @@ static const struct dev_pm_ops tegra_vde_pm_ops = {
 };
 
 static const struct tegra_vde_soc tegra20_vde_soc = {
+	.num_ref_pics = 16,
 	.supports_ref_pic_marking = false,
+	.supports_interlacing = false,
 };
 
 static const struct tegra_vde_soc tegra30_vde_soc = {
+	.num_ref_pics = 32,
 	.supports_ref_pic_marking = false,
+	.supports_interlacing = false,
 };
 
 static const struct tegra_vde_soc tegra114_vde_soc = {
+	.num_ref_pics = 32,
 	.supports_ref_pic_marking = true,
+	.supports_interlacing = false,
 };
 
 static const struct tegra_vde_soc tegra124_vde_soc = {
+	.num_ref_pics = 32,
 	.supports_ref_pic_marking = true,
+	.supports_interlacing = true,
 };
 
 static const struct of_device_id tegra_vde_of_match[] = {
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 04/14] staging: media: tegra-vde: Use DRM/KMS framebuffer modifiers
  2018-08-13 14:50 ` Thierry Reding
@ 2018-08-13 14:50   ` Thierry Reding
  -1 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Dmitry Osipenko, linux-media

From: Thierry Reding <treding@nvidia.com>

VDE on Tegra20 through Tegra114 supports reading and writing frames in
16x16 tiled layout. Similarily, the various block-linear layouts that
are supported by the GPU on Tegra124 can also be read from and written
to by the Tegra124 VDE.

Enable userspace to specify the desired layout using the existing DRM
framebuffer modifiers.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/staging/media/tegra-vde/tegra-vde.c | 112 +++++++++++++++++---
 drivers/staging/media/tegra-vde/uapi.h      |   3 +-
 2 files changed, 100 insertions(+), 15 deletions(-)

diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
index 1a40f6dff7c8..275884e745df 100644
--- a/drivers/staging/media/tegra-vde/tegra-vde.c
+++ b/drivers/staging/media/tegra-vde/tegra-vde.c
@@ -24,6 +24,8 @@
 
 #include <soc/tegra/pmc.h>
 
+#include <drm/drm_fourcc.h>
+
 #include "uapi.h"
 
 #define ICMDQUE_WR		0x00
@@ -58,12 +60,14 @@ struct video_frame {
 	dma_addr_t aux_addr;
 	u32 frame_num;
 	u32 flags;
+	u64 modifier;
 };
 
 struct tegra_vde_soc {
 	unsigned int num_ref_pics;
 	bool supports_ref_pic_marking;
 	bool supports_interlacing;
+	bool supports_block_linear;
 };
 
 struct tegra_vde {
@@ -202,6 +206,7 @@ static void tegra_vde_setup_frameid(struct tegra_vde *vde,
 				    unsigned int frameid,
 				    u32 mbs_width, u32 mbs_height)
 {
+	u64 modifier = frame ? frame->modifier : DRM_FORMAT_MOD_LINEAR;
 	u32 y_addr  = frame ? frame->y_addr  : 0x6CDEAD00;
 	u32 cb_addr = frame ? frame->cb_addr : 0x6CDEAD00;
 	u32 cr_addr = frame ? frame->cr_addr : 0x6CDEAD00;
@@ -209,8 +214,12 @@ static void tegra_vde_setup_frameid(struct tegra_vde *vde,
 	u32 value2 = frame ? ((((mbs_width + 1) >> 1) << 6) | 1) : 0;
 	u32 value = y_addr >> 8;
 
-	if (vde->soc->supports_interlacing)
+	if (!vde->soc->supports_interlacing) {
+		if (modifier == DRM_FORMAT_MOD_NVIDIA_TEGRA_TILED)
+			value |= BIT(31);
+	} else {
 		value |= BIT(31);
+	}
 
 	VDE_WR(value,        vde->frameid + 0x000 + frameid * 4);
 	VDE_WR(cb_addr >> 8, vde->frameid + 0x100 + frameid * 4);
@@ -349,6 +358,37 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
 	}
 }
 
+static int tegra_vde_get_block_height(u64 modifier, unsigned int *block_height)
+{
+	switch (modifier) {
+	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_ONE_GOB:
+		*block_height = 0;
+		return 0;
+
+	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_TWO_GOB:
+		*block_height = 1;
+		return 0;
+
+	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_FOUR_GOB:
+		*block_height = 2;
+		return 0;
+
+	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_EIGHT_GOB:
+		*block_height = 3;
+		return 0;
+
+	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_SIXTEEN_GOB:
+		*block_height = 4;
+		return 0;
+
+	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_THIRTYTWO_GOB:
+		*block_height = 5;
+		return 0;
+	}
+
+	return -EINVAL;
+}
+
 static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 				      struct tegra_vde_h264_decoder_ctx *ctx,
 				      struct video_frame *dpb_frames,
@@ -383,7 +423,21 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 	tegra_vde_set_bits(vde, 0x0005, vde->vdma + 0x04);
 
 	VDE_WR(0x00000000, vde->vdma + 0x1C);
-	VDE_WR(0x00000000, vde->vdma + 0x00);
+
+	value = 0x00000000;
+
+	if (vde->soc->supports_block_linear) {
+		unsigned int block_height;
+
+		err = tegra_vde_get_block_height(dpb_frames[0].modifier,
+						 &block_height);
+		if (err < 0)
+			return err;
+
+		value |= block_height << 10;
+	}
+
+	VDE_WR(value, vde->vdma + 0x00);
 	VDE_WR(0x00000007, vde->vdma + 0x04);
 	VDE_WR(0x00000007, vde->frameid + 0x200);
 	VDE_WR(0x00000005, vde->tfe + 0x04);
@@ -730,11 +784,37 @@ static void tegra_vde_release_frame_dmabufs(struct video_frame *frame,
 static int tegra_vde_validate_frame(struct device *dev,
 				    struct tegra_vde_h264_frame *frame)
 {
+	struct tegra_vde *vde = dev_get_drvdata(dev);
+
 	if (frame->frame_num > 0x7FFFFF) {
 		dev_err(dev, "Bad frame_num %u\n", frame->frame_num);
 		return -EINVAL;
 	}
 
+	if (vde->soc->supports_block_linear) {
+		switch (frame->modifier) {
+		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_ONE_GOB:
+		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_TWO_GOB:
+		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_FOUR_GOB:
+		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_EIGHT_GOB:
+		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_SIXTEEN_GOB:
+		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_THIRTYTWO_GOB:
+			break;
+
+		default:
+			return -EINVAL;
+		}
+	} else {
+		switch (frame->modifier) {
+		case DRM_FORMAT_MOD_NVIDIA_TEGRA_TILED:
+		case DRM_FORMAT_MOD_LINEAR:
+			break;
+
+		default:
+			return -EINVAL;
+		}
+	}
+
 	return 0;
 }
 
@@ -812,7 +892,6 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 {
 	struct device *dev = vde->miscdev.parent;
 	struct tegra_vde_h264_decoder_ctx ctx;
-	struct tegra_vde_h264_frame frames[17];
 	struct tegra_vde_h264_frame __user *frames_user;
 	struct video_frame *dpb_frames;
 	struct dma_buf_attachment *bitstream_data_dmabuf_attachment;
@@ -872,28 +951,30 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 	macroblocks_nb = ctx.pic_width_in_mbs * ctx.pic_height_in_mbs;
 	frames_user = u64_to_user_ptr(ctx.dpb_frames_ptr);
 
-	if (copy_from_user(frames, frames_user,
-			   ctx.dpb_frames_nb * sizeof(*frames))) {
-		ret = -EFAULT;
-		goto free_dpb_frames;
-	}
-
 	cstride = ALIGN(ctx.pic_width_in_mbs * 8, 16);
 	csize = cstride * ctx.pic_height_in_mbs * 8;
 	lsize = macroblocks_nb * 256;
 
 	for (i = 0; i < ctx.dpb_frames_nb; i++) {
-		ret = tegra_vde_validate_frame(dev, &frames[i]);
+		struct tegra_vde_h264_frame frame;
+
+		if (copy_from_user(&frame, &frames_user[i], sizeof(frame))) {
+			ret = -EFAULT;
+			goto release_dpb_frames;
+		}
+
+		ret = tegra_vde_validate_frame(dev, &frame);
 		if (ret)
 			goto release_dpb_frames;
 
-		dpb_frames[i].flags = frames[i].flags;
-		dpb_frames[i].frame_num = frames[i].frame_num;
+		dpb_frames[i].flags = frame.flags;
+		dpb_frames[i].frame_num = frame.frame_num;
+		dpb_frames[i].modifier = frame.modifier;
 
 		dma_dir = (i == 0) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
 
 		ret = tegra_vde_attach_dmabufs_to_frame(dev, &dpb_frames[i],
-							&frames[i], dma_dir,
+							&frame, dma_dir,
 							ctx.baseline_profile,
 							lsize, csize);
 		if (ret)
@@ -985,7 +1066,6 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 						ctx.baseline_profile);
 	}
 
-free_dpb_frames:
 	kfree(dpb_frames);
 
 release_bitstream_dmabuf:
@@ -1328,24 +1408,28 @@ static const struct tegra_vde_soc tegra20_vde_soc = {
 	.num_ref_pics = 16,
 	.supports_ref_pic_marking = false,
 	.supports_interlacing = false,
+	.supports_block_linear = false,
 };
 
 static const struct tegra_vde_soc tegra30_vde_soc = {
 	.num_ref_pics = 32,
 	.supports_ref_pic_marking = false,
 	.supports_interlacing = false,
+	.supports_block_linear = false,
 };
 
 static const struct tegra_vde_soc tegra114_vde_soc = {
 	.num_ref_pics = 32,
 	.supports_ref_pic_marking = true,
 	.supports_interlacing = false,
+	.supports_block_linear = false,
 };
 
 static const struct tegra_vde_soc tegra124_vde_soc = {
 	.num_ref_pics = 32,
 	.supports_ref_pic_marking = true,
 	.supports_interlacing = true,
+	.supports_block_linear = true,
 };
 
 static const struct of_device_id tegra_vde_of_match[] = {
diff --git a/drivers/staging/media/tegra-vde/uapi.h b/drivers/staging/media/tegra-vde/uapi.h
index 58bfd56de55e..6cd730dda61c 100644
--- a/drivers/staging/media/tegra-vde/uapi.h
+++ b/drivers/staging/media/tegra-vde/uapi.h
@@ -27,8 +27,9 @@ struct tegra_vde_h264_frame {
 	__u32 aux_offset;
 	__u32 frame_num;
 	__u32 flags;
+	__u64 modifier;
 
-	__u32 reserved;
+	__u32 reserved[4];
 } __attribute__((packed));
 
 struct tegra_vde_h264_decoder_ctx {
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 04/14] staging: media: tegra-vde: Use DRM/KMS framebuffer modifiers
@ 2018-08-13 14:50   ` Thierry Reding
  0 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: Greg Kroah-Hartman, Dmitry Osipenko, Jonathan Hunter,
	linux-media, linux-tegra, devel

From: Thierry Reding <treding@nvidia.com>

VDE on Tegra20 through Tegra114 supports reading and writing frames in
16x16 tiled layout. Similarily, the various block-linear layouts that
are supported by the GPU on Tegra124 can also be read from and written
to by the Tegra124 VDE.

Enable userspace to specify the desired layout using the existing DRM
framebuffer modifiers.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/staging/media/tegra-vde/tegra-vde.c | 112 +++++++++++++++++---
 drivers/staging/media/tegra-vde/uapi.h      |   3 +-
 2 files changed, 100 insertions(+), 15 deletions(-)

diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
index 1a40f6dff7c8..275884e745df 100644
--- a/drivers/staging/media/tegra-vde/tegra-vde.c
+++ b/drivers/staging/media/tegra-vde/tegra-vde.c
@@ -24,6 +24,8 @@
 
 #include <soc/tegra/pmc.h>
 
+#include <drm/drm_fourcc.h>
+
 #include "uapi.h"
 
 #define ICMDQUE_WR		0x00
@@ -58,12 +60,14 @@ struct video_frame {
 	dma_addr_t aux_addr;
 	u32 frame_num;
 	u32 flags;
+	u64 modifier;
 };
 
 struct tegra_vde_soc {
 	unsigned int num_ref_pics;
 	bool supports_ref_pic_marking;
 	bool supports_interlacing;
+	bool supports_block_linear;
 };
 
 struct tegra_vde {
@@ -202,6 +206,7 @@ static void tegra_vde_setup_frameid(struct tegra_vde *vde,
 				    unsigned int frameid,
 				    u32 mbs_width, u32 mbs_height)
 {
+	u64 modifier = frame ? frame->modifier : DRM_FORMAT_MOD_LINEAR;
 	u32 y_addr  = frame ? frame->y_addr  : 0x6CDEAD00;
 	u32 cb_addr = frame ? frame->cb_addr : 0x6CDEAD00;
 	u32 cr_addr = frame ? frame->cr_addr : 0x6CDEAD00;
@@ -209,8 +214,12 @@ static void tegra_vde_setup_frameid(struct tegra_vde *vde,
 	u32 value2 = frame ? ((((mbs_width + 1) >> 1) << 6) | 1) : 0;
 	u32 value = y_addr >> 8;
 
-	if (vde->soc->supports_interlacing)
+	if (!vde->soc->supports_interlacing) {
+		if (modifier == DRM_FORMAT_MOD_NVIDIA_TEGRA_TILED)
+			value |= BIT(31);
+	} else {
 		value |= BIT(31);
+	}
 
 	VDE_WR(value,        vde->frameid + 0x000 + frameid * 4);
 	VDE_WR(cb_addr >> 8, vde->frameid + 0x100 + frameid * 4);
@@ -349,6 +358,37 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
 	}
 }
 
+static int tegra_vde_get_block_height(u64 modifier, unsigned int *block_height)
+{
+	switch (modifier) {
+	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_ONE_GOB:
+		*block_height = 0;
+		return 0;
+
+	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_TWO_GOB:
+		*block_height = 1;
+		return 0;
+
+	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_FOUR_GOB:
+		*block_height = 2;
+		return 0;
+
+	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_EIGHT_GOB:
+		*block_height = 3;
+		return 0;
+
+	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_SIXTEEN_GOB:
+		*block_height = 4;
+		return 0;
+
+	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_THIRTYTWO_GOB:
+		*block_height = 5;
+		return 0;
+	}
+
+	return -EINVAL;
+}
+
 static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 				      struct tegra_vde_h264_decoder_ctx *ctx,
 				      struct video_frame *dpb_frames,
@@ -383,7 +423,21 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 	tegra_vde_set_bits(vde, 0x0005, vde->vdma + 0x04);
 
 	VDE_WR(0x00000000, vde->vdma + 0x1C);
-	VDE_WR(0x00000000, vde->vdma + 0x00);
+
+	value = 0x00000000;
+
+	if (vde->soc->supports_block_linear) {
+		unsigned int block_height;
+
+		err = tegra_vde_get_block_height(dpb_frames[0].modifier,
+						 &block_height);
+		if (err < 0)
+			return err;
+
+		value |= block_height << 10;
+	}
+
+	VDE_WR(value, vde->vdma + 0x00);
 	VDE_WR(0x00000007, vde->vdma + 0x04);
 	VDE_WR(0x00000007, vde->frameid + 0x200);
 	VDE_WR(0x00000005, vde->tfe + 0x04);
@@ -730,11 +784,37 @@ static void tegra_vde_release_frame_dmabufs(struct video_frame *frame,
 static int tegra_vde_validate_frame(struct device *dev,
 				    struct tegra_vde_h264_frame *frame)
 {
+	struct tegra_vde *vde = dev_get_drvdata(dev);
+
 	if (frame->frame_num > 0x7FFFFF) {
 		dev_err(dev, "Bad frame_num %u\n", frame->frame_num);
 		return -EINVAL;
 	}
 
+	if (vde->soc->supports_block_linear) {
+		switch (frame->modifier) {
+		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_ONE_GOB:
+		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_TWO_GOB:
+		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_FOUR_GOB:
+		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_EIGHT_GOB:
+		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_SIXTEEN_GOB:
+		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_THIRTYTWO_GOB:
+			break;
+
+		default:
+			return -EINVAL;
+		}
+	} else {
+		switch (frame->modifier) {
+		case DRM_FORMAT_MOD_NVIDIA_TEGRA_TILED:
+		case DRM_FORMAT_MOD_LINEAR:
+			break;
+
+		default:
+			return -EINVAL;
+		}
+	}
+
 	return 0;
 }
 
@@ -812,7 +892,6 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 {
 	struct device *dev = vde->miscdev.parent;
 	struct tegra_vde_h264_decoder_ctx ctx;
-	struct tegra_vde_h264_frame frames[17];
 	struct tegra_vde_h264_frame __user *frames_user;
 	struct video_frame *dpb_frames;
 	struct dma_buf_attachment *bitstream_data_dmabuf_attachment;
@@ -872,28 +951,30 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 	macroblocks_nb = ctx.pic_width_in_mbs * ctx.pic_height_in_mbs;
 	frames_user = u64_to_user_ptr(ctx.dpb_frames_ptr);
 
-	if (copy_from_user(frames, frames_user,
-			   ctx.dpb_frames_nb * sizeof(*frames))) {
-		ret = -EFAULT;
-		goto free_dpb_frames;
-	}
-
 	cstride = ALIGN(ctx.pic_width_in_mbs * 8, 16);
 	csize = cstride * ctx.pic_height_in_mbs * 8;
 	lsize = macroblocks_nb * 256;
 
 	for (i = 0; i < ctx.dpb_frames_nb; i++) {
-		ret = tegra_vde_validate_frame(dev, &frames[i]);
+		struct tegra_vde_h264_frame frame;
+
+		if (copy_from_user(&frame, &frames_user[i], sizeof(frame))) {
+			ret = -EFAULT;
+			goto release_dpb_frames;
+		}
+
+		ret = tegra_vde_validate_frame(dev, &frame);
 		if (ret)
 			goto release_dpb_frames;
 
-		dpb_frames[i].flags = frames[i].flags;
-		dpb_frames[i].frame_num = frames[i].frame_num;
+		dpb_frames[i].flags = frame.flags;
+		dpb_frames[i].frame_num = frame.frame_num;
+		dpb_frames[i].modifier = frame.modifier;
 
 		dma_dir = (i == 0) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
 
 		ret = tegra_vde_attach_dmabufs_to_frame(dev, &dpb_frames[i],
-							&frames[i], dma_dir,
+							&frame, dma_dir,
 							ctx.baseline_profile,
 							lsize, csize);
 		if (ret)
@@ -985,7 +1066,6 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 						ctx.baseline_profile);
 	}
 
-free_dpb_frames:
 	kfree(dpb_frames);
 
 release_bitstream_dmabuf:
@@ -1328,24 +1408,28 @@ static const struct tegra_vde_soc tegra20_vde_soc = {
 	.num_ref_pics = 16,
 	.supports_ref_pic_marking = false,
 	.supports_interlacing = false,
+	.supports_block_linear = false,
 };
 
 static const struct tegra_vde_soc tegra30_vde_soc = {
 	.num_ref_pics = 32,
 	.supports_ref_pic_marking = false,
 	.supports_interlacing = false,
+	.supports_block_linear = false,
 };
 
 static const struct tegra_vde_soc tegra114_vde_soc = {
 	.num_ref_pics = 32,
 	.supports_ref_pic_marking = true,
 	.supports_interlacing = false,
+	.supports_block_linear = false,
 };
 
 static const struct tegra_vde_soc tegra124_vde_soc = {
 	.num_ref_pics = 32,
 	.supports_ref_pic_marking = true,
 	.supports_interlacing = true,
+	.supports_block_linear = true,
 };
 
 static const struct of_device_id tegra_vde_of_match[] = {
diff --git a/drivers/staging/media/tegra-vde/uapi.h b/drivers/staging/media/tegra-vde/uapi.h
index 58bfd56de55e..6cd730dda61c 100644
--- a/drivers/staging/media/tegra-vde/uapi.h
+++ b/drivers/staging/media/tegra-vde/uapi.h
@@ -27,8 +27,9 @@ struct tegra_vde_h264_frame {
 	__u32 aux_offset;
 	__u32 frame_num;
 	__u32 flags;
+	__u64 modifier;
 
-	__u32 reserved;
+	__u32 reserved[4];
 } __attribute__((packed));
 
 struct tegra_vde_h264_decoder_ctx {
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 05/14] staging: media: tegra-vde: Properly mark invalid entries
  2018-08-13 14:50 ` Thierry Reding
@ 2018-08-13 14:50   ` Thierry Reding
  -1 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Dmitry Osipenko, linux-media

From: Thierry Reding <treding@nvidia.com>

Entries in the reference picture list are marked as invalid by setting
the frame ID to 0x3f.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/staging/media/tegra-vde/tegra-vde.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
index 275884e745df..0ce30c7ccb75 100644
--- a/drivers/staging/media/tegra-vde/tegra-vde.c
+++ b/drivers/staging/media/tegra-vde/tegra-vde.c
@@ -296,7 +296,7 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
 				(frame->flags & FLAG_B_FRAME));
 		} else {
 			aux_addr = 0x6ADEAD00;
-			value = 0;
+			value = 0x3f;
 		}
 
 		tegra_vde_setup_iram_entry(vde, num_ref_pics, 0, i, value,
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 05/14] staging: media: tegra-vde: Properly mark invalid entries
@ 2018-08-13 14:50   ` Thierry Reding
  0 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: Greg Kroah-Hartman, Dmitry Osipenko, Jonathan Hunter,
	linux-media, linux-tegra, devel

From: Thierry Reding <treding@nvidia.com>

Entries in the reference picture list are marked as invalid by setting
the frame ID to 0x3f.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/staging/media/tegra-vde/tegra-vde.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
index 275884e745df..0ce30c7ccb75 100644
--- a/drivers/staging/media/tegra-vde/tegra-vde.c
+++ b/drivers/staging/media/tegra-vde/tegra-vde.c
@@ -296,7 +296,7 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
 				(frame->flags & FLAG_B_FRAME));
 		} else {
 			aux_addr = 0x6ADEAD00;
-			value = 0;
+			value = 0x3f;
 		}
 
 		tegra_vde_setup_iram_entry(vde, num_ref_pics, 0, i, value,
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 06/14] staging: media: tegra-vde: Print out invalid FD
  2018-08-13 14:50 ` Thierry Reding
@ 2018-08-13 14:50   ` Thierry Reding
  -1 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Dmitry Osipenko, linux-media

From: Thierry Reding <treding@nvidia.com>

Include the invalid file descriptor when reporting an error message to
help diagnosing why importing the buffer failed.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/staging/media/tegra-vde/tegra-vde.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
index 0ce30c7ccb75..0adc603fa437 100644
--- a/drivers/staging/media/tegra-vde/tegra-vde.c
+++ b/drivers/staging/media/tegra-vde/tegra-vde.c
@@ -643,7 +643,7 @@ static int tegra_vde_attach_dmabuf(struct device *dev,
 
 	dmabuf = dma_buf_get(fd);
 	if (IS_ERR(dmabuf)) {
-		dev_err(dev, "Invalid dmabuf FD\n");
+		dev_err(dev, "Invalid dmabuf FD: %d\n", fd);
 		return PTR_ERR(dmabuf);
 	}
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 06/14] staging: media: tegra-vde: Print out invalid FD
@ 2018-08-13 14:50   ` Thierry Reding
  0 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: Greg Kroah-Hartman, Dmitry Osipenko, Jonathan Hunter,
	linux-media, linux-tegra, devel

From: Thierry Reding <treding@nvidia.com>

Include the invalid file descriptor when reporting an error message to
help diagnosing why importing the buffer failed.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/staging/media/tegra-vde/tegra-vde.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
index 0ce30c7ccb75..0adc603fa437 100644
--- a/drivers/staging/media/tegra-vde/tegra-vde.c
+++ b/drivers/staging/media/tegra-vde/tegra-vde.c
@@ -643,7 +643,7 @@ static int tegra_vde_attach_dmabuf(struct device *dev,
 
 	dmabuf = dma_buf_get(fd);
 	if (IS_ERR(dmabuf)) {
-		dev_err(dev, "Invalid dmabuf FD\n");
+		dev_err(dev, "Invalid dmabuf FD: %d\n", fd);
 		return PTR_ERR(dmabuf);
 	}
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 07/14] staging: media: tegra-vde: Add some clarifying comments
  2018-08-13 14:50 ` Thierry Reding
@ 2018-08-13 14:50   ` Thierry Reding
  -1 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Dmitry Osipenko, linux-media

From: Thierry Reding <treding@nvidia.com>

Add some comments specifying what tables are being set up in VRAM.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/staging/media/tegra-vde/tegra-vde.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
index 0adc603fa437..41cf86dc5dbd 100644
--- a/drivers/staging/media/tegra-vde/tegra-vde.c
+++ b/drivers/staging/media/tegra-vde/tegra-vde.c
@@ -271,6 +271,7 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
 	unsigned int i, k;
 	size_t size;
 
+	/* clear H256RefPicList */
 	size = num_ref_pics * 4 * 8;
 	memset(vde->iram, 0, size);
 
@@ -453,6 +454,7 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 	VDE_WR(0x00000000, vde->bsev + 0x98);
 	VDE_WR(0x00000060, vde->bsev + 0x9C);
 
+	/* clear H264MB2SliceGroupMap, assuming no FMO */
 	memset(vde->iram + 1024, 0, macroblocks_nb / 2);
 
 	tegra_setup_frameidx(vde, dpb_frames, ctx->dpb_frames_nb,
@@ -480,6 +482,8 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 	if (err)
 		return err;
 
+	/* upload H264MB2SliceGroupMap */
+	/* XXX don't hardcode map size? */
 	value = (0x20 << 26) | (0 << 25) | ((4096 >> 2) & 0x1fff);
 	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);
 	if (err)
@@ -492,6 +496,7 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 	if (err)
 		return err;
 
+	/* clear H264MBInfo XXX don't hardcode size */
 	value = (0x21 << 26) | ((240 & 0x1fff) << 12) | (0x54c & 0xfff);
 	err = tegra_vde_push_to_bsev_icmdqueue(vde, 0x840F054C, false);
 	if (err)
@@ -499,6 +504,16 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 
 	size = num_ref_pics * 4 * 8;
 
+	/* clear H264RefPicList */
+	/*
+	value = (0x21 << 26) | (((size >> 2) & 0x1fff) << 12) | 0xE34;
+
+	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);
+	if (err)
+		return err;
+	*/
+
+	/* upload H264RefPicList */
 	value = (0x20 << 26) | (0x0 << 25) | ((size >> 2) & 0x1fff);
 	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);
 	if (err)
@@ -584,7 +599,11 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 
 	tegra_vde_mbe_set_0xa_reg(vde, 0, 0x000009FC);
 	tegra_vde_mbe_set_0xa_reg(vde, 2, 0x61DEAD00);
+#if 0
+	tegra_vde_mbe_set_0xa_reg(vde, 4, dpb_frames[0].aux_addr); /* 0x62DEAD00 */
+#else
 	tegra_vde_mbe_set_0xa_reg(vde, 4, 0x62DEAD00);
+#endif
 	tegra_vde_mbe_set_0xa_reg(vde, 6, 0x63DEAD00);
 	tegra_vde_mbe_set_0xa_reg(vde, 8, dpb_frames[0].aux_addr);
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 07/14] staging: media: tegra-vde: Add some clarifying comments
@ 2018-08-13 14:50   ` Thierry Reding
  0 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: Greg Kroah-Hartman, Dmitry Osipenko, Jonathan Hunter,
	linux-media, linux-tegra, devel

From: Thierry Reding <treding@nvidia.com>

Add some comments specifying what tables are being set up in VRAM.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/staging/media/tegra-vde/tegra-vde.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
index 0adc603fa437..41cf86dc5dbd 100644
--- a/drivers/staging/media/tegra-vde/tegra-vde.c
+++ b/drivers/staging/media/tegra-vde/tegra-vde.c
@@ -271,6 +271,7 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
 	unsigned int i, k;
 	size_t size;
 
+	/* clear H256RefPicList */
 	size = num_ref_pics * 4 * 8;
 	memset(vde->iram, 0, size);
 
@@ -453,6 +454,7 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 	VDE_WR(0x00000000, vde->bsev + 0x98);
 	VDE_WR(0x00000060, vde->bsev + 0x9C);
 
+	/* clear H264MB2SliceGroupMap, assuming no FMO */
 	memset(vde->iram + 1024, 0, macroblocks_nb / 2);
 
 	tegra_setup_frameidx(vde, dpb_frames, ctx->dpb_frames_nb,
@@ -480,6 +482,8 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 	if (err)
 		return err;
 
+	/* upload H264MB2SliceGroupMap */
+	/* XXX don't hardcode map size? */
 	value = (0x20 << 26) | (0 << 25) | ((4096 >> 2) & 0x1fff);
 	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);
 	if (err)
@@ -492,6 +496,7 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 	if (err)
 		return err;
 
+	/* clear H264MBInfo XXX don't hardcode size */
 	value = (0x21 << 26) | ((240 & 0x1fff) << 12) | (0x54c & 0xfff);
 	err = tegra_vde_push_to_bsev_icmdqueue(vde, 0x840F054C, false);
 	if (err)
@@ -499,6 +504,16 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 
 	size = num_ref_pics * 4 * 8;
 
+	/* clear H264RefPicList */
+	/*
+	value = (0x21 << 26) | (((size >> 2) & 0x1fff) << 12) | 0xE34;
+
+	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);
+	if (err)
+		return err;
+	*/
+
+	/* upload H264RefPicList */
 	value = (0x20 << 26) | (0x0 << 25) | ((size >> 2) & 0x1fff);
 	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);
 	if (err)
@@ -584,7 +599,11 @@ static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
 
 	tegra_vde_mbe_set_0xa_reg(vde, 0, 0x000009FC);
 	tegra_vde_mbe_set_0xa_reg(vde, 2, 0x61DEAD00);
+#if 0
+	tegra_vde_mbe_set_0xa_reg(vde, 4, dpb_frames[0].aux_addr); /* 0x62DEAD00 */
+#else
 	tegra_vde_mbe_set_0xa_reg(vde, 4, 0x62DEAD00);
+#endif
 	tegra_vde_mbe_set_0xa_reg(vde, 6, 0x63DEAD00);
 	tegra_vde_mbe_set_0xa_reg(vde, 8, dpb_frames[0].aux_addr);
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 08/14] staging: media: tegra-vde: Track struct device *
  2018-08-13 14:50 ` Thierry Reding
@ 2018-08-13 14:50   ` Thierry Reding
  -1 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Dmitry Osipenko, linux-media

From: Thierry Reding <treding@nvidia.com>

The pointer to the struct device is frequently used, so store it in
struct tegra_vde. Also, pass around a pointer to a struct tegra_vde
instead of struct device in some cases to prepare for subsequent
patches referencing additional data from that structure.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/staging/media/tegra-vde/tegra-vde.c | 63 ++++++++++++---------
 1 file changed, 36 insertions(+), 27 deletions(-)

diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
index 41cf86dc5dbd..2496a03fd158 100644
--- a/drivers/staging/media/tegra-vde/tegra-vde.c
+++ b/drivers/staging/media/tegra-vde/tegra-vde.c
@@ -71,6 +71,7 @@ struct tegra_vde_soc {
 };
 
 struct tegra_vde {
+	struct device *dev;
 	const struct tegra_vde_soc *soc;
 	void __iomem *sxe;
 	void __iomem *bsev;
@@ -644,7 +645,7 @@ static void tegra_vde_detach_and_put_dmabuf(struct dma_buf_attachment *a,
 	dma_buf_put(dmabuf);
 }
 
-static int tegra_vde_attach_dmabuf(struct device *dev,
+static int tegra_vde_attach_dmabuf(struct tegra_vde *vde,
 				   int fd,
 				   unsigned long offset,
 				   size_t min_size,
@@ -662,38 +663,40 @@ static int tegra_vde_attach_dmabuf(struct device *dev,
 
 	dmabuf = dma_buf_get(fd);
 	if (IS_ERR(dmabuf)) {
-		dev_err(dev, "Invalid dmabuf FD: %d\n", fd);
+		dev_err(vde->dev, "Invalid dmabuf FD: %d\n", fd);
 		return PTR_ERR(dmabuf);
 	}
 
 	if (dmabuf->size & (align_size - 1)) {
-		dev_err(dev, "Unaligned dmabuf 0x%zX, should be aligned to 0x%zX\n",
+		dev_err(vde->dev,
+			"Unaligned dmabuf 0x%zX, should be aligned to 0x%zX\n",
 			dmabuf->size, align_size);
 		return -EINVAL;
 	}
 
 	if ((u64)offset + min_size > dmabuf->size) {
-		dev_err(dev, "Too small dmabuf size %zu @0x%lX, should be at least %zu\n",
+		dev_err(vde->dev,
+			"Too small dmabuf size %zu @0x%lX, should be at least %zu\n",
 			dmabuf->size, offset, min_size);
 		return -EINVAL;
 	}
 
-	attachment = dma_buf_attach(dmabuf, dev);
+	attachment = dma_buf_attach(dmabuf, vde->dev);
 	if (IS_ERR(attachment)) {
-		dev_err(dev, "Failed to attach dmabuf\n");
+		dev_err(vde->dev, "Failed to attach dmabuf\n");
 		err = PTR_ERR(attachment);
 		goto err_put;
 	}
 
 	sgt = dma_buf_map_attachment(attachment, dma_dir);
 	if (IS_ERR(sgt)) {
-		dev_err(dev, "Failed to get dmabufs sg_table\n");
+		dev_err(vde->dev, "Failed to get dmabufs sg_table\n");
 		err = PTR_ERR(sgt);
 		goto err_detach;
 	}
 
 	if (sgt->nents != 1) {
-		dev_err(dev, "Sparse DMA region is unsupported\n");
+		dev_err(vde->dev, "Sparse DMA region is unsupported\n");
 		err = -EINVAL;
 		goto err_unmap;
 	}
@@ -717,7 +720,7 @@ static int tegra_vde_attach_dmabuf(struct device *dev,
 	return err;
 }
 
-static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
+static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
 					     struct video_frame *frame,
 					     struct tegra_vde_h264_frame *src,
 					     enum dma_data_direction dma_dir,
@@ -726,7 +729,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
 {
 	int err;
 
-	err = tegra_vde_attach_dmabuf(dev, src->y_fd,
+	err = tegra_vde_attach_dmabuf(vde, src->y_fd,
 				      src->y_offset, lsize, SZ_256,
 				      &frame->y_dmabuf_attachment,
 				      &frame->y_addr,
@@ -735,7 +738,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
 	if (err)
 		return err;
 
-	err = tegra_vde_attach_dmabuf(dev, src->cb_fd,
+	err = tegra_vde_attach_dmabuf(vde, src->cb_fd,
 				      src->cb_offset, csize, SZ_256,
 				      &frame->cb_dmabuf_attachment,
 				      &frame->cb_addr,
@@ -744,7 +747,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
 	if (err)
 		goto err_release_y;
 
-	err = tegra_vde_attach_dmabuf(dev, src->cr_fd,
+	err = tegra_vde_attach_dmabuf(vde, src->cr_fd,
 				      src->cr_offset, csize, SZ_256,
 				      &frame->cr_dmabuf_attachment,
 				      &frame->cr_addr,
@@ -758,7 +761,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
 		return 0;
 	}
 
-	err = tegra_vde_attach_dmabuf(dev, src->aux_fd,
+	err = tegra_vde_attach_dmabuf(vde, src->aux_fd,
 				      src->aux_offset, csize, SZ_256,
 				      &frame->aux_dmabuf_attachment,
 				      &frame->aux_addr,
@@ -770,33 +773,35 @@ static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
 	return 0;
 
 err_release_cr:
-	tegra_vde_detach_and_put_dmabuf(frame->cr_dmabuf_attachment,
+	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
 					frame->cr_sgt, dma_dir);
 err_release_cb:
-	tegra_vde_detach_and_put_dmabuf(frame->cb_dmabuf_attachment,
+	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
 					frame->cb_sgt, dma_dir);
 err_release_y:
-	tegra_vde_detach_and_put_dmabuf(frame->y_dmabuf_attachment,
+	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
 					frame->y_sgt, dma_dir);
 
 	return err;
 }
 
-static void tegra_vde_release_frame_dmabufs(struct video_frame *frame,
+static void tegra_vde_release_frame_dmabufs(struct tegra_vde *vde,
+					    struct video_frame *frame,
 					    enum dma_data_direction dma_dir,
 					    bool baseline_profile)
 {
 	if (!baseline_profile)
-		tegra_vde_detach_and_put_dmabuf(frame->aux_dmabuf_attachment,
+		tegra_vde_detach_and_put_dmabuf(vde,
+						frame->aux_dmabuf_attachment,
 						frame->aux_sgt, dma_dir);
 
-	tegra_vde_detach_and_put_dmabuf(frame->cr_dmabuf_attachment,
+	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
 					frame->cr_sgt, dma_dir);
 
-	tegra_vde_detach_and_put_dmabuf(frame->cb_dmabuf_attachment,
+	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
 					frame->cb_sgt, dma_dir);
 
-	tegra_vde_detach_and_put_dmabuf(frame->y_dmabuf_attachment,
+	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
 					frame->y_sgt, dma_dir);
 }
 
@@ -937,7 +942,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 	if (ret)
 		return ret;
 
-	ret = tegra_vde_attach_dmabuf(dev, ctx.bitstream_data_fd,
+	ret = tegra_vde_attach_dmabuf(vde, ctx.bitstream_data_fd,
 				      ctx.bitstream_data_offset,
 				      SZ_16K, SZ_16K,
 				      &bitstream_data_dmabuf_attachment,
@@ -949,7 +954,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 		return ret;
 
 	if (vde->soc->supports_ref_pic_marking) {
-		ret = tegra_vde_attach_dmabuf(dev, ctx.secure_fd,
+		ret = tegra_vde_attach_dmabuf(vde, ctx.secure_fd,
 					      ctx.secure_offset, 0, SZ_256,
 					      &secure_attachment,
 					      &secure_addr,
@@ -992,7 +997,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 
 		dma_dir = (i == 0) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
 
-		ret = tegra_vde_attach_dmabufs_to_frame(dev, &dpb_frames[i],
+		ret = tegra_vde_attach_dmabufs_to_frame(vde, &dpb_frames[i],
 							&frame, dma_dir,
 							ctx.baseline_profile,
 							lsize, csize);
@@ -1081,7 +1086,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 	while (i--) {
 		dma_dir = (i == 0) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
 
-		tegra_vde_release_frame_dmabufs(&dpb_frames[i], dma_dir,
+		tegra_vde_release_frame_dmabufs(vde, &dpb_frames[i], dma_dir,
 						ctx.baseline_profile);
 	}
 
@@ -1089,10 +1094,12 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 
 release_bitstream_dmabuf:
 	if (secure_attachment)
-		tegra_vde_detach_and_put_dmabuf(secure_attachment, secure_sgt,
+		tegra_vde_detach_and_put_dmabuf(vde, secure_attachment,
+						secure_sgt,
 						DMA_TO_DEVICE);
 
-	tegra_vde_detach_and_put_dmabuf(bitstream_data_dmabuf_attachment,
+	tegra_vde_detach_and_put_dmabuf(vde,
+					bitstream_data_dmabuf_attachment,
 					bitstream_sgt, DMA_TO_DEVICE);
 
 	return ret;
@@ -1190,6 +1197,8 @@ static int tegra_vde_probe(struct platform_device *pdev)
 	if (!vde)
 		return -ENOMEM;
 
+	vde->dev = &pdev->dev;
+
 	platform_set_drvdata(pdev, vde);
 
 	vde->soc = of_device_get_match_data(&pdev->dev);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 08/14] staging: media: tegra-vde: Track struct device *
@ 2018-08-13 14:50   ` Thierry Reding
  0 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: Greg Kroah-Hartman, Dmitry Osipenko, Jonathan Hunter,
	linux-media, linux-tegra, devel

From: Thierry Reding <treding@nvidia.com>

The pointer to the struct device is frequently used, so store it in
struct tegra_vde. Also, pass around a pointer to a struct tegra_vde
instead of struct device in some cases to prepare for subsequent
patches referencing additional data from that structure.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/staging/media/tegra-vde/tegra-vde.c | 63 ++++++++++++---------
 1 file changed, 36 insertions(+), 27 deletions(-)

diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
index 41cf86dc5dbd..2496a03fd158 100644
--- a/drivers/staging/media/tegra-vde/tegra-vde.c
+++ b/drivers/staging/media/tegra-vde/tegra-vde.c
@@ -71,6 +71,7 @@ struct tegra_vde_soc {
 };
 
 struct tegra_vde {
+	struct device *dev;
 	const struct tegra_vde_soc *soc;
 	void __iomem *sxe;
 	void __iomem *bsev;
@@ -644,7 +645,7 @@ static void tegra_vde_detach_and_put_dmabuf(struct dma_buf_attachment *a,
 	dma_buf_put(dmabuf);
 }
 
-static int tegra_vde_attach_dmabuf(struct device *dev,
+static int tegra_vde_attach_dmabuf(struct tegra_vde *vde,
 				   int fd,
 				   unsigned long offset,
 				   size_t min_size,
@@ -662,38 +663,40 @@ static int tegra_vde_attach_dmabuf(struct device *dev,
 
 	dmabuf = dma_buf_get(fd);
 	if (IS_ERR(dmabuf)) {
-		dev_err(dev, "Invalid dmabuf FD: %d\n", fd);
+		dev_err(vde->dev, "Invalid dmabuf FD: %d\n", fd);
 		return PTR_ERR(dmabuf);
 	}
 
 	if (dmabuf->size & (align_size - 1)) {
-		dev_err(dev, "Unaligned dmabuf 0x%zX, should be aligned to 0x%zX\n",
+		dev_err(vde->dev,
+			"Unaligned dmabuf 0x%zX, should be aligned to 0x%zX\n",
 			dmabuf->size, align_size);
 		return -EINVAL;
 	}
 
 	if ((u64)offset + min_size > dmabuf->size) {
-		dev_err(dev, "Too small dmabuf size %zu @0x%lX, should be at least %zu\n",
+		dev_err(vde->dev,
+			"Too small dmabuf size %zu @0x%lX, should be at least %zu\n",
 			dmabuf->size, offset, min_size);
 		return -EINVAL;
 	}
 
-	attachment = dma_buf_attach(dmabuf, dev);
+	attachment = dma_buf_attach(dmabuf, vde->dev);
 	if (IS_ERR(attachment)) {
-		dev_err(dev, "Failed to attach dmabuf\n");
+		dev_err(vde->dev, "Failed to attach dmabuf\n");
 		err = PTR_ERR(attachment);
 		goto err_put;
 	}
 
 	sgt = dma_buf_map_attachment(attachment, dma_dir);
 	if (IS_ERR(sgt)) {
-		dev_err(dev, "Failed to get dmabufs sg_table\n");
+		dev_err(vde->dev, "Failed to get dmabufs sg_table\n");
 		err = PTR_ERR(sgt);
 		goto err_detach;
 	}
 
 	if (sgt->nents != 1) {
-		dev_err(dev, "Sparse DMA region is unsupported\n");
+		dev_err(vde->dev, "Sparse DMA region is unsupported\n");
 		err = -EINVAL;
 		goto err_unmap;
 	}
@@ -717,7 +720,7 @@ static int tegra_vde_attach_dmabuf(struct device *dev,
 	return err;
 }
 
-static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
+static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
 					     struct video_frame *frame,
 					     struct tegra_vde_h264_frame *src,
 					     enum dma_data_direction dma_dir,
@@ -726,7 +729,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
 {
 	int err;
 
-	err = tegra_vde_attach_dmabuf(dev, src->y_fd,
+	err = tegra_vde_attach_dmabuf(vde, src->y_fd,
 				      src->y_offset, lsize, SZ_256,
 				      &frame->y_dmabuf_attachment,
 				      &frame->y_addr,
@@ -735,7 +738,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
 	if (err)
 		return err;
 
-	err = tegra_vde_attach_dmabuf(dev, src->cb_fd,
+	err = tegra_vde_attach_dmabuf(vde, src->cb_fd,
 				      src->cb_offset, csize, SZ_256,
 				      &frame->cb_dmabuf_attachment,
 				      &frame->cb_addr,
@@ -744,7 +747,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
 	if (err)
 		goto err_release_y;
 
-	err = tegra_vde_attach_dmabuf(dev, src->cr_fd,
+	err = tegra_vde_attach_dmabuf(vde, src->cr_fd,
 				      src->cr_offset, csize, SZ_256,
 				      &frame->cr_dmabuf_attachment,
 				      &frame->cr_addr,
@@ -758,7 +761,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
 		return 0;
 	}
 
-	err = tegra_vde_attach_dmabuf(dev, src->aux_fd,
+	err = tegra_vde_attach_dmabuf(vde, src->aux_fd,
 				      src->aux_offset, csize, SZ_256,
 				      &frame->aux_dmabuf_attachment,
 				      &frame->aux_addr,
@@ -770,33 +773,35 @@ static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
 	return 0;
 
 err_release_cr:
-	tegra_vde_detach_and_put_dmabuf(frame->cr_dmabuf_attachment,
+	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
 					frame->cr_sgt, dma_dir);
 err_release_cb:
-	tegra_vde_detach_and_put_dmabuf(frame->cb_dmabuf_attachment,
+	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
 					frame->cb_sgt, dma_dir);
 err_release_y:
-	tegra_vde_detach_and_put_dmabuf(frame->y_dmabuf_attachment,
+	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
 					frame->y_sgt, dma_dir);
 
 	return err;
 }
 
-static void tegra_vde_release_frame_dmabufs(struct video_frame *frame,
+static void tegra_vde_release_frame_dmabufs(struct tegra_vde *vde,
+					    struct video_frame *frame,
 					    enum dma_data_direction dma_dir,
 					    bool baseline_profile)
 {
 	if (!baseline_profile)
-		tegra_vde_detach_and_put_dmabuf(frame->aux_dmabuf_attachment,
+		tegra_vde_detach_and_put_dmabuf(vde,
+						frame->aux_dmabuf_attachment,
 						frame->aux_sgt, dma_dir);
 
-	tegra_vde_detach_and_put_dmabuf(frame->cr_dmabuf_attachment,
+	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
 					frame->cr_sgt, dma_dir);
 
-	tegra_vde_detach_and_put_dmabuf(frame->cb_dmabuf_attachment,
+	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
 					frame->cb_sgt, dma_dir);
 
-	tegra_vde_detach_and_put_dmabuf(frame->y_dmabuf_attachment,
+	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
 					frame->y_sgt, dma_dir);
 }
 
@@ -937,7 +942,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 	if (ret)
 		return ret;
 
-	ret = tegra_vde_attach_dmabuf(dev, ctx.bitstream_data_fd,
+	ret = tegra_vde_attach_dmabuf(vde, ctx.bitstream_data_fd,
 				      ctx.bitstream_data_offset,
 				      SZ_16K, SZ_16K,
 				      &bitstream_data_dmabuf_attachment,
@@ -949,7 +954,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 		return ret;
 
 	if (vde->soc->supports_ref_pic_marking) {
-		ret = tegra_vde_attach_dmabuf(dev, ctx.secure_fd,
+		ret = tegra_vde_attach_dmabuf(vde, ctx.secure_fd,
 					      ctx.secure_offset, 0, SZ_256,
 					      &secure_attachment,
 					      &secure_addr,
@@ -992,7 +997,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 
 		dma_dir = (i == 0) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
 
-		ret = tegra_vde_attach_dmabufs_to_frame(dev, &dpb_frames[i],
+		ret = tegra_vde_attach_dmabufs_to_frame(vde, &dpb_frames[i],
 							&frame, dma_dir,
 							ctx.baseline_profile,
 							lsize, csize);
@@ -1081,7 +1086,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 	while (i--) {
 		dma_dir = (i == 0) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
 
-		tegra_vde_release_frame_dmabufs(&dpb_frames[i], dma_dir,
+		tegra_vde_release_frame_dmabufs(vde, &dpb_frames[i], dma_dir,
 						ctx.baseline_profile);
 	}
 
@@ -1089,10 +1094,12 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 
 release_bitstream_dmabuf:
 	if (secure_attachment)
-		tegra_vde_detach_and_put_dmabuf(secure_attachment, secure_sgt,
+		tegra_vde_detach_and_put_dmabuf(vde, secure_attachment,
+						secure_sgt,
 						DMA_TO_DEVICE);
 
-	tegra_vde_detach_and_put_dmabuf(bitstream_data_dmabuf_attachment,
+	tegra_vde_detach_and_put_dmabuf(vde,
+					bitstream_data_dmabuf_attachment,
 					bitstream_sgt, DMA_TO_DEVICE);
 
 	return ret;
@@ -1190,6 +1197,8 @@ static int tegra_vde_probe(struct platform_device *pdev)
 	if (!vde)
 		return -ENOMEM;
 
+	vde->dev = &pdev->dev;
+
 	platform_set_drvdata(pdev, vde);
 
 	vde->soc = of_device_get_match_data(&pdev->dev);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 09/14] staging: media: tegra-vde: Add IOMMU support
  2018-08-13 14:50 ` Thierry Reding
@ 2018-08-13 14:50   ` Thierry Reding
  -1 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Dmitry Osipenko, linux-media

From: Thierry Reding <treding@nvidia.com>

Implement support for using an IOMMU to map physically discontiguous
buffers into contiguous I/O virtual mappings that the VDE can use. This
allows importing arbitrary DMA-BUFs for use by the VDE.

While at it, make sure that the device is detached from any DMA/IOMMU
mapping that it might have automatically been attached to at boot. If
using the IOMMU API explicitly, detaching from any existing mapping is
required to avoid double mapping of buffers.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/staging/media/tegra-vde/tegra-vde.c | 171 +++++++++++++++++---
 1 file changed, 153 insertions(+), 18 deletions(-)

diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
index 2496a03fd158..3bc0bfcfe34e 100644
--- a/drivers/staging/media/tegra-vde/tegra-vde.c
+++ b/drivers/staging/media/tegra-vde/tegra-vde.c
@@ -13,7 +13,9 @@
 #include <linux/dma-buf.h>
 #include <linux/genalloc.h>
 #include <linux/interrupt.h>
+#include <linux/iommu.h>
 #include <linux/iopoll.h>
+#include <linux/iova.h>
 #include <linux/miscdevice.h>
 #include <linux/module.h>
 #include <linux/of_device.h>
@@ -22,6 +24,10 @@
 #include <linux/slab.h>
 #include <linux/uaccess.h>
 
+#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
+#include <asm/dma-iommu.h>
+#endif
+
 #include <soc/tegra/pmc.h>
 
 #include <drm/drm_fourcc.h>
@@ -61,6 +67,11 @@ struct video_frame {
 	u32 frame_num;
 	u32 flags;
 	u64 modifier;
+
+	struct iova *y_iova;
+	struct iova *cb_iova;
+	struct iova *cr_iova;
+	struct iova *aux_iova;
 };
 
 struct tegra_vde_soc {
@@ -93,6 +104,12 @@ struct tegra_vde {
 	struct clk *clk_bsev;
 	dma_addr_t iram_lists_addr;
 	u32 *iram;
+
+	struct iommu_domain *domain;
+	struct iommu_group *group;
+	struct iova_domain iova;
+	unsigned long limit;
+	unsigned int shift;
 };
 
 static void tegra_vde_set_bits(struct tegra_vde *vde,
@@ -634,12 +651,22 @@ static void tegra_vde_decode_frame(struct tegra_vde *vde,
 	VDE_WR(0x20000000 | (macroblocks_nb - 1), vde->sxe + 0x00);
 }
 
-static void tegra_vde_detach_and_put_dmabuf(struct dma_buf_attachment *a,
+static void tegra_vde_detach_and_put_dmabuf(struct tegra_vde *vde,
+					    struct dma_buf_attachment *a,
 					    struct sg_table *sgt,
+					    struct iova *iova,
 					    enum dma_data_direction dma_dir)
 {
 	struct dma_buf *dmabuf = a->dmabuf;
 
+	if (vde->domain) {
+		unsigned long size = iova_size(iova) << vde->shift;
+		dma_addr_t addr = iova_dma_addr(&vde->iova, iova);
+
+		iommu_unmap(vde->domain, addr, size);
+		__free_iova(&vde->iova, iova);
+	}
+
 	dma_buf_unmap_attachment(a, sgt, dma_dir);
 	dma_buf_detach(dmabuf, a);
 	dma_buf_put(dmabuf);
@@ -651,14 +678,16 @@ static int tegra_vde_attach_dmabuf(struct tegra_vde *vde,
 				   size_t min_size,
 				   size_t align_size,
 				   struct dma_buf_attachment **a,
-				   dma_addr_t *addr,
+				   dma_addr_t *addrp,
 				   struct sg_table **s,
-				   size_t *size,
+				   struct iova **iovap,
+				   size_t *sizep,
 				   enum dma_data_direction dma_dir)
 {
 	struct dma_buf_attachment *attachment;
 	struct dma_buf *dmabuf;
 	struct sg_table *sgt;
+	size_t size;
 	int err;
 
 	dmabuf = dma_buf_get(fd);
@@ -695,18 +724,47 @@ static int tegra_vde_attach_dmabuf(struct tegra_vde *vde,
 		goto err_detach;
 	}
 
-	if (sgt->nents != 1) {
+	if (sgt->nents > 1 && !vde->domain) {
 		dev_err(vde->dev, "Sparse DMA region is unsupported\n");
 		err = -EINVAL;
 		goto err_unmap;
 	}
 
-	*addr = sg_dma_address(sgt->sgl) + offset;
+	if (vde->domain) {
+		int prot = IOMMU_READ | IOMMU_WRITE;
+		struct iova *iova;
+		dma_addr_t addr;
+
+		size = (dmabuf->size - offset) >> vde->shift;
+
+		iova = alloc_iova(&vde->iova, size, vde->limit - 1, true);
+		if (!iova) {
+			err = -ENOMEM;
+			goto err_unmap;
+		}
+
+		addr = iova_dma_addr(&vde->iova, iova);
+
+		size = iommu_map_sg(vde->domain, addr, sgt->sgl, sgt->nents,
+				    prot);
+		if (!size) {
+			__free_iova(&vde->iova, iova);
+			err = -ENXIO;
+			goto err_unmap;
+		}
+
+		*addrp = addr;
+		*iovap = iova;
+	} else {
+		*addrp = sg_dma_address(sgt->sgl) + offset;
+		size = dmabuf->size - offset;
+	}
+
 	*a = attachment;
 	*s = sgt;
 
-	if (size)
-		*size = dmabuf->size - offset;
+	if (sizep)
+		*sizep = size;
 
 	return 0;
 
@@ -734,6 +792,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
 				      &frame->y_dmabuf_attachment,
 				      &frame->y_addr,
 				      &frame->y_sgt,
+				      &frame->y_iova,
 				      NULL, dma_dir);
 	if (err)
 		return err;
@@ -743,6 +802,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
 				      &frame->cb_dmabuf_attachment,
 				      &frame->cb_addr,
 				      &frame->cb_sgt,
+				      &frame->cb_iova,
 				      NULL, dma_dir);
 	if (err)
 		goto err_release_y;
@@ -752,6 +812,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
 				      &frame->cr_dmabuf_attachment,
 				      &frame->cr_addr,
 				      &frame->cr_sgt,
+				      &frame->cr_iova,
 				      NULL, dma_dir);
 	if (err)
 		goto err_release_cb;
@@ -766,6 +827,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
 				      &frame->aux_dmabuf_attachment,
 				      &frame->aux_addr,
 				      &frame->aux_sgt,
+				      &frame->aux_iova,
 				      NULL, dma_dir);
 	if (err)
 		goto err_release_cr;
@@ -774,13 +836,16 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
 
 err_release_cr:
 	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
-					frame->cr_sgt, dma_dir);
+					frame->cr_sgt, frame->cr_iova,
+					dma_dir);
 err_release_cb:
 	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
-					frame->cb_sgt, dma_dir);
+					frame->cb_sgt, frame->cb_iova,
+					dma_dir);
 err_release_y:
 	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
-					frame->y_sgt, dma_dir);
+					frame->y_sgt, frame->y_iova,
+					dma_dir);
 
 	return err;
 }
@@ -793,16 +858,20 @@ static void tegra_vde_release_frame_dmabufs(struct tegra_vde *vde,
 	if (!baseline_profile)
 		tegra_vde_detach_and_put_dmabuf(vde,
 						frame->aux_dmabuf_attachment,
-						frame->aux_sgt, dma_dir);
+						frame->aux_sgt,
+						frame->aux_iova, dma_dir);
 
 	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
-					frame->cr_sgt, dma_dir);
+					frame->cr_sgt, frame->cr_iova,
+					dma_dir);
 
 	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
-					frame->cb_sgt, dma_dir);
+					frame->cb_sgt, frame->cb_iova,
+					dma_dir);
 
 	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
-					frame->y_sgt, dma_dir);
+					frame->y_sgt, frame->y_iova,
+					dma_dir);
 }
 
 static int tegra_vde_validate_frame(struct device *dev,
@@ -923,6 +992,8 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 	struct sg_table *bitstream_sgt, *secure_sgt;
 	enum dma_data_direction dma_dir;
 	dma_addr_t bitstream_data_addr;
+	struct iova *bitstream_iova;
+	struct iova *secure_iova;
 	dma_addr_t secure_addr;
 	dma_addr_t bsev_ptr;
 	size_t lsize, csize;
@@ -948,6 +1019,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 				      &bitstream_data_dmabuf_attachment,
 				      &bitstream_data_addr,
 				      &bitstream_sgt,
+				      &bitstream_iova,
 				      &bitstream_data_size,
 				      DMA_TO_DEVICE);
 	if (ret)
@@ -959,6 +1031,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 					      &secure_attachment,
 					      &secure_addr,
 					      &secure_sgt,
+					      &secure_iova,
 					      &secure_size,
 					      DMA_TO_DEVICE);
 		if (ret)
@@ -1095,12 +1168,13 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 release_bitstream_dmabuf:
 	if (secure_attachment)
 		tegra_vde_detach_and_put_dmabuf(vde, secure_attachment,
-						secure_sgt,
+						secure_sgt, secure_iova,
 						DMA_TO_DEVICE);
 
 	tegra_vde_detach_and_put_dmabuf(vde,
 					bitstream_data_dmabuf_attachment,
-					bitstream_sgt, DMA_TO_DEVICE);
+					bitstream_sgt, bitstream_iova,
+					DMA_TO_DEVICE);
 
 	return ret;
 }
@@ -1193,6 +1267,15 @@ static int tegra_vde_probe(struct platform_device *pdev)
 	struct tegra_vde *vde;
 	int irq, err;
 
+#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
+	if (dev->archdata.mapping) {
+		struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
+
+		arm_iommu_detach_device(dev);
+		arm_iommu_release_mapping(mapping);
+	}
+#endif
+
 	vde = devm_kzalloc(dev, sizeof(*vde), GFP_KERNEL);
 	if (!vde)
 		return -ENOMEM;
@@ -1335,6 +1418,37 @@ static int tegra_vde_probe(struct platform_device *pdev)
 		return -ENOMEM;
 	}
 
+	vde->group = iommu_group_get(dev);
+	if (vde->group) {
+		unsigned long order;
+
+		vde->domain = iommu_domain_alloc(&platform_bus_type);
+		if (!vde->domain) {
+			iommu_group_put(vde->group);
+			vde->group = NULL;
+		} else {
+			err = iova_cache_get();
+			if (err < 0)
+				goto free_domain;
+
+			order = __ffs(vde->domain->pgsize_bitmap);
+
+			init_iova_domain(&vde->iova, 1UL << order, 0);
+			vde->shift = iova_shift(&vde->iova);
+			vde->limit = 1 << (32 - vde->shift);
+
+			/*
+			 * VDE doesn't seem to like accessing the last page of
+			 * its 32-bit address space.
+			 */
+			vde->limit -= 1;
+
+			err = iommu_attach_group(vde->domain, vde->group);
+			if (err < 0)
+				goto put_cache;
+		}
+	}
+
 	mutex_init(&vde->lock);
 	init_completion(&vde->decode_completion);
 
@@ -1346,7 +1460,7 @@ static int tegra_vde_probe(struct platform_device *pdev)
 	err = misc_register(&vde->miscdev);
 	if (err) {
 		dev_err(dev, "Failed to register misc device: %d\n", err);
-		goto err_gen_free;
+		goto detach;
 	}
 
 	pm_runtime_enable(dev);
@@ -1364,7 +1478,21 @@ static int tegra_vde_probe(struct platform_device *pdev)
 err_misc_unreg:
 	misc_deregister(&vde->miscdev);
 
-err_gen_free:
+detach:
+	if (vde->domain)
+		iommu_detach_group(vde->domain, vde->group);
+
+put_cache:
+	if (vde->domain)
+		iova_cache_put();
+
+free_domain:
+	if (vde->domain)
+		iommu_domain_free(vde->domain);
+
+	if (vde->group)
+		iommu_group_put(vde->group);
+
 	gen_pool_free(vde->iram_pool, (unsigned long)vde->iram,
 		      gen_pool_size(vde->iram_pool));
 
@@ -1388,6 +1516,13 @@ static int tegra_vde_remove(struct platform_device *pdev)
 
 	misc_deregister(&vde->miscdev);
 
+	if (vde->domain) {
+		iommu_detach_group(vde->domain, vde->group);
+		iova_cache_put();
+		iommu_domain_free(vde->domain);
+		iommu_group_put(vde->group);
+	}
+
 	gen_pool_free(vde->iram_pool, (unsigned long)vde->iram,
 		      gen_pool_size(vde->iram_pool));
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 09/14] staging: media: tegra-vde: Add IOMMU support
@ 2018-08-13 14:50   ` Thierry Reding
  0 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: Greg Kroah-Hartman, Dmitry Osipenko, Jonathan Hunter,
	linux-media, linux-tegra, devel

From: Thierry Reding <treding@nvidia.com>

Implement support for using an IOMMU to map physically discontiguous
buffers into contiguous I/O virtual mappings that the VDE can use. This
allows importing arbitrary DMA-BUFs for use by the VDE.

While at it, make sure that the device is detached from any DMA/IOMMU
mapping that it might have automatically been attached to at boot. If
using the IOMMU API explicitly, detaching from any existing mapping is
required to avoid double mapping of buffers.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/staging/media/tegra-vde/tegra-vde.c | 171 +++++++++++++++++---
 1 file changed, 153 insertions(+), 18 deletions(-)

diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
index 2496a03fd158..3bc0bfcfe34e 100644
--- a/drivers/staging/media/tegra-vde/tegra-vde.c
+++ b/drivers/staging/media/tegra-vde/tegra-vde.c
@@ -13,7 +13,9 @@
 #include <linux/dma-buf.h>
 #include <linux/genalloc.h>
 #include <linux/interrupt.h>
+#include <linux/iommu.h>
 #include <linux/iopoll.h>
+#include <linux/iova.h>
 #include <linux/miscdevice.h>
 #include <linux/module.h>
 #include <linux/of_device.h>
@@ -22,6 +24,10 @@
 #include <linux/slab.h>
 #include <linux/uaccess.h>
 
+#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
+#include <asm/dma-iommu.h>
+#endif
+
 #include <soc/tegra/pmc.h>
 
 #include <drm/drm_fourcc.h>
@@ -61,6 +67,11 @@ struct video_frame {
 	u32 frame_num;
 	u32 flags;
 	u64 modifier;
+
+	struct iova *y_iova;
+	struct iova *cb_iova;
+	struct iova *cr_iova;
+	struct iova *aux_iova;
 };
 
 struct tegra_vde_soc {
@@ -93,6 +104,12 @@ struct tegra_vde {
 	struct clk *clk_bsev;
 	dma_addr_t iram_lists_addr;
 	u32 *iram;
+
+	struct iommu_domain *domain;
+	struct iommu_group *group;
+	struct iova_domain iova;
+	unsigned long limit;
+	unsigned int shift;
 };
 
 static void tegra_vde_set_bits(struct tegra_vde *vde,
@@ -634,12 +651,22 @@ static void tegra_vde_decode_frame(struct tegra_vde *vde,
 	VDE_WR(0x20000000 | (macroblocks_nb - 1), vde->sxe + 0x00);
 }
 
-static void tegra_vde_detach_and_put_dmabuf(struct dma_buf_attachment *a,
+static void tegra_vde_detach_and_put_dmabuf(struct tegra_vde *vde,
+					    struct dma_buf_attachment *a,
 					    struct sg_table *sgt,
+					    struct iova *iova,
 					    enum dma_data_direction dma_dir)
 {
 	struct dma_buf *dmabuf = a->dmabuf;
 
+	if (vde->domain) {
+		unsigned long size = iova_size(iova) << vde->shift;
+		dma_addr_t addr = iova_dma_addr(&vde->iova, iova);
+
+		iommu_unmap(vde->domain, addr, size);
+		__free_iova(&vde->iova, iova);
+	}
+
 	dma_buf_unmap_attachment(a, sgt, dma_dir);
 	dma_buf_detach(dmabuf, a);
 	dma_buf_put(dmabuf);
@@ -651,14 +678,16 @@ static int tegra_vde_attach_dmabuf(struct tegra_vde *vde,
 				   size_t min_size,
 				   size_t align_size,
 				   struct dma_buf_attachment **a,
-				   dma_addr_t *addr,
+				   dma_addr_t *addrp,
 				   struct sg_table **s,
-				   size_t *size,
+				   struct iova **iovap,
+				   size_t *sizep,
 				   enum dma_data_direction dma_dir)
 {
 	struct dma_buf_attachment *attachment;
 	struct dma_buf *dmabuf;
 	struct sg_table *sgt;
+	size_t size;
 	int err;
 
 	dmabuf = dma_buf_get(fd);
@@ -695,18 +724,47 @@ static int tegra_vde_attach_dmabuf(struct tegra_vde *vde,
 		goto err_detach;
 	}
 
-	if (sgt->nents != 1) {
+	if (sgt->nents > 1 && !vde->domain) {
 		dev_err(vde->dev, "Sparse DMA region is unsupported\n");
 		err = -EINVAL;
 		goto err_unmap;
 	}
 
-	*addr = sg_dma_address(sgt->sgl) + offset;
+	if (vde->domain) {
+		int prot = IOMMU_READ | IOMMU_WRITE;
+		struct iova *iova;
+		dma_addr_t addr;
+
+		size = (dmabuf->size - offset) >> vde->shift;
+
+		iova = alloc_iova(&vde->iova, size, vde->limit - 1, true);
+		if (!iova) {
+			err = -ENOMEM;
+			goto err_unmap;
+		}
+
+		addr = iova_dma_addr(&vde->iova, iova);
+
+		size = iommu_map_sg(vde->domain, addr, sgt->sgl, sgt->nents,
+				    prot);
+		if (!size) {
+			__free_iova(&vde->iova, iova);
+			err = -ENXIO;
+			goto err_unmap;
+		}
+
+		*addrp = addr;
+		*iovap = iova;
+	} else {
+		*addrp = sg_dma_address(sgt->sgl) + offset;
+		size = dmabuf->size - offset;
+	}
+
 	*a = attachment;
 	*s = sgt;
 
-	if (size)
-		*size = dmabuf->size - offset;
+	if (sizep)
+		*sizep = size;
 
 	return 0;
 
@@ -734,6 +792,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
 				      &frame->y_dmabuf_attachment,
 				      &frame->y_addr,
 				      &frame->y_sgt,
+				      &frame->y_iova,
 				      NULL, dma_dir);
 	if (err)
 		return err;
@@ -743,6 +802,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
 				      &frame->cb_dmabuf_attachment,
 				      &frame->cb_addr,
 				      &frame->cb_sgt,
+				      &frame->cb_iova,
 				      NULL, dma_dir);
 	if (err)
 		goto err_release_y;
@@ -752,6 +812,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
 				      &frame->cr_dmabuf_attachment,
 				      &frame->cr_addr,
 				      &frame->cr_sgt,
+				      &frame->cr_iova,
 				      NULL, dma_dir);
 	if (err)
 		goto err_release_cb;
@@ -766,6 +827,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
 				      &frame->aux_dmabuf_attachment,
 				      &frame->aux_addr,
 				      &frame->aux_sgt,
+				      &frame->aux_iova,
 				      NULL, dma_dir);
 	if (err)
 		goto err_release_cr;
@@ -774,13 +836,16 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
 
 err_release_cr:
 	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
-					frame->cr_sgt, dma_dir);
+					frame->cr_sgt, frame->cr_iova,
+					dma_dir);
 err_release_cb:
 	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
-					frame->cb_sgt, dma_dir);
+					frame->cb_sgt, frame->cb_iova,
+					dma_dir);
 err_release_y:
 	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
-					frame->y_sgt, dma_dir);
+					frame->y_sgt, frame->y_iova,
+					dma_dir);
 
 	return err;
 }
@@ -793,16 +858,20 @@ static void tegra_vde_release_frame_dmabufs(struct tegra_vde *vde,
 	if (!baseline_profile)
 		tegra_vde_detach_and_put_dmabuf(vde,
 						frame->aux_dmabuf_attachment,
-						frame->aux_sgt, dma_dir);
+						frame->aux_sgt,
+						frame->aux_iova, dma_dir);
 
 	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
-					frame->cr_sgt, dma_dir);
+					frame->cr_sgt, frame->cr_iova,
+					dma_dir);
 
 	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
-					frame->cb_sgt, dma_dir);
+					frame->cb_sgt, frame->cb_iova,
+					dma_dir);
 
 	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
-					frame->y_sgt, dma_dir);
+					frame->y_sgt, frame->y_iova,
+					dma_dir);
 }
 
 static int tegra_vde_validate_frame(struct device *dev,
@@ -923,6 +992,8 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 	struct sg_table *bitstream_sgt, *secure_sgt;
 	enum dma_data_direction dma_dir;
 	dma_addr_t bitstream_data_addr;
+	struct iova *bitstream_iova;
+	struct iova *secure_iova;
 	dma_addr_t secure_addr;
 	dma_addr_t bsev_ptr;
 	size_t lsize, csize;
@@ -948,6 +1019,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 				      &bitstream_data_dmabuf_attachment,
 				      &bitstream_data_addr,
 				      &bitstream_sgt,
+				      &bitstream_iova,
 				      &bitstream_data_size,
 				      DMA_TO_DEVICE);
 	if (ret)
@@ -959,6 +1031,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 					      &secure_attachment,
 					      &secure_addr,
 					      &secure_sgt,
+					      &secure_iova,
 					      &secure_size,
 					      DMA_TO_DEVICE);
 		if (ret)
@@ -1095,12 +1168,13 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 release_bitstream_dmabuf:
 	if (secure_attachment)
 		tegra_vde_detach_and_put_dmabuf(vde, secure_attachment,
-						secure_sgt,
+						secure_sgt, secure_iova,
 						DMA_TO_DEVICE);
 
 	tegra_vde_detach_and_put_dmabuf(vde,
 					bitstream_data_dmabuf_attachment,
-					bitstream_sgt, DMA_TO_DEVICE);
+					bitstream_sgt, bitstream_iova,
+					DMA_TO_DEVICE);
 
 	return ret;
 }
@@ -1193,6 +1267,15 @@ static int tegra_vde_probe(struct platform_device *pdev)
 	struct tegra_vde *vde;
 	int irq, err;
 
+#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
+	if (dev->archdata.mapping) {
+		struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
+
+		arm_iommu_detach_device(dev);
+		arm_iommu_release_mapping(mapping);
+	}
+#endif
+
 	vde = devm_kzalloc(dev, sizeof(*vde), GFP_KERNEL);
 	if (!vde)
 		return -ENOMEM;
@@ -1335,6 +1418,37 @@ static int tegra_vde_probe(struct platform_device *pdev)
 		return -ENOMEM;
 	}
 
+	vde->group = iommu_group_get(dev);
+	if (vde->group) {
+		unsigned long order;
+
+		vde->domain = iommu_domain_alloc(&platform_bus_type);
+		if (!vde->domain) {
+			iommu_group_put(vde->group);
+			vde->group = NULL;
+		} else {
+			err = iova_cache_get();
+			if (err < 0)
+				goto free_domain;
+
+			order = __ffs(vde->domain->pgsize_bitmap);
+
+			init_iova_domain(&vde->iova, 1UL << order, 0);
+			vde->shift = iova_shift(&vde->iova);
+			vde->limit = 1 << (32 - vde->shift);
+
+			/*
+			 * VDE doesn't seem to like accessing the last page of
+			 * its 32-bit address space.
+			 */
+			vde->limit -= 1;
+
+			err = iommu_attach_group(vde->domain, vde->group);
+			if (err < 0)
+				goto put_cache;
+		}
+	}
+
 	mutex_init(&vde->lock);
 	init_completion(&vde->decode_completion);
 
@@ -1346,7 +1460,7 @@ static int tegra_vde_probe(struct platform_device *pdev)
 	err = misc_register(&vde->miscdev);
 	if (err) {
 		dev_err(dev, "Failed to register misc device: %d\n", err);
-		goto err_gen_free;
+		goto detach;
 	}
 
 	pm_runtime_enable(dev);
@@ -1364,7 +1478,21 @@ static int tegra_vde_probe(struct platform_device *pdev)
 err_misc_unreg:
 	misc_deregister(&vde->miscdev);
 
-err_gen_free:
+detach:
+	if (vde->domain)
+		iommu_detach_group(vde->domain, vde->group);
+
+put_cache:
+	if (vde->domain)
+		iova_cache_put();
+
+free_domain:
+	if (vde->domain)
+		iommu_domain_free(vde->domain);
+
+	if (vde->group)
+		iommu_group_put(vde->group);
+
 	gen_pool_free(vde->iram_pool, (unsigned long)vde->iram,
 		      gen_pool_size(vde->iram_pool));
 
@@ -1388,6 +1516,13 @@ static int tegra_vde_remove(struct platform_device *pdev)
 
 	misc_deregister(&vde->miscdev);
 
+	if (vde->domain) {
+		iommu_detach_group(vde->domain, vde->group);
+		iova_cache_put();
+		iommu_domain_free(vde->domain);
+		iommu_group_put(vde->group);
+	}
+
 	gen_pool_free(vde->iram_pool, (unsigned long)vde->iram,
 		      gen_pool_size(vde->iram_pool));
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 10/14] staging: media: tegra-vde: Keep VDE in reset when unused
  2018-08-13 14:50 ` Thierry Reding
@ 2018-08-13 14:50   ` Thierry Reding
  -1 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Dmitry Osipenko, linux-media

From: Thierry Reding <treding@nvidia.com>

There is no point in keeping the VDE module out of reset when it is not
in use. Reset it on runtime suspend.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/staging/media/tegra-vde/tegra-vde.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
index 3bc0bfcfe34e..4b3c6ab3c77e 100644
--- a/drivers/staging/media/tegra-vde/tegra-vde.c
+++ b/drivers/staging/media/tegra-vde/tegra-vde.c
@@ -1226,6 +1226,7 @@ static int tegra_vde_runtime_suspend(struct device *dev)
 	}
 
 	reset_control_assert(vde->rst_bsev);
+	reset_control_assert(vde->rst);
 
 	usleep_range(2000, 4000);
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 10/14] staging: media: tegra-vde: Keep VDE in reset when unused
@ 2018-08-13 14:50   ` Thierry Reding
  0 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: Greg Kroah-Hartman, Dmitry Osipenko, Jonathan Hunter,
	linux-media, linux-tegra, devel

From: Thierry Reding <treding@nvidia.com>

There is no point in keeping the VDE module out of reset when it is not
in use. Reset it on runtime suspend.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/staging/media/tegra-vde/tegra-vde.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
index 3bc0bfcfe34e..4b3c6ab3c77e 100644
--- a/drivers/staging/media/tegra-vde/tegra-vde.c
+++ b/drivers/staging/media/tegra-vde/tegra-vde.c
@@ -1226,6 +1226,7 @@ static int tegra_vde_runtime_suspend(struct device *dev)
 	}
 
 	reset_control_assert(vde->rst_bsev);
+	reset_control_assert(vde->rst);
 
 	usleep_range(2000, 4000);
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 11/14] ARM: tegra: Enable VDE on Tegra124
  2018-08-13 14:50 ` Thierry Reding
@ 2018-08-13 14:50   ` Thierry Reding
  -1 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Dmitry Osipenko, linux-media

From: Thierry Reding <treding@nvidia.com>

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 arch/arm/boot/dts/tegra124.dtsi | 40 +++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index b113e47b2b2a..8fdca4723205 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -83,6 +83,19 @@
 		};
 	};
 
+	iram@40000000 {
+		compatible = "mmio-sram";
+		reg = <0x0 0x40000000 0x0 0x40000>;
+		#address-cells = <1>;
+		#size-cells = <1>;
+		ranges = <0 0x0 0x40000000 0x40000>;
+
+		vde_pool: pool@400 {
+			reg = <0x400 0x3fc00>;
+			pool;
+		};
+	};
+
 	host1x@50000000 {
 		compatible = "nvidia,tegra124-host1x", "simple-bus";
 		reg = <0x0 0x50000000 0x0 0x00034000>;
@@ -283,6 +296,33 @@
 		*/
 	};
 
+	vde@60030000 {
+		compatible = "nvidia,tegra124-vde", "nvidia,tegra30-vde",
+			     "nvidia,tegra20-vde";
+		reg = <0x0 0x60030000 0x0 0x1000   /* Syntax Engine */
+		       0x0 0x60031000 0x0 0x1000   /* Video Bitstream Engine */
+		       0x0 0x60032000 0x0 0x0100   /* Macroblock Engine */
+		       0x0 0x60032200 0x0 0x0100   /* Post-processing Engine */
+		       0x0 0x60032400 0x0 0x0100   /* Motion Compensation Engine */
+		       0x0 0x60032600 0x0 0x0100   /* Transform Engine */
+		       0x0 0x60032800 0x0 0x0100   /* Pixel prediction block */
+		       0x0 0x60032a00 0x0 0x0100   /* Video DMA */
+		       0x0 0x60033800 0x0 0x0400>; /* Video frame controls */
+		reg-names = "sxe", "bsev", "mbe", "ppe", "mce",
+			    "tfe", "ppb", "vdma", "frameid";
+		iram = <&vde_pool>; /* IRAM region */
+		interrupts = <GIC_SPI  9 IRQ_TYPE_LEVEL_HIGH>, /* Sync token interrupt */
+			     <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>, /* BSE-V interrupt */
+			     <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>; /* SXE interrupt */
+		interrupt-names = "sync-token", "bsev", "sxe";
+		clocks = <&tegra_car TEGRA124_CLK_VDE>,
+			 <&tegra_car TEGRA124_CLK_BSEV>;
+		clock-names = "vde", "bsev";
+		resets = <&tegra_car 61>,
+			 <&tegra_car 63>;
+		reset-names = "vde", "bsev";
+	};
+
 	apbdma: dma@60020000 {
 		compatible = "nvidia,tegra124-apbdma", "nvidia,tegra148-apbdma";
 		reg = <0x0 0x60020000 0x0 0x1400>;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 11/14] ARM: tegra: Enable VDE on Tegra124
@ 2018-08-13 14:50   ` Thierry Reding
  0 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: Greg Kroah-Hartman, Dmitry Osipenko, Jonathan Hunter,
	linux-media, linux-tegra, devel

From: Thierry Reding <treding@nvidia.com>

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 arch/arm/boot/dts/tegra124.dtsi | 40 +++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index b113e47b2b2a..8fdca4723205 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -83,6 +83,19 @@
 		};
 	};
 
+	iram@40000000 {
+		compatible = "mmio-sram";
+		reg = <0x0 0x40000000 0x0 0x40000>;
+		#address-cells = <1>;
+		#size-cells = <1>;
+		ranges = <0 0x0 0x40000000 0x40000>;
+
+		vde_pool: pool@400 {
+			reg = <0x400 0x3fc00>;
+			pool;
+		};
+	};
+
 	host1x@50000000 {
 		compatible = "nvidia,tegra124-host1x", "simple-bus";
 		reg = <0x0 0x50000000 0x0 0x00034000>;
@@ -283,6 +296,33 @@
 		*/
 	};
 
+	vde@60030000 {
+		compatible = "nvidia,tegra124-vde", "nvidia,tegra30-vde",
+			     "nvidia,tegra20-vde";
+		reg = <0x0 0x60030000 0x0 0x1000   /* Syntax Engine */
+		       0x0 0x60031000 0x0 0x1000   /* Video Bitstream Engine */
+		       0x0 0x60032000 0x0 0x0100   /* Macroblock Engine */
+		       0x0 0x60032200 0x0 0x0100   /* Post-processing Engine */
+		       0x0 0x60032400 0x0 0x0100   /* Motion Compensation Engine */
+		       0x0 0x60032600 0x0 0x0100   /* Transform Engine */
+		       0x0 0x60032800 0x0 0x0100   /* Pixel prediction block */
+		       0x0 0x60032a00 0x0 0x0100   /* Video DMA */
+		       0x0 0x60033800 0x0 0x0400>; /* Video frame controls */
+		reg-names = "sxe", "bsev", "mbe", "ppe", "mce",
+			    "tfe", "ppb", "vdma", "frameid";
+		iram = <&vde_pool>; /* IRAM region */
+		interrupts = <GIC_SPI  9 IRQ_TYPE_LEVEL_HIGH>, /* Sync token interrupt */
+			     <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>, /* BSE-V interrupt */
+			     <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>; /* SXE interrupt */
+		interrupt-names = "sync-token", "bsev", "sxe";
+		clocks = <&tegra_car TEGRA124_CLK_VDE>,
+			 <&tegra_car TEGRA124_CLK_BSEV>;
+		clock-names = "vde", "bsev";
+		resets = <&tegra_car 61>,
+			 <&tegra_car 63>;
+		reset-names = "vde", "bsev";
+	};
+
 	apbdma: dma@60020000 {
 		compatible = "nvidia,tegra124-apbdma", "nvidia,tegra148-apbdma";
 		reg = <0x0 0x60020000 0x0 0x1400>;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 12/14] ARM: tegra: Add BSEV clock and reset for VDE on Tegra20
  2018-08-13 14:50 ` Thierry Reding
@ 2018-08-13 14:50   ` Thierry Reding
  -1 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Dmitry Osipenko, linux-media

From: Thierry Reding <treding@nvidia.com>

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 arch/arm/boot/dts/tegra20.dtsi | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
index 15b73bd377f0..abb5738a0705 100644
--- a/arch/arm/boot/dts/tegra20.dtsi
+++ b/arch/arm/boot/dts/tegra20.dtsi
@@ -287,9 +287,13 @@
 			     <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>, /* BSE-V interrupt */
 			     <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>; /* SXE interrupt */
 		interrupt-names = "sync-token", "bsev", "sxe";
-		clocks = <&tegra_car TEGRA20_CLK_VDE>;
-		reset-names = "vde", "mc";
-		resets = <&tegra_car 61>, <&mc TEGRA20_MC_RESET_VDE>;
+		clocks = <&tegra_car TEGRA20_CLK_VDE>,
+			 <&tegra_car TEGRA20_CLK_BSEV>;
+		clock-names = "vde", "bsev";
+		resets = <&tegra_car 61>,
+			 <&tegra_car 63>,
+			 <&mc TEGRA20_MC_RESET_VDE>;
+		reset-names = "vde", "bsev", "mc";
 	};
 
 	apbmisc@70000800 {
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 12/14] ARM: tegra: Add BSEV clock and reset for VDE on Tegra20
@ 2018-08-13 14:50   ` Thierry Reding
  0 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: Greg Kroah-Hartman, Dmitry Osipenko, Jonathan Hunter,
	linux-media, linux-tegra, devel

From: Thierry Reding <treding@nvidia.com>

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 arch/arm/boot/dts/tegra20.dtsi | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
index 15b73bd377f0..abb5738a0705 100644
--- a/arch/arm/boot/dts/tegra20.dtsi
+++ b/arch/arm/boot/dts/tegra20.dtsi
@@ -287,9 +287,13 @@
 			     <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>, /* BSE-V interrupt */
 			     <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>; /* SXE interrupt */
 		interrupt-names = "sync-token", "bsev", "sxe";
-		clocks = <&tegra_car TEGRA20_CLK_VDE>;
-		reset-names = "vde", "mc";
-		resets = <&tegra_car 61>, <&mc TEGRA20_MC_RESET_VDE>;
+		clocks = <&tegra_car TEGRA20_CLK_VDE>,
+			 <&tegra_car TEGRA20_CLK_BSEV>;
+		clock-names = "vde", "bsev";
+		resets = <&tegra_car 61>,
+			 <&tegra_car 63>,
+			 <&mc TEGRA20_MC_RESET_VDE>;
+		reset-names = "vde", "bsev", "mc";
 	};
 
 	apbmisc@70000800 {
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 13/14] ARM: tegra: Add BSEV clock and reset for VDE on Tegra30
  2018-08-13 14:50 ` Thierry Reding
@ 2018-08-13 14:50   ` Thierry Reding
  -1 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Dmitry Osipenko, linux-media

From: Thierry Reding <treding@nvidia.com>

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 arch/arm/boot/dts/tegra30.dtsi | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/arm/boot/dts/tegra30.dtsi b/arch/arm/boot/dts/tegra30.dtsi
index a6781f653310..492917d61bab 100644
--- a/arch/arm/boot/dts/tegra30.dtsi
+++ b/arch/arm/boot/dts/tegra30.dtsi
@@ -408,9 +408,13 @@
 			     <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>, /* BSE-V interrupt */
 			     <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>; /* SXE interrupt */
 		interrupt-names = "sync-token", "bsev", "sxe";
-		clocks = <&tegra_car TEGRA30_CLK_VDE>;
-		reset-names = "vde", "mc";
-		resets = <&tegra_car 61>, <&mc TEGRA30_MC_RESET_VDE>;
+		clocks = <&tegra_car TEGRA30_CLK_VDE>,
+			 <&tegra_car TEGRA30_CLK_BSEV>;
+		clock-names = "vde", "bsev";
+		resets = <&tegra_car 61>,
+			 <&tegra_car 63>,
+			 <&mc TEGRA30_MC_RESET_VDE>;
+		reset-names = "vde", "bsev", "mc";
 	};
 
 	apbmisc@70000800 {
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 13/14] ARM: tegra: Add BSEV clock and reset for VDE on Tegra30
@ 2018-08-13 14:50   ` Thierry Reding
  0 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: Greg Kroah-Hartman, Dmitry Osipenko, Jonathan Hunter,
	linux-media, linux-tegra, devel

From: Thierry Reding <treding@nvidia.com>

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 arch/arm/boot/dts/tegra30.dtsi | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/arm/boot/dts/tegra30.dtsi b/arch/arm/boot/dts/tegra30.dtsi
index a6781f653310..492917d61bab 100644
--- a/arch/arm/boot/dts/tegra30.dtsi
+++ b/arch/arm/boot/dts/tegra30.dtsi
@@ -408,9 +408,13 @@
 			     <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>, /* BSE-V interrupt */
 			     <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>; /* SXE interrupt */
 		interrupt-names = "sync-token", "bsev", "sxe";
-		clocks = <&tegra_car TEGRA30_CLK_VDE>;
-		reset-names = "vde", "mc";
-		resets = <&tegra_car 61>, <&mc TEGRA30_MC_RESET_VDE>;
+		clocks = <&tegra_car TEGRA30_CLK_VDE>,
+			 <&tegra_car TEGRA30_CLK_BSEV>;
+		clock-names = "vde", "bsev";
+		resets = <&tegra_car 61>,
+			 <&tegra_car 63>,
+			 <&mc TEGRA30_MC_RESET_VDE>;
+		reset-names = "vde", "bsev", "mc";
 	};
 
 	apbmisc@70000800 {
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 14/14] ARM: tegra: Enable SMMU for VDE on Tegra124
  2018-08-13 14:50 ` Thierry Reding
@ 2018-08-13 14:50   ` Thierry Reding
  -1 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Dmitry Osipenko, linux-media

From: Thierry Reding <treding@nvidia.com>

The video decode engine can use the SMMU to use buffers that are not
physically contiguous in memory. This allows better memory usage for
video decoding, since fragmentation may cause contiguous allocations
to fail.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 arch/arm/boot/dts/tegra124.dtsi | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index 8fdca4723205..0713e0ed5fef 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -321,6 +321,8 @@
 		resets = <&tegra_car 61>,
 			 <&tegra_car 63>;
 		reset-names = "vde", "bsev";
+
+		iommus = <&mc TEGRA_SWGROUP_VDE>;
 	};
 
 	apbdma: dma@60020000 {
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 14/14] ARM: tegra: Enable SMMU for VDE on Tegra124
@ 2018-08-13 14:50   ` Thierry Reding
  0 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-13 14:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Thierry Reding
  Cc: Greg Kroah-Hartman, Dmitry Osipenko, Jonathan Hunter,
	linux-media, linux-tegra, devel

From: Thierry Reding <treding@nvidia.com>

The video decode engine can use the SMMU to use buffers that are not
physically contiguous in memory. This allows better memory usage for
video decoding, since fragmentation may cause contiguous allocations
to fail.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 arch/arm/boot/dts/tegra124.dtsi | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index 8fdca4723205..0713e0ed5fef 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -321,6 +321,8 @@
 		resets = <&tegra_car 61>,
 			 <&tegra_car 63>;
 		reset-names = "vde", "bsev";
+
+		iommus = <&mc TEGRA_SWGROUP_VDE>;
 	};
 
 	apbdma: dma@60020000 {
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH 01/14] staging: media: tegra-vde: Support BSEV clock and reset
  2018-08-13 14:50   ` Thierry Reding
@ 2018-08-13 15:09     ` Dmitry Osipenko
  -1 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-13 15:09 UTC (permalink / raw)
  To: Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Mauro Carvalho Chehab, linux-media

On Monday, 13 August 2018 17:50:14 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> The BSEV clock has a separate gate bit and can not be assumed to be
> always enabled. Add explicit handling for the BSEV clock and reset.
> 
> This fixes an issue on Tegra124 where the BSEV clock is not enabled
> by default and therefore accessing the BSEV registers will hang the
> CPU if the BSEV clock is not enabled and the reset not deasserted.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---

Are you sure that BSEV clock is really needed for T20/30? I've tried already 
to disable the clock explicitly and everything kept working, though I'll try 
again.

The device-tree changes should be reflected in the binding documentation.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 01/14] staging: media: tegra-vde: Support BSEV clock and reset
@ 2018-08-13 15:09     ` Dmitry Osipenko
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-13 15:09 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mauro Carvalho Chehab, Greg Kroah-Hartman, Jonathan Hunter,
	linux-media, linux-tegra, devel

On Monday, 13 August 2018 17:50:14 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> The BSEV clock has a separate gate bit and can not be assumed to be
> always enabled. Add explicit handling for the BSEV clock and reset.
> 
> This fixes an issue on Tegra124 where the BSEV clock is not enabled
> by default and therefore accessing the BSEV registers will hang the
> CPU if the BSEV clock is not enabled and the reset not deasserted.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---

Are you sure that BSEV clock is really needed for T20/30? I've tried already 
to disable the clock explicitly and everything kept working, though I'll try 
again.

The device-tree changes should be reflected in the binding documentation.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 01/14] staging: media: tegra-vde: Support BSEV clock and reset
  2018-08-13 15:09     ` Dmitry Osipenko
@ 2018-08-14 14:21       ` Thierry Reding
  -1 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-14 14:21 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Mauro Carvalho Chehab, linux-media


[-- Attachment #1.1: Type: text/plain, Size: 1181 bytes --]

On Mon, Aug 13, 2018 at 06:09:46PM +0300, Dmitry Osipenko wrote:
> On Monday, 13 August 2018 17:50:14 MSK Thierry Reding wrote:
> > From: Thierry Reding <treding@nvidia.com>
> > 
> > The BSEV clock has a separate gate bit and can not be assumed to be
> > always enabled. Add explicit handling for the BSEV clock and reset.
> > 
> > This fixes an issue on Tegra124 where the BSEV clock is not enabled
> > by default and therefore accessing the BSEV registers will hang the
> > CPU if the BSEV clock is not enabled and the reset not deasserted.
> > 
> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > ---
> 
> Are you sure that BSEV clock is really needed for T20/30? I've tried already 
> to disable the clock explicitly and everything kept working, though I'll try 
> again.

I think you're right that these aren't strictly required for VDE to work
on Tegra20 and Tegra30. However, the BSEV clock and reset do exist on
those platforms, so I didn't see a reason why they shouldn't be handled
uniformly across all generations.

> The device-tree changes should be reflected in the binding documentation.

Indeed, I forgot to update that.

Thierry

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 169 bytes --]

_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 01/14] staging: media: tegra-vde: Support BSEV clock and reset
@ 2018-08-14 14:21       ` Thierry Reding
  0 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-08-14 14:21 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Mauro Carvalho Chehab, Greg Kroah-Hartman, Jonathan Hunter,
	linux-media, linux-tegra, devel

[-- Attachment #1: Type: text/plain, Size: 1181 bytes --]

On Mon, Aug 13, 2018 at 06:09:46PM +0300, Dmitry Osipenko wrote:
> On Monday, 13 August 2018 17:50:14 MSK Thierry Reding wrote:
> > From: Thierry Reding <treding@nvidia.com>
> > 
> > The BSEV clock has a separate gate bit and can not be assumed to be
> > always enabled. Add explicit handling for the BSEV clock and reset.
> > 
> > This fixes an issue on Tegra124 where the BSEV clock is not enabled
> > by default and therefore accessing the BSEV registers will hang the
> > CPU if the BSEV clock is not enabled and the reset not deasserted.
> > 
> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > ---
> 
> Are you sure that BSEV clock is really needed for T20/30? I've tried already 
> to disable the clock explicitly and everything kept working, though I'll try 
> again.

I think you're right that these aren't strictly required for VDE to work
on Tegra20 and Tegra30. However, the BSEV clock and reset do exist on
those platforms, so I didn't see a reason why they shouldn't be handled
uniformly across all generations.

> The device-tree changes should be reflected in the binding documentation.

Indeed, I forgot to update that.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 01/14] staging: media: tegra-vde: Support BSEV clock and reset
  2018-08-14 14:21       ` Thierry Reding
@ 2018-08-14 15:05         ` Dmitry Osipenko
  -1 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-14 15:05 UTC (permalink / raw)
  To: Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Mauro Carvalho Chehab, linux-media

On Tuesday, 14 August 2018 17:21:24 MSK Thierry Reding wrote:
> On Mon, Aug 13, 2018 at 06:09:46PM +0300, Dmitry Osipenko wrote:
> > On Monday, 13 August 2018 17:50:14 MSK Thierry Reding wrote:
> > > From: Thierry Reding <treding@nvidia.com>
> > > 
> > > The BSEV clock has a separate gate bit and can not be assumed to be
> > > always enabled. Add explicit handling for the BSEV clock and reset.
> > > 
> > > This fixes an issue on Tegra124 where the BSEV clock is not enabled
> > > by default and therefore accessing the BSEV registers will hang the
> > > CPU if the BSEV clock is not enabled and the reset not deasserted.
> > > 
> > > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > > ---
> > 
> > Are you sure that BSEV clock is really needed for T20/30? I've tried
> > already to disable the clock explicitly and everything kept working,
> > though I'll try again.
> 
> I think you're right that these aren't strictly required for VDE to work
> on Tegra20 and Tegra30. However, the BSEV clock and reset do exist on
> those platforms, so I didn't see a reason why they shouldn't be handled
> uniformly across all generations.

It's a bit messy to have unsed clock being enabled.

I guess BSEV clock on T20/30 only enables the AES engine. If the decryption 
engine is integrated with the video decoder, then the clock and reset should 
be requested by the driver, but BSEV should be kept disabled if it's not used.

If BSEV clock isn't powering anything related to VDE on T20/30, then let's 
make BSEV clock and reset control optional. For the clock we could check 
whether err = -ENOENT and continue, later we may switch to 
devm_clk_get_optional() of the upcoming [0]. For the reset there is 
devm_reset_control_get_optional(). 

Please try to verify by all means that we can omit BSEV on T20/30. If you are 
not sure, then let's make them optional as we can always make them required 
later.

P.S. I'll test and review all the patches during the next days. 

[0] https://lkml.org/lkml/2018/7/18/460

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 01/14] staging: media: tegra-vde: Support BSEV clock and reset
@ 2018-08-14 15:05         ` Dmitry Osipenko
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-14 15:05 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mauro Carvalho Chehab, Greg Kroah-Hartman, Jonathan Hunter,
	linux-media, linux-tegra, devel

On Tuesday, 14 August 2018 17:21:24 MSK Thierry Reding wrote:
> On Mon, Aug 13, 2018 at 06:09:46PM +0300, Dmitry Osipenko wrote:
> > On Monday, 13 August 2018 17:50:14 MSK Thierry Reding wrote:
> > > From: Thierry Reding <treding@nvidia.com>
> > > 
> > > The BSEV clock has a separate gate bit and can not be assumed to be
> > > always enabled. Add explicit handling for the BSEV clock and reset.
> > > 
> > > This fixes an issue on Tegra124 where the BSEV clock is not enabled
> > > by default and therefore accessing the BSEV registers will hang the
> > > CPU if the BSEV clock is not enabled and the reset not deasserted.
> > > 
> > > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > > ---
> > 
> > Are you sure that BSEV clock is really needed for T20/30? I've tried
> > already to disable the clock explicitly and everything kept working,
> > though I'll try again.
> 
> I think you're right that these aren't strictly required for VDE to work
> on Tegra20 and Tegra30. However, the BSEV clock and reset do exist on
> those platforms, so I didn't see a reason why they shouldn't be handled
> uniformly across all generations.

It's a bit messy to have unsed clock being enabled.

I guess BSEV clock on T20/30 only enables the AES engine. If the decryption 
engine is integrated with the video decoder, then the clock and reset should 
be requested by the driver, but BSEV should be kept disabled if it's not used.

If BSEV clock isn't powering anything related to VDE on T20/30, then let's 
make BSEV clock and reset control optional. For the clock we could check 
whether err = -ENOENT and continue, later we may switch to 
devm_clk_get_optional() of the upcoming [0]. For the reset there is 
devm_reset_control_get_optional(). 

Please try to verify by all means that we can omit BSEV on T20/30. If you are 
not sure, then let's make them optional as we can always make them required 
later.

P.S. I'll test and review all the patches during the next days. 

[0] https://lkml.org/lkml/2018/7/18/460

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 01/14] staging: media: tegra-vde: Support BSEV clock and reset
  2018-08-14 15:05         ` Dmitry Osipenko
@ 2018-08-14 15:16           ` Dmitry Osipenko
  -1 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-14 15:16 UTC (permalink / raw)
  To: Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Mauro Carvalho Chehab, linux-media

On Tuesday, 14 August 2018 18:05:51 MSK Dmitry Osipenko wrote:
> On Tuesday, 14 August 2018 17:21:24 MSK Thierry Reding wrote:
> > On Mon, Aug 13, 2018 at 06:09:46PM +0300, Dmitry Osipenko wrote:
> > > On Monday, 13 August 2018 17:50:14 MSK Thierry Reding wrote:
> > > > From: Thierry Reding <treding@nvidia.com>
> > > > 
> > > > The BSEV clock has a separate gate bit and can not be assumed to be
> > > > always enabled. Add explicit handling for the BSEV clock and reset.
> > > > 
> > > > This fixes an issue on Tegra124 where the BSEV clock is not enabled
> > > > by default and therefore accessing the BSEV registers will hang the
> > > > CPU if the BSEV clock is not enabled and the reset not deasserted.
> > > > 
> > > > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > > > ---
> > > 
> > > Are you sure that BSEV clock is really needed for T20/30? I've tried
> > > already to disable the clock explicitly and everything kept working,
> > > though I'll try again.
> > 
> > I think you're right that these aren't strictly required for VDE to work
> > on Tegra20 and Tegra30. However, the BSEV clock and reset do exist on
> > those platforms, so I didn't see a reason why they shouldn't be handled
> > uniformly across all generations.
> 
> It's a bit messy to have unsed clock being enabled.
> 
> I guess BSEV clock on T20/30 only enables the AES engine. If the decryption
> engine is integrated with the video decoder, then the clock and reset should
> be requested by the driver, but BSEV should be kept disabled if it's not
> used.

Though even if encryption is not directly integrated with the video decoding, 
then it still makes sense to define the clock and reset in DT without using 
them by the VDE driver since the HW registers space is shared. If somebody 
would like to implement the AES driver, it could be made as a sub-device of 
VDE.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 01/14] staging: media: tegra-vde: Support BSEV clock and reset
@ 2018-08-14 15:16           ` Dmitry Osipenko
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-14 15:16 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mauro Carvalho Chehab, Greg Kroah-Hartman, Jonathan Hunter,
	linux-media, linux-tegra, devel

On Tuesday, 14 August 2018 18:05:51 MSK Dmitry Osipenko wrote:
> On Tuesday, 14 August 2018 17:21:24 MSK Thierry Reding wrote:
> > On Mon, Aug 13, 2018 at 06:09:46PM +0300, Dmitry Osipenko wrote:
> > > On Monday, 13 August 2018 17:50:14 MSK Thierry Reding wrote:
> > > > From: Thierry Reding <treding@nvidia.com>
> > > > 
> > > > The BSEV clock has a separate gate bit and can not be assumed to be
> > > > always enabled. Add explicit handling for the BSEV clock and reset.
> > > > 
> > > > This fixes an issue on Tegra124 where the BSEV clock is not enabled
> > > > by default and therefore accessing the BSEV registers will hang the
> > > > CPU if the BSEV clock is not enabled and the reset not deasserted.
> > > > 
> > > > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > > > ---
> > > 
> > > Are you sure that BSEV clock is really needed for T20/30? I've tried
> > > already to disable the clock explicitly and everything kept working,
> > > though I'll try again.
> > 
> > I think you're right that these aren't strictly required for VDE to work
> > on Tegra20 and Tegra30. However, the BSEV clock and reset do exist on
> > those platforms, so I didn't see a reason why they shouldn't be handled
> > uniformly across all generations.
> 
> It's a bit messy to have unsed clock being enabled.
> 
> I guess BSEV clock on T20/30 only enables the AES engine. If the decryption
> engine is integrated with the video decoder, then the clock and reset should
> be requested by the driver, but BSEV should be kept disabled if it's not
> used.

Though even if encryption is not directly integrated with the video decoding, 
then it still makes sense to define the clock and reset in DT without using 
them by the VDE driver since the HW registers space is shared. If somebody 
would like to implement the AES driver, it could be made as a sub-device of 
VDE.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 11/14] ARM: tegra: Enable VDE on Tegra124
  2018-08-13 14:50   ` Thierry Reding
@ 2018-08-18 12:45     ` Dmitry Osipenko
  -1 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:45 UTC (permalink / raw)
  To: Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Mauro Carvalho Chehab, linux-media

On Monday, 13 August 2018 17:50:24 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  arch/arm/boot/dts/tegra124.dtsi | 40 +++++++++++++++++++++++++++++++++
>  1 file changed, 40 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/tegra124.dtsi
> b/arch/arm/boot/dts/tegra124.dtsi index b113e47b2b2a..8fdca4723205 100644
> --- a/arch/arm/boot/dts/tegra124.dtsi
> +++ b/arch/arm/boot/dts/tegra124.dtsi
> @@ -83,6 +83,19 @@
>  		};
>  	};
> 
> +	iram@40000000 {
> +		compatible = "mmio-sram";
> +		reg = <0x0 0x40000000 0x0 0x40000>;
> +		#address-cells = <1>;
> +		#size-cells = <1>;
> +		ranges = <0 0x0 0x40000000 0x40000>;
> +
> +		vde_pool: pool@400 {
> +			reg = <0x400 0x3fc00>;
> +			pool;
> +		};
> +	};
> +
>  	host1x@50000000 {
>  		compatible = "nvidia,tegra124-host1x", "simple-bus";
>  		reg = <0x0 0x50000000 0x0 0x00034000>;
> @@ -283,6 +296,33 @@
>  		*/
>  	};
> 
> +	vde@60030000 {
> +		compatible = "nvidia,tegra124-vde", "nvidia,tegra30-vde",
> +			     "nvidia,tegra20-vde";
> +		reg = <0x0 0x60030000 0x0 0x1000   /* Syntax Engine */
> +		       0x0 0x60031000 0x0 0x1000   /* Video Bitstream Engine */
> +		       0x0 0x60032000 0x0 0x0100   /* Macroblock Engine */
> +		       0x0 0x60032200 0x0 0x0100   /* Post-processing Engine */
> +		       0x0 0x60032400 0x0 0x0100   /* Motion Compensation Engine */
> +		       0x0 0x60032600 0x0 0x0100   /* Transform Engine */
> +		       0x0 0x60032800 0x0 0x0100   /* Pixel prediction block */
> +		       0x0 0x60032a00 0x0 0x0100   /* Video DMA */
> +		       0x0 0x60033800 0x0 0x0400>; /* Video frame controls */
> +		reg-names = "sxe", "bsev", "mbe", "ppe", "mce",
> +			    "tfe", "ppb", "vdma", "frameid";
> +		iram = <&vde_pool>; /* IRAM region */
> +		interrupts = <GIC_SPI  9 IRQ_TYPE_LEVEL_HIGH>, /* Sync token 
> +			     <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>, /* BSE-V interrupt */
> +			     <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>; /* SXE interrupt */
> +		interrupt-names = "sync-token", "bsev", "sxe";
> +		clocks = <&tegra_car TEGRA124_CLK_VDE>,
> +			 <&tegra_car TEGRA124_CLK_BSEV>;
> +		clock-names = "vde", "bsev";
> +		resets = <&tegra_car 61>,
> +			 <&tegra_car 63>;
> +		reset-names = "vde", "bsev";

Memory client reset missed?

> +	};
> +
>  	apbdma: dma@60020000 {
>  		compatible = "nvidia,tegra124-apbdma", "nvidia,tegra148-apbdma";
>  		reg = <0x0 0x60020000 0x0 0x1400>;

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 11/14] ARM: tegra: Enable VDE on Tegra124
@ 2018-08-18 12:45     ` Dmitry Osipenko
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:45 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mauro Carvalho Chehab, Greg Kroah-Hartman, Jonathan Hunter,
	linux-media, linux-tegra, devel

On Monday, 13 August 2018 17:50:24 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  arch/arm/boot/dts/tegra124.dtsi | 40 +++++++++++++++++++++++++++++++++
>  1 file changed, 40 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/tegra124.dtsi
> b/arch/arm/boot/dts/tegra124.dtsi index b113e47b2b2a..8fdca4723205 100644
> --- a/arch/arm/boot/dts/tegra124.dtsi
> +++ b/arch/arm/boot/dts/tegra124.dtsi
> @@ -83,6 +83,19 @@
>  		};
>  	};
> 
> +	iram@40000000 {
> +		compatible = "mmio-sram";
> +		reg = <0x0 0x40000000 0x0 0x40000>;
> +		#address-cells = <1>;
> +		#size-cells = <1>;
> +		ranges = <0 0x0 0x40000000 0x40000>;
> +
> +		vde_pool: pool@400 {
> +			reg = <0x400 0x3fc00>;
> +			pool;
> +		};
> +	};
> +
>  	host1x@50000000 {
>  		compatible = "nvidia,tegra124-host1x", "simple-bus";
>  		reg = <0x0 0x50000000 0x0 0x00034000>;
> @@ -283,6 +296,33 @@
>  		*/
>  	};
> 
> +	vde@60030000 {
> +		compatible = "nvidia,tegra124-vde", "nvidia,tegra30-vde",
> +			     "nvidia,tegra20-vde";
> +		reg = <0x0 0x60030000 0x0 0x1000   /* Syntax Engine */
> +		       0x0 0x60031000 0x0 0x1000   /* Video Bitstream Engine */
> +		       0x0 0x60032000 0x0 0x0100   /* Macroblock Engine */
> +		       0x0 0x60032200 0x0 0x0100   /* Post-processing Engine */
> +		       0x0 0x60032400 0x0 0x0100   /* Motion Compensation Engine */
> +		       0x0 0x60032600 0x0 0x0100   /* Transform Engine */
> +		       0x0 0x60032800 0x0 0x0100   /* Pixel prediction block */
> +		       0x0 0x60032a00 0x0 0x0100   /* Video DMA */
> +		       0x0 0x60033800 0x0 0x0400>; /* Video frame controls */
> +		reg-names = "sxe", "bsev", "mbe", "ppe", "mce",
> +			    "tfe", "ppb", "vdma", "frameid";
> +		iram = <&vde_pool>; /* IRAM region */
> +		interrupts = <GIC_SPI  9 IRQ_TYPE_LEVEL_HIGH>, /* Sync token 
> +			     <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>, /* BSE-V interrupt */
> +			     <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>; /* SXE interrupt */
> +		interrupt-names = "sync-token", "bsev", "sxe";
> +		clocks = <&tegra_car TEGRA124_CLK_VDE>,
> +			 <&tegra_car TEGRA124_CLK_BSEV>;
> +		clock-names = "vde", "bsev";
> +		resets = <&tegra_car 61>,
> +			 <&tegra_car 63>;
> +		reset-names = "vde", "bsev";

Memory client reset missed?

> +	};
> +
>  	apbdma: dma@60020000 {
>  		compatible = "nvidia,tegra124-apbdma", "nvidia,tegra148-apbdma";
>  		reg = <0x0 0x60020000 0x0 0x1400>;

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 05/14] staging: media: tegra-vde: Properly mark invalid entries
  2018-08-13 14:50   ` Thierry Reding
@ 2018-08-18 12:45     ` Dmitry Osipenko
  -1 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:45 UTC (permalink / raw)
  To: Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Mauro Carvalho Chehab, linux-media

On Monday, 13 August 2018 17:50:18 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Entries in the reference picture list are marked as invalid by setting
> the frame ID to 0x3f.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c
> b/drivers/staging/media/tegra-vde/tegra-vde.c index
> 275884e745df..0ce30c7ccb75 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -296,7 +296,7 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde
> *vde, (frame->flags & FLAG_B_FRAME));
>  		} else {
>  			aux_addr = 0x6ADEAD00;
> -			value = 0;
> +			value = 0x3f;
>  		}
> 
>  		tegra_vde_setup_iram_entry(vde, num_ref_pics, 0, i, value,

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 05/14] staging: media: tegra-vde: Properly mark invalid entries
@ 2018-08-18 12:45     ` Dmitry Osipenko
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:45 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mauro Carvalho Chehab, Greg Kroah-Hartman, Jonathan Hunter,
	linux-media, linux-tegra, devel

On Monday, 13 August 2018 17:50:18 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Entries in the reference picture list are marked as invalid by setting
> the frame ID to 0x3f.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c
> b/drivers/staging/media/tegra-vde/tegra-vde.c index
> 275884e745df..0ce30c7ccb75 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -296,7 +296,7 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde
> *vde, (frame->flags & FLAG_B_FRAME));
>  		} else {
>  			aux_addr = 0x6ADEAD00;
> -			value = 0;
> +			value = 0x3f;
>  		}
> 
>  		tegra_vde_setup_iram_entry(vde, num_ref_pics, 0, i, value,

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 06/14] staging: media: tegra-vde: Print out invalid FD
  2018-08-13 14:50   ` Thierry Reding
@ 2018-08-18 12:45     ` Dmitry Osipenko
  -1 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:45 UTC (permalink / raw)
  To: Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Mauro Carvalho Chehab, linux-media

On Monday, 13 August 2018 17:50:19 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Include the invalid file descriptor when reporting an error message to
> help diagnosing why importing the buffer failed.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c
> b/drivers/staging/media/tegra-vde/tegra-vde.c index
> 0ce30c7ccb75..0adc603fa437 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -643,7 +643,7 @@ static int tegra_vde_attach_dmabuf(struct device *dev,
> 
>  	dmabuf = dma_buf_get(fd);
>  	if (IS_ERR(dmabuf)) {
> -		dev_err(dev, "Invalid dmabuf FD\n");
> +		dev_err(dev, "Invalid dmabuf FD: %d\n", fd);
>  		return PTR_ERR(dmabuf);
>  	}

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 06/14] staging: media: tegra-vde: Print out invalid FD
@ 2018-08-18 12:45     ` Dmitry Osipenko
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:45 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mauro Carvalho Chehab, Greg Kroah-Hartman, Jonathan Hunter,
	linux-media, linux-tegra, devel

On Monday, 13 August 2018 17:50:19 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Include the invalid file descriptor when reporting an error message to
> help diagnosing why importing the buffer failed.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c
> b/drivers/staging/media/tegra-vde/tegra-vde.c index
> 0ce30c7ccb75..0adc603fa437 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -643,7 +643,7 @@ static int tegra_vde_attach_dmabuf(struct device *dev,
> 
>  	dmabuf = dma_buf_get(fd);
>  	if (IS_ERR(dmabuf)) {
> -		dev_err(dev, "Invalid dmabuf FD\n");
> +		dev_err(dev, "Invalid dmabuf FD: %d\n", fd);
>  		return PTR_ERR(dmabuf);
>  	}

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 14/14] ARM: tegra: Enable SMMU for VDE on Tegra124
  2018-08-13 14:50   ` Thierry Reding
@ 2018-08-18 12:45     ` Dmitry Osipenko
  -1 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:45 UTC (permalink / raw)
  To: Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Mauro Carvalho Chehab, linux-media

On Monday, 13 August 2018 17:50:27 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> The video decode engine can use the SMMU to use buffers that are not
> physically contiguous in memory. This allows better memory usage for
> video decoding, since fragmentation may cause contiguous allocations
> to fail.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  arch/arm/boot/dts/tegra124.dtsi | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/tegra124.dtsi
> b/arch/arm/boot/dts/tegra124.dtsi index 8fdca4723205..0713e0ed5fef 100644
> --- a/arch/arm/boot/dts/tegra124.dtsi
> +++ b/arch/arm/boot/dts/tegra124.dtsi
> @@ -321,6 +321,8 @@
>  		resets = <&tegra_car 61>,
>  			 <&tegra_car 63>;
>  		reset-names = "vde", "bsev";
> +
> +		iommus = <&mc TEGRA_SWGROUP_VDE>;
>  	};
> 
>  	apbdma: dma@60020000 {

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>

The same should be applied to Tegra30.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 14/14] ARM: tegra: Enable SMMU for VDE on Tegra124
@ 2018-08-18 12:45     ` Dmitry Osipenko
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:45 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mauro Carvalho Chehab, Greg Kroah-Hartman, Jonathan Hunter,
	linux-media, linux-tegra, devel

On Monday, 13 August 2018 17:50:27 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> The video decode engine can use the SMMU to use buffers that are not
> physically contiguous in memory. This allows better memory usage for
> video decoding, since fragmentation may cause contiguous allocations
> to fail.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  arch/arm/boot/dts/tegra124.dtsi | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/tegra124.dtsi
> b/arch/arm/boot/dts/tegra124.dtsi index 8fdca4723205..0713e0ed5fef 100644
> --- a/arch/arm/boot/dts/tegra124.dtsi
> +++ b/arch/arm/boot/dts/tegra124.dtsi
> @@ -321,6 +321,8 @@
>  		resets = <&tegra_car 61>,
>  			 <&tegra_car 63>;
>  		reset-names = "vde", "bsev";
> +
> +		iommus = <&mc TEGRA_SWGROUP_VDE>;
>  	};
> 
>  	apbdma: dma@60020000 {

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>

The same should be applied to Tegra30.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 03/14] staging: media: tegra-vde: Prepare for interlacing support
  2018-08-13 14:50   ` Thierry Reding
@ 2018-08-18 12:48     ` Dmitry Osipenko
  -1 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:48 UTC (permalink / raw)
  To: Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Mauro Carvalho Chehab, linux-media

On Monday, 13 August 2018 17:50:16 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> The number of frames doubles when decoding interlaced content and the
> structures describing the frames double in size. Take that into account
> to prepare for interlacing support.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 73 ++++++++++++++++-----
>  1 file changed, 58 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c
> b/drivers/staging/media/tegra-vde/tegra-vde.c index
> 3027b11b11ae..1a40f6dff7c8 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -61,7 +61,9 @@ struct video_frame {
>  };
> 
>  struct tegra_vde_soc {
> +	unsigned int num_ref_pics;
>  	bool supports_ref_pic_marking;
> +	bool supports_interlacing;
>  };
> 
>  struct tegra_vde {
> @@ -205,8 +207,12 @@ static void tegra_vde_setup_frameid(struct tegra_vde
> *vde, u32 cr_addr = frame ? frame->cr_addr : 0x6CDEAD00;
>  	u32 value1 = frame ? ((mbs_width << 16) | mbs_height) : 0;
>  	u32 value2 = frame ? ((((mbs_width + 1) >> 1) << 6) | 1) : 0;
> +	u32 value = y_addr >> 8;

Let's name it value0 for consistency.

> 
> -	VDE_WR(y_addr  >> 8, vde->frameid + 0x000 + frameid * 4);
> +	if (vde->soc->supports_interlacing)
> +		value |= BIT(31);
> +
> +	VDE_WR(value,        vde->frameid + 0x000 + frameid * 4);
>  	VDE_WR(cb_addr >> 8, vde->frameid + 0x100 + frameid * 4);
>  	VDE_WR(cr_addr >> 8, vde->frameid + 0x180 + frameid * 4);
>  	VDE_WR(value1,       vde->frameid + 0x080 + frameid * 4);
> @@ -229,20 +235,23 @@ static void tegra_setup_frameidx(struct tegra_vde
> *vde, }
> 
>  static void tegra_vde_setup_iram_entry(struct tegra_vde *vde,
> +				       unsigned int num_ref_pics,
>  				       unsigned int table,
>  				       unsigned int row,
>  				       u32 value1, u32 value2)
>  {
> +	unsigned int entries = num_ref_pics * 2;
>  	u32 *iram_tables = vde->iram;
> 
>  	dev_dbg(vde->miscdev.parent, "IRAM table %u: row %u: 0x%08X 0x%08X\n",
>  		table, row, value1, value2);
> 
> -	iram_tables[0x20 * table + row * 2] = value1;
> -	iram_tables[0x20 * table + row * 2 + 1] = value2;
> +	iram_tables[entries * table + row * 2] = value1;
> +	iram_tables[entries * table + row * 2 + 1] = value2;
>  }
> 
>  static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
> +					unsigned int num_ref_pics,
>  					struct video_frame *dpb_frames,
>  					unsigned int ref_frames_nb,
>  					unsigned int with_earlier_poc_nb)
> @@ -251,13 +260,17 @@ static void tegra_vde_setup_iram_tables(struct
> tegra_vde *vde, u32 value, aux_addr;
>  	int with_later_poc_nb;
>  	unsigned int i, k;
> +	size_t size;
> +
> +	size = num_ref_pics * 4 * 8;
> +	memset(vde->iram, 0, size);

Is this memset() really needed or it is just because you're feeling 
uncomfortable that something is kept uninitialized?

> 
>  	dev_dbg(vde->miscdev.parent, "DPB: Frame 0: frame_num = %d\n",
>  		dpb_frames[0].frame_num);
> 
>  	dev_dbg(vde->miscdev.parent, "REF L0:\n");
> 
> -	for (i = 0; i < 16; i++) {
> +	for (i = 0; i < num_ref_pics; i++) {
>  		if (i < ref_frames_nb) {
>  			frame = &dpb_frames[i + 1];
> 
> @@ -277,10 +290,14 @@ static void tegra_vde_setup_iram_tables(struct
> tegra_vde *vde, value = 0;
>  		}
> 
> -		tegra_vde_setup_iram_entry(vde, 0, i, value, aux_addr);
> -		tegra_vde_setup_iram_entry(vde, 1, i, value, aux_addr);
> -		tegra_vde_setup_iram_entry(vde, 2, i, value, aux_addr);
> -		tegra_vde_setup_iram_entry(vde, 3, i, value, aux_addr);
> +		tegra_vde_setup_iram_entry(vde, num_ref_pics, 0, i, value,
> +					   aux_addr);
> +		tegra_vde_setup_iram_entry(vde, num_ref_pics, 1, i, value,
> +					   aux_addr);
> +		tegra_vde_setup_iram_entry(vde, num_ref_pics, 2, i, value,
> +					   aux_addr);
> +		tegra_vde_setup_iram_entry(vde, num_ref_pics, 3, i, value,
> +					   aux_addr);
>  	}
> 
>  	if (!(dpb_frames[0].flags & FLAG_B_FRAME))
> @@ -309,7 +326,8 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde
> *vde, "\tFrame %d: frame_num = %d\n",
>  			k + 1, frame->frame_num);
> 
> -		tegra_vde_setup_iram_entry(vde, 2, i, value, aux_addr);
> +		tegra_vde_setup_iram_entry(vde, num_ref_pics, 2, i, value,
> +					   aux_addr);
>  	}
> 
>  	for (k = 0; i < ref_frames_nb; i++, k++) {
> @@ -326,7 +344,8 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde
> *vde, "\tFrame %d: frame_num = %d\n",
>  			k + 1, frame->frame_num);
> 
> -		tegra_vde_setup_iram_entry(vde, 2, i, value, aux_addr);
> +		tegra_vde_setup_iram_entry(vde, num_ref_pics, 2, i, value,
> +					   aux_addr);
>  	}
>  }
> 
> @@ -339,9 +358,20 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde, unsigned int macroblocks_nb)
>  {
>  	struct device *dev = vde->miscdev.parent;
> +	unsigned int num_ref_pics = 16;
> +	/* XXX extend ABI to provide this */
> +	bool interlaced = false;
> +	size_t size;
>  	u32 value;
>  	int err;
> 
> +	if (vde->soc->supports_interlacing) {
> +		if (interlaced)
> +			num_ref_pics = vde->soc->num_ref_pics;
> +		else
> +			num_ref_pics = 16;
> +	}
> +
>  	tegra_vde_set_bits(vde, 0x000A, vde->sxe + 0xF0);
>  	tegra_vde_set_bits(vde, 0x000B, vde->bsev + CMDQUE_CONTROL);
>  	tegra_vde_set_bits(vde, 0x8002, vde->mbe + 0x50);
> @@ -369,12 +399,12 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde, VDE_WR(0x00000000, vde->bsev + 0x98);
>  	VDE_WR(0x00000060, vde->bsev + 0x9C);
> 
> -	memset(vde->iram + 128, 0, macroblocks_nb / 2);
> +	memset(vde->iram + 1024, 0, macroblocks_nb / 2);

This is wrong and breaks everything because type of vde->iram is (*u32), hence 
+1024 is equal to offset 4096 and below in the code that offset is set to 
vde>iram_lists_addr + 1024. So that should be:

	memset(vde->iram + 256, 0, macroblocks_nb / 2);

Probably we should put that hardcoded offset into a slice_group_map_offset 
variable for clarity, like this:

	unsigned int slice_group_map_offset = 1024;
..
	memset(vde->iram + slice_group_map_offset / sizeof(u32), 0,
		     macroblocks_nb / 2);
..
	/* 
	 * Reference pictures list lays at the beginning of IRAM allocation,
	 * slice group mapping is put afterwards of the list.
	 */
	value |= ((vde->iram_lists_addr + slice_group_map_offset) >> 2) & 0xffff;

> 
>  	tegra_setup_frameidx(vde, dpb_frames, ctx->dpb_frames_nb,
>  			     ctx->pic_width_in_mbs, ctx->pic_height_in_mbs);
> 
> -	tegra_vde_setup_iram_tables(vde, dpb_frames,
> +	tegra_vde_setup_iram_tables(vde, num_ref_pics, dpb_frames,
>  				    ctx->dpb_frames_nb - 1,
>  				    ctx->dpb_ref_frames_with_earlier_poc_nb);
> 
> @@ -396,22 +426,27 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde, if (err)
>  		return err;
> 
> -	err = tegra_vde_push_to_bsev_icmdqueue(vde, 0x800003FC, false);
> +	value = (0x20 << 26) | (0 << 25) | ((4096 >> 2) & 0x1fff);
> +	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);
>  	if (err)
>  		return err;
> 
>  	value = 0x01500000;
> -	value |= ((vde->iram_lists_addr + 512) >> 2) & 0xFFFF;
> +	value |= ((vde->iram_lists_addr + 1024) >> 2) & 0xffff;
> 
>  	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, true);
>  	if (err)
>  		return err;
> 
> +	value = (0x21 << 26) | ((240 & 0x1fff) << 12) | (0x54c & 0xfff);
>  	err = tegra_vde_push_to_bsev_icmdqueue(vde, 0x840F054C, false);

	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);

>  	if (err)
>  		return err;
> 
> -	err = tegra_vde_push_to_bsev_icmdqueue(vde, 0x80000080, false);
> +	size = num_ref_pics * 4 * 8;
> +
> +	value = (0x20 << 26) | (0x0 << 25) | ((size >> 2) & 0x1fff);
> +	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);
>  	if (err)
>  		return err;
> 
> @@ -1290,19 +1325,27 @@ static const struct dev_pm_ops tegra_vde_pm_ops = {
>  };
> 
>  static const struct tegra_vde_soc tegra20_vde_soc = {
> +	.num_ref_pics = 16,
>  	.supports_ref_pic_marking = false,
> +	.supports_interlacing = false,
>  };
> 
>  static const struct tegra_vde_soc tegra30_vde_soc = {
> +	.num_ref_pics = 32,
>  	.supports_ref_pic_marking = false,
> +	.supports_interlacing = false,
>  };
> 
>  static const struct tegra_vde_soc tegra114_vde_soc = {
> +	.num_ref_pics = 32,
>  	.supports_ref_pic_marking = true,
> +	.supports_interlacing = false,
>  };
> 
>  static const struct tegra_vde_soc tegra124_vde_soc = {
> +	.num_ref_pics = 32,
>  	.supports_ref_pic_marking = true,
> +	.supports_interlacing = true,
>  };
> 
>  static const struct of_device_id tegra_vde_of_match[] = {

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 03/14] staging: media: tegra-vde: Prepare for interlacing support
@ 2018-08-18 12:48     ` Dmitry Osipenko
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:48 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mauro Carvalho Chehab, Greg Kroah-Hartman, Jonathan Hunter,
	linux-media, linux-tegra, devel

On Monday, 13 August 2018 17:50:16 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> The number of frames doubles when decoding interlaced content and the
> structures describing the frames double in size. Take that into account
> to prepare for interlacing support.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 73 ++++++++++++++++-----
>  1 file changed, 58 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c
> b/drivers/staging/media/tegra-vde/tegra-vde.c index
> 3027b11b11ae..1a40f6dff7c8 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -61,7 +61,9 @@ struct video_frame {
>  };
> 
>  struct tegra_vde_soc {
> +	unsigned int num_ref_pics;
>  	bool supports_ref_pic_marking;
> +	bool supports_interlacing;
>  };
> 
>  struct tegra_vde {
> @@ -205,8 +207,12 @@ static void tegra_vde_setup_frameid(struct tegra_vde
> *vde, u32 cr_addr = frame ? frame->cr_addr : 0x6CDEAD00;
>  	u32 value1 = frame ? ((mbs_width << 16) | mbs_height) : 0;
>  	u32 value2 = frame ? ((((mbs_width + 1) >> 1) << 6) | 1) : 0;
> +	u32 value = y_addr >> 8;

Let's name it value0 for consistency.

> 
> -	VDE_WR(y_addr  >> 8, vde->frameid + 0x000 + frameid * 4);
> +	if (vde->soc->supports_interlacing)
> +		value |= BIT(31);
> +
> +	VDE_WR(value,        vde->frameid + 0x000 + frameid * 4);
>  	VDE_WR(cb_addr >> 8, vde->frameid + 0x100 + frameid * 4);
>  	VDE_WR(cr_addr >> 8, vde->frameid + 0x180 + frameid * 4);
>  	VDE_WR(value1,       vde->frameid + 0x080 + frameid * 4);
> @@ -229,20 +235,23 @@ static void tegra_setup_frameidx(struct tegra_vde
> *vde, }
> 
>  static void tegra_vde_setup_iram_entry(struct tegra_vde *vde,
> +				       unsigned int num_ref_pics,
>  				       unsigned int table,
>  				       unsigned int row,
>  				       u32 value1, u32 value2)
>  {
> +	unsigned int entries = num_ref_pics * 2;
>  	u32 *iram_tables = vde->iram;
> 
>  	dev_dbg(vde->miscdev.parent, "IRAM table %u: row %u: 0x%08X 0x%08X\n",
>  		table, row, value1, value2);
> 
> -	iram_tables[0x20 * table + row * 2] = value1;
> -	iram_tables[0x20 * table + row * 2 + 1] = value2;
> +	iram_tables[entries * table + row * 2] = value1;
> +	iram_tables[entries * table + row * 2 + 1] = value2;
>  }
> 
>  static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
> +					unsigned int num_ref_pics,
>  					struct video_frame *dpb_frames,
>  					unsigned int ref_frames_nb,
>  					unsigned int with_earlier_poc_nb)
> @@ -251,13 +260,17 @@ static void tegra_vde_setup_iram_tables(struct
> tegra_vde *vde, u32 value, aux_addr;
>  	int with_later_poc_nb;
>  	unsigned int i, k;
> +	size_t size;
> +
> +	size = num_ref_pics * 4 * 8;
> +	memset(vde->iram, 0, size);

Is this memset() really needed or it is just because you're feeling 
uncomfortable that something is kept uninitialized?

> 
>  	dev_dbg(vde->miscdev.parent, "DPB: Frame 0: frame_num = %d\n",
>  		dpb_frames[0].frame_num);
> 
>  	dev_dbg(vde->miscdev.parent, "REF L0:\n");
> 
> -	for (i = 0; i < 16; i++) {
> +	for (i = 0; i < num_ref_pics; i++) {
>  		if (i < ref_frames_nb) {
>  			frame = &dpb_frames[i + 1];
> 
> @@ -277,10 +290,14 @@ static void tegra_vde_setup_iram_tables(struct
> tegra_vde *vde, value = 0;
>  		}
> 
> -		tegra_vde_setup_iram_entry(vde, 0, i, value, aux_addr);
> -		tegra_vde_setup_iram_entry(vde, 1, i, value, aux_addr);
> -		tegra_vde_setup_iram_entry(vde, 2, i, value, aux_addr);
> -		tegra_vde_setup_iram_entry(vde, 3, i, value, aux_addr);
> +		tegra_vde_setup_iram_entry(vde, num_ref_pics, 0, i, value,
> +					   aux_addr);
> +		tegra_vde_setup_iram_entry(vde, num_ref_pics, 1, i, value,
> +					   aux_addr);
> +		tegra_vde_setup_iram_entry(vde, num_ref_pics, 2, i, value,
> +					   aux_addr);
> +		tegra_vde_setup_iram_entry(vde, num_ref_pics, 3, i, value,
> +					   aux_addr);
>  	}
> 
>  	if (!(dpb_frames[0].flags & FLAG_B_FRAME))
> @@ -309,7 +326,8 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde
> *vde, "\tFrame %d: frame_num = %d\n",
>  			k + 1, frame->frame_num);
> 
> -		tegra_vde_setup_iram_entry(vde, 2, i, value, aux_addr);
> +		tegra_vde_setup_iram_entry(vde, num_ref_pics, 2, i, value,
> +					   aux_addr);
>  	}
> 
>  	for (k = 0; i < ref_frames_nb; i++, k++) {
> @@ -326,7 +344,8 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde
> *vde, "\tFrame %d: frame_num = %d\n",
>  			k + 1, frame->frame_num);
> 
> -		tegra_vde_setup_iram_entry(vde, 2, i, value, aux_addr);
> +		tegra_vde_setup_iram_entry(vde, num_ref_pics, 2, i, value,
> +					   aux_addr);
>  	}
>  }
> 
> @@ -339,9 +358,20 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde, unsigned int macroblocks_nb)
>  {
>  	struct device *dev = vde->miscdev.parent;
> +	unsigned int num_ref_pics = 16;
> +	/* XXX extend ABI to provide this */
> +	bool interlaced = false;
> +	size_t size;
>  	u32 value;
>  	int err;
> 
> +	if (vde->soc->supports_interlacing) {
> +		if (interlaced)
> +			num_ref_pics = vde->soc->num_ref_pics;
> +		else
> +			num_ref_pics = 16;
> +	}
> +
>  	tegra_vde_set_bits(vde, 0x000A, vde->sxe + 0xF0);
>  	tegra_vde_set_bits(vde, 0x000B, vde->bsev + CMDQUE_CONTROL);
>  	tegra_vde_set_bits(vde, 0x8002, vde->mbe + 0x50);
> @@ -369,12 +399,12 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde, VDE_WR(0x00000000, vde->bsev + 0x98);
>  	VDE_WR(0x00000060, vde->bsev + 0x9C);
> 
> -	memset(vde->iram + 128, 0, macroblocks_nb / 2);
> +	memset(vde->iram + 1024, 0, macroblocks_nb / 2);

This is wrong and breaks everything because type of vde->iram is (*u32), hence 
+1024 is equal to offset 4096 and below in the code that offset is set to 
vde>iram_lists_addr + 1024. So that should be:

	memset(vde->iram + 256, 0, macroblocks_nb / 2);

Probably we should put that hardcoded offset into a slice_group_map_offset 
variable for clarity, like this:

	unsigned int slice_group_map_offset = 1024;
..
	memset(vde->iram + slice_group_map_offset / sizeof(u32), 0,
		     macroblocks_nb / 2);
..
	/* 
	 * Reference pictures list lays at the beginning of IRAM allocation,
	 * slice group mapping is put afterwards of the list.
	 */
	value |= ((vde->iram_lists_addr + slice_group_map_offset) >> 2) & 0xffff;

> 
>  	tegra_setup_frameidx(vde, dpb_frames, ctx->dpb_frames_nb,
>  			     ctx->pic_width_in_mbs, ctx->pic_height_in_mbs);
> 
> -	tegra_vde_setup_iram_tables(vde, dpb_frames,
> +	tegra_vde_setup_iram_tables(vde, num_ref_pics, dpb_frames,
>  				    ctx->dpb_frames_nb - 1,
>  				    ctx->dpb_ref_frames_with_earlier_poc_nb);
> 
> @@ -396,22 +426,27 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde, if (err)
>  		return err;
> 
> -	err = tegra_vde_push_to_bsev_icmdqueue(vde, 0x800003FC, false);
> +	value = (0x20 << 26) | (0 << 25) | ((4096 >> 2) & 0x1fff);
> +	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);
>  	if (err)
>  		return err;
> 
>  	value = 0x01500000;
> -	value |= ((vde->iram_lists_addr + 512) >> 2) & 0xFFFF;
> +	value |= ((vde->iram_lists_addr + 1024) >> 2) & 0xffff;
> 
>  	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, true);
>  	if (err)
>  		return err;
> 
> +	value = (0x21 << 26) | ((240 & 0x1fff) << 12) | (0x54c & 0xfff);
>  	err = tegra_vde_push_to_bsev_icmdqueue(vde, 0x840F054C, false);

	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);

>  	if (err)
>  		return err;
> 
> -	err = tegra_vde_push_to_bsev_icmdqueue(vde, 0x80000080, false);
> +	size = num_ref_pics * 4 * 8;
> +
> +	value = (0x20 << 26) | (0x0 << 25) | ((size >> 2) & 0x1fff);
> +	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);
>  	if (err)
>  		return err;
> 
> @@ -1290,19 +1325,27 @@ static const struct dev_pm_ops tegra_vde_pm_ops = {
>  };
> 
>  static const struct tegra_vde_soc tegra20_vde_soc = {
> +	.num_ref_pics = 16,
>  	.supports_ref_pic_marking = false,
> +	.supports_interlacing = false,
>  };
> 
>  static const struct tegra_vde_soc tegra30_vde_soc = {
> +	.num_ref_pics = 32,
>  	.supports_ref_pic_marking = false,
> +	.supports_interlacing = false,
>  };
> 
>  static const struct tegra_vde_soc tegra114_vde_soc = {
> +	.num_ref_pics = 32,
>  	.supports_ref_pic_marking = true,
> +	.supports_interlacing = false,
>  };
> 
>  static const struct tegra_vde_soc tegra124_vde_soc = {
> +	.num_ref_pics = 32,
>  	.supports_ref_pic_marking = true,
> +	.supports_interlacing = true,
>  };
> 
>  static const struct of_device_id tegra_vde_of_match[] = {

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 02/14] staging: media: tegra-vde: Support reference picture marking
  2018-08-13 14:50   ` Thierry Reding
@ 2018-08-18 12:48     ` Dmitry Osipenko
  -1 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:48 UTC (permalink / raw)
  To: Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Mauro Carvalho Chehab, linux-media

On Monday, 13 August 2018 17:50:15 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Tegra114 and Tegra124 support reference picture marking, which will
> cause BSEV to write picture marking data to SDRAM. Make sure there is
> a valid destination address for that data to avoid error messages from
> the memory controller.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 54 ++++++++++++++++++++-
>  drivers/staging/media/tegra-vde/uapi.h      |  3 ++
>  2 files changed, 55 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c
> b/drivers/staging/media/tegra-vde/tegra-vde.c index
> 9d8f833744db..3027b11b11ae 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -60,7 +60,12 @@ struct video_frame {
>  	u32 flags;
>  };
> 
> +struct tegra_vde_soc {
> +	bool supports_ref_pic_marking;
> +};
> +
>  struct tegra_vde {
> +	const struct tegra_vde_soc *soc;
>  	void __iomem *sxe;
>  	void __iomem *bsev;
>  	void __iomem *mbe;
> @@ -330,6 +335,7 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde, struct video_frame *dpb_frames,
>  				      dma_addr_t bitstream_data_addr,
>  				      size_t bitstream_data_size,
> +				      dma_addr_t secure_addr,
>  				      unsigned int macroblocks_nb)
>  {
>  	struct device *dev = vde->miscdev.parent;
> @@ -454,6 +460,9 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde,
> 
>  	VDE_WR(bitstream_data_addr, vde->sxe + 0x6C);
> 
> +	if (vde->soc->supports_ref_pic_marking)
> +		VDE_WR(secure_addr, vde->sxe + 0x7c);
> +
>  	value = 0x10000005;
>  	value |= ctx->pic_width_in_mbs << 11;
>  	value |= ctx->pic_height_in_mbs << 3;
> @@ -772,12 +781,15 @@ static int tegra_vde_ioctl_decode_h264(struct
> tegra_vde *vde, struct tegra_vde_h264_frame __user *frames_user;
>  	struct video_frame *dpb_frames;
>  	struct dma_buf_attachment *bitstream_data_dmabuf_attachment;
> -	struct sg_table *bitstream_sgt;
> +	struct dma_buf_attachment *secure_attachment = NULL;
> +	struct sg_table *bitstream_sgt, *secure_sgt;
>  	enum dma_data_direction dma_dir;
>  	dma_addr_t bitstream_data_addr;
> +	dma_addr_t secure_addr;
>  	dma_addr_t bsev_ptr;
>  	size_t lsize, csize;
>  	size_t bitstream_data_size;
> +	size_t secure_size;

secure_size is unused, you could omit it and replace with NULL below.

>  	unsigned int macroblocks_nb;
>  	unsigned int read_bytes;
>  	unsigned int cstride;
> @@ -803,6 +815,18 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde, if (ret)
>  		return ret;
> 
> +	if (vde->soc->supports_ref_pic_marking) {
> +		ret = tegra_vde_attach_dmabuf(dev, ctx.secure_fd,
> +					      ctx.secure_offset, 0, SZ_256,

Minimum buffer size? Since it's coming from userspace, you must specify it to 
validate buffers size correctly.

> +					      &secure_attachment,
> +					      &secure_addr,
> +					      &secure_sgt,
> +					      &secure_size,
> +					      DMA_TO_DEVICE);
> +		if (ret)
> +			goto release_bitstream_dmabuf;
> +	}
> +
>  	dpb_frames = kcalloc(ctx.dpb_frames_nb, sizeof(*dpb_frames),
>  			     GFP_KERNEL);
>  	if (!dpb_frames) {
> @@ -876,6 +900,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde, ret = tegra_vde_setup_hw_context(vde, &ctx, dpb_frames,
>  					 bitstream_data_addr,
>  					 bitstream_data_size,
> +					 secure_addr,
>  					 macroblocks_nb);
>  	if (ret)
>  		goto put_runtime_pm;
> @@ -929,6 +954,10 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde, kfree(dpb_frames);
> 
>  release_bitstream_dmabuf:

release_secure_dmabuf:

> +	if (secure_attachment)
> +		tegra_vde_detach_and_put_dmabuf(secure_attachment, secure_sgt,
> +						DMA_TO_DEVICE);
> +
>  	tegra_vde_detach_and_put_dmabuf(bitstream_data_dmabuf_attachment,
>  					bitstream_sgt, DMA_TO_DEVICE);
> 
> @@ -1029,6 +1058,8 @@ static int tegra_vde_probe(struct platform_device
> *pdev)
> 
>  	platform_set_drvdata(pdev, vde);
> 
> +	vde->soc = of_device_get_match_data(&pdev->dev);
> +
>  	regs = platform_get_resource_byname(pdev, IORESOURCE_MEM, "sxe");
>  	if (!regs)
>  		return -ENODEV;
> @@ -1258,8 +1289,27 @@ static const struct dev_pm_ops tegra_vde_pm_ops = {
>  				tegra_vde_pm_resume)
>  };
> 
> +static const struct tegra_vde_soc tegra20_vde_soc = {
> +	.supports_ref_pic_marking = false,
> +};
> +
> +static const struct tegra_vde_soc tegra30_vde_soc = {
> +	.supports_ref_pic_marking = false,
> +};
> +
> +static const struct tegra_vde_soc tegra114_vde_soc = {
> +	.supports_ref_pic_marking = true,
> +};
> +
> +static const struct tegra_vde_soc tegra124_vde_soc = {
> +	.supports_ref_pic_marking = true,
> +};
> +
>  static const struct of_device_id tegra_vde_of_match[] = {
> -	{ .compatible = "nvidia,tegra20-vde", },
> +	{ .compatible = "nvidia,tegra124-vde", .data = &tegra124_vde_soc },
> +	{ .compatible = "nvidia,tegra114-vde", .data = &tegra114_vde_soc },
> +	{ .compatible = "nvidia,tegra30-vde", .data = &tegra30_vde_soc },
> +	{ .compatible = "nvidia,tegra20-vde", .data = &tegra20_vde_soc },
>  	{ },
>  };
>  MODULE_DEVICE_TABLE(of, tegra_vde_of_match);
> diff --git a/drivers/staging/media/tegra-vde/uapi.h
> b/drivers/staging/media/tegra-vde/uapi.h index a50c7bcae057..58bfd56de55e
> 100644
> --- a/drivers/staging/media/tegra-vde/uapi.h
> +++ b/drivers/staging/media/tegra-vde/uapi.h
> @@ -35,6 +35,9 @@ struct tegra_vde_h264_decoder_ctx {
>  	__s32 bitstream_data_fd;
>  	__u32 bitstream_data_offset;
> 
> +	__s32 secure_fd;
> +	__u32 secure_offset;
> +

If the sole purpose of this buffer is to hold some data that VDE produces 
during the decoding process and there is no use for this data in userspace, 
why this buffer should be exposed to userspace at all and not made internal to 
VDE driver?

>  	__u64 dpb_frames_ptr;
>  	__u8  dpb_frames_nb;
>  	__u8  dpb_ref_frames_with_earlier_poc_nb;

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 02/14] staging: media: tegra-vde: Support reference picture marking
@ 2018-08-18 12:48     ` Dmitry Osipenko
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:48 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mauro Carvalho Chehab, Greg Kroah-Hartman, Jonathan Hunter,
	linux-media, linux-tegra, devel

On Monday, 13 August 2018 17:50:15 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Tegra114 and Tegra124 support reference picture marking, which will
> cause BSEV to write picture marking data to SDRAM. Make sure there is
> a valid destination address for that data to avoid error messages from
> the memory controller.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 54 ++++++++++++++++++++-
>  drivers/staging/media/tegra-vde/uapi.h      |  3 ++
>  2 files changed, 55 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c
> b/drivers/staging/media/tegra-vde/tegra-vde.c index
> 9d8f833744db..3027b11b11ae 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -60,7 +60,12 @@ struct video_frame {
>  	u32 flags;
>  };
> 
> +struct tegra_vde_soc {
> +	bool supports_ref_pic_marking;
> +};
> +
>  struct tegra_vde {
> +	const struct tegra_vde_soc *soc;
>  	void __iomem *sxe;
>  	void __iomem *bsev;
>  	void __iomem *mbe;
> @@ -330,6 +335,7 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde, struct video_frame *dpb_frames,
>  				      dma_addr_t bitstream_data_addr,
>  				      size_t bitstream_data_size,
> +				      dma_addr_t secure_addr,
>  				      unsigned int macroblocks_nb)
>  {
>  	struct device *dev = vde->miscdev.parent;
> @@ -454,6 +460,9 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde,
> 
>  	VDE_WR(bitstream_data_addr, vde->sxe + 0x6C);
> 
> +	if (vde->soc->supports_ref_pic_marking)
> +		VDE_WR(secure_addr, vde->sxe + 0x7c);
> +
>  	value = 0x10000005;
>  	value |= ctx->pic_width_in_mbs << 11;
>  	value |= ctx->pic_height_in_mbs << 3;
> @@ -772,12 +781,15 @@ static int tegra_vde_ioctl_decode_h264(struct
> tegra_vde *vde, struct tegra_vde_h264_frame __user *frames_user;
>  	struct video_frame *dpb_frames;
>  	struct dma_buf_attachment *bitstream_data_dmabuf_attachment;
> -	struct sg_table *bitstream_sgt;
> +	struct dma_buf_attachment *secure_attachment = NULL;
> +	struct sg_table *bitstream_sgt, *secure_sgt;
>  	enum dma_data_direction dma_dir;
>  	dma_addr_t bitstream_data_addr;
> +	dma_addr_t secure_addr;
>  	dma_addr_t bsev_ptr;
>  	size_t lsize, csize;
>  	size_t bitstream_data_size;
> +	size_t secure_size;

secure_size is unused, you could omit it and replace with NULL below.

>  	unsigned int macroblocks_nb;
>  	unsigned int read_bytes;
>  	unsigned int cstride;
> @@ -803,6 +815,18 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde, if (ret)
>  		return ret;
> 
> +	if (vde->soc->supports_ref_pic_marking) {
> +		ret = tegra_vde_attach_dmabuf(dev, ctx.secure_fd,
> +					      ctx.secure_offset, 0, SZ_256,

Minimum buffer size? Since it's coming from userspace, you must specify it to 
validate buffers size correctly.

> +					      &secure_attachment,
> +					      &secure_addr,
> +					      &secure_sgt,
> +					      &secure_size,
> +					      DMA_TO_DEVICE);
> +		if (ret)
> +			goto release_bitstream_dmabuf;
> +	}
> +
>  	dpb_frames = kcalloc(ctx.dpb_frames_nb, sizeof(*dpb_frames),
>  			     GFP_KERNEL);
>  	if (!dpb_frames) {
> @@ -876,6 +900,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde, ret = tegra_vde_setup_hw_context(vde, &ctx, dpb_frames,
>  					 bitstream_data_addr,
>  					 bitstream_data_size,
> +					 secure_addr,
>  					 macroblocks_nb);
>  	if (ret)
>  		goto put_runtime_pm;
> @@ -929,6 +954,10 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde, kfree(dpb_frames);
> 
>  release_bitstream_dmabuf:

release_secure_dmabuf:

> +	if (secure_attachment)
> +		tegra_vde_detach_and_put_dmabuf(secure_attachment, secure_sgt,
> +						DMA_TO_DEVICE);
> +
>  	tegra_vde_detach_and_put_dmabuf(bitstream_data_dmabuf_attachment,
>  					bitstream_sgt, DMA_TO_DEVICE);
> 
> @@ -1029,6 +1058,8 @@ static int tegra_vde_probe(struct platform_device
> *pdev)
> 
>  	platform_set_drvdata(pdev, vde);
> 
> +	vde->soc = of_device_get_match_data(&pdev->dev);
> +
>  	regs = platform_get_resource_byname(pdev, IORESOURCE_MEM, "sxe");
>  	if (!regs)
>  		return -ENODEV;
> @@ -1258,8 +1289,27 @@ static const struct dev_pm_ops tegra_vde_pm_ops = {
>  				tegra_vde_pm_resume)
>  };
> 
> +static const struct tegra_vde_soc tegra20_vde_soc = {
> +	.supports_ref_pic_marking = false,
> +};
> +
> +static const struct tegra_vde_soc tegra30_vde_soc = {
> +	.supports_ref_pic_marking = false,
> +};
> +
> +static const struct tegra_vde_soc tegra114_vde_soc = {
> +	.supports_ref_pic_marking = true,
> +};
> +
> +static const struct tegra_vde_soc tegra124_vde_soc = {
> +	.supports_ref_pic_marking = true,
> +};
> +
>  static const struct of_device_id tegra_vde_of_match[] = {
> -	{ .compatible = "nvidia,tegra20-vde", },
> +	{ .compatible = "nvidia,tegra124-vde", .data = &tegra124_vde_soc },
> +	{ .compatible = "nvidia,tegra114-vde", .data = &tegra114_vde_soc },
> +	{ .compatible = "nvidia,tegra30-vde", .data = &tegra30_vde_soc },
> +	{ .compatible = "nvidia,tegra20-vde", .data = &tegra20_vde_soc },
>  	{ },
>  };
>  MODULE_DEVICE_TABLE(of, tegra_vde_of_match);
> diff --git a/drivers/staging/media/tegra-vde/uapi.h
> b/drivers/staging/media/tegra-vde/uapi.h index a50c7bcae057..58bfd56de55e
> 100644
> --- a/drivers/staging/media/tegra-vde/uapi.h
> +++ b/drivers/staging/media/tegra-vde/uapi.h
> @@ -35,6 +35,9 @@ struct tegra_vde_h264_decoder_ctx {
>  	__s32 bitstream_data_fd;
>  	__u32 bitstream_data_offset;
> 
> +	__s32 secure_fd;
> +	__u32 secure_offset;
> +

If the sole purpose of this buffer is to hold some data that VDE produces 
during the decoding process and there is no use for this data in userspace, 
why this buffer should be exposed to userspace at all and not made internal to 
VDE driver?

>  	__u64 dpb_frames_ptr;
>  	__u8  dpb_frames_nb;
>  	__u8  dpb_ref_frames_with_earlier_poc_nb;

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 08/14] staging: media: tegra-vde: Track struct device *
  2018-08-13 14:50   ` Thierry Reding
@ 2018-08-18 12:49     ` Dmitry Osipenko
  -1 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:49 UTC (permalink / raw)
  To: Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Mauro Carvalho Chehab, linux-media

On Monday, 13 August 2018 17:50:21 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> The pointer to the struct device is frequently used, so store it in
> struct tegra_vde. Also, pass around a pointer to a struct tegra_vde
> instead of struct device in some cases to prepare for subsequent
> patches referencing additional data from that structure.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 63 ++++++++++++---------
>  1 file changed, 36 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c
> b/drivers/staging/media/tegra-vde/tegra-vde.c index
> 41cf86dc5dbd..2496a03fd158 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -71,6 +71,7 @@ struct tegra_vde_soc {
>  };
> 
>  struct tegra_vde {
> +	struct device *dev;
>  	const struct tegra_vde_soc *soc;
>  	void __iomem *sxe;
>  	void __iomem *bsev;
> @@ -644,7 +645,7 @@ static void tegra_vde_detach_and_put_dmabuf(struct
> dma_buf_attachment *a, dma_buf_put(dmabuf);
>  }
> 
> -static int tegra_vde_attach_dmabuf(struct device *dev,
> +static int tegra_vde_attach_dmabuf(struct tegra_vde *vde,
>  				   int fd,
>  				   unsigned long offset,
>  				   size_t min_size,
> @@ -662,38 +663,40 @@ static int tegra_vde_attach_dmabuf(struct device *dev,
> 
>  	dmabuf = dma_buf_get(fd);
>  	if (IS_ERR(dmabuf)) {
> -		dev_err(dev, "Invalid dmabuf FD: %d\n", fd);
> +		dev_err(vde->dev, "Invalid dmabuf FD: %d\n", fd);
>  		return PTR_ERR(dmabuf);
>  	}
> 
>  	if (dmabuf->size & (align_size - 1)) {
> -		dev_err(dev, "Unaligned dmabuf 0x%zX, should be aligned to 0x%zX\n",
> +		dev_err(vde->dev,
> +			"Unaligned dmabuf 0x%zX, should be aligned to 0x%zX\n",
>  			dmabuf->size, align_size);
>  		return -EINVAL;
>  	}
> 
>  	if ((u64)offset + min_size > dmabuf->size) {
> -		dev_err(dev, "Too small dmabuf size %zu @0x%lX, should be at least
> %zu\n", +		dev_err(vde->dev,
> +			"Too small dmabuf size %zu @0x%lX, should be at least %zu\n",
>  			dmabuf->size, offset, min_size);
>  		return -EINVAL;
>  	}
> 
> -	attachment = dma_buf_attach(dmabuf, dev);
> +	attachment = dma_buf_attach(dmabuf, vde->dev);
>  	if (IS_ERR(attachment)) {
> -		dev_err(dev, "Failed to attach dmabuf\n");
> +		dev_err(vde->dev, "Failed to attach dmabuf\n");
>  		err = PTR_ERR(attachment);
>  		goto err_put;
>  	}
> 
>  	sgt = dma_buf_map_attachment(attachment, dma_dir);
>  	if (IS_ERR(sgt)) {
> -		dev_err(dev, "Failed to get dmabufs sg_table\n");
> +		dev_err(vde->dev, "Failed to get dmabufs sg_table\n");
>  		err = PTR_ERR(sgt);
>  		goto err_detach;
>  	}
> 
>  	if (sgt->nents != 1) {
> -		dev_err(dev, "Sparse DMA region is unsupported\n");
> +		dev_err(vde->dev, "Sparse DMA region is unsupported\n");
>  		err = -EINVAL;
>  		goto err_unmap;
>  	}
> @@ -717,7 +720,7 @@ static int tegra_vde_attach_dmabuf(struct device *dev,
>  	return err;
>  }
> 
> -static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
> +static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  					     struct video_frame *frame,
>  					     struct tegra_vde_h264_frame *src,
>  					     enum dma_data_direction dma_dir,
> @@ -726,7 +729,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct
> device *dev, {
>  	int err;
> 
> -	err = tegra_vde_attach_dmabuf(dev, src->y_fd,
> +	err = tegra_vde_attach_dmabuf(vde, src->y_fd,
>  				      src->y_offset, lsize, SZ_256,
>  				      &frame->y_dmabuf_attachment,
>  				      &frame->y_addr,
> @@ -735,7 +738,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct
> device *dev, if (err)
>  		return err;
> 
> -	err = tegra_vde_attach_dmabuf(dev, src->cb_fd,
> +	err = tegra_vde_attach_dmabuf(vde, src->cb_fd,
>  				      src->cb_offset, csize, SZ_256,
>  				      &frame->cb_dmabuf_attachment,
>  				      &frame->cb_addr,
> @@ -744,7 +747,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct
> device *dev, if (err)
>  		goto err_release_y;
> 
> -	err = tegra_vde_attach_dmabuf(dev, src->cr_fd,
> +	err = tegra_vde_attach_dmabuf(vde, src->cr_fd,
>  				      src->cr_offset, csize, SZ_256,
>  				      &frame->cr_dmabuf_attachment,
>  				      &frame->cr_addr,
> @@ -758,7 +761,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct
> device *dev, return 0;
>  	}
> 
> -	err = tegra_vde_attach_dmabuf(dev, src->aux_fd,
> +	err = tegra_vde_attach_dmabuf(vde, src->aux_fd,
>  				      src->aux_offset, csize, SZ_256,
>  				      &frame->aux_dmabuf_attachment,
>  				      &frame->aux_addr,
> @@ -770,33 +773,35 @@ static int tegra_vde_attach_dmabufs_to_frame(struct
> device *dev, return 0;
> 
>  err_release_cr:
> -	tegra_vde_detach_and_put_dmabuf(frame->cr_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
>  					frame->cr_sgt, dma_dir);
>  err_release_cb:
> -	tegra_vde_detach_and_put_dmabuf(frame->cb_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
>  					frame->cb_sgt, dma_dir);
>  err_release_y:
> -	tegra_vde_detach_and_put_dmabuf(frame->y_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
>  					frame->y_sgt, dma_dir);
> 
>  	return err;
>  }
> 
> -static void tegra_vde_release_frame_dmabufs(struct video_frame *frame,
> +static void tegra_vde_release_frame_dmabufs(struct tegra_vde *vde,
> +					    struct video_frame *frame,
>  					    enum dma_data_direction dma_dir,
>  					    bool baseline_profile)
>  {
>  	if (!baseline_profile)
> -		tegra_vde_detach_and_put_dmabuf(frame->aux_dmabuf_attachment,
> +		tegra_vde_detach_and_put_dmabuf(vde,
> +						frame->aux_dmabuf_attachment,
>  						frame->aux_sgt, dma_dir);
> 
> -	tegra_vde_detach_and_put_dmabuf(frame->cr_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
>  					frame->cr_sgt, dma_dir);
> 
> -	tegra_vde_detach_and_put_dmabuf(frame->cb_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
>  					frame->cb_sgt, dma_dir);
> 
> -	tegra_vde_detach_and_put_dmabuf(frame->y_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
>  					frame->y_sgt, dma_dir);
>  }
> 
> @@ -937,7 +942,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde, if (ret)
>  		return ret;
> 
> -	ret = tegra_vde_attach_dmabuf(dev, ctx.bitstream_data_fd,
> +	ret = tegra_vde_attach_dmabuf(vde, ctx.bitstream_data_fd,
>  				      ctx.bitstream_data_offset,
>  				      SZ_16K, SZ_16K,
>  				      &bitstream_data_dmabuf_attachment,
> @@ -949,7 +954,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde, return ret;
> 
>  	if (vde->soc->supports_ref_pic_marking) {
> -		ret = tegra_vde_attach_dmabuf(dev, ctx.secure_fd,
> +		ret = tegra_vde_attach_dmabuf(vde, ctx.secure_fd,
>  					      ctx.secure_offset, 0, SZ_256,
>  					      &secure_attachment,
>  					      &secure_addr,
> @@ -992,7 +997,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde,
> 
>  		dma_dir = (i == 0) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
> 
> -		ret = tegra_vde_attach_dmabufs_to_frame(dev, &dpb_frames[i],
> +		ret = tegra_vde_attach_dmabufs_to_frame(vde, &dpb_frames[i],
>  							&frame, dma_dir,
>  							ctx.baseline_profile,
>  							lsize, csize);
> @@ -1081,7 +1086,7 @@ static int tegra_vde_ioctl_decode_h264(struct
> tegra_vde *vde, while (i--) {
>  		dma_dir = (i == 0) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
> 
> -		tegra_vde_release_frame_dmabufs(&dpb_frames[i], dma_dir,
> +		tegra_vde_release_frame_dmabufs(vde, &dpb_frames[i], dma_dir,
>  						ctx.baseline_profile);
>  	}
> 
> @@ -1089,10 +1094,12 @@ static int tegra_vde_ioctl_decode_h264(struct
> tegra_vde *vde,
> 
>  release_bitstream_dmabuf:
>  	if (secure_attachment)
> -		tegra_vde_detach_and_put_dmabuf(secure_attachment, secure_sgt,
> +		tegra_vde_detach_and_put_dmabuf(vde, secure_attachment,
> +						secure_sgt,
>  						DMA_TO_DEVICE);
> 
> -	tegra_vde_detach_and_put_dmabuf(bitstream_data_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde,
> +					bitstream_data_dmabuf_attachment,
>  					bitstream_sgt, DMA_TO_DEVICE);
> 
>  	return ret;
> @@ -1190,6 +1197,8 @@ static int tegra_vde_probe(struct platform_device
> *pdev) if (!vde)
>  		return -ENOMEM;
> 
> +	vde->dev = &pdev->dev;
> +
>  	platform_set_drvdata(pdev, vde);
> 
>  	vde->soc = of_device_get_match_data(&pdev->dev);

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 08/14] staging: media: tegra-vde: Track struct device *
@ 2018-08-18 12:49     ` Dmitry Osipenko
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:49 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mauro Carvalho Chehab, Greg Kroah-Hartman, Jonathan Hunter,
	linux-media, linux-tegra, devel

On Monday, 13 August 2018 17:50:21 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> The pointer to the struct device is frequently used, so store it in
> struct tegra_vde. Also, pass around a pointer to a struct tegra_vde
> instead of struct device in some cases to prepare for subsequent
> patches referencing additional data from that structure.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 63 ++++++++++++---------
>  1 file changed, 36 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c
> b/drivers/staging/media/tegra-vde/tegra-vde.c index
> 41cf86dc5dbd..2496a03fd158 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -71,6 +71,7 @@ struct tegra_vde_soc {
>  };
> 
>  struct tegra_vde {
> +	struct device *dev;
>  	const struct tegra_vde_soc *soc;
>  	void __iomem *sxe;
>  	void __iomem *bsev;
> @@ -644,7 +645,7 @@ static void tegra_vde_detach_and_put_dmabuf(struct
> dma_buf_attachment *a, dma_buf_put(dmabuf);
>  }
> 
> -static int tegra_vde_attach_dmabuf(struct device *dev,
> +static int tegra_vde_attach_dmabuf(struct tegra_vde *vde,
>  				   int fd,
>  				   unsigned long offset,
>  				   size_t min_size,
> @@ -662,38 +663,40 @@ static int tegra_vde_attach_dmabuf(struct device *dev,
> 
>  	dmabuf = dma_buf_get(fd);
>  	if (IS_ERR(dmabuf)) {
> -		dev_err(dev, "Invalid dmabuf FD: %d\n", fd);
> +		dev_err(vde->dev, "Invalid dmabuf FD: %d\n", fd);
>  		return PTR_ERR(dmabuf);
>  	}
> 
>  	if (dmabuf->size & (align_size - 1)) {
> -		dev_err(dev, "Unaligned dmabuf 0x%zX, should be aligned to 0x%zX\n",
> +		dev_err(vde->dev,
> +			"Unaligned dmabuf 0x%zX, should be aligned to 0x%zX\n",
>  			dmabuf->size, align_size);
>  		return -EINVAL;
>  	}
> 
>  	if ((u64)offset + min_size > dmabuf->size) {
> -		dev_err(dev, "Too small dmabuf size %zu @0x%lX, should be at least
> %zu\n", +		dev_err(vde->dev,
> +			"Too small dmabuf size %zu @0x%lX, should be at least %zu\n",
>  			dmabuf->size, offset, min_size);
>  		return -EINVAL;
>  	}
> 
> -	attachment = dma_buf_attach(dmabuf, dev);
> +	attachment = dma_buf_attach(dmabuf, vde->dev);
>  	if (IS_ERR(attachment)) {
> -		dev_err(dev, "Failed to attach dmabuf\n");
> +		dev_err(vde->dev, "Failed to attach dmabuf\n");
>  		err = PTR_ERR(attachment);
>  		goto err_put;
>  	}
> 
>  	sgt = dma_buf_map_attachment(attachment, dma_dir);
>  	if (IS_ERR(sgt)) {
> -		dev_err(dev, "Failed to get dmabufs sg_table\n");
> +		dev_err(vde->dev, "Failed to get dmabufs sg_table\n");
>  		err = PTR_ERR(sgt);
>  		goto err_detach;
>  	}
> 
>  	if (sgt->nents != 1) {
> -		dev_err(dev, "Sparse DMA region is unsupported\n");
> +		dev_err(vde->dev, "Sparse DMA region is unsupported\n");
>  		err = -EINVAL;
>  		goto err_unmap;
>  	}
> @@ -717,7 +720,7 @@ static int tegra_vde_attach_dmabuf(struct device *dev,
>  	return err;
>  }
> 
> -static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
> +static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  					     struct video_frame *frame,
>  					     struct tegra_vde_h264_frame *src,
>  					     enum dma_data_direction dma_dir,
> @@ -726,7 +729,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct
> device *dev, {
>  	int err;
> 
> -	err = tegra_vde_attach_dmabuf(dev, src->y_fd,
> +	err = tegra_vde_attach_dmabuf(vde, src->y_fd,
>  				      src->y_offset, lsize, SZ_256,
>  				      &frame->y_dmabuf_attachment,
>  				      &frame->y_addr,
> @@ -735,7 +738,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct
> device *dev, if (err)
>  		return err;
> 
> -	err = tegra_vde_attach_dmabuf(dev, src->cb_fd,
> +	err = tegra_vde_attach_dmabuf(vde, src->cb_fd,
>  				      src->cb_offset, csize, SZ_256,
>  				      &frame->cb_dmabuf_attachment,
>  				      &frame->cb_addr,
> @@ -744,7 +747,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct
> device *dev, if (err)
>  		goto err_release_y;
> 
> -	err = tegra_vde_attach_dmabuf(dev, src->cr_fd,
> +	err = tegra_vde_attach_dmabuf(vde, src->cr_fd,
>  				      src->cr_offset, csize, SZ_256,
>  				      &frame->cr_dmabuf_attachment,
>  				      &frame->cr_addr,
> @@ -758,7 +761,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct
> device *dev, return 0;
>  	}
> 
> -	err = tegra_vde_attach_dmabuf(dev, src->aux_fd,
> +	err = tegra_vde_attach_dmabuf(vde, src->aux_fd,
>  				      src->aux_offset, csize, SZ_256,
>  				      &frame->aux_dmabuf_attachment,
>  				      &frame->aux_addr,
> @@ -770,33 +773,35 @@ static int tegra_vde_attach_dmabufs_to_frame(struct
> device *dev, return 0;
> 
>  err_release_cr:
> -	tegra_vde_detach_and_put_dmabuf(frame->cr_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
>  					frame->cr_sgt, dma_dir);
>  err_release_cb:
> -	tegra_vde_detach_and_put_dmabuf(frame->cb_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
>  					frame->cb_sgt, dma_dir);
>  err_release_y:
> -	tegra_vde_detach_and_put_dmabuf(frame->y_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
>  					frame->y_sgt, dma_dir);
> 
>  	return err;
>  }
> 
> -static void tegra_vde_release_frame_dmabufs(struct video_frame *frame,
> +static void tegra_vde_release_frame_dmabufs(struct tegra_vde *vde,
> +					    struct video_frame *frame,
>  					    enum dma_data_direction dma_dir,
>  					    bool baseline_profile)
>  {
>  	if (!baseline_profile)
> -		tegra_vde_detach_and_put_dmabuf(frame->aux_dmabuf_attachment,
> +		tegra_vde_detach_and_put_dmabuf(vde,
> +						frame->aux_dmabuf_attachment,
>  						frame->aux_sgt, dma_dir);
> 
> -	tegra_vde_detach_and_put_dmabuf(frame->cr_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
>  					frame->cr_sgt, dma_dir);
> 
> -	tegra_vde_detach_and_put_dmabuf(frame->cb_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
>  					frame->cb_sgt, dma_dir);
> 
> -	tegra_vde_detach_and_put_dmabuf(frame->y_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
>  					frame->y_sgt, dma_dir);
>  }
> 
> @@ -937,7 +942,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde, if (ret)
>  		return ret;
> 
> -	ret = tegra_vde_attach_dmabuf(dev, ctx.bitstream_data_fd,
> +	ret = tegra_vde_attach_dmabuf(vde, ctx.bitstream_data_fd,
>  				      ctx.bitstream_data_offset,
>  				      SZ_16K, SZ_16K,
>  				      &bitstream_data_dmabuf_attachment,
> @@ -949,7 +954,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde, return ret;
> 
>  	if (vde->soc->supports_ref_pic_marking) {
> -		ret = tegra_vde_attach_dmabuf(dev, ctx.secure_fd,
> +		ret = tegra_vde_attach_dmabuf(vde, ctx.secure_fd,
>  					      ctx.secure_offset, 0, SZ_256,
>  					      &secure_attachment,
>  					      &secure_addr,
> @@ -992,7 +997,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde,
> 
>  		dma_dir = (i == 0) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
> 
> -		ret = tegra_vde_attach_dmabufs_to_frame(dev, &dpb_frames[i],
> +		ret = tegra_vde_attach_dmabufs_to_frame(vde, &dpb_frames[i],
>  							&frame, dma_dir,
>  							ctx.baseline_profile,
>  							lsize, csize);
> @@ -1081,7 +1086,7 @@ static int tegra_vde_ioctl_decode_h264(struct
> tegra_vde *vde, while (i--) {
>  		dma_dir = (i == 0) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
> 
> -		tegra_vde_release_frame_dmabufs(&dpb_frames[i], dma_dir,
> +		tegra_vde_release_frame_dmabufs(vde, &dpb_frames[i], dma_dir,
>  						ctx.baseline_profile);
>  	}
> 
> @@ -1089,10 +1094,12 @@ static int tegra_vde_ioctl_decode_h264(struct
> tegra_vde *vde,
> 
>  release_bitstream_dmabuf:
>  	if (secure_attachment)
> -		tegra_vde_detach_and_put_dmabuf(secure_attachment, secure_sgt,
> +		tegra_vde_detach_and_put_dmabuf(vde, secure_attachment,
> +						secure_sgt,
>  						DMA_TO_DEVICE);
> 
> -	tegra_vde_detach_and_put_dmabuf(bitstream_data_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde,
> +					bitstream_data_dmabuf_attachment,
>  					bitstream_sgt, DMA_TO_DEVICE);
> 
>  	return ret;
> @@ -1190,6 +1197,8 @@ static int tegra_vde_probe(struct platform_device
> *pdev) if (!vde)
>  		return -ENOMEM;
> 
> +	vde->dev = &pdev->dev;
> +
>  	platform_set_drvdata(pdev, vde);
> 
>  	vde->soc = of_device_get_match_data(&pdev->dev);

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 09/14] staging: media: tegra-vde: Add IOMMU support
  2018-08-13 14:50   ` Thierry Reding
@ 2018-08-18 12:50     ` Dmitry Osipenko
  -1 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:50 UTC (permalink / raw)
  To: Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Mauro Carvalho Chehab, linux-media

On Monday, 13 August 2018 17:50:22 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Implement support for using an IOMMU to map physically discontiguous
> buffers into contiguous I/O virtual mappings that the VDE can use. This
> allows importing arbitrary DMA-BUFs for use by the VDE.
> 
> While at it, make sure that the device is detached from any DMA/IOMMU
> mapping that it might have automatically been attached to at boot. If
> using the IOMMU API explicitly, detaching from any existing mapping is
> required to avoid double mapping of buffers.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 171 +++++++++++++++++---
>  1 file changed, 153 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c
> b/drivers/staging/media/tegra-vde/tegra-vde.c index
> 2496a03fd158..3bc0bfcfe34e 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -13,7 +13,9 @@
>  #include <linux/dma-buf.h>
>  #include <linux/genalloc.h>
>  #include <linux/interrupt.h>
> +#include <linux/iommu.h>
>  #include <linux/iopoll.h>
> +#include <linux/iova.h>
>  #include <linux/miscdevice.h>
>  #include <linux/module.h>
>  #include <linux/of_device.h>
> @@ -22,6 +24,10 @@
>  #include <linux/slab.h>
>  #include <linux/uaccess.h>
> 
> +#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
> +#include <asm/dma-iommu.h>
> +#endif
> +
>  #include <soc/tegra/pmc.h>
> 
>  #include <drm/drm_fourcc.h>
> @@ -61,6 +67,11 @@ struct video_frame {
>  	u32 frame_num;
>  	u32 flags;
>  	u64 modifier;
> +
> +	struct iova *y_iova;
> +	struct iova *cb_iova;
> +	struct iova *cr_iova;
> +	struct iova *aux_iova;
>  };
> 
>  struct tegra_vde_soc {
> @@ -93,6 +104,12 @@ struct tegra_vde {
>  	struct clk *clk_bsev;
>  	dma_addr_t iram_lists_addr;
>  	u32 *iram;
> +
> +	struct iommu_domain *domain;
> +	struct iommu_group *group;
> +	struct iova_domain iova;
> +	unsigned long limit;
> +	unsigned int shift;
>  };
> 
>  static void tegra_vde_set_bits(struct tegra_vde *vde,
> @@ -634,12 +651,22 @@ static void tegra_vde_decode_frame(struct tegra_vde
> *vde, VDE_WR(0x20000000 | (macroblocks_nb - 1), vde->sxe + 0x00);
>  }
> 
> -static void tegra_vde_detach_and_put_dmabuf(struct dma_buf_attachment *a,
> +static void tegra_vde_detach_and_put_dmabuf(struct tegra_vde *vde,
> +					    struct dma_buf_attachment *a,
>  					    struct sg_table *sgt,
> +					    struct iova *iova,
>  					    enum dma_data_direction dma_dir)
>  {
>  	struct dma_buf *dmabuf = a->dmabuf;
> 
> +	if (vde->domain) {
> +		unsigned long size = iova_size(iova) << vde->shift;

Let's make it "size = iova_align(&vde->iova, dmabuf->size)" for better 
readability.

> +		dma_addr_t addr = iova_dma_addr(&vde->iova, iova);
> +
> +		iommu_unmap(vde->domain, addr, size);
> +		__free_iova(&vde->iova, iova);
> +	}
> +
>  	dma_buf_unmap_attachment(a, sgt, dma_dir);
>  	dma_buf_detach(dmabuf, a);
>  	dma_buf_put(dmabuf);
> @@ -651,14 +678,16 @@ static int tegra_vde_attach_dmabuf(struct tegra_vde
> *vde, size_t min_size,
>  				   size_t align_size,
>  				   struct dma_buf_attachment **a,
> -				   dma_addr_t *addr,
> +				   dma_addr_t *addrp,
>  				   struct sg_table **s,
> -				   size_t *size,
> +				   struct iova **iovap,
> +				   size_t *sizep,
>  				   enum dma_data_direction dma_dir)
>  {
>  	struct dma_buf_attachment *attachment;
>  	struct dma_buf *dmabuf;
>  	struct sg_table *sgt;
> +	size_t size;
>  	int err;
> 
>  	dmabuf = dma_buf_get(fd);
> @@ -695,18 +724,47 @@ static int tegra_vde_attach_dmabuf(struct tegra_vde
> *vde, goto err_detach;
>  	}
> 
> -	if (sgt->nents != 1) {
> +	if (sgt->nents > 1 && !vde->domain) {
>  		dev_err(vde->dev, "Sparse DMA region is unsupported\n");
>  		err = -EINVAL;
>  		goto err_unmap;
>  	}
> 
> -	*addr = sg_dma_address(sgt->sgl) + offset;
> +	if (vde->domain) {
> +		int prot = IOMMU_READ | IOMMU_WRITE;
> +		struct iova *iova;
> +		dma_addr_t addr;
> +
> +		size = (dmabuf->size - offset) >> vde->shift;

Offset shall not be subtracted and dmabuf size shall be rounded to IOVA 
granule. Also, let's not carry shift within the vde structure as it doesn't 
really worth it.


	shift = iova_shift(&vde->iova);
	size = iova_align(&vde->iova, dmabuf->size) >> shift;

> +
> +		iova = alloc_iova(&vde->iova, size, vde->limit - 1, true);
> +		if (!iova) {
> +			err = -ENOMEM;
> +			goto err_unmap;
> +		}
> +
> +		addr = iova_dma_addr(&vde->iova, iova);
> +
> +		size = iommu_map_sg(vde->domain, addr, sgt->sgl, sgt->nents,
> +				    prot);
> +		if (!size) {
> +			__free_iova(&vde->iova, iova);
> +			err = -ENXIO;
> +			goto err_unmap;
> +		}
> +
> +		*addrp = addr;

Returned address shall point at the beginning of the buffer + offset.

	*addrp = addr + offset;

Returned size shall represent the leftover size after the offset subtraction, 
like for example it is used for the bitstream_data address counter limit.

	size = dmabuf->size - offset;

> +		*iovap = iova;
> +	} else {
> +		*addrp = sg_dma_address(sgt->sgl) + offset;
> +		size = dmabuf->size - offset;
> +	}
> +
>  	*a = attachment;
>  	*s = sgt;
> 
> -	if (size)
> -		*size = dmabuf->size - offset;
> +	if (sizep)
> +		*sizep = size;
> 
>  	return 0;
> 
> @@ -734,6 +792,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct
> tegra_vde *vde, &frame->y_dmabuf_attachment,
>  				      &frame->y_addr,
>  				      &frame->y_sgt,
> +				      &frame->y_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		return err;
> @@ -743,6 +802,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct
> tegra_vde *vde, &frame->cb_dmabuf_attachment,
>  				      &frame->cb_addr,
>  				      &frame->cb_sgt,
> +				      &frame->cb_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		goto err_release_y;
> @@ -752,6 +812,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct
> tegra_vde *vde, &frame->cr_dmabuf_attachment,
>  				      &frame->cr_addr,
>  				      &frame->cr_sgt,
> +				      &frame->cr_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		goto err_release_cb;
> @@ -766,6 +827,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct
> tegra_vde *vde, &frame->aux_dmabuf_attachment,
>  				      &frame->aux_addr,
>  				      &frame->aux_sgt,
> +				      &frame->aux_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		goto err_release_cr;
> @@ -774,13 +836,16 @@ static int tegra_vde_attach_dmabufs_to_frame(struct
> tegra_vde *vde,
> 
>  err_release_cr:
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
> -					frame->cr_sgt, dma_dir);
> +					frame->cr_sgt, frame->cr_iova,
> +					dma_dir);
>  err_release_cb:
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
> -					frame->cb_sgt, dma_dir);
> +					frame->cb_sgt, frame->cb_iova,
> +					dma_dir);
>  err_release_y:
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
> -					frame->y_sgt, dma_dir);
> +					frame->y_sgt, frame->y_iova,
> +					dma_dir);
> 
>  	return err;
>  }
> @@ -793,16 +858,20 @@ static void tegra_vde_release_frame_dmabufs(struct
> tegra_vde *vde, if (!baseline_profile)
>  		tegra_vde_detach_and_put_dmabuf(vde,
>  						frame->aux_dmabuf_attachment,
> -						frame->aux_sgt, dma_dir);
> +						frame->aux_sgt,
> +						frame->aux_iova, dma_dir);
> 
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
> -					frame->cr_sgt, dma_dir);
> +					frame->cr_sgt, frame->cr_iova,
> +					dma_dir);
> 
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
> -					frame->cb_sgt, dma_dir);
> +					frame->cb_sgt, frame->cb_iova,
> +					dma_dir);
> 
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
> -					frame->y_sgt, dma_dir);
> +					frame->y_sgt, frame->y_iova,
> +					dma_dir);
>  }
> 
>  static int tegra_vde_validate_frame(struct device *dev,
> @@ -923,6 +992,8 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde, struct sg_table *bitstream_sgt, *secure_sgt;
>  	enum dma_data_direction dma_dir;
>  	dma_addr_t bitstream_data_addr;
> +	struct iova *bitstream_iova;
> +	struct iova *secure_iova;
>  	dma_addr_t secure_addr;
>  	dma_addr_t bsev_ptr;
>  	size_t lsize, csize;
> @@ -948,6 +1019,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde, &bitstream_data_dmabuf_attachment,
>  				      &bitstream_data_addr,
>  				      &bitstream_sgt,
> +				      &bitstream_iova,
>  				      &bitstream_data_size,
>  				      DMA_TO_DEVICE);
>  	if (ret)
> @@ -959,6 +1031,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde, &secure_attachment,
>  					      &secure_addr,
>  					      &secure_sgt,
> +					      &secure_iova,
>  					      &secure_size,
>  					      DMA_TO_DEVICE);
>  		if (ret)
> @@ -1095,12 +1168,13 @@ static int tegra_vde_ioctl_decode_h264(struct
> tegra_vde *vde, release_bitstream_dmabuf:
>  	if (secure_attachment)
>  		tegra_vde_detach_and_put_dmabuf(vde, secure_attachment,
> -						secure_sgt,
> +						secure_sgt, secure_iova,
>  						DMA_TO_DEVICE);
> 
>  	tegra_vde_detach_and_put_dmabuf(vde,
>  					bitstream_data_dmabuf_attachment,
> -					bitstream_sgt, DMA_TO_DEVICE);
> +					bitstream_sgt, bitstream_iova,
> +					DMA_TO_DEVICE);
> 
>  	return ret;
>  }
> @@ -1193,6 +1267,15 @@ static int tegra_vde_probe(struct platform_device
> *pdev) struct tegra_vde *vde;
>  	int irq, err;
> 
> +#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
> +	if (dev->archdata.mapping) {
> +		struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
> +
> +		arm_iommu_detach_device(dev);
> +		arm_iommu_release_mapping(mapping);
> +	}
> +#endif
> +
>  	vde = devm_kzalloc(dev, sizeof(*vde), GFP_KERNEL);
>  	if (!vde)
>  		return -ENOMEM;
> @@ -1335,6 +1418,37 @@ static int tegra_vde_probe(struct platform_device
> *pdev) return -ENOMEM;
>  	}
> 
> +	vde->group = iommu_group_get(dev);
> +	if (vde->group) {
> +		unsigned long order;
> +
> +		vde->domain = iommu_domain_alloc(&platform_bus_type);
> +		if (!vde->domain) {
> +			iommu_group_put(vde->group);
> +			vde->group = NULL;
> +		} else {
> +			err = iova_cache_get();
> +			if (err < 0)
> +				goto free_domain;
> +
> +			order = __ffs(vde->domain->pgsize_bitmap);
> +
> +			init_iova_domain(&vde->iova, 1UL << order, 0);

The minimum address of IOVA allocations shall be determined by the domains 
aperture start address.


> +			vde->shift = iova_shift(&vde->iova);
> +			vde->limit = 1 << (32 - vde->shift);

IOVA limit shall be determined by the domains aperture size. 

	struct iommu_domain_geometry *geometry;

	order = __ffs(vde->domain->pgsize_bitmap);
	geometry = &vde->domain->geometry;

	init_iova_domain(&vde->iova, 1UL << order,
				 geometry->aperture_start >> order);
	vde->iova_end = geometry->aperture_end;

Hence let's replace 'limit' with 'iova_end'.

> +
> +			/*
> +			 * VDE doesn't seem to like accessing the last page of
> +			 * its 32-bit address space.
> +			 */
> +			vde->limit -= 1;

That's probably because some VDE HW address counter is getting wrapped around 
and it can't cope with that due to a HW bug / optimization.

Since this only affects the end of AS, let's check if that adjustment is 
needed:

	if (vde->iova_end == 0xffffffff)
		vde->iova_end -= 1UL << order;

> +
> +			err = iommu_attach_group(vde->domain, vde->group);
> +			if (err < 0)

	goto put_iova;

> +				goto put_cache;
> +		}
> +	}
> +
>  	mutex_init(&vde->lock);
>  	init_completion(&vde->decode_completion);
> 
> @@ -1346,7 +1460,7 @@ static int tegra_vde_probe(struct platform_device
> *pdev) err = misc_register(&vde->miscdev);
>  	if (err) {
>  		dev_err(dev, "Failed to register misc device: %d\n", err);
> -		goto err_gen_free;
> +		goto detach;
>  	}
> 
>  	pm_runtime_enable(dev);
> @@ -1364,7 +1478,21 @@ static int tegra_vde_probe(struct platform_device
> *pdev) err_misc_unreg:
>  	misc_deregister(&vde->miscdev);
> 
> -err_gen_free:

put_iova:
	put_iova_domain(&vde->iova);

> +detach:
> +	if (vde->domain)
> +		iommu_detach_group(vde->domain, vde->group);
> +
> +put_cache:
> +	if (vde->domain)
> +		iova_cache_put();
> +
> +free_domain:
> +	if (vde->domain)
> +		iommu_domain_free(vde->domain);
> +
> +	if (vde->group)
> +		iommu_group_put(vde->group);
> +
>  	gen_pool_free(vde->iram_pool, (unsigned long)vde->iram,
>  		      gen_pool_size(vde->iram_pool));
> 
> @@ -1388,6 +1516,13 @@ static int tegra_vde_remove(struct platform_device
> *pdev)
> 
>  	misc_deregister(&vde->miscdev);
> 
> +	if (vde->domain) {
> +		iommu_detach_group(vde->domain, vde->group);

	put_iova_domain(&vde->iova);

> +		iova_cache_put();
> +		iommu_domain_free(vde->domain);
> +		iommu_group_put(vde->group);
> +	}
> +
>  	gen_pool_free(vde->iram_pool, (unsigned long)vde->iram,
>  		      gen_pool_size(vde->iram_pool));

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 09/14] staging: media: tegra-vde: Add IOMMU support
@ 2018-08-18 12:50     ` Dmitry Osipenko
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:50 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mauro Carvalho Chehab, Greg Kroah-Hartman, Jonathan Hunter,
	linux-media, linux-tegra, devel

On Monday, 13 August 2018 17:50:22 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Implement support for using an IOMMU to map physically discontiguous
> buffers into contiguous I/O virtual mappings that the VDE can use. This
> allows importing arbitrary DMA-BUFs for use by the VDE.
> 
> While at it, make sure that the device is detached from any DMA/IOMMU
> mapping that it might have automatically been attached to at boot. If
> using the IOMMU API explicitly, detaching from any existing mapping is
> required to avoid double mapping of buffers.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 171 +++++++++++++++++---
>  1 file changed, 153 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c
> b/drivers/staging/media/tegra-vde/tegra-vde.c index
> 2496a03fd158..3bc0bfcfe34e 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -13,7 +13,9 @@
>  #include <linux/dma-buf.h>
>  #include <linux/genalloc.h>
>  #include <linux/interrupt.h>
> +#include <linux/iommu.h>
>  #include <linux/iopoll.h>
> +#include <linux/iova.h>
>  #include <linux/miscdevice.h>
>  #include <linux/module.h>
>  #include <linux/of_device.h>
> @@ -22,6 +24,10 @@
>  #include <linux/slab.h>
>  #include <linux/uaccess.h>
> 
> +#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
> +#include <asm/dma-iommu.h>
> +#endif
> +
>  #include <soc/tegra/pmc.h>
> 
>  #include <drm/drm_fourcc.h>
> @@ -61,6 +67,11 @@ struct video_frame {
>  	u32 frame_num;
>  	u32 flags;
>  	u64 modifier;
> +
> +	struct iova *y_iova;
> +	struct iova *cb_iova;
> +	struct iova *cr_iova;
> +	struct iova *aux_iova;
>  };
> 
>  struct tegra_vde_soc {
> @@ -93,6 +104,12 @@ struct tegra_vde {
>  	struct clk *clk_bsev;
>  	dma_addr_t iram_lists_addr;
>  	u32 *iram;
> +
> +	struct iommu_domain *domain;
> +	struct iommu_group *group;
> +	struct iova_domain iova;
> +	unsigned long limit;
> +	unsigned int shift;
>  };
> 
>  static void tegra_vde_set_bits(struct tegra_vde *vde,
> @@ -634,12 +651,22 @@ static void tegra_vde_decode_frame(struct tegra_vde
> *vde, VDE_WR(0x20000000 | (macroblocks_nb - 1), vde->sxe + 0x00);
>  }
> 
> -static void tegra_vde_detach_and_put_dmabuf(struct dma_buf_attachment *a,
> +static void tegra_vde_detach_and_put_dmabuf(struct tegra_vde *vde,
> +					    struct dma_buf_attachment *a,
>  					    struct sg_table *sgt,
> +					    struct iova *iova,
>  					    enum dma_data_direction dma_dir)
>  {
>  	struct dma_buf *dmabuf = a->dmabuf;
> 
> +	if (vde->domain) {
> +		unsigned long size = iova_size(iova) << vde->shift;

Let's make it "size = iova_align(&vde->iova, dmabuf->size)" for better 
readability.

> +		dma_addr_t addr = iova_dma_addr(&vde->iova, iova);
> +
> +		iommu_unmap(vde->domain, addr, size);
> +		__free_iova(&vde->iova, iova);
> +	}
> +
>  	dma_buf_unmap_attachment(a, sgt, dma_dir);
>  	dma_buf_detach(dmabuf, a);
>  	dma_buf_put(dmabuf);
> @@ -651,14 +678,16 @@ static int tegra_vde_attach_dmabuf(struct tegra_vde
> *vde, size_t min_size,
>  				   size_t align_size,
>  				   struct dma_buf_attachment **a,
> -				   dma_addr_t *addr,
> +				   dma_addr_t *addrp,
>  				   struct sg_table **s,
> -				   size_t *size,
> +				   struct iova **iovap,
> +				   size_t *sizep,
>  				   enum dma_data_direction dma_dir)
>  {
>  	struct dma_buf_attachment *attachment;
>  	struct dma_buf *dmabuf;
>  	struct sg_table *sgt;
> +	size_t size;
>  	int err;
> 
>  	dmabuf = dma_buf_get(fd);
> @@ -695,18 +724,47 @@ static int tegra_vde_attach_dmabuf(struct tegra_vde
> *vde, goto err_detach;
>  	}
> 
> -	if (sgt->nents != 1) {
> +	if (sgt->nents > 1 && !vde->domain) {
>  		dev_err(vde->dev, "Sparse DMA region is unsupported\n");
>  		err = -EINVAL;
>  		goto err_unmap;
>  	}
> 
> -	*addr = sg_dma_address(sgt->sgl) + offset;
> +	if (vde->domain) {
> +		int prot = IOMMU_READ | IOMMU_WRITE;
> +		struct iova *iova;
> +		dma_addr_t addr;
> +
> +		size = (dmabuf->size - offset) >> vde->shift;

Offset shall not be subtracted and dmabuf size shall be rounded to IOVA 
granule. Also, let's not carry shift within the vde structure as it doesn't 
really worth it.


	shift = iova_shift(&vde->iova);
	size = iova_align(&vde->iova, dmabuf->size) >> shift;

> +
> +		iova = alloc_iova(&vde->iova, size, vde->limit - 1, true);
> +		if (!iova) {
> +			err = -ENOMEM;
> +			goto err_unmap;
> +		}
> +
> +		addr = iova_dma_addr(&vde->iova, iova);
> +
> +		size = iommu_map_sg(vde->domain, addr, sgt->sgl, sgt->nents,
> +				    prot);
> +		if (!size) {
> +			__free_iova(&vde->iova, iova);
> +			err = -ENXIO;
> +			goto err_unmap;
> +		}
> +
> +		*addrp = addr;

Returned address shall point at the beginning of the buffer + offset.

	*addrp = addr + offset;

Returned size shall represent the leftover size after the offset subtraction, 
like for example it is used for the bitstream_data address counter limit.

	size = dmabuf->size - offset;

> +		*iovap = iova;
> +	} else {
> +		*addrp = sg_dma_address(sgt->sgl) + offset;
> +		size = dmabuf->size - offset;
> +	}
> +
>  	*a = attachment;
>  	*s = sgt;
> 
> -	if (size)
> -		*size = dmabuf->size - offset;
> +	if (sizep)
> +		*sizep = size;
> 
>  	return 0;
> 
> @@ -734,6 +792,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct
> tegra_vde *vde, &frame->y_dmabuf_attachment,
>  				      &frame->y_addr,
>  				      &frame->y_sgt,
> +				      &frame->y_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		return err;
> @@ -743,6 +802,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct
> tegra_vde *vde, &frame->cb_dmabuf_attachment,
>  				      &frame->cb_addr,
>  				      &frame->cb_sgt,
> +				      &frame->cb_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		goto err_release_y;
> @@ -752,6 +812,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct
> tegra_vde *vde, &frame->cr_dmabuf_attachment,
>  				      &frame->cr_addr,
>  				      &frame->cr_sgt,
> +				      &frame->cr_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		goto err_release_cb;
> @@ -766,6 +827,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct
> tegra_vde *vde, &frame->aux_dmabuf_attachment,
>  				      &frame->aux_addr,
>  				      &frame->aux_sgt,
> +				      &frame->aux_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		goto err_release_cr;
> @@ -774,13 +836,16 @@ static int tegra_vde_attach_dmabufs_to_frame(struct
> tegra_vde *vde,
> 
>  err_release_cr:
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
> -					frame->cr_sgt, dma_dir);
> +					frame->cr_sgt, frame->cr_iova,
> +					dma_dir);
>  err_release_cb:
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
> -					frame->cb_sgt, dma_dir);
> +					frame->cb_sgt, frame->cb_iova,
> +					dma_dir);
>  err_release_y:
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
> -					frame->y_sgt, dma_dir);
> +					frame->y_sgt, frame->y_iova,
> +					dma_dir);
> 
>  	return err;
>  }
> @@ -793,16 +858,20 @@ static void tegra_vde_release_frame_dmabufs(struct
> tegra_vde *vde, if (!baseline_profile)
>  		tegra_vde_detach_and_put_dmabuf(vde,
>  						frame->aux_dmabuf_attachment,
> -						frame->aux_sgt, dma_dir);
> +						frame->aux_sgt,
> +						frame->aux_iova, dma_dir);
> 
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
> -					frame->cr_sgt, dma_dir);
> +					frame->cr_sgt, frame->cr_iova,
> +					dma_dir);
> 
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
> -					frame->cb_sgt, dma_dir);
> +					frame->cb_sgt, frame->cb_iova,
> +					dma_dir);
> 
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
> -					frame->y_sgt, dma_dir);
> +					frame->y_sgt, frame->y_iova,
> +					dma_dir);
>  }
> 
>  static int tegra_vde_validate_frame(struct device *dev,
> @@ -923,6 +992,8 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde, struct sg_table *bitstream_sgt, *secure_sgt;
>  	enum dma_data_direction dma_dir;
>  	dma_addr_t bitstream_data_addr;
> +	struct iova *bitstream_iova;
> +	struct iova *secure_iova;
>  	dma_addr_t secure_addr;
>  	dma_addr_t bsev_ptr;
>  	size_t lsize, csize;
> @@ -948,6 +1019,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde, &bitstream_data_dmabuf_attachment,
>  				      &bitstream_data_addr,
>  				      &bitstream_sgt,
> +				      &bitstream_iova,
>  				      &bitstream_data_size,
>  				      DMA_TO_DEVICE);
>  	if (ret)
> @@ -959,6 +1031,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde, &secure_attachment,
>  					      &secure_addr,
>  					      &secure_sgt,
> +					      &secure_iova,
>  					      &secure_size,
>  					      DMA_TO_DEVICE);
>  		if (ret)
> @@ -1095,12 +1168,13 @@ static int tegra_vde_ioctl_decode_h264(struct
> tegra_vde *vde, release_bitstream_dmabuf:
>  	if (secure_attachment)
>  		tegra_vde_detach_and_put_dmabuf(vde, secure_attachment,
> -						secure_sgt,
> +						secure_sgt, secure_iova,
>  						DMA_TO_DEVICE);
> 
>  	tegra_vde_detach_and_put_dmabuf(vde,
>  					bitstream_data_dmabuf_attachment,
> -					bitstream_sgt, DMA_TO_DEVICE);
> +					bitstream_sgt, bitstream_iova,
> +					DMA_TO_DEVICE);
> 
>  	return ret;
>  }
> @@ -1193,6 +1267,15 @@ static int tegra_vde_probe(struct platform_device
> *pdev) struct tegra_vde *vde;
>  	int irq, err;
> 
> +#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
> +	if (dev->archdata.mapping) {
> +		struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
> +
> +		arm_iommu_detach_device(dev);
> +		arm_iommu_release_mapping(mapping);
> +	}
> +#endif
> +
>  	vde = devm_kzalloc(dev, sizeof(*vde), GFP_KERNEL);
>  	if (!vde)
>  		return -ENOMEM;
> @@ -1335,6 +1418,37 @@ static int tegra_vde_probe(struct platform_device
> *pdev) return -ENOMEM;
>  	}
> 
> +	vde->group = iommu_group_get(dev);
> +	if (vde->group) {
> +		unsigned long order;
> +
> +		vde->domain = iommu_domain_alloc(&platform_bus_type);
> +		if (!vde->domain) {
> +			iommu_group_put(vde->group);
> +			vde->group = NULL;
> +		} else {
> +			err = iova_cache_get();
> +			if (err < 0)
> +				goto free_domain;
> +
> +			order = __ffs(vde->domain->pgsize_bitmap);
> +
> +			init_iova_domain(&vde->iova, 1UL << order, 0);

The minimum address of IOVA allocations shall be determined by the domains 
aperture start address.


> +			vde->shift = iova_shift(&vde->iova);
> +			vde->limit = 1 << (32 - vde->shift);

IOVA limit shall be determined by the domains aperture size. 

	struct iommu_domain_geometry *geometry;

	order = __ffs(vde->domain->pgsize_bitmap);
	geometry = &vde->domain->geometry;

	init_iova_domain(&vde->iova, 1UL << order,
				 geometry->aperture_start >> order);
	vde->iova_end = geometry->aperture_end;

Hence let's replace 'limit' with 'iova_end'.

> +
> +			/*
> +			 * VDE doesn't seem to like accessing the last page of
> +			 * its 32-bit address space.
> +			 */
> +			vde->limit -= 1;

That's probably because some VDE HW address counter is getting wrapped around 
and it can't cope with that due to a HW bug / optimization.

Since this only affects the end of AS, let's check if that adjustment is 
needed:

	if (vde->iova_end == 0xffffffff)
		vde->iova_end -= 1UL << order;

> +
> +			err = iommu_attach_group(vde->domain, vde->group);
> +			if (err < 0)

	goto put_iova;

> +				goto put_cache;
> +		}
> +	}
> +
>  	mutex_init(&vde->lock);
>  	init_completion(&vde->decode_completion);
> 
> @@ -1346,7 +1460,7 @@ static int tegra_vde_probe(struct platform_device
> *pdev) err = misc_register(&vde->miscdev);
>  	if (err) {
>  		dev_err(dev, "Failed to register misc device: %d\n", err);
> -		goto err_gen_free;
> +		goto detach;
>  	}
> 
>  	pm_runtime_enable(dev);
> @@ -1364,7 +1478,21 @@ static int tegra_vde_probe(struct platform_device
> *pdev) err_misc_unreg:
>  	misc_deregister(&vde->miscdev);
> 
> -err_gen_free:

put_iova:
	put_iova_domain(&vde->iova);

> +detach:
> +	if (vde->domain)
> +		iommu_detach_group(vde->domain, vde->group);
> +
> +put_cache:
> +	if (vde->domain)
> +		iova_cache_put();
> +
> +free_domain:
> +	if (vde->domain)
> +		iommu_domain_free(vde->domain);
> +
> +	if (vde->group)
> +		iommu_group_put(vde->group);
> +
>  	gen_pool_free(vde->iram_pool, (unsigned long)vde->iram,
>  		      gen_pool_size(vde->iram_pool));
> 
> @@ -1388,6 +1516,13 @@ static int tegra_vde_remove(struct platform_device
> *pdev)
> 
>  	misc_deregister(&vde->miscdev);
> 
> +	if (vde->domain) {
> +		iommu_detach_group(vde->domain, vde->group);

	put_iova_domain(&vde->iova);

> +		iova_cache_put();
> +		iommu_domain_free(vde->domain);
> +		iommu_group_put(vde->group);
> +	}
> +
>  	gen_pool_free(vde->iram_pool, (unsigned long)vde->iram,
>  		      gen_pool_size(vde->iram_pool));

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 07/14] staging: media: tegra-vde: Add some clarifying comments
  2018-08-13 14:50   ` Thierry Reding
@ 2018-08-18 12:50     ` Dmitry Osipenko
  -1 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:50 UTC (permalink / raw)
  To: Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Mauro Carvalho Chehab, linux-media

On Monday, 13 August 2018 17:50:20 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Add some comments specifying what tables are being set up in VRAM.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c
> b/drivers/staging/media/tegra-vde/tegra-vde.c index
> 0adc603fa437..41cf86dc5dbd 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -271,6 +271,7 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde
> *vde, unsigned int i, k;
>  	size_t size;
> 
> +	/* clear H256RefPicList */
>  	size = num_ref_pics * 4 * 8;
>  	memset(vde->iram, 0, size);

H256? Is it a typo?

> 
> @@ -453,6 +454,7 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde, VDE_WR(0x00000000, vde->bsev + 0x98);
>  	VDE_WR(0x00000060, vde->bsev + 0x9C);
> 
> +	/* clear H264MB2SliceGroupMap, assuming no FMO */
>  	memset(vde->iram + 1024, 0, macroblocks_nb / 2);
> 
>  	tegra_setup_frameidx(vde, dpb_frames, ctx->dpb_frames_nb,
> @@ -480,6 +482,8 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde, if (err)
>  		return err;
> 
> +	/* upload H264MB2SliceGroupMap */
> +	/* XXX don't hardcode map size? */
>  	value = (0x20 << 26) | (0 << 25) | ((4096 >> 2) & 0x1fff);
>  	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);
>  	if (err)
> @@ -492,6 +496,7 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde, if (err)
>  		return err;
> 
> +	/* clear H264MBInfo XXX don't hardcode size */
>  	value = (0x21 << 26) | ((240 & 0x1fff) << 12) | (0x54c & 0xfff);
>  	err = tegra_vde_push_to_bsev_icmdqueue(vde, 0x840F054C, false);
>  	if (err)
> @@ -499,6 +504,16 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde,
> 
>  	size = num_ref_pics * 4 * 8;
> 
> +	/* clear H264RefPicList */

#if 0

> +	value = (0x21 << 26) | (((size >> 2) & 0x1fff) << 12) | 0xE34;
> +
> +	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);
> +	if (err)
> +		return err;

#endif

Is it supposed to do the same as "clear H256RefPicList -> memset(vde->iram, 0, 
size)" above?

> +
> +	/* upload H264RefPicList */
>  	value = (0x20 << 26) | (0x0 << 25) | ((size >> 2) & 0x1fff);
>  	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);
>  	if (err)
> @@ -584,7 +599,11 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde,
> 
>  	tegra_vde_mbe_set_0xa_reg(vde, 0, 0x000009FC);
>  	tegra_vde_mbe_set_0xa_reg(vde, 2, 0x61DEAD00);
> +#if 0
> +	tegra_vde_mbe_set_0xa_reg(vde, 4, dpb_frames[0].aux_addr); /* 0x62DEAD00
> */ +#else
>  	tegra_vde_mbe_set_0xa_reg(vde, 4, 0x62DEAD00);
> +#endif

This doesn't really clarify much, let's drop this chunk for now.

>  	tegra_vde_mbe_set_0xa_reg(vde, 6, 0x63DEAD00);
>  	tegra_vde_mbe_set_0xa_reg(vde, 8, dpb_frames[0].aux_addr);

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 07/14] staging: media: tegra-vde: Add some clarifying comments
@ 2018-08-18 12:50     ` Dmitry Osipenko
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:50 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mauro Carvalho Chehab, Greg Kroah-Hartman, Jonathan Hunter,
	linux-media, linux-tegra, devel

On Monday, 13 August 2018 17:50:20 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Add some comments specifying what tables are being set up in VRAM.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c
> b/drivers/staging/media/tegra-vde/tegra-vde.c index
> 0adc603fa437..41cf86dc5dbd 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -271,6 +271,7 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde
> *vde, unsigned int i, k;
>  	size_t size;
> 
> +	/* clear H256RefPicList */
>  	size = num_ref_pics * 4 * 8;
>  	memset(vde->iram, 0, size);

H256? Is it a typo?

> 
> @@ -453,6 +454,7 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde, VDE_WR(0x00000000, vde->bsev + 0x98);
>  	VDE_WR(0x00000060, vde->bsev + 0x9C);
> 
> +	/* clear H264MB2SliceGroupMap, assuming no FMO */
>  	memset(vde->iram + 1024, 0, macroblocks_nb / 2);
> 
>  	tegra_setup_frameidx(vde, dpb_frames, ctx->dpb_frames_nb,
> @@ -480,6 +482,8 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde, if (err)
>  		return err;
> 
> +	/* upload H264MB2SliceGroupMap */
> +	/* XXX don't hardcode map size? */
>  	value = (0x20 << 26) | (0 << 25) | ((4096 >> 2) & 0x1fff);
>  	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);
>  	if (err)
> @@ -492,6 +496,7 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde, if (err)
>  		return err;
> 
> +	/* clear H264MBInfo XXX don't hardcode size */
>  	value = (0x21 << 26) | ((240 & 0x1fff) << 12) | (0x54c & 0xfff);
>  	err = tegra_vde_push_to_bsev_icmdqueue(vde, 0x840F054C, false);
>  	if (err)
> @@ -499,6 +504,16 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde,
> 
>  	size = num_ref_pics * 4 * 8;
> 
> +	/* clear H264RefPicList */

#if 0

> +	value = (0x21 << 26) | (((size >> 2) & 0x1fff) << 12) | 0xE34;
> +
> +	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);
> +	if (err)
> +		return err;

#endif

Is it supposed to do the same as "clear H256RefPicList -> memset(vde->iram, 0, 
size)" above?

> +
> +	/* upload H264RefPicList */
>  	value = (0x20 << 26) | (0x0 << 25) | ((size >> 2) & 0x1fff);
>  	err = tegra_vde_push_to_bsev_icmdqueue(vde, value, false);
>  	if (err)
> @@ -584,7 +599,11 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde,
> 
>  	tegra_vde_mbe_set_0xa_reg(vde, 0, 0x000009FC);
>  	tegra_vde_mbe_set_0xa_reg(vde, 2, 0x61DEAD00);
> +#if 0
> +	tegra_vde_mbe_set_0xa_reg(vde, 4, dpb_frames[0].aux_addr); /* 0x62DEAD00
> */ +#else
>  	tegra_vde_mbe_set_0xa_reg(vde, 4, 0x62DEAD00);
> +#endif

This doesn't really clarify much, let's drop this chunk for now.

>  	tegra_vde_mbe_set_0xa_reg(vde, 6, 0x63DEAD00);
>  	tegra_vde_mbe_set_0xa_reg(vde, 8, dpb_frames[0].aux_addr);

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 10/14] staging: media: tegra-vde: Keep VDE in reset when unused
  2018-08-13 14:50   ` Thierry Reding
@ 2018-08-18 12:50     ` Dmitry Osipenko
  -1 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:50 UTC (permalink / raw)
  To: Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Mauro Carvalho Chehab, linux-media

On Monday, 13 August 2018 17:50:23 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> There is no point in keeping the VDE module out of reset when it is not
> in use. Reset it on runtime suspend.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c
> b/drivers/staging/media/tegra-vde/tegra-vde.c index
> 3bc0bfcfe34e..4b3c6ab3c77e 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -1226,6 +1226,7 @@ static int tegra_vde_runtime_suspend(struct device
> *dev) }
> 
>  	reset_control_assert(vde->rst_bsev);
> +	reset_control_assert(vde->rst);
> 
>  	usleep_range(2000, 4000);

There is also no point to reset VDE while it is powered off, then why do we 
that?

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 10/14] staging: media: tegra-vde: Keep VDE in reset when unused
@ 2018-08-18 12:50     ` Dmitry Osipenko
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:50 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mauro Carvalho Chehab, Greg Kroah-Hartman, Jonathan Hunter,
	linux-media, linux-tegra, devel

On Monday, 13 August 2018 17:50:23 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> There is no point in keeping the VDE module out of reset when it is not
> in use. Reset it on runtime suspend.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c
> b/drivers/staging/media/tegra-vde/tegra-vde.c index
> 3bc0bfcfe34e..4b3c6ab3c77e 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -1226,6 +1226,7 @@ static int tegra_vde_runtime_suspend(struct device
> *dev) }
> 
>  	reset_control_assert(vde->rst_bsev);
> +	reset_control_assert(vde->rst);
> 
>  	usleep_range(2000, 4000);

There is also no point to reset VDE while it is powered off, then why do we 
that?

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 04/14] staging: media: tegra-vde: Use DRM/KMS framebuffer modifiers
  2018-08-13 14:50   ` Thierry Reding
@ 2018-08-18 12:53     ` Dmitry Osipenko
  -1 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:53 UTC (permalink / raw)
  To: Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Mauro Carvalho Chehab, linux-media

On Monday, 13 August 2018 17:50:17 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> VDE on Tegra20 through Tegra114 supports reading and writing frames in
> 16x16 tiled layout. Similarily, the various block-linear layouts that
> are supported by the GPU on Tegra124 can also be read from and written
> to by the Tegra124 VDE.
> 
> Enable userspace to specify the desired layout using the existing DRM
> framebuffer modifiers.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 112 +++++++++++++++++---
>  drivers/staging/media/tegra-vde/uapi.h      |   3 +-
>  2 files changed, 100 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c
> b/drivers/staging/media/tegra-vde/tegra-vde.c index
> 1a40f6dff7c8..275884e745df 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -24,6 +24,8 @@
> 
>  #include <soc/tegra/pmc.h>
> 
> +#include <drm/drm_fourcc.h>
> +
>  #include "uapi.h"
> 
>  #define ICMDQUE_WR		0x00
> @@ -58,12 +60,14 @@ struct video_frame {
>  	dma_addr_t aux_addr;
>  	u32 frame_num;
>  	u32 flags;
> +	u64 modifier;
>  };
> 
>  struct tegra_vde_soc {
>  	unsigned int num_ref_pics;
>  	bool supports_ref_pic_marking;
>  	bool supports_interlacing;
> +	bool supports_block_linear;
>  };
> 
>  struct tegra_vde {
> @@ -202,6 +206,7 @@ static void tegra_vde_setup_frameid(struct tegra_vde
> *vde, unsigned int frameid,
>  				    u32 mbs_width, u32 mbs_height)
>  {
> +	u64 modifier = frame ? frame->modifier : DRM_FORMAT_MOD_LINEAR;
>  	u32 y_addr  = frame ? frame->y_addr  : 0x6CDEAD00;
>  	u32 cb_addr = frame ? frame->cb_addr : 0x6CDEAD00;
>  	u32 cr_addr = frame ? frame->cr_addr : 0x6CDEAD00;
> @@ -209,8 +214,12 @@ static void tegra_vde_setup_frameid(struct tegra_vde
> *vde, u32 value2 = frame ? ((((mbs_width + 1) >> 1) << 6) | 1) : 0;
>  	u32 value = y_addr >> 8;
> 
> -	if (vde->soc->supports_interlacing)
> +	if (!vde->soc->supports_interlacing) {
> +		if (modifier == DRM_FORMAT_MOD_NVIDIA_TEGRA_TILED)
> +			value |= BIT(31);
> +	} else {
>  		value |= BIT(31);
> +	}
> 
>  	VDE_WR(value,        vde->frameid + 0x000 + frameid * 4);
>  	VDE_WR(cb_addr >> 8, vde->frameid + 0x100 + frameid * 4);
> @@ -349,6 +358,37 @@ static void tegra_vde_setup_iram_tables(struct
> tegra_vde *vde, }
>  }
> 
> +static int tegra_vde_get_block_height(u64 modifier, unsigned int
> *block_height) +{
> +	switch (modifier) {
> +	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_ONE_GOB:
> +		*block_height = 0;
> +		return 0;
> +
> +	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_TWO_GOB:
> +		*block_height = 1;
> +		return 0;
> +
> +	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_FOUR_GOB:
> +		*block_height = 2;
> +		return 0;
> +
> +	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_EIGHT_GOB:
> +		*block_height = 3;
> +		return 0;
> +
> +	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_SIXTEEN_GOB:
> +		*block_height = 4;
> +		return 0;
> +
> +	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_THIRTYTWO_GOB:
> +		*block_height = 5;
> +		return 0;
> +	}
> +
> +	return -EINVAL;
> +}
> +
>  static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
>  				      struct tegra_vde_h264_decoder_ctx *ctx,
>  				      struct video_frame *dpb_frames,
> @@ -383,7 +423,21 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde, tegra_vde_set_bits(vde, 0x0005, vde->vdma + 0x04);
> 
>  	VDE_WR(0x00000000, vde->vdma + 0x1C);
> -	VDE_WR(0x00000000, vde->vdma + 0x00);
> +
> +	value = 0x00000000;
> +
> +	if (vde->soc->supports_block_linear) {
> +		unsigned int block_height;
> +
> +		err = tegra_vde_get_block_height(dpb_frames[0].modifier,
> +						 &block_height);
> +		if (err < 0)
> +			return err;
> +
> +		value |= block_height << 10;
> +	}
> +
> +	VDE_WR(value, vde->vdma + 0x00);
>  	VDE_WR(0x00000007, vde->vdma + 0x04);
>  	VDE_WR(0x00000007, vde->frameid + 0x200);
>  	VDE_WR(0x00000005, vde->tfe + 0x04);
> @@ -730,11 +784,37 @@ static void tegra_vde_release_frame_dmabufs(struct
> video_frame *frame, static int tegra_vde_validate_frame(struct device *dev,
>  				    struct tegra_vde_h264_frame *frame)
>  {
> +	struct tegra_vde *vde = dev_get_drvdata(dev);
> +
>  	if (frame->frame_num > 0x7FFFFF) {
>  		dev_err(dev, "Bad frame_num %u\n", frame->frame_num);
>  		return -EINVAL;
>  	}
> 
> +	if (vde->soc->supports_block_linear) {
> +		switch (frame->modifier) {
> +		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_ONE_GOB:
> +		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_TWO_GOB:
> +		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_FOUR_GOB:
> +		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_EIGHT_GOB:
> +		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_SIXTEEN_GOB:
> +		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_THIRTYTWO_GOB:
> +			break;
> +
> +		default:
> +			return -EINVAL;
> +		}
> +	} else {
> +		switch (frame->modifier) {
> +		case DRM_FORMAT_MOD_NVIDIA_TEGRA_TILED:
> +		case DRM_FORMAT_MOD_LINEAR:
> +			break;
> +
> +		default:
> +			return -EINVAL;
> +		}
> +	}
> +
>  	return 0;
>  }
> 
> @@ -812,7 +892,6 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde, {
>  	struct device *dev = vde->miscdev.parent;
>  	struct tegra_vde_h264_decoder_ctx ctx;
> -	struct tegra_vde_h264_frame frames[17];
>  	struct tegra_vde_h264_frame __user *frames_user;
>  	struct video_frame *dpb_frames;
>  	struct dma_buf_attachment *bitstream_data_dmabuf_attachment;
> @@ -872,28 +951,30 @@ static int tegra_vde_ioctl_decode_h264(struct
> tegra_vde *vde, macroblocks_nb = ctx.pic_width_in_mbs *
> ctx.pic_height_in_mbs;
>  	frames_user = u64_to_user_ptr(ctx.dpb_frames_ptr);
> 
> -	if (copy_from_user(frames, frames_user,
> -			   ctx.dpb_frames_nb * sizeof(*frames))) {
> -		ret = -EFAULT;
> -		goto free_dpb_frames;
> -	}
> -
>  	cstride = ALIGN(ctx.pic_width_in_mbs * 8, 16);
>  	csize = cstride * ctx.pic_height_in_mbs * 8;
>  	lsize = macroblocks_nb * 256;
> 
>  	for (i = 0; i < ctx.dpb_frames_nb; i++) {
> -		ret = tegra_vde_validate_frame(dev, &frames[i]);
> +		struct tegra_vde_h264_frame frame;
> +
> +		if (copy_from_user(&frame, &frames_user[i], sizeof(frame))) {
> +			ret = -EFAULT;
> +			goto release_dpb_frames;
> +		}

This change is unrelated to the modifiers, it should be a standalone patch.

Why do we need to change this at all? Do you think it is more optimal to make 
kernel go back and forth copying the frames rather than to copy them all at 
once?

> +
> +		ret = tegra_vde_validate_frame(dev, &frame);
>  		if (ret)
>  			goto release_dpb_frames;
> 
> -		dpb_frames[i].flags = frames[i].flags;
> -		dpb_frames[i].frame_num = frames[i].frame_num;
> +		dpb_frames[i].flags = frame.flags;
> +		dpb_frames[i].frame_num = frame.frame_num;
> +		dpb_frames[i].modifier = frame.modifier;
> 
>  		dma_dir = (i == 0) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
> 
>  		ret = tegra_vde_attach_dmabufs_to_frame(dev, &dpb_frames[i],
> -							&frames[i], dma_dir,
> +							&frame, dma_dir,
>  							ctx.baseline_profile,
>  							lsize, csize);
>  		if (ret)
> @@ -985,7 +1066,6 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde, ctx.baseline_profile);
>  	}
> 
> -free_dpb_frames:
>  	kfree(dpb_frames);
> 
>  release_bitstream_dmabuf:
> @@ -1328,24 +1408,28 @@ static const struct tegra_vde_soc tegra20_vde_soc =
> { .num_ref_pics = 16,
>  	.supports_ref_pic_marking = false,
>  	.supports_interlacing = false,
> +	.supports_block_linear = false,
>  };
> 
>  static const struct tegra_vde_soc tegra30_vde_soc = {
>  	.num_ref_pics = 32,
>  	.supports_ref_pic_marking = false,
>  	.supports_interlacing = false,
> +	.supports_block_linear = false,
>  };
> 
>  static const struct tegra_vde_soc tegra114_vde_soc = {
>  	.num_ref_pics = 32,
>  	.supports_ref_pic_marking = true,
>  	.supports_interlacing = false,
> +	.supports_block_linear = false,
>  };
> 
>  static const struct tegra_vde_soc tegra124_vde_soc = {
>  	.num_ref_pics = 32,
>  	.supports_ref_pic_marking = true,
>  	.supports_interlacing = true,
> +	.supports_block_linear = true,
>  };
> 
>  static const struct of_device_id tegra_vde_of_match[] = {
> diff --git a/drivers/staging/media/tegra-vde/uapi.h
> b/drivers/staging/media/tegra-vde/uapi.h index 58bfd56de55e..6cd730dda61c
> 100644
> --- a/drivers/staging/media/tegra-vde/uapi.h
> +++ b/drivers/staging/media/tegra-vde/uapi.h
> @@ -27,8 +27,9 @@ struct tegra_vde_h264_frame {
>  	__u32 aux_offset;
>  	__u32 frame_num;
>  	__u32 flags;
> +	__u64 modifier;
> 
> -	__u32 reserved;
> +	__u32 reserved[4];
>  } __attribute__((packed));
> 
>  struct tegra_vde_h264_decoder_ctx {

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 04/14] staging: media: tegra-vde: Use DRM/KMS framebuffer modifiers
@ 2018-08-18 12:53     ` Dmitry Osipenko
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 12:53 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mauro Carvalho Chehab, Greg Kroah-Hartman, Jonathan Hunter,
	linux-media, linux-tegra, devel

On Monday, 13 August 2018 17:50:17 MSK Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> VDE on Tegra20 through Tegra114 supports reading and writing frames in
> 16x16 tiled layout. Similarily, the various block-linear layouts that
> are supported by the GPU on Tegra124 can also be read from and written
> to by the Tegra124 VDE.
> 
> Enable userspace to specify the desired layout using the existing DRM
> framebuffer modifiers.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 112 +++++++++++++++++---
>  drivers/staging/media/tegra-vde/uapi.h      |   3 +-
>  2 files changed, 100 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c
> b/drivers/staging/media/tegra-vde/tegra-vde.c index
> 1a40f6dff7c8..275884e745df 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -24,6 +24,8 @@
> 
>  #include <soc/tegra/pmc.h>
> 
> +#include <drm/drm_fourcc.h>
> +
>  #include "uapi.h"
> 
>  #define ICMDQUE_WR		0x00
> @@ -58,12 +60,14 @@ struct video_frame {
>  	dma_addr_t aux_addr;
>  	u32 frame_num;
>  	u32 flags;
> +	u64 modifier;
>  };
> 
>  struct tegra_vde_soc {
>  	unsigned int num_ref_pics;
>  	bool supports_ref_pic_marking;
>  	bool supports_interlacing;
> +	bool supports_block_linear;
>  };
> 
>  struct tegra_vde {
> @@ -202,6 +206,7 @@ static void tegra_vde_setup_frameid(struct tegra_vde
> *vde, unsigned int frameid,
>  				    u32 mbs_width, u32 mbs_height)
>  {
> +	u64 modifier = frame ? frame->modifier : DRM_FORMAT_MOD_LINEAR;
>  	u32 y_addr  = frame ? frame->y_addr  : 0x6CDEAD00;
>  	u32 cb_addr = frame ? frame->cb_addr : 0x6CDEAD00;
>  	u32 cr_addr = frame ? frame->cr_addr : 0x6CDEAD00;
> @@ -209,8 +214,12 @@ static void tegra_vde_setup_frameid(struct tegra_vde
> *vde, u32 value2 = frame ? ((((mbs_width + 1) >> 1) << 6) | 1) : 0;
>  	u32 value = y_addr >> 8;
> 
> -	if (vde->soc->supports_interlacing)
> +	if (!vde->soc->supports_interlacing) {
> +		if (modifier == DRM_FORMAT_MOD_NVIDIA_TEGRA_TILED)
> +			value |= BIT(31);
> +	} else {
>  		value |= BIT(31);
> +	}
> 
>  	VDE_WR(value,        vde->frameid + 0x000 + frameid * 4);
>  	VDE_WR(cb_addr >> 8, vde->frameid + 0x100 + frameid * 4);
> @@ -349,6 +358,37 @@ static void tegra_vde_setup_iram_tables(struct
> tegra_vde *vde, }
>  }
> 
> +static int tegra_vde_get_block_height(u64 modifier, unsigned int
> *block_height) +{
> +	switch (modifier) {
> +	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_ONE_GOB:
> +		*block_height = 0;
> +		return 0;
> +
> +	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_TWO_GOB:
> +		*block_height = 1;
> +		return 0;
> +
> +	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_FOUR_GOB:
> +		*block_height = 2;
> +		return 0;
> +
> +	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_EIGHT_GOB:
> +		*block_height = 3;
> +		return 0;
> +
> +	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_SIXTEEN_GOB:
> +		*block_height = 4;
> +		return 0;
> +
> +	case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_THIRTYTWO_GOB:
> +		*block_height = 5;
> +		return 0;
> +	}
> +
> +	return -EINVAL;
> +}
> +
>  static int tegra_vde_setup_hw_context(struct tegra_vde *vde,
>  				      struct tegra_vde_h264_decoder_ctx *ctx,
>  				      struct video_frame *dpb_frames,
> @@ -383,7 +423,21 @@ static int tegra_vde_setup_hw_context(struct tegra_vde
> *vde, tegra_vde_set_bits(vde, 0x0005, vde->vdma + 0x04);
> 
>  	VDE_WR(0x00000000, vde->vdma + 0x1C);
> -	VDE_WR(0x00000000, vde->vdma + 0x00);
> +
> +	value = 0x00000000;
> +
> +	if (vde->soc->supports_block_linear) {
> +		unsigned int block_height;
> +
> +		err = tegra_vde_get_block_height(dpb_frames[0].modifier,
> +						 &block_height);
> +		if (err < 0)
> +			return err;
> +
> +		value |= block_height << 10;
> +	}
> +
> +	VDE_WR(value, vde->vdma + 0x00);
>  	VDE_WR(0x00000007, vde->vdma + 0x04);
>  	VDE_WR(0x00000007, vde->frameid + 0x200);
>  	VDE_WR(0x00000005, vde->tfe + 0x04);
> @@ -730,11 +784,37 @@ static void tegra_vde_release_frame_dmabufs(struct
> video_frame *frame, static int tegra_vde_validate_frame(struct device *dev,
>  				    struct tegra_vde_h264_frame *frame)
>  {
> +	struct tegra_vde *vde = dev_get_drvdata(dev);
> +
>  	if (frame->frame_num > 0x7FFFFF) {
>  		dev_err(dev, "Bad frame_num %u\n", frame->frame_num);
>  		return -EINVAL;
>  	}
> 
> +	if (vde->soc->supports_block_linear) {
> +		switch (frame->modifier) {
> +		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_ONE_GOB:
> +		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_TWO_GOB:
> +		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_FOUR_GOB:
> +		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_EIGHT_GOB:
> +		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_SIXTEEN_GOB:
> +		case DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_THIRTYTWO_GOB:
> +			break;
> +
> +		default:
> +			return -EINVAL;
> +		}
> +	} else {
> +		switch (frame->modifier) {
> +		case DRM_FORMAT_MOD_NVIDIA_TEGRA_TILED:
> +		case DRM_FORMAT_MOD_LINEAR:
> +			break;
> +
> +		default:
> +			return -EINVAL;
> +		}
> +	}
> +
>  	return 0;
>  }
> 
> @@ -812,7 +892,6 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde, {
>  	struct device *dev = vde->miscdev.parent;
>  	struct tegra_vde_h264_decoder_ctx ctx;
> -	struct tegra_vde_h264_frame frames[17];
>  	struct tegra_vde_h264_frame __user *frames_user;
>  	struct video_frame *dpb_frames;
>  	struct dma_buf_attachment *bitstream_data_dmabuf_attachment;
> @@ -872,28 +951,30 @@ static int tegra_vde_ioctl_decode_h264(struct
> tegra_vde *vde, macroblocks_nb = ctx.pic_width_in_mbs *
> ctx.pic_height_in_mbs;
>  	frames_user = u64_to_user_ptr(ctx.dpb_frames_ptr);
> 
> -	if (copy_from_user(frames, frames_user,
> -			   ctx.dpb_frames_nb * sizeof(*frames))) {
> -		ret = -EFAULT;
> -		goto free_dpb_frames;
> -	}
> -
>  	cstride = ALIGN(ctx.pic_width_in_mbs * 8, 16);
>  	csize = cstride * ctx.pic_height_in_mbs * 8;
>  	lsize = macroblocks_nb * 256;
> 
>  	for (i = 0; i < ctx.dpb_frames_nb; i++) {
> -		ret = tegra_vde_validate_frame(dev, &frames[i]);
> +		struct tegra_vde_h264_frame frame;
> +
> +		if (copy_from_user(&frame, &frames_user[i], sizeof(frame))) {
> +			ret = -EFAULT;
> +			goto release_dpb_frames;
> +		}

This change is unrelated to the modifiers, it should be a standalone patch.

Why do we need to change this at all? Do you think it is more optimal to make 
kernel go back and forth copying the frames rather than to copy them all at 
once?

> +
> +		ret = tegra_vde_validate_frame(dev, &frame);
>  		if (ret)
>  			goto release_dpb_frames;
> 
> -		dpb_frames[i].flags = frames[i].flags;
> -		dpb_frames[i].frame_num = frames[i].frame_num;
> +		dpb_frames[i].flags = frame.flags;
> +		dpb_frames[i].frame_num = frame.frame_num;
> +		dpb_frames[i].modifier = frame.modifier;
> 
>  		dma_dir = (i == 0) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
> 
>  		ret = tegra_vde_attach_dmabufs_to_frame(dev, &dpb_frames[i],
> -							&frames[i], dma_dir,
> +							&frame, dma_dir,
>  							ctx.baseline_profile,
>  							lsize, csize);
>  		if (ret)
> @@ -985,7 +1066,6 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde
> *vde, ctx.baseline_profile);
>  	}
> 
> -free_dpb_frames:
>  	kfree(dpb_frames);
> 
>  release_bitstream_dmabuf:
> @@ -1328,24 +1408,28 @@ static const struct tegra_vde_soc tegra20_vde_soc =
> { .num_ref_pics = 16,
>  	.supports_ref_pic_marking = false,
>  	.supports_interlacing = false,
> +	.supports_block_linear = false,
>  };
> 
>  static const struct tegra_vde_soc tegra30_vde_soc = {
>  	.num_ref_pics = 32,
>  	.supports_ref_pic_marking = false,
>  	.supports_interlacing = false,
> +	.supports_block_linear = false,
>  };
> 
>  static const struct tegra_vde_soc tegra114_vde_soc = {
>  	.num_ref_pics = 32,
>  	.supports_ref_pic_marking = true,
>  	.supports_interlacing = false,
> +	.supports_block_linear = false,
>  };
> 
>  static const struct tegra_vde_soc tegra124_vde_soc = {
>  	.num_ref_pics = 32,
>  	.supports_ref_pic_marking = true,
>  	.supports_interlacing = true,
> +	.supports_block_linear = true,
>  };
> 
>  static const struct of_device_id tegra_vde_of_match[] = {
> diff --git a/drivers/staging/media/tegra-vde/uapi.h
> b/drivers/staging/media/tegra-vde/uapi.h index 58bfd56de55e..6cd730dda61c
> 100644
> --- a/drivers/staging/media/tegra-vde/uapi.h
> +++ b/drivers/staging/media/tegra-vde/uapi.h
> @@ -27,8 +27,9 @@ struct tegra_vde_h264_frame {
>  	__u32 aux_offset;
>  	__u32 frame_num;
>  	__u32 flags;
> +	__u64 modifier;
> 
> -	__u32 reserved;
> +	__u32 reserved[4];
>  } __attribute__((packed));
> 
>  struct tegra_vde_h264_decoder_ctx {

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 09/14] staging: media: tegra-vde: Add IOMMU support
  2018-08-13 14:50   ` Thierry Reding
@ 2018-08-18 13:07     ` Dmitry Osipenko
  -1 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 13:07 UTC (permalink / raw)
  To: Thierry Reding, Mauro Carvalho Chehab
  Cc: linux-tegra, Greg Kroah-Hartman, linux-media, devel, Jonathan Hunter

On 13.08.2018 17:50, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Implement support for using an IOMMU to map physically discontiguous
> buffers into contiguous I/O virtual mappings that the VDE can use. This
> allows importing arbitrary DMA-BUFs for use by the VDE.
> 
> While at it, make sure that the device is detached from any DMA/IOMMU
> mapping that it might have automatically been attached to at boot. If
> using the IOMMU API explicitly, detaching from any existing mapping is
> required to avoid double mapping of buffers.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 171 +++++++++++++++++---
>  1 file changed, 153 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
> index 2496a03fd158..3bc0bfcfe34e 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -13,7 +13,9 @@
>  #include <linux/dma-buf.h>
>  #include <linux/genalloc.h>
>  #include <linux/interrupt.h>
> +#include <linux/iommu.h>
>  #include <linux/iopoll.h>
> +#include <linux/iova.h>
>  #include <linux/miscdevice.h>
>  #include <linux/module.h>
>  #include <linux/of_device.h>
> @@ -22,6 +24,10 @@
>  #include <linux/slab.h>
>  #include <linux/uaccess.h>
>  
> +#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
> +#include <asm/dma-iommu.h>
> +#endif
> +
>  #include <soc/tegra/pmc.h>
>  
>  #include <drm/drm_fourcc.h>
> @@ -61,6 +67,11 @@ struct video_frame {
>  	u32 frame_num;
>  	u32 flags;
>  	u64 modifier;
> +
> +	struct iova *y_iova;
> +	struct iova *cb_iova;
> +	struct iova *cr_iova;
> +	struct iova *aux_iova;
>  };
>  
>  struct tegra_vde_soc {
> @@ -93,6 +104,12 @@ struct tegra_vde {
>  	struct clk *clk_bsev;
>  	dma_addr_t iram_lists_addr;
>  	u32 *iram;
> +
> +	struct iommu_domain *domain;
> +	struct iommu_group *group;
> +	struct iova_domain iova;
> +	unsigned long limit;
> +	unsigned int shift;
>  };
>  
>  static void tegra_vde_set_bits(struct tegra_vde *vde,
> @@ -634,12 +651,22 @@ static void tegra_vde_decode_frame(struct tegra_vde *vde,
>  	VDE_WR(0x20000000 | (macroblocks_nb - 1), vde->sxe + 0x00);
>  }
>  
> -static void tegra_vde_detach_and_put_dmabuf(struct dma_buf_attachment *a,
> +static void tegra_vde_detach_and_put_dmabuf(struct tegra_vde *vde,
> +					    struct dma_buf_attachment *a,
>  					    struct sg_table *sgt,
> +					    struct iova *iova,
>  					    enum dma_data_direction dma_dir)
>  {
>  	struct dma_buf *dmabuf = a->dmabuf;
>  
> +	if (vde->domain) {
> +		unsigned long size = iova_size(iova) << vde->shift;
> +		dma_addr_t addr = iova_dma_addr(&vde->iova, iova);
> +
> +		iommu_unmap(vde->domain, addr, size);
> +		__free_iova(&vde->iova, iova);
> +	}
> +
>  	dma_buf_unmap_attachment(a, sgt, dma_dir);
>  	dma_buf_detach(dmabuf, a);
>  	dma_buf_put(dmabuf);
> @@ -651,14 +678,16 @@ static int tegra_vde_attach_dmabuf(struct tegra_vde *vde,
>  				   size_t min_size,
>  				   size_t align_size,
>  				   struct dma_buf_attachment **a,
> -				   dma_addr_t *addr,
> +				   dma_addr_t *addrp,
>  				   struct sg_table **s,
> -				   size_t *size,
> +				   struct iova **iovap,
> +				   size_t *sizep,
>  				   enum dma_data_direction dma_dir)
>  {
>  	struct dma_buf_attachment *attachment;
>  	struct dma_buf *dmabuf;
>  	struct sg_table *sgt;
> +	size_t size;
>  	int err;
>  
>  	dmabuf = dma_buf_get(fd);
> @@ -695,18 +724,47 @@ static int tegra_vde_attach_dmabuf(struct tegra_vde *vde,
>  		goto err_detach;
>  	}
>  
> -	if (sgt->nents != 1) {
> +	if (sgt->nents > 1 && !vde->domain) {
>  		dev_err(vde->dev, "Sparse DMA region is unsupported\n");
>  		err = -EINVAL;
>  		goto err_unmap;
>  	}
>  
> -	*addr = sg_dma_address(sgt->sgl) + offset;
> +	if (vde->domain) {
> +		int prot = IOMMU_READ | IOMMU_WRITE;
> +		struct iova *iova;
> +		dma_addr_t addr;
> +
> +		size = (dmabuf->size - offset) >> vde->shift;
> +
> +		iova = alloc_iova(&vde->iova, size, vde->limit - 1, true);
> +		if (!iova) {
> +			err = -ENOMEM;
> +			goto err_unmap;
> +		}
> +
> +		addr = iova_dma_addr(&vde->iova, iova);
> +
> +		size = iommu_map_sg(vde->domain, addr, sgt->sgl, sgt->nents,
> +				    prot);
> +		if (!size) {
> +			__free_iova(&vde->iova, iova);
> +			err = -ENXIO;
> +			goto err_unmap;
> +		}
> +
> +		*addrp = addr;
> +		*iovap = iova;
> +	} else {
> +		*addrp = sg_dma_address(sgt->sgl) + offset;
> +		size = dmabuf->size - offset;
> +	}
> +
>  	*a = attachment;
>  	*s = sgt;
>  
> -	if (size)
> -		*size = dmabuf->size - offset;
> +	if (sizep)
> +		*sizep = size;
>  
>  	return 0;
>  
> @@ -734,6 +792,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  				      &frame->y_dmabuf_attachment,
>  				      &frame->y_addr,
>  				      &frame->y_sgt,
> +				      &frame->y_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		return err;
> @@ -743,6 +802,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  				      &frame->cb_dmabuf_attachment,
>  				      &frame->cb_addr,
>  				      &frame->cb_sgt,
> +				      &frame->cb_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		goto err_release_y;
> @@ -752,6 +812,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  				      &frame->cr_dmabuf_attachment,
>  				      &frame->cr_addr,
>  				      &frame->cr_sgt,
> +				      &frame->cr_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		goto err_release_cb;
> @@ -766,6 +827,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  				      &frame->aux_dmabuf_attachment,
>  				      &frame->aux_addr,
>  				      &frame->aux_sgt,
> +				      &frame->aux_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		goto err_release_cr;
> @@ -774,13 +836,16 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  
>  err_release_cr:
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
> -					frame->cr_sgt, dma_dir);
> +					frame->cr_sgt, frame->cr_iova,
> +					dma_dir);
>  err_release_cb:
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
> -					frame->cb_sgt, dma_dir);
> +					frame->cb_sgt, frame->cb_iova,
> +					dma_dir);
>  err_release_y:
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
> -					frame->y_sgt, dma_dir);
> +					frame->y_sgt, frame->y_iova,
> +					dma_dir);
>  
>  	return err;
>  }
> @@ -793,16 +858,20 @@ static void tegra_vde_release_frame_dmabufs(struct tegra_vde *vde,
>  	if (!baseline_profile)
>  		tegra_vde_detach_and_put_dmabuf(vde,
>  						frame->aux_dmabuf_attachment,
> -						frame->aux_sgt, dma_dir);
> +						frame->aux_sgt,
> +						frame->aux_iova, dma_dir);
>  
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
> -					frame->cr_sgt, dma_dir);
> +					frame->cr_sgt, frame->cr_iova,
> +					dma_dir);
>  
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
> -					frame->cb_sgt, dma_dir);
> +					frame->cb_sgt, frame->cb_iova,
> +					dma_dir);
>  
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
> -					frame->y_sgt, dma_dir);
> +					frame->y_sgt, frame->y_iova,
> +					dma_dir);
>  }
>  
>  static int tegra_vde_validate_frame(struct device *dev,
> @@ -923,6 +992,8 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  	struct sg_table *bitstream_sgt, *secure_sgt;
>  	enum dma_data_direction dma_dir;
>  	dma_addr_t bitstream_data_addr;
> +	struct iova *bitstream_iova;
> +	struct iova *secure_iova;
>  	dma_addr_t secure_addr;
>  	dma_addr_t bsev_ptr;
>  	size_t lsize, csize;
> @@ -948,6 +1019,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  				      &bitstream_data_dmabuf_attachment,
>  				      &bitstream_data_addr,
>  				      &bitstream_sgt,
> +				      &bitstream_iova,
>  				      &bitstream_data_size,
>  				      DMA_TO_DEVICE);
>  	if (ret)
> @@ -959,6 +1031,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  					      &secure_attachment,
>  					      &secure_addr,
>  					      &secure_sgt,
> +					      &secure_iova,
>  					      &secure_size,
>  					      DMA_TO_DEVICE);
>  		if (ret)
> @@ -1095,12 +1168,13 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  release_bitstream_dmabuf:
>  	if (secure_attachment)
>  		tegra_vde_detach_and_put_dmabuf(vde, secure_attachment,
> -						secure_sgt,
> +						secure_sgt, secure_iova,
>  						DMA_TO_DEVICE);
>  
>  	tegra_vde_detach_and_put_dmabuf(vde,
>  					bitstream_data_dmabuf_attachment,
> -					bitstream_sgt, DMA_TO_DEVICE);
> +					bitstream_sgt, bitstream_iova,
> +					DMA_TO_DEVICE);
>  
>  	return ret;
>  }
> @@ -1193,6 +1267,15 @@ static int tegra_vde_probe(struct platform_device *pdev)
>  	struct tegra_vde *vde;
>  	int irq, err;
>  
> +#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
> +	if (dev->archdata.mapping) {
> +		struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
> +
> +		arm_iommu_detach_device(dev);
> +		arm_iommu_release_mapping(mapping);
> +	}
> +#endif
> +
>  	vde = devm_kzalloc(dev, sizeof(*vde), GFP_KERNEL);
>  	if (!vde)
>  		return -ENOMEM;
> @@ -1335,6 +1418,37 @@ static int tegra_vde_probe(struct platform_device *pdev)
>  		return -ENOMEM;
>  	}
>  
> +	vde->group = iommu_group_get(dev);
> +	if (vde->group) {
> +		unsigned long order;
> +
> +		vde->domain = iommu_domain_alloc(&platform_bus_type);
> +		if (!vde->domain) {
> +			iommu_group_put(vde->group);
> +			vde->group = NULL;
> +		} else {
> +			err = iova_cache_get();
> +			if (err < 0)
> +				goto free_domain;
> +
> +			order = __ffs(vde->domain->pgsize_bitmap);
> +
> +			init_iova_domain(&vde->iova, 1UL << order, 0);
> +			vde->shift = iova_shift(&vde->iova);
> +			vde->limit = 1 << (32 - vde->shift);
> +
> +			/*
> +			 * VDE doesn't seem to like accessing the last page of
> +			 * its 32-bit address space.
> +			 */
> +			vde->limit -= 1;
> +
> +			err = iommu_attach_group(vde->domain, vde->group);
> +			if (err < 0)
> +				goto put_cache;
> +		}
> +	}
> +

Let's factor out IOMMU setup into tegra_vde_init/realease_iommu().

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 09/14] staging: media: tegra-vde: Add IOMMU support
@ 2018-08-18 13:07     ` Dmitry Osipenko
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 13:07 UTC (permalink / raw)
  To: Thierry Reding, Mauro Carvalho Chehab
  Cc: Greg Kroah-Hartman, Jonathan Hunter, linux-media, linux-tegra, devel

On 13.08.2018 17:50, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Implement support for using an IOMMU to map physically discontiguous
> buffers into contiguous I/O virtual mappings that the VDE can use. This
> allows importing arbitrary DMA-BUFs for use by the VDE.
> 
> While at it, make sure that the device is detached from any DMA/IOMMU
> mapping that it might have automatically been attached to at boot. If
> using the IOMMU API explicitly, detaching from any existing mapping is
> required to avoid double mapping of buffers.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 171 +++++++++++++++++---
>  1 file changed, 153 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
> index 2496a03fd158..3bc0bfcfe34e 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -13,7 +13,9 @@
>  #include <linux/dma-buf.h>
>  #include <linux/genalloc.h>
>  #include <linux/interrupt.h>
> +#include <linux/iommu.h>
>  #include <linux/iopoll.h>
> +#include <linux/iova.h>
>  #include <linux/miscdevice.h>
>  #include <linux/module.h>
>  #include <linux/of_device.h>
> @@ -22,6 +24,10 @@
>  #include <linux/slab.h>
>  #include <linux/uaccess.h>
>  
> +#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
> +#include <asm/dma-iommu.h>
> +#endif
> +
>  #include <soc/tegra/pmc.h>
>  
>  #include <drm/drm_fourcc.h>
> @@ -61,6 +67,11 @@ struct video_frame {
>  	u32 frame_num;
>  	u32 flags;
>  	u64 modifier;
> +
> +	struct iova *y_iova;
> +	struct iova *cb_iova;
> +	struct iova *cr_iova;
> +	struct iova *aux_iova;
>  };
>  
>  struct tegra_vde_soc {
> @@ -93,6 +104,12 @@ struct tegra_vde {
>  	struct clk *clk_bsev;
>  	dma_addr_t iram_lists_addr;
>  	u32 *iram;
> +
> +	struct iommu_domain *domain;
> +	struct iommu_group *group;
> +	struct iova_domain iova;
> +	unsigned long limit;
> +	unsigned int shift;
>  };
>  
>  static void tegra_vde_set_bits(struct tegra_vde *vde,
> @@ -634,12 +651,22 @@ static void tegra_vde_decode_frame(struct tegra_vde *vde,
>  	VDE_WR(0x20000000 | (macroblocks_nb - 1), vde->sxe + 0x00);
>  }
>  
> -static void tegra_vde_detach_and_put_dmabuf(struct dma_buf_attachment *a,
> +static void tegra_vde_detach_and_put_dmabuf(struct tegra_vde *vde,
> +					    struct dma_buf_attachment *a,
>  					    struct sg_table *sgt,
> +					    struct iova *iova,
>  					    enum dma_data_direction dma_dir)
>  {
>  	struct dma_buf *dmabuf = a->dmabuf;
>  
> +	if (vde->domain) {
> +		unsigned long size = iova_size(iova) << vde->shift;
> +		dma_addr_t addr = iova_dma_addr(&vde->iova, iova);
> +
> +		iommu_unmap(vde->domain, addr, size);
> +		__free_iova(&vde->iova, iova);
> +	}
> +
>  	dma_buf_unmap_attachment(a, sgt, dma_dir);
>  	dma_buf_detach(dmabuf, a);
>  	dma_buf_put(dmabuf);
> @@ -651,14 +678,16 @@ static int tegra_vde_attach_dmabuf(struct tegra_vde *vde,
>  				   size_t min_size,
>  				   size_t align_size,
>  				   struct dma_buf_attachment **a,
> -				   dma_addr_t *addr,
> +				   dma_addr_t *addrp,
>  				   struct sg_table **s,
> -				   size_t *size,
> +				   struct iova **iovap,
> +				   size_t *sizep,
>  				   enum dma_data_direction dma_dir)
>  {
>  	struct dma_buf_attachment *attachment;
>  	struct dma_buf *dmabuf;
>  	struct sg_table *sgt;
> +	size_t size;
>  	int err;
>  
>  	dmabuf = dma_buf_get(fd);
> @@ -695,18 +724,47 @@ static int tegra_vde_attach_dmabuf(struct tegra_vde *vde,
>  		goto err_detach;
>  	}
>  
> -	if (sgt->nents != 1) {
> +	if (sgt->nents > 1 && !vde->domain) {
>  		dev_err(vde->dev, "Sparse DMA region is unsupported\n");
>  		err = -EINVAL;
>  		goto err_unmap;
>  	}
>  
> -	*addr = sg_dma_address(sgt->sgl) + offset;
> +	if (vde->domain) {
> +		int prot = IOMMU_READ | IOMMU_WRITE;
> +		struct iova *iova;
> +		dma_addr_t addr;
> +
> +		size = (dmabuf->size - offset) >> vde->shift;
> +
> +		iova = alloc_iova(&vde->iova, size, vde->limit - 1, true);
> +		if (!iova) {
> +			err = -ENOMEM;
> +			goto err_unmap;
> +		}
> +
> +		addr = iova_dma_addr(&vde->iova, iova);
> +
> +		size = iommu_map_sg(vde->domain, addr, sgt->sgl, sgt->nents,
> +				    prot);
> +		if (!size) {
> +			__free_iova(&vde->iova, iova);
> +			err = -ENXIO;
> +			goto err_unmap;
> +		}
> +
> +		*addrp = addr;
> +		*iovap = iova;
> +	} else {
> +		*addrp = sg_dma_address(sgt->sgl) + offset;
> +		size = dmabuf->size - offset;
> +	}
> +
>  	*a = attachment;
>  	*s = sgt;
>  
> -	if (size)
> -		*size = dmabuf->size - offset;
> +	if (sizep)
> +		*sizep = size;
>  
>  	return 0;
>  
> @@ -734,6 +792,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  				      &frame->y_dmabuf_attachment,
>  				      &frame->y_addr,
>  				      &frame->y_sgt,
> +				      &frame->y_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		return err;
> @@ -743,6 +802,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  				      &frame->cb_dmabuf_attachment,
>  				      &frame->cb_addr,
>  				      &frame->cb_sgt,
> +				      &frame->cb_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		goto err_release_y;
> @@ -752,6 +812,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  				      &frame->cr_dmabuf_attachment,
>  				      &frame->cr_addr,
>  				      &frame->cr_sgt,
> +				      &frame->cr_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		goto err_release_cb;
> @@ -766,6 +827,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  				      &frame->aux_dmabuf_attachment,
>  				      &frame->aux_addr,
>  				      &frame->aux_sgt,
> +				      &frame->aux_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		goto err_release_cr;
> @@ -774,13 +836,16 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  
>  err_release_cr:
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
> -					frame->cr_sgt, dma_dir);
> +					frame->cr_sgt, frame->cr_iova,
> +					dma_dir);
>  err_release_cb:
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
> -					frame->cb_sgt, dma_dir);
> +					frame->cb_sgt, frame->cb_iova,
> +					dma_dir);
>  err_release_y:
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
> -					frame->y_sgt, dma_dir);
> +					frame->y_sgt, frame->y_iova,
> +					dma_dir);
>  
>  	return err;
>  }
> @@ -793,16 +858,20 @@ static void tegra_vde_release_frame_dmabufs(struct tegra_vde *vde,
>  	if (!baseline_profile)
>  		tegra_vde_detach_and_put_dmabuf(vde,
>  						frame->aux_dmabuf_attachment,
> -						frame->aux_sgt, dma_dir);
> +						frame->aux_sgt,
> +						frame->aux_iova, dma_dir);
>  
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
> -					frame->cr_sgt, dma_dir);
> +					frame->cr_sgt, frame->cr_iova,
> +					dma_dir);
>  
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
> -					frame->cb_sgt, dma_dir);
> +					frame->cb_sgt, frame->cb_iova,
> +					dma_dir);
>  
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
> -					frame->y_sgt, dma_dir);
> +					frame->y_sgt, frame->y_iova,
> +					dma_dir);
>  }
>  
>  static int tegra_vde_validate_frame(struct device *dev,
> @@ -923,6 +992,8 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  	struct sg_table *bitstream_sgt, *secure_sgt;
>  	enum dma_data_direction dma_dir;
>  	dma_addr_t bitstream_data_addr;
> +	struct iova *bitstream_iova;
> +	struct iova *secure_iova;
>  	dma_addr_t secure_addr;
>  	dma_addr_t bsev_ptr;
>  	size_t lsize, csize;
> @@ -948,6 +1019,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  				      &bitstream_data_dmabuf_attachment,
>  				      &bitstream_data_addr,
>  				      &bitstream_sgt,
> +				      &bitstream_iova,
>  				      &bitstream_data_size,
>  				      DMA_TO_DEVICE);
>  	if (ret)
> @@ -959,6 +1031,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  					      &secure_attachment,
>  					      &secure_addr,
>  					      &secure_sgt,
> +					      &secure_iova,
>  					      &secure_size,
>  					      DMA_TO_DEVICE);
>  		if (ret)
> @@ -1095,12 +1168,13 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  release_bitstream_dmabuf:
>  	if (secure_attachment)
>  		tegra_vde_detach_and_put_dmabuf(vde, secure_attachment,
> -						secure_sgt,
> +						secure_sgt, secure_iova,
>  						DMA_TO_DEVICE);
>  
>  	tegra_vde_detach_and_put_dmabuf(vde,
>  					bitstream_data_dmabuf_attachment,
> -					bitstream_sgt, DMA_TO_DEVICE);
> +					bitstream_sgt, bitstream_iova,
> +					DMA_TO_DEVICE);
>  
>  	return ret;
>  }
> @@ -1193,6 +1267,15 @@ static int tegra_vde_probe(struct platform_device *pdev)
>  	struct tegra_vde *vde;
>  	int irq, err;
>  
> +#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
> +	if (dev->archdata.mapping) {
> +		struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
> +
> +		arm_iommu_detach_device(dev);
> +		arm_iommu_release_mapping(mapping);
> +	}
> +#endif
> +
>  	vde = devm_kzalloc(dev, sizeof(*vde), GFP_KERNEL);
>  	if (!vde)
>  		return -ENOMEM;
> @@ -1335,6 +1418,37 @@ static int tegra_vde_probe(struct platform_device *pdev)
>  		return -ENOMEM;
>  	}
>  
> +	vde->group = iommu_group_get(dev);
> +	if (vde->group) {
> +		unsigned long order;
> +
> +		vde->domain = iommu_domain_alloc(&platform_bus_type);
> +		if (!vde->domain) {
> +			iommu_group_put(vde->group);
> +			vde->group = NULL;
> +		} else {
> +			err = iova_cache_get();
> +			if (err < 0)
> +				goto free_domain;
> +
> +			order = __ffs(vde->domain->pgsize_bitmap);
> +
> +			init_iova_domain(&vde->iova, 1UL << order, 0);
> +			vde->shift = iova_shift(&vde->iova);
> +			vde->limit = 1 << (32 - vde->shift);
> +
> +			/*
> +			 * VDE doesn't seem to like accessing the last page of
> +			 * its 32-bit address space.
> +			 */
> +			vde->limit -= 1;
> +
> +			err = iommu_attach_group(vde->domain, vde->group);
> +			if (err < 0)
> +				goto put_cache;
> +		}
> +	}
> +

Let's factor out IOMMU setup into tegra_vde_init/realease_iommu().

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 09/14] staging: media: tegra-vde: Add IOMMU support
  2018-08-13 14:50   ` Thierry Reding
@ 2018-08-18 13:29     ` Dmitry Osipenko
  -1 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 13:29 UTC (permalink / raw)
  To: Thierry Reding, Mauro Carvalho Chehab
  Cc: linux-tegra, Greg Kroah-Hartman, linux-media, devel, Jonathan Hunter

On 13.08.2018 17:50, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Implement support for using an IOMMU to map physically discontiguous
> buffers into contiguous I/O virtual mappings that the VDE can use. This
> allows importing arbitrary DMA-BUFs for use by the VDE.
> 
> While at it, make sure that the device is detached from any DMA/IOMMU
> mapping that it might have automatically been attached to at boot. If
> using the IOMMU API explicitly, detaching from any existing mapping is
> required to avoid double mapping of buffers.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 171 +++++++++++++++++---
>  1 file changed, 153 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
> index 2496a03fd158..3bc0bfcfe34e 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -13,7 +13,9 @@
>  #include <linux/dma-buf.h>
>  #include <linux/genalloc.h>
>  #include <linux/interrupt.h>
> +#include <linux/iommu.h>
>  #include <linux/iopoll.h>
> +#include <linux/iova.h>
>  #include <linux/miscdevice.h>
>  #include <linux/module.h>
>  #include <linux/of_device.h>
> @@ -22,6 +24,10 @@
>  #include <linux/slab.h>
>  #include <linux/uaccess.h>
>  
> +#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
> +#include <asm/dma-iommu.h>
> +#endif
> +
>  #include <soc/tegra/pmc.h>
>  
>  #include <drm/drm_fourcc.h>
> @@ -61,6 +67,11 @@ struct video_frame {
>  	u32 frame_num;
>  	u32 flags;
>  	u64 modifier;
> +
> +	struct iova *y_iova;
> +	struct iova *cb_iova;
> +	struct iova *cr_iova;
> +	struct iova *aux_iova;
>  };
>  
>  struct tegra_vde_soc {
> @@ -93,6 +104,12 @@ struct tegra_vde {
>  	struct clk *clk_bsev;
>  	dma_addr_t iram_lists_addr;
>  	u32 *iram;
> +
> +	struct iommu_domain *domain;
> +	struct iommu_group *group;
> +	struct iova_domain iova;
> +	unsigned long limit;
> +	unsigned int shift;
>  };
>  
>  static void tegra_vde_set_bits(struct tegra_vde *vde,
> @@ -634,12 +651,22 @@ static void tegra_vde_decode_frame(struct tegra_vde *vde,
>  	VDE_WR(0x20000000 | (macroblocks_nb - 1), vde->sxe + 0x00);
>  }
>  
> -static void tegra_vde_detach_and_put_dmabuf(struct dma_buf_attachment *a,
> +static void tegra_vde_detach_and_put_dmabuf(struct tegra_vde *vde,
> +					    struct dma_buf_attachment *a,
>  					    struct sg_table *sgt,
> +					    struct iova *iova,
>  					    enum dma_data_direction dma_dir)
>  {
>  	struct dma_buf *dmabuf = a->dmabuf;
>  
> +	if (vde->domain) {
> +		unsigned long size = iova_size(iova) << vde->shift;
> +		dma_addr_t addr = iova_dma_addr(&vde->iova, iova);
> +
> +		iommu_unmap(vde->domain, addr, size);
> +		__free_iova(&vde->iova, iova);
> +	}
> +
>  	dma_buf_unmap_attachment(a, sgt, dma_dir);
>  	dma_buf_detach(dmabuf, a);
>  	dma_buf_put(dmabuf);
> @@ -651,14 +678,16 @@ static int tegra_vde_attach_dmabuf(struct tegra_vde *vde,
>  				   size_t min_size,
>  				   size_t align_size,
>  				   struct dma_buf_attachment **a,
> -				   dma_addr_t *addr,
> +				   dma_addr_t *addrp,
>  				   struct sg_table **s,
> -				   size_t *size,
> +				   struct iova **iovap,
> +				   size_t *sizep,
>  				   enum dma_data_direction dma_dir)
>  {
>  	struct dma_buf_attachment *attachment;
>  	struct dma_buf *dmabuf;
>  	struct sg_table *sgt;
> +	size_t size;
>  	int err;
>  
>  	dmabuf = dma_buf_get(fd);
> @@ -695,18 +724,47 @@ static int tegra_vde_attach_dmabuf(struct tegra_vde *vde,
>  		goto err_detach;
>  	}
>  
> -	if (sgt->nents != 1) {
> +	if (sgt->nents > 1 && !vde->domain) {
>  		dev_err(vde->dev, "Sparse DMA region is unsupported\n");
>  		err = -EINVAL;
>  		goto err_unmap;
>  	}
>  
> -	*addr = sg_dma_address(sgt->sgl) + offset;
> +	if (vde->domain) {
> +		int prot = IOMMU_READ | IOMMU_WRITE;
> +		struct iova *iova;
> +		dma_addr_t addr;
> +
> +		size = (dmabuf->size - offset) >> vde->shift;
> +
> +		iova = alloc_iova(&vde->iova, size, vde->limit - 1, true);
> +		if (!iova) {
> +			err = -ENOMEM;
> +			goto err_unmap;
> +		}
> +
> +		addr = iova_dma_addr(&vde->iova, iova);
> +
> +		size = iommu_map_sg(vde->domain, addr, sgt->sgl, sgt->nents,
> +				    prot);
> +		if (!size) {
> +			__free_iova(&vde->iova, iova);
> +			err = -ENXIO;
> +			goto err_unmap;
> +		}
> +
> +		*addrp = addr;
> +		*iovap = iova;
> +	} else {
> +		*addrp = sg_dma_address(sgt->sgl) + offset;
> +		size = dmabuf->size - offset;
> +	}
> +
>  	*a = attachment;
>  	*s = sgt;
>  
> -	if (size)
> -		*size = dmabuf->size - offset;
> +	if (sizep)
> +		*sizep = size;
>  
>  	return 0;
>  
> @@ -734,6 +792,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  				      &frame->y_dmabuf_attachment,
>  				      &frame->y_addr,
>  				      &frame->y_sgt,
> +				      &frame->y_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		return err;
> @@ -743,6 +802,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  				      &frame->cb_dmabuf_attachment,
>  				      &frame->cb_addr,
>  				      &frame->cb_sgt,
> +				      &frame->cb_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		goto err_release_y;
> @@ -752,6 +812,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  				      &frame->cr_dmabuf_attachment,
>  				      &frame->cr_addr,
>  				      &frame->cr_sgt,
> +				      &frame->cr_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		goto err_release_cb;
> @@ -766,6 +827,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  				      &frame->aux_dmabuf_attachment,
>  				      &frame->aux_addr,
>  				      &frame->aux_sgt,
> +				      &frame->aux_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		goto err_release_cr;
> @@ -774,13 +836,16 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  
>  err_release_cr:
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
> -					frame->cr_sgt, dma_dir);
> +					frame->cr_sgt, frame->cr_iova,
> +					dma_dir);
>  err_release_cb:
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
> -					frame->cb_sgt, dma_dir);
> +					frame->cb_sgt, frame->cb_iova,
> +					dma_dir);
>  err_release_y:
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
> -					frame->y_sgt, dma_dir);
> +					frame->y_sgt, frame->y_iova,
> +					dma_dir);
>  
>  	return err;
>  }
> @@ -793,16 +858,20 @@ static void tegra_vde_release_frame_dmabufs(struct tegra_vde *vde,
>  	if (!baseline_profile)
>  		tegra_vde_detach_and_put_dmabuf(vde,
>  						frame->aux_dmabuf_attachment,
> -						frame->aux_sgt, dma_dir);
> +						frame->aux_sgt,
> +						frame->aux_iova, dma_dir);
>  
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
> -					frame->cr_sgt, dma_dir);
> +					frame->cr_sgt, frame->cr_iova,
> +					dma_dir);
>  
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
> -					frame->cb_sgt, dma_dir);
> +					frame->cb_sgt, frame->cb_iova,
> +					dma_dir);
>  
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
> -					frame->y_sgt, dma_dir);
> +					frame->y_sgt, frame->y_iova,
> +					dma_dir);
>  }
>  
>  static int tegra_vde_validate_frame(struct device *dev,
> @@ -923,6 +992,8 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  	struct sg_table *bitstream_sgt, *secure_sgt;
>  	enum dma_data_direction dma_dir;
>  	dma_addr_t bitstream_data_addr;
> +	struct iova *bitstream_iova;
> +	struct iova *secure_iova;
>  	dma_addr_t secure_addr;
>  	dma_addr_t bsev_ptr;
>  	size_t lsize, csize;
> @@ -948,6 +1019,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  				      &bitstream_data_dmabuf_attachment,
>  				      &bitstream_data_addr,
>  				      &bitstream_sgt,
> +				      &bitstream_iova,
>  				      &bitstream_data_size,
>  				      DMA_TO_DEVICE);
>  	if (ret)
> @@ -959,6 +1031,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  					      &secure_attachment,
>  					      &secure_addr,
>  					      &secure_sgt,
> +					      &secure_iova,
>  					      &secure_size,
>  					      DMA_TO_DEVICE);
>  		if (ret)
> @@ -1095,12 +1168,13 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  release_bitstream_dmabuf:
>  	if (secure_attachment)
>  		tegra_vde_detach_and_put_dmabuf(vde, secure_attachment,
> -						secure_sgt,
> +						secure_sgt, secure_iova,
>  						DMA_TO_DEVICE);
>  
>  	tegra_vde_detach_and_put_dmabuf(vde,
>  					bitstream_data_dmabuf_attachment,
> -					bitstream_sgt, DMA_TO_DEVICE);
> +					bitstream_sgt, bitstream_iova,
> +					DMA_TO_DEVICE);
>  
>  	return ret;
>  }
> @@ -1193,6 +1267,15 @@ static int tegra_vde_probe(struct platform_device *pdev)
>  	struct tegra_vde *vde;
>  	int irq, err;
>  
> +#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
> +	if (dev->archdata.mapping) {
> +		struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
> +
> +		arm_iommu_detach_device(dev);
> +		arm_iommu_release_mapping(mapping);
> +	}
> +#endif
> +
>  	vde = devm_kzalloc(dev, sizeof(*vde), GFP_KERNEL);
>  	if (!vde)
>  		return -ENOMEM;
> @@ -1335,6 +1418,37 @@ static int tegra_vde_probe(struct platform_device *pdev)
>  		return -ENOMEM;
>  	}
>  
> +	vde->group = iommu_group_get(dev);
> +	if (vde->group) {
> +		unsigned long order;
> +
> +		vde->domain = iommu_domain_alloc(&platform_bus_type);
> +		if (!vde->domain) {
> +			iommu_group_put(vde->group);
> +			vde->group = NULL;
> +		} else {
> +			err = iova_cache_get();
> +			if (err < 0)

iova_cache_get() returns only 0 on success, let's check for the 0 like in the
rest of the code for consistency,

> +				goto free_domain;
> +
> +			order = __ffs(vde->domain->pgsize_bitmap);
> +
> +			init_iova_domain(&vde->iova, 1UL << order, 0);
> +			vde->shift = iova_shift(&vde->iova);
> +			vde->limit = 1 << (32 - vde->shift);
> +
> +			/*
> +			 * VDE doesn't seem to like accessing the last page of
> +			 * its 32-bit address space.
> +			 */
> +			vde->limit -= 1;
> +
> +			err = iommu_attach_group(vde->domain, vde->group);
> +			if (err < 0)

Same as above.

> +				goto put_cache;
> +		}
> +	}

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 09/14] staging: media: tegra-vde: Add IOMMU support
@ 2018-08-18 13:29     ` Dmitry Osipenko
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 13:29 UTC (permalink / raw)
  To: Thierry Reding, Mauro Carvalho Chehab
  Cc: Greg Kroah-Hartman, Jonathan Hunter, linux-media, linux-tegra, devel

On 13.08.2018 17:50, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Implement support for using an IOMMU to map physically discontiguous
> buffers into contiguous I/O virtual mappings that the VDE can use. This
> allows importing arbitrary DMA-BUFs for use by the VDE.
> 
> While at it, make sure that the device is detached from any DMA/IOMMU
> mapping that it might have automatically been attached to at boot. If
> using the IOMMU API explicitly, detaching from any existing mapping is
> required to avoid double mapping of buffers.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 171 +++++++++++++++++---
>  1 file changed, 153 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
> index 2496a03fd158..3bc0bfcfe34e 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -13,7 +13,9 @@
>  #include <linux/dma-buf.h>
>  #include <linux/genalloc.h>
>  #include <linux/interrupt.h>
> +#include <linux/iommu.h>
>  #include <linux/iopoll.h>
> +#include <linux/iova.h>
>  #include <linux/miscdevice.h>
>  #include <linux/module.h>
>  #include <linux/of_device.h>
> @@ -22,6 +24,10 @@
>  #include <linux/slab.h>
>  #include <linux/uaccess.h>
>  
> +#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
> +#include <asm/dma-iommu.h>
> +#endif
> +
>  #include <soc/tegra/pmc.h>
>  
>  #include <drm/drm_fourcc.h>
> @@ -61,6 +67,11 @@ struct video_frame {
>  	u32 frame_num;
>  	u32 flags;
>  	u64 modifier;
> +
> +	struct iova *y_iova;
> +	struct iova *cb_iova;
> +	struct iova *cr_iova;
> +	struct iova *aux_iova;
>  };
>  
>  struct tegra_vde_soc {
> @@ -93,6 +104,12 @@ struct tegra_vde {
>  	struct clk *clk_bsev;
>  	dma_addr_t iram_lists_addr;
>  	u32 *iram;
> +
> +	struct iommu_domain *domain;
> +	struct iommu_group *group;
> +	struct iova_domain iova;
> +	unsigned long limit;
> +	unsigned int shift;
>  };
>  
>  static void tegra_vde_set_bits(struct tegra_vde *vde,
> @@ -634,12 +651,22 @@ static void tegra_vde_decode_frame(struct tegra_vde *vde,
>  	VDE_WR(0x20000000 | (macroblocks_nb - 1), vde->sxe + 0x00);
>  }
>  
> -static void tegra_vde_detach_and_put_dmabuf(struct dma_buf_attachment *a,
> +static void tegra_vde_detach_and_put_dmabuf(struct tegra_vde *vde,
> +					    struct dma_buf_attachment *a,
>  					    struct sg_table *sgt,
> +					    struct iova *iova,
>  					    enum dma_data_direction dma_dir)
>  {
>  	struct dma_buf *dmabuf = a->dmabuf;
>  
> +	if (vde->domain) {
> +		unsigned long size = iova_size(iova) << vde->shift;
> +		dma_addr_t addr = iova_dma_addr(&vde->iova, iova);
> +
> +		iommu_unmap(vde->domain, addr, size);
> +		__free_iova(&vde->iova, iova);
> +	}
> +
>  	dma_buf_unmap_attachment(a, sgt, dma_dir);
>  	dma_buf_detach(dmabuf, a);
>  	dma_buf_put(dmabuf);
> @@ -651,14 +678,16 @@ static int tegra_vde_attach_dmabuf(struct tegra_vde *vde,
>  				   size_t min_size,
>  				   size_t align_size,
>  				   struct dma_buf_attachment **a,
> -				   dma_addr_t *addr,
> +				   dma_addr_t *addrp,
>  				   struct sg_table **s,
> -				   size_t *size,
> +				   struct iova **iovap,
> +				   size_t *sizep,
>  				   enum dma_data_direction dma_dir)
>  {
>  	struct dma_buf_attachment *attachment;
>  	struct dma_buf *dmabuf;
>  	struct sg_table *sgt;
> +	size_t size;
>  	int err;
>  
>  	dmabuf = dma_buf_get(fd);
> @@ -695,18 +724,47 @@ static int tegra_vde_attach_dmabuf(struct tegra_vde *vde,
>  		goto err_detach;
>  	}
>  
> -	if (sgt->nents != 1) {
> +	if (sgt->nents > 1 && !vde->domain) {
>  		dev_err(vde->dev, "Sparse DMA region is unsupported\n");
>  		err = -EINVAL;
>  		goto err_unmap;
>  	}
>  
> -	*addr = sg_dma_address(sgt->sgl) + offset;
> +	if (vde->domain) {
> +		int prot = IOMMU_READ | IOMMU_WRITE;
> +		struct iova *iova;
> +		dma_addr_t addr;
> +
> +		size = (dmabuf->size - offset) >> vde->shift;
> +
> +		iova = alloc_iova(&vde->iova, size, vde->limit - 1, true);
> +		if (!iova) {
> +			err = -ENOMEM;
> +			goto err_unmap;
> +		}
> +
> +		addr = iova_dma_addr(&vde->iova, iova);
> +
> +		size = iommu_map_sg(vde->domain, addr, sgt->sgl, sgt->nents,
> +				    prot);
> +		if (!size) {
> +			__free_iova(&vde->iova, iova);
> +			err = -ENXIO;
> +			goto err_unmap;
> +		}
> +
> +		*addrp = addr;
> +		*iovap = iova;
> +	} else {
> +		*addrp = sg_dma_address(sgt->sgl) + offset;
> +		size = dmabuf->size - offset;
> +	}
> +
>  	*a = attachment;
>  	*s = sgt;
>  
> -	if (size)
> -		*size = dmabuf->size - offset;
> +	if (sizep)
> +		*sizep = size;
>  
>  	return 0;
>  
> @@ -734,6 +792,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  				      &frame->y_dmabuf_attachment,
>  				      &frame->y_addr,
>  				      &frame->y_sgt,
> +				      &frame->y_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		return err;
> @@ -743,6 +802,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  				      &frame->cb_dmabuf_attachment,
>  				      &frame->cb_addr,
>  				      &frame->cb_sgt,
> +				      &frame->cb_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		goto err_release_y;
> @@ -752,6 +812,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  				      &frame->cr_dmabuf_attachment,
>  				      &frame->cr_addr,
>  				      &frame->cr_sgt,
> +				      &frame->cr_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		goto err_release_cb;
> @@ -766,6 +827,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  				      &frame->aux_dmabuf_attachment,
>  				      &frame->aux_addr,
>  				      &frame->aux_sgt,
> +				      &frame->aux_iova,
>  				      NULL, dma_dir);
>  	if (err)
>  		goto err_release_cr;
> @@ -774,13 +836,16 @@ static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  
>  err_release_cr:
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
> -					frame->cr_sgt, dma_dir);
> +					frame->cr_sgt, frame->cr_iova,
> +					dma_dir);
>  err_release_cb:
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
> -					frame->cb_sgt, dma_dir);
> +					frame->cb_sgt, frame->cb_iova,
> +					dma_dir);
>  err_release_y:
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
> -					frame->y_sgt, dma_dir);
> +					frame->y_sgt, frame->y_iova,
> +					dma_dir);
>  
>  	return err;
>  }
> @@ -793,16 +858,20 @@ static void tegra_vde_release_frame_dmabufs(struct tegra_vde *vde,
>  	if (!baseline_profile)
>  		tegra_vde_detach_and_put_dmabuf(vde,
>  						frame->aux_dmabuf_attachment,
> -						frame->aux_sgt, dma_dir);
> +						frame->aux_sgt,
> +						frame->aux_iova, dma_dir);
>  
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
> -					frame->cr_sgt, dma_dir);
> +					frame->cr_sgt, frame->cr_iova,
> +					dma_dir);
>  
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
> -					frame->cb_sgt, dma_dir);
> +					frame->cb_sgt, frame->cb_iova,
> +					dma_dir);
>  
>  	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
> -					frame->y_sgt, dma_dir);
> +					frame->y_sgt, frame->y_iova,
> +					dma_dir);
>  }
>  
>  static int tegra_vde_validate_frame(struct device *dev,
> @@ -923,6 +992,8 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  	struct sg_table *bitstream_sgt, *secure_sgt;
>  	enum dma_data_direction dma_dir;
>  	dma_addr_t bitstream_data_addr;
> +	struct iova *bitstream_iova;
> +	struct iova *secure_iova;
>  	dma_addr_t secure_addr;
>  	dma_addr_t bsev_ptr;
>  	size_t lsize, csize;
> @@ -948,6 +1019,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  				      &bitstream_data_dmabuf_attachment,
>  				      &bitstream_data_addr,
>  				      &bitstream_sgt,
> +				      &bitstream_iova,
>  				      &bitstream_data_size,
>  				      DMA_TO_DEVICE);
>  	if (ret)
> @@ -959,6 +1031,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  					      &secure_attachment,
>  					      &secure_addr,
>  					      &secure_sgt,
> +					      &secure_iova,
>  					      &secure_size,
>  					      DMA_TO_DEVICE);
>  		if (ret)
> @@ -1095,12 +1168,13 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  release_bitstream_dmabuf:
>  	if (secure_attachment)
>  		tegra_vde_detach_and_put_dmabuf(vde, secure_attachment,
> -						secure_sgt,
> +						secure_sgt, secure_iova,
>  						DMA_TO_DEVICE);
>  
>  	tegra_vde_detach_and_put_dmabuf(vde,
>  					bitstream_data_dmabuf_attachment,
> -					bitstream_sgt, DMA_TO_DEVICE);
> +					bitstream_sgt, bitstream_iova,
> +					DMA_TO_DEVICE);
>  
>  	return ret;
>  }
> @@ -1193,6 +1267,15 @@ static int tegra_vde_probe(struct platform_device *pdev)
>  	struct tegra_vde *vde;
>  	int irq, err;
>  
> +#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
> +	if (dev->archdata.mapping) {
> +		struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
> +
> +		arm_iommu_detach_device(dev);
> +		arm_iommu_release_mapping(mapping);
> +	}
> +#endif
> +
>  	vde = devm_kzalloc(dev, sizeof(*vde), GFP_KERNEL);
>  	if (!vde)
>  		return -ENOMEM;
> @@ -1335,6 +1418,37 @@ static int tegra_vde_probe(struct platform_device *pdev)
>  		return -ENOMEM;
>  	}
>  
> +	vde->group = iommu_group_get(dev);
> +	if (vde->group) {
> +		unsigned long order;
> +
> +		vde->domain = iommu_domain_alloc(&platform_bus_type);
> +		if (!vde->domain) {
> +			iommu_group_put(vde->group);
> +			vde->group = NULL;
> +		} else {
> +			err = iova_cache_get();
> +			if (err < 0)

iova_cache_get() returns only 0 on success, let's check for the 0 like in the
rest of the code for consistency,

> +				goto free_domain;
> +
> +			order = __ffs(vde->domain->pgsize_bitmap);
> +
> +			init_iova_domain(&vde->iova, 1UL << order, 0);
> +			vde->shift = iova_shift(&vde->iova);
> +			vde->limit = 1 << (32 - vde->shift);
> +
> +			/*
> +			 * VDE doesn't seem to like accessing the last page of
> +			 * its 32-bit address space.
> +			 */
> +			vde->limit -= 1;
> +
> +			err = iommu_attach_group(vde->domain, vde->group);
> +			if (err < 0)

Same as above.

> +				goto put_cache;
> +		}
> +	}

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 08/14] staging: media: tegra-vde: Track struct device *
  2018-08-13 14:50   ` Thierry Reding
@ 2018-08-18 15:39     ` Dmitry Osipenko
  -1 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 15:39 UTC (permalink / raw)
  To: Thierry Reding, Mauro Carvalho Chehab
  Cc: linux-tegra, Greg Kroah-Hartman, linux-media, devel, Jonathan Hunter

On 13.08.2018 17:50, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> The pointer to the struct device is frequently used, so store it in
> struct tegra_vde. Also, pass around a pointer to a struct tegra_vde
> instead of struct device in some cases to prepare for subsequent
> patches referencing additional data from that structure.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 63 ++++++++++++---------
>  1 file changed, 36 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
> index 41cf86dc5dbd..2496a03fd158 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -71,6 +71,7 @@ struct tegra_vde_soc {
>  };
>  
>  struct tegra_vde {
> +	struct device *dev;
>  	const struct tegra_vde_soc *soc;
>  	void __iomem *sxe;
>  	void __iomem *bsev;
> @@ -644,7 +645,7 @@ static void tegra_vde_detach_and_put_dmabuf(struct dma_buf_attachment *a,
>  	dma_buf_put(dmabuf);
>  }
>  
> -static int tegra_vde_attach_dmabuf(struct device *dev,
> +static int tegra_vde_attach_dmabuf(struct tegra_vde *vde,
>  				   int fd,
>  				   unsigned long offset,
>  				   size_t min_size,
> @@ -662,38 +663,40 @@ static int tegra_vde_attach_dmabuf(struct device *dev,
>  
>  	dmabuf = dma_buf_get(fd);
>  	if (IS_ERR(dmabuf)) {
> -		dev_err(dev, "Invalid dmabuf FD: %d\n", fd);
> +		dev_err(vde->dev, "Invalid dmabuf FD: %d\n", fd);
>  		return PTR_ERR(dmabuf);
>  	}
>  
>  	if (dmabuf->size & (align_size - 1)) {
> -		dev_err(dev, "Unaligned dmabuf 0x%zX, should be aligned to 0x%zX\n",
> +		dev_err(vde->dev,
> +			"Unaligned dmabuf 0x%zX, should be aligned to 0x%zX\n",
>  			dmabuf->size, align_size);
>  		return -EINVAL;
>  	}
>  
>  	if ((u64)offset + min_size > dmabuf->size) {
> -		dev_err(dev, "Too small dmabuf size %zu @0x%lX, should be at least %zu\n",
> +		dev_err(vde->dev,
> +			"Too small dmabuf size %zu @0x%lX, should be at least %zu\n",
>  			dmabuf->size, offset, min_size);
>  		return -EINVAL;
>  	}
>  
> -	attachment = dma_buf_attach(dmabuf, dev);
> +	attachment = dma_buf_attach(dmabuf, vde->dev);
>  	if (IS_ERR(attachment)) {
> -		dev_err(dev, "Failed to attach dmabuf\n");
> +		dev_err(vde->dev, "Failed to attach dmabuf\n");
>  		err = PTR_ERR(attachment);
>  		goto err_put;
>  	}
>  
>  	sgt = dma_buf_map_attachment(attachment, dma_dir);
>  	if (IS_ERR(sgt)) {
> -		dev_err(dev, "Failed to get dmabufs sg_table\n");
> +		dev_err(vde->dev, "Failed to get dmabufs sg_table\n");
>  		err = PTR_ERR(sgt);
>  		goto err_detach;
>  	}
>  
>  	if (sgt->nents != 1) {
> -		dev_err(dev, "Sparse DMA region is unsupported\n");
> +		dev_err(vde->dev, "Sparse DMA region is unsupported\n");
>  		err = -EINVAL;
>  		goto err_unmap;
>  	}
> @@ -717,7 +720,7 @@ static int tegra_vde_attach_dmabuf(struct device *dev,
>  	return err;
>  }
>  
> -static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
> +static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  					     struct video_frame *frame,
>  					     struct tegra_vde_h264_frame *src,
>  					     enum dma_data_direction dma_dir,
> @@ -726,7 +729,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
>  {
>  	int err;
>  
> -	err = tegra_vde_attach_dmabuf(dev, src->y_fd,
> +	err = tegra_vde_attach_dmabuf(vde, src->y_fd,
>  				      src->y_offset, lsize, SZ_256,
>  				      &frame->y_dmabuf_attachment,
>  				      &frame->y_addr,
> @@ -735,7 +738,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
>  	if (err)
>  		return err;
>  
> -	err = tegra_vde_attach_dmabuf(dev, src->cb_fd,
> +	err = tegra_vde_attach_dmabuf(vde, src->cb_fd,
>  				      src->cb_offset, csize, SZ_256,
>  				      &frame->cb_dmabuf_attachment,
>  				      &frame->cb_addr,
> @@ -744,7 +747,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
>  	if (err)
>  		goto err_release_y;
>  
> -	err = tegra_vde_attach_dmabuf(dev, src->cr_fd,
> +	err = tegra_vde_attach_dmabuf(vde, src->cr_fd,
>  				      src->cr_offset, csize, SZ_256,
>  				      &frame->cr_dmabuf_attachment,
>  				      &frame->cr_addr,
> @@ -758,7 +761,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
>  		return 0;
>  	}
>  
> -	err = tegra_vde_attach_dmabuf(dev, src->aux_fd,
> +	err = tegra_vde_attach_dmabuf(vde, src->aux_fd,
>  				      src->aux_offset, csize, SZ_256,
>  				      &frame->aux_dmabuf_attachment,
>  				      &frame->aux_addr,
> @@ -770,33 +773,35 @@ static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
>  	return 0;
>  
>  err_release_cr:
> -	tegra_vde_detach_and_put_dmabuf(frame->cr_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
>  					frame->cr_sgt, dma_dir);
>  err_release_cb:
> -	tegra_vde_detach_and_put_dmabuf(frame->cb_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
>  					frame->cb_sgt, dma_dir);
>  err_release_y:
> -	tegra_vde_detach_and_put_dmabuf(frame->y_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
>  					frame->y_sgt, dma_dir);
>  
>  	return err;
>  }
>  
> -static void tegra_vde_release_frame_dmabufs(struct video_frame *frame,
> +static void tegra_vde_release_frame_dmabufs(struct tegra_vde *vde,
> +					    struct video_frame *frame,
>  					    enum dma_data_direction dma_dir,
>  					    bool baseline_profile)
>  {
>  	if (!baseline_profile)
> -		tegra_vde_detach_and_put_dmabuf(frame->aux_dmabuf_attachment,
> +		tegra_vde_detach_and_put_dmabuf(vde,
> +						frame->aux_dmabuf_attachment,
>  						frame->aux_sgt, dma_dir);
>  
> -	tegra_vde_detach_and_put_dmabuf(frame->cr_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
>  					frame->cr_sgt, dma_dir);
>  
> -	tegra_vde_detach_and_put_dmabuf(frame->cb_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
>  					frame->cb_sgt, dma_dir);
>  
> -	tegra_vde_detach_and_put_dmabuf(frame->y_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
>  					frame->y_sgt, dma_dir);
>  }
>  
> @@ -937,7 +942,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  	if (ret)
>  		return ret;
>  
> -	ret = tegra_vde_attach_dmabuf(dev, ctx.bitstream_data_fd,
> +	ret = tegra_vde_attach_dmabuf(vde, ctx.bitstream_data_fd,
>  				      ctx.bitstream_data_offset,
>  				      SZ_16K, SZ_16K,
>  				      &bitstream_data_dmabuf_attachment,
> @@ -949,7 +954,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  		return ret;
>  
>  	if (vde->soc->supports_ref_pic_marking) {
> -		ret = tegra_vde_attach_dmabuf(dev, ctx.secure_fd,
> +		ret = tegra_vde_attach_dmabuf(vde, ctx.secure_fd,
>  					      ctx.secure_offset, 0, SZ_256,
>  					      &secure_attachment,
>  					      &secure_addr,
> @@ -992,7 +997,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  
>  		dma_dir = (i == 0) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
>  
> -		ret = tegra_vde_attach_dmabufs_to_frame(dev, &dpb_frames[i],
> +		ret = tegra_vde_attach_dmabufs_to_frame(vde, &dpb_frames[i],
>  							&frame, dma_dir,
>  							ctx.baseline_profile,
>  							lsize, csize);
> @@ -1081,7 +1086,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  	while (i--) {
>  		dma_dir = (i == 0) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
>  
> -		tegra_vde_release_frame_dmabufs(&dpb_frames[i], dma_dir,
> +		tegra_vde_release_frame_dmabufs(vde, &dpb_frames[i], dma_dir,
>  						ctx.baseline_profile);
>  	}
>  
> @@ -1089,10 +1094,12 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  
>  release_bitstream_dmabuf:
>  	if (secure_attachment)
> -		tegra_vde_detach_and_put_dmabuf(secure_attachment, secure_sgt,
> +		tegra_vde_detach_and_put_dmabuf(vde, secure_attachment,
> +						secure_sgt,
>  						DMA_TO_DEVICE);
>  
> -	tegra_vde_detach_and_put_dmabuf(bitstream_data_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde,
> +					bitstream_data_dmabuf_attachment,
>  					bitstream_sgt, DMA_TO_DEVICE);
>  
>  	return ret;
> @@ -1190,6 +1197,8 @@ static int tegra_vde_probe(struct platform_device *pdev)
>  	if (!vde)
>  		return -ENOMEM;
>  
> +	vde->dev = &pdev->dev;
> +
>  	platform_set_drvdata(pdev, vde);
>  
>  	vde->soc = of_device_get_match_data(&pdev->dev);
> 

This patch fails to compile.

drivers/staging/media/tegra-vde/tegra-vde.c: In function
‘tegra_vde_attach_dmabufs_to_frame’:
drivers/staging/media/tegra-vde/tegra-vde.c:776:34: error: passing argument 1 of
‘tegra_vde_detach_and_put_dmabuf’ from incompatible pointer type
[-Werror=incompatible-pointer-types]
  tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
                                  ^~~
drivers/staging/media/tegra-vde/tegra-vde.c:637:13: note: expected ‘struct
dma_buf_attachment *’ but argument is of type ‘struct tegra_vde *’
 static void tegra_vde_detach_and_put_dmabuf(struct dma_buf_attachment *a
...

You need to rebase this patch properly.
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 08/14] staging: media: tegra-vde: Track struct device *
@ 2018-08-18 15:39     ` Dmitry Osipenko
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry Osipenko @ 2018-08-18 15:39 UTC (permalink / raw)
  To: Thierry Reding, Mauro Carvalho Chehab
  Cc: Greg Kroah-Hartman, Jonathan Hunter, linux-media, linux-tegra, devel

On 13.08.2018 17:50, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> The pointer to the struct device is frequently used, so store it in
> struct tegra_vde. Also, pass around a pointer to a struct tegra_vde
> instead of struct device in some cases to prepare for subsequent
> patches referencing additional data from that structure.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/staging/media/tegra-vde/tegra-vde.c | 63 ++++++++++++---------
>  1 file changed, 36 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c
> index 41cf86dc5dbd..2496a03fd158 100644
> --- a/drivers/staging/media/tegra-vde/tegra-vde.c
> +++ b/drivers/staging/media/tegra-vde/tegra-vde.c
> @@ -71,6 +71,7 @@ struct tegra_vde_soc {
>  };
>  
>  struct tegra_vde {
> +	struct device *dev;
>  	const struct tegra_vde_soc *soc;
>  	void __iomem *sxe;
>  	void __iomem *bsev;
> @@ -644,7 +645,7 @@ static void tegra_vde_detach_and_put_dmabuf(struct dma_buf_attachment *a,
>  	dma_buf_put(dmabuf);
>  }
>  
> -static int tegra_vde_attach_dmabuf(struct device *dev,
> +static int tegra_vde_attach_dmabuf(struct tegra_vde *vde,
>  				   int fd,
>  				   unsigned long offset,
>  				   size_t min_size,
> @@ -662,38 +663,40 @@ static int tegra_vde_attach_dmabuf(struct device *dev,
>  
>  	dmabuf = dma_buf_get(fd);
>  	if (IS_ERR(dmabuf)) {
> -		dev_err(dev, "Invalid dmabuf FD: %d\n", fd);
> +		dev_err(vde->dev, "Invalid dmabuf FD: %d\n", fd);
>  		return PTR_ERR(dmabuf);
>  	}
>  
>  	if (dmabuf->size & (align_size - 1)) {
> -		dev_err(dev, "Unaligned dmabuf 0x%zX, should be aligned to 0x%zX\n",
> +		dev_err(vde->dev,
> +			"Unaligned dmabuf 0x%zX, should be aligned to 0x%zX\n",
>  			dmabuf->size, align_size);
>  		return -EINVAL;
>  	}
>  
>  	if ((u64)offset + min_size > dmabuf->size) {
> -		dev_err(dev, "Too small dmabuf size %zu @0x%lX, should be at least %zu\n",
> +		dev_err(vde->dev,
> +			"Too small dmabuf size %zu @0x%lX, should be at least %zu\n",
>  			dmabuf->size, offset, min_size);
>  		return -EINVAL;
>  	}
>  
> -	attachment = dma_buf_attach(dmabuf, dev);
> +	attachment = dma_buf_attach(dmabuf, vde->dev);
>  	if (IS_ERR(attachment)) {
> -		dev_err(dev, "Failed to attach dmabuf\n");
> +		dev_err(vde->dev, "Failed to attach dmabuf\n");
>  		err = PTR_ERR(attachment);
>  		goto err_put;
>  	}
>  
>  	sgt = dma_buf_map_attachment(attachment, dma_dir);
>  	if (IS_ERR(sgt)) {
> -		dev_err(dev, "Failed to get dmabufs sg_table\n");
> +		dev_err(vde->dev, "Failed to get dmabufs sg_table\n");
>  		err = PTR_ERR(sgt);
>  		goto err_detach;
>  	}
>  
>  	if (sgt->nents != 1) {
> -		dev_err(dev, "Sparse DMA region is unsupported\n");
> +		dev_err(vde->dev, "Sparse DMA region is unsupported\n");
>  		err = -EINVAL;
>  		goto err_unmap;
>  	}
> @@ -717,7 +720,7 @@ static int tegra_vde_attach_dmabuf(struct device *dev,
>  	return err;
>  }
>  
> -static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
> +static int tegra_vde_attach_dmabufs_to_frame(struct tegra_vde *vde,
>  					     struct video_frame *frame,
>  					     struct tegra_vde_h264_frame *src,
>  					     enum dma_data_direction dma_dir,
> @@ -726,7 +729,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
>  {
>  	int err;
>  
> -	err = tegra_vde_attach_dmabuf(dev, src->y_fd,
> +	err = tegra_vde_attach_dmabuf(vde, src->y_fd,
>  				      src->y_offset, lsize, SZ_256,
>  				      &frame->y_dmabuf_attachment,
>  				      &frame->y_addr,
> @@ -735,7 +738,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
>  	if (err)
>  		return err;
>  
> -	err = tegra_vde_attach_dmabuf(dev, src->cb_fd,
> +	err = tegra_vde_attach_dmabuf(vde, src->cb_fd,
>  				      src->cb_offset, csize, SZ_256,
>  				      &frame->cb_dmabuf_attachment,
>  				      &frame->cb_addr,
> @@ -744,7 +747,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
>  	if (err)
>  		goto err_release_y;
>  
> -	err = tegra_vde_attach_dmabuf(dev, src->cr_fd,
> +	err = tegra_vde_attach_dmabuf(vde, src->cr_fd,
>  				      src->cr_offset, csize, SZ_256,
>  				      &frame->cr_dmabuf_attachment,
>  				      &frame->cr_addr,
> @@ -758,7 +761,7 @@ static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
>  		return 0;
>  	}
>  
> -	err = tegra_vde_attach_dmabuf(dev, src->aux_fd,
> +	err = tegra_vde_attach_dmabuf(vde, src->aux_fd,
>  				      src->aux_offset, csize, SZ_256,
>  				      &frame->aux_dmabuf_attachment,
>  				      &frame->aux_addr,
> @@ -770,33 +773,35 @@ static int tegra_vde_attach_dmabufs_to_frame(struct device *dev,
>  	return 0;
>  
>  err_release_cr:
> -	tegra_vde_detach_and_put_dmabuf(frame->cr_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
>  					frame->cr_sgt, dma_dir);
>  err_release_cb:
> -	tegra_vde_detach_and_put_dmabuf(frame->cb_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
>  					frame->cb_sgt, dma_dir);
>  err_release_y:
> -	tegra_vde_detach_and_put_dmabuf(frame->y_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
>  					frame->y_sgt, dma_dir);
>  
>  	return err;
>  }
>  
> -static void tegra_vde_release_frame_dmabufs(struct video_frame *frame,
> +static void tegra_vde_release_frame_dmabufs(struct tegra_vde *vde,
> +					    struct video_frame *frame,
>  					    enum dma_data_direction dma_dir,
>  					    bool baseline_profile)
>  {
>  	if (!baseline_profile)
> -		tegra_vde_detach_and_put_dmabuf(frame->aux_dmabuf_attachment,
> +		tegra_vde_detach_and_put_dmabuf(vde,
> +						frame->aux_dmabuf_attachment,
>  						frame->aux_sgt, dma_dir);
>  
> -	tegra_vde_detach_and_put_dmabuf(frame->cr_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
>  					frame->cr_sgt, dma_dir);
>  
> -	tegra_vde_detach_and_put_dmabuf(frame->cb_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->cb_dmabuf_attachment,
>  					frame->cb_sgt, dma_dir);
>  
> -	tegra_vde_detach_and_put_dmabuf(frame->y_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde, frame->y_dmabuf_attachment,
>  					frame->y_sgt, dma_dir);
>  }
>  
> @@ -937,7 +942,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  	if (ret)
>  		return ret;
>  
> -	ret = tegra_vde_attach_dmabuf(dev, ctx.bitstream_data_fd,
> +	ret = tegra_vde_attach_dmabuf(vde, ctx.bitstream_data_fd,
>  				      ctx.bitstream_data_offset,
>  				      SZ_16K, SZ_16K,
>  				      &bitstream_data_dmabuf_attachment,
> @@ -949,7 +954,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  		return ret;
>  
>  	if (vde->soc->supports_ref_pic_marking) {
> -		ret = tegra_vde_attach_dmabuf(dev, ctx.secure_fd,
> +		ret = tegra_vde_attach_dmabuf(vde, ctx.secure_fd,
>  					      ctx.secure_offset, 0, SZ_256,
>  					      &secure_attachment,
>  					      &secure_addr,
> @@ -992,7 +997,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  
>  		dma_dir = (i == 0) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
>  
> -		ret = tegra_vde_attach_dmabufs_to_frame(dev, &dpb_frames[i],
> +		ret = tegra_vde_attach_dmabufs_to_frame(vde, &dpb_frames[i],
>  							&frame, dma_dir,
>  							ctx.baseline_profile,
>  							lsize, csize);
> @@ -1081,7 +1086,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  	while (i--) {
>  		dma_dir = (i == 0) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
>  
> -		tegra_vde_release_frame_dmabufs(&dpb_frames[i], dma_dir,
> +		tegra_vde_release_frame_dmabufs(vde, &dpb_frames[i], dma_dir,
>  						ctx.baseline_profile);
>  	}
>  
> @@ -1089,10 +1094,12 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  
>  release_bitstream_dmabuf:
>  	if (secure_attachment)
> -		tegra_vde_detach_and_put_dmabuf(secure_attachment, secure_sgt,
> +		tegra_vde_detach_and_put_dmabuf(vde, secure_attachment,
> +						secure_sgt,
>  						DMA_TO_DEVICE);
>  
> -	tegra_vde_detach_and_put_dmabuf(bitstream_data_dmabuf_attachment,
> +	tegra_vde_detach_and_put_dmabuf(vde,
> +					bitstream_data_dmabuf_attachment,
>  					bitstream_sgt, DMA_TO_DEVICE);
>  
>  	return ret;
> @@ -1190,6 +1197,8 @@ static int tegra_vde_probe(struct platform_device *pdev)
>  	if (!vde)
>  		return -ENOMEM;
>  
> +	vde->dev = &pdev->dev;
> +
>  	platform_set_drvdata(pdev, vde);
>  
>  	vde->soc = of_device_get_match_data(&pdev->dev);
> 

This patch fails to compile.

drivers/staging/media/tegra-vde/tegra-vde.c: In function
‘tegra_vde_attach_dmabufs_to_frame’:
drivers/staging/media/tegra-vde/tegra-vde.c:776:34: error: passing argument 1 of
‘tegra_vde_detach_and_put_dmabuf’ from incompatible pointer type
[-Werror=incompatible-pointer-types]
  tegra_vde_detach_and_put_dmabuf(vde, frame->cr_dmabuf_attachment,
                                  ^~~
drivers/staging/media/tegra-vde/tegra-vde.c:637:13: note: expected ‘struct
dma_buf_attachment *’ but argument is of type ‘struct tegra_vde *’
 static void tegra_vde_detach_and_put_dmabuf(struct dma_buf_attachment *a
...

You need to rebase this patch properly.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 03/14] staging: media: tegra-vde: Prepare for interlacing support
  2018-08-13 14:50   ` Thierry Reding
@ 2018-08-30  8:56     ` Dan Carpenter
  -1 siblings, 0 replies; 72+ messages in thread
From: Dan Carpenter @ 2018-08-30  8:56 UTC (permalink / raw)
  To: Thierry Reding
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Dmitry Osipenko, Mauro Carvalho Chehab, linux-media

On Mon, Aug 13, 2018 at 04:50:16PM +0200, Thierry Reding wrote:
>  static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
> +					unsigned int num_ref_pics,
>  					struct video_frame *dpb_frames,
>  					unsigned int ref_frames_nb,
>  					unsigned int with_earlier_poc_nb)
> @@ -251,13 +260,17 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
>  	u32 value, aux_addr;
>  	int with_later_poc_nb;
>  	unsigned int i, k;
> +	size_t size;
> +
> +	size = num_ref_pics * 4 * 8;
> +	memset(vde->iram, 0, size);

I can't get behind the magical size calculation...  :(

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 03/14] staging: media: tegra-vde: Prepare for interlacing support
@ 2018-08-30  8:56     ` Dan Carpenter
  0 siblings, 0 replies; 72+ messages in thread
From: Dan Carpenter @ 2018-08-30  8:56 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mauro Carvalho Chehab, devel, Greg Kroah-Hartman,
	Jonathan Hunter, linux-tegra, Dmitry Osipenko, linux-media

On Mon, Aug 13, 2018 at 04:50:16PM +0200, Thierry Reding wrote:
>  static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
> +					unsigned int num_ref_pics,
>  					struct video_frame *dpb_frames,
>  					unsigned int ref_frames_nb,
>  					unsigned int with_earlier_poc_nb)
> @@ -251,13 +260,17 @@ static void tegra_vde_setup_iram_tables(struct tegra_vde *vde,
>  	u32 value, aux_addr;
>  	int with_later_poc_nb;
>  	unsigned int i, k;
> +	size_t size;
> +
> +	size = num_ref_pics * 4 * 8;
> +	memset(vde->iram, 0, size);

I can't get behind the magical size calculation...  :(

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 00/14] staging: media: tegra-vdea: Add Tegra124 support
  2018-08-13 14:50 ` Thierry Reding
@ 2018-09-03 12:18   ` Hans Verkuil
  -1 siblings, 0 replies; 72+ messages in thread
From: Hans Verkuil @ 2018-09-03 12:18 UTC (permalink / raw)
  To: Thierry Reding, Mauro Carvalho Chehab
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Dmitry Osipenko, linux-media

Hi Thierry, Dmitry,

Dmitry found some issues, so I'll wait for a v2.

Anyway, this driver is in staging with this TODO:

- Implement V4L2 API once it gains support for stateless decoders.

I just wanted to mention that the Request API is expected to be merged
for 4.20. A topic branch is here:

https://git.linuxtv.org/media_tree.git/log/?h=request_api

This patch series is expected to be added to the topic branch once
everyone agrees:

https://www.spinics.net/lists/linux-media/msg139713.html

The first Allwinner driver that will be using this API is here:

https://lwn.net/Articles/763589/

It's expected to be merged for 4.20 as well.

Preliminary H264 work for the Allwinner driver is here:

https://lkml.org/lkml/2018/6/13/399

But this needs more work.

HEVC support, on the other hand, is almost ready:

https://lkml.org/lkml/2018/8/28/229

I hope these links give a good overview of the current status.

Regards,

	Hans

On 08/13/2018 04:50 PM, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Hi,
> 
> this set of patches perform a bit of cleanup and extend support to the
> VDE implementation found on Tegra114 and Tegra124. This requires adding
> handling for a clock and a reset for the BSEV block that is separate
> from the main VDE block. The new VDE revision also supports reference
> picture marking, which requires that the BSEV writes out some related
> data to a memory location. Since the supported tiling layouts have been
> changed in Tegra124, which supports only block-linear and no pitch-
> linear layouts, a new way is added to request a specific layout for the
> decoded frames. Both of the above changes require breaking the ABI to
> accomodate for the new data in the custom IOCTL.
> 
> Finally this set also adds support for dealing with an IOMMU, which
> makes it more convenient to deal with imported buffers since they no
> longer need to be physically contiguous.
> 
> Userspace changes for the updated ABI are available here:
> 
> 	https://cgit.freedesktop.org/~tagr/libvdpau-tegra/commit/
> 
> Mauro, I'm sending the device tree changes as part of the series for
> completeness, but I expect to pick those up into the Tegra tree once
> this has been reviewed and you've applied the driver changes.
> 
> Thanks,
> Thierry
> 
> Thierry Reding (14):
>   staging: media: tegra-vde: Support BSEV clock and reset
>   staging: media: tegra-vde: Support reference picture marking
>   staging: media: tegra-vde: Prepare for interlacing support
>   staging: media: tegra-vde: Use DRM/KMS framebuffer modifiers
>   staging: media: tegra-vde: Properly mark invalid entries
>   staging: media: tegra-vde: Print out invalid FD
>   staging: media: tegra-vde: Add some clarifying comments
>   staging: media: tegra-vde: Track struct device *
>   staging: media: tegra-vde: Add IOMMU support
>   staging: media: tegra-vde: Keep VDE in reset when unused
>   ARM: tegra: Enable VDE on Tegra124
>   ARM: tegra: Add BSEV clock and reset for VDE on Tegra20
>   ARM: tegra: Add BSEV clock and reset for VDE on Tegra30
>   ARM: tegra: Enable SMMU for VDE on Tegra124
> 
>  arch/arm/boot/dts/tegra124.dtsi             |  42 ++
>  arch/arm/boot/dts/tegra20.dtsi              |  10 +-
>  arch/arm/boot/dts/tegra30.dtsi              |  10 +-
>  drivers/staging/media/tegra-vde/tegra-vde.c | 528 +++++++++++++++++---
>  drivers/staging/media/tegra-vde/uapi.h      |   6 +-
>  5 files changed, 511 insertions(+), 85 deletions(-)
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 00/14] staging: media: tegra-vdea: Add Tegra124 support
@ 2018-09-03 12:18   ` Hans Verkuil
  0 siblings, 0 replies; 72+ messages in thread
From: Hans Verkuil @ 2018-09-03 12:18 UTC (permalink / raw)
  To: Thierry Reding, Mauro Carvalho Chehab
  Cc: Greg Kroah-Hartman, Dmitry Osipenko, Jonathan Hunter,
	linux-media, linux-tegra, devel

Hi Thierry, Dmitry,

Dmitry found some issues, so I'll wait for a v2.

Anyway, this driver is in staging with this TODO:

- Implement V4L2 API once it gains support for stateless decoders.

I just wanted to mention that the Request API is expected to be merged
for 4.20. A topic branch is here:

https://git.linuxtv.org/media_tree.git/log/?h=request_api

This patch series is expected to be added to the topic branch once
everyone agrees:

https://www.spinics.net/lists/linux-media/msg139713.html

The first Allwinner driver that will be using this API is here:

https://lwn.net/Articles/763589/

It's expected to be merged for 4.20 as well.

Preliminary H264 work for the Allwinner driver is here:

https://lkml.org/lkml/2018/6/13/399

But this needs more work.

HEVC support, on the other hand, is almost ready:

https://lkml.org/lkml/2018/8/28/229

I hope these links give a good overview of the current status.

Regards,

	Hans

On 08/13/2018 04:50 PM, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Hi,
> 
> this set of patches perform a bit of cleanup and extend support to the
> VDE implementation found on Tegra114 and Tegra124. This requires adding
> handling for a clock and a reset for the BSEV block that is separate
> from the main VDE block. The new VDE revision also supports reference
> picture marking, which requires that the BSEV writes out some related
> data to a memory location. Since the supported tiling layouts have been
> changed in Tegra124, which supports only block-linear and no pitch-
> linear layouts, a new way is added to request a specific layout for the
> decoded frames. Both of the above changes require breaking the ABI to
> accomodate for the new data in the custom IOCTL.
> 
> Finally this set also adds support for dealing with an IOMMU, which
> makes it more convenient to deal with imported buffers since they no
> longer need to be physically contiguous.
> 
> Userspace changes for the updated ABI are available here:
> 
> 	https://cgit.freedesktop.org/~tagr/libvdpau-tegra/commit/
> 
> Mauro, I'm sending the device tree changes as part of the series for
> completeness, but I expect to pick those up into the Tegra tree once
> this has been reviewed and you've applied the driver changes.
> 
> Thanks,
> Thierry
> 
> Thierry Reding (14):
>   staging: media: tegra-vde: Support BSEV clock and reset
>   staging: media: tegra-vde: Support reference picture marking
>   staging: media: tegra-vde: Prepare for interlacing support
>   staging: media: tegra-vde: Use DRM/KMS framebuffer modifiers
>   staging: media: tegra-vde: Properly mark invalid entries
>   staging: media: tegra-vde: Print out invalid FD
>   staging: media: tegra-vde: Add some clarifying comments
>   staging: media: tegra-vde: Track struct device *
>   staging: media: tegra-vde: Add IOMMU support
>   staging: media: tegra-vde: Keep VDE in reset when unused
>   ARM: tegra: Enable VDE on Tegra124
>   ARM: tegra: Add BSEV clock and reset for VDE on Tegra20
>   ARM: tegra: Add BSEV clock and reset for VDE on Tegra30
>   ARM: tegra: Enable SMMU for VDE on Tegra124
> 
>  arch/arm/boot/dts/tegra124.dtsi             |  42 ++
>  arch/arm/boot/dts/tegra20.dtsi              |  10 +-
>  arch/arm/boot/dts/tegra30.dtsi              |  10 +-
>  drivers/staging/media/tegra-vde/tegra-vde.c | 528 +++++++++++++++++---
>  drivers/staging/media/tegra-vde/uapi.h      |   6 +-
>  5 files changed, 511 insertions(+), 85 deletions(-)
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 00/14] staging: media: tegra-vdea: Add Tegra124 support
  2018-09-03 12:18   ` Hans Verkuil
@ 2018-09-03 13:12     ` Thierry Reding
  -1 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-09-03 13:12 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: devel, Greg Kroah-Hartman, Jonathan Hunter, linux-tegra,
	Dmitry Osipenko, Mauro Carvalho Chehab, linux-media


[-- Attachment #1.1: Type: text/plain, Size: 1339 bytes --]

On Mon, Sep 03, 2018 at 02:18:15PM +0200, Hans Verkuil wrote:
> Hi Thierry, Dmitry,
> 
> Dmitry found some issues, so I'll wait for a v2.
> 
> Anyway, this driver is in staging with this TODO:
> 
> - Implement V4L2 API once it gains support for stateless decoders.
> 
> I just wanted to mention that the Request API is expected to be merged
> for 4.20. A topic branch is here:
> 
> https://git.linuxtv.org/media_tree.git/log/?h=request_api
> 
> This patch series is expected to be added to the topic branch once
> everyone agrees:
> 
> https://www.spinics.net/lists/linux-media/msg139713.html
> 
> The first Allwinner driver that will be using this API is here:
> 
> https://lwn.net/Articles/763589/
> 
> It's expected to be merged for 4.20 as well.
> 
> Preliminary H264 work for the Allwinner driver is here:
> 
> https://lkml.org/lkml/2018/6/13/399
> 
> But this needs more work.
> 
> HEVC support, on the other hand, is almost ready:
> 
> https://lkml.org/lkml/2018/8/28/229
> 
> I hope these links give a good overview of the current status.

Thanks for those links. I was aware of the ongoing efforts and was
eagerly waiting for the various pieces to settle a bit. I will hopefully
get around to porting the tegra-vde driver to this new infrastructure in
the next couple of weeks.

Thierry

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 169 bytes --]

_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 00/14] staging: media: tegra-vdea: Add Tegra124 support
@ 2018-09-03 13:12     ` Thierry Reding
  0 siblings, 0 replies; 72+ messages in thread
From: Thierry Reding @ 2018-09-03 13:12 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: Mauro Carvalho Chehab, Greg Kroah-Hartman, Dmitry Osipenko,
	Jonathan Hunter, linux-media, linux-tegra, devel

[-- Attachment #1: Type: text/plain, Size: 1339 bytes --]

On Mon, Sep 03, 2018 at 02:18:15PM +0200, Hans Verkuil wrote:
> Hi Thierry, Dmitry,
> 
> Dmitry found some issues, so I'll wait for a v2.
> 
> Anyway, this driver is in staging with this TODO:
> 
> - Implement V4L2 API once it gains support for stateless decoders.
> 
> I just wanted to mention that the Request API is expected to be merged
> for 4.20. A topic branch is here:
> 
> https://git.linuxtv.org/media_tree.git/log/?h=request_api
> 
> This patch series is expected to be added to the topic branch once
> everyone agrees:
> 
> https://www.spinics.net/lists/linux-media/msg139713.html
> 
> The first Allwinner driver that will be using this API is here:
> 
> https://lwn.net/Articles/763589/
> 
> It's expected to be merged for 4.20 as well.
> 
> Preliminary H264 work for the Allwinner driver is here:
> 
> https://lkml.org/lkml/2018/6/13/399
> 
> But this needs more work.
> 
> HEVC support, on the other hand, is almost ready:
> 
> https://lkml.org/lkml/2018/8/28/229
> 
> I hope these links give a good overview of the current status.

Thanks for those links. I was aware of the ongoing efforts and was
eagerly waiting for the various pieces to settle a bit. I will hopefully
get around to porting the tegra-vde driver to this new infrastructure in
the next couple of weeks.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2018-09-03 17:32 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-13 14:50 [PATCH 00/14] staging: media: tegra-vdea: Add Tegra124 support Thierry Reding
2018-08-13 14:50 ` Thierry Reding
2018-08-13 14:50 ` [PATCH 01/14] staging: media: tegra-vde: Support BSEV clock and reset Thierry Reding
2018-08-13 14:50   ` Thierry Reding
2018-08-13 15:09   ` Dmitry Osipenko
2018-08-13 15:09     ` Dmitry Osipenko
2018-08-14 14:21     ` Thierry Reding
2018-08-14 14:21       ` Thierry Reding
2018-08-14 15:05       ` Dmitry Osipenko
2018-08-14 15:05         ` Dmitry Osipenko
2018-08-14 15:16         ` Dmitry Osipenko
2018-08-14 15:16           ` Dmitry Osipenko
2018-08-13 14:50 ` [PATCH 02/14] staging: media: tegra-vde: Support reference picture marking Thierry Reding
2018-08-13 14:50   ` Thierry Reding
2018-08-18 12:48   ` Dmitry Osipenko
2018-08-18 12:48     ` Dmitry Osipenko
2018-08-13 14:50 ` [PATCH 03/14] staging: media: tegra-vde: Prepare for interlacing support Thierry Reding
2018-08-13 14:50   ` Thierry Reding
2018-08-18 12:48   ` Dmitry Osipenko
2018-08-18 12:48     ` Dmitry Osipenko
2018-08-30  8:56   ` Dan Carpenter
2018-08-30  8:56     ` Dan Carpenter
2018-08-13 14:50 ` [PATCH 04/14] staging: media: tegra-vde: Use DRM/KMS framebuffer modifiers Thierry Reding
2018-08-13 14:50   ` Thierry Reding
2018-08-18 12:53   ` Dmitry Osipenko
2018-08-18 12:53     ` Dmitry Osipenko
2018-08-13 14:50 ` [PATCH 05/14] staging: media: tegra-vde: Properly mark invalid entries Thierry Reding
2018-08-13 14:50   ` Thierry Reding
2018-08-18 12:45   ` Dmitry Osipenko
2018-08-18 12:45     ` Dmitry Osipenko
2018-08-13 14:50 ` [PATCH 06/14] staging: media: tegra-vde: Print out invalid FD Thierry Reding
2018-08-13 14:50   ` Thierry Reding
2018-08-18 12:45   ` Dmitry Osipenko
2018-08-18 12:45     ` Dmitry Osipenko
2018-08-13 14:50 ` [PATCH 07/14] staging: media: tegra-vde: Add some clarifying comments Thierry Reding
2018-08-13 14:50   ` Thierry Reding
2018-08-18 12:50   ` Dmitry Osipenko
2018-08-18 12:50     ` Dmitry Osipenko
2018-08-13 14:50 ` [PATCH 08/14] staging: media: tegra-vde: Track struct device * Thierry Reding
2018-08-13 14:50   ` Thierry Reding
2018-08-18 12:49   ` Dmitry Osipenko
2018-08-18 12:49     ` Dmitry Osipenko
2018-08-18 15:39   ` Dmitry Osipenko
2018-08-18 15:39     ` Dmitry Osipenko
2018-08-13 14:50 ` [PATCH 09/14] staging: media: tegra-vde: Add IOMMU support Thierry Reding
2018-08-13 14:50   ` Thierry Reding
2018-08-18 12:50   ` Dmitry Osipenko
2018-08-18 12:50     ` Dmitry Osipenko
2018-08-18 13:07   ` Dmitry Osipenko
2018-08-18 13:07     ` Dmitry Osipenko
2018-08-18 13:29   ` Dmitry Osipenko
2018-08-18 13:29     ` Dmitry Osipenko
2018-08-13 14:50 ` [PATCH 10/14] staging: media: tegra-vde: Keep VDE in reset when unused Thierry Reding
2018-08-13 14:50   ` Thierry Reding
2018-08-18 12:50   ` Dmitry Osipenko
2018-08-18 12:50     ` Dmitry Osipenko
2018-08-13 14:50 ` [PATCH 11/14] ARM: tegra: Enable VDE on Tegra124 Thierry Reding
2018-08-13 14:50   ` Thierry Reding
2018-08-18 12:45   ` Dmitry Osipenko
2018-08-18 12:45     ` Dmitry Osipenko
2018-08-13 14:50 ` [PATCH 12/14] ARM: tegra: Add BSEV clock and reset for VDE on Tegra20 Thierry Reding
2018-08-13 14:50   ` Thierry Reding
2018-08-13 14:50 ` [PATCH 13/14] ARM: tegra: Add BSEV clock and reset for VDE on Tegra30 Thierry Reding
2018-08-13 14:50   ` Thierry Reding
2018-08-13 14:50 ` [PATCH 14/14] ARM: tegra: Enable SMMU for VDE on Tegra124 Thierry Reding
2018-08-13 14:50   ` Thierry Reding
2018-08-18 12:45   ` Dmitry Osipenko
2018-08-18 12:45     ` Dmitry Osipenko
2018-09-03 12:18 ` [PATCH 00/14] staging: media: tegra-vdea: Add Tegra124 support Hans Verkuil
2018-09-03 12:18   ` Hans Verkuil
2018-09-03 13:12   ` Thierry Reding
2018-09-03 13:12     ` Thierry Reding

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.