All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/9] vsp1: TLB optimisation and DL caching
@ 2018-02-28 20:52 Kieran Bingham
  2018-02-28 20:52 ` [PATCH v6 1/9] v4l: vsp1: Reword uses of 'fragment' as 'body' Kieran Bingham
                   ` (8 more replies)
  0 siblings, 9 replies; 11+ messages in thread
From: Kieran Bingham @ 2018-02-28 20:52 UTC (permalink / raw)
  To: Laurent Pinchart, linux-media, linux-renesas-soc; +Cc: Kieran Bingham

Each display list currently allocates an area of DMA memory to store register
settings for the VSP1 to process. Each of these allocations adds pressure to
the IPMMU TLB entries.

We can reduce the pressure by pre-allocating larger areas and dividing the area
across multiple bodies represented as a pool.

With this reconfiguration of bodies, we can adapt the configuration code to
separate out constant hardware configuration and cache it for re-use.

The patches provided in this series can be found at:
  git://git.kernel.org/pub/scm/linux/kernel/git/kbingham/rcar.git  tags/vsp1/tlb-optimise/v6

Changelog:
----------

v6:
 - Rebased on to linux-media/master (v4.16-rc1)
 - Removed DRM/UIF (DISCOM/ColorKey) updates

v5:
 - Rebased on to renesas-drivers-2018-01-09-v4.15-rc7 to fix conflicts
   with DRM and UIF updates on VSP1 driver

v4:
 - Rebased to v4.14
 * v4l: vsp1: Use reference counting for bodies
   - Fix up reference handling comments

 * v4l: vsp1: Provide a body pool
   - Provide comment explaining extra allocation on body pool
     highlighting area for optimisation later.

 * v4l: vsp1: Refactor display list configure operations
   - Fix up comment to describe yuv_mode caching rather than format

 * vsp1: Adapt entities to configure into a body
   - Rename vsp1_dl_list_get_body() to vsp1_dl_list_get_body0()

 * v4l: vsp1: Move video configuration to a cached dlb
   - Adjust pipe configured flag to be reset on resume rather than suspend
   - rename dl_child, dl_next

Testing:
--------
The VSP unit tests have been run on this patch set with the following results:

--- Test loop 1 ---
- vsp-unit-test-0000.sh
Test Conditions:
  Platform          Renesas Salvator-X 2nd version board based on r8a7795 ES2.0+
  Kernel release    4.16.0-rc1-arm64-renesas-00166-ge0ad4e839fff
  convert           /usr/bin/convert
  compare           /usr/bin/compare
  killall           /usr/bin/killall
  raw2rgbpnm        /usr/bin/raw2rgbpnm
  stress            /usr/bin/stress
  yavta             /usr/bin/yavta
- vsp-unit-test-0001.sh
Testing WPF packing in RGB332: pass
Testing WPF packing in ARGB555: pass
Testing WPF packing in XRGB555: pass
Testing WPF packing in RGB565: pass
Testing WPF packing in BGR24: pass
Testing WPF packing in RGB24: pass
Testing WPF packing in ABGR32: pass
Testing WPF packing in ARGB32: pass
Testing WPF packing in XBGR32: pass
Testing WPF packing in XRGB32: pass
- vsp-unit-test-0002.sh
Testing WPF packing in NV12M: pass
Testing WPF packing in NV16M: pass
Testing WPF packing in NV21M: pass
Testing WPF packing in NV61M: pass
Testing WPF packing in UYVY: pass
Testing WPF packing in VYUY: skip
Testing WPF packing in YUV420M: pass
Testing WPF packing in YUV422M: pass
Testing WPF packing in YUV444M: pass
Testing WPF packing in YVU420M: pass
Testing WPF packing in YVU422M: pass
Testing WPF packing in YVU444M: pass
Testing WPF packing in YUYV: pass
Testing WPF packing in YVYU: pass
- vsp-unit-test-0003.sh
Testing scaling from 640x640 to 640x480 in RGB24: pass
Testing scaling from 1024x768 to 640x480 in RGB24: pass
Testing scaling from 640x480 to 1024x768 in RGB24: pass
Testing scaling from 640x640 to 640x480 in YUV444M: pass
Testing scaling from 1024x768 to 640x480 in YUV444M: pass
Testing scaling from 640x480 to 1024x768 in YUV444M: pass
- vsp-unit-test-0004.sh
Testing histogram in RGB24: pass
Testing histogram in YUV444M: pass
- vsp-unit-test-0005.sh
Testing RPF.0: pass
Testing RPF.1: pass
Testing RPF.2: pass
Testing RPF.3: pass
Testing RPF.4: pass
- vsp-unit-test-0006.sh
Testing invalid pipeline with no RPF: pass
Testing invalid pipeline with no WPF: pass
- vsp-unit-test-0007.sh
Testing BRU in RGB24 with 1 inputs: pass
Testing BRU in RGB24 with 2 inputs: pass
Testing BRU in RGB24 with 3 inputs: pass
Testing BRU in RGB24 with 4 inputs: pass
Testing BRU in RGB24 with 5 inputs: pass
Testing BRU in YUV444M with 1 inputs: pass
Testing BRU in YUV444M with 2 inputs: pass
Testing BRU in YUV444M with 3 inputs: pass
Testing BRU in YUV444M with 4 inputs: pass
Testing BRU in YUV444M with 5 inputs: pass
- vsp-unit-test-0008.sh
Test requires unavailable feature set `bru rpf.0 uds wpf.0': skipped
- vsp-unit-test-0009.sh
Test requires unavailable feature set `rpf.0 wpf.0 wpf.1': skipped
- vsp-unit-test-0010.sh
Testing CLU in RGB24 with zero configuration: pass
Testing CLU in RGB24 with identity configuration: pass
Testing CLU in RGB24 with wave configuration: pass
Testing CLU in YUV444M with zero configuration: pass
Testing CLU in YUV444M with identity configuration: pass
Testing CLU in YUV444M with wave configuration: pass
Testing LUT in RGB24 with zero configuration: pass
Testing LUT in RGB24 with identity configuration: pass
Testing LUT in RGB24 with gamma configuration: pass
Testing LUT in YUV444M with zero configuration: pass
Testing LUT in YUV444M with identity configuration: pass
Testing LUT in YUV444M with gamma configuration: pass
- vsp-unit-test-0011.sh
Testing  hflip=0 vflip=0 rotate=0: pass
Testing  hflip=1 vflip=0 rotate=0: pass
Testing  hflip=0 vflip=1 rotate=0: pass
Testing  hflip=1 vflip=1 rotate=0: pass
Testing  hflip=0 vflip=0 rotate=90: pass
Testing  hflip=1 vflip=0 rotate=90: pass
Testing  hflip=0 vflip=1 rotate=90: pass
Testing  hflip=1 vflip=1 rotate=90: pass
- vsp-unit-test-0012.sh
Testing hflip: pass
Testing vflip: pass
- vsp-unit-test-0013.sh
Testing RPF unpacking in RGB332: pass
Testing RPF unpacking in ARGB555: pass
Testing RPF unpacking in XRGB555: pass
Testing RPF unpacking in RGB565: pass
Testing RPF unpacking in BGR24: pass
Testing RPF unpacking in RGB24: pass
Testing RPF unpacking in ABGR32: pass
Testing RPF unpacking in ARGB32: pass
Testing RPF unpacking in XBGR32: pass
Testing RPF unpacking in XRGB32: pass
- vsp-unit-test-0014.sh
Testing RPF unpacking in NV12M: pass
Testing RPF unpacking in NV16M: pass
Testing RPF unpacking in NV21M: pass
Testing RPF unpacking in NV61M: pass
Testing RPF unpacking in UYVY: pass
Testing RPF unpacking in VYUY: skip
Testing RPF unpacking in YUV420M: pass
Testing RPF unpacking in YUV422M: pass
Testing RPF unpacking in YUV444M: pass
Testing RPF unpacking in YVU420M: pass
Testing RPF unpacking in YVU422M: pass
Testing RPF unpacking in YVU444M: pass
Testing RPF unpacking in YUYV: pass
Testing RPF unpacking in YVYU: pass
- vsp-unit-test-0015.sh
Testing SRU scaling from 1024x768 to 1024x768 in RGB24: pass
Testing SRU scaling from 1024x768 to 2048x1536 in RGB24: pass
Testing SRU scaling from 1024x768 to 1024x768 in YUV444M: pass
Testing SRU scaling from 1024x768 to 2048x1536 in YUV444M: pass
- vsp-unit-test-0016.sh
Testing  hflip=0 vflip=0 rotate=0 640x480 -> 640x480: pass
Testing  hflip=0 vflip=0 rotate=0 640x480 -> 1024x768: pass
Testing  hflip=0 vflip=0 rotate=0 1024x768 -> 640x480: pass
Testing  hflip=1 vflip=0 rotate=0 640x480 -> 640x480: pass
Testing  hflip=1 vflip=0 rotate=0 640x480 -> 1024x768: pass
Testing  hflip=1 vflip=0 rotate=0 1024x768 -> 640x480: pass
Testing  hflip=0 vflip=1 rotate=0 640x480 -> 640x480: pass
Testing  hflip=0 vflip=1 rotate=0 640x480 -> 1024x768: pass
Testing  hflip=0 vflip=1 rotate=0 1024x768 -> 640x480: pass
Testing  hflip=1 vflip=1 rotate=0 640x480 -> 640x480: pass
Testing  hflip=1 vflip=1 rotate=0 640x480 -> 1024x768: pass
Testing  hflip=1 vflip=1 rotate=0 1024x768 -> 640x480: pass
Testing  hflip=0 vflip=0 rotate=90 640x480 -> 640x480: pass
Testing  hflip=0 vflip=0 rotate=90 640x480 -> 1024x768: pass
Testing  hflip=0 vflip=0 rotate=90 1024x768 -> 640x480: pass
Testing  hflip=1 vflip=0 rotate=90 640x480 -> 640x480: pass
Testing  hflip=1 vflip=0 rotate=90 640x480 -> 1024x768: pass
Testing  hflip=1 vflip=0 rotate=90 1024x768 -> 640x480: pass
Testing  hflip=0 vflip=1 rotate=90 640x480 -> 640x480: pass
Testing  hflip=0 vflip=1 rotate=90 640x480 -> 1024x768: pass
Testing  hflip=0 vflip=1 rotate=90 1024x768 -> 640x480: pass
Testing  hflip=1 vflip=1 rotate=90 640x480 -> 640x480: pass
Testing  hflip=1 vflip=1 rotate=90 640x480 -> 1024x768: pass
Testing  hflip=1 vflip=1 rotate=90 1024x768 -> 640x480: pass
- vsp-unit-test-0017.sh
- vsp-unit-test-0018.sh
Testing RPF crop from (0,0)/512x384: pass
Testing RPF crop from (32,32)/512x384: pass
Testing RPF crop from (32,64)/512x384: pass
Testing RPF crop from (64,32)/512x384: pass
- vsp-unit-test-0019.sh
- vsp-unit-test-0020.sh
- vsp-unit-test-0021.sh
Testing WPF packing in RGB332 during stress testing: pass
Testing WPF packing in ARGB555 during stress testing: pass
Testing WPF packing in XRGB555 during stress testing: pass
Testing WPF packing in RGB565 during stress testing: pass
Testing WPF packing in BGR24 during stress testing: pass
Testing WPF packing in RGB24 during stress testing: pass
Testing WPF packing in ABGR32 during stress testing: pass
Testing WPF packing in ARGB32 during stress testing: pass
Testing WPF packing in XBGR32 during stress testing: pass
Testing WPF packing in XRGB32 during stress testing: pass
./vsp-unit-test-0021.sh: line 34: 13658 Killed                  stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M
- vsp-unit-test-0022.sh
Testing long duration pipelines under stress: pass
./vsp-unit-test-0022.sh: line 38: 15624 Killed                  stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M
- vsp-unit-test-0023.sh
Testing histogram HGT with hue areas 0,255,255,255,255,255,255,255,255,255,255,255: pass
Testing histogram HGT with hue areas 0,40,40,80,80,120,120,160,160,200,200,255: pass
Testing histogram HGT with hue areas 220,40,40,80,80,120,120,160,160,200,200,220: pass
Testing histogram HGT with hue areas 0,10,50,60,100,110,150,160,200,210,250,255: pass
Testing histogram HGT with hue areas 10,20,50,60,100,110,150,160,200,210,230,240: pass
Testing histogram HGT with hue areas 240,20,60,80,100,120,140,160,180,200,210,220: pass
- vsp-unit-test-0024.sh
Test requires unavailable feature set `rpf.0 rpf.1 brs wpf.0': skipped
158 tests: 142 passed, 0 failed, 3 skipped

Kieran Bingham (9):
  v4l: vsp1: Reword uses of 'fragment' as 'body'
  v4l: vsp1: Protect bodies against overflow
  v4l: vsp1: Provide a body pool
  v4l: vsp1: Convert display lists to use new body pool
  v4l: vsp1: Use reference counting for bodies
  v4l: vsp1: Refactor display list configure operations
  v4l: vsp1: Adapt entities to configure into a body
  v4l: vsp1: Move video configuration to a cached dlb
  v4l: vsp1: Reduce display list body size

 drivers/media/platform/vsp1/vsp1_bru.c    |  32 +--
 drivers/media/platform/vsp1/vsp1_clu.c    |  94 +++---
 drivers/media/platform/vsp1/vsp1_clu.h    |   1 +-
 drivers/media/platform/vsp1/vsp1_dl.c     | 393 +++++++++++++----------
 drivers/media/platform/vsp1/vsp1_dl.h     |  20 +-
 drivers/media/platform/vsp1/vsp1_drm.c    |  37 +--
 drivers/media/platform/vsp1/vsp1_entity.c |  23 +-
 drivers/media/platform/vsp1/vsp1_entity.h |  31 +--
 drivers/media/platform/vsp1/vsp1_hgo.c    |  26 +--
 drivers/media/platform/vsp1/vsp1_hgt.c    |  28 +--
 drivers/media/platform/vsp1/vsp1_hsit.c   |  20 +-
 drivers/media/platform/vsp1/vsp1_lif.c    |  25 +-
 drivers/media/platform/vsp1/vsp1_lut.c    |  71 ++--
 drivers/media/platform/vsp1/vsp1_lut.h    |   1 +-
 drivers/media/platform/vsp1/vsp1_pipe.c   |  11 +-
 drivers/media/platform/vsp1/vsp1_pipe.h   |   7 +-
 drivers/media/platform/vsp1/vsp1_rpf.c    | 179 +++++-----
 drivers/media/platform/vsp1/vsp1_sru.c    |  24 +-
 drivers/media/platform/vsp1/vsp1_uds.c    |  73 ++--
 drivers/media/platform/vsp1/vsp1_uds.h    |   2 +-
 drivers/media/platform/vsp1/vsp1_video.c  |  82 ++---
 drivers/media/platform/vsp1/vsp1_video.h  |   2 +-
 drivers/media/platform/vsp1/vsp1_wpf.c    | 325 +++++++++----------
 23 files changed, 826 insertions(+), 681 deletions(-)

base-commit: e3e389f931a14ddf43089c7db92fc5d74edf93a4
-- 
git-series 0.9.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v6 1/9] v4l: vsp1: Reword uses of 'fragment' as 'body'
  2018-02-28 20:52 [PATCH v6 0/9] vsp1: TLB optimisation and DL caching Kieran Bingham
@ 2018-02-28 20:52 ` Kieran Bingham
  2018-02-28 20:52 ` [PATCH v6 2/9] v4l: vsp1: Protect bodies against overflow Kieran Bingham
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Kieran Bingham @ 2018-02-28 20:52 UTC (permalink / raw)
  To: Laurent Pinchart, linux-media, linux-renesas-soc; +Cc: Kieran Bingham

Throughout the codebase, the term 'fragment' is used to represent a
display list body. This term duplicates the 'body' which is already in
use.

The datasheet references these objects as a body, therefore replace all
mentions of a fragment with a body, along with the corresponding
pluralised terms.

Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
---
 drivers/media/platform/vsp1/vsp1_clu.c |  10 +-
 drivers/media/platform/vsp1/vsp1_dl.c  | 107 ++++++++++++--------------
 drivers/media/platform/vsp1/vsp1_dl.h  |  14 +--
 drivers/media/platform/vsp1/vsp1_lut.c |   8 +-
 4 files changed, 69 insertions(+), 70 deletions(-)

diff --git a/drivers/media/platform/vsp1/vsp1_clu.c b/drivers/media/platform/vsp1/vsp1_clu.c
index f2fb26e5ab4e..9621afa3658c 100644
--- a/drivers/media/platform/vsp1/vsp1_clu.c
+++ b/drivers/media/platform/vsp1/vsp1_clu.c
@@ -47,19 +47,19 @@ static int clu_set_table(struct vsp1_clu *clu, struct v4l2_ctrl *ctrl)
 	struct vsp1_dl_body *dlb;
 	unsigned int i;
 
-	dlb = vsp1_dl_fragment_alloc(clu->entity.vsp1, 1 + 17 * 17 * 17);
+	dlb = vsp1_dl_body_alloc(clu->entity.vsp1, 1 + 17 * 17 * 17);
 	if (!dlb)
 		return -ENOMEM;
 
-	vsp1_dl_fragment_write(dlb, VI6_CLU_ADDR, 0);
+	vsp1_dl_body_write(dlb, VI6_CLU_ADDR, 0);
 	for (i = 0; i < 17 * 17 * 17; ++i)
-		vsp1_dl_fragment_write(dlb, VI6_CLU_DATA, ctrl->p_new.p_u32[i]);
+		vsp1_dl_body_write(dlb, VI6_CLU_DATA, ctrl->p_new.p_u32[i]);
 
 	spin_lock_irq(&clu->lock);
 	swap(clu->clu, dlb);
 	spin_unlock_irq(&clu->lock);
 
-	vsp1_dl_fragment_free(dlb);
+	vsp1_dl_body_free(dlb);
 	return 0;
 }
 
@@ -256,7 +256,7 @@ static void clu_configure(struct vsp1_entity *entity,
 		spin_unlock_irqrestore(&clu->lock, flags);
 
 		if (dlb)
-			vsp1_dl_list_add_fragment(dl, dlb);
+			vsp1_dl_list_add_body(dl, dlb);
 		break;
 	}
 }
diff --git a/drivers/media/platform/vsp1/vsp1_dl.c b/drivers/media/platform/vsp1/vsp1_dl.c
index 0b86ed01e85d..0c1bd17f9281 100644
--- a/drivers/media/platform/vsp1/vsp1_dl.c
+++ b/drivers/media/platform/vsp1/vsp1_dl.c
@@ -69,7 +69,7 @@ struct vsp1_dl_body {
  * @header: display list header, NULL for headerless lists
  * @dma: DMA address for the header
  * @body0: first display list body
- * @fragments: list of extra display list bodies
+ * @bodies: list of extra display list bodies
  * @has_chain: if true, indicates that there's a partition chain
  * @chain: entry in the display list partition chain
  */
@@ -81,7 +81,7 @@ struct vsp1_dl_list {
 	dma_addr_t dma;
 
 	struct vsp1_dl_body body0;
-	struct list_head fragments;
+	struct list_head bodies;
 
 	bool has_chain;
 	struct list_head chain;
@@ -98,13 +98,13 @@ enum vsp1_dl_mode {
  * @mode: display list operation mode (header or headerless)
  * @singleshot: execute the display list in single-shot mode
  * @vsp1: the VSP1 device
- * @lock: protects the free, active, queued, pending and gc_fragments lists
+ * @lock: protects the free, active, queued, pending and gc_bodies lists
  * @free: array of all free display lists
  * @active: list currently being processed (loaded) by hardware
  * @queued: list queued to the hardware (written to the DL registers)
  * @pending: list waiting to be queued to the hardware
- * @gc_work: fragments garbage collector work struct
- * @gc_fragments: array of display list fragments waiting to be freed
+ * @gc_work: bodies garbage collector work struct
+ * @gc_bodies: array of display list bodies waiting to be freed
  */
 struct vsp1_dl_manager {
 	unsigned int index;
@@ -119,7 +119,7 @@ struct vsp1_dl_manager {
 	struct vsp1_dl_list *pending;
 
 	struct work_struct gc_work;
-	struct list_head gc_fragments;
+	struct list_head gc_bodies;
 };
 
 /* -----------------------------------------------------------------------------
@@ -157,17 +157,16 @@ static void vsp1_dl_body_cleanup(struct vsp1_dl_body *dlb)
 }
 
 /**
- * vsp1_dl_fragment_alloc - Allocate a display list fragment
+ * vsp1_dl_body_alloc - Allocate a display list body
  * @vsp1: The VSP1 device
- * @num_entries: The maximum number of entries that the fragment can contain
+ * @num_entries: The maximum number of entries that the body can contain
  *
- * Allocate a display list fragment with enough memory to contain the requested
+ * Allocate a display list body with enough memory to contain the requested
  * number of entries.
  *
- * Return a pointer to a fragment on success or NULL if memory can't be
- * allocated.
+ * Return a pointer to a body on success or NULL if memory can't be allocated.
  */
-struct vsp1_dl_body *vsp1_dl_fragment_alloc(struct vsp1_device *vsp1,
+struct vsp1_dl_body *vsp1_dl_body_alloc(struct vsp1_device *vsp1,
 					    unsigned int num_entries)
 {
 	struct vsp1_dl_body *dlb;
@@ -187,20 +186,20 @@ struct vsp1_dl_body *vsp1_dl_fragment_alloc(struct vsp1_device *vsp1,
 }
 
 /**
- * vsp1_dl_fragment_free - Free a display list fragment
- * @dlb: The fragment
+ * vsp1_dl_body_free - Free a display list body
+ * @dlb: The body
  *
- * Free the given display list fragment and the associated DMA memory.
+ * Free the given display list body and the associated DMA memory.
  *
- * Fragments must only be freed explicitly if they are not added to a display
+ * Bodies must only be freed explicitly if they are not added to a display
  * list, as the display list will take ownership of them and free them
- * otherwise. Manual free typically happens at cleanup time for fragments that
+ * otherwise. Manual free typically happens at cleanup time for bodies that
  * have been allocated but not used.
  *
  * Passing a NULL pointer to this function is safe, in that case no operation
  * will be performed.
  */
-void vsp1_dl_fragment_free(struct vsp1_dl_body *dlb)
+void vsp1_dl_body_free(struct vsp1_dl_body *dlb)
 {
 	if (!dlb)
 		return;
@@ -210,16 +209,16 @@ void vsp1_dl_fragment_free(struct vsp1_dl_body *dlb)
 }
 
 /**
- * vsp1_dl_fragment_write - Write a register to a display list fragment
- * @dlb: The fragment
+ * vsp1_dl_body_write - Write a register to a display list body
+ * @dlb: The body
  * @reg: The register address
  * @data: The register value
  *
- * Write the given register and value to the display list fragment. The maximum
- * number of entries that can be written in a fragment is specified when the
- * fragment is allocated by vsp1_dl_fragment_alloc().
+ * Write the given register and value to the display list body. The maximum
+ * number of entries that can be written in a body is specified when the body is
+ * allocated by vsp1_dl_body_alloc().
  */
-void vsp1_dl_fragment_write(struct vsp1_dl_body *dlb, u32 reg, u32 data)
+void vsp1_dl_body_write(struct vsp1_dl_body *dlb, u32 reg, u32 data)
 {
 	dlb->entries[dlb->num_entries].addr = reg;
 	dlb->entries[dlb->num_entries].data = data;
@@ -240,7 +239,7 @@ static struct vsp1_dl_list *vsp1_dl_list_alloc(struct vsp1_dl_manager *dlm)
 	if (!dl)
 		return NULL;
 
-	INIT_LIST_HEAD(&dl->fragments);
+	INIT_LIST_HEAD(&dl->bodies);
 	dl->dlm = dlm;
 
 	/*
@@ -277,7 +276,7 @@ static struct vsp1_dl_list *vsp1_dl_list_alloc(struct vsp1_dl_manager *dlm)
 static void vsp1_dl_list_free(struct vsp1_dl_list *dl)
 {
 	vsp1_dl_body_cleanup(&dl->body0);
-	list_splice_init(&dl->fragments, &dl->dlm->gc_fragments);
+	list_splice_init(&dl->bodies, &dl->dlm->gc_bodies);
 	kfree(dl);
 }
 
@@ -332,13 +331,13 @@ static void __vsp1_dl_list_put(struct vsp1_dl_list *dl)
 	dl->has_chain = false;
 
 	/*
-	 * We can't free fragments here as DMA memory can only be freed in
-	 * interruptible context. Move all fragments to the display list
-	 * manager's list of fragments to be freed, they will be
-	 * garbage-collected by the work queue.
+	 * We can't free bodies here as DMA memory can only be freed in
+	 * interruptible context. Move all bodies to the display list manager's
+	 * list of bodies to be freed, they will be garbage-collected by the
+	 * work queue.
 	 */
-	if (!list_empty(&dl->fragments)) {
-		list_splice_init(&dl->fragments, &dl->dlm->gc_fragments);
+	if (!list_empty(&dl->bodies)) {
+		list_splice_init(&dl->bodies, &dl->dlm->gc_bodies);
 		schedule_work(&dl->dlm->gc_work);
 	}
 
@@ -379,33 +378,33 @@ void vsp1_dl_list_put(struct vsp1_dl_list *dl)
  */
 void vsp1_dl_list_write(struct vsp1_dl_list *dl, u32 reg, u32 data)
 {
-	vsp1_dl_fragment_write(&dl->body0, reg, data);
+	vsp1_dl_body_write(&dl->body0, reg, data);
 }
 
 /**
- * vsp1_dl_list_add_fragment - Add a fragment to the display list
+ * vsp1_dl_list_add_body - Add a body to the display list
  * @dl: The display list
- * @dlb: The fragment
+ * @dlb: The body
  *
- * Add a display list body as a fragment to a display list. Registers contained
- * in fragments are processed after registers contained in the main display
- * list, in the order in which fragments are added.
+ * Add a display list body as a body to a display list. Registers contained
+ * in bodies are processed after registers contained in the main display list,
+ * in the order in which bodies are added.
  *
- * Adding a fragment to a display list passes ownership of the fragment to the
- * list. The caller must not touch the fragment after this call, and must not
- * free it explicitly with vsp1_dl_fragment_free().
+ * Adding a body to a display list passes ownership of the body to the list. The
+ * caller must not touch the body after this call, and must not free it
+ * explicitly with vsp1_dl_body_free().
  *
- * Fragments are only usable for display lists in header mode. Attempt to
- * add a fragment to a header-less display list will return an error.
+ * Additional bodies are only usable for display lists in header mode.
+ * Attempting to add a body to a header-less display list will return an error.
  */
-int vsp1_dl_list_add_fragment(struct vsp1_dl_list *dl,
+int vsp1_dl_list_add_body(struct vsp1_dl_list *dl,
 			      struct vsp1_dl_body *dlb)
 {
 	/* Multi-body lists are only available in header mode. */
 	if (dl->dlm->mode != VSP1_DL_MODE_HEADER)
 		return -EINVAL;
 
-	list_add_tail(&dlb->list, &dl->fragments);
+	list_add_tail(&dlb->list, &dl->bodies);
 	return 0;
 }
 
@@ -454,7 +453,7 @@ static void vsp1_dl_list_fill_header(struct vsp1_dl_list *dl, bool is_last)
 	hdr->num_bytes = dl->body0.num_entries
 		       * sizeof(*dl->header->lists);
 
-	list_for_each_entry(dlb, &dl->fragments, list) {
+	list_for_each_entry(dlb, &dl->bodies, list) {
 		num_lists++;
 		hdr++;
 
@@ -711,25 +710,25 @@ void vsp1_dlm_reset(struct vsp1_dl_manager *dlm)
 }
 
 /*
- * Free all fragments awaiting to be garbage-collected.
+ * Free all bodies awaiting to be garbage-collected.
  *
  * This function must be called without the display list manager lock held.
  */
-static void vsp1_dlm_fragments_free(struct vsp1_dl_manager *dlm)
+static void vsp1_dlm_bodies_free(struct vsp1_dl_manager *dlm)
 {
 	unsigned long flags;
 
 	spin_lock_irqsave(&dlm->lock, flags);
 
-	while (!list_empty(&dlm->gc_fragments)) {
+	while (!list_empty(&dlm->gc_bodies)) {
 		struct vsp1_dl_body *dlb;
 
-		dlb = list_first_entry(&dlm->gc_fragments, struct vsp1_dl_body,
+		dlb = list_first_entry(&dlm->gc_bodies, struct vsp1_dl_body,
 				       list);
 		list_del(&dlb->list);
 
 		spin_unlock_irqrestore(&dlm->lock, flags);
-		vsp1_dl_fragment_free(dlb);
+		vsp1_dl_body_free(dlb);
 		spin_lock_irqsave(&dlm->lock, flags);
 	}
 
@@ -741,7 +740,7 @@ static void vsp1_dlm_garbage_collect(struct work_struct *work)
 	struct vsp1_dl_manager *dlm =
 		container_of(work, struct vsp1_dl_manager, gc_work);
 
-	vsp1_dlm_fragments_free(dlm);
+	vsp1_dlm_bodies_free(dlm);
 }
 
 struct vsp1_dl_manager *vsp1_dlm_create(struct vsp1_device *vsp1,
@@ -763,7 +762,7 @@ struct vsp1_dl_manager *vsp1_dlm_create(struct vsp1_device *vsp1,
 
 	spin_lock_init(&dlm->lock);
 	INIT_LIST_HEAD(&dlm->free);
-	INIT_LIST_HEAD(&dlm->gc_fragments);
+	INIT_LIST_HEAD(&dlm->gc_bodies);
 	INIT_WORK(&dlm->gc_work, vsp1_dlm_garbage_collect);
 
 	for (i = 0; i < prealloc; ++i) {
@@ -793,5 +792,5 @@ void vsp1_dlm_destroy(struct vsp1_dl_manager *dlm)
 		vsp1_dl_list_free(dl);
 	}
 
-	vsp1_dlm_fragments_free(dlm);
+	vsp1_dlm_bodies_free(dlm);
 }
diff --git a/drivers/media/platform/vsp1/vsp1_dl.h b/drivers/media/platform/vsp1/vsp1_dl.h
index ee3508172f0a..d4f7695c4ed3 100644
--- a/drivers/media/platform/vsp1/vsp1_dl.h
+++ b/drivers/media/platform/vsp1/vsp1_dl.h
@@ -16,7 +16,7 @@
 #include <linux/types.h>
 
 struct vsp1_device;
-struct vsp1_dl_fragment;
+struct vsp1_dl_body;
 struct vsp1_dl_list;
 struct vsp1_dl_manager;
 
@@ -34,12 +34,12 @@ void vsp1_dl_list_put(struct vsp1_dl_list *dl);
 void vsp1_dl_list_write(struct vsp1_dl_list *dl, u32 reg, u32 data);
 void vsp1_dl_list_commit(struct vsp1_dl_list *dl);
 
-struct vsp1_dl_body *vsp1_dl_fragment_alloc(struct vsp1_device *vsp1,
-					    unsigned int num_entries);
-void vsp1_dl_fragment_free(struct vsp1_dl_body *dlb);
-void vsp1_dl_fragment_write(struct vsp1_dl_body *dlb, u32 reg, u32 data);
-int vsp1_dl_list_add_fragment(struct vsp1_dl_list *dl,
-			      struct vsp1_dl_body *dlb);
+struct vsp1_dl_body *vsp1_dl_body_alloc(struct vsp1_device *vsp1,
+					unsigned int num_entries);
+void vsp1_dl_body_free(struct vsp1_dl_body *dlb);
+void vsp1_dl_body_write(struct vsp1_dl_body *dlb, u32 reg, u32 data);
+int vsp1_dl_list_add_body(struct vsp1_dl_list *dl,
+			  struct vsp1_dl_body *dlb);
 int vsp1_dl_list_add_chain(struct vsp1_dl_list *head, struct vsp1_dl_list *dl);
 
 #endif /* __VSP1_DL_H__ */
diff --git a/drivers/media/platform/vsp1/vsp1_lut.c b/drivers/media/platform/vsp1/vsp1_lut.c
index c67cc60db0db..aa2b40327529 100644
--- a/drivers/media/platform/vsp1/vsp1_lut.c
+++ b/drivers/media/platform/vsp1/vsp1_lut.c
@@ -44,19 +44,19 @@ static int lut_set_table(struct vsp1_lut *lut, struct v4l2_ctrl *ctrl)
 	struct vsp1_dl_body *dlb;
 	unsigned int i;
 
-	dlb = vsp1_dl_fragment_alloc(lut->entity.vsp1, 256);
+	dlb = vsp1_dl_body_alloc(lut->entity.vsp1, 256);
 	if (!dlb)
 		return -ENOMEM;
 
 	for (i = 0; i < 256; ++i)
-		vsp1_dl_fragment_write(dlb, VI6_LUT_TABLE + 4 * i,
+		vsp1_dl_body_write(dlb, VI6_LUT_TABLE + 4 * i,
 				       ctrl->p_new.p_u32[i]);
 
 	spin_lock_irq(&lut->lock);
 	swap(lut->lut, dlb);
 	spin_unlock_irq(&lut->lock);
 
-	vsp1_dl_fragment_free(dlb);
+	vsp1_dl_body_free(dlb);
 	return 0;
 }
 
@@ -212,7 +212,7 @@ static void lut_configure(struct vsp1_entity *entity,
 		spin_unlock_irqrestore(&lut->lock, flags);
 
 		if (dlb)
-			vsp1_dl_list_add_fragment(dl, dlb);
+			vsp1_dl_list_add_body(dl, dlb);
 		break;
 	}
 }
-- 
git-series 0.9.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v6 2/9] v4l: vsp1: Protect bodies against overflow
  2018-02-28 20:52 [PATCH v6 0/9] vsp1: TLB optimisation and DL caching Kieran Bingham
  2018-02-28 20:52 ` [PATCH v6 1/9] v4l: vsp1: Reword uses of 'fragment' as 'body' Kieran Bingham
@ 2018-02-28 20:52 ` Kieran Bingham
  2018-02-28 20:52 ` [PATCH v6 3/9] v4l: vsp1: Provide a body pool Kieran Bingham
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Kieran Bingham @ 2018-02-28 20:52 UTC (permalink / raw)
  To: Laurent Pinchart, linux-media, linux-renesas-soc; +Cc: Kieran Bingham

The body write function relies on the code never asking it to write more
than the entries available in the list.

Currently with each list body containing 256 entries, this is fine, but
we can reduce this number greatly saving memory. In preparation of this
add a level of protection to catch any buffer overflows.

Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

---

v3:
 - adapt for new 'body' terminology
 - simplify WARN_ON macro usage

 drivers/media/platform/vsp1/vsp1_dl.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/media/platform/vsp1/vsp1_dl.c b/drivers/media/platform/vsp1/vsp1_dl.c
index 0c1bd17f9281..59fe80bf6e9d 100644
--- a/drivers/media/platform/vsp1/vsp1_dl.c
+++ b/drivers/media/platform/vsp1/vsp1_dl.c
@@ -50,6 +50,7 @@ struct vsp1_dl_entry {
  * @dma: DMA address of the entries
  * @size: size of the DMA memory in bytes
  * @num_entries: number of stored entries
+ * @max_entries: number of entries available
  */
 struct vsp1_dl_body {
 	struct list_head list;
@@ -60,6 +61,7 @@ struct vsp1_dl_body {
 	size_t size;
 
 	unsigned int num_entries;
+	unsigned int max_entries;
 };
 
 /**
@@ -139,6 +141,7 @@ static int vsp1_dl_body_init(struct vsp1_device *vsp1,
 
 	dlb->vsp1 = vsp1;
 	dlb->size = size;
+	dlb->max_entries = num_entries;
 
 	dlb->entries = dma_alloc_wc(vsp1->bus_master, dlb->size, &dlb->dma,
 				    GFP_KERNEL);
@@ -220,6 +223,10 @@ void vsp1_dl_body_free(struct vsp1_dl_body *dlb)
  */
 void vsp1_dl_body_write(struct vsp1_dl_body *dlb, u32 reg, u32 data)
 {
+	if (WARN_ONCE(dlb->num_entries >= dlb->max_entries,
+		      "DLB size exceeded (max %u)", dlb->max_entries))
+		return;
+
 	dlb->entries[dlb->num_entries].addr = reg;
 	dlb->entries[dlb->num_entries].data = data;
 	dlb->num_entries++;
-- 
git-series 0.9.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v6 3/9] v4l: vsp1: Provide a body pool
  2018-02-28 20:52 [PATCH v6 0/9] vsp1: TLB optimisation and DL caching Kieran Bingham
  2018-02-28 20:52 ` [PATCH v6 1/9] v4l: vsp1: Reword uses of 'fragment' as 'body' Kieran Bingham
  2018-02-28 20:52 ` [PATCH v6 2/9] v4l: vsp1: Protect bodies against overflow Kieran Bingham
@ 2018-02-28 20:52 ` Kieran Bingham
  2018-02-28 20:52 ` [PATCH v6 4/9] v4l: vsp1: Convert display lists to use new " Kieran Bingham
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Kieran Bingham @ 2018-02-28 20:52 UTC (permalink / raw)
  To: Laurent Pinchart, linux-media, linux-renesas-soc; +Cc: Kieran Bingham

Each display list allocates a body to store register values in a dma
accessible buffer from a dma_alloc_wc() allocation. Each of these
results in an entry in the TLB, and a large number of display list
allocations adds pressure to this resource.

Reduce TLB pressure on the IPMMUs by allocating multiple display list
bodies in a single allocation, and providing these to the display list
through a 'body pool'. A pool can be allocated by the display list
manager or entities which require their own body allocations.

Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>

---
v4:
 - Provide comment explaining extra allocation on body pool
   highlighting area for optimisation later.

v3:
 - s/fragment/body/, s/fragments/bodies/
 - qty -> num_bodies
 - indentation fix
 - s/vsp1_dl_body_pool_{alloc,free}/vsp1_dl_body_pool_{create,destroy}/'
 - Add kerneldoc to non-static functions

v2:
 - assign dlb->dma correctly

 drivers/media/platform/vsp1/vsp1_dl.c | 163 +++++++++++++++++++++++++++-
 drivers/media/platform/vsp1/vsp1_dl.h |   8 +-
 2 files changed, 171 insertions(+)

diff --git a/drivers/media/platform/vsp1/vsp1_dl.c b/drivers/media/platform/vsp1/vsp1_dl.c
index 59fe80bf6e9d..87bc4acf8c9e 100644
--- a/drivers/media/platform/vsp1/vsp1_dl.c
+++ b/drivers/media/platform/vsp1/vsp1_dl.c
@@ -45,6 +45,8 @@ struct vsp1_dl_entry {
 /**
  * struct vsp1_dl_body - Display list body
  * @list: entry in the display list list of bodies
+ * @free: entry in the pool free body list
+ * @pool: pool to which this body belongs
  * @vsp1: the VSP1 device
  * @entries: array of entries
  * @dma: DMA address of the entries
@@ -54,6 +56,9 @@ struct vsp1_dl_entry {
  */
 struct vsp1_dl_body {
 	struct list_head list;
+	struct list_head free;
+
+	struct vsp1_dl_body_pool *pool;
 	struct vsp1_device *vsp1;
 
 	struct vsp1_dl_entry *entries;
@@ -65,6 +70,30 @@ struct vsp1_dl_body {
 };
 
 /**
+ * struct vsp1_dl_body_pool - display list body pool
+ * @dma: DMA address of the entries
+ * @size: size of the full DMA memory pool in bytes
+ * @mem: CPU memory pointer for the pool
+ * @bodies: Array of DLB structures for the pool
+ * @free: List of free DLB entries
+ * @lock: Protects the pool and free list
+ * @vsp1: the VSP1 device
+ */
+struct vsp1_dl_body_pool {
+	/* DMA allocation */
+	dma_addr_t dma;
+	size_t size;
+	void *mem;
+
+	/* Body management */
+	struct vsp1_dl_body *bodies;
+	struct list_head free;
+	spinlock_t lock;
+
+	struct vsp1_device *vsp1;
+};
+
+/**
  * struct vsp1_dl_list - Display list
  * @list: entry in the display list manager lists
  * @dlm: the display list manager
@@ -105,6 +134,7 @@ enum vsp1_dl_mode {
  * @active: list currently being processed (loaded) by hardware
  * @queued: list queued to the hardware (written to the DL registers)
  * @pending: list waiting to be queued to the hardware
+ * @pool: body pool for the display list bodies
  * @gc_work: bodies garbage collector work struct
  * @gc_bodies: array of display list bodies waiting to be freed
  */
@@ -120,6 +150,8 @@ struct vsp1_dl_manager {
 	struct vsp1_dl_list *queued;
 	struct vsp1_dl_list *pending;
 
+	struct vsp1_dl_body_pool *pool;
+
 	struct work_struct gc_work;
 	struct list_head gc_bodies;
 };
@@ -128,6 +160,137 @@ struct vsp1_dl_manager {
  * Display List Body Management
  */
 
+/**
+ * vsp1_dl_body_pool_create - Create a pool of bodies from a single allocation
+ * @vsp1: The VSP1 device
+ * @num_bodies: The quantity of bodies to allocate
+ * @num_entries: The maximum number of entries that the body can contain
+ * @extra_size: Extra allocation provided for the bodies
+ *
+ * Allocate a pool of display list bodies each with enough memory to contain the
+ * requested number of entries.
+ *
+ * Return a pointer to a pool on success or NULL if memory can't be allocated.
+ */
+struct vsp1_dl_body_pool *
+vsp1_dl_body_pool_create(struct vsp1_device *vsp1, unsigned int num_bodies,
+			 unsigned int num_entries, size_t extra_size)
+{
+	struct vsp1_dl_body_pool *pool;
+	size_t dlb_size;
+	unsigned int i;
+
+	pool = kzalloc(sizeof(*pool), GFP_KERNEL);
+	if (!pool)
+		return NULL;
+
+	pool->vsp1 = vsp1;
+
+	/*
+	 * Todo: 'extra_size' is only used by vsp1_dlm_create(), to allocate
+	 * extra memory for the display list header. We need only one header per
+	 * display list, not per display list body, thus this allocation is
+	 * extraneous and should be reworked in the future.
+	 */
+	dlb_size = num_entries * sizeof(struct vsp1_dl_entry) + extra_size;
+	pool->size = dlb_size * num_bodies;
+
+	pool->bodies = kcalloc(num_bodies, sizeof(*pool->bodies), GFP_KERNEL);
+	if (!pool->bodies) {
+		kfree(pool);
+		return NULL;
+	}
+
+	pool->mem = dma_alloc_wc(vsp1->bus_master, pool->size, &pool->dma,
+				 GFP_KERNEL);
+	if (!pool->mem) {
+		kfree(pool->bodies);
+		kfree(pool);
+		return NULL;
+	}
+
+	spin_lock_init(&pool->lock);
+	INIT_LIST_HEAD(&pool->free);
+
+	for (i = 0; i < num_bodies; ++i) {
+		struct vsp1_dl_body *dlb = &pool->bodies[i];
+
+		dlb->pool = pool;
+		dlb->max_entries = num_entries;
+
+		dlb->dma = pool->dma + i * dlb_size;
+		dlb->entries = pool->mem + i * dlb_size;
+
+		list_add_tail(&dlb->free, &pool->free);
+	}
+
+	return pool;
+}
+
+/**
+ * vsp1_dl_body_pool_destroy - Release a body pool
+ * @pool: The body pool
+ *
+ * Release all components of a pool allocation.
+ */
+void vsp1_dl_body_pool_destroy(struct vsp1_dl_body_pool *pool)
+{
+	if (!pool)
+		return;
+
+	if (pool->mem)
+		dma_free_wc(pool->vsp1->bus_master, pool->size, pool->mem,
+			    pool->dma);
+
+	kfree(pool->bodies);
+	kfree(pool);
+}
+
+/**
+ * vsp1_dl_body_get - Obtain a body from a pool
+ * @pool: The body pool
+ *
+ * Obtain a body from the pool allocation without blocking.
+ *
+ * Returns a display list body or NULL if there are none available.
+ */
+struct vsp1_dl_body *vsp1_dl_body_get(struct vsp1_dl_body_pool *pool)
+{
+	struct vsp1_dl_body *dlb = NULL;
+	unsigned long flags;
+
+	spin_lock_irqsave(&pool->lock, flags);
+
+	if (!list_empty(&pool->free)) {
+		dlb = list_first_entry(&pool->free, struct vsp1_dl_body, free);
+		list_del(&dlb->free);
+	}
+
+	spin_unlock_irqrestore(&pool->lock, flags);
+
+	return dlb;
+}
+
+/**
+ * vsp1_dl_body_put - Return a body back to its pool
+ * @dlb: The display list body
+ *
+ * Return a body back to the pool, and reset the num_entries to clear the list.
+ */
+void vsp1_dl_body_put(struct vsp1_dl_body *dlb)
+{
+	unsigned long flags;
+
+	if (!dlb)
+		return;
+
+	dlb->num_entries = 0;
+
+	spin_lock_irqsave(&dlb->pool->lock, flags);
+	list_add_tail(&dlb->free, &dlb->pool->free);
+	spin_unlock_irqrestore(&dlb->pool->lock, flags);
+}
+
 /*
  * Initialize a display list body object and allocate DMA memory for the body
  * data. The display list body object is expected to have been initialized to
diff --git a/drivers/media/platform/vsp1/vsp1_dl.h b/drivers/media/platform/vsp1/vsp1_dl.h
index d4f7695c4ed3..785b88472375 100644
--- a/drivers/media/platform/vsp1/vsp1_dl.h
+++ b/drivers/media/platform/vsp1/vsp1_dl.h
@@ -17,6 +17,7 @@
 
 struct vsp1_device;
 struct vsp1_dl_body;
+struct vsp1_dl_body_pool;
 struct vsp1_dl_list;
 struct vsp1_dl_manager;
 
@@ -34,6 +35,13 @@ void vsp1_dl_list_put(struct vsp1_dl_list *dl);
 void vsp1_dl_list_write(struct vsp1_dl_list *dl, u32 reg, u32 data);
 void vsp1_dl_list_commit(struct vsp1_dl_list *dl);
 
+struct vsp1_dl_body_pool *
+vsp1_dl_body_pool_create(struct vsp1_device *vsp1, unsigned int num_bodies,
+			 unsigned int num_entries, size_t extra_size);
+void vsp1_dl_body_pool_destroy(struct vsp1_dl_body_pool *pool);
+struct vsp1_dl_body *vsp1_dl_body_get(struct vsp1_dl_body_pool *pool);
+void vsp1_dl_body_put(struct vsp1_dl_body *dlb);
+
 struct vsp1_dl_body *vsp1_dl_body_alloc(struct vsp1_device *vsp1,
 					unsigned int num_entries);
 void vsp1_dl_body_free(struct vsp1_dl_body *dlb);
-- 
git-series 0.9.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v6 4/9] v4l: vsp1: Convert display lists to use new body pool
  2018-02-28 20:52 [PATCH v6 0/9] vsp1: TLB optimisation and DL caching Kieran Bingham
                   ` (2 preceding siblings ...)
  2018-02-28 20:52 ` [PATCH v6 3/9] v4l: vsp1: Provide a body pool Kieran Bingham
@ 2018-02-28 20:52 ` Kieran Bingham
  2018-02-28 20:52 ` [PATCH v6 5/9] v4l: vsp1: Use reference counting for bodies Kieran Bingham
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Kieran Bingham @ 2018-02-28 20:52 UTC (permalink / raw)
  To: Laurent Pinchart, linux-media, linux-renesas-soc; +Cc: Kieran Bingham

Adapt the dl->body0 object to use an object from the body pool. This
greatly reduces the pressure on the TLB for IPMMU use cases, as all of
the lists use a single allocation for the main body.

The CLU and LUT objects pre-allocate a pool containing three bodies,
allowing a userspace update before the hardware has committed a previous
set of tables.

Bodies are no longer 'freed' in interrupt context, but instead released
back to their respective pools. This allows us to remove the garbage
collector in the DLM.

Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>

---
v3:
 - 's/fragment/body', 's/fragments/bodies/'
 - CLU/LUT now allocate 3 bodies
 - vsp1_dl_list_fragments_free -> vsp1_dl_list_bodies_put

v2:
 - Use dl->body0->max_entries to determine header offset, instead of the
   global constant VSP1_DL_NUM_ENTRIES which is incorrect.
 - squash updates for LUT, CLU, and fragment cleanup into single patch.
   (Not fully bisectable when separated)

 drivers/media/platform/vsp1/vsp1_clu.c |  27 ++-
 drivers/media/platform/vsp1/vsp1_clu.h |   1 +-
 drivers/media/platform/vsp1/vsp1_dl.c  | 223 ++++++--------------------
 drivers/media/platform/vsp1/vsp1_dl.h  |   3 +-
 drivers/media/platform/vsp1/vsp1_lut.c |  27 ++-
 drivers/media/platform/vsp1/vsp1_lut.h |   1 +-
 6 files changed, 101 insertions(+), 181 deletions(-)

diff --git a/drivers/media/platform/vsp1/vsp1_clu.c b/drivers/media/platform/vsp1/vsp1_clu.c
index 9621afa3658c..2018144470c5 100644
--- a/drivers/media/platform/vsp1/vsp1_clu.c
+++ b/drivers/media/platform/vsp1/vsp1_clu.c
@@ -23,6 +23,8 @@
 #define CLU_MIN_SIZE				4U
 #define CLU_MAX_SIZE				8190U
 
+#define CLU_SIZE				(17 * 17 * 17)
+
 /* -----------------------------------------------------------------------------
  * Device Access
  */
@@ -47,19 +49,19 @@ static int clu_set_table(struct vsp1_clu *clu, struct v4l2_ctrl *ctrl)
 	struct vsp1_dl_body *dlb;
 	unsigned int i;
 
-	dlb = vsp1_dl_body_alloc(clu->entity.vsp1, 1 + 17 * 17 * 17);
+	dlb = vsp1_dl_body_get(clu->pool);
 	if (!dlb)
 		return -ENOMEM;
 
 	vsp1_dl_body_write(dlb, VI6_CLU_ADDR, 0);
-	for (i = 0; i < 17 * 17 * 17; ++i)
+	for (i = 0; i < CLU_SIZE; ++i)
 		vsp1_dl_body_write(dlb, VI6_CLU_DATA, ctrl->p_new.p_u32[i]);
 
 	spin_lock_irq(&clu->lock);
 	swap(clu->clu, dlb);
 	spin_unlock_irq(&clu->lock);
 
-	vsp1_dl_body_free(dlb);
+	vsp1_dl_body_put(dlb);
 	return 0;
 }
 
@@ -261,8 +263,16 @@ static void clu_configure(struct vsp1_entity *entity,
 	}
 }
 
+static void clu_destroy(struct vsp1_entity *entity)
+{
+	struct vsp1_clu *clu = to_clu(&entity->subdev);
+
+	vsp1_dl_body_pool_destroy(clu->pool);
+}
+
 static const struct vsp1_entity_operations clu_entity_ops = {
 	.configure = clu_configure,
+	.destroy = clu_destroy,
 };
 
 /* -----------------------------------------------------------------------------
@@ -288,6 +298,17 @@ struct vsp1_clu *vsp1_clu_create(struct vsp1_device *vsp1)
 	if (ret < 0)
 		return ERR_PTR(ret);
 
+	/*
+	 * Pre-allocate a body pool, with 3 bodies allowing a userspace update
+	 * before the hardware has committed a previous set of tables, handling
+	 * both the queued and pending dl entries. One extra entry is added to
+	 * the CLU_SIZE to allow for the VI6_CLU_ADDR header.
+	 */
+	clu->pool = vsp1_dl_body_pool_create(clu->entity.vsp1, 3, CLU_SIZE + 1,
+					     0);
+	if (!clu->pool)
+		return ERR_PTR(-ENOMEM);
+
 	/* Initialize the control handler. */
 	v4l2_ctrl_handler_init(&clu->ctrls, 2);
 	v4l2_ctrl_new_custom(&clu->ctrls, &clu_table_control, NULL);
diff --git a/drivers/media/platform/vsp1/vsp1_clu.h b/drivers/media/platform/vsp1/vsp1_clu.h
index 036e0a2f1a42..fa3fe856725b 100644
--- a/drivers/media/platform/vsp1/vsp1_clu.h
+++ b/drivers/media/platform/vsp1/vsp1_clu.h
@@ -36,6 +36,7 @@ struct vsp1_clu {
 	spinlock_t lock;
 	unsigned int mode;
 	struct vsp1_dl_body *clu;
+	struct vsp1_dl_body_pool *pool;
 };
 
 static inline struct vsp1_clu *to_clu(struct v4l2_subdev *subdev)
diff --git a/drivers/media/platform/vsp1/vsp1_dl.c b/drivers/media/platform/vsp1/vsp1_dl.c
index 87bc4acf8c9e..a069c8456666 100644
--- a/drivers/media/platform/vsp1/vsp1_dl.c
+++ b/drivers/media/platform/vsp1/vsp1_dl.c
@@ -111,7 +111,7 @@ struct vsp1_dl_list {
 	struct vsp1_dl_header *header;
 	dma_addr_t dma;
 
-	struct vsp1_dl_body body0;
+	struct vsp1_dl_body *body0;
 	struct list_head bodies;
 
 	bool has_chain;
@@ -135,8 +135,6 @@ enum vsp1_dl_mode {
  * @queued: list queued to the hardware (written to the DL registers)
  * @pending: list waiting to be queued to the hardware
  * @pool: body pool for the display list bodies
- * @gc_work: bodies garbage collector work struct
- * @gc_bodies: array of display list bodies waiting to be freed
  */
 struct vsp1_dl_manager {
 	unsigned int index;
@@ -151,9 +149,6 @@ struct vsp1_dl_manager {
 	struct vsp1_dl_list *pending;
 
 	struct vsp1_dl_body_pool *pool;
-
-	struct work_struct gc_work;
-	struct list_head gc_bodies;
 };
 
 /* -----------------------------------------------------------------------------
@@ -291,89 +286,6 @@ void vsp1_dl_body_put(struct vsp1_dl_body *dlb)
 	spin_unlock_irqrestore(&dlb->pool->lock, flags);
 }
 
-/*
- * Initialize a display list body object and allocate DMA memory for the body
- * data. The display list body object is expected to have been initialized to
- * 0 when allocated.
- */
-static int vsp1_dl_body_init(struct vsp1_device *vsp1,
-			     struct vsp1_dl_body *dlb, unsigned int num_entries,
-			     size_t extra_size)
-{
-	size_t size = num_entries * sizeof(*dlb->entries) + extra_size;
-
-	dlb->vsp1 = vsp1;
-	dlb->size = size;
-	dlb->max_entries = num_entries;
-
-	dlb->entries = dma_alloc_wc(vsp1->bus_master, dlb->size, &dlb->dma,
-				    GFP_KERNEL);
-	if (!dlb->entries)
-		return -ENOMEM;
-
-	return 0;
-}
-
-/*
- * Cleanup a display list body and free allocated DMA memory allocated.
- */
-static void vsp1_dl_body_cleanup(struct vsp1_dl_body *dlb)
-{
-	dma_free_wc(dlb->vsp1->bus_master, dlb->size, dlb->entries, dlb->dma);
-}
-
-/**
- * vsp1_dl_body_alloc - Allocate a display list body
- * @vsp1: The VSP1 device
- * @num_entries: The maximum number of entries that the body can contain
- *
- * Allocate a display list body with enough memory to contain the requested
- * number of entries.
- *
- * Return a pointer to a body on success or NULL if memory can't be allocated.
- */
-struct vsp1_dl_body *vsp1_dl_body_alloc(struct vsp1_device *vsp1,
-					    unsigned int num_entries)
-{
-	struct vsp1_dl_body *dlb;
-	int ret;
-
-	dlb = kzalloc(sizeof(*dlb), GFP_KERNEL);
-	if (!dlb)
-		return NULL;
-
-	ret = vsp1_dl_body_init(vsp1, dlb, num_entries, 0);
-	if (ret < 0) {
-		kfree(dlb);
-		return NULL;
-	}
-
-	return dlb;
-}
-
-/**
- * vsp1_dl_body_free - Free a display list body
- * @dlb: The body
- *
- * Free the given display list body and the associated DMA memory.
- *
- * Bodies must only be freed explicitly if they are not added to a display
- * list, as the display list will take ownership of them and free them
- * otherwise. Manual free typically happens at cleanup time for bodies that
- * have been allocated but not used.
- *
- * Passing a NULL pointer to this function is safe, in that case no operation
- * will be performed.
- */
-void vsp1_dl_body_free(struct vsp1_dl_body *dlb)
-{
-	if (!dlb)
-		return;
-
-	vsp1_dl_body_cleanup(dlb);
-	kfree(dlb);
-}
-
 /**
  * vsp1_dl_body_write - Write a register to a display list body
  * @dlb: The body
@@ -399,11 +311,10 @@ void vsp1_dl_body_write(struct vsp1_dl_body *dlb, u32 reg, u32 data)
  * Display List Transaction Management
  */
 
-static struct vsp1_dl_list *vsp1_dl_list_alloc(struct vsp1_dl_manager *dlm)
+static struct vsp1_dl_list *vsp1_dl_list_alloc(struct vsp1_dl_manager *dlm,
+					       struct vsp1_dl_body_pool *pool)
 {
 	struct vsp1_dl_list *dl;
-	size_t header_size;
-	int ret;
 
 	dl = kzalloc(sizeof(*dl), GFP_KERNEL);
 	if (!dl)
@@ -412,41 +323,39 @@ static struct vsp1_dl_list *vsp1_dl_list_alloc(struct vsp1_dl_manager *dlm)
 	INIT_LIST_HEAD(&dl->bodies);
 	dl->dlm = dlm;
 
-	/*
-	 * Initialize the display list body and allocate DMA memory for the body
-	 * and the optional header. Both are allocated together to avoid memory
-	 * fragmentation, with the header located right after the body in
-	 * memory.
-	 */
-	header_size = dlm->mode == VSP1_DL_MODE_HEADER
-		    ? ALIGN(sizeof(struct vsp1_dl_header), 8)
-		    : 0;
-
-	ret = vsp1_dl_body_init(dlm->vsp1, &dl->body0, VSP1_DL_NUM_ENTRIES,
-				header_size);
-	if (ret < 0) {
-		kfree(dl);
+	/* Retrieve a body from our DLM body pool */
+	dl->body0 = vsp1_dl_body_get(pool);
+	if (!dl->body0)
 		return NULL;
-	}
-
 	if (dlm->mode == VSP1_DL_MODE_HEADER) {
-		size_t header_offset = VSP1_DL_NUM_ENTRIES
-				     * sizeof(*dl->body0.entries);
+		size_t header_offset = dl->body0->max_entries
+				     * sizeof(*dl->body0->entries);
 
-		dl->header = ((void *)dl->body0.entries) + header_offset;
-		dl->dma = dl->body0.dma + header_offset;
+		dl->header = ((void *)dl->body0->entries) + header_offset;
+		dl->dma = dl->body0->dma + header_offset;
 
 		memset(dl->header, 0, sizeof(*dl->header));
-		dl->header->lists[0].addr = dl->body0.dma;
+		dl->header->lists[0].addr = dl->body0->dma;
 	}
 
 	return dl;
 }
 
+static void vsp1_dl_list_bodies_put(struct vsp1_dl_list *dl)
+{
+	struct vsp1_dl_body *dlb, *tmp;
+
+	list_for_each_entry_safe(dlb, tmp, &dl->bodies, list) {
+		list_del(&dlb->list);
+		vsp1_dl_body_put(dlb);
+	}
+}
+
 static void vsp1_dl_list_free(struct vsp1_dl_list *dl)
 {
-	vsp1_dl_body_cleanup(&dl->body0);
-	list_splice_init(&dl->bodies, &dl->dlm->gc_bodies);
+	vsp1_dl_body_put(dl->body0);
+	vsp1_dl_list_bodies_put(dl);
+
 	kfree(dl);
 }
 
@@ -500,18 +409,13 @@ static void __vsp1_dl_list_put(struct vsp1_dl_list *dl)
 
 	dl->has_chain = false;
 
+	vsp1_dl_list_bodies_put(dl);
+
 	/*
-	 * We can't free bodies here as DMA memory can only be freed in
-	 * interruptible context. Move all bodies to the display list manager's
-	 * list of bodies to be freed, they will be garbage-collected by the
-	 * work queue.
+	 * body0 is reused as as an optimisation as presently every display list
+	 * has at least one body, thus we reinitialise the entries list
 	 */
-	if (!list_empty(&dl->bodies)) {
-		list_splice_init(&dl->bodies, &dl->dlm->gc_bodies);
-		schedule_work(&dl->dlm->gc_work);
-	}
-
-	dl->body0.num_entries = 0;
+	dl->body0->num_entries = 0;
 
 	list_add_tail(&dl->list, &dl->dlm->free);
 }
@@ -548,7 +452,7 @@ void vsp1_dl_list_put(struct vsp1_dl_list *dl)
  */
 void vsp1_dl_list_write(struct vsp1_dl_list *dl, u32 reg, u32 data)
 {
-	vsp1_dl_body_write(&dl->body0, reg, data);
+	vsp1_dl_body_write(dl->body0, reg, data);
 }
 
 /**
@@ -561,8 +465,7 @@ void vsp1_dl_list_write(struct vsp1_dl_list *dl, u32 reg, u32 data)
  * in the order in which bodies are added.
  *
  * Adding a body to a display list passes ownership of the body to the list. The
- * caller must not touch the body after this call, and must not free it
- * explicitly with vsp1_dl_body_free().
+ * caller must not touch the body after this call.
  *
  * Additional bodies are only usable for display lists in header mode.
  * Attempting to add a body to a header-less display list will return an error.
@@ -620,7 +523,7 @@ static void vsp1_dl_list_fill_header(struct vsp1_dl_list *dl, bool is_last)
 	 * list was allocated.
 	 */
 
-	hdr->num_bytes = dl->body0.num_entries
+	hdr->num_bytes = dl->body0->num_entries
 		       * sizeof(*dl->header->lists);
 
 	list_for_each_entry(dlb, &dl->bodies, list) {
@@ -694,9 +597,9 @@ static void vsp1_dl_list_hw_enqueue(struct vsp1_dl_list *dl)
 		 * bit will be cleared by the hardware when the display list
 		 * processing starts.
 		 */
-		vsp1_write(vsp1, VI6_DL_HDR_ADDR(0), dl->body0.dma);
+		vsp1_write(vsp1, VI6_DL_HDR_ADDR(0), dl->body0->dma);
 		vsp1_write(vsp1, VI6_DL_BODY_SIZE, VI6_DL_BODY_SIZE_UPD |
-			   (dl->body0.num_entries * sizeof(*dl->header->lists)));
+			   (dl->body0->num_entries * sizeof(*dl->header->lists)));
 	} else {
 		/*
 		 * In header mode, program the display list header address. If
@@ -879,45 +782,12 @@ void vsp1_dlm_reset(struct vsp1_dl_manager *dlm)
 	dlm->pending = NULL;
 }
 
-/*
- * Free all bodies awaiting to be garbage-collected.
- *
- * This function must be called without the display list manager lock held.
- */
-static void vsp1_dlm_bodies_free(struct vsp1_dl_manager *dlm)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&dlm->lock, flags);
-
-	while (!list_empty(&dlm->gc_bodies)) {
-		struct vsp1_dl_body *dlb;
-
-		dlb = list_first_entry(&dlm->gc_bodies, struct vsp1_dl_body,
-				       list);
-		list_del(&dlb->list);
-
-		spin_unlock_irqrestore(&dlm->lock, flags);
-		vsp1_dl_body_free(dlb);
-		spin_lock_irqsave(&dlm->lock, flags);
-	}
-
-	spin_unlock_irqrestore(&dlm->lock, flags);
-}
-
-static void vsp1_dlm_garbage_collect(struct work_struct *work)
-{
-	struct vsp1_dl_manager *dlm =
-		container_of(work, struct vsp1_dl_manager, gc_work);
-
-	vsp1_dlm_bodies_free(dlm);
-}
-
 struct vsp1_dl_manager *vsp1_dlm_create(struct vsp1_device *vsp1,
 					unsigned int index,
 					unsigned int prealloc)
 {
 	struct vsp1_dl_manager *dlm;
+	size_t header_size;
 	unsigned int i;
 
 	dlm = devm_kzalloc(vsp1->dev, sizeof(*dlm), GFP_KERNEL);
@@ -932,13 +802,26 @@ struct vsp1_dl_manager *vsp1_dlm_create(struct vsp1_device *vsp1,
 
 	spin_lock_init(&dlm->lock);
 	INIT_LIST_HEAD(&dlm->free);
-	INIT_LIST_HEAD(&dlm->gc_bodies);
-	INIT_WORK(&dlm->gc_work, vsp1_dlm_garbage_collect);
+
+	/*
+	 * Initialize the display list body and allocate DMA memory for the body
+	 * and the optional header. Both are allocated together to avoid memory
+	 * fragmentation, with the header located right after the body in
+	 * memory.
+	 */
+	header_size = dlm->mode == VSP1_DL_MODE_HEADER
+		    ? ALIGN(sizeof(struct vsp1_dl_header), 8)
+		    : 0;
+
+	dlm->pool = vsp1_dl_body_pool_create(vsp1, prealloc,
+					     VSP1_DL_NUM_ENTRIES, header_size);
+	if (!dlm->pool)
+		return NULL;
 
 	for (i = 0; i < prealloc; ++i) {
 		struct vsp1_dl_list *dl;
 
-		dl = vsp1_dl_list_alloc(dlm);
+		dl = vsp1_dl_list_alloc(dlm, dlm->pool);
 		if (!dl)
 			return NULL;
 
@@ -955,12 +838,10 @@ void vsp1_dlm_destroy(struct vsp1_dl_manager *dlm)
 	if (!dlm)
 		return;
 
-	cancel_work_sync(&dlm->gc_work);
-
 	list_for_each_entry_safe(dl, next, &dlm->free, list) {
 		list_del(&dl->list);
 		vsp1_dl_list_free(dl);
 	}
 
-	vsp1_dlm_bodies_free(dlm);
+	vsp1_dl_body_pool_destroy(dlm->pool);
 }
diff --git a/drivers/media/platform/vsp1/vsp1_dl.h b/drivers/media/platform/vsp1/vsp1_dl.h
index 785b88472375..f71a76332477 100644
--- a/drivers/media/platform/vsp1/vsp1_dl.h
+++ b/drivers/media/platform/vsp1/vsp1_dl.h
@@ -42,9 +42,6 @@ void vsp1_dl_body_pool_destroy(struct vsp1_dl_body_pool *pool);
 struct vsp1_dl_body *vsp1_dl_body_get(struct vsp1_dl_body_pool *pool);
 void vsp1_dl_body_put(struct vsp1_dl_body *dlb);
 
-struct vsp1_dl_body *vsp1_dl_body_alloc(struct vsp1_device *vsp1,
-					unsigned int num_entries);
-void vsp1_dl_body_free(struct vsp1_dl_body *dlb);
 void vsp1_dl_body_write(struct vsp1_dl_body *dlb, u32 reg, u32 data);
 int vsp1_dl_list_add_body(struct vsp1_dl_list *dl,
 			  struct vsp1_dl_body *dlb);
diff --git a/drivers/media/platform/vsp1/vsp1_lut.c b/drivers/media/platform/vsp1/vsp1_lut.c
index aa2b40327529..262cb72139d6 100644
--- a/drivers/media/platform/vsp1/vsp1_lut.c
+++ b/drivers/media/platform/vsp1/vsp1_lut.c
@@ -23,6 +23,8 @@
 #define LUT_MIN_SIZE				4U
 #define LUT_MAX_SIZE				8190U
 
+#define LUT_SIZE				256
+
 /* -----------------------------------------------------------------------------
  * Device Access
  */
@@ -44,11 +46,11 @@ static int lut_set_table(struct vsp1_lut *lut, struct v4l2_ctrl *ctrl)
 	struct vsp1_dl_body *dlb;
 	unsigned int i;
 
-	dlb = vsp1_dl_body_alloc(lut->entity.vsp1, 256);
+	dlb = vsp1_dl_body_get(lut->pool);
 	if (!dlb)
 		return -ENOMEM;
 
-	for (i = 0; i < 256; ++i)
+	for (i = 0; i < LUT_SIZE; ++i)
 		vsp1_dl_body_write(dlb, VI6_LUT_TABLE + 4 * i,
 				       ctrl->p_new.p_u32[i]);
 
@@ -56,7 +58,7 @@ static int lut_set_table(struct vsp1_lut *lut, struct v4l2_ctrl *ctrl)
 	swap(lut->lut, dlb);
 	spin_unlock_irq(&lut->lock);
 
-	vsp1_dl_body_free(dlb);
+	vsp1_dl_body_put(dlb);
 	return 0;
 }
 
@@ -87,7 +89,7 @@ static const struct v4l2_ctrl_config lut_table_control = {
 	.max = 0x00ffffff,
 	.step = 1,
 	.def = 0,
-	.dims = { 256},
+	.dims = { LUT_SIZE },
 };
 
 /* -----------------------------------------------------------------------------
@@ -217,8 +219,16 @@ static void lut_configure(struct vsp1_entity *entity,
 	}
 }
 
+static void lut_destroy(struct vsp1_entity *entity)
+{
+	struct vsp1_lut *lut = to_lut(&entity->subdev);
+
+	vsp1_dl_body_pool_destroy(lut->pool);
+}
+
 static const struct vsp1_entity_operations lut_entity_ops = {
 	.configure = lut_configure,
+	.destroy = lut_destroy,
 };
 
 /* -----------------------------------------------------------------------------
@@ -244,6 +254,15 @@ struct vsp1_lut *vsp1_lut_create(struct vsp1_device *vsp1)
 	if (ret < 0)
 		return ERR_PTR(ret);
 
+	/*
+	 * Pre-allocate a body pool, with 3 bodies allowing a userspace update
+	 * before the hardware has committed a previous set of tables, handling
+	 * both the queued and pending dl entries.
+	 */
+	lut->pool = vsp1_dl_body_pool_create(vsp1, 3, LUT_SIZE, 0);
+	if (!lut->pool)
+		return ERR_PTR(-ENOMEM);
+
 	/* Initialize the control handler. */
 	v4l2_ctrl_handler_init(&lut->ctrls, 1);
 	v4l2_ctrl_new_custom(&lut->ctrls, &lut_table_control, NULL);
diff --git a/drivers/media/platform/vsp1/vsp1_lut.h b/drivers/media/platform/vsp1/vsp1_lut.h
index f8c4e8f0a79d..499ed0070bd2 100644
--- a/drivers/media/platform/vsp1/vsp1_lut.h
+++ b/drivers/media/platform/vsp1/vsp1_lut.h
@@ -33,6 +33,7 @@ struct vsp1_lut {
 
 	spinlock_t lock;
 	struct vsp1_dl_body *lut;
+	struct vsp1_dl_body_pool *pool;
 };
 
 static inline struct vsp1_lut *to_lut(struct v4l2_subdev *subdev)
-- 
git-series 0.9.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v6 5/9] v4l: vsp1: Use reference counting for bodies
  2018-02-28 20:52 [PATCH v6 0/9] vsp1: TLB optimisation and DL caching Kieran Bingham
                   ` (3 preceding siblings ...)
  2018-02-28 20:52 ` [PATCH v6 4/9] v4l: vsp1: Convert display lists to use new " Kieran Bingham
@ 2018-02-28 20:52 ` Kieran Bingham
  2018-02-28 20:52 ` [PATCH v6 6/9] v4l: vsp1: Refactor display list configure operations Kieran Bingham
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Kieran Bingham @ 2018-02-28 20:52 UTC (permalink / raw)
  To: Laurent Pinchart, linux-media, linux-renesas-soc; +Cc: Kieran Bingham

Extend the display list body with a reference count, allowing bodies to
be kept as long as a reference is maintained. This provides the ability
to keep a cached copy of bodies which will not change, so that they can
be re-applied to multiple display lists.

Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>

---
This could be squashed into the body update code, but it's not a
straightforward squash as the refcounts will affect both:
  v4l: vsp1: Provide a body pool
and
  v4l: vsp1: Convert display lists to use new body pool
therefore, I have kept this separate to prevent breaking bisectability
of the vsp-tests.

v3:
 - 's/fragment/body/'

v4:
 - Fix up reference handling comments.

 drivers/media/platform/vsp1/vsp1_clu.c |  7 ++++++-
 drivers/media/platform/vsp1/vsp1_dl.c  | 15 ++++++++++++++-
 drivers/media/platform/vsp1/vsp1_lut.c |  7 ++++++-
 3 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/drivers/media/platform/vsp1/vsp1_clu.c b/drivers/media/platform/vsp1/vsp1_clu.c
index 2018144470c5..b2a39a6ef7e4 100644
--- a/drivers/media/platform/vsp1/vsp1_clu.c
+++ b/drivers/media/platform/vsp1/vsp1_clu.c
@@ -257,8 +257,13 @@ static void clu_configure(struct vsp1_entity *entity,
 		clu->clu = NULL;
 		spin_unlock_irqrestore(&clu->lock, flags);
 
-		if (dlb)
+		if (dlb) {
 			vsp1_dl_list_add_body(dl, dlb);
+
+			/* release our local reference */
+			vsp1_dl_body_put(dlb);
+		}
+
 		break;
 	}
 }
diff --git a/drivers/media/platform/vsp1/vsp1_dl.c b/drivers/media/platform/vsp1/vsp1_dl.c
index a069c8456666..0f87e0bb21c1 100644
--- a/drivers/media/platform/vsp1/vsp1_dl.c
+++ b/drivers/media/platform/vsp1/vsp1_dl.c
@@ -14,6 +14,7 @@
 #include <linux/device.h>
 #include <linux/dma-mapping.h>
 #include <linux/gfp.h>
+#include <linux/refcount.h>
 #include <linux/slab.h>
 #include <linux/workqueue.h>
 
@@ -58,6 +59,8 @@ struct vsp1_dl_body {
 	struct list_head list;
 	struct list_head free;
 
+	refcount_t refcnt;
+
 	struct vsp1_dl_body_pool *pool;
 	struct vsp1_device *vsp1;
 
@@ -259,6 +262,7 @@ struct vsp1_dl_body *vsp1_dl_body_get(struct vsp1_dl_body_pool *pool)
 	if (!list_empty(&pool->free)) {
 		dlb = list_first_entry(&pool->free, struct vsp1_dl_body, free);
 		list_del(&dlb->free);
+		refcount_set(&dlb->refcnt, 1);
 	}
 
 	spin_unlock_irqrestore(&pool->lock, flags);
@@ -279,6 +283,9 @@ void vsp1_dl_body_put(struct vsp1_dl_body *dlb)
 	if (!dlb)
 		return;
 
+	if (!refcount_dec_and_test(&dlb->refcnt))
+		return;
+
 	dlb->num_entries = 0;
 
 	spin_lock_irqsave(&dlb->pool->lock, flags);
@@ -465,7 +472,11 @@ void vsp1_dl_list_write(struct vsp1_dl_list *dl, u32 reg, u32 data)
  * in the order in which bodies are added.
  *
  * Adding a body to a display list passes ownership of the body to the list. The
- * caller must not touch the body after this call.
+ * caller retains its reference to the fragment when adding it to the display
+ * list, but is not allowed to add new entries to the body.
+ *
+ * The reference must be explicitly released by a call to vsp1_dl_body_put()
+ * when the body isn't needed anymore.
  *
  * Additional bodies are only usable for display lists in header mode.
  * Attempting to add a body to a header-less display list will return an error.
@@ -477,6 +488,8 @@ int vsp1_dl_list_add_body(struct vsp1_dl_list *dl,
 	if (dl->dlm->mode != VSP1_DL_MODE_HEADER)
 		return -EINVAL;
 
+	refcount_inc(&dlb->refcnt);
+
 	list_add_tail(&dlb->list, &dl->bodies);
 	return 0;
 }
diff --git a/drivers/media/platform/vsp1/vsp1_lut.c b/drivers/media/platform/vsp1/vsp1_lut.c
index 262cb72139d6..77cf7137a0f2 100644
--- a/drivers/media/platform/vsp1/vsp1_lut.c
+++ b/drivers/media/platform/vsp1/vsp1_lut.c
@@ -213,8 +213,13 @@ static void lut_configure(struct vsp1_entity *entity,
 		lut->lut = NULL;
 		spin_unlock_irqrestore(&lut->lock, flags);
 
-		if (dlb)
+		if (dlb) {
 			vsp1_dl_list_add_body(dl, dlb);
+
+			/* release our local reference */
+			vsp1_dl_body_put(dlb);
+		}
+
 		break;
 	}
 }
-- 
git-series 0.9.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v6 6/9] v4l: vsp1: Refactor display list configure operations
  2018-02-28 20:52 [PATCH v6 0/9] vsp1: TLB optimisation and DL caching Kieran Bingham
                   ` (4 preceding siblings ...)
  2018-02-28 20:52 ` [PATCH v6 5/9] v4l: vsp1: Use reference counting for bodies Kieran Bingham
@ 2018-02-28 20:52 ` Kieran Bingham
  2018-02-28 20:52 ` [PATCH v6 7/9] v4l: vsp1: Adapt entities to configure into a body Kieran Bingham
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Kieran Bingham @ 2018-02-28 20:52 UTC (permalink / raw)
  To: Laurent Pinchart, linux-media, linux-renesas-soc; +Cc: Kieran Bingham

The entities provide a single .configure operation which configures the
object into the target display list, based on the vsp1_entity_params
selection.

This restricts us to a single function prototype for both static
configuration (the pre-stream INIT stage) and the dynamic runtime stages
for both each frame - and each partition therein.

Split the configure function into two parts, '.prepare()' and
'.configure()', merging both the VSP1_ENTITY_PARAMS_RUNTIME and
VSP1_ENTITY_PARAMS_PARTITION stages into a single call through the
.configure(). The configuration for individual partitions is handled by
passing the partition number to the configure call, and processing any
runtime stage actions on the first partition only.

Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
---
 drivers/media/platform/vsp1/vsp1_bru.c    |  12 +-
 drivers/media/platform/vsp1/vsp1_clu.c    |  42 +--
 drivers/media/platform/vsp1/vsp1_dl.h     |   1 +-
 drivers/media/platform/vsp1/vsp1_drm.c    |  21 +--
 drivers/media/platform/vsp1/vsp1_entity.c |  15 +-
 drivers/media/platform/vsp1/vsp1_entity.h |  27 +--
 drivers/media/platform/vsp1/vsp1_hgo.c    |  12 +-
 drivers/media/platform/vsp1/vsp1_hgt.c    |  12 +-
 drivers/media/platform/vsp1/vsp1_hsit.c   |  12 +-
 drivers/media/platform/vsp1/vsp1_lif.c    |  12 +-
 drivers/media/platform/vsp1/vsp1_lut.c    |  24 +-
 drivers/media/platform/vsp1/vsp1_rpf.c    | 162 ++++++-------
 drivers/media/platform/vsp1/vsp1_sru.c    |  12 +-
 drivers/media/platform/vsp1/vsp1_uds.c    |  55 ++--
 drivers/media/platform/vsp1/vsp1_video.c  |  24 +--
 drivers/media/platform/vsp1/vsp1_wpf.c    | 297 ++++++++++++-----------
 16 files changed, 360 insertions(+), 380 deletions(-)

diff --git a/drivers/media/platform/vsp1/vsp1_bru.c b/drivers/media/platform/vsp1/vsp1_bru.c
index e8fd2ae3b3eb..b9ff96f76b3e 100644
--- a/drivers/media/platform/vsp1/vsp1_bru.c
+++ b/drivers/media/platform/vsp1/vsp1_bru.c
@@ -285,19 +285,15 @@ static const struct v4l2_subdev_ops bru_ops = {
  * VSP1 Entity Operations
  */
 
-static void bru_configure(struct vsp1_entity *entity,
-			  struct vsp1_pipeline *pipe,
-			  struct vsp1_dl_list *dl,
-			  enum vsp1_entity_params params)
+static void bru_prepare(struct vsp1_entity *entity,
+			struct vsp1_pipeline *pipe,
+			struct vsp1_dl_list *dl)
 {
 	struct vsp1_bru *bru = to_bru(&entity->subdev);
 	struct v4l2_mbus_framefmt *format;
 	unsigned int flags;
 	unsigned int i;
 
-	if (params != VSP1_ENTITY_PARAMS_INIT)
-		return;
-
 	format = vsp1_entity_get_pad_format(&bru->entity, bru->entity.config,
 					    bru->entity.source_pad);
 
@@ -404,7 +400,7 @@ static void bru_configure(struct vsp1_entity *entity,
 }
 
 static const struct vsp1_entity_operations bru_entity_ops = {
-	.configure = bru_configure,
+	.prepare = bru_prepare,
 };
 
 /* -----------------------------------------------------------------------------
diff --git a/drivers/media/platform/vsp1/vsp1_clu.c b/drivers/media/platform/vsp1/vsp1_clu.c
index b2a39a6ef7e4..be4d7e493746 100644
--- a/drivers/media/platform/vsp1/vsp1_clu.c
+++ b/drivers/media/platform/vsp1/vsp1_clu.c
@@ -213,37 +213,36 @@ static const struct v4l2_subdev_ops clu_ops = {
 /* -----------------------------------------------------------------------------
  * VSP1 Entity Operations
  */
+static void clu_prepare(struct vsp1_entity *entity,
+			struct vsp1_pipeline *pipe,
+			struct vsp1_dl_list *dl)
+{
+	struct vsp1_clu *clu = to_clu(&entity->subdev);
+
+	/*
+	 * The yuv_mode can't be changed during streaming. Cache it internally
+	 * for future runtime configuration calls.
+	 */
+	struct v4l2_mbus_framefmt *format;
+
+	format = vsp1_entity_get_pad_format(&clu->entity,
+					    clu->entity.config,
+					    CLU_PAD_SINK);
+	clu->yuv_mode = format->code == MEDIA_BUS_FMT_AYUV8_1X32;
+}
 
 static void clu_configure(struct vsp1_entity *entity,
 			  struct vsp1_pipeline *pipe,
 			  struct vsp1_dl_list *dl,
-			  enum vsp1_entity_params params)
+			  unsigned int partition)
 {
 	struct vsp1_clu *clu = to_clu(&entity->subdev);
 	struct vsp1_dl_body *dlb;
 	unsigned long flags;
 	u32 ctrl = VI6_CLU_CTRL_AAI | VI6_CLU_CTRL_MVS | VI6_CLU_CTRL_EN;
 
-	switch (params) {
-	case VSP1_ENTITY_PARAMS_INIT: {
-		/*
-		 * The format can't be changed during streaming, only verify it
-		 * at setup time and store the information internally for future
-		 * runtime configuration calls.
-		 */
-		struct v4l2_mbus_framefmt *format;
-
-		format = vsp1_entity_get_pad_format(&clu->entity,
-						    clu->entity.config,
-						    CLU_PAD_SINK);
-		clu->yuv_mode = format->code == MEDIA_BUS_FMT_AYUV8_1X32;
-		break;
-	}
 
-	case VSP1_ENTITY_PARAMS_PARTITION:
-		break;
-
-	case VSP1_ENTITY_PARAMS_RUNTIME:
+	if (partition == 0) {
 		/* 2D mode can only be used with the YCbCr pixel encoding. */
 		if (clu->mode == V4L2_CID_VSP1_CLU_MODE_2D && clu->yuv_mode)
 			ctrl |= VI6_CLU_CTRL_AX1I_2D | VI6_CLU_CTRL_AX2I_2D
@@ -263,8 +262,6 @@ static void clu_configure(struct vsp1_entity *entity,
 			/* release our local reference */
 			vsp1_dl_body_put(dlb);
 		}
-
-		break;
 	}
 }
 
@@ -276,6 +273,7 @@ static void clu_destroy(struct vsp1_entity *entity)
 }
 
 static const struct vsp1_entity_operations clu_entity_ops = {
+	.prepare = clu_prepare,
 	.configure = clu_configure,
 	.destroy = clu_destroy,
 };
diff --git a/drivers/media/platform/vsp1/vsp1_dl.h b/drivers/media/platform/vsp1/vsp1_dl.h
index f71a76332477..4f67b2b955b9 100644
--- a/drivers/media/platform/vsp1/vsp1_dl.h
+++ b/drivers/media/platform/vsp1/vsp1_dl.h
@@ -41,7 +41,6 @@ vsp1_dl_body_pool_create(struct vsp1_device *vsp1, unsigned int num_bodies,
 void vsp1_dl_body_pool_destroy(struct vsp1_dl_body_pool *pool);
 struct vsp1_dl_body *vsp1_dl_body_get(struct vsp1_dl_body_pool *pool);
 void vsp1_dl_body_put(struct vsp1_dl_body *dlb);
-
 void vsp1_dl_body_write(struct vsp1_dl_body *dlb, u32 reg, u32 data);
 int vsp1_dl_list_add_body(struct vsp1_dl_list *dl,
 			  struct vsp1_dl_body *dlb);
diff --git a/drivers/media/platform/vsp1/vsp1_drm.c b/drivers/media/platform/vsp1/vsp1_drm.c
index b8fee1834253..ef9a64a616ef 100644
--- a/drivers/media/platform/vsp1/vsp1_drm.c
+++ b/drivers/media/platform/vsp1/vsp1_drm.c
@@ -259,14 +259,8 @@ int vsp1_du_setup_lif(struct device *dev, unsigned int pipe_index,
 	list_for_each_entry_safe(entity, next, &pipe->entities, list_pipe) {
 		vsp1_entity_route_setup(entity, pipe, dl);
 
-		if (entity->ops->configure) {
-			entity->ops->configure(entity, pipe, dl,
-					       VSP1_ENTITY_PARAMS_INIT);
-			entity->ops->configure(entity, pipe, dl,
-					       VSP1_ENTITY_PARAMS_RUNTIME);
-			entity->ops->configure(entity, pipe, dl,
-					       VSP1_ENTITY_PARAMS_PARTITION);
-		}
+		vsp1_entity_prepare(entity, pipe, dl);
+		vsp1_entity_configure(entity, pipe, dl, 0);
 	}
 
 	vsp1_dl_list_commit(dl);
@@ -588,15 +582,8 @@ void vsp1_du_atomic_flush(struct device *dev, unsigned int pipe_index)
 		}
 
 		vsp1_entity_route_setup(entity, pipe, dl);
-
-		if (entity->ops->configure) {
-			entity->ops->configure(entity, pipe, dl,
-					       VSP1_ENTITY_PARAMS_INIT);
-			entity->ops->configure(entity, pipe, dl,
-					       VSP1_ENTITY_PARAMS_RUNTIME);
-			entity->ops->configure(entity, pipe, dl,
-					       VSP1_ENTITY_PARAMS_PARTITION);
-		}
+		vsp1_entity_prepare(entity, pipe, dl);
+		vsp1_entity_configure(entity, pipe, dl, 0);
 	}
 
 	vsp1_dl_list_commit(dl);
diff --git a/drivers/media/platform/vsp1/vsp1_entity.c b/drivers/media/platform/vsp1/vsp1_entity.c
index 54de15095709..76f240f005af 100644
--- a/drivers/media/platform/vsp1/vsp1_entity.c
+++ b/drivers/media/platform/vsp1/vsp1_entity.c
@@ -73,6 +73,21 @@ void vsp1_entity_route_setup(struct vsp1_entity *entity,
 	vsp1_dl_list_write(dl, source->route->reg, route);
 }
 
+void vsp1_entity_prepare(struct vsp1_entity *entity, struct vsp1_pipeline *pipe,
+			 struct vsp1_dl_list *dl)
+{
+	if (entity->ops->prepare)
+		entity->ops->prepare(entity, pipe, dl);
+}
+
+void vsp1_entity_configure(struct vsp1_entity *entity,
+			   struct vsp1_pipeline *pipe, struct vsp1_dl_list *dl,
+			   unsigned int partition)
+{
+	if (entity->ops->configure)
+		entity->ops->configure(entity, pipe, dl, partition);
+}
+
 /* -----------------------------------------------------------------------------
  * V4L2 Subdevice Operations
  */
diff --git a/drivers/media/platform/vsp1/vsp1_entity.h b/drivers/media/platform/vsp1/vsp1_entity.h
index 408602ebeb97..2f33e343ccc6 100644
--- a/drivers/media/platform/vsp1/vsp1_entity.h
+++ b/drivers/media/platform/vsp1/vsp1_entity.h
@@ -40,18 +40,6 @@ enum vsp1_entity_type {
 	VSP1_ENTITY_WPF,
 };
 
-/**
- * enum vsp1_entity_params - Entity configuration parameters class
- * @VSP1_ENTITY_PARAMS_INIT - Initial parameters
- * @VSP1_ENTITY_PARAMS_PARTITION - Per-image partition parameters
- * @VSP1_ENTITY_PARAMS_RUNTIME - Runtime-configurable parameters
- */
-enum vsp1_entity_params {
-	VSP1_ENTITY_PARAMS_INIT,
-	VSP1_ENTITY_PARAMS_PARTITION,
-	VSP1_ENTITY_PARAMS_RUNTIME,
-};
-
 #define VSP1_ENTITY_MAX_INPUTS		5	/* For the BRU */
 
 /*
@@ -80,8 +68,10 @@ struct vsp1_route {
 /**
  * struct vsp1_entity_operations - Entity operations
  * @destroy:	Destroy the entity.
- * @configure:	Setup the hardware based on the entity state (pipeline, formats,
- *		selection rectangles, ...)
+ * @prepare:	Setup the initial hardware parameters for the stream (pipeline,
+ *		formats)
+ * @configure:	Configure the runtime parameters for each partition (rectangles,
+ *		buffer addresses, ...)
  * @max_width:	Return the max supported width of data that the entity can
  *		process in a single operation.
  * @partition:	Process the partition construction based on this entity's
@@ -89,8 +79,10 @@ struct vsp1_route {
  */
 struct vsp1_entity_operations {
 	void (*destroy)(struct vsp1_entity *);
+	void (*prepare)(struct vsp1_entity *, struct vsp1_pipeline *,
+			struct vsp1_dl_list *);
 	void (*configure)(struct vsp1_entity *, struct vsp1_pipeline *,
-			  struct vsp1_dl_list *, enum vsp1_entity_params);
+			  struct vsp1_dl_list *, unsigned int partition);
 	unsigned int (*max_width)(struct vsp1_entity *, struct vsp1_pipeline *);
 	void (*partition)(struct vsp1_entity *, struct vsp1_pipeline *,
 			  struct vsp1_partition *, unsigned int,
@@ -156,6 +148,11 @@ int vsp1_entity_init_cfg(struct v4l2_subdev *subdev,
 void vsp1_entity_route_setup(struct vsp1_entity *entity,
 			     struct vsp1_pipeline *pipe,
 			     struct vsp1_dl_list *dl);
+void vsp1_entity_prepare(struct vsp1_entity *entity, struct vsp1_pipeline *pipe,
+			 struct vsp1_dl_list *dl);
+void vsp1_entity_configure(struct vsp1_entity *entity,
+			   struct vsp1_pipeline *pipe, struct vsp1_dl_list *dl,
+			   unsigned int partition);
 
 struct media_pad *vsp1_entity_remote_pad(struct media_pad *pad);
 
diff --git a/drivers/media/platform/vsp1/vsp1_hgo.c b/drivers/media/platform/vsp1/vsp1_hgo.c
index 50309c053b78..5705ba67dbc8 100644
--- a/drivers/media/platform/vsp1/vsp1_hgo.c
+++ b/drivers/media/platform/vsp1/vsp1_hgo.c
@@ -133,10 +133,9 @@ static const struct v4l2_ctrl_config hgo_num_bins_control = {
  * VSP1 Entity Operations
  */
 
-static void hgo_configure(struct vsp1_entity *entity,
-			  struct vsp1_pipeline *pipe,
-			  struct vsp1_dl_list *dl,
-			  enum vsp1_entity_params params)
+static void hgo_prepare(struct vsp1_entity *entity,
+			struct vsp1_pipeline *pipe,
+			struct vsp1_dl_list *dl)
 {
 	struct vsp1_hgo *hgo = to_hgo(&entity->subdev);
 	struct v4l2_rect *compose;
@@ -144,9 +143,6 @@ static void hgo_configure(struct vsp1_entity *entity,
 	unsigned int hratio;
 	unsigned int vratio;
 
-	if (params != VSP1_ENTITY_PARAMS_INIT)
-		return;
-
 	crop = vsp1_entity_get_pad_selection(entity, entity->config,
 					     HISTO_PAD_SINK, V4L2_SEL_TGT_CROP);
 	compose = vsp1_entity_get_pad_selection(entity, entity->config,
@@ -178,7 +174,7 @@ static void hgo_configure(struct vsp1_entity *entity,
 }
 
 static const struct vsp1_entity_operations hgo_entity_ops = {
-	.configure = hgo_configure,
+	.prepare = hgo_prepare,
 	.destroy = vsp1_histogram_destroy,
 };
 
diff --git a/drivers/media/platform/vsp1/vsp1_hgt.c b/drivers/media/platform/vsp1/vsp1_hgt.c
index b5ce305e3e6f..bdd1247e090f 100644
--- a/drivers/media/platform/vsp1/vsp1_hgt.c
+++ b/drivers/media/platform/vsp1/vsp1_hgt.c
@@ -129,10 +129,9 @@ static const struct v4l2_ctrl_config hgt_hue_areas = {
  * VSP1 Entity Operations
  */
 
-static void hgt_configure(struct vsp1_entity *entity,
-			  struct vsp1_pipeline *pipe,
-			  struct vsp1_dl_list *dl,
-			  enum vsp1_entity_params params)
+static void hgt_prepare(struct vsp1_entity *entity,
+			struct vsp1_pipeline *pipe,
+			struct vsp1_dl_list *dl)
 {
 	struct vsp1_hgt *hgt = to_hgt(&entity->subdev);
 	struct v4l2_rect *compose;
@@ -143,9 +142,6 @@ static void hgt_configure(struct vsp1_entity *entity,
 	u8 upper;
 	unsigned int i;
 
-	if (params != VSP1_ENTITY_PARAMS_INIT)
-		return;
-
 	crop = vsp1_entity_get_pad_selection(entity, entity->config,
 					     HISTO_PAD_SINK, V4L2_SEL_TGT_CROP);
 	compose = vsp1_entity_get_pad_selection(entity, entity->config,
@@ -179,7 +175,7 @@ static void hgt_configure(struct vsp1_entity *entity,
 }
 
 static const struct vsp1_entity_operations hgt_entity_ops = {
-	.configure = hgt_configure,
+	.prepare = hgt_prepare,
 	.destroy = vsp1_histogram_destroy,
 };
 
diff --git a/drivers/media/platform/vsp1/vsp1_hsit.c b/drivers/media/platform/vsp1/vsp1_hsit.c
index 764d405345ee..cf96ce2c6da9 100644
--- a/drivers/media/platform/vsp1/vsp1_hsit.c
+++ b/drivers/media/platform/vsp1/vsp1_hsit.c
@@ -131,16 +131,12 @@ static const struct v4l2_subdev_ops hsit_ops = {
  * VSP1 Entity Operations
  */
 
-static void hsit_configure(struct vsp1_entity *entity,
-			   struct vsp1_pipeline *pipe,
-			   struct vsp1_dl_list *dl,
-			   enum vsp1_entity_params params)
+static void hsit_prepare(struct vsp1_entity *entity,
+			 struct vsp1_pipeline *pipe,
+			 struct vsp1_dl_list *dl)
 {
 	struct vsp1_hsit *hsit = to_hsit(&entity->subdev);
 
-	if (params != VSP1_ENTITY_PARAMS_INIT)
-		return;
-
 	if (hsit->inverse)
 		vsp1_hsit_write(hsit, dl, VI6_HSI_CTRL, VI6_HSI_CTRL_EN);
 	else
@@ -148,7 +144,7 @@ static void hsit_configure(struct vsp1_entity *entity,
 }
 
 static const struct vsp1_entity_operations hsit_entity_ops = {
-	.configure = hsit_configure,
+	.prepare = hsit_prepare,
 };
 
 /* -----------------------------------------------------------------------------
diff --git a/drivers/media/platform/vsp1/vsp1_lif.c b/drivers/media/platform/vsp1/vsp1_lif.c
index 704920753998..7c6391090844 100644
--- a/drivers/media/platform/vsp1/vsp1_lif.c
+++ b/drivers/media/platform/vsp1/vsp1_lif.c
@@ -128,10 +128,9 @@ static const struct v4l2_subdev_ops lif_ops = {
  * VSP1 Entity Operations
  */
 
-static void lif_configure(struct vsp1_entity *entity,
-			  struct vsp1_pipeline *pipe,
-			  struct vsp1_dl_list *dl,
-			  enum vsp1_entity_params params)
+static void lif_prepare(struct vsp1_entity *entity,
+			struct vsp1_pipeline *pipe,
+			struct vsp1_dl_list *dl)
 {
 	const struct v4l2_mbus_framefmt *format;
 	struct vsp1_lif *lif = to_lif(&entity->subdev);
@@ -139,9 +138,6 @@ static void lif_configure(struct vsp1_entity *entity,
 	unsigned int obth = 400;
 	unsigned int lbth = 200;
 
-	if (params != VSP1_ENTITY_PARAMS_INIT)
-		return;
-
 	format = vsp1_entity_get_pad_format(&lif->entity, lif->entity.config,
 					    LIF_PAD_SOURCE);
 
@@ -170,7 +166,7 @@ static void lif_configure(struct vsp1_entity *entity,
 }
 
 static const struct vsp1_entity_operations lif_entity_ops = {
-	.configure = lif_configure,
+	.prepare = lif_prepare,
 };
 
 /* -----------------------------------------------------------------------------
diff --git a/drivers/media/platform/vsp1/vsp1_lut.c b/drivers/media/platform/vsp1/vsp1_lut.c
index 77cf7137a0f2..33bfaa1df994 100644
--- a/drivers/media/platform/vsp1/vsp1_lut.c
+++ b/drivers/media/platform/vsp1/vsp1_lut.c
@@ -190,24 +190,25 @@ static const struct v4l2_subdev_ops lut_ops = {
  * VSP1 Entity Operations
  */
 
+static void lut_prepare(struct vsp1_entity *entity,
+			struct vsp1_pipeline *pipe,
+			struct vsp1_dl_list *dl)
+{
+	struct vsp1_lut *lut = to_lut(&entity->subdev);
+
+	vsp1_lut_write(lut, dl, VI6_LUT_CTRL, VI6_LUT_CTRL_EN);
+}
+
 static void lut_configure(struct vsp1_entity *entity,
 			  struct vsp1_pipeline *pipe,
 			  struct vsp1_dl_list *dl,
-			  enum vsp1_entity_params params)
+			  unsigned int partition)
 {
 	struct vsp1_lut *lut = to_lut(&entity->subdev);
 	struct vsp1_dl_body *dlb;
 	unsigned long flags;
 
-	switch (params) {
-	case VSP1_ENTITY_PARAMS_INIT:
-		vsp1_lut_write(lut, dl, VI6_LUT_CTRL, VI6_LUT_CTRL_EN);
-		break;
-
-	case VSP1_ENTITY_PARAMS_PARTITION:
-		break;
-
-	case VSP1_ENTITY_PARAMS_RUNTIME:
+	if (partition == 0) {
 		spin_lock_irqsave(&lut->lock, flags);
 		dlb = lut->lut;
 		lut->lut = NULL;
@@ -219,8 +220,6 @@ static void lut_configure(struct vsp1_entity *entity,
 			/* release our local reference */
 			vsp1_dl_body_put(dlb);
 		}
-
-		break;
 	}
 }
 
@@ -232,6 +231,7 @@ static void lut_destroy(struct vsp1_entity *entity)
 }
 
 static const struct vsp1_entity_operations lut_entity_ops = {
+	.prepare = lut_prepare,
 	.configure = lut_configure,
 	.destroy = lut_destroy,
 };
diff --git a/drivers/media/platform/vsp1/vsp1_rpf.c b/drivers/media/platform/vsp1/vsp1_rpf.c
index fe0633da5a5f..87a47997a086 100644
--- a/drivers/media/platform/vsp1/vsp1_rpf.c
+++ b/drivers/media/platform/vsp1/vsp1_rpf.c
@@ -46,10 +46,9 @@ static const struct v4l2_subdev_ops rpf_ops = {
  * VSP1 Entity Operations
  */
 
-static void rpf_configure(struct vsp1_entity *entity,
-			  struct vsp1_pipeline *pipe,
-			  struct vsp1_dl_list *dl,
-			  enum vsp1_entity_params params)
+static void rpf_prepare(struct vsp1_entity *entity,
+			struct vsp1_pipeline *pipe,
+			struct vsp1_dl_list *dl)
 {
 	struct vsp1_rwpf *rpf = to_rwpf(&entity->subdev);
 	const struct vsp1_format_info *fmtinfo = rpf->fmtinfo;
@@ -61,80 +60,6 @@ static void rpf_configure(struct vsp1_entity *entity,
 	u32 pstride;
 	u32 infmt;
 
-	if (params == VSP1_ENTITY_PARAMS_RUNTIME) {
-		vsp1_rpf_write(rpf, dl, VI6_RPF_VRTCOL_SET,
-			       rpf->alpha << VI6_RPF_VRTCOL_SET_LAYA_SHIFT);
-		vsp1_rpf_write(rpf, dl, VI6_RPF_MULT_ALPHA, rpf->mult_alpha |
-			       (rpf->alpha << VI6_RPF_MULT_ALPHA_RATIO_SHIFT));
-
-		vsp1_pipeline_propagate_alpha(pipe, dl, rpf->alpha);
-		return;
-	}
-
-	if (params == VSP1_ENTITY_PARAMS_PARTITION) {
-		struct vsp1_device *vsp1 = rpf->entity.vsp1;
-		struct vsp1_rwpf_memory mem = rpf->mem;
-		struct v4l2_rect crop;
-
-		/*
-		 * Source size and crop offsets.
-		 *
-		 * The crop offsets correspond to the location of the crop
-		 * rectangle top left corner in the plane buffer. Only two
-		 * offsets are needed, as planes 2 and 3 always have identical
-		 * strides.
-		 */
-		crop = *vsp1_rwpf_get_crop(rpf, rpf->entity.config);
-
-		/*
-		 * Partition Algorithm Control
-		 *
-		 * The partition algorithm can split this frame into multiple
-		 * slices. We must scale our partition window based on the pipe
-		 * configuration to match the destination partition window.
-		 * To achieve this, we adjust our crop to provide a 'sub-crop'
-		 * matching the expected partition window. Only 'left' and
-		 * 'width' need to be adjusted.
-		 */
-		if (pipe->partitions > 1) {
-			crop.width = pipe->partition->rpf.width;
-			crop.left += pipe->partition->rpf.left;
-		}
-
-		vsp1_rpf_write(rpf, dl, VI6_RPF_SRC_BSIZE,
-			       (crop.width << VI6_RPF_SRC_BSIZE_BHSIZE_SHIFT) |
-			       (crop.height << VI6_RPF_SRC_BSIZE_BVSIZE_SHIFT));
-		vsp1_rpf_write(rpf, dl, VI6_RPF_SRC_ESIZE,
-			       (crop.width << VI6_RPF_SRC_ESIZE_EHSIZE_SHIFT) |
-			       (crop.height << VI6_RPF_SRC_ESIZE_EVSIZE_SHIFT));
-
-		mem.addr[0] += crop.top * format->plane_fmt[0].bytesperline
-			     + crop.left * fmtinfo->bpp[0] / 8;
-
-		if (format->num_planes > 1) {
-			unsigned int offset;
-
-			offset = crop.top * format->plane_fmt[1].bytesperline
-			       + crop.left / fmtinfo->hsub
-			       * fmtinfo->bpp[1] / 8;
-			mem.addr[1] += offset;
-			mem.addr[2] += offset;
-		}
-
-		/*
-		 * On Gen3 hardware the SPUVS bit has no effect on 3-planar
-		 * formats. Swap the U and V planes manually in that case.
-		 */
-		if (vsp1->info->gen == 3 && format->num_planes == 3 &&
-		    fmtinfo->swap_uv)
-			swap(mem.addr[1], mem.addr[2]);
-
-		vsp1_rpf_write(rpf, dl, VI6_RPF_SRCM_ADDR_Y, mem.addr[0]);
-		vsp1_rpf_write(rpf, dl, VI6_RPF_SRCM_ADDR_C0, mem.addr[1]);
-		vsp1_rpf_write(rpf, dl, VI6_RPF_SRCM_ADDR_C1, mem.addr[2]);
-		return;
-	}
-
 	/* Stride */
 	pstride = format->plane_fmt[0].bytesperline
 		<< VI6_RPF_SRCM_PSTRIDE_Y_SHIFT;
@@ -247,6 +172,86 @@ static void rpf_configure(struct vsp1_entity *entity,
 
 }
 
+static void rpf_configure(struct vsp1_entity *entity,
+			  struct vsp1_pipeline *pipe,
+			  struct vsp1_dl_list *dl,
+			  unsigned int partition)
+{
+	struct vsp1_rwpf *rpf = to_rwpf(&entity->subdev);
+	struct vsp1_rwpf_memory mem = rpf->mem;
+	struct vsp1_device *vsp1 = rpf->entity.vsp1;
+	const struct vsp1_format_info *fmtinfo = rpf->fmtinfo;
+	const struct v4l2_pix_format_mplane *format = &rpf->format;
+	struct v4l2_rect crop;
+
+	if (partition == 0) {
+		vsp1_rpf_write(rpf, dl, VI6_RPF_VRTCOL_SET,
+			       rpf->alpha << VI6_RPF_VRTCOL_SET_LAYA_SHIFT);
+		vsp1_rpf_write(rpf, dl, VI6_RPF_MULT_ALPHA, rpf->mult_alpha |
+			       (rpf->alpha << VI6_RPF_MULT_ALPHA_RATIO_SHIFT));
+
+		vsp1_pipeline_propagate_alpha(pipe, dl, rpf->alpha);
+	}
+
+
+	/*
+	 * Source size and crop offsets.
+	 *
+	 * The crop offsets correspond to the location of the crop
+	 * rectangle top left corner in the plane buffer. Only two
+	 * offsets are needed, as planes 2 and 3 always have identical
+	 * strides.
+	 */
+	crop = *vsp1_rwpf_get_crop(rpf, rpf->entity.config);
+
+	/*
+	 * Partition Algorithm Control
+	 *
+	 * The partition algorithm can split this frame into multiple
+	 * slices. We must scale our partition window based on the pipe
+	 * configuration to match the destination partition window.
+	 * To achieve this, we adjust our crop to provide a 'sub-crop'
+	 * matching the expected partition window. Only 'left' and
+	 * 'width' need to be adjusted.
+	 */
+	if (pipe->partitions > 1) {
+		crop.width = pipe->partition->rpf.width;
+		crop.left += pipe->partition->rpf.left;
+	}
+
+	vsp1_rpf_write(rpf, dl, VI6_RPF_SRC_BSIZE,
+		       (crop.width << VI6_RPF_SRC_BSIZE_BHSIZE_SHIFT) |
+		       (crop.height << VI6_RPF_SRC_BSIZE_BVSIZE_SHIFT));
+	vsp1_rpf_write(rpf, dl, VI6_RPF_SRC_ESIZE,
+		       (crop.width << VI6_RPF_SRC_ESIZE_EHSIZE_SHIFT) |
+		       (crop.height << VI6_RPF_SRC_ESIZE_EVSIZE_SHIFT));
+
+	mem.addr[0] += crop.top * format->plane_fmt[0].bytesperline
+		     + crop.left * fmtinfo->bpp[0] / 8;
+
+	if (format->num_planes > 1) {
+		unsigned int offset;
+
+		offset = crop.top * format->plane_fmt[1].bytesperline
+		       + crop.left / fmtinfo->hsub
+		       * fmtinfo->bpp[1] / 8;
+		mem.addr[1] += offset;
+		mem.addr[2] += offset;
+	}
+
+	/*
+	 * On Gen3 hardware the SPUVS bit has no effect on 3-planar
+	 * formats. Swap the U and V planes manually in that case.
+	 */
+	if (vsp1->info->gen == 3 && format->num_planes == 3 &&
+	    fmtinfo->swap_uv)
+		swap(mem.addr[1], mem.addr[2]);
+
+	vsp1_rpf_write(rpf, dl, VI6_RPF_SRCM_ADDR_Y, mem.addr[0]);
+	vsp1_rpf_write(rpf, dl, VI6_RPF_SRCM_ADDR_C0, mem.addr[1]);
+	vsp1_rpf_write(rpf, dl, VI6_RPF_SRCM_ADDR_C1, mem.addr[2]);
+}
+
 static void rpf_partition(struct vsp1_entity *entity,
 			  struct vsp1_pipeline *pipe,
 			  struct vsp1_partition *partition,
@@ -257,6 +262,7 @@ static void rpf_partition(struct vsp1_entity *entity,
 }
 
 static const struct vsp1_entity_operations rpf_entity_ops = {
+	.prepare = rpf_prepare,
 	.configure = rpf_configure,
 	.partition = rpf_partition,
 };
diff --git a/drivers/media/platform/vsp1/vsp1_sru.c b/drivers/media/platform/vsp1/vsp1_sru.c
index 51e5691187c3..0a24bc59bc2f 100644
--- a/drivers/media/platform/vsp1/vsp1_sru.c
+++ b/drivers/media/platform/vsp1/vsp1_sru.c
@@ -271,10 +271,9 @@ static const struct v4l2_subdev_ops sru_ops = {
  * VSP1 Entity Operations
  */
 
-static void sru_configure(struct vsp1_entity *entity,
-			  struct vsp1_pipeline *pipe,
-			  struct vsp1_dl_list *dl,
-			  enum vsp1_entity_params params)
+static void sru_prepare(struct vsp1_entity *entity,
+			struct vsp1_pipeline *pipe,
+			struct vsp1_dl_list *dl)
 {
 	const struct vsp1_sru_param *param;
 	struct vsp1_sru *sru = to_sru(&entity->subdev);
@@ -282,9 +281,6 @@ static void sru_configure(struct vsp1_entity *entity,
 	struct v4l2_mbus_framefmt *output;
 	u32 ctrl0;
 
-	if (params != VSP1_ENTITY_PARAMS_INIT)
-		return;
-
 	input = vsp1_entity_get_pad_format(&sru->entity, sru->entity.config,
 					   SRU_PAD_SINK);
 	output = vsp1_entity_get_pad_format(&sru->entity, sru->entity.config,
@@ -351,7 +347,7 @@ static void sru_partition(struct vsp1_entity *entity,
 }
 
 static const struct vsp1_entity_operations sru_entity_ops = {
-	.configure = sru_configure,
+	.prepare = sru_prepare,
 	.max_width = sru_max_width,
 	.partition = sru_partition,
 };
diff --git a/drivers/media/platform/vsp1/vsp1_uds.c b/drivers/media/platform/vsp1/vsp1_uds.c
index 72f72a9d2152..84be962a33b1 100644
--- a/drivers/media/platform/vsp1/vsp1_uds.c
+++ b/drivers/media/platform/vsp1/vsp1_uds.c
@@ -259,10 +259,9 @@ static const struct v4l2_subdev_ops uds_ops = {
  * VSP1 Entity Operations
  */
 
-static void uds_configure(struct vsp1_entity *entity,
-			  struct vsp1_pipeline *pipe,
-			  struct vsp1_dl_list *dl,
-			  enum vsp1_entity_params params)
+static void uds_prepare(struct vsp1_entity *entity,
+			struct vsp1_pipeline *pipe,
+			struct vsp1_dl_list *dl)
 {
 	struct vsp1_uds *uds = to_uds(&entity->subdev);
 	const struct v4l2_mbus_framefmt *output;
@@ -276,27 +275,6 @@ static void uds_configure(struct vsp1_entity *entity,
 	output = vsp1_entity_get_pad_format(&uds->entity, uds->entity.config,
 					    UDS_PAD_SOURCE);
 
-	if (params == VSP1_ENTITY_PARAMS_PARTITION) {
-		struct vsp1_partition *partition = pipe->partition;
-
-		/* Input size clipping */
-		vsp1_uds_write(uds, dl, VI6_UDS_HSZCLIP, VI6_UDS_HSZCLIP_HCEN |
-			       (0 << VI6_UDS_HSZCLIP_HCL_OFST_SHIFT) |
-			       (partition->uds_sink.width
-					<< VI6_UDS_HSZCLIP_HCL_SIZE_SHIFT));
-
-		/* Output size clipping */
-		vsp1_uds_write(uds, dl, VI6_UDS_CLIP_SIZE,
-			       (partition->uds_source.width
-					<< VI6_UDS_CLIP_SIZE_HSIZE_SHIFT) |
-			       (output->height
-					<< VI6_UDS_CLIP_SIZE_VSIZE_SHIFT));
-		return;
-	}
-
-	if (params != VSP1_ENTITY_PARAMS_INIT)
-		return;
-
 	hscale = uds_compute_ratio(input->width, output->width);
 	vscale = uds_compute_ratio(input->height, output->height);
 
@@ -328,6 +306,32 @@ static void uds_configure(struct vsp1_entity *entity,
 		       (vscale << VI6_UDS_SCALE_VFRAC_SHIFT));
 }
 
+static void uds_configure(struct vsp1_entity *entity,
+			  struct vsp1_pipeline *pipe,
+			  struct vsp1_dl_list *dl,
+			  unsigned int pindex)
+{
+	struct vsp1_uds *uds = to_uds(&entity->subdev);
+	struct vsp1_partition *partition = pipe->partition;
+	const struct v4l2_mbus_framefmt *output;
+
+	output = vsp1_entity_get_pad_format(&uds->entity, uds->entity.config,
+					    UDS_PAD_SOURCE);
+
+	/* Input size clipping */
+	vsp1_uds_write(uds, dl, VI6_UDS_HSZCLIP, VI6_UDS_HSZCLIP_HCEN |
+		       (0 << VI6_UDS_HSZCLIP_HCL_OFST_SHIFT) |
+		       (partition->uds_sink.width
+				<< VI6_UDS_HSZCLIP_HCL_SIZE_SHIFT));
+
+	/* Output size clipping */
+	vsp1_uds_write(uds, dl, VI6_UDS_CLIP_SIZE,
+		       (partition->uds_source.width
+				<< VI6_UDS_CLIP_SIZE_HSIZE_SHIFT) |
+		       (output->height
+				<< VI6_UDS_CLIP_SIZE_VSIZE_SHIFT));
+}
+
 static unsigned int uds_max_width(struct vsp1_entity *entity,
 				  struct vsp1_pipeline *pipe)
 {
@@ -384,6 +388,7 @@ static void uds_partition(struct vsp1_entity *entity,
 }
 
 static const struct vsp1_entity_operations uds_entity_ops = {
+	.prepare = uds_prepare,
 	.configure = uds_configure,
 	.max_width = uds_max_width,
 	.partition = uds_partition,
diff --git a/drivers/media/platform/vsp1/vsp1_video.c b/drivers/media/platform/vsp1/vsp1_video.c
index c2d3b8f0f487..bd5403f24dda 100644
--- a/drivers/media/platform/vsp1/vsp1_video.c
+++ b/drivers/media/platform/vsp1/vsp1_video.c
@@ -386,33 +386,18 @@ static void vsp1_video_pipeline_run_partition(struct vsp1_pipeline *pipe,
 
 	pipe->partition = &pipe->part_table[partition];
 
-	list_for_each_entry(entity, &pipe->entities, list_pipe) {
-		if (entity->ops->configure)
-			entity->ops->configure(entity, pipe, dl,
-					       VSP1_ENTITY_PARAMS_PARTITION);
-	}
+	list_for_each_entry(entity, &pipe->entities, list_pipe)
+		vsp1_entity_configure(entity, pipe, dl, partition);
 }
 
 static void vsp1_video_pipeline_run(struct vsp1_pipeline *pipe)
 {
 	struct vsp1_device *vsp1 = pipe->output->entity.vsp1;
-	struct vsp1_entity *entity;
 	unsigned int partition;
 
 	if (!pipe->dl)
 		pipe->dl = vsp1_dl_list_get(pipe->output->dlm);
 
-	/*
-	 * Start with the runtime parameters as the configure operation can
-	 * compute/cache information needed when configuring partitions. This
-	 * is the case with flipping in the WPF.
-	 */
-	list_for_each_entry(entity, &pipe->entities, list_pipe) {
-		if (entity->ops->configure)
-			entity->ops->configure(entity, pipe, pipe->dl,
-					       VSP1_ENTITY_PARAMS_RUNTIME);
-	}
-
 	/* Run the first partition */
 	vsp1_video_pipeline_run_partition(pipe, pipe->dl, 0);
 
@@ -840,10 +825,7 @@ static int vsp1_video_setup_pipeline(struct vsp1_pipeline *pipe)
 
 	list_for_each_entry(entity, &pipe->entities, list_pipe) {
 		vsp1_entity_route_setup(entity, pipe, pipe->dl);
-
-		if (entity->ops->configure)
-			entity->ops->configure(entity, pipe, pipe->dl,
-					       VSP1_ENTITY_PARAMS_INIT);
+		vsp1_entity_prepare(entity, pipe, pipe->dl);
 	}
 
 	return 0;
diff --git a/drivers/media/platform/vsp1/vsp1_wpf.c b/drivers/media/platform/vsp1/vsp1_wpf.c
index f7f3b4b2c2de..d6dd7e783d27 100644
--- a/drivers/media/platform/vsp1/vsp1_wpf.c
+++ b/drivers/media/platform/vsp1/vsp1_wpf.c
@@ -236,10 +236,9 @@ static void vsp1_wpf_destroy(struct vsp1_entity *entity)
 	vsp1_dlm_destroy(wpf->dlm);
 }
 
-static void wpf_configure(struct vsp1_entity *entity,
-			  struct vsp1_pipeline *pipe,
-			  struct vsp1_dl_list *dl,
-			  enum vsp1_entity_params params)
+static void wpf_prepare(struct vsp1_entity *entity,
+			struct vsp1_pipeline *pipe,
+			struct vsp1_dl_list *dl)
 {
 	struct vsp1_rwpf *wpf = to_rwpf(&entity->subdev);
 	struct vsp1_device *vsp1 = wpf->entity.vsp1;
@@ -249,149 +248,12 @@ static void wpf_configure(struct vsp1_entity *entity,
 	u32 outfmt = 0;
 	u32 srcrpf = 0;
 
-	if (params == VSP1_ENTITY_PARAMS_RUNTIME) {
-		const unsigned int mask = BIT(WPF_CTRL_VFLIP)
-					| BIT(WPF_CTRL_HFLIP);
-		unsigned long flags;
-
-		spin_lock_irqsave(&wpf->flip.lock, flags);
-		wpf->flip.active = (wpf->flip.active & ~mask)
-				 | (wpf->flip.pending & mask);
-		spin_unlock_irqrestore(&wpf->flip.lock, flags);
-
-		outfmt = (wpf->alpha << VI6_WPF_OUTFMT_PDV_SHIFT) | wpf->outfmt;
-
-		if (wpf->flip.active & BIT(WPF_CTRL_VFLIP))
-			outfmt |= VI6_WPF_OUTFMT_FLP;
-		if (wpf->flip.active & BIT(WPF_CTRL_HFLIP))
-			outfmt |= VI6_WPF_OUTFMT_HFLP;
-
-		vsp1_wpf_write(wpf, dl, VI6_WPF_OUTFMT, outfmt);
-		return;
-	}
-
 	sink_format = vsp1_entity_get_pad_format(&wpf->entity,
 						 wpf->entity.config,
 						 RWPF_PAD_SINK);
 	source_format = vsp1_entity_get_pad_format(&wpf->entity,
 						   wpf->entity.config,
 						   RWPF_PAD_SOURCE);
-
-	if (params == VSP1_ENTITY_PARAMS_PARTITION) {
-		const struct v4l2_pix_format_mplane *format = &wpf->format;
-		const struct vsp1_format_info *fmtinfo = wpf->fmtinfo;
-		struct vsp1_rwpf_memory mem = wpf->mem;
-		unsigned int flip = wpf->flip.active;
-		unsigned int width = sink_format->width;
-		unsigned int height = sink_format->height;
-		unsigned int offset;
-
-		/*
-		 * Cropping. The partition algorithm can split the image into
-		 * multiple slices.
-		 */
-		if (pipe->partitions > 1)
-			width = pipe->partition->wpf.width;
-
-		vsp1_wpf_write(wpf, dl, VI6_WPF_HSZCLIP, VI6_WPF_SZCLIP_EN |
-			       (0 << VI6_WPF_SZCLIP_OFST_SHIFT) |
-			       (width << VI6_WPF_SZCLIP_SIZE_SHIFT));
-		vsp1_wpf_write(wpf, dl, VI6_WPF_VSZCLIP, VI6_WPF_SZCLIP_EN |
-			       (0 << VI6_WPF_SZCLIP_OFST_SHIFT) |
-			       (height << VI6_WPF_SZCLIP_SIZE_SHIFT));
-
-		if (pipe->lif)
-			return;
-
-		/*
-		 * Update the memory offsets based on flipping configuration.
-		 * The destination addresses point to the locations where the
-		 * VSP starts writing to memory, which can be any corner of the
-		 * image depending on the combination of flipping and rotation.
-		 */
-
-		/*
-		 * First take the partition left coordinate into account.
-		 * Compute the offset to order the partitions correctly on the
-		 * output based on whether flipping is enabled. Consider
-		 * horizontal flipping when rotation is disabled but vertical
-		 * flipping when rotation is enabled, as rotating the image
-		 * switches the horizontal and vertical directions. The offset
-		 * is applied horizontally or vertically accordingly.
-		 */
-		if (flip & BIT(WPF_CTRL_HFLIP) && !wpf->flip.rotate)
-			offset = format->width - pipe->partition->wpf.left
-				- pipe->partition->wpf.width;
-		else if (flip & BIT(WPF_CTRL_VFLIP) && wpf->flip.rotate)
-			offset = format->height - pipe->partition->wpf.left
-				- pipe->partition->wpf.width;
-		else
-			offset = pipe->partition->wpf.left;
-
-		for (i = 0; i < format->num_planes; ++i) {
-			unsigned int hsub = i > 0 ? fmtinfo->hsub : 1;
-			unsigned int vsub = i > 0 ? fmtinfo->vsub : 1;
-
-			if (wpf->flip.rotate)
-				mem.addr[i] += offset / vsub
-					     * format->plane_fmt[i].bytesperline;
-			else
-				mem.addr[i] += offset / hsub
-					     * fmtinfo->bpp[i] / 8;
-		}
-
-		if (flip & BIT(WPF_CTRL_VFLIP)) {
-			/*
-			 * When rotating the output (after rotation) image
-			 * height is equal to the partition width (before
-			 * rotation). Otherwise it is equal to the output
-			 * image height.
-			 */
-			if (wpf->flip.rotate)
-				height = pipe->partition->wpf.width;
-			else
-				height = format->height;
-
-			mem.addr[0] += (height - 1)
-				     * format->plane_fmt[0].bytesperline;
-
-			if (format->num_planes > 1) {
-				offset = (height / fmtinfo->vsub - 1)
-				       * format->plane_fmt[1].bytesperline;
-				mem.addr[1] += offset;
-				mem.addr[2] += offset;
-			}
-		}
-
-		if (wpf->flip.rotate && !(flip & BIT(WPF_CTRL_HFLIP))) {
-			unsigned int hoffset = max(0, (int)format->width - 16);
-
-			/*
-			 * Compute the output coordinate. The partition
-			 * horizontal (left) offset becomes a vertical offset.
-			 */
-			for (i = 0; i < format->num_planes; ++i) {
-				unsigned int hsub = i > 0 ? fmtinfo->hsub : 1;
-
-				mem.addr[i] += hoffset / hsub
-					     * fmtinfo->bpp[i] / 8;
-			}
-		}
-
-		/*
-		 * On Gen3 hardware the SPUVS bit has no effect on 3-planar
-		 * formats. Swap the U and V planes manually in that case.
-		 */
-		if (vsp1->info->gen == 3 && format->num_planes == 3 &&
-		    fmtinfo->swap_uv)
-			swap(mem.addr[1], mem.addr[2]);
-
-		vsp1_wpf_write(wpf, dl, VI6_WPF_DSTM_ADDR_Y, mem.addr[0]);
-		vsp1_wpf_write(wpf, dl, VI6_WPF_DSTM_ADDR_C0, mem.addr[1]);
-		vsp1_wpf_write(wpf, dl, VI6_WPF_DSTM_ADDR_C1, mem.addr[2]);
-		return;
-	}
-
 	/* Format */
 	if (!pipe->lif) {
 		const struct v4l2_pix_format_mplane *format = &wpf->format;
@@ -465,6 +327,158 @@ static void wpf_configure(struct vsp1_entity *entity,
 			   VI6_WFP_IRQ_ENB_DFEE);
 }
 
+static void wpf_configure(struct vsp1_entity *entity,
+			  struct vsp1_pipeline *pipe,
+			  struct vsp1_dl_list *dl,
+			  unsigned int partition)
+{
+	struct vsp1_rwpf *wpf = to_rwpf(&entity->subdev);
+	struct vsp1_device *vsp1 = wpf->entity.vsp1;
+	struct vsp1_rwpf_memory mem = wpf->mem;
+	const struct v4l2_mbus_framefmt *sink_format;
+	const struct v4l2_pix_format_mplane *format = &wpf->format;
+	const struct vsp1_format_info *fmtinfo = wpf->fmtinfo;
+	unsigned int flip;
+	unsigned int i;
+	unsigned int width;
+	unsigned int height;
+	unsigned int offset;
+	u32 outfmt = 0;
+
+	/* Handle the per frame constants */
+	if (partition == 0) {
+		const unsigned int mask = BIT(WPF_CTRL_VFLIP)
+					| BIT(WPF_CTRL_HFLIP);
+		unsigned long flags;
+
+		spin_lock_irqsave(&wpf->flip.lock, flags);
+		wpf->flip.active = (wpf->flip.active & ~mask)
+				 | (wpf->flip.pending & mask);
+		spin_unlock_irqrestore(&wpf->flip.lock, flags);
+
+		outfmt = (wpf->alpha << VI6_WPF_OUTFMT_PDV_SHIFT) | wpf->outfmt;
+
+		if (wpf->flip.active & BIT(WPF_CTRL_VFLIP))
+			outfmt |= VI6_WPF_OUTFMT_FLP;
+		if (wpf->flip.active & BIT(WPF_CTRL_HFLIP))
+			outfmt |= VI6_WPF_OUTFMT_HFLP;
+
+		vsp1_wpf_write(wpf, dl, VI6_WPF_OUTFMT, outfmt);
+	}
+
+	sink_format = vsp1_entity_get_pad_format(&wpf->entity,
+						 wpf->entity.config,
+						 RWPF_PAD_SINK);
+	width = sink_format->width;
+	height = sink_format->height;
+
+	/*
+	 * Cropping. The partition algorithm can split the image into
+	 * multiple slices.
+	 */
+	if (pipe->partitions > 1)
+		width = pipe->partition->wpf.width;
+
+	vsp1_wpf_write(wpf, dl, VI6_WPF_HSZCLIP, VI6_WPF_SZCLIP_EN |
+		       (0 << VI6_WPF_SZCLIP_OFST_SHIFT) |
+		       (width << VI6_WPF_SZCLIP_SIZE_SHIFT));
+	vsp1_wpf_write(wpf, dl, VI6_WPF_VSZCLIP, VI6_WPF_SZCLIP_EN |
+		       (0 << VI6_WPF_SZCLIP_OFST_SHIFT) |
+		       (height << VI6_WPF_SZCLIP_SIZE_SHIFT));
+
+	if (pipe->lif)
+		return;
+
+	/*
+	 * Update the memory offsets based on flipping configuration.
+	 * The destination addresses point to the locations where the
+	 * VSP starts writing to memory, which can be any corner of the
+	 * image depending on the combination of flipping and rotation.
+	 */
+
+	/*
+	 * First take the partition left coordinate into account.
+	 * Compute the offset to order the partitions correctly on the
+	 * output based on whether flipping is enabled. Consider
+	 * horizontal flipping when rotation is disabled but vertical
+	 * flipping when rotation is enabled, as rotating the image
+	 * switches the horizontal and vertical directions. The offset
+	 * is applied horizontally or vertically accordingly.
+	 */
+	flip = wpf->flip.active;
+
+	if (flip & BIT(WPF_CTRL_HFLIP) && !wpf->flip.rotate)
+		offset = format->width - pipe->partition->wpf.left
+			- pipe->partition->wpf.width;
+	else if (flip & BIT(WPF_CTRL_VFLIP) && wpf->flip.rotate)
+		offset = format->height - pipe->partition->wpf.left
+			- pipe->partition->wpf.width;
+	else
+		offset = pipe->partition->wpf.left;
+
+	for (i = 0; i < format->num_planes; ++i) {
+		unsigned int hsub = i > 0 ? fmtinfo->hsub : 1;
+		unsigned int vsub = i > 0 ? fmtinfo->vsub : 1;
+
+		if (wpf->flip.rotate)
+			mem.addr[i] += offset / vsub
+				     * format->plane_fmt[i].bytesperline;
+		else
+			mem.addr[i] += offset / hsub
+				     * fmtinfo->bpp[i] / 8;
+	}
+
+	if (flip & BIT(WPF_CTRL_VFLIP)) {
+		/*
+		 * When rotating the output (after rotation) image
+		 * height is equal to the partition width (before
+		 * rotation). Otherwise it is equal to the output
+		 * image height.
+		 */
+		if (wpf->flip.rotate)
+			height = pipe->partition->wpf.width;
+		else
+			height = format->height;
+
+		mem.addr[0] += (height - 1)
+			     * format->plane_fmt[0].bytesperline;
+
+		if (format->num_planes > 1) {
+			offset = (height / fmtinfo->vsub - 1)
+			       * format->plane_fmt[1].bytesperline;
+			mem.addr[1] += offset;
+			mem.addr[2] += offset;
+		}
+	}
+
+	if (wpf->flip.rotate && !(flip & BIT(WPF_CTRL_HFLIP))) {
+		unsigned int hoffset = max(0, (int)format->width - 16);
+
+		/*
+		 * Compute the output coordinate. The partition
+		 * horizontal (left) offset becomes a vertical offset.
+		 */
+		for (i = 0; i < format->num_planes; ++i) {
+			unsigned int hsub = i > 0 ? fmtinfo->hsub : 1;
+
+			mem.addr[i] += hoffset / hsub
+				     * fmtinfo->bpp[i] / 8;
+		}
+	}
+
+	/*
+	 * On Gen3 hardware the SPUVS bit has no effect on 3-planar
+	 * formats. Swap the U and V planes manually in that case.
+	 */
+	if (vsp1->info->gen == 3 && format->num_planes == 3 &&
+	    fmtinfo->swap_uv)
+		swap(mem.addr[1], mem.addr[2]);
+
+	vsp1_wpf_write(wpf, dl, VI6_WPF_DSTM_ADDR_Y, mem.addr[0]);
+	vsp1_wpf_write(wpf, dl, VI6_WPF_DSTM_ADDR_C0, mem.addr[1]);
+	vsp1_wpf_write(wpf, dl, VI6_WPF_DSTM_ADDR_C1, mem.addr[2]);
+}
+
 static unsigned int wpf_max_width(struct vsp1_entity *entity,
 				  struct vsp1_pipeline *pipe)
 {
@@ -484,6 +498,7 @@ static void wpf_partition(struct vsp1_entity *entity,
 
 static const struct vsp1_entity_operations wpf_entity_ops = {
 	.destroy = vsp1_wpf_destroy,
+	.prepare = wpf_prepare,
 	.configure = wpf_configure,
 	.max_width = wpf_max_width,
 	.partition = wpf_partition,
-- 
git-series 0.9.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v6 7/9] v4l: vsp1: Adapt entities to configure into a body
  2018-02-28 20:52 [PATCH v6 0/9] vsp1: TLB optimisation and DL caching Kieran Bingham
                   ` (5 preceding siblings ...)
  2018-02-28 20:52 ` [PATCH v6 6/9] v4l: vsp1: Refactor display list configure operations Kieran Bingham
@ 2018-02-28 20:52 ` Kieran Bingham
  2018-02-28 20:52 ` [PATCH v6 8/9] v4l: vsp1: Move video configuration to a cached dlb Kieran Bingham
  2018-02-28 20:52 ` [PATCH v6 9/9] v4l: vsp1: Reduce display list body size Kieran Bingham
  8 siblings, 0 replies; 11+ messages in thread
From: Kieran Bingham @ 2018-02-28 20:52 UTC (permalink / raw)
  To: Laurent Pinchart, linux-media, linux-renesas-soc; +Cc: Kieran Bingham

Currently the entities store their configurations into a display list.
Adapt this such that the code can be configured into a body directly,
allowing greater flexibility and control of the content.

All users of vsp1_dl_list_write() are removed in this process, thus it
too is removed.

A helper, vsp1_dl_list_get_body0() is provided to access the internal body0
from the display list.

Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>

---

v4:
 - Rename vsp1_dl_list_get_body() to vsp1_dl_list_get_body0()
   The similarities between vsp1_dl_list_get_body and
   vsp1_dl_list_body_get() were too close

 - body0 could be removed later when the default body is no longer
   needed.

v5:
 - Support DRM/UIF changes

v6:
 - Remove DRM/UIF changes

 drivers/media/platform/vsp1/vsp1_bru.c    | 22 ++++++------
 drivers/media/platform/vsp1/vsp1_clu.c    | 22 ++++++------
 drivers/media/platform/vsp1/vsp1_dl.c     | 12 ++-----
 drivers/media/platform/vsp1/vsp1_dl.h     |  2 +-
 drivers/media/platform/vsp1/vsp1_drm.c    | 22 +++++++-----
 drivers/media/platform/vsp1/vsp1_entity.c | 16 ++++-----
 drivers/media/platform/vsp1/vsp1_entity.h | 12 ++++---
 drivers/media/platform/vsp1/vsp1_hgo.c    | 16 ++++-----
 drivers/media/platform/vsp1/vsp1_hgt.c    | 18 +++++-----
 drivers/media/platform/vsp1/vsp1_hsit.c   | 10 +++---
 drivers/media/platform/vsp1/vsp1_lif.c    | 15 ++++----
 drivers/media/platform/vsp1/vsp1_lut.c    | 21 ++++++------
 drivers/media/platform/vsp1/vsp1_pipe.c   |  4 +-
 drivers/media/platform/vsp1/vsp1_pipe.h   |  3 +-
 drivers/media/platform/vsp1/vsp1_rpf.c    | 43 +++++++++++-------------
 drivers/media/platform/vsp1/vsp1_sru.c    | 14 ++++----
 drivers/media/platform/vsp1/vsp1_uds.c    | 24 +++++++------
 drivers/media/platform/vsp1/vsp1_uds.h    |  2 +-
 drivers/media/platform/vsp1/vsp1_video.c  | 11 ++++--
 drivers/media/platform/vsp1/vsp1_wpf.c    | 42 ++++++++++++-----------
 20 files changed, 174 insertions(+), 157 deletions(-)

diff --git a/drivers/media/platform/vsp1/vsp1_bru.c b/drivers/media/platform/vsp1/vsp1_bru.c
index b9ff96f76b3e..60d449d7b135 100644
--- a/drivers/media/platform/vsp1/vsp1_bru.c
+++ b/drivers/media/platform/vsp1/vsp1_bru.c
@@ -30,10 +30,10 @@
  * Device Access
  */
 
-static inline void vsp1_bru_write(struct vsp1_bru *bru, struct vsp1_dl_list *dl,
-				  u32 reg, u32 data)
+static inline void vsp1_bru_write(struct vsp1_bru *bru,
+				  struct vsp1_dl_body *dlb, u32 reg, u32 data)
 {
-	vsp1_dl_list_write(dl, bru->base + reg, data);
+	vsp1_dl_body_write(dlb, bru->base + reg, data);
 }
 
 /* -----------------------------------------------------------------------------
@@ -287,7 +287,7 @@ static const struct v4l2_subdev_ops bru_ops = {
 
 static void bru_prepare(struct vsp1_entity *entity,
 			struct vsp1_pipeline *pipe,
-			struct vsp1_dl_list *dl)
+			struct vsp1_dl_body *dlb)
 {
 	struct vsp1_bru *bru = to_bru(&entity->subdev);
 	struct v4l2_mbus_framefmt *format;
@@ -309,7 +309,7 @@ static void bru_prepare(struct vsp1_entity *entity,
 	 * format at the pipeline output is premultiplied.
 	 */
 	flags = pipe->output ? pipe->output->format.flags : 0;
-	vsp1_bru_write(bru, dl, VI6_BRU_INCTRL,
+	vsp1_bru_write(bru, dlb, VI6_BRU_INCTRL,
 		       flags & V4L2_PIX_FMT_FLAG_PREMUL_ALPHA ?
 		       0 : VI6_BRU_INCTRL_NRM);
 
@@ -317,12 +317,12 @@ static void bru_prepare(struct vsp1_entity *entity,
 	 * Set the background position to cover the whole output image and
 	 * configure its color.
 	 */
-	vsp1_bru_write(bru, dl, VI6_BRU_VIRRPF_SIZE,
+	vsp1_bru_write(bru, dlb, VI6_BRU_VIRRPF_SIZE,
 		       (format->width << VI6_BRU_VIRRPF_SIZE_HSIZE_SHIFT) |
 		       (format->height << VI6_BRU_VIRRPF_SIZE_VSIZE_SHIFT));
-	vsp1_bru_write(bru, dl, VI6_BRU_VIRRPF_LOC, 0);
+	vsp1_bru_write(bru, dlb, VI6_BRU_VIRRPF_LOC, 0);
 
-	vsp1_bru_write(bru, dl, VI6_BRU_VIRRPF_COL, bru->bgcolor |
+	vsp1_bru_write(bru, dlb, VI6_BRU_VIRRPF_COL, bru->bgcolor |
 		       (0xff << VI6_BRU_VIRRPF_COL_A_SHIFT));
 
 	/*
@@ -332,7 +332,7 @@ static void bru_prepare(struct vsp1_entity *entity,
 	 * unit.
 	 */
 	if (entity->type == VSP1_ENTITY_BRU)
-		vsp1_bru_write(bru, dl, VI6_BRU_ROP,
+		vsp1_bru_write(bru, dlb, VI6_BRU_ROP,
 			       VI6_BRU_ROP_DSTSEL_BRUIN(1) |
 			       VI6_BRU_ROP_CROP(VI6_ROP_NOP) |
 			       VI6_BRU_ROP_AROP(VI6_ROP_NOP));
@@ -374,7 +374,7 @@ static void bru_prepare(struct vsp1_entity *entity,
 		if (!(entity->type == VSP1_ENTITY_BRU && i == 1))
 			ctrl |= VI6_BRU_CTRL_SRCSEL_BRUIN(i);
 
-		vsp1_bru_write(bru, dl, VI6_BRU_CTRL(i), ctrl);
+		vsp1_bru_write(bru, dlb, VI6_BRU_CTRL(i), ctrl);
 
 		/*
 		 * Harcode the blending formula to
@@ -389,7 +389,7 @@ static void bru_prepare(struct vsp1_entity *entity,
 		 *
 		 * otherwise.
 		 */
-		vsp1_bru_write(bru, dl, VI6_BRU_BLD(i),
+		vsp1_bru_write(bru, dlb, VI6_BRU_BLD(i),
 			       VI6_BRU_BLD_CCMDX_255_SRC_A |
 			       (premultiplied ? VI6_BRU_BLD_CCMDY_COEFY :
 						VI6_BRU_BLD_CCMDY_SRC_A) |
diff --git a/drivers/media/platform/vsp1/vsp1_clu.c b/drivers/media/platform/vsp1/vsp1_clu.c
index be4d7e493746..742fe4dd7032 100644
--- a/drivers/media/platform/vsp1/vsp1_clu.c
+++ b/drivers/media/platform/vsp1/vsp1_clu.c
@@ -29,10 +29,10 @@
  * Device Access
  */
 
-static inline void vsp1_clu_write(struct vsp1_clu *clu, struct vsp1_dl_list *dl,
-				  u32 reg, u32 data)
+static inline void vsp1_clu_write(struct vsp1_clu *clu,
+				  struct vsp1_dl_body *dlb, u32 reg, u32 data)
 {
-	vsp1_dl_list_write(dl, reg, data);
+	vsp1_dl_body_write(dlb, reg, data);
 }
 
 /* -----------------------------------------------------------------------------
@@ -215,7 +215,7 @@ static const struct v4l2_subdev_ops clu_ops = {
  */
 static void clu_prepare(struct vsp1_entity *entity,
 			struct vsp1_pipeline *pipe,
-			struct vsp1_dl_list *dl)
+			struct vsp1_dl_body *dlb)
 {
 	struct vsp1_clu *clu = to_clu(&entity->subdev);
 
@@ -234,14 +234,14 @@ static void clu_prepare(struct vsp1_entity *entity,
 static void clu_configure(struct vsp1_entity *entity,
 			  struct vsp1_pipeline *pipe,
 			  struct vsp1_dl_list *dl,
+			  struct vsp1_dl_body *dlb,
 			  unsigned int partition)
 {
 	struct vsp1_clu *clu = to_clu(&entity->subdev);
-	struct vsp1_dl_body *dlb;
+	struct vsp1_dl_body *clu_dlb;
 	unsigned long flags;
 	u32 ctrl = VI6_CLU_CTRL_AAI | VI6_CLU_CTRL_MVS | VI6_CLU_CTRL_EN;
 
-
 	if (partition == 0) {
 		/* 2D mode can only be used with the YCbCr pixel encoding. */
 		if (clu->mode == V4L2_CID_VSP1_CLU_MODE_2D && clu->yuv_mode)
@@ -249,18 +249,18 @@ static void clu_configure(struct vsp1_entity *entity,
 			     |  VI6_CLU_CTRL_OS0_2D | VI6_CLU_CTRL_OS1_2D
 			     |  VI6_CLU_CTRL_OS2_2D | VI6_CLU_CTRL_M2D;
 
-		vsp1_clu_write(clu, dl, VI6_CLU_CTRL, ctrl);
+		vsp1_clu_write(clu, dlb, VI6_CLU_CTRL, ctrl);
 
 		spin_lock_irqsave(&clu->lock, flags);
-		dlb = clu->clu;
+		clu_dlb = clu->clu;
 		clu->clu = NULL;
 		spin_unlock_irqrestore(&clu->lock, flags);
 
-		if (dlb) {
-			vsp1_dl_list_add_body(dl, dlb);
+		if (clu_dlb) {
+			vsp1_dl_list_add_body(dl, clu_dlb);
 
 			/* release our local reference */
-			vsp1_dl_body_put(dlb);
+			vsp1_dl_body_put(clu_dlb);
 		}
 	}
 }
diff --git a/drivers/media/platform/vsp1/vsp1_dl.c b/drivers/media/platform/vsp1/vsp1_dl.c
index 0f87e0bb21c1..a762e840d147 100644
--- a/drivers/media/platform/vsp1/vsp1_dl.c
+++ b/drivers/media/platform/vsp1/vsp1_dl.c
@@ -449,17 +449,15 @@ void vsp1_dl_list_put(struct vsp1_dl_list *dl)
 }
 
 /**
- * vsp1_dl_list_write - Write a register to the display list
+ * vsp1_dl_list_get_body0 - Obtain the default body for the display list
  * @dl: The display list
- * @reg: The register address
- * @data: The register value
  *
- * Write the given register and value to the display list. Up to 256 registers
- * can be written per display list.
+ * Obtain a pointer to the internal display list body allowing this to be passed
+ * directly to configure operations.
  */
-void vsp1_dl_list_write(struct vsp1_dl_list *dl, u32 reg, u32 data)
+struct vsp1_dl_body *vsp1_dl_list_get_body0(struct vsp1_dl_list *dl)
 {
-	vsp1_dl_body_write(dl->body0, reg, data);
+	return dl->body0;
 }
 
 /**
diff --git a/drivers/media/platform/vsp1/vsp1_dl.h b/drivers/media/platform/vsp1/vsp1_dl.h
index 4f67b2b955b9..577fe4b122dd 100644
--- a/drivers/media/platform/vsp1/vsp1_dl.h
+++ b/drivers/media/platform/vsp1/vsp1_dl.h
@@ -32,7 +32,7 @@ bool vsp1_dlm_irq_frame_end(struct vsp1_dl_manager *dlm);
 
 struct vsp1_dl_list *vsp1_dl_list_get(struct vsp1_dl_manager *dlm);
 void vsp1_dl_list_put(struct vsp1_dl_list *dl);
-void vsp1_dl_list_write(struct vsp1_dl_list *dl, u32 reg, u32 data);
+struct vsp1_dl_body *vsp1_dl_list_get_body0(struct vsp1_dl_list *dl);
 void vsp1_dl_list_commit(struct vsp1_dl_list *dl);
 
 struct vsp1_dl_body_pool *
diff --git a/drivers/media/platform/vsp1/vsp1_drm.c b/drivers/media/platform/vsp1/vsp1_drm.c
index ef9a64a616ef..7c766fe48a07 100644
--- a/drivers/media/platform/vsp1/vsp1_drm.c
+++ b/drivers/media/platform/vsp1/vsp1_drm.c
@@ -88,6 +88,7 @@ int vsp1_du_setup_lif(struct device *dev, unsigned int pipe_index,
 	struct vsp1_entity *entity;
 	struct vsp1_entity *next;
 	struct vsp1_dl_list *dl;
+	struct vsp1_dl_body *dlb;
 	struct v4l2_subdev_format format;
 	unsigned long flags;
 	unsigned int i;
@@ -255,12 +256,13 @@ int vsp1_du_setup_lif(struct device *dev, unsigned int pipe_index,
 
 	/* Configure all entities in the pipeline. */
 	dl = vsp1_dl_list_get(pipe->output->dlm);
+	dlb = vsp1_dl_list_get_body0(dl);
 
 	list_for_each_entry_safe(entity, next, &pipe->entities, list_pipe) {
-		vsp1_entity_route_setup(entity, pipe, dl);
+		vsp1_entity_route_setup(entity, pipe, dlb);
 
-		vsp1_entity_prepare(entity, pipe, dl);
-		vsp1_entity_configure(entity, pipe, dl, 0);
+		vsp1_entity_prepare(entity, pipe, dlb);
+		vsp1_entity_configure(entity, pipe, dl, dlb, 0);
 	}
 
 	vsp1_dl_list_commit(dl);
@@ -506,12 +508,16 @@ void vsp1_du_atomic_flush(struct device *dev, unsigned int pipe_index)
 	struct vsp1_entity *entity;
 	struct vsp1_entity *next;
 	struct vsp1_dl_list *dl;
+	struct vsp1_dl_body *dlb;
 	unsigned int i;
 	int ret;
 
 	/* Prepare the display list. */
 	dl = vsp1_dl_list_get(pipe->output->dlm);
 
+	/* Retrieve the default DLB from the list */
+	dlb = vsp1_dl_list_get_body0(dl);
+
 	/* Count the number of enabled inputs and sort them by Z-order. */
 	pipe->num_inputs = 0;
 
@@ -573,17 +579,17 @@ void vsp1_du_atomic_flush(struct device *dev, unsigned int pipe_index)
 		/* Disconnect unused RPFs from the pipeline. */
 		if (entity->type == VSP1_ENTITY_RPF &&
 		    !pipe->inputs[entity->index]) {
-			vsp1_dl_list_write(dl, entity->route->reg,
-					   VI6_DPR_NODE_UNUSED);
+			vsp1_dl_body_write(dlb, entity->route->reg,
+					       VI6_DPR_NODE_UNUSED);
 
 			list_del_init(&entity->list_pipe);
 
 			continue;
 		}
 
-		vsp1_entity_route_setup(entity, pipe, dl);
-		vsp1_entity_prepare(entity, pipe, dl);
-		vsp1_entity_configure(entity, pipe, dl, 0);
+		vsp1_entity_route_setup(entity, pipe, dlb);
+		vsp1_entity_prepare(entity, pipe, dlb);
+		vsp1_entity_configure(entity, pipe, dl, dlb, 0);
 	}
 
 	vsp1_dl_list_commit(dl);
diff --git a/drivers/media/platform/vsp1/vsp1_entity.c b/drivers/media/platform/vsp1/vsp1_entity.c
index 76f240f005af..c9383ff3b069 100644
--- a/drivers/media/platform/vsp1/vsp1_entity.c
+++ b/drivers/media/platform/vsp1/vsp1_entity.c
@@ -26,7 +26,7 @@
 
 void vsp1_entity_route_setup(struct vsp1_entity *entity,
 			     struct vsp1_pipeline *pipe,
-			     struct vsp1_dl_list *dl)
+			     struct vsp1_dl_body *dlb)
 {
 	struct vsp1_entity *source;
 	u32 route;
@@ -42,7 +42,7 @@ void vsp1_entity_route_setup(struct vsp1_entity *entity,
 		smppt = (pipe->output->entity.index << VI6_DPR_SMPPT_TGW_SHIFT)
 		      | (source->route->output << VI6_DPR_SMPPT_PT_SHIFT);
 
-		vsp1_dl_list_write(dl, VI6_DPR_HGO_SMPPT, smppt);
+		vsp1_dl_body_write(dlb, VI6_DPR_HGO_SMPPT, smppt);
 		return;
 	} else if (entity->type == VSP1_ENTITY_HGT) {
 		u32 smppt;
@@ -55,7 +55,7 @@ void vsp1_entity_route_setup(struct vsp1_entity *entity,
 		smppt = (pipe->output->entity.index << VI6_DPR_SMPPT_TGW_SHIFT)
 		      | (source->route->output << VI6_DPR_SMPPT_PT_SHIFT);
 
-		vsp1_dl_list_write(dl, VI6_DPR_HGT_SMPPT, smppt);
+		vsp1_dl_body_write(dlb, VI6_DPR_HGT_SMPPT, smppt);
 		return;
 	}
 
@@ -70,22 +70,22 @@ void vsp1_entity_route_setup(struct vsp1_entity *entity,
 	 */
 	if (source->type == VSP1_ENTITY_BRS)
 		route |= VI6_DPR_ROUTE_BRSSEL;
-	vsp1_dl_list_write(dl, source->route->reg, route);
+	vsp1_dl_body_write(dlb, source->route->reg, route);
 }
 
 void vsp1_entity_prepare(struct vsp1_entity *entity, struct vsp1_pipeline *pipe,
-			 struct vsp1_dl_list *dl)
+			 struct vsp1_dl_body *dlb)
 {
 	if (entity->ops->prepare)
-		entity->ops->prepare(entity, pipe, dl);
+		entity->ops->prepare(entity, pipe, dlb);
 }
 
 void vsp1_entity_configure(struct vsp1_entity *entity,
 			   struct vsp1_pipeline *pipe, struct vsp1_dl_list *dl,
-			   unsigned int partition)
+			   struct vsp1_dl_body *dlb, unsigned int partition)
 {
 	if (entity->ops->configure)
-		entity->ops->configure(entity, pipe, dl, partition);
+		entity->ops->configure(entity, pipe, dl, dlb, partition);
 }
 
 /* -----------------------------------------------------------------------------
diff --git a/drivers/media/platform/vsp1/vsp1_entity.h b/drivers/media/platform/vsp1/vsp1_entity.h
index 2f33e343ccc6..4eb8afd7e402 100644
--- a/drivers/media/platform/vsp1/vsp1_entity.h
+++ b/drivers/media/platform/vsp1/vsp1_entity.h
@@ -19,6 +19,7 @@
 #include <media/v4l2-subdev.h>
 
 struct vsp1_device;
+struct vsp1_dl_body;
 struct vsp1_dl_list;
 struct vsp1_pipeline;
 struct vsp1_partition;
@@ -80,9 +81,10 @@ struct vsp1_route {
 struct vsp1_entity_operations {
 	void (*destroy)(struct vsp1_entity *);
 	void (*prepare)(struct vsp1_entity *, struct vsp1_pipeline *,
-			struct vsp1_dl_list *);
+			struct vsp1_dl_body *);
 	void (*configure)(struct vsp1_entity *, struct vsp1_pipeline *,
-			  struct vsp1_dl_list *, unsigned int partition);
+			  struct vsp1_dl_list *, struct vsp1_dl_body *,
+			  unsigned int partition);
 	unsigned int (*max_width)(struct vsp1_entity *, struct vsp1_pipeline *);
 	void (*partition)(struct vsp1_entity *, struct vsp1_pipeline *,
 			  struct vsp1_partition *, unsigned int,
@@ -147,12 +149,12 @@ int vsp1_entity_init_cfg(struct v4l2_subdev *subdev,
 
 void vsp1_entity_route_setup(struct vsp1_entity *entity,
 			     struct vsp1_pipeline *pipe,
-			     struct vsp1_dl_list *dl);
+			     struct vsp1_dl_body *dlb);
 void vsp1_entity_prepare(struct vsp1_entity *entity, struct vsp1_pipeline *pipe,
-			 struct vsp1_dl_list *dl);
+			 struct vsp1_dl_body *dlb);
 void vsp1_entity_configure(struct vsp1_entity *entity,
 			   struct vsp1_pipeline *pipe, struct vsp1_dl_list *dl,
-			   unsigned int partition);
+			   struct vsp1_dl_body *dlb, unsigned int partition);
 
 struct media_pad *vsp1_entity_remote_pad(struct media_pad *pad);
 
diff --git a/drivers/media/platform/vsp1/vsp1_hgo.c b/drivers/media/platform/vsp1/vsp1_hgo.c
index 5705ba67dbc8..6d07e0e2a60c 100644
--- a/drivers/media/platform/vsp1/vsp1_hgo.c
+++ b/drivers/media/platform/vsp1/vsp1_hgo.c
@@ -32,10 +32,10 @@ static inline u32 vsp1_hgo_read(struct vsp1_hgo *hgo, u32 reg)
 	return vsp1_read(hgo->histo.entity.vsp1, reg);
 }
 
-static inline void vsp1_hgo_write(struct vsp1_hgo *hgo, struct vsp1_dl_list *dl,
-				  u32 reg, u32 data)
+static inline void vsp1_hgo_write(struct vsp1_hgo *hgo,
+				  struct vsp1_dl_body *dlb, u32 reg, u32 data)
 {
-	vsp1_dl_list_write(dl, reg, data);
+	vsp1_dl_body_write(dlb, reg, data);
 }
 
 /* -----------------------------------------------------------------------------
@@ -135,7 +135,7 @@ static const struct v4l2_ctrl_config hgo_num_bins_control = {
 
 static void hgo_prepare(struct vsp1_entity *entity,
 			struct vsp1_pipeline *pipe,
-			struct vsp1_dl_list *dl)
+			struct vsp1_dl_body *dlb)
 {
 	struct vsp1_hgo *hgo = to_hgo(&entity->subdev);
 	struct v4l2_rect *compose;
@@ -149,12 +149,12 @@ static void hgo_prepare(struct vsp1_entity *entity,
 						HISTO_PAD_SINK,
 						V4L2_SEL_TGT_COMPOSE);
 
-	vsp1_hgo_write(hgo, dl, VI6_HGO_REGRST, VI6_HGO_REGRST_RCLEA);
+	vsp1_hgo_write(hgo, dlb, VI6_HGO_REGRST, VI6_HGO_REGRST_RCLEA);
 
-	vsp1_hgo_write(hgo, dl, VI6_HGO_OFFSET,
+	vsp1_hgo_write(hgo, dlb, VI6_HGO_OFFSET,
 		       (crop->left << VI6_HGO_OFFSET_HOFFSET_SHIFT) |
 		       (crop->top << VI6_HGO_OFFSET_VOFFSET_SHIFT));
-	vsp1_hgo_write(hgo, dl, VI6_HGO_SIZE,
+	vsp1_hgo_write(hgo, dlb, VI6_HGO_SIZE,
 		       (crop->width << VI6_HGO_SIZE_HSIZE_SHIFT) |
 		       (crop->height << VI6_HGO_SIZE_VSIZE_SHIFT));
 
@@ -166,7 +166,7 @@ static void hgo_prepare(struct vsp1_entity *entity,
 
 	hratio = crop->width * 2 / compose->width / 3;
 	vratio = crop->height * 2 / compose->height / 3;
-	vsp1_hgo_write(hgo, dl, VI6_HGO_MODE,
+	vsp1_hgo_write(hgo, dlb, VI6_HGO_MODE,
 		       (hgo->num_bins == 256 ? VI6_HGO_MODE_STEP : 0) |
 		       (hgo->max_rgb ? VI6_HGO_MODE_MAXRGB : 0) |
 		       (hratio << VI6_HGO_MODE_HRATIO_SHIFT) |
diff --git a/drivers/media/platform/vsp1/vsp1_hgt.c b/drivers/media/platform/vsp1/vsp1_hgt.c
index bdd1247e090f..b0614b1e033c 100644
--- a/drivers/media/platform/vsp1/vsp1_hgt.c
+++ b/drivers/media/platform/vsp1/vsp1_hgt.c
@@ -32,10 +32,10 @@ static inline u32 vsp1_hgt_read(struct vsp1_hgt *hgt, u32 reg)
 	return vsp1_read(hgt->histo.entity.vsp1, reg);
 }
 
-static inline void vsp1_hgt_write(struct vsp1_hgt *hgt, struct vsp1_dl_list *dl,
-				  u32 reg, u32 data)
+static inline void vsp1_hgt_write(struct vsp1_hgt *hgt,
+				  struct vsp1_dl_body *dlb, u32 reg, u32 data)
 {
-	vsp1_dl_list_write(dl, reg, data);
+	vsp1_dl_body_write(dlb, reg, data);
 }
 
 /* -----------------------------------------------------------------------------
@@ -131,7 +131,7 @@ static const struct v4l2_ctrl_config hgt_hue_areas = {
 
 static void hgt_prepare(struct vsp1_entity *entity,
 			struct vsp1_pipeline *pipe,
-			struct vsp1_dl_list *dl)
+			struct vsp1_dl_body *dlb)
 {
 	struct vsp1_hgt *hgt = to_hgt(&entity->subdev);
 	struct v4l2_rect *compose;
@@ -148,12 +148,12 @@ static void hgt_prepare(struct vsp1_entity *entity,
 						HISTO_PAD_SINK,
 						V4L2_SEL_TGT_COMPOSE);
 
-	vsp1_hgt_write(hgt, dl, VI6_HGT_REGRST, VI6_HGT_REGRST_RCLEA);
+	vsp1_hgt_write(hgt, dlb, VI6_HGT_REGRST, VI6_HGT_REGRST_RCLEA);
 
-	vsp1_hgt_write(hgt, dl, VI6_HGT_OFFSET,
+	vsp1_hgt_write(hgt, dlb, VI6_HGT_OFFSET,
 		       (crop->left << VI6_HGT_OFFSET_HOFFSET_SHIFT) |
 		       (crop->top << VI6_HGT_OFFSET_VOFFSET_SHIFT));
-	vsp1_hgt_write(hgt, dl, VI6_HGT_SIZE,
+	vsp1_hgt_write(hgt, dlb, VI6_HGT_SIZE,
 		       (crop->width << VI6_HGT_SIZE_HSIZE_SHIFT) |
 		       (crop->height << VI6_HGT_SIZE_VSIZE_SHIFT));
 
@@ -161,7 +161,7 @@ static void hgt_prepare(struct vsp1_entity *entity,
 	for (i = 0; i < HGT_NUM_HUE_AREAS; ++i) {
 		lower = hgt->hue_areas[i*2 + 0];
 		upper = hgt->hue_areas[i*2 + 1];
-		vsp1_hgt_write(hgt, dl, VI6_HGT_HUE_AREA(i),
+		vsp1_hgt_write(hgt, dlb, VI6_HGT_HUE_AREA(i),
 			       (lower << VI6_HGT_HUE_AREA_LOWER_SHIFT) |
 			       (upper << VI6_HGT_HUE_AREA_UPPER_SHIFT));
 	}
@@ -169,7 +169,7 @@ static void hgt_prepare(struct vsp1_entity *entity,
 
 	hratio = crop->width * 2 / compose->width / 3;
 	vratio = crop->height * 2 / compose->height / 3;
-	vsp1_hgt_write(hgt, dl, VI6_HGT_MODE,
+	vsp1_hgt_write(hgt, dlb, VI6_HGT_MODE,
 		       (hratio << VI6_HGT_MODE_HRATIO_SHIFT) |
 		       (vratio << VI6_HGT_MODE_VRATIO_SHIFT));
 }
diff --git a/drivers/media/platform/vsp1/vsp1_hsit.c b/drivers/media/platform/vsp1/vsp1_hsit.c
index cf96ce2c6da9..d5b454db4f96 100644
--- a/drivers/media/platform/vsp1/vsp1_hsit.c
+++ b/drivers/media/platform/vsp1/vsp1_hsit.c
@@ -28,9 +28,9 @@
  */
 
 static inline void vsp1_hsit_write(struct vsp1_hsit *hsit,
-				   struct vsp1_dl_list *dl, u32 reg, u32 data)
+				   struct vsp1_dl_body *dlb, u32 reg, u32 data)
 {
-	vsp1_dl_list_write(dl, reg, data);
+	vsp1_dl_body_write(dlb, reg, data);
 }
 
 /* -----------------------------------------------------------------------------
@@ -133,14 +133,14 @@ static const struct v4l2_subdev_ops hsit_ops = {
 
 static void hsit_prepare(struct vsp1_entity *entity,
 			 struct vsp1_pipeline *pipe,
-			 struct vsp1_dl_list *dl)
+			 struct vsp1_dl_body *dlb)
 {
 	struct vsp1_hsit *hsit = to_hsit(&entity->subdev);
 
 	if (hsit->inverse)
-		vsp1_hsit_write(hsit, dl, VI6_HSI_CTRL, VI6_HSI_CTRL_EN);
+		vsp1_hsit_write(hsit, dlb, VI6_HSI_CTRL, VI6_HSI_CTRL_EN);
 	else
-		vsp1_hsit_write(hsit, dl, VI6_HST_CTRL, VI6_HST_CTRL_EN);
+		vsp1_hsit_write(hsit, dlb, VI6_HST_CTRL, VI6_HST_CTRL_EN);
 }
 
 static const struct vsp1_entity_operations hsit_entity_ops = {
diff --git a/drivers/media/platform/vsp1/vsp1_lif.c b/drivers/media/platform/vsp1/vsp1_lif.c
index 7c6391090844..201922bafb6f 100644
--- a/drivers/media/platform/vsp1/vsp1_lif.c
+++ b/drivers/media/platform/vsp1/vsp1_lif.c
@@ -27,10 +27,11 @@
  * Device Access
  */
 
-static inline void vsp1_lif_write(struct vsp1_lif *lif, struct vsp1_dl_list *dl,
-				  u32 reg, u32 data)
+static inline void vsp1_lif_write(struct vsp1_lif *lif,
+				  struct vsp1_dl_body *dlb, u32 reg, u32 data)
 {
-	vsp1_dl_list_write(dl, reg + lif->entity.index * VI6_LIF_OFFSET, data);
+	vsp1_dl_body_write(dlb, reg + lif->entity.index * VI6_LIF_OFFSET,
+			       data);
 }
 
 /* -----------------------------------------------------------------------------
@@ -130,7 +131,7 @@ static const struct v4l2_subdev_ops lif_ops = {
 
 static void lif_prepare(struct vsp1_entity *entity,
 			struct vsp1_pipeline *pipe,
-			struct vsp1_dl_list *dl)
+			struct vsp1_dl_body *dlb)
 {
 	const struct v4l2_mbus_framefmt *format;
 	struct vsp1_lif *lif = to_lif(&entity->subdev);
@@ -143,11 +144,11 @@ static void lif_prepare(struct vsp1_entity *entity,
 
 	obth = min(obth, (format->width + 1) / 2 * format->height - 4);
 
-	vsp1_lif_write(lif, dl, VI6_LIF_CSBTH,
+	vsp1_lif_write(lif, dlb, VI6_LIF_CSBTH,
 			(hbth << VI6_LIF_CSBTH_HBTH_SHIFT) |
 			(lbth << VI6_LIF_CSBTH_LBTH_SHIFT));
 
-	vsp1_lif_write(lif, dl, VI6_LIF_CTRL,
+	vsp1_lif_write(lif, dlb, VI6_LIF_CTRL,
 			(obth << VI6_LIF_CTRL_OBTH_SHIFT) |
 			(format->code == 0 ? VI6_LIF_CTRL_CFMT : 0) |
 			VI6_LIF_CTRL_REQSEL | VI6_LIF_CTRL_LIF_EN);
@@ -160,7 +161,7 @@ static void lif_prepare(struct vsp1_entity *entity,
 	 */
 	if ((entity->vsp1->version & VI6_IP_VERSION_MASK) ==
 	    (VI6_IP_VERSION_MODEL_VSPD_V3 | VI6_IP_VERSION_SOC_V3M))
-		vsp1_lif_write(lif, dl, VI6_LIF_LBA,
+		vsp1_lif_write(lif, dlb, VI6_LIF_LBA,
 			       VI6_LIF_LBA_LBA0 |
 			       (1536 << VI6_LIF_LBA_LBA1_SHIFT));
 }
diff --git a/drivers/media/platform/vsp1/vsp1_lut.c b/drivers/media/platform/vsp1/vsp1_lut.c
index 33bfaa1df994..58922ebe7ff8 100644
--- a/drivers/media/platform/vsp1/vsp1_lut.c
+++ b/drivers/media/platform/vsp1/vsp1_lut.c
@@ -29,10 +29,10 @@
  * Device Access
  */
 
-static inline void vsp1_lut_write(struct vsp1_lut *lut, struct vsp1_dl_list *dl,
-				  u32 reg, u32 data)
+static inline void vsp1_lut_write(struct vsp1_lut *lut,
+				  struct vsp1_dl_body *dlb, u32 reg, u32 data)
 {
-	vsp1_dl_list_write(dl, reg, data);
+	vsp1_dl_body_write(dlb, reg, data);
 }
 
 /* -----------------------------------------------------------------------------
@@ -192,33 +192,34 @@ static const struct v4l2_subdev_ops lut_ops = {
 
 static void lut_prepare(struct vsp1_entity *entity,
 			struct vsp1_pipeline *pipe,
-			struct vsp1_dl_list *dl)
+			struct vsp1_dl_body *dlb)
 {
 	struct vsp1_lut *lut = to_lut(&entity->subdev);
 
-	vsp1_lut_write(lut, dl, VI6_LUT_CTRL, VI6_LUT_CTRL_EN);
+	vsp1_lut_write(lut, dlb, VI6_LUT_CTRL, VI6_LUT_CTRL_EN);
 }
 
 static void lut_configure(struct vsp1_entity *entity,
 			  struct vsp1_pipeline *pipe,
 			  struct vsp1_dl_list *dl,
+			  struct vsp1_dl_body *dlb,
 			  unsigned int partition)
 {
 	struct vsp1_lut *lut = to_lut(&entity->subdev);
-	struct vsp1_dl_body *dlb;
+	struct vsp1_dl_body *lut_dlb;
 	unsigned long flags;
 
 	if (partition == 0) {
 		spin_lock_irqsave(&lut->lock, flags);
-		dlb = lut->lut;
+		lut_dlb = lut->lut;
 		lut->lut = NULL;
 		spin_unlock_irqrestore(&lut->lock, flags);
 
-		if (dlb) {
-			vsp1_dl_list_add_body(dl, dlb);
+		if (lut_dlb) {
+			vsp1_dl_list_add_body(dl, lut_dlb);
 
 			/* release our local reference */
-			vsp1_dl_body_put(dlb);
+			vsp1_dl_body_put(lut_dlb);
 		}
 	}
 }
diff --git a/drivers/media/platform/vsp1/vsp1_pipe.c b/drivers/media/platform/vsp1/vsp1_pipe.c
index 44944ac86d9b..5012643583b6 100644
--- a/drivers/media/platform/vsp1/vsp1_pipe.c
+++ b/drivers/media/platform/vsp1/vsp1_pipe.c
@@ -367,7 +367,7 @@ void vsp1_pipeline_frame_end(struct vsp1_pipeline *pipe)
  * from the input RPF alpha.
  */
 void vsp1_pipeline_propagate_alpha(struct vsp1_pipeline *pipe,
-				   struct vsp1_dl_list *dl, unsigned int alpha)
+				   struct vsp1_dl_body *dlb, unsigned int alpha)
 {
 	if (!pipe->uds)
 		return;
@@ -380,7 +380,7 @@ void vsp1_pipeline_propagate_alpha(struct vsp1_pipeline *pipe,
 	    pipe->uds_input->type == VSP1_ENTITY_BRS)
 		alpha = 255;
 
-	vsp1_uds_set_alpha(pipe->uds, dl, alpha);
+	vsp1_uds_set_alpha(pipe->uds, dlb, alpha);
 }
 
 /*
diff --git a/drivers/media/platform/vsp1/vsp1_pipe.h b/drivers/media/platform/vsp1/vsp1_pipe.h
index dfff9b5685fe..90d29492b9b9 100644
--- a/drivers/media/platform/vsp1/vsp1_pipe.h
+++ b/drivers/media/platform/vsp1/vsp1_pipe.h
@@ -161,7 +161,8 @@ bool vsp1_pipeline_ready(struct vsp1_pipeline *pipe);
 void vsp1_pipeline_frame_end(struct vsp1_pipeline *pipe);
 
 void vsp1_pipeline_propagate_alpha(struct vsp1_pipeline *pipe,
-				   struct vsp1_dl_list *dl, unsigned int alpha);
+				   struct vsp1_dl_body *dlb,
+				   unsigned int alpha);
 
 void vsp1_pipeline_propagate_partition(struct vsp1_pipeline *pipe,
 				       struct vsp1_partition *partition,
diff --git a/drivers/media/platform/vsp1/vsp1_rpf.c b/drivers/media/platform/vsp1/vsp1_rpf.c
index 87a47997a086..79fd86f1b9da 100644
--- a/drivers/media/platform/vsp1/vsp1_rpf.c
+++ b/drivers/media/platform/vsp1/vsp1_rpf.c
@@ -29,9 +29,10 @@
  */
 
 static inline void vsp1_rpf_write(struct vsp1_rwpf *rpf,
-				  struct vsp1_dl_list *dl, u32 reg, u32 data)
+				  struct vsp1_dl_body *dlb, u32 reg, u32 data)
 {
-	vsp1_dl_list_write(dl, reg + rpf->entity.index * VI6_RPF_OFFSET, data);
+	vsp1_dl_body_write(dlb, reg + rpf->entity.index * VI6_RPF_OFFSET,
+			       data);
 }
 
 /* -----------------------------------------------------------------------------
@@ -48,7 +49,7 @@ static const struct v4l2_subdev_ops rpf_ops = {
 
 static void rpf_prepare(struct vsp1_entity *entity,
 			struct vsp1_pipeline *pipe,
-			struct vsp1_dl_list *dl)
+			struct vsp1_dl_body *dlb)
 {
 	struct vsp1_rwpf *rpf = to_rwpf(&entity->subdev);
 	const struct vsp1_format_info *fmtinfo = rpf->fmtinfo;
@@ -67,7 +68,7 @@ static void rpf_prepare(struct vsp1_entity *entity,
 		pstride |= format->plane_fmt[1].bytesperline
 			<< VI6_RPF_SRCM_PSTRIDE_C_SHIFT;
 
-	vsp1_rpf_write(rpf, dl, VI6_RPF_SRCM_PSTRIDE, pstride);
+	vsp1_rpf_write(rpf, dlb, VI6_RPF_SRCM_PSTRIDE, pstride);
 
 	/* Format */
 	sink_format = vsp1_entity_get_pad_format(&rpf->entity,
@@ -88,8 +89,8 @@ static void rpf_prepare(struct vsp1_entity *entity,
 	if (sink_format->code != source_format->code)
 		infmt |= VI6_RPF_INFMT_CSC;
 
-	vsp1_rpf_write(rpf, dl, VI6_RPF_INFMT, infmt);
-	vsp1_rpf_write(rpf, dl, VI6_RPF_DSWAP, fmtinfo->swap);
+	vsp1_rpf_write(rpf, dlb, VI6_RPF_INFMT, infmt);
+	vsp1_rpf_write(rpf, dlb, VI6_RPF_DSWAP, fmtinfo->swap);
 
 	/* Output location */
 	if (pipe->bru) {
@@ -103,7 +104,7 @@ static void rpf_prepare(struct vsp1_entity *entity,
 		top = compose->top;
 	}
 
-	vsp1_rpf_write(rpf, dl, VI6_RPF_LOC,
+	vsp1_rpf_write(rpf, dlb, VI6_RPF_LOC,
 		       (left << VI6_RPF_LOC_HCOORD_SHIFT) |
 		       (top << VI6_RPF_LOC_VCOORD_SHIFT));
 
@@ -130,7 +131,7 @@ static void rpf_prepare(struct vsp1_entity *entity,
 	 *
 	 * In all cases, disable color keying.
 	 */
-	vsp1_rpf_write(rpf, dl, VI6_RPF_ALPH_SEL, VI6_RPF_ALPH_SEL_AEXT_EXT |
+	vsp1_rpf_write(rpf, dlb, VI6_RPF_ALPH_SEL, VI6_RPF_ALPH_SEL_AEXT_EXT |
 		       (fmtinfo->alpha ? VI6_RPF_ALPH_SEL_ASEL_PACKED
 				       : VI6_RPF_ALPH_SEL_ASEL_FIXED));
 
@@ -167,15 +168,14 @@ static void rpf_prepare(struct vsp1_entity *entity,
 		rpf->mult_alpha = mult;
 	}
 
-	vsp1_rpf_write(rpf, dl, VI6_RPF_MSK_CTRL, 0);
-	vsp1_rpf_write(rpf, dl, VI6_RPF_CKEY_CTRL, 0);
+	vsp1_rpf_write(rpf, dlb, VI6_RPF_MSK_CTRL, 0);
+	vsp1_rpf_write(rpf, dlb, VI6_RPF_CKEY_CTRL, 0);
 
 }
 
 static void rpf_configure(struct vsp1_entity *entity,
-			  struct vsp1_pipeline *pipe,
-			  struct vsp1_dl_list *dl,
-			  unsigned int partition)
+			  struct vsp1_pipeline *pipe, struct vsp1_dl_list *dl,
+			  struct vsp1_dl_body *dlb, unsigned int partition)
 {
 	struct vsp1_rwpf *rpf = to_rwpf(&entity->subdev);
 	struct vsp1_rwpf_memory mem = rpf->mem;
@@ -185,15 +185,14 @@ static void rpf_configure(struct vsp1_entity *entity,
 	struct v4l2_rect crop;
 
 	if (partition == 0) {
-		vsp1_rpf_write(rpf, dl, VI6_RPF_VRTCOL_SET,
+		vsp1_rpf_write(rpf, dlb, VI6_RPF_VRTCOL_SET,
 			       rpf->alpha << VI6_RPF_VRTCOL_SET_LAYA_SHIFT);
-		vsp1_rpf_write(rpf, dl, VI6_RPF_MULT_ALPHA, rpf->mult_alpha |
+		vsp1_rpf_write(rpf, dlb, VI6_RPF_MULT_ALPHA, rpf->mult_alpha |
 			       (rpf->alpha << VI6_RPF_MULT_ALPHA_RATIO_SHIFT));
 
-		vsp1_pipeline_propagate_alpha(pipe, dl, rpf->alpha);
+		vsp1_pipeline_propagate_alpha(pipe, dlb, rpf->alpha);
 	}
 
-
 	/*
 	 * Source size and crop offsets.
 	 *
@@ -219,10 +218,10 @@ static void rpf_configure(struct vsp1_entity *entity,
 		crop.left += pipe->partition->rpf.left;
 	}
 
-	vsp1_rpf_write(rpf, dl, VI6_RPF_SRC_BSIZE,
+	vsp1_rpf_write(rpf, dlb, VI6_RPF_SRC_BSIZE,
 		       (crop.width << VI6_RPF_SRC_BSIZE_BHSIZE_SHIFT) |
 		       (crop.height << VI6_RPF_SRC_BSIZE_BVSIZE_SHIFT));
-	vsp1_rpf_write(rpf, dl, VI6_RPF_SRC_ESIZE,
+	vsp1_rpf_write(rpf, dlb, VI6_RPF_SRC_ESIZE,
 		       (crop.width << VI6_RPF_SRC_ESIZE_EHSIZE_SHIFT) |
 		       (crop.height << VI6_RPF_SRC_ESIZE_EVSIZE_SHIFT));
 
@@ -247,9 +246,9 @@ static void rpf_configure(struct vsp1_entity *entity,
 	    fmtinfo->swap_uv)
 		swap(mem.addr[1], mem.addr[2]);
 
-	vsp1_rpf_write(rpf, dl, VI6_RPF_SRCM_ADDR_Y, mem.addr[0]);
-	vsp1_rpf_write(rpf, dl, VI6_RPF_SRCM_ADDR_C0, mem.addr[1]);
-	vsp1_rpf_write(rpf, dl, VI6_RPF_SRCM_ADDR_C1, mem.addr[2]);
+	vsp1_rpf_write(rpf, dlb, VI6_RPF_SRCM_ADDR_Y, mem.addr[0]);
+	vsp1_rpf_write(rpf, dlb, VI6_RPF_SRCM_ADDR_C0, mem.addr[1]);
+	vsp1_rpf_write(rpf, dlb, VI6_RPF_SRCM_ADDR_C1, mem.addr[2]);
 }
 
 static void rpf_partition(struct vsp1_entity *entity,
diff --git a/drivers/media/platform/vsp1/vsp1_sru.c b/drivers/media/platform/vsp1/vsp1_sru.c
index 0a24bc59bc2f..1f938b6d6d45 100644
--- a/drivers/media/platform/vsp1/vsp1_sru.c
+++ b/drivers/media/platform/vsp1/vsp1_sru.c
@@ -28,10 +28,10 @@
  * Device Access
  */
 
-static inline void vsp1_sru_write(struct vsp1_sru *sru, struct vsp1_dl_list *dl,
-				  u32 reg, u32 data)
+static inline void vsp1_sru_write(struct vsp1_sru *sru,
+				  struct vsp1_dl_body *dlb, u32 reg, u32 data)
 {
-	vsp1_dl_list_write(dl, reg, data);
+	vsp1_dl_body_write(dlb, reg, data);
 }
 
 /* -----------------------------------------------------------------------------
@@ -273,7 +273,7 @@ static const struct v4l2_subdev_ops sru_ops = {
 
 static void sru_prepare(struct vsp1_entity *entity,
 			struct vsp1_pipeline *pipe,
-			struct vsp1_dl_list *dl)
+			struct vsp1_dl_body *dlb)
 {
 	const struct vsp1_sru_param *param;
 	struct vsp1_sru *sru = to_sru(&entity->subdev);
@@ -299,9 +299,9 @@ static void sru_prepare(struct vsp1_entity *entity,
 
 	ctrl0 |= param->ctrl0;
 
-	vsp1_sru_write(sru, dl, VI6_SRU_CTRL0, ctrl0);
-	vsp1_sru_write(sru, dl, VI6_SRU_CTRL1, VI6_SRU_CTRL1_PARAM5);
-	vsp1_sru_write(sru, dl, VI6_SRU_CTRL2, param->ctrl2);
+	vsp1_sru_write(sru, dlb, VI6_SRU_CTRL0, ctrl0);
+	vsp1_sru_write(sru, dlb, VI6_SRU_CTRL1, VI6_SRU_CTRL1_PARAM5);
+	vsp1_sru_write(sru, dlb, VI6_SRU_CTRL2, param->ctrl2);
 }
 
 static unsigned int sru_max_width(struct vsp1_entity *entity,
diff --git a/drivers/media/platform/vsp1/vsp1_uds.c b/drivers/media/platform/vsp1/vsp1_uds.c
index 84be962a33b1..7afcb2ae30a5 100644
--- a/drivers/media/platform/vsp1/vsp1_uds.c
+++ b/drivers/media/platform/vsp1/vsp1_uds.c
@@ -31,22 +31,23 @@
  * Device Access
  */
 
-static inline void vsp1_uds_write(struct vsp1_uds *uds, struct vsp1_dl_list *dl,
-				  u32 reg, u32 data)
+static inline void vsp1_uds_write(struct vsp1_uds *uds,
+				  struct vsp1_dl_body *dlb, u32 reg, u32 data)
 {
-	vsp1_dl_list_write(dl, reg + uds->entity.index * VI6_UDS_OFFSET, data);
+	vsp1_dl_body_write(dlb, reg + uds->entity.index * VI6_UDS_OFFSET,
+			       data);
 }
 
 /* -----------------------------------------------------------------------------
  * Scaling Computation
  */
 
-void vsp1_uds_set_alpha(struct vsp1_entity *entity, struct vsp1_dl_list *dl,
+void vsp1_uds_set_alpha(struct vsp1_entity *entity, struct vsp1_dl_body *dlb,
 			unsigned int alpha)
 {
 	struct vsp1_uds *uds = to_uds(&entity->subdev);
 
-	vsp1_uds_write(uds, dl, VI6_UDS_ALPVAL,
+	vsp1_uds_write(uds, dlb, VI6_UDS_ALPVAL,
 		       alpha << VI6_UDS_ALPVAL_VAL0_SHIFT);
 }
 
@@ -261,7 +262,7 @@ static const struct v4l2_subdev_ops uds_ops = {
 
 static void uds_prepare(struct vsp1_entity *entity,
 			struct vsp1_pipeline *pipe,
-			struct vsp1_dl_list *dl)
+			struct vsp1_dl_body *dlb)
 {
 	struct vsp1_uds *uds = to_uds(&entity->subdev);
 	const struct v4l2_mbus_framefmt *output;
@@ -290,18 +291,18 @@ static void uds_prepare(struct vsp1_entity *entity,
 	else
 		multitap = true;
 
-	vsp1_uds_write(uds, dl, VI6_UDS_CTRL,
+	vsp1_uds_write(uds, dlb, VI6_UDS_CTRL,
 		       (uds->scale_alpha ? VI6_UDS_CTRL_AON : 0) |
 		       (multitap ? VI6_UDS_CTRL_BC : 0));
 
-	vsp1_uds_write(uds, dl, VI6_UDS_PASS_BWIDTH,
+	vsp1_uds_write(uds, dlb, VI6_UDS_PASS_BWIDTH,
 		       (uds_passband_width(hscale)
 				<< VI6_UDS_PASS_BWIDTH_H_SHIFT) |
 		       (uds_passband_width(vscale)
 				<< VI6_UDS_PASS_BWIDTH_V_SHIFT));
 
 	/* Set the scaling ratios. */
-	vsp1_uds_write(uds, dl, VI6_UDS_SCALE,
+	vsp1_uds_write(uds, dlb, VI6_UDS_SCALE,
 		       (hscale << VI6_UDS_SCALE_HFRAC_SHIFT) |
 		       (vscale << VI6_UDS_SCALE_VFRAC_SHIFT));
 }
@@ -309,6 +310,7 @@ static void uds_prepare(struct vsp1_entity *entity,
 static void uds_configure(struct vsp1_entity *entity,
 			  struct vsp1_pipeline *pipe,
 			  struct vsp1_dl_list *dl,
+			  struct vsp1_dl_body *dlb,
 			  unsigned int pindex)
 {
 	struct vsp1_uds *uds = to_uds(&entity->subdev);
@@ -319,13 +321,13 @@ static void uds_configure(struct vsp1_entity *entity,
 					    UDS_PAD_SOURCE);
 
 	/* Input size clipping */
-	vsp1_uds_write(uds, dl, VI6_UDS_HSZCLIP, VI6_UDS_HSZCLIP_HCEN |
+	vsp1_uds_write(uds, dlb, VI6_UDS_HSZCLIP, VI6_UDS_HSZCLIP_HCEN |
 		       (0 << VI6_UDS_HSZCLIP_HCL_OFST_SHIFT) |
 		       (partition->uds_sink.width
 				<< VI6_UDS_HSZCLIP_HCL_SIZE_SHIFT));
 
 	/* Output size clipping */
-	vsp1_uds_write(uds, dl, VI6_UDS_CLIP_SIZE,
+	vsp1_uds_write(uds, dlb, VI6_UDS_CLIP_SIZE,
 		       (partition->uds_source.width
 				<< VI6_UDS_CLIP_SIZE_HSIZE_SHIFT) |
 		       (output->height
diff --git a/drivers/media/platform/vsp1/vsp1_uds.h b/drivers/media/platform/vsp1/vsp1_uds.h
index 7bf3cdcffc65..d99997f3b28d 100644
--- a/drivers/media/platform/vsp1/vsp1_uds.h
+++ b/drivers/media/platform/vsp1/vsp1_uds.h
@@ -35,7 +35,7 @@ static inline struct vsp1_uds *to_uds(struct v4l2_subdev *subdev)
 
 struct vsp1_uds *vsp1_uds_create(struct vsp1_device *vsp1, unsigned int index);
 
-void vsp1_uds_set_alpha(struct vsp1_entity *uds, struct vsp1_dl_list *dl,
+void vsp1_uds_set_alpha(struct vsp1_entity *uds, struct vsp1_dl_body *dlb,
 			unsigned int alpha);
 
 #endif /* __VSP1_UDS_H__ */
diff --git a/drivers/media/platform/vsp1/vsp1_video.c b/drivers/media/platform/vsp1/vsp1_video.c
index bd5403f24dda..9c9dcb7daecf 100644
--- a/drivers/media/platform/vsp1/vsp1_video.c
+++ b/drivers/media/platform/vsp1/vsp1_video.c
@@ -383,11 +383,12 @@ static void vsp1_video_pipeline_run_partition(struct vsp1_pipeline *pipe,
 					      unsigned int partition)
 {
 	struct vsp1_entity *entity;
+	struct vsp1_dl_body *dlb = vsp1_dl_list_get_body0(dl);
 
 	pipe->partition = &pipe->part_table[partition];
 
 	list_for_each_entry(entity, &pipe->entities, list_pipe)
-		vsp1_entity_configure(entity, pipe, dl, partition);
+		vsp1_entity_configure(entity, pipe, dl, dlb, partition);
 }
 
 static void vsp1_video_pipeline_run(struct vsp1_pipeline *pipe)
@@ -790,6 +791,7 @@ static void vsp1_video_buffer_queue(struct vb2_buffer *vb)
 static int vsp1_video_setup_pipeline(struct vsp1_pipeline *pipe)
 {
 	struct vsp1_entity *entity;
+	struct vsp1_dl_body *dlb;
 	int ret;
 
 	/* Determine this pipelines sizes for image partitioning support. */
@@ -802,6 +804,9 @@ static int vsp1_video_setup_pipeline(struct vsp1_pipeline *pipe)
 	if (!pipe->dl)
 		return -ENOMEM;
 
+	/* Retrieve the default DLB from the list */
+	dlb = vsp1_dl_list_get_body0(pipe->dl);
+
 	if (pipe->uds) {
 		struct vsp1_uds *uds = to_uds(&pipe->uds->subdev);
 
@@ -824,8 +829,8 @@ static int vsp1_video_setup_pipeline(struct vsp1_pipeline *pipe)
 	}
 
 	list_for_each_entry(entity, &pipe->entities, list_pipe) {
-		vsp1_entity_route_setup(entity, pipe, pipe->dl);
-		vsp1_entity_prepare(entity, pipe, pipe->dl);
+		vsp1_entity_route_setup(entity, pipe, dlb);
+		vsp1_entity_prepare(entity, pipe, dlb);
 	}
 
 	return 0;
diff --git a/drivers/media/platform/vsp1/vsp1_wpf.c b/drivers/media/platform/vsp1/vsp1_wpf.c
index d6dd7e783d27..2b5c006ecb54 100644
--- a/drivers/media/platform/vsp1/vsp1_wpf.c
+++ b/drivers/media/platform/vsp1/vsp1_wpf.c
@@ -31,9 +31,10 @@
  */
 
 static inline void vsp1_wpf_write(struct vsp1_rwpf *wpf,
-				  struct vsp1_dl_list *dl, u32 reg, u32 data)
+				  struct vsp1_dl_body *dlb, u32 reg, u32 data)
 {
-	vsp1_dl_list_write(dl, reg + wpf->entity.index * VI6_WPF_OFFSET, data);
+	vsp1_dl_body_write(dlb, reg + wpf->entity.index * VI6_WPF_OFFSET,
+			       data);
 }
 
 /* -----------------------------------------------------------------------------
@@ -238,7 +239,7 @@ static void vsp1_wpf_destroy(struct vsp1_entity *entity)
 
 static void wpf_prepare(struct vsp1_entity *entity,
 			struct vsp1_pipeline *pipe,
-			struct vsp1_dl_list *dl)
+			struct vsp1_dl_body *dlb)
 {
 	struct vsp1_rwpf *wpf = to_rwpf(&entity->subdev);
 	struct vsp1_device *vsp1 = wpf->entity.vsp1;
@@ -272,17 +273,17 @@ static void wpf_prepare(struct vsp1_entity *entity,
 			outfmt |= VI6_WPF_OUTFMT_SPUVS;
 
 		/* Destination stride and byte swapping. */
-		vsp1_wpf_write(wpf, dl, VI6_WPF_DSTM_STRIDE_Y,
+		vsp1_wpf_write(wpf, dlb, VI6_WPF_DSTM_STRIDE_Y,
 			       format->plane_fmt[0].bytesperline);
 		if (format->num_planes > 1)
-			vsp1_wpf_write(wpf, dl, VI6_WPF_DSTM_STRIDE_C,
+			vsp1_wpf_write(wpf, dlb, VI6_WPF_DSTM_STRIDE_C,
 				       format->plane_fmt[1].bytesperline);
 
-		vsp1_wpf_write(wpf, dl, VI6_WPF_DSWAP, fmtinfo->swap);
+		vsp1_wpf_write(wpf, dlb, VI6_WPF_DSWAP, fmtinfo->swap);
 
 		if (vsp1->info->features & VSP1_HAS_WPF_HFLIP &&
 		    wpf->entity.index == 0)
-			vsp1_wpf_write(wpf, dl, VI6_WPF_ROT_CTRL,
+			vsp1_wpf_write(wpf, dlb, VI6_WPF_ROT_CTRL,
 				       VI6_WPF_ROT_CTRL_LN16 |
 				       (256 << VI6_WPF_ROT_CTRL_LMEM_WD_SHIFT));
 	}
@@ -292,10 +293,10 @@ static void wpf_prepare(struct vsp1_entity *entity,
 
 	wpf->outfmt = outfmt;
 
-	vsp1_dl_list_write(dl, VI6_DPR_WPF_FPORCH(wpf->entity.index),
-			   VI6_DPR_WPF_FPORCH_FP_WPFN);
+	vsp1_dl_body_write(dlb, VI6_DPR_WPF_FPORCH(wpf->entity.index),
+			       VI6_DPR_WPF_FPORCH_FP_WPFN);
 
-	vsp1_dl_list_write(dl, VI6_WPF_WRBCK_CTRL, 0);
+	vsp1_dl_body_write(dlb, VI6_WPF_WRBCK_CTRL, 0);
 
 	/*
 	 * Sources. If the pipeline has a single input and BRU is not used,
@@ -319,17 +320,18 @@ static void wpf_prepare(struct vsp1_entity *entity,
 			? VI6_WPF_SRCRPF_VIRACT_MST
 			: VI6_WPF_SRCRPF_VIRACT2_MST;
 
-	vsp1_wpf_write(wpf, dl, VI6_WPF_SRCRPF, srcrpf);
+	vsp1_wpf_write(wpf, dlb, VI6_WPF_SRCRPF, srcrpf);
 
 	/* Enable interrupts */
-	vsp1_dl_list_write(dl, VI6_WPF_IRQ_STA(wpf->entity.index), 0);
-	vsp1_dl_list_write(dl, VI6_WPF_IRQ_ENB(wpf->entity.index),
-			   VI6_WFP_IRQ_ENB_DFEE);
+	vsp1_dl_body_write(dlb, VI6_WPF_IRQ_STA(wpf->entity.index), 0);
+	vsp1_dl_body_write(dlb, VI6_WPF_IRQ_ENB(wpf->entity.index),
+			       VI6_WFP_IRQ_ENB_DFEE);
 }
 
 static void wpf_configure(struct vsp1_entity *entity,
 			  struct vsp1_pipeline *pipe,
 			  struct vsp1_dl_list *dl,
+			  struct vsp1_dl_body *dlb,
 			  unsigned int partition)
 {
 	struct vsp1_rwpf *wpf = to_rwpf(&entity->subdev);
@@ -363,7 +365,7 @@ static void wpf_configure(struct vsp1_entity *entity,
 		if (wpf->flip.active & BIT(WPF_CTRL_HFLIP))
 			outfmt |= VI6_WPF_OUTFMT_HFLP;
 
-		vsp1_wpf_write(wpf, dl, VI6_WPF_OUTFMT, outfmt);
+		vsp1_wpf_write(wpf, dlb, VI6_WPF_OUTFMT, outfmt);
 	}
 
 	sink_format = vsp1_entity_get_pad_format(&wpf->entity,
@@ -379,10 +381,10 @@ static void wpf_configure(struct vsp1_entity *entity,
 	if (pipe->partitions > 1)
 		width = pipe->partition->wpf.width;
 
-	vsp1_wpf_write(wpf, dl, VI6_WPF_HSZCLIP, VI6_WPF_SZCLIP_EN |
+	vsp1_wpf_write(wpf, dlb, VI6_WPF_HSZCLIP, VI6_WPF_SZCLIP_EN |
 		       (0 << VI6_WPF_SZCLIP_OFST_SHIFT) |
 		       (width << VI6_WPF_SZCLIP_SIZE_SHIFT));
-	vsp1_wpf_write(wpf, dl, VI6_WPF_VSZCLIP, VI6_WPF_SZCLIP_EN |
+	vsp1_wpf_write(wpf, dlb, VI6_WPF_VSZCLIP, VI6_WPF_SZCLIP_EN |
 		       (0 << VI6_WPF_SZCLIP_OFST_SHIFT) |
 		       (height << VI6_WPF_SZCLIP_SIZE_SHIFT));
 
@@ -474,9 +476,9 @@ static void wpf_configure(struct vsp1_entity *entity,
 	    fmtinfo->swap_uv)
 		swap(mem.addr[1], mem.addr[2]);
 
-	vsp1_wpf_write(wpf, dl, VI6_WPF_DSTM_ADDR_Y, mem.addr[0]);
-	vsp1_wpf_write(wpf, dl, VI6_WPF_DSTM_ADDR_C0, mem.addr[1]);
-	vsp1_wpf_write(wpf, dl, VI6_WPF_DSTM_ADDR_C1, mem.addr[2]);
+	vsp1_wpf_write(wpf, dlb, VI6_WPF_DSTM_ADDR_Y, mem.addr[0]);
+	vsp1_wpf_write(wpf, dlb, VI6_WPF_DSTM_ADDR_C0, mem.addr[1]);
+	vsp1_wpf_write(wpf, dlb, VI6_WPF_DSTM_ADDR_C1, mem.addr[2]);
 }
 
 static unsigned int wpf_max_width(struct vsp1_entity *entity,
-- 
git-series 0.9.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v6 8/9] v4l: vsp1: Move video configuration to a cached dlb
  2018-02-28 20:52 [PATCH v6 0/9] vsp1: TLB optimisation and DL caching Kieran Bingham
                   ` (6 preceding siblings ...)
  2018-02-28 20:52 ` [PATCH v6 7/9] v4l: vsp1: Adapt entities to configure into a body Kieran Bingham
@ 2018-02-28 20:52 ` Kieran Bingham
  2018-02-28 20:52 ` [PATCH v6 9/9] v4l: vsp1: Reduce display list body size Kieran Bingham
  8 siblings, 0 replies; 11+ messages in thread
From: Kieran Bingham @ 2018-02-28 20:52 UTC (permalink / raw)
  To: Laurent Pinchart, linux-media, linux-renesas-soc; +Cc: Kieran Bingham

We are now able to configure a pipeline directly into a local display
list body. Take advantage of this fact, and create a cacheable body to
store the configuration of the pipeline in the video object.

vsp1_video_pipeline_run() is now the last user of the pipe->dl object.
Convert this function to use the cached video->config body and obtain a
local display list reference.

Attach the video->config body to the display list when needed before
committing to hardware.

The pipe object is marked as un-configured when resuming from a suspend.
This ensures that when the hardware is reset - our cached configuration
will be re-attached to the next committed DL.

Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
---

v3:
 - 's/fragment/body/', 's/fragments/bodies/'
 - video dlb cache allocation increased from 2 to 3 dlbs

Our video DL usage now looks like the below output:

dl->body0 contains our disposable runtime configuration. Max 41.
dl_child->body0 is our partition specific configuration. Max 12.
dl->bodies shows our constant configuration and LUTs.

  These two are LUT/CLU:
     * dl->bodies[x]->num_entries 256 / max 256
     * dl->bodies[x]->num_entries 4914 / max 4914

Which shows that our 'constant' configuration cache is currently
utilised to a maximum of 64 entries.

trace-cmd report | \
    grep max | sed 's/.*vsp1_dl_list_commit://g' | sort | uniq;

  dl->body0->num_entries 13 / max 128
  dl->body0->num_entries 14 / max 128
  dl->body0->num_entries 16 / max 128
  dl->body0->num_entries 20 / max 128
  dl->body0->num_entries 27 / max 128
  dl->body0->num_entries 34 / max 128
  dl->body0->num_entries 41 / max 128
  dl_child->body0->num_entries 10 / max 128
  dl_child->body0->num_entries 12 / max 128
  dl->bodies[x]->num_entries 15 / max 128
  dl->bodies[x]->num_entries 16 / max 128
  dl->bodies[x]->num_entries 17 / max 128
  dl->bodies[x]->num_entries 18 / max 128
  dl->bodies[x]->num_entries 20 / max 128
  dl->bodies[x]->num_entries 21 / max 128
  dl->bodies[x]->num_entries 256 / max 256
  dl->bodies[x]->num_entries 31 / max 128
  dl->bodies[x]->num_entries 32 / max 128
  dl->bodies[x]->num_entries 39 / max 128
  dl->bodies[x]->num_entries 40 / max 128
  dl->bodies[x]->num_entries 47 / max 128
  dl->bodies[x]->num_entries 48 / max 128
  dl->bodies[x]->num_entries 4914 / max 4914
  dl->bodies[x]->num_entries 55 / max 128
  dl->bodies[x]->num_entries 56 / max 128
  dl->bodies[x]->num_entries 63 / max 128
  dl->bodies[x]->num_entries 64 / max 128

v4:
 - Adjust pipe configured flag to be reset on resume rather than suspend
 - rename dl_child, dl_next

 drivers/media/platform/vsp1/vsp1_pipe.c  |  7 +++-
 drivers/media/platform/vsp1/vsp1_pipe.h  |  4 +-
 drivers/media/platform/vsp1/vsp1_video.c | 67 ++++++++++++++++---------
 drivers/media/platform/vsp1/vsp1_video.h |  2 +-
 4 files changed, 54 insertions(+), 26 deletions(-)

diff --git a/drivers/media/platform/vsp1/vsp1_pipe.c b/drivers/media/platform/vsp1/vsp1_pipe.c
index 5012643583b6..fa445b1a2e38 100644
--- a/drivers/media/platform/vsp1/vsp1_pipe.c
+++ b/drivers/media/platform/vsp1/vsp1_pipe.c
@@ -249,6 +249,7 @@ void vsp1_pipeline_run(struct vsp1_pipeline *pipe)
 		vsp1_write(vsp1, VI6_CMD(pipe->output->entity.index),
 			   VI6_CMD_STRCMD);
 		pipe->state = VSP1_PIPELINE_RUNNING;
+		pipe->configured = true;
 	}
 
 	pipe->buffers_ready = 0;
@@ -470,6 +471,12 @@ void vsp1_pipelines_resume(struct vsp1_device *vsp1)
 			continue;
 
 		spin_lock_irqsave(&pipe->irqlock, flags);
+		/*
+		 * The hardware may have been reset during a suspend and will
+		 * need a full reconfiguration
+		 */
+		pipe->configured = false;
+
 		if (vsp1_pipeline_ready(pipe))
 			vsp1_pipeline_run(pipe);
 		spin_unlock_irqrestore(&pipe->irqlock, flags);
diff --git a/drivers/media/platform/vsp1/vsp1_pipe.h b/drivers/media/platform/vsp1/vsp1_pipe.h
index 90d29492b9b9..e7ad6211b4d0 100644
--- a/drivers/media/platform/vsp1/vsp1_pipe.h
+++ b/drivers/media/platform/vsp1/vsp1_pipe.h
@@ -90,6 +90,7 @@ struct vsp1_partition {
  * @irqlock: protects the pipeline state
  * @state: current state
  * @wq: wait queue to wait for state change completion
+ * @configured: flag determining if the hardware has run since reset
  * @frame_end: frame end interrupt handler
  * @lock: protects the pipeline use count and stream count
  * @kref: pipeline reference count
@@ -117,6 +118,7 @@ struct vsp1_pipeline {
 	spinlock_t irqlock;
 	enum vsp1_pipeline_state state;
 	wait_queue_head_t wq;
+	bool configured;
 
 	void (*frame_end)(struct vsp1_pipeline *pipe, bool completed);
 
@@ -143,8 +145,6 @@ struct vsp1_pipeline {
 	 */
 	struct list_head entities;
 
-	struct vsp1_dl_list *dl;
-
 	unsigned int partitions;
 	struct vsp1_partition *partition;
 	struct vsp1_partition *part_table;
diff --git a/drivers/media/platform/vsp1/vsp1_video.c b/drivers/media/platform/vsp1/vsp1_video.c
index 9c9dcb7daecf..ed2618c94498 100644
--- a/drivers/media/platform/vsp1/vsp1_video.c
+++ b/drivers/media/platform/vsp1/vsp1_video.c
@@ -394,37 +394,43 @@ static void vsp1_video_pipeline_run_partition(struct vsp1_pipeline *pipe,
 static void vsp1_video_pipeline_run(struct vsp1_pipeline *pipe)
 {
 	struct vsp1_device *vsp1 = pipe->output->entity.vsp1;
+	struct vsp1_video *video = pipe->output->video;
 	unsigned int partition;
+	struct vsp1_dl_list *dl;
+
+	dl = vsp1_dl_list_get(pipe->output->dlm);
 
-	if (!pipe->dl)
-		pipe->dl = vsp1_dl_list_get(pipe->output->dlm);
+	/* Attach our pipe configuration to fully initialise the hardware */
+	if (!pipe->configured) {
+		vsp1_dl_list_add_body(dl, video->pipe_config);
+		pipe->configured = true;
+	}
 
 	/* Run the first partition */
-	vsp1_video_pipeline_run_partition(pipe, pipe->dl, 0);
+	vsp1_video_pipeline_run_partition(pipe, dl, 0);
 
 	/* Process consecutive partitions as necessary */
 	for (partition = 1; partition < pipe->partitions; ++partition) {
-		struct vsp1_dl_list *dl;
+		struct vsp1_dl_list *dl_next;
 
-		dl = vsp1_dl_list_get(pipe->output->dlm);
+		dl_next = vsp1_dl_list_get(pipe->output->dlm);
 
 		/*
 		 * An incomplete chain will still function, but output only
 		 * the partitions that had a dl available. The frame end
 		 * interrupt will be marked on the last dl in the chain.
 		 */
-		if (!dl) {
+		if (!dl_next) {
 			dev_err(vsp1->dev, "Failed to obtain a dl list. Frame will be incomplete\n");
 			break;
 		}
 
-		vsp1_video_pipeline_run_partition(pipe, dl, partition);
-		vsp1_dl_list_add_chain(pipe->dl, dl);
+		vsp1_video_pipeline_run_partition(pipe, dl_next, partition);
+		vsp1_dl_list_add_chain(dl, dl_next);
 	}
 
 	/* Complete, and commit the head display list. */
-	vsp1_dl_list_commit(pipe->dl);
-	pipe->dl = NULL;
+	vsp1_dl_list_commit(dl);
 
 	vsp1_pipeline_run(pipe);
 }
@@ -790,8 +796,8 @@ static void vsp1_video_buffer_queue(struct vb2_buffer *vb)
 
 static int vsp1_video_setup_pipeline(struct vsp1_pipeline *pipe)
 {
+	struct vsp1_video *video = pipe->output->video;
 	struct vsp1_entity *entity;
-	struct vsp1_dl_body *dlb;
 	int ret;
 
 	/* Determine this pipelines sizes for image partitioning support. */
@@ -799,14 +805,6 @@ static int vsp1_video_setup_pipeline(struct vsp1_pipeline *pipe)
 	if (ret < 0)
 		return ret;
 
-	/* Prepare the display list. */
-	pipe->dl = vsp1_dl_list_get(pipe->output->dlm);
-	if (!pipe->dl)
-		return -ENOMEM;
-
-	/* Retrieve the default DLB from the list */
-	dlb = vsp1_dl_list_get_body0(pipe->dl);
-
 	if (pipe->uds) {
 		struct vsp1_uds *uds = to_uds(&pipe->uds->subdev);
 
@@ -828,11 +826,20 @@ static int vsp1_video_setup_pipeline(struct vsp1_pipeline *pipe)
 		}
 	}
 
+	/* Obtain a clean body from our pool */
+	video->pipe_config = vsp1_dl_body_get(video->dlbs);
+	if (!video->pipe_config)
+		return -ENOMEM;
+
+	/* Configure the entities into our cached pipe configuration */
 	list_for_each_entry(entity, &pipe->entities, list_pipe) {
-		vsp1_entity_route_setup(entity, pipe, dlb);
-		vsp1_entity_prepare(entity, pipe, dlb);
+		vsp1_entity_route_setup(entity, pipe, video->pipe_config);
+		vsp1_entity_prepare(entity, pipe, video->pipe_config);
 	}
 
+	/* Ensure that our cached configuration is updated in the next DL */
+	pipe->configured = false;
+
 	return 0;
 }
 
@@ -842,6 +849,9 @@ static void vsp1_video_cleanup_pipeline(struct vsp1_pipeline *pipe)
 	struct vsp1_vb2_buffer *buffer;
 	unsigned long flags;
 
+	/* Release any cached configuration */
+	vsp1_dl_body_put(video->pipe_config);
+
 	/* Remove all buffers from the IRQ queue. */
 	spin_lock_irqsave(&video->irqlock, flags);
 	list_for_each_entry(buffer, &video->irqqueue, queue)
@@ -918,9 +928,6 @@ static void vsp1_video_stop_streaming(struct vb2_queue *vq)
 		ret = vsp1_pipeline_stop(pipe);
 		if (ret == -ETIMEDOUT)
 			dev_err(video->vsp1->dev, "pipeline stop timeout\n");
-
-		vsp1_dl_list_put(pipe->dl);
-		pipe->dl = NULL;
 	}
 	mutex_unlock(&pipe->lock);
 
@@ -1240,6 +1247,16 @@ struct vsp1_video *vsp1_video_create(struct vsp1_device *vsp1,
 		goto error;
 	}
 
+	/*
+	 * Utilise a body pool to cache the constant configuration of the
+	 * pipeline object.
+	 */
+	video->dlbs = vsp1_dl_body_pool_create(vsp1, 3, 128, 0);
+	if (!video->dlbs) {
+		ret = -ENOMEM;
+		goto error;
+	}
+
 	return video;
 
 error:
@@ -1249,6 +1266,8 @@ struct vsp1_video *vsp1_video_create(struct vsp1_device *vsp1,
 
 void vsp1_video_cleanup(struct vsp1_video *video)
 {
+	vsp1_dl_body_pool_destroy(video->dlbs);
+
 	if (video_is_registered(&video->video))
 		video_unregister_device(&video->video);
 
diff --git a/drivers/media/platform/vsp1/vsp1_video.h b/drivers/media/platform/vsp1/vsp1_video.h
index 50ea7f02205f..e84f8ee902c1 100644
--- a/drivers/media/platform/vsp1/vsp1_video.h
+++ b/drivers/media/platform/vsp1/vsp1_video.h
@@ -43,6 +43,8 @@ struct vsp1_video {
 
 	struct mutex lock;
 
+	struct vsp1_dl_body_pool *dlbs;
+	struct vsp1_dl_body *pipe_config;
 	unsigned int pipe_index;
 
 	struct vb2_queue queue;
-- 
git-series 0.9.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v6 9/9] v4l: vsp1: Reduce display list body size
  2018-02-28 20:52 [PATCH v6 0/9] vsp1: TLB optimisation and DL caching Kieran Bingham
                   ` (7 preceding siblings ...)
  2018-02-28 20:52 ` [PATCH v6 8/9] v4l: vsp1: Move video configuration to a cached dlb Kieran Bingham
@ 2018-02-28 20:52 ` Kieran Bingham
  2018-05-20  7:04   ` Laurent Pinchart
  8 siblings, 1 reply; 11+ messages in thread
From: Kieran Bingham @ 2018-02-28 20:52 UTC (permalink / raw)
  To: Laurent Pinchart, linux-media, linux-renesas-soc; +Cc: Kieran Bingham

The display list originally allocated a body of 256 entries to store all
of the register lists required for each frame.

This has now been separated into fragments for constant stream setup, and
runtime updates.

Empirical testing shows that the body0 now uses a maximum of 41
registers for each frame, for both DRM and Video API pipelines thus a
rounded 64 entries provides a suitable allocation.

Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
---
 drivers/media/platform/vsp1/vsp1_dl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/platform/vsp1/vsp1_dl.c b/drivers/media/platform/vsp1/vsp1_dl.c
index a762e840d147..6b5743a431a2 100644
--- a/drivers/media/platform/vsp1/vsp1_dl.c
+++ b/drivers/media/platform/vsp1/vsp1_dl.c
@@ -21,7 +21,7 @@
 #include "vsp1.h"
 #include "vsp1_dl.h"
 
-#define VSP1_DL_NUM_ENTRIES		256
+#define VSP1_DL_NUM_ENTRIES		64
 
 #define VSP1_DLH_INT_ENABLE		(1 << 1)
 #define VSP1_DLH_AUTO_START		(1 << 0)
-- 
git-series 0.9.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v6 9/9] v4l: vsp1: Reduce display list body size
  2018-02-28 20:52 ` [PATCH v6 9/9] v4l: vsp1: Reduce display list body size Kieran Bingham
@ 2018-05-20  7:04   ` Laurent Pinchart
  0 siblings, 0 replies; 11+ messages in thread
From: Laurent Pinchart @ 2018-05-20  7:04 UTC (permalink / raw)
  To: Kieran Bingham; +Cc: linux-media, linux-renesas-soc

Hi Kieran,

On Wednesday, 28 February 2018 22:52:43 EEST Kieran Bingham wrote:
> The display list originally allocated a body of 256 entries to store all
> of the register lists required for each frame.
> 
> This has now been separated into fragments for constant stream setup, and
> runtime updates.
> 
> Empirical testing shows that the body0 now uses a maximum of 41
> registers for each frame, for both DRM and Video API pipelines thus a
> rounded 64 entries provides a suitable allocation.
> 
> Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>

Just wondering, why have you dropped this from new versions of the patch 
series ?

> ---
>  drivers/media/platform/vsp1/vsp1_dl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/media/platform/vsp1/vsp1_dl.c
> b/drivers/media/platform/vsp1/vsp1_dl.c index a762e840d147..6b5743a431a2
> 100644
> --- a/drivers/media/platform/vsp1/vsp1_dl.c
> +++ b/drivers/media/platform/vsp1/vsp1_dl.c
> @@ -21,7 +21,7 @@
>  #include "vsp1.h"
>  #include "vsp1_dl.h"
> 
> -#define VSP1_DL_NUM_ENTRIES		256
> +#define VSP1_DL_NUM_ENTRIES		64
> 
>  #define VSP1_DLH_INT_ENABLE		(1 << 1)
>  #define VSP1_DLH_AUTO_START		(1 << 0)


-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-05-20  7:04 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-28 20:52 [PATCH v6 0/9] vsp1: TLB optimisation and DL caching Kieran Bingham
2018-02-28 20:52 ` [PATCH v6 1/9] v4l: vsp1: Reword uses of 'fragment' as 'body' Kieran Bingham
2018-02-28 20:52 ` [PATCH v6 2/9] v4l: vsp1: Protect bodies against overflow Kieran Bingham
2018-02-28 20:52 ` [PATCH v6 3/9] v4l: vsp1: Provide a body pool Kieran Bingham
2018-02-28 20:52 ` [PATCH v6 4/9] v4l: vsp1: Convert display lists to use new " Kieran Bingham
2018-02-28 20:52 ` [PATCH v6 5/9] v4l: vsp1: Use reference counting for bodies Kieran Bingham
2018-02-28 20:52 ` [PATCH v6 6/9] v4l: vsp1: Refactor display list configure operations Kieran Bingham
2018-02-28 20:52 ` [PATCH v6 7/9] v4l: vsp1: Adapt entities to configure into a body Kieran Bingham
2018-02-28 20:52 ` [PATCH v6 8/9] v4l: vsp1: Move video configuration to a cached dlb Kieran Bingham
2018-02-28 20:52 ` [PATCH v6 9/9] v4l: vsp1: Reduce display list body size Kieran Bingham
2018-05-20  7:04   ` Laurent Pinchart

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.