All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/16] drm/vkms: Reimplement line-per-line pixel conversion for plane reading
@ 2024-03-13 17:44 Louis Chauvet
  2024-03-13 17:44 ` [PATCH v5 01/16] drm/vkms: Code formatting Louis Chauvet
                   ` (15 more replies)
  0 siblings, 16 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-13 17:44 UTC (permalink / raw)
  To: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee,
	Louis Chauvet

This patchset is the second version of [1]. It is almost a complete 
rewrite to use a line-by-line algorithm for the composition.

During the development of this series Pekka and Arthur found an issue in 
drm core. The YUV part of this series depend on the fix [9]. I'll let 
Arthur extract it and submit a new independant patch.

It can be divided in three parts:
- PATCH 1 to 4: no functional change is intended, only some formatting and 
  documenting (PATCH 2 is taken from [2])
- PATCH 5 to 8: Some preparation work not directly related to the 
  line-by-line algorithm
- PATCH 10: main patch for this series, it reintroduce the 
  line-by-line algorithm
- PATCH 11 to 15: taken from Arthur's series [2], with sometimes 
  adaptation to use the pixel-by-pixel algorithm.
- PATCH 16: Introduce the support for DRM_FORMAT_R1/2/4/8

The PATCH 10 aims to restore the line-by-line pixel reading algorithm. It 
was introduced in 8ba1648567e2 ("drm: vkms: Refactor the plane composer to 
accept new formats") but removed in 8ba1648567e2 ("drm: vkms: Refactor the 
plane composer to accept new formats") in a over-simplification effort. 
At this time, nobody noticed the performance impact of this commit. After 
the first iteration of my series, poeple notice performance impact, and it 
was the case. Pekka suggested to reimplement the line-by-line algorithm.

Expiriments on my side shown great improvement for the line-by-line 
algorithm, and the performances are the same as the original line-by-line 
algorithm. I targeted my effort to make the code working for all the 
rotations and translations. The usage of helpers from drm_rect_* avoid 
reimplementing existing logic.

The only "complex" part remaining is the clipping of the coordinate to 
avoid reading/writing outside of src/dst. Thus I added a lot of comments 
to help when someone will want to add some features (framebuffer resizing 
for example).

The YUV part is not mandatory for this series, but as my first effort was 
to help the integration of YUV, I decided to rebase Arthur's series on 
mine to help. I took [3], [4], [5] and [6] and adapted them to use the 
line-by-line reading. They were also updated to use 32.32 fixed point 
values for yuv conversion instead of 8.8 fixed points.

The last patch of this series introduce DRM_FORMAT_R1/2/4/8 to show how 
the PATCH 7/16 can be used to manage packed pixel formats.

To properly test the rotation algorithm, I had to implement a new IGT 
test [8]. This helped to found one issue in the YUV rotation algortihm.

My series was mainly tested with:
- kms_plane (for color conversions)
- kms_rotation_crc (for a subset of rotation and formats)
- kms_rotation (to test all rotation and formats combinations) [8]
- kms_cursor_crc (for translations)
The benchmark used to measure the improvment was done with:
- kms_fb_stress

[1]: https://lore.kernel.org/all/20240201-yuv-v1-0-3ca376f27632@bootlin.com
[2]: https://lore.kernel.org/all/20240110-vkms-yuv-v2-0-952fcaa5a193@riseup.net/
[3]: https://lore.kernel.org/all/20240110-vkms-yuv-v2-3-952fcaa5a193@riseup.net/
[4]: https://lore.kernel.org/all/20240110-vkms-yuv-v2-5-952fcaa5a193@riseup.net/
[5]: https://lore.kernel.org/all/20240110-vkms-yuv-v2-6-952fcaa5a193@riseup.net/
[6]: https://lore.kernel.org/all/20240110-vkms-yuv-v2-7-952fcaa5a193@riseup.net/
[8]: https://lore.kernel.org/r/20240313-new_rotation-v2-0-6230fd5cae59@bootlin.com
[9]: https://lore.kernel.org/dri-devel/20240306-louis-vkms-conv-v1-1-5bfe7d129fdd@riseup.net/

To: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
To: Melissa Wen <melissa.srw@gmail.com>
To: Maíra Canal <mairacanal@riseup.net>
To: Haneen Mohammed <hamohammed.sa@gmail.com>
To: Daniel Vetter <daniel@ffwll.ch>
To: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
To: Maxime Ripard <mripard@kernel.org>
To: Thomas Zimmermann <tzimmermann@suse.de>
To: David Airlie <airlied@gmail.com>
To: arthurgrillo@riseup.net
To: Jonathan Corbet <corbet@lwn.net>
To: pekka.paalanen@haloniitty.fi
Cc: dri-devel@lists.freedesktop.org
Cc: linux-kernel@vger.kernel.org
Cc: jeremie.dautheribes@bootlin.com
Cc: miquel.raynal@bootlin.com
Cc: thomas.petazzoni@bootlin.com
Cc: seanpaul@google.com
Cc: marcheu@google.com
Cc: nicolejadeyee@google.com
Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>

Note: after my changes, those tests seems to pass, so [7] may need 
updating (I did not check, it was maybe already the case):
- kms_cursor_legacy@flip-vs-cursor-atomic
- kms_pipe_crc_basic@nonblocking-crc
- kms_pipe_crc_basic@nonblocking-crc-frame-sequence
- kms_writeback@writeback-pixel-formats
- kms_writeback@writeback-invalid-parameters
- kms_flip@flip-vs-absolute-wf_vblank-interruptible
And those tests pass, I did not investigate why the runners fails:
- kms_flip@flip-vs-expired-vblank-interruptible
- kms_flip@flip-vs-expired-vblank
- kms_flip@plain-flip-fb-recreate
- kms_flip@plain-flip-fb-recreate-interruptible
- kms_flip@plain-flip-ts-check-interruptible
- kms_cursor_legacy@cursorA-vs-flipA-toggle
- kms_pipe_crc_basic@nonblocking-crc
- kms_prop_blob@invalid-get-prop
- kms_flip@flip-vs-absolute-wf_vblank-interruptible
- kms_invalid_mode@zero-hdisplay
- kms_invalid_mode@bad-vtotal
- kms_cursor_crc.* (everything is SUCCEED or SKIP, except for 
  rapid_movement)

[7]: https://lore.kernel.org/all/20240201065346.801038-1-vignesh.raman@collabora.com/

Changes in v5:
- All patches: fix some formatting issues
- PATCH 4/16: Use the correct formatter for 4cc code
- PATCH 7/16: Update the pixel accessors to also return the pixel position 
  inside a block.
- PATCH 8/16: Fix a temporary bug
- PATCH 9/16: Update the get_step_1x1 to get_step_next_block and update 
  the documentation
- PATCH 10/16: Update to uses the new pixel accessors
- PATCH 11/16: Update to use the new pixel accessors
- PATCH 11/16: Fix a bug in the subsampling offset for inverted reading 
  (right to left/bottom to top). Found by [8].
- PATCH 11/16: Apply Arthur's modifications (comments, algorithm 
  clarification)
- PATCH 11/16: Use the correct formatter for 4cc code
- PATCH 11/16: Update to use the new get_step_next_block
- PATCH 14/16: Apply Arthur's modification (comments, compilation issue)
- PATCH 15/16: Add Arthur's patch to explain the kunit tests
- PATCH 16/16: Introduce DRM_FORMAT_R* support.
- Link to v4: https://lore.kernel.org/r/20240304-yuv-v4-0-76beac8e9793@bootlin.com
Changes in v4:
- PATCH 3/14: Update comments for get_pixel_* functions
- PATCH 4/14: Add WARN when trying to get unsupported pixel_* functions
- PATCH 5/14: Create dummy pixel reader/writer to avoid NULL 
  function pointers and kernel OOPS
- PATCH 6/14: Added the usage of const pointers when needed
- PATCH 7/14: Extraction of pixel accessors modification
- PATCH 8/14: Extraction of the blending function modification
- PATCH 9/14: Extraction of the pixel_read_direction enum
- PATCH 10/14: Update direction_for_rotation documentation
- PATCH 10/14: Rename conversion functions to be explicit
- PATCH 10/14: Replace while(count) by while(out_pixel<end) in read_line 
  callbacks. It avoid a new variable+addition in the composition hot path.
- PATCH 11/14: Rename conversion functions to be explicit
- PATCH 11/14: Update the documentation for get_subsampling_offset
- PATCH 11/14: Add the matrix_conversion structure to remove a test from 
  the hot path.
- PATCH 11/14: Upadate matrix values to use 32.32 fixed floats for 
  conversion
- PATCH 12/14: Update commit message
- PATCH 14/14: Change kunit expected value
- Link to v3: https://lore.kernel.org/r/20240226-yuv-v3-0-ff662f0994db@bootlin.com
Changes in v3:
- Correction of remaining git-rebase artefacts
- Added Pekka in copy of this patch
- Link to v2: https://lore.kernel.org/r/20240223-yuv-v2-0-aa6be2827bb7@bootlin.com
Changes in v2:
- Rebased the series on top of drm-misc/drm-misc-net
- Extract the typedef for pixel_read/pixel_write
- Introduce the line-by-line algorithm per pixel format
- Add some documentation for existing and new code
- Port the series [1] to use line-by-line algorithm
- Link to v1: https://lore.kernel.org/r/20240201-yuv-v1-0-3ca376f27632@bootlin.com

---
Arthur Grillo (6):
      drm/vkms: Use drm_frame directly
      drm/vkms: Add YUV support
      drm/vkms: Add range and encoding properties to the plane
      drm/vkms: Drop YUV formats TODO
      drm/vkms: Create KUnit tests for YUV conversions
      drm/vkms: Add how to run the Kunit tests

Louis Chauvet (10):
      drm/vkms: Code formatting
      drm/vkms: write/update the documentation for pixel conversion and pixel write functions
      drm/vkms: Add typedef and documentation for pixel_read and pixel_write  functions
      drm/vkms: Add dummy pixel_read/pixel_write callbacks to avoid NULL  pointers
      drm/vkms: Use const for input pointers in pixel_read an pixel_write functions
      drm/vkms: Update pixels accessor to support packed and multi-plane formats.
      drm/vkms: Avoid computing blending limits inside pre_mul_alpha_blend
      drm/vkms: Introduce pixel_read_direction enum
      drm/vkms: Re-introduce line-per-line composition algorithm
      drm/vkms: Add support for DRM_FORMAT_R*

 Documentation/gpu/vkms.rst                    |  14 +-
 drivers/gpu/drm/vkms/Kconfig                  |  15 +
 drivers/gpu/drm/vkms/Makefile                 |   1 +
 drivers/gpu/drm/vkms/tests/.kunitconfig       |   4 +
 drivers/gpu/drm/vkms/tests/Makefile           |   3 +
 drivers/gpu/drm/vkms/tests/vkms_format_test.c | 230 ++++++
 drivers/gpu/drm/vkms/vkms_composer.c          | 244 +++++--
 drivers/gpu/drm/vkms/vkms_crtc.c              |   6 +-
 drivers/gpu/drm/vkms/vkms_drv.c               |   3 +-
 drivers/gpu/drm/vkms/vkms_drv.h               |  85 ++-
 drivers/gpu/drm/vkms/vkms_formats.c           | 983 ++++++++++++++++++++++----
 drivers/gpu/drm/vkms/vkms_formats.h           |  12 +-
 drivers/gpu/drm/vkms/vkms_plane.c             |  50 +-
 drivers/gpu/drm/vkms/vkms_writeback.c         |   5 -
 14 files changed, 1440 insertions(+), 215 deletions(-)
---
base-commit: ae4928daaf4d7b2012c97c9109f608fcf6c60df3
change-id: 20240201-yuv-1337d90d9576

Best regards,
-- 
Louis Chauvet <louis.chauvet@bootlin.com>


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH v5 01/16] drm/vkms: Code formatting
  2024-03-13 17:44 [PATCH v5 00/16] drm/vkms: Reimplement line-per-line pixel conversion for plane reading Louis Chauvet
@ 2024-03-13 17:44 ` Louis Chauvet
  2024-03-25 12:03   ` Pekka Paalanen
  2024-03-25 13:13   ` Maíra Canal
  2024-03-13 17:44 ` [PATCH v5 02/16] drm/vkms: Use drm_frame directly Louis Chauvet
                   ` (14 subsequent siblings)
  15 siblings, 2 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-13 17:44 UTC (permalink / raw)
  To: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee,
	Louis Chauvet

Few no-op changes to remove double spaces and fix wrong alignments.

Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
---
 drivers/gpu/drm/vkms/vkms_composer.c | 10 +++++-----
 drivers/gpu/drm/vkms/vkms_crtc.c     |  6 ++----
 drivers/gpu/drm/vkms/vkms_drv.c      |  3 +--
 drivers/gpu/drm/vkms/vkms_plane.c    |  8 ++++----
 4 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
index e7441b227b3c..c6d9b4a65809 100644
--- a/drivers/gpu/drm/vkms/vkms_composer.c
+++ b/drivers/gpu/drm/vkms/vkms_composer.c
@@ -96,7 +96,7 @@ static u16 lerp_u16(u16 a, u16 b, s64 t)
 	s64 a_fp = drm_int2fixp(a);
 	s64 b_fp = drm_int2fixp(b);
 
-	s64 delta = drm_fixp_mul(b_fp - a_fp,  t);
+	s64 delta = drm_fixp_mul(b_fp - a_fp, t);
 
 	return drm_fixp2int(a_fp + delta);
 }
@@ -302,8 +302,8 @@ static int compose_active_planes(struct vkms_writeback_job *active_wb,
 void vkms_composer_worker(struct work_struct *work)
 {
 	struct vkms_crtc_state *crtc_state = container_of(work,
-						struct vkms_crtc_state,
-						composer_work);
+							  struct vkms_crtc_state,
+							  composer_work);
 	struct drm_crtc *crtc = crtc_state->base.crtc;
 	struct vkms_writeback_job *active_wb = crtc_state->active_writeback;
 	struct vkms_output *out = drm_crtc_to_vkms_output(crtc);
@@ -328,7 +328,7 @@ void vkms_composer_worker(struct work_struct *work)
 		crtc_state->gamma_lut.base = (struct drm_color_lut *)crtc->state->gamma_lut->data;
 		crtc_state->gamma_lut.lut_length =
 			crtc->state->gamma_lut->length / sizeof(struct drm_color_lut);
-		max_lut_index_fp = drm_int2fixp(crtc_state->gamma_lut.lut_length  - 1);
+		max_lut_index_fp = drm_int2fixp(crtc_state->gamma_lut.lut_length - 1);
 		crtc_state->gamma_lut.channel_value2index_ratio = drm_fixp_div(max_lut_index_fp,
 									       u16_max_fp);
 
@@ -367,7 +367,7 @@ void vkms_composer_worker(struct work_struct *work)
 		drm_crtc_add_crc_entry(crtc, true, frame_start++, &crc32);
 }
 
-static const char * const pipe_crc_sources[] = {"auto"};
+static const char *const pipe_crc_sources[] = { "auto" };
 
 const char *const *vkms_get_crc_sources(struct drm_crtc *crtc,
 					size_t *count)
diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
index 61e500b8c9da..7586ae2e1dd3 100644
--- a/drivers/gpu/drm/vkms/vkms_crtc.c
+++ b/drivers/gpu/drm/vkms/vkms_crtc.c
@@ -191,8 +191,7 @@ static int vkms_crtc_atomic_check(struct drm_crtc *crtc,
 		return ret;
 
 	drm_for_each_plane_mask(plane, crtc->dev, crtc_state->plane_mask) {
-		plane_state = drm_atomic_get_existing_plane_state(crtc_state->state,
-								  plane);
+		plane_state = drm_atomic_get_existing_plane_state(crtc_state->state, plane);
 		WARN_ON(!plane_state);
 
 		if (!plane_state->visible)
@@ -208,8 +207,7 @@ static int vkms_crtc_atomic_check(struct drm_crtc *crtc,
 
 	i = 0;
 	drm_for_each_plane_mask(plane, crtc->dev, crtc_state->plane_mask) {
-		plane_state = drm_atomic_get_existing_plane_state(crtc_state->state,
-								  plane);
+		plane_state = drm_atomic_get_existing_plane_state(crtc_state->state, plane);
 
 		if (!plane_state->visible)
 			continue;
diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c
index dd0af086e7fa..83e6c9b9ff46 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.c
+++ b/drivers/gpu/drm/vkms/vkms_drv.c
@@ -81,8 +81,7 @@ static void vkms_atomic_commit_tail(struct drm_atomic_state *old_state)
 	drm_atomic_helper_wait_for_flip_done(dev, old_state);
 
 	for_each_old_crtc_in_state(old_state, crtc, old_crtc_state, i) {
-		struct vkms_crtc_state *vkms_state =
-			to_vkms_crtc_state(old_crtc_state);
+		struct vkms_crtc_state *vkms_state = to_vkms_crtc_state(old_crtc_state);
 
 		flush_work(&vkms_state->composer_work);
 	}
diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
index e5c625ab8e3e..5a8d295e65f2 100644
--- a/drivers/gpu/drm/vkms/vkms_plane.c
+++ b/drivers/gpu/drm/vkms/vkms_plane.c
@@ -117,10 +117,10 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
 	memcpy(&frame_info->map, &shadow_plane_state->data, sizeof(frame_info->map));
 	drm_framebuffer_get(frame_info->fb);
 	frame_info->rotation = drm_rotation_simplify(new_state->rotation, DRM_MODE_ROTATE_0 |
-						     DRM_MODE_ROTATE_90 |
-						     DRM_MODE_ROTATE_270 |
-						     DRM_MODE_REFLECT_X |
-						     DRM_MODE_REFLECT_Y);
+									  DRM_MODE_ROTATE_90 |
+									  DRM_MODE_ROTATE_270 |
+									  DRM_MODE_REFLECT_X |
+									  DRM_MODE_REFLECT_Y);
 
 	drm_rect_rotate(&frame_info->rotated, drm_rect_width(&frame_info->rotated),
 			drm_rect_height(&frame_info->rotated), frame_info->rotation);

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v5 02/16] drm/vkms: Use drm_frame directly
  2024-03-13 17:44 [PATCH v5 00/16] drm/vkms: Reimplement line-per-line pixel conversion for plane reading Louis Chauvet
  2024-03-13 17:44 ` [PATCH v5 01/16] drm/vkms: Code formatting Louis Chauvet
@ 2024-03-13 17:44 ` Louis Chauvet
  2024-03-25 12:04   ` Pekka Paalanen
  2024-03-25 13:20   ` Maíra Canal
  2024-03-13 17:44 ` [PATCH v5 03/16] drm/vkms: write/update the documentation for pixel conversion and pixel write functions Louis Chauvet
                   ` (13 subsequent siblings)
  15 siblings, 2 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-13 17:44 UTC (permalink / raw)
  To: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee,
	Louis Chauvet

From: Arthur Grillo <arthurgrillo@riseup.net>

Remove intermidiary variables and access the variables directly from
drm_frame. These changes should be noop.

Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
---
 drivers/gpu/drm/vkms/vkms_drv.h       |  3 ---
 drivers/gpu/drm/vkms/vkms_formats.c   | 12 +++++++-----
 drivers/gpu/drm/vkms/vkms_plane.c     |  3 ---
 drivers/gpu/drm/vkms/vkms_writeback.c |  5 -----
 4 files changed, 7 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
index 8f5710debb1e..b4b357447292 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.h
+++ b/drivers/gpu/drm/vkms/vkms_drv.h
@@ -31,9 +31,6 @@ struct vkms_frame_info {
 	struct drm_rect rotated;
 	struct iosys_map map[DRM_FORMAT_MAX_PLANES];
 	unsigned int rotation;
-	unsigned int offset;
-	unsigned int pitch;
-	unsigned int cpp;
 };
 
 struct pixel_argb_u16 {
diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
index 36046b12f296..172830a3936a 100644
--- a/drivers/gpu/drm/vkms/vkms_formats.c
+++ b/drivers/gpu/drm/vkms/vkms_formats.c
@@ -11,8 +11,10 @@
 
 static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int y)
 {
-	return frame_info->offset + (y * frame_info->pitch)
-				  + (x * frame_info->cpp);
+	struct drm_framebuffer *fb = frame_info->fb;
+
+	return fb->offsets[0] + (y * fb->pitches[0])
+			      + (x * fb->format->cpp[0]);
 }
 
 /*
@@ -131,12 +133,12 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
 	u8 *src_pixels = get_packed_src_addr(frame_info, y);
 	int limit = min_t(size_t, drm_rect_width(&frame_info->dst), stage_buffer->n_pixels);
 
-	for (size_t x = 0; x < limit; x++, src_pixels += frame_info->cpp) {
+	for (size_t x = 0; x < limit; x++, src_pixels += frame_info->fb->format->cpp[0]) {
 		int x_pos = get_x_position(frame_info, limit, x);
 
 		if (drm_rotation_90_or_270(frame_info->rotation))
 			src_pixels = get_packed_src_addr(frame_info, x + frame_info->rotated.y1)
-				+ frame_info->cpp * y;
+				+ frame_info->fb->format->cpp[0] * y;
 
 		plane->pixel_read(src_pixels, &out_pixels[x_pos]);
 	}
@@ -223,7 +225,7 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
 	struct pixel_argb_u16 *in_pixels = src_buffer->pixels;
 	int x_limit = min_t(size_t, drm_rect_width(&frame_info->dst), src_buffer->n_pixels);
 
-	for (size_t x = 0; x < x_limit; x++, dst_pixels += frame_info->cpp)
+	for (size_t x = 0; x < x_limit; x++, dst_pixels += frame_info->fb->format->cpp[0])
 		wb->pixel_write(dst_pixels, &in_pixels[x]);
 }
 
diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
index 5a8d295e65f2..21b5adfb44aa 100644
--- a/drivers/gpu/drm/vkms/vkms_plane.c
+++ b/drivers/gpu/drm/vkms/vkms_plane.c
@@ -125,9 +125,6 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
 	drm_rect_rotate(&frame_info->rotated, drm_rect_width(&frame_info->rotated),
 			drm_rect_height(&frame_info->rotated), frame_info->rotation);
 
-	frame_info->offset = fb->offsets[0];
-	frame_info->pitch = fb->pitches[0];
-	frame_info->cpp = fb->format->cpp[0];
 	vkms_plane_state->pixel_read = get_pixel_conversion_function(fmt);
 }
 
diff --git a/drivers/gpu/drm/vkms/vkms_writeback.c b/drivers/gpu/drm/vkms/vkms_writeback.c
index bc724cbd5e3a..c8582df1f739 100644
--- a/drivers/gpu/drm/vkms/vkms_writeback.c
+++ b/drivers/gpu/drm/vkms/vkms_writeback.c
@@ -149,11 +149,6 @@ static void vkms_wb_atomic_commit(struct drm_connector *conn,
 	crtc_state->active_writeback = active_wb;
 	crtc_state->wb_pending = true;
 	spin_unlock_irq(&output->composer_lock);
-
-	wb_frame_info->offset = fb->offsets[0];
-	wb_frame_info->pitch = fb->pitches[0];
-	wb_frame_info->cpp = fb->format->cpp[0];
-
 	drm_writeback_queue_job(wb_conn, connector_state);
 	active_wb->pixel_write = get_pixel_write_function(wb_format);
 	drm_rect_init(&wb_frame_info->src, 0, 0, crtc_width, crtc_height);

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v5 03/16] drm/vkms: write/update the documentation for pixel conversion and pixel write functions
  2024-03-13 17:44 [PATCH v5 00/16] drm/vkms: Reimplement line-per-line pixel conversion for plane reading Louis Chauvet
  2024-03-13 17:44 ` [PATCH v5 01/16] drm/vkms: Code formatting Louis Chauvet
  2024-03-13 17:44 ` [PATCH v5 02/16] drm/vkms: Use drm_frame directly Louis Chauvet
@ 2024-03-13 17:44 ` Louis Chauvet
  2024-03-13 19:02   ` Randy Dunlap
  2024-03-25 13:32   ` Maíra Canal
  2024-03-13 17:44 ` [PATCH v5 04/16] drm/vkms: Add typedef and documentation for pixel_read and pixel_write functions Louis Chauvet
                   ` (12 subsequent siblings)
  15 siblings, 2 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-13 17:44 UTC (permalink / raw)
  To: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee,
	Louis Chauvet

Add some documentation on pixel conversion functions.
Update of outdated comments for pixel_write functions.

Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
---
 drivers/gpu/drm/vkms/vkms_composer.c |  7 ++++
 drivers/gpu/drm/vkms/vkms_drv.h      | 13 ++++++++
 drivers/gpu/drm/vkms/vkms_formats.c  | 62 ++++++++++++++++++++++++++++++------
 3 files changed, 73 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
index c6d9b4a65809..da0651a94c9b 100644
--- a/drivers/gpu/drm/vkms/vkms_composer.c
+++ b/drivers/gpu/drm/vkms/vkms_composer.c
@@ -189,6 +189,13 @@ static void blend(struct vkms_writeback_job *wb,
 
 	size_t crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
 
+	/*
+	 * The planes are composed line-by-line to avoid heavy memory usage. It is a necessary
+	 * complexity to avoid poor blending performance.
+	 *
+	 * The function vkms_compose_row is used to read a line, pixel-by-pixel, into the staging
+	 * buffer.
+	 */
 	for (size_t y = 0; y < crtc_y_limit; y++) {
 		fill_background(&background_color, output_buffer);
 
diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
index b4b357447292..18086423a3a7 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.h
+++ b/drivers/gpu/drm/vkms/vkms_drv.h
@@ -25,6 +25,17 @@
 
 #define VKMS_LUT_SIZE 256
 
+/**
+ * struct vkms_frame_info - structure to store the state of a frame
+ *
+ * @fb: backing drm framebuffer
+ * @src: source rectangle of this frame in the source framebuffer
+ * @dst: destination rectangle in the crtc buffer
+ * @map: see drm_shadow_plane_state@data
+ * @rotation: rotation applied to the source.
+ *
+ * @src and @dst should have the same size modulo the rotation.
+ */
 struct vkms_frame_info {
 	struct drm_framebuffer *fb;
 	struct drm_rect src, dst;
@@ -52,6 +63,8 @@ struct vkms_writeback_job {
  * vkms_plane_state - Driver specific plane state
  * @base: base plane state
  * @frame_info: data required for composing computation
+ * @pixel_read: function to read a pixel in this plane. The creator of a vkms_plane_state must
+ * ensure that this pointer is valid
  */
 struct vkms_plane_state {
 	struct drm_shadow_plane_state base;
diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
index 172830a3936a..6e3dc8682ff9 100644
--- a/drivers/gpu/drm/vkms/vkms_formats.c
+++ b/drivers/gpu/drm/vkms/vkms_formats.c
@@ -9,6 +9,18 @@
 
 #include "vkms_formats.h"
 
+/**
+ * pixel_offset() - Get the offset of the pixel at coordinates x/y in the first plane
+ *
+ * @frame_info: Buffer metadata
+ * @x: The x coordinate of the wanted pixel in the buffer
+ * @y: The y coordinate of the wanted pixel in the buffer
+ *
+ * The caller must ensure that the framebuffer associated with this request uses a pixel format
+ * where block_h == block_w == 1.
+ * If this requirement is not fulfilled, the resulting offset can point to an other pixel or
+ * outside of the buffer.
+ */
 static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int y)
 {
 	struct drm_framebuffer *fb = frame_info->fb;
@@ -17,18 +29,22 @@ static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int
 			      + (x * fb->format->cpp[0]);
 }
 
-/*
- * packed_pixels_addr - Get the pointer to pixel of a given pair of coordinates
+/**
+ * packed_pixels_addr() - Get the pointer to the block containing the pixel at the given
+ * coordinates
  *
  * @frame_info: Buffer metadata
- * @x: The x(width) coordinate of the 2D buffer
- * @y: The y(Heigth) coordinate of the 2D buffer
+ * @x: The x(width) coordinate inside the plane
+ * @y: The y(height) coordinate inside the plane
  *
  * Takes the information stored in the frame_info, a pair of coordinates, and
  * returns the address of the first color channel.
  * This function assumes the channels are packed together, i.e. a color channel
  * comes immediately after another in the memory. And therefore, this function
  * doesn't work for YUV with chroma subsampling (e.g. YUV420 and NV21).
+ *
+ * The caller must ensure that the framebuffer associated with this request uses a pixel format
+ * where block_h == block_w == 1, otherwise the returned pointer can be outside the buffer.
  */
 static void *packed_pixels_addr(const struct vkms_frame_info *frame_info,
 				int x, int y)
@@ -53,6 +69,13 @@ static int get_x_position(const struct vkms_frame_info *frame_info, int limit, i
 	return x;
 }
 
+/*
+ * The following  functions take pixel data from the buffer and convert them to the format
+ * ARGB16161616 in out_pixel.
+ *
+ * They are used in the `vkms_compose_row` function to handle multiple formats.
+ */
+
 static void ARGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
 {
 	/*
@@ -145,12 +168,11 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
 }
 
 /*
- * The following  functions take an line of argb_u16 pixels from the
- * src_buffer, convert them to a specific format, and store them in the
- * destination.
+ * The following functions take one argb_u16 pixel and convert it to a specific format. The
+ * result is stored in @dst_pixels.
  *
- * They are used in the `compose_active_planes` to convert and store a line
- * from the src_buffer to the writeback buffer.
+ * They are used in the `vkms_writeback_row` to convert and store a pixel from the src_buffer to
+ * the writeback buffer.
  */
 static void argb_u16_to_ARGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
 {
@@ -216,6 +238,14 @@ static void argb_u16_to_RGB565(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
 	*pixels = cpu_to_le16(r << 11 | g << 5 | b);
 }
 
+/**
+ * Generic loop for all supported writeback format. It is executed just after the blending to
+ * write a line in the writeback buffer.
+ *
+ * @wb: Job where to insert the final image
+ * @src_buffer: Line to write
+ * @y: Row to write in the writeback buffer
+ */
 void vkms_writeback_row(struct vkms_writeback_job *wb,
 			const struct line_buffer *src_buffer, int y)
 {
@@ -229,6 +259,13 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
 		wb->pixel_write(dst_pixels, &in_pixels[x]);
 }
 
+/**
+ * Retrieve the correct read_pixel function for a specific format.
+ * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
+ * pointer is valid before using it in a vkms_plane_state.
+ *
+ * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
+ */
 void *get_pixel_conversion_function(u32 format)
 {
 	switch (format) {
@@ -247,6 +284,13 @@ void *get_pixel_conversion_function(u32 format)
 	}
 }
 
+/**
+ * Retrieve the correct write_pixel function for a specific format.
+ * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
+ * pointer is valid before using it in a vkms_writeback_job.
+ *
+ * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
+ */
 void *get_pixel_write_function(u32 format)
 {
 	switch (format) {

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v5 04/16] drm/vkms: Add typedef and documentation for pixel_read and pixel_write functions
  2024-03-13 17:44 [PATCH v5 00/16] drm/vkms: Reimplement line-per-line pixel conversion for plane reading Louis Chauvet
                   ` (2 preceding siblings ...)
  2024-03-13 17:44 ` [PATCH v5 03/16] drm/vkms: write/update the documentation for pixel conversion and pixel write functions Louis Chauvet
@ 2024-03-13 17:44 ` Louis Chauvet
  2024-03-25 12:04   ` Pekka Paalanen
  2024-03-25 13:56   ` Maíra Canal
  2024-03-13 17:44 ` [PATCH v5 05/16] drm/vkms: Add dummy pixel_read/pixel_write callbacks to avoid NULL pointers Louis Chauvet
                   ` (11 subsequent siblings)
  15 siblings, 2 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-13 17:44 UTC (permalink / raw)
  To: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee,
	Louis Chauvet

Introduce two typedefs: pixel_read_t and pixel_write_t. It allows the
compiler to check if the passed functions take the correct arguments.
Such typedefs will help ensuring consistency across the code base in
case of update of these prototypes.

Rename input/output variable in a consistent way between read_line and
write_line.

A warn has been added in get_pixel_*_function to alert when an unsupported
pixel format is requested. As those formats are checked before
atomic_update callbacks, it should never append.

Document for those typedefs.

Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
---
 drivers/gpu/drm/vkms/vkms_drv.h     |  23 ++++++-
 drivers/gpu/drm/vkms/vkms_formats.c | 124 +++++++++++++++++++++---------------
 drivers/gpu/drm/vkms/vkms_formats.h |   4 +-
 drivers/gpu/drm/vkms/vkms_plane.c   |   2 +-
 4 files changed, 95 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
index 18086423a3a7..4bfc62d26f08 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.h
+++ b/drivers/gpu/drm/vkms/vkms_drv.h
@@ -53,12 +53,31 @@ struct line_buffer {
 	struct pixel_argb_u16 *pixels;
 };
 
+/**
+ * typedef pixel_write_t - These functions are used to read a pixel from a
+ * `struct pixel_argb_u16*`, convert it in a specific format and write it in the @dst_pixels
+ * buffer.
+ *
+ * @out_pixel: destination address to write the pixel
+ * @in_pixel: pixel to write
+ */
+typedef void (*pixel_write_t)(u8 *out_pixel, struct pixel_argb_u16 *in_pixel);
+
 struct vkms_writeback_job {
 	struct iosys_map data[DRM_FORMAT_MAX_PLANES];
 	struct vkms_frame_info wb_frame_info;
-	void (*pixel_write)(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel);
+	pixel_write_t pixel_write;
 };
 
+/**
+ * typedef pixel_read_t - These functions are used to read a pixel in the source frame,
+ * convert it to `struct pixel_argb_u16` and write it to @out_pixel.
+ *
+ * @in_pixel: Pointer to the pixel to read
+ * @out_pixel: Pointer to write the converted pixel
+ */
+typedef void (*pixel_read_t)(u8 *in_pixel, struct pixel_argb_u16 *out_pixel);
+
 /**
  * vkms_plane_state - Driver specific plane state
  * @base: base plane state
@@ -69,7 +88,7 @@ struct vkms_writeback_job {
 struct vkms_plane_state {
 	struct drm_shadow_plane_state base;
 	struct vkms_frame_info *frame_info;
-	void (*pixel_read)(u8 *src_buffer, struct pixel_argb_u16 *out_pixel);
+	pixel_read_t pixel_read;
 };
 
 struct vkms_plane {
diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
index 6e3dc8682ff9..55a4365d21a4 100644
--- a/drivers/gpu/drm/vkms/vkms_formats.c
+++ b/drivers/gpu/drm/vkms/vkms_formats.c
@@ -76,7 +76,7 @@ static int get_x_position(const struct vkms_frame_info *frame_info, int limit, i
  * They are used in the `vkms_compose_row` function to handle multiple formats.
  */
 
-static void ARGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
+static void ARGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
 {
 	/*
 	 * The 257 is the "conversion ratio". This number is obtained by the
@@ -84,48 +84,48 @@ static void ARGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixe
 	 * the best color value in a pixel format with more possibilities.
 	 * A similar idea applies to others RGB color conversions.
 	 */
-	out_pixel->a = (u16)src_pixels[3] * 257;
-	out_pixel->r = (u16)src_pixels[2] * 257;
-	out_pixel->g = (u16)src_pixels[1] * 257;
-	out_pixel->b = (u16)src_pixels[0] * 257;
+	out_pixel->a = (u16)in_pixel[3] * 257;
+	out_pixel->r = (u16)in_pixel[2] * 257;
+	out_pixel->g = (u16)in_pixel[1] * 257;
+	out_pixel->b = (u16)in_pixel[0] * 257;
 }
 
-static void XRGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
+static void XRGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
 {
 	out_pixel->a = (u16)0xffff;
-	out_pixel->r = (u16)src_pixels[2] * 257;
-	out_pixel->g = (u16)src_pixels[1] * 257;
-	out_pixel->b = (u16)src_pixels[0] * 257;
+	out_pixel->r = (u16)in_pixel[2] * 257;
+	out_pixel->g = (u16)in_pixel[1] * 257;
+	out_pixel->b = (u16)in_pixel[0] * 257;
 }
 
-static void ARGB16161616_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
+static void ARGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
 {
-	u16 *pixels = (u16 *)src_pixels;
+	u16 *pixel = (u16 *)in_pixel;
 
-	out_pixel->a = le16_to_cpu(pixels[3]);
-	out_pixel->r = le16_to_cpu(pixels[2]);
-	out_pixel->g = le16_to_cpu(pixels[1]);
-	out_pixel->b = le16_to_cpu(pixels[0]);
+	out_pixel->a = le16_to_cpu(pixel[3]);
+	out_pixel->r = le16_to_cpu(pixel[2]);
+	out_pixel->g = le16_to_cpu(pixel[1]);
+	out_pixel->b = le16_to_cpu(pixel[0]);
 }
 
-static void XRGB16161616_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
+static void XRGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
 {
-	u16 *pixels = (u16 *)src_pixels;
+	u16 *pixel = (u16 *)in_pixel;
 
 	out_pixel->a = (u16)0xffff;
-	out_pixel->r = le16_to_cpu(pixels[2]);
-	out_pixel->g = le16_to_cpu(pixels[1]);
-	out_pixel->b = le16_to_cpu(pixels[0]);
+	out_pixel->r = le16_to_cpu(pixel[2]);
+	out_pixel->g = le16_to_cpu(pixel[1]);
+	out_pixel->b = le16_to_cpu(pixel[0]);
 }
 
-static void RGB565_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
+static void RGB565_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
 {
-	u16 *pixels = (u16 *)src_pixels;
+	u16 *pixel = (u16 *)in_pixel;
 
 	s64 fp_rb_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(31));
 	s64 fp_g_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(63));
 
-	u16 rgb_565 = le16_to_cpu(*pixels);
+	u16 rgb_565 = le16_to_cpu(*pixel);
 	s64 fp_r = drm_int2fixp((rgb_565 >> 11) & 0x1f);
 	s64 fp_g = drm_int2fixp((rgb_565 >> 5) & 0x3f);
 	s64 fp_b = drm_int2fixp(rgb_565 & 0x1f);
@@ -169,12 +169,12 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
 
 /*
  * The following functions take one argb_u16 pixel and convert it to a specific format. The
- * result is stored in @dst_pixels.
+ * result is stored in @out_pixel.
  *
  * They are used in the `vkms_writeback_row` to convert and store a pixel from the src_buffer to
  * the writeback buffer.
  */
-static void argb_u16_to_ARGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
+static void argb_u16_to_ARGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
 {
 	/*
 	 * This sequence below is important because the format's byte order is
@@ -186,43 +186,43 @@ static void argb_u16_to_ARGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel
 	 * | Addr + 2 | = Red channel
 	 * | Addr + 3 | = Alpha channel
 	 */
-	dst_pixels[3] = DIV_ROUND_CLOSEST(in_pixel->a, 257);
-	dst_pixels[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
-	dst_pixels[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
-	dst_pixels[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
+	out_pixel[3] = DIV_ROUND_CLOSEST(in_pixel->a, 257);
+	out_pixel[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
+	out_pixel[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
+	out_pixel[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
 }
 
-static void argb_u16_to_XRGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
+static void argb_u16_to_XRGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
 {
-	dst_pixels[3] = 0xff;
-	dst_pixels[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
-	dst_pixels[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
-	dst_pixels[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
+	out_pixel[3] = 0xff;
+	out_pixel[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
+	out_pixel[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
+	out_pixel[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
 }
 
-static void argb_u16_to_ARGB16161616(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
+static void argb_u16_to_ARGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
 {
-	u16 *pixels = (u16 *)dst_pixels;
+	u16 *pixel = (u16 *)out_pixel;
 
-	pixels[3] = cpu_to_le16(in_pixel->a);
-	pixels[2] = cpu_to_le16(in_pixel->r);
-	pixels[1] = cpu_to_le16(in_pixel->g);
-	pixels[0] = cpu_to_le16(in_pixel->b);
+	pixel[3] = cpu_to_le16(in_pixel->a);
+	pixel[2] = cpu_to_le16(in_pixel->r);
+	pixel[1] = cpu_to_le16(in_pixel->g);
+	pixel[0] = cpu_to_le16(in_pixel->b);
 }
 
-static void argb_u16_to_XRGB16161616(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
+static void argb_u16_to_XRGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
 {
-	u16 *pixels = (u16 *)dst_pixels;
+	u16 *pixel = (u16 *)out_pixel;
 
-	pixels[3] = 0xffff;
-	pixels[2] = cpu_to_le16(in_pixel->r);
-	pixels[1] = cpu_to_le16(in_pixel->g);
-	pixels[0] = cpu_to_le16(in_pixel->b);
+	pixel[3] = 0xffff;
+	pixel[2] = cpu_to_le16(in_pixel->r);
+	pixel[1] = cpu_to_le16(in_pixel->g);
+	pixel[0] = cpu_to_le16(in_pixel->b);
 }
 
-static void argb_u16_to_RGB565(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
+static void argb_u16_to_RGB565(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
 {
-	u16 *pixels = (u16 *)dst_pixels;
+	u16 *pixel = (u16 *)out_pixel;
 
 	s64 fp_rb_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(31));
 	s64 fp_g_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(63));
@@ -235,7 +235,7 @@ static void argb_u16_to_RGB565(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
 	u16 g = drm_fixp2int(drm_fixp_div(fp_g, fp_g_ratio));
 	u16 b = drm_fixp2int(drm_fixp_div(fp_b, fp_rb_ratio));
 
-	*pixels = cpu_to_le16(r << 11 | g << 5 | b);
+	*pixel = cpu_to_le16(r << 11 | g << 5 | b);
 }
 
 /**
@@ -266,7 +266,7 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
  *
  * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
  */
-void *get_pixel_conversion_function(u32 format)
+pixel_read_t get_pixel_read_function(u32 format)
 {
 	switch (format) {
 	case DRM_FORMAT_ARGB8888:
@@ -280,7 +280,16 @@ void *get_pixel_conversion_function(u32 format)
 	case DRM_FORMAT_RGB565:
 		return &RGB565_to_argb_u16;
 	default:
-		return NULL;
+		/*
+		 * This is a bug in vkms_plane_atomic_check. All the supported
+		 * format must:
+		 * - Be listed in vkms_formats in vkms_plane.c
+		 * - Have a pixel_read callback defined here
+		 */
+		WARN(true,
+		     "Pixel format %p4cc is not supported by VKMS planes. This is a kernel bug, atomic check must forbid this configuration.\n",
+		     &format);
+		return (pixel_read_t)NULL;
 	}
 }
 
@@ -291,7 +300,7 @@ void *get_pixel_conversion_function(u32 format)
  *
  * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
  */
-void *get_pixel_write_function(u32 format)
+pixel_write_t get_pixel_write_function(u32 format)
 {
 	switch (format) {
 	case DRM_FORMAT_ARGB8888:
@@ -305,6 +314,15 @@ void *get_pixel_write_function(u32 format)
 	case DRM_FORMAT_RGB565:
 		return &argb_u16_to_RGB565;
 	default:
-		return NULL;
+		/*
+		 * This is a bug in vkms_writeback_atomic_check. All the supported
+		 * format must:
+		 * - Be listed in vkms_wb_formats in vkms_writeback.c
+		 * - Have a pixel_write callback defined here
+		 */
+		WARN(true,
+		     "Pixel format %p4cc is not supported by VKMS writeback. This is a kernel bug, atomic check must forbid this configuration.\n",
+		     &format);
+		return (pixel_write_t)NULL;
 	}
 }
diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
index cf59c2ed8e9a..3ecea4563254 100644
--- a/drivers/gpu/drm/vkms/vkms_formats.h
+++ b/drivers/gpu/drm/vkms/vkms_formats.h
@@ -5,8 +5,8 @@
 
 #include "vkms_drv.h"
 
-void *get_pixel_conversion_function(u32 format);
+pixel_read_t get_pixel_read_function(u32 format);
 
-void *get_pixel_write_function(u32 format);
+pixel_write_t get_pixel_write_function(u32 format);
 
 #endif /* _VKMS_FORMATS_H_ */
diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
index 21b5adfb44aa..10e9b23dab28 100644
--- a/drivers/gpu/drm/vkms/vkms_plane.c
+++ b/drivers/gpu/drm/vkms/vkms_plane.c
@@ -125,7 +125,7 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
 	drm_rect_rotate(&frame_info->rotated, drm_rect_width(&frame_info->rotated),
 			drm_rect_height(&frame_info->rotated), frame_info->rotation);
 
-	vkms_plane_state->pixel_read = get_pixel_conversion_function(fmt);
+	vkms_plane_state->pixel_read = get_pixel_read_function(fmt);
 }
 
 static int vkms_plane_atomic_check(struct drm_plane *plane,

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v5 05/16] drm/vkms: Add dummy pixel_read/pixel_write callbacks to avoid NULL pointers
  2024-03-13 17:44 [PATCH v5 00/16] drm/vkms: Reimplement line-per-line pixel conversion for plane reading Louis Chauvet
                   ` (3 preceding siblings ...)
  2024-03-13 17:44 ` [PATCH v5 04/16] drm/vkms: Add typedef and documentation for pixel_read and pixel_write functions Louis Chauvet
@ 2024-03-13 17:44 ` Louis Chauvet
  2024-03-13 19:08   ` Randy Dunlap
                     ` (2 more replies)
  2024-03-13 17:45 ` [PATCH v5 06/16] drm/vkms: Use const for input pointers in pixel_read an pixel_write functions Louis Chauvet
                   ` (10 subsequent siblings)
  15 siblings, 3 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-13 17:44 UTC (permalink / raw)
  To: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee,
	Louis Chauvet

Introduce two callbacks which does nothing. They are used in replacement
of NULL and it avoid kernel OOPS if this NULL is called.

If those callback are used, it means that there is a mismatch between
what formats are announced by atomic_check and what is realy supported by
atomic_update.

Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
---
 drivers/gpu/drm/vkms/vkms_formats.c | 43 +++++++++++++++++++++++++++++++------
 1 file changed, 37 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
index 55a4365d21a4..b57d85b8b935 100644
--- a/drivers/gpu/drm/vkms/vkms_formats.c
+++ b/drivers/gpu/drm/vkms/vkms_formats.c
@@ -136,6 +136,21 @@ static void RGB565_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
 	out_pixel->b = drm_fixp2int_round(drm_fixp_mul(fp_b, fp_rb_ratio));
 }
 
+/**
+ * black_to_argb_u16() - pixel_read callback which always read black
+ *
+ * This callback is used when an invalid format is requested for plane reading.
+ * It is used to avoid null pointer to be used as a function. In theory, this function should
+ * never be called, except if you found a bug in the driver/DRM core.
+ */
+static void black_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
+{
+	out_pixel->a = (u16)0xFFFF;
+	out_pixel->r = 0;
+	out_pixel->g = 0;
+	out_pixel->b = 0;
+}
+
 /**
  * vkms_compose_row - compose a single row of a plane
  * @stage_buffer: output line with the composed pixels
@@ -238,6 +253,16 @@ static void argb_u16_to_RGB565(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
 	*pixel = cpu_to_le16(r << 11 | g << 5 | b);
 }
 
+/**
+ * argb_u16_to_nothing() - pixel_write callback with no effect
+ *
+ * This callback is used when an invalid format is requested for writeback.
+ * It is used to avoid null pointer to be used as a function. In theory, this should never
+ * happen, except if there is a bug in the driver
+ */
+static void argb_u16_to_nothing(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
+{}
+
 /**
  * Generic loop for all supported writeback format. It is executed just after the blending to
  * write a line in the writeback buffer.
@@ -261,8 +286,8 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
 
 /**
  * Retrieve the correct read_pixel function for a specific format.
- * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
- * pointer is valid before using it in a vkms_plane_state.
+ * If the format is not supported by VKMS a warn is emitted and a dummy "always read black"
+ * function is returned.
  *
  * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
  */
@@ -285,18 +310,21 @@ pixel_read_t get_pixel_read_function(u32 format)
 		 * format must:
 		 * - Be listed in vkms_formats in vkms_plane.c
 		 * - Have a pixel_read callback defined here
+		 *
+		 * To avoid kernel crash, a dummy "always read black" function is used. It means
+		 * that during the composition, this plane will always be black.
 		 */
 		WARN(true,
 		     "Pixel format %p4cc is not supported by VKMS planes. This is a kernel bug, atomic check must forbid this configuration.\n",
 		     &format);
-		return (pixel_read_t)NULL;
+		return &black_to_argb_u16;
 	}
 }
 
 /**
  * Retrieve the correct write_pixel function for a specific format.
- * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
- * pointer is valid before using it in a vkms_writeback_job.
+ * If the format is not supported by VKMS a warn is emitted and a dummy "don't do anything"
+ * function is returned.
  *
  * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
  */
@@ -319,10 +347,13 @@ pixel_write_t get_pixel_write_function(u32 format)
 		 * format must:
 		 * - Be listed in vkms_wb_formats in vkms_writeback.c
 		 * - Have a pixel_write callback defined here
+		 *
+		 * To avoid kernel crash, a dummy "don't do anything" function is used. It means
+		 * that the resulting writeback buffer is not composed and can contains any values.
 		 */
 		WARN(true,
 		     "Pixel format %p4cc is not supported by VKMS writeback. This is a kernel bug, atomic check must forbid this configuration.\n",
 		     &format);
-		return (pixel_write_t)NULL;
+		return &argb_u16_to_nothing;
 	}
 }

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v5 06/16] drm/vkms: Use const for input pointers in pixel_read an pixel_write functions
  2024-03-13 17:44 [PATCH v5 00/16] drm/vkms: Reimplement line-per-line pixel conversion for plane reading Louis Chauvet
                   ` (4 preceding siblings ...)
  2024-03-13 17:44 ` [PATCH v5 05/16] drm/vkms: Add dummy pixel_read/pixel_write callbacks to avoid NULL pointers Louis Chauvet
@ 2024-03-13 17:45 ` Louis Chauvet
  2024-03-25 12:05   ` Pekka Paalanen
  2024-03-25 14:00   ` Maíra Canal
  2024-03-13 17:45 ` [PATCH v5 07/16] drm/vkms: Update pixels accessor to support packed and multi-plane formats Louis Chauvet
                   ` (9 subsequent siblings)
  15 siblings, 2 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-13 17:45 UTC (permalink / raw)
  To: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee,
	Louis Chauvet

As the pixel_read and pixel_write function should never modify the input
buffer, mark those pointers const.

Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
---
 drivers/gpu/drm/vkms/vkms_drv.h     |  4 ++--
 drivers/gpu/drm/vkms/vkms_formats.c | 24 ++++++++++++------------
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
index 4bfc62d26f08..3ead8b39af4a 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.h
+++ b/drivers/gpu/drm/vkms/vkms_drv.h
@@ -61,7 +61,7 @@ struct line_buffer {
  * @out_pixel: destination address to write the pixel
  * @in_pixel: pixel to write
  */
-typedef void (*pixel_write_t)(u8 *out_pixel, struct pixel_argb_u16 *in_pixel);
+typedef void (*pixel_write_t)(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel);
 
 struct vkms_writeback_job {
 	struct iosys_map data[DRM_FORMAT_MAX_PLANES];
@@ -76,7 +76,7 @@ struct vkms_writeback_job {
  * @in_pixel: Pointer to the pixel to read
  * @out_pixel: Pointer to write the converted pixel
  */
-typedef void (*pixel_read_t)(u8 *in_pixel, struct pixel_argb_u16 *out_pixel);
+typedef void (*pixel_read_t)(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel);
 
 /**
  * vkms_plane_state - Driver specific plane state
diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
index b57d85b8b935..b2f8dfc26c35 100644
--- a/drivers/gpu/drm/vkms/vkms_formats.c
+++ b/drivers/gpu/drm/vkms/vkms_formats.c
@@ -76,7 +76,7 @@ static int get_x_position(const struct vkms_frame_info *frame_info, int limit, i
  * They are used in the `vkms_compose_row` function to handle multiple formats.
  */
 
-static void ARGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
+static void ARGB8888_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
 {
 	/*
 	 * The 257 is the "conversion ratio". This number is obtained by the
@@ -90,7 +90,7 @@ static void ARGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
 	out_pixel->b = (u16)in_pixel[0] * 257;
 }
 
-static void XRGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
+static void XRGB8888_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
 {
 	out_pixel->a = (u16)0xffff;
 	out_pixel->r = (u16)in_pixel[2] * 257;
@@ -98,7 +98,7 @@ static void XRGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
 	out_pixel->b = (u16)in_pixel[0] * 257;
 }
 
-static void ARGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
+static void ARGB16161616_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
 {
 	u16 *pixel = (u16 *)in_pixel;
 
@@ -108,7 +108,7 @@ static void ARGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pi
 	out_pixel->b = le16_to_cpu(pixel[0]);
 }
 
-static void XRGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
+static void XRGB16161616_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
 {
 	u16 *pixel = (u16 *)in_pixel;
 
@@ -118,7 +118,7 @@ static void XRGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pi
 	out_pixel->b = le16_to_cpu(pixel[0]);
 }
 
-static void RGB565_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
+static void RGB565_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
 {
 	u16 *pixel = (u16 *)in_pixel;
 
@@ -143,7 +143,7 @@ static void RGB565_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
  * It is used to avoid null pointer to be used as a function. In theory, this function should
  * never be called, except if you found a bug in the driver/DRM core.
  */
-static void black_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
+static void black_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
 {
 	out_pixel->a = (u16)0xFFFF;
 	out_pixel->r = 0;
@@ -189,7 +189,7 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
  * They are used in the `vkms_writeback_row` to convert and store a pixel from the src_buffer to
  * the writeback buffer.
  */
-static void argb_u16_to_ARGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
+static void argb_u16_to_ARGB8888(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel)
 {
 	/*
 	 * This sequence below is important because the format's byte order is
@@ -207,7 +207,7 @@ static void argb_u16_to_ARGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
 	out_pixel[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
 }
 
-static void argb_u16_to_XRGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
+static void argb_u16_to_XRGB8888(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel)
 {
 	out_pixel[3] = 0xff;
 	out_pixel[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
@@ -215,7 +215,7 @@ static void argb_u16_to_XRGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
 	out_pixel[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
 }
 
-static void argb_u16_to_ARGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
+static void argb_u16_to_ARGB16161616(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel)
 {
 	u16 *pixel = (u16 *)out_pixel;
 
@@ -225,7 +225,7 @@ static void argb_u16_to_ARGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pi
 	pixel[0] = cpu_to_le16(in_pixel->b);
 }
 
-static void argb_u16_to_XRGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
+static void argb_u16_to_XRGB16161616(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel)
 {
 	u16 *pixel = (u16 *)out_pixel;
 
@@ -235,7 +235,7 @@ static void argb_u16_to_XRGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pi
 	pixel[0] = cpu_to_le16(in_pixel->b);
 }
 
-static void argb_u16_to_RGB565(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
+static void argb_u16_to_RGB565(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel)
 {
 	u16 *pixel = (u16 *)out_pixel;
 
@@ -260,7 +260,7 @@ static void argb_u16_to_RGB565(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
  * It is used to avoid null pointer to be used as a function. In theory, this should never
  * happen, except if there is a bug in the driver
  */
-static void argb_u16_to_nothing(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
+static void argb_u16_to_nothing(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel)
 {}
 
 /**

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v5 07/16] drm/vkms: Update pixels accessor to support packed and multi-plane formats.
  2024-03-13 17:44 [PATCH v5 00/16] drm/vkms: Reimplement line-per-line pixel conversion for plane reading Louis Chauvet
                   ` (5 preceding siblings ...)
  2024-03-13 17:45 ` [PATCH v5 06/16] drm/vkms: Use const for input pointers in pixel_read an pixel_write functions Louis Chauvet
@ 2024-03-13 17:45 ` Louis Chauvet
  2024-03-25 12:40   ` Pekka Paalanen
  2024-03-13 17:45 ` [PATCH v5 08/16] drm/vkms: Avoid computing blending limits inside pre_mul_alpha_blend Louis Chauvet
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 75+ messages in thread
From: Louis Chauvet @ 2024-03-13 17:45 UTC (permalink / raw)
  To: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee,
	Louis Chauvet

Introduce the usage of block_h/block_w to compute the offset and the
pointer of a pixel. The previous implementation was specialized for
planes with block_h == block_w == 1. To avoid confusion and allow easier
implementation of tiled formats. It also remove the usage of the
deprecated format field `cpp`.

Introduce the plane_index parameter to get an offset/pointer on a
different plane.

Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
---
 drivers/gpu/drm/vkms/vkms_formats.c | 76 +++++++++++++++++++++++++------------
 1 file changed, 52 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
index b2f8dfc26c35..649d75d05b1f 100644
--- a/drivers/gpu/drm/vkms/vkms_formats.c
+++ b/drivers/gpu/drm/vkms/vkms_formats.c
@@ -10,23 +10,43 @@
 #include "vkms_formats.h"
 
 /**
- * pixel_offset() - Get the offset of the pixel at coordinates x/y in the first plane
+ * packed_pixels_offset() - Get the offset of the block containing the pixel at coordinates x/y
  *
  * @frame_info: Buffer metadata
  * @x: The x coordinate of the wanted pixel in the buffer
  * @y: The y coordinate of the wanted pixel in the buffer
+ * @plane_index: The index of the plane to use
+ * @offset: The returned offset inside the buffer of the block
+ * @rem_x,@rem_y: The returned coordinate of the requested pixel in the block
  *
- * The caller must ensure that the framebuffer associated with this request uses a pixel format
- * where block_h == block_w == 1.
- * If this requirement is not fulfilled, the resulting offset can point to an other pixel or
- * outside of the buffer.
+ * As some pixel formats store multiple pixels in a block (DRM_FORMAT_R* for example), some
+ * pixels are not individually addressable. This function return 3 values: the offset of the
+ * whole block, and the coordinate of the requested pixel inside this block.
+ * For example, if the format is DRM_FORMAT_R1 and the requested coordinate is 13,5, the offset
+ * will point to the byte 5*pitches + 13/8 (second byte of the 5th line), and the rem_x/rem_y
+ * coordinates will be (13 % 8, 5 % 1) = (5, 0)
+ *
+ * With this function, the caller just have to extract the correct pixel from the block.
  */
-static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int y)
+static void packed_pixels_offset(const struct vkms_frame_info *frame_info, int x, int y,
+				 int plane_index, int *offset, int *rem_x, int *rem_y)
 {
 	struct drm_framebuffer *fb = frame_info->fb;
+	const struct drm_format_info *format = frame_info->fb->format;
+	/* Directly using x and y to multiply pitches and format->ccp is not sufficient because
+	 * in some formats a block can represent multiple pixels.
+	 *
+	 * Dividing x and y by the block size allows to extract the correct offset of the block
+	 * containing the pixel.
+	 */
 
-	return fb->offsets[0] + (y * fb->pitches[0])
-			      + (x * fb->format->cpp[0]);
+	int block_x = x / drm_format_info_block_width(format, plane_index);
+	int block_y = y / drm_format_info_block_height(format, plane_index);
+	*rem_x = x % drm_format_info_block_width(format, plane_index);
+	*rem_y = x % drm_format_info_block_height(format, plane_index);
+	*offset = fb->offsets[plane_index] +
+		  block_y * fb->pitches[plane_index] +
+		  block_x * format->char_per_block[plane_index];
 }
 
 /**
@@ -36,30 +56,35 @@ static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int
  * @frame_info: Buffer metadata
  * @x: The x(width) coordinate inside the plane
  * @y: The y(height) coordinate inside the plane
+ * @plane_index: The index of the plane
+ * @addr: The returned pointer
+ * @rem_x,@rem_y: The returned coordinate of the requested pixel in the block
  *
- * Takes the information stored in the frame_info, a pair of coordinates, and
- * returns the address of the first color channel.
- * This function assumes the channels are packed together, i.e. a color channel
- * comes immediately after another in the memory. And therefore, this function
- * doesn't work for YUV with chroma subsampling (e.g. YUV420 and NV21).
+ * Takes the information stored in the frame_info, a pair of coordinates, and returns the address
+ * of the block containing this pixel and the pixel position inside this block.
  *
- * The caller must ensure that the framebuffer associated with this request uses a pixel format
- * where block_h == block_w == 1, otherwise the returned pointer can be outside the buffer.
+ * See @packed_pixel_offset for details about rem_x/rem_y behavior.
  */
-static void *packed_pixels_addr(const struct vkms_frame_info *frame_info,
-				int x, int y)
+static void packed_pixels_addr(const struct vkms_frame_info *frame_info,
+			       int x, int y, int plane_index, u8 **addr, int *rem_x,
+			       int *rem_y)
 {
-	size_t offset = pixel_offset(frame_info, x, y);
+	int offset;
 
-	return (u8 *)frame_info->map[0].vaddr + offset;
+	packed_pixels_offset(frame_info, x, y, plane_index, &offset, rem_x, rem_y);
+	*addr = (u8 *)frame_info->map[0].vaddr + offset;
 }
 
-static void *get_packed_src_addr(const struct vkms_frame_info *frame_info, int y)
+static void *get_packed_src_addr(const struct vkms_frame_info *frame_info, int y,
+				 int plane_index)
 {
 	int x_src = frame_info->src.x1 >> 16;
 	int y_src = y - frame_info->rotated.y1 + (frame_info->src.y1 >> 16);
+	u8 *addr;
+	int rem_x, rem_y;
 
-	return packed_pixels_addr(frame_info, x_src, y_src);
+	packed_pixels_addr(frame_info, x_src, y_src, plane_index, &addr, &rem_x, &rem_y);
+	return addr;
 }
 
 static int get_x_position(const struct vkms_frame_info *frame_info, int limit, int x)
@@ -168,14 +193,14 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
 {
 	struct pixel_argb_u16 *out_pixels = stage_buffer->pixels;
 	struct vkms_frame_info *frame_info = plane->frame_info;
-	u8 *src_pixels = get_packed_src_addr(frame_info, y);
+	u8 *src_pixels = get_packed_src_addr(frame_info, y, 0);
 	int limit = min_t(size_t, drm_rect_width(&frame_info->dst), stage_buffer->n_pixels);
 
 	for (size_t x = 0; x < limit; x++, src_pixels += frame_info->fb->format->cpp[0]) {
 		int x_pos = get_x_position(frame_info, limit, x);
 
 		if (drm_rotation_90_or_270(frame_info->rotation))
-			src_pixels = get_packed_src_addr(frame_info, x + frame_info->rotated.y1)
+			src_pixels = get_packed_src_addr(frame_info, x + frame_info->rotated.y1, 0)
 				+ frame_info->fb->format->cpp[0] * y;
 
 		plane->pixel_read(src_pixels, &out_pixels[x_pos]);
@@ -276,7 +301,10 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
 {
 	struct vkms_frame_info *frame_info = &wb->wb_frame_info;
 	int x_dst = frame_info->dst.x1;
-	u8 *dst_pixels = packed_pixels_addr(frame_info, x_dst, y);
+	u8 *dst_pixels;
+	int rem_x, rem_y;
+
+	packed_pixels_addr(frame_info, x_dst, y, 0, &dst_pixels, &rem_x, &rem_y);
 	struct pixel_argb_u16 *in_pixels = src_buffer->pixels;
 	int x_limit = min_t(size_t, drm_rect_width(&frame_info->dst), src_buffer->n_pixels);
 

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v5 08/16] drm/vkms: Avoid computing blending limits inside pre_mul_alpha_blend
  2024-03-13 17:44 [PATCH v5 00/16] drm/vkms: Reimplement line-per-line pixel conversion for plane reading Louis Chauvet
                   ` (6 preceding siblings ...)
  2024-03-13 17:45 ` [PATCH v5 07/16] drm/vkms: Update pixels accessor to support packed and multi-plane formats Louis Chauvet
@ 2024-03-13 17:45 ` Louis Chauvet
  2024-03-25 12:41   ` Pekka Paalanen
  2024-03-13 17:45 ` [PATCH v5 09/16] drm/vkms: Introduce pixel_read_direction enum Louis Chauvet
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 75+ messages in thread
From: Louis Chauvet @ 2024-03-13 17:45 UTC (permalink / raw)
  To: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee,
	Louis Chauvet

The pre_mul_alpha_blend is dedicated to blending, so to avoid mixing
different concepts (coordinate calculation and color management), extract
the x_limit and x_dst computation outside of this helper.
It also increases the maintainability by grouping the computation related
to coordinates in the same place: the loop in `blend`.

Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
---
 drivers/gpu/drm/vkms/vkms_composer.c | 40 +++++++++++++++++-------------------
 1 file changed, 19 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
index da0651a94c9b..9254086f23ff 100644
--- a/drivers/gpu/drm/vkms/vkms_composer.c
+++ b/drivers/gpu/drm/vkms/vkms_composer.c
@@ -24,34 +24,30 @@ static u16 pre_mul_blend_channel(u16 src, u16 dst, u16 alpha)
 
 /**
  * pre_mul_alpha_blend - alpha blending equation
- * @frame_info: Source framebuffer's metadata
  * @stage_buffer: The line with the pixels from src_plane
  * @output_buffer: A line buffer that receives all the blends output
+ * @x_start: The start offset to avoid useless copy
+ * @count: The number of byte to copy
  *
- * Using the information from the `frame_info`, this blends only the
- * necessary pixels from the `stage_buffer` to the `output_buffer`
- * using premultiplied blend formula.
+ * Using @x_start and @count information, only few pixel can be blended instead of the whole line
+ * each time.
  *
  * The current DRM assumption is that pixel color values have been already
  * pre-multiplied with the alpha channel values. See more
  * drm_plane_create_blend_mode_property(). Also, this formula assumes a
  * completely opaque background.
  */
-static void pre_mul_alpha_blend(struct vkms_frame_info *frame_info,
-				struct line_buffer *stage_buffer,
-				struct line_buffer *output_buffer)
+static void pre_mul_alpha_blend(const struct line_buffer *stage_buffer,
+				struct line_buffer *output_buffer, int x_start, int pixel_count)
 {
-	int x_dst = frame_info->dst.x1;
-	struct pixel_argb_u16 *out = output_buffer->pixels + x_dst;
-	struct pixel_argb_u16 *in = stage_buffer->pixels;
-	int x_limit = min_t(size_t, drm_rect_width(&frame_info->dst),
-			    stage_buffer->n_pixels);
-
-	for (int x = 0; x < x_limit; x++) {
-		out[x].a = (u16)0xffff;
-		out[x].r = pre_mul_blend_channel(in[x].r, out[x].r, in[x].a);
-		out[x].g = pre_mul_blend_channel(in[x].g, out[x].g, in[x].a);
-		out[x].b = pre_mul_blend_channel(in[x].b, out[x].b, in[x].a);
+	struct pixel_argb_u16 *out = &output_buffer->pixels[x_start];
+	const struct pixel_argb_u16 *in = stage_buffer->pixels;
+
+	for (int i = 0; i < pixel_count; i++) {
+		out[i].a = (u16)0xffff;
+		out[i].r = pre_mul_blend_channel(in[i].r, out[i].r, in[i].a);
+		out[i].g = pre_mul_blend_channel(in[i].g, out[i].g, in[i].a);
+		out[i].b = pre_mul_blend_channel(in[i].b, out[i].b, in[i].a);
 	}
 }
 
@@ -183,7 +179,7 @@ static void blend(struct vkms_writeback_job *wb,
 {
 	struct vkms_plane_state **plane = crtc_state->active_planes;
 	u32 n_active_planes = crtc_state->num_active_planes;
-	int y_pos;
+	int y_pos, x_dst, x_limit;
 
 	const struct pixel_argb_u16 background_color = { .a = 0xffff };
 
@@ -201,14 +197,16 @@ static void blend(struct vkms_writeback_job *wb,
 
 		/* The active planes are composed associatively in z-order. */
 		for (size_t i = 0; i < n_active_planes; i++) {
+			x_dst = plane[i]->frame_info->dst.x1;
+			x_limit = min_t(size_t, drm_rect_width(&plane[i]->frame_info->dst),
+					stage_buffer->n_pixels);
 			y_pos = get_y_pos(plane[i]->frame_info, y);
 
 			if (!check_limit(plane[i]->frame_info, y_pos))
 				continue;
 
 			vkms_compose_row(stage_buffer, plane[i], y_pos);
-			pre_mul_alpha_blend(plane[i]->frame_info, stage_buffer,
-					    output_buffer);
+			pre_mul_alpha_blend(stage_buffer, output_buffer, x_dst, x_limit);
 		}
 
 		apply_lut(crtc_state, output_buffer);

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v5 09/16] drm/vkms: Introduce pixel_read_direction enum
  2024-03-13 17:44 [PATCH v5 00/16] drm/vkms: Reimplement line-per-line pixel conversion for plane reading Louis Chauvet
                   ` (7 preceding siblings ...)
  2024-03-13 17:45 ` [PATCH v5 08/16] drm/vkms: Avoid computing blending limits inside pre_mul_alpha_blend Louis Chauvet
@ 2024-03-13 17:45 ` Louis Chauvet
  2024-03-25 13:11   ` Pekka Paalanen
  2024-03-25 14:07   ` Maíra Canal
  2024-03-13 17:45 ` [PATCH v5 10/16] drm/vkms: Re-introduce line-per-line composition algorithm Louis Chauvet
                   ` (6 subsequent siblings)
  15 siblings, 2 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-13 17:45 UTC (permalink / raw)
  To: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee,
	Louis Chauvet

The pixel_read_direction enum is useful to describe the reading direction
in a plane. It avoids using the rotation property of DRM, which not
practical to know the direction of reading.
This patch also introduce two helpers, one to compute the
pixel_read_direction from the DRM rotation property, and one to compute
the step, in byte, between two successive pixel in a specific direction.

Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
---
 drivers/gpu/drm/vkms/vkms_composer.c | 36 ++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/vkms/vkms_drv.h      | 11 +++++++++++
 drivers/gpu/drm/vkms/vkms_formats.c  | 30 ++++++++++++++++++++++++++++++
 3 files changed, 77 insertions(+)

diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
index 9254086f23ff..989bcf59f375 100644
--- a/drivers/gpu/drm/vkms/vkms_composer.c
+++ b/drivers/gpu/drm/vkms/vkms_composer.c
@@ -159,6 +159,42 @@ static void apply_lut(const struct vkms_crtc_state *crtc_state, struct line_buff
 	}
 }
 
+/**
+ * direction_for_rotation() - Get the correct reading direction for a given rotation
+ *
+ * This function will use the @rotation setting of a source plane to compute the reading
+ * direction in this plane which correspond to a "left to right writing" in the CRTC.
+ * For example, if the buffer is reflected on X axis, the pixel must be read from right to left
+ * to be written from left to right on the CRTC.
+ *
+ * @rotation: Rotation to analyze. It correspond the field @frame_info.rotation.
+ */
+static enum pixel_read_direction direction_for_rotation(unsigned int rotation)
+{
+	if (rotation & DRM_MODE_ROTATE_0) {
+		if (rotation & DRM_MODE_REFLECT_X)
+			return READ_RIGHT_TO_LEFT;
+		else
+			return READ_LEFT_TO_RIGHT;
+	} else if (rotation & DRM_MODE_ROTATE_90) {
+		if (rotation & DRM_MODE_REFLECT_Y)
+			return READ_BOTTOM_TO_TOP;
+		else
+			return READ_TOP_TO_BOTTOM;
+	} else if (rotation & DRM_MODE_ROTATE_180) {
+		if (rotation & DRM_MODE_REFLECT_X)
+			return READ_LEFT_TO_RIGHT;
+		else
+			return READ_RIGHT_TO_LEFT;
+	} else if (rotation & DRM_MODE_ROTATE_270) {
+		if (rotation & DRM_MODE_REFLECT_Y)
+			return READ_TOP_TO_BOTTOM;
+		else
+			return READ_BOTTOM_TO_TOP;
+	}
+	return READ_LEFT_TO_RIGHT;
+}
+
 /**
  * blend - blend the pixels from all planes and compute crc
  * @wb: The writeback frame buffer metadata
diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
index 3ead8b39af4a..985e7a92b7bc 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.h
+++ b/drivers/gpu/drm/vkms/vkms_drv.h
@@ -69,6 +69,17 @@ struct vkms_writeback_job {
 	pixel_write_t pixel_write;
 };
 
+/**
+ * enum pixel_read_direction - Enum used internaly by VKMS to represent a reading direction in a
+ * plane.
+ */
+enum pixel_read_direction {
+	READ_BOTTOM_TO_TOP,
+	READ_TOP_TO_BOTTOM,
+	READ_RIGHT_TO_LEFT,
+	READ_LEFT_TO_RIGHT
+};
+
 /**
  * typedef pixel_read_t - These functions are used to read a pixel in the source frame,
  * convert it to `struct pixel_argb_u16` and write it to @out_pixel.
diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
index 649d75d05b1f..743b6fd06db5 100644
--- a/drivers/gpu/drm/vkms/vkms_formats.c
+++ b/drivers/gpu/drm/vkms/vkms_formats.c
@@ -75,6 +75,36 @@ static void packed_pixels_addr(const struct vkms_frame_info *frame_info,
 	*addr = (u8 *)frame_info->map[0].vaddr + offset;
 }
 
+/**
+ * get_step_next_block() - Common helper to compute the correct step value between each pixel block
+ * to read in a certain direction.
+ *
+ * As the returned offset is the number of bytes between two consecutive blocks in a direction,
+ * the caller may have to read multiple pixel before using the next one (for example, to read from
+ * left to right in a DRM_FORMAT_R1 plane, each block contains 8 pixels, so the step must be used
+ * only every 8 pixels.
+ *
+ * @fb: Framebuffer to iter on
+ * @direction: Direction of the reading
+ * @plane_index: Plane to get the step from
+ */
+static int get_step_next_block(struct drm_framebuffer *fb, enum pixel_read_direction direction,
+			       int plane_index)
+{
+	switch (direction) {
+	case READ_LEFT_TO_RIGHT:
+		return fb->format->char_per_block[plane_index];
+	case READ_RIGHT_TO_LEFT:
+		return -fb->format->char_per_block[plane_index];
+	case READ_TOP_TO_BOTTOM:
+		return (int)fb->pitches[plane_index];
+	case READ_BOTTOM_TO_TOP:
+		return -(int)fb->pitches[plane_index];
+	}
+
+	return 0;
+}
+
 static void *get_packed_src_addr(const struct vkms_frame_info *frame_info, int y,
 				 int plane_index)
 {

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v5 10/16] drm/vkms: Re-introduce line-per-line composition algorithm
  2024-03-13 17:44 [PATCH v5 00/16] drm/vkms: Reimplement line-per-line pixel conversion for plane reading Louis Chauvet
                   ` (8 preceding siblings ...)
  2024-03-13 17:45 ` [PATCH v5 09/16] drm/vkms: Introduce pixel_read_direction enum Louis Chauvet
@ 2024-03-13 17:45 ` Louis Chauvet
  2024-03-25 14:15   ` Maíra Canal
  2024-03-25 15:43   ` Pekka Paalanen
  2024-03-13 17:45 ` [PATCH v5 11/16] drm/vkms: Add YUV support Louis Chauvet
                   ` (5 subsequent siblings)
  15 siblings, 2 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-13 17:45 UTC (permalink / raw)
  To: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee,
	Louis Chauvet

Re-introduce a line-by-line composition algorithm for each pixel format.
This allows more performance by not requiring an indirection per pixel
read. This patch is focused on readability of the code.

Line-by-line composition was introduced by [1] but rewritten back to
pixel-by-pixel algorithm in [2]. At this time, nobody noticed the impact
on performance, and it was merged.

This patch is almost a revert of [2], but in addition efforts have been
made to increase readability and maintainability of the rotation handling.
The blend function is now divided in two parts:
- Transformation of coordinates from the output referential to the source
referential
- Line conversion and blending

Most of the complexity of the rotation management is avoided by using
drm_rect_* helpers. The remaining complexity is around the clipping, to
avoid reading/writing outside source/destination buffers.

The pixel conversion is now done line-by-line, so the read_pixel_t was
replaced with read_pixel_line_t callback. This way the indirection is only
required once per line and per plane, instead of once per pixel and per
plane.

The read_line_t callbacks are very similar for most pixel format, but it
is required to avoid performance impact. Some helpers for color
conversion were introduced to avoid code repetition:
- *_to_argb_u16: perform colors conversion. They should be inlined by the
  compiler, and they are used to avoid repetition between multiple variants
  of the same format (argb/xrgb and maybe in the future for formats like
  bgr formats).

This new algorithm was tested with:
- kms_plane (for color conversions)
- kms_rotation_crc (for rotations of planes)
- kms_cursor_crc (for translations of planes)
- kms_rotation (for all rotations and formats combinations) [3]
The performance gain was mesured with:
- kms_fb_stress

[1]: commit 8ba1648567e2 ("drm: vkms: Refactor the plane composer to accept
     new formats")
     https://lore.kernel.org/all/20220905190811.25024-7-igormtorrente@gmail.com/
[2]: commit 322d716a3e8a ("drm/vkms: isolate pixel conversion
     functionality")
     https://lore.kernel.org/all/20230418130525.128733-2-mcanal@igalia.com/
[3]:

Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
---
 drivers/gpu/drm/vkms/vkms_composer.c | 167 +++++++++++++++++++------
 drivers/gpu/drm/vkms/vkms_drv.h      |  27 ++--
 drivers/gpu/drm/vkms/vkms_formats.c  | 236 ++++++++++++++++++++++-------------
 drivers/gpu/drm/vkms/vkms_formats.h  |   2 +-
 drivers/gpu/drm/vkms/vkms_plane.c    |   5 +-
 5 files changed, 292 insertions(+), 145 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
index 989bcf59f375..5d78c33dbf41 100644
--- a/drivers/gpu/drm/vkms/vkms_composer.c
+++ b/drivers/gpu/drm/vkms/vkms_composer.c
@@ -41,7 +41,7 @@ static void pre_mul_alpha_blend(const struct line_buffer *stage_buffer,
 				struct line_buffer *output_buffer, int x_start, int pixel_count)
 {
 	struct pixel_argb_u16 *out = &output_buffer->pixels[x_start];
-	const struct pixel_argb_u16 *in = stage_buffer->pixels;
+	const struct pixel_argb_u16 *in = &stage_buffer->pixels[x_start];
 
 	for (int i = 0; i < pixel_count; i++) {
 		out[i].a = (u16)0xffff;
@@ -51,33 +51,6 @@ static void pre_mul_alpha_blend(const struct line_buffer *stage_buffer,
 	}
 }
 
-static int get_y_pos(struct vkms_frame_info *frame_info, int y)
-{
-	if (frame_info->rotation & DRM_MODE_REFLECT_Y)
-		return drm_rect_height(&frame_info->rotated) - y - 1;
-
-	switch (frame_info->rotation & DRM_MODE_ROTATE_MASK) {
-	case DRM_MODE_ROTATE_90:
-		return frame_info->rotated.x2 - y - 1;
-	case DRM_MODE_ROTATE_270:
-		return y + frame_info->rotated.x1;
-	default:
-		return y;
-	}
-}
-
-static bool check_limit(struct vkms_frame_info *frame_info, int pos)
-{
-	if (drm_rotation_90_or_270(frame_info->rotation)) {
-		if (pos >= 0 && pos < drm_rect_width(&frame_info->rotated))
-			return true;
-	} else {
-		if (pos >= frame_info->rotated.y1 && pos < frame_info->rotated.y2)
-			return true;
-	}
-
-	return false;
-}
 
 static void fill_background(const struct pixel_argb_u16 *background_color,
 			    struct line_buffer *output_buffer)
@@ -215,34 +188,146 @@ static void blend(struct vkms_writeback_job *wb,
 {
 	struct vkms_plane_state **plane = crtc_state->active_planes;
 	u32 n_active_planes = crtc_state->num_active_planes;
-	int y_pos, x_dst, x_limit;
 
 	const struct pixel_argb_u16 background_color = { .a = 0xffff };
 
-	size_t crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
+	int crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
+	int crtc_x_limit = crtc_state->base.crtc->mode.hdisplay;
 
 	/*
 	 * The planes are composed line-by-line to avoid heavy memory usage. It is a necessary
 	 * complexity to avoid poor blending performance.
 	 *
-	 * The function vkms_compose_row is used to read a line, pixel-by-pixel, into the staging
-	 * buffer.
+	 * The function pixel_read_line callback is used to read a line, using an efficient
+	 * algorithm for a specific format, into the staging buffer.
 	 */
 	for (size_t y = 0; y < crtc_y_limit; y++) {
 		fill_background(&background_color, output_buffer);
 
 		/* The active planes are composed associatively in z-order. */
 		for (size_t i = 0; i < n_active_planes; i++) {
-			x_dst = plane[i]->frame_info->dst.x1;
-			x_limit = min_t(size_t, drm_rect_width(&plane[i]->frame_info->dst),
-					stage_buffer->n_pixels);
-			y_pos = get_y_pos(plane[i]->frame_info, y);
+			struct vkms_plane_state *current_plane = plane[i];
 
-			if (!check_limit(plane[i]->frame_info, y_pos))
+			/* Avoid rendering useless lines */
+			if (y < current_plane->frame_info->dst.y1 ||
+			    y >= current_plane->frame_info->dst.y2)
 				continue;
 
-			vkms_compose_row(stage_buffer, plane[i], y_pos);
-			pre_mul_alpha_blend(stage_buffer, output_buffer, x_dst, x_limit);
+			/*
+			 * dst_line is the line to copy. The initial coordinates are inside the
+			 * destination framebuffer, and then drm_rect_* helpers are used to
+			 * compute the correct position into the source framebuffer.
+			 */
+			struct drm_rect dst_line = DRM_RECT_INIT(
+				current_plane->frame_info->dst.x1, y,
+				drm_rect_width(&current_plane->frame_info->dst), 1);
+			struct drm_rect tmp_src;
+
+			drm_rect_fp_to_int(&tmp_src, &current_plane->frame_info->src);
+
+			/*
+			 * [1]: Clamping src_line to the crtc_x_limit to avoid writing outside of
+			 * the destination buffer
+			 */
+			dst_line.x1 = max_t(int, dst_line.x1, 0);
+			dst_line.x2 = min_t(int, dst_line.x2, crtc_x_limit);
+			/* The destination is completely outside of the crtc. */
+			if (dst_line.x2 <= dst_line.x1)
+				continue;
+
+			struct drm_rect src_line = dst_line;
+
+			/*
+			 * Transform the coordinate x/y from the crtc to coordinates into
+			 * coordinates for the src buffer.
+			 *
+			 * - Cancel the offset of the dst buffer.
+			 * - Invert the rotation. This assumes that
+			 *   dst = drm_rect_rotate(src, rotation) (dst and src have the
+			 *   same size, but can be rotated).
+			 * - Apply the offset of the source rectangle to the coordinate.
+			 */
+			drm_rect_translate(&src_line, -current_plane->frame_info->dst.x1,
+					   -current_plane->frame_info->dst.y1);
+			drm_rect_rotate_inv(&src_line,
+					    drm_rect_width(&tmp_src),
+					    drm_rect_height(&tmp_src),
+					    current_plane->frame_info->rotation);
+			drm_rect_translate(&src_line, tmp_src.x1, tmp_src.y1);
+
+			/* Get the correct reading direction in the source buffer. */
+
+			enum pixel_read_direction direction =
+				direction_for_rotation(current_plane->frame_info->rotation);
+
+			int x_start = src_line.x1;
+			int y_start = src_line.y1;
+			int pixel_count;
+			/* [2]: Compute and clamp the number of pixel to read */
+			if (direction == READ_LEFT_TO_RIGHT || direction == READ_RIGHT_TO_LEFT) {
+				/*
+				 * In horizontal reading, the src_line width is the number of pixel
+				 * to read
+				 */
+				pixel_count = drm_rect_width(&src_line);
+				if (x_start < 0) {
+					pixel_count += x_start;
+					x_start = 0;
+				}
+				if (x_start + pixel_count > current_plane->frame_info->fb->width) {
+					pixel_count =
+						(int)current_plane->frame_info->fb->width - x_start;
+				}
+			} else {
+				/*
+				 * In vertical reading, the src_line height is the number of pixel
+				 * to read
+				 */
+				pixel_count = drm_rect_height(&src_line);
+				if (y_start < 0) {
+					pixel_count += y_start;
+					y_start = 0;
+				}
+				if (y_start + pixel_count > current_plane->frame_info->fb->height) {
+					pixel_count =
+						(int)current_plane->frame_info->fb->width - y_start;
+				}
+			}
+
+			if (pixel_count <= 0) {
+				/* Nothing to read, so avoid multiple function calls for nothing */
+				continue;
+			}
+
+			/*
+			 * Modify the starting point to take in account the rotation
+			 *
+			 * src_line is the top-left corner, so when reading READ_RIGHT_TO_LEFT or
+			 * READ_BOTTOM_TO_TOP, it must be changed to the top-right/bottom-left
+			 * corner.
+			 */
+			if (direction == READ_RIGHT_TO_LEFT) {
+				// x_start is now the right point
+				x_start += pixel_count - 1;
+			} else if (direction == READ_BOTTOM_TO_TOP) {
+				// y_start is now the bottom point
+				y_start += pixel_count - 1;
+			}
+
+			/*
+			 * Perform the conversion and the blending
+			 *
+			 * Here we know that the read line (x_start, y_start, pixel_count) is
+			 * inside the source buffer [2] and we don't write outside the stage
+			 * buffer [1]
+			 */
+			current_plane->pixel_read_line(
+				current_plane, x_start, y_start, direction, pixel_count,
+				&stage_buffer->pixels[current_plane->frame_info->dst.x1]);
+
+			pre_mul_alpha_blend(stage_buffer, output_buffer,
+					    current_plane->frame_info->dst.x1,
+					    pixel_count);
 		}
 
 		apply_lut(crtc_state, output_buffer);
@@ -250,7 +335,7 @@ static void blend(struct vkms_writeback_job *wb,
 		*crc32 = crc32_le(*crc32, (void *)output_buffer->pixels, row_size);
 
 		if (wb)
-			vkms_writeback_row(wb, output_buffer, y_pos);
+			vkms_writeback_row(wb, output_buffer, y);
 	}
 }
 
@@ -261,7 +346,7 @@ static int check_format_funcs(struct vkms_crtc_state *crtc_state,
 	u32 n_active_planes = crtc_state->num_active_planes;
 
 	for (size_t i = 0; i < n_active_planes; i++)
-		if (!planes[i]->pixel_read)
+		if (!planes[i]->pixel_read_line)
 			return -1;
 
 	if (active_wb && !active_wb->pixel_write)
diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
index 985e7a92b7bc..23e1d247468d 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.h
+++ b/drivers/gpu/drm/vkms/vkms_drv.h
@@ -39,7 +39,6 @@
 struct vkms_frame_info {
 	struct drm_framebuffer *fb;
 	struct drm_rect src, dst;
-	struct drm_rect rotated;
 	struct iosys_map map[DRM_FORMAT_MAX_PLANES];
 	unsigned int rotation;
 };
@@ -80,26 +79,37 @@ enum pixel_read_direction {
 	READ_LEFT_TO_RIGHT
 };
 
+struct vkms_plane_state;
+
 /**
- * typedef pixel_read_t - These functions are used to read a pixel in the source frame,
+ * typedef pixel_read_line_t - These functions are used to read a pixel line in the source frame,
  * convert it to `struct pixel_argb_u16` and write it to @out_pixel.
  *
- * @in_pixel: Pointer to the pixel to read
- * @out_pixel: Pointer to write the converted pixel
+ * @plane: Plane used as source for the pixel value
+ * @x_start: X (width) coordinate of the first pixel to copy. The caller must ensure that x_start
+ * is positive and smaller than @plane->frame_info->fb->width.
+ * @y_start: Y (width) coordinate of the first pixel to copy. The caller must ensure that y_start
+ * is positive and smaller than @plane->frame_info->fb->height.
+ * @direction: Direction to use for the copy, starting at @x_start/@y_start
+ * @count: Number of pixels to copy
+ * @out_pixel: Pointer where to write the pixel values. They will be written from @out_pixel[0]
+ * to @out_pixel[@count]. The caller must ensure that out_pixel have a length of at least @count.
  */
-typedef void (*pixel_read_t)(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel);
+typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_start,
+				  int y_start, enum pixel_read_direction direction, int count,
+				  struct pixel_argb_u16 out_pixel[]);
 
 /**
  * vkms_plane_state - Driver specific plane state
  * @base: base plane state
  * @frame_info: data required for composing computation
- * @pixel_read: function to read a pixel in this plane. The creator of a vkms_plane_state must
- * ensure that this pointer is valid
+ * @pixel_read_line: function to read a pixel line in this plane. The creator of a vkms_plane_state
+ * must ensure that this pointer is valid
  */
 struct vkms_plane_state {
 	struct drm_shadow_plane_state base;
 	struct vkms_frame_info *frame_info;
-	pixel_read_t pixel_read;
+	pixel_read_line_t pixel_read_line;
 };
 
 struct vkms_plane {
@@ -204,7 +214,6 @@ int vkms_verify_crc_source(struct drm_crtc *crtc, const char *source_name,
 /* Composer Support */
 void vkms_composer_worker(struct work_struct *work);
 void vkms_set_composer(struct vkms_output *out, bool enabled);
-void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state *plane, int y);
 void vkms_writeback_row(struct vkms_writeback_job *wb, const struct line_buffer *src_buffer, int y);
 
 /* Writeback */
diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
index 743b6fd06db5..1449a0e6c706 100644
--- a/drivers/gpu/drm/vkms/vkms_formats.c
+++ b/drivers/gpu/drm/vkms/vkms_formats.c
@@ -105,77 +105,45 @@ static int get_step_next_block(struct drm_framebuffer *fb, enum pixel_read_direc
 	return 0;
 }
 
-static void *get_packed_src_addr(const struct vkms_frame_info *frame_info, int y,
-				 int plane_index)
-{
-	int x_src = frame_info->src.x1 >> 16;
-	int y_src = y - frame_info->rotated.y1 + (frame_info->src.y1 >> 16);
-	u8 *addr;
-	int rem_x, rem_y;
-
-	packed_pixels_addr(frame_info, x_src, y_src, plane_index, &addr, &rem_x, &rem_y);
-	return addr;
-}
-
-static int get_x_position(const struct vkms_frame_info *frame_info, int limit, int x)
-{
-	if (frame_info->rotation & (DRM_MODE_REFLECT_X | DRM_MODE_ROTATE_270))
-		return limit - x - 1;
-	return x;
-}
-
 /*
- * The following  functions take pixel data from the buffer and convert them to the format
+ * The following  functions take pixel data (a, r, g, b, pixel, ...), convert them to the format
  * ARGB16161616 in out_pixel.
  *
- * They are used in the `vkms_compose_row` function to handle multiple formats.
+ * They are used in the `read_line`s functions to avoid duplicate work for some pixel formats.
  */
 
-static void ARGB8888_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
+static struct pixel_argb_u16 argb_u16_from_u8888(int a, int r, int g, int b)
 {
+	struct pixel_argb_u16 out_pixel;
 	/*
 	 * The 257 is the "conversion ratio". This number is obtained by the
 	 * (2^16 - 1) / (2^8 - 1) division. Which, in this case, tries to get
 	 * the best color value in a pixel format with more possibilities.
 	 * A similar idea applies to others RGB color conversions.
 	 */
-	out_pixel->a = (u16)in_pixel[3] * 257;
-	out_pixel->r = (u16)in_pixel[2] * 257;
-	out_pixel->g = (u16)in_pixel[1] * 257;
-	out_pixel->b = (u16)in_pixel[0] * 257;
-}
+	out_pixel.a = (u16)a * 257;
+	out_pixel.r = (u16)r * 257;
+	out_pixel.g = (u16)g * 257;
+	out_pixel.b = (u16)b * 257;
 
-static void XRGB8888_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
-{
-	out_pixel->a = (u16)0xffff;
-	out_pixel->r = (u16)in_pixel[2] * 257;
-	out_pixel->g = (u16)in_pixel[1] * 257;
-	out_pixel->b = (u16)in_pixel[0] * 257;
+	return out_pixel;
 }
 
-static void ARGB16161616_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
+static struct pixel_argb_u16 argb_u16_from_u16161616(int a, int r, int g, int b)
 {
-	u16 *pixel = (u16 *)in_pixel;
+	struct pixel_argb_u16 out_pixel;
 
-	out_pixel->a = le16_to_cpu(pixel[3]);
-	out_pixel->r = le16_to_cpu(pixel[2]);
-	out_pixel->g = le16_to_cpu(pixel[1]);
-	out_pixel->b = le16_to_cpu(pixel[0]);
-}
+	out_pixel.a = le16_to_cpu(a);
+	out_pixel.r = le16_to_cpu(r);
+	out_pixel.g = le16_to_cpu(g);
+	out_pixel.b = le16_to_cpu(b);
 
-static void XRGB16161616_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
-{
-	u16 *pixel = (u16 *)in_pixel;
-
-	out_pixel->a = (u16)0xffff;
-	out_pixel->r = le16_to_cpu(pixel[2]);
-	out_pixel->g = le16_to_cpu(pixel[1]);
-	out_pixel->b = le16_to_cpu(pixel[0]);
+	return out_pixel;
 }
 
-static void RGB565_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
+static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
 {
-	u16 *pixel = (u16 *)in_pixel;
+	struct pixel_argb_u16 out_pixel;
 
 	s64 fp_rb_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(31));
 	s64 fp_g_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(63));
@@ -185,12 +153,26 @@ static void RGB565_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pi
 	s64 fp_g = drm_int2fixp((rgb_565 >> 5) & 0x3f);
 	s64 fp_b = drm_int2fixp(rgb_565 & 0x1f);
 
-	out_pixel->a = (u16)0xffff;
-	out_pixel->r = drm_fixp2int_round(drm_fixp_mul(fp_r, fp_rb_ratio));
-	out_pixel->g = drm_fixp2int_round(drm_fixp_mul(fp_g, fp_g_ratio));
-	out_pixel->b = drm_fixp2int_round(drm_fixp_mul(fp_b, fp_rb_ratio));
+	out_pixel.a = (u16)0xffff;
+	out_pixel.r = drm_fixp2int_round(drm_fixp_mul(fp_r, fp_rb_ratio));
+	out_pixel.g = drm_fixp2int_round(drm_fixp_mul(fp_g, fp_g_ratio));
+	out_pixel.b = drm_fixp2int_round(drm_fixp_mul(fp_b, fp_rb_ratio));
+
+	return out_pixel;
 }
 
+/*
+ * The following functions are read_line function for each pixel format supported by VKMS.
+ *
+ * They read a line starting at the point @x_start,@y_start following the @direction. The result
+ * is stored in @out_pixel and in the format ARGB16161616.
+ *
+ * Those function are very similar, but it is required for performance reason. In the past, some
+ * experiment were done, and with a generic loop the performance are very reduced [1].
+ *
+ * [1]: https://lore.kernel.org/dri-devel/d258c8dc-78e9-4509-9037-a98f7f33b3a3@riseup.net/
+ */
+
 /**
  * black_to_argb_u16() - pixel_read callback which always read black
  *
@@ -198,42 +180,116 @@ static void RGB565_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pi
  * It is used to avoid null pointer to be used as a function. In theory, this function should
  * never be called, except if you found a bug in the driver/DRM core.
  */
-static void black_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
+static void black_to_argb_u16(const struct vkms_plane_state *plane, int x_start,
+			      int y_start, enum pixel_read_direction direction, int count,
+			      struct pixel_argb_u16 out_pixel[])
 {
-	out_pixel->a = (u16)0xFFFF;
-	out_pixel->r = 0;
-	out_pixel->g = 0;
-	out_pixel->b = 0;
+	struct pixel_argb_u16 *end = out_pixel + count;
+
+	while (out_pixel < end) {
+		*out_pixel = argb_u16_from_u8888(255, 0, 0, 0);
+		out_pixel += 1;
+	}
 }
 
-/**
- * vkms_compose_row - compose a single row of a plane
- * @stage_buffer: output line with the composed pixels
- * @plane: state of the plane that is being composed
- * @y: y coordinate of the row
- *
- * This function composes a single row of a plane. It gets the source pixels
- * through the y coordinate (see get_packed_src_addr()) and goes linearly
- * through the source pixel, reading the pixels and converting it to
- * ARGB16161616 (see the pixel_read() callback). For rotate-90 and rotate-270,
- * the source pixels are not traversed linearly. The source pixels are queried
- * on each iteration in order to traverse the pixels vertically.
- */
-void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state *plane, int y)
+static void ARGB8888_read_line(const struct vkms_plane_state *plane, int x_start, int y_start,
+			       enum pixel_read_direction direction, int count,
+			       struct pixel_argb_u16 out_pixel[])
 {
-	struct pixel_argb_u16 *out_pixels = stage_buffer->pixels;
-	struct vkms_frame_info *frame_info = plane->frame_info;
-	u8 *src_pixels = get_packed_src_addr(frame_info, y, 0);
-	int limit = min_t(size_t, drm_rect_width(&frame_info->dst), stage_buffer->n_pixels);
+	struct pixel_argb_u16 *end = out_pixel + count;
+	u8 *src_pixels;
+	int rem_x, rem_y;
+
+	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
+
+	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
+
+	while (out_pixel < end) {
+		u8 *px = (u8 *)src_pixels;
+		*out_pixel = argb_u16_from_u8888(px[3], px[2], px[1], px[0]);
+		out_pixel += 1;
+		src_pixels += step;
+	}
+}
+
+static void XRGB8888_read_line(const struct vkms_plane_state *plane, int x_start, int y_start,
+			       enum pixel_read_direction direction, int count,
+			       struct pixel_argb_u16 out_pixel[])
+{
+	struct pixel_argb_u16 *end = out_pixel + count;
+	u8 *src_pixels;
+	int rem_x, rem_y;
+
+	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
+
+	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
+
+	while (out_pixel < end) {
+		u8 *px = (u8 *)src_pixels;
+		*out_pixel = argb_u16_from_u8888(255, px[2], px[1], px[0]);
+		out_pixel += 1;
+		src_pixels += step;
+	}
+}
+
+static void ARGB16161616_read_line(const struct vkms_plane_state *plane, int x_start,
+				   int y_start, enum pixel_read_direction direction, int count,
+				   struct pixel_argb_u16 out_pixel[])
+{
+	struct pixel_argb_u16 *end = out_pixel + count;
+	u8 *src_pixels;
+	int rem_x, rem_y;
+
+	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
+
+	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
+
+	while (out_pixel < end) {
+		u16 *px = (u16 *)src_pixels;
+		*out_pixel = argb_u16_from_u16161616(px[3], px[2], px[1], px[0]);
+		out_pixel += 1;
+		src_pixels += step;
+	}
+}
+
+static void XRGB16161616_read_line(const struct vkms_plane_state *plane, int x_start,
+				   int y_start, enum pixel_read_direction direction, int count,
+				   struct pixel_argb_u16 out_pixel[])
+{
+	struct pixel_argb_u16 *end = out_pixel + count;
+	u8 *src_pixels;
+	int rem_x, rem_y;
+
+	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
+
+	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
+
+	while (out_pixel < end) {
+		u16 *px = (u16 *)src_pixels;
+		*out_pixel = argb_u16_from_u16161616(0xFFFF, px[2], px[1], px[0]);
+		out_pixel += 1;
+		src_pixels += step;
+	}
+}
+
+static void RGB565_read_line(const struct vkms_plane_state *plane, int x_start,
+			     int y_start, enum pixel_read_direction direction, int count,
+			     struct pixel_argb_u16 out_pixel[])
+{
+	struct pixel_argb_u16 *end = out_pixel + count;
+	u8 *src_pixels;
+	int rem_x, rem_y;
+
+	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
 
-	for (size_t x = 0; x < limit; x++, src_pixels += frame_info->fb->format->cpp[0]) {
-		int x_pos = get_x_position(frame_info, limit, x);
+	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
 
-		if (drm_rotation_90_or_270(frame_info->rotation))
-			src_pixels = get_packed_src_addr(frame_info, x + frame_info->rotated.y1, 0)
-				+ frame_info->fb->format->cpp[0] * y;
+	while (out_pixel < end) {
+		u16 *px = (u16 *)src_pixels;
 
-		plane->pixel_read(src_pixels, &out_pixels[x_pos]);
+		*out_pixel = argb_u16_from_RGB565(px);
+		out_pixel += 1;
+		src_pixels += step;
 	}
 }
 
@@ -343,25 +399,25 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
 }
 
 /**
- * Retrieve the correct read_pixel function for a specific format.
+ * Retrieve the correct read_line function for a specific format.
  * If the format is not supported by VKMS a warn is emitted and a dummy "always read black"
  * function is returned.
  *
  * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
  */
-pixel_read_t get_pixel_read_function(u32 format)
+pixel_read_line_t get_pixel_read_line_function(u32 format)
 {
 	switch (format) {
 	case DRM_FORMAT_ARGB8888:
-		return &ARGB8888_to_argb_u16;
+		return &ARGB8888_read_line;
 	case DRM_FORMAT_XRGB8888:
-		return &XRGB8888_to_argb_u16;
+		return &XRGB8888_read_line;
 	case DRM_FORMAT_ARGB16161616:
-		return &ARGB16161616_to_argb_u16;
+		return &ARGB16161616_read_line;
 	case DRM_FORMAT_XRGB16161616:
-		return &XRGB16161616_to_argb_u16;
+		return &XRGB16161616_read_line;
 	case DRM_FORMAT_RGB565:
-		return &RGB565_to_argb_u16;
+		return &RGB565_read_line;
 	default:
 		/*
 		 * This is a bug in vkms_plane_atomic_check. All the supported
diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
index 3ecea4563254..8d2bef95ff79 100644
--- a/drivers/gpu/drm/vkms/vkms_formats.h
+++ b/drivers/gpu/drm/vkms/vkms_formats.h
@@ -5,7 +5,7 @@
 
 #include "vkms_drv.h"
 
-pixel_read_t get_pixel_read_function(u32 format);
+pixel_read_line_t get_pixel_read_line_function(u32 format);
 
 pixel_write_t get_pixel_write_function(u32 format);
 
diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
index 10e9b23dab28..8875bed76410 100644
--- a/drivers/gpu/drm/vkms/vkms_plane.c
+++ b/drivers/gpu/drm/vkms/vkms_plane.c
@@ -112,7 +112,6 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
 	frame_info = vkms_plane_state->frame_info;
 	memcpy(&frame_info->src, &new_state->src, sizeof(struct drm_rect));
 	memcpy(&frame_info->dst, &new_state->dst, sizeof(struct drm_rect));
-	memcpy(&frame_info->rotated, &new_state->dst, sizeof(struct drm_rect));
 	frame_info->fb = fb;
 	memcpy(&frame_info->map, &shadow_plane_state->data, sizeof(frame_info->map));
 	drm_framebuffer_get(frame_info->fb);
@@ -122,10 +121,8 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
 									  DRM_MODE_REFLECT_X |
 									  DRM_MODE_REFLECT_Y);
 
-	drm_rect_rotate(&frame_info->rotated, drm_rect_width(&frame_info->rotated),
-			drm_rect_height(&frame_info->rotated), frame_info->rotation);
 
-	vkms_plane_state->pixel_read = get_pixel_read_function(fmt);
+	vkms_plane_state->pixel_read_line = get_pixel_read_line_function(fmt);
 }
 
 static int vkms_plane_atomic_check(struct drm_plane *plane,

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v5 11/16] drm/vkms: Add YUV support
  2024-03-13 17:44 [PATCH v5 00/16] drm/vkms: Reimplement line-per-line pixel conversion for plane reading Louis Chauvet
                   ` (9 preceding siblings ...)
  2024-03-13 17:45 ` [PATCH v5 10/16] drm/vkms: Re-introduce line-per-line composition algorithm Louis Chauvet
@ 2024-03-13 17:45 ` Louis Chauvet
  2024-03-13 19:20   ` Randy Dunlap
                     ` (3 more replies)
  2024-03-13 17:45 ` [PATCH v5 12/16] drm/vkms: Add range and encoding properties to the plane Louis Chauvet
                   ` (4 subsequent siblings)
  15 siblings, 4 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-13 17:45 UTC (permalink / raw)
  To: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee,
	Louis Chauvet

From: Arthur Grillo <arthurgrillo@riseup.net>

Add support to the YUV formats bellow:

- NV12/NV16/NV24
- NV21/NV61/NV42
- YUV420/YUV422/YUV444
- YVU420/YVU422/YVU444

The conversion from yuv to rgb is done with fixed-point arithmetic, using
32.32 floats and the drm_fixed helpers.

To do the conversion, a specific matrix must be used for each color range
(DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in
the `conversion_matrix` struct, along with the specific y_offset needed.
This matrix is queried only once, in `vkms_plane_atomic_update` and
stored in a `vkms_plane_state`. Those conversion matrices of each
encoding and range were obtained by rounding the values of the original
conversion matrices multiplied by 2^32. This is done to avoid the use of
floating point operations.

The same reading function is used for YUV and YVU formats. As the only
difference between those two category of formats is the order of field, a
simple swap in conversion matrix columns allows using the same function.

Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
[Louis Chauvet:
- Adapted Arthur's work
- Implemented the read_line_t callbacks for yuv
- add struct conversion_matrix
- remove struct pixel_yuv_u8
- update the commit message
- Merge the modifications from Arthur]
Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
---
 drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
 drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/vkms/vkms_formats.h |   4 +
 drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
 4 files changed, 473 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
index 23e1d247468d..f3116084de5a 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.h
+++ b/drivers/gpu/drm/vkms/vkms_drv.h
@@ -99,6 +99,27 @@ typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_st
 				  int y_start, enum pixel_read_direction direction, int count,
 				  struct pixel_argb_u16 out_pixel[]);
 
+/**
+ * CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values
+ */
+#define CONVERSION_MATRIX_FLOAT_DEPTH 32
+
+/**
+ * struct conversion_matrix - Matrix to use for a specific encoding and range
+ *
+ * @matrix: Conversion matrix from yuv to rgb. The matrix is stored in a row-major manner and is
+ * used to compute rgb values from yuv values:
+ *     [[r],[g],[b]] = @matrix * [[y],[u],[v]]
+ *   OR for yvu formats:
+ *     [[r],[g],[b]] = @matrix * [[y],[v],[u]]
+ *  The values of the matrix are fixed floats, 32.CONVERSION_MATRIX_FLOAT_DEPTH
+ * @y_offest: Offset to apply on the y value.
+ */
+struct conversion_matrix {
+	s64 matrix[3][3];
+	s64 y_offset;
+};
+
 /**
  * vkms_plane_state - Driver specific plane state
  * @base: base plane state
@@ -110,6 +131,7 @@ struct vkms_plane_state {
 	struct drm_shadow_plane_state base;
 	struct vkms_frame_info *frame_info;
 	pixel_read_line_t pixel_read_line;
+	struct conversion_matrix *conversion_matrix;
 };
 
 struct vkms_plane {
diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
index 1449a0e6c706..edbf4b321b91 100644
--- a/drivers/gpu/drm/vkms/vkms_formats.c
+++ b/drivers/gpu/drm/vkms/vkms_formats.c
@@ -105,6 +105,44 @@ static int get_step_next_block(struct drm_framebuffer *fb, enum pixel_read_direc
 	return 0;
 }
 
+/**
+ * get_subsampling() - Get the subsampling divisor value on a specific direction
+ */
+static int get_subsampling(const struct drm_format_info *format,
+			   enum pixel_read_direction direction)
+{
+	switch (direction) {
+	case READ_BOTTOM_TO_TOP:
+	case READ_TOP_TO_BOTTOM:
+		return format->vsub;
+	case READ_RIGHT_TO_LEFT:
+	case READ_LEFT_TO_RIGHT:
+		return format->hsub;
+	}
+	WARN_ONCE(true, "Invalid direction for pixel reading: %d\n", direction);
+	return 1;
+}
+
+/**
+ * get_subsampling_offset() - An offset for keeping the chroma siting consistent regardless of
+ * x_start and y_start values
+ */
+static int get_subsampling_offset(enum pixel_read_direction direction, int x_start, int y_start)
+{
+	switch (direction) {
+	case READ_BOTTOM_TO_TOP:
+		return -y_start - 1;
+	case READ_TOP_TO_BOTTOM:
+		return y_start;
+	case READ_RIGHT_TO_LEFT:
+		return -x_start - 1;
+	case READ_LEFT_TO_RIGHT:
+		return x_start;
+	}
+	WARN_ONCE(true, "Invalid direction for pixel reading: %d\n", direction);
+	return 0;
+}
+
 /*
  * The following  functions take pixel data (a, r, g, b, pixel, ...), convert them to the format
  * ARGB16161616 in out_pixel.
@@ -161,6 +199,42 @@ static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
 	return out_pixel;
 }
 
+static struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
+						  struct conversion_matrix *matrix)
+{
+	u8 r, g, b;
+	s64 fp_y, fp_cb, fp_cr;
+	s64 fp_r, fp_g, fp_b;
+
+	fp_y = y - matrix->y_offset;
+	fp_cb = cb - 128;
+	fp_cr = cr - 128;
+
+	fp_y = drm_int2fixp(fp_y);
+	fp_cb = drm_int2fixp(fp_cb);
+	fp_cr = drm_int2fixp(fp_cr);
+
+	fp_r = drm_fixp_mul(matrix->matrix[0][0], fp_y) +
+	       drm_fixp_mul(matrix->matrix[0][1], fp_cb) +
+	       drm_fixp_mul(matrix->matrix[0][2], fp_cr);
+	fp_g = drm_fixp_mul(matrix->matrix[1][0], fp_y) +
+	       drm_fixp_mul(matrix->matrix[1][1], fp_cb) +
+	       drm_fixp_mul(matrix->matrix[1][2], fp_cr);
+	fp_b = drm_fixp_mul(matrix->matrix[2][0], fp_y) +
+	       drm_fixp_mul(matrix->matrix[2][1], fp_cb) +
+	       drm_fixp_mul(matrix->matrix[2][2], fp_cr);
+
+	fp_r = drm_fixp2int_round(fp_r);
+	fp_g = drm_fixp2int_round(fp_g);
+	fp_b = drm_fixp2int_round(fp_b);
+
+	r = clamp(fp_r, 0, 0xff);
+	g = clamp(fp_g, 0, 0xff);
+	b = clamp(fp_b, 0, 0xff);
+
+	return argb_u16_from_u8888(255, r, g, b);
+}
+
 /*
  * The following functions are read_line function for each pixel format supported by VKMS.
  *
@@ -293,6 +367,79 @@ static void RGB565_read_line(const struct vkms_plane_state *plane, int x_start,
 	}
 }
 
+/*
+ * This callback can be used for yuv and yvu formats, given a properly modified conversion matrix
+ * (column inversion)
+ */
+static void semi_planar_yuv_read_line(const struct vkms_plane_state *plane, int x_start,
+				      int y_start, enum pixel_read_direction direction, int count,
+				      struct pixel_argb_u16 out_pixel[])
+{
+	int rem_x, rem_y;
+	u8 *y_plane;
+	u8 *uv_plane;
+
+	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &y_plane, &rem_x, &rem_y);
+	packed_pixels_addr(plane->frame_info,
+			   x_start / plane->frame_info->fb->format->hsub,
+			   y_start / plane->frame_info->fb->format->vsub,
+			   1, &uv_plane, &rem_x, &rem_y);
+	int step_y = get_step_next_block(plane->frame_info->fb, direction, 0);
+	int step_uv = get_step_next_block(plane->frame_info->fb, direction, 1);
+	int subsampling = get_subsampling(plane->frame_info->fb->format, direction);
+	int subsampling_offset = get_subsampling_offset(direction, x_start, y_start);
+	struct conversion_matrix *conversion_matrix = plane->conversion_matrix;
+
+	for (int i = 0; i < count; i++) {
+		*out_pixel = argb_u16_from_yuv888(y_plane[0], uv_plane[0], uv_plane[1],
+						  conversion_matrix);
+		out_pixel += 1;
+		y_plane += step_y;
+		if ((i + subsampling_offset + 1) % subsampling == 0)
+			uv_plane += step_uv;
+	}
+}
+
+/*
+ * This callback can be used for yuv and yvu formats, given a properly modified conversion matrix
+ * (column inversion)
+ */
+static void planar_yuv_read_line(const struct vkms_plane_state *plane, int x_start,
+				 int y_start, enum pixel_read_direction direction, int count,
+				 struct pixel_argb_u16 out_pixel[])
+{
+	int rem_x, rem_y;
+	u8 *y_plane;
+	u8 *u_plane;
+	u8 *v_plane;
+
+	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &y_plane, &rem_x, &rem_y);
+	packed_pixels_addr(plane->frame_info,
+			   x_start / plane->frame_info->fb->format->hsub,
+			   y_start / plane->frame_info->fb->format->vsub,
+			   1, &u_plane, &rem_x, &rem_y);
+	packed_pixels_addr(plane->frame_info,
+			   x_start / plane->frame_info->fb->format->hsub,
+			   y_start / plane->frame_info->fb->format->vsub,
+			   2, &v_plane, &rem_x, &rem_y);
+	int step_y = get_step_next_block(plane->frame_info->fb, direction, 0);
+	int step_u = get_step_next_block(plane->frame_info->fb, direction, 1);
+	int step_v = get_step_next_block(plane->frame_info->fb, direction, 2);
+	int subsampling = get_subsampling(plane->frame_info->fb->format, direction);
+	int subsampling_offset = get_subsampling_offset(direction, x_start, y_start);
+	struct conversion_matrix *conversion_matrix = plane->conversion_matrix;
+
+	for (int i = 0; i < count; i++) {
+		*out_pixel = argb_u16_from_yuv888(*y_plane, *u_plane, *v_plane, conversion_matrix);
+		out_pixel += 1;
+		y_plane += step_y;
+		if ((i + subsampling_offset + 1) % subsampling == 0) {
+			u_plane += step_u;
+			v_plane += step_v;
+		}
+	}
+}
+
 /*
  * The following functions take one argb_u16 pixel and convert it to a specific format. The
  * result is stored in @out_pixel.
@@ -418,6 +565,20 @@ pixel_read_line_t get_pixel_read_line_function(u32 format)
 		return &XRGB16161616_read_line;
 	case DRM_FORMAT_RGB565:
 		return &RGB565_read_line;
+	case DRM_FORMAT_NV12:
+	case DRM_FORMAT_NV16:
+	case DRM_FORMAT_NV24:
+	case DRM_FORMAT_NV21:
+	case DRM_FORMAT_NV61:
+	case DRM_FORMAT_NV42:
+		return &semi_planar_yuv_read_line;
+	case DRM_FORMAT_YUV420:
+	case DRM_FORMAT_YUV422:
+	case DRM_FORMAT_YUV444:
+	case DRM_FORMAT_YVU420:
+	case DRM_FORMAT_YVU422:
+	case DRM_FORMAT_YVU444:
+		return &planar_yuv_read_line;
 	default:
 		/*
 		 * This is a bug in vkms_plane_atomic_check. All the supported
@@ -435,6 +596,276 @@ pixel_read_line_t get_pixel_read_line_function(u32 format)
 	}
 }
 
+/**
+ * get_conversion_matrix_to_argb_u16() - Retrieve the correct yuv to rgb conversion matrix for a
+ * given encoding and range.
+ *
+ * If the matrix is not found, return a null pointer. In all other cases, it return a simple
+ * diagonal matrix, which act as a "no-op".
+ *
+ * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
+ * @encoding: DRM_COLOR_* value for which to obtain a conversion matrix
+ * @range: DRM_COLOR_*_RANGE value for which to obtain a conversion matrix
+ */
+struct conversion_matrix *
+get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
+				  enum drm_color_range range)
+{
+	static struct conversion_matrix no_operation = {
+		.matrix = {
+			{ 4294967296, 0,          0, },
+			{ 0,          4294967296, 0, },
+			{ 0,          0,          4294967296, },
+		},
+		.y_offset = 0,
+	};
+
+	/*
+	 * Those matrixies were generated using the colour python framework
+	 *
+	 * Below are the function calls used to generate eac matrix, go to
+	 * https://colour.readthedocs.io/en/develop/generated/colour.matrix_YCbCr.html
+	 * for more info:
+	 *
+	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
+	 *                                  is_legal = False,
+	 *                                  bits = 8) * 2**32).astype(int)
+	 */
+	static struct conversion_matrix yuv_bt601_full = {
+		.matrix = {
+			{ 4294967296, 0,           6021544149 },
+			{ 4294967296, -1478054095, -3067191994 },
+			{ 4294967296, 7610682049,  0 },
+		},
+		.y_offset = 0,
+	};
+
+	/*
+	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
+	 *                                  is_legal = True,
+	 *                                  bits = 8) * 2**32).astype(int)
+	 */
+	static struct conversion_matrix yuv_bt601_limited = {
+		.matrix = {
+			{ 5020601039, 0,           6881764740 },
+			{ 5020601039, -1689204679, -3505362278 },
+			{ 5020601039, 8697922339,  0 },
+		},
+		.y_offset = 16,
+	};
+
+	/*
+	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.709"],
+	 *                                  is_legal = False,
+	 *                                  bits = 8) * 2**32).astype(int)
+	 */
+	static struct conversion_matrix yuv_bt709_full = {
+		.matrix = {
+			{ 4294967296, 0,          6763714498 },
+			{ 4294967296, -804551626, -2010578443 },
+			{ 4294967296, 7969741314, 0 },
+		},
+		.y_offset = 0,
+	};
+
+	/*
+	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.709"],
+	 *                                  is_legal = True,
+	 *                                  bits = 8) * 2**32).astype(int)
+	 */
+	static struct conversion_matrix yuv_bt709_limited = {
+		.matrix = {
+			{ 5020601039, 0,          7729959424 },
+			{ 5020601039, -919487572, -2297803934 },
+			{ 5020601039, 9108275786, 0 },
+		},
+		.y_offset = 16,
+	};
+
+	/*
+	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.2020"],
+	 *                                  is_legal = False,
+	 *                                  bits = 8) * 2**32).astype(int)
+	 */
+	static struct conversion_matrix yuv_bt2020_full = {
+		.matrix = {
+			{ 4294967296, 0,          6333358775 },
+			{ 4294967296, -706750298, -2453942994 },
+			{ 4294967296, 8080551471, 0 },
+		},
+		.y_offset = 0,
+	};
+
+	/*
+	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.2020"],
+	 *                                  is_legal = True,
+	 *                                  bits = 8) * 2**32).astype(int)
+	 */
+	static struct conversion_matrix yuv_bt2020_limited = {
+		.matrix = {
+			{ 5020601039, 0,          7238124312 },
+			{ 5020601039, -807714626, -2804506279 },
+			{ 5020601039, 9234915964, 0 },
+		},
+		.y_offset = 16,
+	};
+
+	/*
+	 * The next matrices are just the previous ones, but with the first and
+	 * second columns swapped
+	 */
+	static struct conversion_matrix yvu_bt601_full = {
+		.matrix = {
+			{ 4294967296, 6021544149,  0 },
+			{ 4294967296, -3067191994, -1478054095 },
+			{ 4294967296, 0,           7610682049 },
+		},
+		.y_offset = 0,
+	};
+	static struct conversion_matrix yvu_bt601_limited = {
+		.matrix = {
+			{ 5020601039, 6881764740,  0 },
+			{ 5020601039, -3505362278, -1689204679 },
+			{ 5020601039, 0,           8697922339 },
+		},
+		.y_offset = 16,
+	};
+	static struct conversion_matrix yvu_bt709_full = {
+		.matrix = {
+			{ 4294967296, 6763714498,  0 },
+			{ 4294967296, -2010578443, -804551626 },
+			{ 4294967296, 0,           7969741314 },
+		},
+		.y_offset = 0,
+	};
+	static struct conversion_matrix yvu_bt709_limited = {
+		.matrix = {
+			{ 5020601039, 7729959424,  0 },
+			{ 5020601039, -2297803934, -919487572 },
+			{ 5020601039, 0,           9108275786 },
+		},
+		.y_offset = 16,
+	};
+	static struct conversion_matrix yvu_bt2020_full = {
+		.matrix = {
+			{ 4294967296, 6333358775,  0 },
+			{ 4294967296, -2453942994, -706750298 },
+			{ 4294967296, 0,           8080551471 },
+		},
+		.y_offset = 0,
+	};
+	static struct conversion_matrix yvu_bt2020_limited = {
+		.matrix = {
+			{ 5020601039, 7238124312,  0 },
+			{ 5020601039, -2804506279, -807714626 },
+			{ 5020601039, 0,           9234915964 },
+		},
+		.y_offset = 16,
+	};
+
+	/* Breaking in this switch means that the color format+encoding+range is not supported */
+	switch (format) {
+	case DRM_FORMAT_NV12:
+	case DRM_FORMAT_NV16:
+	case DRM_FORMAT_NV24:
+	case DRM_FORMAT_YUV420:
+	case DRM_FORMAT_YUV422:
+	case DRM_FORMAT_YUV444:
+		switch (encoding) {
+		case DRM_COLOR_YCBCR_BT601:
+			switch (range) {
+			case DRM_COLOR_YCBCR_LIMITED_RANGE:
+				return &yuv_bt601_limited;
+			case DRM_COLOR_YCBCR_FULL_RANGE:
+				return &yuv_bt601_full;
+			case DRM_COLOR_RANGE_MAX:
+				break;
+			}
+			break;
+		case DRM_COLOR_YCBCR_BT709:
+			switch (range) {
+			case DRM_COLOR_YCBCR_LIMITED_RANGE:
+				return &yuv_bt709_limited;
+			case DRM_COLOR_YCBCR_FULL_RANGE:
+				return &yuv_bt709_full;
+			case DRM_COLOR_RANGE_MAX:
+				break;
+			}
+			break;
+		case DRM_COLOR_YCBCR_BT2020:
+			switch (range) {
+			case DRM_COLOR_YCBCR_LIMITED_RANGE:
+				return &yuv_bt2020_limited;
+			case DRM_COLOR_YCBCR_FULL_RANGE:
+				return &yuv_bt2020_full;
+			case DRM_COLOR_RANGE_MAX:
+				break;
+			}
+			break;
+		case DRM_COLOR_ENCODING_MAX:
+			break;
+		}
+		break;
+	case DRM_FORMAT_YVU420:
+	case DRM_FORMAT_YVU422:
+	case DRM_FORMAT_YVU444:
+	case DRM_FORMAT_NV21:
+	case DRM_FORMAT_NV61:
+	case DRM_FORMAT_NV42:
+		switch (encoding) {
+		case DRM_COLOR_YCBCR_BT601:
+			switch (range) {
+			case DRM_COLOR_YCBCR_LIMITED_RANGE:
+				return &yvu_bt601_limited;
+			case DRM_COLOR_YCBCR_FULL_RANGE:
+				return &yvu_bt601_full;
+			case DRM_COLOR_RANGE_MAX:
+				break;
+			}
+			break;
+		case DRM_COLOR_YCBCR_BT709:
+			switch (range) {
+			case DRM_COLOR_YCBCR_LIMITED_RANGE:
+				return &yvu_bt709_limited;
+			case DRM_COLOR_YCBCR_FULL_RANGE:
+				return &yvu_bt709_full;
+			case DRM_COLOR_RANGE_MAX:
+				break;
+			}
+			break;
+		case DRM_COLOR_YCBCR_BT2020:
+			switch (range) {
+			case DRM_COLOR_YCBCR_LIMITED_RANGE:
+				return &yvu_bt2020_limited;
+			case DRM_COLOR_YCBCR_FULL_RANGE:
+				return &yvu_bt2020_full;
+			case DRM_COLOR_RANGE_MAX:
+				break;
+			}
+			break;
+		case DRM_COLOR_ENCODING_MAX:
+			break;
+		}
+		break;
+	case DRM_FORMAT_ARGB8888:
+	case DRM_FORMAT_XRGB8888:
+	case DRM_FORMAT_ARGB16161616:
+	case DRM_FORMAT_XRGB16161616:
+	case DRM_FORMAT_RGB565:
+		/*
+		 * Those formats are supported, but they don't need a conversion matrix. Return
+		 * a valid pointer to avoid kernel panic in case this matrix is used/checked
+		 * somewhere.
+		 */
+		return &no_operation;
+	default:
+		break;
+	}
+	WARN(true, "Unsupported encoding (%d), range (%d) and format (%p4cc) combination\n",
+	     encoding, range, &format);
+	return &no_operation;
+}
+
 /**
  * Retrieve the correct write_pixel function for a specific format.
  * If the format is not supported by VKMS a warn is emitted and a dummy "don't do anything"
diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
index 8d2bef95ff79..e1d324764b17 100644
--- a/drivers/gpu/drm/vkms/vkms_formats.h
+++ b/drivers/gpu/drm/vkms/vkms_formats.h
@@ -9,4 +9,8 @@ pixel_read_line_t get_pixel_read_line_function(u32 format);
 
 pixel_write_t get_pixel_write_function(u32 format);
 
+struct conversion_matrix *
+get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
+				  enum drm_color_range range);
+
 #endif /* _VKMS_FORMATS_H_ */
diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
index 8875bed76410..987dd2b686a8 100644
--- a/drivers/gpu/drm/vkms/vkms_plane.c
+++ b/drivers/gpu/drm/vkms/vkms_plane.c
@@ -17,7 +17,19 @@ static const u32 vkms_formats[] = {
 	DRM_FORMAT_XRGB8888,
 	DRM_FORMAT_XRGB16161616,
 	DRM_FORMAT_ARGB16161616,
-	DRM_FORMAT_RGB565
+	DRM_FORMAT_RGB565,
+	DRM_FORMAT_NV12,
+	DRM_FORMAT_NV16,
+	DRM_FORMAT_NV24,
+	DRM_FORMAT_NV21,
+	DRM_FORMAT_NV61,
+	DRM_FORMAT_NV42,
+	DRM_FORMAT_YUV420,
+	DRM_FORMAT_YUV422,
+	DRM_FORMAT_YUV444,
+	DRM_FORMAT_YVU420,
+	DRM_FORMAT_YVU422,
+	DRM_FORMAT_YVU444
 };
 
 static struct drm_plane_state *
@@ -117,12 +129,15 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
 	drm_framebuffer_get(frame_info->fb);
 	frame_info->rotation = drm_rotation_simplify(new_state->rotation, DRM_MODE_ROTATE_0 |
 									  DRM_MODE_ROTATE_90 |
+									  DRM_MODE_ROTATE_180 |
 									  DRM_MODE_ROTATE_270 |
 									  DRM_MODE_REFLECT_X |
 									  DRM_MODE_REFLECT_Y);
 
 
 	vkms_plane_state->pixel_read_line = get_pixel_read_line_function(fmt);
+	vkms_plane_state->conversion_matrix = get_conversion_matrix_to_argb_u16
+		(fmt, new_state->color_encoding, new_state->color_range);
 }
 
 static int vkms_plane_atomic_check(struct drm_plane *plane,

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v5 12/16] drm/vkms: Add range and encoding properties to the plane
  2024-03-13 17:44 [PATCH v5 00/16] drm/vkms: Reimplement line-per-line pixel conversion for plane reading Louis Chauvet
                   ` (10 preceding siblings ...)
  2024-03-13 17:45 ` [PATCH v5 11/16] drm/vkms: Add YUV support Louis Chauvet
@ 2024-03-13 17:45 ` Louis Chauvet
  2024-03-13 17:45 ` [PATCH v5 13/16] drm/vkms: Drop YUV formats TODO Louis Chauvet
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-13 17:45 UTC (permalink / raw)
  To: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee,
	Louis Chauvet

From: Arthur Grillo <arthurgrillo@riseup.net>

Now that the driver internally handles these quantization ranges and YUV
encoding matrices, expose the UAPI for setting them.

Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
[Louis Chauvet: retained only relevant parts, updated the commit message]
Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
---
 drivers/gpu/drm/vkms/vkms_plane.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
index 987dd2b686a8..e21cc92cf497 100644
--- a/drivers/gpu/drm/vkms/vkms_plane.c
+++ b/drivers/gpu/drm/vkms/vkms_plane.c
@@ -224,5 +224,14 @@ struct vkms_plane *vkms_plane_init(struct vkms_device *vkmsdev,
 	drm_plane_create_rotation_property(&plane->base, DRM_MODE_ROTATE_0,
 					   DRM_MODE_ROTATE_MASK | DRM_MODE_REFLECT_MASK);
 
+	drm_plane_create_color_properties(&plane->base,
+					  BIT(DRM_COLOR_YCBCR_BT601) |
+					  BIT(DRM_COLOR_YCBCR_BT709) |
+					  BIT(DRM_COLOR_YCBCR_BT2020),
+					  BIT(DRM_COLOR_YCBCR_LIMITED_RANGE) |
+					  BIT(DRM_COLOR_YCBCR_FULL_RANGE),
+					  DRM_COLOR_YCBCR_BT601,
+					  DRM_COLOR_YCBCR_FULL_RANGE);
+
 	return plane;
 }

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v5 13/16] drm/vkms: Drop YUV formats TODO
  2024-03-13 17:44 [PATCH v5 00/16] drm/vkms: Reimplement line-per-line pixel conversion for plane reading Louis Chauvet
                   ` (11 preceding siblings ...)
  2024-03-13 17:45 ` [PATCH v5 12/16] drm/vkms: Add range and encoding properties to the plane Louis Chauvet
@ 2024-03-13 17:45 ` Louis Chauvet
  2024-03-13 17:45 ` [PATCH v5 14/16] drm/vkms: Create KUnit tests for YUV conversions Louis Chauvet
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-13 17:45 UTC (permalink / raw)
  To: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee,
	Louis Chauvet

From: Arthur Grillo <arthurgrillo@riseup.net>

VKMS has support for YUV formats now. Remove the task from the TODO
list.

Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
---
 Documentation/gpu/vkms.rst | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/Documentation/gpu/vkms.rst b/Documentation/gpu/vkms.rst
index ba04ac7c2167..13b866c3617c 100644
--- a/Documentation/gpu/vkms.rst
+++ b/Documentation/gpu/vkms.rst
@@ -122,8 +122,7 @@ There's lots of plane features we could add support for:
 
 - Scaling.
 
-- Additional buffer formats, especially YUV formats for video like NV12.
-  Low/high bpp RGB formats would also be interesting.
+- Additional buffer formats. Low/high bpp RGB formats would be interesting.
 
 - Async updates (currently only possible on cursor plane using the legacy
   cursor api).

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v5 14/16] drm/vkms: Create KUnit tests for YUV conversions
  2024-03-13 17:44 [PATCH v5 00/16] drm/vkms: Reimplement line-per-line pixel conversion for plane reading Louis Chauvet
                   ` (12 preceding siblings ...)
  2024-03-13 17:45 ` [PATCH v5 13/16] drm/vkms: Drop YUV formats TODO Louis Chauvet
@ 2024-03-13 17:45 ` Louis Chauvet
  2024-03-25 14:34   ` Maíra Canal
  2024-03-13 17:45 ` [PATCH v5 15/16] drm/vkms: Add how to run the Kunit tests Louis Chauvet
  2024-03-13 17:45 ` [PATCH v5 16/16] drm/vkms: Add support for DRM_FORMAT_R* Louis Chauvet
  15 siblings, 1 reply; 75+ messages in thread
From: Louis Chauvet @ 2024-03-13 17:45 UTC (permalink / raw)
  To: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee,
	Louis Chauvet

From: Arthur Grillo <arthurgrillo@riseup.net>

Create KUnit tests to test the conversion between YUV and RGB. Test each
conversion and range combination with some common colors.

The code used to compute the expected result can be found in comment.

Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
[Louis Chauvet:
- fix minor formating issues (whitespace, double line)
- change expected alpha from 0x0000 to 0xffff
- adapt to the new get_conversion_matrix usage
- apply the changes from Arthur
- move struct pixel_yuv_u8 to the test itself]
Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
---
 drivers/gpu/drm/vkms/Kconfig                  |  15 ++
 drivers/gpu/drm/vkms/Makefile                 |   1 +
 drivers/gpu/drm/vkms/tests/.kunitconfig       |   4 +
 drivers/gpu/drm/vkms/tests/Makefile           |   3 +
 drivers/gpu/drm/vkms/tests/vkms_format_test.c | 230 ++++++++++++++++++++++++++
 drivers/gpu/drm/vkms/vkms_formats.c           |   7 +-
 drivers/gpu/drm/vkms/vkms_formats.h           |   4 +
 7 files changed, 262 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/vkms/Kconfig b/drivers/gpu/drm/vkms/Kconfig
index b9ecdebecb0b..9b0e1940c14f 100644
--- a/drivers/gpu/drm/vkms/Kconfig
+++ b/drivers/gpu/drm/vkms/Kconfig
@@ -13,3 +13,18 @@ config DRM_VKMS
 	  a VKMS.
 
 	  If M is selected the module will be called vkms.
+
+config DRM_VKMS_KUNIT_TESTS
+	tristate "Tests for VKMS" if !KUNIT_ALL_TESTS
+	depends on DRM_VKMS && KUNIT
+	default KUNIT_ALL_TESTS
+	help
+	  This builds unit tests for VKMS. This option is not useful for
+	  distributions or general kernels, but only for kernel
+	  developers working on VKMS.
+
+	  For more information on KUnit and unit tests in general,
+	  please refer to the KUnit documentation in
+	  Documentation/dev-tools/kunit/.
+
+	  If in doubt, say "N".
\ No newline at end of file
diff --git a/drivers/gpu/drm/vkms/Makefile b/drivers/gpu/drm/vkms/Makefile
index 1b28a6a32948..8d3e46dde635 100644
--- a/drivers/gpu/drm/vkms/Makefile
+++ b/drivers/gpu/drm/vkms/Makefile
@@ -9,3 +9,4 @@ vkms-y := \
 	vkms_writeback.o
 
 obj-$(CONFIG_DRM_VKMS) += vkms.o
+obj-$(CONFIG_DRM_VKMS_KUNIT_TESTS) += tests/
diff --git a/drivers/gpu/drm/vkms/tests/.kunitconfig b/drivers/gpu/drm/vkms/tests/.kunitconfig
new file mode 100644
index 000000000000..70e378228cbd
--- /dev/null
+++ b/drivers/gpu/drm/vkms/tests/.kunitconfig
@@ -0,0 +1,4 @@
+CONFIG_KUNIT=y
+CONFIG_DRM=y
+CONFIG_DRM_VKMS=y
+CONFIG_DRM_VKMS_KUNIT_TESTS=y
diff --git a/drivers/gpu/drm/vkms/tests/Makefile b/drivers/gpu/drm/vkms/tests/Makefile
new file mode 100644
index 000000000000..2d1df668569e
--- /dev/null
+++ b/drivers/gpu/drm/vkms/tests/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+obj-$(CONFIG_DRM_VKMS_KUNIT_TESTS) += vkms_format_test.o
diff --git a/drivers/gpu/drm/vkms/tests/vkms_format_test.c b/drivers/gpu/drm/vkms/tests/vkms_format_test.c
new file mode 100644
index 000000000000..0954d606e44a
--- /dev/null
+++ b/drivers/gpu/drm/vkms/tests/vkms_format_test.c
@@ -0,0 +1,230 @@
+// SPDX-License-Identifier: GPL-2.0+
+
+#include <kunit/test.h>
+
+#include <drm/drm_fixed.h>
+#include <drm/drm_fourcc.h>
+#include <drm/drm_print.h>
+
+#include "../../drm_crtc_internal.h"
+
+#include "../vkms_drv.h"
+#include "../vkms_formats.h"
+
+#define TEST_BUFF_SIZE 50
+
+struct pixel_yuv_u8 {
+	u8 y, u, v;
+};
+
+struct yuv_u8_to_argb_u16_case {
+	enum drm_color_encoding encoding;
+	enum drm_color_range range;
+	size_t n_colors;
+	struct format_pair {
+		char *name;
+		struct pixel_yuv_u8 yuv;
+		struct pixel_argb_u16 argb;
+	} colors[TEST_BUFF_SIZE];
+};
+
+/*
+ * The YUV color representation were acquired via the colour python framework.
+ * Below are the function calls used for generating each case.
+ *
+ * for more information got to the docs:
+ * https://colour.readthedocs.io/en/master/generated/colour.RGB_to_YCbCr.html
+ */
+static struct yuv_u8_to_argb_u16_case yuv_u8_to_argb_u16_cases[] = {
+	/*
+	 * colour.RGB_to_YCbCr(<rgb color in 16 bit form>,
+	 *                     K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
+	 *                     in_bits = 16,
+	 *                     in_legal = False,
+	 *                     in_int = True,
+	 *                     out_bits = 8,
+	 *                     out_legal = False,
+	 *                     out_int = True)
+	 */
+	{
+		.encoding = DRM_COLOR_YCBCR_BT601,
+		.range = DRM_COLOR_YCBCR_FULL_RANGE,
+		.n_colors = 6,
+		.colors = {
+			{ "white", { 0xff, 0x80, 0x80 }, { 0xffff, 0xffff, 0xffff, 0xffff }},
+			{ "gray",  { 0x80, 0x80, 0x80 }, { 0xffff, 0x8080, 0x8080, 0x8080 }},
+			{ "black", { 0x00, 0x80, 0x80 }, { 0xffff, 0x0000, 0x0000, 0x0000 }},
+			{ "red",   { 0x4c, 0x55, 0xff }, { 0xffff, 0xffff, 0x0000, 0x0000 }},
+			{ "green", { 0x96, 0x2c, 0x15 }, { 0xffff, 0x0000, 0xffff, 0x0000 }},
+			{ "blue",  { 0x1d, 0xff, 0x6b }, { 0xffff, 0x0000, 0x0000, 0xffff }},
+		},
+	},
+	/*
+	 * colour.RGB_to_YCbCr(<rgb color in 16 bit form>,
+	 *                     K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
+	 *                     in_bits = 16,
+	 *                     in_legal = False,
+	 *                     in_int = True,
+	 *                     out_bits = 8,
+	 *                     out_legal = True,
+	 *                     out_int = True)
+	 */
+	{
+		.encoding = DRM_COLOR_YCBCR_BT601,
+		.range = DRM_COLOR_YCBCR_LIMITED_RANGE,
+		.n_colors = 6,
+		.colors = {
+			{ "white", { 0xeb, 0x80, 0x80 }, { 0xffff, 0xffff, 0xffff, 0xffff }},
+			{ "gray",  { 0x7e, 0x80, 0x80 }, { 0xffff, 0x8080, 0x8080, 0x8080 }},
+			{ "black", { 0x10, 0x80, 0x80 }, { 0xffff, 0x0000, 0x0000, 0x0000 }},
+			{ "red",   { 0x51, 0x5a, 0xf0 }, { 0xffff, 0xffff, 0x0000, 0x0000 }},
+			{ "green", { 0x91, 0x36, 0x22 }, { 0xffff, 0x0000, 0xffff, 0x0000 }},
+			{ "blue",  { 0x29, 0xf0, 0x6e }, { 0xffff, 0x0000, 0x0000, 0xffff }},
+		},
+	},
+	/*
+	 * colour.RGB_to_YCbCr(<rgb color in 16 bit form>,
+	 *                     K=colour.WEIGHTS_YCBCR["ITU-R BT.709"],
+	 *                     in_bits = 16,
+	 *                     in_legal = False,
+	 *                     in_int = True,
+	 *                     out_bits = 8,
+	 *                     out_legal = False,
+	 *                     out_int = True)
+	 */
+	{
+		.encoding = DRM_COLOR_YCBCR_BT709,
+		.range = DRM_COLOR_YCBCR_FULL_RANGE,
+		.n_colors = 4,
+		.colors = {
+			{ "white", { 0xff, 0x80, 0x80 }, { 0xffff, 0xffff, 0xffff, 0xffff }},
+			{ "gray",  { 0x80, 0x80, 0x80 }, { 0xffff, 0x8080, 0x8080, 0x8080 }},
+			{ "black", { 0x00, 0x80, 0x80 }, { 0xffff, 0x0000, 0x0000, 0x0000 }},
+			{ "red",   { 0x36, 0x63, 0xff }, { 0xffff, 0xffff, 0x0000, 0x0000 }},
+			{ "green", { 0xb6, 0x1e, 0x0c }, { 0xffff, 0x0000, 0xffff, 0x0000 }},
+			{ "blue",  { 0x12, 0xff, 0x74 }, { 0xffff, 0x0000, 0x0000, 0xffff }},
+		},
+	},
+	/*
+	 * colour.RGB_to_YCbCr(<rgb color in 16 bit form>,
+	 *                     K=colour.WEIGHTS_YCBCR["ITU-R BT.709"],
+	 *                     in_bits = 16,
+	 *                     int_legal = False,
+	 *                     in_int = True,
+	 *                     out_bits = 8,
+	 *                     out_legal = True,
+	 *                     out_int = True)
+	 */
+	{
+		.encoding = DRM_COLOR_YCBCR_BT709,
+		.range = DRM_COLOR_YCBCR_LIMITED_RANGE,
+		.n_colors = 4,
+		.colors = {
+			{ "white", { 0xeb, 0x80, 0x80 }, { 0xffff, 0xffff, 0xffff, 0xffff }},
+			{ "gray",  { 0x7e, 0x80, 0x80 }, { 0xffff, 0x8080, 0x8080, 0x8080 }},
+			{ "black", { 0x10, 0x80, 0x80 }, { 0xffff, 0x0000, 0x0000, 0x0000 }},
+			{ "red",   { 0x3f, 0x66, 0xf0 }, { 0xffff, 0xffff, 0x0000, 0x0000 }},
+			{ "green", { 0xad, 0x2a, 0x1a }, { 0xffff, 0x0000, 0xffff, 0x0000 }},
+			{ "blue",  { 0x20, 0xf0, 0x76 }, { 0xffff, 0x0000, 0x0000, 0xffff }},
+		},
+	},
+	/*
+	 * colour.RGB_to_YCbCr(<rgb color in 16 bit form>,
+	 *                     K=colour.WEIGHTS_YCBCR["ITU-R BT.2020"],
+	 *                     in_bits = 16,
+	 *                     in_legal = False,
+	 *                     in_int = True,
+	 *                     out_bits = 8,
+	 *                     out_legal = False,
+	 *                     out_int = True)
+	 */
+	{
+		.encoding = DRM_COLOR_YCBCR_BT2020,
+		.range = DRM_COLOR_YCBCR_FULL_RANGE,
+		.n_colors = 4,
+		.colors = {
+			{ "white", { 0xff, 0x80, 0x80 }, { 0xffff, 0xffff, 0xffff, 0xffff }},
+			{ "gray",  { 0x80, 0x80, 0x80 }, { 0xffff, 0x8080, 0x8080, 0x8080 }},
+			{ "black", { 0x00, 0x80, 0x80 }, { 0xffff, 0x0000, 0x0000, 0x0000 }},
+			{ "red",   { 0x43, 0x5c, 0xff }, { 0xffff, 0xffff, 0x0000, 0x0000 }},
+			{ "green", { 0xad, 0x24, 0x0b }, { 0xffff, 0x0000, 0xffff, 0x0000 }},
+			{ "blue",  { 0x0f, 0xff, 0x76 }, { 0xffff, 0x0000, 0x0000, 0xffff }},
+		},
+	},
+	/*
+	 * colour.RGB_to_YCbCr(<rgb color in 16 bit form>,
+	 *                     K=colour.WEIGHTS_YCBCR["ITU-R BT.2020"],
+	 *                     in_bits = 16,
+	 *                     in_legal = False,
+	 *                     in_int = True,
+	 *                     out_bits = 8,
+	 *                     out_legal = True,
+	 *                     out_int = True)
+	 */
+	{
+		.encoding = DRM_COLOR_YCBCR_BT2020,
+		.range = DRM_COLOR_YCBCR_LIMITED_RANGE,
+		.n_colors = 4,
+		.colors = {
+			{ "white", { 0xeb, 0x80, 0x80 }, { 0xffff, 0xffff, 0xffff, 0xffff }},
+			{ "gray",  { 0x7e, 0x80, 0x80 }, { 0xffff, 0x8080, 0x8080, 0x8080 }},
+			{ "black", { 0x10, 0x80, 0x80 }, { 0xffff, 0x0000, 0x0000, 0x0000 }},
+			{ "red",   { 0x4a, 0x61, 0xf0 }, { 0xffff, 0xffff, 0x0000, 0x0000 }},
+			{ "green", { 0xa4, 0x2f, 0x19 }, { 0xffff, 0x0000, 0xffff, 0x0000 }},
+			{ "blue",  { 0x1d, 0xf0, 0x77 }, { 0xffff, 0x0000, 0x0000, 0xffff }},
+		},
+	},
+};
+
+static void vkms_format_test_yuv_u8_to_argb_u16(struct kunit *test)
+{
+	const struct yuv_u8_to_argb_u16_case *param = test->param_value;
+	struct pixel_argb_u16 argb;
+
+	for (size_t i = 0; i < param->n_colors; i++) {
+		const struct format_pair *color = &param->colors[i];
+
+		struct conversion_matrix *matrix = get_conversion_matrix_to_argb_u16
+			(DRM_FORMAT_NV12, param->encoding, param->range);
+
+		argb = argb_u16_from_yuv888(color->yuv.y, color->yuv.u, color->yuv.v, matrix);
+
+		KUNIT_EXPECT_LE_MSG(test, abs_diff(argb.a, color->argb.a), 257,
+				    "On the A channel of the color %s expected 0x%04x, got 0x%04x",
+				    color->name, color->argb.a, argb.a);
+		KUNIT_EXPECT_LE_MSG(test, abs_diff(argb.r, color->argb.r), 257,
+				    "On the R channel of the color %s expected 0x%04x, got 0x%04x",
+				    color->name, color->argb.r, argb.r);
+		KUNIT_EXPECT_LE_MSG(test, abs_diff(argb.g, color->argb.g), 257,
+				    "On the G channel of the color %s expected 0x%04x, got 0x%04x",
+				    color->name, color->argb.g, argb.g);
+		KUNIT_EXPECT_LE_MSG(test, abs_diff(argb.b, color->argb.b), 257,
+				    "On the B channel of the color %s expected 0x%04x, got 0x%04x",
+				    color->name, color->argb.b, argb.b);
+	}
+}
+
+static void vkms_format_test_yuv_u8_to_argb_u16_case_desc(struct yuv_u8_to_argb_u16_case *t,
+							  char *desc)
+{
+	snprintf(desc, KUNIT_PARAM_DESC_SIZE, "%s - %s",
+		 drm_get_color_encoding_name(t->encoding), drm_get_color_range_name(t->range));
+}
+
+KUNIT_ARRAY_PARAM(yuv_u8_to_argb_u16, yuv_u8_to_argb_u16_cases,
+		  vkms_format_test_yuv_u8_to_argb_u16_case_desc
+);
+
+static struct kunit_case vkms_format_test_cases[] = {
+	KUNIT_CASE_PARAM(vkms_format_test_yuv_u8_to_argb_u16, yuv_u8_to_argb_u16_gen_params),
+	{}
+};
+
+static struct kunit_suite vkms_format_test_suite = {
+	.name = "vkms-format",
+	.test_cases = vkms_format_test_cases,
+};
+
+kunit_test_suite(vkms_format_test_suite);
+
+MODULE_LICENSE("GPL");
diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
index edbf4b321b91..863fc91d6d48 100644
--- a/drivers/gpu/drm/vkms/vkms_formats.c
+++ b/drivers/gpu/drm/vkms/vkms_formats.c
@@ -7,6 +7,8 @@
 #include <drm/drm_rect.h>
 #include <drm/drm_fixed.h>
 
+#include <kunit/visibility.h>
+
 #include "vkms_formats.h"
 
 /**
@@ -199,8 +201,8 @@ static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
 	return out_pixel;
 }
 
-static struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
-						  struct conversion_matrix *matrix)
+VISIBLE_IF_KUNIT struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
+							    struct conversion_matrix *matrix)
 {
 	u8 r, g, b;
 	s64 fp_y, fp_cb, fp_cr;
@@ -234,6 +236,7 @@ static struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
 
 	return argb_u16_from_u8888(255, r, g, b);
 }
+EXPORT_SYMBOL_IF_KUNIT(argb_u16_from_yuv888);
 
 /*
  * The following functions are read_line function for each pixel format supported by VKMS.
diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
index e1d324764b17..21e66a0cac16 100644
--- a/drivers/gpu/drm/vkms/vkms_formats.h
+++ b/drivers/gpu/drm/vkms/vkms_formats.h
@@ -13,4 +13,8 @@ struct conversion_matrix *
 get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
 				  enum drm_color_range range);
 
+#if IS_ENABLED(CONFIG_KUNIT)
+struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr, struct conversion_matrix *matrix);
+#endif
+
 #endif /* _VKMS_FORMATS_H_ */

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v5 15/16] drm/vkms: Add how to run the Kunit tests
  2024-03-13 17:44 [PATCH v5 00/16] drm/vkms: Reimplement line-per-line pixel conversion for plane reading Louis Chauvet
                   ` (13 preceding siblings ...)
  2024-03-13 17:45 ` [PATCH v5 14/16] drm/vkms: Create KUnit tests for YUV conversions Louis Chauvet
@ 2024-03-13 17:45 ` Louis Chauvet
  2024-03-13 17:45 ` [PATCH v5 16/16] drm/vkms: Add support for DRM_FORMAT_R* Louis Chauvet
  15 siblings, 0 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-13 17:45 UTC (permalink / raw)
  To: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee,
	Louis Chauvet

From: Arthur Grillo <arthurgrillo@riseup.net>

Now that we have KUnit tests, add instructions on how to run them.

Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
---
 Documentation/gpu/vkms.rst | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/Documentation/gpu/vkms.rst b/Documentation/gpu/vkms.rst
index 13b866c3617c..5ef5ef2e6a21 100644
--- a/Documentation/gpu/vkms.rst
+++ b/Documentation/gpu/vkms.rst
@@ -89,6 +89,17 @@ You can also run subtests if you do not want to run the entire test::
   sudo ./build/tests/kms_flip --run-subtest basic-plain-flip --device "sys:/sys/devices/platform/vkms"
   sudo IGT_DEVICE="sys:/sys/devices/platform/vkms" ./build/tests/kms_flip --run-subtest basic-plain-flip
 
+Testing With KUnit
+==================
+
+KUnit (Kernel unit testing framework) provides a common framework for unit tests
+within the Linux kernel.
+More information in ../dev-tools/kunit/index.rst .
+
+To run the VKMS KUnit tests::
+
+  tools/testing/kunit/kunit.py run --kunitconfig=drivers/gpu/drm/vkms/tests
+
 TODO
 ====
 

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v5 16/16] drm/vkms: Add support for DRM_FORMAT_R*
  2024-03-13 17:44 [PATCH v5 00/16] drm/vkms: Reimplement line-per-line pixel conversion for plane reading Louis Chauvet
                   ` (14 preceding siblings ...)
  2024-03-13 17:45 ` [PATCH v5 15/16] drm/vkms: Add how to run the Kunit tests Louis Chauvet
@ 2024-03-13 17:45 ` Louis Chauvet
  2024-03-28 14:00   ` Pekka Paalanen
  15 siblings, 1 reply; 75+ messages in thread
From: Louis Chauvet @ 2024-03-13 17:45 UTC (permalink / raw)
  To: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee,
	Louis Chauvet

This add the support for:
- R1/R2/R4/R8

R1 format was tested with [1] and [2].

[1]: https://lore.kernel.org/r/20240313-new_rotation-v2-0-6230fd5cae59@bootlin.com
[2]: https://lore.kernel.org/igt-dev/20240306-b4-kms_tests-v1-0-8fe451efd2ac@bootlin.com/

Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
---
 drivers/gpu/drm/vkms/vkms_formats.c | 100 ++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/vkms/vkms_plane.c   |   6 ++-
 2 files changed, 105 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
index 863fc91d6d48..cbb2ec09564a 100644
--- a/drivers/gpu/drm/vkms/vkms_formats.c
+++ b/drivers/gpu/drm/vkms/vkms_formats.c
@@ -201,6 +201,11 @@ static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
 	return out_pixel;
 }
 
+static struct pixel_argb_u16 argb_u16_from_gray8(u8 gray)
+{
+	return argb_u16_from_u8888(255, gray, gray, gray);
+}
+
 VISIBLE_IF_KUNIT struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
 							    struct conversion_matrix *matrix)
 {
@@ -269,6 +274,89 @@ static void black_to_argb_u16(const struct vkms_plane_state *plane, int x_start,
 	}
 }
 
+static void Rx_read_line(const struct vkms_plane_state *plane, int x_start,
+			 int y_start, enum pixel_read_direction direction, int count,
+			 struct pixel_argb_u16 out_pixel[], u8 bit_per_pixel, u8 lum_per_level)
+{
+	struct pixel_argb_u16 *end = out_pixel + count;
+	u8 *src_pixels;
+	int rem_x, rem_y;
+
+	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
+	int bit_offset = (int)rem_x * bit_per_pixel;
+	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
+	int mask = (0x1 << bit_per_pixel) - 1;
+
+	if (direction == READ_LEFT_TO_RIGHT || direction == READ_RIGHT_TO_LEFT) {
+		int restart_bit_offset = 0;
+		int step_bit_offset = bit_per_pixel;
+
+		if (direction == READ_RIGHT_TO_LEFT) {
+			restart_bit_offset = 8 - bit_per_pixel;
+			step_bit_offset = -bit_per_pixel;
+		}
+
+		while (out_pixel < end) {
+			u8 val = (*src_pixels & (mask << bit_offset)) >> bit_offset;
+
+			*out_pixel = argb_u16_from_gray8(val * lum_per_level);
+
+			bit_offset += step_bit_offset;
+			if (bit_offset < 0 || 8 <= bit_offset) {
+				bit_offset = restart_bit_offset;
+				src_pixels += step;
+			}
+			out_pixel += 1;
+		}
+	} else if (direction == READ_TOP_TO_BOTTOM || direction == READ_BOTTOM_TO_TOP) {
+		while (out_pixel < end) {
+			u8 val = (*src_pixels & (mask << bit_offset)) >> bit_offset;
+			*out_pixel = argb_u16_from_gray8(val * lum_per_level);
+			src_pixels += step;
+			out_pixel += 1;
+		}
+	}
+}
+
+static void R1_read_line(const struct vkms_plane_state *plane, int x_start,
+			 int y_start, enum pixel_read_direction direction, int count,
+			 struct pixel_argb_u16 out_pixel[])
+{
+	Rx_read_line(plane, x_start, y_start, direction, count, out_pixel, 1, 0xFF);
+}
+
+static void R2_read_line(const struct vkms_plane_state *plane, int x_start,
+			 int y_start, enum pixel_read_direction direction, int count,
+			 struct pixel_argb_u16 out_pixel[])
+{
+	Rx_read_line(plane, x_start, y_start, direction, count, out_pixel, 2, 0x55);
+}
+
+static void R4_read_line(const struct vkms_plane_state *plane, int x_start,
+			 int y_start, enum pixel_read_direction direction, int count,
+			 struct pixel_argb_u16 out_pixel[])
+{
+	Rx_read_line(plane, x_start, y_start, direction, count, out_pixel, 4, 0x11);
+}
+
+static void R8_read_line(const struct vkms_plane_state *plane, int x_start,
+			 int y_start, enum pixel_read_direction direction, int count,
+			 struct pixel_argb_u16 out_pixel[])
+{
+	struct pixel_argb_u16 *end = out_pixel + count;
+	u8 *src_pixels;
+	int rem_x, rem_y;
+	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
+
+	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
+
+	while (out_pixel < end) {
+		*out_pixel = argb_u16_from_gray8(*src_pixels);
+		src_pixels += step;
+		out_pixel += 1;
+	}
+}
+
 static void ARGB8888_read_line(const struct vkms_plane_state *plane, int x_start, int y_start,
 			       enum pixel_read_direction direction, int count,
 			       struct pixel_argb_u16 out_pixel[])
@@ -582,6 +670,14 @@ pixel_read_line_t get_pixel_read_line_function(u32 format)
 	case DRM_FORMAT_YVU422:
 	case DRM_FORMAT_YVU444:
 		return &planar_yuv_read_line;
+	case DRM_FORMAT_R1:
+		return &R1_read_line;
+	case DRM_FORMAT_R2:
+		return &R2_read_line;
+	case DRM_FORMAT_R4:
+		return &R4_read_line;
+	case DRM_FORMAT_R8:
+		return &R8_read_line;
 	default:
 		/*
 		 * This is a bug in vkms_plane_atomic_check. All the supported
@@ -855,6 +951,10 @@ get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
 	case DRM_FORMAT_ARGB16161616:
 	case DRM_FORMAT_XRGB16161616:
 	case DRM_FORMAT_RGB565:
+	case DRM_FORMAT_R1:
+	case DRM_FORMAT_R2:
+	case DRM_FORMAT_R4:
+	case DRM_FORMAT_R8:
 		/*
 		 * Those formats are supported, but they don't need a conversion matrix. Return
 		 * a valid pointer to avoid kernel panic in case this matrix is used/checked
diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
index e21cc92cf497..dc9d62acf350 100644
--- a/drivers/gpu/drm/vkms/vkms_plane.c
+++ b/drivers/gpu/drm/vkms/vkms_plane.c
@@ -29,7 +29,11 @@ static const u32 vkms_formats[] = {
 	DRM_FORMAT_YUV444,
 	DRM_FORMAT_YVU420,
 	DRM_FORMAT_YVU422,
-	DRM_FORMAT_YVU444
+	DRM_FORMAT_YVU444,
+	DRM_FORMAT_R1,
+	DRM_FORMAT_R2,
+	DRM_FORMAT_R4,
+	DRM_FORMAT_R8
 };
 
 static struct drm_plane_state *

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 03/16] drm/vkms: write/update the documentation for pixel conversion and pixel write functions
  2024-03-13 17:44 ` [PATCH v5 03/16] drm/vkms: write/update the documentation for pixel conversion and pixel write functions Louis Chauvet
@ 2024-03-13 19:02   ` Randy Dunlap
  2024-03-25 13:32   ` Maíra Canal
  1 sibling, 0 replies; 75+ messages in thread
From: Randy Dunlap @ 2024-03-13 19:02 UTC (permalink / raw)
  To: Louis Chauvet, Rodrigo Siqueira, Melissa Wen, Maíra Canal,
	Haneen Mohammed, Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Hi,

On 3/13/24 10:44, Louis Chauvet wrote:
> Add some documentation on pixel conversion functions.
> Update of outdated comments for pixel_write functions.
> 
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>  drivers/gpu/drm/vkms/vkms_composer.c |  7 ++++
>  drivers/gpu/drm/vkms/vkms_drv.h      | 13 ++++++++
>  drivers/gpu/drm/vkms/vkms_formats.c  | 62 ++++++++++++++++++++++++++++++------
>  3 files changed, 73 insertions(+), 9 deletions(-)
> 

> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index 172830a3936a..6e3dc8682ff9 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c


> @@ -216,6 +238,14 @@ static void argb_u16_to_RGB565(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
>  	*pixels = cpu_to_le16(r << 11 | g << 5 | b);
>  }
>  
> +/**

This comment is not in kernel-doc format, so either use "/*" to begin the comment
or add the function name in the first comment line, like:


+ * vkms_writeback_row - Generic loop for all supported writeback format. It is executed just after the blending to

> + * Generic loop for all supported writeback format. It is executed just after the blending to
> + * write a line in the writeback buffer.
> + *
> + * @wb: Job where to insert the final image
> + * @src_buffer: Line to write
> + * @y: Row to write in the writeback buffer
> + */
>  void vkms_writeback_row(struct vkms_writeback_job *wb,
>  			const struct line_buffer *src_buffer, int y)
>  {
> @@ -229,6 +259,13 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
>  		wb->pixel_write(dst_pixels, &in_pixels[x]);
>  }
>  
> +/**

Needs function name or don't use "/**" to begin the comment.

> + * Retrieve the correct read_pixel function for a specific format.
> + * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
> + * pointer is valid before using it in a vkms_plane_state.
> + *
> + * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> + */
>  void *get_pixel_conversion_function(u32 format)
>  {
>  	switch (format) {
> @@ -247,6 +284,13 @@ void *get_pixel_conversion_function(u32 format)
>  	}
>  }
>  
> +/**

Same here.

> + * Retrieve the correct write_pixel function for a specific format.
> + * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
> + * pointer is valid before using it in a vkms_writeback_job.
> + *
> + * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> + */
>  void *get_pixel_write_function(u32 format)
>  {
>  	switch (format) {
> 

thanks.
-- 
#Randy

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 05/16] drm/vkms: Add dummy pixel_read/pixel_write callbacks to avoid NULL pointers
  2024-03-13 17:44 ` [PATCH v5 05/16] drm/vkms: Add dummy pixel_read/pixel_write callbacks to avoid NULL pointers Louis Chauvet
@ 2024-03-13 19:08   ` Randy Dunlap
  2024-03-25 12:05   ` Pekka Paalanen
  2024-03-25 13:59   ` Maíra Canal
  2 siblings, 0 replies; 75+ messages in thread
From: Randy Dunlap @ 2024-03-13 19:08 UTC (permalink / raw)
  To: Louis Chauvet, Rodrigo Siqueira, Melissa Wen, Maíra Canal,
	Haneen Mohammed, Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee



On 3/13/24 10:44, Louis Chauvet wrote:
> Introduce two callbacks which does nothing. They are used in replacement
> of NULL and it avoid kernel OOPS if this NULL is called.
> 
> If those callback are used, it means that there is a mismatch between
> what formats are announced by atomic_check and what is realy supported by
> atomic_update.
> 
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>  drivers/gpu/drm/vkms/vkms_formats.c | 43 +++++++++++++++++++++++++++++++------
>  1 file changed, 37 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index 55a4365d21a4..b57d85b8b935 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c


> @@ -261,8 +286,8 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
>  
>  /**

Please mak this comment conform to kernel-doc format or don't use "/**" to
begin the comment.

>   * Retrieve the correct read_pixel function for a specific format.
> - * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
> - * pointer is valid before using it in a vkms_plane_state.
> + * If the format is not supported by VKMS a warn is emitted and a dummy "always read black"
> + * function is returned.
>   *
>   * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
>   */
> @@ -285,18 +310,21 @@ pixel_read_t get_pixel_read_function(u32 format)
>  		 * format must:
>  		 * - Be listed in vkms_formats in vkms_plane.c
>  		 * - Have a pixel_read callback defined here
> +		 *
> +		 * To avoid kernel crash, a dummy "always read black" function is used. It means
> +		 * that during the composition, this plane will always be black.
>  		 */
>  		WARN(true,
>  		     "Pixel format %p4cc is not supported by VKMS planes. This is a kernel bug, atomic check must forbid this configuration.\n",
>  		     &format);
> -		return (pixel_read_t)NULL;
> +		return &black_to_argb_u16;
>  	}
>  }
>  
>  /**

Same here.

>   * Retrieve the correct write_pixel function for a specific format.
> - * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
> - * pointer is valid before using it in a vkms_writeback_job.
> + * If the format is not supported by VKMS a warn is emitted and a dummy "don't do anything"
> + * function is returned.
>   *
>   * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
>   */

thanks.
-- 
#Randy

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 11/16] drm/vkms: Add YUV support
  2024-03-13 17:45 ` [PATCH v5 11/16] drm/vkms: Add YUV support Louis Chauvet
@ 2024-03-13 19:20   ` Randy Dunlap
  2024-03-14 14:41     ` Louis Chauvet
  2024-03-25 14:26   ` Maíra Canal
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 75+ messages in thread
From: Randy Dunlap @ 2024-03-13 19:20 UTC (permalink / raw)
  To: Louis Chauvet, Rodrigo Siqueira, Melissa Wen, Maíra Canal,
	Haneen Mohammed, Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Hi,

On 3/13/24 10:45, Louis Chauvet wrote:
> From: Arthur Grillo <arthurgrillo@riseup.net>
> 

> 
> Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> [Louis Chauvet:
> - Adapted Arthur's work
> - Implemented the read_line_t callbacks for yuv
> - add struct conversion_matrix
> - remove struct pixel_yuv_u8
> - update the commit message
> - Merge the modifications from Arthur]
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>  drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
>  drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/vkms/vkms_formats.h |   4 +
>  drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
>  4 files changed, 473 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 23e1d247468d..f3116084de5a 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -99,6 +99,27 @@ typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_st
>  				  int y_start, enum pixel_read_direction direction, int count,
>  				  struct pixel_argb_u16 out_pixel[]);
>  
> +/**
> + * CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values

This should be

+ * define CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values

to conform to kernel-doc format.

> + */
> +#define CONVERSION_MATRIX_FLOAT_DEPTH 32
> +

> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index 1449a0e6c706..edbf4b321b91 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c

> +/**
> + * get_conversion_matrix_to_argb_u16() - Retrieve the correct yuv to rgb conversion matrix for a
> + * given encoding and range.
> + *
> + * If the matrix is not found, return a null pointer. In all other cases, it return a simple
> + * diagonal matrix, which act as a "no-op".
> + *
> + * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> + * @encoding: DRM_COLOR_* value for which to obtain a conversion matrix
> + * @range: DRM_COLOR_*_RANGE value for which to obtain a conversion matrix
> + */
> +struct conversion_matrix *
> +get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> +				  enum drm_color_range range)
> +{
> +	static struct conversion_matrix no_operation = {
> +		.matrix = {
> +			{ 4294967296, 0,          0, },
> +			{ 0,          4294967296, 0, },
> +			{ 0,          0,          4294967296, },
> +		},
> +		.y_offset = 0,
> +	};
> +
> +	/*
> +	 * Those matrixies were generated using the colour python framework

	         matrices

> +	 *
> +	 * Below are the function calls used to generate eac matrix, go to

	                                                 each

> +	 * https://colour.readthedocs.io/en/develop/generated/colour.matrix_YCbCr.html
> +	 * for more info:
> +	 *
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
> +	 *                                  is_legal = False,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */

> +
>  /**

Please convert this comment to kernel-doc format or just use "/*" to begin
the comment.

>   * Retrieve the correct write_pixel function for a specific format.
>   * If the format is not supported by VKMS a warn is emitted and a dummy "don't do anything"

> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> index 8875bed76410..987dd2b686a8 100644
> --- a/drivers/gpu/drm/vkms/vkms_plane.c
> +++ b/drivers/gpu/drm/vkms/vkms_plane.c


thanks.
-- 
#Randy

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 11/16] drm/vkms: Add YUV support
  2024-03-13 19:20   ` Randy Dunlap
@ 2024-03-14 14:41     ` Louis Chauvet
  0 siblings, 0 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-14 14:41 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen, dri-devel, linux-kernel, jeremie.dautheribes,
	miquel.raynal, thomas.petazzoni, seanpaul, marcheu,
	nicolejadeyee

Le 13/03/24 - 12:20, Randy Dunlap a écrit :
> Hi,
> 
> On 3/13/24 10:45, Louis Chauvet wrote:
> > From: Arthur Grillo <arthurgrillo@riseup.net>
> > 
> 
> > 
> > Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> > [Louis Chauvet:
> > - Adapted Arthur's work
> > - Implemented the read_line_t callbacks for yuv
> > - add struct conversion_matrix
> > - remove struct pixel_yuv_u8
> > - update the commit message
> > - Merge the modifications from Arthur]
> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >  drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
> >  drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/vkms/vkms_formats.h |   4 +
> >  drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
> >  4 files changed, 473 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index 23e1d247468d..f3116084de5a 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -99,6 +99,27 @@ typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_st
> >  				  int y_start, enum pixel_read_direction direction, int count,
> >  				  struct pixel_argb_u16 out_pixel[]);
> >  
> > +/**
> > + * CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values
> 
> This should be
> 
> + * define CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values
> 
> to conform to kernel-doc format.
> 
> > + */
> > +#define CONVERSION_MATRIX_FLOAT_DEPTH 32
> > +

Hi Randy,

Thanks for your feedback.

I missed it while squashing Arthur's work, but this constant is not needed 
anymore, it will be removed in v6.

For all other kernel-doc formatting (PATCHv5 03/16, PATCH V5 05/16), I 
will correct them in the v6.

Kind regards,
Louis Chauvet

> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> > index 1449a0e6c706..edbf4b321b91 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.c
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> 
> > +/**
> > + * get_conversion_matrix_to_argb_u16() - Retrieve the correct yuv to rgb conversion matrix for a
> > + * given encoding and range.
> > + *
> > + * If the matrix is not found, return a null pointer. In all other cases, it return a simple
> > + * diagonal matrix, which act as a "no-op".
> > + *
> > + * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> > + * @encoding: DRM_COLOR_* value for which to obtain a conversion matrix
> > + * @range: DRM_COLOR_*_RANGE value for which to obtain a conversion matrix
> > + */
> > +struct conversion_matrix *
> > +get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> > +				  enum drm_color_range range)
> > +{
> > +	static struct conversion_matrix no_operation = {
> > +		.matrix = {
> > +			{ 4294967296, 0,          0, },
> > +			{ 0,          4294967296, 0, },
> > +			{ 0,          0,          4294967296, },
> > +		},
> > +		.y_offset = 0,
> > +	};
> > +
> > +	/*
> > +	 * Those matrixies were generated using the colour python framework
> 
> 	         matrices
> 
> > +	 *
> > +	 * Below are the function calls used to generate eac matrix, go to
> 
> 	                                                 each
> 
> > +	 * https://colour.readthedocs.io/en/develop/generated/colour.matrix_YCbCr.html
> > +	 * for more info:
> > +	 *
> > +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
> > +	 *                                  is_legal = False,
> > +	 *                                  bits = 8) * 2**32).astype(int)
> > +	 */
> 
> > +
> >  /**
> 
> Please convert this comment to kernel-doc format or just use "/*" to begin
> the comment.
> 
> >   * Retrieve the correct write_pixel function for a specific format.
> >   * If the format is not supported by VKMS a warn is emitted and a dummy "don't do anything"
> 
> > diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> > index 8875bed76410..987dd2b686a8 100644
> > --- a/drivers/gpu/drm/vkms/vkms_plane.c
> > +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> 
> 
> thanks.
> -- 
> #Randy

-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 01/16] drm/vkms: Code formatting
  2024-03-13 17:44 ` [PATCH v5 01/16] drm/vkms: Code formatting Louis Chauvet
@ 2024-03-25 12:03   ` Pekka Paalanen
  2024-03-25 13:13   ` Maíra Canal
  1 sibling, 0 replies; 75+ messages in thread
From: Pekka Paalanen @ 2024-03-25 12:03 UTC (permalink / raw)
  To: Louis Chauvet
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

[-- Attachment #1: Type: text/plain, Size: 5212 bytes --]

On Wed, 13 Mar 2024 18:44:55 +0100
Louis Chauvet <louis.chauvet@bootlin.com> wrote:

> Few no-op changes to remove double spaces and fix wrong alignments.
> 
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.com>

Thanks,
pq


> ---
>  drivers/gpu/drm/vkms/vkms_composer.c | 10 +++++-----
>  drivers/gpu/drm/vkms/vkms_crtc.c     |  6 ++----
>  drivers/gpu/drm/vkms/vkms_drv.c      |  3 +--
>  drivers/gpu/drm/vkms/vkms_plane.c    |  8 ++++----
>  4 files changed, 12 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> index e7441b227b3c..c6d9b4a65809 100644
> --- a/drivers/gpu/drm/vkms/vkms_composer.c
> +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> @@ -96,7 +96,7 @@ static u16 lerp_u16(u16 a, u16 b, s64 t)
>  	s64 a_fp = drm_int2fixp(a);
>  	s64 b_fp = drm_int2fixp(b);
>  
> -	s64 delta = drm_fixp_mul(b_fp - a_fp,  t);
> +	s64 delta = drm_fixp_mul(b_fp - a_fp, t);
>  
>  	return drm_fixp2int(a_fp + delta);
>  }
> @@ -302,8 +302,8 @@ static int compose_active_planes(struct vkms_writeback_job *active_wb,
>  void vkms_composer_worker(struct work_struct *work)
>  {
>  	struct vkms_crtc_state *crtc_state = container_of(work,
> -						struct vkms_crtc_state,
> -						composer_work);
> +							  struct vkms_crtc_state,
> +							  composer_work);
>  	struct drm_crtc *crtc = crtc_state->base.crtc;
>  	struct vkms_writeback_job *active_wb = crtc_state->active_writeback;
>  	struct vkms_output *out = drm_crtc_to_vkms_output(crtc);
> @@ -328,7 +328,7 @@ void vkms_composer_worker(struct work_struct *work)
>  		crtc_state->gamma_lut.base = (struct drm_color_lut *)crtc->state->gamma_lut->data;
>  		crtc_state->gamma_lut.lut_length =
>  			crtc->state->gamma_lut->length / sizeof(struct drm_color_lut);
> -		max_lut_index_fp = drm_int2fixp(crtc_state->gamma_lut.lut_length  - 1);
> +		max_lut_index_fp = drm_int2fixp(crtc_state->gamma_lut.lut_length - 1);
>  		crtc_state->gamma_lut.channel_value2index_ratio = drm_fixp_div(max_lut_index_fp,
>  									       u16_max_fp);
>  
> @@ -367,7 +367,7 @@ void vkms_composer_worker(struct work_struct *work)
>  		drm_crtc_add_crc_entry(crtc, true, frame_start++, &crc32);
>  }
>  
> -static const char * const pipe_crc_sources[] = {"auto"};
> +static const char *const pipe_crc_sources[] = { "auto" };
>  
>  const char *const *vkms_get_crc_sources(struct drm_crtc *crtc,
>  					size_t *count)
> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> index 61e500b8c9da..7586ae2e1dd3 100644
> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> @@ -191,8 +191,7 @@ static int vkms_crtc_atomic_check(struct drm_crtc *crtc,
>  		return ret;
>  
>  	drm_for_each_plane_mask(plane, crtc->dev, crtc_state->plane_mask) {
> -		plane_state = drm_atomic_get_existing_plane_state(crtc_state->state,
> -								  plane);
> +		plane_state = drm_atomic_get_existing_plane_state(crtc_state->state, plane);
>  		WARN_ON(!plane_state);
>  
>  		if (!plane_state->visible)
> @@ -208,8 +207,7 @@ static int vkms_crtc_atomic_check(struct drm_crtc *crtc,
>  
>  	i = 0;
>  	drm_for_each_plane_mask(plane, crtc->dev, crtc_state->plane_mask) {
> -		plane_state = drm_atomic_get_existing_plane_state(crtc_state->state,
> -								  plane);
> +		plane_state = drm_atomic_get_existing_plane_state(crtc_state->state, plane);
>  
>  		if (!plane_state->visible)
>  			continue;
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c
> index dd0af086e7fa..83e6c9b9ff46 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.c
> +++ b/drivers/gpu/drm/vkms/vkms_drv.c
> @@ -81,8 +81,7 @@ static void vkms_atomic_commit_tail(struct drm_atomic_state *old_state)
>  	drm_atomic_helper_wait_for_flip_done(dev, old_state);
>  
>  	for_each_old_crtc_in_state(old_state, crtc, old_crtc_state, i) {
> -		struct vkms_crtc_state *vkms_state =
> -			to_vkms_crtc_state(old_crtc_state);
> +		struct vkms_crtc_state *vkms_state = to_vkms_crtc_state(old_crtc_state);
>  
>  		flush_work(&vkms_state->composer_work);
>  	}
> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> index e5c625ab8e3e..5a8d295e65f2 100644
> --- a/drivers/gpu/drm/vkms/vkms_plane.c
> +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> @@ -117,10 +117,10 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
>  	memcpy(&frame_info->map, &shadow_plane_state->data, sizeof(frame_info->map));
>  	drm_framebuffer_get(frame_info->fb);
>  	frame_info->rotation = drm_rotation_simplify(new_state->rotation, DRM_MODE_ROTATE_0 |
> -						     DRM_MODE_ROTATE_90 |
> -						     DRM_MODE_ROTATE_270 |
> -						     DRM_MODE_REFLECT_X |
> -						     DRM_MODE_REFLECT_Y);
> +									  DRM_MODE_ROTATE_90 |
> +									  DRM_MODE_ROTATE_270 |
> +									  DRM_MODE_REFLECT_X |
> +									  DRM_MODE_REFLECT_Y);
>  
>  	drm_rect_rotate(&frame_info->rotated, drm_rect_width(&frame_info->rotated),
>  			drm_rect_height(&frame_info->rotated), frame_info->rotation);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 02/16] drm/vkms: Use drm_frame directly
  2024-03-13 17:44 ` [PATCH v5 02/16] drm/vkms: Use drm_frame directly Louis Chauvet
@ 2024-03-25 12:04   ` Pekka Paalanen
  2024-03-25 13:20   ` Maíra Canal
  1 sibling, 0 replies; 75+ messages in thread
From: Pekka Paalanen @ 2024-03-25 12:04 UTC (permalink / raw)
  To: Louis Chauvet
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

[-- Attachment #1: Type: text/plain, Size: 4486 bytes --]

On Wed, 13 Mar 2024 18:44:56 +0100
Louis Chauvet <louis.chauvet@bootlin.com> wrote:

> From: Arthur Grillo <arthurgrillo@riseup.net>
> 
> Remove intermidiary variables and access the variables directly from
> drm_frame. These changes should be noop.
> 
> Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---

Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com>

Thanks,
pq



>  drivers/gpu/drm/vkms/vkms_drv.h       |  3 ---
>  drivers/gpu/drm/vkms/vkms_formats.c   | 12 +++++++-----
>  drivers/gpu/drm/vkms/vkms_plane.c     |  3 ---
>  drivers/gpu/drm/vkms/vkms_writeback.c |  5 -----
>  4 files changed, 7 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 8f5710debb1e..b4b357447292 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -31,9 +31,6 @@ struct vkms_frame_info {
>  	struct drm_rect rotated;
>  	struct iosys_map map[DRM_FORMAT_MAX_PLANES];
>  	unsigned int rotation;
> -	unsigned int offset;
> -	unsigned int pitch;
> -	unsigned int cpp;
>  };
>  
>  struct pixel_argb_u16 {
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index 36046b12f296..172830a3936a 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> @@ -11,8 +11,10 @@
>  
>  static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int y)
>  {
> -	return frame_info->offset + (y * frame_info->pitch)
> -				  + (x * frame_info->cpp);
> +	struct drm_framebuffer *fb = frame_info->fb;
> +
> +	return fb->offsets[0] + (y * fb->pitches[0])
> +			      + (x * fb->format->cpp[0]);
>  }
>  
>  /*
> @@ -131,12 +133,12 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
>  	u8 *src_pixels = get_packed_src_addr(frame_info, y);
>  	int limit = min_t(size_t, drm_rect_width(&frame_info->dst), stage_buffer->n_pixels);
>  
> -	for (size_t x = 0; x < limit; x++, src_pixels += frame_info->cpp) {
> +	for (size_t x = 0; x < limit; x++, src_pixels += frame_info->fb->format->cpp[0]) {
>  		int x_pos = get_x_position(frame_info, limit, x);
>  
>  		if (drm_rotation_90_or_270(frame_info->rotation))
>  			src_pixels = get_packed_src_addr(frame_info, x + frame_info->rotated.y1)
> -				+ frame_info->cpp * y;
> +				+ frame_info->fb->format->cpp[0] * y;
>  
>  		plane->pixel_read(src_pixels, &out_pixels[x_pos]);
>  	}
> @@ -223,7 +225,7 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
>  	struct pixel_argb_u16 *in_pixels = src_buffer->pixels;
>  	int x_limit = min_t(size_t, drm_rect_width(&frame_info->dst), src_buffer->n_pixels);
>  
> -	for (size_t x = 0; x < x_limit; x++, dst_pixels += frame_info->cpp)
> +	for (size_t x = 0; x < x_limit; x++, dst_pixels += frame_info->fb->format->cpp[0])
>  		wb->pixel_write(dst_pixels, &in_pixels[x]);
>  }
>  
> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> index 5a8d295e65f2..21b5adfb44aa 100644
> --- a/drivers/gpu/drm/vkms/vkms_plane.c
> +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> @@ -125,9 +125,6 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
>  	drm_rect_rotate(&frame_info->rotated, drm_rect_width(&frame_info->rotated),
>  			drm_rect_height(&frame_info->rotated), frame_info->rotation);
>  
> -	frame_info->offset = fb->offsets[0];
> -	frame_info->pitch = fb->pitches[0];
> -	frame_info->cpp = fb->format->cpp[0];
>  	vkms_plane_state->pixel_read = get_pixel_conversion_function(fmt);
>  }
>  
> diff --git a/drivers/gpu/drm/vkms/vkms_writeback.c b/drivers/gpu/drm/vkms/vkms_writeback.c
> index bc724cbd5e3a..c8582df1f739 100644
> --- a/drivers/gpu/drm/vkms/vkms_writeback.c
> +++ b/drivers/gpu/drm/vkms/vkms_writeback.c
> @@ -149,11 +149,6 @@ static void vkms_wb_atomic_commit(struct drm_connector *conn,
>  	crtc_state->active_writeback = active_wb;
>  	crtc_state->wb_pending = true;
>  	spin_unlock_irq(&output->composer_lock);
> -
> -	wb_frame_info->offset = fb->offsets[0];
> -	wb_frame_info->pitch = fb->pitches[0];
> -	wb_frame_info->cpp = fb->format->cpp[0];
> -
>  	drm_writeback_queue_job(wb_conn, connector_state);
>  	active_wb->pixel_write = get_pixel_write_function(wb_format);
>  	drm_rect_init(&wb_frame_info->src, 0, 0, crtc_width, crtc_height);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 04/16] drm/vkms: Add typedef and documentation for pixel_read and pixel_write functions
  2024-03-13 17:44 ` [PATCH v5 04/16] drm/vkms: Add typedef and documentation for pixel_read and pixel_write functions Louis Chauvet
@ 2024-03-25 12:04   ` Pekka Paalanen
  2024-03-26 15:56     ` Louis Chauvet
  2024-03-25 13:56   ` Maíra Canal
  1 sibling, 1 reply; 75+ messages in thread
From: Pekka Paalanen @ 2024-03-25 12:04 UTC (permalink / raw)
  To: Louis Chauvet
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

[-- Attachment #1: Type: text/plain, Size: 13489 bytes --]

On Wed, 13 Mar 2024 18:44:58 +0100
Louis Chauvet <louis.chauvet@bootlin.com> wrote:

> Introduce two typedefs: pixel_read_t and pixel_write_t. It allows the
> compiler to check if the passed functions take the correct arguments.
> Such typedefs will help ensuring consistency across the code base in
> case of update of these prototypes.
> 
> Rename input/output variable in a consistent way between read_line and
> write_line.
> 
> A warn has been added in get_pixel_*_function to alert when an unsupported
> pixel format is requested. As those formats are checked before
> atomic_update callbacks, it should never append.

s/append/happen/


Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.com>

Thanks,
pq

> 
> Document for those typedefs.
> 
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>  drivers/gpu/drm/vkms/vkms_drv.h     |  23 ++++++-
>  drivers/gpu/drm/vkms/vkms_formats.c | 124 +++++++++++++++++++++---------------
>  drivers/gpu/drm/vkms/vkms_formats.h |   4 +-
>  drivers/gpu/drm/vkms/vkms_plane.c   |   2 +-
>  4 files changed, 95 insertions(+), 58 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 18086423a3a7..4bfc62d26f08 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -53,12 +53,31 @@ struct line_buffer {
>  	struct pixel_argb_u16 *pixels;
>  };
>  
> +/**
> + * typedef pixel_write_t - These functions are used to read a pixel from a
> + * `struct pixel_argb_u16*`, convert it in a specific format and write it in the @dst_pixels
> + * buffer.
> + *
> + * @out_pixel: destination address to write the pixel
> + * @in_pixel: pixel to write
> + */
> +typedef void (*pixel_write_t)(u8 *out_pixel, struct pixel_argb_u16 *in_pixel);
> +
>  struct vkms_writeback_job {
>  	struct iosys_map data[DRM_FORMAT_MAX_PLANES];
>  	struct vkms_frame_info wb_frame_info;
> -	void (*pixel_write)(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel);
> +	pixel_write_t pixel_write;
>  };
>  
> +/**
> + * typedef pixel_read_t - These functions are used to read a pixel in the source frame,
> + * convert it to `struct pixel_argb_u16` and write it to @out_pixel.
> + *
> + * @in_pixel: Pointer to the pixel to read
> + * @out_pixel: Pointer to write the converted pixel
> + */
> +typedef void (*pixel_read_t)(u8 *in_pixel, struct pixel_argb_u16 *out_pixel);
> +
>  /**
>   * vkms_plane_state - Driver specific plane state
>   * @base: base plane state
> @@ -69,7 +88,7 @@ struct vkms_writeback_job {
>  struct vkms_plane_state {
>  	struct drm_shadow_plane_state base;
>  	struct vkms_frame_info *frame_info;
> -	void (*pixel_read)(u8 *src_buffer, struct pixel_argb_u16 *out_pixel);
> +	pixel_read_t pixel_read;
>  };
>  
>  struct vkms_plane {
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index 6e3dc8682ff9..55a4365d21a4 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> @@ -76,7 +76,7 @@ static int get_x_position(const struct vkms_frame_info *frame_info, int limit, i
>   * They are used in the `vkms_compose_row` function to handle multiple formats.
>   */
>  
> -static void ARGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> +static void ARGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>  {
>  	/*
>  	 * The 257 is the "conversion ratio". This number is obtained by the
> @@ -84,48 +84,48 @@ static void ARGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixe
>  	 * the best color value in a pixel format with more possibilities.
>  	 * A similar idea applies to others RGB color conversions.
>  	 */
> -	out_pixel->a = (u16)src_pixels[3] * 257;
> -	out_pixel->r = (u16)src_pixels[2] * 257;
> -	out_pixel->g = (u16)src_pixels[1] * 257;
> -	out_pixel->b = (u16)src_pixels[0] * 257;
> +	out_pixel->a = (u16)in_pixel[3] * 257;
> +	out_pixel->r = (u16)in_pixel[2] * 257;
> +	out_pixel->g = (u16)in_pixel[1] * 257;
> +	out_pixel->b = (u16)in_pixel[0] * 257;
>  }
>  
> -static void XRGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> +static void XRGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>  {
>  	out_pixel->a = (u16)0xffff;
> -	out_pixel->r = (u16)src_pixels[2] * 257;
> -	out_pixel->g = (u16)src_pixels[1] * 257;
> -	out_pixel->b = (u16)src_pixels[0] * 257;
> +	out_pixel->r = (u16)in_pixel[2] * 257;
> +	out_pixel->g = (u16)in_pixel[1] * 257;
> +	out_pixel->b = (u16)in_pixel[0] * 257;
>  }
>  
> -static void ARGB16161616_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> +static void ARGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>  {
> -	u16 *pixels = (u16 *)src_pixels;
> +	u16 *pixel = (u16 *)in_pixel;
>  
> -	out_pixel->a = le16_to_cpu(pixels[3]);
> -	out_pixel->r = le16_to_cpu(pixels[2]);
> -	out_pixel->g = le16_to_cpu(pixels[1]);
> -	out_pixel->b = le16_to_cpu(pixels[0]);
> +	out_pixel->a = le16_to_cpu(pixel[3]);
> +	out_pixel->r = le16_to_cpu(pixel[2]);
> +	out_pixel->g = le16_to_cpu(pixel[1]);
> +	out_pixel->b = le16_to_cpu(pixel[0]);
>  }
>  
> -static void XRGB16161616_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> +static void XRGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>  {
> -	u16 *pixels = (u16 *)src_pixels;
> +	u16 *pixel = (u16 *)in_pixel;
>  
>  	out_pixel->a = (u16)0xffff;
> -	out_pixel->r = le16_to_cpu(pixels[2]);
> -	out_pixel->g = le16_to_cpu(pixels[1]);
> -	out_pixel->b = le16_to_cpu(pixels[0]);
> +	out_pixel->r = le16_to_cpu(pixel[2]);
> +	out_pixel->g = le16_to_cpu(pixel[1]);
> +	out_pixel->b = le16_to_cpu(pixel[0]);
>  }
>  
> -static void RGB565_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> +static void RGB565_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>  {
> -	u16 *pixels = (u16 *)src_pixels;
> +	u16 *pixel = (u16 *)in_pixel;
>  
>  	s64 fp_rb_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(31));
>  	s64 fp_g_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(63));
>  
> -	u16 rgb_565 = le16_to_cpu(*pixels);
> +	u16 rgb_565 = le16_to_cpu(*pixel);
>  	s64 fp_r = drm_int2fixp((rgb_565 >> 11) & 0x1f);
>  	s64 fp_g = drm_int2fixp((rgb_565 >> 5) & 0x3f);
>  	s64 fp_b = drm_int2fixp(rgb_565 & 0x1f);
> @@ -169,12 +169,12 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
>  
>  /*
>   * The following functions take one argb_u16 pixel and convert it to a specific format. The
> - * result is stored in @dst_pixels.
> + * result is stored in @out_pixel.
>   *
>   * They are used in the `vkms_writeback_row` to convert and store a pixel from the src_buffer to
>   * the writeback buffer.
>   */
> -static void argb_u16_to_ARGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_ARGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>  {
>  	/*
>  	 * This sequence below is important because the format's byte order is
> @@ -186,43 +186,43 @@ static void argb_u16_to_ARGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel
>  	 * | Addr + 2 | = Red channel
>  	 * | Addr + 3 | = Alpha channel
>  	 */
> -	dst_pixels[3] = DIV_ROUND_CLOSEST(in_pixel->a, 257);
> -	dst_pixels[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
> -	dst_pixels[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
> -	dst_pixels[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
> +	out_pixel[3] = DIV_ROUND_CLOSEST(in_pixel->a, 257);
> +	out_pixel[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
> +	out_pixel[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
> +	out_pixel[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
>  }
>  
> -static void argb_u16_to_XRGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_XRGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>  {
> -	dst_pixels[3] = 0xff;
> -	dst_pixels[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
> -	dst_pixels[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
> -	dst_pixels[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
> +	out_pixel[3] = 0xff;
> +	out_pixel[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
> +	out_pixel[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
> +	out_pixel[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
>  }
>  
> -static void argb_u16_to_ARGB16161616(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_ARGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>  {
> -	u16 *pixels = (u16 *)dst_pixels;
> +	u16 *pixel = (u16 *)out_pixel;
>  
> -	pixels[3] = cpu_to_le16(in_pixel->a);
> -	pixels[2] = cpu_to_le16(in_pixel->r);
> -	pixels[1] = cpu_to_le16(in_pixel->g);
> -	pixels[0] = cpu_to_le16(in_pixel->b);
> +	pixel[3] = cpu_to_le16(in_pixel->a);
> +	pixel[2] = cpu_to_le16(in_pixel->r);
> +	pixel[1] = cpu_to_le16(in_pixel->g);
> +	pixel[0] = cpu_to_le16(in_pixel->b);
>  }
>  
> -static void argb_u16_to_XRGB16161616(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_XRGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>  {
> -	u16 *pixels = (u16 *)dst_pixels;
> +	u16 *pixel = (u16 *)out_pixel;
>  
> -	pixels[3] = 0xffff;
> -	pixels[2] = cpu_to_le16(in_pixel->r);
> -	pixels[1] = cpu_to_le16(in_pixel->g);
> -	pixels[0] = cpu_to_le16(in_pixel->b);
> +	pixel[3] = 0xffff;
> +	pixel[2] = cpu_to_le16(in_pixel->r);
> +	pixel[1] = cpu_to_le16(in_pixel->g);
> +	pixel[0] = cpu_to_le16(in_pixel->b);
>  }
>  
> -static void argb_u16_to_RGB565(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_RGB565(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>  {
> -	u16 *pixels = (u16 *)dst_pixels;
> +	u16 *pixel = (u16 *)out_pixel;
>  
>  	s64 fp_rb_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(31));
>  	s64 fp_g_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(63));
> @@ -235,7 +235,7 @@ static void argb_u16_to_RGB565(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
>  	u16 g = drm_fixp2int(drm_fixp_div(fp_g, fp_g_ratio));
>  	u16 b = drm_fixp2int(drm_fixp_div(fp_b, fp_rb_ratio));
>  
> -	*pixels = cpu_to_le16(r << 11 | g << 5 | b);
> +	*pixel = cpu_to_le16(r << 11 | g << 5 | b);
>  }
>  
>  /**
> @@ -266,7 +266,7 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
>   *
>   * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
>   */
> -void *get_pixel_conversion_function(u32 format)
> +pixel_read_t get_pixel_read_function(u32 format)
>  {
>  	switch (format) {
>  	case DRM_FORMAT_ARGB8888:
> @@ -280,7 +280,16 @@ void *get_pixel_conversion_function(u32 format)
>  	case DRM_FORMAT_RGB565:
>  		return &RGB565_to_argb_u16;
>  	default:
> -		return NULL;
> +		/*
> +		 * This is a bug in vkms_plane_atomic_check. All the supported
> +		 * format must:
> +		 * - Be listed in vkms_formats in vkms_plane.c
> +		 * - Have a pixel_read callback defined here
> +		 */
> +		WARN(true,
> +		     "Pixel format %p4cc is not supported by VKMS planes. This is a kernel bug, atomic check must forbid this configuration.\n",
> +		     &format);
> +		return (pixel_read_t)NULL;
>  	}
>  }
>  
> @@ -291,7 +300,7 @@ void *get_pixel_conversion_function(u32 format)
>   *
>   * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
>   */
> -void *get_pixel_write_function(u32 format)
> +pixel_write_t get_pixel_write_function(u32 format)
>  {
>  	switch (format) {
>  	case DRM_FORMAT_ARGB8888:
> @@ -305,6 +314,15 @@ void *get_pixel_write_function(u32 format)
>  	case DRM_FORMAT_RGB565:
>  		return &argb_u16_to_RGB565;
>  	default:
> -		return NULL;
> +		/*
> +		 * This is a bug in vkms_writeback_atomic_check. All the supported
> +		 * format must:
> +		 * - Be listed in vkms_wb_formats in vkms_writeback.c
> +		 * - Have a pixel_write callback defined here
> +		 */
> +		WARN(true,
> +		     "Pixel format %p4cc is not supported by VKMS writeback. This is a kernel bug, atomic check must forbid this configuration.\n",
> +		     &format);
> +		return (pixel_write_t)NULL;
>  	}
>  }
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
> index cf59c2ed8e9a..3ecea4563254 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.h
> +++ b/drivers/gpu/drm/vkms/vkms_formats.h
> @@ -5,8 +5,8 @@
>  
>  #include "vkms_drv.h"
>  
> -void *get_pixel_conversion_function(u32 format);
> +pixel_read_t get_pixel_read_function(u32 format);
>  
> -void *get_pixel_write_function(u32 format);
> +pixel_write_t get_pixel_write_function(u32 format);
>  
>  #endif /* _VKMS_FORMATS_H_ */
> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> index 21b5adfb44aa..10e9b23dab28 100644
> --- a/drivers/gpu/drm/vkms/vkms_plane.c
> +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> @@ -125,7 +125,7 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
>  	drm_rect_rotate(&frame_info->rotated, drm_rect_width(&frame_info->rotated),
>  			drm_rect_height(&frame_info->rotated), frame_info->rotation);
>  
> -	vkms_plane_state->pixel_read = get_pixel_conversion_function(fmt);
> +	vkms_plane_state->pixel_read = get_pixel_read_function(fmt);
>  }
>  
>  static int vkms_plane_atomic_check(struct drm_plane *plane,
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 05/16] drm/vkms: Add dummy pixel_read/pixel_write callbacks to avoid NULL pointers
  2024-03-13 17:44 ` [PATCH v5 05/16] drm/vkms: Add dummy pixel_read/pixel_write callbacks to avoid NULL pointers Louis Chauvet
  2024-03-13 19:08   ` Randy Dunlap
@ 2024-03-25 12:05   ` Pekka Paalanen
  2024-03-26 15:56     ` Louis Chauvet
  2024-03-25 13:59   ` Maíra Canal
  2 siblings, 1 reply; 75+ messages in thread
From: Pekka Paalanen @ 2024-03-25 12:05 UTC (permalink / raw)
  To: Louis Chauvet
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

[-- Attachment #1: Type: text/plain, Size: 5268 bytes --]

On Wed, 13 Mar 2024 18:44:59 +0100
Louis Chauvet <louis.chauvet@bootlin.com> wrote:

> Introduce two callbacks which does nothing. They are used in replacement
> of NULL and it avoid kernel OOPS if this NULL is called.
> 
> If those callback are used, it means that there is a mismatch between
> what formats are announced by atomic_check and what is realy supported by
> atomic_update.
> 
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>  drivers/gpu/drm/vkms/vkms_formats.c | 43 +++++++++++++++++++++++++++++++------
>  1 file changed, 37 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index 55a4365d21a4..b57d85b8b935 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> @@ -136,6 +136,21 @@ static void RGB565_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>  	out_pixel->b = drm_fixp2int_round(drm_fixp_mul(fp_b, fp_rb_ratio));
>  }
>  
> +/**
> + * black_to_argb_u16() - pixel_read callback which always read black
> + *
> + * This callback is used when an invalid format is requested for plane reading.
> + * It is used to avoid null pointer to be used as a function. In theory, this function should
> + * never be called, except if you found a bug in the driver/DRM core.
> + */
> +static void black_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +{
> +	out_pixel->a = (u16)0xFFFF;
> +	out_pixel->r = 0;
> +	out_pixel->g = 0;
> +	out_pixel->b = 0;
> +}
> +
>  /**
>   * vkms_compose_row - compose a single row of a plane
>   * @stage_buffer: output line with the composed pixels
> @@ -238,6 +253,16 @@ static void argb_u16_to_RGB565(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>  	*pixel = cpu_to_le16(r << 11 | g << 5 | b);
>  }
>  
> +/**
> + * argb_u16_to_nothing() - pixel_write callback with no effect
> + *
> + * This callback is used when an invalid format is requested for writeback.
> + * It is used to avoid null pointer to be used as a function. In theory, this should never
> + * happen, except if there is a bug in the driver
> + */
> +static void argb_u16_to_nothing(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> +{}
> +
>  /**
>   * Generic loop for all supported writeback format. It is executed just after the blending to
>   * write a line in the writeback buffer.
> @@ -261,8 +286,8 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
>  
>  /**
>   * Retrieve the correct read_pixel function for a specific format.
> - * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
> - * pointer is valid before using it in a vkms_plane_state.
> + * If the format is not supported by VKMS a warn is emitted and a dummy "always read black"
> + * function is returned.
>   *
>   * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
>   */
> @@ -285,18 +310,21 @@ pixel_read_t get_pixel_read_function(u32 format)
>  		 * format must:
>  		 * - Be listed in vkms_formats in vkms_plane.c
>  		 * - Have a pixel_read callback defined here
> +		 *
> +		 * To avoid kernel crash, a dummy "always read black" function is used. It means
> +		 * that during the composition, this plane will always be black.
>  		 */
>  		WARN(true,
>  		     "Pixel format %p4cc is not supported by VKMS planes. This is a kernel bug, atomic check must forbid this configuration.\n",
>  		     &format);
> -		return (pixel_read_t)NULL;
> +		return &black_to_argb_u16;

Hi Louis,

I'm perhaps a bit paranoid in these things, but I'd make this not
black. Maybe something more "screaming" like magenta. There is a slight
chance that black might sometimes be expected, or not affect the
result. After all, blending something into black with pre-multiplied
alpha is equivalent to no-blending (a copy). The kernel warning is
good, the magenta is more like an assurance.

Anyway,

Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com>


Thanks,
pq


>  	}
>  }
>  
>  /**
>   * Retrieve the correct write_pixel function for a specific format.
> - * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
> - * pointer is valid before using it in a vkms_writeback_job.
> + * If the format is not supported by VKMS a warn is emitted and a dummy "don't do anything"
> + * function is returned.
>   *
>   * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
>   */
> @@ -319,10 +347,13 @@ pixel_write_t get_pixel_write_function(u32 format)
>  		 * format must:
>  		 * - Be listed in vkms_wb_formats in vkms_writeback.c
>  		 * - Have a pixel_write callback defined here
> +		 *
> +		 * To avoid kernel crash, a dummy "don't do anything" function is used. It means
> +		 * that the resulting writeback buffer is not composed and can contains any values.
>  		 */
>  		WARN(true,
>  		     "Pixel format %p4cc is not supported by VKMS writeback. This is a kernel bug, atomic check must forbid this configuration.\n",
>  		     &format);
> -		return (pixel_write_t)NULL;
> +		return &argb_u16_to_nothing;
>  	}
>  }
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 06/16] drm/vkms: Use const for input pointers in pixel_read an pixel_write functions
  2024-03-13 17:45 ` [PATCH v5 06/16] drm/vkms: Use const for input pointers in pixel_read an pixel_write functions Louis Chauvet
@ 2024-03-25 12:05   ` Pekka Paalanen
  2024-03-25 14:00   ` Maíra Canal
  1 sibling, 0 replies; 75+ messages in thread
From: Pekka Paalanen @ 2024-03-25 12:05 UTC (permalink / raw)
  To: Louis Chauvet
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

[-- Attachment #1: Type: text/plain, Size: 6877 bytes --]

On Wed, 13 Mar 2024 18:45:00 +0100
Louis Chauvet <louis.chauvet@bootlin.com> wrote:

> As the pixel_read and pixel_write function should never modify the input
> buffer, mark those pointers const.
> 
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.com>


Thanks,
pq


> ---
>  drivers/gpu/drm/vkms/vkms_drv.h     |  4 ++--
>  drivers/gpu/drm/vkms/vkms_formats.c | 24 ++++++++++++------------
>  2 files changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 4bfc62d26f08..3ead8b39af4a 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -61,7 +61,7 @@ struct line_buffer {
>   * @out_pixel: destination address to write the pixel
>   * @in_pixel: pixel to write
>   */
> -typedef void (*pixel_write_t)(u8 *out_pixel, struct pixel_argb_u16 *in_pixel);
> +typedef void (*pixel_write_t)(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel);
>  
>  struct vkms_writeback_job {
>  	struct iosys_map data[DRM_FORMAT_MAX_PLANES];
> @@ -76,7 +76,7 @@ struct vkms_writeback_job {
>   * @in_pixel: Pointer to the pixel to read
>   * @out_pixel: Pointer to write the converted pixel
>   */
> -typedef void (*pixel_read_t)(u8 *in_pixel, struct pixel_argb_u16 *out_pixel);
> +typedef void (*pixel_read_t)(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel);
>  
>  /**
>   * vkms_plane_state - Driver specific plane state
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index b57d85b8b935..b2f8dfc26c35 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> @@ -76,7 +76,7 @@ static int get_x_position(const struct vkms_frame_info *frame_info, int limit, i
>   * They are used in the `vkms_compose_row` function to handle multiple formats.
>   */
>  
> -static void ARGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +static void ARGB8888_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>  {
>  	/*
>  	 * The 257 is the "conversion ratio". This number is obtained by the
> @@ -90,7 +90,7 @@ static void ARGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>  	out_pixel->b = (u16)in_pixel[0] * 257;
>  }
>  
> -static void XRGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +static void XRGB8888_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>  {
>  	out_pixel->a = (u16)0xffff;
>  	out_pixel->r = (u16)in_pixel[2] * 257;
> @@ -98,7 +98,7 @@ static void XRGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>  	out_pixel->b = (u16)in_pixel[0] * 257;
>  }
>  
> -static void ARGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +static void ARGB16161616_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>  {
>  	u16 *pixel = (u16 *)in_pixel;
>  
> @@ -108,7 +108,7 @@ static void ARGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pi
>  	out_pixel->b = le16_to_cpu(pixel[0]);
>  }
>  
> -static void XRGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +static void XRGB16161616_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>  {
>  	u16 *pixel = (u16 *)in_pixel;
>  
> @@ -118,7 +118,7 @@ static void XRGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pi
>  	out_pixel->b = le16_to_cpu(pixel[0]);
>  }
>  
> -static void RGB565_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +static void RGB565_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>  {
>  	u16 *pixel = (u16 *)in_pixel;
>  
> @@ -143,7 +143,7 @@ static void RGB565_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>   * It is used to avoid null pointer to be used as a function. In theory, this function should
>   * never be called, except if you found a bug in the driver/DRM core.
>   */
> -static void black_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +static void black_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>  {
>  	out_pixel->a = (u16)0xFFFF;
>  	out_pixel->r = 0;
> @@ -189,7 +189,7 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
>   * They are used in the `vkms_writeback_row` to convert and store a pixel from the src_buffer to
>   * the writeback buffer.
>   */
> -static void argb_u16_to_ARGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_ARGB8888(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel)
>  {
>  	/*
>  	 * This sequence below is important because the format's byte order is
> @@ -207,7 +207,7 @@ static void argb_u16_to_ARGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>  	out_pixel[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
>  }
>  
> -static void argb_u16_to_XRGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_XRGB8888(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel)
>  {
>  	out_pixel[3] = 0xff;
>  	out_pixel[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
> @@ -215,7 +215,7 @@ static void argb_u16_to_XRGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>  	out_pixel[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
>  }
>  
> -static void argb_u16_to_ARGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_ARGB16161616(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel)
>  {
>  	u16 *pixel = (u16 *)out_pixel;
>  
> @@ -225,7 +225,7 @@ static void argb_u16_to_ARGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pi
>  	pixel[0] = cpu_to_le16(in_pixel->b);
>  }
>  
> -static void argb_u16_to_XRGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_XRGB16161616(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel)
>  {
>  	u16 *pixel = (u16 *)out_pixel;
>  
> @@ -235,7 +235,7 @@ static void argb_u16_to_XRGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pi
>  	pixel[0] = cpu_to_le16(in_pixel->b);
>  }
>  
> -static void argb_u16_to_RGB565(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_RGB565(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel)
>  {
>  	u16 *pixel = (u16 *)out_pixel;
>  
> @@ -260,7 +260,7 @@ static void argb_u16_to_RGB565(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>   * It is used to avoid null pointer to be used as a function. In theory, this should never
>   * happen, except if there is a bug in the driver
>   */
> -static void argb_u16_to_nothing(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_nothing(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel)
>  {}
>  
>  /**
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 07/16] drm/vkms: Update pixels accessor to support packed and multi-plane formats.
  2024-03-13 17:45 ` [PATCH v5 07/16] drm/vkms: Update pixels accessor to support packed and multi-plane formats Louis Chauvet
@ 2024-03-25 12:40   ` Pekka Paalanen
  2024-03-26 15:56     ` Louis Chauvet
  0 siblings, 1 reply; 75+ messages in thread
From: Pekka Paalanen @ 2024-03-25 12:40 UTC (permalink / raw)
  To: Louis Chauvet
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

[-- Attachment #1: Type: text/plain, Size: 8151 bytes --]

On Wed, 13 Mar 2024 18:45:01 +0100
Louis Chauvet <louis.chauvet@bootlin.com> wrote:

> Introduce the usage of block_h/block_w to compute the offset and the
> pointer of a pixel. The previous implementation was specialized for
> planes with block_h == block_w == 1. To avoid confusion and allow easier
> implementation of tiled formats. It also remove the usage of the
> deprecated format field `cpp`.
> 
> Introduce the plane_index parameter to get an offset/pointer on a
> different plane.
> 
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>  drivers/gpu/drm/vkms/vkms_formats.c | 76 +++++++++++++++++++++++++------------
>  1 file changed, 52 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index b2f8dfc26c35..649d75d05b1f 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> @@ -10,23 +10,43 @@
>  #include "vkms_formats.h"
>  
>  /**
> - * pixel_offset() - Get the offset of the pixel at coordinates x/y in the first plane
> + * packed_pixels_offset() - Get the offset of the block containing the pixel at coordinates x/y
>   *
>   * @frame_info: Buffer metadata
>   * @x: The x coordinate of the wanted pixel in the buffer
>   * @y: The y coordinate of the wanted pixel in the buffer
> + * @plane_index: The index of the plane to use
> + * @offset: The returned offset inside the buffer of the block
> + * @rem_x,@rem_y: The returned coordinate of the requested pixel in the block
>   *
> - * The caller must ensure that the framebuffer associated with this request uses a pixel format
> - * where block_h == block_w == 1.
> - * If this requirement is not fulfilled, the resulting offset can point to an other pixel or
> - * outside of the buffer.
> + * As some pixel formats store multiple pixels in a block (DRM_FORMAT_R* for example), some
> + * pixels are not individually addressable. This function return 3 values: the offset of the
> + * whole block, and the coordinate of the requested pixel inside this block.
> + * For example, if the format is DRM_FORMAT_R1 and the requested coordinate is 13,5, the offset
> + * will point to the byte 5*pitches + 13/8 (second byte of the 5th line), and the rem_x/rem_y
> + * coordinates will be (13 % 8, 5 % 1) = (5, 0)
> + *
> + * With this function, the caller just have to extract the correct pixel from the block.
>   */
> -static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int y)
> +static void packed_pixels_offset(const struct vkms_frame_info *frame_info, int x, int y,
> +				 int plane_index, int *offset, int *rem_x, int *rem_y)
>  {
>  	struct drm_framebuffer *fb = frame_info->fb;
> +	const struct drm_format_info *format = frame_info->fb->format;
> +	/* Directly using x and y to multiply pitches and format->ccp is not sufficient because
> +	 * in some formats a block can represent multiple pixels.
> +	 *
> +	 * Dividing x and y by the block size allows to extract the correct offset of the block
> +	 * containing the pixel.
> +	 */
>  
> -	return fb->offsets[0] + (y * fb->pitches[0])
> -			      + (x * fb->format->cpp[0]);
> +	int block_x = x / drm_format_info_block_width(format, plane_index);
> +	int block_y = y / drm_format_info_block_height(format, plane_index);
> +	*rem_x = x % drm_format_info_block_width(format, plane_index);
> +	*rem_y = x % drm_format_info_block_height(format, plane_index);

typo: x should be y


> +	*offset = fb->offsets[plane_index] +
> +		  block_y * fb->pitches[plane_index] +
> +		  block_x * format->char_per_block[plane_index];
>  }

Ok, this function looks very much plausible for handling blocky
formats. Good.

>  
>  /**
> @@ -36,30 +56,35 @@ static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int
>   * @frame_info: Buffer metadata
>   * @x: The x(width) coordinate inside the plane
>   * @y: The y(height) coordinate inside the plane
> + * @plane_index: The index of the plane
> + * @addr: The returned pointer
> + * @rem_x,@rem_y: The returned coordinate of the requested pixel in the block
>   *
> - * Takes the information stored in the frame_info, a pair of coordinates, and
> - * returns the address of the first color channel.
> - * This function assumes the channels are packed together, i.e. a color channel
> - * comes immediately after another in the memory. And therefore, this function
> - * doesn't work for YUV with chroma subsampling (e.g. YUV420 and NV21).
> + * Takes the information stored in the frame_info, a pair of coordinates, and returns the address
> + * of the block containing this pixel and the pixel position inside this block.
>   *
> - * The caller must ensure that the framebuffer associated with this request uses a pixel format
> - * where block_h == block_w == 1, otherwise the returned pointer can be outside the buffer.
> + * See @packed_pixel_offset for details about rem_x/rem_y behavior.
>   */
> -static void *packed_pixels_addr(const struct vkms_frame_info *frame_info,
> -				int x, int y)
> +static void packed_pixels_addr(const struct vkms_frame_info *frame_info,
> +			       int x, int y, int plane_index, u8 **addr, int *rem_x,
> +			       int *rem_y)
>  {
> -	size_t offset = pixel_offset(frame_info, x, y);
> +	int offset;
>  
> -	return (u8 *)frame_info->map[0].vaddr + offset;
> +	packed_pixels_offset(frame_info, x, y, plane_index, &offset, rem_x, rem_y);
> +	*addr = (u8 *)frame_info->map[0].vaddr + offset;
>  }
>  
> -static void *get_packed_src_addr(const struct vkms_frame_info *frame_info, int y)
> +static void *get_packed_src_addr(const struct vkms_frame_info *frame_info, int y,
> +				 int plane_index)
>  {
>  	int x_src = frame_info->src.x1 >> 16;
>  	int y_src = y - frame_info->rotated.y1 + (frame_info->src.y1 >> 16);
> +	u8 *addr;
> +	int rem_x, rem_y;
>  
> -	return packed_pixels_addr(frame_info, x_src, y_src);
> +	packed_pixels_addr(frame_info, x_src, y_src, plane_index, &addr, &rem_x, &rem_y);

How can the caller be not interested in rem_x, rem_y?

Maybe there is no IGT test that uses DRM_FORMAT_R1 FB on a plane and
has a source rectangle whose x is not divisible by 8 pixels?
Or maybe the FB is filled with a solid color instead of a pattern that
would show source rectangle positioning problems?

Maybe at this point of the series, this should assert that rem_x and
rem_y are zero? That's what vkms_compose_row() assumes, right?


Thanks,
pq

> +	return addr;
>  }
>  
>  static int get_x_position(const struct vkms_frame_info *frame_info, int limit, int x)
> @@ -168,14 +193,14 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
>  {
>  	struct pixel_argb_u16 *out_pixels = stage_buffer->pixels;
>  	struct vkms_frame_info *frame_info = plane->frame_info;
> -	u8 *src_pixels = get_packed_src_addr(frame_info, y);
> +	u8 *src_pixels = get_packed_src_addr(frame_info, y, 0);
>  	int limit = min_t(size_t, drm_rect_width(&frame_info->dst), stage_buffer->n_pixels);
>  
>  	for (size_t x = 0; x < limit; x++, src_pixels += frame_info->fb->format->cpp[0]) {
>  		int x_pos = get_x_position(frame_info, limit, x);
>  
>  		if (drm_rotation_90_or_270(frame_info->rotation))
> -			src_pixels = get_packed_src_addr(frame_info, x + frame_info->rotated.y1)
> +			src_pixels = get_packed_src_addr(frame_info, x + frame_info->rotated.y1, 0)
>  				+ frame_info->fb->format->cpp[0] * y;
>  
>  		plane->pixel_read(src_pixels, &out_pixels[x_pos]);
> @@ -276,7 +301,10 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
>  {
>  	struct vkms_frame_info *frame_info = &wb->wb_frame_info;
>  	int x_dst = frame_info->dst.x1;
> -	u8 *dst_pixels = packed_pixels_addr(frame_info, x_dst, y);
> +	u8 *dst_pixels;
> +	int rem_x, rem_y;
> +
> +	packed_pixels_addr(frame_info, x_dst, y, 0, &dst_pixels, &rem_x, &rem_y);
>  	struct pixel_argb_u16 *in_pixels = src_buffer->pixels;
>  	int x_limit = min_t(size_t, drm_rect_width(&frame_info->dst), src_buffer->n_pixels);
>  
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 08/16] drm/vkms: Avoid computing blending limits inside pre_mul_alpha_blend
  2024-03-13 17:45 ` [PATCH v5 08/16] drm/vkms: Avoid computing blending limits inside pre_mul_alpha_blend Louis Chauvet
@ 2024-03-25 12:41   ` Pekka Paalanen
  2024-03-26 15:57     ` Louis Chauvet
  0 siblings, 1 reply; 75+ messages in thread
From: Pekka Paalanen @ 2024-03-25 12:41 UTC (permalink / raw)
  To: Louis Chauvet
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

[-- Attachment #1: Type: text/plain, Size: 4875 bytes --]

On Wed, 13 Mar 2024 18:45:02 +0100
Louis Chauvet <louis.chauvet@bootlin.com> wrote:

> The pre_mul_alpha_blend is dedicated to blending, so to avoid mixing
> different concepts (coordinate calculation and color management), extract
> the x_limit and x_dst computation outside of this helper.
> It also increases the maintainability by grouping the computation related
> to coordinates in the same place: the loop in `blend`.
> 
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>  drivers/gpu/drm/vkms/vkms_composer.c | 40 +++++++++++++++++-------------------
>  1 file changed, 19 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> index da0651a94c9b..9254086f23ff 100644
> --- a/drivers/gpu/drm/vkms/vkms_composer.c
> +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> @@ -24,34 +24,30 @@ static u16 pre_mul_blend_channel(u16 src, u16 dst, u16 alpha)
>  
>  /**
>   * pre_mul_alpha_blend - alpha blending equation
> - * @frame_info: Source framebuffer's metadata
>   * @stage_buffer: The line with the pixels from src_plane
>   * @output_buffer: A line buffer that receives all the blends output
> + * @x_start: The start offset to avoid useless copy

I'd say just:

+ * @x_start: The start offset

It describes the parameter, and the paragraph below explains the why.

It would be explaining, that x_start applies to output_buffer, but
input_buffer is always read starting from 0.

> + * @count: The number of byte to copy

You named it pixel_count, and it counts pixels, not bytes. It's not a
copy but a blend into output_buffer.

>   *
> - * Using the information from the `frame_info`, this blends only the
> - * necessary pixels from the `stage_buffer` to the `output_buffer`
> - * using premultiplied blend formula.
> + * Using @x_start and @count information, only few pixel can be blended instead of the whole line
> + * each time.
>   *
>   * The current DRM assumption is that pixel color values have been already
>   * pre-multiplied with the alpha channel values. See more
>   * drm_plane_create_blend_mode_property(). Also, this formula assumes a
>   * completely opaque background.
>   */
> -static void pre_mul_alpha_blend(struct vkms_frame_info *frame_info,
> -				struct line_buffer *stage_buffer,
> -				struct line_buffer *output_buffer)
> +static void pre_mul_alpha_blend(const struct line_buffer *stage_buffer,
> +				struct line_buffer *output_buffer, int x_start, int pixel_count)
>  {
> -	int x_dst = frame_info->dst.x1;
> -	struct pixel_argb_u16 *out = output_buffer->pixels + x_dst;
> -	struct pixel_argb_u16 *in = stage_buffer->pixels;
> -	int x_limit = min_t(size_t, drm_rect_width(&frame_info->dst),
> -			    stage_buffer->n_pixels);
> -
> -	for (int x = 0; x < x_limit; x++) {
> -		out[x].a = (u16)0xffff;
> -		out[x].r = pre_mul_blend_channel(in[x].r, out[x].r, in[x].a);
> -		out[x].g = pre_mul_blend_channel(in[x].g, out[x].g, in[x].a);
> -		out[x].b = pre_mul_blend_channel(in[x].b, out[x].b, in[x].a);
> +	struct pixel_argb_u16 *out = &output_buffer->pixels[x_start];
> +	const struct pixel_argb_u16 *in = stage_buffer->pixels;
> +
> +	for (int i = 0; i < pixel_count; i++) {
> +		out[i].a = (u16)0xffff;
> +		out[i].r = pre_mul_blend_channel(in[i].r, out[i].r, in[i].a);
> +		out[i].g = pre_mul_blend_channel(in[i].g, out[i].g, in[i].a);
> +		out[i].b = pre_mul_blend_channel(in[i].b, out[i].b, in[i].a);
>  	}
>  }
>  
> @@ -183,7 +179,7 @@ static void blend(struct vkms_writeback_job *wb,
>  {
>  	struct vkms_plane_state **plane = crtc_state->active_planes;
>  	u32 n_active_planes = crtc_state->num_active_planes;
> -	int y_pos;
> +	int y_pos, x_dst, x_limit;
>  
>  	const struct pixel_argb_u16 background_color = { .a = 0xffff };
>  
> @@ -201,14 +197,16 @@ static void blend(struct vkms_writeback_job *wb,
>  
>  		/* The active planes are composed associatively in z-order. */
>  		for (size_t i = 0; i < n_active_planes; i++) {
> +			x_dst = plane[i]->frame_info->dst.x1;
> +			x_limit = min_t(size_t, drm_rect_width(&plane[i]->frame_info->dst),
> +					stage_buffer->n_pixels);

Are those input values to min_t() really of type size_t? Or why is
size_t here?

>  			y_pos = get_y_pos(plane[i]->frame_info, y);
>  
>  			if (!check_limit(plane[i]->frame_info, y_pos))
>  				continue;
>  
>  			vkms_compose_row(stage_buffer, plane[i], y_pos);
> -			pre_mul_alpha_blend(plane[i]->frame_info, stage_buffer,
> -					    output_buffer);
> +			pre_mul_alpha_blend(stage_buffer, output_buffer, x_dst, x_limit);

I thought it was a count, not a limit?

"Limit" sounds to me like "end", and end - start = count.

>  		}
>  
>  		apply_lut(crtc_state, output_buffer);
> 

The details aside, this is a good move.


Thanks,
pq

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 09/16] drm/vkms: Introduce pixel_read_direction enum
  2024-03-13 17:45 ` [PATCH v5 09/16] drm/vkms: Introduce pixel_read_direction enum Louis Chauvet
@ 2024-03-25 13:11   ` Pekka Paalanen
  2024-03-26 15:57     ` Louis Chauvet
  2024-03-25 14:07   ` Maíra Canal
  1 sibling, 1 reply; 75+ messages in thread
From: Pekka Paalanen @ 2024-03-25 13:11 UTC (permalink / raw)
  To: Louis Chauvet
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

[-- Attachment #1: Type: text/plain, Size: 6307 bytes --]

On Wed, 13 Mar 2024 18:45:03 +0100
Louis Chauvet <louis.chauvet@bootlin.com> wrote:

> The pixel_read_direction enum is useful to describe the reading direction
> in a plane. It avoids using the rotation property of DRM, which not
> practical to know the direction of reading.
> This patch also introduce two helpers, one to compute the
> pixel_read_direction from the DRM rotation property, and one to compute
> the step, in byte, between two successive pixel in a specific direction.
> 
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>  drivers/gpu/drm/vkms/vkms_composer.c | 36 ++++++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/vkms/vkms_drv.h      | 11 +++++++++++
>  drivers/gpu/drm/vkms/vkms_formats.c  | 30 ++++++++++++++++++++++++++++++
>  3 files changed, 77 insertions(+)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> index 9254086f23ff..989bcf59f375 100644
> --- a/drivers/gpu/drm/vkms/vkms_composer.c
> +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> @@ -159,6 +159,42 @@ static void apply_lut(const struct vkms_crtc_state *crtc_state, struct line_buff
>  	}
>  }
>  
> +/**
> + * direction_for_rotation() - Get the correct reading direction for a given rotation
> + *
> + * This function will use the @rotation setting of a source plane to compute the reading
> + * direction in this plane which correspond to a "left to right writing" in the CRTC.
> + * For example, if the buffer is reflected on X axis, the pixel must be read from right to left
> + * to be written from left to right on the CRTC.

That is a well written description.

> + *
> + * @rotation: Rotation to analyze. It correspond the field @frame_info.rotation.
> + */
> +static enum pixel_read_direction direction_for_rotation(unsigned int rotation)
> +{
> +	if (rotation & DRM_MODE_ROTATE_0) {
> +		if (rotation & DRM_MODE_REFLECT_X)
> +			return READ_RIGHT_TO_LEFT;
> +		else
> +			return READ_LEFT_TO_RIGHT;
> +	} else if (rotation & DRM_MODE_ROTATE_90) {
> +		if (rotation & DRM_MODE_REFLECT_Y)
> +			return READ_BOTTOM_TO_TOP;
> +		else
> +			return READ_TOP_TO_BOTTOM;
> +	} else if (rotation & DRM_MODE_ROTATE_180) {
> +		if (rotation & DRM_MODE_REFLECT_X)
> +			return READ_LEFT_TO_RIGHT;
> +		else
> +			return READ_RIGHT_TO_LEFT;
> +	} else if (rotation & DRM_MODE_ROTATE_270) {
> +		if (rotation & DRM_MODE_REFLECT_Y)
> +			return READ_TOP_TO_BOTTOM;
> +		else
> +			return READ_BOTTOM_TO_TOP;
> +	}
> +	return READ_LEFT_TO_RIGHT;

I'm a little worried seeing REFLECT_X is supported only for some
rotations, and REFLECT_Y for other rotations. Why is an analysis of all
combinations not necessary?

I hope IGT uses FB patterns instead of solid color in its tests of
rotation to be able to detect the difference.

The return values do seem correct to me, assuming I have guessed
correctly what "X" and "Y" refer to when combined with rotation. I did
not find good documentation about that.

Btw. if there are already functions that are able to transform
coordinates based on the rotation bitfield, you could alternatively use
them. Transform CRTC point (0, 0) to A, and (1, 0) to B. Now A and B
are in plane coordinate system, and vector B - A gives you the
direction. The reason I'm mentioning this is that then you don't have
to implement yet another copy of the rotation bitfield semantics from
scratch.


> +}
> +
>  /**
>   * blend - blend the pixels from all planes and compute crc
>   * @wb: The writeback frame buffer metadata
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 3ead8b39af4a..985e7a92b7bc 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -69,6 +69,17 @@ struct vkms_writeback_job {
>  	pixel_write_t pixel_write;
>  };
>  
> +/**
> + * enum pixel_read_direction - Enum used internaly by VKMS to represent a reading direction in a
> + * plane.
> + */
> +enum pixel_read_direction {
> +	READ_BOTTOM_TO_TOP,
> +	READ_TOP_TO_BOTTOM,
> +	READ_RIGHT_TO_LEFT,
> +	READ_LEFT_TO_RIGHT
> +};
> +
>  /**
>   * typedef pixel_read_t - These functions are used to read a pixel in the source frame,
>   * convert it to `struct pixel_argb_u16` and write it to @out_pixel.
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index 649d75d05b1f..743b6fd06db5 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> @@ -75,6 +75,36 @@ static void packed_pixels_addr(const struct vkms_frame_info *frame_info,
>  	*addr = (u8 *)frame_info->map[0].vaddr + offset;
>  }
>  
> +/**
> + * get_step_next_block() - Common helper to compute the correct step value between each pixel block
> + * to read in a certain direction.
> + *
> + * As the returned offset is the number of bytes between two consecutive blocks in a direction,
> + * the caller may have to read multiple pixel before using the next one (for example, to read from
> + * left to right in a DRM_FORMAT_R1 plane, each block contains 8 pixels, so the step must be used
> + * only every 8 pixels.
> + *
> + * @fb: Framebuffer to iter on
> + * @direction: Direction of the reading
> + * @plane_index: Plane to get the step from
> + */
> +static int get_step_next_block(struct drm_framebuffer *fb, enum pixel_read_direction direction,
> +			       int plane_index)
> +{

I would have called this something like get_block_step_bytes() for
example. That makes it clear it returns bytes (not e.g. pixels). "next"
implies to me that I tell the function the current block, and then it
gets me the next one. It does not do that, so I'd not use "next".

> +	switch (direction) {
> +	case READ_LEFT_TO_RIGHT:
> +		return fb->format->char_per_block[plane_index];
> +	case READ_RIGHT_TO_LEFT:
> +		return -fb->format->char_per_block[plane_index];
> +	case READ_TOP_TO_BOTTOM:
> +		return (int)fb->pitches[plane_index];
> +	case READ_BOTTOM_TO_TOP:
> +		return -(int)fb->pitches[plane_index];
> +	}
> +
> +	return 0;
> +}

Looks good.


Thanks,
pq

> +
>  static void *get_packed_src_addr(const struct vkms_frame_info *frame_info, int y,
>  				 int plane_index)
>  {
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 01/16] drm/vkms: Code formatting
  2024-03-13 17:44 ` [PATCH v5 01/16] drm/vkms: Code formatting Louis Chauvet
  2024-03-25 12:03   ` Pekka Paalanen
@ 2024-03-25 13:13   ` Maíra Canal
  1 sibling, 0 replies; 75+ messages in thread
From: Maíra Canal @ 2024-03-25 13:13 UTC (permalink / raw)
  To: Louis Chauvet, Rodrigo Siqueira, Melissa Wen, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

On 3/13/24 14:44, Louis Chauvet wrote:
> Few no-op changes to remove double spaces and fix wrong alignments.
> 
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>

Reviewed-by: Maíra Canal <mcanal@igalia.com>

Best Regards,
- Maíra

> ---
>   drivers/gpu/drm/vkms/vkms_composer.c | 10 +++++-----
>   drivers/gpu/drm/vkms/vkms_crtc.c     |  6 ++----
>   drivers/gpu/drm/vkms/vkms_drv.c      |  3 +--
>   drivers/gpu/drm/vkms/vkms_plane.c    |  8 ++++----
>   4 files changed, 12 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> index e7441b227b3c..c6d9b4a65809 100644
> --- a/drivers/gpu/drm/vkms/vkms_composer.c
> +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> @@ -96,7 +96,7 @@ static u16 lerp_u16(u16 a, u16 b, s64 t)
>   	s64 a_fp = drm_int2fixp(a);
>   	s64 b_fp = drm_int2fixp(b);
>   
> -	s64 delta = drm_fixp_mul(b_fp - a_fp,  t);
> +	s64 delta = drm_fixp_mul(b_fp - a_fp, t);
>   
>   	return drm_fixp2int(a_fp + delta);
>   }
> @@ -302,8 +302,8 @@ static int compose_active_planes(struct vkms_writeback_job *active_wb,
>   void vkms_composer_worker(struct work_struct *work)
>   {
>   	struct vkms_crtc_state *crtc_state = container_of(work,
> -						struct vkms_crtc_state,
> -						composer_work);
> +							  struct vkms_crtc_state,
> +							  composer_work);
>   	struct drm_crtc *crtc = crtc_state->base.crtc;
>   	struct vkms_writeback_job *active_wb = crtc_state->active_writeback;
>   	struct vkms_output *out = drm_crtc_to_vkms_output(crtc);
> @@ -328,7 +328,7 @@ void vkms_composer_worker(struct work_struct *work)
>   		crtc_state->gamma_lut.base = (struct drm_color_lut *)crtc->state->gamma_lut->data;
>   		crtc_state->gamma_lut.lut_length =
>   			crtc->state->gamma_lut->length / sizeof(struct drm_color_lut);
> -		max_lut_index_fp = drm_int2fixp(crtc_state->gamma_lut.lut_length  - 1);
> +		max_lut_index_fp = drm_int2fixp(crtc_state->gamma_lut.lut_length - 1);
>   		crtc_state->gamma_lut.channel_value2index_ratio = drm_fixp_div(max_lut_index_fp,
>   									       u16_max_fp);
>   
> @@ -367,7 +367,7 @@ void vkms_composer_worker(struct work_struct *work)
>   		drm_crtc_add_crc_entry(crtc, true, frame_start++, &crc32);
>   }
>   
> -static const char * const pipe_crc_sources[] = {"auto"};
> +static const char *const pipe_crc_sources[] = { "auto" };
>   
>   const char *const *vkms_get_crc_sources(struct drm_crtc *crtc,
>   					size_t *count)
> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> index 61e500b8c9da..7586ae2e1dd3 100644
> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> @@ -191,8 +191,7 @@ static int vkms_crtc_atomic_check(struct drm_crtc *crtc,
>   		return ret;
>   
>   	drm_for_each_plane_mask(plane, crtc->dev, crtc_state->plane_mask) {
> -		plane_state = drm_atomic_get_existing_plane_state(crtc_state->state,
> -								  plane);
> +		plane_state = drm_atomic_get_existing_plane_state(crtc_state->state, plane);
>   		WARN_ON(!plane_state);
>   
>   		if (!plane_state->visible)
> @@ -208,8 +207,7 @@ static int vkms_crtc_atomic_check(struct drm_crtc *crtc,
>   
>   	i = 0;
>   	drm_for_each_plane_mask(plane, crtc->dev, crtc_state->plane_mask) {
> -		plane_state = drm_atomic_get_existing_plane_state(crtc_state->state,
> -								  plane);
> +		plane_state = drm_atomic_get_existing_plane_state(crtc_state->state, plane);
>   
>   		if (!plane_state->visible)
>   			continue;
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c
> index dd0af086e7fa..83e6c9b9ff46 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.c
> +++ b/drivers/gpu/drm/vkms/vkms_drv.c
> @@ -81,8 +81,7 @@ static void vkms_atomic_commit_tail(struct drm_atomic_state *old_state)
>   	drm_atomic_helper_wait_for_flip_done(dev, old_state);
>   
>   	for_each_old_crtc_in_state(old_state, crtc, old_crtc_state, i) {
> -		struct vkms_crtc_state *vkms_state =
> -			to_vkms_crtc_state(old_crtc_state);
> +		struct vkms_crtc_state *vkms_state = to_vkms_crtc_state(old_crtc_state);
>   
>   		flush_work(&vkms_state->composer_work);
>   	}
> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> index e5c625ab8e3e..5a8d295e65f2 100644
> --- a/drivers/gpu/drm/vkms/vkms_plane.c
> +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> @@ -117,10 +117,10 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
>   	memcpy(&frame_info->map, &shadow_plane_state->data, sizeof(frame_info->map));
>   	drm_framebuffer_get(frame_info->fb);
>   	frame_info->rotation = drm_rotation_simplify(new_state->rotation, DRM_MODE_ROTATE_0 |
> -						     DRM_MODE_ROTATE_90 |
> -						     DRM_MODE_ROTATE_270 |
> -						     DRM_MODE_REFLECT_X |
> -						     DRM_MODE_REFLECT_Y);
> +									  DRM_MODE_ROTATE_90 |
> +									  DRM_MODE_ROTATE_270 |
> +									  DRM_MODE_REFLECT_X |
> +									  DRM_MODE_REFLECT_Y);
>   
>   	drm_rect_rotate(&frame_info->rotated, drm_rect_width(&frame_info->rotated),
>   			drm_rect_height(&frame_info->rotated), frame_info->rotation);
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 02/16] drm/vkms: Use drm_frame directly
  2024-03-13 17:44 ` [PATCH v5 02/16] drm/vkms: Use drm_frame directly Louis Chauvet
  2024-03-25 12:04   ` Pekka Paalanen
@ 2024-03-25 13:20   ` Maíra Canal
  2024-03-26 15:56     ` Louis Chauvet
  1 sibling, 1 reply; 75+ messages in thread
From: Maíra Canal @ 2024-03-25 13:20 UTC (permalink / raw)
  To: Louis Chauvet, Rodrigo Siqueira, Melissa Wen, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

On 3/13/24 14:44, Louis Chauvet wrote:
> From: Arthur Grillo <arthurgrillo@riseup.net>
> 
> Remove intermidiary variables and access the variables directly from
> drm_frame. These changes should be noop.
> 
> Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>   drivers/gpu/drm/vkms/vkms_drv.h       |  3 ---
>   drivers/gpu/drm/vkms/vkms_formats.c   | 12 +++++++-----
>   drivers/gpu/drm/vkms/vkms_plane.c     |  3 ---
>   drivers/gpu/drm/vkms/vkms_writeback.c |  5 -----
>   4 files changed, 7 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 8f5710debb1e..b4b357447292 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -31,9 +31,6 @@ struct vkms_frame_info {
>   	struct drm_rect rotated;
>   	struct iosys_map map[DRM_FORMAT_MAX_PLANES];
>   	unsigned int rotation;
> -	unsigned int offset;
> -	unsigned int pitch;
> -	unsigned int cpp;
>   };
>   
>   struct pixel_argb_u16 {
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index 36046b12f296..172830a3936a 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> @@ -11,8 +11,10 @@
>   
>   static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int y)
>   {
> -	return frame_info->offset + (y * frame_info->pitch)
> -				  + (x * frame_info->cpp);
> +	struct drm_framebuffer *fb = frame_info->fb;
> +
> +	return fb->offsets[0] + (y * fb->pitches[0])
> +			      + (x * fb->format->cpp[0]);

Nitpicking: Could this be packed into a single line?

Anyway,

Reviewed-by: Maíra Canal <mcanal@igalia.com>

Best Regards,
- Maíra

>   }
>   
>   /*
> @@ -131,12 +133,12 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
>   	u8 *src_pixels = get_packed_src_addr(frame_info, y);
>   	int limit = min_t(size_t, drm_rect_width(&frame_info->dst), stage_buffer->n_pixels);
>   
> -	for (size_t x = 0; x < limit; x++, src_pixels += frame_info->cpp) {
> +	for (size_t x = 0; x < limit; x++, src_pixels += frame_info->fb->format->cpp[0]) {
>   		int x_pos = get_x_position(frame_info, limit, x);
>   
>   		if (drm_rotation_90_or_270(frame_info->rotation))
>   			src_pixels = get_packed_src_addr(frame_info, x + frame_info->rotated.y1)
> -				+ frame_info->cpp * y;
> +				+ frame_info->fb->format->cpp[0] * y;
>   
>   		plane->pixel_read(src_pixels, &out_pixels[x_pos]);
>   	}
> @@ -223,7 +225,7 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
>   	struct pixel_argb_u16 *in_pixels = src_buffer->pixels;
>   	int x_limit = min_t(size_t, drm_rect_width(&frame_info->dst), src_buffer->n_pixels);
>   
> -	for (size_t x = 0; x < x_limit; x++, dst_pixels += frame_info->cpp)
> +	for (size_t x = 0; x < x_limit; x++, dst_pixels += frame_info->fb->format->cpp[0])
>   		wb->pixel_write(dst_pixels, &in_pixels[x]);
>   }
>   
> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> index 5a8d295e65f2..21b5adfb44aa 100644
> --- a/drivers/gpu/drm/vkms/vkms_plane.c
> +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> @@ -125,9 +125,6 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
>   	drm_rect_rotate(&frame_info->rotated, drm_rect_width(&frame_info->rotated),
>   			drm_rect_height(&frame_info->rotated), frame_info->rotation);
>   
> -	frame_info->offset = fb->offsets[0];
> -	frame_info->pitch = fb->pitches[0];
> -	frame_info->cpp = fb->format->cpp[0];
>   	vkms_plane_state->pixel_read = get_pixel_conversion_function(fmt);
>   }
>   
> diff --git a/drivers/gpu/drm/vkms/vkms_writeback.c b/drivers/gpu/drm/vkms/vkms_writeback.c
> index bc724cbd5e3a..c8582df1f739 100644
> --- a/drivers/gpu/drm/vkms/vkms_writeback.c
> +++ b/drivers/gpu/drm/vkms/vkms_writeback.c
> @@ -149,11 +149,6 @@ static void vkms_wb_atomic_commit(struct drm_connector *conn,
>   	crtc_state->active_writeback = active_wb;
>   	crtc_state->wb_pending = true;
>   	spin_unlock_irq(&output->composer_lock);
> -
> -	wb_frame_info->offset = fb->offsets[0];
> -	wb_frame_info->pitch = fb->pitches[0];
> -	wb_frame_info->cpp = fb->format->cpp[0];
> -
>   	drm_writeback_queue_job(wb_conn, connector_state);
>   	active_wb->pixel_write = get_pixel_write_function(wb_format);
>   	drm_rect_init(&wb_frame_info->src, 0, 0, crtc_width, crtc_height);
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 03/16] drm/vkms: write/update the documentation for pixel conversion and pixel write functions
  2024-03-13 17:44 ` [PATCH v5 03/16] drm/vkms: write/update the documentation for pixel conversion and pixel write functions Louis Chauvet
  2024-03-13 19:02   ` Randy Dunlap
@ 2024-03-25 13:32   ` Maíra Canal
  2024-03-26 15:56     ` Louis Chauvet
  1 sibling, 1 reply; 75+ messages in thread
From: Maíra Canal @ 2024-03-25 13:32 UTC (permalink / raw)
  To: Louis Chauvet, Rodrigo Siqueira, Melissa Wen, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

On 3/13/24 14:44, Louis Chauvet wrote:
> Add some documentation on pixel conversion functions.
> Update of outdated comments for pixel_write functions.
> 
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>   drivers/gpu/drm/vkms/vkms_composer.c |  7 ++++
>   drivers/gpu/drm/vkms/vkms_drv.h      | 13 ++++++++
>   drivers/gpu/drm/vkms/vkms_formats.c  | 62 ++++++++++++++++++++++++++++++------
>   3 files changed, 73 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> index c6d9b4a65809..da0651a94c9b 100644
> --- a/drivers/gpu/drm/vkms/vkms_composer.c
> +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> @@ -189,6 +189,13 @@ static void blend(struct vkms_writeback_job *wb,
>   
>   	size_t crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
>   
> +	/*
> +	 * The planes are composed line-by-line to avoid heavy memory usage. It is a necessary
> +	 * complexity to avoid poor blending performance.
> +	 *
> +	 * The function vkms_compose_row is used to read a line, pixel-by-pixel, into the staging
> +	 * buffer.
> +	 */
>   	for (size_t y = 0; y < crtc_y_limit; y++) {
>   		fill_background(&background_color, output_buffer);
>   
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index b4b357447292..18086423a3a7 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -25,6 +25,17 @@
>   
>   #define VKMS_LUT_SIZE 256
>   
> +/**
> + * struct vkms_frame_info - structure to store the state of a frame
> + *
> + * @fb: backing drm framebuffer
> + * @src: source rectangle of this frame in the source framebuffer
> + * @dst: destination rectangle in the crtc buffer
> + * @map: see drm_shadow_plane_state@data
> + * @rotation: rotation applied to the source.
> + *
> + * @src and @dst should have the same size modulo the rotation.
> + */
>   struct vkms_frame_info {
>   	struct drm_framebuffer *fb;
>   	struct drm_rect src, dst;
> @@ -52,6 +63,8 @@ struct vkms_writeback_job {
>    * vkms_plane_state - Driver specific plane state

It should be "* struct vkms_plane_state - Driver specific plane state".

>    * @base: base plane state
>    * @frame_info: data required for composing computation
> + * @pixel_read: function to read a pixel in this plane. The creator of a vkms_plane_state must
> + * ensure that this pointer is valid
>    */
>   struct vkms_plane_state {
>   	struct drm_shadow_plane_state base;
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index 172830a3936a..6e3dc8682ff9 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> @@ -9,6 +9,18 @@
>   
>   #include "vkms_formats.h"
>   
> +/**
> + * pixel_offset() - Get the offset of the pixel at coordinates x/y in the first plane
> + *
> + * @frame_info: Buffer metadata
> + * @x: The x coordinate of the wanted pixel in the buffer
> + * @y: The y coordinate of the wanted pixel in the buffer
> + *
> + * The caller must ensure that the framebuffer associated with this request uses a pixel format
> + * where block_h == block_w == 1.
> + * If this requirement is not fulfilled, the resulting offset can point to an other pixel or
> + * outside of the buffer.
> + */
>   static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int y)
>   {
>   	struct drm_framebuffer *fb = frame_info->fb;
> @@ -17,18 +29,22 @@ static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int
>   			      + (x * fb->format->cpp[0]);
>   }
>   
> -/*
> - * packed_pixels_addr - Get the pointer to pixel of a given pair of coordinates
> +/**
> + * packed_pixels_addr() - Get the pointer to the block containing the pixel at the given
> + * coordinates
>    *
>    * @frame_info: Buffer metadata
> - * @x: The x(width) coordinate of the 2D buffer
> - * @y: The y(Heigth) coordinate of the 2D buffer
> + * @x: The x(width) coordinate inside the plane
> + * @y: The y(height) coordinate inside the plane

I would add a space after x and y.

>    *
>    * Takes the information stored in the frame_info, a pair of coordinates, and
>    * returns the address of the first color channel.
>    * This function assumes the channels are packed together, i.e. a color channel
>    * comes immediately after another in the memory. And therefore, this function
>    * doesn't work for YUV with chroma subsampling (e.g. YUV420 and NV21).
> + *
> + * The caller must ensure that the framebuffer associated with this request uses a pixel format
> + * where block_h == block_w == 1, otherwise the returned pointer can be outside the buffer.
>    */
>   static void *packed_pixels_addr(const struct vkms_frame_info *frame_info,
>   				int x, int y)
> @@ -53,6 +69,13 @@ static int get_x_position(const struct vkms_frame_info *frame_info, int limit, i
>   	return x;
>   }
>   
> +/*
> + * The following  functions take pixel data from the buffer and convert them to the format

Double-spacing.

> + * ARGB16161616 in out_pixel.
> + *
> + * They are used in the `vkms_compose_row` function to handle multiple formats.

For cross-referencing functions, we use vkms_compose_row() [1].

[1] 
https://docs.kernel.org/doc-guide/kernel-doc.html#highlights-and-cross-references

> + */
> +
>   static void ARGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
>   {
>   	/*
> @@ -145,12 +168,11 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
>   }
>   
>   /*
> - * The following  functions take an line of argb_u16 pixels from the
> - * src_buffer, convert them to a specific format, and store them in the
> - * destination.
> + * The following functions take one argb_u16 pixel and convert it to a specific format. The

For cross-referencing structs, look here [1].

> + * result is stored in @dst_pixels.
>    *
> - * They are used in the `compose_active_planes` to convert and store a line
> - * from the src_buffer to the writeback buffer.
> + * They are used in the `vkms_writeback_row` to convert and store a pixel from the src_buffer to

Same.

> + * the writeback buffer.
>    */
>   static void argb_u16_to_ARGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
>   {
> @@ -216,6 +238,14 @@ static void argb_u16_to_RGB565(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
>   	*pixels = cpu_to_le16(r << 11 | g << 5 | b);
>   }
>   
> +/**
> + * Generic loop for all supported writeback format. It is executed just after the blending to
> + * write a line in the writeback buffer.
> + *
> + * @wb: Job where to insert the final image
> + * @src_buffer: Line to write
> + * @y: Row to write in the writeback buffer
> + */
>   void vkms_writeback_row(struct vkms_writeback_job *wb,
>   			const struct line_buffer *src_buffer, int y)
>   {
> @@ -229,6 +259,13 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
>   		wb->pixel_write(dst_pixels, &in_pixels[x]);
>   }
>   
> +/**

Where is the function name?

> + * Retrieve the correct read_pixel function for a specific format.
> + * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
> + * pointer is valid before using it in a vkms_plane_state.
> + *
> + * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> + */
>   void *get_pixel_conversion_function(u32 format)
>   {
>   	switch (format) {
> @@ -247,6 +284,13 @@ void *get_pixel_conversion_function(u32 format)
>   	}
>   }
>   
> +/**

Same.

Best Regards,
- Maíra

> + * Retrieve the correct write_pixel function for a specific format.
> + * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
> + * pointer is valid before using it in a vkms_writeback_job.
> + *
> + * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> + */
>   void *get_pixel_write_function(u32 format)
>   {
>   	switch (format) {
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 04/16] drm/vkms: Add typedef and documentation for pixel_read and pixel_write functions
  2024-03-13 17:44 ` [PATCH v5 04/16] drm/vkms: Add typedef and documentation for pixel_read and pixel_write functions Louis Chauvet
  2024-03-25 12:04   ` Pekka Paalanen
@ 2024-03-25 13:56   ` Maíra Canal
  2024-03-26 15:56     ` Louis Chauvet
  1 sibling, 1 reply; 75+ messages in thread
From: Maíra Canal @ 2024-03-25 13:56 UTC (permalink / raw)
  To: Louis Chauvet, Rodrigo Siqueira, Melissa Wen, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

On 3/13/24 14:44, Louis Chauvet wrote:
> Introduce two typedefs: pixel_read_t and pixel_write_t. It allows the
> compiler to check if the passed functions take the correct arguments.
> Such typedefs will help ensuring consistency across the code base in
> case of update of these prototypes.
> 
> Rename input/output variable in a consistent way between read_line and
> write_line.
> 
> A warn has been added in get_pixel_*_function to alert when an unsupported
> pixel format is requested. As those formats are checked before
> atomic_update callbacks, it should never append.
> 
> Document for those typedefs.
> 
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>   drivers/gpu/drm/vkms/vkms_drv.h     |  23 ++++++-
>   drivers/gpu/drm/vkms/vkms_formats.c | 124 +++++++++++++++++++++---------------
>   drivers/gpu/drm/vkms/vkms_formats.h |   4 +-
>   drivers/gpu/drm/vkms/vkms_plane.c   |   2 +-
>   4 files changed, 95 insertions(+), 58 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 18086423a3a7..4bfc62d26f08 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -53,12 +53,31 @@ struct line_buffer {
>   	struct pixel_argb_u16 *pixels;
>   };
>   
> +/**
> + * typedef pixel_write_t - These functions are used to read a pixel from a
> + * `struct pixel_argb_u16*`, convert it in a specific format and write it in the @dst_pixels
> + * buffer.

Your brief description looks a bit big to me. Also, take a look at the 
cross-references docs [1].

[1] 
https://docs.kernel.org/doc-guide/kernel-doc.html#highlights-and-cross-references

> + *
> + * @out_pixel: destination address to write the pixel
> + * @in_pixel: pixel to write
> + */
> +typedef void (*pixel_write_t)(u8 *out_pixel, struct pixel_argb_u16 *in_pixel);
> +
>   struct vkms_writeback_job {
>   	struct iosys_map data[DRM_FORMAT_MAX_PLANES];
>   	struct vkms_frame_info wb_frame_info;
> -	void (*pixel_write)(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel);
> +	pixel_write_t pixel_write;
>   };
>   
> +/**
> + * typedef pixel_read_t - These functions are used to read a pixel in the source frame,
> + * convert it to `struct pixel_argb_u16` and write it to @out_pixel.

Same.

> + *
> + * @in_pixel: Pointer to the pixel to read
> + * @out_pixel: Pointer to write the converted pixel

s/Pointer/pointer

> + */
> +typedef void (*pixel_read_t)(u8 *in_pixel, struct pixel_argb_u16 *out_pixel);
> +
>   /**
>    * vkms_plane_state - Driver specific plane state
>    * @base: base plane state
> @@ -69,7 +88,7 @@ struct vkms_writeback_job {
>   struct vkms_plane_state {
>   	struct drm_shadow_plane_state base;
>   	struct vkms_frame_info *frame_info;
> -	void (*pixel_read)(u8 *src_buffer, struct pixel_argb_u16 *out_pixel);
> +	pixel_read_t pixel_read;
>   };
>   
>   struct vkms_plane {
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index 6e3dc8682ff9..55a4365d21a4 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> @@ -76,7 +76,7 @@ static int get_x_position(const struct vkms_frame_info *frame_info, int limit, i
>    * They are used in the `vkms_compose_row` function to handle multiple formats.
>    */
>   
> -static void ARGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> +static void ARGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>   {
>   	/*
>   	 * The 257 is the "conversion ratio". This number is obtained by the
> @@ -84,48 +84,48 @@ static void ARGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixe
>   	 * the best color value in a pixel format with more possibilities.
>   	 * A similar idea applies to others RGB color conversions.
>   	 */
> -	out_pixel->a = (u16)src_pixels[3] * 257;
> -	out_pixel->r = (u16)src_pixels[2] * 257;
> -	out_pixel->g = (u16)src_pixels[1] * 257;
> -	out_pixel->b = (u16)src_pixels[0] * 257;
> +	out_pixel->a = (u16)in_pixel[3] * 257;
> +	out_pixel->r = (u16)in_pixel[2] * 257;
> +	out_pixel->g = (u16)in_pixel[1] * 257;
> +	out_pixel->b = (u16)in_pixel[0] * 257;
>   }
>   
> -static void XRGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> +static void XRGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>   {
>   	out_pixel->a = (u16)0xffff;
> -	out_pixel->r = (u16)src_pixels[2] * 257;
> -	out_pixel->g = (u16)src_pixels[1] * 257;
> -	out_pixel->b = (u16)src_pixels[0] * 257;
> +	out_pixel->r = (u16)in_pixel[2] * 257;
> +	out_pixel->g = (u16)in_pixel[1] * 257;
> +	out_pixel->b = (u16)in_pixel[0] * 257;
>   }
>   
> -static void ARGB16161616_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> +static void ARGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>   {
> -	u16 *pixels = (u16 *)src_pixels;
> +	u16 *pixel = (u16 *)in_pixel;
>   
> -	out_pixel->a = le16_to_cpu(pixels[3]);
> -	out_pixel->r = le16_to_cpu(pixels[2]);
> -	out_pixel->g = le16_to_cpu(pixels[1]);
> -	out_pixel->b = le16_to_cpu(pixels[0]);
> +	out_pixel->a = le16_to_cpu(pixel[3]);
> +	out_pixel->r = le16_to_cpu(pixel[2]);
> +	out_pixel->g = le16_to_cpu(pixel[1]);
> +	out_pixel->b = le16_to_cpu(pixel[0]);
>   }
>   
> -static void XRGB16161616_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> +static void XRGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>   {
> -	u16 *pixels = (u16 *)src_pixels;
> +	u16 *pixel = (u16 *)in_pixel;
>   
>   	out_pixel->a = (u16)0xffff;
> -	out_pixel->r = le16_to_cpu(pixels[2]);
> -	out_pixel->g = le16_to_cpu(pixels[1]);
> -	out_pixel->b = le16_to_cpu(pixels[0]);
> +	out_pixel->r = le16_to_cpu(pixel[2]);
> +	out_pixel->g = le16_to_cpu(pixel[1]);
> +	out_pixel->b = le16_to_cpu(pixel[0]);
>   }
>   
> -static void RGB565_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> +static void RGB565_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>   {
> -	u16 *pixels = (u16 *)src_pixels;
> +	u16 *pixel = (u16 *)in_pixel;
>   
>   	s64 fp_rb_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(31));
>   	s64 fp_g_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(63));
>   
> -	u16 rgb_565 = le16_to_cpu(*pixels);
> +	u16 rgb_565 = le16_to_cpu(*pixel);
>   	s64 fp_r = drm_int2fixp((rgb_565 >> 11) & 0x1f);
>   	s64 fp_g = drm_int2fixp((rgb_565 >> 5) & 0x3f);
>   	s64 fp_b = drm_int2fixp(rgb_565 & 0x1f);
> @@ -169,12 +169,12 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
>   
>   /*
>    * The following functions take one argb_u16 pixel and convert it to a specific format. The
> - * result is stored in @dst_pixels.
> + * result is stored in @out_pixel.
>    *
>    * They are used in the `vkms_writeback_row` to convert and store a pixel from the src_buffer to
>    * the writeback buffer.
>    */
> -static void argb_u16_to_ARGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_ARGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>   {
>   	/*
>   	 * This sequence below is important because the format's byte order is
> @@ -186,43 +186,43 @@ static void argb_u16_to_ARGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel
>   	 * | Addr + 2 | = Red channel
>   	 * | Addr + 3 | = Alpha channel
>   	 */
> -	dst_pixels[3] = DIV_ROUND_CLOSEST(in_pixel->a, 257);
> -	dst_pixels[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
> -	dst_pixels[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
> -	dst_pixels[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
> +	out_pixel[3] = DIV_ROUND_CLOSEST(in_pixel->a, 257);
> +	out_pixel[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
> +	out_pixel[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
> +	out_pixel[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
>   }
>   
> -static void argb_u16_to_XRGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_XRGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>   {
> -	dst_pixels[3] = 0xff;
> -	dst_pixels[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
> -	dst_pixels[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
> -	dst_pixels[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
> +	out_pixel[3] = 0xff;
> +	out_pixel[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
> +	out_pixel[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
> +	out_pixel[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
>   }
>   
> -static void argb_u16_to_ARGB16161616(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_ARGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>   {
> -	u16 *pixels = (u16 *)dst_pixels;
> +	u16 *pixel = (u16 *)out_pixel;
>   
> -	pixels[3] = cpu_to_le16(in_pixel->a);
> -	pixels[2] = cpu_to_le16(in_pixel->r);
> -	pixels[1] = cpu_to_le16(in_pixel->g);
> -	pixels[0] = cpu_to_le16(in_pixel->b);
> +	pixel[3] = cpu_to_le16(in_pixel->a);
> +	pixel[2] = cpu_to_le16(in_pixel->r);
> +	pixel[1] = cpu_to_le16(in_pixel->g);
> +	pixel[0] = cpu_to_le16(in_pixel->b);
>   }
>   
> -static void argb_u16_to_XRGB16161616(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_XRGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>   {
> -	u16 *pixels = (u16 *)dst_pixels;
> +	u16 *pixel = (u16 *)out_pixel;
>   
> -	pixels[3] = 0xffff;
> -	pixels[2] = cpu_to_le16(in_pixel->r);
> -	pixels[1] = cpu_to_le16(in_pixel->g);
> -	pixels[0] = cpu_to_le16(in_pixel->b);
> +	pixel[3] = 0xffff;
> +	pixel[2] = cpu_to_le16(in_pixel->r);
> +	pixel[1] = cpu_to_le16(in_pixel->g);
> +	pixel[0] = cpu_to_le16(in_pixel->b);
>   }
>   
> -static void argb_u16_to_RGB565(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_RGB565(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>   {
> -	u16 *pixels = (u16 *)dst_pixels;
> +	u16 *pixel = (u16 *)out_pixel;
>   
>   	s64 fp_rb_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(31));
>   	s64 fp_g_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(63));
> @@ -235,7 +235,7 @@ static void argb_u16_to_RGB565(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
>   	u16 g = drm_fixp2int(drm_fixp_div(fp_g, fp_g_ratio));
>   	u16 b = drm_fixp2int(drm_fixp_div(fp_b, fp_rb_ratio));
>   
> -	*pixels = cpu_to_le16(r << 11 | g << 5 | b);
> +	*pixel = cpu_to_le16(r << 11 | g << 5 | b);
>   }
>   
>   /**
> @@ -266,7 +266,7 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
>    *
>    * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
>    */
> -void *get_pixel_conversion_function(u32 format)
> +pixel_read_t get_pixel_read_function(u32 format)
>   {
>   	switch (format) {
>   	case DRM_FORMAT_ARGB8888:
> @@ -280,7 +280,16 @@ void *get_pixel_conversion_function(u32 format)
>   	case DRM_FORMAT_RGB565:
>   		return &RGB565_to_argb_u16;
>   	default:
> -		return NULL;
> +		/*
> +		 * This is a bug in vkms_plane_atomic_check. All the supported

s/vkms_plane_atomic_check/vkms_plane_atomic_check()

Best Regards,
- Maíra

> +		 * format must:
> +		 * - Be listed in vkms_formats in vkms_plane.c
> +		 * - Have a pixel_read callback defined here
> +		 */
> +		WARN(true,
> +		     "Pixel format %p4cc is not supported by VKMS planes. This is a kernel bug, atomic check must forbid this configuration.\n",
> +		     &format);
> +		return (pixel_read_t)NULL;
>   	}
>   }
>   
> @@ -291,7 +300,7 @@ void *get_pixel_conversion_function(u32 format)
>    *
>    * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
>    */
> -void *get_pixel_write_function(u32 format)
> +pixel_write_t get_pixel_write_function(u32 format)
>   {
>   	switch (format) {
>   	case DRM_FORMAT_ARGB8888:
> @@ -305,6 +314,15 @@ void *get_pixel_write_function(u32 format)
>   	case DRM_FORMAT_RGB565:
>   		return &argb_u16_to_RGB565;
>   	default:
> -		return NULL;
> +		/*
> +		 * This is a bug in vkms_writeback_atomic_check. All the supported
> +		 * format must:
> +		 * - Be listed in vkms_wb_formats in vkms_writeback.c
> +		 * - Have a pixel_write callback defined here
> +		 */
> +		WARN(true,
> +		     "Pixel format %p4cc is not supported by VKMS writeback. This is a kernel bug, atomic check must forbid this configuration.\n",
> +		     &format);
> +		return (pixel_write_t)NULL;
>   	}
>   }
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
> index cf59c2ed8e9a..3ecea4563254 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.h
> +++ b/drivers/gpu/drm/vkms/vkms_formats.h
> @@ -5,8 +5,8 @@
>   
>   #include "vkms_drv.h"
>   
> -void *get_pixel_conversion_function(u32 format);
> +pixel_read_t get_pixel_read_function(u32 format);
>   
> -void *get_pixel_write_function(u32 format);
> +pixel_write_t get_pixel_write_function(u32 format);
>   
>   #endif /* _VKMS_FORMATS_H_ */
> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> index 21b5adfb44aa..10e9b23dab28 100644
> --- a/drivers/gpu/drm/vkms/vkms_plane.c
> +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> @@ -125,7 +125,7 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
>   	drm_rect_rotate(&frame_info->rotated, drm_rect_width(&frame_info->rotated),
>   			drm_rect_height(&frame_info->rotated), frame_info->rotation);
>   
> -	vkms_plane_state->pixel_read = get_pixel_conversion_function(fmt);
> +	vkms_plane_state->pixel_read = get_pixel_read_function(fmt);
>   }
>   
>   static int vkms_plane_atomic_check(struct drm_plane *plane,
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 05/16] drm/vkms: Add dummy pixel_read/pixel_write callbacks to avoid NULL pointers
  2024-03-13 17:44 ` [PATCH v5 05/16] drm/vkms: Add dummy pixel_read/pixel_write callbacks to avoid NULL pointers Louis Chauvet
  2024-03-13 19:08   ` Randy Dunlap
  2024-03-25 12:05   ` Pekka Paalanen
@ 2024-03-25 13:59   ` Maíra Canal
  2024-03-26 15:56     ` Louis Chauvet
  2 siblings, 1 reply; 75+ messages in thread
From: Maíra Canal @ 2024-03-25 13:59 UTC (permalink / raw)
  To: Louis Chauvet, Rodrigo Siqueira, Melissa Wen, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

On 3/13/24 14:44, Louis Chauvet wrote:
> Introduce two callbacks which does nothing. They are used in replacement
> of NULL and it avoid kernel OOPS if this NULL is called.
> 
> If those callback are used, it means that there is a mismatch between
> what formats are announced by atomic_check and what is realy supported by
> atomic_update.
> 
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>   drivers/gpu/drm/vkms/vkms_formats.c | 43 +++++++++++++++++++++++++++++++------
>   1 file changed, 37 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index 55a4365d21a4..b57d85b8b935 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> @@ -136,6 +136,21 @@ static void RGB565_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>   	out_pixel->b = drm_fixp2int_round(drm_fixp_mul(fp_b, fp_rb_ratio));
>   }
>   
> +/**
> + * black_to_argb_u16() - pixel_read callback which always read black
> + *
> + * This callback is used when an invalid format is requested for plane reading.
> + * It is used to avoid null pointer to be used as a function. In theory, this function should
> + * never be called, except if you found a bug in the driver/DRM core.
> + */
> +static void black_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +{
> +	out_pixel->a = (u16)0xFFFF;
> +	out_pixel->r = 0;
> +	out_pixel->g = 0;
> +	out_pixel->b = 0;
> +}
> +
>   /**
>    * vkms_compose_row - compose a single row of a plane
>    * @stage_buffer: output line with the composed pixels
> @@ -238,6 +253,16 @@ static void argb_u16_to_RGB565(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>   	*pixel = cpu_to_le16(r << 11 | g << 5 | b);
>   }
>   
> +/**
> + * argb_u16_to_nothing() - pixel_write callback with no effect
> + *
> + * This callback is used when an invalid format is requested for writeback.
> + * It is used to avoid null pointer to be used as a function. In theory, this should never
> + * happen, except if there is a bug in the driver
> + */
> +static void argb_u16_to_nothing(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> +{}
> +
>   /**
>    * Generic loop for all supported writeback format. It is executed just after the blending to
>    * write a line in the writeback buffer.
> @@ -261,8 +286,8 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
>   
>   /**
>    * Retrieve the correct read_pixel function for a specific format.
> - * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
> - * pointer is valid before using it in a vkms_plane_state.
> + * If the format is not supported by VKMS a warn is emitted and a dummy "always read black"

"If the format is not supported by VKMS, a warning is emitted and a 
dummy "always read black"..."

> + * function is returned.
>    *
>    * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
>    */
> @@ -285,18 +310,21 @@ pixel_read_t get_pixel_read_function(u32 format)
>   		 * format must:
>   		 * - Be listed in vkms_formats in vkms_plane.c
>   		 * - Have a pixel_read callback defined here
> +		 *
> +		 * To avoid kernel crash, a dummy "always read black" function is used. It means
> +		 * that during the composition, this plane will always be black.
>   		 */
>   		WARN(true,
>   		     "Pixel format %p4cc is not supported by VKMS planes. This is a kernel bug, atomic check must forbid this configuration.\n",
>   		     &format);
> -		return (pixel_read_t)NULL;
> +		return &black_to_argb_u16;
>   	}
>   }
>   
>   /**
>    * Retrieve the correct write_pixel function for a specific format.
> - * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
> - * pointer is valid before using it in a vkms_writeback_job.
> + * If the format is not supported by VKMS a warn is emitted and a dummy "don't do anything"

"If the format is not supported by VKMS, a warning is emitted and a 
dummy "don't do anything"..."

Best Regards,
- Maíra

> + * function is returned.
>    *
>    * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
>    */
> @@ -319,10 +347,13 @@ pixel_write_t get_pixel_write_function(u32 format)
>   		 * format must:
>   		 * - Be listed in vkms_wb_formats in vkms_writeback.c
>   		 * - Have a pixel_write callback defined here
> +		 *
> +		 * To avoid kernel crash, a dummy "don't do anything" function is used. It means
> +		 * that the resulting writeback buffer is not composed and can contains any values.
>   		 */
>   		WARN(true,
>   		     "Pixel format %p4cc is not supported by VKMS writeback. This is a kernel bug, atomic check must forbid this configuration.\n",
>   		     &format);
> -		return (pixel_write_t)NULL;
> +		return &argb_u16_to_nothing;
>   	}
>   }
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 06/16] drm/vkms: Use const for input pointers in pixel_read an pixel_write functions
  2024-03-13 17:45 ` [PATCH v5 06/16] drm/vkms: Use const for input pointers in pixel_read an pixel_write functions Louis Chauvet
  2024-03-25 12:05   ` Pekka Paalanen
@ 2024-03-25 14:00   ` Maíra Canal
  1 sibling, 0 replies; 75+ messages in thread
From: Maíra Canal @ 2024-03-25 14:00 UTC (permalink / raw)
  To: Louis Chauvet, Rodrigo Siqueira, Melissa Wen, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

On 3/13/24 14:45, Louis Chauvet wrote:
> As the pixel_read and pixel_write function should never modify the input
> buffer, mark those pointers const.
> 
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>

Reviewed-by: Maíra Canal <mcanal@igalia.com>

Best Regards,
- Maíra

> ---
>   drivers/gpu/drm/vkms/vkms_drv.h     |  4 ++--
>   drivers/gpu/drm/vkms/vkms_formats.c | 24 ++++++++++++------------
>   2 files changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 4bfc62d26f08..3ead8b39af4a 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -61,7 +61,7 @@ struct line_buffer {
>    * @out_pixel: destination address to write the pixel
>    * @in_pixel: pixel to write
>    */
> -typedef void (*pixel_write_t)(u8 *out_pixel, struct pixel_argb_u16 *in_pixel);
> +typedef void (*pixel_write_t)(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel);
>   
>   struct vkms_writeback_job {
>   	struct iosys_map data[DRM_FORMAT_MAX_PLANES];
> @@ -76,7 +76,7 @@ struct vkms_writeback_job {
>    * @in_pixel: Pointer to the pixel to read
>    * @out_pixel: Pointer to write the converted pixel
>    */
> -typedef void (*pixel_read_t)(u8 *in_pixel, struct pixel_argb_u16 *out_pixel);
> +typedef void (*pixel_read_t)(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel);
>   
>   /**
>    * vkms_plane_state - Driver specific plane state
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index b57d85b8b935..b2f8dfc26c35 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> @@ -76,7 +76,7 @@ static int get_x_position(const struct vkms_frame_info *frame_info, int limit, i
>    * They are used in the `vkms_compose_row` function to handle multiple formats.
>    */
>   
> -static void ARGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +static void ARGB8888_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>   {
>   	/*
>   	 * The 257 is the "conversion ratio". This number is obtained by the
> @@ -90,7 +90,7 @@ static void ARGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>   	out_pixel->b = (u16)in_pixel[0] * 257;
>   }
>   
> -static void XRGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +static void XRGB8888_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>   {
>   	out_pixel->a = (u16)0xffff;
>   	out_pixel->r = (u16)in_pixel[2] * 257;
> @@ -98,7 +98,7 @@ static void XRGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>   	out_pixel->b = (u16)in_pixel[0] * 257;
>   }
>   
> -static void ARGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +static void ARGB16161616_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>   {
>   	u16 *pixel = (u16 *)in_pixel;
>   
> @@ -108,7 +108,7 @@ static void ARGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pi
>   	out_pixel->b = le16_to_cpu(pixel[0]);
>   }
>   
> -static void XRGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +static void XRGB16161616_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>   {
>   	u16 *pixel = (u16 *)in_pixel;
>   
> @@ -118,7 +118,7 @@ static void XRGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pi
>   	out_pixel->b = le16_to_cpu(pixel[0]);
>   }
>   
> -static void RGB565_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +static void RGB565_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>   {
>   	u16 *pixel = (u16 *)in_pixel;
>   
> @@ -143,7 +143,7 @@ static void RGB565_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>    * It is used to avoid null pointer to be used as a function. In theory, this function should
>    * never be called, except if you found a bug in the driver/DRM core.
>    */
> -static void black_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +static void black_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>   {
>   	out_pixel->a = (u16)0xFFFF;
>   	out_pixel->r = 0;
> @@ -189,7 +189,7 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
>    * They are used in the `vkms_writeback_row` to convert and store a pixel from the src_buffer to
>    * the writeback buffer.
>    */
> -static void argb_u16_to_ARGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_ARGB8888(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel)
>   {
>   	/*
>   	 * This sequence below is important because the format's byte order is
> @@ -207,7 +207,7 @@ static void argb_u16_to_ARGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>   	out_pixel[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
>   }
>   
> -static void argb_u16_to_XRGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_XRGB8888(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel)
>   {
>   	out_pixel[3] = 0xff;
>   	out_pixel[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
> @@ -215,7 +215,7 @@ static void argb_u16_to_XRGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>   	out_pixel[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
>   }
>   
> -static void argb_u16_to_ARGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_ARGB16161616(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel)
>   {
>   	u16 *pixel = (u16 *)out_pixel;
>   
> @@ -225,7 +225,7 @@ static void argb_u16_to_ARGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pi
>   	pixel[0] = cpu_to_le16(in_pixel->b);
>   }
>   
> -static void argb_u16_to_XRGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_XRGB16161616(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel)
>   {
>   	u16 *pixel = (u16 *)out_pixel;
>   
> @@ -235,7 +235,7 @@ static void argb_u16_to_XRGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pi
>   	pixel[0] = cpu_to_le16(in_pixel->b);
>   }
>   
> -static void argb_u16_to_RGB565(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_RGB565(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel)
>   {
>   	u16 *pixel = (u16 *)out_pixel;
>   
> @@ -260,7 +260,7 @@ static void argb_u16_to_RGB565(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>    * It is used to avoid null pointer to be used as a function. In theory, this should never
>    * happen, except if there is a bug in the driver
>    */
> -static void argb_u16_to_nothing(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> +static void argb_u16_to_nothing(u8 *out_pixel, const struct pixel_argb_u16 *in_pixel)
>   {}
>   
>   /**
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 09/16] drm/vkms: Introduce pixel_read_direction enum
  2024-03-13 17:45 ` [PATCH v5 09/16] drm/vkms: Introduce pixel_read_direction enum Louis Chauvet
  2024-03-25 13:11   ` Pekka Paalanen
@ 2024-03-25 14:07   ` Maíra Canal
  2024-03-26 15:57     ` Louis Chauvet
  1 sibling, 1 reply; 75+ messages in thread
From: Maíra Canal @ 2024-03-25 14:07 UTC (permalink / raw)
  To: Louis Chauvet, Rodrigo Siqueira, Melissa Wen, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

On 3/13/24 14:45, Louis Chauvet wrote:
> The pixel_read_direction enum is useful to describe the reading direction
> in a plane. It avoids using the rotation property of DRM, which not
> practical to know the direction of reading.
> This patch also introduce two helpers, one to compute the
> pixel_read_direction from the DRM rotation property, and one to compute
> the step, in byte, between two successive pixel in a specific direction.
> 
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>   drivers/gpu/drm/vkms/vkms_composer.c | 36 ++++++++++++++++++++++++++++++++++++
>   drivers/gpu/drm/vkms/vkms_drv.h      | 11 +++++++++++
>   drivers/gpu/drm/vkms/vkms_formats.c  | 30 ++++++++++++++++++++++++++++++
>   3 files changed, 77 insertions(+)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> index 9254086f23ff..989bcf59f375 100644
> --- a/drivers/gpu/drm/vkms/vkms_composer.c
> +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> @@ -159,6 +159,42 @@ static void apply_lut(const struct vkms_crtc_state *crtc_state, struct line_buff
>   	}
>   }
>   
> +/**
> + * direction_for_rotation() - Get the correct reading direction for a given rotation
> + *
> + * This function will use the @rotation setting of a source plane to compute the reading
> + * direction in this plane which correspond to a "left to right writing" in the CRTC.
> + * For example, if the buffer is reflected on X axis, the pixel must be read from right to left
> + * to be written from left to right on the CRTC.
> + *
> + * @rotation: Rotation to analyze. It correspond the field @frame_info.rotation.

A bit unusual to see arguments after the description.

> + */
> +static enum pixel_read_direction direction_for_rotation(unsigned int rotation)
> +{
> +	if (rotation & DRM_MODE_ROTATE_0) {
> +		if (rotation & DRM_MODE_REFLECT_X)
> +			return READ_RIGHT_TO_LEFT;
> +		else
> +			return READ_LEFT_TO_RIGHT;
> +	} else if (rotation & DRM_MODE_ROTATE_90) {
> +		if (rotation & DRM_MODE_REFLECT_Y)
> +			return READ_BOTTOM_TO_TOP;
> +		else
> +			return READ_TOP_TO_BOTTOM;
> +	} else if (rotation & DRM_MODE_ROTATE_180) {
> +		if (rotation & DRM_MODE_REFLECT_X)
> +			return READ_LEFT_TO_RIGHT;
> +		else
> +			return READ_RIGHT_TO_LEFT;
> +	} else if (rotation & DRM_MODE_ROTATE_270) {
> +		if (rotation & DRM_MODE_REFLECT_Y)
> +			return READ_TOP_TO_BOTTOM;
> +		else
> +			return READ_BOTTOM_TO_TOP;
> +	}
> +	return READ_LEFT_TO_RIGHT;
> +}
> +
>   /**
>    * blend - blend the pixels from all planes and compute crc
>    * @wb: The writeback frame buffer metadata
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 3ead8b39af4a..985e7a92b7bc 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -69,6 +69,17 @@ struct vkms_writeback_job {
>   	pixel_write_t pixel_write;
>   };
>   
> +/**
> + * enum pixel_read_direction - Enum used internaly by VKMS to represent a reading direction in a
> + * plane.
> + */
> +enum pixel_read_direction {
> +	READ_BOTTOM_TO_TOP,
> +	READ_TOP_TO_BOTTOM,
> +	READ_RIGHT_TO_LEFT,
> +	READ_LEFT_TO_RIGHT
> +};
> +
>   /**
>    * typedef pixel_read_t - These functions are used to read a pixel in the source frame,
>    * convert it to `struct pixel_argb_u16` and write it to @out_pixel.
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index 649d75d05b1f..743b6fd06db5 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> @@ -75,6 +75,36 @@ static void packed_pixels_addr(const struct vkms_frame_info *frame_info,
>   	*addr = (u8 *)frame_info->map[0].vaddr + offset;
>   }
>   
> +/**
> + * get_step_next_block() - Common helper to compute the correct step value between each pixel block
> + * to read in a certain direction.
> + *
> + * As the returned offset is the number of bytes between two consecutive blocks in a direction,
> + * the caller may have to read multiple pixel before using the next one (for example, to read from
> + * left to right in a DRM_FORMAT_R1 plane, each block contains 8 pixels, so the step must be used
> + * only every 8 pixels.
> + *
> + * @fb: Framebuffer to iter on
> + * @direction: Direction of the reading
> + * @plane_index: Plane to get the step from

Same.

Best Regards,
- Maíra

> + */
> +static int get_step_next_block(struct drm_framebuffer *fb, enum pixel_read_direction direction,
> +			       int plane_index)
> +{
> +	switch (direction) {
> +	case READ_LEFT_TO_RIGHT:
> +		return fb->format->char_per_block[plane_index];
> +	case READ_RIGHT_TO_LEFT:
> +		return -fb->format->char_per_block[plane_index];
> +	case READ_TOP_TO_BOTTOM:
> +		return (int)fb->pitches[plane_index];
> +	case READ_BOTTOM_TO_TOP:
> +		return -(int)fb->pitches[plane_index];
> +	}
> +
> +	return 0;
> +}
> +
>   static void *get_packed_src_addr(const struct vkms_frame_info *frame_info, int y,
>   				 int plane_index)
>   {
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 10/16] drm/vkms: Re-introduce line-per-line composition algorithm
  2024-03-13 17:45 ` [PATCH v5 10/16] drm/vkms: Re-introduce line-per-line composition algorithm Louis Chauvet
@ 2024-03-25 14:15   ` Maíra Canal
  2024-03-25 14:56     ` Pekka Paalanen
  2024-03-26 15:57     ` Louis Chauvet
  2024-03-25 15:43   ` Pekka Paalanen
  1 sibling, 2 replies; 75+ messages in thread
From: Maíra Canal @ 2024-03-25 14:15 UTC (permalink / raw)
  To: Louis Chauvet, Rodrigo Siqueira, Melissa Wen, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

On 3/13/24 14:45, Louis Chauvet wrote:
> Re-introduce a line-by-line composition algorithm for each pixel format.
> This allows more performance by not requiring an indirection per pixel
> read. This patch is focused on readability of the code.
> 
> Line-by-line composition was introduced by [1] but rewritten back to
> pixel-by-pixel algorithm in [2]. At this time, nobody noticed the impact
> on performance, and it was merged.
> 
> This patch is almost a revert of [2], but in addition efforts have been
> made to increase readability and maintainability of the rotation handling.
> The blend function is now divided in two parts:
> - Transformation of coordinates from the output referential to the source
> referential
> - Line conversion and blending
> 
> Most of the complexity of the rotation management is avoided by using
> drm_rect_* helpers. The remaining complexity is around the clipping, to
> avoid reading/writing outside source/destination buffers.
> 
> The pixel conversion is now done line-by-line, so the read_pixel_t was
> replaced with read_pixel_line_t callback. This way the indirection is only
> required once per line and per plane, instead of once per pixel and per
> plane.
> 
> The read_line_t callbacks are very similar for most pixel format, but it
> is required to avoid performance impact. Some helpers for color
> conversion were introduced to avoid code repetition:
> - *_to_argb_u16: perform colors conversion. They should be inlined by the
>    compiler, and they are used to avoid repetition between multiple variants
>    of the same format (argb/xrgb and maybe in the future for formats like
>    bgr formats).
> 
> This new algorithm was tested with:
> - kms_plane (for color conversions)
> - kms_rotation_crc (for rotations of planes)
> - kms_cursor_crc (for translations of planes)
> - kms_rotation (for all rotations and formats combinations) [3]
> The performance gain was mesured with:
> - kms_fb_stress

Could you tell us what was the performance gain?

> 
> [1]: commit 8ba1648567e2 ("drm: vkms: Refactor the plane composer to accept
>       new formats")
>       https://lore.kernel.org/all/20220905190811.25024-7-igormtorrente@gmail.com/
> [2]: commit 322d716a3e8a ("drm/vkms: isolate pixel conversion
>       functionality")
>       https://lore.kernel.org/all/20230418130525.128733-2-mcanal@igalia.com/
> [3]:
> 
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>   drivers/gpu/drm/vkms/vkms_composer.c | 167 +++++++++++++++++++------
>   drivers/gpu/drm/vkms/vkms_drv.h      |  27 ++--
>   drivers/gpu/drm/vkms/vkms_formats.c  | 236 ++++++++++++++++++++++-------------
>   drivers/gpu/drm/vkms/vkms_formats.h  |   2 +-
>   drivers/gpu/drm/vkms/vkms_plane.c    |   5 +-
>   5 files changed, 292 insertions(+), 145 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> index 989bcf59f375..5d78c33dbf41 100644
> --- a/drivers/gpu/drm/vkms/vkms_composer.c
> +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> @@ -41,7 +41,7 @@ static void pre_mul_alpha_blend(const struct line_buffer *stage_buffer,
>   				struct line_buffer *output_buffer, int x_start, int pixel_count)
>   {
>   	struct pixel_argb_u16 *out = &output_buffer->pixels[x_start];
> -	const struct pixel_argb_u16 *in = stage_buffer->pixels;
> +	const struct pixel_argb_u16 *in = &stage_buffer->pixels[x_start];
>   
>   	for (int i = 0; i < pixel_count; i++) {
>   		out[i].a = (u16)0xffff;
> @@ -51,33 +51,6 @@ static void pre_mul_alpha_blend(const struct line_buffer *stage_buffer,
>   	}
>   }
>   
> -static int get_y_pos(struct vkms_frame_info *frame_info, int y)
> -{
> -	if (frame_info->rotation & DRM_MODE_REFLECT_Y)
> -		return drm_rect_height(&frame_info->rotated) - y - 1;
> -
> -	switch (frame_info->rotation & DRM_MODE_ROTATE_MASK) {
> -	case DRM_MODE_ROTATE_90:
> -		return frame_info->rotated.x2 - y - 1;
> -	case DRM_MODE_ROTATE_270:
> -		return y + frame_info->rotated.x1;
> -	default:
> -		return y;
> -	}
> -}
> -
> -static bool check_limit(struct vkms_frame_info *frame_info, int pos)
> -{
> -	if (drm_rotation_90_or_270(frame_info->rotation)) {
> -		if (pos >= 0 && pos < drm_rect_width(&frame_info->rotated))
> -			return true;
> -	} else {
> -		if (pos >= frame_info->rotated.y1 && pos < frame_info->rotated.y2)
> -			return true;
> -	}
> -
> -	return false;
> -}
>   
>   static void fill_background(const struct pixel_argb_u16 *background_color,
>   			    struct line_buffer *output_buffer)
> @@ -215,34 +188,146 @@ static void blend(struct vkms_writeback_job *wb,
>   {
>   	struct vkms_plane_state **plane = crtc_state->active_planes;
>   	u32 n_active_planes = crtc_state->num_active_planes;
> -	int y_pos, x_dst, x_limit;
>   
>   	const struct pixel_argb_u16 background_color = { .a = 0xffff };
>   
> -	size_t crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
> +	int crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
> +	int crtc_x_limit = crtc_state->base.crtc->mode.hdisplay;

Shouldn't it be `unsigned int`?

>   
>   	/*
>   	 * The planes are composed line-by-line to avoid heavy memory usage. It is a necessary
>   	 * complexity to avoid poor blending performance.
>   	 *
> -	 * The function vkms_compose_row is used to read a line, pixel-by-pixel, into the staging
> -	 * buffer.
> +	 * The function pixel_read_line callback is used to read a line, using an efficient
> +	 * algorithm for a specific format, into the staging buffer.
>   	 */
>   	for (size_t y = 0; y < crtc_y_limit; y++) {
>   		fill_background(&background_color, output_buffer);
>   
>   		/* The active planes are composed associatively in z-order. */
>   		for (size_t i = 0; i < n_active_planes; i++) {
> -			x_dst = plane[i]->frame_info->dst.x1;
> -			x_limit = min_t(size_t, drm_rect_width(&plane[i]->frame_info->dst),
> -					stage_buffer->n_pixels);
> -			y_pos = get_y_pos(plane[i]->frame_info, y);
> +			struct vkms_plane_state *current_plane = plane[i];
>   
> -			if (!check_limit(plane[i]->frame_info, y_pos))
> +			/* Avoid rendering useless lines */
> +			if (y < current_plane->frame_info->dst.y1 ||
> +			    y >= current_plane->frame_info->dst.y2)
>   				continue;
>   
> -			vkms_compose_row(stage_buffer, plane[i], y_pos);
> -			pre_mul_alpha_blend(stage_buffer, output_buffer, x_dst, x_limit);
> +			/*
> +			 * dst_line is the line to copy. The initial coordinates are inside the
> +			 * destination framebuffer, and then drm_rect_* helpers are used to
> +			 * compute the correct position into the source framebuffer.
> +			 */
> +			struct drm_rect dst_line = DRM_RECT_INIT(

Please, run checkpatch on this patch.

> +				current_plane->frame_info->dst.x1, y,
> +				drm_rect_width(&current_plane->frame_info->dst), 1);
> +			struct drm_rect tmp_src;
> +
> +			drm_rect_fp_to_int(&tmp_src, &current_plane->frame_info->src);
> +
> +			/*
> +			 * [1]: Clamping src_line to the crtc_x_limit to avoid writing outside of
> +			 * the destination buffer
> +			 */
> +			dst_line.x1 = max_t(int, dst_line.x1, 0);
> +			dst_line.x2 = min_t(int, dst_line.x2, crtc_x_limit);
> +			/* The destination is completely outside of the crtc. */
> +			if (dst_line.x2 <= dst_line.x1)
> +				continue;
> +
> +			struct drm_rect src_line = dst_line;
> +
> +			/*
> +			 * Transform the coordinate x/y from the crtc to coordinates into
> +			 * coordinates for the src buffer.
> +			 *
> +			 * - Cancel the offset of the dst buffer.
> +			 * - Invert the rotation. This assumes that
> +			 *   dst = drm_rect_rotate(src, rotation) (dst and src have the
> +			 *   same size, but can be rotated).
> +			 * - Apply the offset of the source rectangle to the coordinate.
> +			 */
> +			drm_rect_translate(&src_line, -current_plane->frame_info->dst.x1,
> +					   -current_plane->frame_info->dst.y1);
> +			drm_rect_rotate_inv(&src_line,
> +					    drm_rect_width(&tmp_src),
> +					    drm_rect_height(&tmp_src),
> +					    current_plane->frame_info->rotation);
> +			drm_rect_translate(&src_line, tmp_src.x1, tmp_src.y1);
> +
> +			/* Get the correct reading direction in the source buffer. */
> +
> +			enum pixel_read_direction direction =
> +				direction_for_rotation(current_plane->frame_info->rotation);
> +
> +			int x_start = src_line.x1;
> +			int y_start = src_line.y1;
> +			int pixel_count;
> +			/* [2]: Compute and clamp the number of pixel to read */
> +			if (direction == READ_LEFT_TO_RIGHT || direction == READ_RIGHT_TO_LEFT) {
> +				/*
> +				 * In horizontal reading, the src_line width is the number of pixel
> +				 * to read
> +				 */
> +				pixel_count = drm_rect_width(&src_line);
> +				if (x_start < 0) {
> +					pixel_count += x_start;
> +					x_start = 0;
> +				}
> +				if (x_start + pixel_count > current_plane->frame_info->fb->width) {
> +					pixel_count =
> +						(int)current_plane->frame_info->fb->width - x_start;
> +				}
> +			} else {
> +				/*
> +				 * In vertical reading, the src_line height is the number of pixel
> +				 * to read
> +				 */
> +				pixel_count = drm_rect_height(&src_line);
> +				if (y_start < 0) {
> +					pixel_count += y_start;
> +					y_start = 0;
> +				}
> +				if (y_start + pixel_count > current_plane->frame_info->fb->height) {
> +					pixel_count =
> +						(int)current_plane->frame_info->fb->width - y_start;
> +				}
> +			}
> +
> +			if (pixel_count <= 0) {
> +				/* Nothing to read, so avoid multiple function calls for nothing */
> +				continue;
> +			}
> +
> +			/*
> +			 * Modify the starting point to take in account the rotation
> +			 *
> +			 * src_line is the top-left corner, so when reading READ_RIGHT_TO_LEFT or
> +			 * READ_BOTTOM_TO_TOP, it must be changed to the top-right/bottom-left
> +			 * corner.
> +			 */
> +			if (direction == READ_RIGHT_TO_LEFT) {
> +				// x_start is now the right point
> +				x_start += pixel_count - 1;
> +			} else if (direction == READ_BOTTOM_TO_TOP) {
> +				// y_start is now the bottom point
> +				y_start += pixel_count - 1;
> +			}

Any chance this code could be a separate function? I believe it would
make it more readable.

Best Regards,
- Maíra

> +
> +			/*
> +			 * Perform the conversion and the blending
> +			 *
> +			 * Here we know that the read line (x_start, y_start, pixel_count) is
> +			 * inside the source buffer [2] and we don't write outside the stage
> +			 * buffer [1]
> +			 */
> +			current_plane->pixel_read_line(
> +				current_plane, x_start, y_start, direction, pixel_count,
> +				&stage_buffer->pixels[current_plane->frame_info->dst.x1]);
> +
> +			pre_mul_alpha_blend(stage_buffer, output_buffer,
> +					    current_plane->frame_info->dst.x1,
> +					    pixel_count);
>   		}
>   
>   		apply_lut(crtc_state, output_buffer);
> @@ -250,7 +335,7 @@ static void blend(struct vkms_writeback_job *wb,
>   		*crc32 = crc32_le(*crc32, (void *)output_buffer->pixels, row_size);
>   
>   		if (wb)
> -			vkms_writeback_row(wb, output_buffer, y_pos);
> +			vkms_writeback_row(wb, output_buffer, y);
>   	}
>   }
>   
> @@ -261,7 +346,7 @@ static int check_format_funcs(struct vkms_crtc_state *crtc_state,
>   	u32 n_active_planes = crtc_state->num_active_planes;
>   
>   	for (size_t i = 0; i < n_active_planes; i++)
> -		if (!planes[i]->pixel_read)
> +		if (!planes[i]->pixel_read_line)
>   			return -1;
>   
>   	if (active_wb && !active_wb->pixel_write)
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 985e7a92b7bc..23e1d247468d 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -39,7 +39,6 @@
>   struct vkms_frame_info {
>   	struct drm_framebuffer *fb;
>   	struct drm_rect src, dst;
> -	struct drm_rect rotated;
>   	struct iosys_map map[DRM_FORMAT_MAX_PLANES];
>   	unsigned int rotation;
>   };
> @@ -80,26 +79,37 @@ enum pixel_read_direction {
>   	READ_LEFT_TO_RIGHT
>   };
>   
> +struct vkms_plane_state;
> +
>   /**
> - * typedef pixel_read_t - These functions are used to read a pixel in the source frame,
> + * typedef pixel_read_line_t - These functions are used to read a pixel line in the source frame,
>    * convert it to `struct pixel_argb_u16` and write it to @out_pixel.
>    *
> - * @in_pixel: Pointer to the pixel to read
> - * @out_pixel: Pointer to write the converted pixel
> + * @plane: Plane used as source for the pixel value
> + * @x_start: X (width) coordinate of the first pixel to copy. The caller must ensure that x_start
> + * is positive and smaller than @plane->frame_info->fb->width.
> + * @y_start: Y (width) coordinate of the first pixel to copy. The caller must ensure that y_start
> + * is positive and smaller than @plane->frame_info->fb->height.
> + * @direction: Direction to use for the copy, starting at @x_start/@y_start
> + * @count: Number of pixels to copy
> + * @out_pixel: Pointer where to write the pixel values. They will be written from @out_pixel[0]
> + * to @out_pixel[@count]. The caller must ensure that out_pixel have a length of at least @count.
>    */
> -typedef void (*pixel_read_t)(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel);
> +typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_start,
> +				  int y_start, enum pixel_read_direction direction, int count,
> +				  struct pixel_argb_u16 out_pixel[]);
>   
>   /**
>    * vkms_plane_state - Driver specific plane state
>    * @base: base plane state
>    * @frame_info: data required for composing computation
> - * @pixel_read: function to read a pixel in this plane. The creator of a vkms_plane_state must
> - * ensure that this pointer is valid
> + * @pixel_read_line: function to read a pixel line in this plane. The creator of a vkms_plane_state
> + * must ensure that this pointer is valid
>    */
>   struct vkms_plane_state {
>   	struct drm_shadow_plane_state base;
>   	struct vkms_frame_info *frame_info;
> -	pixel_read_t pixel_read;
> +	pixel_read_line_t pixel_read_line;
>   };
>   
>   struct vkms_plane {
> @@ -204,7 +214,6 @@ int vkms_verify_crc_source(struct drm_crtc *crtc, const char *source_name,
>   /* Composer Support */
>   void vkms_composer_worker(struct work_struct *work);
>   void vkms_set_composer(struct vkms_output *out, bool enabled);
> -void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state *plane, int y);
>   void vkms_writeback_row(struct vkms_writeback_job *wb, const struct line_buffer *src_buffer, int y);
>   
>   /* Writeback */
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index 743b6fd06db5..1449a0e6c706 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> @@ -105,77 +105,45 @@ static int get_step_next_block(struct drm_framebuffer *fb, enum pixel_read_direc
>   	return 0;
>   }
>   
> -static void *get_packed_src_addr(const struct vkms_frame_info *frame_info, int y,
> -				 int plane_index)
> -{
> -	int x_src = frame_info->src.x1 >> 16;
> -	int y_src = y - frame_info->rotated.y1 + (frame_info->src.y1 >> 16);
> -	u8 *addr;
> -	int rem_x, rem_y;
> -
> -	packed_pixels_addr(frame_info, x_src, y_src, plane_index, &addr, &rem_x, &rem_y);
> -	return addr;
> -}
> -
> -static int get_x_position(const struct vkms_frame_info *frame_info, int limit, int x)
> -{
> -	if (frame_info->rotation & (DRM_MODE_REFLECT_X | DRM_MODE_ROTATE_270))
> -		return limit - x - 1;
> -	return x;
> -}
> -
>   /*
> - * The following  functions take pixel data from the buffer and convert them to the format
> + * The following  functions take pixel data (a, r, g, b, pixel, ...), convert them to the format
>    * ARGB16161616 in out_pixel.
>    *
> - * They are used in the `vkms_compose_row` function to handle multiple formats.
> + * They are used in the `read_line`s functions to avoid duplicate work for some pixel formats.
>    */
>   
> -static void ARGB8888_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +static struct pixel_argb_u16 argb_u16_from_u8888(int a, int r, int g, int b)
>   {
> +	struct pixel_argb_u16 out_pixel;
>   	/*
>   	 * The 257 is the "conversion ratio". This number is obtained by the
>   	 * (2^16 - 1) / (2^8 - 1) division. Which, in this case, tries to get
>   	 * the best color value in a pixel format with more possibilities.
>   	 * A similar idea applies to others RGB color conversions.
>   	 */
> -	out_pixel->a = (u16)in_pixel[3] * 257;
> -	out_pixel->r = (u16)in_pixel[2] * 257;
> -	out_pixel->g = (u16)in_pixel[1] * 257;
> -	out_pixel->b = (u16)in_pixel[0] * 257;
> -}
> +	out_pixel.a = (u16)a * 257;
> +	out_pixel.r = (u16)r * 257;
> +	out_pixel.g = (u16)g * 257;
> +	out_pixel.b = (u16)b * 257;
>   
> -static void XRGB8888_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> -{
> -	out_pixel->a = (u16)0xffff;
> -	out_pixel->r = (u16)in_pixel[2] * 257;
> -	out_pixel->g = (u16)in_pixel[1] * 257;
> -	out_pixel->b = (u16)in_pixel[0] * 257;
> +	return out_pixel;
>   }
>   
> -static void ARGB16161616_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +static struct pixel_argb_u16 argb_u16_from_u16161616(int a, int r, int g, int b)
>   {
> -	u16 *pixel = (u16 *)in_pixel;
> +	struct pixel_argb_u16 out_pixel;
>   
> -	out_pixel->a = le16_to_cpu(pixel[3]);
> -	out_pixel->r = le16_to_cpu(pixel[2]);
> -	out_pixel->g = le16_to_cpu(pixel[1]);
> -	out_pixel->b = le16_to_cpu(pixel[0]);
> -}
> +	out_pixel.a = le16_to_cpu(a);
> +	out_pixel.r = le16_to_cpu(r);
> +	out_pixel.g = le16_to_cpu(g);
> +	out_pixel.b = le16_to_cpu(b);
>   
> -static void XRGB16161616_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> -{
> -	u16 *pixel = (u16 *)in_pixel;
> -
> -	out_pixel->a = (u16)0xffff;
> -	out_pixel->r = le16_to_cpu(pixel[2]);
> -	out_pixel->g = le16_to_cpu(pixel[1]);
> -	out_pixel->b = le16_to_cpu(pixel[0]);
> +	return out_pixel;
>   }
>   
> -static void RGB565_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
>   {
> -	u16 *pixel = (u16 *)in_pixel;
> +	struct pixel_argb_u16 out_pixel;
>   
>   	s64 fp_rb_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(31));
>   	s64 fp_g_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(63));
> @@ -185,12 +153,26 @@ static void RGB565_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pi
>   	s64 fp_g = drm_int2fixp((rgb_565 >> 5) & 0x3f);
>   	s64 fp_b = drm_int2fixp(rgb_565 & 0x1f);
>   
> -	out_pixel->a = (u16)0xffff;
> -	out_pixel->r = drm_fixp2int_round(drm_fixp_mul(fp_r, fp_rb_ratio));
> -	out_pixel->g = drm_fixp2int_round(drm_fixp_mul(fp_g, fp_g_ratio));
> -	out_pixel->b = drm_fixp2int_round(drm_fixp_mul(fp_b, fp_rb_ratio));
> +	out_pixel.a = (u16)0xffff;
> +	out_pixel.r = drm_fixp2int_round(drm_fixp_mul(fp_r, fp_rb_ratio));
> +	out_pixel.g = drm_fixp2int_round(drm_fixp_mul(fp_g, fp_g_ratio));
> +	out_pixel.b = drm_fixp2int_round(drm_fixp_mul(fp_b, fp_rb_ratio));
> +
> +	return out_pixel;
>   }
>   
> +/*
> + * The following functions are read_line function for each pixel format supported by VKMS.
> + *
> + * They read a line starting at the point @x_start,@y_start following the @direction. The result
> + * is stored in @out_pixel and in the format ARGB16161616.
> + *
> + * Those function are very similar, but it is required for performance reason. In the past, some
> + * experiment were done, and with a generic loop the performance are very reduced [1].
> + *
> + * [1]: https://lore.kernel.org/dri-devel/d258c8dc-78e9-4509-9037-a98f7f33b3a3@riseup.net/
> + */
> +
>   /**
>    * black_to_argb_u16() - pixel_read callback which always read black
>    *
> @@ -198,42 +180,116 @@ static void RGB565_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pi
>    * It is used to avoid null pointer to be used as a function. In theory, this function should
>    * never be called, except if you found a bug in the driver/DRM core.
>    */
> -static void black_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +static void black_to_argb_u16(const struct vkms_plane_state *plane, int x_start,
> +			      int y_start, enum pixel_read_direction direction, int count,
> +			      struct pixel_argb_u16 out_pixel[])
>   {
> -	out_pixel->a = (u16)0xFFFF;
> -	out_pixel->r = 0;
> -	out_pixel->g = 0;
> -	out_pixel->b = 0;
> +	struct pixel_argb_u16 *end = out_pixel + count;
> +
> +	while (out_pixel < end) {
> +		*out_pixel = argb_u16_from_u8888(255, 0, 0, 0);
> +		out_pixel += 1;
> +	}
>   }
>   
> -/**
> - * vkms_compose_row - compose a single row of a plane
> - * @stage_buffer: output line with the composed pixels
> - * @plane: state of the plane that is being composed
> - * @y: y coordinate of the row
> - *
> - * This function composes a single row of a plane. It gets the source pixels
> - * through the y coordinate (see get_packed_src_addr()) and goes linearly
> - * through the source pixel, reading the pixels and converting it to
> - * ARGB16161616 (see the pixel_read() callback). For rotate-90 and rotate-270,
> - * the source pixels are not traversed linearly. The source pixels are queried
> - * on each iteration in order to traverse the pixels vertically.
> - */
> -void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state *plane, int y)
> +static void ARGB8888_read_line(const struct vkms_plane_state *plane, int x_start, int y_start,
> +			       enum pixel_read_direction direction, int count,
> +			       struct pixel_argb_u16 out_pixel[])
>   {
> -	struct pixel_argb_u16 *out_pixels = stage_buffer->pixels;
> -	struct vkms_frame_info *frame_info = plane->frame_info;
> -	u8 *src_pixels = get_packed_src_addr(frame_info, y, 0);
> -	int limit = min_t(size_t, drm_rect_width(&frame_info->dst), stage_buffer->n_pixels);
> +	struct pixel_argb_u16 *end = out_pixel + count;
> +	u8 *src_pixels;
> +	int rem_x, rem_y;
> +
> +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
> +
> +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> +
> +	while (out_pixel < end) {
> +		u8 *px = (u8 *)src_pixels;
> +		*out_pixel = argb_u16_from_u8888(px[3], px[2], px[1], px[0]);
> +		out_pixel += 1;
> +		src_pixels += step;
> +	}
> +}
> +
> +static void XRGB8888_read_line(const struct vkms_plane_state *plane, int x_start, int y_start,
> +			       enum pixel_read_direction direction, int count,
> +			       struct pixel_argb_u16 out_pixel[])
> +{
> +	struct pixel_argb_u16 *end = out_pixel + count;
> +	u8 *src_pixels;
> +	int rem_x, rem_y;
> +
> +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
> +
> +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> +
> +	while (out_pixel < end) {
> +		u8 *px = (u8 *)src_pixels;
> +		*out_pixel = argb_u16_from_u8888(255, px[2], px[1], px[0]);
> +		out_pixel += 1;
> +		src_pixels += step;
> +	}
> +}
> +
> +static void ARGB16161616_read_line(const struct vkms_plane_state *plane, int x_start,
> +				   int y_start, enum pixel_read_direction direction, int count,
> +				   struct pixel_argb_u16 out_pixel[])
> +{
> +	struct pixel_argb_u16 *end = out_pixel + count;
> +	u8 *src_pixels;
> +	int rem_x, rem_y;
> +
> +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
> +
> +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> +
> +	while (out_pixel < end) {
> +		u16 *px = (u16 *)src_pixels;
> +		*out_pixel = argb_u16_from_u16161616(px[3], px[2], px[1], px[0]);
> +		out_pixel += 1;
> +		src_pixels += step;
> +	}
> +}
> +
> +static void XRGB16161616_read_line(const struct vkms_plane_state *plane, int x_start,
> +				   int y_start, enum pixel_read_direction direction, int count,
> +				   struct pixel_argb_u16 out_pixel[])
> +{
> +	struct pixel_argb_u16 *end = out_pixel + count;
> +	u8 *src_pixels;
> +	int rem_x, rem_y;
> +
> +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
> +
> +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> +
> +	while (out_pixel < end) {
> +		u16 *px = (u16 *)src_pixels;
> +		*out_pixel = argb_u16_from_u16161616(0xFFFF, px[2], px[1], px[0]);
> +		out_pixel += 1;
> +		src_pixels += step;
> +	}
> +}
> +
> +static void RGB565_read_line(const struct vkms_plane_state *plane, int x_start,
> +			     int y_start, enum pixel_read_direction direction, int count,
> +			     struct pixel_argb_u16 out_pixel[])
> +{
> +	struct pixel_argb_u16 *end = out_pixel + count;
> +	u8 *src_pixels;
> +	int rem_x, rem_y;
> +
> +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
>   
> -	for (size_t x = 0; x < limit; x++, src_pixels += frame_info->fb->format->cpp[0]) {
> -		int x_pos = get_x_position(frame_info, limit, x);
> +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
>   
> -		if (drm_rotation_90_or_270(frame_info->rotation))
> -			src_pixels = get_packed_src_addr(frame_info, x + frame_info->rotated.y1, 0)
> -				+ frame_info->fb->format->cpp[0] * y;
> +	while (out_pixel < end) {
> +		u16 *px = (u16 *)src_pixels;
>   
> -		plane->pixel_read(src_pixels, &out_pixels[x_pos]);
> +		*out_pixel = argb_u16_from_RGB565(px);
> +		out_pixel += 1;
> +		src_pixels += step;
>   	}
>   }
>   
> @@ -343,25 +399,25 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
>   }
>   
>   /**
> - * Retrieve the correct read_pixel function for a specific format.
> + * Retrieve the correct read_line function for a specific format.
>    * If the format is not supported by VKMS a warn is emitted and a dummy "always read black"
>    * function is returned.
>    *
>    * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
>    */
> -pixel_read_t get_pixel_read_function(u32 format)
> +pixel_read_line_t get_pixel_read_line_function(u32 format)
>   {
>   	switch (format) {
>   	case DRM_FORMAT_ARGB8888:
> -		return &ARGB8888_to_argb_u16;
> +		return &ARGB8888_read_line;
>   	case DRM_FORMAT_XRGB8888:
> -		return &XRGB8888_to_argb_u16;
> +		return &XRGB8888_read_line;
>   	case DRM_FORMAT_ARGB16161616:
> -		return &ARGB16161616_to_argb_u16;
> +		return &ARGB16161616_read_line;
>   	case DRM_FORMAT_XRGB16161616:
> -		return &XRGB16161616_to_argb_u16;
> +		return &XRGB16161616_read_line;
>   	case DRM_FORMAT_RGB565:
> -		return &RGB565_to_argb_u16;
> +		return &RGB565_read_line;
>   	default:
>   		/*
>   		 * This is a bug in vkms_plane_atomic_check. All the supported
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
> index 3ecea4563254..8d2bef95ff79 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.h
> +++ b/drivers/gpu/drm/vkms/vkms_formats.h
> @@ -5,7 +5,7 @@
>   
>   #include "vkms_drv.h"
>   
> -pixel_read_t get_pixel_read_function(u32 format);
> +pixel_read_line_t get_pixel_read_line_function(u32 format);
>   
>   pixel_write_t get_pixel_write_function(u32 format);
>   
> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> index 10e9b23dab28..8875bed76410 100644
> --- a/drivers/gpu/drm/vkms/vkms_plane.c
> +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> @@ -112,7 +112,6 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
>   	frame_info = vkms_plane_state->frame_info;
>   	memcpy(&frame_info->src, &new_state->src, sizeof(struct drm_rect));
>   	memcpy(&frame_info->dst, &new_state->dst, sizeof(struct drm_rect));
> -	memcpy(&frame_info->rotated, &new_state->dst, sizeof(struct drm_rect));
>   	frame_info->fb = fb;
>   	memcpy(&frame_info->map, &shadow_plane_state->data, sizeof(frame_info->map));
>   	drm_framebuffer_get(frame_info->fb);
> @@ -122,10 +121,8 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
>   									  DRM_MODE_REFLECT_X |
>   									  DRM_MODE_REFLECT_Y);
>   
> -	drm_rect_rotate(&frame_info->rotated, drm_rect_width(&frame_info->rotated),
> -			drm_rect_height(&frame_info->rotated), frame_info->rotation);
>   
> -	vkms_plane_state->pixel_read = get_pixel_read_function(fmt);
> +	vkms_plane_state->pixel_read_line = get_pixel_read_line_function(fmt);
>   }
>   
>   static int vkms_plane_atomic_check(struct drm_plane *plane,
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 11/16] drm/vkms: Add YUV support
  2024-03-13 17:45 ` [PATCH v5 11/16] drm/vkms: Add YUV support Louis Chauvet
  2024-03-13 19:20   ` Randy Dunlap
@ 2024-03-25 14:26   ` Maíra Canal
  2024-03-26 15:57     ` Louis Chauvet
  2024-03-27 12:11   ` Philipp Zabel
  2024-03-27 14:23   ` Pekka Paalanen
  3 siblings, 1 reply; 75+ messages in thread
From: Maíra Canal @ 2024-03-25 14:26 UTC (permalink / raw)
  To: Louis Chauvet, Rodrigo Siqueira, Melissa Wen, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

On 3/13/24 14:45, Louis Chauvet wrote:
> From: Arthur Grillo <arthurgrillo@riseup.net>
> 
> Add support to the YUV formats bellow:
> 
> - NV12/NV16/NV24
> - NV21/NV61/NV42
> - YUV420/YUV422/YUV444
> - YVU420/YVU422/YVU444
> 
> The conversion from yuv to rgb is done with fixed-point arithmetic, using
> 32.32 floats and the drm_fixed helpers.
> 
> To do the conversion, a specific matrix must be used for each color range
> (DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in
> the `conversion_matrix` struct, along with the specific y_offset needed.
> This matrix is queried only once, in `vkms_plane_atomic_update` and
> stored in a `vkms_plane_state`. Those conversion matrices of each
> encoding and range were obtained by rounding the values of the original
> conversion matrices multiplied by 2^32. This is done to avoid the use of
> floating point operations.
> 
> The same reading function is used for YUV and YVU formats. As the only
> difference between those two category of formats is the order of field, a
> simple swap in conversion matrix columns allows using the same function.
> 
> Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> [Louis Chauvet:
> - Adapted Arthur's work
> - Implemented the read_line_t callbacks for yuv
> - add struct conversion_matrix
> - remove struct pixel_yuv_u8
> - update the commit message
> - Merge the modifications from Arthur]

A Co-developed-by tag would be more appropriate.

> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>   drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
>   drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
>   drivers/gpu/drm/vkms/vkms_formats.h |   4 +
>   drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
>   4 files changed, 473 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 23e1d247468d..f3116084de5a 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -99,6 +99,27 @@ typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_st
>   				  int y_start, enum pixel_read_direction direction, int count,
>   				  struct pixel_argb_u16 out_pixel[]);
>   
> +/**
> + * CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values
> + */
> +#define CONVERSION_MATRIX_FLOAT_DEPTH 32
> +
> +/**
> + * struct conversion_matrix - Matrix to use for a specific encoding and range
> + *
> + * @matrix: Conversion matrix from yuv to rgb. The matrix is stored in a row-major manner and is
> + * used to compute rgb values from yuv values:
> + *     [[r],[g],[b]] = @matrix * [[y],[u],[v]]
> + *   OR for yvu formats:
> + *     [[r],[g],[b]] = @matrix * [[y],[v],[u]]
> + *  The values of the matrix are fixed floats, 32.CONVERSION_MATRIX_FLOAT_DEPTH > + * @y_offest: Offset to apply on the y value.

s/y_offest/y_offset

> + */
> +struct conversion_matrix {
> +	s64 matrix[3][3];
> +	s64 y_offset;
> +};
> +
>   /**
>    * vkms_plane_state - Driver specific plane state
>    * @base: base plane state
> @@ -110,6 +131,7 @@ struct vkms_plane_state {
>   	struct drm_shadow_plane_state base;
>   	struct vkms_frame_info *frame_info;
>   	pixel_read_line_t pixel_read_line;
> +	struct conversion_matrix *conversion_matrix;

Add @conversion_matrix on the kernel-doc from the struct
vkms_plane_state.

>   };
>   
>   struct vkms_plane {
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index 1449a0e6c706..edbf4b321b91 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> @@ -105,6 +105,44 @@ static int get_step_next_block(struct drm_framebuffer *fb, enum pixel_read_direc
>   	return 0;
>   }
>   
> +/**
> + * get_subsampling() - Get the subsampling divisor value on a specific direction

Where are the arguments?

> + */
> +static int get_subsampling(const struct drm_format_info *format,
> +			   enum pixel_read_direction direction)
> +{
> +	switch (direction) {
> +	case READ_BOTTOM_TO_TOP:
> +	case READ_TOP_TO_BOTTOM:
> +		return format->vsub;
> +	case READ_RIGHT_TO_LEFT:
> +	case READ_LEFT_TO_RIGHT:
> +		return format->hsub;
> +	}
> +	WARN_ONCE(true, "Invalid direction for pixel reading: %d\n", direction);
> +	return 1;
> +}
> +
> +/**
> + * get_subsampling_offset() - An offset for keeping the chroma siting consistent regardless of
> + * x_start and y_start values

Same.

> + */
> +static int get_subsampling_offset(enum pixel_read_direction direction, int x_start, int y_start)
> +{
> +	switch (direction) {
> +	case READ_BOTTOM_TO_TOP:
> +		return -y_start - 1;
> +	case READ_TOP_TO_BOTTOM:
> +		return y_start;
> +	case READ_RIGHT_TO_LEFT:
> +		return -x_start - 1;
> +	case READ_LEFT_TO_RIGHT:
> +		return x_start;
> +	}
> +	WARN_ONCE(true, "Invalid direction for pixel reading: %d\n", direction);
> +	return 0;
> +}
> +
>   /*
>    * The following  functions take pixel data (a, r, g, b, pixel, ...), convert them to the format
>    * ARGB16161616 in out_pixel.
> @@ -161,6 +199,42 @@ static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
>   	return out_pixel;
>   }
>   

[...]

>   
> +/**
> + * get_conversion_matrix_to_argb_u16() - Retrieve the correct yuv to rgb conversion matrix for a
> + * given encoding and range.
> + *
> + * If the matrix is not found, return a null pointer. In all other cases, it return a simple
> + * diagonal matrix, which act as a "no-op".
> + *
> + * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> + * @encoding: DRM_COLOR_* value for which to obtain a conversion matrix
> + * @range: DRM_COLOR_*_RANGE value for which to obtain a conversion matrix

A bit odd to see the arguments after the description.

> + */
> +struct conversion_matrix *
> +get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> +				  enum drm_color_range range)
> +{
> +	static struct conversion_matrix no_operation = {
> +		.matrix = {
> +			{ 4294967296, 0,          0, },
> +			{ 0,          4294967296, 0, },
> +			{ 0,          0,          4294967296, },
> +		},
> +		.y_offset = 0,
> +	};
> +
> +	/*
> +	 * Those matrixies were generated using the colour python framework
> +	 *
> +	 * Below are the function calls used to generate eac matrix, go to
> +	 * https://colour.readthedocs.io/en/develop/generated/colour.matrix_YCbCr.html
> +	 * for more info:
> +	 *
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
> +	 *                                  is_legal = False,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt601_full = {
> +		.matrix = {
> +			{ 4294967296, 0,           6021544149 },
> +			{ 4294967296, -1478054095, -3067191994 },
> +			{ 4294967296, 7610682049,  0 },
> +		},
> +		.y_offset = 0,
> +	};
> +
> +	/*
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
> +	 *                                  is_legal = True,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt601_limited = {
> +		.matrix = {
> +			{ 5020601039, 0,           6881764740 },
> +			{ 5020601039, -1689204679, -3505362278 },
> +			{ 5020601039, 8697922339,  0 },
> +		},
> +		.y_offset = 16,
> +	};
> +
> +	/*
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.709"],
> +	 *                                  is_legal = False,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt709_full = {
> +		.matrix = {
> +			{ 4294967296, 0,          6763714498 },
> +			{ 4294967296, -804551626, -2010578443 },
> +			{ 4294967296, 7969741314, 0 },
> +		},
> +		.y_offset = 0,
> +	};
> +
> +	/*
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.709"],
> +	 *                                  is_legal = True,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt709_limited = {
> +		.matrix = {
> +			{ 5020601039, 0,          7729959424 },
> +			{ 5020601039, -919487572, -2297803934 },
> +			{ 5020601039, 9108275786, 0 },
> +		},
> +		.y_offset = 16,
> +	};
> +
> +	/*
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.2020"],
> +	 *                                  is_legal = False,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt2020_full = {
> +		.matrix = {
> +			{ 4294967296, 0,          6333358775 },
> +			{ 4294967296, -706750298, -2453942994 },
> +			{ 4294967296, 8080551471, 0 },
> +		},
> +		.y_offset = 0,
> +	};
> +
> +	/*
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.2020"],
> +	 *                                  is_legal = True,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt2020_limited = {
> +		.matrix = {
> +			{ 5020601039, 0,          7238124312 },
> +			{ 5020601039, -807714626, -2804506279 },
> +			{ 5020601039, 9234915964, 0 },
> +		},
> +		.y_offset = 16,
> +	};
> +
> +	/*
> +	 * The next matrices are just the previous ones, but with the first and
> +	 * second columns swapped
> +	 */
> +	static struct conversion_matrix yvu_bt601_full = {
> +		.matrix = {
> +			{ 4294967296, 6021544149,  0 },
> +			{ 4294967296, -3067191994, -1478054095 },
> +			{ 4294967296, 0,           7610682049 },
> +		},
> +		.y_offset = 0,
> +	};
> +	static struct conversion_matrix yvu_bt601_limited = {
> +		.matrix = {
> +			{ 5020601039, 6881764740,  0 },
> +			{ 5020601039, -3505362278, -1689204679 },
> +			{ 5020601039, 0,           8697922339 },
> +		},
> +		.y_offset = 16,
> +	};
> +	static struct conversion_matrix yvu_bt709_full = {
> +		.matrix = {
> +			{ 4294967296, 6763714498,  0 },
> +			{ 4294967296, -2010578443, -804551626 },
> +			{ 4294967296, 0,           7969741314 },
> +		},
> +		.y_offset = 0,
> +	};
> +	static struct conversion_matrix yvu_bt709_limited = {
> +		.matrix = {
> +			{ 5020601039, 7729959424,  0 },
> +			{ 5020601039, -2297803934, -919487572 },
> +			{ 5020601039, 0,           9108275786 },
> +		},
> +		.y_offset = 16,
> +	};
> +	static struct conversion_matrix yvu_bt2020_full = {
> +		.matrix = {
> +			{ 4294967296, 6333358775,  0 },
> +			{ 4294967296, -2453942994, -706750298 },
> +			{ 4294967296, 0,           8080551471 },
> +		},
> +		.y_offset = 0,
> +	};
> +	static struct conversion_matrix yvu_bt2020_limited = {
> +		.matrix = {
> +			{ 5020601039, 7238124312,  0 },
> +			{ 5020601039, -2804506279, -807714626 },
> +			{ 5020601039, 0,           9234915964 },
> +		},
> +		.y_offset = 16,
> +	};
> +
> +	/* Breaking in this switch means that the color format+encoding+range is not supported */

s/color format+encoding+range/color format + encoding + range

> +	switch (format) {
> +	case DRM_FORMAT_NV12:
> +	case DRM_FORMAT_NV16:
> +	case DRM_FORMAT_NV24:
> +	case DRM_FORMAT_YUV420:
> +	case DRM_FORMAT_YUV422:
> +	case DRM_FORMAT_YUV444:
> +		switch (encoding) {
> +		case DRM_COLOR_YCBCR_BT601:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yuv_bt601_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yuv_bt601_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_YCBCR_BT709:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yuv_bt709_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yuv_bt709_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_YCBCR_BT2020:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yuv_bt2020_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yuv_bt2020_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_ENCODING_MAX:
> +			break;
> +		}
> +		break;
> +	case DRM_FORMAT_YVU420:
> +	case DRM_FORMAT_YVU422:
> +	case DRM_FORMAT_YVU444:
> +	case DRM_FORMAT_NV21:
> +	case DRM_FORMAT_NV61:
> +	case DRM_FORMAT_NV42:
> +		switch (encoding) {
> +		case DRM_COLOR_YCBCR_BT601:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yvu_bt601_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yvu_bt601_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_YCBCR_BT709:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yvu_bt709_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yvu_bt709_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_YCBCR_BT2020:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yvu_bt2020_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yvu_bt2020_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_ENCODING_MAX:
> +			break;
> +		}
> +		break;
> +	case DRM_FORMAT_ARGB8888:
> +	case DRM_FORMAT_XRGB8888:
> +	case DRM_FORMAT_ARGB16161616:
> +	case DRM_FORMAT_XRGB16161616:
> +	case DRM_FORMAT_RGB565:
> +		/*
> +		 * Those formats are supported, but they don't need a conversion matrix. Return
> +		 * a valid pointer to avoid kernel panic in case this matrix is used/checked
> +		 * somewhere.
> +		 */
> +		return &no_operation;
> +	default:
> +		break;
> +	}
> +	WARN(true, "Unsupported encoding (%d), range (%d) and format (%p4cc) combination\n",
> +	     encoding, range, &format);
> +	return &no_operation;
> +}
> +
>   /**
>    * Retrieve the correct write_pixel function for a specific format.
>    * If the format is not supported by VKMS a warn is emitted and a dummy "don't do anything"
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
> index 8d2bef95ff79..e1d324764b17 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.h
> +++ b/drivers/gpu/drm/vkms/vkms_formats.h
> @@ -9,4 +9,8 @@ pixel_read_line_t get_pixel_read_line_function(u32 format);
>   
>   pixel_write_t get_pixel_write_function(u32 format);
>   
> +struct conversion_matrix *
> +get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> +				  enum drm_color_range range);
> +
>   #endif /* _VKMS_FORMATS_H_ */
> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> index 8875bed76410..987dd2b686a8 100644
> --- a/drivers/gpu/drm/vkms/vkms_plane.c
> +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> @@ -17,7 +17,19 @@ static const u32 vkms_formats[] = {
>   	DRM_FORMAT_XRGB8888,
>   	DRM_FORMAT_XRGB16161616,
>   	DRM_FORMAT_ARGB16161616,
> -	DRM_FORMAT_RGB565
> +	DRM_FORMAT_RGB565,
> +	DRM_FORMAT_NV12,
> +	DRM_FORMAT_NV16,
> +	DRM_FORMAT_NV24,
> +	DRM_FORMAT_NV21,
> +	DRM_FORMAT_NV61,
> +	DRM_FORMAT_NV42,
> +	DRM_FORMAT_YUV420,
> +	DRM_FORMAT_YUV422,
> +	DRM_FORMAT_YUV444,
> +	DRM_FORMAT_YVU420,
> +	DRM_FORMAT_YVU422,
> +	DRM_FORMAT_YVU444

Let's add a comma by the end of this entry, to avoid deleting this line
when adding a new format.

>   };
>   
>   static struct drm_plane_state *
> @@ -117,12 +129,15 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
>   	drm_framebuffer_get(frame_info->fb);
>   	frame_info->rotation = drm_rotation_simplify(new_state->rotation, DRM_MODE_ROTATE_0 |
>   									  DRM_MODE_ROTATE_90 |
> +									  DRM_MODE_ROTATE_180 |

Why do we need to add DRM_MODE_ROTATE_180 here? Isn't the same as
reflecting both along the X and Y axis?

Best Regards,
- Maíra

>   									  DRM_MODE_ROTATE_270 |
>   									  DRM_MODE_REFLECT_X |
>   									  DRM_MODE_REFLECT_Y);
>   
>   
>   	vkms_plane_state->pixel_read_line = get_pixel_read_line_function(fmt);
> +	vkms_plane_state->conversion_matrix = get_conversion_matrix_to_argb_u16
> +		(fmt, new_state->color_encoding, new_state->color_range);
>   }
>   
>   static int vkms_plane_atomic_check(struct drm_plane *plane,
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 14/16] drm/vkms: Create KUnit tests for YUV conversions
  2024-03-13 17:45 ` [PATCH v5 14/16] drm/vkms: Create KUnit tests for YUV conversions Louis Chauvet
@ 2024-03-25 14:34   ` Maíra Canal
  2024-03-26 15:57     ` Louis Chauvet
  2024-03-28 13:26     ` Pekka Paalanen
  0 siblings, 2 replies; 75+ messages in thread
From: Maíra Canal @ 2024-03-25 14:34 UTC (permalink / raw)
  To: Louis Chauvet, Rodrigo Siqueira, Melissa Wen, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

On 3/13/24 14:45, Louis Chauvet wrote:
> From: Arthur Grillo <arthurgrillo@riseup.net>
> 
> Create KUnit tests to test the conversion between YUV and RGB. Test each
> conversion and range combination with some common colors.
> 
> The code used to compute the expected result can be found in comment.
> 
> Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> [Louis Chauvet:
> - fix minor formating issues (whitespace, double line)
> - change expected alpha from 0x0000 to 0xffff
> - adapt to the new get_conversion_matrix usage
> - apply the changes from Arthur
> - move struct pixel_yuv_u8 to the test itself]

Again, a Co-developed-by tag might be more proper.

> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>   drivers/gpu/drm/vkms/Kconfig                  |  15 ++
>   drivers/gpu/drm/vkms/Makefile                 |   1 +
>   drivers/gpu/drm/vkms/tests/.kunitconfig       |   4 +
>   drivers/gpu/drm/vkms/tests/Makefile           |   3 +
>   drivers/gpu/drm/vkms/tests/vkms_format_test.c | 230 ++++++++++++++++++++++++++
>   drivers/gpu/drm/vkms/vkms_formats.c           |   7 +-
>   drivers/gpu/drm/vkms/vkms_formats.h           |   4 +
>   7 files changed, 262 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/Kconfig b/drivers/gpu/drm/vkms/Kconfig
> index b9ecdebecb0b..9b0e1940c14f 100644
> --- a/drivers/gpu/drm/vkms/Kconfig
> +++ b/drivers/gpu/drm/vkms/Kconfig
> @@ -13,3 +13,18 @@ config DRM_VKMS
>   	  a VKMS.
>   
>   	  If M is selected the module will be called vkms.
> +
> +config DRM_VKMS_KUNIT_TESTS
> +	tristate "Tests for VKMS" if !KUNIT_ALL_TESTS

"KUnit tests for VKMS"

> +	depends on DRM_VKMS && KUNIT
> +	default KUNIT_ALL_TESTS
> +	help
> +	  This builds unit tests for VKMS. This option is not useful for
> +	  distributions or general kernels, but only for kernel
> +	  developers working on VKMS.
> +
> +	  For more information on KUnit and unit tests in general,
> +	  please refer to the KUnit documentation in
> +	  Documentation/dev-tools/kunit/.
> +
> +	  If in doubt, say "N".
> \ No newline at end of file
> diff --git a/drivers/gpu/drm/vkms/Makefile b/drivers/gpu/drm/vkms/Makefile
> index 1b28a6a32948..8d3e46dde635 100644
> --- a/drivers/gpu/drm/vkms/Makefile
> +++ b/drivers/gpu/drm/vkms/Makefile
> @@ -9,3 +9,4 @@ vkms-y := \
>   	vkms_writeback.o
>   
>   obj-$(CONFIG_DRM_VKMS) += vkms.o
> +obj-$(CONFIG_DRM_VKMS_KUNIT_TESTS) += tests/
> diff --git a/drivers/gpu/drm/vkms/tests/.kunitconfig b/drivers/gpu/drm/vkms/tests/.kunitconfig
> new file mode 100644
> index 000000000000..70e378228cbd
> --- /dev/null
> +++ b/drivers/gpu/drm/vkms/tests/.kunitconfig
> @@ -0,0 +1,4 @@
> +CONFIG_KUNIT=y
> +CONFIG_DRM=y
> +CONFIG_DRM_VKMS=y
> +CONFIG_DRM_VKMS_KUNIT_TESTS=y
> diff --git a/drivers/gpu/drm/vkms/tests/Makefile b/drivers/gpu/drm/vkms/tests/Makefile
> new file mode 100644
> index 000000000000..2d1df668569e
> --- /dev/null
> +++ b/drivers/gpu/drm/vkms/tests/Makefile
> @@ -0,0 +1,3 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +obj-$(CONFIG_DRM_VKMS_KUNIT_TESTS) += vkms_format_test.o
> diff --git a/drivers/gpu/drm/vkms/tests/vkms_format_test.c b/drivers/gpu/drm/vkms/tests/vkms_format_test.c
> new file mode 100644
> index 000000000000..0954d606e44a
> --- /dev/null
> +++ b/drivers/gpu/drm/vkms/tests/vkms_format_test.c
> @@ -0,0 +1,230 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +
> +#include <kunit/test.h>
> +
> +#include <drm/drm_fixed.h>
> +#include <drm/drm_fourcc.h>
> +#include <drm/drm_print.h>
> +
> +#include "../../drm_crtc_internal.h"
> +
> +#include "../vkms_drv.h"
> +#include "../vkms_formats.h"
> +
> +#define TEST_BUFF_SIZE 50
> +
> +struct pixel_yuv_u8 {
> +	u8 y, u, v;
> +};
> +
> +struct yuv_u8_to_argb_u16_case {
> +	enum drm_color_encoding encoding;
> +	enum drm_color_range range;
> +	size_t n_colors;
> +	struct format_pair {
> +		char *name;
> +		struct pixel_yuv_u8 yuv;
> +		struct pixel_argb_u16 argb;
> +	} colors[TEST_BUFF_SIZE];
> +};
> +
> +/*
> + * The YUV color representation were acquired via the colour python framework.
> + * Below are the function calls used for generating each case.
> + *
> + * for more information got to the docs:

s/for/For

> + * https://colour.readthedocs.io/en/master/generated/colour.RGB_to_YCbCr.html
> + */
> +static struct yuv_u8_to_argb_u16_case yuv_u8_to_argb_u16_cases[] = {
> +	/*
> +	 * colour.RGB_to_YCbCr(<rgb color in 16 bit form>,
> +	 *                     K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
> +	 *                     in_bits = 16,
> +	 *                     in_legal = False,
> +	 *                     in_int = True,
> +	 *                     out_bits = 8,
> +	 *                     out_legal = False,
> +	 *                     out_int = True)
> +	 */

I feel that this Python code is kind of poluting the test cases.

> +	{
> +		.encoding = DRM_COLOR_YCBCR_BT601,
> +		.range = DRM_COLOR_YCBCR_FULL_RANGE,
> +		.n_colors = 6,
> +		.colors = {
> +			{ "white", { 0xff, 0x80, 0x80 }, { 0xffff, 0xffff, 0xffff, 0xffff }},
> +			{ "gray",  { 0x80, 0x80, 0x80 }, { 0xffff, 0x8080, 0x8080, 0x8080 }},
> +			{ "black", { 0x00, 0x80, 0x80 }, { 0xffff, 0x0000, 0x0000, 0x0000 }},
> +			{ "red",   { 0x4c, 0x55, 0xff }, { 0xffff, 0xffff, 0x0000, 0x0000 }},
> +			{ "green", { 0x96, 0x2c, 0x15 }, { 0xffff, 0x0000, 0xffff, 0x0000 }},
> +			{ "blue",  { 0x1d, 0xff, 0x6b }, { 0xffff, 0x0000, 0x0000, 0xffff }},
> +		},
> +	},
> +	/*
> +	 * colour.RGB_to_YCbCr(<rgb color in 16 bit form>,
> +	 *                     K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
> +	 *                     in_bits = 16,
> +	 *                     in_legal = False,
> +	 *                     in_int = True,
> +	 *                     out_bits = 8,
> +	 *                     out_legal = True,
> +	 *                     out_int = True)
> +	 */
> +	{
> +		.encoding = DRM_COLOR_YCBCR_BT601,
> +		.range = DRM_COLOR_YCBCR_LIMITED_RANGE,
> +		.n_colors = 6,
> +		.colors = {
> +			{ "white", { 0xeb, 0x80, 0x80 }, { 0xffff, 0xffff, 0xffff, 0xffff }},
> +			{ "gray",  { 0x7e, 0x80, 0x80 }, { 0xffff, 0x8080, 0x8080, 0x8080 }},
> +			{ "black", { 0x10, 0x80, 0x80 }, { 0xffff, 0x0000, 0x0000, 0x0000 }},
> +			{ "red",   { 0x51, 0x5a, 0xf0 }, { 0xffff, 0xffff, 0x0000, 0x0000 }},
> +			{ "green", { 0x91, 0x36, 0x22 }, { 0xffff, 0x0000, 0xffff, 0x0000 }},
> +			{ "blue",  { 0x29, 0xf0, 0x6e }, { 0xffff, 0x0000, 0x0000, 0xffff }},
> +		},
> +	},
> +	/*
> +	 * colour.RGB_to_YCbCr(<rgb color in 16 bit form>,
> +	 *                     K=colour.WEIGHTS_YCBCR["ITU-R BT.709"],
> +	 *                     in_bits = 16,
> +	 *                     in_legal = False,
> +	 *                     in_int = True,
> +	 *                     out_bits = 8,
> +	 *                     out_legal = False,
> +	 *                     out_int = True)
> +	 */
> +	{
> +		.encoding = DRM_COLOR_YCBCR_BT709,
> +		.range = DRM_COLOR_YCBCR_FULL_RANGE,
> +		.n_colors = 4,
> +		.colors = {
> +			{ "white", { 0xff, 0x80, 0x80 }, { 0xffff, 0xffff, 0xffff, 0xffff }},
> +			{ "gray",  { 0x80, 0x80, 0x80 }, { 0xffff, 0x8080, 0x8080, 0x8080 }},
> +			{ "black", { 0x00, 0x80, 0x80 }, { 0xffff, 0x0000, 0x0000, 0x0000 }},
> +			{ "red",   { 0x36, 0x63, 0xff }, { 0xffff, 0xffff, 0x0000, 0x0000 }},
> +			{ "green", { 0xb6, 0x1e, 0x0c }, { 0xffff, 0x0000, 0xffff, 0x0000 }},
> +			{ "blue",  { 0x12, 0xff, 0x74 }, { 0xffff, 0x0000, 0x0000, 0xffff }},
> +		},
> +	},
> +	/*
> +	 * colour.RGB_to_YCbCr(<rgb color in 16 bit form>,
> +	 *                     K=colour.WEIGHTS_YCBCR["ITU-R BT.709"],
> +	 *                     in_bits = 16,
> +	 *                     int_legal = False,
> +	 *                     in_int = True,
> +	 *                     out_bits = 8,
> +	 *                     out_legal = True,
> +	 *                     out_int = True)
> +	 */
> +	{
> +		.encoding = DRM_COLOR_YCBCR_BT709,
> +		.range = DRM_COLOR_YCBCR_LIMITED_RANGE,
> +		.n_colors = 4,
> +		.colors = {
> +			{ "white", { 0xeb, 0x80, 0x80 }, { 0xffff, 0xffff, 0xffff, 0xffff }},
> +			{ "gray",  { 0x7e, 0x80, 0x80 }, { 0xffff, 0x8080, 0x8080, 0x8080 }},
> +			{ "black", { 0x10, 0x80, 0x80 }, { 0xffff, 0x0000, 0x0000, 0x0000 }},
> +			{ "red",   { 0x3f, 0x66, 0xf0 }, { 0xffff, 0xffff, 0x0000, 0x0000 }},
> +			{ "green", { 0xad, 0x2a, 0x1a }, { 0xffff, 0x0000, 0xffff, 0x0000 }},
> +			{ "blue",  { 0x20, 0xf0, 0x76 }, { 0xffff, 0x0000, 0x0000, 0xffff }},
> +		},
> +	},
> +	/*
> +	 * colour.RGB_to_YCbCr(<rgb color in 16 bit form>,
> +	 *                     K=colour.WEIGHTS_YCBCR["ITU-R BT.2020"],
> +	 *                     in_bits = 16,
> +	 *                     in_legal = False,
> +	 *                     in_int = True,
> +	 *                     out_bits = 8,
> +	 *                     out_legal = False,
> +	 *                     out_int = True)
> +	 */
> +	{
> +		.encoding = DRM_COLOR_YCBCR_BT2020,
> +		.range = DRM_COLOR_YCBCR_FULL_RANGE,
> +		.n_colors = 4,
> +		.colors = {
> +			{ "white", { 0xff, 0x80, 0x80 }, { 0xffff, 0xffff, 0xffff, 0xffff }},
> +			{ "gray",  { 0x80, 0x80, 0x80 }, { 0xffff, 0x8080, 0x8080, 0x8080 }},
> +			{ "black", { 0x00, 0x80, 0x80 }, { 0xffff, 0x0000, 0x0000, 0x0000 }},
> +			{ "red",   { 0x43, 0x5c, 0xff }, { 0xffff, 0xffff, 0x0000, 0x0000 }},
> +			{ "green", { 0xad, 0x24, 0x0b }, { 0xffff, 0x0000, 0xffff, 0x0000 }},
> +			{ "blue",  { 0x0f, 0xff, 0x76 }, { 0xffff, 0x0000, 0x0000, 0xffff }},
> +		},
> +	},
> +	/*
> +	 * colour.RGB_to_YCbCr(<rgb color in 16 bit form>,
> +	 *                     K=colour.WEIGHTS_YCBCR["ITU-R BT.2020"],
> +	 *                     in_bits = 16,
> +	 *                     in_legal = False,
> +	 *                     in_int = True,
> +	 *                     out_bits = 8,
> +	 *                     out_legal = True,
> +	 *                     out_int = True)
> +	 */
> +	{
> +		.encoding = DRM_COLOR_YCBCR_BT2020,
> +		.range = DRM_COLOR_YCBCR_LIMITED_RANGE,
> +		.n_colors = 4,
> +		.colors = {
> +			{ "white", { 0xeb, 0x80, 0x80 }, { 0xffff, 0xffff, 0xffff, 0xffff }},
> +			{ "gray",  { 0x7e, 0x80, 0x80 }, { 0xffff, 0x8080, 0x8080, 0x8080 }},
> +			{ "black", { 0x10, 0x80, 0x80 }, { 0xffff, 0x0000, 0x0000, 0x0000 }},
> +			{ "red",   { 0x4a, 0x61, 0xf0 }, { 0xffff, 0xffff, 0x0000, 0x0000 }},
> +			{ "green", { 0xa4, 0x2f, 0x19 }, { 0xffff, 0x0000, 0xffff, 0x0000 }},
> +			{ "blue",  { 0x1d, 0xf0, 0x77 }, { 0xffff, 0x0000, 0x0000, 0xffff }},
> +		},
> +	},
> +};
> +
> +static void vkms_format_test_yuv_u8_to_argb_u16(struct kunit *test)
> +{
> +	const struct yuv_u8_to_argb_u16_case *param = test->param_value;
> +	struct pixel_argb_u16 argb;
> +
> +	for (size_t i = 0; i < param->n_colors; i++) {
> +		const struct format_pair *color = &param->colors[i];
> +
> +		struct conversion_matrix *matrix = get_conversion_matrix_to_argb_u16
> +			(DRM_FORMAT_NV12, param->encoding, param->range);
> +
> +		argb = argb_u16_from_yuv888(color->yuv.y, color->yuv.u, color->yuv.v, matrix);
> +
> +		KUNIT_EXPECT_LE_MSG(test, abs_diff(argb.a, color->argb.a), 257,
> +				    "On the A channel of the color %s expected 0x%04x, got 0x%04x",
> +				    color->name, color->argb.a, argb.a);
> +		KUNIT_EXPECT_LE_MSG(test, abs_diff(argb.r, color->argb.r), 257,
> +				    "On the R channel of the color %s expected 0x%04x, got 0x%04x",
> +				    color->name, color->argb.r, argb.r);
> +		KUNIT_EXPECT_LE_MSG(test, abs_diff(argb.g, color->argb.g), 257,
> +				    "On the G channel of the color %s expected 0x%04x, got 0x%04x",
> +				    color->name, color->argb.g, argb.g);
> +		KUNIT_EXPECT_LE_MSG(test, abs_diff(argb.b, color->argb.b), 257,
> +				    "On the B channel of the color %s expected 0x%04x, got 0x%04x",
> +				    color->name, color->argb.b, argb.b);
> +	}
> +}
> +
> +static void vkms_format_test_yuv_u8_to_argb_u16_case_desc(struct yuv_u8_to_argb_u16_case *t,
> +							  char *desc)
> +{
> +	snprintf(desc, KUNIT_PARAM_DESC_SIZE, "%s - %s",
> +		 drm_get_color_encoding_name(t->encoding), drm_get_color_range_name(t->range));
> +}
> +
> +KUNIT_ARRAY_PARAM(yuv_u8_to_argb_u16, yuv_u8_to_argb_u16_cases,
> +		  vkms_format_test_yuv_u8_to_argb_u16_case_desc
> +);
> +
> +static struct kunit_case vkms_format_test_cases[] = {
> +	KUNIT_CASE_PARAM(vkms_format_test_yuv_u8_to_argb_u16, yuv_u8_to_argb_u16_gen_params),
> +	{}
> +};
> +
> +static struct kunit_suite vkms_format_test_suite = {
> +	.name = "vkms-format",
> +	.test_cases = vkms_format_test_cases,
> +};
> +
> +kunit_test_suite(vkms_format_test_suite);
> +
> +MODULE_LICENSE("GPL");
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index edbf4b321b91..863fc91d6d48 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> @@ -7,6 +7,8 @@
>   #include <drm/drm_rect.h>
>   #include <drm/drm_fixed.h>
>   
> +#include <kunit/visibility.h>
> +
>   #include "vkms_formats.h"
>   
>   /**
> @@ -199,8 +201,8 @@ static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
>   	return out_pixel;
>   }
>   
> -static struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
> -						  struct conversion_matrix *matrix)
> +VISIBLE_IF_KUNIT struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
> +							    struct conversion_matrix *matrix)
>   {
>   	u8 r, g, b;
>   	s64 fp_y, fp_cb, fp_cr;
> @@ -234,6 +236,7 @@ static struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
>   
>   	return argb_u16_from_u8888(255, r, g, b);
>   }
> +EXPORT_SYMBOL_IF_KUNIT(argb_u16_from_yuv888);
>   
>   /*
>    * The following functions are read_line function for each pixel format supported by VKMS.
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
> index e1d324764b17..21e66a0cac16 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.h
> +++ b/drivers/gpu/drm/vkms/vkms_formats.h
> @@ -13,4 +13,8 @@ struct conversion_matrix *
>   get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
>   				  enum drm_color_range range);
>   
> +#if IS_ENABLED(CONFIG_KUNIT)

What about the CONFIG_DRM_EXPORT_FOR_TESTS?

Best Regards,
- Maíra

> +struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr, struct conversion_matrix *matrix);
> +#endif
> +
>   #endif /* _VKMS_FORMATS_H_ */
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 10/16] drm/vkms: Re-introduce line-per-line composition algorithm
  2024-03-25 14:15   ` Maíra Canal
@ 2024-03-25 14:56     ` Pekka Paalanen
  2024-03-26 15:57     ` Louis Chauvet
  1 sibling, 0 replies; 75+ messages in thread
From: Pekka Paalanen @ 2024-03-25 14:56 UTC (permalink / raw)
  To: Maíra Canal
  Cc: Louis Chauvet, Rodrigo Siqueira, Melissa Wen, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

[-- Attachment #1: Type: text/plain, Size: 4301 bytes --]

On Mon, 25 Mar 2024 11:15:13 -0300
Maíra Canal <mcanal@igalia.com> wrote:

> On 3/13/24 14:45, Louis Chauvet wrote:
> > Re-introduce a line-by-line composition algorithm for each pixel format.
> > This allows more performance by not requiring an indirection per pixel
> > read. This patch is focused on readability of the code.
> > 
> > Line-by-line composition was introduced by [1] but rewritten back to
> > pixel-by-pixel algorithm in [2]. At this time, nobody noticed the impact
> > on performance, and it was merged.
> > 
> > This patch is almost a revert of [2], but in addition efforts have been
> > made to increase readability and maintainability of the rotation handling.
> > The blend function is now divided in two parts:
> > - Transformation of coordinates from the output referential to the source
> > referential
> > - Line conversion and blending
> > 
> > Most of the complexity of the rotation management is avoided by using
> > drm_rect_* helpers. The remaining complexity is around the clipping, to
> > avoid reading/writing outside source/destination buffers.
> > 
> > The pixel conversion is now done line-by-line, so the read_pixel_t was
> > replaced with read_pixel_line_t callback. This way the indirection is only
> > required once per line and per plane, instead of once per pixel and per
> > plane.
> > 
> > The read_line_t callbacks are very similar for most pixel format, but it
> > is required to avoid performance impact. Some helpers for color
> > conversion were introduced to avoid code repetition:
> > - *_to_argb_u16: perform colors conversion. They should be inlined by the
> >    compiler, and they are used to avoid repetition between multiple variants
> >    of the same format (argb/xrgb and maybe in the future for formats like
> >    bgr formats).
> > 
> > This new algorithm was tested with:
> > - kms_plane (for color conversions)
> > - kms_rotation_crc (for rotations of planes)
> > - kms_cursor_crc (for translations of planes)
> > - kms_rotation (for all rotations and formats combinations) [3]
> > The performance gain was mesured with:
> > - kms_fb_stress  
> 
> Could you tell us what was the performance gain?
> 
> > 
> > [1]: commit 8ba1648567e2 ("drm: vkms: Refactor the plane composer to accept
> >       new formats")
> >       https://lore.kernel.org/all/20220905190811.25024-7-igormtorrente@gmail.com/
> > [2]: commit 322d716a3e8a ("drm/vkms: isolate pixel conversion
> >       functionality")
> >       https://lore.kernel.org/all/20230418130525.128733-2-mcanal@igalia.com/
> > [3]:
> > 
> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >   drivers/gpu/drm/vkms/vkms_composer.c | 167 +++++++++++++++++++------
> >   drivers/gpu/drm/vkms/vkms_drv.h      |  27 ++--
> >   drivers/gpu/drm/vkms/vkms_formats.c  | 236 ++++++++++++++++++++++-------------
> >   drivers/gpu/drm/vkms/vkms_formats.h  |   2 +-
> >   drivers/gpu/drm/vkms/vkms_plane.c    |   5 +-
> >   5 files changed, 292 insertions(+), 145 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> > index 989bcf59f375..5d78c33dbf41 100644
> > --- a/drivers/gpu/drm/vkms/vkms_composer.c
> > +++ b/drivers/gpu/drm/vkms/vkms_composer.c

...

> > @@ -215,34 +188,146 @@ static void blend(struct vkms_writeback_job *wb,
> >   {
> >   	struct vkms_plane_state **plane = crtc_state->active_planes;
> >   	u32 n_active_planes = crtc_state->num_active_planes;
> > -	int y_pos, x_dst, x_limit;
> >   
> >   	const struct pixel_argb_u16 background_color = { .a = 0xffff };
> >   
> > -	size_t crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
> > +	int crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
> > +	int crtc_x_limit = crtc_state->base.crtc->mode.hdisplay;  
> 
> Shouldn't it be `unsigned int`?

No. It's not good to mix signed and unsigned variables in computations.
I for sure would not remember all the implicit promotion rules that
apply, and you'd probably be forced to add explicit signedness casts to
get the correct behaviour. It causes much less surprises to "normalize"
all variables to the same signedness before computing with them. Some
values in this function can be negative.


Thanks,
pq

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 10/16] drm/vkms: Re-introduce line-per-line composition algorithm
  2024-03-13 17:45 ` [PATCH v5 10/16] drm/vkms: Re-introduce line-per-line composition algorithm Louis Chauvet
  2024-03-25 14:15   ` Maíra Canal
@ 2024-03-25 15:43   ` Pekka Paalanen
  2024-03-26 15:57     ` Louis Chauvet
  1 sibling, 1 reply; 75+ messages in thread
From: Pekka Paalanen @ 2024-03-25 15:43 UTC (permalink / raw)
  To: Louis Chauvet
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

[-- Attachment #1: Type: text/plain, Size: 29050 bytes --]

On Wed, 13 Mar 2024 18:45:04 +0100
Louis Chauvet <louis.chauvet@bootlin.com> wrote:

> Re-introduce a line-by-line composition algorithm for each pixel format.
> This allows more performance by not requiring an indirection per pixel
> read. This patch is focused on readability of the code.
> 
> Line-by-line composition was introduced by [1] but rewritten back to
> pixel-by-pixel algorithm in [2]. At this time, nobody noticed the impact
> on performance, and it was merged.
> 
> This patch is almost a revert of [2], but in addition efforts have been
> made to increase readability and maintainability of the rotation handling.
> The blend function is now divided in two parts:
> - Transformation of coordinates from the output referential to the source
> referential
> - Line conversion and blending
> 
> Most of the complexity of the rotation management is avoided by using
> drm_rect_* helpers. The remaining complexity is around the clipping, to
> avoid reading/writing outside source/destination buffers.
> 
> The pixel conversion is now done line-by-line, so the read_pixel_t was
> replaced with read_pixel_line_t callback. This way the indirection is only
> required once per line and per plane, instead of once per pixel and per
> plane.
> 
> The read_line_t callbacks are very similar for most pixel format, but it
> is required to avoid performance impact. Some helpers for color
> conversion were introduced to avoid code repetition:
> - *_to_argb_u16: perform colors conversion. They should be inlined by the
>   compiler, and they are used to avoid repetition between multiple variants
>   of the same format (argb/xrgb and maybe in the future for formats like
>   bgr formats).
> 
> This new algorithm was tested with:
> - kms_plane (for color conversions)
> - kms_rotation_crc (for rotations of planes)
> - kms_cursor_crc (for translations of planes)
> - kms_rotation (for all rotations and formats combinations) [3]
> The performance gain was mesured with:
> - kms_fb_stress
> 
> [1]: commit 8ba1648567e2 ("drm: vkms: Refactor the plane composer to accept
>      new formats")
>      https://lore.kernel.org/all/20220905190811.25024-7-igormtorrente@gmail.com/
> [2]: commit 322d716a3e8a ("drm/vkms: isolate pixel conversion
>      functionality")
>      https://lore.kernel.org/all/20230418130525.128733-2-mcanal@igalia.com/
> [3]:
> 
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>  drivers/gpu/drm/vkms/vkms_composer.c | 167 +++++++++++++++++++------
>  drivers/gpu/drm/vkms/vkms_drv.h      |  27 ++--
>  drivers/gpu/drm/vkms/vkms_formats.c  | 236 ++++++++++++++++++++++-------------
>  drivers/gpu/drm/vkms/vkms_formats.h  |   2 +-
>  drivers/gpu/drm/vkms/vkms_plane.c    |   5 +-
>  5 files changed, 292 insertions(+), 145 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> index 989bcf59f375..5d78c33dbf41 100644
> --- a/drivers/gpu/drm/vkms/vkms_composer.c
> +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> @@ -41,7 +41,7 @@ static void pre_mul_alpha_blend(const struct line_buffer *stage_buffer,
>  				struct line_buffer *output_buffer, int x_start, int pixel_count)
>  {
>  	struct pixel_argb_u16 *out = &output_buffer->pixels[x_start];
> -	const struct pixel_argb_u16 *in = stage_buffer->pixels;
> +	const struct pixel_argb_u16 *in = &stage_buffer->pixels[x_start];
>  
>  	for (int i = 0; i < pixel_count; i++) {
>  		out[i].a = (u16)0xffff;
> @@ -51,33 +51,6 @@ static void pre_mul_alpha_blend(const struct line_buffer *stage_buffer,
>  	}
>  }
>  
> -static int get_y_pos(struct vkms_frame_info *frame_info, int y)
> -{
> -	if (frame_info->rotation & DRM_MODE_REFLECT_Y)
> -		return drm_rect_height(&frame_info->rotated) - y - 1;
> -
> -	switch (frame_info->rotation & DRM_MODE_ROTATE_MASK) {
> -	case DRM_MODE_ROTATE_90:
> -		return frame_info->rotated.x2 - y - 1;
> -	case DRM_MODE_ROTATE_270:
> -		return y + frame_info->rotated.x1;
> -	default:
> -		return y;
> -	}
> -}
> -
> -static bool check_limit(struct vkms_frame_info *frame_info, int pos)
> -{
> -	if (drm_rotation_90_or_270(frame_info->rotation)) {
> -		if (pos >= 0 && pos < drm_rect_width(&frame_info->rotated))
> -			return true;
> -	} else {
> -		if (pos >= frame_info->rotated.y1 && pos < frame_info->rotated.y2)
> -			return true;
> -	}
> -
> -	return false;
> -}
>  
>  static void fill_background(const struct pixel_argb_u16 *background_color,
>  			    struct line_buffer *output_buffer)
> @@ -215,34 +188,146 @@ static void blend(struct vkms_writeback_job *wb,
>  {
>  	struct vkms_plane_state **plane = crtc_state->active_planes;
>  	u32 n_active_planes = crtc_state->num_active_planes;
> -	int y_pos, x_dst, x_limit;
>  
>  	const struct pixel_argb_u16 background_color = { .a = 0xffff };
>  
> -	size_t crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
> +	int crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
> +	int crtc_x_limit = crtc_state->base.crtc->mode.hdisplay;
>  
>  	/*
>  	 * The planes are composed line-by-line to avoid heavy memory usage. It is a necessary
>  	 * complexity to avoid poor blending performance.
>  	 *
> -	 * The function vkms_compose_row is used to read a line, pixel-by-pixel, into the staging
> -	 * buffer.
> +	 * The function pixel_read_line callback is used to read a line, using an efficient
> +	 * algorithm for a specific format, into the staging buffer.
>  	 */
>  	for (size_t y = 0; y < crtc_y_limit; y++) {
>  		fill_background(&background_color, output_buffer);
>  
>  		/* The active planes are composed associatively in z-order. */
>  		for (size_t i = 0; i < n_active_planes; i++) {
> -			x_dst = plane[i]->frame_info->dst.x1;
> -			x_limit = min_t(size_t, drm_rect_width(&plane[i]->frame_info->dst),
> -					stage_buffer->n_pixels);
> -			y_pos = get_y_pos(plane[i]->frame_info, y);
> +			struct vkms_plane_state *current_plane = plane[i];
>  
> -			if (!check_limit(plane[i]->frame_info, y_pos))
> +			/* Avoid rendering useless lines */
> +			if (y < current_plane->frame_info->dst.y1 ||
> +			    y >= current_plane->frame_info->dst.y2)
>  				continue;
>  
> -			vkms_compose_row(stage_buffer, plane[i], y_pos);
> -			pre_mul_alpha_blend(stage_buffer, output_buffer, x_dst, x_limit);
> +			/*
> +			 * dst_line is the line to copy. The initial coordinates are inside the
> +			 * destination framebuffer, and then drm_rect_* helpers are used to
> +			 * compute the correct position into the source framebuffer.
> +			 */
> +			struct drm_rect dst_line = DRM_RECT_INIT(
> +				current_plane->frame_info->dst.x1, y,
> +				drm_rect_width(&current_plane->frame_info->dst), 1);
> +			struct drm_rect tmp_src;
> +
> +			drm_rect_fp_to_int(&tmp_src, &current_plane->frame_info->src);
> +
> +			/*
> +			 * [1]: Clamping src_line to the crtc_x_limit to avoid writing outside of
> +			 * the destination buffer
> +			 */
> +			dst_line.x1 = max_t(int, dst_line.x1, 0);
> +			dst_line.x2 = min_t(int, dst_line.x2, crtc_x_limit);
> +			/* The destination is completely outside of the crtc. */
> +			if (dst_line.x2 <= dst_line.x1)
> +				continue;
> +
> +			struct drm_rect src_line = dst_line;
> +
> +			/*
> +			 * Transform the coordinate x/y from the crtc to coordinates into
> +			 * coordinates for the src buffer.
> +			 *
> +			 * - Cancel the offset of the dst buffer.
> +			 * - Invert the rotation. This assumes that
> +			 *   dst = drm_rect_rotate(src, rotation) (dst and src have the
> +			 *   same size, but can be rotated).
> +			 * - Apply the offset of the source rectangle to the coordinate.
> +			 */
> +			drm_rect_translate(&src_line, -current_plane->frame_info->dst.x1,
> +					   -current_plane->frame_info->dst.y1);
> +			drm_rect_rotate_inv(&src_line,
> +					    drm_rect_width(&tmp_src),
> +					    drm_rect_height(&tmp_src),
> +					    current_plane->frame_info->rotation);
> +			drm_rect_translate(&src_line, tmp_src.x1, tmp_src.y1);
> +
> +			/* Get the correct reading direction in the source buffer. */
> +
> +			enum pixel_read_direction direction =
> +				direction_for_rotation(current_plane->frame_info->rotation);
> +
> +			int x_start = src_line.x1;
> +			int y_start = src_line.y1;
> +			int pixel_count;
> +			/* [2]: Compute and clamp the number of pixel to read */
> +			if (direction == READ_LEFT_TO_RIGHT || direction == READ_RIGHT_TO_LEFT) {
> +				/*
> +				 * In horizontal reading, the src_line width is the number of pixel
> +				 * to read
> +				 */
> +				pixel_count = drm_rect_width(&src_line);
> +				if (x_start < 0) {
> +					pixel_count += x_start;
> +					x_start = 0;
> +				}
> +				if (x_start + pixel_count > current_plane->frame_info->fb->width) {
> +					pixel_count =
> +						(int)current_plane->frame_info->fb->width - x_start;
> +				}
> +			} else {
> +				/*
> +				 * In vertical reading, the src_line height is the number of pixel
> +				 * to read
> +				 */
> +				pixel_count = drm_rect_height(&src_line);
> +				if (y_start < 0) {
> +					pixel_count += y_start;
> +					y_start = 0;
> +				}
> +				if (y_start + pixel_count > current_plane->frame_info->fb->height) {
> +					pixel_count =
> +						(int)current_plane->frame_info->fb->width - y_start;
> +				}

When you are clamping x_start or y_start or pixel_count to be inside
the source FB, should you not equally adjust the destination
coordinates as well?

If we take a step back and look at the UAPI, I believe the answer is
"no", but it's in no way obvious. It results from the combination of
several facts:

- UAPI checks reject any source rectangle that extends outside of the
  source FB.

- The source rectangle stretches to fill the destination rectangle
  exactly.

- VKMS does not support stretching (scaling), so its UAPI checks reject
  any commit with source and destination rectangles of different sizes
  after accounting for rotation. (Right?)

I think this results in the clamping code being actually dead code.
However, I would not delete the clamping code, because it is a cheap
safety net in case something goes wrong.

If you agree that it's just a safety net, then maybe explain that in a
comment? If the safety net catches anything, the composition result
will be wrong anyway, so it doesn't matter to adjust the destination
rectangle to match.

When the last point is relaxed and VKMS gains scaling support, I think
it won't change the fact that the clamping remains as a safety net. It
just increases the risk of bugs that would be caught by the net.

Going outside of FB boundaries is a serious bug and deserves to be
checked. Going outside of the source rectangle would be a bug too,
assuming that partially included pixels are considered fully included,
but it's not serious enough to warrant explicit checks. Ideally IGT
would catch it.

> +			}
> +
> +			if (pixel_count <= 0) {
> +				/* Nothing to read, so avoid multiple function calls for nothing */
> +				continue;
> +			}
> +
> +			/*
> +			 * Modify the starting point to take in account the rotation
> +			 *
> +			 * src_line is the top-left corner, so when reading READ_RIGHT_TO_LEFT or
> +			 * READ_BOTTOM_TO_TOP, it must be changed to the top-right/bottom-left
> +			 * corner.
> +			 */
> +			if (direction == READ_RIGHT_TO_LEFT) {
> +				// x_start is now the right point
> +				x_start += pixel_count - 1;
> +			} else if (direction == READ_BOTTOM_TO_TOP) {
> +				// y_start is now the bottom point
> +				y_start += pixel_count - 1;
> +			}
> +
> +			/*
> +			 * Perform the conversion and the blending
> +			 *
> +			 * Here we know that the read line (x_start, y_start, pixel_count) is
> +			 * inside the source buffer [2] and we don't write outside the stage
> +			 * buffer [1]
> +			 */
> +			current_plane->pixel_read_line(
> +				current_plane, x_start, y_start, direction, pixel_count,
> +				&stage_buffer->pixels[current_plane->frame_info->dst.x1]);
> +
> +			pre_mul_alpha_blend(stage_buffer, output_buffer,
> +					    current_plane->frame_info->dst.x1,
> +					    pixel_count);
>  		}
>  
>  		apply_lut(crtc_state, output_buffer);
> @@ -250,7 +335,7 @@ static void blend(struct vkms_writeback_job *wb,
>  		*crc32 = crc32_le(*crc32, (void *)output_buffer->pixels, row_size);
>  
>  		if (wb)
> -			vkms_writeback_row(wb, output_buffer, y_pos);
> +			vkms_writeback_row(wb, output_buffer, y);
>  	}
>  }
>  
> @@ -261,7 +346,7 @@ static int check_format_funcs(struct vkms_crtc_state *crtc_state,
>  	u32 n_active_planes = crtc_state->num_active_planes;
>  
>  	for (size_t i = 0; i < n_active_planes; i++)
> -		if (!planes[i]->pixel_read)
> +		if (!planes[i]->pixel_read_line)
>  			return -1;
>  
>  	if (active_wb && !active_wb->pixel_write)
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 985e7a92b7bc..23e1d247468d 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -39,7 +39,6 @@
>  struct vkms_frame_info {
>  	struct drm_framebuffer *fb;
>  	struct drm_rect src, dst;
> -	struct drm_rect rotated;
>  	struct iosys_map map[DRM_FORMAT_MAX_PLANES];
>  	unsigned int rotation;
>  };
> @@ -80,26 +79,37 @@ enum pixel_read_direction {
>  	READ_LEFT_TO_RIGHT
>  };
>  
> +struct vkms_plane_state;
> +
>  /**
> - * typedef pixel_read_t - These functions are used to read a pixel in the source frame,
> + * typedef pixel_read_line_t - These functions are used to read a pixel line in the source frame,
>   * convert it to `struct pixel_argb_u16` and write it to @out_pixel.
>   *
> - * @in_pixel: Pointer to the pixel to read
> - * @out_pixel: Pointer to write the converted pixel
> + * @plane: Plane used as source for the pixel value
> + * @x_start: X (width) coordinate of the first pixel to copy. The caller must ensure that x_start
> + * is positive and smaller than @plane->frame_info->fb->width.
> + * @y_start: Y (width) coordinate of the first pixel to copy. The caller must ensure that y_start
> + * is positive and smaller than @plane->frame_info->fb->height.

s/positive/non-negative/ because zero is valid too. At least, there is
debate whether zero is positive or not, but non-negative is clear.

> + * @direction: Direction to use for the copy, starting at @x_start/@y_start
> + * @count: Number of pixels to copy
> + * @out_pixel: Pointer where to write the pixel values. They will be written from @out_pixel[0]
> + * to @out_pixel[@count]. The caller must ensure that out_pixel have a length of at least @count.
>   */
> -typedef void (*pixel_read_t)(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel);
> +typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_start,
> +				  int y_start, enum pixel_read_direction direction, int count,
> +				  struct pixel_argb_u16 out_pixel[]);
>  
>  /**
>   * vkms_plane_state - Driver specific plane state
>   * @base: base plane state
>   * @frame_info: data required for composing computation
> - * @pixel_read: function to read a pixel in this plane. The creator of a vkms_plane_state must
> - * ensure that this pointer is valid
> + * @pixel_read_line: function to read a pixel line in this plane. The creator of a vkms_plane_state
> + * must ensure that this pointer is valid
>   */
>  struct vkms_plane_state {
>  	struct drm_shadow_plane_state base;
>  	struct vkms_frame_info *frame_info;
> -	pixel_read_t pixel_read;
> +	pixel_read_line_t pixel_read_line;
>  };
>  
>  struct vkms_plane {
> @@ -204,7 +214,6 @@ int vkms_verify_crc_source(struct drm_crtc *crtc, const char *source_name,
>  /* Composer Support */
>  void vkms_composer_worker(struct work_struct *work);
>  void vkms_set_composer(struct vkms_output *out, bool enabled);
> -void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state *plane, int y);
>  void vkms_writeback_row(struct vkms_writeback_job *wb, const struct line_buffer *src_buffer, int y);
>  
>  /* Writeback */
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index 743b6fd06db5..1449a0e6c706 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> @@ -105,77 +105,45 @@ static int get_step_next_block(struct drm_framebuffer *fb, enum pixel_read_direc
>  	return 0;
>  }
>  
> -static void *get_packed_src_addr(const struct vkms_frame_info *frame_info, int y,
> -				 int plane_index)
> -{
> -	int x_src = frame_info->src.x1 >> 16;
> -	int y_src = y - frame_info->rotated.y1 + (frame_info->src.y1 >> 16);
> -	u8 *addr;
> -	int rem_x, rem_y;
> -
> -	packed_pixels_addr(frame_info, x_src, y_src, plane_index, &addr, &rem_x, &rem_y);
> -	return addr;
> -}
> -
> -static int get_x_position(const struct vkms_frame_info *frame_info, int limit, int x)
> -{
> -	if (frame_info->rotation & (DRM_MODE_REFLECT_X | DRM_MODE_ROTATE_270))
> -		return limit - x - 1;
> -	return x;
> -}
> -
>  /*
> - * The following  functions take pixel data from the buffer and convert them to the format
> + * The following  functions take pixel data (a, r, g, b, pixel, ...), convert them to the format
>   * ARGB16161616 in out_pixel.
>   *
> - * They are used in the `vkms_compose_row` function to handle multiple formats.
> + * They are used in the `read_line`s functions to avoid duplicate work for some pixel formats.
>   */
>  
> -static void ARGB8888_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +static struct pixel_argb_u16 argb_u16_from_u8888(int a, int r, int g, int b)
>  {
> +	struct pixel_argb_u16 out_pixel;
>  	/*
>  	 * The 257 is the "conversion ratio". This number is obtained by the
>  	 * (2^16 - 1) / (2^8 - 1) division. Which, in this case, tries to get
>  	 * the best color value in a pixel format with more possibilities.
>  	 * A similar idea applies to others RGB color conversions.
>  	 */
> -	out_pixel->a = (u16)in_pixel[3] * 257;
> -	out_pixel->r = (u16)in_pixel[2] * 257;
> -	out_pixel->g = (u16)in_pixel[1] * 257;
> -	out_pixel->b = (u16)in_pixel[0] * 257;
> -}
> +	out_pixel.a = (u16)a * 257;
> +	out_pixel.r = (u16)r * 257;
> +	out_pixel.g = (u16)g * 257;
> +	out_pixel.b = (u16)b * 257;
>  
> -static void XRGB8888_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> -{
> -	out_pixel->a = (u16)0xffff;
> -	out_pixel->r = (u16)in_pixel[2] * 257;
> -	out_pixel->g = (u16)in_pixel[1] * 257;
> -	out_pixel->b = (u16)in_pixel[0] * 257;
> +	return out_pixel;
>  }
>  
> -static void ARGB16161616_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +static struct pixel_argb_u16 argb_u16_from_u16161616(int a, int r, int g, int b)
>  {
> -	u16 *pixel = (u16 *)in_pixel;
> +	struct pixel_argb_u16 out_pixel;
>  
> -	out_pixel->a = le16_to_cpu(pixel[3]);
> -	out_pixel->r = le16_to_cpu(pixel[2]);
> -	out_pixel->g = le16_to_cpu(pixel[1]);
> -	out_pixel->b = le16_to_cpu(pixel[0]);
> -}
> +	out_pixel.a = le16_to_cpu(a);
> +	out_pixel.r = le16_to_cpu(r);
> +	out_pixel.g = le16_to_cpu(g);
> +	out_pixel.b = le16_to_cpu(b);
>  
> -static void XRGB16161616_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> -{
> -	u16 *pixel = (u16 *)in_pixel;
> -
> -	out_pixel->a = (u16)0xffff;
> -	out_pixel->r = le16_to_cpu(pixel[2]);
> -	out_pixel->g = le16_to_cpu(pixel[1]);
> -	out_pixel->b = le16_to_cpu(pixel[0]);
> +	return out_pixel;
>  }
>  
> -static void RGB565_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> +static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
>  {
> -	u16 *pixel = (u16 *)in_pixel;
> +	struct pixel_argb_u16 out_pixel;
>  
>  	s64 fp_rb_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(31));
>  	s64 fp_g_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(63));
> @@ -185,12 +153,26 @@ static void RGB565_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pi
>  	s64 fp_g = drm_int2fixp((rgb_565 >> 5) & 0x3f);
>  	s64 fp_b = drm_int2fixp(rgb_565 & 0x1f);
>  
> -	out_pixel->a = (u16)0xffff;
> -	out_pixel->r = drm_fixp2int_round(drm_fixp_mul(fp_r, fp_rb_ratio));
> -	out_pixel->g = drm_fixp2int_round(drm_fixp_mul(fp_g, fp_g_ratio));
> -	out_pixel->b = drm_fixp2int_round(drm_fixp_mul(fp_b, fp_rb_ratio));
> +	out_pixel.a = (u16)0xffff;
> +	out_pixel.r = drm_fixp2int_round(drm_fixp_mul(fp_r, fp_rb_ratio));
> +	out_pixel.g = drm_fixp2int_round(drm_fixp_mul(fp_g, fp_g_ratio));
> +	out_pixel.b = drm_fixp2int_round(drm_fixp_mul(fp_b, fp_rb_ratio));
> +
> +	return out_pixel;
>  }
>  
> +/*
> + * The following functions are read_line function for each pixel format supported by VKMS.
> + *
> + * They read a line starting at the point @x_start,@y_start following the @direction. The result
> + * is stored in @out_pixel and in the format ARGB16161616.
> + *
> + * Those function are very similar, but it is required for performance reason. In the past, some
> + * experiment were done, and with a generic loop the performance are very reduced [1].

The English here feels a bit awkward. How about:

These functions are very repetitive, but the innermost pixel loops must
be kept inside these functions for performance reasons. Some
benchmarking was done in [1] where having the innermost loop factored
out of these functions showed a slowdown by a factor of three.

> + *
> + * [1]: https://lore.kernel.org/dri-devel/d258c8dc-78e9-4509-9037-a98f7f33b3a3@riseup.net/
> + */
> +
>  /**
>   * black_to_argb_u16() - pixel_read callback which always read black
>   *
> @@ -198,42 +180,116 @@ static void RGB565_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pi
>   * It is used to avoid null pointer to be used as a function. In theory, this function should
>   * never be called, except if you found a bug in the driver/DRM core.
>   */
> +static void black_to_argb_u16(const struct vkms_plane_state *plane, int x_start,
> +			      int y_start, enum pixel_read_direction direction, int count,
> +			      struct pixel_argb_u16 out_pixel[])
>  {
> +	struct pixel_argb_u16 *end = out_pixel + count;
> +
> +	while (out_pixel < end) {
> +		*out_pixel = argb_u16_from_u8888(255, 0, 0, 0);
> +		out_pixel += 1;
> +	}
>  }
>  
> +static void ARGB8888_read_line(const struct vkms_plane_state *plane, int x_start, int y_start,
> +			       enum pixel_read_direction direction, int count,
> +			       struct pixel_argb_u16 out_pixel[])
>  {
> +	struct pixel_argb_u16 *end = out_pixel + count;
> +	u8 *src_pixels;
> +	int rem_x, rem_y;
> +
> +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
> +
> +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> +
> +	while (out_pixel < end) {
> +		u8 *px = (u8 *)src_pixels;
> +		*out_pixel = argb_u16_from_u8888(px[3], px[2], px[1], px[0]);
> +		out_pixel += 1;
> +		src_pixels += step;
> +	}
> +}
> +
> +static void XRGB8888_read_line(const struct vkms_plane_state *plane, int x_start, int y_start,
> +			       enum pixel_read_direction direction, int count,
> +			       struct pixel_argb_u16 out_pixel[])
> +{
> +	struct pixel_argb_u16 *end = out_pixel + count;
> +	u8 *src_pixels;
> +	int rem_x, rem_y;
> +
> +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
> +
> +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> +
> +	while (out_pixel < end) {
> +		u8 *px = (u8 *)src_pixels;
> +		*out_pixel = argb_u16_from_u8888(255, px[2], px[1], px[0]);
> +		out_pixel += 1;
> +		src_pixels += step;
> +	}
> +}
> +
> +static void ARGB16161616_read_line(const struct vkms_plane_state *plane, int x_start,
> +				   int y_start, enum pixel_read_direction direction, int count,
> +				   struct pixel_argb_u16 out_pixel[])
> +{
> +	struct pixel_argb_u16 *end = out_pixel + count;
> +	u8 *src_pixels;
> +	int rem_x, rem_y;
> +
> +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
> +
> +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> +
> +	while (out_pixel < end) {
> +		u16 *px = (u16 *)src_pixels;
> +		*out_pixel = argb_u16_from_u16161616(px[3], px[2], px[1], px[0]);
> +		out_pixel += 1;
> +		src_pixels += step;
> +	}
> +}
> +
> +static void XRGB16161616_read_line(const struct vkms_plane_state *plane, int x_start,
> +				   int y_start, enum pixel_read_direction direction, int count,
> +				   struct pixel_argb_u16 out_pixel[])
> +{
> +	struct pixel_argb_u16 *end = out_pixel + count;
> +	u8 *src_pixels;
> +	int rem_x, rem_y;
> +
> +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
> +
> +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> +
> +	while (out_pixel < end) {
> +		u16 *px = (u16 *)src_pixels;
> +		*out_pixel = argb_u16_from_u16161616(0xFFFF, px[2], px[1], px[0]);
> +		out_pixel += 1;
> +		src_pixels += step;
> +	}
> +}
> +
> +static void RGB565_read_line(const struct vkms_plane_state *plane, int x_start,
> +			     int y_start, enum pixel_read_direction direction, int count,
> +			     struct pixel_argb_u16 out_pixel[])
> +{
> +	struct pixel_argb_u16 *end = out_pixel + count;
> +	u8 *src_pixels;
> +	int rem_x, rem_y;
> +
> +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
>  
> +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
>  
> +	while (out_pixel < end) {
> +		u16 *px = (u16 *)src_pixels;
>  
> +		*out_pixel = argb_u16_from_RGB565(px);
> +		out_pixel += 1;
> +		src_pixels += step;
>  	}
>  }
>  
> @@ -343,25 +399,25 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
>  }
>  
>  /**
> - * Retrieve the correct read_pixel function for a specific format.
> + * Retrieve the correct read_line function for a specific format.
>   * If the format is not supported by VKMS a warn is emitted and a dummy "always read black"
>   * function is returned.
>   *
>   * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
>   */
> -pixel_read_t get_pixel_read_function(u32 format)
> +pixel_read_line_t get_pixel_read_line_function(u32 format)
>  {
>  	switch (format) {
>  	case DRM_FORMAT_ARGB8888:
> -		return &ARGB8888_to_argb_u16;
> +		return &ARGB8888_read_line;
>  	case DRM_FORMAT_XRGB8888:
> -		return &XRGB8888_to_argb_u16;
> +		return &XRGB8888_read_line;
>  	case DRM_FORMAT_ARGB16161616:
> -		return &ARGB16161616_to_argb_u16;
> +		return &ARGB16161616_read_line;
>  	case DRM_FORMAT_XRGB16161616:
> -		return &XRGB16161616_to_argb_u16;
> +		return &XRGB16161616_read_line;
>  	case DRM_FORMAT_RGB565:
> -		return &RGB565_to_argb_u16;
> +		return &RGB565_read_line;
>  	default:
>  		/*
>  		 * This is a bug in vkms_plane_atomic_check. All the supported
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
> index 3ecea4563254..8d2bef95ff79 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.h
> +++ b/drivers/gpu/drm/vkms/vkms_formats.h
> @@ -5,7 +5,7 @@
>  
>  #include "vkms_drv.h"
>  
> -pixel_read_t get_pixel_read_function(u32 format);
> +pixel_read_line_t get_pixel_read_line_function(u32 format);
>  
>  pixel_write_t get_pixel_write_function(u32 format);
>  
> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> index 10e9b23dab28..8875bed76410 100644
> --- a/drivers/gpu/drm/vkms/vkms_plane.c
> +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> @@ -112,7 +112,6 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
>  	frame_info = vkms_plane_state->frame_info;
>  	memcpy(&frame_info->src, &new_state->src, sizeof(struct drm_rect));
>  	memcpy(&frame_info->dst, &new_state->dst, sizeof(struct drm_rect));
> -	memcpy(&frame_info->rotated, &new_state->dst, sizeof(struct drm_rect));
>  	frame_info->fb = fb;
>  	memcpy(&frame_info->map, &shadow_plane_state->data, sizeof(frame_info->map));
>  	drm_framebuffer_get(frame_info->fb);
> @@ -122,10 +121,8 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
>  									  DRM_MODE_REFLECT_X |
>  									  DRM_MODE_REFLECT_Y);
>  
> -	drm_rect_rotate(&frame_info->rotated, drm_rect_width(&frame_info->rotated),
> -			drm_rect_height(&frame_info->rotated), frame_info->rotation);
>  
> -	vkms_plane_state->pixel_read = get_pixel_read_function(fmt);
> +	vkms_plane_state->pixel_read_line = get_pixel_read_line_function(fmt);
>  }
>  
>  static int vkms_plane_atomic_check(struct drm_plane *plane,
> 

This is looking good enough that I can give an

Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com>


Thanks,
pq

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 02/16] drm/vkms: Use drm_frame directly
  2024-03-25 13:20   ` Maíra Canal
@ 2024-03-26 15:56     ` Louis Chauvet
  0 siblings, 0 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-26 15:56 UTC (permalink / raw)
  To: Maíra Canal
  Cc: Rodrigo Siqueira, Melissa Wen, Haneen Mohammed, Daniel Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, arthurgrillo, Jonathan Corbet, pekka.paalanen,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 25/03/24 - 10:20, Maíra Canal a écrit :
> On 3/13/24 14:44, Louis Chauvet wrote:
> > From: Arthur Grillo <arthurgrillo@riseup.net>
> > 
> > Remove intermidiary variables and access the variables directly from
> > drm_frame. These changes should be noop.
> > 
> > Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >   drivers/gpu/drm/vkms/vkms_drv.h       |  3 ---
> >   drivers/gpu/drm/vkms/vkms_formats.c   | 12 +++++++-----
> >   drivers/gpu/drm/vkms/vkms_plane.c     |  3 ---
> >   drivers/gpu/drm/vkms/vkms_writeback.c |  5 -----
> >   4 files changed, 7 insertions(+), 16 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index 8f5710debb1e..b4b357447292 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -31,9 +31,6 @@ struct vkms_frame_info {
> >   	struct drm_rect rotated;
> >   	struct iosys_map map[DRM_FORMAT_MAX_PLANES];
> >   	unsigned int rotation;
> > -	unsigned int offset;
> > -	unsigned int pitch;
> > -	unsigned int cpp;
> >   };
> >   
> >   struct pixel_argb_u16 {
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> > index 36046b12f296..172830a3936a 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.c
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> > @@ -11,8 +11,10 @@
> >   
> >   static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int y)
> >   {
> > -	return frame_info->offset + (y * frame_info->pitch)
> > -				  + (x * frame_info->cpp);
> > +	struct drm_framebuffer *fb = frame_info->fb;
> > +
> > +	return fb->offsets[0] + (y * fb->pitches[0])
> > +			      + (x * fb->format->cpp[0]);
> 
> Nitpicking: Could this be packed into a single line?

Applied on the v6.

Thanks,
Louis Chauvet
 
> Anyway,
> 
> Reviewed-by: Maíra Canal <mcanal@igalia.com>
> 
> Best Regards,
> - Maíra
> 
> >   }
> >   
> >   /*
> > @@ -131,12 +133,12 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
> >   	u8 *src_pixels = get_packed_src_addr(frame_info, y);
> >   	int limit = min_t(size_t, drm_rect_width(&frame_info->dst), stage_buffer->n_pixels);
> >   
> > -	for (size_t x = 0; x < limit; x++, src_pixels += frame_info->cpp) {
> > +	for (size_t x = 0; x < limit; x++, src_pixels += frame_info->fb->format->cpp[0]) {
> >   		int x_pos = get_x_position(frame_info, limit, x);
> >   
> >   		if (drm_rotation_90_or_270(frame_info->rotation))
> >   			src_pixels = get_packed_src_addr(frame_info, x + frame_info->rotated.y1)
> > -				+ frame_info->cpp * y;
> > +				+ frame_info->fb->format->cpp[0] * y;
> >   
> >   		plane->pixel_read(src_pixels, &out_pixels[x_pos]);
> >   	}
> > @@ -223,7 +225,7 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
> >   	struct pixel_argb_u16 *in_pixels = src_buffer->pixels;
> >   	int x_limit = min_t(size_t, drm_rect_width(&frame_info->dst), src_buffer->n_pixels);
> >   
> > -	for (size_t x = 0; x < x_limit; x++, dst_pixels += frame_info->cpp)
> > +	for (size_t x = 0; x < x_limit; x++, dst_pixels += frame_info->fb->format->cpp[0])
> >   		wb->pixel_write(dst_pixels, &in_pixels[x]);
> >   }
> >   
> > diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> > index 5a8d295e65f2..21b5adfb44aa 100644
> > --- a/drivers/gpu/drm/vkms/vkms_plane.c
> > +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> > @@ -125,9 +125,6 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
> >   	drm_rect_rotate(&frame_info->rotated, drm_rect_width(&frame_info->rotated),
> >   			drm_rect_height(&frame_info->rotated), frame_info->rotation);
> >   
> > -	frame_info->offset = fb->offsets[0];
> > -	frame_info->pitch = fb->pitches[0];
> > -	frame_info->cpp = fb->format->cpp[0];
> >   	vkms_plane_state->pixel_read = get_pixel_conversion_function(fmt);
> >   }
> >   
> > diff --git a/drivers/gpu/drm/vkms/vkms_writeback.c b/drivers/gpu/drm/vkms/vkms_writeback.c
> > index bc724cbd5e3a..c8582df1f739 100644
> > --- a/drivers/gpu/drm/vkms/vkms_writeback.c
> > +++ b/drivers/gpu/drm/vkms/vkms_writeback.c
> > @@ -149,11 +149,6 @@ static void vkms_wb_atomic_commit(struct drm_connector *conn,
> >   	crtc_state->active_writeback = active_wb;
> >   	crtc_state->wb_pending = true;
> >   	spin_unlock_irq(&output->composer_lock);
> > -
> > -	wb_frame_info->offset = fb->offsets[0];
> > -	wb_frame_info->pitch = fb->pitches[0];
> > -	wb_frame_info->cpp = fb->format->cpp[0];
> > -
> >   	drm_writeback_queue_job(wb_conn, connector_state);
> >   	active_wb->pixel_write = get_pixel_write_function(wb_format);
> >   	drm_rect_init(&wb_frame_info->src, 0, 0, crtc_width, crtc_height);
> > 

-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 03/16] drm/vkms: write/update the documentation for pixel conversion and pixel write functions
  2024-03-25 13:32   ` Maíra Canal
@ 2024-03-26 15:56     ` Louis Chauvet
  0 siblings, 0 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-26 15:56 UTC (permalink / raw)
  To: Maíra Canal
  Cc: Rodrigo Siqueira, Melissa Wen, Haneen Mohammed, Daniel Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, arthurgrillo, Jonathan Corbet, pekka.paalanen,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 25/03/24 - 10:32, Maíra Canal a écrit :
> On 3/13/24 14:44, Louis Chauvet wrote:
> > Add some documentation on pixel conversion functions.
> > Update of outdated comments for pixel_write functions.
> > 
> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >   drivers/gpu/drm/vkms/vkms_composer.c |  7 ++++
> >   drivers/gpu/drm/vkms/vkms_drv.h      | 13 ++++++++
> >   drivers/gpu/drm/vkms/vkms_formats.c  | 62 ++++++++++++++++++++++++++++++------
> >   3 files changed, 73 insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> > index c6d9b4a65809..da0651a94c9b 100644
> > --- a/drivers/gpu/drm/vkms/vkms_composer.c
> > +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> > @@ -189,6 +189,13 @@ static void blend(struct vkms_writeback_job *wb,
> >   
> >   	size_t crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
> >   
> > +	/*
> > +	 * The planes are composed line-by-line to avoid heavy memory usage. It is a necessary
> > +	 * complexity to avoid poor blending performance.
> > +	 *
> > +	 * The function vkms_compose_row is used to read a line, pixel-by-pixel, into the staging
> > +	 * buffer.
> > +	 */
> >   	for (size_t y = 0; y < crtc_y_limit; y++) {
> >   		fill_background(&background_color, output_buffer);
> >   
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index b4b357447292..18086423a3a7 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -25,6 +25,17 @@
> >   
> >   #define VKMS_LUT_SIZE 256
> >   
> > +/**
> > + * struct vkms_frame_info - structure to store the state of a frame
> > + *
> > + * @fb: backing drm framebuffer
> > + * @src: source rectangle of this frame in the source framebuffer
> > + * @dst: destination rectangle in the crtc buffer
> > + * @map: see drm_shadow_plane_state@data
> > + * @rotation: rotation applied to the source.
> > + *
> > + * @src and @dst should have the same size modulo the rotation.
> > + */
> >   struct vkms_frame_info {
> >   	struct drm_framebuffer *fb;
> >   	struct drm_rect src, dst;
> > @@ -52,6 +63,8 @@ struct vkms_writeback_job {
> >    * vkms_plane_state - Driver specific plane state
> 
> It should be "* struct vkms_plane_state - Driver specific plane state".

Fixed in v6.
 
> >    * @base: base plane state
> >    * @frame_info: data required for composing computation
> > + * @pixel_read: function to read a pixel in this plane. The creator of a vkms_plane_state must
> > + * ensure that this pointer is valid
> >    */
> >   struct vkms_plane_state {
> >   	struct drm_shadow_plane_state base;
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> > index 172830a3936a..6e3dc8682ff9 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.c
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> > @@ -9,6 +9,18 @@
> >   
> >   #include "vkms_formats.h"
> >   
> > +/**
> > + * pixel_offset() - Get the offset of the pixel at coordinates x/y in the first plane
> > + *
> > + * @frame_info: Buffer metadata
> > + * @x: The x coordinate of the wanted pixel in the buffer
> > + * @y: The y coordinate of the wanted pixel in the buffer
> > + *
> > + * The caller must ensure that the framebuffer associated with this request uses a pixel format
> > + * where block_h == block_w == 1.
> > + * If this requirement is not fulfilled, the resulting offset can point to an other pixel or
> > + * outside of the buffer.
> > + */
> >   static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int y)
> >   {
> >   	struct drm_framebuffer *fb = frame_info->fb;
> > @@ -17,18 +29,22 @@ static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int
> >   			      + (x * fb->format->cpp[0]);
> >   }
> >   
> > -/*
> > - * packed_pixels_addr - Get the pointer to pixel of a given pair of coordinates
> > +/**
> > + * packed_pixels_addr() - Get the pointer to the block containing the pixel at the given
> > + * coordinates
> >    *
> >    * @frame_info: Buffer metadata
> > - * @x: The x(width) coordinate of the 2D buffer
> > - * @y: The y(Heigth) coordinate of the 2D buffer
> > + * @x: The x(width) coordinate inside the plane
> > + * @y: The y(height) coordinate inside the plane
> 
> I would add a space after x and y.

I just followed what was here before, fixed for the v6.

> >    *
> >    * Takes the information stored in the frame_info, a pair of coordinates, and
> >    * returns the address of the first color channel.
> >    * This function assumes the channels are packed together, i.e. a color channel
> >    * comes immediately after another in the memory. And therefore, this function
> >    * doesn't work for YUV with chroma subsampling (e.g. YUV420 and NV21).
> > + *
> > + * The caller must ensure that the framebuffer associated with this request uses a pixel format
> > + * where block_h == block_w == 1, otherwise the returned pointer can be outside the buffer.
> >    */
> >   static void *packed_pixels_addr(const struct vkms_frame_info *frame_info,
> >   				int x, int y)
> > @@ -53,6 +69,13 @@ static int get_x_position(const struct vkms_frame_info *frame_info, int limit, i
> >   	return x;
> >   }
> >   
> > +/*
> > + * The following  functions take pixel data from the buffer and convert them to the format
> 
> Double-spacing.

Fixed in v6.

> > + * ARGB16161616 in out_pixel.
> > + *
> > + * They are used in the `vkms_compose_row` function to handle multiple formats.
> 
> For cross-referencing functions, we use vkms_compose_row() [1].
> 
> [1] 
> https://docs.kernel.org/doc-guide/kernel-doc.html#highlights-and-cross-references

Thanks for this reference, fixed for the v6

> > + */
> > +
> >   static void ARGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> >   {
> >   	/*
> > @@ -145,12 +168,11 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
> >   }
> >   
> >   /*
> > - * The following  functions take an line of argb_u16 pixels from the
> > - * src_buffer, convert them to a specific format, and store them in the
> > - * destination.
> > + * The following functions take one argb_u16 pixel and convert it to a specific format. The
> 
> For cross-referencing structs, look here [1].

Fixed in v6.

> > + * result is stored in @dst_pixels.
> >    *
> > - * They are used in the `compose_active_planes` to convert and store a line
> > - * from the src_buffer to the writeback buffer.
> > + * They are used in the `vkms_writeback_row` to convert and store a pixel from the src_buffer to
> 
> Same.

Fixed in v6.

> > + * the writeback buffer.
> >    */
> >   static void argb_u16_to_ARGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> >   {
> > @@ -216,6 +238,14 @@ static void argb_u16_to_RGB565(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> >   	*pixels = cpu_to_le16(r << 11 | g << 5 | b);
> >   }
> >   
> > +/**
> > + * Generic loop for all supported writeback format. It is executed just after the blending to
> > + * write a line in the writeback buffer.
> > + *
> > + * @wb: Job where to insert the final image
> > + * @src_buffer: Line to write
> > + * @y: Row to write in the writeback buffer
> > + */
> >   void vkms_writeback_row(struct vkms_writeback_job *wb,
> >   			const struct line_buffer *src_buffer, int y)
> >   {
> > @@ -229,6 +259,13 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
> >   		wb->pixel_write(dst_pixels, &in_pixels[x]);
> >   }
> >   
> > +/**
> 
> Where is the function name?

Fixed for the v6.

> > + * Retrieve the correct read_pixel function for a specific format.
> > + * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
> > + * pointer is valid before using it in a vkms_plane_state.
> > + *
> > + * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> > + */
> >   void *get_pixel_conversion_function(u32 format)
> >   {
> >   	switch (format) {
> > @@ -247,6 +284,13 @@ void *get_pixel_conversion_function(u32 format)
> >   	}
> >   }
> >   
> > +/**
> 
> Same.

Fixed for the v6.

Thanks,
Louis Chauvet

> Best Regards,
> - Maíra
> 
> > + * Retrieve the correct write_pixel function for a specific format.
> > + * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
> > + * pointer is valid before using it in a vkms_writeback_job.
> > + *
> > + * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> > + */
> >   void *get_pixel_write_function(u32 format)
> >   {
> >   	switch (format) {
> > 

-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 04/16] drm/vkms: Add typedef and documentation for pixel_read and pixel_write functions
  2024-03-25 12:04   ` Pekka Paalanen
@ 2024-03-26 15:56     ` Louis Chauvet
  0 siblings, 0 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-26 15:56 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 25/03/24 - 14:04, Pekka Paalanen a écrit :
> On Wed, 13 Mar 2024 18:44:58 +0100
> Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> 
> > Introduce two typedefs: pixel_read_t and pixel_write_t. It allows the
> > compiler to check if the passed functions take the correct arguments.
> > Such typedefs will help ensuring consistency across the code base in
> > case of update of these prototypes.
> > 
> > Rename input/output variable in a consistent way between read_line and
> > write_line.
> > 
> > A warn has been added in get_pixel_*_function to alert when an unsupported
> > pixel format is requested. As those formats are checked before
> > atomic_update callbacks, it should never append.
> 
> s/append/happen/

Fixed in v6.

Thanks,
Louis Chauvet
 
> 
> Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.com>
> 
> Thanks,
> pq
> 
> > 
> > Document for those typedefs.
> > 
> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >  drivers/gpu/drm/vkms/vkms_drv.h     |  23 ++++++-
> >  drivers/gpu/drm/vkms/vkms_formats.c | 124 +++++++++++++++++++++---------------
> >  drivers/gpu/drm/vkms/vkms_formats.h |   4 +-
> >  drivers/gpu/drm/vkms/vkms_plane.c   |   2 +-
> >  4 files changed, 95 insertions(+), 58 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index 18086423a3a7..4bfc62d26f08 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -53,12 +53,31 @@ struct line_buffer {
> >  	struct pixel_argb_u16 *pixels;
> >  };
> >  
> > +/**
> > + * typedef pixel_write_t - These functions are used to read a pixel from a
> > + * `struct pixel_argb_u16*`, convert it in a specific format and write it in the @dst_pixels
> > + * buffer.
> > + *
> > + * @out_pixel: destination address to write the pixel
> > + * @in_pixel: pixel to write
> > + */
> > +typedef void (*pixel_write_t)(u8 *out_pixel, struct pixel_argb_u16 *in_pixel);
> > +
> >  struct vkms_writeback_job {
> >  	struct iosys_map data[DRM_FORMAT_MAX_PLANES];
> >  	struct vkms_frame_info wb_frame_info;
> > -	void (*pixel_write)(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel);
> > +	pixel_write_t pixel_write;
> >  };
> >  
> > +/**
> > + * typedef pixel_read_t - These functions are used to read a pixel in the source frame,
> > + * convert it to `struct pixel_argb_u16` and write it to @out_pixel.
> > + *
> > + * @in_pixel: Pointer to the pixel to read
> > + * @out_pixel: Pointer to write the converted pixel
> > + */
> > +typedef void (*pixel_read_t)(u8 *in_pixel, struct pixel_argb_u16 *out_pixel);
> > +
> >  /**
> >   * vkms_plane_state - Driver specific plane state
> >   * @base: base plane state
> > @@ -69,7 +88,7 @@ struct vkms_writeback_job {
> >  struct vkms_plane_state {
> >  	struct drm_shadow_plane_state base;
> >  	struct vkms_frame_info *frame_info;
> > -	void (*pixel_read)(u8 *src_buffer, struct pixel_argb_u16 *out_pixel);
> > +	pixel_read_t pixel_read;
> >  };
> >  
> >  struct vkms_plane {
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> > index 6e3dc8682ff9..55a4365d21a4 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.c
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> > @@ -76,7 +76,7 @@ static int get_x_position(const struct vkms_frame_info *frame_info, int limit, i
> >   * They are used in the `vkms_compose_row` function to handle multiple formats.
> >   */
> >  
> > -static void ARGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> > +static void ARGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> >  {
> >  	/*
> >  	 * The 257 is the "conversion ratio". This number is obtained by the
> > @@ -84,48 +84,48 @@ static void ARGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixe
> >  	 * the best color value in a pixel format with more possibilities.
> >  	 * A similar idea applies to others RGB color conversions.
> >  	 */
> > -	out_pixel->a = (u16)src_pixels[3] * 257;
> > -	out_pixel->r = (u16)src_pixels[2] * 257;
> > -	out_pixel->g = (u16)src_pixels[1] * 257;
> > -	out_pixel->b = (u16)src_pixels[0] * 257;
> > +	out_pixel->a = (u16)in_pixel[3] * 257;
> > +	out_pixel->r = (u16)in_pixel[2] * 257;
> > +	out_pixel->g = (u16)in_pixel[1] * 257;
> > +	out_pixel->b = (u16)in_pixel[0] * 257;
> >  }
> >  
> > -static void XRGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> > +static void XRGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> >  {
> >  	out_pixel->a = (u16)0xffff;
> > -	out_pixel->r = (u16)src_pixels[2] * 257;
> > -	out_pixel->g = (u16)src_pixels[1] * 257;
> > -	out_pixel->b = (u16)src_pixels[0] * 257;
> > +	out_pixel->r = (u16)in_pixel[2] * 257;
> > +	out_pixel->g = (u16)in_pixel[1] * 257;
> > +	out_pixel->b = (u16)in_pixel[0] * 257;
> >  }
> >  
> > -static void ARGB16161616_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> > +static void ARGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> >  {
> > -	u16 *pixels = (u16 *)src_pixels;
> > +	u16 *pixel = (u16 *)in_pixel;
> >  
> > -	out_pixel->a = le16_to_cpu(pixels[3]);
> > -	out_pixel->r = le16_to_cpu(pixels[2]);
> > -	out_pixel->g = le16_to_cpu(pixels[1]);
> > -	out_pixel->b = le16_to_cpu(pixels[0]);
> > +	out_pixel->a = le16_to_cpu(pixel[3]);
> > +	out_pixel->r = le16_to_cpu(pixel[2]);
> > +	out_pixel->g = le16_to_cpu(pixel[1]);
> > +	out_pixel->b = le16_to_cpu(pixel[0]);
> >  }
> >  
> > -static void XRGB16161616_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> > +static void XRGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> >  {
> > -	u16 *pixels = (u16 *)src_pixels;
> > +	u16 *pixel = (u16 *)in_pixel;
> >  
> >  	out_pixel->a = (u16)0xffff;
> > -	out_pixel->r = le16_to_cpu(pixels[2]);
> > -	out_pixel->g = le16_to_cpu(pixels[1]);
> > -	out_pixel->b = le16_to_cpu(pixels[0]);
> > +	out_pixel->r = le16_to_cpu(pixel[2]);
> > +	out_pixel->g = le16_to_cpu(pixel[1]);
> > +	out_pixel->b = le16_to_cpu(pixel[0]);
> >  }
> >  
> > -static void RGB565_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> > +static void RGB565_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> >  {
> > -	u16 *pixels = (u16 *)src_pixels;
> > +	u16 *pixel = (u16 *)in_pixel;
> >  
> >  	s64 fp_rb_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(31));
> >  	s64 fp_g_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(63));
> >  
> > -	u16 rgb_565 = le16_to_cpu(*pixels);
> > +	u16 rgb_565 = le16_to_cpu(*pixel);
> >  	s64 fp_r = drm_int2fixp((rgb_565 >> 11) & 0x1f);
> >  	s64 fp_g = drm_int2fixp((rgb_565 >> 5) & 0x3f);
> >  	s64 fp_b = drm_int2fixp(rgb_565 & 0x1f);
> > @@ -169,12 +169,12 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
> >  
> >  /*
> >   * The following functions take one argb_u16 pixel and convert it to a specific format. The
> > - * result is stored in @dst_pixels.
> > + * result is stored in @out_pixel.
> >   *
> >   * They are used in the `vkms_writeback_row` to convert and store a pixel from the src_buffer to
> >   * the writeback buffer.
> >   */
> > -static void argb_u16_to_ARGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> > +static void argb_u16_to_ARGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> >  {
> >  	/*
> >  	 * This sequence below is important because the format's byte order is
> > @@ -186,43 +186,43 @@ static void argb_u16_to_ARGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel
> >  	 * | Addr + 2 | = Red channel
> >  	 * | Addr + 3 | = Alpha channel
> >  	 */
> > -	dst_pixels[3] = DIV_ROUND_CLOSEST(in_pixel->a, 257);
> > -	dst_pixels[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
> > -	dst_pixels[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
> > -	dst_pixels[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
> > +	out_pixel[3] = DIV_ROUND_CLOSEST(in_pixel->a, 257);
> > +	out_pixel[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
> > +	out_pixel[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
> > +	out_pixel[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
> >  }
> >  
> > -static void argb_u16_to_XRGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> > +static void argb_u16_to_XRGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> >  {
> > -	dst_pixels[3] = 0xff;
> > -	dst_pixels[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
> > -	dst_pixels[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
> > -	dst_pixels[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
> > +	out_pixel[3] = 0xff;
> > +	out_pixel[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
> > +	out_pixel[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
> > +	out_pixel[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
> >  }
> >  
> > -static void argb_u16_to_ARGB16161616(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> > +static void argb_u16_to_ARGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> >  {
> > -	u16 *pixels = (u16 *)dst_pixels;
> > +	u16 *pixel = (u16 *)out_pixel;
> >  
> > -	pixels[3] = cpu_to_le16(in_pixel->a);
> > -	pixels[2] = cpu_to_le16(in_pixel->r);
> > -	pixels[1] = cpu_to_le16(in_pixel->g);
> > -	pixels[0] = cpu_to_le16(in_pixel->b);
> > +	pixel[3] = cpu_to_le16(in_pixel->a);
> > +	pixel[2] = cpu_to_le16(in_pixel->r);
> > +	pixel[1] = cpu_to_le16(in_pixel->g);
> > +	pixel[0] = cpu_to_le16(in_pixel->b);
> >  }
> >  
> > -static void argb_u16_to_XRGB16161616(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> > +static void argb_u16_to_XRGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> >  {
> > -	u16 *pixels = (u16 *)dst_pixels;
> > +	u16 *pixel = (u16 *)out_pixel;
> >  
> > -	pixels[3] = 0xffff;
> > -	pixels[2] = cpu_to_le16(in_pixel->r);
> > -	pixels[1] = cpu_to_le16(in_pixel->g);
> > -	pixels[0] = cpu_to_le16(in_pixel->b);
> > +	pixel[3] = 0xffff;
> > +	pixel[2] = cpu_to_le16(in_pixel->r);
> > +	pixel[1] = cpu_to_le16(in_pixel->g);
> > +	pixel[0] = cpu_to_le16(in_pixel->b);
> >  }
> >  
> > -static void argb_u16_to_RGB565(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> > +static void argb_u16_to_RGB565(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> >  {
> > -	u16 *pixels = (u16 *)dst_pixels;
> > +	u16 *pixel = (u16 *)out_pixel;
> >  
> >  	s64 fp_rb_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(31));
> >  	s64 fp_g_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(63));
> > @@ -235,7 +235,7 @@ static void argb_u16_to_RGB565(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> >  	u16 g = drm_fixp2int(drm_fixp_div(fp_g, fp_g_ratio));
> >  	u16 b = drm_fixp2int(drm_fixp_div(fp_b, fp_rb_ratio));
> >  
> > -	*pixels = cpu_to_le16(r << 11 | g << 5 | b);
> > +	*pixel = cpu_to_le16(r << 11 | g << 5 | b);
> >  }
> >  
> >  /**
> > @@ -266,7 +266,7 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
> >   *
> >   * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> >   */
> > -void *get_pixel_conversion_function(u32 format)
> > +pixel_read_t get_pixel_read_function(u32 format)
> >  {
> >  	switch (format) {
> >  	case DRM_FORMAT_ARGB8888:
> > @@ -280,7 +280,16 @@ void *get_pixel_conversion_function(u32 format)
> >  	case DRM_FORMAT_RGB565:
> >  		return &RGB565_to_argb_u16;
> >  	default:
> > -		return NULL;
> > +		/*
> > +		 * This is a bug in vkms_plane_atomic_check. All the supported
> > +		 * format must:
> > +		 * - Be listed in vkms_formats in vkms_plane.c
> > +		 * - Have a pixel_read callback defined here
> > +		 */
> > +		WARN(true,
> > +		     "Pixel format %p4cc is not supported by VKMS planes. This is a kernel bug, atomic check must forbid this configuration.\n",
> > +		     &format);
> > +		return (pixel_read_t)NULL;
> >  	}
> >  }
> >  
> > @@ -291,7 +300,7 @@ void *get_pixel_conversion_function(u32 format)
> >   *
> >   * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> >   */
> > -void *get_pixel_write_function(u32 format)
> > +pixel_write_t get_pixel_write_function(u32 format)
> >  {
> >  	switch (format) {
> >  	case DRM_FORMAT_ARGB8888:
> > @@ -305,6 +314,15 @@ void *get_pixel_write_function(u32 format)
> >  	case DRM_FORMAT_RGB565:
> >  		return &argb_u16_to_RGB565;
> >  	default:
> > -		return NULL;
> > +		/*
> > +		 * This is a bug in vkms_writeback_atomic_check. All the supported
> > +		 * format must:
> > +		 * - Be listed in vkms_wb_formats in vkms_writeback.c
> > +		 * - Have a pixel_write callback defined here
> > +		 */
> > +		WARN(true,
> > +		     "Pixel format %p4cc is not supported by VKMS writeback. This is a kernel bug, atomic check must forbid this configuration.\n",
> > +		     &format);
> > +		return (pixel_write_t)NULL;
> >  	}
> >  }
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
> > index cf59c2ed8e9a..3ecea4563254 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.h
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.h
> > @@ -5,8 +5,8 @@
> >  
> >  #include "vkms_drv.h"
> >  
> > -void *get_pixel_conversion_function(u32 format);
> > +pixel_read_t get_pixel_read_function(u32 format);
> >  
> > -void *get_pixel_write_function(u32 format);
> > +pixel_write_t get_pixel_write_function(u32 format);
> >  
> >  #endif /* _VKMS_FORMATS_H_ */
> > diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> > index 21b5adfb44aa..10e9b23dab28 100644
> > --- a/drivers/gpu/drm/vkms/vkms_plane.c
> > +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> > @@ -125,7 +125,7 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
> >  	drm_rect_rotate(&frame_info->rotated, drm_rect_width(&frame_info->rotated),
> >  			drm_rect_height(&frame_info->rotated), frame_info->rotation);
> >  
> > -	vkms_plane_state->pixel_read = get_pixel_conversion_function(fmt);
> > +	vkms_plane_state->pixel_read = get_pixel_read_function(fmt);
> >  }
> >  
> >  static int vkms_plane_atomic_check(struct drm_plane *plane,
> > 
> 



-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 04/16] drm/vkms: Add typedef and documentation for pixel_read and pixel_write functions
  2024-03-25 13:56   ` Maíra Canal
@ 2024-03-26 15:56     ` Louis Chauvet
  2024-03-27 15:03       ` Maíra Canal
  0 siblings, 1 reply; 75+ messages in thread
From: Louis Chauvet @ 2024-03-26 15:56 UTC (permalink / raw)
  To: Maíra Canal
  Cc: Rodrigo Siqueira, Melissa Wen, Haneen Mohammed, Daniel Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, arthurgrillo, Jonathan Corbet, pekka.paalanen,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 25/03/24 - 10:56, Maíra Canal a écrit :
> On 3/13/24 14:44, Louis Chauvet wrote:
> > Introduce two typedefs: pixel_read_t and pixel_write_t. It allows the
> > compiler to check if the passed functions take the correct arguments.
> > Such typedefs will help ensuring consistency across the code base in
> > case of update of these prototypes.
> > 
> > Rename input/output variable in a consistent way between read_line and
> > write_line.
> > 
> > A warn has been added in get_pixel_*_function to alert when an unsupported
> > pixel format is requested. As those formats are checked before
> > atomic_update callbacks, it should never append.
> > 
> > Document for those typedefs.
> > 
> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >   drivers/gpu/drm/vkms/vkms_drv.h     |  23 ++++++-
> >   drivers/gpu/drm/vkms/vkms_formats.c | 124 +++++++++++++++++++++---------------
> >   drivers/gpu/drm/vkms/vkms_formats.h |   4 +-
> >   drivers/gpu/drm/vkms/vkms_plane.c   |   2 +-
> >   4 files changed, 95 insertions(+), 58 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index 18086423a3a7..4bfc62d26f08 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -53,12 +53,31 @@ struct line_buffer {
> >   	struct pixel_argb_u16 *pixels;
> >   };
> >   
> > +/**
> > + * typedef pixel_write_t - These functions are used to read a pixel from a
> > + * `struct pixel_argb_u16*`, convert it in a specific format and write it in the @dst_pixels
> > + * buffer.
> 
> Your brief description looks a bit big to me. Also, take a look at the 
> cross-references docs [1].

Is this description sufficient?

	typedef pixel_write_t - Convert a pixel from a &struct pixel_argb_u16 into a specific format
 
> [1] 
> https://docs.kernel.org/doc-guide/kernel-doc.html#highlights-and-cross-references
> 
> > + *
> > + * @out_pixel: destination address to write the pixel
> > + * @in_pixel: pixel to write
> > + */
> > +typedef void (*pixel_write_t)(u8 *out_pixel, struct pixel_argb_u16 *in_pixel);
> > +
> >   struct vkms_writeback_job {
> >   	struct iosys_map data[DRM_FORMAT_MAX_PLANES];
> >   	struct vkms_frame_info wb_frame_info;
> > -	void (*pixel_write)(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel);
> > +	pixel_write_t pixel_write;
> >   };
> >   
> > +/**
> > + * typedef pixel_read_t - These functions are used to read a pixel in the source frame,
> > + * convert it to `struct pixel_argb_u16` and write it to @out_pixel.
> 
> Same.

	typedef pixel_read_t - Read a pixel and convert it to a &struct pixel_argb_u16
 
> > + *
> > + * @in_pixel: Pointer to the pixel to read
> > + * @out_pixel: Pointer to write the converted pixel
> 
> s/Pointer/pointer

Fixed in v6.

> > + */
> > +typedef void (*pixel_read_t)(u8 *in_pixel, struct pixel_argb_u16 *out_pixel);
> > +
> >   /**
> >    * vkms_plane_state - Driver specific plane state
> >    * @base: base plane state
> > @@ -69,7 +88,7 @@ struct vkms_writeback_job {
> >   struct vkms_plane_state {
> >   	struct drm_shadow_plane_state base;
> >   	struct vkms_frame_info *frame_info;
> > -	void (*pixel_read)(u8 *src_buffer, struct pixel_argb_u16 *out_pixel);
> > +	pixel_read_t pixel_read;
> >   };
> >   
> >   struct vkms_plane {
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> > index 6e3dc8682ff9..55a4365d21a4 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.c
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> > @@ -76,7 +76,7 @@ static int get_x_position(const struct vkms_frame_info *frame_info, int limit, i
> >    * They are used in the `vkms_compose_row` function to handle multiple formats.
> >    */
> >   
> > -static void ARGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> > +static void ARGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> >   {
> >   	/*
> >   	 * The 257 is the "conversion ratio". This number is obtained by the
> > @@ -84,48 +84,48 @@ static void ARGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixe
> >   	 * the best color value in a pixel format with more possibilities.
> >   	 * A similar idea applies to others RGB color conversions.
> >   	 */
> > -	out_pixel->a = (u16)src_pixels[3] * 257;
> > -	out_pixel->r = (u16)src_pixels[2] * 257;
> > -	out_pixel->g = (u16)src_pixels[1] * 257;
> > -	out_pixel->b = (u16)src_pixels[0] * 257;
> > +	out_pixel->a = (u16)in_pixel[3] * 257;
> > +	out_pixel->r = (u16)in_pixel[2] * 257;
> > +	out_pixel->g = (u16)in_pixel[1] * 257;
> > +	out_pixel->b = (u16)in_pixel[0] * 257;
> >   }
> >   
> > -static void XRGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> > +static void XRGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> >   {
> >   	out_pixel->a = (u16)0xffff;
> > -	out_pixel->r = (u16)src_pixels[2] * 257;
> > -	out_pixel->g = (u16)src_pixels[1] * 257;
> > -	out_pixel->b = (u16)src_pixels[0] * 257;
> > +	out_pixel->r = (u16)in_pixel[2] * 257;
> > +	out_pixel->g = (u16)in_pixel[1] * 257;
> > +	out_pixel->b = (u16)in_pixel[0] * 257;
> >   }
> >   
> > -static void ARGB16161616_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> > +static void ARGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> >   {
> > -	u16 *pixels = (u16 *)src_pixels;
> > +	u16 *pixel = (u16 *)in_pixel;
> >   
> > -	out_pixel->a = le16_to_cpu(pixels[3]);
> > -	out_pixel->r = le16_to_cpu(pixels[2]);
> > -	out_pixel->g = le16_to_cpu(pixels[1]);
> > -	out_pixel->b = le16_to_cpu(pixels[0]);
> > +	out_pixel->a = le16_to_cpu(pixel[3]);
> > +	out_pixel->r = le16_to_cpu(pixel[2]);
> > +	out_pixel->g = le16_to_cpu(pixel[1]);
> > +	out_pixel->b = le16_to_cpu(pixel[0]);
> >   }
> >   
> > -static void XRGB16161616_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> > +static void XRGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> >   {
> > -	u16 *pixels = (u16 *)src_pixels;
> > +	u16 *pixel = (u16 *)in_pixel;
> >   
> >   	out_pixel->a = (u16)0xffff;
> > -	out_pixel->r = le16_to_cpu(pixels[2]);
> > -	out_pixel->g = le16_to_cpu(pixels[1]);
> > -	out_pixel->b = le16_to_cpu(pixels[0]);
> > +	out_pixel->r = le16_to_cpu(pixel[2]);
> > +	out_pixel->g = le16_to_cpu(pixel[1]);
> > +	out_pixel->b = le16_to_cpu(pixel[0]);
> >   }
> >   
> > -static void RGB565_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
> > +static void RGB565_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> >   {
> > -	u16 *pixels = (u16 *)src_pixels;
> > +	u16 *pixel = (u16 *)in_pixel;
> >   
> >   	s64 fp_rb_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(31));
> >   	s64 fp_g_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(63));
> >   
> > -	u16 rgb_565 = le16_to_cpu(*pixels);
> > +	u16 rgb_565 = le16_to_cpu(*pixel);
> >   	s64 fp_r = drm_int2fixp((rgb_565 >> 11) & 0x1f);
> >   	s64 fp_g = drm_int2fixp((rgb_565 >> 5) & 0x3f);
> >   	s64 fp_b = drm_int2fixp(rgb_565 & 0x1f);
> > @@ -169,12 +169,12 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
> >   
> >   /*
> >    * The following functions take one argb_u16 pixel and convert it to a specific format. The
> > - * result is stored in @dst_pixels.
> > + * result is stored in @out_pixel.
> >    *
> >    * They are used in the `vkms_writeback_row` to convert and store a pixel from the src_buffer to
> >    * the writeback buffer.
> >    */
> > -static void argb_u16_to_ARGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> > +static void argb_u16_to_ARGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> >   {
> >   	/*
> >   	 * This sequence below is important because the format's byte order is
> > @@ -186,43 +186,43 @@ static void argb_u16_to_ARGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel
> >   	 * | Addr + 2 | = Red channel
> >   	 * | Addr + 3 | = Alpha channel
> >   	 */
> > -	dst_pixels[3] = DIV_ROUND_CLOSEST(in_pixel->a, 257);
> > -	dst_pixels[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
> > -	dst_pixels[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
> > -	dst_pixels[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
> > +	out_pixel[3] = DIV_ROUND_CLOSEST(in_pixel->a, 257);
> > +	out_pixel[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
> > +	out_pixel[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
> > +	out_pixel[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
> >   }
> >   
> > -static void argb_u16_to_XRGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> > +static void argb_u16_to_XRGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> >   {
> > -	dst_pixels[3] = 0xff;
> > -	dst_pixels[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
> > -	dst_pixels[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
> > -	dst_pixels[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
> > +	out_pixel[3] = 0xff;
> > +	out_pixel[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
> > +	out_pixel[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
> > +	out_pixel[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
> >   }
> >   
> > -static void argb_u16_to_ARGB16161616(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> > +static void argb_u16_to_ARGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> >   {
> > -	u16 *pixels = (u16 *)dst_pixels;
> > +	u16 *pixel = (u16 *)out_pixel;
> >   
> > -	pixels[3] = cpu_to_le16(in_pixel->a);
> > -	pixels[2] = cpu_to_le16(in_pixel->r);
> > -	pixels[1] = cpu_to_le16(in_pixel->g);
> > -	pixels[0] = cpu_to_le16(in_pixel->b);
> > +	pixel[3] = cpu_to_le16(in_pixel->a);
> > +	pixel[2] = cpu_to_le16(in_pixel->r);
> > +	pixel[1] = cpu_to_le16(in_pixel->g);
> > +	pixel[0] = cpu_to_le16(in_pixel->b);
> >   }
> >   
> > -static void argb_u16_to_XRGB16161616(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> > +static void argb_u16_to_XRGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> >   {
> > -	u16 *pixels = (u16 *)dst_pixels;
> > +	u16 *pixel = (u16 *)out_pixel;
> >   
> > -	pixels[3] = 0xffff;
> > -	pixels[2] = cpu_to_le16(in_pixel->r);
> > -	pixels[1] = cpu_to_le16(in_pixel->g);
> > -	pixels[0] = cpu_to_le16(in_pixel->b);
> > +	pixel[3] = 0xffff;
> > +	pixel[2] = cpu_to_le16(in_pixel->r);
> > +	pixel[1] = cpu_to_le16(in_pixel->g);
> > +	pixel[0] = cpu_to_le16(in_pixel->b);
> >   }
> >   
> > -static void argb_u16_to_RGB565(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> > +static void argb_u16_to_RGB565(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> >   {
> > -	u16 *pixels = (u16 *)dst_pixels;
> > +	u16 *pixel = (u16 *)out_pixel;
> >   
> >   	s64 fp_rb_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(31));
> >   	s64 fp_g_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(63));
> > @@ -235,7 +235,7 @@ static void argb_u16_to_RGB565(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
> >   	u16 g = drm_fixp2int(drm_fixp_div(fp_g, fp_g_ratio));
> >   	u16 b = drm_fixp2int(drm_fixp_div(fp_b, fp_rb_ratio));
> >   
> > -	*pixels = cpu_to_le16(r << 11 | g << 5 | b);
> > +	*pixel = cpu_to_le16(r << 11 | g << 5 | b);
> >   }
> >   
> >   /**
> > @@ -266,7 +266,7 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
> >    *
> >    * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> >    */
> > -void *get_pixel_conversion_function(u32 format)
> > +pixel_read_t get_pixel_read_function(u32 format)
> >   {
> >   	switch (format) {
> >   	case DRM_FORMAT_ARGB8888:
> > @@ -280,7 +280,16 @@ void *get_pixel_conversion_function(u32 format)
> >   	case DRM_FORMAT_RGB565:
> >   		return &RGB565_to_argb_u16;
> >   	default:
> > -		return NULL;
> > +		/*
> > +		 * This is a bug in vkms_plane_atomic_check. All the supported
> 
> s/vkms_plane_atomic_check/vkms_plane_atomic_check()

Fixed in v6.

Thanks,
Louis Chauvet

> Best Regards,
> - Maíra
> 
> > +		 * format must:
> > +		 * - Be listed in vkms_formats in vkms_plane.c
> > +		 * - Have a pixel_read callback defined here
> > +		 */
> > +		WARN(true,
> > +		     "Pixel format %p4cc is not supported by VKMS planes. This is a kernel bug, atomic check must forbid this configuration.\n",
> > +		     &format);
> > +		return (pixel_read_t)NULL;
> >   	}
> >   }
> >   
> > @@ -291,7 +300,7 @@ void *get_pixel_conversion_function(u32 format)
> >    *
> >    * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> >    */
> > -void *get_pixel_write_function(u32 format)
> > +pixel_write_t get_pixel_write_function(u32 format)
> >   {
> >   	switch (format) {
> >   	case DRM_FORMAT_ARGB8888:
> > @@ -305,6 +314,15 @@ void *get_pixel_write_function(u32 format)
> >   	case DRM_FORMAT_RGB565:
> >   		return &argb_u16_to_RGB565;
> >   	default:
> > -		return NULL;
> > +		/*
> > +		 * This is a bug in vkms_writeback_atomic_check. All the supported
> > +		 * format must:
> > +		 * - Be listed in vkms_wb_formats in vkms_writeback.c
> > +		 * - Have a pixel_write callback defined here
> > +		 */
> > +		WARN(true,
> > +		     "Pixel format %p4cc is not supported by VKMS writeback. This is a kernel bug, atomic check must forbid this configuration.\n",
> > +		     &format);
> > +		return (pixel_write_t)NULL;
> >   	}
> >   }
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
> > index cf59c2ed8e9a..3ecea4563254 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.h
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.h
> > @@ -5,8 +5,8 @@
> >   
> >   #include "vkms_drv.h"
> >   
> > -void *get_pixel_conversion_function(u32 format);
> > +pixel_read_t get_pixel_read_function(u32 format);
> >   
> > -void *get_pixel_write_function(u32 format);
> > +pixel_write_t get_pixel_write_function(u32 format);
> >   
> >   #endif /* _VKMS_FORMATS_H_ */
> > diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> > index 21b5adfb44aa..10e9b23dab28 100644
> > --- a/drivers/gpu/drm/vkms/vkms_plane.c
> > +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> > @@ -125,7 +125,7 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
> >   	drm_rect_rotate(&frame_info->rotated, drm_rect_width(&frame_info->rotated),
> >   			drm_rect_height(&frame_info->rotated), frame_info->rotation);
> >   
> > -	vkms_plane_state->pixel_read = get_pixel_conversion_function(fmt);
> > +	vkms_plane_state->pixel_read = get_pixel_read_function(fmt);
> >   }
> >   
> >   static int vkms_plane_atomic_check(struct drm_plane *plane,
> > 

-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 05/16] drm/vkms: Add dummy pixel_read/pixel_write callbacks to avoid NULL pointers
  2024-03-25 12:05   ` Pekka Paalanen
@ 2024-03-26 15:56     ` Louis Chauvet
  0 siblings, 0 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-26 15:56 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 25/03/24 - 14:05, Pekka Paalanen a écrit :
> On Wed, 13 Mar 2024 18:44:59 +0100
> Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> 
> > Introduce two callbacks which does nothing. They are used in replacement
> > of NULL and it avoid kernel OOPS if this NULL is called.
> > 
> > If those callback are used, it means that there is a mismatch between
> > what formats are announced by atomic_check and what is realy supported by
> > atomic_update.
> > 
> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >  drivers/gpu/drm/vkms/vkms_formats.c | 43 +++++++++++++++++++++++++++++++------
> >  1 file changed, 37 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> > index 55a4365d21a4..b57d85b8b935 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.c
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> > @@ -136,6 +136,21 @@ static void RGB565_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> >  	out_pixel->b = drm_fixp2int_round(drm_fixp_mul(fp_b, fp_rb_ratio));
> >  }
> >  
> > +/**
> > + * black_to_argb_u16() - pixel_read callback which always read black
> > + *
> > + * This callback is used when an invalid format is requested for plane reading.
> > + * It is used to avoid null pointer to be used as a function. In theory, this function should
> > + * never be called, except if you found a bug in the driver/DRM core.
> > + */
> > +static void black_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> > +{
> > +	out_pixel->a = (u16)0xFFFF;
> > +	out_pixel->r = 0;
> > +	out_pixel->g = 0;
> > +	out_pixel->b = 0;
> > +}
> > +
> >  /**
> >   * vkms_compose_row - compose a single row of a plane
> >   * @stage_buffer: output line with the composed pixels
> > @@ -238,6 +253,16 @@ static void argb_u16_to_RGB565(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> >  	*pixel = cpu_to_le16(r << 11 | g << 5 | b);
> >  }
> >  
> > +/**
> > + * argb_u16_to_nothing() - pixel_write callback with no effect
> > + *
> > + * This callback is used when an invalid format is requested for writeback.
> > + * It is used to avoid null pointer to be used as a function. In theory, this should never
> > + * happen, except if there is a bug in the driver
> > + */
> > +static void argb_u16_to_nothing(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> > +{}
> > +
> >  /**
> >   * Generic loop for all supported writeback format. It is executed just after the blending to
> >   * write a line in the writeback buffer.
> > @@ -261,8 +286,8 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
> >  
> >  /**
> >   * Retrieve the correct read_pixel function for a specific format.
> > - * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
> > - * pointer is valid before using it in a vkms_plane_state.
> > + * If the format is not supported by VKMS a warn is emitted and a dummy "always read black"
> > + * function is returned.
> >   *
> >   * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> >   */
> > @@ -285,18 +310,21 @@ pixel_read_t get_pixel_read_function(u32 format)
> >  		 * format must:
> >  		 * - Be listed in vkms_formats in vkms_plane.c
> >  		 * - Have a pixel_read callback defined here
> > +		 *
> > +		 * To avoid kernel crash, a dummy "always read black" function is used. It means
> > +		 * that during the composition, this plane will always be black.
> >  		 */
> >  		WARN(true,
> >  		     "Pixel format %p4cc is not supported by VKMS planes. This is a kernel bug, atomic check must forbid this configuration.\n",
> >  		     &format);
> > -		return (pixel_read_t)NULL;
> > +		return &black_to_argb_u16;
> 
> Hi Louis,
> 
> I'm perhaps a bit paranoid in these things, but I'd make this not
> black. Maybe something more "screaming" like magenta. There is a slight
> chance that black might sometimes be expected, or not affect the
> result. After all, blending something into black with pre-multiplied
> alpha is equivalent to no-blending (a copy). The kernel warning is
> good, the magenta is more like an assurance.

Changed to 0xFF/0x00/0XFF in the V6.

Thanks,
Louis Chauvet
 
> Anyway,
> 
> Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com>
> 
> 
> Thanks,
> pq
> 
> 
> >  	}
> >  }
> >  
> >  /**
> >   * Retrieve the correct write_pixel function for a specific format.
> > - * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
> > - * pointer is valid before using it in a vkms_writeback_job.
> > + * If the format is not supported by VKMS a warn is emitted and a dummy "don't do anything"
> > + * function is returned.
> >   *
> >   * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> >   */
> > @@ -319,10 +347,13 @@ pixel_write_t get_pixel_write_function(u32 format)
> >  		 * format must:
> >  		 * - Be listed in vkms_wb_formats in vkms_writeback.c
> >  		 * - Have a pixel_write callback defined here
> > +		 *
> > +		 * To avoid kernel crash, a dummy "don't do anything" function is used. It means
> > +		 * that the resulting writeback buffer is not composed and can contains any values.
> >  		 */
> >  		WARN(true,
> >  		     "Pixel format %p4cc is not supported by VKMS writeback. This is a kernel bug, atomic check must forbid this configuration.\n",
> >  		     &format);
> > -		return (pixel_write_t)NULL;
> > +		return &argb_u16_to_nothing;
> >  	}
> >  }
> > 
> 



-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 05/16] drm/vkms: Add dummy pixel_read/pixel_write callbacks to avoid NULL pointers
  2024-03-25 13:59   ` Maíra Canal
@ 2024-03-26 15:56     ` Louis Chauvet
  0 siblings, 0 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-26 15:56 UTC (permalink / raw)
  To: Maíra Canal
  Cc: Rodrigo Siqueira, Melissa Wen, Haneen Mohammed, Daniel Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, arthurgrillo, Jonathan Corbet, pekka.paalanen,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 25/03/24 - 10:59, Maíra Canal a écrit :
> On 3/13/24 14:44, Louis Chauvet wrote:
> > Introduce two callbacks which does nothing. They are used in replacement
> > of NULL and it avoid kernel OOPS if this NULL is called.
> > 
> > If those callback are used, it means that there is a mismatch between
> > what formats are announced by atomic_check and what is realy supported by
> > atomic_update.
> > 
> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >   drivers/gpu/drm/vkms/vkms_formats.c | 43 +++++++++++++++++++++++++++++++------
> >   1 file changed, 37 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> > index 55a4365d21a4..b57d85b8b935 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.c
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> > @@ -136,6 +136,21 @@ static void RGB565_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> >   	out_pixel->b = drm_fixp2int_round(drm_fixp_mul(fp_b, fp_rb_ratio));
> >   }
> >   
> > +/**
> > + * black_to_argb_u16() - pixel_read callback which always read black
> > + *
> > + * This callback is used when an invalid format is requested for plane reading.
> > + * It is used to avoid null pointer to be used as a function. In theory, this function should
> > + * never be called, except if you found a bug in the driver/DRM core.
> > + */
> > +static void black_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> > +{
> > +	out_pixel->a = (u16)0xFFFF;
> > +	out_pixel->r = 0;
> > +	out_pixel->g = 0;
> > +	out_pixel->b = 0;
> > +}
> > +
> >   /**
> >    * vkms_compose_row - compose a single row of a plane
> >    * @stage_buffer: output line with the composed pixels
> > @@ -238,6 +253,16 @@ static void argb_u16_to_RGB565(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> >   	*pixel = cpu_to_le16(r << 11 | g << 5 | b);
> >   }
> >   
> > +/**
> > + * argb_u16_to_nothing() - pixel_write callback with no effect
> > + *
> > + * This callback is used when an invalid format is requested for writeback.
> > + * It is used to avoid null pointer to be used as a function. In theory, this should never
> > + * happen, except if there is a bug in the driver
> > + */
> > +static void argb_u16_to_nothing(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
> > +{}
> > +
> >   /**
> >    * Generic loop for all supported writeback format. It is executed just after the blending to
> >    * write a line in the writeback buffer.
> > @@ -261,8 +286,8 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
> >   
> >   /**
> >    * Retrieve the correct read_pixel function for a specific format.
> > - * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
> > - * pointer is valid before using it in a vkms_plane_state.
> > + * If the format is not supported by VKMS a warn is emitted and a dummy "always read black"
> 
> "If the format is not supported by VKMS, a warning is emitted and a 
> dummy "always read black"..."

Fixed in v6.
 
> > + * function is returned.
> >    *
> >    * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> >    */
> > @@ -285,18 +310,21 @@ pixel_read_t get_pixel_read_function(u32 format)
> >   		 * format must:
> >   		 * - Be listed in vkms_formats in vkms_plane.c
> >   		 * - Have a pixel_read callback defined here
> > +		 *
> > +		 * To avoid kernel crash, a dummy "always read black" function is used. It means
> > +		 * that during the composition, this plane will always be black.
> >   		 */
> >   		WARN(true,
> >   		     "Pixel format %p4cc is not supported by VKMS planes. This is a kernel bug, atomic check must forbid this configuration.\n",
> >   		     &format);
> > -		return (pixel_read_t)NULL;
> > +		return &black_to_argb_u16;
> >   	}
> >   }
> >   
> >   /**
> >    * Retrieve the correct write_pixel function for a specific format.
> > - * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
> > - * pointer is valid before using it in a vkms_writeback_job.
> > + * If the format is not supported by VKMS a warn is emitted and a dummy "don't do anything"
> 
> "If the format is not supported by VKMS, a warning is emitted and a 
> dummy "don't do anything"..."

Fixed in v6.

Thanks,
Louis Chauvet

> Best Regards,
> - Maíra
> 
> > + * function is returned.
> >    *
> >    * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> >    */
> > @@ -319,10 +347,13 @@ pixel_write_t get_pixel_write_function(u32 format)
> >   		 * format must:
> >   		 * - Be listed in vkms_wb_formats in vkms_writeback.c
> >   		 * - Have a pixel_write callback defined here
> > +		 *
> > +		 * To avoid kernel crash, a dummy "don't do anything" function is used. It means
> > +		 * that the resulting writeback buffer is not composed and can contains any values.
> >   		 */
> >   		WARN(true,
> >   		     "Pixel format %p4cc is not supported by VKMS writeback. This is a kernel bug, atomic check must forbid this configuration.\n",
> >   		     &format);
> > -		return (pixel_write_t)NULL;
> > +		return &argb_u16_to_nothing;
> >   	}
> >   }
> > 

-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 07/16] drm/vkms: Update pixels accessor to support packed and multi-plane formats.
  2024-03-25 12:40   ` Pekka Paalanen
@ 2024-03-26 15:56     ` Louis Chauvet
  0 siblings, 0 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-26 15:56 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 25/03/24 - 14:40, Pekka Paalanen a écrit :
> On Wed, 13 Mar 2024 18:45:01 +0100
> Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> 
> > Introduce the usage of block_h/block_w to compute the offset and the
> > pointer of a pixel. The previous implementation was specialized for
> > planes with block_h == block_w == 1. To avoid confusion and allow easier
> > implementation of tiled formats. It also remove the usage of the
> > deprecated format field `cpp`.
> > 
> > Introduce the plane_index parameter to get an offset/pointer on a
> > different plane.
> > 
> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >  drivers/gpu/drm/vkms/vkms_formats.c | 76 +++++++++++++++++++++++++------------
> >  1 file changed, 52 insertions(+), 24 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> > index b2f8dfc26c35..649d75d05b1f 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.c
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> > @@ -10,23 +10,43 @@
> >  #include "vkms_formats.h"
> >  
> >  /**
> > - * pixel_offset() - Get the offset of the pixel at coordinates x/y in the first plane
> > + * packed_pixels_offset() - Get the offset of the block containing the pixel at coordinates x/y
> >   *
> >   * @frame_info: Buffer metadata
> >   * @x: The x coordinate of the wanted pixel in the buffer
> >   * @y: The y coordinate of the wanted pixel in the buffer
> > + * @plane_index: The index of the plane to use
> > + * @offset: The returned offset inside the buffer of the block
> > + * @rem_x,@rem_y: The returned coordinate of the requested pixel in the block
> >   *
> > - * The caller must ensure that the framebuffer associated with this request uses a pixel format
> > - * where block_h == block_w == 1.
> > - * If this requirement is not fulfilled, the resulting offset can point to an other pixel or
> > - * outside of the buffer.
> > + * As some pixel formats store multiple pixels in a block (DRM_FORMAT_R* for example), some
> > + * pixels are not individually addressable. This function return 3 values: the offset of the
> > + * whole block, and the coordinate of the requested pixel inside this block.
> > + * For example, if the format is DRM_FORMAT_R1 and the requested coordinate is 13,5, the offset
> > + * will point to the byte 5*pitches + 13/8 (second byte of the 5th line), and the rem_x/rem_y
> > + * coordinates will be (13 % 8, 5 % 1) = (5, 0)
> > + *
> > + * With this function, the caller just have to extract the correct pixel from the block.
> >   */
> > -static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int y)
> > +static void packed_pixels_offset(const struct vkms_frame_info *frame_info, int x, int y,
> > +				 int plane_index, int *offset, int *rem_x, int *rem_y)
> >  {
> >  	struct drm_framebuffer *fb = frame_info->fb;
> > +	const struct drm_format_info *format = frame_info->fb->format;
> > +	/* Directly using x and y to multiply pitches and format->ccp is not sufficient because
> > +	 * in some formats a block can represent multiple pixels.
> > +	 *
> > +	 * Dividing x and y by the block size allows to extract the correct offset of the block
> > +	 * containing the pixel.
> > +	 */
> >  
> > -	return fb->offsets[0] + (y * fb->pitches[0])
> > -			      + (x * fb->format->cpp[0]);
> > +	int block_x = x / drm_format_info_block_width(format, plane_index);
> > +	int block_y = y / drm_format_info_block_height(format, plane_index);
> > +	*rem_x = x % drm_format_info_block_width(format, plane_index);
> > +	*rem_y = x % drm_format_info_block_height(format, plane_index);
> 
> typo: x should be y

Fixed in v6.
 
> 
> > +	*offset = fb->offsets[plane_index] +
> > +		  block_y * fb->pitches[plane_index] +
> > +		  block_x * format->char_per_block[plane_index];
> >  }
> 
> Ok, this function looks very much plausible for handling blocky
> formats. Good.

Thanks!

> >  
> >  /**
> > @@ -36,30 +56,35 @@ static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int
> >   * @frame_info: Buffer metadata
> >   * @x: The x(width) coordinate inside the plane
> >   * @y: The y(height) coordinate inside the plane
> > + * @plane_index: The index of the plane
> > + * @addr: The returned pointer
> > + * @rem_x,@rem_y: The returned coordinate of the requested pixel in the block
> >   *
> > - * Takes the information stored in the frame_info, a pair of coordinates, and
> > - * returns the address of the first color channel.
> > - * This function assumes the channels are packed together, i.e. a color channel
> > - * comes immediately after another in the memory. And therefore, this function
> > - * doesn't work for YUV with chroma subsampling (e.g. YUV420 and NV21).
> > + * Takes the information stored in the frame_info, a pair of coordinates, and returns the address
> > + * of the block containing this pixel and the pixel position inside this block.
> >   *
> > - * The caller must ensure that the framebuffer associated with this request uses a pixel format
> > - * where block_h == block_w == 1, otherwise the returned pointer can be outside the buffer.
> > + * See @packed_pixel_offset for details about rem_x/rem_y behavior.
> >   */
> > -static void *packed_pixels_addr(const struct vkms_frame_info *frame_info,
> > -				int x, int y)
> > +static void packed_pixels_addr(const struct vkms_frame_info *frame_info,
> > +			       int x, int y, int plane_index, u8 **addr, int *rem_x,
> > +			       int *rem_y)
> >  {
> > -	size_t offset = pixel_offset(frame_info, x, y);
> > +	int offset;
> >  
> > -	return (u8 *)frame_info->map[0].vaddr + offset;
> > +	packed_pixels_offset(frame_info, x, y, plane_index, &offset, rem_x, rem_y);
> > +	*addr = (u8 *)frame_info->map[0].vaddr + offset;
> >  }
> >  
> > -static void *get_packed_src_addr(const struct vkms_frame_info *frame_info, int y)
> > +static void *get_packed_src_addr(const struct vkms_frame_info *frame_info, int y,
> > +				 int plane_index)
> >  {
> >  	int x_src = frame_info->src.x1 >> 16;
> >  	int y_src = y - frame_info->rotated.y1 + (frame_info->src.y1 >> 16);
> > +	u8 *addr;
> > +	int rem_x, rem_y;
> >  
> > -	return packed_pixels_addr(frame_info, x_src, y_src);
> > +	packed_pixels_addr(frame_info, x_src, y_src, plane_index, &addr, &rem_x, &rem_y);
> 
> How can the caller be not interested in rem_x, rem_y?

At this point of the series, I did not change how the rest was working. As 
this function will be deleted later, I just adapted it to use the new 
packed_pixel_addr implementation.
 
> Maybe there is no IGT test that uses DRM_FORMAT_R1 FB on a plane and
> has a source rectangle whose x is not divisible by 8 pixels?
> Or maybe the FB is filled with a solid color instead of a pattern that
> would show source rectangle positioning problems?

Currently, there is no DRM_FORMAT_R1 test in IGT, and all formats 
supported by VKMS are aligned on 8/16 bits with block_w == block_h == 1, 
so rem_x and rem_y will be equal to zero.

> Maybe at this point of the series, this should assert that rem_x and
> rem_y are zero? That's what vkms_compose_row() assumes, right?

Even more specificaly, vkms_compose_row() assumes that
block_w == block_h == 1, so maybe more

	WARN_ONCE(drm_format_info_block_width(format, plane_index) != 1, "get_packed_pixel_addr() only support formats with block_w == 1");
	WARN_ONCE(drm_format_info_block_hieght(format, plane_index) != 1, "get_packed_pixel_addr() only support formats with block_h == 1");

Thanks,
Louis Chauvet

> 
> Thanks,
> pq
> 
> > +	return addr;
> >  }
> >  
> >  static int get_x_position(const struct vkms_frame_info *frame_info, int limit, int x)
> > @@ -168,14 +193,14 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
> >  {
> >  	struct pixel_argb_u16 *out_pixels = stage_buffer->pixels;
> >  	struct vkms_frame_info *frame_info = plane->frame_info;
> > -	u8 *src_pixels = get_packed_src_addr(frame_info, y);
> > +	u8 *src_pixels = get_packed_src_addr(frame_info, y, 0);
> >  	int limit = min_t(size_t, drm_rect_width(&frame_info->dst), stage_buffer->n_pixels);
> >  
> >  	for (size_t x = 0; x < limit; x++, src_pixels += frame_info->fb->format->cpp[0]) {
> >  		int x_pos = get_x_position(frame_info, limit, x);
> >  
> >  		if (drm_rotation_90_or_270(frame_info->rotation))
> > -			src_pixels = get_packed_src_addr(frame_info, x + frame_info->rotated.y1)
> > +			src_pixels = get_packed_src_addr(frame_info, x + frame_info->rotated.y1, 0)
> >  				+ frame_info->fb->format->cpp[0] * y;
> >  
> >  		plane->pixel_read(src_pixels, &out_pixels[x_pos]);
> > @@ -276,7 +301,10 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
> >  {
> >  	struct vkms_frame_info *frame_info = &wb->wb_frame_info;
> >  	int x_dst = frame_info->dst.x1;
> > -	u8 *dst_pixels = packed_pixels_addr(frame_info, x_dst, y);
> > +	u8 *dst_pixels;
> > +	int rem_x, rem_y;
> > +
> > +	packed_pixels_addr(frame_info, x_dst, y, 0, &dst_pixels, &rem_x, &rem_y);
> >  	struct pixel_argb_u16 *in_pixels = src_buffer->pixels;
> >  	int x_limit = min_t(size_t, drm_rect_width(&frame_info->dst), src_buffer->n_pixels);
> >  
> > 
> 



-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 08/16] drm/vkms: Avoid computing blending limits inside pre_mul_alpha_blend
  2024-03-25 12:41   ` Pekka Paalanen
@ 2024-03-26 15:57     ` Louis Chauvet
  2024-03-27 11:48       ` Pekka Paalanen
  0 siblings, 1 reply; 75+ messages in thread
From: Louis Chauvet @ 2024-03-26 15:57 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 25/03/24 - 14:41, Pekka Paalanen a écrit :
> On Wed, 13 Mar 2024 18:45:02 +0100
> Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> 
> > The pre_mul_alpha_blend is dedicated to blending, so to avoid mixing
> > different concepts (coordinate calculation and color management), extract
> > the x_limit and x_dst computation outside of this helper.
> > It also increases the maintainability by grouping the computation related
> > to coordinates in the same place: the loop in `blend`.
> > 
> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >  drivers/gpu/drm/vkms/vkms_composer.c | 40 +++++++++++++++++-------------------
> >  1 file changed, 19 insertions(+), 21 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> > index da0651a94c9b..9254086f23ff 100644
> > --- a/drivers/gpu/drm/vkms/vkms_composer.c
> > +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> > @@ -24,34 +24,30 @@ static u16 pre_mul_blend_channel(u16 src, u16 dst, u16 alpha)
> >  
> >  /**
> >   * pre_mul_alpha_blend - alpha blending equation
> > - * @frame_info: Source framebuffer's metadata
> >   * @stage_buffer: The line with the pixels from src_plane
> >   * @output_buffer: A line buffer that receives all the blends output
> > + * @x_start: The start offset to avoid useless copy
> 
> I'd say just:
> 
> + * @x_start: The start offset
> 
> It describes the parameter, and the paragraph below explains the why.
> 
> It would be explaining, that x_start applies to output_buffer, but
> input_buffer is always read starting from 0.

I will change it to:

 * Using @x_start and @count information, only few pixel can be blended instead of the whole line
 * each time. @x_start is only used for the output buffer. The staging buffer is always read from
 * the start (0..@count in stage_buffer is blended at @x_start..@x_start+@count in output_buffer).

> > + * @count: The number of byte to copy
> 
> You named it pixel_count, and it counts pixels, not bytes. It's not a
> copy but a blend into output_buffer.

Oops, fixed in v6.
 
> >   *
> > - * Using the information from the `frame_info`, this blends only the
> > - * necessary pixels from the `stage_buffer` to the `output_buffer`
> > - * using premultiplied blend formula.
> > + * Using @x_start and @count information, only few pixel can be blended instead of the whole line
> > + * each time.
> >   *
> >   * The current DRM assumption is that pixel color values have been already
> >   * pre-multiplied with the alpha channel values. See more
> >   * drm_plane_create_blend_mode_property(). Also, this formula assumes a
> >   * completely opaque background.
> >   */
> > -static void pre_mul_alpha_blend(struct vkms_frame_info *frame_info,
> > -				struct line_buffer *stage_buffer,
> > -				struct line_buffer *output_buffer)
> > +static void pre_mul_alpha_blend(const struct line_buffer *stage_buffer,
> > +				struct line_buffer *output_buffer, int x_start, int pixel_count)
> >  {
> > -	int x_dst = frame_info->dst.x1;
> > -	struct pixel_argb_u16 *out = output_buffer->pixels + x_dst;
> > -	struct pixel_argb_u16 *in = stage_buffer->pixels;
> > -	int x_limit = min_t(size_t, drm_rect_width(&frame_info->dst),
> > -			    stage_buffer->n_pixels);
> > -
> > -	for (int x = 0; x < x_limit; x++) {
> > -		out[x].a = (u16)0xffff;
> > -		out[x].r = pre_mul_blend_channel(in[x].r, out[x].r, in[x].a);
> > -		out[x].g = pre_mul_blend_channel(in[x].g, out[x].g, in[x].a);
> > -		out[x].b = pre_mul_blend_channel(in[x].b, out[x].b, in[x].a);
> > +	struct pixel_argb_u16 *out = &output_buffer->pixels[x_start];
> > +	const struct pixel_argb_u16 *in = stage_buffer->pixels;
> > +
> > +	for (int i = 0; i < pixel_count; i++) {
> > +		out[i].a = (u16)0xffff;
> > +		out[i].r = pre_mul_blend_channel(in[i].r, out[i].r, in[i].a);
> > +		out[i].g = pre_mul_blend_channel(in[i].g, out[i].g, in[i].a);
> > +		out[i].b = pre_mul_blend_channel(in[i].b, out[i].b, in[i].a);
> >  	}
> >  }
> >  
> > @@ -183,7 +179,7 @@ static void blend(struct vkms_writeback_job *wb,
> >  {
> >  	struct vkms_plane_state **plane = crtc_state->active_planes;
> >  	u32 n_active_planes = crtc_state->num_active_planes;
> > -	int y_pos;
> > +	int y_pos, x_dst, x_limit;
> >  
> >  	const struct pixel_argb_u16 background_color = { .a = 0xffff };
> >  
> > @@ -201,14 +197,16 @@ static void blend(struct vkms_writeback_job *wb,
> >  
> >  		/* The active planes are composed associatively in z-order. */
> >  		for (size_t i = 0; i < n_active_planes; i++) {
> > +			x_dst = plane[i]->frame_info->dst.x1;
> > +			x_limit = min_t(size_t, drm_rect_width(&plane[i]->frame_info->dst),
> > +					stage_buffer->n_pixels);
> 
> Are those input values to min_t() really of type size_t? Or why is
> size_t here?

n_pixel is size_t, drm_rect_width is int. I will change everything to int. 
Is there a way to ask the compiler "please don't do implicit conversion 
and report them as warn/errors"?

> >  			y_pos = get_y_pos(plane[i]->frame_info, y);
> >  
> >  			if (!check_limit(plane[i]->frame_info, y_pos))
> >  				continue;
> >  
> >  			vkms_compose_row(stage_buffer, plane[i], y_pos);
> > -			pre_mul_alpha_blend(plane[i]->frame_info, stage_buffer,
> > -					    output_buffer);
> > +			pre_mul_alpha_blend(stage_buffer, output_buffer, x_dst, x_limit);
> 
> I thought it was a count, not a limit?
> 
> "Limit" sounds to me like "end", and end - start = count.

It is effectively a pixel count. I just took those naming from the 
original pre_mul_alpha_blend. I will change it to pixel_count.

Thanks,
Louis Chauvet

> >  		}
> >  
> >  		apply_lut(crtc_state, output_buffer);
> > 
> 
> The details aside, this is a good move.
> 
> 
> Thanks,
> pq



-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 09/16] drm/vkms: Introduce pixel_read_direction enum
  2024-03-25 13:11   ` Pekka Paalanen
@ 2024-03-26 15:57     ` Louis Chauvet
  2024-03-27 12:16       ` Pekka Paalanen
  0 siblings, 1 reply; 75+ messages in thread
From: Louis Chauvet @ 2024-03-26 15:57 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 25/03/24 - 15:11, Pekka Paalanen a écrit :
> On Wed, 13 Mar 2024 18:45:03 +0100
> Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> 
> > The pixel_read_direction enum is useful to describe the reading direction
> > in a plane. It avoids using the rotation property of DRM, which not
> > practical to know the direction of reading.
> > This patch also introduce two helpers, one to compute the
> > pixel_read_direction from the DRM rotation property, and one to compute
> > the step, in byte, between two successive pixel in a specific direction.
> > 
> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >  drivers/gpu/drm/vkms/vkms_composer.c | 36 ++++++++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/vkms/vkms_drv.h      | 11 +++++++++++
> >  drivers/gpu/drm/vkms/vkms_formats.c  | 30 ++++++++++++++++++++++++++++++
> >  3 files changed, 77 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> > index 9254086f23ff..989bcf59f375 100644
> > --- a/drivers/gpu/drm/vkms/vkms_composer.c
> > +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> > @@ -159,6 +159,42 @@ static void apply_lut(const struct vkms_crtc_state *crtc_state, struct line_buff
> >  	}
> >  }
> >  
> > +/**
> > + * direction_for_rotation() - Get the correct reading direction for a given rotation
> > + *
> > + * This function will use the @rotation setting of a source plane to compute the reading
> > + * direction in this plane which correspond to a "left to right writing" in the CRTC.
> > + * For example, if the buffer is reflected on X axis, the pixel must be read from right to left
> > + * to be written from left to right on the CRTC.
> 
> That is a well written description.

Thanks
 
> > + *
> > + * @rotation: Rotation to analyze. It correspond the field @frame_info.rotation.
> > + */
> > +static enum pixel_read_direction direction_for_rotation(unsigned int rotation)
> > +{
> > +	if (rotation & DRM_MODE_ROTATE_0) {
> > +		if (rotation & DRM_MODE_REFLECT_X)
> > +			return READ_RIGHT_TO_LEFT;
> > +		else
> > +			return READ_LEFT_TO_RIGHT;
> > +	} else if (rotation & DRM_MODE_ROTATE_90) {
> > +		if (rotation & DRM_MODE_REFLECT_Y)
> > +			return READ_BOTTOM_TO_TOP;
> > +		else
> > +			return READ_TOP_TO_BOTTOM;
> > +	} else if (rotation & DRM_MODE_ROTATE_180) {
> > +		if (rotation & DRM_MODE_REFLECT_X)
> > +			return READ_LEFT_TO_RIGHT;
> > +		else
> > +			return READ_RIGHT_TO_LEFT;
> > +	} else if (rotation & DRM_MODE_ROTATE_270) {
> > +		if (rotation & DRM_MODE_REFLECT_Y)
> > +			return READ_TOP_TO_BOTTOM;
> > +		else
> > +			return READ_BOTTOM_TO_TOP;
> > +	}
> > +	return READ_LEFT_TO_RIGHT;
> 
> I'm a little worried seeing REFLECT_X is supported only for some
> rotations, and REFLECT_Y for other rotations. Why is an analysis of all
> combinations not necessary?

I don't need to manage all the combination because this is only about 
the "horizontal writing".

So, if you want to write a line in the CRTC, with:
- ROT_0 || REF_X => You need to read the source line from right to left
- ROT_0 => You need to read source buffer from left to right
- ROT_0 || REF_Y => You need to read the source line from left to right

In this case, REF_Y only have an effect on the "column reading". It is not 
needed here because the new version of the blend function will use the 
drm_rect_* helpers to compute the correct y coordinate.

If you think it's clearer, I can create a big switch(rotation) like this:

	switch (rotation) {
	case ROT_0:
	case ROT_0 || REF_X:
		return L2R;
	case ROT_0 || REF_Y:
		return R2L;
	case ROT_90:
	case ROT_90 || REF_X:
		return T2B;
	[...]
	}

So all cases are clearly covered?

> I hope IGT uses FB patterns instead of solid color in its tests of
> rotation to be able to detect the difference.

They use solid colors, and even my new rotation test [3] use solid colors. 
It is mainly for yuv formats with subsampling: if you have formats with 
subsampling, a "software rotated buffer" and a "hardware rotated buffer" 
will not apply the same subsampling, so the colors will be slightly 
different.

> The return values do seem correct to me, assuming I have guessed
> correctly what "X" and "Y" refer to when combined with rotation. I did
> not find good documentation about that.

Yes, it is difficult to understand how rotation and reflexion should 
works in drm. I spend half a day testing all the combination in drm_rect_* 
helpers to understand how this works. According to the code:
- If only rotation or only reflexion, easy as expected
- If reflexion and rotation are mixed, the source buffer is first 
  reflected and then rotated.
 
> Btw. if there are already functions that are able to transform
> coordinates based on the rotation bitfield, you could alternatively use
> them. Transform CRTC point (0, 0) to A, and (1, 0) to B. Now A and B
> are in plane coordinate system, and vector B - A gives you the
> direction. The reason I'm mentioning this is that then you don't have
> to implement yet another copy of the rotation bitfield semantics from
> scratch.

You are totaly right. I will try this elegant method. Yes, there are some 
helpers (drm_rect_rotate_inv), so I will try to do something.

> 
> > +}
> > +
> >  /**
> >   * blend - blend the pixels from all planes and compute crc
> >   * @wb: The writeback frame buffer metadata
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index 3ead8b39af4a..985e7a92b7bc 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -69,6 +69,17 @@ struct vkms_writeback_job {
> >  	pixel_write_t pixel_write;
> >  };
> >  
> > +/**
> > + * enum pixel_read_direction - Enum used internaly by VKMS to represent a reading direction in a
> > + * plane.
> > + */
> > +enum pixel_read_direction {
> > +	READ_BOTTOM_TO_TOP,
> > +	READ_TOP_TO_BOTTOM,
> > +	READ_RIGHT_TO_LEFT,
> > +	READ_LEFT_TO_RIGHT
> > +};
> > +
> >  /**
> >   * typedef pixel_read_t - These functions are used to read a pixel in the source frame,
> >   * convert it to `struct pixel_argb_u16` and write it to @out_pixel.
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> > index 649d75d05b1f..743b6fd06db5 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.c
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> > @@ -75,6 +75,36 @@ static void packed_pixels_addr(const struct vkms_frame_info *frame_info,
> >  	*addr = (u8 *)frame_info->map[0].vaddr + offset;
> >  }
> >  
> > +/**
> > + * get_step_next_block() - Common helper to compute the correct step value between each pixel block
> > + * to read in a certain direction.
> > + *
> > + * As the returned offset is the number of bytes between two consecutive blocks in a direction,
> > + * the caller may have to read multiple pixel before using the next one (for example, to read from
> > + * left to right in a DRM_FORMAT_R1 plane, each block contains 8 pixels, so the step must be used
> > + * only every 8 pixels.
> > + *
> > + * @fb: Framebuffer to iter on
> > + * @direction: Direction of the reading
> > + * @plane_index: Plane to get the step from
> > + */
> > +static int get_step_next_block(struct drm_framebuffer *fb, enum pixel_read_direction direction,
> > +			       int plane_index)
> > +{
> 
> I would have called this something like get_block_step_bytes() for
> example. That makes it clear it returns bytes (not e.g. pixels). "next"
> implies to me that I tell the function the current block, and then it
> gets me the next one. It does not do that, so I'd not use "next".

Nice name, I will took it for the v6.

Thanks,
Louis Chauvet

> > +	switch (direction) {
> > +	case READ_LEFT_TO_RIGHT:
> > +		return fb->format->char_per_block[plane_index];
> > +	case READ_RIGHT_TO_LEFT:
> > +		return -fb->format->char_per_block[plane_index];
> > +	case READ_TOP_TO_BOTTOM:
> > +		return (int)fb->pitches[plane_index];
> > +	case READ_BOTTOM_TO_TOP:
> > +		return -(int)fb->pitches[plane_index];
> > +	}
> > +
> > +	return 0;
> > +}
> 
> Looks good.
> 
> 
> Thanks,
> pq
> 
> > +
> >  static void *get_packed_src_addr(const struct vkms_frame_info *frame_info, int y,
> >  				 int plane_index)
> >  {
> > 
> 



-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 10/16] drm/vkms: Re-introduce line-per-line composition algorithm
  2024-03-25 14:15   ` Maíra Canal
  2024-03-25 14:56     ` Pekka Paalanen
@ 2024-03-26 15:57     ` Louis Chauvet
  1 sibling, 0 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-26 15:57 UTC (permalink / raw)
  To: Maíra Canal
  Cc: Rodrigo Siqueira, Melissa Wen, Haneen Mohammed, Daniel Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, arthurgrillo, Jonathan Corbet, pekka.paalanen,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 25/03/24 - 11:15, Maíra Canal a écrit :
> On 3/13/24 14:45, Louis Chauvet wrote:
> > Re-introduce a line-by-line composition algorithm for each pixel format.
> > This allows more performance by not requiring an indirection per pixel
> > read. This patch is focused on readability of the code.
> > 
> > Line-by-line composition was introduced by [1] but rewritten back to
> > pixel-by-pixel algorithm in [2]. At this time, nobody noticed the impact
> > on performance, and it was merged.
> > 
> > This patch is almost a revert of [2], but in addition efforts have been
> > made to increase readability and maintainability of the rotation handling.
> > The blend function is now divided in two parts:
> > - Transformation of coordinates from the output referential to the source
> > referential
> > - Line conversion and blending
> > 
> > Most of the complexity of the rotation management is avoided by using
> > drm_rect_* helpers. The remaining complexity is around the clipping, to
> > avoid reading/writing outside source/destination buffers.
> > 
> > The pixel conversion is now done line-by-line, so the read_pixel_t was
> > replaced with read_pixel_line_t callback. This way the indirection is only
> > required once per line and per plane, instead of once per pixel and per
> > plane.
> > 
> > The read_line_t callbacks are very similar for most pixel format, but it
> > is required to avoid performance impact. Some helpers for color
> > conversion were introduced to avoid code repetition:
> > - *_to_argb_u16: perform colors conversion. They should be inlined by the
> >    compiler, and they are used to avoid repetition between multiple variants
> >    of the same format (argb/xrgb and maybe in the future for formats like
> >    bgr formats).
> > 
> > This new algorithm was tested with:
> > - kms_plane (for color conversions)
> > - kms_rotation_crc (for rotations of planes)
> > - kms_cursor_crc (for translations of planes)
> > - kms_rotation (for all rotations and formats combinations) [3]
> > The performance gain was mesured with:
> > - kms_fb_stress
> 
> Could you tell us what was the performance gain?

I will measure the gain to put the correct number here, but it was at 
least twice faster.
 
> > 
> > [1]: commit 8ba1648567e2 ("drm: vkms: Refactor the plane composer to accept
> >       new formats")
> >       https://lore.kernel.org/all/20220905190811.25024-7-igormtorrente@gmail.com/
> > [2]: commit 322d716a3e8a ("drm/vkms: isolate pixel conversion
> >       functionality")
> >       https://lore.kernel.org/all/20230418130525.128733-2-mcanal@igalia.com/
> > [3]:
> > 
> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >   drivers/gpu/drm/vkms/vkms_composer.c | 167 +++++++++++++++++++------
> >   drivers/gpu/drm/vkms/vkms_drv.h      |  27 ++--
> >   drivers/gpu/drm/vkms/vkms_formats.c  | 236 ++++++++++++++++++++++-------------
> >   drivers/gpu/drm/vkms/vkms_formats.h  |   2 +-
> >   drivers/gpu/drm/vkms/vkms_plane.c    |   5 +-
> >   5 files changed, 292 insertions(+), 145 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> > index 989bcf59f375..5d78c33dbf41 100644
> > --- a/drivers/gpu/drm/vkms/vkms_composer.c
> > +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> > @@ -41,7 +41,7 @@ static void pre_mul_alpha_blend(const struct line_buffer *stage_buffer,
> >   				struct line_buffer *output_buffer, int x_start, int pixel_count)
> >   {
> >   	struct pixel_argb_u16 *out = &output_buffer->pixels[x_start];
> > -	const struct pixel_argb_u16 *in = stage_buffer->pixels;
> > +	const struct pixel_argb_u16 *in = &stage_buffer->pixels[x_start];
> >   
> >   	for (int i = 0; i < pixel_count; i++) {
> >   		out[i].a = (u16)0xffff;
> > @@ -51,33 +51,6 @@ static void pre_mul_alpha_blend(const struct line_buffer *stage_buffer,
> >   	}
> >   }
> >   
> > -static int get_y_pos(struct vkms_frame_info *frame_info, int y)
> > -{
> > -	if (frame_info->rotation & DRM_MODE_REFLECT_Y)
> > -		return drm_rect_height(&frame_info->rotated) - y - 1;
> > -
> > -	switch (frame_info->rotation & DRM_MODE_ROTATE_MASK) {
> > -	case DRM_MODE_ROTATE_90:
> > -		return frame_info->rotated.x2 - y - 1;
> > -	case DRM_MODE_ROTATE_270:
> > -		return y + frame_info->rotated.x1;
> > -	default:
> > -		return y;
> > -	}
> > -}
> > -
> > -static bool check_limit(struct vkms_frame_info *frame_info, int pos)
> > -{
> > -	if (drm_rotation_90_or_270(frame_info->rotation)) {
> > -		if (pos >= 0 && pos < drm_rect_width(&frame_info->rotated))
> > -			return true;
> > -	} else {
> > -		if (pos >= frame_info->rotated.y1 && pos < frame_info->rotated.y2)
> > -			return true;
> > -	}
> > -
> > -	return false;
> > -}
> >   
> >   static void fill_background(const struct pixel_argb_u16 *background_color,
> >   			    struct line_buffer *output_buffer)
> > @@ -215,34 +188,146 @@ static void blend(struct vkms_writeback_job *wb,
> >   {
> >   	struct vkms_plane_state **plane = crtc_state->active_planes;
> >   	u32 n_active_planes = crtc_state->num_active_planes;
> > -	int y_pos, x_dst, x_limit;
> >   
> >   	const struct pixel_argb_u16 background_color = { .a = 0xffff };
> >   
> > -	size_t crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
> > +	int crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
> > +	int crtc_x_limit = crtc_state->base.crtc->mode.hdisplay;
> 
> Shouldn't it be `unsigned int`?

It was a suggestion from Pekka.

> >   
> >   	/*
> >   	 * The planes are composed line-by-line to avoid heavy memory usage. It is a necessary
> >   	 * complexity to avoid poor blending performance.
> >   	 *
> > -	 * The function vkms_compose_row is used to read a line, pixel-by-pixel, into the staging
> > -	 * buffer.
> > +	 * The function pixel_read_line callback is used to read a line, using an efficient
> > +	 * algorithm for a specific format, into the staging buffer.
> >   	 */
> >   	for (size_t y = 0; y < crtc_y_limit; y++) {
> >   		fill_background(&background_color, output_buffer);
> >   
> >   		/* The active planes are composed associatively in z-order. */
> >   		for (size_t i = 0; i < n_active_planes; i++) {
> > -			x_dst = plane[i]->frame_info->dst.x1;
> > -			x_limit = min_t(size_t, drm_rect_width(&plane[i]->frame_info->dst),
> > -					stage_buffer->n_pixels);
> > -			y_pos = get_y_pos(plane[i]->frame_info, y);
> > +			struct vkms_plane_state *current_plane = plane[i];
> >   
> > -			if (!check_limit(plane[i]->frame_info, y_pos))
> > +			/* Avoid rendering useless lines */
> > +			if (y < current_plane->frame_info->dst.y1 ||
> > +			    y >= current_plane->frame_info->dst.y2)
> >   				continue;
> >   
> > -			vkms_compose_row(stage_buffer, plane[i], y_pos);
> > -			pre_mul_alpha_blend(stage_buffer, output_buffer, x_dst, x_limit);
> > +			/*
> > +			 * dst_line is the line to copy. The initial coordinates are inside the
> > +			 * destination framebuffer, and then drm_rect_* helpers are used to
> > +			 * compute the correct position into the source framebuffer.
> > +			 */
> > +			struct drm_rect dst_line = DRM_RECT_INIT(
> 
> Please, run checkpatch on this patch.

I ran checkpatch and ignored this warn for readability, I did not found 
better solution.

But as you suggested, I will split this big function in smaller one, so it 
should solve this.

> > +				current_plane->frame_info->dst.x1, y,
> > +				drm_rect_width(&current_plane->frame_info->dst), 1);
> > +			struct drm_rect tmp_src;
> > +
> > +			drm_rect_fp_to_int(&tmp_src, &current_plane->frame_info->src);
> > +
> > +			/*
> > +			 * [1]: Clamping src_line to the crtc_x_limit to avoid writing outside of
> > +			 * the destination buffer
> > +			 */
> > +			dst_line.x1 = max_t(int, dst_line.x1, 0);
> > +			dst_line.x2 = min_t(int, dst_line.x2, crtc_x_limit);
> > +			/* The destination is completely outside of the crtc. */
> > +			if (dst_line.x2 <= dst_line.x1)
> > +				continue;
> > +
> > +			struct drm_rect src_line = dst_line;
> > +
> > +			/*
> > +			 * Transform the coordinate x/y from the crtc to coordinates into
> > +			 * coordinates for the src buffer.
> > +			 *
> > +			 * - Cancel the offset of the dst buffer.
> > +			 * - Invert the rotation. This assumes that
> > +			 *   dst = drm_rect_rotate(src, rotation) (dst and src have the
> > +			 *   same size, but can be rotated).
> > +			 * - Apply the offset of the source rectangle to the coordinate.
> > +			 */
> > +			drm_rect_translate(&src_line, -current_plane->frame_info->dst.x1,
> > +					   -current_plane->frame_info->dst.y1);
> > +			drm_rect_rotate_inv(&src_line,
> > +					    drm_rect_width(&tmp_src),
> > +					    drm_rect_height(&tmp_src),
> > +					    current_plane->frame_info->rotation);
> > +			drm_rect_translate(&src_line, tmp_src.x1, tmp_src.y1);
> > +
> > +			/* Get the correct reading direction in the source buffer. */
> > +
> > +			enum pixel_read_direction direction =
> > +				direction_for_rotation(current_plane->frame_info->rotation);
> > +
> > +			int x_start = src_line.x1;
> > +			int y_start = src_line.y1;
> > +			int pixel_count;
> > +			/* [2]: Compute and clamp the number of pixel to read */
> > +			if (direction == READ_LEFT_TO_RIGHT || direction == READ_RIGHT_TO_LEFT) {
> > +				/*
> > +				 * In horizontal reading, the src_line width is the number of pixel
> > +				 * to read
> > +				 */
> > +				pixel_count = drm_rect_width(&src_line);
> > +				if (x_start < 0) {
> > +					pixel_count += x_start;
> > +					x_start = 0;
> > +				}
> > +				if (x_start + pixel_count > current_plane->frame_info->fb->width) {
> > +					pixel_count =
> > +						(int)current_plane->frame_info->fb->width - x_start;
> > +				}
> > +			} else {
> > +				/*
> > +				 * In vertical reading, the src_line height is the number of pixel
> > +				 * to read
> > +				 */
> > +				pixel_count = drm_rect_height(&src_line);
> > +				if (y_start < 0) {
> > +					pixel_count += y_start;
> > +					y_start = 0;
> > +				}
> > +				if (y_start + pixel_count > current_plane->frame_info->fb->height) {
> > +					pixel_count =
> > +						(int)current_plane->frame_info->fb->width - y_start;
> > +				}
> > +			}
> > +
> > +			if (pixel_count <= 0) {
> > +				/* Nothing to read, so avoid multiple function calls for nothing */
> > +				continue;
> > +			}
> > +
> > +			/*
> > +			 * Modify the starting point to take in account the rotation
> > +			 *
> > +			 * src_line is the top-left corner, so when reading READ_RIGHT_TO_LEFT or
> > +			 * READ_BOTTOM_TO_TOP, it must be changed to the top-right/bottom-left
> > +			 * corner.
> > +			 */
> > +			if (direction == READ_RIGHT_TO_LEFT) {
> > +				// x_start is now the right point
> > +				x_start += pixel_count - 1;
> > +			} else if (direction == READ_BOTTOM_TO_TOP) {
> > +				// y_start is now the bottom point
> > +				y_start += pixel_count - 1;
> > +			}
> 
> Any chance this code could be a separate function? I believe it would
> make it more readable.

I will try do split this big function, good idea. At least I can try to 
extract the inner loop content to separate "manage all the planes" from 
"manage the rotation of a plane".

I think it will solve the previous checkpatch issue.

Thanks,
Louis Chauvet

> Best Regards,
> - Maíra
> 
> > +
> > +			/*
> > +			 * Perform the conversion and the blending
> > +			 *
> > +			 * Here we know that the read line (x_start, y_start, pixel_count) is
> > +			 * inside the source buffer [2] and we don't write outside the stage
> > +			 * buffer [1]
> > +			 */
> > +			current_plane->pixel_read_line(
> > +				current_plane, x_start, y_start, direction, pixel_count,
> > +				&stage_buffer->pixels[current_plane->frame_info->dst.x1]);
> > +
> > +			pre_mul_alpha_blend(stage_buffer, output_buffer,
> > +					    current_plane->frame_info->dst.x1,
> > +					    pixel_count);
> >   		}
> >   
> >   		apply_lut(crtc_state, output_buffer);
> > @@ -250,7 +335,7 @@ static void blend(struct vkms_writeback_job *wb,
> >   		*crc32 = crc32_le(*crc32, (void *)output_buffer->pixels, row_size);
> >   
> >   		if (wb)
> > -			vkms_writeback_row(wb, output_buffer, y_pos);
> > +			vkms_writeback_row(wb, output_buffer, y);
> >   	}
> >   }
> >   
> > @@ -261,7 +346,7 @@ static int check_format_funcs(struct vkms_crtc_state *crtc_state,
> >   	u32 n_active_planes = crtc_state->num_active_planes;
> >   
> >   	for (size_t i = 0; i < n_active_planes; i++)
> > -		if (!planes[i]->pixel_read)
> > +		if (!planes[i]->pixel_read_line)
> >   			return -1;
> >   
> >   	if (active_wb && !active_wb->pixel_write)
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index 985e7a92b7bc..23e1d247468d 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -39,7 +39,6 @@
> >   struct vkms_frame_info {
> >   	struct drm_framebuffer *fb;
> >   	struct drm_rect src, dst;
> > -	struct drm_rect rotated;
> >   	struct iosys_map map[DRM_FORMAT_MAX_PLANES];
> >   	unsigned int rotation;
> >   };
> > @@ -80,26 +79,37 @@ enum pixel_read_direction {
> >   	READ_LEFT_TO_RIGHT
> >   };
> >   
> > +struct vkms_plane_state;
> > +
> >   /**
> > - * typedef pixel_read_t - These functions are used to read a pixel in the source frame,
> > + * typedef pixel_read_line_t - These functions are used to read a pixel line in the source frame,
> >    * convert it to `struct pixel_argb_u16` and write it to @out_pixel.
> >    *
> > - * @in_pixel: Pointer to the pixel to read
> > - * @out_pixel: Pointer to write the converted pixel
> > + * @plane: Plane used as source for the pixel value
> > + * @x_start: X (width) coordinate of the first pixel to copy. The caller must ensure that x_start
> > + * is positive and smaller than @plane->frame_info->fb->width.
> > + * @y_start: Y (width) coordinate of the first pixel to copy. The caller must ensure that y_start
> > + * is positive and smaller than @plane->frame_info->fb->height.
> > + * @direction: Direction to use for the copy, starting at @x_start/@y_start
> > + * @count: Number of pixels to copy
> > + * @out_pixel: Pointer where to write the pixel values. They will be written from @out_pixel[0]
> > + * to @out_pixel[@count]. The caller must ensure that out_pixel have a length of at least @count.
> >    */
> > -typedef void (*pixel_read_t)(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel);
> > +typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_start,
> > +				  int y_start, enum pixel_read_direction direction, int count,
> > +				  struct pixel_argb_u16 out_pixel[]);
> >   
> >   /**
> >    * vkms_plane_state - Driver specific plane state
> >    * @base: base plane state
> >    * @frame_info: data required for composing computation
> > - * @pixel_read: function to read a pixel in this plane. The creator of a vkms_plane_state must
> > - * ensure that this pointer is valid
> > + * @pixel_read_line: function to read a pixel line in this plane. The creator of a vkms_plane_state
> > + * must ensure that this pointer is valid
> >    */
> >   struct vkms_plane_state {
> >   	struct drm_shadow_plane_state base;
> >   	struct vkms_frame_info *frame_info;
> > -	pixel_read_t pixel_read;
> > +	pixel_read_line_t pixel_read_line;
> >   };
> >   
> >   struct vkms_plane {
> > @@ -204,7 +214,6 @@ int vkms_verify_crc_source(struct drm_crtc *crtc, const char *source_name,
> >   /* Composer Support */
> >   void vkms_composer_worker(struct work_struct *work);
> >   void vkms_set_composer(struct vkms_output *out, bool enabled);
> > -void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state *plane, int y);
> >   void vkms_writeback_row(struct vkms_writeback_job *wb, const struct line_buffer *src_buffer, int y);
> >   
> >   /* Writeback */
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> > index 743b6fd06db5..1449a0e6c706 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.c
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> > @@ -105,77 +105,45 @@ static int get_step_next_block(struct drm_framebuffer *fb, enum pixel_read_direc
> >   	return 0;
> >   }
> >   
> > -static void *get_packed_src_addr(const struct vkms_frame_info *frame_info, int y,
> > -				 int plane_index)
> > -{
> > -	int x_src = frame_info->src.x1 >> 16;
> > -	int y_src = y - frame_info->rotated.y1 + (frame_info->src.y1 >> 16);
> > -	u8 *addr;
> > -	int rem_x, rem_y;
> > -
> > -	packed_pixels_addr(frame_info, x_src, y_src, plane_index, &addr, &rem_x, &rem_y);
> > -	return addr;
> > -}
> > -
> > -static int get_x_position(const struct vkms_frame_info *frame_info, int limit, int x)
> > -{
> > -	if (frame_info->rotation & (DRM_MODE_REFLECT_X | DRM_MODE_ROTATE_270))
> > -		return limit - x - 1;
> > -	return x;
> > -}
> > -
> >   /*
> > - * The following  functions take pixel data from the buffer and convert them to the format
> > + * The following  functions take pixel data (a, r, g, b, pixel, ...), convert them to the format
> >    * ARGB16161616 in out_pixel.
> >    *
> > - * They are used in the `vkms_compose_row` function to handle multiple formats.
> > + * They are used in the `read_line`s functions to avoid duplicate work for some pixel formats.
> >    */
> >   
> > -static void ARGB8888_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> > +static struct pixel_argb_u16 argb_u16_from_u8888(int a, int r, int g, int b)
> >   {
> > +	struct pixel_argb_u16 out_pixel;
> >   	/*
> >   	 * The 257 is the "conversion ratio". This number is obtained by the
> >   	 * (2^16 - 1) / (2^8 - 1) division. Which, in this case, tries to get
> >   	 * the best color value in a pixel format with more possibilities.
> >   	 * A similar idea applies to others RGB color conversions.
> >   	 */
> > -	out_pixel->a = (u16)in_pixel[3] * 257;
> > -	out_pixel->r = (u16)in_pixel[2] * 257;
> > -	out_pixel->g = (u16)in_pixel[1] * 257;
> > -	out_pixel->b = (u16)in_pixel[0] * 257;
> > -}
> > +	out_pixel.a = (u16)a * 257;
> > +	out_pixel.r = (u16)r * 257;
> > +	out_pixel.g = (u16)g * 257;
> > +	out_pixel.b = (u16)b * 257;
> >   
> > -static void XRGB8888_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> > -{
> > -	out_pixel->a = (u16)0xffff;
> > -	out_pixel->r = (u16)in_pixel[2] * 257;
> > -	out_pixel->g = (u16)in_pixel[1] * 257;
> > -	out_pixel->b = (u16)in_pixel[0] * 257;
> > +	return out_pixel;
> >   }
> >   
> > -static void ARGB16161616_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> > +static struct pixel_argb_u16 argb_u16_from_u16161616(int a, int r, int g, int b)
> >   {
> > -	u16 *pixel = (u16 *)in_pixel;
> > +	struct pixel_argb_u16 out_pixel;
> >   
> > -	out_pixel->a = le16_to_cpu(pixel[3]);
> > -	out_pixel->r = le16_to_cpu(pixel[2]);
> > -	out_pixel->g = le16_to_cpu(pixel[1]);
> > -	out_pixel->b = le16_to_cpu(pixel[0]);
> > -}
> > +	out_pixel.a = le16_to_cpu(a);
> > +	out_pixel.r = le16_to_cpu(r);
> > +	out_pixel.g = le16_to_cpu(g);
> > +	out_pixel.b = le16_to_cpu(b);
> >   
> > -static void XRGB16161616_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> > -{
> > -	u16 *pixel = (u16 *)in_pixel;
> > -
> > -	out_pixel->a = (u16)0xffff;
> > -	out_pixel->r = le16_to_cpu(pixel[2]);
> > -	out_pixel->g = le16_to_cpu(pixel[1]);
> > -	out_pixel->b = le16_to_cpu(pixel[0]);
> > +	return out_pixel;
> >   }
> >   
> > -static void RGB565_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> > +static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
> >   {
> > -	u16 *pixel = (u16 *)in_pixel;
> > +	struct pixel_argb_u16 out_pixel;
> >   
> >   	s64 fp_rb_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(31));
> >   	s64 fp_g_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(63));
> > @@ -185,12 +153,26 @@ static void RGB565_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pi
> >   	s64 fp_g = drm_int2fixp((rgb_565 >> 5) & 0x3f);
> >   	s64 fp_b = drm_int2fixp(rgb_565 & 0x1f);
> >   
> > -	out_pixel->a = (u16)0xffff;
> > -	out_pixel->r = drm_fixp2int_round(drm_fixp_mul(fp_r, fp_rb_ratio));
> > -	out_pixel->g = drm_fixp2int_round(drm_fixp_mul(fp_g, fp_g_ratio));
> > -	out_pixel->b = drm_fixp2int_round(drm_fixp_mul(fp_b, fp_rb_ratio));
> > +	out_pixel.a = (u16)0xffff;
> > +	out_pixel.r = drm_fixp2int_round(drm_fixp_mul(fp_r, fp_rb_ratio));
> > +	out_pixel.g = drm_fixp2int_round(drm_fixp_mul(fp_g, fp_g_ratio));
> > +	out_pixel.b = drm_fixp2int_round(drm_fixp_mul(fp_b, fp_rb_ratio));
> > +
> > +	return out_pixel;
> >   }
> >   
> > +/*
> > + * The following functions are read_line function for each pixel format supported by VKMS.
> > + *
> > + * They read a line starting at the point @x_start,@y_start following the @direction. The result
> > + * is stored in @out_pixel and in the format ARGB16161616.
> > + *
> > + * Those function are very similar, but it is required for performance reason. In the past, some
> > + * experiment were done, and with a generic loop the performance are very reduced [1].
> > + *
> > + * [1]: https://lore.kernel.org/dri-devel/d258c8dc-78e9-4509-9037-a98f7f33b3a3@riseup.net/
> > + */
> > +
> >   /**
> >    * black_to_argb_u16() - pixel_read callback which always read black
> >    *
> > @@ -198,42 +180,116 @@ static void RGB565_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pi
> >    * It is used to avoid null pointer to be used as a function. In theory, this function should
> >    * never be called, except if you found a bug in the driver/DRM core.
> >    */
> > -static void black_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> > +static void black_to_argb_u16(const struct vkms_plane_state *plane, int x_start,
> > +			      int y_start, enum pixel_read_direction direction, int count,
> > +			      struct pixel_argb_u16 out_pixel[])
> >   {
> > -	out_pixel->a = (u16)0xFFFF;
> > -	out_pixel->r = 0;
> > -	out_pixel->g = 0;
> > -	out_pixel->b = 0;
> > +	struct pixel_argb_u16 *end = out_pixel + count;
> > +
> > +	while (out_pixel < end) {
> > +		*out_pixel = argb_u16_from_u8888(255, 0, 0, 0);
> > +		out_pixel += 1;
> > +	}
> >   }
> >   
> > -/**
> > - * vkms_compose_row - compose a single row of a plane
> > - * @stage_buffer: output line with the composed pixels
> > - * @plane: state of the plane that is being composed
> > - * @y: y coordinate of the row
> > - *
> > - * This function composes a single row of a plane. It gets the source pixels
> > - * through the y coordinate (see get_packed_src_addr()) and goes linearly
> > - * through the source pixel, reading the pixels and converting it to
> > - * ARGB16161616 (see the pixel_read() callback). For rotate-90 and rotate-270,
> > - * the source pixels are not traversed linearly. The source pixels are queried
> > - * on each iteration in order to traverse the pixels vertically.
> > - */
> > -void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state *plane, int y)
> > +static void ARGB8888_read_line(const struct vkms_plane_state *plane, int x_start, int y_start,
> > +			       enum pixel_read_direction direction, int count,
> > +			       struct pixel_argb_u16 out_pixel[])
> >   {
> > -	struct pixel_argb_u16 *out_pixels = stage_buffer->pixels;
> > -	struct vkms_frame_info *frame_info = plane->frame_info;
> > -	u8 *src_pixels = get_packed_src_addr(frame_info, y, 0);
> > -	int limit = min_t(size_t, drm_rect_width(&frame_info->dst), stage_buffer->n_pixels);
> > +	struct pixel_argb_u16 *end = out_pixel + count;
> > +	u8 *src_pixels;
> > +	int rem_x, rem_y;
> > +
> > +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
> > +
> > +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> > +
> > +	while (out_pixel < end) {
> > +		u8 *px = (u8 *)src_pixels;
> > +		*out_pixel = argb_u16_from_u8888(px[3], px[2], px[1], px[0]);
> > +		out_pixel += 1;
> > +		src_pixels += step;
> > +	}
> > +}
> > +
> > +static void XRGB8888_read_line(const struct vkms_plane_state *plane, int x_start, int y_start,
> > +			       enum pixel_read_direction direction, int count,
> > +			       struct pixel_argb_u16 out_pixel[])
> > +{
> > +	struct pixel_argb_u16 *end = out_pixel + count;
> > +	u8 *src_pixels;
> > +	int rem_x, rem_y;
> > +
> > +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
> > +
> > +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> > +
> > +	while (out_pixel < end) {
> > +		u8 *px = (u8 *)src_pixels;
> > +		*out_pixel = argb_u16_from_u8888(255, px[2], px[1], px[0]);
> > +		out_pixel += 1;
> > +		src_pixels += step;
> > +	}
> > +}
> > +
> > +static void ARGB16161616_read_line(const struct vkms_plane_state *plane, int x_start,
> > +				   int y_start, enum pixel_read_direction direction, int count,
> > +				   struct pixel_argb_u16 out_pixel[])
> > +{
> > +	struct pixel_argb_u16 *end = out_pixel + count;
> > +	u8 *src_pixels;
> > +	int rem_x, rem_y;
> > +
> > +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
> > +
> > +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> > +
> > +	while (out_pixel < end) {
> > +		u16 *px = (u16 *)src_pixels;
> > +		*out_pixel = argb_u16_from_u16161616(px[3], px[2], px[1], px[0]);
> > +		out_pixel += 1;
> > +		src_pixels += step;
> > +	}
> > +}
> > +
> > +static void XRGB16161616_read_line(const struct vkms_plane_state *plane, int x_start,
> > +				   int y_start, enum pixel_read_direction direction, int count,
> > +				   struct pixel_argb_u16 out_pixel[])
> > +{
> > +	struct pixel_argb_u16 *end = out_pixel + count;
> > +	u8 *src_pixels;
> > +	int rem_x, rem_y;
> > +
> > +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
> > +
> > +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> > +
> > +	while (out_pixel < end) {
> > +		u16 *px = (u16 *)src_pixels;
> > +		*out_pixel = argb_u16_from_u16161616(0xFFFF, px[2], px[1], px[0]);
> > +		out_pixel += 1;
> > +		src_pixels += step;
> > +	}
> > +}
> > +
> > +static void RGB565_read_line(const struct vkms_plane_state *plane, int x_start,
> > +			     int y_start, enum pixel_read_direction direction, int count,
> > +			     struct pixel_argb_u16 out_pixel[])
> > +{
> > +	struct pixel_argb_u16 *end = out_pixel + count;
> > +	u8 *src_pixels;
> > +	int rem_x, rem_y;
> > +
> > +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
> >   
> > -	for (size_t x = 0; x < limit; x++, src_pixels += frame_info->fb->format->cpp[0]) {
> > -		int x_pos = get_x_position(frame_info, limit, x);
> > +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> >   
> > -		if (drm_rotation_90_or_270(frame_info->rotation))
> > -			src_pixels = get_packed_src_addr(frame_info, x + frame_info->rotated.y1, 0)
> > -				+ frame_info->fb->format->cpp[0] * y;
> > +	while (out_pixel < end) {
> > +		u16 *px = (u16 *)src_pixels;
> >   
> > -		plane->pixel_read(src_pixels, &out_pixels[x_pos]);
> > +		*out_pixel = argb_u16_from_RGB565(px);
> > +		out_pixel += 1;
> > +		src_pixels += step;
> >   	}
> >   }
> >   
> > @@ -343,25 +399,25 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
> >   }
> >   
> >   /**
> > - * Retrieve the correct read_pixel function for a specific format.
> > + * Retrieve the correct read_line function for a specific format.
> >    * If the format is not supported by VKMS a warn is emitted and a dummy "always read black"
> >    * function is returned.
> >    *
> >    * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> >    */
> > -pixel_read_t get_pixel_read_function(u32 format)
> > +pixel_read_line_t get_pixel_read_line_function(u32 format)
> >   {
> >   	switch (format) {
> >   	case DRM_FORMAT_ARGB8888:
> > -		return &ARGB8888_to_argb_u16;
> > +		return &ARGB8888_read_line;
> >   	case DRM_FORMAT_XRGB8888:
> > -		return &XRGB8888_to_argb_u16;
> > +		return &XRGB8888_read_line;
> >   	case DRM_FORMAT_ARGB16161616:
> > -		return &ARGB16161616_to_argb_u16;
> > +		return &ARGB16161616_read_line;
> >   	case DRM_FORMAT_XRGB16161616:
> > -		return &XRGB16161616_to_argb_u16;
> > +		return &XRGB16161616_read_line;
> >   	case DRM_FORMAT_RGB565:
> > -		return &RGB565_to_argb_u16;
> > +		return &RGB565_read_line;
> >   	default:
> >   		/*
> >   		 * This is a bug in vkms_plane_atomic_check. All the supported
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
> > index 3ecea4563254..8d2bef95ff79 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.h
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.h
> > @@ -5,7 +5,7 @@
> >   
> >   #include "vkms_drv.h"
> >   
> > -pixel_read_t get_pixel_read_function(u32 format);
> > +pixel_read_line_t get_pixel_read_line_function(u32 format);
> >   
> >   pixel_write_t get_pixel_write_function(u32 format);
> >   
> > diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> > index 10e9b23dab28..8875bed76410 100644
> > --- a/drivers/gpu/drm/vkms/vkms_plane.c
> > +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> > @@ -112,7 +112,6 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
> >   	frame_info = vkms_plane_state->frame_info;
> >   	memcpy(&frame_info->src, &new_state->src, sizeof(struct drm_rect));
> >   	memcpy(&frame_info->dst, &new_state->dst, sizeof(struct drm_rect));
> > -	memcpy(&frame_info->rotated, &new_state->dst, sizeof(struct drm_rect));
> >   	frame_info->fb = fb;
> >   	memcpy(&frame_info->map, &shadow_plane_state->data, sizeof(frame_info->map));
> >   	drm_framebuffer_get(frame_info->fb);
> > @@ -122,10 +121,8 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
> >   									  DRM_MODE_REFLECT_X |
> >   									  DRM_MODE_REFLECT_Y);
> >   
> > -	drm_rect_rotate(&frame_info->rotated, drm_rect_width(&frame_info->rotated),
> > -			drm_rect_height(&frame_info->rotated), frame_info->rotation);
> >   
> > -	vkms_plane_state->pixel_read = get_pixel_read_function(fmt);
> > +	vkms_plane_state->pixel_read_line = get_pixel_read_line_function(fmt);
> >   }
> >   
> >   static int vkms_plane_atomic_check(struct drm_plane *plane,
> > 

-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 09/16] drm/vkms: Introduce pixel_read_direction enum
  2024-03-25 14:07   ` Maíra Canal
@ 2024-03-26 15:57     ` Louis Chauvet
  0 siblings, 0 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-26 15:57 UTC (permalink / raw)
  To: Maíra Canal
  Cc: Rodrigo Siqueira, Melissa Wen, Haneen Mohammed, Daniel Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, arthurgrillo, Jonathan Corbet, pekka.paalanen,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 25/03/24 - 11:07, Maíra Canal a écrit :
> On 3/13/24 14:45, Louis Chauvet wrote:
> > The pixel_read_direction enum is useful to describe the reading direction
> > in a plane. It avoids using the rotation property of DRM, which not
> > practical to know the direction of reading.
> > This patch also introduce two helpers, one to compute the
> > pixel_read_direction from the DRM rotation property, and one to compute
> > the step, in byte, between two successive pixel in a specific direction.
> > 
> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >   drivers/gpu/drm/vkms/vkms_composer.c | 36 ++++++++++++++++++++++++++++++++++++
> >   drivers/gpu/drm/vkms/vkms_drv.h      | 11 +++++++++++
> >   drivers/gpu/drm/vkms/vkms_formats.c  | 30 ++++++++++++++++++++++++++++++
> >   3 files changed, 77 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> > index 9254086f23ff..989bcf59f375 100644
> > --- a/drivers/gpu/drm/vkms/vkms_composer.c
> > +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> > @@ -159,6 +159,42 @@ static void apply_lut(const struct vkms_crtc_state *crtc_state, struct line_buff
> >   	}
> >   }
> >   
> > +/**
> > + * direction_for_rotation() - Get the correct reading direction for a given rotation
> > + *
> > + * This function will use the @rotation setting of a source plane to compute the reading
> > + * direction in this plane which correspond to a "left to right writing" in the CRTC.
> > + * For example, if the buffer is reflected on X axis, the pixel must be read from right to left
> > + * to be written from left to right on the CRTC.
> > + *
> > + * @rotation: Rotation to analyze. It correspond the field @frame_info.rotation.
> 
> A bit unusual to see arguments after the description.

Fixed in v6.
 
> > + */
> > +static enum pixel_read_direction direction_for_rotation(unsigned int rotation)
> > +{
> > +	if (rotation & DRM_MODE_ROTATE_0) {
> > +		if (rotation & DRM_MODE_REFLECT_X)
> > +			return READ_RIGHT_TO_LEFT;
> > +		else
> > +			return READ_LEFT_TO_RIGHT;
> > +	} else if (rotation & DRM_MODE_ROTATE_90) {
> > +		if (rotation & DRM_MODE_REFLECT_Y)
> > +			return READ_BOTTOM_TO_TOP;
> > +		else
> > +			return READ_TOP_TO_BOTTOM;
> > +	} else if (rotation & DRM_MODE_ROTATE_180) {
> > +		if (rotation & DRM_MODE_REFLECT_X)
> > +			return READ_LEFT_TO_RIGHT;
> > +		else
> > +			return READ_RIGHT_TO_LEFT;
> > +	} else if (rotation & DRM_MODE_ROTATE_270) {
> > +		if (rotation & DRM_MODE_REFLECT_Y)
> > +			return READ_TOP_TO_BOTTOM;
> > +		else
> > +			return READ_BOTTOM_TO_TOP;
> > +	}
> > +	return READ_LEFT_TO_RIGHT;
> > +}
> > +
> >   /**
> >    * blend - blend the pixels from all planes and compute crc
> >    * @wb: The writeback frame buffer metadata
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index 3ead8b39af4a..985e7a92b7bc 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -69,6 +69,17 @@ struct vkms_writeback_job {
> >   	pixel_write_t pixel_write;
> >   };
> >   
> > +/**
> > + * enum pixel_read_direction - Enum used internaly by VKMS to represent a reading direction in a
> > + * plane.
> > + */
> > +enum pixel_read_direction {
> > +	READ_BOTTOM_TO_TOP,
> > +	READ_TOP_TO_BOTTOM,
> > +	READ_RIGHT_TO_LEFT,
> > +	READ_LEFT_TO_RIGHT
> > +};
> > +
> >   /**
> >    * typedef pixel_read_t - These functions are used to read a pixel in the source frame,
> >    * convert it to `struct pixel_argb_u16` and write it to @out_pixel.
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> > index 649d75d05b1f..743b6fd06db5 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.c
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> > @@ -75,6 +75,36 @@ static void packed_pixels_addr(const struct vkms_frame_info *frame_info,
> >   	*addr = (u8 *)frame_info->map[0].vaddr + offset;
> >   }
> >   
> > +/**
> > + * get_step_next_block() - Common helper to compute the correct step value between each pixel block
> > + * to read in a certain direction.
> > + *
> > + * As the returned offset is the number of bytes between two consecutive blocks in a direction,
> > + * the caller may have to read multiple pixel before using the next one (for example, to read from
> > + * left to right in a DRM_FORMAT_R1 plane, each block contains 8 pixels, so the step must be used
> > + * only every 8 pixels.
> > + *
> > + * @fb: Framebuffer to iter on
> > + * @direction: Direction of the reading
> > + * @plane_index: Plane to get the step from
> 
> Same.

Fixed in v6.

Thanks,
Louis Chauvet
 
> Best Regards,
> - Maíra
> 
> > + */
> > +static int get_step_next_block(struct drm_framebuffer *fb, enum pixel_read_direction direction,
> > +			       int plane_index)
> > +{
> > +	switch (direction) {
> > +	case READ_LEFT_TO_RIGHT:
> > +		return fb->format->char_per_block[plane_index];
> > +	case READ_RIGHT_TO_LEFT:
> > +		return -fb->format->char_per_block[plane_index];
> > +	case READ_TOP_TO_BOTTOM:
> > +		return (int)fb->pitches[plane_index];
> > +	case READ_BOTTOM_TO_TOP:
> > +		return -(int)fb->pitches[plane_index];
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> >   static void *get_packed_src_addr(const struct vkms_frame_info *frame_info, int y,
> >   				 int plane_index)
> >   {
> > 

-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 10/16] drm/vkms: Re-introduce line-per-line composition algorithm
  2024-03-25 15:43   ` Pekka Paalanen
@ 2024-03-26 15:57     ` Louis Chauvet
  2024-03-27 12:29       ` Pekka Paalanen
  0 siblings, 1 reply; 75+ messages in thread
From: Louis Chauvet @ 2024-03-26 15:57 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

[...]

> > @@ -215,34 +188,146 @@ static void blend(struct vkms_writeback_job *wb,
> >  {
> >  	struct vkms_plane_state **plane = crtc_state->active_planes;
> >  	u32 n_active_planes = crtc_state->num_active_planes;
> > -	int y_pos, x_dst, x_limit;
> >  
> >  	const struct pixel_argb_u16 background_color = { .a = 0xffff };
> >  
> > -	size_t crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
> > +	int crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
> > +	int crtc_x_limit = crtc_state->base.crtc->mode.hdisplay;
> >  
> >  	/*
> >  	 * The planes are composed line-by-line to avoid heavy memory usage. It is a necessary
> >  	 * complexity to avoid poor blending performance.
> >  	 *
> > -	 * The function vkms_compose_row is used to read a line, pixel-by-pixel, into the staging
> > -	 * buffer.
> > +	 * The function pixel_read_line callback is used to read a line, using an efficient
> > +	 * algorithm for a specific format, into the staging buffer.
> >  	 */
> >  	for (size_t y = 0; y < crtc_y_limit; y++) {
> >  		fill_background(&background_color, output_buffer);
> >  
> >  		/* The active planes are composed associatively in z-order. */
> >  		for (size_t i = 0; i < n_active_planes; i++) {
> > -			x_dst = plane[i]->frame_info->dst.x1;
> > -			x_limit = min_t(size_t, drm_rect_width(&plane[i]->frame_info->dst),
> > -					stage_buffer->n_pixels);
> > -			y_pos = get_y_pos(plane[i]->frame_info, y);
> > +			struct vkms_plane_state *current_plane = plane[i];
> >  
> > -			if (!check_limit(plane[i]->frame_info, y_pos))
> > +			/* Avoid rendering useless lines */
> > +			if (y < current_plane->frame_info->dst.y1 ||
> > +			    y >= current_plane->frame_info->dst.y2)
> >  				continue;
> >  
> > -			vkms_compose_row(stage_buffer, plane[i], y_pos);
> > -			pre_mul_alpha_blend(stage_buffer, output_buffer, x_dst, x_limit);
> > +			/*
> > +			 * dst_line is the line to copy. The initial coordinates are inside the

[...]

> > +				 */
> > +				pixel_count = drm_rect_width(&src_line);
> > +				if (x_start < 0) {
> > +					pixel_count += x_start;
> > +					x_start = 0;
> > +				}
> > +				if (x_start + pixel_count > current_plane->frame_info->fb->width) {
> > +					pixel_count =
> > +						(int)current_plane->frame_info->fb->width - x_start;
> > +				}
> > +			} else {
> > +				/*
> > +				 * In vertical reading, the src_line height is the number of pixel
> > +				 * to read
> > +				 */
> > +				pixel_count = drm_rect_height(&src_line);
> > +				if (y_start < 0) {
> > +					pixel_count += y_start;
> > +					y_start = 0;
> > +				}
> > +				if (y_start + pixel_count > current_plane->frame_info->fb->height) {
> > +					pixel_count =
> > +						(int)current_plane->frame_info->fb->width - y_start;
> > +				}
> 
> When you are clamping x_start or y_start or pixel_count to be inside
> the source FB, should you not equally adjust the destination
> coordinates as well?

I did not think about it. Currently it is not an issue and it will not 
read or write outside a buffer because the pixel count is properly 
limited. But indeed, there is an issue here. I will fix it in the v6.
 
> If we take a step back and look at the UAPI, I believe the answer is
> "no", but it's in no way obvious. It results from the combination of
> several facts:
> 
> - UAPI checks reject any source rectangle that extends outside of the
>   source FB.
> 
> - The source rectangle stretches to fill the destination rectangle
>   exactly.
> 
> - VKMS does not support stretching (scaling), so its UAPI checks reject
>   any commit with source and destination rectangles of different sizes
>   after accounting for rotation. (Right?)

I don't know what are exactly the UAPI contract but as the dst can be 
outside the CRTC, I assumed that the src can be outside the source plane. 
After thinking it doesn't really make sense.

> I think this results in the clamping code being actually dead code.
> However, I would not delete the clamping code, because it is a cheap
> safety net in case something goes wrong.

If UAPI check that the source rectangle is inside the plane, yes it is 
just a safety net. Otherwise, it is required to manage properly the 
userspace requests. In the v6, the outside of a source buffer is 
transparent.

> If you agree that it's just a safety net, then maybe explain that in a
> comment? If the safety net catches anything, the composition result
> will be wrong anyway, so it doesn't matter to adjust the destination
> rectangle to match.

I will extract this whole clamping stuff in a function, is this comment 
enough?

 * This function is mainly a safety net to avoid reading outside the source buffer. As the
 * userspace should never ask to read outside the source plane, all the cases covered here should
 * be dead code.

> When the last point is relaxed and VKMS gains scaling support, I think
> it won't change the fact that the clamping remains as a safety net. It
> just increases the risk of bugs that would be caught by the net.
> 
> Going outside of FB boundaries is a serious bug and deserves to be
> checked. Going outside of the source rectangle would be a bug too,
> assuming that partially included pixels are considered fully included,
> but it's not serious enough to warrant explicit checks. Ideally IGT
> would catch it.

That was exactly the idea behind all those check and clamping: avoid 
access outside the buffers.

> > +			}
> > +
> > +			if (pixel_count <= 0) {
> > +				/* Nothing to read, so avoid multiple function calls for nothing */
> > +				continue;
> > +			}
> > +
> > +			/*
> > +			 * Modify the starting point to take in account the rotation
> > +			 *
> > +			 * src_line is the top-left corner, so when reading READ_RIGHT_TO_LEFT or
> > +			 * READ_BOTTOM_TO_TOP, it must be changed to the top-right/bottom-left
> > +			 * corner.
> > +			 */
> > +			if (direction == READ_RIGHT_TO_LEFT) {
> > +				// x_start is now the right point
> > +				x_start += pixel_count - 1;
> > +			} else if (direction == READ_BOTTOM_TO_TOP) {
> > +				// y_start is now the bottom point
> > +				y_start += pixel_count - 1;
> > +			}
> > +
> > +			/*
> > +			 * Perform the conversion and the blending
> > +			 *
> > +			 * Here we know that the read line (x_start, y_start, pixel_count) is
> > +			 * inside the source buffer [2] and we don't write outside the stage
> > +			 * buffer [1]
> > +			 */
> > +			current_plane->pixel_read_line(
> > +				current_plane, x_start, y_start, direction, pixel_count,
> > +				&stage_buffer->pixels[current_plane->frame_info->dst.x1]);
> > +
> > +			pre_mul_alpha_blend(stage_buffer, output_buffer,
> > +					    current_plane->frame_info->dst.x1,
> > +					    pixel_count);
> >  		}
> >  
> >  		apply_lut(crtc_state, output_buffer);
> > @@ -250,7 +335,7 @@ static void blend(struct vkms_writeback_job *wb,
> >  		*crc32 = crc32_le(*crc32, (void *)output_buffer->pixels, row_size);
> >  
> >  		if (wb)
> > -			vkms_writeback_row(wb, output_buffer, y_pos);
> > +			vkms_writeback_row(wb, output_buffer, y);
> >  	}
> >  }
> >  
> > @@ -261,7 +346,7 @@ static int check_format_funcs(struct vkms_crtc_state *crtc_state,
> >  	u32 n_active_planes = crtc_state->num_active_planes;
> >  
> >  	for (size_t i = 0; i < n_active_planes; i++)
> > -		if (!planes[i]->pixel_read)
> > +		if (!planes[i]->pixel_read_line)
> >  			return -1;
> >  
> >  	if (active_wb && !active_wb->pixel_write)
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index 985e7a92b7bc..23e1d247468d 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -39,7 +39,6 @@
> >  struct vkms_frame_info {
> >  	struct drm_framebuffer *fb;
> >  	struct drm_rect src, dst;
> > -	struct drm_rect rotated;
> >  	struct iosys_map map[DRM_FORMAT_MAX_PLANES];
> >  	unsigned int rotation;
> >  };
> > @@ -80,26 +79,37 @@ enum pixel_read_direction {
> >  	READ_LEFT_TO_RIGHT
> >  };
> >  
> > +struct vkms_plane_state;
> > +
> >  /**
> > - * typedef pixel_read_t - These functions are used to read a pixel in the source frame,
> > + * typedef pixel_read_line_t - These functions are used to read a pixel line in the source frame,
> >   * convert it to `struct pixel_argb_u16` and write it to @out_pixel.
> >   *
> > - * @in_pixel: Pointer to the pixel to read
> > - * @out_pixel: Pointer to write the converted pixel
> > + * @plane: Plane used as source for the pixel value
> > + * @x_start: X (width) coordinate of the first pixel to copy. The caller must ensure that x_start
> > + * is positive and smaller than @plane->frame_info->fb->width.
> > + * @y_start: Y (width) coordinate of the first pixel to copy. The caller must ensure that y_start
> > + * is positive and smaller than @plane->frame_info->fb->height.
> 
> s/positive/non-negative/ because zero is valid too. At least, there is
> debate whether zero is positive or not, but non-negative is clear.

Edited in the v6.

> > + * @direction: Direction to use for the copy, starting at @x_start/@y_start
> > + * @count: Number of pixels to copy
> > + * @out_pixel: Pointer where to write the pixel values. They will be written from @out_pixel[0]
> > + * to @out_pixel[@count]. The caller must ensure that out_pixel have a length of at least @count.
> >   */
> > -typedef void (*pixel_read_t)(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel);
> > +typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_start,
> > +				  int y_start, enum pixel_read_direction direction, int count,
> > +				  struct pixel_argb_u16 out_pixel[]);
> >  
> >  /**
> >   * vkms_plane_state - Driver specific plane state
> >   * @base: base plane state
> >   * @frame_info: data required for composing computation
> > - * @pixel_read: function to read a pixel in this plane. The creator of a vkms_plane_state must
> > - * ensure that this pointer is valid
> > + * @pixel_read_line: function to read a pixel line in this plane. The creator of a vkms_plane_state
> > + * must ensure that this pointer is valid
> >   */
> >  struct vkms_plane_state {
> >  	struct drm_shadow_plane_state base;
> >  	struct vkms_frame_info *frame_info;
> > -	pixel_read_t pixel_read;
> > +	pixel_read_line_t pixel_read_line;
> >  };
> >  
> >  struct vkms_plane {
> > @@ -204,7 +214,6 @@ int vkms_verify_crc_source(struct drm_crtc *crtc, const char *source_name,
> >  /* Composer Support */
> >  void vkms_composer_worker(struct work_struct *work);
> >  void vkms_set_composer(struct vkms_output *out, bool enabled);
> > -void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state *plane, int y);
> >  void vkms_writeback_row(struct vkms_writeback_job *wb, const struct line_buffer *src_buffer, int y);
> >  
> >  /* Writeback */
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> > index 743b6fd06db5..1449a0e6c706 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.c
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> > @@ -105,77 +105,45 @@ static int get_step_next_block(struct drm_framebuffer *fb, enum pixel_read_direc
> >  	return 0;
> >  }
> >  
> > -static void *get_packed_src_addr(const struct vkms_frame_info *frame_info, int y,
> > -				 int plane_index)
> > -{
> > -	int x_src = frame_info->src.x1 >> 16;
> > -	int y_src = y - frame_info->rotated.y1 + (frame_info->src.y1 >> 16);
> > -	u8 *addr;
> > -	int rem_x, rem_y;
> > -
> > -	packed_pixels_addr(frame_info, x_src, y_src, plane_index, &addr, &rem_x, &rem_y);
> > -	return addr;
> > -}
> > -
> > -static int get_x_position(const struct vkms_frame_info *frame_info, int limit, int x)
> > -{
> > -	if (frame_info->rotation & (DRM_MODE_REFLECT_X | DRM_MODE_ROTATE_270))
> > -		return limit - x - 1;
> > -	return x;
> > -}
> > -
> >  /*
> > - * The following  functions take pixel data from the buffer and convert them to the format
> > + * The following  functions take pixel data (a, r, g, b, pixel, ...), convert them to the format
> >   * ARGB16161616 in out_pixel.
> >   *
> > - * They are used in the `vkms_compose_row` function to handle multiple formats.
> > + * They are used in the `read_line`s functions to avoid duplicate work for some pixel formats.
> >   */
> >  
> > -static void ARGB8888_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> > +static struct pixel_argb_u16 argb_u16_from_u8888(int a, int r, int g, int b)
> >  {
> > +	struct pixel_argb_u16 out_pixel;
> >  	/*
> >  	 * The 257 is the "conversion ratio". This number is obtained by the
> >  	 * (2^16 - 1) / (2^8 - 1) division. Which, in this case, tries to get
> >  	 * the best color value in a pixel format with more possibilities.
> >  	 * A similar idea applies to others RGB color conversions.
> >  	 */
> > -	out_pixel->a = (u16)in_pixel[3] * 257;
> > -	out_pixel->r = (u16)in_pixel[2] * 257;
> > -	out_pixel->g = (u16)in_pixel[1] * 257;
> > -	out_pixel->b = (u16)in_pixel[0] * 257;
> > -}
> > +	out_pixel.a = (u16)a * 257;
> > +	out_pixel.r = (u16)r * 257;
> > +	out_pixel.g = (u16)g * 257;
> > +	out_pixel.b = (u16)b * 257;
> >  
> > -static void XRGB8888_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> > -{
> > -	out_pixel->a = (u16)0xffff;
> > -	out_pixel->r = (u16)in_pixel[2] * 257;
> > -	out_pixel->g = (u16)in_pixel[1] * 257;
> > -	out_pixel->b = (u16)in_pixel[0] * 257;
> > +	return out_pixel;
> >  }
> >  
> > -static void ARGB16161616_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> > +static struct pixel_argb_u16 argb_u16_from_u16161616(int a, int r, int g, int b)
> >  {
> > -	u16 *pixel = (u16 *)in_pixel;
> > +	struct pixel_argb_u16 out_pixel;
> >  
> > -	out_pixel->a = le16_to_cpu(pixel[3]);
> > -	out_pixel->r = le16_to_cpu(pixel[2]);
> > -	out_pixel->g = le16_to_cpu(pixel[1]);
> > -	out_pixel->b = le16_to_cpu(pixel[0]);
> > -}
> > +	out_pixel.a = le16_to_cpu(a);
> > +	out_pixel.r = le16_to_cpu(r);
> > +	out_pixel.g = le16_to_cpu(g);
> > +	out_pixel.b = le16_to_cpu(b);
> >  
> > -static void XRGB16161616_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> > -{
> > -	u16 *pixel = (u16 *)in_pixel;
> > -
> > -	out_pixel->a = (u16)0xffff;
> > -	out_pixel->r = le16_to_cpu(pixel[2]);
> > -	out_pixel->g = le16_to_cpu(pixel[1]);
> > -	out_pixel->b = le16_to_cpu(pixel[0]);
> > +	return out_pixel;
> >  }
> >  
> > -static void RGB565_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
> > +static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
> >  {
> > -	u16 *pixel = (u16 *)in_pixel;
> > +	struct pixel_argb_u16 out_pixel;
> >  
> >  	s64 fp_rb_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(31));
> >  	s64 fp_g_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(63));
> > @@ -185,12 +153,26 @@ static void RGB565_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pi
> >  	s64 fp_g = drm_int2fixp((rgb_565 >> 5) & 0x3f);
> >  	s64 fp_b = drm_int2fixp(rgb_565 & 0x1f);
> >  
> > -	out_pixel->a = (u16)0xffff;
> > -	out_pixel->r = drm_fixp2int_round(drm_fixp_mul(fp_r, fp_rb_ratio));
> > -	out_pixel->g = drm_fixp2int_round(drm_fixp_mul(fp_g, fp_g_ratio));
> > -	out_pixel->b = drm_fixp2int_round(drm_fixp_mul(fp_b, fp_rb_ratio));
> > +	out_pixel.a = (u16)0xffff;
> > +	out_pixel.r = drm_fixp2int_round(drm_fixp_mul(fp_r, fp_rb_ratio));
> > +	out_pixel.g = drm_fixp2int_round(drm_fixp_mul(fp_g, fp_g_ratio));
> > +	out_pixel.b = drm_fixp2int_round(drm_fixp_mul(fp_b, fp_rb_ratio));
> > +
> > +	return out_pixel;
> >  }
> >  
> > +/*
> > + * The following functions are read_line function for each pixel format supported by VKMS.
> > + *
> > + * They read a line starting at the point @x_start,@y_start following the @direction. The result
> > + * is stored in @out_pixel and in the format ARGB16161616.
> > + *
> > + * Those function are very similar, but it is required for performance reason. In the past, some
> > + * experiment were done, and with a generic loop the performance are very reduced [1].
> 
> The English here feels a bit awkward. How about:
> 
> These functions are very repetitive, but the innermost pixel loops must
> be kept inside these functions for performance reasons. Some
> benchmarking was done in [1] where having the innermost loop factored
> out of these functions showed a slowdown by a factor of three.

It is better, thanks.

> > + *
> > + * [1]: https://lore.kernel.org/dri-devel/d258c8dc-78e9-4509-9037-a98f7f33b3a3@riseup.net/
> > + */
> > +
> >  /**
> >   * black_to_argb_u16() - pixel_read callback which always read black
> >   *
> > @@ -198,42 +180,116 @@ static void RGB565_to_argb_u16(const u8 *in_pixel, struct pixel_argb_u16 *out_pi
> >   * It is used to avoid null pointer to be used as a function. In theory, this function should
> >   * never be called, except if you found a bug in the driver/DRM core.
> >   */
> > +static void black_to_argb_u16(const struct vkms_plane_state *plane, int x_start,
> > +			      int y_start, enum pixel_read_direction direction, int count,
> > +			      struct pixel_argb_u16 out_pixel[])
> >  {
> > +	struct pixel_argb_u16 *end = out_pixel + count;
> > +
> > +	while (out_pixel < end) {
> > +		*out_pixel = argb_u16_from_u8888(255, 0, 0, 0);
> > +		out_pixel += 1;
> > +	}
> >  }
> >  
> > +static void ARGB8888_read_line(const struct vkms_plane_state *plane, int x_start, int y_start,
> > +			       enum pixel_read_direction direction, int count,
> > +			       struct pixel_argb_u16 out_pixel[])
> >  {
> > +	struct pixel_argb_u16 *end = out_pixel + count;
> > +	u8 *src_pixels;
> > +	int rem_x, rem_y;
> > +
> > +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
> > +
> > +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> > +
> > +	while (out_pixel < end) {
> > +		u8 *px = (u8 *)src_pixels;
> > +		*out_pixel = argb_u16_from_u8888(px[3], px[2], px[1], px[0]);
> > +		out_pixel += 1;
> > +		src_pixels += step;
> > +	}
> > +}
> > +
> > +static void XRGB8888_read_line(const struct vkms_plane_state *plane, int x_start, int y_start,
> > +			       enum pixel_read_direction direction, int count,
> > +			       struct pixel_argb_u16 out_pixel[])
> > +{
> > +	struct pixel_argb_u16 *end = out_pixel + count;
> > +	u8 *src_pixels;
> > +	int rem_x, rem_y;
> > +
> > +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
> > +
> > +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> > +
> > +	while (out_pixel < end) {
> > +		u8 *px = (u8 *)src_pixels;
> > +		*out_pixel = argb_u16_from_u8888(255, px[2], px[1], px[0]);
> > +		out_pixel += 1;
> > +		src_pixels += step;
> > +	}
> > +}
> > +
> > +static void ARGB16161616_read_line(const struct vkms_plane_state *plane, int x_start,
> > +				   int y_start, enum pixel_read_direction direction, int count,
> > +				   struct pixel_argb_u16 out_pixel[])
> > +{
> > +	struct pixel_argb_u16 *end = out_pixel + count;
> > +	u8 *src_pixels;
> > +	int rem_x, rem_y;
> > +
> > +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
> > +
> > +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> > +
> > +	while (out_pixel < end) {
> > +		u16 *px = (u16 *)src_pixels;
> > +		*out_pixel = argb_u16_from_u16161616(px[3], px[2], px[1], px[0]);
> > +		out_pixel += 1;
> > +		src_pixels += step;
> > +	}
> > +}
> > +
> > +static void XRGB16161616_read_line(const struct vkms_plane_state *plane, int x_start,
> > +				   int y_start, enum pixel_read_direction direction, int count,
> > +				   struct pixel_argb_u16 out_pixel[])
> > +{
> > +	struct pixel_argb_u16 *end = out_pixel + count;
> > +	u8 *src_pixels;
> > +	int rem_x, rem_y;
> > +
> > +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
> > +
> > +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> > +
> > +	while (out_pixel < end) {
> > +		u16 *px = (u16 *)src_pixels;
> > +		*out_pixel = argb_u16_from_u16161616(0xFFFF, px[2], px[1], px[0]);
> > +		out_pixel += 1;
> > +		src_pixels += step;
> > +	}
> > +}
> > +
> > +static void RGB565_read_line(const struct vkms_plane_state *plane, int x_start,
> > +			     int y_start, enum pixel_read_direction direction, int count,
> > +			     struct pixel_argb_u16 out_pixel[])
> > +{
> > +	struct pixel_argb_u16 *end = out_pixel + count;
> > +	u8 *src_pixels;
> > +	int rem_x, rem_y;
> > +
> > +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
> >  
> > +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> >  
> > +	while (out_pixel < end) {
> > +		u16 *px = (u16 *)src_pixels;
> >  
> > +		*out_pixel = argb_u16_from_RGB565(px);
> > +		out_pixel += 1;
> > +		src_pixels += step;
> >  	}
> >  }
> >  
> > @@ -343,25 +399,25 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
> >  }
> >  
> >  /**
> > - * Retrieve the correct read_pixel function for a specific format.
> > + * Retrieve the correct read_line function for a specific format.
> >   * If the format is not supported by VKMS a warn is emitted and a dummy "always read black"
> >   * function is returned.
> >   *
> >   * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> >   */
> > -pixel_read_t get_pixel_read_function(u32 format)
> > +pixel_read_line_t get_pixel_read_line_function(u32 format)
> >  {
> >  	switch (format) {
> >  	case DRM_FORMAT_ARGB8888:
> > -		return &ARGB8888_to_argb_u16;
> > +		return &ARGB8888_read_line;
> >  	case DRM_FORMAT_XRGB8888:
> > -		return &XRGB8888_to_argb_u16;
> > +		return &XRGB8888_read_line;
> >  	case DRM_FORMAT_ARGB16161616:
> > -		return &ARGB16161616_to_argb_u16;
> > +		return &ARGB16161616_read_line;
> >  	case DRM_FORMAT_XRGB16161616:
> > -		return &XRGB16161616_to_argb_u16;
> > +		return &XRGB16161616_read_line;
> >  	case DRM_FORMAT_RGB565:
> > -		return &RGB565_to_argb_u16;
> > +		return &RGB565_read_line;
> >  	default:
> >  		/*
> >  		 * This is a bug in vkms_plane_atomic_check. All the supported
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
> > index 3ecea4563254..8d2bef95ff79 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.h
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.h
> > @@ -5,7 +5,7 @@
> >  
> >  #include "vkms_drv.h"
> >  
> > -pixel_read_t get_pixel_read_function(u32 format);
> > +pixel_read_line_t get_pixel_read_line_function(u32 format);
> >  
> >  pixel_write_t get_pixel_write_function(u32 format);
> >  
> > diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> > index 10e9b23dab28..8875bed76410 100644
> > --- a/drivers/gpu/drm/vkms/vkms_plane.c
> > +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> > @@ -112,7 +112,6 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
> >  	frame_info = vkms_plane_state->frame_info;
> >  	memcpy(&frame_info->src, &new_state->src, sizeof(struct drm_rect));
> >  	memcpy(&frame_info->dst, &new_state->dst, sizeof(struct drm_rect));
> > -	memcpy(&frame_info->rotated, &new_state->dst, sizeof(struct drm_rect));
> >  	frame_info->fb = fb;
> >  	memcpy(&frame_info->map, &shadow_plane_state->data, sizeof(frame_info->map));
> >  	drm_framebuffer_get(frame_info->fb);
> > @@ -122,10 +121,8 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
> >  									  DRM_MODE_REFLECT_X |
> >  									  DRM_MODE_REFLECT_Y);
> >  
> > -	drm_rect_rotate(&frame_info->rotated, drm_rect_width(&frame_info->rotated),
> > -			drm_rect_height(&frame_info->rotated), frame_info->rotation);
> >  
> > -	vkms_plane_state->pixel_read = get_pixel_read_function(fmt);
> > +	vkms_plane_state->pixel_read_line = get_pixel_read_line_function(fmt);
> >  }
> >  
> >  static int vkms_plane_atomic_check(struct drm_plane *plane,
> > 
> 
> This is looking good enough that I can give an

Thanks for all your feedback !

> Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com>

As I changed the code, I will not keep it in the commit.

Thanks,
Louis Chauvet

> Thanks,
> pq



-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 11/16] drm/vkms: Add YUV support
  2024-03-25 14:26   ` Maíra Canal
@ 2024-03-26 15:57     ` Louis Chauvet
  2024-03-27 12:59       ` Pekka Paalanen
  0 siblings, 1 reply; 75+ messages in thread
From: Louis Chauvet @ 2024-03-26 15:57 UTC (permalink / raw)
  To: Maíra Canal
  Cc: Rodrigo Siqueira, Melissa Wen, Haneen Mohammed, Daniel Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, arthurgrillo, Jonathan Corbet, pekka.paalanen,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 25/03/24 - 11:26, Maíra Canal a écrit :
> On 3/13/24 14:45, Louis Chauvet wrote:
> > From: Arthur Grillo <arthurgrillo@riseup.net>
> > 
> > Add support to the YUV formats bellow:
> > 
> > - NV12/NV16/NV24
> > - NV21/NV61/NV42
> > - YUV420/YUV422/YUV444
> > - YVU420/YVU422/YVU444
> > 
> > The conversion from yuv to rgb is done with fixed-point arithmetic, using
> > 32.32 floats and the drm_fixed helpers.
> > 
> > To do the conversion, a specific matrix must be used for each color range
> > (DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in
> > the `conversion_matrix` struct, along with the specific y_offset needed.
> > This matrix is queried only once, in `vkms_plane_atomic_update` and
> > stored in a `vkms_plane_state`. Those conversion matrices of each
> > encoding and range were obtained by rounding the values of the original
> > conversion matrices multiplied by 2^32. This is done to avoid the use of
> > floating point operations.
> > 
> > The same reading function is used for YUV and YVU formats. As the only
> > difference between those two category of formats is the order of field, a
> > simple swap in conversion matrix columns allows using the same function.
> > 
> > Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> > [Louis Chauvet:
> > - Adapted Arthur's work
> > - Implemented the read_line_t callbacks for yuv
> > - add struct conversion_matrix
> > - remove struct pixel_yuv_u8
> > - update the commit message
> > - Merge the modifications from Arthur]
> 
> A Co-developed-by tag would be more appropriate.

I am not the main author of this part, I only applied a few simple 
suggestions, the complex part was done by Arthur.

I will wait for Arthur's confirmation to change it to Co-developed by if
he agrees.

 
> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >   drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
> >   drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
> >   drivers/gpu/drm/vkms/vkms_formats.h |   4 +
> >   drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
> >   4 files changed, 473 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index 23e1d247468d..f3116084de5a 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -99,6 +99,27 @@ typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_st
> >   				  int y_start, enum pixel_read_direction direction, int count,
> >   				  struct pixel_argb_u16 out_pixel[]);
> >   
> > +/**
> > + * CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values
> > + */
> > +#define CONVERSION_MATRIX_FLOAT_DEPTH 32
> > +
> > +/**
> > + * struct conversion_matrix - Matrix to use for a specific encoding and range
> > + *
> > + * @matrix: Conversion matrix from yuv to rgb. The matrix is stored in a row-major manner and is
> > + * used to compute rgb values from yuv values:
> > + *     [[r],[g],[b]] = @matrix * [[y],[u],[v]]
> > + *   OR for yvu formats:
> > + *     [[r],[g],[b]] = @matrix * [[y],[v],[u]]
> > + *  The values of the matrix are fixed floats, 32.CONVERSION_MATRIX_FLOAT_DEPTH > + * @y_offest: Offset to apply on the y value.
> 
> s/y_offest/y_offset

Fixed in v6.

> > + */
> > +struct conversion_matrix {
> > +	s64 matrix[3][3];
> > +	s64 y_offset;
> > +};
> > +
> >   /**
> >    * vkms_plane_state - Driver specific plane state
> >    * @base: base plane state
> > @@ -110,6 +131,7 @@ struct vkms_plane_state {
> >   	struct drm_shadow_plane_state base;
> >   	struct vkms_frame_info *frame_info;
> >   	pixel_read_line_t pixel_read_line;
> > +	struct conversion_matrix *conversion_matrix;
> 
> Add @conversion_matrix on the kernel-doc from the struct
> vkms_plane_state.

Fixed in v6.

> >   };
> >   
> >   struct vkms_plane {
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> > index 1449a0e6c706..edbf4b321b91 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.c
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> > @@ -105,6 +105,44 @@ static int get_step_next_block(struct drm_framebuffer *fb, enum pixel_read_direc
> >   	return 0;
> >   }
> >   
> > +/**
> > + * get_subsampling() - Get the subsampling divisor value on a specific direction
> 
> Where are the arguments?

Fixed in v6.

> > + */
> > +static int get_subsampling(const struct drm_format_info *format,
> > +			   enum pixel_read_direction direction)
> > +{
> > +	switch (direction) {
> > +	case READ_BOTTOM_TO_TOP:
> > +	case READ_TOP_TO_BOTTOM:
> > +		return format->vsub;
> > +	case READ_RIGHT_TO_LEFT:
> > +	case READ_LEFT_TO_RIGHT:
> > +		return format->hsub;
> > +	}
> > +	WARN_ONCE(true, "Invalid direction for pixel reading: %d\n", direction);
> > +	return 1;
> > +}
> > +
> > +/**
> > + * get_subsampling_offset() - An offset for keeping the chroma siting consistent regardless of
> > + * x_start and y_start values
> 
> Same.

Fixed in v6.

> > + */
> > +static int get_subsampling_offset(enum pixel_read_direction direction, int x_start, int y_start)
> > +{
> > +	switch (direction) {
> > +	case READ_BOTTOM_TO_TOP:
> > +		return -y_start - 1;
> > +	case READ_TOP_TO_BOTTOM:
> > +		return y_start;
> > +	case READ_RIGHT_TO_LEFT:
> > +		return -x_start - 1;
> > +	case READ_LEFT_TO_RIGHT:
> > +		return x_start;
> > +	}
> > +	WARN_ONCE(true, "Invalid direction for pixel reading: %d\n", direction);
> > +	return 0;
> > +}
> > +
> >   /*
> >    * The following  functions take pixel data (a, r, g, b, pixel, ...), convert them to the format
> >    * ARGB16161616 in out_pixel.
> > @@ -161,6 +199,42 @@ static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
> >   	return out_pixel;
> >   }
> >   
> 
> [...]
> 
> >   
> > +/**
> > + * get_conversion_matrix_to_argb_u16() - Retrieve the correct yuv to rgb conversion matrix for a
> > + * given encoding and range.
> > + *
> > + * If the matrix is not found, return a null pointer. In all other cases, it return a simple
> > + * diagonal matrix, which act as a "no-op".
> > + *
> > + * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> > + * @encoding: DRM_COLOR_* value for which to obtain a conversion matrix
> > + * @range: DRM_COLOR_*_RANGE value for which to obtain a conversion matrix
> 
> A bit odd to see the arguments after the description.

Fixed in v6.

> > + */
> > +struct conversion_matrix *
> > +get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> > +				  enum drm_color_range range)
> > +{
> > +	static struct conversion_matrix no_operation = {
> > +		.matrix = {
> > +			{ 4294967296, 0,          0, },
> > +			{ 0,          4294967296, 0, },
> > +			{ 0,          0,          4294967296, },
> > +		},
> > +		.y_offset = 0,
> > +	};

[...]

> > +
> > +	/* Breaking in this switch means that the color format+encoding+range is not supported */
> 
> s/color format+encoding+range/color format + encoding + range

Fixed in v6.

> > +	switch (format) {
> > +	case DRM_FORMAT_NV12:
> > +	case DRM_FORMAT_NV16:
> > +	case DRM_FORMAT_NV24:
> > +	case DRM_FORMAT_YUV420:
> > +	case DRM_FORMAT_YUV422:
> > +	case DRM_FORMAT_YUV444:
> > +		switch (encoding) {
> > +		case DRM_COLOR_YCBCR_BT601:
> > +			switch (range) {
> > +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> > +				return &yuv_bt601_limited;
> > +			case DRM_COLOR_YCBCR_FULL_RANGE:

[...]

> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
> > index 8d2bef95ff79..e1d324764b17 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.h
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.h
> > @@ -9,4 +9,8 @@ pixel_read_line_t get_pixel_read_line_function(u32 format);
> >   
> >   pixel_write_t get_pixel_write_function(u32 format);
> >   
> > +struct conversion_matrix *
> > +get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> > +				  enum drm_color_range range);
> > +
> >   #endif /* _VKMS_FORMATS_H_ */
> > diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> > index 8875bed76410..987dd2b686a8 100644
> > --- a/drivers/gpu/drm/vkms/vkms_plane.c
> > +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> > @@ -17,7 +17,19 @@ static const u32 vkms_formats[] = {
> >   	DRM_FORMAT_XRGB8888,
> >   	DRM_FORMAT_XRGB16161616,
> >   	DRM_FORMAT_ARGB16161616,
> > -	DRM_FORMAT_RGB565
> > +	DRM_FORMAT_RGB565,
> > +	DRM_FORMAT_NV12,
> > +	DRM_FORMAT_NV16,
> > +	DRM_FORMAT_NV24,
> > +	DRM_FORMAT_NV21,
> > +	DRM_FORMAT_NV61,
> > +	DRM_FORMAT_NV42,
> > +	DRM_FORMAT_YUV420,
> > +	DRM_FORMAT_YUV422,
> > +	DRM_FORMAT_YUV444,
> > +	DRM_FORMAT_YVU420,
> > +	DRM_FORMAT_YVU422,
> > +	DRM_FORMAT_YVU444
> 
> Let's add a comma by the end of this entry, to avoid deleting this line
> when adding a new format.

Fixed in v6.

> >   };
> >   
> >   static struct drm_plane_state *
> > @@ -117,12 +129,15 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
> >   	drm_framebuffer_get(frame_info->fb);
> >   	frame_info->rotation = drm_rotation_simplify(new_state->rotation, DRM_MODE_ROTATE_0 |
> >   									  DRM_MODE_ROTATE_90 |
> > +									  DRM_MODE_ROTATE_180 |
> 
> Why do we need to add DRM_MODE_ROTATE_180 here? Isn't the same as
> reflecting both along the X and Y axis?

Oops, I had no intention of putting that change here. I will move it to 
another patch.

I don't understand why DRM_MODE_ROTATE_180 isn't in this list. If I read 
the drm_rotation_simplify documentation, it explains that this argument 
should contain all supported rotations and reflections, and ROT_180 is 
supported by VKMS. Perhaps this call is unnecessary because all 
combinations are supported by vkms?

Thanks,
Louis Chauvet

> Best Regards,
> - Maíra
> 
> >   									  DRM_MODE_ROTATE_270 |
> >   									  DRM_MODE_REFLECT_X |
> >   									  DRM_MODE_REFLECT_Y);
> >   
> >   
> >   	vkms_plane_state->pixel_read_line = get_pixel_read_line_function(fmt);
> > +	vkms_plane_state->conversion_matrix = get_conversion_matrix_to_argb_u16
> > +		(fmt, new_state->color_encoding, new_state->color_range);
> >   }
> >   
> >   static int vkms_plane_atomic_check(struct drm_plane *plane,
> > 

-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 14/16] drm/vkms: Create KUnit tests for YUV conversions
  2024-03-25 14:34   ` Maíra Canal
@ 2024-03-26 15:57     ` Louis Chauvet
  2024-03-28 13:26     ` Pekka Paalanen
  1 sibling, 0 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-03-26 15:57 UTC (permalink / raw)
  To: Maíra Canal
  Cc: Rodrigo Siqueira, Melissa Wen, Haneen Mohammed, Daniel Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, arthurgrillo, Jonathan Corbet, pekka.paalanen,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 25/03/24 - 11:34, Maíra Canal a écrit :
> On 3/13/24 14:45, Louis Chauvet wrote:
> > From: Arthur Grillo <arthurgrillo@riseup.net>
> > 
> > Create KUnit tests to test the conversion between YUV and RGB. Test each
> > conversion and range combination with some common colors.
> > 
> > The code used to compute the expected result can be found in comment.
> > 
> > Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> > [Louis Chauvet:
> > - fix minor formating issues (whitespace, double line)
> > - change expected alpha from 0x0000 to 0xffff
> > - adapt to the new get_conversion_matrix usage
> > - apply the changes from Arthur
> > - move struct pixel_yuv_u8 to the test itself]
> 
> Again, a Co-developed-by tag might be more proper.

For this patch, my contribution was very minimal (I only add a call to 
get_conversion_matrix_to_argb_u16), so I will not add the Co-developed-by.

> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >   drivers/gpu/drm/vkms/Kconfig                  |  15 ++
> >   drivers/gpu/drm/vkms/Makefile                 |   1 +
> >   drivers/gpu/drm/vkms/tests/.kunitconfig       |   4 +
> >   drivers/gpu/drm/vkms/tests/Makefile           |   3 +
> >   drivers/gpu/drm/vkms/tests/vkms_format_test.c | 230 ++++++++++++++++++++++++++
> >   drivers/gpu/drm/vkms/vkms_formats.c           |   7 +-
> >   drivers/gpu/drm/vkms/vkms_formats.h           |   4 +
> >   7 files changed, 262 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/Kconfig b/drivers/gpu/drm/vkms/Kconfig
> > index b9ecdebecb0b..9b0e1940c14f 100644
> > --- a/drivers/gpu/drm/vkms/Kconfig
> > +++ b/drivers/gpu/drm/vkms/Kconfig
> > @@ -13,3 +13,18 @@ config DRM_VKMS
> >   	  a VKMS.
> >   
> >   	  If M is selected the module will be called vkms.
> > +
> > +config DRM_VKMS_KUNIT_TESTS
> > +	tristate "Tests for VKMS" if !KUNIT_ALL_TESTS
> 
> "KUnit tests for VKMS"

Fixed in v6.

> > +	depends on DRM_VKMS && KUNIT
> > +	default KUNIT_ALL_TESTS
> > +	help
> > +	  This builds unit tests for VKMS. This option is not useful for
> > +	  distributions or general kernels, but only for kernel
> > +	  developers working on VKMS.
> > +
> > +	  For more information on KUnit and unit tests in general,
> > +	  please refer to the KUnit documentation in
> > +	  Documentation/dev-tools/kunit/.
> > +
> > +	  If in doubt, say "N".
> > \ No newline at end of file
> > diff --git a/drivers/gpu/drm/vkms/Makefile b/drivers/gpu/drm/vkms/Makefile
> > index 1b28a6a32948..8d3e46dde635 100644
> > --- a/drivers/gpu/drm/vkms/Makefile
> > +++ b/drivers/gpu/drm/vkms/Makefile
> > @@ -9,3 +9,4 @@ vkms-y := \
> >   	vkms_writeback.o
> >   
> >   obj-$(CONFIG_DRM_VKMS) += vkms.o
> > +obj-$(CONFIG_DRM_VKMS_KUNIT_TESTS) += tests/
> > diff --git a/drivers/gpu/drm/vkms/tests/.kunitconfig b/drivers/gpu/drm/vkms/tests/.kunitconfig
> > new file mode 100644
> > index 000000000000..70e378228cbd
> > --- /dev/null
> > +++ b/drivers/gpu/drm/vkms/tests/.kunitconfig
> > @@ -0,0 +1,4 @@
> > +CONFIG_KUNIT=y
> > +CONFIG_DRM=y
> > +CONFIG_DRM_VKMS=y
> > +CONFIG_DRM_VKMS_KUNIT_TESTS=y
> > diff --git a/drivers/gpu/drm/vkms/tests/Makefile b/drivers/gpu/drm/vkms/tests/Makefile
> > new file mode 100644
> > index 000000000000..2d1df668569e
> > --- /dev/null
> > +++ b/drivers/gpu/drm/vkms/tests/Makefile
> > @@ -0,0 +1,3 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +
> > +obj-$(CONFIG_DRM_VKMS_KUNIT_TESTS) += vkms_format_test.o
> > diff --git a/drivers/gpu/drm/vkms/tests/vkms_format_test.c b/drivers/gpu/drm/vkms/tests/vkms_format_test.c
> > new file mode 100644
> > index 000000000000..0954d606e44a
> > --- /dev/null
> > +++ b/drivers/gpu/drm/vkms/tests/vkms_format_test.c

[...]

> > +/*
> > + * The YUV color representation were acquired via the colour python framework.
> > + * Below are the function calls used for generating each case.
> > + *
> > + * for more information got to the docs:
> 
> s/for/For

Fixed in v6.

> > + * https://colour.readthedocs.io/en/master/generated/colour.RGB_to_YCbCr.html
> > + */
> > +static struct yuv_u8_to_argb_u16_case yuv_u8_to_argb_u16_cases[] = {
> > +	/*
> > +	 * colour.RGB_to_YCbCr(<rgb color in 16 bit form>,
> > +	 *                     K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
> > +	 *                     in_bits = 16,
> > +	 *                     in_legal = False,
> > +	 *                     in_int = True,
> > +	 *                     out_bits = 8,
> > +	 *                     out_legal = False,
> > +	 *                     out_int = True)
> > +	 */
> 
> I feel that this Python code is kind of poluting the test cases.

This python code is needed to understand where the values come from. I 
think we should keep it for future reference (add more cases, test yuv 16 
bits...)

Maybe we can change the array comment to

 /*
  * The yuv color representation were acquired via the colour python framework:
  *
  * colour.RGB_to_YCbCr(<rgb color in 16 bits form>,
  *			K=color.WEIGHTS_YCBCR["<format>"],
  *			[...],
  *			out_legal = <limited or full range>)
  *
  * The exact function call arguments are given for each element of this list.
  *
  * [...]
  */

And above each test case:

 /*
  * format = "ITU-R BT.601"
  * out_legal = False
  */

@Arthur, do you agree with those modifications?

> > +	{
> > +		.encoding = DRM_COLOR_YCBCR_BT601,
> > +		.range = DRM_COLOR_YCBCR_FULL_RANGE,
> > +		.n_colors = 6,
> > +		.colors = {
> > +			{ "white", { 0xff, 0x80, 0x80 }, { 0xffff, 0xffff, 0xffff, 0xffff }},
> > +			{ "gray",  { 0x80, 0x80, 0x80 }, { 0xffff, 0x8080, 0x8080, 0x8080 }},
> > +			{ "black", { 0x00, 0x80, 0x80 }, { 0xffff, 0x0000, 0x0000, 0x0000 }},
> > +			{ "red",   { 0x4c, 0x55, 0xff }, { 0xffff, 0xffff, 0x0000, 0x0000 }},
> > +			{ "green", { 0x96, 0x2c, 0x15 }, { 0xffff, 0x0000, 0xffff, 0x0000 }},
> > +			{ "blue",  { 0x1d, 0xff, 0x6b }, { 0xffff, 0x0000, 0x0000, 0xffff }},
> > +		},
> > +	},

[...]

> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> > index edbf4b321b91..863fc91d6d48 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.c
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> > @@ -7,6 +7,8 @@
> >   #include <drm/drm_rect.h>
> >   #include <drm/drm_fixed.h>
> >   
> > +#include <kunit/visibility.h>
> > +
> >   #include "vkms_formats.h"
> >   
> >   /**
> > @@ -199,8 +201,8 @@ static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
> >   	return out_pixel;
> >   }
> >   
> > -static struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
> > -						  struct conversion_matrix *matrix)
> > +VISIBLE_IF_KUNIT struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
> > +							    struct conversion_matrix *matrix)
> >   {
> >   	u8 r, g, b;
> >   	s64 fp_y, fp_cb, fp_cr;
> > @@ -234,6 +236,7 @@ static struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
> >   
> >   	return argb_u16_from_u8888(255, r, g, b);
> >   }
> > +EXPORT_SYMBOL_IF_KUNIT(argb_u16_from_yuv888);
> >   
> >   /*
> >    * The following functions are read_line function for each pixel format supported by VKMS.
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
> > index e1d324764b17..21e66a0cac16 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.h
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.h
> > @@ -13,4 +13,8 @@ struct conversion_matrix *
> >   get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> >   				  enum drm_color_range range);
> >   
> > +#if IS_ENABLED(CONFIG_KUNIT)
> 
> What about the CONFIG_DRM_EXPORT_FOR_TESTS?

As the documentation for CONFIG_DRM_EXPORT_FOR_TESTS don't exists, I don't 
know what to use. Maybe Arthur knows what to do here? If needed I can 
apply the modifications for the next iteration.

Thanks for all your reviews,
Louis Chauvet
 
[...]

-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 08/16] drm/vkms: Avoid computing blending limits inside pre_mul_alpha_blend
  2024-03-26 15:57     ` Louis Chauvet
@ 2024-03-27 11:48       ` Pekka Paalanen
  2024-04-08  7:50         ` Louis Chauvet
  0 siblings, 1 reply; 75+ messages in thread
From: Pekka Paalanen @ 2024-03-27 11:48 UTC (permalink / raw)
  To: Louis Chauvet
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

[-- Attachment #1: Type: text/plain, Size: 6777 bytes --]

On Tue, 26 Mar 2024 16:57:00 +0100
Louis Chauvet <louis.chauvet@bootlin.com> wrote:

> Le 25/03/24 - 14:41, Pekka Paalanen a écrit :
> > On Wed, 13 Mar 2024 18:45:02 +0100
> > Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> >   
> > > The pre_mul_alpha_blend is dedicated to blending, so to avoid mixing
> > > different concepts (coordinate calculation and color management), extract
> > > the x_limit and x_dst computation outside of this helper.
> > > It also increases the maintainability by grouping the computation related
> > > to coordinates in the same place: the loop in `blend`.
> > > 
> > > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > > ---
> > >  drivers/gpu/drm/vkms/vkms_composer.c | 40 +++++++++++++++++-------------------
> > >  1 file changed, 19 insertions(+), 21 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> > > index da0651a94c9b..9254086f23ff 100644
> > > --- a/drivers/gpu/drm/vkms/vkms_composer.c
> > > +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> > > @@ -24,34 +24,30 @@ static u16 pre_mul_blend_channel(u16 src, u16 dst, u16 alpha)
> > >  
> > >  /**
> > >   * pre_mul_alpha_blend - alpha blending equation
> > > - * @frame_info: Source framebuffer's metadata
> > >   * @stage_buffer: The line with the pixels from src_plane
> > >   * @output_buffer: A line buffer that receives all the blends output
> > > + * @x_start: The start offset to avoid useless copy  
> > 
> > I'd say just:
> > 
> > + * @x_start: The start offset
> > 
> > It describes the parameter, and the paragraph below explains the why.
> > 
> > It would be explaining, that x_start applies to output_buffer, but
> > input_buffer is always read starting from 0.  
> 
> I will change it to:
> 
>  * Using @x_start and @count information, only few pixel can be blended instead of the whole line
>  * each time. @x_start is only used for the output buffer. The staging buffer is always read from
>  * the start (0..@count in stage_buffer is blended at @x_start..@x_start+@count in output_buffer).

The important part is

0..@count in stage_buffer is blended at @x_start..@x_start+@count in output_buffer

and everything else from that paragraph is not really adding much.

Remember to update the doc in "drm/vkms: Re-introduce line-per-line
composition  algorithm" to follow the changes.


> > > + * @count: The number of byte to copy  
> > 
> > You named it pixel_count, and it counts pixels, not bytes. It's not a
> > copy but a blend into output_buffer.  
> 
> Oops, fixed in v6.
>  
> > >   *
> > > - * Using the information from the `frame_info`, this blends only the
> > > - * necessary pixels from the `stage_buffer` to the `output_buffer`
> > > - * using premultiplied blend formula.
> > > + * Using @x_start and @count information, only few pixel can be blended instead of the whole line
> > > + * each time.
> > >   *
> > >   * The current DRM assumption is that pixel color values have been already
> > >   * pre-multiplied with the alpha channel values. See more
> > >   * drm_plane_create_blend_mode_property(). Also, this formula assumes a
> > >   * completely opaque background.
> > >   */
> > > -static void pre_mul_alpha_blend(struct vkms_frame_info *frame_info,
> > > -				struct line_buffer *stage_buffer,
> > > -				struct line_buffer *output_buffer)
> > > +static void pre_mul_alpha_blend(const struct line_buffer *stage_buffer,
> > > +				struct line_buffer *output_buffer, int x_start, int pixel_count)
> > >  {
> > > -	int x_dst = frame_info->dst.x1;
> > > -	struct pixel_argb_u16 *out = output_buffer->pixels + x_dst;
> > > -	struct pixel_argb_u16 *in = stage_buffer->pixels;
> > > -	int x_limit = min_t(size_t, drm_rect_width(&frame_info->dst),
> > > -			    stage_buffer->n_pixels);
> > > -
> > > -	for (int x = 0; x < x_limit; x++) {
> > > -		out[x].a = (u16)0xffff;
> > > -		out[x].r = pre_mul_blend_channel(in[x].r, out[x].r, in[x].a);
> > > -		out[x].g = pre_mul_blend_channel(in[x].g, out[x].g, in[x].a);
> > > -		out[x].b = pre_mul_blend_channel(in[x].b, out[x].b, in[x].a);
> > > +	struct pixel_argb_u16 *out = &output_buffer->pixels[x_start];
> > > +	const struct pixel_argb_u16 *in = stage_buffer->pixels;
> > > +
> > > +	for (int i = 0; i < pixel_count; i++) {
> > > +		out[i].a = (u16)0xffff;
> > > +		out[i].r = pre_mul_blend_channel(in[i].r, out[i].r, in[i].a);
> > > +		out[i].g = pre_mul_blend_channel(in[i].g, out[i].g, in[i].a);
> > > +		out[i].b = pre_mul_blend_channel(in[i].b, out[i].b, in[i].a);
> > >  	}
> > >  }
> > >  
> > > @@ -183,7 +179,7 @@ static void blend(struct vkms_writeback_job *wb,
> > >  {
> > >  	struct vkms_plane_state **plane = crtc_state->active_planes;
> > >  	u32 n_active_planes = crtc_state->num_active_planes;
> > > -	int y_pos;
> > > +	int y_pos, x_dst, x_limit;
> > >  
> > >  	const struct pixel_argb_u16 background_color = { .a = 0xffff };
> > >  
> > > @@ -201,14 +197,16 @@ static void blend(struct vkms_writeback_job *wb,
> > >  
> > >  		/* The active planes are composed associatively in z-order. */
> > >  		for (size_t i = 0; i < n_active_planes; i++) {
> > > +			x_dst = plane[i]->frame_info->dst.x1;
> > > +			x_limit = min_t(size_t, drm_rect_width(&plane[i]->frame_info->dst),
> > > +					stage_buffer->n_pixels);  
> > 
> > Are those input values to min_t() really of type size_t? Or why is
> > size_t here?  
> 
> n_pixel is size_t, drm_rect_width is int. I will change everything to int. 
> Is there a way to ask the compiler "please don't do implicit conversion 
> and report them as warn/errors"?

There probably is, you can find it in the gcc manual. However, I suspect
you would drown in warnings for cases where the implicit conversion is
wanted and an explicit cast is unwanted.


Thanks,
pq

> > >  			y_pos = get_y_pos(plane[i]->frame_info, y);
> > >  
> > >  			if (!check_limit(plane[i]->frame_info, y_pos))
> > >  				continue;
> > >  
> > >  			vkms_compose_row(stage_buffer, plane[i], y_pos);
> > > -			pre_mul_alpha_blend(plane[i]->frame_info, stage_buffer,
> > > -					    output_buffer);
> > > +			pre_mul_alpha_blend(stage_buffer, output_buffer, x_dst, x_limit);  
> > 
> > I thought it was a count, not a limit?
> > 
> > "Limit" sounds to me like "end", and end - start = count.  
> 
> It is effectively a pixel count. I just took those naming from the 
> original pre_mul_alpha_blend. I will change it to pixel_count.
> 
> Thanks,
> Louis Chauvet
> 
> > >  		}
> > >  
> > >  		apply_lut(crtc_state, output_buffer);
> > >   
> > 
> > The details aside, this is a good move.
> > 
> > 
> > Thanks,
> > pq  
> 
> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 11/16] drm/vkms: Add YUV support
  2024-03-13 17:45 ` [PATCH v5 11/16] drm/vkms: Add YUV support Louis Chauvet
  2024-03-13 19:20   ` Randy Dunlap
  2024-03-25 14:26   ` Maíra Canal
@ 2024-03-27 12:11   ` Philipp Zabel
  2024-04-08  7:50     ` Louis Chauvet
  2024-03-27 14:23   ` Pekka Paalanen
  3 siblings, 1 reply; 75+ messages in thread
From: Philipp Zabel @ 2024-03-27 12:11 UTC (permalink / raw)
  To: Louis Chauvet, Rodrigo Siqueira, Melissa Wen, Maíra Canal,
	Haneen Mohammed, Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen
  Cc: dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Hi Louis,

On Mi, 2024-03-13 at 18:45 +0100, Louis Chauvet wrote:
> From: Arthur Grillo <arthurgrillo@riseup.net>
> 
> Add support to the YUV formats bellow:
> 
> - NV12/NV16/NV24
> - NV21/NV61/NV42
> - YUV420/YUV422/YUV444
> - YVU420/YVU422/YVU444
> 
> The conversion from yuv to rgb is done with fixed-point arithmetic, using
> 32.32 floats and the drm_fixed helpers.

s/floats/fixed-point numbers/

Nothing floating here, the point is fixed.

[...]
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 23e1d247468d..f3116084de5a 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -99,6 +99,27 @@ typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_st
>  				  int y_start, enum pixel_read_direction direction, int count,
>  				  struct pixel_argb_u16 out_pixel[]);
>  
> +/**
> + * CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values

s/CONVERSION_MATRIX_FLOAT_DEPTH/CONVERSION_MATRIX_FRACTIONAL_BITS/

Just a suggestion, maybe there are better terms, but using "FLOAT" here
is confusing.

> + */
> +#define CONVERSION_MATRIX_FLOAT_DEPTH 32
> +
> +/**
> + * struct conversion_matrix - Matrix to use for a specific encoding and range
> + *
> + * @matrix: Conversion matrix from yuv to rgb. The matrix is stored in a row-major manner and is
> + * used to compute rgb values from yuv values:
> + *     [[r],[g],[b]] = @matrix * [[y],[u],[v]]
> + *   OR for yvu formats:
> + *     [[r],[g],[b]] = @matrix * [[y],[v],[u]]
> + *  The values of the matrix are fixed floats, 32.CONVERSION_MATRIX_FLOAT_DEPTH

s/fixed floats/fixed-point numbers/

regards
Philipp

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 09/16] drm/vkms: Introduce pixel_read_direction enum
  2024-03-26 15:57     ` Louis Chauvet
@ 2024-03-27 12:16       ` Pekka Paalanen
  2024-04-08  7:50         ` Louis Chauvet
  0 siblings, 1 reply; 75+ messages in thread
From: Pekka Paalanen @ 2024-03-27 12:16 UTC (permalink / raw)
  To: Louis Chauvet
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

[-- Attachment #1: Type: text/plain, Size: 10258 bytes --]

On Tue, 26 Mar 2024 16:57:00 +0100
Louis Chauvet <louis.chauvet@bootlin.com> wrote:

> Le 25/03/24 - 15:11, Pekka Paalanen a écrit :
> > On Wed, 13 Mar 2024 18:45:03 +0100
> > Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> >   
> > > The pixel_read_direction enum is useful to describe the reading direction
> > > in a plane. It avoids using the rotation property of DRM, which not
> > > practical to know the direction of reading.
> > > This patch also introduce two helpers, one to compute the
> > > pixel_read_direction from the DRM rotation property, and one to compute
> > > the step, in byte, between two successive pixel in a specific direction.
> > > 
> > > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > > ---
> > >  drivers/gpu/drm/vkms/vkms_composer.c | 36 ++++++++++++++++++++++++++++++++++++
> > >  drivers/gpu/drm/vkms/vkms_drv.h      | 11 +++++++++++
> > >  drivers/gpu/drm/vkms/vkms_formats.c  | 30 ++++++++++++++++++++++++++++++
> > >  3 files changed, 77 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> > > index 9254086f23ff..989bcf59f375 100644
> > > --- a/drivers/gpu/drm/vkms/vkms_composer.c
> > > +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> > > @@ -159,6 +159,42 @@ static void apply_lut(const struct vkms_crtc_state *crtc_state, struct line_buff
> > >  	}
> > >  }
> > >  
> > > +/**
> > > + * direction_for_rotation() - Get the correct reading direction for a given rotation
> > > + *
> > > + * This function will use the @rotation setting of a source plane to compute the reading
> > > + * direction in this plane which correspond to a "left to right writing" in the CRTC.
> > > + * For example, if the buffer is reflected on X axis, the pixel must be read from right to left
> > > + * to be written from left to right on the CRTC.  
> > 
> > That is a well written description.  
> 
> Thanks
>  
> > > + *
> > > + * @rotation: Rotation to analyze. It correspond the field @frame_info.rotation.
> > > + */
> > > +static enum pixel_read_direction direction_for_rotation(unsigned int rotation)
> > > +{
> > > +	if (rotation & DRM_MODE_ROTATE_0) {
> > > +		if (rotation & DRM_MODE_REFLECT_X)
> > > +			return READ_RIGHT_TO_LEFT;
> > > +		else
> > > +			return READ_LEFT_TO_RIGHT;
> > > +	} else if (rotation & DRM_MODE_ROTATE_90) {
> > > +		if (rotation & DRM_MODE_REFLECT_Y)
> > > +			return READ_BOTTOM_TO_TOP;
> > > +		else
> > > +			return READ_TOP_TO_BOTTOM;
> > > +	} else if (rotation & DRM_MODE_ROTATE_180) {
> > > +		if (rotation & DRM_MODE_REFLECT_X)
> > > +			return READ_LEFT_TO_RIGHT;
> > > +		else
> > > +			return READ_RIGHT_TO_LEFT;
> > > +	} else if (rotation & DRM_MODE_ROTATE_270) {
> > > +		if (rotation & DRM_MODE_REFLECT_Y)
> > > +			return READ_TOP_TO_BOTTOM;
> > > +		else
> > > +			return READ_BOTTOM_TO_TOP;
> > > +	}
> > > +	return READ_LEFT_TO_RIGHT;  
> > 
> > I'm a little worried seeing REFLECT_X is supported only for some
> > rotations, and REFLECT_Y for other rotations. Why is an analysis of all
> > combinations not necessary?  
> 
> I don't need to manage all the combination because this is only about 
> the "horizontal writing".
> 
> So, if you want to write a line in the CRTC, with:
> - ROT_0 || REF_X => You need to read the source line from right to left
> - ROT_0 => You need to read source buffer from left to right
> - ROT_0 || REF_Y => You need to read the source line from left to right

That is true, indeed.

> In this case, REF_Y only have an effect on the "column reading". It is not 
> needed here because the new version of the blend function will use the 
> drm_rect_* helpers to compute the correct y coordinate.
> 
> If you think it's clearer, I can create a big switch(rotation) like this:
> 
> 	switch (rotation) {
> 	case ROT_0:
> 	case ROT_0 || REF_X:
> 		return L2R;
> 	case ROT_0 || REF_Y:
> 		return R2L;
> 	case ROT_90:
> 	case ROT_90 || REF_X:
> 		return T2B;
> 	[...]
> 	}
> 
> So all cases are clearly covered?

I think that would suit my personal taste better. It would not raise
questions nor need a comment. It does become a long function, but I
tend to favour long and clear more than short and needs thinking to
figure out if it works, everything else being equivalent.

I wonder how DRM maintainers feel.

> > I hope IGT uses FB patterns instead of solid color in its tests of
> > rotation to be able to detect the difference.  
> 
> They use solid colors, and even my new rotation test [3] use solid colors.

That will completely fail to detect rotation and reflection bugs then.
E.g. userspace asks for 180-degree rotation, and the driver does not
rotate at all. Or rotate-180 getting confused with one reflection.

> It is mainly for yuv formats with subsampling: if you have formats with 
> subsampling, a "software rotated buffer" and a "hardware rotated buffer" 
> will not apply the same subsampling, so the colors will be slightly 
> different.

Why would they not use the same subsampling?

The framebuffer contents are defined in its natural orientation, and
the subsampling applies in the natural orientation. If such a FB
is on a rotated plane, one must account for subsampling first, and
rotate second. 90-degree rotation does not change the encoded color.

Getting the subsampling exactly right is going to be necessary sooner
or later. There is no UAPI for setting chroma siting yet, but ideally
there should be.

> > The return values do seem correct to me, assuming I have guessed
> > correctly what "X" and "Y" refer to when combined with rotation. I did
> > not find good documentation about that.  
> 
> Yes, it is difficult to understand how rotation and reflexion should 
> works in drm. I spend half a day testing all the combination in drm_rect_* 
> helpers to understand how this works. According to the code:
> - If only rotation or only reflexion, easy as expected
> - If reflexion and rotation are mixed, the source buffer is first 
>   reflected and then rotated.

Now that you know, you could send a documentation patch. :-)

For me as a userspace developer, the important place is
https://dri.freedesktop.org/docs/drm/gpu/drm-kms.html#standard-plane-properties

>  
> > Btw. if there are already functions that are able to transform
> > coordinates based on the rotation bitfield, you could alternatively use
> > them. Transform CRTC point (0, 0) to A, and (1, 0) to B. Now A and B
> > are in plane coordinate system, and vector B - A gives you the
> > direction. The reason I'm mentioning this is that then you don't have
> > to implement yet another copy of the rotation bitfield semantics from
> > scratch.  
> 
> You are totaly right. I will try this elegant method. Yes, there are some 
> helpers (drm_rect_rotate_inv), so I will try to do something.


Cool, thanks,
pq

> >   
> > > +}
> > > +
> > >  /**
> > >   * blend - blend the pixels from all planes and compute crc
> > >   * @wb: The writeback frame buffer metadata
> > > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > > index 3ead8b39af4a..985e7a92b7bc 100644
> > > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > > @@ -69,6 +69,17 @@ struct vkms_writeback_job {
> > >  	pixel_write_t pixel_write;
> > >  };
> > >  
> > > +/**
> > > + * enum pixel_read_direction - Enum used internaly by VKMS to represent a reading direction in a
> > > + * plane.
> > > + */
> > > +enum pixel_read_direction {
> > > +	READ_BOTTOM_TO_TOP,
> > > +	READ_TOP_TO_BOTTOM,
> > > +	READ_RIGHT_TO_LEFT,
> > > +	READ_LEFT_TO_RIGHT
> > > +};
> > > +
> > >  /**
> > >   * typedef pixel_read_t - These functions are used to read a pixel in the source frame,
> > >   * convert it to `struct pixel_argb_u16` and write it to @out_pixel.
> > > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> > > index 649d75d05b1f..743b6fd06db5 100644
> > > --- a/drivers/gpu/drm/vkms/vkms_formats.c
> > > +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> > > @@ -75,6 +75,36 @@ static void packed_pixels_addr(const struct vkms_frame_info *frame_info,
> > >  	*addr = (u8 *)frame_info->map[0].vaddr + offset;
> > >  }
> > >  
> > > +/**
> > > + * get_step_next_block() - Common helper to compute the correct step value between each pixel block
> > > + * to read in a certain direction.
> > > + *
> > > + * As the returned offset is the number of bytes between two consecutive blocks in a direction,
> > > + * the caller may have to read multiple pixel before using the next one (for example, to read from
> > > + * left to right in a DRM_FORMAT_R1 plane, each block contains 8 pixels, so the step must be used
> > > + * only every 8 pixels.
> > > + *
> > > + * @fb: Framebuffer to iter on
> > > + * @direction: Direction of the reading
> > > + * @plane_index: Plane to get the step from
> > > + */
> > > +static int get_step_next_block(struct drm_framebuffer *fb, enum pixel_read_direction direction,
> > > +			       int plane_index)
> > > +{  
> > 
> > I would have called this something like get_block_step_bytes() for
> > example. That makes it clear it returns bytes (not e.g. pixels). "next"
> > implies to me that I tell the function the current block, and then it
> > gets me the next one. It does not do that, so I'd not use "next".  
> 
> Nice name, I will took it for the v6.
> 
> Thanks,
> Louis Chauvet
> 
> > > +	switch (direction) {
> > > +	case READ_LEFT_TO_RIGHT:
> > > +		return fb->format->char_per_block[plane_index];
> > > +	case READ_RIGHT_TO_LEFT:
> > > +		return -fb->format->char_per_block[plane_index];
> > > +	case READ_TOP_TO_BOTTOM:
> > > +		return (int)fb->pitches[plane_index];
> > > +	case READ_BOTTOM_TO_TOP:
> > > +		return -(int)fb->pitches[plane_index];
> > > +	}
> > > +
> > > +	return 0;
> > > +}  
> > 
> > Looks good.
> > 
> > 
> > Thanks,
> > pq
> >   
> > > +
> > >  static void *get_packed_src_addr(const struct vkms_frame_info *frame_info, int y,
> > >  				 int plane_index)
> > >  {
> > >   
> >   
> 
> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 10/16] drm/vkms: Re-introduce line-per-line composition algorithm
  2024-03-26 15:57     ` Louis Chauvet
@ 2024-03-27 12:29       ` Pekka Paalanen
  2024-04-08  7:50         ` Louis Chauvet
  0 siblings, 1 reply; 75+ messages in thread
From: Pekka Paalanen @ 2024-03-27 12:29 UTC (permalink / raw)
  To: Louis Chauvet
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

[-- Attachment #1: Type: text/plain, Size: 8313 bytes --]

On Tue, 26 Mar 2024 16:57:02 +0100
Louis Chauvet <louis.chauvet@bootlin.com> wrote:

> [...]
> 
> > > @@ -215,34 +188,146 @@ static void blend(struct vkms_writeback_job *wb,
> > >  {
> > >  	struct vkms_plane_state **plane = crtc_state->active_planes;
> > >  	u32 n_active_planes = crtc_state->num_active_planes;
> > > -	int y_pos, x_dst, x_limit;
> > >  
> > >  	const struct pixel_argb_u16 background_color = { .a = 0xffff };
> > >  
> > > -	size_t crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
> > > +	int crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
> > > +	int crtc_x_limit = crtc_state->base.crtc->mode.hdisplay;
> > >  
> > >  	/*
> > >  	 * The planes are composed line-by-line to avoid heavy memory usage. It is a necessary
> > >  	 * complexity to avoid poor blending performance.
> > >  	 *
> > > -	 * The function vkms_compose_row is used to read a line, pixel-by-pixel, into the staging
> > > -	 * buffer.
> > > +	 * The function pixel_read_line callback is used to read a line, using an efficient
> > > +	 * algorithm for a specific format, into the staging buffer.
> > >  	 */
> > >  	for (size_t y = 0; y < crtc_y_limit; y++) {
> > >  		fill_background(&background_color, output_buffer);
> > >  
> > >  		/* The active planes are composed associatively in z-order. */
> > >  		for (size_t i = 0; i < n_active_planes; i++) {
> > > -			x_dst = plane[i]->frame_info->dst.x1;
> > > -			x_limit = min_t(size_t, drm_rect_width(&plane[i]->frame_info->dst),
> > > -					stage_buffer->n_pixels);
> > > -			y_pos = get_y_pos(plane[i]->frame_info, y);
> > > +			struct vkms_plane_state *current_plane = plane[i];
> > >  
> > > -			if (!check_limit(plane[i]->frame_info, y_pos))
> > > +			/* Avoid rendering useless lines */
> > > +			if (y < current_plane->frame_info->dst.y1 ||
> > > +			    y >= current_plane->frame_info->dst.y2)
> > >  				continue;
> > >  
> > > -			vkms_compose_row(stage_buffer, plane[i], y_pos);
> > > -			pre_mul_alpha_blend(stage_buffer, output_buffer, x_dst, x_limit);
> > > +			/*
> > > +			 * dst_line is the line to copy. The initial coordinates are inside the  
> 
> [...]
> 
> > > +				 */
> > > +				pixel_count = drm_rect_width(&src_line);
> > > +				if (x_start < 0) {
> > > +					pixel_count += x_start;
> > > +					x_start = 0;
> > > +				}
> > > +				if (x_start + pixel_count > current_plane->frame_info->fb->width) {
> > > +					pixel_count =
> > > +						(int)current_plane->frame_info->fb->width - x_start;
> > > +				}
> > > +			} else {
> > > +				/*
> > > +				 * In vertical reading, the src_line height is the number of pixel
> > > +				 * to read
> > > +				 */
> > > +				pixel_count = drm_rect_height(&src_line);
> > > +				if (y_start < 0) {
> > > +					pixel_count += y_start;
> > > +					y_start = 0;
> > > +				}
> > > +				if (y_start + pixel_count > current_plane->frame_info->fb->height) {
> > > +					pixel_count =
> > > +						(int)current_plane->frame_info->fb->width - y_start;
> > > +				}  
> > 
> > When you are clamping x_start or y_start or pixel_count to be inside
> > the source FB, should you not equally adjust the destination
> > coordinates as well?  
> 
> I did not think about it. Currently it is not an issue and it will not 
> read or write outside a buffer because the pixel count is properly 
> limited. But indeed, there is an issue here. I will fix it in the v6.
>  
> > If we take a step back and look at the UAPI, I believe the answer is
> > "no", but it's in no way obvious. It results from the combination of
> > several facts:
> > 
> > - UAPI checks reject any source rectangle that extends outside of the
> >   source FB.
> > 
> > - The source rectangle stretches to fill the destination rectangle
> >   exactly.
> > 
> > - VKMS does not support stretching (scaling), so its UAPI checks reject
> >   any commit with source and destination rectangles of different sizes
> >   after accounting for rotation. (Right?)  
> 
> I don't know what are exactly the UAPI contract but as the dst can be 
> outside the CRTC, I assumed that the src can be outside the source plane. 
> After thinking it doesn't really make sense.

The UAPI contract for source and destination rectangles is here:
https://dri.freedesktop.org/docs/drm/gpu/drm-kms.html#standard-plane-properties

I assume there is some shared (helper?) code in DRM that enforces the
contract and returns error to userspace if it is violated.

> > I think this results in the clamping code being actually dead code.
> > However, I would not delete the clamping code, because it is a cheap
> > safety net in case something goes wrong.  
> 
> If UAPI check that the source rectangle is inside the plane, yes it is 
> just a safety net. Otherwise, it is required to manage properly the 
> userspace requests. In the v6, the outside of a source buffer is 
> transparent.
> 
> > If you agree that it's just a safety net, then maybe explain that in a
> > comment? If the safety net catches anything, the composition result
> > will be wrong anyway, so it doesn't matter to adjust the destination
> > rectangle to match.  
> 
> I will extract this whole clamping stuff in a function, is this comment 
> enough?
> 
>  * This function is mainly a safety net to avoid reading outside the source buffer. As the
>  * userspace should never ask to read outside the source plane, all the cases covered here should
>  * be dead code.

Sure. Perhaps use a bit more assertive tone about what the UAPI
contract guarantees. Yes, userspace "should not", but the kernel DRM
code ensures that it does not.

> > When the last point is relaxed and VKMS gains scaling support, I think
> > it won't change the fact that the clamping remains as a safety net. It
> > just increases the risk of bugs that would be caught by the net.
> > 
> > Going outside of FB boundaries is a serious bug and deserves to be
> > checked. Going outside of the source rectangle would be a bug too,
> > assuming that partially included pixels are considered fully included,
> > but it's not serious enough to warrant explicit checks. Ideally IGT
> > would catch it.  
> 
> That was exactly the idea behind all those check and clamping: avoid 
> access outside the buffers.

Good.

To catch a driver using pixels outside of a source rectangle, the test
FB in IGT should be painted to have a different non-black color outside
of the source rectangle.

> > > +			}
> > > +
> > > +			if (pixel_count <= 0) {
> > > +				/* Nothing to read, so avoid multiple function calls for nothing */
> > > +				continue;
> > > +			}
> > > +
> > > +			/*
> > > +			 * Modify the starting point to take in account the rotation
> > > +			 *
> > > +			 * src_line is the top-left corner, so when reading READ_RIGHT_TO_LEFT or
> > > +			 * READ_BOTTOM_TO_TOP, it must be changed to the top-right/bottom-left
> > > +			 * corner.
> > > +			 */
> > > +			if (direction == READ_RIGHT_TO_LEFT) {
> > > +				// x_start is now the right point
> > > +				x_start += pixel_count - 1;
> > > +			} else if (direction == READ_BOTTOM_TO_TOP) {
> > > +				// y_start is now the bottom point
> > > +				y_start += pixel_count - 1;
> > > +			}
> > > +
> > > +			/*
> > > +			 * Perform the conversion and the blending
> > > +			 *
> > > +			 * Here we know that the read line (x_start, y_start, pixel_count) is
> > > +			 * inside the source buffer [2] and we don't write outside the stage
> > > +			 * buffer [1]
> > > +			 */
> > > +			current_plane->pixel_read_line(
> > > +				current_plane, x_start, y_start, direction, pixel_count,
> > > +				&stage_buffer->pixels[current_plane->frame_info->dst.x1]);
> > > +
> > > +			pre_mul_alpha_blend(stage_buffer, output_buffer,
> > > +					    current_plane->frame_info->dst.x1,
> > > +					    pixel_count);
> > >  		}
> > >  
> > >  		apply_lut(crtc_state, output_buffer);
> > > @@ -250,7 +335,7 @@ static void blend(struct vkms_writeback_job *wb,
> > >  		*crc32 = crc32_le(*crc32, (void *)output_buffer->pixels, row_size);
> > >  
> > >  		if (wb)
> > > -			vkms_writeback_row(wb, output_buffer, y_pos);
> > > +			vkms_writeback_row(wb, output_buffer, y);
> > >  	}
> > >  }


Thanks,
pq

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 11/16] drm/vkms: Add YUV support
  2024-03-26 15:57     ` Louis Chauvet
@ 2024-03-27 12:59       ` Pekka Paalanen
  2024-04-08  7:50         ` Louis Chauvet
  0 siblings, 1 reply; 75+ messages in thread
From: Pekka Paalanen @ 2024-03-27 12:59 UTC (permalink / raw)
  To: Maíra Canal
  Cc: Louis Chauvet, Rodrigo Siqueira, Melissa Wen, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

[-- Attachment #1: Type: text/plain, Size: 5279 bytes --]

On Tue, 26 Mar 2024 16:57:03 +0100
Louis Chauvet <louis.chauvet@bootlin.com> wrote:

> Le 25/03/24 - 11:26, Maíra Canal a écrit :
> > On 3/13/24 14:45, Louis Chauvet wrote:  
> > > From: Arthur Grillo <arthurgrillo@riseup.net>
> > > 
> > > Add support to the YUV formats bellow:
> > > 
> > > - NV12/NV16/NV24
> > > - NV21/NV61/NV42
> > > - YUV420/YUV422/YUV444
> > > - YVU420/YVU422/YVU444
> > > 
> > > The conversion from yuv to rgb is done with fixed-point arithmetic, using
> > > 32.32 floats and the drm_fixed helpers.
> > > 
> > > To do the conversion, a specific matrix must be used for each color range
> > > (DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in
> > > the `conversion_matrix` struct, along with the specific y_offset needed.
> > > This matrix is queried only once, in `vkms_plane_atomic_update` and
> > > stored in a `vkms_plane_state`. Those conversion matrices of each
> > > encoding and range were obtained by rounding the values of the original
> > > conversion matrices multiplied by 2^32. This is done to avoid the use of
> > > floating point operations.
> > > 
> > > The same reading function is used for YUV and YVU formats. As the only
> > > difference between those two category of formats is the order of field, a
> > > simple swap in conversion matrix columns allows using the same function.
> > > 
> > > Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> > > [Louis Chauvet:
> > > - Adapted Arthur's work
> > > - Implemented the read_line_t callbacks for yuv
> > > - add struct conversion_matrix
> > > - remove struct pixel_yuv_u8
> > > - update the commit message
> > > - Merge the modifications from Arthur]  
> > 
> > A Co-developed-by tag would be more appropriate.  
> 
> I am not the main author of this part, I only applied a few simple 
> suggestions, the complex part was done by Arthur.
> 
> I will wait for Arthur's confirmation to change it to Co-developed by if
> he agrees.

Co-developed-by is an additional tag, and does not replace S-o-b. To my
understanding, the kernel rules and Developers' Certificate of Origin
require S-o-b to be added by anyone who has taken a patch and
re-submitted it, regardless of who the original author is, and
especially if the patch was modified.

Personally I also like to keep the list of changes like Louis added, to
credit people better.

> > > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > > ---
> > >   drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
> > >   drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
> > >   drivers/gpu/drm/vkms/vkms_formats.h |   4 +
> > >   drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
> > >   4 files changed, 473 insertions(+), 1 deletion(-)
> > > 

...

> > >   };
> > >   
> > >   static struct drm_plane_state *
> > > @@ -117,12 +129,15 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
> > >   	drm_framebuffer_get(frame_info->fb);
> > >   	frame_info->rotation = drm_rotation_simplify(new_state->rotation, DRM_MODE_ROTATE_0 |
> > >   									  DRM_MODE_ROTATE_90 |
> > > +									  DRM_MODE_ROTATE_180 |  
> > 
> > Why do we need to add DRM_MODE_ROTATE_180 here? Isn't the same as
> > reflecting both along the X and Y axis?  
> 
> Oops, I had no intention of putting that change here. I will move it to 
> another patch.
> 
> I don't understand why DRM_MODE_ROTATE_180 isn't in this list. If I read 
> the drm_rotation_simplify documentation, it explains that this argument 
> should contain all supported rotations and reflections, and ROT_180 is 
> supported by VKMS. Perhaps this call is unnecessary because all 
> combinations are supported by vkms?

If you truly handle all bit patterns that the rotation bitfield can
have, then yes, the call seems unnecessary.

However, as documented, the bitfield contains redundancy: the same
orientation can be expressed in more than one bit pattern. One example
is that ROTATE_180 is equivalent to REFLECT_X | REFLECT_Y.

Since it's a bitmask, userspace can give you funny values like
ROTATE_0 | ROTATE_90 | ROTATE_180. That is a valid orientation of
270-degree rotation (according to UAPI doc), but it is very awkwardly
expressed, hence the need to normalise it into a minimal bit pattern.

It does not look like drm_rotation_simplify() actually does this
minimisation!

I was not able to tell if DRM common code actually stops userspace from
combining multiple ROTATE bits in the same value. I suspect it must
stop them, or perhaps all code dealing with rotation is actually broken.

drm_rotation_simplify() is useful for cases where your hardware does
not have exactly the same flexibility. Maybe it cannot do REFLECT_Y but
it can do everything else? Then drm_rotation_simplify() gives you a bit
pattern that you can use directly, or fails if the orientation is not
representable with what your hardware can do.

At least, that's my understanding of quickly glancing over it.

IOW, if you wanted to never have to deal with REFLECT_Y bit, you could
leave it out here. Or, if you never want to deal with ROTATE_180, leave
that out - you will get REFLECT_X | REFLECT_Y instead. In theory.


Thanks,
pq

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 11/16] drm/vkms: Add YUV support
  2024-03-13 17:45 ` [PATCH v5 11/16] drm/vkms: Add YUV support Louis Chauvet
                     ` (2 preceding siblings ...)
  2024-03-27 12:11   ` Philipp Zabel
@ 2024-03-27 14:23   ` Pekka Paalanen
  2024-04-08  7:50     ` Louis Chauvet
  3 siblings, 1 reply; 75+ messages in thread
From: Pekka Paalanen @ 2024-03-27 14:23 UTC (permalink / raw)
  To: Louis Chauvet
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

[-- Attachment #1: Type: text/plain, Size: 25591 bytes --]

On Wed, 13 Mar 2024 18:45:05 +0100
Louis Chauvet <louis.chauvet@bootlin.com> wrote:

> From: Arthur Grillo <arthurgrillo@riseup.net>
> 
> Add support to the YUV formats bellow:
> 
> - NV12/NV16/NV24
> - NV21/NV61/NV42
> - YUV420/YUV422/YUV444
> - YVU420/YVU422/YVU444
> 
> The conversion from yuv to rgb is done with fixed-point arithmetic, using
> 32.32 floats and the drm_fixed helpers.

You mean fixed-point, not floating-point (floats).

> 
> To do the conversion, a specific matrix must be used for each color range
> (DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in
> the `conversion_matrix` struct, along with the specific y_offset needed.
> This matrix is queried only once, in `vkms_plane_atomic_update` and
> stored in a `vkms_plane_state`. Those conversion matrices of each
> encoding and range were obtained by rounding the values of the original
> conversion matrices multiplied by 2^32. This is done to avoid the use of
> floating point operations.
> 
> The same reading function is used for YUV and YVU formats. As the only
> difference between those two category of formats is the order of field, a
> simple swap in conversion matrix columns allows using the same function.

Sounds good!

> Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> [Louis Chauvet:
> - Adapted Arthur's work
> - Implemented the read_line_t callbacks for yuv
> - add struct conversion_matrix
> - remove struct pixel_yuv_u8
> - update the commit message
> - Merge the modifications from Arthur]
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>  drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
>  drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/vkms/vkms_formats.h |   4 +
>  drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
>  4 files changed, 473 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 23e1d247468d..f3116084de5a 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -99,6 +99,27 @@ typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_st
>  				  int y_start, enum pixel_read_direction direction, int count,
>  				  struct pixel_argb_u16 out_pixel[]);
>  
> +/**
> + * CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values
> + */
> +#define CONVERSION_MATRIX_FLOAT_DEPTH 32

Fraction, not float.

> +
> +/**
> + * struct conversion_matrix - Matrix to use for a specific encoding and range
> + *
> + * @matrix: Conversion matrix from yuv to rgb. The matrix is stored in a row-major manner and is
> + * used to compute rgb values from yuv values:
> + *     [[r],[g],[b]] = @matrix * [[y],[u],[v]]
> + *   OR for yvu formats:
> + *     [[r],[g],[b]] = @matrix * [[y],[v],[u]]
> + *  The values of the matrix are fixed floats, 32.CONVERSION_MATRIX_FLOAT_DEPTH

Fixed float is not a thing. They are signed fixed-point values with
32-bit fractional part.

> + * @y_offest: Offset to apply on the y value.
> + */
> +struct conversion_matrix {
> +	s64 matrix[3][3];
> +	s64 y_offset;
> +};

Btw. too bad that drm_fixed.h does not use something like

	typedef struct drm_fixed {
		s64 v;
	} drm_fixed_t;

and use that in all the API where a fixed-point value is passed. It
would make the type very explicit, and the struct prevents it from
implicitly casting to/from regular integer formats.

Then you could use drm_fixed_t instead of s64 and it would be obvious
how the values must be handled and which API is appropriate.

> +
>  /**
>   * vkms_plane_state - Driver specific plane state
>   * @base: base plane state
> @@ -110,6 +131,7 @@ struct vkms_plane_state {
>  	struct drm_shadow_plane_state base;
>  	struct vkms_frame_info *frame_info;
>  	pixel_read_line_t pixel_read_line;
> +	struct conversion_matrix *conversion_matrix;

If the matrix was embedded as a copy instead of a pointer to (const!)
matrix, you would not need to manually hardcode YVU variant of the
matrices, but you could simply swap the columns of the YUV matrix while
copying them into this field.


>  };
>  
>  struct vkms_plane {
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index 1449a0e6c706..edbf4b321b91 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> @@ -105,6 +105,44 @@ static int get_step_next_block(struct drm_framebuffer *fb, enum pixel_read_direc
>  	return 0;
>  }
>  
> +/**
> + * get_subsampling() - Get the subsampling divisor value on a specific direction
> + */
> +static int get_subsampling(const struct drm_format_info *format,
> +			   enum pixel_read_direction direction)
> +{
> +	switch (direction) {
> +	case READ_BOTTOM_TO_TOP:
> +	case READ_TOP_TO_BOTTOM:
> +		return format->vsub;
> +	case READ_RIGHT_TO_LEFT:
> +	case READ_LEFT_TO_RIGHT:
> +		return format->hsub;
> +	}
> +	WARN_ONCE(true, "Invalid direction for pixel reading: %d\n", direction);
> +	return 1;
> +}
> +
> +/**
> + * get_subsampling_offset() - An offset for keeping the chroma siting consistent regardless of
> + * x_start and y_start values
> + */
> +static int get_subsampling_offset(enum pixel_read_direction direction, int x_start, int y_start)
> +{
> +	switch (direction) {
> +	case READ_BOTTOM_TO_TOP:
> +		return -y_start - 1;
> +	case READ_TOP_TO_BOTTOM:
> +		return y_start;
> +	case READ_RIGHT_TO_LEFT:
> +		return -x_start - 1;
> +	case READ_LEFT_TO_RIGHT:
> +		return x_start;
> +	}
> +	WARN_ONCE(true, "Invalid direction for pixel reading: %d\n", direction);
> +	return 0;
> +}
> +
>  /*
>   * The following  functions take pixel data (a, r, g, b, pixel, ...), convert them to the format
>   * ARGB16161616 in out_pixel.
> @@ -161,6 +199,42 @@ static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
>  	return out_pixel;
>  }
>  
> +static struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
> +						  struct conversion_matrix *matrix)

If you are using the "swap the matrix columns" trick, then you cannot
call these cb, cr nor even u,v, because they might be the opposite.
They are simply the first and second chroma channel, and their meaning
depends on the given matrix.

> +{
> +	u8 r, g, b;
> +	s64 fp_y, fp_cb, fp_cr;
> +	s64 fp_r, fp_g, fp_b;
> +
> +	fp_y = y - matrix->y_offset;
> +	fp_cb = cb - 128;
> +	fp_cr = cr - 128;

This looks like an incorrect way to convert u8 to fixed-point, but...

> +
> +	fp_y = drm_int2fixp(fp_y);
> +	fp_cb = drm_int2fixp(fp_cb);
> +	fp_cr = drm_int2fixp(fp_cr);

I find it confusing to re-purpose variables like this.

I'd do just

	fp_c1 = drm_int2fixp((int)c1 - 128);

If the function arguments were int to begin with, then the cast would
be obviously unnecessary.

So, what you have in fp variables at this point is fractional numbers
in the 8-bit integer scale. However, because the target format is
16-bit, you should not show the extra precision away here. Instead,
multiply by 257 to bring the values to 16-bit scale, and do the RGB
clamping to 16-bit, not 8-bit.

> +
> +	fp_r = drm_fixp_mul(matrix->matrix[0][0], fp_y) +
> +	       drm_fixp_mul(matrix->matrix[0][1], fp_cb) +
> +	       drm_fixp_mul(matrix->matrix[0][2], fp_cr);
> +	fp_g = drm_fixp_mul(matrix->matrix[1][0], fp_y) +
> +	       drm_fixp_mul(matrix->matrix[1][1], fp_cb) +
> +	       drm_fixp_mul(matrix->matrix[1][2], fp_cr);
> +	fp_b = drm_fixp_mul(matrix->matrix[2][0], fp_y) +
> +	       drm_fixp_mul(matrix->matrix[2][1], fp_cb) +
> +	       drm_fixp_mul(matrix->matrix[2][2], fp_cr);
> +
> +	fp_r = drm_fixp2int_round(fp_r);
> +	fp_g = drm_fixp2int_round(fp_g);
> +	fp_b = drm_fixp2int_round(fp_b);
> +
> +	r = clamp(fp_r, 0, 0xff);
> +	g = clamp(fp_g, 0, 0xff);
> +	b = clamp(fp_b, 0, 0xff);
> +
> +	return argb_u16_from_u8888(255, r, g, b);

Going through argb_u16_from_u8888() will throw away precision.

> +}
> +
>  /*
>   * The following functions are read_line function for each pixel format supported by VKMS.
>   *
> @@ -293,6 +367,79 @@ static void RGB565_read_line(const struct vkms_plane_state *plane, int x_start,
>  	}
>  }
>  
> +/*
> + * This callback can be used for yuv and yvu formats, given a properly modified conversion matrix
> + * (column inversion)

Would be nice to explain what semi_planar_yuv means, so that the
documentation for these functions would show how they differ rather
than all saying exactly the same thing.

> + */
> +static void semi_planar_yuv_read_line(const struct vkms_plane_state *plane, int x_start,
> +				      int y_start, enum pixel_read_direction direction, int count,
> +				      struct pixel_argb_u16 out_pixel[])
> +{
> +	int rem_x, rem_y;
> +	u8 *y_plane;
> +	u8 *uv_plane;
> +
> +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &y_plane, &rem_x, &rem_y);

Assert rem_x, rem_y are zero, or block is 1x1.

> +	packed_pixels_addr(plane->frame_info,
> +			   x_start / plane->frame_info->fb->format->hsub,
> +			   y_start / plane->frame_info->fb->format->vsub,
> +			   1, &uv_plane, &rem_x, &rem_y);

Assert rem_x, rem_y are zero, or block is 1x1.

Actually, this is so common, that maybe there should be a wrapper for
packed_pixels_addr() or another variant of it, that asserts that the
block size is 1x1 and does not return rem_x, rem_y at all.

> +	int step_y = get_step_next_block(plane->frame_info->fb, direction, 0);
> +	int step_uv = get_step_next_block(plane->frame_info->fb, direction, 1);
> +	int subsampling = get_subsampling(plane->frame_info->fb->format, direction);
> +	int subsampling_offset = get_subsampling_offset(direction, x_start, y_start);
> +	struct conversion_matrix *conversion_matrix = plane->conversion_matrix;
> +
> +	for (int i = 0; i < count; i++) {
> +		*out_pixel = argb_u16_from_yuv888(y_plane[0], uv_plane[0], uv_plane[1],
> +						  conversion_matrix);
> +		out_pixel += 1;
> +		y_plane += step_y;
> +		if ((i + subsampling_offset + 1) % subsampling == 0)
> +			uv_plane += step_uv;
> +	}
> +}
> +
> +/*
> + * This callback can be used for yuv and yvu formats, given a properly modified conversion matrix
> + * (column inversion)
> + */
> +static void planar_yuv_read_line(const struct vkms_plane_state *plane, int x_start,
> +				 int y_start, enum pixel_read_direction direction, int count,
> +				 struct pixel_argb_u16 out_pixel[])
> +{
> +	int rem_x, rem_y;
> +	u8 *y_plane;
> +	u8 *u_plane;
> +	u8 *v_plane;
> +
> +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &y_plane, &rem_x, &rem_y);
> +	packed_pixels_addr(plane->frame_info,
> +			   x_start / plane->frame_info->fb->format->hsub,
> +			   y_start / plane->frame_info->fb->format->vsub,
> +			   1, &u_plane, &rem_x, &rem_y);
> +	packed_pixels_addr(plane->frame_info,
> +			   x_start / plane->frame_info->fb->format->hsub,
> +			   y_start / plane->frame_info->fb->format->vsub,
> +			   2, &v_plane, &rem_x, &rem_y);
> +	int step_y = get_step_next_block(plane->frame_info->fb, direction, 0);
> +	int step_u = get_step_next_block(plane->frame_info->fb, direction, 1);
> +	int step_v = get_step_next_block(plane->frame_info->fb, direction, 2);
> +	int subsampling = get_subsampling(plane->frame_info->fb->format, direction);
> +	int subsampling_offset = get_subsampling_offset(direction, x_start, y_start);
> +	struct conversion_matrix *conversion_matrix = plane->conversion_matrix;
> +
> +	for (int i = 0; i < count; i++) {
> +		*out_pixel = argb_u16_from_yuv888(*y_plane, *u_plane, *v_plane, conversion_matrix);
> +		out_pixel += 1;
> +		y_plane += step_y;
> +		if ((i + subsampling_offset + 1) % subsampling == 0) {
> +			u_plane += step_u;
> +			v_plane += step_v;
> +		}
> +	}
> +}
> +
>  /*
>   * The following functions take one argb_u16 pixel and convert it to a specific format. The
>   * result is stored in @out_pixel.
> @@ -418,6 +565,20 @@ pixel_read_line_t get_pixel_read_line_function(u32 format)
>  		return &XRGB16161616_read_line;
>  	case DRM_FORMAT_RGB565:
>  		return &RGB565_read_line;
> +	case DRM_FORMAT_NV12:
> +	case DRM_FORMAT_NV16:
> +	case DRM_FORMAT_NV24:
> +	case DRM_FORMAT_NV21:
> +	case DRM_FORMAT_NV61:
> +	case DRM_FORMAT_NV42:
> +		return &semi_planar_yuv_read_line;
> +	case DRM_FORMAT_YUV420:
> +	case DRM_FORMAT_YUV422:
> +	case DRM_FORMAT_YUV444:
> +	case DRM_FORMAT_YVU420:
> +	case DRM_FORMAT_YVU422:
> +	case DRM_FORMAT_YVU444:
> +		return &planar_yuv_read_line;
>  	default:
>  		/*
>  		 * This is a bug in vkms_plane_atomic_check. All the supported
> @@ -435,6 +596,276 @@ pixel_read_line_t get_pixel_read_line_function(u32 format)
>  	}
>  }
>  
> +/**
> + * get_conversion_matrix_to_argb_u16() - Retrieve the correct yuv to rgb conversion matrix for a
> + * given encoding and range.
> + *
> + * If the matrix is not found, return a null pointer. In all other cases, it return a simple
> + * diagonal matrix, which act as a "no-op".

This comment about NULL seems bogus.

> + *
> + * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> + * @encoding: DRM_COLOR_* value for which to obtain a conversion matrix
> + * @range: DRM_COLOR_*_RANGE value for which to obtain a conversion matrix
> + */
> +struct conversion_matrix *
> +get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> +				  enum drm_color_range range)
> +{
> +	static struct conversion_matrix no_operation = {

Every matrix here should be 'static const' rather than only 'static'.

> +		.matrix = {
> +			{ 4294967296, 0,          0, },
> +			{ 0,          4294967296, 0, },
> +			{ 0,          0,          4294967296, },
> +		},
> +		.y_offset = 0,
> +	};
> +
> +	/*
> +	 * Those matrixies were generated using the colour python framework
> +	 *
> +	 * Below are the function calls used to generate eac matrix, go to
> +	 * https://colour.readthedocs.io/en/develop/generated/colour.matrix_YCbCr.html
> +	 * for more info:
> +	 *
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
> +	 *                                  is_legal = False,

Ugh, colour.matrix_YCbCr documentation is confusing. This is the first
time I've heard of "legal range", so I had to look it up. Of course,
the doc does not explain it.

Reading
https://kb.pomfort.com/livegrade/advanced-grading-features/legal-and-extended-sdi-signals-and-luts-in-livegrade/
it sounds like extended range in 8-bit is 1-254, not 0-255 that
we use in computer graphics. This matches what I've read before
elsewhere in ITU or SMPTE specs.

SDI signals reserve the 8-bit code points 0 and 255 for
synchronization, making them invalid as data. It scales to higher bit
depths, so 10-bit code points 0-3 and 1020-1023 inclusive are reserved
for synchronization.

IOW, there are two different "full range" quantizations: extended and full.

Does is_legal=False refer to extended or full? The documentation
does not say.

However, given that changing 'bits' value with is_legal=False does not
change the result, and with is_legal=True it does change the result, I
suspect is_legal=False means full range, not extended range.

So I think the python snippet is correct.

> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt601_full = {
> +		.matrix = {
> +			{ 4294967296, 0,           6021544149 },
> +			{ 4294967296, -1478054095, -3067191994 },
> +			{ 4294967296, 7610682049,  0 },
> +		},
> +		.y_offset = 0,
> +	};
> +
> +	/*
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
> +	 *                                  is_legal = True,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt601_limited = {
> +		.matrix = {
> +			{ 5020601039, 0,           6881764740 },
> +			{ 5020601039, -1689204679, -3505362278 },
> +			{ 5020601039, 8697922339,  0 },
> +		},
> +		.y_offset = 16,
> +	};
> +
> +	/*
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.709"],
> +	 *                                  is_legal = False,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt709_full = {
> +		.matrix = {
> +			{ 4294967296, 0,          6763714498 },
> +			{ 4294967296, -804551626, -2010578443 },
> +			{ 4294967296, 7969741314, 0 },
> +		},
> +		.y_offset = 0,
> +	};
> +
> +	/*
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.709"],
> +	 *                                  is_legal = True,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt709_limited = {
> +		.matrix = {
> +			{ 5020601039, 0,          7729959424 },
> +			{ 5020601039, -919487572, -2297803934 },
> +			{ 5020601039, 9108275786, 0 },
> +		},
> +		.y_offset = 16,
> +	};
> +
> +	/*
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.2020"],
> +	 *                                  is_legal = False,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt2020_full = {
> +		.matrix = {
> +			{ 4294967296, 0,          6333358775 },
> +			{ 4294967296, -706750298, -2453942994 },
> +			{ 4294967296, 8080551471, 0 },
> +		},
> +		.y_offset = 0,
> +	};
> +
> +	/*
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.2020"],
> +	 *                                  is_legal = True,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt2020_limited = {
> +		.matrix = {
> +			{ 5020601039, 0,          7238124312 },
> +			{ 5020601039, -807714626, -2804506279 },
> +			{ 5020601039, 9234915964, 0 },
> +		},
> +		.y_offset = 16,
> +	};
> +
> +	/*
> +	 * The next matrices are just the previous ones, but with the first and
> +	 * second columns swapped

As I mentioned earlier, you could derive those below from the above
matrices in code, so you don't need all these open-coded.

You also would not need twice the switch-ladders below, you'd only need
a 'bool need_to_swap_columns' from the pixel format.

You could also have a 'bool limited_range', and do

	case DRM_COLOR_YCBCR_BT601:
		return limited_range ? &yuv_bt601_limited : &yuv_bt601_full;


> +	 */
> +	static struct conversion_matrix yvu_bt601_full = {
> +		.matrix = {
> +			{ 4294967296, 6021544149,  0 },
> +			{ 4294967296, -3067191994, -1478054095 },
> +			{ 4294967296, 0,           7610682049 },
> +		},
> +		.y_offset = 0,
> +	};
> +	static struct conversion_matrix yvu_bt601_limited = {
> +		.matrix = {
> +			{ 5020601039, 6881764740,  0 },
> +			{ 5020601039, -3505362278, -1689204679 },
> +			{ 5020601039, 0,           8697922339 },
> +		},
> +		.y_offset = 16,
> +	};
> +	static struct conversion_matrix yvu_bt709_full = {
> +		.matrix = {
> +			{ 4294967296, 6763714498,  0 },
> +			{ 4294967296, -2010578443, -804551626 },
> +			{ 4294967296, 0,           7969741314 },
> +		},
> +		.y_offset = 0,
> +	};
> +	static struct conversion_matrix yvu_bt709_limited = {
> +		.matrix = {
> +			{ 5020601039, 7729959424,  0 },
> +			{ 5020601039, -2297803934, -919487572 },
> +			{ 5020601039, 0,           9108275786 },
> +		},
> +		.y_offset = 16,
> +	};
> +	static struct conversion_matrix yvu_bt2020_full = {
> +		.matrix = {
> +			{ 4294967296, 6333358775,  0 },
> +			{ 4294967296, -2453942994, -706750298 },
> +			{ 4294967296, 0,           8080551471 },
> +		},
> +		.y_offset = 0,
> +	};
> +	static struct conversion_matrix yvu_bt2020_limited = {
> +		.matrix = {
> +			{ 5020601039, 7238124312,  0 },
> +			{ 5020601039, -2804506279, -807714626 },
> +			{ 5020601039, 0,           9234915964 },
> +		},
> +		.y_offset = 16,
> +	};
> +
> +	/* Breaking in this switch means that the color format+encoding+range is not supported */
> +	switch (format) {
> +	case DRM_FORMAT_NV12:
> +	case DRM_FORMAT_NV16:
> +	case DRM_FORMAT_NV24:
> +	case DRM_FORMAT_YUV420:
> +	case DRM_FORMAT_YUV422:
> +	case DRM_FORMAT_YUV444:
> +		switch (encoding) {
> +		case DRM_COLOR_YCBCR_BT601:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yuv_bt601_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yuv_bt601_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_YCBCR_BT709:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yuv_bt709_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yuv_bt709_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_YCBCR_BT2020:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yuv_bt2020_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yuv_bt2020_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_ENCODING_MAX:
> +			break;
> +		}
> +		break;
> +	case DRM_FORMAT_YVU420:
> +	case DRM_FORMAT_YVU422:
> +	case DRM_FORMAT_YVU444:
> +	case DRM_FORMAT_NV21:
> +	case DRM_FORMAT_NV61:
> +	case DRM_FORMAT_NV42:
> +		switch (encoding) {
> +		case DRM_COLOR_YCBCR_BT601:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yvu_bt601_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yvu_bt601_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_YCBCR_BT709:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yvu_bt709_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yvu_bt709_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_YCBCR_BT2020:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yvu_bt2020_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yvu_bt2020_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_ENCODING_MAX:
> +			break;
> +		}
> +		break;
> +	case DRM_FORMAT_ARGB8888:
> +	case DRM_FORMAT_XRGB8888:
> +	case DRM_FORMAT_ARGB16161616:
> +	case DRM_FORMAT_XRGB16161616:
> +	case DRM_FORMAT_RGB565:
> +		/*
> +		 * Those formats are supported, but they don't need a conversion matrix. Return
> +		 * a valid pointer to avoid kernel panic in case this matrix is used/checked
> +		 * somewhere.
> +		 */
> +		return &no_operation;
> +	default:
> +		break;
> +	}
> +	WARN(true, "Unsupported encoding (%d), range (%d) and format (%p4cc) combination\n",
> +	     encoding, range, &format);
> +	return &no_operation;
> +}
> +
>  /**
>   * Retrieve the correct write_pixel function for a specific format.
>   * If the format is not supported by VKMS a warn is emitted and a dummy "don't do anything"
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
> index 8d2bef95ff79..e1d324764b17 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.h
> +++ b/drivers/gpu/drm/vkms/vkms_formats.h
> @@ -9,4 +9,8 @@ pixel_read_line_t get_pixel_read_line_function(u32 format);
>  
>  pixel_write_t get_pixel_write_function(u32 format);
>  
> +struct conversion_matrix *
> +get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> +				  enum drm_color_range range);
> +
>  #endif /* _VKMS_FORMATS_H_ */
> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> index 8875bed76410..987dd2b686a8 100644
> --- a/drivers/gpu/drm/vkms/vkms_plane.c
> +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> @@ -17,7 +17,19 @@ static const u32 vkms_formats[] = {
>  	DRM_FORMAT_XRGB8888,
>  	DRM_FORMAT_XRGB16161616,
>  	DRM_FORMAT_ARGB16161616,
> -	DRM_FORMAT_RGB565
> +	DRM_FORMAT_RGB565,
> +	DRM_FORMAT_NV12,
> +	DRM_FORMAT_NV16,
> +	DRM_FORMAT_NV24,
> +	DRM_FORMAT_NV21,
> +	DRM_FORMAT_NV61,
> +	DRM_FORMAT_NV42,
> +	DRM_FORMAT_YUV420,
> +	DRM_FORMAT_YUV422,
> +	DRM_FORMAT_YUV444,
> +	DRM_FORMAT_YVU420,
> +	DRM_FORMAT_YVU422,
> +	DRM_FORMAT_YVU444
>  };
>  
>  static struct drm_plane_state *
> @@ -117,12 +129,15 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
>  	drm_framebuffer_get(frame_info->fb);
>  	frame_info->rotation = drm_rotation_simplify(new_state->rotation, DRM_MODE_ROTATE_0 |
>  									  DRM_MODE_ROTATE_90 |
> +									  DRM_MODE_ROTATE_180 |
>  									  DRM_MODE_ROTATE_270 |
>  									  DRM_MODE_REFLECT_X |
>  									  DRM_MODE_REFLECT_Y);
>  
>  
>  	vkms_plane_state->pixel_read_line = get_pixel_read_line_function(fmt);
> +	vkms_plane_state->conversion_matrix = get_conversion_matrix_to_argb_u16
> +		(fmt, new_state->color_encoding, new_state->color_range);
>  }
>  
>  static int vkms_plane_atomic_check(struct drm_plane *plane,
> 

I couldn't pinpoint what would need to be fixed so that rotation would
not change chroma siting, but I also cannot say that chroma siting is
definitely correct already.


Thanks,
pq

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 04/16] drm/vkms: Add typedef and documentation for pixel_read and pixel_write functions
  2024-03-26 15:56     ` Louis Chauvet
@ 2024-03-27 15:03       ` Maíra Canal
  0 siblings, 0 replies; 75+ messages in thread
From: Maíra Canal @ 2024-03-27 15:03 UTC (permalink / raw)
  To: Rodrigo Siqueira, Melissa Wen, Haneen Mohammed, Daniel Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, arthurgrillo, Jonathan Corbet, pekka.paalanen,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

On 3/26/24 12:56, Louis Chauvet wrote:
> Le 25/03/24 - 10:56, Maíra Canal a écrit :
>> On 3/13/24 14:44, Louis Chauvet wrote:
>>> Introduce two typedefs: pixel_read_t and pixel_write_t. It allows the
>>> compiler to check if the passed functions take the correct arguments.
>>> Such typedefs will help ensuring consistency across the code base in
>>> case of update of these prototypes.
>>>
>>> Rename input/output variable in a consistent way between read_line and
>>> write_line.
>>>
>>> A warn has been added in get_pixel_*_function to alert when an unsupported
>>> pixel format is requested. As those formats are checked before
>>> atomic_update callbacks, it should never append.
>>>
>>> Document for those typedefs.
>>>
>>> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
>>> ---
>>>    drivers/gpu/drm/vkms/vkms_drv.h     |  23 ++++++-
>>>    drivers/gpu/drm/vkms/vkms_formats.c | 124 +++++++++++++++++++++---------------
>>>    drivers/gpu/drm/vkms/vkms_formats.h |   4 +-
>>>    drivers/gpu/drm/vkms/vkms_plane.c   |   2 +-
>>>    4 files changed, 95 insertions(+), 58 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
>>> index 18086423a3a7..4bfc62d26f08 100644
>>> --- a/drivers/gpu/drm/vkms/vkms_drv.h
>>> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
>>> @@ -53,12 +53,31 @@ struct line_buffer {
>>>    	struct pixel_argb_u16 *pixels;
>>>    };
>>>    
>>> +/**
>>> + * typedef pixel_write_t - These functions are used to read a pixel from a
>>> + * `struct pixel_argb_u16*`, convert it in a specific format and write it in the @dst_pixels
>>> + * buffer.
>>
>> Your brief description looks a bit big to me. Also, take a look at the
>> cross-references docs [1].
> 
> Is this description sufficient?
> 
> 	typedef pixel_write_t - Convert a pixel from a &struct pixel_argb_u16 into a specific format

Yeah.

Best Regards,
- Maíra

>   
>> [1]
>> https://docs.kernel.org/doc-guide/kernel-doc.html#highlights-and-cross-references
>>
>>> + *
>>> + * @out_pixel: destination address to write the pixel
>>> + * @in_pixel: pixel to write
>>> + */
>>> +typedef void (*pixel_write_t)(u8 *out_pixel, struct pixel_argb_u16 *in_pixel);
>>> +
>>>    struct vkms_writeback_job {
>>>    	struct iosys_map data[DRM_FORMAT_MAX_PLANES];
>>>    	struct vkms_frame_info wb_frame_info;
>>> -	void (*pixel_write)(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel);
>>> +	pixel_write_t pixel_write;
>>>    };
>>>    
>>> +/**
>>> + * typedef pixel_read_t - These functions are used to read a pixel in the source frame,
>>> + * convert it to `struct pixel_argb_u16` and write it to @out_pixel.
>>
>> Same.
> 
> 	typedef pixel_read_t - Read a pixel and convert it to a &struct pixel_argb_u16
>   
>>> + *
>>> + * @in_pixel: Pointer to the pixel to read
>>> + * @out_pixel: Pointer to write the converted pixel
>>
>> s/Pointer/pointer
> 
> Fixed in v6.
> 
>>> + */
>>> +typedef void (*pixel_read_t)(u8 *in_pixel, struct pixel_argb_u16 *out_pixel);
>>> +
>>>    /**
>>>     * vkms_plane_state - Driver specific plane state
>>>     * @base: base plane state
>>> @@ -69,7 +88,7 @@ struct vkms_writeback_job {
>>>    struct vkms_plane_state {
>>>    	struct drm_shadow_plane_state base;
>>>    	struct vkms_frame_info *frame_info;
>>> -	void (*pixel_read)(u8 *src_buffer, struct pixel_argb_u16 *out_pixel);
>>> +	pixel_read_t pixel_read;
>>>    };
>>>    
>>>    struct vkms_plane {
>>> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
>>> index 6e3dc8682ff9..55a4365d21a4 100644
>>> --- a/drivers/gpu/drm/vkms/vkms_formats.c
>>> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
>>> @@ -76,7 +76,7 @@ static int get_x_position(const struct vkms_frame_info *frame_info, int limit, i
>>>     * They are used in the `vkms_compose_row` function to handle multiple formats.
>>>     */
>>>    
>>> -static void ARGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
>>> +static void ARGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>>>    {
>>>    	/*
>>>    	 * The 257 is the "conversion ratio". This number is obtained by the
>>> @@ -84,48 +84,48 @@ static void ARGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixe
>>>    	 * the best color value in a pixel format with more possibilities.
>>>    	 * A similar idea applies to others RGB color conversions.
>>>    	 */
>>> -	out_pixel->a = (u16)src_pixels[3] * 257;
>>> -	out_pixel->r = (u16)src_pixels[2] * 257;
>>> -	out_pixel->g = (u16)src_pixels[1] * 257;
>>> -	out_pixel->b = (u16)src_pixels[0] * 257;
>>> +	out_pixel->a = (u16)in_pixel[3] * 257;
>>> +	out_pixel->r = (u16)in_pixel[2] * 257;
>>> +	out_pixel->g = (u16)in_pixel[1] * 257;
>>> +	out_pixel->b = (u16)in_pixel[0] * 257;
>>>    }
>>>    
>>> -static void XRGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
>>> +static void XRGB8888_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>>>    {
>>>    	out_pixel->a = (u16)0xffff;
>>> -	out_pixel->r = (u16)src_pixels[2] * 257;
>>> -	out_pixel->g = (u16)src_pixels[1] * 257;
>>> -	out_pixel->b = (u16)src_pixels[0] * 257;
>>> +	out_pixel->r = (u16)in_pixel[2] * 257;
>>> +	out_pixel->g = (u16)in_pixel[1] * 257;
>>> +	out_pixel->b = (u16)in_pixel[0] * 257;
>>>    }
>>>    
>>> -static void ARGB16161616_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
>>> +static void ARGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>>>    {
>>> -	u16 *pixels = (u16 *)src_pixels;
>>> +	u16 *pixel = (u16 *)in_pixel;
>>>    
>>> -	out_pixel->a = le16_to_cpu(pixels[3]);
>>> -	out_pixel->r = le16_to_cpu(pixels[2]);
>>> -	out_pixel->g = le16_to_cpu(pixels[1]);
>>> -	out_pixel->b = le16_to_cpu(pixels[0]);
>>> +	out_pixel->a = le16_to_cpu(pixel[3]);
>>> +	out_pixel->r = le16_to_cpu(pixel[2]);
>>> +	out_pixel->g = le16_to_cpu(pixel[1]);
>>> +	out_pixel->b = le16_to_cpu(pixel[0]);
>>>    }
>>>    
>>> -static void XRGB16161616_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
>>> +static void XRGB16161616_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>>>    {
>>> -	u16 *pixels = (u16 *)src_pixels;
>>> +	u16 *pixel = (u16 *)in_pixel;
>>>    
>>>    	out_pixel->a = (u16)0xffff;
>>> -	out_pixel->r = le16_to_cpu(pixels[2]);
>>> -	out_pixel->g = le16_to_cpu(pixels[1]);
>>> -	out_pixel->b = le16_to_cpu(pixels[0]);
>>> +	out_pixel->r = le16_to_cpu(pixel[2]);
>>> +	out_pixel->g = le16_to_cpu(pixel[1]);
>>> +	out_pixel->b = le16_to_cpu(pixel[0]);
>>>    }
>>>    
>>> -static void RGB565_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
>>> +static void RGB565_to_argb_u16(u8 *in_pixel, struct pixel_argb_u16 *out_pixel)
>>>    {
>>> -	u16 *pixels = (u16 *)src_pixels;
>>> +	u16 *pixel = (u16 *)in_pixel;
>>>    
>>>    	s64 fp_rb_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(31));
>>>    	s64 fp_g_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(63));
>>>    
>>> -	u16 rgb_565 = le16_to_cpu(*pixels);
>>> +	u16 rgb_565 = le16_to_cpu(*pixel);
>>>    	s64 fp_r = drm_int2fixp((rgb_565 >> 11) & 0x1f);
>>>    	s64 fp_g = drm_int2fixp((rgb_565 >> 5) & 0x3f);
>>>    	s64 fp_b = drm_int2fixp(rgb_565 & 0x1f);
>>> @@ -169,12 +169,12 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
>>>    
>>>    /*
>>>     * The following functions take one argb_u16 pixel and convert it to a specific format. The
>>> - * result is stored in @dst_pixels.
>>> + * result is stored in @out_pixel.
>>>     *
>>>     * They are used in the `vkms_writeback_row` to convert and store a pixel from the src_buffer to
>>>     * the writeback buffer.
>>>     */
>>> -static void argb_u16_to_ARGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
>>> +static void argb_u16_to_ARGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>>>    {
>>>    	/*
>>>    	 * This sequence below is important because the format's byte order is
>>> @@ -186,43 +186,43 @@ static void argb_u16_to_ARGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel
>>>    	 * | Addr + 2 | = Red channel
>>>    	 * | Addr + 3 | = Alpha channel
>>>    	 */
>>> -	dst_pixels[3] = DIV_ROUND_CLOSEST(in_pixel->a, 257);
>>> -	dst_pixels[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
>>> -	dst_pixels[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
>>> -	dst_pixels[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
>>> +	out_pixel[3] = DIV_ROUND_CLOSEST(in_pixel->a, 257);
>>> +	out_pixel[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
>>> +	out_pixel[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
>>> +	out_pixel[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
>>>    }
>>>    
>>> -static void argb_u16_to_XRGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
>>> +static void argb_u16_to_XRGB8888(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>>>    {
>>> -	dst_pixels[3] = 0xff;
>>> -	dst_pixels[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
>>> -	dst_pixels[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
>>> -	dst_pixels[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
>>> +	out_pixel[3] = 0xff;
>>> +	out_pixel[2] = DIV_ROUND_CLOSEST(in_pixel->r, 257);
>>> +	out_pixel[1] = DIV_ROUND_CLOSEST(in_pixel->g, 257);
>>> +	out_pixel[0] = DIV_ROUND_CLOSEST(in_pixel->b, 257);
>>>    }
>>>    
>>> -static void argb_u16_to_ARGB16161616(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
>>> +static void argb_u16_to_ARGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>>>    {
>>> -	u16 *pixels = (u16 *)dst_pixels;
>>> +	u16 *pixel = (u16 *)out_pixel;
>>>    
>>> -	pixels[3] = cpu_to_le16(in_pixel->a);
>>> -	pixels[2] = cpu_to_le16(in_pixel->r);
>>> -	pixels[1] = cpu_to_le16(in_pixel->g);
>>> -	pixels[0] = cpu_to_le16(in_pixel->b);
>>> +	pixel[3] = cpu_to_le16(in_pixel->a);
>>> +	pixel[2] = cpu_to_le16(in_pixel->r);
>>> +	pixel[1] = cpu_to_le16(in_pixel->g);
>>> +	pixel[0] = cpu_to_le16(in_pixel->b);
>>>    }
>>>    
>>> -static void argb_u16_to_XRGB16161616(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
>>> +static void argb_u16_to_XRGB16161616(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>>>    {
>>> -	u16 *pixels = (u16 *)dst_pixels;
>>> +	u16 *pixel = (u16 *)out_pixel;
>>>    
>>> -	pixels[3] = 0xffff;
>>> -	pixels[2] = cpu_to_le16(in_pixel->r);
>>> -	pixels[1] = cpu_to_le16(in_pixel->g);
>>> -	pixels[0] = cpu_to_le16(in_pixel->b);
>>> +	pixel[3] = 0xffff;
>>> +	pixel[2] = cpu_to_le16(in_pixel->r);
>>> +	pixel[1] = cpu_to_le16(in_pixel->g);
>>> +	pixel[0] = cpu_to_le16(in_pixel->b);
>>>    }
>>>    
>>> -static void argb_u16_to_RGB565(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
>>> +static void argb_u16_to_RGB565(u8 *out_pixel, struct pixel_argb_u16 *in_pixel)
>>>    {
>>> -	u16 *pixels = (u16 *)dst_pixels;
>>> +	u16 *pixel = (u16 *)out_pixel;
>>>    
>>>    	s64 fp_rb_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(31));
>>>    	s64 fp_g_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(63));
>>> @@ -235,7 +235,7 @@ static void argb_u16_to_RGB565(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
>>>    	u16 g = drm_fixp2int(drm_fixp_div(fp_g, fp_g_ratio));
>>>    	u16 b = drm_fixp2int(drm_fixp_div(fp_b, fp_rb_ratio));
>>>    
>>> -	*pixels = cpu_to_le16(r << 11 | g << 5 | b);
>>> +	*pixel = cpu_to_le16(r << 11 | g << 5 | b);
>>>    }
>>>    
>>>    /**
>>> @@ -266,7 +266,7 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
>>>     *
>>>     * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
>>>     */
>>> -void *get_pixel_conversion_function(u32 format)
>>> +pixel_read_t get_pixel_read_function(u32 format)
>>>    {
>>>    	switch (format) {
>>>    	case DRM_FORMAT_ARGB8888:
>>> @@ -280,7 +280,16 @@ void *get_pixel_conversion_function(u32 format)
>>>    	case DRM_FORMAT_RGB565:
>>>    		return &RGB565_to_argb_u16;
>>>    	default:
>>> -		return NULL;
>>> +		/*
>>> +		 * This is a bug in vkms_plane_atomic_check. All the supported
>>
>> s/vkms_plane_atomic_check/vkms_plane_atomic_check()
> 
> Fixed in v6.
> 
> Thanks,
> Louis Chauvet
> 
>> Best Regards,
>> - Maíra
>>
>>> +		 * format must:
>>> +		 * - Be listed in vkms_formats in vkms_plane.c
>>> +		 * - Have a pixel_read callback defined here
>>> +		 */
>>> +		WARN(true,
>>> +		     "Pixel format %p4cc is not supported by VKMS planes. This is a kernel bug, atomic check must forbid this configuration.\n",
>>> +		     &format);
>>> +		return (pixel_read_t)NULL;
>>>    	}
>>>    }
>>>    
>>> @@ -291,7 +300,7 @@ void *get_pixel_conversion_function(u32 format)
>>>     *
>>>     * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
>>>     */
>>> -void *get_pixel_write_function(u32 format)
>>> +pixel_write_t get_pixel_write_function(u32 format)
>>>    {
>>>    	switch (format) {
>>>    	case DRM_FORMAT_ARGB8888:
>>> @@ -305,6 +314,15 @@ void *get_pixel_write_function(u32 format)
>>>    	case DRM_FORMAT_RGB565:
>>>    		return &argb_u16_to_RGB565;
>>>    	default:
>>> -		return NULL;
>>> +		/*
>>> +		 * This is a bug in vkms_writeback_atomic_check. All the supported
>>> +		 * format must:
>>> +		 * - Be listed in vkms_wb_formats in vkms_writeback.c
>>> +		 * - Have a pixel_write callback defined here
>>> +		 */
>>> +		WARN(true,
>>> +		     "Pixel format %p4cc is not supported by VKMS writeback. This is a kernel bug, atomic check must forbid this configuration.\n",
>>> +		     &format);
>>> +		return (pixel_write_t)NULL;
>>>    	}
>>>    }
>>> diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
>>> index cf59c2ed8e9a..3ecea4563254 100644
>>> --- a/drivers/gpu/drm/vkms/vkms_formats.h
>>> +++ b/drivers/gpu/drm/vkms/vkms_formats.h
>>> @@ -5,8 +5,8 @@
>>>    
>>>    #include "vkms_drv.h"
>>>    
>>> -void *get_pixel_conversion_function(u32 format);
>>> +pixel_read_t get_pixel_read_function(u32 format);
>>>    
>>> -void *get_pixel_write_function(u32 format);
>>> +pixel_write_t get_pixel_write_function(u32 format);
>>>    
>>>    #endif /* _VKMS_FORMATS_H_ */
>>> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
>>> index 21b5adfb44aa..10e9b23dab28 100644
>>> --- a/drivers/gpu/drm/vkms/vkms_plane.c
>>> +++ b/drivers/gpu/drm/vkms/vkms_plane.c
>>> @@ -125,7 +125,7 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
>>>    	drm_rect_rotate(&frame_info->rotated, drm_rect_width(&frame_info->rotated),
>>>    			drm_rect_height(&frame_info->rotated), frame_info->rotation);
>>>    
>>> -	vkms_plane_state->pixel_read = get_pixel_conversion_function(fmt);
>>> +	vkms_plane_state->pixel_read = get_pixel_read_function(fmt);
>>>    }
>>>    
>>>    static int vkms_plane_atomic_check(struct drm_plane *plane,
>>>
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 14/16] drm/vkms: Create KUnit tests for YUV conversions
  2024-03-25 14:34   ` Maíra Canal
  2024-03-26 15:57     ` Louis Chauvet
@ 2024-03-28 13:26     ` Pekka Paalanen
  1 sibling, 0 replies; 75+ messages in thread
From: Pekka Paalanen @ 2024-03-28 13:26 UTC (permalink / raw)
  To: Maíra Canal
  Cc: Louis Chauvet, Rodrigo Siqueira, Melissa Wen, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

[-- Attachment #1: Type: text/plain, Size: 8139 bytes --]

On Mon, 25 Mar 2024 11:34:17 -0300
Maíra Canal <mcanal@igalia.com> wrote:

> On 3/13/24 14:45, Louis Chauvet wrote:
> > From: Arthur Grillo <arthurgrillo@riseup.net>
> > 
> > Create KUnit tests to test the conversion between YUV and RGB. Test each
> > conversion and range combination with some common colors.
> > 
> > The code used to compute the expected result can be found in comment.
> > 
> > Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> > [Louis Chauvet:
> > - fix minor formating issues (whitespace, double line)
> > - change expected alpha from 0x0000 to 0xffff
> > - adapt to the new get_conversion_matrix usage
> > - apply the changes from Arthur
> > - move struct pixel_yuv_u8 to the test itself]  
> 
> Again, a Co-developed-by tag might be more proper.
> 
> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >   drivers/gpu/drm/vkms/Kconfig                  |  15 ++
> >   drivers/gpu/drm/vkms/Makefile                 |   1 +
> >   drivers/gpu/drm/vkms/tests/.kunitconfig       |   4 +
> >   drivers/gpu/drm/vkms/tests/Makefile           |   3 +
> >   drivers/gpu/drm/vkms/tests/vkms_format_test.c | 230 ++++++++++++++++++++++++++
> >   drivers/gpu/drm/vkms/vkms_formats.c           |   7 +-
> >   drivers/gpu/drm/vkms/vkms_formats.h           |   4 +
> >   7 files changed, 262 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/Kconfig b/drivers/gpu/drm/vkms/Kconfig
> > index b9ecdebecb0b..9b0e1940c14f 100644
> > --- a/drivers/gpu/drm/vkms/Kconfig
> > +++ b/drivers/gpu/drm/vkms/Kconfig
> > @@ -13,3 +13,18 @@ config DRM_VKMS
> >   	  a VKMS.
> >   
> >   	  If M is selected the module will be called vkms.
> > +
> > +config DRM_VKMS_KUNIT_TESTS
> > +	tristate "Tests for VKMS" if !KUNIT_ALL_TESTS  
> 
> "KUnit tests for VKMS"
> 
> > +	depends on DRM_VKMS && KUNIT
> > +	default KUNIT_ALL_TESTS
> > +	help
> > +	  This builds unit tests for VKMS. This option is not useful for
> > +	  distributions or general kernels, but only for kernel
> > +	  developers working on VKMS.
> > +
> > +	  For more information on KUnit and unit tests in general,
> > +	  please refer to the KUnit documentation in
> > +	  Documentation/dev-tools/kunit/.
> > +
> > +	  If in doubt, say "N".
> > \ No newline at end of file
> > diff --git a/drivers/gpu/drm/vkms/Makefile b/drivers/gpu/drm/vkms/Makefile
> > index 1b28a6a32948..8d3e46dde635 100644
> > --- a/drivers/gpu/drm/vkms/Makefile
> > +++ b/drivers/gpu/drm/vkms/Makefile
> > @@ -9,3 +9,4 @@ vkms-y := \
> >   	vkms_writeback.o
> >   
> >   obj-$(CONFIG_DRM_VKMS) += vkms.o
> > +obj-$(CONFIG_DRM_VKMS_KUNIT_TESTS) += tests/
> > diff --git a/drivers/gpu/drm/vkms/tests/.kunitconfig b/drivers/gpu/drm/vkms/tests/.kunitconfig
> > new file mode 100644
> > index 000000000000..70e378228cbd
> > --- /dev/null
> > +++ b/drivers/gpu/drm/vkms/tests/.kunitconfig
> > @@ -0,0 +1,4 @@
> > +CONFIG_KUNIT=y
> > +CONFIG_DRM=y
> > +CONFIG_DRM_VKMS=y
> > +CONFIG_DRM_VKMS_KUNIT_TESTS=y
> > diff --git a/drivers/gpu/drm/vkms/tests/Makefile b/drivers/gpu/drm/vkms/tests/Makefile
> > new file mode 100644
> > index 000000000000..2d1df668569e
> > --- /dev/null
> > +++ b/drivers/gpu/drm/vkms/tests/Makefile
> > @@ -0,0 +1,3 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +
> > +obj-$(CONFIG_DRM_VKMS_KUNIT_TESTS) += vkms_format_test.o
> > diff --git a/drivers/gpu/drm/vkms/tests/vkms_format_test.c b/drivers/gpu/drm/vkms/tests/vkms_format_test.c
> > new file mode 100644
> > index 000000000000..0954d606e44a
> > --- /dev/null
> > +++ b/drivers/gpu/drm/vkms/tests/vkms_format_test.c
> > @@ -0,0 +1,230 @@
> > +// SPDX-License-Identifier: GPL-2.0+
> > +
> > +#include <kunit/test.h>
> > +
> > +#include <drm/drm_fixed.h>
> > +#include <drm/drm_fourcc.h>
> > +#include <drm/drm_print.h>
> > +
> > +#include "../../drm_crtc_internal.h"
> > +
> > +#include "../vkms_drv.h"
> > +#include "../vkms_formats.h"
> > +
> > +#define TEST_BUFF_SIZE 50
> > +
> > +struct pixel_yuv_u8 {
> > +	u8 y, u, v;
> > +};
> > +
> > +struct yuv_u8_to_argb_u16_case {
> > +	enum drm_color_encoding encoding;
> > +	enum drm_color_range range;
> > +	size_t n_colors;
> > +	struct format_pair {
> > +		char *name;
> > +		struct pixel_yuv_u8 yuv;
> > +		struct pixel_argb_u16 argb;
> > +	} colors[TEST_BUFF_SIZE];
> > +};
> > +
> > +/*
> > + * The YUV color representation were acquired via the colour python framework.
> > + * Below are the function calls used for generating each case.
> > + *
> > + * for more information got to the docs:  
> 
> s/for/For
> 
> > + * https://colour.readthedocs.io/en/master/generated/colour.RGB_to_YCbCr.html
> > + */
> > +static struct yuv_u8_to_argb_u16_case yuv_u8_to_argb_u16_cases[] = {
> > +	/*
> > +	 * colour.RGB_to_YCbCr(<rgb color in 16 bit form>,
> > +	 *                     K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
> > +	 *                     in_bits = 16,
> > +	 *                     in_legal = False,
> > +	 *                     in_int = True,
> > +	 *                     out_bits = 8,
> > +	 *                     out_legal = False,
> > +	 *                     out_int = True)
> > +	 */  
> 
> I feel that this Python code is kind of poluting the test cases.

I asked for the python code.

How would you verify that the expected values are actually correct
without these comments?

You cannot trust that the test values are good if the tests pass. Both
the test values and the tested code could be wrong simultaneously and
agree on wrong results.


I love these comments that explicitly give the python snippets used to
generate the test values. Well done!

Louis' suggestion of collecting the common python bits together is
fine, too. As long as the comments clearly explain what python commands
produced the test values, I'm happy.

Anyway,

Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com>


Thanks,
pq

> > +	{
> > +		.encoding = DRM_COLOR_YCBCR_BT601,
> > +		.range = DRM_COLOR_YCBCR_FULL_RANGE,
> > +		.n_colors = 6,
> > +		.colors = {
> > +			{ "white", { 0xff, 0x80, 0x80 }, { 0xffff, 0xffff, 0xffff, 0xffff }},
> > +			{ "gray",  { 0x80, 0x80, 0x80 }, { 0xffff, 0x8080, 0x8080, 0x8080 }},
> > +			{ "black", { 0x00, 0x80, 0x80 }, { 0xffff, 0x0000, 0x0000, 0x0000 }},
> > +			{ "red",   { 0x4c, 0x55, 0xff }, { 0xffff, 0xffff, 0x0000, 0x0000 }},
> > +			{ "green", { 0x96, 0x2c, 0x15 }, { 0xffff, 0x0000, 0xffff, 0x0000 }},
> > +			{ "blue",  { 0x1d, 0xff, 0x6b }, { 0xffff, 0x0000, 0x0000, 0xffff }},
> > +		},
> > +	},
> > +	/*
> > +	 * colour.RGB_to_YCbCr(<rgb color in 16 bit form>,
> > +	 *                     K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
> > +	 *                     in_bits = 16,
> > +	 *                     in_legal = False,
> > +	 *                     in_int = True,
> > +	 *                     out_bits = 8,
> > +	 *                     out_legal = True,
> > +	 *                     out_int = True)
> > +	 */
> > +	{
> > +		.encoding = DRM_COLOR_YCBCR_BT601,
> > +		.range = DRM_COLOR_YCBCR_LIMITED_RANGE,
> > +		.n_colors = 6,
> > +		.colors = {
> > +			{ "white", { 0xeb, 0x80, 0x80 }, { 0xffff, 0xffff, 0xffff, 0xffff }},
> > +			{ "gray",  { 0x7e, 0x80, 0x80 }, { 0xffff, 0x8080, 0x8080, 0x8080 }},
> > +			{ "black", { 0x10, 0x80, 0x80 }, { 0xffff, 0x0000, 0x0000, 0x0000 }},
> > +			{ "red",   { 0x51, 0x5a, 0xf0 }, { 0xffff, 0xffff, 0x0000, 0x0000 }},
> > +			{ "green", { 0x91, 0x36, 0x22 }, { 0xffff, 0x0000, 0xffff, 0x0000 }},
> > +			{ "blue",  { 0x29, 0xf0, 0x6e }, { 0xffff, 0x0000, 0x0000, 0xffff }},
> > +		},
> > +	},
> > +	/*
> > +	 * colour.RGB_to_YCbCr(<rgb color in 16 bit form>,
> > +	 *                     K=colour.WEIGHTS_YCBCR["ITU-R BT.709"],
> > +	 *                     in_bits = 16,
> > +	 *                     in_legal = False,
> > +	 *                     in_int = True,
> > +	 *                     out_bits = 8,
> > +	 *                     out_legal = False,
> > +	 *                     out_int = True)
> > +	 */
> > +	{


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 16/16] drm/vkms: Add support for DRM_FORMAT_R*
  2024-03-13 17:45 ` [PATCH v5 16/16] drm/vkms: Add support for DRM_FORMAT_R* Louis Chauvet
@ 2024-03-28 14:00   ` Pekka Paalanen
  2024-04-08  7:50     ` Louis Chauvet
  0 siblings, 1 reply; 75+ messages in thread
From: Pekka Paalanen @ 2024-03-28 14:00 UTC (permalink / raw)
  To: Louis Chauvet
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

[-- Attachment #1: Type: text/plain, Size: 6675 bytes --]

On Wed, 13 Mar 2024 18:45:10 +0100
Louis Chauvet <louis.chauvet@bootlin.com> wrote:

> This add the support for:
> - R1/R2/R4/R8
> 
> R1 format was tested with [1] and [2].
> 
> [1]: https://lore.kernel.org/r/20240313-new_rotation-v2-0-6230fd5cae59@bootlin.com
> [2]: https://lore.kernel.org/igt-dev/20240306-b4-kms_tests-v1-0-8fe451efd2ac@bootlin.com/
> 
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>  drivers/gpu/drm/vkms/vkms_formats.c | 100 ++++++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/vkms/vkms_plane.c   |   6 ++-
>  2 files changed, 105 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index 863fc91d6d48..cbb2ec09564a 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> @@ -201,6 +201,11 @@ static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
>  	return out_pixel;
>  }
>  
> +static struct pixel_argb_u16 argb_u16_from_gray8(u8 gray)
> +{
> +	return argb_u16_from_u8888(255, gray, gray, gray);
> +}
> +
>  VISIBLE_IF_KUNIT struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
>  							    struct conversion_matrix *matrix)
>  {
> @@ -269,6 +274,89 @@ static void black_to_argb_u16(const struct vkms_plane_state *plane, int x_start,
>  	}
>  }
>  
> +static void Rx_read_line(const struct vkms_plane_state *plane, int x_start,
> +			 int y_start, enum pixel_read_direction direction, int count,
> +			 struct pixel_argb_u16 out_pixel[], u8 bit_per_pixel, u8 lum_per_level)
> +{
> +	struct pixel_argb_u16 *end = out_pixel + count;
> +	u8 *src_pixels;
> +	int rem_x, rem_y;
> +
> +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);

Maybe assert that rem_y = 0? Or block_h = 1.

> +	int bit_offset = (int)rem_x * bit_per_pixel;

Why cast rem_x to int when it was defined to be int?

> +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> +	int mask = (0x1 << bit_per_pixel) - 1;

Since mask will interact with u8, it should be unsigned too.

> +
> +	if (direction == READ_LEFT_TO_RIGHT || direction == READ_RIGHT_TO_LEFT) {
> +		int restart_bit_offset = 0;
> +		int step_bit_offset = bit_per_pixel;
> +
> +		if (direction == READ_RIGHT_TO_LEFT) {
> +			restart_bit_offset = 8 - bit_per_pixel;
> +			step_bit_offset = -bit_per_pixel;
> +		}
> +
> +		while (out_pixel < end) {
> +			u8 val = (*src_pixels & (mask << bit_offset)) >> bit_offset;

or shorter: (*src_pixels >> bit_offset) & mask

However, shouldn't the first pixel be on the high bits?

That how I would understand the comments in drm_fourcc.h.

Again a reason to avoid a solid color fill in IGT.

> +
> +			*out_pixel = argb_u16_from_gray8(val * lum_per_level);
> +
> +			bit_offset += step_bit_offset;
> +			if (bit_offset < 0 || 8 <= bit_offset) {
> +				bit_offset = restart_bit_offset;
> +				src_pixels += step;
> +			}
> +			out_pixel += 1;
> +		}
> +	} else if (direction == READ_TOP_TO_BOTTOM || direction == READ_BOTTOM_TO_TOP) {
> +		while (out_pixel < end) {
> +			u8 val = (*src_pixels & (mask << bit_offset)) >> bit_offset;
> +			*out_pixel = argb_u16_from_gray8(val * lum_per_level);
> +			src_pixels += step;
> +			out_pixel += 1;
> +		}
> +	}
> +}
> +
> +static void R1_read_line(const struct vkms_plane_state *plane, int x_start,
> +			 int y_start, enum pixel_read_direction direction, int count,
> +			 struct pixel_argb_u16 out_pixel[])
> +{
> +	Rx_read_line(plane, x_start, y_start, direction, count, out_pixel, 1, 0xFF);
> +}
> +
> +static void R2_read_line(const struct vkms_plane_state *plane, int x_start,
> +			 int y_start, enum pixel_read_direction direction, int count,
> +			 struct pixel_argb_u16 out_pixel[])
> +{
> +	Rx_read_line(plane, x_start, y_start, direction, count, out_pixel, 2, 0x55);
> +}
> +
> +static void R4_read_line(const struct vkms_plane_state *plane, int x_start,
> +			 int y_start, enum pixel_read_direction direction, int count,
> +			 struct pixel_argb_u16 out_pixel[])
> +{
> +	Rx_read_line(plane, x_start, y_start, direction, count, out_pixel, 4, 0x11);
> +}
> +
> +static void R8_read_line(const struct vkms_plane_state *plane, int x_start,
> +			 int y_start, enum pixel_read_direction direction, int count,
> +			 struct pixel_argb_u16 out_pixel[])
> +{
> +	struct pixel_argb_u16 *end = out_pixel + count;
> +	u8 *src_pixels;
> +	int rem_x, rem_y;
> +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> +
> +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);

Assert on block size?


> +
> +	while (out_pixel < end) {
> +		*out_pixel = argb_u16_from_gray8(*src_pixels);
> +		src_pixels += step;
> +		out_pixel += 1;
> +	}
> +}
> +
>  static void ARGB8888_read_line(const struct vkms_plane_state *plane, int x_start, int y_start,
>  			       enum pixel_read_direction direction, int count,
>  			       struct pixel_argb_u16 out_pixel[])
> @@ -582,6 +670,14 @@ pixel_read_line_t get_pixel_read_line_function(u32 format)
>  	case DRM_FORMAT_YVU422:
>  	case DRM_FORMAT_YVU444:
>  		return &planar_yuv_read_line;
> +	case DRM_FORMAT_R1:
> +		return &R1_read_line;
> +	case DRM_FORMAT_R2:
> +		return &R2_read_line;
> +	case DRM_FORMAT_R4:
> +		return &R4_read_line;
> +	case DRM_FORMAT_R8:
> +		return &R8_read_line;
>  	default:
>  		/*
>  		 * This is a bug in vkms_plane_atomic_check. All the supported
> @@ -855,6 +951,10 @@ get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
>  	case DRM_FORMAT_ARGB16161616:
>  	case DRM_FORMAT_XRGB16161616:
>  	case DRM_FORMAT_RGB565:
> +	case DRM_FORMAT_R1:
> +	case DRM_FORMAT_R2:
> +	case DRM_FORMAT_R4:
> +	case DRM_FORMAT_R8:
>  		/*
>  		 * Those formats are supported, but they don't need a conversion matrix. Return

It is strange that you need to list irrelevant formats here.


>  		 * a valid pointer to avoid kernel panic in case this matrix is used/checked
> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> index e21cc92cf497..dc9d62acf350 100644
> --- a/drivers/gpu/drm/vkms/vkms_plane.c
> +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> @@ -29,7 +29,11 @@ static const u32 vkms_formats[] = {
>  	DRM_FORMAT_YUV444,
>  	DRM_FORMAT_YVU420,
>  	DRM_FORMAT_YVU422,
> -	DRM_FORMAT_YVU444
> +	DRM_FORMAT_YVU444,
> +	DRM_FORMAT_R1,
> +	DRM_FORMAT_R2,
> +	DRM_FORMAT_R4,
> +	DRM_FORMAT_R8
>  };
>  
>  static struct drm_plane_state *
> 

Thanks,
pq

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 09/16] drm/vkms: Introduce pixel_read_direction enum
  2024-03-27 12:16       ` Pekka Paalanen
@ 2024-04-08  7:50         ` Louis Chauvet
  2024-04-09  7:35           ` Pekka Paalanen
  0 siblings, 1 reply; 75+ messages in thread
From: Louis Chauvet @ 2024-04-08  7:50 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 27/03/24 - 14:16, Pekka Paalanen a écrit :
> On Tue, 26 Mar 2024 16:57:00 +0100
> Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> 
> > Le 25/03/24 - 15:11, Pekka Paalanen a écrit :
> > > On Wed, 13 Mar 2024 18:45:03 +0100
> > > Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> > >   
> > > > The pixel_read_direction enum is useful to describe the reading direction
> > > > in a plane. It avoids using the rotation property of DRM, which not
> > > > practical to know the direction of reading.
> > > > This patch also introduce two helpers, one to compute the
> > > > pixel_read_direction from the DRM rotation property, and one to compute
> > > > the step, in byte, between two successive pixel in a specific direction.
> > > > 
> > > > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > > > ---
> > > >  drivers/gpu/drm/vkms/vkms_composer.c | 36 ++++++++++++++++++++++++++++++++++++
> > > >  drivers/gpu/drm/vkms/vkms_drv.h      | 11 +++++++++++
> > > >  drivers/gpu/drm/vkms/vkms_formats.c  | 30 ++++++++++++++++++++++++++++++
> > > >  3 files changed, 77 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> > > > index 9254086f23ff..989bcf59f375 100644
> > > > --- a/drivers/gpu/drm/vkms/vkms_composer.c
> > > > +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> > > > @@ -159,6 +159,42 @@ static void apply_lut(const struct vkms_crtc_state *crtc_state, struct line_buff
> > > >  	}
> > > >  }
> > > >  
> > > > +/**
> > > > + * direction_for_rotation() - Get the correct reading direction for a given rotation
> > > > + *
> > > > + * This function will use the @rotation setting of a source plane to compute the reading
> > > > + * direction in this plane which correspond to a "left to right writing" in the CRTC.
> > > > + * For example, if the buffer is reflected on X axis, the pixel must be read from right to left
> > > > + * to be written from left to right on the CRTC.  
> > > 
> > > That is a well written description.  
> > 
> > Thanks
> >  
> > > > + *
> > > > + * @rotation: Rotation to analyze. It correspond the field @frame_info.rotation.
> > > > + */
> > > > +static enum pixel_read_direction direction_for_rotation(unsigned int rotation)
> > > > +{
> > > > +	if (rotation & DRM_MODE_ROTATE_0) {
> > > > +		if (rotation & DRM_MODE_REFLECT_X)
> > > > +			return READ_RIGHT_TO_LEFT;
> > > > +		else
> > > > +			return READ_LEFT_TO_RIGHT;
> > > > +	} else if (rotation & DRM_MODE_ROTATE_90) {
> > > > +		if (rotation & DRM_MODE_REFLECT_Y)
> > > > +			return READ_BOTTOM_TO_TOP;
> > > > +		else
> > > > +			return READ_TOP_TO_BOTTOM;
> > > > +	} else if (rotation & DRM_MODE_ROTATE_180) {
> > > > +		if (rotation & DRM_MODE_REFLECT_X)
> > > > +			return READ_LEFT_TO_RIGHT;
> > > > +		else
> > > > +			return READ_RIGHT_TO_LEFT;
> > > > +	} else if (rotation & DRM_MODE_ROTATE_270) {
> > > > +		if (rotation & DRM_MODE_REFLECT_Y)
> > > > +			return READ_TOP_TO_BOTTOM;
> > > > +		else
> > > > +			return READ_BOTTOM_TO_TOP;
> > > > +	}
> > > > +	return READ_LEFT_TO_RIGHT;  
> > > 
> > > I'm a little worried seeing REFLECT_X is supported only for some
> > > rotations, and REFLECT_Y for other rotations. Why is an analysis of all
> > > combinations not necessary?  
> > 
> > I don't need to manage all the combination because this is only about 
> > the "horizontal writing".
> > 
> > So, if you want to write a line in the CRTC, with:
> > - ROT_0 || REF_X => You need to read the source line from right to left
> > - ROT_0 => You need to read source buffer from left to right
> > - ROT_0 || REF_Y => You need to read the source line from left to right
> 
> That is true, indeed.
> 
> > In this case, REF_Y only have an effect on the "column reading". It is not 
> > needed here because the new version of the blend function will use the 
> > drm_rect_* helpers to compute the correct y coordinate.
> > 
> > If you think it's clearer, I can create a big switch(rotation) like this:
> > 
> > 	switch (rotation) {
> > 	case ROT_0:
> > 	case ROT_0 || REF_X:
> > 		return L2R;
> > 	case ROT_0 || REF_Y:
> > 		return R2L;
> > 	case ROT_90:
> > 	case ROT_90 || REF_X:
> > 		return T2B;
> > 	[...]
> > 	}
> > 
> > So all cases are clearly covered?
> 
> I think that would suit my personal taste better. It would not raise
> questions nor need a comment. It does become a long function, but I
> tend to favour long and clear more than short and needs thinking to
> figure out if it works, everything else being equivalent.

I will change it to switch case.
 
> I wonder how DRM maintainers feel.
> 
> > > I hope IGT uses FB patterns instead of solid color in its tests of
> > > rotation to be able to detect the difference.  
> > 
> > They use solid colors, and even my new rotation test [3] use solid colors.
> 
> That will completely fail to detect rotation and reflection bugs then.
> E.g. userspace asks for 180-degree rotation, and the driver does not
> rotate at all. Or rotate-180 getting confused with one reflection.

I think I missunderstood what you means with "solid colors".

The tests uses a plane with multiple solid colors:

+-------+-------+
| White | Red   |
+-------+-------+
| Blue  | Green |
+-------+-------+

But it don't use gradients because of YUV.
 
> > It is mainly for yuv formats with subsampling: if you have formats with 
> > subsampling, a "software rotated buffer" and a "hardware rotated buffer" 
> > will not apply the same subsampling, so the colors will be slightly 
> > different.
> 
> Why would they not use the same subsampling?

YUV422, for each pair of pixels along a horizontal line, the U and V 
components are shared between those two pixels. However, along a vertical 
line, each pixel has its own U and V components.

When you rotate an image by 90 degrees:
 - Hardware Rotation: If you use hardware rotation, the YUV subsampling 
   axis will align with what was previously the "White-Red" axis. The 
   hardware will handle the rotation.
 - Software Rotation: If you use software rotation, the YUV subsampling 
   axis will align with what was previously the "Red-Green" axis.

Because the subsampling compression axis changes depending on whether 
you're using hardware or software rotation, the compression effect on 
colors will differ. Specifically:
 - Hardware rotation, a gradient along the "White-Red" axis may be 
   compressed (i.e same UV component for multiple pixels along the 
   gradient).
 - Software rotation, the same gradient will not be compressed (i.e, each 
   different color in the gradient have dedicated UV component)

The same reasoning also apply for "color borders", and my series [3] avoid 
this issue by choosing the right number of pixels.

> The framebuffer contents are defined in its natural orientation, and
> the subsampling applies in the natural orientation. If such a FB
> is on a rotated plane, one must account for subsampling first, and
> rotate second. 90-degree rotation does not change the encoded color.
> 
> Getting the subsampling exactly right is going to be necessary sooner
> or later. There is no UAPI for setting chroma siting yet, but ideally
> there should be.
> 
> > > The return values do seem correct to me, assuming I have guessed
> > > correctly what "X" and "Y" refer to when combined with rotation. I did
> > > not find good documentation about that.  
> > 
> > Yes, it is difficult to understand how rotation and reflexion should 
> > works in drm. I spend half a day testing all the combination in drm_rect_* 
> > helpers to understand how this works. According to the code:
> > - If only rotation or only reflexion, easy as expected
> > - If reflexion and rotation are mixed, the source buffer is first 
> >   reflected and then rotated.
> 
> Now that you know, you could send a documentation patch. :-)

And now I'm not sure about it :)

I was running the tests on my v6, and for the first time ran my new 
rotation [3] on the previous VKMS code. None of the tests for 
ROT_90+reflexion and ROT_270+reflexion are passing...

So, either the previous vkms implementation was wrong, or mine is wrong :)

So, if a DRM expert can explain this, it could be nice.

To have a common example, if I take the same buffer as above 
(white+red+blue+green), if I create a plane with rotation = 
ROTATION_90 | REFLECTION_X, what is the expected result?

1 - rotation then reflection 

+-------+-------+
| Green | Red   |
+-------+-------+
| Blue  | White |
+-------+-------+

2 - reflection then rotation (my vkms implementation)

+-------+-------+
| White | Blue  |
+-------+-------+
| Red   | Green |
+-------+-------+

Thanks a lot,
Louis Chauvet

> For me as a userspace developer, the important place is
> https://dri.freedesktop.org/docs/drm/gpu/drm-kms.html#standard-plane-properties
> 
> >  
> > > Btw. if there are already functions that are able to transform
> > > coordinates based on the rotation bitfield, you could alternatively use
> > > them. Transform CRTC point (0, 0) to A, and (1, 0) to B. Now A and B
> > > are in plane coordinate system, and vector B - A gives you the
> > > direction. The reason I'm mentioning this is that then you don't have
> > > to implement yet another copy of the rotation bitfield semantics from
> > > scratch.  
> > 
> > You are totaly right. I will try this elegant method. Yes, there are some 
> > helpers (drm_rect_rotate_inv), so I will try to do something.

It works fine, a bit more tricky to get the right implementation. It will 
be in the v6.
 
> Cool, thanks,
> pq
> 
   
[...]


-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 10/16] drm/vkms: Re-introduce line-per-line composition algorithm
  2024-03-27 12:29       ` Pekka Paalanen
@ 2024-04-08  7:50         ` Louis Chauvet
  0 siblings, 0 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-04-08  7:50 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 27/03/24 - 14:29, Pekka Paalanen a écrit :
> On Tue, 26 Mar 2024 16:57:02 +0100
> Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> 
> > [...]
> > 
> > > > @@ -215,34 +188,146 @@ static void blend(struct vkms_writeback_job *wb,
> > > >  {
> > > >  	struct vkms_plane_state **plane = crtc_state->active_planes;
> > > >  	u32 n_active_planes = crtc_state->num_active_planes;
> > > > -	int y_pos, x_dst, x_limit;
> > > >  
> > > >  	const struct pixel_argb_u16 background_color = { .a = 0xffff };
> > > >  
> > > > -	size_t crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
> > > > +	int crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
> > > > +	int crtc_x_limit = crtc_state->base.crtc->mode.hdisplay;
> > > >  
> > > >  	/*
> > > >  	 * The planes are composed line-by-line to avoid heavy memory usage. It is a necessary
> > > >  	 * complexity to avoid poor blending performance.
> > > >  	 *
> > > > -	 * The function vkms_compose_row is used to read a line, pixel-by-pixel, into the staging
> > > > -	 * buffer.
> > > > +	 * The function pixel_read_line callback is used to read a line, using an efficient
> > > > +	 * algorithm for a specific format, into the staging buffer.
> > > >  	 */
> > > >  	for (size_t y = 0; y < crtc_y_limit; y++) {
> > > >  		fill_background(&background_color, output_buffer);
> > > >  
> > > >  		/* The active planes are composed associatively in z-order. */
> > > >  		for (size_t i = 0; i < n_active_planes; i++) {
> > > > -			x_dst = plane[i]->frame_info->dst.x1;
> > > > -			x_limit = min_t(size_t, drm_rect_width(&plane[i]->frame_info->dst),
> > > > -					stage_buffer->n_pixels);
> > > > -			y_pos = get_y_pos(plane[i]->frame_info, y);
> > > > +			struct vkms_plane_state *current_plane = plane[i];
> > > >  
> > > > -			if (!check_limit(plane[i]->frame_info, y_pos))
> > > > +			/* Avoid rendering useless lines */
> > > > +			if (y < current_plane->frame_info->dst.y1 ||
> > > > +			    y >= current_plane->frame_info->dst.y2)
> > > >  				continue;
> > > >  
> > > > -			vkms_compose_row(stage_buffer, plane[i], y_pos);
> > > > -			pre_mul_alpha_blend(stage_buffer, output_buffer, x_dst, x_limit);
> > > > +			/*
> > > > +			 * dst_line is the line to copy. The initial coordinates are inside the  
> > 
> > [...]
> > 
> > > > +				 */
> > > > +				pixel_count = drm_rect_width(&src_line);
> > > > +				if (x_start < 0) {
> > > > +					pixel_count += x_start;
> > > > +					x_start = 0;
> > > > +				}
> > > > +				if (x_start + pixel_count > current_plane->frame_info->fb->width) {
> > > > +					pixel_count =
> > > > +						(int)current_plane->frame_info->fb->width - x_start;
> > > > +				}
> > > > +			} else {
> > > > +				/*
> > > > +				 * In vertical reading, the src_line height is the number of pixel
> > > > +				 * to read
> > > > +				 */
> > > > +				pixel_count = drm_rect_height(&src_line);
> > > > +				if (y_start < 0) {
> > > > +					pixel_count += y_start;
> > > > +					y_start = 0;
> > > > +				}
> > > > +				if (y_start + pixel_count > current_plane->frame_info->fb->height) {
> > > > +					pixel_count =
> > > > +						(int)current_plane->frame_info->fb->width - y_start;
> > > > +				}  
> > > 
> > > When you are clamping x_start or y_start or pixel_count to be inside
> > > the source FB, should you not equally adjust the destination
> > > coordinates as well?  
> > 
> > I did not think about it. Currently it is not an issue and it will not 
> > read or write outside a buffer because the pixel count is properly 
> > limited. But indeed, there is an issue here. I will fix it in the v6.
> >  
> > > If we take a step back and look at the UAPI, I believe the answer is
> > > "no", but it's in no way obvious. It results from the combination of
> > > several facts:
> > > 
> > > - UAPI checks reject any source rectangle that extends outside of the
> > >   source FB.
> > > 
> > > - The source rectangle stretches to fill the destination rectangle
> > >   exactly.
> > > 
> > > - VKMS does not support stretching (scaling), so its UAPI checks reject
> > >   any commit with source and destination rectangles of different sizes
> > >   after accounting for rotation. (Right?)  
> > 
> > I don't know what are exactly the UAPI contract but as the dst can be 
> > outside the CRTC, I assumed that the src can be outside the source plane. 
> > After thinking it doesn't really make sense.
> 
> The UAPI contract for source and destination rectangles is here:
> https://dri.freedesktop.org/docs/drm/gpu/drm-kms.html#standard-plane-properties
> 
> I assume there is some shared (helper?) code in DRM that enforces the
> contract and returns error to userspace if it is violated.
> 
> > > I think this results in the clamping code being actually dead code.
> > > However, I would not delete the clamping code, because it is a cheap
> > > safety net in case something goes wrong.  
> > 
> > If UAPI check that the source rectangle is inside the plane, yes it is 
> > just a safety net. Otherwise, it is required to manage properly the 
> > userspace requests. In the v6, the outside of a source buffer is 
> > transparent.
> > 
> > > If you agree that it's just a safety net, then maybe explain that in a
> > > comment? If the safety net catches anything, the composition result
> > > will be wrong anyway, so it doesn't matter to adjust the destination
> > > rectangle to match.  
> > 
> > I will extract this whole clamping stuff in a function, is this comment 
> > enough?
> > 
> >  * This function is mainly a safety net to avoid reading outside the source buffer. As the
> >  * userspace should never ask to read outside the source plane, all the cases covered here should
> >  * be dead code.
> 
> Sure. Perhaps use a bit more assertive tone about what the UAPI
> contract guarantees. Yes, userspace "should not", but the kernel DRM
> code ensures that it does not.

 * This function is mainly a safety net to avoid reading outside the source buffer. As the
 * userspace can't ask to read outside the source plane, all the cases covered here should
 * be dead code.
 
> > > When the last point is relaxed and VKMS gains scaling support, I think
> > > it won't change the fact that the clamping remains as a safety net. It
> > > just increases the risk of bugs that would be caught by the net.
> > > 
> > > Going outside of FB boundaries is a serious bug and deserves to be
> > > checked. Going outside of the source rectangle would be a bug too,
> > > assuming that partially included pixels are considered fully included,
> > > but it's not serious enough to warrant explicit checks. Ideally IGT
> > > would catch it.  
> > 
> > That was exactly the idea behind all those check and clamping: avoid 
> > access outside the buffers.
> 
> Good.
> 
> To catch a driver using pixels outside of a source rectangle, the test
> FB in IGT should be painted to have a different non-black color outside
> of the source rectangle.

The IGT test kms_plane already does that, you can use an option to ask for 
a border around the framebuffer.

Thanks,
Louis Chauvet

> > > > +			}
> > > > +
> > > > +			if (pixel_count <= 0) {
> > > > +				/* Nothing to read, so avoid multiple function calls for nothing */
> > > > +				continue;
> > > > +			}
> > > > +
> > > > +			/*
> > > > +			 * Modify the starting point to take in account the rotation
> > > > +			 *
> > > > +			 * src_line is the top-left corner, so when reading READ_RIGHT_TO_LEFT or
> > > > +			 * READ_BOTTOM_TO_TOP, it must be changed to the top-right/bottom-left
> > > > +			 * corner.
> > > > +			 */
> > > > +			if (direction == READ_RIGHT_TO_LEFT) {
> > > > +				// x_start is now the right point
> > > > +				x_start += pixel_count - 1;
> > > > +			} else if (direction == READ_BOTTOM_TO_TOP) {
> > > > +				// y_start is now the bottom point
> > > > +				y_start += pixel_count - 1;
> > > > +			}
> > > > +
> > > > +			/*
> > > > +			 * Perform the conversion and the blending
> > > > +			 *
> > > > +			 * Here we know that the read line (x_start, y_start, pixel_count) is
> > > > +			 * inside the source buffer [2] and we don't write outside the stage
> > > > +			 * buffer [1]
> > > > +			 */
> > > > +			current_plane->pixel_read_line(
> > > > +				current_plane, x_start, y_start, direction, pixel_count,
> > > > +				&stage_buffer->pixels[current_plane->frame_info->dst.x1]);
> > > > +
> > > > +			pre_mul_alpha_blend(stage_buffer, output_buffer,
> > > > +					    current_plane->frame_info->dst.x1,
> > > > +					    pixel_count);
> > > >  		}
> > > >  
> > > >  		apply_lut(crtc_state, output_buffer);
> > > > @@ -250,7 +335,7 @@ static void blend(struct vkms_writeback_job *wb,
> > > >  		*crc32 = crc32_le(*crc32, (void *)output_buffer->pixels, row_size);
> > > >  
> > > >  		if (wb)
> > > > -			vkms_writeback_row(wb, output_buffer, y_pos);
> > > > +			vkms_writeback_row(wb, output_buffer, y);
> > > >  	}
> > > >  }
> 
> 
> Thanks,
> pq



-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 08/16] drm/vkms: Avoid computing blending limits inside pre_mul_alpha_blend
  2024-03-27 11:48       ` Pekka Paalanen
@ 2024-04-08  7:50         ` Louis Chauvet
  0 siblings, 0 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-04-08  7:50 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 27/03/24 - 13:48, Pekka Paalanen a écrit :
> On Tue, 26 Mar 2024 16:57:00 +0100
> Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> 
> > Le 25/03/24 - 14:41, Pekka Paalanen a écrit :
> > > On Wed, 13 Mar 2024 18:45:02 +0100
> > > Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> > >   
> > > > The pre_mul_alpha_blend is dedicated to blending, so to avoid mixing
> > > > different concepts (coordinate calculation and color management), extract
> > > > the x_limit and x_dst computation outside of this helper.
> > > > It also increases the maintainability by grouping the computation related
> > > > to coordinates in the same place: the loop in `blend`.
> > > > 
> > > > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > > > ---
> > > >  drivers/gpu/drm/vkms/vkms_composer.c | 40 +++++++++++++++++-------------------
> > > >  1 file changed, 19 insertions(+), 21 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> > > > index da0651a94c9b..9254086f23ff 100644
> > > > --- a/drivers/gpu/drm/vkms/vkms_composer.c
> > > > +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> > > > @@ -24,34 +24,30 @@ static u16 pre_mul_blend_channel(u16 src, u16 dst, u16 alpha)
> > > >  
> > > >  /**
> > > >   * pre_mul_alpha_blend - alpha blending equation
> > > > - * @frame_info: Source framebuffer's metadata
> > > >   * @stage_buffer: The line with the pixels from src_plane
> > > >   * @output_buffer: A line buffer that receives all the blends output
> > > > + * @x_start: The start offset to avoid useless copy  
> > > 
> > > I'd say just:
> > > 
> > > + * @x_start: The start offset
> > > 
> > > It describes the parameter, and the paragraph below explains the why.
> > > 
> > > It would be explaining, that x_start applies to output_buffer, but
> > > input_buffer is always read starting from 0.  
> > 
> > I will change it to:
> > 
> >  * Using @x_start and @count information, only few pixel can be blended instead of the whole line
> >  * each time. @x_start is only used for the output buffer. The staging buffer is always read from
> >  * the start (0..@count in stage_buffer is blended at @x_start..@x_start+@count in output_buffer).
> 
> The important part is
> 
> 0..@count in stage_buffer is blended at @x_start..@x_start+@count in output_buffer
> 
> and everything else from that paragraph is not really adding much.

Ok, I will only keep this sentence.
 
> Remember to update the doc in "drm/vkms: Re-introduce line-per-line
> composition  algorithm" to follow the changes.

Thanks for the reminder, I will check!

> 
> > > > + * @count: The number of byte to copy  
> > > 
> > > You named it pixel_count, and it counts pixels, not bytes. It's not a
> > > copy but a blend into output_buffer.  
> > 
> > Oops, fixed in v6.
> >  
> > > >   *
> > > > - * Using the information from the `frame_info`, this blends only the
> > > > - * necessary pixels from the `stage_buffer` to the `output_buffer`
> > > > - * using premultiplied blend formula.
> > > > + * Using @x_start and @count information, only few pixel can be blended instead of the whole line
> > > > + * each time.
> > > >   *
> > > >   * The current DRM assumption is that pixel color values have been already
> > > >   * pre-multiplied with the alpha channel values. See more
> > > >   * drm_plane_create_blend_mode_property(). Also, this formula assumes a
> > > >   * completely opaque background.
> > > >   */
> > > > -static void pre_mul_alpha_blend(struct vkms_frame_info *frame_info,
> > > > -				struct line_buffer *stage_buffer,
> > > > -				struct line_buffer *output_buffer)
> > > > +static void pre_mul_alpha_blend(const struct line_buffer *stage_buffer,
> > > > +				struct line_buffer *output_buffer, int x_start, int pixel_count)
> > > >  {
> > > > -	int x_dst = frame_info->dst.x1;
> > > > -	struct pixel_argb_u16 *out = output_buffer->pixels + x_dst;
> > > > -	struct pixel_argb_u16 *in = stage_buffer->pixels;
> > > > -	int x_limit = min_t(size_t, drm_rect_width(&frame_info->dst),
> > > > -			    stage_buffer->n_pixels);
> > > > -
> > > > -	for (int x = 0; x < x_limit; x++) {
> > > > -		out[x].a = (u16)0xffff;
> > > > -		out[x].r = pre_mul_blend_channel(in[x].r, out[x].r, in[x].a);
> > > > -		out[x].g = pre_mul_blend_channel(in[x].g, out[x].g, in[x].a);
> > > > -		out[x].b = pre_mul_blend_channel(in[x].b, out[x].b, in[x].a);
> > > > +	struct pixel_argb_u16 *out = &output_buffer->pixels[x_start];
> > > > +	const struct pixel_argb_u16 *in = stage_buffer->pixels;
> > > > +
> > > > +	for (int i = 0; i < pixel_count; i++) {
> > > > +		out[i].a = (u16)0xffff;
> > > > +		out[i].r = pre_mul_blend_channel(in[i].r, out[i].r, in[i].a);
> > > > +		out[i].g = pre_mul_blend_channel(in[i].g, out[i].g, in[i].a);
> > > > +		out[i].b = pre_mul_blend_channel(in[i].b, out[i].b, in[i].a);
> > > >  	}
> > > >  }
> > > >  
> > > > @@ -183,7 +179,7 @@ static void blend(struct vkms_writeback_job *wb,
> > > >  {
> > > >  	struct vkms_plane_state **plane = crtc_state->active_planes;
> > > >  	u32 n_active_planes = crtc_state->num_active_planes;
> > > > -	int y_pos;
> > > > +	int y_pos, x_dst, x_limit;
> > > >  
> > > >  	const struct pixel_argb_u16 background_color = { .a = 0xffff };
> > > >  
> > > > @@ -201,14 +197,16 @@ static void blend(struct vkms_writeback_job *wb,
> > > >  
> > > >  		/* The active planes are composed associatively in z-order. */
> > > >  		for (size_t i = 0; i < n_active_planes; i++) {
> > > > +			x_dst = plane[i]->frame_info->dst.x1;
> > > > +			x_limit = min_t(size_t, drm_rect_width(&plane[i]->frame_info->dst),
> > > > +					stage_buffer->n_pixels);  
> > > 
> > > Are those input values to min_t() really of type size_t? Or why is
> > > size_t here?  
> > 
> > n_pixel is size_t, drm_rect_width is int. I will change everything to int. 
> > Is there a way to ask the compiler "please don't do implicit conversion 
> > and report them as warn/errors"?
> 
> There probably is, you can find it in the gcc manual. However, I suspect
> you would drown in warnings for cases where the implicit conversion is
> wanted and an explicit cast is unwanted.

That true, I found it (-Wconversion), but very noisy...

Thanks,
Louis Chauvet

> 
> Thanks,
> pq
> 
> > > >  			y_pos = get_y_pos(plane[i]->frame_info, y);
> > > >  
> > > >  			if (!check_limit(plane[i]->frame_info, y_pos))
> > > >  				continue;
> > > >  
> > > >  			vkms_compose_row(stage_buffer, plane[i], y_pos);
> > > > -			pre_mul_alpha_blend(plane[i]->frame_info, stage_buffer,
> > > > -					    output_buffer);
> > > > +			pre_mul_alpha_blend(stage_buffer, output_buffer, x_dst, x_limit);  
> > > 
> > > I thought it was a count, not a limit?
> > > 
> > > "Limit" sounds to me like "end", and end - start = count.  
> > 
> > It is effectively a pixel count. I just took those naming from the 
> > original pre_mul_alpha_blend. I will change it to pixel_count.
> > 
> > Thanks,
> > Louis Chauvet
> > 
> > > >  		}
> > > >  
> > > >  		apply_lut(crtc_state, output_buffer);
> > > >   
> > > 
> > > The details aside, this is a good move.
> > > 
> > > 
> > > Thanks,
> > > pq  
> > 
> > 
> > 
> 



-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 11/16] drm/vkms: Add YUV support
  2024-03-27 12:11   ` Philipp Zabel
@ 2024-04-08  7:50     ` Louis Chauvet
  0 siblings, 0 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-04-08  7:50 UTC (permalink / raw)
  To: Philipp Zabel
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	pekka.paalanen, dri-devel, linux-kernel, jeremie.dautheribes,
	miquel.raynal, thomas.petazzoni, seanpaul, marcheu,
	nicolejadeyee

Le 27/03/24 - 13:11, Philipp Zabel a écrit :
> Hi Louis,
> 
> On Mi, 2024-03-13 at 18:45 +0100, Louis Chauvet wrote:
> > From: Arthur Grillo <arthurgrillo@riseup.net>
> > 
> > Add support to the YUV formats bellow:
> > 
> > - NV12/NV16/NV24
> > - NV21/NV61/NV42
> > - YUV420/YUV422/YUV444
> > - YVU420/YVU422/YVU444
> > 
> > The conversion from yuv to rgb is done with fixed-point arithmetic, using
> > 32.32 floats and the drm_fixed helpers.
> 
> s/floats/fixed-point numbers/
> 
> Nothing floating here, the point is fixed.
> 
> [...]
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index 23e1d247468d..f3116084de5a 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -99,6 +99,27 @@ typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_st
> >  				  int y_start, enum pixel_read_direction direction, int count,
> >  				  struct pixel_argb_u16 out_pixel[]);
> >  
> > +/**
> > + * CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values
> 
> s/CONVERSION_MATRIX_FLOAT_DEPTH/CONVERSION_MATRIX_FRACTIONAL_BITS/
> 
> Just a suggestion, maybe there are better terms, but using "FLOAT" here
> is confusing.
> 
> > + */
> > +#define CONVERSION_MATRIX_FLOAT_DEPTH 32
> > +
> > +/**
> > + * struct conversion_matrix - Matrix to use for a specific encoding and range
> > + *
> > + * @matrix: Conversion matrix from yuv to rgb. The matrix is stored in a row-major manner and is
> > + * used to compute rgb values from yuv values:
> > + *     [[r],[g],[b]] = @matrix * [[y],[u],[v]]
> > + *   OR for yvu formats:
> > + *     [[r],[g],[b]] = @matrix * [[y],[v],[u]]
> > + *  The values of the matrix are fixed floats, 32.CONVERSION_MATRIX_FLOAT_DEPTH
> 
> s/fixed floats/fixed-point numbers/

Thanks for those precision, I will change the wording in v6.

Louis Chauvet
 
> regards
> Philipp

-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 11/16] drm/vkms: Add YUV support
  2024-03-27 12:59       ` Pekka Paalanen
@ 2024-04-08  7:50         ` Louis Chauvet
  0 siblings, 0 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-04-08  7:50 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Maíra Canal, Rodrigo Siqueira, Melissa Wen, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 27/03/24 - 14:59, Pekka Paalanen a écrit :
> On Tue, 26 Mar 2024 16:57:03 +0100
> Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> 
> > Le 25/03/24 - 11:26, Maíra Canal a écrit :
> > > On 3/13/24 14:45, Louis Chauvet wrote:  
> > > > From: Arthur Grillo <arthurgrillo@riseup.net>
> > > > 
> > > > Add support to the YUV formats bellow:
> > > > 
> > > > - NV12/NV16/NV24
> > > > - NV21/NV61/NV42
> > > > - YUV420/YUV422/YUV444
> > > > - YVU420/YVU422/YVU444
> > > > 
> > > > The conversion from yuv to rgb is done with fixed-point arithmetic, using
> > > > 32.32 floats and the drm_fixed helpers.
> > > > 
> > > > To do the conversion, a specific matrix must be used for each color range
> > > > (DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in
> > > > the `conversion_matrix` struct, along with the specific y_offset needed.
> > > > This matrix is queried only once, in `vkms_plane_atomic_update` and
> > > > stored in a `vkms_plane_state`. Those conversion matrices of each
> > > > encoding and range were obtained by rounding the values of the original
> > > > conversion matrices multiplied by 2^32. This is done to avoid the use of
> > > > floating point operations.
> > > > 
> > > > The same reading function is used for YUV and YVU formats. As the only
> > > > difference between those two category of formats is the order of field, a
> > > > simple swap in conversion matrix columns allows using the same function.
> > > > 
> > > > Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> > > > [Louis Chauvet:
> > > > - Adapted Arthur's work
> > > > - Implemented the read_line_t callbacks for yuv
> > > > - add struct conversion_matrix
> > > > - remove struct pixel_yuv_u8
> > > > - update the commit message
> > > > - Merge the modifications from Arthur]  
> > > 
> > > A Co-developed-by tag would be more appropriate.  
> > 
> > I am not the main author of this part, I only applied a few simple 
> > suggestions, the complex part was done by Arthur.
> > 
> > I will wait for Arthur's confirmation to change it to Co-developed by if
> > he agrees.
> 
> Co-developed-by is an additional tag, and does not replace S-o-b. To my
> understanding, the kernel rules and Developers' Certificate of Origin
> require S-o-b to be added by anyone who has taken a patch and
> re-submitted it, regardless of who the original author is, and
> especially if the patch was modified.
> 
> Personally I also like to keep the list of changes like Louis added, to
> credit people better.

I will keep everything, don't worry!
 
> > > > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > > > ---
> > > >   drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
> > > >   drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
> > > >   drivers/gpu/drm/vkms/vkms_formats.h |   4 +
> > > >   drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
> > > >   4 files changed, 473 insertions(+), 1 deletion(-)
> > > > 
> 
> ...
> 
> > > >   };
> > > >   
> > > >   static struct drm_plane_state *
> > > > @@ -117,12 +129,15 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
> > > >   	drm_framebuffer_get(frame_info->fb);
> > > >   	frame_info->rotation = drm_rotation_simplify(new_state->rotation, DRM_MODE_ROTATE_0 |
> > > >   									  DRM_MODE_ROTATE_90 |
> > > > +									  DRM_MODE_ROTATE_180 |  
> > > 
> > > Why do we need to add DRM_MODE_ROTATE_180 here? Isn't the same as
> > > reflecting both along the X and Y axis?  
> > 
> > Oops, I had no intention of putting that change here. I will move it to 
> > another patch.
> > 
> > I don't understand why DRM_MODE_ROTATE_180 isn't in this list. If I read 
> > the drm_rotation_simplify documentation, it explains that this argument 
> > should contain all supported rotations and reflections, and ROT_180 is 
> > supported by VKMS. Perhaps this call is unnecessary because all 
> > combinations are supported by vkms?
> 
> If you truly handle all bit patterns that the rotation bitfield can
> have, then yes, the call seems unnecessary.
> 
> However, as documented, the bitfield contains redundancy: the same
> orientation can be expressed in more than one bit pattern. One example
> is that ROTATE_180 is equivalent to REFLECT_X | REFLECT_Y.
> 
> Since it's a bitmask, userspace can give you funny values like
> ROTATE_0 | ROTATE_90 | ROTATE_180. That is a valid orientation of
> 270-degree rotation (according to UAPI doc), but it is very awkwardly
> expressed, hence the need to normalise it into a minimal bit pattern.

The userspace can't set multiple bit, if you look at [1]:

	if (!is_power_of_2(val & DRM_MODE_ROTATE_MASK)) {
		drm_dbg_atomic(plane->dev,
			       "[PLANE:%d:%s] bad rotation bitmask: 0x%llx\n",
			       plane->base.id, plane->name, val);
		return -EINVAL;
	}

I have a series ready for improving the drm documentation, you will be in 
copy.

[1]: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_atomic_uapi.c#L527
 
> It does not look like drm_rotation_simplify() actually does this
> minimisation!

drm_rotation_simplify() apply an additionnal 
REFLECT_X|REFLECT_Y|ROTATE_180, it is a "no-op" operation, but it 
transform ROT_90|REF_X into ROT_270|REF_Y, so you have eliminated REF_X 
and ROT_90. I will create a patch to document a bit more this function.

In the current vkms code, it will just remove the 180° rotation. I 
think we can just delete this call as everything is supported. I will add 
the patch in the v6 (I don't know why it was there at the first place, 
and I don't want to introduce regression).

> I was not able to tell if DRM common code actually stops userspace from
> combining multiple ROTATE bits in the same value. I suspect it must
> stop them, or perhaps all code dealing with rotation is actually broken.

See [1], and yes, drm helpers are broken with multiple bit sets, they 
expect only one rotation bit.

> drm_rotation_simplify() is useful for cases where your hardware does
> not have exactly the same flexibility. Maybe it cannot do REFLECT_Y but
> it can do everything else? Then drm_rotation_simplify() gives you a bit
> pattern that you can use directly, or fails if the orientation is not
> representable with what your hardware can do.
> 
> At least, that's my understanding of quickly glancing over it.
> 
> IOW, if you wanted to never have to deal with REFLECT_Y bit, you could
> leave it out here. Or, if you never want to deal with ROTATE_180, leave
> that out - you will get REFLECT_X | REFLECT_Y instead. In theory.
> 
> 
> Thanks,
> pq



-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 11/16] drm/vkms: Add YUV support
  2024-03-27 14:23   ` Pekka Paalanen
@ 2024-04-08  7:50     ` Louis Chauvet
  2024-04-09  7:58       ` Pekka Paalanen
  0 siblings, 1 reply; 75+ messages in thread
From: Louis Chauvet @ 2024-04-08  7:50 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 27/03/24 - 16:23, Pekka Paalanen a écrit :
> On Wed, 13 Mar 2024 18:45:05 +0100
> Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> 
> > From: Arthur Grillo <arthurgrillo@riseup.net>
> > 
> > Add support to the YUV formats bellow:
> > 
> > - NV12/NV16/NV24
> > - NV21/NV61/NV42
> > - YUV420/YUV422/YUV444
> > - YVU420/YVU422/YVU444
> > 
> > The conversion from yuv to rgb is done with fixed-point arithmetic, using
> > 32.32 floats and the drm_fixed helpers.
> 
> You mean fixed-point, not floating-point (floats).
> 
> > 
> > To do the conversion, a specific matrix must be used for each color range
> > (DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in
> > the `conversion_matrix` struct, along with the specific y_offset needed.
> > This matrix is queried only once, in `vkms_plane_atomic_update` and
> > stored in a `vkms_plane_state`. Those conversion matrices of each
> > encoding and range were obtained by rounding the values of the original
> > conversion matrices multiplied by 2^32. This is done to avoid the use of
> > floating point operations.
> > 
> > The same reading function is used for YUV and YVU formats. As the only
> > difference between those two category of formats is the order of field, a
> > simple swap in conversion matrix columns allows using the same function.
> 
> Sounds good!
> 
> > Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> > [Louis Chauvet:
> > - Adapted Arthur's work
> > - Implemented the read_line_t callbacks for yuv
> > - add struct conversion_matrix
> > - remove struct pixel_yuv_u8
> > - update the commit message
> > - Merge the modifications from Arthur]
> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >  drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
> >  drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/vkms/vkms_formats.h |   4 +
> >  drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
> >  4 files changed, 473 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index 23e1d247468d..f3116084de5a 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -99,6 +99,27 @@ typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_st
> >  				  int y_start, enum pixel_read_direction direction, int count,
> >  				  struct pixel_argb_u16 out_pixel[]);
> >  
> > +/**
> > + * CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values
> > + */
> > +#define CONVERSION_MATRIX_FLOAT_DEPTH 32
> 
> Fraction, not float.
> 
> > +
> > +/**
> > + * struct conversion_matrix - Matrix to use for a specific encoding and range
> > + *
> > + * @matrix: Conversion matrix from yuv to rgb. The matrix is stored in a row-major manner and is
> > + * used to compute rgb values from yuv values:
> > + *     [[r],[g],[b]] = @matrix * [[y],[u],[v]]
> > + *   OR for yvu formats:
> > + *     [[r],[g],[b]] = @matrix * [[y],[v],[u]]
> > + *  The values of the matrix are fixed floats, 32.CONVERSION_MATRIX_FLOAT_DEPTH
> 
> Fixed float is not a thing. They are signed fixed-point values with
> 32-bit fractional part.

I will edit all of this for v6.
 
> > + * @y_offest: Offset to apply on the y value.
> > + */
> > +struct conversion_matrix {
> > +	s64 matrix[3][3];
> > +	s64 y_offset;
> > +};
> 
> Btw. too bad that drm_fixed.h does not use something like
> 
> 	typedef struct drm_fixed {
> 		s64 v;
> 	} drm_fixed_t;
> 
> and use that in all the API where a fixed-point value is passed. It
> would make the type very explicit, and the struct prevents it from
> implicitly casting to/from regular integer formats.
> 
> Then you could use drm_fixed_t instead of s64 and it would be obvious
> how the values must be handled and which API is appropriate.

I agree this could be a nice improvment, but it may require touching a lot 
of places. 

> > +
> >  /**
> >   * vkms_plane_state - Driver specific plane state
> >   * @base: base plane state
> > @@ -110,6 +131,7 @@ struct vkms_plane_state {
> >  	struct drm_shadow_plane_state base;
> >  	struct vkms_frame_info *frame_info;
> >  	pixel_read_line_t pixel_read_line;
> > +	struct conversion_matrix *conversion_matrix;
> 
> If the matrix was embedded as a copy instead of a pointer to (const!)
> matrix, you would not need to manually hardcode YVU variant of the
> matrices, but you could simply swap the columns of the YUV matrix while
> copying them into this field.

Very good suggestion thanks, applied for the v6!

> 
> >  };
> >  
> >  struct vkms_plane {
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> > index 1449a0e6c706..edbf4b321b91 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.c
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> > @@ -105,6 +105,44 @@ static int get_step_next_block(struct drm_framebuffer *fb, enum pixel_read_direc
> >  	return 0;
> >  }
> >  
> > +/**
> > + * get_subsampling() - Get the subsampling divisor value on a specific direction
> > + */
> > +static int get_subsampling(const struct drm_format_info *format,
> > +			   enum pixel_read_direction direction)
> > +{
> > +	switch (direction) {
> > +	case READ_BOTTOM_TO_TOP:
> > +	case READ_TOP_TO_BOTTOM:
> > +		return format->vsub;
> > +	case READ_RIGHT_TO_LEFT:
> > +	case READ_LEFT_TO_RIGHT:
> > +		return format->hsub;
> > +	}
> > +	WARN_ONCE(true, "Invalid direction for pixel reading: %d\n", direction);
> > +	return 1;
> > +}
> > +
> > +/**
> > + * get_subsampling_offset() - An offset for keeping the chroma siting consistent regardless of
> > + * x_start and y_start values
> > + */
> > +static int get_subsampling_offset(enum pixel_read_direction direction, int x_start, int y_start)
> > +{
> > +	switch (direction) {
> > +	case READ_BOTTOM_TO_TOP:
> > +		return -y_start - 1;
> > +	case READ_TOP_TO_BOTTOM:
> > +		return y_start;
> > +	case READ_RIGHT_TO_LEFT:
> > +		return -x_start - 1;
> > +	case READ_LEFT_TO_RIGHT:
> > +		return x_start;
> > +	}
> > +	WARN_ONCE(true, "Invalid direction for pixel reading: %d\n", direction);
> > +	return 0;
> > +}
> > +
> >  /*
> >   * The following  functions take pixel data (a, r, g, b, pixel, ...), convert them to the format
> >   * ARGB16161616 in out_pixel.
> > @@ -161,6 +199,42 @@ static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
> >  	return out_pixel;
> >  }
> >  
> > +static struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
> > +						  struct conversion_matrix *matrix)
> 
> If you are using the "swap the matrix columns" trick, then you cannot
> call these cb, cr nor even u,v, because they might be the opposite.
> They are simply the first and second chroma channel, and their meaning
> depends on the given matrix.

I will rename them for v6, channel_1 and channel_2.

> > +{
> > +	u8 r, g, b;
> > +	s64 fp_y, fp_cb, fp_cr;
> > +	s64 fp_r, fp_g, fp_b;
> > +
> > +	fp_y = y - matrix->y_offset;
> > +	fp_cb = cb - 128;
> > +	fp_cr = cr - 128;
> 
> This looks like an incorrect way to convert u8 to fixed-point, but...
>
> > +
> > +	fp_y = drm_int2fixp(fp_y);
> > +	fp_cb = drm_int2fixp(fp_cb);
> > +	fp_cr = drm_int2fixp(fp_cr);
> 
> I find it confusing to re-purpose variables like this.
> 
> I'd do just
> 
> 	fp_c1 = drm_int2fixp((int)c1 - 128);

I agree with this remark, I will change it for the v6.

> If the function arguments were int to begin with, then the cast would
> be obviously unnecessary.

For this I'm less sure. The name of the function and the usage is 
explicit: we want to use u8 as input. As we manipulate pointers in 
read_line, I don't know how it will works if the pointer is dereferenced 
to a int instead of a u8.

> So, what you have in fp variables at this point is fractional numbers
> in the 8-bit integer scale. However, because the target format is
> 16-bit, you should not show the extra precision away here. Instead,
> multiply by 257 to bring the values to 16-bit scale, and do the RGB
> clamping to 16-bit, not 8-bit.
> 
> > +
> > +	fp_r = drm_fixp_mul(matrix->matrix[0][0], fp_y) +
> > +	       drm_fixp_mul(matrix->matrix[0][1], fp_cb) +
> > +	       drm_fixp_mul(matrix->matrix[0][2], fp_cr);
> > +	fp_g = drm_fixp_mul(matrix->matrix[1][0], fp_y) +
> > +	       drm_fixp_mul(matrix->matrix[1][1], fp_cb) +
> > +	       drm_fixp_mul(matrix->matrix[1][2], fp_cr);
> > +	fp_b = drm_fixp_mul(matrix->matrix[2][0], fp_y) +
> > +	       drm_fixp_mul(matrix->matrix[2][1], fp_cb) +
> > +	       drm_fixp_mul(matrix->matrix[2][2], fp_cr);
> > +
> > +	fp_r = drm_fixp2int_round(fp_r);
> > +	fp_g = drm_fixp2int_round(fp_g);
> > +	fp_b = drm_fixp2int_round(fp_b);
> > +
> > +	r = clamp(fp_r, 0, 0xff);
> > +	g = clamp(fp_g, 0, 0xff);
> > +	b = clamp(fp_b, 0, 0xff);
> > +
> > +	return argb_u16_from_u8888(255, r, g, b);
> 
> Going through argb_u16_from_u8888() will throw away precision.

I tried to fix it in the v6, IGT tests pass. If something is wrong in the 
v6, please let me know.

> > +}
> > +
> >  /*
> >   * The following functions are read_line function for each pixel format supported by VKMS.
> >   *
> > @@ -293,6 +367,79 @@ static void RGB565_read_line(const struct vkms_plane_state *plane, int x_start,
> >  	}
> >  }
> >  
> > +/*
> > + * This callback can be used for yuv and yvu formats, given a properly modified conversion matrix
> > + * (column inversion)
> 
> Would be nice to explain what semi_planar_yuv means, so that the
> documentation for these functions would show how they differ rather
> than all saying exactly the same thing.

 /* This callback can be used for YUV format where each color component is 
  * stored in a different plane (often called planar formats). It will 
  * handle correctly subsampling.

 /*
  * This callback can be used for YUV formats where U and V values are 
  * stored in the same plane (often called semi-planar formats). It will 
  * corectly handle subsampling.
  * 
  * The conversion matrix stored in the @plane is used to:
  * - Apply the correct color range and encoding
  * - Convert YUV and YVU with the same function (a simple column swap is 
  *   needed)
  */
 
> > + */
> > +static void semi_planar_yuv_read_line(const struct vkms_plane_state *plane, int x_start,
> > +				      int y_start, enum pixel_read_direction direction, int count,
> > +				      struct pixel_argb_u16 out_pixel[])
> > +{
> > +	int rem_x, rem_y;
> > +	u8 *y_plane;
> > +	u8 *uv_plane;
> > +
> > +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &y_plane, &rem_x, &rem_y);
> 
> Assert rem_x, rem_y are zero, or block is 1x1.
> 
> > +	packed_pixels_addr(plane->frame_info,
> > +			   x_start / plane->frame_info->fb->format->hsub,
> > +			   y_start / plane->frame_info->fb->format->vsub,
> > +			   1, &uv_plane, &rem_x, &rem_y);
> 
> Assert rem_x, rem_y are zero, or block is 1x1.
> 
> Actually, this is so common, that maybe there should be a wrapper for
> packed_pixels_addr() or another variant of it, that asserts that the
> block size is 1x1 and does not return rem_x, rem_y at all.

I will create a packed_pixel_addr_1x1 for this, and add this assert 
inside, so no code duplication.

> > +	int step_y = get_step_next_block(plane->frame_info->fb, direction, 0);
> > +	int step_uv = get_step_next_block(plane->frame_info->fb, direction, 1);
> > +	int subsampling = get_subsampling(plane->frame_info->fb->format, direction);
> > +	int subsampling_offset = get_subsampling_offset(direction, x_start, y_start);
> > +	struct conversion_matrix *conversion_matrix = plane->conversion_matrix;
> > +
> > +	for (int i = 0; i < count; i++) {
> > +		*out_pixel = argb_u16_from_yuv888(y_plane[0], uv_plane[0], uv_plane[1],
> > +						  conversion_matrix);
> > +		out_pixel += 1;
> > +		y_plane += step_y;
> > +		if ((i + subsampling_offset + 1) % subsampling == 0)
> > +			uv_plane += step_uv;
> > +	}
> > +}
> > +
> > +/*
> > + * This callback can be used for yuv and yvu formats, given a properly modified conversion matrix
> > + * (column inversion)
> > + */
> > +static void planar_yuv_read_line(const struct vkms_plane_state *plane, int x_start,
> > +				 int y_start, enum pixel_read_direction direction, int count,
> > +				 struct pixel_argb_u16 out_pixel[])
> > +{
> > +	int rem_x, rem_y;
> > +	u8 *y_plane;
> > +	u8 *u_plane;
> > +	u8 *v_plane;
> > +
> > +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &y_plane, &rem_x, &rem_y);
> > +	packed_pixels_addr(plane->frame_info,
> > +			   x_start / plane->frame_info->fb->format->hsub,
> > +			   y_start / plane->frame_info->fb->format->vsub,
> > +			   1, &u_plane, &rem_x, &rem_y);
> > +	packed_pixels_addr(plane->frame_info,
> > +			   x_start / plane->frame_info->fb->format->hsub,
> > +			   y_start / plane->frame_info->fb->format->vsub,
> > +			   2, &v_plane, &rem_x, &rem_y);
> > +	int step_y = get_step_next_block(plane->frame_info->fb, direction, 0);
> > +	int step_u = get_step_next_block(plane->frame_info->fb, direction, 1);
> > +	int step_v = get_step_next_block(plane->frame_info->fb, direction, 2);
> > +	int subsampling = get_subsampling(plane->frame_info->fb->format, direction);
> > +	int subsampling_offset = get_subsampling_offset(direction, x_start, y_start);
> > +	struct conversion_matrix *conversion_matrix = plane->conversion_matrix;
> > +
> > +	for (int i = 0; i < count; i++) {
> > +		*out_pixel = argb_u16_from_yuv888(*y_plane, *u_plane, *v_plane, conversion_matrix);
> > +		out_pixel += 1;
> > +		y_plane += step_y;
> > +		if ((i + subsampling_offset + 1) % subsampling == 0) {
> > +			u_plane += step_u;
> > +			v_plane += step_v;
> > +		}
> > +	}
> > +}
> > +
> >  /*
> >   * The following functions take one argb_u16 pixel and convert it to a specific format. The
> >   * result is stored in @out_pixel.
> > @@ -418,6 +565,20 @@ pixel_read_line_t get_pixel_read_line_function(u32 format)
> >  		return &XRGB16161616_read_line;
> >  	case DRM_FORMAT_RGB565:
> >  		return &RGB565_read_line;
> > +	case DRM_FORMAT_NV12:
> > +	case DRM_FORMAT_NV16:
> > +	case DRM_FORMAT_NV24:
> > +	case DRM_FORMAT_NV21:
> > +	case DRM_FORMAT_NV61:
> > +	case DRM_FORMAT_NV42:
> > +		return &semi_planar_yuv_read_line;
> > +	case DRM_FORMAT_YUV420:
> > +	case DRM_FORMAT_YUV422:
> > +	case DRM_FORMAT_YUV444:
> > +	case DRM_FORMAT_YVU420:
> > +	case DRM_FORMAT_YVU422:
> > +	case DRM_FORMAT_YVU444:
> > +		return &planar_yuv_read_line;
> >  	default:
> >  		/*
> >  		 * This is a bug in vkms_plane_atomic_check. All the supported
> > @@ -435,6 +596,276 @@ pixel_read_line_t get_pixel_read_line_function(u32 format)
> >  	}
> >  }
> >  
> > +/**
> > + * get_conversion_matrix_to_argb_u16() - Retrieve the correct yuv to rgb conversion matrix for a
> > + * given encoding and range.
> > + *
> > + * If the matrix is not found, return a null pointer. In all other cases, it return a simple
> > + * diagonal matrix, which act as a "no-op".
> 
> This comment about NULL seems bogus.

Because it is... and it become useless when using the "copy matrix" 
method.

> > + *
> > + * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> > + * @encoding: DRM_COLOR_* value for which to obtain a conversion matrix
> > + * @range: DRM_COLOR_*_RANGE value for which to obtain a conversion matrix
> > + */
> > +struct conversion_matrix *
> > +get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> > +				  enum drm_color_range range)
> > +{
> > +	static struct conversion_matrix no_operation = {
> 
> Every matrix here should be 'static const' rather than only 'static'.
>
> > +		.matrix = {
> > +			{ 4294967296, 0,          0, },
> > +			{ 0,          4294967296, 0, },
> > +			{ 0,          0,          4294967296, },
> > +		},
> > +		.y_offset = 0,
> > +	};
> > +
> > +	/*
> > +	 * Those matrixies were generated using the colour python framework
> > +	 *
> > +	 * Below are the function calls used to generate eac matrix, go to
> > +	 * https://colour.readthedocs.io/en/develop/generated/colour.matrix_YCbCr.html
> > +	 * for more info:
> > +	 *
> > +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
> > +	 *                                  is_legal = False,
> 
> Ugh, colour.matrix_YCbCr documentation is confusing. This is the first
> time I've heard of "legal range", so I had to look it up. Of course,
> the doc does not explain it.
> 
> Reading
> https://kb.pomfort.com/livegrade/advanced-grading-features/legal-and-extended-sdi-signals-and-luts-in-livegrade/
> it sounds like extended range in 8-bit is 1-254, not 0-255 that
> we use in computer graphics. This matches what I've read before
> elsewhere in ITU or SMPTE specs.
> 
> SDI signals reserve the 8-bit code points 0 and 255 for
> synchronization, making them invalid as data. It scales to higher bit
> depths, so 10-bit code points 0-3 and 1020-1023 inclusive are reserved
> for synchronization.
> 
> IOW, there are two different "full range" quantizations: extended and full.
> 
> Does is_legal=False refer to extended or full? The documentation
> does not say.
> 
> However, given that changing 'bits' value with is_legal=False does not
> change the result, and with is_legal=True it does change the result, I
> suspect is_legal=False means full range, not extended range.
> 
> So I think the python snippet is correct.
> 
> > +	 *                                  bits = 8) * 2**32).astype(int)
> > +	 */
> > +	static struct conversion_matrix yuv_bt601_full = {
> > +		.matrix = {
> > +			{ 4294967296, 0,           6021544149 },

[...]

> > +			{ 5020601039, 9234915964, 0 },
> > +		},
> > +		.y_offset = 16,
> > +	};
> > +
> > +	/*
> > +	 * The next matrices are just the previous ones, but with the first and
> > +	 * second columns swapped
> 
> As I mentioned earlier, you could derive those below from the above
> matrices in code, so you don't need all these open-coded.
> 
> You also would not need twice the switch-ladders below, you'd only need
> a 'bool need_to_swap_columns' from the pixel format.

It is done in the v6, the code is much simpler.

Thanks,
Louis Chauvet

> You could also have a 'bool limited_range', and do
> 
> 	case DRM_COLOR_YCBCR_BT601:
> 		return limited_range ? &yuv_bt601_limited : &yuv_bt601_full;
> 
> 
> > +	 */
> > +	static struct conversion_matrix yvu_bt601_full = {
> > +		.matrix = {
> > +			{ 4294967296, 6021544149,  0 },
> > +			{ 4294967296, -3067191994, -1478054095 },
> > +			{ 4294967296, 0,           7610682049 },
> > +		},
> > +		.y_offset = 0,
> > +	};
> > +	static struct conversion_matrix yvu_bt601_limited = {
> > +		.matrix = {
> > +			{ 5020601039, 6881764740,  0 },
> > +			{ 5020601039, -3505362278, -1689204679 },
> > +			{ 5020601039, 0,           8697922339 },
> > +		},
> > +		.y_offset = 16,
> > +	};
> > +	static struct conversion_matrix yvu_bt709_full = {
> > +		.matrix = {
> > +			{ 4294967296, 6763714498,  0 },
> > +			{ 4294967296, -2010578443, -804551626 },
> > +			{ 4294967296, 0,           7969741314 },
> > +		},
> > +		.y_offset = 0,
> > +	};
> > +	static struct conversion_matrix yvu_bt709_limited = {
> > +		.matrix = {
> > +			{ 5020601039, 7729959424,  0 },
> > +			{ 5020601039, -2297803934, -919487572 },
> > +			{ 5020601039, 0,           9108275786 },
> > +		},
> > +		.y_offset = 16,
> > +	};
> > +	static struct conversion_matrix yvu_bt2020_full = {
> > +		.matrix = {
> > +			{ 4294967296, 6333358775,  0 },
> > +			{ 4294967296, -2453942994, -706750298 },
> > +			{ 4294967296, 0,           8080551471 },
> > +		},
> > +		.y_offset = 0,
> > +	};
> > +	static struct conversion_matrix yvu_bt2020_limited = {
> > +		.matrix = {
> > +			{ 5020601039, 7238124312,  0 },
> > +			{ 5020601039, -2804506279, -807714626 },
> > +			{ 5020601039, 0,           9234915964 },
> > +		},
> > +		.y_offset = 16,
> > +	};
> > +
> > +	/* Breaking in this switch means that the color format+encoding+range is not supported */
> > +	switch (format) {
> > +	case DRM_FORMAT_NV12:
> > +	case DRM_FORMAT_NV16:
> > +	case DRM_FORMAT_NV24:
> > +	case DRM_FORMAT_YUV420:
> > +	case DRM_FORMAT_YUV422:
> > +	case DRM_FORMAT_YUV444:
> > +		switch (encoding) {
> > +		case DRM_COLOR_YCBCR_BT601:
> > +			switch (range) {
> > +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> > +				return &yuv_bt601_limited;
> > +			case DRM_COLOR_YCBCR_FULL_RANGE:
> > +				return &yuv_bt601_full;
> > +			case DRM_COLOR_RANGE_MAX:
> > +				break;
> > +			}
> > +			break;
> > +		case DRM_COLOR_YCBCR_BT709:
> > +			switch (range) {
> > +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> > +				return &yuv_bt709_limited;
> > +			case DRM_COLOR_YCBCR_FULL_RANGE:
> > +				return &yuv_bt709_full;
> > +			case DRM_COLOR_RANGE_MAX:
> > +				break;
> > +			}
> > +			break;
> > +		case DRM_COLOR_YCBCR_BT2020:
> > +			switch (range) {
> > +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> > +				return &yuv_bt2020_limited;
> > +			case DRM_COLOR_YCBCR_FULL_RANGE:
> > +				return &yuv_bt2020_full;
> > +			case DRM_COLOR_RANGE_MAX:
> > +				break;
> > +			}
> > +			break;
> > +		case DRM_COLOR_ENCODING_MAX:
> > +			break;
> > +		}
> > +		break;
> > +	case DRM_FORMAT_YVU420:
> > +	case DRM_FORMAT_YVU422:
> > +	case DRM_FORMAT_YVU444:
> > +	case DRM_FORMAT_NV21:
> > +	case DRM_FORMAT_NV61:
> > +	case DRM_FORMAT_NV42:
> > +		switch (encoding) {
> > +		case DRM_COLOR_YCBCR_BT601:
> > +			switch (range) {
> > +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> > +				return &yvu_bt601_limited;
> > +			case DRM_COLOR_YCBCR_FULL_RANGE:
> > +				return &yvu_bt601_full;
> > +			case DRM_COLOR_RANGE_MAX:
> > +				break;
> > +			}
> > +			break;
> > +		case DRM_COLOR_YCBCR_BT709:
> > +			switch (range) {
> > +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> > +				return &yvu_bt709_limited;
> > +			case DRM_COLOR_YCBCR_FULL_RANGE:
> > +				return &yvu_bt709_full;
> > +			case DRM_COLOR_RANGE_MAX:
> > +				break;
> > +			}
> > +			break;
> > +		case DRM_COLOR_YCBCR_BT2020:
> > +			switch (range) {
> > +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> > +				return &yvu_bt2020_limited;
> > +			case DRM_COLOR_YCBCR_FULL_RANGE:
> > +				return &yvu_bt2020_full;
> > +			case DRM_COLOR_RANGE_MAX:
> > +				break;
> > +			}
> > +			break;
> > +		case DRM_COLOR_ENCODING_MAX:
> > +			break;
> > +		}
> > +		break;
> > +	case DRM_FORMAT_ARGB8888:
> > +	case DRM_FORMAT_XRGB8888:
> > +	case DRM_FORMAT_ARGB16161616:
> > +	case DRM_FORMAT_XRGB16161616:
> > +	case DRM_FORMAT_RGB565:
> > +		/*
> > +		 * Those formats are supported, but they don't need a conversion matrix. Return
> > +		 * a valid pointer to avoid kernel panic in case this matrix is used/checked
> > +		 * somewhere.
> > +		 */
> > +		return &no_operation;
> > +	default:
> > +		break;
> > +	}
> > +	WARN(true, "Unsupported encoding (%d), range (%d) and format (%p4cc) combination\n",
> > +	     encoding, range, &format);
> > +	return &no_operation;
> > +}
> > +
> >  /**
> >   * Retrieve the correct write_pixel function for a specific format.
> >   * If the format is not supported by VKMS a warn is emitted and a dummy "don't do anything"
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
> > index 8d2bef95ff79..e1d324764b17 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.h
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.h
> > @@ -9,4 +9,8 @@ pixel_read_line_t get_pixel_read_line_function(u32 format);
> >  
> >  pixel_write_t get_pixel_write_function(u32 format);
> >  
> > +struct conversion_matrix *
> > +get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> > +				  enum drm_color_range range);
> > +
> >  #endif /* _VKMS_FORMATS_H_ */
> > diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> > index 8875bed76410..987dd2b686a8 100644
> > --- a/drivers/gpu/drm/vkms/vkms_plane.c
> > +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> > @@ -17,7 +17,19 @@ static const u32 vkms_formats[] = {
> >  	DRM_FORMAT_XRGB8888,
> >  	DRM_FORMAT_XRGB16161616,
> >  	DRM_FORMAT_ARGB16161616,
> > -	DRM_FORMAT_RGB565
> > +	DRM_FORMAT_RGB565,
> > +	DRM_FORMAT_NV12,
> > +	DRM_FORMAT_NV16,
> > +	DRM_FORMAT_NV24,
> > +	DRM_FORMAT_NV21,
> > +	DRM_FORMAT_NV61,
> > +	DRM_FORMAT_NV42,
> > +	DRM_FORMAT_YUV420,
> > +	DRM_FORMAT_YUV422,
> > +	DRM_FORMAT_YUV444,
> > +	DRM_FORMAT_YVU420,
> > +	DRM_FORMAT_YVU422,
> > +	DRM_FORMAT_YVU444
> >  };
> >  
> >  static struct drm_plane_state *
> > @@ -117,12 +129,15 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
> >  	drm_framebuffer_get(frame_info->fb);
> >  	frame_info->rotation = drm_rotation_simplify(new_state->rotation, DRM_MODE_ROTATE_0 |
> >  									  DRM_MODE_ROTATE_90 |
> > +									  DRM_MODE_ROTATE_180 |
> >  									  DRM_MODE_ROTATE_270 |
> >  									  DRM_MODE_REFLECT_X |
> >  									  DRM_MODE_REFLECT_Y);
> >  
> >  
> >  	vkms_plane_state->pixel_read_line = get_pixel_read_line_function(fmt);
> > +	vkms_plane_state->conversion_matrix = get_conversion_matrix_to_argb_u16
> > +		(fmt, new_state->color_encoding, new_state->color_range);
> >  }
> >  
> >  static int vkms_plane_atomic_check(struct drm_plane *plane,
> > 
> 
> I couldn't pinpoint what would need to be fixed so that rotation would
> not change chroma siting, but I also cannot say that chroma siting is
> definitely correct already.
> 
> Thanks,
> pq



-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 16/16] drm/vkms: Add support for DRM_FORMAT_R*
  2024-03-28 14:00   ` Pekka Paalanen
@ 2024-04-08  7:50     ` Louis Chauvet
  0 siblings, 0 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-04-08  7:50 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 28/03/24 - 16:00, Pekka Paalanen a écrit :
> On Wed, 13 Mar 2024 18:45:10 +0100
> Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> 
> > This add the support for:
> > - R1/R2/R4/R8
> > 
> > R1 format was tested with [1] and [2].
> > 
> > [1]: https://lore.kernel.org/r/20240313-new_rotation-v2-0-6230fd5cae59@bootlin.com
> > [2]: https://lore.kernel.org/igt-dev/20240306-b4-kms_tests-v1-0-8fe451efd2ac@bootlin.com/
> > 
> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >  drivers/gpu/drm/vkms/vkms_formats.c | 100 ++++++++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/vkms/vkms_plane.c   |   6 ++-
> >  2 files changed, 105 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> > index 863fc91d6d48..cbb2ec09564a 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.c
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> > @@ -201,6 +201,11 @@ static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
> >  	return out_pixel;
> >  }
> >  
> > +static struct pixel_argb_u16 argb_u16_from_gray8(u8 gray)
> > +{
> > +	return argb_u16_from_u8888(255, gray, gray, gray);
> > +}
> > +
> >  VISIBLE_IF_KUNIT struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
> >  							    struct conversion_matrix *matrix)
> >  {
> > @@ -269,6 +274,89 @@ static void black_to_argb_u16(const struct vkms_plane_state *plane, int x_start,
> >  	}
> >  }
> >  
> > +static void Rx_read_line(const struct vkms_plane_state *plane, int x_start,
> > +			 int y_start, enum pixel_read_direction direction, int count,
> > +			 struct pixel_argb_u16 out_pixel[], u8 bit_per_pixel, u8 lum_per_level)
> > +{
> > +	struct pixel_argb_u16 *end = out_pixel + count;
> > +	u8 *src_pixels;
> > +	int rem_x, rem_y;
> > +
> > +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
> 
> Maybe assert that rem_y = 0? Or block_h = 1.

Done for the v6.
 
> > +	int bit_offset = (int)rem_x * bit_per_pixel;
> 
> Why cast rem_x to int when it was defined to be int?

Because it was not the case for my first implementation, and I miss this 
cast... Thanks.

> > +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> > +	int mask = (0x1 << bit_per_pixel) - 1;
> 
> Since mask will interact with u8, it should be unsigned too.

Ok, I will change it for the v6.

> > +
> > +	if (direction == READ_LEFT_TO_RIGHT || direction == READ_RIGHT_TO_LEFT) {
> > +		int restart_bit_offset = 0;
> > +		int step_bit_offset = bit_per_pixel;
> > +
> > +		if (direction == READ_RIGHT_TO_LEFT) {
> > +			restart_bit_offset = 8 - bit_per_pixel;
> > +			step_bit_offset = -bit_per_pixel;
> > +		}
> > +
> > +		while (out_pixel < end) {
> > +			u8 val = (*src_pixels & (mask << bit_offset)) >> bit_offset;
> 
> or shorter: (*src_pixels >> bit_offset) & mask
> 
> However, shouldn't the first pixel be on the high bits?

Obviously yes... I missunderstood it... fixed in the v6 (and it will be 
fixed in the next iteration of my igt series).

> That how I would understand the comments in drm_fourcc.h.
> 
> Again a reason to avoid a solid color fill in IGT.

Yes, but in this case I wrote the IGT test too... So "wrong + wrong = 
test SUCCESS" :)

There are some patterns in IGT, but they are only for "color". None of 
them are available for "black and white"/"gray" formats.

> > +
> > +			*out_pixel = argb_u16_from_gray8(val * lum_per_level);
> > +
> > +			bit_offset += step_bit_offset;
> > +			if (bit_offset < 0 || 8 <= bit_offset) {
> > +				bit_offset = restart_bit_offset;
> > +				src_pixels += step;
> > +			}
> > +			out_pixel += 1;
> > +		}
> > +	} else if (direction == READ_TOP_TO_BOTTOM || direction == READ_BOTTOM_TO_TOP) {
> > +		while (out_pixel < end) {
> > +			u8 val = (*src_pixels & (mask << bit_offset)) >> bit_offset;
> > +			*out_pixel = argb_u16_from_gray8(val * lum_per_level);
> > +			src_pixels += step;
> > +			out_pixel += 1;
> > +		}
> > +	}
> > +}
> > +
> > +static void R1_read_line(const struct vkms_plane_state *plane, int x_start,
> > +			 int y_start, enum pixel_read_direction direction, int count,
> > +			 struct pixel_argb_u16 out_pixel[])
> > +{
> > +	Rx_read_line(plane, x_start, y_start, direction, count, out_pixel, 1, 0xFF);
> > +}
> > +
> > +static void R2_read_line(const struct vkms_plane_state *plane, int x_start,
> > +			 int y_start, enum pixel_read_direction direction, int count,
> > +			 struct pixel_argb_u16 out_pixel[])
> > +{
> > +	Rx_read_line(plane, x_start, y_start, direction, count, out_pixel, 2, 0x55);
> > +}
> > +
> > +static void R4_read_line(const struct vkms_plane_state *plane, int x_start,
> > +			 int y_start, enum pixel_read_direction direction, int count,
> > +			 struct pixel_argb_u16 out_pixel[])
> > +{
> > +	Rx_read_line(plane, x_start, y_start, direction, count, out_pixel, 4, 0x11);
> > +}
> > +
> > +static void R8_read_line(const struct vkms_plane_state *plane, int x_start,
> > +			 int y_start, enum pixel_read_direction direction, int count,
> > +			 struct pixel_argb_u16 out_pixel[])
> > +{
> > +	struct pixel_argb_u16 *end = out_pixel + count;
> > +	u8 *src_pixels;
> > +	int rem_x, rem_y;
> > +	int step = get_step_next_block(plane->frame_info->fb, direction, 0);
> > +
> > +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &src_pixels, &rem_x, &rem_y);
> 
> Assert on block size?

Fixed in v6.

> > +
> > +	while (out_pixel < end) {
> > +		*out_pixel = argb_u16_from_gray8(*src_pixels);
> > +		src_pixels += step;
> > +		out_pixel += 1;
> > +	}
> > +}
> > +
> >  static void ARGB8888_read_line(const struct vkms_plane_state *plane, int x_start, int y_start,
> >  			       enum pixel_read_direction direction, int count,
> >  			       struct pixel_argb_u16 out_pixel[])
> > @@ -582,6 +670,14 @@ pixel_read_line_t get_pixel_read_line_function(u32 format)
> >  	case DRM_FORMAT_YVU422:
> >  	case DRM_FORMAT_YVU444:
> >  		return &planar_yuv_read_line;
> > +	case DRM_FORMAT_R1:
> > +		return &R1_read_line;
> > +	case DRM_FORMAT_R2:
> > +		return &R2_read_line;
> > +	case DRM_FORMAT_R4:
> > +		return &R4_read_line;
> > +	case DRM_FORMAT_R8:
> > +		return &R8_read_line;
> >  	default:
> >  		/*
> >  		 * This is a bug in vkms_plane_atomic_check. All the supported
> > @@ -855,6 +951,10 @@ get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> >  	case DRM_FORMAT_ARGB16161616:
> >  	case DRM_FORMAT_XRGB16161616:
> >  	case DRM_FORMAT_RGB565:
> > +	case DRM_FORMAT_R1:
> > +	case DRM_FORMAT_R2:
> > +	case DRM_FORMAT_R4:
> > +	case DRM_FORMAT_R8:
> >  		/*
> >  		 * Those formats are supported, but they don't need a conversion matrix. Return
> 
> It is strange that you need to list irrelevant formats here.

It is not needed anymore.

Thanks for your review,
Louis Chauvet

> 
> >  		 * a valid pointer to avoid kernel panic in case this matrix is used/checked
> > diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> > index e21cc92cf497..dc9d62acf350 100644
> > --- a/drivers/gpu/drm/vkms/vkms_plane.c
> > +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> > @@ -29,7 +29,11 @@ static const u32 vkms_formats[] = {
> >  	DRM_FORMAT_YUV444,
> >  	DRM_FORMAT_YVU420,
> >  	DRM_FORMAT_YVU422,
> > -	DRM_FORMAT_YVU444
> > +	DRM_FORMAT_YVU444,
> > +	DRM_FORMAT_R1,
> > +	DRM_FORMAT_R2,
> > +	DRM_FORMAT_R4,
> > +	DRM_FORMAT_R8
> >  };
> >  
> >  static struct drm_plane_state *
> > 
> 
> Thanks,
> pq



-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 09/16] drm/vkms: Introduce pixel_read_direction enum
  2024-04-08  7:50         ` Louis Chauvet
@ 2024-04-09  7:35           ` Pekka Paalanen
  2024-04-09 10:06             ` Louis Chauvet
  0 siblings, 1 reply; 75+ messages in thread
From: Pekka Paalanen @ 2024-04-09  7:35 UTC (permalink / raw)
  To: Louis Chauvet
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

[-- Attachment #1: Type: text/plain, Size: 6771 bytes --]

On Mon, 8 Apr 2024 09:50:18 +0200
Louis Chauvet <louis.chauvet@bootlin.com> wrote:

> Le 27/03/24 - 14:16, Pekka Paalanen a écrit :
> > On Tue, 26 Mar 2024 16:57:00 +0100
> > Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> >   
> > > Le 25/03/24 - 15:11, Pekka Paalanen a écrit :  
> > > > On Wed, 13 Mar 2024 18:45:03 +0100
> > > > Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> > > >     
> > > > > The pixel_read_direction enum is useful to describe the reading direction
> > > > > in a plane. It avoids using the rotation property of DRM, which not
> > > > > practical to know the direction of reading.
> > > > > This patch also introduce two helpers, one to compute the
> > > > > pixel_read_direction from the DRM rotation property, and one to compute
> > > > > the step, in byte, between two successive pixel in a specific direction.
> > > > > 
> > > > > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > > > > ---
> > > > >  drivers/gpu/drm/vkms/vkms_composer.c | 36 ++++++++++++++++++++++++++++++++++++
> > > > >  drivers/gpu/drm/vkms/vkms_drv.h      | 11 +++++++++++
> > > > >  drivers/gpu/drm/vkms/vkms_formats.c  | 30 ++++++++++++++++++++++++++++++
> > > > >  3 files changed, 77 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> > > > > index 9254086f23ff..989bcf59f375 100644
> > > > > --- a/drivers/gpu/drm/vkms/vkms_composer.c
> > > > > +++ b/drivers/gpu/drm/vkms/vkms_composer.c

> > > > I hope IGT uses FB patterns instead of solid color in its tests of
> > > > rotation to be able to detect the difference.    
> > > 
> > > They use solid colors, and even my new rotation test [3] use solid colors.  
> > 
> > That will completely fail to detect rotation and reflection bugs then.
> > E.g. userspace asks for 180-degree rotation, and the driver does not
> > rotate at all. Or rotate-180 getting confused with one reflection.  
> 
> I think I missunderstood what you means with "solid colors".
> 
> The tests uses a plane with multiple solid colors:
> 
> +-------+-------+
> | White | Red   |
> +-------+-------+
> | Blue  | Green |
> +-------+-------+
> 
> But it don't use gradients because of YUV.
>

Oh, that works. No worries then.

> > > It is mainly for yuv formats with subsampling: if you have formats with 
> > > subsampling, a "software rotated buffer" and a "hardware rotated buffer" 
> > > will not apply the same subsampling, so the colors will be slightly 
> > > different.  
> > 
> > Why would they not use the same subsampling?  
> 
> YUV422, for each pair of pixels along a horizontal line, the U and V 
> components are shared between those two pixels. However, along a vertical 
> line, each pixel has its own U and V components.
> 
> When you rotate an image by 90 degrees:
>  - Hardware Rotation: If you use hardware rotation, the YUV subsampling 
>    axis will align with what was previously the "White-Red" axis. The 
>    hardware will handle the rotation.
>  - Software Rotation: If you use software rotation, the YUV subsampling 
>    axis will align with what was previously the "Red-Green" axis.

That would be a bug in the software rotation.

> Because the subsampling compression axis changes depending on whether 
> you're using hardware or software rotation, the compression effect on 
> colors will differ. Specifically:
>  - Hardware rotation, a gradient along the "White-Red" axis may be 
>    compressed (i.e same UV component for multiple pixels along the 
>    gradient).
>  - Software rotation, the same gradient will not be compressed (i.e, each 
>    different color in the gradient have dedicated UV component)
> 
> The same reasoning also apply for "color borders", and my series [3] avoid 
> this issue by choosing the right number of pixels.

What is [3]?

I've used similar tactics in the Weston test suite, when I have no
implementation for chroma siting: the input and reference images
consist of 2x2 equal color pixel groups, so that chroma siting makes no
difference. When chroma siting will be implemented, the tests will be
extended.

Is there a TODO item to fix the software rotation bug and make the
tests more sensitive?

I think documenting this would be an ok intermediate solution.

> > The framebuffer contents are defined in its natural orientation, and
> > the subsampling applies in the natural orientation. If such a FB
> > is on a rotated plane, one must account for subsampling first, and
> > rotate second. 90-degree rotation does not change the encoded color.
> > 
> > Getting the subsampling exactly right is going to be necessary sooner
> > or later. There is no UAPI for setting chroma siting yet, but ideally
> > there should be.
> >   
> > > > The return values do seem correct to me, assuming I have guessed
> > > > correctly what "X" and "Y" refer to when combined with rotation. I did
> > > > not find good documentation about that.    
> > > 
> > > Yes, it is difficult to understand how rotation and reflexion should 
> > > works in drm. I spend half a day testing all the combination in drm_rect_* 
> > > helpers to understand how this works. According to the code:
> > > - If only rotation or only reflexion, easy as expected
> > > - If reflexion and rotation are mixed, the source buffer is first 
> > >   reflected and then rotated.  
> > 
> > Now that you know, you could send a documentation patch. :-)  
> 
> And now I'm not sure about it :)

You'll have people review the patch and confirm your understanding or
point out a mistake. A doc patch it easier to notice and jump in than
this series.

> I was running the tests on my v6, and for the first time ran my new 
> rotation [3] on the previous VKMS code. None of the tests for 
> ROT_90+reflexion and ROT_270+reflexion are passing...
> 
> So, either the previous vkms implementation was wrong, or mine is wrong :)
> 
> So, if a DRM expert can explain this, it could be nice.
> 
> To have a common example, if I take the same buffer as above 
> (white+red+blue+green), if I create a plane with rotation = 
> ROTATION_90 | REFLECTION_X, what is the expected result?
> 
> 1 - rotation then reflection 
> 
> +-------+-------+
> | Green | Red   |
> +-------+-------+
> | Blue  | White |
> +-------+-------+
> 
> 2 - reflection then rotation (my vkms implementation)
> 
> +-------+-------+
> | White | Blue  |
> +-------+-------+
> | Red   | Green |
> +-------+-------+
> 

I wish I knew. :-)

Thanks,
pq


> > For me as a userspace developer, the important place is
> > https://dri.freedesktop.org/docs/drm/gpu/drm-kms.html#standard-plane-properties
> >   

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 11/16] drm/vkms: Add YUV support
  2024-04-08  7:50     ` Louis Chauvet
@ 2024-04-09  7:58       ` Pekka Paalanen
  2024-04-09 10:06         ` Louis Chauvet
  0 siblings, 1 reply; 75+ messages in thread
From: Pekka Paalanen @ 2024-04-09  7:58 UTC (permalink / raw)
  To: Louis Chauvet
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

[-- Attachment #1: Type: text/plain, Size: 7165 bytes --]

On Mon, 8 Apr 2024 09:50:19 +0200
Louis Chauvet <louis.chauvet@bootlin.com> wrote:

> Le 27/03/24 - 16:23, Pekka Paalanen a écrit :
> > On Wed, 13 Mar 2024 18:45:05 +0100
> > Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> >   
> > > From: Arthur Grillo <arthurgrillo@riseup.net>
> > > 
> > > Add support to the YUV formats bellow:
> > > 
> > > - NV12/NV16/NV24
> > > - NV21/NV61/NV42
> > > - YUV420/YUV422/YUV444
> > > - YVU420/YVU422/YVU444
> > > 
> > > The conversion from yuv to rgb is done with fixed-point arithmetic, using
> > > 32.32 floats and the drm_fixed helpers.  
> > 
> > You mean fixed-point, not floating-point (floats).
> >   
> > > 
> > > To do the conversion, a specific matrix must be used for each color range
> > > (DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in
> > > the `conversion_matrix` struct, along with the specific y_offset needed.
> > > This matrix is queried only once, in `vkms_plane_atomic_update` and
> > > stored in a `vkms_plane_state`. Those conversion matrices of each
> > > encoding and range were obtained by rounding the values of the original
> > > conversion matrices multiplied by 2^32. This is done to avoid the use of
> > > floating point operations.
> > > 
> > > The same reading function is used for YUV and YVU formats. As the only
> > > difference between those two category of formats is the order of field, a
> > > simple swap in conversion matrix columns allows using the same function.  
> > 
> > Sounds good!
> >   
> > > Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> > > [Louis Chauvet:
> > > - Adapted Arthur's work
> > > - Implemented the read_line_t callbacks for yuv
> > > - add struct conversion_matrix
> > > - remove struct pixel_yuv_u8
> > > - update the commit message
> > > - Merge the modifications from Arthur]
> > > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > > ---
> > >  drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
> > >  drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
> > >  drivers/gpu/drm/vkms/vkms_formats.h |   4 +
> > >  drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
> > >  4 files changed, 473 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > > index 23e1d247468d..f3116084de5a 100644
> > > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > > +++ b/drivers/gpu/drm/vkms/vkms_drv.h

...

> > > +static struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
> > > +						  struct conversion_matrix *matrix)  
> > 
> > If you are using the "swap the matrix columns" trick, then you cannot
> > call these cb, cr nor even u,v, because they might be the opposite.
> > They are simply the first and second chroma channel, and their meaning
> > depends on the given matrix.  
> 
> I will rename them for v6, channel_1 and channel_2.
> 
> > > +{
> > > +	u8 r, g, b;
> > > +	s64 fp_y, fp_cb, fp_cr;
> > > +	s64 fp_r, fp_g, fp_b;
> > > +
> > > +	fp_y = y - matrix->y_offset;
> > > +	fp_cb = cb - 128;
> > > +	fp_cr = cr - 128;  
> > 
> > This looks like an incorrect way to convert u8 to fixed-point, but...
> >  
> > > +
> > > +	fp_y = drm_int2fixp(fp_y);
> > > +	fp_cb = drm_int2fixp(fp_cb);
> > > +	fp_cr = drm_int2fixp(fp_cr);  
> > 
> > I find it confusing to re-purpose variables like this.
> > 
> > I'd do just
> > 
> > 	fp_c1 = drm_int2fixp((int)c1 - 128);  
> 
> I agree with this remark, I will change it for the v6.
> 
> > If the function arguments were int to begin with, then the cast would
> > be obviously unnecessary.  
> 
> For this I'm less sure. The name of the function and the usage is 
> explicit: we want to use u8 as input. As we manipulate pointers in 
> read_line, I don't know how it will works if the pointer is dereferenced 
> to a int instead of a u8.

Dereference operator acts on its input type. What happens to the result
is irrelevant.

If we have

u8 *p = ...;

void foo(int x);

then you can call

foo(*v);

if that was your question. Dereference acts on u8* which results in u8.
Then it gets implicitly cast to int.

However, you have a semantic reason to keep the argument as u8, and
that is fine.

> > So, what you have in fp variables at this point is fractional numbers
> > in the 8-bit integer scale. However, because the target format is
> > 16-bit, you should not show the extra precision away here. Instead,
> > multiply by 257 to bring the values to 16-bit scale, and do the RGB
> > clamping to 16-bit, not 8-bit.
> >   
> > > +
> > > +	fp_r = drm_fixp_mul(matrix->matrix[0][0], fp_y) +
> > > +	       drm_fixp_mul(matrix->matrix[0][1], fp_cb) +
> > > +	       drm_fixp_mul(matrix->matrix[0][2], fp_cr);
> > > +	fp_g = drm_fixp_mul(matrix->matrix[1][0], fp_y) +
> > > +	       drm_fixp_mul(matrix->matrix[1][1], fp_cb) +
> > > +	       drm_fixp_mul(matrix->matrix[1][2], fp_cr);
> > > +	fp_b = drm_fixp_mul(matrix->matrix[2][0], fp_y) +
> > > +	       drm_fixp_mul(matrix->matrix[2][1], fp_cb) +
> > > +	       drm_fixp_mul(matrix->matrix[2][2], fp_cr);
> > > +
> > > +	fp_r = drm_fixp2int_round(fp_r);
> > > +	fp_g = drm_fixp2int_round(fp_g);
> > > +	fp_b = drm_fixp2int_round(fp_b);
> > > +
> > > +	r = clamp(fp_r, 0, 0xff);
> > > +	g = clamp(fp_g, 0, 0xff);
> > > +	b = clamp(fp_b, 0, 0xff);
> > > +
> > > +	return argb_u16_from_u8888(255, r, g, b);  
> > 
> > Going through argb_u16_from_u8888() will throw away precision.  
> 
> I tried to fix it in the v6, IGT tests pass. If something is wrong in the 
> v6, please let me know.
> 
> > > +}
> > > +
> > >  /*
> > >   * The following functions are read_line function for each pixel format supported by VKMS.
> > >   *
> > > @@ -293,6 +367,79 @@ static void RGB565_read_line(const struct vkms_plane_state *plane, int x_start,
> > >  	}
> > >  }
> > >  
> > > +/*
> > > + * This callback can be used for yuv and yvu formats, given a properly modified conversion matrix
> > > + * (column inversion)  
> > 
> > Would be nice to explain what semi_planar_yuv means, so that the
> > documentation for these functions would show how they differ rather
> > than all saying exactly the same thing.  
> 
>  /* This callback can be used for YUV format where each color component is 
>   * stored in a different plane (often called planar formats). It will 
>   * handle correctly subsampling.
> 
>  /*
>   * This callback can be used for YUV formats where U and V values are 
>   * stored in the same plane (often called semi-planar formats). It will 
>   * corectly handle subsampling.
>   * 
>   * The conversion matrix stored in the @plane is used to:
>   * - Apply the correct color range and encoding
>   * - Convert YUV and YVU with the same function (a simple column swap is 
>   *   needed)
>   */

Sounds good. I'd just drop the "It will handle correctly subsampling."
because all code is supposed to be correct by default.

If there is a function that intentionally overlooks something, that
certainly should be documented.


Thanks,
pq

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 09/16] drm/vkms: Introduce pixel_read_direction enum
  2024-04-09  7:35           ` Pekka Paalanen
@ 2024-04-09 10:06             ` Louis Chauvet
  0 siblings, 0 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-04-09 10:06 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 09/04/24 - 10:35, Pekka Paalanen a écrit :
> On Mon, 8 Apr 2024 09:50:18 +0200
> Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> 
> > Le 27/03/24 - 14:16, Pekka Paalanen a écrit :
> > > On Tue, 26 Mar 2024 16:57:00 +0100
> > > Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> > >   
> > > > Le 25/03/24 - 15:11, Pekka Paalanen a écrit :  
> > > > > On Wed, 13 Mar 2024 18:45:03 +0100
> > > > > Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> > > > >     
> > > > > > The pixel_read_direction enum is useful to describe the reading direction
> > > > > > in a plane. It avoids using the rotation property of DRM, which not
> > > > > > practical to know the direction of reading.
> > > > > > This patch also introduce two helpers, one to compute the
> > > > > > pixel_read_direction from the DRM rotation property, and one to compute
> > > > > > the step, in byte, between two successive pixel in a specific direction.
> > > > > > 
> > > > > > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > > > > > ---
> > > > > >  drivers/gpu/drm/vkms/vkms_composer.c | 36 ++++++++++++++++++++++++++++++++++++
> > > > > >  drivers/gpu/drm/vkms/vkms_drv.h      | 11 +++++++++++
> > > > > >  drivers/gpu/drm/vkms/vkms_formats.c  | 30 ++++++++++++++++++++++++++++++
> > > > > >  3 files changed, 77 insertions(+)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> > > > > > index 9254086f23ff..989bcf59f375 100644
> > > > > > --- a/drivers/gpu/drm/vkms/vkms_composer.c
> > > > > > +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> 
> > > > > I hope IGT uses FB patterns instead of solid color in its tests of
> > > > > rotation to be able to detect the difference.    
> > > > 
> > > > They use solid colors, and even my new rotation test [3] use solid colors.  
> > > 
> > > That will completely fail to detect rotation and reflection bugs then.
> > > E.g. userspace asks for 180-degree rotation, and the driver does not
> > > rotate at all. Or rotate-180 getting confused with one reflection.  
> > 
> > I think I missunderstood what you means with "solid colors".
> > 
> > The tests uses a plane with multiple solid colors:
> > 
> > +-------+-------+
> > | White | Red   |
> > +-------+-------+
> > | Blue  | Green |
> > +-------+-------+
> > 
> > But it don't use gradients because of YUV.
> >
> 
> Oh, that works. No worries then.
> 
> > > > It is mainly for yuv formats with subsampling: if you have formats with 
> > > > subsampling, a "software rotated buffer" and a "hardware rotated buffer" 
> > > > will not apply the same subsampling, so the colors will be slightly 
> > > > different.  
> > > 
> > > Why would they not use the same subsampling?  
> > 
> > YUV422, for each pair of pixels along a horizontal line, the U and V 
> > components are shared between those two pixels. However, along a vertical 
> > line, each pixel has its own U and V components.
> > 
> > When you rotate an image by 90 degrees:
> >  - Hardware Rotation: If you use hardware rotation, the YUV subsampling 
> >    axis will align with what was previously the "White-Red" axis. The 
> >    hardware will handle the rotation.
> >  - Software Rotation: If you use software rotation, the YUV subsampling 
> >    axis will align with what was previously the "Red-Green" axis.
> 
> That would be a bug in the software rotation.

Yes, but it is very complex to fix I think, so I did not chose 
this path :)
 
> > Because the subsampling compression axis changes depending on whether 
> > you're using hardware or software rotation, the compression effect on 
> > colors will differ. Specifically:
> >  - Hardware rotation, a gradient along the "White-Red" axis may be 
> >    compressed (i.e same UV component for multiple pixels along the 
> >    gradient).
> >  - Software rotation, the same gradient will not be compressed (i.e, each 
> >    different color in the gradient have dedicated UV component)
> > 
> > The same reasoning also apply for "color borders", and my series [3] avoid 
> > this issue by choosing the right number of pixels.
> 
> What is [3]?

I don't know why I put [3] here, I probably mixed references between mails

[3]: https://lore.kernel.org/all/20240313-new_rotation-v2-0-6230fd5cae59@bootlin.com/
 
> I've used similar tactics in the Weston test suite, when I have no
> implementation for chroma siting: the input and reference images
> consist of 2x2 equal color pixel groups, so that chroma siting makes no
> difference. When chroma siting will be implemented, the tests will be
> extended.
> 
> Is there a TODO item to fix the software rotation bug and make the
> tests more sensitive?
> 
> I think documenting this would be an ok intermediate solution.
> 
> > > The framebuffer contents are defined in its natural orientation, and
> > > the subsampling applies in the natural orientation. If such a FB
> > > is on a rotated plane, one must account for subsampling first, and
> > > rotate second. 90-degree rotation does not change the encoded color.
> > > 
> > > Getting the subsampling exactly right is going to be necessary sooner
> > > or later. There is no UAPI for setting chroma siting yet, but ideally
> > > there should be.
> > >   
> > > > > The return values do seem correct to me, assuming I have guessed
> > > > > correctly what "X" and "Y" refer to when combined with rotation. I did
> > > > > not find good documentation about that.    
> > > > 
> > > > Yes, it is difficult to understand how rotation and reflexion should 
> > > > works in drm. I spend half a day testing all the combination in drm_rect_* 
> > > > helpers to understand how this works. According to the code:
> > > > - If only rotation or only reflexion, easy as expected
> > > > - If reflexion and rotation are mixed, the source buffer is first 
> > > >   reflected and then rotated.  
> > > 
> > > Now that you know, you could send a documentation patch. :-)  
> > 
> > And now I'm not sure about it :)
> 
> You'll have people review the patch and confirm your understanding or
> point out a mistake. A doc patch it easier to notice and jump in than
> this series.

I just send it [4], you are in copy.

[4]: https://lore.kernel.org/all/20240409-google-drm-doc-v1-0-033d55cc8250@bootlin.com/

> > I was running the tests on my v6, and for the first time ran my new 
> > rotation [3] on the previous VKMS code. None of the tests for 
> > ROT_90+reflexion and ROT_270+reflexion are passing...
> > 
> > So, either the previous vkms implementation was wrong, or mine is wrong :)
> > 
> > So, if a DRM expert can explain this, it could be nice.
> > 
> > To have a common example, if I take the same buffer as above 
> > (white+red+blue+green), if I create a plane with rotation = 
> > ROTATION_90 | REFLECTION_X, what is the expected result?
> > 
> > 1 - rotation then reflection 
> > 
> > +-------+-------+
> > | Green | Red   |
> > +-------+-------+
> > | Blue  | White |
> > +-------+-------+
> > 
> > 2 - reflection then rotation (my vkms implementation)
> > 
> > +-------+-------+
> > | White | Blue  |
> > +-------+-------+
> > | Red   | Green |
> > +-------+-------+
> > 
> 
> I wish I knew. :-)
> 
> Thanks,
> pq
> 
> 
> > > For me as a userspace developer, the important place is
> > > https://dri.freedesktop.org/docs/drm/gpu/drm-kms.html#standard-plane-properties
> > >   



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 11/16] drm/vkms: Add YUV support
  2024-04-09  7:58       ` Pekka Paalanen
@ 2024-04-09 10:06         ` Louis Chauvet
  0 siblings, 0 replies; 75+ messages in thread
From: Louis Chauvet @ 2024-04-09 10:06 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Rodrigo Siqueira, Melissa Wen, Maíra Canal, Haneen Mohammed,
	Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, arthurgrillo, Jonathan Corbet,
	dri-devel, linux-kernel, jeremie.dautheribes, miquel.raynal,
	thomas.petazzoni, seanpaul, marcheu, nicolejadeyee

Le 09/04/24 - 10:58, Pekka Paalanen a écrit :
> On Mon, 8 Apr 2024 09:50:19 +0200
> Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> 
> > Le 27/03/24 - 16:23, Pekka Paalanen a écrit :
> > > On Wed, 13 Mar 2024 18:45:05 +0100
> > > Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> > >   
> > > > From: Arthur Grillo <arthurgrillo@riseup.net>
> > > > 
> > > > Add support to the YUV formats bellow:
> > > > 
> > > > - NV12/NV16/NV24
> > > > - NV21/NV61/NV42
> > > > - YUV420/YUV422/YUV444
> > > > - YVU420/YVU422/YVU444
> > > > 
> > > > The conversion from yuv to rgb is done with fixed-point arithmetic, using
> > > > 32.32 floats and the drm_fixed helpers.  
> > > 
> > > You mean fixed-point, not floating-point (floats).
> > >   
> > > > 
> > > > To do the conversion, a specific matrix must be used for each color range
> > > > (DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in
> > > > the `conversion_matrix` struct, along with the specific y_offset needed.
> > > > This matrix is queried only once, in `vkms_plane_atomic_update` and
> > > > stored in a `vkms_plane_state`. Those conversion matrices of each
> > > > encoding and range were obtained by rounding the values of the original
> > > > conversion matrices multiplied by 2^32. This is done to avoid the use of
> > > > floating point operations.
> > > > 
> > > > The same reading function is used for YUV and YVU formats. As the only
> > > > difference between those two category of formats is the order of field, a
> > > > simple swap in conversion matrix columns allows using the same function.  
> > > 
> > > Sounds good!
> > >   
> > > > Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> > > > [Louis Chauvet:
> > > > - Adapted Arthur's work
> > > > - Implemented the read_line_t callbacks for yuv
> > > > - add struct conversion_matrix
> > > > - remove struct pixel_yuv_u8
> > > > - update the commit message
> > > > - Merge the modifications from Arthur]
> > > > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > > > ---
> > > >  drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
> > > >  drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
> > > >  drivers/gpu/drm/vkms/vkms_formats.h |   4 +
> > > >  drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
> > > >  4 files changed, 473 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > > > index 23e1d247468d..f3116084de5a 100644
> > > > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > > > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> 
> ...
> 
> > > > +static struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
> > > > +						  struct conversion_matrix *matrix)  
> > > 
> > > If you are using the "swap the matrix columns" trick, then you cannot
> > > call these cb, cr nor even u,v, because they might be the opposite.
> > > They are simply the first and second chroma channel, and their meaning
> > > depends on the given matrix.  
> > 
> > I will rename them for v6, channel_1 and channel_2.
> > 
> > > > +{
> > > > +	u8 r, g, b;
> > > > +	s64 fp_y, fp_cb, fp_cr;
> > > > +	s64 fp_r, fp_g, fp_b;
> > > > +
> > > > +	fp_y = y - matrix->y_offset;
> > > > +	fp_cb = cb - 128;
> > > > +	fp_cr = cr - 128;  
> > > 
> > > This looks like an incorrect way to convert u8 to fixed-point, but...
> > >  
> > > > +
> > > > +	fp_y = drm_int2fixp(fp_y);
> > > > +	fp_cb = drm_int2fixp(fp_cb);
> > > > +	fp_cr = drm_int2fixp(fp_cr);  
> > > 
> > > I find it confusing to re-purpose variables like this.
> > > 
> > > I'd do just
> > > 
> > > 	fp_c1 = drm_int2fixp((int)c1 - 128);  
> > 
> > I agree with this remark, I will change it for the v6.
> > 
> > > If the function arguments were int to begin with, then the cast would
> > > be obviously unnecessary.  
> > 
> > For this I'm less sure. The name of the function and the usage is 
> > explicit: we want to use u8 as input. As we manipulate pointers in 
> > read_line, I don't know how it will works if the pointer is dereferenced 
> > to a int instead of a u8.
> 
> Dereference operator acts on its input type. What happens to the result
> is irrelevant.
> 
> If we have
> 
> u8 *p = ...;
> 
> void foo(int x);
> 
> then you can call
> 
> foo(*v);
> 
> if that was your question. Dereference acts on u8* which results in u8.
> Then it gets implicitly cast to int.

Thanks for the clear explaination!
 
> However, you have a semantic reason to keep the argument as u8, and
> that is fine.

So I will keep u8 for the v6.

> > > So, what you have in fp variables at this point is fractional numbers
> > > in the 8-bit integer scale. However, because the target format is
> > > 16-bit, you should not show the extra precision away here. Instead,
> > > multiply by 257 to bring the values to 16-bit scale, and do the RGB
> > > clamping to 16-bit, not 8-bit.
> > >   
> > > > +
> > > > +	fp_r = drm_fixp_mul(matrix->matrix[0][0], fp_y) +
> > > > +	       drm_fixp_mul(matrix->matrix[0][1], fp_cb) +
> > > > +	       drm_fixp_mul(matrix->matrix[0][2], fp_cr);
> > > > +	fp_g = drm_fixp_mul(matrix->matrix[1][0], fp_y) +
> > > > +	       drm_fixp_mul(matrix->matrix[1][1], fp_cb) +
> > > > +	       drm_fixp_mul(matrix->matrix[1][2], fp_cr);
> > > > +	fp_b = drm_fixp_mul(matrix->matrix[2][0], fp_y) +
> > > > +	       drm_fixp_mul(matrix->matrix[2][1], fp_cb) +
> > > > +	       drm_fixp_mul(matrix->matrix[2][2], fp_cr);
> > > > +
> > > > +	fp_r = drm_fixp2int_round(fp_r);
> > > > +	fp_g = drm_fixp2int_round(fp_g);
> > > > +	fp_b = drm_fixp2int_round(fp_b);
> > > > +
> > > > +	r = clamp(fp_r, 0, 0xff);
> > > > +	g = clamp(fp_g, 0, 0xff);
> > > > +	b = clamp(fp_b, 0, 0xff);
> > > > +
> > > > +	return argb_u16_from_u8888(255, r, g, b);  
> > > 
> > > Going through argb_u16_from_u8888() will throw away precision.  
> > 
> > I tried to fix it in the v6, IGT tests pass. If something is wrong in the 
> > v6, please let me know.
> > 
> > > > +}
> > > > +
> > > >  /*
> > > >   * The following functions are read_line function for each pixel format supported by VKMS.
> > > >   *
> > > > @@ -293,6 +367,79 @@ static void RGB565_read_line(const struct vkms_plane_state *plane, int x_start,
> > > >  	}
> > > >  }
> > > >  
> > > > +/*
> > > > + * This callback can be used for yuv and yvu formats, given a properly modified conversion matrix
> > > > + * (column inversion)  
> > > 
> > > Would be nice to explain what semi_planar_yuv means, so that the
> > > documentation for these functions would show how they differ rather
> > > than all saying exactly the same thing.  
> > 
> >  /* This callback can be used for YUV format where each color component is 
> >   * stored in a different plane (often called planar formats). It will 
> >   * handle correctly subsampling.
> > 
> >  /*
> >   * This callback can be used for YUV formats where U and V values are 
> >   * stored in the same plane (often called semi-planar formats). It will 
> >   * corectly handle subsampling.
> >   * 
> >   * The conversion matrix stored in the @plane is used to:
> >   * - Apply the correct color range and encoding
> >   * - Convert YUV and YVU with the same function (a simple column swap is 
> >   *   needed)
> >   */
> 
> Sounds good. I'd just drop the "It will handle correctly subsampling."
> because all code is supposed to be correct by default.

Will do for the v6.

Thanks,
Louis Chauvet
 
> If there is a function that intentionally overlooks something, that
> certainly should be documented.
> 
> 
> Thanks,
> pq



-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

end of thread, other threads:[~2024-04-09 10:06 UTC | newest]

Thread overview: 75+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-13 17:44 [PATCH v5 00/16] drm/vkms: Reimplement line-per-line pixel conversion for plane reading Louis Chauvet
2024-03-13 17:44 ` [PATCH v5 01/16] drm/vkms: Code formatting Louis Chauvet
2024-03-25 12:03   ` Pekka Paalanen
2024-03-25 13:13   ` Maíra Canal
2024-03-13 17:44 ` [PATCH v5 02/16] drm/vkms: Use drm_frame directly Louis Chauvet
2024-03-25 12:04   ` Pekka Paalanen
2024-03-25 13:20   ` Maíra Canal
2024-03-26 15:56     ` Louis Chauvet
2024-03-13 17:44 ` [PATCH v5 03/16] drm/vkms: write/update the documentation for pixel conversion and pixel write functions Louis Chauvet
2024-03-13 19:02   ` Randy Dunlap
2024-03-25 13:32   ` Maíra Canal
2024-03-26 15:56     ` Louis Chauvet
2024-03-13 17:44 ` [PATCH v5 04/16] drm/vkms: Add typedef and documentation for pixel_read and pixel_write functions Louis Chauvet
2024-03-25 12:04   ` Pekka Paalanen
2024-03-26 15:56     ` Louis Chauvet
2024-03-25 13:56   ` Maíra Canal
2024-03-26 15:56     ` Louis Chauvet
2024-03-27 15:03       ` Maíra Canal
2024-03-13 17:44 ` [PATCH v5 05/16] drm/vkms: Add dummy pixel_read/pixel_write callbacks to avoid NULL pointers Louis Chauvet
2024-03-13 19:08   ` Randy Dunlap
2024-03-25 12:05   ` Pekka Paalanen
2024-03-26 15:56     ` Louis Chauvet
2024-03-25 13:59   ` Maíra Canal
2024-03-26 15:56     ` Louis Chauvet
2024-03-13 17:45 ` [PATCH v5 06/16] drm/vkms: Use const for input pointers in pixel_read an pixel_write functions Louis Chauvet
2024-03-25 12:05   ` Pekka Paalanen
2024-03-25 14:00   ` Maíra Canal
2024-03-13 17:45 ` [PATCH v5 07/16] drm/vkms: Update pixels accessor to support packed and multi-plane formats Louis Chauvet
2024-03-25 12:40   ` Pekka Paalanen
2024-03-26 15:56     ` Louis Chauvet
2024-03-13 17:45 ` [PATCH v5 08/16] drm/vkms: Avoid computing blending limits inside pre_mul_alpha_blend Louis Chauvet
2024-03-25 12:41   ` Pekka Paalanen
2024-03-26 15:57     ` Louis Chauvet
2024-03-27 11:48       ` Pekka Paalanen
2024-04-08  7:50         ` Louis Chauvet
2024-03-13 17:45 ` [PATCH v5 09/16] drm/vkms: Introduce pixel_read_direction enum Louis Chauvet
2024-03-25 13:11   ` Pekka Paalanen
2024-03-26 15:57     ` Louis Chauvet
2024-03-27 12:16       ` Pekka Paalanen
2024-04-08  7:50         ` Louis Chauvet
2024-04-09  7:35           ` Pekka Paalanen
2024-04-09 10:06             ` Louis Chauvet
2024-03-25 14:07   ` Maíra Canal
2024-03-26 15:57     ` Louis Chauvet
2024-03-13 17:45 ` [PATCH v5 10/16] drm/vkms: Re-introduce line-per-line composition algorithm Louis Chauvet
2024-03-25 14:15   ` Maíra Canal
2024-03-25 14:56     ` Pekka Paalanen
2024-03-26 15:57     ` Louis Chauvet
2024-03-25 15:43   ` Pekka Paalanen
2024-03-26 15:57     ` Louis Chauvet
2024-03-27 12:29       ` Pekka Paalanen
2024-04-08  7:50         ` Louis Chauvet
2024-03-13 17:45 ` [PATCH v5 11/16] drm/vkms: Add YUV support Louis Chauvet
2024-03-13 19:20   ` Randy Dunlap
2024-03-14 14:41     ` Louis Chauvet
2024-03-25 14:26   ` Maíra Canal
2024-03-26 15:57     ` Louis Chauvet
2024-03-27 12:59       ` Pekka Paalanen
2024-04-08  7:50         ` Louis Chauvet
2024-03-27 12:11   ` Philipp Zabel
2024-04-08  7:50     ` Louis Chauvet
2024-03-27 14:23   ` Pekka Paalanen
2024-04-08  7:50     ` Louis Chauvet
2024-04-09  7:58       ` Pekka Paalanen
2024-04-09 10:06         ` Louis Chauvet
2024-03-13 17:45 ` [PATCH v5 12/16] drm/vkms: Add range and encoding properties to the plane Louis Chauvet
2024-03-13 17:45 ` [PATCH v5 13/16] drm/vkms: Drop YUV formats TODO Louis Chauvet
2024-03-13 17:45 ` [PATCH v5 14/16] drm/vkms: Create KUnit tests for YUV conversions Louis Chauvet
2024-03-25 14:34   ` Maíra Canal
2024-03-26 15:57     ` Louis Chauvet
2024-03-28 13:26     ` Pekka Paalanen
2024-03-13 17:45 ` [PATCH v5 15/16] drm/vkms: Add how to run the Kunit tests Louis Chauvet
2024-03-13 17:45 ` [PATCH v5 16/16] drm/vkms: Add support for DRM_FORMAT_R* Louis Chauvet
2024-03-28 14:00   ` Pekka Paalanen
2024-04-08  7:50     ` Louis Chauvet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.