amd-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.
@ 2021-03-19 21:03 Mario Kleiner
  2021-03-19 21:03 ` [PATCH 1/5] drm/fourcc: Add 16 bpc fixed point framebuffer formats Mario Kleiner
                   ` (6 more replies)
  0 siblings, 7 replies; 20+ messages in thread
From: Mario Kleiner @ 2021-03-19 21:03 UTC (permalink / raw)
  To: amd-gfx, dri-devel
  Cc: alexander.deucher, mario.kleiner.de, harry.wentland, nicholas.kazlauskas

Hi,

this patch series adds the fourcc's for 16 bit fixed point unorm
framebuffers to the core, and then an implementation for AMD gpu's
with DisplayCore.

This is intended to allow for pageflipping to, and direct scanout of,
Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
for swapchains, mapping to DRM_FORMAT_XBGR16161616:
Link: https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4

My main motivation for this is squeezing every bit of precision
out of the hardware for scientific and medical research applications,
where fp16 in the unorm range is limited to ~11 bpc effective linear
precision in the upper half [0.5;1.0] of the unorm range, although
the hardware could do at least 12 bpc.

It has been successfully tested on AMD RavenRidge (DCN-1), and with
Polaris11 (DCE-11.2). Up to two displays were active on RavenRidge
(DP 2560x1440@144Hz + HDMI 2560x1440@120Hz), the maximum supported
on my hw, both running at 10 bpc DP output depth.

Up to three displays were active on the Polaris (DP 2560x1440@144Hz +
2560x1440@100Hz USB-C DP-altMode-to-HDMI converter + eDP 2880x1800@60Hz
Apple Retina panel), all running at 10 bpc output depth.

No malfunctions, visual artifacts or other oddities were observed
(apart from an adventureous mess of cables and adapters on my desk),
suggesting it works.

I used my automatic photometer measurement procedure to verify the
effective output precision of 10 bpc DP native signal + spatial
dithering in the gpu as enabled by the amdgpu driver. Results show
the expected 12 bpc precision i hoped for -- the current upper limit
for AMD display hw afaik.

So it seems to work in the way i hoped :).

Some open questions wrt. AMD DC, to be addressed in this patch series, or follow up
patches if neccessary:

- For the atomic check for plane scaling, the current patch will
apply the same hw limits as for other rgb fixed point fb's, e.g.,
for 8 bpc rgb8. Is this correct? Or would we need to use the fp16
limits, because this is also a 64 bpp format? Or something new
entirely?

- I haven't added the new fourcc to the DCC tables yet. Should i?

- I had to change an assert for DCE to allow 36bpp linebuffers (patch 4/5).
It looks to me as if that assert was inconsistent with other places
in the driver where COLOR_DEPTH121212 is supported, and looking at
the code, the change seems harmless. At least on DCE-11.2 the change
didn't cause any noticeable (by myself) or measurable (by my equipment)
problems on any of the 3 connected displays.

- Related to that change, while i needed to increase lb pixelsize to 36bpp
to get > 10 bpc effective precision on DCN, i didn't need to do that
on DCE. Also no change of lb pixelsize was needed on either DCN or DCe
to get > 10 bpc precision for fp16 framebuffers, so something seems to
behave differently for floating point 16 vs. fixed point 16. This all
seems to suggest one could leave lb pixelsize at the old 30 bpp value
on at least DCE-11.2 and still get the > 10 bpc precision if one wanted
to avoid the changes of patch 4/5.

Thanks,
-mario


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/5] drm/fourcc: Add 16 bpc fixed point framebuffer formats.
  2021-03-19 21:03 16 bpc fixed point (RGBA16) framebuffer support for core and AMD Mario Kleiner
@ 2021-03-19 21:03 ` Mario Kleiner
  2021-03-19 21:16   ` Ville Syrjälä
  2021-03-19 21:03 ` [PATCH 2/5] drm/amd/display: Add support for SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616 Mario Kleiner
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Mario Kleiner @ 2021-03-19 21:03 UTC (permalink / raw)
  To: amd-gfx, dri-devel
  Cc: alexander.deucher, mario.kleiner.de, harry.wentland, nicholas.kazlauskas

These are 16 bits per color channel unsigned normalized formats.
They are supported by at least AMD display hw, and suitable for
direct scanout of Vulkan swapchain images in the format
VK_FORMAT_R16G16B16A16_UNORM.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
---
 drivers/gpu/drm/drm_fourcc.c  | 4 ++++
 include/uapi/drm/drm_fourcc.h | 7 +++++++
 2 files changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/drm_fourcc.c b/drivers/gpu/drm/drm_fourcc.c
index 03262472059c..ce13d2be5d7b 100644
--- a/drivers/gpu/drm/drm_fourcc.c
+++ b/drivers/gpu/drm/drm_fourcc.c
@@ -203,6 +203,10 @@ const struct drm_format_info *__drm_format_info(u32 format)
 		{ .format = DRM_FORMAT_ARGB16161616F,	.depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
 		{ .format = DRM_FORMAT_ABGR16161616F,	.depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
 		{ .format = DRM_FORMAT_AXBXGXRX106106106106, .depth = 0, .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
+		{ .format = DRM_FORMAT_XRGB16161616,	.depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1 },
+		{ .format = DRM_FORMAT_XBGR16161616,	.depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1 },
+		{ .format = DRM_FORMAT_ARGB16161616,	.depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
+		{ .format = DRM_FORMAT_ABGR16161616,	.depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
 		{ .format = DRM_FORMAT_RGB888_A8,	.depth = 32, .num_planes = 2, .cpp = { 3, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
 		{ .format = DRM_FORMAT_BGR888_A8,	.depth = 32, .num_planes = 2, .cpp = { 3, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
 		{ .format = DRM_FORMAT_XRGB8888_A8,	.depth = 32, .num_planes = 2, .cpp = { 4, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
index f76de49c768f..f7156322aba5 100644
--- a/include/uapi/drm/drm_fourcc.h
+++ b/include/uapi/drm/drm_fourcc.h
@@ -168,6 +168,13 @@ extern "C" {
 #define DRM_FORMAT_RGBA1010102	fourcc_code('R', 'A', '3', '0') /* [31:0] R:G:B:A 10:10:10:2 little endian */
 #define DRM_FORMAT_BGRA1010102	fourcc_code('B', 'A', '3', '0') /* [31:0] B:G:R:A 10:10:10:2 little endian */
 
+/* 64 bpp RGB */
+#define DRM_FORMAT_XRGB16161616	fourcc_code('X', 'R', '4', '8') /* [63:0] x:R:G:B 16:16:16:16 little endian */
+#define DRM_FORMAT_XBGR16161616	fourcc_code('X', 'B', '4', '8') /* [63:0] x:B:G:R 16:16:16:16 little endian */
+
+#define DRM_FORMAT_ARGB16161616	fourcc_code('A', 'R', '4', '8') /* [63:0] A:R:G:B 16:16:16:16 little endian */
+#define DRM_FORMAT_ABGR16161616	fourcc_code('A', 'B', '4', '8') /* [63:0] A:B:G:R 16:16:16:16 little endian */
+
 /*
  * Floating point 64bpp RGB
  * IEEE 754-2008 binary16 half-precision float
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/5] drm/amd/display: Add support for SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616.
  2021-03-19 21:03 16 bpc fixed point (RGBA16) framebuffer support for core and AMD Mario Kleiner
  2021-03-19 21:03 ` [PATCH 1/5] drm/fourcc: Add 16 bpc fixed point framebuffer formats Mario Kleiner
@ 2021-03-19 21:03 ` Mario Kleiner
  2021-03-19 21:03 ` [PATCH 3/5] drm/amd/display: Increase linebuffer pixel depth to 36bpp Mario Kleiner
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Mario Kleiner @ 2021-03-19 21:03 UTC (permalink / raw)
  To: amd-gfx, dri-devel
  Cc: alexander.deucher, mario.kleiner.de, harry.wentland, nicholas.kazlauskas

Add the necessary format definition, bandwidth and pixel size mappings,
prescaler setup, and pixelformat selection, following the logic
already present for SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616.

The new SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616 is implemented as the
old SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616 format, but with swapped
red <-> green color channel, by use of the hardware xbar.

Please note that on the DCN 1/2/3 display engines, the pixelformat
in hubp and dpp setup for the old SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616
and the new SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616 was changed from
format id 22 to id 26. See amd/include/navi10_enum.h for the meaning
of the id's.

For format 22, the display engine read the framebuffer in 16 bpc format,
but truncated to the 12 bpc actually supported by later pipeline stages.
However, the engine took the 12 LSB of each color component for
truncation, which is incompatible with rendering at least under Vulkan,
where content is 16 bit wide, and a 12 MSB alignment would be appropriate,
if any. Format 20 for ARGB16161616_12MSB does work, but even better, we
can choose format 26 for ARGB16161616_UNORM, keeping all 16 bits around
until later stages of the display pipeline.

This allows to directly consume what the rendering hw produces under
Vulkan for swapchain format VK_FORMAT_R16G16B16A16_UNORM, as tested
with a patched version of the current AMD open-source amdvlk driver
which maps swapchain format VK_FORMAT_R16G16B16A16_UNORM onto
DRM_FORMAT_XBGR16161616.

The old id 22 would cause colorful pixeltrash to be displayed instead.

Tested under DCN-1.0 and DCE-11.2.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
---
 drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c            | 2 ++
 drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c            | 2 ++
 drivers/gpu/drm/amd/display/dc/core/dc_resource.c           | 2 ++
 drivers/gpu/drm/amd/display/dc/dc_hw_types.h                | 2 ++
 drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.c          | 2 ++
 drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c | 1 +
 drivers/gpu/drm/amd/display/dc/dce110/dce110_mem_input_v.c  | 1 +
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c            | 6 ++++--
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c         | 1 +
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c           | 4 +++-
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c            | 3 ++-
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubbub.c         | 1 +
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c           | 4 +++-
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c       | 1 +
 drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c            | 3 ++-
 15 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c b/drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c
index e633f8a51edb..4e3664db7456 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c
+++ b/drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c
@@ -2827,6 +2827,7 @@ static void populate_initial_data(
 			data->bytes_per_pixel[num_displays + 4] = 4;
 			break;
 		case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
+		case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616:
 		case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F:
 			data->bytes_per_pixel[num_displays + 4] = 8;
 			break;
@@ -2930,6 +2931,7 @@ static void populate_initial_data(
 				data->bytes_per_pixel[num_displays + 4] = 4;
 				break;
 			case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
+			case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616:
 			case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F:
 				data->bytes_per_pixel[num_displays + 4] = 8;
 				break;
diff --git a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
index d4df4da5b81a..0e18df1283b6 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
+++ b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
@@ -236,6 +236,7 @@ static enum dcn_bw_defs tl_pixel_format_to_bw_defs(enum surface_pixel_format for
 	case SURFACE_PIXEL_FORMAT_GRPH_ABGR2101010_XR_BIAS:
 		return dcn_bw_rgb_sub_32;
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
+	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616:
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616F:
 	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F:
 		return dcn_bw_rgb_sub_64;
@@ -375,6 +376,7 @@ static void pipe_ctx_to_e2e_pipe_params (
 		input->src.viewport_height_c   = input->src.viewport_height / 2;
 		break;
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
+	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616:
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616F:
 	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F:
 		input->src.source_format = dm_444_64;
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
index 0c26c2ade782..f1aed40b3124 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
@@ -562,6 +562,7 @@ static enum pixel_format convert_pixel_format_to_dalsurface(
 		dal_pixel_format = PIXEL_FORMAT_420BPP10;
 		break;
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
+	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616:
 	default:
 		dal_pixel_format = PIXEL_FORMAT_UNKNOWN;
 		break;
@@ -2990,6 +2991,7 @@ unsigned int resource_pixel_format_to_bpp(enum surface_pixel_format format)
 #endif
 		return 32;
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
+	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616:
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616F:
 	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F:
 		return 64;
diff --git a/drivers/gpu/drm/amd/display/dc/dc_hw_types.h b/drivers/gpu/drm/amd/display/dc/dc_hw_types.h
index b41e6367b15e..87f8b1b486d3 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_hw_types.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_hw_types.h
@@ -182,6 +182,8 @@ enum surface_pixel_format {
 	SURFACE_PIXEL_FORMAT_GRPH_ABGR2101010_XR_BIAS,
 	/*64 bpp */
 	SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616,
+	/*swapped*/
+	SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616,
 	/*float*/
 	SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616F,
 	/*swaped & float*/
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.c b/drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.c
index 79a6f261a0da..4cdd4dacb761 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.c
@@ -566,6 +566,7 @@ static void program_grph_pixel_format(
 			 *  should problem swap endian*/
 		format == SURFACE_PIXEL_FORMAT_GRPH_ABGR2101010 ||
 		format == SURFACE_PIXEL_FORMAT_GRPH_ABGR2101010_XR_BIAS ||
+		format == SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616 ||
 		format == SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F) {
 		/* ABGR formats */
 		red_xbar = 2;
@@ -606,6 +607,7 @@ static void program_grph_pixel_format(
 		fallthrough;
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616F: /* shouldn't this get float too? */
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
+	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616:
 		grph_depth = 3;
 		grph_format = 0;
 		break;
diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index caee1c9f54bd..a4eec436ba2e 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -263,6 +263,7 @@ static void build_prescale_params(struct ipp_prescale_params *prescale_params,
 		prescale_params->scale = 0x2008;
 		break;
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
+	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616:
 	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F:
 		prescale_params->scale = 0x2000;
 		break;
diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_mem_input_v.c b/drivers/gpu/drm/amd/display/dc/dce110/dce110_mem_input_v.c
index 8bbb499067f7..db7557a1c613 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_mem_input_v.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_mem_input_v.c
@@ -393,6 +393,7 @@ static void program_pixel_format(
 			grph_format = 1;
 			break;
 		case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
+		case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616:
 		case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F:
 		case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616F:
 			grph_depth = 3;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c
index 7f8456b9988b..a77e7bd3b8d5 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c
@@ -257,7 +257,8 @@ static void dpp1_setup_format_flags(enum surface_pixel_format input_format,\
 	if (input_format == SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616F ||
 		input_format == SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F)
 		*fmt = PIXEL_FORMAT_FLOAT;
-	else if (input_format == SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616)
+	else if (input_format == SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616 ||
+		input_format == SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616)
 		*fmt = PIXEL_FORMAT_FIXED16;
 	else
 		*fmt = PIXEL_FORMAT_FIXED;
@@ -368,7 +369,8 @@ void dpp1_cnv_setup (
 		select = INPUT_CSC_SELECT_ICSC;
 		break;
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
-		pixel_format = 22;
+	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616:
+		pixel_format = 26; /* ARGB16161616_UNORM */
 		break;
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616F:
 		pixel_format = 24;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c
index 6f42d10dd772..f4f423d0b8c3 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c
@@ -785,6 +785,7 @@ static bool hubbub1_dcc_support_pixel_format(
 		*bytes_per_element = 4;
 		return true;
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
+	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616:
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616F:
 	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F:
 		*bytes_per_element = 8;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c
index 9e796dfeac20..4e2ac6c5e35d 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c
@@ -245,6 +245,7 @@ void hubp1_program_pixel_format(
 	if (format == SURFACE_PIXEL_FORMAT_GRPH_ABGR8888
 			|| format == SURFACE_PIXEL_FORMAT_GRPH_ABGR2101010
 			|| format == SURFACE_PIXEL_FORMAT_GRPH_ABGR2101010_XR_BIAS
+			|| format == SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616
 			|| format == SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F) {
 		red_bar = 2;
 		blue_bar = 3;
@@ -277,8 +278,9 @@ void hubp1_program_pixel_format(
 				SURFACE_PIXEL_FORMAT, 10);
 		break;
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
+	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616: /*we use crossbar already*/
 		REG_UPDATE(DCSURF_SURFACE_CONFIG,
-				SURFACE_PIXEL_FORMAT, 22);
+				SURFACE_PIXEL_FORMAT, 26); /* ARGB16161616_UNORM */
 		break;
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616F:
 	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F:/*we use crossbar already*/
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c
index 4af96cc5d9d6..f2f44ddf522a 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c
@@ -166,7 +166,8 @@ static void dpp2_cnv_setup (
 		select = DCN2_ICSC_SELECT_ICSC_A;
 		break;
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
-		pixel_format = 22;
+	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616:
+		pixel_format = 26; /* ARGB16161616_UNORM */
 		break;
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616F:
 		pixel_format = 24;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubbub.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubbub.c
index 6d03d98fca22..91a9305d42e8 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubbub.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubbub.c
@@ -158,6 +158,7 @@ bool hubbub2_dcc_support_pixel_format(
 		*bytes_per_element = 4;
 		return true;
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
+	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616:
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616F:
 	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F:
 		*bytes_per_element = 8;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c
index 0df0da2e6a4d..05c5494bf00f 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c
@@ -428,6 +428,7 @@ void hubp2_program_pixel_format(
 	if (format == SURFACE_PIXEL_FORMAT_GRPH_ABGR8888
 			|| format == SURFACE_PIXEL_FORMAT_GRPH_ABGR2101010
 			|| format == SURFACE_PIXEL_FORMAT_GRPH_ABGR2101010_XR_BIAS
+			|| format == SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616
 			|| format == SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F) {
 		red_bar = 2;
 		blue_bar = 3;
@@ -460,8 +461,9 @@ void hubp2_program_pixel_format(
 				SURFACE_PIXEL_FORMAT, 10);
 		break;
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
+	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616: /*we use crossbar already*/
 		REG_UPDATE(DCSURF_SURFACE_CONFIG,
-				SURFACE_PIXEL_FORMAT, 22);
+				SURFACE_PIXEL_FORMAT, 26); /* ARGB16161616_UNORM */
 		break;
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616F:
 	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F:/*we use crossbar already*/
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
index 2c2dbfcd8957..4083075c1ee6 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
@@ -2358,6 +2358,7 @@ int dcn20_populate_dml_pipes_from_context(
 				pipes[pipe_cnt].pipe.src.source_format = dm_420_10;
 				break;
 			case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
+			case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616:
 			case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616F:
 			case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F:
 				pipes[pipe_cnt].pipe.src.source_format = dm_444_64;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c
index 6e864b1a95c4..0bc5c5eba7af 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c
@@ -245,7 +245,8 @@ static void dpp3_cnv_setup (
 		select = INPUT_CSC_SELECT_ICSC;
 		break;
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
-		pixel_format = 22;
+	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616:
+		pixel_format = 26; /* ARGB16161616_UNORM */
 		break;
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616F:
 		pixel_format = 24;
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/5] drm/amd/display: Increase linebuffer pixel depth to 36bpp.
  2021-03-19 21:03 16 bpc fixed point (RGBA16) framebuffer support for core and AMD Mario Kleiner
  2021-03-19 21:03 ` [PATCH 1/5] drm/fourcc: Add 16 bpc fixed point framebuffer formats Mario Kleiner
  2021-03-19 21:03 ` [PATCH 2/5] drm/amd/display: Add support for SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616 Mario Kleiner
@ 2021-03-19 21:03 ` Mario Kleiner
  2021-03-19 21:03 ` [PATCH 4/5] drm/amd/display: Make assert in DCE's program_bit_depth_reduction more lenient Mario Kleiner
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Mario Kleiner @ 2021-03-19 21:03 UTC (permalink / raw)
  To: amd-gfx, dri-devel
  Cc: alexander.deucher, mario.kleiner.de, harry.wentland, nicholas.kazlauskas

Testing with the photometer shows that at least Raven Ridge DCN-1.0
does not achieve more than 10 bpc effective output precision with a
16 bpc unorm surface of type SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616,
unless linebuffer depth is increased from LB_PIXEL_DEPTH_30BPP to
LB_PIXEL_DEPTH_36BPP. Otherwise precision gets truncated somewhere
to 10 bpc effective depth.

Strangely this increase was not needed on Polaris11 DCE-11.2 during
testing to get 12 bpc effective precision. It also is not needed for
fp16 framebuffers.

Tested on DCN-1.0 and DCE-11.2.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
---
 drivers/gpu/drm/amd/display/dc/core/dc_resource.c          | 7 +++++--
 drivers/gpu/drm/amd/display/dc/dce/dce_transform.c         | 6 ++++--
 drivers/gpu/drm/amd/display/dc/dce110/dce110_transform_v.c | 3 ++-
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c           | 3 ++-
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c  | 2 +-
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c           | 3 ++-
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c         | 2 +-
 drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c           | 3 ++-
 8 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
index f1aed40b3124..51e91b546d69 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
@@ -1167,9 +1167,12 @@ bool resource_build_scaling_params(struct pipe_ctx *pipe_ctx)
 
 	/**
 	 * Setting line buffer pixel depth to 24bpp yields banding
-	 * on certain displays, such as the Sharp 4k
+	 * on certain displays, such as the Sharp 4k. 36bpp is needed
+	 * to support SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616 and
+	 * SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616 with actual > 10 bpc
+	 * precision on at least DCN display engines.
 	 */
-	pipe_ctx->plane_res.scl_data.lb_params.depth = LB_PIXEL_DEPTH_30BPP;
+	pipe_ctx->plane_res.scl_data.lb_params.depth = LB_PIXEL_DEPTH_36BPP;
 	pipe_ctx->plane_res.scl_data.lb_params.alpha_en = plane_state->per_pixel_alpha;
 
 	pipe_ctx->plane_res.scl_data.recout.x += timing->h_border_left;
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c
index 151dc7bf6d23..92b53a30d954 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c
@@ -1647,7 +1647,8 @@ void dce_transform_construct(
 	xfm_dce->lb_pixel_depth_supported =
 			LB_PIXEL_DEPTH_18BPP |
 			LB_PIXEL_DEPTH_24BPP |
-			LB_PIXEL_DEPTH_30BPP;
+			LB_PIXEL_DEPTH_30BPP |
+			LB_PIXEL_DEPTH_36BPP;
 
 	xfm_dce->lb_bits_per_entry = LB_BITS_PER_ENTRY;
 	xfm_dce->lb_memory_size = LB_TOTAL_NUMBER_OF_ENTRIES; /*0x6B0*/
@@ -1675,7 +1676,8 @@ void dce60_transform_construct(
 	xfm_dce->lb_pixel_depth_supported =
 			LB_PIXEL_DEPTH_18BPP |
 			LB_PIXEL_DEPTH_24BPP |
-			LB_PIXEL_DEPTH_30BPP;
+			LB_PIXEL_DEPTH_30BPP |
+			LB_PIXEL_DEPTH_36BPP;
 
 	xfm_dce->lb_bits_per_entry = LB_BITS_PER_ENTRY;
 	xfm_dce->lb_memory_size = LB_TOTAL_NUMBER_OF_ENTRIES; /*0x6B0*/
diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_transform_v.c b/drivers/gpu/drm/amd/display/dc/dce110/dce110_transform_v.c
index 29438c6050db..45bca0db5e5e 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_transform_v.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_transform_v.c
@@ -708,7 +708,8 @@ bool dce110_transform_v_construct(
 	xfm_dce->lb_pixel_depth_supported =
 			LB_PIXEL_DEPTH_18BPP |
 			LB_PIXEL_DEPTH_24BPP |
-			LB_PIXEL_DEPTH_30BPP;
+			LB_PIXEL_DEPTH_30BPP |
+			LB_PIXEL_DEPTH_36BPP;
 
 	xfm_dce->prescaler_on = true;
 	xfm_dce->lb_bits_per_entry = LB_BITS_PER_ENTRY;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c
index a77e7bd3b8d5..91fdfcd8a14e 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c
@@ -568,7 +568,8 @@ void dpp1_construct(
 	dpp->lb_pixel_depth_supported =
 		LB_PIXEL_DEPTH_18BPP |
 		LB_PIXEL_DEPTH_24BPP |
-		LB_PIXEL_DEPTH_30BPP;
+		LB_PIXEL_DEPTH_30BPP |
+		LB_PIXEL_DEPTH_36BPP;
 
 	dpp->lb_bits_per_entry = LB_BITS_PER_ENTRY;
 	dpp->lb_memory_size = LB_TOTAL_NUMBER_OF_ENTRIES; /*0x1404*/
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index 89912bb5014f..25d198f60a1c 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -2470,7 +2470,7 @@ static void update_scaler(struct pipe_ctx *pipe_ctx)
 			pipe_ctx->plane_state->per_pixel_alpha && pipe_ctx->bottom_pipe;
 
 	pipe_ctx->plane_res.scl_data.lb_params.alpha_en = per_pixel_alpha;
-	pipe_ctx->plane_res.scl_data.lb_params.depth = LB_PIXEL_DEPTH_30BPP;
+	pipe_ctx->plane_res.scl_data.lb_params.depth = LB_PIXEL_DEPTH_36BPP;
 	/* scaler configuration */
 	pipe_ctx->plane_res.dpp->funcs->dpp_set_scaler(
 			pipe_ctx->plane_res.dpp, &pipe_ctx->plane_res.scl_data);
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c
index f2f44ddf522a..a9e420c7d75a 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c
@@ -432,7 +432,8 @@ bool dpp2_construct(
 	dpp->lb_pixel_depth_supported =
 		LB_PIXEL_DEPTH_18BPP |
 		LB_PIXEL_DEPTH_24BPP |
-		LB_PIXEL_DEPTH_30BPP;
+		LB_PIXEL_DEPTH_30BPP |
+		LB_PIXEL_DEPTH_36BPP;
 
 	dpp->lb_bits_per_entry = LB_BITS_PER_ENTRY;
 	dpp->lb_memory_size = LB_TOTAL_NUMBER_OF_ENTRIES; /*0x1404*/
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
index 0726fb435e2a..cd924f4688e1 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
@@ -1467,7 +1467,7 @@ static void dcn20_update_dchubp_dpp(
 			plane_state->update_flags.bits.per_pixel_alpha_change ||
 			pipe_ctx->stream->update_flags.bits.scaling) {
 		pipe_ctx->plane_res.scl_data.lb_params.alpha_en = pipe_ctx->plane_state->per_pixel_alpha;
-		ASSERT(pipe_ctx->plane_res.scl_data.lb_params.depth == LB_PIXEL_DEPTH_30BPP);
+		ASSERT(pipe_ctx->plane_res.scl_data.lb_params.depth == LB_PIXEL_DEPTH_36BPP);
 		/* scaler configuration */
 		pipe_ctx->plane_res.dpp->funcs->dpp_set_scaler(
 				pipe_ctx->plane_res.dpp, &pipe_ctx->plane_res.scl_data);
diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c
index 0bc5c5eba7af..9c8138e52ded 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c
@@ -1443,7 +1443,8 @@ bool dpp3_construct(
 	dpp->lb_pixel_depth_supported =
 		LB_PIXEL_DEPTH_18BPP |
 		LB_PIXEL_DEPTH_24BPP |
-		LB_PIXEL_DEPTH_30BPP;
+		LB_PIXEL_DEPTH_30BPP |
+		LB_PIXEL_DEPTH_36BPP;
 
 	dpp->lb_bits_per_entry = LB_BITS_PER_ENTRY;
 	dpp->lb_memory_size = LB_TOTAL_NUMBER_OF_ENTRIES; /*0x1404*/
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/5] drm/amd/display: Make assert in DCE's program_bit_depth_reduction more lenient.
  2021-03-19 21:03 16 bpc fixed point (RGBA16) framebuffer support for core and AMD Mario Kleiner
                   ` (2 preceding siblings ...)
  2021-03-19 21:03 ` [PATCH 3/5] drm/amd/display: Increase linebuffer pixel depth to 36bpp Mario Kleiner
@ 2021-03-19 21:03 ` Mario Kleiner
  2021-03-19 21:03 ` [PATCH 5/5] drm/amd/display: Enable support for 16 bpc fixed-point framebuffers Mario Kleiner
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Mario Kleiner @ 2021-03-19 21:03 UTC (permalink / raw)
  To: amd-gfx, dri-devel
  Cc: alexander.deucher, mario.kleiner.de, harry.wentland, nicholas.kazlauskas

This is needed to avoid warnings with linebuffer depth 36 bpp.
Testing on a Polaris11, DCE-11.2 on a 10 bit HDR-10 monitor
showed no obvious problems, and this 12 bpc limit is consistent
with what other function in the DCE bit depth reduction path use.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
---
 drivers/gpu/drm/amd/display/dc/dce/dce_transform.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c
index 92b53a30d954..d9fd4ec60588 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c
@@ -794,7 +794,7 @@ static void program_bit_depth_reduction(
 	enum dcp_out_trunc_round_mode trunc_mode;
 	bool spatial_dither_enable;
 
-	ASSERT(depth < COLOR_DEPTH_121212); /* Invalid clamp bit depth */
+	ASSERT(depth <= COLOR_DEPTH_121212); /* Invalid clamp bit depth */
 
 	spatial_dither_enable = bit_depth_params->flags.SPATIAL_DITHER_ENABLED;
 	/* Default to 12 bit truncation without rounding */
@@ -854,7 +854,7 @@ static void dce60_program_bit_depth_reduction(
 	enum dcp_out_trunc_round_mode trunc_mode;
 	bool spatial_dither_enable;
 
-	ASSERT(depth < COLOR_DEPTH_121212); /* Invalid clamp bit depth */
+	ASSERT(depth <= COLOR_DEPTH_121212); /* Invalid clamp bit depth */
 
 	spatial_dither_enable = bit_depth_params->flags.SPATIAL_DITHER_ENABLED;
 	/* Default to 12 bit truncation without rounding */
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 5/5] drm/amd/display: Enable support for 16 bpc fixed-point framebuffers.
  2021-03-19 21:03 16 bpc fixed point (RGBA16) framebuffer support for core and AMD Mario Kleiner
                   ` (3 preceding siblings ...)
  2021-03-19 21:03 ` [PATCH 4/5] drm/amd/display: Make assert in DCE's program_bit_depth_reduction more lenient Mario Kleiner
@ 2021-03-19 21:03 ` Mario Kleiner
  2021-03-22 15:52 ` 16 bpc fixed point (RGBA16) framebuffer support for core and AMD Ville Syrjälä
  2021-04-16 16:29 ` Mario Kleiner
  6 siblings, 0 replies; 20+ messages in thread
From: Mario Kleiner @ 2021-03-19 21:03 UTC (permalink / raw)
  To: amd-gfx, dri-devel
  Cc: alexander.deucher, mario.kleiner.de, harry.wentland, nicholas.kazlauskas

This is intended to enable direct high-precision scanout and pageflip
of Vulkan swapchain images in format VK_FORMAT_R16G16B16A16_UNORM.

Expose DRM_FORMAT_XRGB16161616, DRM_FORMAT_ARGB16161616,
DRM_FORMAT_XBGR16161616 and DRM_FORMAT_ABGR16161616 as 16 bpc
unsigned normalized formats. These allow to take full advantage
of the maximum precision of the display hardware, ie. currently
up to 12 bpc.

Searching through old AMD M56, M76 and RV630 hw programming docs
suggests that these 16 bpc formats are supported by all DCE and
DCN display engines, so we can expose the formats unconditionally.

Successfully tested on AMD Polaris11 DCE-11.2 an RavenRidge DCN-1.0
with a HDR-10 monitor over 10 bpc DP output with spatial dithering
enabled by the driver. Picture looks good, and my photometer
measurement procedure confirms an effective 12 bpc color
reproduction.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 94cd5ddd67ef..1a6e90e20f10 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4563,6 +4563,14 @@ fill_dc_plane_info_and_addr(struct amdgpu_device *adev,
 	case DRM_FORMAT_ABGR16161616F:
 		plane_info->format = SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F;
 		break;
+	case DRM_FORMAT_XRGB16161616:
+	case DRM_FORMAT_ARGB16161616:
+		plane_info->format = SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616;
+		break;
+	case DRM_FORMAT_XBGR16161616:
+	case DRM_FORMAT_ABGR16161616:
+		plane_info->format = SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616;
+		break;
 	default:
 		DRM_ERROR(
 			"Unsupported screen format %s\n",
@@ -6541,6 +6549,10 @@ static const uint32_t rgb_formats[] = {
 	DRM_FORMAT_XBGR2101010,
 	DRM_FORMAT_ARGB2101010,
 	DRM_FORMAT_ABGR2101010,
+	DRM_FORMAT_XRGB16161616,
+	DRM_FORMAT_XBGR16161616,
+	DRM_FORMAT_ARGB16161616,
+	DRM_FORMAT_ABGR16161616,
 	DRM_FORMAT_XBGR8888,
 	DRM_FORMAT_ABGR8888,
 	DRM_FORMAT_RGB565,
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/5] drm/fourcc: Add 16 bpc fixed point framebuffer formats.
  2021-03-19 21:03 ` [PATCH 1/5] drm/fourcc: Add 16 bpc fixed point framebuffer formats Mario Kleiner
@ 2021-03-19 21:16   ` Ville Syrjälä
  2021-03-19 21:45     ` Mario Kleiner
  0 siblings, 1 reply; 20+ messages in thread
From: Ville Syrjälä @ 2021-03-19 21:16 UTC (permalink / raw)
  To: Mario Kleiner; +Cc: alexander.deucher, dri-devel, amd-gfx, nicholas.kazlauskas

On Fri, Mar 19, 2021 at 10:03:13PM +0100, Mario Kleiner wrote:
> These are 16 bits per color channel unsigned normalized formats.
> They are supported by at least AMD display hw, and suitable for
> direct scanout of Vulkan swapchain images in the format
> VK_FORMAT_R16G16B16A16_UNORM.
> 
> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
> ---
>  drivers/gpu/drm/drm_fourcc.c  | 4 ++++
>  include/uapi/drm/drm_fourcc.h | 7 +++++++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_fourcc.c b/drivers/gpu/drm/drm_fourcc.c
> index 03262472059c..ce13d2be5d7b 100644
> --- a/drivers/gpu/drm/drm_fourcc.c
> +++ b/drivers/gpu/drm/drm_fourcc.c
> @@ -203,6 +203,10 @@ const struct drm_format_info *__drm_format_info(u32 format)
>  		{ .format = DRM_FORMAT_ARGB16161616F,	.depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
>  		{ .format = DRM_FORMAT_ABGR16161616F,	.depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
>  		{ .format = DRM_FORMAT_AXBXGXRX106106106106, .depth = 0, .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> +		{ .format = DRM_FORMAT_XRGB16161616,	.depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1 },
> +		{ .format = DRM_FORMAT_XBGR16161616,	.depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1 },
> +		{ .format = DRM_FORMAT_ARGB16161616,	.depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> +		{ .format = DRM_FORMAT_ABGR16161616,	.depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
>  		{ .format = DRM_FORMAT_RGB888_A8,	.depth = 32, .num_planes = 2, .cpp = { 3, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
>  		{ .format = DRM_FORMAT_BGR888_A8,	.depth = 32, .num_planes = 2, .cpp = { 3, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
>  		{ .format = DRM_FORMAT_XRGB8888_A8,	.depth = 32, .num_planes = 2, .cpp = { 4, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
> index f76de49c768f..f7156322aba5 100644
> --- a/include/uapi/drm/drm_fourcc.h
> +++ b/include/uapi/drm/drm_fourcc.h
> @@ -168,6 +168,13 @@ extern "C" {
>  #define DRM_FORMAT_RGBA1010102	fourcc_code('R', 'A', '3', '0') /* [31:0] R:G:B:A 10:10:10:2 little endian */
>  #define DRM_FORMAT_BGRA1010102	fourcc_code('B', 'A', '3', '0') /* [31:0] B:G:R:A 10:10:10:2 little endian */
>  
> +/* 64 bpp RGB */
> +#define DRM_FORMAT_XRGB16161616	fourcc_code('X', 'R', '4', '8') /* [63:0] x:R:G:B 16:16:16:16 little endian */
> +#define DRM_FORMAT_XBGR16161616	fourcc_code('X', 'B', '4', '8') /* [63:0] x:B:G:R 16:16:16:16 little endian */
> +
> +#define DRM_FORMAT_ARGB16161616	fourcc_code('A', 'R', '4', '8') /* [63:0] A:R:G:B 16:16:16:16 little endian */
> +#define DRM_FORMAT_ABGR16161616	fourcc_code('A', 'B', '4', '8') /* [63:0] A:B:G:R 16:16:16:16 little endian */

These look reasonable enough to me. IIRC we should be able to expose
them on some recent Intel hw as well.

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

-- 
Ville Syrjälä
Intel
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/5] drm/fourcc: Add 16 bpc fixed point framebuffer formats.
  2021-03-19 21:16   ` Ville Syrjälä
@ 2021-03-19 21:45     ` Mario Kleiner
  2021-03-20  2:09       ` Ville Syrjälä
  0 siblings, 1 reply; 20+ messages in thread
From: Mario Kleiner @ 2021-03-19 21:45 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Alex Deucher, dri-devel, amd-gfx list, Nicholas Kazlauskas

On Fri, Mar 19, 2021 at 10:16 PM Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
>
> On Fri, Mar 19, 2021 at 10:03:13PM +0100, Mario Kleiner wrote:
> > These are 16 bits per color channel unsigned normalized formats.
> > They are supported by at least AMD display hw, and suitable for
> > direct scanout of Vulkan swapchain images in the format
> > VK_FORMAT_R16G16B16A16_UNORM.
> >
> > Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
> > ---
> >  drivers/gpu/drm/drm_fourcc.c  | 4 ++++
> >  include/uapi/drm/drm_fourcc.h | 7 +++++++
> >  2 files changed, 11 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/drm_fourcc.c b/drivers/gpu/drm/drm_fourcc.c
> > index 03262472059c..ce13d2be5d7b 100644
> > --- a/drivers/gpu/drm/drm_fourcc.c
> > +++ b/drivers/gpu/drm/drm_fourcc.c
> > @@ -203,6 +203,10 @@ const struct drm_format_info *__drm_format_info(u32 format)
> >               { .format = DRM_FORMAT_ARGB16161616F,   .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> >               { .format = DRM_FORMAT_ABGR16161616F,   .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> >               { .format = DRM_FORMAT_AXBXGXRX106106106106, .depth = 0, .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > +             { .format = DRM_FORMAT_XRGB16161616,    .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1 },
> > +             { .format = DRM_FORMAT_XBGR16161616,    .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1 },
> > +             { .format = DRM_FORMAT_ARGB16161616,    .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > +             { .format = DRM_FORMAT_ABGR16161616,    .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> >               { .format = DRM_FORMAT_RGB888_A8,       .depth = 32, .num_planes = 2, .cpp = { 3, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> >               { .format = DRM_FORMAT_BGR888_A8,       .depth = 32, .num_planes = 2, .cpp = { 3, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> >               { .format = DRM_FORMAT_XRGB8888_A8,     .depth = 32, .num_planes = 2, .cpp = { 4, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
> > index f76de49c768f..f7156322aba5 100644
> > --- a/include/uapi/drm/drm_fourcc.h
> > +++ b/include/uapi/drm/drm_fourcc.h
> > @@ -168,6 +168,13 @@ extern "C" {
> >  #define DRM_FORMAT_RGBA1010102       fourcc_code('R', 'A', '3', '0') /* [31:0] R:G:B:A 10:10:10:2 little endian */
> >  #define DRM_FORMAT_BGRA1010102       fourcc_code('B', 'A', '3', '0') /* [31:0] B:G:R:A 10:10:10:2 little endian */
> >
> > +/* 64 bpp RGB */
> > +#define DRM_FORMAT_XRGB16161616      fourcc_code('X', 'R', '4', '8') /* [63:0] x:R:G:B 16:16:16:16 little endian */
> > +#define DRM_FORMAT_XBGR16161616      fourcc_code('X', 'B', '4', '8') /* [63:0] x:B:G:R 16:16:16:16 little endian */
> > +
> > +#define DRM_FORMAT_ARGB16161616      fourcc_code('A', 'R', '4', '8') /* [63:0] A:R:G:B 16:16:16:16 little endian */
> > +#define DRM_FORMAT_ABGR16161616      fourcc_code('A', 'B', '4', '8') /* [63:0] A:B:G:R 16:16:16:16 little endian */
>
> These look reasonable enough to me. IIRC we should be able to expose
> them on some recent Intel hw as well.
>
> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
>

Thanks Ville!

Indeed i looked over the Intel PRM's, and while fp16 support seems to
be rather recent (Gen8? Gen9? Gen10? Can't remember atm.), iirc, I
found references to rgb16 fixed point back to gen5 / Ironlake. That
would be pretty cool! The precision limit for the encoders on Intel is
also 12 bpc atm., right?

-mario

> --
> Ville Syrjälä
> Intel
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/5] drm/fourcc: Add 16 bpc fixed point framebuffer formats.
  2021-03-19 21:45     ` Mario Kleiner
@ 2021-03-20  2:09       ` Ville Syrjälä
  2021-05-06  6:37         ` Ville Syrjälä
  0 siblings, 1 reply; 20+ messages in thread
From: Ville Syrjälä @ 2021-03-20  2:09 UTC (permalink / raw)
  To: Mario Kleiner; +Cc: Alex Deucher, dri-devel, amd-gfx list, Nicholas Kazlauskas

On Fri, Mar 19, 2021 at 10:45:10PM +0100, Mario Kleiner wrote:
> On Fri, Mar 19, 2021 at 10:16 PM Ville Syrjälä
> <ville.syrjala@linux.intel.com> wrote:
> >
> > On Fri, Mar 19, 2021 at 10:03:13PM +0100, Mario Kleiner wrote:
> > > These are 16 bits per color channel unsigned normalized formats.
> > > They are supported by at least AMD display hw, and suitable for
> > > direct scanout of Vulkan swapchain images in the format
> > > VK_FORMAT_R16G16B16A16_UNORM.
> > >
> > > Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
> > > ---
> > >  drivers/gpu/drm/drm_fourcc.c  | 4 ++++
> > >  include/uapi/drm/drm_fourcc.h | 7 +++++++
> > >  2 files changed, 11 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/drm_fourcc.c b/drivers/gpu/drm/drm_fourcc.c
> > > index 03262472059c..ce13d2be5d7b 100644
> > > --- a/drivers/gpu/drm/drm_fourcc.c
> > > +++ b/drivers/gpu/drm/drm_fourcc.c
> > > @@ -203,6 +203,10 @@ const struct drm_format_info *__drm_format_info(u32 format)
> > >               { .format = DRM_FORMAT_ARGB16161616F,   .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > >               { .format = DRM_FORMAT_ABGR16161616F,   .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > >               { .format = DRM_FORMAT_AXBXGXRX106106106106, .depth = 0, .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > > +             { .format = DRM_FORMAT_XRGB16161616,    .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1 },
> > > +             { .format = DRM_FORMAT_XBGR16161616,    .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1 },
> > > +             { .format = DRM_FORMAT_ARGB16161616,    .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > > +             { .format = DRM_FORMAT_ABGR16161616,    .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > >               { .format = DRM_FORMAT_RGB888_A8,       .depth = 32, .num_planes = 2, .cpp = { 3, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > >               { .format = DRM_FORMAT_BGR888_A8,       .depth = 32, .num_planes = 2, .cpp = { 3, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > >               { .format = DRM_FORMAT_XRGB8888_A8,     .depth = 32, .num_planes = 2, .cpp = { 4, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > > diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
> > > index f76de49c768f..f7156322aba5 100644
> > > --- a/include/uapi/drm/drm_fourcc.h
> > > +++ b/include/uapi/drm/drm_fourcc.h
> > > @@ -168,6 +168,13 @@ extern "C" {
> > >  #define DRM_FORMAT_RGBA1010102       fourcc_code('R', 'A', '3', '0') /* [31:0] R:G:B:A 10:10:10:2 little endian */
> > >  #define DRM_FORMAT_BGRA1010102       fourcc_code('B', 'A', '3', '0') /* [31:0] B:G:R:A 10:10:10:2 little endian */
> > >
> > > +/* 64 bpp RGB */
> > > +#define DRM_FORMAT_XRGB16161616      fourcc_code('X', 'R', '4', '8') /* [63:0] x:R:G:B 16:16:16:16 little endian */
> > > +#define DRM_FORMAT_XBGR16161616      fourcc_code('X', 'B', '4', '8') /* [63:0] x:B:G:R 16:16:16:16 little endian */
> > > +
> > > +#define DRM_FORMAT_ARGB16161616      fourcc_code('A', 'R', '4', '8') /* [63:0] A:R:G:B 16:16:16:16 little endian */
> > > +#define DRM_FORMAT_ABGR16161616      fourcc_code('A', 'B', '4', '8') /* [63:0] A:B:G:R 16:16:16:16 little endian */
> >
> > These look reasonable enough to me. IIRC we should be able to expose
> > them on some recent Intel hw as well.
> >
> > Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >
> 
> Thanks Ville!
> 
> Indeed i looked over the Intel PRM's, and while fp16 support seems to
> be rather recent (Gen8? Gen9? Gen10? Can't remember atm.), iirc, I
> found references to rgb16 fixed point back to gen5 / Ironlake.

fp16 has been around since forever (gen4+)
uint16 is much more recent, IIRC is something ~glk+

> That
> would be pretty cool! The precision limit for the encoders on Intel is
> also 12 bpc atm., right?

Yes.

-- 
Ville Syrjälä
Intel
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.
  2021-03-19 21:03 16 bpc fixed point (RGBA16) framebuffer support for core and AMD Mario Kleiner
                   ` (4 preceding siblings ...)
  2021-03-19 21:03 ` [PATCH 5/5] drm/amd/display: Enable support for 16 bpc fixed-point framebuffers Mario Kleiner
@ 2021-03-22 15:52 ` Ville Syrjälä
  2021-04-16 16:27   ` Mario Kleiner
  2021-04-16 16:29 ` Mario Kleiner
  6 siblings, 1 reply; 20+ messages in thread
From: Ville Syrjälä @ 2021-03-22 15:52 UTC (permalink / raw)
  To: Mario Kleiner; +Cc: alexander.deucher, dri-devel, amd-gfx, nicholas.kazlauskas

On Fri, Mar 19, 2021 at 10:03:12PM +0100, Mario Kleiner wrote:
> Hi,
> 
> this patch series adds the fourcc's for 16 bit fixed point unorm
> framebuffers to the core, and then an implementation for AMD gpu's
> with DisplayCore.
> 
> This is intended to allow for pageflipping to, and direct scanout of,
> Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> Link: https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4

We should also add support for these formats into igt.a Should 
be semi-easy by just adding the suitable float<->uint16
conversion stuff.

-- 
Ville Syrjälä
Intel
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.
  2021-03-22 15:52 ` 16 bpc fixed point (RGBA16) framebuffer support for core and AMD Ville Syrjälä
@ 2021-04-16 16:27   ` Mario Kleiner
  2021-04-16 17:31     ` Ville Syrjälä
  0 siblings, 1 reply; 20+ messages in thread
From: Mario Kleiner @ 2021-04-16 16:27 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Alex Deucher, Harry Wentland, dri-devel, amd-gfx list,
	Nicholas Kazlauskas

On Mon, Mar 22, 2021 at 4:52 PM Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
>
> On Fri, Mar 19, 2021 at 10:03:12PM +0100, Mario Kleiner wrote:
> > Hi,
> >
> > this patch series adds the fourcc's for 16 bit fixed point unorm
> > framebuffers to the core, and then an implementation for AMD gpu's
> > with DisplayCore.
> >
> > This is intended to allow for pageflipping to, and direct scanout of,
> > Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> > I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> > for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> > Link: https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
>
> We should also add support for these formats into igt.a Should
> be semi-easy by just adding the suitable float<->uint16
> conversion stuff.
>

Hi Ville,

Could you point me to a specific test case / file that I should look
at for adding this?

thanks,
-mario

> --
> Ville Syrjälä
> Intel
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.
  2021-03-19 21:03 16 bpc fixed point (RGBA16) framebuffer support for core and AMD Mario Kleiner
                   ` (5 preceding siblings ...)
  2021-03-22 15:52 ` 16 bpc fixed point (RGBA16) framebuffer support for core and AMD Ville Syrjälä
@ 2021-04-16 16:29 ` Mario Kleiner
  2021-04-20 21:25   ` Alex Deucher
  6 siblings, 1 reply; 20+ messages in thread
From: Mario Kleiner @ 2021-04-16 16:29 UTC (permalink / raw)
  To: amd-gfx list, dri-devel; +Cc: Alex Deucher, Harry Wentland, Nicholas Kazlauskas

Friendly ping to the AMD people. Nicholas, Harry, Alex, any feedback?
Would be great to get this in sooner than later.

Thanks and have a nice weekend,
-mario

On Fri, Mar 19, 2021 at 10:03 PM Mario Kleiner
<mario.kleiner.de@gmail.com> wrote:
>
> Hi,
>
> this patch series adds the fourcc's for 16 bit fixed point unorm
> framebuffers to the core, and then an implementation for AMD gpu's
> with DisplayCore.
>
> This is intended to allow for pageflipping to, and direct scanout of,
> Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> Link: https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
>
> My main motivation for this is squeezing every bit of precision
> out of the hardware for scientific and medical research applications,
> where fp16 in the unorm range is limited to ~11 bpc effective linear
> precision in the upper half [0.5;1.0] of the unorm range, although
> the hardware could do at least 12 bpc.
>
> It has been successfully tested on AMD RavenRidge (DCN-1), and with
> Polaris11 (DCE-11.2). Up to two displays were active on RavenRidge
> (DP 2560x1440@144Hz + HDMI 2560x1440@120Hz), the maximum supported
> on my hw, both running at 10 bpc DP output depth.
>
> Up to three displays were active on the Polaris (DP 2560x1440@144Hz +
> 2560x1440@100Hz USB-C DP-altMode-to-HDMI converter + eDP 2880x1800@60Hz
> Apple Retina panel), all running at 10 bpc output depth.
>
> No malfunctions, visual artifacts or other oddities were observed
> (apart from an adventureous mess of cables and adapters on my desk),
> suggesting it works.
>
> I used my automatic photometer measurement procedure to verify the
> effective output precision of 10 bpc DP native signal + spatial
> dithering in the gpu as enabled by the amdgpu driver. Results show
> the expected 12 bpc precision i hoped for -- the current upper limit
> for AMD display hw afaik.
>
> So it seems to work in the way i hoped :).
>
> Some open questions wrt. AMD DC, to be addressed in this patch series, or follow up
> patches if neccessary:
>
> - For the atomic check for plane scaling, the current patch will
> apply the same hw limits as for other rgb fixed point fb's, e.g.,
> for 8 bpc rgb8. Is this correct? Or would we need to use the fp16
> limits, because this is also a 64 bpp format? Or something new
> entirely?
>
> - I haven't added the new fourcc to the DCC tables yet. Should i?
>
> - I had to change an assert for DCE to allow 36bpp linebuffers (patch 4/5).
> It looks to me as if that assert was inconsistent with other places
> in the driver where COLOR_DEPTH121212 is supported, and looking at
> the code, the change seems harmless. At least on DCE-11.2 the change
> didn't cause any noticeable (by myself) or measurable (by my equipment)
> problems on any of the 3 connected displays.
>
> - Related to that change, while i needed to increase lb pixelsize to 36bpp
> to get > 10 bpc effective precision on DCN, i didn't need to do that
> on DCE. Also no change of lb pixelsize was needed on either DCN or DCe
> to get > 10 bpc precision for fp16 framebuffers, so something seems to
> behave differently for floating point 16 vs. fixed point 16. This all
> seems to suggest one could leave lb pixelsize at the old 30 bpp value
> on at least DCE-11.2 and still get the > 10 bpc precision if one wanted
> to avoid the changes of patch 4/5.
>
> Thanks,
> -mario
>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.
  2021-04-16 16:27   ` Mario Kleiner
@ 2021-04-16 17:31     ` Ville Syrjälä
  0 siblings, 0 replies; 20+ messages in thread
From: Ville Syrjälä @ 2021-04-16 17:31 UTC (permalink / raw)
  To: Mario Kleiner
  Cc: Alex Deucher, Harry Wentland, dri-devel, amd-gfx list,
	Nicholas Kazlauskas

On Fri, Apr 16, 2021 at 06:27:23PM +0200, Mario Kleiner wrote:
> On Mon, Mar 22, 2021 at 4:52 PM Ville Syrjälä
> <ville.syrjala@linux.intel.com> wrote:
> >
> > On Fri, Mar 19, 2021 at 10:03:12PM +0100, Mario Kleiner wrote:
> > > Hi,
> > >
> > > this patch series adds the fourcc's for 16 bit fixed point unorm
> > > framebuffers to the core, and then an implementation for AMD gpu's
> > > with DisplayCore.
> > >
> > > This is intended to allow for pageflipping to, and direct scanout of,
> > > Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> > > I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> > > for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> > > Link: https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
> >
> > We should also add support for these formats into igt.a Should
> > be semi-easy by just adding the suitable float<->uint16
> > conversion stuff.
> >
> 
> Hi Ville,
> 
> Could you point me to a specific test case / file that I should look
> at for adding this?

lib/igt_fb.c is the main thing. It has a bunch of conversion magic
to support rendering into all kinds of weird framebuffer formats
via cairo. 

In this should be mostly a matter of adding convert_uint16_to_float()
and convert_float_to_uint16(), plugging those into fb_convert(),
and declaring the new formats in format_desc[]. There might be
a few little extra details I'm forgetting though.

Once igt_fb has the required stuff kms_plane/pixel-format*
should automagically pick it up if the kernel reports the
format as supported.

Oh, and you need some >1.17 version of cairo for the float
support.

-- 
Ville Syrjälä
Intel
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.
  2021-04-16 16:29 ` Mario Kleiner
@ 2021-04-20 21:25   ` Alex Deucher
  2021-04-28 21:21     ` Alex Deucher
  0 siblings, 1 reply; 20+ messages in thread
From: Alex Deucher @ 2021-04-20 21:25 UTC (permalink / raw)
  To: Mario Kleiner; +Cc: Alex Deucher, dri-devel, amd-gfx list, Nicholas Kazlauskas

On Fri, Apr 16, 2021 at 12:29 PM Mario Kleiner
<mario.kleiner.de@gmail.com> wrote:
>
> Friendly ping to the AMD people. Nicholas, Harry, Alex, any feedback?
> Would be great to get this in sooner than later.
>

No objections from me.

Alex


> Thanks and have a nice weekend,
> -mario
>
> On Fri, Mar 19, 2021 at 10:03 PM Mario Kleiner
> <mario.kleiner.de@gmail.com> wrote:
> >
> > Hi,
> >
> > this patch series adds the fourcc's for 16 bit fixed point unorm
> > framebuffers to the core, and then an implementation for AMD gpu's
> > with DisplayCore.
> >
> > This is intended to allow for pageflipping to, and direct scanout of,
> > Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> > I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> > for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> > Link: https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
> >
> > My main motivation for this is squeezing every bit of precision
> > out of the hardware for scientific and medical research applications,
> > where fp16 in the unorm range is limited to ~11 bpc effective linear
> > precision in the upper half [0.5;1.0] of the unorm range, although
> > the hardware could do at least 12 bpc.
> >
> > It has been successfully tested on AMD RavenRidge (DCN-1), and with
> > Polaris11 (DCE-11.2). Up to two displays were active on RavenRidge
> > (DP 2560x1440@144Hz + HDMI 2560x1440@120Hz), the maximum supported
> > on my hw, both running at 10 bpc DP output depth.
> >
> > Up to three displays were active on the Polaris (DP 2560x1440@144Hz +
> > 2560x1440@100Hz USB-C DP-altMode-to-HDMI converter + eDP 2880x1800@60Hz
> > Apple Retina panel), all running at 10 bpc output depth.
> >
> > No malfunctions, visual artifacts or other oddities were observed
> > (apart from an adventureous mess of cables and adapters on my desk),
> > suggesting it works.
> >
> > I used my automatic photometer measurement procedure to verify the
> > effective output precision of 10 bpc DP native signal + spatial
> > dithering in the gpu as enabled by the amdgpu driver. Results show
> > the expected 12 bpc precision i hoped for -- the current upper limit
> > for AMD display hw afaik.
> >
> > So it seems to work in the way i hoped :).
> >
> > Some open questions wrt. AMD DC, to be addressed in this patch series, or follow up
> > patches if neccessary:
> >
> > - For the atomic check for plane scaling, the current patch will
> > apply the same hw limits as for other rgb fixed point fb's, e.g.,
> > for 8 bpc rgb8. Is this correct? Or would we need to use the fp16
> > limits, because this is also a 64 bpp format? Or something new
> > entirely?
> >
> > - I haven't added the new fourcc to the DCC tables yet. Should i?
> >
> > - I had to change an assert for DCE to allow 36bpp linebuffers (patch 4/5).
> > It looks to me as if that assert was inconsistent with other places
> > in the driver where COLOR_DEPTH121212 is supported, and looking at
> > the code, the change seems harmless. At least on DCE-11.2 the change
> > didn't cause any noticeable (by myself) or measurable (by my equipment)
> > problems on any of the 3 connected displays.
> >
> > - Related to that change, while i needed to increase lb pixelsize to 36bpp
> > to get > 10 bpc effective precision on DCN, i didn't need to do that
> > on DCE. Also no change of lb pixelsize was needed on either DCN or DCe
> > to get > 10 bpc precision for fp16 framebuffers, so something seems to
> > behave differently for floating point 16 vs. fixed point 16. This all
> > seems to suggest one could leave lb pixelsize at the old 30 bpp value
> > on at least DCE-11.2 and still get the > 10 bpc precision if one wanted
> > to avoid the changes of patch 4/5.
> >
> > Thanks,
> > -mario
> >
> >
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.
  2021-04-20 21:25   ` Alex Deucher
@ 2021-04-28 21:21     ` Alex Deucher
  2021-05-04 19:22       ` Alex Deucher
  2021-05-06  0:33       ` Mario Kleiner
  0 siblings, 2 replies; 20+ messages in thread
From: Alex Deucher @ 2021-04-28 21:21 UTC (permalink / raw)
  To: Mario Kleiner; +Cc: Alex Deucher, dri-devel, amd-gfx list, Nicholas Kazlauskas

On Tue, Apr 20, 2021 at 5:25 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>
> On Fri, Apr 16, 2021 at 12:29 PM Mario Kleiner
> <mario.kleiner.de@gmail.com> wrote:
> >
> > Friendly ping to the AMD people. Nicholas, Harry, Alex, any feedback?
> > Would be great to get this in sooner than later.
> >
>
> No objections from me.
>

I don't have any objections to merging this.  Are the IGT tests available?

Alex

> Alex
>
>
> > Thanks and have a nice weekend,
> > -mario
> >
> > On Fri, Mar 19, 2021 at 10:03 PM Mario Kleiner
> > <mario.kleiner.de@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > this patch series adds the fourcc's for 16 bit fixed point unorm
> > > framebuffers to the core, and then an implementation for AMD gpu's
> > > with DisplayCore.
> > >
> > > This is intended to allow for pageflipping to, and direct scanout of,
> > > Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> > > I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> > > for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> > > Link: https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
> > >
> > > My main motivation for this is squeezing every bit of precision
> > > out of the hardware for scientific and medical research applications,
> > > where fp16 in the unorm range is limited to ~11 bpc effective linear
> > > precision in the upper half [0.5;1.0] of the unorm range, although
> > > the hardware could do at least 12 bpc.
> > >
> > > It has been successfully tested on AMD RavenRidge (DCN-1), and with
> > > Polaris11 (DCE-11.2). Up to two displays were active on RavenRidge
> > > (DP 2560x1440@144Hz + HDMI 2560x1440@120Hz), the maximum supported
> > > on my hw, both running at 10 bpc DP output depth.
> > >
> > > Up to three displays were active on the Polaris (DP 2560x1440@144Hz +
> > > 2560x1440@100Hz USB-C DP-altMode-to-HDMI converter + eDP 2880x1800@60Hz
> > > Apple Retina panel), all running at 10 bpc output depth.
> > >
> > > No malfunctions, visual artifacts or other oddities were observed
> > > (apart from an adventureous mess of cables and adapters on my desk),
> > > suggesting it works.
> > >
> > > I used my automatic photometer measurement procedure to verify the
> > > effective output precision of 10 bpc DP native signal + spatial
> > > dithering in the gpu as enabled by the amdgpu driver. Results show
> > > the expected 12 bpc precision i hoped for -- the current upper limit
> > > for AMD display hw afaik.
> > >
> > > So it seems to work in the way i hoped :).
> > >
> > > Some open questions wrt. AMD DC, to be addressed in this patch series, or follow up
> > > patches if neccessary:
> > >
> > > - For the atomic check for plane scaling, the current patch will
> > > apply the same hw limits as for other rgb fixed point fb's, e.g.,
> > > for 8 bpc rgb8. Is this correct? Or would we need to use the fp16
> > > limits, because this is also a 64 bpp format? Or something new
> > > entirely?
> > >
> > > - I haven't added the new fourcc to the DCC tables yet. Should i?
> > >
> > > - I had to change an assert for DCE to allow 36bpp linebuffers (patch 4/5).
> > > It looks to me as if that assert was inconsistent with other places
> > > in the driver where COLOR_DEPTH121212 is supported, and looking at
> > > the code, the change seems harmless. At least on DCE-11.2 the change
> > > didn't cause any noticeable (by myself) or measurable (by my equipment)
> > > problems on any of the 3 connected displays.
> > >
> > > - Related to that change, while i needed to increase lb pixelsize to 36bpp
> > > to get > 10 bpc effective precision on DCN, i didn't need to do that
> > > on DCE. Also no change of lb pixelsize was needed on either DCN or DCe
> > > to get > 10 bpc precision for fp16 framebuffers, so something seems to
> > > behave differently for floating point 16 vs. fixed point 16. This all
> > > seems to suggest one could leave lb pixelsize at the old 30 bpp value
> > > on at least DCE-11.2 and still get the > 10 bpc precision if one wanted
> > > to avoid the changes of patch 4/5.
> > >
> > > Thanks,
> > > -mario
> > >
> > >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.
  2021-04-28 21:21     ` Alex Deucher
@ 2021-05-04 19:22       ` Alex Deucher
  2021-05-05 14:01         ` Mario Kleiner
  2021-05-06  0:33       ` Mario Kleiner
  1 sibling, 1 reply; 20+ messages in thread
From: Alex Deucher @ 2021-05-04 19:22 UTC (permalink / raw)
  To: Mario Kleiner; +Cc: Alex Deucher, dri-devel, amd-gfx list, Nicholas Kazlauskas

On Wed, Apr 28, 2021 at 5:21 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>
> On Tue, Apr 20, 2021 at 5:25 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> >
> > On Fri, Apr 16, 2021 at 12:29 PM Mario Kleiner
> > <mario.kleiner.de@gmail.com> wrote:
> > >
> > > Friendly ping to the AMD people. Nicholas, Harry, Alex, any feedback?
> > > Would be great to get this in sooner than later.
> > >
> >
> > No objections from me.
> >
>
> I don't have any objections to merging this.  Are the IGT tests available?
>

Any preference on whether I merge this through the AMD tree or drm-misc?

Alex


> Alex
>
> > Alex
> >
> >
> > > Thanks and have a nice weekend,
> > > -mario
> > >
> > > On Fri, Mar 19, 2021 at 10:03 PM Mario Kleiner
> > > <mario.kleiner.de@gmail.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > this patch series adds the fourcc's for 16 bit fixed point unorm
> > > > framebuffers to the core, and then an implementation for AMD gpu's
> > > > with DisplayCore.
> > > >
> > > > This is intended to allow for pageflipping to, and direct scanout of,
> > > > Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> > > > I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> > > > for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> > > > Link: https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
> > > >
> > > > My main motivation for this is squeezing every bit of precision
> > > > out of the hardware for scientific and medical research applications,
> > > > where fp16 in the unorm range is limited to ~11 bpc effective linear
> > > > precision in the upper half [0.5;1.0] of the unorm range, although
> > > > the hardware could do at least 12 bpc.
> > > >
> > > > It has been successfully tested on AMD RavenRidge (DCN-1), and with
> > > > Polaris11 (DCE-11.2). Up to two displays were active on RavenRidge
> > > > (DP 2560x1440@144Hz + HDMI 2560x1440@120Hz), the maximum supported
> > > > on my hw, both running at 10 bpc DP output depth.
> > > >
> > > > Up to three displays were active on the Polaris (DP 2560x1440@144Hz +
> > > > 2560x1440@100Hz USB-C DP-altMode-to-HDMI converter + eDP 2880x1800@60Hz
> > > > Apple Retina panel), all running at 10 bpc output depth.
> > > >
> > > > No malfunctions, visual artifacts or other oddities were observed
> > > > (apart from an adventureous mess of cables and adapters on my desk),
> > > > suggesting it works.
> > > >
> > > > I used my automatic photometer measurement procedure to verify the
> > > > effective output precision of 10 bpc DP native signal + spatial
> > > > dithering in the gpu as enabled by the amdgpu driver. Results show
> > > > the expected 12 bpc precision i hoped for -- the current upper limit
> > > > for AMD display hw afaik.
> > > >
> > > > So it seems to work in the way i hoped :).
> > > >
> > > > Some open questions wrt. AMD DC, to be addressed in this patch series, or follow up
> > > > patches if neccessary:
> > > >
> > > > - For the atomic check for plane scaling, the current patch will
> > > > apply the same hw limits as for other rgb fixed point fb's, e.g.,
> > > > for 8 bpc rgb8. Is this correct? Or would we need to use the fp16
> > > > limits, because this is also a 64 bpp format? Or something new
> > > > entirely?
> > > >
> > > > - I haven't added the new fourcc to the DCC tables yet. Should i?
> > > >
> > > > - I had to change an assert for DCE to allow 36bpp linebuffers (patch 4/5).
> > > > It looks to me as if that assert was inconsistent with other places
> > > > in the driver where COLOR_DEPTH121212 is supported, and looking at
> > > > the code, the change seems harmless. At least on DCE-11.2 the change
> > > > didn't cause any noticeable (by myself) or measurable (by my equipment)
> > > > problems on any of the 3 connected displays.
> > > >
> > > > - Related to that change, while i needed to increase lb pixelsize to 36bpp
> > > > to get > 10 bpc effective precision on DCN, i didn't need to do that
> > > > on DCE. Also no change of lb pixelsize was needed on either DCN or DCe
> > > > to get > 10 bpc precision for fp16 framebuffers, so something seems to
> > > > behave differently for floating point 16 vs. fixed point 16. This all
> > > > seems to suggest one could leave lb pixelsize at the old 30 bpp value
> > > > on at least DCE-11.2 and still get the > 10 bpc precision if one wanted
> > > > to avoid the changes of patch 4/5.
> > > >
> > > > Thanks,
> > > > -mario
> > > >
> > > >
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.
  2021-05-04 19:22       ` Alex Deucher
@ 2021-05-05 14:01         ` Mario Kleiner
  0 siblings, 0 replies; 20+ messages in thread
From: Mario Kleiner @ 2021-05-05 14:01 UTC (permalink / raw)
  To: Alex Deucher; +Cc: Alex Deucher, dri-devel, amd-gfx list, Nicholas Kazlauskas

On Tue, May 4, 2021 at 9:22 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>
> On Wed, Apr 28, 2021 at 5:21 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> >
> > On Tue, Apr 20, 2021 at 5:25 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> > >
> > > On Fri, Apr 16, 2021 at 12:29 PM Mario Kleiner
> > > <mario.kleiner.de@gmail.com> wrote:
> > > >
> > > > Friendly ping to the AMD people. Nicholas, Harry, Alex, any feedback?
> > > > Would be great to get this in sooner than later.
> > > >
> > >
> > > No objections from me.
> > >
> >
> > I don't have any objections to merging this.  Are the IGT tests available?
> >
>
> Any preference on whether I merge this through the AMD tree or drm-misc?
>
> Alex
>

Hi Alex, in case the question is addressed to myself: I prefer
whatever gets it into drm-next asap, so we can sync the drm_fourcc.h
headers from drm-next to the IGT tests, libdrm, amdvlk etc.

Another thing:Unless this would still make it into the Linux 5.13
merge window, we'd also need a KMS_DRIVER_MINOR bump 41 -> 42. This
way amdgpu-pro's Vulkan driver could know about the new 16 bpc pixel
formats for the out of tree amdgpu-dkms package when running against
older kernels.

thanks,
-mario

>
> > Alex
> >
> > > Alex
> > >
> > >
> > > > Thanks and have a nice weekend,
> > > > -mario
> > > >
> > > > On Fri, Mar 19, 2021 at 10:03 PM Mario Kleiner
> > > > <mario.kleiner.de@gmail.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > this patch series adds the fourcc's for 16 bit fixed point unorm
> > > > > framebuffers to the core, and then an implementation for AMD gpu's
> > > > > with DisplayCore.
> > > > >
> > > > > This is intended to allow for pageflipping to, and direct scanout of,
> > > > > Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> > > > > I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> > > > > for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> > > > > Link: https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
> > > > >
> > > > > My main motivation for this is squeezing every bit of precision
> > > > > out of the hardware for scientific and medical research applications,
> > > > > where fp16 in the unorm range is limited to ~11 bpc effective linear
> > > > > precision in the upper half [0.5;1.0] of the unorm range, although
> > > > > the hardware could do at least 12 bpc.
> > > > >
> > > > > It has been successfully tested on AMD RavenRidge (DCN-1), and with
> > > > > Polaris11 (DCE-11.2). Up to two displays were active on RavenRidge
> > > > > (DP 2560x1440@144Hz + HDMI 2560x1440@120Hz), the maximum supported
> > > > > on my hw, both running at 10 bpc DP output depth.
> > > > >
> > > > > Up to three displays were active on the Polaris (DP 2560x1440@144Hz +
> > > > > 2560x1440@100Hz USB-C DP-altMode-to-HDMI converter + eDP 2880x1800@60Hz
> > > > > Apple Retina panel), all running at 10 bpc output depth.
> > > > >
> > > > > No malfunctions, visual artifacts or other oddities were observed
> > > > > (apart from an adventureous mess of cables and adapters on my desk),
> > > > > suggesting it works.
> > > > >
> > > > > I used my automatic photometer measurement procedure to verify the
> > > > > effective output precision of 10 bpc DP native signal + spatial
> > > > > dithering in the gpu as enabled by the amdgpu driver. Results show
> > > > > the expected 12 bpc precision i hoped for -- the current upper limit
> > > > > for AMD display hw afaik.
> > > > >
> > > > > So it seems to work in the way i hoped :).
> > > > >
> > > > > Some open questions wrt. AMD DC, to be addressed in this patch series, or follow up
> > > > > patches if neccessary:
> > > > >
> > > > > - For the atomic check for plane scaling, the current patch will
> > > > > apply the same hw limits as for other rgb fixed point fb's, e.g.,
> > > > > for 8 bpc rgb8. Is this correct? Or would we need to use the fp16
> > > > > limits, because this is also a 64 bpp format? Or something new
> > > > > entirely?
> > > > >
> > > > > - I haven't added the new fourcc to the DCC tables yet. Should i?
> > > > >
> > > > > - I had to change an assert for DCE to allow 36bpp linebuffers (patch 4/5).
> > > > > It looks to me as if that assert was inconsistent with other places
> > > > > in the driver where COLOR_DEPTH121212 is supported, and looking at
> > > > > the code, the change seems harmless. At least on DCE-11.2 the change
> > > > > didn't cause any noticeable (by myself) or measurable (by my equipment)
> > > > > problems on any of the 3 connected displays.
> > > > >
> > > > > - Related to that change, while i needed to increase lb pixelsize to 36bpp
> > > > > to get > 10 bpc effective precision on DCN, i didn't need to do that
> > > > > on DCE. Also no change of lb pixelsize was needed on either DCN or DCe
> > > > > to get > 10 bpc precision for fp16 framebuffers, so something seems to
> > > > > behave differently for floating point 16 vs. fixed point 16. This all
> > > > > seems to suggest one could leave lb pixelsize at the old 30 bpp value
> > > > > on at least DCE-11.2 and still get the > 10 bpc precision if one wanted
> > > > > to avoid the changes of patch 4/5.
> > > > >
> > > > > Thanks,
> > > > > -mario
> > > > >
> > > > >
> > > > _______________________________________________
> > > > dri-devel mailing list
> > > > dri-devel@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.
  2021-04-28 21:21     ` Alex Deucher
  2021-05-04 19:22       ` Alex Deucher
@ 2021-05-06  0:33       ` Mario Kleiner
  1 sibling, 0 replies; 20+ messages in thread
From: Mario Kleiner @ 2021-05-06  0:33 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Alex Deucher, Ville Syrjälä,
	dri-devel, amd-gfx list, Nicholas Kazlauskas

[-- Attachment #1: Type: text/plain, Size: 6086 bytes --]

On Wed, Apr 28, 2021 at 11:22 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>
> On Tue, Apr 20, 2021 at 5:25 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> >
> > On Fri, Apr 16, 2021 at 12:29 PM Mario Kleiner
> > <mario.kleiner.de@gmail.com> wrote:
> > >
> > > Friendly ping to the AMD people. Nicholas, Harry, Alex, any feedback?
> > > Would be great to get this in sooner than later.
> > >
> >
> > No objections from me.
> >
>
> I don't have any objections to merging this.  Are the IGT tests available?
>
> Alex
>.

IGT Patches are out now, already r-b by Ville, cc'd to you. As
mentioned in the cover letter for those, the new 16 bpc test cases on
top o f IGT master for kms_plane test now work nicely on my
RavenRidge, but i had to add hacks on top of kms_plane test to make it
work at all on RV, ie. get it to the point where it could execute the
tests for the new formats at all. Unmodified kms_plane from master
doesn't even work on RV with Linux 5.8. Seems IGT is quite a bit out
of date wrt. the kernel?

Things i had to do:

- Skip all tests for modifiers other than linear. --> Test
requirements wrt. tiling not met. Seems all the modifier support for
DCC, DCC_RETILE on Vega+ is missing from IGT so far?

- Skip test for format DRM_FORMAT_RGB565. CRC mismatch. Probably
because a 5 bpc container can't represent the net 8 bpc content from
the reference test image? Maybe all tests for < 8 bpc formats should
be skipped?

- Skip tests for yuv planar formats with BT2020 color space: Limited
range unsupported by DC, full range causes CRC mismatch.

- Problems with crc vblank count expected vs. actual for planar YUV formats.

- If the tests try to test more than the primary plane,
igt_pipe_crc_start() fails to open the crtc/crc/data file with -EIO.

See the attached patch with all the needed hacks. Not sure which of
these are limitations of the IGT test, and which are amdgpu bugs or hw
limitations, but applying this hack-patch on top of the patches for
the new formats makes kms_plane pass.

-mario





> > Alex
> >
> >
> > > Thanks and have a nice weekend,
> > > -mario
> > >
> > > On Fri, Mar 19, 2021 at 10:03 PM Mario Kleiner
> > > <mario.kleiner.de@gmail.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > this patch series adds the fourcc's for 16 bit fixed point unorm
> > > > framebuffers to the core, and then an implementation for AMD gpu's
> > > > with DisplayCore.
> > > >
> > > > This is intended to allow for pageflipping to, and direct scanout of,
> > > > Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> > > > I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> > > > for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> > > > Link: https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
> > > >
> > > > My main motivation for this is squeezing every bit of precision
> > > > out of the hardware for scientific and medical research applications,
> > > > where fp16 in the unorm range is limited to ~11 bpc effective linear
> > > > precision in the upper half [0.5;1.0] of the unorm range, although
> > > > the hardware could do at least 12 bpc.
> > > >
> > > > It has been successfully tested on AMD RavenRidge (DCN-1), and with
> > > > Polaris11 (DCE-11.2). Up to two displays were active on RavenRidge
> > > > (DP 2560x1440@144Hz + HDMI 2560x1440@120Hz), the maximum supported
> > > > on my hw, both running at 10 bpc DP output depth.
> > > >
> > > > Up to three displays were active on the Polaris (DP 2560x1440@144Hz +
> > > > 2560x1440@100Hz USB-C DP-altMode-to-HDMI converter + eDP 2880x1800@60Hz
> > > > Apple Retina panel), all running at 10 bpc output depth.
> > > >
> > > > No malfunctions, visual artifacts or other oddities were observed
> > > > (apart from an adventureous mess of cables and adapters on my desk),
> > > > suggesting it works.
> > > >
> > > > I used my automatic photometer measurement procedure to verify the
> > > > effective output precision of 10 bpc DP native signal + spatial
> > > > dithering in the gpu as enabled by the amdgpu driver. Results show
> > > > the expected 12 bpc precision i hoped for -- the current upper limit
> > > > for AMD display hw afaik.
> > > >
> > > > So it seems to work in the way i hoped :).
> > > >
> > > > Some open questions wrt. AMD DC, to be addressed in this patch series, or follow up
> > > > patches if neccessary:
> > > >
> > > > - For the atomic check for plane scaling, the current patch will
> > > > apply the same hw limits as for other rgb fixed point fb's, e.g.,
> > > > for 8 bpc rgb8. Is this correct? Or would we need to use the fp16
> > > > limits, because this is also a 64 bpp format? Or something new
> > > > entirely?
> > > >
> > > > - I haven't added the new fourcc to the DCC tables yet. Should i?
> > > >
> > > > - I had to change an assert for DCE to allow 36bpp linebuffers (patch 4/5).
> > > > It looks to me as if that assert was inconsistent with other places
> > > > in the driver where COLOR_DEPTH121212 is supported, and looking at
> > > > the code, the change seems harmless. At least on DCE-11.2 the change
> > > > didn't cause any noticeable (by myself) or measurable (by my equipment)
> > > > problems on any of the 3 connected displays.
> > > >
> > > > - Related to that change, while i needed to increase lb pixelsize to 36bpp
> > > > to get > 10 bpc effective precision on DCN, i didn't need to do that
> > > > on DCE. Also no change of lb pixelsize was needed on either DCN or DCe
> > > > to get > 10 bpc precision for fp16 framebuffers, so something seems to
> > > > behave differently for floating point 16 vs. fixed point 16. This all
> > > > seems to suggest one could leave lb pixelsize at the old 30 bpp value
> > > > on at least DCE-11.2 and still get the > 10 bpc precision if one wanted
> > > > to avoid the changes of patch 4/5.
> > > >
> > > > Thanks,
> > > > -mario
> > > >
> > > >
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel

[-- Attachment #2: 0001-kms_plane-Hacks-to-make-all-AMD-Raven-format-tests-p.patch --]
[-- Type: text/x-patch, Size: 4032 bytes --]

From 09f18c39bcafcd884cad47c7f33e892a57a2bf50 Mon Sep 17 00:00:00 2001
From: Mario Kleiner <mario.kleiner.de@gmail.com>
Date: Sat, 1 May 2021 16:06:12 +0200
Subject: [PATCH i-g-t] kms_plane: Hacks to make all AMD Raven format tests
 pass.

These cause failure of atomic commit or other asserts and cause
the tests to abort:

- Problems with crc vblank count expected vs. actual for planar YUV.
- No support for YUV BT2020 limited range by amdgpu.
- Failure of igt_pipe_crc_start() for secondary planes.
- No support for DCC / DCC_RETILE modifiers introduced in Linux 5.11.

These do not cause abort of all tests iirc., just reporting the mismatch:

- CRC mismatch for YUV BT2020 full range.
- CRC mismatch on all YUV for src crop test.
- CRC mismatch for 16bpp RGB565 format.
- src crop test needs 16 pixel borders for 64bpp rgba16/fp16 formats.
---
 tests/kms_plane.c | 31 +++++++++++++++++++++++++++++--
 1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/tests/kms_plane.c b/tests/kms_plane.c
index 9fe253a8..6eb9a122 100644
--- a/tests/kms_plane.c
+++ b/tests/kms_plane.c
@@ -550,7 +550,7 @@ static void capture_crc(data_t *data, unsigned int vblank, igt_crc_t *crc)
 	igt_pipe_crc_get_for_frame(data->drm_fd, data->pipe_crc, vblank, crc);
 
 	igt_fail_on_f(!igt_skip_crc_compare &&
-		      crc->has_valid_frame && crc->frame != vblank,
+		      crc->has_valid_frame && crc->frame != (vblank + (is_amdgpu_device(data->drm_fd) ? 2 : 0)),
 		      "Got CRC for the wrong frame (got %u, expected %u). CRC buffer overflow?\n",
 		      crc->frame, vblank);
 }
@@ -787,6 +787,15 @@ static bool test_format_plane_yuv(data_t *data, enum pipe pipe,
 						     igt_color_range_to_str(r)))
 				continue;
 
+			/* AMD can't do IGT_COLOR_YCBCR_BT2020 with limited range, only full range,
+			 * otherwise atomic test fails with -EINVAL.
+			 * With full range, we still get crc mismatch.
+			 * Therefore skip IGT_COLOR_YCBCR_BT2020 encodings.
+			 * Also skip tests for src cropping test -> crc mismatch.
+			 */
+			if (is_amdgpu_device(data->drm_fd) && (e == IGT_COLOR_YCBCR_BT2020 || data->crop))
+				continue;
+
 			igt_info("Testing format " IGT_FORMAT_FMT " / modifier 0x%" PRIx64 " (%s, %s) on %s.%u\n",
 				 IGT_FORMAT_ARGS(format), modifier,
 				 igt_color_encoding_to_str(e),
@@ -921,6 +930,15 @@ static bool test_format_plane(data_t *data, enum pipe pipe,
 		    f.modifier == ref.modifier)
 			continue;
 
+		/* Prevent use of yet unsupported DCC / DCC_RETILE modifiers on GFX9+ */
+		if (is_amdgpu_device(data->drm_fd) && (f.modifier != DRM_FORMAT_MOD_LINEAR))
+			continue;
+
+		/* Don't test formats which only hold less than 8 bpc of content */
+		// Or (igt_drm_format_to_bpp(f.format) == 16 && !igt_format_is_yuv(f.format)) for skip due to < 8 bpc?
+		if (f.format == DRM_FORMAT_RGB565)
+			continue;
+
 		/* test each format "class" only once in non-extended tests */
 		if (!data->extended && f.modifier != DRM_FORMAT_MOD_LINEAR) {
 			struct format_mod rf = {
@@ -981,6 +999,14 @@ static bool skip_plane(data_t *data, igt_plane_t *plane)
 	if (data->extended)
 		return false;
 
+	/* igt_pipe_crc_start() fails for futher planes, so only test primary plane.
+	 * The error is a -EIO error when opening the ../crtc/crc/data file, which
+	 * suggests that the DRM crtc_crc_open() function rejects open, because the
+	 * associated crtc is (!crtc->state->active)?
+	 */
+	if (is_amdgpu_device(data->drm_fd) && (plane->type != DRM_PLANE_TYPE_PRIMARY))
+		return true;
+
 	if (!is_i915_device(data->drm_fd))
 		return false;
 
@@ -1073,7 +1099,8 @@ run_tests_for_pipe_plane(data_t *data, enum pipe pipe)
 	igt_describe("verify the pixel formats for given plane and pipe with source clamping");
 	igt_subtest_f("pixel-format-pipe-%s-planes-source-clamping",
 		      kmstest_pipe_name(pipe)) {
-		data->crop = 4;
+		/* At least AMD needs crop to be multiple of 16 for 64bpp pixel formats */
+		data->crop = 4 * (is_amdgpu_device(data->drm_fd) ? 4 : 1);
 		test_pixel_formats(data, pipe);
 	}
 
-- 
2.25.1


[-- Attachment #3: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/5] drm/fourcc: Add 16 bpc fixed point framebuffer formats.
  2021-03-20  2:09       ` Ville Syrjälä
@ 2021-05-06  6:37         ` Ville Syrjälä
  2021-05-13 19:27           ` Mario Kleiner
  0 siblings, 1 reply; 20+ messages in thread
From: Ville Syrjälä @ 2021-05-06  6:37 UTC (permalink / raw)
  To: Mario Kleiner; +Cc: Alex Deucher, amd-gfx list, dri-devel, Nicholas Kazlauskas

On Sat, Mar 20, 2021 at 04:09:47AM +0200, Ville Syrjälä wrote:
> On Fri, Mar 19, 2021 at 10:45:10PM +0100, Mario Kleiner wrote:
> > On Fri, Mar 19, 2021 at 10:16 PM Ville Syrjälä
> > <ville.syrjala@linux.intel.com> wrote:
> > >
> > > On Fri, Mar 19, 2021 at 10:03:13PM +0100, Mario Kleiner wrote:
> > > > These are 16 bits per color channel unsigned normalized formats.
> > > > They are supported by at least AMD display hw, and suitable for
> > > > direct scanout of Vulkan swapchain images in the format
> > > > VK_FORMAT_R16G16B16A16_UNORM.
> > > >
> > > > Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
> > > > ---
> > > >  drivers/gpu/drm/drm_fourcc.c  | 4 ++++
> > > >  include/uapi/drm/drm_fourcc.h | 7 +++++++
> > > >  2 files changed, 11 insertions(+)
> > > >
> > > > diff --git a/drivers/gpu/drm/drm_fourcc.c b/drivers/gpu/drm/drm_fourcc.c
> > > > index 03262472059c..ce13d2be5d7b 100644
> > > > --- a/drivers/gpu/drm/drm_fourcc.c
> > > > +++ b/drivers/gpu/drm/drm_fourcc.c
> > > > @@ -203,6 +203,10 @@ const struct drm_format_info *__drm_format_info(u32 format)
> > > >               { .format = DRM_FORMAT_ARGB16161616F,   .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > > >               { .format = DRM_FORMAT_ABGR16161616F,   .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > > >               { .format = DRM_FORMAT_AXBXGXRX106106106106, .depth = 0, .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > > > +             { .format = DRM_FORMAT_XRGB16161616,    .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1 },
> > > > +             { .format = DRM_FORMAT_XBGR16161616,    .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1 },
> > > > +             { .format = DRM_FORMAT_ARGB16161616,    .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > > > +             { .format = DRM_FORMAT_ABGR16161616,    .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > > >               { .format = DRM_FORMAT_RGB888_A8,       .depth = 32, .num_planes = 2, .cpp = { 3, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > > >               { .format = DRM_FORMAT_BGR888_A8,       .depth = 32, .num_planes = 2, .cpp = { 3, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > > >               { .format = DRM_FORMAT_XRGB8888_A8,     .depth = 32, .num_planes = 2, .cpp = { 4, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > > > diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
> > > > index f76de49c768f..f7156322aba5 100644
> > > > --- a/include/uapi/drm/drm_fourcc.h
> > > > +++ b/include/uapi/drm/drm_fourcc.h
> > > > @@ -168,6 +168,13 @@ extern "C" {
> > > >  #define DRM_FORMAT_RGBA1010102       fourcc_code('R', 'A', '3', '0') /* [31:0] R:G:B:A 10:10:10:2 little endian */
> > > >  #define DRM_FORMAT_BGRA1010102       fourcc_code('B', 'A', '3', '0') /* [31:0] B:G:R:A 10:10:10:2 little endian */
> > > >
> > > > +/* 64 bpp RGB */
> > > > +#define DRM_FORMAT_XRGB16161616      fourcc_code('X', 'R', '4', '8') /* [63:0] x:R:G:B 16:16:16:16 little endian */
> > > > +#define DRM_FORMAT_XBGR16161616      fourcc_code('X', 'B', '4', '8') /* [63:0] x:B:G:R 16:16:16:16 little endian */
> > > > +
> > > > +#define DRM_FORMAT_ARGB16161616      fourcc_code('A', 'R', '4', '8') /* [63:0] A:R:G:B 16:16:16:16 little endian */
> > > > +#define DRM_FORMAT_ABGR16161616      fourcc_code('A', 'B', '4', '8') /* [63:0] A:B:G:R 16:16:16:16 little endian */
> > >
> > > These look reasonable enough to me. IIRC we should be able to expose
> > > them on some recent Intel hw as well.
> > >
> > > Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > >
> > 
> > Thanks Ville!
> > 
> > Indeed i looked over the Intel PRM's, and while fp16 support seems to
> > be rather recent (Gen8? Gen9? Gen10? Can't remember atm.), iirc, I
> > found references to rgb16 fixed point back to gen5 / Ironlake.
> 
> fp16 has been around since forever (gen4+)
> uint16 is much more recent, IIRC is something ~glk+

FYI I just hacked something together for i915:
git://github.com/vsyrjala/linux.git uint16

Tests seem to pass on a glk here at least.

-- 
Ville Syrjälä
Intel
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/5] drm/fourcc: Add 16 bpc fixed point framebuffer formats.
  2021-05-06  6:37         ` Ville Syrjälä
@ 2021-05-13 19:27           ` Mario Kleiner
  0 siblings, 0 replies; 20+ messages in thread
From: Mario Kleiner @ 2021-05-13 19:27 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Alex Deucher, amd-gfx list, dri-devel, Nicholas Kazlauskas

On Thu, May 6, 2021 at 8:37 AM Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
>
> On Sat, Mar 20, 2021 at 04:09:47AM +0200, Ville Syrjälä wrote:
> > On Fri, Mar 19, 2021 at 10:45:10PM +0100, Mario Kleiner wrote:
> > > On Fri, Mar 19, 2021 at 10:16 PM Ville Syrjälä
> > > <ville.syrjala@linux.intel.com> wrote:
> > > >
> > > > On Fri, Mar 19, 2021 at 10:03:13PM +0100, Mario Kleiner wrote:
> > > > > These are 16 bits per color channel unsigned normalized formats.
> > > > > They are supported by at least AMD display hw, and suitable for
> > > > > direct scanout of Vulkan swapchain images in the format
> > > > > VK_FORMAT_R16G16B16A16_UNORM.
> > > > >
> > > > > Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
> > > > > ---
> > > > >  drivers/gpu/drm/drm_fourcc.c  | 4 ++++
> > > > >  include/uapi/drm/drm_fourcc.h | 7 +++++++
> > > > >  2 files changed, 11 insertions(+)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/drm_fourcc.c b/drivers/gpu/drm/drm_fourcc.c
> > > > > index 03262472059c..ce13d2be5d7b 100644
> > > > > --- a/drivers/gpu/drm/drm_fourcc.c
> > > > > +++ b/drivers/gpu/drm/drm_fourcc.c
> > > > > @@ -203,6 +203,10 @@ const struct drm_format_info *__drm_format_info(u32 format)
> > > > >               { .format = DRM_FORMAT_ARGB16161616F,   .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > > > >               { .format = DRM_FORMAT_ABGR16161616F,   .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > > > >               { .format = DRM_FORMAT_AXBXGXRX106106106106, .depth = 0, .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > > > > +             { .format = DRM_FORMAT_XRGB16161616,    .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1 },
> > > > > +             { .format = DRM_FORMAT_XBGR16161616,    .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1 },
> > > > > +             { .format = DRM_FORMAT_ARGB16161616,    .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > > > > +             { .format = DRM_FORMAT_ABGR16161616,    .depth = 0,  .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > > > >               { .format = DRM_FORMAT_RGB888_A8,       .depth = 32, .num_planes = 2, .cpp = { 3, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > > > >               { .format = DRM_FORMAT_BGR888_A8,       .depth = 32, .num_planes = 2, .cpp = { 3, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > > > >               { .format = DRM_FORMAT_XRGB8888_A8,     .depth = 32, .num_planes = 2, .cpp = { 4, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
> > > > > diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
> > > > > index f76de49c768f..f7156322aba5 100644
> > > > > --- a/include/uapi/drm/drm_fourcc.h
> > > > > +++ b/include/uapi/drm/drm_fourcc.h
> > > > > @@ -168,6 +168,13 @@ extern "C" {
> > > > >  #define DRM_FORMAT_RGBA1010102       fourcc_code('R', 'A', '3', '0') /* [31:0] R:G:B:A 10:10:10:2 little endian */
> > > > >  #define DRM_FORMAT_BGRA1010102       fourcc_code('B', 'A', '3', '0') /* [31:0] B:G:R:A 10:10:10:2 little endian */
> > > > >
> > > > > +/* 64 bpp RGB */
> > > > > +#define DRM_FORMAT_XRGB16161616      fourcc_code('X', 'R', '4', '8') /* [63:0] x:R:G:B 16:16:16:16 little endian */
> > > > > +#define DRM_FORMAT_XBGR16161616      fourcc_code('X', 'B', '4', '8') /* [63:0] x:B:G:R 16:16:16:16 little endian */
> > > > > +
> > > > > +#define DRM_FORMAT_ARGB16161616      fourcc_code('A', 'R', '4', '8') /* [63:0] A:R:G:B 16:16:16:16 little endian */
> > > > > +#define DRM_FORMAT_ABGR16161616      fourcc_code('A', 'B', '4', '8') /* [63:0] A:B:G:R 16:16:16:16 little endian */
> > > >
> > > > These look reasonable enough to me. IIRC we should be able to expose
> > > > them on some recent Intel hw as well.
> > > >
> > > > Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > >
> > >
> > > Thanks Ville!
> > >
> > > Indeed i looked over the Intel PRM's, and while fp16 support seems to
> > > be rather recent (Gen8? Gen9? Gen10? Can't remember atm.), iirc, I
> > > found references to rgb16 fixed point back to gen5 / Ironlake.
> >
> > fp16 has been around since forever (gen4+)
> > uint16 is much more recent, IIRC is something ~glk+
>
> FYI I just hacked something together for i915:
> git://github.com/vsyrjala/linux.git uint16
>
> Tests seem to pass on a glk here at least.

Great! Thanks for doing this. I reviewed those 3 patches of yours,
look good to me, also added R-b's to the individual patches on your
git://github.com/vsyrjala/linux.git uint16:

Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>

Too bad uint16 isn't supported already on KBL hw, which is the most
modern Intel hw i have atm, so i can't test them.

-mario
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-05-13 19:27 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-19 21:03 16 bpc fixed point (RGBA16) framebuffer support for core and AMD Mario Kleiner
2021-03-19 21:03 ` [PATCH 1/5] drm/fourcc: Add 16 bpc fixed point framebuffer formats Mario Kleiner
2021-03-19 21:16   ` Ville Syrjälä
2021-03-19 21:45     ` Mario Kleiner
2021-03-20  2:09       ` Ville Syrjälä
2021-05-06  6:37         ` Ville Syrjälä
2021-05-13 19:27           ` Mario Kleiner
2021-03-19 21:03 ` [PATCH 2/5] drm/amd/display: Add support for SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616 Mario Kleiner
2021-03-19 21:03 ` [PATCH 3/5] drm/amd/display: Increase linebuffer pixel depth to 36bpp Mario Kleiner
2021-03-19 21:03 ` [PATCH 4/5] drm/amd/display: Make assert in DCE's program_bit_depth_reduction more lenient Mario Kleiner
2021-03-19 21:03 ` [PATCH 5/5] drm/amd/display: Enable support for 16 bpc fixed-point framebuffers Mario Kleiner
2021-03-22 15:52 ` 16 bpc fixed point (RGBA16) framebuffer support for core and AMD Ville Syrjälä
2021-04-16 16:27   ` Mario Kleiner
2021-04-16 17:31     ` Ville Syrjälä
2021-04-16 16:29 ` Mario Kleiner
2021-04-20 21:25   ` Alex Deucher
2021-04-28 21:21     ` Alex Deucher
2021-05-04 19:22       ` Alex Deucher
2021-05-05 14:01         ` Mario Kleiner
2021-05-06  0:33       ` Mario Kleiner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).