All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] drm/i915: SKL+ render decompression support
@ 2017-01-04 18:42 ville.syrjala
  2017-01-04 18:42 ` [PATCH 1/9] drm: Add mode_config .get_format_info() hook ville.syrjala
                   ` (12 more replies)
  0 siblings, 13 replies; 44+ messages in thread
From: ville.syrjala @ 2017-01-04 18:42 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, dri-devel, Laurent Pinchart, Vandana Kannan

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

This series enables the SKL+ display engine render decompression
support. Some bits and pieces of the i915 code are based on work
from various people, but I just put my name on it since it
would be hard to figure out which parts came from where.

Entire series available here:
git://github.com/vsyrjala/linux.git fb_format_dedup_4_ccs

Cc: Vandana Kannan <vandana.kannan@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

Ville Syrjälä (9):
  drm: Add mode_config .get_format_info() hook
  drm/i915: Plumb drm_framebuffer into more places
  drm/i915: Move nv12 chroma plane handling into intel_surf_alignment()
  drm/i915: Avoid div-by-zero when computing aux_stride w/o an aux plane
  drm/i915: Fix Yf tile width
  drm/i915: Pass the correct plane index to _intel_compute_tile_offset()
  drm/i915: Use DRM_DEBUG_KMS() for framebuffer failure debug messages
  drm/i915: Implement .get_format_info() hook for CCS
  drm/i915: Add render decompression support

 drivers/gpu/drm/drm_fb_cma_helper.c  |   2 +-
 drivers/gpu/drm/drm_fourcc.c         |  25 ++
 drivers/gpu/drm/drm_framebuffer.c    |   9 +-
 drivers/gpu/drm/drm_modeset_helper.c |   2 +-
 drivers/gpu/drm/i915/i915_reg.h      |  22 ++
 drivers/gpu/drm/i915/intel_display.c | 474 +++++++++++++++++++++++++----------
 drivers/gpu/drm/i915/intel_drv.h     |  11 +-
 drivers/gpu/drm/i915/intel_fbdev.c   |   4 +-
 drivers/gpu/drm/i915/intel_pm.c      |   8 +-
 drivers/gpu/drm/i915/intel_sprite.c  |   5 +
 include/drm/drm_fourcc.h             |   6 +
 include/drm/drm_mode_config.h        |  14 ++
 include/uapi/drm/drm_fourcc.h        |  49 ++++
 13 files changed, 478 insertions(+), 153 deletions(-)

-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 1/9] drm: Add mode_config .get_format_info() hook
  2017-01-04 18:42 [PATCH 0/9] drm/i915: SKL+ render decompression support ville.syrjala
@ 2017-01-04 18:42 ` ville.syrjala
  2017-01-05  3:15   ` Ben Widawsky
  2017-01-04 18:42 ` [PATCH 2/9] drm/i915: Plumb drm_framebuffer into more places ville.syrjala
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 44+ messages in thread
From: ville.syrjala @ 2017-01-04 18:42 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Laurent Pinchart, dri-devel

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

Allow drivers to return a custom drm_format_info structure for special
fb layouts. We'll use this for the compression control surface in i915.

v2: Fix drm_get_format_info() kernel doc (Laurent)
    Don't pass 'dev' to the new hook (Laurent)

Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/drm_fb_cma_helper.c  |  2 +-
 drivers/gpu/drm/drm_fourcc.c         | 25 +++++++++++++++++++++++++
 drivers/gpu/drm/drm_framebuffer.c    |  9 +++++++--
 drivers/gpu/drm/drm_modeset_helper.c |  2 +-
 include/drm/drm_fourcc.h             |  6 ++++++
 include/drm/drm_mode_config.h        | 14 ++++++++++++++
 6 files changed, 54 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_fb_cma_helper.c b/drivers/gpu/drm/drm_fb_cma_helper.c
index 20a4011f790d..de3c9fe116fc 100644
--- a/drivers/gpu/drm/drm_fb_cma_helper.c
+++ b/drivers/gpu/drm/drm_fb_cma_helper.c
@@ -177,7 +177,7 @@ struct drm_framebuffer *drm_fb_cma_create_with_funcs(struct drm_device *dev,
 	int ret;
 	int i;
 
-	info = drm_format_info(mode_cmd->pixel_format);
+	info = drm_get_format_info(dev, mode_cmd);
 	if (!info)
 		return ERR_PTR(-EINVAL);
 
diff --git a/drivers/gpu/drm/drm_fourcc.c b/drivers/gpu/drm/drm_fourcc.c
index 90d2cc8da8eb..f9b6445e846a 100644
--- a/drivers/gpu/drm/drm_fourcc.c
+++ b/drivers/gpu/drm/drm_fourcc.c
@@ -199,6 +199,31 @@ const struct drm_format_info *drm_format_info(u32 format)
 EXPORT_SYMBOL(drm_format_info);
 
 /**
+ * drm_get_format_info - query information for a given framebuffer configuration
+ * @dev: DRM device
+ * @mode_cmd: metadata from the userspace fb creation request
+ *
+ * Returns:
+ * The instance of struct drm_format_info that describes the pixel format, or
+ * NULL if the format is unsupported.
+ */
+const struct drm_format_info *
+drm_get_format_info(struct drm_device *dev,
+		    const struct drm_mode_fb_cmd2 *mode_cmd)
+{
+	const struct drm_format_info *info = NULL;
+
+	if (dev->mode_config.funcs->get_format_info)
+		info = dev->mode_config.funcs->get_format_info(mode_cmd);
+
+	if (!info)
+		info = drm_format_info(mode_cmd->pixel_format);
+
+	return info;
+}
+EXPORT_SYMBOL(drm_get_format_info);
+
+/**
  * drm_format_num_planes - get the number of planes for format
  * @format: pixel format (DRM_FORMAT_*)
  *
diff --git a/drivers/gpu/drm/drm_framebuffer.c b/drivers/gpu/drm/drm_framebuffer.c
index 588ccc3a2218..e276fcdc3a62 100644
--- a/drivers/gpu/drm/drm_framebuffer.c
+++ b/drivers/gpu/drm/drm_framebuffer.c
@@ -126,11 +126,13 @@ int drm_mode_addfb(struct drm_device *dev,
 	return 0;
 }
 
-static int framebuffer_check(const struct drm_mode_fb_cmd2 *r)
+static int framebuffer_check(struct drm_device *dev,
+			     const struct drm_mode_fb_cmd2 *r)
 {
 	const struct drm_format_info *info;
 	int i;
 
+	/* check if the format is supported at all */
 	info = __drm_format_info(r->pixel_format & ~DRM_FORMAT_BIG_ENDIAN);
 	if (!info) {
 		struct drm_format_name_buf format_name;
@@ -140,6 +142,9 @@ static int framebuffer_check(const struct drm_mode_fb_cmd2 *r)
 		return -EINVAL;
 	}
 
+	/* now let the driver pick its own format info */
+	info = drm_get_format_info(dev, r);
+
 	if (r->width == 0 || r->width % info->hsub) {
 		DRM_DEBUG_KMS("bad framebuffer width %u\n", r->width);
 		return -EINVAL;
@@ -263,7 +268,7 @@ drm_internal_framebuffer_create(struct drm_device *dev,
 		return ERR_PTR(-EINVAL);
 	}
 
-	ret = framebuffer_check(r);
+	ret = framebuffer_check(dev, r);
 	if (ret)
 		return ERR_PTR(ret);
 
diff --git a/drivers/gpu/drm/drm_modeset_helper.c b/drivers/gpu/drm/drm_modeset_helper.c
index cc44a9a4b004..2b33825f2f93 100644
--- a/drivers/gpu/drm/drm_modeset_helper.c
+++ b/drivers/gpu/drm/drm_modeset_helper.c
@@ -78,7 +78,7 @@ void drm_helper_mode_fill_fb_struct(struct drm_device *dev,
 	int i;
 
 	fb->dev = dev;
-	fb->format = drm_format_info(mode_cmd->pixel_format);
+	fb->format = drm_get_format_info(dev, mode_cmd);
 	fb->width = mode_cmd->width;
 	fb->height = mode_cmd->height;
 	for (i = 0; i < 4; i++) {
diff --git a/include/drm/drm_fourcc.h b/include/drm/drm_fourcc.h
index fcc08da850c8..6942e84b6edd 100644
--- a/include/drm/drm_fourcc.h
+++ b/include/drm/drm_fourcc.h
@@ -25,6 +25,9 @@
 #include <linux/types.h>
 #include <uapi/drm/drm_fourcc.h>
 
+struct drm_device;
+struct drm_mode_fb_cmd2;
+
 /**
  * struct drm_format_info - information about a DRM format
  * @format: 4CC format identifier (DRM_FORMAT_*)
@@ -55,6 +58,9 @@ struct drm_format_name_buf {
 
 const struct drm_format_info *__drm_format_info(u32 format);
 const struct drm_format_info *drm_format_info(u32 format);
+const struct drm_format_info *
+drm_get_format_info(struct drm_device *dev,
+		    const struct drm_mode_fb_cmd2 *mode_cmd);
 uint32_t drm_mode_legacy_fb_format(uint32_t bpp, uint32_t depth);
 int drm_format_num_planes(uint32_t format);
 int drm_format_plane_cpp(uint32_t format, int plane);
diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h
index 17942c0f32a8..b5c2eb825965 100644
--- a/include/drm/drm_mode_config.h
+++ b/include/drm/drm_mode_config.h
@@ -34,6 +34,7 @@ struct drm_file;
 struct drm_device;
 struct drm_atomic_state;
 struct drm_mode_fb_cmd2;
+struct drm_format_info;
 
 /**
  * struct drm_mode_config_funcs - basic driver provided mode setting functions
@@ -70,6 +71,19 @@ struct drm_mode_config_funcs {
 					     const struct drm_mode_fb_cmd2 *mode_cmd);
 
 	/**
+	 * @get_format_info:
+	 *
+	 * Allows a driver to return custom format information for special
+	 * fb layouts (eg. ones with auxiliary compresssion control planes).
+	 *
+	 * RETURNS:
+	 *
+	 * The format information specific to the given fb metadata, or
+	 * NULL if none is found.
+	 */
+	const struct drm_format_info *(*get_format_info)(const struct drm_mode_fb_cmd2 *mode_cmd);
+
+	/**
 	 * @output_poll_changed:
 	 *
 	 * Callback used by helpers to inform the driver of output configuration
-- 
2.10.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 2/9] drm/i915: Plumb drm_framebuffer into more places
  2017-01-04 18:42 [PATCH 0/9] drm/i915: SKL+ render decompression support ville.syrjala
  2017-01-04 18:42 ` [PATCH 1/9] drm: Add mode_config .get_format_info() hook ville.syrjala
@ 2017-01-04 18:42 ` ville.syrjala
  2017-02-02 13:30   ` [Intel-gfx] " Imre Deak
  2017-01-04 18:42 ` [PATCH 3/9] drm/i915: Move nv12 chroma plane handling into intel_surf_alignment() ville.syrjala
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 44+ messages in thread
From: ville.syrjala @ 2017-01-04 18:42 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

Now that framebuffers can be used even before calling
drm_framebuffer_init() we can start to plumb them into more places,
instead of passing individual pieces for fb metadata.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 127 +++++++++++++++--------------------
 drivers/gpu/drm/i915/intel_drv.h     |  11 +--
 drivers/gpu/drm/i915/intel_fbdev.c   |   4 +-
 3 files changed, 57 insertions(+), 85 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index e2150a64860c..f0cb80aba89a 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2050,10 +2050,13 @@ static unsigned int intel_tile_size(const struct drm_i915_private *dev_priv)
 	return IS_GEN2(dev_priv) ? 2048 : 4096;
 }
 
-static unsigned int intel_tile_width_bytes(const struct drm_i915_private *dev_priv,
-					   uint64_t fb_modifier, unsigned int cpp)
+static unsigned int
+intel_tile_width_bytes(const struct drm_framebuffer *fb, int plane)
 {
-	switch (fb_modifier) {
+	struct drm_i915_private *dev_priv = to_i915(fb->dev);
+	unsigned int cpp = fb->format->cpp[plane];
+
+	switch (fb->modifier) {
 	case DRM_FORMAT_MOD_NONE:
 		return cpp;
 	case I915_FORMAT_MOD_X_TILED:
@@ -2082,41 +2085,38 @@ static unsigned int intel_tile_width_bytes(const struct drm_i915_private *dev_pr
 		}
 		break;
 	default:
-		MISSING_CASE(fb_modifier);
+		MISSING_CASE(fb->modifier);
 		return cpp;
 	}
 }
 
-unsigned int intel_tile_height(const struct drm_i915_private *dev_priv,
-			       uint64_t fb_modifier, unsigned int cpp)
+static unsigned int
+intel_tile_height(const struct drm_framebuffer *fb, int plane)
 {
-	if (fb_modifier == DRM_FORMAT_MOD_NONE)
+	if (fb->modifier == DRM_FORMAT_MOD_NONE)
 		return 1;
 	else
-		return intel_tile_size(dev_priv) /
-			intel_tile_width_bytes(dev_priv, fb_modifier, cpp);
+		return intel_tile_size(to_i915(fb->dev)) /
+			intel_tile_width_bytes(fb, plane);
 }
 
 /* Return the tile dimensions in pixel units */
-static void intel_tile_dims(const struct drm_i915_private *dev_priv,
+static void intel_tile_dims(const struct drm_framebuffer *fb, int plane,
 			    unsigned int *tile_width,
-			    unsigned int *tile_height,
-			    uint64_t fb_modifier,
-			    unsigned int cpp)
+			    unsigned int *tile_height)
 {
-	unsigned int tile_width_bytes =
-		intel_tile_width_bytes(dev_priv, fb_modifier, cpp);
+	unsigned int tile_width_bytes = intel_tile_width_bytes(fb, plane);
+	unsigned int cpp = fb->format->cpp[plane];
 
 	*tile_width = tile_width_bytes / cpp;
-	*tile_height = intel_tile_size(dev_priv) / tile_width_bytes;
+	*tile_height = intel_tile_size(to_i915(fb->dev)) / tile_width_bytes;
 }
 
 unsigned int
-intel_fb_align_height(struct drm_device *dev, unsigned int height,
-		      uint32_t pixel_format, uint64_t fb_modifier)
+intel_fb_align_height(const struct drm_framebuffer *fb,
+		      int plane, unsigned int height)
 {
-	unsigned int cpp = drm_format_plane_cpp(pixel_format, 0);
-	unsigned int tile_height = intel_tile_height(to_i915(dev), fb_modifier, cpp);
+	unsigned int tile_height = intel_tile_height(fb, plane);
 
 	return ALIGN(height, tile_height);
 }
@@ -2158,21 +2158,23 @@ static unsigned int intel_linear_alignment(const struct drm_i915_private *dev_pr
 		return 0;
 }
 
-static unsigned int intel_surf_alignment(const struct drm_i915_private *dev_priv,
-					 uint64_t fb_modifier)
+static unsigned int intel_surf_alignment(const struct drm_framebuffer *fb,
+					 int plane)
 {
-	switch (fb_modifier) {
+	struct drm_i915_private *dev_priv = to_i915(fb->dev);
+
+	switch (fb->modifier) {
 	case DRM_FORMAT_MOD_NONE:
 		return intel_linear_alignment(dev_priv);
 	case I915_FORMAT_MOD_X_TILED:
-		if (INTEL_INFO(dev_priv)->gen >= 9)
+		if (INTEL_GEN(dev_priv) >= 9)
 			return 256 * 1024;
 		return 0;
 	case I915_FORMAT_MOD_Y_TILED:
 	case I915_FORMAT_MOD_Yf_TILED:
 		return 1 * 1024 * 1024;
 	default:
-		MISSING_CASE(fb_modifier);
+		MISSING_CASE(fb->modifier);
 		return 0;
 	}
 }
@@ -2189,7 +2191,7 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb, unsigned int rotation)
 
 	WARN_ON(!mutex_is_locked(&dev->struct_mutex));
 
-	alignment = intel_surf_alignment(dev_priv, fb->modifier);
+	alignment = intel_surf_alignment(fb, 0);
 
 	intel_fill_fb_ggtt_view(&view, fb, rotation);
 
@@ -2355,8 +2357,7 @@ static u32 intel_adjust_tile_offset(int *x, int *y,
 		unsigned int pitch_tiles;
 
 		tile_size = intel_tile_size(dev_priv);
-		intel_tile_dims(dev_priv, &tile_width, &tile_height,
-				fb->modifier, cpp);
+		intel_tile_dims(fb, plane, &tile_width, &tile_height);
 
 		if (drm_rotation_90_or_270(rotation)) {
 			pitch_tiles = pitch / tile_height;
@@ -2411,8 +2412,7 @@ static u32 _intel_compute_tile_offset(const struct drm_i915_private *dev_priv,
 		unsigned int tile_rows, tiles, pitch_tiles;
 
 		tile_size = intel_tile_size(dev_priv);
-		intel_tile_dims(dev_priv, &tile_width, &tile_height,
-				fb_modifier, cpp);
+		intel_tile_dims(fb, plane, &tile_width, &tile_height);
 
 		if (drm_rotation_90_or_270(rotation)) {
 			pitch_tiles = pitch / tile_height;
@@ -2458,7 +2458,7 @@ u32 intel_compute_tile_offset(int *x, int *y,
 	if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
 		alignment = 4096;
 	else
-		alignment = intel_surf_alignment(dev_priv, fb->modifier);
+		alignment = intel_surf_alignment(fb, plane);
 
 	return _intel_compute_tile_offset(dev_priv, x, y, fb, plane, pitch,
 					  rotation, alignment);
@@ -2544,8 +2544,7 @@ intel_fill_fb_info(struct drm_i915_private *dev_priv,
 			unsigned int pitch_tiles;
 			struct drm_rect r;
 
-			intel_tile_dims(dev_priv, &tile_width, &tile_height,
-					fb->modifier, cpp);
+			intel_tile_dims(fb, i, &tile_width, &tile_height);
 
 			rot_info->plane[i].offset = offset;
 			rot_info->plane[i].stride = DIV_ROUND_UP(fb->pitches[i], tile_width * cpp);
@@ -2873,7 +2872,6 @@ static int skl_max_plane_width(const struct drm_framebuffer *fb, int plane,
 
 static int skl_check_main_surface(struct intel_plane_state *plane_state)
 {
-	const struct drm_i915_private *dev_priv = to_i915(plane_state->base.plane->dev);
 	const struct drm_framebuffer *fb = plane_state->base.fb;
 	unsigned int rotation = plane_state->base.rotation;
 	int x = plane_state->base.src.x1 >> 16;
@@ -2892,8 +2890,7 @@ static int skl_check_main_surface(struct intel_plane_state *plane_state)
 
 	intel_add_fb_offsets(&x, &y, plane_state, 0);
 	offset = intel_compute_tile_offset(&x, &y, plane_state, 0);
-
-	alignment = intel_surf_alignment(dev_priv, fb->modifier);
+	alignment = intel_surf_alignment(fb, 0);
 
 	/*
 	 * AUX surface offset is specified as the distance from the
@@ -3210,16 +3207,13 @@ static void ironlake_update_primary_plane(struct drm_plane *primary,
 	POSTING_READ(reg);
 }
 
-u32 intel_fb_stride_alignment(const struct drm_i915_private *dev_priv,
-			      uint64_t fb_modifier, uint32_t pixel_format)
+static u32
+intel_fb_stride_alignment(const struct drm_framebuffer *fb, int plane)
 {
-	if (fb_modifier == DRM_FORMAT_MOD_NONE) {
+	if (fb->modifier == DRM_FORMAT_MOD_NONE)
 		return 64;
-	} else {
-		int cpp = drm_format_plane_cpp(pixel_format, 0);
-
-		return intel_tile_width_bytes(dev_priv, fb_modifier, cpp);
-	}
+	else
+		return intel_tile_width_bytes(fb, plane);
 }
 
 u32 intel_fb_gtt_offset(struct drm_framebuffer *fb,
@@ -3269,21 +3263,16 @@ static void skl_detach_scalers(struct intel_crtc *intel_crtc)
 u32 skl_plane_stride(const struct drm_framebuffer *fb, int plane,
 		     unsigned int rotation)
 {
-	const struct drm_i915_private *dev_priv = to_i915(fb->dev);
 	u32 stride = intel_fb_pitch(fb, plane, rotation);
 
 	/*
 	 * The stride is either expressed as a multiple of 64 bytes chunks for
 	 * linear buffers or in number of tiles for tiled buffers.
 	 */
-	if (drm_rotation_90_or_270(rotation)) {
-		int cpp = fb->format->cpp[plane];
-
-		stride /= intel_tile_height(dev_priv, fb->modifier, cpp);
-	} else {
-		stride /= intel_fb_stride_alignment(dev_priv, fb->modifier,
-						    fb->format->format);
-	}
+	if (drm_rotation_90_or_270(rotation))
+		stride /= intel_tile_height(fb, plane);
+	else
+		stride /= intel_fb_stride_alignment(fb, plane);
 
 	return stride;
 }
@@ -8773,9 +8762,7 @@ i9xx_get_initial_plane_config(struct intel_crtc *crtc,
 	val = I915_READ(DSPSTRIDE(pipe));
 	fb->pitches[0] = val & 0xffffffc0;
 
-	aligned_height = intel_fb_align_height(dev, fb->height,
-					       fb->format->format,
-					       fb->modifier);
+	aligned_height = intel_fb_align_height(fb, 0, fb->height);
 
 	plane_config->size = fb->pitches[0] * aligned_height;
 
@@ -9810,13 +9797,10 @@ skylake_get_initial_plane_config(struct intel_crtc *crtc,
 	fb->width = ((val >> 0) & 0x1fff) + 1;
 
 	val = I915_READ(PLANE_STRIDE(pipe, 0));
-	stride_mult = intel_fb_stride_alignment(dev_priv, fb->modifier,
-						fb->format->format);
+	stride_mult = intel_fb_stride_alignment(fb, 0);
 	fb->pitches[0] = (val & 0x3ff) * stride_mult;
 
-	aligned_height = intel_fb_align_height(dev, fb->height,
-					       fb->format->format,
-					       fb->modifier);
+	aligned_height = intel_fb_align_height(fb, 0, fb->height);
 
 	plane_config->size = fb->pitches[0] * aligned_height;
 
@@ -9912,9 +9896,7 @@ ironlake_get_initial_plane_config(struct intel_crtc *crtc,
 	val = I915_READ(DSPSTRIDE(pipe));
 	fb->pitches[0] = val & 0xffffffc0;
 
-	aligned_height = intel_fb_align_height(dev, fb->height,
-					       fb->format->format,
-					       fb->modifier);
+	aligned_height = intel_fb_align_height(fb, 0, fb->height);
 
 	plane_config->size = fb->pitches[0] * aligned_height;
 
@@ -15967,15 +15949,6 @@ static int intel_framebuffer_init(struct drm_device *dev,
 		return -EINVAL;
 	}
 
-	stride_alignment = intel_fb_stride_alignment(dev_priv,
-						     mode_cmd->modifier[0],
-						     mode_cmd->pixel_format);
-	if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
-		DRM_DEBUG("pitch (%d) must be at least %u byte aligned\n",
-			  mode_cmd->pitches[0], stride_alignment);
-		return -EINVAL;
-	}
-
 	pitch_limit = intel_fb_pitch_limit(dev_priv, mode_cmd->modifier[0],
 					   mode_cmd->pixel_format);
 	if (mode_cmd->pitches[0] > pitch_limit) {
@@ -16057,6 +16030,14 @@ static int intel_framebuffer_init(struct drm_device *dev,
 		return -EINVAL;
 
 	drm_helper_mode_fill_fb_struct(dev, &intel_fb->base, mode_cmd);
+
+	stride_alignment = intel_fb_stride_alignment(&intel_fb->base, 0);
+	if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
+		DRM_DEBUG("pitch (%d) must be at least %u byte aligned\n",
+			  mode_cmd->pitches[0], stride_alignment);
+		return -EINVAL;
+	}
+
 	intel_fb->obj = obj;
 
 	ret = intel_fill_fb_info(dev_priv, &intel_fb->base);
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 6b02dac6ea26..e3d19fc6720c 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1208,12 +1208,8 @@ void intel_ddi_set_vc_payload_alloc(struct drm_crtc *crtc, bool state);
 uint32_t ddi_signal_levels(struct intel_dp *intel_dp);
 struct intel_shared_dpll *intel_ddi_get_link_dpll(struct intel_dp *intel_dp,
 						  int clock);
-unsigned int intel_fb_align_height(struct drm_device *dev,
-				   unsigned int height,
-				   uint32_t pixel_format,
-				   uint64_t fb_format_modifier);
-u32 intel_fb_stride_alignment(const struct drm_i915_private *dev_priv,
-			      uint64_t fb_modifier, uint32_t pixel_format);
+unsigned int intel_fb_align_height(const struct drm_framebuffer *fb,
+				   int plane, unsigned int height);
 
 /* intel_audio.c */
 void intel_init_audio_hooks(struct drm_i915_private *dev_priv);
@@ -1325,9 +1321,6 @@ int intel_plane_atomic_set_property(struct drm_plane *plane,
 int intel_plane_atomic_calc_changes(struct drm_crtc_state *crtc_state,
 				    struct drm_plane_state *plane_state);
 
-unsigned int intel_tile_height(const struct drm_i915_private *dev_priv,
-			       uint64_t fb_modifier, unsigned int cpp);
-
 void assert_pch_transcoder_disabled(struct drm_i915_private *dev_priv,
 				    enum pipe pipe);
 
diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index 73d02d21c768..3716554e32a9 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -631,9 +631,7 @@ static bool intel_fbdev_init_bios(struct drm_device *dev,
 		}
 
 		cur_size = intel_crtc->config->base.adjusted_mode.crtc_vdisplay;
-		cur_size = intel_fb_align_height(dev, cur_size,
-						 fb->base.format->format,
-						 fb->base.modifier);
+		cur_size = intel_fb_align_height(&fb->base, 0, cur_size);
 		cur_size *= fb->base.pitches[0];
 		DRM_DEBUG_KMS("pipe %c area: %dx%d, bpp: %d, size: %d\n",
 			      pipe_name(intel_crtc->pipe),
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 3/9] drm/i915: Move nv12 chroma plane handling into intel_surf_alignment()
  2017-01-04 18:42 [PATCH 0/9] drm/i915: SKL+ render decompression support ville.syrjala
  2017-01-04 18:42 ` [PATCH 1/9] drm: Add mode_config .get_format_info() hook ville.syrjala
  2017-01-04 18:42 ` [PATCH 2/9] drm/i915: Plumb drm_framebuffer into more places ville.syrjala
@ 2017-01-04 18:42 ` ville.syrjala
  2017-02-02 13:34   ` Imre Deak
  2017-01-04 18:42 ` [PATCH 4/9] drm/i915: Avoid div-by-zero when computing aux_stride w/o an aux plane ville.syrjala
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 44+ messages in thread
From: ville.syrjala @ 2017-01-04 18:42 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

Let's try to keep the alignment requirements in one place, and so
towards that end let's move the AUX_DIST alignment handling into
intel_surf_alignment() alongside the main surface alignment stuff.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index f0cb80aba89a..4d514ca1da88 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2163,6 +2163,10 @@ static unsigned int intel_surf_alignment(const struct drm_framebuffer *fb,
 {
 	struct drm_i915_private *dev_priv = to_i915(fb->dev);
 
+	/* AUX_DIST needs only 4K alignment */
+	if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
+		return 4096;
+
 	switch (fb->modifier) {
 	case DRM_FORMAT_MOD_NONE:
 		return intel_linear_alignment(dev_priv);
@@ -2452,13 +2456,7 @@ u32 intel_compute_tile_offset(int *x, int *y,
 	const struct drm_framebuffer *fb = state->base.fb;
 	unsigned int rotation = state->base.rotation;
 	int pitch = intel_fb_pitch(fb, plane, rotation);
-	u32 alignment;
-
-	/* AUX_DIST needs only 4K alignment */
-	if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
-		alignment = 4096;
-	else
-		alignment = intel_surf_alignment(fb, plane);
+	u32 alignment = intel_surf_alignment(fb, plane);
 
 	return _intel_compute_tile_offset(dev_priv, x, y, fb, plane, pitch,
 					  rotation, alignment);
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 4/9] drm/i915: Avoid div-by-zero when computing aux_stride w/o an aux plane
  2017-01-04 18:42 [PATCH 0/9] drm/i915: SKL+ render decompression support ville.syrjala
                   ` (2 preceding siblings ...)
  2017-01-04 18:42 ` [PATCH 3/9] drm/i915: Move nv12 chroma plane handling into intel_surf_alignment() ville.syrjala
@ 2017-01-04 18:42 ` ville.syrjala
  2017-02-02 13:38   ` Imre Deak
  2017-01-04 18:42 ` [PATCH 5/9] drm/i915: Fix Yf tile width ville.syrjala
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 44+ messages in thread
From: ville.syrjala @ 2017-01-04 18:42 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

To make life easier let's allow skl_plane_stride() to be called for the
AUX surface even when there is no AUX surface. Avoids special cases in
the callers.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 4d514ca1da88..bc398743e941 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -3261,7 +3261,12 @@ static void skl_detach_scalers(struct intel_crtc *intel_crtc)
 u32 skl_plane_stride(const struct drm_framebuffer *fb, int plane,
 		     unsigned int rotation)
 {
-	u32 stride = intel_fb_pitch(fb, plane, rotation);
+	u32 stride;
+
+	if (plane >= fb->format->num_planes)
+		return 0;
+
+	stride = intel_fb_pitch(fb, plane, rotation);
 
 	/*
 	 * The stride is either expressed as a multiple of 64 bytes chunks for
-- 
2.10.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 5/9] drm/i915: Fix Yf tile width
  2017-01-04 18:42 [PATCH 0/9] drm/i915: SKL+ render decompression support ville.syrjala
                   ` (3 preceding siblings ...)
  2017-01-04 18:42 ` [PATCH 4/9] drm/i915: Avoid div-by-zero when computing aux_stride w/o an aux plane ville.syrjala
@ 2017-01-04 18:42 ` ville.syrjala
  2017-02-02 15:07   ` Imre Deak
  2017-01-04 18:42 ` [PATCH 6/9] drm/i915: Pass the correct plane index to _intel_compute_tile_offset() ville.syrjala
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 44+ messages in thread
From: ville.syrjala @ 2017-01-04 18:42 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

Based on empirical evidence the display engine (at least) always
treats Yf tiles as 128Bx32. Currently we're assuming the tile dimensions
change based on the pixel format. Let's adjust our code to match the
hardware.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 20 ++++++--------------
 1 file changed, 6 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index bc398743e941..0ca0dbccc005 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2070,20 +2070,12 @@ intel_tile_width_bytes(const struct drm_framebuffer *fb, int plane)
 		else
 			return 512;
 	case I915_FORMAT_MOD_Yf_TILED:
-		switch (cpp) {
-		case 1:
-			return 64;
-		case 2:
-		case 4:
-			return 128;
-		case 8:
-		case 16:
-			return 256;
-		default:
-			MISSING_CASE(cpp);
-			return cpp;
-		}
-		break;
+		/*
+		 * Bspec seems to suggest that the Yf tile width would
+		 * depend on the cpp. In reality it doesn't, at least
+		 * as far as the display engine is concerned.
+		 */
+		return 128;
 	default:
 		MISSING_CASE(fb->modifier);
 		return cpp;
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 6/9] drm/i915: Pass the correct plane index to _intel_compute_tile_offset()
  2017-01-04 18:42 [PATCH 0/9] drm/i915: SKL+ render decompression support ville.syrjala
                   ` (4 preceding siblings ...)
  2017-01-04 18:42 ` [PATCH 5/9] drm/i915: Fix Yf tile width ville.syrjala
@ 2017-01-04 18:42 ` ville.syrjala
  2017-02-02 16:10   ` Imre Deak
  2017-01-04 18:42 ` [PATCH 7/9] drm/i915: Use DRM_DEBUG_KMS() for framebuffer failure debug messages ville.syrjala
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 44+ messages in thread
From: ville.syrjala @ 2017-01-04 18:42 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

intel_fill_fb_info() should pass the correct plane index to
_intel_compute_tile_offset() once we start to care about the AUX
surface.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 0ca0dbccc005..5fee5a7ac9a4 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2525,7 +2525,7 @@ intel_fill_fb_info(struct drm_i915_private *dev_priv,
 		intel_fb->normal[i].y = y;
 
 		offset = _intel_compute_tile_offset(dev_priv, &x, &y,
-						    fb, 0, fb->pitches[i],
+						    fb, i, fb->pitches[i],
 						    DRM_ROTATE_0, tile_size);
 		offset /= tile_size;
 
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 7/9] drm/i915: Use DRM_DEBUG_KMS() for framebuffer failure debug messages
  2017-01-04 18:42 [PATCH 0/9] drm/i915: SKL+ render decompression support ville.syrjala
                   ` (5 preceding siblings ...)
  2017-01-04 18:42 ` [PATCH 6/9] drm/i915: Pass the correct plane index to _intel_compute_tile_offset() ville.syrjala
@ 2017-01-04 18:42 ` ville.syrjala
  2017-02-02 16:19   ` Imre Deak
  2017-01-04 18:42 ` [PATCH 8/9] drm/i915: Implement .get_format_info() hook for CCS ville.syrjala
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 44+ messages in thread
From: ville.syrjala @ 2017-01-04 18:42 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

DRM_UT_CORE generates way too much noise usually, so having the
framebuffer init failures use DRM_UT_CORE is a pain when trying to
find out the reason why you failed in creating a framebuffer.
Let's use DRM_UT_KMS for these debug messages instead.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 66 ++++++++++++++++++------------------
 1 file changed, 33 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 5fee5a7ac9a4..c4662b2e9613 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2512,8 +2512,8 @@ intel_fill_fb_info(struct drm_i915_private *dev_priv,
 		 */
 		if (i915_gem_object_is_tiled(intel_fb->obj) &&
 		    (x + width) * cpp > fb->pitches[i]) {
-			DRM_DEBUG("bad fb plane %d offset: 0x%x\n",
-				  i, fb->offsets[i]);
+			DRM_DEBUG_KMS("bad fb plane %d offset: 0x%x\n",
+				      i, fb->offsets[i]);
 			return -EINVAL;
 		}
 
@@ -2594,9 +2594,9 @@ intel_fill_fb_info(struct drm_i915_private *dev_priv,
 		max_size = max(max_size, offset + size);
 	}
 
-	if (max_size * tile_size > to_intel_framebuffer(fb)->obj->base.size) {
-		DRM_DEBUG("fb too big for bo (need %u bytes, have %zu bytes)\n",
-			  max_size * tile_size, to_intel_framebuffer(fb)->obj->base.size);
+	if (max_size * tile_size > intel_fb->obj->base.size) {
+		DRM_DEBUG_KMS("fb too big for bo (need %u bytes, have %zu bytes)\n",
+			      max_size * tile_size, intel_fb->obj->base.size);
 		return -EINVAL;
 	}
 
@@ -15904,14 +15904,14 @@ static int intel_framebuffer_init(struct drm_device *dev,
 		 */
 		if (tiling != I915_TILING_NONE &&
 		    tiling != intel_fb_modifier_to_tiling(mode_cmd->modifier[0])) {
-			DRM_DEBUG("tiling_mode doesn't match fb modifier\n");
+			DRM_DEBUG_KMS("tiling_mode doesn't match fb modifier\n");
 			return -EINVAL;
 		}
 	} else {
 		if (tiling == I915_TILING_X) {
 			mode_cmd->modifier[0] = I915_FORMAT_MOD_X_TILED;
 		} else if (tiling == I915_TILING_Y) {
-			DRM_DEBUG("No Y tiling for legacy addfb\n");
+			DRM_DEBUG_KMS("No Y tiling for legacy addfb\n");
 			return -EINVAL;
 		}
 	}
@@ -15921,16 +15921,16 @@ static int intel_framebuffer_init(struct drm_device *dev,
 	case I915_FORMAT_MOD_Y_TILED:
 	case I915_FORMAT_MOD_Yf_TILED:
 		if (INTEL_GEN(dev_priv) < 9) {
-			DRM_DEBUG("Unsupported tiling 0x%llx!\n",
-				  mode_cmd->modifier[0]);
+			DRM_DEBUG_KMS("Unsupported tiling 0x%llx!\n",
+				      mode_cmd->modifier[0]);
 			return -EINVAL;
 		}
 	case DRM_FORMAT_MOD_NONE:
 	case I915_FORMAT_MOD_X_TILED:
 		break;
 	default:
-		DRM_DEBUG("Unsupported fb modifier 0x%llx!\n",
-			  mode_cmd->modifier[0]);
+		DRM_DEBUG_KMS("Unsupported fb modifier 0x%llx!\n",
+			      mode_cmd->modifier[0]);
 		return -EINVAL;
 	}
 
@@ -15940,17 +15940,17 @@ static int intel_framebuffer_init(struct drm_device *dev,
 	 */
 	if (INTEL_INFO(dev_priv)->gen < 4 &&
 	    tiling != intel_fb_modifier_to_tiling(mode_cmd->modifier[0])) {
-		DRM_DEBUG("tiling_mode must match fb modifier exactly on gen2/3\n");
+		DRM_DEBUG_KMS("tiling_mode must match fb modifier exactly on gen2/3\n");
 		return -EINVAL;
 	}
 
 	pitch_limit = intel_fb_pitch_limit(dev_priv, mode_cmd->modifier[0],
 					   mode_cmd->pixel_format);
 	if (mode_cmd->pitches[0] > pitch_limit) {
-		DRM_DEBUG("%s pitch (%u) must be at less than %d\n",
-			  mode_cmd->modifier[0] != DRM_FORMAT_MOD_NONE ?
-			  "tiled" : "linear",
-			  mode_cmd->pitches[0], pitch_limit);
+		DRM_DEBUG_KMS("%s pitch (%u) must be at less than %d\n",
+			      mode_cmd->modifier[0] != DRM_FORMAT_MOD_NONE ?
+			      "tiled" : "linear",
+			      mode_cmd->pitches[0], pitch_limit);
 		return -EINVAL;
 	}
 
@@ -15960,9 +15960,9 @@ static int intel_framebuffer_init(struct drm_device *dev,
 	 */
 	if (tiling != I915_TILING_NONE &&
 	    mode_cmd->pitches[0] != i915_gem_object_get_stride(obj)) {
-		DRM_DEBUG("pitch (%d) must match tiling stride (%d)\n",
-			  mode_cmd->pitches[0],
-			  i915_gem_object_get_stride(obj));
+		DRM_DEBUG_KMS("pitch (%d) must match tiling stride (%d)\n",
+			      mode_cmd->pitches[0],
+			      i915_gem_object_get_stride(obj));
 		return -EINVAL;
 	}
 
@@ -15975,16 +15975,16 @@ static int intel_framebuffer_init(struct drm_device *dev,
 		break;
 	case DRM_FORMAT_XRGB1555:
 		if (INTEL_GEN(dev_priv) > 3) {
-			DRM_DEBUG("unsupported pixel format: %s\n",
-			          drm_get_format_name(mode_cmd->pixel_format, &format_name));
+			DRM_DEBUG_KMS("unsupported pixel format: %s\n",
+				      drm_get_format_name(mode_cmd->pixel_format, &format_name));
 			return -EINVAL;
 		}
 		break;
 	case DRM_FORMAT_ABGR8888:
 		if (!IS_VALLEYVIEW(dev_priv) && !IS_CHERRYVIEW(dev_priv) &&
 		    INTEL_GEN(dev_priv) < 9) {
-			DRM_DEBUG("unsupported pixel format: %s\n",
-			          drm_get_format_name(mode_cmd->pixel_format, &format_name));
+			DRM_DEBUG_KMS("unsupported pixel format: %s\n",
+				      drm_get_format_name(mode_cmd->pixel_format, &format_name));
 			return -EINVAL;
 		}
 		break;
@@ -15992,15 +15992,15 @@ static int intel_framebuffer_init(struct drm_device *dev,
 	case DRM_FORMAT_XRGB2101010:
 	case DRM_FORMAT_XBGR2101010:
 		if (INTEL_GEN(dev_priv) < 4) {
-			DRM_DEBUG("unsupported pixel format: %s\n",
-			          drm_get_format_name(mode_cmd->pixel_format, &format_name));
+			DRM_DEBUG_KMS("unsupported pixel format: %s\n",
+				      drm_get_format_name(mode_cmd->pixel_format, &format_name));
 			return -EINVAL;
 		}
 		break;
 	case DRM_FORMAT_ABGR2101010:
 		if (!IS_VALLEYVIEW(dev_priv) && !IS_CHERRYVIEW(dev_priv)) {
-			DRM_DEBUG("unsupported pixel format: %s\n",
-			          drm_get_format_name(mode_cmd->pixel_format, &format_name));
+			DRM_DEBUG_KMS("unsupported pixel format: %s\n",
+				      drm_get_format_name(mode_cmd->pixel_format, &format_name));
 			return -EINVAL;
 		}
 		break;
@@ -16009,14 +16009,14 @@ static int intel_framebuffer_init(struct drm_device *dev,
 	case DRM_FORMAT_YVYU:
 	case DRM_FORMAT_VYUY:
 		if (INTEL_GEN(dev_priv) < 5) {
-			DRM_DEBUG("unsupported pixel format: %s\n",
-			          drm_get_format_name(mode_cmd->pixel_format, &format_name));
+			DRM_DEBUG_KMS("unsupported pixel format: %s\n",
+				      drm_get_format_name(mode_cmd->pixel_format, &format_name));
 			return -EINVAL;
 		}
 		break;
 	default:
-		DRM_DEBUG("unsupported pixel format: %s\n",
-		          drm_get_format_name(mode_cmd->pixel_format, &format_name));
+		DRM_DEBUG_KMS("unsupported pixel format: %s\n",
+			      drm_get_format_name(mode_cmd->pixel_format, &format_name));
 		return -EINVAL;
 	}
 
@@ -16028,8 +16028,8 @@ static int intel_framebuffer_init(struct drm_device *dev,
 
 	stride_alignment = intel_fb_stride_alignment(&intel_fb->base, 0);
 	if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
-		DRM_DEBUG("pitch (%d) must be at least %u byte aligned\n",
-			  mode_cmd->pitches[0], stride_alignment);
+		DRM_DEBUG_KMS("pitch (%d) must be at least %u byte aligned\n",
+			      mode_cmd->pitches[0], stride_alignment);
 		return -EINVAL;
 	}
 
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 8/9] drm/i915: Implement .get_format_info() hook for CCS
  2017-01-04 18:42 [PATCH 0/9] drm/i915: SKL+ render decompression support ville.syrjala
                   ` (6 preceding siblings ...)
  2017-01-04 18:42 ` [PATCH 7/9] drm/i915: Use DRM_DEBUG_KMS() for framebuffer failure debug messages ville.syrjala
@ 2017-01-04 18:42 ` ville.syrjala
  2017-01-05 16:24   ` Tvrtko Ursulin
  2017-02-26 22:41   ` Chad Versace
  2017-01-04 18:42 ` [PATCH 9/9] drm/i915: Add render decompression support ville.syrjala
                   ` (4 subsequent siblings)
  12 siblings, 2 replies; 44+ messages in thread
From: ville.syrjala @ 2017-01-04 18:42 UTC (permalink / raw)
  To: intel-gfx; +Cc: Jason Ekstrand, Ben Widawsky, Vandana Kannan, dri-devel

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

SKL+ display engine can scan out certain kinds of compressed surfaces
produced by the render engine. This involved telling the display engine
the location of the color control surfae (CCS) which describes which
parts of the main surface are compressed and which are not. The location
of CCS is provided by userspace as just another plane with its own offset.

By providing our own format information for the CCS formats, we should
be able to make framebuffer_check() do the right thing for the CCS
surface as well.

Note that we'll return the same format info for both Y and Yf tiled
format as that's what happens with the non-CCS Y vs. Yf as well. If
desired, we could potentially return a unique pointer for each
pixel_format+tiling+ccs combination, in which case we immediately be
able to tell if any of that stuff changed by just comparing the
pointers. But that does sound a bit wasteful space wise.

v2: Drop the 'dev' argument from the hook
v3: Include the description of the CCS surface layout

Cc: Vandana Kannan <vandana.kannan@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 36 ++++++++++++++++++++++++++
 include/uapi/drm/drm_fourcc.h        | 49 ++++++++++++++++++++++++++++++++++++
 2 files changed, 85 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index c4662b2e9613..38de9df0ec60 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2478,6 +2478,41 @@ static unsigned int intel_fb_modifier_to_tiling(uint64_t fb_modifier)
 	}
 }
 
+static const struct drm_format_info ccs_formats[] = {
+	{ .format = DRM_FORMAT_XRGB8888, .depth = 24, .num_planes = 2, .cpp = { 4, 1, }, .hsub = 16, .vsub = 8, },
+	{ .format = DRM_FORMAT_XBGR8888, .depth = 24, .num_planes = 2, .cpp = { 4, 1, }, .hsub = 16, .vsub = 8, },
+	{ .format = DRM_FORMAT_ARGB8888, .depth = 32, .num_planes = 2, .cpp = { 4, 1, }, .hsub = 16, .vsub = 8, },
+	{ .format = DRM_FORMAT_ABGR8888, .depth = 32, .num_planes = 2, .cpp = { 4, 1, }, .hsub = 16, .vsub = 8, },
+};
+
+static const struct drm_format_info *
+lookup_format_info(const struct drm_format_info formats[],
+		   int num_formats, u32 format)
+{
+	int i;
+
+	for (i = 0; i < num_formats; i++) {
+		if (formats[i].format == format)
+			return &formats[i];
+	}
+
+	return NULL;
+}
+
+static const struct drm_format_info *
+intel_get_format_info(const struct drm_mode_fb_cmd2 *cmd)
+{
+	switch (cmd->modifier[0]) {
+	case I915_FORMAT_MOD_Y_TILED_CCS:
+	case I915_FORMAT_MOD_Yf_TILED_CCS:
+		return lookup_format_info(ccs_formats,
+					  ARRAY_SIZE(ccs_formats),
+					  cmd->pixel_format);
+	default:
+		return NULL;
+	}
+}
+
 static int
 intel_fill_fb_info(struct drm_i915_private *dev_priv,
 		   struct drm_framebuffer *fb)
@@ -16083,6 +16118,7 @@ static void intel_atomic_state_free(struct drm_atomic_state *state)
 
 static const struct drm_mode_config_funcs intel_mode_funcs = {
 	.fb_create = intel_user_framebuffer_create,
+	.get_format_info = intel_get_format_info,
 	.output_poll_changed = intel_fbdev_output_poll_changed,
 	.atomic_check = intel_atomic_check,
 	.atomic_commit = intel_atomic_commit,
diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
index 9e1bb7fabcde..4581e3d41e5c 100644
--- a/include/uapi/drm/drm_fourcc.h
+++ b/include/uapi/drm/drm_fourcc.h
@@ -230,6 +230,55 @@ extern "C" {
 #define I915_FORMAT_MOD_Yf_TILED fourcc_mod_code(INTEL, 3)
 
 /*
+ * Intel color control surface (CCS) for render compression
+ *
+ * The framebuffer format must be one of the 8:8:8:8 RGB formats,
+ * the main surface will be plane index 0 and will be Y/Yf-tiled,
+ * the CCS will be plane index 1.
+ *
+ * Each byte of CCS contains 4 pairs of bits, with each pair of bits
+ * matching an area of 8x4 pixels of the main surface. Which would seem
+ * to match 2 cachelines containing 4x4 pixels each. The pairs bits within
+ * the byte form a 2x2 grid, which thus matches a 16x8 pixel area of the
+ * main surface. This is the 2x2 pattern the bits form (0=lsb, 7=msb):
+ * -----------
+ * | 01 | 23 |
+ *  ----------
+ * | 45 | 67 |
+ * -----------
+ *
+ * Actually only the lower bit of the pair seems to have any effect.
+ * No idea why. 0 in the lower bit would seem to mean not compressed,
+ * and 1 is compressed.  The interpreation of the main surface data
+ * when the block is marked compressed is unknown as of now.
+ *
+ * CCS tile is laid out in 8 byte horizontal strips each strip thus corresponds
+ * to a 128x8 pixel are of the main surface. So each 8x8 bytes of the CCS
+ * (1 cacheline) will match an area of 4x2 tiles on the main surface.
+ *
+ * Here is the layout of a full CCS tile, with the 8 byte strips numbered 0-511:
+ * ------------------------
+ * |  0 |  64 | ... | 448 |
+ * |  1 |  65 |     | 449 |
+ * |  2 |  66 |     | 450 |
+ * |  . |   . |     |   . |
+ * |  . |   . |     |   . |
+ * |  . |   . |     |   . |
+ * | 63 | 127 |     | 511 |
+ * ------------------------
+ *
+ * This will match an area of 1024x512 pixels on the main surface.
+ *
+ * The CCS plane pitch must be a multiple of the CCS tile width (64 bytes),
+ * and for the purposes of the CCS plane offset we assume cpp==1. As each
+ * byte matches a 16x8 area of the main surface, the dimensions of the CCS
+ * plane will thus appear to be width/16 x height/8. Both planes must be
+ * contained within the same gem object.
+ */
+#define I915_FORMAT_MOD_Y_TILED_CCS	fourcc_mod_code(INTEL, 4)
+#define I915_FORMAT_MOD_Yf_TILED_CCS	fourcc_mod_code(INTEL, 5)
+
+/*
  * Tiled, NV12MT, grouped in 64 (pixels) x 32 (lines) -sized macroblocks
  *
  * Macroblocks are laid in a Z-shape, and each pixel data is following the
-- 
2.10.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 9/9] drm/i915: Add render decompression support
  2017-01-04 18:42 [PATCH 0/9] drm/i915: SKL+ render decompression support ville.syrjala
                   ` (7 preceding siblings ...)
  2017-01-04 18:42 ` [PATCH 8/9] drm/i915: Implement .get_format_info() hook for CCS ville.syrjala
@ 2017-01-04 18:42 ` ville.syrjala
  2017-01-04 19:14   ` Paulo Zanoni
                     ` (2 more replies)
  2017-01-04 19:45 ` ✓ Fi.CI.BAT: success for drm/i915: SKL+ " Patchwork
                   ` (3 subsequent siblings)
  12 siblings, 3 replies; 44+ messages in thread
From: ville.syrjala @ 2017-01-04 18:42 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Vandana Kannan, dri-devel

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

SKL+ display engine can scan out certain kinds of compressed surfaces
produced by the render engine. This involved telling the display engine
the location of the color control surfae (CCS) which describes
which parts of the main surface are compressed and which are not. The
location of CCS is provided by userspace as just another plane with its
own offset.

Add the required stuff to validate the user provided AUX plane metadata
and convert the user provided linear offset into something the hardware
can consume.

Due to hardware limitations we require that the main surface and
the AUX surface (CCS) be part of the same bo. The hardware also
makes life hard by not allowing you to provide separate x/y offsets
for the main and AUX surfaces (excpet with NV12), so finding suitable
offsets for both requires a bit of work. Assuming we still want keep
playing tricks with the offsets. I've just gone with a dumb "search
backward for suitable offsets" approach, which is far from optimal,
but it works.

Also not all planes will be capable of scanning out compressed surfaces,
and eg. 90/270 degree rotation is not supported in combination with
decompression either.

This patch may contain work from at least the following people:
* Vandana Kannan <vandana.kannan@intel.com>
* Daniel Vetter <daniel@ffwll.ch>
* Ben Widawsky <ben@bwidawsk.net>

Cc: Vandana Kannan <vandana.kannan@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h      |  22 ++++
 drivers/gpu/drm/i915/intel_display.c | 219 +++++++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_pm.c      |   8 +-
 drivers/gpu/drm/i915/intel_sprite.c  |   5 +
 4 files changed, 240 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 00970aa77afa..05e18e742776 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -6209,6 +6209,28 @@ enum {
 			_ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A),   \
 			_ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B))
 
+#define PLANE_AUX_DIST_1_A		0x701c0
+#define PLANE_AUX_DIST_2_A		0x702c0
+#define PLANE_AUX_DIST_1_B		0x711c0
+#define PLANE_AUX_DIST_2_B		0x712c0
+#define _PLANE_AUX_DIST_1(pipe) \
+			_PIPE(pipe, PLANE_AUX_DIST_1_A, PLANE_AUX_DIST_1_B)
+#define _PLANE_AUX_DIST_2(pipe) \
+			_PIPE(pipe, PLANE_AUX_DIST_2_A, PLANE_AUX_DIST_2_B)
+#define PLANE_AUX_DIST(pipe, plane)     \
+	_MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe), _PLANE_AUX_DIST_2(pipe))
+
+#define PLANE_AUX_OFFSET_1_A		0x701c4
+#define PLANE_AUX_OFFSET_2_A		0x702c4
+#define PLANE_AUX_OFFSET_1_B		0x711c4
+#define PLANE_AUX_OFFSET_2_B		0x712c4
+#define _PLANE_AUX_OFFSET_1(pipe)       \
+		_PIPE(pipe, PLANE_AUX_OFFSET_1_A, PLANE_AUX_OFFSET_1_B)
+#define _PLANE_AUX_OFFSET_2(pipe)       \
+		_PIPE(pipe, PLANE_AUX_OFFSET_2_A, PLANE_AUX_OFFSET_2_B)
+#define PLANE_AUX_OFFSET(pipe, plane)   \
+	_MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe), _PLANE_AUX_OFFSET_2(pipe))
+
 /* legacy palette */
 #define _LGC_PALETTE_A           0x4a000
 #define _LGC_PALETTE_B           0x4a800
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 38de9df0ec60..b547332eeda1 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2064,11 +2064,19 @@ intel_tile_width_bytes(const struct drm_framebuffer *fb, int plane)
 			return 128;
 		else
 			return 512;
+	case I915_FORMAT_MOD_Y_TILED_CCS:
+		if (plane == 1)
+			return 64;
+		/* fall through */
 	case I915_FORMAT_MOD_Y_TILED:
 		if (IS_GEN2(dev_priv) || HAS_128_BYTE_Y_TILING(dev_priv))
 			return 128;
 		else
 			return 512;
+	case I915_FORMAT_MOD_Yf_TILED_CCS:
+		if (plane == 1)
+			return 64;
+		/* fall through */
 	case I915_FORMAT_MOD_Yf_TILED:
 		/*
 		 * Bspec seems to suggest that the Yf tile width would
@@ -2156,7 +2164,7 @@ static unsigned int intel_surf_alignment(const struct drm_framebuffer *fb,
 	struct drm_i915_private *dev_priv = to_i915(fb->dev);
 
 	/* AUX_DIST needs only 4K alignment */
-	if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
+	if (plane == 1)
 		return 4096;
 
 	switch (fb->modifier) {
@@ -2166,6 +2174,8 @@ static unsigned int intel_surf_alignment(const struct drm_framebuffer *fb,
 		if (INTEL_GEN(dev_priv) >= 9)
 			return 256 * 1024;
 		return 0;
+	case I915_FORMAT_MOD_Y_TILED_CCS:
+	case I915_FORMAT_MOD_Yf_TILED_CCS:
 	case I915_FORMAT_MOD_Y_TILED:
 	case I915_FORMAT_MOD_Yf_TILED:
 		return 1 * 1024 * 1024;
@@ -2472,6 +2482,7 @@ static unsigned int intel_fb_modifier_to_tiling(uint64_t fb_modifier)
 	case I915_FORMAT_MOD_X_TILED:
 		return I915_TILING_X;
 	case I915_FORMAT_MOD_Y_TILED:
+	case I915_FORMAT_MOD_Y_TILED_CCS:
 		return I915_TILING_Y;
 	default:
 		return I915_TILING_NONE;
@@ -2536,6 +2547,35 @@ intel_fill_fb_info(struct drm_i915_private *dev_priv,
 
 		intel_fb_offset_to_xy(&x, &y, fb, i);
 
+		if ((fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
+		     fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) && i == 1) {
+			int main_x, main_y;
+			int ccs_x, ccs_y;
+
+			/*
+			 * Each byte of CCS corresponds to a 16x8 area of the main surface, and
+			 * each CCS tile is 64x64 bytes.
+			 */
+			ccs_x = (x * 16) % (64 * 16);
+			ccs_y = (y * 8) % (64 * 8);
+			main_x = intel_fb->normal[0].x % (64 * 16);
+			main_y = intel_fb->normal[0].y % (64 * 8);
+
+			/*
+			 * CCS doesn't have its own x/y offset register, so the intra CCS tile
+			 * x/y offsets must match between CCS and the main surface.
+			 */
+			if (main_x != ccs_x || main_y != ccs_y) {
+				DRM_DEBUG_KMS("Bad CCS x/y (main %d,%d ccs %d,%d) full (main %d,%d ccs %d,%d)\n",
+					      main_x, main_y,
+					      ccs_x, ccs_y,
+					      intel_fb->normal[0].x,
+					      intel_fb->normal[0].y,
+					      x, y);
+				return -EINVAL;
+			}
+		}
+
 		/*
 		 * The fence (if used) is aligned to the start of the object
 		 * so having the framebuffer wrap around across the edge of the
@@ -2873,6 +2913,9 @@ static int skl_max_plane_width(const struct drm_framebuffer *fb, int plane,
 			break;
 		}
 		break;
+	case I915_FORMAT_MOD_Y_TILED_CCS:
+	case I915_FORMAT_MOD_Yf_TILED_CCS:
+		/* FIXME AUX plane? */
 	case I915_FORMAT_MOD_Y_TILED:
 	case I915_FORMAT_MOD_Yf_TILED:
 		switch (cpp) {
@@ -2895,6 +2938,42 @@ static int skl_max_plane_width(const struct drm_framebuffer *fb, int plane,
 	return 2048;
 }
 
+static bool skl_check_main_ccs_coordinates(struct intel_plane_state *plane_state,
+					   int main_x, int main_y, u32 main_offset)
+{
+	const struct drm_framebuffer *fb = plane_state->base.fb;
+	int aux_x = plane_state->aux.x;
+	int aux_y = plane_state->aux.y;
+	u32 aux_offset = plane_state->aux.offset;
+	u32 alignment = intel_surf_alignment(fb, 1);
+
+	while (aux_offset >= main_offset && aux_y <= main_y) {
+		int x, y;
+
+		if (aux_x == main_x && aux_y == main_y)
+			break;
+
+		if (aux_offset == 0)
+			break;
+
+		x = aux_x / 16;
+		y = aux_y / 8;
+		aux_offset = intel_adjust_tile_offset(&x, &y, plane_state, 1,
+						      aux_offset, aux_offset - alignment);
+		aux_x = x * 16 + aux_x % 16;
+		aux_y = y * 8 + aux_y % 8;
+	}
+
+	if (aux_x != main_x || aux_y != main_y)
+		return false;
+
+	plane_state->aux.offset = aux_offset;
+	plane_state->aux.x = aux_x;
+	plane_state->aux.y = aux_y;
+
+	return true;
+}
+
 static int skl_check_main_surface(struct intel_plane_state *plane_state)
 {
 	const struct drm_framebuffer *fb = plane_state->base.fb;
@@ -2937,7 +3016,7 @@ static int skl_check_main_surface(struct intel_plane_state *plane_state)
 
 		while ((x + w) * cpp > fb->pitches[0]) {
 			if (offset == 0) {
-				DRM_DEBUG_KMS("Unable to find suitable display surface offset\n");
+				DRM_DEBUG_KMS("Unable to find suitable display surface offset due to X-tiling\n");
 				return -EINVAL;
 			}
 
@@ -2946,6 +3025,26 @@ static int skl_check_main_surface(struct intel_plane_state *plane_state)
 		}
 	}
 
+	/*
+	 * CCS AUX surface doesn't have its own x/y offsets, we must make sure
+	 * they match with the main surface x/y offsets.
+	 */
+	if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
+	    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
+		while (!skl_check_main_ccs_coordinates(plane_state, x, y, offset)) {
+			if (offset == 0)
+				break;
+
+			offset = intel_adjust_tile_offset(&x, &y, plane_state, 0,
+							  offset, offset - alignment);
+		}
+
+		if (x != plane_state->aux.x || y != plane_state->aux.y) {
+			DRM_DEBUG_KMS("Unable to find suitable display surface offset due to CCS\n");
+			return -EINVAL;
+		}
+	}
+
 	plane_state->main.offset = offset;
 	plane_state->main.x = x;
 	plane_state->main.y = y;
@@ -2982,6 +3081,53 @@ static int skl_check_nv12_aux_surface(struct intel_plane_state *plane_state)
 	return 0;
 }
 
+static int skl_check_ccs_aux_surface(struct intel_plane_state *plane_state)
+{
+	struct intel_plane *plane = to_intel_plane(plane_state->base.plane);
+	struct intel_crtc *crtc = to_intel_crtc(plane_state->base.crtc);
+	int src_x = plane_state->base.src.x1 >> 16;
+	int src_y = plane_state->base.src.y1 >> 16;
+	int x = src_x / 16;
+	int y = src_y / 8;
+	u32 offset;
+
+	switch (plane->id) {
+	case PLANE_PRIMARY:
+	case PLANE_SPRITE0:
+		break;
+	default:
+		DRM_DEBUG_KMS("RC support only on plane 1 and 2\n");
+		return -EINVAL;
+	}
+
+	if (crtc->pipe == PIPE_C) {
+		DRM_DEBUG_KMS("No RC support on pipe C\n");
+		return -EINVAL;
+	}
+	/*
+	 * TODO:
+	 * 1. Disable stereo 3D when render decomp is enabled (bit 7:6)
+	 * 2. Render decompression must not be used in VTd pass-through mode
+	 * 3. Program hashing select CHICKEN_MISC1 bit 15
+	 */
+
+	if (plane_state->base.rotation &&
+	    plane_state->base.rotation & ~(DRM_ROTATE_0 | DRM_ROTATE_180)) {
+		DRM_DEBUG_KMS("RC support only with 0/180 degree rotation %x\n",
+			      plane_state->base.rotation);
+		return -EINVAL;
+	}
+
+	intel_add_fb_offsets(&x, &y, plane_state, 1);
+	offset = intel_compute_tile_offset(&x, &y, plane_state, 1);
+
+	plane_state->aux.offset = offset;
+	plane_state->aux.x = x * 16 + src_x % 16;
+	plane_state->aux.y = y * 8 + src_y % 8;
+
+	return 0;
+}
+
 int skl_check_plane_surface(struct intel_plane_state *plane_state)
 {
 	const struct drm_framebuffer *fb = plane_state->base.fb;
@@ -3002,6 +3148,11 @@ int skl_check_plane_surface(struct intel_plane_state *plane_state)
 		ret = skl_check_nv12_aux_surface(plane_state);
 		if (ret)
 			return ret;
+	} else if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
+		   fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
+		ret = skl_check_ccs_aux_surface(plane_state);
+		if (ret)
+			return ret;
 	} else {
 		plane_state->aux.offset = ~0xfff;
 		plane_state->aux.x = 0;
@@ -3357,8 +3508,12 @@ u32 skl_plane_ctl_tiling(uint64_t fb_modifier)
 		return PLANE_CTL_TILED_X;
 	case I915_FORMAT_MOD_Y_TILED:
 		return PLANE_CTL_TILED_Y;
+	case I915_FORMAT_MOD_Y_TILED_CCS:
+		return PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
 	case I915_FORMAT_MOD_Yf_TILED:
 		return PLANE_CTL_TILED_YF;
+	case I915_FORMAT_MOD_Yf_TILED_CCS:
+		return PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_ENABLE;
 	default:
 		MISSING_CASE(fb_modifier);
 	}
@@ -3401,6 +3556,7 @@ static void skylake_update_primary_plane(struct drm_plane *plane,
 	u32 plane_ctl;
 	unsigned int rotation = plane_state->base.rotation;
 	u32 stride = skl_plane_stride(fb, 0, rotation);
+	u32 aux_stride = skl_plane_stride(fb, 1, rotation);
 	u32 surf_addr = plane_state->main.offset;
 	int scaler_id = plane_state->scaler_id;
 	int src_x = plane_state->main.x;
@@ -3436,6 +3592,10 @@ static void skylake_update_primary_plane(struct drm_plane *plane,
 	I915_WRITE(PLANE_OFFSET(pipe, plane_id), (src_y << 16) | src_x);
 	I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
 	I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
+	I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
+		   (plane_state->aux.offset - surf_addr) | aux_stride);
+	I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
+		   (plane_state->aux.y << 16) | plane_state->aux.x);
 
 	if (scaler_id >= 0) {
 		uint32_t ps_ctrl = 0;
@@ -9807,10 +9967,16 @@ skylake_get_initial_plane_config(struct intel_crtc *crtc,
 		fb->modifier = I915_FORMAT_MOD_X_TILED;
 		break;
 	case PLANE_CTL_TILED_Y:
-		fb->modifier = I915_FORMAT_MOD_Y_TILED;
+		if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
+			fb->modifier = I915_FORMAT_MOD_Y_TILED_CCS;
+		else
+			fb->modifier = I915_FORMAT_MOD_Y_TILED;
 		break;
 	case PLANE_CTL_TILED_YF:
-		fb->modifier = I915_FORMAT_MOD_Yf_TILED;
+		if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
+			fb->modifier = I915_FORMAT_MOD_Yf_TILED_CCS;
+		else
+			fb->modifier = I915_FORMAT_MOD_Yf_TILED;
 		break;
 	default:
 		MISSING_CASE(tiling);
@@ -12014,7 +12180,7 @@ static void skl_do_mmio_flip(struct intel_crtc *intel_crtc,
 	u32 ctl, stride = skl_plane_stride(fb, 0, rotation);
 
 	ctl = I915_READ(PLANE_CTL(pipe, 0));
-	ctl &= ~PLANE_CTL_TILED_MASK;
+	ctl &= ~(PLANE_CTL_TILED_MASK | PLANE_CTL_DECOMPRESSION_ENABLE);
 	switch (fb->modifier) {
 	case DRM_FORMAT_MOD_NONE:
 		break;
@@ -12024,9 +12190,15 @@ static void skl_do_mmio_flip(struct intel_crtc *intel_crtc,
 	case I915_FORMAT_MOD_Y_TILED:
 		ctl |= PLANE_CTL_TILED_Y;
 		break;
+	case I915_FORMAT_MOD_Y_TILED_CCS:
+		ctl |= PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
+		break;
 	case I915_FORMAT_MOD_Yf_TILED:
 		ctl |= PLANE_CTL_TILED_YF;
 		break;
+	case I915_FORMAT_MOD_Yf_TILED_CCS:
+		ctl |= PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_ENABLE;
+		break;
 	default:
 		MISSING_CASE(fb->modifier);
 	}
@@ -15926,8 +16098,8 @@ static int intel_framebuffer_init(struct drm_device *dev,
 {
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	unsigned int tiling = i915_gem_object_get_tiling(obj);
-	int ret;
-	u32 pitch_limit, stride_alignment;
+	int ret, i;
+	u32 pitch_limit;
 	struct drm_format_name_buf format_name;
 
 	WARN_ON(!mutex_is_locked(&dev->struct_mutex));
@@ -15953,6 +16125,19 @@ static int intel_framebuffer_init(struct drm_device *dev,
 
 	/* Passed in modifier sanity checking. */
 	switch (mode_cmd->modifier[0]) {
+	case I915_FORMAT_MOD_Y_TILED_CCS:
+	case I915_FORMAT_MOD_Yf_TILED_CCS:
+		switch (mode_cmd->pixel_format) {
+		case DRM_FORMAT_XBGR8888:
+		case DRM_FORMAT_ABGR8888:
+		case DRM_FORMAT_XRGB8888:
+		case DRM_FORMAT_ARGB8888:
+			break;
+		default:
+			DRM_DEBUG_KMS("RC supported only with RGB8888 formats\n");
+			return -EINVAL;
+		}
+		/* fall through */
 	case I915_FORMAT_MOD_Y_TILED:
 	case I915_FORMAT_MOD_Yf_TILED:
 		if (INTEL_GEN(dev_priv) < 9) {
@@ -16061,11 +16246,21 @@ static int intel_framebuffer_init(struct drm_device *dev,
 
 	drm_helper_mode_fill_fb_struct(dev, &intel_fb->base, mode_cmd);
 
-	stride_alignment = intel_fb_stride_alignment(&intel_fb->base, 0);
-	if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
-		DRM_DEBUG_KMS("pitch (%d) must be at least %u byte aligned\n",
-			      mode_cmd->pitches[0], stride_alignment);
-		return -EINVAL;
+	for (i = 0; i < intel_fb->base.format->num_planes; i++) {
+		u32 stride_alignment;
+
+		if (mode_cmd->handles[i] != mode_cmd->handles[0]) {
+			DRM_DEBUG_KMS("bad plane %d handle\n", i);
+			return -EINVAL;
+		}
+
+		stride_alignment = intel_fb_stride_alignment(&intel_fb->base, i);
+
+		if (mode_cmd->pitches[i] & (stride_alignment - 1)) {
+			DRM_DEBUG_KMS("plane %d pitch (%d) must be at least %u byte aligned\n",
+				      i, mode_cmd->pitches[i], stride_alignment);
+			return -EINVAL;
+		}
 	}
 
 	intel_fb->obj = obj;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 249623d45be0..add359022c96 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3314,7 +3314,9 @@ skl_ddb_min_alloc(const struct drm_plane_state *pstate,
 
 	/* For Non Y-tile return 8-blocks */
 	if (fb->modifier != I915_FORMAT_MOD_Y_TILED &&
-	    fb->modifier != I915_FORMAT_MOD_Yf_TILED)
+	    fb->modifier != I915_FORMAT_MOD_Yf_TILED &&
+	    fb->modifier != I915_FORMAT_MOD_Y_TILED_CCS &&
+	    fb->modifier != I915_FORMAT_MOD_Yf_TILED_CCS)
 		return 8;
 
 	src_w = drm_rect_width(&intel_pstate->base.src) >> 16;
@@ -3590,7 +3592,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	}
 
 	y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
-		  fb->modifier == I915_FORMAT_MOD_Yf_TILED;
+		  fb->modifier == I915_FORMAT_MOD_Yf_TILED ||
+		  fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
+		  fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS;
 	x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
 
 	/* Display WA #1141: kbl. */
diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
index 7031bc733d97..063a994815d0 100644
--- a/drivers/gpu/drm/i915/intel_sprite.c
+++ b/drivers/gpu/drm/i915/intel_sprite.c
@@ -210,6 +210,7 @@ skl_update_plane(struct drm_plane *drm_plane,
 	u32 surf_addr = plane_state->main.offset;
 	unsigned int rotation = plane_state->base.rotation;
 	u32 stride = skl_plane_stride(fb, 0, rotation);
+	u32 aux_stride = skl_plane_stride(fb, 1, rotation);
 	int crtc_x = plane_state->base.dst.x1;
 	int crtc_y = plane_state->base.dst.y1;
 	uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
@@ -248,6 +249,10 @@ skl_update_plane(struct drm_plane *drm_plane,
 	I915_WRITE(PLANE_OFFSET(pipe, plane_id), (y << 16) | x);
 	I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
 	I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
+	I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
+		   (plane_state->aux.offset - surf_addr) | aux_stride);
+	I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
+		   (plane_state->aux.y << 16) | plane_state->aux.x);
 
 	/* program plane scaler */
 	if (plane_state->scaler_id >= 0) {
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH 9/9] drm/i915: Add render decompression support
  2017-01-04 18:42 ` [PATCH 9/9] drm/i915: Add render decompression support ville.syrjala
@ 2017-01-04 19:14   ` Paulo Zanoni
  2017-01-05 15:12     ` Ville Syrjälä
  2017-01-05  4:25   ` Ben Widawsky
  2017-01-05 15:14   ` [PATCH v2 " ville.syrjala
  2 siblings, 1 reply; 44+ messages in thread
From: Paulo Zanoni @ 2017-01-04 19:14 UTC (permalink / raw)
  To: ville.syrjala, intel-gfx; +Cc: Ben Widawsky, Vandana Kannan, dri-devel

Em Qua, 2017-01-04 às 20:42 +0200, ville.syrjala@linux.intel.com
escreveu:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> SKL+ display engine can scan out certain kinds of compressed surfaces
> produced by the render engine. This involved telling the display
> engine
> the location of the color control surfae (CCS) which describes
> which parts of the main surface are compressed and which are not. The
> location of CCS is provided by userspace as just another plane with
> its
> own offset.
> 
> Add the required stuff to validate the user provided AUX plane
> metadata
> and convert the user provided linear offset into something the
> hardware
> can consume.
> 
> Due to hardware limitations we require that the main surface and
> the AUX surface (CCS) be part of the same bo. The hardware also
> makes life hard by not allowing you to provide separate x/y offsets
> for the main and AUX surfaces (excpet with NV12), so finding suitable
> offsets for both requires a bit of work. Assuming we still want keep
> playing tricks with the offsets. I've just gone with a dumb "search
> backward for suitable offsets" approach, which is far from optimal,
> but it works.
> 
> Also not all planes will be capable of scanning out compressed
> surfaces,
> and eg. 90/270 degree rotation is not supported in combination with
> decompression either.
> 
> This patch may contain work from at least the following people:
> * Vandana Kannan <vandana.kannan@intel.com>
> * Daniel Vetter <daniel@ffwll.ch>
> * Ben Widawsky <ben@bwidawsk.net>

As I mentioned to Ben in the other email, there are some points of
BSpec that say "if render decompression is enabled, to this", which we
largely ignored so far. I hope they are all marked as workarounds. From
a quick look, it looks like we need at least Display WAs #0390, #0531
and #1125, and maybe some other non-display WAs (please take a look at
the BSpec list). I'll assume they were not implemented yet since I
don't see WA comments on the patches. I think we need them, otherwise
we may introduce more SKL flickering problems.

Thanks,
Paulo

> 
> Cc: Vandana Kannan <vandana.kannan@intel.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Ben Widawsky <ben@bwidawsk.net>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_reg.h      |  22 ++++
>  drivers/gpu/drm/i915/intel_display.c | 219
> +++++++++++++++++++++++++++++++++--
>  drivers/gpu/drm/i915/intel_pm.c      |   8 +-
>  drivers/gpu/drm/i915/intel_sprite.c  |   5 +
>  4 files changed, 240 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_reg.h
> b/drivers/gpu/drm/i915/i915_reg.h
> index 00970aa77afa..05e18e742776 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -6209,6 +6209,28 @@ enum {
>  			_ID(id, _PS_ECC_STAT_1A,
> _PS_ECC_STAT_2A),   \
>  			_ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B))
>  
> +#define PLANE_AUX_DIST_1_A		0x701c0
> +#define PLANE_AUX_DIST_2_A		0x702c0
> +#define PLANE_AUX_DIST_1_B		0x711c0
> +#define PLANE_AUX_DIST_2_B		0x712c0
> +#define _PLANE_AUX_DIST_1(pipe) \
> +			_PIPE(pipe, PLANE_AUX_DIST_1_A,
> PLANE_AUX_DIST_1_B)
> +#define _PLANE_AUX_DIST_2(pipe) \
> +			_PIPE(pipe, PLANE_AUX_DIST_2_A,
> PLANE_AUX_DIST_2_B)
> +#define PLANE_AUX_DIST(pipe, plane)     \
> +	_MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe),
> _PLANE_AUX_DIST_2(pipe))
> +
> +#define PLANE_AUX_OFFSET_1_A		0x701c4
> +#define PLANE_AUX_OFFSET_2_A		0x702c4
> +#define PLANE_AUX_OFFSET_1_B		0x711c4
> +#define PLANE_AUX_OFFSET_2_B		0x712c4
> +#define _PLANE_AUX_OFFSET_1(pipe)       \
> +		_PIPE(pipe, PLANE_AUX_OFFSET_1_A,
> PLANE_AUX_OFFSET_1_B)
> +#define _PLANE_AUX_OFFSET_2(pipe)       \
> +		_PIPE(pipe, PLANE_AUX_OFFSET_2_A,
> PLANE_AUX_OFFSET_2_B)
> +#define PLANE_AUX_OFFSET(pipe, plane)   \
> +	_MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe),
> _PLANE_AUX_OFFSET_2(pipe))
> +
>  /* legacy palette */
>  #define _LGC_PALETTE_A           0x4a000
>  #define _LGC_PALETTE_B           0x4a800
> diff --git a/drivers/gpu/drm/i915/intel_display.c
> b/drivers/gpu/drm/i915/intel_display.c
> index 38de9df0ec60..b547332eeda1 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -2064,11 +2064,19 @@ intel_tile_width_bytes(const struct
> drm_framebuffer *fb, int plane)
>  			return 128;
>  		else
>  			return 512;
> +	case I915_FORMAT_MOD_Y_TILED_CCS:
> +		if (plane == 1)
> +			return 64;
> +		/* fall through */
>  	case I915_FORMAT_MOD_Y_TILED:
>  		if (IS_GEN2(dev_priv) ||
> HAS_128_BYTE_Y_TILING(dev_priv))
>  			return 128;
>  		else
>  			return 512;
> +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> +		if (plane == 1)
> +			return 64;
> +		/* fall through */
>  	case I915_FORMAT_MOD_Yf_TILED:
>  		/*
>  		 * Bspec seems to suggest that the Yf tile width
> would
> @@ -2156,7 +2164,7 @@ static unsigned int intel_surf_alignment(const
> struct drm_framebuffer *fb,
>  	struct drm_i915_private *dev_priv = to_i915(fb->dev);
>  
>  	/* AUX_DIST needs only 4K alignment */
> -	if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
> +	if (plane == 1)
>  		return 4096;
>  
>  	switch (fb->modifier) {
> @@ -2166,6 +2174,8 @@ static unsigned int intel_surf_alignment(const
> struct drm_framebuffer *fb,
>  		if (INTEL_GEN(dev_priv) >= 9)
>  			return 256 * 1024;
>  		return 0;
> +	case I915_FORMAT_MOD_Y_TILED_CCS:
> +	case I915_FORMAT_MOD_Yf_TILED_CCS:
>  	case I915_FORMAT_MOD_Y_TILED:
>  	case I915_FORMAT_MOD_Yf_TILED:
>  		return 1 * 1024 * 1024;
> @@ -2472,6 +2482,7 @@ static unsigned int
> intel_fb_modifier_to_tiling(uint64_t fb_modifier)
>  	case I915_FORMAT_MOD_X_TILED:
>  		return I915_TILING_X;
>  	case I915_FORMAT_MOD_Y_TILED:
> +	case I915_FORMAT_MOD_Y_TILED_CCS:
>  		return I915_TILING_Y;
>  	default:
>  		return I915_TILING_NONE;
> @@ -2536,6 +2547,35 @@ intel_fill_fb_info(struct drm_i915_private
> *dev_priv,
>  
>  		intel_fb_offset_to_xy(&x, &y, fb, i);
>  
> +		if ((fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +		     fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS)
> && i == 1) {
> +			int main_x, main_y;
> +			int ccs_x, ccs_y;
> +
> +			/*
> +			 * Each byte of CCS corresponds to a 16x8
> area of the main surface, and
> +			 * each CCS tile is 64x64 bytes.
> +			 */
> +			ccs_x = (x * 16) % (64 * 16);
> +			ccs_y = (y * 8) % (64 * 8);
> +			main_x = intel_fb->normal[0].x % (64 * 16);
> +			main_y = intel_fb->normal[0].y % (64 * 8);
> +
> +			/*
> +			 * CCS doesn't have its own x/y offset
> register, so the intra CCS tile
> +			 * x/y offsets must match between CCS and
> the main surface.
> +			 */
> +			if (main_x != ccs_x || main_y != ccs_y) {
> +				DRM_DEBUG_KMS("Bad CCS x/y (main
> %d,%d ccs %d,%d) full (main %d,%d ccs %d,%d)\n",
> +					      main_x, main_y,
> +					      ccs_x, ccs_y,
> +					      intel_fb->normal[0].x,
> +					      intel_fb->normal[0].y,
> +					      x, y);
> +				return -EINVAL;
> +			}
> +		}
> +
>  		/*
>  		 * The fence (if used) is aligned to the start of
> the object
>  		 * so having the framebuffer wrap around across the
> edge of the
> @@ -2873,6 +2913,9 @@ static int skl_max_plane_width(const struct
> drm_framebuffer *fb, int plane,
>  			break;
>  		}
>  		break;
> +	case I915_FORMAT_MOD_Y_TILED_CCS:
> +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> +		/* FIXME AUX plane? */
>  	case I915_FORMAT_MOD_Y_TILED:
>  	case I915_FORMAT_MOD_Yf_TILED:
>  		switch (cpp) {
> @@ -2895,6 +2938,42 @@ static int skl_max_plane_width(const struct
> drm_framebuffer *fb, int plane,
>  	return 2048;
>  }
>  
> +static bool skl_check_main_ccs_coordinates(struct intel_plane_state
> *plane_state,
> +					   int main_x, int main_y,
> u32 main_offset)
> +{
> +	const struct drm_framebuffer *fb = plane_state->base.fb;
> +	int aux_x = plane_state->aux.x;
> +	int aux_y = plane_state->aux.y;
> +	u32 aux_offset = plane_state->aux.offset;
> +	u32 alignment = intel_surf_alignment(fb, 1);
> +
> +	while (aux_offset >= main_offset && aux_y <= main_y) {
> +		int x, y;
> +
> +		if (aux_x == main_x && aux_y == main_y)
> +			break;
> +
> +		if (aux_offset == 0)
> +			break;
> +
> +		x = aux_x / 16;
> +		y = aux_y / 8;
> +		aux_offset = intel_adjust_tile_offset(&x, &y,
> plane_state, 1,
> +						      aux_offset,
> aux_offset - alignment);
> +		aux_x = x * 16 + aux_x % 16;
> +		aux_y = y * 8 + aux_y % 8;
> +	}
> +
> +	if (aux_x != main_x || aux_y != main_y)
> +		return false;
> +
> +	plane_state->aux.offset = aux_offset;
> +	plane_state->aux.x = aux_x;
> +	plane_state->aux.y = aux_y;
> +
> +	return true;
> +}
> +
>  static int skl_check_main_surface(struct intel_plane_state
> *plane_state)
>  {
>  	const struct drm_framebuffer *fb = plane_state->base.fb;
> @@ -2937,7 +3016,7 @@ static int skl_check_main_surface(struct
> intel_plane_state *plane_state)
>  
>  		while ((x + w) * cpp > fb->pitches[0]) {
>  			if (offset == 0) {
> -				DRM_DEBUG_KMS("Unable to find
> suitable display surface offset\n");
> +				DRM_DEBUG_KMS("Unable to find
> suitable display surface offset due to X-tiling\n");
>  				return -EINVAL;
>  			}
>  
> @@ -2946,6 +3025,26 @@ static int skl_check_main_surface(struct
> intel_plane_state *plane_state)
>  		}
>  	}
>  
> +	/*
> +	 * CCS AUX surface doesn't have its own x/y offsets, we must
> make sure
> +	 * they match with the main surface x/y offsets.
> +	 */
> +	if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +	    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
> +		while (!skl_check_main_ccs_coordinates(plane_state,
> x, y, offset)) {
> +			if (offset == 0)
> +				break;
> +
> +			offset = intel_adjust_tile_offset(&x, &y,
> plane_state, 0,
> +							  offset,
> offset - alignment);
> +		}
> +
> +		if (x != plane_state->aux.x || y != plane_state-
> >aux.y) {
> +			DRM_DEBUG_KMS("Unable to find suitable
> display surface offset due to CCS\n");
> +			return -EINVAL;
> +		}
> +	}
> +
>  	plane_state->main.offset = offset;
>  	plane_state->main.x = x;
>  	plane_state->main.y = y;
> @@ -2982,6 +3081,53 @@ static int skl_check_nv12_aux_surface(struct
> intel_plane_state *plane_state)
>  	return 0;
>  }
>  
> +static int skl_check_ccs_aux_surface(struct intel_plane_state
> *plane_state)
> +{
> +	struct intel_plane *plane = to_intel_plane(plane_state-
> >base.plane);
> +	struct intel_crtc *crtc = to_intel_crtc(plane_state-
> >base.crtc);
> +	int src_x = plane_state->base.src.x1 >> 16;
> +	int src_y = plane_state->base.src.y1 >> 16;
> +	int x = src_x / 16;
> +	int y = src_y / 8;
> +	u32 offset;
> +
> +	switch (plane->id) {
> +	case PLANE_PRIMARY:
> +	case PLANE_SPRITE0:
> +		break;
> +	default:
> +		DRM_DEBUG_KMS("RC support only on plane 1 and 2\n");
> +		return -EINVAL;
> +	}
> +
> +	if (crtc->pipe == PIPE_C) {
> +		DRM_DEBUG_KMS("No RC support on pipe C\n");
> +		return -EINVAL;
> +	}
> +	/*
> +	 * TODO:
> +	 * 1. Disable stereo 3D when render decomp is enabled (bit
> 7:6)
> +	 * 2. Render decompression must not be used in VTd pass-
> through mode
> +	 * 3. Program hashing select CHICKEN_MISC1 bit 15
> +	 */
> +
> +	if (plane_state->base.rotation &&
> +	    plane_state->base.rotation & ~(DRM_ROTATE_0 |
> DRM_ROTATE_180)) {
> +		DRM_DEBUG_KMS("RC support only with 0/180 degree
> rotation %x\n",
> +			      plane_state->base.rotation);
> +		return -EINVAL;
> +	}
> +
> +	intel_add_fb_offsets(&x, &y, plane_state, 1);
> +	offset = intel_compute_tile_offset(&x, &y, plane_state, 1);
> +
> +	plane_state->aux.offset = offset;
> +	plane_state->aux.x = x * 16 + src_x % 16;
> +	plane_state->aux.y = y * 8 + src_y % 8;
> +
> +	return 0;
> +}
> +
>  int skl_check_plane_surface(struct intel_plane_state *plane_state)
>  {
>  	const struct drm_framebuffer *fb = plane_state->base.fb;
> @@ -3002,6 +3148,11 @@ int skl_check_plane_surface(struct
> intel_plane_state *plane_state)
>  		ret = skl_check_nv12_aux_surface(plane_state);
>  		if (ret)
>  			return ret;
> +	} else if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +		   fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
> +		ret = skl_check_ccs_aux_surface(plane_state);
> +		if (ret)
> +			return ret;
>  	} else {
>  		plane_state->aux.offset = ~0xfff;
>  		plane_state->aux.x = 0;
> @@ -3357,8 +3508,12 @@ u32 skl_plane_ctl_tiling(uint64_t fb_modifier)
>  		return PLANE_CTL_TILED_X;
>  	case I915_FORMAT_MOD_Y_TILED:
>  		return PLANE_CTL_TILED_Y;
> +	case I915_FORMAT_MOD_Y_TILED_CCS:
> +		return PLANE_CTL_TILED_Y |
> PLANE_CTL_DECOMPRESSION_ENABLE;
>  	case I915_FORMAT_MOD_Yf_TILED:
>  		return PLANE_CTL_TILED_YF;
> +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> +		return PLANE_CTL_TILED_YF |
> PLANE_CTL_DECOMPRESSION_ENABLE;
>  	default:
>  		MISSING_CASE(fb_modifier);
>  	}
> @@ -3401,6 +3556,7 @@ static void skylake_update_primary_plane(struct
> drm_plane *plane,
>  	u32 plane_ctl;
>  	unsigned int rotation = plane_state->base.rotation;
>  	u32 stride = skl_plane_stride(fb, 0, rotation);
> +	u32 aux_stride = skl_plane_stride(fb, 1, rotation);
>  	u32 surf_addr = plane_state->main.offset;
>  	int scaler_id = plane_state->scaler_id;
>  	int src_x = plane_state->main.x;
> @@ -3436,6 +3592,10 @@ static void
> skylake_update_primary_plane(struct drm_plane *plane,
>  	I915_WRITE(PLANE_OFFSET(pipe, plane_id), (src_y << 16) |
> src_x);
>  	I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
>  	I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) |
> src_w);
> +	I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
> +		   (plane_state->aux.offset - surf_addr) |
> aux_stride);
> +	I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
> +		   (plane_state->aux.y << 16) | plane_state->aux.x);
>  
>  	if (scaler_id >= 0) {
>  		uint32_t ps_ctrl = 0;
> @@ -9807,10 +9967,16 @@ skylake_get_initial_plane_config(struct
> intel_crtc *crtc,
>  		fb->modifier = I915_FORMAT_MOD_X_TILED;
>  		break;
>  	case PLANE_CTL_TILED_Y:
> -		fb->modifier = I915_FORMAT_MOD_Y_TILED;
> +		if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
> +			fb->modifier = I915_FORMAT_MOD_Y_TILED_CCS;
> +		else
> +			fb->modifier = I915_FORMAT_MOD_Y_TILED;
>  		break;
>  	case PLANE_CTL_TILED_YF:
> -		fb->modifier = I915_FORMAT_MOD_Yf_TILED;
> +		if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
> +			fb->modifier = I915_FORMAT_MOD_Yf_TILED_CCS;
> +		else
> +			fb->modifier = I915_FORMAT_MOD_Yf_TILED;
>  		break;
>  	default:
>  		MISSING_CASE(tiling);
> @@ -12014,7 +12180,7 @@ static void skl_do_mmio_flip(struct
> intel_crtc *intel_crtc,
>  	u32 ctl, stride = skl_plane_stride(fb, 0, rotation);
>  
>  	ctl = I915_READ(PLANE_CTL(pipe, 0));
> -	ctl &= ~PLANE_CTL_TILED_MASK;
> +	ctl &= ~(PLANE_CTL_TILED_MASK |
> PLANE_CTL_DECOMPRESSION_ENABLE);
>  	switch (fb->modifier) {
>  	case DRM_FORMAT_MOD_NONE:
>  		break;
> @@ -12024,9 +12190,15 @@ static void skl_do_mmio_flip(struct
> intel_crtc *intel_crtc,
>  	case I915_FORMAT_MOD_Y_TILED:
>  		ctl |= PLANE_CTL_TILED_Y;
>  		break;
> +	case I915_FORMAT_MOD_Y_TILED_CCS:
> +		ctl |= PLANE_CTL_TILED_Y |
> PLANE_CTL_DECOMPRESSION_ENABLE;
> +		break;
>  	case I915_FORMAT_MOD_Yf_TILED:
>  		ctl |= PLANE_CTL_TILED_YF;
>  		break;
> +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> +		ctl |= PLANE_CTL_TILED_YF |
> PLANE_CTL_DECOMPRESSION_ENABLE;
> +		break;
>  	default:
>  		MISSING_CASE(fb->modifier);
>  	}
> @@ -15926,8 +16098,8 @@ static int intel_framebuffer_init(struct
> drm_device *dev,
>  {
>  	struct drm_i915_private *dev_priv = to_i915(dev);
>  	unsigned int tiling = i915_gem_object_get_tiling(obj);
> -	int ret;
> -	u32 pitch_limit, stride_alignment;
> +	int ret, i;
> +	u32 pitch_limit;
>  	struct drm_format_name_buf format_name;
>  
>  	WARN_ON(!mutex_is_locked(&dev->struct_mutex));
> @@ -15953,6 +16125,19 @@ static int intel_framebuffer_init(struct
> drm_device *dev,
>  
>  	/* Passed in modifier sanity checking. */
>  	switch (mode_cmd->modifier[0]) {
> +	case I915_FORMAT_MOD_Y_TILED_CCS:
> +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> +		switch (mode_cmd->pixel_format) {
> +		case DRM_FORMAT_XBGR8888:
> +		case DRM_FORMAT_ABGR8888:
> +		case DRM_FORMAT_XRGB8888:
> +		case DRM_FORMAT_ARGB8888:
> +			break;
> +		default:
> +			DRM_DEBUG_KMS("RC supported only with
> RGB8888 formats\n");
> +			return -EINVAL;
> +		}
> +		/* fall through */
>  	case I915_FORMAT_MOD_Y_TILED:
>  	case I915_FORMAT_MOD_Yf_TILED:
>  		if (INTEL_GEN(dev_priv) < 9) {
> @@ -16061,11 +16246,21 @@ static int intel_framebuffer_init(struct
> drm_device *dev,
>  
>  	drm_helper_mode_fill_fb_struct(dev, &intel_fb->base,
> mode_cmd);
>  
> -	stride_alignment = intel_fb_stride_alignment(&intel_fb-
> >base, 0);
> -	if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
> -		DRM_DEBUG_KMS("pitch (%d) must be at least %u byte
> aligned\n",
> -			      mode_cmd->pitches[0],
> stride_alignment);
> -		return -EINVAL;
> +	for (i = 0; i < intel_fb->base.format->num_planes; i++) {
> +		u32 stride_alignment;
> +
> +		if (mode_cmd->handles[i] != mode_cmd->handles[0]) {
> +			DRM_DEBUG_KMS("bad plane %d handle\n", i);
> +			return -EINVAL;
> +		}
> +
> +		stride_alignment =
> intel_fb_stride_alignment(&intel_fb->base, i);
> +
> +		if (mode_cmd->pitches[i] & (stride_alignment - 1)) {
> +			DRM_DEBUG_KMS("plane %d pitch (%d) must be
> at least %u byte aligned\n",
> +				      i, mode_cmd->pitches[i],
> stride_alignment);
> +			return -EINVAL;
> +		}
>  	}
>  
>  	intel_fb->obj = obj;
> diff --git a/drivers/gpu/drm/i915/intel_pm.c
> b/drivers/gpu/drm/i915/intel_pm.c
> index 249623d45be0..add359022c96 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3314,7 +3314,9 @@ skl_ddb_min_alloc(const struct drm_plane_state
> *pstate,
>  
>  	/* For Non Y-tile return 8-blocks */
>  	if (fb->modifier != I915_FORMAT_MOD_Y_TILED &&
> -	    fb->modifier != I915_FORMAT_MOD_Yf_TILED)
> +	    fb->modifier != I915_FORMAT_MOD_Yf_TILED &&
> +	    fb->modifier != I915_FORMAT_MOD_Y_TILED_CCS &&
> +	    fb->modifier != I915_FORMAT_MOD_Yf_TILED_CCS)
>  		return 8;
>  
>  	src_w = drm_rect_width(&intel_pstate->base.src) >> 16;
> @@ -3590,7 +3592,9 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>  	}
>  
>  	y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
> -		  fb->modifier == I915_FORMAT_MOD_Yf_TILED;
> +		  fb->modifier == I915_FORMAT_MOD_Yf_TILED ||
> +		  fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +		  fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS;
>  	x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
>  
>  	/* Display WA #1141: kbl. */
> diff --git a/drivers/gpu/drm/i915/intel_sprite.c
> b/drivers/gpu/drm/i915/intel_sprite.c
> index 7031bc733d97..063a994815d0 100644
> --- a/drivers/gpu/drm/i915/intel_sprite.c
> +++ b/drivers/gpu/drm/i915/intel_sprite.c
> @@ -210,6 +210,7 @@ skl_update_plane(struct drm_plane *drm_plane,
>  	u32 surf_addr = plane_state->main.offset;
>  	unsigned int rotation = plane_state->base.rotation;
>  	u32 stride = skl_plane_stride(fb, 0, rotation);
> +	u32 aux_stride = skl_plane_stride(fb, 1, rotation);
>  	int crtc_x = plane_state->base.dst.x1;
>  	int crtc_y = plane_state->base.dst.y1;
>  	uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
> @@ -248,6 +249,10 @@ skl_update_plane(struct drm_plane *drm_plane,
>  	I915_WRITE(PLANE_OFFSET(pipe, plane_id), (y << 16) | x);
>  	I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
>  	I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) |
> src_w);
> +	I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
> +		   (plane_state->aux.offset - surf_addr) |
> aux_stride);
> +	I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
> +		   (plane_state->aux.y << 16) | plane_state->aux.x);
>  
>  	/* program plane scaler */
>  	if (plane_state->scaler_id >= 0) {
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915: SKL+ render decompression support
  2017-01-04 18:42 [PATCH 0/9] drm/i915: SKL+ render decompression support ville.syrjala
                   ` (8 preceding siblings ...)
  2017-01-04 18:42 ` [PATCH 9/9] drm/i915: Add render decompression support ville.syrjala
@ 2017-01-04 19:45 ` Patchwork
  2017-01-05 15:54 ` ✗ Fi.CI.BAT: failure for drm/i915: SKL+ render decompression support (rev2) Patchwork
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 44+ messages in thread
From: Patchwork @ 2017-01-04 19:45 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: SKL+ render decompression support
URL   : https://patchwork.freedesktop.org/series/17507/
State : success

== Summary ==

Series 17507v1 drm/i915: SKL+ render decompression support
https://patchwork.freedesktop.org/api/1.0/series/17507/revisions/1/mbox/


fi-bdw-5557u     total:246  pass:232  dwarn:0   dfail:0   fail:0   skip:14 
fi-bsw-n3050     total:246  pass:207  dwarn:0   dfail:0   fail:0   skip:39 
fi-bxt-j4205     total:246  pass:224  dwarn:0   dfail:0   fail:0   skip:22 
fi-byt-j1900     total:246  pass:219  dwarn:0   dfail:0   fail:0   skip:27 
fi-byt-n2820     total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-hsw-4770      total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-hsw-4770r     total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-ivb-3520m     total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-ivb-3770      total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-kbl-7500u     total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-skl-6260u     total:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-skl-6700hq    total:246  pass:226  dwarn:0   dfail:0   fail:0   skip:20 
fi-skl-6700k     total:246  pass:222  dwarn:3   dfail:0   fail:0   skip:21 
fi-skl-6770hq    total:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-snb-2520m     total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 

ea0500897bf72bbbf6eca6e695c9d49289dfc768 drm-tip: 2017y-01m-04d-16h-46m-35s UTC integration manifest
47b7d18 drm/i915: Add render decompression support
09b28ed drm/i915: Implement .get_format_info() hook for CCS
5dc5c70 drm/i915: Use DRM_DEBUG_KMS() for framebuffer failure debug messages
f9e982d drm/i915: Pass the correct plane index to _intel_compute_tile_offset()
0c72ad2 drm/i915: Fix Yf tile width
47a0285 drm/i915: Avoid div-by-zero when computing aux_stride w/o an aux plane
0acd973 drm/i915: Move nv12 chroma plane handling into intel_surf_alignment()
38a0f9c drm/i915: Plumb drm_framebuffer into more places
fd640ca drm: Add mode_config .get_format_info() hook

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3435/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 1/9] drm: Add mode_config .get_format_info() hook
  2017-01-04 18:42 ` [PATCH 1/9] drm: Add mode_config .get_format_info() hook ville.syrjala
@ 2017-01-05  3:15   ` Ben Widawsky
  2017-01-05  8:48     ` Daniel Vetter
  0 siblings, 1 reply; 44+ messages in thread
From: Ben Widawsky @ 2017-01-05  3:15 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx, Laurent Pinchart, dri-devel

On 17-01-04 20:42:24, Ville Syrjälä wrote:
>From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
>Allow drivers to return a custom drm_format_info structure for special
>fb layouts. We'll use this for the compression control surface in i915.
>
>v2: Fix drm_get_format_info() kernel doc (Laurent)
>    Don't pass 'dev' to the new hook (Laurent)
>
>Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
>Cc: Ben Widawsky <ben@bwidawsk.net>
>Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
>---
> drivers/gpu/drm/drm_fb_cma_helper.c  |  2 +-
> drivers/gpu/drm/drm_fourcc.c         | 25 +++++++++++++++++++++++++
> drivers/gpu/drm/drm_framebuffer.c    |  9 +++++++--
> drivers/gpu/drm/drm_modeset_helper.c |  2 +-
> include/drm/drm_fourcc.h             |  6 ++++++
> include/drm/drm_mode_config.h        | 14 ++++++++++++++
> 6 files changed, 54 insertions(+), 4 deletions(-)
>
>diff --git a/drivers/gpu/drm/drm_fb_cma_helper.c b/drivers/gpu/drm/drm_fb_cma_helper.c
>index 20a4011f790d..de3c9fe116fc 100644
>--- a/drivers/gpu/drm/drm_fb_cma_helper.c
>+++ b/drivers/gpu/drm/drm_fb_cma_helper.c
>@@ -177,7 +177,7 @@ struct drm_framebuffer *drm_fb_cma_create_with_funcs(struct drm_device *dev,
> 	int ret;
> 	int i;
>
>-	info = drm_format_info(mode_cmd->pixel_format);
>+	info = drm_get_format_info(dev, mode_cmd);
> 	if (!info)
> 		return ERR_PTR(-EINVAL);
>
>diff --git a/drivers/gpu/drm/drm_fourcc.c b/drivers/gpu/drm/drm_fourcc.c
>index 90d2cc8da8eb..f9b6445e846a 100644
>--- a/drivers/gpu/drm/drm_fourcc.c
>+++ b/drivers/gpu/drm/drm_fourcc.c
>@@ -199,6 +199,31 @@ const struct drm_format_info *drm_format_info(u32 format)
> EXPORT_SYMBOL(drm_format_info);
>
> /**
>+ * drm_get_format_info - query information for a given framebuffer configuration
>+ * @dev: DRM device
>+ * @mode_cmd: metadata from the userspace fb creation request
>+ *
>+ * Returns:
>+ * The instance of struct drm_format_info that describes the pixel format, or
>+ * NULL if the format is unsupported.
>+ */
>+const struct drm_format_info *
>+drm_get_format_info(struct drm_device *dev,
>+		    const struct drm_mode_fb_cmd2 *mode_cmd)
>+{
>+	const struct drm_format_info *info = NULL;
>+
>+	if (dev->mode_config.funcs->get_format_info)
>+		info = dev->mode_config.funcs->get_format_info(mode_cmd);
>+
>+	if (!info)
>+		info = drm_format_info(mode_cmd->pixel_format);
>+
>+	return info;
>+}
>+EXPORT_SYMBOL(drm_get_format_info);
>+
>+/**
>  * drm_format_num_planes - get the number of planes for format
>  * @format: pixel format (DRM_FORMAT_*)
>  *
>diff --git a/drivers/gpu/drm/drm_framebuffer.c b/drivers/gpu/drm/drm_framebuffer.c
>index 588ccc3a2218..e276fcdc3a62 100644
>--- a/drivers/gpu/drm/drm_framebuffer.c
>+++ b/drivers/gpu/drm/drm_framebuffer.c
>@@ -126,11 +126,13 @@ int drm_mode_addfb(struct drm_device *dev,
> 	return 0;
> }
>
>-static int framebuffer_check(const struct drm_mode_fb_cmd2 *r)
>+static int framebuffer_check(struct drm_device *dev,
>+			     const struct drm_mode_fb_cmd2 *r)
> {
> 	const struct drm_format_info *info;
> 	int i;
>
>+	/* check if the format is supported at all */
> 	info = __drm_format_info(r->pixel_format & ~DRM_FORMAT_BIG_ENDIAN);
> 	if (!info) {
> 		struct drm_format_name_buf format_name;
>@@ -140,6 +142,9 @@ static int framebuffer_check(const struct drm_mode_fb_cmd2 *r)
> 		return -EINVAL;
> 	}
>
>+	/* now let the driver pick its own format info */
>+	info = drm_get_format_info(dev, r);
>+
> 	if (r->width == 0 || r->width % info->hsub) {
> 		DRM_DEBUG_KMS("bad framebuffer width %u\n", r->width);
> 		return -EINVAL;
>@@ -263,7 +268,7 @@ drm_internal_framebuffer_create(struct drm_device *dev,
> 		return ERR_PTR(-EINVAL);
> 	}
>
>-	ret = framebuffer_check(r);
>+	ret = framebuffer_check(dev, r);
> 	if (ret)
> 		return ERR_PTR(ret);
>
>diff --git a/drivers/gpu/drm/drm_modeset_helper.c b/drivers/gpu/drm/drm_modeset_helper.c
>index cc44a9a4b004..2b33825f2f93 100644
>--- a/drivers/gpu/drm/drm_modeset_helper.c
>+++ b/drivers/gpu/drm/drm_modeset_helper.c
>@@ -78,7 +78,7 @@ void drm_helper_mode_fill_fb_struct(struct drm_device *dev,
> 	int i;
>
> 	fb->dev = dev;
>-	fb->format = drm_format_info(mode_cmd->pixel_format);
>+	fb->format = drm_get_format_info(dev, mode_cmd);
> 	fb->width = mode_cmd->width;
> 	fb->height = mode_cmd->height;
> 	for (i = 0; i < 4; i++) {
>diff --git a/include/drm/drm_fourcc.h b/include/drm/drm_fourcc.h
>index fcc08da850c8..6942e84b6edd 100644
>--- a/include/drm/drm_fourcc.h
>+++ b/include/drm/drm_fourcc.h
>@@ -25,6 +25,9 @@
> #include <linux/types.h>
> #include <uapi/drm/drm_fourcc.h>
>
>+struct drm_device;
>+struct drm_mode_fb_cmd2;
>+
> /**
>  * struct drm_format_info - information about a DRM format
>  * @format: 4CC format identifier (DRM_FORMAT_*)
>@@ -55,6 +58,9 @@ struct drm_format_name_buf {
>
> const struct drm_format_info *__drm_format_info(u32 format);
> const struct drm_format_info *drm_format_info(u32 format);
>+const struct drm_format_info *
>+drm_get_format_info(struct drm_device *dev,
>+		    const struct drm_mode_fb_cmd2 *mode_cmd);
> uint32_t drm_mode_legacy_fb_format(uint32_t bpp, uint32_t depth);
> int drm_format_num_planes(uint32_t format);
> int drm_format_plane_cpp(uint32_t format, int plane);
>diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h
>index 17942c0f32a8..b5c2eb825965 100644
>--- a/include/drm/drm_mode_config.h
>+++ b/include/drm/drm_mode_config.h
>@@ -34,6 +34,7 @@ struct drm_file;
> struct drm_device;
> struct drm_atomic_state;
> struct drm_mode_fb_cmd2;
>+struct drm_format_info;
>
> /**
>  * struct drm_mode_config_funcs - basic driver provided mode setting functions
>@@ -70,6 +71,19 @@ struct drm_mode_config_funcs {
> 					     const struct drm_mode_fb_cmd2 *mode_cmd);
>
> 	/**
>+	 * @get_format_info:
>+	 *
>+	 * Allows a driver to return custom format information for special
>+	 * fb layouts (eg. ones with auxiliary compresssion control planes).

compression

>+	 *
>+	 * RETURNS:
>+	 *
>+	 * The format information specific to the given fb metadata, or
>+	 * NULL if none is found.
>+	 */
>+	const struct drm_format_info *(*get_format_info)(const struct drm_mode_fb_cmd2 *mode_cmd);
>+
>+	/**
> 	 * @output_poll_changed:
> 	 *
> 	 * Callback used by helpers to inform the driver of output configuration

Looks like msm and omap could use this too, and then if you allowed mode_cmd
to be NULL, you could potentially deprecate drm_format_info. Just a thought.

LGTM:
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>

>-- 
>2.10.2
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 9/9] drm/i915: Add render decompression support
  2017-01-04 18:42 ` [PATCH 9/9] drm/i915: Add render decompression support ville.syrjala
  2017-01-04 19:14   ` Paulo Zanoni
@ 2017-01-05  4:25   ` Ben Widawsky
  2017-01-05 15:11     ` Ville Syrjälä
  2017-01-05 15:14   ` [PATCH v2 " ville.syrjala
  2 siblings, 1 reply; 44+ messages in thread
From: Ben Widawsky @ 2017-01-05  4:25 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx, Vandana Kannan, dri-devel

On 17-01-04 20:42:32, Ville Syrjälä wrote:
>From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
>SKL+ display engine can scan out certain kinds of compressed surfaces
>produced by the render engine. This involved telling the display engine
>the location of the color control surfae (CCS) which describes
>which parts of the main surface are compressed and which are not. The
>location of CCS is provided by userspace as just another plane with its
>own offset.
>
>Add the required stuff to validate the user provided AUX plane metadata
>and convert the user provided linear offset into something the hardware
>can consume.
>
>Due to hardware limitations we require that the main surface and
>the AUX surface (CCS) be part of the same bo. The hardware also
>makes life hard by not allowing you to provide separate x/y offsets
>for the main and AUX surfaces (excpet with NV12), so finding suitable
>offsets for both requires a bit of work. Assuming we still want keep
>playing tricks with the offsets. I've just gone with a dumb "search
>backward for suitable offsets" approach, which is far from optimal,
>but it works.
>
>Also not all planes will be capable of scanning out compressed surfaces,
>and eg. 90/270 degree rotation is not supported in combination with
>decompression either.
>
>This patch may contain work from at least the following people:
>* Vandana Kannan <vandana.kannan@intel.com>
>* Daniel Vetter <daniel@ffwll.ch>
>* Ben Widawsky <ben@bwidawsk.net>
>
>Cc: Vandana Kannan <vandana.kannan@intel.com>
>Cc: Daniel Vetter <daniel@ffwll.ch>
>Cc: Ben Widawsky <ben@bwidawsk.net>
>Cc: Jason Ekstrand <jason@jlekstrand.net>
>Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
>---
> drivers/gpu/drm/i915/i915_reg.h      |  22 ++++
> drivers/gpu/drm/i915/intel_display.c | 219 +++++++++++++++++++++++++++++++++--
> drivers/gpu/drm/i915/intel_pm.c      |   8 +-
> drivers/gpu/drm/i915/intel_sprite.c  |   5 +
> 4 files changed, 240 insertions(+), 14 deletions(-)
>
>diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>index 00970aa77afa..05e18e742776 100644
>--- a/drivers/gpu/drm/i915/i915_reg.h
>+++ b/drivers/gpu/drm/i915/i915_reg.h
>@@ -6209,6 +6209,28 @@ enum {
> 			_ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A),   \
> 			_ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B))
>
>+#define PLANE_AUX_DIST_1_A		0x701c0
>+#define PLANE_AUX_DIST_2_A		0x702c0
>+#define PLANE_AUX_DIST_1_B		0x711c0
>+#define PLANE_AUX_DIST_2_B		0x712c0
>+#define _PLANE_AUX_DIST_1(pipe) \
>+			_PIPE(pipe, PLANE_AUX_DIST_1_A, PLANE_AUX_DIST_1_B)
>+#define _PLANE_AUX_DIST_2(pipe) \
>+			_PIPE(pipe, PLANE_AUX_DIST_2_A, PLANE_AUX_DIST_2_B)
>+#define PLANE_AUX_DIST(pipe, plane)     \
>+	_MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe), _PLANE_AUX_DIST_2(pipe))
>+
>+#define PLANE_AUX_OFFSET_1_A		0x701c4
>+#define PLANE_AUX_OFFSET_2_A		0x702c4
>+#define PLANE_AUX_OFFSET_1_B		0x711c4
>+#define PLANE_AUX_OFFSET_2_B		0x712c4
>+#define _PLANE_AUX_OFFSET_1(pipe)       \
>+		_PIPE(pipe, PLANE_AUX_OFFSET_1_A, PLANE_AUX_OFFSET_1_B)
>+#define _PLANE_AUX_OFFSET_2(pipe)       \
>+		_PIPE(pipe, PLANE_AUX_OFFSET_2_A, PLANE_AUX_OFFSET_2_B)
>+#define PLANE_AUX_OFFSET(pipe, plane)   \
>+	_MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe), _PLANE_AUX_OFFSET_2(pipe))
>+
> /* legacy palette */
> #define _LGC_PALETTE_A           0x4a000
> #define _LGC_PALETTE_B           0x4a800
>diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
>index 38de9df0ec60..b547332eeda1 100644
>--- a/drivers/gpu/drm/i915/intel_display.c
>+++ b/drivers/gpu/drm/i915/intel_display.c
>@@ -2064,11 +2064,19 @@ intel_tile_width_bytes(const struct drm_framebuffer *fb, int plane)
> 			return 128;
> 		else
> 			return 512;
>+	case I915_FORMAT_MOD_Y_TILED_CCS:
>+		if (plane == 1)
>+			return 64;
>+		/* fall through */
> 	case I915_FORMAT_MOD_Y_TILED:
> 		if (IS_GEN2(dev_priv) || HAS_128_BYTE_Y_TILING(dev_priv))
> 			return 128;
> 		else
> 			return 512;
>+	case I915_FORMAT_MOD_Yf_TILED_CCS:
>+		if (plane == 1)
>+			return 64;
>+		/* fall through */
> 	case I915_FORMAT_MOD_Yf_TILED:
> 		/*
> 		 * Bspec seems to suggest that the Yf tile width would
>@@ -2156,7 +2164,7 @@ static unsigned int intel_surf_alignment(const struct drm_framebuffer *fb,
> 	struct drm_i915_private *dev_priv = to_i915(fb->dev);
>
> 	/* AUX_DIST needs only 4K alignment */
>-	if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
>+	if (plane == 1)

This looks wrong at least within this context, surely multi-planar formats might
have different alignment restrictions?

> 		return 4096;
>
> 	switch (fb->modifier) {
>@@ -2166,6 +2174,8 @@ static unsigned int intel_surf_alignment(const struct drm_framebuffer *fb,
> 		if (INTEL_GEN(dev_priv) >= 9)
> 			return 256 * 1024;
> 		return 0;
>+	case I915_FORMAT_MOD_Y_TILED_CCS:
>+	case I915_FORMAT_MOD_Yf_TILED_CCS:
> 	case I915_FORMAT_MOD_Y_TILED:
> 	case I915_FORMAT_MOD_Yf_TILED:
> 		return 1 * 1024 * 1024;
>@@ -2472,6 +2482,7 @@ static unsigned int intel_fb_modifier_to_tiling(uint64_t fb_modifier)
> 	case I915_FORMAT_MOD_X_TILED:
> 		return I915_TILING_X;
> 	case I915_FORMAT_MOD_Y_TILED:
>+	case I915_FORMAT_MOD_Y_TILED_CCS:

Is I915_FORMAT_MOD_Yf_TILED_CCS supposed to be here?
 
> 		return I915_TILING_Y;
> 	default:
> 		return I915_TILING_NONE;
>@@ -2536,6 +2547,35 @@ intel_fill_fb_info(struct drm_i915_private *dev_priv,
>
> 		intel_fb_offset_to_xy(&x, &y, fb, i);
>
>+		if ((fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>+		     fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) && i == 1) {

In one of my branches I turned this into a macro/inline because I ended up
checking it pretty often (if modifier is CCS type and plane == 1). Just a
thought.

>+			int main_x, main_y;
>+			int ccs_x, ccs_y;
>+
>+			/*
>+			 * Each byte of CCS corresponds to a 16x8 area of the main surface, and
>+			 * each CCS tile is 64x64 bytes.
>+			 */
>+			ccs_x = (x * 16) % (64 * 16);
>+			ccs_y = (y * 8) % (64 * 8);
>+			main_x = intel_fb->normal[0].x % (64 * 16);
>+			main_y = intel_fb->normal[0].y % (64 * 8);
>+
>+			/*
>+			 * CCS doesn't have its own x/y offset register, so the intra CCS tile
>+			 * x/y offsets must match between CCS and the main surface.
>+			 */
>+			if (main_x != ccs_x || main_y != ccs_y) {
>+				DRM_DEBUG_KMS("Bad CCS x/y (main %d,%d ccs %d,%d) full (main %d,%d ccs %d,%d)\n",
>+					      main_x, main_y,
>+					      ccs_x, ccs_y,
>+					      intel_fb->normal[0].x,
>+					      intel_fb->normal[0].y,
>+					      x, y);
>+				return -EINVAL;
>+			}
>+		}
>+
> 		/*
> 		 * The fence (if used) is aligned to the start of the object
> 		 * so having the framebuffer wrap around across the edge of the
>@@ -2873,6 +2913,9 @@ static int skl_max_plane_width(const struct drm_framebuffer *fb, int plane,
> 			break;
> 		}
> 		break;
>+	case I915_FORMAT_MOD_Y_TILED_CCS:
>+	case I915_FORMAT_MOD_Yf_TILED_CCS:
>+		/* FIXME AUX plane? */
> 	case I915_FORMAT_MOD_Y_TILED:
> 	case I915_FORMAT_MOD_Yf_TILED:
> 		switch (cpp) {
>@@ -2895,6 +2938,42 @@ static int skl_max_plane_width(const struct drm_framebuffer *fb, int plane,
> 	return 2048;
> }
>
>+static bool skl_check_main_ccs_coordinates(struct intel_plane_state *plane_state,
>+					   int main_x, int main_y, u32 main_offset)
>+{
>+	const struct drm_framebuffer *fb = plane_state->base.fb;
>+	int aux_x = plane_state->aux.x;
>+	int aux_y = plane_state->aux.y;
>+	u32 aux_offset = plane_state->aux.offset;
>+	u32 alignment = intel_surf_alignment(fb, 1);
>+
>+	while (aux_offset >= main_offset && aux_y <= main_y) {
>+		int x, y;
>+
>+		if (aux_x == main_x && aux_y == main_y)
>+			break;
>+
>+		if (aux_offset == 0)
>+			break;
>+
>+		x = aux_x / 16;
>+		y = aux_y / 8;
>+		aux_offset = intel_adjust_tile_offset(&x, &y, plane_state, 1,
>+						      aux_offset, aux_offset - alignment);
>+		aux_x = x * 16 + aux_x % 16;
>+		aux_y = y * 8 + aux_y % 8;
>+	}
>+
>+	if (aux_x != main_x || aux_y != main_y)
>+		return false;
>+
>+	plane_state->aux.offset = aux_offset;
>+	plane_state->aux.x = aux_x;
>+	plane_state->aux.y = aux_y;
>+
>+	return true;
>+}
>+
> static int skl_check_main_surface(struct intel_plane_state *plane_state)
> {
> 	const struct drm_framebuffer *fb = plane_state->base.fb;
>@@ -2937,7 +3016,7 @@ static int skl_check_main_surface(struct intel_plane_state *plane_state)
>
> 		while ((x + w) * cpp > fb->pitches[0]) {
> 			if (offset == 0) {
>-				DRM_DEBUG_KMS("Unable to find suitable display surface offset\n");
>+				DRM_DEBUG_KMS("Unable to find suitable display surface offset due to X-tiling\n");
> 				return -EINVAL;
> 			}
>
>@@ -2946,6 +3025,26 @@ static int skl_check_main_surface(struct intel_plane_state *plane_state)
> 		}
> 	}
>
>+	/*
>+	 * CCS AUX surface doesn't have its own x/y offsets, we must make sure
>+	 * they match with the main surface x/y offsets.
>+	 */
>+	if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>+	    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
>+		while (!skl_check_main_ccs_coordinates(plane_state, x, y, offset)) {
>+			if (offset == 0)
>+				break;
>+
>+			offset = intel_adjust_tile_offset(&x, &y, plane_state, 0,
>+							  offset, offset - alignment);
>+		}
>+
>+		if (x != plane_state->aux.x || y != plane_state->aux.y) {
>+			DRM_DEBUG_KMS("Unable to find suitable display surface offset due to CCS\n");
>+			return -EINVAL;
>+		}
>+	}
>+
> 	plane_state->main.offset = offset;
> 	plane_state->main.x = x;
> 	plane_state->main.y = y;
>@@ -2982,6 +3081,53 @@ static int skl_check_nv12_aux_surface(struct intel_plane_state *plane_state)
> 	return 0;
> }
>
>+static int skl_check_ccs_aux_surface(struct intel_plane_state *plane_state)
>+{
>+	struct intel_plane *plane = to_intel_plane(plane_state->base.plane);
>+	struct intel_crtc *crtc = to_intel_crtc(plane_state->base.crtc);
>+	int src_x = plane_state->base.src.x1 >> 16;
>+	int src_y = plane_state->base.src.y1 >> 16;
>+	int x = src_x / 16;
>+	int y = src_y / 8;
>+	u32 offset;
>+
>+	switch (plane->id) {
>+	case PLANE_PRIMARY:
>+	case PLANE_SPRITE0:
>+		break;
>+	default:
>+		DRM_DEBUG_KMS("RC support only on plane 1 and 2\n");
>+		return -EINVAL;
>+	}
>+
>+	if (crtc->pipe == PIPE_C) {
>+		DRM_DEBUG_KMS("No RC support on pipe C\n");
>+		return -EINVAL;
>+	}
>+	/*
>+	 * TODO:
>+	 * 1. Disable stereo 3D when render decomp is enabled (bit 7:6)
>+	 * 2. Render decompression must not be used in VTd pass-through mode
>+	 * 3. Program hashing select CHICKEN_MISC1 bit 15
>+	 */
>+
>+	if (plane_state->base.rotation &&
>+	    plane_state->base.rotation & ~(DRM_ROTATE_0 | DRM_ROTATE_180)) {
>+		DRM_DEBUG_KMS("RC support only with 0/180 degree rotation %x\n",
>+			      plane_state->base.rotation);
>+		return -EINVAL;
>+	}
>+
>+	intel_add_fb_offsets(&x, &y, plane_state, 1);
>+	offset = intel_compute_tile_offset(&x, &y, plane_state, 1);
>+
>+	plane_state->aux.offset = offset;
>+	plane_state->aux.x = x * 16 + src_x % 16;
>+	plane_state->aux.y = y * 8 + src_y % 8;
>+
>+	return 0;
>+}
>+
> int skl_check_plane_surface(struct intel_plane_state *plane_state)
> {
> 	const struct drm_framebuffer *fb = plane_state->base.fb;
>@@ -3002,6 +3148,11 @@ int skl_check_plane_surface(struct intel_plane_state *plane_state)
> 		ret = skl_check_nv12_aux_surface(plane_state);
> 		if (ret)
> 			return ret;
>+	} else if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>+		   fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
>+		ret = skl_check_ccs_aux_surface(plane_state);
>+		if (ret)
>+			return ret;
> 	} else {
> 		plane_state->aux.offset = ~0xfff;
> 		plane_state->aux.x = 0;
>@@ -3357,8 +3508,12 @@ u32 skl_plane_ctl_tiling(uint64_t fb_modifier)
> 		return PLANE_CTL_TILED_X;
> 	case I915_FORMAT_MOD_Y_TILED:
> 		return PLANE_CTL_TILED_Y;
>+	case I915_FORMAT_MOD_Y_TILED_CCS:
>+		return PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
> 	case I915_FORMAT_MOD_Yf_TILED:
> 		return PLANE_CTL_TILED_YF;
>+	case I915_FORMAT_MOD_Yf_TILED_CCS:
>+		return PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_ENABLE;
> 	default:
> 		MISSING_CASE(fb_modifier);
> 	}
>@@ -3401,6 +3556,7 @@ static void skylake_update_primary_plane(struct drm_plane *plane,
> 	u32 plane_ctl;
> 	unsigned int rotation = plane_state->base.rotation;
> 	u32 stride = skl_plane_stride(fb, 0, rotation);
>+	u32 aux_stride = skl_plane_stride(fb, 1, rotation);
> 	u32 surf_addr = plane_state->main.offset;
> 	int scaler_id = plane_state->scaler_id;
> 	int src_x = plane_state->main.x;
>@@ -3436,6 +3592,10 @@ static void skylake_update_primary_plane(struct drm_plane *plane,
> 	I915_WRITE(PLANE_OFFSET(pipe, plane_id), (src_y << 16) | src_x);
> 	I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
> 	I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
>+	I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
>+		   (plane_state->aux.offset - surf_addr) | aux_stride);
>+	I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
>+		   (plane_state->aux.y << 16) | plane_state->aux.x);
>
> 	if (scaler_id >= 0) {
> 		uint32_t ps_ctrl = 0;
>@@ -9807,10 +9967,16 @@ skylake_get_initial_plane_config(struct intel_crtc *crtc,
> 		fb->modifier = I915_FORMAT_MOD_X_TILED;
> 		break;
> 	case PLANE_CTL_TILED_Y:
>-		fb->modifier = I915_FORMAT_MOD_Y_TILED;
>+		if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
>+			fb->modifier = I915_FORMAT_MOD_Y_TILED_CCS;
>+		else
>+			fb->modifier = I915_FORMAT_MOD_Y_TILED;
> 		break;
> 	case PLANE_CTL_TILED_YF:
>-		fb->modifier = I915_FORMAT_MOD_Yf_TILED;
>+		if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
>+			fb->modifier = I915_FORMAT_MOD_Yf_TILED_CCS;
>+		else
>+			fb->modifier = I915_FORMAT_MOD_Yf_TILED;

I'm wondering if this is actually feasible. If we find compression enabled, I'd
think we should just disable it.

> 		break;
> 	default:
> 		MISSING_CASE(tiling);
>@@ -12014,7 +12180,7 @@ static void skl_do_mmio_flip(struct intel_crtc *intel_crtc,
> 	u32 ctl, stride = skl_plane_stride(fb, 0, rotation);
>
> 	ctl = I915_READ(PLANE_CTL(pipe, 0));
>-	ctl &= ~PLANE_CTL_TILED_MASK;
>+	ctl &= ~(PLANE_CTL_TILED_MASK | PLANE_CTL_DECOMPRESSION_ENABLE);
> 	switch (fb->modifier) {
> 	case DRM_FORMAT_MOD_NONE:
> 		break;
>@@ -12024,9 +12190,15 @@ static void skl_do_mmio_flip(struct intel_crtc *intel_crtc,
> 	case I915_FORMAT_MOD_Y_TILED:
> 		ctl |= PLANE_CTL_TILED_Y;
> 		break;
>+	case I915_FORMAT_MOD_Y_TILED_CCS:
>+		ctl |= PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
>+		break;
> 	case I915_FORMAT_MOD_Yf_TILED:
> 		ctl |= PLANE_CTL_TILED_YF;
> 		break;
>+	case I915_FORMAT_MOD_Yf_TILED_CCS:
>+		ctl |= PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_ENABLE;
>+		break;
> 	default:
> 		MISSING_CASE(fb->modifier);
> 	}
>@@ -15926,8 +16098,8 @@ static int intel_framebuffer_init(struct drm_device *dev,
> {
> 	struct drm_i915_private *dev_priv = to_i915(dev);
> 	unsigned int tiling = i915_gem_object_get_tiling(obj);
>-	int ret;
>-	u32 pitch_limit, stride_alignment;
>+	int ret, i;
>+	u32 pitch_limit;
> 	struct drm_format_name_buf format_name;
>
> 	WARN_ON(!mutex_is_locked(&dev->struct_mutex));
>@@ -15953,6 +16125,19 @@ static int intel_framebuffer_init(struct drm_device *dev,
>
> 	/* Passed in modifier sanity checking. */
> 	switch (mode_cmd->modifier[0]) {
>+	case I915_FORMAT_MOD_Y_TILED_CCS:
>+	case I915_FORMAT_MOD_Yf_TILED_CCS:
>+		switch (mode_cmd->pixel_format) {
>+		case DRM_FORMAT_XBGR8888:
>+		case DRM_FORMAT_ABGR8888:
>+		case DRM_FORMAT_XRGB8888:
>+		case DRM_FORMAT_ARGB8888:
>+			break;
>+		default:
>+			DRM_DEBUG_KMS("RC supported only with RGB8888 formats\n");
>+			return -EINVAL;
>+		}
>+		/* fall through */
> 	case I915_FORMAT_MOD_Y_TILED:
> 	case I915_FORMAT_MOD_Yf_TILED:
> 		if (INTEL_GEN(dev_priv) < 9) {
>@@ -16061,11 +16246,21 @@ static int intel_framebuffer_init(struct drm_device *dev,
>
> 	drm_helper_mode_fill_fb_struct(dev, &intel_fb->base, mode_cmd);
>
>-	stride_alignment = intel_fb_stride_alignment(&intel_fb->base, 0);
>-	if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
>-		DRM_DEBUG_KMS("pitch (%d) must be at least %u byte aligned\n",
>-			      mode_cmd->pitches[0], stride_alignment);
>-		return -EINVAL;
>+	for (i = 0; i < intel_fb->base.format->num_planes; i++) {
>+		u32 stride_alignment;
>+
>+		if (mode_cmd->handles[i] != mode_cmd->handles[0]) {
>+			DRM_DEBUG_KMS("bad plane %d handle\n", i);
>+			return -EINVAL;
>+		}
>+
>+		stride_alignment = intel_fb_stride_alignment(&intel_fb->base, i);
>+
>+		if (mode_cmd->pitches[i] & (stride_alignment - 1)) {
>+			DRM_DEBUG_KMS("plane %d pitch (%d) must be at least %u byte aligned\n",
>+				      i, mode_cmd->pitches[i], stride_alignment);
>+			return -EINVAL;
>+		}
> 	}
>
> 	intel_fb->obj = obj;
>diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>index 249623d45be0..add359022c96 100644
>--- a/drivers/gpu/drm/i915/intel_pm.c
>+++ b/drivers/gpu/drm/i915/intel_pm.c
>@@ -3314,7 +3314,9 @@ skl_ddb_min_alloc(const struct drm_plane_state *pstate,
>
> 	/* For Non Y-tile return 8-blocks */
> 	if (fb->modifier != I915_FORMAT_MOD_Y_TILED &&
>-	    fb->modifier != I915_FORMAT_MOD_Yf_TILED)
>+	    fb->modifier != I915_FORMAT_MOD_Yf_TILED &&
>+	    fb->modifier != I915_FORMAT_MOD_Y_TILED_CCS &&
>+	    fb->modifier != I915_FORMAT_MOD_Yf_TILED_CCS)
> 		return 8;
>
> 	src_w = drm_rect_width(&intel_pstate->base.src) >> 16;
>@@ -3590,7 +3592,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
> 	}
>
> 	y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
>-		  fb->modifier == I915_FORMAT_MOD_Yf_TILED;
>+		  fb->modifier == I915_FORMAT_MOD_Yf_TILED ||
>+		  fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>+		  fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS;
> 	x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
>
> 	/* Display WA #1141: kbl. */
>diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
>index 7031bc733d97..063a994815d0 100644
>--- a/drivers/gpu/drm/i915/intel_sprite.c
>+++ b/drivers/gpu/drm/i915/intel_sprite.c
>@@ -210,6 +210,7 @@ skl_update_plane(struct drm_plane *drm_plane,
> 	u32 surf_addr = plane_state->main.offset;
> 	unsigned int rotation = plane_state->base.rotation;
> 	u32 stride = skl_plane_stride(fb, 0, rotation);
>+	u32 aux_stride = skl_plane_stride(fb, 1, rotation);
> 	int crtc_x = plane_state->base.dst.x1;
> 	int crtc_y = plane_state->base.dst.y1;
> 	uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
>@@ -248,6 +249,10 @@ skl_update_plane(struct drm_plane *drm_plane,
> 	I915_WRITE(PLANE_OFFSET(pipe, plane_id), (y << 16) | x);
> 	I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
> 	I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
>+	I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
>+		   (plane_state->aux.offset - surf_addr) | aux_stride);
>+	I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
>+		   (plane_state->aux.y << 16) | plane_state->aux.x);
>
> 	/* program plane scaler */
> 	if (plane_state->scaler_id >= 0) {

Looks good. I haven't tested it.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 1/9] drm: Add mode_config .get_format_info() hook
  2017-01-05  3:15   ` Ben Widawsky
@ 2017-01-05  8:48     ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2017-01-05  8:48 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: dri-devel, intel-gfx, Laurent Pinchart

On Wed, Jan 04, 2017 at 07:15:34PM -0800, Ben Widawsky wrote:
> On 17-01-04 20:42:24, Ville Syrjälä wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > Allow drivers to return a custom drm_format_info structure for special
> > fb layouts. We'll use this for the compression control surface in i915.
> > 
> > v2: Fix drm_get_format_info() kernel doc (Laurent)
> >    Don't pass 'dev' to the new hook (Laurent)
> > 
> > Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > Cc: Ben Widawsky <ben@bwidawsk.net>
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

Ok if I still merge this throught drm-misc? Looks like the next few
patches don't need this yet, and the CCS stuff probably needs a bit more
wrangling for all the pieces to be fully ready.

> > +	 *
> > +	 * RETURNS:
> > +	 *
> > +	 * The format information specific to the given fb metadata, or
> > +	 * NULL if none is found.
> > +	 */
> > +	const struct drm_format_info *(*get_format_info)(const struct drm_mode_fb_cmd2 *mode_cmd);
> > +
> > +	/**
> > 	 * @output_poll_changed:
> > 	 *
> > 	 * Callback used by helpers to inform the driver of output configuration
> 
> Looks like msm and omap could use this too, and then if you allowed mode_cmd
> to be NULL, you could potentially deprecate drm_format_info. Just a thought.

Hm, what do you mean here with deprecating drm_format_info and a NULL
mode_cmd? I don't follow at all ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 9/9] drm/i915: Add render decompression support
  2017-01-05  4:25   ` Ben Widawsky
@ 2017-01-05 15:11     ` Ville Syrjälä
  0 siblings, 0 replies; 44+ messages in thread
From: Ville Syrjälä @ 2017-01-05 15:11 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: intel-gfx, Vandana Kannan, dri-devel

On Wed, Jan 04, 2017 at 08:25:23PM -0800, Ben Widawsky wrote:
> On 17-01-04 20:42:32, Ville Syrjälä wrote:
> >From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >
> >SKL+ display engine can scan out certain kinds of compressed surfaces
> >produced by the render engine. This involved telling the display engine
> >the location of the color control surfae (CCS) which describes
> >which parts of the main surface are compressed and which are not. The
> >location of CCS is provided by userspace as just another plane with its
> >own offset.
> >
> >Add the required stuff to validate the user provided AUX plane metadata
> >and convert the user provided linear offset into something the hardware
> >can consume.
> >
> >Due to hardware limitations we require that the main surface and
> >the AUX surface (CCS) be part of the same bo. The hardware also
> >makes life hard by not allowing you to provide separate x/y offsets
> >for the main and AUX surfaces (excpet with NV12), so finding suitable
> >offsets for both requires a bit of work. Assuming we still want keep
> >playing tricks with the offsets. I've just gone with a dumb "search
> >backward for suitable offsets" approach, which is far from optimal,
> >but it works.
> >
> >Also not all planes will be capable of scanning out compressed surfaces,
> >and eg. 90/270 degree rotation is not supported in combination with
> >decompression either.
> >
> >This patch may contain work from at least the following people:
> >* Vandana Kannan <vandana.kannan@intel.com>
> >* Daniel Vetter <daniel@ffwll.ch>
> >* Ben Widawsky <ben@bwidawsk.net>
> >
> >Cc: Vandana Kannan <vandana.kannan@intel.com>
> >Cc: Daniel Vetter <daniel@ffwll.ch>
> >Cc: Ben Widawsky <ben@bwidawsk.net>
> >Cc: Jason Ekstrand <jason@jlekstrand.net>
> >Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >---
> > drivers/gpu/drm/i915/i915_reg.h      |  22 ++++
> > drivers/gpu/drm/i915/intel_display.c | 219 +++++++++++++++++++++++++++++++++--
> > drivers/gpu/drm/i915/intel_pm.c      |   8 +-
> > drivers/gpu/drm/i915/intel_sprite.c  |   5 +
> > 4 files changed, 240 insertions(+), 14 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> >index 00970aa77afa..05e18e742776 100644
> >--- a/drivers/gpu/drm/i915/i915_reg.h
> >+++ b/drivers/gpu/drm/i915/i915_reg.h
> >@@ -6209,6 +6209,28 @@ enum {
> > 			_ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A),   \
> > 			_ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B))
> >
> >+#define PLANE_AUX_DIST_1_A		0x701c0
> >+#define PLANE_AUX_DIST_2_A		0x702c0
> >+#define PLANE_AUX_DIST_1_B		0x711c0
> >+#define PLANE_AUX_DIST_2_B		0x712c0
> >+#define _PLANE_AUX_DIST_1(pipe) \
> >+			_PIPE(pipe, PLANE_AUX_DIST_1_A, PLANE_AUX_DIST_1_B)
> >+#define _PLANE_AUX_DIST_2(pipe) \
> >+			_PIPE(pipe, PLANE_AUX_DIST_2_A, PLANE_AUX_DIST_2_B)
> >+#define PLANE_AUX_DIST(pipe, plane)     \
> >+	_MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe), _PLANE_AUX_DIST_2(pipe))
> >+
> >+#define PLANE_AUX_OFFSET_1_A		0x701c4
> >+#define PLANE_AUX_OFFSET_2_A		0x702c4
> >+#define PLANE_AUX_OFFSET_1_B		0x711c4
> >+#define PLANE_AUX_OFFSET_2_B		0x712c4
> >+#define _PLANE_AUX_OFFSET_1(pipe)       \
> >+		_PIPE(pipe, PLANE_AUX_OFFSET_1_A, PLANE_AUX_OFFSET_1_B)
> >+#define _PLANE_AUX_OFFSET_2(pipe)       \
> >+		_PIPE(pipe, PLANE_AUX_OFFSET_2_A, PLANE_AUX_OFFSET_2_B)
> >+#define PLANE_AUX_OFFSET(pipe, plane)   \
> >+	_MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe), _PLANE_AUX_OFFSET_2(pipe))
> >+
> > /* legacy palette */
> > #define _LGC_PALETTE_A           0x4a000
> > #define _LGC_PALETTE_B           0x4a800
> >diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> >index 38de9df0ec60..b547332eeda1 100644
> >--- a/drivers/gpu/drm/i915/intel_display.c
> >+++ b/drivers/gpu/drm/i915/intel_display.c
> >@@ -2064,11 +2064,19 @@ intel_tile_width_bytes(const struct drm_framebuffer *fb, int plane)
> > 			return 128;
> > 		else
> > 			return 512;
> >+	case I915_FORMAT_MOD_Y_TILED_CCS:
> >+		if (plane == 1)
> >+			return 64;
> >+		/* fall through */
> > 	case I915_FORMAT_MOD_Y_TILED:
> > 		if (IS_GEN2(dev_priv) || HAS_128_BYTE_Y_TILING(dev_priv))
> > 			return 128;
> > 		else
> > 			return 512;
> >+	case I915_FORMAT_MOD_Yf_TILED_CCS:
> >+		if (plane == 1)
> >+			return 64;
> >+		/* fall through */
> > 	case I915_FORMAT_MOD_Yf_TILED:
> > 		/*
> > 		 * Bspec seems to suggest that the Yf tile width would
> >@@ -2156,7 +2164,7 @@ static unsigned int intel_surf_alignment(const struct drm_framebuffer *fb,
> > 	struct drm_i915_private *dev_priv = to_i915(fb->dev);
> >
> > 	/* AUX_DIST needs only 4K alignment */
> >-	if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
> >+	if (plane == 1)
> 
> This looks wrong at least within this context, surely multi-planar formats might
> have different alignment restrictions?

Nope. Well, NV12 being the only other case also requires 4k alignment
for AUX_DIST. Note that we don't actually support NV12, we just happen
to have some code for it which will become useful in the future (I hope).

> 
> > 		return 4096;
> >
> > 	switch (fb->modifier) {
> >@@ -2166,6 +2174,8 @@ static unsigned int intel_surf_alignment(const struct drm_framebuffer *fb,
> > 		if (INTEL_GEN(dev_priv) >= 9)
> > 			return 256 * 1024;
> > 		return 0;
> >+	case I915_FORMAT_MOD_Y_TILED_CCS:
> >+	case I915_FORMAT_MOD_Yf_TILED_CCS:
> > 	case I915_FORMAT_MOD_Y_TILED:
> > 	case I915_FORMAT_MOD_Yf_TILED:
> > 		return 1 * 1024 * 1024;
> >@@ -2472,6 +2482,7 @@ static unsigned int intel_fb_modifier_to_tiling(uint64_t fb_modifier)
> > 	case I915_FORMAT_MOD_X_TILED:
> > 		return I915_TILING_X;
> > 	case I915_FORMAT_MOD_Y_TILED:
> >+	case I915_FORMAT_MOD_Y_TILED_CCS:
> 
> Is I915_FORMAT_MOD_Yf_TILED_CCS supposed to be here?

Nope. There's no fence tiling format for Yf.

>  
> > 		return I915_TILING_Y;
> > 	default:
> > 		return I915_TILING_NONE;
> >@@ -2536,6 +2547,35 @@ intel_fill_fb_info(struct drm_i915_private *dev_priv,
> >
> > 		intel_fb_offset_to_xy(&x, &y, fb, i);
> >
> >+		if ((fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> >+		     fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) && i == 1) {
> 
> In one of my branches I turned this into a macro/inline because I ended up
> checking it pretty often (if modifier is CCS type and plane == 1). Just a
> thought.

I think this might be the only place we check it now.

> 
> >+			int main_x, main_y;
> >+			int ccs_x, ccs_y;
> >+
> >+			/*
> >+			 * Each byte of CCS corresponds to a 16x8 area of the main surface, and
> >+			 * each CCS tile is 64x64 bytes.
> >+			 */
> >+			ccs_x = (x * 16) % (64 * 16);
> >+			ccs_y = (y * 8) % (64 * 8);
> >+			main_x = intel_fb->normal[0].x % (64 * 16);
> >+			main_y = intel_fb->normal[0].y % (64 * 8);
> >+
> >+			/*
> >+			 * CCS doesn't have its own x/y offset register, so the intra CCS tile
> >+			 * x/y offsets must match between CCS and the main surface.
> >+			 */
> >+			if (main_x != ccs_x || main_y != ccs_y) {
> >+				DRM_DEBUG_KMS("Bad CCS x/y (main %d,%d ccs %d,%d) full (main %d,%d ccs %d,%d)\n",
> >+					      main_x, main_y,
> >+					      ccs_x, ccs_y,
> >+					      intel_fb->normal[0].x,
> >+					      intel_fb->normal[0].y,
> >+					      x, y);
> >+				return -EINVAL;
> >+			}
> >+		}
> >+
> > 		/*
> > 		 * The fence (if used) is aligned to the start of the object
> > 		 * so having the framebuffer wrap around across the edge of the
> >@@ -2873,6 +2913,9 @@ static int skl_max_plane_width(const struct drm_framebuffer *fb, int plane,
> > 			break;
> > 		}
> > 		break;
> >+	case I915_FORMAT_MOD_Y_TILED_CCS:
> >+	case I915_FORMAT_MOD_Yf_TILED_CCS:
> >+		/* FIXME AUX plane? */
> > 	case I915_FORMAT_MOD_Y_TILED:
> > 	case I915_FORMAT_MOD_Yf_TILED:
> > 		switch (cpp) {
> >@@ -2895,6 +2938,42 @@ static int skl_max_plane_width(const struct drm_framebuffer *fb, int plane,
> > 	return 2048;
> > }
> >
> >+static bool skl_check_main_ccs_coordinates(struct intel_plane_state *plane_state,
> >+					   int main_x, int main_y, u32 main_offset)
> >+{
> >+	const struct drm_framebuffer *fb = plane_state->base.fb;
> >+	int aux_x = plane_state->aux.x;
> >+	int aux_y = plane_state->aux.y;
> >+	u32 aux_offset = plane_state->aux.offset;
> >+	u32 alignment = intel_surf_alignment(fb, 1);
> >+
> >+	while (aux_offset >= main_offset && aux_y <= main_y) {
> >+		int x, y;
> >+
> >+		if (aux_x == main_x && aux_y == main_y)
> >+			break;
> >+
> >+		if (aux_offset == 0)
> >+			break;
> >+
> >+		x = aux_x / 16;
> >+		y = aux_y / 8;
> >+		aux_offset = intel_adjust_tile_offset(&x, &y, plane_state, 1,
> >+						      aux_offset, aux_offset - alignment);
> >+		aux_x = x * 16 + aux_x % 16;
> >+		aux_y = y * 8 + aux_y % 8;
> >+	}
> >+
> >+	if (aux_x != main_x || aux_y != main_y)
> >+		return false;
> >+
> >+	plane_state->aux.offset = aux_offset;
> >+	plane_state->aux.x = aux_x;
> >+	plane_state->aux.y = aux_y;
> >+
> >+	return true;
> >+}
> >+
> > static int skl_check_main_surface(struct intel_plane_state *plane_state)
> > {
> > 	const struct drm_framebuffer *fb = plane_state->base.fb;
> >@@ -2937,7 +3016,7 @@ static int skl_check_main_surface(struct intel_plane_state *plane_state)
> >
> > 		while ((x + w) * cpp > fb->pitches[0]) {
> > 			if (offset == 0) {
> >-				DRM_DEBUG_KMS("Unable to find suitable display surface offset\n");
> >+				DRM_DEBUG_KMS("Unable to find suitable display surface offset due to X-tiling\n");
> > 				return -EINVAL;
> > 			}
> >
> >@@ -2946,6 +3025,26 @@ static int skl_check_main_surface(struct intel_plane_state *plane_state)
> > 		}
> > 	}
> >
> >+	/*
> >+	 * CCS AUX surface doesn't have its own x/y offsets, we must make sure
> >+	 * they match with the main surface x/y offsets.
> >+	 */
> >+	if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> >+	    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
> >+		while (!skl_check_main_ccs_coordinates(plane_state, x, y, offset)) {
> >+			if (offset == 0)
> >+				break;
> >+
> >+			offset = intel_adjust_tile_offset(&x, &y, plane_state, 0,
> >+							  offset, offset - alignment);
> >+		}
> >+
> >+		if (x != plane_state->aux.x || y != plane_state->aux.y) {
> >+			DRM_DEBUG_KMS("Unable to find suitable display surface offset due to CCS\n");
> >+			return -EINVAL;
> >+		}
> >+	}
> >+
> > 	plane_state->main.offset = offset;
> > 	plane_state->main.x = x;
> > 	plane_state->main.y = y;
> >@@ -2982,6 +3081,53 @@ static int skl_check_nv12_aux_surface(struct intel_plane_state *plane_state)
> > 	return 0;
> > }
> >
> >+static int skl_check_ccs_aux_surface(struct intel_plane_state *plane_state)
> >+{
> >+	struct intel_plane *plane = to_intel_plane(plane_state->base.plane);
> >+	struct intel_crtc *crtc = to_intel_crtc(plane_state->base.crtc);
> >+	int src_x = plane_state->base.src.x1 >> 16;
> >+	int src_y = plane_state->base.src.y1 >> 16;
> >+	int x = src_x / 16;
> >+	int y = src_y / 8;
> >+	u32 offset;
> >+
> >+	switch (plane->id) {
> >+	case PLANE_PRIMARY:
> >+	case PLANE_SPRITE0:
> >+		break;
> >+	default:
> >+		DRM_DEBUG_KMS("RC support only on plane 1 and 2\n");
> >+		return -EINVAL;
> >+	}
> >+
> >+	if (crtc->pipe == PIPE_C) {
> >+		DRM_DEBUG_KMS("No RC support on pipe C\n");
> >+		return -EINVAL;
> >+	}
> >+	/*
> >+	 * TODO:
> >+	 * 1. Disable stereo 3D when render decomp is enabled (bit 7:6)
> >+	 * 2. Render decompression must not be used in VTd pass-through mode
> >+	 * 3. Program hashing select CHICKEN_MISC1 bit 15
> >+	 */
> >+
> >+	if (plane_state->base.rotation &&
> >+	    plane_state->base.rotation & ~(DRM_ROTATE_0 | DRM_ROTATE_180)) {
> >+		DRM_DEBUG_KMS("RC support only with 0/180 degree rotation %x\n",
> >+			      plane_state->base.rotation);
> >+		return -EINVAL;
> >+	}
> >+
> >+	intel_add_fb_offsets(&x, &y, plane_state, 1);
> >+	offset = intel_compute_tile_offset(&x, &y, plane_state, 1);
> >+
> >+	plane_state->aux.offset = offset;
> >+	plane_state->aux.x = x * 16 + src_x % 16;
> >+	plane_state->aux.y = y * 8 + src_y % 8;
> >+
> >+	return 0;
> >+}
> >+
> > int skl_check_plane_surface(struct intel_plane_state *plane_state)
> > {
> > 	const struct drm_framebuffer *fb = plane_state->base.fb;
> >@@ -3002,6 +3148,11 @@ int skl_check_plane_surface(struct intel_plane_state *plane_state)
> > 		ret = skl_check_nv12_aux_surface(plane_state);
> > 		if (ret)
> > 			return ret;
> >+	} else if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> >+		   fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
> >+		ret = skl_check_ccs_aux_surface(plane_state);
> >+		if (ret)
> >+			return ret;
> > 	} else {
> > 		plane_state->aux.offset = ~0xfff;
> > 		plane_state->aux.x = 0;
> >@@ -3357,8 +3508,12 @@ u32 skl_plane_ctl_tiling(uint64_t fb_modifier)
> > 		return PLANE_CTL_TILED_X;
> > 	case I915_FORMAT_MOD_Y_TILED:
> > 		return PLANE_CTL_TILED_Y;
> >+	case I915_FORMAT_MOD_Y_TILED_CCS:
> >+		return PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
> > 	case I915_FORMAT_MOD_Yf_TILED:
> > 		return PLANE_CTL_TILED_YF;
> >+	case I915_FORMAT_MOD_Yf_TILED_CCS:
> >+		return PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_ENABLE;
> > 	default:
> > 		MISSING_CASE(fb_modifier);
> > 	}
> >@@ -3401,6 +3556,7 @@ static void skylake_update_primary_plane(struct drm_plane *plane,
> > 	u32 plane_ctl;
> > 	unsigned int rotation = plane_state->base.rotation;
> > 	u32 stride = skl_plane_stride(fb, 0, rotation);
> >+	u32 aux_stride = skl_plane_stride(fb, 1, rotation);
> > 	u32 surf_addr = plane_state->main.offset;
> > 	int scaler_id = plane_state->scaler_id;
> > 	int src_x = plane_state->main.x;
> >@@ -3436,6 +3592,10 @@ static void skylake_update_primary_plane(struct drm_plane *plane,
> > 	I915_WRITE(PLANE_OFFSET(pipe, plane_id), (src_y << 16) | src_x);
> > 	I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
> > 	I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
> >+	I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
> >+		   (plane_state->aux.offset - surf_addr) | aux_stride);
> >+	I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
> >+		   (plane_state->aux.y << 16) | plane_state->aux.x);
> >
> > 	if (scaler_id >= 0) {
> > 		uint32_t ps_ctrl = 0;
> >@@ -9807,10 +9967,16 @@ skylake_get_initial_plane_config(struct intel_crtc *crtc,
> > 		fb->modifier = I915_FORMAT_MOD_X_TILED;
> > 		break;
> > 	case PLANE_CTL_TILED_Y:
> >-		fb->modifier = I915_FORMAT_MOD_Y_TILED;
> >+		if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
> >+			fb->modifier = I915_FORMAT_MOD_Y_TILED_CCS;
> >+		else
> >+			fb->modifier = I915_FORMAT_MOD_Y_TILED;
> > 		break;
> > 	case PLANE_CTL_TILED_YF:
> >-		fb->modifier = I915_FORMAT_MOD_Yf_TILED;
> >+		if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
> >+			fb->modifier = I915_FORMAT_MOD_Yf_TILED_CCS;
> >+		else
> >+			fb->modifier = I915_FORMAT_MOD_Yf_TILED;
> 
> I'm wondering if this is actually feasible. If we find compression enabled, I'd
> think we should just disable it.

Adding a more thorough plane readout support is somewhere on my todo list.
I think that would be step 1 in properly sanitizing the plane state (and
hopefully eliminating and the _noatomic() disable stuff).

> 
> > 		break;
> > 	default:
> > 		MISSING_CASE(tiling);
> >@@ -12014,7 +12180,7 @@ static void skl_do_mmio_flip(struct intel_crtc *intel_crtc,
> > 	u32 ctl, stride = skl_plane_stride(fb, 0, rotation);
> >
> > 	ctl = I915_READ(PLANE_CTL(pipe, 0));
> >-	ctl &= ~PLANE_CTL_TILED_MASK;
> >+	ctl &= ~(PLANE_CTL_TILED_MASK | PLANE_CTL_DECOMPRESSION_ENABLE);
> > 	switch (fb->modifier) {
> > 	case DRM_FORMAT_MOD_NONE:
> > 		break;
> >@@ -12024,9 +12190,15 @@ static void skl_do_mmio_flip(struct intel_crtc *intel_crtc,
> > 	case I915_FORMAT_MOD_Y_TILED:
> > 		ctl |= PLANE_CTL_TILED_Y;
> > 		break;
> >+	case I915_FORMAT_MOD_Y_TILED_CCS:
> >+		ctl |= PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
> >+		break;
> > 	case I915_FORMAT_MOD_Yf_TILED:
> > 		ctl |= PLANE_CTL_TILED_YF;
> > 		break;
> >+	case I915_FORMAT_MOD_Yf_TILED_CCS:
> >+		ctl |= PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_ENABLE;
> >+		break;
> > 	default:
> > 		MISSING_CASE(fb->modifier);
> > 	}
> >@@ -15926,8 +16098,8 @@ static int intel_framebuffer_init(struct drm_device *dev,
> > {
> > 	struct drm_i915_private *dev_priv = to_i915(dev);
> > 	unsigned int tiling = i915_gem_object_get_tiling(obj);
> >-	int ret;
> >-	u32 pitch_limit, stride_alignment;
> >+	int ret, i;
> >+	u32 pitch_limit;
> > 	struct drm_format_name_buf format_name;
> >
> > 	WARN_ON(!mutex_is_locked(&dev->struct_mutex));
> >@@ -15953,6 +16125,19 @@ static int intel_framebuffer_init(struct drm_device *dev,
> >
> > 	/* Passed in modifier sanity checking. */
> > 	switch (mode_cmd->modifier[0]) {
> >+	case I915_FORMAT_MOD_Y_TILED_CCS:
> >+	case I915_FORMAT_MOD_Yf_TILED_CCS:
> >+		switch (mode_cmd->pixel_format) {
> >+		case DRM_FORMAT_XBGR8888:
> >+		case DRM_FORMAT_ABGR8888:
> >+		case DRM_FORMAT_XRGB8888:
> >+		case DRM_FORMAT_ARGB8888:
> >+			break;
> >+		default:
> >+			DRM_DEBUG_KMS("RC supported only with RGB8888 formats\n");
> >+			return -EINVAL;
> >+		}
> >+		/* fall through */
> > 	case I915_FORMAT_MOD_Y_TILED:
> > 	case I915_FORMAT_MOD_Yf_TILED:
> > 		if (INTEL_GEN(dev_priv) < 9) {
> >@@ -16061,11 +16246,21 @@ static int intel_framebuffer_init(struct drm_device *dev,
> >
> > 	drm_helper_mode_fill_fb_struct(dev, &intel_fb->base, mode_cmd);
> >
> >-	stride_alignment = intel_fb_stride_alignment(&intel_fb->base, 0);
> >-	if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
> >-		DRM_DEBUG_KMS("pitch (%d) must be at least %u byte aligned\n",
> >-			      mode_cmd->pitches[0], stride_alignment);
> >-		return -EINVAL;
> >+	for (i = 0; i < intel_fb->base.format->num_planes; i++) {
> >+		u32 stride_alignment;
> >+
> >+		if (mode_cmd->handles[i] != mode_cmd->handles[0]) {
> >+			DRM_DEBUG_KMS("bad plane %d handle\n", i);
> >+			return -EINVAL;
> >+		}
> >+
> >+		stride_alignment = intel_fb_stride_alignment(&intel_fb->base, i);
> >+
> >+		if (mode_cmd->pitches[i] & (stride_alignment - 1)) {
> >+			DRM_DEBUG_KMS("plane %d pitch (%d) must be at least %u byte aligned\n",
> >+				      i, mode_cmd->pitches[i], stride_alignment);
> >+			return -EINVAL;
> >+		}
> > 	}
> >
> > 	intel_fb->obj = obj;
> >diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> >index 249623d45be0..add359022c96 100644
> >--- a/drivers/gpu/drm/i915/intel_pm.c
> >+++ b/drivers/gpu/drm/i915/intel_pm.c
> >@@ -3314,7 +3314,9 @@ skl_ddb_min_alloc(const struct drm_plane_state *pstate,
> >
> > 	/* For Non Y-tile return 8-blocks */
> > 	if (fb->modifier != I915_FORMAT_MOD_Y_TILED &&
> >-	    fb->modifier != I915_FORMAT_MOD_Yf_TILED)
> >+	    fb->modifier != I915_FORMAT_MOD_Yf_TILED &&
> >+	    fb->modifier != I915_FORMAT_MOD_Y_TILED_CCS &&
> >+	    fb->modifier != I915_FORMAT_MOD_Yf_TILED_CCS)
> > 		return 8;
> >
> > 	src_w = drm_rect_width(&intel_pstate->base.src) >> 16;
> >@@ -3590,7 +3592,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
> > 	}
> >
> > 	y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
> >-		  fb->modifier == I915_FORMAT_MOD_Yf_TILED;
> >+		  fb->modifier == I915_FORMAT_MOD_Yf_TILED ||
> >+		  fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> >+		  fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS;
> > 	x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
> >
> > 	/* Display WA #1141: kbl. */
> >diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
> >index 7031bc733d97..063a994815d0 100644
> >--- a/drivers/gpu/drm/i915/intel_sprite.c
> >+++ b/drivers/gpu/drm/i915/intel_sprite.c
> >@@ -210,6 +210,7 @@ skl_update_plane(struct drm_plane *drm_plane,
> > 	u32 surf_addr = plane_state->main.offset;
> > 	unsigned int rotation = plane_state->base.rotation;
> > 	u32 stride = skl_plane_stride(fb, 0, rotation);
> >+	u32 aux_stride = skl_plane_stride(fb, 1, rotation);
> > 	int crtc_x = plane_state->base.dst.x1;
> > 	int crtc_y = plane_state->base.dst.y1;
> > 	uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
> >@@ -248,6 +249,10 @@ skl_update_plane(struct drm_plane *drm_plane,
> > 	I915_WRITE(PLANE_OFFSET(pipe, plane_id), (y << 16) | x);
> > 	I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
> > 	I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
> >+	I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
> >+		   (plane_state->aux.offset - surf_addr) | aux_stride);
> >+	I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
> >+		   (plane_state->aux.y << 16) | plane_state->aux.x);
> >
> > 	/* program plane scaler */
> > 	if (plane_state->scaler_id >= 0) {
> 
> Looks good. I haven't tested it.
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 9/9] drm/i915: Add render decompression support
  2017-01-04 19:14   ` Paulo Zanoni
@ 2017-01-05 15:12     ` Ville Syrjälä
  0 siblings, 0 replies; 44+ messages in thread
From: Ville Syrjälä @ 2017-01-05 15:12 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Ben Widawsky, dri-devel, Vandana Kannan

On Wed, Jan 04, 2017 at 05:14:04PM -0200, Paulo Zanoni wrote:
> Em Qua, 2017-01-04 às 20:42 +0200, ville.syrjala@linux.intel.com
> escreveu:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > SKL+ display engine can scan out certain kinds of compressed surfaces
> > produced by the render engine. This involved telling the display
> > engine
> > the location of the color control surfae (CCS) which describes
> > which parts of the main surface are compressed and which are not. The
> > location of CCS is provided by userspace as just another plane with
> > its
> > own offset.
> > 
> > Add the required stuff to validate the user provided AUX plane
> > metadata
> > and convert the user provided linear offset into something the
> > hardware
> > can consume.
> > 
> > Due to hardware limitations we require that the main surface and
> > the AUX surface (CCS) be part of the same bo. The hardware also
> > makes life hard by not allowing you to provide separate x/y offsets
> > for the main and AUX surfaces (excpet with NV12), so finding suitable
> > offsets for both requires a bit of work. Assuming we still want keep
> > playing tricks with the offsets. I've just gone with a dumb "search
> > backward for suitable offsets" approach, which is far from optimal,
> > but it works.
> > 
> > Also not all planes will be capable of scanning out compressed
> > surfaces,
> > and eg. 90/270 degree rotation is not supported in combination with
> > decompression either.
> > 
> > This patch may contain work from at least the following people:
> > * Vandana Kannan <vandana.kannan@intel.com>
> > * Daniel Vetter <daniel@ffwll.ch>
> > * Ben Widawsky <ben@bwidawsk.net>
> 
> As I mentioned to Ben in the other email, there are some points of
> BSpec that say "if render decompression is enabled, to this", which we
> largely ignored so far. I hope they are all marked as workarounds. From
> a quick look, it looks like we need at least Display WAs #0390, #0531
> and #1125, and maybe some other non-display WAs (please take a look at
> the BSpec list). I'll assume they were not implemented yet since I
> don't see WA comments on the patches. I think we need them, otherwise
> we may introduce more SKL flickering problems.

I went through the list and those three do seem like the likely things
we want. I'll post a revised patch with those included.

> 
> Thanks,
> Paulo
> 
> > 
> > Cc: Vandana Kannan <vandana.kannan@intel.com>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Ben Widawsky <ben@bwidawsk.net>
> > Cc: Jason Ekstrand <jason@jlekstrand.net>
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_reg.h      |  22 ++++
> >  drivers/gpu/drm/i915/intel_display.c | 219
> > +++++++++++++++++++++++++++++++++--
> >  drivers/gpu/drm/i915/intel_pm.c      |   8 +-
> >  drivers/gpu/drm/i915/intel_sprite.c  |   5 +
> >  4 files changed, 240 insertions(+), 14 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> > b/drivers/gpu/drm/i915/i915_reg.h
> > index 00970aa77afa..05e18e742776 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -6209,6 +6209,28 @@ enum {
> >  			_ID(id, _PS_ECC_STAT_1A,
> > _PS_ECC_STAT_2A),   \
> >  			_ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B))
> >  
> > +#define PLANE_AUX_DIST_1_A		0x701c0
> > +#define PLANE_AUX_DIST_2_A		0x702c0
> > +#define PLANE_AUX_DIST_1_B		0x711c0
> > +#define PLANE_AUX_DIST_2_B		0x712c0
> > +#define _PLANE_AUX_DIST_1(pipe) \
> > +			_PIPE(pipe, PLANE_AUX_DIST_1_A,
> > PLANE_AUX_DIST_1_B)
> > +#define _PLANE_AUX_DIST_2(pipe) \
> > +			_PIPE(pipe, PLANE_AUX_DIST_2_A,
> > PLANE_AUX_DIST_2_B)
> > +#define PLANE_AUX_DIST(pipe, plane)     \
> > +	_MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe),
> > _PLANE_AUX_DIST_2(pipe))
> > +
> > +#define PLANE_AUX_OFFSET_1_A		0x701c4
> > +#define PLANE_AUX_OFFSET_2_A		0x702c4
> > +#define PLANE_AUX_OFFSET_1_B		0x711c4
> > +#define PLANE_AUX_OFFSET_2_B		0x712c4
> > +#define _PLANE_AUX_OFFSET_1(pipe)       \
> > +		_PIPE(pipe, PLANE_AUX_OFFSET_1_A,
> > PLANE_AUX_OFFSET_1_B)
> > +#define _PLANE_AUX_OFFSET_2(pipe)       \
> > +		_PIPE(pipe, PLANE_AUX_OFFSET_2_A,
> > PLANE_AUX_OFFSET_2_B)
> > +#define PLANE_AUX_OFFSET(pipe, plane)   \
> > +	_MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe),
> > _PLANE_AUX_OFFSET_2(pipe))
> > +
> >  /* legacy palette */
> >  #define _LGC_PALETTE_A           0x4a000
> >  #define _LGC_PALETTE_B           0x4a800
> > diff --git a/drivers/gpu/drm/i915/intel_display.c
> > b/drivers/gpu/drm/i915/intel_display.c
> > index 38de9df0ec60..b547332eeda1 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -2064,11 +2064,19 @@ intel_tile_width_bytes(const struct
> > drm_framebuffer *fb, int plane)
> >  			return 128;
> >  		else
> >  			return 512;
> > +	case I915_FORMAT_MOD_Y_TILED_CCS:
> > +		if (plane == 1)
> > +			return 64;
> > +		/* fall through */
> >  	case I915_FORMAT_MOD_Y_TILED:
> >  		if (IS_GEN2(dev_priv) ||
> > HAS_128_BYTE_Y_TILING(dev_priv))
> >  			return 128;
> >  		else
> >  			return 512;
> > +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> > +		if (plane == 1)
> > +			return 64;
> > +		/* fall through */
> >  	case I915_FORMAT_MOD_Yf_TILED:
> >  		/*
> >  		 * Bspec seems to suggest that the Yf tile width
> > would
> > @@ -2156,7 +2164,7 @@ static unsigned int intel_surf_alignment(const
> > struct drm_framebuffer *fb,
> >  	struct drm_i915_private *dev_priv = to_i915(fb->dev);
> >  
> >  	/* AUX_DIST needs only 4K alignment */
> > -	if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
> > +	if (plane == 1)
> >  		return 4096;
> >  
> >  	switch (fb->modifier) {
> > @@ -2166,6 +2174,8 @@ static unsigned int intel_surf_alignment(const
> > struct drm_framebuffer *fb,
> >  		if (INTEL_GEN(dev_priv) >= 9)
> >  			return 256 * 1024;
> >  		return 0;
> > +	case I915_FORMAT_MOD_Y_TILED_CCS:
> > +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> >  	case I915_FORMAT_MOD_Y_TILED:
> >  	case I915_FORMAT_MOD_Yf_TILED:
> >  		return 1 * 1024 * 1024;
> > @@ -2472,6 +2482,7 @@ static unsigned int
> > intel_fb_modifier_to_tiling(uint64_t fb_modifier)
> >  	case I915_FORMAT_MOD_X_TILED:
> >  		return I915_TILING_X;
> >  	case I915_FORMAT_MOD_Y_TILED:
> > +	case I915_FORMAT_MOD_Y_TILED_CCS:
> >  		return I915_TILING_Y;
> >  	default:
> >  		return I915_TILING_NONE;
> > @@ -2536,6 +2547,35 @@ intel_fill_fb_info(struct drm_i915_private
> > *dev_priv,
> >  
> >  		intel_fb_offset_to_xy(&x, &y, fb, i);
> >  
> > +		if ((fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> > +		     fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS)
> > && i == 1) {
> > +			int main_x, main_y;
> > +			int ccs_x, ccs_y;
> > +
> > +			/*
> > +			 * Each byte of CCS corresponds to a 16x8
> > area of the main surface, and
> > +			 * each CCS tile is 64x64 bytes.
> > +			 */
> > +			ccs_x = (x * 16) % (64 * 16);
> > +			ccs_y = (y * 8) % (64 * 8);
> > +			main_x = intel_fb->normal[0].x % (64 * 16);
> > +			main_y = intel_fb->normal[0].y % (64 * 8);
> > +
> > +			/*
> > +			 * CCS doesn't have its own x/y offset
> > register, so the intra CCS tile
> > +			 * x/y offsets must match between CCS and
> > the main surface.
> > +			 */
> > +			if (main_x != ccs_x || main_y != ccs_y) {
> > +				DRM_DEBUG_KMS("Bad CCS x/y (main
> > %d,%d ccs %d,%d) full (main %d,%d ccs %d,%d)\n",
> > +					      main_x, main_y,
> > +					      ccs_x, ccs_y,
> > +					      intel_fb->normal[0].x,
> > +					      intel_fb->normal[0].y,
> > +					      x, y);
> > +				return -EINVAL;
> > +			}
> > +		}
> > +
> >  		/*
> >  		 * The fence (if used) is aligned to the start of
> > the object
> >  		 * so having the framebuffer wrap around across the
> > edge of the
> > @@ -2873,6 +2913,9 @@ static int skl_max_plane_width(const struct
> > drm_framebuffer *fb, int plane,
> >  			break;
> >  		}
> >  		break;
> > +	case I915_FORMAT_MOD_Y_TILED_CCS:
> > +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> > +		/* FIXME AUX plane? */
> >  	case I915_FORMAT_MOD_Y_TILED:
> >  	case I915_FORMAT_MOD_Yf_TILED:
> >  		switch (cpp) {
> > @@ -2895,6 +2938,42 @@ static int skl_max_plane_width(const struct
> > drm_framebuffer *fb, int plane,
> >  	return 2048;
> >  }
> >  
> > +static bool skl_check_main_ccs_coordinates(struct intel_plane_state
> > *plane_state,
> > +					   int main_x, int main_y,
> > u32 main_offset)
> > +{
> > +	const struct drm_framebuffer *fb = plane_state->base.fb;
> > +	int aux_x = plane_state->aux.x;
> > +	int aux_y = plane_state->aux.y;
> > +	u32 aux_offset = plane_state->aux.offset;
> > +	u32 alignment = intel_surf_alignment(fb, 1);
> > +
> > +	while (aux_offset >= main_offset && aux_y <= main_y) {
> > +		int x, y;
> > +
> > +		if (aux_x == main_x && aux_y == main_y)
> > +			break;
> > +
> > +		if (aux_offset == 0)
> > +			break;
> > +
> > +		x = aux_x / 16;
> > +		y = aux_y / 8;
> > +		aux_offset = intel_adjust_tile_offset(&x, &y,
> > plane_state, 1,
> > +						      aux_offset,
> > aux_offset - alignment);
> > +		aux_x = x * 16 + aux_x % 16;
> > +		aux_y = y * 8 + aux_y % 8;
> > +	}
> > +
> > +	if (aux_x != main_x || aux_y != main_y)
> > +		return false;
> > +
> > +	plane_state->aux.offset = aux_offset;
> > +	plane_state->aux.x = aux_x;
> > +	plane_state->aux.y = aux_y;
> > +
> > +	return true;
> > +}
> > +
> >  static int skl_check_main_surface(struct intel_plane_state
> > *plane_state)
> >  {
> >  	const struct drm_framebuffer *fb = plane_state->base.fb;
> > @@ -2937,7 +3016,7 @@ static int skl_check_main_surface(struct
> > intel_plane_state *plane_state)
> >  
> >  		while ((x + w) * cpp > fb->pitches[0]) {
> >  			if (offset == 0) {
> > -				DRM_DEBUG_KMS("Unable to find
> > suitable display surface offset\n");
> > +				DRM_DEBUG_KMS("Unable to find
> > suitable display surface offset due to X-tiling\n");
> >  				return -EINVAL;
> >  			}
> >  
> > @@ -2946,6 +3025,26 @@ static int skl_check_main_surface(struct
> > intel_plane_state *plane_state)
> >  		}
> >  	}
> >  
> > +	/*
> > +	 * CCS AUX surface doesn't have its own x/y offsets, we must
> > make sure
> > +	 * they match with the main surface x/y offsets.
> > +	 */
> > +	if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> > +	    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
> > +		while (!skl_check_main_ccs_coordinates(plane_state,
> > x, y, offset)) {
> > +			if (offset == 0)
> > +				break;
> > +
> > +			offset = intel_adjust_tile_offset(&x, &y,
> > plane_state, 0,
> > +							  offset,
> > offset - alignment);
> > +		}
> > +
> > +		if (x != plane_state->aux.x || y != plane_state-
> > >aux.y) {
> > +			DRM_DEBUG_KMS("Unable to find suitable
> > display surface offset due to CCS\n");
> > +			return -EINVAL;
> > +		}
> > +	}
> > +
> >  	plane_state->main.offset = offset;
> >  	plane_state->main.x = x;
> >  	plane_state->main.y = y;
> > @@ -2982,6 +3081,53 @@ static int skl_check_nv12_aux_surface(struct
> > intel_plane_state *plane_state)
> >  	return 0;
> >  }
> >  
> > +static int skl_check_ccs_aux_surface(struct intel_plane_state
> > *plane_state)
> > +{
> > +	struct intel_plane *plane = to_intel_plane(plane_state-
> > >base.plane);
> > +	struct intel_crtc *crtc = to_intel_crtc(plane_state-
> > >base.crtc);
> > +	int src_x = plane_state->base.src.x1 >> 16;
> > +	int src_y = plane_state->base.src.y1 >> 16;
> > +	int x = src_x / 16;
> > +	int y = src_y / 8;
> > +	u32 offset;
> > +
> > +	switch (plane->id) {
> > +	case PLANE_PRIMARY:
> > +	case PLANE_SPRITE0:
> > +		break;
> > +	default:
> > +		DRM_DEBUG_KMS("RC support only on plane 1 and 2\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	if (crtc->pipe == PIPE_C) {
> > +		DRM_DEBUG_KMS("No RC support on pipe C\n");
> > +		return -EINVAL;
> > +	}
> > +	/*
> > +	 * TODO:
> > +	 * 1. Disable stereo 3D when render decomp is enabled (bit
> > 7:6)
> > +	 * 2. Render decompression must not be used in VTd pass-
> > through mode
> > +	 * 3. Program hashing select CHICKEN_MISC1 bit 15
> > +	 */
> > +
> > +	if (plane_state->base.rotation &&
> > +	    plane_state->base.rotation & ~(DRM_ROTATE_0 |
> > DRM_ROTATE_180)) {
> > +		DRM_DEBUG_KMS("RC support only with 0/180 degree
> > rotation %x\n",
> > +			      plane_state->base.rotation);
> > +		return -EINVAL;
> > +	}
> > +
> > +	intel_add_fb_offsets(&x, &y, plane_state, 1);
> > +	offset = intel_compute_tile_offset(&x, &y, plane_state, 1);
> > +
> > +	plane_state->aux.offset = offset;
> > +	plane_state->aux.x = x * 16 + src_x % 16;
> > +	plane_state->aux.y = y * 8 + src_y % 8;
> > +
> > +	return 0;
> > +}
> > +
> >  int skl_check_plane_surface(struct intel_plane_state *plane_state)
> >  {
> >  	const struct drm_framebuffer *fb = plane_state->base.fb;
> > @@ -3002,6 +3148,11 @@ int skl_check_plane_surface(struct
> > intel_plane_state *plane_state)
> >  		ret = skl_check_nv12_aux_surface(plane_state);
> >  		if (ret)
> >  			return ret;
> > +	} else if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> > +		   fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
> > +		ret = skl_check_ccs_aux_surface(plane_state);
> > +		if (ret)
> > +			return ret;
> >  	} else {
> >  		plane_state->aux.offset = ~0xfff;
> >  		plane_state->aux.x = 0;
> > @@ -3357,8 +3508,12 @@ u32 skl_plane_ctl_tiling(uint64_t fb_modifier)
> >  		return PLANE_CTL_TILED_X;
> >  	case I915_FORMAT_MOD_Y_TILED:
> >  		return PLANE_CTL_TILED_Y;
> > +	case I915_FORMAT_MOD_Y_TILED_CCS:
> > +		return PLANE_CTL_TILED_Y |
> > PLANE_CTL_DECOMPRESSION_ENABLE;
> >  	case I915_FORMAT_MOD_Yf_TILED:
> >  		return PLANE_CTL_TILED_YF;
> > +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> > +		return PLANE_CTL_TILED_YF |
> > PLANE_CTL_DECOMPRESSION_ENABLE;
> >  	default:
> >  		MISSING_CASE(fb_modifier);
> >  	}
> > @@ -3401,6 +3556,7 @@ static void skylake_update_primary_plane(struct
> > drm_plane *plane,
> >  	u32 plane_ctl;
> >  	unsigned int rotation = plane_state->base.rotation;
> >  	u32 stride = skl_plane_stride(fb, 0, rotation);
> > +	u32 aux_stride = skl_plane_stride(fb, 1, rotation);
> >  	u32 surf_addr = plane_state->main.offset;
> >  	int scaler_id = plane_state->scaler_id;
> >  	int src_x = plane_state->main.x;
> > @@ -3436,6 +3592,10 @@ static void
> > skylake_update_primary_plane(struct drm_plane *plane,
> >  	I915_WRITE(PLANE_OFFSET(pipe, plane_id), (src_y << 16) |
> > src_x);
> >  	I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
> >  	I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) |
> > src_w);
> > +	I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
> > +		   (plane_state->aux.offset - surf_addr) |
> > aux_stride);
> > +	I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
> > +		   (plane_state->aux.y << 16) | plane_state->aux.x);
> >  
> >  	if (scaler_id >= 0) {
> >  		uint32_t ps_ctrl = 0;
> > @@ -9807,10 +9967,16 @@ skylake_get_initial_plane_config(struct
> > intel_crtc *crtc,
> >  		fb->modifier = I915_FORMAT_MOD_X_TILED;
> >  		break;
> >  	case PLANE_CTL_TILED_Y:
> > -		fb->modifier = I915_FORMAT_MOD_Y_TILED;
> > +		if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
> > +			fb->modifier = I915_FORMAT_MOD_Y_TILED_CCS;
> > +		else
> > +			fb->modifier = I915_FORMAT_MOD_Y_TILED;
> >  		break;
> >  	case PLANE_CTL_TILED_YF:
> > -		fb->modifier = I915_FORMAT_MOD_Yf_TILED;
> > +		if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
> > +			fb->modifier = I915_FORMAT_MOD_Yf_TILED_CCS;
> > +		else
> > +			fb->modifier = I915_FORMAT_MOD_Yf_TILED;
> >  		break;
> >  	default:
> >  		MISSING_CASE(tiling);
> > @@ -12014,7 +12180,7 @@ static void skl_do_mmio_flip(struct
> > intel_crtc *intel_crtc,
> >  	u32 ctl, stride = skl_plane_stride(fb, 0, rotation);
> >  
> >  	ctl = I915_READ(PLANE_CTL(pipe, 0));
> > -	ctl &= ~PLANE_CTL_TILED_MASK;
> > +	ctl &= ~(PLANE_CTL_TILED_MASK |
> > PLANE_CTL_DECOMPRESSION_ENABLE);
> >  	switch (fb->modifier) {
> >  	case DRM_FORMAT_MOD_NONE:
> >  		break;
> > @@ -12024,9 +12190,15 @@ static void skl_do_mmio_flip(struct
> > intel_crtc *intel_crtc,
> >  	case I915_FORMAT_MOD_Y_TILED:
> >  		ctl |= PLANE_CTL_TILED_Y;
> >  		break;
> > +	case I915_FORMAT_MOD_Y_TILED_CCS:
> > +		ctl |= PLANE_CTL_TILED_Y |
> > PLANE_CTL_DECOMPRESSION_ENABLE;
> > +		break;
> >  	case I915_FORMAT_MOD_Yf_TILED:
> >  		ctl |= PLANE_CTL_TILED_YF;
> >  		break;
> > +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> > +		ctl |= PLANE_CTL_TILED_YF |
> > PLANE_CTL_DECOMPRESSION_ENABLE;
> > +		break;
> >  	default:
> >  		MISSING_CASE(fb->modifier);
> >  	}
> > @@ -15926,8 +16098,8 @@ static int intel_framebuffer_init(struct
> > drm_device *dev,
> >  {
> >  	struct drm_i915_private *dev_priv = to_i915(dev);
> >  	unsigned int tiling = i915_gem_object_get_tiling(obj);
> > -	int ret;
> > -	u32 pitch_limit, stride_alignment;
> > +	int ret, i;
> > +	u32 pitch_limit;
> >  	struct drm_format_name_buf format_name;
> >  
> >  	WARN_ON(!mutex_is_locked(&dev->struct_mutex));
> > @@ -15953,6 +16125,19 @@ static int intel_framebuffer_init(struct
> > drm_device *dev,
> >  
> >  	/* Passed in modifier sanity checking. */
> >  	switch (mode_cmd->modifier[0]) {
> > +	case I915_FORMAT_MOD_Y_TILED_CCS:
> > +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> > +		switch (mode_cmd->pixel_format) {
> > +		case DRM_FORMAT_XBGR8888:
> > +		case DRM_FORMAT_ABGR8888:
> > +		case DRM_FORMAT_XRGB8888:
> > +		case DRM_FORMAT_ARGB8888:
> > +			break;
> > +		default:
> > +			DRM_DEBUG_KMS("RC supported only with
> > RGB8888 formats\n");
> > +			return -EINVAL;
> > +		}
> > +		/* fall through */
> >  	case I915_FORMAT_MOD_Y_TILED:
> >  	case I915_FORMAT_MOD_Yf_TILED:
> >  		if (INTEL_GEN(dev_priv) < 9) {
> > @@ -16061,11 +16246,21 @@ static int intel_framebuffer_init(struct
> > drm_device *dev,
> >  
> >  	drm_helper_mode_fill_fb_struct(dev, &intel_fb->base,
> > mode_cmd);
> >  
> > -	stride_alignment = intel_fb_stride_alignment(&intel_fb-
> > >base, 0);
> > -	if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
> > -		DRM_DEBUG_KMS("pitch (%d) must be at least %u byte
> > aligned\n",
> > -			      mode_cmd->pitches[0],
> > stride_alignment);
> > -		return -EINVAL;
> > +	for (i = 0; i < intel_fb->base.format->num_planes; i++) {
> > +		u32 stride_alignment;
> > +
> > +		if (mode_cmd->handles[i] != mode_cmd->handles[0]) {
> > +			DRM_DEBUG_KMS("bad plane %d handle\n", i);
> > +			return -EINVAL;
> > +		}
> > +
> > +		stride_alignment =
> > intel_fb_stride_alignment(&intel_fb->base, i);
> > +
> > +		if (mode_cmd->pitches[i] & (stride_alignment - 1)) {
> > +			DRM_DEBUG_KMS("plane %d pitch (%d) must be
> > at least %u byte aligned\n",
> > +				      i, mode_cmd->pitches[i],
> > stride_alignment);
> > +			return -EINVAL;
> > +		}
> >  	}
> >  
> >  	intel_fb->obj = obj;
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > b/drivers/gpu/drm/i915/intel_pm.c
> > index 249623d45be0..add359022c96 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -3314,7 +3314,9 @@ skl_ddb_min_alloc(const struct drm_plane_state
> > *pstate,
> >  
> >  	/* For Non Y-tile return 8-blocks */
> >  	if (fb->modifier != I915_FORMAT_MOD_Y_TILED &&
> > -	    fb->modifier != I915_FORMAT_MOD_Yf_TILED)
> > +	    fb->modifier != I915_FORMAT_MOD_Yf_TILED &&
> > +	    fb->modifier != I915_FORMAT_MOD_Y_TILED_CCS &&
> > +	    fb->modifier != I915_FORMAT_MOD_Yf_TILED_CCS)
> >  		return 8;
> >  
> >  	src_w = drm_rect_width(&intel_pstate->base.src) >> 16;
> > @@ -3590,7 +3592,9 @@ static int skl_compute_plane_wm(const struct
> > drm_i915_private *dev_priv,
> >  	}
> >  
> >  	y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
> > -		  fb->modifier == I915_FORMAT_MOD_Yf_TILED;
> > +		  fb->modifier == I915_FORMAT_MOD_Yf_TILED ||
> > +		  fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> > +		  fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS;
> >  	x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
> >  
> >  	/* Display WA #1141: kbl. */
> > diff --git a/drivers/gpu/drm/i915/intel_sprite.c
> > b/drivers/gpu/drm/i915/intel_sprite.c
> > index 7031bc733d97..063a994815d0 100644
> > --- a/drivers/gpu/drm/i915/intel_sprite.c
> > +++ b/drivers/gpu/drm/i915/intel_sprite.c
> > @@ -210,6 +210,7 @@ skl_update_plane(struct drm_plane *drm_plane,
> >  	u32 surf_addr = plane_state->main.offset;
> >  	unsigned int rotation = plane_state->base.rotation;
> >  	u32 stride = skl_plane_stride(fb, 0, rotation);
> > +	u32 aux_stride = skl_plane_stride(fb, 1, rotation);
> >  	int crtc_x = plane_state->base.dst.x1;
> >  	int crtc_y = plane_state->base.dst.y1;
> >  	uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
> > @@ -248,6 +249,10 @@ skl_update_plane(struct drm_plane *drm_plane,
> >  	I915_WRITE(PLANE_OFFSET(pipe, plane_id), (y << 16) | x);
> >  	I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
> >  	I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) |
> > src_w);
> > +	I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
> > +		   (plane_state->aux.offset - surf_addr) |
> > aux_stride);
> > +	I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
> > +		   (plane_state->aux.y << 16) | plane_state->aux.x);
> >  
> >  	/* program plane scaler */
> >  	if (plane_state->scaler_id >= 0) {

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 9/9] drm/i915: Add render decompression support
  2017-01-04 18:42 ` [PATCH 9/9] drm/i915: Add render decompression support ville.syrjala
  2017-01-04 19:14   ` Paulo Zanoni
  2017-01-05  4:25   ` Ben Widawsky
@ 2017-01-05 15:14   ` ville.syrjala
  2017-01-09 19:20     ` Jason Ekstrand
                       ` (2 more replies)
  2 siblings, 3 replies; 44+ messages in thread
From: ville.syrjala @ 2017-01-05 15:14 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, dri-devel, Vandana Kannan, Paulo Zanoni

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

SKL+ display engine can scan out certain kinds of compressed surfaces
produced by the render engine. This involved telling the display engine
the location of the color control surfae (CCS) which describes
which parts of the main surface are compressed and which are not. The
location of CCS is provided by userspace as just another plane with its
own offset.

Add the required stuff to validate the user provided AUX plane metadata
and convert the user provided linear offset into something the hardware
can consume.

Due to hardware limitations we require that the main surface and
the AUX surface (CCS) be part of the same bo. The hardware also
makes life hard by not allowing you to provide separate x/y offsets
for the main and AUX surfaces (excpet with NV12), so finding suitable
offsets for both requires a bit of work. Assuming we still want keep
playing tricks with the offsets. I've just gone with a dumb "search
backward for suitable offsets" approach, which is far from optimal,
but it works.

Also not all planes will be capable of scanning out compressed surfaces,
and eg. 90/270 degree rotation is not supported in combination with
decompression either.

This patch may contain work from at least the following people:
* Vandana Kannan <vandana.kannan@intel.com>
* Daniel Vetter <daniel@ffwll.ch>
* Ben Widawsky <ben@bwidawsk.net>

v2: Deal with display workarounds 0390, 0531, 1125 (Paulo)

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Vandana Kannan <vandana.kannan@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h      |  23 ++++
 drivers/gpu/drm/i915/intel_display.c | 234 ++++++++++++++++++++++++++++++++---
 drivers/gpu/drm/i915/intel_pm.c      |  29 ++++-
 drivers/gpu/drm/i915/intel_sprite.c  |   5 +
 4 files changed, 274 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 00970aa77afa..6849ba93f1d9 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -6209,6 +6209,28 @@ enum {
 			_ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A),   \
 			_ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B))
 
+#define PLANE_AUX_DIST_1_A		0x701c0
+#define PLANE_AUX_DIST_2_A		0x702c0
+#define PLANE_AUX_DIST_1_B		0x711c0
+#define PLANE_AUX_DIST_2_B		0x712c0
+#define _PLANE_AUX_DIST_1(pipe) \
+			_PIPE(pipe, PLANE_AUX_DIST_1_A, PLANE_AUX_DIST_1_B)
+#define _PLANE_AUX_DIST_2(pipe) \
+			_PIPE(pipe, PLANE_AUX_DIST_2_A, PLANE_AUX_DIST_2_B)
+#define PLANE_AUX_DIST(pipe, plane)     \
+	_MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe), _PLANE_AUX_DIST_2(pipe))
+
+#define PLANE_AUX_OFFSET_1_A		0x701c4
+#define PLANE_AUX_OFFSET_2_A		0x702c4
+#define PLANE_AUX_OFFSET_1_B		0x711c4
+#define PLANE_AUX_OFFSET_2_B		0x712c4
+#define _PLANE_AUX_OFFSET_1(pipe)       \
+		_PIPE(pipe, PLANE_AUX_OFFSET_1_A, PLANE_AUX_OFFSET_1_B)
+#define _PLANE_AUX_OFFSET_2(pipe)       \
+		_PIPE(pipe, PLANE_AUX_OFFSET_2_A, PLANE_AUX_OFFSET_2_B)
+#define PLANE_AUX_OFFSET(pipe, plane)   \
+	_MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe), _PLANE_AUX_OFFSET_2(pipe))
+
 /* legacy palette */
 #define _LGC_PALETTE_A           0x4a000
 #define _LGC_PALETTE_B           0x4a800
@@ -6433,6 +6455,7 @@ enum {
 # define CHICKEN3_DGMG_DONE_FIX_DISABLE		(1 << 2)
 
 #define CHICKEN_PAR1_1		_MMIO(0x42080)
+#define  SKL_RC_HASH_OUTSIDE	(1 << 15)
 #define  DPA_MASK_VBLANK_SRD	(1 << 15)
 #define  FORCE_ARB_IDLE_PLANES	(1 << 14)
 #define  SKL_EDP_PSR_FIX_RDWRAP	(1 << 3)
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 38de9df0ec60..2236abebd8bc 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2064,11 +2064,19 @@ intel_tile_width_bytes(const struct drm_framebuffer *fb, int plane)
 			return 128;
 		else
 			return 512;
+	case I915_FORMAT_MOD_Y_TILED_CCS:
+		if (plane == 1)
+			return 64;
+		/* fall through */
 	case I915_FORMAT_MOD_Y_TILED:
 		if (IS_GEN2(dev_priv) || HAS_128_BYTE_Y_TILING(dev_priv))
 			return 128;
 		else
 			return 512;
+	case I915_FORMAT_MOD_Yf_TILED_CCS:
+		if (plane == 1)
+			return 64;
+		/* fall through */
 	case I915_FORMAT_MOD_Yf_TILED:
 		/*
 		 * Bspec seems to suggest that the Yf tile width would
@@ -2156,7 +2164,7 @@ static unsigned int intel_surf_alignment(const struct drm_framebuffer *fb,
 	struct drm_i915_private *dev_priv = to_i915(fb->dev);
 
 	/* AUX_DIST needs only 4K alignment */
-	if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
+	if (plane == 1)
 		return 4096;
 
 	switch (fb->modifier) {
@@ -2166,6 +2174,8 @@ static unsigned int intel_surf_alignment(const struct drm_framebuffer *fb,
 		if (INTEL_GEN(dev_priv) >= 9)
 			return 256 * 1024;
 		return 0;
+	case I915_FORMAT_MOD_Y_TILED_CCS:
+	case I915_FORMAT_MOD_Yf_TILED_CCS:
 	case I915_FORMAT_MOD_Y_TILED:
 	case I915_FORMAT_MOD_Yf_TILED:
 		return 1 * 1024 * 1024;
@@ -2472,6 +2482,7 @@ static unsigned int intel_fb_modifier_to_tiling(uint64_t fb_modifier)
 	case I915_FORMAT_MOD_X_TILED:
 		return I915_TILING_X;
 	case I915_FORMAT_MOD_Y_TILED:
+	case I915_FORMAT_MOD_Y_TILED_CCS:
 		return I915_TILING_Y;
 	default:
 		return I915_TILING_NONE;
@@ -2536,6 +2547,35 @@ intel_fill_fb_info(struct drm_i915_private *dev_priv,
 
 		intel_fb_offset_to_xy(&x, &y, fb, i);
 
+		if ((fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
+		     fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) && i == 1) {
+			int main_x, main_y;
+			int ccs_x, ccs_y;
+
+			/*
+			 * Each byte of CCS corresponds to a 16x8 area of the main surface, and
+			 * each CCS tile is 64x64 bytes.
+			 */
+			ccs_x = (x * 16) % (64 * 16);
+			ccs_y = (y * 8) % (64 * 8);
+			main_x = intel_fb->normal[0].x % (64 * 16);
+			main_y = intel_fb->normal[0].y % (64 * 8);
+
+			/*
+			 * CCS doesn't have its own x/y offset register, so the intra CCS tile
+			 * x/y offsets must match between CCS and the main surface.
+			 */
+			if (main_x != ccs_x || main_y != ccs_y) {
+				DRM_DEBUG_KMS("Bad CCS x/y (main %d,%d ccs %d,%d) full (main %d,%d ccs %d,%d)\n",
+					      main_x, main_y,
+					      ccs_x, ccs_y,
+					      intel_fb->normal[0].x,
+					      intel_fb->normal[0].y,
+					      x, y);
+				return -EINVAL;
+			}
+		}
+
 		/*
 		 * The fence (if used) is aligned to the start of the object
 		 * so having the framebuffer wrap around across the edge of the
@@ -2873,6 +2913,9 @@ static int skl_max_plane_width(const struct drm_framebuffer *fb, int plane,
 			break;
 		}
 		break;
+	case I915_FORMAT_MOD_Y_TILED_CCS:
+	case I915_FORMAT_MOD_Yf_TILED_CCS:
+		/* FIXME AUX plane? */
 	case I915_FORMAT_MOD_Y_TILED:
 	case I915_FORMAT_MOD_Yf_TILED:
 		switch (cpp) {
@@ -2895,6 +2938,42 @@ static int skl_max_plane_width(const struct drm_framebuffer *fb, int plane,
 	return 2048;
 }
 
+static bool skl_check_main_ccs_coordinates(struct intel_plane_state *plane_state,
+					   int main_x, int main_y, u32 main_offset)
+{
+	const struct drm_framebuffer *fb = plane_state->base.fb;
+	int aux_x = plane_state->aux.x;
+	int aux_y = plane_state->aux.y;
+	u32 aux_offset = plane_state->aux.offset;
+	u32 alignment = intel_surf_alignment(fb, 1);
+
+	while (aux_offset >= main_offset && aux_y <= main_y) {
+		int x, y;
+
+		if (aux_x == main_x && aux_y == main_y)
+			break;
+
+		if (aux_offset == 0)
+			break;
+
+		x = aux_x / 16;
+		y = aux_y / 8;
+		aux_offset = intel_adjust_tile_offset(&x, &y, plane_state, 1,
+						      aux_offset, aux_offset - alignment);
+		aux_x = x * 16 + aux_x % 16;
+		aux_y = y * 8 + aux_y % 8;
+	}
+
+	if (aux_x != main_x || aux_y != main_y)
+		return false;
+
+	plane_state->aux.offset = aux_offset;
+	plane_state->aux.x = aux_x;
+	plane_state->aux.y = aux_y;
+
+	return true;
+}
+
 static int skl_check_main_surface(struct intel_plane_state *plane_state)
 {
 	const struct drm_framebuffer *fb = plane_state->base.fb;
@@ -2937,7 +3016,7 @@ static int skl_check_main_surface(struct intel_plane_state *plane_state)
 
 		while ((x + w) * cpp > fb->pitches[0]) {
 			if (offset == 0) {
-				DRM_DEBUG_KMS("Unable to find suitable display surface offset\n");
+				DRM_DEBUG_KMS("Unable to find suitable display surface offset due to X-tiling\n");
 				return -EINVAL;
 			}
 
@@ -2946,6 +3025,26 @@ static int skl_check_main_surface(struct intel_plane_state *plane_state)
 		}
 	}
 
+	/*
+	 * CCS AUX surface doesn't have its own x/y offsets, we must make sure
+	 * they match with the main surface x/y offsets.
+	 */
+	if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
+	    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
+		while (!skl_check_main_ccs_coordinates(plane_state, x, y, offset)) {
+			if (offset == 0)
+				break;
+
+			offset = intel_adjust_tile_offset(&x, &y, plane_state, 0,
+							  offset, offset - alignment);
+		}
+
+		if (x != plane_state->aux.x || y != plane_state->aux.y) {
+			DRM_DEBUG_KMS("Unable to find suitable display surface offset due to CCS\n");
+			return -EINVAL;
+		}
+	}
+
 	plane_state->main.offset = offset;
 	plane_state->main.x = x;
 	plane_state->main.y = y;
@@ -2982,6 +3081,47 @@ static int skl_check_nv12_aux_surface(struct intel_plane_state *plane_state)
 	return 0;
 }
 
+static int skl_check_ccs_aux_surface(struct intel_plane_state *plane_state)
+{
+	struct intel_plane *plane = to_intel_plane(plane_state->base.plane);
+	struct intel_crtc *crtc = to_intel_crtc(plane_state->base.crtc);
+	int src_x = plane_state->base.src.x1 >> 16;
+	int src_y = plane_state->base.src.y1 >> 16;
+	int x = src_x / 16;
+	int y = src_y / 8;
+	u32 offset;
+
+	switch (plane->id) {
+	case PLANE_PRIMARY:
+	case PLANE_SPRITE0:
+		break;
+	default:
+		DRM_DEBUG_KMS("RC support only on plane 1 and 2\n");
+		return -EINVAL;
+	}
+
+	if (crtc->pipe == PIPE_C) {
+		DRM_DEBUG_KMS("No RC support on pipe C\n");
+		return -EINVAL;
+	}
+
+	if (plane_state->base.rotation &&
+	    plane_state->base.rotation & ~(DRM_ROTATE_0 | DRM_ROTATE_180)) {
+		DRM_DEBUG_KMS("RC support only with 0/180 degree rotation %x\n",
+			      plane_state->base.rotation);
+		return -EINVAL;
+	}
+
+	intel_add_fb_offsets(&x, &y, plane_state, 1);
+	offset = intel_compute_tile_offset(&x, &y, plane_state, 1);
+
+	plane_state->aux.offset = offset;
+	plane_state->aux.x = x * 16 + src_x % 16;
+	plane_state->aux.y = y * 8 + src_y % 8;
+
+	return 0;
+}
+
 int skl_check_plane_surface(struct intel_plane_state *plane_state)
 {
 	const struct drm_framebuffer *fb = plane_state->base.fb;
@@ -3002,6 +3142,11 @@ int skl_check_plane_surface(struct intel_plane_state *plane_state)
 		ret = skl_check_nv12_aux_surface(plane_state);
 		if (ret)
 			return ret;
+	} else if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
+		   fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
+		ret = skl_check_ccs_aux_surface(plane_state);
+		if (ret)
+			return ret;
 	} else {
 		plane_state->aux.offset = ~0xfff;
 		plane_state->aux.x = 0;
@@ -3357,8 +3502,12 @@ u32 skl_plane_ctl_tiling(uint64_t fb_modifier)
 		return PLANE_CTL_TILED_X;
 	case I915_FORMAT_MOD_Y_TILED:
 		return PLANE_CTL_TILED_Y;
+	case I915_FORMAT_MOD_Y_TILED_CCS:
+		return PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
 	case I915_FORMAT_MOD_Yf_TILED:
 		return PLANE_CTL_TILED_YF;
+	case I915_FORMAT_MOD_Yf_TILED_CCS:
+		return PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_ENABLE;
 	default:
 		MISSING_CASE(fb_modifier);
 	}
@@ -3401,6 +3550,7 @@ static void skylake_update_primary_plane(struct drm_plane *plane,
 	u32 plane_ctl;
 	unsigned int rotation = plane_state->base.rotation;
 	u32 stride = skl_plane_stride(fb, 0, rotation);
+	u32 aux_stride = skl_plane_stride(fb, 1, rotation);
 	u32 surf_addr = plane_state->main.offset;
 	int scaler_id = plane_state->scaler_id;
 	int src_x = plane_state->main.x;
@@ -3436,6 +3586,10 @@ static void skylake_update_primary_plane(struct drm_plane *plane,
 	I915_WRITE(PLANE_OFFSET(pipe, plane_id), (src_y << 16) | src_x);
 	I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
 	I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
+	I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
+		   (plane_state->aux.offset - surf_addr) | aux_stride);
+	I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
+		   (plane_state->aux.y << 16) | plane_state->aux.x);
 
 	if (scaler_id >= 0) {
 		uint32_t ps_ctrl = 0;
@@ -9807,10 +9961,16 @@ skylake_get_initial_plane_config(struct intel_crtc *crtc,
 		fb->modifier = I915_FORMAT_MOD_X_TILED;
 		break;
 	case PLANE_CTL_TILED_Y:
-		fb->modifier = I915_FORMAT_MOD_Y_TILED;
+		if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
+			fb->modifier = I915_FORMAT_MOD_Y_TILED_CCS;
+		else
+			fb->modifier = I915_FORMAT_MOD_Y_TILED;
 		break;
 	case PLANE_CTL_TILED_YF:
-		fb->modifier = I915_FORMAT_MOD_Yf_TILED;
+		if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
+			fb->modifier = I915_FORMAT_MOD_Yf_TILED_CCS;
+		else
+			fb->modifier = I915_FORMAT_MOD_Yf_TILED;
 		break;
 	default:
 		MISSING_CASE(tiling);
@@ -12014,7 +12174,7 @@ static void skl_do_mmio_flip(struct intel_crtc *intel_crtc,
 	u32 ctl, stride = skl_plane_stride(fb, 0, rotation);
 
 	ctl = I915_READ(PLANE_CTL(pipe, 0));
-	ctl &= ~PLANE_CTL_TILED_MASK;
+	ctl &= ~(PLANE_CTL_TILED_MASK | PLANE_CTL_DECOMPRESSION_ENABLE);
 	switch (fb->modifier) {
 	case DRM_FORMAT_MOD_NONE:
 		break;
@@ -12024,9 +12184,15 @@ static void skl_do_mmio_flip(struct intel_crtc *intel_crtc,
 	case I915_FORMAT_MOD_Y_TILED:
 		ctl |= PLANE_CTL_TILED_Y;
 		break;
+	case I915_FORMAT_MOD_Y_TILED_CCS:
+		ctl |= PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
+		break;
 	case I915_FORMAT_MOD_Yf_TILED:
 		ctl |= PLANE_CTL_TILED_YF;
 		break;
+	case I915_FORMAT_MOD_Yf_TILED_CCS:
+		ctl |= PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_ENABLE;
+		break;
 	default:
 		MISSING_CASE(fb->modifier);
 	}
@@ -15925,9 +16091,10 @@ static int intel_framebuffer_init(struct drm_device *dev,
 				  struct drm_i915_gem_object *obj)
 {
 	struct drm_i915_private *dev_priv = to_i915(dev);
+	struct drm_framebuffer *fb = &intel_fb->base;
 	unsigned int tiling = i915_gem_object_get_tiling(obj);
-	int ret;
-	u32 pitch_limit, stride_alignment;
+	int ret, i;
+	u32 pitch_limit;
 	struct drm_format_name_buf format_name;
 
 	WARN_ON(!mutex_is_locked(&dev->struct_mutex));
@@ -15953,6 +16120,19 @@ static int intel_framebuffer_init(struct drm_device *dev,
 
 	/* Passed in modifier sanity checking. */
 	switch (mode_cmd->modifier[0]) {
+	case I915_FORMAT_MOD_Y_TILED_CCS:
+	case I915_FORMAT_MOD_Yf_TILED_CCS:
+		switch (mode_cmd->pixel_format) {
+		case DRM_FORMAT_XBGR8888:
+		case DRM_FORMAT_ABGR8888:
+		case DRM_FORMAT_XRGB8888:
+		case DRM_FORMAT_ARGB8888:
+			break;
+		default:
+			DRM_DEBUG_KMS("RC supported only with RGB8888 formats\n");
+			return -EINVAL;
+		}
+		/* fall through */
 	case I915_FORMAT_MOD_Y_TILED:
 	case I915_FORMAT_MOD_Yf_TILED:
 		if (INTEL_GEN(dev_priv) < 9) {
@@ -16059,22 +16239,46 @@ static int intel_framebuffer_init(struct drm_device *dev,
 	if (mode_cmd->offsets[0] != 0)
 		return -EINVAL;
 
-	drm_helper_mode_fill_fb_struct(dev, &intel_fb->base, mode_cmd);
+	drm_helper_mode_fill_fb_struct(dev, fb, mode_cmd);
 
-	stride_alignment = intel_fb_stride_alignment(&intel_fb->base, 0);
-	if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
-		DRM_DEBUG_KMS("pitch (%d) must be at least %u byte aligned\n",
-			      mode_cmd->pitches[0], stride_alignment);
-		return -EINVAL;
+	for (i = 0; i < fb->format->num_planes; i++) {
+		u32 stride_alignment;
+
+		if (mode_cmd->handles[i] != mode_cmd->handles[0]) {
+			DRM_DEBUG_KMS("bad plane %d handle\n", i);
+			return -EINVAL;
+		}
+
+		stride_alignment = intel_fb_stride_alignment(fb, i);
+
+		/*
+		 * Display WA #0531: skl,bxt,kbl,glk
+		 *
+		 * Render decompression and plane width > 3840
+		 * combined with horizontal panning requires the
+		 * plane stride to be a multiple of 4. We'll just
+		 * require the entire fb to accommodate that to avoid
+		 * potential runtime errors at plane configuration time.
+		 */
+		if (IS_GEN9(dev_priv) && i == 0 && fb->width > 3840 &&
+		    (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
+		     fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS))
+			stride_alignment *= 4;
+
+		if (fb->pitches[i] & (stride_alignment - 1)) {
+			DRM_DEBUG_KMS("plane %d pitch (%d) must be at least %u byte aligned\n",
+				      i, fb->pitches[i], stride_alignment);
+			return -EINVAL;
+		}
 	}
 
 	intel_fb->obj = obj;
 
-	ret = intel_fill_fb_info(dev_priv, &intel_fb->base);
+	ret = intel_fill_fb_info(dev_priv, fb);
 	if (ret)
 		return ret;
 
-	ret = drm_framebuffer_init(dev, &intel_fb->base, &intel_fb_funcs);
+	ret = drm_framebuffer_init(dev, fb, &intel_fb_funcs);
 	if (ret) {
 		DRM_ERROR("framebuffer init failed %d\n", ret);
 		return ret;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 249623d45be0..25782cd174c0 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -62,6 +62,20 @@ static void gen9_init_clock_gating(struct drm_i915_private *dev_priv)
 	I915_WRITE(CHICKEN_PAR1_1,
 		   I915_READ(CHICKEN_PAR1_1) | SKL_EDP_PSR_FIX_RDWRAP);
 
+	/*
+	 * Display WA#0390: skl,bxt,kbl,glk
+	 *
+	 * Must match Sampler, Pixel Back End, and Media
+	 * (0xE194 bit 8, 0x7014 bit 13, 0x4DDC bits 27 and 31).
+	 *
+	 * Including bits outside the page in the hash would
+	 * require 2 (or 4?) MiB alignment of resources. Just
+	 * assume the defaul hashing mode which only uses bits
+	 * within the page.
+	 */
+	I915_WRITE(CHICKEN_PAR1_1,
+		   I915_READ(CHICKEN_PAR1_1) & ~SKL_RC_HASH_OUTSIDE);
+
 	I915_WRITE(GEN8_CONFIG0,
 		   I915_READ(GEN8_CONFIG0) | GEN9_DEFAULT_FIXES);
 
@@ -3314,7 +3328,9 @@ skl_ddb_min_alloc(const struct drm_plane_state *pstate,
 
 	/* For Non Y-tile return 8-blocks */
 	if (fb->modifier != I915_FORMAT_MOD_Y_TILED &&
-	    fb->modifier != I915_FORMAT_MOD_Yf_TILED)
+	    fb->modifier != I915_FORMAT_MOD_Yf_TILED &&
+	    fb->modifier != I915_FORMAT_MOD_Y_TILED_CCS &&
+	    fb->modifier != I915_FORMAT_MOD_Yf_TILED_CCS)
 		return 8;
 
 	src_w = drm_rect_width(&intel_pstate->base.src) >> 16;
@@ -3590,7 +3606,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	}
 
 	y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
-		  fb->modifier == I915_FORMAT_MOD_Yf_TILED;
+		  fb->modifier == I915_FORMAT_MOD_Yf_TILED ||
+		  fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
+		  fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS;
 	x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
 
 	/* Display WA #1141: kbl. */
@@ -3675,6 +3693,13 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	res_lines = DIV_ROUND_UP(selected_result.val,
 				 plane_blocks_per_line.val);
 
+	/* Display WA #1125: skl,bxt,kbl,glk */
+	if (level == 0 &&
+	    (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
+	     fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS))
+		res_blocks += fixed_16_16_to_u32_round_up(y_tile_minimum);
+
+	/* Display WA #1126: skl,bxt,kbl,glk */
 	if (level >= 1 && level <= 7) {
 		if (y_tiled) {
 			res_blocks += fixed_16_16_to_u32_round_up(y_tile_minimum);
diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
index 7031bc733d97..063a994815d0 100644
--- a/drivers/gpu/drm/i915/intel_sprite.c
+++ b/drivers/gpu/drm/i915/intel_sprite.c
@@ -210,6 +210,7 @@ skl_update_plane(struct drm_plane *drm_plane,
 	u32 surf_addr = plane_state->main.offset;
 	unsigned int rotation = plane_state->base.rotation;
 	u32 stride = skl_plane_stride(fb, 0, rotation);
+	u32 aux_stride = skl_plane_stride(fb, 1, rotation);
 	int crtc_x = plane_state->base.dst.x1;
 	int crtc_y = plane_state->base.dst.y1;
 	uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
@@ -248,6 +249,10 @@ skl_update_plane(struct drm_plane *drm_plane,
 	I915_WRITE(PLANE_OFFSET(pipe, plane_id), (y << 16) | x);
 	I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
 	I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
+	I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
+		   (plane_state->aux.offset - surf_addr) | aux_stride);
+	I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
+		   (plane_state->aux.y << 16) | plane_state->aux.x);
 
 	/* program plane scaler */
 	if (plane_state->scaler_id >= 0) {
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915: SKL+ render decompression support (rev2)
  2017-01-04 18:42 [PATCH 0/9] drm/i915: SKL+ render decompression support ville.syrjala
                   ` (9 preceding siblings ...)
  2017-01-04 19:45 ` ✓ Fi.CI.BAT: success for drm/i915: SKL+ " Patchwork
@ 2017-01-05 15:54 ` Patchwork
  2017-01-06 23:41 ` [PATCH 0/9] drm/i915: SKL+ render decompression support Ben Widawsky
  2017-01-10 19:23 ` ✓ Fi.CI.BAT: success for drm/i915: SKL+ render decompression support (rev2) Patchwork
  12 siblings, 0 replies; 44+ messages in thread
From: Patchwork @ 2017-01-05 15:54 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: SKL+ render decompression support (rev2)
URL   : https://patchwork.freedesktop.org/series/17507/
State : failure

== Summary ==

Series 17507v2 drm/i915: SKL+ render decompression support
https://patchwork.freedesktop.org/api/1.0/series/17507/revisions/2/mbox/

Test kms_force_connector_basic:
        Subgroup prune-stale-modes:
                dmesg-warn -> PASS       (fi-snb-2520m)
Test kms_setmode:
        Subgroup basic-clone-single-crtc:
                pass       -> INCOMPLETE (fi-skl-6700k)

fi-bdw-5557u     total:246  pass:231  dwarn:0   dfail:0   fail:1   skip:14 
fi-bsw-n3050     total:246  pass:206  dwarn:0   dfail:0   fail:1   skip:39 
fi-bxt-j4205     total:246  pass:223  dwarn:0   dfail:0   fail:1   skip:22 
fi-bxt-t5700     total:82   pass:69   dwarn:0   dfail:0   fail:0   skip:12 
fi-byt-j1900     total:246  pass:218  dwarn:0   dfail:0   fail:1   skip:27 
fi-byt-n2820     total:246  pass:214  dwarn:0   dfail:0   fail:1   skip:31 
fi-hsw-4770      total:246  pass:226  dwarn:0   dfail:0   fail:1   skip:19 
fi-hsw-4770r     total:246  pass:226  dwarn:0   dfail:0   fail:1   skip:19 
fi-ivb-3520m     total:246  pass:224  dwarn:0   dfail:0   fail:1   skip:21 
fi-ivb-3770      total:246  pass:224  dwarn:0   dfail:0   fail:1   skip:21 
fi-kbl-7500u     total:246  pass:224  dwarn:0   dfail:0   fail:1   skip:21 
fi-skl-6260u     total:246  pass:232  dwarn:0   dfail:0   fail:1   skip:13 
fi-skl-6700hq    total:246  pass:226  dwarn:0   dfail:0   fail:0   skip:20 
fi-skl-6700k     total:211  pass:187  dwarn:3   dfail:0   fail:1   skip:19 
fi-skl-6770hq    total:246  pass:232  dwarn:0   dfail:0   fail:1   skip:13 
fi-snb-2520m     total:246  pass:214  dwarn:0   dfail:0   fail:1   skip:31 
fi-snb-2600      total:246  pass:213  dwarn:0   dfail:0   fail:1   skip:32 

6a304f1f2e7446fe71bf7845c34dcdc73436c547 drm-tip: 2017y-01m-05d-11h-52m-09s UTC integration manifest
2989e4e drm/i915: Add render decompression support
701d909 drm/i915: Implement .get_format_info() hook for CCS
2909de8 drm/i915: Use DRM_DEBUG_KMS() for framebuffer failure debug messages
7b784ce drm/i915: Pass the correct plane index to _intel_compute_tile_offset()
4130c11 drm/i915: Fix Yf tile width
0784dcc drm/i915: Avoid div-by-zero when computing aux_stride w/o an aux plane
cd53d3c drm/i915: Move nv12 chroma plane handling into intel_surf_alignment()
0d03743 drm/i915: Plumb drm_framebuffer into more places
6494b83 drm: Add mode_config .get_format_info() hook

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3438/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 8/9] drm/i915: Implement .get_format_info() hook for CCS
  2017-01-04 18:42 ` [PATCH 8/9] drm/i915: Implement .get_format_info() hook for CCS ville.syrjala
@ 2017-01-05 16:24   ` Tvrtko Ursulin
  2017-01-05 17:40     ` Ben Widawsky
  2017-02-26 22:41   ` Chad Versace
  1 sibling, 1 reply; 44+ messages in thread
From: Tvrtko Ursulin @ 2017-01-05 16:24 UTC (permalink / raw)
  To: ville.syrjala, intel-gfx; +Cc: Ben Widawsky, Vandana Kannan, dri-devel


On 04/01/2017 18:42, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> SKL+ display engine can scan out certain kinds of compressed surfaces
> produced by the render engine. This involved telling the display engine
> the location of the color control surfae (CCS) which describes which
> parts of the main surface are compressed and which are not. The location
> of CCS is provided by userspace as just another plane with its own offset.
>
> By providing our own format information for the CCS formats, we should
> be able to make framebuffer_check() do the right thing for the CCS
> surface as well.
>
> Note that we'll return the same format info for both Y and Yf tiled
> format as that's what happens with the non-CCS Y vs. Yf as well. If
> desired, we could potentially return a unique pointer for each
> pixel_format+tiling+ccs combination, in which case we immediately be
> able to tell if any of that stuff changed by just comparing the
> pointers. But that does sound a bit wasteful space wise.
>
> v2: Drop the 'dev' argument from the hook
> v3: Include the description of the CCS surface layout
>
> Cc: Vandana Kannan <vandana.kannan@intel.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Ben Widawsky <ben@bwidawsk.net>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/intel_display.c | 36 ++++++++++++++++++++++++++
>  include/uapi/drm/drm_fourcc.h        | 49 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 85 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index c4662b2e9613..38de9df0ec60 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -2478,6 +2478,41 @@ static unsigned int intel_fb_modifier_to_tiling(uint64_t fb_modifier)
>  	}
>  }
>
> +static const struct drm_format_info ccs_formats[] = {
> +	{ .format = DRM_FORMAT_XRGB8888, .depth = 24, .num_planes = 2, .cpp = { 4, 1, }, .hsub = 16, .vsub = 8, },
> +	{ .format = DRM_FORMAT_XBGR8888, .depth = 24, .num_planes = 2, .cpp = { 4, 1, }, .hsub = 16, .vsub = 8, },
> +	{ .format = DRM_FORMAT_ARGB8888, .depth = 32, .num_planes = 2, .cpp = { 4, 1, }, .hsub = 16, .vsub = 8, },
> +	{ .format = DRM_FORMAT_ABGR8888, .depth = 32, .num_planes = 2, .cpp = { 4, 1, }, .hsub = 16, .vsub = 8, },
> +};
> +
> +static const struct drm_format_info *
> +lookup_format_info(const struct drm_format_info formats[],
> +		   int num_formats, u32 format)
> +{
> +	int i;
> +
> +	for (i = 0; i < num_formats; i++) {
> +		if (formats[i].format == format)
> +			return &formats[i];
> +	}
> +
> +	return NULL;
> +}
> +
> +static const struct drm_format_info *
> +intel_get_format_info(const struct drm_mode_fb_cmd2 *cmd)
> +{
> +	switch (cmd->modifier[0]) {
> +	case I915_FORMAT_MOD_Y_TILED_CCS:
> +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> +		return lookup_format_info(ccs_formats,
> +					  ARRAY_SIZE(ccs_formats),
> +					  cmd->pixel_format);
> +	default:
> +		return NULL;
> +	}
> +}
> +
>  static int
>  intel_fill_fb_info(struct drm_i915_private *dev_priv,
>  		   struct drm_framebuffer *fb)
> @@ -16083,6 +16118,7 @@ static void intel_atomic_state_free(struct drm_atomic_state *state)
>
>  static const struct drm_mode_config_funcs intel_mode_funcs = {
>  	.fb_create = intel_user_framebuffer_create,
> +	.get_format_info = intel_get_format_info,
>  	.output_poll_changed = intel_fbdev_output_poll_changed,
>  	.atomic_check = intel_atomic_check,
>  	.atomic_commit = intel_atomic_commit,
> diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
> index 9e1bb7fabcde..4581e3d41e5c 100644
> --- a/include/uapi/drm/drm_fourcc.h
> +++ b/include/uapi/drm/drm_fourcc.h
> @@ -230,6 +230,55 @@ extern "C" {
>  #define I915_FORMAT_MOD_Yf_TILED fourcc_mod_code(INTEL, 3)
>
>  /*
> + * Intel color control surface (CCS) for render compression
> + *
> + * The framebuffer format must be one of the 8:8:8:8 RGB formats,
> + * the main surface will be plane index 0 and will be Y/Yf-tiled,
> + * the CCS will be plane index 1.
> + *
> + * Each byte of CCS contains 4 pairs of bits, with each pair of bits
> + * matching an area of 8x4 pixels of the main surface. Which would seem
> + * to match 2 cachelines containing 4x4 pixels each. The pairs bits within
> + * the byte form a 2x2 grid, which thus matches a 16x8 pixel area of the
> + * main surface. This is the 2x2 pattern the bits form (0=lsb, 7=msb):
> + * -----------
> + * | 01 | 23 |
> + *  ----------
> + * | 45 | 67 |
> + * -----------
> + *
> + * Actually only the lower bit of the pair seems to have any effect.
> + * No idea why. 0 in the lower bit would seem to mean not compressed,
> + * and 1 is compressed.  The interpreation of the main surface data
> + * when the block is marked compressed is unknown as of now.
> + *
> + * CCS tile is laid out in 8 byte horizontal strips each strip thus corresponds
> + * to a 128x8 pixel are of the main surface. So each 8x8 bytes of the CCS
> + * (1 cacheline) will match an area of 4x2 tiles on the main surface.
> + *
> + * Here is the layout of a full CCS tile, with the 8 byte strips numbered 0-511:
> + * ------------------------
> + * |  0 |  64 | ... | 448 |
> + * |  1 |  65 |     | 449 |
> + * |  2 |  66 |     | 450 |
> + * |  . |   . |     |   . |
> + * |  . |   . |     |   . |
> + * |  . |   . |     |   . |
> + * | 63 | 127 |     | 511 |
> + * ------------------------
> + *
> + * This will match an area of 1024x512 pixels on the main surface.
> + *
> + * The CCS plane pitch must be a multiple of the CCS tile width (64 bytes),
> + * and for the purposes of the CCS plane offset we assume cpp==1. As each
> + * byte matches a 16x8 area of the main surface, the dimensions of the CCS
> + * plane will thus appear to be width/16 x height/8. Both planes must be
> + * contained within the same gem object.
> + */
> +#define I915_FORMAT_MOD_Y_TILED_CCS	fourcc_mod_code(INTEL, 4)
> +#define I915_FORMAT_MOD_Yf_TILED_CCS	fourcc_mod_code(INTEL, 5)

Are we sure this is better than reserving some bits for tiling mode and 
having CCS as separate bit flag? IMHO code is already a bit unsightly 
with this scheme and it would take just one more orthogonal but 
simultaneously usable set of modifiers to drown us in permutations. We 
got plenty of bits available.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 8/9] drm/i915: Implement .get_format_info() hook for CCS
  2017-01-05 16:24   ` Tvrtko Ursulin
@ 2017-01-05 17:40     ` Ben Widawsky
  0 siblings, 0 replies; 44+ messages in thread
From: Ben Widawsky @ 2017-01-05 17:40 UTC (permalink / raw)
  To: Tvrtko Ursulin, Kristian Høgsberg
  Cc: dri-devel, intel-gfx, Vandana Kannan

On 17-01-05 16:24:56, Tvrtko Ursulin wrote:
>
>On 04/01/2017 18:42, ville.syrjala@linux.intel.com wrote:
>>From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>
>>SKL+ display engine can scan out certain kinds of compressed surfaces
>>produced by the render engine. This involved telling the display engine
>>the location of the color control surfae (CCS) which describes which
>>parts of the main surface are compressed and which are not. The location
>>of CCS is provided by userspace as just another plane with its own offset.
>>
>>By providing our own format information for the CCS formats, we should
>>be able to make framebuffer_check() do the right thing for the CCS
>>surface as well.
>>
>>Note that we'll return the same format info for both Y and Yf tiled
>>format as that's what happens with the non-CCS Y vs. Yf as well. If
>>desired, we could potentially return a unique pointer for each
>>pixel_format+tiling+ccs combination, in which case we immediately be
>>able to tell if any of that stuff changed by just comparing the
>>pointers. But that does sound a bit wasteful space wise.
>>
>>v2: Drop the 'dev' argument from the hook
>>v3: Include the description of the CCS surface layout
>>
>>Cc: Vandana Kannan <vandana.kannan@intel.com>
>>Cc: Daniel Vetter <daniel@ffwll.ch>
>>Cc: Ben Widawsky <ben@bwidawsk.net>
>>Cc: Jason Ekstrand <jason@jlekstrand.net>
>>Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
>>Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>---
>> drivers/gpu/drm/i915/intel_display.c | 36 ++++++++++++++++++++++++++
>> include/uapi/drm/drm_fourcc.h        | 49 ++++++++++++++++++++++++++++++++++++
>> 2 files changed, 85 insertions(+)
>>
>>diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
>>index c4662b2e9613..38de9df0ec60 100644
>>--- a/drivers/gpu/drm/i915/intel_display.c
>>+++ b/drivers/gpu/drm/i915/intel_display.c
>>@@ -2478,6 +2478,41 @@ static unsigned int intel_fb_modifier_to_tiling(uint64_t fb_modifier)
>> 	}
>> }
>>
>>+static const struct drm_format_info ccs_formats[] = {
>>+	{ .format = DRM_FORMAT_XRGB8888, .depth = 24, .num_planes = 2, .cpp = { 4, 1, }, .hsub = 16, .vsub = 8, },
>>+	{ .format = DRM_FORMAT_XBGR8888, .depth = 24, .num_planes = 2, .cpp = { 4, 1, }, .hsub = 16, .vsub = 8, },
>>+	{ .format = DRM_FORMAT_ARGB8888, .depth = 32, .num_planes = 2, .cpp = { 4, 1, }, .hsub = 16, .vsub = 8, },
>>+	{ .format = DRM_FORMAT_ABGR8888, .depth = 32, .num_planes = 2, .cpp = { 4, 1, }, .hsub = 16, .vsub = 8, },
>>+};
>>+
>>+static const struct drm_format_info *
>>+lookup_format_info(const struct drm_format_info formats[],
>>+		   int num_formats, u32 format)
>>+{
>>+	int i;
>>+
>>+	for (i = 0; i < num_formats; i++) {
>>+		if (formats[i].format == format)
>>+			return &formats[i];
>>+	}
>>+
>>+	return NULL;
>>+}
>>+
>>+static const struct drm_format_info *
>>+intel_get_format_info(const struct drm_mode_fb_cmd2 *cmd)
>>+{
>>+	switch (cmd->modifier[0]) {
>>+	case I915_FORMAT_MOD_Y_TILED_CCS:
>>+	case I915_FORMAT_MOD_Yf_TILED_CCS:
>>+		return lookup_format_info(ccs_formats,
>>+					  ARRAY_SIZE(ccs_formats),
>>+					  cmd->pixel_format);
>>+	default:
>>+		return NULL;
>>+	}
>>+}
>>+
>> static int
>> intel_fill_fb_info(struct drm_i915_private *dev_priv,
>> 		   struct drm_framebuffer *fb)
>>@@ -16083,6 +16118,7 @@ static void intel_atomic_state_free(struct drm_atomic_state *state)
>>
>> static const struct drm_mode_config_funcs intel_mode_funcs = {
>> 	.fb_create = intel_user_framebuffer_create,
>>+	.get_format_info = intel_get_format_info,
>> 	.output_poll_changed = intel_fbdev_output_poll_changed,
>> 	.atomic_check = intel_atomic_check,
>> 	.atomic_commit = intel_atomic_commit,
>>diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
>>index 9e1bb7fabcde..4581e3d41e5c 100644
>>--- a/include/uapi/drm/drm_fourcc.h
>>+++ b/include/uapi/drm/drm_fourcc.h
>>@@ -230,6 +230,55 @@ extern "C" {
>> #define I915_FORMAT_MOD_Yf_TILED fourcc_mod_code(INTEL, 3)
>>
>> /*
>>+ * Intel color control surface (CCS) for render compression
>>+ *
>>+ * The framebuffer format must be one of the 8:8:8:8 RGB formats,
>>+ * the main surface will be plane index 0 and will be Y/Yf-tiled,
>>+ * the CCS will be plane index 1.
>>+ *
>>+ * Each byte of CCS contains 4 pairs of bits, with each pair of bits
>>+ * matching an area of 8x4 pixels of the main surface. Which would seem
>>+ * to match 2 cachelines containing 4x4 pixels each. The pairs bits within
>>+ * the byte form a 2x2 grid, which thus matches a 16x8 pixel area of the
>>+ * main surface. This is the 2x2 pattern the bits form (0=lsb, 7=msb):
>>+ * -----------
>>+ * | 01 | 23 |
>>+ *  ----------
>>+ * | 45 | 67 |
>>+ * -----------
>>+ *
>>+ * Actually only the lower bit of the pair seems to have any effect.
>>+ * No idea why. 0 in the lower bit would seem to mean not compressed,
>>+ * and 1 is compressed.  The interpreation of the main surface data
>>+ * when the block is marked compressed is unknown as of now.
>>+ *
>>+ * CCS tile is laid out in 8 byte horizontal strips each strip thus corresponds
>>+ * to a 128x8 pixel are of the main surface. So each 8x8 bytes of the CCS
>>+ * (1 cacheline) will match an area of 4x2 tiles on the main surface.
>>+ *
>>+ * Here is the layout of a full CCS tile, with the 8 byte strips numbered 0-511:
>>+ * ------------------------
>>+ * |  0 |  64 | ... | 448 |
>>+ * |  1 |  65 |     | 449 |
>>+ * |  2 |  66 |     | 450 |
>>+ * |  . |   . |     |   . |
>>+ * |  . |   . |     |   . |
>>+ * |  . |   . |     |   . |
>>+ * | 63 | 127 |     | 511 |
>>+ * ------------------------
>>+ *
>>+ * This will match an area of 1024x512 pixels on the main surface.
>>+ *
>>+ * The CCS plane pitch must be a multiple of the CCS tile width (64 bytes),
>>+ * and for the purposes of the CCS plane offset we assume cpp==1. As each
>>+ * byte matches a 16x8 area of the main surface, the dimensions of the CCS
>>+ * plane will thus appear to be width/16 x height/8. Both planes must be
>>+ * contained within the same gem object.
>>+ */
>>+#define I915_FORMAT_MOD_Y_TILED_CCS	fourcc_mod_code(INTEL, 4)
>>+#define I915_FORMAT_MOD_Yf_TILED_CCS	fourcc_mod_code(INTEL, 5)
>
>Are we sure this is better than reserving some bits for tiling mode 
>and having CCS as separate bit flag? IMHO code is already a bit 
>unsightly with this scheme and it would take just one more orthogonal 
>but simultaneously usable set of modifiers to drown us in 
>permutations. We got plenty of bits available.
>
>Regards,
>
>Tvrtko

Adding Kristian...

Obviously it's a matter of opinion but I certainly don't think it's unsightly
now. I agree that if we have 1 more orthogonal modifier makes it bad, and 2
pretty much make it terrible which was actually what I liked about having the
per plane modifiers. Anyway, it seems everyone has mostly agreed on this
already and I'd propose we move forward and say that if this scheme doesn't
work, we bail for addfb3.

I'm not entirely convinced we need a Yf even now....
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/9] drm/i915: SKL+ render decompression support
  2017-01-04 18:42 [PATCH 0/9] drm/i915: SKL+ render decompression support ville.syrjala
                   ` (10 preceding siblings ...)
  2017-01-05 15:54 ` ✗ Fi.CI.BAT: failure for drm/i915: SKL+ render decompression support (rev2) Patchwork
@ 2017-01-06 23:41 ` Ben Widawsky
  2017-01-10 19:23 ` ✓ Fi.CI.BAT: success for drm/i915: SKL+ render decompression support (rev2) Patchwork
  12 siblings, 0 replies; 44+ messages in thread
From: Ben Widawsky @ 2017-01-06 23:41 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx, dri-devel, Laurent Pinchart, Vandana Kannan

On 17-01-04 20:42:23, Ville Syrjälä wrote:
>From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
>This series enables the SKL+ display engine render decompression
>support. Some bits and pieces of the i915 code are based on work
>from various people, but I just put my name on it since it
>would be hard to figure out which parts came from where.
>
>Entire series available here:
>git://github.com/vsyrjala/linux.git fb_format_dedup_4_ccs
>
>Cc: Vandana Kannan <vandana.kannan@intel.com>
>Cc: Daniel Vetter <daniel@ffwll.ch>
>Cc: Ben Widawsky <ben@bwidawsk.net>
>Cc: Jason Ekstrand <jason@jlekstrand.net>
>Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
>

In addition to review comments I've left, this series (patch 8 and 9 in
particular) is:
Tested-by: Ben Widawsky <ben@bwidawsk.net>

[snip]

-- 
Ben Widawsky, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 9/9] drm/i915: Add render decompression support
  2017-01-05 15:14   ` [PATCH v2 " ville.syrjala
@ 2017-01-09 19:20     ` Jason Ekstrand
  2017-01-10 17:04       ` Ville Syrjälä
  2017-02-07 15:37     ` Imre Deak
  2017-02-28 20:18     ` Jason Ekstrand
  2 siblings, 1 reply; 44+ messages in thread
From: Jason Ekstrand @ 2017-01-09 19:20 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Ben Widawsky, Paulo Zanoni, intel-gfx,
	Maling list - DRI developers, Vandana Kannan


[-- Attachment #1.1: Type: text/plain, Size: 27286 bytes --]

On Thu, Jan 5, 2017 at 7:14 AM, <ville.syrjala@linux.intel.com> wrote:

> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> SKL+ display engine can scan out certain kinds of compressed surfaces
> produced by the render engine. This involved telling the display engine
> the location of the color control surfae (CCS) which describes
> which parts of the main surface are compressed and which are not. The
> location of CCS is provided by userspace as just another plane with its
> own offset.
>
> Add the required stuff to validate the user provided AUX plane metadata
> and convert the user provided linear offset into something the hardware
> can consume.
>
> Due to hardware limitations we require that the main surface and
> the AUX surface (CCS) be part of the same bo. The hardware also
> makes life hard by not allowing you to provide separate x/y offsets
> for the main and AUX surfaces (excpet with NV12), so finding suitable
> offsets for both requires a bit of work. Assuming we still want keep
> playing tricks with the offsets. I've just gone with a dumb "search
> backward for suitable offsets" approach, which is far from optimal,
> but it works.
>
> Also not all planes will be capable of scanning out compressed surfaces,
> and eg. 90/270 degree rotation is not supported in combination with
> decompression either.
>
> This patch may contain work from at least the following people:
> * Vandana Kannan <vandana.kannan@intel.com>
> * Daniel Vetter <daniel@ffwll.ch>
> * Ben Widawsky <ben@bwidawsk.net>
>
> v2: Deal with display workarounds 0390, 0531, 1125 (Paulo)
>
> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> Cc: Vandana Kannan <vandana.kannan@intel.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Ben Widawsky <ben@bwidawsk.net>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_reg.h      |  23 ++++
>  drivers/gpu/drm/i915/intel_display.c | 234 ++++++++++++++++++++++++++++++
> ++---
>  drivers/gpu/drm/i915/intel_pm.c      |  29 ++++-
>  drivers/gpu/drm/i915/intel_sprite.c  |   5 +
>  4 files changed, 274 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_
> reg.h
> index 00970aa77afa..6849ba93f1d9 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -6209,6 +6209,28 @@ enum {
>                         _ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A),   \
>                         _ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B))
>
> +#define PLANE_AUX_DIST_1_A             0x701c0
> +#define PLANE_AUX_DIST_2_A             0x702c0
> +#define PLANE_AUX_DIST_1_B             0x711c0
> +#define PLANE_AUX_DIST_2_B             0x712c0
> +#define _PLANE_AUX_DIST_1(pipe) \
> +                       _PIPE(pipe, PLANE_AUX_DIST_1_A, PLANE_AUX_DIST_1_B)
> +#define _PLANE_AUX_DIST_2(pipe) \
> +                       _PIPE(pipe, PLANE_AUX_DIST_2_A, PLANE_AUX_DIST_2_B)
> +#define PLANE_AUX_DIST(pipe, plane)     \
> +       _MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe),
> _PLANE_AUX_DIST_2(pipe))
> +
> +#define PLANE_AUX_OFFSET_1_A           0x701c4
> +#define PLANE_AUX_OFFSET_2_A           0x702c4
> +#define PLANE_AUX_OFFSET_1_B           0x711c4
> +#define PLANE_AUX_OFFSET_2_B           0x712c4
> +#define _PLANE_AUX_OFFSET_1(pipe)       \
> +               _PIPE(pipe, PLANE_AUX_OFFSET_1_A, PLANE_AUX_OFFSET_1_B)
> +#define _PLANE_AUX_OFFSET_2(pipe)       \
> +               _PIPE(pipe, PLANE_AUX_OFFSET_2_A, PLANE_AUX_OFFSET_2_B)
> +#define PLANE_AUX_OFFSET(pipe, plane)   \
> +       _MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe),
> _PLANE_AUX_OFFSET_2(pipe))
> +
>  /* legacy palette */
>  #define _LGC_PALETTE_A           0x4a000
>  #define _LGC_PALETTE_B           0x4a800
> @@ -6433,6 +6455,7 @@ enum {
>  # define CHICKEN3_DGMG_DONE_FIX_DISABLE                (1 << 2)
>
>  #define CHICKEN_PAR1_1         _MMIO(0x42080)
> +#define  SKL_RC_HASH_OUTSIDE   (1 << 15)
>  #define  DPA_MASK_VBLANK_SRD   (1 << 15)
>  #define  FORCE_ARB_IDLE_PLANES (1 << 14)
>  #define  SKL_EDP_PSR_FIX_RDWRAP        (1 << 3)
> diff --git a/drivers/gpu/drm/i915/intel_display.c
> b/drivers/gpu/drm/i915/intel_display.c
> index 38de9df0ec60..2236abebd8bc 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -2064,11 +2064,19 @@ intel_tile_width_bytes(const struct
> drm_framebuffer *fb, int plane)
>                         return 128;
>                 else
>                         return 512;
> +       case I915_FORMAT_MOD_Y_TILED_CCS:
> +               if (plane == 1)
> +                       return 64;
> +               /* fall through */
>         case I915_FORMAT_MOD_Y_TILED:
>                 if (IS_GEN2(dev_priv) || HAS_128_BYTE_Y_TILING(dev_priv))
>                         return 128;
>                 else
>                         return 512;
> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
> +               if (plane == 1)
> +                       return 64;
>

I still think a CCS tile is 128B wide. :-)


> +               /* fall through */
>         case I915_FORMAT_MOD_Yf_TILED:
>                 /*
>                  * Bspec seems to suggest that the Yf tile width would
> @@ -2156,7 +2164,7 @@ static unsigned int intel_surf_alignment(const
> struct drm_framebuffer *fb,
>         struct drm_i915_private *dev_priv = to_i915(fb->dev);
>
>         /* AUX_DIST needs only 4K alignment */
> -       if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
> +       if (plane == 1)
>                 return 4096;
>
>         switch (fb->modifier) {
> @@ -2166,6 +2174,8 @@ static unsigned int intel_surf_alignment(const
> struct drm_framebuffer *fb,
>                 if (INTEL_GEN(dev_priv) >= 9)
>                         return 256 * 1024;
>                 return 0;
> +       case I915_FORMAT_MOD_Y_TILED_CCS:
> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>         case I915_FORMAT_MOD_Y_TILED:
>         case I915_FORMAT_MOD_Yf_TILED:
>                 return 1 * 1024 * 1024;
> @@ -2472,6 +2482,7 @@ static unsigned int intel_fb_modifier_to_tiling(uint64_t
> fb_modifier)
>         case I915_FORMAT_MOD_X_TILED:
>                 return I915_TILING_X;
>         case I915_FORMAT_MOD_Y_TILED:
> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>                 return I915_TILING_Y;
>         default:
>                 return I915_TILING_NONE;
> @@ -2536,6 +2547,35 @@ intel_fill_fb_info(struct drm_i915_private
> *dev_priv,
>
>                 intel_fb_offset_to_xy(&x, &y, fb, i);
>
> +               if ((fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +                    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) && i ==
> 1) {
> +                       int main_x, main_y;
> +                       int ccs_x, ccs_y;
> +
> +                       /*
> +                        * Each byte of CCS corresponds to a 16x8 area of
> the main surface, and
> +                        * each CCS tile is 64x64 bytes.
> +                        */
> +                       ccs_x = (x * 16) % (64 * 16);
> +                       ccs_y = (y * 8) % (64 * 8);
>

This makes me nervous.  Why are we multiplying CCS coordinates by something
before we do the modulus?  Why aren't coordinates in both surfaces in
pixels?  So long as you keep things in pixesl and know that a CCS tile is
1024x512px and a color tile is 32x32 pixels, you can safely do tile
offsetting and it all makes sense.  Having different units looks like a
recipe for some very confusing bugs.  Am I just completely misunderstanding
what's going on here?


> +                       main_x = intel_fb->normal[0].x % (64 * 16);
> +                       main_y = intel_fb->normal[0].y % (64 * 8);
>
+
> +                       /*
> +                        * CCS doesn't have its own x/y offset register,
> so the intra CCS tile
> +                        * x/y offsets must match between CCS and the main
> surface.
> +                        */
> +                       if (main_x != ccs_x || main_y != ccs_y) {
> +                               DRM_DEBUG_KMS("Bad CCS x/y (main %d,%d ccs
> %d,%d) full (main %d,%d ccs %d,%d)\n",
> +                                             main_x, main_y,
> +                                             ccs_x, ccs_y,
> +                                             intel_fb->normal[0].x,
> +                                             intel_fb->normal[0].y,
> +                                             x, y);
> +                               return -EINVAL;
> +                       }
> +               }
> +
>                 /*
>                  * The fence (if used) is aligned to the start of the
> object
>                  * so having the framebuffer wrap around across the edge
> of the
> @@ -2873,6 +2913,9 @@ static int skl_max_plane_width(const struct
> drm_framebuffer *fb, int plane,
>                         break;
>                 }
>                 break;
> +       case I915_FORMAT_MOD_Y_TILED_CCS:
> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
> +               /* FIXME AUX plane? */
>         case I915_FORMAT_MOD_Y_TILED:
>         case I915_FORMAT_MOD_Yf_TILED:
>                 switch (cpp) {
> @@ -2895,6 +2938,42 @@ static int skl_max_plane_width(const struct
> drm_framebuffer *fb, int plane,
>         return 2048;
>  }
>
> +static bool skl_check_main_ccs_coordinates(struct intel_plane_state
> *plane_state,
> +                                          int main_x, int main_y, u32
> main_offset)
> +{
> +       const struct drm_framebuffer *fb = plane_state->base.fb;
> +       int aux_x = plane_state->aux.x;
> +       int aux_y = plane_state->aux.y;
> +       u32 aux_offset = plane_state->aux.offset;
> +       u32 alignment = intel_surf_alignment(fb, 1);
> +
> +       while (aux_offset >= main_offset && aux_y <= main_y) {
> +               int x, y;
> +
> +               if (aux_x == main_x && aux_y == main_y)
> +                       break;
> +
> +               if (aux_offset == 0)
> +                       break;
> +
> +               x = aux_x / 16;
> +               y = aux_y / 8;
> +               aux_offset = intel_adjust_tile_offset(&x, &y, plane_state,
> 1,
> +                                                     aux_offset,
> aux_offset - alignment);
> +               aux_x = x * 16 + aux_x % 16;
> +               aux_y = y * 8 + aux_y % 8;
> +       }
> +
> +       if (aux_x != main_x || aux_y != main_y)
> +               return false;
> +
> +       plane_state->aux.offset = aux_offset;
> +       plane_state->aux.x = aux_x;
> +       plane_state->aux.y = aux_y;
> +
> +       return true;
> +}
> +
>  static int skl_check_main_surface(struct intel_plane_state *plane_state)
>  {
>         const struct drm_framebuffer *fb = plane_state->base.fb;
> @@ -2937,7 +3016,7 @@ static int skl_check_main_surface(struct
> intel_plane_state *plane_state)
>
>                 while ((x + w) * cpp > fb->pitches[0]) {
>                         if (offset == 0) {
> -                               DRM_DEBUG_KMS("Unable to find suitable
> display surface offset\n");
> +                               DRM_DEBUG_KMS("Unable to find suitable
> display surface offset due to X-tiling\n");
>                                 return -EINVAL;
>                         }
>
> @@ -2946,6 +3025,26 @@ static int skl_check_main_surface(struct
> intel_plane_state *plane_state)
>                 }
>         }
>
> +       /*
> +        * CCS AUX surface doesn't have its own x/y offsets, we must make
> sure
> +        * they match with the main surface x/y offsets.
> +        */
> +       if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +           fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
> +               while (!skl_check_main_ccs_coordinates(plane_state, x, y,
> offset)) {
> +                       if (offset == 0)
> +                               break;
> +
> +                       offset = intel_adjust_tile_offset(&x, &y,
> plane_state, 0,
> +                                                         offset, offset -
> alignment);
> +               }
> +
> +               if (x != plane_state->aux.x || y != plane_state->aux.y) {
> +                       DRM_DEBUG_KMS("Unable to find suitable display
> surface offset due to CCS\n");
> +                       return -EINVAL;
> +               }
> +       }
> +
>         plane_state->main.offset = offset;
>         plane_state->main.x = x;
>         plane_state->main.y = y;
> @@ -2982,6 +3081,47 @@ static int skl_check_nv12_aux_surface(struct
> intel_plane_state *plane_state)
>         return 0;
>  }
>
> +static int skl_check_ccs_aux_surface(struct intel_plane_state
> *plane_state)
> +{
> +       struct intel_plane *plane = to_intel_plane(plane_state->
> base.plane);
> +       struct intel_crtc *crtc = to_intel_crtc(plane_state->base.crtc);
> +       int src_x = plane_state->base.src.x1 >> 16;
> +       int src_y = plane_state->base.src.y1 >> 16;
> +       int x = src_x / 16;
> +       int y = src_y / 8;
> +       u32 offset;
> +
> +       switch (plane->id) {
> +       case PLANE_PRIMARY:
> +       case PLANE_SPRITE0:
> +               break;
> +       default:
> +               DRM_DEBUG_KMS("RC support only on plane 1 and 2\n");
> +               return -EINVAL;
> +       }
> +
> +       if (crtc->pipe == PIPE_C) {
> +               DRM_DEBUG_KMS("No RC support on pipe C\n");
> +               return -EINVAL;
> +       }
> +
> +       if (plane_state->base.rotation &&
> +           plane_state->base.rotation & ~(DRM_ROTATE_0 | DRM_ROTATE_180))
> {
> +               DRM_DEBUG_KMS("RC support only with 0/180 degree rotation
> %x\n",
> +                             plane_state->base.rotation);
> +               return -EINVAL;
> +       }
> +
> +       intel_add_fb_offsets(&x, &y, plane_state, 1);
> +       offset = intel_compute_tile_offset(&x, &y, plane_state, 1);
> +
> +       plane_state->aux.offset = offset;
> +       plane_state->aux.x = x * 16 + src_x % 16;
> +       plane_state->aux.y = y * 8 + src_y % 8;
> +
> +       return 0;
> +}
> +
>  int skl_check_plane_surface(struct intel_plane_state *plane_state)
>  {
>         const struct drm_framebuffer *fb = plane_state->base.fb;
> @@ -3002,6 +3142,11 @@ int skl_check_plane_surface(struct
> intel_plane_state *plane_state)
>                 ret = skl_check_nv12_aux_surface(plane_state);
>                 if (ret)
>                         return ret;
> +       } else if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +                  fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
> +               ret = skl_check_ccs_aux_surface(plane_state);
> +               if (ret)
> +                       return ret;
>         } else {
>                 plane_state->aux.offset = ~0xfff;
>                 plane_state->aux.x = 0;
> @@ -3357,8 +3502,12 @@ u32 skl_plane_ctl_tiling(uint64_t fb_modifier)
>                 return PLANE_CTL_TILED_X;
>         case I915_FORMAT_MOD_Y_TILED:
>                 return PLANE_CTL_TILED_Y;
> +       case I915_FORMAT_MOD_Y_TILED_CCS:
> +               return PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
>         case I915_FORMAT_MOD_Yf_TILED:
>                 return PLANE_CTL_TILED_YF;
> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
> +               return PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_
> ENABLE;
>         default:
>                 MISSING_CASE(fb_modifier);
>         }
> @@ -3401,6 +3550,7 @@ static void skylake_update_primary_plane(struct
> drm_plane *plane,
>         u32 plane_ctl;
>         unsigned int rotation = plane_state->base.rotation;
>         u32 stride = skl_plane_stride(fb, 0, rotation);
> +       u32 aux_stride = skl_plane_stride(fb, 1, rotation);
>         u32 surf_addr = plane_state->main.offset;
>         int scaler_id = plane_state->scaler_id;
>         int src_x = plane_state->main.x;
> @@ -3436,6 +3586,10 @@ static void skylake_update_primary_plane(struct
> drm_plane *plane,
>         I915_WRITE(PLANE_OFFSET(pipe, plane_id), (src_y << 16) | src_x);
>         I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
>         I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
> +       I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
> +                  (plane_state->aux.offset - surf_addr) | aux_stride);
> +       I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
> +                  (plane_state->aux.y << 16) | plane_state->aux.x);
>
>         if (scaler_id >= 0) {
>                 uint32_t ps_ctrl = 0;
> @@ -9807,10 +9961,16 @@ skylake_get_initial_plane_config(struct
> intel_crtc *crtc,
>                 fb->modifier = I915_FORMAT_MOD_X_TILED;
>                 break;
>         case PLANE_CTL_TILED_Y:
> -               fb->modifier = I915_FORMAT_MOD_Y_TILED;
> +               if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
> +                       fb->modifier = I915_FORMAT_MOD_Y_TILED_CCS;
> +               else
> +                       fb->modifier = I915_FORMAT_MOD_Y_TILED;
>                 break;
>         case PLANE_CTL_TILED_YF:
> -               fb->modifier = I915_FORMAT_MOD_Yf_TILED;
> +               if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
> +                       fb->modifier = I915_FORMAT_MOD_Yf_TILED_CCS;
> +               else
> +                       fb->modifier = I915_FORMAT_MOD_Yf_TILED;
>                 break;
>         default:
>                 MISSING_CASE(tiling);
> @@ -12014,7 +12174,7 @@ static void skl_do_mmio_flip(struct intel_crtc
> *intel_crtc,
>         u32 ctl, stride = skl_plane_stride(fb, 0, rotation);
>
>         ctl = I915_READ(PLANE_CTL(pipe, 0));
> -       ctl &= ~PLANE_CTL_TILED_MASK;
> +       ctl &= ~(PLANE_CTL_TILED_MASK | PLANE_CTL_DECOMPRESSION_ENABLE);
>         switch (fb->modifier) {
>         case DRM_FORMAT_MOD_NONE:
>                 break;
> @@ -12024,9 +12184,15 @@ static void skl_do_mmio_flip(struct intel_crtc
> *intel_crtc,
>         case I915_FORMAT_MOD_Y_TILED:
>                 ctl |= PLANE_CTL_TILED_Y;
>                 break;
> +       case I915_FORMAT_MOD_Y_TILED_CCS:
> +               ctl |= PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
> +               break;
>         case I915_FORMAT_MOD_Yf_TILED:
>                 ctl |= PLANE_CTL_TILED_YF;
>                 break;
> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
> +               ctl |= PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_
> ENABLE;
> +               break;
>         default:
>                 MISSING_CASE(fb->modifier);
>         }
> @@ -15925,9 +16091,10 @@ static int intel_framebuffer_init(struct
> drm_device *dev,
>                                   struct drm_i915_gem_object *obj)
>  {
>         struct drm_i915_private *dev_priv = to_i915(dev);
> +       struct drm_framebuffer *fb = &intel_fb->base;
>         unsigned int tiling = i915_gem_object_get_tiling(obj);
> -       int ret;
> -       u32 pitch_limit, stride_alignment;
> +       int ret, i;
> +       u32 pitch_limit;
>         struct drm_format_name_buf format_name;
>
>         WARN_ON(!mutex_is_locked(&dev->struct_mutex));
> @@ -15953,6 +16120,19 @@ static int intel_framebuffer_init(struct
> drm_device *dev,
>
>         /* Passed in modifier sanity checking. */
>         switch (mode_cmd->modifier[0]) {
> +       case I915_FORMAT_MOD_Y_TILED_CCS:
> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
> +               switch (mode_cmd->pixel_format) {
> +               case DRM_FORMAT_XBGR8888:
> +               case DRM_FORMAT_ABGR8888:
> +               case DRM_FORMAT_XRGB8888:
> +               case DRM_FORMAT_ARGB8888:
> +                       break;
> +               default:
> +                       DRM_DEBUG_KMS("RC supported only with RGB8888
> formats\n");
> +                       return -EINVAL;
> +               }
> +               /* fall through */
>         case I915_FORMAT_MOD_Y_TILED:
>         case I915_FORMAT_MOD_Yf_TILED:
>                 if (INTEL_GEN(dev_priv) < 9) {
> @@ -16059,22 +16239,46 @@ static int intel_framebuffer_init(struct
> drm_device *dev,
>         if (mode_cmd->offsets[0] != 0)
>                 return -EINVAL;
>
> -       drm_helper_mode_fill_fb_struct(dev, &intel_fb->base, mode_cmd);
> +       drm_helper_mode_fill_fb_struct(dev, fb, mode_cmd);
>
> -       stride_alignment = intel_fb_stride_alignment(&intel_fb->base, 0);
> -       if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
> -               DRM_DEBUG_KMS("pitch (%d) must be at least %u byte
> aligned\n",
> -                             mode_cmd->pitches[0], stride_alignment);
> -               return -EINVAL;
> +       for (i = 0; i < fb->format->num_planes; i++) {
> +               u32 stride_alignment;
> +
> +               if (mode_cmd->handles[i] != mode_cmd->handles[0]) {
> +                       DRM_DEBUG_KMS("bad plane %d handle\n", i);
> +                       return -EINVAL;
> +               }
> +
> +               stride_alignment = intel_fb_stride_alignment(fb, i);
> +
> +               /*
> +                * Display WA #0531: skl,bxt,kbl,glk
> +                *
> +                * Render decompression and plane width > 3840
> +                * combined with horizontal panning requires the
> +                * plane stride to be a multiple of 4. We'll just
> +                * require the entire fb to accommodate that to avoid
> +                * potential runtime errors at plane configuration time.
> +                */
> +               if (IS_GEN9(dev_priv) && i == 0 && fb->width > 3840 &&
> +                   (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +                    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS))
> +                       stride_alignment *= 4;
> +
> +               if (fb->pitches[i] & (stride_alignment - 1)) {
> +                       DRM_DEBUG_KMS("plane %d pitch (%d) must be at
> least %u byte aligned\n",
> +                                     i, fb->pitches[i], stride_alignment);
> +                       return -EINVAL;
> +               }
>         }
>
>         intel_fb->obj = obj;
>
> -       ret = intel_fill_fb_info(dev_priv, &intel_fb->base);
> +       ret = intel_fill_fb_info(dev_priv, fb);
>         if (ret)
>                 return ret;
>
> -       ret = drm_framebuffer_init(dev, &intel_fb->base, &intel_fb_funcs);
> +       ret = drm_framebuffer_init(dev, fb, &intel_fb_funcs);
>         if (ret) {
>                 DRM_ERROR("framebuffer init failed %d\n", ret);
>                 return ret;
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_
> pm.c
> index 249623d45be0..25782cd174c0 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -62,6 +62,20 @@ static void gen9_init_clock_gating(struct
> drm_i915_private *dev_priv)
>         I915_WRITE(CHICKEN_PAR1_1,
>                    I915_READ(CHICKEN_PAR1_1) | SKL_EDP_PSR_FIX_RDWRAP);
>
> +       /*
> +        * Display WA#0390: skl,bxt,kbl,glk
> +        *
> +        * Must match Sampler, Pixel Back End, and Media
> +        * (0xE194 bit 8, 0x7014 bit 13, 0x4DDC bits 27 and 31).
> +        *
> +        * Including bits outside the page in the hash would
> +        * require 2 (or 4?) MiB alignment of resources. Just
> +        * assume the defaul hashing mode which only uses bits
> +        * within the page.
> +        */
> +       I915_WRITE(CHICKEN_PAR1_1,
> +                  I915_READ(CHICKEN_PAR1_1) & ~SKL_RC_HASH_OUTSIDE);
> +
>         I915_WRITE(GEN8_CONFIG0,
>                    I915_READ(GEN8_CONFIG0) | GEN9_DEFAULT_FIXES);
>
> @@ -3314,7 +3328,9 @@ skl_ddb_min_alloc(const struct drm_plane_state
> *pstate,
>
>         /* For Non Y-tile return 8-blocks */
>         if (fb->modifier != I915_FORMAT_MOD_Y_TILED &&
> -           fb->modifier != I915_FORMAT_MOD_Yf_TILED)
> +           fb->modifier != I915_FORMAT_MOD_Yf_TILED &&
> +           fb->modifier != I915_FORMAT_MOD_Y_TILED_CCS &&
> +           fb->modifier != I915_FORMAT_MOD_Yf_TILED_CCS)
>                 return 8;
>
>         src_w = drm_rect_width(&intel_pstate->base.src) >> 16;
> @@ -3590,7 +3606,9 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>         }
>
>         y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
> -                 fb->modifier == I915_FORMAT_MOD_Yf_TILED;
> +                 fb->modifier == I915_FORMAT_MOD_Yf_TILED ||
> +                 fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +                 fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS;
>         x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
>
>         /* Display WA #1141: kbl. */
> @@ -3675,6 +3693,13 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>         res_lines = DIV_ROUND_UP(selected_result.val,
>                                  plane_blocks_per_line.val);
>
> +       /* Display WA #1125: skl,bxt,kbl,glk */
> +       if (level == 0 &&
> +           (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +            fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS))
> +               res_blocks += fixed_16_16_to_u32_round_up(y_tile_minimum);
> +
> +       /* Display WA #1126: skl,bxt,kbl,glk */
>         if (level >= 1 && level <= 7) {
>                 if (y_tiled) {
>                         res_blocks += fixed_16_16_to_u32_round_up(y_
> tile_minimum);
> diff --git a/drivers/gpu/drm/i915/intel_sprite.c
> b/drivers/gpu/drm/i915/intel_sprite.c
> index 7031bc733d97..063a994815d0 100644
> --- a/drivers/gpu/drm/i915/intel_sprite.c
> +++ b/drivers/gpu/drm/i915/intel_sprite.c
> @@ -210,6 +210,7 @@ skl_update_plane(struct drm_plane *drm_plane,
>         u32 surf_addr = plane_state->main.offset;
>         unsigned int rotation = plane_state->base.rotation;
>         u32 stride = skl_plane_stride(fb, 0, rotation);
> +       u32 aux_stride = skl_plane_stride(fb, 1, rotation);
>         int crtc_x = plane_state->base.dst.x1;
>         int crtc_y = plane_state->base.dst.y1;
>         uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
> @@ -248,6 +249,10 @@ skl_update_plane(struct drm_plane *drm_plane,
>         I915_WRITE(PLANE_OFFSET(pipe, plane_id), (y << 16) | x);
>         I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
>         I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
> +       I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
> +                  (plane_state->aux.offset - surf_addr) | aux_stride);
> +       I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
> +                  (plane_state->aux.y << 16) | plane_state->aux.x);
>
>         /* program plane scaler */
>         if (plane_state->scaler_id >= 0) {
> --
> 2.10.2
>
>

[-- Attachment #1.2: Type: text/html, Size: 34214 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 9/9] drm/i915: Add render decompression support
  2017-01-09 19:20     ` Jason Ekstrand
@ 2017-01-10 17:04       ` Ville Syrjälä
  2017-01-11 21:49         ` Jason Ekstrand
  0 siblings, 1 reply; 44+ messages in thread
From: Ville Syrjälä @ 2017-01-10 17:04 UTC (permalink / raw)
  To: Jason Ekstrand
  Cc: Ben Widawsky, Paulo Zanoni, intel-gfx,
	Maling list - DRI developers, Vandana Kannan

On Mon, Jan 09, 2017 at 11:20:57AM -0800, Jason Ekstrand wrote:
> On Thu, Jan 5, 2017 at 7:14 AM, <ville.syrjala@linux.intel.com> wrote:
> 
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >
> > SKL+ display engine can scan out certain kinds of compressed surfaces
> > produced by the render engine. This involved telling the display engine
> > the location of the color control surfae (CCS) which describes
> > which parts of the main surface are compressed and which are not. The
> > location of CCS is provided by userspace as just another plane with its
> > own offset.
> >
> > Add the required stuff to validate the user provided AUX plane metadata
> > and convert the user provided linear offset into something the hardware
> > can consume.
> >
> > Due to hardware limitations we require that the main surface and
> > the AUX surface (CCS) be part of the same bo. The hardware also
> > makes life hard by not allowing you to provide separate x/y offsets
> > for the main and AUX surfaces (excpet with NV12), so finding suitable
> > offsets for both requires a bit of work. Assuming we still want keep
> > playing tricks with the offsets. I've just gone with a dumb "search
> > backward for suitable offsets" approach, which is far from optimal,
> > but it works.
> >
> > Also not all planes will be capable of scanning out compressed surfaces,
> > and eg. 90/270 degree rotation is not supported in combination with
> > decompression either.
> >
> > This patch may contain work from at least the following people:
> > * Vandana Kannan <vandana.kannan@intel.com>
> > * Daniel Vetter <daniel@ffwll.ch>
> > * Ben Widawsky <ben@bwidawsk.net>
> >
> > v2: Deal with display workarounds 0390, 0531, 1125 (Paulo)
> >
> > Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > Cc: Vandana Kannan <vandana.kannan@intel.com>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Ben Widawsky <ben@bwidawsk.net>
> > Cc: Jason Ekstrand <jason@jlekstrand.net>
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_reg.h      |  23 ++++
> >  drivers/gpu/drm/i915/intel_display.c | 234 ++++++++++++++++++++++++++++++
> > ++---
> >  drivers/gpu/drm/i915/intel_pm.c      |  29 ++++-
> >  drivers/gpu/drm/i915/intel_sprite.c  |   5 +
> >  4 files changed, 274 insertions(+), 17 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_
> > reg.h
> > index 00970aa77afa..6849ba93f1d9 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -6209,6 +6209,28 @@ enum {
> >                         _ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A),   \
> >                         _ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B))
> >
> > +#define PLANE_AUX_DIST_1_A             0x701c0
> > +#define PLANE_AUX_DIST_2_A             0x702c0
> > +#define PLANE_AUX_DIST_1_B             0x711c0
> > +#define PLANE_AUX_DIST_2_B             0x712c0
> > +#define _PLANE_AUX_DIST_1(pipe) \
> > +                       _PIPE(pipe, PLANE_AUX_DIST_1_A, PLANE_AUX_DIST_1_B)
> > +#define _PLANE_AUX_DIST_2(pipe) \
> > +                       _PIPE(pipe, PLANE_AUX_DIST_2_A, PLANE_AUX_DIST_2_B)
> > +#define PLANE_AUX_DIST(pipe, plane)     \
> > +       _MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe),
> > _PLANE_AUX_DIST_2(pipe))
> > +
> > +#define PLANE_AUX_OFFSET_1_A           0x701c4
> > +#define PLANE_AUX_OFFSET_2_A           0x702c4
> > +#define PLANE_AUX_OFFSET_1_B           0x711c4
> > +#define PLANE_AUX_OFFSET_2_B           0x712c4
> > +#define _PLANE_AUX_OFFSET_1(pipe)       \
> > +               _PIPE(pipe, PLANE_AUX_OFFSET_1_A, PLANE_AUX_OFFSET_1_B)
> > +#define _PLANE_AUX_OFFSET_2(pipe)       \
> > +               _PIPE(pipe, PLANE_AUX_OFFSET_2_A, PLANE_AUX_OFFSET_2_B)
> > +#define PLANE_AUX_OFFSET(pipe, plane)   \
> > +       _MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe),
> > _PLANE_AUX_OFFSET_2(pipe))
> > +
> >  /* legacy palette */
> >  #define _LGC_PALETTE_A           0x4a000
> >  #define _LGC_PALETTE_B           0x4a800
> > @@ -6433,6 +6455,7 @@ enum {
> >  # define CHICKEN3_DGMG_DONE_FIX_DISABLE                (1 << 2)
> >
> >  #define CHICKEN_PAR1_1         _MMIO(0x42080)
> > +#define  SKL_RC_HASH_OUTSIDE   (1 << 15)
> >  #define  DPA_MASK_VBLANK_SRD   (1 << 15)
> >  #define  FORCE_ARB_IDLE_PLANES (1 << 14)
> >  #define  SKL_EDP_PSR_FIX_RDWRAP        (1 << 3)
> > diff --git a/drivers/gpu/drm/i915/intel_display.c
> > b/drivers/gpu/drm/i915/intel_display.c
> > index 38de9df0ec60..2236abebd8bc 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -2064,11 +2064,19 @@ intel_tile_width_bytes(const struct
> > drm_framebuffer *fb, int plane)
> >                         return 128;
> >                 else
> >                         return 512;
> > +       case I915_FORMAT_MOD_Y_TILED_CCS:
> > +               if (plane == 1)
> > +                       return 64;
> > +               /* fall through */
> >         case I915_FORMAT_MOD_Y_TILED:
> >                 if (IS_GEN2(dev_priv) || HAS_128_BYTE_Y_TILING(dev_priv))
> >                         return 128;
> >                 else
> >                         return 512;
> > +       case I915_FORMAT_MOD_Yf_TILED_CCS:
> > +               if (plane == 1)
> > +                       return 64;
> >
> 
> I still think a CCS tile is 128B wide. :-)

The spec kinda suggests the same. But I still couldn't figure out where
that notion really came from, so I just went with the value that gave
me the expected result on my screen. That is writing 64 bytes into the
tile is exactly what's required to fill a single row/column, writing
more would wrap.

> 
> 
> > +               /* fall through */
> >         case I915_FORMAT_MOD_Yf_TILED:
> >                 /*
> >                  * Bspec seems to suggest that the Yf tile width would
> > @@ -2156,7 +2164,7 @@ static unsigned int intel_surf_alignment(const
> > struct drm_framebuffer *fb,
> >         struct drm_i915_private *dev_priv = to_i915(fb->dev);
> >
> >         /* AUX_DIST needs only 4K alignment */
> > -       if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
> > +       if (plane == 1)
> >                 return 4096;
> >
> >         switch (fb->modifier) {
> > @@ -2166,6 +2174,8 @@ static unsigned int intel_surf_alignment(const
> > struct drm_framebuffer *fb,
> >                 if (INTEL_GEN(dev_priv) >= 9)
> >                         return 256 * 1024;
> >                 return 0;
> > +       case I915_FORMAT_MOD_Y_TILED_CCS:
> > +       case I915_FORMAT_MOD_Yf_TILED_CCS:
> >         case I915_FORMAT_MOD_Y_TILED:
> >         case I915_FORMAT_MOD_Yf_TILED:
> >                 return 1 * 1024 * 1024;
> > @@ -2472,6 +2482,7 @@ static unsigned int intel_fb_modifier_to_tiling(uint64_t
> > fb_modifier)
> >         case I915_FORMAT_MOD_X_TILED:
> >                 return I915_TILING_X;
> >         case I915_FORMAT_MOD_Y_TILED:
> > +       case I915_FORMAT_MOD_Y_TILED_CCS:
> >                 return I915_TILING_Y;
> >         default:
> >                 return I915_TILING_NONE;
> > @@ -2536,6 +2547,35 @@ intel_fill_fb_info(struct drm_i915_private
> > *dev_priv,
> >
> >                 intel_fb_offset_to_xy(&x, &y, fb, i);
> >
> > +               if ((fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> > +                    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) && i ==
> > 1) {
> > +                       int main_x, main_y;
> > +                       int ccs_x, ccs_y;
> > +
> > +                       /*
> > +                        * Each byte of CCS corresponds to a 16x8 area of
> > the main surface, and
> > +                        * each CCS tile is 64x64 bytes.
> > +                        */
> > +                       ccs_x = (x * 16) % (64 * 16);
> > +                       ccs_y = (y * 8) % (64 * 8);
> >
> 
> This makes me nervous.  Why are we multiplying CCS coordinates by something
> before we do the modulus?  Why aren't coordinates in both surfaces in
> pixels?

For converting the linear offset (which is in bytes) into x/y we just
consider each pixel to be 1 byte. Hence to get the corresponding pixel
coordinates we multiply the byte based coordinates by 16x8. We can't
really deal with <1 byte pixels in most places.

> So long as you keep things in pixesl and know that a CCS tile is
> 1024x512px and a color tile is 32x32 pixels, you can safely do tile
> offsetting and it all makes sense.  Having different units looks like a
> recipe for some very confusing bugs.  Am I just completely misunderstanding
> what's going on here?

Doing things in pixels would involve totally custom code for the CCS.
By thinking of CCS as having 1 byte pixels we can share the code already
used for everything else (apart from this one special check which is
really only necessary because the HW ignores the AUX x/y offsets for CCS.

I suppose it would be possible to rewrite a bunch of other things to
allow <1 byte pixels but I couldn't be bothered to go there.

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915: SKL+ render decompression support (rev2)
  2017-01-04 18:42 [PATCH 0/9] drm/i915: SKL+ render decompression support ville.syrjala
                   ` (11 preceding siblings ...)
  2017-01-06 23:41 ` [PATCH 0/9] drm/i915: SKL+ render decompression support Ben Widawsky
@ 2017-01-10 19:23 ` Patchwork
  12 siblings, 0 replies; 44+ messages in thread
From: Patchwork @ 2017-01-10 19:23 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: SKL+ render decompression support (rev2)
URL   : https://patchwork.freedesktop.org/series/17507/
State : success

== Summary ==

Series 17507v2 drm/i915: SKL+ render decompression support
https://patchwork.freedesktop.org/api/1.0/series/17507/revisions/2/mbox/


fi-bdw-5557u     total:246  pass:232  dwarn:0   dfail:0   fail:0   skip:14 
fi-bsw-n3050     total:246  pass:207  dwarn:0   dfail:0   fail:0   skip:39 
fi-bxt-j4205     total:246  pass:224  dwarn:0   dfail:0   fail:0   skip:22 
fi-bxt-t5700     total:82   pass:69   dwarn:0   dfail:0   fail:0   skip:12 
fi-byt-j1900     total:246  pass:219  dwarn:0   dfail:0   fail:0   skip:27 
fi-byt-n2820     total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-hsw-4770      total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-hsw-4770r     total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-ivb-3520m     total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-ivb-3770      total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-kbl-7500u     total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-skl-6260u     total:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-skl-6700hq    total:246  pass:226  dwarn:0   dfail:0   fail:0   skip:20 
fi-skl-6700k     total:246  pass:222  dwarn:3   dfail:0   fail:0   skip:21 
fi-skl-6770hq    total:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-snb-2520m     total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-snb-2600      total:246  pass:214  dwarn:0   dfail:0   fail:0   skip:32 

44b1ad71b67ecfae6ba9b816c6f3a6ebe1fd182e drm-tip: 2017y-01m-10d-16h-32m-29s UTC integration manifest
e897d71 drm/i915: Add render decompression support
847f24e drm/i915: Implement .get_format_info() hook for CCS
c99e4d9 drm/i915: Use DRM_DEBUG_KMS() for framebuffer failure debug messages
6f9b893 drm/i915: Pass the correct plane index to _intel_compute_tile_offset()
d6f2958 drm/i915: Fix Yf tile width
698b29a drm/i915: Avoid div-by-zero when computing aux_stride w/o an aux plane
0bb427e drm/i915: Move nv12 chroma plane handling into intel_surf_alignment()
05022a8 drm/i915: Plumb drm_framebuffer into more places
04db250 drm: Add mode_config .get_format_info() hook

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3470/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 9/9] drm/i915: Add render decompression support
  2017-01-10 17:04       ` Ville Syrjälä
@ 2017-01-11 21:49         ` Jason Ekstrand
  2017-01-11 22:29           ` Jason Ekstrand
  0 siblings, 1 reply; 44+ messages in thread
From: Jason Ekstrand @ 2017-01-11 21:49 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Ben Widawsky, Paulo Zanoni, intel-gfx,
	Maling list - DRI developers, Vandana Kannan


[-- Attachment #1.1: Type: text/plain, Size: 10351 bytes --]

On Tue, Jan 10, 2017 at 9:04 AM, Ville Syrjälä <
ville.syrjala@linux.intel.com> wrote:

> On Mon, Jan 09, 2017 at 11:20:57AM -0800, Jason Ekstrand wrote:
> > On Thu, Jan 5, 2017 at 7:14 AM, <ville.syrjala@linux.intel.com> wrote:
> >
> > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > >
> > > SKL+ display engine can scan out certain kinds of compressed surfaces
> > > produced by the render engine. This involved telling the display engine
> > > the location of the color control surfae (CCS) which describes
> > > which parts of the main surface are compressed and which are not. The
> > > location of CCS is provided by userspace as just another plane with its
> > > own offset.
> > >
> > > Add the required stuff to validate the user provided AUX plane metadata
> > > and convert the user provided linear offset into something the hardware
> > > can consume.
> > >
> > > Due to hardware limitations we require that the main surface and
> > > the AUX surface (CCS) be part of the same bo. The hardware also
> > > makes life hard by not allowing you to provide separate x/y offsets
> > > for the main and AUX surfaces (excpet with NV12), so finding suitable
> > > offsets for both requires a bit of work. Assuming we still want keep
> > > playing tricks with the offsets. I've just gone with a dumb "search
> > > backward for suitable offsets" approach, which is far from optimal,
> > > but it works.
> > >
> > > Also not all planes will be capable of scanning out compressed
> surfaces,
> > > and eg. 90/270 degree rotation is not supported in combination with
> > > decompression either.
> > >
> > > This patch may contain work from at least the following people:
> > > * Vandana Kannan <vandana.kannan@intel.com>
> > > * Daniel Vetter <daniel@ffwll.ch>
> > > * Ben Widawsky <ben@bwidawsk.net>
> > >
> > > v2: Deal with display workarounds 0390, 0531, 1125 (Paulo)
> > >
> > > Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > > Cc: Vandana Kannan <vandana.kannan@intel.com>
> > > Cc: Daniel Vetter <daniel@ffwll.ch>
> > > Cc: Ben Widawsky <ben@bwidawsk.net>
> > > Cc: Jason Ekstrand <jason@jlekstrand.net>
> > > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_reg.h      |  23 ++++
> > >  drivers/gpu/drm/i915/intel_display.c | 234
> ++++++++++++++++++++++++++++++
> > > ++---
> > >  drivers/gpu/drm/i915/intel_pm.c      |  29 ++++-
> > >  drivers/gpu/drm/i915/intel_sprite.c  |   5 +
> > >  4 files changed, 274 insertions(+), 17 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> b/drivers/gpu/drm/i915/i915_
> > > reg.h
> > > index 00970aa77afa..6849ba93f1d9 100644
> > > --- a/drivers/gpu/drm/i915/i915_reg.h
> > > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > > @@ -6209,6 +6209,28 @@ enum {
> > >                         _ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A),   \
> > >                         _ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B))
> > >
> > > +#define PLANE_AUX_DIST_1_A             0x701c0
> > > +#define PLANE_AUX_DIST_2_A             0x702c0
> > > +#define PLANE_AUX_DIST_1_B             0x711c0
> > > +#define PLANE_AUX_DIST_2_B             0x712c0
> > > +#define _PLANE_AUX_DIST_1(pipe) \
> > > +                       _PIPE(pipe, PLANE_AUX_DIST_1_A,
> PLANE_AUX_DIST_1_B)
> > > +#define _PLANE_AUX_DIST_2(pipe) \
> > > +                       _PIPE(pipe, PLANE_AUX_DIST_2_A,
> PLANE_AUX_DIST_2_B)
> > > +#define PLANE_AUX_DIST(pipe, plane)     \
> > > +       _MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe),
> > > _PLANE_AUX_DIST_2(pipe))
> > > +
> > > +#define PLANE_AUX_OFFSET_1_A           0x701c4
> > > +#define PLANE_AUX_OFFSET_2_A           0x702c4
> > > +#define PLANE_AUX_OFFSET_1_B           0x711c4
> > > +#define PLANE_AUX_OFFSET_2_B           0x712c4
> > > +#define _PLANE_AUX_OFFSET_1(pipe)       \
> > > +               _PIPE(pipe, PLANE_AUX_OFFSET_1_A, PLANE_AUX_OFFSET_1_B)
> > > +#define _PLANE_AUX_OFFSET_2(pipe)       \
> > > +               _PIPE(pipe, PLANE_AUX_OFFSET_2_A, PLANE_AUX_OFFSET_2_B)
> > > +#define PLANE_AUX_OFFSET(pipe, plane)   \
> > > +       _MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe),
> > > _PLANE_AUX_OFFSET_2(pipe))
> > > +
> > >  /* legacy palette */
> > >  #define _LGC_PALETTE_A           0x4a000
> > >  #define _LGC_PALETTE_B           0x4a800
> > > @@ -6433,6 +6455,7 @@ enum {
> > >  # define CHICKEN3_DGMG_DONE_FIX_DISABLE                (1 << 2)
> > >
> > >  #define CHICKEN_PAR1_1         _MMIO(0x42080)
> > > +#define  SKL_RC_HASH_OUTSIDE   (1 << 15)
> > >  #define  DPA_MASK_VBLANK_SRD   (1 << 15)
> > >  #define  FORCE_ARB_IDLE_PLANES (1 << 14)
> > >  #define  SKL_EDP_PSR_FIX_RDWRAP        (1 << 3)
> > > diff --git a/drivers/gpu/drm/i915/intel_display.c
> > > b/drivers/gpu/drm/i915/intel_display.c
> > > index 38de9df0ec60..2236abebd8bc 100644
> > > --- a/drivers/gpu/drm/i915/intel_display.c
> > > +++ b/drivers/gpu/drm/i915/intel_display.c
> > > @@ -2064,11 +2064,19 @@ intel_tile_width_bytes(const struct
> > > drm_framebuffer *fb, int plane)
> > >                         return 128;
> > >                 else
> > >                         return 512;
> > > +       case I915_FORMAT_MOD_Y_TILED_CCS:
> > > +               if (plane == 1)
> > > +                       return 64;
> > > +               /* fall through */
> > >         case I915_FORMAT_MOD_Y_TILED:
> > >                 if (IS_GEN2(dev_priv) || HAS_128_BYTE_Y_TILING(dev_
> priv))
> > >                         return 128;
> > >                 else
> > >                         return 512;
> > > +       case I915_FORMAT_MOD_Yf_TILED_CCS:
> > > +               if (plane == 1)
> > > +                       return 64;
> > >
> >
> > I still think a CCS tile is 128B wide. :-)
>
> The spec kinda suggests the same. But I still couldn't figure out where
> that notion really came from, so I just went with the value that gave
> me the expected result on my screen. That is writing 64 bytes into the
> tile is exactly what's required to fill a single row/column, writing
> more would wrap.
>

Yeah.  Ultimately it doesn't matter.  It's an arbitrary number userspace
multiplies by and the kernel divides by.  We just need to settle on
something sensible.  In userspace, we've more-or-less settled on 128 to
match the docs (and what the hardware does for W-tiling and some other edge
cases) but it's not a huge deal.


> >
> >
> > > +               /* fall through */
> > >         case I915_FORMAT_MOD_Yf_TILED:
> > >                 /*
> > >                  * Bspec seems to suggest that the Yf tile width would
> > > @@ -2156,7 +2164,7 @@ static unsigned int intel_surf_alignment(const
> > > struct drm_framebuffer *fb,
> > >         struct drm_i915_private *dev_priv = to_i915(fb->dev);
> > >
> > >         /* AUX_DIST needs only 4K alignment */
> > > -       if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
> > > +       if (plane == 1)
> > >                 return 4096;
> > >
> > >         switch (fb->modifier) {
> > > @@ -2166,6 +2174,8 @@ static unsigned int intel_surf_alignment(const
> > > struct drm_framebuffer *fb,
> > >                 if (INTEL_GEN(dev_priv) >= 9)
> > >                         return 256 * 1024;
> > >                 return 0;
> > > +       case I915_FORMAT_MOD_Y_TILED_CCS:
> > > +       case I915_FORMAT_MOD_Yf_TILED_CCS:
> > >         case I915_FORMAT_MOD_Y_TILED:
> > >         case I915_FORMAT_MOD_Yf_TILED:
> > >                 return 1 * 1024 * 1024;
> > > @@ -2472,6 +2482,7 @@ static unsigned int intel_fb_modifier_to_tiling(
> uint64_t
> > > fb_modifier)
> > >         case I915_FORMAT_MOD_X_TILED:
> > >                 return I915_TILING_X;
> > >         case I915_FORMAT_MOD_Y_TILED:
> > > +       case I915_FORMAT_MOD_Y_TILED_CCS:
> > >                 return I915_TILING_Y;
> > >         default:
> > >                 return I915_TILING_NONE;
> > > @@ -2536,6 +2547,35 @@ intel_fill_fb_info(struct drm_i915_private
> > > *dev_priv,
> > >
> > >                 intel_fb_offset_to_xy(&x, &y, fb, i);
> > >
> > > +               if ((fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> > > +                    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) &&
> i ==
> > > 1) {
> > > +                       int main_x, main_y;
> > > +                       int ccs_x, ccs_y;
> > > +
> > > +                       /*
> > > +                        * Each byte of CCS corresponds to a 16x8 area
> of
> > > the main surface, and
> > > +                        * each CCS tile is 64x64 bytes.
> > > +                        */
> > > +                       ccs_x = (x * 16) % (64 * 16);
> > > +                       ccs_y = (y * 8) % (64 * 8);
> > >
> >
> > This makes me nervous.  Why are we multiplying CCS coordinates by
> something
> > before we do the modulus?  Why aren't coordinates in both surfaces in
> > pixels?
>
> For converting the linear offset (which is in bytes) into x/y we just
> consider each pixel to be 1 byte. Hence to get the corresponding pixel
> coordinates we multiply the byte based coordinates by 16x8. We can't
> really deal with <1 byte pixels in most places.
>

So are the x/y offsets provided by userspace in bytes or pixels for regular
color surfaces?  I had kind-of assumed pixels, but I guess I could also see
bytes.


> > So long as you keep things in pixesl and know that a CCS tile is
> > 1024x512px and a color tile is 32x32 pixels, you can safely do tile
> > offsetting and it all makes sense.  Having different units looks like a
> > recipe for some very confusing bugs.  Am I just completely
> misunderstanding
> > what's going on here?
>
> Doing things in pixels would involve totally custom code for the CCS.
> By thinking of CCS as having 1 byte pixels we can share the code already
> used for everything else (apart from this one special check which is
> really only necessary because the HW ignores the AUX x/y offsets for CCS.
>
> I suppose it would be possible to rewrite a bunch of other things to
> allow <1 byte pixels but I couldn't be bothered to go there.
>

I think I need a better mental model of what the X/Y offset API looks like
before I can reply to that properly.

[-- Attachment #1.2: Type: text/html, Size: 13906 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 9/9] drm/i915: Add render decompression support
  2017-01-11 21:49         ` Jason Ekstrand
@ 2017-01-11 22:29           ` Jason Ekstrand
  0 siblings, 0 replies; 44+ messages in thread
From: Jason Ekstrand @ 2017-01-11 22:29 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: intel-gfx, Paulo Zanoni, Maling list - DRI developers, Ben Widawsky


[-- Attachment #1.1: Type: text/plain, Size: 11193 bytes --]

On Wed, Jan 11, 2017 at 1:49 PM, Jason Ekstrand <jason@jlekstrand.net>
wrote:

> On Tue, Jan 10, 2017 at 9:04 AM, Ville Syrjälä <
> ville.syrjala@linux.intel.com> wrote:
>
>> On Mon, Jan 09, 2017 at 11:20:57AM -0800, Jason Ekstrand wrote:
>> > On Thu, Jan 5, 2017 at 7:14 AM, <ville.syrjala@linux.intel.com> wrote:
>> >
>> > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> > >
>> > > SKL+ display engine can scan out certain kinds of compressed surfaces
>> > > produced by the render engine. This involved telling the display
>> engine
>> > > the location of the color control surfae (CCS) which describes
>> > > which parts of the main surface are compressed and which are not. The
>> > > location of CCS is provided by userspace as just another plane with
>> its
>> > > own offset.
>> > >
>> > > Add the required stuff to validate the user provided AUX plane
>> metadata
>> > > and convert the user provided linear offset into something the
>> hardware
>> > > can consume.
>> > >
>> > > Due to hardware limitations we require that the main surface and
>> > > the AUX surface (CCS) be part of the same bo. The hardware also
>> > > makes life hard by not allowing you to provide separate x/y offsets
>> > > for the main and AUX surfaces (excpet with NV12), so finding suitable
>> > > offsets for both requires a bit of work. Assuming we still want keep
>> > > playing tricks with the offsets. I've just gone with a dumb "search
>> > > backward for suitable offsets" approach, which is far from optimal,
>> > > but it works.
>> > >
>> > > Also not all planes will be capable of scanning out compressed
>> surfaces,
>> > > and eg. 90/270 degree rotation is not supported in combination with
>> > > decompression either.
>> > >
>> > > This patch may contain work from at least the following people:
>> > > * Vandana Kannan <vandana.kannan@intel.com>
>> > > * Daniel Vetter <daniel@ffwll.ch>
>> > > * Ben Widawsky <ben@bwidawsk.net>
>> > >
>> > > v2: Deal with display workarounds 0390, 0531, 1125 (Paulo)
>> > >
>> > > Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> > > Cc: Vandana Kannan <vandana.kannan@intel.com>
>> > > Cc: Daniel Vetter <daniel@ffwll.ch>
>> > > Cc: Ben Widawsky <ben@bwidawsk.net>
>> > > Cc: Jason Ekstrand <jason@jlekstrand.net>
>> > > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> > > ---
>> > >  drivers/gpu/drm/i915/i915_reg.h      |  23 ++++
>> > >  drivers/gpu/drm/i915/intel_display.c | 234
>> ++++++++++++++++++++++++++++++
>> > > ++---
>> > >  drivers/gpu/drm/i915/intel_pm.c      |  29 ++++-
>> > >  drivers/gpu/drm/i915/intel_sprite.c  |   5 +
>> > >  4 files changed, 274 insertions(+), 17 deletions(-)
>> > >
>> > > diff --git a/drivers/gpu/drm/i915/i915_reg.h
>> b/drivers/gpu/drm/i915/i915_
>> > > reg.h
>> > > index 00970aa77afa..6849ba93f1d9 100644
>> > > --- a/drivers/gpu/drm/i915/i915_reg.h
>> > > +++ b/drivers/gpu/drm/i915/i915_reg.h
>> > > @@ -6209,6 +6209,28 @@ enum {
>> > >                         _ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A),   \
>> > >                         _ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B))
>> > >
>> > > +#define PLANE_AUX_DIST_1_A             0x701c0
>> > > +#define PLANE_AUX_DIST_2_A             0x702c0
>> > > +#define PLANE_AUX_DIST_1_B             0x711c0
>> > > +#define PLANE_AUX_DIST_2_B             0x712c0
>> > > +#define _PLANE_AUX_DIST_1(pipe) \
>> > > +                       _PIPE(pipe, PLANE_AUX_DIST_1_A,
>> PLANE_AUX_DIST_1_B)
>> > > +#define _PLANE_AUX_DIST_2(pipe) \
>> > > +                       _PIPE(pipe, PLANE_AUX_DIST_2_A,
>> PLANE_AUX_DIST_2_B)
>> > > +#define PLANE_AUX_DIST(pipe, plane)     \
>> > > +       _MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe),
>> > > _PLANE_AUX_DIST_2(pipe))
>> > > +
>> > > +#define PLANE_AUX_OFFSET_1_A           0x701c4
>> > > +#define PLANE_AUX_OFFSET_2_A           0x702c4
>> > > +#define PLANE_AUX_OFFSET_1_B           0x711c4
>> > > +#define PLANE_AUX_OFFSET_2_B           0x712c4
>> > > +#define _PLANE_AUX_OFFSET_1(pipe)       \
>> > > +               _PIPE(pipe, PLANE_AUX_OFFSET_1_A,
>> PLANE_AUX_OFFSET_1_B)
>> > > +#define _PLANE_AUX_OFFSET_2(pipe)       \
>> > > +               _PIPE(pipe, PLANE_AUX_OFFSET_2_A,
>> PLANE_AUX_OFFSET_2_B)
>> > > +#define PLANE_AUX_OFFSET(pipe, plane)   \
>> > > +       _MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe),
>> > > _PLANE_AUX_OFFSET_2(pipe))
>> > > +
>> > >  /* legacy palette */
>> > >  #define _LGC_PALETTE_A           0x4a000
>> > >  #define _LGC_PALETTE_B           0x4a800
>> > > @@ -6433,6 +6455,7 @@ enum {
>> > >  # define CHICKEN3_DGMG_DONE_FIX_DISABLE                (1 << 2)
>> > >
>> > >  #define CHICKEN_PAR1_1         _MMIO(0x42080)
>> > > +#define  SKL_RC_HASH_OUTSIDE   (1 << 15)
>> > >  #define  DPA_MASK_VBLANK_SRD   (1 << 15)
>> > >  #define  FORCE_ARB_IDLE_PLANES (1 << 14)
>> > >  #define  SKL_EDP_PSR_FIX_RDWRAP        (1 << 3)
>> > > diff --git a/drivers/gpu/drm/i915/intel_display.c
>> > > b/drivers/gpu/drm/i915/intel_display.c
>> > > index 38de9df0ec60..2236abebd8bc 100644
>> > > --- a/drivers/gpu/drm/i915/intel_display.c
>> > > +++ b/drivers/gpu/drm/i915/intel_display.c
>> > > @@ -2064,11 +2064,19 @@ intel_tile_width_bytes(const struct
>> > > drm_framebuffer *fb, int plane)
>> > >                         return 128;
>> > >                 else
>> > >                         return 512;
>> > > +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> > > +               if (plane == 1)
>> > > +                       return 64;
>> > > +               /* fall through */
>> > >         case I915_FORMAT_MOD_Y_TILED:
>> > >                 if (IS_GEN2(dev_priv) ||
>> HAS_128_BYTE_Y_TILING(dev_priv))
>> > >                         return 128;
>> > >                 else
>> > >                         return 512;
>> > > +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>> > > +               if (plane == 1)
>> > > +                       return 64;
>> > >
>> >
>> > I still think a CCS tile is 128B wide. :-)
>>
>> The spec kinda suggests the same. But I still couldn't figure out where
>> that notion really came from, so I just went with the value that gave
>> me the expected result on my screen. That is writing 64 bytes into the
>> tile is exactly what's required to fill a single row/column, writing
>> more would wrap.
>>
>
> Yeah.  Ultimately it doesn't matter.  It's an arbitrary number userspace
> multiplies by and the kernel divides by.  We just need to settle on
> something sensible.  In userspace, we've more-or-less settled on 128 to
> match the docs (and what the hardware does for W-tiling and some other edge
> cases) but it's not a huge deal.
>
>
>> >
>> >
>> > > +               /* fall through */
>> > >         case I915_FORMAT_MOD_Yf_TILED:
>> > >                 /*
>> > >                  * Bspec seems to suggest that the Yf tile width would
>> > > @@ -2156,7 +2164,7 @@ static unsigned int intel_surf_alignment(const
>> > > struct drm_framebuffer *fb,
>> > >         struct drm_i915_private *dev_priv = to_i915(fb->dev);
>> > >
>> > >         /* AUX_DIST needs only 4K alignment */
>> > > -       if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
>> > > +       if (plane == 1)
>> > >                 return 4096;
>> > >
>> > >         switch (fb->modifier) {
>> > > @@ -2166,6 +2174,8 @@ static unsigned int intel_surf_alignment(const
>> > > struct drm_framebuffer *fb,
>> > >                 if (INTEL_GEN(dev_priv) >= 9)
>> > >                         return 256 * 1024;
>> > >                 return 0;
>> > > +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> > > +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>> > >         case I915_FORMAT_MOD_Y_TILED:
>> > >         case I915_FORMAT_MOD_Yf_TILED:
>> > >                 return 1 * 1024 * 1024;
>> > > @@ -2472,6 +2482,7 @@ static unsigned int
>> intel_fb_modifier_to_tiling(uint64_t
>> > > fb_modifier)
>> > >         case I915_FORMAT_MOD_X_TILED:
>> > >                 return I915_TILING_X;
>> > >         case I915_FORMAT_MOD_Y_TILED:
>> > > +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> > >                 return I915_TILING_Y;
>> > >         default:
>> > >                 return I915_TILING_NONE;
>> > > @@ -2536,6 +2547,35 @@ intel_fill_fb_info(struct drm_i915_private
>> > > *dev_priv,
>> > >
>> > >                 intel_fb_offset_to_xy(&x, &y, fb, i);
>> > >
>> > > +               if ((fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>> > > +                    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) &&
>> i ==
>> > > 1) {
>> > > +                       int main_x, main_y;
>> > > +                       int ccs_x, ccs_y;
>> > > +
>> > > +                       /*
>> > > +                        * Each byte of CCS corresponds to a 16x8
>> area of
>> > > the main surface, and
>> > > +                        * each CCS tile is 64x64 bytes.
>> > > +                        */
>> > > +                       ccs_x = (x * 16) % (64 * 16);
>> > > +                       ccs_y = (y * 8) % (64 * 8);
>> > >
>> >
>> > This makes me nervous.  Why are we multiplying CCS coordinates by
>> something
>> > before we do the modulus?  Why aren't coordinates in both surfaces in
>> > pixels?
>>
>> For converting the linear offset (which is in bytes) into x/y we just
>> consider each pixel to be 1 byte. Hence to get the corresponding pixel
>> coordinates we multiply the byte based coordinates by 16x8. We can't
>> really deal with <1 byte pixels in most places.
>>
>
> So are the x/y offsets provided by userspace in bytes or pixels for
> regular color surfaces?  I had kind-of assumed pixels, but I guess I could
> also see bytes.
>
>
>> > So long as you keep things in pixesl and know that a CCS tile is
>> > 1024x512px and a color tile is 32x32 pixels, you can safely do tile
>> > offsetting and it all makes sense.  Having different units looks like a
>> > recipe for some very confusing bugs.  Am I just completely
>> misunderstanding
>> > what's going on here?
>>
>> Doing things in pixels would involve totally custom code for the CCS.
>> By thinking of CCS as having 1 byte pixels we can share the code already
>> used for everything else (apart from this one special check which is
>> really only necessary because the HW ignores the AUX x/y offsets for CCS.
>>
>> I suppose it would be possible to rewrite a bunch of other things to
>> allow <1 byte pixels but I couldn't be bothered to go there.
>>
>
> I think I need a better mental model of what the X/Y offset API looks like
> before I can reply to that properly.
>

I had a very useful chat with Kristian on IRC today and he explained that
the offset in drm_mode_fb_cmd2 is just to get you to the upper-left pixel
of the image and the actual X/Y offset used for scanout is provided in
pixels when you do SetCrtc.  This seems completely sane and, at that point,
I really don't care how you do things internally so long as the results are
correct.  Consider my comments here dropped.  I didn't realize I was
arguing over an implementation detail.

[-- Attachment #1.2: Type: text/html, Size: 15121 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [PATCH 2/9] drm/i915: Plumb drm_framebuffer into more places
  2017-01-04 18:42 ` [PATCH 2/9] drm/i915: Plumb drm_framebuffer into more places ville.syrjala
@ 2017-02-02 13:30   ` Imre Deak
  0 siblings, 0 replies; 44+ messages in thread
From: Imre Deak @ 2017-02-02 13:30 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx, dri-devel

On Wed, Jan 04, 2017 at 08:42:25PM +0200, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> Now that framebuffers can be used even before calling
> drm_framebuffer_init() we can start to plumb them into more places,
> instead of passing individual pieces for fb metadata.
> 
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

Reviewed-by: Imre Deak <imre.deak@intel.com>

> ---
>  drivers/gpu/drm/i915/intel_display.c | 127 +++++++++++++++--------------------
>  drivers/gpu/drm/i915/intel_drv.h     |  11 +--
>  drivers/gpu/drm/i915/intel_fbdev.c   |   4 +-
>  3 files changed, 57 insertions(+), 85 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index e2150a64860c..f0cb80aba89a 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -2050,10 +2050,13 @@ static unsigned int intel_tile_size(const struct drm_i915_private *dev_priv)
>  	return IS_GEN2(dev_priv) ? 2048 : 4096;
>  }
>  
> -static unsigned int intel_tile_width_bytes(const struct drm_i915_private *dev_priv,
> -					   uint64_t fb_modifier, unsigned int cpp)
> +static unsigned int
> +intel_tile_width_bytes(const struct drm_framebuffer *fb, int plane)
>  {
> -	switch (fb_modifier) {
> +	struct drm_i915_private *dev_priv = to_i915(fb->dev);
> +	unsigned int cpp = fb->format->cpp[plane];
> +
> +	switch (fb->modifier) {
>  	case DRM_FORMAT_MOD_NONE:
>  		return cpp;
>  	case I915_FORMAT_MOD_X_TILED:
> @@ -2082,41 +2085,38 @@ static unsigned int intel_tile_width_bytes(const struct drm_i915_private *dev_pr
>  		}
>  		break;
>  	default:
> -		MISSING_CASE(fb_modifier);
> +		MISSING_CASE(fb->modifier);
>  		return cpp;
>  	}
>  }
>  
> -unsigned int intel_tile_height(const struct drm_i915_private *dev_priv,
> -			       uint64_t fb_modifier, unsigned int cpp)
> +static unsigned int
> +intel_tile_height(const struct drm_framebuffer *fb, int plane)
>  {
> -	if (fb_modifier == DRM_FORMAT_MOD_NONE)
> +	if (fb->modifier == DRM_FORMAT_MOD_NONE)
>  		return 1;
>  	else
> -		return intel_tile_size(dev_priv) /
> -			intel_tile_width_bytes(dev_priv, fb_modifier, cpp);
> +		return intel_tile_size(to_i915(fb->dev)) /
> +			intel_tile_width_bytes(fb, plane);
>  }
>  
>  /* Return the tile dimensions in pixel units */
> -static void intel_tile_dims(const struct drm_i915_private *dev_priv,
> +static void intel_tile_dims(const struct drm_framebuffer *fb, int plane,
>  			    unsigned int *tile_width,
> -			    unsigned int *tile_height,
> -			    uint64_t fb_modifier,
> -			    unsigned int cpp)
> +			    unsigned int *tile_height)
>  {
> -	unsigned int tile_width_bytes =
> -		intel_tile_width_bytes(dev_priv, fb_modifier, cpp);
> +	unsigned int tile_width_bytes = intel_tile_width_bytes(fb, plane);
> +	unsigned int cpp = fb->format->cpp[plane];
>  
>  	*tile_width = tile_width_bytes / cpp;
> -	*tile_height = intel_tile_size(dev_priv) / tile_width_bytes;
> +	*tile_height = intel_tile_size(to_i915(fb->dev)) / tile_width_bytes;
>  }
>  
>  unsigned int
> -intel_fb_align_height(struct drm_device *dev, unsigned int height,
> -		      uint32_t pixel_format, uint64_t fb_modifier)
> +intel_fb_align_height(const struct drm_framebuffer *fb,
> +		      int plane, unsigned int height)
>  {
> -	unsigned int cpp = drm_format_plane_cpp(pixel_format, 0);
> -	unsigned int tile_height = intel_tile_height(to_i915(dev), fb_modifier, cpp);
> +	unsigned int tile_height = intel_tile_height(fb, plane);
>  
>  	return ALIGN(height, tile_height);
>  }
> @@ -2158,21 +2158,23 @@ static unsigned int intel_linear_alignment(const struct drm_i915_private *dev_pr
>  		return 0;
>  }
>  
> -static unsigned int intel_surf_alignment(const struct drm_i915_private *dev_priv,
> -					 uint64_t fb_modifier)
> +static unsigned int intel_surf_alignment(const struct drm_framebuffer *fb,
> +					 int plane)
>  {
> -	switch (fb_modifier) {
> +	struct drm_i915_private *dev_priv = to_i915(fb->dev);
> +
> +	switch (fb->modifier) {
>  	case DRM_FORMAT_MOD_NONE:
>  		return intel_linear_alignment(dev_priv);
>  	case I915_FORMAT_MOD_X_TILED:
> -		if (INTEL_INFO(dev_priv)->gen >= 9)
> +		if (INTEL_GEN(dev_priv) >= 9)
>  			return 256 * 1024;
>  		return 0;
>  	case I915_FORMAT_MOD_Y_TILED:
>  	case I915_FORMAT_MOD_Yf_TILED:
>  		return 1 * 1024 * 1024;
>  	default:
> -		MISSING_CASE(fb_modifier);
> +		MISSING_CASE(fb->modifier);
>  		return 0;
>  	}
>  }
> @@ -2189,7 +2191,7 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb, unsigned int rotation)
>  
>  	WARN_ON(!mutex_is_locked(&dev->struct_mutex));
>  
> -	alignment = intel_surf_alignment(dev_priv, fb->modifier);
> +	alignment = intel_surf_alignment(fb, 0);
>  
>  	intel_fill_fb_ggtt_view(&view, fb, rotation);
>  
> @@ -2355,8 +2357,7 @@ static u32 intel_adjust_tile_offset(int *x, int *y,
>  		unsigned int pitch_tiles;
>  
>  		tile_size = intel_tile_size(dev_priv);
> -		intel_tile_dims(dev_priv, &tile_width, &tile_height,
> -				fb->modifier, cpp);
> +		intel_tile_dims(fb, plane, &tile_width, &tile_height);
>  
>  		if (drm_rotation_90_or_270(rotation)) {
>  			pitch_tiles = pitch / tile_height;
> @@ -2411,8 +2412,7 @@ static u32 _intel_compute_tile_offset(const struct drm_i915_private *dev_priv,
>  		unsigned int tile_rows, tiles, pitch_tiles;
>  
>  		tile_size = intel_tile_size(dev_priv);
> -		intel_tile_dims(dev_priv, &tile_width, &tile_height,
> -				fb_modifier, cpp);
> +		intel_tile_dims(fb, plane, &tile_width, &tile_height);
>  
>  		if (drm_rotation_90_or_270(rotation)) {
>  			pitch_tiles = pitch / tile_height;
> @@ -2458,7 +2458,7 @@ u32 intel_compute_tile_offset(int *x, int *y,
>  	if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
>  		alignment = 4096;
>  	else
> -		alignment = intel_surf_alignment(dev_priv, fb->modifier);
> +		alignment = intel_surf_alignment(fb, plane);
>  
>  	return _intel_compute_tile_offset(dev_priv, x, y, fb, plane, pitch,
>  					  rotation, alignment);
> @@ -2544,8 +2544,7 @@ intel_fill_fb_info(struct drm_i915_private *dev_priv,
>  			unsigned int pitch_tiles;
>  			struct drm_rect r;
>  
> -			intel_tile_dims(dev_priv, &tile_width, &tile_height,
> -					fb->modifier, cpp);
> +			intel_tile_dims(fb, i, &tile_width, &tile_height);
>  
>  			rot_info->plane[i].offset = offset;
>  			rot_info->plane[i].stride = DIV_ROUND_UP(fb->pitches[i], tile_width * cpp);
> @@ -2873,7 +2872,6 @@ static int skl_max_plane_width(const struct drm_framebuffer *fb, int plane,
>  
>  static int skl_check_main_surface(struct intel_plane_state *plane_state)
>  {
> -	const struct drm_i915_private *dev_priv = to_i915(plane_state->base.plane->dev);
>  	const struct drm_framebuffer *fb = plane_state->base.fb;
>  	unsigned int rotation = plane_state->base.rotation;
>  	int x = plane_state->base.src.x1 >> 16;
> @@ -2892,8 +2890,7 @@ static int skl_check_main_surface(struct intel_plane_state *plane_state)
>  
>  	intel_add_fb_offsets(&x, &y, plane_state, 0);
>  	offset = intel_compute_tile_offset(&x, &y, plane_state, 0);
> -
> -	alignment = intel_surf_alignment(dev_priv, fb->modifier);
> +	alignment = intel_surf_alignment(fb, 0);
>  
>  	/*
>  	 * AUX surface offset is specified as the distance from the
> @@ -3210,16 +3207,13 @@ static void ironlake_update_primary_plane(struct drm_plane *primary,
>  	POSTING_READ(reg);
>  }
>  
> -u32 intel_fb_stride_alignment(const struct drm_i915_private *dev_priv,
> -			      uint64_t fb_modifier, uint32_t pixel_format)
> +static u32
> +intel_fb_stride_alignment(const struct drm_framebuffer *fb, int plane)
>  {
> -	if (fb_modifier == DRM_FORMAT_MOD_NONE) {
> +	if (fb->modifier == DRM_FORMAT_MOD_NONE)
>  		return 64;
> -	} else {
> -		int cpp = drm_format_plane_cpp(pixel_format, 0);
> -
> -		return intel_tile_width_bytes(dev_priv, fb_modifier, cpp);
> -	}
> +	else
> +		return intel_tile_width_bytes(fb, plane);
>  }
>  
>  u32 intel_fb_gtt_offset(struct drm_framebuffer *fb,
> @@ -3269,21 +3263,16 @@ static void skl_detach_scalers(struct intel_crtc *intel_crtc)
>  u32 skl_plane_stride(const struct drm_framebuffer *fb, int plane,
>  		     unsigned int rotation)
>  {
> -	const struct drm_i915_private *dev_priv = to_i915(fb->dev);
>  	u32 stride = intel_fb_pitch(fb, plane, rotation);
>  
>  	/*
>  	 * The stride is either expressed as a multiple of 64 bytes chunks for
>  	 * linear buffers or in number of tiles for tiled buffers.
>  	 */
> -	if (drm_rotation_90_or_270(rotation)) {
> -		int cpp = fb->format->cpp[plane];
> -
> -		stride /= intel_tile_height(dev_priv, fb->modifier, cpp);
> -	} else {
> -		stride /= intel_fb_stride_alignment(dev_priv, fb->modifier,
> -						    fb->format->format);
> -	}
> +	if (drm_rotation_90_or_270(rotation))
> +		stride /= intel_tile_height(fb, plane);
> +	else
> +		stride /= intel_fb_stride_alignment(fb, plane);
>  
>  	return stride;
>  }
> @@ -8773,9 +8762,7 @@ i9xx_get_initial_plane_config(struct intel_crtc *crtc,
>  	val = I915_READ(DSPSTRIDE(pipe));
>  	fb->pitches[0] = val & 0xffffffc0;
>  
> -	aligned_height = intel_fb_align_height(dev, fb->height,
> -					       fb->format->format,
> -					       fb->modifier);
> +	aligned_height = intel_fb_align_height(fb, 0, fb->height);
>  
>  	plane_config->size = fb->pitches[0] * aligned_height;
>  
> @@ -9810,13 +9797,10 @@ skylake_get_initial_plane_config(struct intel_crtc *crtc,
>  	fb->width = ((val >> 0) & 0x1fff) + 1;
>  
>  	val = I915_READ(PLANE_STRIDE(pipe, 0));
> -	stride_mult = intel_fb_stride_alignment(dev_priv, fb->modifier,
> -						fb->format->format);
> +	stride_mult = intel_fb_stride_alignment(fb, 0);
>  	fb->pitches[0] = (val & 0x3ff) * stride_mult;
>  
> -	aligned_height = intel_fb_align_height(dev, fb->height,
> -					       fb->format->format,
> -					       fb->modifier);
> +	aligned_height = intel_fb_align_height(fb, 0, fb->height);
>  
>  	plane_config->size = fb->pitches[0] * aligned_height;
>  
> @@ -9912,9 +9896,7 @@ ironlake_get_initial_plane_config(struct intel_crtc *crtc,
>  	val = I915_READ(DSPSTRIDE(pipe));
>  	fb->pitches[0] = val & 0xffffffc0;
>  
> -	aligned_height = intel_fb_align_height(dev, fb->height,
> -					       fb->format->format,
> -					       fb->modifier);
> +	aligned_height = intel_fb_align_height(fb, 0, fb->height);
>  
>  	plane_config->size = fb->pitches[0] * aligned_height;
>  
> @@ -15967,15 +15949,6 @@ static int intel_framebuffer_init(struct drm_device *dev,
>  		return -EINVAL;
>  	}
>  
> -	stride_alignment = intel_fb_stride_alignment(dev_priv,
> -						     mode_cmd->modifier[0],
> -						     mode_cmd->pixel_format);
> -	if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
> -		DRM_DEBUG("pitch (%d) must be at least %u byte aligned\n",
> -			  mode_cmd->pitches[0], stride_alignment);
> -		return -EINVAL;
> -	}
> -
>  	pitch_limit = intel_fb_pitch_limit(dev_priv, mode_cmd->modifier[0],
>  					   mode_cmd->pixel_format);
>  	if (mode_cmd->pitches[0] > pitch_limit) {
> @@ -16057,6 +16030,14 @@ static int intel_framebuffer_init(struct drm_device *dev,
>  		return -EINVAL;
>  
>  	drm_helper_mode_fill_fb_struct(dev, &intel_fb->base, mode_cmd);
> +
> +	stride_alignment = intel_fb_stride_alignment(&intel_fb->base, 0);
> +	if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
> +		DRM_DEBUG("pitch (%d) must be at least %u byte aligned\n",
> +			  mode_cmd->pitches[0], stride_alignment);
> +		return -EINVAL;
> +	}
> +
>  	intel_fb->obj = obj;
>  
>  	ret = intel_fill_fb_info(dev_priv, &intel_fb->base);
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 6b02dac6ea26..e3d19fc6720c 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1208,12 +1208,8 @@ void intel_ddi_set_vc_payload_alloc(struct drm_crtc *crtc, bool state);
>  uint32_t ddi_signal_levels(struct intel_dp *intel_dp);
>  struct intel_shared_dpll *intel_ddi_get_link_dpll(struct intel_dp *intel_dp,
>  						  int clock);
> -unsigned int intel_fb_align_height(struct drm_device *dev,
> -				   unsigned int height,
> -				   uint32_t pixel_format,
> -				   uint64_t fb_format_modifier);
> -u32 intel_fb_stride_alignment(const struct drm_i915_private *dev_priv,
> -			      uint64_t fb_modifier, uint32_t pixel_format);
> +unsigned int intel_fb_align_height(const struct drm_framebuffer *fb,
> +				   int plane, unsigned int height);
>  
>  /* intel_audio.c */
>  void intel_init_audio_hooks(struct drm_i915_private *dev_priv);
> @@ -1325,9 +1321,6 @@ int intel_plane_atomic_set_property(struct drm_plane *plane,
>  int intel_plane_atomic_calc_changes(struct drm_crtc_state *crtc_state,
>  				    struct drm_plane_state *plane_state);
>  
> -unsigned int intel_tile_height(const struct drm_i915_private *dev_priv,
> -			       uint64_t fb_modifier, unsigned int cpp);
> -
>  void assert_pch_transcoder_disabled(struct drm_i915_private *dev_priv,
>  				    enum pipe pipe);
>  
> diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
> index 73d02d21c768..3716554e32a9 100644
> --- a/drivers/gpu/drm/i915/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/intel_fbdev.c
> @@ -631,9 +631,7 @@ static bool intel_fbdev_init_bios(struct drm_device *dev,
>  		}
>  
>  		cur_size = intel_crtc->config->base.adjusted_mode.crtc_vdisplay;
> -		cur_size = intel_fb_align_height(dev, cur_size,
> -						 fb->base.format->format,
> -						 fb->base.modifier);
> +		cur_size = intel_fb_align_height(&fb->base, 0, cur_size);
>  		cur_size *= fb->base.pitches[0];
>  		DRM_DEBUG_KMS("pipe %c area: %dx%d, bpp: %d, size: %d\n",
>  			      pipe_name(intel_crtc->pipe),
> -- 
> 2.10.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/9] drm/i915: Move nv12 chroma plane handling into intel_surf_alignment()
  2017-01-04 18:42 ` [PATCH 3/9] drm/i915: Move nv12 chroma plane handling into intel_surf_alignment() ville.syrjala
@ 2017-02-02 13:34   ` Imre Deak
  0 siblings, 0 replies; 44+ messages in thread
From: Imre Deak @ 2017-02-02 13:34 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx, dri-devel

On Wed, Jan 04, 2017 at 08:42:26PM +0200, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> Let's try to keep the alignment requirements in one place, and so
> towards that end let's move the AUX_DIST alignment handling into
> intel_surf_alignment() alongside the main surface alignment stuff.
> 
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

Reviewed-by: Imre Deak <imre.deak@intel.com>

> ---
>  drivers/gpu/drm/i915/intel_display.c | 12 +++++-------
>  1 file changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index f0cb80aba89a..4d514ca1da88 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -2163,6 +2163,10 @@ static unsigned int intel_surf_alignment(const struct drm_framebuffer *fb,
>  {
>  	struct drm_i915_private *dev_priv = to_i915(fb->dev);
>  
> +	/* AUX_DIST needs only 4K alignment */
> +	if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
> +		return 4096;
> +
>  	switch (fb->modifier) {
>  	case DRM_FORMAT_MOD_NONE:
>  		return intel_linear_alignment(dev_priv);
> @@ -2452,13 +2456,7 @@ u32 intel_compute_tile_offset(int *x, int *y,
>  	const struct drm_framebuffer *fb = state->base.fb;
>  	unsigned int rotation = state->base.rotation;
>  	int pitch = intel_fb_pitch(fb, plane, rotation);
> -	u32 alignment;
> -
> -	/* AUX_DIST needs only 4K alignment */
> -	if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
> -		alignment = 4096;
> -	else
> -		alignment = intel_surf_alignment(fb, plane);
> +	u32 alignment = intel_surf_alignment(fb, plane);
>  
>  	return _intel_compute_tile_offset(dev_priv, x, y, fb, plane, pitch,
>  					  rotation, alignment);
> -- 
> 2.10.2
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 4/9] drm/i915: Avoid div-by-zero when computing aux_stride w/o an aux plane
  2017-01-04 18:42 ` [PATCH 4/9] drm/i915: Avoid div-by-zero when computing aux_stride w/o an aux plane ville.syrjala
@ 2017-02-02 13:38   ` Imre Deak
  0 siblings, 0 replies; 44+ messages in thread
From: Imre Deak @ 2017-02-02 13:38 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx, dri-devel

On Wed, Jan 04, 2017 at 08:42:27PM +0200, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> To make life easier let's allow skl_plane_stride() to be called for the
> AUX surface even when there is no AUX surface. Avoids special cases in
> the callers.
> 
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

Reviewed-by: Imre Deak <imre.deak@intel.com>

> ---
>  drivers/gpu/drm/i915/intel_display.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 4d514ca1da88..bc398743e941 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -3261,7 +3261,12 @@ static void skl_detach_scalers(struct intel_crtc *intel_crtc)
>  u32 skl_plane_stride(const struct drm_framebuffer *fb, int plane,
>  		     unsigned int rotation)
>  {
> -	u32 stride = intel_fb_pitch(fb, plane, rotation);
> +	u32 stride;
> +
> +	if (plane >= fb->format->num_planes)
> +		return 0;
> +
> +	stride = intel_fb_pitch(fb, plane, rotation);
>  
>  	/*
>  	 * The stride is either expressed as a multiple of 64 bytes chunks for
> -- 
> 2.10.2
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 5/9] drm/i915: Fix Yf tile width
  2017-01-04 18:42 ` [PATCH 5/9] drm/i915: Fix Yf tile width ville.syrjala
@ 2017-02-02 15:07   ` Imre Deak
  0 siblings, 0 replies; 44+ messages in thread
From: Imre Deak @ 2017-02-02 15:07 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx, dri-devel

On Wed, Jan 04, 2017 at 08:42:28PM +0200, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> Based on empirical evidence the display engine (at least) always
> treats Yf tiles as 128Bx32. Currently we're assuming the tile dimensions
> change based on the pixel format. Let's adjust our code to match the
> hardware.
> 
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

Reviewed-by: Imre Deak <imre.deak@intel.com>

BSpec "Address Tiling Function Introduction/Linear vs Tiled Storage":
"""
Note that the dimensions of tiles are irrespective of the data contained
within – e.g., a tile can hold twice as many 16-bit pixels (256
pixels/row x 8 rows = 2K pixels) than 32-bit pixels (128 pixels/row x 8
rows = 1K pixels).
"""

"Tile-Yf Format":
"""
The 64 Byte block is always 4-high. Width (in pixels) is defined by bpp.
"""

Then it continues to specify the different tile aspect ratios for
different bpps which contradict the above. This aspect ratio definition
is what matches the current code. Filing a BSpec issue for this would be
good.

> ---
>  drivers/gpu/drm/i915/intel_display.c | 20 ++++++--------------
>  1 file changed, 6 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index bc398743e941..0ca0dbccc005 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -2070,20 +2070,12 @@ intel_tile_width_bytes(const struct drm_framebuffer *fb, int plane)
>  		else
>  			return 512;
>  	case I915_FORMAT_MOD_Yf_TILED:
> -		switch (cpp) {
> -		case 1:
> -			return 64;
> -		case 2:
> -		case 4:
> -			return 128;
> -		case 8:
> -		case 16:
> -			return 256;
> -		default:
> -			MISSING_CASE(cpp);
> -			return cpp;
> -		}
> -		break;
> +		/*
> +		 * Bspec seems to suggest that the Yf tile width would
> +		 * depend on the cpp. In reality it doesn't, at least
> +		 * as far as the display engine is concerned.
> +		 */
> +		return 128;
>  	default:
>  		MISSING_CASE(fb->modifier);
>  		return cpp;
> -- 
> 2.10.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 6/9] drm/i915: Pass the correct plane index to _intel_compute_tile_offset()
  2017-01-04 18:42 ` [PATCH 6/9] drm/i915: Pass the correct plane index to _intel_compute_tile_offset() ville.syrjala
@ 2017-02-02 16:10   ` Imre Deak
  0 siblings, 0 replies; 44+ messages in thread
From: Imre Deak @ 2017-02-02 16:10 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx, dri-devel

On Wed, Jan 04, 2017 at 08:42:29PM +0200, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> intel_fill_fb_info() should pass the correct plane index to
> _intel_compute_tile_offset() once we start to care about the AUX
> surface.
> 
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

This changes how x/y and offset is calculated already now for planes
with a cpp different than that of plane 0, but the end result remains
the same:
Reviewed-by: Imre Deak <imre.deak@intel.com>


> ---
>  drivers/gpu/drm/i915/intel_display.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 0ca0dbccc005..5fee5a7ac9a4 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -2525,7 +2525,7 @@ intel_fill_fb_info(struct drm_i915_private *dev_priv,
>  		intel_fb->normal[i].y = y;
>  
>  		offset = _intel_compute_tile_offset(dev_priv, &x, &y,
> -						    fb, 0, fb->pitches[i],
> +						    fb, i, fb->pitches[i],
>  						    DRM_ROTATE_0, tile_size);
>  		offset /= tile_size;
>  
> -- 
> 2.10.2
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 7/9] drm/i915: Use DRM_DEBUG_KMS() for framebuffer failure debug messages
  2017-01-04 18:42 ` [PATCH 7/9] drm/i915: Use DRM_DEBUG_KMS() for framebuffer failure debug messages ville.syrjala
@ 2017-02-02 16:19   ` Imre Deak
  0 siblings, 0 replies; 44+ messages in thread
From: Imre Deak @ 2017-02-02 16:19 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx, dri-devel

On Wed, Jan 04, 2017 at 08:42:30PM +0200, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> DRM_UT_CORE generates way too much noise usually, so having the
> framebuffer init failures use DRM_UT_CORE is a pain when trying to
> find out the reason why you failed in creating a framebuffer.
> Let's use DRM_UT_KMS for these debug messages instead.
> 
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/intel_display.c | 66 ++++++++++++++++++------------------
>  1 file changed, 33 insertions(+), 33 deletions(-)
> [...]  
> @@ -15940,17 +15940,17 @@ static int intel_framebuffer_init(struct drm_device *dev,
>  	 */
>  	if (INTEL_INFO(dev_priv)->gen < 4 &&
>  	    tiling != intel_fb_modifier_to_tiling(mode_cmd->modifier[0])) {
> -		DRM_DEBUG("tiling_mode must match fb modifier exactly on gen2/3\n");
> +		DRM_DEBUG_KMS("tiling_mode must match fb modifier exactly on gen2/3\n");
>  		return -EINVAL;
>  	}
>  
>  	pitch_limit = intel_fb_pitch_limit(dev_priv, mode_cmd->modifier[0],
>  					   mode_cmd->pixel_format);
>  	if (mode_cmd->pitches[0] > pitch_limit) {
> -		DRM_DEBUG("%s pitch (%u) must be at less than %d\n",
> -			  mode_cmd->modifier[0] != DRM_FORMAT_MOD_NONE ?
> -			  "tiled" : "linear",
> -			  mode_cmd->pitches[0], pitch_limit);
> +		DRM_DEBUG_KMS("%s pitch (%u) must be at less than %d\n",

While at it: s/at less than/at most/

Reviewed-by: Imre Deak <imre.deak@intel.com>

> +			      mode_cmd->modifier[0] != DRM_FORMAT_MOD_NONE ?
> +			      "tiled" : "linear",
> +			      mode_cmd->pitches[0], pitch_limit);
>  		return -EINVAL;
>  	}
>  
> @@ -15960,9 +15960,9 @@ static int intel_framebuffer_init(struct drm_device *dev,
>  	 */
>  	if (tiling != I915_TILING_NONE &&
>  	    mode_cmd->pitches[0] != i915_gem_object_get_stride(obj)) {
> -		DRM_DEBUG("pitch (%d) must match tiling stride (%d)\n",
> -			  mode_cmd->pitches[0],
> -			  i915_gem_object_get_stride(obj));
> +		DRM_DEBUG_KMS("pitch (%d) must match tiling stride (%d)\n",
> +			      mode_cmd->pitches[0],
> +			      i915_gem_object_get_stride(obj));
>  		return -EINVAL;
>  	}
>  
> @@ -15975,16 +15975,16 @@ static int intel_framebuffer_init(struct drm_device *dev,
>  		break;
>  	case DRM_FORMAT_XRGB1555:
>  		if (INTEL_GEN(dev_priv) > 3) {
> -			DRM_DEBUG("unsupported pixel format: %s\n",
> -			          drm_get_format_name(mode_cmd->pixel_format, &format_name));
> +			DRM_DEBUG_KMS("unsupported pixel format: %s\n",
> +				      drm_get_format_name(mode_cmd->pixel_format, &format_name));
>  			return -EINVAL;
>  		}
>  		break;
>  	case DRM_FORMAT_ABGR8888:
>  		if (!IS_VALLEYVIEW(dev_priv) && !IS_CHERRYVIEW(dev_priv) &&
>  		    INTEL_GEN(dev_priv) < 9) {
> -			DRM_DEBUG("unsupported pixel format: %s\n",
> -			          drm_get_format_name(mode_cmd->pixel_format, &format_name));
> +			DRM_DEBUG_KMS("unsupported pixel format: %s\n",
> +				      drm_get_format_name(mode_cmd->pixel_format, &format_name));
>  			return -EINVAL;
>  		}
>  		break;
> @@ -15992,15 +15992,15 @@ static int intel_framebuffer_init(struct drm_device *dev,
>  	case DRM_FORMAT_XRGB2101010:
>  	case DRM_FORMAT_XBGR2101010:
>  		if (INTEL_GEN(dev_priv) < 4) {
> -			DRM_DEBUG("unsupported pixel format: %s\n",
> -			          drm_get_format_name(mode_cmd->pixel_format, &format_name));
> +			DRM_DEBUG_KMS("unsupported pixel format: %s\n",
> +				      drm_get_format_name(mode_cmd->pixel_format, &format_name));
>  			return -EINVAL;
>  		}
>  		break;
>  	case DRM_FORMAT_ABGR2101010:
>  		if (!IS_VALLEYVIEW(dev_priv) && !IS_CHERRYVIEW(dev_priv)) {
> -			DRM_DEBUG("unsupported pixel format: %s\n",
> -			          drm_get_format_name(mode_cmd->pixel_format, &format_name));
> +			DRM_DEBUG_KMS("unsupported pixel format: %s\n",
> +				      drm_get_format_name(mode_cmd->pixel_format, &format_name));
>  			return -EINVAL;
>  		}
>  		break;
> @@ -16009,14 +16009,14 @@ static int intel_framebuffer_init(struct drm_device *dev,
>  	case DRM_FORMAT_YVYU:
>  	case DRM_FORMAT_VYUY:
>  		if (INTEL_GEN(dev_priv) < 5) {
> -			DRM_DEBUG("unsupported pixel format: %s\n",
> -			          drm_get_format_name(mode_cmd->pixel_format, &format_name));
> +			DRM_DEBUG_KMS("unsupported pixel format: %s\n",
> +				      drm_get_format_name(mode_cmd->pixel_format, &format_name));
>  			return -EINVAL;
>  		}
>  		break;
>  	default:
> -		DRM_DEBUG("unsupported pixel format: %s\n",
> -		          drm_get_format_name(mode_cmd->pixel_format, &format_name));
> +		DRM_DEBUG_KMS("unsupported pixel format: %s\n",
> +			      drm_get_format_name(mode_cmd->pixel_format, &format_name));
>  		return -EINVAL;
>  	}
>  
> @@ -16028,8 +16028,8 @@ static int intel_framebuffer_init(struct drm_device *dev,
>  
>  	stride_alignment = intel_fb_stride_alignment(&intel_fb->base, 0);
>  	if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
> -		DRM_DEBUG("pitch (%d) must be at least %u byte aligned\n",
> -			  mode_cmd->pitches[0], stride_alignment);
> +		DRM_DEBUG_KMS("pitch (%d) must be at least %u byte aligned\n",
> +			      mode_cmd->pitches[0], stride_alignment);
>  		return -EINVAL;
>  	}
>  
> -- 
> 2.10.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 9/9] drm/i915: Add render decompression support
  2017-01-05 15:14   ` [PATCH v2 " ville.syrjala
  2017-01-09 19:20     ` Jason Ekstrand
@ 2017-02-07 15:37     ` Imre Deak
  2017-02-13 17:13       ` Ville Syrjälä
  2017-02-28 20:18     ` Jason Ekstrand
  2 siblings, 1 reply; 44+ messages in thread
From: Imre Deak @ 2017-02-07 15:37 UTC (permalink / raw)
  To: ville.syrjala
  Cc: Ben Widawsky, Paulo Zanoni, intel-gfx, dri-devel, Vandana Kannan

On Thu, Jan 05, 2017 at 05:14:54PM +0200, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> SKL+ display engine can scan out certain kinds of compressed surfaces
> produced by the render engine. This involved telling the display engine
> the location of the color control surfae (CCS) which describes
> which parts of the main surface are compressed and which are not. The
> location of CCS is provided by userspace as just another plane with its
> own offset.
> 
> Add the required stuff to validate the user provided AUX plane metadata
> and convert the user provided linear offset into something the hardware
> can consume.
> 
> Due to hardware limitations we require that the main surface and
> the AUX surface (CCS) be part of the same bo. The hardware also
> makes life hard by not allowing you to provide separate x/y offsets
> for the main and AUX surfaces (excpet with NV12), so finding suitable
> offsets for both requires a bit of work. Assuming we still want keep
> playing tricks with the offsets. I've just gone with a dumb "search
> backward for suitable offsets" approach, which is far from optimal,
> but it works.
> 
> Also not all planes will be capable of scanning out compressed surfaces,
> and eg. 90/270 degree rotation is not supported in combination with
> decompression either.
> 
> This patch may contain work from at least the following people:
> * Vandana Kannan <vandana.kannan@intel.com>
> * Daniel Vetter <daniel@ffwll.ch>
> * Ben Widawsky <ben@bwidawsk.net>
> 
> v2: Deal with display workarounds 0390, 0531, 1125 (Paulo)
> 
> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> Cc: Vandana Kannan <vandana.kannan@intel.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Ben Widawsky <ben@bwidawsk.net>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_reg.h      |  23 ++++
>  drivers/gpu/drm/i915/intel_display.c | 234 ++++++++++++++++++++++++++++++++---
>  drivers/gpu/drm/i915/intel_pm.c      |  29 ++++-
>  drivers/gpu/drm/i915/intel_sprite.c  |   5 +
>  4 files changed, 274 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 00970aa77afa..6849ba93f1d9 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -6209,6 +6209,28 @@ enum {
>  			_ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A),   \
>  			_ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B))
>  
> +#define PLANE_AUX_DIST_1_A		0x701c0
> +#define PLANE_AUX_DIST_2_A		0x702c0
> +#define PLANE_AUX_DIST_1_B		0x711c0
> +#define PLANE_AUX_DIST_2_B		0x712c0
> +#define _PLANE_AUX_DIST_1(pipe) \
> +			_PIPE(pipe, PLANE_AUX_DIST_1_A, PLANE_AUX_DIST_1_B)
> +#define _PLANE_AUX_DIST_2(pipe) \
> +			_PIPE(pipe, PLANE_AUX_DIST_2_A, PLANE_AUX_DIST_2_B)
> +#define PLANE_AUX_DIST(pipe, plane)     \
> +	_MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe), _PLANE_AUX_DIST_2(pipe))
> +
> +#define PLANE_AUX_OFFSET_1_A		0x701c4
> +#define PLANE_AUX_OFFSET_2_A		0x702c4
> +#define PLANE_AUX_OFFSET_1_B		0x711c4
> +#define PLANE_AUX_OFFSET_2_B		0x712c4
> +#define _PLANE_AUX_OFFSET_1(pipe)       \
> +		_PIPE(pipe, PLANE_AUX_OFFSET_1_A, PLANE_AUX_OFFSET_1_B)
> +#define _PLANE_AUX_OFFSET_2(pipe)       \
> +		_PIPE(pipe, PLANE_AUX_OFFSET_2_A, PLANE_AUX_OFFSET_2_B)
> +#define PLANE_AUX_OFFSET(pipe, plane)   \
> +	_MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe), _PLANE_AUX_OFFSET_2(pipe))
> +
>  /* legacy palette */
>  #define _LGC_PALETTE_A           0x4a000
>  #define _LGC_PALETTE_B           0x4a800
> @@ -6433,6 +6455,7 @@ enum {
>  # define CHICKEN3_DGMG_DONE_FIX_DISABLE		(1 << 2)
>  
>  #define CHICKEN_PAR1_1		_MMIO(0x42080)
> +#define  SKL_RC_HASH_OUTSIDE	(1 << 15)
>  #define  DPA_MASK_VBLANK_SRD	(1 << 15)
>  #define  FORCE_ARB_IDLE_PLANES	(1 << 14)
>  #define  SKL_EDP_PSR_FIX_RDWRAP	(1 << 3)
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 38de9df0ec60..2236abebd8bc 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -2064,11 +2064,19 @@ intel_tile_width_bytes(const struct drm_framebuffer *fb, int plane)
>  			return 128;
>  		else
>  			return 512;
> +	case I915_FORMAT_MOD_Y_TILED_CCS:
> +		if (plane == 1)
> +			return 64;
> +		/* fall through */
>  	case I915_FORMAT_MOD_Y_TILED:
>  		if (IS_GEN2(dev_priv) || HAS_128_BYTE_Y_TILING(dev_priv))
>  			return 128;
>  		else
>  			return 512;
> +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> +		if (plane == 1)
> +			return 64;
> +		/* fall through */
>  	case I915_FORMAT_MOD_Yf_TILED:
>  		/*
>  		 * Bspec seems to suggest that the Yf tile width would
> @@ -2156,7 +2164,7 @@ static unsigned int intel_surf_alignment(const struct drm_framebuffer *fb,
>  	struct drm_i915_private *dev_priv = to_i915(fb->dev);
>  
>  	/* AUX_DIST needs only 4K alignment */
> -	if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
> +	if (plane == 1)
>  		return 4096;
>  
>  	switch (fb->modifier) {
> @@ -2166,6 +2174,8 @@ static unsigned int intel_surf_alignment(const struct drm_framebuffer *fb,
>  		if (INTEL_GEN(dev_priv) >= 9)
>  			return 256 * 1024;
>  		return 0;
> +	case I915_FORMAT_MOD_Y_TILED_CCS:
> +	case I915_FORMAT_MOD_Yf_TILED_CCS:
>  	case I915_FORMAT_MOD_Y_TILED:
>  	case I915_FORMAT_MOD_Yf_TILED:
>  		return 1 * 1024 * 1024;
> @@ -2472,6 +2482,7 @@ static unsigned int intel_fb_modifier_to_tiling(uint64_t fb_modifier)
>  	case I915_FORMAT_MOD_X_TILED:
>  		return I915_TILING_X;
>  	case I915_FORMAT_MOD_Y_TILED:
> +	case I915_FORMAT_MOD_Y_TILED_CCS:
>  		return I915_TILING_Y;
>  	default:
>  		return I915_TILING_NONE;
> @@ -2536,6 +2547,35 @@ intel_fill_fb_info(struct drm_i915_private *dev_priv,
>  
>  		intel_fb_offset_to_xy(&x, &y, fb, i);
>  
> +		if ((fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +		     fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) && i == 1) {
> +			int main_x, main_y;
> +			int ccs_x, ccs_y;
> +
> +			/*
> +			 * Each byte of CCS corresponds to a 16x8 area of the main surface, and
> +			 * each CCS tile is 64x64 bytes.
> +			 */
> +			ccs_x = (x * 16) % (64 * 16);
> +			ccs_y = (y * 8) % (64 * 8);
> +			main_x = intel_fb->normal[0].x % (64 * 16);
> +			main_y = intel_fb->normal[0].y % (64 * 8);
> +
> +			/*
> +			 * CCS doesn't have its own x/y offset register, so the intra CCS tile
> +			 * x/y offsets must match between CCS and the main surface.
> +			 */
> +			if (main_x != ccs_x || main_y != ccs_y) {
> +				DRM_DEBUG_KMS("Bad CCS x/y (main %d,%d ccs %d,%d) full (main %d,%d ccs %d,%d)\n",
> +					      main_x, main_y,
> +					      ccs_x, ccs_y,
> +					      intel_fb->normal[0].x,
> +					      intel_fb->normal[0].y,
> +					      x, y);
> +				return -EINVAL;
> +			}
> +		}
> +
>  		/*
>  		 * The fence (if used) is aligned to the start of the object
>  		 * so having the framebuffer wrap around across the edge of the
> @@ -2873,6 +2913,9 @@ static int skl_max_plane_width(const struct drm_framebuffer *fb, int plane,
>  			break;
>  		}
>  		break;
> +	case I915_FORMAT_MOD_Y_TILED_CCS:
> +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> +		/* FIXME AUX plane? */
>  	case I915_FORMAT_MOD_Y_TILED:
>  	case I915_FORMAT_MOD_Yf_TILED:
>  		switch (cpp) {
> @@ -2895,6 +2938,42 @@ static int skl_max_plane_width(const struct drm_framebuffer *fb, int plane,
>  	return 2048;
>  }
>  
> +static bool skl_check_main_ccs_coordinates(struct intel_plane_state *plane_state,
> +					   int main_x, int main_y, u32 main_offset)
> +{
> +	const struct drm_framebuffer *fb = plane_state->base.fb;
> +	int aux_x = plane_state->aux.x;
> +	int aux_y = plane_state->aux.y;
> +	u32 aux_offset = plane_state->aux.offset;
> +	u32 alignment = intel_surf_alignment(fb, 1);
> +
> +	while (aux_offset >= main_offset && aux_y <= main_y) {
> +		int x, y;
> +
> +		if (aux_x == main_x && aux_y == main_y)
> +			break;
> +
> +		if (aux_offset == 0)
> +			break;
> +
> +		x = aux_x / 16;
> +		y = aux_y / 8;
> +		aux_offset = intel_adjust_tile_offset(&x, &y, plane_state, 1,
> +						      aux_offset, aux_offset - alignment);
> +		aux_x = x * 16 + aux_x % 16;
> +		aux_y = y * 8 + aux_y % 8;
> +	}
> +
> +	if (aux_x != main_x || aux_y != main_y)
> +		return false;
> +
> +	plane_state->aux.offset = aux_offset;
> +	plane_state->aux.x = aux_x;
> +	plane_state->aux.y = aux_y;
> +
> +	return true;
> +}
> +
>  static int skl_check_main_surface(struct intel_plane_state *plane_state)
>  {
>  	const struct drm_framebuffer *fb = plane_state->base.fb;
> @@ -2937,7 +3016,7 @@ static int skl_check_main_surface(struct intel_plane_state *plane_state)
>  
>  		while ((x + w) * cpp > fb->pitches[0]) {
>  			if (offset == 0) {
> -				DRM_DEBUG_KMS("Unable to find suitable display surface offset\n");
> +				DRM_DEBUG_KMS("Unable to find suitable display surface offset due to X-tiling\n");
>  				return -EINVAL;
>  			}
>  
> @@ -2946,6 +3025,26 @@ static int skl_check_main_surface(struct intel_plane_state *plane_state)
>  		}
>  	}
>  
> +	/*
> +	 * CCS AUX surface doesn't have its own x/y offsets, we must make sure
> +	 * they match with the main surface x/y offsets.
> +	 */
> +	if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +	    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
> +		while (!skl_check_main_ccs_coordinates(plane_state, x, y, offset)) {
> +			if (offset == 0)
> +				break;
> +
> +			offset = intel_adjust_tile_offset(&x, &y, plane_state, 0,
> +							  offset, offset - alignment);
> +		}
> +
> +		if (x != plane_state->aux.x || y != plane_state->aux.y) {
> +			DRM_DEBUG_KMS("Unable to find suitable display surface offset due to CCS\n");
> +			return -EINVAL;
> +		}
> +	}
> +
>  	plane_state->main.offset = offset;
>  	plane_state->main.x = x;
>  	plane_state->main.y = y;
> @@ -2982,6 +3081,47 @@ static int skl_check_nv12_aux_surface(struct intel_plane_state *plane_state)
>  	return 0;
>  }
>  
> +static int skl_check_ccs_aux_surface(struct intel_plane_state *plane_state)
> +{
> +	struct intel_plane *plane = to_intel_plane(plane_state->base.plane);
> +	struct intel_crtc *crtc = to_intel_crtc(plane_state->base.crtc);
> +	int src_x = plane_state->base.src.x1 >> 16;
> +	int src_y = plane_state->base.src.y1 >> 16;
> +	int x = src_x / 16;
> +	int y = src_y / 8;
> +	u32 offset;
> +
> +	switch (plane->id) {
> +	case PLANE_PRIMARY:
> +	case PLANE_SPRITE0:
> +		break;
> +	default:
> +		DRM_DEBUG_KMS("RC support only on plane 1 and 2\n");
> +		return -EINVAL;
> +	}
> +
> +	if (crtc->pipe == PIPE_C) {
> +		DRM_DEBUG_KMS("No RC support on pipe C\n");
> +		return -EINVAL;
> +	}
> +
> +	if (plane_state->base.rotation &&
> +	    plane_state->base.rotation & ~(DRM_ROTATE_0 | DRM_ROTATE_180)) {
> +		DRM_DEBUG_KMS("RC support only with 0/180 degree rotation %x\n",
> +			      plane_state->base.rotation);
> +		return -EINVAL;
> +	}
> +
> +	intel_add_fb_offsets(&x, &y, plane_state, 1);
> +	offset = intel_compute_tile_offset(&x, &y, plane_state, 1);
> +
> +	plane_state->aux.offset = offset;
> +	plane_state->aux.x = x * 16 + src_x % 16;
> +	plane_state->aux.y = y * 8 + src_y % 8;
> +
> +	return 0;
> +}
> +
>  int skl_check_plane_surface(struct intel_plane_state *plane_state)
>  {
>  	const struct drm_framebuffer *fb = plane_state->base.fb;
> @@ -3002,6 +3142,11 @@ int skl_check_plane_surface(struct intel_plane_state *plane_state)
>  		ret = skl_check_nv12_aux_surface(plane_state);
>  		if (ret)
>  			return ret;
> +	} else if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +		   fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
> +		ret = skl_check_ccs_aux_surface(plane_state);
> +		if (ret)
> +			return ret;
>  	} else {
>  		plane_state->aux.offset = ~0xfff;
>  		plane_state->aux.x = 0;
> @@ -3357,8 +3502,12 @@ u32 skl_plane_ctl_tiling(uint64_t fb_modifier)
>  		return PLANE_CTL_TILED_X;
>  	case I915_FORMAT_MOD_Y_TILED:
>  		return PLANE_CTL_TILED_Y;
> +	case I915_FORMAT_MOD_Y_TILED_CCS:
> +		return PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
>  	case I915_FORMAT_MOD_Yf_TILED:
>  		return PLANE_CTL_TILED_YF;
> +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> +		return PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_ENABLE;
>  	default:
>  		MISSING_CASE(fb_modifier);
>  	}
> @@ -3401,6 +3550,7 @@ static void skylake_update_primary_plane(struct drm_plane *plane,
>  	u32 plane_ctl;
>  	unsigned int rotation = plane_state->base.rotation;
>  	u32 stride = skl_plane_stride(fb, 0, rotation);
> +	u32 aux_stride = skl_plane_stride(fb, 1, rotation);
>  	u32 surf_addr = plane_state->main.offset;
>  	int scaler_id = plane_state->scaler_id;
>  	int src_x = plane_state->main.x;
> @@ -3436,6 +3586,10 @@ static void skylake_update_primary_plane(struct drm_plane *plane,
>  	I915_WRITE(PLANE_OFFSET(pipe, plane_id), (src_y << 16) | src_x);
>  	I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
>  	I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
> +	I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
> +		   (plane_state->aux.offset - surf_addr) | aux_stride);
> +	I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
> +		   (plane_state->aux.y << 16) | plane_state->aux.x);
>  
>  	if (scaler_id >= 0) {
>  		uint32_t ps_ctrl = 0;
> @@ -9807,10 +9961,16 @@ skylake_get_initial_plane_config(struct intel_crtc *crtc,
>  		fb->modifier = I915_FORMAT_MOD_X_TILED;
>  		break;
>  	case PLANE_CTL_TILED_Y:
> -		fb->modifier = I915_FORMAT_MOD_Y_TILED;
> +		if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
> +			fb->modifier = I915_FORMAT_MOD_Y_TILED_CCS;
> +		else
> +			fb->modifier = I915_FORMAT_MOD_Y_TILED;
>  		break;
>  	case PLANE_CTL_TILED_YF:
> -		fb->modifier = I915_FORMAT_MOD_Yf_TILED;
> +		if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
> +			fb->modifier = I915_FORMAT_MOD_Yf_TILED_CCS;
> +		else
> +			fb->modifier = I915_FORMAT_MOD_Yf_TILED;
>  		break;
>  	default:
>  		MISSING_CASE(tiling);
> @@ -12014,7 +12174,7 @@ static void skl_do_mmio_flip(struct intel_crtc *intel_crtc,
>  	u32 ctl, stride = skl_plane_stride(fb, 0, rotation);
>  
>  	ctl = I915_READ(PLANE_CTL(pipe, 0));
> -	ctl &= ~PLANE_CTL_TILED_MASK;
> +	ctl &= ~(PLANE_CTL_TILED_MASK | PLANE_CTL_DECOMPRESSION_ENABLE);
>  	switch (fb->modifier) {
>  	case DRM_FORMAT_MOD_NONE:
>  		break;
> @@ -12024,9 +12184,15 @@ static void skl_do_mmio_flip(struct intel_crtc *intel_crtc,
>  	case I915_FORMAT_MOD_Y_TILED:
>  		ctl |= PLANE_CTL_TILED_Y;
>  		break;
> +	case I915_FORMAT_MOD_Y_TILED_CCS:
> +		ctl |= PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
> +		break;
>  	case I915_FORMAT_MOD_Yf_TILED:
>  		ctl |= PLANE_CTL_TILED_YF;
>  		break;
> +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> +		ctl |= PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_ENABLE;
> +		break;
>  	default:
>  		MISSING_CASE(fb->modifier);
>  	}
> @@ -15925,9 +16091,10 @@ static int intel_framebuffer_init(struct drm_device *dev,
>  				  struct drm_i915_gem_object *obj)
>  {
>  	struct drm_i915_private *dev_priv = to_i915(dev);
> +	struct drm_framebuffer *fb = &intel_fb->base;
>  	unsigned int tiling = i915_gem_object_get_tiling(obj);
> -	int ret;
> -	u32 pitch_limit, stride_alignment;
> +	int ret, i;
> +	u32 pitch_limit;
>  	struct drm_format_name_buf format_name;
>  
>  	WARN_ON(!mutex_is_locked(&dev->struct_mutex));
> @@ -15953,6 +16120,19 @@ static int intel_framebuffer_init(struct drm_device *dev,
>  
>  	/* Passed in modifier sanity checking. */
>  	switch (mode_cmd->modifier[0]) {
> +	case I915_FORMAT_MOD_Y_TILED_CCS:
> +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> +		switch (mode_cmd->pixel_format) {
> +		case DRM_FORMAT_XBGR8888:
> +		case DRM_FORMAT_ABGR8888:
> +		case DRM_FORMAT_XRGB8888:
> +		case DRM_FORMAT_ARGB8888:
> +			break;
> +		default:
> +			DRM_DEBUG_KMS("RC supported only with RGB8888 formats\n");
> +			return -EINVAL;
> +		}
> +		/* fall through */
>  	case I915_FORMAT_MOD_Y_TILED:
>  	case I915_FORMAT_MOD_Yf_TILED:
>  		if (INTEL_GEN(dev_priv) < 9) {
> @@ -16059,22 +16239,46 @@ static int intel_framebuffer_init(struct drm_device *dev,
>  	if (mode_cmd->offsets[0] != 0)
>  		return -EINVAL;
>  
> -	drm_helper_mode_fill_fb_struct(dev, &intel_fb->base, mode_cmd);
> +	drm_helper_mode_fill_fb_struct(dev, fb, mode_cmd);
>  
> -	stride_alignment = intel_fb_stride_alignment(&intel_fb->base, 0);
> -	if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
> -		DRM_DEBUG_KMS("pitch (%d) must be at least %u byte aligned\n",
> -			      mode_cmd->pitches[0], stride_alignment);
> -		return -EINVAL;
> +	for (i = 0; i < fb->format->num_planes; i++) {
> +		u32 stride_alignment;
> +
> +		if (mode_cmd->handles[i] != mode_cmd->handles[0]) {
> +			DRM_DEBUG_KMS("bad plane %d handle\n", i);
> +			return -EINVAL;
> +		}
> +
> +		stride_alignment = intel_fb_stride_alignment(fb, i);
> +
> +		/*
> +		 * Display WA #0531: skl,bxt,kbl,glk
> +		 *
> +		 * Render decompression and plane width > 3840
> +		 * combined with horizontal panning requires the
> +		 * plane stride to be a multiple of 4. We'll just
> +		 * require the entire fb to accommodate that to avoid
> +		 * potential runtime errors at plane configuration time.
> +		 */
> +		if (IS_GEN9(dev_priv) && i == 0 && fb->width > 3840 &&
> +		    (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +		     fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS))
> +			stride_alignment *= 4;
> +
> +		if (fb->pitches[i] & (stride_alignment - 1)) {
> +			DRM_DEBUG_KMS("plane %d pitch (%d) must be at least %u byte aligned\n",
> +				      i, fb->pitches[i], stride_alignment);
> +			return -EINVAL;
> +		}
>  	}
>  
>  	intel_fb->obj = obj;
>  
> -	ret = intel_fill_fb_info(dev_priv, &intel_fb->base);
> +	ret = intel_fill_fb_info(dev_priv, fb);
>  	if (ret)
>  		return ret;
>  
> -	ret = drm_framebuffer_init(dev, &intel_fb->base, &intel_fb_funcs);
> +	ret = drm_framebuffer_init(dev, fb, &intel_fb_funcs);
>  	if (ret) {
>  		DRM_ERROR("framebuffer init failed %d\n", ret);
>  		return ret;
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 249623d45be0..25782cd174c0 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -62,6 +62,20 @@ static void gen9_init_clock_gating(struct drm_i915_private *dev_priv)
>  	I915_WRITE(CHICKEN_PAR1_1,
>  		   I915_READ(CHICKEN_PAR1_1) | SKL_EDP_PSR_FIX_RDWRAP);
>  
> +	/*
> +	 * Display WA#0390: skl,bxt,kbl,glk
> +	 *
> +	 * Must match Sampler, Pixel Back End, and Media
> +	 * (0xE194 bit 8, 0x7014 bit 13, 0x4DDC bits 27 and 31).
> +	 *
> +	 * Including bits outside the page in the hash would
> +	 * require 2 (or 4?) MiB alignment of resources. Just
> +	 * assume the defaul hashing mode which only uses bits
> +	 * within the page.
> +	 */
> +	I915_WRITE(CHICKEN_PAR1_1,
> +		   I915_READ(CHICKEN_PAR1_1) & ~SKL_RC_HASH_OUTSIDE);
> +
>  	I915_WRITE(GEN8_CONFIG0,
>  		   I915_READ(GEN8_CONFIG0) | GEN9_DEFAULT_FIXES);
>  
> @@ -3314,7 +3328,9 @@ skl_ddb_min_alloc(const struct drm_plane_state *pstate,
>  
>  	/* For Non Y-tile return 8-blocks */
>  	if (fb->modifier != I915_FORMAT_MOD_Y_TILED &&
> -	    fb->modifier != I915_FORMAT_MOD_Yf_TILED)
> +	    fb->modifier != I915_FORMAT_MOD_Yf_TILED &&
> +	    fb->modifier != I915_FORMAT_MOD_Y_TILED_CCS &&
> +	    fb->modifier != I915_FORMAT_MOD_Yf_TILED_CCS)
>  		return 8;
>  
>  	src_w = drm_rect_width(&intel_pstate->base.src) >> 16;
> @@ -3590,7 +3606,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  	}
>  
>  	y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
> -		  fb->modifier == I915_FORMAT_MOD_Yf_TILED;
> +		  fb->modifier == I915_FORMAT_MOD_Yf_TILED ||
> +		  fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +		  fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS;
>  	x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
>  
>  	/* Display WA #1141: kbl. */
> @@ -3675,6 +3693,13 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  	res_lines = DIV_ROUND_UP(selected_result.val,
>  				 plane_blocks_per_line.val);
>  
> +	/* Display WA #1125: skl,bxt,kbl,glk */
> +	if (level == 0 &&
> +	    (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +	     fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS))
> +		res_blocks += fixed_16_16_to_u32_round_up(y_tile_minimum);

This WA also requires adjusting res_lines and making the level 0
res_blocks and res_lines the minimum for the higher levels.
The level 0 res_lines won't be used though later and the minimum could
be guaranteed already by the higher memory latency values for higher
levels. In any case this check can be added later if needed. The rest
looks good:
Reviewed-by: Imre Deak <imre.deak@intel.com>

> +
> +	/* Display WA #1126: skl,bxt,kbl,glk */
>  	if (level >= 1 && level <= 7) {
>  		if (y_tiled) {
>  			res_blocks += fixed_16_16_to_u32_round_up(y_tile_minimum);
> diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
> index 7031bc733d97..063a994815d0 100644
> --- a/drivers/gpu/drm/i915/intel_sprite.c
> +++ b/drivers/gpu/drm/i915/intel_sprite.c
> @@ -210,6 +210,7 @@ skl_update_plane(struct drm_plane *drm_plane,
>  	u32 surf_addr = plane_state->main.offset;
>  	unsigned int rotation = plane_state->base.rotation;
>  	u32 stride = skl_plane_stride(fb, 0, rotation);
> +	u32 aux_stride = skl_plane_stride(fb, 1, rotation);
>  	int crtc_x = plane_state->base.dst.x1;
>  	int crtc_y = plane_state->base.dst.y1;
>  	uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
> @@ -248,6 +249,10 @@ skl_update_plane(struct drm_plane *drm_plane,
>  	I915_WRITE(PLANE_OFFSET(pipe, plane_id), (y << 16) | x);
>  	I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
>  	I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
> +	I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
> +		   (plane_state->aux.offset - surf_addr) | aux_stride);
> +	I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
> +		   (plane_state->aux.y << 16) | plane_state->aux.x);
>  
>  	/* program plane scaler */
>  	if (plane_state->scaler_id >= 0) {
> -- 
> 2.10.2
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 9/9] drm/i915: Add render decompression support
  2017-02-07 15:37     ` Imre Deak
@ 2017-02-13 17:13       ` Ville Syrjälä
  0 siblings, 0 replies; 44+ messages in thread
From: Ville Syrjälä @ 2017-02-13 17:13 UTC (permalink / raw)
  To: Imre Deak
  Cc: Ben Widawsky, Paulo Zanoni, intel-gfx, dri-devel, Jason Ekstrand,
	Vandana Kannan

On Tue, Feb 07, 2017 at 05:37:44PM +0200, Imre Deak wrote:
> On Thu, Jan 05, 2017 at 05:14:54PM +0200, ville.syrjala@linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > SKL+ display engine can scan out certain kinds of compressed surfaces
> > produced by the render engine. This involved telling the display engine
> > the location of the color control surfae (CCS) which describes
> > which parts of the main surface are compressed and which are not. The
> > location of CCS is provided by userspace as just another plane with its
> > own offset.
> > 
> > Add the required stuff to validate the user provided AUX plane metadata
> > and convert the user provided linear offset into something the hardware
> > can consume.
> > 
> > Due to hardware limitations we require that the main surface and
> > the AUX surface (CCS) be part of the same bo. The hardware also
> > makes life hard by not allowing you to provide separate x/y offsets
> > for the main and AUX surfaces (excpet with NV12), so finding suitable
> > offsets for both requires a bit of work. Assuming we still want keep
> > playing tricks with the offsets. I've just gone with a dumb "search
> > backward for suitable offsets" approach, which is far from optimal,
> > but it works.
> > 
> > Also not all planes will be capable of scanning out compressed surfaces,
> > and eg. 90/270 degree rotation is not supported in combination with
> > decompression either.
> > 
> > This patch may contain work from at least the following people:
> > * Vandana Kannan <vandana.kannan@intel.com>
> > * Daniel Vetter <daniel@ffwll.ch>
> > * Ben Widawsky <ben@bwidawsk.net>
> > 
> > v2: Deal with display workarounds 0390, 0531, 1125 (Paulo)
> > 
> > Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > Cc: Vandana Kannan <vandana.kannan@intel.com>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Ben Widawsky <ben@bwidawsk.net>
> > Cc: Jason Ekstrand <jason@jlekstrand.net>
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_reg.h      |  23 ++++
> >  drivers/gpu/drm/i915/intel_display.c | 234 ++++++++++++++++++++++++++++++++---
> >  drivers/gpu/drm/i915/intel_pm.c      |  29 ++++-
> >  drivers/gpu/drm/i915/intel_sprite.c  |   5 +
> >  4 files changed, 274 insertions(+), 17 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> > index 00970aa77afa..6849ba93f1d9 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -6209,6 +6209,28 @@ enum {
> >  			_ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A),   \
> >  			_ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B))
> >  
> > +#define PLANE_AUX_DIST_1_A		0x701c0
> > +#define PLANE_AUX_DIST_2_A		0x702c0
> > +#define PLANE_AUX_DIST_1_B		0x711c0
> > +#define PLANE_AUX_DIST_2_B		0x712c0
> > +#define _PLANE_AUX_DIST_1(pipe) \
> > +			_PIPE(pipe, PLANE_AUX_DIST_1_A, PLANE_AUX_DIST_1_B)
> > +#define _PLANE_AUX_DIST_2(pipe) \
> > +			_PIPE(pipe, PLANE_AUX_DIST_2_A, PLANE_AUX_DIST_2_B)
> > +#define PLANE_AUX_DIST(pipe, plane)     \
> > +	_MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe), _PLANE_AUX_DIST_2(pipe))
> > +
> > +#define PLANE_AUX_OFFSET_1_A		0x701c4
> > +#define PLANE_AUX_OFFSET_2_A		0x702c4
> > +#define PLANE_AUX_OFFSET_1_B		0x711c4
> > +#define PLANE_AUX_OFFSET_2_B		0x712c4
> > +#define _PLANE_AUX_OFFSET_1(pipe)       \
> > +		_PIPE(pipe, PLANE_AUX_OFFSET_1_A, PLANE_AUX_OFFSET_1_B)
> > +#define _PLANE_AUX_OFFSET_2(pipe)       \
> > +		_PIPE(pipe, PLANE_AUX_OFFSET_2_A, PLANE_AUX_OFFSET_2_B)
> > +#define PLANE_AUX_OFFSET(pipe, plane)   \
> > +	_MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe), _PLANE_AUX_OFFSET_2(pipe))
> > +
> >  /* legacy palette */
> >  #define _LGC_PALETTE_A           0x4a000
> >  #define _LGC_PALETTE_B           0x4a800
> > @@ -6433,6 +6455,7 @@ enum {
> >  # define CHICKEN3_DGMG_DONE_FIX_DISABLE		(1 << 2)
> >  
> >  #define CHICKEN_PAR1_1		_MMIO(0x42080)
> > +#define  SKL_RC_HASH_OUTSIDE	(1 << 15)
> >  #define  DPA_MASK_VBLANK_SRD	(1 << 15)
> >  #define  FORCE_ARB_IDLE_PLANES	(1 << 14)
> >  #define  SKL_EDP_PSR_FIX_RDWRAP	(1 << 3)
> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > index 38de9df0ec60..2236abebd8bc 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -2064,11 +2064,19 @@ intel_tile_width_bytes(const struct drm_framebuffer *fb, int plane)
> >  			return 128;
> >  		else
> >  			return 512;
> > +	case I915_FORMAT_MOD_Y_TILED_CCS:
> > +		if (plane == 1)
> > +			return 64;
> > +		/* fall through */
> >  	case I915_FORMAT_MOD_Y_TILED:
> >  		if (IS_GEN2(dev_priv) || HAS_128_BYTE_Y_TILING(dev_priv))
> >  			return 128;
> >  		else
> >  			return 512;
> > +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> > +		if (plane == 1)
> > +			return 64;
> > +		/* fall through */
> >  	case I915_FORMAT_MOD_Yf_TILED:
> >  		/*
> >  		 * Bspec seems to suggest that the Yf tile width would
> > @@ -2156,7 +2164,7 @@ static unsigned int intel_surf_alignment(const struct drm_framebuffer *fb,
> >  	struct drm_i915_private *dev_priv = to_i915(fb->dev);
> >  
> >  	/* AUX_DIST needs only 4K alignment */
> > -	if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
> > +	if (plane == 1)
> >  		return 4096;
> >  
> >  	switch (fb->modifier) {
> > @@ -2166,6 +2174,8 @@ static unsigned int intel_surf_alignment(const struct drm_framebuffer *fb,
> >  		if (INTEL_GEN(dev_priv) >= 9)
> >  			return 256 * 1024;
> >  		return 0;
> > +	case I915_FORMAT_MOD_Y_TILED_CCS:
> > +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> >  	case I915_FORMAT_MOD_Y_TILED:
> >  	case I915_FORMAT_MOD_Yf_TILED:
> >  		return 1 * 1024 * 1024;
> > @@ -2472,6 +2482,7 @@ static unsigned int intel_fb_modifier_to_tiling(uint64_t fb_modifier)
> >  	case I915_FORMAT_MOD_X_TILED:
> >  		return I915_TILING_X;
> >  	case I915_FORMAT_MOD_Y_TILED:
> > +	case I915_FORMAT_MOD_Y_TILED_CCS:
> >  		return I915_TILING_Y;
> >  	default:
> >  		return I915_TILING_NONE;
> > @@ -2536,6 +2547,35 @@ intel_fill_fb_info(struct drm_i915_private *dev_priv,
> >  
> >  		intel_fb_offset_to_xy(&x, &y, fb, i);
> >  
> > +		if ((fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> > +		     fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) && i == 1) {
> > +			int main_x, main_y;
> > +			int ccs_x, ccs_y;
> > +
> > +			/*
> > +			 * Each byte of CCS corresponds to a 16x8 area of the main surface, and
> > +			 * each CCS tile is 64x64 bytes.
> > +			 */
> > +			ccs_x = (x * 16) % (64 * 16);
> > +			ccs_y = (y * 8) % (64 * 8);
> > +			main_x = intel_fb->normal[0].x % (64 * 16);
> > +			main_y = intel_fb->normal[0].y % (64 * 8);
> > +
> > +			/*
> > +			 * CCS doesn't have its own x/y offset register, so the intra CCS tile
> > +			 * x/y offsets must match between CCS and the main surface.
> > +			 */
> > +			if (main_x != ccs_x || main_y != ccs_y) {
> > +				DRM_DEBUG_KMS("Bad CCS x/y (main %d,%d ccs %d,%d) full (main %d,%d ccs %d,%d)\n",
> > +					      main_x, main_y,
> > +					      ccs_x, ccs_y,
> > +					      intel_fb->normal[0].x,
> > +					      intel_fb->normal[0].y,
> > +					      x, y);
> > +				return -EINVAL;
> > +			}
> > +		}
> > +
> >  		/*
> >  		 * The fence (if used) is aligned to the start of the object
> >  		 * so having the framebuffer wrap around across the edge of the
> > @@ -2873,6 +2913,9 @@ static int skl_max_plane_width(const struct drm_framebuffer *fb, int plane,
> >  			break;
> >  		}
> >  		break;
> > +	case I915_FORMAT_MOD_Y_TILED_CCS:
> > +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> > +		/* FIXME AUX plane? */
> >  	case I915_FORMAT_MOD_Y_TILED:
> >  	case I915_FORMAT_MOD_Yf_TILED:
> >  		switch (cpp) {
> > @@ -2895,6 +2938,42 @@ static int skl_max_plane_width(const struct drm_framebuffer *fb, int plane,
> >  	return 2048;
> >  }
> >  
> > +static bool skl_check_main_ccs_coordinates(struct intel_plane_state *plane_state,
> > +					   int main_x, int main_y, u32 main_offset)
> > +{
> > +	const struct drm_framebuffer *fb = plane_state->base.fb;
> > +	int aux_x = plane_state->aux.x;
> > +	int aux_y = plane_state->aux.y;
> > +	u32 aux_offset = plane_state->aux.offset;
> > +	u32 alignment = intel_surf_alignment(fb, 1);
> > +
> > +	while (aux_offset >= main_offset && aux_y <= main_y) {
> > +		int x, y;
> > +
> > +		if (aux_x == main_x && aux_y == main_y)
> > +			break;
> > +
> > +		if (aux_offset == 0)
> > +			break;
> > +
> > +		x = aux_x / 16;
> > +		y = aux_y / 8;
> > +		aux_offset = intel_adjust_tile_offset(&x, &y, plane_state, 1,
> > +						      aux_offset, aux_offset - alignment);
> > +		aux_x = x * 16 + aux_x % 16;
> > +		aux_y = y * 8 + aux_y % 8;
> > +	}
> > +
> > +	if (aux_x != main_x || aux_y != main_y)
> > +		return false;
> > +
> > +	plane_state->aux.offset = aux_offset;
> > +	plane_state->aux.x = aux_x;
> > +	plane_state->aux.y = aux_y;
> > +
> > +	return true;
> > +}
> > +
> >  static int skl_check_main_surface(struct intel_plane_state *plane_state)
> >  {
> >  	const struct drm_framebuffer *fb = plane_state->base.fb;
> > @@ -2937,7 +3016,7 @@ static int skl_check_main_surface(struct intel_plane_state *plane_state)
> >  
> >  		while ((x + w) * cpp > fb->pitches[0]) {
> >  			if (offset == 0) {
> > -				DRM_DEBUG_KMS("Unable to find suitable display surface offset\n");
> > +				DRM_DEBUG_KMS("Unable to find suitable display surface offset due to X-tiling\n");
> >  				return -EINVAL;
> >  			}
> >  
> > @@ -2946,6 +3025,26 @@ static int skl_check_main_surface(struct intel_plane_state *plane_state)
> >  		}
> >  	}
> >  
> > +	/*
> > +	 * CCS AUX surface doesn't have its own x/y offsets, we must make sure
> > +	 * they match with the main surface x/y offsets.
> > +	 */
> > +	if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> > +	    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
> > +		while (!skl_check_main_ccs_coordinates(plane_state, x, y, offset)) {
> > +			if (offset == 0)
> > +				break;
> > +
> > +			offset = intel_adjust_tile_offset(&x, &y, plane_state, 0,
> > +							  offset, offset - alignment);
> > +		}
> > +
> > +		if (x != plane_state->aux.x || y != plane_state->aux.y) {
> > +			DRM_DEBUG_KMS("Unable to find suitable display surface offset due to CCS\n");
> > +			return -EINVAL;
> > +		}
> > +	}
> > +
> >  	plane_state->main.offset = offset;
> >  	plane_state->main.x = x;
> >  	plane_state->main.y = y;
> > @@ -2982,6 +3081,47 @@ static int skl_check_nv12_aux_surface(struct intel_plane_state *plane_state)
> >  	return 0;
> >  }
> >  
> > +static int skl_check_ccs_aux_surface(struct intel_plane_state *plane_state)
> > +{
> > +	struct intel_plane *plane = to_intel_plane(plane_state->base.plane);
> > +	struct intel_crtc *crtc = to_intel_crtc(plane_state->base.crtc);
> > +	int src_x = plane_state->base.src.x1 >> 16;
> > +	int src_y = plane_state->base.src.y1 >> 16;
> > +	int x = src_x / 16;
> > +	int y = src_y / 8;
> > +	u32 offset;
> > +
> > +	switch (plane->id) {
> > +	case PLANE_PRIMARY:
> > +	case PLANE_SPRITE0:
> > +		break;
> > +	default:
> > +		DRM_DEBUG_KMS("RC support only on plane 1 and 2\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	if (crtc->pipe == PIPE_C) {
> > +		DRM_DEBUG_KMS("No RC support on pipe C\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	if (plane_state->base.rotation &&
> > +	    plane_state->base.rotation & ~(DRM_ROTATE_0 | DRM_ROTATE_180)) {
> > +		DRM_DEBUG_KMS("RC support only with 0/180 degree rotation %x\n",
> > +			      plane_state->base.rotation);
> > +		return -EINVAL;
> > +	}
> > +
> > +	intel_add_fb_offsets(&x, &y, plane_state, 1);
> > +	offset = intel_compute_tile_offset(&x, &y, plane_state, 1);
> > +
> > +	plane_state->aux.offset = offset;
> > +	plane_state->aux.x = x * 16 + src_x % 16;
> > +	plane_state->aux.y = y * 8 + src_y % 8;
> > +
> > +	return 0;
> > +}
> > +
> >  int skl_check_plane_surface(struct intel_plane_state *plane_state)
> >  {
> >  	const struct drm_framebuffer *fb = plane_state->base.fb;
> > @@ -3002,6 +3142,11 @@ int skl_check_plane_surface(struct intel_plane_state *plane_state)
> >  		ret = skl_check_nv12_aux_surface(plane_state);
> >  		if (ret)
> >  			return ret;
> > +	} else if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> > +		   fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
> > +		ret = skl_check_ccs_aux_surface(plane_state);
> > +		if (ret)
> > +			return ret;
> >  	} else {
> >  		plane_state->aux.offset = ~0xfff;
> >  		plane_state->aux.x = 0;
> > @@ -3357,8 +3502,12 @@ u32 skl_plane_ctl_tiling(uint64_t fb_modifier)
> >  		return PLANE_CTL_TILED_X;
> >  	case I915_FORMAT_MOD_Y_TILED:
> >  		return PLANE_CTL_TILED_Y;
> > +	case I915_FORMAT_MOD_Y_TILED_CCS:
> > +		return PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
> >  	case I915_FORMAT_MOD_Yf_TILED:
> >  		return PLANE_CTL_TILED_YF;
> > +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> > +		return PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_ENABLE;
> >  	default:
> >  		MISSING_CASE(fb_modifier);
> >  	}
> > @@ -3401,6 +3550,7 @@ static void skylake_update_primary_plane(struct drm_plane *plane,
> >  	u32 plane_ctl;
> >  	unsigned int rotation = plane_state->base.rotation;
> >  	u32 stride = skl_plane_stride(fb, 0, rotation);
> > +	u32 aux_stride = skl_plane_stride(fb, 1, rotation);
> >  	u32 surf_addr = plane_state->main.offset;
> >  	int scaler_id = plane_state->scaler_id;
> >  	int src_x = plane_state->main.x;
> > @@ -3436,6 +3586,10 @@ static void skylake_update_primary_plane(struct drm_plane *plane,
> >  	I915_WRITE(PLANE_OFFSET(pipe, plane_id), (src_y << 16) | src_x);
> >  	I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
> >  	I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
> > +	I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
> > +		   (plane_state->aux.offset - surf_addr) | aux_stride);
> > +	I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
> > +		   (plane_state->aux.y << 16) | plane_state->aux.x);
> >  
> >  	if (scaler_id >= 0) {
> >  		uint32_t ps_ctrl = 0;
> > @@ -9807,10 +9961,16 @@ skylake_get_initial_plane_config(struct intel_crtc *crtc,
> >  		fb->modifier = I915_FORMAT_MOD_X_TILED;
> >  		break;
> >  	case PLANE_CTL_TILED_Y:
> > -		fb->modifier = I915_FORMAT_MOD_Y_TILED;
> > +		if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
> > +			fb->modifier = I915_FORMAT_MOD_Y_TILED_CCS;
> > +		else
> > +			fb->modifier = I915_FORMAT_MOD_Y_TILED;
> >  		break;
> >  	case PLANE_CTL_TILED_YF:
> > -		fb->modifier = I915_FORMAT_MOD_Yf_TILED;
> > +		if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
> > +			fb->modifier = I915_FORMAT_MOD_Yf_TILED_CCS;
> > +		else
> > +			fb->modifier = I915_FORMAT_MOD_Yf_TILED;
> >  		break;
> >  	default:
> >  		MISSING_CASE(tiling);
> > @@ -12014,7 +12174,7 @@ static void skl_do_mmio_flip(struct intel_crtc *intel_crtc,
> >  	u32 ctl, stride = skl_plane_stride(fb, 0, rotation);
> >  
> >  	ctl = I915_READ(PLANE_CTL(pipe, 0));
> > -	ctl &= ~PLANE_CTL_TILED_MASK;
> > +	ctl &= ~(PLANE_CTL_TILED_MASK | PLANE_CTL_DECOMPRESSION_ENABLE);
> >  	switch (fb->modifier) {
> >  	case DRM_FORMAT_MOD_NONE:
> >  		break;
> > @@ -12024,9 +12184,15 @@ static void skl_do_mmio_flip(struct intel_crtc *intel_crtc,
> >  	case I915_FORMAT_MOD_Y_TILED:
> >  		ctl |= PLANE_CTL_TILED_Y;
> >  		break;
> > +	case I915_FORMAT_MOD_Y_TILED_CCS:
> > +		ctl |= PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
> > +		break;
> >  	case I915_FORMAT_MOD_Yf_TILED:
> >  		ctl |= PLANE_CTL_TILED_YF;
> >  		break;
> > +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> > +		ctl |= PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_ENABLE;
> > +		break;
> >  	default:
> >  		MISSING_CASE(fb->modifier);
> >  	}
> > @@ -15925,9 +16091,10 @@ static int intel_framebuffer_init(struct drm_device *dev,
> >  				  struct drm_i915_gem_object *obj)
> >  {
> >  	struct drm_i915_private *dev_priv = to_i915(dev);
> > +	struct drm_framebuffer *fb = &intel_fb->base;
> >  	unsigned int tiling = i915_gem_object_get_tiling(obj);
> > -	int ret;
> > -	u32 pitch_limit, stride_alignment;
> > +	int ret, i;
> > +	u32 pitch_limit;
> >  	struct drm_format_name_buf format_name;
> >  
> >  	WARN_ON(!mutex_is_locked(&dev->struct_mutex));
> > @@ -15953,6 +16120,19 @@ static int intel_framebuffer_init(struct drm_device *dev,
> >  
> >  	/* Passed in modifier sanity checking. */
> >  	switch (mode_cmd->modifier[0]) {
> > +	case I915_FORMAT_MOD_Y_TILED_CCS:
> > +	case I915_FORMAT_MOD_Yf_TILED_CCS:
> > +		switch (mode_cmd->pixel_format) {
> > +		case DRM_FORMAT_XBGR8888:
> > +		case DRM_FORMAT_ABGR8888:
> > +		case DRM_FORMAT_XRGB8888:
> > +		case DRM_FORMAT_ARGB8888:
> > +			break;
> > +		default:
> > +			DRM_DEBUG_KMS("RC supported only with RGB8888 formats\n");
> > +			return -EINVAL;
> > +		}
> > +		/* fall through */
> >  	case I915_FORMAT_MOD_Y_TILED:
> >  	case I915_FORMAT_MOD_Yf_TILED:
> >  		if (INTEL_GEN(dev_priv) < 9) {
> > @@ -16059,22 +16239,46 @@ static int intel_framebuffer_init(struct drm_device *dev,
> >  	if (mode_cmd->offsets[0] != 0)
> >  		return -EINVAL;
> >  
> > -	drm_helper_mode_fill_fb_struct(dev, &intel_fb->base, mode_cmd);
> > +	drm_helper_mode_fill_fb_struct(dev, fb, mode_cmd);
> >  
> > -	stride_alignment = intel_fb_stride_alignment(&intel_fb->base, 0);
> > -	if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
> > -		DRM_DEBUG_KMS("pitch (%d) must be at least %u byte aligned\n",
> > -			      mode_cmd->pitches[0], stride_alignment);
> > -		return -EINVAL;
> > +	for (i = 0; i < fb->format->num_planes; i++) {
> > +		u32 stride_alignment;
> > +
> > +		if (mode_cmd->handles[i] != mode_cmd->handles[0]) {
> > +			DRM_DEBUG_KMS("bad plane %d handle\n", i);
> > +			return -EINVAL;
> > +		}
> > +
> > +		stride_alignment = intel_fb_stride_alignment(fb, i);
> > +
> > +		/*
> > +		 * Display WA #0531: skl,bxt,kbl,glk
> > +		 *
> > +		 * Render decompression and plane width > 3840
> > +		 * combined with horizontal panning requires the
> > +		 * plane stride to be a multiple of 4. We'll just
> > +		 * require the entire fb to accommodate that to avoid
> > +		 * potential runtime errors at plane configuration time.
> > +		 */
> > +		if (IS_GEN9(dev_priv) && i == 0 && fb->width > 3840 &&
> > +		    (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> > +		     fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS))
> > +			stride_alignment *= 4;
> > +
> > +		if (fb->pitches[i] & (stride_alignment - 1)) {
> > +			DRM_DEBUG_KMS("plane %d pitch (%d) must be at least %u byte aligned\n",
> > +				      i, fb->pitches[i], stride_alignment);
> > +			return -EINVAL;
> > +		}
> >  	}
> >  
> >  	intel_fb->obj = obj;
> >  
> > -	ret = intel_fill_fb_info(dev_priv, &intel_fb->base);
> > +	ret = intel_fill_fb_info(dev_priv, fb);
> >  	if (ret)
> >  		return ret;
> >  
> > -	ret = drm_framebuffer_init(dev, &intel_fb->base, &intel_fb_funcs);
> > +	ret = drm_framebuffer_init(dev, fb, &intel_fb_funcs);
> >  	if (ret) {
> >  		DRM_ERROR("framebuffer init failed %d\n", ret);
> >  		return ret;
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index 249623d45be0..25782cd174c0 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -62,6 +62,20 @@ static void gen9_init_clock_gating(struct drm_i915_private *dev_priv)
> >  	I915_WRITE(CHICKEN_PAR1_1,
> >  		   I915_READ(CHICKEN_PAR1_1) | SKL_EDP_PSR_FIX_RDWRAP);
> >  
> > +	/*
> > +	 * Display WA#0390: skl,bxt,kbl,glk
> > +	 *
> > +	 * Must match Sampler, Pixel Back End, and Media
> > +	 * (0xE194 bit 8, 0x7014 bit 13, 0x4DDC bits 27 and 31).
> > +	 *
> > +	 * Including bits outside the page in the hash would
> > +	 * require 2 (or 4?) MiB alignment of resources. Just
> > +	 * assume the defaul hashing mode which only uses bits
> > +	 * within the page.
> > +	 */
> > +	I915_WRITE(CHICKEN_PAR1_1,
> > +		   I915_READ(CHICKEN_PAR1_1) & ~SKL_RC_HASH_OUTSIDE);
> > +
> >  	I915_WRITE(GEN8_CONFIG0,
> >  		   I915_READ(GEN8_CONFIG0) | GEN9_DEFAULT_FIXES);
> >  
> > @@ -3314,7 +3328,9 @@ skl_ddb_min_alloc(const struct drm_plane_state *pstate,
> >  
> >  	/* For Non Y-tile return 8-blocks */
> >  	if (fb->modifier != I915_FORMAT_MOD_Y_TILED &&
> > -	    fb->modifier != I915_FORMAT_MOD_Yf_TILED)
> > +	    fb->modifier != I915_FORMAT_MOD_Yf_TILED &&
> > +	    fb->modifier != I915_FORMAT_MOD_Y_TILED_CCS &&
> > +	    fb->modifier != I915_FORMAT_MOD_Yf_TILED_CCS)
> >  		return 8;
> >  
> >  	src_w = drm_rect_width(&intel_pstate->base.src) >> 16;
> > @@ -3590,7 +3606,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
> >  	}
> >  
> >  	y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
> > -		  fb->modifier == I915_FORMAT_MOD_Yf_TILED;
> > +		  fb->modifier == I915_FORMAT_MOD_Yf_TILED ||
> > +		  fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> > +		  fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS;
> >  	x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
> >  
> >  	/* Display WA #1141: kbl. */
> > @@ -3675,6 +3693,13 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
> >  	res_lines = DIV_ROUND_UP(selected_result.val,
> >  				 plane_blocks_per_line.val);
> >  
> > +	/* Display WA #1125: skl,bxt,kbl,glk */
> > +	if (level == 0 &&
> > +	    (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> > +	     fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS))
> > +		res_blocks += fixed_16_16_to_u32_round_up(y_tile_minimum);
> 
> This WA also requires adjusting res_lines and making the level 0
> res_blocks and res_lines the minimum for the higher levels.
> The level 0 res_lines won't be used though later and the minimum could
> be guaranteed already by the higher memory latency values for higher
> levels. In any case this check can be added later if needed. The rest
> looks good:
> Reviewed-by: Imre Deak <imre.deak@intel.com>
> 
> > +
> > +	/* Display WA #1126: skl,bxt,kbl,glk */
> >  	if (level >= 1 && level <= 7) {
> >  		if (y_tiled) {
> >  			res_blocks += fixed_16_16_to_u32_round_up(y_tile_minimum);

Yeah so this ^ is where we do the same adjustment for level >= 1
regardless of CCS or not, and so assuming the memory latency values
are non-decreasing this should give us the required guarantee already.

> > diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
> > index 7031bc733d97..063a994815d0 100644
> > --- a/drivers/gpu/drm/i915/intel_sprite.c
> > +++ b/drivers/gpu/drm/i915/intel_sprite.c
> > @@ -210,6 +210,7 @@ skl_update_plane(struct drm_plane *drm_plane,
> >  	u32 surf_addr = plane_state->main.offset;
> >  	unsigned int rotation = plane_state->base.rotation;
> >  	u32 stride = skl_plane_stride(fb, 0, rotation);
> > +	u32 aux_stride = skl_plane_stride(fb, 1, rotation);
> >  	int crtc_x = plane_state->base.dst.x1;
> >  	int crtc_y = plane_state->base.dst.y1;
> >  	uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
> > @@ -248,6 +249,10 @@ skl_update_plane(struct drm_plane *drm_plane,
> >  	I915_WRITE(PLANE_OFFSET(pipe, plane_id), (y << 16) | x);
> >  	I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
> >  	I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
> > +	I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
> > +		   (plane_state->aux.offset - surf_addr) | aux_stride);
> > +	I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
> > +		   (plane_state->aux.y << 16) | plane_state->aux.x);
> >  
> >  	/* program plane scaler */
> >  	if (plane_state->scaler_id >= 0) {
> > -- 
> > 2.10.2
> > 
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 8/9] drm/i915: Implement .get_format_info() hook for CCS
  2017-01-04 18:42 ` [PATCH 8/9] drm/i915: Implement .get_format_info() hook for CCS ville.syrjala
  2017-01-05 16:24   ` Tvrtko Ursulin
@ 2017-02-26 22:41   ` Chad Versace
  2017-02-27 15:13     ` [Intel-gfx] " Ville Syrjälä
  1 sibling, 1 reply; 44+ messages in thread
From: Chad Versace @ 2017-02-26 22:41 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx, Ben Widawsky, dri-devel, Vandana Kannan

On Wed 04 Jan 2017, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> SKL+ display engine can scan out certain kinds of compressed surfaces
> produced by the render engine. This involved telling the display engine
> the location of the color control surfae (CCS) which describes which
> parts of the main surface are compressed and which are not. The location
> of CCS is provided by userspace as just another plane with its own offset.
> 
> By providing our own format information for the CCS formats, we should
> be able to make framebuffer_check() do the right thing for the CCS
> surface as well.
> 
> Note that we'll return the same format info for both Y and Yf tiled
> format as that's what happens with the non-CCS Y vs. Yf as well. If
> desired, we could potentially return a unique pointer for each
> pixel_format+tiling+ccs combination, in which case we immediately be
> able to tell if any of that stuff changed by just comparing the
> pointers. But that does sound a bit wasteful space wise.
> 
> v2: Drop the 'dev' argument from the hook
> v3: Include the description of the CCS surface layout
> 
> Cc: Vandana Kannan <vandana.kannan@intel.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Ben Widawsky <ben@bwidawsk.net>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/intel_display.c | 36 ++++++++++++++++++++++++++
>  include/uapi/drm/drm_fourcc.h        | 49 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 85 insertions(+)


> diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
> index 9e1bb7fabcde..4581e3d41e5c 100644
> --- a/include/uapi/drm/drm_fourcc.h
> +++ b/include/uapi/drm/drm_fourcc.h
> @@ -230,6 +230,55 @@ extern "C" {
>  #define I915_FORMAT_MOD_Yf_TILED fourcc_mod_code(INTEL, 3)
>  
>  /*
> + * Intel color control surface (CCS) for render compression
> + *
> + * The framebuffer format must be one of the 8:8:8:8 RGB formats,
> + * the main surface will be plane index 0 and will be Y/Yf-tiled,
> + * the CCS will be plane index 1.
> + *

The paragraph below is...

> + * Each byte of CCS contains 4 pairs of bits, with each pair of bits
> + * matching an area of 8x4 pixels of the main surface. Which would seem
> + * to match 2 cachelines containing 4x4 pixels each. The pairs bits within
> + * the byte form a 2x2 grid, which thus matches a 16x8 pixel area of the
> + * main surface. This is the 2x2 pattern the bits form (0=lsb, 7=msb):
> + * -----------
> + * | 01 | 23 |
> + *  ----------
> + * | 45 | 67 |
> + * -----------

...almost correct. The hw docs state that each bit-pair of the CCS maps
cacheline-pair, horizontally adjacent in the Y tile, of the primary surface. As
a consequence, the remainder of the above paragraph is correct for 32-bit
formats, but not others.

This is not a nitpick, because Vulkan and EGL users may want to share
dma_bufs with a fat format like R32G32B32A32_UNORM, and want to have CCS
enabled when possible. As long as the users use the dma_buf only the 3D
engine, and don't submit it to KMS, it's all safe.

For those users, we should document the bit-pair/cacheline-pair relationship
correctly. And then preface the following detailed explanation and nice ascii
diagrams by saying "this applies only to the 32-bit formats".

Here's the relevant PRM quote:

     The Color Control Surface (CCS) contains the compression status
     of the cache-line pairs. The compression state of the cache-line
     pair is specified by 2 bits in the CCS.  Each CCS cache-line
     represents an area on the main surface of 16x16 sets of 128 byte
     Y-tiled cache-line-pairs. CCS is always Y tiled.

See this Mesa comment for more details:
https://cgit.freedesktop.org/mesa/mesa/tree/src/intel/isl/isl.c?h=17.0#n211

> + * Actually only the lower bit of the pair seems to have any effect.
> + * No idea why. 0 in the lower bit would seem to mean not compressed,
> + * and 1 is compressed.  The interpreation of the main surface data
> + * when the block is marked compressed is unknown as of now.

If I recall correctly (it's been several months since I inspected the
bits), the high bit is actually significant. Bit pattern 11 means that
the data in primary surface's cacheline-pair is invalid; the 3D engine
interprets the color of all pixels in that cacheline-pair to be the
clear color stored in RENDER_SURFACE_STATE. Before the display engine
can consume the surface, userspace is required to do a partial resolve,
flushing the clear color into the primary surface. So it makes sense
that the kernel would never observe that bit pattern in an incoming ccs
surface, as long as userspace is doing its job correctly. And it makes
sense that the display engine would ignore the high bit, because there is
no way to provide the clear color to the display engine (at least
according my reading of the docs).

Jason, does this match your understanding of the high bit?

> + *
> + * CCS tile is laid out in 8 byte horizontal strips each strip thus corresponds
> + * to a 128x8 pixel are of the main surface. So each 8x8 bytes of the CCS
> + * (1 cacheline) will match an area of 4x2 tiles on the main surface.
> + *
> + * Here is the layout of a full CCS tile, with the 8 byte strips numbered 0-511:
> + * ------------------------
> + * |  0 |  64 | ... | 448 |
> + * |  1 |  65 |     | 449 |
> + * |  2 |  66 |     | 450 |
> + * |  . |   . |     |   . |
> + * |  . |   . |     |   . |
> + * |  . |   . |     |   . |
> + * | 63 | 127 |     | 511 |
> + * ------------------------
> + *
> + * This will match an area of 1024x512 pixels on the main surface.
> + *
> + * The CCS plane pitch must be a multiple of the CCS tile width (64 bytes),
> + * and for the purposes of the CCS plane offset we assume cpp==1. As each
> + * byte matches a 16x8 area of the main surface, the dimensions of the CCS
> + * plane will thus appear to be width/16 x height/8. Both planes must be
> + * contained within the same gem object.

Again, the above paragraphs should clarify that they apply only to 32-bit formats.

> + */
> +#define I915_FORMAT_MOD_Y_TILED_CCS	fourcc_mod_code(INTEL, 4)
> +#define I915_FORMAT_MOD_Yf_TILED_CCS	fourcc_mod_code(INTEL, 5)
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [PATCH 8/9] drm/i915: Implement .get_format_info() hook for CCS
  2017-02-26 22:41   ` Chad Versace
@ 2017-02-27 15:13     ` Ville Syrjälä
  2017-02-28  5:36       ` Ben Widawsky
  0 siblings, 1 reply; 44+ messages in thread
From: Ville Syrjälä @ 2017-02-27 15:13 UTC (permalink / raw)
  To: Chad Versace, intel-gfx, Ben Widawsky, Vandana Kannan, dri-devel,
	Jason Ekstrand

On Sun, Feb 26, 2017 at 02:41:50PM -0800, Chad Versace wrote:
> On Wed 04 Jan 2017, ville.syrjala@linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > SKL+ display engine can scan out certain kinds of compressed surfaces
> > produced by the render engine. This involved telling the display engine
> > the location of the color control surfae (CCS) which describes which
> > parts of the main surface are compressed and which are not. The location
> > of CCS is provided by userspace as just another plane with its own offset.
> > 
> > By providing our own format information for the CCS formats, we should
> > be able to make framebuffer_check() do the right thing for the CCS
> > surface as well.
> > 
> > Note that we'll return the same format info for both Y and Yf tiled
> > format as that's what happens with the non-CCS Y vs. Yf as well. If
> > desired, we could potentially return a unique pointer for each
> > pixel_format+tiling+ccs combination, in which case we immediately be
> > able to tell if any of that stuff changed by just comparing the
> > pointers. But that does sound a bit wasteful space wise.
> > 
> > v2: Drop the 'dev' argument from the hook
> > v3: Include the description of the CCS surface layout
> > 
> > Cc: Vandana Kannan <vandana.kannan@intel.com>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Ben Widawsky <ben@bwidawsk.net>
> > Cc: Jason Ekstrand <jason@jlekstrand.net>
> > Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_display.c | 36 ++++++++++++++++++++++++++
> >  include/uapi/drm/drm_fourcc.h        | 49 ++++++++++++++++++++++++++++++++++++
> >  2 files changed, 85 insertions(+)
> 
> 
> > diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
> > index 9e1bb7fabcde..4581e3d41e5c 100644
> > --- a/include/uapi/drm/drm_fourcc.h
> > +++ b/include/uapi/drm/drm_fourcc.h
> > @@ -230,6 +230,55 @@ extern "C" {
> >  #define I915_FORMAT_MOD_Yf_TILED fourcc_mod_code(INTEL, 3)
> >  
> >  /*
> > + * Intel color control surface (CCS) for render compression
> > + *
> > + * The framebuffer format must be one of the 8:8:8:8 RGB formats,
> > + * the main surface will be plane index 0 and will be Y/Yf-tiled,
> > + * the CCS will be plane index 1.
> > + *
> 
> The paragraph below is...
> 
> > + * Each byte of CCS contains 4 pairs of bits, with each pair of bits
> > + * matching an area of 8x4 pixels of the main surface. Which would seem
> > + * to match 2 cachelines containing 4x4 pixels each. The pairs bits within
> > + * the byte form a 2x2 grid, which thus matches a 16x8 pixel area of the
> > + * main surface. This is the 2x2 pattern the bits form (0=lsb, 7=msb):
> > + * -----------
> > + * | 01 | 23 |
> > + *  ----------
> > + * | 45 | 67 |
> > + * -----------
> 
> ...almost correct. The hw docs state that each bit-pair of the CCS maps
> cacheline-pair, horizontally adjacent in the Y tile, of the primary surface. As
> a consequence, the remainder of the above paragraph is correct for 32-bit
> formats, but not others.

Which is why the comment says at the very top that the fb needs to
use a 8:8:8:8 format. IIRC that's the only thing the hardware supports.

> 
> This is not a nitpick, because Vulkan and EGL users may want to share
> dma_bufs with a fat format like R32G32B32A32_UNORM, and want to have CCS
> enabled when possible. As long as the users use the dma_buf only the 3D
> engine, and don't submit it to KMS, it's all safe.
> 
> For those users, we should document the bit-pair/cacheline-pair relationship
> correctly. And then preface the following detailed explanation and nice ascii
> diagrams by saying "this applies only to the 32-bit formats".
> 
> Here's the relevant PRM quote:
> 
>      The Color Control Surface (CCS) contains the compression status
>      of the cache-line pairs. The compression state of the cache-line
>      pair is specified by 2 bits in the CCS.  Each CCS cache-line
>      represents an area on the main surface of 16x16 sets of 128 byte
>      Y-tiled cache-line-pairs. CCS is always Y tiled.
> 
> See this Mesa comment for more details:
> https://cgit.freedesktop.org/mesa/mesa/tree/src/intel/isl/isl.c?h=17.0#n211
> 
> > + * Actually only the lower bit of the pair seems to have any effect.
> > + * No idea why. 0 in the lower bit would seem to mean not compressed,
> > + * and 1 is compressed.  The interpreation of the main surface data
> > + * when the block is marked compressed is unknown as of now.
> 
> If I recall correctly (it's been several months since I inspected the
> bits), the high bit is actually significant. Bit pattern 11 means that
> the data in primary surface's cacheline-pair is invalid; the 3D engine
> interprets the color of all pixels in that cacheline-pair to be the
> clear color stored in RENDER_SURFACE_STATE. Before the display engine
> can consume the surface, userspace is required to do a partial resolve,
> flushing the clear color into the primary surface. So it makes sense
> that the kernel would never observe that bit pattern in an incoming ccs
> surface, as long as userspace is doing its job correctly. And it makes
> sense that the display engine would ignore the high bit, because there is
> no way to provide the clear color to the display engine (at least
> according my reading of the docs).
> 
> Jason, does this match your understanding of the high bit?
> 
> > + *
> > + * CCS tile is laid out in 8 byte horizontal strips each strip thus corresponds
> > + * to a 128x8 pixel are of the main surface. So each 8x8 bytes of the CCS
> > + * (1 cacheline) will match an area of 4x2 tiles on the main surface.
> > + *
> > + * Here is the layout of a full CCS tile, with the 8 byte strips numbered 0-511:
> > + * ------------------------
> > + * |  0 |  64 | ... | 448 |
> > + * |  1 |  65 |     | 449 |
> > + * |  2 |  66 |     | 450 |
> > + * |  . |   . |     |   . |
> > + * |  . |   . |     |   . |
> > + * |  . |   . |     |   . |
> > + * | 63 | 127 |     | 511 |
> > + * ------------------------
> > + *
> > + * This will match an area of 1024x512 pixels on the main surface.
> > + *
> > + * The CCS plane pitch must be a multiple of the CCS tile width (64 bytes),
> > + * and for the purposes of the CCS plane offset we assume cpp==1. As each
> > + * byte matches a 16x8 area of the main surface, the dimensions of the CCS
> > + * plane will thus appear to be width/16 x height/8. Both planes must be
> > + * contained within the same gem object.
> 
> Again, the above paragraphs should clarify that they apply only to 32-bit formats.
> 
> > + */
> > +#define I915_FORMAT_MOD_Y_TILED_CCS	fourcc_mod_code(INTEL, 4)
> > +#define I915_FORMAT_MOD_Yf_TILED_CCS	fourcc_mod_code(INTEL, 5)

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 8/9] drm/i915: Implement .get_format_info() hook for CCS
  2017-02-27 15:13     ` [Intel-gfx] " Ville Syrjälä
@ 2017-02-28  5:36       ` Ben Widawsky
  0 siblings, 0 replies; 44+ messages in thread
From: Ben Widawsky @ 2017-02-28  5:36 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Chad Versace, intel-gfx, dri-devel, Vandana Kannan

On 17-02-27 17:13:41, Ville Syrjälä wrote:
>On Sun, Feb 26, 2017 at 02:41:50PM -0800, Chad Versace wrote:
>> On Wed 04 Jan 2017, ville.syrjala@linux.intel.com wrote:
>> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> >
>> > SKL+ display engine can scan out certain kinds of compressed surfaces
>> > produced by the render engine. This involved telling the display engine
>> > the location of the color control surfae (CCS) which describes which
>> > parts of the main surface are compressed and which are not. The location
>> > of CCS is provided by userspace as just another plane with its own offset.
>> >
>> > By providing our own format information for the CCS formats, we should
>> > be able to make framebuffer_check() do the right thing for the CCS
>> > surface as well.
>> >
>> > Note that we'll return the same format info for both Y and Yf tiled
>> > format as that's what happens with the non-CCS Y vs. Yf as well. If
>> > desired, we could potentially return a unique pointer for each
>> > pixel_format+tiling+ccs combination, in which case we immediately be
>> > able to tell if any of that stuff changed by just comparing the
>> > pointers. But that does sound a bit wasteful space wise.
>> >
>> > v2: Drop the 'dev' argument from the hook
>> > v3: Include the description of the CCS surface layout
>> >
>> > Cc: Vandana Kannan <vandana.kannan@intel.com>
>> > Cc: Daniel Vetter <daniel@ffwll.ch>
>> > Cc: Ben Widawsky <ben@bwidawsk.net>
>> > Cc: Jason Ekstrand <jason@jlekstrand.net>
>> > Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
>> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> > ---
>> >  drivers/gpu/drm/i915/intel_display.c | 36 ++++++++++++++++++++++++++
>> >  include/uapi/drm/drm_fourcc.h        | 49 ++++++++++++++++++++++++++++++++++++
>> >  2 files changed, 85 insertions(+)
>>
>>
>> > diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
>> > index 9e1bb7fabcde..4581e3d41e5c 100644
>> > --- a/include/uapi/drm/drm_fourcc.h
>> > +++ b/include/uapi/drm/drm_fourcc.h
>> > @@ -230,6 +230,55 @@ extern "C" {
>> >  #define I915_FORMAT_MOD_Yf_TILED fourcc_mod_code(INTEL, 3)
>> >
>> >  /*
>> > + * Intel color control surface (CCS) for render compression
>> > + *
>> > + * The framebuffer format must be one of the 8:8:8:8 RGB formats,
>> > + * the main surface will be plane index 0 and will be Y/Yf-tiled,
>> > + * the CCS will be plane index 1.
>> > + *
>>
>> The paragraph below is...
>>
>> > + * Each byte of CCS contains 4 pairs of bits, with each pair of bits
>> > + * matching an area of 8x4 pixels of the main surface. Which would seem
>> > + * to match 2 cachelines containing 4x4 pixels each. The pairs bits within
>> > + * the byte form a 2x2 grid, which thus matches a 16x8 pixel area of the
>> > + * main surface. This is the 2x2 pattern the bits form (0=lsb, 7=msb):
>> > + * -----------
>> > + * | 01 | 23 |
>> > + *  ----------
>> > + * | 45 | 67 |
>> > + * -----------
>>
>> ...almost correct. The hw docs state that each bit-pair of the CCS maps
>> cacheline-pair, horizontally adjacent in the Y tile, of the primary surface. As
>> a consequence, the remainder of the above paragraph is correct for 32-bit
>> formats, but not others.
>
>Which is why the comment says at the very top that the fb needs to
>use a 8:8:8:8 format. IIRC that's the only thing the hardware supports.
>

It is, and it is for the foreseeable future too. Chad, granted you say this
isn't a nitpick below, but it is because Ville's patch is for the KMS case, it's
not entirely relevant here.


>>
>> This is not a nitpick, because Vulkan and EGL users may want to share
>> dma_bufs with a fat format like R32G32B32A32_UNORM, and want to have CCS
>> enabled when possible. As long as the users use the dma_buf only the 3D
>> engine, and don't submit it to KMS, it's all safe.
>>
>> For those users, we should document the bit-pair/cacheline-pair relationship
>> correctly. And then preface the following detailed explanation and nice ascii
>> diagrams by saying "this applies only to the 32-bit formats".
>>
>> Here's the relevant PRM quote:
>>
>>      The Color Control Surface (CCS) contains the compression status
>>      of the cache-line pairs. The compression state of the cache-line
>>      pair is specified by 2 bits in the CCS.  Each CCS cache-line
>>      represents an area on the main surface of 16x16 sets of 128 byte
>>      Y-tiled cache-line-pairs. CCS is always Y tiled.
>>
>> See this Mesa comment for more details:
>> https://cgit.freedesktop.org/mesa/mesa/tree/src/intel/isl/isl.c?h=17.0#n211
>>
>> > + * Actually only the lower bit of the pair seems to have any effect.
>> > + * No idea why. 0 in the lower bit would seem to mean not compressed,
>> > + * and 1 is compressed.  The interpreation of the main surface data
>> > + * when the block is marked compressed is unknown as of now.
>>
>> If I recall correctly (it's been several months since I inspected the
>> bits), the high bit is actually significant. Bit pattern 11 means that
>> the data in primary surface's cacheline-pair is invalid; the 3D engine
>> interprets the color of all pixels in that cacheline-pair to be the
>> clear color stored in RENDER_SURFACE_STATE. Before the display engine
>> can consume the surface, userspace is required to do a partial resolve,
>> flushing the clear color into the primary surface. So it makes sense
>> that the kernel would never observe that bit pattern in an incoming ccs
>> surface, as long as userspace is doing its job correctly. And it makes
>> sense that the display engine would ignore the high bit, because there is
>> no way to provide the clear color to the display engine (at least
>> according my reading of the docs).
>>
>> Jason, does this match your understanding of the high bit?
>>
>> > + *
>> > + * CCS tile is laid out in 8 byte horizontal strips each strip thus corresponds
>> > + * to a 128x8 pixel are of the main surface. So each 8x8 bytes of the CCS
>> > + * (1 cacheline) will match an area of 4x2 tiles on the main surface.
>> > + *
>> > + * Here is the layout of a full CCS tile, with the 8 byte strips numbered 0-511:
>> > + * ------------------------
>> > + * |  0 |  64 | ... | 448 |
>> > + * |  1 |  65 |     | 449 |
>> > + * |  2 |  66 |     | 450 |
>> > + * |  . |   . |     |   . |
>> > + * |  . |   . |     |   . |
>> > + * |  . |   . |     |   . |
>> > + * | 63 | 127 |     | 511 |
>> > + * ------------------------
>> > + *
>> > + * This will match an area of 1024x512 pixels on the main surface.
>> > + *
>> > + * The CCS plane pitch must be a multiple of the CCS tile width (64 bytes),
>> > + * and for the purposes of the CCS plane offset we assume cpp==1. As each
>> > + * byte matches a 16x8 area of the main surface, the dimensions of the CCS
>> > + * plane will thus appear to be width/16 x height/8. Both planes must be
>> > + * contained within the same gem object.
>>
>> Again, the above paragraphs should clarify that they apply only to 32-bit formats.
>>
>> > + */
>> > +#define I915_FORMAT_MOD_Y_TILED_CCS	fourcc_mod_code(INTEL, 4)
>> > +#define I915_FORMAT_MOD_Yf_TILED_CCS	fourcc_mod_code(INTEL, 5)
>
>-- 
>Ville Syrjälä
>Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 9/9] drm/i915: Add render decompression support
  2017-01-05 15:14   ` [PATCH v2 " ville.syrjala
  2017-01-09 19:20     ` Jason Ekstrand
  2017-02-07 15:37     ` Imre Deak
@ 2017-02-28 20:18     ` Jason Ekstrand
  2017-02-28 21:00       ` Ben Widawsky
  2017-02-28 23:20       ` Ben Widawsky
  2 siblings, 2 replies; 44+ messages in thread
From: Jason Ekstrand @ 2017-02-28 20:18 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Ben Widawsky, Paulo Zanoni, intel-gfx,
	Maling list - DRI developers, Chad Versace, Vandana Kannan


[-- Attachment #1.1: Type: text/plain, Size: 29233 bytes --]

On Thu, Jan 5, 2017 at 7:14 AM, <ville.syrjala@linux.intel.com> wrote:

> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> SKL+ display engine can scan out certain kinds of compressed surfaces
> produced by the render engine. This involved telling the display engine
> the location of the color control surfae (CCS) which describes
> which parts of the main surface are compressed and which are not. The
> location of CCS is provided by userspace as just another plane with its
> own offset.
>
> Add the required stuff to validate the user provided AUX plane metadata
> and convert the user provided linear offset into something the hardware
> can consume.
>
> Due to hardware limitations we require that the main surface and
> the AUX surface (CCS) be part of the same bo. The hardware also
> makes life hard by not allowing you to provide separate x/y offsets
> for the main and AUX surfaces (excpet with NV12), so finding suitable
> offsets for both requires a bit of work. Assuming we still want keep
> playing tricks with the offsets. I've just gone with a dumb "search
> backward for suitable offsets" approach, which is far from optimal,
> but it works.
>
> Also not all planes will be capable of scanning out compressed surfaces,
> and eg. 90/270 degree rotation is not supported in combination with
> decompression either.
>
> This patch may contain work from at least the following people:
> * Vandana Kannan <vandana.kannan@intel.com>
> * Daniel Vetter <daniel@ffwll.ch>
> * Ben Widawsky <ben@bwidawsk.net>
>
> v2: Deal with display workarounds 0390, 0531, 1125 (Paulo)
>
> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> Cc: Vandana Kannan <vandana.kannan@intel.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Ben Widawsky <ben@bwidawsk.net>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_reg.h      |  23 ++++
>  drivers/gpu/drm/i915/intel_display.c | 234 ++++++++++++++++++++++++++++++
> ++---
>  drivers/gpu/drm/i915/intel_pm.c      |  29 ++++-
>  drivers/gpu/drm/i915/intel_sprite.c  |   5 +
>  4 files changed, 274 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_
> reg.h
> index 00970aa77afa..6849ba93f1d9 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -6209,6 +6209,28 @@ enum {
>                         _ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A),   \
>                         _ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B))
>
> +#define PLANE_AUX_DIST_1_A             0x701c0
> +#define PLANE_AUX_DIST_2_A             0x702c0
> +#define PLANE_AUX_DIST_1_B             0x711c0
> +#define PLANE_AUX_DIST_2_B             0x712c0
> +#define _PLANE_AUX_DIST_1(pipe) \
> +                       _PIPE(pipe, PLANE_AUX_DIST_1_A, PLANE_AUX_DIST_1_B)
> +#define _PLANE_AUX_DIST_2(pipe) \
> +                       _PIPE(pipe, PLANE_AUX_DIST_2_A, PLANE_AUX_DIST_2_B)
> +#define PLANE_AUX_DIST(pipe, plane)     \
> +       _MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe),
> _PLANE_AUX_DIST_2(pipe))
> +
> +#define PLANE_AUX_OFFSET_1_A           0x701c4
> +#define PLANE_AUX_OFFSET_2_A           0x702c4
> +#define PLANE_AUX_OFFSET_1_B           0x711c4
> +#define PLANE_AUX_OFFSET_2_B           0x712c4
> +#define _PLANE_AUX_OFFSET_1(pipe)       \
> +               _PIPE(pipe, PLANE_AUX_OFFSET_1_A, PLANE_AUX_OFFSET_1_B)
> +#define _PLANE_AUX_OFFSET_2(pipe)       \
> +               _PIPE(pipe, PLANE_AUX_OFFSET_2_A, PLANE_AUX_OFFSET_2_B)
> +#define PLANE_AUX_OFFSET(pipe, plane)   \
> +       _MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe),
> _PLANE_AUX_OFFSET_2(pipe))
> +
>  /* legacy palette */
>  #define _LGC_PALETTE_A           0x4a000
>  #define _LGC_PALETTE_B           0x4a800
> @@ -6433,6 +6455,7 @@ enum {
>  # define CHICKEN3_DGMG_DONE_FIX_DISABLE                (1 << 2)
>
>  #define CHICKEN_PAR1_1         _MMIO(0x42080)
> +#define  SKL_RC_HASH_OUTSIDE   (1 << 15)
>  #define  DPA_MASK_VBLANK_SRD   (1 << 15)
>  #define  FORCE_ARB_IDLE_PLANES (1 << 14)
>  #define  SKL_EDP_PSR_FIX_RDWRAP        (1 << 3)
> diff --git a/drivers/gpu/drm/i915/intel_display.c
> b/drivers/gpu/drm/i915/intel_display.c
> index 38de9df0ec60..2236abebd8bc 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -2064,11 +2064,19 @@ intel_tile_width_bytes(const struct
> drm_framebuffer *fb, int plane)
>                         return 128;
>                 else
>                         return 512;
> +       case I915_FORMAT_MOD_Y_TILED_CCS:
> +               if (plane == 1)
> +                       return 64;
> +               /* fall through */
>         case I915_FORMAT_MOD_Y_TILED:
>                 if (IS_GEN2(dev_priv) || HAS_128_BYTE_Y_TILING(dev_priv))
>                         return 128;
>                 else
>                         return 512;
> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
> +               if (plane == 1)
> +                       return 64;
>

I've said it before but reading through Ben's patches again make me want to
be peskier about it.  I would really like the UABI to treat the CCS as if
it's Y-tiled with a tile size of 128B x 32 rows.  Why?  Because this is
what all the docs say it is.  From the display docs for "Color Control
Surface":

"The Color Control Surface (CCS) contains the compression status of the
cache-line pairs. The
compression state of the cache-line pair is specified by 2 bits in the CCS.
Each CCS cache-line represents
an area on the main surface of 16 x16 sets of 128 byte Y-tiled
cache-line-pairs. CCS is always Y tiled."

This contains 95% of the information needed to know the relation between
the CCS and the main surface.  The other 5% (which is badly documented) is
that cache line pairs are horizontally adjacent.  This gives a relationship
of one cache line in the CCS maps to 32x64 cache lines in the main surface.

But it's not actually Y-tiled?  Of course not.  I've worked out the exact
tiling and it looks something like Y but isn't quite the same.  However,
this isn't unique to CCS.  Stencil (W-tiled), HiZ, and gen7-8
single-sampled MCS also each have their own tiling (Haswell MCS is
especially exotic) but the docs call all of them Y-tiled and I think the
hardware internally treats them as Y-tiled with the cache lines shuffled
around a bit.

Given the fact that they seem to like to change the MCS/CCS tiling around
on every hardware generation, I'm reluctant to base UABI on the fact that
the CCS appears to have 64x64 "pixels" per tile with each "pixel"
corresponding to 16x8 pixels in the color surface.  That's not what we had
on any previous gen and may change on gen10 for no aparent reason.  I'd
much rather base it on Y-tiling and a relationship between cache lines
which, even if they change the exact swizzle on gen10, will probably remain
the same.  (For the gen7-8 analogue of CCS, they changed the tiling every
generation but the relationship of one MCS cache line maps to 32x128 color
cache lines remained the same).

Ok, I've said my peice.  If we have to divide by 2 in userspace, we won't
die, but I'd like to get the UABI right before we chissel it in stone.

--Jason


> +               /* fall through */
>         case I915_FORMAT_MOD_Yf_TILED:
>                 /*
>                  * Bspec seems to suggest that the Yf tile width would
> @@ -2156,7 +2164,7 @@ static unsigned int intel_surf_alignment(const
> struct drm_framebuffer *fb,
>         struct drm_i915_private *dev_priv = to_i915(fb->dev);
>
>         /* AUX_DIST needs only 4K alignment */
> -       if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
> +       if (plane == 1)
>                 return 4096;
>
>         switch (fb->modifier) {
> @@ -2166,6 +2174,8 @@ static unsigned int intel_surf_alignment(const
> struct drm_framebuffer *fb,
>                 if (INTEL_GEN(dev_priv) >= 9)
>                         return 256 * 1024;
>                 return 0;
> +       case I915_FORMAT_MOD_Y_TILED_CCS:
> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>         case I915_FORMAT_MOD_Y_TILED:
>         case I915_FORMAT_MOD_Yf_TILED:
>                 return 1 * 1024 * 1024;
> @@ -2472,6 +2482,7 @@ static unsigned int intel_fb_modifier_to_tiling(uint64_t
> fb_modifier)
>         case I915_FORMAT_MOD_X_TILED:
>                 return I915_TILING_X;
>         case I915_FORMAT_MOD_Y_TILED:
> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>                 return I915_TILING_Y;
>         default:
>                 return I915_TILING_NONE;
> @@ -2536,6 +2547,35 @@ intel_fill_fb_info(struct drm_i915_private
> *dev_priv,
>
>                 intel_fb_offset_to_xy(&x, &y, fb, i);
>
> +               if ((fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +                    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) && i ==
> 1) {
> +                       int main_x, main_y;
> +                       int ccs_x, ccs_y;
> +
> +                       /*
> +                        * Each byte of CCS corresponds to a 16x8 area of
> the main surface, and
> +                        * each CCS tile is 64x64 bytes.
> +                        */
> +                       ccs_x = (x * 16) % (64 * 16);
> +                       ccs_y = (y * 8) % (64 * 8);
> +                       main_x = intel_fb->normal[0].x % (64 * 16);
> +                       main_y = intel_fb->normal[0].y % (64 * 8);
> +
> +                       /*
> +                        * CCS doesn't have its own x/y offset register,
> so the intra CCS tile
> +                        * x/y offsets must match between CCS and the main
> surface.
> +                        */
> +                       if (main_x != ccs_x || main_y != ccs_y) {
> +                               DRM_DEBUG_KMS("Bad CCS x/y (main %d,%d ccs
> %d,%d) full (main %d,%d ccs %d,%d)\n",
> +                                             main_x, main_y,
> +                                             ccs_x, ccs_y,
> +                                             intel_fb->normal[0].x,
> +                                             intel_fb->normal[0].y,
> +                                             x, y);
> +                               return -EINVAL;
> +                       }
> +               }
> +
>                 /*
>                  * The fence (if used) is aligned to the start of the
> object
>                  * so having the framebuffer wrap around across the edge
> of the
> @@ -2873,6 +2913,9 @@ static int skl_max_plane_width(const struct
> drm_framebuffer *fb, int plane,
>                         break;
>                 }
>                 break;
> +       case I915_FORMAT_MOD_Y_TILED_CCS:
> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
> +               /* FIXME AUX plane? */
>         case I915_FORMAT_MOD_Y_TILED:
>         case I915_FORMAT_MOD_Yf_TILED:
>                 switch (cpp) {
> @@ -2895,6 +2938,42 @@ static int skl_max_plane_width(const struct
> drm_framebuffer *fb, int plane,
>         return 2048;
>  }
>
> +static bool skl_check_main_ccs_coordinates(struct intel_plane_state
> *plane_state,
> +                                          int main_x, int main_y, u32
> main_offset)
> +{
> +       const struct drm_framebuffer *fb = plane_state->base.fb;
> +       int aux_x = plane_state->aux.x;
> +       int aux_y = plane_state->aux.y;
> +       u32 aux_offset = plane_state->aux.offset;
> +       u32 alignment = intel_surf_alignment(fb, 1);
> +
> +       while (aux_offset >= main_offset && aux_y <= main_y) {
> +               int x, y;
> +
> +               if (aux_x == main_x && aux_y == main_y)
> +                       break;
> +
> +               if (aux_offset == 0)
> +                       break;
> +
> +               x = aux_x / 16;
> +               y = aux_y / 8;
> +               aux_offset = intel_adjust_tile_offset(&x, &y, plane_state,
> 1,
> +                                                     aux_offset,
> aux_offset - alignment);
> +               aux_x = x * 16 + aux_x % 16;
> +               aux_y = y * 8 + aux_y % 8;
> +       }
> +
> +       if (aux_x != main_x || aux_y != main_y)
> +               return false;
> +
> +       plane_state->aux.offset = aux_offset;
> +       plane_state->aux.x = aux_x;
> +       plane_state->aux.y = aux_y;
> +
> +       return true;
> +}
> +
>  static int skl_check_main_surface(struct intel_plane_state *plane_state)
>  {
>         const struct drm_framebuffer *fb = plane_state->base.fb;
> @@ -2937,7 +3016,7 @@ static int skl_check_main_surface(struct
> intel_plane_state *plane_state)
>
>                 while ((x + w) * cpp > fb->pitches[0]) {
>                         if (offset == 0) {
> -                               DRM_DEBUG_KMS("Unable to find suitable
> display surface offset\n");
> +                               DRM_DEBUG_KMS("Unable to find suitable
> display surface offset due to X-tiling\n");
>                                 return -EINVAL;
>                         }
>
> @@ -2946,6 +3025,26 @@ static int skl_check_main_surface(struct
> intel_plane_state *plane_state)
>                 }
>         }
>
> +       /*
> +        * CCS AUX surface doesn't have its own x/y offsets, we must make
> sure
> +        * they match with the main surface x/y offsets.
> +        */
> +       if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +           fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
> +               while (!skl_check_main_ccs_coordinates(plane_state, x, y,
> offset)) {
> +                       if (offset == 0)
> +                               break;
> +
> +                       offset = intel_adjust_tile_offset(&x, &y,
> plane_state, 0,
> +                                                         offset, offset -
> alignment);
> +               }
> +
> +               if (x != plane_state->aux.x || y != plane_state->aux.y) {
> +                       DRM_DEBUG_KMS("Unable to find suitable display
> surface offset due to CCS\n");
> +                       return -EINVAL;
> +               }
> +       }
> +
>         plane_state->main.offset = offset;
>         plane_state->main.x = x;
>         plane_state->main.y = y;
> @@ -2982,6 +3081,47 @@ static int skl_check_nv12_aux_surface(struct
> intel_plane_state *plane_state)
>         return 0;
>  }
>
> +static int skl_check_ccs_aux_surface(struct intel_plane_state
> *plane_state)
> +{
> +       struct intel_plane *plane = to_intel_plane(plane_state->
> base.plane);
> +       struct intel_crtc *crtc = to_intel_crtc(plane_state->base.crtc);
> +       int src_x = plane_state->base.src.x1 >> 16;
> +       int src_y = plane_state->base.src.y1 >> 16;
> +       int x = src_x / 16;
> +       int y = src_y / 8;
> +       u32 offset;
> +
> +       switch (plane->id) {
> +       case PLANE_PRIMARY:
> +       case PLANE_SPRITE0:
> +               break;
> +       default:
> +               DRM_DEBUG_KMS("RC support only on plane 1 and 2\n");
> +               return -EINVAL;
> +       }
> +
> +       if (crtc->pipe == PIPE_C) {
> +               DRM_DEBUG_KMS("No RC support on pipe C\n");
> +               return -EINVAL;
> +       }
> +
> +       if (plane_state->base.rotation &&
> +           plane_state->base.rotation & ~(DRM_ROTATE_0 | DRM_ROTATE_180))
> {
> +               DRM_DEBUG_KMS("RC support only with 0/180 degree rotation
> %x\n",
> +                             plane_state->base.rotation);
> +               return -EINVAL;
> +       }
> +
> +       intel_add_fb_offsets(&x, &y, plane_state, 1);
> +       offset = intel_compute_tile_offset(&x, &y, plane_state, 1);
> +
> +       plane_state->aux.offset = offset;
> +       plane_state->aux.x = x * 16 + src_x % 16;
> +       plane_state->aux.y = y * 8 + src_y % 8;
> +
> +       return 0;
> +}
> +
>  int skl_check_plane_surface(struct intel_plane_state *plane_state)
>  {
>         const struct drm_framebuffer *fb = plane_state->base.fb;
> @@ -3002,6 +3142,11 @@ int skl_check_plane_surface(struct
> intel_plane_state *plane_state)
>                 ret = skl_check_nv12_aux_surface(plane_state);
>                 if (ret)
>                         return ret;
> +       } else if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +                  fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
> +               ret = skl_check_ccs_aux_surface(plane_state);
> +               if (ret)
> +                       return ret;
>         } else {
>                 plane_state->aux.offset = ~0xfff;
>                 plane_state->aux.x = 0;
> @@ -3357,8 +3502,12 @@ u32 skl_plane_ctl_tiling(uint64_t fb_modifier)
>                 return PLANE_CTL_TILED_X;
>         case I915_FORMAT_MOD_Y_TILED:
>                 return PLANE_CTL_TILED_Y;
> +       case I915_FORMAT_MOD_Y_TILED_CCS:
> +               return PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
>         case I915_FORMAT_MOD_Yf_TILED:
>                 return PLANE_CTL_TILED_YF;
> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
> +               return PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_
> ENABLE;
>         default:
>                 MISSING_CASE(fb_modifier);
>         }
> @@ -3401,6 +3550,7 @@ static void skylake_update_primary_plane(struct
> drm_plane *plane,
>         u32 plane_ctl;
>         unsigned int rotation = plane_state->base.rotation;
>         u32 stride = skl_plane_stride(fb, 0, rotation);
> +       u32 aux_stride = skl_plane_stride(fb, 1, rotation);
>         u32 surf_addr = plane_state->main.offset;
>         int scaler_id = plane_state->scaler_id;
>         int src_x = plane_state->main.x;
> @@ -3436,6 +3586,10 @@ static void skylake_update_primary_plane(struct
> drm_plane *plane,
>         I915_WRITE(PLANE_OFFSET(pipe, plane_id), (src_y << 16) | src_x);
>         I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
>         I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
> +       I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
> +                  (plane_state->aux.offset - surf_addr) | aux_stride);
> +       I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
> +                  (plane_state->aux.y << 16) | plane_state->aux.x);
>
>         if (scaler_id >= 0) {
>                 uint32_t ps_ctrl = 0;
> @@ -9807,10 +9961,16 @@ skylake_get_initial_plane_config(struct
> intel_crtc *crtc,
>                 fb->modifier = I915_FORMAT_MOD_X_TILED;
>                 break;
>         case PLANE_CTL_TILED_Y:
> -               fb->modifier = I915_FORMAT_MOD_Y_TILED;
> +               if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
> +                       fb->modifier = I915_FORMAT_MOD_Y_TILED_CCS;
> +               else
> +                       fb->modifier = I915_FORMAT_MOD_Y_TILED;
>                 break;
>         case PLANE_CTL_TILED_YF:
> -               fb->modifier = I915_FORMAT_MOD_Yf_TILED;
> +               if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
> +                       fb->modifier = I915_FORMAT_MOD_Yf_TILED_CCS;
> +               else
> +                       fb->modifier = I915_FORMAT_MOD_Yf_TILED;
>                 break;
>         default:
>                 MISSING_CASE(tiling);
> @@ -12014,7 +12174,7 @@ static void skl_do_mmio_flip(struct intel_crtc
> *intel_crtc,
>         u32 ctl, stride = skl_plane_stride(fb, 0, rotation);
>
>         ctl = I915_READ(PLANE_CTL(pipe, 0));
> -       ctl &= ~PLANE_CTL_TILED_MASK;
> +       ctl &= ~(PLANE_CTL_TILED_MASK | PLANE_CTL_DECOMPRESSION_ENABLE);
>         switch (fb->modifier) {
>         case DRM_FORMAT_MOD_NONE:
>                 break;
> @@ -12024,9 +12184,15 @@ static void skl_do_mmio_flip(struct intel_crtc
> *intel_crtc,
>         case I915_FORMAT_MOD_Y_TILED:
>                 ctl |= PLANE_CTL_TILED_Y;
>                 break;
> +       case I915_FORMAT_MOD_Y_TILED_CCS:
> +               ctl |= PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
> +               break;
>         case I915_FORMAT_MOD_Yf_TILED:
>                 ctl |= PLANE_CTL_TILED_YF;
>                 break;
> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
> +               ctl |= PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_
> ENABLE;
> +               break;
>         default:
>                 MISSING_CASE(fb->modifier);
>         }
> @@ -15925,9 +16091,10 @@ static int intel_framebuffer_init(struct
> drm_device *dev,
>                                   struct drm_i915_gem_object *obj)
>  {
>         struct drm_i915_private *dev_priv = to_i915(dev);
> +       struct drm_framebuffer *fb = &intel_fb->base;
>         unsigned int tiling = i915_gem_object_get_tiling(obj);
> -       int ret;
> -       u32 pitch_limit, stride_alignment;
> +       int ret, i;
> +       u32 pitch_limit;
>         struct drm_format_name_buf format_name;
>
>         WARN_ON(!mutex_is_locked(&dev->struct_mutex));
> @@ -15953,6 +16120,19 @@ static int intel_framebuffer_init(struct
> drm_device *dev,
>
>         /* Passed in modifier sanity checking. */
>         switch (mode_cmd->modifier[0]) {
> +       case I915_FORMAT_MOD_Y_TILED_CCS:
> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
> +               switch (mode_cmd->pixel_format) {
> +               case DRM_FORMAT_XBGR8888:
> +               case DRM_FORMAT_ABGR8888:
> +               case DRM_FORMAT_XRGB8888:
> +               case DRM_FORMAT_ARGB8888:
> +                       break;
> +               default:
> +                       DRM_DEBUG_KMS("RC supported only with RGB8888
> formats\n");
> +                       return -EINVAL;
> +               }
> +               /* fall through */
>         case I915_FORMAT_MOD_Y_TILED:
>         case I915_FORMAT_MOD_Yf_TILED:
>                 if (INTEL_GEN(dev_priv) < 9) {
> @@ -16059,22 +16239,46 @@ static int intel_framebuffer_init(struct
> drm_device *dev,
>         if (mode_cmd->offsets[0] != 0)
>                 return -EINVAL;
>
> -       drm_helper_mode_fill_fb_struct(dev, &intel_fb->base, mode_cmd);
> +       drm_helper_mode_fill_fb_struct(dev, fb, mode_cmd);
>
> -       stride_alignment = intel_fb_stride_alignment(&intel_fb->base, 0);
> -       if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
> -               DRM_DEBUG_KMS("pitch (%d) must be at least %u byte
> aligned\n",
> -                             mode_cmd->pitches[0], stride_alignment);
> -               return -EINVAL;
> +       for (i = 0; i < fb->format->num_planes; i++) {
> +               u32 stride_alignment;
> +
> +               if (mode_cmd->handles[i] != mode_cmd->handles[0]) {
> +                       DRM_DEBUG_KMS("bad plane %d handle\n", i);
> +                       return -EINVAL;
> +               }
> +
> +               stride_alignment = intel_fb_stride_alignment(fb, i);
> +
> +               /*
> +                * Display WA #0531: skl,bxt,kbl,glk
> +                *
> +                * Render decompression and plane width > 3840
> +                * combined with horizontal panning requires the
> +                * plane stride to be a multiple of 4. We'll just
> +                * require the entire fb to accommodate that to avoid
> +                * potential runtime errors at plane configuration time.
>

Note to Ben: Userspace needs to know about this too.  In this case, I
believe "multiple of 4" means "multiple of 4 tiles".  You've never hit this
because the standard 1920x1080 is 60 tiles wide which is a multiple of 4.


> +                */
> +               if (IS_GEN9(dev_priv) && i == 0 && fb->width > 3840 &&
> +                   (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +                    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS))
> +                       stride_alignment *= 4;
> +
> +               if (fb->pitches[i] & (stride_alignment - 1)) {
> +                       DRM_DEBUG_KMS("plane %d pitch (%d) must be at
> least %u byte aligned\n",
> +                                     i, fb->pitches[i], stride_alignment);
> +                       return -EINVAL;
> +               }
>         }
>
>         intel_fb->obj = obj;
>
> -       ret = intel_fill_fb_info(dev_priv, &intel_fb->base);
> +       ret = intel_fill_fb_info(dev_priv, fb);
>         if (ret)
>                 return ret;
>
> -       ret = drm_framebuffer_init(dev, &intel_fb->base, &intel_fb_funcs);
> +       ret = drm_framebuffer_init(dev, fb, &intel_fb_funcs);
>         if (ret) {
>                 DRM_ERROR("framebuffer init failed %d\n", ret);
>                 return ret;
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_
> pm.c
> index 249623d45be0..25782cd174c0 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -62,6 +62,20 @@ static void gen9_init_clock_gating(struct
> drm_i915_private *dev_priv)
>         I915_WRITE(CHICKEN_PAR1_1,
>                    I915_READ(CHICKEN_PAR1_1) | SKL_EDP_PSR_FIX_RDWRAP);
>
> +       /*
> +        * Display WA#0390: skl,bxt,kbl,glk
> +        *
> +        * Must match Sampler, Pixel Back End, and Media
> +        * (0xE194 bit 8, 0x7014 bit 13, 0x4DDC bits 27 and 31).
> +        *
> +        * Including bits outside the page in the hash would
> +        * require 2 (or 4?) MiB alignment of resources. Just
> +        * assume the defaul hashing mode which only uses bits
> +        * within the page.
> +        */
> +       I915_WRITE(CHICKEN_PAR1_1,
> +                  I915_READ(CHICKEN_PAR1_1) & ~SKL_RC_HASH_OUTSIDE);
> +
>         I915_WRITE(GEN8_CONFIG0,
>                    I915_READ(GEN8_CONFIG0) | GEN9_DEFAULT_FIXES);
>
> @@ -3314,7 +3328,9 @@ skl_ddb_min_alloc(const struct drm_plane_state
> *pstate,
>
>         /* For Non Y-tile return 8-blocks */
>         if (fb->modifier != I915_FORMAT_MOD_Y_TILED &&
> -           fb->modifier != I915_FORMAT_MOD_Yf_TILED)
> +           fb->modifier != I915_FORMAT_MOD_Yf_TILED &&
> +           fb->modifier != I915_FORMAT_MOD_Y_TILED_CCS &&
> +           fb->modifier != I915_FORMAT_MOD_Yf_TILED_CCS)
>                 return 8;
>
>         src_w = drm_rect_width(&intel_pstate->base.src) >> 16;
> @@ -3590,7 +3606,9 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>         }
>
>         y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
> -                 fb->modifier == I915_FORMAT_MOD_Yf_TILED;
> +                 fb->modifier == I915_FORMAT_MOD_Yf_TILED ||
> +                 fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +                 fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS;
>         x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
>
>         /* Display WA #1141: kbl. */
> @@ -3675,6 +3693,13 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>         res_lines = DIV_ROUND_UP(selected_result.val,
>                                  plane_blocks_per_line.val);
>
> +       /* Display WA #1125: skl,bxt,kbl,glk */
> +       if (level == 0 &&
> +           (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> +            fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS))
> +               res_blocks += fixed_16_16_to_u32_round_up(y_tile_minimum);
> +
> +       /* Display WA #1126: skl,bxt,kbl,glk */
>         if (level >= 1 && level <= 7) {
>                 if (y_tiled) {
>                         res_blocks += fixed_16_16_to_u32_round_up(y_
> tile_minimum);
> diff --git a/drivers/gpu/drm/i915/intel_sprite.c
> b/drivers/gpu/drm/i915/intel_sprite.c
> index 7031bc733d97..063a994815d0 100644
> --- a/drivers/gpu/drm/i915/intel_sprite.c
> +++ b/drivers/gpu/drm/i915/intel_sprite.c
> @@ -210,6 +210,7 @@ skl_update_plane(struct drm_plane *drm_plane,
>         u32 surf_addr = plane_state->main.offset;
>         unsigned int rotation = plane_state->base.rotation;
>         u32 stride = skl_plane_stride(fb, 0, rotation);
> +       u32 aux_stride = skl_plane_stride(fb, 1, rotation);
>         int crtc_x = plane_state->base.dst.x1;
>         int crtc_y = plane_state->base.dst.y1;
>         uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
> @@ -248,6 +249,10 @@ skl_update_plane(struct drm_plane *drm_plane,
>         I915_WRITE(PLANE_OFFSET(pipe, plane_id), (y << 16) | x);
>         I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
>         I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
> +       I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
> +                  (plane_state->aux.offset - surf_addr) | aux_stride);
> +       I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
> +                  (plane_state->aux.y << 16) | plane_state->aux.x);
>
>         /* program plane scaler */
>         if (plane_state->scaler_id >= 0) {
> --
> 2.10.2
>
>

[-- Attachment #1.2: Type: text/html, Size: 36373 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 9/9] drm/i915: Add render decompression support
  2017-02-28 20:18     ` Jason Ekstrand
@ 2017-02-28 21:00       ` Ben Widawsky
  2017-02-28 23:20       ` Ben Widawsky
  1 sibling, 0 replies; 44+ messages in thread
From: Ben Widawsky @ 2017-02-28 21:00 UTC (permalink / raw)
  To: Jason Ekstrand
  Cc: Vandana Kannan, intel-gfx, Maling list - DRI developers,
	Chad Versace, Paulo Zanoni

On 17-02-28 12:18:39, Jason Ekstrand wrote:
>On Thu, Jan 5, 2017 at 7:14 AM, <ville.syrjala@linux.intel.com> wrote:
>
>> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>
>> SKL+ display engine can scan out certain kinds of compressed surfaces
>> produced by the render engine. This involved telling the display engine
>> the location of the color control surfae (CCS) which describes
>> which parts of the main surface are compressed and which are not. The
>> location of CCS is provided by userspace as just another plane with its
>> own offset.
>>
>> Add the required stuff to validate the user provided AUX plane metadata
>> and convert the user provided linear offset into something the hardware
>> can consume.
>>
>> Due to hardware limitations we require that the main surface and
>> the AUX surface (CCS) be part of the same bo. The hardware also
>> makes life hard by not allowing you to provide separate x/y offsets
>> for the main and AUX surfaces (excpet with NV12), so finding suitable
>> offsets for both requires a bit of work. Assuming we still want keep
>> playing tricks with the offsets. I've just gone with a dumb "search
>> backward for suitable offsets" approach, which is far from optimal,
>> but it works.
>>
>> Also not all planes will be capable of scanning out compressed surfaces,
>> and eg. 90/270 degree rotation is not supported in combination with
>> decompression either.
>>
>> This patch may contain work from at least the following people:
>> * Vandana Kannan <vandana.kannan@intel.com>
>> * Daniel Vetter <daniel@ffwll.ch>
>> * Ben Widawsky <ben@bwidawsk.net>
>>
>> v2: Deal with display workarounds 0390, 0531, 1125 (Paulo)
>>
>> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> Cc: Vandana Kannan <vandana.kannan@intel.com>
>> Cc: Daniel Vetter <daniel@ffwll.ch>
>> Cc: Ben Widawsky <ben@bwidawsk.net>
>> Cc: Jason Ekstrand <jason@jlekstrand.net>
>> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_reg.h      |  23 ++++
>>  drivers/gpu/drm/i915/intel_display.c | 234 ++++++++++++++++++++++++++++++
>> ++---
>>  drivers/gpu/drm/i915/intel_pm.c      |  29 ++++-
>>  drivers/gpu/drm/i915/intel_sprite.c  |   5 +
>>  4 files changed, 274 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_
>> reg.h
>> index 00970aa77afa..6849ba93f1d9 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -6209,6 +6209,28 @@ enum {
>>                         _ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A),   \
>>                         _ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B))
>>
>> +#define PLANE_AUX_DIST_1_A             0x701c0
>> +#define PLANE_AUX_DIST_2_A             0x702c0
>> +#define PLANE_AUX_DIST_1_B             0x711c0
>> +#define PLANE_AUX_DIST_2_B             0x712c0
>> +#define _PLANE_AUX_DIST_1(pipe) \
>> +                       _PIPE(pipe, PLANE_AUX_DIST_1_A, PLANE_AUX_DIST_1_B)
>> +#define _PLANE_AUX_DIST_2(pipe) \
>> +                       _PIPE(pipe, PLANE_AUX_DIST_2_A, PLANE_AUX_DIST_2_B)
>> +#define PLANE_AUX_DIST(pipe, plane)     \
>> +       _MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe),
>> _PLANE_AUX_DIST_2(pipe))
>> +
>> +#define PLANE_AUX_OFFSET_1_A           0x701c4
>> +#define PLANE_AUX_OFFSET_2_A           0x702c4
>> +#define PLANE_AUX_OFFSET_1_B           0x711c4
>> +#define PLANE_AUX_OFFSET_2_B           0x712c4
>> +#define _PLANE_AUX_OFFSET_1(pipe)       \
>> +               _PIPE(pipe, PLANE_AUX_OFFSET_1_A, PLANE_AUX_OFFSET_1_B)
>> +#define _PLANE_AUX_OFFSET_2(pipe)       \
>> +               _PIPE(pipe, PLANE_AUX_OFFSET_2_A, PLANE_AUX_OFFSET_2_B)
>> +#define PLANE_AUX_OFFSET(pipe, plane)   \
>> +       _MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe),
>> _PLANE_AUX_OFFSET_2(pipe))
>> +
>>  /* legacy palette */
>>  #define _LGC_PALETTE_A           0x4a000
>>  #define _LGC_PALETTE_B           0x4a800
>> @@ -6433,6 +6455,7 @@ enum {
>>  # define CHICKEN3_DGMG_DONE_FIX_DISABLE                (1 << 2)
>>
>>  #define CHICKEN_PAR1_1         _MMIO(0x42080)
>> +#define  SKL_RC_HASH_OUTSIDE   (1 << 15)
>>  #define  DPA_MASK_VBLANK_SRD   (1 << 15)
>>  #define  FORCE_ARB_IDLE_PLANES (1 << 14)
>>  #define  SKL_EDP_PSR_FIX_RDWRAP        (1 << 3)
>> diff --git a/drivers/gpu/drm/i915/intel_display.c
>> b/drivers/gpu/drm/i915/intel_display.c
>> index 38de9df0ec60..2236abebd8bc 100644
>> --- a/drivers/gpu/drm/i915/intel_display.c
>> +++ b/drivers/gpu/drm/i915/intel_display.c
>> @@ -2064,11 +2064,19 @@ intel_tile_width_bytes(const struct
>> drm_framebuffer *fb, int plane)
>>                         return 128;
>>                 else
>>                         return 512;
>> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> +               if (plane == 1)
>> +                       return 64;
>> +               /* fall through */
>>         case I915_FORMAT_MOD_Y_TILED:
>>                 if (IS_GEN2(dev_priv) || HAS_128_BYTE_Y_TILING(dev_priv))
>>                         return 128;
>>                 else
>>                         return 512;
>> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>> +               if (plane == 1)
>> +                       return 64;
>>
>
>I've said it before but reading through Ben's patches again make me want to
>be peskier about it.  I would really like the UABI to treat the CCS as if
>it's Y-tiled with a tile size of 128B x 32 rows.  Why?  Because this is
>what all the docs say it is.  From the display docs for "Color Control
>Surface":
>
>"The Color Control Surface (CCS) contains the compression status of the
>cache-line pairs. The
>compression state of the cache-line pair is specified by 2 bits in the CCS.
>Each CCS cache-line represents
>an area on the main surface of 16 x16 sets of 128 byte Y-tiled
>cache-line-pairs. CCS is always Y tiled."
>
>This contains 95% of the information needed to know the relation between
>the CCS and the main surface.  The other 5% (which is badly documented) is
>that cache line pairs are horizontally adjacent.  This gives a relationship
>of one cache line in the CCS maps to 32x64 cache lines in the main surface.
>
>But it's not actually Y-tiled?  Of course not.  I've worked out the exact
>tiling and it looks something like Y but isn't quite the same.  However,
>this isn't unique to CCS.  Stencil (W-tiled), HiZ, and gen7-8
>single-sampled MCS also each have their own tiling (Haswell MCS is
>especially exotic) but the docs call all of them Y-tiled and I think the
>hardware internally treats them as Y-tiled with the cache lines shuffled
>around a bit.
>
>Given the fact that they seem to like to change the MCS/CCS tiling around
>on every hardware generation, I'm reluctant to base UABI on the fact that
>the CCS appears to have 64x64 "pixels" per tile with each "pixel"
>corresponding to 16x8 pixels in the color surface.  That's not what we had
>on any previous gen and may change on gen10 for no aparent reason.  I'd
>much rather base it on Y-tiling and a relationship between cache lines
>which, even if they change the exact swizzle on gen10, will probably remain
>the same.  (For the gen7-8 analogue of CCS, they changed the tiling every
>generation but the relationship of one MCS cache line maps to 32x128 color
>cache lines remained the same).
>
>Ok, I've said my peice.  If we have to divide by 2 in userspace, we won't
>die, but I'd like to get the UABI right before we chissel it in stone.
>
>--Jason
>
>
>> +               /* fall through */
>>         case I915_FORMAT_MOD_Yf_TILED:
>>                 /*
>>                  * Bspec seems to suggest that the Yf tile width would
>> @@ -2156,7 +2164,7 @@ static unsigned int intel_surf_alignment(const
>> struct drm_framebuffer *fb,
>>         struct drm_i915_private *dev_priv = to_i915(fb->dev);
>>
>>         /* AUX_DIST needs only 4K alignment */
>> -       if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
>> +       if (plane == 1)
>>                 return 4096;
>>
>>         switch (fb->modifier) {
>> @@ -2166,6 +2174,8 @@ static unsigned int intel_surf_alignment(const
>> struct drm_framebuffer *fb,
>>                 if (INTEL_GEN(dev_priv) >= 9)
>>                         return 256 * 1024;
>>                 return 0;
>> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>>         case I915_FORMAT_MOD_Y_TILED:
>>         case I915_FORMAT_MOD_Yf_TILED:
>>                 return 1 * 1024 * 1024;
>> @@ -2472,6 +2482,7 @@ static unsigned int intel_fb_modifier_to_tiling(uint64_t
>> fb_modifier)
>>         case I915_FORMAT_MOD_X_TILED:
>>                 return I915_TILING_X;
>>         case I915_FORMAT_MOD_Y_TILED:
>> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>>                 return I915_TILING_Y;
>>         default:
>>                 return I915_TILING_NONE;
>> @@ -2536,6 +2547,35 @@ intel_fill_fb_info(struct drm_i915_private
>> *dev_priv,
>>
>>                 intel_fb_offset_to_xy(&x, &y, fb, i);
>>
>> +               if ((fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>> +                    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) && i ==
>> 1) {
>> +                       int main_x, main_y;
>> +                       int ccs_x, ccs_y;
>> +
>> +                       /*
>> +                        * Each byte of CCS corresponds to a 16x8 area of
>> the main surface, and
>> +                        * each CCS tile is 64x64 bytes.
>> +                        */
>> +                       ccs_x = (x * 16) % (64 * 16);
>> +                       ccs_y = (y * 8) % (64 * 8);
>> +                       main_x = intel_fb->normal[0].x % (64 * 16);
>> +                       main_y = intel_fb->normal[0].y % (64 * 8);
>> +
>> +                       /*
>> +                        * CCS doesn't have its own x/y offset register,
>> so the intra CCS tile
>> +                        * x/y offsets must match between CCS and the main
>> surface.
>> +                        */
>> +                       if (main_x != ccs_x || main_y != ccs_y) {
>> +                               DRM_DEBUG_KMS("Bad CCS x/y (main %d,%d ccs
>> %d,%d) full (main %d,%d ccs %d,%d)\n",
>> +                                             main_x, main_y,
>> +                                             ccs_x, ccs_y,
>> +                                             intel_fb->normal[0].x,
>> +                                             intel_fb->normal[0].y,
>> +                                             x, y);
>> +                               return -EINVAL;
>> +                       }
>> +               }
>> +
>>                 /*
>>                  * The fence (if used) is aligned to the start of the
>> object
>>                  * so having the framebuffer wrap around across the edge
>> of the
>> @@ -2873,6 +2913,9 @@ static int skl_max_plane_width(const struct
>> drm_framebuffer *fb, int plane,
>>                         break;
>>                 }
>>                 break;
>> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>> +               /* FIXME AUX plane? */
>>         case I915_FORMAT_MOD_Y_TILED:
>>         case I915_FORMAT_MOD_Yf_TILED:
>>                 switch (cpp) {
>> @@ -2895,6 +2938,42 @@ static int skl_max_plane_width(const struct
>> drm_framebuffer *fb, int plane,
>>         return 2048;
>>  }
>>
>> +static bool skl_check_main_ccs_coordinates(struct intel_plane_state
>> *plane_state,
>> +                                          int main_x, int main_y, u32
>> main_offset)
>> +{
>> +       const struct drm_framebuffer *fb = plane_state->base.fb;
>> +       int aux_x = plane_state->aux.x;
>> +       int aux_y = plane_state->aux.y;
>> +       u32 aux_offset = plane_state->aux.offset;
>> +       u32 alignment = intel_surf_alignment(fb, 1);
>> +
>> +       while (aux_offset >= main_offset && aux_y <= main_y) {
>> +               int x, y;
>> +
>> +               if (aux_x == main_x && aux_y == main_y)
>> +                       break;
>> +
>> +               if (aux_offset == 0)
>> +                       break;
>> +
>> +               x = aux_x / 16;
>> +               y = aux_y / 8;
>> +               aux_offset = intel_adjust_tile_offset(&x, &y, plane_state,
>> 1,
>> +                                                     aux_offset,
>> aux_offset - alignment);
>> +               aux_x = x * 16 + aux_x % 16;
>> +               aux_y = y * 8 + aux_y % 8;
>> +       }
>> +
>> +       if (aux_x != main_x || aux_y != main_y)
>> +               return false;
>> +
>> +       plane_state->aux.offset = aux_offset;
>> +       plane_state->aux.x = aux_x;
>> +       plane_state->aux.y = aux_y;
>> +
>> +       return true;
>> +}
>> +
>>  static int skl_check_main_surface(struct intel_plane_state *plane_state)
>>  {
>>         const struct drm_framebuffer *fb = plane_state->base.fb;
>> @@ -2937,7 +3016,7 @@ static int skl_check_main_surface(struct
>> intel_plane_state *plane_state)
>>
>>                 while ((x + w) * cpp > fb->pitches[0]) {
>>                         if (offset == 0) {
>> -                               DRM_DEBUG_KMS("Unable to find suitable
>> display surface offset\n");
>> +                               DRM_DEBUG_KMS("Unable to find suitable
>> display surface offset due to X-tiling\n");
>>                                 return -EINVAL;
>>                         }
>>
>> @@ -2946,6 +3025,26 @@ static int skl_check_main_surface(struct
>> intel_plane_state *plane_state)
>>                 }
>>         }
>>
>> +       /*
>> +        * CCS AUX surface doesn't have its own x/y offsets, we must make
>> sure
>> +        * they match with the main surface x/y offsets.
>> +        */
>> +       if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>> +           fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
>> +               while (!skl_check_main_ccs_coordinates(plane_state, x, y,
>> offset)) {
>> +                       if (offset == 0)
>> +                               break;
>> +
>> +                       offset = intel_adjust_tile_offset(&x, &y,
>> plane_state, 0,
>> +                                                         offset, offset -
>> alignment);
>> +               }
>> +
>> +               if (x != plane_state->aux.x || y != plane_state->aux.y) {
>> +                       DRM_DEBUG_KMS("Unable to find suitable display
>> surface offset due to CCS\n");
>> +                       return -EINVAL;
>> +               }
>> +       }
>> +
>>         plane_state->main.offset = offset;
>>         plane_state->main.x = x;
>>         plane_state->main.y = y;
>> @@ -2982,6 +3081,47 @@ static int skl_check_nv12_aux_surface(struct
>> intel_plane_state *plane_state)
>>         return 0;
>>  }
>>
>> +static int skl_check_ccs_aux_surface(struct intel_plane_state
>> *plane_state)
>> +{
>> +       struct intel_plane *plane = to_intel_plane(plane_state->
>> base.plane);
>> +       struct intel_crtc *crtc = to_intel_crtc(plane_state->base.crtc);
>> +       int src_x = plane_state->base.src.x1 >> 16;
>> +       int src_y = plane_state->base.src.y1 >> 16;
>> +       int x = src_x / 16;
>> +       int y = src_y / 8;
>> +       u32 offset;
>> +
>> +       switch (plane->id) {
>> +       case PLANE_PRIMARY:
>> +       case PLANE_SPRITE0:
>> +               break;
>> +       default:
>> +               DRM_DEBUG_KMS("RC support only on plane 1 and 2\n");
>> +               return -EINVAL;
>> +       }
>> +
>> +       if (crtc->pipe == PIPE_C) {
>> +               DRM_DEBUG_KMS("No RC support on pipe C\n");
>> +               return -EINVAL;
>> +       }
>> +
>> +       if (plane_state->base.rotation &&
>> +           plane_state->base.rotation & ~(DRM_ROTATE_0 | DRM_ROTATE_180))
>> {
>> +               DRM_DEBUG_KMS("RC support only with 0/180 degree rotation
>> %x\n",
>> +                             plane_state->base.rotation);
>> +               return -EINVAL;
>> +       }
>> +
>> +       intel_add_fb_offsets(&x, &y, plane_state, 1);
>> +       offset = intel_compute_tile_offset(&x, &y, plane_state, 1);
>> +
>> +       plane_state->aux.offset = offset;
>> +       plane_state->aux.x = x * 16 + src_x % 16;
>> +       plane_state->aux.y = y * 8 + src_y % 8;
>> +
>> +       return 0;
>> +}
>> +
>>  int skl_check_plane_surface(struct intel_plane_state *plane_state)
>>  {
>>         const struct drm_framebuffer *fb = plane_state->base.fb;
>> @@ -3002,6 +3142,11 @@ int skl_check_plane_surface(struct
>> intel_plane_state *plane_state)
>>                 ret = skl_check_nv12_aux_surface(plane_state);
>>                 if (ret)
>>                         return ret;
>> +       } else if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>> +                  fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
>> +               ret = skl_check_ccs_aux_surface(plane_state);
>> +               if (ret)
>> +                       return ret;
>>         } else {
>>                 plane_state->aux.offset = ~0xfff;
>>                 plane_state->aux.x = 0;
>> @@ -3357,8 +3502,12 @@ u32 skl_plane_ctl_tiling(uint64_t fb_modifier)
>>                 return PLANE_CTL_TILED_X;
>>         case I915_FORMAT_MOD_Y_TILED:
>>                 return PLANE_CTL_TILED_Y;
>> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> +               return PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
>>         case I915_FORMAT_MOD_Yf_TILED:
>>                 return PLANE_CTL_TILED_YF;
>> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>> +               return PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_
>> ENABLE;
>>         default:
>>                 MISSING_CASE(fb_modifier);
>>         }
>> @@ -3401,6 +3550,7 @@ static void skylake_update_primary_plane(struct
>> drm_plane *plane,
>>         u32 plane_ctl;
>>         unsigned int rotation = plane_state->base.rotation;
>>         u32 stride = skl_plane_stride(fb, 0, rotation);
>> +       u32 aux_stride = skl_plane_stride(fb, 1, rotation);
>>         u32 surf_addr = plane_state->main.offset;
>>         int scaler_id = plane_state->scaler_id;
>>         int src_x = plane_state->main.x;
>> @@ -3436,6 +3586,10 @@ static void skylake_update_primary_plane(struct
>> drm_plane *plane,
>>         I915_WRITE(PLANE_OFFSET(pipe, plane_id), (src_y << 16) | src_x);
>>         I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
>>         I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
>> +       I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
>> +                  (plane_state->aux.offset - surf_addr) | aux_stride);
>> +       I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
>> +                  (plane_state->aux.y << 16) | plane_state->aux.x);
>>
>>         if (scaler_id >= 0) {
>>                 uint32_t ps_ctrl = 0;
>> @@ -9807,10 +9961,16 @@ skylake_get_initial_plane_config(struct
>> intel_crtc *crtc,
>>                 fb->modifier = I915_FORMAT_MOD_X_TILED;
>>                 break;
>>         case PLANE_CTL_TILED_Y:
>> -               fb->modifier = I915_FORMAT_MOD_Y_TILED;
>> +               if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
>> +                       fb->modifier = I915_FORMAT_MOD_Y_TILED_CCS;
>> +               else
>> +                       fb->modifier = I915_FORMAT_MOD_Y_TILED;
>>                 break;
>>         case PLANE_CTL_TILED_YF:
>> -               fb->modifier = I915_FORMAT_MOD_Yf_TILED;
>> +               if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
>> +                       fb->modifier = I915_FORMAT_MOD_Yf_TILED_CCS;
>> +               else
>> +                       fb->modifier = I915_FORMAT_MOD_Yf_TILED;
>>                 break;
>>         default:
>>                 MISSING_CASE(tiling);
>> @@ -12014,7 +12174,7 @@ static void skl_do_mmio_flip(struct intel_crtc
>> *intel_crtc,
>>         u32 ctl, stride = skl_plane_stride(fb, 0, rotation);
>>
>>         ctl = I915_READ(PLANE_CTL(pipe, 0));
>> -       ctl &= ~PLANE_CTL_TILED_MASK;
>> +       ctl &= ~(PLANE_CTL_TILED_MASK | PLANE_CTL_DECOMPRESSION_ENABLE);
>>         switch (fb->modifier) {
>>         case DRM_FORMAT_MOD_NONE:
>>                 break;
>> @@ -12024,9 +12184,15 @@ static void skl_do_mmio_flip(struct intel_crtc
>> *intel_crtc,
>>         case I915_FORMAT_MOD_Y_TILED:
>>                 ctl |= PLANE_CTL_TILED_Y;
>>                 break;
>> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> +               ctl |= PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
>> +               break;
>>         case I915_FORMAT_MOD_Yf_TILED:
>>                 ctl |= PLANE_CTL_TILED_YF;
>>                 break;
>> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>> +               ctl |= PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_
>> ENABLE;
>> +               break;
>>         default:
>>                 MISSING_CASE(fb->modifier);
>>         }
>> @@ -15925,9 +16091,10 @@ static int intel_framebuffer_init(struct
>> drm_device *dev,
>>                                   struct drm_i915_gem_object *obj)
>>  {
>>         struct drm_i915_private *dev_priv = to_i915(dev);
>> +       struct drm_framebuffer *fb = &intel_fb->base;
>>         unsigned int tiling = i915_gem_object_get_tiling(obj);
>> -       int ret;
>> -       u32 pitch_limit, stride_alignment;
>> +       int ret, i;
>> +       u32 pitch_limit;
>>         struct drm_format_name_buf format_name;
>>
>>         WARN_ON(!mutex_is_locked(&dev->struct_mutex));
>> @@ -15953,6 +16120,19 @@ static int intel_framebuffer_init(struct
>> drm_device *dev,
>>
>>         /* Passed in modifier sanity checking. */
>>         switch (mode_cmd->modifier[0]) {
>> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>> +               switch (mode_cmd->pixel_format) {
>> +               case DRM_FORMAT_XBGR8888:
>> +               case DRM_FORMAT_ABGR8888:
>> +               case DRM_FORMAT_XRGB8888:
>> +               case DRM_FORMAT_ARGB8888:
>> +                       break;
>> +               default:
>> +                       DRM_DEBUG_KMS("RC supported only with RGB8888
>> formats\n");
>> +                       return -EINVAL;
>> +               }
>> +               /* fall through */
>>         case I915_FORMAT_MOD_Y_TILED:
>>         case I915_FORMAT_MOD_Yf_TILED:
>>                 if (INTEL_GEN(dev_priv) < 9) {
>> @@ -16059,22 +16239,46 @@ static int intel_framebuffer_init(struct
>> drm_device *dev,
>>         if (mode_cmd->offsets[0] != 0)
>>                 return -EINVAL;
>>
>> -       drm_helper_mode_fill_fb_struct(dev, &intel_fb->base, mode_cmd);
>> +       drm_helper_mode_fill_fb_struct(dev, fb, mode_cmd);
>>
>> -       stride_alignment = intel_fb_stride_alignment(&intel_fb->base, 0);
>> -       if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
>> -               DRM_DEBUG_KMS("pitch (%d) must be at least %u byte
>> aligned\n",
>> -                             mode_cmd->pitches[0], stride_alignment);
>> -               return -EINVAL;
>> +       for (i = 0; i < fb->format->num_planes; i++) {
>> +               u32 stride_alignment;
>> +
>> +               if (mode_cmd->handles[i] != mode_cmd->handles[0]) {
>> +                       DRM_DEBUG_KMS("bad plane %d handle\n", i);
>> +                       return -EINVAL;
>> +               }
>> +
>> +               stride_alignment = intel_fb_stride_alignment(fb, i);
>> +
>> +               /*
>> +                * Display WA #0531: skl,bxt,kbl,glk
>> +                *
>> +                * Render decompression and plane width > 3840
>> +                * combined with horizontal panning requires the
>> +                * plane stride to be a multiple of 4. We'll just
>> +                * require the entire fb to accommodate that to avoid
>> +                * potential runtime errors at plane configuration time.
>>
>
>Note to Ben: Userspace needs to know about this too.  In this case, I
>believe "multiple of 4" means "multiple of 4 tiles".  You've never hit this
>because the standard 1920x1080 is 60 tiles wide which is a multiple of 4.
>
>

I'm more likely not hitting it because the width must be > 3840. Okay, I'll
modify my patch - I suppose I can test this with Daniel's Weston branch,
otherwise I'm not sure how I'd hit it.

>> +                */
>> +               if (IS_GEN9(dev_priv) && i == 0 && fb->width > 3840 &&
>> +                   (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>> +                    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS))
>> +                       stride_alignment *= 4;
>> +
>> +               if (fb->pitches[i] & (stride_alignment - 1)) {
>> +                       DRM_DEBUG_KMS("plane %d pitch (%d) must be at
>> least %u byte aligned\n",
>> +                                     i, fb->pitches[i], stride_alignment);
>> +                       return -EINVAL;
>> +               }
>>         }
>>
>>         intel_fb->obj = obj;
>>
>> -       ret = intel_fill_fb_info(dev_priv, &intel_fb->base);
>> +       ret = intel_fill_fb_info(dev_priv, fb);
>>         if (ret)
>>                 return ret;
>>
>> -       ret = drm_framebuffer_init(dev, &intel_fb->base, &intel_fb_funcs);
>> +       ret = drm_framebuffer_init(dev, fb, &intel_fb_funcs);
>>         if (ret) {
>>                 DRM_ERROR("framebuffer init failed %d\n", ret);
>>                 return ret;
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_
>> pm.c
>> index 249623d45be0..25782cd174c0 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -62,6 +62,20 @@ static void gen9_init_clock_gating(struct
>> drm_i915_private *dev_priv)
>>         I915_WRITE(CHICKEN_PAR1_1,
>>                    I915_READ(CHICKEN_PAR1_1) | SKL_EDP_PSR_FIX_RDWRAP);
>>
>> +       /*
>> +        * Display WA#0390: skl,bxt,kbl,glk
>> +        *
>> +        * Must match Sampler, Pixel Back End, and Media
>> +        * (0xE194 bit 8, 0x7014 bit 13, 0x4DDC bits 27 and 31).
>> +        *
>> +        * Including bits outside the page in the hash would
>> +        * require 2 (or 4?) MiB alignment of resources. Just
>> +        * assume the defaul hashing mode which only uses bits
>> +        * within the page.
>> +        */
>> +       I915_WRITE(CHICKEN_PAR1_1,
>> +                  I915_READ(CHICKEN_PAR1_1) & ~SKL_RC_HASH_OUTSIDE);
>> +
>>         I915_WRITE(GEN8_CONFIG0,
>>                    I915_READ(GEN8_CONFIG0) | GEN9_DEFAULT_FIXES);
>>
>> @@ -3314,7 +3328,9 @@ skl_ddb_min_alloc(const struct drm_plane_state
>> *pstate,
>>
>>         /* For Non Y-tile return 8-blocks */
>>         if (fb->modifier != I915_FORMAT_MOD_Y_TILED &&
>> -           fb->modifier != I915_FORMAT_MOD_Yf_TILED)
>> +           fb->modifier != I915_FORMAT_MOD_Yf_TILED &&
>> +           fb->modifier != I915_FORMAT_MOD_Y_TILED_CCS &&
>> +           fb->modifier != I915_FORMAT_MOD_Yf_TILED_CCS)
>>                 return 8;
>>
>>         src_w = drm_rect_width(&intel_pstate->base.src) >> 16;
>> @@ -3590,7 +3606,9 @@ static int skl_compute_plane_wm(const struct
>> drm_i915_private *dev_priv,
>>         }
>>
>>         y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
>> -                 fb->modifier == I915_FORMAT_MOD_Yf_TILED;
>> +                 fb->modifier == I915_FORMAT_MOD_Yf_TILED ||
>> +                 fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>> +                 fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS;
>>         x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
>>
>>         /* Display WA #1141: kbl. */
>> @@ -3675,6 +3693,13 @@ static int skl_compute_plane_wm(const struct
>> drm_i915_private *dev_priv,
>>         res_lines = DIV_ROUND_UP(selected_result.val,
>>                                  plane_blocks_per_line.val);
>>
>> +       /* Display WA #1125: skl,bxt,kbl,glk */
>> +       if (level == 0 &&
>> +           (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>> +            fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS))
>> +               res_blocks += fixed_16_16_to_u32_round_up(y_tile_minimum);
>> +
>> +       /* Display WA #1126: skl,bxt,kbl,glk */
>>         if (level >= 1 && level <= 7) {
>>                 if (y_tiled) {
>>                         res_blocks += fixed_16_16_to_u32_round_up(y_
>> tile_minimum);
>> diff --git a/drivers/gpu/drm/i915/intel_sprite.c
>> b/drivers/gpu/drm/i915/intel_sprite.c
>> index 7031bc733d97..063a994815d0 100644
>> --- a/drivers/gpu/drm/i915/intel_sprite.c
>> +++ b/drivers/gpu/drm/i915/intel_sprite.c
>> @@ -210,6 +210,7 @@ skl_update_plane(struct drm_plane *drm_plane,
>>         u32 surf_addr = plane_state->main.offset;
>>         unsigned int rotation = plane_state->base.rotation;
>>         u32 stride = skl_plane_stride(fb, 0, rotation);
>> +       u32 aux_stride = skl_plane_stride(fb, 1, rotation);
>>         int crtc_x = plane_state->base.dst.x1;
>>         int crtc_y = plane_state->base.dst.y1;
>>         uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
>> @@ -248,6 +249,10 @@ skl_update_plane(struct drm_plane *drm_plane,
>>         I915_WRITE(PLANE_OFFSET(pipe, plane_id), (y << 16) | x);
>>         I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
>>         I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
>> +       I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
>> +                  (plane_state->aux.offset - surf_addr) | aux_stride);
>> +       I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
>> +                  (plane_state->aux.y << 16) | plane_state->aux.x);
>>
>>         /* program plane scaler */
>>         if (plane_state->scaler_id >= 0) {
>> --
>> 2.10.2
>>
>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 9/9] drm/i915: Add render decompression support
  2017-02-28 20:18     ` Jason Ekstrand
  2017-02-28 21:00       ` Ben Widawsky
@ 2017-02-28 23:20       ` Ben Widawsky
  2017-03-01 10:51         ` Ville Syrjälä
  1 sibling, 1 reply; 44+ messages in thread
From: Ben Widawsky @ 2017-02-28 23:20 UTC (permalink / raw)
  To: Jason Ekstrand
  Cc: Vandana Kannan, intel-gfx, Maling list - DRI developers,
	Chad Versace, Paulo Zanoni

On 17-02-28 12:18:39, Jason Ekstrand wrote:
>On Thu, Jan 5, 2017 at 7:14 AM, <ville.syrjala@linux.intel.com> wrote:
>
>> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>
>> SKL+ display engine can scan out certain kinds of compressed surfaces
>> produced by the render engine. This involved telling the display engine
>> the location of the color control surfae (CCS) which describes
>> which parts of the main surface are compressed and which are not. The
>> location of CCS is provided by userspace as just another plane with its
>> own offset.
>>
>> Add the required stuff to validate the user provided AUX plane metadata
>> and convert the user provided linear offset into something the hardware
>> can consume.
>>
>> Due to hardware limitations we require that the main surface and
>> the AUX surface (CCS) be part of the same bo. The hardware also
>> makes life hard by not allowing you to provide separate x/y offsets
>> for the main and AUX surfaces (excpet with NV12), so finding suitable
>> offsets for both requires a bit of work. Assuming we still want keep
>> playing tricks with the offsets. I've just gone with a dumb "search
>> backward for suitable offsets" approach, which is far from optimal,
>> but it works.
>>
>> Also not all planes will be capable of scanning out compressed surfaces,
>> and eg. 90/270 degree rotation is not supported in combination with
>> decompression either.
>>
>> This patch may contain work from at least the following people:
>> * Vandana Kannan <vandana.kannan@intel.com>
>> * Daniel Vetter <daniel@ffwll.ch>
>> * Ben Widawsky <ben@bwidawsk.net>
>>
>> v2: Deal with display workarounds 0390, 0531, 1125 (Paulo)
>>
>> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> Cc: Vandana Kannan <vandana.kannan@intel.com>
>> Cc: Daniel Vetter <daniel@ffwll.ch>
>> Cc: Ben Widawsky <ben@bwidawsk.net>
>> Cc: Jason Ekstrand <jason@jlekstrand.net>
>> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_reg.h      |  23 ++++
>>  drivers/gpu/drm/i915/intel_display.c | 234 ++++++++++++++++++++++++++++++
>> ++---
>>  drivers/gpu/drm/i915/intel_pm.c      |  29 ++++-
>>  drivers/gpu/drm/i915/intel_sprite.c  |   5 +
>>  4 files changed, 274 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_
>> reg.h
>> index 00970aa77afa..6849ba93f1d9 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -6209,6 +6209,28 @@ enum {
>>                         _ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A),   \
>>                         _ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B))
>>
>> +#define PLANE_AUX_DIST_1_A             0x701c0
>> +#define PLANE_AUX_DIST_2_A             0x702c0
>> +#define PLANE_AUX_DIST_1_B             0x711c0
>> +#define PLANE_AUX_DIST_2_B             0x712c0
>> +#define _PLANE_AUX_DIST_1(pipe) \
>> +                       _PIPE(pipe, PLANE_AUX_DIST_1_A, PLANE_AUX_DIST_1_B)
>> +#define _PLANE_AUX_DIST_2(pipe) \
>> +                       _PIPE(pipe, PLANE_AUX_DIST_2_A, PLANE_AUX_DIST_2_B)
>> +#define PLANE_AUX_DIST(pipe, plane)     \
>> +       _MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe),
>> _PLANE_AUX_DIST_2(pipe))
>> +
>> +#define PLANE_AUX_OFFSET_1_A           0x701c4
>> +#define PLANE_AUX_OFFSET_2_A           0x702c4
>> +#define PLANE_AUX_OFFSET_1_B           0x711c4
>> +#define PLANE_AUX_OFFSET_2_B           0x712c4
>> +#define _PLANE_AUX_OFFSET_1(pipe)       \
>> +               _PIPE(pipe, PLANE_AUX_OFFSET_1_A, PLANE_AUX_OFFSET_1_B)
>> +#define _PLANE_AUX_OFFSET_2(pipe)       \
>> +               _PIPE(pipe, PLANE_AUX_OFFSET_2_A, PLANE_AUX_OFFSET_2_B)
>> +#define PLANE_AUX_OFFSET(pipe, plane)   \
>> +       _MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe),
>> _PLANE_AUX_OFFSET_2(pipe))
>> +
>>  /* legacy palette */
>>  #define _LGC_PALETTE_A           0x4a000
>>  #define _LGC_PALETTE_B           0x4a800
>> @@ -6433,6 +6455,7 @@ enum {
>>  # define CHICKEN3_DGMG_DONE_FIX_DISABLE                (1 << 2)
>>
>>  #define CHICKEN_PAR1_1         _MMIO(0x42080)
>> +#define  SKL_RC_HASH_OUTSIDE   (1 << 15)
>>  #define  DPA_MASK_VBLANK_SRD   (1 << 15)
>>  #define  FORCE_ARB_IDLE_PLANES (1 << 14)
>>  #define  SKL_EDP_PSR_FIX_RDWRAP        (1 << 3)
>> diff --git a/drivers/gpu/drm/i915/intel_display.c
>> b/drivers/gpu/drm/i915/intel_display.c
>> index 38de9df0ec60..2236abebd8bc 100644
>> --- a/drivers/gpu/drm/i915/intel_display.c
>> +++ b/drivers/gpu/drm/i915/intel_display.c
>> @@ -2064,11 +2064,19 @@ intel_tile_width_bytes(const struct
>> drm_framebuffer *fb, int plane)
>>                         return 128;
>>                 else
>>                         return 512;
>> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> +               if (plane == 1)
>> +                       return 64;
>> +               /* fall through */
>>         case I915_FORMAT_MOD_Y_TILED:
>>                 if (IS_GEN2(dev_priv) || HAS_128_BYTE_Y_TILING(dev_priv))
>>                         return 128;
>>                 else
>>                         return 512;
>> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>> +               if (plane == 1)
>> +                       return 64;
>>
>
>I've said it before but reading through Ben's patches again make me want to
>be peskier about it.  I would really like the UABI to treat the CCS as if
>it's Y-tiled with a tile size of 128B x 32 rows.  Why?  Because this is
>what all the docs say it is.  From the display docs for "Color Control
>Surface":
>
>"The Color Control Surface (CCS) contains the compression status of the
>cache-line pairs. The
>compression state of the cache-line pair is specified by 2 bits in the CCS.
>Each CCS cache-line represents
>an area on the main surface of 16 x16 sets of 128 byte Y-tiled
>cache-line-pairs. CCS is always Y tiled."
>
>This contains 95% of the information needed to know the relation between
>the CCS and the main surface.  The other 5% (which is badly documented) is
>that cache line pairs are horizontally adjacent.  This gives a relationship
>of one cache line in the CCS maps to 32x64 cache lines in the main surface.
>
>But it's not actually Y-tiled?  Of course not.  I've worked out the exact
>tiling and it looks something like Y but isn't quite the same.  However,
>this isn't unique to CCS.  Stencil (W-tiled), HiZ, and gen7-8
>single-sampled MCS also each have their own tiling (Haswell MCS is
>especially exotic) but the docs call all of them Y-tiled and I think the
>hardware internally treats them as Y-tiled with the cache lines shuffled
>around a bit.
>
>Given the fact that they seem to like to change the MCS/CCS tiling around
>on every hardware generation, I'm reluctant to base UABI on the fact that
>the CCS appears to have 64x64 "pixels" per tile with each "pixel"
>corresponding to 16x8 pixels in the color surface.  That's not what we had
>on any previous gen and may change on gen10 for no aparent reason.  I'd
>much rather base it on Y-tiling and a relationship between cache lines
>which, even if they change the exact swizzle on gen10, will probably remain
>the same.  (For the gen7-8 analogue of CCS, they changed the tiling every
>generation but the relationship of one MCS cache line maps to 32x128 color
>cache lines remained the same).
>
>Ok, I've said my peice.  If we have to divide by 2 in userspace, we won't
>die, but I'd like to get the UABI right before we chissel it in stone.
>
>--Jason
>
>

I vote we make the stride in units of tiles when we have the CCS modifier.

>> +               /* fall through */
>>         case I915_FORMAT_MOD_Yf_TILED:
>>                 /*
>>                  * Bspec seems to suggest that the Yf tile width would
>> @@ -2156,7 +2164,7 @@ static unsigned int intel_surf_alignment(const
>> struct drm_framebuffer *fb,
>>         struct drm_i915_private *dev_priv = to_i915(fb->dev);
>>
>>         /* AUX_DIST needs only 4K alignment */
>> -       if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
>> +       if (plane == 1)
>>                 return 4096;
>>
>>         switch (fb->modifier) {
>> @@ -2166,6 +2174,8 @@ static unsigned int intel_surf_alignment(const
>> struct drm_framebuffer *fb,
>>                 if (INTEL_GEN(dev_priv) >= 9)
>>                         return 256 * 1024;
>>                 return 0;
>> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>>         case I915_FORMAT_MOD_Y_TILED:
>>         case I915_FORMAT_MOD_Yf_TILED:
>>                 return 1 * 1024 * 1024;
>> @@ -2472,6 +2482,7 @@ static unsigned int intel_fb_modifier_to_tiling(uint64_t
>> fb_modifier)
>>         case I915_FORMAT_MOD_X_TILED:
>>                 return I915_TILING_X;
>>         case I915_FORMAT_MOD_Y_TILED:
>> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>>                 return I915_TILING_Y;
>>         default:
>>                 return I915_TILING_NONE;
>> @@ -2536,6 +2547,35 @@ intel_fill_fb_info(struct drm_i915_private
>> *dev_priv,
>>
>>                 intel_fb_offset_to_xy(&x, &y, fb, i);
>>
>> +               if ((fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>> +                    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) && i ==
>> 1) {
>> +                       int main_x, main_y;
>> +                       int ccs_x, ccs_y;
>> +
>> +                       /*
>> +                        * Each byte of CCS corresponds to a 16x8 area of
>> the main surface, and
>> +                        * each CCS tile is 64x64 bytes.
>> +                        */
>> +                       ccs_x = (x * 16) % (64 * 16);
>> +                       ccs_y = (y * 8) % (64 * 8);
>> +                       main_x = intel_fb->normal[0].x % (64 * 16);
>> +                       main_y = intel_fb->normal[0].y % (64 * 8);
>> +
>> +                       /*
>> +                        * CCS doesn't have its own x/y offset register,
>> so the intra CCS tile
>> +                        * x/y offsets must match between CCS and the main
>> surface.
>> +                        */
>> +                       if (main_x != ccs_x || main_y != ccs_y) {
>> +                               DRM_DEBUG_KMS("Bad CCS x/y (main %d,%d ccs
>> %d,%d) full (main %d,%d ccs %d,%d)\n",
>> +                                             main_x, main_y,
>> +                                             ccs_x, ccs_y,
>> +                                             intel_fb->normal[0].x,
>> +                                             intel_fb->normal[0].y,
>> +                                             x, y);
>> +                               return -EINVAL;
>> +                       }
>> +               }
>> +
>>                 /*
>>                  * The fence (if used) is aligned to the start of the
>> object
>>                  * so having the framebuffer wrap around across the edge
>> of the
>> @@ -2873,6 +2913,9 @@ static int skl_max_plane_width(const struct
>> drm_framebuffer *fb, int plane,
>>                         break;
>>                 }
>>                 break;
>> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>> +               /* FIXME AUX plane? */
>>         case I915_FORMAT_MOD_Y_TILED:
>>         case I915_FORMAT_MOD_Yf_TILED:
>>                 switch (cpp) {
>> @@ -2895,6 +2938,42 @@ static int skl_max_plane_width(const struct
>> drm_framebuffer *fb, int plane,
>>         return 2048;
>>  }
>>
>> +static bool skl_check_main_ccs_coordinates(struct intel_plane_state
>> *plane_state,
>> +                                          int main_x, int main_y, u32
>> main_offset)
>> +{
>> +       const struct drm_framebuffer *fb = plane_state->base.fb;
>> +       int aux_x = plane_state->aux.x;
>> +       int aux_y = plane_state->aux.y;
>> +       u32 aux_offset = plane_state->aux.offset;
>> +       u32 alignment = intel_surf_alignment(fb, 1);
>> +
>> +       while (aux_offset >= main_offset && aux_y <= main_y) {
>> +               int x, y;
>> +
>> +               if (aux_x == main_x && aux_y == main_y)
>> +                       break;
>> +
>> +               if (aux_offset == 0)
>> +                       break;
>> +
>> +               x = aux_x / 16;
>> +               y = aux_y / 8;
>> +               aux_offset = intel_adjust_tile_offset(&x, &y, plane_state,
>> 1,
>> +                                                     aux_offset,
>> aux_offset - alignment);
>> +               aux_x = x * 16 + aux_x % 16;
>> +               aux_y = y * 8 + aux_y % 8;
>> +       }
>> +
>> +       if (aux_x != main_x || aux_y != main_y)
>> +               return false;
>> +
>> +       plane_state->aux.offset = aux_offset;
>> +       plane_state->aux.x = aux_x;
>> +       plane_state->aux.y = aux_y;
>> +
>> +       return true;
>> +}
>> +
>>  static int skl_check_main_surface(struct intel_plane_state *plane_state)
>>  {
>>         const struct drm_framebuffer *fb = plane_state->base.fb;
>> @@ -2937,7 +3016,7 @@ static int skl_check_main_surface(struct
>> intel_plane_state *plane_state)
>>
>>                 while ((x + w) * cpp > fb->pitches[0]) {
>>                         if (offset == 0) {
>> -                               DRM_DEBUG_KMS("Unable to find suitable
>> display surface offset\n");
>> +                               DRM_DEBUG_KMS("Unable to find suitable
>> display surface offset due to X-tiling\n");
>>                                 return -EINVAL;
>>                         }
>>
>> @@ -2946,6 +3025,26 @@ static int skl_check_main_surface(struct
>> intel_plane_state *plane_state)
>>                 }
>>         }
>>
>> +       /*
>> +        * CCS AUX surface doesn't have its own x/y offsets, we must make
>> sure
>> +        * they match with the main surface x/y offsets.
>> +        */
>> +       if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>> +           fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
>> +               while (!skl_check_main_ccs_coordinates(plane_state, x, y,
>> offset)) {
>> +                       if (offset == 0)
>> +                               break;
>> +
>> +                       offset = intel_adjust_tile_offset(&x, &y,
>> plane_state, 0,
>> +                                                         offset, offset -
>> alignment);
>> +               }
>> +
>> +               if (x != plane_state->aux.x || y != plane_state->aux.y) {
>> +                       DRM_DEBUG_KMS("Unable to find suitable display
>> surface offset due to CCS\n");
>> +                       return -EINVAL;
>> +               }
>> +       }
>> +
>>         plane_state->main.offset = offset;
>>         plane_state->main.x = x;
>>         plane_state->main.y = y;
>> @@ -2982,6 +3081,47 @@ static int skl_check_nv12_aux_surface(struct
>> intel_plane_state *plane_state)
>>         return 0;
>>  }
>>
>> +static int skl_check_ccs_aux_surface(struct intel_plane_state
>> *plane_state)
>> +{
>> +       struct intel_plane *plane = to_intel_plane(plane_state->
>> base.plane);
>> +       struct intel_crtc *crtc = to_intel_crtc(plane_state->base.crtc);
>> +       int src_x = plane_state->base.src.x1 >> 16;
>> +       int src_y = plane_state->base.src.y1 >> 16;
>> +       int x = src_x / 16;
>> +       int y = src_y / 8;
>> +       u32 offset;
>> +
>> +       switch (plane->id) {
>> +       case PLANE_PRIMARY:
>> +       case PLANE_SPRITE0:
>> +               break;
>> +       default:
>> +               DRM_DEBUG_KMS("RC support only on plane 1 and 2\n");
>> +               return -EINVAL;
>> +       }
>> +
>> +       if (crtc->pipe == PIPE_C) {
>> +               DRM_DEBUG_KMS("No RC support on pipe C\n");
>> +               return -EINVAL;
>> +       }
>> +
>> +       if (plane_state->base.rotation &&
>> +           plane_state->base.rotation & ~(DRM_ROTATE_0 | DRM_ROTATE_180))
>> {
>> +               DRM_DEBUG_KMS("RC support only with 0/180 degree rotation
>> %x\n",
>> +                             plane_state->base.rotation);
>> +               return -EINVAL;
>> +       }
>> +
>> +       intel_add_fb_offsets(&x, &y, plane_state, 1);
>> +       offset = intel_compute_tile_offset(&x, &y, plane_state, 1);
>> +
>> +       plane_state->aux.offset = offset;
>> +       plane_state->aux.x = x * 16 + src_x % 16;
>> +       plane_state->aux.y = y * 8 + src_y % 8;
>> +
>> +       return 0;
>> +}
>> +
>>  int skl_check_plane_surface(struct intel_plane_state *plane_state)
>>  {
>>         const struct drm_framebuffer *fb = plane_state->base.fb;
>> @@ -3002,6 +3142,11 @@ int skl_check_plane_surface(struct
>> intel_plane_state *plane_state)
>>                 ret = skl_check_nv12_aux_surface(plane_state);
>>                 if (ret)
>>                         return ret;
>> +       } else if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>> +                  fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
>> +               ret = skl_check_ccs_aux_surface(plane_state);
>> +               if (ret)
>> +                       return ret;
>>         } else {
>>                 plane_state->aux.offset = ~0xfff;
>>                 plane_state->aux.x = 0;
>> @@ -3357,8 +3502,12 @@ u32 skl_plane_ctl_tiling(uint64_t fb_modifier)
>>                 return PLANE_CTL_TILED_X;
>>         case I915_FORMAT_MOD_Y_TILED:
>>                 return PLANE_CTL_TILED_Y;
>> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> +               return PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
>>         case I915_FORMAT_MOD_Yf_TILED:
>>                 return PLANE_CTL_TILED_YF;
>> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>> +               return PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_
>> ENABLE;
>>         default:
>>                 MISSING_CASE(fb_modifier);
>>         }
>> @@ -3401,6 +3550,7 @@ static void skylake_update_primary_plane(struct
>> drm_plane *plane,
>>         u32 plane_ctl;
>>         unsigned int rotation = plane_state->base.rotation;
>>         u32 stride = skl_plane_stride(fb, 0, rotation);
>> +       u32 aux_stride = skl_plane_stride(fb, 1, rotation);
>>         u32 surf_addr = plane_state->main.offset;
>>         int scaler_id = plane_state->scaler_id;
>>         int src_x = plane_state->main.x;
>> @@ -3436,6 +3586,10 @@ static void skylake_update_primary_plane(struct
>> drm_plane *plane,
>>         I915_WRITE(PLANE_OFFSET(pipe, plane_id), (src_y << 16) | src_x);
>>         I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
>>         I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
>> +       I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
>> +                  (plane_state->aux.offset - surf_addr) | aux_stride);
>> +       I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
>> +                  (plane_state->aux.y << 16) | plane_state->aux.x);
>>
>>         if (scaler_id >= 0) {
>>                 uint32_t ps_ctrl = 0;
>> @@ -9807,10 +9961,16 @@ skylake_get_initial_plane_config(struct
>> intel_crtc *crtc,
>>                 fb->modifier = I915_FORMAT_MOD_X_TILED;
>>                 break;
>>         case PLANE_CTL_TILED_Y:
>> -               fb->modifier = I915_FORMAT_MOD_Y_TILED;
>> +               if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
>> +                       fb->modifier = I915_FORMAT_MOD_Y_TILED_CCS;
>> +               else
>> +                       fb->modifier = I915_FORMAT_MOD_Y_TILED;
>>                 break;
>>         case PLANE_CTL_TILED_YF:
>> -               fb->modifier = I915_FORMAT_MOD_Yf_TILED;
>> +               if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
>> +                       fb->modifier = I915_FORMAT_MOD_Yf_TILED_CCS;
>> +               else
>> +                       fb->modifier = I915_FORMAT_MOD_Yf_TILED;
>>                 break;
>>         default:
>>                 MISSING_CASE(tiling);
>> @@ -12014,7 +12174,7 @@ static void skl_do_mmio_flip(struct intel_crtc
>> *intel_crtc,
>>         u32 ctl, stride = skl_plane_stride(fb, 0, rotation);
>>
>>         ctl = I915_READ(PLANE_CTL(pipe, 0));
>> -       ctl &= ~PLANE_CTL_TILED_MASK;
>> +       ctl &= ~(PLANE_CTL_TILED_MASK | PLANE_CTL_DECOMPRESSION_ENABLE);
>>         switch (fb->modifier) {
>>         case DRM_FORMAT_MOD_NONE:
>>                 break;
>> @@ -12024,9 +12184,15 @@ static void skl_do_mmio_flip(struct intel_crtc
>> *intel_crtc,
>>         case I915_FORMAT_MOD_Y_TILED:
>>                 ctl |= PLANE_CTL_TILED_Y;
>>                 break;
>> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> +               ctl |= PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
>> +               break;
>>         case I915_FORMAT_MOD_Yf_TILED:
>>                 ctl |= PLANE_CTL_TILED_YF;
>>                 break;
>> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>> +               ctl |= PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_
>> ENABLE;
>> +               break;
>>         default:
>>                 MISSING_CASE(fb->modifier);
>>         }
>> @@ -15925,9 +16091,10 @@ static int intel_framebuffer_init(struct
>> drm_device *dev,
>>                                   struct drm_i915_gem_object *obj)
>>  {
>>         struct drm_i915_private *dev_priv = to_i915(dev);
>> +       struct drm_framebuffer *fb = &intel_fb->base;
>>         unsigned int tiling = i915_gem_object_get_tiling(obj);
>> -       int ret;
>> -       u32 pitch_limit, stride_alignment;
>> +       int ret, i;
>> +       u32 pitch_limit;
>>         struct drm_format_name_buf format_name;
>>
>>         WARN_ON(!mutex_is_locked(&dev->struct_mutex));
>> @@ -15953,6 +16120,19 @@ static int intel_framebuffer_init(struct
>> drm_device *dev,
>>
>>         /* Passed in modifier sanity checking. */
>>         switch (mode_cmd->modifier[0]) {
>> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>> +               switch (mode_cmd->pixel_format) {
>> +               case DRM_FORMAT_XBGR8888:
>> +               case DRM_FORMAT_ABGR8888:
>> +               case DRM_FORMAT_XRGB8888:
>> +               case DRM_FORMAT_ARGB8888:
>> +                       break;
>> +               default:
>> +                       DRM_DEBUG_KMS("RC supported only with RGB8888
>> formats\n");
>> +                       return -EINVAL;
>> +               }
>> +               /* fall through */
>>         case I915_FORMAT_MOD_Y_TILED:
>>         case I915_FORMAT_MOD_Yf_TILED:
>>                 if (INTEL_GEN(dev_priv) < 9) {
>> @@ -16059,22 +16239,46 @@ static int intel_framebuffer_init(struct
>> drm_device *dev,
>>         if (mode_cmd->offsets[0] != 0)
>>                 return -EINVAL;
>>
>> -       drm_helper_mode_fill_fb_struct(dev, &intel_fb->base, mode_cmd);
>> +       drm_helper_mode_fill_fb_struct(dev, fb, mode_cmd);
>>
>> -       stride_alignment = intel_fb_stride_alignment(&intel_fb->base, 0);
>> -       if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
>> -               DRM_DEBUG_KMS("pitch (%d) must be at least %u byte
>> aligned\n",
>> -                             mode_cmd->pitches[0], stride_alignment);
>> -               return -EINVAL;
>> +       for (i = 0; i < fb->format->num_planes; i++) {
>> +               u32 stride_alignment;
>> +
>> +               if (mode_cmd->handles[i] != mode_cmd->handles[0]) {
>> +                       DRM_DEBUG_KMS("bad plane %d handle\n", i);
>> +                       return -EINVAL;
>> +               }
>> +
>> +               stride_alignment = intel_fb_stride_alignment(fb, i);
>> +
>> +               /*
>> +                * Display WA #0531: skl,bxt,kbl,glk
>> +                *
>> +                * Render decompression and plane width > 3840
>> +                * combined with horizontal panning requires the
>> +                * plane stride to be a multiple of 4. We'll just
>> +                * require the entire fb to accommodate that to avoid
>> +                * potential runtime errors at plane configuration time.
>>
>
>Note to Ben: Userspace needs to know about this too.  In this case, I
>believe "multiple of 4" means "multiple of 4 tiles".  You've never hit this
>because the standard 1920x1080 is 60 tiles wide which is a multiple of 4.
>
>
>> +                */
>> +               if (IS_GEN9(dev_priv) && i == 0 && fb->width > 3840 &&
>> +                   (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>> +                    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS))
>> +                       stride_alignment *= 4;
>> +
>> +               if (fb->pitches[i] & (stride_alignment - 1)) {
>> +                       DRM_DEBUG_KMS("plane %d pitch (%d) must be at
>> least %u byte aligned\n",
>> +                                     i, fb->pitches[i], stride_alignment);
>> +                       return -EINVAL;
>> +               }
>>         }
>>
>>         intel_fb->obj = obj;
>>
>> -       ret = intel_fill_fb_info(dev_priv, &intel_fb->base);
>> +       ret = intel_fill_fb_info(dev_priv, fb);
>>         if (ret)
>>                 return ret;
>>
>> -       ret = drm_framebuffer_init(dev, &intel_fb->base, &intel_fb_funcs);
>> +       ret = drm_framebuffer_init(dev, fb, &intel_fb_funcs);
>>         if (ret) {
>>                 DRM_ERROR("framebuffer init failed %d\n", ret);
>>                 return ret;
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_
>> pm.c
>> index 249623d45be0..25782cd174c0 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -62,6 +62,20 @@ static void gen9_init_clock_gating(struct
>> drm_i915_private *dev_priv)
>>         I915_WRITE(CHICKEN_PAR1_1,
>>                    I915_READ(CHICKEN_PAR1_1) | SKL_EDP_PSR_FIX_RDWRAP);
>>
>> +       /*
>> +        * Display WA#0390: skl,bxt,kbl,glk
>> +        *
>> +        * Must match Sampler, Pixel Back End, and Media
>> +        * (0xE194 bit 8, 0x7014 bit 13, 0x4DDC bits 27 and 31).
>> +        *
>> +        * Including bits outside the page in the hash would
>> +        * require 2 (or 4?) MiB alignment of resources. Just
>> +        * assume the defaul hashing mode which only uses bits
>> +        * within the page.
>> +        */
>> +       I915_WRITE(CHICKEN_PAR1_1,
>> +                  I915_READ(CHICKEN_PAR1_1) & ~SKL_RC_HASH_OUTSIDE);
>> +
>>         I915_WRITE(GEN8_CONFIG0,
>>                    I915_READ(GEN8_CONFIG0) | GEN9_DEFAULT_FIXES);
>>
>> @@ -3314,7 +3328,9 @@ skl_ddb_min_alloc(const struct drm_plane_state
>> *pstate,
>>
>>         /* For Non Y-tile return 8-blocks */
>>         if (fb->modifier != I915_FORMAT_MOD_Y_TILED &&
>> -           fb->modifier != I915_FORMAT_MOD_Yf_TILED)
>> +           fb->modifier != I915_FORMAT_MOD_Yf_TILED &&
>> +           fb->modifier != I915_FORMAT_MOD_Y_TILED_CCS &&
>> +           fb->modifier != I915_FORMAT_MOD_Yf_TILED_CCS)
>>                 return 8;
>>
>>         src_w = drm_rect_width(&intel_pstate->base.src) >> 16;
>> @@ -3590,7 +3606,9 @@ static int skl_compute_plane_wm(const struct
>> drm_i915_private *dev_priv,
>>         }
>>
>>         y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
>> -                 fb->modifier == I915_FORMAT_MOD_Yf_TILED;
>> +                 fb->modifier == I915_FORMAT_MOD_Yf_TILED ||
>> +                 fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>> +                 fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS;
>>         x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
>>
>>         /* Display WA #1141: kbl. */
>> @@ -3675,6 +3693,13 @@ static int skl_compute_plane_wm(const struct
>> drm_i915_private *dev_priv,
>>         res_lines = DIV_ROUND_UP(selected_result.val,
>>                                  plane_blocks_per_line.val);
>>
>> +       /* Display WA #1125: skl,bxt,kbl,glk */
>> +       if (level == 0 &&
>> +           (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>> +            fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS))
>> +               res_blocks += fixed_16_16_to_u32_round_up(y_tile_minimum);
>> +
>> +       /* Display WA #1126: skl,bxt,kbl,glk */
>>         if (level >= 1 && level <= 7) {
>>                 if (y_tiled) {
>>                         res_blocks += fixed_16_16_to_u32_round_up(y_
>> tile_minimum);
>> diff --git a/drivers/gpu/drm/i915/intel_sprite.c
>> b/drivers/gpu/drm/i915/intel_sprite.c
>> index 7031bc733d97..063a994815d0 100644
>> --- a/drivers/gpu/drm/i915/intel_sprite.c
>> +++ b/drivers/gpu/drm/i915/intel_sprite.c
>> @@ -210,6 +210,7 @@ skl_update_plane(struct drm_plane *drm_plane,
>>         u32 surf_addr = plane_state->main.offset;
>>         unsigned int rotation = plane_state->base.rotation;
>>         u32 stride = skl_plane_stride(fb, 0, rotation);
>> +       u32 aux_stride = skl_plane_stride(fb, 1, rotation);
>>         int crtc_x = plane_state->base.dst.x1;
>>         int crtc_y = plane_state->base.dst.y1;
>>         uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
>> @@ -248,6 +249,10 @@ skl_update_plane(struct drm_plane *drm_plane,
>>         I915_WRITE(PLANE_OFFSET(pipe, plane_id), (y << 16) | x);
>>         I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
>>         I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
>> +       I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
>> +                  (plane_state->aux.offset - surf_addr) | aux_stride);
>> +       I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
>> +                  (plane_state->aux.y << 16) | plane_state->aux.x);
>>
>>         /* program plane scaler */
>>         if (plane_state->scaler_id >= 0) {
>> --
>> 2.10.2
>>
>>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 9/9] drm/i915: Add render decompression support
  2017-02-28 23:20       ` Ben Widawsky
@ 2017-03-01 10:51         ` Ville Syrjälä
  2017-03-01 17:50           ` Ben Widawsky
  0 siblings, 1 reply; 44+ messages in thread
From: Ville Syrjälä @ 2017-03-01 10:51 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Vandana Kannan, intel-gfx, Maling list - DRI developers,
	Chad Versace, Jason Ekstrand, Paulo Zanoni

On Tue, Feb 28, 2017 at 03:20:38PM -0800, Ben Widawsky wrote:
> On 17-02-28 12:18:39, Jason Ekstrand wrote:
<snip>
> >I've said it before but reading through Ben's patches again make me want to
> >be peskier about it.  I would really like the UABI to treat the CCS as if
> >it's Y-tiled with a tile size of 128B x 32 rows.  Why?  Because this is
> >what all the docs say it is.  From the display docs for "Color Control
> >Surface":
> >
> >"The Color Control Surface (CCS) contains the compression status of the
> >cache-line pairs. The
> >compression state of the cache-line pair is specified by 2 bits in the CCS.
> >Each CCS cache-line represents
> >an area on the main surface of 16 x16 sets of 128 byte Y-tiled
> >cache-line-pairs. CCS is always Y tiled."
> >
> >This contains 95% of the information needed to know the relation between
> >the CCS and the main surface.  The other 5% (which is badly documented) is
> >that cache line pairs are horizontally adjacent.  This gives a relationship
> >of one cache line in the CCS maps to 32x64 cache lines in the main surface.
> >
> >But it's not actually Y-tiled?  Of course not.  I've worked out the exact
> >tiling and it looks something like Y but isn't quite the same.  However,
> >this isn't unique to CCS.  Stencil (W-tiled), HiZ, and gen7-8
> >single-sampled MCS also each have their own tiling (Haswell MCS is
> >especially exotic) but the docs call all of them Y-tiled and I think the
> >hardware internally treats them as Y-tiled with the cache lines shuffled
> >around a bit.
> >
> >Given the fact that they seem to like to change the MCS/CCS tiling around
> >on every hardware generation, I'm reluctant to base UABI on the fact that
> >the CCS appears to have 64x64 "pixels" per tile with each "pixel"
> >corresponding to 16x8 pixels in the color surface.  That's not what we had
> >on any previous gen and may change on gen10 for no aparent reason.  I'd
> >much rather base it on Y-tiling and a relationship between cache lines
> >which, even if they change the exact swizzle on gen10, will probably remain
> >the same.  (For the gen7-8 analogue of CCS, they changed the tiling every
> >generation but the relationship of one MCS cache line maps to 32x128 color
> >cache lines remained the same).
> >
> >Ok, I've said my peice.  If we have to divide by 2 in userspace, we won't
> >die, but I'd like to get the UABI right before we chissel it in stone.
> >
> >--Jason
> >
> >
> 
> I vote we make the stride in units of tiles when we have the CCS modifier.

That won't fly with the KMS API. I suppose I could make the tile 128 bytes
wide by swapping the "1 byte == 16x8 pixels" notion with a
"1 byte == 8x16 pixels" notion. Doesn't match the actual truth anymore,
but I guess no one really cares.

> 
> >> +               /* fall through */
> >>         case I915_FORMAT_MOD_Yf_TILED:
> >>                 /*
> >>                  * Bspec seems to suggest that the Yf tile width would
> >> @@ -2156,7 +2164,7 @@ static unsigned int intel_surf_alignment(const
> >> struct drm_framebuffer *fb,
> >>         struct drm_i915_private *dev_priv = to_i915(fb->dev);
> >>
> >>         /* AUX_DIST needs only 4K alignment */
> >> -       if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
> >> +       if (plane == 1)
> >>                 return 4096;
> >>
> >>         switch (fb->modifier) {
> >> @@ -2166,6 +2174,8 @@ static unsigned int intel_surf_alignment(const
> >> struct drm_framebuffer *fb,
> >>                 if (INTEL_GEN(dev_priv) >= 9)
> >>                         return 256 * 1024;
> >>                 return 0;
> >> +       case I915_FORMAT_MOD_Y_TILED_CCS:
> >> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
> >>         case I915_FORMAT_MOD_Y_TILED:
> >>         case I915_FORMAT_MOD_Yf_TILED:
> >>                 return 1 * 1024 * 1024;
> >> @@ -2472,6 +2482,7 @@ static unsigned int intel_fb_modifier_to_tiling(uint64_t
> >> fb_modifier)
> >>         case I915_FORMAT_MOD_X_TILED:
> >>                 return I915_TILING_X;
> >>         case I915_FORMAT_MOD_Y_TILED:
> >> +       case I915_FORMAT_MOD_Y_TILED_CCS:
> >>                 return I915_TILING_Y;
> >>         default:
> >>                 return I915_TILING_NONE;
> >> @@ -2536,6 +2547,35 @@ intel_fill_fb_info(struct drm_i915_private
> >> *dev_priv,
> >>
> >>                 intel_fb_offset_to_xy(&x, &y, fb, i);
> >>
> >> +               if ((fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> >> +                    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) && i ==
> >> 1) {
> >> +                       int main_x, main_y;
> >> +                       int ccs_x, ccs_y;
> >> +
> >> +                       /*
> >> +                        * Each byte of CCS corresponds to a 16x8 area of
> >> the main surface, and
> >> +                        * each CCS tile is 64x64 bytes.
> >> +                        */
> >> +                       ccs_x = (x * 16) % (64 * 16);
> >> +                       ccs_y = (y * 8) % (64 * 8);
> >> +                       main_x = intel_fb->normal[0].x % (64 * 16);
> >> +                       main_y = intel_fb->normal[0].y % (64 * 8);
> >> +
> >> +                       /*
> >> +                        * CCS doesn't have its own x/y offset register,
> >> so the intra CCS tile
> >> +                        * x/y offsets must match between CCS and the main
> >> surface.
> >> +                        */
> >> +                       if (main_x != ccs_x || main_y != ccs_y) {
> >> +                               DRM_DEBUG_KMS("Bad CCS x/y (main %d,%d ccs
> >> %d,%d) full (main %d,%d ccs %d,%d)\n",
> >> +                                             main_x, main_y,
> >> +                                             ccs_x, ccs_y,
> >> +                                             intel_fb->normal[0].x,
> >> +                                             intel_fb->normal[0].y,
> >> +                                             x, y);
> >> +                               return -EINVAL;
> >> +                       }
> >> +               }
> >> +
> >>                 /*
> >>                  * The fence (if used) is aligned to the start of the
> >> object
> >>                  * so having the framebuffer wrap around across the edge
> >> of the
> >> @@ -2873,6 +2913,9 @@ static int skl_max_plane_width(const struct
> >> drm_framebuffer *fb, int plane,
> >>                         break;
> >>                 }
> >>                 break;
> >> +       case I915_FORMAT_MOD_Y_TILED_CCS:
> >> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
> >> +               /* FIXME AUX plane? */
> >>         case I915_FORMAT_MOD_Y_TILED:
> >>         case I915_FORMAT_MOD_Yf_TILED:
> >>                 switch (cpp) {
> >> @@ -2895,6 +2938,42 @@ static int skl_max_plane_width(const struct
> >> drm_framebuffer *fb, int plane,
> >>         return 2048;
> >>  }
> >>
> >> +static bool skl_check_main_ccs_coordinates(struct intel_plane_state
> >> *plane_state,
> >> +                                          int main_x, int main_y, u32
> >> main_offset)
> >> +{
> >> +       const struct drm_framebuffer *fb = plane_state->base.fb;
> >> +       int aux_x = plane_state->aux.x;
> >> +       int aux_y = plane_state->aux.y;
> >> +       u32 aux_offset = plane_state->aux.offset;
> >> +       u32 alignment = intel_surf_alignment(fb, 1);
> >> +
> >> +       while (aux_offset >= main_offset && aux_y <= main_y) {
> >> +               int x, y;
> >> +
> >> +               if (aux_x == main_x && aux_y == main_y)
> >> +                       break;
> >> +
> >> +               if (aux_offset == 0)
> >> +                       break;
> >> +
> >> +               x = aux_x / 16;
> >> +               y = aux_y / 8;
> >> +               aux_offset = intel_adjust_tile_offset(&x, &y, plane_state,
> >> 1,
> >> +                                                     aux_offset,
> >> aux_offset - alignment);
> >> +               aux_x = x * 16 + aux_x % 16;
> >> +               aux_y = y * 8 + aux_y % 8;
> >> +       }
> >> +
> >> +       if (aux_x != main_x || aux_y != main_y)
> >> +               return false;
> >> +
> >> +       plane_state->aux.offset = aux_offset;
> >> +       plane_state->aux.x = aux_x;
> >> +       plane_state->aux.y = aux_y;
> >> +
> >> +       return true;
> >> +}
> >> +
> >>  static int skl_check_main_surface(struct intel_plane_state *plane_state)
> >>  {
> >>         const struct drm_framebuffer *fb = plane_state->base.fb;
> >> @@ -2937,7 +3016,7 @@ static int skl_check_main_surface(struct
> >> intel_plane_state *plane_state)
> >>
> >>                 while ((x + w) * cpp > fb->pitches[0]) {
> >>                         if (offset == 0) {
> >> -                               DRM_DEBUG_KMS("Unable to find suitable
> >> display surface offset\n");
> >> +                               DRM_DEBUG_KMS("Unable to find suitable
> >> display surface offset due to X-tiling\n");
> >>                                 return -EINVAL;
> >>                         }
> >>
> >> @@ -2946,6 +3025,26 @@ static int skl_check_main_surface(struct
> >> intel_plane_state *plane_state)
> >>                 }
> >>         }
> >>
> >> +       /*
> >> +        * CCS AUX surface doesn't have its own x/y offsets, we must make
> >> sure
> >> +        * they match with the main surface x/y offsets.
> >> +        */
> >> +       if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> >> +           fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
> >> +               while (!skl_check_main_ccs_coordinates(plane_state, x, y,
> >> offset)) {
> >> +                       if (offset == 0)
> >> +                               break;
> >> +
> >> +                       offset = intel_adjust_tile_offset(&x, &y,
> >> plane_state, 0,
> >> +                                                         offset, offset -
> >> alignment);
> >> +               }
> >> +
> >> +               if (x != plane_state->aux.x || y != plane_state->aux.y) {
> >> +                       DRM_DEBUG_KMS("Unable to find suitable display
> >> surface offset due to CCS\n");
> >> +                       return -EINVAL;
> >> +               }
> >> +       }
> >> +
> >>         plane_state->main.offset = offset;
> >>         plane_state->main.x = x;
> >>         plane_state->main.y = y;
> >> @@ -2982,6 +3081,47 @@ static int skl_check_nv12_aux_surface(struct
> >> intel_plane_state *plane_state)
> >>         return 0;
> >>  }
> >>
> >> +static int skl_check_ccs_aux_surface(struct intel_plane_state
> >> *plane_state)
> >> +{
> >> +       struct intel_plane *plane = to_intel_plane(plane_state->
> >> base.plane);
> >> +       struct intel_crtc *crtc = to_intel_crtc(plane_state->base.crtc);
> >> +       int src_x = plane_state->base.src.x1 >> 16;
> >> +       int src_y = plane_state->base.src.y1 >> 16;
> >> +       int x = src_x / 16;
> >> +       int y = src_y / 8;
> >> +       u32 offset;
> >> +
> >> +       switch (plane->id) {
> >> +       case PLANE_PRIMARY:
> >> +       case PLANE_SPRITE0:
> >> +               break;
> >> +       default:
> >> +               DRM_DEBUG_KMS("RC support only on plane 1 and 2\n");
> >> +               return -EINVAL;
> >> +       }
> >> +
> >> +       if (crtc->pipe == PIPE_C) {
> >> +               DRM_DEBUG_KMS("No RC support on pipe C\n");
> >> +               return -EINVAL;
> >> +       }
> >> +
> >> +       if (plane_state->base.rotation &&
> >> +           plane_state->base.rotation & ~(DRM_ROTATE_0 | DRM_ROTATE_180))
> >> {
> >> +               DRM_DEBUG_KMS("RC support only with 0/180 degree rotation
> >> %x\n",
> >> +                             plane_state->base.rotation);
> >> +               return -EINVAL;
> >> +       }
> >> +
> >> +       intel_add_fb_offsets(&x, &y, plane_state, 1);
> >> +       offset = intel_compute_tile_offset(&x, &y, plane_state, 1);
> >> +
> >> +       plane_state->aux.offset = offset;
> >> +       plane_state->aux.x = x * 16 + src_x % 16;
> >> +       plane_state->aux.y = y * 8 + src_y % 8;
> >> +
> >> +       return 0;
> >> +}
> >> +
> >>  int skl_check_plane_surface(struct intel_plane_state *plane_state)
> >>  {
> >>         const struct drm_framebuffer *fb = plane_state->base.fb;
> >> @@ -3002,6 +3142,11 @@ int skl_check_plane_surface(struct
> >> intel_plane_state *plane_state)
> >>                 ret = skl_check_nv12_aux_surface(plane_state);
> >>                 if (ret)
> >>                         return ret;
> >> +       } else if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> >> +                  fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
> >> +               ret = skl_check_ccs_aux_surface(plane_state);
> >> +               if (ret)
> >> +                       return ret;
> >>         } else {
> >>                 plane_state->aux.offset = ~0xfff;
> >>                 plane_state->aux.x = 0;
> >> @@ -3357,8 +3502,12 @@ u32 skl_plane_ctl_tiling(uint64_t fb_modifier)
> >>                 return PLANE_CTL_TILED_X;
> >>         case I915_FORMAT_MOD_Y_TILED:
> >>                 return PLANE_CTL_TILED_Y;
> >> +       case I915_FORMAT_MOD_Y_TILED_CCS:
> >> +               return PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
> >>         case I915_FORMAT_MOD_Yf_TILED:
> >>                 return PLANE_CTL_TILED_YF;
> >> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
> >> +               return PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_
> >> ENABLE;
> >>         default:
> >>                 MISSING_CASE(fb_modifier);
> >>         }
> >> @@ -3401,6 +3550,7 @@ static void skylake_update_primary_plane(struct
> >> drm_plane *plane,
> >>         u32 plane_ctl;
> >>         unsigned int rotation = plane_state->base.rotation;
> >>         u32 stride = skl_plane_stride(fb, 0, rotation);
> >> +       u32 aux_stride = skl_plane_stride(fb, 1, rotation);
> >>         u32 surf_addr = plane_state->main.offset;
> >>         int scaler_id = plane_state->scaler_id;
> >>         int src_x = plane_state->main.x;
> >> @@ -3436,6 +3586,10 @@ static void skylake_update_primary_plane(struct
> >> drm_plane *plane,
> >>         I915_WRITE(PLANE_OFFSET(pipe, plane_id), (src_y << 16) | src_x);
> >>         I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
> >>         I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
> >> +       I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
> >> +                  (plane_state->aux.offset - surf_addr) | aux_stride);
> >> +       I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
> >> +                  (plane_state->aux.y << 16) | plane_state->aux.x);
> >>
> >>         if (scaler_id >= 0) {
> >>                 uint32_t ps_ctrl = 0;
> >> @@ -9807,10 +9961,16 @@ skylake_get_initial_plane_config(struct
> >> intel_crtc *crtc,
> >>                 fb->modifier = I915_FORMAT_MOD_X_TILED;
> >>                 break;
> >>         case PLANE_CTL_TILED_Y:
> >> -               fb->modifier = I915_FORMAT_MOD_Y_TILED;
> >> +               if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
> >> +                       fb->modifier = I915_FORMAT_MOD_Y_TILED_CCS;
> >> +               else
> >> +                       fb->modifier = I915_FORMAT_MOD_Y_TILED;
> >>                 break;
> >>         case PLANE_CTL_TILED_YF:
> >> -               fb->modifier = I915_FORMAT_MOD_Yf_TILED;
> >> +               if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
> >> +                       fb->modifier = I915_FORMAT_MOD_Yf_TILED_CCS;
> >> +               else
> >> +                       fb->modifier = I915_FORMAT_MOD_Yf_TILED;
> >>                 break;
> >>         default:
> >>                 MISSING_CASE(tiling);
> >> @@ -12014,7 +12174,7 @@ static void skl_do_mmio_flip(struct intel_crtc
> >> *intel_crtc,
> >>         u32 ctl, stride = skl_plane_stride(fb, 0, rotation);
> >>
> >>         ctl = I915_READ(PLANE_CTL(pipe, 0));
> >> -       ctl &= ~PLANE_CTL_TILED_MASK;
> >> +       ctl &= ~(PLANE_CTL_TILED_MASK | PLANE_CTL_DECOMPRESSION_ENABLE);
> >>         switch (fb->modifier) {
> >>         case DRM_FORMAT_MOD_NONE:
> >>                 break;
> >> @@ -12024,9 +12184,15 @@ static void skl_do_mmio_flip(struct intel_crtc
> >> *intel_crtc,
> >>         case I915_FORMAT_MOD_Y_TILED:
> >>                 ctl |= PLANE_CTL_TILED_Y;
> >>                 break;
> >> +       case I915_FORMAT_MOD_Y_TILED_CCS:
> >> +               ctl |= PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
> >> +               break;
> >>         case I915_FORMAT_MOD_Yf_TILED:
> >>                 ctl |= PLANE_CTL_TILED_YF;
> >>                 break;
> >> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
> >> +               ctl |= PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_
> >> ENABLE;
> >> +               break;
> >>         default:
> >>                 MISSING_CASE(fb->modifier);
> >>         }
> >> @@ -15925,9 +16091,10 @@ static int intel_framebuffer_init(struct
> >> drm_device *dev,
> >>                                   struct drm_i915_gem_object *obj)
> >>  {
> >>         struct drm_i915_private *dev_priv = to_i915(dev);
> >> +       struct drm_framebuffer *fb = &intel_fb->base;
> >>         unsigned int tiling = i915_gem_object_get_tiling(obj);
> >> -       int ret;
> >> -       u32 pitch_limit, stride_alignment;
> >> +       int ret, i;
> >> +       u32 pitch_limit;
> >>         struct drm_format_name_buf format_name;
> >>
> >>         WARN_ON(!mutex_is_locked(&dev->struct_mutex));
> >> @@ -15953,6 +16120,19 @@ static int intel_framebuffer_init(struct
> >> drm_device *dev,
> >>
> >>         /* Passed in modifier sanity checking. */
> >>         switch (mode_cmd->modifier[0]) {
> >> +       case I915_FORMAT_MOD_Y_TILED_CCS:
> >> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
> >> +               switch (mode_cmd->pixel_format) {
> >> +               case DRM_FORMAT_XBGR8888:
> >> +               case DRM_FORMAT_ABGR8888:
> >> +               case DRM_FORMAT_XRGB8888:
> >> +               case DRM_FORMAT_ARGB8888:
> >> +                       break;
> >> +               default:
> >> +                       DRM_DEBUG_KMS("RC supported only with RGB8888
> >> formats\n");
> >> +                       return -EINVAL;
> >> +               }
> >> +               /* fall through */
> >>         case I915_FORMAT_MOD_Y_TILED:
> >>         case I915_FORMAT_MOD_Yf_TILED:
> >>                 if (INTEL_GEN(dev_priv) < 9) {
> >> @@ -16059,22 +16239,46 @@ static int intel_framebuffer_init(struct
> >> drm_device *dev,
> >>         if (mode_cmd->offsets[0] != 0)
> >>                 return -EINVAL;
> >>
> >> -       drm_helper_mode_fill_fb_struct(dev, &intel_fb->base, mode_cmd);
> >> +       drm_helper_mode_fill_fb_struct(dev, fb, mode_cmd);
> >>
> >> -       stride_alignment = intel_fb_stride_alignment(&intel_fb->base, 0);
> >> -       if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
> >> -               DRM_DEBUG_KMS("pitch (%d) must be at least %u byte
> >> aligned\n",
> >> -                             mode_cmd->pitches[0], stride_alignment);
> >> -               return -EINVAL;
> >> +       for (i = 0; i < fb->format->num_planes; i++) {
> >> +               u32 stride_alignment;
> >> +
> >> +               if (mode_cmd->handles[i] != mode_cmd->handles[0]) {
> >> +                       DRM_DEBUG_KMS("bad plane %d handle\n", i);
> >> +                       return -EINVAL;
> >> +               }
> >> +
> >> +               stride_alignment = intel_fb_stride_alignment(fb, i);
> >> +
> >> +               /*
> >> +                * Display WA #0531: skl,bxt,kbl,glk
> >> +                *
> >> +                * Render decompression and plane width > 3840
> >> +                * combined with horizontal panning requires the
> >> +                * plane stride to be a multiple of 4. We'll just
> >> +                * require the entire fb to accommodate that to avoid
> >> +                * potential runtime errors at plane configuration time.
> >>
> >
> >Note to Ben: Userspace needs to know about this too.  In this case, I
> >believe "multiple of 4" means "multiple of 4 tiles".  You've never hit this
> >because the standard 1920x1080 is 60 tiles wide which is a multiple of 4.
> >
> >
> >> +                */
> >> +               if (IS_GEN9(dev_priv) && i == 0 && fb->width > 3840 &&
> >> +                   (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> >> +                    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS))
> >> +                       stride_alignment *= 4;
> >> +
> >> +               if (fb->pitches[i] & (stride_alignment - 1)) {
> >> +                       DRM_DEBUG_KMS("plane %d pitch (%d) must be at
> >> least %u byte aligned\n",
> >> +                                     i, fb->pitches[i], stride_alignment);
> >> +                       return -EINVAL;
> >> +               }
> >>         }
> >>
> >>         intel_fb->obj = obj;
> >>
> >> -       ret = intel_fill_fb_info(dev_priv, &intel_fb->base);
> >> +       ret = intel_fill_fb_info(dev_priv, fb);
> >>         if (ret)
> >>                 return ret;
> >>
> >> -       ret = drm_framebuffer_init(dev, &intel_fb->base, &intel_fb_funcs);
> >> +       ret = drm_framebuffer_init(dev, fb, &intel_fb_funcs);
> >>         if (ret) {
> >>                 DRM_ERROR("framebuffer init failed %d\n", ret);
> >>                 return ret;
> >> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_
> >> pm.c
> >> index 249623d45be0..25782cd174c0 100644
> >> --- a/drivers/gpu/drm/i915/intel_pm.c
> >> +++ b/drivers/gpu/drm/i915/intel_pm.c
> >> @@ -62,6 +62,20 @@ static void gen9_init_clock_gating(struct
> >> drm_i915_private *dev_priv)
> >>         I915_WRITE(CHICKEN_PAR1_1,
> >>                    I915_READ(CHICKEN_PAR1_1) | SKL_EDP_PSR_FIX_RDWRAP);
> >>
> >> +       /*
> >> +        * Display WA#0390: skl,bxt,kbl,glk
> >> +        *
> >> +        * Must match Sampler, Pixel Back End, and Media
> >> +        * (0xE194 bit 8, 0x7014 bit 13, 0x4DDC bits 27 and 31).
> >> +        *
> >> +        * Including bits outside the page in the hash would
> >> +        * require 2 (or 4?) MiB alignment of resources. Just
> >> +        * assume the defaul hashing mode which only uses bits
> >> +        * within the page.
> >> +        */
> >> +       I915_WRITE(CHICKEN_PAR1_1,
> >> +                  I915_READ(CHICKEN_PAR1_1) & ~SKL_RC_HASH_OUTSIDE);
> >> +
> >>         I915_WRITE(GEN8_CONFIG0,
> >>                    I915_READ(GEN8_CONFIG0) | GEN9_DEFAULT_FIXES);
> >>
> >> @@ -3314,7 +3328,9 @@ skl_ddb_min_alloc(const struct drm_plane_state
> >> *pstate,
> >>
> >>         /* For Non Y-tile return 8-blocks */
> >>         if (fb->modifier != I915_FORMAT_MOD_Y_TILED &&
> >> -           fb->modifier != I915_FORMAT_MOD_Yf_TILED)
> >> +           fb->modifier != I915_FORMAT_MOD_Yf_TILED &&
> >> +           fb->modifier != I915_FORMAT_MOD_Y_TILED_CCS &&
> >> +           fb->modifier != I915_FORMAT_MOD_Yf_TILED_CCS)
> >>                 return 8;
> >>
> >>         src_w = drm_rect_width(&intel_pstate->base.src) >> 16;
> >> @@ -3590,7 +3606,9 @@ static int skl_compute_plane_wm(const struct
> >> drm_i915_private *dev_priv,
> >>         }
> >>
> >>         y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
> >> -                 fb->modifier == I915_FORMAT_MOD_Yf_TILED;
> >> +                 fb->modifier == I915_FORMAT_MOD_Yf_TILED ||
> >> +                 fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> >> +                 fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS;
> >>         x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
> >>
> >>         /* Display WA #1141: kbl. */
> >> @@ -3675,6 +3693,13 @@ static int skl_compute_plane_wm(const struct
> >> drm_i915_private *dev_priv,
> >>         res_lines = DIV_ROUND_UP(selected_result.val,
> >>                                  plane_blocks_per_line.val);
> >>
> >> +       /* Display WA #1125: skl,bxt,kbl,glk */
> >> +       if (level == 0 &&
> >> +           (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
> >> +            fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS))
> >> +               res_blocks += fixed_16_16_to_u32_round_up(y_tile_minimum);
> >> +
> >> +       /* Display WA #1126: skl,bxt,kbl,glk */
> >>         if (level >= 1 && level <= 7) {
> >>                 if (y_tiled) {
> >>                         res_blocks += fixed_16_16_to_u32_round_up(y_
> >> tile_minimum);
> >> diff --git a/drivers/gpu/drm/i915/intel_sprite.c
> >> b/drivers/gpu/drm/i915/intel_sprite.c
> >> index 7031bc733d97..063a994815d0 100644
> >> --- a/drivers/gpu/drm/i915/intel_sprite.c
> >> +++ b/drivers/gpu/drm/i915/intel_sprite.c
> >> @@ -210,6 +210,7 @@ skl_update_plane(struct drm_plane *drm_plane,
> >>         u32 surf_addr = plane_state->main.offset;
> >>         unsigned int rotation = plane_state->base.rotation;
> >>         u32 stride = skl_plane_stride(fb, 0, rotation);
> >> +       u32 aux_stride = skl_plane_stride(fb, 1, rotation);
> >>         int crtc_x = plane_state->base.dst.x1;
> >>         int crtc_y = plane_state->base.dst.y1;
> >>         uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
> >> @@ -248,6 +249,10 @@ skl_update_plane(struct drm_plane *drm_plane,
> >>         I915_WRITE(PLANE_OFFSET(pipe, plane_id), (y << 16) | x);
> >>         I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
> >>         I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
> >> +       I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
> >> +                  (plane_state->aux.offset - surf_addr) | aux_stride);
> >> +       I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
> >> +                  (plane_state->aux.y << 16) | plane_state->aux.x);
> >>
> >>         /* program plane scaler */
> >>         if (plane_state->scaler_id >= 0) {
> >> --
> >> 2.10.2
> >>
> >>

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 9/9] drm/i915: Add render decompression support
  2017-03-01 10:51         ` Ville Syrjälä
@ 2017-03-01 17:50           ` Ben Widawsky
  2017-03-01 18:00             ` Ville Syrjälä
  0 siblings, 1 reply; 44+ messages in thread
From: Ben Widawsky @ 2017-03-01 17:50 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Vandana Kannan, intel-gfx, Maling list - DRI developers,
	Chad Versace, Jason Ekstrand, Paulo Zanoni

On 17-03-01 12:51:17, Ville Syrjälä wrote:
>On Tue, Feb 28, 2017 at 03:20:38PM -0800, Ben Widawsky wrote:
>> On 17-02-28 12:18:39, Jason Ekstrand wrote:
><snip>
>> >I've said it before but reading through Ben's patches again make me want to
>> >be peskier about it.  I would really like the UABI to treat the CCS as if
>> >it's Y-tiled with a tile size of 128B x 32 rows.  Why?  Because this is
>> >what all the docs say it is.  From the display docs for "Color Control
>> >Surface":
>> >
>> >"The Color Control Surface (CCS) contains the compression status of the
>> >cache-line pairs. The
>> >compression state of the cache-line pair is specified by 2 bits in the CCS.
>> >Each CCS cache-line represents
>> >an area on the main surface of 16 x16 sets of 128 byte Y-tiled
>> >cache-line-pairs. CCS is always Y tiled."
>> >
>> >This contains 95% of the information needed to know the relation between
>> >the CCS and the main surface.  The other 5% (which is badly documented) is
>> >that cache line pairs are horizontally adjacent.  This gives a relationship
>> >of one cache line in the CCS maps to 32x64 cache lines in the main surface.
>> >
>> >But it's not actually Y-tiled?  Of course not.  I've worked out the exact
>> >tiling and it looks something like Y but isn't quite the same.  However,
>> >this isn't unique to CCS.  Stencil (W-tiled), HiZ, and gen7-8
>> >single-sampled MCS also each have their own tiling (Haswell MCS is
>> >especially exotic) but the docs call all of them Y-tiled and I think the
>> >hardware internally treats them as Y-tiled with the cache lines shuffled
>> >around a bit.
>> >
>> >Given the fact that they seem to like to change the MCS/CCS tiling around
>> >on every hardware generation, I'm reluctant to base UABI on the fact that
>> >the CCS appears to have 64x64 "pixels" per tile with each "pixel"
>> >corresponding to 16x8 pixels in the color surface.  That's not what we had
>> >on any previous gen and may change on gen10 for no aparent reason.  I'd
>> >much rather base it on Y-tiling and a relationship between cache lines
>> >which, even if they change the exact swizzle on gen10, will probably remain
>> >the same.  (For the gen7-8 analogue of CCS, they changed the tiling every
>> >generation but the relationship of one MCS cache line maps to 32x128 color
>> >cache lines remained the same).
>> >
>> >Ok, I've said my peice.  If we have to divide by 2 in userspace, we won't
>> >die, but I'd like to get the UABI right before we chissel it in stone.
>> >
>> >--Jason
>> >
>> >
>>
>> I vote we make the stride in units of tiles when we have the CCS modifier.
>
>That won't fly with the KMS API. I suppose I could make the tile 128 bytes
>wide by swapping the "1 byte == 16x8 pixels" notion with a
>"1 byte == 8x16 pixels" notion. Doesn't match the actual truth anymore,
>but I guess no one really cares.
>

KMS API goes right out the window with modifiers. Isn't that really the whole
point of modifiers?

>>
>> >> +               /* fall through */
>> >>         case I915_FORMAT_MOD_Yf_TILED:
>> >>                 /*
>> >>                  * Bspec seems to suggest that the Yf tile width would
>> >> @@ -2156,7 +2164,7 @@ static unsigned int intel_surf_alignment(const
>> >> struct drm_framebuffer *fb,
>> >>         struct drm_i915_private *dev_priv = to_i915(fb->dev);
>> >>
>> >>         /* AUX_DIST needs only 4K alignment */
>> >> -       if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
>> >> +       if (plane == 1)
>> >>                 return 4096;
>> >>
>> >>         switch (fb->modifier) {
>> >> @@ -2166,6 +2174,8 @@ static unsigned int intel_surf_alignment(const
>> >> struct drm_framebuffer *fb,
>> >>                 if (INTEL_GEN(dev_priv) >= 9)
>> >>                         return 256 * 1024;
>> >>                 return 0;
>> >> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> >> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>> >>         case I915_FORMAT_MOD_Y_TILED:
>> >>         case I915_FORMAT_MOD_Yf_TILED:
>> >>                 return 1 * 1024 * 1024;
>> >> @@ -2472,6 +2482,7 @@ static unsigned int intel_fb_modifier_to_tiling(uint64_t
>> >> fb_modifier)
>> >>         case I915_FORMAT_MOD_X_TILED:
>> >>                 return I915_TILING_X;
>> >>         case I915_FORMAT_MOD_Y_TILED:
>> >> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> >>                 return I915_TILING_Y;
>> >>         default:
>> >>                 return I915_TILING_NONE;
>> >> @@ -2536,6 +2547,35 @@ intel_fill_fb_info(struct drm_i915_private
>> >> *dev_priv,
>> >>
>> >>                 intel_fb_offset_to_xy(&x, &y, fb, i);
>> >>
>> >> +               if ((fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>> >> +                    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) && i ==
>> >> 1) {
>> >> +                       int main_x, main_y;
>> >> +                       int ccs_x, ccs_y;
>> >> +
>> >> +                       /*
>> >> +                        * Each byte of CCS corresponds to a 16x8 area of
>> >> the main surface, and
>> >> +                        * each CCS tile is 64x64 bytes.
>> >> +                        */
>> >> +                       ccs_x = (x * 16) % (64 * 16);
>> >> +                       ccs_y = (y * 8) % (64 * 8);
>> >> +                       main_x = intel_fb->normal[0].x % (64 * 16);
>> >> +                       main_y = intel_fb->normal[0].y % (64 * 8);
>> >> +
>> >> +                       /*
>> >> +                        * CCS doesn't have its own x/y offset register,
>> >> so the intra CCS tile
>> >> +                        * x/y offsets must match between CCS and the main
>> >> surface.
>> >> +                        */
>> >> +                       if (main_x != ccs_x || main_y != ccs_y) {
>> >> +                               DRM_DEBUG_KMS("Bad CCS x/y (main %d,%d ccs
>> >> %d,%d) full (main %d,%d ccs %d,%d)\n",
>> >> +                                             main_x, main_y,
>> >> +                                             ccs_x, ccs_y,
>> >> +                                             intel_fb->normal[0].x,
>> >> +                                             intel_fb->normal[0].y,
>> >> +                                             x, y);
>> >> +                               return -EINVAL;
>> >> +                       }
>> >> +               }
>> >> +
>> >>                 /*
>> >>                  * The fence (if used) is aligned to the start of the
>> >> object
>> >>                  * so having the framebuffer wrap around across the edge
>> >> of the
>> >> @@ -2873,6 +2913,9 @@ static int skl_max_plane_width(const struct
>> >> drm_framebuffer *fb, int plane,
>> >>                         break;
>> >>                 }
>> >>                 break;
>> >> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> >> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>> >> +               /* FIXME AUX plane? */
>> >>         case I915_FORMAT_MOD_Y_TILED:
>> >>         case I915_FORMAT_MOD_Yf_TILED:
>> >>                 switch (cpp) {
>> >> @@ -2895,6 +2938,42 @@ static int skl_max_plane_width(const struct
>> >> drm_framebuffer *fb, int plane,
>> >>         return 2048;
>> >>  }
>> >>
>> >> +static bool skl_check_main_ccs_coordinates(struct intel_plane_state
>> >> *plane_state,
>> >> +                                          int main_x, int main_y, u32
>> >> main_offset)
>> >> +{
>> >> +       const struct drm_framebuffer *fb = plane_state->base.fb;
>> >> +       int aux_x = plane_state->aux.x;
>> >> +       int aux_y = plane_state->aux.y;
>> >> +       u32 aux_offset = plane_state->aux.offset;
>> >> +       u32 alignment = intel_surf_alignment(fb, 1);
>> >> +
>> >> +       while (aux_offset >= main_offset && aux_y <= main_y) {
>> >> +               int x, y;
>> >> +
>> >> +               if (aux_x == main_x && aux_y == main_y)
>> >> +                       break;
>> >> +
>> >> +               if (aux_offset == 0)
>> >> +                       break;
>> >> +
>> >> +               x = aux_x / 16;
>> >> +               y = aux_y / 8;
>> >> +               aux_offset = intel_adjust_tile_offset(&x, &y, plane_state,
>> >> 1,
>> >> +                                                     aux_offset,
>> >> aux_offset - alignment);
>> >> +               aux_x = x * 16 + aux_x % 16;
>> >> +               aux_y = y * 8 + aux_y % 8;
>> >> +       }
>> >> +
>> >> +       if (aux_x != main_x || aux_y != main_y)
>> >> +               return false;
>> >> +
>> >> +       plane_state->aux.offset = aux_offset;
>> >> +       plane_state->aux.x = aux_x;
>> >> +       plane_state->aux.y = aux_y;
>> >> +
>> >> +       return true;
>> >> +}
>> >> +
>> >>  static int skl_check_main_surface(struct intel_plane_state *plane_state)
>> >>  {
>> >>         const struct drm_framebuffer *fb = plane_state->base.fb;
>> >> @@ -2937,7 +3016,7 @@ static int skl_check_main_surface(struct
>> >> intel_plane_state *plane_state)
>> >>
>> >>                 while ((x + w) * cpp > fb->pitches[0]) {
>> >>                         if (offset == 0) {
>> >> -                               DRM_DEBUG_KMS("Unable to find suitable
>> >> display surface offset\n");
>> >> +                               DRM_DEBUG_KMS("Unable to find suitable
>> >> display surface offset due to X-tiling\n");
>> >>                                 return -EINVAL;
>> >>                         }
>> >>
>> >> @@ -2946,6 +3025,26 @@ static int skl_check_main_surface(struct
>> >> intel_plane_state *plane_state)
>> >>                 }
>> >>         }
>> >>
>> >> +       /*
>> >> +        * CCS AUX surface doesn't have its own x/y offsets, we must make
>> >> sure
>> >> +        * they match with the main surface x/y offsets.
>> >> +        */
>> >> +       if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>> >> +           fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
>> >> +               while (!skl_check_main_ccs_coordinates(plane_state, x, y,
>> >> offset)) {
>> >> +                       if (offset == 0)
>> >> +                               break;
>> >> +
>> >> +                       offset = intel_adjust_tile_offset(&x, &y,
>> >> plane_state, 0,
>> >> +                                                         offset, offset -
>> >> alignment);
>> >> +               }
>> >> +
>> >> +               if (x != plane_state->aux.x || y != plane_state->aux.y) {
>> >> +                       DRM_DEBUG_KMS("Unable to find suitable display
>> >> surface offset due to CCS\n");
>> >> +                       return -EINVAL;
>> >> +               }
>> >> +       }
>> >> +
>> >>         plane_state->main.offset = offset;
>> >>         plane_state->main.x = x;
>> >>         plane_state->main.y = y;
>> >> @@ -2982,6 +3081,47 @@ static int skl_check_nv12_aux_surface(struct
>> >> intel_plane_state *plane_state)
>> >>         return 0;
>> >>  }
>> >>
>> >> +static int skl_check_ccs_aux_surface(struct intel_plane_state
>> >> *plane_state)
>> >> +{
>> >> +       struct intel_plane *plane = to_intel_plane(plane_state->
>> >> base.plane);
>> >> +       struct intel_crtc *crtc = to_intel_crtc(plane_state->base.crtc);
>> >> +       int src_x = plane_state->base.src.x1 >> 16;
>> >> +       int src_y = plane_state->base.src.y1 >> 16;
>> >> +       int x = src_x / 16;
>> >> +       int y = src_y / 8;
>> >> +       u32 offset;
>> >> +
>> >> +       switch (plane->id) {
>> >> +       case PLANE_PRIMARY:
>> >> +       case PLANE_SPRITE0:
>> >> +               break;
>> >> +       default:
>> >> +               DRM_DEBUG_KMS("RC support only on plane 1 and 2\n");
>> >> +               return -EINVAL;
>> >> +       }
>> >> +
>> >> +       if (crtc->pipe == PIPE_C) {
>> >> +               DRM_DEBUG_KMS("No RC support on pipe C\n");
>> >> +               return -EINVAL;
>> >> +       }
>> >> +
>> >> +       if (plane_state->base.rotation &&
>> >> +           plane_state->base.rotation & ~(DRM_ROTATE_0 | DRM_ROTATE_180))
>> >> {
>> >> +               DRM_DEBUG_KMS("RC support only with 0/180 degree rotation
>> >> %x\n",
>> >> +                             plane_state->base.rotation);
>> >> +               return -EINVAL;
>> >> +       }
>> >> +
>> >> +       intel_add_fb_offsets(&x, &y, plane_state, 1);
>> >> +       offset = intel_compute_tile_offset(&x, &y, plane_state, 1);
>> >> +
>> >> +       plane_state->aux.offset = offset;
>> >> +       plane_state->aux.x = x * 16 + src_x % 16;
>> >> +       plane_state->aux.y = y * 8 + src_y % 8;
>> >> +
>> >> +       return 0;
>> >> +}
>> >> +
>> >>  int skl_check_plane_surface(struct intel_plane_state *plane_state)
>> >>  {
>> >>         const struct drm_framebuffer *fb = plane_state->base.fb;
>> >> @@ -3002,6 +3142,11 @@ int skl_check_plane_surface(struct
>> >> intel_plane_state *plane_state)
>> >>                 ret = skl_check_nv12_aux_surface(plane_state);
>> >>                 if (ret)
>> >>                         return ret;
>> >> +       } else if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>> >> +                  fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
>> >> +               ret = skl_check_ccs_aux_surface(plane_state);
>> >> +               if (ret)
>> >> +                       return ret;
>> >>         } else {
>> >>                 plane_state->aux.offset = ~0xfff;
>> >>                 plane_state->aux.x = 0;
>> >> @@ -3357,8 +3502,12 @@ u32 skl_plane_ctl_tiling(uint64_t fb_modifier)
>> >>                 return PLANE_CTL_TILED_X;
>> >>         case I915_FORMAT_MOD_Y_TILED:
>> >>                 return PLANE_CTL_TILED_Y;
>> >> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> >> +               return PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
>> >>         case I915_FORMAT_MOD_Yf_TILED:
>> >>                 return PLANE_CTL_TILED_YF;
>> >> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>> >> +               return PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_
>> >> ENABLE;
>> >>         default:
>> >>                 MISSING_CASE(fb_modifier);
>> >>         }
>> >> @@ -3401,6 +3550,7 @@ static void skylake_update_primary_plane(struct
>> >> drm_plane *plane,
>> >>         u32 plane_ctl;
>> >>         unsigned int rotation = plane_state->base.rotation;
>> >>         u32 stride = skl_plane_stride(fb, 0, rotation);
>> >> +       u32 aux_stride = skl_plane_stride(fb, 1, rotation);
>> >>         u32 surf_addr = plane_state->main.offset;
>> >>         int scaler_id = plane_state->scaler_id;
>> >>         int src_x = plane_state->main.x;
>> >> @@ -3436,6 +3586,10 @@ static void skylake_update_primary_plane(struct
>> >> drm_plane *plane,
>> >>         I915_WRITE(PLANE_OFFSET(pipe, plane_id), (src_y << 16) | src_x);
>> >>         I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
>> >>         I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
>> >> +       I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
>> >> +                  (plane_state->aux.offset - surf_addr) | aux_stride);
>> >> +       I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
>> >> +                  (plane_state->aux.y << 16) | plane_state->aux.x);
>> >>
>> >>         if (scaler_id >= 0) {
>> >>                 uint32_t ps_ctrl = 0;
>> >> @@ -9807,10 +9961,16 @@ skylake_get_initial_plane_config(struct
>> >> intel_crtc *crtc,
>> >>                 fb->modifier = I915_FORMAT_MOD_X_TILED;
>> >>                 break;
>> >>         case PLANE_CTL_TILED_Y:
>> >> -               fb->modifier = I915_FORMAT_MOD_Y_TILED;
>> >> +               if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
>> >> +                       fb->modifier = I915_FORMAT_MOD_Y_TILED_CCS;
>> >> +               else
>> >> +                       fb->modifier = I915_FORMAT_MOD_Y_TILED;
>> >>                 break;
>> >>         case PLANE_CTL_TILED_YF:
>> >> -               fb->modifier = I915_FORMAT_MOD_Yf_TILED;
>> >> +               if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
>> >> +                       fb->modifier = I915_FORMAT_MOD_Yf_TILED_CCS;
>> >> +               else
>> >> +                       fb->modifier = I915_FORMAT_MOD_Yf_TILED;
>> >>                 break;
>> >>         default:
>> >>                 MISSING_CASE(tiling);
>> >> @@ -12014,7 +12174,7 @@ static void skl_do_mmio_flip(struct intel_crtc
>> >> *intel_crtc,
>> >>         u32 ctl, stride = skl_plane_stride(fb, 0, rotation);
>> >>
>> >>         ctl = I915_READ(PLANE_CTL(pipe, 0));
>> >> -       ctl &= ~PLANE_CTL_TILED_MASK;
>> >> +       ctl &= ~(PLANE_CTL_TILED_MASK | PLANE_CTL_DECOMPRESSION_ENABLE);
>> >>         switch (fb->modifier) {
>> >>         case DRM_FORMAT_MOD_NONE:
>> >>                 break;
>> >> @@ -12024,9 +12184,15 @@ static void skl_do_mmio_flip(struct intel_crtc
>> >> *intel_crtc,
>> >>         case I915_FORMAT_MOD_Y_TILED:
>> >>                 ctl |= PLANE_CTL_TILED_Y;
>> >>                 break;
>> >> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> >> +               ctl |= PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
>> >> +               break;
>> >>         case I915_FORMAT_MOD_Yf_TILED:
>> >>                 ctl |= PLANE_CTL_TILED_YF;
>> >>                 break;
>> >> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>> >> +               ctl |= PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_
>> >> ENABLE;
>> >> +               break;
>> >>         default:
>> >>                 MISSING_CASE(fb->modifier);
>> >>         }
>> >> @@ -15925,9 +16091,10 @@ static int intel_framebuffer_init(struct
>> >> drm_device *dev,
>> >>                                   struct drm_i915_gem_object *obj)
>> >>  {
>> >>         struct drm_i915_private *dev_priv = to_i915(dev);
>> >> +       struct drm_framebuffer *fb = &intel_fb->base;
>> >>         unsigned int tiling = i915_gem_object_get_tiling(obj);
>> >> -       int ret;
>> >> -       u32 pitch_limit, stride_alignment;
>> >> +       int ret, i;
>> >> +       u32 pitch_limit;
>> >>         struct drm_format_name_buf format_name;
>> >>
>> >>         WARN_ON(!mutex_is_locked(&dev->struct_mutex));
>> >> @@ -15953,6 +16120,19 @@ static int intel_framebuffer_init(struct
>> >> drm_device *dev,
>> >>
>> >>         /* Passed in modifier sanity checking. */
>> >>         switch (mode_cmd->modifier[0]) {
>> >> +       case I915_FORMAT_MOD_Y_TILED_CCS:
>> >> +       case I915_FORMAT_MOD_Yf_TILED_CCS:
>> >> +               switch (mode_cmd->pixel_format) {
>> >> +               case DRM_FORMAT_XBGR8888:
>> >> +               case DRM_FORMAT_ABGR8888:
>> >> +               case DRM_FORMAT_XRGB8888:
>> >> +               case DRM_FORMAT_ARGB8888:
>> >> +                       break;
>> >> +               default:
>> >> +                       DRM_DEBUG_KMS("RC supported only with RGB8888
>> >> formats\n");
>> >> +                       return -EINVAL;
>> >> +               }
>> >> +               /* fall through */
>> >>         case I915_FORMAT_MOD_Y_TILED:
>> >>         case I915_FORMAT_MOD_Yf_TILED:
>> >>                 if (INTEL_GEN(dev_priv) < 9) {
>> >> @@ -16059,22 +16239,46 @@ static int intel_framebuffer_init(struct
>> >> drm_device *dev,
>> >>         if (mode_cmd->offsets[0] != 0)
>> >>                 return -EINVAL;
>> >>
>> >> -       drm_helper_mode_fill_fb_struct(dev, &intel_fb->base, mode_cmd);
>> >> +       drm_helper_mode_fill_fb_struct(dev, fb, mode_cmd);
>> >>
>> >> -       stride_alignment = intel_fb_stride_alignment(&intel_fb->base, 0);
>> >> -       if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
>> >> -               DRM_DEBUG_KMS("pitch (%d) must be at least %u byte
>> >> aligned\n",
>> >> -                             mode_cmd->pitches[0], stride_alignment);
>> >> -               return -EINVAL;
>> >> +       for (i = 0; i < fb->format->num_planes; i++) {
>> >> +               u32 stride_alignment;
>> >> +
>> >> +               if (mode_cmd->handles[i] != mode_cmd->handles[0]) {
>> >> +                       DRM_DEBUG_KMS("bad plane %d handle\n", i);
>> >> +                       return -EINVAL;
>> >> +               }
>> >> +
>> >> +               stride_alignment = intel_fb_stride_alignment(fb, i);
>> >> +
>> >> +               /*
>> >> +                * Display WA #0531: skl,bxt,kbl,glk
>> >> +                *
>> >> +                * Render decompression and plane width > 3840
>> >> +                * combined with horizontal panning requires the
>> >> +                * plane stride to be a multiple of 4. We'll just
>> >> +                * require the entire fb to accommodate that to avoid
>> >> +                * potential runtime errors at plane configuration time.
>> >>
>> >
>> >Note to Ben: Userspace needs to know about this too.  In this case, I
>> >believe "multiple of 4" means "multiple of 4 tiles".  You've never hit this
>> >because the standard 1920x1080 is 60 tiles wide which is a multiple of 4.
>> >
>> >
>> >> +                */
>> >> +               if (IS_GEN9(dev_priv) && i == 0 && fb->width > 3840 &&
>> >> +                   (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>> >> +                    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS))
>> >> +                       stride_alignment *= 4;
>> >> +
>> >> +               if (fb->pitches[i] & (stride_alignment - 1)) {
>> >> +                       DRM_DEBUG_KMS("plane %d pitch (%d) must be at
>> >> least %u byte aligned\n",
>> >> +                                     i, fb->pitches[i], stride_alignment);
>> >> +                       return -EINVAL;
>> >> +               }
>> >>         }
>> >>
>> >>         intel_fb->obj = obj;
>> >>
>> >> -       ret = intel_fill_fb_info(dev_priv, &intel_fb->base);
>> >> +       ret = intel_fill_fb_info(dev_priv, fb);
>> >>         if (ret)
>> >>                 return ret;
>> >>
>> >> -       ret = drm_framebuffer_init(dev, &intel_fb->base, &intel_fb_funcs);
>> >> +       ret = drm_framebuffer_init(dev, fb, &intel_fb_funcs);
>> >>         if (ret) {
>> >>                 DRM_ERROR("framebuffer init failed %d\n", ret);
>> >>                 return ret;
>> >> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_
>> >> pm.c
>> >> index 249623d45be0..25782cd174c0 100644
>> >> --- a/drivers/gpu/drm/i915/intel_pm.c
>> >> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> >> @@ -62,6 +62,20 @@ static void gen9_init_clock_gating(struct
>> >> drm_i915_private *dev_priv)
>> >>         I915_WRITE(CHICKEN_PAR1_1,
>> >>                    I915_READ(CHICKEN_PAR1_1) | SKL_EDP_PSR_FIX_RDWRAP);
>> >>
>> >> +       /*
>> >> +        * Display WA#0390: skl,bxt,kbl,glk
>> >> +        *
>> >> +        * Must match Sampler, Pixel Back End, and Media
>> >> +        * (0xE194 bit 8, 0x7014 bit 13, 0x4DDC bits 27 and 31).
>> >> +        *
>> >> +        * Including bits outside the page in the hash would
>> >> +        * require 2 (or 4?) MiB alignment of resources. Just
>> >> +        * assume the defaul hashing mode which only uses bits
>> >> +        * within the page.
>> >> +        */
>> >> +       I915_WRITE(CHICKEN_PAR1_1,
>> >> +                  I915_READ(CHICKEN_PAR1_1) & ~SKL_RC_HASH_OUTSIDE);
>> >> +
>> >>         I915_WRITE(GEN8_CONFIG0,
>> >>                    I915_READ(GEN8_CONFIG0) | GEN9_DEFAULT_FIXES);
>> >>
>> >> @@ -3314,7 +3328,9 @@ skl_ddb_min_alloc(const struct drm_plane_state
>> >> *pstate,
>> >>
>> >>         /* For Non Y-tile return 8-blocks */
>> >>         if (fb->modifier != I915_FORMAT_MOD_Y_TILED &&
>> >> -           fb->modifier != I915_FORMAT_MOD_Yf_TILED)
>> >> +           fb->modifier != I915_FORMAT_MOD_Yf_TILED &&
>> >> +           fb->modifier != I915_FORMAT_MOD_Y_TILED_CCS &&
>> >> +           fb->modifier != I915_FORMAT_MOD_Yf_TILED_CCS)
>> >>                 return 8;
>> >>
>> >>         src_w = drm_rect_width(&intel_pstate->base.src) >> 16;
>> >> @@ -3590,7 +3606,9 @@ static int skl_compute_plane_wm(const struct
>> >> drm_i915_private *dev_priv,
>> >>         }
>> >>
>> >>         y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
>> >> -                 fb->modifier == I915_FORMAT_MOD_Yf_TILED;
>> >> +                 fb->modifier == I915_FORMAT_MOD_Yf_TILED ||
>> >> +                 fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>> >> +                 fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS;
>> >>         x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
>> >>
>> >>         /* Display WA #1141: kbl. */
>> >> @@ -3675,6 +3693,13 @@ static int skl_compute_plane_wm(const struct
>> >> drm_i915_private *dev_priv,
>> >>         res_lines = DIV_ROUND_UP(selected_result.val,
>> >>                                  plane_blocks_per_line.val);
>> >>
>> >> +       /* Display WA #1125: skl,bxt,kbl,glk */
>> >> +       if (level == 0 &&
>> >> +           (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
>> >> +            fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS))
>> >> +               res_blocks += fixed_16_16_to_u32_round_up(y_tile_minimum);
>> >> +
>> >> +       /* Display WA #1126: skl,bxt,kbl,glk */
>> >>         if (level >= 1 && level <= 7) {
>> >>                 if (y_tiled) {
>> >>                         res_blocks += fixed_16_16_to_u32_round_up(y_
>> >> tile_minimum);
>> >> diff --git a/drivers/gpu/drm/i915/intel_sprite.c
>> >> b/drivers/gpu/drm/i915/intel_sprite.c
>> >> index 7031bc733d97..063a994815d0 100644
>> >> --- a/drivers/gpu/drm/i915/intel_sprite.c
>> >> +++ b/drivers/gpu/drm/i915/intel_sprite.c
>> >> @@ -210,6 +210,7 @@ skl_update_plane(struct drm_plane *drm_plane,
>> >>         u32 surf_addr = plane_state->main.offset;
>> >>         unsigned int rotation = plane_state->base.rotation;
>> >>         u32 stride = skl_plane_stride(fb, 0, rotation);
>> >> +       u32 aux_stride = skl_plane_stride(fb, 1, rotation);
>> >>         int crtc_x = plane_state->base.dst.x1;
>> >>         int crtc_y = plane_state->base.dst.y1;
>> >>         uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
>> >> @@ -248,6 +249,10 @@ skl_update_plane(struct drm_plane *drm_plane,
>> >>         I915_WRITE(PLANE_OFFSET(pipe, plane_id), (y << 16) | x);
>> >>         I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride);
>> >>         I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
>> >> +       I915_WRITE(PLANE_AUX_DIST(pipe, plane_id),
>> >> +                  (plane_state->aux.offset - surf_addr) | aux_stride);
>> >> +       I915_WRITE(PLANE_AUX_OFFSET(pipe, plane_id),
>> >> +                  (plane_state->aux.y << 16) | plane_state->aux.x);
>> >>
>> >>         /* program plane scaler */
>> >>         if (plane_state->scaler_id >= 0) {
>> >> --
>> >> 2.10.2
>> >>
>> >>
>
>-- 
>Ville Syrjälä
>Intel OTC
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 9/9] drm/i915: Add render decompression support
  2017-03-01 17:50           ` Ben Widawsky
@ 2017-03-01 18:00             ` Ville Syrjälä
  0 siblings, 0 replies; 44+ messages in thread
From: Ville Syrjälä @ 2017-03-01 18:00 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Vandana Kannan, intel-gfx, Maling list - DRI developers,
	Chad Versace, Jason Ekstrand, Paulo Zanoni

On Wed, Mar 01, 2017 at 09:50:56AM -0800, Ben Widawsky wrote:
> On 17-03-01 12:51:17, Ville Syrjälä wrote:
> >On Tue, Feb 28, 2017 at 03:20:38PM -0800, Ben Widawsky wrote:
> >> On 17-02-28 12:18:39, Jason Ekstrand wrote:
> ><snip>
> >> >I've said it before but reading through Ben's patches again make me want to
> >> >be peskier about it.  I would really like the UABI to treat the CCS as if
> >> >it's Y-tiled with a tile size of 128B x 32 rows.  Why?  Because this is
> >> >what all the docs say it is.  From the display docs for "Color Control
> >> >Surface":
> >> >
> >> >"The Color Control Surface (CCS) contains the compression status of the
> >> >cache-line pairs. The
> >> >compression state of the cache-line pair is specified by 2 bits in the CCS.
> >> >Each CCS cache-line represents
> >> >an area on the main surface of 16 x16 sets of 128 byte Y-tiled
> >> >cache-line-pairs. CCS is always Y tiled."
> >> >
> >> >This contains 95% of the information needed to know the relation between
> >> >the CCS and the main surface.  The other 5% (which is badly documented) is
> >> >that cache line pairs are horizontally adjacent.  This gives a relationship
> >> >of one cache line in the CCS maps to 32x64 cache lines in the main surface.
> >> >
> >> >But it's not actually Y-tiled?  Of course not.  I've worked out the exact
> >> >tiling and it looks something like Y but isn't quite the same.  However,
> >> >this isn't unique to CCS.  Stencil (W-tiled), HiZ, and gen7-8
> >> >single-sampled MCS also each have their own tiling (Haswell MCS is
> >> >especially exotic) but the docs call all of them Y-tiled and I think the
> >> >hardware internally treats them as Y-tiled with the cache lines shuffled
> >> >around a bit.
> >> >
> >> >Given the fact that they seem to like to change the MCS/CCS tiling around
> >> >on every hardware generation, I'm reluctant to base UABI on the fact that
> >> >the CCS appears to have 64x64 "pixels" per tile with each "pixel"
> >> >corresponding to 16x8 pixels in the color surface.  That's not what we had
> >> >on any previous gen and may change on gen10 for no aparent reason.  I'd
> >> >much rather base it on Y-tiling and a relationship between cache lines
> >> >which, even if they change the exact swizzle on gen10, will probably remain
> >> >the same.  (For the gen7-8 analogue of CCS, they changed the tiling every
> >> >generation but the relationship of one MCS cache line maps to 32x128 color
> >> >cache lines remained the same).
> >> >
> >> >Ok, I've said my peice.  If we have to divide by 2 in userspace, we won't
> >> >die, but I'd like to get the UABI right before we chissel it in stone.
> >> >
> >> >--Jason
> >> >
> >> >
> >>
> >> I vote we make the stride in units of tiles when we have the CCS modifier.
> >
> >That won't fly with the KMS API. I suppose I could make the tile 128 bytes
> >wide by swapping the "1 byte == 16x8 pixels" notion with a
> >"1 byte == 8x16 pixels" notion. Doesn't match the actual truth anymore,
> >but I guess no one really cares.
> >
> 
> KMS API goes right out the window with modifiers. Isn't that really the whole
> point of modifiers?

Not in my opinion. It's just a mild extension. The basics still apply
perfectly fine. 

And I really don't want to add special exceptions for CCS since
that'll just mean more code and more bugs. Right now the same code
that works for your typical planar formats works for CCS as well.

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2017-03-01 18:00 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-04 18:42 [PATCH 0/9] drm/i915: SKL+ render decompression support ville.syrjala
2017-01-04 18:42 ` [PATCH 1/9] drm: Add mode_config .get_format_info() hook ville.syrjala
2017-01-05  3:15   ` Ben Widawsky
2017-01-05  8:48     ` Daniel Vetter
2017-01-04 18:42 ` [PATCH 2/9] drm/i915: Plumb drm_framebuffer into more places ville.syrjala
2017-02-02 13:30   ` [Intel-gfx] " Imre Deak
2017-01-04 18:42 ` [PATCH 3/9] drm/i915: Move nv12 chroma plane handling into intel_surf_alignment() ville.syrjala
2017-02-02 13:34   ` Imre Deak
2017-01-04 18:42 ` [PATCH 4/9] drm/i915: Avoid div-by-zero when computing aux_stride w/o an aux plane ville.syrjala
2017-02-02 13:38   ` Imre Deak
2017-01-04 18:42 ` [PATCH 5/9] drm/i915: Fix Yf tile width ville.syrjala
2017-02-02 15:07   ` Imre Deak
2017-01-04 18:42 ` [PATCH 6/9] drm/i915: Pass the correct plane index to _intel_compute_tile_offset() ville.syrjala
2017-02-02 16:10   ` Imre Deak
2017-01-04 18:42 ` [PATCH 7/9] drm/i915: Use DRM_DEBUG_KMS() for framebuffer failure debug messages ville.syrjala
2017-02-02 16:19   ` Imre Deak
2017-01-04 18:42 ` [PATCH 8/9] drm/i915: Implement .get_format_info() hook for CCS ville.syrjala
2017-01-05 16:24   ` Tvrtko Ursulin
2017-01-05 17:40     ` Ben Widawsky
2017-02-26 22:41   ` Chad Versace
2017-02-27 15:13     ` [Intel-gfx] " Ville Syrjälä
2017-02-28  5:36       ` Ben Widawsky
2017-01-04 18:42 ` [PATCH 9/9] drm/i915: Add render decompression support ville.syrjala
2017-01-04 19:14   ` Paulo Zanoni
2017-01-05 15:12     ` Ville Syrjälä
2017-01-05  4:25   ` Ben Widawsky
2017-01-05 15:11     ` Ville Syrjälä
2017-01-05 15:14   ` [PATCH v2 " ville.syrjala
2017-01-09 19:20     ` Jason Ekstrand
2017-01-10 17:04       ` Ville Syrjälä
2017-01-11 21:49         ` Jason Ekstrand
2017-01-11 22:29           ` Jason Ekstrand
2017-02-07 15:37     ` Imre Deak
2017-02-13 17:13       ` Ville Syrjälä
2017-02-28 20:18     ` Jason Ekstrand
2017-02-28 21:00       ` Ben Widawsky
2017-02-28 23:20       ` Ben Widawsky
2017-03-01 10:51         ` Ville Syrjälä
2017-03-01 17:50           ` Ben Widawsky
2017-03-01 18:00             ` Ville Syrjälä
2017-01-04 19:45 ` ✓ Fi.CI.BAT: success for drm/i915: SKL+ " Patchwork
2017-01-05 15:54 ` ✗ Fi.CI.BAT: failure for drm/i915: SKL+ render decompression support (rev2) Patchwork
2017-01-06 23:41 ` [PATCH 0/9] drm/i915: SKL+ render decompression support Ben Widawsky
2017-01-10 19:23 ` ✓ Fi.CI.BAT: success for drm/i915: SKL+ render decompression support (rev2) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.