All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
To: Paulo Zanoni <przanoni@gmail.com>
Cc: intel-gfx@lists.freedesktop.org, Paulo Zanoni <paulo.r.zanoni@intel.com>
Subject: Re: [PATCH] drm/i915: replace snb_update_wm with haswell_update_wm on HSW
Date: Tue, 21 May 2013 16:08:06 +0300	[thread overview]
Message-ID: <20130521130806.GB16772@intel.com> (raw)
In-Reply-To: <1368130422-15392-1-git-send-email-przanoni@gmail.com>

On Thu, May 09, 2013 at 05:13:41PM -0300, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> 
> We were previously calling sandybridge_update_wm on HSW, but the SNB
> function didn't really match the HSW specification, so we were just
> writing the wrong values. For example, I was not seeing any LP
> watermark ever enabled.
> 
> This patch implements the haswell_update_wm function as described in
> our specification. The values match the "Haswell Watermark Calculator"
> too.
> 
> With this patch I can finally see us reaching PC7 state with screen
> sizes <= 1920x1080.
> 
> The only thing I see we're missing is to properly account for the case
> where the primary plane is disabled. We still don't do this and I
> believe we'll need some small reworks on intel_sprite.c if we want to
> do this correctly, so let's leave this feature for a future patch.
> 
> v2: - Refactor code in the hope that it will be more useful for
>       Ville's rework
>     - Immpletement the 2 pieces missing on v1: sprite watermarks and
>       support for 5/6 data buffer partitioning.

Apart from I said yesterday, this is a rather large patch and could be
split up a bit to make it easier to digest.

- IMHO the 1/2 vs. 5/6 split stuff could be dropped completely for now
  since it doesn't even appear functional
- the pfit downscaling pixel rate adjustment part at least could be a
  separate patch

A few more bikesheds below.

> 
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_reg.h     |    7 +
>  drivers/gpu/drm/i915/intel_drv.h    |   12 +
>  drivers/gpu/drm/i915/intel_pm.c     |  522 +++++++++++++++++++++++++++++++++--
>  drivers/gpu/drm/i915/intel_sprite.c |    6 +-
>  4 files changed, 527 insertions(+), 20 deletions(-)
> 
> Hi
> 
> I had some discussions with Ville based on his plans to rework the watermarks
> code, so I decided to do a little refactoring on the previous patch in order to
> make it easier for him. Now we have a clear split between the 3 steps: (i)
> gather the WM parameters, (ii) calculate the WM values and (iii) write the
> values to the registers. My idea is that he'll be able to take parts of my
> Haswell-specific code and make them become usable by the other hardware
> generations. He'll probably have to rename some of the hsw_ structs and move
> them around, but I hope my code can be used as a starting point for his rework.
> 
> In addition to the refactoring I also added support for sprite watermarks and
> the 5/6 data buffer partitioning mode, so we should be "feature complete".
> 
> I checked the values set by the Kernel and they do match the watermarks
> calculator.
> 
> Thanks,
> Paulo
> 
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 92dcbe3..33b5de3 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -3055,6 +3055,10 @@
>  #define WM3S_LP_IVB		0x45128
>  #define  WM1S_LP_EN		(1<<31)
>  
> +#define HSW_WM_LP_VAL(lat, fbc, pri, cur) \
> +	(WM3_LP_EN | ((lat) << WM1_LP_LATENCY_SHIFT) | \
> +	 ((fbc) << WM1_LP_FBC_SHIFT) | ((pri) << WM1_LP_SR_SHIFT) | (cur))
> +
>  /* Memory latency timer register */
>  #define MLTR_ILK		0x11222
>  #define  MLTR_WM1_SHIFT		0
> @@ -4938,6 +4942,9 @@
>  #define  SFUSE_STRAP_DDIC_DETECTED	(1<<1)
>  #define  SFUSE_STRAP_DDID_DETECTED	(1<<0)
>  
> +#define WM_MISC				0x45260
> +#define  WM_MISC_DATA_PARTITION_5_6	(1 << 0)
> +
>  #define WM_DBG				0x45280
>  #define  WM_DBG_DISALLOW_MULTIPLE_LP	(1<<0)
>  #define  WM_DBG_DISALLOW_MAXFIFO	(1<<1)
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 80b417a..b8376e1 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -319,6 +319,18 @@ struct intel_plane {
>  	unsigned int crtc_w, crtc_h;
>  	uint32_t src_x, src_y;
>  	uint32_t src_w, src_h;
> +
> +	/* Since we need to change the watermarks before/after
> +	 * enabling/disabling the planes, we need to store the parameters here
> +	 * as the other pieces of the struct may not reflect the values we want
> +	 * for the watermark calculations. Currently only Haswell uses this.
> +	 */
> +	struct plane_watermark_parameters {

The type isn't used anywhere, so no need to give it a type name.

> +		bool enable;
> +		int horiz_pixels;
> +		int bytes_per_pixel;
> +	} wm;
> +
>  	void (*update_plane)(struct drm_plane *plane,
>  			     struct drm_framebuffer *fb,
>  			     struct drm_i915_gem_object *obj,
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 478518d..afc4705 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2068,20 +2068,236 @@ static void ivybridge_update_wm(struct drm_device *dev)
>  		   cursor_wm);
>  }
>  
> -static void
> -haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
> +static int hsw_wm_get_pixel_rate(struct drm_device *dev,
> +				 struct drm_crtc *crtc)
> +{
> +	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> +	int pixel_rate, pfit_size;
> +
> +	if (intel_crtc->config.pixel_target_clock)
> +		pixel_rate = intel_crtc->config.pixel_target_clock;
> +	else
> +		pixel_rate = intel_crtc->config.adjusted_mode.clock;
> +
> +	/* We only use IF-ID interlacing. If we ever use PF-ID we'll need to
> +	 * adjust the pixel_rate here. */
> +
> +	pfit_size = intel_crtc->config.pch_pfit.size;
> +	if (pfit_size) {
> +		int x, y, crtc_x, crtc_y, hscale, vscale, totscale;
> +
> +		x = (pfit_size >> 16) & 0xFFFF;
> +		y = pfit_size & 0xFFFF;
> +		crtc_x = intel_crtc->config.adjusted_mode.hdisplay;
> +		crtc_y = intel_crtc->config.adjusted_mode.vdisplay;
> +
> +		hscale = crtc_x * 1000 / x;
> +		vscale = crtc_y * 1000 / y;
> +		hscale = (hscale < 1000) ? 1000 : hscale;
> +		vscale = (vscale < 1000) ? 1000 : vscale;
> +		totscale = hscale * vscale / 1000;
> +		pixel_rate = pixel_rate * totscale / 1000;
> +	}
> +
> +	return pixel_rate;
> +}
> +
> +static int hsw_wm_method1(int pixel_rate, int bytes_per_pixel, int latency)
> +{
> +	int ret;
> +
> +	ret = pixel_rate * bytes_per_pixel * latency;
> +	ret = DIV_ROUND_UP(ret, 64 * 10000) + 2;
> +	return ret;
> +}
> +
> +static int hsw_wm_method2(int pixel_rate, int pipe_htotal, int horiz_pixels,
> +			  int bytes_per_pixel, int latency)
> +{
> +	int ret;
> +
> +	ret = DIV_ROUND_UP(pipe_htotal * 1000, pixel_rate);
> +	ret = ((latency / (ret * 10)) + 1) * horiz_pixels * bytes_per_pixel;
> +	ret = DIV_ROUND_UP(ret, 64) + 2;
> +	return ret;
> +}
> +
> +static int hsw_wm_fbc(int pri_val, int horiz_pixels, int bytes_per_pixel)
> +{
> +	return DIV_ROUND_UP(pri_val * 64, horiz_pixels * bytes_per_pixel) + 2;
> +}
> +
> +struct hsw_pipe_wm_parameters {
> +	bool active;
> +	bool sprite_enabled;
> +	int pipe_htotal;
> +	int pixel_rate;
> +	int pri_bytes_per_pixel;
> +	int spr_bytes_per_pixel;
> +	int cur_bytes_per_pixel;
> +	int pri_horiz_pixels;
> +	int spr_horiz_pixels;
> +	int cur_horiz_pixels;
> +};
> +
> +struct hsw_wm_maximums {
> +	int pri;
> +	int spr;
> +	int cur;
> +	int fbc;
> +};
> +
> +struct hsw_lp_wm_result {
> +	bool enable;
> +	bool fbc_enable;

This guy seems a bit pointless. You could just delay the fbc wm max
check until the time when you derive the final enable_fbc_wm value.

> +	int pri_val;
> +	int spr_val;
> +	int cur_val;
> +	int fbc_val;
> +};
> +
> +struct hsw_wm_values {
> +	uint32_t wm_pipe[3];
> +	uint32_t wm_lp[3];
> +	uint32_t wm_lp_spr[3];
> +	uint32_t wm_linetime[3];
> +	bool enable_fbc_wm;
> +};
> +
> +static void hsw_compute_lp_wm(int mem_value, struct hsw_wm_maximums *max,
> +			      struct hsw_pipe_wm_parameters *params,
> +			      struct hsw_lp_wm_result *result)
> +{
> +	enum pipe pipe;
> +	int pri_val[3], spr_val[3], cur_val[3], fbc_val[3];
> +	int method1, method2;
> +
> +	for (pipe = PIPE_A; pipe <= PIPE_C; pipe++) {
> +		struct hsw_pipe_wm_parameters *p = &params[pipe];
> +
> +		if (p->active) {
> +			/* TODO: for now, assume the primary plane is always
> +			 * enabled. */
> +			method1 = hsw_wm_method1(p->pixel_rate,
> +						 p->pri_bytes_per_pixel,
> +						 mem_value);
> +			method2 = hsw_wm_method2(p->pixel_rate,
> +						 p->pipe_htotal,
> +						 p->pri_horiz_pixels,
> +						 p->pri_bytes_per_pixel,
> +						 mem_value);
> +			pri_val[pipe] = min(method1, method2);

I think these things could be refactored a bit into
hsw_wm_compute_primary(), hsw_wm_compute_sprite() etc. like I had in
my big rewrite patch.

> +
> +			if (p->sprite_enabled) {
> +				method1 = hsw_wm_method1(p->pixel_rate,
> +							 p->spr_bytes_per_pixel,
> +							 mem_value);
> +				method2 = hsw_wm_method2(p->pixel_rate,
> +							 p->pipe_htotal,
> +							 p->spr_horiz_pixels,
> +							 p->spr_bytes_per_pixel,
> +							 mem_value);
> +				spr_val[pipe] = min(method1, method2);
> +			} else {
> +				spr_val[pipe] = 0;
> +			}
> +
> +			cur_val[pipe] = hsw_wm_method2(p->pixel_rate,
> +						       p->pipe_htotal,
> +						       p->cur_horiz_pixels,
> +						       p->cur_bytes_per_pixel,
> +						       mem_value);
> +			fbc_val[pipe] = hsw_wm_fbc(pri_val[pipe],
> +						   p->pri_horiz_pixels,
> +						   p->pri_bytes_per_pixel);
> +		} else {
> +			pri_val[pipe] = 0;
> +			spr_val[pipe] = 0;
> +			cur_val[pipe] = 0;
> +			fbc_val[pipe] = 0;
> +		}
> +	}
> +
> +	result->pri_val = max3(pri_val[0], pri_val[1], pri_val[2]);
> +	result->spr_val = max3(spr_val[0], spr_val[1], spr_val[2]);
> +	result->cur_val = max3(cur_val[0], cur_val[1], cur_val[2]);
> +	result->fbc_val = max3(fbc_val[0], fbc_val[1], fbc_val[2]);
> +
> +	if (result->fbc_val > max->fbc) {
> +		result->fbc_enable = false;
> +		result->fbc_val = 0;
> +	} else {
> +		result->fbc_enable = true;
> +	}
> +
> +	result->enable = result->pri_val <= max->pri &&
> +			 result->spr_val <= max->spr &&
> +			 result->cur_val <= max->cur;
> +}
> +
> +static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv,
> +				    int mem_value, enum pipe pipe,
> +				    struct hsw_pipe_wm_parameters *params)
> +{
> +	int pri_val, cur_val, spr_val, method1, method2;
> +
> +	if (params->active) {
> +		/* TODO: for now, assume the primary plane is always enabled. */
> +		pri_val = hsw_wm_method1(params->pixel_rate,
> +					 params->pri_bytes_per_pixel,
> +					 mem_value);
> +
> +		if (params->sprite_enabled) {
> +			method1 = hsw_wm_method1(params->pixel_rate,
> +						 params->spr_bytes_per_pixel,
> +						 mem_value);
> +			method2 = hsw_wm_method2(params->pixel_rate,
> +						 params->pipe_htotal,
> +						 params->spr_horiz_pixels,
> +						 params->spr_bytes_per_pixel,
> +						 mem_value);
> +			spr_val = min(method1, method2);
> +		} else {
> +			spr_val = 0;
> +		}
> +
> +		cur_val = hsw_wm_method2(params->pixel_rate,
> +					 params->pipe_htotal,
> +					 params->cur_horiz_pixels,
> +					 params->cur_bytes_per_pixel,
> +					 mem_value);
> +	} else {
> +		pri_val = 0;
> +		spr_val = 0;
> +		cur_val = 0;
> +	}
> +
> +	WARN(pri_val > 127,
> +	     "Primary WM error, mode not supported for pipe %c\n",
> +	     pipe_name(pipe));
> +	WARN(spr_val > 127,
> +	     "Sprite WM error, mode not supported for pipe %c\n",
> +	     pipe_name(pipe));
> +	WARN(cur_val > 63,
> +	     "Cursor WM error, mode not supported for pipe %c\n",
> +	     pipe_name(pipe));
> +
> +	return (pri_val << WM0_PIPE_PLANE_SHIFT) |
> +	       (spr_val << WM0_PIPE_SPRITE_SHIFT) |
> +	       cur_val;
> +}
> +
> +static uint32_t
> +hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> -	enum pipe pipe = intel_crtc->pipe;
>  	struct drm_display_mode *mode = &intel_crtc->config.adjusted_mode;
>  	int target_clock;
>  	u32 linetime, ips_linetime;
>  
> -	if (!intel_crtc_active(crtc)) {
> -		I915_WRITE(PIPE_WM_LINETIME(pipe), 0);
> -		return;
> -	}
> +	if (!intel_crtc_active(crtc))
> +		return 0;
>  
>  	if (intel_crtc->config.pixel_target_clock)
>  		target_clock = intel_crtc->config.pixel_target_clock;
> @@ -2095,29 +2311,296 @@ haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
>  	ips_linetime = DIV_ROUND_CLOSEST(mode->htotal * 1000 * 8,
>  					 intel_ddi_get_cdclk_freq(dev_priv));
>  
> -	I915_WRITE(PIPE_WM_LINETIME(pipe),
> -		   PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
> -		   PIPE_WM_LINETIME_TIME(linetime));
> +	return PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
> +	       PIPE_WM_LINETIME_TIME(linetime);
>  }
>  
> -static void haswell_update_wm(struct drm_device *dev)
> +static void hsw_compute_wm_parameters(struct drm_device *dev,
> +				      struct hsw_pipe_wm_parameters *params,
> +				      uint32_t *wm,
> +				      struct hsw_wm_maximums *lp_max_1_2,
> +				      struct hsw_wm_maximums *lp_max_5_6)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_crtc *crtc;
> +	struct drm_plane *plane;
> +	uint64_t sskpd = I915_READ64(MCH_SSKPD);
>  	enum pipe pipe;
> +	int pipes_active = 0, sprites_enabled = 0;
>  
> -	/* Disable the LP WMs before changine the linetime registers. This is
> -	 * just a temporary code that will be replaced soon. */
> -	I915_WRITE(WM3_LP_ILK, 0);
> -	I915_WRITE(WM2_LP_ILK, 0);
> -	I915_WRITE(WM1_LP_ILK, 0);
> +	if ((sskpd >> 56) & 0xFF)
> +		wm[0] = (sskpd >> 56) & 0xFF;
> +	else
> +		wm[0] = sskpd & 0xF;
> +	wm[1] = ((sskpd >> 4) & 0xFF) * 5;
> +	wm[2] = ((sskpd >> 12) & 0xFF) * 5;
> +	wm[3] = ((sskpd >> 20) & 0x1FF) * 5;
> +	wm[4] = ((sskpd >> 32) & 0x1FF) * 5;
> +
> +	for_each_pipe(pipe) {

list_for_each_entry() would be more consistent with most of our code.

> +		struct intel_crtc *intel_crtc;
> +
> +		crtc = dev_priv->pipe_to_crtc_mapping[pipe];
> +		intel_crtc = to_intel_crtc(crtc);
> +
> +		params[pipe].active = intel_crtc_active(crtc);
> +		if (!params[pipe].active)
> +			continue;
> +
> +		pipes_active++;
> +
> +		params[pipe].pipe_htotal =
> +			intel_crtc->config.adjusted_mode.htotal;
> +		params[pipe].pixel_rate = hsw_wm_get_pixel_rate(dev, crtc);
> +		params[pipe].pri_bytes_per_pixel = crtc->fb->bits_per_pixel / 8;
> +		params[pipe].cur_bytes_per_pixel = 4;
> +		params[pipe].pri_horiz_pixels =
> +			intel_crtc->config.adjusted_mode.hdisplay;
> +		params[pipe].cur_horiz_pixels = 64;

Could avoid repeating the same params[pipe] stuff with eg.

	struct hsw_pipe_wm_parameters *p = &params[pipe];

	p->foo = bar;
	...

> +	}
> +
> +	list_for_each_entry(plane, &dev->mode_config.plane_list, head) {
> +		struct intel_plane *intel_plane = to_intel_plane(plane);
> +
> +		pipe = intel_plane->pipe;
> +		params[pipe].sprite_enabled = intel_plane->wm.enable;
> +		params[pipe].spr_bytes_per_pixel =
> +			intel_plane->wm.bytes_per_pixel;
> +		params[pipe].spr_horiz_pixels = intel_plane->wm.horiz_pixels;
> +
> +		if (params[pipe].sprite_enabled)
> +			sprites_enabled++;
> +	}
> +
> +	if (pipes_active > 1) {
> +		lp_max_1_2->pri = lp_max_5_6->pri = sprites_enabled ? 128 : 256;
> +		lp_max_1_2->spr = lp_max_5_6->spr = 128;
> +		lp_max_1_2->cur = lp_max_5_6->cur = 64;
> +	} else {
> +		lp_max_1_2->pri = sprites_enabled ? 384 : 768;
> +		lp_max_5_6->pri = sprites_enabled ? 128 : 768;
> +		lp_max_1_2->spr = 384;
> +		lp_max_5_6->spr = 640;
> +		lp_max_1_2->cur = lp_max_5_6->cur = 255;
> +	}
> +	lp_max_1_2->fbc = lp_max_5_6->fbc = 15;
> +}
> +
> +static void hsw_compute_wm_results(struct drm_device *dev,
> +				   struct hsw_pipe_wm_parameters *params,
> +				   uint32_t *wm,
> +				   struct hsw_wm_maximums *lp_maximums,
> +				   struct hsw_wm_values *results)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct drm_crtc *crtc;
> +	struct hsw_lp_wm_result lp_results[4];
> +	enum pipe pipe;
> +	int i;
> +
> +	hsw_compute_lp_wm(wm[1], lp_maximums, params, &lp_results[0]);
> +	hsw_compute_lp_wm(wm[2], lp_maximums, params, &lp_results[1]);
> +	hsw_compute_lp_wm(wm[3], lp_maximums, params, &lp_results[2]);
> +	hsw_compute_lp_wm(wm[4], lp_maximums, params, &lp_results[3]);
> +
> +	/* The spec says it is preferred to disable FBC WMs instead of disabling
> +	 * a WM level. */
> +	results->enable_fbc_wm = true;
> +	for (i = 0; i < 4; i++) {
> +		if (lp_results[i].enable && !lp_results[i].fbc_enable) {
> +			results->enable_fbc_wm = false;
> +			break;
> +		}
> +	}
> +
> +	if (lp_results[3].enable) {
> +		results->wm_lp[2] = HSW_WM_LP_VAL(8, lp_results[3].fbc_val,
> +						  lp_results[3].pri_val,
> +						  lp_results[3].cur_val);
> +		results->wm_lp_spr[2] = lp_results[3].spr_val;
> +	} else if (lp_results[2].enable) {
> +		results->wm_lp[2] = HSW_WM_LP_VAL(6, lp_results[2].fbc_val,
> +						  lp_results[2].pri_val,
> +						  lp_results[2].cur_val);
> +		results->wm_lp_spr[2] = lp_results[2].spr_val;
> +	} else {
> +		results->wm_lp[2] = 0;
> +		results->wm_lp_spr[2] = 0;
> +	}
> +
> +	if (lp_results[3].enable && lp_results[2].enable) {
> +		results->wm_lp[1] = HSW_WM_LP_VAL(6, lp_results[2].fbc_val,
> +						  lp_results[2].pri_val,
> +						  lp_results[2].cur_val);
> +		results->wm_lp_spr[1] = lp_results[2].spr_val;
> +	} else if (!lp_results[3].enable && lp_results[1].enable) {
> +		results->wm_lp[1] = HSW_WM_LP_VAL(4, lp_results[1].fbc_val,
> +						  lp_results[1].pri_val,
> +						  lp_results[1].cur_val);
> +		results->wm_lp_spr[1] = lp_results[1].spr_val;
> +	} else {
> +		results->wm_lp[1] = 0;
> +		results->wm_lp_spr[1] = 0;
> +	}
> +
> +	if (lp_results[0].enable) {
> +		results->wm_lp[0] = HSW_WM_LP_VAL(2, lp_results[0].fbc_val,
> +						  lp_results[0].pri_val,
> +						  lp_results[0].cur_val);
> +		results->wm_lp_spr[0] = lp_results[0].spr_val;
> +	} else {
> +		results->wm_lp[0] = 0;
> +		results->wm_lp_spr[0] = 0;
> +	}
> +
> +	for_each_pipe(pipe)
> +		results->wm_pipe[pipe] = hsw_compute_wm_pipe(dev_priv, wm[0],
> +							     pipe,
> +							     &params[pipe]);
>  
>  	for_each_pipe(pipe) {
>  		crtc = dev_priv->pipe_to_crtc_mapping[pipe];
> -		haswell_update_linetime_wm(dev, crtc);
> +		results->wm_linetime[pipe] = hsw_compute_linetime_wm(dev, crtc);
> +	}
> +}
> +
> +/* Find the result with the highest level enabled. Check for enable_fbc_wm in
> + * case both are at the same level. Prefer r1 in case they're the same. */
> +struct hsw_wm_values *hsw_find_best_result(struct hsw_wm_values *r1,
> +					   struct hsw_wm_values *r2)
> +{
> +	int i, val_r1 = 0, val_r2 = 0;
> +
> +	for (i = 0; i < 3; i++) {
> +		if (r1->wm_lp[i] & WM3_LP_EN)
> +			val_r1 |= (1 << i);
> +		if (r2->wm_lp[i] & WM3_LP_EN)
> +			val_r2 |= (1 << i);
> +	}
> +
> +	if (val_r1 == val_r2) {
> +		if (r2->enable_fbc_wm && !r1->enable_fbc_wm)
> +			return r2;
> +		else
> +			return r1;
> +	} else if (val_r1 > val_r2) {
> +		return r1;
> +	} else {
> +		return r2;
> +	}
> +}
> +
> +/*
> + * The spec says we shouldn't write when we don't need, because every write
> + * causes WMs to be re-evaluated, expending some power.
> + */
> +static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> +				struct hsw_wm_values *results)
> +{
> +	struct hsw_wm_values previous;
> +	uint32_t val;
> +
> +	val = I915_READ(DISP_ARB_CTL);
> +	if (results->enable_fbc_wm)
> +		val &= ~DISP_FBC_WM_DIS;
> +	else
> +		val |= DISP_FBC_WM_DIS;
> +	I915_WRITE(DISP_ARB_CTL, val);
> +
> +	previous.wm_pipe[0] = I915_READ(WM0_PIPEA_ILK);
> +	previous.wm_pipe[1] = I915_READ(WM0_PIPEB_ILK);
> +	previous.wm_pipe[2] = I915_READ(WM0_PIPEC_IVB);
> +	previous.wm_lp[0] = I915_READ(WM1_LP_ILK);
> +	previous.wm_lp[1] = I915_READ(WM2_LP_ILK);
> +	previous.wm_lp[2] = I915_READ(WM3_LP_ILK);
> +	previous.wm_lp_spr[0] = I915_READ(WM1S_LP_ILK);
> +	previous.wm_lp_spr[1] = I915_READ(WM2S_LP_IVB);
> +	previous.wm_lp_spr[2] = I915_READ(WM3S_LP_IVB);
> +	previous.wm_linetime[0] = I915_READ(PIPE_WM_LINETIME(PIPE_A));
> +	previous.wm_linetime[1] = I915_READ(PIPE_WM_LINETIME(PIPE_B));
> +	previous.wm_linetime[2] = I915_READ(PIPE_WM_LINETIME(PIPE_C));
> +
> +	if (memcmp(results->wm_pipe, previous.wm_pipe, 3) == 0 &&
> +	    memcmp(results->wm_lp, previous.wm_lp, 3) == 0 &&
> +	    memcmp(results->wm_lp_spr, previous.wm_lp_spr, 3) == 0 &&
> +	    memcmp(results->wm_linetime, previous.wm_linetime, 3) == 0)
> +		return;
> +
> +	if (previous.wm_lp[2] != 0)
> +		I915_WRITE(WM3_LP_ILK, 0);
> +	if (previous.wm_lp[1] != 0)
> +		I915_WRITE(WM2_LP_ILK, 0);
> +	if (previous.wm_lp[0] != 0)
> +		I915_WRITE(WM1_LP_ILK, 0);
> +
> +	if (previous.wm_pipe[0] != results->wm_pipe[0])
> +		I915_WRITE(WM0_PIPEA_ILK, results->wm_pipe[0]);
> +	if (previous.wm_pipe[1] != results->wm_pipe[1])
> +		I915_WRITE(WM0_PIPEB_ILK, results->wm_pipe[1]);
> +	if (previous.wm_pipe[2] != results->wm_pipe[2])
> +		I915_WRITE(WM0_PIPEC_IVB, results->wm_pipe[2]);
> +
> +	if (previous.wm_linetime[0] != results->wm_linetime[0])
> +		I915_WRITE(PIPE_WM_LINETIME(PIPE_A), results->wm_linetime[0]);
> +	if (previous.wm_linetime[1] != results->wm_linetime[1])
> +		I915_WRITE(PIPE_WM_LINETIME(PIPE_B), results->wm_linetime[1]);
> +	if (previous.wm_linetime[2] != results->wm_linetime[2])
> +		I915_WRITE(PIPE_WM_LINETIME(PIPE_C), results->wm_linetime[2]);
> +
> +	if (previous.wm_lp_spr[0] != results->wm_lp_spr[0])
> +		I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]);
> +	if (previous.wm_lp_spr[1] != results->wm_lp_spr[1])
> +		I915_WRITE(WM2S_LP_IVB, results->wm_lp_spr[1]);
> +	if (previous.wm_lp_spr[2] != results->wm_lp_spr[2])
> +		I915_WRITE(WM3S_LP_IVB, results->wm_lp_spr[2]);
> +
> +	if (results->wm_lp[0] != 0)
> +		I915_WRITE(WM1_LP_ILK, results->wm_lp[0]);
> +	if (results->wm_lp[1] != 0)
> +		I915_WRITE(WM2_LP_ILK, results->wm_lp[1]);
> +	if (results->wm_lp[2] != 0)
> +		I915_WRITE(WM3_LP_ILK, results->wm_lp[2]);
> +}
> +
> +static void haswell_update_wm(struct drm_device *dev)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct hsw_wm_maximums lp_max_1_2, lp_max_5_6;
> +	struct hsw_pipe_wm_parameters params[3];
> +	struct hsw_wm_values results_1_2, results_5_6, *best_results;
> +	uint32_t wm[5];
> +
> +	hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2, &lp_max_5_6);
> +
> +	hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results_1_2);
> +	if (lp_max_1_2.pri != lp_max_5_6.pri) {
> +		hsw_compute_wm_results(dev, params, wm, &lp_max_5_6,
> +				       &results_5_6);
> +		best_results = hsw_find_best_result(&results_1_2, &results_5_6);
> +	} else {
> +		best_results = &results_1_2;
>  	}
>  
> -	sandybridge_update_wm(dev);
> +	hsw_write_wm_values(dev_priv, best_results);
> +}
> +
> +static void haswell_update_sprite_wm(struct drm_device *dev, int pipe,
> +				     uint32_t sprite_width, int pixel_size)
> +{
> +	struct drm_plane *plane;
> +
> +	list_for_each_entry(plane, &dev->mode_config.plane_list, head) {
> +		struct intel_plane *intel_plane = to_intel_plane(plane);
> +
> +		if (intel_plane->pipe == pipe) {
> +			intel_plane->wm.enable = true;
> +			intel_plane->wm.horiz_pixels = sprite_width + 1;
> +			intel_plane->wm.bytes_per_pixel = pixel_size;
> +			break;
> +		}
> +	}
> +
> +	haswell_update_wm(dev);
>  }
>  
>  static bool
> @@ -4564,7 +5047,8 @@ void intel_init_pm(struct drm_device *dev)
>  		} else if (IS_HASWELL(dev)) {
>  			if (I915_READ64(MCH_SSKPD)) {
>  				dev_priv->display.update_wm = haswell_update_wm;
> -				dev_priv->display.update_sprite_wm = sandybridge_update_sprite_wm;
> +				dev_priv->display.update_sprite_wm =
> +					haswell_update_sprite_wm;
>  			} else {
>  				DRM_DEBUG_KMS("Failed to read display plane latency. "
>  					      "Disable CxSR\n");
> diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
> index 19b9cb9..95b39ef 100644
> --- a/drivers/gpu/drm/i915/intel_sprite.c
> +++ b/drivers/gpu/drm/i915/intel_sprite.c
> @@ -335,8 +335,12 @@ ivb_disable_plane(struct drm_plane *plane)
>  
>  	dev_priv->sprite_scaling_enabled &= ~(1 << pipe);
>  
> +	if (IS_HASWELL(dev))
> +		intel_plane->wm.enable = false;
> +
>  	/* potentially re-enable LP watermarks */
> -	if (scaling_was_enabled && !dev_priv->sprite_scaling_enabled)
> +	if ((scaling_was_enabled && !dev_priv->sprite_scaling_enabled) ||
> +	    IS_HASWELL(dev))
>  		intel_update_watermarks(dev);

Maybe just add an 'enable' parameter to update_sprite_wm() instead an
call it for disable_plane too.

>  }
>  
> -- 
> 1.7.10.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC

      parent reply	other threads:[~2013-05-21 13:08 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-06 20:38 [PATCH] drm/i915: replace snb_update_wm with haswell_update_wm on HSW Paulo Zanoni
2013-05-09 20:13 ` Paulo Zanoni
2013-05-20 15:56   ` Ville Syrjälä
2013-05-21 21:24     ` Paulo Zanoni
2013-05-22  7:25       ` Ville Syrjälä
2013-05-24 15:05         ` Paulo Zanoni
2013-05-24 16:07           ` Daniel Vetter
2013-05-24 16:13             ` Daniel Vetter
2013-05-24 16:20               ` Ville Syrjälä
2013-05-25 16:01                 ` Daniel Vetter
2013-05-21 13:08   ` Ville Syrjälä [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130521130806.GB16772@intel.com \
    --to=ville.syrjala@linux.intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=paulo.r.zanoni@intel.com \
    --cc=przanoni@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.