From mboxrd@z Thu Jan  1 00:00:00 1970
From: Ville =?iso-8859-1?Q?Syrj=E4l=E4?= <ville.syrjala@linux.intel.com>
Subject: Re: [PATCH] drm/i915: replace snb_update_wm with
 haswell_update_wm on HSW
Date: Tue, 21 May 2013 16:08:06 +0300
Message-ID: <20130521130806.GB16772@intel.com>
References: <1367872696-4617-1-git-send-email-przanoni@gmail.com>
	<1368130422-15392-1-git-send-email-przanoni@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Return-path: <intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org>
Received: from mga01.intel.com (mga01.intel.com [192.55.52.88])
	by gabe.freedesktop.org (Postfix) with ESMTP id 663EEE615E
	for <intel-gfx@lists.freedesktop.org>;
	Tue, 21 May 2013 06:08:17 -0700 (PDT)
Content-Disposition: inline
In-Reply-To: <1368130422-15392-1-git-send-email-przanoni@gmail.com>
List-Unsubscribe: <http://lists.freedesktop.org/mailman/options/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <http://lists.freedesktop.org/archives/intel-gfx>
List-Post: <mailto:intel-gfx@lists.freedesktop.org>
List-Help: <mailto:intel-gfx-request@lists.freedesktop.org?subject=help>
List-Subscribe: <http://lists.freedesktop.org/mailman/listinfo/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=subscribe>
Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org
Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org
To: Paulo Zanoni <przanoni@gmail.com>
Cc: intel-gfx@lists.freedesktop.org, Paulo Zanoni <paulo.r.zanoni@intel.com>
List-Id: intel-gfx@lists.freedesktop.org

On Thu, May 09, 2013 at 05:13:41PM -0300, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> =

> We were previously calling sandybridge_update_wm on HSW, but the SNB
> function didn't really match the HSW specification, so we were just
> writing the wrong values. For example, I was not seeing any LP
> watermark ever enabled.
> =

> This patch implements the haswell_update_wm function as described in
> our specification. The values match the "Haswell Watermark Calculator"
> too.
> =

> With this patch I can finally see us reaching PC7 state with screen
> sizes <=3D 1920x1080.
> =

> The only thing I see we're missing is to properly account for the case
> where the primary plane is disabled. We still don't do this and I
> believe we'll need some small reworks on intel_sprite.c if we want to
> do this correctly, so let's leave this feature for a future patch.
> =

> v2: - Refactor code in the hope that it will be more useful for
>       Ville's rework
>     - Immpletement the 2 pieces missing on v1: sprite watermarks and
>       support for 5/6 data buffer partitioning.

Apart from I said yesterday, this is a rather large patch and could be
split up a bit to make it easier to digest.

- IMHO the 1/2 vs. 5/6 split stuff could be dropped completely for now
  since it doesn't even appear functional
- the pfit downscaling pixel rate adjustment part at least could be a
  separate patch

A few more bikesheds below.

> =

> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_reg.h     |    7 +
>  drivers/gpu/drm/i915/intel_drv.h    |   12 +
>  drivers/gpu/drm/i915/intel_pm.c     |  522 +++++++++++++++++++++++++++++=
++++--
>  drivers/gpu/drm/i915/intel_sprite.c |    6 +-
>  4 files changed, 527 insertions(+), 20 deletions(-)
> =

> Hi
> =

> I had some discussions with Ville based on his plans to rework the waterm=
arks
> code, so I decided to do a little refactoring on the previous patch in or=
der to
> make it easier for him. Now we have a clear split between the 3 steps: (i)
> gather the WM parameters, (ii) calculate the WM values and (iii) write the
> values to the registers. My idea is that he'll be able to take parts of my
> Haswell-specific code and make them become usable by the other hardware
> generations. He'll probably have to rename some of the hsw_ structs and m=
ove
> them around, but I hope my code can be used as a starting point for his r=
ework.
> =

> In addition to the refactoring I also added support for sprite watermarks=
 and
> the 5/6 data buffer partitioning mode, so we should be "feature complete".
> =

> I checked the values set by the Kernel and they do match the watermarks
> calculator.
> =

> Thanks,
> Paulo
> =

> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_=
reg.h
> index 92dcbe3..33b5de3 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -3055,6 +3055,10 @@
>  #define WM3S_LP_IVB		0x45128
>  #define  WM1S_LP_EN		(1<<31)
>  =

> +#define HSW_WM_LP_VAL(lat, fbc, pri, cur) \
> +	(WM3_LP_EN | ((lat) << WM1_LP_LATENCY_SHIFT) | \
> +	 ((fbc) << WM1_LP_FBC_SHIFT) | ((pri) << WM1_LP_SR_SHIFT) | (cur))
> +
>  /* Memory latency timer register */
>  #define MLTR_ILK		0x11222
>  #define  MLTR_WM1_SHIFT		0
> @@ -4938,6 +4942,9 @@
>  #define  SFUSE_STRAP_DDIC_DETECTED	(1<<1)
>  #define  SFUSE_STRAP_DDID_DETECTED	(1<<0)
>  =

> +#define WM_MISC				0x45260
> +#define  WM_MISC_DATA_PARTITION_5_6	(1 << 0)
> +
>  #define WM_DBG				0x45280
>  #define  WM_DBG_DISALLOW_MULTIPLE_LP	(1<<0)
>  #define  WM_DBG_DISALLOW_MAXFIFO	(1<<1)
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/inte=
l_drv.h
> index 80b417a..b8376e1 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -319,6 +319,18 @@ struct intel_plane {
>  	unsigned int crtc_w, crtc_h;
>  	uint32_t src_x, src_y;
>  	uint32_t src_w, src_h;
> +
> +	/* Since we need to change the watermarks before/after
> +	 * enabling/disabling the planes, we need to store the parameters here
> +	 * as the other pieces of the struct may not reflect the values we want
> +	 * for the watermark calculations. Currently only Haswell uses this.
> +	 */
> +	struct plane_watermark_parameters {

The type isn't used anywhere, so no need to give it a type name.

> +		bool enable;
> +		int horiz_pixels;
> +		int bytes_per_pixel;
> +	} wm;
> +
>  	void (*update_plane)(struct drm_plane *plane,
>  			     struct drm_framebuffer *fb,
>  			     struct drm_i915_gem_object *obj,
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel=
_pm.c
> index 478518d..afc4705 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2068,20 +2068,236 @@ static void ivybridge_update_wm(struct drm_devic=
e *dev)
>  		   cursor_wm);
>  }
>  =

> -static void
> -haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
> +static int hsw_wm_get_pixel_rate(struct drm_device *dev,
> +				 struct drm_crtc *crtc)
> +{
> +	struct intel_crtc *intel_crtc =3D to_intel_crtc(crtc);
> +	int pixel_rate, pfit_size;
> +
> +	if (intel_crtc->config.pixel_target_clock)
> +		pixel_rate =3D intel_crtc->config.pixel_target_clock;
> +	else
> +		pixel_rate =3D intel_crtc->config.adjusted_mode.clock;
> +
> +	/* We only use IF-ID interlacing. If we ever use PF-ID we'll need to
> +	 * adjust the pixel_rate here. */
> +
> +	pfit_size =3D intel_crtc->config.pch_pfit.size;
> +	if (pfit_size) {
> +		int x, y, crtc_x, crtc_y, hscale, vscale, totscale;
> +
> +		x =3D (pfit_size >> 16) & 0xFFFF;
> +		y =3D pfit_size & 0xFFFF;
> +		crtc_x =3D intel_crtc->config.adjusted_mode.hdisplay;
> +		crtc_y =3D intel_crtc->config.adjusted_mode.vdisplay;
> +
> +		hscale =3D crtc_x * 1000 / x;
> +		vscale =3D crtc_y * 1000 / y;
> +		hscale =3D (hscale < 1000) ? 1000 : hscale;
> +		vscale =3D (vscale < 1000) ? 1000 : vscale;
> +		totscale =3D hscale * vscale / 1000;
> +		pixel_rate =3D pixel_rate * totscale / 1000;
> +	}
> +
> +	return pixel_rate;
> +}
> +
> +static int hsw_wm_method1(int pixel_rate, int bytes_per_pixel, int laten=
cy)
> +{
> +	int ret;
> +
> +	ret =3D pixel_rate * bytes_per_pixel * latency;
> +	ret =3D DIV_ROUND_UP(ret, 64 * 10000) + 2;
> +	return ret;
> +}
> +
> +static int hsw_wm_method2(int pixel_rate, int pipe_htotal, int horiz_pix=
els,
> +			  int bytes_per_pixel, int latency)
> +{
> +	int ret;
> +
> +	ret =3D DIV_ROUND_UP(pipe_htotal * 1000, pixel_rate);
> +	ret =3D ((latency / (ret * 10)) + 1) * horiz_pixels * bytes_per_pixel;
> +	ret =3D DIV_ROUND_UP(ret, 64) + 2;
> +	return ret;
> +}
> +
> +static int hsw_wm_fbc(int pri_val, int horiz_pixels, int bytes_per_pixel)
> +{
> +	return DIV_ROUND_UP(pri_val * 64, horiz_pixels * bytes_per_pixel) + 2;
> +}
> +
> +struct hsw_pipe_wm_parameters {
> +	bool active;
> +	bool sprite_enabled;
> +	int pipe_htotal;
> +	int pixel_rate;
> +	int pri_bytes_per_pixel;
> +	int spr_bytes_per_pixel;
> +	int cur_bytes_per_pixel;
> +	int pri_horiz_pixels;
> +	int spr_horiz_pixels;
> +	int cur_horiz_pixels;
> +};
> +
> +struct hsw_wm_maximums {
> +	int pri;
> +	int spr;
> +	int cur;
> +	int fbc;
> +};
> +
> +struct hsw_lp_wm_result {
> +	bool enable;
> +	bool fbc_enable;

This guy seems a bit pointless. You could just delay the fbc wm max
check until the time when you derive the final enable_fbc_wm value.

> +	int pri_val;
> +	int spr_val;
> +	int cur_val;
> +	int fbc_val;
> +};
> +
> +struct hsw_wm_values {
> +	uint32_t wm_pipe[3];
> +	uint32_t wm_lp[3];
> +	uint32_t wm_lp_spr[3];
> +	uint32_t wm_linetime[3];
> +	bool enable_fbc_wm;
> +};
> +
> +static void hsw_compute_lp_wm(int mem_value, struct hsw_wm_maximums *max,
> +			      struct hsw_pipe_wm_parameters *params,
> +			      struct hsw_lp_wm_result *result)
> +{
> +	enum pipe pipe;
> +	int pri_val[3], spr_val[3], cur_val[3], fbc_val[3];
> +	int method1, method2;
> +
> +	for (pipe =3D PIPE_A; pipe <=3D PIPE_C; pipe++) {
> +		struct hsw_pipe_wm_parameters *p =3D &params[pipe];
> +
> +		if (p->active) {
> +			/* TODO: for now, assume the primary plane is always
> +			 * enabled. */
> +			method1 =3D hsw_wm_method1(p->pixel_rate,
> +						 p->pri_bytes_per_pixel,
> +						 mem_value);
> +			method2 =3D hsw_wm_method2(p->pixel_rate,
> +						 p->pipe_htotal,
> +						 p->pri_horiz_pixels,
> +						 p->pri_bytes_per_pixel,
> +						 mem_value);
> +			pri_val[pipe] =3D min(method1, method2);

I think these things could be refactored a bit into
hsw_wm_compute_primary(), hsw_wm_compute_sprite() etc. like I had in
my big rewrite patch.

> +
> +			if (p->sprite_enabled) {
> +				method1 =3D hsw_wm_method1(p->pixel_rate,
> +							 p->spr_bytes_per_pixel,
> +							 mem_value);
> +				method2 =3D hsw_wm_method2(p->pixel_rate,
> +							 p->pipe_htotal,
> +							 p->spr_horiz_pixels,
> +							 p->spr_bytes_per_pixel,
> +							 mem_value);
> +				spr_val[pipe] =3D min(method1, method2);
> +			} else {
> +				spr_val[pipe] =3D 0;
> +			}
> +
> +			cur_val[pipe] =3D hsw_wm_method2(p->pixel_rate,
> +						       p->pipe_htotal,
> +						       p->cur_horiz_pixels,
> +						       p->cur_bytes_per_pixel,
> +						       mem_value);
> +			fbc_val[pipe] =3D hsw_wm_fbc(pri_val[pipe],
> +						   p->pri_horiz_pixels,
> +						   p->pri_bytes_per_pixel);
> +		} else {
> +			pri_val[pipe] =3D 0;
> +			spr_val[pipe] =3D 0;
> +			cur_val[pipe] =3D 0;
> +			fbc_val[pipe] =3D 0;
> +		}
> +	}
> +
> +	result->pri_val =3D max3(pri_val[0], pri_val[1], pri_val[2]);
> +	result->spr_val =3D max3(spr_val[0], spr_val[1], spr_val[2]);
> +	result->cur_val =3D max3(cur_val[0], cur_val[1], cur_val[2]);
> +	result->fbc_val =3D max3(fbc_val[0], fbc_val[1], fbc_val[2]);
> +
> +	if (result->fbc_val > max->fbc) {
> +		result->fbc_enable =3D false;
> +		result->fbc_val =3D 0;
> +	} else {
> +		result->fbc_enable =3D true;
> +	}
> +
> +	result->enable =3D result->pri_val <=3D max->pri &&
> +			 result->spr_val <=3D max->spr &&
> +			 result->cur_val <=3D max->cur;
> +}
> +
> +static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv,
> +				    int mem_value, enum pipe pipe,
> +				    struct hsw_pipe_wm_parameters *params)
> +{
> +	int pri_val, cur_val, spr_val, method1, method2;
> +
> +	if (params->active) {
> +		/* TODO: for now, assume the primary plane is always enabled. */
> +		pri_val =3D hsw_wm_method1(params->pixel_rate,
> +					 params->pri_bytes_per_pixel,
> +					 mem_value);
> +
> +		if (params->sprite_enabled) {
> +			method1 =3D hsw_wm_method1(params->pixel_rate,
> +						 params->spr_bytes_per_pixel,
> +						 mem_value);
> +			method2 =3D hsw_wm_method2(params->pixel_rate,
> +						 params->pipe_htotal,
> +						 params->spr_horiz_pixels,
> +						 params->spr_bytes_per_pixel,
> +						 mem_value);
> +			spr_val =3D min(method1, method2);
> +		} else {
> +			spr_val =3D 0;
> +		}
> +
> +		cur_val =3D hsw_wm_method2(params->pixel_rate,
> +					 params->pipe_htotal,
> +					 params->cur_horiz_pixels,
> +					 params->cur_bytes_per_pixel,
> +					 mem_value);
> +	} else {
> +		pri_val =3D 0;
> +		spr_val =3D 0;
> +		cur_val =3D 0;
> +	}
> +
> +	WARN(pri_val > 127,
> +	     "Primary WM error, mode not supported for pipe %c\n",
> +	     pipe_name(pipe));
> +	WARN(spr_val > 127,
> +	     "Sprite WM error, mode not supported for pipe %c\n",
> +	     pipe_name(pipe));
> +	WARN(cur_val > 63,
> +	     "Cursor WM error, mode not supported for pipe %c\n",
> +	     pipe_name(pipe));
> +
> +	return (pri_val << WM0_PIPE_PLANE_SHIFT) |
> +	       (spr_val << WM0_PIPE_SPRITE_SHIFT) |
> +	       cur_val;
> +}
> +
> +static uint32_t
> +hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
>  {
>  	struct drm_i915_private *dev_priv =3D dev->dev_private;
>  	struct intel_crtc *intel_crtc =3D to_intel_crtc(crtc);
> -	enum pipe pipe =3D intel_crtc->pipe;
>  	struct drm_display_mode *mode =3D &intel_crtc->config.adjusted_mode;
>  	int target_clock;
>  	u32 linetime, ips_linetime;
>  =

> -	if (!intel_crtc_active(crtc)) {
> -		I915_WRITE(PIPE_WM_LINETIME(pipe), 0);
> -		return;
> -	}
> +	if (!intel_crtc_active(crtc))
> +		return 0;
>  =

>  	if (intel_crtc->config.pixel_target_clock)
>  		target_clock =3D intel_crtc->config.pixel_target_clock;
> @@ -2095,29 +2311,296 @@ haswell_update_linetime_wm(struct drm_device *de=
v, struct drm_crtc *crtc)
>  	ips_linetime =3D DIV_ROUND_CLOSEST(mode->htotal * 1000 * 8,
>  					 intel_ddi_get_cdclk_freq(dev_priv));
>  =

> -	I915_WRITE(PIPE_WM_LINETIME(pipe),
> -		   PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
> -		   PIPE_WM_LINETIME_TIME(linetime));
> +	return PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
> +	       PIPE_WM_LINETIME_TIME(linetime);
>  }
>  =

> -static void haswell_update_wm(struct drm_device *dev)
> +static void hsw_compute_wm_parameters(struct drm_device *dev,
> +				      struct hsw_pipe_wm_parameters *params,
> +				      uint32_t *wm,
> +				      struct hsw_wm_maximums *lp_max_1_2,
> +				      struct hsw_wm_maximums *lp_max_5_6)
>  {
>  	struct drm_i915_private *dev_priv =3D dev->dev_private;
>  	struct drm_crtc *crtc;
> +	struct drm_plane *plane;
> +	uint64_t sskpd =3D I915_READ64(MCH_SSKPD);
>  	enum pipe pipe;
> +	int pipes_active =3D 0, sprites_enabled =3D 0;
>  =

> -	/* Disable the LP WMs before changine the linetime registers. This is
> -	 * just a temporary code that will be replaced soon. */
> -	I915_WRITE(WM3_LP_ILK, 0);
> -	I915_WRITE(WM2_LP_ILK, 0);
> -	I915_WRITE(WM1_LP_ILK, 0);
> +	if ((sskpd >> 56) & 0xFF)
> +		wm[0] =3D (sskpd >> 56) & 0xFF;
> +	else
> +		wm[0] =3D sskpd & 0xF;
> +	wm[1] =3D ((sskpd >> 4) & 0xFF) * 5;
> +	wm[2] =3D ((sskpd >> 12) & 0xFF) * 5;
> +	wm[3] =3D ((sskpd >> 20) & 0x1FF) * 5;
> +	wm[4] =3D ((sskpd >> 32) & 0x1FF) * 5;
> +
> +	for_each_pipe(pipe) {

list_for_each_entry() would be more consistent with most of our code.

> +		struct intel_crtc *intel_crtc;
> +
> +		crtc =3D dev_priv->pipe_to_crtc_mapping[pipe];
> +		intel_crtc =3D to_intel_crtc(crtc);
> +
> +		params[pipe].active =3D intel_crtc_active(crtc);
> +		if (!params[pipe].active)
> +			continue;
> +
> +		pipes_active++;
> +
> +		params[pipe].pipe_htotal =3D
> +			intel_crtc->config.adjusted_mode.htotal;
> +		params[pipe].pixel_rate =3D hsw_wm_get_pixel_rate(dev, crtc);
> +		params[pipe].pri_bytes_per_pixel =3D crtc->fb->bits_per_pixel / 8;
> +		params[pipe].cur_bytes_per_pixel =3D 4;
> +		params[pipe].pri_horiz_pixels =3D
> +			intel_crtc->config.adjusted_mode.hdisplay;
> +		params[pipe].cur_horiz_pixels =3D 64;

Could avoid repeating the same params[pipe] stuff with eg.

	struct hsw_pipe_wm_parameters *p =3D &params[pipe];

	p->foo =3D bar;
	...

> +	}
> +
> +	list_for_each_entry(plane, &dev->mode_config.plane_list, head) {
> +		struct intel_plane *intel_plane =3D to_intel_plane(plane);
> +
> +		pipe =3D intel_plane->pipe;
> +		params[pipe].sprite_enabled =3D intel_plane->wm.enable;
> +		params[pipe].spr_bytes_per_pixel =3D
> +			intel_plane->wm.bytes_per_pixel;
> +		params[pipe].spr_horiz_pixels =3D intel_plane->wm.horiz_pixels;
> +
> +		if (params[pipe].sprite_enabled)
> +			sprites_enabled++;
> +	}
> +
> +	if (pipes_active > 1) {
> +		lp_max_1_2->pri =3D lp_max_5_6->pri =3D sprites_enabled ? 128 : 256;
> +		lp_max_1_2->spr =3D lp_max_5_6->spr =3D 128;
> +		lp_max_1_2->cur =3D lp_max_5_6->cur =3D 64;
> +	} else {
> +		lp_max_1_2->pri =3D sprites_enabled ? 384 : 768;
> +		lp_max_5_6->pri =3D sprites_enabled ? 128 : 768;
> +		lp_max_1_2->spr =3D 384;
> +		lp_max_5_6->spr =3D 640;
> +		lp_max_1_2->cur =3D lp_max_5_6->cur =3D 255;
> +	}
> +	lp_max_1_2->fbc =3D lp_max_5_6->fbc =3D 15;
> +}
> +
> +static void hsw_compute_wm_results(struct drm_device *dev,
> +				   struct hsw_pipe_wm_parameters *params,
> +				   uint32_t *wm,
> +				   struct hsw_wm_maximums *lp_maximums,
> +				   struct hsw_wm_values *results)
> +{
> +	struct drm_i915_private *dev_priv =3D dev->dev_private;
> +	struct drm_crtc *crtc;
> +	struct hsw_lp_wm_result lp_results[4];
> +	enum pipe pipe;
> +	int i;
> +
> +	hsw_compute_lp_wm(wm[1], lp_maximums, params, &lp_results[0]);
> +	hsw_compute_lp_wm(wm[2], lp_maximums, params, &lp_results[1]);
> +	hsw_compute_lp_wm(wm[3], lp_maximums, params, &lp_results[2]);
> +	hsw_compute_lp_wm(wm[4], lp_maximums, params, &lp_results[3]);
> +
> +	/* The spec says it is preferred to disable FBC WMs instead of disabling
> +	 * a WM level. */
> +	results->enable_fbc_wm =3D true;
> +	for (i =3D 0; i < 4; i++) {
> +		if (lp_results[i].enable && !lp_results[i].fbc_enable) {
> +			results->enable_fbc_wm =3D false;
> +			break;
> +		}
> +	}
> +
> +	if (lp_results[3].enable) {
> +		results->wm_lp[2] =3D HSW_WM_LP_VAL(8, lp_results[3].fbc_val,
> +						  lp_results[3].pri_val,
> +						  lp_results[3].cur_val);
> +		results->wm_lp_spr[2] =3D lp_results[3].spr_val;
> +	} else if (lp_results[2].enable) {
> +		results->wm_lp[2] =3D HSW_WM_LP_VAL(6, lp_results[2].fbc_val,
> +						  lp_results[2].pri_val,
> +						  lp_results[2].cur_val);
> +		results->wm_lp_spr[2] =3D lp_results[2].spr_val;
> +	} else {
> +		results->wm_lp[2] =3D 0;
> +		results->wm_lp_spr[2] =3D 0;
> +	}
> +
> +	if (lp_results[3].enable && lp_results[2].enable) {
> +		results->wm_lp[1] =3D HSW_WM_LP_VAL(6, lp_results[2].fbc_val,
> +						  lp_results[2].pri_val,
> +						  lp_results[2].cur_val);
> +		results->wm_lp_spr[1] =3D lp_results[2].spr_val;
> +	} else if (!lp_results[3].enable && lp_results[1].enable) {
> +		results->wm_lp[1] =3D HSW_WM_LP_VAL(4, lp_results[1].fbc_val,
> +						  lp_results[1].pri_val,
> +						  lp_results[1].cur_val);
> +		results->wm_lp_spr[1] =3D lp_results[1].spr_val;
> +	} else {
> +		results->wm_lp[1] =3D 0;
> +		results->wm_lp_spr[1] =3D 0;
> +	}
> +
> +	if (lp_results[0].enable) {
> +		results->wm_lp[0] =3D HSW_WM_LP_VAL(2, lp_results[0].fbc_val,
> +						  lp_results[0].pri_val,
> +						  lp_results[0].cur_val);
> +		results->wm_lp_spr[0] =3D lp_results[0].spr_val;
> +	} else {
> +		results->wm_lp[0] =3D 0;
> +		results->wm_lp_spr[0] =3D 0;
> +	}
> +
> +	for_each_pipe(pipe)
> +		results->wm_pipe[pipe] =3D hsw_compute_wm_pipe(dev_priv, wm[0],
> +							     pipe,
> +							     &params[pipe]);
>  =

>  	for_each_pipe(pipe) {
>  		crtc =3D dev_priv->pipe_to_crtc_mapping[pipe];
> -		haswell_update_linetime_wm(dev, crtc);
> +		results->wm_linetime[pipe] =3D hsw_compute_linetime_wm(dev, crtc);
> +	}
> +}
> +
> +/* Find the result with the highest level enabled. Check for enable_fbc_=
wm in
> + * case both are at the same level. Prefer r1 in case they're the same. =
*/
> +struct hsw_wm_values *hsw_find_best_result(struct hsw_wm_values *r1,
> +					   struct hsw_wm_values *r2)
> +{
> +	int i, val_r1 =3D 0, val_r2 =3D 0;
> +
> +	for (i =3D 0; i < 3; i++) {
> +		if (r1->wm_lp[i] & WM3_LP_EN)
> +			val_r1 |=3D (1 << i);
> +		if (r2->wm_lp[i] & WM3_LP_EN)
> +			val_r2 |=3D (1 << i);
> +	}
> +
> +	if (val_r1 =3D=3D val_r2) {
> +		if (r2->enable_fbc_wm && !r1->enable_fbc_wm)
> +			return r2;
> +		else
> +			return r1;
> +	} else if (val_r1 > val_r2) {
> +		return r1;
> +	} else {
> +		return r2;
> +	}
> +}
> +
> +/*
> + * The spec says we shouldn't write when we don't need, because every wr=
ite
> + * causes WMs to be re-evaluated, expending some power.
> + */
> +static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> +				struct hsw_wm_values *results)
> +{
> +	struct hsw_wm_values previous;
> +	uint32_t val;
> +
> +	val =3D I915_READ(DISP_ARB_CTL);
> +	if (results->enable_fbc_wm)
> +		val &=3D ~DISP_FBC_WM_DIS;
> +	else
> +		val |=3D DISP_FBC_WM_DIS;
> +	I915_WRITE(DISP_ARB_CTL, val);
> +
> +	previous.wm_pipe[0] =3D I915_READ(WM0_PIPEA_ILK);
> +	previous.wm_pipe[1] =3D I915_READ(WM0_PIPEB_ILK);
> +	previous.wm_pipe[2] =3D I915_READ(WM0_PIPEC_IVB);
> +	previous.wm_lp[0] =3D I915_READ(WM1_LP_ILK);
> +	previous.wm_lp[1] =3D I915_READ(WM2_LP_ILK);
> +	previous.wm_lp[2] =3D I915_READ(WM3_LP_ILK);
> +	previous.wm_lp_spr[0] =3D I915_READ(WM1S_LP_ILK);
> +	previous.wm_lp_spr[1] =3D I915_READ(WM2S_LP_IVB);
> +	previous.wm_lp_spr[2] =3D I915_READ(WM3S_LP_IVB);
> +	previous.wm_linetime[0] =3D I915_READ(PIPE_WM_LINETIME(PIPE_A));
> +	previous.wm_linetime[1] =3D I915_READ(PIPE_WM_LINETIME(PIPE_B));
> +	previous.wm_linetime[2] =3D I915_READ(PIPE_WM_LINETIME(PIPE_C));
> +
> +	if (memcmp(results->wm_pipe, previous.wm_pipe, 3) =3D=3D 0 &&
> +	    memcmp(results->wm_lp, previous.wm_lp, 3) =3D=3D 0 &&
> +	    memcmp(results->wm_lp_spr, previous.wm_lp_spr, 3) =3D=3D 0 &&
> +	    memcmp(results->wm_linetime, previous.wm_linetime, 3) =3D=3D 0)
> +		return;
> +
> +	if (previous.wm_lp[2] !=3D 0)
> +		I915_WRITE(WM3_LP_ILK, 0);
> +	if (previous.wm_lp[1] !=3D 0)
> +		I915_WRITE(WM2_LP_ILK, 0);
> +	if (previous.wm_lp[0] !=3D 0)
> +		I915_WRITE(WM1_LP_ILK, 0);
> +
> +	if (previous.wm_pipe[0] !=3D results->wm_pipe[0])
> +		I915_WRITE(WM0_PIPEA_ILK, results->wm_pipe[0]);
> +	if (previous.wm_pipe[1] !=3D results->wm_pipe[1])
> +		I915_WRITE(WM0_PIPEB_ILK, results->wm_pipe[1]);
> +	if (previous.wm_pipe[2] !=3D results->wm_pipe[2])
> +		I915_WRITE(WM0_PIPEC_IVB, results->wm_pipe[2]);
> +
> +	if (previous.wm_linetime[0] !=3D results->wm_linetime[0])
> +		I915_WRITE(PIPE_WM_LINETIME(PIPE_A), results->wm_linetime[0]);
> +	if (previous.wm_linetime[1] !=3D results->wm_linetime[1])
> +		I915_WRITE(PIPE_WM_LINETIME(PIPE_B), results->wm_linetime[1]);
> +	if (previous.wm_linetime[2] !=3D results->wm_linetime[2])
> +		I915_WRITE(PIPE_WM_LINETIME(PIPE_C), results->wm_linetime[2]);
> +
> +	if (previous.wm_lp_spr[0] !=3D results->wm_lp_spr[0])
> +		I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]);
> +	if (previous.wm_lp_spr[1] !=3D results->wm_lp_spr[1])
> +		I915_WRITE(WM2S_LP_IVB, results->wm_lp_spr[1]);
> +	if (previous.wm_lp_spr[2] !=3D results->wm_lp_spr[2])
> +		I915_WRITE(WM3S_LP_IVB, results->wm_lp_spr[2]);
> +
> +	if (results->wm_lp[0] !=3D 0)
> +		I915_WRITE(WM1_LP_ILK, results->wm_lp[0]);
> +	if (results->wm_lp[1] !=3D 0)
> +		I915_WRITE(WM2_LP_ILK, results->wm_lp[1]);
> +	if (results->wm_lp[2] !=3D 0)
> +		I915_WRITE(WM3_LP_ILK, results->wm_lp[2]);
> +}
> +
> +static void haswell_update_wm(struct drm_device *dev)
> +{
> +	struct drm_i915_private *dev_priv =3D dev->dev_private;
> +	struct hsw_wm_maximums lp_max_1_2, lp_max_5_6;
> +	struct hsw_pipe_wm_parameters params[3];
> +	struct hsw_wm_values results_1_2, results_5_6, *best_results;
> +	uint32_t wm[5];
> +
> +	hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2, &lp_max_5_6);
> +
> +	hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results_1_2);
> +	if (lp_max_1_2.pri !=3D lp_max_5_6.pri) {
> +		hsw_compute_wm_results(dev, params, wm, &lp_max_5_6,
> +				       &results_5_6);
> +		best_results =3D hsw_find_best_result(&results_1_2, &results_5_6);
> +	} else {
> +		best_results =3D &results_1_2;
>  	}
>  =

> -	sandybridge_update_wm(dev);
> +	hsw_write_wm_values(dev_priv, best_results);
> +}
> +
> +static void haswell_update_sprite_wm(struct drm_device *dev, int pipe,
> +				     uint32_t sprite_width, int pixel_size)
> +{
> +	struct drm_plane *plane;
> +
> +	list_for_each_entry(plane, &dev->mode_config.plane_list, head) {
> +		struct intel_plane *intel_plane =3D to_intel_plane(plane);
> +
> +		if (intel_plane->pipe =3D=3D pipe) {
> +			intel_plane->wm.enable =3D true;
> +			intel_plane->wm.horiz_pixels =3D sprite_width + 1;
> +			intel_plane->wm.bytes_per_pixel =3D pixel_size;
> +			break;
> +		}
> +	}
> +
> +	haswell_update_wm(dev);
>  }
>  =

>  static bool
> @@ -4564,7 +5047,8 @@ void intel_init_pm(struct drm_device *dev)
>  		} else if (IS_HASWELL(dev)) {
>  			if (I915_READ64(MCH_SSKPD)) {
>  				dev_priv->display.update_wm =3D haswell_update_wm;
> -				dev_priv->display.update_sprite_wm =3D sandybridge_update_sprite_wm;
> +				dev_priv->display.update_sprite_wm =3D
> +					haswell_update_sprite_wm;
>  			} else {
>  				DRM_DEBUG_KMS("Failed to read display plane latency. "
>  					      "Disable CxSR\n");
> diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/i=
ntel_sprite.c
> index 19b9cb9..95b39ef 100644
> --- a/drivers/gpu/drm/i915/intel_sprite.c
> +++ b/drivers/gpu/drm/i915/intel_sprite.c
> @@ -335,8 +335,12 @@ ivb_disable_plane(struct drm_plane *plane)
>  =

>  	dev_priv->sprite_scaling_enabled &=3D ~(1 << pipe);
>  =

> +	if (IS_HASWELL(dev))
> +		intel_plane->wm.enable =3D false;
> +
>  	/* potentially re-enable LP watermarks */
> -	if (scaling_was_enabled && !dev_priv->sprite_scaling_enabled)
> +	if ((scaling_was_enabled && !dev_priv->sprite_scaling_enabled) ||
> +	    IS_HASWELL(dev))
>  		intel_update_watermarks(dev);

Maybe just add an 'enable' parameter to update_sprite_wm() instead an
call it for disable_plane too.

>  }
>  =

> -- =

> 1.7.10.4
> =

> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- =

Ville Syrj=E4l=E4
Intel OTC