From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ville =?iso-8859-1?Q?Syrj=E4l=E4?= Subject: Re: [PATCH] drm/i915: replace snb_update_wm with haswell_update_wm on HSW Date: Tue, 21 May 2013 16:08:06 +0300 Message-ID: <20130521130806.GB16772@intel.com> References: <1367872696-4617-1-git-send-email-przanoni@gmail.com> <1368130422-15392-1-git-send-email-przanoni@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTP id 663EEE615E for ; Tue, 21 May 2013 06:08:17 -0700 (PDT) Content-Disposition: inline In-Reply-To: <1368130422-15392-1-git-send-email-przanoni@gmail.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: Paulo Zanoni Cc: intel-gfx@lists.freedesktop.org, Paulo Zanoni List-Id: intel-gfx@lists.freedesktop.org On Thu, May 09, 2013 at 05:13:41PM -0300, Paulo Zanoni wrote: > From: Paulo Zanoni > = > We were previously calling sandybridge_update_wm on HSW, but the SNB > function didn't really match the HSW specification, so we were just > writing the wrong values. For example, I was not seeing any LP > watermark ever enabled. > = > This patch implements the haswell_update_wm function as described in > our specification. The values match the "Haswell Watermark Calculator" > too. > = > With this patch I can finally see us reaching PC7 state with screen > sizes <=3D 1920x1080. > = > The only thing I see we're missing is to properly account for the case > where the primary plane is disabled. We still don't do this and I > believe we'll need some small reworks on intel_sprite.c if we want to > do this correctly, so let's leave this feature for a future patch. > = > v2: - Refactor code in the hope that it will be more useful for > Ville's rework > - Immpletement the 2 pieces missing on v1: sprite watermarks and > support for 5/6 data buffer partitioning. Apart from I said yesterday, this is a rather large patch and could be split up a bit to make it easier to digest. - IMHO the 1/2 vs. 5/6 split stuff could be dropped completely for now since it doesn't even appear functional - the pfit downscaling pixel rate adjustment part at least could be a separate patch A few more bikesheds below. > = > Signed-off-by: Paulo Zanoni > --- > drivers/gpu/drm/i915/i915_reg.h | 7 + > drivers/gpu/drm/i915/intel_drv.h | 12 + > drivers/gpu/drm/i915/intel_pm.c | 522 +++++++++++++++++++++++++++++= ++++-- > drivers/gpu/drm/i915/intel_sprite.c | 6 +- > 4 files changed, 527 insertions(+), 20 deletions(-) > = > Hi > = > I had some discussions with Ville based on his plans to rework the waterm= arks > code, so I decided to do a little refactoring on the previous patch in or= der to > make it easier for him. Now we have a clear split between the 3 steps: (i) > gather the WM parameters, (ii) calculate the WM values and (iii) write the > values to the registers. My idea is that he'll be able to take parts of my > Haswell-specific code and make them become usable by the other hardware > generations. He'll probably have to rename some of the hsw_ structs and m= ove > them around, but I hope my code can be used as a starting point for his r= ework. > = > In addition to the refactoring I also added support for sprite watermarks= and > the 5/6 data buffer partitioning mode, so we should be "feature complete". > = > I checked the values set by the Kernel and they do match the watermarks > calculator. > = > Thanks, > Paulo > = > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_= reg.h > index 92dcbe3..33b5de3 100644 > --- a/drivers/gpu/drm/i915/i915_reg.h > +++ b/drivers/gpu/drm/i915/i915_reg.h > @@ -3055,6 +3055,10 @@ > #define WM3S_LP_IVB 0x45128 > #define WM1S_LP_EN (1<<31) > = > +#define HSW_WM_LP_VAL(lat, fbc, pri, cur) \ > + (WM3_LP_EN | ((lat) << WM1_LP_LATENCY_SHIFT) | \ > + ((fbc) << WM1_LP_FBC_SHIFT) | ((pri) << WM1_LP_SR_SHIFT) | (cur)) > + > /* Memory latency timer register */ > #define MLTR_ILK 0x11222 > #define MLTR_WM1_SHIFT 0 > @@ -4938,6 +4942,9 @@ > #define SFUSE_STRAP_DDIC_DETECTED (1<<1) > #define SFUSE_STRAP_DDID_DETECTED (1<<0) > = > +#define WM_MISC 0x45260 > +#define WM_MISC_DATA_PARTITION_5_6 (1 << 0) > + > #define WM_DBG 0x45280 > #define WM_DBG_DISALLOW_MULTIPLE_LP (1<<0) > #define WM_DBG_DISALLOW_MAXFIFO (1<<1) > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/inte= l_drv.h > index 80b417a..b8376e1 100644 > --- a/drivers/gpu/drm/i915/intel_drv.h > +++ b/drivers/gpu/drm/i915/intel_drv.h > @@ -319,6 +319,18 @@ struct intel_plane { > unsigned int crtc_w, crtc_h; > uint32_t src_x, src_y; > uint32_t src_w, src_h; > + > + /* Since we need to change the watermarks before/after > + * enabling/disabling the planes, we need to store the parameters here > + * as the other pieces of the struct may not reflect the values we want > + * for the watermark calculations. Currently only Haswell uses this. > + */ > + struct plane_watermark_parameters { The type isn't used anywhere, so no need to give it a type name. > + bool enable; > + int horiz_pixels; > + int bytes_per_pixel; > + } wm; > + > void (*update_plane)(struct drm_plane *plane, > struct drm_framebuffer *fb, > struct drm_i915_gem_object *obj, > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel= _pm.c > index 478518d..afc4705 100644 > --- a/drivers/gpu/drm/i915/intel_pm.c > +++ b/drivers/gpu/drm/i915/intel_pm.c > @@ -2068,20 +2068,236 @@ static void ivybridge_update_wm(struct drm_devic= e *dev) > cursor_wm); > } > = > -static void > -haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc) > +static int hsw_wm_get_pixel_rate(struct drm_device *dev, > + struct drm_crtc *crtc) > +{ > + struct intel_crtc *intel_crtc =3D to_intel_crtc(crtc); > + int pixel_rate, pfit_size; > + > + if (intel_crtc->config.pixel_target_clock) > + pixel_rate =3D intel_crtc->config.pixel_target_clock; > + else > + pixel_rate =3D intel_crtc->config.adjusted_mode.clock; > + > + /* We only use IF-ID interlacing. If we ever use PF-ID we'll need to > + * adjust the pixel_rate here. */ > + > + pfit_size =3D intel_crtc->config.pch_pfit.size; > + if (pfit_size) { > + int x, y, crtc_x, crtc_y, hscale, vscale, totscale; > + > + x =3D (pfit_size >> 16) & 0xFFFF; > + y =3D pfit_size & 0xFFFF; > + crtc_x =3D intel_crtc->config.adjusted_mode.hdisplay; > + crtc_y =3D intel_crtc->config.adjusted_mode.vdisplay; > + > + hscale =3D crtc_x * 1000 / x; > + vscale =3D crtc_y * 1000 / y; > + hscale =3D (hscale < 1000) ? 1000 : hscale; > + vscale =3D (vscale < 1000) ? 1000 : vscale; > + totscale =3D hscale * vscale / 1000; > + pixel_rate =3D pixel_rate * totscale / 1000; > + } > + > + return pixel_rate; > +} > + > +static int hsw_wm_method1(int pixel_rate, int bytes_per_pixel, int laten= cy) > +{ > + int ret; > + > + ret =3D pixel_rate * bytes_per_pixel * latency; > + ret =3D DIV_ROUND_UP(ret, 64 * 10000) + 2; > + return ret; > +} > + > +static int hsw_wm_method2(int pixel_rate, int pipe_htotal, int horiz_pix= els, > + int bytes_per_pixel, int latency) > +{ > + int ret; > + > + ret =3D DIV_ROUND_UP(pipe_htotal * 1000, pixel_rate); > + ret =3D ((latency / (ret * 10)) + 1) * horiz_pixels * bytes_per_pixel; > + ret =3D DIV_ROUND_UP(ret, 64) + 2; > + return ret; > +} > + > +static int hsw_wm_fbc(int pri_val, int horiz_pixels, int bytes_per_pixel) > +{ > + return DIV_ROUND_UP(pri_val * 64, horiz_pixels * bytes_per_pixel) + 2; > +} > + > +struct hsw_pipe_wm_parameters { > + bool active; > + bool sprite_enabled; > + int pipe_htotal; > + int pixel_rate; > + int pri_bytes_per_pixel; > + int spr_bytes_per_pixel; > + int cur_bytes_per_pixel; > + int pri_horiz_pixels; > + int spr_horiz_pixels; > + int cur_horiz_pixels; > +}; > + > +struct hsw_wm_maximums { > + int pri; > + int spr; > + int cur; > + int fbc; > +}; > + > +struct hsw_lp_wm_result { > + bool enable; > + bool fbc_enable; This guy seems a bit pointless. You could just delay the fbc wm max check until the time when you derive the final enable_fbc_wm value. > + int pri_val; > + int spr_val; > + int cur_val; > + int fbc_val; > +}; > + > +struct hsw_wm_values { > + uint32_t wm_pipe[3]; > + uint32_t wm_lp[3]; > + uint32_t wm_lp_spr[3]; > + uint32_t wm_linetime[3]; > + bool enable_fbc_wm; > +}; > + > +static void hsw_compute_lp_wm(int mem_value, struct hsw_wm_maximums *max, > + struct hsw_pipe_wm_parameters *params, > + struct hsw_lp_wm_result *result) > +{ > + enum pipe pipe; > + int pri_val[3], spr_val[3], cur_val[3], fbc_val[3]; > + int method1, method2; > + > + for (pipe =3D PIPE_A; pipe <=3D PIPE_C; pipe++) { > + struct hsw_pipe_wm_parameters *p =3D ¶ms[pipe]; > + > + if (p->active) { > + /* TODO: for now, assume the primary plane is always > + * enabled. */ > + method1 =3D hsw_wm_method1(p->pixel_rate, > + p->pri_bytes_per_pixel, > + mem_value); > + method2 =3D hsw_wm_method2(p->pixel_rate, > + p->pipe_htotal, > + p->pri_horiz_pixels, > + p->pri_bytes_per_pixel, > + mem_value); > + pri_val[pipe] =3D min(method1, method2); I think these things could be refactored a bit into hsw_wm_compute_primary(), hsw_wm_compute_sprite() etc. like I had in my big rewrite patch. > + > + if (p->sprite_enabled) { > + method1 =3D hsw_wm_method1(p->pixel_rate, > + p->spr_bytes_per_pixel, > + mem_value); > + method2 =3D hsw_wm_method2(p->pixel_rate, > + p->pipe_htotal, > + p->spr_horiz_pixels, > + p->spr_bytes_per_pixel, > + mem_value); > + spr_val[pipe] =3D min(method1, method2); > + } else { > + spr_val[pipe] =3D 0; > + } > + > + cur_val[pipe] =3D hsw_wm_method2(p->pixel_rate, > + p->pipe_htotal, > + p->cur_horiz_pixels, > + p->cur_bytes_per_pixel, > + mem_value); > + fbc_val[pipe] =3D hsw_wm_fbc(pri_val[pipe], > + p->pri_horiz_pixels, > + p->pri_bytes_per_pixel); > + } else { > + pri_val[pipe] =3D 0; > + spr_val[pipe] =3D 0; > + cur_val[pipe] =3D 0; > + fbc_val[pipe] =3D 0; > + } > + } > + > + result->pri_val =3D max3(pri_val[0], pri_val[1], pri_val[2]); > + result->spr_val =3D max3(spr_val[0], spr_val[1], spr_val[2]); > + result->cur_val =3D max3(cur_val[0], cur_val[1], cur_val[2]); > + result->fbc_val =3D max3(fbc_val[0], fbc_val[1], fbc_val[2]); > + > + if (result->fbc_val > max->fbc) { > + result->fbc_enable =3D false; > + result->fbc_val =3D 0; > + } else { > + result->fbc_enable =3D true; > + } > + > + result->enable =3D result->pri_val <=3D max->pri && > + result->spr_val <=3D max->spr && > + result->cur_val <=3D max->cur; > +} > + > +static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv, > + int mem_value, enum pipe pipe, > + struct hsw_pipe_wm_parameters *params) > +{ > + int pri_val, cur_val, spr_val, method1, method2; > + > + if (params->active) { > + /* TODO: for now, assume the primary plane is always enabled. */ > + pri_val =3D hsw_wm_method1(params->pixel_rate, > + params->pri_bytes_per_pixel, > + mem_value); > + > + if (params->sprite_enabled) { > + method1 =3D hsw_wm_method1(params->pixel_rate, > + params->spr_bytes_per_pixel, > + mem_value); > + method2 =3D hsw_wm_method2(params->pixel_rate, > + params->pipe_htotal, > + params->spr_horiz_pixels, > + params->spr_bytes_per_pixel, > + mem_value); > + spr_val =3D min(method1, method2); > + } else { > + spr_val =3D 0; > + } > + > + cur_val =3D hsw_wm_method2(params->pixel_rate, > + params->pipe_htotal, > + params->cur_horiz_pixels, > + params->cur_bytes_per_pixel, > + mem_value); > + } else { > + pri_val =3D 0; > + spr_val =3D 0; > + cur_val =3D 0; > + } > + > + WARN(pri_val > 127, > + "Primary WM error, mode not supported for pipe %c\n", > + pipe_name(pipe)); > + WARN(spr_val > 127, > + "Sprite WM error, mode not supported for pipe %c\n", > + pipe_name(pipe)); > + WARN(cur_val > 63, > + "Cursor WM error, mode not supported for pipe %c\n", > + pipe_name(pipe)); > + > + return (pri_val << WM0_PIPE_PLANE_SHIFT) | > + (spr_val << WM0_PIPE_SPRITE_SHIFT) | > + cur_val; > +} > + > +static uint32_t > +hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc) > { > struct drm_i915_private *dev_priv =3D dev->dev_private; > struct intel_crtc *intel_crtc =3D to_intel_crtc(crtc); > - enum pipe pipe =3D intel_crtc->pipe; > struct drm_display_mode *mode =3D &intel_crtc->config.adjusted_mode; > int target_clock; > u32 linetime, ips_linetime; > = > - if (!intel_crtc_active(crtc)) { > - I915_WRITE(PIPE_WM_LINETIME(pipe), 0); > - return; > - } > + if (!intel_crtc_active(crtc)) > + return 0; > = > if (intel_crtc->config.pixel_target_clock) > target_clock =3D intel_crtc->config.pixel_target_clock; > @@ -2095,29 +2311,296 @@ haswell_update_linetime_wm(struct drm_device *de= v, struct drm_crtc *crtc) > ips_linetime =3D DIV_ROUND_CLOSEST(mode->htotal * 1000 * 8, > intel_ddi_get_cdclk_freq(dev_priv)); > = > - I915_WRITE(PIPE_WM_LINETIME(pipe), > - PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) | > - PIPE_WM_LINETIME_TIME(linetime)); > + return PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) | > + PIPE_WM_LINETIME_TIME(linetime); > } > = > -static void haswell_update_wm(struct drm_device *dev) > +static void hsw_compute_wm_parameters(struct drm_device *dev, > + struct hsw_pipe_wm_parameters *params, > + uint32_t *wm, > + struct hsw_wm_maximums *lp_max_1_2, > + struct hsw_wm_maximums *lp_max_5_6) > { > struct drm_i915_private *dev_priv =3D dev->dev_private; > struct drm_crtc *crtc; > + struct drm_plane *plane; > + uint64_t sskpd =3D I915_READ64(MCH_SSKPD); > enum pipe pipe; > + int pipes_active =3D 0, sprites_enabled =3D 0; > = > - /* Disable the LP WMs before changine the linetime registers. This is > - * just a temporary code that will be replaced soon. */ > - I915_WRITE(WM3_LP_ILK, 0); > - I915_WRITE(WM2_LP_ILK, 0); > - I915_WRITE(WM1_LP_ILK, 0); > + if ((sskpd >> 56) & 0xFF) > + wm[0] =3D (sskpd >> 56) & 0xFF; > + else > + wm[0] =3D sskpd & 0xF; > + wm[1] =3D ((sskpd >> 4) & 0xFF) * 5; > + wm[2] =3D ((sskpd >> 12) & 0xFF) * 5; > + wm[3] =3D ((sskpd >> 20) & 0x1FF) * 5; > + wm[4] =3D ((sskpd >> 32) & 0x1FF) * 5; > + > + for_each_pipe(pipe) { list_for_each_entry() would be more consistent with most of our code. > + struct intel_crtc *intel_crtc; > + > + crtc =3D dev_priv->pipe_to_crtc_mapping[pipe]; > + intel_crtc =3D to_intel_crtc(crtc); > + > + params[pipe].active =3D intel_crtc_active(crtc); > + if (!params[pipe].active) > + continue; > + > + pipes_active++; > + > + params[pipe].pipe_htotal =3D > + intel_crtc->config.adjusted_mode.htotal; > + params[pipe].pixel_rate =3D hsw_wm_get_pixel_rate(dev, crtc); > + params[pipe].pri_bytes_per_pixel =3D crtc->fb->bits_per_pixel / 8; > + params[pipe].cur_bytes_per_pixel =3D 4; > + params[pipe].pri_horiz_pixels =3D > + intel_crtc->config.adjusted_mode.hdisplay; > + params[pipe].cur_horiz_pixels =3D 64; Could avoid repeating the same params[pipe] stuff with eg. struct hsw_pipe_wm_parameters *p =3D ¶ms[pipe]; p->foo =3D bar; ... > + } > + > + list_for_each_entry(plane, &dev->mode_config.plane_list, head) { > + struct intel_plane *intel_plane =3D to_intel_plane(plane); > + > + pipe =3D intel_plane->pipe; > + params[pipe].sprite_enabled =3D intel_plane->wm.enable; > + params[pipe].spr_bytes_per_pixel =3D > + intel_plane->wm.bytes_per_pixel; > + params[pipe].spr_horiz_pixels =3D intel_plane->wm.horiz_pixels; > + > + if (params[pipe].sprite_enabled) > + sprites_enabled++; > + } > + > + if (pipes_active > 1) { > + lp_max_1_2->pri =3D lp_max_5_6->pri =3D sprites_enabled ? 128 : 256; > + lp_max_1_2->spr =3D lp_max_5_6->spr =3D 128; > + lp_max_1_2->cur =3D lp_max_5_6->cur =3D 64; > + } else { > + lp_max_1_2->pri =3D sprites_enabled ? 384 : 768; > + lp_max_5_6->pri =3D sprites_enabled ? 128 : 768; > + lp_max_1_2->spr =3D 384; > + lp_max_5_6->spr =3D 640; > + lp_max_1_2->cur =3D lp_max_5_6->cur =3D 255; > + } > + lp_max_1_2->fbc =3D lp_max_5_6->fbc =3D 15; > +} > + > +static void hsw_compute_wm_results(struct drm_device *dev, > + struct hsw_pipe_wm_parameters *params, > + uint32_t *wm, > + struct hsw_wm_maximums *lp_maximums, > + struct hsw_wm_values *results) > +{ > + struct drm_i915_private *dev_priv =3D dev->dev_private; > + struct drm_crtc *crtc; > + struct hsw_lp_wm_result lp_results[4]; > + enum pipe pipe; > + int i; > + > + hsw_compute_lp_wm(wm[1], lp_maximums, params, &lp_results[0]); > + hsw_compute_lp_wm(wm[2], lp_maximums, params, &lp_results[1]); > + hsw_compute_lp_wm(wm[3], lp_maximums, params, &lp_results[2]); > + hsw_compute_lp_wm(wm[4], lp_maximums, params, &lp_results[3]); > + > + /* The spec says it is preferred to disable FBC WMs instead of disabling > + * a WM level. */ > + results->enable_fbc_wm =3D true; > + for (i =3D 0; i < 4; i++) { > + if (lp_results[i].enable && !lp_results[i].fbc_enable) { > + results->enable_fbc_wm =3D false; > + break; > + } > + } > + > + if (lp_results[3].enable) { > + results->wm_lp[2] =3D HSW_WM_LP_VAL(8, lp_results[3].fbc_val, > + lp_results[3].pri_val, > + lp_results[3].cur_val); > + results->wm_lp_spr[2] =3D lp_results[3].spr_val; > + } else if (lp_results[2].enable) { > + results->wm_lp[2] =3D HSW_WM_LP_VAL(6, lp_results[2].fbc_val, > + lp_results[2].pri_val, > + lp_results[2].cur_val); > + results->wm_lp_spr[2] =3D lp_results[2].spr_val; > + } else { > + results->wm_lp[2] =3D 0; > + results->wm_lp_spr[2] =3D 0; > + } > + > + if (lp_results[3].enable && lp_results[2].enable) { > + results->wm_lp[1] =3D HSW_WM_LP_VAL(6, lp_results[2].fbc_val, > + lp_results[2].pri_val, > + lp_results[2].cur_val); > + results->wm_lp_spr[1] =3D lp_results[2].spr_val; > + } else if (!lp_results[3].enable && lp_results[1].enable) { > + results->wm_lp[1] =3D HSW_WM_LP_VAL(4, lp_results[1].fbc_val, > + lp_results[1].pri_val, > + lp_results[1].cur_val); > + results->wm_lp_spr[1] =3D lp_results[1].spr_val; > + } else { > + results->wm_lp[1] =3D 0; > + results->wm_lp_spr[1] =3D 0; > + } > + > + if (lp_results[0].enable) { > + results->wm_lp[0] =3D HSW_WM_LP_VAL(2, lp_results[0].fbc_val, > + lp_results[0].pri_val, > + lp_results[0].cur_val); > + results->wm_lp_spr[0] =3D lp_results[0].spr_val; > + } else { > + results->wm_lp[0] =3D 0; > + results->wm_lp_spr[0] =3D 0; > + } > + > + for_each_pipe(pipe) > + results->wm_pipe[pipe] =3D hsw_compute_wm_pipe(dev_priv, wm[0], > + pipe, > + ¶ms[pipe]); > = > for_each_pipe(pipe) { > crtc =3D dev_priv->pipe_to_crtc_mapping[pipe]; > - haswell_update_linetime_wm(dev, crtc); > + results->wm_linetime[pipe] =3D hsw_compute_linetime_wm(dev, crtc); > + } > +} > + > +/* Find the result with the highest level enabled. Check for enable_fbc_= wm in > + * case both are at the same level. Prefer r1 in case they're the same. = */ > +struct hsw_wm_values *hsw_find_best_result(struct hsw_wm_values *r1, > + struct hsw_wm_values *r2) > +{ > + int i, val_r1 =3D 0, val_r2 =3D 0; > + > + for (i =3D 0; i < 3; i++) { > + if (r1->wm_lp[i] & WM3_LP_EN) > + val_r1 |=3D (1 << i); > + if (r2->wm_lp[i] & WM3_LP_EN) > + val_r2 |=3D (1 << i); > + } > + > + if (val_r1 =3D=3D val_r2) { > + if (r2->enable_fbc_wm && !r1->enable_fbc_wm) > + return r2; > + else > + return r1; > + } else if (val_r1 > val_r2) { > + return r1; > + } else { > + return r2; > + } > +} > + > +/* > + * The spec says we shouldn't write when we don't need, because every wr= ite > + * causes WMs to be re-evaluated, expending some power. > + */ > +static void hsw_write_wm_values(struct drm_i915_private *dev_priv, > + struct hsw_wm_values *results) > +{ > + struct hsw_wm_values previous; > + uint32_t val; > + > + val =3D I915_READ(DISP_ARB_CTL); > + if (results->enable_fbc_wm) > + val &=3D ~DISP_FBC_WM_DIS; > + else > + val |=3D DISP_FBC_WM_DIS; > + I915_WRITE(DISP_ARB_CTL, val); > + > + previous.wm_pipe[0] =3D I915_READ(WM0_PIPEA_ILK); > + previous.wm_pipe[1] =3D I915_READ(WM0_PIPEB_ILK); > + previous.wm_pipe[2] =3D I915_READ(WM0_PIPEC_IVB); > + previous.wm_lp[0] =3D I915_READ(WM1_LP_ILK); > + previous.wm_lp[1] =3D I915_READ(WM2_LP_ILK); > + previous.wm_lp[2] =3D I915_READ(WM3_LP_ILK); > + previous.wm_lp_spr[0] =3D I915_READ(WM1S_LP_ILK); > + previous.wm_lp_spr[1] =3D I915_READ(WM2S_LP_IVB); > + previous.wm_lp_spr[2] =3D I915_READ(WM3S_LP_IVB); > + previous.wm_linetime[0] =3D I915_READ(PIPE_WM_LINETIME(PIPE_A)); > + previous.wm_linetime[1] =3D I915_READ(PIPE_WM_LINETIME(PIPE_B)); > + previous.wm_linetime[2] =3D I915_READ(PIPE_WM_LINETIME(PIPE_C)); > + > + if (memcmp(results->wm_pipe, previous.wm_pipe, 3) =3D=3D 0 && > + memcmp(results->wm_lp, previous.wm_lp, 3) =3D=3D 0 && > + memcmp(results->wm_lp_spr, previous.wm_lp_spr, 3) =3D=3D 0 && > + memcmp(results->wm_linetime, previous.wm_linetime, 3) =3D=3D 0) > + return; > + > + if (previous.wm_lp[2] !=3D 0) > + I915_WRITE(WM3_LP_ILK, 0); > + if (previous.wm_lp[1] !=3D 0) > + I915_WRITE(WM2_LP_ILK, 0); > + if (previous.wm_lp[0] !=3D 0) > + I915_WRITE(WM1_LP_ILK, 0); > + > + if (previous.wm_pipe[0] !=3D results->wm_pipe[0]) > + I915_WRITE(WM0_PIPEA_ILK, results->wm_pipe[0]); > + if (previous.wm_pipe[1] !=3D results->wm_pipe[1]) > + I915_WRITE(WM0_PIPEB_ILK, results->wm_pipe[1]); > + if (previous.wm_pipe[2] !=3D results->wm_pipe[2]) > + I915_WRITE(WM0_PIPEC_IVB, results->wm_pipe[2]); > + > + if (previous.wm_linetime[0] !=3D results->wm_linetime[0]) > + I915_WRITE(PIPE_WM_LINETIME(PIPE_A), results->wm_linetime[0]); > + if (previous.wm_linetime[1] !=3D results->wm_linetime[1]) > + I915_WRITE(PIPE_WM_LINETIME(PIPE_B), results->wm_linetime[1]); > + if (previous.wm_linetime[2] !=3D results->wm_linetime[2]) > + I915_WRITE(PIPE_WM_LINETIME(PIPE_C), results->wm_linetime[2]); > + > + if (previous.wm_lp_spr[0] !=3D results->wm_lp_spr[0]) > + I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]); > + if (previous.wm_lp_spr[1] !=3D results->wm_lp_spr[1]) > + I915_WRITE(WM2S_LP_IVB, results->wm_lp_spr[1]); > + if (previous.wm_lp_spr[2] !=3D results->wm_lp_spr[2]) > + I915_WRITE(WM3S_LP_IVB, results->wm_lp_spr[2]); > + > + if (results->wm_lp[0] !=3D 0) > + I915_WRITE(WM1_LP_ILK, results->wm_lp[0]); > + if (results->wm_lp[1] !=3D 0) > + I915_WRITE(WM2_LP_ILK, results->wm_lp[1]); > + if (results->wm_lp[2] !=3D 0) > + I915_WRITE(WM3_LP_ILK, results->wm_lp[2]); > +} > + > +static void haswell_update_wm(struct drm_device *dev) > +{ > + struct drm_i915_private *dev_priv =3D dev->dev_private; > + struct hsw_wm_maximums lp_max_1_2, lp_max_5_6; > + struct hsw_pipe_wm_parameters params[3]; > + struct hsw_wm_values results_1_2, results_5_6, *best_results; > + uint32_t wm[5]; > + > + hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2, &lp_max_5_6); > + > + hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results_1_2); > + if (lp_max_1_2.pri !=3D lp_max_5_6.pri) { > + hsw_compute_wm_results(dev, params, wm, &lp_max_5_6, > + &results_5_6); > + best_results =3D hsw_find_best_result(&results_1_2, &results_5_6); > + } else { > + best_results =3D &results_1_2; > } > = > - sandybridge_update_wm(dev); > + hsw_write_wm_values(dev_priv, best_results); > +} > + > +static void haswell_update_sprite_wm(struct drm_device *dev, int pipe, > + uint32_t sprite_width, int pixel_size) > +{ > + struct drm_plane *plane; > + > + list_for_each_entry(plane, &dev->mode_config.plane_list, head) { > + struct intel_plane *intel_plane =3D to_intel_plane(plane); > + > + if (intel_plane->pipe =3D=3D pipe) { > + intel_plane->wm.enable =3D true; > + intel_plane->wm.horiz_pixels =3D sprite_width + 1; > + intel_plane->wm.bytes_per_pixel =3D pixel_size; > + break; > + } > + } > + > + haswell_update_wm(dev); > } > = > static bool > @@ -4564,7 +5047,8 @@ void intel_init_pm(struct drm_device *dev) > } else if (IS_HASWELL(dev)) { > if (I915_READ64(MCH_SSKPD)) { > dev_priv->display.update_wm =3D haswell_update_wm; > - dev_priv->display.update_sprite_wm =3D sandybridge_update_sprite_wm; > + dev_priv->display.update_sprite_wm =3D > + haswell_update_sprite_wm; > } else { > DRM_DEBUG_KMS("Failed to read display plane latency. " > "Disable CxSR\n"); > diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/i= ntel_sprite.c > index 19b9cb9..95b39ef 100644 > --- a/drivers/gpu/drm/i915/intel_sprite.c > +++ b/drivers/gpu/drm/i915/intel_sprite.c > @@ -335,8 +335,12 @@ ivb_disable_plane(struct drm_plane *plane) > = > dev_priv->sprite_scaling_enabled &=3D ~(1 << pipe); > = > + if (IS_HASWELL(dev)) > + intel_plane->wm.enable =3D false; > + > /* potentially re-enable LP watermarks */ > - if (scaling_was_enabled && !dev_priv->sprite_scaling_enabled) > + if ((scaling_was_enabled && !dev_priv->sprite_scaling_enabled) || > + IS_HASWELL(dev)) > intel_update_watermarks(dev); Maybe just add an 'enable' parameter to update_sprite_wm() instead an call it for disable_plane too. > } > = > -- = > 1.7.10.4 > = > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- = Ville Syrj=E4l=E4 Intel OTC