* [PATCH 0/5] drm/i915: Optimize plane updates a bit @ 2017-03-09 15:44 ville.syrjala 2017-03-09 15:44 ` [PATCH 1/5] drm/i915: Use I915_READ_FW in i915_get_vblank_counter() ville.syrjala ` (8 more replies) 0 siblings, 9 replies; 15+ messages in thread From: ville.syrjala @ 2017-03-09 15:44 UTC (permalink / raw) To: intel-gfx From: Ville Syrjälä <ville.syrjala@linux.intel.com> Now that commit e1edbd44e23b ("drm/i915: Complain if we take too long under vblank evasion.") has expose just how badly we suck, it seems like a good time to optimize things a little bit. Prior to this one of my VLV machines exceed the 100 usec vblank evade deadline pretty regularly, and at ever boot I was hitting numbers as high as 500 usec. Granted that was with lockdep and all kinds of other debug things enabled. After these changes things seem stay below the 33 usec mark, and with all that debug junk enabled we seem to staying below 22 usec. Note that I was testing single plane updates mostly, so I'm not 100% sure multi plane updates couldn't still exceed the deadline. That will need to be checked. Ville Syrjälä (5): drm/i915: Use I915_READ_FW in i915_get_vblank_counter() drm/i915: s/__raw_i915_read32/I915_READ_FW/ in the SKL+ scanline read w/a drm/i915: Organize plane register writes into tighter bunches drm/i915: Use I915_READ_FW for plane updates drm/i915: Optimize VLV/CHV display FIFO updates drivers/gpu/drm/i915/i915_drv.c | 1 - drivers/gpu/drm/i915/i915_drv.h | 3 - drivers/gpu/drm/i915/i915_irq.c | 14 +- drivers/gpu/drm/i915/intel_display.c | 176 ++++++++++++++--------- drivers/gpu/drm/i915/intel_pm.c | 36 +++-- drivers/gpu/drm/i915/intel_sprite.c | 269 ++++++++++++++++++++--------------- 6 files changed, 290 insertions(+), 209 deletions(-) -- 2.10.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/5] drm/i915: Use I915_READ_FW in i915_get_vblank_counter() 2017-03-09 15:44 [PATCH 0/5] drm/i915: Optimize plane updates a bit ville.syrjala @ 2017-03-09 15:44 ` ville.syrjala 2017-03-09 15:44 ` [PATCH 2/5] drm/i915: s/__raw_i915_read32/I915_READ_FW/ in the SKL+ scanline read w/a ville.syrjala ` (7 subsequent siblings) 8 siblings, 0 replies; 15+ messages in thread From: ville.syrjala @ 2017-03-09 15:44 UTC (permalink / raw) To: intel-gfx From: Ville Syrjälä <ville.syrjala@linux.intel.com> Optimize the multi-register read in i915_get_vblank_counter() a little bit by grabbing the uncore lock manually and using I915_READ_FW(). Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> --- drivers/gpu/drm/i915/i915_irq.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 22ac70374ac1..70214e1fcd10 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -728,6 +728,7 @@ static u32 i915_get_vblank_counter(struct drm_device *dev, unsigned int pipe) struct intel_crtc *intel_crtc = intel_get_crtc_for_pipe(dev_priv, pipe); const struct drm_display_mode *mode = &intel_crtc->base.hwmode; + unsigned long irqflags; htotal = mode->crtc_htotal; hsync_start = mode->crtc_hsync_start; @@ -744,17 +745,21 @@ static u32 i915_get_vblank_counter(struct drm_device *dev, unsigned int pipe) high_frame = PIPEFRAME(pipe); low_frame = PIPEFRAMEPIXEL(pipe); + spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + /* * High & low register fields aren't synchronized, so make sure * we get a low value that's stable across two reads of the high * register. */ do { - high1 = I915_READ(high_frame) & PIPE_FRAME_HIGH_MASK; - low = I915_READ(low_frame); - high2 = I915_READ(high_frame) & PIPE_FRAME_HIGH_MASK; + high1 = I915_READ_FW(high_frame) & PIPE_FRAME_HIGH_MASK; + low = I915_READ_FW(low_frame); + high2 = I915_READ_FW(high_frame) & PIPE_FRAME_HIGH_MASK; } while (high1 != high2); + spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + high1 >>= PIPE_FRAME_HIGH_SHIFT; pixel = low & PIPE_PIXEL_MASK; low >>= PIPE_FRAME_LOW_SHIFT; -- 2.10.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/5] drm/i915: s/__raw_i915_read32/I915_READ_FW/ in the SKL+ scanline read w/a 2017-03-09 15:44 [PATCH 0/5] drm/i915: Optimize plane updates a bit ville.syrjala 2017-03-09 15:44 ` [PATCH 1/5] drm/i915: Use I915_READ_FW in i915_get_vblank_counter() ville.syrjala @ 2017-03-09 15:44 ` ville.syrjala 2017-03-13 9:29 ` Mika Kahola 2017-03-09 15:44 ` [PATCH 3/5] drm/i915: Organize plane register writes into tighter bunches ville.syrjala ` (6 subsequent siblings) 8 siblings, 1 reply; 15+ messages in thread From: ville.syrjala @ 2017-03-09 15:44 UTC (permalink / raw) To: intel-gfx From: Ville Syrjälä <ville.syrjala@linux.intel.com> Replace __raw_i915_read32() with I915_READ_FW() in the workaround for the SKL+ scanline counter hardware fail. The two are the same thing but everyone else uses I915_READ_FW() so let's follow suit. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> --- drivers/gpu/drm/i915/i915_irq.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 70214e1fcd10..3c2109b4be2c 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -817,8 +817,7 @@ static int __intel_get_crtc_scanline(struct intel_crtc *crtc) for (i = 0; i < 100; i++) { udelay(1); - temp = __raw_i915_read32(dev_priv, PIPEDSL(pipe)) & - DSL_LINEMASK_GEN3; + temp = I915_READ_FW(PIPEDSL(pipe)) & DSL_LINEMASK_GEN3; if (temp != position) { position = temp; break; -- 2.10.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/5] drm/i915: s/__raw_i915_read32/I915_READ_FW/ in the SKL+ scanline read w/a 2017-03-09 15:44 ` [PATCH 2/5] drm/i915: s/__raw_i915_read32/I915_READ_FW/ in the SKL+ scanline read w/a ville.syrjala @ 2017-03-13 9:29 ` Mika Kahola 0 siblings, 0 replies; 15+ messages in thread From: Mika Kahola @ 2017-03-13 9:29 UTC (permalink / raw) To: ville.syrjala, intel-gfx On Thu, 2017-03-09 at 17:44 +0200, ville.syrjala@linux.intel.com wrote: > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Replace __raw_i915_read32() with I915_READ_FW() in the workaround for > the SKL+ scanline counter hardware fail. The two are the same thing > but everyone else uses I915_READ_FW() so let's follow suit. > > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Mika Kahola <mika.kahola@intel.com> > --- > drivers/gpu/drm/i915/i915_irq.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_irq.c > b/drivers/gpu/drm/i915/i915_irq.c > index 70214e1fcd10..3c2109b4be2c 100644 > --- a/drivers/gpu/drm/i915/i915_irq.c > +++ b/drivers/gpu/drm/i915/i915_irq.c > @@ -817,8 +817,7 @@ static int __intel_get_crtc_scanline(struct > intel_crtc *crtc) > > for (i = 0; i < 100; i++) { > udelay(1); > - temp = __raw_i915_read32(dev_priv, > PIPEDSL(pipe)) & > - DSL_LINEMASK_GEN3; > + temp = I915_READ_FW(PIPEDSL(pipe)) & > DSL_LINEMASK_GEN3; > if (temp != position) { > position = temp; > break; -- Mika Kahola - Intel OTC _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 3/5] drm/i915: Organize plane register writes into tighter bunches 2017-03-09 15:44 [PATCH 0/5] drm/i915: Optimize plane updates a bit ville.syrjala 2017-03-09 15:44 ` [PATCH 1/5] drm/i915: Use I915_READ_FW in i915_get_vblank_counter() ville.syrjala 2017-03-09 15:44 ` [PATCH 2/5] drm/i915: s/__raw_i915_read32/I915_READ_FW/ in the SKL+ scanline read w/a ville.syrjala @ 2017-03-09 15:44 ` ville.syrjala 2017-03-09 15:44 ` [PATCH 4/5] drm/i915: Use I915_READ_FW for plane updates ville.syrjala ` (5 subsequent siblings) 8 siblings, 0 replies; 15+ messages in thread From: ville.syrjala @ 2017-03-09 15:44 UTC (permalink / raw) To: intel-gfx From: Ville Syrjälä <ville.syrjala@linux.intel.com> Pull all the plane register writes closer together to avoid having a lot of unrelated stuff in between them. This will make things more clear once we'll grab the uncore lock around the entire bunch. Also in the future we might even consider moving more of the register value computation out from the plane update hooks. This should make that easier to do. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> --- drivers/gpu/drm/i915/intel_display.c | 44 +++++++++++++++------------ drivers/gpu/drm/i915/intel_sprite.c | 59 ++++++++++++++++++------------------ 2 files changed, 54 insertions(+), 49 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 5526a196e8a2..253f064bde3e 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -2984,20 +2984,6 @@ static void i9xx_update_primary_plane(struct drm_plane *primary, if (INTEL_GEN(dev_priv) < 4) { if (intel_crtc->pipe == PIPE_B) dspcntr |= DISPPLANE_SEL_PIPE_B; - - /* pipesrc and dspsize control the size that is scaled from, - * which should always be the user's requested size. - */ - I915_WRITE(DSPSIZE(plane), - ((crtc_state->pipe_src_h - 1) << 16) | - (crtc_state->pipe_src_w - 1)); - I915_WRITE(DSPPOS(plane), 0); - } else if (IS_CHERRYVIEW(dev_priv) && plane == PLANE_B) { - I915_WRITE(PRIMSIZE(plane), - ((crtc_state->pipe_src_h - 1) << 16) | - (crtc_state->pipe_src_w - 1)); - I915_WRITE(PRIMPOS(plane), 0); - I915_WRITE(PRIMCNSTALPHA(plane), 0); } switch (fb->format->format) { @@ -3060,6 +3046,22 @@ static void i9xx_update_primary_plane(struct drm_plane *primary, intel_crtc->adjusted_x = x; intel_crtc->adjusted_y = y; + if (INTEL_GEN(dev_priv) < 4) { + /* pipesrc and dspsize control the size that is scaled from, + * which should always be the user's requested size. + */ + I915_WRITE(DSPSIZE(plane), + ((crtc_state->pipe_src_h - 1) << 16) | + (crtc_state->pipe_src_w - 1)); + I915_WRITE(DSPPOS(plane), 0); + } else if (IS_CHERRYVIEW(dev_priv) && plane == PLANE_B) { + I915_WRITE(PRIMSIZE(plane), + ((crtc_state->pipe_src_h - 1) << 16) | + (crtc_state->pipe_src_w - 1)); + I915_WRITE(PRIMPOS(plane), 0); + I915_WRITE(PRIMCNSTALPHA(plane), 0); + } + I915_WRITE(reg, dspcntr); I915_WRITE(DSPSTRIDE(plane), fb->pitches[0]); @@ -3344,12 +3346,7 @@ static void skylake_update_primary_plane(struct drm_plane *plane, plane_ctl = PLANE_CTL_ENABLE; - if (IS_GEMINILAKE(dev_priv)) { - I915_WRITE(PLANE_COLOR_CTL(pipe, plane_id), - PLANE_COLOR_PIPE_GAMMA_ENABLE | - PLANE_COLOR_PIPE_CSC_ENABLE | - PLANE_COLOR_PLANE_GAMMA_DISABLE); - } else { + if (!IS_GEMINILAKE(dev_priv)) { plane_ctl |= PLANE_CTL_PIPE_GAMMA_ENABLE | PLANE_CTL_PIPE_CSC_ENABLE | @@ -3371,6 +3368,13 @@ static void skylake_update_primary_plane(struct drm_plane *plane, intel_crtc->adjusted_x = src_x; intel_crtc->adjusted_y = src_y; + if (IS_GEMINILAKE(dev_priv)) { + I915_WRITE(PLANE_COLOR_CTL(pipe, plane_id), + PLANE_COLOR_PIPE_GAMMA_ENABLE | + PLANE_COLOR_PIPE_CSC_ENABLE | + PLANE_COLOR_PLANE_GAMMA_DISABLE); + } + I915_WRITE(PLANE_CTL(pipe, plane_id), plane_ctl); I915_WRITE(PLANE_OFFSET(pipe, plane_id), (src_y << 16) | src_x); I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride); diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c index 300776b9e935..532db7d62290 100644 --- a/drivers/gpu/drm/i915/intel_sprite.c +++ b/drivers/gpu/drm/i915/intel_sprite.c @@ -233,12 +233,7 @@ skl_update_plane(struct drm_plane *drm_plane, plane_ctl = PLANE_CTL_ENABLE; - if (IS_GEMINILAKE(dev_priv)) { - I915_WRITE(PLANE_COLOR_CTL(pipe, plane_id), - PLANE_COLOR_PIPE_GAMMA_ENABLE | - PLANE_COLOR_PIPE_CSC_ENABLE | - PLANE_COLOR_PLANE_GAMMA_DISABLE); - } else { + if (!IS_GEMINILAKE(dev_priv)) { plane_ctl |= PLANE_CTL_PIPE_GAMMA_ENABLE | PLANE_CTL_PIPE_CSC_ENABLE | @@ -249,12 +244,6 @@ skl_update_plane(struct drm_plane *drm_plane, plane_ctl |= skl_plane_ctl_tiling(fb->modifier); plane_ctl |= skl_plane_ctl_rotation(rotation); - if (key->flags) { - I915_WRITE(PLANE_KEYVAL(pipe, plane_id), key->min_value); - I915_WRITE(PLANE_KEYMAX(pipe, plane_id), key->max_value); - I915_WRITE(PLANE_KEYMSK(pipe, plane_id), key->channel_mask); - } - if (key->flags & I915_SET_COLORKEY_DESTINATION) plane_ctl |= PLANE_CTL_KEY_ENABLE_DESTINATION; else if (key->flags & I915_SET_COLORKEY_SOURCE) @@ -266,6 +255,19 @@ skl_update_plane(struct drm_plane *drm_plane, crtc_w--; crtc_h--; + if (IS_GEMINILAKE(dev_priv)) { + I915_WRITE(PLANE_COLOR_CTL(pipe, plane_id), + PLANE_COLOR_PIPE_GAMMA_ENABLE | + PLANE_COLOR_PIPE_CSC_ENABLE | + PLANE_COLOR_PLANE_GAMMA_DISABLE); + } + + if (key->flags) { + I915_WRITE(PLANE_KEYVAL(pipe, plane_id), key->min_value); + I915_WRITE(PLANE_KEYMAX(pipe, plane_id), key->max_value); + I915_WRITE(PLANE_KEYMSK(pipe, plane_id), key->channel_mask); + } + I915_WRITE(PLANE_OFFSET(pipe, plane_id), (y << 16) | x); I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride); I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w); @@ -433,6 +435,9 @@ vlv_update_plane(struct drm_plane *dplane, if (rotation & DRM_REFLECT_X) sprctl |= SP_MIRROR; + if (key->flags & I915_SET_COLORKEY_SOURCE) + sprctl |= SP_SOURCE_KEY; + /* Sizes are 0 based */ src_w--; src_h--; @@ -451,18 +456,14 @@ vlv_update_plane(struct drm_plane *dplane, linear_offset = intel_fb_xy_to_linear(x, y, plane_state, 0); + if (IS_CHERRYVIEW(dev_priv) && pipe == PIPE_B) + chv_update_csc(intel_plane, fb->format->format); + if (key->flags) { I915_WRITE(SPKEYMINVAL(pipe, plane_id), key->min_value); I915_WRITE(SPKEYMAXVAL(pipe, plane_id), key->max_value); I915_WRITE(SPKEYMSK(pipe, plane_id), key->channel_mask); } - - if (key->flags & I915_SET_COLORKEY_SOURCE) - sprctl |= SP_SOURCE_KEY; - - if (IS_CHERRYVIEW(dev_priv) && pipe == PIPE_B) - chv_update_csc(intel_plane, fb->format->format); - I915_WRITE(SPSTRIDE(pipe, plane_id), fb->pitches[0]); I915_WRITE(SPPOS(pipe, plane_id), (crtc_y << 16) | crtc_x); @@ -563,6 +564,11 @@ ivb_update_plane(struct drm_plane *plane, if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv)) sprctl |= SPRITE_PIPE_CSC_ENABLE; + if (key->flags & I915_SET_COLORKEY_DESTINATION) + sprctl |= SPRITE_DEST_KEY; + else if (key->flags & I915_SET_COLORKEY_SOURCE) + sprctl |= SPRITE_SOURCE_KEY; + /* Sizes are 0 based */ src_w--; src_h--; @@ -590,11 +596,6 @@ ivb_update_plane(struct drm_plane *plane, I915_WRITE(SPRKEYMSK(pipe), key->channel_mask); } - if (key->flags & I915_SET_COLORKEY_DESTINATION) - sprctl |= SPRITE_DEST_KEY; - else if (key->flags & I915_SET_COLORKEY_SOURCE) - sprctl |= SPRITE_SOURCE_KEY; - I915_WRITE(SPRSTRIDE(pipe), fb->pitches[0]); I915_WRITE(SPRPOS(pipe), (crtc_y << 16) | crtc_x); @@ -696,6 +697,11 @@ ilk_update_plane(struct drm_plane *plane, if (IS_GEN6(dev_priv)) dvscntr |= DVS_TRICKLE_FEED_DISABLE; /* must disable */ + if (key->flags & I915_SET_COLORKEY_DESTINATION) + dvscntr |= DVS_DEST_KEY; + else if (key->flags & I915_SET_COLORKEY_SOURCE) + dvscntr |= DVS_SOURCE_KEY; + /* Sizes are 0 based */ src_w--; src_h--; @@ -722,11 +728,6 @@ ilk_update_plane(struct drm_plane *plane, I915_WRITE(DVSKEYMSK(pipe), key->channel_mask); } - if (key->flags & I915_SET_COLORKEY_DESTINATION) - dvscntr |= DVS_DEST_KEY; - else if (key->flags & I915_SET_COLORKEY_SOURCE) - dvscntr |= DVS_SOURCE_KEY; - I915_WRITE(DVSSTRIDE(pipe), fb->pitches[0]); I915_WRITE(DVSPOS(pipe), (crtc_y << 16) | crtc_x); -- 2.10.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 4/5] drm/i915: Use I915_READ_FW for plane updates 2017-03-09 15:44 [PATCH 0/5] drm/i915: Optimize plane updates a bit ville.syrjala ` (2 preceding siblings ...) 2017-03-09 15:44 ` [PATCH 3/5] drm/i915: Organize plane register writes into tighter bunches ville.syrjala @ 2017-03-09 15:44 ` ville.syrjala 2017-03-09 15:44 ` [PATCH 5/5] drm/i915: Optimize VLV/CHV display FIFO updates ville.syrjala ` (4 subsequent siblings) 8 siblings, 0 replies; 15+ messages in thread From: ville.syrjala @ 2017-03-09 15:44 UTC (permalink / raw) To: intel-gfx From: Ville Syrjälä <ville.syrjala@linux.intel.com> Optimize the plane register accesses a little bit by grabbing the uncore lock manually across the entire pile of accesses and using I915_READ_FW(). This helps keep the pipe update vblank evade critical section below our 100 usec deadline, particularly with lockdep enabled. And in general we want to keep that critical section as short as possible as it's executed with interrupts disabled. Not all plane updates currently happen from within the vblank evade critical section, so we must use the irqsave/irqrestore variants of the spinlock functions in the plane hooks. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> --- drivers/gpu/drm/i915/intel_display.c | 158 ++++++++++++++---------- drivers/gpu/drm/i915/intel_sprite.c | 228 ++++++++++++++++++++--------------- 2 files changed, 228 insertions(+), 158 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 253f064bde3e..abd05e0d7896 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -2976,6 +2976,7 @@ static void i9xx_update_primary_plane(struct drm_plane *primary, unsigned int rotation = plane_state->base.rotation; int x = plane_state->base.src.x1 >> 16; int y = plane_state->base.src.y1 >> 16; + unsigned long irqflags; dspcntr = DISPPLANE_GAMMA_ENABLE; @@ -3046,37 +3047,41 @@ static void i9xx_update_primary_plane(struct drm_plane *primary, intel_crtc->adjusted_x = x; intel_crtc->adjusted_y = y; + spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + if (INTEL_GEN(dev_priv) < 4) { /* pipesrc and dspsize control the size that is scaled from, * which should always be the user's requested size. */ - I915_WRITE(DSPSIZE(plane), - ((crtc_state->pipe_src_h - 1) << 16) | - (crtc_state->pipe_src_w - 1)); - I915_WRITE(DSPPOS(plane), 0); + I915_WRITE_FW(DSPSIZE(plane), + ((crtc_state->pipe_src_h - 1) << 16) | + (crtc_state->pipe_src_w - 1)); + I915_WRITE_FW(DSPPOS(plane), 0); } else if (IS_CHERRYVIEW(dev_priv) && plane == PLANE_B) { - I915_WRITE(PRIMSIZE(plane), - ((crtc_state->pipe_src_h - 1) << 16) | - (crtc_state->pipe_src_w - 1)); - I915_WRITE(PRIMPOS(plane), 0); - I915_WRITE(PRIMCNSTALPHA(plane), 0); + I915_WRITE_FW(PRIMSIZE(plane), + ((crtc_state->pipe_src_h - 1) << 16) | + (crtc_state->pipe_src_w - 1)); + I915_WRITE_FW(PRIMPOS(plane), 0); + I915_WRITE_FW(PRIMCNSTALPHA(plane), 0); } - I915_WRITE(reg, dspcntr); + I915_WRITE_FW(reg, dspcntr); - I915_WRITE(DSPSTRIDE(plane), fb->pitches[0]); + I915_WRITE_FW(DSPSTRIDE(plane), fb->pitches[0]); if (INTEL_GEN(dev_priv) >= 4) { - I915_WRITE(DSPSURF(plane), - intel_plane_ggtt_offset(plane_state) + - intel_crtc->dspaddr_offset); - I915_WRITE(DSPTILEOFF(plane), (y << 16) | x); - I915_WRITE(DSPLINOFF(plane), linear_offset); + I915_WRITE_FW(DSPSURF(plane), + intel_plane_ggtt_offset(plane_state) + + intel_crtc->dspaddr_offset); + I915_WRITE_FW(DSPTILEOFF(plane), (y << 16) | x); + I915_WRITE_FW(DSPLINOFF(plane), linear_offset); } else { - I915_WRITE(DSPADDR(plane), - intel_plane_ggtt_offset(plane_state) + - intel_crtc->dspaddr_offset); + I915_WRITE_FW(DSPADDR(plane), + intel_plane_ggtt_offset(plane_state) + + intel_crtc->dspaddr_offset); } - POSTING_READ(reg); + POSTING_READ_FW(reg); + + spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); } static void i9xx_disable_primary_plane(struct drm_plane *primary, @@ -3086,13 +3091,18 @@ static void i9xx_disable_primary_plane(struct drm_plane *primary, struct drm_i915_private *dev_priv = to_i915(dev); struct intel_crtc *intel_crtc = to_intel_crtc(crtc); int plane = intel_crtc->plane; + unsigned long irqflags; - I915_WRITE(DSPCNTR(plane), 0); + spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + + I915_WRITE_FW(DSPCNTR(plane), 0); if (INTEL_INFO(dev_priv)->gen >= 4) - I915_WRITE(DSPSURF(plane), 0); + I915_WRITE_FW(DSPSURF(plane), 0); else - I915_WRITE(DSPADDR(plane), 0); - POSTING_READ(DSPCNTR(plane)); + I915_WRITE_FW(DSPADDR(plane), 0); + POSTING_READ_FW(DSPCNTR(plane)); + + spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); } static void ironlake_update_primary_plane(struct drm_plane *primary, @@ -3110,6 +3120,7 @@ static void ironlake_update_primary_plane(struct drm_plane *primary, unsigned int rotation = plane_state->base.rotation; int x = plane_state->base.src.x1 >> 16; int y = plane_state->base.src.y1 >> 16; + unsigned long irqflags; dspcntr = DISPPLANE_GAMMA_ENABLE; dspcntr |= DISPLAY_PLANE_ENABLE; @@ -3166,19 +3177,23 @@ static void ironlake_update_primary_plane(struct drm_plane *primary, intel_crtc->adjusted_x = x; intel_crtc->adjusted_y = y; - I915_WRITE(reg, dspcntr); + spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + + I915_WRITE_FW(reg, dspcntr); - I915_WRITE(DSPSTRIDE(plane), fb->pitches[0]); - I915_WRITE(DSPSURF(plane), - intel_plane_ggtt_offset(plane_state) + - intel_crtc->dspaddr_offset); + I915_WRITE_FW(DSPSTRIDE(plane), fb->pitches[0]); + I915_WRITE_FW(DSPSURF(plane), + intel_plane_ggtt_offset(plane_state) + + intel_crtc->dspaddr_offset); if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv)) { - I915_WRITE(DSPOFFSET(plane), (y << 16) | x); + I915_WRITE_FW(DSPOFFSET(plane), (y << 16) | x); } else { - I915_WRITE(DSPTILEOFF(plane), (y << 16) | x); - I915_WRITE(DSPLINOFF(plane), linear_offset); + I915_WRITE_FW(DSPTILEOFF(plane), (y << 16) | x); + I915_WRITE_FW(DSPLINOFF(plane), linear_offset); } - POSTING_READ(reg); + POSTING_READ_FW(reg); + + spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); } static u32 @@ -3343,6 +3358,7 @@ static void skylake_update_primary_plane(struct drm_plane *plane, int dst_y = plane_state->base.dst.y1; int dst_w = drm_rect_width(&plane_state->base.dst); int dst_h = drm_rect_height(&plane_state->base.dst); + unsigned long irqflags; plane_ctl = PLANE_CTL_ENABLE; @@ -3368,17 +3384,19 @@ static void skylake_update_primary_plane(struct drm_plane *plane, intel_crtc->adjusted_x = src_x; intel_crtc->adjusted_y = src_y; + spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + if (IS_GEMINILAKE(dev_priv)) { - I915_WRITE(PLANE_COLOR_CTL(pipe, plane_id), - PLANE_COLOR_PIPE_GAMMA_ENABLE | - PLANE_COLOR_PIPE_CSC_ENABLE | - PLANE_COLOR_PLANE_GAMMA_DISABLE); + I915_WRITE_FW(PLANE_COLOR_CTL(pipe, plane_id), + PLANE_COLOR_PIPE_GAMMA_ENABLE | + PLANE_COLOR_PIPE_CSC_ENABLE | + PLANE_COLOR_PLANE_GAMMA_DISABLE); } - I915_WRITE(PLANE_CTL(pipe, plane_id), plane_ctl); - I915_WRITE(PLANE_OFFSET(pipe, plane_id), (src_y << 16) | src_x); - I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride); - I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w); + I915_WRITE_FW(PLANE_CTL(pipe, plane_id), plane_ctl); + I915_WRITE_FW(PLANE_OFFSET(pipe, plane_id), (src_y << 16) | src_x); + I915_WRITE_FW(PLANE_STRIDE(pipe, plane_id), stride); + I915_WRITE_FW(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w); if (scaler_id >= 0) { uint32_t ps_ctrl = 0; @@ -3386,19 +3404,21 @@ static void skylake_update_primary_plane(struct drm_plane *plane, WARN_ON(!dst_w || !dst_h); ps_ctrl = PS_SCALER_EN | PS_PLANE_SEL(plane_id) | crtc_state->scaler_state.scalers[scaler_id].mode; - I915_WRITE(SKL_PS_CTRL(pipe, scaler_id), ps_ctrl); - I915_WRITE(SKL_PS_PWR_GATE(pipe, scaler_id), 0); - I915_WRITE(SKL_PS_WIN_POS(pipe, scaler_id), (dst_x << 16) | dst_y); - I915_WRITE(SKL_PS_WIN_SZ(pipe, scaler_id), (dst_w << 16) | dst_h); - I915_WRITE(PLANE_POS(pipe, plane_id), 0); + I915_WRITE_FW(SKL_PS_CTRL(pipe, scaler_id), ps_ctrl); + I915_WRITE_FW(SKL_PS_PWR_GATE(pipe, scaler_id), 0); + I915_WRITE_FW(SKL_PS_WIN_POS(pipe, scaler_id), (dst_x << 16) | dst_y); + I915_WRITE_FW(SKL_PS_WIN_SZ(pipe, scaler_id), (dst_w << 16) | dst_h); + I915_WRITE_FW(PLANE_POS(pipe, plane_id), 0); } else { - I915_WRITE(PLANE_POS(pipe, plane_id), (dst_y << 16) | dst_x); + I915_WRITE_FW(PLANE_POS(pipe, plane_id), (dst_y << 16) | dst_x); } - I915_WRITE(PLANE_SURF(pipe, plane_id), - intel_plane_ggtt_offset(plane_state) + surf_addr); + I915_WRITE_FW(PLANE_SURF(pipe, plane_id), + intel_plane_ggtt_offset(plane_state) + surf_addr); + + POSTING_READ_FW(PLANE_SURF(pipe, plane_id)); - POSTING_READ(PLANE_SURF(pipe, plane_id)); + spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); } static void skylake_disable_primary_plane(struct drm_plane *primary, @@ -3408,10 +3428,15 @@ static void skylake_disable_primary_plane(struct drm_plane *primary, struct drm_i915_private *dev_priv = to_i915(dev); enum plane_id plane_id = to_intel_plane(primary)->id; enum pipe pipe = to_intel_plane(primary)->pipe; + unsigned long irqflags; - I915_WRITE(PLANE_CTL(pipe, plane_id), 0); - I915_WRITE(PLANE_SURF(pipe, plane_id), 0); - POSTING_READ(PLANE_SURF(pipe, plane_id)); + spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + + I915_WRITE_FW(PLANE_CTL(pipe, plane_id), 0); + I915_WRITE_FW(PLANE_SURF(pipe, plane_id), 0); + POSTING_READ_FW(PLANE_SURF(pipe, plane_id)); + + spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); } /* Assume fb object is pinned & idle & fenced and just update base pointers */ @@ -9174,24 +9199,24 @@ static void i845_update_cursor(struct drm_crtc *crtc, u32 base, /* On these chipsets we can only modify the base/size/stride * whilst the cursor is disabled. */ - I915_WRITE(CURCNTR(PIPE_A), 0); - POSTING_READ(CURCNTR(PIPE_A)); + I915_WRITE_FW(CURCNTR(PIPE_A), 0); + POSTING_READ_FW(CURCNTR(PIPE_A)); intel_crtc->cursor_cntl = 0; } if (intel_crtc->cursor_base != base) { - I915_WRITE(CURBASE(PIPE_A), base); + I915_WRITE_FW(CURBASE(PIPE_A), base); intel_crtc->cursor_base = base; } if (intel_crtc->cursor_size != size) { - I915_WRITE(CURSIZE, size); + I915_WRITE_FW(CURSIZE, size); intel_crtc->cursor_size = size; } if (intel_crtc->cursor_cntl != cntl) { - I915_WRITE(CURCNTR(PIPE_A), cntl); - POSTING_READ(CURCNTR(PIPE_A)); + I915_WRITE_FW(CURCNTR(PIPE_A), cntl); + POSTING_READ_FW(CURCNTR(PIPE_A)); intel_crtc->cursor_cntl = cntl; } } @@ -9231,14 +9256,14 @@ static void i9xx_update_cursor(struct drm_crtc *crtc, u32 base, } if (intel_crtc->cursor_cntl != cntl) { - I915_WRITE(CURCNTR(pipe), cntl); - POSTING_READ(CURCNTR(pipe)); + I915_WRITE_FW(CURCNTR(pipe), cntl); + POSTING_READ_FW(CURCNTR(pipe)); intel_crtc->cursor_cntl = cntl; } /* and commit changes on next vblank */ - I915_WRITE(CURBASE(pipe), base); - POSTING_READ(CURBASE(pipe)); + I915_WRITE_FW(CURBASE(pipe), base); + POSTING_READ_FW(CURBASE(pipe)); intel_crtc->cursor_base = base; } @@ -9252,6 +9277,7 @@ static void intel_crtc_update_cursor(struct drm_crtc *crtc, struct intel_crtc *intel_crtc = to_intel_crtc(crtc); int pipe = intel_crtc->pipe; u32 base = intel_crtc->cursor_addr; + unsigned long irqflags; u32 pos = 0; if (plane_state) { @@ -9278,12 +9304,16 @@ static void intel_crtc_update_cursor(struct drm_crtc *crtc, } } - I915_WRITE(CURPOS(pipe), pos); + spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + + I915_WRITE_FW(CURPOS(pipe), pos); if (IS_I845G(dev_priv) || IS_I865G(dev_priv)) i845_update_cursor(crtc, base, plane_state); else i9xx_update_cursor(crtc, base, plane_state); + + spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); } static bool cursor_size_ok(struct drm_i915_private *dev_priv, diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c index 532db7d62290..b931d0bd7a64 100644 --- a/drivers/gpu/drm/i915/intel_sprite.c +++ b/drivers/gpu/drm/i915/intel_sprite.c @@ -230,6 +230,7 @@ skl_update_plane(struct drm_plane *drm_plane, uint32_t y = plane_state->main.y; uint32_t src_w = drm_rect_width(&plane_state->base.src) >> 16; uint32_t src_h = drm_rect_height(&plane_state->base.src) >> 16; + unsigned long irqflags; plane_ctl = PLANE_CTL_ENABLE; @@ -255,22 +256,24 @@ skl_update_plane(struct drm_plane *drm_plane, crtc_w--; crtc_h--; + spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + if (IS_GEMINILAKE(dev_priv)) { - I915_WRITE(PLANE_COLOR_CTL(pipe, plane_id), - PLANE_COLOR_PIPE_GAMMA_ENABLE | - PLANE_COLOR_PIPE_CSC_ENABLE | - PLANE_COLOR_PLANE_GAMMA_DISABLE); + I915_WRITE_FW(PLANE_COLOR_CTL(pipe, plane_id), + PLANE_COLOR_PIPE_GAMMA_ENABLE | + PLANE_COLOR_PIPE_CSC_ENABLE | + PLANE_COLOR_PLANE_GAMMA_DISABLE); } if (key->flags) { - I915_WRITE(PLANE_KEYVAL(pipe, plane_id), key->min_value); - I915_WRITE(PLANE_KEYMAX(pipe, plane_id), key->max_value); - I915_WRITE(PLANE_KEYMSK(pipe, plane_id), key->channel_mask); + I915_WRITE_FW(PLANE_KEYVAL(pipe, plane_id), key->min_value); + I915_WRITE_FW(PLANE_KEYMAX(pipe, plane_id), key->max_value); + I915_WRITE_FW(PLANE_KEYMSK(pipe, plane_id), key->channel_mask); } - I915_WRITE(PLANE_OFFSET(pipe, plane_id), (y << 16) | x); - I915_WRITE(PLANE_STRIDE(pipe, plane_id), stride); - I915_WRITE(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w); + I915_WRITE_FW(PLANE_OFFSET(pipe, plane_id), (y << 16) | x); + I915_WRITE_FW(PLANE_STRIDE(pipe, plane_id), stride); + I915_WRITE_FW(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w); /* program plane scaler */ if (plane_state->scaler_id >= 0) { @@ -279,22 +282,24 @@ skl_update_plane(struct drm_plane *drm_plane, scaler = &crtc_state->scaler_state.scalers[scaler_id]; - I915_WRITE(SKL_PS_CTRL(pipe, scaler_id), - PS_SCALER_EN | PS_PLANE_SEL(plane_id) | scaler->mode); - I915_WRITE(SKL_PS_PWR_GATE(pipe, scaler_id), 0); - I915_WRITE(SKL_PS_WIN_POS(pipe, scaler_id), (crtc_x << 16) | crtc_y); - I915_WRITE(SKL_PS_WIN_SZ(pipe, scaler_id), - ((crtc_w + 1) << 16)|(crtc_h + 1)); + I915_WRITE_FW(SKL_PS_CTRL(pipe, scaler_id), + PS_SCALER_EN | PS_PLANE_SEL(plane_id) | scaler->mode); + I915_WRITE_FW(SKL_PS_PWR_GATE(pipe, scaler_id), 0); + I915_WRITE_FW(SKL_PS_WIN_POS(pipe, scaler_id), (crtc_x << 16) | crtc_y); + I915_WRITE_FW(SKL_PS_WIN_SZ(pipe, scaler_id), + ((crtc_w + 1) << 16)|(crtc_h + 1)); - I915_WRITE(PLANE_POS(pipe, plane_id), 0); + I915_WRITE_FW(PLANE_POS(pipe, plane_id), 0); } else { - I915_WRITE(PLANE_POS(pipe, plane_id), (crtc_y << 16) | crtc_x); + I915_WRITE_FW(PLANE_POS(pipe, plane_id), (crtc_y << 16) | crtc_x); } - I915_WRITE(PLANE_CTL(pipe, plane_id), plane_ctl); - I915_WRITE(PLANE_SURF(pipe, plane_id), - intel_plane_ggtt_offset(plane_state) + surf_addr); - POSTING_READ(PLANE_SURF(pipe, plane_id)); + I915_WRITE_FW(PLANE_CTL(pipe, plane_id), plane_ctl); + I915_WRITE_FW(PLANE_SURF(pipe, plane_id), + intel_plane_ggtt_offset(plane_state) + surf_addr); + POSTING_READ_FW(PLANE_SURF(pipe, plane_id)); + + spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); } static void @@ -305,11 +310,16 @@ skl_disable_plane(struct drm_plane *dplane, struct drm_crtc *crtc) struct intel_plane *intel_plane = to_intel_plane(dplane); enum plane_id plane_id = intel_plane->id; enum pipe pipe = intel_plane->pipe; + unsigned long irqflags; + + spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); - I915_WRITE(PLANE_CTL(pipe, plane_id), 0); + I915_WRITE_FW(PLANE_CTL(pipe, plane_id), 0); - I915_WRITE(PLANE_SURF(pipe, plane_id), 0); - POSTING_READ(PLANE_SURF(pipe, plane_id)); + I915_WRITE_FW(PLANE_SURF(pipe, plane_id), 0); + POSTING_READ_FW(PLANE_SURF(pipe, plane_id)); + + spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); } static void @@ -332,23 +342,23 @@ chv_update_csc(struct intel_plane *intel_plane, uint32_t format) * Cb and Cr apparently come in as signed already, so no * need for any offset. For Y we need to remove the offset. */ - I915_WRITE(SPCSCYGOFF(plane_id), SPCSC_OOFF(0) | SPCSC_IOFF(-64)); - I915_WRITE(SPCSCCBOFF(plane_id), SPCSC_OOFF(0) | SPCSC_IOFF(0)); - I915_WRITE(SPCSCCROFF(plane_id), SPCSC_OOFF(0) | SPCSC_IOFF(0)); - - I915_WRITE(SPCSCC01(plane_id), SPCSC_C1(4769) | SPCSC_C0(6537)); - I915_WRITE(SPCSCC23(plane_id), SPCSC_C1(-3330) | SPCSC_C0(0)); - I915_WRITE(SPCSCC45(plane_id), SPCSC_C1(-1605) | SPCSC_C0(4769)); - I915_WRITE(SPCSCC67(plane_id), SPCSC_C1(4769) | SPCSC_C0(0)); - I915_WRITE(SPCSCC8(plane_id), SPCSC_C0(8263)); - - I915_WRITE(SPCSCYGICLAMP(plane_id), SPCSC_IMAX(940) | SPCSC_IMIN(64)); - I915_WRITE(SPCSCCBICLAMP(plane_id), SPCSC_IMAX(448) | SPCSC_IMIN(-448)); - I915_WRITE(SPCSCCRICLAMP(plane_id), SPCSC_IMAX(448) | SPCSC_IMIN(-448)); - - I915_WRITE(SPCSCYGOCLAMP(plane_id), SPCSC_OMAX(1023) | SPCSC_OMIN(0)); - I915_WRITE(SPCSCCBOCLAMP(plane_id), SPCSC_OMAX(1023) | SPCSC_OMIN(0)); - I915_WRITE(SPCSCCROCLAMP(plane_id), SPCSC_OMAX(1023) | SPCSC_OMIN(0)); + I915_WRITE_FW(SPCSCYGOFF(plane_id), SPCSC_OOFF(0) | SPCSC_IOFF(-64)); + I915_WRITE_FW(SPCSCCBOFF(plane_id), SPCSC_OOFF(0) | SPCSC_IOFF(0)); + I915_WRITE_FW(SPCSCCROFF(plane_id), SPCSC_OOFF(0) | SPCSC_IOFF(0)); + + I915_WRITE_FW(SPCSCC01(plane_id), SPCSC_C1(4769) | SPCSC_C0(6537)); + I915_WRITE_FW(SPCSCC23(plane_id), SPCSC_C1(-3330) | SPCSC_C0(0)); + I915_WRITE_FW(SPCSCC45(plane_id), SPCSC_C1(-1605) | SPCSC_C0(4769)); + I915_WRITE_FW(SPCSCC67(plane_id), SPCSC_C1(4769) | SPCSC_C0(0)); + I915_WRITE_FW(SPCSCC8(plane_id), SPCSC_C0(8263)); + + I915_WRITE_FW(SPCSCYGICLAMP(plane_id), SPCSC_IMAX(940) | SPCSC_IMIN(64)); + I915_WRITE_FW(SPCSCCBICLAMP(plane_id), SPCSC_IMAX(448) | SPCSC_IMIN(-448)); + I915_WRITE_FW(SPCSCCRICLAMP(plane_id), SPCSC_IMAX(448) | SPCSC_IMIN(-448)); + + I915_WRITE_FW(SPCSCYGOCLAMP(plane_id), SPCSC_OMAX(1023) | SPCSC_OMIN(0)); + I915_WRITE_FW(SPCSCCBOCLAMP(plane_id), SPCSC_OMAX(1023) | SPCSC_OMIN(0)); + I915_WRITE_FW(SPCSCCROCLAMP(plane_id), SPCSC_OMAX(1023) | SPCSC_OMIN(0)); } static void @@ -374,6 +384,7 @@ vlv_update_plane(struct drm_plane *dplane, uint32_t y = plane_state->base.src.y1 >> 16; uint32_t src_w = drm_rect_width(&plane_state->base.src) >> 16; uint32_t src_h = drm_rect_height(&plane_state->base.src) >> 16; + unsigned long irqflags; sprctl = SP_ENABLE; @@ -456,29 +467,33 @@ vlv_update_plane(struct drm_plane *dplane, linear_offset = intel_fb_xy_to_linear(x, y, plane_state, 0); + spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + if (IS_CHERRYVIEW(dev_priv) && pipe == PIPE_B) chv_update_csc(intel_plane, fb->format->format); if (key->flags) { - I915_WRITE(SPKEYMINVAL(pipe, plane_id), key->min_value); - I915_WRITE(SPKEYMAXVAL(pipe, plane_id), key->max_value); - I915_WRITE(SPKEYMSK(pipe, plane_id), key->channel_mask); + I915_WRITE_FW(SPKEYMINVAL(pipe, plane_id), key->min_value); + I915_WRITE_FW(SPKEYMAXVAL(pipe, plane_id), key->max_value); + I915_WRITE_FW(SPKEYMSK(pipe, plane_id), key->channel_mask); } - I915_WRITE(SPSTRIDE(pipe, plane_id), fb->pitches[0]); - I915_WRITE(SPPOS(pipe, plane_id), (crtc_y << 16) | crtc_x); + I915_WRITE_FW(SPSTRIDE(pipe, plane_id), fb->pitches[0]); + I915_WRITE_FW(SPPOS(pipe, plane_id), (crtc_y << 16) | crtc_x); if (fb->modifier == I915_FORMAT_MOD_X_TILED) - I915_WRITE(SPTILEOFF(pipe, plane_id), (y << 16) | x); + I915_WRITE_FW(SPTILEOFF(pipe, plane_id), (y << 16) | x); else - I915_WRITE(SPLINOFF(pipe, plane_id), linear_offset); + I915_WRITE_FW(SPLINOFF(pipe, plane_id), linear_offset); - I915_WRITE(SPCONSTALPHA(pipe, plane_id), 0); + I915_WRITE_FW(SPCONSTALPHA(pipe, plane_id), 0); - I915_WRITE(SPSIZE(pipe, plane_id), (crtc_h << 16) | crtc_w); - I915_WRITE(SPCNTR(pipe, plane_id), sprctl); - I915_WRITE(SPSURF(pipe, plane_id), - intel_plane_ggtt_offset(plane_state) + sprsurf_offset); - POSTING_READ(SPSURF(pipe, plane_id)); + I915_WRITE_FW(SPSIZE(pipe, plane_id), (crtc_h << 16) | crtc_w); + I915_WRITE_FW(SPCNTR(pipe, plane_id), sprctl); + I915_WRITE_FW(SPSURF(pipe, plane_id), + intel_plane_ggtt_offset(plane_state) + sprsurf_offset); + POSTING_READ_FW(SPSURF(pipe, plane_id)); + + spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); } static void @@ -489,11 +504,16 @@ vlv_disable_plane(struct drm_plane *dplane, struct drm_crtc *crtc) struct intel_plane *intel_plane = to_intel_plane(dplane); enum pipe pipe = intel_plane->pipe; enum plane_id plane_id = intel_plane->id; + unsigned long irqflags; + + spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); - I915_WRITE(SPCNTR(pipe, plane_id), 0); + I915_WRITE_FW(SPCNTR(pipe, plane_id), 0); - I915_WRITE(SPSURF(pipe, plane_id), 0); - POSTING_READ(SPSURF(pipe, plane_id)); + I915_WRITE_FW(SPSURF(pipe, plane_id), 0); + POSTING_READ_FW(SPSURF(pipe, plane_id)); + + spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); } static void @@ -518,6 +538,7 @@ ivb_update_plane(struct drm_plane *plane, uint32_t y = plane_state->base.src.y1 >> 16; uint32_t src_w = drm_rect_width(&plane_state->base.src) >> 16; uint32_t src_h = drm_rect_height(&plane_state->base.src) >> 16; + unsigned long irqflags; sprctl = SPRITE_ENABLE; @@ -590,31 +611,35 @@ ivb_update_plane(struct drm_plane *plane, linear_offset = intel_fb_xy_to_linear(x, y, plane_state, 0); + spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + if (key->flags) { - I915_WRITE(SPRKEYVAL(pipe), key->min_value); - I915_WRITE(SPRKEYMAX(pipe), key->max_value); - I915_WRITE(SPRKEYMSK(pipe), key->channel_mask); + I915_WRITE_FW(SPRKEYVAL(pipe), key->min_value); + I915_WRITE_FW(SPRKEYMAX(pipe), key->max_value); + I915_WRITE_FW(SPRKEYMSK(pipe), key->channel_mask); } - I915_WRITE(SPRSTRIDE(pipe), fb->pitches[0]); - I915_WRITE(SPRPOS(pipe), (crtc_y << 16) | crtc_x); + I915_WRITE_FW(SPRSTRIDE(pipe), fb->pitches[0]); + I915_WRITE_FW(SPRPOS(pipe), (crtc_y << 16) | crtc_x); /* HSW consolidates SPRTILEOFF and SPRLINOFF into a single SPROFFSET * register */ if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv)) - I915_WRITE(SPROFFSET(pipe), (y << 16) | x); + I915_WRITE_FW(SPROFFSET(pipe), (y << 16) | x); else if (fb->modifier == I915_FORMAT_MOD_X_TILED) - I915_WRITE(SPRTILEOFF(pipe), (y << 16) | x); + I915_WRITE_FW(SPRTILEOFF(pipe), (y << 16) | x); else - I915_WRITE(SPRLINOFF(pipe), linear_offset); + I915_WRITE_FW(SPRLINOFF(pipe), linear_offset); - I915_WRITE(SPRSIZE(pipe), (crtc_h << 16) | crtc_w); + I915_WRITE_FW(SPRSIZE(pipe), (crtc_h << 16) | crtc_w); if (intel_plane->can_scale) - I915_WRITE(SPRSCALE(pipe), sprscale); - I915_WRITE(SPRCTL(pipe), sprctl); - I915_WRITE(SPRSURF(pipe), - intel_plane_ggtt_offset(plane_state) + sprsurf_offset); - POSTING_READ(SPRSURF(pipe)); + I915_WRITE_FW(SPRSCALE(pipe), sprscale); + I915_WRITE_FW(SPRCTL(pipe), sprctl); + I915_WRITE_FW(SPRSURF(pipe), + intel_plane_ggtt_offset(plane_state) + sprsurf_offset); + POSTING_READ_FW(SPRSURF(pipe)); + + spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); } static void @@ -624,14 +649,19 @@ ivb_disable_plane(struct drm_plane *plane, struct drm_crtc *crtc) struct drm_i915_private *dev_priv = to_i915(dev); struct intel_plane *intel_plane = to_intel_plane(plane); int pipe = intel_plane->pipe; + unsigned long irqflags; + + spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); - I915_WRITE(SPRCTL(pipe), 0); + I915_WRITE_FW(SPRCTL(pipe), 0); /* Can't leave the scaler enabled... */ if (intel_plane->can_scale) - I915_WRITE(SPRSCALE(pipe), 0); + I915_WRITE_FW(SPRSCALE(pipe), 0); - I915_WRITE(SPRSURF(pipe), 0); - POSTING_READ(SPRSURF(pipe)); + I915_WRITE_FW(SPRSURF(pipe), 0); + POSTING_READ_FW(SPRSURF(pipe)); + + spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); } static void @@ -656,6 +686,7 @@ ilk_update_plane(struct drm_plane *plane, uint32_t y = plane_state->base.src.y1 >> 16; uint32_t src_w = drm_rect_width(&plane_state->base.src) >> 16; uint32_t src_h = drm_rect_height(&plane_state->base.src) >> 16; + unsigned long irqflags; dvscntr = DVS_ENABLE; @@ -722,26 +753,30 @@ ilk_update_plane(struct drm_plane *plane, linear_offset = intel_fb_xy_to_linear(x, y, plane_state, 0); + spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + if (key->flags) { - I915_WRITE(DVSKEYVAL(pipe), key->min_value); - I915_WRITE(DVSKEYMAX(pipe), key->max_value); - I915_WRITE(DVSKEYMSK(pipe), key->channel_mask); + I915_WRITE_FW(DVSKEYVAL(pipe), key->min_value); + I915_WRITE_FW(DVSKEYMAX(pipe), key->max_value); + I915_WRITE_FW(DVSKEYMSK(pipe), key->channel_mask); } - I915_WRITE(DVSSTRIDE(pipe), fb->pitches[0]); - I915_WRITE(DVSPOS(pipe), (crtc_y << 16) | crtc_x); + I915_WRITE_FW(DVSSTRIDE(pipe), fb->pitches[0]); + I915_WRITE_FW(DVSPOS(pipe), (crtc_y << 16) | crtc_x); if (fb->modifier == I915_FORMAT_MOD_X_TILED) - I915_WRITE(DVSTILEOFF(pipe), (y << 16) | x); + I915_WRITE_FW(DVSTILEOFF(pipe), (y << 16) | x); else - I915_WRITE(DVSLINOFF(pipe), linear_offset); - - I915_WRITE(DVSSIZE(pipe), (crtc_h << 16) | crtc_w); - I915_WRITE(DVSSCALE(pipe), dvsscale); - I915_WRITE(DVSCNTR(pipe), dvscntr); - I915_WRITE(DVSSURF(pipe), - intel_plane_ggtt_offset(plane_state) + dvssurf_offset); - POSTING_READ(DVSSURF(pipe)); + I915_WRITE_FW(DVSLINOFF(pipe), linear_offset); + + I915_WRITE_FW(DVSSIZE(pipe), (crtc_h << 16) | crtc_w); + I915_WRITE_FW(DVSSCALE(pipe), dvsscale); + I915_WRITE_FW(DVSCNTR(pipe), dvscntr); + I915_WRITE_FW(DVSSURF(pipe), + intel_plane_ggtt_offset(plane_state) + dvssurf_offset); + POSTING_READ_FW(DVSSURF(pipe)); + + spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); } static void @@ -751,13 +786,18 @@ ilk_disable_plane(struct drm_plane *plane, struct drm_crtc *crtc) struct drm_i915_private *dev_priv = to_i915(dev); struct intel_plane *intel_plane = to_intel_plane(plane); int pipe = intel_plane->pipe; + unsigned long irqflags; - I915_WRITE(DVSCNTR(pipe), 0); + spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + + I915_WRITE_FW(DVSCNTR(pipe), 0); /* Disable the scaler */ - I915_WRITE(DVSSCALE(pipe), 0); + I915_WRITE_FW(DVSSCALE(pipe), 0); + + I915_WRITE_FW(DVSSURF(pipe), 0); + POSTING_READ_FW(DVSSURF(pipe)); - I915_WRITE(DVSSURF(pipe), 0); - POSTING_READ(DVSSURF(pipe)); + spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); } static int -- 2.10.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 5/5] drm/i915: Optimize VLV/CHV display FIFO updates 2017-03-09 15:44 [PATCH 0/5] drm/i915: Optimize plane updates a bit ville.syrjala ` (3 preceding siblings ...) 2017-03-09 15:44 ` [PATCH 4/5] drm/i915: Use I915_READ_FW for plane updates ville.syrjala @ 2017-03-09 15:44 ` ville.syrjala 2017-03-09 21:26 ` Chris Wilson 2017-03-09 15:56 ` [PATCH 0/5] drm/i915: Optimize plane updates a bit Maarten Lankhorst ` (3 subsequent siblings) 8 siblings, 1 reply; 15+ messages in thread From: ville.syrjala @ 2017-03-09 15:44 UTC (permalink / raw) To: intel-gfx From: Ville Syrjälä <ville.syrjala@linux.intel.com> Use I915_{READ,WRITE}_FW() for updating the DSPARB registers on VLV/CHV. This is less expesive as we can grab the uncore.lock across the entire sequence of reads and writes instead of each register access grabbing it. This also allows us to eliminate the dsparb lock entirely as the uncore.lock now effectively protects the contents of the DSPARB registers. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> --- drivers/gpu/drm/i915/i915_drv.c | 1 - drivers/gpu/drm/i915/i915_drv.h | 3 --- drivers/gpu/drm/i915/intel_pm.c | 36 +++++++++++++++++++++--------------- 3 files changed, 21 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index b1e9027a4f80..debb74a2b2a9 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -827,7 +827,6 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv, spin_lock_init(&dev_priv->mm.object_stat_lock); spin_lock_init(&dev_priv->mmio_flip_lock); - spin_lock_init(&dev_priv->wm.dsparb_lock); mutex_init(&dev_priv->sb_lock); mutex_init(&dev_priv->modeset_restore_lock); mutex_init(&dev_priv->av_mutex); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 3002996ddbed..6af0b1d33cab 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2375,9 +2375,6 @@ struct drm_i915_private { } sagv_status; struct { - /* protects DSPARB registers on pre-g4x/vlv/chv */ - spinlock_t dsparb_lock; - /* * Raw watermark latency values: * in 0.1us units for WM0, diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 99e09f63d4b3..24cdc13a416a 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -1358,13 +1358,19 @@ static void vlv_atomic_update_fifo(struct intel_atomic_state *state, trace_vlv_fifo_size(crtc, sprite0_start, sprite1_start, fifo_size); - spin_lock(&dev_priv->wm.dsparb_lock); + /* + * uncore.lock serves a double purpose here. It allows us to + * use the less expensive I915_{READ,WRITE}_FW() functions, and + * it protects the DSPARB registers from getting clobbered by + * parallel updates from multiple pipes. + */ + spin_lock(&dev_priv->uncore.lock); switch (crtc->pipe) { uint32_t dsparb, dsparb2, dsparb3; case PIPE_A: - dsparb = I915_READ(DSPARB); - dsparb2 = I915_READ(DSPARB2); + dsparb = I915_READ_FW(DSPARB); + dsparb2 = I915_READ_FW(DSPARB2); dsparb &= ~(VLV_FIFO(SPRITEA, 0xff) | VLV_FIFO(SPRITEB, 0xff)); @@ -1376,12 +1382,12 @@ static void vlv_atomic_update_fifo(struct intel_atomic_state *state, dsparb2 |= (VLV_FIFO(SPRITEA_HI, sprite0_start >> 8) | VLV_FIFO(SPRITEB_HI, sprite1_start >> 8)); - I915_WRITE(DSPARB, dsparb); - I915_WRITE(DSPARB2, dsparb2); + I915_WRITE_FW(DSPARB, dsparb); + I915_WRITE_FW(DSPARB2, dsparb2); break; case PIPE_B: - dsparb = I915_READ(DSPARB); - dsparb2 = I915_READ(DSPARB2); + dsparb = I915_READ_FW(DSPARB); + dsparb2 = I915_READ_FW(DSPARB2); dsparb &= ~(VLV_FIFO(SPRITEC, 0xff) | VLV_FIFO(SPRITED, 0xff)); @@ -1393,12 +1399,12 @@ static void vlv_atomic_update_fifo(struct intel_atomic_state *state, dsparb2 |= (VLV_FIFO(SPRITEC_HI, sprite0_start >> 8) | VLV_FIFO(SPRITED_HI, sprite1_start >> 8)); - I915_WRITE(DSPARB, dsparb); - I915_WRITE(DSPARB2, dsparb2); + I915_WRITE_FW(DSPARB, dsparb); + I915_WRITE_FW(DSPARB2, dsparb2); break; case PIPE_C: - dsparb3 = I915_READ(DSPARB3); - dsparb2 = I915_READ(DSPARB2); + dsparb3 = I915_READ_FW(DSPARB3); + dsparb2 = I915_READ_FW(DSPARB2); dsparb3 &= ~(VLV_FIFO(SPRITEE, 0xff) | VLV_FIFO(SPRITEF, 0xff)); @@ -1410,16 +1416,16 @@ static void vlv_atomic_update_fifo(struct intel_atomic_state *state, dsparb2 |= (VLV_FIFO(SPRITEE_HI, sprite0_start >> 8) | VLV_FIFO(SPRITEF_HI, sprite1_start >> 8)); - I915_WRITE(DSPARB3, dsparb3); - I915_WRITE(DSPARB2, dsparb2); + I915_WRITE_FW(DSPARB3, dsparb3); + I915_WRITE_FW(DSPARB2, dsparb2); break; default: break; } - POSTING_READ(DSPARB); + POSTING_READ_FW(DSPARB); - spin_unlock(&dev_priv->wm.dsparb_lock); + spin_unlock(&dev_priv->uncore.lock); } #undef VLV_FIFO -- 2.10.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 5/5] drm/i915: Optimize VLV/CHV display FIFO updates 2017-03-09 15:44 ` [PATCH 5/5] drm/i915: Optimize VLV/CHV display FIFO updates ville.syrjala @ 2017-03-09 21:26 ` Chris Wilson 2017-03-10 10:07 ` Ville Syrjälä 0 siblings, 1 reply; 15+ messages in thread From: Chris Wilson @ 2017-03-09 21:26 UTC (permalink / raw) To: ville.syrjala; +Cc: intel-gfx On Thu, Mar 09, 2017 at 05:44:34PM +0200, ville.syrjala@linux.intel.com wrote: > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Use I915_{READ,WRITE}_FW() for updating the DSPARB registers on > VLV/CHV. This is less expesive as we can grab the uncore.lock across > the entire sequence of reads and writes instead of each register > access grabbing it. > > This also allows us to eliminate the dsparb lock entirely as the > uncore.lock now effectively protects the contents of the DSPARB > registers. > > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > --- > drivers/gpu/drm/i915/i915_drv.c | 1 - > drivers/gpu/drm/i915/i915_drv.h | 3 --- > drivers/gpu/drm/i915/intel_pm.c | 36 +++++++++++++++++++++--------------- > 3 files changed, 21 insertions(+), 19 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c > index b1e9027a4f80..debb74a2b2a9 100644 > --- a/drivers/gpu/drm/i915/i915_drv.c > +++ b/drivers/gpu/drm/i915/i915_drv.c > @@ -827,7 +827,6 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv, > > spin_lock_init(&dev_priv->mm.object_stat_lock); > spin_lock_init(&dev_priv->mmio_flip_lock); > - spin_lock_init(&dev_priv->wm.dsparb_lock); > mutex_init(&dev_priv->sb_lock); > mutex_init(&dev_priv->modeset_restore_lock); > mutex_init(&dev_priv->av_mutex); > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 3002996ddbed..6af0b1d33cab 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -2375,9 +2375,6 @@ struct drm_i915_private { > } sagv_status; > > struct { > - /* protects DSPARB registers on pre-g4x/vlv/chv */ > - spinlock_t dsparb_lock; > - > /* > * Raw watermark latency values: > * in 0.1us units for WM0, > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c > index 99e09f63d4b3..24cdc13a416a 100644 > --- a/drivers/gpu/drm/i915/intel_pm.c > +++ b/drivers/gpu/drm/i915/intel_pm.c > @@ -1358,13 +1358,19 @@ static void vlv_atomic_update_fifo(struct intel_atomic_state *state, > > trace_vlv_fifo_size(crtc, sprite0_start, sprite1_start, fifo_size); > > - spin_lock(&dev_priv->wm.dsparb_lock); > + /* > + * uncore.lock serves a double purpose here. It allows us to > + * use the less expensive I915_{READ,WRITE}_FW() functions, and > + * it protects the DSPARB registers from getting clobbered by > + * parallel updates from multiple pipes. > + */ > + spin_lock(&dev_priv->uncore.lock); Might be nice to document that the irq is disabled by intel_pipe_update_start() (hence why spin_lock is safe). -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 5/5] drm/i915: Optimize VLV/CHV display FIFO updates 2017-03-09 21:26 ` Chris Wilson @ 2017-03-10 10:07 ` Ville Syrjälä 2017-03-13 19:18 ` Ville Syrjälä 0 siblings, 1 reply; 15+ messages in thread From: Ville Syrjälä @ 2017-03-10 10:07 UTC (permalink / raw) To: Chris Wilson, intel-gfx On Thu, Mar 09, 2017 at 09:26:36PM +0000, Chris Wilson wrote: > On Thu, Mar 09, 2017 at 05:44:34PM +0200, ville.syrjala@linux.intel.com wrote: > > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > > Use I915_{READ,WRITE}_FW() for updating the DSPARB registers on > > VLV/CHV. This is less expesive as we can grab the uncore.lock across > > the entire sequence of reads and writes instead of each register > > access grabbing it. > > > > This also allows us to eliminate the dsparb lock entirely as the > > uncore.lock now effectively protects the contents of the DSPARB > > registers. > > > > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > > --- > > drivers/gpu/drm/i915/i915_drv.c | 1 - > > drivers/gpu/drm/i915/i915_drv.h | 3 --- > > drivers/gpu/drm/i915/intel_pm.c | 36 +++++++++++++++++++++--------------- > > 3 files changed, 21 insertions(+), 19 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c > > index b1e9027a4f80..debb74a2b2a9 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.c > > +++ b/drivers/gpu/drm/i915/i915_drv.c > > @@ -827,7 +827,6 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv, > > > > spin_lock_init(&dev_priv->mm.object_stat_lock); > > spin_lock_init(&dev_priv->mmio_flip_lock); > > - spin_lock_init(&dev_priv->wm.dsparb_lock); > > mutex_init(&dev_priv->sb_lock); > > mutex_init(&dev_priv->modeset_restore_lock); > > mutex_init(&dev_priv->av_mutex); > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > > index 3002996ddbed..6af0b1d33cab 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.h > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > @@ -2375,9 +2375,6 @@ struct drm_i915_private { > > } sagv_status; > > > > struct { > > - /* protects DSPARB registers on pre-g4x/vlv/chv */ > > - spinlock_t dsparb_lock; > > - > > /* > > * Raw watermark latency values: > > * in 0.1us units for WM0, > > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c > > index 99e09f63d4b3..24cdc13a416a 100644 > > --- a/drivers/gpu/drm/i915/intel_pm.c > > +++ b/drivers/gpu/drm/i915/intel_pm.c > > @@ -1358,13 +1358,19 @@ static void vlv_atomic_update_fifo(struct intel_atomic_state *state, > > > > trace_vlv_fifo_size(crtc, sprite0_start, sprite1_start, fifo_size); > > > > - spin_lock(&dev_priv->wm.dsparb_lock); > > + /* > > + * uncore.lock serves a double purpose here. It allows us to > > + * use the less expensive I915_{READ,WRITE}_FW() functions, and > > + * it protects the DSPARB registers from getting clobbered by > > + * parallel updates from multiple pipes. > > + */ > > + spin_lock(&dev_priv->uncore.lock); > > Might be nice to document that the irq is disabled by > intel_pipe_update_start() (hence why spin_lock is safe). Sure. I'll add a note. -- Ville Syrjälä Intel OTC _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 5/5] drm/i915: Optimize VLV/CHV display FIFO updates 2017-03-10 10:07 ` Ville Syrjälä @ 2017-03-13 19:18 ` Ville Syrjälä 0 siblings, 0 replies; 15+ messages in thread From: Ville Syrjälä @ 2017-03-13 19:18 UTC (permalink / raw) To: Chris Wilson, intel-gfx On Fri, Mar 10, 2017 at 12:07:53PM +0200, Ville Syrjälä wrote: > On Thu, Mar 09, 2017 at 09:26:36PM +0000, Chris Wilson wrote: > > On Thu, Mar 09, 2017 at 05:44:34PM +0200, ville.syrjala@linux.intel.com wrote: > > > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > > > > Use I915_{READ,WRITE}_FW() for updating the DSPARB registers on > > > VLV/CHV. This is less expesive as we can grab the uncore.lock across > > > the entire sequence of reads and writes instead of each register > > > access grabbing it. > > > > > > This also allows us to eliminate the dsparb lock entirely as the > > > uncore.lock now effectively protects the contents of the DSPARB > > > registers. > > > > > > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > --- > > > drivers/gpu/drm/i915/i915_drv.c | 1 - > > > drivers/gpu/drm/i915/i915_drv.h | 3 --- > > > drivers/gpu/drm/i915/intel_pm.c | 36 +++++++++++++++++++++--------------- > > > 3 files changed, 21 insertions(+), 19 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c > > > index b1e9027a4f80..debb74a2b2a9 100644 > > > --- a/drivers/gpu/drm/i915/i915_drv.c > > > +++ b/drivers/gpu/drm/i915/i915_drv.c > > > @@ -827,7 +827,6 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv, > > > > > > spin_lock_init(&dev_priv->mm.object_stat_lock); > > > spin_lock_init(&dev_priv->mmio_flip_lock); > > > - spin_lock_init(&dev_priv->wm.dsparb_lock); > > > mutex_init(&dev_priv->sb_lock); > > > mutex_init(&dev_priv->modeset_restore_lock); > > > mutex_init(&dev_priv->av_mutex); > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > > > index 3002996ddbed..6af0b1d33cab 100644 > > > --- a/drivers/gpu/drm/i915/i915_drv.h > > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > > @@ -2375,9 +2375,6 @@ struct drm_i915_private { > > > } sagv_status; > > > > > > struct { > > > - /* protects DSPARB registers on pre-g4x/vlv/chv */ > > > - spinlock_t dsparb_lock; > > > - > > > /* > > > * Raw watermark latency values: > > > * in 0.1us units for WM0, > > > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c > > > index 99e09f63d4b3..24cdc13a416a 100644 > > > --- a/drivers/gpu/drm/i915/intel_pm.c > > > +++ b/drivers/gpu/drm/i915/intel_pm.c > > > @@ -1358,13 +1358,19 @@ static void vlv_atomic_update_fifo(struct intel_atomic_state *state, > > > > > > trace_vlv_fifo_size(crtc, sprite0_start, sprite1_start, fifo_size); > > > > > > - spin_lock(&dev_priv->wm.dsparb_lock); > > > + /* > > > + * uncore.lock serves a double purpose here. It allows us to > > > + * use the less expensive I915_{READ,WRITE}_FW() functions, and > > > + * it protects the DSPARB registers from getting clobbered by > > > + * parallel updates from multiple pipes. > > > + */ > > > + spin_lock(&dev_priv->uncore.lock); > > > > Might be nice to document that the irq is disabled by > > intel_pipe_update_start() (hence why spin_lock is safe). > > Sure. I'll add a note. + * intel_pipe_update_start() has already disabled interrupts + * for us, so a plain spin_lock() is sufficient here. */ And entire series pushed to dinq. Thanks for the reviews. -- Ville Syrjälä Intel OTC _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/5] drm/i915: Optimize plane updates a bit 2017-03-09 15:44 [PATCH 0/5] drm/i915: Optimize plane updates a bit ville.syrjala ` (4 preceding siblings ...) 2017-03-09 15:44 ` [PATCH 5/5] drm/i915: Optimize VLV/CHV display FIFO updates ville.syrjala @ 2017-03-09 15:56 ` Maarten Lankhorst 2017-03-09 16:14 ` Ville Syrjälä 2017-03-09 18:53 ` ✗ Fi.CI.BAT: failure for " Patchwork ` (2 subsequent siblings) 8 siblings, 1 reply; 15+ messages in thread From: Maarten Lankhorst @ 2017-03-09 15:56 UTC (permalink / raw) To: ville.syrjala, intel-gfx Hey, Op 09-03-17 om 16:44 schreef ville.syrjala@linux.intel.com: > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Now that commit e1edbd44e23b ("drm/i915: Complain if we take too > long under vblank evasion.") has expose just how badly we suck, > it seems like a good time to optimize things a little bit. > > Prior to this one of my VLV machines exceed the 100 usec vblank > evade deadline pretty regularly, and at ever boot I was hitting > numbers as high as 500 usec. Granted that was with lockdep and > all kinds of other debug things enabled. After these changes > things seem stay below the 33 usec mark, and with all that debug > junk enabled we seem to staying below 22 usec. > > Note that I was testing single plane updates mostly, so I'm > not 100% sure multi plane updates couldn't still exceed the > deadline. That will need to be checked. I've considered doing the same before, but thanks to the commit it's clear that it seems to worth the effort to do so. Maybe a bit more radical, but could we grab the uncore lock for the entire update perhaps if it's still an issue? Patch series looks sane, might be worth fixing all chips at some point. :) Assuming testbot is happy.. Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> ~Maarten _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/5] drm/i915: Optimize plane updates a bit 2017-03-09 15:56 ` [PATCH 0/5] drm/i915: Optimize plane updates a bit Maarten Lankhorst @ 2017-03-09 16:14 ` Ville Syrjälä 0 siblings, 0 replies; 15+ messages in thread From: Ville Syrjälä @ 2017-03-09 16:14 UTC (permalink / raw) To: Maarten Lankhorst; +Cc: intel-gfx On Thu, Mar 09, 2017 at 04:56:23PM +0100, Maarten Lankhorst wrote: > Hey, > > Op 09-03-17 om 16:44 schreef ville.syrjala@linux.intel.com: > > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > > Now that commit e1edbd44e23b ("drm/i915: Complain if we take too > > long under vblank evasion.") has expose just how badly we suck, > > it seems like a good time to optimize things a little bit. > > > > Prior to this one of my VLV machines exceed the 100 usec vblank > > evade deadline pretty regularly, and at ever boot I was hitting > > numbers as high as 500 usec. Granted that was with lockdep and > > all kinds of other debug things enabled. After these changes > > things seem stay below the 33 usec mark, and with all that debug > > junk enabled we seem to staying below 22 usec. > > > > Note that I was testing single plane updates mostly, so I'm > > not 100% sure multi plane updates couldn't still exceed the > > deadline. That will need to be checked. > I've considered doing the same before, but thanks to the commit it's clear that it seems to worth the effort to do so. > Maybe a bit more radical, but could we grab the uncore lock for the entire update perhaps if it's still an issue? Yes that might be the ultimate goal. One of the problems is that we don't use pipe_update_start()/end() everywhere we do plane updates. Some of those we need to fix (eg. crtc disable really should disable things atomically). The "disable all extra planes at driver load time" case I'd like to eliminate by having proper state readout/takeover for planes. And then there's the legacy cursor thing. I guess we could avoid bigger surgery by simply grabbing the lock manually around the update_plane/disable_plane call in those places. Another thing I'd like to figure out is how much are we wasting CPU cycles for computing the register values in the .update_plane() hooks. If it's at all significant we should probably try to move absolutely everything but the register writes out from those functions. Oh, and we might even try to take advantage of the hardware's limite atomic capabilities and only write the absolute minimum set of registers under the vblank evasion (ie. just the ones that actually arm the update to occur on the next vblank). Unfortunately the split between the two classes of registers isn't all that clear. It would also make it impossible to do the fps>vrefresh thing. -- Ville Syrjälä Intel OTC _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* ✗ Fi.CI.BAT: failure for drm/i915: Optimize plane updates a bit 2017-03-09 15:44 [PATCH 0/5] drm/i915: Optimize plane updates a bit ville.syrjala ` (5 preceding siblings ...) 2017-03-09 15:56 ` [PATCH 0/5] drm/i915: Optimize plane updates a bit Maarten Lankhorst @ 2017-03-09 18:53 ` Patchwork 2017-03-09 19:53 ` Patchwork 2017-03-13 17:47 ` ✓ Fi.CI.BAT: success " Patchwork 8 siblings, 0 replies; 15+ messages in thread From: Patchwork @ 2017-03-09 18:53 UTC (permalink / raw) To: ville.syrjala; +Cc: intel-gfx == Series Details == Series: drm/i915: Optimize plane updates a bit URL : https://patchwork.freedesktop.org/series/21002/ State : failure == Summary == Series 21002v1 drm/i915: Optimize plane updates a bit https://patchwork.freedesktop.org/api/1.0/series/21002/revisions/1/mbox/ Test gem_exec_flush: Subgroup basic-batch-kernel-default-uc: incomplete -> PASS (fi-skl-6700k) fdo#100130 Test gem_exec_reloc: Subgroup basic-gtt-read-noreloc: pass -> INCOMPLETE (fi-byt-j1900) fdo#100130 https://bugs.freedesktop.org/show_bug.cgi?id=100130 fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time: 466s fi-bsw-n3050 total:278 pass:239 dwarn:0 dfail:0 fail:0 skip:39 time: 615s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time: 549s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time: 608s fi-byt-j1900 total:85 pass:75 dwarn:0 dfail:0 fail:0 skip:9 time: 0s fi-byt-n2820 total:278 pass:247 dwarn:0 dfail:0 fail:0 skip:31 time: 498s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time: 434s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time: 435s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time: 437s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time: 510s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time: 495s fi-kbl-7500u total:278 pass:259 dwarn:1 dfail:0 fail:0 skip:18 time: 479s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time: 509s fi-skl-6700hq total:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time: 597s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time: 496s fi-skl-6770hq total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time: 551s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time: 553s fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time: 423s 510c200742ced5a91d07e48220b669a3c9b30c0c drm-tip: 2017y-03m-09d-15h-21m-14s UTC integration manifest 06676dc drm/i915: Optimize VLV/CHV display FIFO updates bea48aa drm/i915: Use I915_READ_FW for plane updates c63745b drm/i915: Organize plane register writes into tighter bunches 6f81bf0 drm/i915: s/__raw_i915_read32/I915_READ_FW/ in the SKL+ scanline read w/a 8fcdc34 drm/i915: Use I915_READ_FW in i915_get_vblank_counter() == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4121/ _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* ✗ Fi.CI.BAT: failure for drm/i915: Optimize plane updates a bit 2017-03-09 15:44 [PATCH 0/5] drm/i915: Optimize plane updates a bit ville.syrjala ` (6 preceding siblings ...) 2017-03-09 18:53 ` ✗ Fi.CI.BAT: failure for " Patchwork @ 2017-03-09 19:53 ` Patchwork 2017-03-13 17:47 ` ✓ Fi.CI.BAT: success " Patchwork 8 siblings, 0 replies; 15+ messages in thread From: Patchwork @ 2017-03-09 19:53 UTC (permalink / raw) To: ville.syrjala; +Cc: intel-gfx == Series Details == Series: drm/i915: Optimize plane updates a bit URL : https://patchwork.freedesktop.org/series/21002/ State : failure == Summary == Series 21002v1 drm/i915: Optimize plane updates a bit https://patchwork.freedesktop.org/api/1.0/series/21002/revisions/1/mbox/ Test gem_exec_flush: Subgroup basic-batch-kernel-default-uc: incomplete -> PASS (fi-skl-6700k) fdo#100130 Subgroup basic-uc-prw-default: pass -> INCOMPLETE (fi-skl-6700hq) fdo#100130 Test gem_exec_suspend: Subgroup basic-s3: pass -> INCOMPLETE (fi-bxt-t5700) Test kms_pipe_crc_basic: Subgroup suspend-read-crc-pipe-a: pass -> DMESG-WARN (fi-byt-j1900) fdo#100130 https://bugs.freedesktop.org/show_bug.cgi?id=100130 fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time: 459s fi-bsw-n3050 total:278 pass:239 dwarn:0 dfail:0 fail:0 skip:39 time: 607s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time: 541s fi-bxt-t5700 total:108 pass:95 dwarn:0 dfail:0 fail:0 skip:12 time: 0s fi-byt-j1900 total:278 pass:250 dwarn:1 dfail:0 fail:0 skip:27 time: 507s fi-byt-n2820 total:278 pass:247 dwarn:0 dfail:0 fail:0 skip:31 time: 498s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time: 441s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time: 435s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time: 448s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time: 506s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time: 484s fi-kbl-7500u total:278 pass:259 dwarn:1 dfail:0 fail:0 skip:18 time: 474s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time: 520s fi-skl-6700hq total:54 pass:46 dwarn:0 dfail:0 fail:0 skip:7 time: 0s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time: 497s fi-skl-6770hq total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time: 560s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time: 555s fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time: 422s 510c200742ced5a91d07e48220b669a3c9b30c0c drm-tip: 2017y-03m-09d-15h-21m-14s UTC integration manifest aa6c0ed drm/i915: Optimize VLV/CHV display FIFO updates 7bdb347 drm/i915: Use I915_READ_FW for plane updates b03cfd4 drm/i915: Organize plane register writes into tighter bunches 6e789b6 drm/i915: s/__raw_i915_read32/I915_READ_FW/ in the SKL+ scanline read w/a d3af5ca drm/i915: Use I915_READ_FW in i915_get_vblank_counter() == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4123/ _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* ✓ Fi.CI.BAT: success for drm/i915: Optimize plane updates a bit 2017-03-09 15:44 [PATCH 0/5] drm/i915: Optimize plane updates a bit ville.syrjala ` (7 preceding siblings ...) 2017-03-09 19:53 ` Patchwork @ 2017-03-13 17:47 ` Patchwork 8 siblings, 0 replies; 15+ messages in thread From: Patchwork @ 2017-03-13 17:47 UTC (permalink / raw) To: ville.syrjala; +Cc: intel-gfx == Series Details == Series: drm/i915: Optimize plane updates a bit URL : https://patchwork.freedesktop.org/series/21002/ State : success == Summary == Series 21002v1 drm/i915: Optimize plane updates a bit https://patchwork.freedesktop.org/api/1.0/series/21002/revisions/1/mbox/ Test gem_exec_flush: Subgroup basic-batch-kernel-default-uc: pass -> FAIL (fi-snb-2600) fdo#100007 fdo#100007 https://bugs.freedesktop.org/show_bug.cgi?id=100007 fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time: 461s fi-bsw-n3050 total:278 pass:239 dwarn:0 dfail:0 fail:0 skip:39 time: 609s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time: 536s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time: 592s fi-byt-j1900 total:278 pass:251 dwarn:0 dfail:0 fail:0 skip:27 time: 508s fi-byt-n2820 total:278 pass:247 dwarn:0 dfail:0 fail:0 skip:31 time: 504s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time: 441s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time: 434s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time: 438s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time: 502s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time: 490s fi-kbl-7500u total:278 pass:259 dwarn:1 dfail:0 fail:0 skip:18 time: 483s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time: 504s fi-skl-6700hq total:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time: 584s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time: 489s fi-skl-6770hq total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time: 531s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time: 550s fi-snb-2600 total:278 pass:248 dwarn:0 dfail:0 fail:1 skip:29 time: 432s ee2d731bf1079972596e5714dd70f98dd0177b0e drm-tip: 2017y-03m-13d-16h-06m-45s UTC integration manifest 184f5df drm/i915: Optimize VLV/CHV display FIFO updates c8ffbc1 drm/i915: Use I915_READ_FW for plane updates 2989119 drm/i915: Organize plane register writes into tighter bunches 05c4acc drm/i915: s/__raw_i915_read32/I915_READ_FW/ in the SKL+ scanline read w/a c53d265 drm/i915: Use I915_READ_FW in i915_get_vblank_counter() == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4156/ _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2017-03-13 19:19 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-03-09 15:44 [PATCH 0/5] drm/i915: Optimize plane updates a bit ville.syrjala 2017-03-09 15:44 ` [PATCH 1/5] drm/i915: Use I915_READ_FW in i915_get_vblank_counter() ville.syrjala 2017-03-09 15:44 ` [PATCH 2/5] drm/i915: s/__raw_i915_read32/I915_READ_FW/ in the SKL+ scanline read w/a ville.syrjala 2017-03-13 9:29 ` Mika Kahola 2017-03-09 15:44 ` [PATCH 3/5] drm/i915: Organize plane register writes into tighter bunches ville.syrjala 2017-03-09 15:44 ` [PATCH 4/5] drm/i915: Use I915_READ_FW for plane updates ville.syrjala 2017-03-09 15:44 ` [PATCH 5/5] drm/i915: Optimize VLV/CHV display FIFO updates ville.syrjala 2017-03-09 21:26 ` Chris Wilson 2017-03-10 10:07 ` Ville Syrjälä 2017-03-13 19:18 ` Ville Syrjälä 2017-03-09 15:56 ` [PATCH 0/5] drm/i915: Optimize plane updates a bit Maarten Lankhorst 2017-03-09 16:14 ` Ville Syrjälä 2017-03-09 18:53 ` ✗ Fi.CI.BAT: failure for " Patchwork 2017-03-09 19:53 ` Patchwork 2017-03-13 17:47 ` ✓ Fi.CI.BAT: success " Patchwork
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.