All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update
@ 2018-04-30  4:00 Tarun Vyas
  2018-04-30  8:20 ` Jani Nikula
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Tarun Vyas @ 2018-04-30  4:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: rodrigo.vivi, dhinakaran.pandiyan

From: Tarun <tarun.vyas@intel.com>

The PIPEDSL freezes on PSR entry and if PSR hasn't fully exited, then
the pipe_update_start call schedules itself out to check back later.

On ChromeOS-4.4 kernel, which is fairly up-to-date w.r.t drm/i915 but
lags w.r.t core kernel code, hot plugging an external display triggers
tons of "potential atomic update errors" in the dmesg, on *pipe A*. A
closer analysis reveals that we try to read the scanline 3 times and
eventually timeout, b/c PSR hasn't exited fully leading to a PIPEDSL
stuck @ 1599. This issue is not seen on upstream kernels, b/c for *some*
reason we loop inside intel_pipe_update start for ~2+ msec which in this
case is more than enough to exit PSR fully, hence an *unstuck* PIPEDSL
counter, hence no error. On the other hand, the ChromeOS kernel spends
~1.1 msec looping inside intel_pipe_update_start and hence errors out
b/c the source is still in PSR.

Regardless, we should wait for PSR exit (if PSR is supported and active
on the current pipe) before reading the PIPEDSL, b/c if we haven't
fully exited PSR, then checking for vblank evasion isn't actually
applicable.

This scenario applies to a configuration with an additional pipe,
as of now.

---
 drivers/gpu/drm/i915/intel_sprite.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
index aa1dfaa692b9..135b41568503 100644
--- a/drivers/gpu/drm/i915/intel_sprite.c
+++ b/drivers/gpu/drm/i915/intel_sprite.c
@@ -107,14 +107,17 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state)
 						      VBLANK_EVASION_TIME_US);
 	max = vblank_start - 1;
 
-	local_irq_disable();
-
 	if (min <= 0 || max <= 0)
 		return;
 
 	if (WARN_ON(drm_crtc_vblank_get(&crtc->base)))
 		return;
 
+	if(new_crtc_state->has_psr && dev_priv->psr.active)
+		intel_wait_for_register(dev_priv, EDP_PSR_STATUS, EDP_PSR_STATUS_STATE_MASK, EDP_PSR_STATUS_STATE_IDLE, 5);
+
+	local_irq_disable();
+
 	crtc->debug.min_vbl = min;
 	crtc->debug.max_vbl = max;
 	trace_i915_pipe_update_start(crtc);
-- 
2.13.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update
  2018-04-30  4:00 [PATCH] drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update Tarun Vyas
@ 2018-04-30  8:20 ` Jani Nikula
  2018-04-30 10:48 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Jani Nikula @ 2018-04-30  8:20 UTC (permalink / raw)
  To: Tarun Vyas, intel-gfx; +Cc: dhinakaran.pandiyan, rodrigo.vivi

On Sun, 29 Apr 2018, Tarun Vyas <tarun.vyas@intel.com> wrote:
> From: Tarun <tarun.vyas@intel.com>
>
> The PIPEDSL freezes on PSR entry and if PSR hasn't fully exited, then
> the pipe_update_start call schedules itself out to check back later.
>
> On ChromeOS-4.4 kernel, which is fairly up-to-date w.r.t drm/i915 but
> lags w.r.t core kernel code, hot plugging an external display triggers
> tons of "potential atomic update errors" in the dmesg, on *pipe A*. A
> closer analysis reveals that we try to read the scanline 3 times and
> eventually timeout, b/c PSR hasn't exited fully leading to a PIPEDSL
> stuck @ 1599. This issue is not seen on upstream kernels, b/c for *some*
> reason we loop inside intel_pipe_update start for ~2+ msec which in this
> case is more than enough to exit PSR fully, hence an *unstuck* PIPEDSL
> counter, hence no error. On the other hand, the ChromeOS kernel spends
> ~1.1 msec looping inside intel_pipe_update_start and hence errors out
> b/c the source is still in PSR.
>
> Regardless, we should wait for PSR exit (if PSR is supported and active
> on the current pipe) before reading the PIPEDSL, b/c if we haven't
> fully exited PSR, then checking for vblank evasion isn't actually
> applicable.
>
> This scenario applies to a configuration with an additional pipe,
> as of now.

I'll let others look at the content of the patch, but for this to be
considered for inclusion this needs your Signed-off-by [1].

BR,
Jani.

[1] https://www.kernel.org/doc/html/latest/process/submitting-patches.html#developer-s-certificate-of-origin-1-1

>
> ---
>  drivers/gpu/drm/i915/intel_sprite.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
> index aa1dfaa692b9..135b41568503 100644
> --- a/drivers/gpu/drm/i915/intel_sprite.c
> +++ b/drivers/gpu/drm/i915/intel_sprite.c
> @@ -107,14 +107,17 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state)
>  						      VBLANK_EVASION_TIME_US);
>  	max = vblank_start - 1;
>  
> -	local_irq_disable();
> -
>  	if (min <= 0 || max <= 0)
>  		return;
>  
>  	if (WARN_ON(drm_crtc_vblank_get(&crtc->base)))
>  		return;
>  
> +	if(new_crtc_state->has_psr && dev_priv->psr.active)
> +		intel_wait_for_register(dev_priv, EDP_PSR_STATUS, EDP_PSR_STATUS_STATE_MASK, EDP_PSR_STATUS_STATE_IDLE, 5);
> +
> +	local_irq_disable();
> +
>  	crtc->debug.min_vbl = min;
>  	crtc->debug.max_vbl = max;
>  	trace_i915_pipe_update_start(crtc);

-- 
Jani Nikula, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update
  2018-04-30  4:00 [PATCH] drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update Tarun Vyas
  2018-04-30  8:20 ` Jani Nikula
@ 2018-04-30 10:48 ` Patchwork
  2018-04-30 11:04 ` ✓ Fi.CI.BAT: success " Patchwork
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Patchwork @ 2018-04-30 10:48 UTC (permalink / raw)
  To: Tarun Vyas; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update
URL   : https://patchwork.freedesktop.org/series/42461/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
136000f52a51 drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update
-:46: ERROR:SPACING: space required before the open parenthesis '('
#46: FILE: drivers/gpu/drm/i915/intel_sprite.c:116:
+	if(new_crtc_state->has_psr && dev_priv->psr.active)

-:47: WARNING:LONG_LINE: line over 100 characters
#47: FILE: drivers/gpu/drm/i915/intel_sprite.c:117:
+		intel_wait_for_register(dev_priv, EDP_PSR_STATUS, EDP_PSR_STATUS_STATE_MASK, EDP_PSR_STATUS_STATE_IDLE, 5);

-:53: ERROR:MISSING_SIGN_OFF: Missing Signed-off-by: line(s)

total: 2 errors, 1 warnings, 0 checks, 19 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update
  2018-04-30  4:00 [PATCH] drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update Tarun Vyas
  2018-04-30  8:20 ` Jani Nikula
  2018-04-30 10:48 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
@ 2018-04-30 11:04 ` Patchwork
  2018-04-30 13:39 ` ✓ Fi.CI.IGT: " Patchwork
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Patchwork @ 2018-04-30 11:04 UTC (permalink / raw)
  To: Tarun Vyas; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update
URL   : https://patchwork.freedesktop.org/series/42461/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4112 -> Patchwork_8839 =

== Summary - WARNING ==

  Minor unknown changes coming with Patchwork_8839 need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_8839, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/42461/revisions/1/mbox/

== Possible new issues ==

  Here are the unknown changes that may have been introduced in Patchwork_8839:

  === IGT changes ===

    ==== Warnings ====

    igt@gem_exec_gttfill@basic:
      fi-pnv-d510:        PASS -> SKIP

    
== Known issues ==

  Here are the changes found in Patchwork_8839 that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@gem_exec_suspend@basic-s3:
      fi-ivb-3520m:       PASS -> DMESG-WARN (fdo#106084)

    igt@kms_frontbuffer_tracking@basic:
      fi-hsw-4200u:       PASS -> DMESG-FAIL (fdo#106103)

    igt@kms_pipe_crc_basic@suspend-read-crc-pipe-c:
      fi-bxt-dsi:         NOTRUN -> INCOMPLETE (fdo#103927)
      fi-cnl-psr:         PASS -> DMESG-WARN (fdo#104951)

    
    ==== Possible fixes ====

    igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b:
      fi-ivb-3520m:       DMESG-WARN (fdo#106084) -> PASS

    
  fdo#103927 https://bugs.freedesktop.org/show_bug.cgi?id=103927
  fdo#104951 https://bugs.freedesktop.org/show_bug.cgi?id=104951
  fdo#106084 https://bugs.freedesktop.org/show_bug.cgi?id=106084
  fdo#106103 https://bugs.freedesktop.org/show_bug.cgi?id=106103


== Participating hosts (39 -> 36) ==

  Missing    (3): fi-ctg-p8600 fi-ilk-m540 fi-skl-6700hq 


== Build changes ==

    * Linux: CI_DRM_4112 -> Patchwork_8839

  CI_DRM_4112: 423a00794c9d9610a71d8a02cd3bc17c6fe5fae1 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4452: 29ae12bd764e3b1876356e7628a32192b4ec9066 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_8839: 136000f52a51f95837247de854eeac040380c507 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4452: 04a2952c5b3782eb03cb136bb16d89daaf243f14 @ git://anongit.freedesktop.org/piglit


== Linux commits ==

136000f52a51 drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8839/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* ✓ Fi.CI.IGT: success for drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update
  2018-04-30  4:00 [PATCH] drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update Tarun Vyas
                   ` (2 preceding siblings ...)
  2018-04-30 11:04 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2018-04-30 13:39 ` Patchwork
  2018-04-30 17:19 ` [PATCH] " Rodrigo Vivi
  2018-05-14 12:53 ` Jani Nikula
  5 siblings, 0 replies; 13+ messages in thread
From: Patchwork @ 2018-04-30 13:39 UTC (permalink / raw)
  To: Tarun Vyas; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update
URL   : https://patchwork.freedesktop.org/series/42461/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4112_full -> Patchwork_8839_full =

== Summary - WARNING ==

  Minor unknown changes coming with Patchwork_8839_full need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_8839_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/42461/revisions/1/mbox/

== Possible new issues ==

  Here are the unknown changes that may have been introduced in Patchwork_8839_full:

  === IGT changes ===

    ==== Warnings ====

    igt@gem_exec_schedule@deep-bsd2:
      shard-kbl:          PASS -> SKIP +1

    
== Known issues ==

  Here are the changes found in Patchwork_8839_full that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@kms_flip@basic-flip-vs-wf_vblank:
      shard-apl:          PASS -> FAIL (fdo#100368)

    igt@kms_flip@dpms-vs-vblank-race-interruptible:
      shard-glk:          PASS -> FAIL (fdo#103060)

    igt@kms_flip@plain-flip-fb-recreate:
      shard-hsw:          PASS -> FAIL (fdo#100368)

    
    ==== Possible fixes ====

    igt@gem_ppgtt@blt-vs-render-ctx0:
      shard-kbl:          INCOMPLETE (fdo#106023, fdo#103665) -> PASS

    igt@kms_cursor_legacy@2x-long-flip-vs-cursor-atomic:
      shard-hsw:          FAIL (fdo#104873) -> PASS

    igt@kms_flip@2x-flip-vs-absolute-wf_vblank-interruptible:
      shard-hsw:          FAIL (fdo#103928) -> PASS

    igt@kms_flip@flip-vs-expired-vblank:
      shard-glk:          FAIL (fdo#105707) -> PASS

    igt@kms_flip@flip-vs-expired-vblank-interruptible:
      shard-glk:          FAIL (fdo#102887, fdo#105363) -> PASS

    igt@kms_flip@plain-flip-fb-recreate-interruptible:
      shard-glk:          FAIL (fdo#100368) -> PASS +1

    igt@kms_setmode@basic:
      shard-hsw:          FAIL (fdo#99912) -> PASS

    
  fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
  fdo#102887 https://bugs.freedesktop.org/show_bug.cgi?id=102887
  fdo#103060 https://bugs.freedesktop.org/show_bug.cgi?id=103060
  fdo#103665 https://bugs.freedesktop.org/show_bug.cgi?id=103665
  fdo#103928 https://bugs.freedesktop.org/show_bug.cgi?id=103928
  fdo#104873 https://bugs.freedesktop.org/show_bug.cgi?id=104873
  fdo#105363 https://bugs.freedesktop.org/show_bug.cgi?id=105363
  fdo#105707 https://bugs.freedesktop.org/show_bug.cgi?id=105707
  fdo#106023 https://bugs.freedesktop.org/show_bug.cgi?id=106023
  fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912


== Participating hosts (9 -> 8) ==

  Missing    (1): shard-glkb 


== Build changes ==

    * Linux: CI_DRM_4112 -> Patchwork_8839

  CI_DRM_4112: 423a00794c9d9610a71d8a02cd3bc17c6fe5fae1 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4452: 29ae12bd764e3b1876356e7628a32192b4ec9066 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_8839: 136000f52a51f95837247de854eeac040380c507 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4452: 04a2952c5b3782eb03cb136bb16d89daaf243f14 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8839/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update
  2018-04-30  4:00 [PATCH] drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update Tarun Vyas
                   ` (3 preceding siblings ...)
  2018-04-30 13:39 ` ✓ Fi.CI.IGT: " Patchwork
@ 2018-04-30 17:19 ` Rodrigo Vivi
  2018-05-02 18:19   ` Tarun Vyas
  2018-05-14 12:53 ` Jani Nikula
  5 siblings, 1 reply; 13+ messages in thread
From: Rodrigo Vivi @ 2018-04-30 17:19 UTC (permalink / raw)
  To: Tarun Vyas, imre.deak; +Cc: intel-gfx, dhinakaran.pandiyan

On Sun, Apr 29, 2018 at 09:00:18PM -0700, Tarun Vyas wrote:
> From: Tarun <tarun.vyas@intel.com>
> 
> The PIPEDSL freezes on PSR entry and if PSR hasn't fully exited, then
> the pipe_update_start call schedules itself out to check back later.
> 
> On ChromeOS-4.4 kernel, which is fairly up-to-date w.r.t drm/i915 but
> lags w.r.t core kernel code, hot plugging an external display triggers
> tons of "potential atomic update errors" in the dmesg, on *pipe A*. A
> closer analysis reveals that we try to read the scanline 3 times and
> eventually timeout, b/c PSR hasn't exited fully leading to a PIPEDSL
> stuck @ 1599. This issue is not seen on upstream kernels, b/c for *some*
> reason we loop inside intel_pipe_update start for ~2+ msec which in this
> case is more than enough to exit PSR fully, hence an *unstuck* PIPEDSL
> counter, hence no error. On the other hand, the ChromeOS kernel spends
> ~1.1 msec looping inside intel_pipe_update_start and hence errors out
> b/c the source is still in PSR.
> 
> Regardless, we should wait for PSR exit (if PSR is supported and active
> on the current pipe) before reading the PIPEDSL, b/c if we haven't
> fully exited PSR, then checking for vblank evasion isn't actually
> applicable.
> 
> This scenario applies to a configuration with an additional pipe,
> as of now.

I honestly believe you picking the wrong culprit here. By "coincidence".
PSR will allow DC state with screen on and DC state will mess up with all
registers reads....

probably what you are missing you your kernel is some power domain
grab that would keep DC_OFF and consequently a sane read of these
registers.

Maybe Imre has a quick idea of what you could be missing on your kernel
that we already have on upstream one.

Thanks,
Rodrigo.

> 
> ---
>  drivers/gpu/drm/i915/intel_sprite.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
> index aa1dfaa692b9..135b41568503 100644
> --- a/drivers/gpu/drm/i915/intel_sprite.c
> +++ b/drivers/gpu/drm/i915/intel_sprite.c
> @@ -107,14 +107,17 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state)
>  						      VBLANK_EVASION_TIME_US);
>  	max = vblank_start - 1;
>  
> -	local_irq_disable();
> -
>  	if (min <= 0 || max <= 0)
>  		return;
>  
>  	if (WARN_ON(drm_crtc_vblank_get(&crtc->base)))
>  		return;
>  
> +	if(new_crtc_state->has_psr && dev_priv->psr.active)
> +		intel_wait_for_register(dev_priv, EDP_PSR_STATUS, EDP_PSR_STATUS_STATE_MASK, EDP_PSR_STATUS_STATE_IDLE, 5);
> +
> +	local_irq_disable();
> +
>  	crtc->debug.min_vbl = min;
>  	crtc->debug.max_vbl = max;
>  	trace_i915_pipe_update_start(crtc);
> -- 
> 2.13.5
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update
  2018-04-30 17:19 ` [PATCH] " Rodrigo Vivi
@ 2018-05-02 18:19   ` Tarun Vyas
  2018-05-02 18:51     ` Ville Syrjälä
  0 siblings, 1 reply; 13+ messages in thread
From: Tarun Vyas @ 2018-05-02 18:19 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: Deak, Pandiyan, Dhinakaran, intel-gfx

On Mon, Apr 30, 2018 at 10:19:33AM -0700, Rodrigo Vivi wrote:
> On Sun, Apr 29, 2018 at 09:00:18PM -0700, Tarun Vyas wrote:
> > From: Tarun <tarun.vyas@intel.com>
> > 
> > The PIPEDSL freezes on PSR entry and if PSR hasn't fully exited, then
> > the pipe_update_start call schedules itself out to check back later.
> > 
> > On ChromeOS-4.4 kernel, which is fairly up-to-date w.r.t drm/i915 but
> > lags w.r.t core kernel code, hot plugging an external display triggers
> > tons of "potential atomic update errors" in the dmesg, on *pipe A*. A
> > closer analysis reveals that we try to read the scanline 3 times and
> > eventually timeout, b/c PSR hasn't exited fully leading to a PIPEDSL
> > stuck @ 1599. This issue is not seen on upstream kernels, b/c for *some*
> > reason we loop inside intel_pipe_update start for ~2+ msec which in this
> > case is more than enough to exit PSR fully, hence an *unstuck* PIPEDSL
> > counter, hence no error. On the other hand, the ChromeOS kernel spends
> > ~1.1 msec looping inside intel_pipe_update_start and hence errors out
> > b/c the source is still in PSR.
> > 
> > Regardless, we should wait for PSR exit (if PSR is supported and active
> > on the current pipe) before reading the PIPEDSL, b/c if we haven't
> > fully exited PSR, then checking for vblank evasion isn't actually
> > applicable.
> > 
> > This scenario applies to a configuration with an additional pipe,
> > as of now.
> 
> I honestly believe you picking the wrong culprit here. By "coincidence".
> PSR will allow DC state with screen on and DC state will mess up with all
> registers reads....
> 
> probably what you are missing you your kernel is some power domain
> grab that would keep DC_OFF and consequently a sane read of these
> registers.
> 
> Maybe Imre has a quick idea of what you could be missing on your kernel
> that we already have on upstream one.
> 
> Thanks,
> Rodrigo.
>
Thanks for the quick response Rodrigo !
Some key observations based on my experiments so far:
       for (;;) {                                                                 
                /*                                                                
                 * prepare_to_wait() has a memory barrier, which guarantees       
                 * other CPUs can see the task state update by the time we        
                 * read the scanline.                                             
                 */                                                               
                prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE);                 
                                                                                  
                scanline = intel_get_crtc_scanline(crtc);                         
                if (scanline < min || scanline > max)                             
                        break;                                                    
                                                                                  
                if (timeout <= 0) {                                               
                        DRM_ERROR("Potential atomic update failure on pipe %c\n", 
                                  pipe_name(crtc->pipe));                         
                        break;                                                    
                }                                                                 
                                                                                  
                local_irq_enable();                                               
                                                                                  
                timeout = schedule_timeout(timeout);                              
                                                                                  
                local_irq_disable();                                              
        }
1. In the above loop inside pipe_update_start, the *first time*, we read the PIPEDSL, with PSR1 and external display connected, it always reads 1599, for *both* the kernels(upstream and ChromeOS-4.4) . The PSR_STATUS also reads the exact same for *both* kernels and shows that we haven't *fully* exited PSR.

2. The difference between the two kernels comes after this first read of the PIPEDSL. ChromeOS-4.4 spends ~1 msec inside that loop and upstream spends ~2msec. I suspect that it is because of the scheduling changes between the two kernels, b/c I can't find any i915 specific code running in that loop, except for vblank processing.

3. So to summarize it, both the kernels are in the same state w.r.t PSR and PIPEDSL value when they read the PIPEDSL for the first time inside the loop. *When* the kernels *transition* to a *full PSR exit* is what is differing.

My rationale for this patch is that, the pipe_update_start function is meant to evade 100 usec before a vblank, but, *if* we haven't *fully* exited PSR (which is true for both the kernels for the first PIPEDSL read), then vblank evasion is *not applicable* b/c the PIPEDSL will be messed up. So we shouldn't bother evading vblank until we have fully exited PSR.

Thanks,
Tarun
> > 
> > ---
> >  drivers/gpu/drm/i915/intel_sprite.c | 7 +++++--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
> > index aa1dfaa692b9..135b41568503 100644
> > --- a/drivers/gpu/drm/i915/intel_sprite.c
> > +++ b/drivers/gpu/drm/i915/intel_sprite.c
> > @@ -107,14 +107,17 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state)
> >  						      VBLANK_EVASION_TIME_US);
> >  	max = vblank_start - 1;
> >  
> > -	local_irq_disable();
> > -
> >  	if (min <= 0 || max <= 0)
> >  		return;
> >  
> >  	if (WARN_ON(drm_crtc_vblank_get(&crtc->base)))
> >  		return;
> >  
> > +	if(new_crtc_state->has_psr && dev_priv->psr.active)
> > +		intel_wait_for_register(dev_priv, EDP_PSR_STATUS, EDP_PSR_STATUS_STATE_MASK, EDP_PSR_STATUS_STATE_IDLE, 5);
> > +
> > +	local_irq_disable();
> > +
> >  	crtc->debug.min_vbl = min;
> >  	crtc->debug.max_vbl = max;
> >  	trace_i915_pipe_update_start(crtc);
> > -- 
> > 2.13.5
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update
  2018-05-02 18:19   ` Tarun Vyas
@ 2018-05-02 18:51     ` Ville Syrjälä
  2018-05-02 20:04       ` Rodrigo Vivi
  0 siblings, 1 reply; 13+ messages in thread
From: Ville Syrjälä @ 2018-05-02 18:51 UTC (permalink / raw)
  To: Tarun Vyas; +Cc: Deak, intel-gfx, Pandiyan, Dhinakaran, Rodrigo Vivi

On Wed, May 02, 2018 at 11:19:14AM -0700, Tarun Vyas wrote:
> On Mon, Apr 30, 2018 at 10:19:33AM -0700, Rodrigo Vivi wrote:
> > On Sun, Apr 29, 2018 at 09:00:18PM -0700, Tarun Vyas wrote:
> > > From: Tarun <tarun.vyas@intel.com>
> > > 
> > > The PIPEDSL freezes on PSR entry and if PSR hasn't fully exited, then
> > > the pipe_update_start call schedules itself out to check back later.
> > > 
> > > On ChromeOS-4.4 kernel, which is fairly up-to-date w.r.t drm/i915 but
> > > lags w.r.t core kernel code, hot plugging an external display triggers
> > > tons of "potential atomic update errors" in the dmesg, on *pipe A*. A
> > > closer analysis reveals that we try to read the scanline 3 times and
> > > eventually timeout, b/c PSR hasn't exited fully leading to a PIPEDSL
> > > stuck @ 1599. This issue is not seen on upstream kernels, b/c for *some*
> > > reason we loop inside intel_pipe_update start for ~2+ msec which in this
> > > case is more than enough to exit PSR fully, hence an *unstuck* PIPEDSL
> > > counter, hence no error. On the other hand, the ChromeOS kernel spends
> > > ~1.1 msec looping inside intel_pipe_update_start and hence errors out
> > > b/c the source is still in PSR.
> > > 
> > > Regardless, we should wait for PSR exit (if PSR is supported and active
> > > on the current pipe) before reading the PIPEDSL, b/c if we haven't
> > > fully exited PSR, then checking for vblank evasion isn't actually
> > > applicable.
> > > 
> > > This scenario applies to a configuration with an additional pipe,
> > > as of now.
> > 
> > I honestly believe you picking the wrong culprit here. By "coincidence".
> > PSR will allow DC state with screen on and DC state will mess up with all
> > registers reads....
> > 
> > probably what you are missing you your kernel is some power domain
> > grab that would keep DC_OFF and consequently a sane read of these
> > registers.
> > 
> > Maybe Imre has a quick idea of what you could be missing on your kernel
> > that we already have on upstream one.
> > 
> > Thanks,
> > Rodrigo.
> >
> Thanks for the quick response Rodrigo !
> Some key observations based on my experiments so far:
>        for (;;) {                                                                 
>                 /*                                                                
>                  * prepare_to_wait() has a memory barrier, which guarantees       
>                  * other CPUs can see the task state update by the time we        
>                  * read the scanline.                                             
>                  */                                                               
>                 prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE);                 
>                                                                                   
>                 scanline = intel_get_crtc_scanline(crtc);                         
>                 if (scanline < min || scanline > max)                             
>                         break;                                                    
>                                                                                   
>                 if (timeout <= 0) {                                               
>                         DRM_ERROR("Potential atomic update failure on pipe %c\n", 
>                                   pipe_name(crtc->pipe));                         
>                         break;                                                    
>                 }                                                                 
>                                                                                   
>                 local_irq_enable();                                               
>                                                                                   
>                 timeout = schedule_timeout(timeout);                              
>                                                                                   
>                 local_irq_disable();                                              
>         }
> 1. In the above loop inside pipe_update_start, the *first time*, we read the PIPEDSL, with PSR1 and external display connected, it always reads 1599, for *both* the kernels(upstream and ChromeOS-4.4) . The PSR_STATUS also reads the exact same for *both* kernels and shows that we haven't *fully* exited PSR.
> 
> 2. The difference between the two kernels comes after this first read of the PIPEDSL. ChromeOS-4.4 spends ~1 msec inside that loop and upstream spends ~2msec. I suspect that it is because of the scheduling changes between the two kernels, b/c I can't find any i915 specific code running in that loop, except for vblank processing.
> 
> 3. So to summarize it, both the kernels are in the same state w.r.t PSR and PIPEDSL value when they read the PIPEDSL for the first time inside the loop. *When* the kernels *transition* to a *full PSR exit* is what is differing.
> 
> My rationale for this patch is that, the pipe_update_start function is meant to evade 100 usec before a vblank, but, *if* we haven't *fully* exited PSR (which is true for both the kernels for the first PIPEDSL read), then vblank evasion is *not applicable* b/c the PIPEDSL will be messed up. So we shouldn't bother evading vblank until we have fully exited PSR.

Yeah, I think this is the right direction. The problem really is the
extra vblank pulse that the hardware generates (or at least can
generate depending on a chicken bit) when it exits PSR. We have no
control over when that happens and hence we have no control over when
the registers get latched. And yet we still have to somehow prevent
the register latching from occurring while we're in middle of
reprogramming them.

There are a couple of ways to avoid this:
1) Set the chicken bit so that we don't get the vblank pulse. The
   pipe should restart from the vblank start, so we would have one
   full frame to reprogam the registers. Howver IIRC DK told me
   there is no way to fully eliminate it in all cases so this
   option is probably out. There was also some implication for FBC
   which I already forgot.
2) Make sure we've exited PSR before repgrogamming the registers
   (ie. what you do).
3) Use the DOUBLE_BUFFER_CTL to prevent the extra vblank pulse from
   latching the registers while we're still reprogramming them.
   This feature only exists on SKL+ so is not a solution for
   HSW/BDW. But maybe HSW/BDW didn't even have the extra vblank
   pulse?

Option 2) does provide a consistent behaviour on all platforms, so I
do kinda like it. It also avoids a bigger reword on account of the
DOUBLE_BUFFER_CTL. I do think we'll have to start using
DOUBLE_BUFFER_CTL anyway due to other issues, but at least this way
we don't block PSR progress on that work.

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update
  2018-05-02 18:51     ` Ville Syrjälä
@ 2018-05-02 20:04       ` Rodrigo Vivi
  2018-05-02 22:31         ` Tarun Vyas
  0 siblings, 1 reply; 13+ messages in thread
From: Rodrigo Vivi @ 2018-05-02 20:04 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: Deak, intel-gfx, Pandiyan, Dhinakaran

On Wed, May 02, 2018 at 09:51:43PM +0300, Ville Syrjälä wrote:
> On Wed, May 02, 2018 at 11:19:14AM -0700, Tarun Vyas wrote:
> > On Mon, Apr 30, 2018 at 10:19:33AM -0700, Rodrigo Vivi wrote:
> > > On Sun, Apr 29, 2018 at 09:00:18PM -0700, Tarun Vyas wrote:
> > > > From: Tarun <tarun.vyas@intel.com>
> > > > 
> > > > The PIPEDSL freezes on PSR entry and if PSR hasn't fully exited, then
> > > > the pipe_update_start call schedules itself out to check back later.
> > > > 
> > > > On ChromeOS-4.4 kernel, which is fairly up-to-date w.r.t drm/i915 but
> > > > lags w.r.t core kernel code, hot plugging an external display triggers
> > > > tons of "potential atomic update errors" in the dmesg, on *pipe A*. A
> > > > closer analysis reveals that we try to read the scanline 3 times and
> > > > eventually timeout, b/c PSR hasn't exited fully leading to a PIPEDSL
> > > > stuck @ 1599. This issue is not seen on upstream kernels, b/c for *some*
> > > > reason we loop inside intel_pipe_update start for ~2+ msec which in this
> > > > case is more than enough to exit PSR fully, hence an *unstuck* PIPEDSL
> > > > counter, hence no error. On the other hand, the ChromeOS kernel spends
> > > > ~1.1 msec looping inside intel_pipe_update_start and hence errors out
> > > > b/c the source is still in PSR.
> > > > 
> > > > Regardless, we should wait for PSR exit (if PSR is supported and active
> > > > on the current pipe) before reading the PIPEDSL, b/c if we haven't
> > > > fully exited PSR, then checking for vblank evasion isn't actually
> > > > applicable.
> > > > 
> > > > This scenario applies to a configuration with an additional pipe,
> > > > as of now.
> > > 
> > > I honestly believe you picking the wrong culprit here. By "coincidence".
> > > PSR will allow DC state with screen on and DC state will mess up with all
> > > registers reads....
> > > 
> > > probably what you are missing you your kernel is some power domain
> > > grab that would keep DC_OFF and consequently a sane read of these
> > > registers.
> > > 
> > > Maybe Imre has a quick idea of what you could be missing on your kernel
> > > that we already have on upstream one.
> > > 
> > > Thanks,
> > > Rodrigo.
> > >
> > Thanks for the quick response Rodrigo !
> > Some key observations based on my experiments so far:
> >        for (;;) {                                                                 
> >                 /*                                                                
> >                  * prepare_to_wait() has a memory barrier, which guarantees       
> >                  * other CPUs can see the task state update by the time we        
> >                  * read the scanline.                                             
> >                  */                                                               
> >                 prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE);                 
> >                                                                                   
> >                 scanline = intel_get_crtc_scanline(crtc);                         
> >                 if (scanline < min || scanline > max)                             
> >                         break;                                                    
> >                                                                                   
> >                 if (timeout <= 0) {                                               
> >                         DRM_ERROR("Potential atomic update failure on pipe %c\n", 
> >                                   pipe_name(crtc->pipe));                         
> >                         break;                                                    
> >                 }                                                                 
> >                                                                                   
> >                 local_irq_enable();                                               
> >                                                                                   
> >                 timeout = schedule_timeout(timeout);                              
> >                                                                                   
> >                 local_irq_disable();                                              
> >         }
> > 1. In the above loop inside pipe_update_start, the *first time*, we read the PIPEDSL, with PSR1 and external display connected, it always reads 1599, for *both* the kernels(upstream and ChromeOS-4.4) . The PSR_STATUS also reads the exact same for *both* kernels and shows that we haven't *fully* exited PSR.
> > 
> > 2. The difference between the two kernels comes after this first read of the PIPEDSL. ChromeOS-4.4 spends ~1 msec inside that loop and upstream spends ~2msec. I suspect that it is because of the scheduling changes between the two kernels, b/c I can't find any i915 specific code running in that loop, except for vblank processing.
> > 
> > 3. So to summarize it, both the kernels are in the same state w.r.t PSR and PIPEDSL value when they read the PIPEDSL for the first time inside the loop. *When* the kernels *transition* to a *full PSR exit* is what is differing.

Oh! So you really are getting reliable counters....

> > 
> > My rationale for this patch is that, the pipe_update_start function is meant to evade 100 usec before a vblank, but, *if* we haven't *fully* exited PSR (which is true for both the kernels for the first PIPEDSL read), then vblank evasion is *not applicable* b/c the PIPEDSL will be messed up. So we shouldn't bother evading vblank until we have fully exited PSR.
> 
> Yeah, I think this is the right direction. The problem really is the
> extra vblank pulse that the hardware generates (or at least can
> generate depending on a chicken bit) when it exits PSR. We have no
> control over when that happens and hence we have no control over when
> the registers get latched. And yet we still have to somehow prevent
> the register latching from occurring while we're in middle of
> reprogramming them.

I see the problem now. Thanks for the explanation.

> 
> There are a couple of ways to avoid this:
> 1) Set the chicken bit so that we don't get the vblank pulse. The
>    pipe should restart from the vblank start, so we would have one
>    full frame to reprogam the registers. Howver IIRC DK told me
>    there is no way to fully eliminate it in all cases so this
>    option is probably out. There was also some implication for FBC
>    which I already forgot.
> 2) Make sure we've exited PSR before repgrogamming the registers
>    (ie. what you do).
> 3) Use the DOUBLE_BUFFER_CTL to prevent the extra vblank pulse from
>    latching the registers while we're still reprogramming them.
>    This feature only exists on SKL+ so is not a solution for
>    HSW/BDW. But maybe HSW/BDW didn't even have the extra vblank
>    pulse?
> 
> Option 2) does provide a consistent behaviour on all platforms, so I
> do kinda like it. It also avoids a bigger reword on account of the
> DOUBLE_BUFFER_CTL. I do think we'll have to start using
> DOUBLE_BUFFER_CTL anyway due to other issues, but at least this way
> we don't block PSR progress on that work.

My vote is for the option 2. Seems more straighforward and more broad.

DK?

My only request on the patch itself would be to create a function
on intel_psr.c intel_psr_wait_for_idle... or something like this
and put the register wait logic inside it instead of spreading
the psr code around.

Thanks,
Rodrigo.

> 
> -- 
> Ville Syrjälä
> Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update
  2018-05-02 20:04       ` Rodrigo Vivi
@ 2018-05-02 22:31         ` Tarun Vyas
  2018-05-03 16:58           ` Rodrigo Vivi
  0 siblings, 1 reply; 13+ messages in thread
From: Tarun Vyas @ 2018-05-02 22:31 UTC (permalink / raw)
  To: Vivi, Rodrigo; +Cc: Deak, Pandiyan, Dhinakaran, intel-gfx

On Wed, May 02, 2018 at 01:04:06PM -0700, Vivi, Rodrigo wrote:
> On Wed, May 02, 2018 at 09:51:43PM +0300, Ville Syrjälä wrote:
> > On Wed, May 02, 2018 at 11:19:14AM -0700, Tarun Vyas wrote:
> > > On Mon, Apr 30, 2018 at 10:19:33AM -0700, Rodrigo Vivi wrote:
> > > > On Sun, Apr 29, 2018 at 09:00:18PM -0700, Tarun Vyas wrote:
> > > > > From: Tarun <tarun.vyas@intel.com>
> > > > >
> > > > > The PIPEDSL freezes on PSR entry and if PSR hasn't fully exited, then
> > > > > the pipe_update_start call schedules itself out to check back later.
> > > > >
> > > > > On ChromeOS-4.4 kernel, which is fairly up-to-date w.r.t drm/i915 but
> > > > > lags w.r.t core kernel code, hot plugging an external display triggers
> > > > > tons of "potential atomic update errors" in the dmesg, on *pipe A*. A
> > > > > closer analysis reveals that we try to read the scanline 3 times and
> > > > > eventually timeout, b/c PSR hasn't exited fully leading to a PIPEDSL
> > > > > stuck @ 1599. This issue is not seen on upstream kernels, b/c for *some*
> > > > > reason we loop inside intel_pipe_update start for ~2+ msec which in this
> > > > > case is more than enough to exit PSR fully, hence an *unstuck* PIPEDSL
> > > > > counter, hence no error. On the other hand, the ChromeOS kernel spends
> > > > > ~1.1 msec looping inside intel_pipe_update_start and hence errors out
> > > > > b/c the source is still in PSR.
> > > > >
> > > > > Regardless, we should wait for PSR exit (if PSR is supported and active
> > > > > on the current pipe) before reading the PIPEDSL, b/c if we haven't
> > > > > fully exited PSR, then checking for vblank evasion isn't actually
> > > > > applicable.
> > > > >
> > > > > This scenario applies to a configuration with an additional pipe,
> > > > > as of now.
> > > >
> > > > I honestly believe you picking the wrong culprit here. By "coincidence".
> > > > PSR will allow DC state with screen on and DC state will mess up with all
> > > > registers reads....
> > > >
> > > > probably what you are missing you your kernel is some power domain
> > > > grab that would keep DC_OFF and consequently a sane read of these
> > > > registers.
> > > >
> > > > Maybe Imre has a quick idea of what you could be missing on your kernel
> > > > that we already have on upstream one.
> > > >
> > > > Thanks,
> > > > Rodrigo.
> > > >
> > > Thanks for the quick response Rodrigo !
> > > Some key observations based on my experiments so far:
> > >        for (;;) {
> > >                 /*
> > >                  * prepare_to_wait() has a memory barrier, which guarantees
> > >                  * other CPUs can see the task state update by the time we
> > >                  * read the scanline.
> > >                  */
> > >                 prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE);
> > >
> > >                 scanline = intel_get_crtc_scanline(crtc);
> > >                 if (scanline < min || scanline > max)
> > >                         break;
> > >
> > >                 if (timeout <= 0) {
> > >                         DRM_ERROR("Potential atomic update failure on pipe %c\n",
> > >                                   pipe_name(crtc->pipe));
> > >                         break;
> > >                 }
> > >
> > >                 local_irq_enable();
> > >
> > >                 timeout = schedule_timeout(timeout);
> > >
> > >                 local_irq_disable();
> > >         }
> > > 1. In the above loop inside pipe_update_start, the *first time*, we read the PIPEDSL, with PSR1 and external display connected, it always reads 1599, for *both* the kernels(upstream and ChromeOS-4.4) . The PSR_STATUS also reads the exact same for *both* kernels and shows that we haven't *fully* exited PSR.
> > >
> > > 2. The difference between the two kernels comes after this first read of the PIPEDSL. ChromeOS-4.4 spends ~1 msec inside that loop and upstream spends ~2msec. I suspect that it is because of the scheduling changes between the two kernels, b/c I can't find any i915 specific code running in that loop, except for vblank processing.
> > >
> > > 3. So to summarize it, both the kernels are in the same state w.r.t PSR and PIPEDSL value when they read the PIPEDSL for the first time inside the loop. *When* the kernels *transition* to a *full PSR exit* is what is differing.
> 
> Oh! So you really are getting reliable counters....
> 
> > >
> > > My rationale for this patch is that, the pipe_update_start function is meant to evade 100 usec before a vblank, but, *if* we haven't *fully* exited PSR (which is true for both the kernels for the first PIPEDSL read), then vblank evasion is *not applicable* b/c the PIPEDSL will be messed up. So we shouldn't bother evading vblank until we have fully exited PSR.
> >
> > Yeah, I think this is the right direction. The problem really is the
> > extra vblank pulse that the hardware generates (or at least can
> > generate depending on a chicken bit) when it exits PSR. We have no
> > control over when that happens and hence we have no control over when
> > the registers get latched. And yet we still have to somehow prevent
> > the register latching from occurring while we're in middle of
> > reprogramming them.
> 
> I see the problem now. Thanks for the explanation.
> 
> >
> > There are a couple of ways to avoid this:
> > 1) Set the chicken bit so that we don't get the vblank pulse. The
> >    pipe should restart from the vblank start, so we would have one
> >    full frame to reprogam the registers. Howver IIRC DK told me
> >    there is no way to fully eliminate it in all cases so this
> >    option is probably out. There was also some implication for FBC
> >    which I already forgot.
> > 2) Make sure we've exited PSR before repgrogamming the registers
> >    (ie. what you do).
> > 3) Use the DOUBLE_BUFFER_CTL to prevent the extra vblank pulse from
> >    latching the registers while we're still reprogramming them.
> >    This feature only exists on SKL+ so is not a solution for
> >    HSW/BDW. But maybe HSW/BDW didn't even have the extra vblank
> >    pulse?
> >
> > Option 2) does provide a consistent behaviour on all platforms, so I
> > do kinda like it. It also avoids a bigger reword on account of the
> > DOUBLE_BUFFER_CTL. I do think we'll have to start using
> > DOUBLE_BUFFER_CTL anyway due to other issues, but at least this way
> > we don't block PSR progress on that work.
> 
> My vote is for the option 2. Seems more straighforward and more broad.
> 
> DK?
> 
> My only request on the patch itself would be to create a function
> on intel_psr.c intel_psr_wait_for_idle... or something like this
> and put the register wait logic inside it instead of spreading
> the psr code around.
> 
> Thanks,
> Rodrigo.
> 
> >
> > --
> > Ville Syrjälä
> > Intel
Thanks for the comments, Ville and Rodrigo. I'll rework this to move the wait to intel_psr.c. There is a psr_wait_for_idle() in there, but there are some PSR locks being passed around inside it (eventually released by the caller). Also,the max timeout specified there is 50 msec which might be way too much ?

Best,
Tarun
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update
  2018-05-02 22:31         ` Tarun Vyas
@ 2018-05-03 16:58           ` Rodrigo Vivi
  2018-05-03 17:08             ` Tarun Vyas
  0 siblings, 1 reply; 13+ messages in thread
From: Rodrigo Vivi @ 2018-05-03 16:58 UTC (permalink / raw)
  To: Tarun Vyas; +Cc: Deak, Pandiyan, Dhinakaran, intel-gfx

On Wed, May 02, 2018 at 03:31:15PM -0700, Tarun Vyas wrote:
> On Wed, May 02, 2018 at 01:04:06PM -0700, Vivi, Rodrigo wrote:
> > On Wed, May 02, 2018 at 09:51:43PM +0300, Ville Syrjälä wrote:
> > > On Wed, May 02, 2018 at 11:19:14AM -0700, Tarun Vyas wrote:
> > > > On Mon, Apr 30, 2018 at 10:19:33AM -0700, Rodrigo Vivi wrote:
> > > > > On Sun, Apr 29, 2018 at 09:00:18PM -0700, Tarun Vyas wrote:
> > > > > > From: Tarun <tarun.vyas@intel.com>
> > > > > >
> > > > > > The PIPEDSL freezes on PSR entry and if PSR hasn't fully exited, then
> > > > > > the pipe_update_start call schedules itself out to check back later.
> > > > > >
> > > > > > On ChromeOS-4.4 kernel, which is fairly up-to-date w.r.t drm/i915 but
> > > > > > lags w.r.t core kernel code, hot plugging an external display triggers
> > > > > > tons of "potential atomic update errors" in the dmesg, on *pipe A*. A
> > > > > > closer analysis reveals that we try to read the scanline 3 times and
> > > > > > eventually timeout, b/c PSR hasn't exited fully leading to a PIPEDSL
> > > > > > stuck @ 1599. This issue is not seen on upstream kernels, b/c for *some*
> > > > > > reason we loop inside intel_pipe_update start for ~2+ msec which in this
> > > > > > case is more than enough to exit PSR fully, hence an *unstuck* PIPEDSL
> > > > > > counter, hence no error. On the other hand, the ChromeOS kernel spends
> > > > > > ~1.1 msec looping inside intel_pipe_update_start and hence errors out
> > > > > > b/c the source is still in PSR.
> > > > > >
> > > > > > Regardless, we should wait for PSR exit (if PSR is supported and active
> > > > > > on the current pipe) before reading the PIPEDSL, b/c if we haven't
> > > > > > fully exited PSR, then checking for vblank evasion isn't actually
> > > > > > applicable.
> > > > > >
> > > > > > This scenario applies to a configuration with an additional pipe,
> > > > > > as of now.
> > > > >
> > > > > I honestly believe you picking the wrong culprit here. By "coincidence".
> > > > > PSR will allow DC state with screen on and DC state will mess up with all
> > > > > registers reads....
> > > > >
> > > > > probably what you are missing you your kernel is some power domain
> > > > > grab that would keep DC_OFF and consequently a sane read of these
> > > > > registers.
> > > > >
> > > > > Maybe Imre has a quick idea of what you could be missing on your kernel
> > > > > that we already have on upstream one.
> > > > >
> > > > > Thanks,
> > > > > Rodrigo.
> > > > >
> > > > Thanks for the quick response Rodrigo !
> > > > Some key observations based on my experiments so far:
> > > >        for (;;) {
> > > >                 /*
> > > >                  * prepare_to_wait() has a memory barrier, which guarantees
> > > >                  * other CPUs can see the task state update by the time we
> > > >                  * read the scanline.
> > > >                  */
> > > >                 prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE);
> > > >
> > > >                 scanline = intel_get_crtc_scanline(crtc);
> > > >                 if (scanline < min || scanline > max)
> > > >                         break;
> > > >
> > > >                 if (timeout <= 0) {
> > > >                         DRM_ERROR("Potential atomic update failure on pipe %c\n",
> > > >                                   pipe_name(crtc->pipe));
> > > >                         break;
> > > >                 }
> > > >
> > > >                 local_irq_enable();
> > > >
> > > >                 timeout = schedule_timeout(timeout);
> > > >
> > > >                 local_irq_disable();
> > > >         }
> > > > 1. In the above loop inside pipe_update_start, the *first time*, we read the PIPEDSL, with PSR1 and external display connected, it always reads 1599, for *both* the kernels(upstream and ChromeOS-4.4) . The PSR_STATUS also reads the exact same for *both* kernels and shows that we haven't *fully* exited PSR.
> > > >
> > > > 2. The difference between the two kernels comes after this first read of the PIPEDSL. ChromeOS-4.4 spends ~1 msec inside that loop and upstream spends ~2msec. I suspect that it is because of the scheduling changes between the two kernels, b/c I can't find any i915 specific code running in that loop, except for vblank processing.
> > > >
> > > > 3. So to summarize it, both the kernels are in the same state w.r.t PSR and PIPEDSL value when they read the PIPEDSL for the first time inside the loop. *When* the kernels *transition* to a *full PSR exit* is what is differing.
> > 
> > Oh! So you really are getting reliable counters....
> > 
> > > >
> > > > My rationale for this patch is that, the pipe_update_start function is meant to evade 100 usec before a vblank, but, *if* we haven't *fully* exited PSR (which is true for both the kernels for the first PIPEDSL read), then vblank evasion is *not applicable* b/c the PIPEDSL will be messed up. So we shouldn't bother evading vblank until we have fully exited PSR.
> > >
> > > Yeah, I think this is the right direction. The problem really is the
> > > extra vblank pulse that the hardware generates (or at least can
> > > generate depending on a chicken bit) when it exits PSR. We have no
> > > control over when that happens and hence we have no control over when
> > > the registers get latched. And yet we still have to somehow prevent
> > > the register latching from occurring while we're in middle of
> > > reprogramming them.
> > 
> > I see the problem now. Thanks for the explanation.
> > 
> > >
> > > There are a couple of ways to avoid this:
> > > 1) Set the chicken bit so that we don't get the vblank pulse. The
> > >    pipe should restart from the vblank start, so we would have one
> > >    full frame to reprogam the registers. Howver IIRC DK told me
> > >    there is no way to fully eliminate it in all cases so this
> > >    option is probably out. There was also some implication for FBC
> > >    which I already forgot.
> > > 2) Make sure we've exited PSR before repgrogamming the registers
> > >    (ie. what you do).
> > > 3) Use the DOUBLE_BUFFER_CTL to prevent the extra vblank pulse from
> > >    latching the registers while we're still reprogramming them.
> > >    This feature only exists on SKL+ so is not a solution for
> > >    HSW/BDW. But maybe HSW/BDW didn't even have the extra vblank
> > >    pulse?
> > >
> > > Option 2) does provide a consistent behaviour on all platforms, so I
> > > do kinda like it. It also avoids a bigger reword on account of the
> > > DOUBLE_BUFFER_CTL. I do think we'll have to start using
> > > DOUBLE_BUFFER_CTL anyway due to other issues, but at least this way
> > > we don't block PSR progress on that work.
> > 
> > My vote is for the option 2. Seems more straighforward and more broad.
> > 
> > DK?
> > 
> > My only request on the patch itself would be to create a function
> > on intel_psr.c intel_psr_wait_for_idle... or something like this
> > and put the register wait logic inside it instead of spreading
> > the psr code around.
> > 
> > Thanks,
> > Rodrigo.
> > 
> > >
> > > --
> > > Ville Syrjälä
> > > Intel
> Thanks for the comments, Ville and Rodrigo. I'll rework this to move the wait to intel_psr.c. There is a psr_wait_for_idle() in there, but there are some PSR locks being passed around inside it (eventually released by the caller). Also,the max timeout specified there is 50 msec which might be way too much ?

ouch! that function is ugly.... unlock than lock back again...
(specially unlock without any assert locked... :/)

If you can improve that or split in a way that we reuse some code it would be nice...

>
> Best,
> Tarun
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update
  2018-05-03 16:58           ` Rodrigo Vivi
@ 2018-05-03 17:08             ` Tarun Vyas
  0 siblings, 0 replies; 13+ messages in thread
From: Tarun Vyas @ 2018-05-03 17:08 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx, dhinakaran.pandiyan

On Thu, May 03, 2018 at 09:58:56AM -0700, Rodrigo Vivi wrote:
> On Wed, May 02, 2018 at 03:31:15PM -0700, Tarun Vyas wrote:
> > On Wed, May 02, 2018 at 01:04:06PM -0700, Vivi, Rodrigo wrote:
> > > On Wed, May 02, 2018 at 09:51:43PM +0300, Ville Syrjälä wrote:
> > > > On Wed, May 02, 2018 at 11:19:14AM -0700, Tarun Vyas wrote:
> > > > > On Mon, Apr 30, 2018 at 10:19:33AM -0700, Rodrigo Vivi wrote:
> > > > > > On Sun, Apr 29, 2018 at 09:00:18PM -0700, Tarun Vyas wrote:
> > > > > > > From: Tarun <tarun.vyas@intel.com>
> > > > > > >
> > > > > > > The PIPEDSL freezes on PSR entry and if PSR hasn't fully exited, then
> > > > > > > the pipe_update_start call schedules itself out to check back later.
> > > > > > >
> > > > > > > On ChromeOS-4.4 kernel, which is fairly up-to-date w.r.t drm/i915 but
> > > > > > > lags w.r.t core kernel code, hot plugging an external display triggers
> > > > > > > tons of "potential atomic update errors" in the dmesg, on *pipe A*. A
> > > > > > > closer analysis reveals that we try to read the scanline 3 times and
> > > > > > > eventually timeout, b/c PSR hasn't exited fully leading to a PIPEDSL
> > > > > > > stuck @ 1599. This issue is not seen on upstream kernels, b/c for *some*
> > > > > > > reason we loop inside intel_pipe_update start for ~2+ msec which in this
> > > > > > > case is more than enough to exit PSR fully, hence an *unstuck* PIPEDSL
> > > > > > > counter, hence no error. On the other hand, the ChromeOS kernel spends
> > > > > > > ~1.1 msec looping inside intel_pipe_update_start and hence errors out
> > > > > > > b/c the source is still in PSR.
> > > > > > >
> > > > > > > Regardless, we should wait for PSR exit (if PSR is supported and active
> > > > > > > on the current pipe) before reading the PIPEDSL, b/c if we haven't
> > > > > > > fully exited PSR, then checking for vblank evasion isn't actually
> > > > > > > applicable.
> > > > > > >
> > > > > > > This scenario applies to a configuration with an additional pipe,
> > > > > > > as of now.
> > > > > >
> > > > > > I honestly believe you picking the wrong culprit here. By "coincidence".
> > > > > > PSR will allow DC state with screen on and DC state will mess up with all
> > > > > > registers reads....
> > > > > >
> > > > > > probably what you are missing you your kernel is some power domain
> > > > > > grab that would keep DC_OFF and consequently a sane read of these
> > > > > > registers.
> > > > > >
> > > > > > Maybe Imre has a quick idea of what you could be missing on your kernel
> > > > > > that we already have on upstream one.
> > > > > >
> > > > > > Thanks,
> > > > > > Rodrigo.
> > > > > >
> > > > > Thanks for the quick response Rodrigo !
> > > > > Some key observations based on my experiments so far:
> > > > >        for (;;) {
> > > > >                 /*
> > > > >                  * prepare_to_wait() has a memory barrier, which guarantees
> > > > >                  * other CPUs can see the task state update by the time we
> > > > >                  * read the scanline.
> > > > >                  */
> > > > >                 prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE);
> > > > >
> > > > >                 scanline = intel_get_crtc_scanline(crtc);
> > > > >                 if (scanline < min || scanline > max)
> > > > >                         break;
> > > > >
> > > > >                 if (timeout <= 0) {
> > > > >                         DRM_ERROR("Potential atomic update failure on pipe %c\n",
> > > > >                                   pipe_name(crtc->pipe));
> > > > >                         break;
> > > > >                 }
> > > > >
> > > > >                 local_irq_enable();
> > > > >
> > > > >                 timeout = schedule_timeout(timeout);
> > > > >
> > > > >                 local_irq_disable();
> > > > >         }
> > > > > 1. In the above loop inside pipe_update_start, the *first time*, we read the PIPEDSL, with PSR1 and external display connected, it always reads 1599, for *both* the kernels(upstream and ChromeOS-4.4) . The PSR_STATUS also reads the exact same for *both* kernels and shows that we haven't *fully* exited PSR.
> > > > >
> > > > > 2. The difference between the two kernels comes after this first read of the PIPEDSL. ChromeOS-4.4 spends ~1 msec inside that loop and upstream spends ~2msec. I suspect that it is because of the scheduling changes between the two kernels, b/c I can't find any i915 specific code running in that loop, except for vblank processing.
> > > > >
> > > > > 3. So to summarize it, both the kernels are in the same state w.r.t PSR and PIPEDSL value when they read the PIPEDSL for the first time inside the loop. *When* the kernels *transition* to a *full PSR exit* is what is differing.
> > > 
> > > Oh! So you really are getting reliable counters....
> > > 
> > > > >
> > > > > My rationale for this patch is that, the pipe_update_start function is meant to evade 100 usec before a vblank, but, *if* we haven't *fully* exited PSR (which is true for both the kernels for the first PIPEDSL read), then vblank evasion is *not applicable* b/c the PIPEDSL will be messed up. So we shouldn't bother evading vblank until we have fully exited PSR.
> > > >
> > > > Yeah, I think this is the right direction. The problem really is the
> > > > extra vblank pulse that the hardware generates (or at least can
> > > > generate depending on a chicken bit) when it exits PSR. We have no
> > > > control over when that happens and hence we have no control over when
> > > > the registers get latched. And yet we still have to somehow prevent
> > > > the register latching from occurring while we're in middle of
> > > > reprogramming them.
> > > 
> > > I see the problem now. Thanks for the explanation.
> > > 
> > > >
> > > > There are a couple of ways to avoid this:
> > > > 1) Set the chicken bit so that we don't get the vblank pulse. The
> > > >    pipe should restart from the vblank start, so we would have one
> > > >    full frame to reprogam the registers. Howver IIRC DK told me
> > > >    there is no way to fully eliminate it in all cases so this
> > > >    option is probably out. There was also some implication for FBC
> > > >    which I already forgot.
> > > > 2) Make sure we've exited PSR before repgrogamming the registers
> > > >    (ie. what you do).
> > > > 3) Use the DOUBLE_BUFFER_CTL to prevent the extra vblank pulse from
> > > >    latching the registers while we're still reprogramming them.
> > > >    This feature only exists on SKL+ so is not a solution for
> > > >    HSW/BDW. But maybe HSW/BDW didn't even have the extra vblank
> > > >    pulse?
> > > >
> > > > Option 2) does provide a consistent behaviour on all platforms, so I
> > > > do kinda like it. It also avoids a bigger reword on account of the
> > > > DOUBLE_BUFFER_CTL. I do think we'll have to start using
> > > > DOUBLE_BUFFER_CTL anyway due to other issues, but at least this way
> > > > we don't block PSR progress on that work.
> > > 
> > > My vote is for the option 2. Seems more straighforward and more broad.
> > > 
> > > DK?
> > > 
> > > My only request on the patch itself would be to create a function
> > > on intel_psr.c intel_psr_wait_for_idle... or something like this
> > > and put the register wait logic inside it instead of spreading
> > > the psr code around.
> > > 
> > > Thanks,
> > > Rodrigo.
> > > 
> > > >
> > > > --
> > > > Ville Syrjälä
> > > > Intel
> > Thanks for the comments, Ville and Rodrigo. I'll rework this to move the wait to intel_psr.c. There is a psr_wait_for_idle() in there, but there are some PSR locks being passed around inside it (eventually released by the caller). Also,the max timeout specified there is 50 msec which might be way too much ?
> 
> ouch! that function is ugly.... unlock than lock back again...
> (specially unlock without any assert locked... :/)
> 
> If you can improve that or split in a way that we reuse some code it would be nice...
>
Yes, I was also considering splitting it up. I'll rework and send it out. Thanks !

-Tarun 
> >
> > Best,
> > Tarun
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update
  2018-04-30  4:00 [PATCH] drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update Tarun Vyas
                   ` (4 preceding siblings ...)
  2018-04-30 17:19 ` [PATCH] " Rodrigo Vivi
@ 2018-05-14 12:53 ` Jani Nikula
  5 siblings, 0 replies; 13+ messages in thread
From: Jani Nikula @ 2018-05-14 12:53 UTC (permalink / raw)
  To: Tarun Vyas, intel-gfx; +Cc: dhinakaran.pandiyan, rodrigo.vivi

On Sun, 29 Apr 2018, Tarun Vyas <tarun.vyas@intel.com> wrote:
> From: Tarun <tarun.vyas@intel.com>

Please use your full name. This will appear in git logs.

> The PIPEDSL freezes on PSR entry and if PSR hasn't fully exited, then
> the pipe_update_start call schedules itself out to check back later.
>
> On ChromeOS-4.4 kernel, which is fairly up-to-date w.r.t drm/i915 but
> lags w.r.t core kernel code, hot plugging an external display triggers
> tons of "potential atomic update errors" in the dmesg, on *pipe A*. A
> closer analysis reveals that we try to read the scanline 3 times and
> eventually timeout, b/c PSR hasn't exited fully leading to a PIPEDSL
> stuck @ 1599. This issue is not seen on upstream kernels, b/c for *some*
> reason we loop inside intel_pipe_update start for ~2+ msec which in this
> case is more than enough to exit PSR fully, hence an *unstuck* PIPEDSL
> counter, hence no error. On the other hand, the ChromeOS kernel spends
> ~1.1 msec looping inside intel_pipe_update_start and hence errors out
> b/c the source is still in PSR.
>
> Regardless, we should wait for PSR exit (if PSR is supported and active
> on the current pipe) before reading the PIPEDSL, b/c if we haven't
> fully exited PSR, then checking for vblank evasion isn't actually
> applicable.
>
> This scenario applies to a configuration with an additional pipe,
> as of now.

Sign your work. See the Developer's Certificate of Origin in
Documentation/process/submitting-patches.rst.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-05-14 12:51 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-30  4:00 [PATCH] drm/i915: Wait for PSR exit before checking for vblank evasion for an atomic update Tarun Vyas
2018-04-30  8:20 ` Jani Nikula
2018-04-30 10:48 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2018-04-30 11:04 ` ✓ Fi.CI.BAT: success " Patchwork
2018-04-30 13:39 ` ✓ Fi.CI.IGT: " Patchwork
2018-04-30 17:19 ` [PATCH] " Rodrigo Vivi
2018-05-02 18:19   ` Tarun Vyas
2018-05-02 18:51     ` Ville Syrjälä
2018-05-02 20:04       ` Rodrigo Vivi
2018-05-02 22:31         ` Tarun Vyas
2018-05-03 16:58           ` Rodrigo Vivi
2018-05-03 17:08             ` Tarun Vyas
2018-05-14 12:53 ` Jani Nikula

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.