All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH] i915/perf: Avoid reading OA reports before they land
@ 2023-05-19 22:56 Umesh Nerlige Ramappa
  2023-05-19 23:58 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Umesh Nerlige Ramappa @ 2023-05-19 22:56 UTC (permalink / raw)
  To: intel-gfx; +Cc: Lionel G Landwerlin

On DG2, capturing OA reports while running heavy render workloads
sometimes results in invalid OA reports where 64-byte chunks inside
reports have stale values. Under memory pressure, high OA sampling rates
(13.3 us) and heavy render workload, occassionally, the OA HW TAIL
pointer does not progress as fast as the sampling rate. When these
glitches occur, the TAIL pointer takes approx. 200us to progress.  While
this is expected behavior from the HW perspective, invalid reports are
not expected.

In oa_buffer_check_unlocked(), when we execute the if condition, we are
updating the oa_buffer.tail to the aging tail and then setting pollin
based on this tail value, however, we do not have a chance to rewind and
validate the reports prior to setting pollin. The validation happens
in a subsequent call to oa_buffer_check_unlocked(). If a read occurs
before this validation, then we end up reading reports up until this
oa_buffer.tail value which includes invalid reports. Though found on
DG2, this affects all platforms.

Set the pollin only in the else condition in oa_buffer_check_unlocked.

Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7484
Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7757
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 19d5652300ee..61536e3c4ac9 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -545,7 +545,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
 	u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma);
 	int report_size = stream->oa_buffer.format->size;
 	unsigned long flags;
-	bool pollin;
+	bool pollin = false;
 	u32 hw_tail;
 	u64 now;
 	u32 partial_report_size;
@@ -620,10 +620,10 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
 		stream->oa_buffer.tail = gtt_offset + tail;
 		stream->oa_buffer.aging_tail = gtt_offset + hw_tail;
 		stream->oa_buffer.aging_timestamp = now;
-	}
 
-	pollin = OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
-			  stream->oa_buffer.head - gtt_offset) >= report_size;
+		pollin = OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
+				  stream->oa_buffer.head - gtt_offset) >= report_size;
+	}
 
 	spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for i915/perf: Avoid reading OA reports before they land
  2023-05-19 22:56 [Intel-gfx] [PATCH] i915/perf: Avoid reading OA reports before they land Umesh Nerlige Ramappa
@ 2023-05-19 23:58 ` Patchwork
  2023-05-20  7:28 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2023-05-19 23:58 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

== Series Details ==

Series: i915/perf: Avoid reading OA reports before they land
URL   : https://patchwork.freedesktop.org/series/118054/
State : warning

== Summary ==

Error: dim checkpatch failed
0e9ec65e25cf i915/perf: Avoid reading OA reports before they land
-:9: WARNING:TYPO_SPELLING: 'occassionally' may be misspelled - perhaps 'occasionally'?
#9: 
(13.3 us) and heavy render workload, occassionally, the OA HW TAIL
                                     ^^^^^^^^^^^^^

-:26: WARNING:COMMIT_LOG_USE_LINK: Unknown link reference 'Bug:', use 'Link:' or 'Closes:' instead
#26: 
Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7484

-:27: WARNING:COMMIT_LOG_USE_LINK: Unknown link reference 'Bug:', use 'Link:' or 'Closes:' instead
#27: 
Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7757

total: 0 errors, 3 warnings, 0 checks, 21 lines checked



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for i915/perf: Avoid reading OA reports before they land
  2023-05-19 22:56 [Intel-gfx] [PATCH] i915/perf: Avoid reading OA reports before they land Umesh Nerlige Ramappa
  2023-05-19 23:58 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
@ 2023-05-20  7:28 ` Patchwork
  2023-05-20 12:05 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2023-05-20  7:28 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 3855 bytes --]

== Series Details ==

Series: i915/perf: Avoid reading OA reports before they land
URL   : https://patchwork.freedesktop.org/series/118054/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_13169 -> Patchwork_118054v1
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/index.html

Participating hosts (36 -> 35)
------------------------------

  Missing    (1): fi-snb-2520m 

Known issues
------------

  Here are the changes found in Patchwork_118054v1 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_pm_backlight@basic-brightness@edp-1:
    - bat-rplp-1:         NOTRUN -> [ABORT][1] ([i915#7077])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/bat-rplp-1/igt@i915_pm_backlight@basic-brightness@edp-1.html

  * igt@i915_selftest@live@reset:
    - bat-rpls-1:         NOTRUN -> [ABORT][2] ([i915#4983] / [i915#7461] / [i915#8347] / [i915#8384])
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/bat-rpls-1/igt@i915_selftest@live@reset.html

  * igt@kms_pipe_crc_basic@suspend-read-crc:
    - fi-hsw-4770:        NOTRUN -> [SKIP][3] ([fdo#109271])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/fi-hsw-4770/igt@kms_pipe_crc_basic@suspend-read-crc.html

  
#### Possible fixes ####

  * igt@i915_selftest@live@requests:
    - bat-rpls-1:         [ABORT][4] ([i915#4983] / [i915#7911] / [i915#7920]) -> [PASS][5]
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13169/bat-rpls-1/igt@i915_selftest@live@requests.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/bat-rpls-1/igt@i915_selftest@live@requests.html

  
#### Warnings ####

  * igt@kms_setmode@basic-clone-single-crtc:
    - bat-rplp-1:         [ABORT][6] ([i915#4579] / [i915#8260]) -> [SKIP][7] ([i915#3555] / [i915#4579])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13169/bat-rplp-1/igt@kms_setmode@basic-clone-single-crtc.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/bat-rplp-1/igt@kms_setmode@basic-clone-single-crtc.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#3555]: https://gitlab.freedesktop.org/drm/intel/issues/3555
  [i915#4579]: https://gitlab.freedesktop.org/drm/intel/issues/4579
  [i915#4983]: https://gitlab.freedesktop.org/drm/intel/issues/4983
  [i915#7077]: https://gitlab.freedesktop.org/drm/intel/issues/7077
  [i915#7461]: https://gitlab.freedesktop.org/drm/intel/issues/7461
  [i915#7872]: https://gitlab.freedesktop.org/drm/intel/issues/7872
  [i915#7911]: https://gitlab.freedesktop.org/drm/intel/issues/7911
  [i915#7920]: https://gitlab.freedesktop.org/drm/intel/issues/7920
  [i915#7953]: https://gitlab.freedesktop.org/drm/intel/issues/7953
  [i915#8260]: https://gitlab.freedesktop.org/drm/intel/issues/8260
  [i915#8347]: https://gitlab.freedesktop.org/drm/intel/issues/8347
  [i915#8384]: https://gitlab.freedesktop.org/drm/intel/issues/8384


Build changes
-------------

  * Linux: CI_DRM_13169 -> Patchwork_118054v1

  CI-20190529: 20190529
  CI_DRM_13169: f533234d40e8f5b8599bd5bc97fa8e30384aec03 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_7297: 0f0754413f14abe2fe6434fd0873c158dbc47ec9 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_118054v1: f533234d40e8f5b8599bd5bc97fa8e30384aec03 @ git://anongit.freedesktop.org/gfx-ci/linux


### Linux commits

bd939ad4e8fd i915/perf: Avoid reading OA reports before they land

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/index.html

[-- Attachment #2: Type: text/html, Size: 4551 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Intel-gfx] ✗ Fi.CI.IGT: failure for i915/perf: Avoid reading OA reports before they land
  2023-05-19 22:56 [Intel-gfx] [PATCH] i915/perf: Avoid reading OA reports before they land Umesh Nerlige Ramappa
  2023-05-19 23:58 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
  2023-05-20  7:28 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
@ 2023-05-20 12:05 ` Patchwork
  2023-05-20 17:27 ` [Intel-gfx] [PATCH] " Lionel Landwerlin
  2023-05-22 20:20 ` Dixit, Ashutosh
  4 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2023-05-20 12:05 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 9191 bytes --]

== Series Details ==

Series: i915/perf: Avoid reading OA reports before they land
URL   : https://patchwork.freedesktop.org/series/118054/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_13169_full -> Patchwork_118054v1_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_118054v1_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_118054v1_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Participating hosts (7 -> 7)
------------------------------

  No changes in participating hosts

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_118054v1_full:

### IGT changes ###

#### Possible regressions ####

  * igt@perf@gen12-group-concurrent-oa-buffer-read:
    - shard-apl:          [PASS][1] -> [FAIL][2] +1 similar issue
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13169/shard-apl1/igt@perf@gen12-group-concurrent-oa-buffer-read.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/shard-apl4/igt@perf@gen12-group-concurrent-oa-buffer-read.html

  
Known issues
------------

  Here are the changes found in Patchwork_118054v1_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_fair@basic-deadline:
    - shard-glk:          [PASS][3] -> [FAIL][4] ([i915#2846])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13169/shard-glk2/igt@gem_exec_fair@basic-deadline.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/shard-glk8/igt@gem_exec_fair@basic-deadline.html

  * igt@kms_color@ctm-0-75@pipe-b-vga-1:
    - shard-snb:          NOTRUN -> [SKIP][5] ([fdo#109271] / [i915#4579]) +14 similar issues
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/shard-snb2/igt@kms_color@ctm-0-75@pipe-b-vga-1.html

  * igt@kms_flip@2x-nonexisting-fb:
    - shard-snb:          NOTRUN -> [SKIP][6] ([fdo#109271]) +16 similar issues
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/shard-snb5/igt@kms_flip@2x-nonexisting-fb.html

  
#### Possible fixes ####

  * igt@gem_eio@hibernate:
    - {shard-tglu}:       [ABORT][7] ([i915#7975] / [i915#8213] / [i915#8398]) -> [PASS][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13169/shard-tglu-10/igt@gem_eio@hibernate.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/shard-tglu-2/igt@gem_eio@hibernate.html

  * igt@gem_exec_fair@basic-pace@rcs0:
    - {shard-rkl}:        [FAIL][9] ([i915#2842]) -> [PASS][10] +2 similar issues
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13169/shard-rkl-1/igt@gem_exec_fair@basic-pace@rcs0.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/shard-rkl-3/igt@gem_exec_fair@basic-pace@rcs0.html

  * igt@gem_lmem_swapping@smem-oom@lmem0:
    - {shard-dg1}:        [TIMEOUT][11] ([i915#5493]) -> [PASS][12]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13169/shard-dg1-12/igt@gem_lmem_swapping@smem-oom@lmem0.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/shard-dg1-17/igt@gem_lmem_swapping@smem-oom@lmem0.html

  * igt@i915_pm_lpsp@kms-lpsp@kms-lpsp-hdmi-a:
    - {shard-rkl}:        [SKIP][13] ([i915#1937] / [i915#4579]) -> [PASS][14]
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13169/shard-rkl-2/igt@i915_pm_lpsp@kms-lpsp@kms-lpsp-hdmi-a.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/shard-rkl-7/igt@i915_pm_lpsp@kms-lpsp@kms-lpsp-hdmi-a.html

  * igt@i915_pm_rc6_residency@rc6-idle@vcs0:
    - {shard-dg1}:        [FAIL][15] ([i915#3591]) -> [PASS][16]
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13169/shard-dg1-18/igt@i915_pm_rc6_residency@rc6-idle@vcs0.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/shard-dg1-18/igt@i915_pm_rc6_residency@rc6-idle@vcs0.html

  * igt@i915_suspend@basic-s3-without-i915:
    - {shard-rkl}:        [FAIL][17] ([fdo#103375]) -> [PASS][18]
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13169/shard-rkl-6/igt@i915_suspend@basic-s3-without-i915.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/shard-rkl-7/igt@i915_suspend@basic-s3-without-i915.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions:
    - shard-apl:          [FAIL][19] ([i915#2346]) -> [PASS][20]
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13169/shard-apl4/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/shard-apl2/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html

  * igt@kms_draw_crc@draw-method-blt@xrgb2101010-ytiled:
    - shard-glk:          [DMESG-WARN][21] ([i915#7936]) -> [PASS][22]
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13169/shard-glk5/igt@kms_draw_crc@draw-method-blt@xrgb2101010-ytiled.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/shard-glk3/igt@kms_draw_crc@draw-method-blt@xrgb2101010-ytiled.html

  * igt@kms_frontbuffer_tracking@fbc-suspend:
    - {shard-dg1}:        [FAIL][23] ([fdo#103375]) -> [PASS][24]
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13169/shard-dg1-18/igt@kms_frontbuffer_tracking@fbc-suspend.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/shard-dg1-14/igt@kms_frontbuffer_tracking@fbc-suspend.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#103375]: https://bugs.freedesktop.org/show_bug.cgi?id=103375
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109274]: https://bugs.freedesktop.org/show_bug.cgi?id=109274
  [fdo#109280]: https://bugs.freedesktop.org/show_bug.cgi?id=109280
  [fdo#110189]: https://bugs.freedesktop.org/show_bug.cgi?id=110189
  [fdo#111615]: https://bugs.freedesktop.org/show_bug.cgi?id=111615
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#1397]: https://gitlab.freedesktop.org/drm/intel/issues/1397
  [i915#1937]: https://gitlab.freedesktop.org/drm/intel/issues/1937
  [i915#2346]: https://gitlab.freedesktop.org/drm/intel/issues/2346
  [i915#2842]: https://gitlab.freedesktop.org/drm/intel/issues/2842
  [i915#2846]: https://gitlab.freedesktop.org/drm/intel/issues/2846
  [i915#3591]: https://gitlab.freedesktop.org/drm/intel/issues/3591
  [i915#3637]: https://gitlab.freedesktop.org/drm/intel/issues/3637
  [i915#3689]: https://gitlab.freedesktop.org/drm/intel/issues/3689
  [i915#3840]: https://gitlab.freedesktop.org/drm/intel/issues/3840
  [i915#4070]: https://gitlab.freedesktop.org/drm/intel/issues/4070
  [i915#4078]: https://gitlab.freedesktop.org/drm/intel/issues/4078
  [i915#4103]: https://gitlab.freedesktop.org/drm/intel/issues/4103
  [i915#4270]: https://gitlab.freedesktop.org/drm/intel/issues/4270
  [i915#4579]: https://gitlab.freedesktop.org/drm/intel/issues/4579
  [i915#4816]: https://gitlab.freedesktop.org/drm/intel/issues/4816
  [i915#5099]: https://gitlab.freedesktop.org/drm/intel/issues/5099
  [i915#5176]: https://gitlab.freedesktop.org/drm/intel/issues/5176
  [i915#5235]: https://gitlab.freedesktop.org/drm/intel/issues/5235
  [i915#5354]: https://gitlab.freedesktop.org/drm/intel/issues/5354
  [i915#5493]: https://gitlab.freedesktop.org/drm/intel/issues/5493
  [i915#6095]: https://gitlab.freedesktop.org/drm/intel/issues/6095
  [i915#6268]: https://gitlab.freedesktop.org/drm/intel/issues/6268
  [i915#658]: https://gitlab.freedesktop.org/drm/intel/issues/658
  [i915#6786]: https://gitlab.freedesktop.org/drm/intel/issues/6786
  [i915#6944]: https://gitlab.freedesktop.org/drm/intel/issues/6944
  [i915#7116]: https://gitlab.freedesktop.org/drm/intel/issues/7116
  [i915#7118]: https://gitlab.freedesktop.org/drm/intel/issues/7118
  [i915#7697]: https://gitlab.freedesktop.org/drm/intel/issues/7697
  [i915#7828]: https://gitlab.freedesktop.org/drm/intel/issues/7828
  [i915#7936]: https://gitlab.freedesktop.org/drm/intel/issues/7936
  [i915#7975]: https://gitlab.freedesktop.org/drm/intel/issues/7975
  [i915#8011]: https://gitlab.freedesktop.org/drm/intel/issues/8011
  [i915#8213]: https://gitlab.freedesktop.org/drm/intel/issues/8213
  [i915#8398]: https://gitlab.freedesktop.org/drm/intel/issues/8398


Build changes
-------------

  * Linux: CI_DRM_13169 -> Patchwork_118054v1

  CI-20190529: 20190529
  CI_DRM_13169: f533234d40e8f5b8599bd5bc97fa8e30384aec03 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_7297: 0f0754413f14abe2fe6434fd0873c158dbc47ec9 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_118054v1: f533234d40e8f5b8599bd5bc97fa8e30384aec03 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118054v1/index.html

[-- Attachment #2: Type: text/html, Size: 8414 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-gfx] [PATCH] i915/perf: Avoid reading OA reports before they land
  2023-05-19 22:56 [Intel-gfx] [PATCH] i915/perf: Avoid reading OA reports before they land Umesh Nerlige Ramappa
                   ` (2 preceding siblings ...)
  2023-05-20 12:05 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
@ 2023-05-20 17:27 ` Lionel Landwerlin
  2023-05-22 20:20 ` Dixit, Ashutosh
  4 siblings, 0 replies; 9+ messages in thread
From: Lionel Landwerlin @ 2023-05-20 17:27 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa, intel-gfx

Hi Umesh,

Looks like there is still a problem with the if block moving the 
stream->oa_buffer.tail forward.
An application not doing any polling would still run into the same problem.

If I understand correctly this change, it means the time based 
workaround doesn't work.
We need to actually check the report's content before moving the 
software tracked tail.
If that's the case, they maybe we should just delete that code.

-Lionel

On 20/05/2023 01:56, Umesh Nerlige Ramappa wrote:
> On DG2, capturing OA reports while running heavy render workloads
> sometimes results in invalid OA reports where 64-byte chunks inside
> reports have stale values. Under memory pressure, high OA sampling rates
> (13.3 us) and heavy render workload, occassionally, the OA HW TAIL
> pointer does not progress as fast as the sampling rate. When these
> glitches occur, the TAIL pointer takes approx. 200us to progress.  While
> this is expected behavior from the HW perspective, invalid reports are
> not expected.
>
> In oa_buffer_check_unlocked(), when we execute the if condition, we are
> updating the oa_buffer.tail to the aging tail and then setting pollin
> based on this tail value, however, we do not have a chance to rewind and
> validate the reports prior to setting pollin. The validation happens
> in a subsequent call to oa_buffer_check_unlocked(). If a read occurs
> before this validation, then we end up reading reports up until this
> oa_buffer.tail value which includes invalid reports. Though found on
> DG2, this affects all platforms.
>
> Set the pollin only in the else condition in oa_buffer_check_unlocked.
>
> Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7484
> Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7757
> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_perf.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 19d5652300ee..61536e3c4ac9 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -545,7 +545,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>   	u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma);
>   	int report_size = stream->oa_buffer.format->size;
>   	unsigned long flags;
> -	bool pollin;
> +	bool pollin = false;
>   	u32 hw_tail;
>   	u64 now;
>   	u32 partial_report_size;
> @@ -620,10 +620,10 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>   		stream->oa_buffer.tail = gtt_offset + tail;
>   		stream->oa_buffer.aging_tail = gtt_offset + hw_tail;
>   		stream->oa_buffer.aging_timestamp = now;
> -	}
>   
> -	pollin = OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
> -			  stream->oa_buffer.head - gtt_offset) >= report_size;
> +		pollin = OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
> +				  stream->oa_buffer.head - gtt_offset) >= report_size;
> +	}
>   
>   	spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);
>   



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-gfx] [PATCH] i915/perf: Avoid reading OA reports before they land
  2023-05-19 22:56 [Intel-gfx] [PATCH] i915/perf: Avoid reading OA reports before they land Umesh Nerlige Ramappa
                   ` (3 preceding siblings ...)
  2023-05-20 17:27 ` [Intel-gfx] [PATCH] " Lionel Landwerlin
@ 2023-05-22 20:20 ` Dixit, Ashutosh
  2023-05-22 22:08   ` Umesh Nerlige Ramappa
  4 siblings, 1 reply; 9+ messages in thread
From: Dixit, Ashutosh @ 2023-05-22 20:20 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: Lionel G Landwerlin, intel-gfx

On Fri, 19 May 2023 15:56:42 -0700, Umesh Nerlige Ramappa wrote:
>

Hi Umesh,

> On DG2, capturing OA reports while running heavy render workloads
> sometimes results in invalid OA reports where 64-byte chunks inside
> reports have stale values. Under memory pressure, high OA sampling rates
> (13.3 us) and heavy render workload, occassionally, the OA HW TAIL
> pointer does not progress as fast as the sampling rate. When these
> glitches occur, the TAIL pointer takes approx. 200us to progress.  While
> this is expected behavior from the HW perspective, invalid reports are
> not expected.
>
> In oa_buffer_check_unlocked(), when we execute the if condition, we are
> updating the oa_buffer.tail to the aging tail and then setting pollin
> based on this tail value, however, we do not have a chance to rewind and
> validate the reports prior to setting pollin. The validation happens
> in a subsequent call to oa_buffer_check_unlocked(). If a read occurs
> before this validation, then we end up reading reports up until this
> oa_buffer.tail value which includes invalid reports. Though found on
> DG2, this affects all platforms.
>
> Set the pollin only in the else condition in oa_buffer_check_unlocked.
>
> Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7484
> Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7757
> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_perf.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 19d5652300ee..61536e3c4ac9 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -545,7 +545,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>	u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma);
>	int report_size = stream->oa_buffer.format->size;
>	unsigned long flags;
> -	bool pollin;
> +	bool pollin = false;
>	u32 hw_tail;
>	u64 now;
>	u32 partial_report_size;
> @@ -620,10 +620,10 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>		stream->oa_buffer.tail = gtt_offset + tail;
>		stream->oa_buffer.aging_tail = gtt_offset + hw_tail;
>		stream->oa_buffer.aging_timestamp = now;
> -	}
>
> -	pollin = OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
> -			  stream->oa_buffer.head - gtt_offset) >= report_size;
> +		pollin = OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
> +				  stream->oa_buffer.head - gtt_offset) >= report_size;
> +	}

The issue has been correctly identified above. But seems that the real
cause for the issue is not that pollin statement above is misplaced but
that updating the tail via aging is unreliable (at least with the present
timeout as you mention above). Also, it is not clear why we have tail aging
at all, since it seems we can detect when reports land (by checking
report_id and timestamp). So rather than move the pollin into the else, we
should just eliminate the if () part. And if we are eliminating the if ()
we can just eliminate the concept of tail aging from the code (and
comments) and rely solely on explicit detection of reports landing.

Separately, there seems to be another related bug in the code, I have sent
a patch for that here:

https://patchwork.freedesktop.org/series/118151/

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-gfx] [PATCH] i915/perf: Avoid reading OA reports before they land
  2023-05-22 20:20 ` Dixit, Ashutosh
@ 2023-05-22 22:08   ` Umesh Nerlige Ramappa
  2023-05-23 18:20     ` Dixit, Ashutosh
  0 siblings, 1 reply; 9+ messages in thread
From: Umesh Nerlige Ramappa @ 2023-05-22 22:08 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: Lionel G Landwerlin, intel-gfx

On Mon, May 22, 2023 at 01:20:12PM -0700, Dixit, Ashutosh wrote:
>On Fri, 19 May 2023 15:56:42 -0700, Umesh Nerlige Ramappa wrote:
>>
>
>Hi Umesh,
>
>> On DG2, capturing OA reports while running heavy render workloads
>> sometimes results in invalid OA reports where 64-byte chunks inside
>> reports have stale values. Under memory pressure, high OA sampling rates
>> (13.3 us) and heavy render workload, occassionally, the OA HW TAIL
>> pointer does not progress as fast as the sampling rate. When these
>> glitches occur, the TAIL pointer takes approx. 200us to progress.  While
>> this is expected behavior from the HW perspective, invalid reports are
>> not expected.
>>
>> In oa_buffer_check_unlocked(), when we execute the if condition, we are
>> updating the oa_buffer.tail to the aging tail and then setting pollin
>> based on this tail value, however, we do not have a chance to rewind and
>> validate the reports prior to setting pollin. The validation happens
>> in a subsequent call to oa_buffer_check_unlocked(). If a read occurs
>> before this validation, then we end up reading reports up until this
>> oa_buffer.tail value which includes invalid reports. Though found on
>> DG2, this affects all platforms.
>>
>> Set the pollin only in the else condition in oa_buffer_check_unlocked.
>>
>> Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7484
>> Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7757
>> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_perf.c | 8 ++++----
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>> index 19d5652300ee..61536e3c4ac9 100644
>> --- a/drivers/gpu/drm/i915/i915_perf.c
>> +++ b/drivers/gpu/drm/i915/i915_perf.c
>> @@ -545,7 +545,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>>	u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma);
>>	int report_size = stream->oa_buffer.format->size;
>>	unsigned long flags;
>> -	bool pollin;
>> +	bool pollin = false;
>>	u32 hw_tail;
>>	u64 now;
>>	u32 partial_report_size;
>> @@ -620,10 +620,10 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>>		stream->oa_buffer.tail = gtt_offset + tail;
>>		stream->oa_buffer.aging_tail = gtt_offset + hw_tail;
>>		stream->oa_buffer.aging_timestamp = now;
>> -	}
>>
>> -	pollin = OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
>> -			  stream->oa_buffer.head - gtt_offset) >= report_size;
>> +		pollin = OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
>> +				  stream->oa_buffer.head - gtt_offset) >= report_size;
>> +	}
>
>The issue has been correctly identified above. But seems that the real
>cause for the issue is not that pollin statement above is misplaced but
>that updating the tail via aging is unreliable (at least with the present
>timeout as you mention above). Also, it is not clear why we have tail aging
>at all, since it seems we can detect when reports land (by checking
>report_id and timestamp). So rather than move the pollin into the else, we
>should just eliminate the if () part. And if we are eliminating the if ()
>we can just eliminate the concept of tail aging from the code (and
>comments) and rely solely on explicit detection of reports landing.

I thought so too, it would be much simpler code. Looks like Lionel 
agrees with removing this code as well. 

I do have a couple concerns though.

- In the blocking case, i915_perf_read() path waits on a queue with the
condition being oa_buffer_check_unlocked(). If sampling rate is high, 
oa_buffer_check_unlocked will almost always return true. If we remove 
the if block, we may run the rewind logic too often to detect reports 
that landed. The aging logic is just giving a 100us buffer to avoid 
repeated checks here if tail hasn't moved. (although tbh, 100 us is very 
small).

- The other concern - by dropping all this aging logic, are we changing 
   underlying behavior?

- Is there a significant ROI on current patch vs. dropping all the aging 
   logic?

>
>Separately, there seems to be another related bug in the code, I have sent
>a patch for that here:
>
>https://patchwork.freedesktop.org/series/118151/

That's a valid new issue and different from this one, but related to the 
rewind logic. lgtm.

Thanks,
Umesh
>
>Thanks.
>--
>Ashutosh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-gfx] [PATCH] i915/perf: Avoid reading OA reports before they land
  2023-05-22 22:08   ` Umesh Nerlige Ramappa
@ 2023-05-23 18:20     ` Dixit, Ashutosh
  2023-05-23 22:22       ` Umesh Nerlige Ramappa
  0 siblings, 1 reply; 9+ messages in thread
From: Dixit, Ashutosh @ 2023-05-23 18:20 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: Lionel G Landwerlin, intel-gfx

On Mon, 22 May 2023 15:08:42 -0700, Umesh Nerlige Ramappa wrote:
>

Hi Umesh,

> On Mon, May 22, 2023 at 01:20:12PM -0700, Dixit, Ashutosh wrote:
> > On Fri, 19 May 2023 15:56:42 -0700, Umesh Nerlige Ramappa wrote:
> >>
> >> On DG2, capturing OA reports while running heavy render workloads
> >> sometimes results in invalid OA reports where 64-byte chunks inside
> >> reports have stale values. Under memory pressure, high OA sampling rates
> >> (13.3 us) and heavy render workload, occassionally, the OA HW TAIL
> >> pointer does not progress as fast as the sampling rate. When these
> >> glitches occur, the TAIL pointer takes approx. 200us to progress.  While
> >> this is expected behavior from the HW perspective, invalid reports are
> >> not expected.
> >>
> >> In oa_buffer_check_unlocked(), when we execute the if condition, we are
> >> updating the oa_buffer.tail to the aging tail and then setting pollin
> >> based on this tail value, however, we do not have a chance to rewind and
> >> validate the reports prior to setting pollin. The validation happens
> >> in a subsequent call to oa_buffer_check_unlocked(). If a read occurs
> >> before this validation, then we end up reading reports up until this
> >> oa_buffer.tail value which includes invalid reports. Though found on
> >> DG2, this affects all platforms.
> >>
> >> Set the pollin only in the else condition in oa_buffer_check_unlocked.
> >>
> >> Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7484
> >> Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7757
> >> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> >> ---
> >>  drivers/gpu/drm/i915/i915_perf.c | 8 ++++----
> >>  1 file changed, 4 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> >> index 19d5652300ee..61536e3c4ac9 100644
> >> --- a/drivers/gpu/drm/i915/i915_perf.c
> >> +++ b/drivers/gpu/drm/i915/i915_perf.c
> >> @@ -545,7 +545,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
> >>	u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma);
> >>	int report_size = stream->oa_buffer.format->size;
> >>	unsigned long flags;
> >> -	bool pollin;
> >> +	bool pollin = false;
> >>	u32 hw_tail;
> >>	u64 now;
> >>	u32 partial_report_size;
> >> @@ -620,10 +620,10 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
> >>		stream->oa_buffer.tail = gtt_offset + tail;
> >>		stream->oa_buffer.aging_tail = gtt_offset + hw_tail;
> >>		stream->oa_buffer.aging_timestamp = now;
> >> -	}
> >>
> >> -	pollin = OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
> >> -			  stream->oa_buffer.head - gtt_offset) >= report_size;
> >> +		pollin = OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
> >> +				  stream->oa_buffer.head - gtt_offset) >= report_size;
> >> +	}
> >
> > The issue has been correctly identified above. But seems that the real
> > cause for the issue is not that pollin statement above is misplaced but
> > that updating the tail via aging is unreliable (at least with the present
> > timeout as you mention above). Also, it is not clear why we have tail aging
> > at all, since it seems we can detect when reports land (by checking
> > report_id and timestamp). So rather than move the pollin into the else, we
> > should just eliminate the if () part. And if we are eliminating the if ()
> > we can just eliminate the concept of tail aging from the code (and
> > comments) and rely solely on explicit detection of reports landing.

I missed this yesterday but the above patch is basically incorrect. We need
to return pollin true when we have a "non-zero distance between head and
tail", i.e. when there is data to be read. And we have violated this for
the if () part with this patch (because we are unconditionally returning
false from the if () even when there is data to be read). So there are only
two ways to solve this:

a. Increase OA_TAIL_MARGIN_NSEC (the aging time)
b. Eliminate tail aging (i.e. eliminate the if ())

We cannot move the pollin statement into the else.

The preferred way is b. since it makes the overall code consistent
again. And it seems easy enough to do.

> I thought so too, it would be much simpler code. Looks like Lionel agrees
> with removing this code as well.
> I do have a couple concerns though.
>
> - In the blocking case, i915_perf_read() path waits on a queue with the
> condition being oa_buffer_check_unlocked(). If sampling rate is high,
> oa_buffer_check_unlocked will almost always return true. If we remove the
> if block, we may run the rewind logic too often to detect reports that
> landed. The aging logic is just giving a 100us buffer to avoid repeated
> checks here if tail hasn't moved. (although tbh, 100 us is very small).

I am pretty sure if we eliminate tail aging, we would fairly easily be able
to solve the problem of rewind logic running too often (e.g. put an 'if
(hw_tail != oa_buffer.tail) around the rewind logic etc).

> - The other concern - by dropping all this aging logic, are we changing
> underlying behavior?

I don't think eliminating tail aging makes a significant change to
underlying behavior from what we have today (and I doubt we worried about
changing underlying behavior when we implemented explicit detection of
reports landing in d1df41eb72ef).

>
> - Is there a significant ROI on current patch vs. dropping all the aging
> logic?

Yes, the biggest ROI for me is to have the code make sense again (the code
is incomprehensible the moment you ask "why do we the concept of tail aging
when we can explicitly detect reports landing?"). And anyway as I mentioned
above the current patch is just incorrect so we need a different solution
anyway.

Thanks.
--
Ashutosh

> > Separately, there seems to be another related bug in the code, I have sent
> > a patch for that here:
> >
> > https://patchwork.freedesktop.org/series/118151/
>
> That's a valid new issue and different from this one, but related to the
> rewind logic. lgtm.
>
> Thanks,
> Umesh
> >
> > Thanks.
> > --
> > Ashutosh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-gfx] [PATCH] i915/perf: Avoid reading OA reports before they land
  2023-05-23 18:20     ` Dixit, Ashutosh
@ 2023-05-23 22:22       ` Umesh Nerlige Ramappa
  0 siblings, 0 replies; 9+ messages in thread
From: Umesh Nerlige Ramappa @ 2023-05-23 22:22 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: Lionel G Landwerlin, intel-gfx

On Tue, May 23, 2023 at 11:20:54AM -0700, Dixit, Ashutosh wrote:
>On Mon, 22 May 2023 15:08:42 -0700, Umesh Nerlige Ramappa wrote:
>>
>
>Hi Umesh,
>
>> On Mon, May 22, 2023 at 01:20:12PM -0700, Dixit, Ashutosh wrote:
>> > On Fri, 19 May 2023 15:56:42 -0700, Umesh Nerlige Ramappa wrote:
>> >>
>> >> On DG2, capturing OA reports while running heavy render workloads
>> >> sometimes results in invalid OA reports where 64-byte chunks inside
>> >> reports have stale values. Under memory pressure, high OA sampling rates
>> >> (13.3 us) and heavy render workload, occassionally, the OA HW TAIL
>> >> pointer does not progress as fast as the sampling rate. When these
>> >> glitches occur, the TAIL pointer takes approx. 200us to progress.  While
>> >> this is expected behavior from the HW perspective, invalid reports are
>> >> not expected.
>> >>
>> >> In oa_buffer_check_unlocked(), when we execute the if condition, we are
>> >> updating the oa_buffer.tail to the aging tail and then setting pollin
>> >> based on this tail value, however, we do not have a chance to rewind and
>> >> validate the reports prior to setting pollin. The validation happens
>> >> in a subsequent call to oa_buffer_check_unlocked(). If a read occurs
>> >> before this validation, then we end up reading reports up until this
>> >> oa_buffer.tail value which includes invalid reports. Though found on
>> >> DG2, this affects all platforms.
>> >>
>> >> Set the pollin only in the else condition in oa_buffer_check_unlocked.
>> >>
>> >> Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7484
>> >> Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7757
>> >> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>> >> ---
>> >>  drivers/gpu/drm/i915/i915_perf.c | 8 ++++----
>> >>  1 file changed, 4 insertions(+), 4 deletions(-)
>> >>
>> >> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>> >> index 19d5652300ee..61536e3c4ac9 100644
>> >> --- a/drivers/gpu/drm/i915/i915_perf.c
>> >> +++ b/drivers/gpu/drm/i915/i915_perf.c
>> >> @@ -545,7 +545,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>> >>	u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma);
>> >>	int report_size = stream->oa_buffer.format->size;
>> >>	unsigned long flags;
>> >> -	bool pollin;
>> >> +	bool pollin = false;
>> >>	u32 hw_tail;
>> >>	u64 now;
>> >>	u32 partial_report_size;
>> >> @@ -620,10 +620,10 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>> >>		stream->oa_buffer.tail = gtt_offset + tail;
>> >>		stream->oa_buffer.aging_tail = gtt_offset + hw_tail;
>> >>		stream->oa_buffer.aging_timestamp = now;
>> >> -	}
>> >>
>> >> -	pollin = OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
>> >> -			  stream->oa_buffer.head - gtt_offset) >= report_size;
>> >> +		pollin = OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
>> >> +				  stream->oa_buffer.head - gtt_offset) >= report_size;
>> >> +	}
>> >
>> > The issue has been correctly identified above. But seems that the real
>> > cause for the issue is not that pollin statement above is misplaced but
>> > that updating the tail via aging is unreliable (at least with the present
>> > timeout as you mention above). Also, it is not clear why we have tail aging
>> > at all, since it seems we can detect when reports land (by checking
>> > report_id and timestamp). So rather than move the pollin into the else, we
>> > should just eliminate the if () part. And if we are eliminating the if ()
>> > we can just eliminate the concept of tail aging from the code (and
>> > comments) and rely solely on explicit detection of reports landing.
>
>I missed this yesterday but the above patch is basically incorrect. We need
>to return pollin true when we have a "non-zero distance between head and
>tail", i.e. when there is data to be read. And we have violated this for
>the if () part with this patch (because we are unconditionally returning
>false from the if () even when there is data to be read). So there are only
>two ways to solve this:

Yikes, didn't see that. Ideally if the tail did progress the first time 
we entered this function, then let's say pollin is true since we find 
some valid reports. If the tail hasn't moved when the function is 
entered second time, then we return false (which is wrong) since there 
may still be data to be read.

>
>a. Increase OA_TAIL_MARGIN_NSEC (the aging time)
>b. Eliminate tail aging (i.e. eliminate the if ())
>
>We cannot move the pollin statement into the else.
>
>The preferred way is b. since it makes the overall code consistent
>again. And it seems easy enough to do.
>
>> I thought so too, it would be much simpler code. Looks like Lionel agrees
>> with removing this code as well.
>> I do have a couple concerns though.
>>
>> - In the blocking case, i915_perf_read() path waits on a queue with the
>> condition being oa_buffer_check_unlocked(). If sampling rate is high,
>> oa_buffer_check_unlocked will almost always return true. If we remove the
>> if block, we may run the rewind logic too often to detect reports that
>> landed. The aging logic is just giving a 100us buffer to avoid repeated
>> checks here if tail hasn't moved. (although tbh, 100 us is very small).
>
>I am pretty sure if we eliminate tail aging, we would fairly easily be able
>to solve the problem of rewind logic running too often (e.g. put an 'if
>(hw_tail != oa_buffer.tail) around the rewind logic etc).
>
>> - The other concern - by dropping all this aging logic, are we changing
>> underlying behavior?
>
>I don't think eliminating tail aging makes a significant change to
>underlying behavior from what we have today (and I doubt we worried about
>changing underlying behavior when we implemented explicit detection of
>reports landing in d1df41eb72ef).
>
>>
>> - Is there a significant ROI on current patch vs. dropping all the aging
>> logic?
>
>Yes, the biggest ROI for me is to have the code make sense again (the code
>is incomprehensible the moment you ask "why do we the concept of tail aging
>when we can explicitly detect reports landing?"). And anyway as I mentioned
>above the current patch is just incorrect so we need a different solution
>anyway.

okay, I agree more since this patch is still buggy :). Maybe it's time 
to rip out this logic. Will look into it.

Thanks,
Umesh
>
>Thanks.
>--
>Ashutosh
>
>> > Separately, there seems to be another related bug in the code, I have sent
>> > a patch for that here:
>> >
>> > https://patchwork.freedesktop.org/series/118151/
>>
>> That's a valid new issue and different from this one, but related to the
>> rewind logic. lgtm.
>>
>> Thanks,
>> Umesh
>> >
>> > Thanks.
>> > --
>> > Ashutosh

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-05-23 22:22 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-19 22:56 [Intel-gfx] [PATCH] i915/perf: Avoid reading OA reports before they land Umesh Nerlige Ramappa
2023-05-19 23:58 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2023-05-20  7:28 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2023-05-20 12:05 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2023-05-20 17:27 ` [Intel-gfx] [PATCH] " Lionel Landwerlin
2023-05-22 20:20 ` Dixit, Ashutosh
2023-05-22 22:08   ` Umesh Nerlige Ramappa
2023-05-23 18:20     ` Dixit, Ashutosh
2023-05-23 22:22       ` Umesh Nerlige Ramappa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.