intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Protect debugfs per_file_stats with RCU lock
@ 2019-09-03  6:21 Chris Wilson
  2019-09-03  7:08 ` ✓ Fi.CI.BAT: success for " Patchwork
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Chris Wilson @ 2019-09-03  6:21 UTC (permalink / raw)
  To: intel-gfx

If we make sure we grab a strong reference to each object as we dump it,
we can reduce the locks outside of our iterators to an rcu_read_lock.

This should prevent errors like:
[ 2138.371911] BUG: KASAN: use-after-free in per_file_stats+0x43/0x380 [i915]
[ 2138.371924] Read of size 8 at addr ffff888223651000 by task cat/8293

[ 2138.371947] CPU: 0 PID: 8293 Comm: cat Not tainted 5.3.0-rc6-CI-Custom_4352+ #1
[ 2138.371953] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J4205-ITX, BIOS P1.40 07/14/2017
[ 2138.371959] Call Trace:
[ 2138.371974]  dump_stack+0x7c/0xbb
[ 2138.372099]  ? per_file_stats+0x43/0x380 [i915]
[ 2138.372108]  print_address_description+0x73/0x3a0
[ 2138.372231]  ? per_file_stats+0x43/0x380 [i915]
[ 2138.372352]  ? per_file_stats+0x43/0x380 [i915]
[ 2138.372362]  __kasan_report+0x14e/0x192
[ 2138.372489]  ? per_file_stats+0x43/0x380 [i915]
[ 2138.372502]  kasan_report+0xe/0x20
[ 2138.372625]  per_file_stats+0x43/0x380 [i915]
[ 2138.372751]  ? i915_panel_show+0x110/0x110 [i915]
[ 2138.372761]  idr_for_each+0xa7/0x160
[ 2138.372773]  ? idr_get_next_ul+0x110/0x110
[ 2138.372782]  ? do_raw_spin_lock+0x10a/0x1d0
[ 2138.372923]  print_context_stats+0x264/0x510 [i915]

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 9798f27a697a..708855e051b5 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -237,6 +237,9 @@ static int per_file_stats(int id, void *ptr, void *data)
 	struct file_stats *stats = data;
 	struct i915_vma *vma;
 
+	if (!kref_get_unless_zero(&obj->base.refcount))
+		return 0;
+
 	stats->count++;
 	stats->total += obj->base.size;
 	if (!atomic_read(&obj->bind_count))
@@ -284,6 +287,7 @@ static int per_file_stats(int id, void *ptr, void *data)
 	}
 	spin_unlock(&obj->vma.lock);
 
+	i915_gem_object_put(obj);
 	return 0;
 }
 
@@ -313,10 +317,12 @@ static void print_context_stats(struct seq_file *m,
 				    i915_gem_context_lock_engines(ctx), it) {
 			intel_context_lock_pinned(ce);
 			if (intel_context_is_pinned(ce)) {
+				rcu_read_lock();
 				if (ce->state)
 					per_file_stats(0,
 						       ce->state->obj, &kstats);
 				per_file_stats(0, ce->ring->vma->obj, &kstats);
+				rcu_read_unlock();
 			}
 			intel_context_unlock_pinned(ce);
 		}
@@ -328,9 +334,9 @@ static void print_context_stats(struct seq_file *m,
 			struct task_struct *task;
 			char name[80];
 
-			spin_lock(&file->table_lock);
+			rcu_read_lock();
 			idr_for_each(&file->object_idr, per_file_stats, &stats);
-			spin_unlock(&file->table_lock);
+			rcu_read_unlock();
 
 			rcu_read_lock();
 			task = pid_task(ctx->pid ?: file->pid, PIDTYPE_PID);
-- 
2.23.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915: Protect debugfs per_file_stats with RCU lock
  2019-09-03  6:21 [PATCH] drm/i915: Protect debugfs per_file_stats with RCU lock Chris Wilson
@ 2019-09-03  7:08 ` Patchwork
  2019-09-03  9:08 ` ✗ Fi.CI.IGT: failure " Patchwork
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Patchwork @ 2019-09-03  7:08 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Protect debugfs per_file_stats with RCU lock
URL   : https://patchwork.freedesktop.org/series/66149/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6823 -> Patchwork_14263
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/

Known issues
------------

  Here are the changes found in Patchwork_14263 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_selftest@live_execlists:
    - fi-skl-gvtdvm:      [PASS][1] -> [DMESG-FAIL][2] ([fdo#111108])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/fi-skl-gvtdvm/igt@i915_selftest@live_execlists.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/fi-skl-gvtdvm/igt@i915_selftest@live_execlists.html

  
#### Possible fixes ####

  * igt@gem_busy@busy-all:
    - fi-icl-u3:          [DMESG-WARN][3] ([fdo#107724]) -> [PASS][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/fi-icl-u3/igt@gem_busy@busy-all.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/fi-icl-u3/igt@gem_busy@busy-all.html

  * igt@i915_selftest@live_workarounds:
    - fi-bsw-kefka:       [DMESG-WARN][5] ([fdo#111373]) -> [PASS][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/fi-bsw-kefka/igt@i915_selftest@live_workarounds.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/fi-bsw-kefka/igt@i915_selftest@live_workarounds.html

  
  [fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724
  [fdo#111108]: https://bugs.freedesktop.org/show_bug.cgi?id=111108
  [fdo#111373]: https://bugs.freedesktop.org/show_bug.cgi?id=111373


Participating hosts (52 -> 46)
------------------------------

  Additional (1): fi-byt-j1900 
  Missing    (7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-icl-y fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_6823 -> Patchwork_14263

  CI-20190529: 20190529
  CI_DRM_6823: 8f0a4755577623776cfb9e1a52eb8e873562bce5 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5163: 9bd37d5e307748a5b43e92a5b0ea9be8fc60b339 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_14263: e36bcda69c2699dae8ac9e0701b546cb0b3f5625 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

e36bcda69c26 drm/i915: Protect debugfs per_file_stats with RCU lock

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* ✗ Fi.CI.IGT: failure for drm/i915: Protect debugfs per_file_stats with RCU lock
  2019-09-03  6:21 [PATCH] drm/i915: Protect debugfs per_file_stats with RCU lock Chris Wilson
  2019-09-03  7:08 ` ✓ Fi.CI.BAT: success for " Patchwork
@ 2019-09-03  9:08 ` Patchwork
  2019-09-04 12:32 ` [PATCH] " David Weinehall
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Patchwork @ 2019-09-03  9:08 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Protect debugfs per_file_stats with RCU lock
URL   : https://patchwork.freedesktop.org/series/66149/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_6823_full -> Patchwork_14263_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_14263_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_14263_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_14263_full:

### Piglit changes ###

#### Possible regressions ####

  * spec@ext_texture_compression_s3tc@fbo-generatemipmap-formats (NEW):
    - pig-hsw-4770r:      NOTRUN -> [CRASH][1] +3 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/pig-hsw-4770r/spec@ext_texture_compression_s3tc@fbo-generatemipmap-formats.html

  
New tests
---------

  New tests have been introduced between CI_DRM_6823_full and Patchwork_14263_full:

### New Piglit tests (4) ###

  * spec@ext_texture_compression_s3tc@compressedteximage gl_compressed_rgba_s3tc_dxt5_ext:
    - Statuses : 1 crash(s)
    - Exec time: [0.08] s

  * spec@ext_texture_compression_s3tc@compressedteximage gl_compressed_srgb_alpha_s3tc_dxt3_ext:
    - Statuses : 1 crash(s)
    - Exec time: [0.08] s

  * spec@ext_texture_compression_s3tc@compressedteximage gl_compressed_srgb_alpha_s3tc_dxt5_ext:
    - Statuses : 1 crash(s)
    - Exec time: [0.08] s

  * spec@ext_texture_compression_s3tc@fbo-generatemipmap-formats:
    - Statuses : 1 crash(s)
    - Exec time: [0.14] s

  

Known issues
------------

  Here are the changes found in Patchwork_14263_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ctx_shared@exec-single-timeline-bsd:
    - shard-iclb:         [PASS][2] -> [SKIP][3] ([fdo#110841])
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-iclb7/igt@gem_ctx_shared@exec-single-timeline-bsd.html
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-iclb1/igt@gem_ctx_shared@exec-single-timeline-bsd.html

  * igt@gem_eio@in-flight-internal-10ms:
    - shard-skl:          [PASS][4] -> [DMESG-WARN][5] ([fdo#106107])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-skl9/igt@gem_eio@in-flight-internal-10ms.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-skl2/igt@gem_eio@in-flight-internal-10ms.html

  * igt@gem_eio@reset-stress:
    - shard-glk:          [PASS][6] -> [FAIL][7] ([fdo#109661])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-glk2/igt@gem_eio@reset-stress.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-glk5/igt@gem_eio@reset-stress.html
    - shard-snb:          [PASS][8] -> [FAIL][9] ([fdo#109661])
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-snb7/igt@gem_eio@reset-stress.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-snb5/igt@gem_eio@reset-stress.html

  * igt@gem_exec_balancer@smoke:
    - shard-iclb:         [PASS][10] -> [SKIP][11] ([fdo#110854])
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-iclb2/igt@gem_exec_balancer@smoke.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-iclb3/igt@gem_exec_balancer@smoke.html

  * igt@gem_exec_schedule@preempt-other-chain-bsd:
    - shard-iclb:         [PASS][12] -> [SKIP][13] ([fdo#111325]) +7 similar issues
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-iclb5/igt@gem_exec_schedule@preempt-other-chain-bsd.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-iclb2/igt@gem_exec_schedule@preempt-other-chain-bsd.html

  * igt@gem_exec_schedule@preempt-queue-contexts-chain-render:
    - shard-glk:          [PASS][14] -> [INCOMPLETE][15] ([fdo#103359] / [k.org#198133])
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-glk6/igt@gem_exec_schedule@preempt-queue-contexts-chain-render.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-glk3/igt@gem_exec_schedule@preempt-queue-contexts-chain-render.html

  * igt@kms_cursor_legacy@pipe-c-single-bo:
    - shard-apl:          [PASS][16] -> [INCOMPLETE][17] ([fdo#103927])
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-apl4/igt@kms_cursor_legacy@pipe-c-single-bo.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-apl4/igt@kms_cursor_legacy@pipe-c-single-bo.html

  * igt@kms_draw_crc@draw-method-xrgb2101010-mmap-cpu-ytiled:
    - shard-skl:          [PASS][18] -> [FAIL][19] ([fdo#103184] / [fdo#103232])
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-skl10/igt@kms_draw_crc@draw-method-xrgb2101010-mmap-cpu-ytiled.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-skl1/igt@kms_draw_crc@draw-method-xrgb2101010-mmap-cpu-ytiled.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible:
    - shard-skl:          [PASS][20] -> [FAIL][21] ([fdo#105363])
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-skl2/igt@kms_flip@flip-vs-expired-vblank-interruptible.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-skl3/igt@kms_flip@flip-vs-expired-vblank-interruptible.html

  * igt@kms_flip@flip-vs-suspend:
    - shard-skl:          [PASS][22] -> [INCOMPLETE][23] ([fdo#109507])
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-skl8/igt@kms_flip@flip-vs-suspend.html
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-skl6/igt@kms_flip@flip-vs-suspend.html

  * igt@kms_frontbuffer_tracking@fbc-1p-pri-indfb-multidraw:
    - shard-iclb:         [PASS][24] -> [FAIL][25] ([fdo#103167]) +6 similar issues
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-iclb3/igt@kms_frontbuffer_tracking@fbc-1p-pri-indfb-multidraw.html
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-iclb2/igt@kms_frontbuffer_tracking@fbc-1p-pri-indfb-multidraw.html

  * igt@kms_frontbuffer_tracking@fbc-tilingchange:
    - shard-skl:          [PASS][26] -> [FAIL][27] ([fdo#103167])
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-skl3/igt@kms_frontbuffer_tracking@fbc-tilingchange.html
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-skl1/igt@kms_frontbuffer_tracking@fbc-tilingchange.html

  * igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes:
    - shard-apl:          [PASS][28] -> [DMESG-WARN][29] ([fdo#108566]) +4 similar issues
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-apl5/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes.html
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-apl2/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes.html

  * igt@kms_plane_alpha_blend@pipe-a-coverage-7efc:
    - shard-skl:          [PASS][30] -> [FAIL][31] ([fdo#108145])
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-skl3/igt@kms_plane_alpha_blend@pipe-a-coverage-7efc.html
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-skl1/igt@kms_plane_alpha_blend@pipe-a-coverage-7efc.html

  * igt@kms_plane_lowres@pipe-a-tiling-x:
    - shard-iclb:         [PASS][32] -> [FAIL][33] ([fdo#103166])
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-iclb1/igt@kms_plane_lowres@pipe-a-tiling-x.html
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-iclb6/igt@kms_plane_lowres@pipe-a-tiling-x.html

  * igt@kms_psr@psr2_dpms:
    - shard-iclb:         [PASS][34] -> [SKIP][35] ([fdo#109441]) +1 similar issue
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-iclb2/igt@kms_psr@psr2_dpms.html
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-iclb7/igt@kms_psr@psr2_dpms.html

  * igt@prime_busy@hang-bsd2:
    - shard-iclb:         [PASS][36] -> [SKIP][37] ([fdo#109276]) +27 similar issues
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-iclb2/igt@prime_busy@hang-bsd2.html
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-iclb6/igt@prime_busy@hang-bsd2.html

  
#### Possible fixes ####

  * igt@gem_eio@unwedge-stress:
    - shard-snb:          [FAIL][38] ([fdo#109661]) -> [PASS][39]
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-snb2/igt@gem_eio@unwedge-stress.html
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-snb7/igt@gem_eio@unwedge-stress.html

  * igt@gem_exec_schedule@independent-bsd2:
    - shard-iclb:         [SKIP][40] ([fdo#109276]) -> [PASS][41] +22 similar issues
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-iclb7/igt@gem_exec_schedule@independent-bsd2.html
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-iclb1/igt@gem_exec_schedule@independent-bsd2.html

  * igt@gem_exec_schedule@reorder-wide-bsd:
    - shard-iclb:         [SKIP][42] ([fdo#111325]) -> [PASS][43] +9 similar issues
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-iclb2/igt@gem_exec_schedule@reorder-wide-bsd.html
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-iclb3/igt@gem_exec_schedule@reorder-wide-bsd.html

  * igt@i915_pm_backlight@fade_with_suspend:
    - shard-skl:          [INCOMPLETE][44] ([fdo#104108]) -> [PASS][45]
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-skl7/igt@i915_pm_backlight@fade_with_suspend.html
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-skl6/igt@i915_pm_backlight@fade_with_suspend.html

  * igt@kms_flip@flip-vs-suspend-interruptible:
    - shard-skl:          [INCOMPLETE][46] ([fdo#109507]) -> [PASS][47]
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-skl9/igt@kms_flip@flip-vs-suspend-interruptible.html
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-skl2/igt@kms_flip@flip-vs-suspend-interruptible.html

  * igt@kms_frontbuffer_tracking@fbc-suspend:
    - shard-apl:          [DMESG-WARN][48] ([fdo#108566]) -> [PASS][49] +4 similar issues
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-apl2/igt@kms_frontbuffer_tracking@fbc-suspend.html
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-apl5/igt@kms_frontbuffer_tracking@fbc-suspend.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-pwrite:
    - shard-iclb:         [FAIL][50] ([fdo#103167]) -> [PASS][51] +5 similar issues
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-iclb4/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-pwrite.html
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-iclb2/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-pwrite.html

  * igt@kms_plane_alpha_blend@pipe-b-constant-alpha-min:
    - shard-skl:          [FAIL][52] ([fdo#108145]) -> [PASS][53]
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-skl2/igt@kms_plane_alpha_blend@pipe-b-constant-alpha-min.html
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-skl5/igt@kms_plane_alpha_blend@pipe-b-constant-alpha-min.html

  * igt@kms_plane_lowres@pipe-a-tiling-y:
    - shard-iclb:         [FAIL][54] ([fdo#103166]) -> [PASS][55]
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-iclb1/igt@kms_plane_lowres@pipe-a-tiling-y.html
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-iclb6/igt@kms_plane_lowres@pipe-a-tiling-y.html

  * igt@kms_psr2_su@page_flip:
    - shard-iclb:         [SKIP][56] ([fdo#109642] / [fdo#111068]) -> [PASS][57]
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-iclb3/igt@kms_psr2_su@page_flip.html
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-iclb2/igt@kms_psr2_su@page_flip.html

  * igt@kms_psr@psr2_no_drrs:
    - shard-iclb:         [SKIP][58] ([fdo#109441]) -> [PASS][59] +2 similar issues
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-iclb3/igt@kms_psr@psr2_no_drrs.html
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-iclb2/igt@kms_psr@psr2_no_drrs.html

  
#### Warnings ####

  * igt@gem_mocs_settings@mocs-isolation-bsd2:
    - shard-iclb:         [SKIP][60] ([fdo#109276]) -> [FAIL][61] ([fdo#111330])
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-iclb3/igt@gem_mocs_settings@mocs-isolation-bsd2.html
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-iclb2/igt@gem_mocs_settings@mocs-isolation-bsd2.html

  * igt@gem_mocs_settings@mocs-reset-bsd2:
    - shard-iclb:         [FAIL][62] ([fdo#111330]) -> [SKIP][63] ([fdo#109276]) +1 similar issue
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6823/shard-iclb1/igt@gem_mocs_settings@mocs-reset-bsd2.html
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/shard-iclb6/igt@gem_mocs_settings@mocs-reset-bsd2.html

  
  [fdo#103166]: https://bugs.freedesktop.org/show_bug.cgi?id=103166
  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#103184]: https://bugs.freedesktop.org/show_bug.cgi?id=103184
  [fdo#103232]: https://bugs.freedesktop.org/show_bug.cgi?id=103232
  [fdo#103359]: https://bugs.freedesktop.org/show_bug.cgi?id=103359
  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#104108]: https://bugs.freedesktop.org/show_bug.cgi?id=104108
  [fdo#105363]: https://bugs.freedesktop.org/show_bug.cgi?id=105363
  [fdo#106107]: https://bugs.freedesktop.org/show_bug.cgi?id=106107
  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#108566]: https://bugs.freedesktop.org/show_bug.cgi?id=108566
  [fdo#109276]: https://bugs.freedesktop.org/show_bug.cgi?id=109276
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#109507]: https://bugs.freedesktop.org/show_bug.cgi?id=109507
  [fdo#109642]: https://bugs.freedesktop.org/show_bug.cgi?id=109642
  [fdo#109661]: https://bugs.freedesktop.org/show_bug.cgi?id=109661
  [fdo#110841]: https://bugs.freedesktop.org/show_bug.cgi?id=110841
  [fdo#110854]: https://bugs.freedesktop.org/show_bug.cgi?id=110854
  [fdo#111068]: https://bugs.freedesktop.org/show_bug.cgi?id=111068
  [fdo#111325]: https://bugs.freedesktop.org/show_bug.cgi?id=111325
  [fdo#111330]: https://bugs.freedesktop.org/show_bug.cgi?id=111330
  [k.org#198133]: https://bugzilla.kernel.org/show_bug.cgi?id=198133


Participating hosts (10 -> 10)
------------------------------

  No changes in participating hosts


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_6823 -> Patchwork_14263

  CI-20190529: 20190529
  CI_DRM_6823: 8f0a4755577623776cfb9e1a52eb8e873562bce5 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5163: 9bd37d5e307748a5b43e92a5b0ea9be8fc60b339 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_14263: e36bcda69c2699dae8ac9e0701b546cb0b3f5625 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14263/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/i915: Protect debugfs per_file_stats with RCU lock
  2019-09-03  6:21 [PATCH] drm/i915: Protect debugfs per_file_stats with RCU lock Chris Wilson
  2019-09-03  7:08 ` ✓ Fi.CI.BAT: success for " Patchwork
  2019-09-03  9:08 ` ✗ Fi.CI.IGT: failure " Patchwork
@ 2019-09-04 12:32 ` David Weinehall
  2019-09-04 13:19 ` Mika Kuoppala
  2020-06-30  5:39 ` [Intel-gfx] " Guenter Roeck
  4 siblings, 0 replies; 10+ messages in thread
From: David Weinehall @ 2019-09-04 12:32 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Tue, Sep 03, 2019 at 07:21:33AM +0100, Chris Wilson wrote:
> If we make sure we grab a strong reference to each object as we dump it,
> we can reduce the locks outside of our iterators to an rcu_read_lock.
> 
> This should prevent errors like:
> [ 2138.371911] BUG: KASAN: use-after-free in per_file_stats+0x43/0x380 [i915]
> [ 2138.371924] Read of size 8 at addr ffff888223651000 by task cat/8293
> 
> [ 2138.371947] CPU: 0 PID: 8293 Comm: cat Not tainted 5.3.0-rc6-CI-Custom_4352+ #1
> [ 2138.371953] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J4205-ITX, BIOS P1.40 07/14/2017
> [ 2138.371959] Call Trace:
> [ 2138.371974]  dump_stack+0x7c/0xbb
> [ 2138.372099]  ? per_file_stats+0x43/0x380 [i915]
> [ 2138.372108]  print_address_description+0x73/0x3a0
> [ 2138.372231]  ? per_file_stats+0x43/0x380 [i915]
> [ 2138.372352]  ? per_file_stats+0x43/0x380 [i915]
> [ 2138.372362]  __kasan_report+0x14e/0x192
> [ 2138.372489]  ? per_file_stats+0x43/0x380 [i915]
> [ 2138.372502]  kasan_report+0xe/0x20
> [ 2138.372625]  per_file_stats+0x43/0x380 [i915]
> [ 2138.372751]  ? i915_panel_show+0x110/0x110 [i915]
> [ 2138.372761]  idr_for_each+0xa7/0x160
> [ 2138.372773]  ? idr_get_next_ul+0x110/0x110
> [ 2138.372782]  ? do_raw_spin_lock+0x10a/0x1d0
> [ 2138.372923]  print_context_stats+0x264/0x510 [i915]
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Tested-by: David Weinehall <david.weinehall@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 9798f27a697a..708855e051b5 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -237,6 +237,9 @@ static int per_file_stats(int id, void *ptr, void *data)
>  	struct file_stats *stats = data;
>  	struct i915_vma *vma;
>  
> +	if (!kref_get_unless_zero(&obj->base.refcount))
> +		return 0;
> +
>  	stats->count++;
>  	stats->total += obj->base.size;
>  	if (!atomic_read(&obj->bind_count))
> @@ -284,6 +287,7 @@ static int per_file_stats(int id, void *ptr, void *data)
>  	}
>  	spin_unlock(&obj->vma.lock);
>  
> +	i915_gem_object_put(obj);
>  	return 0;
>  }
>  
> @@ -313,10 +317,12 @@ static void print_context_stats(struct seq_file *m,
>  				    i915_gem_context_lock_engines(ctx), it) {
>  			intel_context_lock_pinned(ce);
>  			if (intel_context_is_pinned(ce)) {
> +				rcu_read_lock();
>  				if (ce->state)
>  					per_file_stats(0,
>  						       ce->state->obj, &kstats);
>  				per_file_stats(0, ce->ring->vma->obj, &kstats);
> +				rcu_read_unlock();
>  			}
>  			intel_context_unlock_pinned(ce);
>  		}
> @@ -328,9 +334,9 @@ static void print_context_stats(struct seq_file *m,
>  			struct task_struct *task;
>  			char name[80];
>  
> -			spin_lock(&file->table_lock);
> +			rcu_read_lock();
>  			idr_for_each(&file->object_idr, per_file_stats, &stats);
> -			spin_unlock(&file->table_lock);
> +			rcu_read_unlock();
>  
>  			rcu_read_lock();
>  			task = pid_task(ctx->pid ?: file->pid, PIDTYPE_PID);
> -- 
> 2.23.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/i915: Protect debugfs per_file_stats with RCU lock
  2019-09-03  6:21 [PATCH] drm/i915: Protect debugfs per_file_stats with RCU lock Chris Wilson
                   ` (2 preceding siblings ...)
  2019-09-04 12:32 ` [PATCH] " David Weinehall
@ 2019-09-04 13:19 ` Mika Kuoppala
  2020-06-30  5:39 ` [Intel-gfx] " Guenter Roeck
  4 siblings, 0 replies; 10+ messages in thread
From: Mika Kuoppala @ 2019-09-04 13:19 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> If we make sure we grab a strong reference to each object as we dump it,
> we can reduce the locks outside of our iterators to an rcu_read_lock.
>
> This should prevent errors like:
> [ 2138.371911] BUG: KASAN: use-after-free in per_file_stats+0x43/0x380 [i915]
> [ 2138.371924] Read of size 8 at addr ffff888223651000 by task cat/8293
>
> [ 2138.371947] CPU: 0 PID: 8293 Comm: cat Not tainted 5.3.0-rc6-CI-Custom_4352+ #1
> [ 2138.371953] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J4205-ITX, BIOS P1.40 07/14/2017
> [ 2138.371959] Call Trace:
> [ 2138.371974]  dump_stack+0x7c/0xbb
> [ 2138.372099]  ? per_file_stats+0x43/0x380 [i915]
> [ 2138.372108]  print_address_description+0x73/0x3a0
> [ 2138.372231]  ? per_file_stats+0x43/0x380 [i915]
> [ 2138.372352]  ? per_file_stats+0x43/0x380 [i915]
> [ 2138.372362]  __kasan_report+0x14e/0x192
> [ 2138.372489]  ? per_file_stats+0x43/0x380 [i915]
> [ 2138.372502]  kasan_report+0xe/0x20
> [ 2138.372625]  per_file_stats+0x43/0x380 [i915]
> [ 2138.372751]  ? i915_panel_show+0x110/0x110 [i915]
> [ 2138.372761]  idr_for_each+0xa7/0x160
> [ 2138.372773]  ? idr_get_next_ul+0x110/0x110
> [ 2138.372782]  ? do_raw_spin_lock+0x10a/0x1d0
> [ 2138.372923]  print_context_stats+0x264/0x510 [i915]
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 9798f27a697a..708855e051b5 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -237,6 +237,9 @@ static int per_file_stats(int id, void *ptr, void *data)
>  	struct file_stats *stats = data;
>  	struct i915_vma *vma;
>  
> +	if (!kref_get_unless_zero(&obj->base.refcount))
> +		return 0;
> +

Not a single callsite using this. Eyesore is the
getting through this and putting thru nice object wrapper.

But for the content, seems to do what it claims.

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>


>  	stats->count++;
>  	stats->total += obj->base.size;
>  	if (!atomic_read(&obj->bind_count))
> @@ -284,6 +287,7 @@ static int per_file_stats(int id, void *ptr, void *data)
>  	}
>  	spin_unlock(&obj->vma.lock);
>  
> +	i915_gem_object_put(obj);
>  	return 0;
>  }
>  
> @@ -313,10 +317,12 @@ static void print_context_stats(struct seq_file *m,
>  				    i915_gem_context_lock_engines(ctx), it) {
>  			intel_context_lock_pinned(ce);
>  			if (intel_context_is_pinned(ce)) {
> +				rcu_read_lock();
>  				if (ce->state)
>  					per_file_stats(0,
>  						       ce->state->obj, &kstats);
>  				per_file_stats(0, ce->ring->vma->obj, &kstats);
> +				rcu_read_unlock();
>  			}
>  			intel_context_unlock_pinned(ce);
>  		}
> @@ -328,9 +334,9 @@ static void print_context_stats(struct seq_file *m,
>  			struct task_struct *task;
>  			char name[80];
>  
> -			spin_lock(&file->table_lock);
> +			rcu_read_lock();
>  			idr_for_each(&file->object_idr, per_file_stats, &stats);
> -			spin_unlock(&file->table_lock);
> +			rcu_read_unlock();
>  
>  			rcu_read_lock();
>  			task = pid_task(ctx->pid ?: file->pid, PIDTYPE_PID);
> -- 
> 2.23.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Protect debugfs per_file_stats with RCU lock
  2019-09-03  6:21 [PATCH] drm/i915: Protect debugfs per_file_stats with RCU lock Chris Wilson
                   ` (3 preceding siblings ...)
  2019-09-04 13:19 ` Mika Kuoppala
@ 2020-06-30  5:39 ` Guenter Roeck
  2020-06-30  9:14   ` Chris Wilson
  4 siblings, 1 reply; 10+ messages in thread
From: Guenter Roeck @ 2020-06-30  5:39 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, bgeffon, levinale

On Tue, Sep 03, 2019 at 07:21:33AM +0100, Chris Wilson wrote:
> If we make sure we grab a strong reference to each object as we dump it,
> we can reduce the locks outside of our iterators to an rcu_read_lock.
> 
> This should prevent errors like:
> [ 2138.371911] BUG: KASAN: use-after-free in per_file_stats+0x43/0x380 [i915]
> [ 2138.371924] Read of size 8 at addr ffff888223651000 by task cat/8293
> 
> [ 2138.371947] CPU: 0 PID: 8293 Comm: cat Not tainted 5.3.0-rc6-CI-Custom_4352+ #1
> [ 2138.371953] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J4205-ITX, BIOS P1.40 07/14/2017
> [ 2138.371959] Call Trace:
> [ 2138.371974]  dump_stack+0x7c/0xbb
> [ 2138.372099]  ? per_file_stats+0x43/0x380 [i915]
> [ 2138.372108]  print_address_description+0x73/0x3a0
> [ 2138.372231]  ? per_file_stats+0x43/0x380 [i915]
> [ 2138.372352]  ? per_file_stats+0x43/0x380 [i915]
> [ 2138.372362]  __kasan_report+0x14e/0x192
> [ 2138.372489]  ? per_file_stats+0x43/0x380 [i915]
> [ 2138.372502]  kasan_report+0xe/0x20
> [ 2138.372625]  per_file_stats+0x43/0x380 [i915]
> [ 2138.372751]  ? i915_panel_show+0x110/0x110 [i915]
> [ 2138.372761]  idr_for_each+0xa7/0x160
> [ 2138.372773]  ? idr_get_next_ul+0x110/0x110
> [ 2138.372782]  ? do_raw_spin_lock+0x10a/0x1d0
> [ 2138.372923]  print_context_stats+0x264/0x510 [i915]
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Tested-by: David Weinehall <david.weinehall@linux.intel.com>
> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 9798f27a697a..708855e051b5 100644
[ ... ]
>  		}
> @@ -328,9 +334,9 @@ static void print_context_stats(struct seq_file *m,
>  			struct task_struct *task;
>  			char name[80];
>  
> -			spin_lock(&file->table_lock);
> +			rcu_read_lock();
>  			idr_for_each(&file->object_idr, per_file_stats, &stats);
> -			spin_unlock(&file->table_lock);
> +			rcu_read_unlock();
>  
For my education - is it indeed possible and valid to replace spin_lock()
with rcu_read_lock() to prevent list manipulation for a list used by
idr_for_each(), even if that list is otherwise manipulated under the
spinlock ?

Background: we are seeing a crash with the following call trace.

[ 1016.651593] BUG: kernel NULL pointer dereference, address: 0000000000000000
...
[ 1016.651693] Call Trace:
[ 1016.651703]  idr_for_each+0x8a/0xe8
[ 1016.651711]  i915_gem_object_info+0x2a3/0x3eb
[ 1016.651720]  seq_read+0x162/0x3ca
[ 1016.651727]  full_proxy_read+0x5b/0x8d
[ 1016.651733]  __vfs_read+0x45/0x1bb
[ 1016.651741]  vfs_read+0xc9/0x15e
[ 1016.651746]  ksys_read+0x7e/0xde
[ 1016.651752]  do_syscall_64+0x54/0x68
[ 1016.651758]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

I have not tried to track down what exactly is NULL in this case.
Before spending more time on it, I'd like to understand the above
change a little better.

Thanks,
Guenter
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Protect debugfs per_file_stats with RCU lock
  2020-06-30  5:39 ` [Intel-gfx] " Guenter Roeck
@ 2020-06-30  9:14   ` Chris Wilson
  2020-06-30 15:01     ` Guenter Roeck
  0 siblings, 1 reply; 10+ messages in thread
From: Chris Wilson @ 2020-06-30  9:14 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: intel-gfx, bgeffon, levinale

Quoting Guenter Roeck (2020-06-30 06:39:36)
> On Tue, Sep 03, 2019 at 07:21:33AM +0100, Chris Wilson wrote:
> > If we make sure we grab a strong reference to each object as we dump it,
> > we can reduce the locks outside of our iterators to an rcu_read_lock.
> > 
> > This should prevent errors like:
> > [ 2138.371911] BUG: KASAN: use-after-free in per_file_stats+0x43/0x380 [i915]
> > [ 2138.371924] Read of size 8 at addr ffff888223651000 by task cat/8293
> > 
> > [ 2138.371947] CPU: 0 PID: 8293 Comm: cat Not tainted 5.3.0-rc6-CI-Custom_4352+ #1
> > [ 2138.371953] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J4205-ITX, BIOS P1.40 07/14/2017
> > [ 2138.371959] Call Trace:
> > [ 2138.371974]  dump_stack+0x7c/0xbb
> > [ 2138.372099]  ? per_file_stats+0x43/0x380 [i915]
> > [ 2138.372108]  print_address_description+0x73/0x3a0
> > [ 2138.372231]  ? per_file_stats+0x43/0x380 [i915]
> > [ 2138.372352]  ? per_file_stats+0x43/0x380 [i915]
> > [ 2138.372362]  __kasan_report+0x14e/0x192
> > [ 2138.372489]  ? per_file_stats+0x43/0x380 [i915]
> > [ 2138.372502]  kasan_report+0xe/0x20
> > [ 2138.372625]  per_file_stats+0x43/0x380 [i915]
> > [ 2138.372751]  ? i915_panel_show+0x110/0x110 [i915]
> > [ 2138.372761]  idr_for_each+0xa7/0x160
> > [ 2138.372773]  ? idr_get_next_ul+0x110/0x110
> > [ 2138.372782]  ? do_raw_spin_lock+0x10a/0x1d0
> > [ 2138.372923]  print_context_stats+0x264/0x510 [i915]
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Tested-by: David Weinehall <david.weinehall@linux.intel.com>
> > Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_debugfs.c | 10 ++++++++--
> >  1 file changed, 8 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > index 9798f27a697a..708855e051b5 100644
> [ ... ]
> >               }
> > @@ -328,9 +334,9 @@ static void print_context_stats(struct seq_file *m,
> >                       struct task_struct *task;
> >                       char name[80];
> >  
> > -                     spin_lock(&file->table_lock);
> > +                     rcu_read_lock();
> >                       idr_for_each(&file->object_idr, per_file_stats, &stats);
> > -                     spin_unlock(&file->table_lock);
> > +                     rcu_read_unlock();
> >  
> For my education - is it indeed possible and valid to replace spin_lock()
> with rcu_read_lock() to prevent list manipulation for a list used by
> idr_for_each(), even if that list is otherwise manipulated under the
> spinlock ?

It's a pure read of a radixtree here, and is supposed to be RCU safe:

 * idr_for_each() can be called concurrently with idr_alloc() and
 * idr_remove() if protected by RCU.  Newly added entries may not be
 * seen and deleted entries may be seen, but adding and removing entries
 * will not cause other entries to be skipped, nor spurious ones to be seen.

That is the tree structure is stable.

> Background: we are seeing a crash with the following call trace.
> 
> [ 1016.651593] BUG: kernel NULL pointer dereference, address: 0000000000000000
> ...
> [ 1016.651693] Call Trace:
> [ 1016.651703]  idr_for_each+0x8a/0xe8
> [ 1016.651711]  i915_gem_object_info+0x2a3/0x3eb
> [ 1016.651720]  seq_read+0x162/0x3ca
> [ 1016.651727]  full_proxy_read+0x5b/0x8d
> [ 1016.651733]  __vfs_read+0x45/0x1bb
> [ 1016.651741]  vfs_read+0xc9/0x15e
> [ 1016.651746]  ksys_read+0x7e/0xde
> [ 1016.651752]  do_syscall_64+0x54/0x68
> [ 1016.651758]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Is there a reason you are using this slow debugfs in the first place?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Protect debugfs per_file_stats with RCU lock
  2020-06-30  9:14   ` Chris Wilson
@ 2020-06-30 15:01     ` Guenter Roeck
  2020-06-30 15:08       ` Chris Wilson
  0 siblings, 1 reply; 10+ messages in thread
From: Guenter Roeck @ 2020-06-30 15:01 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, bgeffon, levinale

On Tue, Jun 30, 2020 at 10:14:25AM +0100, Chris Wilson wrote:
[ ... ]
> > > @@ -328,9 +334,9 @@ static void print_context_stats(struct seq_file *m,
> > >                       struct task_struct *task;
> > >                       char name[80];
> > >  
> > > -                     spin_lock(&file->table_lock);
> > > +                     rcu_read_lock();
> > >                       idr_for_each(&file->object_idr, per_file_stats, &stats);
> > > -                     spin_unlock(&file->table_lock);
> > > +                     rcu_read_unlock();
> > >  
> > For my education - is it indeed possible and valid to replace spin_lock()
> > with rcu_read_lock() to prevent list manipulation for a list used by
> > idr_for_each(), even if that list is otherwise manipulated under the
> > spinlock ?
> 
> It's a pure read of a radixtree here, and is supposed to be RCU safe:
> 
>  * idr_for_each() can be called concurrently with idr_alloc() and
>  * idr_remove() if protected by RCU.  Newly added entries may not be
>  * seen and deleted entries may be seen, but adding and removing entries
>  * will not cause other entries to be skipped, nor spurious ones to be seen.
> 
> That is the tree structure is stable.
> 
Ah, that makes sense. Thanks for the clarification.

> > Background: we are seeing a crash with the following call trace.
> > 
> > [ 1016.651593] BUG: kernel NULL pointer dereference, address: 0000000000000000
> > ...
> > [ 1016.651693] Call Trace:
> > [ 1016.651703]  idr_for_each+0x8a/0xe8
> > [ 1016.651711]  i915_gem_object_info+0x2a3/0x3eb
> > [ 1016.651720]  seq_read+0x162/0x3ca
> > [ 1016.651727]  full_proxy_read+0x5b/0x8d
> > [ 1016.651733]  __vfs_read+0x45/0x1bb
> > [ 1016.651741]  vfs_read+0xc9/0x15e
> > [ 1016.651746]  ksys_read+0x7e/0xde
> > [ 1016.651752]  do_syscall_64+0x54/0x68
> > [ 1016.651758]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
Actually, the crash is not in idr_for_each, but in per_file_stats:

[ 1016.651637] RIP: 0010:per_file_stats+0xe/0x16e
[ 1016.651646] Code: d2 41 0f b6 8e 69 8c 00 00 48 89 df 48 c7 c6 7b 74 8c be 31 c0 e8 0c 89 cf ff eb d2 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 53 <8b> 06 85 c0 0f 84 4d 01 00 00 49 89 d6 48 89 f3 3d ff ff ff 7f 73
[ 1016.651651] RSP: 0018:ffffad3a01337ba0 EFLAGS: 00010293
[ 1016.651656] RAX: 0000000000000018 RBX: ffff96fe040d65e0 RCX: 0000000000000002
[ 1016.651660] RDX: ffffad3a01337c50 RSI: 0000000000000000 RDI: 00000000000001e8
[ 1016.651663] RBP: ffffad3a01337bb8 R08: 0000000000000000 R09: 00000000000001c0
[ 1016.651667] R10: 0000000000000000 R11: ffffffffbdbe5fce R12: 0000000000000000
[ 1016.651671] R13: ffffffffbdbe5fce R14: ffffad3a01337c50 R15: 0000000000000001
[ 1016.651676] FS:  00007a597e2d7480(0000) GS:ffff96ff3bb00000(0000) knlGS:0000000000000000
[ 1016.651680] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1016.651683] CR2: 0000000000000000 CR3: 0000000171fc2001 CR4: 00000000003606e0
[ 1016.651687] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1016.651690] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1016.651693] Call Trace:

Sorry for the confusion. From the context it appears that the second parameter
of per_file_stats() may be NULL, though I am not entirely sure if I got that
correctly.

> Is there a reason you are using this slow debugfs in the first place?

AFAICS ChromeOS is using the information to calculate graphics memory use.

Guenter
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Protect debugfs per_file_stats with RCU lock
  2020-06-30 15:01     ` Guenter Roeck
@ 2020-06-30 15:08       ` Chris Wilson
  2020-06-30 15:48         ` Guenter Roeck
  0 siblings, 1 reply; 10+ messages in thread
From: Chris Wilson @ 2020-06-30 15:08 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: intel-gfx, bgeffon, levinale

Quoting Guenter Roeck (2020-06-30 16:01:05)
> On Tue, Jun 30, 2020 at 10:14:25AM +0100, Chris Wilson wrote:
> [ ... ]
> > > > @@ -328,9 +334,9 @@ static void print_context_stats(struct seq_file *m,
> > > >                       struct task_struct *task;
> > > >                       char name[80];
> > > >  
> > > > -                     spin_lock(&file->table_lock);
> > > > +                     rcu_read_lock();
> > > >                       idr_for_each(&file->object_idr, per_file_stats, &stats);
> > > > -                     spin_unlock(&file->table_lock);
> > > > +                     rcu_read_unlock();
> > > >  
> > > For my education - is it indeed possible and valid to replace spin_lock()
> > > with rcu_read_lock() to prevent list manipulation for a list used by
> > > idr_for_each(), even if that list is otherwise manipulated under the
> > > spinlock ?
> > 
> > It's a pure read of a radixtree here, and is supposed to be RCU safe:
> > 
> >  * idr_for_each() can be called concurrently with idr_alloc() and
> >  * idr_remove() if protected by RCU.  Newly added entries may not be
> >  * seen and deleted entries may be seen, but adding and removing entries
> >  * will not cause other entries to be skipped, nor spurious ones to be seen.
> > 
> > That is the tree structure is stable.
> > 
> Ah, that makes sense. Thanks for the clarification.
> 
> > > Background: we are seeing a crash with the following call trace.
> > > 
> > > [ 1016.651593] BUG: kernel NULL pointer dereference, address: 0000000000000000
> > > ...
> > > [ 1016.651693] Call Trace:
> > > [ 1016.651703]  idr_for_each+0x8a/0xe8
> > > [ 1016.651711]  i915_gem_object_info+0x2a3/0x3eb
> > > [ 1016.651720]  seq_read+0x162/0x3ca
> > > [ 1016.651727]  full_proxy_read+0x5b/0x8d
> > > [ 1016.651733]  __vfs_read+0x45/0x1bb
> > > [ 1016.651741]  vfs_read+0xc9/0x15e
> > > [ 1016.651746]  ksys_read+0x7e/0xde
> > > [ 1016.651752]  do_syscall_64+0x54/0x68
> > > [ 1016.651758]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > 
> Actually, the crash is not in idr_for_each, but in per_file_stats:

Ok, let's assume that the object is being closed as we read the idr. The
idr will temporarily hold an error pointer for the handle to indicate the
in-progress closure, so something like:

@@ -230,7 +230,7 @@ static int per_file_stats(int id, void *ptr, void *data)
        struct file_stats *stats = data;
        struct i915_vma *vma;

-       if (!kref_get_unless_zero(&obj->base.refcount))
+       if (IS_ERR_OR_NULL(obj) || !kref_get_unless_zero(&obj->base.refcount))
                return 0;

-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Protect debugfs per_file_stats with RCU lock
  2020-06-30 15:08       ` Chris Wilson
@ 2020-06-30 15:48         ` Guenter Roeck
  0 siblings, 0 replies; 10+ messages in thread
From: Guenter Roeck @ 2020-06-30 15:48 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, bgeffon, levinale

On Tue, Jun 30, 2020 at 04:08:00PM +0100, Chris Wilson wrote:
> Quoting Guenter Roeck (2020-06-30 16:01:05)
> > On Tue, Jun 30, 2020 at 10:14:25AM +0100, Chris Wilson wrote:
> > [ ... ]
> > > > > @@ -328,9 +334,9 @@ static void print_context_stats(struct seq_file *m,
> > > > >                       struct task_struct *task;
> > > > >                       char name[80];
> > > > >  
> > > > > -                     spin_lock(&file->table_lock);
> > > > > +                     rcu_read_lock();
> > > > >                       idr_for_each(&file->object_idr, per_file_stats, &stats);
> > > > > -                     spin_unlock(&file->table_lock);
> > > > > +                     rcu_read_unlock();
> > > > >  
> > > > For my education - is it indeed possible and valid to replace spin_lock()
> > > > with rcu_read_lock() to prevent list manipulation for a list used by
> > > > idr_for_each(), even if that list is otherwise manipulated under the
> > > > spinlock ?
> > > 
> > > It's a pure read of a radixtree here, and is supposed to be RCU safe:
> > > 
> > >  * idr_for_each() can be called concurrently with idr_alloc() and
> > >  * idr_remove() if protected by RCU.  Newly added entries may not be
> > >  * seen and deleted entries may be seen, but adding and removing entries
> > >  * will not cause other entries to be skipped, nor spurious ones to be seen.
> > > 
> > > That is the tree structure is stable.
> > > 
> > Ah, that makes sense. Thanks for the clarification.
> > 
> > > > Background: we are seeing a crash with the following call trace.
> > > > 
> > > > [ 1016.651593] BUG: kernel NULL pointer dereference, address: 0000000000000000
> > > > ...
> > > > [ 1016.651693] Call Trace:
> > > > [ 1016.651703]  idr_for_each+0x8a/0xe8
> > > > [ 1016.651711]  i915_gem_object_info+0x2a3/0x3eb
> > > > [ 1016.651720]  seq_read+0x162/0x3ca
> > > > [ 1016.651727]  full_proxy_read+0x5b/0x8d
> > > > [ 1016.651733]  __vfs_read+0x45/0x1bb
> > > > [ 1016.651741]  vfs_read+0xc9/0x15e
> > > > [ 1016.651746]  ksys_read+0x7e/0xde
> > > > [ 1016.651752]  do_syscall_64+0x54/0x68
> > > > [ 1016.651758]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > 
> > Actually, the crash is not in idr_for_each, but in per_file_stats:
> 
> Ok, let's assume that the object is being closed as we read the idr. The
> idr will temporarily hold an error pointer for the handle to indicate the
> in-progress closure, so something like:
> 
> @@ -230,7 +230,7 @@ static int per_file_stats(int id, void *ptr, void *data)
>         struct file_stats *stats = data;
>         struct i915_vma *vma;
> 
> -       if (!kref_get_unless_zero(&obj->base.refcount))
> +       if (IS_ERR_OR_NULL(obj) || !kref_get_unless_zero(&obj->base.refcount))
>                 return 0;
> 
Makes sense. Thanks a lot for the patch!

Guenter
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-06-30 15:48 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-03  6:21 [PATCH] drm/i915: Protect debugfs per_file_stats with RCU lock Chris Wilson
2019-09-03  7:08 ` ✓ Fi.CI.BAT: success for " Patchwork
2019-09-03  9:08 ` ✗ Fi.CI.IGT: failure " Patchwork
2019-09-04 12:32 ` [PATCH] " David Weinehall
2019-09-04 13:19 ` Mika Kuoppala
2020-06-30  5:39 ` [Intel-gfx] " Guenter Roeck
2020-06-30  9:14   ` Chris Wilson
2020-06-30 15:01     ` Guenter Roeck
2020-06-30 15:08       ` Chris Wilson
2020-06-30 15:48         ` Guenter Roeck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).