All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/4] drm/i915: Drop wmb() inside pread_gtt
@ 2019-07-18 14:54 Chris Wilson
  2019-07-18 14:54 ` [PATCH 2/4] drm/i915: Use maximum write flush for pwrite_gtt Chris Wilson
                   ` (6 more replies)
  0 siblings, 7 replies; 32+ messages in thread
From: Chris Wilson @ 2019-07-18 14:54 UTC (permalink / raw)
  To: intel-gfx

Inside pread, we only ever read from the GTT so the serialising wmb()
instructions around the GGTT PTE updates are pointless.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a207b90924e4..fed0bc421a55 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -395,11 +395,9 @@ i915_gem_gtt_pread(struct drm_i915_gem_object *obj,
 		unsigned page_length = PAGE_SIZE - page_offset;
 		page_length = remain < page_length ? remain : page_length;
 		if (node.allocated) {
-			wmb();
 			ggtt->vm.insert_page(&ggtt->vm,
 					     i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT),
 					     node.start, I915_CACHE_NONE, 0);
-			wmb();
 		} else {
 			page_base += offset & PAGE_MASK;
 		}
@@ -419,7 +417,6 @@ i915_gem_gtt_pread(struct drm_i915_gem_object *obj,
 out_unpin:
 	mutex_lock(&i915->drm.struct_mutex);
 	if (node.allocated) {
-		wmb();
 		ggtt->vm.clear_range(&ggtt->vm, node.start, node.size);
 		remove_mappable_node(&node);
 	} else {
-- 
2.22.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 2/4] drm/i915: Use maximum write flush for pwrite_gtt
  2019-07-18 14:54 [PATCH 1/4] drm/i915: Drop wmb() inside pread_gtt Chris Wilson
@ 2019-07-18 14:54 ` Chris Wilson
  2019-07-18 18:28   ` Ville Syrjälä
  2019-07-19  0:45   ` Sasha Levin
  2019-07-18 14:54 ` [PATCH 3/4] drm/i915: Flush all user surfaces prior to first use Chris Wilson
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 32+ messages in thread
From: Chris Wilson @ 2019-07-18 14:54 UTC (permalink / raw)
  To: intel-gfx
  Cc: Chris Wilson, Joonas Lahtinen, Matthew Auld,
	Ville Syrjälä,
	stable

As recently disovered by forcing big-core (!llc) machines to use the GTT
paths, we need our full GTT write flush before manipulating the GTT PTE
or else the writes may be directed to the wrong page.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/i915/i915_gem.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fed0bc421a55..c6ba350e6e4f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -610,7 +610,8 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,
 		unsigned int page_length = PAGE_SIZE - page_offset;
 		page_length = remain < page_length ? remain : page_length;
 		if (node.allocated) {
-			wmb(); /* flush the write before we modify the GGTT */
+			/* flush the write before we modify the GGTT */
+			intel_gt_flush_ggtt_writes(ggtt->vm.gt);
 			ggtt->vm.insert_page(&ggtt->vm,
 					     i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT),
 					     node.start, I915_CACHE_NONE, 0);
@@ -639,8 +640,8 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,
 	i915_gem_object_unlock_fence(obj, fence);
 out_unpin:
 	mutex_lock(&i915->drm.struct_mutex);
+	intel_gt_flush_ggtt_writes(ggtt->vm.gt);
 	if (node.allocated) {
-		wmb();
 		ggtt->vm.clear_range(&ggtt->vm, node.start, node.size);
 		remove_mappable_node(&node);
 	} else {
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 3/4] drm/i915: Flush all user surfaces prior to first use
  2019-07-18 14:54 [PATCH 1/4] drm/i915: Drop wmb() inside pread_gtt Chris Wilson
  2019-07-18 14:54 ` [PATCH 2/4] drm/i915: Use maximum write flush for pwrite_gtt Chris Wilson
@ 2019-07-18 14:54 ` Chris Wilson
  2019-07-19 10:01   ` Joonas Lahtinen
                     ` (2 more replies)
  2019-07-18 14:54 ` [PATCH 4/4] drm/i915: Flush stale cachelines on set-cache-level Chris Wilson
                   ` (4 subsequent siblings)
  6 siblings, 3 replies; 32+ messages in thread
From: Chris Wilson @ 2019-07-18 14:54 UTC (permalink / raw)
  To: intel-gfx

Since userspace has the ability to bypass the CPU cache from within its
unprivileged command stream, we have to flush the CPU cache to memory
in order to overwrite the previous contents on creation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stablevger.kernel.org
---
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 26 ++++++-----------------
 1 file changed, 7 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index d2a1158868e7..f752b326d399 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -459,7 +459,6 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size)
 {
 	struct drm_i915_gem_object *obj;
 	struct address_space *mapping;
-	unsigned int cache_level;
 	gfp_t mask;
 	int ret;
 
@@ -498,24 +497,13 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size)
 	obj->write_domain = I915_GEM_DOMAIN_CPU;
 	obj->read_domains = I915_GEM_DOMAIN_CPU;
 
-	if (HAS_LLC(i915))
-		/* On some devices, we can have the GPU use the LLC (the CPU
-		 * cache) for about a 10% performance improvement
-		 * compared to uncached.  Graphics requests other than
-		 * display scanout are coherent with the CPU in
-		 * accessing this cache.  This means in this mode we
-		 * don't need to clflush on the CPU side, and on the
-		 * GPU side we only need to flush internal caches to
-		 * get data visible to the CPU.
-		 *
-		 * However, we maintain the display planes as UC, and so
-		 * need to rebind when first used as such.
-		 */
-		cache_level = I915_CACHE_LLC;
-	else
-		cache_level = I915_CACHE_NONE;
-
-	i915_gem_object_set_cache_coherency(obj, cache_level);
+	/*
+	 * Note that userspace has control over cache-bypass
+	 * via its command stream, so even on LLC architectures
+	 * we have to flush out the CPU cache to memory to
+	 * clear previous contents.
+	 */
+	i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
 
 	trace_i915_gem_object_create(obj);
 
-- 
2.22.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 4/4] drm/i915: Flush stale cachelines on set-cache-level
  2019-07-18 14:54 [PATCH 1/4] drm/i915: Drop wmb() inside pread_gtt Chris Wilson
  2019-07-18 14:54 ` [PATCH 2/4] drm/i915: Use maximum write flush for pwrite_gtt Chris Wilson
  2019-07-18 14:54 ` [PATCH 3/4] drm/i915: Flush all user surfaces prior to first use Chris Wilson
@ 2019-07-18 14:54 ` Chris Wilson
  2019-07-19 10:03     ` Joonas Lahtinen
  2019-11-12  9:09     ` Daniel Vetter
  2019-07-18 15:35 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/4] drm/i915: Drop wmb() inside pread_gtt Patchwork
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 32+ messages in thread
From: Chris Wilson @ 2019-07-18 14:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson, Joonas Lahtinen, stable

Ensure that we flush any cache dirt out to main memory before the user
changes the cache-level as they may elect to bypass the cache (even after
declaring their access cache-coherent) via use of unprivileged MOCS.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/i915/gem/i915_gem_domain.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index 2e3ce2a69653..5d41e769a428 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -277,6 +277,11 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 
 	list_for_each_entry(vma, &obj->vma.list, obj_link)
 		vma->node.color = cache_level;
+
+	/* Flush any previous cache dirt in case of cache bypass */
+	if (obj->cache_dirty & ~obj->cache_coherent)
+		i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC);
+
 	i915_gem_object_set_cache_coherency(obj, cache_level);
 	obj->cache_dirty = true; /* Always invalidate stale cachelines */
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/4] drm/i915: Drop wmb() inside pread_gtt
  2019-07-18 14:54 [PATCH 1/4] drm/i915: Drop wmb() inside pread_gtt Chris Wilson
                   ` (2 preceding siblings ...)
  2019-07-18 14:54 ` [PATCH 4/4] drm/i915: Flush stale cachelines on set-cache-level Chris Wilson
@ 2019-07-18 15:35 ` Patchwork
  2019-07-18 16:22 ` ✓ Fi.CI.BAT: success " Patchwork
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 32+ messages in thread
From: Patchwork @ 2019-07-18 15:35 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/4] drm/i915: Drop wmb() inside pread_gtt
URL   : https://patchwork.freedesktop.org/series/63895/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
7a958734910b drm/i915: Drop wmb() inside pread_gtt
45e21a6a99db drm/i915: Use maximum write flush for pwrite_gtt
de5d49b49195 drm/i915: Flush all user surfaces prior to first use
-:12: ERROR:BAD_SIGN_OFF: Unrecognized email address: 'stablevger.kernel.org'
#12: 
Cc: stablevger.kernel.org

total: 1 errors, 0 warnings, 0 checks, 38 lines checked
7e8db5f4d4f9 drm/i915: Flush stale cachelines on set-cache-level

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [1/4] drm/i915: Drop wmb() inside pread_gtt
  2019-07-18 14:54 [PATCH 1/4] drm/i915: Drop wmb() inside pread_gtt Chris Wilson
                   ` (3 preceding siblings ...)
  2019-07-18 15:35 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/4] drm/i915: Drop wmb() inside pread_gtt Patchwork
@ 2019-07-18 16:22 ` Patchwork
  2019-07-18 17:29 ` ✓ Fi.CI.IGT: " Patchwork
  2019-07-18 18:23 ` [PATCH 1/4] " Ville Syrjälä
  6 siblings, 0 replies; 32+ messages in thread
From: Patchwork @ 2019-07-18 16:22 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/4] drm/i915: Drop wmb() inside pread_gtt
URL   : https://patchwork.freedesktop.org/series/63895/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6504 -> Patchwork_13690
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/

Known issues
------------

  Here are the changes found in Patchwork_13690 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s3:
    - fi-kbl-7567u:       [PASS][1] -> [DMESG-WARN][2] ([fdo#108566])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/fi-kbl-7567u/igt@gem_exec_suspend@basic-s3.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/fi-kbl-7567u/igt@gem_exec_suspend@basic-s3.html

  * igt@i915_module_load@reload-no-display:
    - fi-icl-u3:          [PASS][3] -> [DMESG-WARN][4] ([fdo#107724])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/fi-icl-u3/igt@i915_module_load@reload-no-display.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/fi-icl-u3/igt@i915_module_load@reload-no-display.html

  * igt@i915_selftest@live_hangcheck:
    - fi-icl-dsi:         [PASS][5] -> [INCOMPLETE][6] ([fdo#107713] / [fdo#108569])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/fi-icl-dsi/igt@i915_selftest@live_hangcheck.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/fi-icl-dsi/igt@i915_selftest@live_hangcheck.html

  
#### Possible fixes ####

  * igt@kms_chamelium@dp-edid-read:
    - fi-cml-u2:          [FAIL][7] ([fdo#109483]) -> [PASS][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/fi-cml-u2/igt@kms_chamelium@dp-edid-read.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/fi-cml-u2/igt@kms_chamelium@dp-edid-read.html

  * igt@kms_chamelium@hdmi-hpd-fast:
    - fi-kbl-7500u:       [FAIL][9] ([fdo#109485]) -> [PASS][10]
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html

  
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724
  [fdo#108566]: https://bugs.freedesktop.org/show_bug.cgi?id=108566
  [fdo#108569]: https://bugs.freedesktop.org/show_bug.cgi?id=108569
  [fdo#109483]: https://bugs.freedesktop.org/show_bug.cgi?id=109483
  [fdo#109485]: https://bugs.freedesktop.org/show_bug.cgi?id=109485


Participating hosts (54 -> 47)
------------------------------

  Additional (1): fi-pnv-d510 
  Missing    (8): fi-kbl-soraka fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-icl-y fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * Linux: CI_DRM_6504 -> Patchwork_13690

  CI_DRM_6504: a23df173f63ed05ae452ab478a01131a89938654 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5102: 6038ace76016d8892f4d13aef5301f71ca1a6e2d @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_13690: 7e8db5f4d4f9d58379fd4519ddb4b70b61697e70 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

7e8db5f4d4f9 drm/i915: Flush stale cachelines on set-cache-level
de5d49b49195 drm/i915: Flush all user surfaces prior to first use
45e21a6a99db drm/i915: Use maximum write flush for pwrite_gtt
7a958734910b drm/i915: Drop wmb() inside pread_gtt

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* ✓ Fi.CI.IGT: success for series starting with [1/4] drm/i915: Drop wmb() inside pread_gtt
  2019-07-18 14:54 [PATCH 1/4] drm/i915: Drop wmb() inside pread_gtt Chris Wilson
                   ` (4 preceding siblings ...)
  2019-07-18 16:22 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2019-07-18 17:29 ` Patchwork
  2019-07-18 18:23 ` [PATCH 1/4] " Ville Syrjälä
  6 siblings, 0 replies; 32+ messages in thread
From: Patchwork @ 2019-07-18 17:29 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/4] drm/i915: Drop wmb() inside pread_gtt
URL   : https://patchwork.freedesktop.org/series/63895/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6504_full -> Patchwork_13690_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

Known issues
------------

  Here are the changes found in Patchwork_13690_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ctx_isolation@rcs0-s3:
    - shard-kbl:          [PASS][1] -> [DMESG-WARN][2] ([fdo#108566]) +4 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-kbl3/igt@gem_ctx_isolation@rcs0-s3.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-kbl6/igt@gem_ctx_isolation@rcs0-s3.html

  * igt@kms_cursor_edge_walk@pipe-c-128x128-top-edge:
    - shard-iclb:         [PASS][3] -> [INCOMPLETE][4] ([fdo#107713])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-iclb2/igt@kms_cursor_edge_walk@pipe-c-128x128-top-edge.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-iclb7/igt@kms_cursor_edge_walk@pipe-c-128x128-top-edge.html

  * igt@kms_cursor_legacy@2x-long-cursor-vs-flip-atomic:
    - shard-hsw:          [PASS][5] -> [FAIL][6] ([fdo#105767])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-hsw4/igt@kms_cursor_legacy@2x-long-cursor-vs-flip-atomic.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-hsw6/igt@kms_cursor_legacy@2x-long-cursor-vs-flip-atomic.html

  * igt@kms_flip@flip-vs-expired-vblank:
    - shard-skl:          [PASS][7] -> [FAIL][8] ([fdo#105363])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-skl8/igt@kms_flip@flip-vs-expired-vblank.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-skl7/igt@kms_flip@flip-vs-expired-vblank.html

  * igt@kms_flip@modeset-vs-vblank-race-interruptible:
    - shard-glk:          [PASS][9] -> [FAIL][10] ([fdo#103060])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-glk1/igt@kms_flip@modeset-vs-vblank-race-interruptible.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-glk7/igt@kms_flip@modeset-vs-vblank-race-interruptible.html

  * igt@kms_frontbuffer_tracking@fbc-1p-pri-indfb-multidraw:
    - shard-iclb:         [PASS][11] -> [FAIL][12] ([fdo#103167]) +6 similar issues
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-iclb3/igt@kms_frontbuffer_tracking@fbc-1p-pri-indfb-multidraw.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-iclb2/igt@kms_frontbuffer_tracking@fbc-1p-pri-indfb-multidraw.html

  * igt@kms_frontbuffer_tracking@fbcpsr-indfb-scaledprimary:
    - shard-iclb:         [PASS][13] -> [DMESG-WARN][14] ([fdo#107724])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-iclb7/igt@kms_frontbuffer_tracking@fbcpsr-indfb-scaledprimary.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-iclb1/igt@kms_frontbuffer_tracking@fbcpsr-indfb-scaledprimary.html

  * igt@kms_frontbuffer_tracking@fbcpsr-shrfb-scaledprimary:
    - shard-iclb:         [PASS][15] -> [DMESG-FAIL][16] ([fdo#103167] / [fdo#107724])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-iclb2/igt@kms_frontbuffer_tracking@fbcpsr-shrfb-scaledprimary.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-iclb1/igt@kms_frontbuffer_tracking@fbcpsr-shrfb-scaledprimary.html

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-fullscreen:
    - shard-iclb:         [PASS][17] -> [INCOMPLETE][18] ([fdo#106978] / [fdo#107713])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-iclb8/igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-fullscreen.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-iclb7/igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-fullscreen.html

  * igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min:
    - shard-skl:          [PASS][19] -> [FAIL][20] ([fdo#108145]) +1 similar issue
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-skl10/igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-skl2/igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min.html

  * igt@kms_psr@psr2_cursor_mmap_cpu:
    - shard-iclb:         [PASS][21] -> [SKIP][22] ([fdo#109441]) +5 similar issues
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-iclb2/igt@kms_psr@psr2_cursor_mmap_cpu.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-iclb1/igt@kms_psr@psr2_cursor_mmap_cpu.html

  * igt@kms_vblank@pipe-a-ts-continuation-suspend:
    - shard-apl:          [PASS][23] -> [DMESG-WARN][24] ([fdo#108566]) +3 similar issues
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-apl2/igt@kms_vblank@pipe-a-ts-continuation-suspend.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-apl1/igt@kms_vblank@pipe-a-ts-continuation-suspend.html

  
#### Possible fixes ####

  * igt@gem_ctx_isolation@vcs0-s3:
    - shard-kbl:          [DMESG-WARN][25] ([fdo#108566]) -> [PASS][26]
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-kbl6/igt@gem_ctx_isolation@vcs0-s3.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-kbl7/igt@gem_ctx_isolation@vcs0-s3.html

  * igt@i915_pm_rpm@i2c:
    - shard-hsw:          [FAIL][27] ([fdo#104097]) -> [PASS][28]
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-hsw6/igt@i915_pm_rpm@i2c.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-hsw1/igt@i915_pm_rpm@i2c.html

  * igt@kms_cursor_crc@pipe-b-cursor-128x42-random:
    - shard-skl:          [FAIL][29] ([fdo#103232]) -> [PASS][30] +1 similar issue
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-skl8/igt@kms_cursor_crc@pipe-b-cursor-128x42-random.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-skl9/igt@kms_cursor_crc@pipe-b-cursor-128x42-random.html

  * igt@kms_cursor_legacy@cursor-vs-flip-atomic:
    - shard-hsw:          [FAIL][31] ([fdo#103355]) -> [PASS][32]
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-hsw6/igt@kms_cursor_legacy@cursor-vs-flip-atomic.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-hsw4/igt@kms_cursor_legacy@cursor-vs-flip-atomic.html

  * igt@kms_flip@2x-flip-vs-expired-vblank:
    - shard-glk:          [FAIL][33] ([fdo#105363]) -> [PASS][34]
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-glk7/igt@kms_flip@2x-flip-vs-expired-vblank.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-glk1/igt@kms_flip@2x-flip-vs-expired-vblank.html

  * igt@kms_flip@flip-vs-suspend-interruptible:
    - shard-hsw:          [INCOMPLETE][35] ([fdo#103540]) -> [PASS][36]
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-hsw4/igt@kms_flip@flip-vs-suspend-interruptible.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-hsw6/igt@kms_flip@flip-vs-suspend-interruptible.html

  * igt@kms_frontbuffer_tracking@fbc-rgb565-draw-pwrite:
    - shard-iclb:         [FAIL][37] ([fdo#103167]) -> [PASS][38] +2 similar issues
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-iclb5/igt@kms_frontbuffer_tracking@fbc-rgb565-draw-pwrite.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-iclb7/igt@kms_frontbuffer_tracking@fbc-rgb565-draw-pwrite.html

  * igt@kms_frontbuffer_tracking@fbcpsr-suspend:
    - shard-iclb:         [INCOMPLETE][39] ([fdo#106978] / [fdo#107713]) -> [PASS][40]
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-iclb7/igt@kms_frontbuffer_tracking@fbcpsr-suspend.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-iclb2/igt@kms_frontbuffer_tracking@fbcpsr-suspend.html

  * igt@kms_plane@plane-panning-bottom-right-suspend-pipe-c-planes:
    - shard-apl:          [DMESG-WARN][41] ([fdo#108566]) -> [PASS][42] +3 similar issues
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-apl7/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-c-planes.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-apl1/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-c-planes.html

  * igt@kms_plane_alpha_blend@pipe-b-coverage-7efc:
    - shard-skl:          [FAIL][43] ([fdo#108145] / [fdo#110403]) -> [PASS][44]
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-skl8/igt@kms_plane_alpha_blend@pipe-b-coverage-7efc.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-skl9/igt@kms_plane_alpha_blend@pipe-b-coverage-7efc.html

  * igt@kms_psr@no_drrs:
    - shard-iclb:         [FAIL][45] ([fdo#108341]) -> [PASS][46]
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-iclb1/igt@kms_psr@no_drrs.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-iclb3/igt@kms_psr@no_drrs.html

  * igt@kms_psr@psr2_dpms:
    - shard-iclb:         [SKIP][47] ([fdo#109441]) -> [PASS][48] +2 similar issues
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-iclb3/igt@kms_psr@psr2_dpms.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-iclb2/igt@kms_psr@psr2_dpms.html

  * igt@kms_setmode@basic:
    - shard-kbl:          [FAIL][49] ([fdo#99912]) -> [PASS][50]
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-kbl4/igt@kms_setmode@basic.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-kbl2/igt@kms_setmode@basic.html

  * igt@kms_vblank@pipe-a-wait-forked-busy-hang:
    - shard-snb:          [SKIP][51] ([fdo#109271]) -> [PASS][52]
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-snb1/igt@kms_vblank@pipe-a-wait-forked-busy-hang.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-snb1/igt@kms_vblank@pipe-a-wait-forked-busy-hang.html

  * igt@kms_vblank@pipe-c-ts-continuation-dpms-suspend:
    - shard-skl:          [INCOMPLETE][53] ([fdo#104108]) -> [PASS][54]
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-skl8/igt@kms_vblank@pipe-c-ts-continuation-dpms-suspend.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-skl9/igt@kms_vblank@pipe-c-ts-continuation-dpms-suspend.html

  
#### Warnings ####

  * igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-shrfb-draw-mmap-wc:
    - shard-skl:          [FAIL][55] ([fdo#103167]) -> [FAIL][56] ([fdo#108040])
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-skl2/igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-shrfb-draw-mmap-wc.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-skl9/igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-shrfb-draw-mmap-wc.html

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-shrfb-draw-blt:
    - shard-skl:          [FAIL][57] ([fdo#108040]) -> [FAIL][58] ([fdo#103167])
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-skl2/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-shrfb-draw-blt.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-skl9/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-shrfb-draw-blt.html

  * igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-spr-indfb-draw-mmap-cpu:
    - shard-apl:          [SKIP][59] ([fdo#109271]) -> [INCOMPLETE][60] ([fdo#103927])
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6504/shard-apl6/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-spr-indfb-draw-mmap-cpu.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/shard-apl3/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-spr-indfb-draw-mmap-cpu.html

  
  [fdo#103060]: https://bugs.freedesktop.org/show_bug.cgi?id=103060
  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#103232]: https://bugs.freedesktop.org/show_bug.cgi?id=103232
  [fdo#103355]: https://bugs.freedesktop.org/show_bug.cgi?id=103355
  [fdo#103540]: https://bugs.freedesktop.org/show_bug.cgi?id=103540
  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#104097]: https://bugs.freedesktop.org/show_bug.cgi?id=104097
  [fdo#104108]: https://bugs.freedesktop.org/show_bug.cgi?id=104108
  [fdo#105363]: https://bugs.freedesktop.org/show_bug.cgi?id=105363
  [fdo#105767]: https://bugs.freedesktop.org/show_bug.cgi?id=105767
  [fdo#106978]: https://bugs.freedesktop.org/show_bug.cgi?id=106978
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724
  [fdo#108040]: https://bugs.freedesktop.org/show_bug.cgi?id=108040
  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#108341]: https://bugs.freedesktop.org/show_bug.cgi?id=108341
  [fdo#108566]: https://bugs.freedesktop.org/show_bug.cgi?id=108566
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#110403]: https://bugs.freedesktop.org/show_bug.cgi?id=110403
  [fdo#99912]: https://bugs.freedesktop.org/show_bug.cgi?id=99912


Participating hosts (10 -> 10)
------------------------------

  No changes in participating hosts


Build changes
-------------

  * Linux: CI_DRM_6504 -> Patchwork_13690

  CI_DRM_6504: a23df173f63ed05ae452ab478a01131a89938654 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5102: 6038ace76016d8892f4d13aef5301f71ca1a6e2d @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_13690: 7e8db5f4d4f9d58379fd4519ddb4b70b61697e70 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13690/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/4] drm/i915: Drop wmb() inside pread_gtt
  2019-07-18 14:54 [PATCH 1/4] drm/i915: Drop wmb() inside pread_gtt Chris Wilson
                   ` (5 preceding siblings ...)
  2019-07-18 17:29 ` ✓ Fi.CI.IGT: " Patchwork
@ 2019-07-18 18:23 ` Ville Syrjälä
  6 siblings, 0 replies; 32+ messages in thread
From: Ville Syrjälä @ 2019-07-18 18:23 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Thu, Jul 18, 2019 at 03:54:04PM +0100, Chris Wilson wrote:
> Inside pread, we only ever read from the GTT so the serialising wmb()
> instructions around the GGTT PTE updates are pointless.

Hard to argue with that.

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index a207b90924e4..fed0bc421a55 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -395,11 +395,9 @@ i915_gem_gtt_pread(struct drm_i915_gem_object *obj,
>  		unsigned page_length = PAGE_SIZE - page_offset;
>  		page_length = remain < page_length ? remain : page_length;
>  		if (node.allocated) {
> -			wmb();
>  			ggtt->vm.insert_page(&ggtt->vm,
>  					     i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT),
>  					     node.start, I915_CACHE_NONE, 0);
> -			wmb();
>  		} else {
>  			page_base += offset & PAGE_MASK;
>  		}
> @@ -419,7 +417,6 @@ i915_gem_gtt_pread(struct drm_i915_gem_object *obj,
>  out_unpin:
>  	mutex_lock(&i915->drm.struct_mutex);
>  	if (node.allocated) {
> -		wmb();
>  		ggtt->vm.clear_range(&ggtt->vm, node.start, node.size);
>  		remove_mappable_node(&node);
>  	} else {
> -- 
> 2.22.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/4] drm/i915: Use maximum write flush for pwrite_gtt
  2019-07-18 14:54 ` [PATCH 2/4] drm/i915: Use maximum write flush for pwrite_gtt Chris Wilson
@ 2019-07-18 18:28   ` Ville Syrjälä
  2019-07-18 19:19     ` Chris Wilson
  2019-07-19  0:45   ` Sasha Levin
  1 sibling, 1 reply; 32+ messages in thread
From: Ville Syrjälä @ 2019-07-18 18:28 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Joonas Lahtinen, Matthew Auld, stable

On Thu, Jul 18, 2019 at 03:54:05PM +0100, Chris Wilson wrote:
> As recently disovered by forcing big-core (!llc) machines to use the GTT
> paths, we need our full GTT write flush before manipulating the GTT PTE
> or else the writes may be directed to the wrong page.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Matthew Auld <matthew.william.auld@gmail.com>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: stable@vger.kernel.org
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index fed0bc421a55..c6ba350e6e4f 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -610,7 +610,8 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,
>  		unsigned int page_length = PAGE_SIZE - page_offset;
>  		page_length = remain < page_length ? remain : page_length;
>  		if (node.allocated) {
> -			wmb(); /* flush the write before we modify the GGTT */
> +			/* flush the write before we modify the GGTT */
> +			intel_gt_flush_ggtt_writes(ggtt->vm.gt);

Matches the story told by intel_gt_flush_ggtt_writes().

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

>  			ggtt->vm.insert_page(&ggtt->vm,
>  					     i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT),
>  					     node.start, I915_CACHE_NONE, 0);
> @@ -639,8 +640,8 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,
>  	i915_gem_object_unlock_fence(obj, fence);
>  out_unpin:
>  	mutex_lock(&i915->drm.struct_mutex);
> +	intel_gt_flush_ggtt_writes(ggtt->vm.gt);
>  	if (node.allocated) {
> -		wmb();
>  		ggtt->vm.clear_range(&ggtt->vm, node.start, node.size);
>  		remove_mappable_node(&node);
>  	} else {
> -- 
> 2.22.0

-- 
Ville Syrjälä
Intel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/4] drm/i915: Use maximum write flush for pwrite_gtt
  2019-07-18 18:28   ` Ville Syrjälä
@ 2019-07-18 19:19     ` Chris Wilson
  0 siblings, 0 replies; 32+ messages in thread
From: Chris Wilson @ 2019-07-18 19:19 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, Joonas Lahtinen, Matthew Auld, stable

Quoting Ville Syrjälä (2019-07-18 19:28:43)
> On Thu, Jul 18, 2019 at 03:54:05PM +0100, Chris Wilson wrote:
> > As recently disovered by forcing big-core (!llc) machines to use the GTT
> > paths, we need our full GTT write flush before manipulating the GTT PTE
> > or else the writes may be directed to the wrong page.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Matthew Auld <matthew.william.auld@gmail.com>
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Cc: stable@vger.kernel.org
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index fed0bc421a55..c6ba350e6e4f 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -610,7 +610,8 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,
> >               unsigned int page_length = PAGE_SIZE - page_offset;
> >               page_length = remain < page_length ? remain : page_length;
> >               if (node.allocated) {
> > -                     wmb(); /* flush the write before we modify the GGTT */
> > +                     /* flush the write before we modify the GGTT */
> > +                     intel_gt_flush_ggtt_writes(ggtt->vm.gt);
> 
> Matches the story told by intel_gt_flush_ggtt_writes().
> 
> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

Ta, pushed to dinq. Hopefully, this may explain some mystery fails!
(Not that any sane userspace does for(;;) { gem_write(); gem_read(); })
-Chris

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/4] drm/i915: Use maximum write flush for pwrite_gtt
  2019-07-18 14:54 ` [PATCH 2/4] drm/i915: Use maximum write flush for pwrite_gtt Chris Wilson
  2019-07-18 18:28   ` Ville Syrjälä
@ 2019-07-19  0:45   ` Sasha Levin
  1 sibling, 0 replies; 32+ messages in thread
From: Sasha Levin @ 2019-07-19  0:45 UTC (permalink / raw)
  To: Sasha Levin, Chris Wilson, intel-gfx; +Cc: stable

Hi,

[This is an automated email]

This commit has been processed because it contains a -stable tag.
The stable tag indicates that it's relevant for the following trees: all

The bot has tested the following trees: v5.2.1, v5.1.18, v4.19.59, v4.14.133, v4.9.185, v4.4.185.

v5.2.1: Failed to apply! Possible dependencies:
    09407579abf5 ("drm/i915: Store the default sseu setup on the engine")
    10be98a77c55 ("drm/i915: Move more GEM objects under gem/")
    112ed2d31a46 ("drm/i915: Move GraphicsTechnology files under gt/")
    5e5d2e209e08 ("drm/i915: Split GEM object type definition to its own header")
    6951e5893b48 ("drm/i915: Move GEM object domain management from struct_mutex to local")
    98932149aeb9 ("drm/i915: Move object->pages API to i915_gem_object.[ch]")

v5.1.18: Failed to apply! Possible dependencies:
    10be98a77c55 ("drm/i915: Move more GEM objects under gem/")
    112ed2d31a46 ("drm/i915: Move GraphicsTechnology files under gt/")
    13f1bfd3b332 ("drm/i915: Make object/vma allocation caches global")
    2caffbf11762 ("drm/i915: Revoke mmaps and prevent access to fence registers across reset")
    32eb6bcfdda9 ("drm/i915: Make request allocation caches global")
    39e2f501c1b4 ("drm/i915: Split struct intel_context definition to its own header")
    5e5d2e209e08 ("drm/i915: Split GEM object type definition to its own header")
    6951e5893b48 ("drm/i915: Move GEM object domain management from struct_mutex to local")
    7ae1940014ef ("drm/i915: Defer removing fence register tracking to rpm wakeup")
    7e3d9a59410d ("drm/i915: Track active engines within a context")
    7f4127c4839b ("drm/i915: Use time based guilty context banning")
    98932149aeb9 ("drm/i915: Move object->pages API to i915_gem_object.[ch]")
    ba4fda620a5f ("drm/i915: Optionally disable automatic recovery after a GPU reset")
    c2400ec3b6d1 ("drm/i915: add Makefile magic for testing headers are self-contained")

v4.19.59: Failed to apply! Possible dependencies:
    0e39037b3165 ("drm/i915: Cache the error string")
    10be98a77c55 ("drm/i915: Move more GEM objects under gem/")
    112ed2d31a46 ("drm/i915: Move GraphicsTechnology files under gt/")
    16e4dd0342a8 ("drm/i915: Markup paired operations on wakerefs")
    39e2f501c1b4 ("drm/i915: Split struct intel_context definition to its own header")
    52c0fdb25c7c ("drm/i915: Replace global breadcrumbs with per-context interrupt tracking")
    538ef96b9dae ("drm/i915/gem: Track the rpm wakerefs")
    5e5d2e209e08 ("drm/i915: Split GEM object type definition to its own header")
    6951e5893b48 ("drm/i915: Move GEM object domain management from struct_mutex to local")
    6b048706f407 ("drm/i915: Forcibly flush unwanted requests in drop-caches")
    87f1ef225242 ("drm/i915: Record the sseu configuration per-context & engine")
    95fd94a645f7 ("drm/i915: avoid rebuilding i915_gpu_error.o on version string updates")
    98932149aeb9 ("drm/i915: Move object->pages API to i915_gem_object.[ch]")
    c0a6aa7ec2c3 ("drm/i915: Show actual alongside requested frequency in debugfs/i915_rps_boost_info")
    c2400ec3b6d1 ("drm/i915: add Makefile magic for testing headers are self-contained")
    c44301fce614 ("drm/i915: Allow control of PSR at runtime through debugfs, v6")
    cab870b7fdf3 ("drm/i915/ilk: Fix warning when reading emon_status with no output")
    e6154e4cb8b0 ("drm/i915: Skip the ERR_PTR error state")
    eb8d0f5af4ec ("drm/i915: Remove GPU reset dependence on struct_mutex")
    fb6f0b64e455 ("drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture")

v4.14.133: Failed to apply! Possible dependencies:
    3bd4073524fa ("drm/i915: Consolidate get_fence with pin_fence")
    465c403cb508 ("drm/i915: introduce simple gemfs")
    66df1014efba ("drm/i915: Keep a small stash of preallocated WC pages")
    7393b7ee3a9c ("drm/i915/debugfs: include some gtt page size metrics")
    73ebd503034c ("drm/i915: make mappable struct resource centric")
    7789422665f5 ("drm/i915: make dsm struct resource centric")
    82ad6443a55e ("drm/i915/gtt: Rename i915_hw_ppgtt base member")
    969b0950a188 ("drm/i915: Add interface to reserve fence registers for vGPU")
    a65adaf8a834 ("drm/i915: Track user GTT faulting per-vma")
    b1ace60107e6 ("drm/i915: give stolen_usable_size a more suitable home")
    b7128ef125b4 ("drm/i915: prefer resource_size_t for everything stolen")
    da1dd0dbe024 ("drm/i915: Make the report about a bogus stolen reserved area an error")
    db7fb60593e4 ("drm/i915: Check if the stolen memory "reserved" area is enabled or not")
    e91ef99b9543 ("drm/i915/selftests: Remember to create the fake preempt context")
    f773568b6ff8 ("drm/i915: nuke the duplicated stolen discovery")

v4.9.185: Failed to apply! Possible dependencies:
    0e70447605f4 ("drm/i915: Move common code out of i915_gpu_error.c")
    1b36595ffb35 ("drm/i915: Show RING registers through debugfs")
    28a60dee2ce6 ("drm/i915/gvt: vGPU HW resource management")
    3b3f1650b1ca ("drm/i915: Allocate intel_engine_cs structure only for the enabled engines")
    82ad6443a55e ("drm/i915/gtt: Rename i915_hw_ppgtt base member")
    85fd4f58d7ef ("drm/i915: Mark all non-vma being inserted into the address spaces")
    9c870d03674f ("drm/i915: Use RPM as the barrier for controlling user mmap access")
    bb6dc8d96b68 ("drm/i915: Implement pread without struct-mutex")
    d636951ec01b ("drm/i915: Cleanup instdone collection")
    e007b19d7ba7 ("drm/i915: Use the MRU stack search after evicting")
    f9e613728090 ("drm/i915: Try to print INSTDONE bits for all slice/subslice")

v4.4.185: Failed to apply! Possible dependencies:
    09cfcb456941 ("drm/i915: Split out load time HW initialization")
    0a9d2bed5557 ("drm/i915/skl: Making DC6 entry is the last call in suspend flow.")
    1f814daca43a ("drm/i915: add support for checking if we hold an RPM reference")
    2f693e28b8df ("drm/i915: Make turning on/off PW1 and Misc I/O part of the init/fini sequences")
    399bb5b6db02 ("drm/i915: Move allocation of various workqueues earlier during init")
    414b7999b8be ("drm/i915/gen9: Remove csr.state, csr_lock and related code.")
    4f1959ee33c0 ("drm/i915: Use insert_page for pwrite_fast")
    62106b4f6b91 ("drm/i915: Rename dev_priv->gtt to dev_priv->ggtt")
    666a45379e2c ("drm/i915: Separate cherryview from valleyview")
    72e96d6450c0 ("drm/i915: Refer to GGTT {,VM} consistently")
    73dfc227ff5c ("drm/i915/skl: init/uninit display core as part of the HW power domain state")
    9c5308ea1cd4 ("drm/i915/skl: Refuse to load outdated dmc firmware")
    ad5c3d3ffbb2 ("drm/i915: Move MCHBAR setup earlier during init")
    b6e7d894c3d2 ("drm/i915/skl: Store and print the DMC firmware version we load")
    bc87229f323e ("drm/i915/skl: enable PC9/10 power states during suspend-to-idle")
    c73666f394fc ("drm/i915/skl: If needed sanitize bios programmed cdclk")
    ebae38d061df ("drm/i915/gen9: csr_init after runtime pm enable")
    f4448375467d ("drm/i915/gen9: Use dev_priv in csr functions")
    f514c2d84285 ("drm/i915/gen9: flush DMC fw loading work during system suspend")


NOTE: The patch will not be queued to stable trees until it is upstream.

How should we proceed with this patch?

--
Thanks,
Sasha
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] drm/i915: Flush all user surfaces prior to first use
  2019-07-18 14:54 ` [PATCH 3/4] drm/i915: Flush all user surfaces prior to first use Chris Wilson
@ 2019-07-19 10:01   ` Joonas Lahtinen
  2019-07-19 10:18   ` Lionel Landwerlin
  2019-07-24 20:37   ` Francisco Jerez
  2 siblings, 0 replies; 32+ messages in thread
From: Joonas Lahtinen @ 2019-07-19 10:01 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Quoting Chris Wilson (2019-07-18 17:54:06)
> Since userspace has the ability to bypass the CPU cache from within its
> unprivileged command stream, we have to flush the CPU cache to memory
> in order to overwrite the previous contents on creation.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: stablevger.kernel.org

Makes sense. If there is some platform where this absolutely can't
happen and they need optimization, we can special case later.

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/4] drm/i915: Flush stale cachelines on set-cache-level
  2019-07-18 14:54 ` [PATCH 4/4] drm/i915: Flush stale cachelines on set-cache-level Chris Wilson
@ 2019-07-19 10:03     ` Joonas Lahtinen
  2019-11-12  9:09     ` Daniel Vetter
  1 sibling, 0 replies; 32+ messages in thread
From: Joonas Lahtinen @ 2019-07-19 10:03 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Chris Wilson, stable

Quoting Chris Wilson (2019-07-18 17:54:07)
> Ensure that we flush any cache dirt out to main memory before the user
> changes the cache-level as they may elect to bypass the cache (even after
> declaring their access cache-coherent) via use of unprivileged MOCS.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: stable@vger.kernel.org

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/4] drm/i915: Flush stale cachelines on set-cache-level
@ 2019-07-19 10:03     ` Joonas Lahtinen
  0 siblings, 0 replies; 32+ messages in thread
From: Joonas Lahtinen @ 2019-07-19 10:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson, stable

Quoting Chris Wilson (2019-07-18 17:54:07)
> Ensure that we flush any cache dirt out to main memory before the user
> changes the cache-level as they may elect to bypass the cache (even after
> declaring their access cache-coherent) via use of unprivileged MOCS.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: stable@vger.kernel.org

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] drm/i915: Flush all user surfaces prior to first use
  2019-07-18 14:54 ` [PATCH 3/4] drm/i915: Flush all user surfaces prior to first use Chris Wilson
  2019-07-19 10:01   ` Joonas Lahtinen
@ 2019-07-19 10:18   ` Lionel Landwerlin
  2019-07-19 10:21     ` Chris Wilson
  2019-07-24 20:37   ` Francisco Jerez
  2 siblings, 1 reply; 32+ messages in thread
From: Lionel Landwerlin @ 2019-07-19 10:18 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 18/07/2019 17:54, Chris Wilson wrote:
> Since userspace has the ability to bypass the CPU cache from within its
> unprivileged command stream, we have to flush the CPU cache to memory
> in order to overwrite the previous contents on creation.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: stablevger.kernel.org
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 26 ++++++-----------------
>   1 file changed, 7 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> index d2a1158868e7..f752b326d399 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> @@ -459,7 +459,6 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size)
>   {
>   	struct drm_i915_gem_object *obj;
>   	struct address_space *mapping;
> -	unsigned int cache_level;
>   	gfp_t mask;
>   	int ret;
>   
> @@ -498,24 +497,13 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size)
>   	obj->write_domain = I915_GEM_DOMAIN_CPU;
>   	obj->read_domains = I915_GEM_DOMAIN_CPU;
>   
> -	if (HAS_LLC(i915))
> -		/* On some devices, we can have the GPU use the LLC (the CPU
> -		 * cache) for about a 10% performance improvement
> -		 * compared to uncached.  Graphics requests other than
> -		 * display scanout are coherent with the CPU in
> -		 * accessing this cache.  This means in this mode we
> -		 * don't need to clflush on the CPU side, and on the
> -		 * GPU side we only need to flush internal caches to
> -		 * get data visible to the CPU.
> -		 *
> -		 * However, we maintain the display planes as UC, and so
> -		 * need to rebind when first used as such.
> -		 */
> -		cache_level = I915_CACHE_LLC;
> -	else
> -		cache_level = I915_CACHE_NONE;
> -
> -	i915_gem_object_set_cache_coherency(obj, cache_level);
> +	/*
> +	 * Note that userspace has control over cache-bypass
> +	 * via its command stream, so even on LLC architectures
> +	 * we have to flush out the CPU cache to memory to
> +	 * clear previous contents.
> +	 */
> +	i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
>   
>   	trace_i915_gem_object_create(obj);
>   

Does i915_drm.h needs updating? :


/**
  * I915_CACHING_CACHED
  *
  * GPU access is coherent with cpu caches and furthermore the data is 
cached in
  * last-level caches shared between cpu cores and the gpu GT. Default on
  * machines with HAS_LLC.
  */
#define I915_CACHING_CACHED             1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] drm/i915: Flush all user surfaces prior to first use
  2019-07-19 10:18   ` Lionel Landwerlin
@ 2019-07-19 10:21     ` Chris Wilson
  2019-07-19 22:55       ` Jason Ekstrand
  0 siblings, 1 reply; 32+ messages in thread
From: Chris Wilson @ 2019-07-19 10:21 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx

Quoting Lionel Landwerlin (2019-07-19 11:18:42)
> On 18/07/2019 17:54, Chris Wilson wrote:
> > Since userspace has the ability to bypass the CPU cache from within its
> > unprivileged command stream, we have to flush the CPU cache to memory
> > in order to overwrite the previous contents on creation.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: stablevger.kernel.org
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 26 ++++++-----------------
> >   1 file changed, 7 insertions(+), 19 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > index d2a1158868e7..f752b326d399 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > @@ -459,7 +459,6 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size)
> >   {
> >       struct drm_i915_gem_object *obj;
> >       struct address_space *mapping;
> > -     unsigned int cache_level;
> >       gfp_t mask;
> >       int ret;
> >   
> > @@ -498,24 +497,13 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size)
> >       obj->write_domain = I915_GEM_DOMAIN_CPU;
> >       obj->read_domains = I915_GEM_DOMAIN_CPU;
> >   
> > -     if (HAS_LLC(i915))
> > -             /* On some devices, we can have the GPU use the LLC (the CPU
> > -              * cache) for about a 10% performance improvement
> > -              * compared to uncached.  Graphics requests other than
> > -              * display scanout are coherent with the CPU in
> > -              * accessing this cache.  This means in this mode we
> > -              * don't need to clflush on the CPU side, and on the
> > -              * GPU side we only need to flush internal caches to
> > -              * get data visible to the CPU.
> > -              *
> > -              * However, we maintain the display planes as UC, and so
> > -              * need to rebind when first used as such.
> > -              */
> > -             cache_level = I915_CACHE_LLC;
> > -     else
> > -             cache_level = I915_CACHE_NONE;
> > -
> > -     i915_gem_object_set_cache_coherency(obj, cache_level);
> > +     /*
> > +      * Note that userspace has control over cache-bypass
> > +      * via its command stream, so even on LLC architectures
> > +      * we have to flush out the CPU cache to memory to
> > +      * clear previous contents.
> > +      */
> > +     i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
> >   
> >       trace_i915_gem_object_create(obj);
> >   
> 
> Does i915_drm.h needs updating? :
> 
> 
> /**
>   * I915_CACHING_CACHED
>   *
>   * GPU access is coherent with cpu caches and furthermore the data is 
> cached in
>   * last-level caches shared between cpu cores and the gpu GT. Default on
>   * machines with HAS_LLC.
>   */
> #define I915_CACHING_CACHED             1

Sneaky. Thanks,
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] drm/i915: Flush all user surfaces prior to first use
  2019-07-19 10:21     ` Chris Wilson
@ 2019-07-19 22:55       ` Jason Ekstrand
  2019-07-20 10:49         ` Chris Wilson
  0 siblings, 1 reply; 32+ messages in thread
From: Jason Ekstrand @ 2019-07-19 22:55 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel GFX


[-- Attachment #1.1: Type: text/plain, Size: 3610 bytes --]

Just to be clear, is this just adding a CLFLUSH or is it actually changing
the default caching state of buffers from CACHED to NONE?  If it's actually
changing the default state, that's going to break userspace badly.

--Jason

On Fri, Jul 19, 2019 at 5:21 AM Chris Wilson <chris@chris-wilson.co.uk>
wrote:

> Quoting Lionel Landwerlin (2019-07-19 11:18:42)
> > On 18/07/2019 17:54, Chris Wilson wrote:
> > > Since userspace has the ability to bypass the CPU cache from within its
> > > unprivileged command stream, we have to flush the CPU cache to memory
> > > in order to overwrite the previous contents on creation.
> > >
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > Cc: stablevger.kernel.org
> > > ---
> > >   drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 26
> ++++++-----------------
> > >   1 file changed, 7 insertions(+), 19 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > > index d2a1158868e7..f752b326d399 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > > @@ -459,7 +459,6 @@ i915_gem_object_create_shmem(struct
> drm_i915_private *i915, u64 size)
> > >   {
> > >       struct drm_i915_gem_object *obj;
> > >       struct address_space *mapping;
> > > -     unsigned int cache_level;
> > >       gfp_t mask;
> > >       int ret;
> > >
> > > @@ -498,24 +497,13 @@ i915_gem_object_create_shmem(struct
> drm_i915_private *i915, u64 size)
> > >       obj->write_domain = I915_GEM_DOMAIN_CPU;
> > >       obj->read_domains = I915_GEM_DOMAIN_CPU;
> > >
> > > -     if (HAS_LLC(i915))
> > > -             /* On some devices, we can have the GPU use the LLC (the
> CPU
> > > -              * cache) for about a 10% performance improvement
> > > -              * compared to uncached.  Graphics requests other than
> > > -              * display scanout are coherent with the CPU in
> > > -              * accessing this cache.  This means in this mode we
> > > -              * don't need to clflush on the CPU side, and on the
> > > -              * GPU side we only need to flush internal caches to
> > > -              * get data visible to the CPU.
> > > -              *
> > > -              * However, we maintain the display planes as UC, and so
> > > -              * need to rebind when first used as such.
> > > -              */
> > > -             cache_level = I915_CACHE_LLC;
> > > -     else
> > > -             cache_level = I915_CACHE_NONE;
> > > -
> > > -     i915_gem_object_set_cache_coherency(obj, cache_level);
> > > +     /*
> > > +      * Note that userspace has control over cache-bypass
> > > +      * via its command stream, so even on LLC architectures
> > > +      * we have to flush out the CPU cache to memory to
> > > +      * clear previous contents.
> > > +      */
> > > +     i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
> > >
> > >       trace_i915_gem_object_create(obj);
> > >
> >
> > Does i915_drm.h needs updating? :
> >
> >
> > /**
> >   * I915_CACHING_CACHED
> >   *
> >   * GPU access is coherent with cpu caches and furthermore the data is
> > cached in
> >   * last-level caches shared between cpu cores and the gpu GT. Default on
> >   * machines with HAS_LLC.
> >   */
> > #define I915_CACHING_CACHED             1
>
> Sneaky. Thanks,
> -Chris
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[-- Attachment #1.2: Type: text/html, Size: 5147 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] drm/i915: Flush all user surfaces prior to first use
  2019-07-19 22:55       ` Jason Ekstrand
@ 2019-07-20 10:49         ` Chris Wilson
  0 siblings, 0 replies; 32+ messages in thread
From: Chris Wilson @ 2019-07-20 10:49 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX

Quoting Jason Ekstrand (2019-07-19 23:55:03)
> Just to be clear, is this just adding a CLFLUSH or is it actually changing the
> default caching state of buffers from CACHED to NONE?  If it's actually
> changing the default state, that's going to break userspace badly.

What userspace is actually broken? You don't use set-domain, preferring
to do your own cache management and MOCS; where you do default to PTE is
where we expect the buffers to be uncached anyway. In effect, it is only
swap in of the objects that the kernel will insert an extra clflush for
you.

I'm looking for the negative impact of this, and not finding much
repercussion at all. At the end of the day, we have no choice but to
close such a hole; the only question is how much leeway to give and how
to mitigate such issues in future interfaces.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] drm/i915: Flush all user surfaces prior to first use
  2019-07-18 14:54 ` [PATCH 3/4] drm/i915: Flush all user surfaces prior to first use Chris Wilson
  2019-07-19 10:01   ` Joonas Lahtinen
  2019-07-19 10:18   ` Lionel Landwerlin
@ 2019-07-24 20:37   ` Francisco Jerez
  2019-11-12  9:38       ` [Intel-gfx] " Chris Wilson
  2 siblings, 1 reply; 32+ messages in thread
From: Francisco Jerez @ 2019-07-24 20:37 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


[-- Attachment #1.1.1: Type: text/plain, Size: 3805 bytes --]

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Since userspace has the ability to bypass the CPU cache from within its
> unprivileged command stream, we have to flush the CPU cache to memory
> in order to overwrite the previous contents on creation.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: stablevger.kernel.org
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 26 ++++++-----------------
>  1 file changed, 7 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> index d2a1158868e7..f752b326d399 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> @@ -459,7 +459,6 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size)
>  {
>  	struct drm_i915_gem_object *obj;
>  	struct address_space *mapping;
> -	unsigned int cache_level;
>  	gfp_t mask;
>  	int ret;
>  
> @@ -498,24 +497,13 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size)
>  	obj->write_domain = I915_GEM_DOMAIN_CPU;
>  	obj->read_domains = I915_GEM_DOMAIN_CPU;
>  
> -	if (HAS_LLC(i915))
> -		/* On some devices, we can have the GPU use the LLC (the CPU
> -		 * cache) for about a 10% performance improvement
> -		 * compared to uncached.  Graphics requests other than
> -		 * display scanout are coherent with the CPU in
> -		 * accessing this cache.  This means in this mode we
> -		 * don't need to clflush on the CPU side, and on the
> -		 * GPU side we only need to flush internal caches to
> -		 * get data visible to the CPU.
> -		 *
> -		 * However, we maintain the display planes as UC, and so
> -		 * need to rebind when first used as such.
> -		 */
> -		cache_level = I915_CACHE_LLC;
> -	else
> -		cache_level = I915_CACHE_NONE;
> -
> -	i915_gem_object_set_cache_coherency(obj, cache_level);
> +	/*
> +	 * Note that userspace has control over cache-bypass
> +	 * via its command stream, so even on LLC architectures
> +	 * we have to flush out the CPU cache to memory to
> +	 * clear previous contents.
> +	 */
> +	i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
>  

I'm not sure if you've seen my comments about this in an internal thread
you were CC'ed to: I don't think this patch will have the intended
effect.  The various clflushes this could trigger before the first use
of the buffer are conditional on "obj->cache_dirty &
~obj->cache_coherent", which will always be false on LLC platforms
AFAICT, because on those platforms i915_gem_object_set_cache_coherency()
will always set bit 0 of obj->cache_coherent.

I attached a patch with the same purpose as this to that internal thread
which doesn't suffer from this bug, but my patch was specific to Gen12+
where cache bypass is actually exposed to userspace.  Why is this patch
enabling the flushes for all platforms?  AFAICT the only currently
exposed MOCS entries that might allow userspace to bypass the LLC are 16
and 17 on Gen11, which enable the SCF MOCS table bit, which is marked
"S/W should NOT set this field in client platforms" in the Gen9 docs,
and according to the Gen10-11 docs is "Not supported".  Does LLC bypass
actually work on ICL?  I doubt it but it might have been fixed in some
other Gen11 project.  Shouldn't this change be conditional on the
platform supporting LLC bypass?  Do you want me to resend my patch here
with the Gen12+ checks changed to Gen11+?

>  	trace_i915_gem_object_create(obj);
>  
> -- 
> 2.22.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] [PATCH 4/4] drm/i915: Flush stale cachelines on set-cache-level
  2019-07-18 14:54 ` [PATCH 4/4] drm/i915: Flush stale cachelines on set-cache-level Chris Wilson
  2019-07-19 10:03     ` Joonas Lahtinen
@ 2019-11-12  9:09     ` Daniel Vetter
  1 sibling, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2019-11-12  9:09 UTC (permalink / raw)
  To: Chris Wilson, Joonas Lahtinen, Francisco Jerez; +Cc: intel-gfx, stable

On Thu, Jul 18, 2019 at 4:54 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Ensure that we flush any cache dirt out to main memory before the user
> changes the cache-level as they may elect to bypass the cache (even after
> declaring their access cache-coherent) via use of unprivileged MOCS.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: stable@vger.kernel.org
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_domain.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> index 2e3ce2a69653..5d41e769a428 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> @@ -277,6 +277,11 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>
>         list_for_each_entry(vma, &obj->vma.list, obj_link)
>                 vma->node.color = cache_level;
> +
> +       /* Flush any previous cache dirt in case of cache bypass */
> +       if (obj->cache_dirty & ~obj->cache_coherent)
> +               i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC);

I think writing out the bit logic instead of implicitly relying on the
#defines would be much better, i.e. && !(cache_coherent &
COHERENT_FOR_READ). Plus I think we only need to set cache_dirty =
true if we don't flush here already, to avoid double flushing?
-Daniel

> +
>         i915_gem_object_set_cache_coherency(obj, cache_level);
>         obj->cache_dirty = true; /* Always invalidate stale cachelines */
>
> --
> 2.22.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx



--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] [PATCH 4/4] drm/i915: Flush stale cachelines on set-cache-level
@ 2019-11-12  9:09     ` Daniel Vetter
  0 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2019-11-12  9:09 UTC (permalink / raw)
  To: Chris Wilson, Joonas Lahtinen, Francisco Jerez; +Cc: intel-gfx, stable

On Thu, Jul 18, 2019 at 4:54 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Ensure that we flush any cache dirt out to main memory before the user
> changes the cache-level as they may elect to bypass the cache (even after
> declaring their access cache-coherent) via use of unprivileged MOCS.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: stable@vger.kernel.org
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_domain.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> index 2e3ce2a69653..5d41e769a428 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> @@ -277,6 +277,11 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>
>         list_for_each_entry(vma, &obj->vma.list, obj_link)
>                 vma->node.color = cache_level;
> +
> +       /* Flush any previous cache dirt in case of cache bypass */
> +       if (obj->cache_dirty & ~obj->cache_coherent)
> +               i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC);

I think writing out the bit logic instead of implicitly relying on the
#defines would be much better, i.e. && !(cache_coherent &
COHERENT_FOR_READ). Plus I think we only need to set cache_dirty =
true if we don't flush here already, to avoid double flushing?
-Daniel

> +
>         i915_gem_object_set_cache_coherency(obj, cache_level);
>         obj->cache_dirty = true; /* Always invalidate stale cachelines */
>
> --
> 2.22.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] [PATCH 4/4] drm/i915: Flush stale cachelines on set-cache-level
@ 2019-11-12  9:09     ` Daniel Vetter
  0 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2019-11-12  9:09 UTC (permalink / raw)
  To: Chris Wilson, Joonas Lahtinen, Francisco Jerez; +Cc: intel-gfx, stable

On Thu, Jul 18, 2019 at 4:54 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Ensure that we flush any cache dirt out to main memory before the user
> changes the cache-level as they may elect to bypass the cache (even after
> declaring their access cache-coherent) via use of unprivileged MOCS.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: stable@vger.kernel.org
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_domain.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> index 2e3ce2a69653..5d41e769a428 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> @@ -277,6 +277,11 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>
>         list_for_each_entry(vma, &obj->vma.list, obj_link)
>                 vma->node.color = cache_level;
> +
> +       /* Flush any previous cache dirt in case of cache bypass */
> +       if (obj->cache_dirty & ~obj->cache_coherent)
> +               i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC);

I think writing out the bit logic instead of implicitly relying on the
#defines would be much better, i.e. && !(cache_coherent &
COHERENT_FOR_READ). Plus I think we only need to set cache_dirty =
true if we don't flush here already, to avoid double flushing?
-Daniel

> +
>         i915_gem_object_set_cache_coherency(obj, cache_level);
>         obj->cache_dirty = true; /* Always invalidate stale cachelines */
>
> --
> 2.22.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx



--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] drm/i915: Flush all user surfaces prior to first use
@ 2019-11-12  9:38       ` Chris Wilson
  0 siblings, 0 replies; 32+ messages in thread
From: Chris Wilson @ 2019-11-12  9:38 UTC (permalink / raw)
  To: Francisco Jerez, intel-gfx

Quoting Francisco Jerez (2019-07-24 21:37:24)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > Since userspace has the ability to bypass the CPU cache from within its
> > unprivileged command stream, we have to flush the CPU cache to memory
> > in order to overwrite the previous contents on creation.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: stablevger.kernel.org
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 26 ++++++-----------------
> >  1 file changed, 7 insertions(+), 19 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > index d2a1158868e7..f752b326d399 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > @@ -459,7 +459,6 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size)
> >  {
> >       struct drm_i915_gem_object *obj;
> >       struct address_space *mapping;
> > -     unsigned int cache_level;
> >       gfp_t mask;
> >       int ret;
> >  
> > @@ -498,24 +497,13 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size)
> >       obj->write_domain = I915_GEM_DOMAIN_CPU;
> >       obj->read_domains = I915_GEM_DOMAIN_CPU;
> >  
> > -     if (HAS_LLC(i915))
> > -             /* On some devices, we can have the GPU use the LLC (the CPU
> > -              * cache) for about a 10% performance improvement
> > -              * compared to uncached.  Graphics requests other than
> > -              * display scanout are coherent with the CPU in
> > -              * accessing this cache.  This means in this mode we
> > -              * don't need to clflush on the CPU side, and on the
> > -              * GPU side we only need to flush internal caches to
> > -              * get data visible to the CPU.
> > -              *
> > -              * However, we maintain the display planes as UC, and so
> > -              * need to rebind when first used as such.
> > -              */
> > -             cache_level = I915_CACHE_LLC;
> > -     else
> > -             cache_level = I915_CACHE_NONE;
> > -
> > -     i915_gem_object_set_cache_coherency(obj, cache_level);
> > +     /*
> > +      * Note that userspace has control over cache-bypass
> > +      * via its command stream, so even on LLC architectures
> > +      * we have to flush out the CPU cache to memory to
> > +      * clear previous contents.
> > +      */
> > +     i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
> >  
> 
> I'm not sure if you've seen my comments about this in an internal thread
> you were CC'ed to: I don't think this patch will have the intended
> effect.  The various clflushes this could trigger before the first use
> of the buffer are conditional on "obj->cache_dirty &
> ~obj->cache_coherent", which will always be false on LLC platforms
> AFAICT, because on those platforms i915_gem_object_set_cache_coherency()
> will always set bit 0 of obj->cache_coherent.

You only need to flush the stale written-to cachelines, so that the page
content is correct on reuse of the foreign struct page. After that point,
the CPU cache is managed by the client.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] [PATCH 3/4] drm/i915: Flush all user surfaces prior to first use
@ 2019-11-12  9:38       ` Chris Wilson
  0 siblings, 0 replies; 32+ messages in thread
From: Chris Wilson @ 2019-11-12  9:38 UTC (permalink / raw)
  To: Francisco Jerez, intel-gfx

Quoting Francisco Jerez (2019-07-24 21:37:24)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > Since userspace has the ability to bypass the CPU cache from within its
> > unprivileged command stream, we have to flush the CPU cache to memory
> > in order to overwrite the previous contents on creation.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: stablevger.kernel.org
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 26 ++++++-----------------
> >  1 file changed, 7 insertions(+), 19 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > index d2a1158868e7..f752b326d399 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > @@ -459,7 +459,6 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size)
> >  {
> >       struct drm_i915_gem_object *obj;
> >       struct address_space *mapping;
> > -     unsigned int cache_level;
> >       gfp_t mask;
> >       int ret;
> >  
> > @@ -498,24 +497,13 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size)
> >       obj->write_domain = I915_GEM_DOMAIN_CPU;
> >       obj->read_domains = I915_GEM_DOMAIN_CPU;
> >  
> > -     if (HAS_LLC(i915))
> > -             /* On some devices, we can have the GPU use the LLC (the CPU
> > -              * cache) for about a 10% performance improvement
> > -              * compared to uncached.  Graphics requests other than
> > -              * display scanout are coherent with the CPU in
> > -              * accessing this cache.  This means in this mode we
> > -              * don't need to clflush on the CPU side, and on the
> > -              * GPU side we only need to flush internal caches to
> > -              * get data visible to the CPU.
> > -              *
> > -              * However, we maintain the display planes as UC, and so
> > -              * need to rebind when first used as such.
> > -              */
> > -             cache_level = I915_CACHE_LLC;
> > -     else
> > -             cache_level = I915_CACHE_NONE;
> > -
> > -     i915_gem_object_set_cache_coherency(obj, cache_level);
> > +     /*
> > +      * Note that userspace has control over cache-bypass
> > +      * via its command stream, so even on LLC architectures
> > +      * we have to flush out the CPU cache to memory to
> > +      * clear previous contents.
> > +      */
> > +     i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
> >  
> 
> I'm not sure if you've seen my comments about this in an internal thread
> you were CC'ed to: I don't think this patch will have the intended
> effect.  The various clflushes this could trigger before the first use
> of the buffer are conditional on "obj->cache_dirty &
> ~obj->cache_coherent", which will always be false on LLC platforms
> AFAICT, because on those platforms i915_gem_object_set_cache_coherency()
> will always set bit 0 of obj->cache_coherent.

You only need to flush the stale written-to cachelines, so that the page
content is correct on reuse of the foreign struct page. After that point,
the CPU cache is managed by the client.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] [PATCH 4/4] drm/i915: Flush stale cachelines on set-cache-level
@ 2019-11-12  9:42       ` Chris Wilson
  0 siblings, 0 replies; 32+ messages in thread
From: Chris Wilson @ 2019-11-12  9:42 UTC (permalink / raw)
  To: Daniel Vetter, Francisco Jerez, Joonas Lahtinen; +Cc: intel-gfx, stable

Quoting Daniel Vetter (2019-11-12 09:09:06)
> On Thu, Jul 18, 2019 at 4:54 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> >
> > Ensure that we flush any cache dirt out to main memory before the user
> > changes the cache-level as they may elect to bypass the cache (even after
> > declaring their access cache-coherent) via use of unprivileged MOCS.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: stable@vger.kernel.org
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_domain.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > index 2e3ce2a69653..5d41e769a428 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > @@ -277,6 +277,11 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> >
> >         list_for_each_entry(vma, &obj->vma.list, obj_link)
> >                 vma->node.color = cache_level;
> > +
> > +       /* Flush any previous cache dirt in case of cache bypass */
> > +       if (obj->cache_dirty & ~obj->cache_coherent)
> > +               i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC);
> 
> I think writing out the bit logic instead of implicitly relying on the
> #defines would be much better, i.e. && !(cache_coherent &
> COHERENT_FOR_READ). Plus I think we only need to set cache_dirty =
> true if we don't flush here already, to avoid double flushing?

No. The mask is being updated, so you need to flush before you lose
track. The cache is then cleared of the dirty bit so won't be flushed
again until dirty and no longer coherent. We need to flag that the page
is no longer coherent at the end of its lifetime (passing back to the
system) to force the flush then.
-Chris

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] [PATCH 4/4] drm/i915: Flush stale cachelines on set-cache-level
@ 2019-11-12  9:42       ` Chris Wilson
  0 siblings, 0 replies; 32+ messages in thread
From: Chris Wilson @ 2019-11-12  9:42 UTC (permalink / raw)
  To: Daniel Vetter, Francisco Jerez, Joonas Lahtinen; +Cc: intel-gfx, stable

Quoting Daniel Vetter (2019-11-12 09:09:06)
> On Thu, Jul 18, 2019 at 4:54 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> >
> > Ensure that we flush any cache dirt out to main memory before the user
> > changes the cache-level as they may elect to bypass the cache (even after
> > declaring their access cache-coherent) via use of unprivileged MOCS.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: stable@vger.kernel.org
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_domain.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > index 2e3ce2a69653..5d41e769a428 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > @@ -277,6 +277,11 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> >
> >         list_for_each_entry(vma, &obj->vma.list, obj_link)
> >                 vma->node.color = cache_level;
> > +
> > +       /* Flush any previous cache dirt in case of cache bypass */
> > +       if (obj->cache_dirty & ~obj->cache_coherent)
> > +               i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC);
> 
> I think writing out the bit logic instead of implicitly relying on the
> #defines would be much better, i.e. && !(cache_coherent &
> COHERENT_FOR_READ). Plus I think we only need to set cache_dirty =
> true if we don't flush here already, to avoid double flushing?

No. The mask is being updated, so you need to flush before you lose
track. The cache is then cleared of the dirty bit so won't be flushed
again until dirty and no longer coherent. We need to flag that the page
is no longer coherent at the end of its lifetime (passing back to the
system) to force the flush then.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] [PATCH 4/4] drm/i915: Flush stale cachelines on set-cache-level
@ 2019-11-12 10:57         ` Daniel Vetter
  0 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2019-11-12 10:57 UTC (permalink / raw)
  To: Chris Wilson
  Cc: Francisco Jerez, Joonas Lahtinen, intel-gfx, stable, Maarten Lankhorst

On Tue, Nov 12, 2019 at 10:43 AM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Quoting Daniel Vetter (2019-11-12 09:09:06)
> > On Thu, Jul 18, 2019 at 4:54 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > >
> > > Ensure that we flush any cache dirt out to main memory before the user
> > > changes the cache-level as they may elect to bypass the cache (even after
> > > declaring their access cache-coherent) via use of unprivileged MOCS.
> > >
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > Cc: stable@vger.kernel.org
> > > ---
> > >  drivers/gpu/drm/i915/gem/i915_gem_domain.c | 5 +++++
> > >  1 file changed, 5 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > > index 2e3ce2a69653..5d41e769a428 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > > @@ -277,6 +277,11 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > >
> > >         list_for_each_entry(vma, &obj->vma.list, obj_link)
> > >                 vma->node.color = cache_level;
> > > +
> > > +       /* Flush any previous cache dirt in case of cache bypass */
> > > +       if (obj->cache_dirty & ~obj->cache_coherent)
> > > +               i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC);
> >
> > I think writing out the bit logic instead of implicitly relying on the
> > #defines would be much better, i.e. && !(cache_coherent &
> > COHERENT_FOR_READ). Plus I think we only need to set cache_dirty =
> > true if we don't flush here already, to avoid double flushing?
>
> No. The mask is being updated, so you need to flush before you lose
> track. The cache is then cleared of the dirty bit so won't be flushed
> again until dirty and no longer coherent. We need to flag that the page
> is no longer coherent at the end of its lifetime (passing back to the
> system) to force the flush then.

Hm I think I overlooked that we only clear cache_dirty in
i915_gem_clflush_object when it's a coherent mode.

I also spotted more cases for (obj->cache_dirty
&~obj->cache_coherent), so that obscure/fragile pattern is
pre-existing :-/ One of them also checks outside of the object lock,
which I think is how these states are supposed to be protected. Smells
a bit fishy still, would be good to make a bit clearer.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] [PATCH 4/4] drm/i915: Flush stale cachelines on set-cache-level
@ 2019-11-12 10:57         ` Daniel Vetter
  0 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2019-11-12 10:57 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, stable

On Tue, Nov 12, 2019 at 10:43 AM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Quoting Daniel Vetter (2019-11-12 09:09:06)
> > On Thu, Jul 18, 2019 at 4:54 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > >
> > > Ensure that we flush any cache dirt out to main memory before the user
> > > changes the cache-level as they may elect to bypass the cache (even after
> > > declaring their access cache-coherent) via use of unprivileged MOCS.
> > >
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > Cc: stable@vger.kernel.org
> > > ---
> > >  drivers/gpu/drm/i915/gem/i915_gem_domain.c | 5 +++++
> > >  1 file changed, 5 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > > index 2e3ce2a69653..5d41e769a428 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > > @@ -277,6 +277,11 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > >
> > >         list_for_each_entry(vma, &obj->vma.list, obj_link)
> > >                 vma->node.color = cache_level;
> > > +
> > > +       /* Flush any previous cache dirt in case of cache bypass */
> > > +       if (obj->cache_dirty & ~obj->cache_coherent)
> > > +               i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC);
> >
> > I think writing out the bit logic instead of implicitly relying on the
> > #defines would be much better, i.e. && !(cache_coherent &
> > COHERENT_FOR_READ). Plus I think we only need to set cache_dirty =
> > true if we don't flush here already, to avoid double flushing?
>
> No. The mask is being updated, so you need to flush before you lose
> track. The cache is then cleared of the dirty bit so won't be flushed
> again until dirty and no longer coherent. We need to flag that the page
> is no longer coherent at the end of its lifetime (passing back to the
> system) to force the flush then.

Hm I think I overlooked that we only clear cache_dirty in
i915_gem_clflush_object when it's a coherent mode.

I also spotted more cases for (obj->cache_dirty
&~obj->cache_coherent), so that obscure/fragile pattern is
pre-existing :-/ One of them also checks outside of the object lock,
which I think is how these states are supposed to be protected. Smells
a bit fishy still, would be good to make a bit clearer.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] [PATCH 4/4] drm/i915: Flush stale cachelines on set-cache-level
@ 2019-11-12 12:08           ` Daniel Vetter
  0 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2019-11-12 12:08 UTC (permalink / raw)
  To: Chris Wilson
  Cc: Francisco Jerez, Joonas Lahtinen, intel-gfx, stable, Maarten Lankhorst

On Tue, Nov 12, 2019 at 11:57 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Tue, Nov 12, 2019 at 10:43 AM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> >
> > Quoting Daniel Vetter (2019-11-12 09:09:06)
> > > On Thu, Jul 18, 2019 at 4:54 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > > >
> > > > Ensure that we flush any cache dirt out to main memory before the user
> > > > changes the cache-level as they may elect to bypass the cache (even after
> > > > declaring their access cache-coherent) via use of unprivileged MOCS.
> > > >
> > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > > Cc: stable@vger.kernel.org
> > > > ---
> > > >  drivers/gpu/drm/i915/gem/i915_gem_domain.c | 5 +++++
> > > >  1 file changed, 5 insertions(+)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > > > index 2e3ce2a69653..5d41e769a428 100644
> > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > > > @@ -277,6 +277,11 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > >
> > > >         list_for_each_entry(vma, &obj->vma.list, obj_link)
> > > >                 vma->node.color = cache_level;
> > > > +
> > > > +       /* Flush any previous cache dirt in case of cache bypass */
> > > > +       if (obj->cache_dirty & ~obj->cache_coherent)
> > > > +               i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC);
> > >
> > > I think writing out the bit logic instead of implicitly relying on the
> > > #defines would be much better, i.e. && !(cache_coherent &
> > > COHERENT_FOR_READ). Plus I think we only need to set cache_dirty =
> > > true if we don't flush here already, to avoid double flushing?
> >
> > No. The mask is being updated, so you need to flush before you lose
> > track. The cache is then cleared of the dirty bit so won't be flushed
> > again until dirty and no longer coherent. We need to flag that the page
> > is no longer coherent at the end of its lifetime (passing back to the
> > system) to force the flush then.
>
> Hm I think I overlooked that we only clear cache_dirty in
> i915_gem_clflush_object when it's a coherent mode.

Hm, the clear/blt code recently merged doesn't preserve the
->cache_dirty setting for this case, unlike clfush_object. Do we have
a bug there?

> I also spotted more cases for (obj->cache_dirty
> &~obj->cache_coherent), so that obscure/fragile pattern is
> pre-existing :-/ One of them also checks outside of the object lock,
> which I think is how these states are supposed to be protected. Smells
> a bit fishy still, would be good to make a bit clearer.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] [PATCH 4/4] drm/i915: Flush stale cachelines on set-cache-level
@ 2019-11-12 12:08           ` Daniel Vetter
  0 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2019-11-12 12:08 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, stable

On Tue, Nov 12, 2019 at 11:57 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Tue, Nov 12, 2019 at 10:43 AM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> >
> > Quoting Daniel Vetter (2019-11-12 09:09:06)
> > > On Thu, Jul 18, 2019 at 4:54 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > > >
> > > > Ensure that we flush any cache dirt out to main memory before the user
> > > > changes the cache-level as they may elect to bypass the cache (even after
> > > > declaring their access cache-coherent) via use of unprivileged MOCS.
> > > >
> > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > > Cc: stable@vger.kernel.org
> > > > ---
> > > >  drivers/gpu/drm/i915/gem/i915_gem_domain.c | 5 +++++
> > > >  1 file changed, 5 insertions(+)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > > > index 2e3ce2a69653..5d41e769a428 100644
> > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > > > @@ -277,6 +277,11 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > >
> > > >         list_for_each_entry(vma, &obj->vma.list, obj_link)
> > > >                 vma->node.color = cache_level;
> > > > +
> > > > +       /* Flush any previous cache dirt in case of cache bypass */
> > > > +       if (obj->cache_dirty & ~obj->cache_coherent)
> > > > +               i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC);
> > >
> > > I think writing out the bit logic instead of implicitly relying on the
> > > #defines would be much better, i.e. && !(cache_coherent &
> > > COHERENT_FOR_READ). Plus I think we only need to set cache_dirty =
> > > true if we don't flush here already, to avoid double flushing?
> >
> > No. The mask is being updated, so you need to flush before you lose
> > track. The cache is then cleared of the dirty bit so won't be flushed
> > again until dirty and no longer coherent. We need to flag that the page
> > is no longer coherent at the end of its lifetime (passing back to the
> > system) to force the flush then.
>
> Hm I think I overlooked that we only clear cache_dirty in
> i915_gem_clflush_object when it's a coherent mode.

Hm, the clear/blt code recently merged doesn't preserve the
->cache_dirty setting for this case, unlike clfush_object. Do we have
a bug there?

> I also spotted more cases for (obj->cache_dirty
> &~obj->cache_coherent), so that obscure/fragile pattern is
> pre-existing :-/ One of them also checks outside of the object lock,
> which I think is how these states are supposed to be protected. Smells
> a bit fishy still, would be good to make a bit clearer.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] drm/i915: Flush all user surfaces prior to first use
@ 2019-11-12 19:58         ` Francisco Jerez
  0 siblings, 0 replies; 32+ messages in thread
From: Francisco Jerez @ 2019-11-12 19:58 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


[-- Attachment #1.1.1: Type: text/plain, Size: 4189 bytes --]

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Francisco Jerez (2019-07-24 21:37:24)
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>> 
>> > Since userspace has the ability to bypass the CPU cache from within its
>> > unprivileged command stream, we have to flush the CPU cache to memory
>> > in order to overwrite the previous contents on creation.
>> >
>> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> > Cc: stablevger.kernel.org
>> > ---
>> >  drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 26 ++++++-----------------
>> >  1 file changed, 7 insertions(+), 19 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
>> > index d2a1158868e7..f752b326d399 100644
>> > --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
>> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
>> > @@ -459,7 +459,6 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size)
>> >  {
>> >       struct drm_i915_gem_object *obj;
>> >       struct address_space *mapping;
>> > -     unsigned int cache_level;
>> >       gfp_t mask;
>> >       int ret;
>> >  
>> > @@ -498,24 +497,13 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size)
>> >       obj->write_domain = I915_GEM_DOMAIN_CPU;
>> >       obj->read_domains = I915_GEM_DOMAIN_CPU;
>> >  
>> > -     if (HAS_LLC(i915))
>> > -             /* On some devices, we can have the GPU use the LLC (the CPU
>> > -              * cache) for about a 10% performance improvement
>> > -              * compared to uncached.  Graphics requests other than
>> > -              * display scanout are coherent with the CPU in
>> > -              * accessing this cache.  This means in this mode we
>> > -              * don't need to clflush on the CPU side, and on the
>> > -              * GPU side we only need to flush internal caches to
>> > -              * get data visible to the CPU.
>> > -              *
>> > -              * However, we maintain the display planes as UC, and so
>> > -              * need to rebind when first used as such.
>> > -              */
>> > -             cache_level = I915_CACHE_LLC;
>> > -     else
>> > -             cache_level = I915_CACHE_NONE;
>> > -
>> > -     i915_gem_object_set_cache_coherency(obj, cache_level);
>> > +     /*
>> > +      * Note that userspace has control over cache-bypass
>> > +      * via its command stream, so even on LLC architectures
>> > +      * we have to flush out the CPU cache to memory to
>> > +      * clear previous contents.
>> > +      */
>> > +     i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
>> >  
>> 
>> I'm not sure if you've seen my comments about this in an internal thread
>> you were CC'ed to: I don't think this patch will have the intended
>> effect.  The various clflushes this could trigger before the first use
>> of the buffer are conditional on "obj->cache_dirty &
>> ~obj->cache_coherent", which will always be false on LLC platforms
>> AFAICT, because on those platforms i915_gem_object_set_cache_coherency()
>> will always set bit 0 of obj->cache_coherent.
>
> You only need to flush the stale written-to cachelines, so that the page
> content is correct on reuse of the foreign struct page. After that point,
> the CPU cache is managed by the client.

Point was that the code above wouldn't have necessarily led to *any*
flushing on Gen11+ platforms, not even of stale written-to cachelines,
because the various clflushes this could have triggered before the first
use of the buffer were conditional on "obj->cache_dirty &
~obj->cache_coherent", which would still have been false on LLC
platforms.

Anyway you may have fixed that in the next revision of this patch you
sent today by doing things in a completely different way.  But you
didn't answer my question asking why you are doing this on all
platforms.  Cache bypass isn't available on ICL nor earlier platforms,
is it?  And it's been defeatured on TGL:

https://www.spinics.net/lists/intel-gfx/msg219552.html

> -Chris



[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] [PATCH 3/4] drm/i915: Flush all user surfaces prior to first use
@ 2019-11-12 19:58         ` Francisco Jerez
  0 siblings, 0 replies; 32+ messages in thread
From: Francisco Jerez @ 2019-11-12 19:58 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


[-- Attachment #1.1.1: Type: text/plain, Size: 4189 bytes --]

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Francisco Jerez (2019-07-24 21:37:24)
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>> 
>> > Since userspace has the ability to bypass the CPU cache from within its
>> > unprivileged command stream, we have to flush the CPU cache to memory
>> > in order to overwrite the previous contents on creation.
>> >
>> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> > Cc: stablevger.kernel.org
>> > ---
>> >  drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 26 ++++++-----------------
>> >  1 file changed, 7 insertions(+), 19 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
>> > index d2a1158868e7..f752b326d399 100644
>> > --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
>> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
>> > @@ -459,7 +459,6 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size)
>> >  {
>> >       struct drm_i915_gem_object *obj;
>> >       struct address_space *mapping;
>> > -     unsigned int cache_level;
>> >       gfp_t mask;
>> >       int ret;
>> >  
>> > @@ -498,24 +497,13 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size)
>> >       obj->write_domain = I915_GEM_DOMAIN_CPU;
>> >       obj->read_domains = I915_GEM_DOMAIN_CPU;
>> >  
>> > -     if (HAS_LLC(i915))
>> > -             /* On some devices, we can have the GPU use the LLC (the CPU
>> > -              * cache) for about a 10% performance improvement
>> > -              * compared to uncached.  Graphics requests other than
>> > -              * display scanout are coherent with the CPU in
>> > -              * accessing this cache.  This means in this mode we
>> > -              * don't need to clflush on the CPU side, and on the
>> > -              * GPU side we only need to flush internal caches to
>> > -              * get data visible to the CPU.
>> > -              *
>> > -              * However, we maintain the display planes as UC, and so
>> > -              * need to rebind when first used as such.
>> > -              */
>> > -             cache_level = I915_CACHE_LLC;
>> > -     else
>> > -             cache_level = I915_CACHE_NONE;
>> > -
>> > -     i915_gem_object_set_cache_coherency(obj, cache_level);
>> > +     /*
>> > +      * Note that userspace has control over cache-bypass
>> > +      * via its command stream, so even on LLC architectures
>> > +      * we have to flush out the CPU cache to memory to
>> > +      * clear previous contents.
>> > +      */
>> > +     i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
>> >  
>> 
>> I'm not sure if you've seen my comments about this in an internal thread
>> you were CC'ed to: I don't think this patch will have the intended
>> effect.  The various clflushes this could trigger before the first use
>> of the buffer are conditional on "obj->cache_dirty &
>> ~obj->cache_coherent", which will always be false on LLC platforms
>> AFAICT, because on those platforms i915_gem_object_set_cache_coherency()
>> will always set bit 0 of obj->cache_coherent.
>
> You only need to flush the stale written-to cachelines, so that the page
> content is correct on reuse of the foreign struct page. After that point,
> the CPU cache is managed by the client.

Point was that the code above wouldn't have necessarily led to *any*
flushing on Gen11+ platforms, not even of stale written-to cachelines,
because the various clflushes this could have triggered before the first
use of the buffer were conditional on "obj->cache_dirty &
~obj->cache_coherent", which would still have been false on LLC
platforms.

Anyway you may have fixed that in the next revision of this patch you
sent today by doing things in a completely different way.  But you
didn't answer my question asking why you are doing this on all
platforms.  Cache bypass isn't available on ICL nor earlier platforms,
is it?  And it's been defeatured on TGL:

https://www.spinics.net/lists/intel-gfx/msg219552.html

> -Chris



[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2019-11-12 19:58 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-18 14:54 [PATCH 1/4] drm/i915: Drop wmb() inside pread_gtt Chris Wilson
2019-07-18 14:54 ` [PATCH 2/4] drm/i915: Use maximum write flush for pwrite_gtt Chris Wilson
2019-07-18 18:28   ` Ville Syrjälä
2019-07-18 19:19     ` Chris Wilson
2019-07-19  0:45   ` Sasha Levin
2019-07-18 14:54 ` [PATCH 3/4] drm/i915: Flush all user surfaces prior to first use Chris Wilson
2019-07-19 10:01   ` Joonas Lahtinen
2019-07-19 10:18   ` Lionel Landwerlin
2019-07-19 10:21     ` Chris Wilson
2019-07-19 22:55       ` Jason Ekstrand
2019-07-20 10:49         ` Chris Wilson
2019-07-24 20:37   ` Francisco Jerez
2019-11-12  9:38     ` Chris Wilson
2019-11-12  9:38       ` [Intel-gfx] " Chris Wilson
2019-11-12 19:58       ` Francisco Jerez
2019-11-12 19:58         ` [Intel-gfx] " Francisco Jerez
2019-07-18 14:54 ` [PATCH 4/4] drm/i915: Flush stale cachelines on set-cache-level Chris Wilson
2019-07-19 10:03   ` Joonas Lahtinen
2019-07-19 10:03     ` Joonas Lahtinen
2019-11-12  9:09   ` [Intel-gfx] " Daniel Vetter
2019-11-12  9:09     ` Daniel Vetter
2019-11-12  9:09     ` Daniel Vetter
2019-11-12  9:42     ` Chris Wilson
2019-11-12  9:42       ` Chris Wilson
2019-11-12 10:57       ` Daniel Vetter
2019-11-12 10:57         ` Daniel Vetter
2019-11-12 12:08         ` Daniel Vetter
2019-11-12 12:08           ` Daniel Vetter
2019-07-18 15:35 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/4] drm/i915: Drop wmb() inside pread_gtt Patchwork
2019-07-18 16:22 ` ✓ Fi.CI.BAT: success " Patchwork
2019-07-18 17:29 ` ✓ Fi.CI.IGT: " Patchwork
2019-07-18 18:23 ` [PATCH 1/4] " Ville Syrjälä

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.