* [PATCH v10 0/6] Use drm_clflush* instead of clflush
@ 2022-02-10 18:36 Michael Cheng
2022-02-10 18:36 ` [PATCH v10 1/6] drm: Add arch arm64 for drm_clflush_virt_range Michael Cheng
` (5 more replies)
0 siblings, 6 replies; 13+ messages in thread
From: Michael Cheng @ 2022-02-10 18:36 UTC (permalink / raw)
To: intel-gfx
Cc: tvrtko.ursulin, michael.cheng, balasubramani.vivekanandan,
wayne.boyer, casey.g.bowman, lucas.demarchi, dri-devel
This patch series re-work a few i915 functions to use drm_clflush_virt_range
instead of calling clflush or clflushopt directly. This will prevent errors
when building for non-x86 architectures.
v2: s/PAGE_SIZE/sizeof(value) for Re-work intel_write_status_page and added
more patches to convert additional clflush/clflushopt to use drm_clflush*.
(Michael Cheng)
v3: Drop invalidate_csb_entries and directly invoke drm_clflush_virt_ran
v4: Remove extra memory barriers
v5: s/cache_clflush_range/drm_clflush_virt_range
v6: Fix up "Drop invalidate_csb_entries" to use correct parameters. Also
added in arm64 support for drm_clflush_virt_range.
v7: Re-order patches, and use correct macro for dcache flush for arm64.
v8: Remove ifdef for asm/cacheflush.
v9: Rebased
v10: Replaced asm/cacheflush with linux/cacheflush
Michael Cheng (6):
drm: Add arch arm64 for drm_clflush_virt_range
drm/i915/gt: Re-work intel_write_status_page
drm/i915/gt: Drop invalidate_csb_entries
drm/i915/gt: Re-work reset_csb
drm/i915/: Re-work clflush_write32
drm/i915/gt: replace cache_clflush_range
drivers/gpu/drm/drm_cache.c | 6 ++++++
.../gpu/drm/i915/gem/i915_gem_execbuffer.c | 8 +++-----
drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 12 +++++------
drivers/gpu/drm/i915/gt/intel_engine.h | 13 ++++--------
.../drm/i915/gt/intel_execlists_submission.c | 20 +++++++------------
drivers/gpu/drm/i915/gt/intel_gtt.c | 2 +-
drivers/gpu/drm/i915/gt/intel_ppgtt.c | 2 +-
.../gpu/drm/i915/gt/uc/intel_guc_submission.c | 2 +-
8 files changed, 29 insertions(+), 36 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v10 1/6] drm: Add arch arm64 for drm_clflush_virt_range
2022-02-10 18:36 [PATCH v10 0/6] Use drm_clflush* instead of clflush Michael Cheng
@ 2022-02-10 18:36 ` Michael Cheng
2022-02-22 21:49 ` Matt Roper
2022-02-10 18:36 ` [PATCH v10 2/6] drm/i915/gt: Re-work intel_write_status_page Michael Cheng
` (4 subsequent siblings)
5 siblings, 1 reply; 13+ messages in thread
From: Michael Cheng @ 2022-02-10 18:36 UTC (permalink / raw)
To: intel-gfx
Cc: tvrtko.ursulin, michael.cheng, balasubramani.vivekanandan,
wayne.boyer, casey.g.bowman, lucas.demarchi, dri-devel
Add arm64 support for drm_clflush_virt_range. dcache_clean_inval_poc
performs a flush by first performing a clean, follow by an invalidation
operation.
v2 (Michael Cheng): Use correct macro for cleaning and invalidation the
dcache.
v3 (Michael Cheng): Remove ifdef for asm/cacheflush.h
v4 (Michael Cheng): Rebase
v5 (Michael Cheng): Replace asm/cacheflush.h with linux/cacheflush.h
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
---
drivers/gpu/drm/drm_cache.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index 66597e411764..2e233f53331e 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -28,6 +28,7 @@
* Authors: Thomas Hellström <thomas-at-tungstengraphics-dot-com>
*/
+#include <linux/cacheflush.h>
#include <linux/cc_platform.h>
#include <linux/export.h>
#include <linux/highmem.h>
@@ -174,6 +175,11 @@ drm_clflush_virt_range(void *addr, unsigned long length)
if (wbinvd_on_all_cpus())
pr_err("Timed out waiting for cache flush\n");
+
+#elif defined(CONFIG_ARM64)
+ void *end = addr + length;
+ dcache_clean_inval_poc((unsigned long)addr, (unsigned long)end);
+
#else
WARN_ONCE(1, "Architecture has no drm_cache.c support\n");
#endif
--
2.25.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v10 2/6] drm/i915/gt: Re-work intel_write_status_page
2022-02-10 18:36 [PATCH v10 0/6] Use drm_clflush* instead of clflush Michael Cheng
2022-02-10 18:36 ` [PATCH v10 1/6] drm: Add arch arm64 for drm_clflush_virt_range Michael Cheng
@ 2022-02-10 18:36 ` Michael Cheng
2022-02-22 22:04 ` Matt Roper
2022-02-10 18:36 ` [PATCH v10 3/6] drm/i915/gt: Drop invalidate_csb_entries Michael Cheng
` (3 subsequent siblings)
5 siblings, 1 reply; 13+ messages in thread
From: Michael Cheng @ 2022-02-10 18:36 UTC (permalink / raw)
To: intel-gfx
Cc: tvrtko.ursulin, michael.cheng, balasubramani.vivekanandan,
wayne.boyer, casey.g.bowman, lucas.demarchi, dri-devel
Re-work intel_write_status_page to use drm_clflush_virt_range. This
will prevent compiler errors when building for non-x86 architectures.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
---
drivers/gpu/drm/i915/gt/intel_engine.h | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index 0e353d8c2bc8..986777c2430d 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -4,6 +4,7 @@
#include <asm/cacheflush.h>
#include <drm/drm_util.h>
+#include <drm/drm_cache.h>
#include <linux/hashtable.h>
#include <linux/irq_work.h>
@@ -143,15 +144,9 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value)
* of extra paranoia to try and ensure that the HWS takes the value
* we give and that it doesn't end up trapped inside the CPU!
*/
- if (static_cpu_has(X86_FEATURE_CLFLUSH)) {
- mb();
- clflush(&engine->status_page.addr[reg]);
- engine->status_page.addr[reg] = value;
- clflush(&engine->status_page.addr[reg]);
- mb();
- } else {
- WRITE_ONCE(engine->status_page.addr[reg], value);
- }
+ drm_clflush_virt_range(&engine->status_page.addr[reg], sizeof(value));
+ WRITE_ONCE(engine->status_page.addr[reg], value);
+ drm_clflush_virt_range(&engine->status_page.addr[reg], sizeof(value));
}
/*
--
2.25.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v10 3/6] drm/i915/gt: Drop invalidate_csb_entries
2022-02-10 18:36 [PATCH v10 0/6] Use drm_clflush* instead of clflush Michael Cheng
2022-02-10 18:36 ` [PATCH v10 1/6] drm: Add arch arm64 for drm_clflush_virt_range Michael Cheng
2022-02-10 18:36 ` [PATCH v10 2/6] drm/i915/gt: Re-work intel_write_status_page Michael Cheng
@ 2022-02-10 18:36 ` Michael Cheng
2022-02-22 22:31 ` Matt Roper
2022-02-10 18:36 ` [PATCH v10 4/6] drm/i915/gt: Re-work reset_csb Michael Cheng
` (2 subsequent siblings)
5 siblings, 1 reply; 13+ messages in thread
From: Michael Cheng @ 2022-02-10 18:36 UTC (permalink / raw)
To: intel-gfx
Cc: tvrtko.ursulin, michael.cheng, balasubramani.vivekanandan,
wayne.boyer, casey.g.bowman, lucas.demarchi, dri-devel
Drop invalidate_csb_entries and directly call drm_clflush_virt_range.
This allows for one less function call, and prevent complier errors when
building for non-x86 architectures.
v2(Michael Cheng): Drop invalidate_csb_entries function and directly
invoke drm_clflush_virt_range. Thanks to Tvrtko for the
sugguestion.
v3(Michael Cheng): Use correct parameters for drm_clflush_virt_range.
Thanks to Tvrtko for pointing this out.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
---
.../gpu/drm/i915/gt/intel_execlists_submission.c | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 9bb7c863172f..6186a5e4b191 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -1646,12 +1646,6 @@ cancel_port_requests(struct intel_engine_execlists * const execlists,
return inactive;
}
-static void invalidate_csb_entries(const u64 *first, const u64 *last)
-{
- clflush((void *)first);
- clflush((void *)last);
-}
-
/*
* Starting with Gen12, the status has a new format:
*
@@ -1999,7 +1993,7 @@ process_csb(struct intel_engine_cs *engine, struct i915_request **inactive)
* the wash as hardware, working or not, will need to do the
* invalidation before.
*/
- invalidate_csb_entries(&buf[0], &buf[num_entries - 1]);
+ drm_clflush_virt_range(&buf[0], num_entries * sizeof(buf[0]));
/*
* We assume that any event reflects a change in context flow
@@ -2783,8 +2777,9 @@ static void reset_csb_pointers(struct intel_engine_cs *engine)
/* Check that the GPU does indeed update the CSB entries! */
memset(execlists->csb_status, -1, (reset_value + 1) * sizeof(u64));
- invalidate_csb_entries(&execlists->csb_status[0],
- &execlists->csb_status[reset_value]);
+ drm_clflush_virt_range(&execlists->csb_status[0],
+ execlists->csb_size *
+ sizeof(execlists->csb_status[0]));
/* Once more for luck and our trusty paranoia */
ENGINE_WRITE(engine, RING_CONTEXT_STATUS_PTR,
--
2.25.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v10 4/6] drm/i915/gt: Re-work reset_csb
2022-02-10 18:36 [PATCH v10 0/6] Use drm_clflush* instead of clflush Michael Cheng
` (2 preceding siblings ...)
2022-02-10 18:36 ` [PATCH v10 3/6] drm/i915/gt: Drop invalidate_csb_entries Michael Cheng
@ 2022-02-10 18:36 ` Michael Cheng
2022-02-22 22:35 ` Matt Roper
2022-02-10 18:36 ` [PATCH v10 5/6] drm/i915/: Re-work clflush_write32 Michael Cheng
2022-02-10 18:36 ` [PATCH v10 6/6] drm/i915/gt: replace cache_clflush_range Michael Cheng
5 siblings, 1 reply; 13+ messages in thread
From: Michael Cheng @ 2022-02-10 18:36 UTC (permalink / raw)
To: intel-gfx
Cc: tvrtko.ursulin, michael.cheng, balasubramani.vivekanandan,
wayne.boyer, casey.g.bowman, lucas.demarchi, dri-devel
Use drm_clflush_virt_range instead of directly invoking clflush. This
will prevent compiler errors when building for non-x86 architectures.
v2(Michael Cheng): Remove extra clflush
v3(Michael Cheng): Remove memory barrier since drm_clflush_virt_range
takes care of it.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
---
drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 6186a5e4b191..11b864fd68a5 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -2945,9 +2945,8 @@ reset_csb(struct intel_engine_cs *engine, struct i915_request **inactive)
{
struct intel_engine_execlists * const execlists = &engine->execlists;
- mb(); /* paranoia: read the CSB pointers from after the reset */
- clflush(execlists->csb_write);
- mb();
+ drm_clflush_virt_range(execlists->csb_write,
+ sizeof(execlists->csb_write));
inactive = process_csb(engine, inactive); /* drain preemption events */
--
2.25.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v10 5/6] drm/i915/: Re-work clflush_write32
2022-02-10 18:36 [PATCH v10 0/6] Use drm_clflush* instead of clflush Michael Cheng
` (3 preceding siblings ...)
2022-02-10 18:36 ` [PATCH v10 4/6] drm/i915/gt: Re-work reset_csb Michael Cheng
@ 2022-02-10 18:36 ` Michael Cheng
2022-02-22 22:37 ` Matt Roper
2022-02-10 18:36 ` [PATCH v10 6/6] drm/i915/gt: replace cache_clflush_range Michael Cheng
5 siblings, 1 reply; 13+ messages in thread
From: Michael Cheng @ 2022-02-10 18:36 UTC (permalink / raw)
To: intel-gfx
Cc: tvrtko.ursulin, michael.cheng, balasubramani.vivekanandan,
wayne.boyer, casey.g.bowman, lucas.demarchi, dri-devel
Use drm_clflush_virt_range instead of clflushopt and remove the memory
barrier, since drm_clflush_virt_range takes care of that.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
---
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 498b458fd784..0854276ff7ba 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1332,10 +1332,8 @@ static void *reloc_vaddr(struct i915_vma *vma,
static void clflush_write32(u32 *addr, u32 value, unsigned int flushes)
{
if (unlikely(flushes & (CLFLUSH_BEFORE | CLFLUSH_AFTER))) {
- if (flushes & CLFLUSH_BEFORE) {
- clflushopt(addr);
- mb();
- }
+ if (flushes & CLFLUSH_BEFORE)
+ drm_clflush_virt_range(addr, sizeof(addr));
*addr = value;
@@ -1347,7 +1345,7 @@ static void clflush_write32(u32 *addr, u32 value, unsigned int flushes)
* to ensure ordering of clflush wrt to the system.
*/
if (flushes & CLFLUSH_AFTER)
- clflushopt(addr);
+ drm_clflush_virt_range(addr, sizeof(addr));
} else
*addr = value;
}
--
2.25.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v10 6/6] drm/i915/gt: replace cache_clflush_range
2022-02-10 18:36 [PATCH v10 0/6] Use drm_clflush* instead of clflush Michael Cheng
` (4 preceding siblings ...)
2022-02-10 18:36 ` [PATCH v10 5/6] drm/i915/: Re-work clflush_write32 Michael Cheng
@ 2022-02-10 18:36 ` Michael Cheng
2022-02-22 22:40 ` Matt Roper
5 siblings, 1 reply; 13+ messages in thread
From: Michael Cheng @ 2022-02-10 18:36 UTC (permalink / raw)
To: intel-gfx
Cc: tvrtko.ursulin, michael.cheng, balasubramani.vivekanandan,
wayne.boyer, casey.g.bowman, lucas.demarchi, dri-devel
Replace all occurrence of cache_clflush_range with drm_clflush_virt_range.
This will prevent compile errors on non-x86 platforms.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
---
drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 12 ++++++------
drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 2 +-
drivers/gpu/drm/i915/gt/intel_gtt.c | 2 +-
drivers/gpu/drm/i915/gt/intel_ppgtt.c | 2 +-
drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 2 +-
5 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index c43e724afa9f..d0999e92621b 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -444,11 +444,11 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
pd = pdp->entry[gen8_pd_index(idx, 2)];
}
- clflush_cache_range(vaddr, PAGE_SIZE);
+ drm_clflush_virt_range(vaddr, PAGE_SIZE);
vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
}
} while (1);
- clflush_cache_range(vaddr, PAGE_SIZE);
+ drm_clflush_virt_range(vaddr, PAGE_SIZE);
return idx;
}
@@ -532,7 +532,7 @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
}
} while (rem >= page_size && index < I915_PDES);
- clflush_cache_range(vaddr, PAGE_SIZE);
+ drm_clflush_virt_range(vaddr, PAGE_SIZE);
/*
* Is it safe to mark the 2M block as 64K? -- Either we have
@@ -548,7 +548,7 @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
I915_GTT_PAGE_SIZE_2M)))) {
vaddr = px_vaddr(pd);
vaddr[maybe_64K] |= GEN8_PDE_IPS_64K;
- clflush_cache_range(vaddr, PAGE_SIZE);
+ drm_clflush_virt_range(vaddr, PAGE_SIZE);
page_size = I915_GTT_PAGE_SIZE_64K;
/*
@@ -569,7 +569,7 @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
for (i = 1; i < index; i += 16)
memset64(vaddr + i, encode, 15);
- clflush_cache_range(vaddr, PAGE_SIZE);
+ drm_clflush_virt_range(vaddr, PAGE_SIZE);
}
}
@@ -617,7 +617,7 @@ static void gen8_ppgtt_insert_entry(struct i915_address_space *vm,
vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
- clflush_cache_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
+ drm_clflush_virt_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
}
static int gen8_init_scratch(struct i915_address_space *vm)
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 11b864fd68a5..67dd4b1fc185 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -2823,7 +2823,7 @@ static void execlists_sanitize(struct intel_engine_cs *engine)
sanitize_hwsp(engine);
/* And scrub the dirty cachelines for the HWSP */
- clflush_cache_range(engine->status_page.addr, PAGE_SIZE);
+ drm_clflush_virt_range(engine->status_page.addr, PAGE_SIZE);
intel_engine_reset_pinned_contexts(engine);
}
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 0d6bbc8c57f2..9b594be9102f 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -255,7 +255,7 @@ fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count)
void *vaddr = __px_vaddr(p);
memset64(vaddr, val, count);
- clflush_cache_range(vaddr, PAGE_SIZE);
+ drm_clflush_virt_range(vaddr, PAGE_SIZE);
}
static void poison_scratch_page(struct drm_i915_gem_object *scratch)
diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
index 48e6e2f87700..bd474a5123cb 100644
--- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
@@ -90,7 +90,7 @@ write_dma_entry(struct drm_i915_gem_object * const pdma,
u64 * const vaddr = __px_vaddr(pdma);
vaddr[idx] = encoded_entry;
- clflush_cache_range(&vaddr[idx], sizeof(u64));
+ drm_clflush_virt_range(&vaddr[idx], sizeof(u64));
}
void
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index b3a429a92c0d..89020706adc4 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -3573,7 +3573,7 @@ static void guc_sanitize(struct intel_engine_cs *engine)
sanitize_hwsp(engine);
/* And scrub the dirty cachelines for the HWSP */
- clflush_cache_range(engine->status_page.addr, PAGE_SIZE);
+ drm_clflush_virt_range(engine->status_page.addr, PAGE_SIZE);
intel_engine_reset_pinned_contexts(engine);
}
--
2.25.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v10 1/6] drm: Add arch arm64 for drm_clflush_virt_range
2022-02-10 18:36 ` [PATCH v10 1/6] drm: Add arch arm64 for drm_clflush_virt_range Michael Cheng
@ 2022-02-22 21:49 ` Matt Roper
0 siblings, 0 replies; 13+ messages in thread
From: Matt Roper @ 2022-02-22 21:49 UTC (permalink / raw)
To: Michael Cheng
Cc: tvrtko.ursulin, balasubramani.vivekanandan, wayne.boyer,
intel-gfx, casey.g.bowman, lucas.demarchi, dri-devel
On Thu, Feb 10, 2022 at 10:36:31AM -0800, Michael Cheng wrote:
> Add arm64 support for drm_clflush_virt_range. dcache_clean_inval_poc
> performs a flush by first performing a clean, follow by an invalidation
> operation.
>
> v2 (Michael Cheng): Use correct macro for cleaning and invalidation the
> dcache.
>
> v3 (Michael Cheng): Remove ifdef for asm/cacheflush.h
>
> v4 (Michael Cheng): Rebase
>
> v5 (Michael Cheng): Replace asm/cacheflush.h with linux/cacheflush.h
Note that you only really need to indicate that you're the one making
these updates in cases where you're picking up someone else's patch and
carrying it forward; otherwise it's pretty clear that you were also the
author of v2-v5.
However when possible it is a good idea to indicate who suggested
various changes you're making. E.g., I think a lot of these were based
on feedback from Tvrtko?
>
> Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Change appears to accurately implement the same type of cache flush as
what we have on the x86 backend.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
> ---
> drivers/gpu/drm/drm_cache.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
> index 66597e411764..2e233f53331e 100644
> --- a/drivers/gpu/drm/drm_cache.c
> +++ b/drivers/gpu/drm/drm_cache.c
> @@ -28,6 +28,7 @@
> * Authors: Thomas Hellström <thomas-at-tungstengraphics-dot-com>
> */
>
> +#include <linux/cacheflush.h>
> #include <linux/cc_platform.h>
> #include <linux/export.h>
> #include <linux/highmem.h>
> @@ -174,6 +175,11 @@ drm_clflush_virt_range(void *addr, unsigned long length)
>
> if (wbinvd_on_all_cpus())
> pr_err("Timed out waiting for cache flush\n");
> +
> +#elif defined(CONFIG_ARM64)
> + void *end = addr + length;
> + dcache_clean_inval_poc((unsigned long)addr, (unsigned long)end);
> +
> #else
> WARN_ONCE(1, "Architecture has no drm_cache.c support\n");
> #endif
> --
> 2.25.1
>
--
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v10 2/6] drm/i915/gt: Re-work intel_write_status_page
2022-02-10 18:36 ` [PATCH v10 2/6] drm/i915/gt: Re-work intel_write_status_page Michael Cheng
@ 2022-02-22 22:04 ` Matt Roper
0 siblings, 0 replies; 13+ messages in thread
From: Matt Roper @ 2022-02-22 22:04 UTC (permalink / raw)
To: Michael Cheng
Cc: tvrtko.ursulin, balasubramani.vivekanandan, wayne.boyer,
intel-gfx, casey.g.bowman, lucas.demarchi, dri-devel
On Thu, Feb 10, 2022 at 10:36:32AM -0800, Michael Cheng wrote:
> Re-work intel_write_status_page to use drm_clflush_virt_range. This
> will prevent compiler errors when building for non-x86 architectures.
>
It looks like this will also cause old x86 cpu's that don't have clflush
to do an extra wbinvd that they didn't do before; based on commit
9a29dd85a09d ("drm/i915: Fixup intel_write_status_page() for old CPUs
without clflush") we were just hoping that they were sufficiently
coherent that we can get away without extra flushing.
As far as I can see, this function is only used from a selftest, not
from real driver codepaths, so the extra flushing shouldn't have any
negative impact on end users.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
> Signed-off-by: Michael Cheng <michael.cheng@intel.com>
> ---
> drivers/gpu/drm/i915/gt/intel_engine.h | 13 ++++---------
> 1 file changed, 4 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
> index 0e353d8c2bc8..986777c2430d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine.h
> @@ -4,6 +4,7 @@
>
> #include <asm/cacheflush.h>
> #include <drm/drm_util.h>
> +#include <drm/drm_cache.h>
>
> #include <linux/hashtable.h>
> #include <linux/irq_work.h>
> @@ -143,15 +144,9 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value)
> * of extra paranoia to try and ensure that the HWS takes the value
> * we give and that it doesn't end up trapped inside the CPU!
> */
> - if (static_cpu_has(X86_FEATURE_CLFLUSH)) {
> - mb();
> - clflush(&engine->status_page.addr[reg]);
> - engine->status_page.addr[reg] = value;
> - clflush(&engine->status_page.addr[reg]);
> - mb();
> - } else {
> - WRITE_ONCE(engine->status_page.addr[reg], value);
> - }
> + drm_clflush_virt_range(&engine->status_page.addr[reg], sizeof(value));
> + WRITE_ONCE(engine->status_page.addr[reg], value);
> + drm_clflush_virt_range(&engine->status_page.addr[reg], sizeof(value));
> }
>
> /*
> --
> 2.25.1
>
--
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v10 3/6] drm/i915/gt: Drop invalidate_csb_entries
2022-02-10 18:36 ` [PATCH v10 3/6] drm/i915/gt: Drop invalidate_csb_entries Michael Cheng
@ 2022-02-22 22:31 ` Matt Roper
0 siblings, 0 replies; 13+ messages in thread
From: Matt Roper @ 2022-02-22 22:31 UTC (permalink / raw)
To: Michael Cheng
Cc: tvrtko.ursulin, balasubramani.vivekanandan, wayne.boyer,
intel-gfx, casey.g.bowman, lucas.demarchi, dri-devel
On Thu, Feb 10, 2022 at 10:36:33AM -0800, Michael Cheng wrote:
> Drop invalidate_csb_entries and directly call drm_clflush_virt_range.
> This allows for one less function call, and prevent complier errors when
> building for non-x86 architectures.
>
> v2(Michael Cheng): Drop invalidate_csb_entries function and directly
> invoke drm_clflush_virt_range. Thanks to Tvrtko for the
> sugguestion.
>
> v3(Michael Cheng): Use correct parameters for drm_clflush_virt_range.
> Thanks to Tvrtko for pointing this out.
>
> Signed-off-by: Michael Cheng <michael.cheng@intel.com>
> ---
> .../gpu/drm/i915/gt/intel_execlists_submission.c | 13 ++++---------
> 1 file changed, 4 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index 9bb7c863172f..6186a5e4b191 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -1646,12 +1646,6 @@ cancel_port_requests(struct intel_engine_execlists * const execlists,
> return inactive;
> }
>
> -static void invalidate_csb_entries(const u64 *first, const u64 *last)
> -{
> - clflush((void *)first);
> - clflush((void *)last);
> -}
> -
> /*
> * Starting with Gen12, the status has a new format:
> *
> @@ -1999,7 +1993,7 @@ process_csb(struct intel_engine_cs *engine, struct i915_request **inactive)
> * the wash as hardware, working or not, will need to do the
> * invalidation before.
> */
> - invalidate_csb_entries(&buf[0], &buf[num_entries - 1]);
> + drm_clflush_virt_range(&buf[0], num_entries * sizeof(buf[0]));
>
> /*
> * We assume that any event reflects a change in context flow
> @@ -2783,8 +2777,9 @@ static void reset_csb_pointers(struct intel_engine_cs *engine)
>
> /* Check that the GPU does indeed update the CSB entries! */
> memset(execlists->csb_status, -1, (reset_value + 1) * sizeof(u64));
> - invalidate_csb_entries(&execlists->csb_status[0],
> - &execlists->csb_status[reset_value]);
> + drm_clflush_virt_range(&execlists->csb_status[0],
I think you could simplify the parameter slightly by just writing it as
'execlists->csb_status'
> + execlists->csb_size *
> + sizeof(execlists->csb_status[0]));
The existing code only issues a clflush for the first and last entries
rather than the range from 0..reset_value, but since there are only a
maximum of 12 u64 entries, which fits into two cachelines, the end
result should be the same either way.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
>
> /* Once more for luck and our trusty paranoia */
> ENGINE_WRITE(engine, RING_CONTEXT_STATUS_PTR,
> --
> 2.25.1
>
--
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v10 4/6] drm/i915/gt: Re-work reset_csb
2022-02-10 18:36 ` [PATCH v10 4/6] drm/i915/gt: Re-work reset_csb Michael Cheng
@ 2022-02-22 22:35 ` Matt Roper
0 siblings, 0 replies; 13+ messages in thread
From: Matt Roper @ 2022-02-22 22:35 UTC (permalink / raw)
To: Michael Cheng
Cc: tvrtko.ursulin, balasubramani.vivekanandan, wayne.boyer,
intel-gfx, casey.g.bowman, lucas.demarchi, dri-devel
On Thu, Feb 10, 2022 at 10:36:34AM -0800, Michael Cheng wrote:
> Use drm_clflush_virt_range instead of directly invoking clflush. This
> will prevent compiler errors when building for non-x86 architectures.
>
> v2(Michael Cheng): Remove extra clflush
>
> v3(Michael Cheng): Remove memory barrier since drm_clflush_virt_range
> takes care of it.
>
> Signed-off-by: Michael Cheng <michael.cheng@intel.com>
> ---
> drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index 6186a5e4b191..11b864fd68a5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -2945,9 +2945,8 @@ reset_csb(struct intel_engine_cs *engine, struct i915_request **inactive)
> {
> struct intel_engine_execlists * const execlists = &engine->execlists;
>
> - mb(); /* paranoia: read the CSB pointers from after the reset */
> - clflush(execlists->csb_write);
> - mb();
> + drm_clflush_virt_range(execlists->csb_write,
> + sizeof(execlists->csb_write));
I think you technically want sizeof(execlists->csb_write[0]) here,
right? I.e., the size of the value (32-bits), not the size of the
pointer (32 or 64 depending on architecture). Not that it will really
change the behavior since it all works out to a single cacheline in the
end.
Aside from that,
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
>
> inactive = process_csb(engine, inactive); /* drain preemption events */
>
> --
> 2.25.1
>
--
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v10 5/6] drm/i915/: Re-work clflush_write32
2022-02-10 18:36 ` [PATCH v10 5/6] drm/i915/: Re-work clflush_write32 Michael Cheng
@ 2022-02-22 22:37 ` Matt Roper
0 siblings, 0 replies; 13+ messages in thread
From: Matt Roper @ 2022-02-22 22:37 UTC (permalink / raw)
To: Michael Cheng
Cc: tvrtko.ursulin, balasubramani.vivekanandan, wayne.boyer,
intel-gfx, casey.g.bowman, lucas.demarchi, dri-devel
On Thu, Feb 10, 2022 at 10:36:35AM -0800, Michael Cheng wrote:
> Use drm_clflush_virt_range instead of clflushopt and remove the memory
> barrier, since drm_clflush_virt_range takes care of that.
>
> Signed-off-by: Michael Cheng <michael.cheng@intel.com>
> ---
> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 8 +++-----
> 1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 498b458fd784..0854276ff7ba 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -1332,10 +1332,8 @@ static void *reloc_vaddr(struct i915_vma *vma,
> static void clflush_write32(u32 *addr, u32 value, unsigned int flushes)
> {
> if (unlikely(flushes & (CLFLUSH_BEFORE | CLFLUSH_AFTER))) {
> - if (flushes & CLFLUSH_BEFORE) {
> - clflushopt(addr);
> - mb();
> - }
> + if (flushes & CLFLUSH_BEFORE)
> + drm_clflush_virt_range(addr, sizeof(addr));
This is another case where it should technically be sizeof(*addr),
although in practice it won't change the behavior.
>
> *addr = value;
>
> @@ -1347,7 +1345,7 @@ static void clflush_write32(u32 *addr, u32 value, unsigned int flushes)
> * to ensure ordering of clflush wrt to the system.
> */
> if (flushes & CLFLUSH_AFTER)
> - clflushopt(addr);
> + drm_clflush_virt_range(addr, sizeof(addr));
Ditto.
Aside from those,
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
> } else
> *addr = value;
> }
> --
> 2.25.1
>
--
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v10 6/6] drm/i915/gt: replace cache_clflush_range
2022-02-10 18:36 ` [PATCH v10 6/6] drm/i915/gt: replace cache_clflush_range Michael Cheng
@ 2022-02-22 22:40 ` Matt Roper
0 siblings, 0 replies; 13+ messages in thread
From: Matt Roper @ 2022-02-22 22:40 UTC (permalink / raw)
To: Michael Cheng
Cc: tvrtko.ursulin, balasubramani.vivekanandan, wayne.boyer,
intel-gfx, casey.g.bowman, lucas.demarchi, dri-devel
On Thu, Feb 10, 2022 at 10:36:36AM -0800, Michael Cheng wrote:
> Replace all occurrence of cache_clflush_range with drm_clflush_virt_range.
> This will prevent compile errors on non-x86 platforms.
>
> Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
> ---
> drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 12 ++++++------
> drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 2 +-
> drivers/gpu/drm/i915/gt/intel_gtt.c | 2 +-
> drivers/gpu/drm/i915/gt/intel_ppgtt.c | 2 +-
> drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 2 +-
> 5 files changed, 10 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index c43e724afa9f..d0999e92621b 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -444,11 +444,11 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
> pd = pdp->entry[gen8_pd_index(idx, 2)];
> }
>
> - clflush_cache_range(vaddr, PAGE_SIZE);
> + drm_clflush_virt_range(vaddr, PAGE_SIZE);
> vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
> }
> } while (1);
> - clflush_cache_range(vaddr, PAGE_SIZE);
> + drm_clflush_virt_range(vaddr, PAGE_SIZE);
>
> return idx;
> }
> @@ -532,7 +532,7 @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
> }
> } while (rem >= page_size && index < I915_PDES);
>
> - clflush_cache_range(vaddr, PAGE_SIZE);
> + drm_clflush_virt_range(vaddr, PAGE_SIZE);
>
> /*
> * Is it safe to mark the 2M block as 64K? -- Either we have
> @@ -548,7 +548,7 @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
> I915_GTT_PAGE_SIZE_2M)))) {
> vaddr = px_vaddr(pd);
> vaddr[maybe_64K] |= GEN8_PDE_IPS_64K;
> - clflush_cache_range(vaddr, PAGE_SIZE);
> + drm_clflush_virt_range(vaddr, PAGE_SIZE);
> page_size = I915_GTT_PAGE_SIZE_64K;
>
> /*
> @@ -569,7 +569,7 @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
> for (i = 1; i < index; i += 16)
> memset64(vaddr + i, encode, 15);
>
> - clflush_cache_range(vaddr, PAGE_SIZE);
> + drm_clflush_virt_range(vaddr, PAGE_SIZE);
> }
> }
>
> @@ -617,7 +617,7 @@ static void gen8_ppgtt_insert_entry(struct i915_address_space *vm,
>
> vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
> vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
> - clflush_cache_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
> + drm_clflush_virt_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
> }
>
> static int gen8_init_scratch(struct i915_address_space *vm)
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index 11b864fd68a5..67dd4b1fc185 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -2823,7 +2823,7 @@ static void execlists_sanitize(struct intel_engine_cs *engine)
> sanitize_hwsp(engine);
>
> /* And scrub the dirty cachelines for the HWSP */
> - clflush_cache_range(engine->status_page.addr, PAGE_SIZE);
> + drm_clflush_virt_range(engine->status_page.addr, PAGE_SIZE);
>
> intel_engine_reset_pinned_contexts(engine);
> }
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> index 0d6bbc8c57f2..9b594be9102f 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> @@ -255,7 +255,7 @@ fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count)
> void *vaddr = __px_vaddr(p);
>
> memset64(vaddr, val, count);
> - clflush_cache_range(vaddr, PAGE_SIZE);
> + drm_clflush_virt_range(vaddr, PAGE_SIZE);
> }
>
> static void poison_scratch_page(struct drm_i915_gem_object *scratch)
> diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> index 48e6e2f87700..bd474a5123cb 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> @@ -90,7 +90,7 @@ write_dma_entry(struct drm_i915_gem_object * const pdma,
> u64 * const vaddr = __px_vaddr(pdma);
>
> vaddr[idx] = encoded_entry;
> - clflush_cache_range(&vaddr[idx], sizeof(u64));
> + drm_clflush_virt_range(&vaddr[idx], sizeof(u64));
> }
>
> void
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index b3a429a92c0d..89020706adc4 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -3573,7 +3573,7 @@ static void guc_sanitize(struct intel_engine_cs *engine)
> sanitize_hwsp(engine);
>
> /* And scrub the dirty cachelines for the HWSP */
> - clflush_cache_range(engine->status_page.addr, PAGE_SIZE);
> + drm_clflush_virt_range(engine->status_page.addr, PAGE_SIZE);
>
> intel_engine_reset_pinned_contexts(engine);
> }
> --
> 2.25.1
>
--
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2022-02-22 22:40 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-10 18:36 [PATCH v10 0/6] Use drm_clflush* instead of clflush Michael Cheng
2022-02-10 18:36 ` [PATCH v10 1/6] drm: Add arch arm64 for drm_clflush_virt_range Michael Cheng
2022-02-22 21:49 ` Matt Roper
2022-02-10 18:36 ` [PATCH v10 2/6] drm/i915/gt: Re-work intel_write_status_page Michael Cheng
2022-02-22 22:04 ` Matt Roper
2022-02-10 18:36 ` [PATCH v10 3/6] drm/i915/gt: Drop invalidate_csb_entries Michael Cheng
2022-02-22 22:31 ` Matt Roper
2022-02-10 18:36 ` [PATCH v10 4/6] drm/i915/gt: Re-work reset_csb Michael Cheng
2022-02-22 22:35 ` Matt Roper
2022-02-10 18:36 ` [PATCH v10 5/6] drm/i915/: Re-work clflush_write32 Michael Cheng
2022-02-22 22:37 ` Matt Roper
2022-02-10 18:36 ` [PATCH v10 6/6] drm/i915/gt: replace cache_clflush_range Michael Cheng
2022-02-22 22:40 ` Matt Roper
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).