[Intel-xe] [PATCH v4 0/5] PAT and cache coherency support

* [Intel-xe] [PATCH v4 0/5] PAT and cache coherency support
@ 2023-09-27 11:00 Matthew Auld
  2023-09-27 11:00 ` [Intel-xe] [PATCH v4 1/5] drm/xe/pat: trim the xelp PAT table Matthew Auld
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Matthew Auld @ 2023-09-27 11:00 UTC (permalink / raw)
  To: intel-xe

Branch available here:
https://gitlab.freedesktop.org/mwa/kernel/-/tree/xe-pat-index?ref_type=heads

Series directly depends on the patches here:
https://patchwork.freedesktop.org/series/124225/

Goal here is to allow userspace to directly control the pat_index when mapping
memory via the ppGTT, in addtion to the CPU caching mode. This is very much
needed on newer igpu platforms which allow incoherent GT access, where the
choice over the cache level and expected coherency is best left to userspace
depending on their usecase.  In the future there may also be other stuff encoded
in the pat_index, so giving userspace direct control will also be needed there.

To support this we added new gem_create uAPI for selecting the CPU cache
mode to use for system memory, including the expected GPU coherency mode. There
are various restrictions here for the selected coherency mode and compatible CPU
cache modes.  With that in place the actual pat_index can now be provided as
part of vm_bind. The only restriction is that the coherency mode of the
pat_index must be at least as coherent as the gem_create coherency mode. There
are also some special cases like with userptr and dma-buf.

v2:
  - Loads of improvements/tweaks. Main changes are to now allow
    gem_create.coh_mode <= coh_mode(pat_index), rather than it needing to match
    exactly. This simplifies the dma-buf policy from userspace pov. Also we now
    only consider COH_NONE and COH_AT_LEAST_1WAY.
v3:
  - Rebase. Split the pte_encode() refactoring, plus various smaller tweaks and
    fixes.
v4:
  - Rebase on Lucas' new series.
  - Drop UC cache mode.
  - s/smem_cpu_caching/cpu_caching/. Idea is to make VRAM WC explicit in the
    uapi, plus make it more future proof.

-- 
2.41.0

^ permalink raw reply	[flat|nested] 11+ messages in thread