[Intel-xe] [PATCH 0/4] Port Xe to use GPUVA and implement NULL VM binds

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Intel-xe] [PATCH 0/4] Port Xe to use GPUVA and implement NULL VM binds
@ 2023-03-15 18:25 Matthew Brost
  2023-03-15 18:25 ` [Intel-xe] [PATCH 1/4] maple_tree: split up MA_STATE() macro Matthew Brost
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Matthew Brost @ 2023-03-15 18:25 UTC (permalink / raw)
  To: intel-xe; +Cc: paulo.r.zanoni, lionel.g.landwerlin, dakr

GPUVA is common code written primarily by Danilo with the idea being a
common place to track GPUVAs (VMAs in Xe) within an address space (VMs
in Xe), track all the GPUVAs attached to GEMs, and a common way
implement VM binds / unbinds with MMAP / MUNMAP semantics via creating
operation lists. All of this adds up to a common way to implement VK
sparse bindings.

This series pulls in the GPUVA code written by Danilo plus some small
fixes by myself into 1 large patch. Once the GPUVA makes it upstream, we
can rebase and drop this patch. I believe what lands upstream should be
nearly identical to this patch at least from an API perspective. 

The last two patches port Xe to GPUVA and add support for NULL VM binds
(writes dropped, read zero, VK sparse support). An example of the
semantics of this is below.

MAP 0x0000-0x8000 to NULL 	- 0x0000-0x8000 writes dropped + read zero
MAP 0x4000-0x5000 to a GEM 	- 0x0000-0x4000, 0x5000-0x8000 writes dropped + read zero; 0x4000-0x5000 mapped to a GEM
UNMAP 0x3000-0x6000		- 0x0000-0x3000, 0x6000-0x8000 writes dropped + read zero
UNMAP 0x0000-0x8000		- Nothing mapped

No changins to existing behavior, rather just new functionality.

A follow up will optimize REBIND operation to avoid using dma-resv slots
for ordering (partial unbinds when not changing page sizes) and prune
the xe_vma object data members.
 
Signed-off-by: Matthew Brost <matthew.brost@intel.com>

Danilo Krummrich (1):
  maple_tree: split up MA_STATE() macro

Matthew Brost (2):
  drm/xe: Port Xe to GPUVA
  drm/xe: NULL binding implementation

Signed-off-by: Danilo Krummrich (1):
  drm: manager to keep track of GPUs VA mappings

 Documentation/gpu/drm-mm.rst                |   31 +
 drivers/gpu/drm/Makefile                    |    1 +
 drivers/gpu/drm/drm_debugfs.c               |   56 +
 drivers/gpu/drm/drm_gem.c                   |    3 +
 drivers/gpu/drm/drm_gpuva_mgr.c             | 1891 +++++++++++++++++++
 drivers/gpu/drm/xe/xe_bo.c                  |   10 +-
 drivers/gpu/drm/xe/xe_bo.h                  |    1 +
 drivers/gpu/drm/xe/xe_device.c              |    2 +-
 drivers/gpu/drm/xe/xe_exec.c                |    4 +-
 drivers/gpu/drm/xe/xe_gt_pagefault.c        |   27 +-
 drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c |   14 +-
 drivers/gpu/drm/xe/xe_guc_ct.c              |    6 +-
 drivers/gpu/drm/xe/xe_migrate.c             |    5 +-
 drivers/gpu/drm/xe/xe_pt.c                  |  166 +-
 drivers/gpu/drm/xe/xe_trace.h               |   10 +-
 drivers/gpu/drm/xe/xe_vm.c                  | 1872 +++++++++---------
 drivers/gpu/drm/xe/xe_vm.h                  |   76 +-
 drivers/gpu/drm/xe/xe_vm_madvise.c          |   87 +-
 drivers/gpu/drm/xe/xe_vm_types.h            |  168 +-
 include/drm/drm_debugfs.h                   |   23 +
 include/drm/drm_drv.h                       |    7 +
 include/drm/drm_gem.h                       |   75 +
 include/drm/drm_gpuva_mgr.h                 |  735 +++++++
 include/linux/maple_tree.h                  |    7 +-
 include/uapi/drm/xe_drm.h                   |    8 +
 25 files changed, 4072 insertions(+), 1213 deletions(-)
 create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
 create mode 100644 include/drm/drm_gpuva_mgr.h

-- 
2.34.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Intel-xe] [PATCH 1/4] maple_tree: split up MA_STATE() macro
  2023-03-15 18:25 [Intel-xe] [PATCH 0/4] Port Xe to use GPUVA and implement NULL VM binds Matthew Brost
@ 2023-03-15 18:25 ` Matthew Brost
  2023-03-15 18:25 ` [Intel-xe] [PATCH 2/4] drm: manager to keep track of GPUs VA mappings Matthew Brost
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Matthew Brost @ 2023-03-15 18:25 UTC (permalink / raw)
  To: intel-xe; +Cc: paulo.r.zanoni, lionel.g.landwerlin, dakr

From: Danilo Krummrich <dakr@redhat.com>

Split up the MA_STATE() macro such that components using the maple tree
can easily inherit from struct ma_state and build custom tree walk
macros to hide their internals from users.

Example:

struct sample_iterator {
	struct ma_state mas;
	struct sample_mgr *mgr;
};

\#define SAMPLE_ITERATOR(name, __mgr, start)			\
	struct sample_iterator name = {				\
		.mas = MA_STATE_INIT(&(__mgr)->mt, start, 0),	\
		.mgr = __mgr,					\
	}

\#define sample_iter_for_each_range(it__, entry__, end__) \
	mas_for_each(&(it__).mas, entry__, end__)

--

struct sample *sample;
SAMPLE_ITERATOR(si, min);

sample_iter_for_each_range(&si, sample, max) {
	frob(mgr, sample);
}

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
---
 include/linux/maple_tree.h | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h
index e594db58a0f1..baeb989c6e14 100644
--- a/include/linux/maple_tree.h
+++ b/include/linux/maple_tree.h
@@ -424,8 +424,8 @@ struct ma_wr_state {
 #define MA_ERROR(err) \
 		((struct maple_enode *)(((unsigned long)err << 2) | 2UL))
 
-#define MA_STATE(name, mt, first, end)					\
-	struct ma_state name = {					\
+#define MA_STATE_INIT(mt, first, end)					\
+	{								\
 		.tree = mt,						\
 		.index = first,						\
 		.last = end,						\
@@ -435,6 +435,9 @@ struct ma_wr_state {
 		.alloc = NULL,						\
 	}
 
+#define MA_STATE(name, mt, first, end)					\
+	struct ma_state name = MA_STATE_INIT(mt, first, end)
+
 #define MA_WR_STATE(name, ma_state, wr_entry)				\
 	struct ma_wr_state name = {					\
 		.mas = ma_state,					\
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Intel-xe] [PATCH 2/4] drm: manager to keep track of GPUs VA mappings
  2023-03-15 18:25 [Intel-xe] [PATCH 0/4] Port Xe to use GPUVA and implement NULL VM binds Matthew Brost
  2023-03-15 18:25 ` [Intel-xe] [PATCH 1/4] maple_tree: split up MA_STATE() macro Matthew Brost
@ 2023-03-15 18:25 ` Matthew Brost
  2023-03-15 18:25 ` [Intel-xe] [PATCH 3/4] drm/xe: Port Xe to GPUVA Matthew Brost
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Matthew Brost @ 2023-03-15 18:25 UTC (permalink / raw)
  To: intel-xe; +Cc: paulo.r.zanoni, lionel.g.landwerlin, dakr, Dave Airlie

From: "Signed-off-by: Danilo Krummrich" <dakr@redhat.com>

Add infrastructure to keep track of GPU virtual address (VA) mappings
with a decicated VA space manager implementation.

New UAPIs, motivated by Vulkan sparse memory bindings graphics drivers
start implementing, allow userspace applications to request multiple and
arbitrary GPU VA mappings of buffer objects. The DRM GPU VA manager is
intended to serve the following purposes in this context.

1) Provide infrastructure to track GPU VA allocations and mappings,
   making use of the maple_tree.

2) Generically connect GPU VA mappings to their backing buffers, in
   particular DRM GEM objects.

3) Provide a common implementation to perform more complex mapping
   operations on the GPU VA space. In particular splitting and merging
   of GPU VA mappings, e.g. for intersecting mapping requests or partial
   unmap requests.

Suggested-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 Documentation/gpu/drm-mm.rst    |   31 +
 drivers/gpu/drm/Makefile        |    1 +
 drivers/gpu/drm/drm_debugfs.c   |   56 +
 drivers/gpu/drm/drm_gem.c       |    3 +
 drivers/gpu/drm/drm_gpuva_mgr.c | 1891 +++++++++++++++++++++++++++++++
 include/drm/drm_debugfs.h       |   23 +
 include/drm/drm_drv.h           |    7 +
 include/drm/drm_gem.h           |   75 ++
 include/drm/drm_gpuva_mgr.h     |  735 ++++++++++++
 9 files changed, 2822 insertions(+)
 create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
 create mode 100644 include/drm/drm_gpuva_mgr.h

diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index a79fd3549ff8..fe40ee686f6e 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -466,6 +466,37 @@ DRM MM Range Allocator Function References
 .. kernel-doc:: drivers/gpu/drm/drm_mm.c
    :export:
 
+DRM GPU VA Manager
+==================
+
+Overview
+--------
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+   :doc: Overview
+
+Split and Merge
+---------------
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+   :doc: Split and Merge
+
+Locking
+-------
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+   :doc: Locking
+
+
+DRM GPU VA Manager Function References
+--------------------------------------
+
+.. kernel-doc:: include/drm/drm_gpuva_mgr.h
+   :internal:
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+   :export:
+
 DRM Buddy Allocator
 ===================
 
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 55e66bf89334..26d21c864757 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -46,6 +46,7 @@ drm-y := \
 	drm_vblank.o \
 	drm_vblank_work.o \
 	drm_vma_manager.o \
+	drm_gpuva_mgr.o \
 	drm_writeback.o
 drm-$(CONFIG_DRM_LEGACY) += \
 	drm_agpsupport.o \
diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
index ee445f4605ba..fc86650031ba 100644
--- a/drivers/gpu/drm/drm_debugfs.c
+++ b/drivers/gpu/drm/drm_debugfs.c
@@ -38,6 +38,7 @@
 #include <drm/drm_edid.h>
 #include <drm/drm_file.h>
 #include <drm/drm_gem.h>
+#include <drm/drm_gpuva_mgr.h>
 
 #include "drm_crtc_internal.h"
 #include "drm_internal.h"
@@ -160,6 +161,61 @@ static const struct file_operations drm_debugfs_fops = {
 	.release = single_release,
 };
 
+/**
+ * drm_debugfs_gpuva_info - dump the given DRM GPU VA space
+ * @m: pointer to the &seq_file to write
+ * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ *
+ * Dumps the GPU VA regions and mappings of a given DRM GPU VA manager.
+ *
+ * For each DRM GPU VA space drivers should call this function from their
+ * &drm_info_list's show callback.
+ *
+ * Returns: 0 on success, -ENODEV if the &mgr is not initialized
+ */
+int drm_debugfs_gpuva_info(struct seq_file *m,
+			   struct drm_gpuva_manager *mgr)
+{
+	DRM_GPUVA_ITER(it, mgr, 0);
+	DRM_GPUVA_REGION_ITER(__it, mgr, 0);
+
+	if (!mgr->name)
+		return -ENODEV;
+
+	seq_printf(m, "DRM GPU VA space (%s)\n", mgr->name);
+	seq_puts  (m, "\n");
+	seq_puts  (m, " VA regions  | start              | range              | end                | sparse\n");
+	seq_puts  (m, "------------------------------------------------------------------------------------\n");
+	seq_printf(m, " VA space    | 0x%016llx | 0x%016llx | 0x%016llx |   -\n",
+		   mgr->mm_start, mgr->mm_range, mgr->mm_start + mgr->mm_range);
+	seq_puts  (m, "-----------------------------------------------------------------------------------\n");
+	drm_gpuva_iter_for_each(__it) {
+		struct drm_gpuva_region *reg = __it.reg;
+
+		if (reg == &mgr->kernel_alloc_region) {
+			seq_printf(m, " kernel node | 0x%016llx | 0x%016llx | 0x%016llx |   -\n",
+				   reg->va.addr, reg->va.range, reg->va.addr + reg->va.range);
+			continue;
+		}
+
+		seq_printf(m, "             | 0x%016llx | 0x%016llx | 0x%016llx | %s\n",
+			   reg->va.addr, reg->va.range, reg->va.addr + reg->va.range,
+			   reg->sparse ? "true" : "false");
+	}
+	seq_puts(m, "\n");
+	seq_puts(m, " VAs | start              | range              | end                | object             | object offset\n");
+	seq_puts(m, "-------------------------------------------------------------------------------------------------------------\n");
+	drm_gpuva_iter_for_each(it) {
+		struct drm_gpuva *va = it.va;
+
+		seq_printf(m, "     | 0x%016llx | 0x%016llx | 0x%016llx | 0x%016llx | 0x%016llx\n",
+			   va->va.addr, va->va.range, va->va.addr + va->va.range,
+			   (u64)va->gem.obj, va->gem.offset);
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(drm_debugfs_gpuva_info);
 
 /**
  * drm_debugfs_create_files - Initialize a given set of debugfs files for DRM
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 59a0bb5ebd85..65115fe88627 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -164,6 +164,9 @@ void drm_gem_private_object_init(struct drm_device *dev,
 	if (!obj->resv)
 		obj->resv = &obj->_resv;
 
+	if (drm_core_check_feature(dev, DRIVER_GEM_GPUVA))
+		drm_gem_gpuva_init(obj);
+
 	drm_vma_node_reset(&obj->vma_node);
 	INIT_LIST_HEAD(&obj->lru_node);
 }
diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c b/drivers/gpu/drm/drm_gpuva_mgr.c
new file mode 100644
index 000000000000..c359db7924ba
--- /dev/null
+++ b/drivers/gpu/drm/drm_gpuva_mgr.c
@@ -0,0 +1,1891 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2022 Red Hat.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors:
+ *     Danilo Krummrich <dakr@redhat.com>
+ *
+ */
+
+#include <drm/drm_gem.h>
+#include <drm/drm_gpuva_mgr.h>
+
+/**
+ * DOC: Overview
+ *
+ * The DRM GPU VA Manager, represented by struct drm_gpuva_manager keeps track
+ * of a GPU's virtual address (VA) space and manages the corresponding virtual
+ * mappings represented by &drm_gpuva objects. It also keeps track of the
+ * mapping's backing &drm_gem_object buffers.
+ *
+ * &drm_gem_object buffers maintain a list (and a corresponding list lock) of
+ * &drm_gpuva objects representing all existent GPU VA mappings using this
+ * &drm_gem_object as backing buffer.
+ *
+ * If the &DRM_GPUVA_MANAGER_REGIONS feature is enabled, a GPU VA mapping can
+ * only be created within a previously allocated &drm_gpuva_region, which
+ * represents a reserved portion of the GPU VA space. GPU VA mappings are not
+ * allowed to span over a &drm_gpuva_region's boundary.
+ *
+ * GPU VA regions can also be flagged as sparse, which allows drivers to create
+ * sparse mappings for a whole GPU VA region in order to support Vulkan
+ * 'Sparse Resources'.
+ *
+ * The GPU VA manager internally uses &maple_tree structures to manage the
+ * &drm_gpuva mappings and the &drm_gpuva_regions within a GPU's virtual address
+ * space.
+ *
+ * Besides the GPU VA space regions (&drm_gpuva_region) allocated by a driver
+ * the &drm_gpuva_manager contains a special region representing the portion of
+ * VA space reserved by the kernel. This node is initialized together with the
+ * GPU VA manager instance and removed when the GPU VA manager is destroyed.
+ *
+ * In a typical application drivers would embed struct drm_gpuva_manager,
+ * struct drm_gpuva_region and struct drm_gpuva within their own driver
+ * specific structures, there won't be any memory allocations of it's own nor
+ * memory allocations of &drm_gpuva or &drm_gpuva_region entries.
+ */
+
+/**
+ * DOC: Split and Merge
+ *
+ * The DRM GPU VA manager also provides an algorithm implementing splitting and
+ * merging of existent GPU VA mappings with the ones that are requested to be
+ * mapped or unmapped. This feature is required by the Vulkan API to implement
+ * Vulkan 'Sparse Memory Bindings' - drivers UAPIs often refer to this as
+ * VM BIND.
+ *
+ * Drivers can call drm_gpuva_sm_map() to receive a sequence of callbacks
+ * containing map, unmap and remap operations for a given newly requested
+ * mapping. The sequence of callbacks represents the set of operations to
+ * execute in order to integrate the new mapping cleanly into the current state
+ * of the GPU VA space.
+ *
+ * Depending on how the new GPU VA mapping intersects with the existent mappings
+ * of the GPU VA space the &drm_gpuva_fn_ops callbacks contain an arbitrary
+ * amount of unmap operations, a maximum of two remap operations and a single
+ * map operation. The caller might receive no callback at all if no operation is
+ * required, e.g. if the requested mapping already exists in the exact same way.
+ *
+ * The single map operation, if existent, represents the original map operation
+ * requested by the caller. Please note that this operation might be altered
+ * comparing it with the original map operation, e.g. because it was merged with
+ * an already  existent mapping. Hence, drivers must execute this map operation
+ * instead of the original one passed to drm_gpuva_sm_map().
+ *
+ * &drm_gpuva_op_unmap contains a 'keep' field, which indicates whether the
+ * &drm_gpuva to unmap is physically contiguous with the original mapping
+ * request. Optionally, if 'keep' is set, drivers may keep the actual page table
+ * entries for this &drm_gpuva, adding the missing page table entries only and
+ * update the &drm_gpuva_manager's view of things accordingly.
+ *
+ * Drivers may do the same optimization, namely delta page table updates, also
+ * for remap operations. This is possible since &drm_gpuva_op_remap consists of
+ * one unmap operation and one or two map operations, such that drivers can
+ * derive the page table update delta accordingly.
+ *
+ * Note that there can't be more than two existent mappings to split up, one at
+ * the beginning and one at the end of the new mapping, hence there is a
+ * maximum of two remap operations.
+ *
+ * Generally, the DRM GPU VA manager never merges mappings across the
+ * boundaries of &drm_gpuva_regions. This is the case since merging between
+ * GPU VA regions would result into unmap and map operations to be issued for
+ * both regions involved although the original mapping request was referred to
+ * one specific GPU VA region only. Since the other GPU VA region, the one not
+ * explicitly requested to be altered, might be in use by the GPU, we are not
+ * allowed to issue any map/unmap operations for this region.
+ *
+ * To update the &drm_gpuva_manager's view of the GPU VA space
+ * drm_gpuva_insert() and drm_gpuva_remove() should be used.
+ *
+ * Analogous to drm_gpuva_sm_map() drm_gpuva_sm_unmap() uses &drm_gpuva_fn_ops
+ * to call back into the driver in order to unmap a range of GPU VA space. The
+ * logic behind this function is way simpler though: For all existent mappings
+ * enclosed by the given range unmap operations are created. For mappings which
+ * are only partically located within the given range, remap operations are
+ * created such that those mappings are split up and re-mapped partically.
+ *
+ * The following diagram depicts the basic relationships of existent GPU VA
+ * mappings, a newly requested mapping and the resulting mappings as implemented
+ * by drm_gpuva_sm_map() - it doesn't cover any arbitrary combinations of these.
+ *
+ * 1) Requested mapping is identical, hence noop.
+ *
+ *    ::
+ *
+ *	     0     a     1
+ *	old: |-----------| (bo_offset=n)
+ *
+ *	     0     a     1
+ *	req: |-----------| (bo_offset=n)
+ *
+ *	     0     a     1
+ *	new: |-----------| (bo_offset=n)
+ *
+ *
+ * 2) Requested mapping is identical, except for the BO offset, hence replace
+ *    the mapping.
+ *
+ *    ::
+ *
+ *	     0     a     1
+ *	old: |-----------| (bo_offset=n)
+ *
+ *	     0     a     1
+ *	req: |-----------| (bo_offset=m)
+ *
+ *	     0     a     1
+ *	new: |-----------| (bo_offset=m)
+ *
+ *
+ * 3) Requested mapping is identical, except for the backing BO, hence replace
+ *    the mapping.
+ *
+ *    ::
+ *
+ *	     0     a     1
+ *	old: |-----------| (bo_offset=n)
+ *
+ *	     0     b     1
+ *	req: |-----------| (bo_offset=n)
+ *
+ *	     0     b     1
+ *	new: |-----------| (bo_offset=n)
+ *
+ *
+ * 4) Existent mapping is a left aligned subset of the requested one, hence
+ *    replace the existent one.
+ *
+ *    ::
+ *
+ *	     0  a  1
+ *	old: |-----|       (bo_offset=n)
+ *
+ *	     0     a     2
+ *	req: |-----------| (bo_offset=n)
+ *
+ *	     0     a     2
+ *	new: |-----------| (bo_offset=n)
+ *
+ *    .. note::
+ *       We expect to see the same result for a request with a different BO
+ *       and/or non-contiguous BO offset.
+ *
+ *
+ * 5) Requested mapping's range is a left aligned subset of the existent one,
+ *    but backed by a different BO. Hence, map the requested mapping and split
+ *    the existent one adjusting it's BO offset.
+ *
+ *    ::
+ *
+ *	     0     a     2
+ *	old: |-----------| (bo_offset=n)
+ *
+ *	     0  b  1
+ *	req: |-----|       (bo_offset=n)
+ *
+ *	     0  b  1  a' 2
+ *	new: |-----|-----| (b.bo_offset=n, a.bo_offset=n+1)
+ *
+ *    .. note::
+ *       We expect to see the same result for a request with a different BO
+ *       and/or non-contiguous BO offset.
+ *
+ *
+ * 6) Existent mapping is a superset of the requested mapping, hence noop.
+ *
+ *    ::
+ *
+ *	     0     a     2
+ *	old: |-----------| (bo_offset=n)
+ *
+ *	     0  a  1
+ *	req: |-----|       (bo_offset=n)
+ *
+ *	     0     a     2
+ *	new: |-----------| (bo_offset=n)
+ *
+ *
+ * 7) Requested mapping's range is a right aligned subset of the existent one,
+ *    but backed by a different BO. Hence, map the requested mapping and split
+ *    the existent one, without adjusting the BO offset.
+ *
+ *    ::
+ *
+ *	     0     a     2
+ *	old: |-----------| (bo_offset=n)
+ *
+ *	           1  b  2
+ *	req:       |-----| (bo_offset=m)
+ *
+ *	     0  a  1  b  2
+ *	new: |-----|-----| (a.bo_offset=n,b.bo_offset=m)
+ *
+ *
+ * 8) Existent mapping is a superset of the requested mapping, hence noop.
+ *
+ *    ::
+ *
+ *	      0     a     2
+ *	old: |-----------| (bo_offset=n)
+ *
+ *	           1  a  2
+ *	req:       |-----| (bo_offset=n+1)
+ *
+ *	     0     a     2
+ *	new: |-----------| (bo_offset=n)
+ *
+ *
+ * 9) Existent mapping is overlapped at the end by the requested mapping backed
+ *    by a different BO. Hence, map the requested mapping and split up the
+ *    existent one, without adjusting the BO offset.
+ *
+ *    ::
+ *
+ *	     0     a     2
+ *	old: |-----------|       (bo_offset=n)
+ *
+ *	           1     b     3
+ *	req:       |-----------| (bo_offset=m)
+ *
+ *	     0  a  1     b     3
+ *	new: |-----|-----------| (a.bo_offset=n,b.bo_offset=m)
+ *
+ *
+ * 10) Existent mapping is overlapped by the requested mapping, both having the
+ *     same backing BO with a contiguous offset. Hence, merge both mappings.
+ *
+ *     ::
+ *
+ *	      0     a     2
+ *	 old: |-----------|       (bo_offset=n)
+ *
+ *	            1     a     3
+ *	 req:       |-----------| (bo_offset=n+1)
+ *
+ *	      0        a        3
+ *	 new: |-----------------| (bo_offset=n)
+ *
+ *
+ * 11) Requested mapping's range is a centered subset of the existent one
+ *     having a different backing BO. Hence, map the requested mapping and split
+ *     up the existent one in two mappings, adjusting the BO offset of the right
+ *     one accordingly.
+ *
+ *     ::
+ *
+ *	      0        a        3
+ *	 old: |-----------------| (bo_offset=n)
+ *
+ *	            1  b  2
+ *	 req:       |-----|       (bo_offset=m)
+ *
+ *	      0  a  1  b  2  a' 3
+ *	 new: |-----|-----|-----| (a.bo_offset=n,b.bo_offset=m,a'.bo_offset=n+2)
+ *
+ *
+ * 12) Requested mapping is a contiguous subset of the existent one, hence noop.
+ *
+ *     ::
+ *
+ *	      0        a        3
+ *	 old: |-----------------| (bo_offset=n)
+ *
+ *	            1  a  2
+ *	 req:       |-----|       (bo_offset=n+1)
+ *
+ *	      0        a        3
+ *	 old: |-----------------| (bo_offset=n)
+ *
+ *
+ * 13) Existent mapping is a right aligned subset of the requested one, hence
+ *     replace the existent one.
+ *
+ *     ::
+ *
+ *	            1  a  2
+ *	 old:       |-----| (bo_offset=n+1)
+ *
+ *	      0     a     2
+ *	 req: |-----------| (bo_offset=n)
+ *
+ *	      0     a     2
+ *	 new: |-----------| (bo_offset=n)
+ *
+ *     .. note::
+ *        We expect to see the same result for a request with a different bo
+ *        and/or non-contiguous bo_offset.
+ *
+ *
+ * 14) Existent mapping is a centered subset of the requested one, hence
+ *     replace the existent one.
+ *
+ *     ::
+ *
+ *	            1  a  2
+ *	 old:       |-----| (bo_offset=n+1)
+ *
+ *	      0        a       3
+ *	 req: |----------------| (bo_offset=n)
+ *
+ *	      0        a       3
+ *	 new: |----------------| (bo_offset=n)
+ *
+ *     .. note::
+ *        We expect to see the same result for a request with a different bo
+ *        and/or non-contiguous bo_offset.
+ *
+ *
+ * 15) Existent mappings is overlapped at the beginning by the requested mapping
+ *     backed by a different BO. Hence, map the requested mapping and split up
+ *     the existent one, adjusting it's BO offset accordingly.
+ *
+ *     ::
+ *
+ *	            1     a     3
+ *	 old:       |-----------| (bo_offset=n)
+ *
+ *	      0     b     2
+ *	 req: |-----------|       (bo_offset=m)
+ *
+ *	      0     b     2  a' 3
+ *	 new: |-----------|-----| (b.bo_offset=m,a.bo_offset=n+2)
+ *
+ *
+ * 16) Requested mapping fills the gap between two existent mappings all having
+ *     the same backing BO, such that all three have a contiguous BO offset.
+ *     Hence, merge all mappings.
+ *
+ *     ::
+ *
+ *	      0     a     1
+ *	 old: |-----------|                        (bo_offset=n)
+ *
+ *	                             2     a     3
+ *	 old':                       |-----------| (bo_offset=n+2)
+ *
+ *	                 1     a     2
+ *	 req:            |-----------|             (bo_offset=n+1)
+ *
+ *	                       a
+ *	 new: |----------------------------------| (bo_offset=n)
+ */
+
+/**
+ * DOC: Locking
+ *
+ * Generally, the GPU VA manager does not take care of locking itself, it is
+ * the drivers responsibility to take care about locking. Drivers might want to
+ * protect the following operations: inserting, removing and iterating
+ * &drm_gpuva and &drm_gpuva_region objects as well as generating all kinds of
+ * operations, such as split / merge or prefetch.
+ *
+ * The GPU VA manager also does not take care of the locking of the backing
+ * &drm_gem_object buffers GPU VA lists by itself; drivers are responsible to
+ * enforce mutual exclusion.
+ */
+
+ /*
+  * Maple Tree Locking
+  *
+  * The maple tree's advanced API requires the user of the API to protect
+  * certain tree operations with a lock (either the external or internal tree
+  * lock) for tree internal reasons.
+  *
+  * The actual rules (when to aquire/release the lock) are enforced by lockdep
+  * through the maple tree implementation.
+  *
+  * For this reason the DRM GPUVA manager takes the maple tree's internal
+  * spinlock according to the lockdep enforced rules.
+  *
+  * Please note, that this lock is *only* meant to fulfill the maple trees
+  * requirements and does not intentionally protect the DRM GPUVA manager
+  * against concurrent access.
+  *
+  * The following mail thread provides more details on why the maple tree
+  * has this requirement.
+  *
+  * https://lore.kernel.org/lkml/20230217134422.14116-5-dakr@redhat.com/
+  */
+
+static int __drm_gpuva_region_insert(struct drm_gpuva_manager *mgr,
+				     struct drm_gpuva_region *reg);
+static void __drm_gpuva_region_remove(struct drm_gpuva_region *reg);
+
+/**
+ * drm_gpuva_manager_init - initialize a &drm_gpuva_manager
+ * @mgr: pointer to the &drm_gpuva_manager to initialize
+ * @name: the name of the GPU VA space
+ * @start_offset: the start offset of the GPU VA space
+ * @range: the size of the GPU VA space
+ * @reserve_offset: the start of the kernel reserved GPU VA area
+ * @reserve_range: the size of the kernel reserved GPU VA area
+ * @ops: &drm_gpuva_fn_ops called on &drm_gpuva_sm_map / &drm_gpuva_sm_unmap
+ * @flags: the feature flags of the &drm_gpuva_manager
+ *
+ * The &drm_gpuva_manager must be initialized with this function before use.
+ *
+ * Note that @mgr must be cleared to 0 before calling this function. The given
+ * &name is expected to be managed by the surrounding driver structures.
+ */
+void
+drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
+		       const char *name,
+		       u64 start_offset, u64 range,
+		       u64 reserve_offset, u64 reserve_range,
+		       struct drm_gpuva_fn_ops *ops,
+		       enum drm_gpuva_mgr_flags flags)
+{
+	mt_init(&mgr->region_mt);
+	mt_init(&mgr->va_mt);
+
+	mgr->mm_start = start_offset;
+	mgr->mm_range = range;
+
+	mgr->name = name ? name : "unknown";
+	mgr->ops = ops;
+	mgr->flags = flags;
+
+	memset(&mgr->kernel_alloc_region, 0, sizeof(struct drm_gpuva_region));
+
+	if (reserve_offset || reserve_range) {
+		mgr->kernel_alloc_region.va.addr = reserve_offset;
+		mgr->kernel_alloc_region.va.range = reserve_range;
+
+		__drm_gpuva_region_insert(mgr, &mgr->kernel_alloc_region);
+	}
+}
+EXPORT_SYMBOL(drm_gpuva_manager_init);
+
+/**
+ * drm_gpuva_manager_destroy - cleanup a &drm_gpuva_manager
+ * @mgr: pointer to the &drm_gpuva_manager to clean up
+ *
+ * Note that it is a bug to call this function on a manager that still
+ * holds GPU VA mappings.
+ */
+void
+drm_gpuva_manager_destroy(struct drm_gpuva_manager *mgr)
+{
+	mgr->name = NULL;
+	if (mgr->kernel_alloc_region.va.addr ||
+	    mgr->kernel_alloc_region.va.range)
+		__drm_gpuva_region_remove(&mgr->kernel_alloc_region);
+
+	mtree_lock(&mgr->va_mt);
+	WARN(!mtree_empty(&mgr->va_mt),
+	     "GPUVA tree is not empty, potentially leaking memory.");
+	__mt_destroy(&mgr->va_mt);
+	mtree_unlock(&mgr->va_mt);
+
+	mtree_lock(&mgr->region_mt);
+	WARN(!mtree_empty(&mgr->region_mt),
+	     "GPUVA region tree is not empty, potentially leaking memory.");
+	__mt_destroy(&mgr->region_mt);
+	mtree_unlock(&mgr->region_mt);
+}
+EXPORT_SYMBOL(drm_gpuva_manager_destroy);
+
+static inline bool
+drm_gpuva_in_mm_range(struct drm_gpuva_manager *mgr, u64 addr, u64 range)
+{
+	u64 end = addr + range;
+	u64 mm_start = mgr->mm_start;
+	u64 mm_end = mm_start + mgr->mm_range;
+
+	return addr < mm_end && mm_start < end;
+}
+
+static inline bool
+drm_gpuva_in_kernel_region(struct drm_gpuva_manager *mgr, u64 addr, u64 range)
+{
+	u64 end = addr + range;
+	u64 kstart = mgr->kernel_alloc_region.va.addr;
+	u64 kend = kstart + mgr->kernel_alloc_region.va.range;
+
+	return addr < kend && kstart < end;
+}
+
+static struct drm_gpuva_region *
+drm_gpuva_in_region(struct drm_gpuva_manager *mgr, u64 addr, u64 range)
+{
+	DRM_GPUVA_REGION_ITER(it, mgr, addr);
+
+	/* Find the VA region the requested range is strictly enclosed by. */
+	drm_gpuva_iter_for_each_range(it, addr + range) {
+		struct drm_gpuva_region *reg = it.reg;
+
+		if (reg->va.addr <= addr &&
+		    reg->va.addr + reg->va.range >= addr + range &&
+		    reg != &mgr->kernel_alloc_region)
+			return reg;
+	}
+
+	return NULL;
+}
+
+static bool
+drm_gpuva_in_any_region(struct drm_gpuva_manager *mgr, u64 addr, u64 range)
+{
+	return !!drm_gpuva_in_region(mgr, addr, range);
+}
+
+/**
+ * drm_gpuva_iter_remove - removes the iterators current element
+ * @it: the &drm_gpuva_iterator
+ *
+ * This removes the element the iterator currently points to.
+ */
+void
+drm_gpuva_iter_remove(struct drm_gpuva_iterator *it)
+{
+	mas_lock(&it->mas);
+	mas_erase(&it->mas);
+	mas_unlock(&it->mas);
+}
+EXPORT_SYMBOL(drm_gpuva_iter_remove);
+
+static int
+drm_gpuva_iter_common_replace(struct drm_gpuva_iterator *it,
+			      u64 addr, u64 range, void *entry)
+{
+	struct ma_state *mas = &it->mas;
+	u64 last = addr + range - 1;
+	int ret;
+
+	if (unlikely(addr < mas->index ||
+		     last > mas->last))
+		return -EINVAL;
+
+	mas_lock(mas);
+
+	ret = mas_preallocate(mas, entry, GFP_KERNEL);
+	if (ret)
+		goto err_unlock;
+
+	mas_erase(mas);
+
+	mas->index = addr;
+	mas->last = last;
+	mas_store_prealloc(mas, entry);
+
+	mas_unlock(mas);
+
+	return 0;
+
+err_unlock:
+	mas_unlock(mas);
+	return ret;
+}
+
+/**
+ * drm_gpuva_iter_replace - replaces the iterators current element
+ * @it: the &drm_gpuva_iterator
+ * @va: the &drm_gpuva to insert
+ *
+ * This replaces the element the iterator currently points to.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuva_iter_va_replace(struct drm_gpuva_iterator *it,
+			  struct drm_gpuva *va)
+{
+	u64 addr = va->va.addr;
+	u64 range = va->va.range;
+
+	return drm_gpuva_iter_common_replace(it, addr, range, va);
+}
+EXPORT_SYMBOL(drm_gpuva_iter_va_replace);
+
+/**
+ * drm_gpuva_region_iter_replace - replaces the iterators current element
+ * @it: the &drm_gpuva_iterator
+ * @reg: the &drm_gpuva_region to insert
+ *
+ * This replaces the element the iterator currently points to.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuva_iter_region_replace(struct drm_gpuva_iterator *it,
+			      struct drm_gpuva_region *reg)
+{
+	u64 addr = reg->va.addr;
+	u64 range = reg->va.range;
+
+	return drm_gpuva_iter_common_replace(it, addr, range, reg);
+}
+EXPORT_SYMBOL(drm_gpuva_iter_region_replace);
+
+/**
+ * drm_gpuva_insert - insert a &drm_gpuva
+ * @mgr: the &drm_gpuva_manager to insert the &drm_gpuva in
+ * @va: the &drm_gpuva to insert
+ * @addr: the start address of the GPU VA
+ * @range: the range of the GPU VA
+ *
+ * Insert a &drm_gpuva with a given address and range into a
+ * &drm_gpuva_manager.
+ *
+ * It is not allowed to use this function while iterating this GPU VA space,
+ * e.g via drm_gpuva_iter_for_each(). Please use drm_gpuva_iter_remove(),
+ * drm_gpuva_iter_va_replace() or drm_gpuva_iter_region_replace() instead.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuva_insert(struct drm_gpuva_manager *mgr,
+		 struct drm_gpuva *va)
+{
+	u64 addr = va->va.addr;
+	u64 range = va->va.range;
+	u64 last = addr + range - 1;
+	MA_STATE(mas, &mgr->va_mt, addr, addr);
+	struct drm_gpuva_region *reg = NULL;
+	int ret;
+
+	if (unlikely(!drm_gpuva_in_mm_range(mgr, addr, range)))
+		return -EINVAL;
+
+	if (unlikely(drm_gpuva_in_kernel_region(mgr, addr, range)))
+		return -EINVAL;
+
+	if (mgr->flags & DRM_GPUVA_MANAGER_REGIONS) {
+		reg = drm_gpuva_in_region(mgr, addr, range);
+		if (unlikely(!reg))
+			return -EINVAL;
+	}
+
+	mas_lock(&mas);
+
+	if (unlikely(mas_walk(&mas))) {
+		ret = -EEXIST;
+		goto err_unlock;
+	}
+
+	if (unlikely(mas.last < last)) {
+		ret = -EEXIST;
+		goto err_unlock;
+	}
+
+	mas.index = addr;
+	mas.last = last;
+	ret = mas_store_gfp(&mas, va, GFP_KERNEL);
+	if (unlikely(ret))
+		goto err_unlock;
+
+	mas_unlock(&mas);
+
+	va->mgr = mgr;
+	va->region = reg;
+
+	return 0;
+
+err_unlock:
+	mas_unlock(&mas);
+	return ret;
+}
+EXPORT_SYMBOL(drm_gpuva_insert);
+
+/**
+ * drm_gpuva_remove - remove a &drm_gpuva
+ * @va: the &drm_gpuva to remove
+ *
+ * This removes the given &va from the underlaying tree.
+ *
+ * It is not allowed to use this function while iterating this GPU VA space,
+ * e.g via drm_gpuva_iter_for_each(). Please use drm_gpuva_iter_remove(),
+ * drm_gpuva_iter_va_replace() or drm_gpuva_iter_region_replace() instead.
+ */
+void
+drm_gpuva_remove(struct drm_gpuva *va)
+{
+	MA_STATE(mas, &va->mgr->va_mt, va->va.addr, 0);
+
+	mas_lock(&mas);
+	mas_erase(&mas);
+	mas_unlock(&mas);
+}
+EXPORT_SYMBOL(drm_gpuva_remove);
+
+/**
+ * drm_gpuva_link - link a &drm_gpuva
+ * @va: the &drm_gpuva to link
+ *
+ * This adds the given &va to the GPU VA list of the &drm_gem_object it is
+ * associated with.
+ *
+ * This function expects the caller to protect the GEM's GPUVA list against
+ * concurrent access.
+ */
+void
+drm_gpuva_link(struct drm_gpuva *va)
+{
+	if (likely(va->gem.obj))
+		list_add_tail(&va->head, &va->gem.obj->gpuva.list);
+}
+EXPORT_SYMBOL(drm_gpuva_link);
+
+/**
+ * drm_gpuva_unlink - unlink a &drm_gpuva
+ * @va: the &drm_gpuva to unlink
+ *
+ * This removes the given &va from the GPU VA list of the &drm_gem_object it is
+ * associated with.
+ *
+ * This function expects the caller to protect the GEM's GPUVA list against
+ * concurrent access.
+ */
+void
+drm_gpuva_unlink(struct drm_gpuva *va)
+{
+	if (likely(va->gem.obj))
+		list_del_init(&va->head);
+}
+EXPORT_SYMBOL(drm_gpuva_unlink);
+
+/**
+ * drm_gpuva_find_first - find the first &drm_gpuva in the given range
+ * @mgr: the &drm_gpuva_manager to search in
+ * @addr: the &drm_gpuvas address
+ * @range: the &drm_gpuvas range
+ *
+ * Returns: the first &drm_gpuva within the given range
+ */
+struct drm_gpuva *
+drm_gpuva_find_first(struct drm_gpuva_manager *mgr,
+		     u64 addr, u64 range)
+{
+	MA_STATE(mas, &mgr->va_mt, addr, 0);
+	struct drm_gpuva *va;
+
+	mas_lock(&mas);
+	va = mas_find(&mas, addr + range - 1);
+	mas_unlock(&mas);
+
+	return va;
+}
+EXPORT_SYMBOL(drm_gpuva_find_first);
+
+/**
+ * drm_gpuva_find - find a &drm_gpuva
+ * @mgr: the &drm_gpuva_manager to search in
+ * @addr: the &drm_gpuvas address
+ * @range: the &drm_gpuvas range
+ *
+ * Returns: the &drm_gpuva at a given &addr and with a given &range
+ */
+struct drm_gpuva *
+drm_gpuva_find(struct drm_gpuva_manager *mgr,
+	       u64 addr, u64 range)
+{
+	struct drm_gpuva *va;
+
+	va = drm_gpuva_find_first(mgr, addr, range);
+	if (!va)
+		goto out;
+
+	if (va->va.addr != addr ||
+	    va->va.range != range)
+		goto out;
+
+	return va;
+
+out:
+	return NULL;
+}
+EXPORT_SYMBOL(drm_gpuva_find);
+
+/**
+ * drm_gpuva_find_prev - find the &drm_gpuva before the given address
+ * @mgr: the &drm_gpuva_manager to search in
+ * @start: the given GPU VA's start address
+ *
+ * Find the adjacent &drm_gpuva before the GPU VA with given &start address.
+ *
+ * Note that if there is any free space between the GPU VA mappings no mapping
+ * is returned.
+ *
+ * Returns: a pointer to the found &drm_gpuva or NULL if none was found
+ */
+struct drm_gpuva *
+drm_gpuva_find_prev(struct drm_gpuva_manager *mgr, u64 start)
+{
+	MA_STATE(mas, &mgr->va_mt, start - 1, 0);
+	struct drm_gpuva *va;
+
+	if (start <= mgr->mm_start ||
+	    start > (mgr->mm_start + mgr->mm_range))
+		return NULL;
+
+	mas_lock(&mas);
+	va = mas_walk(&mas);
+	mas_unlock(&mas);
+
+	return va;
+}
+EXPORT_SYMBOL(drm_gpuva_find_prev);
+
+/**
+ * drm_gpuva_find_next - find the &drm_gpuva after the given address
+ * @mgr: the &drm_gpuva_manager to search in
+ * @end: the given GPU VA's end address
+ *
+ * Find the adjacent &drm_gpuva after the GPU VA with given &end address.
+ *
+ * Note that if there is any free space between the GPU VA mappings no mapping
+ * is returned.
+ *
+ * Returns: a pointer to the found &drm_gpuva or NULL if none was found
+ */
+struct drm_gpuva *
+drm_gpuva_find_next(struct drm_gpuva_manager *mgr, u64 end)
+{
+	MA_STATE(mas, &mgr->va_mt, end, 0);
+	struct drm_gpuva *va;
+
+	if (end < mgr->mm_start ||
+	    end >= (mgr->mm_start + mgr->mm_range))
+		return NULL;
+
+	mas_lock(&mas);
+	va = mas_walk(&mas);
+	mas_unlock(&mas);
+
+	return va;
+}
+EXPORT_SYMBOL(drm_gpuva_find_next);
+
+static int
+__drm_gpuva_region_insert(struct drm_gpuva_manager *mgr,
+			  struct drm_gpuva_region *reg)
+{
+	u64 addr = reg->va.addr;
+	u64 range = reg->va.range;
+	u64 last = addr + range - 1;
+	MA_STATE(mas, &mgr->region_mt, addr, addr);
+	int ret;
+
+	if (unlikely(!drm_gpuva_in_mm_range(mgr, addr, range)))
+		return -EINVAL;
+
+	mas_lock(&mas);
+
+	if (unlikely(mas_walk(&mas))) {
+		ret = -EEXIST;
+		goto err_unlock;
+	}
+
+	if (unlikely(mas.last < last)) {
+		ret = -EEXIST;
+		goto err_unlock;
+	}
+
+	mas.index = addr;
+	mas.last = last;
+	ret = mas_store_gfp(&mas, reg, GFP_KERNEL);
+	if (unlikely(ret))
+		goto err_unlock;
+
+	mas_unlock(&mas);
+
+	reg->mgr = mgr;
+
+	return 0;
+
+err_unlock:
+	mas_unlock(&mas);
+	return ret;
+}
+
+/**
+ * drm_gpuva_region_insert - insert a &drm_gpuva_region
+ * @mgr: the &drm_gpuva_manager to insert the &drm_gpuva in
+ * @reg: the &drm_gpuva_region to insert
+ * @addr: the start address of the GPU VA
+ * @range: the range of the GPU VA
+ *
+ * Insert a &drm_gpuva_region with a given address and range into a
+ * &drm_gpuva_manager.
+ *
+ * It is not allowed to use this function while iterating this GPU VA space,
+ * e.g via drm_gpuva_iter_for_each(). Please use drm_gpuva_iter_remove(),
+ * drm_gpuva_iter_va_replace() or drm_gpuva_iter_region_replace() instead.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuva_region_insert(struct drm_gpuva_manager *mgr,
+			struct drm_gpuva_region *reg)
+{
+	if (unlikely(!(mgr->flags & DRM_GPUVA_MANAGER_REGIONS)))
+		return -EINVAL;
+
+	return __drm_gpuva_region_insert(mgr, reg);
+}
+EXPORT_SYMBOL(drm_gpuva_region_insert);
+
+static void
+__drm_gpuva_region_remove(struct drm_gpuva_region *reg)
+{
+	struct drm_gpuva_manager *mgr = reg->mgr;
+	MA_STATE(mas, &mgr->region_mt, reg->va.addr, 0);
+
+	mas_lock(&mas);
+	mas_erase(&mas);
+	mas_unlock(&mas);
+}
+
+/**
+ * drm_gpuva_region_remove - remove a &drm_gpuva_region
+ * @reg: the &drm_gpuva to remove
+ *
+ * This removes the given &reg from the underlaying tree.
+ *
+ * It is not allowed to use this function while iterating this GPU VA space,
+ * e.g via drm_gpuva_iter_for_each(). Please use drm_gpuva_iter_remove(),
+ * drm_gpuva_iter_va_replace() or drm_gpuva_iter_region_replace() instead.
+ */
+void
+drm_gpuva_region_remove(struct drm_gpuva_region *reg)
+{
+	struct drm_gpuva_manager *mgr = reg->mgr;
+
+	if (unlikely(!(mgr->flags & DRM_GPUVA_MANAGER_REGIONS)))
+		return;
+
+	if (unlikely(reg == &mgr->kernel_alloc_region)) {
+		WARN(1, "Can't destroy kernel reserved region.\n");
+		return;
+	}
+
+	if (unlikely(!drm_gpuva_region_empty(reg)))
+		WARN(1, "GPU VA region should be empty on destroy.\n");
+
+	__drm_gpuva_region_remove(reg);
+}
+EXPORT_SYMBOL(drm_gpuva_region_remove);
+
+/**
+ * drm_gpuva_region_empty - indicate whether a &drm_gpuva_region is empty
+ * @reg: the &drm_gpuva to destroy
+ *
+ * Returns: true if the &drm_gpuva_region is empty, false otherwise
+ */
+bool
+drm_gpuva_region_empty(struct drm_gpuva_region *reg)
+{
+	DRM_GPUVA_ITER(it, reg->mgr, reg->va.addr);
+
+	drm_gpuva_iter_for_each_range(it, reg->va.addr + reg->va.range)
+		return false;
+
+	return true;
+}
+EXPORT_SYMBOL(drm_gpuva_region_empty);
+
+/**
+ * drm_gpuva_region_find_first - find the first &drm_gpuva_region in the given
+ * range
+ * @mgr: the &drm_gpuva_manager to search in
+ * @addr: the &drm_gpuva_regions address
+ * @range: the &drm_gpuva_regions range
+ *
+ * Returns: the first &drm_gpuva_region within the given range
+ */
+struct drm_gpuva_region *
+drm_gpuva_region_find_first(struct drm_gpuva_manager *mgr,
+			    u64 addr, u64 range)
+{
+	MA_STATE(mas, &mgr->region_mt, addr, 0);
+	struct drm_gpuva_region *reg;
+
+	mas_lock(&mas);
+	reg = mas_find(&mas, addr + range - 1);
+	mas_unlock(&mas);
+
+	return reg;
+}
+EXPORT_SYMBOL(drm_gpuva_region_find_first);
+
+/**
+ * drm_gpuva_region_find - find a &drm_gpuva_region
+ * @mgr: the &drm_gpuva_manager to search in
+ * @addr: the &drm_gpuva_regions address
+ * @range: the &drm_gpuva_regions range
+ *
+ * Returns: the &drm_gpuva_region at a given &addr and with a given &range
+ */
+struct drm_gpuva_region *
+drm_gpuva_region_find(struct drm_gpuva_manager *mgr,
+		      u64 addr, u64 range)
+{
+	struct drm_gpuva_region *reg;
+
+	reg = drm_gpuva_region_find_first(mgr, addr, range);
+	if (!reg)
+		goto out;
+
+	if (reg->va.addr != addr ||
+	    reg->va.range != range)
+		goto out;
+
+	return reg;
+
+out:
+	return NULL;
+}
+EXPORT_SYMBOL(drm_gpuva_region_find);
+
+static int
+op_map_cb(int (*step)(struct drm_gpuva_op *, void *),
+	  void *priv,
+	  u64 addr, u64 range,
+	  struct drm_gem_object *obj, u64 offset)
+{
+	struct drm_gpuva_op op = {};
+
+	op.op = DRM_GPUVA_OP_MAP;
+	op.map.va.addr = addr;
+	op.map.va.range = range;
+	op.map.gem.obj = obj;
+	op.map.gem.offset = offset;
+
+	return step(&op, priv);
+}
+
+static int
+op_remap_cb(int (*step)(struct drm_gpuva_op *, void *),
+	    void *priv,
+	    struct drm_gpuva_op_map *prev,
+	    struct drm_gpuva_op_map *next,
+	    struct drm_gpuva_op_unmap *unmap)
+{
+	struct drm_gpuva_op op = {};
+	struct drm_gpuva_op_remap *r;
+
+	op.op = DRM_GPUVA_OP_REMAP;
+	r = &op.remap;
+	r->prev = prev;
+	r->next = next;
+	r->unmap = unmap;
+
+	return step(&op, priv);
+}
+
+static int
+op_unmap_cb(int (*step)(struct drm_gpuva_op *, void *),
+	    void *priv,
+	    struct drm_gpuva *va, bool merge)
+{
+	struct drm_gpuva_op op = {};
+
+	op.op = DRM_GPUVA_OP_UNMAP;
+	op.unmap.va = va;
+	op.unmap.keep = merge;
+
+	return step(&op, priv);
+}
+
+static inline bool
+gpuva_should_merge(struct drm_gpuva_manager *mgr, struct drm_gpuva *va)
+{
+	/* Never merge mappings with NULL GEMs. */
+	return mgr->flags & DRM_GPUVA_MANAGER_REGIONS && !!va->gem.obj;
+}
+
+static int
+__drm_gpuva_sm_map(struct drm_gpuva_manager *mgr,
+		   struct drm_gpuva_fn_ops *ops, void *priv,
+		   u64 req_addr, u64 req_range,
+		   struct drm_gem_object *req_obj, u64 req_offset)
+{
+	DRM_GPUVA_ITER(it, mgr, req_addr);
+	int (*step)(struct drm_gpuva_op *, void *);
+	struct drm_gpuva *va, *prev = NULL;
+	u64 req_end = req_addr + req_range;
+	bool skip_pmerge = false, skip_nmerge = false;
+	int ret;
+
+	step = ops->sm_map_step;
+
+	if (unlikely(!drm_gpuva_in_mm_range(mgr, req_addr, req_range)))
+		return -EINVAL;
+
+	if (unlikely(drm_gpuva_in_kernel_region(mgr, req_addr, req_range)))
+		return -EINVAL;
+
+	if ((mgr->flags & DRM_GPUVA_MANAGER_REGIONS) &&
+	    !drm_gpuva_in_any_region(mgr, req_addr, req_range))
+		return -EINVAL;
+
+	drm_gpuva_iter_for_each_range(it, req_end) {
+		struct drm_gpuva *va = it.va;
+		struct drm_gem_object *obj = va->gem.obj;
+		u64 offset = va->gem.offset;
+		u64 addr = va->va.addr;
+		u64 range = va->va.range;
+		u64 end = addr + range;
+		bool merge = gpuva_should_merge(mgr, va);
+
+		/* Generally, we want to skip merging with potential mappings
+		 * left and right of the requested one when we found a
+		 * collision, since merging happens in this loop already.
+		 *
+		 * However, there is one exception when the requested mapping
+		 * spans into a free VM area. If this is the case we might
+		 * still hit the boundary of another mapping before and/or
+		 * after the free VM area.
+		 */
+		skip_pmerge = true;
+		skip_nmerge = true;
+
+		if (addr == req_addr) {
+			merge &= obj == req_obj &&
+				 offset == req_offset;
+
+			if (end == req_end) {
+				if (merge)
+					goto done;
+
+				ret = op_unmap_cb(step, priv, va, false);
+				if (ret)
+					return ret;
+				break;
+			}
+
+			if (end < req_end) {
+				skip_nmerge = false;
+				ret = op_unmap_cb(step, priv, va, merge);
+				if (ret)
+					return ret;
+				goto next;
+			}
+
+			if (end > req_end) {
+				struct drm_gpuva_op_map n = {
+					.va.addr = req_end,
+					.va.range = range - req_range,
+					.gem.obj = obj,
+					.gem.offset = offset + req_range,
+				};
+				struct drm_gpuva_op_unmap u = { .va = va };
+
+				if (merge)
+					goto done;
+
+				ret = op_remap_cb(step, priv, NULL, &n, &u);
+				if (ret)
+					return ret;
+				break;
+			}
+		} else if (addr < req_addr) {
+			u64 ls_range = req_addr - addr;
+			struct drm_gpuva_op_map p = {
+				.va.addr = addr,
+				.va.range = ls_range,
+				.gem.obj = obj,
+				.gem.offset = offset,
+			};
+			struct drm_gpuva_op_unmap u = { .va = va };
+
+			merge &= obj == req_obj &&
+				 offset + ls_range == req_offset;
+
+			if (end == req_end) {
+				if (merge)
+					goto done;
+
+				ret = op_remap_cb(step, priv, &p, NULL, &u);
+				if (ret)
+					return ret;
+				break;
+			}
+
+			if (end < req_end) {
+				u64 new_addr = addr;
+				u64 new_range = req_range + ls_range;
+				u64 new_offset = offset;
+
+				/* We validated that the requested mapping is
+				 * within a single VA region already.
+				 * Since it overlaps the current mapping (which
+				 * can't cross a VA region boundary) we can be
+				 * sure that we're still within the boundaries
+				 * of the same VA region after merging.
+				 */
+				if (merge) {
+					req_offset = new_offset;
+					req_addr = new_addr;
+					req_range = new_range;
+					ret = op_unmap_cb(step, priv, va, true);
+					if (ret)
+						return ret;
+					goto next;
+				}
+
+				ret = op_remap_cb(step, priv, &p, NULL, &u);
+				if (ret)
+					return ret;
+				goto next;
+			}
+
+			if (end > req_end) {
+				struct drm_gpuva_op_map n = {
+					.va.addr = req_end,
+					.va.range = end - req_end,
+					.gem.obj = obj,
+					.gem.offset = offset + ls_range +
+						      req_range,
+				};
+
+				if (merge)
+					goto done;
+
+				ret = op_remap_cb(step, priv, &p, &n, &u);
+				if (ret)
+					return ret;
+				break;
+			}
+		} else if (addr > req_addr) {
+			merge &= obj == req_obj &&
+				 offset == req_offset +
+					   (addr - req_addr);
+
+			if (!prev)
+				skip_pmerge = false;
+
+			if (end == req_end) {
+				ret = op_unmap_cb(step, priv, va, merge);
+				if (ret)
+					return ret;
+				break;
+			}
+
+			if (end < req_end) {
+				skip_nmerge = false;
+				ret = op_unmap_cb(step, priv, va, merge);
+				if (ret)
+					return ret;
+				goto next;
+			}
+
+			if (end > req_end) {
+				struct drm_gpuva_op_map n = {
+					.va.addr = req_end,
+					.va.range = end - req_end,
+					.gem.obj = obj,
+					.gem.offset = offset + req_end - addr,
+				};
+				struct drm_gpuva_op_unmap u = { .va = va };
+				u64 new_end = end;
+				u64 new_range = new_end - req_addr;
+
+				/* We validated that the requested mapping is
+				 * within a single VA region already.
+				 * Since it overlaps the current mapping (which
+				 * can't cross a VA region boundary) we can be
+				 * sure that we're still within the boundaries
+				 * of the same VA region after merging.
+				 */
+				if (merge) {
+					req_end = new_end;
+					req_range = new_range;
+					ret = op_unmap_cb(step, priv, va, true);
+					if (ret)
+						return ret;
+					break;
+				}
+
+				ret = op_remap_cb(step, priv, NULL, &n, &u);
+				if (ret)
+					return ret;
+				break;
+			}
+		}
+next:
+		prev = va;
+	}
+
+	va = skip_pmerge ? NULL : drm_gpuva_find_prev(mgr, req_addr);
+	if (va) {
+		struct drm_gem_object *obj = va->gem.obj;
+		u64 offset = va->gem.offset;
+		u64 addr = va->va.addr;
+		u64 range = va->va.range;
+		u64 new_offset = offset;
+		u64 new_addr = addr;
+		u64 new_range = req_range + range;
+		bool merge = gpuva_should_merge(mgr, va) &&
+			     obj == req_obj &&
+			     offset + range == req_offset;
+
+		if (mgr->flags & DRM_GPUVA_MANAGER_REGIONS)
+			merge &= drm_gpuva_in_any_region(mgr, new_addr,
+							 new_range);
+
+		if (merge) {
+			ret = op_unmap_cb(step, priv, va, true);
+			if (ret)
+				return ret;
+
+			req_offset = new_offset;
+			req_addr = new_addr;
+			req_range = new_range;
+		}
+	}
+
+	va = skip_nmerge ? NULL : drm_gpuva_find_next(mgr, req_end);
+	if (va) {
+		struct drm_gem_object *obj = va->gem.obj;
+		u64 offset = va->gem.offset;
+		u64 addr = va->va.addr;
+		u64 range = va->va.range;
+		u64 end = addr + range;
+		u64 new_range = req_range + range;
+		u64 new_end = end;
+		bool merge = gpuva_should_merge(mgr, va) &&
+			     obj == req_obj &&
+			     offset == req_offset + req_range;
+
+		if (mgr->flags & DRM_GPUVA_MANAGER_REGIONS)
+			merge &= drm_gpuva_in_any_region(mgr, req_addr,
+							 new_range);
+
+		if (merge) {
+			ret = op_unmap_cb(step, priv, va, true);
+			if (ret)
+				return ret;
+
+			req_range = new_range;
+			req_end = new_end;
+		}
+	}
+
+	ret = op_map_cb(step, priv,
+			req_addr, req_range,
+			req_obj, req_offset);
+	if (ret)
+		return ret;
+
+done:
+	return 0;
+}
+
+static int
+__drm_gpuva_sm_unmap(struct drm_gpuva_manager *mgr,
+		     struct drm_gpuva_fn_ops *ops, void *priv,
+		     u64 req_addr, u64 req_range)
+{
+	DRM_GPUVA_ITER(it, mgr, req_addr);
+	int (*step)(struct drm_gpuva_op *, void *);
+	u64 req_end = req_addr + req_range;
+	int ret;
+
+	step = ops->sm_unmap_step;
+
+	drm_gpuva_iter_for_each_range(it, req_end) {
+		struct drm_gpuva *va = it.va;
+		struct drm_gpuva_op_map prev = {}, next = {};
+		bool prev_split = false, next_split = false;
+		struct drm_gem_object *obj = va->gem.obj;
+		u64 offset = va->gem.offset;
+		u64 addr = va->va.addr;
+		u64 range = va->va.range;
+		u64 end = addr + range;
+
+		if (addr < req_addr) {
+			prev.va.addr = addr;
+			prev.va.range = req_addr - addr;
+			prev.gem.obj = obj;
+			prev.gem.offset = offset;
+
+			prev_split = true;
+		}
+
+		if (end > req_end) {
+			next.va.addr = req_end;
+			next.va.range = end - req_end;
+			next.gem.obj = obj;
+			next.gem.offset = offset + (req_end - addr);
+
+			next_split = true;
+		}
+
+		if (prev_split || next_split) {
+			struct drm_gpuva_op_unmap unmap = { .va = va };
+
+			ret = op_remap_cb(step, priv, prev_split ? &prev : NULL,
+					  next_split ? &next : NULL, &unmap);
+			if (ret)
+				return ret;
+		} else {
+			ret = op_unmap_cb(step, priv, va, false);
+			if (ret)
+				return ret;
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * drm_gpuva_sm_map - creates the &drm_gpuva_op split/merge steps
+ * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ * @req_addr: the start address of the new mapping
+ * @req_range: the range of the new mapping
+ * @req_obj: the &drm_gem_object to map
+ * @req_offset: the offset within the &drm_gem_object
+ * @priv: pointer to a driver private data structure
+ *
+ * This function iterates the given range of the GPU VA space. It utilizes the
+ * &drm_gpuva_fn_ops to call back into the driver providing the split and merge
+ * steps.
+ *
+ * Drivers may use these callbacks to update the GPU VA space right away within
+ * the callback. In case the driver decides to copy and store the operations for
+ * later processing neither this function nor &drm_gpuva_sm_unmap is allowed to
+ * be called before the &drm_gpuva_manager's view of the GPU VA space was
+ * updated with the previous set of operations. To update the
+ * &drm_gpuva_manager's view of the GPU VA space drm_gpuva_insert(),
+ * drm_gpuva_destroy_locked() and/or drm_gpuva_destroy_unlocked() should be
+ * used.
+ *
+ * A sequence of callbacks can contain map, unmap and remap operations, but
+ * the sequence of callbacks might also be empty if no operation is required,
+ * e.g. if the requested mapping already exists in the exact same way.
+ *
+ * There can be an arbitrary amount of unmap operations, a maximum of two remap
+ * operations and a single map operation. The latter one, if existent,
+ * represents the original map operation requested by the caller. Please note
+ * that the map operation might has been modified, e.g. if it was merged with
+ * an existent mapping.
+ *
+ * Returns: 0 on success or a negative error code
+ */
+int
+drm_gpuva_sm_map(struct drm_gpuva_manager *mgr, void *priv,
+		 u64 req_addr, u64 req_range,
+		 struct drm_gem_object *req_obj, u64 req_offset)
+{
+	if (!mgr->ops || !mgr->ops->sm_map_step)
+		return -EINVAL;
+
+	return __drm_gpuva_sm_map(mgr, mgr->ops, priv,
+				  req_addr, req_range,
+				  req_obj, req_offset);
+}
+EXPORT_SYMBOL(drm_gpuva_sm_map);
+
+/**
+ * drm_gpuva_sm_unmap - creates the &drm_gpuva_ops to split on unmap
+ * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ * @req_addr: the start address of the range to unmap
+ * @req_range: the range of the mappings to unmap
+ * @ops: the &drm_gpuva_fn_ops callbacks to provide the split/merge steps
+ * @priv: pointer to a driver private data structure
+ *
+ * This function iterates the given range of the GPU VA space. It utilizes the
+ * &drm_gpuva_fn_ops to call back into the driver providing the operations to
+ * unmap and, if required, split existent mappings.
+ *
+ * Drivers may use these callbacks to update the GPU VA space right away within
+ * the callback. In case the driver decides to copy and store the operations for
+ * later processing neither this function nor &drm_gpuva_sm_map is allowed to be
+ * called before the &drm_gpuva_manager's view of the GPU VA space was updated
+ * with the previous set of operations. To update the &drm_gpuva_manager's view
+ * of the GPU VA space drm_gpuva_insert(), drm_gpuva_destroy_locked() and/or
+ * drm_gpuva_destroy_unlocked() should be used.
+ *
+ * A sequence of callbacks can contain unmap and remap operations, depending on
+ * whether there are actual overlapping mappings to split.
+ *
+ * There can be an arbitrary amount of unmap operations and a maximum of two
+ * remap operations.
+ *
+ * Returns: 0 on success or a negative error code
+ */
+int
+drm_gpuva_sm_unmap(struct drm_gpuva_manager *mgr, void *priv,
+		   u64 req_addr, u64 req_range)
+{
+	if (!mgr->ops || !mgr->ops->sm_unmap_step)
+		return -EINVAL;
+
+	return __drm_gpuva_sm_unmap(mgr, mgr->ops, priv,
+				    req_addr, req_range);
+}
+EXPORT_SYMBOL(drm_gpuva_sm_unmap);
+
+static struct drm_gpuva_op *
+gpuva_op_alloc(struct drm_gpuva_manager *mgr)
+{
+	struct drm_gpuva_fn_ops *fn = mgr->ops;
+	struct drm_gpuva_op *op;
+
+	if (fn && fn->op_alloc)
+		op = fn->op_alloc();
+	else
+		op = kzalloc(sizeof(*op), GFP_KERNEL);
+
+	if (unlikely(!op))
+		return NULL;
+
+	return op;
+}
+
+static void
+gpuva_op_free(struct drm_gpuva_manager *mgr,
+	      struct drm_gpuva_op *op)
+{
+	struct drm_gpuva_fn_ops *fn = mgr->ops;
+
+	if (fn && fn->op_free)
+		fn->op_free(op);
+	else
+		kfree(op);
+}
+
+int drm_gpuva_sm_step(struct drm_gpuva_op *__op, void *priv)
+{
+	struct {
+		struct drm_gpuva_manager *mgr;
+		struct drm_gpuva_ops *ops;
+	} *args = priv;
+	struct drm_gpuva_manager *mgr = args->mgr;
+	struct drm_gpuva_ops *ops = args->ops;
+	struct drm_gpuva_op *op;
+
+	op = gpuva_op_alloc(mgr);
+	if (unlikely(!op))
+		goto err;
+
+	memcpy(op, __op, sizeof(*op));
+
+	if (op->op == DRM_GPUVA_OP_REMAP) {
+		struct drm_gpuva_op_remap *__r = &__op->remap;
+		struct drm_gpuva_op_remap *r = &op->remap;
+
+		r->unmap = kmemdup(__r->unmap, sizeof(*r->unmap),
+				   GFP_KERNEL);
+		if (unlikely(!r->unmap))
+			goto err_free_op;
+
+		if (__r->prev) {
+			r->prev = kmemdup(__r->prev, sizeof(*r->prev),
+					  GFP_KERNEL);
+			if (unlikely(!r->prev))
+				goto err_free_unmap;
+		}
+
+		if (__r->next) {
+			r->next = kmemdup(__r->next, sizeof(*r->next),
+					  GFP_KERNEL);
+			if (unlikely(!r->next))
+				goto err_free_prev;
+		}
+	}
+
+	list_add_tail(&op->entry, &ops->list);
+
+	return 0;
+
+err_free_unmap:
+	kfree(op->remap.unmap);
+err_free_prev:
+	kfree(op->remap.prev);
+err_free_op:
+	gpuva_op_free(mgr, op);
+err:
+	return -ENOMEM;
+}
+EXPORT_SYMBOL(drm_gpuva_sm_step);
+
+static struct drm_gpuva_fn_ops gpuva_list_ops = {
+	.sm_map_step = drm_gpuva_sm_step,
+	.sm_unmap_step = drm_gpuva_sm_step,
+};
+
+/**
+ * drm_gpuva_sm_map_ops_create - creates the &drm_gpuva_ops to split and merge
+ * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ * @req_addr: the start address of the new mapping
+ * @req_range: the range of the new mapping
+ * @req_obj: the &drm_gem_object to map
+ * @req_offset: the offset within the &drm_gem_object
+ *
+ * This function creates a list of operations to perform splitting and merging
+ * of existent mapping(s) with the newly requested one.
+ *
+ * The list can be iterated with &drm_gpuva_for_each_op and must be processed
+ * in the given order. It can contain map, unmap and remap operations, but it
+ * also can be empty if no operation is required, e.g. if the requested mapping
+ * already exists is the exact same way.
+ *
+ * There can be an arbitrary amount of unmap operations, a maximum of two remap
+ * operations and a single map operation. The latter one, if existent,
+ * represents the original map operation requested by the caller. Please note
+ * that the map operation might has been modified, e.g. if it was merged with an
+ * existent mapping.
+ *
+ * Note that before calling this function again with another mapping request it
+ * is necessary to update the &drm_gpuva_manager's view of the GPU VA space. The
+ * previously obtained operations must be either processed or abandoned. To
+ * update the &drm_gpuva_manager's view of the GPU VA space drm_gpuva_insert(),
+ * drm_gpuva_destroy_locked() and/or drm_gpuva_destroy_unlocked() should be
+ * used.
+ *
+ * After the caller finished processing the returned &drm_gpuva_ops, they must
+ * be freed with &drm_gpuva_ops_free.
+ *
+ * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
+ */
+struct drm_gpuva_ops *
+drm_gpuva_sm_map_ops_create(struct drm_gpuva_manager *mgr,
+			    u64 req_addr, u64 req_range,
+			    struct drm_gem_object *req_obj, u64 req_offset)
+{
+	struct drm_gpuva_ops *ops;
+	struct {
+		struct drm_gpuva_manager *mgr;
+		struct drm_gpuva_ops *ops;
+	} args;
+	int ret;
+
+	ops = kzalloc(sizeof(*ops), GFP_KERNEL);
+	if (unlikely(!ops))
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&ops->list);
+
+	args.mgr = mgr;
+	args.ops = ops;
+
+	ret = __drm_gpuva_sm_map(mgr, &gpuva_list_ops, &args,
+				 req_addr, req_range,
+				 req_obj, req_offset);
+	if (ret) {
+		kfree(ops);
+		return ERR_PTR(ret);
+	}
+
+	return ops;
+}
+EXPORT_SYMBOL(drm_gpuva_sm_map_ops_create);
+
+/**
+ * drm_gpuva_sm_unmap_ops_create - creates the &drm_gpuva_ops to split on unmap
+ * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ * @req_addr: the start address of the range to unmap
+ * @req_range: the range of the mappings to unmap
+ *
+ * This function creates a list of operations to perform unmapping and, if
+ * required, splitting of the mappings overlapping the unmap range.
+ *
+ * The list can be iterated with &drm_gpuva_for_each_op and must be processed
+ * in the given order. It can contain unmap and remap operations, depending on
+ * whether there are actual overlapping mappings to split.
+ *
+ * There can be an arbitrary amount of unmap operations and a maximum of two
+ * remap operations.
+ *
+ * Note that before calling this function again with another range to unmap it
+ * is necessary to update the &drm_gpuva_manager's view of the GPU VA space. The
+ * previously obtained operations must be processed or abandoned. To update the
+ * &drm_gpuva_manager's view of the GPU VA space drm_gpuva_insert(),
+ * drm_gpuva_destroy_locked() and/or drm_gpuva_destroy_unlocked() should be
+ * used.
+ *
+ * After the caller finished processing the returned &drm_gpuva_ops, they must
+ * be freed with &drm_gpuva_ops_free.
+ *
+ * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
+ */
+struct drm_gpuva_ops *
+drm_gpuva_sm_unmap_ops_create(struct drm_gpuva_manager *mgr,
+			      u64 req_addr, u64 req_range)
+{
+	struct drm_gpuva_ops *ops;
+	struct {
+		struct drm_gpuva_manager *mgr;
+		struct drm_gpuva_ops *ops;
+	} args;
+	int ret;
+
+	ops = kzalloc(sizeof(*ops), GFP_KERNEL);
+	if (unlikely(!ops))
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&ops->list);
+
+	args.mgr = mgr;
+	args.ops = ops;
+
+	ret = __drm_gpuva_sm_unmap(mgr, &gpuva_list_ops, &args,
+				   req_addr, req_range);
+	if (ret) {
+		kfree(ops);
+		return ERR_PTR(ret);
+	}
+
+	return ops;
+}
+EXPORT_SYMBOL(drm_gpuva_sm_unmap_ops_create);
+
+/**
+ * drm_gpuva_prefetch_ops_create - creates the &drm_gpuva_ops to prefetch
+ * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ * @req_addr: the start address of the range to prefetch
+ * @req_range: the range of the mappings to prefetch
+ *
+ * This function creates a list of operations to perform prefetching.
+ *
+ * The list can be iterated with &drm_gpuva_for_each_op and must be processed
+ * in the given order. It can contain prefetch operations.
+ *
+ * There can be an arbitrary amount of prefetch operations.
+ *
+ * After the caller finished processing the returned &drm_gpuva_ops, they must
+ * be freed with &drm_gpuva_ops_free.
+ *
+ * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
+ */
+struct drm_gpuva_ops *
+drm_gpuva_prefetch_ops_create(struct drm_gpuva_manager *mgr,
+			      u64 addr, u64 range)
+{
+	DRM_GPUVA_ITER(it, mgr, addr);
+	struct drm_gpuva_ops *ops;
+	struct drm_gpuva_op *op;
+	int ret;
+
+	ops = kzalloc(sizeof(*ops), GFP_KERNEL);
+	if (!ops)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&ops->list);
+
+	drm_gpuva_iter_for_each_range(it, addr + range) {
+		op = gpuva_op_alloc(mgr);
+		if (!op) {
+			ret = -ENOMEM;
+			goto err_free_ops;
+		}
+
+		op->op = DRM_GPUVA_OP_PREFETCH;
+		op->prefetch.va = it.va;
+		list_add_tail(&op->entry, &ops->list);
+	}
+
+	return ops;
+
+err_free_ops:
+	drm_gpuva_ops_free(mgr, ops);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL(drm_gpuva_prefetch_ops_create);
+
+/**
+ * drm_gpuva_gem_unmap_ops_create - creates the &drm_gpuva_ops to unmap a GEM
+ * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ * @obj: the &drm_gem_object to unmap
+ *
+ * This function creates a list of operations to perform unmapping for every
+ * GPUVA attached to a GEM.
+ *
+ * The list can be iterated with &drm_gpuva_for_each_op and consists out of an
+ * arbitrary amount of unmap operations.
+ *
+ * After the caller finished processing the returned &drm_gpuva_ops, they must
+ * be freed with &drm_gpuva_ops_free.
+ *
+ * It is the callers responsibility to protect the GEMs GPUVA list against
+ * concurrent access.
+ *
+ * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
+ */
+struct drm_gpuva_ops *
+drm_gpuva_gem_unmap_ops_create(struct drm_gpuva_manager *mgr,
+			       struct drm_gem_object *obj)
+{
+	struct drm_gpuva_ops *ops;
+	struct drm_gpuva_op *op;
+	struct drm_gpuva *va;
+	int ret;
+
+	ops = kzalloc(sizeof(*ops), GFP_KERNEL);
+	if (!ops)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&ops->list);
+
+	drm_gem_for_each_gpuva(va, obj) {
+		op = gpuva_op_alloc(mgr);
+		if (!op) {
+			ret = -ENOMEM;
+			goto err_free_ops;
+		}
+
+		op->op = DRM_GPUVA_OP_UNMAP;
+		op->unmap.va = va;
+		list_add_tail(&op->entry, &ops->list);
+	}
+
+	return ops;
+
+err_free_ops:
+	drm_gpuva_ops_free(mgr, ops);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL(drm_gpuva_gem_unmap_ops_create);
+
+
+/**
+ * drm_gpuva_ops_free - free the given &drm_gpuva_ops
+ * @mgr: the &drm_gpuva_manager the ops were created for
+ * @ops: the &drm_gpuva_ops to free
+ *
+ * Frees the given &drm_gpuva_ops structure including all the ops associated
+ * with it.
+ */
+void
+drm_gpuva_ops_free(struct drm_gpuva_manager *mgr,
+		   struct drm_gpuva_ops *ops)
+{
+	struct drm_gpuva_op *op, *next;
+
+	drm_gpuva_for_each_op_safe(op, next, ops) {
+		list_del(&op->entry);
+
+		if (op->op == DRM_GPUVA_OP_REMAP) {
+			kfree(op->remap.prev);
+			kfree(op->remap.next);
+			kfree(op->remap.unmap);
+		}
+
+		gpuva_op_free(mgr, op);
+	}
+
+	kfree(ops);
+}
+EXPORT_SYMBOL(drm_gpuva_ops_free);
diff --git a/include/drm/drm_debugfs.h b/include/drm/drm_debugfs.h
index 2188dc83957f..10bc2372d019 100644
--- a/include/drm/drm_debugfs.h
+++ b/include/drm/drm_debugfs.h
@@ -34,6 +34,22 @@
 
 #include <linux/types.h>
 #include <linux/seq_file.h>
+
+#include <drm/drm_gpuva_mgr.h>
+
+/**
+ * DRM_DEBUGFS_GPUVA_INFO - &drm_info_list entry to dump a GPU VA space
+ * @show: the &drm_info_list's show callback
+ * @data: driver private data
+ *
+ * Drivers should use this macro to define a &drm_info_list entry to provide a
+ * debugfs file for dumping the GPU VA space regions and mappings.
+ *
+ * For each DRM GPU VA space drivers should call drm_debugfs_gpuva_info() from
+ * their @show callback.
+ */
+#define DRM_DEBUGFS_GPUVA_INFO(show, data) {"gpuvas", show, DRIVER_GEM_GPUVA, data}
+
 /**
  * struct drm_info_list - debugfs info list entry
  *
@@ -85,6 +101,8 @@ void drm_debugfs_create_files(const struct drm_info_list *files,
 			      struct drm_minor *minor);
 int drm_debugfs_remove_files(const struct drm_info_list *files,
 			     int count, struct drm_minor *minor);
+int drm_debugfs_gpuva_info(struct seq_file *m,
+			   struct drm_gpuva_manager *mgr);
 #else
 static inline void drm_debugfs_create_files(const struct drm_info_list *files,
 					    int count, struct dentry *root,
@@ -96,6 +114,11 @@ static inline int drm_debugfs_remove_files(const struct drm_info_list *files,
 {
 	return 0;
 }
+static inline int drm_debugfs_gpuva_info(struct seq_file *m,
+					 struct drm_gpuva_manager *mgr)
+{
+	return 0;
+}
 #endif
 
 #endif /* _DRM_DEBUGFS_H_ */
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index d7c521e8860f..9d718870d783 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -105,6 +105,13 @@ enum drm_driver_feature {
 	 */
 	DRIVER_COMPUTE_ACCEL            = BIT(7),
 
+	/**
+	 * @DRIVER_GEM_GPUVA:
+	 *
+	 * Driver supports user defined GPU VA bindings for GEM objects.
+	 */
+	DRIVER_GEM_GPUVA		= BIT(8),
+
 	/* IMPORTANT: Below are all the legacy flags, add new ones above. */
 
 	/**
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index 772a4adf5287..4a3679034966 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -36,6 +36,8 @@
 
 #include <linux/kref.h>
 #include <linux/dma-resv.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
 
 #include <drm/drm_vma_manager.h>
 
@@ -337,6 +339,17 @@ struct drm_gem_object {
 	 */
 	struct dma_resv _resv;
 
+	/**
+	 * @gpuva:
+	 *
+	 * Provides the list and list mutex of GPU VAs attached to this
+	 * GEM object.
+	 */
+	struct {
+		struct list_head list;
+		struct mutex mutex;
+	} gpuva;
+
 	/**
 	 * @funcs:
 	 *
@@ -479,4 +492,66 @@ void drm_gem_lru_move_tail(struct drm_gem_lru *lru, struct drm_gem_object *obj);
 unsigned long drm_gem_lru_scan(struct drm_gem_lru *lru, unsigned nr_to_scan,
 			       bool (*shrink)(struct drm_gem_object *obj));
 
+/**
+ * drm_gem_gpuva_init - initialize the gpuva list of a GEM object
+ * @obj: the &drm_gem_object
+ *
+ * This initializes the &drm_gem_object's &drm_gpuva list and the mutex
+ * protecting it.
+ *
+ * Calling this function is only necessary for drivers intending to support the
+ * &drm_driver_feature DRIVER_GEM_GPUVA.
+ */
+static inline void drm_gem_gpuva_init(struct drm_gem_object *obj)
+{
+	INIT_LIST_HEAD(&obj->gpuva.list);
+	mutex_init(&obj->gpuva.mutex);
+}
+
+/**
+ * drm_gem_gpuva_lock - lock the GEM's gpuva list mutex
+ * @obj: the &drm_gem_object
+ *
+ * This unlocks the mutex protecting the &drm_gem_object's &drm_gpuva list.
+ */
+static inline void drm_gem_gpuva_lock(struct drm_gem_object *obj)
+{
+	mutex_lock(&obj->gpuva.mutex);
+}
+
+/**
+ * drm_gem_gpuva_unlock - unlock the GEM's gpuva list mutex
+ * @obj: the &drm_gem_object
+ *
+ * This unlocks the mutex protecting the &drm_gem_object's &drm_gpuva list.
+ */
+static inline void drm_gem_gpuva_unlock(struct drm_gem_object *obj)
+{
+	mutex_unlock(&obj->gpuva.mutex);
+}
+
+/**
+ * drm_gem_for_each_gpuva - iternator to walk over a list of gpuvas
+ * @entry: &drm_gpuva structure to assign to in each iteration step
+ * @obj: the &drm_gem_object the &drm_gpuvas to walk are associated with
+ *
+ * This iterator walks over all &drm_gpuva structures associated with the
+ * &drm_gpuva_manager.
+ */
+#define drm_gem_for_each_gpuva(entry, obj) \
+	list_for_each_entry(entry, &obj->gpuva.list, head)
+
+/**
+ * drm_gem_for_each_gpuva_safe - iternator to safely walk over a list of gpuvas
+ * @entry: &drm_gpuva structure to assign to in each iteration step
+ * @next: &next &drm_gpuva to store the next step
+ * @obj: the &drm_gem_object the &drm_gpuvas to walk are associated with
+ *
+ * This iterator walks over all &drm_gpuva structures associated with the
+ * &drm_gem_object. It is implemented with list_for_each_entry_safe(), hence
+ * it is save against removal of elements.
+ */
+#define drm_gem_for_each_gpuva_safe(entry, next, obj) \
+	list_for_each_entry_safe(entry, next, &obj->gpuva.list, head)
+
 #endif /* __DRM_GEM_H__ */
diff --git a/include/drm/drm_gpuva_mgr.h b/include/drm/drm_gpuva_mgr.h
new file mode 100644
index 000000000000..675836cb1c7d
--- /dev/null
+++ b/include/drm/drm_gpuva_mgr.h
@@ -0,0 +1,735 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __DRM_GPUVA_MGR_H__
+#define __DRM_GPUVA_MGR_H__
+
+/*
+ * Copyright (c) 2022 Red Hat.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#include <linux/maple_tree.h>
+#include <linux/mm.h>
+#include <linux/rbtree.h>
+#include <linux/spinlock.h>
+#include <linux/types.h>
+
+struct drm_gpuva_manager;
+struct drm_gpuva_fn_ops;
+struct drm_gpuva_op;
+
+/**
+ * struct drm_gpuva_region - structure to track a portion of GPU VA space
+ *
+ * This structure represents a portion of a GPUs VA space and is associated
+ * with a &drm_gpuva_manager.
+ *
+ * GPU VA mappings, represented by &drm_gpuva objects, are restricted to be
+ * placed within a &drm_gpuva_region.
+ */
+struct drm_gpuva_region {
+	/**
+	 * @mgr: the &drm_gpuva_manager this object is associated with
+	 */
+	struct drm_gpuva_manager *mgr;
+
+	/**
+	 * @va: structure containing the address and range of the &drm_gpuva_region
+	 */
+	struct {
+		/**
+		 * @addr: the start address
+		 */
+		u64 addr;
+
+		/*
+		 * @range: the range
+		 */
+		u64 range;
+	} va;
+
+	/**
+	 * @sparse: indicates whether this region is sparse
+	 */
+	bool sparse;
+};
+
+int drm_gpuva_sm_step(struct drm_gpuva_op *__op, void *priv);
+int drm_gpuva_region_insert(struct drm_gpuva_manager *mgr,
+			    struct drm_gpuva_region *reg);
+void drm_gpuva_region_remove(struct drm_gpuva_region *reg);
+
+bool
+drm_gpuva_region_empty(struct drm_gpuva_region *reg);
+
+struct drm_gpuva_region *
+drm_gpuva_region_find(struct drm_gpuva_manager *mgr,
+		      u64 addr, u64 range);
+struct drm_gpuva_region *
+drm_gpuva_region_find_first(struct drm_gpuva_manager *mgr,
+			    u64 addr, u64 range);
+
+/**
+ * enum drm_gpuva_flags - flags for struct drm_gpuva
+ */
+enum drm_gpuva_flags {
+	/**
+	 * @DRM_GPUVA_EVICTED:
+	 *
+	 * Flag indicating that the &drm_gpuva's backing GEM is evicted.
+	 */
+	DRM_GPUVA_EVICTED = (1 << 0),
+
+	/**
+	 * @DRM_GPUVA_USERBITS: user defined bits
+	 */
+	DRM_GPUVA_USERBITS = (1 << 1),
+};
+
+/**
+ * struct drm_gpuva - structure to track a GPU VA mapping
+ *
+ * This structure represents a GPU VA mapping and is associated with a
+ * &drm_gpuva_manager.
+ *
+ * Typically, this structure is embedded in bigger driver structures.
+ */
+struct drm_gpuva {
+	/**
+	 * @mgr: the &drm_gpuva_manager this object is associated with
+	 */
+	struct drm_gpuva_manager *mgr;
+
+	/**
+	 * @region: the &drm_gpuva_region the &drm_gpuva is mapped in
+	 */
+	struct drm_gpuva_region *region;
+
+	/**
+	 * @head: the &list_head to attach this object to a &drm_gem_object
+	 */
+	struct list_head head;
+
+	/**
+	 * @flags: the &drm_gpuva_flags for this mapping
+	 */
+	enum drm_gpuva_flags flags;
+
+	/**
+	 * @va: structure containing the address and range of the &drm_gpuva
+	 */
+	struct {
+		/**
+		 * @addr: the start address
+		 */
+		u64 addr;
+
+		/*
+		 * @range: the range
+		 */
+		u64 range;
+	} va;
+
+	/**
+	 * @gem: structure containing the &drm_gem_object and it's offset
+	 */
+	struct {
+		/**
+		 * @offset: the offset within the &drm_gem_object
+		 */
+		u64 offset;
+
+		/**
+		 * @obj: the mapped &drm_gem_object
+		 */
+		struct drm_gem_object *obj;
+	} gem;
+};
+
+void drm_gpuva_link(struct drm_gpuva *va);
+void drm_gpuva_unlink(struct drm_gpuva *va);
+
+int drm_gpuva_insert(struct drm_gpuva_manager *mgr,
+		     struct drm_gpuva *va);
+void drm_gpuva_remove(struct drm_gpuva *va);
+
+struct drm_gpuva *drm_gpuva_find(struct drm_gpuva_manager *mgr,
+				 u64 addr, u64 range);
+struct drm_gpuva *drm_gpuva_find_first(struct drm_gpuva_manager *mgr,
+				       u64 addr, u64 range);
+struct drm_gpuva *drm_gpuva_find_prev(struct drm_gpuva_manager *mgr, u64 start);
+struct drm_gpuva *drm_gpuva_find_next(struct drm_gpuva_manager *mgr, u64 end);
+
+/**
+ * drm_gpuva_evict - sets whether the backing GEM of this &drm_gpuva is evicted
+ * @va: the &drm_gpuva to set the evict flag for
+ * @evict: indicates whether the &drm_gpuva is evicted
+ */
+static inline void drm_gpuva_evict(struct drm_gpuva *va, bool evict)
+{
+	if (evict)
+		va->flags |= DRM_GPUVA_EVICTED;
+	else
+		va->flags &= ~DRM_GPUVA_EVICTED;
+}
+
+/**
+ * drm_gpuva_evicted - indicates whether the backing BO of this &drm_gpuva
+ * is evicted
+ * @va: the &drm_gpuva to check
+ */
+static inline bool drm_gpuva_evicted(struct drm_gpuva *va)
+{
+	return va->flags & DRM_GPUVA_EVICTED;
+}
+
+/**
+ * enum drm_gpuva_mgr_flags - the feature flags for the &drm_gpuva_manager
+ */
+enum drm_gpuva_mgr_flags {
+	/**
+	 * @DRM_GPUVA_MANAGER_REGIONS:
+	 *
+	 * Enable the &drm_gpuva_manager to separately track &drm_gpuva_regions.
+	 *
+	 * &drm_gpuva_regions represent a reserved portion of VA space drivers
+	 * can create mappings in. If regions are enabled, &drm_gpuvas can be
+	 * created within an existing &drm_gpuva_region only and merge
+	 * operations never indicate merging over region boundaries.
+	 */
+	DRM_GPUVA_MANAGER_REGIONS = (1 << 0),
+};
+
+/**
+ * struct drm_gpuva_manager - DRM GPU VA Manager
+ *
+ * The DRM GPU VA Manager keeps track of a GPU's virtual address space by using
+ * &maple_tree structures. Typically, this structure is embedded in bigger
+ * driver structures.
+ *
+ * Drivers can pass addresses and ranges in an arbitrary unit, e.g. bytes or
+ * pages.
+ *
+ * There should be one manager instance per GPU virtual address space.
+ */
+struct drm_gpuva_manager {
+	/**
+	 * @name: the name of the DRM GPU VA space
+	 */
+	const char *name;
+
+	/**
+	 * @mm_start: start of the VA space
+	 */
+	u64 mm_start;
+
+	/**
+	 * @mm_range: length of the VA space
+	 */
+	u64 mm_range;
+
+	/**
+	 * @region_mt: the &maple_tree to track GPU VA regions
+	 */
+	struct maple_tree region_mt;
+
+	/**
+	 * @va_mt: the &maple_tree to track GPU VA mappings
+	 */
+	struct maple_tree va_mt;
+
+	/**
+	 * @kernel_alloc_region:
+	 *
+	 * &drm_gpuva_region representing the address space cutout reserved for
+	 * the kernel
+	 */
+	struct drm_gpuva_region kernel_alloc_region;
+
+	/**
+	 * @ops: &drm_gpuva_fn_ops providing the split/merge steps to drivers
+	 */
+	struct drm_gpuva_fn_ops *ops;
+
+	/**
+	 * @flags: the feature flags of the &drm_gpuva_manager
+	 */
+	enum drm_gpuva_mgr_flags flags;
+};
+
+void drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
+			    const char *name,
+			    u64 start_offset, u64 range,
+			    u64 reserve_offset, u64 reserve_range,
+			    struct drm_gpuva_fn_ops *ops,
+			    enum drm_gpuva_mgr_flags flags);
+void drm_gpuva_manager_destroy(struct drm_gpuva_manager *mgr);
+
+/**
+ * struct drm_gpuva_iterator - iterator for walking the internal (maple) tree
+ */
+struct drm_gpuva_iterator {
+	/**
+	 * @mas: the maple tree iterator (maple advanced state)
+	 */
+	struct ma_state mas;
+
+	/**
+	 * @mgr: the &drm_gpuva_manager to iterate
+	 */
+	struct drm_gpuva_manager *mgr;
+
+	union {
+		/**
+		 * @va: the current &drm_gpuva entry
+		 */
+		struct drm_gpuva *va;
+
+		/**
+		 * @reg: the current &drm_gpuva_region entry
+		 */
+		struct drm_gpuva_region *reg;
+
+		/**
+		 * @entry: the current entry
+		 */
+		void *entry;
+	};
+};
+
+void drm_gpuva_iter_remove(struct drm_gpuva_iterator *it);
+int drm_gpuva_iter_va_replace(struct drm_gpuva_iterator *it,
+			      struct drm_gpuva *va);
+int drm_gpuva_iter_region_replace(struct drm_gpuva_iterator *it,
+				  struct drm_gpuva_region *reg);
+
+static inline bool
+drm_gpuva_iter_find(struct drm_gpuva_iterator *it, unsigned long max)
+{
+
+	mas_lock(&it->mas);
+	it->entry = mas_find(&it->mas, max);
+	mas_unlock(&it->mas);
+
+	return !!it->entry;
+}
+
+/**
+ * DRM_GPUVA_ITER - create an iterator structure to iterate the &drm_gpuva tree
+ * @name: the name of the &drm_gpuva_iterator to create
+ * @mgr: the &drm_gpuva_manager to iterate
+ * @start: starting offset, the first entry will overlap this
+ */
+#define DRM_GPUVA_ITER(name, mgr__, start)				\
+	struct drm_gpuva_iterator name = {				\
+		.mas = MA_STATE_INIT(&(mgr__)->va_mt, start, 0),	\
+		.mgr = mgr__,						\
+		.va = NULL,						\
+	}
+
+/**
+ * DRM_GPUVA_REGION_ITER - create an iterator structure to iterate the
+ * &drm_gpuva_region tree
+ * @name: the name of the &drm_gpuva_iterator to create
+ * @mgr: the &drm_gpuva_manager to iterate
+ * @start: starting offset, the first entry will overlap this
+ */
+#define DRM_GPUVA_REGION_ITER(name, mgr__, start)			\
+	struct drm_gpuva_iterator name = {				\
+		.mas = MA_STATE_INIT(&(mgr__)->region_mt, start, 0),	\
+		.mgr = mgr__,						\
+		.reg = NULL,						\
+	}
+
+/**
+ * drm_gpuva_iter_for_each_range - iternator to walk over a range of entries
+ * @it__: &drm_gpuva_iterator structure to assign to in each iteration step
+ * @start__: starting offset, the first entry will overlap this
+ * @end__: ending offset, the last entry will start before this (but may overlap)
+ *
+ * This function can be used to iterate both &drm_gpuva objects and
+ * &drm_gpuva_region objects.
+ *
+ * It is safe against the removal of elements using &drm_gpuva_iter_remove,
+ * however it is not safe against the removal of elements using
+ * &drm_gpuva_remove and &drm_gpuva_region_remove.
+ */
+#define drm_gpuva_iter_for_each_range(it__, end__) \
+	while (drm_gpuva_iter_find(&(it__), (end__) - 1))
+
+/**
+ * drm_gpuva_iter_for_each - iternator to walk over all existing entries
+ * @it__: &drm_gpuva_iterator structure to assign to in each iteration step
+ *
+ * This function can be used to iterate both &drm_gpuva objects and
+ * &drm_gpuva_region objects.
+ *
+ * In order to walk over all potentially existing entries, the
+ * &drm_gpuva_iterator must be initialized to start at
+ * &drm_gpuva_manager->mm_start or simply 0.
+ *
+ * It is safe against the removal of elements using &drm_gpuva_iter_remove,
+ * however it is not safe against the removal of elements using
+ * &drm_gpuva_remove and &drm_gpuva_region_remove.
+ */
+#define drm_gpuva_iter_for_each(it__) \
+	drm_gpuva_iter_for_each_range(it__, (it__).mgr->mm_start + (it__).mgr->mm_range)
+
+/**
+ * enum drm_gpuva_op_type - GPU VA operation type
+ *
+ * Operations to alter the GPU VA mappings tracked by the &drm_gpuva_manager.
+ */
+enum drm_gpuva_op_type {
+	/**
+	 * @DRM_GPUVA_OP_MAP: the map op type
+	 */
+	DRM_GPUVA_OP_MAP,
+
+	/**
+	 * @DRM_GPUVA_OP_REMAP: the remap op type
+	 */
+	DRM_GPUVA_OP_REMAP,
+
+	/**
+	 * @DRM_GPUVA_OP_UNMAP: the unmap op type
+	 */
+	DRM_GPUVA_OP_UNMAP,
+
+	/**
+	 * @DRM_GPUVA_OP_PREFETCH: the prefetch op type
+	 */
+	DRM_GPUVA_OP_PREFETCH,
+};
+
+/**
+ * struct drm_gpuva_op_map - GPU VA map operation
+ *
+ * This structure represents a single map operation generated by the
+ * DRM GPU VA manager.
+ */
+struct drm_gpuva_op_map {
+	/**
+	 * @va: structure containing address and range of a map
+	 * operation
+	 */
+	struct {
+		/**
+		 * @addr: the base address of the new mapping
+		 */
+		u64 addr;
+
+		/**
+		 * @range: the range of the new mapping
+		 */
+		u64 range;
+	} va;
+
+	/**
+	 * @gem: structure containing the &drm_gem_object and it's offset
+	 */
+	struct {
+		/**
+		 * @offset: the offset within the &drm_gem_object
+		 */
+		u64 offset;
+
+		/**
+		 * @obj: the &drm_gem_object to map
+		 */
+		struct drm_gem_object *obj;
+	} gem;
+};
+
+/**
+ * struct drm_gpuva_op_unmap - GPU VA unmap operation
+ *
+ * This structure represents a single unmap operation generated by the
+ * DRM GPU VA manager.
+ */
+struct drm_gpuva_op_unmap {
+	/**
+	 * @va: the &drm_gpuva to unmap
+	 */
+	struct drm_gpuva *va;
+
+	/**
+	 * @keep:
+	 *
+	 * Indicates whether this &drm_gpuva is physically contiguous with the
+	 * original mapping request.
+	 *
+	 * Optionally, if &keep is set, drivers may keep the actual page table
+	 * mappings for this &drm_gpuva, adding the missing page table entries
+	 * only and update the &drm_gpuva_manager accordingly.
+	 */
+	bool keep;
+};
+
+/**
+ * struct drm_gpuva_op_remap - GPU VA remap operation
+ *
+ * This represents a single remap operation generated by the DRM GPU VA manager.
+ *
+ * A remap operation is generated when an existing GPU VA mmapping is split up
+ * by inserting a new GPU VA mapping or by partially unmapping existent
+ * mapping(s), hence it consists of a maximum of two map and one unmap
+ * operation.
+ *
+ * The @unmap operation takes care of removing the original existing mapping.
+ * @prev is used to remap the preceding part, @next the subsequent part.
+ *
+ * If either a new mapping's start address is aligned with the start address
+ * of the old mapping or the new mapping's end address is aligned with the
+ * end address of the old mapping, either @prev or @next is NULL.
+ *
+ * Note, the reason for a dedicated remap operation, rather than arbitrary
+ * unmap and map operations, is to give drivers the chance of extracting driver
+ * specific data for creating the new mappings from the unmap operations's
+ * &drm_gpuva structure which typically is embedded in larger driver specific
+ * structures.
+ */
+struct drm_gpuva_op_remap {
+	/**
+	 * @prev: the preceding part of a split mapping
+	 */
+	struct drm_gpuva_op_map *prev;
+
+	/**
+	 * @next: the subsequent part of a split mapping
+	 */
+	struct drm_gpuva_op_map *next;
+
+	/**
+	 * @unmap: the unmap operation for the original existing mapping
+	 */
+	struct drm_gpuva_op_unmap *unmap;
+};
+
+/**
+ * struct drm_gpuva_op_prefetch - GPU VA prefetch operation
+ *
+ * This structure represents a single prefetch operation generated by the
+ * DRM GPU VA manager.
+ */
+struct drm_gpuva_op_prefetch {
+	/**
+	 * @va: the &drm_gpuva to prefetch
+	 */
+	struct drm_gpuva *va;
+};
+
+/**
+ * struct drm_gpuva_op - GPU VA operation
+ *
+ * This structure represents a single generic operation.
+ *
+ * The particular type of the operation is defined by @op.
+ */
+struct drm_gpuva_op {
+	/**
+	 * @entry:
+	 *
+	 * The &list_head used to distribute instances of this struct within
+	 * &drm_gpuva_ops.
+	 */
+	struct list_head entry;
+
+	/**
+	 * @op: the type of the operation
+	 */
+	enum drm_gpuva_op_type op;
+
+	union {
+		/**
+		 * @map: the map operation
+		 */
+		struct drm_gpuva_op_map map;
+
+		/**
+		 * @remap: the remap operation
+		 */
+		struct drm_gpuva_op_remap remap;
+
+		/**
+		 * @unmap: the unmap operation
+		 */
+		struct drm_gpuva_op_unmap unmap;
+
+		/**
+		 * @prefetch: the prefetch operation
+		 */
+		struct drm_gpuva_op_prefetch prefetch;
+	};
+};
+
+/**
+ * struct drm_gpuva_ops - wraps a list of &drm_gpuva_op
+ */
+struct drm_gpuva_ops {
+	/**
+	 * @list: the &list_head
+	 */
+	struct list_head list;
+};
+
+/**
+ * drm_gpuva_for_each_op - iterator to walk over &drm_gpuva_ops
+ * @op: &drm_gpuva_op to assign in each iteration step
+ * @ops: &drm_gpuva_ops to walk
+ *
+ * This iterator walks over all ops within a given list of operations.
+ */
+#define drm_gpuva_for_each_op(op, ops) list_for_each_entry(op, &(ops)->list, entry)
+
+/**
+ * drm_gpuva_for_each_op_safe - iterator to safely walk over &drm_gpuva_ops
+ * @op: &drm_gpuva_op to assign in each iteration step
+ * @next: &next &drm_gpuva_op to store the next step
+ * @ops: &drm_gpuva_ops to walk
+ *
+ * This iterator walks over all ops within a given list of operations. It is
+ * implemented with list_for_each_safe(), so save against removal of elements.
+ */
+#define drm_gpuva_for_each_op_safe(op, next, ops) \
+	list_for_each_entry_safe(op, next, &(ops)->list, entry)
+
+/**
+ * drm_gpuva_for_each_op_from_reverse - iterate backwards from the given point
+ * @op: &drm_gpuva_op to assign in each iteration step
+ * @ops: &drm_gpuva_ops to walk
+ *
+ * This iterator walks over all ops within a given list of operations beginning
+ * from the given operation in reverse order.
+ */
+#define drm_gpuva_for_each_op_from_reverse(op, ops) \
+	list_for_each_entry_from_reverse(op, &(ops)->list, entry)
+
+/**
+ * drm_gpuva_first_op - returns the first &drm_gpuva_op from &drm_gpuva_ops
+ * @ops: the &drm_gpuva_ops to get the fist &drm_gpuva_op from
+ */
+#define drm_gpuva_first_op(ops) \
+	list_first_entry(&(ops)->list, struct drm_gpuva_op, entry)
+
+/**
+ * drm_gpuva_last_op - returns the last &drm_gpuva_op from &drm_gpuva_ops
+ * @ops: the &drm_gpuva_ops to get the last &drm_gpuva_op from
+ */
+#define drm_gpuva_last_op(ops) \
+	list_last_entry(&(ops)->list, struct drm_gpuva_op, entry)
+
+/**
+ * drm_gpuva_prev_op - previous &drm_gpuva_op in the list
+ * @op: the current &drm_gpuva_op
+ */
+#define drm_gpuva_prev_op(op) list_prev_entry(op, entry)
+
+/**
+ * drm_gpuva_next_op - next &drm_gpuva_op in the list
+ * @op: the current &drm_gpuva_op
+ */
+#define drm_gpuva_next_op(op) list_next_entry(op, entry)
+
+struct drm_gpuva_ops *
+drm_gpuva_sm_map_ops_create(struct drm_gpuva_manager *mgr,
+			    u64 addr, u64 range,
+			    struct drm_gem_object *obj, u64 offset);
+struct drm_gpuva_ops *
+drm_gpuva_sm_unmap_ops_create(struct drm_gpuva_manager *mgr,
+			      u64 addr, u64 range);
+
+struct drm_gpuva_ops *
+drm_gpuva_prefetch_ops_create(struct drm_gpuva_manager *mgr,
+				 u64 addr, u64 range);
+
+struct drm_gpuva_ops *
+drm_gpuva_gem_unmap_ops_create(struct drm_gpuva_manager *mgr,
+			       struct drm_gem_object *obj);
+
+void drm_gpuva_ops_free(struct drm_gpuva_manager *mgr,
+			struct drm_gpuva_ops *ops);
+
+/**
+ * struct drm_gpuva_fn_ops - callbacks for split/merge steps
+ *
+ * This structure defines the callbacks used by &drm_gpuva_sm_map and
+ * &drm_gpuva_sm_unmap to provide the split/merge steps for map and unmap
+ * operations to drivers.
+ */
+struct drm_gpuva_fn_ops {
+	/**
+	 * @op_alloc: called when the &drm_gpuva_manager allocates
+	 * a struct drm_gpuva_op
+	 *
+	 * Some drivers may want to embed struct drm_gpuva_op into driver
+	 * specific structures. By implementing this callback drivers can
+	 * allocate memory accordingly.
+	 *
+	 * This callback is optional.
+	 */
+	struct drm_gpuva_op *(*op_alloc)(void);
+
+	/**
+	 * @op_free: called when the &drm_gpuva_manager frees a
+	 * struct drm_gpuva_op
+	 *
+	 * Some drivers may want to embed struct drm_gpuva_op into driver
+	 * specific structures. By implementing this callback drivers can
+	 * free the previously allocated memory accordingly.
+	 *
+	 * This callback is optional.
+	 */
+	void (*op_free)(struct drm_gpuva_op *op);
+
+	/**
+	 * @sm_map_step: called from &drm_gpuva_sm_map providing the split and
+	 * merge steps
+	 *
+	 * This callback provides a single split / merge step or, if no split
+	 * and merge is indicated, the original map operation.
+	 *
+	 * The &priv pointer is equal to the one drivers pass to
+	 * &drm_gpuva_sm_map.
+	 */
+	int (*sm_map_step)(struct drm_gpuva_op *op, void *priv);
+
+	/**
+	 * @sm_unmap_step: called from &drm_gpuva_sm_map providing the split and
+	 * merge steps
+	 *
+	 * This callback provides a single split step or, if no split is
+	 * indicated, the plain unmap operations of the corresponding unmap
+	 * range originally passed to &drm_gpuva_sm_unmap.
+	 *
+	 * The &priv pointer is equal to the one drivers pass to
+	 * &drm_gpuva_sm_unmap.
+	 */
+	int (*sm_unmap_step)(struct drm_gpuva_op *op, void *priv);
+};
+
+int drm_gpuva_sm_map(struct drm_gpuva_manager *mgr, void *priv,
+		     u64 addr, u64 range,
+		     struct drm_gem_object *obj, u64 offset);
+
+int drm_gpuva_sm_unmap(struct drm_gpuva_manager *mgr, void *priv,
+		       u64 addr, u64 range);
+
+#endif /* __DRM_GPUVA_MGR_H__ */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Intel-xe] [PATCH 3/4] drm/xe: Port Xe to GPUVA
  2023-03-15 18:25 [Intel-xe] [PATCH 0/4] Port Xe to use GPUVA and implement NULL VM binds Matthew Brost
  2023-03-15 18:25 ` [Intel-xe] [PATCH 1/4] maple_tree: split up MA_STATE() macro Matthew Brost
  2023-03-15 18:25 ` [Intel-xe] [PATCH 2/4] drm: manager to keep track of GPUs VA mappings Matthew Brost
@ 2023-03-15 18:25 ` Matthew Brost
  2023-03-15 18:25 ` [Intel-xe] [PATCH 4/4] drm/xe: NULL binding implementation Matthew Brost
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Matthew Brost @ 2023-03-15 18:25 UTC (permalink / raw)
  To: intel-xe; +Cc: paulo.r.zanoni, lionel.g.landwerlin, dakr

Rather than open coding VM binds and VMA tracking, use the GPUVA
library. GPUVA provides a common infrastructure for VM binds to use mmap
/ munmap semantics and support for VK sparse bindings.

The concepts are:

1) xe_vm inherits from drm_gpuva_manager
2) xe_vma inherits from drm_gpuva
3) xe_vma_op inherits from drm_gpuva_op
4) VM bind operations (MAP, UNMAP, PREFETCH, UNMAP_ALL) call into the
GPUVA code to generate an VMA operations list which is parsed, commited,
and executed.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/xe/xe_bo.c                  |   10 +-
 drivers/gpu/drm/xe/xe_device.c              |    2 +-
 drivers/gpu/drm/xe/xe_exec.c                |    2 +-
 drivers/gpu/drm/xe/xe_gt_pagefault.c        |   23 +-
 drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c |   14 +-
 drivers/gpu/drm/xe/xe_guc_ct.c              |    6 +-
 drivers/gpu/drm/xe/xe_migrate.c             |    5 +-
 drivers/gpu/drm/xe/xe_pt.c                  |  105 +-
 drivers/gpu/drm/xe/xe_trace.h               |   10 +-
 drivers/gpu/drm/xe/xe_vm.c                  | 1792 +++++++++----------
 drivers/gpu/drm/xe/xe_vm.h                  |   66 +-
 drivers/gpu/drm/xe/xe_vm_madvise.c          |   87 +-
 drivers/gpu/drm/xe/xe_vm_types.h            |  165 +-
 13 files changed, 1116 insertions(+), 1171 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 73a7f2cd4ad8..764b3ca9fff7 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -404,7 +404,8 @@ static int xe_bo_trigger_rebind(struct xe_device *xe, struct xe_bo *bo,
 {
 	struct dma_resv_iter cursor;
 	struct dma_fence *fence;
-	struct xe_vma *vma;
+	struct drm_gpuva *gpuva;
+	struct drm_gem_object *obj = &bo->ttm.base;
 	int ret = 0;
 
 	dma_resv_assert_held(bo->ttm.base.resv);
@@ -417,8 +418,9 @@ static int xe_bo_trigger_rebind(struct xe_device *xe, struct xe_bo *bo,
 		dma_resv_iter_end(&cursor);
 	}
 
-	list_for_each_entry(vma, &bo->vmas, bo_link) {
-		struct xe_vm *vm = vma->vm;
+	drm_gem_for_each_gpuva(gpuva, obj) {
+		struct xe_vma *vma = gpuva_to_vma(gpuva);
+		struct xe_vm *vm = xe_vma_vm(vma);
 
 		trace_xe_vma_evict(vma);
 
@@ -443,10 +445,8 @@ static int xe_bo_trigger_rebind(struct xe_device *xe, struct xe_bo *bo,
 			} else {
 				ret = timeout;
 			}
-
 		} else {
 			bool vm_resv_locked = false;
-			struct xe_vm *vm = vma->vm;
 
 			/*
 			 * We need to put the vma on the vm's rebind_list,
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 1553949d12b6..a8c206d7ba20 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -130,7 +130,7 @@ static struct drm_driver driver = {
 	.driver_features =
 	    DRIVER_GEM |
 	    DRIVER_RENDER | DRIVER_SYNCOBJ |
-	    DRIVER_SYNCOBJ_TIMELINE,
+	    DRIVER_SYNCOBJ_TIMELINE | DRIVER_GEM_GPUVA,
 	.open = xe_file_open,
 	.postclose = xe_file_close,
 
diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c
index 97fd1a311f2d..b798a11f168b 100644
--- a/drivers/gpu/drm/xe/xe_exec.c
+++ b/drivers/gpu/drm/xe/xe_exec.c
@@ -118,7 +118,7 @@ static int xe_exec_begin(struct xe_engine *e, struct ww_acquire_ctx *ww,
 		if (xe_vma_is_userptr(vma))
 			continue;
 
-		err = xe_bo_validate(vma->bo, vm, false);
+		err = xe_bo_validate(xe_vma_bo(vma), vm, false);
 		if (err) {
 			xe_vm_unlock_dma_resv(vm, tv_onstack, *tv, ww, objs);
 			*tv = NULL;
diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
index 1677640e1075..f7a066090a13 100644
--- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
+++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
@@ -75,9 +75,10 @@ static bool vma_is_valid(struct xe_gt *gt, struct xe_vma *vma)
 		!(BIT(gt->info.id) & vma->usm.gt_invalidated);
 }
 
-static bool vma_matches(struct xe_vma *vma, struct xe_vma *lookup)
+static bool vma_matches(struct xe_vma *vma, u64 page_addr)
 {
-	if (lookup->start > vma->end || lookup->end < vma->start)
+	if (page_addr > xe_vma_end(vma) - 1 ||
+	    page_addr + SZ_4K < xe_vma_start(vma))
 		return false;
 
 	return true;
@@ -90,16 +91,14 @@ static bool only_needs_bo_lock(struct xe_bo *bo)
 
 static struct xe_vma *lookup_vma(struct xe_vm *vm, u64 page_addr)
 {
-	struct xe_vma *vma = NULL, lookup;
+	struct xe_vma *vma = NULL;
 
-	lookup.start = page_addr;
-	lookup.end = lookup.start + SZ_4K - 1;
 	if (vm->usm.last_fault_vma) {   /* Fast lookup */
-		if (vma_matches(vm->usm.last_fault_vma, &lookup))
+		if (vma_matches(vm->usm.last_fault_vma, page_addr))
 			vma = vm->usm.last_fault_vma;
 	}
 	if (!vma)
-		vma = xe_vm_find_overlapping_vma(vm, &lookup);
+		vma = xe_vm_find_overlapping_vma(vm, page_addr, SZ_4K);
 
 	return vma;
 }
@@ -170,7 +169,7 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
 	}
 
 	/* Lock VM and BOs dma-resv */
-	bo = vma->bo;
+	bo = xe_vma_bo(vma);
 	if (only_needs_bo_lock(bo)) {
 		/* This path ensures the BO's LRU is updated */
 		ret = xe_bo_lock(bo, &ww, xe->info.tile_count, false);
@@ -487,12 +486,8 @@ static struct xe_vma *get_acc_vma(struct xe_vm *vm, struct acc *acc)
 {
 	u64 page_va = acc->va_range_base + (ffs(acc->sub_granularity) - 1) *
 		sub_granularity_in_byte(acc->granularity);
-	struct xe_vma lookup;
-
-	lookup.start = page_va;
-	lookup.end = lookup.start + SZ_4K - 1;
 
-	return xe_vm_find_overlapping_vma(vm, &lookup);
+	return xe_vm_find_overlapping_vma(vm, page_va, SZ_4K);
 }
 
 static int handle_acc(struct xe_gt *gt, struct acc *acc)
@@ -536,7 +531,7 @@ static int handle_acc(struct xe_gt *gt, struct acc *acc)
 		goto unlock_vm;
 
 	/* Lock VM and BOs dma-resv */
-	bo = vma->bo;
+	bo = xe_vma_bo(vma);
 	if (only_needs_bo_lock(bo)) {
 		/* This path ensures the BO's LRU is updated */
 		ret = xe_bo_lock(bo, &ww, xe->info.tile_count, false);
diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
index f279e21300aa..155f37aaf31c 100644
--- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
+++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
@@ -201,8 +201,8 @@ int xe_gt_tlb_invalidation_vma(struct xe_gt *gt,
 	if (!xe->info.has_range_tlb_invalidation) {
 		action[len++] = MAKE_INVAL_OP(XE_GUC_TLB_INVAL_FULL);
 	} else {
-		u64 start = vma->start;
-		u64 length = vma->end - vma->start + 1;
+		u64 start = xe_vma_start(vma);
+		u64 length = xe_vma_size(vma);
 		u64 align, end;
 
 		if (length < SZ_4K)
@@ -215,12 +215,12 @@ int xe_gt_tlb_invalidation_vma(struct xe_gt *gt,
 		 * address mask covering the required range.
 		 */
 		align = roundup_pow_of_two(length);
-		start = ALIGN_DOWN(vma->start, align);
-		end = ALIGN(vma->start + length, align);
+		start = ALIGN_DOWN(xe_vma_start(vma), align);
+		end = ALIGN(xe_vma_start(vma) + length, align);
 		length = align;
 		while (start + length < end) {
 			length <<= 1;
-			start = ALIGN_DOWN(vma->start, length);
+			start = ALIGN_DOWN(xe_vma_start(vma), length);
 		}
 
 		/*
@@ -229,7 +229,7 @@ int xe_gt_tlb_invalidation_vma(struct xe_gt *gt,
 		 */
 		if (length >= SZ_2M) {
 			length = max_t(u64, SZ_16M, length);
-			start = ALIGN_DOWN(vma->start, length);
+			start = ALIGN_DOWN(xe_vma_start(vma), length);
 		}
 
 		XE_BUG_ON(length < SZ_4K);
@@ -238,7 +238,7 @@ int xe_gt_tlb_invalidation_vma(struct xe_gt *gt,
 		XE_BUG_ON(!IS_ALIGNED(start, length));
 
 		action[len++] = MAKE_INVAL_OP(XE_GUC_TLB_INVAL_PAGE_SELECTIVE);
-		action[len++] = vma->vm->usm.asid;
+		action[len++] = xe_vma_vm(vma)->usm.asid;
 		action[len++] = lower_32_bits(start);
 		action[len++] = upper_32_bits(start);
 		action[len++] = ilog2(length) - ilog2(SZ_4K);
diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
index 5e00b75d3ca2..e5ed9022a0a2 100644
--- a/drivers/gpu/drm/xe/xe_guc_ct.c
+++ b/drivers/gpu/drm/xe/xe_guc_ct.c
@@ -783,13 +783,13 @@ static int parse_g2h_response(struct xe_guc_ct *ct, u32 *msg, u32 len)
 	if (type == GUC_HXG_TYPE_RESPONSE_FAILURE) {
 		g2h_fence->fail = true;
 		g2h_fence->error =
-			FIELD_GET(GUC_HXG_FAILURE_MSG_0_ERROR, msg[0]);
+			FIELD_GET(GUC_HXG_FAILURE_MSG_0_ERROR, msg[1]);
 		g2h_fence->hint =
-			FIELD_GET(GUC_HXG_FAILURE_MSG_0_HINT, msg[0]);
+			FIELD_GET(GUC_HXG_FAILURE_MSG_0_HINT, msg[1]);
 	} else if (type == GUC_HXG_TYPE_NO_RESPONSE_RETRY) {
 		g2h_fence->retry = true;
 		g2h_fence->reason =
-			FIELD_GET(GUC_HXG_RETRY_MSG_0_REASON, msg[0]);
+			FIELD_GET(GUC_HXG_RETRY_MSG_0_REASON, msg[1]);
 	} else if (g2h_fence->response_buffer) {
 		g2h_fence->response_len = response_len;
 		memcpy(g2h_fence->response_buffer, msg + GUC_CTB_MSG_MIN_LEN,
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index c0523d8fe944..77a6d71f6e89 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -992,6 +992,7 @@ xe_migrate_update_pgtables_cpu(struct xe_migrate *m,
 	if (wait_vm) {
 		long wait;
 
+		vm_dbg(&vm->xe->drm, "wait on VM for munmap");
 		wait = dma_resv_wait_timeout(&vm->resv,
 					     DMA_RESV_USAGE_BOOKKEEP,
 					     true, HZ / 100);
@@ -1089,7 +1090,8 @@ xe_migrate_update_pgtables(struct xe_migrate *m,
 	u64 addr;
 	int err = 0;
 	bool usm = !eng && xe->info.supports_usm;
-	bool first_munmap_rebind = vma && vma->first_munmap_rebind;
+	bool first_munmap_rebind = vma &&
+		vma->gpuva.flags & XE_VMA_FIRST_REBIND;
 
 	/* Use the CPU if no in syncs and engine is idle */
 	if (no_in_syncs(syncs, num_syncs) && engine_is_idle(eng)) {
@@ -1210,6 +1212,7 @@ xe_migrate_update_pgtables(struct xe_migrate *m,
 	 * trigger preempts before moving forward
 	 */
 	if (first_munmap_rebind) {
+		vm_dbg(&vm->xe->drm, "wait on first_munmap_rebind");
 		err = job_add_deps(job, &vm->resv,
 				   DMA_RESV_USAGE_BOOKKEEP);
 		if (err)
diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
index dfd97b0ec42a..d4f58ec8058e 100644
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -94,7 +94,7 @@ static dma_addr_t vma_addr(struct xe_vma *vma, u64 offset,
 				&cur);
 		return xe_res_dma(&cur) + offset;
 	} else {
-		return xe_bo_addr(vma->bo, offset, page_size, is_vram);
+		return xe_bo_addr(xe_vma_bo(vma), offset, page_size, is_vram);
 	}
 }
 
@@ -159,7 +159,7 @@ u64 gen8_pte_encode(struct xe_vma *vma, struct xe_bo *bo,
 
 	if (is_vram) {
 		pte |= GEN12_PPGTT_PTE_LM;
-		if (vma && vma->use_atomic_access_pte_bit)
+		if (vma && vma->gpuva.flags & XE_VMA_ATOMIC_PTE_BIT)
 			pte |= GEN12_USM_PPGTT_PTE_AE;
 	}
 
@@ -738,7 +738,7 @@ static int
 xe_pt_stage_bind(struct xe_gt *gt, struct xe_vma *vma,
 		 struct xe_vm_pgtable_update *entries, u32 *num_entries)
 {
-	struct xe_bo *bo = vma->bo;
+	struct xe_bo *bo = xe_vma_bo(vma);
 	bool is_vram = !xe_vma_is_userptr(vma) && bo && xe_bo_is_vram(bo);
 	struct xe_res_cursor curs;
 	struct xe_pt_stage_bind_walk xe_walk = {
@@ -747,20 +747,20 @@ xe_pt_stage_bind(struct xe_gt *gt, struct xe_vma *vma,
 			.shifts = xe_normal_pt_shifts,
 			.max_level = XE_PT_HIGHEST_LEVEL,
 		},
-		.vm = vma->vm,
+		.vm = xe_vma_vm(vma),
 		.gt = gt,
 		.curs = &curs,
-		.va_curs_start = vma->start,
-		.pte_flags = vma->pte_flags,
+		.va_curs_start = xe_vma_start(vma),
+		.pte_flags = xe_vma_read_only(vma) ? PTE_READ_ONLY : 0,
 		.wupd.entries = entries,
-		.needs_64K = (vma->vm->flags & XE_VM_FLAGS_64K) && is_vram,
+		.needs_64K = (xe_vma_vm(vma)->flags & XE_VM_FLAGS_64K) && is_vram,
 	};
-	struct xe_pt *pt = vma->vm->pt_root[gt->info.id];
+	struct xe_pt *pt = xe_vma_vm(vma)->pt_root[gt->info.id];
 	int ret;
 
 	if (is_vram) {
 		xe_walk.default_pte = GEN12_PPGTT_PTE_LM;
-		if (vma && vma->use_atomic_access_pte_bit)
+		if (vma && vma->gpuva.flags & XE_VMA_ATOMIC_PTE_BIT)
 			xe_walk.default_pte |= GEN12_USM_PPGTT_PTE_AE;
 		xe_walk.dma_offset = gt->mem.vram.io_start -
 			gt_to_xe(gt)->mem.vram.io_start;
@@ -776,17 +776,16 @@ xe_pt_stage_bind(struct xe_gt *gt, struct xe_vma *vma,
 
 	xe_bo_assert_held(bo);
 	if (xe_vma_is_userptr(vma))
-		xe_res_first_sg(vma->userptr.sg, 0, vma->end - vma->start + 1,
-				&curs);
+		xe_res_first_sg(vma->userptr.sg, 0, xe_vma_size(vma), &curs);
 	else if (xe_bo_is_vram(bo) || xe_bo_is_stolen(bo))
-		xe_res_first(bo->ttm.resource, vma->bo_offset,
-			     vma->end - vma->start + 1, &curs);
+		xe_res_first(bo->ttm.resource, xe_vma_bo_offset(vma),
+			     xe_vma_size(vma), &curs);
 	else
-		xe_res_first_sg(xe_bo_get_sg(bo), vma->bo_offset,
-				vma->end - vma->start + 1, &curs);
+		xe_res_first_sg(xe_bo_get_sg(bo), xe_vma_bo_offset(vma),
+				xe_vma_size(vma), &curs);
 
-	ret = drm_pt_walk_range(&pt->drm, pt->level, vma->start, vma->end + 1,
-				&xe_walk.drm);
+	ret = drm_pt_walk_range(&pt->drm, pt->level, xe_vma_start(vma),
+				xe_vma_end(vma), &xe_walk.drm);
 
 	*num_entries = xe_walk.wupd.num_used_entries;
 	return ret;
@@ -921,13 +920,13 @@ bool xe_pt_zap_ptes(struct xe_gt *gt, struct xe_vma *vma)
 		},
 		.gt = gt,
 	};
-	struct xe_pt *pt = vma->vm->pt_root[gt->info.id];
+	struct xe_pt *pt = xe_vma_vm(vma)->pt_root[gt->info.id];
 
 	if (!(vma->gt_present & BIT(gt->info.id)))
 		return false;
 
-	(void)drm_pt_walk_shared(&pt->drm, pt->level, vma->start, vma->end + 1,
-				 &xe_walk.drm);
+	(void)drm_pt_walk_shared(&pt->drm, pt->level, xe_vma_start(vma),
+				 xe_vma_end(vma), &xe_walk.drm);
 
 	return xe_walk.needs_invalidate;
 }
@@ -964,21 +963,21 @@ static void xe_pt_abort_bind(struct xe_vma *vma,
 			continue;
 
 		for (j = 0; j < entries[i].qwords; j++)
-			xe_pt_destroy(entries[i].pt_entries[j].pt, vma->vm->flags, NULL);
+			xe_pt_destroy(entries[i].pt_entries[j].pt, xe_vma_vm(vma)->flags, NULL);
 		kfree(entries[i].pt_entries);
 	}
 }
 
 static void xe_pt_commit_locks_assert(struct xe_vma *vma)
 {
-	struct xe_vm *vm = vma->vm;
+	struct xe_vm *vm = xe_vma_vm(vma);
 
 	lockdep_assert_held(&vm->lock);
 
 	if (xe_vma_is_userptr(vma))
 		lockdep_assert_held_read(&vm->userptr.notifier_lock);
 	else
-		dma_resv_assert_held(vma->bo->ttm.base.resv);
+		dma_resv_assert_held(xe_vma_bo(vma)->ttm.base.resv);
 
 	dma_resv_assert_held(&vm->resv);
 }
@@ -1011,7 +1010,7 @@ static void xe_pt_commit_bind(struct xe_vma *vma,
 
 			if (xe_pt_entry(pt_dir, j_))
 				xe_pt_destroy(xe_pt_entry(pt_dir, j_),
-					      vma->vm->flags, deferred);
+					      xe_vma_vm(vma)->flags, deferred);
 
 			pt_dir->dir.entries[j_] = &newpte->drm;
 		}
@@ -1072,7 +1071,7 @@ static int xe_pt_userptr_inject_eagain(struct xe_vma *vma)
 	static u32 count;
 
 	if (count++ % divisor == divisor - 1) {
-		struct xe_vm *vm = vma->vm;
+		struct xe_vm *vm = xe_vma_vm(vma);
 
 		vma->userptr.divisor = divisor << 1;
 		spin_lock(&vm->userptr.invalidated_lock);
@@ -1115,7 +1114,7 @@ static int xe_pt_userptr_pre_commit(struct xe_migrate_pt_update *pt_update)
 		container_of(pt_update, typeof(*userptr_update), base);
 	struct xe_vma *vma = pt_update->vma;
 	unsigned long notifier_seq = vma->userptr.notifier_seq;
-	struct xe_vm *vm = vma->vm;
+	struct xe_vm *vm = xe_vma_vm(vma);
 
 	userptr_update->locked = false;
 
@@ -1286,20 +1285,20 @@ __xe_pt_bind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
 		},
 		.bind = true,
 	};
-	struct xe_vm *vm = vma->vm;
+	struct xe_vm *vm = xe_vma_vm(vma);
 	u32 num_entries;
 	struct dma_fence *fence;
 	struct invalidation_fence *ifence = NULL;
 	int err;
 
 	bind_pt_update.locked = false;
-	xe_bo_assert_held(vma->bo);
+	xe_bo_assert_held(xe_vma_bo(vma));
 	xe_vm_assert_held(vm);
 	XE_BUG_ON(xe_gt_is_media_type(gt));
 
-	vm_dbg(&vma->vm->xe->drm,
+	vm_dbg(&xe_vma_vm(vma)->xe->drm,
 	       "Preparing bind, with range [%llx...%llx) engine %p.\n",
-	       vma->start, vma->end, e);
+	       xe_vma_start(vma), xe_vma_end(vma) - 1, e);
 
 	err = xe_pt_prepare_bind(gt, vma, entries, &num_entries, rebind);
 	if (err)
@@ -1308,23 +1307,28 @@ __xe_pt_bind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
 
 	xe_vm_dbg_print_entries(gt_to_xe(gt), entries, num_entries);
 
-	if (rebind && !xe_vm_no_dma_fences(vma->vm)) {
+	if (rebind && !xe_vm_no_dma_fences(xe_vma_vm(vma))) {
 		ifence = kzalloc(sizeof(*ifence), GFP_KERNEL);
 		if (!ifence)
 			return ERR_PTR(-ENOMEM);
 	}
 
 	fence = xe_migrate_update_pgtables(gt->migrate,
-					   vm, vma->bo,
+					   vm, xe_vma_bo(vma),
 					   e ? e : vm->eng[gt->info.id],
 					   entries, num_entries,
 					   syncs, num_syncs,
 					   &bind_pt_update.base);
 	if (!IS_ERR(fence)) {
+		bool last_munmap_rebind = vma->gpuva.flags & XE_VMA_LAST_REBIND;
 		LLIST_HEAD(deferred);
 
+
+		if (last_munmap_rebind)
+			vm_dbg(&vm->xe->drm, "last_munmap_rebind");
+
 		/* TLB invalidation must be done before signaling rebind */
-		if (rebind && !xe_vm_no_dma_fences(vma->vm)) {
+		if (rebind && !xe_vm_no_dma_fences(xe_vma_vm(vma))) {
 			int err = invalidation_fence_init(gt, ifence, fence,
 							  vma);
 			if (err) {
@@ -1337,12 +1341,12 @@ __xe_pt_bind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
 
 		/* add shared fence now for pagetable delayed destroy */
 		dma_resv_add_fence(&vm->resv, fence, !rebind &&
-				   vma->last_munmap_rebind ?
+				   last_munmap_rebind ?
 				   DMA_RESV_USAGE_KERNEL :
 				   DMA_RESV_USAGE_BOOKKEEP);
 
-		if (!xe_vma_is_userptr(vma) && !vma->bo->vm)
-			dma_resv_add_fence(vma->bo->ttm.base.resv, fence,
+		if (!xe_vma_is_userptr(vma) && !xe_vma_bo(vma)->vm)
+			dma_resv_add_fence(xe_vma_bo(vma)->ttm.base.resv, fence,
 					   DMA_RESV_USAGE_BOOKKEEP);
 		xe_pt_commit_bind(vma, entries, num_entries, rebind,
 				  bind_pt_update.locked ? &deferred : NULL);
@@ -1355,8 +1359,7 @@ __xe_pt_bind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
 			up_read(&vm->userptr.notifier_lock);
 			xe_bo_put_commit(&deferred);
 		}
-		if (!rebind && vma->last_munmap_rebind &&
-		    xe_vm_in_compute_mode(vm))
+		if (!rebind && last_munmap_rebind && xe_vm_in_compute_mode(vm))
 			queue_work(vm->xe->ordered_wq,
 				   &vm->preempt.rebind_work);
 	} else {
@@ -1504,14 +1507,14 @@ static unsigned int xe_pt_stage_unbind(struct xe_gt *gt, struct xe_vma *vma,
 			.max_level = XE_PT_HIGHEST_LEVEL,
 		},
 		.gt = gt,
-		.modified_start = vma->start,
-		.modified_end = vma->end + 1,
+		.modified_start = xe_vma_start(vma),
+		.modified_end = xe_vma_end(vma),
 		.wupd.entries = entries,
 	};
-	struct xe_pt *pt = vma->vm->pt_root[gt->info.id];
+	struct xe_pt *pt = xe_vma_vm(vma)->pt_root[gt->info.id];
 
-	(void)drm_pt_walk_shared(&pt->drm, pt->level, vma->start, vma->end + 1,
-				 &xe_walk.drm);
+	(void)drm_pt_walk_shared(&pt->drm, pt->level, xe_vma_start(vma),
+				 xe_vma_end(vma), &xe_walk.drm);
 
 	return xe_walk.wupd.num_used_entries;
 }
@@ -1523,7 +1526,7 @@ xe_migrate_clear_pgtable_callback(struct xe_migrate_pt_update *pt_update,
 				  const struct xe_vm_pgtable_update *update)
 {
 	struct xe_vma *vma = pt_update->vma;
-	u64 empty = __xe_pt_empty_pte(gt, vma->vm, update->pt->level);
+	u64 empty = __xe_pt_empty_pte(gt, xe_vma_vm(vma), update->pt->level);
 	int i;
 
 	XE_BUG_ON(xe_gt_is_media_type(gt));
@@ -1561,7 +1564,7 @@ xe_pt_commit_unbind(struct xe_vma *vma,
 			     i++) {
 				if (xe_pt_entry(pt_dir, i))
 					xe_pt_destroy(xe_pt_entry(pt_dir, i),
-						      vma->vm->flags, deferred);
+						      xe_vma_vm(vma)->flags, deferred);
 
 				pt_dir->dir.entries[i] = NULL;
 			}
@@ -1610,19 +1613,19 @@ __xe_pt_unbind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
 			.vma = vma,
 		},
 	};
-	struct xe_vm *vm = vma->vm;
+	struct xe_vm *vm = xe_vma_vm(vma);
 	u32 num_entries;
 	struct dma_fence *fence = NULL;
 	struct invalidation_fence *ifence;
 	LLIST_HEAD(deferred);
 
-	xe_bo_assert_held(vma->bo);
+	xe_bo_assert_held(xe_vma_bo(vma));
 	xe_vm_assert_held(vm);
 	XE_BUG_ON(xe_gt_is_media_type(gt));
 
-	vm_dbg(&vma->vm->xe->drm,
+	vm_dbg(&xe_vma_vm(vma)->xe->drm,
 	       "Preparing unbind, with range [%llx...%llx) engine %p.\n",
-	       vma->start, vma->end, e);
+	       xe_vma_start(vma), xe_vma_end(vma) - 1, e);
 
 	num_entries = xe_pt_stage_unbind(gt, vma, entries);
 	XE_BUG_ON(num_entries > ARRAY_SIZE(entries));
@@ -1661,8 +1664,8 @@ __xe_pt_unbind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
 				   DMA_RESV_USAGE_BOOKKEEP);
 
 		/* This fence will be installed by caller when doing eviction */
-		if (!xe_vma_is_userptr(vma) && !vma->bo->vm)
-			dma_resv_add_fence(vma->bo->ttm.base.resv, fence,
+		if (!xe_vma_is_userptr(vma) && !xe_vma_bo(vma)->vm)
+			dma_resv_add_fence(xe_vma_bo(vma)->ttm.base.resv, fence,
 					   DMA_RESV_USAGE_BOOKKEEP);
 		xe_pt_commit_unbind(vma, entries, num_entries,
 				    unbind_pt_update.locked ? &deferred : NULL);
diff --git a/drivers/gpu/drm/xe/xe_trace.h b/drivers/gpu/drm/xe/xe_trace.h
index 2f8eb7ebe9a7..12e12673fc91 100644
--- a/drivers/gpu/drm/xe/xe_trace.h
+++ b/drivers/gpu/drm/xe/xe_trace.h
@@ -18,7 +18,7 @@
 #include "xe_gt_types.h"
 #include "xe_guc_engine_types.h"
 #include "xe_sched_job.h"
-#include "xe_vm_types.h"
+#include "xe_vm.h"
 
 DECLARE_EVENT_CLASS(xe_gt_tlb_invalidation_fence,
 		    TP_PROTO(struct xe_gt_tlb_invalidation_fence *fence),
@@ -368,10 +368,10 @@ DECLARE_EVENT_CLASS(xe_vma,
 
 		    TP_fast_assign(
 			   __entry->vma = (unsigned long)vma;
-			   __entry->asid = vma->vm->usm.asid;
-			   __entry->start = vma->start;
-			   __entry->end = vma->end;
-			   __entry->ptr = (u64)vma->userptr.ptr;
+			   __entry->asid = xe_vma_vm(vma)->usm.asid;
+			   __entry->start = xe_vma_start(vma);
+			   __entry->end = xe_vma_end(vma) - 1;
+			   __entry->ptr = xe_vma_userptr(vma);
 			   ),
 
 		    TP_printk("vma=0x%016llx, asid=0x%05x, start=0x%012llx, end=0x%012llx, ptr=0x%012llx,",
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index e8e178922082..b312160f53ff 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -25,10 +25,8 @@
 #include "xe_preempt_fence.h"
 #include "xe_pt.h"
 #include "xe_res_cursor.h"
-#include "xe_sync.h"
 #include "xe_trace.h"
-
-#define TEST_VM_ASYNC_OPS_ERROR
+#include "xe_sync.h"
 
 /**
  * xe_vma_userptr_check_repin() - Advisory check for repin needed
@@ -51,20 +49,19 @@ int xe_vma_userptr_check_repin(struct xe_vma *vma)
 
 int xe_vma_userptr_pin_pages(struct xe_vma *vma)
 {
-	struct xe_vm *vm = vma->vm;
+	struct xe_vm *vm = xe_vma_vm(vma);
 	struct xe_device *xe = vm->xe;
-	const unsigned long num_pages =
-		(vma->end - vma->start + 1) >> PAGE_SHIFT;
+	const unsigned long num_pages = xe_vma_size(vma) >> PAGE_SHIFT;
 	struct page **pages;
 	bool in_kthread = !current->mm;
 	unsigned long notifier_seq;
 	int pinned, ret, i;
-	bool read_only = vma->pte_flags & PTE_READ_ONLY;
+	bool read_only = xe_vma_read_only(vma);
 
 	lockdep_assert_held(&vm->lock);
 	XE_BUG_ON(!xe_vma_is_userptr(vma));
 retry:
-	if (vma->destroyed)
+	if (vma->gpuva.flags & XE_VMA_DESTROYED)
 		return 0;
 
 	notifier_seq = mmu_interval_read_begin(&vma->userptr.notifier);
@@ -94,7 +91,8 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma)
 	}
 
 	while (pinned < num_pages) {
-		ret = get_user_pages_fast(vma->userptr.ptr + pinned * PAGE_SIZE,
+		ret = get_user_pages_fast(xe_vma_userptr(vma) +
+					  pinned * PAGE_SIZE,
 					  num_pages - pinned,
 					  read_only ? 0 : FOLL_WRITE,
 					  &pages[pinned]);
@@ -282,7 +280,7 @@ void xe_vm_fence_all_extobjs(struct xe_vm *vm, struct dma_fence *fence,
 	struct xe_vma *vma;
 
 	list_for_each_entry(vma, &vm->extobj.list, extobj.link)
-		dma_resv_add_fence(vma->bo->ttm.base.resv, fence, usage);
+		dma_resv_add_fence(xe_vma_bo(vma)->ttm.base.resv, fence, usage);
 }
 
 static void resume_and_reinstall_preempt_fences(struct xe_vm *vm)
@@ -431,7 +429,7 @@ int xe_vm_lock_dma_resv(struct xe_vm *vm, struct ww_acquire_ctx *ww,
 	INIT_LIST_HEAD(objs);
 	list_for_each_entry(vma, &vm->extobj.list, extobj.link) {
 		tv_bo->num_shared = num_shared;
-		tv_bo->bo = &vma->bo->ttm;
+		tv_bo->bo = &xe_vma_bo(vma)->ttm;
 
 		list_add_tail(&tv_bo->head, objs);
 		tv_bo++;
@@ -446,10 +444,10 @@ int xe_vm_lock_dma_resv(struct xe_vm *vm, struct ww_acquire_ctx *ww,
 	spin_lock(&vm->notifier.list_lock);
 	list_for_each_entry_safe(vma, next, &vm->notifier.rebind_list,
 				 notifier.rebind_link) {
-		xe_bo_assert_held(vma->bo);
+		xe_bo_assert_held(xe_vma_bo(vma));
 
 		list_del_init(&vma->notifier.rebind_link);
-		if (vma->gt_present && !vma->destroyed)
+		if (vma->gt_present && !(vma->gpuva.flags & XE_VMA_DESTROYED))
 			list_move_tail(&vma->rebind_link, &vm->rebind_list);
 	}
 	spin_unlock(&vm->notifier.list_lock);
@@ -565,10 +563,11 @@ static void preempt_rebind_work_func(struct work_struct *w)
 		goto out_unlock;
 
 	list_for_each_entry(vma, &vm->rebind_list, rebind_link) {
-		if (xe_vma_is_userptr(vma) || vma->destroyed)
+		if (xe_vma_is_userptr(vma) ||
+		    vma->gpuva.flags & XE_VMA_DESTROYED)
 			continue;
 
-		err = xe_bo_validate(vma->bo, vm, false);
+		err = xe_bo_validate(xe_vma_bo(vma), vm, false);
 		if (err)
 			goto out_unlock;
 	}
@@ -627,17 +626,12 @@ static void preempt_rebind_work_func(struct work_struct *w)
 	trace_xe_vm_rebind_worker_exit(vm);
 }
 
-struct async_op_fence;
-static int __xe_vm_bind(struct xe_vm *vm, struct xe_vma *vma,
-			struct xe_engine *e, struct xe_sync_entry *syncs,
-			u32 num_syncs, struct async_op_fence *afence);
-
 static bool vma_userptr_invalidate(struct mmu_interval_notifier *mni,
 				   const struct mmu_notifier_range *range,
 				   unsigned long cur_seq)
 {
 	struct xe_vma *vma = container_of(mni, struct xe_vma, userptr.notifier);
-	struct xe_vm *vm = vma->vm;
+	struct xe_vm *vm = xe_vma_vm(vma);
 	struct dma_resv_iter cursor;
 	struct dma_fence *fence;
 	long err;
@@ -661,7 +655,8 @@ static bool vma_userptr_invalidate(struct mmu_interval_notifier *mni,
 	 * Tell exec and rebind worker they need to repin and rebind this
 	 * userptr.
 	 */
-	if (!xe_vm_in_fault_mode(vm) && !vma->destroyed && vma->gt_present) {
+	if (!xe_vm_in_fault_mode(vm) &&
+	    !(vma->gpuva.flags & XE_VMA_DESTROYED) && vma->gt_present) {
 		spin_lock(&vm->userptr.invalidated_lock);
 		list_move_tail(&vma->userptr.invalidate_link,
 			       &vm->userptr.invalidated);
@@ -766,7 +761,8 @@ int xe_vm_userptr_check_repin(struct xe_vm *vm)
 
 static struct dma_fence *
 xe_vm_bind_vma(struct xe_vma *vma, struct xe_engine *e,
-	       struct xe_sync_entry *syncs, u32 num_syncs);
+	       struct xe_sync_entry *syncs, u32 num_syncs,
+	       bool first_op, bool last_op);
 
 struct dma_fence *xe_vm_rebind(struct xe_vm *vm, bool rebind_worker)
 {
@@ -787,7 +783,7 @@ struct dma_fence *xe_vm_rebind(struct xe_vm *vm, bool rebind_worker)
 			trace_xe_vma_rebind_worker(vma);
 		else
 			trace_xe_vma_rebind_exec(vma);
-		fence = xe_vm_bind_vma(vma, NULL, NULL, 0);
+		fence = xe_vm_bind_vma(vma, NULL, NULL, 0, false, false);
 		if (IS_ERR(fence))
 			return fence;
 	}
@@ -815,6 +811,7 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm,
 		return vma;
 	}
 
+	/* FIXME: Way to many lists, should be able to reduce this */
 	INIT_LIST_HEAD(&vma->rebind_link);
 	INIT_LIST_HEAD(&vma->unbind_link);
 	INIT_LIST_HEAD(&vma->userptr_link);
@@ -822,11 +819,12 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm,
 	INIT_LIST_HEAD(&vma->notifier.rebind_link);
 	INIT_LIST_HEAD(&vma->extobj.link);
 
-	vma->vm = vm;
-	vma->start = start;
-	vma->end = end;
+	INIT_LIST_HEAD(&vma->gpuva.head);
+	vma->gpuva.mgr = &vm->mgr;
+	vma->gpuva.va.addr = start;
+	vma->gpuva.va.range = end - start + 1;
 	if (read_only)
-		vma->pte_flags = PTE_READ_ONLY;
+		vma->gpuva.flags |= XE_VMA_READ_ONLY;
 
 	if (gt_mask) {
 		vma->gt_mask = gt_mask;
@@ -837,22 +835,24 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm,
 	}
 
 	if (vm->xe->info.platform == XE_PVC)
-		vma->use_atomic_access_pte_bit = true;
+		vma->gpuva.flags |= XE_VMA_ATOMIC_PTE_BIT;
 
 	if (bo) {
 		xe_bo_assert_held(bo);
-		vma->bo_offset = bo_offset_or_userptr;
-		vma->bo = xe_bo_get(bo);
-		list_add_tail(&vma->bo_link, &bo->vmas);
+
+		drm_gem_object_get(&bo->ttm.base);
+		vma->gpuva.gem.obj = &bo->ttm.base;
+		vma->gpuva.gem.offset = bo_offset_or_userptr;
+		drm_gpuva_link(&vma->gpuva);
 	} else /* userptr */ {
 		u64 size = end - start + 1;
 		int err;
 
-		vma->userptr.ptr = bo_offset_or_userptr;
+		vma->gpuva.gem.offset = bo_offset_or_userptr;
 
 		err = mmu_interval_notifier_insert(&vma->userptr.notifier,
 						   current->mm,
-						   vma->userptr.ptr, size,
+						   xe_vma_userptr(vma), size,
 						   &vma_userptr_notifier_ops);
 		if (err) {
 			kfree(vma);
@@ -870,16 +870,16 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm,
 static void vm_remove_extobj(struct xe_vma *vma)
 {
 	if (!list_empty(&vma->extobj.link)) {
-		vma->vm->extobj.entries--;
+		xe_vma_vm(vma)->extobj.entries--;
 		list_del_init(&vma->extobj.link);
 	}
 }
 
 static void xe_vma_destroy_late(struct xe_vma *vma)
 {
-	struct xe_vm *vm = vma->vm;
+	struct xe_vm *vm = xe_vma_vm(vma);
 	struct xe_device *xe = vm->xe;
-	bool read_only = vma->pte_flags & PTE_READ_ONLY;
+	bool read_only = xe_vma_read_only(vma);
 
 	if (xe_vma_is_userptr(vma)) {
 		if (vma->userptr.sg) {
@@ -899,7 +899,7 @@ static void xe_vma_destroy_late(struct xe_vma *vma)
 		mmu_interval_notifier_remove(&vma->userptr.notifier);
 		xe_vm_put(vm);
 	} else {
-		xe_bo_put(vma->bo);
+		xe_bo_put(xe_vma_bo(vma));
 	}
 
 	kfree(vma);
@@ -924,21 +924,22 @@ static void vma_destroy_cb(struct dma_fence *fence,
 
 static void xe_vma_destroy(struct xe_vma *vma, struct dma_fence *fence)
 {
-	struct xe_vm *vm = vma->vm;
+	struct xe_vm *vm = xe_vma_vm(vma);
 
 	lockdep_assert_held_write(&vm->lock);
 	XE_BUG_ON(!list_empty(&vma->unbind_link));
 
 	if (xe_vma_is_userptr(vma)) {
-		XE_WARN_ON(!vma->destroyed);
+		XE_WARN_ON(!(vma->gpuva.flags & XE_VMA_DESTROYED));
+
 		spin_lock(&vm->userptr.invalidated_lock);
 		list_del_init(&vma->userptr.invalidate_link);
 		spin_unlock(&vm->userptr.invalidated_lock);
 		list_del(&vma->userptr_link);
 	} else {
-		xe_bo_assert_held(vma->bo);
-		list_del(&vma->bo_link);
-		if (!vma->bo->vm)
+		xe_bo_assert_held(xe_vma_bo(vma));
+		drm_gpuva_unlink(&vma->gpuva);
+		if (!xe_vma_bo(vma)->vm)
 			vm_remove_extobj(vma);
 	}
 
@@ -963,13 +964,13 @@ static void xe_vma_destroy_unlocked(struct xe_vma *vma)
 {
 	struct ttm_validate_buffer tv[2];
 	struct ww_acquire_ctx ww;
-	struct xe_bo *bo = vma->bo;
+	struct xe_bo *bo = xe_vma_bo(vma);
 	LIST_HEAD(objs);
 	LIST_HEAD(dups);
 	int err;
 
 	memset(tv, 0, sizeof(tv));
-	tv[0].bo = xe_vm_ttm_bo(vma->vm);
+	tv[0].bo = xe_vm_ttm_bo(xe_vma_vm(vma));
 	list_add(&tv[0].head, &objs);
 
 	if (bo) {
@@ -986,77 +987,63 @@ static void xe_vma_destroy_unlocked(struct xe_vma *vma)
 		xe_bo_put(bo);
 }
 
-static struct xe_vma *to_xe_vma(const struct rb_node *node)
-{
-	BUILD_BUG_ON(offsetof(struct xe_vma, vm_node) != 0);
-	return (struct xe_vma *)node;
-}
-
-static int xe_vma_cmp(const struct xe_vma *a, const struct xe_vma *b)
-{
-	if (a->end < b->start) {
-		return -1;
-	} else if (b->end < a->start) {
-		return 1;
-	} else {
-		return 0;
-	}
-}
-
-static bool xe_vma_less_cb(struct rb_node *a, const struct rb_node *b)
-{
-	return xe_vma_cmp(to_xe_vma(a), to_xe_vma(b)) < 0;
-}
-
-int xe_vma_cmp_vma_cb(const void *key, const struct rb_node *node)
-{
-	struct xe_vma *cmp = to_xe_vma(node);
-	const struct xe_vma *own = key;
-
-	if (own->start > cmp->end)
-		return 1;
-
-	if (own->end < cmp->start)
-		return -1;
-
-	return 0;
-}
-
 struct xe_vma *
-xe_vm_find_overlapping_vma(struct xe_vm *vm, const struct xe_vma *vma)
+xe_vm_find_overlapping_vma(struct xe_vm *vm, u64 start, u64 range)
 {
-	struct rb_node *node;
+	struct drm_gpuva *gpuva;
 
 	if (xe_vm_is_closed(vm))
 		return NULL;
 
-	XE_BUG_ON(vma->end >= vm->size);
+	XE_BUG_ON(start + range > vm->size);
 	lockdep_assert_held(&vm->lock);
 
-	node = rb_find(vma, &vm->vmas, xe_vma_cmp_vma_cb);
+	gpuva = drm_gpuva_find_first(&vm->mgr, start, range);
 
-	return node ? to_xe_vma(node) : NULL;
+	return gpuva ? gpuva_to_vma(gpuva) : NULL;
 }
 
 static void xe_vm_insert_vma(struct xe_vm *vm, struct xe_vma *vma)
 {
-	XE_BUG_ON(vma->vm != vm);
+	int err;
+
+	XE_BUG_ON(xe_vma_vm(vma) != vm);
 	lockdep_assert_held(&vm->lock);
 
-	rb_add(&vma->vm_node, &vm->vmas, xe_vma_less_cb);
+	err = drm_gpuva_insert(&vm->mgr, &vma->gpuva);
+	XE_WARN_ON(err);
 }
 
-static void xe_vm_remove_vma(struct xe_vm *vm, struct xe_vma *vma)
+static void xe_vm_remove_vma(struct xe_vm *vm, struct xe_vma *vma, bool remove)
 {
-	XE_BUG_ON(vma->vm != vm);
+	XE_BUG_ON(xe_vma_vm(vma) != vm);
 	lockdep_assert_held(&vm->lock);
 
-	rb_erase(&vma->vm_node, &vm->vmas);
+	if (remove)
+		drm_gpuva_remove(&vma->gpuva);
 	if (vm->usm.last_fault_vma == vma)
 		vm->usm.last_fault_vma = NULL;
 }
 
-static void async_op_work_func(struct work_struct *w);
+static struct drm_gpuva_op *xe_vm_op_alloc(void)
+{
+	struct xe_vma_op *op;
+
+	op = kzalloc(sizeof(*op), GFP_KERNEL);
+
+	if (unlikely(!op))
+		return NULL;
+
+	return &op->base;
+}
+
+static struct drm_gpuva_fn_ops gpuva_ops = {
+	.op_alloc = xe_vm_op_alloc,
+	.sm_map_step = drm_gpuva_sm_step,
+	.sm_unmap_step = drm_gpuva_sm_step,
+};
+
+static void xe_vma_op_work_func(struct work_struct *w);
 static void vm_destroy_work_func(struct work_struct *w);
 
 struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
@@ -1076,7 +1063,6 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
 
 	vm->size = 1ull << xe_pt_shift(xe->info.vm_max_level + 1);
 
-	vm->vmas = RB_ROOT;
 	vm->flags = flags;
 
 	init_rwsem(&vm->lock);
@@ -1092,7 +1078,7 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
 	spin_lock_init(&vm->notifier.list_lock);
 
 	INIT_LIST_HEAD(&vm->async_ops.pending);
-	INIT_WORK(&vm->async_ops.work, async_op_work_func);
+	INIT_WORK(&vm->async_ops.work, xe_vma_op_work_func);
 	spin_lock_init(&vm->async_ops.lock);
 
 	INIT_WORK(&vm->destroy_work, vm_destroy_work_func);
@@ -1112,6 +1098,8 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
 	if (err)
 		goto err_put;
 
+	drm_gpuva_manager_init(&vm->mgr, "Xe VM", 0, vm->size, 0, 0,
+			       &gpuva_ops, 0);
 	if (IS_DGFX(xe) && xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K)
 		vm->flags |= XE_VM_FLAGS_64K;
 
@@ -1217,6 +1205,7 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
 			xe_pt_destroy(vm->pt_root[id], vm->flags, NULL);
 	}
 	dma_resv_unlock(&vm->resv);
+	drm_gpuva_manager_destroy(&vm->mgr);
 err_put:
 	dma_resv_fini(&vm->resv);
 	kfree(vm);
@@ -1266,14 +1255,18 @@ static void vm_error_capture(struct xe_vm *vm, int err,
 
 void xe_vm_close_and_put(struct xe_vm *vm)
 {
-	struct rb_root contested = RB_ROOT;
+	struct list_head contested;
 	struct ww_acquire_ctx ww;
 	struct xe_device *xe = vm->xe;
 	struct xe_gt *gt;
+	struct xe_vma *vma, *next_vma;
+	DRM_GPUVA_ITER(it, &vm->mgr, 0);
 	u8 id;
 
 	XE_BUG_ON(vm->preempt.num_engines);
 
+	INIT_LIST_HEAD(&contested);
+
 	vm->size = 0;
 	smp_mb();
 	flush_async_ops(vm);
@@ -1290,24 +1283,25 @@ void xe_vm_close_and_put(struct xe_vm *vm)
 
 	down_write(&vm->lock);
 	xe_vm_lock(vm, &ww, 0, false);
-	while (vm->vmas.rb_node) {
-		struct xe_vma *vma = to_xe_vma(vm->vmas.rb_node);
+	drm_gpuva_iter_for_each(it) {
+		vma = gpuva_to_vma(it.va);
 
 		if (xe_vma_is_userptr(vma)) {
 			down_read(&vm->userptr.notifier_lock);
-			vma->destroyed = true;
+			vma->gpuva.flags |= XE_VMA_DESTROYED;
 			up_read(&vm->userptr.notifier_lock);
 		}
 
-		rb_erase(&vma->vm_node, &vm->vmas);
+		xe_vm_remove_vma(vm, vma, false);
+		drm_gpuva_iter_remove(&it);
 
 		/* easy case, remove from VMA? */
-		if (xe_vma_is_userptr(vma) || vma->bo->vm) {
+		if (xe_vma_is_userptr(vma) || xe_vma_bo(vma)->vm) {
 			xe_vma_destroy(vma, NULL);
 			continue;
 		}
 
-		rb_add(&vma->vm_node, &contested, xe_vma_less_cb);
+		list_add_tail(&contested, &vma->unbind_link);
 	}
 
 	/*
@@ -1330,19 +1324,14 @@ void xe_vm_close_and_put(struct xe_vm *vm)
 	}
 	xe_vm_unlock(vm, &ww);
 
-	if (contested.rb_node) {
-
-		/*
-		 * VM is now dead, cannot re-add nodes to vm->vmas if it's NULL
-		 * Since we hold a refcount to the bo, we can remove and free
-		 * the members safely without locking.
-		 */
-		while (contested.rb_node) {
-			struct xe_vma *vma = to_xe_vma(contested.rb_node);
-
-			rb_erase(&vma->vm_node, &contested);
-			xe_vma_destroy_unlocked(vma);
-		}
+	/*
+	 * VM is now dead, cannot re-add nodes to vm->vmas if it's NULL
+	 * Since we hold a refcount to the bo, we can remove and free
+	 * the members safely without locking.
+	 */
+	list_for_each_entry_safe(vma, next_vma, &contested, unbind_link) {
+		list_del_init(&vma->unbind_link);
+		xe_vma_destroy_unlocked(vma);
 	}
 
 	if (vm->async_ops.error_capture.addr)
@@ -1393,6 +1382,8 @@ static void vm_destroy_work_func(struct work_struct *w)
 	}
 	xe_vm_unlock(vm, &ww);
 
+	drm_gpuva_manager_destroy(&vm->mgr);
+
 	mutex_lock(&xe->usm.lock);
 	if (vm->flags & XE_VM_FLAG_FAULT_MODE)
 		xe->usm.num_vm_in_fault_mode--;
@@ -1439,13 +1430,14 @@ u64 xe_vm_pdp4_descriptor(struct xe_vm *vm, struct xe_gt *full_gt)
 
 static struct dma_fence *
 xe_vm_unbind_vma(struct xe_vma *vma, struct xe_engine *e,
-		 struct xe_sync_entry *syncs, u32 num_syncs)
+		 struct xe_sync_entry *syncs, u32 num_syncs,
+		 bool first_op, bool last_op)
 {
 	struct xe_gt *gt;
 	struct dma_fence *fence = NULL;
 	struct dma_fence **fences = NULL;
 	struct dma_fence_array *cf = NULL;
-	struct xe_vm *vm = vma->vm;
+	struct xe_vm *vm = xe_vma_vm(vma);
 	int cur_fence = 0, i;
 	int number_gts = hweight_long(vma->gt_present);
 	int err;
@@ -1466,7 +1458,8 @@ xe_vm_unbind_vma(struct xe_vma *vma, struct xe_engine *e,
 
 		XE_BUG_ON(xe_gt_is_media_type(gt));
 
-		fence = __xe_pt_unbind_vma(gt, vma, e, syncs, num_syncs);
+		fence = __xe_pt_unbind_vma(gt, vma, e, first_op ? syncs : NULL,
+					   first_op ? num_syncs : 0);
 		if (IS_ERR(fence)) {
 			err = PTR_ERR(fence);
 			goto err_fences;
@@ -1492,7 +1485,7 @@ xe_vm_unbind_vma(struct xe_vma *vma, struct xe_engine *e,
 		}
 	}
 
-	for (i = 0; i < num_syncs; i++)
+	for (i = 0; last_op && i < num_syncs; i++)
 		xe_sync_entry_signal(&syncs[i], NULL, cf ? &cf->base : fence);
 
 	return cf ? &cf->base : !fence ? dma_fence_get_stub() : fence;
@@ -1511,13 +1504,14 @@ xe_vm_unbind_vma(struct xe_vma *vma, struct xe_engine *e,
 
 static struct dma_fence *
 xe_vm_bind_vma(struct xe_vma *vma, struct xe_engine *e,
-	       struct xe_sync_entry *syncs, u32 num_syncs)
+	       struct xe_sync_entry *syncs, u32 num_syncs,
+	       bool first_op, bool last_op)
 {
 	struct xe_gt *gt;
 	struct dma_fence *fence;
 	struct dma_fence **fences = NULL;
 	struct dma_fence_array *cf = NULL;
-	struct xe_vm *vm = vma->vm;
+	struct xe_vm *vm = xe_vma_vm(vma);
 	int cur_fence = 0, i;
 	int number_gts = hweight_long(vma->gt_mask);
 	int err;
@@ -1537,7 +1531,8 @@ xe_vm_bind_vma(struct xe_vma *vma, struct xe_engine *e,
 			goto next;
 
 		XE_BUG_ON(xe_gt_is_media_type(gt));
-		fence = __xe_pt_bind_vma(gt, vma, e, syncs, num_syncs,
+		fence = __xe_pt_bind_vma(gt, vma, e, first_op ? syncs : NULL,
+					 first_op ? num_syncs : 0,
 					 vma->gt_present & BIT(id));
 		if (IS_ERR(fence)) {
 			err = PTR_ERR(fence);
@@ -1564,7 +1559,7 @@ xe_vm_bind_vma(struct xe_vma *vma, struct xe_engine *e,
 		}
 	}
 
-	for (i = 0; i < num_syncs; i++)
+	for (i = 0; last_op && i < num_syncs; i++)
 		xe_sync_entry_signal(&syncs[i], NULL, cf ? &cf->base : fence);
 
 	return cf ? &cf->base : fence;
@@ -1663,15 +1658,27 @@ int xe_vm_async_fence_wait_start(struct dma_fence *fence)
 
 static int __xe_vm_bind(struct xe_vm *vm, struct xe_vma *vma,
 			struct xe_engine *e, struct xe_sync_entry *syncs,
-			u32 num_syncs, struct async_op_fence *afence)
+			u32 num_syncs, struct async_op_fence *afence,
+			bool immediate, bool first_op, bool last_op)
 {
 	struct dma_fence *fence;
 
 	xe_vm_assert_held(vm);
 
-	fence = xe_vm_bind_vma(vma, e, syncs, num_syncs);
-	if (IS_ERR(fence))
-		return PTR_ERR(fence);
+	if (immediate) {
+		fence = xe_vm_bind_vma(vma, e, syncs, num_syncs, first_op,
+				       last_op);
+		if (IS_ERR(fence))
+			return PTR_ERR(fence);
+	} else {
+		int i;
+
+		XE_BUG_ON(!xe_vm_in_fault_mode(vm));
+
+		fence = dma_fence_get_stub();
+		for (i = 0; last_op && i < num_syncs; i++)
+			xe_sync_entry_signal(&syncs[i], NULL, fence);
+	}
 	if (afence)
 		add_async_op_fence_cb(vm, fence, afence);
 
@@ -1681,32 +1688,35 @@ static int __xe_vm_bind(struct xe_vm *vm, struct xe_vma *vma,
 
 static int xe_vm_bind(struct xe_vm *vm, struct xe_vma *vma, struct xe_engine *e,
 		      struct xe_bo *bo, struct xe_sync_entry *syncs,
-		      u32 num_syncs, struct async_op_fence *afence)
+		      u32 num_syncs, struct async_op_fence *afence,
+		      bool immediate, bool first_op, bool last_op)
 {
 	int err;
 
 	xe_vm_assert_held(vm);
 	xe_bo_assert_held(bo);
 
-	if (bo) {
+	if (bo && immediate) {
 		err = xe_bo_validate(bo, vm, true);
 		if (err)
 			return err;
 	}
 
-	return __xe_vm_bind(vm, vma, e, syncs, num_syncs, afence);
+	return __xe_vm_bind(vm, vma, e, syncs, num_syncs, afence, immediate,
+			    first_op, last_op);
 }
 
 static int xe_vm_unbind(struct xe_vm *vm, struct xe_vma *vma,
 			struct xe_engine *e, struct xe_sync_entry *syncs,
-			u32 num_syncs, struct async_op_fence *afence)
+			u32 num_syncs, struct async_op_fence *afence,
+			bool first_op, bool last_op)
 {
 	struct dma_fence *fence;
 
 	xe_vm_assert_held(vm);
-	xe_bo_assert_held(vma->bo);
+	xe_bo_assert_held(xe_vma_bo(vma));
 
-	fence = xe_vm_unbind_vma(vma, e, syncs, num_syncs);
+	fence = xe_vm_unbind_vma(vma, e, syncs, num_syncs, first_op, last_op);
 	if (IS_ERR(fence))
 		return PTR_ERR(fence);
 	if (afence)
@@ -1929,26 +1939,27 @@ static const u32 region_to_mem_type[] = {
 static int xe_vm_prefetch(struct xe_vm *vm, struct xe_vma *vma,
 			  struct xe_engine *e, u32 region,
 			  struct xe_sync_entry *syncs, u32 num_syncs,
-			  struct async_op_fence *afence)
+			  struct async_op_fence *afence, bool first_op,
+			  bool last_op)
 {
 	int err;
 
 	XE_BUG_ON(region > ARRAY_SIZE(region_to_mem_type));
 
 	if (!xe_vma_is_userptr(vma)) {
-		err = xe_bo_migrate(vma->bo, region_to_mem_type[region]);
+		err = xe_bo_migrate(xe_vma_bo(vma), region_to_mem_type[region]);
 		if (err)
 			return err;
 	}
 
 	if (vma->gt_mask != (vma->gt_present & ~vma->usm.gt_invalidated)) {
-		return xe_vm_bind(vm, vma, e, vma->bo, syncs, num_syncs,
-				  afence);
+		return xe_vm_bind(vm, vma, e, xe_vma_bo(vma), syncs, num_syncs,
+				  afence, true, first_op, last_op);
 	} else {
 		int i;
 
 		/* Nothing to do, signal fences now */
-		for (i = 0; i < num_syncs; i++)
+		for (i = 0; last_op && i < num_syncs; i++)
 			xe_sync_entry_signal(&syncs[i], NULL,
 					     dma_fence_get_stub());
 		if (afence)
@@ -1959,29 +1970,6 @@ static int xe_vm_prefetch(struct xe_vm *vm, struct xe_vma *vma,
 
 #define VM_BIND_OP(op)	(op & 0xffff)
 
-static int __vm_bind_ioctl(struct xe_vm *vm, struct xe_vma *vma,
-			   struct xe_engine *e, struct xe_bo *bo, u32 op,
-			   u32 region, struct xe_sync_entry *syncs,
-			   u32 num_syncs, struct async_op_fence *afence)
-{
-	switch (VM_BIND_OP(op)) {
-	case XE_VM_BIND_OP_MAP:
-		return xe_vm_bind(vm, vma, e, bo, syncs, num_syncs, afence);
-	case XE_VM_BIND_OP_UNMAP:
-	case XE_VM_BIND_OP_UNMAP_ALL:
-		return xe_vm_unbind(vm, vma, e, syncs, num_syncs, afence);
-	case XE_VM_BIND_OP_MAP_USERPTR:
-		return xe_vm_bind(vm, vma, e, NULL, syncs, num_syncs, afence);
-	case XE_VM_BIND_OP_PREFETCH:
-		return xe_vm_prefetch(vm, vma, e, region, syncs, num_syncs,
-				      afence);
-		break;
-	default:
-		XE_BUG_ON("NOT POSSIBLE");
-		return -EINVAL;
-	}
-}
-
 struct ttm_buffer_object *xe_vm_ttm_bo(struct xe_vm *vm)
 {
 	int idx = vm->flags & XE_VM_FLAG_MIGRATION ?
@@ -1997,834 +1985,807 @@ static void xe_vm_tv_populate(struct xe_vm *vm, struct ttm_validate_buffer *tv)
 	tv->bo = xe_vm_ttm_bo(vm);
 }
 
-static bool is_map_op(u32 op)
+static void vm_set_async_error(struct xe_vm *vm, int err)
 {
-	return VM_BIND_OP(op) == XE_VM_BIND_OP_MAP ||
-		VM_BIND_OP(op) == XE_VM_BIND_OP_MAP_USERPTR;
+	lockdep_assert_held(&vm->lock);
+	vm->async_ops.error = err;
 }
 
-static bool is_unmap_op(u32 op)
+static bool bo_has_vm_references(struct xe_bo *bo, struct xe_vm *vm,
+				 struct xe_vma *ignore)
 {
-	return VM_BIND_OP(op) == XE_VM_BIND_OP_UNMAP ||
-		VM_BIND_OP(op) == XE_VM_BIND_OP_UNMAP_ALL;
+	struct ww_acquire_ctx ww;
+	struct drm_gpuva *gpuva;
+	struct drm_gem_object *obj = &bo->ttm.base;
+	bool ret = false;
+
+	xe_bo_lock(bo, &ww, 0, false);
+	drm_gem_for_each_gpuva(gpuva, obj) {
+		struct xe_vma *vma = gpuva_to_vma(gpuva);
+
+		if (vma != ignore && xe_vma_vm(vma) == vm &&
+		    !(vma->gpuva.flags & XE_VMA_DESTROYED)) {
+			ret = true;
+			break;
+		}
+	}
+	xe_bo_unlock(bo, &ww);
+
+	return ret;
 }
 
-static int vm_bind_ioctl(struct xe_vm *vm, struct xe_vma *vma,
-			 struct xe_engine *e, struct xe_bo *bo,
-			 struct drm_xe_vm_bind_op *bind_op,
-			 struct xe_sync_entry *syncs, u32 num_syncs,
-			 struct async_op_fence *afence)
+static int vm_insert_extobj(struct xe_vm *vm, struct xe_vma *vma)
 {
-	LIST_HEAD(objs);
-	LIST_HEAD(dups);
-	struct ttm_validate_buffer tv_bo, tv_vm;
-	struct ww_acquire_ctx ww;
-	struct xe_bo *vbo;
-	int err, i;
+	struct xe_bo *bo = xe_vma_bo(vma);
 
-	lockdep_assert_held(&vm->lock);
-	XE_BUG_ON(!list_empty(&vma->unbind_link));
+	lockdep_assert_held_write(&vm->lock);
 
-	/* Binds deferred to faults, signal fences now */
-	if (xe_vm_in_fault_mode(vm) && is_map_op(bind_op->op) &&
-	    !(bind_op->op & XE_VM_BIND_FLAG_IMMEDIATE)) {
-		for (i = 0; i < num_syncs; i++)
-			xe_sync_entry_signal(&syncs[i], NULL,
-					     dma_fence_get_stub());
-		if (afence)
-			dma_fence_signal(&afence->fence);
+	if (bo_has_vm_references(bo, vm, vma))
 		return 0;
-	}
 
-	xe_vm_tv_populate(vm, &tv_vm);
-	list_add_tail(&tv_vm.head, &objs);
-	vbo = vma->bo;
-	if (vbo) {
-		/*
-		 * An unbind can drop the last reference to the BO and
-		 * the BO is needed for ttm_eu_backoff_reservation so
-		 * take a reference here.
-		 */
-		xe_bo_get(vbo);
+	list_add(&vma->extobj.link, &vm->extobj.list);
+	vm->extobj.entries++;
 
-		tv_bo.bo = &vbo->ttm;
-		tv_bo.num_shared = 1;
-		list_add(&tv_bo.head, &objs);
-	}
+	return 0;
+}
 
-again:
-	err = ttm_eu_reserve_buffers(&ww, &objs, true, &dups);
-	if (!err) {
-		err = __vm_bind_ioctl(vm, vma, e, bo,
-				      bind_op->op, bind_op->region, syncs,
-				      num_syncs, afence);
-		ttm_eu_backoff_reservation(&ww, &objs);
-		if (err == -EAGAIN && xe_vma_is_userptr(vma)) {
-			lockdep_assert_held_write(&vm->lock);
-			err = xe_vma_userptr_pin_pages(vma);
-			if (!err)
-				goto again;
-		}
+static int __vm_bind_ioctl_lookup_vma(struct xe_vm *vm, struct xe_bo *bo,
+				      u64 addr, u64 range, u32 op)
+{
+	struct xe_device *xe = vm->xe;
+	struct xe_vma *vma;
+	bool async = !!(op & XE_VM_BIND_FLAG_ASYNC);
+
+	lockdep_assert_held(&vm->lock);
+
+	return 0;
+
+	switch (VM_BIND_OP(op)) {
+	case XE_VM_BIND_OP_MAP:
+	case XE_VM_BIND_OP_MAP_USERPTR:
+		vma = xe_vm_find_overlapping_vma(vm, addr, range);
+		if (XE_IOCTL_ERR(xe, vma))
+			return -EBUSY;
+		break;
+	case XE_VM_BIND_OP_UNMAP:
+	case XE_VM_BIND_OP_PREFETCH:
+		vma = xe_vm_find_overlapping_vma(vm, addr, range);
+		if (XE_IOCTL_ERR(xe, !vma) ||
+		    XE_IOCTL_ERR(xe, (xe_vma_start(vma) != addr ||
+				 xe_vma_end(vma) != addr + range) && !async))
+			return -EINVAL;
+		break;
+	case XE_VM_BIND_OP_UNMAP_ALL:
+		if (XE_IOCTL_ERR(xe, list_empty(&bo->ttm.base.gpuva.list)))
+			return -EINVAL;
+		break;
+	default:
+		XE_BUG_ON("NOT POSSIBLE");
+		return -EINVAL;
 	}
-	xe_bo_put(vbo);
 
-	return err;
+	return 0;
 }
 
-struct async_op {
-	struct xe_vma *vma;
-	struct xe_engine *engine;
-	struct xe_bo *bo;
-	struct drm_xe_vm_bind_op bind_op;
-	struct xe_sync_entry *syncs;
-	u32 num_syncs;
-	struct list_head link;
-	struct async_op_fence *fence;
-};
-
-static void async_op_cleanup(struct xe_vm *vm, struct async_op *op)
+static void prep_vma_destroy(struct xe_vm *vm, struct xe_vma *vma)
 {
-	while (op->num_syncs--)
-		xe_sync_entry_cleanup(&op->syncs[op->num_syncs]);
-	kfree(op->syncs);
-	xe_bo_put(op->bo);
-	if (op->engine)
-		xe_engine_put(op->engine);
-	xe_vm_put(vm);
-	if (op->fence)
-		dma_fence_put(&op->fence->fence);
-	kfree(op);
+	down_read(&vm->userptr.notifier_lock);
+	vma->gpuva.flags |= XE_VMA_DESTROYED;
+	up_read(&vm->userptr.notifier_lock);
+	xe_vm_remove_vma(vm, vma, true);
 }
 
-static struct async_op *next_async_op(struct xe_vm *vm)
+#if IS_ENABLED(CONFIG_DRM_XE_DEBUG_VM)
+static void print_op(struct xe_device *xe, struct drm_gpuva_op *op)
 {
-	return list_first_entry_or_null(&vm->async_ops.pending,
-					struct async_op, link);
-}
+	struct xe_vma *vma;
 
-static void vm_set_async_error(struct xe_vm *vm, int err)
+	switch (op->op) {
+	case DRM_GPUVA_OP_MAP:
+		vm_dbg(&xe->drm, "MAP: addr=0x%016llx, range=0x%016llx",
+		       op->map.va.addr, op->map.va.range);
+		break;
+	case DRM_GPUVA_OP_REMAP:
+		vma = gpuva_to_vma(op->remap.unmap->va);
+		vm_dbg(&xe->drm, "REMAP:UNMAP: addr=0x%016llx, range=0x%016llx, keep=%d",
+		       xe_vma_start(vma), xe_vma_size(vma),
+		       op->unmap.keep ? 1 : 0);
+		if (op->remap.prev)
+			vm_dbg(&xe->drm,
+			       "REMAP:PREV: addr=0x%016llx, range=0x%016llx",
+			       op->remap.prev->va.addr,
+			       op->remap.prev->va.range);
+		if (op->remap.next)
+			vm_dbg(&xe->drm,
+			       "REMAP:NEXT: addr=0x%016llx, range=0x%016llx",
+			       op->remap.next->va.addr,
+			       op->remap.next->va.range);
+		break;
+	case DRM_GPUVA_OP_UNMAP:
+		vma = gpuva_to_vma(op->unmap.va);
+		vm_dbg(&xe->drm, "UNMAP: addr=0x%016llx, range=0x%016llx, keep=%d",
+		       xe_vma_start(vma), xe_vma_size(vma),
+		       op->unmap.keep ? 1 : 0);
+		break;
+	default:
+		XE_BUG_ON("NOT_POSSIBLE");
+	}
+}
+#else
+static void print_op(struct xe_device *xe, struct drm_gpuva_op *op)
 {
-	lockdep_assert_held(&vm->lock);
-	vm->async_ops.error = err;
 }
+#endif
 
-static void async_op_work_func(struct work_struct *w)
+/*
+ * Create operations list from IOCTL arguments, setup operations fields so parse
+ * and commit steps are decoupled from IOCTL arguments. This step can fail.
+ */
+static struct drm_gpuva_ops *
+vm_bind_ioctl_ops_create(struct xe_vm *vm, struct xe_bo *bo,
+			 u64 bo_offset_or_userptr, u64 addr, u64 range,
+			 u32 operation, u64 gt_mask, u32 region)
 {
-	struct xe_vm *vm = container_of(w, struct xe_vm, async_ops.work);
-
-	for (;;) {
-		struct async_op *op;
-		int err;
-
-		if (vm->async_ops.error && !xe_vm_is_closed(vm))
-			break;
+	struct drm_gem_object *obj = bo ? &bo->ttm.base : NULL;
+	struct ww_acquire_ctx ww;
+	struct drm_gpuva_ops *ops;
+	struct drm_gpuva_op *__op;
+	struct xe_vma_op *op;
+	int err;
 
-		spin_lock_irq(&vm->async_ops.lock);
-		op = next_async_op(vm);
-		if (op)
-			list_del_init(&op->link);
-		spin_unlock_irq(&vm->async_ops.lock);
+	lockdep_assert_held_write(&vm->lock);
 
-		if (!op)
-			break;
+	vm_dbg(&vm->xe->drm,
+	       "op=%d, addr=0x%016llx, range=0x%016llx, bo_offset_or_userptr=0x%016llx",
+	       VM_BIND_OP(operation), addr, range, bo_offset_or_userptr);
 
-		if (!xe_vm_is_closed(vm)) {
-			bool first, last;
+	switch (VM_BIND_OP(operation)) {
+	case XE_VM_BIND_OP_MAP:
+	case XE_VM_BIND_OP_MAP_USERPTR:
+		ops = drm_gpuva_sm_map_ops_create(&vm->mgr, addr, range,
+						  obj, bo_offset_or_userptr);
+		drm_gpuva_for_each_op(__op, ops) {
+			struct xe_vma_op *op = gpuva_op_to_vma_op(__op);
 
-			down_write(&vm->lock);
-again:
-			first = op->vma->first_munmap_rebind;
-			last = op->vma->last_munmap_rebind;
-#ifdef TEST_VM_ASYNC_OPS_ERROR
-#define FORCE_ASYNC_OP_ERROR	BIT(31)
-			if (!(op->bind_op.op & FORCE_ASYNC_OP_ERROR)) {
-				err = vm_bind_ioctl(vm, op->vma, op->engine,
-						    op->bo, &op->bind_op,
-						    op->syncs, op->num_syncs,
-						    op->fence);
-			} else {
-				err = -ENOMEM;
-				op->bind_op.op &= ~FORCE_ASYNC_OP_ERROR;
-			}
-#else
-			err = vm_bind_ioctl(vm, op->vma, op->engine, op->bo,
-					    &op->bind_op, op->syncs,
-					    op->num_syncs, op->fence);
-#endif
-			/*
-			 * In order for the fencing to work (stall behind
-			 * existing jobs / prevent new jobs from running) all
-			 * the dma-resv slots need to be programmed in a batch
-			 * relative to execs / the rebind worker. The vm->lock
-			 * ensure this.
-			 */
-			if (!err && ((first && VM_BIND_OP(op->bind_op.op) ==
-				      XE_VM_BIND_OP_UNMAP) ||
-				     vm->async_ops.munmap_rebind_inflight)) {
-				if (last) {
-					op->vma->last_munmap_rebind = false;
-					vm->async_ops.munmap_rebind_inflight =
-						false;
-				} else {
-					vm->async_ops.munmap_rebind_inflight =
-						true;
-
-					async_op_cleanup(vm, op);
-
-					spin_lock_irq(&vm->async_ops.lock);
-					op = next_async_op(vm);
-					XE_BUG_ON(!op);
-					list_del_init(&op->link);
-					spin_unlock_irq(&vm->async_ops.lock);
-
-					goto again;
-				}
-			}
-			if (err) {
-				trace_xe_vma_fail(op->vma);
-				drm_warn(&vm->xe->drm, "Async VM op(%d) failed with %d",
-					 VM_BIND_OP(op->bind_op.op),
-					 err);
+			op->gt_mask = gt_mask;
+			op->map.immediate =
+				operation & XE_VM_BIND_FLAG_IMMEDIATE;
+			op->map.read_only =
+				operation & XE_VM_BIND_FLAG_READONLY;
+		}
+		break;
+	case XE_VM_BIND_OP_UNMAP:
+		ops = drm_gpuva_sm_unmap_ops_create(&vm->mgr, addr, range);
+		drm_gpuva_for_each_op(__op, ops) {
+			struct xe_vma_op *op = gpuva_op_to_vma_op(__op);
 
-				spin_lock_irq(&vm->async_ops.lock);
-				list_add(&op->link, &vm->async_ops.pending);
-				spin_unlock_irq(&vm->async_ops.lock);
+			op->gt_mask = gt_mask;
+		}
+		break;
+	case XE_VM_BIND_OP_PREFETCH:
+		ops = drm_gpuva_prefetch_ops_create(&vm->mgr, addr, range);
+		drm_gpuva_for_each_op(__op, ops) {
+			struct xe_vma_op *op = gpuva_op_to_vma_op(__op);
 
-				vm_set_async_error(vm, err);
-				up_write(&vm->lock);
+			op->gt_mask = gt_mask;
+			op->prefetch.region = region;
+		}
+		break;
+	case XE_VM_BIND_OP_UNMAP_ALL:
+		XE_BUG_ON(!bo);
 
-				if (vm->async_ops.error_capture.addr)
-					vm_error_capture(vm, err,
-							 op->bind_op.op,
-							 op->bind_op.addr,
-							 op->bind_op.range);
-				break;
-			}
-			up_write(&vm->lock);
-		} else {
-			trace_xe_vma_flush(op->vma);
+		err = xe_bo_lock(bo, &ww, 0, true);
+		if (err)
+			return ERR_PTR(err);
+		ops = drm_gpuva_gem_unmap_ops_create(&vm->mgr, obj);
+		xe_bo_unlock(bo, &ww);
 
-			if (is_unmap_op(op->bind_op.op)) {
-				down_write(&vm->lock);
-				xe_vma_destroy_unlocked(op->vma);
-				up_write(&vm->lock);
-			}
+		drm_gpuva_for_each_op(__op, ops) {
+			struct xe_vma_op *op = gpuva_op_to_vma_op(__op);
 
-			if (op->fence && !test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
-						   &op->fence->fence.flags)) {
-				if (!xe_vm_no_dma_fences(vm)) {
-					op->fence->started = true;
-					smp_wmb();
-					wake_up_all(&op->fence->wq);
-				}
-				dma_fence_signal(&op->fence->fence);
-			}
+			op->gt_mask = gt_mask;
 		}
+		break;
+	default:
+		XE_BUG_ON("NOT POSSIBLE");
+		ops = ERR_PTR(-EINVAL);
+	}
 
-		async_op_cleanup(vm, op);
+#ifdef TEST_VM_ASYNC_OPS_ERROR
+	if (operation & FORCE_ASYNC_OP_ERROR) {
+		op = list_first_entry_or_null(&ops->list, struct xe_vma_op,
+					      base.entry);
+		if (op)
+			op->inject_error = true;
 	}
+#endif
+
+	if (!IS_ERR(ops))
+		drm_gpuva_for_each_op(__op, ops)
+			print_op(vm->xe, __op);
+
+	return ops;
 }
 
-static int __vm_bind_ioctl_async(struct xe_vm *vm, struct xe_vma *vma,
-				 struct xe_engine *e, struct xe_bo *bo,
-				 struct drm_xe_vm_bind_op *bind_op,
-				 struct xe_sync_entry *syncs, u32 num_syncs)
+static struct xe_vma *new_vma(struct xe_vm *vm, struct drm_gpuva_op_map *op,
+			      u64 gt_mask, bool read_only)
 {
-	struct async_op *op;
-	bool installed = false;
-	u64 seqno;
-	int i;
+	struct xe_bo *bo = op->gem.obj ? gem_to_xe_bo(op->gem.obj) : NULL;
+	struct xe_vma *vma;
+	struct ww_acquire_ctx ww;
+	int err;
 
-	lockdep_assert_held(&vm->lock);
+	lockdep_assert_held_write(&vm->lock);
 
-	op = kmalloc(sizeof(*op), GFP_KERNEL);
-	if (!op) {
-		return -ENOMEM;
+	if (bo) {
+		err = xe_bo_lock(bo, &ww, 0, true);
+		if (err)
+			return ERR_PTR(err);
 	}
+	vma = xe_vma_create(vm, bo, op->gem.offset,
+			    op->va.addr, op->va.addr +
+			    op->va.range - 1, read_only,
+			    gt_mask);
+	if (bo)
+		xe_bo_unlock(bo, &ww);
 
-	if (num_syncs) {
-		op->fence = kmalloc(sizeof(*op->fence), GFP_KERNEL);
-		if (!op->fence) {
-			kfree(op);
-			return -ENOMEM;
+	if (xe_vma_is_userptr(vma)) {
+		err = xe_vma_userptr_pin_pages(vma);
+		if (err) {
+			xe_vma_destroy(vma, NULL);
+			return ERR_PTR(err);
 		}
+	} else if(!bo->vm) {
+		vm_insert_extobj(vm, vma);
+		err = add_preempt_fences(vm, bo);
+		if (err) {
+			xe_vma_destroy(vma, NULL);
+			return ERR_PTR(err);
+		}
+	}
+
+	return vma;
+}
+
+/*
+ * Parse operations list and create any resources needed for the operations
+ * prior to fully commiting to the operations. This setp can fail.
+ */
+static int vm_bind_ioctl_ops_parse(struct xe_vm *vm, struct xe_engine *e,
+				   struct drm_gpuva_ops **ops, int num_ops_list,
+				   struct xe_sync_entry *syncs, u32 num_syncs,
+				   struct list_head *ops_list, bool async)
+{
+	struct xe_vma_op *last_op = NULL;
+	struct list_head *async_list = NULL;
+	struct async_op_fence *fence = NULL;
+	int err, i;
+
+	lockdep_assert_held_write(&vm->lock);
+	XE_BUG_ON(num_ops_list > 1 && !async);
+
+	if (num_syncs && async) {
+		u64 seqno;
+
+		fence = kmalloc(sizeof(*fence), GFP_KERNEL);
+		if (!fence)
+			return -ENOMEM;
 
 		seqno = e ? ++e->bind.fence_seqno : ++vm->async_ops.fence.seqno;
-		dma_fence_init(&op->fence->fence, &async_op_fence_ops,
+		dma_fence_init(&fence->fence, &async_op_fence_ops,
 			       &vm->async_ops.lock, e ? e->bind.fence_ctx :
 			       vm->async_ops.fence.context, seqno);
 
 		if (!xe_vm_no_dma_fences(vm)) {
-			op->fence->vm = vm;
-			op->fence->started = false;
-			init_waitqueue_head(&op->fence->wq);
+			fence->vm = vm;
+			fence->started = false;
+			init_waitqueue_head(&fence->wq);
 		}
-	} else {
-		op->fence = NULL;
 	}
-	op->vma = vma;
-	op->engine = e;
-	op->bo = bo;
-	op->bind_op = *bind_op;
-	op->syncs = syncs;
-	op->num_syncs = num_syncs;
-	INIT_LIST_HEAD(&op->link);
 
-	for (i = 0; i < num_syncs; i++)
-		installed |= xe_sync_entry_signal(&syncs[i], NULL,
-						  &op->fence->fence);
+	for (i = 0; i < num_ops_list; ++i) {
+		struct drm_gpuva_ops *__ops = ops[i];
+		struct drm_gpuva_op *__op;
 
-	if (!installed && op->fence)
-		dma_fence_signal(&op->fence->fence);
+		drm_gpuva_for_each_op(__op, __ops) {
+			struct xe_vma_op *op = gpuva_op_to_vma_op(__op);
+			bool first = !async_list;
 
-	spin_lock_irq(&vm->async_ops.lock);
-	list_add_tail(&op->link, &vm->async_ops.pending);
-	spin_unlock_irq(&vm->async_ops.lock);
+			XE_BUG_ON(!first && !async);
 
-	if (!vm->async_ops.error)
-		queue_work(system_unbound_wq, &vm->async_ops.work);
+			INIT_LIST_HEAD(&op->link);
+			if (first)
+				async_list = ops_list;
+			list_add_tail(&op->link, async_list);
 
-	return 0;
-}
-
-static int vm_bind_ioctl_async(struct xe_vm *vm, struct xe_vma *vma,
-			       struct xe_engine *e, struct xe_bo *bo,
-			       struct drm_xe_vm_bind_op *bind_op,
-			       struct xe_sync_entry *syncs, u32 num_syncs)
-{
-	struct xe_vma *__vma, *next;
-	struct list_head rebind_list;
-	struct xe_sync_entry *in_syncs = NULL, *out_syncs = NULL;
-	u32 num_in_syncs = 0, num_out_syncs = 0;
-	bool first = true, last;
-	int err;
-	int i;
+			if (first) {
+				op->flags |= XE_VMA_OP_FIRST;
+				op->num_syncs = num_syncs;
+				op->syncs = syncs;
+			}
 
-	lockdep_assert_held(&vm->lock);
+			op->engine = e;
 
-	/* Not a linked list of unbinds + rebinds, easy */
-	if (list_empty(&vma->unbind_link))
-		return __vm_bind_ioctl_async(vm, vma, e, bo, bind_op,
-					     syncs, num_syncs);
+			switch (op->base.op) {
+			case DRM_GPUVA_OP_MAP:
+			{
+				struct xe_vma *vma;
 
-	/*
-	 * Linked list of unbinds + rebinds, decompose syncs into 'in / out'
-	 * passing the 'in' to the first operation and 'out' to the last. Also
-	 * the reference counting is a little tricky, increment the VM / bind
-	 * engine ref count on all but the last operation and increment the BOs
-	 * ref count on each rebind.
-	 */
+				vma = new_vma(vm, &op->base.map,
+					      op->gt_mask, op->map.read_only);
+				if (IS_ERR(vma)) {
+					err = PTR_ERR(vma);
+					goto free_fence;
+				}
 
-	XE_BUG_ON(VM_BIND_OP(bind_op->op) != XE_VM_BIND_OP_UNMAP &&
-		  VM_BIND_OP(bind_op->op) != XE_VM_BIND_OP_UNMAP_ALL &&
-		  VM_BIND_OP(bind_op->op) != XE_VM_BIND_OP_PREFETCH);
+				op->map.vma = vma;
+				break;
+			}
+			case DRM_GPUVA_OP_REMAP:
+				if (op->base.remap.prev) {
+					struct xe_vma *vma;
+					bool read_only =
+						op->base.remap.unmap->va->flags &
+						XE_VMA_READ_ONLY;
+
+					vma = new_vma(vm, op->base.remap.prev,
+						      op->gt_mask, read_only);
+					if (IS_ERR(vma)) {
+						err = PTR_ERR(vma);
+						goto free_fence;
+					}
+
+					op->remap.prev = vma;
+				}
 
-	/* Decompose syncs */
-	if (num_syncs) {
-		in_syncs = kmalloc(sizeof(*in_syncs) * num_syncs, GFP_KERNEL);
-		out_syncs = kmalloc(sizeof(*out_syncs) * num_syncs, GFP_KERNEL);
-		if (!in_syncs || !out_syncs) {
-			err = -ENOMEM;
-			goto out_error;
-		}
+				if (op->base.remap.next) {
+					struct xe_vma *vma;
+					bool read_only =
+						op->base.remap.unmap->va->flags &
+						XE_VMA_READ_ONLY;
 
-		for (i = 0; i < num_syncs; ++i) {
-			bool signal = syncs[i].flags & DRM_XE_SYNC_SIGNAL;
+					vma = new_vma(vm, op->base.remap.next,
+						      op->gt_mask, read_only);
+					if (IS_ERR(vma)) {
+						err = PTR_ERR(vma);
+						goto free_fence;
+					}
 
-			if (signal)
-				out_syncs[num_out_syncs++] = syncs[i];
-			else
-				in_syncs[num_in_syncs++] = syncs[i];
-		}
-	}
+					op->remap.next = vma;
+				}
 
-	/* Do unbinds + move rebinds to new list */
-	INIT_LIST_HEAD(&rebind_list);
-	list_for_each_entry_safe(__vma, next, &vma->unbind_link, unbind_link) {
-		if (__vma->destroyed ||
-		    VM_BIND_OP(bind_op->op) == XE_VM_BIND_OP_PREFETCH) {
-			list_del_init(&__vma->unbind_link);
-			xe_bo_get(bo);
-			err = __vm_bind_ioctl_async(xe_vm_get(vm), __vma,
-						    e ? xe_engine_get(e) : NULL,
-						    bo, bind_op, first ?
-						    in_syncs : NULL,
-						    first ? num_in_syncs : 0);
-			if (err) {
-				xe_bo_put(bo);
-				xe_vm_put(vm);
-				if (e)
-					xe_engine_put(e);
-				goto out_error;
+				/* XXX: Support no doing remaps */
+				op->remap.start =
+					xe_vma_start(gpuva_to_vma(op->base.remap.unmap->va));
+				op->remap.range =
+					xe_vma_size(gpuva_to_vma(op->base.remap.unmap->va));
+				break;
+			case DRM_GPUVA_OP_UNMAP:
+				op->unmap.start =
+					xe_vma_start(gpuva_to_vma(op->base.unmap.va));
+				op->unmap.range =
+					xe_vma_size(gpuva_to_vma(op->base.unmap.va));
+				break;
+			case DRM_GPUVA_OP_PREFETCH:
+				/* Nothing to do */
+				break;
+			default:
+				XE_BUG_ON("NOT POSSIBLE");
 			}
-			in_syncs = NULL;
-			first = false;
-		} else {
-			list_move_tail(&__vma->unbind_link, &rebind_list);
-		}
-	}
-	last = list_empty(&rebind_list);
-	if (!last) {
-		xe_vm_get(vm);
-		if (e)
-			xe_engine_get(e);
-	}
-	err = __vm_bind_ioctl_async(vm, vma, e,
-				    bo, bind_op,
-				    first ? in_syncs :
-				    last ? out_syncs : NULL,
-				    first ? num_in_syncs :
-				    last ? num_out_syncs : 0);
-	if (err) {
-		if (!last) {
-			xe_vm_put(vm);
-			if (e)
-				xe_engine_put(e);
-		}
-		goto out_error;
-	}
-	in_syncs = NULL;
-
-	/* Do rebinds */
-	list_for_each_entry_safe(__vma, next, &rebind_list, unbind_link) {
-		list_del_init(&__vma->unbind_link);
-		last = list_empty(&rebind_list);
 
-		if (xe_vma_is_userptr(__vma)) {
-			bind_op->op = XE_VM_BIND_FLAG_ASYNC |
-				XE_VM_BIND_OP_MAP_USERPTR;
-		} else {
-			bind_op->op = XE_VM_BIND_FLAG_ASYNC |
-				XE_VM_BIND_OP_MAP;
-			xe_bo_get(__vma->bo);
-		}
-
-		if (!last) {
-			xe_vm_get(vm);
-			if (e)
-				xe_engine_get(e);
+			last_op = op;
 		}
 
-		err = __vm_bind_ioctl_async(vm, __vma, e,
-					    __vma->bo, bind_op, last ?
-					    out_syncs : NULL,
-					    last ? num_out_syncs : 0);
-		if (err) {
-			if (!last) {
-				xe_vm_put(vm);
-				if (e)
-					xe_engine_put(e);
-			}
-			goto out_error;
-		}
+		last_op->ops = __ops;
 	}
 
-	kfree(syncs);
-	return 0;
+	XE_BUG_ON(!last_op);	/* FIXME: This is not an error, handle */
 
-out_error:
-	kfree(in_syncs);
-	kfree(out_syncs);
-	kfree(syncs);
+	last_op->flags |= XE_VMA_OP_LAST;
+	last_op->num_syncs = num_syncs;
+	last_op->syncs = syncs;
+	last_op->fence = fence;
 
+	return 0;
+
+free_fence:
+	kfree(fence);
 	return err;
 }
 
-static bool bo_has_vm_references(struct xe_bo *bo, struct xe_vm *vm,
-				 struct xe_vma *ignore)
+static void xe_vma_op_commit(struct xe_vm *vm, struct xe_vma_op *op)
 {
-	struct ww_acquire_ctx ww;
-	struct xe_vma *vma;
-	bool ret = false;
+	lockdep_assert_held_write(&vm->lock);
 
-	xe_bo_lock(bo, &ww, 0, false);
-	list_for_each_entry(vma, &bo->vmas, bo_link) {
-		if (vma != ignore && vma->vm == vm && !vma->destroyed) {
-			ret = true;
-			break;
-		}
+	switch (op->base.op) {
+	case DRM_GPUVA_OP_MAP:
+		xe_vm_insert_vma(vm, op->map.vma);
+		break;
+	case DRM_GPUVA_OP_REMAP:
+		prep_vma_destroy(vm, gpuva_to_vma(op->base.remap.unmap->va));
+		if (op->remap.prev)
+			xe_vm_insert_vma(vm, op->remap.prev);
+		if (op->remap.next)
+			xe_vm_insert_vma(vm, op->remap.next);
+		break;
+	case DRM_GPUVA_OP_UNMAP:
+		prep_vma_destroy(vm, gpuva_to_vma(op->base.unmap.va));
+		break;
+	case DRM_GPUVA_OP_PREFETCH:
+		/* Nothing to do */
+		break;
+	default:
+		XE_BUG_ON("NOT POSSIBLE");
 	}
-	xe_bo_unlock(bo, &ww);
-
-	return ret;
 }
 
-static int vm_insert_extobj(struct xe_vm *vm, struct xe_vma *vma)
+static int __xe_vma_op_execute(struct xe_vm *vm, struct xe_vma *vma,
+			       struct xe_vma_op *op)
 {
-	struct xe_bo *bo = vma->bo;
+	LIST_HEAD(objs);
+	LIST_HEAD(dups);
+	struct ttm_validate_buffer tv_bo, tv_vm;
+	struct ww_acquire_ctx ww;
+	struct xe_bo *vbo;
+	int err;
 
 	lockdep_assert_held_write(&vm->lock);
 
-	if (bo_has_vm_references(bo, vm, vma))
-		return 0;
+	xe_vm_tv_populate(vm, &tv_vm);
+	list_add_tail(&tv_vm.head, &objs);
+	vbo = xe_vma_bo(vma);
+	if (vbo) {
+		/*
+		 * An unbind can drop the last reference to the BO and
+		 * the BO is needed for ttm_eu_backoff_reservation so
+		 * take a reference here.
+		 */
+		xe_bo_get(vbo);
 
-	list_add(&vma->extobj.link, &vm->extobj.list);
-	vm->extobj.entries++;
+		tv_bo.bo = &vbo->ttm;
+		tv_bo.num_shared = 1;
+		list_add(&tv_bo.head, &objs);
+	}
 
-	return 0;
-}
+again:
+	err = ttm_eu_reserve_buffers(&ww, &objs, true, &dups);
+	if (err) {
+		xe_bo_put(vbo);
+		return err;
+	}
 
-static int __vm_bind_ioctl_lookup_vma(struct xe_vm *vm, struct xe_bo *bo,
-				      u64 addr, u64 range, u32 op)
-{
-	struct xe_device *xe = vm->xe;
-	struct xe_vma *vma, lookup;
-	bool async = !!(op & XE_VM_BIND_FLAG_ASYNC);
+	xe_vm_assert_held(vm);
+	xe_bo_assert_held(xe_vma_bo(vma));
+
+	switch (op->base.op) {
+	case DRM_GPUVA_OP_MAP:
+		err = xe_vm_bind(vm, vma, op->engine, xe_vma_bo(vma),
+				 op->syncs, op->num_syncs, op->fence,
+				 op->map.immediate || !xe_vm_in_fault_mode(vm),
+				 op->flags & XE_VMA_OP_FIRST,
+				 op->flags & XE_VMA_OP_LAST);
+		break;
+	case DRM_GPUVA_OP_REMAP:
+	{
+		bool prev = !!op->remap.prev;
+		bool next = !!op->remap.next;
+
+		if (!op->remap.unmap_done) {
+			vm->async_ops.munmap_rebind_inflight = true;
+			if (prev || next)
+				vma->gpuva.flags |= XE_VMA_FIRST_REBIND;
+			err = xe_vm_unbind(vm, vma, op->engine, op->syncs,
+					   op->num_syncs,
+					   !prev && !next ? op->fence : NULL,
+					   op->flags & XE_VMA_OP_FIRST,
+					   op->flags & XE_VMA_OP_LAST && !prev &&
+					   !next);
+			if (err)
+				break;
+			op->remap.unmap_done = true;
+		}
 
-	lockdep_assert_held(&vm->lock);
+		if (prev) {
+			op->remap.prev->gpuva.flags |= XE_VMA_LAST_REBIND;
+			err = xe_vm_bind(vm, op->remap.prev, op->engine,
+					 xe_vma_bo(op->remap.prev), op->syncs,
+					 op->num_syncs,
+					 !next ? op->fence : NULL, true, false,
+					 op->flags & XE_VMA_OP_LAST && !next);
+			op->remap.prev->gpuva.flags &= ~XE_VMA_LAST_REBIND;
+			if (err)
+				break;
+			op->remap.prev = NULL;
+		}
 
-	lookup.start = addr;
-	lookup.end = addr + range - 1;
+		if (next) {
+			op->remap.next->gpuva.flags |= XE_VMA_LAST_REBIND;
+			err = xe_vm_bind(vm, op->remap.next, op->engine,
+					 xe_vma_bo(op->remap.next),
+					 op->syncs, op->num_syncs,
+					 op->fence, true, false,
+					 op->flags & XE_VMA_OP_LAST);
+			op->remap.next->gpuva.flags &= ~XE_VMA_LAST_REBIND;
+			if (err)
+				break;
+			op->remap.next = NULL;
+		}
+		vm->async_ops.munmap_rebind_inflight = false;
 
-	switch (VM_BIND_OP(op)) {
-	case XE_VM_BIND_OP_MAP:
-	case XE_VM_BIND_OP_MAP_USERPTR:
-		vma = xe_vm_find_overlapping_vma(vm, &lookup);
-		if (XE_IOCTL_ERR(xe, vma))
-			return -EBUSY;
 		break;
-	case XE_VM_BIND_OP_UNMAP:
-	case XE_VM_BIND_OP_PREFETCH:
-		vma = xe_vm_find_overlapping_vma(vm, &lookup);
-		if (XE_IOCTL_ERR(xe, !vma) ||
-		    XE_IOCTL_ERR(xe, (vma->start != addr ||
-				 vma->end != addr + range - 1) && !async))
-			return -EINVAL;
+	}
+	case DRM_GPUVA_OP_UNMAP:
+		err = xe_vm_unbind(vm, vma, op->engine, op->syncs,
+				   op->num_syncs, op->fence,
+				   op->flags & XE_VMA_OP_FIRST,
+				   op->flags & XE_VMA_OP_LAST);
 		break;
-	case XE_VM_BIND_OP_UNMAP_ALL:
+	case DRM_GPUVA_OP_PREFETCH:
+		err = xe_vm_prefetch(vm, vma, op->engine, op->prefetch.region,
+				     op->syncs, op->num_syncs, op->fence,
+				     op->flags & XE_VMA_OP_FIRST,
+				     op->flags & XE_VMA_OP_LAST);
 		break;
 	default:
 		XE_BUG_ON("NOT POSSIBLE");
-		return -EINVAL;
 	}
 
-	return 0;
-}
-
-static void prep_vma_destroy(struct xe_vm *vm, struct xe_vma *vma)
-{
-	down_read(&vm->userptr.notifier_lock);
-	vma->destroyed = true;
-	up_read(&vm->userptr.notifier_lock);
-	xe_vm_remove_vma(vm, vma);
-}
-
-static int prep_replacement_vma(struct xe_vm *vm, struct xe_vma *vma)
-{
-	int err;
-
-	if (vma->bo && !vma->bo->vm) {
-		vm_insert_extobj(vm, vma);
-		err = add_preempt_fences(vm, vma->bo);
-		if (err)
-			return err;
+	ttm_eu_backoff_reservation(&ww, &objs);
+	if (err == -EAGAIN && xe_vma_is_userptr(vma)) {
+		lockdep_assert_held_write(&vm->lock);
+		err = xe_vma_userptr_pin_pages(vma);
+		if (!err)
+			goto again;
 	}
+	xe_bo_put(vbo);
 
-	return 0;
+	if (err)
+		trace_xe_vma_fail(vma);
+
+	return err;
 }
 
-/*
- * Find all overlapping VMAs in lookup range and add to a list in the returned
- * VMA, all of VMAs found will be unbound. Also possibly add 2 new VMAs that
- * need to be bound if first / last VMAs are not fully unbound. This is akin to
- * how munmap works.
- */
-static struct xe_vma *vm_unbind_lookup_vmas(struct xe_vm *vm,
-					    struct xe_vma *lookup)
+static int xe_vma_op_execute(struct xe_vm *vm, struct xe_vma_op *op)
 {
-	struct xe_vma *vma = xe_vm_find_overlapping_vma(vm, lookup);
-	struct rb_node *node;
-	struct xe_vma *first = vma, *last = vma, *new_first = NULL,
-		      *new_last = NULL, *__vma, *next;
-	int err = 0;
-	bool first_munmap_rebind = false;
+	int ret = 0;
 
-	lockdep_assert_held(&vm->lock);
-	XE_BUG_ON(!vma);
-
-	node = &vma->vm_node;
-	while ((node = rb_next(node))) {
-		if (!xe_vma_cmp_vma_cb(lookup, node)) {
-			__vma = to_xe_vma(node);
-			list_add_tail(&__vma->unbind_link, &vma->unbind_link);
-			last = __vma;
-		} else {
-			break;
-		}
-	}
+	lockdep_assert_held_write(&vm->lock);
 
-	node = &vma->vm_node;
-	while ((node = rb_prev(node))) {
-		if (!xe_vma_cmp_vma_cb(lookup, node)) {
-			__vma = to_xe_vma(node);
-			list_add(&__vma->unbind_link, &vma->unbind_link);
-			first = __vma;
-		} else {
-			break;
-		}
+#ifdef TEST_VM_ASYNC_OPS_ERROR
+	if (op->inject_error) {
+		op->inject_error = false;
+		return -ENOMEM;
 	}
+#endif
 
-	if (first->start != lookup->start) {
-		struct ww_acquire_ctx ww;
+	switch (op->base.op) {
+	case DRM_GPUVA_OP_MAP:
+		ret = __xe_vma_op_execute(vm, op->map.vma, op);
+		break;
+	case DRM_GPUVA_OP_REMAP:
+		ret = __xe_vma_op_execute(vm,
+					  gpuva_to_vma(op->base.remap.unmap->va),
+					  op);
+		break;
+	case DRM_GPUVA_OP_UNMAP:
+	{
+		struct xe_vma *vma;
+
+		if (!op->remap.unmap_done)
+			vma = gpuva_to_vma(op->base.unmap.va);
+		else if(op->remap.prev)
+			vma = op->remap.prev;
+		else
+			vma = op->remap.next;
 
-		if (first->bo)
-			err = xe_bo_lock(first->bo, &ww, 0, true);
-		if (err)
-			goto unwind;
-		new_first = xe_vma_create(first->vm, first->bo,
-					  first->bo ? first->bo_offset :
-					  first->userptr.ptr,
-					  first->start,
-					  lookup->start - 1,
-					  (first->pte_flags & PTE_READ_ONLY),
-					  first->gt_mask);
-		if (first->bo)
-			xe_bo_unlock(first->bo, &ww);
-		if (!new_first) {
-			err = -ENOMEM;
-			goto unwind;
-		}
-		if (!first->bo) {
-			err = xe_vma_userptr_pin_pages(new_first);
-			if (err)
-				goto unwind;
-		}
-		err = prep_replacement_vma(vm, new_first);
-		if (err)
-			goto unwind;
+		ret = __xe_vma_op_execute(vm, vma, op);
+		break;
 	}
-
-	if (last->end != lookup->end) {
-		struct ww_acquire_ctx ww;
-		u64 chunk = lookup->end + 1 - last->start;
-
-		if (last->bo)
-			err = xe_bo_lock(last->bo, &ww, 0, true);
-		if (err)
-			goto unwind;
-		new_last = xe_vma_create(last->vm, last->bo,
-					 last->bo ? last->bo_offset + chunk :
-					 last->userptr.ptr + chunk,
-					 last->start + chunk,
-					 last->end,
-					 (last->pte_flags & PTE_READ_ONLY),
-					 last->gt_mask);
-		if (last->bo)
-			xe_bo_unlock(last->bo, &ww);
-		if (!new_last) {
-			err = -ENOMEM;
-			goto unwind;
-		}
-		if (!last->bo) {
-			err = xe_vma_userptr_pin_pages(new_last);
-			if (err)
-				goto unwind;
-		}
-		err = prep_replacement_vma(vm, new_last);
-		if (err)
-			goto unwind;
+	case DRM_GPUVA_OP_PREFETCH:
+		ret = __xe_vma_op_execute(vm,
+					  gpuva_to_vma(op->base.prefetch.va),
+					  op);
+		break;
+	default:
+		XE_BUG_ON("NOT POSSIBLE");
 	}
 
-	prep_vma_destroy(vm, vma);
-	if (list_empty(&vma->unbind_link) && (new_first || new_last))
-		vma->first_munmap_rebind = true;
-	list_for_each_entry(__vma, &vma->unbind_link, unbind_link) {
-		if ((new_first || new_last) && !first_munmap_rebind) {
-			__vma->first_munmap_rebind = true;
-			first_munmap_rebind = true;
-		}
-		prep_vma_destroy(vm, __vma);
-	}
-	if (new_first) {
-		xe_vm_insert_vma(vm, new_first);
-		list_add_tail(&new_first->unbind_link, &vma->unbind_link);
-		if (!new_last)
-			new_first->last_munmap_rebind = true;
+	return ret;
+}
+
+static void xe_vma_op_cleanup(struct xe_vm *vm, struct xe_vma_op *op)
+{
+	if (op->flags & XE_VMA_OP_LAST) {
+		while (op->num_syncs--)
+			xe_sync_entry_cleanup(&op->syncs[op->num_syncs]);
+		kfree(op->syncs);
+		if (op->engine)
+			xe_engine_put(op->engine);
+		if (op->fence)
+			dma_fence_put(&op->fence->fence);
 	}
-	if (new_last) {
-		xe_vm_insert_vma(vm, new_last);
-		list_add_tail(&new_last->unbind_link, &vma->unbind_link);
-		new_last->last_munmap_rebind = true;
+	if (!list_empty(&op->link)) {
+		spin_lock_irq(&vm->async_ops.lock);
+		list_del(&op->link);
+		spin_unlock_irq(&vm->async_ops.lock);
 	}
+	if (op->ops)
+		drm_gpuva_ops_free(&vm->mgr, op->ops);
+}
 
-	return vma;
+static void xe_vma_op_unwind(struct xe_vm *vm, struct xe_vma_op *op)
+{
+	lockdep_assert_held_write(&vm->lock);
+
+	switch (op->base.op) {
+	case DRM_GPUVA_OP_MAP:
+		prep_vma_destroy(vm, op->map.vma);
+		xe_vma_destroy(op->map.vma, NULL);
+		break;
+	case DRM_GPUVA_OP_UNMAP:
+	{
+		struct xe_vma *vma = gpuva_to_vma(op->base.unmap.va);
 
-unwind:
-	list_for_each_entry_safe(__vma, next, &vma->unbind_link, unbind_link)
-		list_del_init(&__vma->unbind_link);
-	if (new_last) {
-		prep_vma_destroy(vm, new_last);
-		xe_vma_destroy_unlocked(new_last);
+		down_read(&vm->userptr.notifier_lock);
+		vma->gpuva.flags &= ~XE_VMA_DESTROYED;
+		up_read(&vm->userptr.notifier_lock);
+		xe_vm_insert_vma(vm, vma);
+		break;
 	}
-	if (new_first) {
-		prep_vma_destroy(vm, new_first);
-		xe_vma_destroy_unlocked(new_first);
+	case DRM_GPUVA_OP_PREFETCH:
+		/* Nothing to do */
+		break;
+	case DRM_GPUVA_OP_REMAP:
+	default:
+		XE_BUG_ON("NOT POSSIBLE");
 	}
+}
 
-	return ERR_PTR(err);
+static struct xe_vma_op *next_vma_op(struct xe_vm *vm)
+{
+	return list_first_entry_or_null(&vm->async_ops.pending,
+					struct xe_vma_op, link);
 }
 
-/*
- * Similar to vm_unbind_lookup_vmas, find all VMAs in lookup range to prefetch
- */
-static struct xe_vma *vm_prefetch_lookup_vmas(struct xe_vm *vm,
-					      struct xe_vma *lookup,
-					      u32 region)
+static void xe_vma_op_work_func(struct work_struct *w)
 {
-	struct xe_vma *vma = xe_vm_find_overlapping_vma(vm, lookup), *__vma,
-		      *next;
-	struct rb_node *node;
+	struct xe_vm *vm = container_of(w, struct xe_vm, async_ops.work);
 
-	if (!xe_vma_is_userptr(vma)) {
-		if (!xe_bo_can_migrate(vma->bo, region_to_mem_type[region]))
-			return ERR_PTR(-EINVAL);
-	}
+	for (;;) {
+		struct xe_vma_op *op;
+		int err;
 
-	node = &vma->vm_node;
-	while ((node = rb_next(node))) {
-		if (!xe_vma_cmp_vma_cb(lookup, node)) {
-			__vma = to_xe_vma(node);
-			if (!xe_vma_is_userptr(__vma)) {
-				if (!xe_bo_can_migrate(__vma->bo, region_to_mem_type[region]))
-					goto flush_list;
-			}
-			list_add_tail(&__vma->unbind_link, &vma->unbind_link);
-		} else {
+		if (vm->async_ops.error && !xe_vm_is_closed(vm))
 			break;
-		}
-	}
 
-	node = &vma->vm_node;
-	while ((node = rb_prev(node))) {
-		if (!xe_vma_cmp_vma_cb(lookup, node)) {
-			__vma = to_xe_vma(node);
-			if (!xe_vma_is_userptr(__vma)) {
-				if (!xe_bo_can_migrate(__vma->bo, region_to_mem_type[region]))
-					goto flush_list;
-			}
-			list_add(&__vma->unbind_link, &vma->unbind_link);
-		} else {
+		spin_lock_irq(&vm->async_ops.lock);
+		op = next_vma_op(vm);
+		spin_unlock_irq(&vm->async_ops.lock);
+
+		if (!op)
 			break;
-		}
-	}
 
-	return vma;
+		if (!xe_vm_is_closed(vm)) {
+			down_write(&vm->lock);
+			err = xe_vma_op_execute(vm, op);
+			if (err) {
+				drm_warn(&vm->xe->drm, "Async VM op(%d) failed with %d",
+					 0, err);
 
-flush_list:
-	list_for_each_entry_safe(__vma, next, &vma->unbind_link,
-				 unbind_link)
-		list_del_init(&__vma->unbind_link);
+				vm_set_async_error(vm, err);
+				up_write(&vm->lock);
 
-	return ERR_PTR(-EINVAL);
-}
+				if (vm->async_ops.error_capture.addr)
+					vm_error_capture(vm, err, 0, 0, 0);
+				break;
+			}
+			up_write(&vm->lock);
+		} else {
+			struct xe_vma *vma;
 
-static struct xe_vma *vm_unbind_all_lookup_vmas(struct xe_vm *vm,
-						struct xe_bo *bo)
-{
-	struct xe_vma *first = NULL, *vma;
+			switch (op->base.op) {
+			case DRM_GPUVA_OP_REMAP:
+				vma = gpuva_to_vma(op->base.remap.unmap->va);
+				trace_xe_vma_flush(vma);
 
-	lockdep_assert_held(&vm->lock);
-	xe_bo_assert_held(bo);
+				down_write(&vm->lock);
+				xe_vma_destroy_unlocked(vma);
+				up_write(&vm->lock);
+				break;
+			case DRM_GPUVA_OP_UNMAP:
+				vma = gpuva_to_vma(op->base.unmap.va);
+				trace_xe_vma_flush(vma);
 
-	list_for_each_entry(vma, &bo->vmas, bo_link) {
-		if (vma->vm != vm)
-			continue;
+				down_write(&vm->lock);
+				xe_vma_destroy_unlocked(vma);
+				up_write(&vm->lock);
+				break;
+			default:
+				/* Nothing to do */
+			}
 
-		prep_vma_destroy(vm, vma);
-		if (!first)
-			first = vma;
-		else
-			list_add_tail(&vma->unbind_link, &first->unbind_link);
-	}
+			if (op->fence && !test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
+						   &op->fence->fence.flags)) {
+				if (!xe_vm_no_dma_fences(vm)) {
+					op->fence->started = true;
+					smp_wmb();
+					wake_up_all(&op->fence->wq);
+				}
+				dma_fence_signal(&op->fence->fence);
+			}
+		}
 
-	return first;
+		xe_vma_op_cleanup(vm, op);
+	}
 }
 
-static struct xe_vma *vm_bind_ioctl_lookup_vma(struct xe_vm *vm,
-					       struct xe_bo *bo,
-					       u64 bo_offset_or_userptr,
-					       u64 addr, u64 range, u32 op,
-					       u64 gt_mask, u32 region)
+/*
+ * Commit operations list, this step cannot fail in async mode, can fail if the
+ * bind operation fails in sync mode.
+ */
+static int vm_bind_ioctl_ops_commit(struct xe_vm *vm,
+				    struct list_head *ops_list, bool async)
 {
-	struct ww_acquire_ctx ww;
-	struct xe_vma *vma, lookup;
-	int err;
-
-	lockdep_assert_held(&vm->lock);
+	struct xe_vma_op *op, *last_op;
+	int err = 0;
 
-	lookup.start = addr;
-	lookup.end = addr + range - 1;
+	lockdep_assert_held_write(&vm->lock);
 
-	switch (VM_BIND_OP(op)) {
-	case XE_VM_BIND_OP_MAP:
-		XE_BUG_ON(!bo);
+	list_for_each_entry(op, ops_list, link) {
+		last_op = op;
+		xe_vma_op_commit(vm, op);
+	}
 
-		err = xe_bo_lock(bo, &ww, 0, true);
+	if (!async) {
+		err = xe_vma_op_execute(vm, last_op);
 		if (err)
-			return ERR_PTR(err);
-		vma = xe_vma_create(vm, bo, bo_offset_or_userptr, addr,
-				    addr + range - 1,
-				    op & XE_VM_BIND_FLAG_READONLY,
-				    gt_mask);
-		xe_bo_unlock(bo, &ww);
-		if (!vma)
-			return ERR_PTR(-ENOMEM);
+			xe_vma_op_unwind(vm, last_op);
+		xe_vma_op_cleanup(vm, last_op);
+	} else {
+		int i;
+		bool installed = false;
 
-		xe_vm_insert_vma(vm, vma);
-		if (!bo->vm) {
-			vm_insert_extobj(vm, vma);
-			err = add_preempt_fences(vm, bo);
-			if (err) {
-				prep_vma_destroy(vm, vma);
-				xe_vma_destroy_unlocked(vma);
+		for (i = 0; i < last_op->num_syncs; i++)
+			installed |= xe_sync_entry_signal(&last_op->syncs[i],
+							  NULL,
+							  &last_op->fence->fence);
+		if (!installed && last_op->fence)
+			dma_fence_signal(&last_op->fence->fence);
 
-				return ERR_PTR(err);
-			}
-		}
-		break;
-	case XE_VM_BIND_OP_UNMAP:
-		vma = vm_unbind_lookup_vmas(vm, &lookup);
-		break;
-	case XE_VM_BIND_OP_PREFETCH:
-		vma = vm_prefetch_lookup_vmas(vm, &lookup, region);
-		break;
-	case XE_VM_BIND_OP_UNMAP_ALL:
-		XE_BUG_ON(!bo);
+		spin_lock_irq(&vm->async_ops.lock);
+		list_splice_tail(ops_list, &vm->async_ops.pending);
+		spin_unlock_irq(&vm->async_ops.lock);
 
-		err = xe_bo_lock(bo, &ww, 0, true);
-		if (err)
-			return ERR_PTR(err);
-		vma = vm_unbind_all_lookup_vmas(vm, bo);
-		if (!vma)
-			vma = ERR_PTR(-EINVAL);
-		xe_bo_unlock(bo, &ww);
-		break;
-	case XE_VM_BIND_OP_MAP_USERPTR:
-		XE_BUG_ON(bo);
+		if (!vm->async_ops.error)
+			queue_work(system_unbound_wq, &vm->async_ops.work);
+	}
 
-		vma = xe_vma_create(vm, NULL, bo_offset_or_userptr, addr,
-				    addr + range - 1,
-				    op & XE_VM_BIND_FLAG_READONLY,
-				    gt_mask);
-		if (!vma)
-			return ERR_PTR(-ENOMEM);
+	return err;
+}
 
-		err = xe_vma_userptr_pin_pages(vma);
-		if (err) {
-			prep_vma_destroy(vm, vma);
-			xe_vma_destroy_unlocked(vma);
+/*
+ * Unwind operations list, called after a failure of vm_bind_ioctl_ops_create or
+ * vm_bind_ioctl_ops_parse.
+ */
+static void vm_bind_ioctl_ops_unwind(struct xe_vm *vm,
+				     struct drm_gpuva_ops **ops,
+				     int num_ops_list)
+{
+	int i;
 
-			return ERR_PTR(err);
-		} else {
-			xe_vm_insert_vma(vm, vma);
+	for (i = 0; i < num_ops_list; ++i) {
+		struct drm_gpuva_ops *__ops = ops[i];
+		struct drm_gpuva_op *__op;
+
+		if (!__ops)
+			continue;
+
+		drm_gpuva_for_each_op(__op, __ops) {
+			struct xe_vma_op *op = gpuva_op_to_vma_op(__op);
+
+			xe_vma_op_unwind(vm, op);
 		}
-		break;
-	default:
-		XE_BUG_ON("NOT POSSIBLE");
-		vma = ERR_PTR(-EINVAL);
 	}
-
-	return vma;
 }
 
 #ifdef TEST_VM_ASYNC_OPS_ERROR
@@ -2954,15 +2915,16 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 	struct drm_xe_vm_bind *args = data;
 	struct drm_xe_sync __user *syncs_user;
 	struct xe_bo **bos = NULL;
-	struct xe_vma **vmas = NULL;
+	struct drm_gpuva_ops **ops = NULL;
 	struct xe_vm *vm;
 	struct xe_engine *e = NULL;
 	u32 num_syncs;
 	struct xe_sync_entry *syncs = NULL;
 	struct drm_xe_vm_bind_op *bind_ops;
+	LIST_HEAD(ops_list);
 	bool async;
 	int err;
-	int i, j = 0;
+	int i;
 
 	err = vm_bind_ioctl_check_args(xe, args, &bind_ops, &async);
 	if (err)
@@ -3050,8 +3012,8 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 		goto put_engine;
 	}
 
-	vmas = kzalloc(sizeof(*vmas) * args->num_binds, GFP_KERNEL);
-	if (!vmas) {
+	ops = kzalloc(sizeof(*ops) * args->num_binds, GFP_KERNEL);
+	if (!ops) {
 		err = -ENOMEM;
 		goto put_engine;
 	}
@@ -3131,128 +3093,40 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 		u64 gt_mask = bind_ops[i].gt_mask;
 		u32 region = bind_ops[i].region;
 
-		vmas[i] = vm_bind_ioctl_lookup_vma(vm, bos[i], obj_offset,
-						   addr, range, op, gt_mask,
-						   region);
-		if (IS_ERR(vmas[i])) {
-			err = PTR_ERR(vmas[i]);
-			vmas[i] = NULL;
-			goto destroy_vmas;
-		}
-	}
-
-	for (j = 0; j < args->num_binds; ++j) {
-		struct xe_sync_entry *__syncs;
-		u32 __num_syncs = 0;
-		bool first_or_last = j == 0 || j == args->num_binds - 1;
-
-		if (args->num_binds == 1) {
-			__num_syncs = num_syncs;
-			__syncs = syncs;
-		} else if (first_or_last && num_syncs) {
-			bool first = j == 0;
-
-			__syncs = kmalloc(sizeof(*__syncs) * num_syncs,
-					  GFP_KERNEL);
-			if (!__syncs) {
-				err = ENOMEM;
-				break;
-			}
-
-			/* in-syncs on first bind, out-syncs on last bind */
-			for (i = 0; i < num_syncs; ++i) {
-				bool signal = syncs[i].flags &
-					DRM_XE_SYNC_SIGNAL;
-
-				if ((first && !signal) || (!first && signal))
-					__syncs[__num_syncs++] = syncs[i];
-			}
-		} else {
-			__num_syncs = 0;
-			__syncs = NULL;
-		}
-
-		if (async) {
-			bool last = j == args->num_binds - 1;
-
-			/*
-			 * Each pass of async worker drops the ref, take a ref
-			 * here, 1 set of refs taken above
-			 */
-			if (!last) {
-				if (e)
-					xe_engine_get(e);
-				xe_vm_get(vm);
-			}
-
-			err = vm_bind_ioctl_async(vm, vmas[j], e, bos[j],
-						  bind_ops + j, __syncs,
-						  __num_syncs);
-			if (err && !last) {
-				if (e)
-					xe_engine_put(e);
-				xe_vm_put(vm);
-			}
-			if (err)
-				break;
-		} else {
-			XE_BUG_ON(j != 0);	/* Not supported */
-			err = vm_bind_ioctl(vm, vmas[j], e, bos[j],
-					    bind_ops + j, __syncs,
-					    __num_syncs, NULL);
-			break;	/* Needed so cleanup loops work */
+		ops[i] = vm_bind_ioctl_ops_create(vm, bos[i], obj_offset,
+						  addr, range, op, gt_mask,
+						  region);
+		if (IS_ERR(ops[i])) {
+			err = PTR_ERR(ops[i]);
+			ops[i] = NULL;
+			goto unwind_ops;
 		}
 	}
 
-	/* Most of cleanup owned by the async bind worker */
-	if (async && !err) {
-		up_write(&vm->lock);
-		if (args->num_binds > 1)
-			kfree(syncs);
-		goto free_objs;
-	}
+	err = vm_bind_ioctl_ops_parse(vm, e, ops, args->num_binds,
+				      syncs, num_syncs, &ops_list, async);
+	if (err)
+		goto unwind_ops;
 
-destroy_vmas:
-	for (i = j; err && i < args->num_binds; ++i) {
-		u32 op = bind_ops[i].op;
-		struct xe_vma *vma, *next;
+	err = vm_bind_ioctl_ops_commit(vm, &ops_list, async);
+	up_write(&vm->lock);
 
-		if (!vmas[i])
-			break;
+	for (i = 0; i < args->num_binds; ++i)
+		xe_bo_put(bos[i]);
 
-		list_for_each_entry_safe(vma, next, &vma->unbind_link,
-					 unbind_link) {
-			list_del_init(&vma->unbind_link);
-			if (!vma->destroyed) {
-				prep_vma_destroy(vm, vma);
-				xe_vma_destroy_unlocked(vma);
-			}
-		}
+	return err;
 
-		switch (VM_BIND_OP(op)) {
-		case XE_VM_BIND_OP_MAP:
-			prep_vma_destroy(vm, vmas[i]);
-			xe_vma_destroy_unlocked(vmas[i]);
-			break;
-		case XE_VM_BIND_OP_MAP_USERPTR:
-			prep_vma_destroy(vm, vmas[i]);
-			xe_vma_destroy_unlocked(vmas[i]);
-			break;
-		}
-	}
+unwind_ops:
+	vm_bind_ioctl_ops_unwind(vm, ops, args->num_binds);
 release_vm_lock:
 	up_write(&vm->lock);
 free_syncs:
-	while (num_syncs--) {
-		if (async && j &&
-		    !(syncs[num_syncs].flags & DRM_XE_SYNC_SIGNAL))
-			continue;	/* Still in async worker */
+	while (num_syncs--)
 		xe_sync_entry_cleanup(&syncs[num_syncs]);
-	}
 
 	kfree(syncs);
 put_obj:
-	for (i = j; i < args->num_binds; ++i)
+	for (i = 0; i < args->num_binds; ++i)
 		xe_bo_put(bos[i]);
 put_engine:
 	if (e)
@@ -3261,7 +3135,7 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 	xe_vm_put(vm);
 free_objs:
 	kfree(bos);
-	kfree(vmas);
+	kfree(ops);
 	if (args->num_binds > 1)
 		kfree(bind_ops);
 	return err;
@@ -3305,14 +3179,14 @@ void xe_vm_unlock(struct xe_vm *vm, struct ww_acquire_ctx *ww)
  */
 int xe_vm_invalidate_vma(struct xe_vma *vma)
 {
-	struct xe_device *xe = vma->vm->xe;
+	struct xe_device *xe = xe_vma_vm(vma)->xe;
 	struct xe_gt *gt;
 	u32 gt_needs_invalidate = 0;
 	int seqno[XE_MAX_GT];
 	u8 id;
 	int ret;
 
-	XE_BUG_ON(!xe_vm_in_fault_mode(vma->vm));
+	XE_BUG_ON(!xe_vm_in_fault_mode(xe_vma_vm(vma)));
 	trace_xe_vma_usm_invalidate(vma);
 
 	/* Check that we don't race with page-table updates */
@@ -3321,11 +3195,11 @@ int xe_vm_invalidate_vma(struct xe_vma *vma)
 			WARN_ON_ONCE(!mmu_interval_check_retry
 				     (&vma->userptr.notifier,
 				      vma->userptr.notifier_seq));
-			WARN_ON_ONCE(!dma_resv_test_signaled(&vma->vm->resv,
+			WARN_ON_ONCE(!dma_resv_test_signaled(&xe_vma_vm(vma)->resv,
 							     DMA_RESV_USAGE_BOOKKEEP));
 
 		} else {
-			xe_bo_assert_held(vma->bo);
+			xe_bo_assert_held(xe_vma_bo(vma));
 		}
 	}
 
@@ -3355,7 +3229,7 @@ int xe_vm_invalidate_vma(struct xe_vma *vma)
 #if IS_ENABLED(CONFIG_DRM_XE_SIMPLE_ERROR_CAPTURE)
 int xe_analyze_vm(struct drm_printer *p, struct xe_vm *vm, int gt_id)
 {
-	struct rb_node *node;
+	struct drm_gpuva *gpuva;
 	bool is_vram;
 	uint64_t addr;
 
@@ -3368,8 +3242,8 @@ int xe_analyze_vm(struct drm_printer *p, struct xe_vm *vm, int gt_id)
 		drm_printf(p, " VM root: A:0x%llx %s\n", addr, is_vram ? "VRAM" : "SYS");
 	}
 
-	for (node = rb_first(&vm->vmas); node; node = rb_next(node)) {
-		struct xe_vma *vma = to_xe_vma(node);
+	drm_gpuva_for_each_va(gpuva, &vm->mgr) {
+		struct xe_vma *vma = gpuva_to_vma(vma);
 		bool is_userptr = xe_vma_is_userptr(vma);
 
 		if (is_userptr) {
@@ -3378,10 +3252,10 @@ int xe_analyze_vm(struct drm_printer *p, struct xe_vm *vm, int gt_id)
 			xe_res_first_sg(vma->userptr.sg, 0, GEN8_PAGE_SIZE, &cur);
 			addr = xe_res_dma(&cur);
 		} else {
-			addr = xe_bo_addr(vma->bo, 0, GEN8_PAGE_SIZE, &is_vram);
+			addr = xe_bo_addr(xe_vma_bo(vma), 0, GEN8_PAGE_SIZE, &is_vram);
 		}
 		drm_printf(p, " [%016llx-%016llx] S:0x%016llx A:%016llx %s\n",
-			   vma->start, vma->end, vma->end - vma->start + 1ull,
+			   xe_vma_start(vma), xe_vma_end(vma), xe_vma_size(vma),
 			   addr, is_userptr ? "USR" : is_vram ? "VRAM" : "SYS");
 	}
 	up_read(&vm->lock);
diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
index 3468ed9d0528..ef665068bcf7 100644
--- a/drivers/gpu/drm/xe/xe_vm.h
+++ b/drivers/gpu/drm/xe/xe_vm.h
@@ -6,6 +6,7 @@
 #ifndef _XE_VM_H_
 #define _XE_VM_H_
 
+#include "xe_bo_types.h"
 #include "xe_macros.h"
 #include "xe_map.h"
 #include "xe_vm_types.h"
@@ -25,7 +26,6 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags);
 void xe_vm_free(struct kref *ref);
 
 struct xe_vm *xe_vm_lookup(struct xe_file *xef, u32 id);
-int xe_vma_cmp_vma_cb(const void *key, const struct rb_node *node);
 
 static inline struct xe_vm *xe_vm_get(struct xe_vm *vm)
 {
@@ -50,7 +50,67 @@ static inline bool xe_vm_is_closed(struct xe_vm *vm)
 }
 
 struct xe_vma *
-xe_vm_find_overlapping_vma(struct xe_vm *vm, const struct xe_vma *vma);
+xe_vm_find_overlapping_vma(struct xe_vm *vm, u64 start, u64 range);
+
+static inline struct xe_vm *gpuva_to_vm(struct drm_gpuva *gpuva)
+{
+	return container_of(gpuva->mgr, struct xe_vm, mgr);
+}
+
+static inline struct xe_vma *gpuva_to_vma(struct drm_gpuva *gpuva)
+{
+	return container_of(gpuva, struct xe_vma, gpuva);
+}
+
+static inline struct xe_vma_op *gpuva_op_to_vma_op(struct drm_gpuva_op *op)
+{
+	return container_of(op, struct xe_vma_op, base);
+}
+
+/*
+ * Let's abstract start, size, end, bo_offset, vm, and bo as the underlying
+ * implementation may change
+ */
+static inline u64 xe_vma_start(struct xe_vma *vma)
+{
+	return vma->gpuva.va.addr;
+}
+
+static inline u64 xe_vma_size(struct xe_vma *vma)
+{
+	return vma->gpuva.va.range;
+}
+
+static inline u64 xe_vma_end(struct xe_vma *vma)
+{
+	return xe_vma_start(vma) + xe_vma_size(vma);
+}
+
+static inline u64 xe_vma_bo_offset(struct xe_vma *vma)
+{
+	return vma->gpuva.gem.offset;
+}
+
+static inline struct xe_bo *xe_vma_bo(struct xe_vma *vma)
+{
+	return !vma->gpuva.gem.obj ? NULL :
+		container_of(vma->gpuva.gem.obj, struct xe_bo, ttm.base);
+}
+
+static inline struct xe_vm *xe_vma_vm(struct xe_vma *vma)
+{
+	return container_of(vma->gpuva.mgr, struct xe_vm, mgr);
+}
+
+static inline bool xe_vma_read_only(struct xe_vma *vma)
+{
+	return vma->gpuva.flags & XE_VMA_READ_ONLY;
+}
+
+static inline u64 xe_vma_userptr(struct xe_vma *vma)
+{
+	return vma->gpuva.gem.offset;
+}
 
 #define xe_vm_assert_held(vm) dma_resv_assert_held(&(vm)->resv)
 
@@ -100,7 +160,7 @@ struct ttm_buffer_object *xe_vm_ttm_bo(struct xe_vm *vm);
 
 static inline bool xe_vma_is_userptr(struct xe_vma *vma)
 {
-	return !vma->bo;
+	return !xe_vma_bo(vma);
 }
 
 int xe_vma_userptr_pin_pages(struct xe_vma *vma);
diff --git a/drivers/gpu/drm/xe/xe_vm_madvise.c b/drivers/gpu/drm/xe/xe_vm_madvise.c
index 29815852985a..46d1b8d7b72f 100644
--- a/drivers/gpu/drm/xe/xe_vm_madvise.c
+++ b/drivers/gpu/drm/xe/xe_vm_madvise.c
@@ -30,7 +30,7 @@ static int madvise_preferred_mem_class(struct xe_device *xe, struct xe_vm *vm,
 		struct xe_bo *bo;
 		struct ww_acquire_ctx ww;
 
-		bo = vmas[i]->bo;
+		bo = xe_vma_bo(vmas[i]);
 
 		err = xe_bo_lock(bo, &ww, 0, true);
 		if (err)
@@ -55,7 +55,7 @@ static int madvise_preferred_gt(struct xe_device *xe, struct xe_vm *vm,
 		struct xe_bo *bo;
 		struct ww_acquire_ctx ww;
 
-		bo = vmas[i]->bo;
+		bo = xe_vma_bo(vmas[i]);
 
 		err = xe_bo_lock(bo, &ww, 0, true);
 		if (err)
@@ -91,7 +91,7 @@ static int madvise_preferred_mem_class_gt(struct xe_device *xe,
 		struct xe_bo *bo;
 		struct ww_acquire_ctx ww;
 
-		bo = vmas[i]->bo;
+		bo = xe_vma_bo(vmas[i]);
 
 		err = xe_bo_lock(bo, &ww, 0, true);
 		if (err)
@@ -114,7 +114,7 @@ static int madvise_cpu_atomic(struct xe_device *xe, struct xe_vm *vm,
 		struct xe_bo *bo;
 		struct ww_acquire_ctx ww;
 
-		bo = vmas[i]->bo;
+		bo = xe_vma_bo(vmas[i]);
 		if (XE_IOCTL_ERR(xe, !(bo->flags & XE_BO_CREATE_SYSTEM_BIT)))
 			return -EINVAL;
 
@@ -145,7 +145,7 @@ static int madvise_device_atomic(struct xe_device *xe, struct xe_vm *vm,
 		struct xe_bo *bo;
 		struct ww_acquire_ctx ww;
 
-		bo = vmas[i]->bo;
+		bo = xe_vma_bo(vmas[i]);
 		if (XE_IOCTL_ERR(xe, !(bo->flags & XE_BO_CREATE_VRAM0_BIT) &&
 				 !(bo->flags & XE_BO_CREATE_VRAM1_BIT)))
 			return -EINVAL;
@@ -176,7 +176,7 @@ static int madvise_priority(struct xe_device *xe, struct xe_vm *vm,
 		struct xe_bo *bo;
 		struct ww_acquire_ctx ww;
 
-		bo = vmas[i]->bo;
+		bo = xe_vma_bo(vmas[i]);
 
 		err = xe_bo_lock(bo, &ww, 0, true);
 		if (err)
@@ -210,19 +210,12 @@ static const madvise_func madvise_funcs[] = {
 	[DRM_XE_VM_MADVISE_PIN] = madvise_pin,
 };
 
-static struct xe_vma *node_to_vma(const struct rb_node *node)
-{
-	BUILD_BUG_ON(offsetof(struct xe_vma, vm_node) != 0);
-	return (struct xe_vma *)node;
-}
-
 static struct xe_vma **
 get_vmas(struct xe_vm *vm, int *num_vmas, u64 addr, u64 range)
 {
-	struct xe_vma **vmas;
-	struct xe_vma *vma, *__vma, lookup;
+	struct xe_vma **vmas, **__vmas;
 	int max_vmas = 8;
-	struct rb_node *node;
+	DRM_GPUVA_ITER(it, &vm->mgr, addr);
 
 	lockdep_assert_held(&vm->lock);
 
@@ -230,64 +223,24 @@ get_vmas(struct xe_vm *vm, int *num_vmas, u64 addr, u64 range)
 	if (!vmas)
 		return NULL;
 
-	lookup.start = addr;
-	lookup.end = addr + range - 1;
+	drm_gpuva_iter_for_each_range(it, addr + range) {
+		struct xe_vma *vma = gpuva_to_vma(it.va);
 
-	vma = xe_vm_find_overlapping_vma(vm, &lookup);
-	if (!vma)
-		return vmas;
+		if (xe_vma_is_userptr(vma))
+			continue;
 
-	if (!xe_vma_is_userptr(vma)) {
+		if (*num_vmas == max_vmas) {
+			max_vmas <<= 1;
+			__vmas = krealloc(vmas, max_vmas * sizeof(*vmas),
+					  GFP_KERNEL);
+			if (!__vmas)
+				return NULL;
+			vmas = __vmas;
+		}
 		vmas[*num_vmas] = vma;
 		*num_vmas += 1;
 	}
 
-	node = &vma->vm_node;
-	while ((node = rb_next(node))) {
-		if (!xe_vma_cmp_vma_cb(&lookup, node)) {
-			__vma = node_to_vma(node);
-			if (xe_vma_is_userptr(__vma))
-				continue;
-
-			if (*num_vmas == max_vmas) {
-				struct xe_vma **__vmas =
-					krealloc(vmas, max_vmas * sizeof(*vmas),
-						 GFP_KERNEL);
-
-				if (!__vmas)
-					return NULL;
-				vmas = __vmas;
-			}
-			vmas[*num_vmas] = __vma;
-			*num_vmas += 1;
-		} else {
-			break;
-		}
-	}
-
-	node = &vma->vm_node;
-	while ((node = rb_prev(node))) {
-		if (!xe_vma_cmp_vma_cb(&lookup, node)) {
-			__vma = node_to_vma(node);
-			if (xe_vma_is_userptr(__vma))
-				continue;
-
-			if (*num_vmas == max_vmas) {
-				struct xe_vma **__vmas =
-					krealloc(vmas, max_vmas * sizeof(*vmas),
-						 GFP_KERNEL);
-
-				if (!__vmas)
-					return NULL;
-				vmas = __vmas;
-			}
-			vmas[*num_vmas] = __vma;
-			*num_vmas += 1;
-		} else {
-			break;
-		}
-	}
-
 	return vmas;
 }
 
diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
index 2a3b911ab358..ae7f25233410 100644
--- a/drivers/gpu/drm/xe/xe_vm_types.h
+++ b/drivers/gpu/drm/xe/xe_vm_types.h
@@ -6,6 +6,8 @@
 #ifndef _XE_VM_TYPES_H_
 #define _XE_VM_TYPES_H_
 
+#include <drm/drm_gpuva_mgr.h>
+
 #include <linux/dma-resv.h>
 #include <linux/kref.h>
 #include <linux/mmu_notifier.h>
@@ -14,28 +16,23 @@
 #include "xe_device_types.h"
 #include "xe_pt_types.h"
 
+struct async_op_fence;
 struct xe_bo;
+struct xe_sync_entry;
 struct xe_vm;
 
-struct xe_vma {
-	struct rb_node vm_node;
-	/** @vm: VM which this VMA belongs to */
-	struct xe_vm *vm;
+#define TEST_VM_ASYNC_OPS_ERROR
+#define FORCE_ASYNC_OP_ERROR	BIT(31)
 
-	/**
-	 * @start: start address of this VMA within its address domain, end -
-	 * start + 1 == VMA size
-	 */
-	u64 start;
-	/** @end: end address of this VMA within its address domain */
-	u64 end;
-	/** @pte_flags: pte flags for this VMA */
-	u32 pte_flags;
+#define XE_VMA_READ_ONLY	DRM_GPUVA_USERBITS
+#define XE_VMA_DESTROYED	(DRM_GPUVA_USERBITS << 1)
+#define XE_VMA_ATOMIC_PTE_BIT	(DRM_GPUVA_USERBITS << 2)
+#define XE_VMA_FIRST_REBIND	(DRM_GPUVA_USERBITS << 3)
+#define XE_VMA_LAST_REBIND	(DRM_GPUVA_USERBITS << 4)
 
-	/** @bo: BO if not a userptr, must be NULL is userptr */
-	struct xe_bo *bo;
-	/** @bo_offset: offset into BO if not a userptr, unused for userptr */
-	u64 bo_offset;
+struct xe_vma {
+	/** @gpuva: Base GPUVA object */
+	struct drm_gpuva gpuva;
 
 	/** @gt_mask: GT mask of where to create binding for this VMA */
 	u64 gt_mask;
@@ -49,40 +46,8 @@ struct xe_vma {
 	 */
 	u64 gt_present;
 
-	/**
-	 * @destroyed: VMA is destroyed, in the sense that it shouldn't be
-	 * subject to rebind anymore. This field must be written under
-	 * the vm lock in write mode and the userptr.notifier_lock in
-	 * either mode. Read under the vm lock or the userptr.notifier_lock in
-	 * write mode.
-	 */
-	bool destroyed;
-
-	/**
-	 * @first_munmap_rebind: VMA is first in a sequence of ops that triggers
-	 * a rebind (munmap style VM unbinds). This indicates the operation
-	 * using this VMA must wait on all dma-resv slots (wait for pending jobs
-	 * / trigger preempt fences).
-	 */
-	bool first_munmap_rebind;
-
-	/**
-	 * @last_munmap_rebind: VMA is first in a sequence of ops that triggers
-	 * a rebind (munmap style VM unbinds). This indicates the operation
-	 * using this VMA must install itself into kernel dma-resv slot (blocks
-	 * future jobs) and kick the rebind work in compute mode.
-	 */
-	bool last_munmap_rebind;
-
-	/** @use_atomic_access_pte_bit: Set atomic access bit in PTE */
-	bool use_atomic_access_pte_bit;
-
-	union {
-		/** @bo_link: link into BO if not a userptr */
-		struct list_head bo_link;
-		/** @userptr_link: link into VM repin list if userptr */
-		struct list_head userptr_link;
-	};
+	/** @userptr_link: link into VM repin list if userptr */
+	struct list_head userptr_link;
 
 	/**
 	 * @rebind_link: link into VM if this VMA needs rebinding, and
@@ -105,8 +70,6 @@ struct xe_vma {
 
 	/** @userptr: user pointer state */
 	struct {
-		/** @ptr: user pointer */
-		uintptr_t ptr;
 		/** @invalidate_link: Link for the vm::userptr.invalidated list */
 		struct list_head invalidate_link;
 		/**
@@ -154,6 +117,9 @@ struct xe_device;
 #define xe_vm_assert_held(vm) dma_resv_assert_held(&(vm)->resv)
 
 struct xe_vm {
+	/** @mgr: base GPUVA used to track VMAs */
+	struct drm_gpuva_manager mgr;
+
 	struct xe_device *xe;
 
 	struct kref refcount;
@@ -165,7 +131,6 @@ struct xe_vm {
 	struct dma_resv resv;
 
 	u64 size;
-	struct rb_root vmas;
 
 	struct xe_pt *pt_root[XE_MAX_GT];
 	struct xe_bo *scratch_bo[XE_MAX_GT];
@@ -334,4 +299,96 @@ struct xe_vm {
 	} error_capture;
 };
 
+/** struct xe_vma_op_map - VMA map operation */
+struct xe_vma_op_map {
+	/** @vma: VMA to map */
+	struct xe_vma *vma;
+	/** @immediate: Immediate bind */
+	bool immediate;
+	/** @read_only: Read only */
+	bool read_only;
+};
+
+/** struct xe_vma_op_unmap - VMA unmap operation */
+struct xe_vma_op_unmap {
+	/** @start: start of the VMA unmap */
+	u64 start;
+	/** @range: range of the VMA unmap */
+	u64 range;
+};
+
+/** struct xe_vma_op_remap - VMA remap operation */
+struct xe_vma_op_remap {
+	/** @prev: VMA preceding part of a split mapping */
+	struct xe_vma *prev;
+	/** @next: VMA subsequent part of a split mapping */
+	struct xe_vma *next;
+	/** @start: start of the VMA unmap */
+	u64 start;
+	/** @range: range of the VMA unmap */
+	u64 range;
+	/** @unmap_done: unmap operation in done */
+	bool unmap_done;
+};
+
+/** struct xe_vma_op_prefetch - VMA prefetch operation */
+struct xe_vma_op_prefetch {
+	/** @region: memory region to prefetch to */
+	u32 region;
+};
+
+/** enum xe_vma_op_flags - flags for VMA operation */
+enum xe_vma_op_flags {
+	/** @XE_VMA_OP_FIRST: first VMA operation for a set of syncs */
+	XE_VMA_OP_FIRST		= (0x1 << 0),
+	/** @XE_VMA_OP_LAST: last VMA operation for a set of syncs */
+	XE_VMA_OP_LAST		= (0x1 << 1),
+};
+
+/** struct xe_vma_op - VMA operation */
+struct xe_vma_op {
+	/** @base: GPUVA base operation */
+	struct drm_gpuva_op base;
+	/**
+	 * @ops: GPUVA ops, when set call drm_gpuva_ops_free after this
+	 * operations is processed
+	 */
+	struct drm_gpuva_ops *ops;
+	/** @engine: engine for this operation */
+	struct xe_engine *engine;
+	/**
+	 * @syncs: syncs for this operation, only used on first and last
+	 * operation
+	 */
+	struct xe_sync_entry *syncs;
+	/** @num_syncs: number of syncs */
+	u32 num_syncs;
+	/** @link: async operation link */
+	struct list_head link;
+	/**
+	 * @fence: async operation fence, signaled on last operation complete
+	 */
+	struct async_op_fence *fence;
+	/** @gt_mask: gt mask for this operation */
+	u64 gt_mask;
+	/** @flags: operation flags */
+	enum xe_vma_op_flags flags;
+
+#ifdef TEST_VM_ASYNC_OPS_ERROR
+	/** @inject_error: inject error to test async op error handling */
+	bool inject_error;
+#endif
+
+	union {
+		/** @map: VMA map operation specific data */
+		struct xe_vma_op_map map;
+		/** @unmap: VMA unmap operation specific data */
+		struct xe_vma_op_unmap unmap;
+		/** @map: VMA remap operation specific data */
+		struct xe_vma_op_remap remap;
+		/** @map: VMA prefetch operation specific data */
+		struct xe_vma_op_prefetch prefetch;
+	};
+};
+
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Intel-xe] [PATCH 4/4] drm/xe: NULL binding implementation
  2023-03-15 18:25 [Intel-xe] [PATCH 0/4] Port Xe to use GPUVA and implement NULL VM binds Matthew Brost
                   ` (2 preceding siblings ...)
  2023-03-15 18:25 ` [Intel-xe] [PATCH 3/4] drm/xe: Port Xe to GPUVA Matthew Brost
@ 2023-03-15 18:25 ` Matthew Brost
  2023-03-15 18:27 ` [Intel-xe] ✓ CI.Patch_applied: success for Port Xe to use GPUVA and implement NULL VM binds Patchwork
  2023-03-15 18:28 ` [Intel-xe] ✗ CI.KUnit: failure " Patchwork
  5 siblings, 0 replies; 9+ messages in thread
From: Matthew Brost @ 2023-03-15 18:25 UTC (permalink / raw)
  To: intel-xe; +Cc: paulo.r.zanoni, lionel.g.landwerlin, dakr

Add uAPI and implementation for NULL bindings. A NULL binding is defined
as writes dropped and read zero. A single bit in the uAPI has been added
which results in a single bit in the PTEs being set.

NULL bindings are indended to be used to implement VK sparse bindings.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/xe/xe_bo.h           |  1 +
 drivers/gpu/drm/xe/xe_exec.c         |  2 +
 drivers/gpu/drm/xe/xe_gt_pagefault.c |  4 +-
 drivers/gpu/drm/xe/xe_pt.c           | 77 ++++++++++++++++-------
 drivers/gpu/drm/xe/xe_vm.c           | 92 ++++++++++++++++++----------
 drivers/gpu/drm/xe/xe_vm.h           | 10 +++
 drivers/gpu/drm/xe/xe_vm_madvise.c   |  2 +-
 drivers/gpu/drm/xe/xe_vm_types.h     |  3 +
 include/uapi/drm/xe_drm.h            |  8 +++
 9 files changed, 144 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h
index f841e74cd417..f4303810f213 100644
--- a/drivers/gpu/drm/xe/xe_bo.h
+++ b/drivers/gpu/drm/xe/xe_bo.h
@@ -54,6 +54,7 @@
 #define GEN8_PDE_IPS_64K		BIT_ULL(11)
 
 #define GEN12_GGTT_PTE_LM		BIT_ULL(1)
+#define GEN12_PTE_NULL			BIT_ULL(9)
 #define GEN12_USM_PPGTT_PTE_AE		BIT_ULL(10)
 #define GEN12_PPGTT_PTE_LM		BIT_ULL(11)
 #define GEN12_PDE_64K			BIT_ULL(6)
diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c
index b798a11f168b..2b3a15623013 100644
--- a/drivers/gpu/drm/xe/xe_exec.c
+++ b/drivers/gpu/drm/xe/xe_exec.c
@@ -115,6 +115,8 @@ static int xe_exec_begin(struct xe_engine *e, struct ww_acquire_ctx *ww,
 	 * to a location where the GPU can access it).
 	 */
 	list_for_each_entry(vma, &vm->rebind_list, rebind_link) {
+		XE_BUG_ON(xe_vma_is_null(vma));
+
 		if (xe_vma_is_userptr(vma))
 			continue;
 
diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
index f7a066090a13..cfffe3398fe4 100644
--- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
+++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
@@ -526,8 +526,8 @@ static int handle_acc(struct xe_gt *gt, struct acc *acc)
 
 	trace_xe_vma_acc(vma);
 
-	/* Userptr can't be migrated, nothing to do */
-	if (xe_vma_is_userptr(vma))
+	/* Userptr or null can't be migrated, nothing to do */
+	if (xe_vma_has_no_bo(vma))
 		goto unlock_vm;
 
 	/* Lock VM and BOs dma-resv */
diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
index d4f58ec8058e..b6e2fdb5f06c 100644
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -82,7 +82,9 @@ u64 gen8_pde_encode(struct xe_bo *bo, u64 bo_offset,
 static dma_addr_t vma_addr(struct xe_vma *vma, u64 offset,
 			   size_t page_size, bool *is_vram)
 {
-	if (xe_vma_is_userptr(vma)) {
+	if (xe_vma_is_null(vma)) {
+		return 0;
+	} else if (xe_vma_is_userptr(vma)) {
 		struct xe_res_cursor cur;
 		u64 page;
 
@@ -563,6 +565,10 @@ static bool xe_pt_hugepte_possible(u64 addr, u64 next, unsigned int level,
 	if (next - xe_walk->va_curs_start > xe_walk->curs->size)
 		return false;
 
+	/* null VMA's do not have dma adresses */
+	if (xe_walk->pte_flags & GEN12_PTE_NULL)
+		return true;
+
 	/* Is the DMA address huge PTE size aligned? */
 	size = next - addr;
 	dma = addr - xe_walk->va_curs_start + xe_res_dma(xe_walk->curs);
@@ -585,6 +591,10 @@ xe_pt_scan_64K(u64 addr, u64 next, struct xe_pt_stage_bind_walk *xe_walk)
 	if (next > xe_walk->l0_end_addr)
 		return false;
 
+	/* null VMA's do not have dma adresses */
+	if (xe_walk->pte_flags & GEN12_PTE_NULL)
+		return true;
+
 	xe_res_next(&curs, addr - xe_walk->va_curs_start);
 	for (; addr < next; addr += SZ_64K) {
 		if (!IS_ALIGNED(xe_res_dma(&curs), SZ_64K) || curs.size < SZ_64K)
@@ -630,17 +640,34 @@ xe_pt_stage_bind_entry(struct drm_pt *parent, pgoff_t offset,
 	struct xe_pt *xe_child;
 	bool covers;
 	int ret = 0;
-	u64 pte;
+	u64 pte = 0;
 
 	/* Is this a leaf entry ?*/
 	if (level == 0 || xe_pt_hugepte_possible(addr, next, level, xe_walk)) {
 		struct xe_res_cursor *curs = xe_walk->curs;
+		bool null = xe_walk->pte_flags & GEN12_PTE_NULL;
 
 		XE_WARN_ON(xe_walk->va_curs_start != addr);
 
-		pte = __gen8_pte_encode(xe_res_dma(curs) + xe_walk->dma_offset,
-					xe_walk->cache, xe_walk->pte_flags,
-					level);
+		if (null) {
+			pte |= GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
+
+			if (unlikely(xe_walk->pte_flags & PTE_READ_ONLY))
+				pte &= ~GEN8_PAGE_RW;
+
+			if (level == 1)
+				pte |= GEN8_PDE_PS_2M;
+			else if (level == 2)
+				pte |= GEN8_PDPE_PS_1G;
+
+			pte |= GEN12_PTE_NULL;
+		} else {
+			pte = __gen8_pte_encode(xe_res_dma(curs) +
+						xe_walk->dma_offset,
+						xe_walk->cache,
+						xe_walk->pte_flags,
+						level);
+		}
 		pte |= xe_walk->default_pte;
 
 		/*
@@ -658,7 +685,8 @@ xe_pt_stage_bind_entry(struct drm_pt *parent, pgoff_t offset,
 		if (unlikely(ret))
 			return ret;
 
-		xe_res_next(curs, next - addr);
+		if (!null)
+			xe_res_next(curs, next - addr);
 		xe_walk->va_curs_start = next;
 		*action = ACTION_CONTINUE;
 
@@ -751,7 +779,8 @@ xe_pt_stage_bind(struct xe_gt *gt, struct xe_vma *vma,
 		.gt = gt,
 		.curs = &curs,
 		.va_curs_start = xe_vma_start(vma),
-		.pte_flags = xe_vma_read_only(vma) ? PTE_READ_ONLY : 0,
+		.pte_flags = xe_vma_read_only(vma) ? PTE_READ_ONLY : 0 |
+			xe_vma_is_null(vma) ? GEN12_PTE_NULL : 0,
 		.wupd.entries = entries,
 		.needs_64K = (xe_vma_vm(vma)->flags & XE_VM_FLAGS_64K) && is_vram,
 	};
@@ -766,23 +795,28 @@ xe_pt_stage_bind(struct xe_gt *gt, struct xe_vma *vma,
 			gt_to_xe(gt)->mem.vram.io_start;
 		xe_walk.cache = XE_CACHE_WB;
 	} else {
-		if (!xe_vma_is_userptr(vma) && bo->flags & XE_BO_SCANOUT_BIT)
+		if (!xe_vma_has_no_bo(vma) && bo->flags & XE_BO_SCANOUT_BIT)
 			xe_walk.cache = XE_CACHE_WT;
 		else
 			xe_walk.cache = XE_CACHE_WB;
 	}
-	if (!xe_vma_is_userptr(vma) && xe_bo_is_stolen(bo))
+	if (!xe_vma_has_no_bo(vma) && xe_bo_is_stolen(bo))
 		xe_walk.dma_offset = xe_ttm_stolen_gpu_offset(xe_bo_device(bo));
 
 	xe_bo_assert_held(bo);
-	if (xe_vma_is_userptr(vma))
-		xe_res_first_sg(vma->userptr.sg, 0, xe_vma_size(vma), &curs);
-	else if (xe_bo_is_vram(bo) || xe_bo_is_stolen(bo))
-		xe_res_first(bo->ttm.resource, xe_vma_bo_offset(vma),
-			     xe_vma_size(vma), &curs);
-	else
-		xe_res_first_sg(xe_bo_get_sg(bo), xe_vma_bo_offset(vma),
-				xe_vma_size(vma), &curs);
+	if (!xe_vma_is_null(vma)) {
+		if (xe_vma_is_userptr(vma))
+			xe_res_first_sg(vma->userptr.sg, 0, xe_vma_size(vma),
+					&curs);
+		else if (xe_bo_is_vram(bo) || xe_bo_is_stolen(bo))
+			xe_res_first(bo->ttm.resource, xe_vma_bo_offset(vma),
+				     xe_vma_size(vma), &curs);
+		else
+			xe_res_first_sg(xe_bo_get_sg(bo), xe_vma_bo_offset(vma),
+					xe_vma_size(vma), &curs);
+	} else {
+		curs.size = xe_vma_size(vma);
+	}
 
 	ret = drm_pt_walk_range(&pt->drm, pt->level, xe_vma_start(vma),
 				xe_vma_end(vma), &xe_walk.drm);
@@ -976,7 +1010,7 @@ static void xe_pt_commit_locks_assert(struct xe_vma *vma)
 
 	if (xe_vma_is_userptr(vma))
 		lockdep_assert_held_read(&vm->userptr.notifier_lock);
-	else
+	else if (!xe_vma_is_null(vma))
 		dma_resv_assert_held(xe_vma_bo(vma)->ttm.base.resv);
 
 	dma_resv_assert_held(&vm->resv);
@@ -1280,7 +1314,8 @@ __xe_pt_bind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
 	struct xe_vm_pgtable_update entries[XE_VM_MAX_LEVEL * 2 + 1];
 	struct xe_pt_migrate_pt_update bind_pt_update = {
 		.base = {
-			.ops = xe_vma_is_userptr(vma) ? &userptr_bind_ops : &bind_ops,
+			.ops = xe_vma_is_userptr(vma) ? &userptr_bind_ops :
+				&bind_ops,
 			.vma = vma,
 		},
 		.bind = true,
@@ -1345,7 +1380,7 @@ __xe_pt_bind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
 				   DMA_RESV_USAGE_KERNEL :
 				   DMA_RESV_USAGE_BOOKKEEP);
 
-		if (!xe_vma_is_userptr(vma) && !xe_vma_bo(vma)->vm)
+		if (!xe_vma_has_no_bo(vma) && !xe_vma_bo(vma)->vm)
 			dma_resv_add_fence(xe_vma_bo(vma)->ttm.base.resv, fence,
 					   DMA_RESV_USAGE_BOOKKEEP);
 		xe_pt_commit_bind(vma, entries, num_entries, rebind,
@@ -1664,7 +1699,7 @@ __xe_pt_unbind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
 				   DMA_RESV_USAGE_BOOKKEEP);
 
 		/* This fence will be installed by caller when doing eviction */
-		if (!xe_vma_is_userptr(vma) && !xe_vma_bo(vma)->vm)
+		if (!xe_vma_has_no_bo(vma) && !xe_vma_bo(vma)->vm)
 			dma_resv_add_fence(xe_vma_bo(vma)->ttm.base.resv, fence,
 					   DMA_RESV_USAGE_BOOKKEEP);
 		xe_pt_commit_unbind(vma, entries, num_entries,
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index b312160f53ff..939528a5a17f 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -60,6 +60,7 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma)
 
 	lockdep_assert_held(&vm->lock);
 	XE_BUG_ON(!xe_vma_is_userptr(vma));
+	XE_BUG_ON(xe_vma_is_null(vma));
 retry:
 	if (vma->gpuva.flags & XE_VMA_DESTROYED)
 		return 0;
@@ -563,7 +564,7 @@ static void preempt_rebind_work_func(struct work_struct *w)
 		goto out_unlock;
 
 	list_for_each_entry(vma, &vm->rebind_list, rebind_link) {
-		if (xe_vma_is_userptr(vma) ||
+		if (xe_vma_has_no_bo(vma) ||
 		    vma->gpuva.flags & XE_VMA_DESTROYED)
 			continue;
 
@@ -795,7 +796,7 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm,
 				    struct xe_bo *bo,
 				    u64 bo_offset_or_userptr,
 				    u64 start, u64 end,
-				    bool read_only,
+				    bool read_only, bool null,
 				    u64 gt_mask)
 {
 	struct xe_vma *vma;
@@ -825,6 +826,8 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm,
 	vma->gpuva.va.range = end - start + 1;
 	if (read_only)
 		vma->gpuva.flags |= XE_VMA_READ_ONLY;
+	if (null)
+		vma->gpuva.flags |= XE_VMA_NULL;
 
 	if (gt_mask) {
 		vma->gt_mask = gt_mask;
@@ -844,23 +847,26 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm,
 		vma->gpuva.gem.obj = &bo->ttm.base;
 		vma->gpuva.gem.offset = bo_offset_or_userptr;
 		drm_gpuva_link(&vma->gpuva);
-	} else /* userptr */ {
-		u64 size = end - start + 1;
-		int err;
-
-		vma->gpuva.gem.offset = bo_offset_or_userptr;
+	} else /* userptr or null */ {
+		if (!null) {
+			u64 size = end - start + 1;
+			int err;
+
+			vma->gpuva.gem.offset = bo_offset_or_userptr;
+			err = mmu_interval_notifier_insert(&vma->userptr.notifier,
+							   current->mm,
+							   xe_vma_userptr(vma),
+							   size,
+							   &vma_userptr_notifier_ops);
+			if (err) {
+				kfree(vma);
+				vma = ERR_PTR(err);
+				return vma;
+			}
 
-		err = mmu_interval_notifier_insert(&vma->userptr.notifier,
-						   current->mm,
-						   xe_vma_userptr(vma), size,
-						   &vma_userptr_notifier_ops);
-		if (err) {
-			kfree(vma);
-			vma = ERR_PTR(err);
-			return vma;
+			vma->userptr.notifier_seq = LONG_MAX;
 		}
 
-		vma->userptr.notifier_seq = LONG_MAX;
 		xe_vm_get(vm);
 	}
 
@@ -898,6 +904,8 @@ static void xe_vma_destroy_late(struct xe_vma *vma)
 		 */
 		mmu_interval_notifier_remove(&vma->userptr.notifier);
 		xe_vm_put(vm);
+	} else if (xe_vma_is_null(vma)) {
+		xe_vm_put(vm);
 	} else {
 		xe_bo_put(xe_vma_bo(vma));
 	}
@@ -936,7 +944,7 @@ static void xe_vma_destroy(struct xe_vma *vma, struct dma_fence *fence)
 		list_del_init(&vma->userptr.invalidate_link);
 		spin_unlock(&vm->userptr.invalidated_lock);
 		list_del(&vma->userptr_link);
-	} else {
+	} else if (!xe_vma_is_null(vma)) {
 		xe_bo_assert_held(xe_vma_bo(vma));
 		drm_gpuva_unlink(&vma->gpuva);
 		if (!xe_vma_bo(vma)->vm)
@@ -1286,7 +1294,7 @@ void xe_vm_close_and_put(struct xe_vm *vm)
 	drm_gpuva_iter_for_each(it) {
 		vma = gpuva_to_vma(it.va);
 
-		if (xe_vma_is_userptr(vma)) {
+		if (xe_vma_has_no_bo(vma)) {
 			down_read(&vm->userptr.notifier_lock);
 			vma->gpuva.flags |= XE_VMA_DESTROYED;
 			up_read(&vm->userptr.notifier_lock);
@@ -1296,7 +1304,7 @@ void xe_vm_close_and_put(struct xe_vm *vm)
 		drm_gpuva_iter_remove(&it);
 
 		/* easy case, remove from VMA? */
-		if (xe_vma_is_userptr(vma) || xe_vma_bo(vma)->vm) {
+		if (xe_vma_has_no_bo(vma) || xe_vma_bo(vma)->vm) {
 			xe_vma_destroy(vma, NULL);
 			continue;
 		}
@@ -1946,7 +1954,7 @@ static int xe_vm_prefetch(struct xe_vm *vm, struct xe_vma *vma,
 
 	XE_BUG_ON(region > ARRAY_SIZE(region_to_mem_type));
 
-	if (!xe_vma_is_userptr(vma)) {
+	if (!xe_vma_has_no_bo(vma)) {
 		err = xe_bo_migrate(xe_vma_bo(vma), region_to_mem_type[region]);
 		if (err)
 			return err;
@@ -2152,6 +2160,7 @@ vm_bind_ioctl_ops_create(struct xe_vm *vm, struct xe_bo *bo,
 				operation & XE_VM_BIND_FLAG_IMMEDIATE;
 			op->map.read_only =
 				operation & XE_VM_BIND_FLAG_READONLY;
+			op->map.null = operation & XE_VM_BIND_FLAG_NULL;
 		}
 		break;
 	case XE_VM_BIND_OP_UNMAP:
@@ -2208,7 +2217,7 @@ vm_bind_ioctl_ops_create(struct xe_vm *vm, struct xe_bo *bo,
 }
 
 static struct xe_vma *new_vma(struct xe_vm *vm, struct drm_gpuva_op_map *op,
-			      u64 gt_mask, bool read_only)
+			      u64 gt_mask, bool read_only, bool null)
 {
 	struct xe_bo *bo = op->gem.obj ? gem_to_xe_bo(op->gem.obj) : NULL;
 	struct xe_vma *vma;
@@ -2224,7 +2233,7 @@ static struct xe_vma *new_vma(struct xe_vm *vm, struct drm_gpuva_op_map *op,
 	}
 	vma = xe_vma_create(vm, bo, op->gem.offset,
 			    op->va.addr, op->va.addr +
-			    op->va.range - 1, read_only,
+			    op->va.range - 1, read_only, null,
 			    gt_mask);
 	if (bo)
 		xe_bo_unlock(bo, &ww);
@@ -2235,7 +2244,7 @@ static struct xe_vma *new_vma(struct xe_vm *vm, struct drm_gpuva_op_map *op,
 			xe_vma_destroy(vma, NULL);
 			return ERR_PTR(err);
 		}
-	} else if(!bo->vm) {
+	} else if(!xe_vma_has_no_bo(vma) && !bo->vm) {
 		vm_insert_extobj(vm, vma);
 		err = add_preempt_fences(vm, bo);
 		if (err) {
@@ -2312,7 +2321,8 @@ static int vm_bind_ioctl_ops_parse(struct xe_vm *vm, struct xe_engine *e,
 				struct xe_vma *vma;
 
 				vma = new_vma(vm, &op->base.map,
-					      op->gt_mask, op->map.read_only);
+					      op->gt_mask, op->map.read_only,
+					      op->map.null );
 				if (IS_ERR(vma)) {
 					err = PTR_ERR(vma);
 					goto free_fence;
@@ -2327,9 +2337,13 @@ static int vm_bind_ioctl_ops_parse(struct xe_vm *vm, struct xe_engine *e,
 					bool read_only =
 						op->base.remap.unmap->va->flags &
 						XE_VMA_READ_ONLY;
+					bool null =
+						op->base.remap.unmap->va->flags &
+						XE_VMA_NULL;
 
 					vma = new_vma(vm, op->base.remap.prev,
-						      op->gt_mask, read_only);
+						      op->gt_mask, read_only,
+						      null);
 					if (IS_ERR(vma)) {
 						err = PTR_ERR(vma);
 						goto free_fence;
@@ -2344,8 +2358,13 @@ static int vm_bind_ioctl_ops_parse(struct xe_vm *vm, struct xe_engine *e,
 						op->base.remap.unmap->va->flags &
 						XE_VMA_READ_ONLY;
 
+					bool null =
+						op->base.remap.unmap->va->flags &
+						XE_VMA_NULL;
+
 					vma = new_vma(vm, op->base.remap.next,
-						      op->gt_mask, read_only);
+						      op->gt_mask, read_only,
+						      null);
 					if (IS_ERR(vma)) {
 						err = PTR_ERR(vma);
 						goto free_fence;
@@ -2791,11 +2810,12 @@ static void vm_bind_ioctl_ops_unwind(struct xe_vm *vm,
 #ifdef TEST_VM_ASYNC_OPS_ERROR
 #define SUPPORTED_FLAGS	\
 	(FORCE_ASYNC_OP_ERROR | XE_VM_BIND_FLAG_ASYNC | \
-	 XE_VM_BIND_FLAG_READONLY | XE_VM_BIND_FLAG_IMMEDIATE | 0xffff)
+	 XE_VM_BIND_FLAG_READONLY | XE_VM_BIND_FLAG_IMMEDIATE | \
+	 XE_VM_BIND_FLAG_NULL | 0xffff)
 #else
 #define SUPPORTED_FLAGS	\
 	(XE_VM_BIND_FLAG_ASYNC | XE_VM_BIND_FLAG_READONLY | \
-	 XE_VM_BIND_FLAG_IMMEDIATE | 0xffff)
+	 XE_VM_BIND_FLAG_IMMEDIATE | XE_VM_BIND_FLAG_NULL | 0xffff)
 #endif
 #define XE_64K_PAGE_MASK 0xffffull
 
@@ -2841,6 +2861,7 @@ static int vm_bind_ioctl_check_args(struct xe_device *xe,
 		u32 obj = (*bind_ops)[i].obj;
 		u64 obj_offset = (*bind_ops)[i].obj_offset;
 		u32 region = (*bind_ops)[i].region;
+		bool null = op &  XE_VM_BIND_FLAG_NULL;
 
 		if (i == 0) {
 			*async = !!(op & XE_VM_BIND_FLAG_ASYNC);
@@ -2867,8 +2888,12 @@ static int vm_bind_ioctl_check_args(struct xe_device *xe,
 		if (XE_IOCTL_ERR(xe, VM_BIND_OP(op) >
 				 XE_VM_BIND_OP_PREFETCH) ||
 		    XE_IOCTL_ERR(xe, op & ~SUPPORTED_FLAGS) ||
+		    XE_IOCTL_ERR(xe, obj && null) ||
+		    XE_IOCTL_ERR(xe, obj_offset && null) ||
+		    XE_IOCTL_ERR(xe, VM_BIND_OP(op) != XE_VM_BIND_OP_MAP &&
+				 null) ||
 		    XE_IOCTL_ERR(xe, !obj &&
-				 VM_BIND_OP(op) == XE_VM_BIND_OP_MAP) ||
+				 VM_BIND_OP(op) == XE_VM_BIND_OP_MAP && !null) ||
 		    XE_IOCTL_ERR(xe, !obj &&
 				 VM_BIND_OP(op) == XE_VM_BIND_OP_UNMAP_ALL) ||
 		    XE_IOCTL_ERR(xe, addr &&
@@ -3187,6 +3212,7 @@ int xe_vm_invalidate_vma(struct xe_vma *vma)
 	int ret;
 
 	XE_BUG_ON(!xe_vm_in_fault_mode(xe_vma_vm(vma)));
+	XE_BUG_ON(xe_vma_is_null(vma));
 	trace_xe_vma_usm_invalidate(vma);
 
 	/* Check that we don't race with page-table updates */
@@ -3245,8 +3271,11 @@ int xe_analyze_vm(struct drm_printer *p, struct xe_vm *vm, int gt_id)
 	drm_gpuva_for_each_va(gpuva, &vm->mgr) {
 		struct xe_vma *vma = gpuva_to_vma(vma);
 		bool is_userptr = xe_vma_is_userptr(vma);
+		bool null = xe_vma_is_null(vma);
 
-		if (is_userptr) {
+		if (null) {
+			addr = 0;
+		} else if (is_userptr) {
 			struct xe_res_cursor cur;
 
 			xe_res_first_sg(vma->userptr.sg, 0, GEN8_PAGE_SIZE, &cur);
@@ -3256,7 +3285,8 @@ int xe_analyze_vm(struct drm_printer *p, struct xe_vm *vm, int gt_id)
 		}
 		drm_printf(p, " [%016llx-%016llx] S:0x%016llx A:%016llx %s\n",
 			   xe_vma_start(vma), xe_vma_end(vma), xe_vma_size(vma),
-			   addr, is_userptr ? "USR" : is_vram ? "VRAM" : "SYS");
+			   addr, null ? "NULL" :
+			   is_userptr ? "USR" : is_vram ? "VRAM" : "SYS");
 	}
 	up_read(&vm->lock);
 
diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
index ef665068bcf7..db30443e121d 100644
--- a/drivers/gpu/drm/xe/xe_vm.h
+++ b/drivers/gpu/drm/xe/xe_vm.h
@@ -158,7 +158,17 @@ extern struct ttm_device_funcs xe_ttm_funcs;
 
 struct ttm_buffer_object *xe_vm_ttm_bo(struct xe_vm *vm);
 
+static inline bool xe_vma_is_null(struct xe_vma *vma)
+{
+	return vma->gpuva.flags & XE_VMA_NULL;
+}
+
 static inline bool xe_vma_is_userptr(struct xe_vma *vma)
+{
+	return !xe_vma_bo(vma) && !xe_vma_is_null(vma);
+}
+
+static inline bool xe_vma_has_no_bo(struct xe_vma *vma)
 {
 	return !xe_vma_bo(vma);
 }
diff --git a/drivers/gpu/drm/xe/xe_vm_madvise.c b/drivers/gpu/drm/xe/xe_vm_madvise.c
index 46d1b8d7b72f..29c99136a57f 100644
--- a/drivers/gpu/drm/xe/xe_vm_madvise.c
+++ b/drivers/gpu/drm/xe/xe_vm_madvise.c
@@ -226,7 +226,7 @@ get_vmas(struct xe_vm *vm, int *num_vmas, u64 addr, u64 range)
 	drm_gpuva_iter_for_each_range(it, addr + range) {
 		struct xe_vma *vma = gpuva_to_vma(it.va);
 
-		if (xe_vma_is_userptr(vma))
+		if (xe_vma_has_no_bo(vma))
 			continue;
 
 		if (*num_vmas == max_vmas) {
diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
index ae7f25233410..1d17dae726c9 100644
--- a/drivers/gpu/drm/xe/xe_vm_types.h
+++ b/drivers/gpu/drm/xe/xe_vm_types.h
@@ -29,6 +29,7 @@ struct xe_vm;
 #define XE_VMA_ATOMIC_PTE_BIT	(DRM_GPUVA_USERBITS << 2)
 #define XE_VMA_FIRST_REBIND	(DRM_GPUVA_USERBITS << 3)
 #define XE_VMA_LAST_REBIND	(DRM_GPUVA_USERBITS << 4)
+#define XE_VMA_NULL		(DRM_GPUVA_USERBITS << 5)
 
 struct xe_vma {
 	/** @gpuva: Base GPUVA object */
@@ -307,6 +308,8 @@ struct xe_vma_op_map {
 	bool immediate;
 	/** @read_only: Read only */
 	bool read_only;
+	/** @null: NULL (writes dropped, read zero) */
+	bool null;
 };
 
 /** struct xe_vma_op_unmap - VMA unmap operation */
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 593b01ba5919..4bde10875d45 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -446,6 +446,14 @@ struct drm_xe_vm_bind_op {
 	 * than differing the MAP to the page fault handler.
 	 */
 #define XE_VM_BIND_FLAG_IMMEDIATE	(0x1 << 18)
+	/*
+	 * When the NULL flag is set, the page tables are setup with a special
+	 * bit which indicates writes are dropped and all reads return zero. The
+	 * NULL flags is only valid for XE_VM_BIND_OP_MAP operations, the BO
+	 * handle MBZ, and the BO offset MBZ. This flag is intended to implement
+	 * VK sparse bindings.
+	 */
+#define XE_VM_BIND_FLAG_NULL		(0x1 << 19)
 
 	/** @reserved: Reserved */
 	__u64 reserved[2];
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Intel-xe] ✓ CI.Patch_applied: success for Port Xe to use GPUVA and implement NULL VM binds
  2023-03-15 18:25 [Intel-xe] [PATCH 0/4] Port Xe to use GPUVA and implement NULL VM binds Matthew Brost
                   ` (3 preceding siblings ...)
  2023-03-15 18:25 ` [Intel-xe] [PATCH 4/4] drm/xe: NULL binding implementation Matthew Brost
@ 2023-03-15 18:27 ` Patchwork
  2023-03-15 18:28 ` [Intel-xe] ✗ CI.KUnit: failure " Patchwork
  5 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2023-03-15 18:27 UTC (permalink / raw)
  To: Matthew Brost; +Cc: intel-xe

== Series Details ==

Series: Port Xe to use GPUVA and implement NULL VM binds
URL   : https://patchwork.freedesktop.org/series/115217/
State : success

== Summary ==

=== Applying kernel patches on branch 'drm-xe-next' with base: ===
commit 8f6c3eaf3f9daab25b31e80c8ba277877fd10547
Author:     Mauro Carvalho Chehab <mchehab@kernel.org>
AuthorDate: Fri Mar 10 09:13:39 2023 +0100
Commit:     Lucas De Marchi <lucas.demarchi@intel.com>
CommitDate: Wed Mar 15 10:28:56 2023 -0700

    drm/xe/xe_uc_fw: Use firmware files from standard locations
    
    The GuC/HuC firmware files used by Xe drivers are the same as
    used by i915. Use the already-known location to find those
    firmware files, for a couple of reasons:
    
    1. Avoid having the same firmware placed on two different
       places on MODULE_FIRMWARE(), if both 915 and xe drivers
       are compiled;
    
    2. Having firmware files located on different locations may end
       creating bigger initramfs, as the same files will be copied
       twice my mkinitrd/dracut/...;
    
    3. this is the place where those firmware files are located at
       https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git
       Upstream doesn't expect them to have on other places;
    
    4. When built with display support, DMC firmware will be
       loaded from i915/ directory. It is very confusing to have
       some firmware files on a different place for the same driver.
    
    Cc: Matthew Brost <matthew.brost@intel.com>
    Cc: Lucas de Marchi <lucas.demarchi@intel.com>
    Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Cc: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
    Cc: Daniel Vetter <daniel@ffwll.ch>
    Cc: David Airlie <airlied@gmail.com>
    Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
    [ Mostly agree with the direction of "use the firmware blobs from
      upstream at their current location for these platforms". Previous
      directory was not wrong as the plan was to have it handled in the
      upstream firmware repo. For future platforms the location can be
      changed if the support is only in xe ]
    Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
    Link: https://lore.kernel.org/r/20230310081338.3275583-1-mauro.chehab@linux.intel.com
=== git am output follows ===
Applying: maple_tree: split up MA_STATE() macro
Applying: drm: manager to keep track of GPUs VA mappings
Applying: drm/xe: Port Xe to GPUVA
Applying: drm/xe: NULL binding implementation



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Intel-xe] ✗ CI.KUnit: failure for Port Xe to use GPUVA and implement NULL VM binds
  2023-03-15 18:25 [Intel-xe] [PATCH 0/4] Port Xe to use GPUVA and implement NULL VM binds Matthew Brost
                   ` (4 preceding siblings ...)
  2023-03-15 18:27 ` [Intel-xe] ✓ CI.Patch_applied: success for Port Xe to use GPUVA and implement NULL VM binds Patchwork
@ 2023-03-15 18:28 ` Patchwork
  5 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2023-03-15 18:28 UTC (permalink / raw)
  To: Matthew Brost; +Cc: intel-xe

== Series Details ==

Series: Port Xe to use GPUVA and implement NULL VM binds
URL   : https://patchwork.freedesktop.org/series/115217/
State : failure

== Summary ==

+ trap cleanup EXIT
+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/xe/.kunitconfig
ERROR:root:../drivers/gpu/drm/xe/xe_vm.c: In function ‘xe_vma_op_work_func’:
../drivers/gpu/drm/xe/xe_vm.c:2722:4: error: label at end of compound statement
 2722 |    default:
      |    ^~~~~~~
make[6]: *** [../scripts/Makefile.build:250: drivers/gpu/drm/xe/xe_vm.o] Error 1
make[6]: *** Waiting for unfinished jobs....
make[5]: *** [../scripts/Makefile.build:500: drivers/gpu/drm/xe] Error 2
make[4]: *** [../scripts/Makefile.build:500: drivers/gpu/drm] Error 2
make[3]: *** [../scripts/Makefile.build:500: drivers/gpu] Error 2
make[2]: *** [../scripts/Makefile.build:500: drivers] Error 2
make[1]: *** [/kernel/Makefile:1992: .] Error 2
make: *** [Makefile:231: __sub-make] Error 2

[18:27:59] Configuring KUnit Kernel ...
Generating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[18:28:03] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make ARCH=um O=.kunit --jobs=48
+ cleanup
++ stat -c %u:%g /kernel
+ chown -R 1003:1003 /kernel



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Intel-xe] [PATCH 0/4] Port Xe to use GPUVA and implement NULL VM binds
@ 2023-03-15 23:14 Matthew Brost
  0 siblings, 0 replies; 9+ messages in thread
From: Matthew Brost @ 2023-03-15 23:14 UTC (permalink / raw)
  To: intel-xe; +Cc: paulo.r.zanoni, lionel.g.landwerlin, dakr

GPUVA is common code written primarily by Danilo with the idea being a
common place to track GPUVAs (VMAs in Xe) within an address space (VMs
in Xe), track all the GPUVAs attached to GEMs, and a common way
implement VM binds / unbinds with MMAP / MUNMAP semantics via creating
operation lists. All of this adds up to a common way to implement VK
sparse bindings.

This series pulls in the GPUVA code written by Danilo plus some small
fixes by myself into 1 large patch. Once the GPUVA makes it upstream, we
can rebase and drop this patch. I believe what lands upstream should be
nearly identical to this patch at least from an API perspective. 

The last two patches port Xe to GPUVA and add support for NULL VM binds
(writes dropped, read zero, VK sparse support). An example of the
semantics of this is below.

MAP 0x0000-0x8000 to NULL 	- 0x0000-0x8000 writes dropped + read zero
MAP 0x4000-0x5000 to a GEM 	- 0x0000-0x4000, 0x5000-0x8000 writes dropped + read zero; 0x4000-0x5000 mapped to a GEM
UNMAP 0x3000-0x6000		- 0x0000-0x3000, 0x6000-0x8000 writes dropped + read zero
UNMAP 0x0000-0x8000		- Nothing mapped

No changins to existing behavior, rather just new functionality.

A follow up will optimize REBIND operation to avoid using dma-resv slots
for ordering (partial unbinds when not changing page sizes) and prune
the xe_vma object data members.

v2: Fix CI build failure
 
Signed-off-by: Matthew Brost <matthew.brost@intel.com>

Danilo Krummrich (1):
  maple_tree: split up MA_STATE() macro

Matthew Brost (2):
  drm/xe: Port Xe to GPUVA
  drm/xe: NULL binding implementation

Signed-off-by: Danilo Krummrich (1):
  drm: manager to keep track of GPUs VA mappings

 Documentation/gpu/drm-mm.rst                |   31 +
 drivers/gpu/drm/Makefile                    |    1 +
 drivers/gpu/drm/drm_debugfs.c               |   56 +
 drivers/gpu/drm/drm_gem.c                   |    3 +
 drivers/gpu/drm/drm_gpuva_mgr.c             | 1891 +++++++++++++++++++
 drivers/gpu/drm/xe/xe_bo.c                  |   10 +-
 drivers/gpu/drm/xe/xe_bo.h                  |    1 +
 drivers/gpu/drm/xe/xe_device.c              |    2 +-
 drivers/gpu/drm/xe/xe_exec.c                |    4 +-
 drivers/gpu/drm/xe/xe_gt_pagefault.c        |   27 +-
 drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c |   14 +-
 drivers/gpu/drm/xe/xe_guc_ct.c              |    6 +-
 drivers/gpu/drm/xe/xe_migrate.c             |    5 +-
 drivers/gpu/drm/xe/xe_pt.c                  |  166 +-
 drivers/gpu/drm/xe/xe_trace.h               |   10 +-
 drivers/gpu/drm/xe/xe_vm.c                  | 1873 +++++++++---------
 drivers/gpu/drm/xe/xe_vm.h                  |   76 +-
 drivers/gpu/drm/xe/xe_vm_madvise.c          |   87 +-
 drivers/gpu/drm/xe/xe_vm_types.h            |  168 +-
 include/drm/drm_debugfs.h                   |   23 +
 include/drm/drm_drv.h                       |    7 +
 include/drm/drm_gem.h                       |   75 +
 include/drm/drm_gpuva_mgr.h                 |  735 +++++++
 include/linux/maple_tree.h                  |    7 +-
 include/uapi/drm/xe_drm.h                   |    8 +
 25 files changed, 4073 insertions(+), 1213 deletions(-)
 create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
 create mode 100644 include/drm/drm_gpuva_mgr.h

-- 
2.34.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Intel-xe] [PATCH 0/4] Port Xe to use GPUVA and implement NULL VM binds
@ 2023-03-15 18:35 Matthew Brost
  0 siblings, 0 replies; 9+ messages in thread
From: Matthew Brost @ 2023-03-15 18:35 UTC (permalink / raw)
  To: intel-xe; +Cc: paulo.r.zanoni, lionel.g.landwerlin, dakr

GPUVA is common code written primarily by Danilo with the idea being a
common place to track GPUVAs (VMAs in Xe) within an address space (VMs
in Xe), track all the GPUVAs attached to GEMs, and a common way
implement VM binds / unbinds with MMAP / MUNMAP semantics via creating
operation lists. All of this adds up to a common way to implement VK
sparse bindings.

This series pulls in the GPUVA code written by Danilo plus some small
fixes by myself into 1 large patch. Once the GPUVA makes it upstream, we
can rebase and drop this patch. I believe what lands upstream should be
nearly identical to this patch at least from an API perspective. 

The last two patches port Xe to GPUVA and add support for NULL VM binds
(writes dropped, read zero, VK sparse support). An example of the
semantics of this is below.

MAP 0x0000-0x8000 to NULL 	- 0x0000-0x8000 writes dropped + read zero
MAP 0x4000-0x5000 to a GEM 	- 0x0000-0x4000, 0x5000-0x8000 writes dropped + read zero; 0x4000-0x5000 mapped to a GEM
UNMAP 0x3000-0x6000		- 0x0000-0x3000, 0x6000-0x8000 writes dropped + read zero
UNMAP 0x0000-0x8000		- Nothing mapped

No changins to existing behavior, rather just new functionality.

A follow up will optimize REBIND operation to avoid using dma-resv slots
for ordering (partial unbinds when not changing page sizes) and prune
the xe_vma object data members.

v2: Fix CI build failure
 
Signed-off-by: Matthew Brost <matthew.brost@intel.com>

Danilo Krummrich (1):
  maple_tree: split up MA_STATE() macro

Matthew Brost (2):
  drm/xe: Port Xe to GPUVA
  drm/xe: NULL binding implementation

Signed-off-by: Danilo Krummrich (1):
  drm: manager to keep track of GPUs VA mappings

 Documentation/gpu/drm-mm.rst                |   31 +
 drivers/gpu/drm/Makefile                    |    1 +
 drivers/gpu/drm/drm_debugfs.c               |   56 +
 drivers/gpu/drm/drm_gem.c                   |    3 +
 drivers/gpu/drm/drm_gpuva_mgr.c             | 1891 +++++++++++++++++++
 drivers/gpu/drm/xe/xe_bo.c                  |   10 +-
 drivers/gpu/drm/xe/xe_bo.h                  |    1 +
 drivers/gpu/drm/xe/xe_device.c              |    2 +-
 drivers/gpu/drm/xe/xe_exec.c                |    4 +-
 drivers/gpu/drm/xe/xe_gt_pagefault.c        |   27 +-
 drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c |   14 +-
 drivers/gpu/drm/xe/xe_guc_ct.c              |    6 +-
 drivers/gpu/drm/xe/xe_migrate.c             |    5 +-
 drivers/gpu/drm/xe/xe_pt.c                  |  166 +-
 drivers/gpu/drm/xe/xe_trace.h               |   10 +-
 drivers/gpu/drm/xe/xe_vm.c                  | 1873 +++++++++---------
 drivers/gpu/drm/xe/xe_vm.h                  |   76 +-
 drivers/gpu/drm/xe/xe_vm_madvise.c          |   87 +-
 drivers/gpu/drm/xe/xe_vm_types.h            |  168 +-
 include/drm/drm_debugfs.h                   |   23 +
 include/drm/drm_drv.h                       |    7 +
 include/drm/drm_gem.h                       |   75 +
 include/drm/drm_gpuva_mgr.h                 |  735 +++++++
 include/linux/maple_tree.h                  |    7 +-
 include/uapi/drm/xe_drm.h                   |    8 +
 25 files changed, 4073 insertions(+), 1213 deletions(-)
 create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
 create mode 100644 include/drm/drm_gpuva_mgr.h

-- 
2.34.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-03-15 23:14 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-15 18:25 [Intel-xe] [PATCH 0/4] Port Xe to use GPUVA and implement NULL VM binds Matthew Brost
2023-03-15 18:25 ` [Intel-xe] [PATCH 1/4] maple_tree: split up MA_STATE() macro Matthew Brost
2023-03-15 18:25 ` [Intel-xe] [PATCH 2/4] drm: manager to keep track of GPUs VA mappings Matthew Brost
2023-03-15 18:25 ` [Intel-xe] [PATCH 3/4] drm/xe: Port Xe to GPUVA Matthew Brost
2023-03-15 18:25 ` [Intel-xe] [PATCH 4/4] drm/xe: NULL binding implementation Matthew Brost
2023-03-15 18:27 ` [Intel-xe] ✓ CI.Patch_applied: success for Port Xe to use GPUVA and implement NULL VM binds Patchwork
2023-03-15 18:28 ` [Intel-xe] ✗ CI.KUnit: failure " Patchwork
2023-03-15 18:35 [Intel-xe] [PATCH 0/4] " Matthew Brost
2023-03-15 23:14 Matthew Brost

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.