All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 00/13] drm/i915/svm: Add SVM support
@ 2019-11-22 20:57 ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen, jglisse,
	jon.bloomfield, daniel.vetter, dan.j.williams, ira.weiny, jgg

Shared Virtual Memory (SVM) allows the programmer to use a single virtual
address space which will be shared between threads executing on CPUs and GPUs.
It abstracts away from the user the location of the backing memory, and hence
simplifies the user programming model.
SVM supports two types of virtual memory allocation methods.
Runtime allocator requires the driver to provide memory allocation and
management interface, like buffer object (BO) interface.
Whereas system allocator makes use of default OS memory allocation and
management support like malloc().

This patch series adds both SVM system and runtime allocator support
to i915 driver.

The patch series includes
 - SVM support for both system and runtime allocation.
 - Plugin in device memory with the Linux kernel.
 - User API advertising SVM capability and configuration by user on per
   vm basis.
 - User API to bind an address range or a BO with a device page table.
 - User API to migrate an address range to device memory.
 - Implicit migration by moving pages or BOs back from device to host
   memory upon CPU access.
 - CPU copy and blitter copy support for migrating the pages/BOs.
 - Large page mapping support
 - Page table dump support.

References:
https://www.kernel.org/doc/Documentation/vm/hmm.rst
The HMM use cases in the Linux kernel.

Niranjana Vishwanathapura (12):
  drm/i915/svm: Add SVM documentation
  drm/i915/svm: Define SVM UAPI
  drm/i915/svm: Runtime (RT) allocator support
  drm/i915/svm: Page table update support for SVM
  drm/i915/svm: Page table mirroring support
  drm/i915/svm: Device memory support
  drm/i915/svm: Implicitly migrate pages upon CPU fault
  drm/i915/svm: Page copy support during migration
  drm/i915/svm: Add functions to blitter copy SVM buffers
  drm/i915/svm: Use blitter copy for migration
  drm/i915/svm: Add support to en/disable SVM
  drm/i915/svm: Add page table dump support

Venkata Sandeep Dhanalakota (1):
  drm/i915/svm: Implicitly migrate BOs upon CPU access

 Documentation/gpu/i915.rst                    |  29 +
 drivers/gpu/drm/i915/Kconfig                  |  23 +
 drivers/gpu/drm/i915/Kconfig.debug            |  14 +
 drivers/gpu/drm/i915/Makefile                 |   6 +
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  95 ++-
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |   2 +
 .../gpu/drm/i915/gem/i915_gem_context_types.h |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  65 +-
 drivers/gpu/drm/i915/gem/i915_gem_mman.c      |  10 +
 drivers/gpu/drm/i915/gem/i915_gem_object.c    |  43 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   6 +
 drivers/gpu/drm/i915/gem/i915_gem_svm.c       |  51 ++
 drivers/gpu/drm/i915/gem/i915_gem_svm.h       |  22 +
 drivers/gpu/drm/i915/gem/i915_gem_wait.c      |   2 +-
 drivers/gpu/drm/i915/i915_buddy.h             |  12 +
 drivers/gpu/drm/i915/i915_drv.c               |  30 +-
 drivers/gpu/drm/i915/i915_drv.h               |  32 +
 drivers/gpu/drm/i915/i915_gem_gtt.c           | 158 +++-
 drivers/gpu/drm/i915/i915_gem_gtt.h           |  41 +
 drivers/gpu/drm/i915/i915_getparam.c          |   3 +
 drivers/gpu/drm/i915/i915_svm.c               | 298 +++++++
 drivers/gpu/drm/i915/i915_svm.h               |  71 ++
 drivers/gpu/drm/i915/i915_svm_copy.c          | 172 ++++
 drivers/gpu/drm/i915/i915_svm_devmem.c        | 772 ++++++++++++++++++
 drivers/gpu/drm/i915/intel_memory_region.c    |   4 -
 drivers/gpu/drm/i915/intel_memory_region.h    |  18 +
 drivers/gpu/drm/i915/intel_region_lmem.c      |  10 +
 include/uapi/drm/i915_drm.h                   |  70 ++
 28 files changed, 2024 insertions(+), 36 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.h
 create mode 100644 drivers/gpu/drm/i915/i915_svm.c
 create mode 100644 drivers/gpu/drm/i915/i915_svm.h
 create mode 100644 drivers/gpu/drm/i915/i915_svm_copy.c
 create mode 100644 drivers/gpu/drm/i915/i915_svm_devmem.c

-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [Intel-gfx] [RFC 00/13] drm/i915/svm: Add SVM support
@ 2019-11-22 20:57 ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Shared Virtual Memory (SVM) allows the programmer to use a single virtual
address space which will be shared between threads executing on CPUs and GPUs.
It abstracts away from the user the location of the backing memory, and hence
simplifies the user programming model.
SVM supports two types of virtual memory allocation methods.
Runtime allocator requires the driver to provide memory allocation and
management interface, like buffer object (BO) interface.
Whereas system allocator makes use of default OS memory allocation and
management support like malloc().

This patch series adds both SVM system and runtime allocator support
to i915 driver.

The patch series includes
 - SVM support for both system and runtime allocation.
 - Plugin in device memory with the Linux kernel.
 - User API advertising SVM capability and configuration by user on per
   vm basis.
 - User API to bind an address range or a BO with a device page table.
 - User API to migrate an address range to device memory.
 - Implicit migration by moving pages or BOs back from device to host
   memory upon CPU access.
 - CPU copy and blitter copy support for migrating the pages/BOs.
 - Large page mapping support
 - Page table dump support.

References:
https://www.kernel.org/doc/Documentation/vm/hmm.rst
The HMM use cases in the Linux kernel.

Niranjana Vishwanathapura (12):
  drm/i915/svm: Add SVM documentation
  drm/i915/svm: Define SVM UAPI
  drm/i915/svm: Runtime (RT) allocator support
  drm/i915/svm: Page table update support for SVM
  drm/i915/svm: Page table mirroring support
  drm/i915/svm: Device memory support
  drm/i915/svm: Implicitly migrate pages upon CPU fault
  drm/i915/svm: Page copy support during migration
  drm/i915/svm: Add functions to blitter copy SVM buffers
  drm/i915/svm: Use blitter copy for migration
  drm/i915/svm: Add support to en/disable SVM
  drm/i915/svm: Add page table dump support

Venkata Sandeep Dhanalakota (1):
  drm/i915/svm: Implicitly migrate BOs upon CPU access

 Documentation/gpu/i915.rst                    |  29 +
 drivers/gpu/drm/i915/Kconfig                  |  23 +
 drivers/gpu/drm/i915/Kconfig.debug            |  14 +
 drivers/gpu/drm/i915/Makefile                 |   6 +
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  95 ++-
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |   2 +
 .../gpu/drm/i915/gem/i915_gem_context_types.h |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  65 +-
 drivers/gpu/drm/i915/gem/i915_gem_mman.c      |  10 +
 drivers/gpu/drm/i915/gem/i915_gem_object.c    |  43 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   6 +
 drivers/gpu/drm/i915/gem/i915_gem_svm.c       |  51 ++
 drivers/gpu/drm/i915/gem/i915_gem_svm.h       |  22 +
 drivers/gpu/drm/i915/gem/i915_gem_wait.c      |   2 +-
 drivers/gpu/drm/i915/i915_buddy.h             |  12 +
 drivers/gpu/drm/i915/i915_drv.c               |  30 +-
 drivers/gpu/drm/i915/i915_drv.h               |  32 +
 drivers/gpu/drm/i915/i915_gem_gtt.c           | 158 +++-
 drivers/gpu/drm/i915/i915_gem_gtt.h           |  41 +
 drivers/gpu/drm/i915/i915_getparam.c          |   3 +
 drivers/gpu/drm/i915/i915_svm.c               | 298 +++++++
 drivers/gpu/drm/i915/i915_svm.h               |  71 ++
 drivers/gpu/drm/i915/i915_svm_copy.c          | 172 ++++
 drivers/gpu/drm/i915/i915_svm_devmem.c        | 772 ++++++++++++++++++
 drivers/gpu/drm/i915/intel_memory_region.c    |   4 -
 drivers/gpu/drm/i915/intel_memory_region.h    |  18 +
 drivers/gpu/drm/i915/intel_region_lmem.c      |  10 +
 include/uapi/drm/i915_drm.h                   |  70 ++
 28 files changed, 2024 insertions(+), 36 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.h
 create mode 100644 drivers/gpu/drm/i915/i915_svm.c
 create mode 100644 drivers/gpu/drm/i915/i915_svm.h
 create mode 100644 drivers/gpu/drm/i915/i915_svm_copy.c
 create mode 100644 drivers/gpu/drm/i915/i915_svm_devmem.c

-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [RFC 01/13] drm/i915/svm: Add SVM documentation
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen, jglisse,
	jon.bloomfield, daniel.vetter, dan.j.williams, ira.weiny, jgg

Add Shared Virtual Memory (SVM) support information.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 Documentation/gpu/i915.rst | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index d0947c5c4ab8..592fc26d2ca0 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -415,6 +415,35 @@ Object Tiling IOCTLs
 .. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_tiling.c
    :doc: buffer object tiling
 
+Shared Virtual Memory (SVM)
+---------------------------
+
+Shared Virtual Memory (SVM) allows the programmer to use a single virtual
+address space which will be shared between threads executing on CPUs and GPUs.
+It abstracts away from the user the location of the backing memory, and hence
+simplifies the user programming model.
+SVM supports two types of virtual memory allocation methods.
+Runtime allocator requires the driver to provide memory allocation and
+management interface, like buffer object (BO) interface.
+Whereas system allocator makes use of default OS memory allocation and
+management support like malloc().
+
+Linux kernel has a Heterogeneous Memory Management (HMM) framework to
+Support SVM system allocator. HMM’s address space mirroring support allows
+sharing of the address space by duplicating sections of CPU page tables in the
+device page tables. This enables both CPU and GPU access a physical memory
+location using the same virtual address. Linux kernel also provides the ability
+to plugin device memory with the system (as a special ZONE_DEVICE type) and
+allocates struct page for each device memory page. It also provides a mechanism
+to migrate pages from host to device memory and vice versa.
+More information on HMM can be found here.
+https://www.kernel.org/doc/Documentation/vm/hmm.rst
+
+i915 supports both SVM system and runtime allocator. As PCIe is a non-coherent
+bus, it plugs in device memory as DEVICE_PRIVATE and no memory access across
+PCIe link is allowed. Any such access will trigger migration of the page/s
+or BOs before the memory is accessed.
+
 Microcontrollers
 ================
 
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [Intel-gfx] [RFC 01/13] drm/i915/svm: Add SVM documentation
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Add Shared Virtual Memory (SVM) support information.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 Documentation/gpu/i915.rst | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index d0947c5c4ab8..592fc26d2ca0 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -415,6 +415,35 @@ Object Tiling IOCTLs
 .. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_tiling.c
    :doc: buffer object tiling
 
+Shared Virtual Memory (SVM)
+---------------------------
+
+Shared Virtual Memory (SVM) allows the programmer to use a single virtual
+address space which will be shared between threads executing on CPUs and GPUs.
+It abstracts away from the user the location of the backing memory, and hence
+simplifies the user programming model.
+SVM supports two types of virtual memory allocation methods.
+Runtime allocator requires the driver to provide memory allocation and
+management interface, like buffer object (BO) interface.
+Whereas system allocator makes use of default OS memory allocation and
+management support like malloc().
+
+Linux kernel has a Heterogeneous Memory Management (HMM) framework to
+Support SVM system allocator. HMM’s address space mirroring support allows
+sharing of the address space by duplicating sections of CPU page tables in the
+device page tables. This enables both CPU and GPU access a physical memory
+location using the same virtual address. Linux kernel also provides the ability
+to plugin device memory with the system (as a special ZONE_DEVICE type) and
+allocates struct page for each device memory page. It also provides a mechanism
+to migrate pages from host to device memory and vice versa.
+More information on HMM can be found here.
+https://www.kernel.org/doc/Documentation/vm/hmm.rst
+
+i915 supports both SVM system and runtime allocator. As PCIe is a non-coherent
+bus, it plugs in device memory as DEVICE_PRIVATE and no memory access across
+PCIe link is allowed. Any such access will trigger migration of the page/s
+or BOs before the memory is accessed.
+
 Microcontrollers
 ================
 
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 02/13] drm/i915/svm: Define SVM UAPI
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Define UAPI for Shared Virtual Memory (SVM) fucntionality including
SVM capability and configurability.
When there is no GPU page fault support available, add ioctls
to explicitly bind and migrate virtual address ranges.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
---
 include/uapi/drm/i915_drm.h | 70 +++++++++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)

diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index b127a99da1c1..33b3127fc3ae 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -360,6 +360,10 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_GEM_VM_CREATE		0x3a
 #define DRM_I915_GEM_VM_DESTROY		0x3b
 #define DRM_I915_GEM_OBJECT_SETPARAM	DRM_I915_GEM_CONTEXT_SETPARAM
+#define DRM_I915_GEM_VM_GETPARAM        DRM_I915_GEM_CONTEXT_GETPARAM
+#define DRM_I915_GEM_VM_SETPARAM        DRM_I915_GEM_CONTEXT_SETPARAM
+#define DRM_I915_BIND			0x3c
+#define DRM_I915_SVM_MIGRATE		0x3d
 /* Must be kept compact -- no holes */
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
@@ -423,6 +427,10 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
 #define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
 #define DRM_IOCTL_I915_GEM_OBJECT_SETPARAM	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_OBJECT_SETPARAM, struct drm_i915_gem_object_param)
+#define DRM_IOCTL_I915_GEM_VM_GETPARAM		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_GETPARAM, struct drm_i915_gem_vm_param)
+#define DRM_IOCTL_I915_GEM_VM_SETPARAM		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_SETPARAM, struct drm_i915_gem_vm_param)
+#define DRM_IOCTL_I915_BIND		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_BIND, struct drm_i915_bind)
+#define DRM_IOCTL_I915_SVM_MIGRATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_SVM_MIGRATE, struct drm_i915_svm_migrate)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -620,6 +628,9 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_PERF_REVISION	54
 
+/* Shared Virtual Memory (SVM) support capability */
+#define I915_PARAM_HAS_SVM		55
+
 /* Must be kept compact -- no holes and well documented */
 
 typedef struct drm_i915_getparam {
@@ -1815,6 +1826,17 @@ struct drm_i915_gem_vm_control {
 	__u32 vm_id;
 };
 
+struct drm_i915_gem_vm_param {
+	__u32 vm_id;
+	__u32 rsvd;
+
+#define I915_VM_PARAM     (2ull << 32)
+#define I915_GEM_VM_PARAM_SVM   0x1
+	__u64 param;
+
+	__u64 value;
+};
+
 struct drm_i915_reg_read {
 	/*
 	 * Register offset.
@@ -2268,6 +2290,54 @@ struct drm_i915_query_perf_config {
 	__u8 data[];
 };
 
+/**
+ * struct drm_i915_bind
+ *
+ * Bind an object/buffer in a vm's page table.
+ */
+struct drm_i915_bind {
+	/** VA start to bind **/
+	__u64 start;
+
+	/**
+	 * VA length to [un]bind
+	 * length only required while binding buffers.
+	 */
+	__u64 length;
+
+	/** Type of memory to [un]bind **/
+	__u32 type;
+#define I915_BIND_SVM_BUFFER       0
+#define I915_BIND_SVM_GEM_OBJ      1
+
+	/** Object handle to [un]bind for I915_BIND_SVM_GEM_OBJ type **/
+	__u32 handle;
+
+	/** vm to [un]bind **/
+	__u32 vm_id;
+
+	/** Flags **/
+	__u32 flags;
+#define I915_BIND_UNBIND      (1 << 0)
+#define I915_BIND_READONLY    (1 << 1)
+};
+
+/**
+ * struct drm_i915_svm_migrate
+ *
+ * Migrate an address range to a memory region.
+ */
+struct drm_i915_svm_migrate {
+	/** VA start to migrate **/
+	__u64 start;
+
+	/** VA length to migrate **/
+	__u64 length;
+
+	/** Memory region to migrate to **/
+	__u32 region;
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 02/13] drm/i915/svm: Define SVM UAPI
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen, jglisse,
	jon.bloomfield, daniel.vetter, dan.j.williams, ira.weiny, jgg

Define UAPI for Shared Virtual Memory (SVM) fucntionality including
SVM capability and configurability.
When there is no GPU page fault support available, add ioctls
to explicitly bind and migrate virtual address ranges.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
---
 include/uapi/drm/i915_drm.h | 70 +++++++++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)

diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index b127a99da1c1..33b3127fc3ae 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -360,6 +360,10 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_GEM_VM_CREATE		0x3a
 #define DRM_I915_GEM_VM_DESTROY		0x3b
 #define DRM_I915_GEM_OBJECT_SETPARAM	DRM_I915_GEM_CONTEXT_SETPARAM
+#define DRM_I915_GEM_VM_GETPARAM        DRM_I915_GEM_CONTEXT_GETPARAM
+#define DRM_I915_GEM_VM_SETPARAM        DRM_I915_GEM_CONTEXT_SETPARAM
+#define DRM_I915_BIND			0x3c
+#define DRM_I915_SVM_MIGRATE		0x3d
 /* Must be kept compact -- no holes */
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
@@ -423,6 +427,10 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
 #define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
 #define DRM_IOCTL_I915_GEM_OBJECT_SETPARAM	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_OBJECT_SETPARAM, struct drm_i915_gem_object_param)
+#define DRM_IOCTL_I915_GEM_VM_GETPARAM		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_GETPARAM, struct drm_i915_gem_vm_param)
+#define DRM_IOCTL_I915_GEM_VM_SETPARAM		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_SETPARAM, struct drm_i915_gem_vm_param)
+#define DRM_IOCTL_I915_BIND		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_BIND, struct drm_i915_bind)
+#define DRM_IOCTL_I915_SVM_MIGRATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_SVM_MIGRATE, struct drm_i915_svm_migrate)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -620,6 +628,9 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_PERF_REVISION	54
 
+/* Shared Virtual Memory (SVM) support capability */
+#define I915_PARAM_HAS_SVM		55
+
 /* Must be kept compact -- no holes and well documented */
 
 typedef struct drm_i915_getparam {
@@ -1815,6 +1826,17 @@ struct drm_i915_gem_vm_control {
 	__u32 vm_id;
 };
 
+struct drm_i915_gem_vm_param {
+	__u32 vm_id;
+	__u32 rsvd;
+
+#define I915_VM_PARAM     (2ull << 32)
+#define I915_GEM_VM_PARAM_SVM   0x1
+	__u64 param;
+
+	__u64 value;
+};
+
 struct drm_i915_reg_read {
 	/*
 	 * Register offset.
@@ -2268,6 +2290,54 @@ struct drm_i915_query_perf_config {
 	__u8 data[];
 };
 
+/**
+ * struct drm_i915_bind
+ *
+ * Bind an object/buffer in a vm's page table.
+ */
+struct drm_i915_bind {
+	/** VA start to bind **/
+	__u64 start;
+
+	/**
+	 * VA length to [un]bind
+	 * length only required while binding buffers.
+	 */
+	__u64 length;
+
+	/** Type of memory to [un]bind **/
+	__u32 type;
+#define I915_BIND_SVM_BUFFER       0
+#define I915_BIND_SVM_GEM_OBJ      1
+
+	/** Object handle to [un]bind for I915_BIND_SVM_GEM_OBJ type **/
+	__u32 handle;
+
+	/** vm to [un]bind **/
+	__u32 vm_id;
+
+	/** Flags **/
+	__u32 flags;
+#define I915_BIND_UNBIND      (1 << 0)
+#define I915_BIND_READONLY    (1 << 1)
+};
+
+/**
+ * struct drm_i915_svm_migrate
+ *
+ * Migrate an address range to a memory region.
+ */
+struct drm_i915_svm_migrate {
+	/** VA start to migrate **/
+	__u64 start;
+
+	/** VA length to migrate **/
+	__u64 length;
+
+	/** Memory region to migrate to **/
+	__u32 region;
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [Intel-gfx] [RFC 02/13] drm/i915/svm: Define SVM UAPI
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Define UAPI for Shared Virtual Memory (SVM) fucntionality including
SVM capability and configurability.
When there is no GPU page fault support available, add ioctls
to explicitly bind and migrate virtual address ranges.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
---
 include/uapi/drm/i915_drm.h | 70 +++++++++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)

diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index b127a99da1c1..33b3127fc3ae 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -360,6 +360,10 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_GEM_VM_CREATE		0x3a
 #define DRM_I915_GEM_VM_DESTROY		0x3b
 #define DRM_I915_GEM_OBJECT_SETPARAM	DRM_I915_GEM_CONTEXT_SETPARAM
+#define DRM_I915_GEM_VM_GETPARAM        DRM_I915_GEM_CONTEXT_GETPARAM
+#define DRM_I915_GEM_VM_SETPARAM        DRM_I915_GEM_CONTEXT_SETPARAM
+#define DRM_I915_BIND			0x3c
+#define DRM_I915_SVM_MIGRATE		0x3d
 /* Must be kept compact -- no holes */
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
@@ -423,6 +427,10 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
 #define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
 #define DRM_IOCTL_I915_GEM_OBJECT_SETPARAM	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_OBJECT_SETPARAM, struct drm_i915_gem_object_param)
+#define DRM_IOCTL_I915_GEM_VM_GETPARAM		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_GETPARAM, struct drm_i915_gem_vm_param)
+#define DRM_IOCTL_I915_GEM_VM_SETPARAM		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_SETPARAM, struct drm_i915_gem_vm_param)
+#define DRM_IOCTL_I915_BIND		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_BIND, struct drm_i915_bind)
+#define DRM_IOCTL_I915_SVM_MIGRATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_SVM_MIGRATE, struct drm_i915_svm_migrate)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -620,6 +628,9 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_PERF_REVISION	54
 
+/* Shared Virtual Memory (SVM) support capability */
+#define I915_PARAM_HAS_SVM		55
+
 /* Must be kept compact -- no holes and well documented */
 
 typedef struct drm_i915_getparam {
@@ -1815,6 +1826,17 @@ struct drm_i915_gem_vm_control {
 	__u32 vm_id;
 };
 
+struct drm_i915_gem_vm_param {
+	__u32 vm_id;
+	__u32 rsvd;
+
+#define I915_VM_PARAM     (2ull << 32)
+#define I915_GEM_VM_PARAM_SVM   0x1
+	__u64 param;
+
+	__u64 value;
+};
+
 struct drm_i915_reg_read {
 	/*
 	 * Register offset.
@@ -2268,6 +2290,54 @@ struct drm_i915_query_perf_config {
 	__u8 data[];
 };
 
+/**
+ * struct drm_i915_bind
+ *
+ * Bind an object/buffer in a vm's page table.
+ */
+struct drm_i915_bind {
+	/** VA start to bind **/
+	__u64 start;
+
+	/**
+	 * VA length to [un]bind
+	 * length only required while binding buffers.
+	 */
+	__u64 length;
+
+	/** Type of memory to [un]bind **/
+	__u32 type;
+#define I915_BIND_SVM_BUFFER       0
+#define I915_BIND_SVM_GEM_OBJ      1
+
+	/** Object handle to [un]bind for I915_BIND_SVM_GEM_OBJ type **/
+	__u32 handle;
+
+	/** vm to [un]bind **/
+	__u32 vm_id;
+
+	/** Flags **/
+	__u32 flags;
+#define I915_BIND_UNBIND      (1 << 0)
+#define I915_BIND_READONLY    (1 << 1)
+};
+
+/**
+ * struct drm_i915_svm_migrate
+ *
+ * Migrate an address range to a memory region.
+ */
+struct drm_i915_svm_migrate {
+	/** VA start to migrate **/
+	__u64 start;
+
+	/** VA length to migrate **/
+	__u64 length;
+
+	/** Memory region to migrate to **/
+	__u32 region;
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Shared Virtual Memory (SVM) runtime allocator support allows
binding a shared virtual address to a buffer object (BO) in the
device page table through an ioctl call.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/Kconfig                  | 11 ++++
 drivers/gpu/drm/i915/Makefile                 |  3 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 58 +++++++++++++++----
 drivers/gpu/drm/i915/gem/i915_gem_svm.c       | 51 ++++++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_svm.h       | 22 +++++++
 drivers/gpu/drm/i915/i915_drv.c               | 22 +++++++
 drivers/gpu/drm/i915/i915_drv.h               | 22 +++++++
 drivers/gpu/drm/i915/i915_gem_gtt.c           |  1 +
 drivers/gpu/drm/i915/i915_gem_gtt.h           | 13 +++++
 9 files changed, 192 insertions(+), 11 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.h

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index ba9595960bbe..c2e48710eec8 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -137,6 +137,17 @@ config DRM_I915_GVT_KVMGT
 	  Choose this option if you want to enable KVMGT support for
 	  Intel GVT-g.
 
+config DRM_I915_SVM
+	bool "Enable Shared Virtual Memory support in i915"
+	depends on STAGING
+	depends on DRM_I915
+	default n
+	help
+	  Choose this option if you want Shared Virtual Memory (SVM)
+	  support in i915. With SVM support, one can share the virtual
+	  address space between a process and the GPU. SVM is supported
+	  on both integrated and discrete Intel GPUs.
+
 menu "drm/i915 Debugging"
 depends on DRM_I915
 depends on EXPERT
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index e0fd10c0cfb8..75fe45633779 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -153,6 +153,9 @@ i915-y += \
 	  intel_region_lmem.o \
 	  intel_wopcm.o
 
+# SVM code
+i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o
+
 # general-purpose microcontroller (GuC) support
 obj-y += gt/uc/
 i915-y += gt/uc/intel_uc.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 7a87e8270460..9d43ae6d643a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2864,10 +2864,14 @@ int
 i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
 			   struct drm_file *file)
 {
+	struct drm_i915_gem_exec_object2 *exec2_list, *exec2_list_user;
 	struct drm_i915_gem_execbuffer2 *args = data;
-	struct drm_i915_gem_exec_object2 *exec2_list;
-	struct drm_syncobj **fences = NULL;
 	const size_t count = args->buffer_count;
+	struct drm_syncobj **fences = NULL;
+	unsigned int i = 0, svm_count = 0;
+	struct i915_address_space *vm;
+	struct i915_gem_context *ctx;
+	struct i915_svm_obj *svm_obj;
 	int err;
 
 	if (!check_buffer_count(count)) {
@@ -2878,15 +2882,46 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
 	if (!i915_gem_check_execbuffer(args))
 		return -EINVAL;
 
+	ctx = i915_gem_context_lookup(file->driver_priv, args->rsvd1);
+	if (!ctx || !rcu_access_pointer(ctx->vm))
+		return -ENOENT;
+
+	rcu_read_lock();
+	vm = i915_vm_get(ctx->vm);
+	rcu_read_unlock();
+
+alloc_again:
+	svm_count = vm->svm_count;
 	/* Allocate an extra slot for use by the command parser */
-	exec2_list = kvmalloc_array(count + 1, eb_element_size(),
+	exec2_list = kvmalloc_array(count + svm_count + 1, eb_element_size(),
 				    __GFP_NOWARN | GFP_KERNEL);
 	if (exec2_list == NULL) {
 		DRM_DEBUG("Failed to allocate exec list for %zd buffers\n",
-			  count);
+			  count + svm_count);
 		return -ENOMEM;
 	}
-	if (copy_from_user(exec2_list,
+	mutex_lock(&vm->mutex);
+	if (svm_count != vm->svm_count) {
+		mutex_unlock(&vm->mutex);
+		kvfree(exec2_list);
+		goto alloc_again;
+	}
+
+	list_for_each_entry(svm_obj, &vm->svm_list, link) {
+		memset(&exec2_list[i], 0, sizeof(*exec2_list));
+		exec2_list[i].handle = svm_obj->handle;
+		exec2_list[i].offset = svm_obj->offset;
+		exec2_list[i].flags = EXEC_OBJECT_PINNED |
+				      EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+		i++;
+	}
+	exec2_list_user = &exec2_list[i];
+	args->buffer_count += svm_count;
+	mutex_unlock(&vm->mutex);
+	i915_vm_put(vm);
+	i915_gem_context_put(ctx);
+
+	if (copy_from_user(exec2_list_user,
 			   u64_to_user_ptr(args->buffers_ptr),
 			   sizeof(*exec2_list) * count)) {
 		DRM_DEBUG("copy %zd exec entries failed\n", count);
@@ -2903,6 +2938,7 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
 	}
 
 	err = i915_gem_do_execbuffer(dev, file, args, exec2_list, fences);
+	args->buffer_count -= svm_count;
 
 	/*
 	 * Now that we have begun execution of the batchbuffer, we ignore
@@ -2913,7 +2949,6 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
 	if (args->flags & __EXEC_HAS_RELOC) {
 		struct drm_i915_gem_exec_object2 __user *user_exec_list =
 			u64_to_user_ptr(args->buffers_ptr);
-		unsigned int i;
 
 		/* Copy the new buffer offsets back to the user's exec list. */
 		/*
@@ -2927,13 +2962,14 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
 			goto end;
 
 		for (i = 0; i < args->buffer_count; i++) {
-			if (!(exec2_list[i].offset & UPDATE))
+			u64 *offset = &exec2_list_user[i].offset;
+
+			if (!(*offset & UPDATE))
 				continue;
 
-			exec2_list[i].offset =
-				gen8_canonical_addr(exec2_list[i].offset & PIN_OFFSET_MASK);
-			unsafe_put_user(exec2_list[i].offset,
-					&user_exec_list[i].offset,
+			*offset = gen8_canonical_addr(*offset &
+						      PIN_OFFSET_MASK);
+			unsafe_put_user(*offset, &user_exec_list[i].offset,
 					end_user);
 		}
 end_user:
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_svm.c b/drivers/gpu/drm/i915/gem/i915_gem_svm.c
new file mode 100644
index 000000000000..973070056726
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_svm.c
@@ -0,0 +1,51 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "i915_gem_gtt.h"
+
+int i915_gem_svm_bind(struct i915_address_space *vm,
+		      struct drm_i915_bind *args,
+		      struct drm_file *file)
+{
+	struct i915_svm_obj *svm_obj, *tmp;
+	struct drm_i915_gem_object *obj;
+	int ret = 0;
+
+	obj = i915_gem_object_lookup(file, args->handle);
+	if (!obj)
+		return -ENOENT;
+
+	/* FIXME: Need to handle case with unending batch buffers */
+	if (!(args->flags & I915_BIND_UNBIND)) {
+		svm_obj = kmalloc(sizeof(*svm_obj), GFP_KERNEL);
+		if (!svm_obj) {
+			ret = -ENOMEM;
+			goto put_obj;
+		}
+		svm_obj->handle = args->handle;
+		svm_obj->offset = args->start;
+	}
+
+	mutex_lock(&vm->mutex);
+	if (!(args->flags & I915_BIND_UNBIND)) {
+		list_add(&svm_obj->link, &vm->svm_list);
+		vm->svm_count++;
+	} else {
+		list_for_each_entry_safe(svm_obj, tmp, &vm->svm_list, link) {
+			if (svm_obj->handle != args->handle)
+				continue;
+
+			list_del_init(&svm_obj->link);
+			vm->svm_count--;
+			kfree(svm_obj);
+			break;
+		}
+	}
+	mutex_unlock(&vm->mutex);
+put_obj:
+	i915_gem_object_put(obj);
+	return ret;
+}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_svm.h b/drivers/gpu/drm/i915/gem/i915_gem_svm.h
new file mode 100644
index 000000000000..c394542dba75
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_svm.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef __I915_GEM_SVM_H
+#define __I915_GEM_SVM_H
+
+#include "i915_drv.h"
+
+#if defined(CONFIG_DRM_I915_SVM)
+int i915_gem_svm_bind(struct i915_address_space *vm,
+		      struct drm_i915_bind *args,
+		      struct drm_file *file);
+#else
+static inline int i915_gem_svm_bind(struct i915_address_space *vm,
+				    struct drm_i915_bind *args,
+				    struct drm_file *file)
+{ return -ENOTSUPP; }
+#endif
+
+#endif /* __I915_GEM_SVM_H */
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 4defadb26e7e..9c525d3f694c 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -61,6 +61,7 @@
 
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_ioctls.h"
+#include "gem/i915_gem_svm.h"
 #include "gt/intel_gt.h"
 #include "gt/intel_gt_pm.h"
 #include "gt/intel_rc6.h"
@@ -2687,6 +2688,26 @@ i915_gem_reject_pin_ioctl(struct drm_device *dev, void *data,
 	return -ENODEV;
 }
 
+static int i915_bind_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *file)
+{
+	struct drm_i915_bind *args = data;
+	struct i915_address_space *vm;
+	int ret = -EINVAL;
+
+	vm = i915_gem_address_space_lookup(file->driver_priv, args->vm_id);
+	if (unlikely(!vm))
+		return -ENOENT;
+
+	switch (args->type) {
+	case I915_BIND_SVM_GEM_OBJ:
+		ret = i915_gem_svm_bind(vm, args, file);
+	}
+
+	i915_vm_put(vm);
+	return ret;
+}
+
 static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_INIT, drm_noop, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
 	DRM_IOCTL_DEF_DRV(I915_FLUSH, drm_noop, DRM_AUTH),
@@ -2746,6 +2767,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_BIND, i915_bind_ioctl, DRM_RENDER_ALLOW),
 };
 
 static struct drm_driver driver = {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bbf4dfdfa8ba..f7051e6df656 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1911,6 +1911,28 @@ i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
 	return ctx;
 }
 
+static inline struct i915_address_space *
+__i915_gem_address_space_lookup_rcu(struct drm_i915_file_private *file_priv,
+				    u32 id)
+{
+	return idr_find(&file_priv->vm_idr, id);
+}
+
+static inline struct i915_address_space *
+i915_gem_address_space_lookup(struct drm_i915_file_private *file_priv,
+			      u32 id)
+{
+	struct i915_address_space *vm;
+
+	rcu_read_lock();
+	vm = __i915_gem_address_space_lookup_rcu(file_priv, id);
+	if (vm)
+		vm = i915_vm_get(vm);
+	rcu_read_unlock();
+
+	return vm;
+}
+
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct i915_address_space *vm,
 					  u64 min_size, u64 alignment,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 6239a9adbf14..44ff4074db12 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -574,6 +574,7 @@ static void i915_address_space_init(struct i915_address_space *vm, int subclass)
 	stash_init(&vm->free_pages);
 
 	INIT_LIST_HEAD(&vm->bound_list);
+	INIT_LIST_HEAD(&vm->svm_list);
 }
 
 static int __setup_page_dma(struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 402283ce2864..d618a5787c61 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -285,6 +285,13 @@ struct pagestash {
 	struct pagevec pvec;
 };
 
+struct i915_svm_obj {
+	/** This obj's place in the SVM object list */
+	struct list_head link;
+	u32 handle;
+	u64 offset;
+};
+
 struct i915_address_space {
 	struct kref ref;
 	struct rcu_work rcu;
@@ -329,6 +336,12 @@ struct i915_address_space {
 	 */
 	struct list_head bound_list;
 
+	/**
+	 * List of SVM bind objects.
+	 */
+	struct list_head svm_list;
+	unsigned int svm_count;
+
 	struct pagestash free_pages;
 
 	/* Global GTT */
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen, jglisse,
	jon.bloomfield, daniel.vetter, dan.j.williams, ira.weiny, jgg

Shared Virtual Memory (SVM) runtime allocator support allows
binding a shared virtual address to a buffer object (BO) in the
device page table through an ioctl call.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/Kconfig                  | 11 ++++
 drivers/gpu/drm/i915/Makefile                 |  3 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 58 +++++++++++++++----
 drivers/gpu/drm/i915/gem/i915_gem_svm.c       | 51 ++++++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_svm.h       | 22 +++++++
 drivers/gpu/drm/i915/i915_drv.c               | 22 +++++++
 drivers/gpu/drm/i915/i915_drv.h               | 22 +++++++
 drivers/gpu/drm/i915/i915_gem_gtt.c           |  1 +
 drivers/gpu/drm/i915/i915_gem_gtt.h           | 13 +++++
 9 files changed, 192 insertions(+), 11 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.h

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index ba9595960bbe..c2e48710eec8 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -137,6 +137,17 @@ config DRM_I915_GVT_KVMGT
 	  Choose this option if you want to enable KVMGT support for
 	  Intel GVT-g.
 
+config DRM_I915_SVM
+	bool "Enable Shared Virtual Memory support in i915"
+	depends on STAGING
+	depends on DRM_I915
+	default n
+	help
+	  Choose this option if you want Shared Virtual Memory (SVM)
+	  support in i915. With SVM support, one can share the virtual
+	  address space between a process and the GPU. SVM is supported
+	  on both integrated and discrete Intel GPUs.
+
 menu "drm/i915 Debugging"
 depends on DRM_I915
 depends on EXPERT
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index e0fd10c0cfb8..75fe45633779 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -153,6 +153,9 @@ i915-y += \
 	  intel_region_lmem.o \
 	  intel_wopcm.o
 
+# SVM code
+i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o
+
 # general-purpose microcontroller (GuC) support
 obj-y += gt/uc/
 i915-y += gt/uc/intel_uc.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 7a87e8270460..9d43ae6d643a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2864,10 +2864,14 @@ int
 i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
 			   struct drm_file *file)
 {
+	struct drm_i915_gem_exec_object2 *exec2_list, *exec2_list_user;
 	struct drm_i915_gem_execbuffer2 *args = data;
-	struct drm_i915_gem_exec_object2 *exec2_list;
-	struct drm_syncobj **fences = NULL;
 	const size_t count = args->buffer_count;
+	struct drm_syncobj **fences = NULL;
+	unsigned int i = 0, svm_count = 0;
+	struct i915_address_space *vm;
+	struct i915_gem_context *ctx;
+	struct i915_svm_obj *svm_obj;
 	int err;
 
 	if (!check_buffer_count(count)) {
@@ -2878,15 +2882,46 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
 	if (!i915_gem_check_execbuffer(args))
 		return -EINVAL;
 
+	ctx = i915_gem_context_lookup(file->driver_priv, args->rsvd1);
+	if (!ctx || !rcu_access_pointer(ctx->vm))
+		return -ENOENT;
+
+	rcu_read_lock();
+	vm = i915_vm_get(ctx->vm);
+	rcu_read_unlock();
+
+alloc_again:
+	svm_count = vm->svm_count;
 	/* Allocate an extra slot for use by the command parser */
-	exec2_list = kvmalloc_array(count + 1, eb_element_size(),
+	exec2_list = kvmalloc_array(count + svm_count + 1, eb_element_size(),
 				    __GFP_NOWARN | GFP_KERNEL);
 	if (exec2_list == NULL) {
 		DRM_DEBUG("Failed to allocate exec list for %zd buffers\n",
-			  count);
+			  count + svm_count);
 		return -ENOMEM;
 	}
-	if (copy_from_user(exec2_list,
+	mutex_lock(&vm->mutex);
+	if (svm_count != vm->svm_count) {
+		mutex_unlock(&vm->mutex);
+		kvfree(exec2_list);
+		goto alloc_again;
+	}
+
+	list_for_each_entry(svm_obj, &vm->svm_list, link) {
+		memset(&exec2_list[i], 0, sizeof(*exec2_list));
+		exec2_list[i].handle = svm_obj->handle;
+		exec2_list[i].offset = svm_obj->offset;
+		exec2_list[i].flags = EXEC_OBJECT_PINNED |
+				      EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+		i++;
+	}
+	exec2_list_user = &exec2_list[i];
+	args->buffer_count += svm_count;
+	mutex_unlock(&vm->mutex);
+	i915_vm_put(vm);
+	i915_gem_context_put(ctx);
+
+	if (copy_from_user(exec2_list_user,
 			   u64_to_user_ptr(args->buffers_ptr),
 			   sizeof(*exec2_list) * count)) {
 		DRM_DEBUG("copy %zd exec entries failed\n", count);
@@ -2903,6 +2938,7 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
 	}
 
 	err = i915_gem_do_execbuffer(dev, file, args, exec2_list, fences);
+	args->buffer_count -= svm_count;
 
 	/*
 	 * Now that we have begun execution of the batchbuffer, we ignore
@@ -2913,7 +2949,6 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
 	if (args->flags & __EXEC_HAS_RELOC) {
 		struct drm_i915_gem_exec_object2 __user *user_exec_list =
 			u64_to_user_ptr(args->buffers_ptr);
-		unsigned int i;
 
 		/* Copy the new buffer offsets back to the user's exec list. */
 		/*
@@ -2927,13 +2962,14 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
 			goto end;
 
 		for (i = 0; i < args->buffer_count; i++) {
-			if (!(exec2_list[i].offset & UPDATE))
+			u64 *offset = &exec2_list_user[i].offset;
+
+			if (!(*offset & UPDATE))
 				continue;
 
-			exec2_list[i].offset =
-				gen8_canonical_addr(exec2_list[i].offset & PIN_OFFSET_MASK);
-			unsafe_put_user(exec2_list[i].offset,
-					&user_exec_list[i].offset,
+			*offset = gen8_canonical_addr(*offset &
+						      PIN_OFFSET_MASK);
+			unsafe_put_user(*offset, &user_exec_list[i].offset,
 					end_user);
 		}
 end_user:
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_svm.c b/drivers/gpu/drm/i915/gem/i915_gem_svm.c
new file mode 100644
index 000000000000..973070056726
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_svm.c
@@ -0,0 +1,51 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "i915_gem_gtt.h"
+
+int i915_gem_svm_bind(struct i915_address_space *vm,
+		      struct drm_i915_bind *args,
+		      struct drm_file *file)
+{
+	struct i915_svm_obj *svm_obj, *tmp;
+	struct drm_i915_gem_object *obj;
+	int ret = 0;
+
+	obj = i915_gem_object_lookup(file, args->handle);
+	if (!obj)
+		return -ENOENT;
+
+	/* FIXME: Need to handle case with unending batch buffers */
+	if (!(args->flags & I915_BIND_UNBIND)) {
+		svm_obj = kmalloc(sizeof(*svm_obj), GFP_KERNEL);
+		if (!svm_obj) {
+			ret = -ENOMEM;
+			goto put_obj;
+		}
+		svm_obj->handle = args->handle;
+		svm_obj->offset = args->start;
+	}
+
+	mutex_lock(&vm->mutex);
+	if (!(args->flags & I915_BIND_UNBIND)) {
+		list_add(&svm_obj->link, &vm->svm_list);
+		vm->svm_count++;
+	} else {
+		list_for_each_entry_safe(svm_obj, tmp, &vm->svm_list, link) {
+			if (svm_obj->handle != args->handle)
+				continue;
+
+			list_del_init(&svm_obj->link);
+			vm->svm_count--;
+			kfree(svm_obj);
+			break;
+		}
+	}
+	mutex_unlock(&vm->mutex);
+put_obj:
+	i915_gem_object_put(obj);
+	return ret;
+}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_svm.h b/drivers/gpu/drm/i915/gem/i915_gem_svm.h
new file mode 100644
index 000000000000..c394542dba75
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_svm.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef __I915_GEM_SVM_H
+#define __I915_GEM_SVM_H
+
+#include "i915_drv.h"
+
+#if defined(CONFIG_DRM_I915_SVM)
+int i915_gem_svm_bind(struct i915_address_space *vm,
+		      struct drm_i915_bind *args,
+		      struct drm_file *file);
+#else
+static inline int i915_gem_svm_bind(struct i915_address_space *vm,
+				    struct drm_i915_bind *args,
+				    struct drm_file *file)
+{ return -ENOTSUPP; }
+#endif
+
+#endif /* __I915_GEM_SVM_H */
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 4defadb26e7e..9c525d3f694c 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -61,6 +61,7 @@
 
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_ioctls.h"
+#include "gem/i915_gem_svm.h"
 #include "gt/intel_gt.h"
 #include "gt/intel_gt_pm.h"
 #include "gt/intel_rc6.h"
@@ -2687,6 +2688,26 @@ i915_gem_reject_pin_ioctl(struct drm_device *dev, void *data,
 	return -ENODEV;
 }
 
+static int i915_bind_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *file)
+{
+	struct drm_i915_bind *args = data;
+	struct i915_address_space *vm;
+	int ret = -EINVAL;
+
+	vm = i915_gem_address_space_lookup(file->driver_priv, args->vm_id);
+	if (unlikely(!vm))
+		return -ENOENT;
+
+	switch (args->type) {
+	case I915_BIND_SVM_GEM_OBJ:
+		ret = i915_gem_svm_bind(vm, args, file);
+	}
+
+	i915_vm_put(vm);
+	return ret;
+}
+
 static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_INIT, drm_noop, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
 	DRM_IOCTL_DEF_DRV(I915_FLUSH, drm_noop, DRM_AUTH),
@@ -2746,6 +2767,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_BIND, i915_bind_ioctl, DRM_RENDER_ALLOW),
 };
 
 static struct drm_driver driver = {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bbf4dfdfa8ba..f7051e6df656 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1911,6 +1911,28 @@ i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
 	return ctx;
 }
 
+static inline struct i915_address_space *
+__i915_gem_address_space_lookup_rcu(struct drm_i915_file_private *file_priv,
+				    u32 id)
+{
+	return idr_find(&file_priv->vm_idr, id);
+}
+
+static inline struct i915_address_space *
+i915_gem_address_space_lookup(struct drm_i915_file_private *file_priv,
+			      u32 id)
+{
+	struct i915_address_space *vm;
+
+	rcu_read_lock();
+	vm = __i915_gem_address_space_lookup_rcu(file_priv, id);
+	if (vm)
+		vm = i915_vm_get(vm);
+	rcu_read_unlock();
+
+	return vm;
+}
+
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct i915_address_space *vm,
 					  u64 min_size, u64 alignment,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 6239a9adbf14..44ff4074db12 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -574,6 +574,7 @@ static void i915_address_space_init(struct i915_address_space *vm, int subclass)
 	stash_init(&vm->free_pages);
 
 	INIT_LIST_HEAD(&vm->bound_list);
+	INIT_LIST_HEAD(&vm->svm_list);
 }
 
 static int __setup_page_dma(struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 402283ce2864..d618a5787c61 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -285,6 +285,13 @@ struct pagestash {
 	struct pagevec pvec;
 };
 
+struct i915_svm_obj {
+	/** This obj's place in the SVM object list */
+	struct list_head link;
+	u32 handle;
+	u64 offset;
+};
+
 struct i915_address_space {
 	struct kref ref;
 	struct rcu_work rcu;
@@ -329,6 +336,12 @@ struct i915_address_space {
 	 */
 	struct list_head bound_list;
 
+	/**
+	 * List of SVM bind objects.
+	 */
+	struct list_head svm_list;
+	unsigned int svm_count;
+
 	struct pagestash free_pages;
 
 	/* Global GTT */
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [Intel-gfx] [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Shared Virtual Memory (SVM) runtime allocator support allows
binding a shared virtual address to a buffer object (BO) in the
device page table through an ioctl call.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/Kconfig                  | 11 ++++
 drivers/gpu/drm/i915/Makefile                 |  3 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 58 +++++++++++++++----
 drivers/gpu/drm/i915/gem/i915_gem_svm.c       | 51 ++++++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_svm.h       | 22 +++++++
 drivers/gpu/drm/i915/i915_drv.c               | 22 +++++++
 drivers/gpu/drm/i915/i915_drv.h               | 22 +++++++
 drivers/gpu/drm/i915/i915_gem_gtt.c           |  1 +
 drivers/gpu/drm/i915/i915_gem_gtt.h           | 13 +++++
 9 files changed, 192 insertions(+), 11 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.h

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index ba9595960bbe..c2e48710eec8 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -137,6 +137,17 @@ config DRM_I915_GVT_KVMGT
 	  Choose this option if you want to enable KVMGT support for
 	  Intel GVT-g.
 
+config DRM_I915_SVM
+	bool "Enable Shared Virtual Memory support in i915"
+	depends on STAGING
+	depends on DRM_I915
+	default n
+	help
+	  Choose this option if you want Shared Virtual Memory (SVM)
+	  support in i915. With SVM support, one can share the virtual
+	  address space between a process and the GPU. SVM is supported
+	  on both integrated and discrete Intel GPUs.
+
 menu "drm/i915 Debugging"
 depends on DRM_I915
 depends on EXPERT
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index e0fd10c0cfb8..75fe45633779 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -153,6 +153,9 @@ i915-y += \
 	  intel_region_lmem.o \
 	  intel_wopcm.o
 
+# SVM code
+i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o
+
 # general-purpose microcontroller (GuC) support
 obj-y += gt/uc/
 i915-y += gt/uc/intel_uc.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 7a87e8270460..9d43ae6d643a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2864,10 +2864,14 @@ int
 i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
 			   struct drm_file *file)
 {
+	struct drm_i915_gem_exec_object2 *exec2_list, *exec2_list_user;
 	struct drm_i915_gem_execbuffer2 *args = data;
-	struct drm_i915_gem_exec_object2 *exec2_list;
-	struct drm_syncobj **fences = NULL;
 	const size_t count = args->buffer_count;
+	struct drm_syncobj **fences = NULL;
+	unsigned int i = 0, svm_count = 0;
+	struct i915_address_space *vm;
+	struct i915_gem_context *ctx;
+	struct i915_svm_obj *svm_obj;
 	int err;
 
 	if (!check_buffer_count(count)) {
@@ -2878,15 +2882,46 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
 	if (!i915_gem_check_execbuffer(args))
 		return -EINVAL;
 
+	ctx = i915_gem_context_lookup(file->driver_priv, args->rsvd1);
+	if (!ctx || !rcu_access_pointer(ctx->vm))
+		return -ENOENT;
+
+	rcu_read_lock();
+	vm = i915_vm_get(ctx->vm);
+	rcu_read_unlock();
+
+alloc_again:
+	svm_count = vm->svm_count;
 	/* Allocate an extra slot for use by the command parser */
-	exec2_list = kvmalloc_array(count + 1, eb_element_size(),
+	exec2_list = kvmalloc_array(count + svm_count + 1, eb_element_size(),
 				    __GFP_NOWARN | GFP_KERNEL);
 	if (exec2_list == NULL) {
 		DRM_DEBUG("Failed to allocate exec list for %zd buffers\n",
-			  count);
+			  count + svm_count);
 		return -ENOMEM;
 	}
-	if (copy_from_user(exec2_list,
+	mutex_lock(&vm->mutex);
+	if (svm_count != vm->svm_count) {
+		mutex_unlock(&vm->mutex);
+		kvfree(exec2_list);
+		goto alloc_again;
+	}
+
+	list_for_each_entry(svm_obj, &vm->svm_list, link) {
+		memset(&exec2_list[i], 0, sizeof(*exec2_list));
+		exec2_list[i].handle = svm_obj->handle;
+		exec2_list[i].offset = svm_obj->offset;
+		exec2_list[i].flags = EXEC_OBJECT_PINNED |
+				      EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+		i++;
+	}
+	exec2_list_user = &exec2_list[i];
+	args->buffer_count += svm_count;
+	mutex_unlock(&vm->mutex);
+	i915_vm_put(vm);
+	i915_gem_context_put(ctx);
+
+	if (copy_from_user(exec2_list_user,
 			   u64_to_user_ptr(args->buffers_ptr),
 			   sizeof(*exec2_list) * count)) {
 		DRM_DEBUG("copy %zd exec entries failed\n", count);
@@ -2903,6 +2938,7 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
 	}
 
 	err = i915_gem_do_execbuffer(dev, file, args, exec2_list, fences);
+	args->buffer_count -= svm_count;
 
 	/*
 	 * Now that we have begun execution of the batchbuffer, we ignore
@@ -2913,7 +2949,6 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
 	if (args->flags & __EXEC_HAS_RELOC) {
 		struct drm_i915_gem_exec_object2 __user *user_exec_list =
 			u64_to_user_ptr(args->buffers_ptr);
-		unsigned int i;
 
 		/* Copy the new buffer offsets back to the user's exec list. */
 		/*
@@ -2927,13 +2962,14 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
 			goto end;
 
 		for (i = 0; i < args->buffer_count; i++) {
-			if (!(exec2_list[i].offset & UPDATE))
+			u64 *offset = &exec2_list_user[i].offset;
+
+			if (!(*offset & UPDATE))
 				continue;
 
-			exec2_list[i].offset =
-				gen8_canonical_addr(exec2_list[i].offset & PIN_OFFSET_MASK);
-			unsafe_put_user(exec2_list[i].offset,
-					&user_exec_list[i].offset,
+			*offset = gen8_canonical_addr(*offset &
+						      PIN_OFFSET_MASK);
+			unsafe_put_user(*offset, &user_exec_list[i].offset,
 					end_user);
 		}
 end_user:
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_svm.c b/drivers/gpu/drm/i915/gem/i915_gem_svm.c
new file mode 100644
index 000000000000..973070056726
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_svm.c
@@ -0,0 +1,51 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "i915_gem_gtt.h"
+
+int i915_gem_svm_bind(struct i915_address_space *vm,
+		      struct drm_i915_bind *args,
+		      struct drm_file *file)
+{
+	struct i915_svm_obj *svm_obj, *tmp;
+	struct drm_i915_gem_object *obj;
+	int ret = 0;
+
+	obj = i915_gem_object_lookup(file, args->handle);
+	if (!obj)
+		return -ENOENT;
+
+	/* FIXME: Need to handle case with unending batch buffers */
+	if (!(args->flags & I915_BIND_UNBIND)) {
+		svm_obj = kmalloc(sizeof(*svm_obj), GFP_KERNEL);
+		if (!svm_obj) {
+			ret = -ENOMEM;
+			goto put_obj;
+		}
+		svm_obj->handle = args->handle;
+		svm_obj->offset = args->start;
+	}
+
+	mutex_lock(&vm->mutex);
+	if (!(args->flags & I915_BIND_UNBIND)) {
+		list_add(&svm_obj->link, &vm->svm_list);
+		vm->svm_count++;
+	} else {
+		list_for_each_entry_safe(svm_obj, tmp, &vm->svm_list, link) {
+			if (svm_obj->handle != args->handle)
+				continue;
+
+			list_del_init(&svm_obj->link);
+			vm->svm_count--;
+			kfree(svm_obj);
+			break;
+		}
+	}
+	mutex_unlock(&vm->mutex);
+put_obj:
+	i915_gem_object_put(obj);
+	return ret;
+}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_svm.h b/drivers/gpu/drm/i915/gem/i915_gem_svm.h
new file mode 100644
index 000000000000..c394542dba75
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_svm.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef __I915_GEM_SVM_H
+#define __I915_GEM_SVM_H
+
+#include "i915_drv.h"
+
+#if defined(CONFIG_DRM_I915_SVM)
+int i915_gem_svm_bind(struct i915_address_space *vm,
+		      struct drm_i915_bind *args,
+		      struct drm_file *file);
+#else
+static inline int i915_gem_svm_bind(struct i915_address_space *vm,
+				    struct drm_i915_bind *args,
+				    struct drm_file *file)
+{ return -ENOTSUPP; }
+#endif
+
+#endif /* __I915_GEM_SVM_H */
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 4defadb26e7e..9c525d3f694c 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -61,6 +61,7 @@
 
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_ioctls.h"
+#include "gem/i915_gem_svm.h"
 #include "gt/intel_gt.h"
 #include "gt/intel_gt_pm.h"
 #include "gt/intel_rc6.h"
@@ -2687,6 +2688,26 @@ i915_gem_reject_pin_ioctl(struct drm_device *dev, void *data,
 	return -ENODEV;
 }
 
+static int i915_bind_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *file)
+{
+	struct drm_i915_bind *args = data;
+	struct i915_address_space *vm;
+	int ret = -EINVAL;
+
+	vm = i915_gem_address_space_lookup(file->driver_priv, args->vm_id);
+	if (unlikely(!vm))
+		return -ENOENT;
+
+	switch (args->type) {
+	case I915_BIND_SVM_GEM_OBJ:
+		ret = i915_gem_svm_bind(vm, args, file);
+	}
+
+	i915_vm_put(vm);
+	return ret;
+}
+
 static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_INIT, drm_noop, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
 	DRM_IOCTL_DEF_DRV(I915_FLUSH, drm_noop, DRM_AUTH),
@@ -2746,6 +2767,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_BIND, i915_bind_ioctl, DRM_RENDER_ALLOW),
 };
 
 static struct drm_driver driver = {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bbf4dfdfa8ba..f7051e6df656 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1911,6 +1911,28 @@ i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
 	return ctx;
 }
 
+static inline struct i915_address_space *
+__i915_gem_address_space_lookup_rcu(struct drm_i915_file_private *file_priv,
+				    u32 id)
+{
+	return idr_find(&file_priv->vm_idr, id);
+}
+
+static inline struct i915_address_space *
+i915_gem_address_space_lookup(struct drm_i915_file_private *file_priv,
+			      u32 id)
+{
+	struct i915_address_space *vm;
+
+	rcu_read_lock();
+	vm = __i915_gem_address_space_lookup_rcu(file_priv, id);
+	if (vm)
+		vm = i915_vm_get(vm);
+	rcu_read_unlock();
+
+	return vm;
+}
+
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct i915_address_space *vm,
 					  u64 min_size, u64 alignment,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 6239a9adbf14..44ff4074db12 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -574,6 +574,7 @@ static void i915_address_space_init(struct i915_address_space *vm, int subclass)
 	stash_init(&vm->free_pages);
 
 	INIT_LIST_HEAD(&vm->bound_list);
+	INIT_LIST_HEAD(&vm->svm_list);
 }
 
 static int __setup_page_dma(struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 402283ce2864..d618a5787c61 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -285,6 +285,13 @@ struct pagestash {
 	struct pagevec pvec;
 };
 
+struct i915_svm_obj {
+	/** This obj's place in the SVM object list */
+	struct list_head link;
+	u32 handle;
+	u64 offset;
+};
+
 struct i915_address_space {
 	struct kref ref;
 	struct rcu_work rcu;
@@ -329,6 +336,12 @@ struct i915_address_space {
 	 */
 	struct list_head bound_list;
 
+	/**
+	 * List of SVM bind objects.
+	 */
+	struct list_head svm_list;
+	unsigned int svm_count;
+
 	struct pagestash free_pages;
 
 	/* Global GTT */
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 04/13] drm/i915/svm: Implicitly migrate BOs upon CPU access
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>

As PCIe is non-coherent link, do not allow direct access to buffer
objects across the PCIe link for SVM case. Upon CPU accesses (mmap, pread),
migrate buffer object to host memory.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c   | 10 ++++++++
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 29 +++++++++++++++++-----
 drivers/gpu/drm/i915/gem/i915_gem_object.h |  3 +++
 drivers/gpu/drm/i915/intel_memory_region.c |  4 ---
 drivers/gpu/drm/i915/intel_memory_region.h |  4 +++
 5 files changed, 40 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index d60973603cc1..0b41e2f3098c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -13,6 +13,7 @@
 #include "i915_drv.h"
 #include "i915_gem_gtt.h"
 #include "i915_gem_ioctls.h"
+#include "i915_gem_lmem.h"
 #include "i915_gem_object.h"
 #include "i915_trace.h"
 #include "i915_vma.h"
@@ -235,6 +236,15 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf)
 	if (i915_gem_object_is_readonly(obj) && write)
 		return VM_FAULT_SIGBUS;
 
+	/* Implicitly migrate BO to SMEM if it is SVM mapped */
+	if (i915_gem_object_svm_mapped(obj) && i915_gem_object_is_lmem(obj)) {
+		u32 regions[] = { REGION_MAP(INTEL_MEMORY_SYSTEM, 0) };
+
+		ret = i915_gem_object_migrate_region(obj, regions, 1);
+		if (ret)
+			goto err;
+	}
+
 	/* We don't use vmf->pgoff since that has the fake offset */
 	page_offset = (vmf->address - area->vm_start) >> PAGE_SHIFT;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 10defa723bc9..a7dee1b749cb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -493,12 +493,17 @@ __region_id(u32 region)
 	return INTEL_REGION_UNKNOWN;
 }
 
+bool
+i915_gem_object_svm_mapped(struct drm_i915_gem_object *obj)
+{
+	return false;
+}
+
 static int i915_gem_object_region_select(struct drm_i915_private *dev_priv,
 					 struct drm_i915_gem_object_param *args,
 					 struct drm_file *file,
 					 struct drm_i915_gem_object *obj)
 {
-	struct intel_context *ce = dev_priv->engine[BCS0]->kernel_context;
 	u32 __user *uregions = u64_to_user_ptr(args->data);
 	u32 uregions_copy[INTEL_REGION_UNKNOWN];
 	int i, ret;
@@ -518,16 +523,28 @@ static int i915_gem_object_region_select(struct drm_i915_private *dev_priv,
 		++uregions;
 	}
 
+	ret = i915_gem_object_migrate_region(obj, uregions_copy,
+					     args->size);
+
+	return ret;
+}
+
+int i915_gem_object_migrate_region(struct drm_i915_gem_object *obj,
+				   u32 *regions, int size)
+{
+	struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
+	struct intel_context *ce = dev_priv->engine[BCS0]->kernel_context;
+	int i, ret;
+
 	mutex_lock(&dev_priv->drm.struct_mutex);
 	ret = i915_gem_object_prepare_move(obj);
 	if (ret) {
 		DRM_ERROR("Cannot set memory region, object in use\n");
-	        goto err;
+		goto err;
 	}
 
-	for (i = 0; i < args->size; i++) {
-		u32 region = uregions_copy[i];
-		enum intel_region_id id = __region_id(region);
+	for (i = 0; i < size; i++) {
+		enum intel_region_id id = __region_id(regions[i]);
 
 		if (id == INTEL_REGION_UNKNOWN) {
 			ret = -EINVAL;
@@ -537,7 +554,7 @@ static int i915_gem_object_region_select(struct drm_i915_private *dev_priv,
 		ret = i915_gem_object_migrate(obj, ce, id);
 		if (!ret) {
 			if (!i915_gem_object_has_pages(obj) &&
-			    MEMORY_TYPE_FROM_REGION(region) ==
+			    MEMORY_TYPE_FROM_REGION(regions[i]) ==
 			    INTEL_MEMORY_LOCAL) {
 				/*
 				 * TODO: this should be part of get_pages(),
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 4d6da91beaff..eec31d9a4fa2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -47,6 +47,9 @@ int i915_gem_object_prepare_move(struct drm_i915_gem_object *obj);
 int i915_gem_object_migrate(struct drm_i915_gem_object *obj,
 			    struct intel_context *ce,
 			    enum intel_region_id id);
+bool i915_gem_object_svm_mapped(struct drm_i915_gem_object *obj);
+int i915_gem_object_migrate_region(struct drm_i915_gem_object *obj,
+				   u32 *uregions, int size);
 
 void i915_gem_flush_free_objects(struct drm_i915_private *i915);
 
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index baaeaecc64af..049a1b482d9d 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -6,10 +6,6 @@
 #include "intel_memory_region.h"
 #include "i915_drv.h"
 
-/* XXX: Hysterical raisins. BIT(inst) needs to just be (inst) at some point. */
-#define REGION_MAP(type, inst) \
-	BIT((type) + INTEL_MEMORY_TYPE_SHIFT) | BIT(inst)
-
 const u32 intel_region_map[] = {
 	[INTEL_REGION_SMEM] = REGION_MAP(INTEL_MEMORY_SYSTEM, 0),
 	[INTEL_REGION_LMEM] = REGION_MAP(INTEL_MEMORY_LOCAL, 0),
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
index 238722009677..4622c086c06d 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.h
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -41,6 +41,10 @@ enum intel_region_id {
 
 #define INTEL_MEMORY_TYPE_SHIFT 16
 
+/* XXX: Hysterical raisins. BIT(inst) needs to just be (inst) at some point. */
+#define REGION_MAP(type, inst) \
+	(BIT((type) + INTEL_MEMORY_TYPE_SHIFT) | BIT(inst))
+
 #define MEMORY_TYPE_FROM_REGION(r) (ilog2((r) >> INTEL_MEMORY_TYPE_SHIFT))
 #define MEMORY_INSTANCE_FROM_REGION(r) (ilog2((r) & 0xffff))
 
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 04/13] drm/i915/svm: Implicitly migrate BOs upon CPU access
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen, jglisse,
	jon.bloomfield, daniel.vetter, dan.j.williams, ira.weiny, jgg

From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>

As PCIe is non-coherent link, do not allow direct access to buffer
objects across the PCIe link for SVM case. Upon CPU accesses (mmap, pread),
migrate buffer object to host memory.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c   | 10 ++++++++
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 29 +++++++++++++++++-----
 drivers/gpu/drm/i915/gem/i915_gem_object.h |  3 +++
 drivers/gpu/drm/i915/intel_memory_region.c |  4 ---
 drivers/gpu/drm/i915/intel_memory_region.h |  4 +++
 5 files changed, 40 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index d60973603cc1..0b41e2f3098c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -13,6 +13,7 @@
 #include "i915_drv.h"
 #include "i915_gem_gtt.h"
 #include "i915_gem_ioctls.h"
+#include "i915_gem_lmem.h"
 #include "i915_gem_object.h"
 #include "i915_trace.h"
 #include "i915_vma.h"
@@ -235,6 +236,15 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf)
 	if (i915_gem_object_is_readonly(obj) && write)
 		return VM_FAULT_SIGBUS;
 
+	/* Implicitly migrate BO to SMEM if it is SVM mapped */
+	if (i915_gem_object_svm_mapped(obj) && i915_gem_object_is_lmem(obj)) {
+		u32 regions[] = { REGION_MAP(INTEL_MEMORY_SYSTEM, 0) };
+
+		ret = i915_gem_object_migrate_region(obj, regions, 1);
+		if (ret)
+			goto err;
+	}
+
 	/* We don't use vmf->pgoff since that has the fake offset */
 	page_offset = (vmf->address - area->vm_start) >> PAGE_SHIFT;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 10defa723bc9..a7dee1b749cb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -493,12 +493,17 @@ __region_id(u32 region)
 	return INTEL_REGION_UNKNOWN;
 }
 
+bool
+i915_gem_object_svm_mapped(struct drm_i915_gem_object *obj)
+{
+	return false;
+}
+
 static int i915_gem_object_region_select(struct drm_i915_private *dev_priv,
 					 struct drm_i915_gem_object_param *args,
 					 struct drm_file *file,
 					 struct drm_i915_gem_object *obj)
 {
-	struct intel_context *ce = dev_priv->engine[BCS0]->kernel_context;
 	u32 __user *uregions = u64_to_user_ptr(args->data);
 	u32 uregions_copy[INTEL_REGION_UNKNOWN];
 	int i, ret;
@@ -518,16 +523,28 @@ static int i915_gem_object_region_select(struct drm_i915_private *dev_priv,
 		++uregions;
 	}
 
+	ret = i915_gem_object_migrate_region(obj, uregions_copy,
+					     args->size);
+
+	return ret;
+}
+
+int i915_gem_object_migrate_region(struct drm_i915_gem_object *obj,
+				   u32 *regions, int size)
+{
+	struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
+	struct intel_context *ce = dev_priv->engine[BCS0]->kernel_context;
+	int i, ret;
+
 	mutex_lock(&dev_priv->drm.struct_mutex);
 	ret = i915_gem_object_prepare_move(obj);
 	if (ret) {
 		DRM_ERROR("Cannot set memory region, object in use\n");
-	        goto err;
+		goto err;
 	}
 
-	for (i = 0; i < args->size; i++) {
-		u32 region = uregions_copy[i];
-		enum intel_region_id id = __region_id(region);
+	for (i = 0; i < size; i++) {
+		enum intel_region_id id = __region_id(regions[i]);
 
 		if (id == INTEL_REGION_UNKNOWN) {
 			ret = -EINVAL;
@@ -537,7 +554,7 @@ static int i915_gem_object_region_select(struct drm_i915_private *dev_priv,
 		ret = i915_gem_object_migrate(obj, ce, id);
 		if (!ret) {
 			if (!i915_gem_object_has_pages(obj) &&
-			    MEMORY_TYPE_FROM_REGION(region) ==
+			    MEMORY_TYPE_FROM_REGION(regions[i]) ==
 			    INTEL_MEMORY_LOCAL) {
 				/*
 				 * TODO: this should be part of get_pages(),
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 4d6da91beaff..eec31d9a4fa2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -47,6 +47,9 @@ int i915_gem_object_prepare_move(struct drm_i915_gem_object *obj);
 int i915_gem_object_migrate(struct drm_i915_gem_object *obj,
 			    struct intel_context *ce,
 			    enum intel_region_id id);
+bool i915_gem_object_svm_mapped(struct drm_i915_gem_object *obj);
+int i915_gem_object_migrate_region(struct drm_i915_gem_object *obj,
+				   u32 *uregions, int size);
 
 void i915_gem_flush_free_objects(struct drm_i915_private *i915);
 
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index baaeaecc64af..049a1b482d9d 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -6,10 +6,6 @@
 #include "intel_memory_region.h"
 #include "i915_drv.h"
 
-/* XXX: Hysterical raisins. BIT(inst) needs to just be (inst) at some point. */
-#define REGION_MAP(type, inst) \
-	BIT((type) + INTEL_MEMORY_TYPE_SHIFT) | BIT(inst)
-
 const u32 intel_region_map[] = {
 	[INTEL_REGION_SMEM] = REGION_MAP(INTEL_MEMORY_SYSTEM, 0),
 	[INTEL_REGION_LMEM] = REGION_MAP(INTEL_MEMORY_LOCAL, 0),
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
index 238722009677..4622c086c06d 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.h
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -41,6 +41,10 @@ enum intel_region_id {
 
 #define INTEL_MEMORY_TYPE_SHIFT 16
 
+/* XXX: Hysterical raisins. BIT(inst) needs to just be (inst) at some point. */
+#define REGION_MAP(type, inst) \
+	(BIT((type) + INTEL_MEMORY_TYPE_SHIFT) | BIT(inst))
+
 #define MEMORY_TYPE_FROM_REGION(r) (ilog2((r) >> INTEL_MEMORY_TYPE_SHIFT))
 #define MEMORY_INSTANCE_FROM_REGION(r) (ilog2((r) & 0xffff))
 
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [Intel-gfx] [RFC 04/13] drm/i915/svm: Implicitly migrate BOs upon CPU access
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>

As PCIe is non-coherent link, do not allow direct access to buffer
objects across the PCIe link for SVM case. Upon CPU accesses (mmap, pread),
migrate buffer object to host memory.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c   | 10 ++++++++
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 29 +++++++++++++++++-----
 drivers/gpu/drm/i915/gem/i915_gem_object.h |  3 +++
 drivers/gpu/drm/i915/intel_memory_region.c |  4 ---
 drivers/gpu/drm/i915/intel_memory_region.h |  4 +++
 5 files changed, 40 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index d60973603cc1..0b41e2f3098c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -13,6 +13,7 @@
 #include "i915_drv.h"
 #include "i915_gem_gtt.h"
 #include "i915_gem_ioctls.h"
+#include "i915_gem_lmem.h"
 #include "i915_gem_object.h"
 #include "i915_trace.h"
 #include "i915_vma.h"
@@ -235,6 +236,15 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf)
 	if (i915_gem_object_is_readonly(obj) && write)
 		return VM_FAULT_SIGBUS;
 
+	/* Implicitly migrate BO to SMEM if it is SVM mapped */
+	if (i915_gem_object_svm_mapped(obj) && i915_gem_object_is_lmem(obj)) {
+		u32 regions[] = { REGION_MAP(INTEL_MEMORY_SYSTEM, 0) };
+
+		ret = i915_gem_object_migrate_region(obj, regions, 1);
+		if (ret)
+			goto err;
+	}
+
 	/* We don't use vmf->pgoff since that has the fake offset */
 	page_offset = (vmf->address - area->vm_start) >> PAGE_SHIFT;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 10defa723bc9..a7dee1b749cb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -493,12 +493,17 @@ __region_id(u32 region)
 	return INTEL_REGION_UNKNOWN;
 }
 
+bool
+i915_gem_object_svm_mapped(struct drm_i915_gem_object *obj)
+{
+	return false;
+}
+
 static int i915_gem_object_region_select(struct drm_i915_private *dev_priv,
 					 struct drm_i915_gem_object_param *args,
 					 struct drm_file *file,
 					 struct drm_i915_gem_object *obj)
 {
-	struct intel_context *ce = dev_priv->engine[BCS0]->kernel_context;
 	u32 __user *uregions = u64_to_user_ptr(args->data);
 	u32 uregions_copy[INTEL_REGION_UNKNOWN];
 	int i, ret;
@@ -518,16 +523,28 @@ static int i915_gem_object_region_select(struct drm_i915_private *dev_priv,
 		++uregions;
 	}
 
+	ret = i915_gem_object_migrate_region(obj, uregions_copy,
+					     args->size);
+
+	return ret;
+}
+
+int i915_gem_object_migrate_region(struct drm_i915_gem_object *obj,
+				   u32 *regions, int size)
+{
+	struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
+	struct intel_context *ce = dev_priv->engine[BCS0]->kernel_context;
+	int i, ret;
+
 	mutex_lock(&dev_priv->drm.struct_mutex);
 	ret = i915_gem_object_prepare_move(obj);
 	if (ret) {
 		DRM_ERROR("Cannot set memory region, object in use\n");
-	        goto err;
+		goto err;
 	}
 
-	for (i = 0; i < args->size; i++) {
-		u32 region = uregions_copy[i];
-		enum intel_region_id id = __region_id(region);
+	for (i = 0; i < size; i++) {
+		enum intel_region_id id = __region_id(regions[i]);
 
 		if (id == INTEL_REGION_UNKNOWN) {
 			ret = -EINVAL;
@@ -537,7 +554,7 @@ static int i915_gem_object_region_select(struct drm_i915_private *dev_priv,
 		ret = i915_gem_object_migrate(obj, ce, id);
 		if (!ret) {
 			if (!i915_gem_object_has_pages(obj) &&
-			    MEMORY_TYPE_FROM_REGION(region) ==
+			    MEMORY_TYPE_FROM_REGION(regions[i]) ==
 			    INTEL_MEMORY_LOCAL) {
 				/*
 				 * TODO: this should be part of get_pages(),
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 4d6da91beaff..eec31d9a4fa2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -47,6 +47,9 @@ int i915_gem_object_prepare_move(struct drm_i915_gem_object *obj);
 int i915_gem_object_migrate(struct drm_i915_gem_object *obj,
 			    struct intel_context *ce,
 			    enum intel_region_id id);
+bool i915_gem_object_svm_mapped(struct drm_i915_gem_object *obj);
+int i915_gem_object_migrate_region(struct drm_i915_gem_object *obj,
+				   u32 *uregions, int size);
 
 void i915_gem_flush_free_objects(struct drm_i915_private *i915);
 
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index baaeaecc64af..049a1b482d9d 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -6,10 +6,6 @@
 #include "intel_memory_region.h"
 #include "i915_drv.h"
 
-/* XXX: Hysterical raisins. BIT(inst) needs to just be (inst) at some point. */
-#define REGION_MAP(type, inst) \
-	BIT((type) + INTEL_MEMORY_TYPE_SHIFT) | BIT(inst)
-
 const u32 intel_region_map[] = {
 	[INTEL_REGION_SMEM] = REGION_MAP(INTEL_MEMORY_SYSTEM, 0),
 	[INTEL_REGION_LMEM] = REGION_MAP(INTEL_MEMORY_LOCAL, 0),
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
index 238722009677..4622c086c06d 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.h
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -41,6 +41,10 @@ enum intel_region_id {
 
 #define INTEL_MEMORY_TYPE_SHIFT 16
 
+/* XXX: Hysterical raisins. BIT(inst) needs to just be (inst) at some point. */
+#define REGION_MAP(type, inst) \
+	(BIT((type) + INTEL_MEMORY_TYPE_SHIFT) | BIT(inst))
+
 #define MEMORY_TYPE_FROM_REGION(r) (ilog2((r) >> INTEL_MEMORY_TYPE_SHIFT))
 #define MEMORY_INSTANCE_FROM_REGION(r) (ilog2((r) & 0xffff))
 
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 05/13] drm/i915/svm: Page table update support for SVM
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

For Shared Virtual Memory (SVM) system (SYS) allocator, there is no
backing buffer object (BO). Add support to bind a VA to PA mapping
in the device page table.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 60 ++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_gtt.h | 10 +++++
 2 files changed, 68 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 44ff4074db12..d9e95229ba1d 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -183,6 +183,50 @@ static void ppgtt_unbind_vma(struct i915_vma *vma)
 		vma->vm->clear_range(vma->vm, vma->node.start, vma->size);
 }
 
+int svm_bind_addr_prepare(struct i915_address_space *vm, u64 start, u64 size)
+{
+	return vm->allocate_va_range(vm, start, size);
+}
+
+int svm_bind_addr_commit(struct i915_address_space *vm, u64 start, u64 size,
+			 u64 flags, struct sg_table *st, u32 sg_page_sizes)
+{
+	struct i915_vma vma = {0};
+	u32 pte_flags = 0;
+
+	/* use a vma wrapper */
+	vma.page_sizes.sg = sg_page_sizes;
+	vma.node.start = start;
+	vma.node.size = size;
+	vma.pages = st;
+	vma.vm = vm;
+
+	/* Applicable to VLV, and gen8+ */
+	if (flags & I915_GTT_SVM_READONLY)
+		pte_flags |= PTE_READ_ONLY;
+
+	vm->insert_entries(vm, &vma, 0, pte_flags);
+	return 0;
+}
+
+int svm_bind_addr(struct i915_address_space *vm, u64 start, u64 size,
+		  u64 flags, struct sg_table *st, u32 sg_page_sizes)
+{
+	int ret;
+
+	ret = svm_bind_addr_prepare(vm, start, size);
+	if (ret)
+		return ret;
+
+	return svm_bind_addr_commit(vm, start, size, flags, st, sg_page_sizes);
+}
+
+void svm_unbind_addr(struct i915_address_space *vm,
+		     u64 start, u64 size)
+{
+	vm->clear_range(vm, start, size);
+}
+
 static int ppgtt_set_pages(struct i915_vma *vma)
 {
 	GEM_BUG_ON(vma->pages);
@@ -973,11 +1017,21 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm,
 	DBG("%s(%p):{ lvl:%d, start:%llx, end:%llx, idx:%d, len:%d, used:%d }\n",
 	    __func__, vm, lvl + 1, start, end,
 	    idx, len, atomic_read(px_used(pd)));
-	GEM_BUG_ON(!len || len >= atomic_read(px_used(pd)));
+	/*
+	 * FIXME: In SVM case, during mmu invalidation, we need to clear ppgtt,
+	 * but we don't know if the entry exist or not. So, we can't assume
+	 * that it is called only when the entry exist. revisit.
+	 * Also need to add the ebility to properly handle partial invalidations
+	 * by downgrading the large mappings.
+	 */
+	GEM_BUG_ON(!len);
 
 	do {
 		struct i915_page_table *pt = pd->entry[idx];
 
+		if (!pt)
+			continue;
+
 		if (atomic_fetch_inc(&pt->used) >> gen8_pd_shift(1) &&
 		    gen8_pd_contains(start, end, lvl)) {
 			DBG("%s(%p):{ lvl:%d, idx:%d, start:%llx, end:%llx } removing pd\n",
@@ -1000,7 +1054,9 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm,
 			    __func__, vm, lvl, start, end,
 			    gen8_pd_index(start, 0), count,
 			    atomic_read(&pt->used));
-			GEM_BUG_ON(!count || count >= atomic_read(&pt->used));
+			GEM_BUG_ON(!count);
+			if (count > atomic_read(&pt->used))
+				count = atomic_read(&pt->used);
 
 			vaddr = kmap_atomic_px(pt);
 			memset64(vaddr + gen8_pd_index(start, 0),
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index d618a5787c61..6a8d55490e39 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -39,6 +39,7 @@
 #include <linux/mm.h>
 #include <linux/pagevec.h>
 #include <linux/workqueue.h>
+#include <linux/scatterlist.h>
 
 #include <drm/drm_mm.h>
 
@@ -678,4 +679,13 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
 
 #define PIN_OFFSET_MASK		(-I915_GTT_PAGE_SIZE)
 
+/* SVM API */
+#define I915_GTT_SVM_READONLY  BIT(0)
+
+int svm_bind_addr_prepare(struct i915_address_space *vm, u64 start, u64 size);
+int svm_bind_addr_commit(struct i915_address_space *vm, u64 start, u64 size,
+			 u64 flags, struct sg_table *st, u32 sg_page_sizes);
+int svm_bind_addr(struct i915_address_space *vm, u64 start, u64 size,
+		  u64 flags, struct sg_table *st, u32 sg_page_sizes);
+void svm_unbind_addr(struct i915_address_space *vm, u64 start, u64 size);
 #endif
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 05/13] drm/i915/svm: Page table update support for SVM
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen, jglisse,
	jon.bloomfield, daniel.vetter, dan.j.williams, ira.weiny, jgg

For Shared Virtual Memory (SVM) system (SYS) allocator, there is no
backing buffer object (BO). Add support to bind a VA to PA mapping
in the device page table.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 60 ++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_gtt.h | 10 +++++
 2 files changed, 68 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 44ff4074db12..d9e95229ba1d 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -183,6 +183,50 @@ static void ppgtt_unbind_vma(struct i915_vma *vma)
 		vma->vm->clear_range(vma->vm, vma->node.start, vma->size);
 }
 
+int svm_bind_addr_prepare(struct i915_address_space *vm, u64 start, u64 size)
+{
+	return vm->allocate_va_range(vm, start, size);
+}
+
+int svm_bind_addr_commit(struct i915_address_space *vm, u64 start, u64 size,
+			 u64 flags, struct sg_table *st, u32 sg_page_sizes)
+{
+	struct i915_vma vma = {0};
+	u32 pte_flags = 0;
+
+	/* use a vma wrapper */
+	vma.page_sizes.sg = sg_page_sizes;
+	vma.node.start = start;
+	vma.node.size = size;
+	vma.pages = st;
+	vma.vm = vm;
+
+	/* Applicable to VLV, and gen8+ */
+	if (flags & I915_GTT_SVM_READONLY)
+		pte_flags |= PTE_READ_ONLY;
+
+	vm->insert_entries(vm, &vma, 0, pte_flags);
+	return 0;
+}
+
+int svm_bind_addr(struct i915_address_space *vm, u64 start, u64 size,
+		  u64 flags, struct sg_table *st, u32 sg_page_sizes)
+{
+	int ret;
+
+	ret = svm_bind_addr_prepare(vm, start, size);
+	if (ret)
+		return ret;
+
+	return svm_bind_addr_commit(vm, start, size, flags, st, sg_page_sizes);
+}
+
+void svm_unbind_addr(struct i915_address_space *vm,
+		     u64 start, u64 size)
+{
+	vm->clear_range(vm, start, size);
+}
+
 static int ppgtt_set_pages(struct i915_vma *vma)
 {
 	GEM_BUG_ON(vma->pages);
@@ -973,11 +1017,21 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm,
 	DBG("%s(%p):{ lvl:%d, start:%llx, end:%llx, idx:%d, len:%d, used:%d }\n",
 	    __func__, vm, lvl + 1, start, end,
 	    idx, len, atomic_read(px_used(pd)));
-	GEM_BUG_ON(!len || len >= atomic_read(px_used(pd)));
+	/*
+	 * FIXME: In SVM case, during mmu invalidation, we need to clear ppgtt,
+	 * but we don't know if the entry exist or not. So, we can't assume
+	 * that it is called only when the entry exist. revisit.
+	 * Also need to add the ebility to properly handle partial invalidations
+	 * by downgrading the large mappings.
+	 */
+	GEM_BUG_ON(!len);
 
 	do {
 		struct i915_page_table *pt = pd->entry[idx];
 
+		if (!pt)
+			continue;
+
 		if (atomic_fetch_inc(&pt->used) >> gen8_pd_shift(1) &&
 		    gen8_pd_contains(start, end, lvl)) {
 			DBG("%s(%p):{ lvl:%d, idx:%d, start:%llx, end:%llx } removing pd\n",
@@ -1000,7 +1054,9 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm,
 			    __func__, vm, lvl, start, end,
 			    gen8_pd_index(start, 0), count,
 			    atomic_read(&pt->used));
-			GEM_BUG_ON(!count || count >= atomic_read(&pt->used));
+			GEM_BUG_ON(!count);
+			if (count > atomic_read(&pt->used))
+				count = atomic_read(&pt->used);
 
 			vaddr = kmap_atomic_px(pt);
 			memset64(vaddr + gen8_pd_index(start, 0),
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index d618a5787c61..6a8d55490e39 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -39,6 +39,7 @@
 #include <linux/mm.h>
 #include <linux/pagevec.h>
 #include <linux/workqueue.h>
+#include <linux/scatterlist.h>
 
 #include <drm/drm_mm.h>
 
@@ -678,4 +679,13 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
 
 #define PIN_OFFSET_MASK		(-I915_GTT_PAGE_SIZE)
 
+/* SVM API */
+#define I915_GTT_SVM_READONLY  BIT(0)
+
+int svm_bind_addr_prepare(struct i915_address_space *vm, u64 start, u64 size);
+int svm_bind_addr_commit(struct i915_address_space *vm, u64 start, u64 size,
+			 u64 flags, struct sg_table *st, u32 sg_page_sizes);
+int svm_bind_addr(struct i915_address_space *vm, u64 start, u64 size,
+		  u64 flags, struct sg_table *st, u32 sg_page_sizes);
+void svm_unbind_addr(struct i915_address_space *vm, u64 start, u64 size);
 #endif
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [Intel-gfx] [RFC 05/13] drm/i915/svm: Page table update support for SVM
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

For Shared Virtual Memory (SVM) system (SYS) allocator, there is no
backing buffer object (BO). Add support to bind a VA to PA mapping
in the device page table.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 60 ++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_gtt.h | 10 +++++
 2 files changed, 68 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 44ff4074db12..d9e95229ba1d 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -183,6 +183,50 @@ static void ppgtt_unbind_vma(struct i915_vma *vma)
 		vma->vm->clear_range(vma->vm, vma->node.start, vma->size);
 }
 
+int svm_bind_addr_prepare(struct i915_address_space *vm, u64 start, u64 size)
+{
+	return vm->allocate_va_range(vm, start, size);
+}
+
+int svm_bind_addr_commit(struct i915_address_space *vm, u64 start, u64 size,
+			 u64 flags, struct sg_table *st, u32 sg_page_sizes)
+{
+	struct i915_vma vma = {0};
+	u32 pte_flags = 0;
+
+	/* use a vma wrapper */
+	vma.page_sizes.sg = sg_page_sizes;
+	vma.node.start = start;
+	vma.node.size = size;
+	vma.pages = st;
+	vma.vm = vm;
+
+	/* Applicable to VLV, and gen8+ */
+	if (flags & I915_GTT_SVM_READONLY)
+		pte_flags |= PTE_READ_ONLY;
+
+	vm->insert_entries(vm, &vma, 0, pte_flags);
+	return 0;
+}
+
+int svm_bind_addr(struct i915_address_space *vm, u64 start, u64 size,
+		  u64 flags, struct sg_table *st, u32 sg_page_sizes)
+{
+	int ret;
+
+	ret = svm_bind_addr_prepare(vm, start, size);
+	if (ret)
+		return ret;
+
+	return svm_bind_addr_commit(vm, start, size, flags, st, sg_page_sizes);
+}
+
+void svm_unbind_addr(struct i915_address_space *vm,
+		     u64 start, u64 size)
+{
+	vm->clear_range(vm, start, size);
+}
+
 static int ppgtt_set_pages(struct i915_vma *vma)
 {
 	GEM_BUG_ON(vma->pages);
@@ -973,11 +1017,21 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm,
 	DBG("%s(%p):{ lvl:%d, start:%llx, end:%llx, idx:%d, len:%d, used:%d }\n",
 	    __func__, vm, lvl + 1, start, end,
 	    idx, len, atomic_read(px_used(pd)));
-	GEM_BUG_ON(!len || len >= atomic_read(px_used(pd)));
+	/*
+	 * FIXME: In SVM case, during mmu invalidation, we need to clear ppgtt,
+	 * but we don't know if the entry exist or not. So, we can't assume
+	 * that it is called only when the entry exist. revisit.
+	 * Also need to add the ebility to properly handle partial invalidations
+	 * by downgrading the large mappings.
+	 */
+	GEM_BUG_ON(!len);
 
 	do {
 		struct i915_page_table *pt = pd->entry[idx];
 
+		if (!pt)
+			continue;
+
 		if (atomic_fetch_inc(&pt->used) >> gen8_pd_shift(1) &&
 		    gen8_pd_contains(start, end, lvl)) {
 			DBG("%s(%p):{ lvl:%d, idx:%d, start:%llx, end:%llx } removing pd\n",
@@ -1000,7 +1054,9 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm,
 			    __func__, vm, lvl, start, end,
 			    gen8_pd_index(start, 0), count,
 			    atomic_read(&pt->used));
-			GEM_BUG_ON(!count || count >= atomic_read(&pt->used));
+			GEM_BUG_ON(!count);
+			if (count > atomic_read(&pt->used))
+				count = atomic_read(&pt->used);
 
 			vaddr = kmap_atomic_px(pt);
 			memset64(vaddr + gen8_pd_index(start, 0),
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index d618a5787c61..6a8d55490e39 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -39,6 +39,7 @@
 #include <linux/mm.h>
 #include <linux/pagevec.h>
 #include <linux/workqueue.h>
+#include <linux/scatterlist.h>
 
 #include <drm/drm_mm.h>
 
@@ -678,4 +679,13 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
 
 #define PIN_OFFSET_MASK		(-I915_GTT_PAGE_SIZE)
 
+/* SVM API */
+#define I915_GTT_SVM_READONLY  BIT(0)
+
+int svm_bind_addr_prepare(struct i915_address_space *vm, u64 start, u64 size);
+int svm_bind_addr_commit(struct i915_address_space *vm, u64 start, u64 size,
+			 u64 flags, struct sg_table *st, u32 sg_page_sizes);
+int svm_bind_addr(struct i915_address_space *vm, u64 start, u64 size,
+		  u64 flags, struct sg_table *st, u32 sg_page_sizes);
+void svm_unbind_addr(struct i915_address_space *vm, u64 start, u64 size);
 #endif
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Use HMM page table mirroring support to build device page table.
Implement the bind ioctl and bind the process address range in the
specified context's ppgtt.
Handle invalidation notifications by unbinding the address range.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/Kconfig        |   3 +
 drivers/gpu/drm/i915/Makefile       |   3 +-
 drivers/gpu/drm/i915/i915_drv.c     |   3 +
 drivers/gpu/drm/i915/i915_gem_gtt.c |   5 +
 drivers/gpu/drm/i915/i915_gem_gtt.h |   4 +
 drivers/gpu/drm/i915/i915_svm.c     | 296 ++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_svm.h     |  50 +++++
 7 files changed, 363 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/i915/i915_svm.c
 create mode 100644 drivers/gpu/drm/i915/i915_svm.h

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index c2e48710eec8..689e57fe3973 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -141,6 +141,9 @@ config DRM_I915_SVM
 	bool "Enable Shared Virtual Memory support in i915"
 	depends on STAGING
 	depends on DRM_I915
+	depends on MMU
+	select HMM_MIRROR
+	select MMU_NOTIFIER
 	default n
 	help
 	  Choose this option if you want Shared Virtual Memory (SVM)
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 75fe45633779..7d4cd9eefd12 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -154,7 +154,8 @@ i915-y += \
 	  intel_wopcm.o
 
 # SVM code
-i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o
+i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o \
+			       i915_svm.o
 
 # general-purpose microcontroller (GuC) support
 obj-y += gt/uc/
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 9c525d3f694c..c190df614c48 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -73,6 +73,7 @@
 #include "i915_perf.h"
 #include "i915_query.h"
 #include "i915_suspend.h"
+#include "i915_svm.h"
 #include "i915_switcheroo.h"
 #include "i915_sysfs.h"
 #include "i915_trace.h"
@@ -2700,6 +2701,8 @@ static int i915_bind_ioctl(struct drm_device *dev, void *data,
 		return -ENOENT;
 
 	switch (args->type) {
+	case I915_BIND_SVM_BUFFER:
+		return i915_svm_bind(vm, args);
 	case I915_BIND_SVM_GEM_OBJ:
 		ret = i915_gem_svm_bind(vm, args, file);
 	}
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index d9e95229ba1d..08cbbb4d91cb 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -42,6 +42,7 @@
 
 #include "i915_drv.h"
 #include "i915_scatterlist.h"
+#include "i915_svm.h"
 #include "i915_trace.h"
 #include "i915_vgpu.h"
 
@@ -550,6 +551,7 @@ static void i915_address_space_fini(struct i915_address_space *vm)
 	drm_mm_takedown(&vm->mm);
 
 	mutex_destroy(&vm->mutex);
+	mutex_destroy(&vm->svm_mutex);
 }
 
 void __i915_vm_close(struct i915_address_space *vm)
@@ -593,6 +595,7 @@ void i915_vm_release(struct kref *kref)
 	GEM_BUG_ON(i915_is_ggtt(vm));
 	trace_i915_ppgtt_release(vm);
 
+	i915_svm_unbind_mm(vm);
 	queue_rcu_work(vm->i915->wq, &vm->rcu);
 }
 
@@ -608,6 +611,7 @@ static void i915_address_space_init(struct i915_address_space *vm, int subclass)
 	 * attempt holding the lock is immediately reported by lockdep.
 	 */
 	mutex_init(&vm->mutex);
+	mutex_init(&vm->svm_mutex);
 	lockdep_set_subclass(&vm->mutex, subclass);
 	i915_gem_shrinker_taints_mutex(vm->i915, &vm->mutex);
 
@@ -619,6 +623,7 @@ static void i915_address_space_init(struct i915_address_space *vm, int subclass)
 
 	INIT_LIST_HEAD(&vm->bound_list);
 	INIT_LIST_HEAD(&vm->svm_list);
+	RCU_INIT_POINTER(vm->svm, NULL);
 }
 
 static int __setup_page_dma(struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 6a8d55490e39..722b1e7ce291 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -293,6 +293,8 @@ struct i915_svm_obj {
 	u64 offset;
 };
 
+struct i915_svm;
+
 struct i915_address_space {
 	struct kref ref;
 	struct rcu_work rcu;
@@ -342,6 +344,8 @@ struct i915_address_space {
 	 */
 	struct list_head svm_list;
 	unsigned int svm_count;
+	struct i915_svm *svm;
+	struct mutex svm_mutex; /* protects svm enabling */
 
 	struct pagestash free_pages;
 
diff --git a/drivers/gpu/drm/i915/i915_svm.c b/drivers/gpu/drm/i915/i915_svm.c
new file mode 100644
index 000000000000..4e414f257121
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_svm.c
@@ -0,0 +1,296 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include <linux/mm_types.h>
+#include <linux/sched/mm.h>
+
+#include "i915_svm.h"
+#include "intel_memory_region.h"
+#include "gem/i915_gem_context.h"
+
+static const u64 i915_range_flags[HMM_PFN_FLAG_MAX] = {
+	[HMM_PFN_VALID]          = (1 << 0), /* HMM_PFN_VALID */
+	[HMM_PFN_WRITE]          = (1 << 1), /* HMM_PFN_WRITE */
+	[HMM_PFN_DEVICE_PRIVATE] = (1 << 2) /* HMM_PFN_DEVICE_PRIVATE */
+};
+
+static const u64 i915_range_values[HMM_PFN_VALUE_MAX] = {
+	[HMM_PFN_ERROR]   = 0xfffffffffffffffeUL, /* HMM_PFN_ERROR */
+	[HMM_PFN_NONE]    = 0, /* HMM_PFN_NONE */
+	[HMM_PFN_SPECIAL] = 0xfffffffffffffffcUL /* HMM_PFN_SPECIAL */
+};
+
+static inline bool
+i915_range_done(struct hmm_range *range)
+{
+	bool ret = hmm_range_valid(range);
+
+	hmm_range_unregister(range);
+	return ret;
+}
+
+static int
+i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
+{
+	long ret;
+
+	range->default_flags = 0;
+	range->pfn_flags_mask = -1UL;
+
+	ret = hmm_range_register(range, &svm->mirror);
+	if (ret) {
+		up_read(&svm->mm->mmap_sem);
+		return (int)ret;
+	}
+
+	if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) {
+		up_read(&svm->mm->mmap_sem);
+		return -EBUSY;
+	}
+
+	ret = hmm_range_fault(range, 0);
+	if (ret <= 0) {
+		if (ret == 0)
+			ret = -EBUSY;
+		up_read(&svm->mm->mmap_sem);
+		hmm_range_unregister(range);
+		return ret;
+	}
+	return 0;
+}
+
+static struct i915_svm *vm_get_svm(struct i915_address_space *vm)
+{
+	struct i915_svm *svm = vm->svm;
+
+	mutex_lock(&vm->svm_mutex);
+	if (svm && !kref_get_unless_zero(&svm->ref))
+		svm = NULL;
+
+	mutex_unlock(&vm->svm_mutex);
+	return svm;
+}
+
+static void release_svm(struct kref *ref)
+{
+	struct i915_svm *svm = container_of(ref, typeof(*svm), ref);
+	struct i915_address_space *vm = svm->vm;
+
+	hmm_mirror_unregister(&svm->mirror);
+	mutex_destroy(&svm->mutex);
+	vm->svm = NULL;
+	kfree(svm);
+}
+
+static void vm_put_svm(struct i915_address_space *vm)
+{
+	mutex_lock(&vm->svm_mutex);
+	if (vm->svm)
+		kref_put(&vm->svm->ref, release_svm);
+	mutex_unlock(&vm->svm_mutex);
+}
+
+static u32 i915_svm_build_sg(struct i915_address_space *vm,
+			     struct hmm_range *range,
+			     struct sg_table *st)
+{
+	struct scatterlist *sg;
+	u32 sg_page_sizes = 0;
+	u64 i, npages;
+
+	sg = NULL;
+	st->nents = 0;
+	npages = (range->end - range->start) / PAGE_SIZE;
+
+	/*
+	 * No needd to dma map the host pages and later unmap it, as
+	 * GPU is not allowed to access it with SVM. Hence, no need
+	 * of any intermediate data strucutre to hold the mappings.
+	 */
+	for (i = 0; i < npages; i++) {
+		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
+
+		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
+			sg->length += PAGE_SIZE;
+			sg_dma_len(sg) += PAGE_SIZE;
+			continue;
+		}
+
+		if (sg)
+			sg_page_sizes |= sg->length;
+
+		sg =  sg ? __sg_next(sg) : st->sgl;
+		sg_dma_address(sg) = addr;
+		sg_dma_len(sg) = PAGE_SIZE;
+		sg->length = PAGE_SIZE;
+		st->nents++;
+	}
+
+	sg_page_sizes |= sg->length;
+	sg_mark_end(sg);
+	return sg_page_sizes;
+}
+
+static void i915_svm_unbind_addr(struct i915_address_space *vm,
+				 u64 start, u64 length)
+{
+	struct i915_svm *svm = vm->svm;
+
+	mutex_lock(&svm->mutex);
+	/* FIXME: Need to flush the TLB */
+	svm_unbind_addr(vm, start, length);
+	mutex_unlock(&svm->mutex);
+}
+
+int i915_svm_bind(struct i915_address_space *vm, struct drm_i915_bind *args)
+{
+	struct vm_area_struct *vma;
+	struct hmm_range range;
+	struct i915_svm *svm;
+	u64 i, flags, npages;
+	struct sg_table st;
+	u32 sg_page_sizes;
+	int ret = 0;
+
+	svm = vm_get_svm(vm);
+	if (!svm)
+		return -EINVAL;
+
+	args->length += (args->start & ~PAGE_MASK);
+	args->start &= PAGE_MASK;
+	DRM_DEBUG_DRIVER("%sing start 0x%llx length 0x%llx vm_id 0x%x\n",
+			 (args->flags & I915_BIND_UNBIND) ? "Unbind" : "Bind",
+			 args->start, args->length, args->vm_id);
+	if (args->flags & I915_BIND_UNBIND) {
+		i915_svm_unbind_addr(vm, args->start, args->length);
+		goto bind_done;
+	}
+
+	npages = args->length / PAGE_SIZE;
+	if (unlikely(sg_alloc_table(&st, npages, GFP_KERNEL))) {
+		ret = -ENOMEM;
+		goto bind_done;
+	}
+
+	range.pfns = kvmalloc_array(npages, sizeof(uint64_t), GFP_KERNEL);
+	if (unlikely(!range.pfns)) {
+		ret = -ENOMEM;
+		goto range_done;
+	}
+
+	ret = svm_bind_addr_prepare(vm, args->start, args->length);
+	if (unlikely(ret))
+		goto prepare_done;
+
+	range.pfn_shift = PAGE_SHIFT;
+	range.start = args->start;
+	range.flags = i915_range_flags;
+	range.values = i915_range_values;
+	range.end = range.start + args->length;
+	for (i = 0; i < npages; ++i) {
+		range.pfns[i] = range.flags[HMM_PFN_VALID];
+		if (!(args->flags & I915_BIND_READONLY))
+			range.pfns[i] |= range.flags[HMM_PFN_WRITE];
+	}
+
+	down_read(&svm->mm->mmap_sem);
+	vma = find_vma(svm->mm, range.start);
+	if (unlikely(!vma || vma->vm_end < range.end)) {
+		ret = -EFAULT;
+		goto vma_done;
+	}
+again:
+	ret = i915_range_fault(svm, &range);
+	if (unlikely(ret))
+		goto vma_done;
+
+	sg_page_sizes = i915_svm_build_sg(vm, &range, &st);
+
+	mutex_lock(&svm->mutex);
+	if (!i915_range_done(&range)) {
+		mutex_unlock(&svm->mutex);
+		goto again;
+	}
+
+	flags = (args->flags & I915_BIND_READONLY) ? I915_GTT_SVM_READONLY : 0;
+	ret = svm_bind_addr_commit(vm, args->start, args->length,
+				   flags, &st, sg_page_sizes);
+	mutex_unlock(&svm->mutex);
+vma_done:
+	up_read(&svm->mm->mmap_sem);
+	if (unlikely(ret))
+		i915_svm_unbind_addr(vm, args->start, args->length);
+prepare_done:
+	kvfree(range.pfns);
+range_done:
+	sg_free_table(&st);
+bind_done:
+	vm_put_svm(vm);
+	return ret;
+}
+
+static int
+i915_sync_cpu_device_pagetables(struct hmm_mirror *mirror,
+				const struct mmu_notifier_range *update)
+{
+	struct i915_svm *svm = container_of(mirror, typeof(*svm), mirror);
+	unsigned long length = update->end - update->start;
+
+	DRM_DEBUG_DRIVER("start 0x%lx length 0x%lx\n", update->start, length);
+	if (!mmu_notifier_range_blockable(update))
+		return -EAGAIN;
+
+	i915_svm_unbind_addr(svm->vm, update->start, length);
+	return 0;
+}
+
+static void i915_mirror_release(struct hmm_mirror *mirror)
+{
+}
+
+static const struct hmm_mirror_ops i915_mirror_ops = {
+	.sync_cpu_device_pagetables = &i915_sync_cpu_device_pagetables,
+	.release = &i915_mirror_release,
+};
+
+void i915_svm_unbind_mm(struct i915_address_space *vm)
+{
+	vm_put_svm(vm);
+}
+
+int i915_svm_bind_mm(struct i915_address_space *vm)
+{
+	struct i915_svm *svm;
+	struct mm_struct *mm;
+	int ret = 0;
+
+	mm = get_task_mm(current);
+	down_write(&mm->mmap_sem);
+	mutex_lock(&vm->svm_mutex);
+	if (vm->svm)
+		goto bind_out;
+
+	svm = kzalloc(sizeof(*svm), GFP_KERNEL);
+	if (!svm) {
+		ret = -ENOMEM;
+		goto bind_out;
+	}
+	svm->mirror.ops = &i915_mirror_ops;
+	mutex_init(&svm->mutex);
+	kref_init(&svm->ref);
+	svm->mm = mm;
+	svm->vm = vm;
+
+	ret = hmm_mirror_register(&svm->mirror, mm);
+	if (ret)
+		goto bind_out;
+
+	vm->svm = svm;
+bind_out:
+	mutex_unlock(&vm->svm_mutex);
+	up_write(&mm->mmap_sem);
+	mmput(mm);
+	return ret;
+}
diff --git a/drivers/gpu/drm/i915/i915_svm.h b/drivers/gpu/drm/i915/i915_svm.h
new file mode 100644
index 000000000000..f176f1dc493f
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_svm.h
@@ -0,0 +1,50 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef __I915_SVM_H
+#define __I915_SVM_H
+
+#include <linux/hmm.h>
+
+#include "i915_drv.h"
+
+#if defined(CONFIG_DRM_I915_SVM)
+struct i915_svm {
+	/* i915 address space */
+	struct i915_address_space *vm;
+
+	/* Associated process mm */
+	struct mm_struct *mm;
+
+	/* hmm_mirror to track memory invalidations for a process */
+	struct hmm_mirror mirror;
+
+	struct mutex mutex; /* protects svm operations */
+	struct kref ref;
+};
+
+int i915_svm_bind(struct i915_address_space *vm, struct drm_i915_bind *args);
+void i915_svm_unbind_mm(struct i915_address_space *vm);
+int i915_svm_bind_mm(struct i915_address_space *vm);
+static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm)
+{
+	return vm->svm;
+}
+
+#else
+
+struct i915_svm { };
+static inline int i915_svm_bind(struct i915_address_space *vm,
+				struct drm_i915_bind *args)
+{ return -ENOTSUPP; }
+static inline void i915_svm_unbind_mm(struct i915_address_space *vm) { }
+static inline int i915_svm_bind_mm(struct i915_address_space *vm)
+{ return -ENOTSUPP; }
+static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm)
+{ return false; }
+
+#endif
+
+#endif /* __I915_SVM_H */
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen, jglisse,
	jon.bloomfield, daniel.vetter, dan.j.williams, ira.weiny, jgg

Use HMM page table mirroring support to build device page table.
Implement the bind ioctl and bind the process address range in the
specified context's ppgtt.
Handle invalidation notifications by unbinding the address range.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/Kconfig        |   3 +
 drivers/gpu/drm/i915/Makefile       |   3 +-
 drivers/gpu/drm/i915/i915_drv.c     |   3 +
 drivers/gpu/drm/i915/i915_gem_gtt.c |   5 +
 drivers/gpu/drm/i915/i915_gem_gtt.h |   4 +
 drivers/gpu/drm/i915/i915_svm.c     | 296 ++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_svm.h     |  50 +++++
 7 files changed, 363 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/i915/i915_svm.c
 create mode 100644 drivers/gpu/drm/i915/i915_svm.h

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index c2e48710eec8..689e57fe3973 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -141,6 +141,9 @@ config DRM_I915_SVM
 	bool "Enable Shared Virtual Memory support in i915"
 	depends on STAGING
 	depends on DRM_I915
+	depends on MMU
+	select HMM_MIRROR
+	select MMU_NOTIFIER
 	default n
 	help
 	  Choose this option if you want Shared Virtual Memory (SVM)
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 75fe45633779..7d4cd9eefd12 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -154,7 +154,8 @@ i915-y += \
 	  intel_wopcm.o
 
 # SVM code
-i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o
+i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o \
+			       i915_svm.o
 
 # general-purpose microcontroller (GuC) support
 obj-y += gt/uc/
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 9c525d3f694c..c190df614c48 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -73,6 +73,7 @@
 #include "i915_perf.h"
 #include "i915_query.h"
 #include "i915_suspend.h"
+#include "i915_svm.h"
 #include "i915_switcheroo.h"
 #include "i915_sysfs.h"
 #include "i915_trace.h"
@@ -2700,6 +2701,8 @@ static int i915_bind_ioctl(struct drm_device *dev, void *data,
 		return -ENOENT;
 
 	switch (args->type) {
+	case I915_BIND_SVM_BUFFER:
+		return i915_svm_bind(vm, args);
 	case I915_BIND_SVM_GEM_OBJ:
 		ret = i915_gem_svm_bind(vm, args, file);
 	}
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index d9e95229ba1d..08cbbb4d91cb 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -42,6 +42,7 @@
 
 #include "i915_drv.h"
 #include "i915_scatterlist.h"
+#include "i915_svm.h"
 #include "i915_trace.h"
 #include "i915_vgpu.h"
 
@@ -550,6 +551,7 @@ static void i915_address_space_fini(struct i915_address_space *vm)
 	drm_mm_takedown(&vm->mm);
 
 	mutex_destroy(&vm->mutex);
+	mutex_destroy(&vm->svm_mutex);
 }
 
 void __i915_vm_close(struct i915_address_space *vm)
@@ -593,6 +595,7 @@ void i915_vm_release(struct kref *kref)
 	GEM_BUG_ON(i915_is_ggtt(vm));
 	trace_i915_ppgtt_release(vm);
 
+	i915_svm_unbind_mm(vm);
 	queue_rcu_work(vm->i915->wq, &vm->rcu);
 }
 
@@ -608,6 +611,7 @@ static void i915_address_space_init(struct i915_address_space *vm, int subclass)
 	 * attempt holding the lock is immediately reported by lockdep.
 	 */
 	mutex_init(&vm->mutex);
+	mutex_init(&vm->svm_mutex);
 	lockdep_set_subclass(&vm->mutex, subclass);
 	i915_gem_shrinker_taints_mutex(vm->i915, &vm->mutex);
 
@@ -619,6 +623,7 @@ static void i915_address_space_init(struct i915_address_space *vm, int subclass)
 
 	INIT_LIST_HEAD(&vm->bound_list);
 	INIT_LIST_HEAD(&vm->svm_list);
+	RCU_INIT_POINTER(vm->svm, NULL);
 }
 
 static int __setup_page_dma(struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 6a8d55490e39..722b1e7ce291 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -293,6 +293,8 @@ struct i915_svm_obj {
 	u64 offset;
 };
 
+struct i915_svm;
+
 struct i915_address_space {
 	struct kref ref;
 	struct rcu_work rcu;
@@ -342,6 +344,8 @@ struct i915_address_space {
 	 */
 	struct list_head svm_list;
 	unsigned int svm_count;
+	struct i915_svm *svm;
+	struct mutex svm_mutex; /* protects svm enabling */
 
 	struct pagestash free_pages;
 
diff --git a/drivers/gpu/drm/i915/i915_svm.c b/drivers/gpu/drm/i915/i915_svm.c
new file mode 100644
index 000000000000..4e414f257121
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_svm.c
@@ -0,0 +1,296 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include <linux/mm_types.h>
+#include <linux/sched/mm.h>
+
+#include "i915_svm.h"
+#include "intel_memory_region.h"
+#include "gem/i915_gem_context.h"
+
+static const u64 i915_range_flags[HMM_PFN_FLAG_MAX] = {
+	[HMM_PFN_VALID]          = (1 << 0), /* HMM_PFN_VALID */
+	[HMM_PFN_WRITE]          = (1 << 1), /* HMM_PFN_WRITE */
+	[HMM_PFN_DEVICE_PRIVATE] = (1 << 2) /* HMM_PFN_DEVICE_PRIVATE */
+};
+
+static const u64 i915_range_values[HMM_PFN_VALUE_MAX] = {
+	[HMM_PFN_ERROR]   = 0xfffffffffffffffeUL, /* HMM_PFN_ERROR */
+	[HMM_PFN_NONE]    = 0, /* HMM_PFN_NONE */
+	[HMM_PFN_SPECIAL] = 0xfffffffffffffffcUL /* HMM_PFN_SPECIAL */
+};
+
+static inline bool
+i915_range_done(struct hmm_range *range)
+{
+	bool ret = hmm_range_valid(range);
+
+	hmm_range_unregister(range);
+	return ret;
+}
+
+static int
+i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
+{
+	long ret;
+
+	range->default_flags = 0;
+	range->pfn_flags_mask = -1UL;
+
+	ret = hmm_range_register(range, &svm->mirror);
+	if (ret) {
+		up_read(&svm->mm->mmap_sem);
+		return (int)ret;
+	}
+
+	if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) {
+		up_read(&svm->mm->mmap_sem);
+		return -EBUSY;
+	}
+
+	ret = hmm_range_fault(range, 0);
+	if (ret <= 0) {
+		if (ret == 0)
+			ret = -EBUSY;
+		up_read(&svm->mm->mmap_sem);
+		hmm_range_unregister(range);
+		return ret;
+	}
+	return 0;
+}
+
+static struct i915_svm *vm_get_svm(struct i915_address_space *vm)
+{
+	struct i915_svm *svm = vm->svm;
+
+	mutex_lock(&vm->svm_mutex);
+	if (svm && !kref_get_unless_zero(&svm->ref))
+		svm = NULL;
+
+	mutex_unlock(&vm->svm_mutex);
+	return svm;
+}
+
+static void release_svm(struct kref *ref)
+{
+	struct i915_svm *svm = container_of(ref, typeof(*svm), ref);
+	struct i915_address_space *vm = svm->vm;
+
+	hmm_mirror_unregister(&svm->mirror);
+	mutex_destroy(&svm->mutex);
+	vm->svm = NULL;
+	kfree(svm);
+}
+
+static void vm_put_svm(struct i915_address_space *vm)
+{
+	mutex_lock(&vm->svm_mutex);
+	if (vm->svm)
+		kref_put(&vm->svm->ref, release_svm);
+	mutex_unlock(&vm->svm_mutex);
+}
+
+static u32 i915_svm_build_sg(struct i915_address_space *vm,
+			     struct hmm_range *range,
+			     struct sg_table *st)
+{
+	struct scatterlist *sg;
+	u32 sg_page_sizes = 0;
+	u64 i, npages;
+
+	sg = NULL;
+	st->nents = 0;
+	npages = (range->end - range->start) / PAGE_SIZE;
+
+	/*
+	 * No needd to dma map the host pages and later unmap it, as
+	 * GPU is not allowed to access it with SVM. Hence, no need
+	 * of any intermediate data strucutre to hold the mappings.
+	 */
+	for (i = 0; i < npages; i++) {
+		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
+
+		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
+			sg->length += PAGE_SIZE;
+			sg_dma_len(sg) += PAGE_SIZE;
+			continue;
+		}
+
+		if (sg)
+			sg_page_sizes |= sg->length;
+
+		sg =  sg ? __sg_next(sg) : st->sgl;
+		sg_dma_address(sg) = addr;
+		sg_dma_len(sg) = PAGE_SIZE;
+		sg->length = PAGE_SIZE;
+		st->nents++;
+	}
+
+	sg_page_sizes |= sg->length;
+	sg_mark_end(sg);
+	return sg_page_sizes;
+}
+
+static void i915_svm_unbind_addr(struct i915_address_space *vm,
+				 u64 start, u64 length)
+{
+	struct i915_svm *svm = vm->svm;
+
+	mutex_lock(&svm->mutex);
+	/* FIXME: Need to flush the TLB */
+	svm_unbind_addr(vm, start, length);
+	mutex_unlock(&svm->mutex);
+}
+
+int i915_svm_bind(struct i915_address_space *vm, struct drm_i915_bind *args)
+{
+	struct vm_area_struct *vma;
+	struct hmm_range range;
+	struct i915_svm *svm;
+	u64 i, flags, npages;
+	struct sg_table st;
+	u32 sg_page_sizes;
+	int ret = 0;
+
+	svm = vm_get_svm(vm);
+	if (!svm)
+		return -EINVAL;
+
+	args->length += (args->start & ~PAGE_MASK);
+	args->start &= PAGE_MASK;
+	DRM_DEBUG_DRIVER("%sing start 0x%llx length 0x%llx vm_id 0x%x\n",
+			 (args->flags & I915_BIND_UNBIND) ? "Unbind" : "Bind",
+			 args->start, args->length, args->vm_id);
+	if (args->flags & I915_BIND_UNBIND) {
+		i915_svm_unbind_addr(vm, args->start, args->length);
+		goto bind_done;
+	}
+
+	npages = args->length / PAGE_SIZE;
+	if (unlikely(sg_alloc_table(&st, npages, GFP_KERNEL))) {
+		ret = -ENOMEM;
+		goto bind_done;
+	}
+
+	range.pfns = kvmalloc_array(npages, sizeof(uint64_t), GFP_KERNEL);
+	if (unlikely(!range.pfns)) {
+		ret = -ENOMEM;
+		goto range_done;
+	}
+
+	ret = svm_bind_addr_prepare(vm, args->start, args->length);
+	if (unlikely(ret))
+		goto prepare_done;
+
+	range.pfn_shift = PAGE_SHIFT;
+	range.start = args->start;
+	range.flags = i915_range_flags;
+	range.values = i915_range_values;
+	range.end = range.start + args->length;
+	for (i = 0; i < npages; ++i) {
+		range.pfns[i] = range.flags[HMM_PFN_VALID];
+		if (!(args->flags & I915_BIND_READONLY))
+			range.pfns[i] |= range.flags[HMM_PFN_WRITE];
+	}
+
+	down_read(&svm->mm->mmap_sem);
+	vma = find_vma(svm->mm, range.start);
+	if (unlikely(!vma || vma->vm_end < range.end)) {
+		ret = -EFAULT;
+		goto vma_done;
+	}
+again:
+	ret = i915_range_fault(svm, &range);
+	if (unlikely(ret))
+		goto vma_done;
+
+	sg_page_sizes = i915_svm_build_sg(vm, &range, &st);
+
+	mutex_lock(&svm->mutex);
+	if (!i915_range_done(&range)) {
+		mutex_unlock(&svm->mutex);
+		goto again;
+	}
+
+	flags = (args->flags & I915_BIND_READONLY) ? I915_GTT_SVM_READONLY : 0;
+	ret = svm_bind_addr_commit(vm, args->start, args->length,
+				   flags, &st, sg_page_sizes);
+	mutex_unlock(&svm->mutex);
+vma_done:
+	up_read(&svm->mm->mmap_sem);
+	if (unlikely(ret))
+		i915_svm_unbind_addr(vm, args->start, args->length);
+prepare_done:
+	kvfree(range.pfns);
+range_done:
+	sg_free_table(&st);
+bind_done:
+	vm_put_svm(vm);
+	return ret;
+}
+
+static int
+i915_sync_cpu_device_pagetables(struct hmm_mirror *mirror,
+				const struct mmu_notifier_range *update)
+{
+	struct i915_svm *svm = container_of(mirror, typeof(*svm), mirror);
+	unsigned long length = update->end - update->start;
+
+	DRM_DEBUG_DRIVER("start 0x%lx length 0x%lx\n", update->start, length);
+	if (!mmu_notifier_range_blockable(update))
+		return -EAGAIN;
+
+	i915_svm_unbind_addr(svm->vm, update->start, length);
+	return 0;
+}
+
+static void i915_mirror_release(struct hmm_mirror *mirror)
+{
+}
+
+static const struct hmm_mirror_ops i915_mirror_ops = {
+	.sync_cpu_device_pagetables = &i915_sync_cpu_device_pagetables,
+	.release = &i915_mirror_release,
+};
+
+void i915_svm_unbind_mm(struct i915_address_space *vm)
+{
+	vm_put_svm(vm);
+}
+
+int i915_svm_bind_mm(struct i915_address_space *vm)
+{
+	struct i915_svm *svm;
+	struct mm_struct *mm;
+	int ret = 0;
+
+	mm = get_task_mm(current);
+	down_write(&mm->mmap_sem);
+	mutex_lock(&vm->svm_mutex);
+	if (vm->svm)
+		goto bind_out;
+
+	svm = kzalloc(sizeof(*svm), GFP_KERNEL);
+	if (!svm) {
+		ret = -ENOMEM;
+		goto bind_out;
+	}
+	svm->mirror.ops = &i915_mirror_ops;
+	mutex_init(&svm->mutex);
+	kref_init(&svm->ref);
+	svm->mm = mm;
+	svm->vm = vm;
+
+	ret = hmm_mirror_register(&svm->mirror, mm);
+	if (ret)
+		goto bind_out;
+
+	vm->svm = svm;
+bind_out:
+	mutex_unlock(&vm->svm_mutex);
+	up_write(&mm->mmap_sem);
+	mmput(mm);
+	return ret;
+}
diff --git a/drivers/gpu/drm/i915/i915_svm.h b/drivers/gpu/drm/i915/i915_svm.h
new file mode 100644
index 000000000000..f176f1dc493f
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_svm.h
@@ -0,0 +1,50 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef __I915_SVM_H
+#define __I915_SVM_H
+
+#include <linux/hmm.h>
+
+#include "i915_drv.h"
+
+#if defined(CONFIG_DRM_I915_SVM)
+struct i915_svm {
+	/* i915 address space */
+	struct i915_address_space *vm;
+
+	/* Associated process mm */
+	struct mm_struct *mm;
+
+	/* hmm_mirror to track memory invalidations for a process */
+	struct hmm_mirror mirror;
+
+	struct mutex mutex; /* protects svm operations */
+	struct kref ref;
+};
+
+int i915_svm_bind(struct i915_address_space *vm, struct drm_i915_bind *args);
+void i915_svm_unbind_mm(struct i915_address_space *vm);
+int i915_svm_bind_mm(struct i915_address_space *vm);
+static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm)
+{
+	return vm->svm;
+}
+
+#else
+
+struct i915_svm { };
+static inline int i915_svm_bind(struct i915_address_space *vm,
+				struct drm_i915_bind *args)
+{ return -ENOTSUPP; }
+static inline void i915_svm_unbind_mm(struct i915_address_space *vm) { }
+static inline int i915_svm_bind_mm(struct i915_address_space *vm)
+{ return -ENOTSUPP; }
+static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm)
+{ return false; }
+
+#endif
+
+#endif /* __I915_SVM_H */
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [Intel-gfx] [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Use HMM page table mirroring support to build device page table.
Implement the bind ioctl and bind the process address range in the
specified context's ppgtt.
Handle invalidation notifications by unbinding the address range.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/Kconfig        |   3 +
 drivers/gpu/drm/i915/Makefile       |   3 +-
 drivers/gpu/drm/i915/i915_drv.c     |   3 +
 drivers/gpu/drm/i915/i915_gem_gtt.c |   5 +
 drivers/gpu/drm/i915/i915_gem_gtt.h |   4 +
 drivers/gpu/drm/i915/i915_svm.c     | 296 ++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_svm.h     |  50 +++++
 7 files changed, 363 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/i915/i915_svm.c
 create mode 100644 drivers/gpu/drm/i915/i915_svm.h

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index c2e48710eec8..689e57fe3973 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -141,6 +141,9 @@ config DRM_I915_SVM
 	bool "Enable Shared Virtual Memory support in i915"
 	depends on STAGING
 	depends on DRM_I915
+	depends on MMU
+	select HMM_MIRROR
+	select MMU_NOTIFIER
 	default n
 	help
 	  Choose this option if you want Shared Virtual Memory (SVM)
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 75fe45633779..7d4cd9eefd12 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -154,7 +154,8 @@ i915-y += \
 	  intel_wopcm.o
 
 # SVM code
-i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o
+i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o \
+			       i915_svm.o
 
 # general-purpose microcontroller (GuC) support
 obj-y += gt/uc/
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 9c525d3f694c..c190df614c48 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -73,6 +73,7 @@
 #include "i915_perf.h"
 #include "i915_query.h"
 #include "i915_suspend.h"
+#include "i915_svm.h"
 #include "i915_switcheroo.h"
 #include "i915_sysfs.h"
 #include "i915_trace.h"
@@ -2700,6 +2701,8 @@ static int i915_bind_ioctl(struct drm_device *dev, void *data,
 		return -ENOENT;
 
 	switch (args->type) {
+	case I915_BIND_SVM_BUFFER:
+		return i915_svm_bind(vm, args);
 	case I915_BIND_SVM_GEM_OBJ:
 		ret = i915_gem_svm_bind(vm, args, file);
 	}
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index d9e95229ba1d..08cbbb4d91cb 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -42,6 +42,7 @@
 
 #include "i915_drv.h"
 #include "i915_scatterlist.h"
+#include "i915_svm.h"
 #include "i915_trace.h"
 #include "i915_vgpu.h"
 
@@ -550,6 +551,7 @@ static void i915_address_space_fini(struct i915_address_space *vm)
 	drm_mm_takedown(&vm->mm);
 
 	mutex_destroy(&vm->mutex);
+	mutex_destroy(&vm->svm_mutex);
 }
 
 void __i915_vm_close(struct i915_address_space *vm)
@@ -593,6 +595,7 @@ void i915_vm_release(struct kref *kref)
 	GEM_BUG_ON(i915_is_ggtt(vm));
 	trace_i915_ppgtt_release(vm);
 
+	i915_svm_unbind_mm(vm);
 	queue_rcu_work(vm->i915->wq, &vm->rcu);
 }
 
@@ -608,6 +611,7 @@ static void i915_address_space_init(struct i915_address_space *vm, int subclass)
 	 * attempt holding the lock is immediately reported by lockdep.
 	 */
 	mutex_init(&vm->mutex);
+	mutex_init(&vm->svm_mutex);
 	lockdep_set_subclass(&vm->mutex, subclass);
 	i915_gem_shrinker_taints_mutex(vm->i915, &vm->mutex);
 
@@ -619,6 +623,7 @@ static void i915_address_space_init(struct i915_address_space *vm, int subclass)
 
 	INIT_LIST_HEAD(&vm->bound_list);
 	INIT_LIST_HEAD(&vm->svm_list);
+	RCU_INIT_POINTER(vm->svm, NULL);
 }
 
 static int __setup_page_dma(struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 6a8d55490e39..722b1e7ce291 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -293,6 +293,8 @@ struct i915_svm_obj {
 	u64 offset;
 };
 
+struct i915_svm;
+
 struct i915_address_space {
 	struct kref ref;
 	struct rcu_work rcu;
@@ -342,6 +344,8 @@ struct i915_address_space {
 	 */
 	struct list_head svm_list;
 	unsigned int svm_count;
+	struct i915_svm *svm;
+	struct mutex svm_mutex; /* protects svm enabling */
 
 	struct pagestash free_pages;
 
diff --git a/drivers/gpu/drm/i915/i915_svm.c b/drivers/gpu/drm/i915/i915_svm.c
new file mode 100644
index 000000000000..4e414f257121
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_svm.c
@@ -0,0 +1,296 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include <linux/mm_types.h>
+#include <linux/sched/mm.h>
+
+#include "i915_svm.h"
+#include "intel_memory_region.h"
+#include "gem/i915_gem_context.h"
+
+static const u64 i915_range_flags[HMM_PFN_FLAG_MAX] = {
+	[HMM_PFN_VALID]          = (1 << 0), /* HMM_PFN_VALID */
+	[HMM_PFN_WRITE]          = (1 << 1), /* HMM_PFN_WRITE */
+	[HMM_PFN_DEVICE_PRIVATE] = (1 << 2) /* HMM_PFN_DEVICE_PRIVATE */
+};
+
+static const u64 i915_range_values[HMM_PFN_VALUE_MAX] = {
+	[HMM_PFN_ERROR]   = 0xfffffffffffffffeUL, /* HMM_PFN_ERROR */
+	[HMM_PFN_NONE]    = 0, /* HMM_PFN_NONE */
+	[HMM_PFN_SPECIAL] = 0xfffffffffffffffcUL /* HMM_PFN_SPECIAL */
+};
+
+static inline bool
+i915_range_done(struct hmm_range *range)
+{
+	bool ret = hmm_range_valid(range);
+
+	hmm_range_unregister(range);
+	return ret;
+}
+
+static int
+i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
+{
+	long ret;
+
+	range->default_flags = 0;
+	range->pfn_flags_mask = -1UL;
+
+	ret = hmm_range_register(range, &svm->mirror);
+	if (ret) {
+		up_read(&svm->mm->mmap_sem);
+		return (int)ret;
+	}
+
+	if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) {
+		up_read(&svm->mm->mmap_sem);
+		return -EBUSY;
+	}
+
+	ret = hmm_range_fault(range, 0);
+	if (ret <= 0) {
+		if (ret == 0)
+			ret = -EBUSY;
+		up_read(&svm->mm->mmap_sem);
+		hmm_range_unregister(range);
+		return ret;
+	}
+	return 0;
+}
+
+static struct i915_svm *vm_get_svm(struct i915_address_space *vm)
+{
+	struct i915_svm *svm = vm->svm;
+
+	mutex_lock(&vm->svm_mutex);
+	if (svm && !kref_get_unless_zero(&svm->ref))
+		svm = NULL;
+
+	mutex_unlock(&vm->svm_mutex);
+	return svm;
+}
+
+static void release_svm(struct kref *ref)
+{
+	struct i915_svm *svm = container_of(ref, typeof(*svm), ref);
+	struct i915_address_space *vm = svm->vm;
+
+	hmm_mirror_unregister(&svm->mirror);
+	mutex_destroy(&svm->mutex);
+	vm->svm = NULL;
+	kfree(svm);
+}
+
+static void vm_put_svm(struct i915_address_space *vm)
+{
+	mutex_lock(&vm->svm_mutex);
+	if (vm->svm)
+		kref_put(&vm->svm->ref, release_svm);
+	mutex_unlock(&vm->svm_mutex);
+}
+
+static u32 i915_svm_build_sg(struct i915_address_space *vm,
+			     struct hmm_range *range,
+			     struct sg_table *st)
+{
+	struct scatterlist *sg;
+	u32 sg_page_sizes = 0;
+	u64 i, npages;
+
+	sg = NULL;
+	st->nents = 0;
+	npages = (range->end - range->start) / PAGE_SIZE;
+
+	/*
+	 * No needd to dma map the host pages and later unmap it, as
+	 * GPU is not allowed to access it with SVM. Hence, no need
+	 * of any intermediate data strucutre to hold the mappings.
+	 */
+	for (i = 0; i < npages; i++) {
+		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
+
+		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
+			sg->length += PAGE_SIZE;
+			sg_dma_len(sg) += PAGE_SIZE;
+			continue;
+		}
+
+		if (sg)
+			sg_page_sizes |= sg->length;
+
+		sg =  sg ? __sg_next(sg) : st->sgl;
+		sg_dma_address(sg) = addr;
+		sg_dma_len(sg) = PAGE_SIZE;
+		sg->length = PAGE_SIZE;
+		st->nents++;
+	}
+
+	sg_page_sizes |= sg->length;
+	sg_mark_end(sg);
+	return sg_page_sizes;
+}
+
+static void i915_svm_unbind_addr(struct i915_address_space *vm,
+				 u64 start, u64 length)
+{
+	struct i915_svm *svm = vm->svm;
+
+	mutex_lock(&svm->mutex);
+	/* FIXME: Need to flush the TLB */
+	svm_unbind_addr(vm, start, length);
+	mutex_unlock(&svm->mutex);
+}
+
+int i915_svm_bind(struct i915_address_space *vm, struct drm_i915_bind *args)
+{
+	struct vm_area_struct *vma;
+	struct hmm_range range;
+	struct i915_svm *svm;
+	u64 i, flags, npages;
+	struct sg_table st;
+	u32 sg_page_sizes;
+	int ret = 0;
+
+	svm = vm_get_svm(vm);
+	if (!svm)
+		return -EINVAL;
+
+	args->length += (args->start & ~PAGE_MASK);
+	args->start &= PAGE_MASK;
+	DRM_DEBUG_DRIVER("%sing start 0x%llx length 0x%llx vm_id 0x%x\n",
+			 (args->flags & I915_BIND_UNBIND) ? "Unbind" : "Bind",
+			 args->start, args->length, args->vm_id);
+	if (args->flags & I915_BIND_UNBIND) {
+		i915_svm_unbind_addr(vm, args->start, args->length);
+		goto bind_done;
+	}
+
+	npages = args->length / PAGE_SIZE;
+	if (unlikely(sg_alloc_table(&st, npages, GFP_KERNEL))) {
+		ret = -ENOMEM;
+		goto bind_done;
+	}
+
+	range.pfns = kvmalloc_array(npages, sizeof(uint64_t), GFP_KERNEL);
+	if (unlikely(!range.pfns)) {
+		ret = -ENOMEM;
+		goto range_done;
+	}
+
+	ret = svm_bind_addr_prepare(vm, args->start, args->length);
+	if (unlikely(ret))
+		goto prepare_done;
+
+	range.pfn_shift = PAGE_SHIFT;
+	range.start = args->start;
+	range.flags = i915_range_flags;
+	range.values = i915_range_values;
+	range.end = range.start + args->length;
+	for (i = 0; i < npages; ++i) {
+		range.pfns[i] = range.flags[HMM_PFN_VALID];
+		if (!(args->flags & I915_BIND_READONLY))
+			range.pfns[i] |= range.flags[HMM_PFN_WRITE];
+	}
+
+	down_read(&svm->mm->mmap_sem);
+	vma = find_vma(svm->mm, range.start);
+	if (unlikely(!vma || vma->vm_end < range.end)) {
+		ret = -EFAULT;
+		goto vma_done;
+	}
+again:
+	ret = i915_range_fault(svm, &range);
+	if (unlikely(ret))
+		goto vma_done;
+
+	sg_page_sizes = i915_svm_build_sg(vm, &range, &st);
+
+	mutex_lock(&svm->mutex);
+	if (!i915_range_done(&range)) {
+		mutex_unlock(&svm->mutex);
+		goto again;
+	}
+
+	flags = (args->flags & I915_BIND_READONLY) ? I915_GTT_SVM_READONLY : 0;
+	ret = svm_bind_addr_commit(vm, args->start, args->length,
+				   flags, &st, sg_page_sizes);
+	mutex_unlock(&svm->mutex);
+vma_done:
+	up_read(&svm->mm->mmap_sem);
+	if (unlikely(ret))
+		i915_svm_unbind_addr(vm, args->start, args->length);
+prepare_done:
+	kvfree(range.pfns);
+range_done:
+	sg_free_table(&st);
+bind_done:
+	vm_put_svm(vm);
+	return ret;
+}
+
+static int
+i915_sync_cpu_device_pagetables(struct hmm_mirror *mirror,
+				const struct mmu_notifier_range *update)
+{
+	struct i915_svm *svm = container_of(mirror, typeof(*svm), mirror);
+	unsigned long length = update->end - update->start;
+
+	DRM_DEBUG_DRIVER("start 0x%lx length 0x%lx\n", update->start, length);
+	if (!mmu_notifier_range_blockable(update))
+		return -EAGAIN;
+
+	i915_svm_unbind_addr(svm->vm, update->start, length);
+	return 0;
+}
+
+static void i915_mirror_release(struct hmm_mirror *mirror)
+{
+}
+
+static const struct hmm_mirror_ops i915_mirror_ops = {
+	.sync_cpu_device_pagetables = &i915_sync_cpu_device_pagetables,
+	.release = &i915_mirror_release,
+};
+
+void i915_svm_unbind_mm(struct i915_address_space *vm)
+{
+	vm_put_svm(vm);
+}
+
+int i915_svm_bind_mm(struct i915_address_space *vm)
+{
+	struct i915_svm *svm;
+	struct mm_struct *mm;
+	int ret = 0;
+
+	mm = get_task_mm(current);
+	down_write(&mm->mmap_sem);
+	mutex_lock(&vm->svm_mutex);
+	if (vm->svm)
+		goto bind_out;
+
+	svm = kzalloc(sizeof(*svm), GFP_KERNEL);
+	if (!svm) {
+		ret = -ENOMEM;
+		goto bind_out;
+	}
+	svm->mirror.ops = &i915_mirror_ops;
+	mutex_init(&svm->mutex);
+	kref_init(&svm->ref);
+	svm->mm = mm;
+	svm->vm = vm;
+
+	ret = hmm_mirror_register(&svm->mirror, mm);
+	if (ret)
+		goto bind_out;
+
+	vm->svm = svm;
+bind_out:
+	mutex_unlock(&vm->svm_mutex);
+	up_write(&mm->mmap_sem);
+	mmput(mm);
+	return ret;
+}
diff --git a/drivers/gpu/drm/i915/i915_svm.h b/drivers/gpu/drm/i915/i915_svm.h
new file mode 100644
index 000000000000..f176f1dc493f
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_svm.h
@@ -0,0 +1,50 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef __I915_SVM_H
+#define __I915_SVM_H
+
+#include <linux/hmm.h>
+
+#include "i915_drv.h"
+
+#if defined(CONFIG_DRM_I915_SVM)
+struct i915_svm {
+	/* i915 address space */
+	struct i915_address_space *vm;
+
+	/* Associated process mm */
+	struct mm_struct *mm;
+
+	/* hmm_mirror to track memory invalidations for a process */
+	struct hmm_mirror mirror;
+
+	struct mutex mutex; /* protects svm operations */
+	struct kref ref;
+};
+
+int i915_svm_bind(struct i915_address_space *vm, struct drm_i915_bind *args);
+void i915_svm_unbind_mm(struct i915_address_space *vm);
+int i915_svm_bind_mm(struct i915_address_space *vm);
+static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm)
+{
+	return vm->svm;
+}
+
+#else
+
+struct i915_svm { };
+static inline int i915_svm_bind(struct i915_address_space *vm,
+				struct drm_i915_bind *args)
+{ return -ENOTSUPP; }
+static inline void i915_svm_unbind_mm(struct i915_address_space *vm) { }
+static inline int i915_svm_bind_mm(struct i915_address_space *vm)
+{ return -ENOTSUPP; }
+static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm)
+{ return false; }
+
+#endif
+
+#endif /* __I915_SVM_H */
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 07/13] drm/i915/svm: Device memory support
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Plugin device memory through HMM as DEVICE_PRIVATE.
Add support functions to allocate pages and free pages from device memory.
Implement ioctl to migrate pages from host to device memory.
For now, only support migrating pages from host memory to device memory.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/Kconfig               |   9 +
 drivers/gpu/drm/i915/Makefile              |   3 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c |  13 -
 drivers/gpu/drm/i915/i915_buddy.h          |  12 +
 drivers/gpu/drm/i915/i915_drv.c            |   1 +
 drivers/gpu/drm/i915/i915_svm.c            |   2 +
 drivers/gpu/drm/i915/i915_svm.h            |  15 +
 drivers/gpu/drm/i915/i915_svm_devmem.c     | 391 +++++++++++++++++++++
 drivers/gpu/drm/i915/intel_memory_region.h |  14 +
 drivers/gpu/drm/i915/intel_region_lmem.c   |  10 +
 10 files changed, 456 insertions(+), 14 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_svm_devmem.c

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 689e57fe3973..66337f2ca2bf 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -141,9 +141,18 @@ config DRM_I915_SVM
 	bool "Enable Shared Virtual Memory support in i915"
 	depends on STAGING
 	depends on DRM_I915
+	depends on ARCH_ENABLE_MEMORY_HOTPLUG
+	depends on ARCH_ENABLE_MEMORY_HOTREMOVE
+	depends on MEMORY_HOTPLUG
+	depends on MEMORY_HOTREMOVE
+	depends on ARCH_HAS_PTE_DEVMAP
+	depends on SPARSEMEM_VMEMMAP
+	depends on ZONE_DEVICE
+	depends on DEVICE_PRIVATE
 	depends on MMU
 	select HMM_MIRROR
 	select MMU_NOTIFIER
+	select MIGRATE_VMA_HELPER
 	default n
 	help
 	  Choose this option if you want Shared Virtual Memory (SVM)
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 7d4cd9eefd12..b574ec31ea2e 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -155,7 +155,8 @@ i915-y += \
 
 # SVM code
 i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o \
-			       i915_svm.o
+			       i915_svm.o \
+			       i915_svm_devmem.o
 
 # general-purpose microcontroller (GuC) support
 obj-y += gt/uc/
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index a7dee1b749cb..dd88fa87b7fe 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -480,19 +480,6 @@ int __init i915_global_objects_init(void)
 	return 0;
 }
 
-static enum intel_region_id
-__region_id(u32 region)
-{
-	enum intel_region_id id;
-
-	for (id = 0; id < INTEL_REGION_UNKNOWN; ++id) {
-		if (intel_region_map[id] == region)
-			return id;
-	}
-
-	return INTEL_REGION_UNKNOWN;
-}
-
 bool
 i915_gem_object_svm_mapped(struct drm_i915_gem_object *obj)
 {
diff --git a/drivers/gpu/drm/i915/i915_buddy.h b/drivers/gpu/drm/i915/i915_buddy.h
index ed41f3507cdc..afc493e6c130 100644
--- a/drivers/gpu/drm/i915/i915_buddy.h
+++ b/drivers/gpu/drm/i915/i915_buddy.h
@@ -9,6 +9,9 @@
 #include <linux/bitops.h>
 #include <linux/list.h>
 
+/* 512 bits (one per pages) supports 2MB blocks */
+#define I915_BUDDY_MAX_PAGES   512
+
 struct i915_buddy_block {
 #define I915_BUDDY_HEADER_OFFSET GENMASK_ULL(63, 12)
 #define I915_BUDDY_HEADER_STATE  GENMASK_ULL(11, 10)
@@ -32,6 +35,15 @@ struct i915_buddy_block {
 	 */
 	struct list_head link;
 	struct list_head tmp_link;
+
+	unsigned long pfn_first;
+	/*
+	 * FIXME: There are other alternatives to bitmap. Like splitting the
+	 * block into contiguous 4K sized blocks. But it is part of bigger
+	 * issues involving partially invalidating large mapping, freeing the
+	 * blocks etc., revisit.
+	 */
+	unsigned long bitmap[BITS_TO_LONGS(I915_BUDDY_MAX_PAGES)];
 };
 
 #define I915_BUDDY_MAX_ORDER  I915_BUDDY_HEADER_ORDER
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index c190df614c48..740b4b9d39a8 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -2771,6 +2771,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_BIND, i915_bind_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_SVM_MIGRATE, i915_svm_migrate_ioctl, DRM_RENDER_ALLOW)
 };
 
 static struct drm_driver driver = {
diff --git a/drivers/gpu/drm/i915/i915_svm.c b/drivers/gpu/drm/i915/i915_svm.c
index 4e414f257121..fe7d53634606 100644
--- a/drivers/gpu/drm/i915/i915_svm.c
+++ b/drivers/gpu/drm/i915/i915_svm.c
@@ -206,6 +206,8 @@ int i915_svm_bind(struct i915_address_space *vm, struct drm_i915_bind *args)
 	if (unlikely(ret))
 		goto vma_done;
 
+	/* XXX: Assuming the range is exclusively LMEM or SMEM, fix it */
+	i915_dmem_convert_pfn(vm->i915, &range);
 	sg_page_sizes = i915_svm_build_sg(vm, &range, &st);
 
 	mutex_lock(&svm->mutex);
diff --git a/drivers/gpu/drm/i915/i915_svm.h b/drivers/gpu/drm/i915/i915_svm.h
index f176f1dc493f..a1b62997e925 100644
--- a/drivers/gpu/drm/i915/i915_svm.h
+++ b/drivers/gpu/drm/i915/i915_svm.h
@@ -33,6 +33,14 @@ static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm)
 	return vm->svm;
 }
 
+void i915_dmem_convert_pfn(struct drm_i915_private *dev_priv,
+			   struct hmm_range *range);
+int i915_svm_migrate_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *file_priv);
+struct i915_devmem *i915_svm_devmem_add(struct drm_i915_private *i915,
+					u64 size);
+void i915_svm_devmem_remove(struct i915_devmem *devmem);
+
 #else
 
 struct i915_svm { };
@@ -45,6 +53,13 @@ static inline int i915_svm_bind_mm(struct i915_address_space *vm)
 static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm)
 { return false; }
 
+static inline int i915_svm_migrate_ioctl(struct drm_device *dev, void *data,
+					 struct drm_file *file_priv)
+{ return -ENOTSUPP; }
+static inline
+struct i915_devmem *i915_svm_devmem_add(struct drm_i915_private *i915, u64 size)
+{ return NULL; }
+static inline void i915_svm_devmem_remove(struct i915_devmem *devmem) { }
 #endif
 
 #endif /* __I915_SVM_H */
diff --git a/drivers/gpu/drm/i915/i915_svm_devmem.c b/drivers/gpu/drm/i915/i915_svm_devmem.c
new file mode 100644
index 000000000000..40c2f79ff614
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_svm_devmem.c
@@ -0,0 +1,391 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include <linux/mm_types.h>
+#include <linux/sched/mm.h>
+
+#include "i915_svm.h"
+#include "intel_memory_region.h"
+
+struct i915_devmem_migrate {
+	struct drm_i915_private *i915;
+	struct migrate_vma *args;
+
+	enum intel_region_id src_id;
+	enum intel_region_id dst_id;
+	u64 npages;
+};
+
+struct i915_devmem {
+	struct drm_i915_private *i915;
+	struct dev_pagemap pagemap;
+	unsigned long pfn_first;
+	unsigned long pfn_last;
+};
+
+static inline bool
+i915_dmem_page(struct drm_i915_private *dev_priv, struct page *page)
+{
+	if (!is_device_private_page(page))
+		return false;
+
+	return true;
+}
+
+void i915_dmem_convert_pfn(struct drm_i915_private *dev_priv,
+			   struct hmm_range *range)
+{
+	unsigned long i, npages;
+
+	npages = (range->end - range->start) >> PAGE_SHIFT;
+	for (i = 0; i < npages; ++i) {
+		struct i915_buddy_block *block;
+		struct intel_memory_region *mem;
+		struct page *page;
+		u64 addr;
+
+		page = hmm_device_entry_to_page(range, range->pfns[i]);
+		if (!page)
+			continue;
+
+		if (!(range->pfns[i] & range->flags[HMM_PFN_DEVICE_PRIVATE]))
+			continue;
+
+		if (!i915_dmem_page(dev_priv, page)) {
+			WARN(1, "Some unknown device memory !\n");
+			range->pfns[i] = 0;
+			continue;
+		}
+
+		block = page->zone_device_data;
+		mem = block->private;
+		addr = mem->region.start +
+		       i915_buddy_block_offset(block);
+		addr += (page_to_pfn(page) - block->pfn_first) << PAGE_SHIFT;
+
+		range->pfns[i] &= ~range->flags[HMM_PFN_DEVICE_PRIVATE];
+		range->pfns[i] &= ((1UL << range->pfn_shift) - 1);
+		range->pfns[i] |= (addr >> PAGE_SHIFT) << range->pfn_shift;
+	}
+}
+
+static int
+i915_devmem_page_alloc_locked(struct intel_memory_region *mem,
+			      unsigned long npages,
+			      struct list_head *blocks)
+{
+	unsigned long size = ALIGN((npages * PAGE_SIZE), mem->mm.chunk_size);
+	struct i915_buddy_block *block;
+	int ret;
+
+	INIT_LIST_HEAD(blocks);
+	ret = __intel_memory_region_get_pages_buddy(mem, size, 0, blocks);
+	if (unlikely(ret))
+		goto alloc_failed;
+
+	list_for_each_entry(block, blocks, link) {
+		block->pfn_first = mem->devmem->pfn_first;
+		block->pfn_first += i915_buddy_block_offset(block) /
+				    PAGE_SIZE;
+		bitmap_zero(block->bitmap, I915_BUDDY_MAX_PAGES);
+		DRM_DEBUG_DRIVER("%s pfn_first 0x%lx off 0x%llx size 0x%llx\n",
+				 "Allocated block", block->pfn_first,
+				 i915_buddy_block_offset(block),
+				 i915_buddy_block_size(&mem->mm, block));
+	}
+
+alloc_failed:
+	return ret;
+}
+
+static struct page *
+i915_devmem_page_get_locked(struct intel_memory_region *mem,
+			    struct list_head *blocks)
+{
+	struct i915_buddy_block *block, *on;
+
+	list_for_each_entry_safe(block, on, blocks, link) {
+		unsigned long weight, max;
+		unsigned long i, pfn;
+		struct page *page;
+
+		max = i915_buddy_block_size(&mem->mm, block) / PAGE_SIZE;
+		i = find_first_zero_bit(block->bitmap, max);
+		if (unlikely(i == max)) {
+			WARN(1, "Getting a page should have never failed\n");
+			break;
+		}
+
+		set_bit(i, block->bitmap);
+		pfn = block->pfn_first + i;
+		page = pfn_to_page(pfn);
+		get_page(page);
+		lock_page(page);
+		page->zone_device_data = block;
+		weight = bitmap_weight(block->bitmap, max);
+		if (weight == max)
+			list_del_init(&block->link);
+		DRM_DEBUG_DRIVER("%s pfn 0x%lx block weight 0x%lx\n",
+				 "Allocated page", pfn, weight);
+		return page;
+	}
+	return NULL;
+}
+
+static void
+i915_devmem_page_free_locked(struct drm_i915_private *dev_priv,
+			     struct page *page)
+{
+	unlock_page(page);
+	put_page(page);
+}
+
+static int
+i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
+{
+	struct drm_i915_private *i915 = migrate->i915;
+	struct migrate_vma *args = migrate->args;
+	struct intel_memory_region *mem;
+	struct list_head blocks = {0};
+	unsigned long i, npages, cnt;
+	struct page *page;
+	int ret;
+
+	npages = (args->end - args->start) >> PAGE_SHIFT;
+	DRM_DEBUG_DRIVER("start 0x%lx npages %ld\n", args->start, npages);
+
+	/* Check source pages */
+	for (i = 0, cnt = 0; i < npages; i++) {
+		args->dst[i] = 0;
+		page = migrate_pfn_to_page(args->src[i]);
+		if (unlikely(!page || !(args->src[i] & MIGRATE_PFN_MIGRATE)))
+			continue;
+
+		args->dst[i] = MIGRATE_PFN_VALID;
+		cnt++;
+	}
+
+	if (!cnt) {
+		ret = -ENOMEM;
+		goto migrate_out;
+	}
+
+	mem = i915->mm.regions[migrate->dst_id];
+	ret = i915_devmem_page_alloc_locked(mem, cnt, &blocks);
+	if (unlikely(ret))
+		goto migrate_out;
+
+	/* Allocate device memory */
+	for (i = 0, cnt = 0; i < npages; i++) {
+		if (!args->dst[i])
+			continue;
+
+		page = i915_devmem_page_get_locked(mem, &blocks);
+		if (unlikely(!page)) {
+			WARN(1, "Failed to get dst page\n");
+			args->dst[i] = 0;
+			continue;
+		}
+
+		cnt++;
+		args->dst[i] = migrate_pfn(page_to_pfn(page)) |
+			       MIGRATE_PFN_LOCKED;
+	}
+
+	if (!cnt) {
+		ret = -ENOMEM;
+		goto migrate_out;
+	}
+
+	/* Copy the pages */
+	migrate->npages = npages;
+migrate_out:
+	if (unlikely(ret)) {
+		for (i = 0; i < npages; i++) {
+			if (args->dst[i] & MIGRATE_PFN_LOCKED) {
+				page = migrate_pfn_to_page(args->dst[i]);
+				i915_devmem_page_free_locked(i915, page);
+			}
+			args->dst[i] = 0;
+		}
+	}
+
+	return ret;
+}
+
+void i915_devmem_migrate_finalize_and_map(struct i915_devmem_migrate *migrate)
+{
+	DRM_DEBUG_DRIVER("npages %lld\n", migrate->npages);
+}
+
+static void i915_devmem_migrate_chunk(struct i915_devmem_migrate *migrate)
+{
+	int ret;
+
+	ret = i915_devmem_migrate_alloc_and_copy(migrate);
+	if (!ret) {
+		migrate_vma_pages(migrate->args);
+		i915_devmem_migrate_finalize_and_map(migrate);
+	}
+	migrate_vma_finalize(migrate->args);
+}
+
+int i915_devmem_migrate_vma(struct intel_memory_region *mem,
+			    struct vm_area_struct *vma,
+			    unsigned long start,
+			    unsigned long end)
+{
+	unsigned long npages = (end - start) >> PAGE_SHIFT;
+	unsigned long max = min_t(unsigned long, I915_BUDDY_MAX_PAGES, npages);
+	struct i915_devmem_migrate migrate = {0};
+	struct migrate_vma args = {
+		.vma		= vma,
+		.start		= start,
+	};
+	unsigned long c, i;
+	int ret = 0;
+
+	/* XXX: Opportunistically migrate additional pages? */
+	DRM_DEBUG_DRIVER("start 0x%lx end 0x%lx\n", start, end);
+	args.src = kcalloc(max, sizeof(args.src), GFP_KERNEL);
+	if (unlikely(!args.src))
+		return -ENOMEM;
+
+	args.dst = kcalloc(max, sizeof(args.dst), GFP_KERNEL);
+	if (unlikely(!args.dst)) {
+		kfree(args.src);
+		return -ENOMEM;
+	}
+
+	/* XXX: Support migrating from LMEM to SMEM */
+	migrate.args = &args;
+	migrate.i915 = mem->i915;
+	migrate.src_id = INTEL_REGION_SMEM;
+	migrate.dst_id = MEMORY_TYPE_FROM_REGION(mem->id);
+	for (i = 0; i < npages; i += c) {
+		c = min_t(unsigned long, I915_BUDDY_MAX_PAGES, npages);
+		args.end = start + (c << PAGE_SHIFT);
+		ret = migrate_vma_setup(&args);
+		if (unlikely(ret))
+			goto migrate_done;
+		if (args.cpages)
+			i915_devmem_migrate_chunk(&migrate);
+		args.start = args.end;
+	}
+migrate_done:
+	kfree(args.dst);
+	kfree(args.src);
+	return ret;
+}
+
+static vm_fault_t i915_devmem_migrate_to_ram(struct vm_fault *vmf)
+{
+	return VM_FAULT_SIGBUS;
+}
+
+static void i915_devmem_page_free(struct page *page)
+{
+	struct i915_buddy_block *block = page->zone_device_data;
+	struct intel_memory_region *mem = block->private;
+	unsigned long i, max, weight;
+
+	max = i915_buddy_block_size(&mem->mm, block) / PAGE_SIZE;
+	i = page_to_pfn(page) - block->pfn_first;
+	clear_bit(i, block->bitmap);
+	weight = bitmap_weight(block->bitmap, max);
+	DRM_DEBUG_DRIVER("%s pfn 0x%lx block weight 0x%lx\n",
+			 "Freeing page", page_to_pfn(page), weight);
+	if (!weight) {
+		DRM_DEBUG_DRIVER("%s pfn_first 0x%lx off 0x%llx size 0x%llx\n",
+				 "Freeing block", block->pfn_first,
+				 i915_buddy_block_offset(block),
+				 i915_buddy_block_size(&mem->mm, block));
+		__intel_memory_region_put_block_buddy(block);
+	}
+}
+
+static const struct dev_pagemap_ops i915_devmem_pagemap_ops = {
+	.page_free = i915_devmem_page_free,
+	.migrate_to_ram = i915_devmem_migrate_to_ram,
+};
+
+struct i915_devmem *i915_svm_devmem_add(struct drm_i915_private *i915, u64 size)
+{
+	struct device *dev = &i915->drm.pdev->dev;
+	struct i915_devmem *devmem;
+	struct resource *res;
+
+	devmem = kzalloc(sizeof(*devmem), GFP_KERNEL);
+	if (!devmem)
+		return NULL;
+
+	devmem->i915 = i915;
+	res = devm_request_free_mem_region(dev, &iomem_resource, size);
+	if (IS_ERR(res))
+		goto out_free;
+
+	devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
+	devmem->pagemap.res = *res;
+	devmem->pagemap.ops = &i915_devmem_pagemap_ops;
+	if (IS_ERR(devm_memremap_pages(dev, &devmem->pagemap)))
+		goto out_free;
+
+	devmem->pfn_first = res->start >> PAGE_SHIFT;
+	devmem->pfn_last = res->end >> PAGE_SHIFT;
+	return devmem;
+out_free:
+	kfree(devmem);
+	return NULL;
+}
+
+void i915_svm_devmem_remove(struct i915_devmem *devmem)
+{
+	/* XXX: Is it the right way to release? */
+	release_resource(&devmem->pagemap.res);
+	kfree(devmem);
+}
+
+int i915_svm_migrate_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *file_priv)
+{
+	struct drm_i915_private *i915 = to_i915(dev);
+	struct drm_i915_svm_migrate *args = data;
+	unsigned long addr, end, size = args->length;
+	struct intel_memory_region *mem;
+	enum intel_region_id id;
+	struct mm_struct *mm;
+
+	DRM_DEBUG_DRIVER("start 0x%llx length 0x%llx region 0x%x\n",
+			 args->start, args->length, args->region);
+	id = __region_id(args->region);
+	if ((MEMORY_TYPE_FROM_REGION(args->region) != INTEL_MEMORY_LOCAL) ||
+	    id == INTEL_REGION_UNKNOWN)
+		return -EINVAL;
+
+	mem = i915->mm.regions[id];
+
+	mm = get_task_mm(current);
+	down_read(&mm->mmap_sem);
+
+	for (addr = args->start, end = args->start + size; addr < end;) {
+		struct vm_area_struct *vma;
+		unsigned long next;
+
+		vma = find_vma_intersection(mm, addr, end);
+		if (!vma)
+			break;
+
+		addr &= PAGE_MASK;
+		next = min(vma->vm_end, end);
+		next = round_up(next, PAGE_SIZE);
+		/* This is a best effort so we ignore errors */
+		i915_devmem_migrate_vma(mem, vma, addr, next);
+		addr = next;
+	}
+
+	up_read(&mm->mmap_sem);
+	mmput(mm);
+	return 0;
+}
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
index 4622c086c06d..95e1eff0c0b0 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.h
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -71,6 +71,7 @@ struct intel_memory_region_ops {
 struct intel_memory_region {
 	struct drm_i915_private *i915;
 
+	struct i915_devmem *devmem;
 	const struct intel_memory_region_ops *ops;
 
 	struct io_mapping iomap;
@@ -100,6 +101,19 @@ struct intel_memory_region {
 	} objects;
 };
 
+static inline enum intel_region_id
+__region_id(u32 region)
+{
+	enum intel_region_id id;
+
+	for (id = 0; id < INTEL_REGION_UNKNOWN; ++id) {
+		if (intel_region_map[id] == region)
+			return id;
+	}
+
+	return INTEL_REGION_UNKNOWN;
+}
+
 int intel_memory_region_init_buddy(struct intel_memory_region *mem);
 void intel_memory_region_release_buddy(struct intel_memory_region *mem);
 
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
index eddb392917aa..2ba4a4720eb6 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/intel_region_lmem.c
@@ -4,6 +4,7 @@
  */
 
 #include "i915_drv.h"
+#include "i915_svm.h"
 #include "intel_memory_region.h"
 #include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_region.h"
@@ -66,6 +67,7 @@ static void release_fake_lmem_bar(struct intel_memory_region *mem)
 static void
 region_lmem_release(struct intel_memory_region *mem)
 {
+	i915_svm_devmem_remove(mem->devmem);
 	release_fake_lmem_bar(mem);
 	io_mapping_fini(&mem->iomap);
 	intel_memory_region_release_buddy(mem);
@@ -122,6 +124,14 @@ intel_setup_fake_lmem(struct drm_i915_private *i915)
 					 PAGE_SIZE,
 					 io_start,
 					 &intel_region_lmem_ops);
+	if (!IS_ERR(mem)) {
+		mem->devmem = i915_svm_devmem_add(i915, mappable_end);
+		if (IS_ERR(mem->devmem)) {
+			intel_memory_region_put(mem);
+			mem = ERR_CAST(mem->devmem);
+		}
+	}
+
 	if (!IS_ERR(mem)) {
 		DRM_INFO("Intel graphics fake LMEM: %pR\n", &mem->region);
 		DRM_INFO("Intel graphics fake LMEM IO start: %llx\n",
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 07/13] drm/i915/svm: Device memory support
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen, jglisse,
	jon.bloomfield, daniel.vetter, dan.j.williams, ira.weiny, jgg

Plugin device memory through HMM as DEVICE_PRIVATE.
Add support functions to allocate pages and free pages from device memory.
Implement ioctl to migrate pages from host to device memory.
For now, only support migrating pages from host memory to device memory.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/Kconfig               |   9 +
 drivers/gpu/drm/i915/Makefile              |   3 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c |  13 -
 drivers/gpu/drm/i915/i915_buddy.h          |  12 +
 drivers/gpu/drm/i915/i915_drv.c            |   1 +
 drivers/gpu/drm/i915/i915_svm.c            |   2 +
 drivers/gpu/drm/i915/i915_svm.h            |  15 +
 drivers/gpu/drm/i915/i915_svm_devmem.c     | 391 +++++++++++++++++++++
 drivers/gpu/drm/i915/intel_memory_region.h |  14 +
 drivers/gpu/drm/i915/intel_region_lmem.c   |  10 +
 10 files changed, 456 insertions(+), 14 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_svm_devmem.c

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 689e57fe3973..66337f2ca2bf 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -141,9 +141,18 @@ config DRM_I915_SVM
 	bool "Enable Shared Virtual Memory support in i915"
 	depends on STAGING
 	depends on DRM_I915
+	depends on ARCH_ENABLE_MEMORY_HOTPLUG
+	depends on ARCH_ENABLE_MEMORY_HOTREMOVE
+	depends on MEMORY_HOTPLUG
+	depends on MEMORY_HOTREMOVE
+	depends on ARCH_HAS_PTE_DEVMAP
+	depends on SPARSEMEM_VMEMMAP
+	depends on ZONE_DEVICE
+	depends on DEVICE_PRIVATE
 	depends on MMU
 	select HMM_MIRROR
 	select MMU_NOTIFIER
+	select MIGRATE_VMA_HELPER
 	default n
 	help
 	  Choose this option if you want Shared Virtual Memory (SVM)
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 7d4cd9eefd12..b574ec31ea2e 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -155,7 +155,8 @@ i915-y += \
 
 # SVM code
 i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o \
-			       i915_svm.o
+			       i915_svm.o \
+			       i915_svm_devmem.o
 
 # general-purpose microcontroller (GuC) support
 obj-y += gt/uc/
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index a7dee1b749cb..dd88fa87b7fe 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -480,19 +480,6 @@ int __init i915_global_objects_init(void)
 	return 0;
 }
 
-static enum intel_region_id
-__region_id(u32 region)
-{
-	enum intel_region_id id;
-
-	for (id = 0; id < INTEL_REGION_UNKNOWN; ++id) {
-		if (intel_region_map[id] == region)
-			return id;
-	}
-
-	return INTEL_REGION_UNKNOWN;
-}
-
 bool
 i915_gem_object_svm_mapped(struct drm_i915_gem_object *obj)
 {
diff --git a/drivers/gpu/drm/i915/i915_buddy.h b/drivers/gpu/drm/i915/i915_buddy.h
index ed41f3507cdc..afc493e6c130 100644
--- a/drivers/gpu/drm/i915/i915_buddy.h
+++ b/drivers/gpu/drm/i915/i915_buddy.h
@@ -9,6 +9,9 @@
 #include <linux/bitops.h>
 #include <linux/list.h>
 
+/* 512 bits (one per pages) supports 2MB blocks */
+#define I915_BUDDY_MAX_PAGES   512
+
 struct i915_buddy_block {
 #define I915_BUDDY_HEADER_OFFSET GENMASK_ULL(63, 12)
 #define I915_BUDDY_HEADER_STATE  GENMASK_ULL(11, 10)
@@ -32,6 +35,15 @@ struct i915_buddy_block {
 	 */
 	struct list_head link;
 	struct list_head tmp_link;
+
+	unsigned long pfn_first;
+	/*
+	 * FIXME: There are other alternatives to bitmap. Like splitting the
+	 * block into contiguous 4K sized blocks. But it is part of bigger
+	 * issues involving partially invalidating large mapping, freeing the
+	 * blocks etc., revisit.
+	 */
+	unsigned long bitmap[BITS_TO_LONGS(I915_BUDDY_MAX_PAGES)];
 };
 
 #define I915_BUDDY_MAX_ORDER  I915_BUDDY_HEADER_ORDER
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index c190df614c48..740b4b9d39a8 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -2771,6 +2771,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_BIND, i915_bind_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_SVM_MIGRATE, i915_svm_migrate_ioctl, DRM_RENDER_ALLOW)
 };
 
 static struct drm_driver driver = {
diff --git a/drivers/gpu/drm/i915/i915_svm.c b/drivers/gpu/drm/i915/i915_svm.c
index 4e414f257121..fe7d53634606 100644
--- a/drivers/gpu/drm/i915/i915_svm.c
+++ b/drivers/gpu/drm/i915/i915_svm.c
@@ -206,6 +206,8 @@ int i915_svm_bind(struct i915_address_space *vm, struct drm_i915_bind *args)
 	if (unlikely(ret))
 		goto vma_done;
 
+	/* XXX: Assuming the range is exclusively LMEM or SMEM, fix it */
+	i915_dmem_convert_pfn(vm->i915, &range);
 	sg_page_sizes = i915_svm_build_sg(vm, &range, &st);
 
 	mutex_lock(&svm->mutex);
diff --git a/drivers/gpu/drm/i915/i915_svm.h b/drivers/gpu/drm/i915/i915_svm.h
index f176f1dc493f..a1b62997e925 100644
--- a/drivers/gpu/drm/i915/i915_svm.h
+++ b/drivers/gpu/drm/i915/i915_svm.h
@@ -33,6 +33,14 @@ static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm)
 	return vm->svm;
 }
 
+void i915_dmem_convert_pfn(struct drm_i915_private *dev_priv,
+			   struct hmm_range *range);
+int i915_svm_migrate_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *file_priv);
+struct i915_devmem *i915_svm_devmem_add(struct drm_i915_private *i915,
+					u64 size);
+void i915_svm_devmem_remove(struct i915_devmem *devmem);
+
 #else
 
 struct i915_svm { };
@@ -45,6 +53,13 @@ static inline int i915_svm_bind_mm(struct i915_address_space *vm)
 static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm)
 { return false; }
 
+static inline int i915_svm_migrate_ioctl(struct drm_device *dev, void *data,
+					 struct drm_file *file_priv)
+{ return -ENOTSUPP; }
+static inline
+struct i915_devmem *i915_svm_devmem_add(struct drm_i915_private *i915, u64 size)
+{ return NULL; }
+static inline void i915_svm_devmem_remove(struct i915_devmem *devmem) { }
 #endif
 
 #endif /* __I915_SVM_H */
diff --git a/drivers/gpu/drm/i915/i915_svm_devmem.c b/drivers/gpu/drm/i915/i915_svm_devmem.c
new file mode 100644
index 000000000000..40c2f79ff614
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_svm_devmem.c
@@ -0,0 +1,391 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include <linux/mm_types.h>
+#include <linux/sched/mm.h>
+
+#include "i915_svm.h"
+#include "intel_memory_region.h"
+
+struct i915_devmem_migrate {
+	struct drm_i915_private *i915;
+	struct migrate_vma *args;
+
+	enum intel_region_id src_id;
+	enum intel_region_id dst_id;
+	u64 npages;
+};
+
+struct i915_devmem {
+	struct drm_i915_private *i915;
+	struct dev_pagemap pagemap;
+	unsigned long pfn_first;
+	unsigned long pfn_last;
+};
+
+static inline bool
+i915_dmem_page(struct drm_i915_private *dev_priv, struct page *page)
+{
+	if (!is_device_private_page(page))
+		return false;
+
+	return true;
+}
+
+void i915_dmem_convert_pfn(struct drm_i915_private *dev_priv,
+			   struct hmm_range *range)
+{
+	unsigned long i, npages;
+
+	npages = (range->end - range->start) >> PAGE_SHIFT;
+	for (i = 0; i < npages; ++i) {
+		struct i915_buddy_block *block;
+		struct intel_memory_region *mem;
+		struct page *page;
+		u64 addr;
+
+		page = hmm_device_entry_to_page(range, range->pfns[i]);
+		if (!page)
+			continue;
+
+		if (!(range->pfns[i] & range->flags[HMM_PFN_DEVICE_PRIVATE]))
+			continue;
+
+		if (!i915_dmem_page(dev_priv, page)) {
+			WARN(1, "Some unknown device memory !\n");
+			range->pfns[i] = 0;
+			continue;
+		}
+
+		block = page->zone_device_data;
+		mem = block->private;
+		addr = mem->region.start +
+		       i915_buddy_block_offset(block);
+		addr += (page_to_pfn(page) - block->pfn_first) << PAGE_SHIFT;
+
+		range->pfns[i] &= ~range->flags[HMM_PFN_DEVICE_PRIVATE];
+		range->pfns[i] &= ((1UL << range->pfn_shift) - 1);
+		range->pfns[i] |= (addr >> PAGE_SHIFT) << range->pfn_shift;
+	}
+}
+
+static int
+i915_devmem_page_alloc_locked(struct intel_memory_region *mem,
+			      unsigned long npages,
+			      struct list_head *blocks)
+{
+	unsigned long size = ALIGN((npages * PAGE_SIZE), mem->mm.chunk_size);
+	struct i915_buddy_block *block;
+	int ret;
+
+	INIT_LIST_HEAD(blocks);
+	ret = __intel_memory_region_get_pages_buddy(mem, size, 0, blocks);
+	if (unlikely(ret))
+		goto alloc_failed;
+
+	list_for_each_entry(block, blocks, link) {
+		block->pfn_first = mem->devmem->pfn_first;
+		block->pfn_first += i915_buddy_block_offset(block) /
+				    PAGE_SIZE;
+		bitmap_zero(block->bitmap, I915_BUDDY_MAX_PAGES);
+		DRM_DEBUG_DRIVER("%s pfn_first 0x%lx off 0x%llx size 0x%llx\n",
+				 "Allocated block", block->pfn_first,
+				 i915_buddy_block_offset(block),
+				 i915_buddy_block_size(&mem->mm, block));
+	}
+
+alloc_failed:
+	return ret;
+}
+
+static struct page *
+i915_devmem_page_get_locked(struct intel_memory_region *mem,
+			    struct list_head *blocks)
+{
+	struct i915_buddy_block *block, *on;
+
+	list_for_each_entry_safe(block, on, blocks, link) {
+		unsigned long weight, max;
+		unsigned long i, pfn;
+		struct page *page;
+
+		max = i915_buddy_block_size(&mem->mm, block) / PAGE_SIZE;
+		i = find_first_zero_bit(block->bitmap, max);
+		if (unlikely(i == max)) {
+			WARN(1, "Getting a page should have never failed\n");
+			break;
+		}
+
+		set_bit(i, block->bitmap);
+		pfn = block->pfn_first + i;
+		page = pfn_to_page(pfn);
+		get_page(page);
+		lock_page(page);
+		page->zone_device_data = block;
+		weight = bitmap_weight(block->bitmap, max);
+		if (weight == max)
+			list_del_init(&block->link);
+		DRM_DEBUG_DRIVER("%s pfn 0x%lx block weight 0x%lx\n",
+				 "Allocated page", pfn, weight);
+		return page;
+	}
+	return NULL;
+}
+
+static void
+i915_devmem_page_free_locked(struct drm_i915_private *dev_priv,
+			     struct page *page)
+{
+	unlock_page(page);
+	put_page(page);
+}
+
+static int
+i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
+{
+	struct drm_i915_private *i915 = migrate->i915;
+	struct migrate_vma *args = migrate->args;
+	struct intel_memory_region *mem;
+	struct list_head blocks = {0};
+	unsigned long i, npages, cnt;
+	struct page *page;
+	int ret;
+
+	npages = (args->end - args->start) >> PAGE_SHIFT;
+	DRM_DEBUG_DRIVER("start 0x%lx npages %ld\n", args->start, npages);
+
+	/* Check source pages */
+	for (i = 0, cnt = 0; i < npages; i++) {
+		args->dst[i] = 0;
+		page = migrate_pfn_to_page(args->src[i]);
+		if (unlikely(!page || !(args->src[i] & MIGRATE_PFN_MIGRATE)))
+			continue;
+
+		args->dst[i] = MIGRATE_PFN_VALID;
+		cnt++;
+	}
+
+	if (!cnt) {
+		ret = -ENOMEM;
+		goto migrate_out;
+	}
+
+	mem = i915->mm.regions[migrate->dst_id];
+	ret = i915_devmem_page_alloc_locked(mem, cnt, &blocks);
+	if (unlikely(ret))
+		goto migrate_out;
+
+	/* Allocate device memory */
+	for (i = 0, cnt = 0; i < npages; i++) {
+		if (!args->dst[i])
+			continue;
+
+		page = i915_devmem_page_get_locked(mem, &blocks);
+		if (unlikely(!page)) {
+			WARN(1, "Failed to get dst page\n");
+			args->dst[i] = 0;
+			continue;
+		}
+
+		cnt++;
+		args->dst[i] = migrate_pfn(page_to_pfn(page)) |
+			       MIGRATE_PFN_LOCKED;
+	}
+
+	if (!cnt) {
+		ret = -ENOMEM;
+		goto migrate_out;
+	}
+
+	/* Copy the pages */
+	migrate->npages = npages;
+migrate_out:
+	if (unlikely(ret)) {
+		for (i = 0; i < npages; i++) {
+			if (args->dst[i] & MIGRATE_PFN_LOCKED) {
+				page = migrate_pfn_to_page(args->dst[i]);
+				i915_devmem_page_free_locked(i915, page);
+			}
+			args->dst[i] = 0;
+		}
+	}
+
+	return ret;
+}
+
+void i915_devmem_migrate_finalize_and_map(struct i915_devmem_migrate *migrate)
+{
+	DRM_DEBUG_DRIVER("npages %lld\n", migrate->npages);
+}
+
+static void i915_devmem_migrate_chunk(struct i915_devmem_migrate *migrate)
+{
+	int ret;
+
+	ret = i915_devmem_migrate_alloc_and_copy(migrate);
+	if (!ret) {
+		migrate_vma_pages(migrate->args);
+		i915_devmem_migrate_finalize_and_map(migrate);
+	}
+	migrate_vma_finalize(migrate->args);
+}
+
+int i915_devmem_migrate_vma(struct intel_memory_region *mem,
+			    struct vm_area_struct *vma,
+			    unsigned long start,
+			    unsigned long end)
+{
+	unsigned long npages = (end - start) >> PAGE_SHIFT;
+	unsigned long max = min_t(unsigned long, I915_BUDDY_MAX_PAGES, npages);
+	struct i915_devmem_migrate migrate = {0};
+	struct migrate_vma args = {
+		.vma		= vma,
+		.start		= start,
+	};
+	unsigned long c, i;
+	int ret = 0;
+
+	/* XXX: Opportunistically migrate additional pages? */
+	DRM_DEBUG_DRIVER("start 0x%lx end 0x%lx\n", start, end);
+	args.src = kcalloc(max, sizeof(args.src), GFP_KERNEL);
+	if (unlikely(!args.src))
+		return -ENOMEM;
+
+	args.dst = kcalloc(max, sizeof(args.dst), GFP_KERNEL);
+	if (unlikely(!args.dst)) {
+		kfree(args.src);
+		return -ENOMEM;
+	}
+
+	/* XXX: Support migrating from LMEM to SMEM */
+	migrate.args = &args;
+	migrate.i915 = mem->i915;
+	migrate.src_id = INTEL_REGION_SMEM;
+	migrate.dst_id = MEMORY_TYPE_FROM_REGION(mem->id);
+	for (i = 0; i < npages; i += c) {
+		c = min_t(unsigned long, I915_BUDDY_MAX_PAGES, npages);
+		args.end = start + (c << PAGE_SHIFT);
+		ret = migrate_vma_setup(&args);
+		if (unlikely(ret))
+			goto migrate_done;
+		if (args.cpages)
+			i915_devmem_migrate_chunk(&migrate);
+		args.start = args.end;
+	}
+migrate_done:
+	kfree(args.dst);
+	kfree(args.src);
+	return ret;
+}
+
+static vm_fault_t i915_devmem_migrate_to_ram(struct vm_fault *vmf)
+{
+	return VM_FAULT_SIGBUS;
+}
+
+static void i915_devmem_page_free(struct page *page)
+{
+	struct i915_buddy_block *block = page->zone_device_data;
+	struct intel_memory_region *mem = block->private;
+	unsigned long i, max, weight;
+
+	max = i915_buddy_block_size(&mem->mm, block) / PAGE_SIZE;
+	i = page_to_pfn(page) - block->pfn_first;
+	clear_bit(i, block->bitmap);
+	weight = bitmap_weight(block->bitmap, max);
+	DRM_DEBUG_DRIVER("%s pfn 0x%lx block weight 0x%lx\n",
+			 "Freeing page", page_to_pfn(page), weight);
+	if (!weight) {
+		DRM_DEBUG_DRIVER("%s pfn_first 0x%lx off 0x%llx size 0x%llx\n",
+				 "Freeing block", block->pfn_first,
+				 i915_buddy_block_offset(block),
+				 i915_buddy_block_size(&mem->mm, block));
+		__intel_memory_region_put_block_buddy(block);
+	}
+}
+
+static const struct dev_pagemap_ops i915_devmem_pagemap_ops = {
+	.page_free = i915_devmem_page_free,
+	.migrate_to_ram = i915_devmem_migrate_to_ram,
+};
+
+struct i915_devmem *i915_svm_devmem_add(struct drm_i915_private *i915, u64 size)
+{
+	struct device *dev = &i915->drm.pdev->dev;
+	struct i915_devmem *devmem;
+	struct resource *res;
+
+	devmem = kzalloc(sizeof(*devmem), GFP_KERNEL);
+	if (!devmem)
+		return NULL;
+
+	devmem->i915 = i915;
+	res = devm_request_free_mem_region(dev, &iomem_resource, size);
+	if (IS_ERR(res))
+		goto out_free;
+
+	devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
+	devmem->pagemap.res = *res;
+	devmem->pagemap.ops = &i915_devmem_pagemap_ops;
+	if (IS_ERR(devm_memremap_pages(dev, &devmem->pagemap)))
+		goto out_free;
+
+	devmem->pfn_first = res->start >> PAGE_SHIFT;
+	devmem->pfn_last = res->end >> PAGE_SHIFT;
+	return devmem;
+out_free:
+	kfree(devmem);
+	return NULL;
+}
+
+void i915_svm_devmem_remove(struct i915_devmem *devmem)
+{
+	/* XXX: Is it the right way to release? */
+	release_resource(&devmem->pagemap.res);
+	kfree(devmem);
+}
+
+int i915_svm_migrate_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *file_priv)
+{
+	struct drm_i915_private *i915 = to_i915(dev);
+	struct drm_i915_svm_migrate *args = data;
+	unsigned long addr, end, size = args->length;
+	struct intel_memory_region *mem;
+	enum intel_region_id id;
+	struct mm_struct *mm;
+
+	DRM_DEBUG_DRIVER("start 0x%llx length 0x%llx region 0x%x\n",
+			 args->start, args->length, args->region);
+	id = __region_id(args->region);
+	if ((MEMORY_TYPE_FROM_REGION(args->region) != INTEL_MEMORY_LOCAL) ||
+	    id == INTEL_REGION_UNKNOWN)
+		return -EINVAL;
+
+	mem = i915->mm.regions[id];
+
+	mm = get_task_mm(current);
+	down_read(&mm->mmap_sem);
+
+	for (addr = args->start, end = args->start + size; addr < end;) {
+		struct vm_area_struct *vma;
+		unsigned long next;
+
+		vma = find_vma_intersection(mm, addr, end);
+		if (!vma)
+			break;
+
+		addr &= PAGE_MASK;
+		next = min(vma->vm_end, end);
+		next = round_up(next, PAGE_SIZE);
+		/* This is a best effort so we ignore errors */
+		i915_devmem_migrate_vma(mem, vma, addr, next);
+		addr = next;
+	}
+
+	up_read(&mm->mmap_sem);
+	mmput(mm);
+	return 0;
+}
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
index 4622c086c06d..95e1eff0c0b0 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.h
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -71,6 +71,7 @@ struct intel_memory_region_ops {
 struct intel_memory_region {
 	struct drm_i915_private *i915;
 
+	struct i915_devmem *devmem;
 	const struct intel_memory_region_ops *ops;
 
 	struct io_mapping iomap;
@@ -100,6 +101,19 @@ struct intel_memory_region {
 	} objects;
 };
 
+static inline enum intel_region_id
+__region_id(u32 region)
+{
+	enum intel_region_id id;
+
+	for (id = 0; id < INTEL_REGION_UNKNOWN; ++id) {
+		if (intel_region_map[id] == region)
+			return id;
+	}
+
+	return INTEL_REGION_UNKNOWN;
+}
+
 int intel_memory_region_init_buddy(struct intel_memory_region *mem);
 void intel_memory_region_release_buddy(struct intel_memory_region *mem);
 
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
index eddb392917aa..2ba4a4720eb6 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/intel_region_lmem.c
@@ -4,6 +4,7 @@
  */
 
 #include "i915_drv.h"
+#include "i915_svm.h"
 #include "intel_memory_region.h"
 #include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_region.h"
@@ -66,6 +67,7 @@ static void release_fake_lmem_bar(struct intel_memory_region *mem)
 static void
 region_lmem_release(struct intel_memory_region *mem)
 {
+	i915_svm_devmem_remove(mem->devmem);
 	release_fake_lmem_bar(mem);
 	io_mapping_fini(&mem->iomap);
 	intel_memory_region_release_buddy(mem);
@@ -122,6 +124,14 @@ intel_setup_fake_lmem(struct drm_i915_private *i915)
 					 PAGE_SIZE,
 					 io_start,
 					 &intel_region_lmem_ops);
+	if (!IS_ERR(mem)) {
+		mem->devmem = i915_svm_devmem_add(i915, mappable_end);
+		if (IS_ERR(mem->devmem)) {
+			intel_memory_region_put(mem);
+			mem = ERR_CAST(mem->devmem);
+		}
+	}
+
 	if (!IS_ERR(mem)) {
 		DRM_INFO("Intel graphics fake LMEM: %pR\n", &mem->region);
 		DRM_INFO("Intel graphics fake LMEM IO start: %llx\n",
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [Intel-gfx] [RFC 07/13] drm/i915/svm: Device memory support
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Plugin device memory through HMM as DEVICE_PRIVATE.
Add support functions to allocate pages and free pages from device memory.
Implement ioctl to migrate pages from host to device memory.
For now, only support migrating pages from host memory to device memory.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/Kconfig               |   9 +
 drivers/gpu/drm/i915/Makefile              |   3 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c |  13 -
 drivers/gpu/drm/i915/i915_buddy.h          |  12 +
 drivers/gpu/drm/i915/i915_drv.c            |   1 +
 drivers/gpu/drm/i915/i915_svm.c            |   2 +
 drivers/gpu/drm/i915/i915_svm.h            |  15 +
 drivers/gpu/drm/i915/i915_svm_devmem.c     | 391 +++++++++++++++++++++
 drivers/gpu/drm/i915/intel_memory_region.h |  14 +
 drivers/gpu/drm/i915/intel_region_lmem.c   |  10 +
 10 files changed, 456 insertions(+), 14 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_svm_devmem.c

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 689e57fe3973..66337f2ca2bf 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -141,9 +141,18 @@ config DRM_I915_SVM
 	bool "Enable Shared Virtual Memory support in i915"
 	depends on STAGING
 	depends on DRM_I915
+	depends on ARCH_ENABLE_MEMORY_HOTPLUG
+	depends on ARCH_ENABLE_MEMORY_HOTREMOVE
+	depends on MEMORY_HOTPLUG
+	depends on MEMORY_HOTREMOVE
+	depends on ARCH_HAS_PTE_DEVMAP
+	depends on SPARSEMEM_VMEMMAP
+	depends on ZONE_DEVICE
+	depends on DEVICE_PRIVATE
 	depends on MMU
 	select HMM_MIRROR
 	select MMU_NOTIFIER
+	select MIGRATE_VMA_HELPER
 	default n
 	help
 	  Choose this option if you want Shared Virtual Memory (SVM)
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 7d4cd9eefd12..b574ec31ea2e 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -155,7 +155,8 @@ i915-y += \
 
 # SVM code
 i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o \
-			       i915_svm.o
+			       i915_svm.o \
+			       i915_svm_devmem.o
 
 # general-purpose microcontroller (GuC) support
 obj-y += gt/uc/
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index a7dee1b749cb..dd88fa87b7fe 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -480,19 +480,6 @@ int __init i915_global_objects_init(void)
 	return 0;
 }
 
-static enum intel_region_id
-__region_id(u32 region)
-{
-	enum intel_region_id id;
-
-	for (id = 0; id < INTEL_REGION_UNKNOWN; ++id) {
-		if (intel_region_map[id] == region)
-			return id;
-	}
-
-	return INTEL_REGION_UNKNOWN;
-}
-
 bool
 i915_gem_object_svm_mapped(struct drm_i915_gem_object *obj)
 {
diff --git a/drivers/gpu/drm/i915/i915_buddy.h b/drivers/gpu/drm/i915/i915_buddy.h
index ed41f3507cdc..afc493e6c130 100644
--- a/drivers/gpu/drm/i915/i915_buddy.h
+++ b/drivers/gpu/drm/i915/i915_buddy.h
@@ -9,6 +9,9 @@
 #include <linux/bitops.h>
 #include <linux/list.h>
 
+/* 512 bits (one per pages) supports 2MB blocks */
+#define I915_BUDDY_MAX_PAGES   512
+
 struct i915_buddy_block {
 #define I915_BUDDY_HEADER_OFFSET GENMASK_ULL(63, 12)
 #define I915_BUDDY_HEADER_STATE  GENMASK_ULL(11, 10)
@@ -32,6 +35,15 @@ struct i915_buddy_block {
 	 */
 	struct list_head link;
 	struct list_head tmp_link;
+
+	unsigned long pfn_first;
+	/*
+	 * FIXME: There are other alternatives to bitmap. Like splitting the
+	 * block into contiguous 4K sized blocks. But it is part of bigger
+	 * issues involving partially invalidating large mapping, freeing the
+	 * blocks etc., revisit.
+	 */
+	unsigned long bitmap[BITS_TO_LONGS(I915_BUDDY_MAX_PAGES)];
 };
 
 #define I915_BUDDY_MAX_ORDER  I915_BUDDY_HEADER_ORDER
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index c190df614c48..740b4b9d39a8 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -2771,6 +2771,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_BIND, i915_bind_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_SVM_MIGRATE, i915_svm_migrate_ioctl, DRM_RENDER_ALLOW)
 };
 
 static struct drm_driver driver = {
diff --git a/drivers/gpu/drm/i915/i915_svm.c b/drivers/gpu/drm/i915/i915_svm.c
index 4e414f257121..fe7d53634606 100644
--- a/drivers/gpu/drm/i915/i915_svm.c
+++ b/drivers/gpu/drm/i915/i915_svm.c
@@ -206,6 +206,8 @@ int i915_svm_bind(struct i915_address_space *vm, struct drm_i915_bind *args)
 	if (unlikely(ret))
 		goto vma_done;
 
+	/* XXX: Assuming the range is exclusively LMEM or SMEM, fix it */
+	i915_dmem_convert_pfn(vm->i915, &range);
 	sg_page_sizes = i915_svm_build_sg(vm, &range, &st);
 
 	mutex_lock(&svm->mutex);
diff --git a/drivers/gpu/drm/i915/i915_svm.h b/drivers/gpu/drm/i915/i915_svm.h
index f176f1dc493f..a1b62997e925 100644
--- a/drivers/gpu/drm/i915/i915_svm.h
+++ b/drivers/gpu/drm/i915/i915_svm.h
@@ -33,6 +33,14 @@ static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm)
 	return vm->svm;
 }
 
+void i915_dmem_convert_pfn(struct drm_i915_private *dev_priv,
+			   struct hmm_range *range);
+int i915_svm_migrate_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *file_priv);
+struct i915_devmem *i915_svm_devmem_add(struct drm_i915_private *i915,
+					u64 size);
+void i915_svm_devmem_remove(struct i915_devmem *devmem);
+
 #else
 
 struct i915_svm { };
@@ -45,6 +53,13 @@ static inline int i915_svm_bind_mm(struct i915_address_space *vm)
 static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm)
 { return false; }
 
+static inline int i915_svm_migrate_ioctl(struct drm_device *dev, void *data,
+					 struct drm_file *file_priv)
+{ return -ENOTSUPP; }
+static inline
+struct i915_devmem *i915_svm_devmem_add(struct drm_i915_private *i915, u64 size)
+{ return NULL; }
+static inline void i915_svm_devmem_remove(struct i915_devmem *devmem) { }
 #endif
 
 #endif /* __I915_SVM_H */
diff --git a/drivers/gpu/drm/i915/i915_svm_devmem.c b/drivers/gpu/drm/i915/i915_svm_devmem.c
new file mode 100644
index 000000000000..40c2f79ff614
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_svm_devmem.c
@@ -0,0 +1,391 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include <linux/mm_types.h>
+#include <linux/sched/mm.h>
+
+#include "i915_svm.h"
+#include "intel_memory_region.h"
+
+struct i915_devmem_migrate {
+	struct drm_i915_private *i915;
+	struct migrate_vma *args;
+
+	enum intel_region_id src_id;
+	enum intel_region_id dst_id;
+	u64 npages;
+};
+
+struct i915_devmem {
+	struct drm_i915_private *i915;
+	struct dev_pagemap pagemap;
+	unsigned long pfn_first;
+	unsigned long pfn_last;
+};
+
+static inline bool
+i915_dmem_page(struct drm_i915_private *dev_priv, struct page *page)
+{
+	if (!is_device_private_page(page))
+		return false;
+
+	return true;
+}
+
+void i915_dmem_convert_pfn(struct drm_i915_private *dev_priv,
+			   struct hmm_range *range)
+{
+	unsigned long i, npages;
+
+	npages = (range->end - range->start) >> PAGE_SHIFT;
+	for (i = 0; i < npages; ++i) {
+		struct i915_buddy_block *block;
+		struct intel_memory_region *mem;
+		struct page *page;
+		u64 addr;
+
+		page = hmm_device_entry_to_page(range, range->pfns[i]);
+		if (!page)
+			continue;
+
+		if (!(range->pfns[i] & range->flags[HMM_PFN_DEVICE_PRIVATE]))
+			continue;
+
+		if (!i915_dmem_page(dev_priv, page)) {
+			WARN(1, "Some unknown device memory !\n");
+			range->pfns[i] = 0;
+			continue;
+		}
+
+		block = page->zone_device_data;
+		mem = block->private;
+		addr = mem->region.start +
+		       i915_buddy_block_offset(block);
+		addr += (page_to_pfn(page) - block->pfn_first) << PAGE_SHIFT;
+
+		range->pfns[i] &= ~range->flags[HMM_PFN_DEVICE_PRIVATE];
+		range->pfns[i] &= ((1UL << range->pfn_shift) - 1);
+		range->pfns[i] |= (addr >> PAGE_SHIFT) << range->pfn_shift;
+	}
+}
+
+static int
+i915_devmem_page_alloc_locked(struct intel_memory_region *mem,
+			      unsigned long npages,
+			      struct list_head *blocks)
+{
+	unsigned long size = ALIGN((npages * PAGE_SIZE), mem->mm.chunk_size);
+	struct i915_buddy_block *block;
+	int ret;
+
+	INIT_LIST_HEAD(blocks);
+	ret = __intel_memory_region_get_pages_buddy(mem, size, 0, blocks);
+	if (unlikely(ret))
+		goto alloc_failed;
+
+	list_for_each_entry(block, blocks, link) {
+		block->pfn_first = mem->devmem->pfn_first;
+		block->pfn_first += i915_buddy_block_offset(block) /
+				    PAGE_SIZE;
+		bitmap_zero(block->bitmap, I915_BUDDY_MAX_PAGES);
+		DRM_DEBUG_DRIVER("%s pfn_first 0x%lx off 0x%llx size 0x%llx\n",
+				 "Allocated block", block->pfn_first,
+				 i915_buddy_block_offset(block),
+				 i915_buddy_block_size(&mem->mm, block));
+	}
+
+alloc_failed:
+	return ret;
+}
+
+static struct page *
+i915_devmem_page_get_locked(struct intel_memory_region *mem,
+			    struct list_head *blocks)
+{
+	struct i915_buddy_block *block, *on;
+
+	list_for_each_entry_safe(block, on, blocks, link) {
+		unsigned long weight, max;
+		unsigned long i, pfn;
+		struct page *page;
+
+		max = i915_buddy_block_size(&mem->mm, block) / PAGE_SIZE;
+		i = find_first_zero_bit(block->bitmap, max);
+		if (unlikely(i == max)) {
+			WARN(1, "Getting a page should have never failed\n");
+			break;
+		}
+
+		set_bit(i, block->bitmap);
+		pfn = block->pfn_first + i;
+		page = pfn_to_page(pfn);
+		get_page(page);
+		lock_page(page);
+		page->zone_device_data = block;
+		weight = bitmap_weight(block->bitmap, max);
+		if (weight == max)
+			list_del_init(&block->link);
+		DRM_DEBUG_DRIVER("%s pfn 0x%lx block weight 0x%lx\n",
+				 "Allocated page", pfn, weight);
+		return page;
+	}
+	return NULL;
+}
+
+static void
+i915_devmem_page_free_locked(struct drm_i915_private *dev_priv,
+			     struct page *page)
+{
+	unlock_page(page);
+	put_page(page);
+}
+
+static int
+i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
+{
+	struct drm_i915_private *i915 = migrate->i915;
+	struct migrate_vma *args = migrate->args;
+	struct intel_memory_region *mem;
+	struct list_head blocks = {0};
+	unsigned long i, npages, cnt;
+	struct page *page;
+	int ret;
+
+	npages = (args->end - args->start) >> PAGE_SHIFT;
+	DRM_DEBUG_DRIVER("start 0x%lx npages %ld\n", args->start, npages);
+
+	/* Check source pages */
+	for (i = 0, cnt = 0; i < npages; i++) {
+		args->dst[i] = 0;
+		page = migrate_pfn_to_page(args->src[i]);
+		if (unlikely(!page || !(args->src[i] & MIGRATE_PFN_MIGRATE)))
+			continue;
+
+		args->dst[i] = MIGRATE_PFN_VALID;
+		cnt++;
+	}
+
+	if (!cnt) {
+		ret = -ENOMEM;
+		goto migrate_out;
+	}
+
+	mem = i915->mm.regions[migrate->dst_id];
+	ret = i915_devmem_page_alloc_locked(mem, cnt, &blocks);
+	if (unlikely(ret))
+		goto migrate_out;
+
+	/* Allocate device memory */
+	for (i = 0, cnt = 0; i < npages; i++) {
+		if (!args->dst[i])
+			continue;
+
+		page = i915_devmem_page_get_locked(mem, &blocks);
+		if (unlikely(!page)) {
+			WARN(1, "Failed to get dst page\n");
+			args->dst[i] = 0;
+			continue;
+		}
+
+		cnt++;
+		args->dst[i] = migrate_pfn(page_to_pfn(page)) |
+			       MIGRATE_PFN_LOCKED;
+	}
+
+	if (!cnt) {
+		ret = -ENOMEM;
+		goto migrate_out;
+	}
+
+	/* Copy the pages */
+	migrate->npages = npages;
+migrate_out:
+	if (unlikely(ret)) {
+		for (i = 0; i < npages; i++) {
+			if (args->dst[i] & MIGRATE_PFN_LOCKED) {
+				page = migrate_pfn_to_page(args->dst[i]);
+				i915_devmem_page_free_locked(i915, page);
+			}
+			args->dst[i] = 0;
+		}
+	}
+
+	return ret;
+}
+
+void i915_devmem_migrate_finalize_and_map(struct i915_devmem_migrate *migrate)
+{
+	DRM_DEBUG_DRIVER("npages %lld\n", migrate->npages);
+}
+
+static void i915_devmem_migrate_chunk(struct i915_devmem_migrate *migrate)
+{
+	int ret;
+
+	ret = i915_devmem_migrate_alloc_and_copy(migrate);
+	if (!ret) {
+		migrate_vma_pages(migrate->args);
+		i915_devmem_migrate_finalize_and_map(migrate);
+	}
+	migrate_vma_finalize(migrate->args);
+}
+
+int i915_devmem_migrate_vma(struct intel_memory_region *mem,
+			    struct vm_area_struct *vma,
+			    unsigned long start,
+			    unsigned long end)
+{
+	unsigned long npages = (end - start) >> PAGE_SHIFT;
+	unsigned long max = min_t(unsigned long, I915_BUDDY_MAX_PAGES, npages);
+	struct i915_devmem_migrate migrate = {0};
+	struct migrate_vma args = {
+		.vma		= vma,
+		.start		= start,
+	};
+	unsigned long c, i;
+	int ret = 0;
+
+	/* XXX: Opportunistically migrate additional pages? */
+	DRM_DEBUG_DRIVER("start 0x%lx end 0x%lx\n", start, end);
+	args.src = kcalloc(max, sizeof(args.src), GFP_KERNEL);
+	if (unlikely(!args.src))
+		return -ENOMEM;
+
+	args.dst = kcalloc(max, sizeof(args.dst), GFP_KERNEL);
+	if (unlikely(!args.dst)) {
+		kfree(args.src);
+		return -ENOMEM;
+	}
+
+	/* XXX: Support migrating from LMEM to SMEM */
+	migrate.args = &args;
+	migrate.i915 = mem->i915;
+	migrate.src_id = INTEL_REGION_SMEM;
+	migrate.dst_id = MEMORY_TYPE_FROM_REGION(mem->id);
+	for (i = 0; i < npages; i += c) {
+		c = min_t(unsigned long, I915_BUDDY_MAX_PAGES, npages);
+		args.end = start + (c << PAGE_SHIFT);
+		ret = migrate_vma_setup(&args);
+		if (unlikely(ret))
+			goto migrate_done;
+		if (args.cpages)
+			i915_devmem_migrate_chunk(&migrate);
+		args.start = args.end;
+	}
+migrate_done:
+	kfree(args.dst);
+	kfree(args.src);
+	return ret;
+}
+
+static vm_fault_t i915_devmem_migrate_to_ram(struct vm_fault *vmf)
+{
+	return VM_FAULT_SIGBUS;
+}
+
+static void i915_devmem_page_free(struct page *page)
+{
+	struct i915_buddy_block *block = page->zone_device_data;
+	struct intel_memory_region *mem = block->private;
+	unsigned long i, max, weight;
+
+	max = i915_buddy_block_size(&mem->mm, block) / PAGE_SIZE;
+	i = page_to_pfn(page) - block->pfn_first;
+	clear_bit(i, block->bitmap);
+	weight = bitmap_weight(block->bitmap, max);
+	DRM_DEBUG_DRIVER("%s pfn 0x%lx block weight 0x%lx\n",
+			 "Freeing page", page_to_pfn(page), weight);
+	if (!weight) {
+		DRM_DEBUG_DRIVER("%s pfn_first 0x%lx off 0x%llx size 0x%llx\n",
+				 "Freeing block", block->pfn_first,
+				 i915_buddy_block_offset(block),
+				 i915_buddy_block_size(&mem->mm, block));
+		__intel_memory_region_put_block_buddy(block);
+	}
+}
+
+static const struct dev_pagemap_ops i915_devmem_pagemap_ops = {
+	.page_free = i915_devmem_page_free,
+	.migrate_to_ram = i915_devmem_migrate_to_ram,
+};
+
+struct i915_devmem *i915_svm_devmem_add(struct drm_i915_private *i915, u64 size)
+{
+	struct device *dev = &i915->drm.pdev->dev;
+	struct i915_devmem *devmem;
+	struct resource *res;
+
+	devmem = kzalloc(sizeof(*devmem), GFP_KERNEL);
+	if (!devmem)
+		return NULL;
+
+	devmem->i915 = i915;
+	res = devm_request_free_mem_region(dev, &iomem_resource, size);
+	if (IS_ERR(res))
+		goto out_free;
+
+	devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
+	devmem->pagemap.res = *res;
+	devmem->pagemap.ops = &i915_devmem_pagemap_ops;
+	if (IS_ERR(devm_memremap_pages(dev, &devmem->pagemap)))
+		goto out_free;
+
+	devmem->pfn_first = res->start >> PAGE_SHIFT;
+	devmem->pfn_last = res->end >> PAGE_SHIFT;
+	return devmem;
+out_free:
+	kfree(devmem);
+	return NULL;
+}
+
+void i915_svm_devmem_remove(struct i915_devmem *devmem)
+{
+	/* XXX: Is it the right way to release? */
+	release_resource(&devmem->pagemap.res);
+	kfree(devmem);
+}
+
+int i915_svm_migrate_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *file_priv)
+{
+	struct drm_i915_private *i915 = to_i915(dev);
+	struct drm_i915_svm_migrate *args = data;
+	unsigned long addr, end, size = args->length;
+	struct intel_memory_region *mem;
+	enum intel_region_id id;
+	struct mm_struct *mm;
+
+	DRM_DEBUG_DRIVER("start 0x%llx length 0x%llx region 0x%x\n",
+			 args->start, args->length, args->region);
+	id = __region_id(args->region);
+	if ((MEMORY_TYPE_FROM_REGION(args->region) != INTEL_MEMORY_LOCAL) ||
+	    id == INTEL_REGION_UNKNOWN)
+		return -EINVAL;
+
+	mem = i915->mm.regions[id];
+
+	mm = get_task_mm(current);
+	down_read(&mm->mmap_sem);
+
+	for (addr = args->start, end = args->start + size; addr < end;) {
+		struct vm_area_struct *vma;
+		unsigned long next;
+
+		vma = find_vma_intersection(mm, addr, end);
+		if (!vma)
+			break;
+
+		addr &= PAGE_MASK;
+		next = min(vma->vm_end, end);
+		next = round_up(next, PAGE_SIZE);
+		/* This is a best effort so we ignore errors */
+		i915_devmem_migrate_vma(mem, vma, addr, next);
+		addr = next;
+	}
+
+	up_read(&mm->mmap_sem);
+	mmput(mm);
+	return 0;
+}
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
index 4622c086c06d..95e1eff0c0b0 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.h
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -71,6 +71,7 @@ struct intel_memory_region_ops {
 struct intel_memory_region {
 	struct drm_i915_private *i915;
 
+	struct i915_devmem *devmem;
 	const struct intel_memory_region_ops *ops;
 
 	struct io_mapping iomap;
@@ -100,6 +101,19 @@ struct intel_memory_region {
 	} objects;
 };
 
+static inline enum intel_region_id
+__region_id(u32 region)
+{
+	enum intel_region_id id;
+
+	for (id = 0; id < INTEL_REGION_UNKNOWN; ++id) {
+		if (intel_region_map[id] == region)
+			return id;
+	}
+
+	return INTEL_REGION_UNKNOWN;
+}
+
 int intel_memory_region_init_buddy(struct intel_memory_region *mem);
 void intel_memory_region_release_buddy(struct intel_memory_region *mem);
 
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
index eddb392917aa..2ba4a4720eb6 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/intel_region_lmem.c
@@ -4,6 +4,7 @@
  */
 
 #include "i915_drv.h"
+#include "i915_svm.h"
 #include "intel_memory_region.h"
 #include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_region.h"
@@ -66,6 +67,7 @@ static void release_fake_lmem_bar(struct intel_memory_region *mem)
 static void
 region_lmem_release(struct intel_memory_region *mem)
 {
+	i915_svm_devmem_remove(mem->devmem);
 	release_fake_lmem_bar(mem);
 	io_mapping_fini(&mem->iomap);
 	intel_memory_region_release_buddy(mem);
@@ -122,6 +124,14 @@ intel_setup_fake_lmem(struct drm_i915_private *i915)
 					 PAGE_SIZE,
 					 io_start,
 					 &intel_region_lmem_ops);
+	if (!IS_ERR(mem)) {
+		mem->devmem = i915_svm_devmem_add(i915, mappable_end);
+		if (IS_ERR(mem->devmem)) {
+			intel_memory_region_put(mem);
+			mem = ERR_CAST(mem->devmem);
+		}
+	}
+
 	if (!IS_ERR(mem)) {
 		DRM_INFO("Intel graphics fake LMEM: %pR\n", &mem->region);
 		DRM_INFO("Intel graphics fake LMEM IO start: %llx\n",
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 08/13] drm/i915/svm: Implicitly migrate pages upon CPU fault
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen, jglisse,
	jon.bloomfield, daniel.vetter, dan.j.williams, ira.weiny, jgg

As PCIe is non-coherent link, do not allow direct memory access across
PCIe link. Handle CPU fault by migrating pages back to host memory.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/i915_svm_devmem.c | 70 +++++++++++++++++++++++++-
 1 file changed, 69 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_svm_devmem.c b/drivers/gpu/drm/i915/i915_svm_devmem.c
index 40c2f79ff614..37a205d3a82f 100644
--- a/drivers/gpu/drm/i915/i915_svm_devmem.c
+++ b/drivers/gpu/drm/i915/i915_svm_devmem.c
@@ -280,9 +280,77 @@ int i915_devmem_migrate_vma(struct intel_memory_region *mem,
 	return ret;
 }
 
+static vm_fault_t
+i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate)
+{
+	struct device *dev = migrate->i915->drm.dev;
+	struct migrate_vma *args = migrate->args;
+	struct page *dpage, *spage;
+
+	DRM_DEBUG_DRIVER("start 0x%lx\n", args->start);
+	/* Allocate host page */
+	spage = migrate_pfn_to_page(args->src[0]);
+	if (unlikely(!spage || !(args->src[0] & MIGRATE_PFN_MIGRATE)))
+		return 0;
+
+	dpage = alloc_page_vma(GFP_HIGHUSER, args->vma, args->start);
+	if (unlikely(!dpage))
+		return VM_FAULT_SIGBUS;
+	lock_page(dpage);
+
+	args->dst[0] = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;
+
+	/* Copy the pages */
+	migrate->npages = 1;
+
+	return 0;
+}
+
+void i915_devmem_fault_finalize_and_map(struct i915_devmem_migrate *migrate)
+{
+	DRM_DEBUG_DRIVER("\n");
+}
+
+static inline struct i915_devmem *page_to_devmem(struct page *page)
+{
+	return container_of(page->pgmap, struct i915_devmem, pagemap);
+}
+
 static vm_fault_t i915_devmem_migrate_to_ram(struct vm_fault *vmf)
 {
-	return VM_FAULT_SIGBUS;
+	struct i915_devmem *devmem = page_to_devmem(vmf->page);
+	struct drm_i915_private *i915 = devmem->i915;
+	struct i915_devmem_migrate migrate = {0};
+	unsigned long src = 0, dst = 0;
+	vm_fault_t ret;
+	struct migrate_vma args = {
+		.vma		= vmf->vma,
+		.start		= vmf->address,
+		.end		= vmf->address + PAGE_SIZE,
+		.src		= &src,
+		.dst		= &dst,
+	};
+
+	/* XXX: Opportunistically migrate more pages? */
+	DRM_DEBUG_DRIVER("addr 0x%lx\n", args.start);
+	migrate.i915 = i915;
+	migrate.args = &args;
+	migrate.src_id = INTEL_REGION_LMEM;
+	migrate.dst_id = INTEL_REGION_SMEM;
+	if (migrate_vma_setup(&args) < 0)
+		return VM_FAULT_SIGBUS;
+	if (!args.cpages)
+		return 0;
+
+	ret = i915_devmem_fault_alloc_and_copy(&migrate);
+	if (ret || dst == 0)
+		goto done;
+
+	migrate_vma_pages(&args);
+	i915_devmem_fault_finalize_and_map(&migrate);
+done:
+	migrate_vma_finalize(&args);
+	return ret;
 }
 
 static void i915_devmem_page_free(struct page *page)
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [Intel-gfx] [RFC 08/13] drm/i915/svm: Implicitly migrate pages upon CPU fault
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

As PCIe is non-coherent link, do not allow direct memory access across
PCIe link. Handle CPU fault by migrating pages back to host memory.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/i915_svm_devmem.c | 70 +++++++++++++++++++++++++-
 1 file changed, 69 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_svm_devmem.c b/drivers/gpu/drm/i915/i915_svm_devmem.c
index 40c2f79ff614..37a205d3a82f 100644
--- a/drivers/gpu/drm/i915/i915_svm_devmem.c
+++ b/drivers/gpu/drm/i915/i915_svm_devmem.c
@@ -280,9 +280,77 @@ int i915_devmem_migrate_vma(struct intel_memory_region *mem,
 	return ret;
 }
 
+static vm_fault_t
+i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate)
+{
+	struct device *dev = migrate->i915->drm.dev;
+	struct migrate_vma *args = migrate->args;
+	struct page *dpage, *spage;
+
+	DRM_DEBUG_DRIVER("start 0x%lx\n", args->start);
+	/* Allocate host page */
+	spage = migrate_pfn_to_page(args->src[0]);
+	if (unlikely(!spage || !(args->src[0] & MIGRATE_PFN_MIGRATE)))
+		return 0;
+
+	dpage = alloc_page_vma(GFP_HIGHUSER, args->vma, args->start);
+	if (unlikely(!dpage))
+		return VM_FAULT_SIGBUS;
+	lock_page(dpage);
+
+	args->dst[0] = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;
+
+	/* Copy the pages */
+	migrate->npages = 1;
+
+	return 0;
+}
+
+void i915_devmem_fault_finalize_and_map(struct i915_devmem_migrate *migrate)
+{
+	DRM_DEBUG_DRIVER("\n");
+}
+
+static inline struct i915_devmem *page_to_devmem(struct page *page)
+{
+	return container_of(page->pgmap, struct i915_devmem, pagemap);
+}
+
 static vm_fault_t i915_devmem_migrate_to_ram(struct vm_fault *vmf)
 {
-	return VM_FAULT_SIGBUS;
+	struct i915_devmem *devmem = page_to_devmem(vmf->page);
+	struct drm_i915_private *i915 = devmem->i915;
+	struct i915_devmem_migrate migrate = {0};
+	unsigned long src = 0, dst = 0;
+	vm_fault_t ret;
+	struct migrate_vma args = {
+		.vma		= vmf->vma,
+		.start		= vmf->address,
+		.end		= vmf->address + PAGE_SIZE,
+		.src		= &src,
+		.dst		= &dst,
+	};
+
+	/* XXX: Opportunistically migrate more pages? */
+	DRM_DEBUG_DRIVER("addr 0x%lx\n", args.start);
+	migrate.i915 = i915;
+	migrate.args = &args;
+	migrate.src_id = INTEL_REGION_LMEM;
+	migrate.dst_id = INTEL_REGION_SMEM;
+	if (migrate_vma_setup(&args) < 0)
+		return VM_FAULT_SIGBUS;
+	if (!args.cpages)
+		return 0;
+
+	ret = i915_devmem_fault_alloc_and_copy(&migrate);
+	if (ret || dst == 0)
+		goto done;
+
+	migrate_vma_pages(&args);
+	i915_devmem_fault_finalize_and_map(&migrate);
+done:
+	migrate_vma_finalize(&args);
+	return ret;
 }
 
 static void i915_devmem_page_free(struct page *page)
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 09/13] drm/i915/svm: Page copy support during migration
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Copy the pages duing SVM migration using memcpy().

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/i915_svm_devmem.c | 72 ++++++++++++++++++++++++++
 1 file changed, 72 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_svm_devmem.c b/drivers/gpu/drm/i915/i915_svm_devmem.c
index 37a205d3a82f..960b04091e44 100644
--- a/drivers/gpu/drm/i915/i915_svm_devmem.c
+++ b/drivers/gpu/drm/i915/i915_svm_devmem.c
@@ -142,6 +142,69 @@ i915_devmem_page_free_locked(struct drm_i915_private *dev_priv,
 	put_page(page);
 }
 
+static int i915_migrate_cpu_copy(struct i915_devmem_migrate *migrate)
+{
+	const unsigned long *src = migrate->args->src;
+	unsigned long *dst = migrate->args->dst;
+	struct drm_i915_private *i915 = migrate->i915;
+	struct intel_memory_region *mem;
+	void *src_vaddr, *dst_vaddr;
+	u64 src_addr, dst_addr;
+	struct page *page;
+	int i, ret = 0;
+
+	/* XXX: Copy multiple pages at a time */
+	for (i = 0; !ret && i < migrate->npages; i++) {
+		if (!dst[i])
+			continue;
+
+		page = migrate_pfn_to_page(dst[i]);
+		mem = i915->mm.regions[migrate->dst_id];
+		dst_addr = page_to_pfn(page);
+		if (migrate->dst_id != INTEL_REGION_SMEM)
+			dst_addr -= mem->devmem->pfn_first;
+		dst_addr <<= PAGE_SHIFT;
+
+		if (migrate->dst_id == INTEL_REGION_SMEM)
+			dst_vaddr = phys_to_virt(dst_addr);
+		else
+			dst_vaddr = io_mapping_map_atomic_wc(&mem->iomap,
+							     dst_addr);
+		if (unlikely(!dst_vaddr))
+			return -ENOMEM;
+
+		page = migrate_pfn_to_page(src[i]);
+		mem = i915->mm.regions[migrate->src_id];
+		src_addr = page_to_pfn(page);
+		if (migrate->src_id != INTEL_REGION_SMEM)
+			src_addr -= mem->devmem->pfn_first;
+		src_addr <<= PAGE_SHIFT;
+
+		if (migrate->src_id == INTEL_REGION_SMEM)
+			src_vaddr = phys_to_virt(src_addr);
+		else
+			src_vaddr = io_mapping_map_atomic_wc(&mem->iomap,
+							     src_addr);
+
+		if (likely(src_vaddr))
+			memcpy(dst_vaddr, src_vaddr, PAGE_SIZE);
+		else
+			ret = -ENOMEM;
+
+		if (migrate->dst_id != INTEL_REGION_SMEM)
+			io_mapping_unmap_atomic(dst_vaddr);
+
+		if (migrate->src_id != INTEL_REGION_SMEM && src_vaddr)
+			io_mapping_unmap_atomic(src_vaddr);
+
+		DRM_DEBUG_DRIVER("src [%d] 0x%llx, dst [%d] 0x%llx\n",
+				 migrate->src_id, src_addr,
+				 migrate->dst_id, dst_addr);
+	}
+
+	return ret;
+}
+
 static int
 i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
 {
@@ -201,6 +264,8 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
 
 	/* Copy the pages */
 	migrate->npages = npages;
+	/* XXX: Flush the caches? */
+	ret = i915_migrate_cpu_copy(migrate);
 migrate_out:
 	if (unlikely(ret)) {
 		for (i = 0; i < npages; i++) {
@@ -286,6 +351,7 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate)
 	struct device *dev = migrate->i915->drm.dev;
 	struct migrate_vma *args = migrate->args;
 	struct page *dpage, *spage;
+	int err;
 
 	DRM_DEBUG_DRIVER("start 0x%lx\n", args->start);
 	/* Allocate host page */
@@ -302,6 +368,12 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate)
 
 	/* Copy the pages */
 	migrate->npages = 1;
+	err = i915_migrate_cpu_copy(migrate);
+	if (unlikely(err)) {
+		__free_page(dpage);
+		args->dst[0] = 0;
+		return VM_FAULT_SIGBUS;
+	}
 
 	return 0;
 }
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 09/13] drm/i915/svm: Page copy support during migration
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen, jglisse,
	jon.bloomfield, daniel.vetter, dan.j.williams, ira.weiny, jgg

Copy the pages duing SVM migration using memcpy().

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/i915_svm_devmem.c | 72 ++++++++++++++++++++++++++
 1 file changed, 72 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_svm_devmem.c b/drivers/gpu/drm/i915/i915_svm_devmem.c
index 37a205d3a82f..960b04091e44 100644
--- a/drivers/gpu/drm/i915/i915_svm_devmem.c
+++ b/drivers/gpu/drm/i915/i915_svm_devmem.c
@@ -142,6 +142,69 @@ i915_devmem_page_free_locked(struct drm_i915_private *dev_priv,
 	put_page(page);
 }
 
+static int i915_migrate_cpu_copy(struct i915_devmem_migrate *migrate)
+{
+	const unsigned long *src = migrate->args->src;
+	unsigned long *dst = migrate->args->dst;
+	struct drm_i915_private *i915 = migrate->i915;
+	struct intel_memory_region *mem;
+	void *src_vaddr, *dst_vaddr;
+	u64 src_addr, dst_addr;
+	struct page *page;
+	int i, ret = 0;
+
+	/* XXX: Copy multiple pages at a time */
+	for (i = 0; !ret && i < migrate->npages; i++) {
+		if (!dst[i])
+			continue;
+
+		page = migrate_pfn_to_page(dst[i]);
+		mem = i915->mm.regions[migrate->dst_id];
+		dst_addr = page_to_pfn(page);
+		if (migrate->dst_id != INTEL_REGION_SMEM)
+			dst_addr -= mem->devmem->pfn_first;
+		dst_addr <<= PAGE_SHIFT;
+
+		if (migrate->dst_id == INTEL_REGION_SMEM)
+			dst_vaddr = phys_to_virt(dst_addr);
+		else
+			dst_vaddr = io_mapping_map_atomic_wc(&mem->iomap,
+							     dst_addr);
+		if (unlikely(!dst_vaddr))
+			return -ENOMEM;
+
+		page = migrate_pfn_to_page(src[i]);
+		mem = i915->mm.regions[migrate->src_id];
+		src_addr = page_to_pfn(page);
+		if (migrate->src_id != INTEL_REGION_SMEM)
+			src_addr -= mem->devmem->pfn_first;
+		src_addr <<= PAGE_SHIFT;
+
+		if (migrate->src_id == INTEL_REGION_SMEM)
+			src_vaddr = phys_to_virt(src_addr);
+		else
+			src_vaddr = io_mapping_map_atomic_wc(&mem->iomap,
+							     src_addr);
+
+		if (likely(src_vaddr))
+			memcpy(dst_vaddr, src_vaddr, PAGE_SIZE);
+		else
+			ret = -ENOMEM;
+
+		if (migrate->dst_id != INTEL_REGION_SMEM)
+			io_mapping_unmap_atomic(dst_vaddr);
+
+		if (migrate->src_id != INTEL_REGION_SMEM && src_vaddr)
+			io_mapping_unmap_atomic(src_vaddr);
+
+		DRM_DEBUG_DRIVER("src [%d] 0x%llx, dst [%d] 0x%llx\n",
+				 migrate->src_id, src_addr,
+				 migrate->dst_id, dst_addr);
+	}
+
+	return ret;
+}
+
 static int
 i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
 {
@@ -201,6 +264,8 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
 
 	/* Copy the pages */
 	migrate->npages = npages;
+	/* XXX: Flush the caches? */
+	ret = i915_migrate_cpu_copy(migrate);
 migrate_out:
 	if (unlikely(ret)) {
 		for (i = 0; i < npages; i++) {
@@ -286,6 +351,7 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate)
 	struct device *dev = migrate->i915->drm.dev;
 	struct migrate_vma *args = migrate->args;
 	struct page *dpage, *spage;
+	int err;
 
 	DRM_DEBUG_DRIVER("start 0x%lx\n", args->start);
 	/* Allocate host page */
@@ -302,6 +368,12 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate)
 
 	/* Copy the pages */
 	migrate->npages = 1;
+	err = i915_migrate_cpu_copy(migrate);
+	if (unlikely(err)) {
+		__free_page(dpage);
+		args->dst[0] = 0;
+		return VM_FAULT_SIGBUS;
+	}
 
 	return 0;
 }
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [Intel-gfx] [RFC 09/13] drm/i915/svm: Page copy support during migration
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Copy the pages duing SVM migration using memcpy().

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/i915_svm_devmem.c | 72 ++++++++++++++++++++++++++
 1 file changed, 72 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_svm_devmem.c b/drivers/gpu/drm/i915/i915_svm_devmem.c
index 37a205d3a82f..960b04091e44 100644
--- a/drivers/gpu/drm/i915/i915_svm_devmem.c
+++ b/drivers/gpu/drm/i915/i915_svm_devmem.c
@@ -142,6 +142,69 @@ i915_devmem_page_free_locked(struct drm_i915_private *dev_priv,
 	put_page(page);
 }
 
+static int i915_migrate_cpu_copy(struct i915_devmem_migrate *migrate)
+{
+	const unsigned long *src = migrate->args->src;
+	unsigned long *dst = migrate->args->dst;
+	struct drm_i915_private *i915 = migrate->i915;
+	struct intel_memory_region *mem;
+	void *src_vaddr, *dst_vaddr;
+	u64 src_addr, dst_addr;
+	struct page *page;
+	int i, ret = 0;
+
+	/* XXX: Copy multiple pages at a time */
+	for (i = 0; !ret && i < migrate->npages; i++) {
+		if (!dst[i])
+			continue;
+
+		page = migrate_pfn_to_page(dst[i]);
+		mem = i915->mm.regions[migrate->dst_id];
+		dst_addr = page_to_pfn(page);
+		if (migrate->dst_id != INTEL_REGION_SMEM)
+			dst_addr -= mem->devmem->pfn_first;
+		dst_addr <<= PAGE_SHIFT;
+
+		if (migrate->dst_id == INTEL_REGION_SMEM)
+			dst_vaddr = phys_to_virt(dst_addr);
+		else
+			dst_vaddr = io_mapping_map_atomic_wc(&mem->iomap,
+							     dst_addr);
+		if (unlikely(!dst_vaddr))
+			return -ENOMEM;
+
+		page = migrate_pfn_to_page(src[i]);
+		mem = i915->mm.regions[migrate->src_id];
+		src_addr = page_to_pfn(page);
+		if (migrate->src_id != INTEL_REGION_SMEM)
+			src_addr -= mem->devmem->pfn_first;
+		src_addr <<= PAGE_SHIFT;
+
+		if (migrate->src_id == INTEL_REGION_SMEM)
+			src_vaddr = phys_to_virt(src_addr);
+		else
+			src_vaddr = io_mapping_map_atomic_wc(&mem->iomap,
+							     src_addr);
+
+		if (likely(src_vaddr))
+			memcpy(dst_vaddr, src_vaddr, PAGE_SIZE);
+		else
+			ret = -ENOMEM;
+
+		if (migrate->dst_id != INTEL_REGION_SMEM)
+			io_mapping_unmap_atomic(dst_vaddr);
+
+		if (migrate->src_id != INTEL_REGION_SMEM && src_vaddr)
+			io_mapping_unmap_atomic(src_vaddr);
+
+		DRM_DEBUG_DRIVER("src [%d] 0x%llx, dst [%d] 0x%llx\n",
+				 migrate->src_id, src_addr,
+				 migrate->dst_id, dst_addr);
+	}
+
+	return ret;
+}
+
 static int
 i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
 {
@@ -201,6 +264,8 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
 
 	/* Copy the pages */
 	migrate->npages = npages;
+	/* XXX: Flush the caches? */
+	ret = i915_migrate_cpu_copy(migrate);
 migrate_out:
 	if (unlikely(ret)) {
 		for (i = 0; i < npages; i++) {
@@ -286,6 +351,7 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate)
 	struct device *dev = migrate->i915->drm.dev;
 	struct migrate_vma *args = migrate->args;
 	struct page *dpage, *spage;
+	int err;
 
 	DRM_DEBUG_DRIVER("start 0x%lx\n", args->start);
 	/* Allocate host page */
@@ -302,6 +368,12 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate)
 
 	/* Copy the pages */
 	migrate->npages = 1;
+	err = i915_migrate_cpu_copy(migrate);
+	if (unlikely(err)) {
+		__free_page(dpage);
+		args->dst[0] = 0;
+		return VM_FAULT_SIGBUS;
+	}
 
 	return 0;
 }
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 10/13] drm/i915/svm: Add functions to blitter copy SVM buffers
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Add support function to blitter copy SVM VAs without requiring any
gem objects. Also add function to wait for completion of the copy.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/Makefile              |   3 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.h |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_wait.c   |   2 +-
 drivers/gpu/drm/i915/i915_svm.h            |   6 +
 drivers/gpu/drm/i915/i915_svm_copy.c       | 172 +++++++++++++++++++++
 5 files changed, 184 insertions(+), 2 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_svm_copy.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index b574ec31ea2e..97d40172bf27 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -156,7 +156,8 @@ i915-y += \
 # SVM code
 i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o \
 			       i915_svm.o \
-			       i915_svm_devmem.o
+			       i915_svm_devmem.o \
+			       i915_svm_copy.o
 
 # general-purpose microcontroller (GuC) support
 obj-y += gt/uc/
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index eec31d9a4fa2..3fe697659366 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -481,6 +481,9 @@ static inline void __start_cpu_write(struct drm_i915_gem_object *obj)
 int i915_gem_object_wait(struct drm_i915_gem_object *obj,
 			 unsigned int flags,
 			 long timeout);
+long i915_gem_object_wait_fence(struct dma_fence *fence,
+				unsigned int flags,
+				long timeout);
 int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 				  unsigned int flags,
 				  const struct i915_sched_attr *attr);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
index 8af55cd3e690..b7905aa8f821 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
@@ -12,7 +12,7 @@
 #include "i915_gem_ioctls.h"
 #include "i915_gem_object.h"
 
-static long
+long
 i915_gem_object_wait_fence(struct dma_fence *fence,
 			   unsigned int flags,
 			   long timeout)
diff --git a/drivers/gpu/drm/i915/i915_svm.h b/drivers/gpu/drm/i915/i915_svm.h
index a1b62997e925..5d28e531f737 100644
--- a/drivers/gpu/drm/i915/i915_svm.h
+++ b/drivers/gpu/drm/i915/i915_svm.h
@@ -33,6 +33,12 @@ static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm)
 	return vm->svm;
 }
 
+int i915_svm_copy_blt(struct intel_context *ce,
+		      u64 src_start, u64 dst_start, u64 size,
+		      struct dma_fence **fence);
+int i915_svm_copy_blt_wait(struct drm_i915_private *i915,
+			   struct dma_fence *fence);
+
 void i915_dmem_convert_pfn(struct drm_i915_private *dev_priv,
 			   struct hmm_range *range);
 int i915_svm_migrate_ioctl(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/i915/i915_svm_copy.c b/drivers/gpu/drm/i915/i915_svm_copy.c
new file mode 100644
index 000000000000..42f7d563f6b4
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_svm_copy.c
@@ -0,0 +1,172 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "gem/i915_gem_object_blt.h"
+#include "gt/intel_engine_pm.h"
+#include "gt/intel_engine_pool.h"
+#include "gt/intel_gpu_commands.h"
+#include "gt/intel_gt.h"
+
+static struct i915_vma *
+intel_emit_svm_copy_blt(struct intel_context *ce,
+			u64 src_start, u64 dst_start, u64 buff_size)
+{
+	struct drm_i915_private *i915 = ce->vm->i915;
+	const u32 block_size = SZ_8M; /* ~1ms at 8GiB/s preemption delay */
+	struct intel_engine_pool_node *pool;
+	struct i915_vma *batch;
+	u64 count, rem;
+	u32 size, *cmd;
+	int err;
+
+	GEM_BUG_ON(intel_engine_is_virtual(ce->engine));
+	intel_engine_pm_get(ce->engine);
+
+	if (INTEL_GEN(i915) < 8)
+		return ERR_PTR(-ENOTSUPP);
+
+	count = div_u64(round_up(buff_size, block_size), block_size);
+	size = (1 + 11 * count) * sizeof(u32);
+	size = round_up(size, PAGE_SIZE);
+	pool = intel_engine_get_pool(ce->engine, size);
+	if (IS_ERR(pool)) {
+		err = PTR_ERR(pool);
+		goto out_pm;
+	}
+
+	cmd = i915_gem_object_pin_map(pool->obj, I915_MAP_FORCE_WC);
+	if (IS_ERR(cmd)) {
+		err = PTR_ERR(cmd);
+		goto out_put;
+	}
+
+	rem = buff_size;
+	do {
+		size = min_t(u64, rem, block_size);
+		GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
+
+		if (INTEL_GEN(i915) >= 9) {
+			*cmd++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2);
+			*cmd++ = BLT_DEPTH_32 | PAGE_SIZE;
+			*cmd++ = 0;
+			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+			*cmd++ = lower_32_bits(dst_start);
+			*cmd++ = upper_32_bits(dst_start);
+			*cmd++ = 0;
+			*cmd++ = PAGE_SIZE;
+			*cmd++ = lower_32_bits(src_start);
+			*cmd++ = upper_32_bits(src_start);
+		} else if (INTEL_GEN(i915) >= 8) {
+			*cmd++ = XY_SRC_COPY_BLT_CMD |
+				 BLT_WRITE_RGBA | (10 - 2);
+			*cmd++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
+			*cmd++ = 0;
+			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+			*cmd++ = lower_32_bits(dst_start);
+			*cmd++ = upper_32_bits(dst_start);
+			*cmd++ = 0;
+			*cmd++ = PAGE_SIZE;
+			*cmd++ = lower_32_bits(src_start);
+			*cmd++ = upper_32_bits(src_start);
+		}
+
+		/* Allow ourselves to be preempted in between blocks */
+		*cmd++ = MI_ARB_CHECK;
+
+		src_start += size;
+		dst_start += size;
+		rem -= size;
+	} while (rem);
+
+	*cmd = MI_BATCH_BUFFER_END;
+	intel_gt_chipset_flush(ce->vm->gt);
+
+	i915_gem_object_unpin_map(pool->obj);
+
+	batch = i915_vma_instance(pool->obj, ce->vm, NULL);
+	if (IS_ERR(batch)) {
+		err = PTR_ERR(batch);
+		goto out_put;
+	}
+
+	err = i915_vma_pin(batch, 0, 0, PIN_USER);
+	if (unlikely(err))
+		goto out_put;
+
+	batch->private = pool;
+	return batch;
+
+out_put:
+	intel_engine_pool_put(pool);
+out_pm:
+	intel_engine_pm_put(ce->engine);
+	return ERR_PTR(err);
+}
+
+int i915_svm_copy_blt(struct intel_context *ce,
+		      u64 src_start, u64 dst_start, u64 size,
+		      struct dma_fence **fence)
+{
+	struct drm_i915_private *i915 = ce->gem_context->i915;
+	struct i915_request *rq;
+	struct i915_vma *batch;
+	int err;
+
+	DRM_DEBUG_DRIVER("src_start 0x%llx dst_start 0x%llx size 0x%llx\n",
+			 src_start, dst_start, size);
+	mutex_lock(&i915->drm.struct_mutex);
+	batch = intel_emit_svm_copy_blt(ce, src_start, dst_start, size);
+	if (IS_ERR(batch)) {
+		err = PTR_ERR(batch);
+		goto out_unlock;
+	}
+
+	rq = i915_request_create(ce);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
+		goto out_batch;
+	}
+
+	err = intel_emit_vma_mark_active(batch, rq);
+	if (unlikely(err))
+		goto out_request;
+
+	if (ce->engine->emit_init_breadcrumb) {
+		err = ce->engine->emit_init_breadcrumb(rq);
+		if (unlikely(err))
+			goto out_request;
+	}
+
+	err = rq->engine->emit_bb_start(rq,
+					batch->node.start, batch->node.size,
+					0);
+out_request:
+	if (unlikely(err))
+		i915_request_skip(rq, err);
+	else
+		*fence = &rq->fence;
+
+	i915_request_add(rq);
+out_batch:
+	intel_emit_vma_release(ce, batch);
+out_unlock:
+	mutex_unlock(&i915->drm.struct_mutex);
+	return err;
+}
+
+int i915_svm_copy_blt_wait(struct drm_i915_private *i915,
+			   struct dma_fence *fence)
+{
+	long timeout;
+
+	mutex_lock(&i915->drm.struct_mutex);
+	timeout = i915_gem_object_wait_fence(fence,
+					     I915_WAIT_INTERRUPTIBLE |
+					     I915_WAIT_ALL,
+					     MAX_SCHEDULE_TIMEOUT);
+	mutex_unlock(&i915->drm.struct_mutex);
+	return (timeout < 0) ? timeout : 0;
+}
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 10/13] drm/i915/svm: Add functions to blitter copy SVM buffers
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen, jglisse,
	jon.bloomfield, daniel.vetter, dan.j.williams, ira.weiny, jgg

Add support function to blitter copy SVM VAs without requiring any
gem objects. Also add function to wait for completion of the copy.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/Makefile              |   3 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.h |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_wait.c   |   2 +-
 drivers/gpu/drm/i915/i915_svm.h            |   6 +
 drivers/gpu/drm/i915/i915_svm_copy.c       | 172 +++++++++++++++++++++
 5 files changed, 184 insertions(+), 2 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_svm_copy.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index b574ec31ea2e..97d40172bf27 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -156,7 +156,8 @@ i915-y += \
 # SVM code
 i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o \
 			       i915_svm.o \
-			       i915_svm_devmem.o
+			       i915_svm_devmem.o \
+			       i915_svm_copy.o
 
 # general-purpose microcontroller (GuC) support
 obj-y += gt/uc/
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index eec31d9a4fa2..3fe697659366 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -481,6 +481,9 @@ static inline void __start_cpu_write(struct drm_i915_gem_object *obj)
 int i915_gem_object_wait(struct drm_i915_gem_object *obj,
 			 unsigned int flags,
 			 long timeout);
+long i915_gem_object_wait_fence(struct dma_fence *fence,
+				unsigned int flags,
+				long timeout);
 int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 				  unsigned int flags,
 				  const struct i915_sched_attr *attr);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
index 8af55cd3e690..b7905aa8f821 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
@@ -12,7 +12,7 @@
 #include "i915_gem_ioctls.h"
 #include "i915_gem_object.h"
 
-static long
+long
 i915_gem_object_wait_fence(struct dma_fence *fence,
 			   unsigned int flags,
 			   long timeout)
diff --git a/drivers/gpu/drm/i915/i915_svm.h b/drivers/gpu/drm/i915/i915_svm.h
index a1b62997e925..5d28e531f737 100644
--- a/drivers/gpu/drm/i915/i915_svm.h
+++ b/drivers/gpu/drm/i915/i915_svm.h
@@ -33,6 +33,12 @@ static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm)
 	return vm->svm;
 }
 
+int i915_svm_copy_blt(struct intel_context *ce,
+		      u64 src_start, u64 dst_start, u64 size,
+		      struct dma_fence **fence);
+int i915_svm_copy_blt_wait(struct drm_i915_private *i915,
+			   struct dma_fence *fence);
+
 void i915_dmem_convert_pfn(struct drm_i915_private *dev_priv,
 			   struct hmm_range *range);
 int i915_svm_migrate_ioctl(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/i915/i915_svm_copy.c b/drivers/gpu/drm/i915/i915_svm_copy.c
new file mode 100644
index 000000000000..42f7d563f6b4
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_svm_copy.c
@@ -0,0 +1,172 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "gem/i915_gem_object_blt.h"
+#include "gt/intel_engine_pm.h"
+#include "gt/intel_engine_pool.h"
+#include "gt/intel_gpu_commands.h"
+#include "gt/intel_gt.h"
+
+static struct i915_vma *
+intel_emit_svm_copy_blt(struct intel_context *ce,
+			u64 src_start, u64 dst_start, u64 buff_size)
+{
+	struct drm_i915_private *i915 = ce->vm->i915;
+	const u32 block_size = SZ_8M; /* ~1ms at 8GiB/s preemption delay */
+	struct intel_engine_pool_node *pool;
+	struct i915_vma *batch;
+	u64 count, rem;
+	u32 size, *cmd;
+	int err;
+
+	GEM_BUG_ON(intel_engine_is_virtual(ce->engine));
+	intel_engine_pm_get(ce->engine);
+
+	if (INTEL_GEN(i915) < 8)
+		return ERR_PTR(-ENOTSUPP);
+
+	count = div_u64(round_up(buff_size, block_size), block_size);
+	size = (1 + 11 * count) * sizeof(u32);
+	size = round_up(size, PAGE_SIZE);
+	pool = intel_engine_get_pool(ce->engine, size);
+	if (IS_ERR(pool)) {
+		err = PTR_ERR(pool);
+		goto out_pm;
+	}
+
+	cmd = i915_gem_object_pin_map(pool->obj, I915_MAP_FORCE_WC);
+	if (IS_ERR(cmd)) {
+		err = PTR_ERR(cmd);
+		goto out_put;
+	}
+
+	rem = buff_size;
+	do {
+		size = min_t(u64, rem, block_size);
+		GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
+
+		if (INTEL_GEN(i915) >= 9) {
+			*cmd++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2);
+			*cmd++ = BLT_DEPTH_32 | PAGE_SIZE;
+			*cmd++ = 0;
+			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+			*cmd++ = lower_32_bits(dst_start);
+			*cmd++ = upper_32_bits(dst_start);
+			*cmd++ = 0;
+			*cmd++ = PAGE_SIZE;
+			*cmd++ = lower_32_bits(src_start);
+			*cmd++ = upper_32_bits(src_start);
+		} else if (INTEL_GEN(i915) >= 8) {
+			*cmd++ = XY_SRC_COPY_BLT_CMD |
+				 BLT_WRITE_RGBA | (10 - 2);
+			*cmd++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
+			*cmd++ = 0;
+			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+			*cmd++ = lower_32_bits(dst_start);
+			*cmd++ = upper_32_bits(dst_start);
+			*cmd++ = 0;
+			*cmd++ = PAGE_SIZE;
+			*cmd++ = lower_32_bits(src_start);
+			*cmd++ = upper_32_bits(src_start);
+		}
+
+		/* Allow ourselves to be preempted in between blocks */
+		*cmd++ = MI_ARB_CHECK;
+
+		src_start += size;
+		dst_start += size;
+		rem -= size;
+	} while (rem);
+
+	*cmd = MI_BATCH_BUFFER_END;
+	intel_gt_chipset_flush(ce->vm->gt);
+
+	i915_gem_object_unpin_map(pool->obj);
+
+	batch = i915_vma_instance(pool->obj, ce->vm, NULL);
+	if (IS_ERR(batch)) {
+		err = PTR_ERR(batch);
+		goto out_put;
+	}
+
+	err = i915_vma_pin(batch, 0, 0, PIN_USER);
+	if (unlikely(err))
+		goto out_put;
+
+	batch->private = pool;
+	return batch;
+
+out_put:
+	intel_engine_pool_put(pool);
+out_pm:
+	intel_engine_pm_put(ce->engine);
+	return ERR_PTR(err);
+}
+
+int i915_svm_copy_blt(struct intel_context *ce,
+		      u64 src_start, u64 dst_start, u64 size,
+		      struct dma_fence **fence)
+{
+	struct drm_i915_private *i915 = ce->gem_context->i915;
+	struct i915_request *rq;
+	struct i915_vma *batch;
+	int err;
+
+	DRM_DEBUG_DRIVER("src_start 0x%llx dst_start 0x%llx size 0x%llx\n",
+			 src_start, dst_start, size);
+	mutex_lock(&i915->drm.struct_mutex);
+	batch = intel_emit_svm_copy_blt(ce, src_start, dst_start, size);
+	if (IS_ERR(batch)) {
+		err = PTR_ERR(batch);
+		goto out_unlock;
+	}
+
+	rq = i915_request_create(ce);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
+		goto out_batch;
+	}
+
+	err = intel_emit_vma_mark_active(batch, rq);
+	if (unlikely(err))
+		goto out_request;
+
+	if (ce->engine->emit_init_breadcrumb) {
+		err = ce->engine->emit_init_breadcrumb(rq);
+		if (unlikely(err))
+			goto out_request;
+	}
+
+	err = rq->engine->emit_bb_start(rq,
+					batch->node.start, batch->node.size,
+					0);
+out_request:
+	if (unlikely(err))
+		i915_request_skip(rq, err);
+	else
+		*fence = &rq->fence;
+
+	i915_request_add(rq);
+out_batch:
+	intel_emit_vma_release(ce, batch);
+out_unlock:
+	mutex_unlock(&i915->drm.struct_mutex);
+	return err;
+}
+
+int i915_svm_copy_blt_wait(struct drm_i915_private *i915,
+			   struct dma_fence *fence)
+{
+	long timeout;
+
+	mutex_lock(&i915->drm.struct_mutex);
+	timeout = i915_gem_object_wait_fence(fence,
+					     I915_WAIT_INTERRUPTIBLE |
+					     I915_WAIT_ALL,
+					     MAX_SCHEDULE_TIMEOUT);
+	mutex_unlock(&i915->drm.struct_mutex);
+	return (timeout < 0) ? timeout : 0;
+}
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [Intel-gfx] [RFC 10/13] drm/i915/svm: Add functions to blitter copy SVM buffers
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Add support function to blitter copy SVM VAs without requiring any
gem objects. Also add function to wait for completion of the copy.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/Makefile              |   3 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.h |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_wait.c   |   2 +-
 drivers/gpu/drm/i915/i915_svm.h            |   6 +
 drivers/gpu/drm/i915/i915_svm_copy.c       | 172 +++++++++++++++++++++
 5 files changed, 184 insertions(+), 2 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_svm_copy.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index b574ec31ea2e..97d40172bf27 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -156,7 +156,8 @@ i915-y += \
 # SVM code
 i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o \
 			       i915_svm.o \
-			       i915_svm_devmem.o
+			       i915_svm_devmem.o \
+			       i915_svm_copy.o
 
 # general-purpose microcontroller (GuC) support
 obj-y += gt/uc/
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index eec31d9a4fa2..3fe697659366 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -481,6 +481,9 @@ static inline void __start_cpu_write(struct drm_i915_gem_object *obj)
 int i915_gem_object_wait(struct drm_i915_gem_object *obj,
 			 unsigned int flags,
 			 long timeout);
+long i915_gem_object_wait_fence(struct dma_fence *fence,
+				unsigned int flags,
+				long timeout);
 int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 				  unsigned int flags,
 				  const struct i915_sched_attr *attr);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
index 8af55cd3e690..b7905aa8f821 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
@@ -12,7 +12,7 @@
 #include "i915_gem_ioctls.h"
 #include "i915_gem_object.h"
 
-static long
+long
 i915_gem_object_wait_fence(struct dma_fence *fence,
 			   unsigned int flags,
 			   long timeout)
diff --git a/drivers/gpu/drm/i915/i915_svm.h b/drivers/gpu/drm/i915/i915_svm.h
index a1b62997e925..5d28e531f737 100644
--- a/drivers/gpu/drm/i915/i915_svm.h
+++ b/drivers/gpu/drm/i915/i915_svm.h
@@ -33,6 +33,12 @@ static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm)
 	return vm->svm;
 }
 
+int i915_svm_copy_blt(struct intel_context *ce,
+		      u64 src_start, u64 dst_start, u64 size,
+		      struct dma_fence **fence);
+int i915_svm_copy_blt_wait(struct drm_i915_private *i915,
+			   struct dma_fence *fence);
+
 void i915_dmem_convert_pfn(struct drm_i915_private *dev_priv,
 			   struct hmm_range *range);
 int i915_svm_migrate_ioctl(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/i915/i915_svm_copy.c b/drivers/gpu/drm/i915/i915_svm_copy.c
new file mode 100644
index 000000000000..42f7d563f6b4
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_svm_copy.c
@@ -0,0 +1,172 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "gem/i915_gem_object_blt.h"
+#include "gt/intel_engine_pm.h"
+#include "gt/intel_engine_pool.h"
+#include "gt/intel_gpu_commands.h"
+#include "gt/intel_gt.h"
+
+static struct i915_vma *
+intel_emit_svm_copy_blt(struct intel_context *ce,
+			u64 src_start, u64 dst_start, u64 buff_size)
+{
+	struct drm_i915_private *i915 = ce->vm->i915;
+	const u32 block_size = SZ_8M; /* ~1ms at 8GiB/s preemption delay */
+	struct intel_engine_pool_node *pool;
+	struct i915_vma *batch;
+	u64 count, rem;
+	u32 size, *cmd;
+	int err;
+
+	GEM_BUG_ON(intel_engine_is_virtual(ce->engine));
+	intel_engine_pm_get(ce->engine);
+
+	if (INTEL_GEN(i915) < 8)
+		return ERR_PTR(-ENOTSUPP);
+
+	count = div_u64(round_up(buff_size, block_size), block_size);
+	size = (1 + 11 * count) * sizeof(u32);
+	size = round_up(size, PAGE_SIZE);
+	pool = intel_engine_get_pool(ce->engine, size);
+	if (IS_ERR(pool)) {
+		err = PTR_ERR(pool);
+		goto out_pm;
+	}
+
+	cmd = i915_gem_object_pin_map(pool->obj, I915_MAP_FORCE_WC);
+	if (IS_ERR(cmd)) {
+		err = PTR_ERR(cmd);
+		goto out_put;
+	}
+
+	rem = buff_size;
+	do {
+		size = min_t(u64, rem, block_size);
+		GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
+
+		if (INTEL_GEN(i915) >= 9) {
+			*cmd++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2);
+			*cmd++ = BLT_DEPTH_32 | PAGE_SIZE;
+			*cmd++ = 0;
+			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+			*cmd++ = lower_32_bits(dst_start);
+			*cmd++ = upper_32_bits(dst_start);
+			*cmd++ = 0;
+			*cmd++ = PAGE_SIZE;
+			*cmd++ = lower_32_bits(src_start);
+			*cmd++ = upper_32_bits(src_start);
+		} else if (INTEL_GEN(i915) >= 8) {
+			*cmd++ = XY_SRC_COPY_BLT_CMD |
+				 BLT_WRITE_RGBA | (10 - 2);
+			*cmd++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
+			*cmd++ = 0;
+			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+			*cmd++ = lower_32_bits(dst_start);
+			*cmd++ = upper_32_bits(dst_start);
+			*cmd++ = 0;
+			*cmd++ = PAGE_SIZE;
+			*cmd++ = lower_32_bits(src_start);
+			*cmd++ = upper_32_bits(src_start);
+		}
+
+		/* Allow ourselves to be preempted in between blocks */
+		*cmd++ = MI_ARB_CHECK;
+
+		src_start += size;
+		dst_start += size;
+		rem -= size;
+	} while (rem);
+
+	*cmd = MI_BATCH_BUFFER_END;
+	intel_gt_chipset_flush(ce->vm->gt);
+
+	i915_gem_object_unpin_map(pool->obj);
+
+	batch = i915_vma_instance(pool->obj, ce->vm, NULL);
+	if (IS_ERR(batch)) {
+		err = PTR_ERR(batch);
+		goto out_put;
+	}
+
+	err = i915_vma_pin(batch, 0, 0, PIN_USER);
+	if (unlikely(err))
+		goto out_put;
+
+	batch->private = pool;
+	return batch;
+
+out_put:
+	intel_engine_pool_put(pool);
+out_pm:
+	intel_engine_pm_put(ce->engine);
+	return ERR_PTR(err);
+}
+
+int i915_svm_copy_blt(struct intel_context *ce,
+		      u64 src_start, u64 dst_start, u64 size,
+		      struct dma_fence **fence)
+{
+	struct drm_i915_private *i915 = ce->gem_context->i915;
+	struct i915_request *rq;
+	struct i915_vma *batch;
+	int err;
+
+	DRM_DEBUG_DRIVER("src_start 0x%llx dst_start 0x%llx size 0x%llx\n",
+			 src_start, dst_start, size);
+	mutex_lock(&i915->drm.struct_mutex);
+	batch = intel_emit_svm_copy_blt(ce, src_start, dst_start, size);
+	if (IS_ERR(batch)) {
+		err = PTR_ERR(batch);
+		goto out_unlock;
+	}
+
+	rq = i915_request_create(ce);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
+		goto out_batch;
+	}
+
+	err = intel_emit_vma_mark_active(batch, rq);
+	if (unlikely(err))
+		goto out_request;
+
+	if (ce->engine->emit_init_breadcrumb) {
+		err = ce->engine->emit_init_breadcrumb(rq);
+		if (unlikely(err))
+			goto out_request;
+	}
+
+	err = rq->engine->emit_bb_start(rq,
+					batch->node.start, batch->node.size,
+					0);
+out_request:
+	if (unlikely(err))
+		i915_request_skip(rq, err);
+	else
+		*fence = &rq->fence;
+
+	i915_request_add(rq);
+out_batch:
+	intel_emit_vma_release(ce, batch);
+out_unlock:
+	mutex_unlock(&i915->drm.struct_mutex);
+	return err;
+}
+
+int i915_svm_copy_blt_wait(struct drm_i915_private *i915,
+			   struct dma_fence *fence)
+{
+	long timeout;
+
+	mutex_lock(&i915->drm.struct_mutex);
+	timeout = i915_gem_object_wait_fence(fence,
+					     I915_WAIT_INTERRUPTIBLE |
+					     I915_WAIT_ALL,
+					     MAX_SCHEDULE_TIMEOUT);
+	mutex_unlock(&i915->drm.struct_mutex);
+	return (timeout < 0) ? timeout : 0;
+}
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 11/13] drm/i915/svm: Use blitter copy for migration
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen, jglisse,
	jon.bloomfield, daniel.vetter, dan.j.williams, ira.weiny, jgg

Use blitter engine to copy pages during migration.
As blitter context virtual address space is shared with other flows,
ensure virtual address are allocated properly from that address space.
Also ensure completion of blitter copy by waiting on the fence of the
issued request.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/i915_svm_devmem.c | 249 ++++++++++++++++++++++++-
 1 file changed, 245 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_svm_devmem.c b/drivers/gpu/drm/i915/i915_svm_devmem.c
index 960b04091e44..4f772983679b 100644
--- a/drivers/gpu/drm/i915/i915_svm_devmem.c
+++ b/drivers/gpu/drm/i915/i915_svm_devmem.c
@@ -15,7 +15,15 @@ struct i915_devmem_migrate {
 
 	enum intel_region_id src_id;
 	enum intel_region_id dst_id;
+	dma_addr_t *host_dma;
+	bool blitter_copy;
 	u64 npages;
+
+	/* required for blitter copy */
+	struct drm_mm_node src_node;
+	struct drm_mm_node dst_node;
+	struct intel_context *ce;
+	struct dma_fence *fence;
 };
 
 struct i915_devmem {
@@ -142,6 +150,139 @@ i915_devmem_page_free_locked(struct drm_i915_private *dev_priv,
 	put_page(page);
 }
 
+static int i915_devmem_bind_addr(struct i915_devmem_migrate *migrate,
+				 bool is_src)
+{
+	struct i915_gem_context *ctx = migrate->ce->gem_context;
+	struct drm_i915_private *i915 = migrate->i915;
+	struct i915_address_space *vm = ctx->vm;
+	struct intel_memory_region *mem;
+	u64 npages = migrate->npages;
+	enum intel_region_id mem_id;
+	struct drm_mm_node *node;
+	struct scatterlist *sg;
+	u32 sg_page_sizes = 0;
+	struct sg_table st;
+	u64 flags = 0;
+	int i, ret;
+
+	if (unlikely(sg_alloc_table(&st, npages, GFP_KERNEL)))
+		return -ENOMEM;
+
+	if (is_src) {
+		node = &migrate->src_node;
+		mem_id = migrate->src_id;
+		flags |= I915_GTT_SVM_READONLY;
+	} else {
+		node = &migrate->dst_node;
+		mem_id = migrate->dst_id;
+	}
+
+	mutex_lock(&vm->mutex);
+	ret = i915_gem_gtt_insert(vm, node, npages * PAGE_SIZE,
+				  I915_GTT_PAGE_SIZE_2M, 0,
+				  0, vm->total, PIN_USER);
+	mutex_unlock(&vm->mutex);
+	if (unlikely(ret))
+		return ret;
+
+	sg = NULL;
+	st.nents = 0;
+
+	/*
+	 * XXX: If the source page is missing, source it from a temporary
+	 * zero filled page. If needed, destination page missing scenarios
+	 * can be similarly handled by draining data into a temporary page.
+	 */
+	for (i = 0; i < npages; i++) {
+		u64 addr;
+
+		if (mem_id == INTEL_REGION_SMEM) {
+			addr = migrate->host_dma[i];
+		} else {
+			struct page *page;
+			unsigned long mpfn;
+
+			mpfn = is_src ? migrate->args->src[i] :
+					migrate->args->dst[i];
+			page = migrate_pfn_to_page(mpfn);
+			mem = i915->mm.regions[mem_id];
+			addr = page_to_pfn(page) - mem->devmem->pfn_first;
+			addr <<= PAGE_SHIFT;
+			addr += mem->region.start;
+		}
+
+		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
+			sg->length += PAGE_SIZE;
+			sg_dma_len(sg) += PAGE_SIZE;
+			continue;
+		}
+
+		if (sg)
+			sg_page_sizes |= sg->length;
+
+		sg =  sg ? __sg_next(sg) : st.sgl;
+		sg_dma_address(sg) = addr;
+		sg_dma_len(sg) = PAGE_SIZE;
+		sg->length = PAGE_SIZE;
+		st.nents++;
+	}
+
+	sg_page_sizes |= sg->length;
+	sg_mark_end(sg);
+	i915_sg_trim(&st);
+
+	ret = svm_bind_addr(vm, node->start, npages * PAGE_SIZE,
+			    flags, &st, sg_page_sizes);
+	sg_free_table(&st);
+
+	return ret;
+}
+
+static void i915_devmem_unbind_addr(struct i915_devmem_migrate *migrate,
+				    bool is_src)
+{
+	struct i915_gem_context *ctx = migrate->ce->gem_context;
+	struct i915_address_space *vm = ctx->vm;
+	struct drm_mm_node *node;
+
+	node = is_src ? &migrate->src_node : &migrate->dst_node;
+	svm_unbind_addr(vm, node->start, migrate->npages * PAGE_SIZE);
+	mutex_lock(&vm->mutex);
+	drm_mm_remove_node(node);
+	mutex_unlock(&vm->mutex);
+}
+
+static int i915_migrate_blitter_copy(struct i915_devmem_migrate *migrate)
+{
+	struct drm_i915_private *i915 = migrate->i915;
+	int ret;
+
+	migrate->ce = i915->engine[BCS0]->kernel_context;
+	ret = i915_devmem_bind_addr(migrate, true);
+	if (unlikely(ret))
+		return ret;
+
+	ret = i915_devmem_bind_addr(migrate, false);
+	if (unlikely(ret)) {
+		i915_devmem_unbind_addr(migrate, true);
+		return ret;
+	}
+
+	DRM_DEBUG_DRIVER("npages %lld\n", migrate->npages);
+	ret = i915_svm_copy_blt(migrate->ce,
+				migrate->src_node.start,
+				migrate->dst_node.start,
+				migrate->npages * PAGE_SIZE,
+				&migrate->fence);
+	if (unlikely(ret)) {
+		i915_devmem_unbind_addr(migrate, false);
+		i915_devmem_unbind_addr(migrate, true);
+	}
+
+	return ret;
+}
+
 static int i915_migrate_cpu_copy(struct i915_devmem_migrate *migrate)
 {
 	const unsigned long *src = migrate->args->src;
@@ -210,6 +351,7 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
 {
 	struct drm_i915_private *i915 = migrate->i915;
 	struct migrate_vma *args = migrate->args;
+	struct device *dev = i915->drm.dev;
 	struct intel_memory_region *mem;
 	struct list_head blocks = {0};
 	unsigned long i, npages, cnt;
@@ -218,13 +360,35 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
 
 	npages = (args->end - args->start) >> PAGE_SHIFT;
 	DRM_DEBUG_DRIVER("start 0x%lx npages %ld\n", args->start, npages);
+	/*
+	 * XXX: Set proper condition for cpu vs blitter copy for performance,
+	 * functionality and to avoid any deadlocks around blitter usage.
+	 */
+	migrate->blitter_copy = true;
+
+	if (migrate->blitter_copy) {
+		/* Allocate storage for DMA addresses, so we can unmap later. */
+		migrate->host_dma = kcalloc(npages, sizeof(*migrate->host_dma),
+					    GFP_KERNEL);
+		if (unlikely(!migrate->host_dma))
+			migrate->blitter_copy = false;
+	}
 
-	/* Check source pages */
+	/* Check and DMA map source pages */
 	for (i = 0, cnt = 0; i < npages; i++) {
 		args->dst[i] = 0;
 		page = migrate_pfn_to_page(args->src[i]);
-		if (unlikely(!page || !(args->src[i] & MIGRATE_PFN_MIGRATE)))
+		if (unlikely(!page || !(args->src[i] & MIGRATE_PFN_MIGRATE))) {
+			migrate->blitter_copy = false;
 			continue;
+		}
+
+		migrate->host_dma[i] = dma_map_page(dev, page, 0, PAGE_SIZE,
+						    PCI_DMA_TODEVICE);
+		if (unlikely(dma_mapping_error(dev, migrate->host_dma[i]))) {
+			migrate->blitter_copy = false;
+			migrate->host_dma[i] = 0;
+		}
 
 		args->dst[i] = MIGRATE_PFN_VALID;
 		cnt++;
@@ -248,6 +412,7 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
 		page = i915_devmem_page_get_locked(mem, &blocks);
 		if (unlikely(!page)) {
 			WARN(1, "Failed to get dst page\n");
+			migrate->blitter_copy = false;
 			args->dst[i] = 0;
 			continue;
 		}
@@ -265,7 +430,11 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
 	/* Copy the pages */
 	migrate->npages = npages;
 	/* XXX: Flush the caches? */
-	ret = i915_migrate_cpu_copy(migrate);
+	/* XXX: Fallback to cpu_copy if blitter_copy fails? */
+	if (migrate->blitter_copy)
+		ret = i915_migrate_blitter_copy(migrate);
+	else
+		ret = i915_migrate_cpu_copy(migrate);
 migrate_out:
 	if (unlikely(ret)) {
 		for (i = 0; i < npages; i++) {
@@ -277,12 +446,42 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
 		}
 	}
 
+	if (unlikely(migrate->host_dma && (!migrate->blitter_copy || ret))) {
+		for (i = 0; i < npages; i++) {
+			if (migrate->host_dma[i])
+				dma_unmap_page(dev, migrate->host_dma[i],
+					       PAGE_SIZE, PCI_DMA_TODEVICE);
+		}
+		kfree(migrate->host_dma);
+		migrate->host_dma = NULL;
+	}
+
 	return ret;
 }
 
 void i915_devmem_migrate_finalize_and_map(struct i915_devmem_migrate *migrate)
 {
+	struct drm_i915_private *i915 = migrate->i915;
+	unsigned long i;
+	int ret;
+
+	if (!migrate->blitter_copy)
+		return;
+
 	DRM_DEBUG_DRIVER("npages %lld\n", migrate->npages);
+	ret = i915_svm_copy_blt_wait(i915, migrate->fence);
+	if (unlikely(ret < 0))
+		WARN(1, "Waiting for dma fence failed %d\n", ret);
+
+	i915_devmem_unbind_addr(migrate, true);
+	i915_devmem_unbind_addr(migrate, false);
+
+	for (i = 0; i < migrate->npages; i++)
+		dma_unmap_page(i915->drm.dev, migrate->host_dma[i],
+			       PAGE_SIZE, PCI_DMA_TODEVICE);
+
+	kfree(migrate->host_dma);
+	migrate->host_dma = NULL;
 }
 
 static void i915_devmem_migrate_chunk(struct i915_devmem_migrate *migrate)
@@ -354,6 +553,12 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate)
 	int err;
 
 	DRM_DEBUG_DRIVER("start 0x%lx\n", args->start);
+	/*
+	 * XXX: Set proper condition for cpu vs blitter copy for performance,
+	 * functionality and to avoid any deadlocks around blitter usage.
+	 */
+	migrate->blitter_copy = false;
+
 	/* Allocate host page */
 	spage = migrate_pfn_to_page(args->src[0]);
 	if (unlikely(!spage || !(args->src[0] & MIGRATE_PFN_MIGRATE)))
@@ -364,12 +569,31 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate)
 		return VM_FAULT_SIGBUS;
 	lock_page(dpage);
 
+	/* DMA map host page */
+	if (migrate->blitter_copy) {
+		migrate->host_dma[0] = dma_map_page(dev, dpage, 0, PAGE_SIZE,
+						    PCI_DMA_FROMDEVICE);
+		if (unlikely(dma_mapping_error(dev, migrate->host_dma[0]))) {
+			migrate->host_dma[0] = 0;
+			err = -ENOMEM;
+			goto fault_out;
+		}
+	}
+
 	args->dst[0] = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;
 
 	/* Copy the pages */
 	migrate->npages = 1;
-	err = i915_migrate_cpu_copy(migrate);
+	/* XXX: Fallback to cpu_copy if blitter_copy fails? */
+	if (migrate->blitter_copy)
+		err = i915_migrate_blitter_copy(migrate);
+	else
+		err = i915_migrate_cpu_copy(migrate);
+fault_out:
 	if (unlikely(err)) {
+		if (migrate->host_dma[0])
+			dma_unmap_page(dev, migrate->host_dma[0],
+				       PAGE_SIZE, PCI_DMA_FROMDEVICE);
 		__free_page(dpage);
 		args->dst[0] = 0;
 		return VM_FAULT_SIGBUS;
@@ -380,7 +604,22 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate)
 
 void i915_devmem_fault_finalize_and_map(struct i915_devmem_migrate *migrate)
 {
+	struct drm_i915_private *i915 = migrate->i915;
+	int err;
+
 	DRM_DEBUG_DRIVER("\n");
+	if (!migrate->blitter_copy)
+		return;
+
+	err = i915_svm_copy_blt_wait(i915, migrate->fence);
+	if (unlikely(err < 0))
+		WARN(1, "Waiting for dma fence failed %d\n", err);
+
+	i915_devmem_unbind_addr(migrate, true);
+	i915_devmem_unbind_addr(migrate, false);
+
+	dma_unmap_page(i915->drm.dev, migrate->host_dma[0],
+		       PAGE_SIZE, PCI_DMA_FROMDEVICE);
 }
 
 static inline struct i915_devmem *page_to_devmem(struct page *page)
@@ -394,6 +633,7 @@ static vm_fault_t i915_devmem_migrate_to_ram(struct vm_fault *vmf)
 	struct drm_i915_private *i915 = devmem->i915;
 	struct i915_devmem_migrate migrate = {0};
 	unsigned long src = 0, dst = 0;
+	dma_addr_t dma_addr = 0;
 	vm_fault_t ret;
 	struct migrate_vma args = {
 		.vma		= vmf->vma,
@@ -407,6 +647,7 @@ static vm_fault_t i915_devmem_migrate_to_ram(struct vm_fault *vmf)
 	DRM_DEBUG_DRIVER("addr 0x%lx\n", args.start);
 	migrate.i915 = i915;
 	migrate.args = &args;
+	migrate.host_dma = &dma_addr;
 	migrate.src_id = INTEL_REGION_LMEM;
 	migrate.dst_id = INTEL_REGION_SMEM;
 	if (migrate_vma_setup(&args) < 0)
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [Intel-gfx] [RFC 11/13] drm/i915/svm: Use blitter copy for migration
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Use blitter engine to copy pages during migration.
As blitter context virtual address space is shared with other flows,
ensure virtual address are allocated properly from that address space.
Also ensure completion of blitter copy by waiting on the fence of the
issued request.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/i915_svm_devmem.c | 249 ++++++++++++++++++++++++-
 1 file changed, 245 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_svm_devmem.c b/drivers/gpu/drm/i915/i915_svm_devmem.c
index 960b04091e44..4f772983679b 100644
--- a/drivers/gpu/drm/i915/i915_svm_devmem.c
+++ b/drivers/gpu/drm/i915/i915_svm_devmem.c
@@ -15,7 +15,15 @@ struct i915_devmem_migrate {
 
 	enum intel_region_id src_id;
 	enum intel_region_id dst_id;
+	dma_addr_t *host_dma;
+	bool blitter_copy;
 	u64 npages;
+
+	/* required for blitter copy */
+	struct drm_mm_node src_node;
+	struct drm_mm_node dst_node;
+	struct intel_context *ce;
+	struct dma_fence *fence;
 };
 
 struct i915_devmem {
@@ -142,6 +150,139 @@ i915_devmem_page_free_locked(struct drm_i915_private *dev_priv,
 	put_page(page);
 }
 
+static int i915_devmem_bind_addr(struct i915_devmem_migrate *migrate,
+				 bool is_src)
+{
+	struct i915_gem_context *ctx = migrate->ce->gem_context;
+	struct drm_i915_private *i915 = migrate->i915;
+	struct i915_address_space *vm = ctx->vm;
+	struct intel_memory_region *mem;
+	u64 npages = migrate->npages;
+	enum intel_region_id mem_id;
+	struct drm_mm_node *node;
+	struct scatterlist *sg;
+	u32 sg_page_sizes = 0;
+	struct sg_table st;
+	u64 flags = 0;
+	int i, ret;
+
+	if (unlikely(sg_alloc_table(&st, npages, GFP_KERNEL)))
+		return -ENOMEM;
+
+	if (is_src) {
+		node = &migrate->src_node;
+		mem_id = migrate->src_id;
+		flags |= I915_GTT_SVM_READONLY;
+	} else {
+		node = &migrate->dst_node;
+		mem_id = migrate->dst_id;
+	}
+
+	mutex_lock(&vm->mutex);
+	ret = i915_gem_gtt_insert(vm, node, npages * PAGE_SIZE,
+				  I915_GTT_PAGE_SIZE_2M, 0,
+				  0, vm->total, PIN_USER);
+	mutex_unlock(&vm->mutex);
+	if (unlikely(ret))
+		return ret;
+
+	sg = NULL;
+	st.nents = 0;
+
+	/*
+	 * XXX: If the source page is missing, source it from a temporary
+	 * zero filled page. If needed, destination page missing scenarios
+	 * can be similarly handled by draining data into a temporary page.
+	 */
+	for (i = 0; i < npages; i++) {
+		u64 addr;
+
+		if (mem_id == INTEL_REGION_SMEM) {
+			addr = migrate->host_dma[i];
+		} else {
+			struct page *page;
+			unsigned long mpfn;
+
+			mpfn = is_src ? migrate->args->src[i] :
+					migrate->args->dst[i];
+			page = migrate_pfn_to_page(mpfn);
+			mem = i915->mm.regions[mem_id];
+			addr = page_to_pfn(page) - mem->devmem->pfn_first;
+			addr <<= PAGE_SHIFT;
+			addr += mem->region.start;
+		}
+
+		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
+			sg->length += PAGE_SIZE;
+			sg_dma_len(sg) += PAGE_SIZE;
+			continue;
+		}
+
+		if (sg)
+			sg_page_sizes |= sg->length;
+
+		sg =  sg ? __sg_next(sg) : st.sgl;
+		sg_dma_address(sg) = addr;
+		sg_dma_len(sg) = PAGE_SIZE;
+		sg->length = PAGE_SIZE;
+		st.nents++;
+	}
+
+	sg_page_sizes |= sg->length;
+	sg_mark_end(sg);
+	i915_sg_trim(&st);
+
+	ret = svm_bind_addr(vm, node->start, npages * PAGE_SIZE,
+			    flags, &st, sg_page_sizes);
+	sg_free_table(&st);
+
+	return ret;
+}
+
+static void i915_devmem_unbind_addr(struct i915_devmem_migrate *migrate,
+				    bool is_src)
+{
+	struct i915_gem_context *ctx = migrate->ce->gem_context;
+	struct i915_address_space *vm = ctx->vm;
+	struct drm_mm_node *node;
+
+	node = is_src ? &migrate->src_node : &migrate->dst_node;
+	svm_unbind_addr(vm, node->start, migrate->npages * PAGE_SIZE);
+	mutex_lock(&vm->mutex);
+	drm_mm_remove_node(node);
+	mutex_unlock(&vm->mutex);
+}
+
+static int i915_migrate_blitter_copy(struct i915_devmem_migrate *migrate)
+{
+	struct drm_i915_private *i915 = migrate->i915;
+	int ret;
+
+	migrate->ce = i915->engine[BCS0]->kernel_context;
+	ret = i915_devmem_bind_addr(migrate, true);
+	if (unlikely(ret))
+		return ret;
+
+	ret = i915_devmem_bind_addr(migrate, false);
+	if (unlikely(ret)) {
+		i915_devmem_unbind_addr(migrate, true);
+		return ret;
+	}
+
+	DRM_DEBUG_DRIVER("npages %lld\n", migrate->npages);
+	ret = i915_svm_copy_blt(migrate->ce,
+				migrate->src_node.start,
+				migrate->dst_node.start,
+				migrate->npages * PAGE_SIZE,
+				&migrate->fence);
+	if (unlikely(ret)) {
+		i915_devmem_unbind_addr(migrate, false);
+		i915_devmem_unbind_addr(migrate, true);
+	}
+
+	return ret;
+}
+
 static int i915_migrate_cpu_copy(struct i915_devmem_migrate *migrate)
 {
 	const unsigned long *src = migrate->args->src;
@@ -210,6 +351,7 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
 {
 	struct drm_i915_private *i915 = migrate->i915;
 	struct migrate_vma *args = migrate->args;
+	struct device *dev = i915->drm.dev;
 	struct intel_memory_region *mem;
 	struct list_head blocks = {0};
 	unsigned long i, npages, cnt;
@@ -218,13 +360,35 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
 
 	npages = (args->end - args->start) >> PAGE_SHIFT;
 	DRM_DEBUG_DRIVER("start 0x%lx npages %ld\n", args->start, npages);
+	/*
+	 * XXX: Set proper condition for cpu vs blitter copy for performance,
+	 * functionality and to avoid any deadlocks around blitter usage.
+	 */
+	migrate->blitter_copy = true;
+
+	if (migrate->blitter_copy) {
+		/* Allocate storage for DMA addresses, so we can unmap later. */
+		migrate->host_dma = kcalloc(npages, sizeof(*migrate->host_dma),
+					    GFP_KERNEL);
+		if (unlikely(!migrate->host_dma))
+			migrate->blitter_copy = false;
+	}
 
-	/* Check source pages */
+	/* Check and DMA map source pages */
 	for (i = 0, cnt = 0; i < npages; i++) {
 		args->dst[i] = 0;
 		page = migrate_pfn_to_page(args->src[i]);
-		if (unlikely(!page || !(args->src[i] & MIGRATE_PFN_MIGRATE)))
+		if (unlikely(!page || !(args->src[i] & MIGRATE_PFN_MIGRATE))) {
+			migrate->blitter_copy = false;
 			continue;
+		}
+
+		migrate->host_dma[i] = dma_map_page(dev, page, 0, PAGE_SIZE,
+						    PCI_DMA_TODEVICE);
+		if (unlikely(dma_mapping_error(dev, migrate->host_dma[i]))) {
+			migrate->blitter_copy = false;
+			migrate->host_dma[i] = 0;
+		}
 
 		args->dst[i] = MIGRATE_PFN_VALID;
 		cnt++;
@@ -248,6 +412,7 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
 		page = i915_devmem_page_get_locked(mem, &blocks);
 		if (unlikely(!page)) {
 			WARN(1, "Failed to get dst page\n");
+			migrate->blitter_copy = false;
 			args->dst[i] = 0;
 			continue;
 		}
@@ -265,7 +430,11 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
 	/* Copy the pages */
 	migrate->npages = npages;
 	/* XXX: Flush the caches? */
-	ret = i915_migrate_cpu_copy(migrate);
+	/* XXX: Fallback to cpu_copy if blitter_copy fails? */
+	if (migrate->blitter_copy)
+		ret = i915_migrate_blitter_copy(migrate);
+	else
+		ret = i915_migrate_cpu_copy(migrate);
 migrate_out:
 	if (unlikely(ret)) {
 		for (i = 0; i < npages; i++) {
@@ -277,12 +446,42 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
 		}
 	}
 
+	if (unlikely(migrate->host_dma && (!migrate->blitter_copy || ret))) {
+		for (i = 0; i < npages; i++) {
+			if (migrate->host_dma[i])
+				dma_unmap_page(dev, migrate->host_dma[i],
+					       PAGE_SIZE, PCI_DMA_TODEVICE);
+		}
+		kfree(migrate->host_dma);
+		migrate->host_dma = NULL;
+	}
+
 	return ret;
 }
 
 void i915_devmem_migrate_finalize_and_map(struct i915_devmem_migrate *migrate)
 {
+	struct drm_i915_private *i915 = migrate->i915;
+	unsigned long i;
+	int ret;
+
+	if (!migrate->blitter_copy)
+		return;
+
 	DRM_DEBUG_DRIVER("npages %lld\n", migrate->npages);
+	ret = i915_svm_copy_blt_wait(i915, migrate->fence);
+	if (unlikely(ret < 0))
+		WARN(1, "Waiting for dma fence failed %d\n", ret);
+
+	i915_devmem_unbind_addr(migrate, true);
+	i915_devmem_unbind_addr(migrate, false);
+
+	for (i = 0; i < migrate->npages; i++)
+		dma_unmap_page(i915->drm.dev, migrate->host_dma[i],
+			       PAGE_SIZE, PCI_DMA_TODEVICE);
+
+	kfree(migrate->host_dma);
+	migrate->host_dma = NULL;
 }
 
 static void i915_devmem_migrate_chunk(struct i915_devmem_migrate *migrate)
@@ -354,6 +553,12 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate)
 	int err;
 
 	DRM_DEBUG_DRIVER("start 0x%lx\n", args->start);
+	/*
+	 * XXX: Set proper condition for cpu vs blitter copy for performance,
+	 * functionality and to avoid any deadlocks around blitter usage.
+	 */
+	migrate->blitter_copy = false;
+
 	/* Allocate host page */
 	spage = migrate_pfn_to_page(args->src[0]);
 	if (unlikely(!spage || !(args->src[0] & MIGRATE_PFN_MIGRATE)))
@@ -364,12 +569,31 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate)
 		return VM_FAULT_SIGBUS;
 	lock_page(dpage);
 
+	/* DMA map host page */
+	if (migrate->blitter_copy) {
+		migrate->host_dma[0] = dma_map_page(dev, dpage, 0, PAGE_SIZE,
+						    PCI_DMA_FROMDEVICE);
+		if (unlikely(dma_mapping_error(dev, migrate->host_dma[0]))) {
+			migrate->host_dma[0] = 0;
+			err = -ENOMEM;
+			goto fault_out;
+		}
+	}
+
 	args->dst[0] = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;
 
 	/* Copy the pages */
 	migrate->npages = 1;
-	err = i915_migrate_cpu_copy(migrate);
+	/* XXX: Fallback to cpu_copy if blitter_copy fails? */
+	if (migrate->blitter_copy)
+		err = i915_migrate_blitter_copy(migrate);
+	else
+		err = i915_migrate_cpu_copy(migrate);
+fault_out:
 	if (unlikely(err)) {
+		if (migrate->host_dma[0])
+			dma_unmap_page(dev, migrate->host_dma[0],
+				       PAGE_SIZE, PCI_DMA_FROMDEVICE);
 		__free_page(dpage);
 		args->dst[0] = 0;
 		return VM_FAULT_SIGBUS;
@@ -380,7 +604,22 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate)
 
 void i915_devmem_fault_finalize_and_map(struct i915_devmem_migrate *migrate)
 {
+	struct drm_i915_private *i915 = migrate->i915;
+	int err;
+
 	DRM_DEBUG_DRIVER("\n");
+	if (!migrate->blitter_copy)
+		return;
+
+	err = i915_svm_copy_blt_wait(i915, migrate->fence);
+	if (unlikely(err < 0))
+		WARN(1, "Waiting for dma fence failed %d\n", err);
+
+	i915_devmem_unbind_addr(migrate, true);
+	i915_devmem_unbind_addr(migrate, false);
+
+	dma_unmap_page(i915->drm.dev, migrate->host_dma[0],
+		       PAGE_SIZE, PCI_DMA_FROMDEVICE);
 }
 
 static inline struct i915_devmem *page_to_devmem(struct page *page)
@@ -394,6 +633,7 @@ static vm_fault_t i915_devmem_migrate_to_ram(struct vm_fault *vmf)
 	struct drm_i915_private *i915 = devmem->i915;
 	struct i915_devmem_migrate migrate = {0};
 	unsigned long src = 0, dst = 0;
+	dma_addr_t dma_addr = 0;
 	vm_fault_t ret;
 	struct migrate_vma args = {
 		.vma		= vmf->vma,
@@ -407,6 +647,7 @@ static vm_fault_t i915_devmem_migrate_to_ram(struct vm_fault *vmf)
 	DRM_DEBUG_DRIVER("addr 0x%lx\n", args.start);
 	migrate.i915 = i915;
 	migrate.args = &args;
+	migrate.host_dma = &dma_addr;
 	migrate.src_id = INTEL_REGION_LMEM;
 	migrate.dst_id = INTEL_REGION_SMEM;
 	if (migrate_vma_setup(&args) < 0)
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 12/13] drm/i915/svm: Add support to en/disable SVM
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Add SVM as a capability and allow user to enable/disable SVM
functionality on a per context basis.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 95 ++++++++++++++++++-
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |  2 +
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  6 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.c    | 11 +++
 drivers/gpu/drm/i915/i915_drv.c               |  4 +-
 drivers/gpu/drm/i915/i915_drv.h               | 10 ++
 drivers/gpu/drm/i915/i915_getparam.c          |  3 +
 8 files changed, 128 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 46926478ed2e..bc6cbf3b3d97 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -76,6 +76,7 @@
 
 #include "i915_gem_context.h"
 #include "i915_globals.h"
+#include "i915_svm.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
 #include "i915_gem_ioctls.h"
@@ -965,6 +966,78 @@ int i915_gem_vm_destroy_ioctl(struct drm_device *dev, void *data,
 	return 0;
 }
 
+static int i915_gem_vm_setparam_ioctl(struct drm_device *dev, void *data,
+				      struct drm_file *file)
+{
+	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct drm_i915_gem_vm_param *args = data;
+	struct i915_address_space *vm;
+	int err = 0;
+	u32 id;
+
+	id = args->vm_id;
+	if (!id)
+		return -ENOENT;
+
+	err = mutex_lock_interruptible(&file_priv->vm_idr_lock);
+	if (err)
+		return err;
+
+	vm = idr_find(&file_priv->vm_idr, id);
+
+	mutex_unlock(&file_priv->vm_idr_lock);
+	if (!vm)
+		return -ENOENT;
+
+	switch (lower_32_bits(args->param)) {
+	case I915_GEM_VM_PARAM_SVM:
+		/* FIXME: Ensure ppgtt is empty before switching */
+		if (!i915_has_svm(file_priv->dev_priv))
+			err = -ENOTSUPP;
+		else if (args->value)
+			err = i915_svm_bind_mm(vm);
+		else
+			i915_svm_unbind_mm(vm);
+		break;
+	default:
+		err = -EINVAL;
+	}
+	return err;
+}
+
+static int i915_gem_vm_getparam_ioctl(struct drm_device *dev, void *data,
+				      struct drm_file *file)
+{
+	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct drm_i915_gem_vm_param *args = data;
+	struct i915_address_space *vm;
+	int err = 0;
+	u32 id;
+
+	id = args->vm_id;
+	if (!id)
+		return -ENOENT;
+
+	err = mutex_lock_interruptible(&file_priv->vm_idr_lock);
+	if (err)
+		return err;
+
+	vm = idr_find(&file_priv->vm_idr, id);
+
+	mutex_unlock(&file_priv->vm_idr_lock);
+	if (!vm)
+		return -ENOENT;
+
+	switch (lower_32_bits(args->param)) {
+	case I915_GEM_VM_PARAM_SVM:
+		args->value = i915_vm_is_svm_enabled(vm);
+		break;
+	default:
+		err = -EINVAL;
+	}
+	return err;
+}
+
 struct context_barrier_task {
 	struct i915_active base;
 	void (*task)(void *data);
@@ -2379,6 +2452,21 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 	return ret;
 }
 
+int i915_gem_getparam_ioctl(struct drm_device *dev, void *data,
+			    struct drm_file *file)
+{
+	struct drm_i915_gem_context_param *args = data;
+	u32 class = upper_32_bits(args->param);
+
+	switch (class) {
+	case 0:
+		return i915_gem_context_getparam_ioctl(dev, data, file);
+	case 2:
+		return i915_gem_vm_getparam_ioctl(dev, data, file);
+	}
+	return -EINVAL;
+}
+
 int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 				    struct drm_file *file)
 {
@@ -2401,14 +2489,15 @@ int i915_gem_setparam_ioctl(struct drm_device *dev, void *data,
 			    struct drm_file *file)
 {
 	struct drm_i915_gem_context_param *args = data;
-	u32 object_class = upper_32_bits(args->param);
+	u32 class = upper_32_bits(args->param);
 
-	switch (object_class) {
+	switch (class) {
 	case 0:
 		return i915_gem_context_setparam_ioctl(dev, data, file);
 	case 1:
 		return i915_gem_object_setparam_ioctl(dev, data, file);
-
+	case 2:
+		return i915_gem_vm_setparam_ioctl(dev, data, file);
 	}
 	return -EINVAL;
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index 0b17636cfea2..59eedb9e8ae5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -175,6 +175,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 				    struct drm_file *file_priv);
 int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 				    struct drm_file *file_priv);
+int i915_gem_getparam_ioctl(struct drm_device *dev, void *data,
+			    struct drm_file *file);
 int i915_gem_setparam_ioctl(struct drm_device *dev, void *data,
 			    struct drm_file *file);
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index c060bc428f49..eefd2ee64c3f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -129,6 +129,7 @@ struct i915_gem_context {
 #define UCONTEXT_BANNABLE		2
 #define UCONTEXT_RECOVERABLE		3
 #define UCONTEXT_PERSISTENCE		4
+#define UCONTEXT_SVM_ENABLED		5
 
 	/**
 	 * @flags: small set of booleans
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 9d43ae6d643a..cbcfbf86ea61 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -25,6 +25,7 @@
 #include "i915_gem_clflush.h"
 #include "i915_gem_context.h"
 #include "i915_gem_ioctls.h"
+#include "i915_svm.h"
 #include "i915_trace.h"
 
 enum {
@@ -444,6 +445,11 @@ eb_validate_vma(struct i915_execbuffer *eb,
 	if (unlikely(entry->alignment && !is_power_of_2(entry->alignment)))
 		return -EINVAL;
 
+	/* Only allow user PINNED addresses for SVM enabled contexts */
+	if (unlikely(i915_vm_is_svm_enabled(eb->gem_context->vm) &&
+		     !(entry->flags & EXEC_OBJECT_PINNED)))
+		return -EINVAL;
+
 	/*
 	 * Offset can be used as input (EXEC_OBJECT_PINNED), reject
 	 * any non-page-aligned or non-canonical addresses.
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index dd88fa87b7fe..93d8aef22b92 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -34,6 +34,7 @@
 #include "i915_gem_object_blt.h"
 #include "i915_gem_region.h"
 #include "i915_globals.h"
+#include "i915_svm.h"
 #include "i915_trace.h"
 
 static struct i915_global_object {
@@ -483,6 +484,16 @@ int __init i915_global_objects_init(void)
 bool
 i915_gem_object_svm_mapped(struct drm_i915_gem_object *obj)
 {
+	struct i915_vma *vma;
+
+	spin_lock(&obj->vma.lock);
+	list_for_each_entry(vma, &obj->vma.list, obj_link)
+		if (i915_vm_is_svm_enabled(vma->vm)) {
+			spin_unlock(&obj->vma.lock);
+			return true;
+		}
+
+	spin_unlock(&obj->vma.lock);
 	return false;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 740b4b9d39a8..f78912ffa03d 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -2699,6 +2699,8 @@ static int i915_bind_ioctl(struct drm_device *dev, void *data,
 	vm = i915_gem_address_space_lookup(file->driver_priv, args->vm_id);
 	if (unlikely(!vm))
 		return -ENOENT;
+	if (unlikely(!i915_vm_is_svm_enabled(vm)))
+		return -ENOTSUPP;
 
 	switch (args->type) {
 	case I915_BIND_SVM_BUFFER:
@@ -2762,7 +2764,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_REG_READ, i915_reg_read_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GET_RESET_STATS, i915_gem_context_reset_stats_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_getparam_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, i915_gem_setparam_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, i915_perf_add_config_ioctl, DRM_RENDER_ALLOW),
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f7051e6df656..2ea9d6ac9743 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -38,6 +38,7 @@
 #include <linux/i2c-algo-bit.h>
 #include <linux/backlight.h>
 #include <linux/hash.h>
+#include <linux/hmm.h>
 #include <linux/intel-iommu.h>
 #include <linux/kref.h>
 #include <linux/mm_types.h>
@@ -1735,6 +1736,15 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 /* Only valid when HAS_DISPLAY() is true */
 #define INTEL_DISPLAY_ENABLED(dev_priv) (WARN_ON(!HAS_DISPLAY(dev_priv)), !i915_modparams.disable_display)
 
+static inline bool i915_has_svm(struct drm_i915_private *dev_priv)
+{
+#ifdef CONFIG_DRM_I915_SVM
+	return INTEL_GEN(dev_priv) >= 8;
+#else
+	return false;
+#endif
+}
+
 static inline bool intel_vtd_active(void)
 {
 #ifdef CONFIG_INTEL_IOMMU
diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c
index cf8a8c3ef047..3023b7b40f66 100644
--- a/drivers/gpu/drm/i915/i915_getparam.c
+++ b/drivers/gpu/drm/i915/i915_getparam.c
@@ -160,6 +160,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_PARAM_PERF_REVISION:
 		value = i915_perf_ioctl_version();
 		break;
+	case I915_PARAM_HAS_SVM:
+		value = i915_has_svm(i915);
+		break;
 	default:
 		DRM_DEBUG("Unknown parameter %d\n", param->param);
 		return -EINVAL;
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 12/13] drm/i915/svm: Add support to en/disable SVM
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen, jglisse,
	jon.bloomfield, daniel.vetter, dan.j.williams, ira.weiny, jgg

Add SVM as a capability and allow user to enable/disable SVM
functionality on a per context basis.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 95 ++++++++++++++++++-
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |  2 +
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  6 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.c    | 11 +++
 drivers/gpu/drm/i915/i915_drv.c               |  4 +-
 drivers/gpu/drm/i915/i915_drv.h               | 10 ++
 drivers/gpu/drm/i915/i915_getparam.c          |  3 +
 8 files changed, 128 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 46926478ed2e..bc6cbf3b3d97 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -76,6 +76,7 @@
 
 #include "i915_gem_context.h"
 #include "i915_globals.h"
+#include "i915_svm.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
 #include "i915_gem_ioctls.h"
@@ -965,6 +966,78 @@ int i915_gem_vm_destroy_ioctl(struct drm_device *dev, void *data,
 	return 0;
 }
 
+static int i915_gem_vm_setparam_ioctl(struct drm_device *dev, void *data,
+				      struct drm_file *file)
+{
+	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct drm_i915_gem_vm_param *args = data;
+	struct i915_address_space *vm;
+	int err = 0;
+	u32 id;
+
+	id = args->vm_id;
+	if (!id)
+		return -ENOENT;
+
+	err = mutex_lock_interruptible(&file_priv->vm_idr_lock);
+	if (err)
+		return err;
+
+	vm = idr_find(&file_priv->vm_idr, id);
+
+	mutex_unlock(&file_priv->vm_idr_lock);
+	if (!vm)
+		return -ENOENT;
+
+	switch (lower_32_bits(args->param)) {
+	case I915_GEM_VM_PARAM_SVM:
+		/* FIXME: Ensure ppgtt is empty before switching */
+		if (!i915_has_svm(file_priv->dev_priv))
+			err = -ENOTSUPP;
+		else if (args->value)
+			err = i915_svm_bind_mm(vm);
+		else
+			i915_svm_unbind_mm(vm);
+		break;
+	default:
+		err = -EINVAL;
+	}
+	return err;
+}
+
+static int i915_gem_vm_getparam_ioctl(struct drm_device *dev, void *data,
+				      struct drm_file *file)
+{
+	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct drm_i915_gem_vm_param *args = data;
+	struct i915_address_space *vm;
+	int err = 0;
+	u32 id;
+
+	id = args->vm_id;
+	if (!id)
+		return -ENOENT;
+
+	err = mutex_lock_interruptible(&file_priv->vm_idr_lock);
+	if (err)
+		return err;
+
+	vm = idr_find(&file_priv->vm_idr, id);
+
+	mutex_unlock(&file_priv->vm_idr_lock);
+	if (!vm)
+		return -ENOENT;
+
+	switch (lower_32_bits(args->param)) {
+	case I915_GEM_VM_PARAM_SVM:
+		args->value = i915_vm_is_svm_enabled(vm);
+		break;
+	default:
+		err = -EINVAL;
+	}
+	return err;
+}
+
 struct context_barrier_task {
 	struct i915_active base;
 	void (*task)(void *data);
@@ -2379,6 +2452,21 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 	return ret;
 }
 
+int i915_gem_getparam_ioctl(struct drm_device *dev, void *data,
+			    struct drm_file *file)
+{
+	struct drm_i915_gem_context_param *args = data;
+	u32 class = upper_32_bits(args->param);
+
+	switch (class) {
+	case 0:
+		return i915_gem_context_getparam_ioctl(dev, data, file);
+	case 2:
+		return i915_gem_vm_getparam_ioctl(dev, data, file);
+	}
+	return -EINVAL;
+}
+
 int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 				    struct drm_file *file)
 {
@@ -2401,14 +2489,15 @@ int i915_gem_setparam_ioctl(struct drm_device *dev, void *data,
 			    struct drm_file *file)
 {
 	struct drm_i915_gem_context_param *args = data;
-	u32 object_class = upper_32_bits(args->param);
+	u32 class = upper_32_bits(args->param);
 
-	switch (object_class) {
+	switch (class) {
 	case 0:
 		return i915_gem_context_setparam_ioctl(dev, data, file);
 	case 1:
 		return i915_gem_object_setparam_ioctl(dev, data, file);
-
+	case 2:
+		return i915_gem_vm_setparam_ioctl(dev, data, file);
 	}
 	return -EINVAL;
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index 0b17636cfea2..59eedb9e8ae5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -175,6 +175,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 				    struct drm_file *file_priv);
 int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 				    struct drm_file *file_priv);
+int i915_gem_getparam_ioctl(struct drm_device *dev, void *data,
+			    struct drm_file *file);
 int i915_gem_setparam_ioctl(struct drm_device *dev, void *data,
 			    struct drm_file *file);
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index c060bc428f49..eefd2ee64c3f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -129,6 +129,7 @@ struct i915_gem_context {
 #define UCONTEXT_BANNABLE		2
 #define UCONTEXT_RECOVERABLE		3
 #define UCONTEXT_PERSISTENCE		4
+#define UCONTEXT_SVM_ENABLED		5
 
 	/**
 	 * @flags: small set of booleans
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 9d43ae6d643a..cbcfbf86ea61 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -25,6 +25,7 @@
 #include "i915_gem_clflush.h"
 #include "i915_gem_context.h"
 #include "i915_gem_ioctls.h"
+#include "i915_svm.h"
 #include "i915_trace.h"
 
 enum {
@@ -444,6 +445,11 @@ eb_validate_vma(struct i915_execbuffer *eb,
 	if (unlikely(entry->alignment && !is_power_of_2(entry->alignment)))
 		return -EINVAL;
 
+	/* Only allow user PINNED addresses for SVM enabled contexts */
+	if (unlikely(i915_vm_is_svm_enabled(eb->gem_context->vm) &&
+		     !(entry->flags & EXEC_OBJECT_PINNED)))
+		return -EINVAL;
+
 	/*
 	 * Offset can be used as input (EXEC_OBJECT_PINNED), reject
 	 * any non-page-aligned or non-canonical addresses.
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index dd88fa87b7fe..93d8aef22b92 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -34,6 +34,7 @@
 #include "i915_gem_object_blt.h"
 #include "i915_gem_region.h"
 #include "i915_globals.h"
+#include "i915_svm.h"
 #include "i915_trace.h"
 
 static struct i915_global_object {
@@ -483,6 +484,16 @@ int __init i915_global_objects_init(void)
 bool
 i915_gem_object_svm_mapped(struct drm_i915_gem_object *obj)
 {
+	struct i915_vma *vma;
+
+	spin_lock(&obj->vma.lock);
+	list_for_each_entry(vma, &obj->vma.list, obj_link)
+		if (i915_vm_is_svm_enabled(vma->vm)) {
+			spin_unlock(&obj->vma.lock);
+			return true;
+		}
+
+	spin_unlock(&obj->vma.lock);
 	return false;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 740b4b9d39a8..f78912ffa03d 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -2699,6 +2699,8 @@ static int i915_bind_ioctl(struct drm_device *dev, void *data,
 	vm = i915_gem_address_space_lookup(file->driver_priv, args->vm_id);
 	if (unlikely(!vm))
 		return -ENOENT;
+	if (unlikely(!i915_vm_is_svm_enabled(vm)))
+		return -ENOTSUPP;
 
 	switch (args->type) {
 	case I915_BIND_SVM_BUFFER:
@@ -2762,7 +2764,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_REG_READ, i915_reg_read_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GET_RESET_STATS, i915_gem_context_reset_stats_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_getparam_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, i915_gem_setparam_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, i915_perf_add_config_ioctl, DRM_RENDER_ALLOW),
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f7051e6df656..2ea9d6ac9743 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -38,6 +38,7 @@
 #include <linux/i2c-algo-bit.h>
 #include <linux/backlight.h>
 #include <linux/hash.h>
+#include <linux/hmm.h>
 #include <linux/intel-iommu.h>
 #include <linux/kref.h>
 #include <linux/mm_types.h>
@@ -1735,6 +1736,15 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 /* Only valid when HAS_DISPLAY() is true */
 #define INTEL_DISPLAY_ENABLED(dev_priv) (WARN_ON(!HAS_DISPLAY(dev_priv)), !i915_modparams.disable_display)
 
+static inline bool i915_has_svm(struct drm_i915_private *dev_priv)
+{
+#ifdef CONFIG_DRM_I915_SVM
+	return INTEL_GEN(dev_priv) >= 8;
+#else
+	return false;
+#endif
+}
+
 static inline bool intel_vtd_active(void)
 {
 #ifdef CONFIG_INTEL_IOMMU
diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c
index cf8a8c3ef047..3023b7b40f66 100644
--- a/drivers/gpu/drm/i915/i915_getparam.c
+++ b/drivers/gpu/drm/i915/i915_getparam.c
@@ -160,6 +160,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_PARAM_PERF_REVISION:
 		value = i915_perf_ioctl_version();
 		break;
+	case I915_PARAM_HAS_SVM:
+		value = i915_has_svm(i915);
+		break;
 	default:
 		DRM_DEBUG("Unknown parameter %d\n", param->param);
 		return -EINVAL;
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [Intel-gfx] [RFC 12/13] drm/i915/svm: Add support to en/disable SVM
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Add SVM as a capability and allow user to enable/disable SVM
functionality on a per context basis.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 95 ++++++++++++++++++-
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |  2 +
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  6 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.c    | 11 +++
 drivers/gpu/drm/i915/i915_drv.c               |  4 +-
 drivers/gpu/drm/i915/i915_drv.h               | 10 ++
 drivers/gpu/drm/i915/i915_getparam.c          |  3 +
 8 files changed, 128 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 46926478ed2e..bc6cbf3b3d97 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -76,6 +76,7 @@
 
 #include "i915_gem_context.h"
 #include "i915_globals.h"
+#include "i915_svm.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
 #include "i915_gem_ioctls.h"
@@ -965,6 +966,78 @@ int i915_gem_vm_destroy_ioctl(struct drm_device *dev, void *data,
 	return 0;
 }
 
+static int i915_gem_vm_setparam_ioctl(struct drm_device *dev, void *data,
+				      struct drm_file *file)
+{
+	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct drm_i915_gem_vm_param *args = data;
+	struct i915_address_space *vm;
+	int err = 0;
+	u32 id;
+
+	id = args->vm_id;
+	if (!id)
+		return -ENOENT;
+
+	err = mutex_lock_interruptible(&file_priv->vm_idr_lock);
+	if (err)
+		return err;
+
+	vm = idr_find(&file_priv->vm_idr, id);
+
+	mutex_unlock(&file_priv->vm_idr_lock);
+	if (!vm)
+		return -ENOENT;
+
+	switch (lower_32_bits(args->param)) {
+	case I915_GEM_VM_PARAM_SVM:
+		/* FIXME: Ensure ppgtt is empty before switching */
+		if (!i915_has_svm(file_priv->dev_priv))
+			err = -ENOTSUPP;
+		else if (args->value)
+			err = i915_svm_bind_mm(vm);
+		else
+			i915_svm_unbind_mm(vm);
+		break;
+	default:
+		err = -EINVAL;
+	}
+	return err;
+}
+
+static int i915_gem_vm_getparam_ioctl(struct drm_device *dev, void *data,
+				      struct drm_file *file)
+{
+	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct drm_i915_gem_vm_param *args = data;
+	struct i915_address_space *vm;
+	int err = 0;
+	u32 id;
+
+	id = args->vm_id;
+	if (!id)
+		return -ENOENT;
+
+	err = mutex_lock_interruptible(&file_priv->vm_idr_lock);
+	if (err)
+		return err;
+
+	vm = idr_find(&file_priv->vm_idr, id);
+
+	mutex_unlock(&file_priv->vm_idr_lock);
+	if (!vm)
+		return -ENOENT;
+
+	switch (lower_32_bits(args->param)) {
+	case I915_GEM_VM_PARAM_SVM:
+		args->value = i915_vm_is_svm_enabled(vm);
+		break;
+	default:
+		err = -EINVAL;
+	}
+	return err;
+}
+
 struct context_barrier_task {
 	struct i915_active base;
 	void (*task)(void *data);
@@ -2379,6 +2452,21 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 	return ret;
 }
 
+int i915_gem_getparam_ioctl(struct drm_device *dev, void *data,
+			    struct drm_file *file)
+{
+	struct drm_i915_gem_context_param *args = data;
+	u32 class = upper_32_bits(args->param);
+
+	switch (class) {
+	case 0:
+		return i915_gem_context_getparam_ioctl(dev, data, file);
+	case 2:
+		return i915_gem_vm_getparam_ioctl(dev, data, file);
+	}
+	return -EINVAL;
+}
+
 int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 				    struct drm_file *file)
 {
@@ -2401,14 +2489,15 @@ int i915_gem_setparam_ioctl(struct drm_device *dev, void *data,
 			    struct drm_file *file)
 {
 	struct drm_i915_gem_context_param *args = data;
-	u32 object_class = upper_32_bits(args->param);
+	u32 class = upper_32_bits(args->param);
 
-	switch (object_class) {
+	switch (class) {
 	case 0:
 		return i915_gem_context_setparam_ioctl(dev, data, file);
 	case 1:
 		return i915_gem_object_setparam_ioctl(dev, data, file);
-
+	case 2:
+		return i915_gem_vm_setparam_ioctl(dev, data, file);
 	}
 	return -EINVAL;
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index 0b17636cfea2..59eedb9e8ae5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -175,6 +175,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 				    struct drm_file *file_priv);
 int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 				    struct drm_file *file_priv);
+int i915_gem_getparam_ioctl(struct drm_device *dev, void *data,
+			    struct drm_file *file);
 int i915_gem_setparam_ioctl(struct drm_device *dev, void *data,
 			    struct drm_file *file);
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index c060bc428f49..eefd2ee64c3f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -129,6 +129,7 @@ struct i915_gem_context {
 #define UCONTEXT_BANNABLE		2
 #define UCONTEXT_RECOVERABLE		3
 #define UCONTEXT_PERSISTENCE		4
+#define UCONTEXT_SVM_ENABLED		5
 
 	/**
 	 * @flags: small set of booleans
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 9d43ae6d643a..cbcfbf86ea61 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -25,6 +25,7 @@
 #include "i915_gem_clflush.h"
 #include "i915_gem_context.h"
 #include "i915_gem_ioctls.h"
+#include "i915_svm.h"
 #include "i915_trace.h"
 
 enum {
@@ -444,6 +445,11 @@ eb_validate_vma(struct i915_execbuffer *eb,
 	if (unlikely(entry->alignment && !is_power_of_2(entry->alignment)))
 		return -EINVAL;
 
+	/* Only allow user PINNED addresses for SVM enabled contexts */
+	if (unlikely(i915_vm_is_svm_enabled(eb->gem_context->vm) &&
+		     !(entry->flags & EXEC_OBJECT_PINNED)))
+		return -EINVAL;
+
 	/*
 	 * Offset can be used as input (EXEC_OBJECT_PINNED), reject
 	 * any non-page-aligned or non-canonical addresses.
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index dd88fa87b7fe..93d8aef22b92 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -34,6 +34,7 @@
 #include "i915_gem_object_blt.h"
 #include "i915_gem_region.h"
 #include "i915_globals.h"
+#include "i915_svm.h"
 #include "i915_trace.h"
 
 static struct i915_global_object {
@@ -483,6 +484,16 @@ int __init i915_global_objects_init(void)
 bool
 i915_gem_object_svm_mapped(struct drm_i915_gem_object *obj)
 {
+	struct i915_vma *vma;
+
+	spin_lock(&obj->vma.lock);
+	list_for_each_entry(vma, &obj->vma.list, obj_link)
+		if (i915_vm_is_svm_enabled(vma->vm)) {
+			spin_unlock(&obj->vma.lock);
+			return true;
+		}
+
+	spin_unlock(&obj->vma.lock);
 	return false;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 740b4b9d39a8..f78912ffa03d 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -2699,6 +2699,8 @@ static int i915_bind_ioctl(struct drm_device *dev, void *data,
 	vm = i915_gem_address_space_lookup(file->driver_priv, args->vm_id);
 	if (unlikely(!vm))
 		return -ENOENT;
+	if (unlikely(!i915_vm_is_svm_enabled(vm)))
+		return -ENOTSUPP;
 
 	switch (args->type) {
 	case I915_BIND_SVM_BUFFER:
@@ -2762,7 +2764,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_REG_READ, i915_reg_read_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GET_RESET_STATS, i915_gem_context_reset_stats_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_getparam_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, i915_gem_setparam_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, i915_perf_add_config_ioctl, DRM_RENDER_ALLOW),
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f7051e6df656..2ea9d6ac9743 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -38,6 +38,7 @@
 #include <linux/i2c-algo-bit.h>
 #include <linux/backlight.h>
 #include <linux/hash.h>
+#include <linux/hmm.h>
 #include <linux/intel-iommu.h>
 #include <linux/kref.h>
 #include <linux/mm_types.h>
@@ -1735,6 +1736,15 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 /* Only valid when HAS_DISPLAY() is true */
 #define INTEL_DISPLAY_ENABLED(dev_priv) (WARN_ON(!HAS_DISPLAY(dev_priv)), !i915_modparams.disable_display)
 
+static inline bool i915_has_svm(struct drm_i915_private *dev_priv)
+{
+#ifdef CONFIG_DRM_I915_SVM
+	return INTEL_GEN(dev_priv) >= 8;
+#else
+	return false;
+#endif
+}
+
 static inline bool intel_vtd_active(void)
 {
 #ifdef CONFIG_INTEL_IOMMU
diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c
index cf8a8c3ef047..3023b7b40f66 100644
--- a/drivers/gpu/drm/i915/i915_getparam.c
+++ b/drivers/gpu/drm/i915/i915_getparam.c
@@ -160,6 +160,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_PARAM_PERF_REVISION:
 		value = i915_perf_ioctl_version();
 		break;
+	case I915_PARAM_HAS_SVM:
+		value = i915_has_svm(i915);
+		break;
 	default:
 		DRM_DEBUG("Unknown parameter %d\n", param->param);
 		return -EINVAL;
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC 13/13] drm/i915/svm: Add page table dump support
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen, jglisse,
	jon.bloomfield, daniel.vetter, dan.j.williams, ira.weiny, jgg

Add support to dump page table for debug purpose.
Here is an example dump. Format is,
[<page table index>] <start VA pfn>: <value>

Page Table dump start 0x0 len 0xffffffffffffffff
      [0x0fe] 0x7f0000000: 0x6b0003
              [0x1e6] 0x7f7980000: 0x6c0003
                      [0x16d] 0x7f79ada00: 0x5f0003
                              [0x000] 0x7f79ada00: 0x610803
                      [0x16e] 0x7f79adc00: 0x6d0003
                              [0x000] 0x7f79adc00: 0x630803
      [0x100] 0x800000000: 0x6f0003
              [0x000] 0x800000000: 0x700000
                      [0x000] 0x800000000: 0x710003
                              [0x000] 0x800000000: 0x5d0803

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/Kconfig.debug            | 14 +++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  1 +
 drivers/gpu/drm/i915/i915_gem_gtt.c           | 92 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h           | 14 +++
 4 files changed, 121 insertions(+)

diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug
index 1140525da75a..06953c1d6fd4 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -223,3 +223,17 @@ config DRM_I915_DEBUG_RUNTIME_PM
 	  driver loading, suspend and resume operations.
 
 	  If in doubt, say "N"
+
+config DRM_I915_DUMP_PPGTT
+        bool "Enable PPGTT Page Table dump support"
+        depends on DRM_I915
+        default n
+        help
+	  Choose this option to enable PPGTT page table dump support.
+	  The page table snapshot helps developers to debug page table
+	  related issues. This will affect performance and dumps a lot of
+	  information, so only recommended for developer debug.
+
+          Recommended for driver developers only.
+
+	  If in doubt, say "N".
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index cbcfbf86ea61..c771659d72e2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2707,6 +2707,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 		intel_engine_pool_mark_active(eb.batch->private, eb.request);
 
 	trace_i915_request_queue(eb.request, eb.batch_flags);
+	ppgtt_dump(eb.context->vm, 0, eb.context->vm->total);
 	err = eb_submit(&eb);
 err_request:
 	add_to_client(eb.request, file);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 08cbbb4d91cb..6a7348af5717 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1215,6 +1215,97 @@ static int gen8_ppgtt_alloc(struct i915_address_space *vm,
 	return err;
 }
 
+#ifdef CONFIG_DRM_I915_DUMP_PPGTT
+static void __gen8_ppgtt_dump(struct i915_address_space * const vm,
+			      struct i915_page_directory * const pd,
+			      u64 start, u64 end, int lvl)
+{
+	char *prefix[4] = { "\t\t\t\t", "\t\t\t", "\t\t", "\t"};
+	char *format = "%s [0x%03x] 0x%llx: 0x%llx\n";
+	unsigned int idx, len;
+	gen8_pte_t *vaddr;
+	unsigned int pdpe;
+	bool is_large;
+
+	GEM_BUG_ON(end > vm->total >> GEN8_PTE_SHIFT);
+
+	len = gen8_pd_range(start, end, lvl--, &idx);
+	GEM_BUG_ON(!len || (idx + len - 1) >> gen8_pd_shift(1));
+
+	spin_lock(&pd->lock);
+	GEM_BUG_ON(!atomic_read(px_used(pd))); /* Must be pinned! */
+	do {
+		struct i915_page_table *pt = pd->entry[idx];
+
+		if (!pt) {
+			start += (1 << gen8_pd_shift(lvl + 1));
+			continue;
+		}
+
+		vaddr = kmap_atomic_px(&pd->pt);
+		pdpe = gen8_pd_index(start, lvl + 1);
+		DRM_DEBUG_DRIVER(format, prefix[lvl + 1], pdpe,
+				 start, vaddr[pdpe]);
+		is_large = (vaddr[pdpe] & GEN8_PDE_PS_2M);
+		kunmap_atomic(vaddr);
+		if (is_large) {
+			start += (1 << gen8_pd_shift(lvl + 1));
+			continue;
+		}
+
+		if (lvl) {
+			atomic_inc(&pt->used);
+			spin_unlock(&pd->lock);
+
+			__gen8_ppgtt_dump(vm, as_pd(pt),
+					  start, end, lvl);
+
+			start += (1 << gen8_pd_shift(lvl + 1));
+			spin_lock(&pd->lock);
+			atomic_dec(&pt->used);
+			GEM_BUG_ON(!atomic_read(&pt->used));
+		} else {
+			unsigned int count = gen8_pt_count(start, end);
+
+			pdpe = gen8_pd_index(start, lvl);
+			vaddr = kmap_atomic_px(pt);
+			while (count) {
+				if (vaddr[pdpe] != vm->scratch[lvl].encode)
+					DRM_DEBUG_DRIVER(format, prefix[lvl],
+							 pdpe, start,
+							 vaddr[pdpe]);
+				start++;
+				count--;
+				pdpe++;
+			}
+
+			kunmap_atomic(vaddr);
+			GEM_BUG_ON(atomic_read(&pt->used) > I915_PDES);
+		}
+	} while (idx++, --len);
+	spin_unlock(&pd->lock);
+}
+
+static void gen8_ppgtt_dump(struct i915_address_space *vm,
+			    u64 start, u64 length)
+{
+	GEM_BUG_ON(!IS_ALIGNED(start, BIT_ULL(GEN8_PTE_SHIFT)));
+	GEM_BUG_ON(!IS_ALIGNED(length, BIT_ULL(GEN8_PTE_SHIFT)));
+	GEM_BUG_ON(range_overflows(start, length, vm->total));
+
+	start >>= GEN8_PTE_SHIFT;
+	length >>= GEN8_PTE_SHIFT;
+	GEM_BUG_ON(length == 0);
+
+	DRM_DEBUG_DRIVER("PPGTT dump: start 0x%llx length 0x%llx\n",
+			 start, length);
+	__gen8_ppgtt_dump(vm, i915_vm_to_ppgtt(vm)->pd,
+			  start, start + length, vm->top);
+}
+#else
+#define gen8_ppgtt_dump   NULL
+#endif
+
 static inline struct sgt_dma {
 	struct scatterlist *sg;
 	dma_addr_t dma, max;
@@ -1584,6 +1675,7 @@ static struct i915_ppgtt *gen8_ppgtt_create(struct drm_i915_private *i915)
 	ppgtt->vm.insert_entries = gen8_ppgtt_insert;
 	ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc;
 	ppgtt->vm.clear_range = gen8_ppgtt_clear;
+	ppgtt->vm.dump_va_range = gen8_ppgtt_dump;
 
 	if (intel_vgpu_active(i915))
 		gen8_ppgtt_notify_vgt(ppgtt, true);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 722b1e7ce291..a4358fb1c711 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -367,6 +367,8 @@ struct i915_address_space {
 				 u64 start, u64 length);
 	void (*clear_range)(struct i915_address_space *vm,
 			    u64 start, u64 length);
+	void (*dump_va_range)(struct i915_address_space *vm,
+			      u64 start, u64 length);
 	void (*insert_page)(struct i915_address_space *vm,
 			    dma_addr_t addr,
 			    u64 offset,
@@ -683,6 +685,18 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
 
 #define PIN_OFFSET_MASK		(-I915_GTT_PAGE_SIZE)
 
+#ifdef CONFIG_DRM_I915_DUMP_PPGTT
+static inline void ppgtt_dump(struct i915_address_space *vm,
+			      u64 start, u64 length)
+{
+	if (vm->dump_va_range)
+		vm->dump_va_range(vm, start, length);
+}
+#else
+static inline void ppgtt_dump(struct i915_address_space *vm,
+			      u64 start, u64 length) { }
+#endif
+
 /* SVM API */
 #define I915_GTT_SVM_READONLY  BIT(0)
 
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [Intel-gfx] [RFC 13/13] drm/i915/svm: Add page table dump support
@ 2019-11-22 20:57   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjana Vishwanathapura @ 2019-11-22 20:57 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Add support to dump page table for debug purpose.
Here is an example dump. Format is,
[<page table index>] <start VA pfn>: <value>

Page Table dump start 0x0 len 0xffffffffffffffff
      [0x0fe] 0x7f0000000: 0x6b0003
              [0x1e6] 0x7f7980000: 0x6c0003
                      [0x16d] 0x7f79ada00: 0x5f0003
                              [0x000] 0x7f79ada00: 0x610803
                      [0x16e] 0x7f79adc00: 0x6d0003
                              [0x000] 0x7f79adc00: 0x630803
      [0x100] 0x800000000: 0x6f0003
              [0x000] 0x800000000: 0x700000
                      [0x000] 0x800000000: 0x710003
                              [0x000] 0x800000000: 0x5d0803

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/Kconfig.debug            | 14 +++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  1 +
 drivers/gpu/drm/i915/i915_gem_gtt.c           | 92 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h           | 14 +++
 4 files changed, 121 insertions(+)

diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug
index 1140525da75a..06953c1d6fd4 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -223,3 +223,17 @@ config DRM_I915_DEBUG_RUNTIME_PM
 	  driver loading, suspend and resume operations.
 
 	  If in doubt, say "N"
+
+config DRM_I915_DUMP_PPGTT
+        bool "Enable PPGTT Page Table dump support"
+        depends on DRM_I915
+        default n
+        help
+	  Choose this option to enable PPGTT page table dump support.
+	  The page table snapshot helps developers to debug page table
+	  related issues. This will affect performance and dumps a lot of
+	  information, so only recommended for developer debug.
+
+          Recommended for driver developers only.
+
+	  If in doubt, say "N".
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index cbcfbf86ea61..c771659d72e2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2707,6 +2707,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 		intel_engine_pool_mark_active(eb.batch->private, eb.request);
 
 	trace_i915_request_queue(eb.request, eb.batch_flags);
+	ppgtt_dump(eb.context->vm, 0, eb.context->vm->total);
 	err = eb_submit(&eb);
 err_request:
 	add_to_client(eb.request, file);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 08cbbb4d91cb..6a7348af5717 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1215,6 +1215,97 @@ static int gen8_ppgtt_alloc(struct i915_address_space *vm,
 	return err;
 }
 
+#ifdef CONFIG_DRM_I915_DUMP_PPGTT
+static void __gen8_ppgtt_dump(struct i915_address_space * const vm,
+			      struct i915_page_directory * const pd,
+			      u64 start, u64 end, int lvl)
+{
+	char *prefix[4] = { "\t\t\t\t", "\t\t\t", "\t\t", "\t"};
+	char *format = "%s [0x%03x] 0x%llx: 0x%llx\n";
+	unsigned int idx, len;
+	gen8_pte_t *vaddr;
+	unsigned int pdpe;
+	bool is_large;
+
+	GEM_BUG_ON(end > vm->total >> GEN8_PTE_SHIFT);
+
+	len = gen8_pd_range(start, end, lvl--, &idx);
+	GEM_BUG_ON(!len || (idx + len - 1) >> gen8_pd_shift(1));
+
+	spin_lock(&pd->lock);
+	GEM_BUG_ON(!atomic_read(px_used(pd))); /* Must be pinned! */
+	do {
+		struct i915_page_table *pt = pd->entry[idx];
+
+		if (!pt) {
+			start += (1 << gen8_pd_shift(lvl + 1));
+			continue;
+		}
+
+		vaddr = kmap_atomic_px(&pd->pt);
+		pdpe = gen8_pd_index(start, lvl + 1);
+		DRM_DEBUG_DRIVER(format, prefix[lvl + 1], pdpe,
+				 start, vaddr[pdpe]);
+		is_large = (vaddr[pdpe] & GEN8_PDE_PS_2M);
+		kunmap_atomic(vaddr);
+		if (is_large) {
+			start += (1 << gen8_pd_shift(lvl + 1));
+			continue;
+		}
+
+		if (lvl) {
+			atomic_inc(&pt->used);
+			spin_unlock(&pd->lock);
+
+			__gen8_ppgtt_dump(vm, as_pd(pt),
+					  start, end, lvl);
+
+			start += (1 << gen8_pd_shift(lvl + 1));
+			spin_lock(&pd->lock);
+			atomic_dec(&pt->used);
+			GEM_BUG_ON(!atomic_read(&pt->used));
+		} else {
+			unsigned int count = gen8_pt_count(start, end);
+
+			pdpe = gen8_pd_index(start, lvl);
+			vaddr = kmap_atomic_px(pt);
+			while (count) {
+				if (vaddr[pdpe] != vm->scratch[lvl].encode)
+					DRM_DEBUG_DRIVER(format, prefix[lvl],
+							 pdpe, start,
+							 vaddr[pdpe]);
+				start++;
+				count--;
+				pdpe++;
+			}
+
+			kunmap_atomic(vaddr);
+			GEM_BUG_ON(atomic_read(&pt->used) > I915_PDES);
+		}
+	} while (idx++, --len);
+	spin_unlock(&pd->lock);
+}
+
+static void gen8_ppgtt_dump(struct i915_address_space *vm,
+			    u64 start, u64 length)
+{
+	GEM_BUG_ON(!IS_ALIGNED(start, BIT_ULL(GEN8_PTE_SHIFT)));
+	GEM_BUG_ON(!IS_ALIGNED(length, BIT_ULL(GEN8_PTE_SHIFT)));
+	GEM_BUG_ON(range_overflows(start, length, vm->total));
+
+	start >>= GEN8_PTE_SHIFT;
+	length >>= GEN8_PTE_SHIFT;
+	GEM_BUG_ON(length == 0);
+
+	DRM_DEBUG_DRIVER("PPGTT dump: start 0x%llx length 0x%llx\n",
+			 start, length);
+	__gen8_ppgtt_dump(vm, i915_vm_to_ppgtt(vm)->pd,
+			  start, start + length, vm->top);
+}
+#else
+#define gen8_ppgtt_dump   NULL
+#endif
+
 static inline struct sgt_dma {
 	struct scatterlist *sg;
 	dma_addr_t dma, max;
@@ -1584,6 +1675,7 @@ static struct i915_ppgtt *gen8_ppgtt_create(struct drm_i915_private *i915)
 	ppgtt->vm.insert_entries = gen8_ppgtt_insert;
 	ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc;
 	ppgtt->vm.clear_range = gen8_ppgtt_clear;
+	ppgtt->vm.dump_va_range = gen8_ppgtt_dump;
 
 	if (intel_vgpu_active(i915))
 		gen8_ppgtt_notify_vgt(ppgtt, true);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 722b1e7ce291..a4358fb1c711 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -367,6 +367,8 @@ struct i915_address_space {
 				 u64 start, u64 length);
 	void (*clear_range)(struct i915_address_space *vm,
 			    u64 start, u64 length);
+	void (*dump_va_range)(struct i915_address_space *vm,
+			      u64 start, u64 length);
 	void (*insert_page)(struct i915_address_space *vm,
 			    dma_addr_t addr,
 			    u64 offset,
@@ -683,6 +685,18 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
 
 #define PIN_OFFSET_MASK		(-I915_GTT_PAGE_SIZE)
 
+#ifdef CONFIG_DRM_I915_DUMP_PPGTT
+static inline void ppgtt_dump(struct i915_address_space *vm,
+			      u64 start, u64 length)
+{
+	if (vm->dump_va_range)
+		vm->dump_va_range(vm, start, length);
+}
+#else
+static inline void ppgtt_dump(struct i915_address_space *vm,
+			      u64 start, u64 length) { }
+#endif
+
 /* SVM API */
 #define I915_GTT_SVM_READONLY  BIT(0)
 
-- 
2.21.0.rc0.32.g243a4c7e27

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* ✗ Fi.CI.BUILD: failure for drm/i915/svm: Add SVM support
@ 2019-11-22 21:32   ` Patchwork
  0 siblings, 0 replies; 111+ messages in thread
From: Patchwork @ 2019-11-22 21:32 UTC (permalink / raw)
  To: Niranjana Vishwanathapura; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/svm: Add SVM support
URL   : https://patchwork.freedesktop.org/series/69908/
State : failure

== Summary ==

Applying: drm/i915/svm: Add SVM documentation
Applying: drm/i915/svm: Define SVM UAPI
error: sha1 information is lacking or useless (include/uapi/drm/i915_drm.h).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch' to see the failed patch
Patch failed at 0002 drm/i915/svm: Define SVM UAPI
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for drm/i915/svm: Add SVM support
@ 2019-11-22 21:32   ` Patchwork
  0 siblings, 0 replies; 111+ messages in thread
From: Patchwork @ 2019-11-22 21:32 UTC (permalink / raw)
  To: Niranjana Vishwanathapura; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/svm: Add SVM support
URL   : https://patchwork.freedesktop.org/series/69908/
State : failure

== Summary ==

Applying: drm/i915/svm: Add SVM documentation
Applying: drm/i915/svm: Define SVM UAPI
error: sha1 information is lacking or useless (include/uapi/drm/i915_drm.h).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch' to see the failed patch
Patch failed at 0002 drm/i915/svm: Define SVM UAPI
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-22 23:33     ` Jason Gunthorpe
  0 siblings, 0 replies; 111+ messages in thread
From: Jason Gunthorpe @ 2019-11-22 23:33 UTC (permalink / raw)
  To: Niranjana Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse,
	daniel.vetter, dan.j.williams, ira.weiny

On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:

> +static inline bool
> +i915_range_done(struct hmm_range *range)
> +{
> +	bool ret = hmm_range_valid(range);
> +
> +	hmm_range_unregister(range);
> +	return ret;
> +}

This needs to be updated to follow the new API, but this pattern is
generally wrong, failure if hmm_range_valid should retry the
range_fault and so forth. See the hmm.rst.

> +static int
> +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
> +{
> +	long ret;
> +
> +	range->default_flags = 0;
> +	range->pfn_flags_mask = -1UL;
> +
> +	ret = hmm_range_register(range, &svm->mirror);
> +	if (ret) {
> +		up_read(&svm->mm->mmap_sem);
> +		return (int)ret;
> +	}


Using a temporary range is the pattern from nouveau, is it really
necessary in this driver?

> +static u32 i915_svm_build_sg(struct i915_address_space *vm,
> +			     struct hmm_range *range,
> +			     struct sg_table *st)
> +{
> +	struct scatterlist *sg;
> +	u32 sg_page_sizes = 0;
> +	u64 i, npages;
> +
> +	sg = NULL;
> +	st->nents = 0;
> +	npages = (range->end - range->start) / PAGE_SIZE;
> +
> +	/*
> +	 * No needd to dma map the host pages and later unmap it, as
> +	 * GPU is not allowed to access it with SVM. Hence, no need
> +	 * of any intermediate data strucutre to hold the mappings.
> +	 */
> +	for (i = 0; i < npages; i++) {
> +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
> +
> +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
> +			sg->length += PAGE_SIZE;
> +			sg_dma_len(sg) += PAGE_SIZE;
> +			continue;
> +		}
> +
> +		if (sg)
> +			sg_page_sizes |= sg->length;
> +
> +		sg =  sg ? __sg_next(sg) : st->sgl;
> +		sg_dma_address(sg) = addr;
> +		sg_dma_len(sg) = PAGE_SIZE;
> +		sg->length = PAGE_SIZE;
> +		st->nents++;

It is odd to build the range into a sgl.

IMHO it is not a good idea to use the sg_dma_address like this, that
should only be filled in by a dma map. Where does it end up being
used?

> +	range.pfn_shift = PAGE_SHIFT;
> +	range.start = args->start;
> +	range.flags = i915_range_flags;
> +	range.values = i915_range_values;
> +	range.end = range.start + args->length;
> +	for (i = 0; i < npages; ++i) {
> +		range.pfns[i] = range.flags[HMM_PFN_VALID];
> +		if (!(args->flags & I915_BIND_READONLY))
> +			range.pfns[i] |= range.flags[HMM_PFN_WRITE];
> +	}
> +
> +	down_read(&svm->mm->mmap_sem);
> +	vma = find_vma(svm->mm, range.start);

Why is find_vma needed?

> +	if (unlikely(!vma || vma->vm_end < range.end)) {
> +		ret = -EFAULT;
> +		goto vma_done;
> +	}
> +again:
> +	ret = i915_range_fault(svm, &range);
> +	if (unlikely(ret))
> +		goto vma_done;
> +
> +	sg_page_sizes = i915_svm_build_sg(vm, &range, &st);
>

Keep in mind what can be done with the range values is limited until
the lock is obtained. Reading the pfns and flags is OK though.

> +int i915_svm_bind_mm(struct i915_address_space *vm)
> +{
> +	struct i915_svm *svm;
> +	struct mm_struct *mm;
> +	int ret = 0;
> +
> +	mm = get_task_mm(current);

I don't think the get_task_mm(current) is needed, the mmget is already
done for current->mm ?

Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-22 23:33     ` Jason Gunthorpe
  0 siblings, 0 replies; 111+ messages in thread
From: Jason Gunthorpe @ 2019-11-22 23:33 UTC (permalink / raw)
  To: Niranjana Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen,
	jglisse, jon.bloomfield, daniel.vetter, dan.j.williams,
	ira.weiny

On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:

> +static inline bool
> +i915_range_done(struct hmm_range *range)
> +{
> +	bool ret = hmm_range_valid(range);
> +
> +	hmm_range_unregister(range);
> +	return ret;
> +}

This needs to be updated to follow the new API, but this pattern is
generally wrong, failure if hmm_range_valid should retry the
range_fault and so forth. See the hmm.rst.

> +static int
> +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
> +{
> +	long ret;
> +
> +	range->default_flags = 0;
> +	range->pfn_flags_mask = -1UL;
> +
> +	ret = hmm_range_register(range, &svm->mirror);
> +	if (ret) {
> +		up_read(&svm->mm->mmap_sem);
> +		return (int)ret;
> +	}


Using a temporary range is the pattern from nouveau, is it really
necessary in this driver?

> +static u32 i915_svm_build_sg(struct i915_address_space *vm,
> +			     struct hmm_range *range,
> +			     struct sg_table *st)
> +{
> +	struct scatterlist *sg;
> +	u32 sg_page_sizes = 0;
> +	u64 i, npages;
> +
> +	sg = NULL;
> +	st->nents = 0;
> +	npages = (range->end - range->start) / PAGE_SIZE;
> +
> +	/*
> +	 * No needd to dma map the host pages and later unmap it, as
> +	 * GPU is not allowed to access it with SVM. Hence, no need
> +	 * of any intermediate data strucutre to hold the mappings.
> +	 */
> +	for (i = 0; i < npages; i++) {
> +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
> +
> +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
> +			sg->length += PAGE_SIZE;
> +			sg_dma_len(sg) += PAGE_SIZE;
> +			continue;
> +		}
> +
> +		if (sg)
> +			sg_page_sizes |= sg->length;
> +
> +		sg =  sg ? __sg_next(sg) : st->sgl;
> +		sg_dma_address(sg) = addr;
> +		sg_dma_len(sg) = PAGE_SIZE;
> +		sg->length = PAGE_SIZE;
> +		st->nents++;

It is odd to build the range into a sgl.

IMHO it is not a good idea to use the sg_dma_address like this, that
should only be filled in by a dma map. Where does it end up being
used?

> +	range.pfn_shift = PAGE_SHIFT;
> +	range.start = args->start;
> +	range.flags = i915_range_flags;
> +	range.values = i915_range_values;
> +	range.end = range.start + args->length;
> +	for (i = 0; i < npages; ++i) {
> +		range.pfns[i] = range.flags[HMM_PFN_VALID];
> +		if (!(args->flags & I915_BIND_READONLY))
> +			range.pfns[i] |= range.flags[HMM_PFN_WRITE];
> +	}
> +
> +	down_read(&svm->mm->mmap_sem);
> +	vma = find_vma(svm->mm, range.start);

Why is find_vma needed?

> +	if (unlikely(!vma || vma->vm_end < range.end)) {
> +		ret = -EFAULT;
> +		goto vma_done;
> +	}
> +again:
> +	ret = i915_range_fault(svm, &range);
> +	if (unlikely(ret))
> +		goto vma_done;
> +
> +	sg_page_sizes = i915_svm_build_sg(vm, &range, &st);
>

Keep in mind what can be done with the range values is limited until
the lock is obtained. Reading the pfns and flags is OK though.

> +int i915_svm_bind_mm(struct i915_address_space *vm)
> +{
> +	struct i915_svm *svm;
> +	struct mm_struct *mm;
> +	int ret = 0;
> +
> +	mm = get_task_mm(current);

I don't think the get_task_mm(current) is needed, the mmget is already
done for current->mm ?

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-22 23:33     ` Jason Gunthorpe
  0 siblings, 0 replies; 111+ messages in thread
From: Jason Gunthorpe @ 2019-11-22 23:33 UTC (permalink / raw)
  To: Niranjana Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse,
	daniel.vetter, dan.j.williams, ira.weiny

On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:

> +static inline bool
> +i915_range_done(struct hmm_range *range)
> +{
> +	bool ret = hmm_range_valid(range);
> +
> +	hmm_range_unregister(range);
> +	return ret;
> +}

This needs to be updated to follow the new API, but this pattern is
generally wrong, failure if hmm_range_valid should retry the
range_fault and so forth. See the hmm.rst.

> +static int
> +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
> +{
> +	long ret;
> +
> +	range->default_flags = 0;
> +	range->pfn_flags_mask = -1UL;
> +
> +	ret = hmm_range_register(range, &svm->mirror);
> +	if (ret) {
> +		up_read(&svm->mm->mmap_sem);
> +		return (int)ret;
> +	}


Using a temporary range is the pattern from nouveau, is it really
necessary in this driver?

> +static u32 i915_svm_build_sg(struct i915_address_space *vm,
> +			     struct hmm_range *range,
> +			     struct sg_table *st)
> +{
> +	struct scatterlist *sg;
> +	u32 sg_page_sizes = 0;
> +	u64 i, npages;
> +
> +	sg = NULL;
> +	st->nents = 0;
> +	npages = (range->end - range->start) / PAGE_SIZE;
> +
> +	/*
> +	 * No needd to dma map the host pages and later unmap it, as
> +	 * GPU is not allowed to access it with SVM. Hence, no need
> +	 * of any intermediate data strucutre to hold the mappings.
> +	 */
> +	for (i = 0; i < npages; i++) {
> +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
> +
> +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
> +			sg->length += PAGE_SIZE;
> +			sg_dma_len(sg) += PAGE_SIZE;
> +			continue;
> +		}
> +
> +		if (sg)
> +			sg_page_sizes |= sg->length;
> +
> +		sg =  sg ? __sg_next(sg) : st->sgl;
> +		sg_dma_address(sg) = addr;
> +		sg_dma_len(sg) = PAGE_SIZE;
> +		sg->length = PAGE_SIZE;
> +		st->nents++;

It is odd to build the range into a sgl.

IMHO it is not a good idea to use the sg_dma_address like this, that
should only be filled in by a dma map. Where does it end up being
used?

> +	range.pfn_shift = PAGE_SHIFT;
> +	range.start = args->start;
> +	range.flags = i915_range_flags;
> +	range.values = i915_range_values;
> +	range.end = range.start + args->length;
> +	for (i = 0; i < npages; ++i) {
> +		range.pfns[i] = range.flags[HMM_PFN_VALID];
> +		if (!(args->flags & I915_BIND_READONLY))
> +			range.pfns[i] |= range.flags[HMM_PFN_WRITE];
> +	}
> +
> +	down_read(&svm->mm->mmap_sem);
> +	vma = find_vma(svm->mm, range.start);

Why is find_vma needed?

> +	if (unlikely(!vma || vma->vm_end < range.end)) {
> +		ret = -EFAULT;
> +		goto vma_done;
> +	}
> +again:
> +	ret = i915_range_fault(svm, &range);
> +	if (unlikely(ret))
> +		goto vma_done;
> +
> +	sg_page_sizes = i915_svm_build_sg(vm, &range, &st);
>

Keep in mind what can be done with the range values is limited until
the lock is obtained. Reading the pfns and flags is OK though.

> +int i915_svm_bind_mm(struct i915_address_space *vm)
> +{
> +	struct i915_svm *svm;
> +	struct mm_struct *mm;
> +	int ret = 0;
> +
> +	mm = get_task_mm(current);

I don't think the get_task_mm(current) is needed, the mmget is already
done for current->mm ?

Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-23  4:44       ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-23  4:44 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse,
	daniel.vetter, dan.j.williams, ira.weiny

On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
>On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
>
>> +static inline bool
>> +i915_range_done(struct hmm_range *range)
>> +{
>> +	bool ret = hmm_range_valid(range);
>> +
>> +	hmm_range_unregister(range);
>> +	return ret;
>> +}
>
>This needs to be updated to follow the new API, but this pattern is
>generally wrong, failure if hmm_range_valid should retry the
>range_fault and so forth. See the hmm.rst.
>

Thanks Jason for the feedback.
Yah, will update as per new API in the next revision.
The caller of this function is indeed retrying if the range is not valid.
But I got the point. I made changes in my local build as per hmm.rst
and it is working fine. Will include the change in next revision.

Noticed that as hmm_range_fault() retuns number of valid pages, hmm.rst
example probably should be updated to check for that instead of non-zero
return value.

>> +static int
>> +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
>> +{
>> +	long ret;
>> +
>> +	range->default_flags = 0;
>> +	range->pfn_flags_mask = -1UL;
>> +
>> +	ret = hmm_range_register(range, &svm->mirror);
>> +	if (ret) {
>> +		up_read(&svm->mm->mmap_sem);
>> +		return (int)ret;
>> +	}
>
>
>Using a temporary range is the pattern from nouveau, is it really
>necessary in this driver?

Yah, not required. In my local build I tried with proper default_flags
and set pfn_flags_mask to 0 and it is working fine.

>
>> +static u32 i915_svm_build_sg(struct i915_address_space *vm,
>> +			     struct hmm_range *range,
>> +			     struct sg_table *st)
>> +{
>> +	struct scatterlist *sg;
>> +	u32 sg_page_sizes = 0;
>> +	u64 i, npages;
>> +
>> +	sg = NULL;
>> +	st->nents = 0;
>> +	npages = (range->end - range->start) / PAGE_SIZE;
>> +
>> +	/*
>> +	 * No needd to dma map the host pages and later unmap it, as
>> +	 * GPU is not allowed to access it with SVM. Hence, no need
>> +	 * of any intermediate data strucutre to hold the mappings.
>> +	 */
>> +	for (i = 0; i < npages; i++) {
>> +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
>> +
>> +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
>> +			sg->length += PAGE_SIZE;
>> +			sg_dma_len(sg) += PAGE_SIZE;
>> +			continue;
>> +		}
>> +
>> +		if (sg)
>> +			sg_page_sizes |= sg->length;
>> +
>> +		sg =  sg ? __sg_next(sg) : st->sgl;
>> +		sg_dma_address(sg) = addr;
>> +		sg_dma_len(sg) = PAGE_SIZE;
>> +		sg->length = PAGE_SIZE;
>> +		st->nents++;
>
>It is odd to build the range into a sgl.
>
>IMHO it is not a good idea to use the sg_dma_address like this, that
>should only be filled in by a dma map. Where does it end up being
>used?

The sgl is used to plug into the page table update function in i915.

For the device memory in discrete card, we don't need dma map which
is the case here. Here we don't expect any access to host memory
as mentioned in the comment above.
Well for integrated gfx, if we need to use SVM (with iommu is enabled),
we need to dma map. This requires we maintain the dma address for
each page in some intermediate data structure. Currently, this is TBD.

>
>> +	range.pfn_shift = PAGE_SHIFT;
>> +	range.start = args->start;
>> +	range.flags = i915_range_flags;
>> +	range.values = i915_range_values;
>> +	range.end = range.start + args->length;
>> +	for (i = 0; i < npages; ++i) {
>> +		range.pfns[i] = range.flags[HMM_PFN_VALID];
>> +		if (!(args->flags & I915_BIND_READONLY))
>> +			range.pfns[i] |= range.flags[HMM_PFN_WRITE];
>> +	}
>> +
>> +	down_read(&svm->mm->mmap_sem);
>> +	vma = find_vma(svm->mm, range.start);
>
>Why is find_vma needed?

Ok, looks like not needed, will remove it.

>
>> +	if (unlikely(!vma || vma->vm_end < range.end)) {
>> +		ret = -EFAULT;
>> +		goto vma_done;
>> +	}
>> +again:
>> +	ret = i915_range_fault(svm, &range);
>> +	if (unlikely(ret))
>> +		goto vma_done;
>> +
>> +	sg_page_sizes = i915_svm_build_sg(vm, &range, &st);
>>
>
>Keep in mind what can be done with the range values is limited until
>the lock is obtained. Reading the pfns and flags is OK though.
>

Yah, I either have to build sgl here now and discard later if the range
is not valid. Or, do it while holding the lock which increases contention
period. So, settled on doing it here as it seemed better choice to me.

>> +int i915_svm_bind_mm(struct i915_address_space *vm)
>> +{
>> +	struct i915_svm *svm;
>> +	struct mm_struct *mm;
>> +	int ret = 0;
>> +
>> +	mm = get_task_mm(current);
>
>I don't think the get_task_mm(current) is needed, the mmget is already
>done for current->mm ?

No, I don't think mmget is already done for current->mm here.

>
>Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-23  4:44       ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-23  4:44 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: intel-gfx, sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen,
	jglisse, jon.bloomfield, daniel.vetter, dan.j.williams,
	ira.weiny

On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
>On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
>
>> +static inline bool
>> +i915_range_done(struct hmm_range *range)
>> +{
>> +	bool ret = hmm_range_valid(range);
>> +
>> +	hmm_range_unregister(range);
>> +	return ret;
>> +}
>
>This needs to be updated to follow the new API, but this pattern is
>generally wrong, failure if hmm_range_valid should retry the
>range_fault and so forth. See the hmm.rst.
>

Thanks Jason for the feedback.
Yah, will update as per new API in the next revision.
The caller of this function is indeed retrying if the range is not valid.
But I got the point. I made changes in my local build as per hmm.rst
and it is working fine. Will include the change in next revision.

Noticed that as hmm_range_fault() retuns number of valid pages, hmm.rst
example probably should be updated to check for that instead of non-zero
return value.

>> +static int
>> +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
>> +{
>> +	long ret;
>> +
>> +	range->default_flags = 0;
>> +	range->pfn_flags_mask = -1UL;
>> +
>> +	ret = hmm_range_register(range, &svm->mirror);
>> +	if (ret) {
>> +		up_read(&svm->mm->mmap_sem);
>> +		return (int)ret;
>> +	}
>
>
>Using a temporary range is the pattern from nouveau, is it really
>necessary in this driver?

Yah, not required. In my local build I tried with proper default_flags
and set pfn_flags_mask to 0 and it is working fine.

>
>> +static u32 i915_svm_build_sg(struct i915_address_space *vm,
>> +			     struct hmm_range *range,
>> +			     struct sg_table *st)
>> +{
>> +	struct scatterlist *sg;
>> +	u32 sg_page_sizes = 0;
>> +	u64 i, npages;
>> +
>> +	sg = NULL;
>> +	st->nents = 0;
>> +	npages = (range->end - range->start) / PAGE_SIZE;
>> +
>> +	/*
>> +	 * No needd to dma map the host pages and later unmap it, as
>> +	 * GPU is not allowed to access it with SVM. Hence, no need
>> +	 * of any intermediate data strucutre to hold the mappings.
>> +	 */
>> +	for (i = 0; i < npages; i++) {
>> +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
>> +
>> +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
>> +			sg->length += PAGE_SIZE;
>> +			sg_dma_len(sg) += PAGE_SIZE;
>> +			continue;
>> +		}
>> +
>> +		if (sg)
>> +			sg_page_sizes |= sg->length;
>> +
>> +		sg =  sg ? __sg_next(sg) : st->sgl;
>> +		sg_dma_address(sg) = addr;
>> +		sg_dma_len(sg) = PAGE_SIZE;
>> +		sg->length = PAGE_SIZE;
>> +		st->nents++;
>
>It is odd to build the range into a sgl.
>
>IMHO it is not a good idea to use the sg_dma_address like this, that
>should only be filled in by a dma map. Where does it end up being
>used?

The sgl is used to plug into the page table update function in i915.

For the device memory in discrete card, we don't need dma map which
is the case here. Here we don't expect any access to host memory
as mentioned in the comment above.
Well for integrated gfx, if we need to use SVM (with iommu is enabled),
we need to dma map. This requires we maintain the dma address for
each page in some intermediate data structure. Currently, this is TBD.

>
>> +	range.pfn_shift = PAGE_SHIFT;
>> +	range.start = args->start;
>> +	range.flags = i915_range_flags;
>> +	range.values = i915_range_values;
>> +	range.end = range.start + args->length;
>> +	for (i = 0; i < npages; ++i) {
>> +		range.pfns[i] = range.flags[HMM_PFN_VALID];
>> +		if (!(args->flags & I915_BIND_READONLY))
>> +			range.pfns[i] |= range.flags[HMM_PFN_WRITE];
>> +	}
>> +
>> +	down_read(&svm->mm->mmap_sem);
>> +	vma = find_vma(svm->mm, range.start);
>
>Why is find_vma needed?

Ok, looks like not needed, will remove it.

>
>> +	if (unlikely(!vma || vma->vm_end < range.end)) {
>> +		ret = -EFAULT;
>> +		goto vma_done;
>> +	}
>> +again:
>> +	ret = i915_range_fault(svm, &range);
>> +	if (unlikely(ret))
>> +		goto vma_done;
>> +
>> +	sg_page_sizes = i915_svm_build_sg(vm, &range, &st);
>>
>
>Keep in mind what can be done with the range values is limited until
>the lock is obtained. Reading the pfns and flags is OK though.
>

Yah, I either have to build sgl here now and discard later if the range
is not valid. Or, do it while holding the lock which increases contention
period. So, settled on doing it here as it seemed better choice to me.

>> +int i915_svm_bind_mm(struct i915_address_space *vm)
>> +{
>> +	struct i915_svm *svm;
>> +	struct mm_struct *mm;
>> +	int ret = 0;
>> +
>> +	mm = get_task_mm(current);
>
>I don't think the get_task_mm(current) is needed, the mmget is already
>done for current->mm ?

No, I don't think mmget is already done for current->mm here.

>
>Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-23  4:44       ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-23  4:44 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse,
	daniel.vetter, dan.j.williams, ira.weiny

On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
>On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
>
>> +static inline bool
>> +i915_range_done(struct hmm_range *range)
>> +{
>> +	bool ret = hmm_range_valid(range);
>> +
>> +	hmm_range_unregister(range);
>> +	return ret;
>> +}
>
>This needs to be updated to follow the new API, but this pattern is
>generally wrong, failure if hmm_range_valid should retry the
>range_fault and so forth. See the hmm.rst.
>

Thanks Jason for the feedback.
Yah, will update as per new API in the next revision.
The caller of this function is indeed retrying if the range is not valid.
But I got the point. I made changes in my local build as per hmm.rst
and it is working fine. Will include the change in next revision.

Noticed that as hmm_range_fault() retuns number of valid pages, hmm.rst
example probably should be updated to check for that instead of non-zero
return value.

>> +static int
>> +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
>> +{
>> +	long ret;
>> +
>> +	range->default_flags = 0;
>> +	range->pfn_flags_mask = -1UL;
>> +
>> +	ret = hmm_range_register(range, &svm->mirror);
>> +	if (ret) {
>> +		up_read(&svm->mm->mmap_sem);
>> +		return (int)ret;
>> +	}
>
>
>Using a temporary range is the pattern from nouveau, is it really
>necessary in this driver?

Yah, not required. In my local build I tried with proper default_flags
and set pfn_flags_mask to 0 and it is working fine.

>
>> +static u32 i915_svm_build_sg(struct i915_address_space *vm,
>> +			     struct hmm_range *range,
>> +			     struct sg_table *st)
>> +{
>> +	struct scatterlist *sg;
>> +	u32 sg_page_sizes = 0;
>> +	u64 i, npages;
>> +
>> +	sg = NULL;
>> +	st->nents = 0;
>> +	npages = (range->end - range->start) / PAGE_SIZE;
>> +
>> +	/*
>> +	 * No needd to dma map the host pages and later unmap it, as
>> +	 * GPU is not allowed to access it with SVM. Hence, no need
>> +	 * of any intermediate data strucutre to hold the mappings.
>> +	 */
>> +	for (i = 0; i < npages; i++) {
>> +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
>> +
>> +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
>> +			sg->length += PAGE_SIZE;
>> +			sg_dma_len(sg) += PAGE_SIZE;
>> +			continue;
>> +		}
>> +
>> +		if (sg)
>> +			sg_page_sizes |= sg->length;
>> +
>> +		sg =  sg ? __sg_next(sg) : st->sgl;
>> +		sg_dma_address(sg) = addr;
>> +		sg_dma_len(sg) = PAGE_SIZE;
>> +		sg->length = PAGE_SIZE;
>> +		st->nents++;
>
>It is odd to build the range into a sgl.
>
>IMHO it is not a good idea to use the sg_dma_address like this, that
>should only be filled in by a dma map. Where does it end up being
>used?

The sgl is used to plug into the page table update function in i915.

For the device memory in discrete card, we don't need dma map which
is the case here. Here we don't expect any access to host memory
as mentioned in the comment above.
Well for integrated gfx, if we need to use SVM (with iommu is enabled),
we need to dma map. This requires we maintain the dma address for
each page in some intermediate data structure. Currently, this is TBD.

>
>> +	range.pfn_shift = PAGE_SHIFT;
>> +	range.start = args->start;
>> +	range.flags = i915_range_flags;
>> +	range.values = i915_range_values;
>> +	range.end = range.start + args->length;
>> +	for (i = 0; i < npages; ++i) {
>> +		range.pfns[i] = range.flags[HMM_PFN_VALID];
>> +		if (!(args->flags & I915_BIND_READONLY))
>> +			range.pfns[i] |= range.flags[HMM_PFN_WRITE];
>> +	}
>> +
>> +	down_read(&svm->mm->mmap_sem);
>> +	vma = find_vma(svm->mm, range.start);
>
>Why is find_vma needed?

Ok, looks like not needed, will remove it.

>
>> +	if (unlikely(!vma || vma->vm_end < range.end)) {
>> +		ret = -EFAULT;
>> +		goto vma_done;
>> +	}
>> +again:
>> +	ret = i915_range_fault(svm, &range);
>> +	if (unlikely(ret))
>> +		goto vma_done;
>> +
>> +	sg_page_sizes = i915_svm_build_sg(vm, &range, &st);
>>
>
>Keep in mind what can be done with the range values is limited until
>the lock is obtained. Reading the pfns and flags is OK though.
>

Yah, I either have to build sgl here now and discard later if the range
is not valid. Or, do it while holding the lock which increases contention
period. So, settled on doing it here as it seemed better choice to me.

>> +int i915_svm_bind_mm(struct i915_address_space *vm)
>> +{
>> +	struct i915_svm *svm;
>> +	struct mm_struct *mm;
>> +	int ret = 0;
>> +
>> +	mm = get_task_mm(current);
>
>I don't think the get_task_mm(current) is needed, the mmget is already
>done for current->mm ?

No, I don't think mmget is already done for current->mm here.

>
>Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-23 23:53         ` Jason Gunthorpe
  0 siblings, 0 replies; 111+ messages in thread
From: Jason Gunthorpe @ 2019-11-23 23:53 UTC (permalink / raw)
  To: Niranjan Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse,
	daniel.vetter, dan.j.williams, ira.weiny

On Fri, Nov 22, 2019 at 08:44:18PM -0800, Niranjan Vishwanathapura wrote:
> On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
> > On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
> > 
> > > +static inline bool
> > > +i915_range_done(struct hmm_range *range)
> > > +{
> > > +	bool ret = hmm_range_valid(range);
> > > +
> > > +	hmm_range_unregister(range);
> > > +	return ret;
> > > +}
> > 
> > This needs to be updated to follow the new API, but this pattern is
> > generally wrong, failure if hmm_range_valid should retry the
> > range_fault and so forth. See the hmm.rst.
> > 
> 
> Thanks Jason for the feedback.
> Yah, will update as per new API in the next revision.
> The caller of this function is indeed retrying if the range is not valid.
> But I got the point. I made changes in my local build as per hmm.rst
> and it is working fine. Will include the change in next revision.

Generally speaking this locking approach is a live-lockable
collision-retry scheme. 

For maintainability it is best if the retry loop is explicit and
doesn't unregister the range during the retry. I also encourage adding
an absolute time bound to the retry as it *could* live lock.

> > > +static int
> > > +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
> > > +{
> > > +	long ret;
> > > +
> > > +	range->default_flags = 0;
> > > +	range->pfn_flags_mask = -1UL;
> > > +
> > > +	ret = hmm_range_register(range, &svm->mirror);
> > > +	if (ret) {
> > > +		up_read(&svm->mm->mmap_sem);
> > > +		return (int)ret;
> > > +	}
> > 
> > 
> > Using a temporary range is the pattern from nouveau, is it really
> > necessary in this driver?
> 
> Yah, not required. In my local build I tried with proper default_flags
> and set pfn_flags_mask to 0 and it is working fine.

Sorry, I ment calling hmm_range_register during fault processing.

If your driver works around user space objects that cover a VA then
the range should be created when the object is created.

> > > +	/*
> > > +	 * No needd to dma map the host pages and later unmap it, as
> > > +	 * GPU is not allowed to access it with SVM. Hence, no need
> > > +	 * of any intermediate data strucutre to hold the mappings.
> > > +	 */
> > > +	for (i = 0; i < npages; i++) {
> > > +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
> > > +
> > > +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
> > > +			sg->length += PAGE_SIZE;
> > > +			sg_dma_len(sg) += PAGE_SIZE;
> > > +			continue;
> > > +		}
> > > +
> > > +		if (sg)
> > > +			sg_page_sizes |= sg->length;
> > > +
> > > +		sg =  sg ? __sg_next(sg) : st->sgl;
> > > +		sg_dma_address(sg) = addr;
> > > +		sg_dma_len(sg) = PAGE_SIZE;
> > > +		sg->length = PAGE_SIZE;
> > > +		st->nents++;
> > 
> > It is odd to build the range into a sgl.
> > 
> > IMHO it is not a good idea to use the sg_dma_address like this, that
> > should only be filled in by a dma map. Where does it end up being
> > used?
> 
> The sgl is used to plug into the page table update function in i915.
> 
> For the device memory in discrete card, we don't need dma map which
> is the case here.

How did we get to device memory on a card? Isn't range->pfns a CPU PFN
at this point?

I'm confused.

> > > +int i915_svm_bind_mm(struct i915_address_space *vm)
> > > +{
> > > +	struct i915_svm *svm;
> > > +	struct mm_struct *mm;
> > > +	int ret = 0;
> > > +
> > > +	mm = get_task_mm(current);
> > 
> > I don't think the get_task_mm(current) is needed, the mmget is already
> > done for current->mm ?
> 
> No, I don't think mmget is already done for current->mm here.

I'm not certain, but I thought it is already done because it is
'current' and current cannot begin to destroy the mm while a syscall
is currently running.

Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-23 23:53         ` Jason Gunthorpe
  0 siblings, 0 replies; 111+ messages in thread
From: Jason Gunthorpe @ 2019-11-23 23:53 UTC (permalink / raw)
  To: Niranjan Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen,
	jglisse, jon.bloomfield, daniel.vetter, dan.j.williams,
	ira.weiny

On Fri, Nov 22, 2019 at 08:44:18PM -0800, Niranjan Vishwanathapura wrote:
> On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
> > On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
> > 
> > > +static inline bool
> > > +i915_range_done(struct hmm_range *range)
> > > +{
> > > +	bool ret = hmm_range_valid(range);
> > > +
> > > +	hmm_range_unregister(range);
> > > +	return ret;
> > > +}
> > 
> > This needs to be updated to follow the new API, but this pattern is
> > generally wrong, failure if hmm_range_valid should retry the
> > range_fault and so forth. See the hmm.rst.
> > 
> 
> Thanks Jason for the feedback.
> Yah, will update as per new API in the next revision.
> The caller of this function is indeed retrying if the range is not valid.
> But I got the point. I made changes in my local build as per hmm.rst
> and it is working fine. Will include the change in next revision.

Generally speaking this locking approach is a live-lockable
collision-retry scheme. 

For maintainability it is best if the retry loop is explicit and
doesn't unregister the range during the retry. I also encourage adding
an absolute time bound to the retry as it *could* live lock.

> > > +static int
> > > +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
> > > +{
> > > +	long ret;
> > > +
> > > +	range->default_flags = 0;
> > > +	range->pfn_flags_mask = -1UL;
> > > +
> > > +	ret = hmm_range_register(range, &svm->mirror);
> > > +	if (ret) {
> > > +		up_read(&svm->mm->mmap_sem);
> > > +		return (int)ret;
> > > +	}
> > 
> > 
> > Using a temporary range is the pattern from nouveau, is it really
> > necessary in this driver?
> 
> Yah, not required. In my local build I tried with proper default_flags
> and set pfn_flags_mask to 0 and it is working fine.

Sorry, I ment calling hmm_range_register during fault processing.

If your driver works around user space objects that cover a VA then
the range should be created when the object is created.

> > > +	/*
> > > +	 * No needd to dma map the host pages and later unmap it, as
> > > +	 * GPU is not allowed to access it with SVM. Hence, no need
> > > +	 * of any intermediate data strucutre to hold the mappings.
> > > +	 */
> > > +	for (i = 0; i < npages; i++) {
> > > +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
> > > +
> > > +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
> > > +			sg->length += PAGE_SIZE;
> > > +			sg_dma_len(sg) += PAGE_SIZE;
> > > +			continue;
> > > +		}
> > > +
> > > +		if (sg)
> > > +			sg_page_sizes |= sg->length;
> > > +
> > > +		sg =  sg ? __sg_next(sg) : st->sgl;
> > > +		sg_dma_address(sg) = addr;
> > > +		sg_dma_len(sg) = PAGE_SIZE;
> > > +		sg->length = PAGE_SIZE;
> > > +		st->nents++;
> > 
> > It is odd to build the range into a sgl.
> > 
> > IMHO it is not a good idea to use the sg_dma_address like this, that
> > should only be filled in by a dma map. Where does it end up being
> > used?
> 
> The sgl is used to plug into the page table update function in i915.
> 
> For the device memory in discrete card, we don't need dma map which
> is the case here.

How did we get to device memory on a card? Isn't range->pfns a CPU PFN
at this point?

I'm confused.

> > > +int i915_svm_bind_mm(struct i915_address_space *vm)
> > > +{
> > > +	struct i915_svm *svm;
> > > +	struct mm_struct *mm;
> > > +	int ret = 0;
> > > +
> > > +	mm = get_task_mm(current);
> > 
> > I don't think the get_task_mm(current) is needed, the mmget is already
> > done for current->mm ?
> 
> No, I don't think mmget is already done for current->mm here.

I'm not certain, but I thought it is already done because it is
'current' and current cannot begin to destroy the mm while a syscall
is currently running.

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-23 23:53         ` Jason Gunthorpe
  0 siblings, 0 replies; 111+ messages in thread
From: Jason Gunthorpe @ 2019-11-23 23:53 UTC (permalink / raw)
  To: Niranjan Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse,
	daniel.vetter, dan.j.williams, ira.weiny

On Fri, Nov 22, 2019 at 08:44:18PM -0800, Niranjan Vishwanathapura wrote:
> On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
> > On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
> > 
> > > +static inline bool
> > > +i915_range_done(struct hmm_range *range)
> > > +{
> > > +	bool ret = hmm_range_valid(range);
> > > +
> > > +	hmm_range_unregister(range);
> > > +	return ret;
> > > +}
> > 
> > This needs to be updated to follow the new API, but this pattern is
> > generally wrong, failure if hmm_range_valid should retry the
> > range_fault and so forth. See the hmm.rst.
> > 
> 
> Thanks Jason for the feedback.
> Yah, will update as per new API in the next revision.
> The caller of this function is indeed retrying if the range is not valid.
> But I got the point. I made changes in my local build as per hmm.rst
> and it is working fine. Will include the change in next revision.

Generally speaking this locking approach is a live-lockable
collision-retry scheme. 

For maintainability it is best if the retry loop is explicit and
doesn't unregister the range during the retry. I also encourage adding
an absolute time bound to the retry as it *could* live lock.

> > > +static int
> > > +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
> > > +{
> > > +	long ret;
> > > +
> > > +	range->default_flags = 0;
> > > +	range->pfn_flags_mask = -1UL;
> > > +
> > > +	ret = hmm_range_register(range, &svm->mirror);
> > > +	if (ret) {
> > > +		up_read(&svm->mm->mmap_sem);
> > > +		return (int)ret;
> > > +	}
> > 
> > 
> > Using a temporary range is the pattern from nouveau, is it really
> > necessary in this driver?
> 
> Yah, not required. In my local build I tried with proper default_flags
> and set pfn_flags_mask to 0 and it is working fine.

Sorry, I ment calling hmm_range_register during fault processing.

If your driver works around user space objects that cover a VA then
the range should be created when the object is created.

> > > +	/*
> > > +	 * No needd to dma map the host pages and later unmap it, as
> > > +	 * GPU is not allowed to access it with SVM. Hence, no need
> > > +	 * of any intermediate data strucutre to hold the mappings.
> > > +	 */
> > > +	for (i = 0; i < npages; i++) {
> > > +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
> > > +
> > > +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
> > > +			sg->length += PAGE_SIZE;
> > > +			sg_dma_len(sg) += PAGE_SIZE;
> > > +			continue;
> > > +		}
> > > +
> > > +		if (sg)
> > > +			sg_page_sizes |= sg->length;
> > > +
> > > +		sg =  sg ? __sg_next(sg) : st->sgl;
> > > +		sg_dma_address(sg) = addr;
> > > +		sg_dma_len(sg) = PAGE_SIZE;
> > > +		sg->length = PAGE_SIZE;
> > > +		st->nents++;
> > 
> > It is odd to build the range into a sgl.
> > 
> > IMHO it is not a good idea to use the sg_dma_address like this, that
> > should only be filled in by a dma map. Where does it end up being
> > used?
> 
> The sgl is used to plug into the page table update function in i915.
> 
> For the device memory in discrete card, we don't need dma map which
> is the case here.

How did we get to device memory on a card? Isn't range->pfns a CPU PFN
at this point?

I'm confused.

> > > +int i915_svm_bind_mm(struct i915_address_space *vm)
> > > +{
> > > +	struct i915_svm *svm;
> > > +	struct mm_struct *mm;
> > > +	int ret = 0;
> > > +
> > > +	mm = get_task_mm(current);
> > 
> > I don't think the get_task_mm(current) is needed, the mmget is already
> > done for current->mm ?
> 
> No, I don't think mmget is already done for current->mm here.

I'm not certain, but I thought it is already done because it is
'current' and current cannot begin to destroy the mm while a syscall
is currently running.

Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-24 21:12           ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-24 21:12 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse,
	daniel.vetter, dan.j.williams, ira.weiny

On Sat, Nov 23, 2019 at 11:53:52PM +0000, Jason Gunthorpe wrote:
>On Fri, Nov 22, 2019 at 08:44:18PM -0800, Niranjan Vishwanathapura wrote:
>> On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
>> > On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
>> >
>> > > +static inline bool
>> > > +i915_range_done(struct hmm_range *range)
>> > > +{
>> > > +	bool ret = hmm_range_valid(range);
>> > > +
>> > > +	hmm_range_unregister(range);
>> > > +	return ret;
>> > > +}
>> >
>> > This needs to be updated to follow the new API, but this pattern is
>> > generally wrong, failure if hmm_range_valid should retry the
>> > range_fault and so forth. See the hmm.rst.
>> >
>>
>> Thanks Jason for the feedback.
>> Yah, will update as per new API in the next revision.
>> The caller of this function is indeed retrying if the range is not valid.
>> But I got the point. I made changes in my local build as per hmm.rst
>> and it is working fine. Will include the change in next revision.
>
>Generally speaking this locking approach is a live-lockable
>collision-retry scheme.
>
>For maintainability it is best if the retry loop is explicit and
>doesn't unregister the range during the retry. I also encourage adding
>an absolute time bound to the retry as it *could* live lock.
>

Ok, thanks for the suggestion, will do.

>> > > +static int
>> > > +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
>> > > +{
>> > > +	long ret;
>> > > +
>> > > +	range->default_flags = 0;
>> > > +	range->pfn_flags_mask = -1UL;
>> > > +
>> > > +	ret = hmm_range_register(range, &svm->mirror);
>> > > +	if (ret) {
>> > > +		up_read(&svm->mm->mmap_sem);
>> > > +		return (int)ret;
>> > > +	}
>> >
>> >
>> > Using a temporary range is the pattern from nouveau, is it really
>> > necessary in this driver?
>>
>> Yah, not required. In my local build I tried with proper default_flags
>> and set pfn_flags_mask to 0 and it is working fine.
>
>Sorry, I ment calling hmm_range_register during fault processing.
>
>If your driver works around user space objects that cover a VA then
>the range should be created when the object is created.
>

Oh ok. No, there is no user space object here.
Binding the user space object to device page table is handled in
patch 03 of this series (no HMM there).
This is for binding a system allocated (malloc) memory. User calls
the bind ioctl with the VA range.

>> > > +	/*
>> > > +	 * No needd to dma map the host pages and later unmap it, as
>> > > +	 * GPU is not allowed to access it with SVM. Hence, no need
>> > > +	 * of any intermediate data strucutre to hold the mappings.
>> > > +	 */
>> > > +	for (i = 0; i < npages; i++) {
>> > > +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
>> > > +
>> > > +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
>> > > +			sg->length += PAGE_SIZE;
>> > > +			sg_dma_len(sg) += PAGE_SIZE;
>> > > +			continue;
>> > > +		}
>> > > +
>> > > +		if (sg)
>> > > +			sg_page_sizes |= sg->length;
>> > > +
>> > > +		sg =  sg ? __sg_next(sg) : st->sgl;
>> > > +		sg_dma_address(sg) = addr;
>> > > +		sg_dma_len(sg) = PAGE_SIZE;
>> > > +		sg->length = PAGE_SIZE;
>> > > +		st->nents++;
>> >
>> > It is odd to build the range into a sgl.
>> >
>> > IMHO it is not a good idea to use the sg_dma_address like this, that
>> > should only be filled in by a dma map. Where does it end up being
>> > used?
>>
>> The sgl is used to plug into the page table update function in i915.
>>
>> For the device memory in discrete card, we don't need dma map which
>> is the case here.
>
>How did we get to device memory on a card? Isn't range->pfns a CPU PFN
>at this point?
>
>I'm confused.
>

Device memory plugin is done through devm_memremap_pages() in patch 07 of
this series. In that patch, we convert the CPU PFN to device PFN before
building the sgl (this is similar to the nouveau driver).

>> > > +int i915_svm_bind_mm(struct i915_address_space *vm)
>> > > +{
>> > > +	struct i915_svm *svm;
>> > > +	struct mm_struct *mm;
>> > > +	int ret = 0;
>> > > +
>> > > +	mm = get_task_mm(current);
>> >
>> > I don't think the get_task_mm(current) is needed, the mmget is already
>> > done for current->mm ?
>>
>> No, I don't think mmget is already done for current->mm here.
>
>I'm not certain, but I thought it is already done because it is
>'current' and current cannot begin to destroy the mm while a syscall
>is currently running.
>

Ok, I don't know, at least the driver is not calling it.
But it makes sense to me. I will remove get_task_mm and directly use current->mm.

Niranjana

>Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-24 21:12           ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-24 21:12 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: intel-gfx, sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen,
	jglisse, jon.bloomfield, daniel.vetter, dan.j.williams,
	ira.weiny

On Sat, Nov 23, 2019 at 11:53:52PM +0000, Jason Gunthorpe wrote:
>On Fri, Nov 22, 2019 at 08:44:18PM -0800, Niranjan Vishwanathapura wrote:
>> On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
>> > On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
>> >
>> > > +static inline bool
>> > > +i915_range_done(struct hmm_range *range)
>> > > +{
>> > > +	bool ret = hmm_range_valid(range);
>> > > +
>> > > +	hmm_range_unregister(range);
>> > > +	return ret;
>> > > +}
>> >
>> > This needs to be updated to follow the new API, but this pattern is
>> > generally wrong, failure if hmm_range_valid should retry the
>> > range_fault and so forth. See the hmm.rst.
>> >
>>
>> Thanks Jason for the feedback.
>> Yah, will update as per new API in the next revision.
>> The caller of this function is indeed retrying if the range is not valid.
>> But I got the point. I made changes in my local build as per hmm.rst
>> and it is working fine. Will include the change in next revision.
>
>Generally speaking this locking approach is a live-lockable
>collision-retry scheme.
>
>For maintainability it is best if the retry loop is explicit and
>doesn't unregister the range during the retry. I also encourage adding
>an absolute time bound to the retry as it *could* live lock.
>

Ok, thanks for the suggestion, will do.

>> > > +static int
>> > > +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
>> > > +{
>> > > +	long ret;
>> > > +
>> > > +	range->default_flags = 0;
>> > > +	range->pfn_flags_mask = -1UL;
>> > > +
>> > > +	ret = hmm_range_register(range, &svm->mirror);
>> > > +	if (ret) {
>> > > +		up_read(&svm->mm->mmap_sem);
>> > > +		return (int)ret;
>> > > +	}
>> >
>> >
>> > Using a temporary range is the pattern from nouveau, is it really
>> > necessary in this driver?
>>
>> Yah, not required. In my local build I tried with proper default_flags
>> and set pfn_flags_mask to 0 and it is working fine.
>
>Sorry, I ment calling hmm_range_register during fault processing.
>
>If your driver works around user space objects that cover a VA then
>the range should be created when the object is created.
>

Oh ok. No, there is no user space object here.
Binding the user space object to device page table is handled in
patch 03 of this series (no HMM there).
This is for binding a system allocated (malloc) memory. User calls
the bind ioctl with the VA range.

>> > > +	/*
>> > > +	 * No needd to dma map the host pages and later unmap it, as
>> > > +	 * GPU is not allowed to access it with SVM. Hence, no need
>> > > +	 * of any intermediate data strucutre to hold the mappings.
>> > > +	 */
>> > > +	for (i = 0; i < npages; i++) {
>> > > +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
>> > > +
>> > > +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
>> > > +			sg->length += PAGE_SIZE;
>> > > +			sg_dma_len(sg) += PAGE_SIZE;
>> > > +			continue;
>> > > +		}
>> > > +
>> > > +		if (sg)
>> > > +			sg_page_sizes |= sg->length;
>> > > +
>> > > +		sg =  sg ? __sg_next(sg) : st->sgl;
>> > > +		sg_dma_address(sg) = addr;
>> > > +		sg_dma_len(sg) = PAGE_SIZE;
>> > > +		sg->length = PAGE_SIZE;
>> > > +		st->nents++;
>> >
>> > It is odd to build the range into a sgl.
>> >
>> > IMHO it is not a good idea to use the sg_dma_address like this, that
>> > should only be filled in by a dma map. Where does it end up being
>> > used?
>>
>> The sgl is used to plug into the page table update function in i915.
>>
>> For the device memory in discrete card, we don't need dma map which
>> is the case here.
>
>How did we get to device memory on a card? Isn't range->pfns a CPU PFN
>at this point?
>
>I'm confused.
>

Device memory plugin is done through devm_memremap_pages() in patch 07 of
this series. In that patch, we convert the CPU PFN to device PFN before
building the sgl (this is similar to the nouveau driver).

>> > > +int i915_svm_bind_mm(struct i915_address_space *vm)
>> > > +{
>> > > +	struct i915_svm *svm;
>> > > +	struct mm_struct *mm;
>> > > +	int ret = 0;
>> > > +
>> > > +	mm = get_task_mm(current);
>> >
>> > I don't think the get_task_mm(current) is needed, the mmget is already
>> > done for current->mm ?
>>
>> No, I don't think mmget is already done for current->mm here.
>
>I'm not certain, but I thought it is already done because it is
>'current' and current cannot begin to destroy the mm while a syscall
>is currently running.
>

Ok, I don't know, at least the driver is not calling it.
But it makes sense to me. I will remove get_task_mm and directly use current->mm.

Niranjana

>Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-24 21:12           ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-24 21:12 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse,
	daniel.vetter, dan.j.williams, ira.weiny

On Sat, Nov 23, 2019 at 11:53:52PM +0000, Jason Gunthorpe wrote:
>On Fri, Nov 22, 2019 at 08:44:18PM -0800, Niranjan Vishwanathapura wrote:
>> On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
>> > On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
>> >
>> > > +static inline bool
>> > > +i915_range_done(struct hmm_range *range)
>> > > +{
>> > > +	bool ret = hmm_range_valid(range);
>> > > +
>> > > +	hmm_range_unregister(range);
>> > > +	return ret;
>> > > +}
>> >
>> > This needs to be updated to follow the new API, but this pattern is
>> > generally wrong, failure if hmm_range_valid should retry the
>> > range_fault and so forth. See the hmm.rst.
>> >
>>
>> Thanks Jason for the feedback.
>> Yah, will update as per new API in the next revision.
>> The caller of this function is indeed retrying if the range is not valid.
>> But I got the point. I made changes in my local build as per hmm.rst
>> and it is working fine. Will include the change in next revision.
>
>Generally speaking this locking approach is a live-lockable
>collision-retry scheme.
>
>For maintainability it is best if the retry loop is explicit and
>doesn't unregister the range during the retry. I also encourage adding
>an absolute time bound to the retry as it *could* live lock.
>

Ok, thanks for the suggestion, will do.

>> > > +static int
>> > > +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
>> > > +{
>> > > +	long ret;
>> > > +
>> > > +	range->default_flags = 0;
>> > > +	range->pfn_flags_mask = -1UL;
>> > > +
>> > > +	ret = hmm_range_register(range, &svm->mirror);
>> > > +	if (ret) {
>> > > +		up_read(&svm->mm->mmap_sem);
>> > > +		return (int)ret;
>> > > +	}
>> >
>> >
>> > Using a temporary range is the pattern from nouveau, is it really
>> > necessary in this driver?
>>
>> Yah, not required. In my local build I tried with proper default_flags
>> and set pfn_flags_mask to 0 and it is working fine.
>
>Sorry, I ment calling hmm_range_register during fault processing.
>
>If your driver works around user space objects that cover a VA then
>the range should be created when the object is created.
>

Oh ok. No, there is no user space object here.
Binding the user space object to device page table is handled in
patch 03 of this series (no HMM there).
This is for binding a system allocated (malloc) memory. User calls
the bind ioctl with the VA range.

>> > > +	/*
>> > > +	 * No needd to dma map the host pages and later unmap it, as
>> > > +	 * GPU is not allowed to access it with SVM. Hence, no need
>> > > +	 * of any intermediate data strucutre to hold the mappings.
>> > > +	 */
>> > > +	for (i = 0; i < npages; i++) {
>> > > +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
>> > > +
>> > > +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
>> > > +			sg->length += PAGE_SIZE;
>> > > +			sg_dma_len(sg) += PAGE_SIZE;
>> > > +			continue;
>> > > +		}
>> > > +
>> > > +		if (sg)
>> > > +			sg_page_sizes |= sg->length;
>> > > +
>> > > +		sg =  sg ? __sg_next(sg) : st->sgl;
>> > > +		sg_dma_address(sg) = addr;
>> > > +		sg_dma_len(sg) = PAGE_SIZE;
>> > > +		sg->length = PAGE_SIZE;
>> > > +		st->nents++;
>> >
>> > It is odd to build the range into a sgl.
>> >
>> > IMHO it is not a good idea to use the sg_dma_address like this, that
>> > should only be filled in by a dma map. Where does it end up being
>> > used?
>>
>> The sgl is used to plug into the page table update function in i915.
>>
>> For the device memory in discrete card, we don't need dma map which
>> is the case here.
>
>How did we get to device memory on a card? Isn't range->pfns a CPU PFN
>at this point?
>
>I'm confused.
>

Device memory plugin is done through devm_memremap_pages() in patch 07 of
this series. In that patch, we convert the CPU PFN to device PFN before
building the sgl (this is similar to the nouveau driver).

>> > > +int i915_svm_bind_mm(struct i915_address_space *vm)
>> > > +{
>> > > +	struct i915_svm *svm;
>> > > +	struct mm_struct *mm;
>> > > +	int ret = 0;
>> > > +
>> > > +	mm = get_task_mm(current);
>> >
>> > I don't think the get_task_mm(current) is needed, the mmget is already
>> > done for current->mm ?
>>
>> No, I don't think mmget is already done for current->mm here.
>
>I'm not certain, but I thought it is already done because it is
>'current' and current cannot begin to destroy the mm while a syscall
>is currently running.
>

Ok, I don't know, at least the driver is not calling it.
But it makes sense to me. I will remove get_task_mm and directly use current->mm.

Niranjana

>Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support
@ 2019-11-25  9:59     ` Chris Wilson
  0 siblings, 0 replies; 111+ messages in thread
From: Chris Wilson @ 2019-11-25  9:59 UTC (permalink / raw)
  To: Niranjana Vishwanathapura, intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, jgg,
	daniel.vetter, dan.j.williams, ira.weiny

Quoting Niranjana Vishwanathapura (2019-11-22 20:57:24)
> Shared Virtual Memory (SVM) runtime allocator support allows
> binding a shared virtual address to a buffer object (BO) in the
> device page table through an ioctl call.

The ioctl though is not svm specific, it is to do with "bulk residency"
and can be used to reduce execbuf traffic to provide virtual address
layout controls to e.g. Vulkan clients.

I915_VM_BIND {
	uint32_t vm_id;
	int32_t fd; /* or -1 for anon, or buf depending on flags */
	uint64_t flags;
	uint64_t offset; /* offset info fd [page aligned] */
	uint64_t length; /* page aligned */
	uint64_t iova; /* page aligned */
	uint64_t extensions;
}; /* where page aligned is actually more I915_GTT_PAGE_ALIGNMENT */

as I recall. I also recall it being part of a future command stream
interface to reduce ioctls, but that is another story.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support
@ 2019-11-25  9:59     ` Chris Wilson
  0 siblings, 0 replies; 111+ messages in thread
From: Chris Wilson @ 2019-11-25  9:59 UTC (permalink / raw)
  To: Niranjana Vishwanathapura, intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, jgg,
	daniel.vetter, dan.j.williams, ira.weiny

Quoting Niranjana Vishwanathapura (2019-11-22 20:57:24)
> Shared Virtual Memory (SVM) runtime allocator support allows
> binding a shared virtual address to a buffer object (BO) in the
> device page table through an ioctl call.

The ioctl though is not svm specific, it is to do with "bulk residency"
and can be used to reduce execbuf traffic to provide virtual address
layout controls to e.g. Vulkan clients.

I915_VM_BIND {
	uint32_t vm_id;
	int32_t fd; /* or -1 for anon, or buf depending on flags */
	uint64_t flags;
	uint64_t offset; /* offset info fd [page aligned] */
	uint64_t length; /* page aligned */
	uint64_t iova; /* page aligned */
	uint64_t extensions;
}; /* where page aligned is actually more I915_GTT_PAGE_ALIGNMENT */

as I recall. I also recall it being part of a future command stream
interface to reduce ioctls, but that is another story.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support
@ 2019-11-25  9:59     ` Chris Wilson
  0 siblings, 0 replies; 111+ messages in thread
From: Chris Wilson @ 2019-11-25  9:59 UTC (permalink / raw)
  To: Niranjana Vishwanathapura, intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, jgg,
	daniel.vetter, dan.j.williams, ira.weiny

Quoting Niranjana Vishwanathapura (2019-11-22 20:57:24)
> Shared Virtual Memory (SVM) runtime allocator support allows
> binding a shared virtual address to a buffer object (BO) in the
> device page table through an ioctl call.

The ioctl though is not svm specific, it is to do with "bulk residency"
and can be used to reduce execbuf traffic to provide virtual address
layout controls to e.g. Vulkan clients.

I915_VM_BIND {
	uint32_t vm_id;
	int32_t fd; /* or -1 for anon, or buf depending on flags */
	uint64_t flags;
	uint64_t offset; /* offset info fd [page aligned] */
	uint64_t length; /* page aligned */
	uint64_t iova; /* page aligned */
	uint64_t extensions;
}; /* where page aligned is actually more I915_GTT_PAGE_ALIGNMENT */

as I recall. I also recall it being part of a future command stream
interface to reduce ioctls, but that is another story.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-25 13:24             ` Jason Gunthorpe
  0 siblings, 0 replies; 111+ messages in thread
From: Jason Gunthorpe @ 2019-11-25 13:24 UTC (permalink / raw)
  To: Niranjan Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse,
	daniel.vetter, dan.j.williams, ira.weiny

On Sun, Nov 24, 2019 at 01:12:47PM -0800, Niranjan Vishwanathapura wrote:

> > > > Using a temporary range is the pattern from nouveau, is it really
> > > > necessary in this driver?
> > > 
> > > Yah, not required. In my local build I tried with proper default_flags
> > > and set pfn_flags_mask to 0 and it is working fine.
> > 
> > Sorry, I ment calling hmm_range_register during fault processing.
> > 
> > If your driver works around user space objects that cover a VA then
> > the range should be created when the object is created.
> > 
> 
> Oh ok. No, there is no user space object here.
> Binding the user space object to device page table is handled in
> patch 03 of this series (no HMM there).
> This is for binding a system allocated (malloc) memory. User calls
> the bind ioctl with the VA range.
> 
> > > > > +	/*
> > > > > +	 * No needd to dma map the host pages and later unmap it, as
> > > > > +	 * GPU is not allowed to access it with SVM. Hence, no need
> > > > > +	 * of any intermediate data strucutre to hold the mappings.
> > > > > +	 */
> > > > > +	for (i = 0; i < npages; i++) {
> > > > > +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
> > > > > +
> > > > > +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
> > > > > +			sg->length += PAGE_SIZE;
> > > > > +			sg_dma_len(sg) += PAGE_SIZE;
> > > > > +			continue;
> > > > > +		}
> > > > > +
> > > > > +		if (sg)
> > > > > +			sg_page_sizes |= sg->length;
> > > > > +
> > > > > +		sg =  sg ? __sg_next(sg) : st->sgl;
> > > > > +		sg_dma_address(sg) = addr;
> > > > > +		sg_dma_len(sg) = PAGE_SIZE;
> > > > > +		sg->length = PAGE_SIZE;
> > > > > +		st->nents++;
> > > >
> > > > It is odd to build the range into a sgl.
> > > >
> > > > IMHO it is not a good idea to use the sg_dma_address like this, that
> > > > should only be filled in by a dma map. Where does it end up being
> > > > used?
> > > 
> > > The sgl is used to plug into the page table update function in i915.
> > > 
> > > For the device memory in discrete card, we don't need dma map which
> > > is the case here.
> > 
> > How did we get to device memory on a card? Isn't range->pfns a CPU PFN
> > at this point?
> > 
> > I'm confused.
> 
> Device memory plugin is done through devm_memremap_pages() in patch 07 of
> this series. In that patch, we convert the CPU PFN to device PFN before
> building the sgl (this is similar to the nouveau driver).

But earlier just called hmm_range_fault(), it can return all kinds of
pages. If these are only allowed to be device pages here then that
must be checked (under lock)

And putting the cpu PFN of a ZONE_DEVICE device page into
sg_dma_address still looks very wrong to me

Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-25 13:24             ` Jason Gunthorpe
  0 siblings, 0 replies; 111+ messages in thread
From: Jason Gunthorpe @ 2019-11-25 13:24 UTC (permalink / raw)
  To: Niranjan Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen,
	jglisse, jon.bloomfield, daniel.vetter, dan.j.williams,
	ira.weiny

On Sun, Nov 24, 2019 at 01:12:47PM -0800, Niranjan Vishwanathapura wrote:

> > > > Using a temporary range is the pattern from nouveau, is it really
> > > > necessary in this driver?
> > > 
> > > Yah, not required. In my local build I tried with proper default_flags
> > > and set pfn_flags_mask to 0 and it is working fine.
> > 
> > Sorry, I ment calling hmm_range_register during fault processing.
> > 
> > If your driver works around user space objects that cover a VA then
> > the range should be created when the object is created.
> > 
> 
> Oh ok. No, there is no user space object here.
> Binding the user space object to device page table is handled in
> patch 03 of this series (no HMM there).
> This is for binding a system allocated (malloc) memory. User calls
> the bind ioctl with the VA range.
> 
> > > > > +	/*
> > > > > +	 * No needd to dma map the host pages and later unmap it, as
> > > > > +	 * GPU is not allowed to access it with SVM. Hence, no need
> > > > > +	 * of any intermediate data strucutre to hold the mappings.
> > > > > +	 */
> > > > > +	for (i = 0; i < npages; i++) {
> > > > > +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
> > > > > +
> > > > > +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
> > > > > +			sg->length += PAGE_SIZE;
> > > > > +			sg_dma_len(sg) += PAGE_SIZE;
> > > > > +			continue;
> > > > > +		}
> > > > > +
> > > > > +		if (sg)
> > > > > +			sg_page_sizes |= sg->length;
> > > > > +
> > > > > +		sg =  sg ? __sg_next(sg) : st->sgl;
> > > > > +		sg_dma_address(sg) = addr;
> > > > > +		sg_dma_len(sg) = PAGE_SIZE;
> > > > > +		sg->length = PAGE_SIZE;
> > > > > +		st->nents++;
> > > >
> > > > It is odd to build the range into a sgl.
> > > >
> > > > IMHO it is not a good idea to use the sg_dma_address like this, that
> > > > should only be filled in by a dma map. Where does it end up being
> > > > used?
> > > 
> > > The sgl is used to plug into the page table update function in i915.
> > > 
> > > For the device memory in discrete card, we don't need dma map which
> > > is the case here.
> > 
> > How did we get to device memory on a card? Isn't range->pfns a CPU PFN
> > at this point?
> > 
> > I'm confused.
> 
> Device memory plugin is done through devm_memremap_pages() in patch 07 of
> this series. In that patch, we convert the CPU PFN to device PFN before
> building the sgl (this is similar to the nouveau driver).

But earlier just called hmm_range_fault(), it can return all kinds of
pages. If these are only allowed to be device pages here then that
must be checked (under lock)

And putting the cpu PFN of a ZONE_DEVICE device page into
sg_dma_address still looks very wrong to me

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-25 13:24             ` Jason Gunthorpe
  0 siblings, 0 replies; 111+ messages in thread
From: Jason Gunthorpe @ 2019-11-25 13:24 UTC (permalink / raw)
  To: Niranjan Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse,
	daniel.vetter, dan.j.williams, ira.weiny

On Sun, Nov 24, 2019 at 01:12:47PM -0800, Niranjan Vishwanathapura wrote:

> > > > Using a temporary range is the pattern from nouveau, is it really
> > > > necessary in this driver?
> > > 
> > > Yah, not required. In my local build I tried with proper default_flags
> > > and set pfn_flags_mask to 0 and it is working fine.
> > 
> > Sorry, I ment calling hmm_range_register during fault processing.
> > 
> > If your driver works around user space objects that cover a VA then
> > the range should be created when the object is created.
> > 
> 
> Oh ok. No, there is no user space object here.
> Binding the user space object to device page table is handled in
> patch 03 of this series (no HMM there).
> This is for binding a system allocated (malloc) memory. User calls
> the bind ioctl with the VA range.
> 
> > > > > +	/*
> > > > > +	 * No needd to dma map the host pages and later unmap it, as
> > > > > +	 * GPU is not allowed to access it with SVM. Hence, no need
> > > > > +	 * of any intermediate data strucutre to hold the mappings.
> > > > > +	 */
> > > > > +	for (i = 0; i < npages; i++) {
> > > > > +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
> > > > > +
> > > > > +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
> > > > > +			sg->length += PAGE_SIZE;
> > > > > +			sg_dma_len(sg) += PAGE_SIZE;
> > > > > +			continue;
> > > > > +		}
> > > > > +
> > > > > +		if (sg)
> > > > > +			sg_page_sizes |= sg->length;
> > > > > +
> > > > > +		sg =  sg ? __sg_next(sg) : st->sgl;
> > > > > +		sg_dma_address(sg) = addr;
> > > > > +		sg_dma_len(sg) = PAGE_SIZE;
> > > > > +		sg->length = PAGE_SIZE;
> > > > > +		st->nents++;
> > > >
> > > > It is odd to build the range into a sgl.
> > > >
> > > > IMHO it is not a good idea to use the sg_dma_address like this, that
> > > > should only be filled in by a dma map. Where does it end up being
> > > > used?
> > > 
> > > The sgl is used to plug into the page table update function in i915.
> > > 
> > > For the device memory in discrete card, we don't need dma map which
> > > is the case here.
> > 
> > How did we get to device memory on a card? Isn't range->pfns a CPU PFN
> > at this point?
> > 
> > I'm confused.
> 
> Device memory plugin is done through devm_memremap_pages() in patch 07 of
> this series. In that patch, we convert the CPU PFN to device PFN before
> building the sgl (this is similar to the nouveau driver).

But earlier just called hmm_range_fault(), it can return all kinds of
pages. If these are only allowed to be device pages here then that
must be checked (under lock)

And putting the cpu PFN of a ZONE_DEVICE device page into
sg_dma_address still looks very wrong to me

Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-25 16:32               ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-25 16:32 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: intel-gfx, sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen,
	jglisse, jon.bloomfield, daniel.vetter, dan.j.williams,
	ira.weiny

On Mon, Nov 25, 2019 at 01:24:18PM +0000, Jason Gunthorpe wrote:
>On Sun, Nov 24, 2019 at 01:12:47PM -0800, Niranjan Vishwanathapura wrote:
>
>> > > > Using a temporary range is the pattern from nouveau, is it really
>> > > > necessary in this driver?
>> > >
>> > > Yah, not required. In my local build I tried with proper default_flags
>> > > and set pfn_flags_mask to 0 and it is working fine.
>> >
>> > Sorry, I ment calling hmm_range_register during fault processing.
>> >
>> > If your driver works around user space objects that cover a VA then
>> > the range should be created when the object is created.
>> >
>>
>> Oh ok. No, there is no user space object here.
>> Binding the user space object to device page table is handled in
>> patch 03 of this series (no HMM there).
>> This is for binding a system allocated (malloc) memory. User calls
>> the bind ioctl with the VA range.
>>
>> > > > > +	/*
>> > > > > +	 * No needd to dma map the host pages and later unmap it, as
>> > > > > +	 * GPU is not allowed to access it with SVM. Hence, no need
>> > > > > +	 * of any intermediate data strucutre to hold the mappings.
>> > > > > +	 */
>> > > > > +	for (i = 0; i < npages; i++) {
>> > > > > +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
>> > > > > +
>> > > > > +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
>> > > > > +			sg->length += PAGE_SIZE;
>> > > > > +			sg_dma_len(sg) += PAGE_SIZE;
>> > > > > +			continue;
>> > > > > +		}
>> > > > > +
>> > > > > +		if (sg)
>> > > > > +			sg_page_sizes |= sg->length;
>> > > > > +
>> > > > > +		sg =  sg ? __sg_next(sg) : st->sgl;
>> > > > > +		sg_dma_address(sg) = addr;
>> > > > > +		sg_dma_len(sg) = PAGE_SIZE;
>> > > > > +		sg->length = PAGE_SIZE;
>> > > > > +		st->nents++;
>> > > >
>> > > > It is odd to build the range into a sgl.
>> > > >
>> > > > IMHO it is not a good idea to use the sg_dma_address like this, that
>> > > > should only be filled in by a dma map. Where does it end up being
>> > > > used?
>> > >
>> > > The sgl is used to plug into the page table update function in i915.
>> > >
>> > > For the device memory in discrete card, we don't need dma map which
>> > > is the case here.
>> >
>> > How did we get to device memory on a card? Isn't range->pfns a CPU PFN
>> > at this point?
>> >
>> > I'm confused.
>>
>> Device memory plugin is done through devm_memremap_pages() in patch 07 of
>> this series. In that patch, we convert the CPU PFN to device PFN before
>> building the sgl (this is similar to the nouveau driver).
>
>But earlier just called hmm_range_fault(), it can return all kinds of
>pages. If these are only allowed to be device pages here then that
>must be checked (under lock)
>

Yah, let me add the check.
(I kept is unchecked for debug purpose, forgot to add the check before sending
the patches.)

>And putting the cpu PFN of a ZONE_DEVICE device page into
>sg_dma_address still looks very wrong to me
>

The below call in patch 7 does convert any cpu PFN to device address.
So, it won't be CPU PFN.
i915_dmem_convert_pfn(vm->i915, &range);

Also, only reason to use sgl list is because i915 driver page table update
functions takes an sgl, otherwise we can directly deal with range.pfns array.

Niranjana

>Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-25 16:32               ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-25 16:32 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse,
	daniel.vetter, dan.j.williams, ira.weiny

On Mon, Nov 25, 2019 at 01:24:18PM +0000, Jason Gunthorpe wrote:
>On Sun, Nov 24, 2019 at 01:12:47PM -0800, Niranjan Vishwanathapura wrote:
>
>> > > > Using a temporary range is the pattern from nouveau, is it really
>> > > > necessary in this driver?
>> > >
>> > > Yah, not required. In my local build I tried with proper default_flags
>> > > and set pfn_flags_mask to 0 and it is working fine.
>> >
>> > Sorry, I ment calling hmm_range_register during fault processing.
>> >
>> > If your driver works around user space objects that cover a VA then
>> > the range should be created when the object is created.
>> >
>>
>> Oh ok. No, there is no user space object here.
>> Binding the user space object to device page table is handled in
>> patch 03 of this series (no HMM there).
>> This is for binding a system allocated (malloc) memory. User calls
>> the bind ioctl with the VA range.
>>
>> > > > > +	/*
>> > > > > +	 * No needd to dma map the host pages and later unmap it, as
>> > > > > +	 * GPU is not allowed to access it with SVM. Hence, no need
>> > > > > +	 * of any intermediate data strucutre to hold the mappings.
>> > > > > +	 */
>> > > > > +	for (i = 0; i < npages; i++) {
>> > > > > +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
>> > > > > +
>> > > > > +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
>> > > > > +			sg->length += PAGE_SIZE;
>> > > > > +			sg_dma_len(sg) += PAGE_SIZE;
>> > > > > +			continue;
>> > > > > +		}
>> > > > > +
>> > > > > +		if (sg)
>> > > > > +			sg_page_sizes |= sg->length;
>> > > > > +
>> > > > > +		sg =  sg ? __sg_next(sg) : st->sgl;
>> > > > > +		sg_dma_address(sg) = addr;
>> > > > > +		sg_dma_len(sg) = PAGE_SIZE;
>> > > > > +		sg->length = PAGE_SIZE;
>> > > > > +		st->nents++;
>> > > >
>> > > > It is odd to build the range into a sgl.
>> > > >
>> > > > IMHO it is not a good idea to use the sg_dma_address like this, that
>> > > > should only be filled in by a dma map. Where does it end up being
>> > > > used?
>> > >
>> > > The sgl is used to plug into the page table update function in i915.
>> > >
>> > > For the device memory in discrete card, we don't need dma map which
>> > > is the case here.
>> >
>> > How did we get to device memory on a card? Isn't range->pfns a CPU PFN
>> > at this point?
>> >
>> > I'm confused.
>>
>> Device memory plugin is done through devm_memremap_pages() in patch 07 of
>> this series. In that patch, we convert the CPU PFN to device PFN before
>> building the sgl (this is similar to the nouveau driver).
>
>But earlier just called hmm_range_fault(), it can return all kinds of
>pages. If these are only allowed to be device pages here then that
>must be checked (under lock)
>

Yah, let me add the check.
(I kept is unchecked for debug purpose, forgot to add the check before sending
the patches.)

>And putting the cpu PFN of a ZONE_DEVICE device page into
>sg_dma_address still looks very wrong to me
>

The below call in patch 7 does convert any cpu PFN to device address.
So, it won't be CPU PFN.
i915_dmem_convert_pfn(vm->i915, &range);

Also, only reason to use sgl list is because i915 driver page table update
functions takes an sgl, otherwise we can directly deal with range.pfns array.

Niranjana

>Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-25 16:33       ` Jerome Glisse
  0 siblings, 0 replies; 111+ messages in thread
From: Jerome Glisse @ 2019-11-25 16:33 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: intel-gfx, sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen,
	jon.bloomfield, daniel.vetter, Niranjana Vishwanathapura,
	ira.weiny, dan.j.williams

On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
> On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:

[...]

> > +static int
> > +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
> > +{
> > +	long ret;
> > +
> > +	range->default_flags = 0;
> > +	range->pfn_flags_mask = -1UL;
> > +
> > +	ret = hmm_range_register(range, &svm->mirror);
> > +	if (ret) {
> > +		up_read(&svm->mm->mmap_sem);
> > +		return (int)ret;
> > +	}
> 
> 
> Using a temporary range is the pattern from nouveau, is it really
> necessary in this driver?

Just to comment on this, for GPU the usage model is not application
register range of virtual address it wants to use. It is GPU can
access _any_ CPU valid address just like the CPU would (modulo mmap
of device file).

This is because the API you want in userspace is application passing
random pointer to the GPU and GPU being able to chase down any kind
of random pointer chain (assuming all valid ie pointing to valid
virtual address for the process).

This is unlike the RDMA case.


That being said, for best performance we still expect well behaving
application to provide hint to kernel so that we know if a range of
virtual address is likely to be use by the GPU or not. But this is
not, and should not be a requirement.


I posted patchset and given talks about this, but long term i believe
we want a common API to manage hint provided by userspace (see my
talk at LPC this year about new syscall to bind memory to device).
With such thing in place we could hang mmu notifier range to it. But
the driver will still need to be able to handle the case where there
is no hint provided by userspace and thus no before knowledge of what
VA might be accessed.

Cheers,
Jérôme

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-25 16:33       ` Jerome Glisse
  0 siblings, 0 replies; 111+ messages in thread
From: Jerome Glisse @ 2019-11-25 16:33 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, daniel.vetter,
	ira.weiny, dan.j.williams

On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
> On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:

[...]

> > +static int
> > +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
> > +{
> > +	long ret;
> > +
> > +	range->default_flags = 0;
> > +	range->pfn_flags_mask = -1UL;
> > +
> > +	ret = hmm_range_register(range, &svm->mirror);
> > +	if (ret) {
> > +		up_read(&svm->mm->mmap_sem);
> > +		return (int)ret;
> > +	}
> 
> 
> Using a temporary range is the pattern from nouveau, is it really
> necessary in this driver?

Just to comment on this, for GPU the usage model is not application
register range of virtual address it wants to use. It is GPU can
access _any_ CPU valid address just like the CPU would (modulo mmap
of device file).

This is because the API you want in userspace is application passing
random pointer to the GPU and GPU being able to chase down any kind
of random pointer chain (assuming all valid ie pointing to valid
virtual address for the process).

This is unlike the RDMA case.


That being said, for best performance we still expect well behaving
application to provide hint to kernel so that we know if a range of
virtual address is likely to be use by the GPU or not. But this is
not, and should not be a requirement.


I posted patchset and given talks about this, but long term i believe
we want a common API to manage hint provided by userspace (see my
talk at LPC this year about new syscall to bind memory to device).
With such thing in place we could hang mmu notifier range to it. But
the driver will still need to be able to handle the case where there
is no hint provided by userspace and thus no before knowledge of what
VA might be accessed.

Cheers,
Jérôme

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-25 16:34               ` Jerome Glisse
  0 siblings, 0 replies; 111+ messages in thread
From: Jerome Glisse @ 2019-11-25 16:34 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, daniel.vetter,
	ira.weiny, dan.j.williams

On Mon, Nov 25, 2019 at 01:24:18PM +0000, Jason Gunthorpe wrote:
> On Sun, Nov 24, 2019 at 01:12:47PM -0800, Niranjan Vishwanathapura wrote:
> 
> > > > > Using a temporary range is the pattern from nouveau, is it really
> > > > > necessary in this driver?
> > > > 
> > > > Yah, not required. In my local build I tried with proper default_flags
> > > > and set pfn_flags_mask to 0 and it is working fine.
> > > 
> > > Sorry, I ment calling hmm_range_register during fault processing.
> > > 
> > > If your driver works around user space objects that cover a VA then
> > > the range should be created when the object is created.
> > > 
> > 
> > Oh ok. No, there is no user space object here.
> > Binding the user space object to device page table is handled in
> > patch 03 of this series (no HMM there).
> > This is for binding a system allocated (malloc) memory. User calls
> > the bind ioctl with the VA range.
> > 
> > > > > > +	/*
> > > > > > +	 * No needd to dma map the host pages and later unmap it, as
> > > > > > +	 * GPU is not allowed to access it with SVM. Hence, no need
> > > > > > +	 * of any intermediate data strucutre to hold the mappings.
> > > > > > +	 */
> > > > > > +	for (i = 0; i < npages; i++) {
> > > > > > +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
> > > > > > +
> > > > > > +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
> > > > > > +			sg->length += PAGE_SIZE;
> > > > > > +			sg_dma_len(sg) += PAGE_SIZE;
> > > > > > +			continue;
> > > > > > +		}
> > > > > > +
> > > > > > +		if (sg)
> > > > > > +			sg_page_sizes |= sg->length;
> > > > > > +
> > > > > > +		sg =  sg ? __sg_next(sg) : st->sgl;
> > > > > > +		sg_dma_address(sg) = addr;
> > > > > > +		sg_dma_len(sg) = PAGE_SIZE;
> > > > > > +		sg->length = PAGE_SIZE;
> > > > > > +		st->nents++;
> > > > >
> > > > > It is odd to build the range into a sgl.
> > > > >
> > > > > IMHO it is not a good idea to use the sg_dma_address like this, that
> > > > > should only be filled in by a dma map. Where does it end up being
> > > > > used?
> > > > 
> > > > The sgl is used to plug into the page table update function in i915.
> > > > 
> > > > For the device memory in discrete card, we don't need dma map which
> > > > is the case here.
> > > 
> > > How did we get to device memory on a card? Isn't range->pfns a CPU PFN
> > > at this point?
> > > 
> > > I'm confused.
> > 
> > Device memory plugin is done through devm_memremap_pages() in patch 07 of
> > this series. In that patch, we convert the CPU PFN to device PFN before
> > building the sgl (this is similar to the nouveau driver).
> 
> But earlier just called hmm_range_fault(), it can return all kinds of
> pages. If these are only allowed to be device pages here then that
> must be checked (under lock)
> 
> And putting the cpu PFN of a ZONE_DEVICE device page into
> sg_dma_address still looks very wrong to me

Yeah, nouveau has different code path but this is because nouveau
driver architecture allows it, i do not see any easy way to hammer
this inside i915 current architecture. I will ponder on this a bit
more.

Cheers,
Jérôme

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-25 16:34               ` Jerome Glisse
  0 siblings, 0 replies; 111+ messages in thread
From: Jerome Glisse @ 2019-11-25 16:34 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: intel-gfx, sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen,
	jon.bloomfield, daniel.vetter, Niranjan Vishwanathapura,
	ira.weiny, dan.j.williams

On Mon, Nov 25, 2019 at 01:24:18PM +0000, Jason Gunthorpe wrote:
> On Sun, Nov 24, 2019 at 01:12:47PM -0800, Niranjan Vishwanathapura wrote:
> 
> > > > > Using a temporary range is the pattern from nouveau, is it really
> > > > > necessary in this driver?
> > > > 
> > > > Yah, not required. In my local build I tried with proper default_flags
> > > > and set pfn_flags_mask to 0 and it is working fine.
> > > 
> > > Sorry, I ment calling hmm_range_register during fault processing.
> > > 
> > > If your driver works around user space objects that cover a VA then
> > > the range should be created when the object is created.
> > > 
> > 
> > Oh ok. No, there is no user space object here.
> > Binding the user space object to device page table is handled in
> > patch 03 of this series (no HMM there).
> > This is for binding a system allocated (malloc) memory. User calls
> > the bind ioctl with the VA range.
> > 
> > > > > > +	/*
> > > > > > +	 * No needd to dma map the host pages and later unmap it, as
> > > > > > +	 * GPU is not allowed to access it with SVM. Hence, no need
> > > > > > +	 * of any intermediate data strucutre to hold the mappings.
> > > > > > +	 */
> > > > > > +	for (i = 0; i < npages; i++) {
> > > > > > +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
> > > > > > +
> > > > > > +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
> > > > > > +			sg->length += PAGE_SIZE;
> > > > > > +			sg_dma_len(sg) += PAGE_SIZE;
> > > > > > +			continue;
> > > > > > +		}
> > > > > > +
> > > > > > +		if (sg)
> > > > > > +			sg_page_sizes |= sg->length;
> > > > > > +
> > > > > > +		sg =  sg ? __sg_next(sg) : st->sgl;
> > > > > > +		sg_dma_address(sg) = addr;
> > > > > > +		sg_dma_len(sg) = PAGE_SIZE;
> > > > > > +		sg->length = PAGE_SIZE;
> > > > > > +		st->nents++;
> > > > >
> > > > > It is odd to build the range into a sgl.
> > > > >
> > > > > IMHO it is not a good idea to use the sg_dma_address like this, that
> > > > > should only be filled in by a dma map. Where does it end up being
> > > > > used?
> > > > 
> > > > The sgl is used to plug into the page table update function in i915.
> > > > 
> > > > For the device memory in discrete card, we don't need dma map which
> > > > is the case here.
> > > 
> > > How did we get to device memory on a card? Isn't range->pfns a CPU PFN
> > > at this point?
> > > 
> > > I'm confused.
> > 
> > Device memory plugin is done through devm_memremap_pages() in patch 07 of
> > this series. In that patch, we convert the CPU PFN to device PFN before
> > building the sgl (this is similar to the nouveau driver).
> 
> But earlier just called hmm_range_fault(), it can return all kinds of
> pages. If these are only allowed to be device pages here then that
> must be checked (under lock)
> 
> And putting the cpu PFN of a ZONE_DEVICE device page into
> sg_dma_address still looks very wrong to me

Yeah, nouveau has different code path but this is because nouveau
driver architecture allows it, i do not see any easy way to hammer
this inside i915 current architecture. I will ponder on this a bit
more.

Cheers,
Jérôme

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-25 16:34               ` Jerome Glisse
  0 siblings, 0 replies; 111+ messages in thread
From: Jerome Glisse @ 2019-11-25 16:34 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, daniel.vetter,
	ira.weiny, dan.j.williams

On Mon, Nov 25, 2019 at 01:24:18PM +0000, Jason Gunthorpe wrote:
> On Sun, Nov 24, 2019 at 01:12:47PM -0800, Niranjan Vishwanathapura wrote:
> 
> > > > > Using a temporary range is the pattern from nouveau, is it really
> > > > > necessary in this driver?
> > > > 
> > > > Yah, not required. In my local build I tried with proper default_flags
> > > > and set pfn_flags_mask to 0 and it is working fine.
> > > 
> > > Sorry, I ment calling hmm_range_register during fault processing.
> > > 
> > > If your driver works around user space objects that cover a VA then
> > > the range should be created when the object is created.
> > > 
> > 
> > Oh ok. No, there is no user space object here.
> > Binding the user space object to device page table is handled in
> > patch 03 of this series (no HMM there).
> > This is for binding a system allocated (malloc) memory. User calls
> > the bind ioctl with the VA range.
> > 
> > > > > > +	/*
> > > > > > +	 * No needd to dma map the host pages and later unmap it, as
> > > > > > +	 * GPU is not allowed to access it with SVM. Hence, no need
> > > > > > +	 * of any intermediate data strucutre to hold the mappings.
> > > > > > +	 */
> > > > > > +	for (i = 0; i < npages; i++) {
> > > > > > +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
> > > > > > +
> > > > > > +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
> > > > > > +			sg->length += PAGE_SIZE;
> > > > > > +			sg_dma_len(sg) += PAGE_SIZE;
> > > > > > +			continue;
> > > > > > +		}
> > > > > > +
> > > > > > +		if (sg)
> > > > > > +			sg_page_sizes |= sg->length;
> > > > > > +
> > > > > > +		sg =  sg ? __sg_next(sg) : st->sgl;
> > > > > > +		sg_dma_address(sg) = addr;
> > > > > > +		sg_dma_len(sg) = PAGE_SIZE;
> > > > > > +		sg->length = PAGE_SIZE;
> > > > > > +		st->nents++;
> > > > >
> > > > > It is odd to build the range into a sgl.
> > > > >
> > > > > IMHO it is not a good idea to use the sg_dma_address like this, that
> > > > > should only be filled in by a dma map. Where does it end up being
> > > > > used?
> > > > 
> > > > The sgl is used to plug into the page table update function in i915.
> > > > 
> > > > For the device memory in discrete card, we don't need dma map which
> > > > is the case here.
> > > 
> > > How did we get to device memory on a card? Isn't range->pfns a CPU PFN
> > > at this point?
> > > 
> > > I'm confused.
> > 
> > Device memory plugin is done through devm_memremap_pages() in patch 07 of
> > this series. In that patch, we convert the CPU PFN to device PFN before
> > building the sgl (this is similar to the nouveau driver).
> 
> But earlier just called hmm_range_fault(), it can return all kinds of
> pages. If these are only allowed to be device pages here then that
> must be checked (under lock)
> 
> And putting the cpu PFN of a ZONE_DEVICE device page into
> sg_dma_address still looks very wrong to me

Yeah, nouveau has different code path but this is because nouveau
driver architecture allows it, i do not see any easy way to hammer
this inside i915 current architecture. I will ponder on this a bit
more.

Cheers,
Jérôme

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support
@ 2019-11-25 16:40       ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-25 16:40 UTC (permalink / raw)
  To: Chris Wilson
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse, jgg,
	daniel.vetter, dan.j.williams, ira.weiny

On Mon, Nov 25, 2019 at 09:59:37AM +0000, Chris Wilson wrote:
>Quoting Niranjana Vishwanathapura (2019-11-22 20:57:24)
>> Shared Virtual Memory (SVM) runtime allocator support allows
>> binding a shared virtual address to a buffer object (BO) in the
>> device page table through an ioctl call.
>
>The ioctl though is not svm specific, it is to do with "bulk residency"
>and can be used to reduce execbuf traffic to provide virtual address
>layout controls to e.g. Vulkan clients.
>
>I915_VM_BIND {
>	uint32_t vm_id;
>	int32_t fd; /* or -1 for anon, or buf depending on flags */
>	uint64_t flags;
>	uint64_t offset; /* offset info fd [page aligned] */
>	uint64_t length; /* page aligned */
>	uint64_t iova; /* page aligned */
>	uint64_t extensions;
>}; /* where page aligned is actually more I915_GTT_PAGE_ALIGNMENT */
>
>as I recall. I also recall it being part of a future command stream
>interface to reduce ioctls, but that is another story.

Thanks Chris.
I will change I915_BIND to I915_VM_BIND.
Currently, it is only addressing binding SVM system (buffer) and runtime (BOs)
allocations. But it can be expanded for other bindings. I have 'type' field
instead of 'fd' and 'extensions' & 'iov' can be added later if required.
Is that OK?

>-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support
@ 2019-11-25 16:40       ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-25 16:40 UTC (permalink / raw)
  To: Chris Wilson
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse, jgg,
	daniel.vetter, dan.j.williams, ira.weiny

On Mon, Nov 25, 2019 at 09:59:37AM +0000, Chris Wilson wrote:
>Quoting Niranjana Vishwanathapura (2019-11-22 20:57:24)
>> Shared Virtual Memory (SVM) runtime allocator support allows
>> binding a shared virtual address to a buffer object (BO) in the
>> device page table through an ioctl call.
>
>The ioctl though is not svm specific, it is to do with "bulk residency"
>and can be used to reduce execbuf traffic to provide virtual address
>layout controls to e.g. Vulkan clients.
>
>I915_VM_BIND {
>	uint32_t vm_id;
>	int32_t fd; /* or -1 for anon, or buf depending on flags */
>	uint64_t flags;
>	uint64_t offset; /* offset info fd [page aligned] */
>	uint64_t length; /* page aligned */
>	uint64_t iova; /* page aligned */
>	uint64_t extensions;
>}; /* where page aligned is actually more I915_GTT_PAGE_ALIGNMENT */
>
>as I recall. I also recall it being part of a future command stream
>interface to reduce ioctls, but that is another story.

Thanks Chris.
I will change I915_BIND to I915_VM_BIND.
Currently, it is only addressing binding SVM system (buffer) and runtime (BOs)
allocations. But it can be expanded for other bindings. I have 'type' field
instead of 'fd' and 'extensions' & 'iov' can be added later if required.
Is that OK?

>-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support
@ 2019-11-25 16:40       ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-25 16:40 UTC (permalink / raw)
  To: Chris Wilson
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse, jgg,
	daniel.vetter, dan.j.williams, ira.weiny

On Mon, Nov 25, 2019 at 09:59:37AM +0000, Chris Wilson wrote:
>Quoting Niranjana Vishwanathapura (2019-11-22 20:57:24)
>> Shared Virtual Memory (SVM) runtime allocator support allows
>> binding a shared virtual address to a buffer object (BO) in the
>> device page table through an ioctl call.
>
>The ioctl though is not svm specific, it is to do with "bulk residency"
>and can be used to reduce execbuf traffic to provide virtual address
>layout controls to e.g. Vulkan clients.
>
>I915_VM_BIND {
>	uint32_t vm_id;
>	int32_t fd; /* or -1 for anon, or buf depending on flags */
>	uint64_t flags;
>	uint64_t offset; /* offset info fd [page aligned] */
>	uint64_t length; /* page aligned */
>	uint64_t iova; /* page aligned */
>	uint64_t extensions;
>}; /* where page aligned is actually more I915_GTT_PAGE_ALIGNMENT */
>
>as I recall. I also recall it being part of a future command stream
>interface to reduce ioctls, but that is another story.

Thanks Chris.
I will change I915_BIND to I915_VM_BIND.
Currently, it is only addressing binding SVM system (buffer) and runtime (BOs)
allocations. But it can be expanded for other bindings. I have 'type' field
instead of 'fd' and 'extensions' & 'iov' can be added later if required.
Is that OK?

>-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-25 21:29         ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-25 21:29 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: intel-gfx, sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen,
	Jason Gunthorpe, jon.bloomfield, daniel.vetter, dan.j.williams,
	ira.weiny

On Mon, Nov 25, 2019 at 11:33:27AM -0500, Jerome Glisse wrote:
>On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
>> On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
>
>[...]
>
>> > +static int
>> > +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
>> > +{
>> > +	long ret;
>> > +
>> > +	range->default_flags = 0;
>> > +	range->pfn_flags_mask = -1UL;
>> > +
>> > +	ret = hmm_range_register(range, &svm->mirror);
>> > +	if (ret) {
>> > +		up_read(&svm->mm->mmap_sem);
>> > +		return (int)ret;
>> > +	}
>>
>>
>> Using a temporary range is the pattern from nouveau, is it really
>> necessary in this driver?
>
>Just to comment on this, for GPU the usage model is not application
>register range of virtual address it wants to use. It is GPU can
>access _any_ CPU valid address just like the CPU would (modulo mmap
>of device file).
>
>This is because the API you want in userspace is application passing
>random pointer to the GPU and GPU being able to chase down any kind
>of random pointer chain (assuming all valid ie pointing to valid
>virtual address for the process).
>
>This is unlike the RDMA case.
>
>
>That being said, for best performance we still expect well behaving
>application to provide hint to kernel so that we know if a range of
>virtual address is likely to be use by the GPU or not. But this is
>not, and should not be a requirement.
>
>
>I posted patchset and given talks about this, but long term i believe
>we want a common API to manage hint provided by userspace (see my
>talk at LPC this year about new syscall to bind memory to device).
>With such thing in place we could hang mmu notifier range to it. But
>the driver will still need to be able to handle the case where there
>is no hint provided by userspace and thus no before knowledge of what
>VA might be accessed.
>

Thanks Jerome for the explanation. Will checkout your LPC talk.
Yes I agree. When GPU faulting support is available, driver will handle the fault,
migrate page if needed and bind the page using HMM.
This patch series adds support for prefetch and bind hints (via explicit ioctls).
Also, patch 12 of the series provides the ability to enable/disable SVM on a per
VM basis for user, and by default SVM is disabled.

Niranjana

>Cheers,
>Jérôme
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-25 21:29         ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-25 21:29 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen,
	Jason Gunthorpe, daniel.vetter, dan.j.williams, ira.weiny

On Mon, Nov 25, 2019 at 11:33:27AM -0500, Jerome Glisse wrote:
>On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
>> On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
>
>[...]
>
>> > +static int
>> > +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
>> > +{
>> > +	long ret;
>> > +
>> > +	range->default_flags = 0;
>> > +	range->pfn_flags_mask = -1UL;
>> > +
>> > +	ret = hmm_range_register(range, &svm->mirror);
>> > +	if (ret) {
>> > +		up_read(&svm->mm->mmap_sem);
>> > +		return (int)ret;
>> > +	}
>>
>>
>> Using a temporary range is the pattern from nouveau, is it really
>> necessary in this driver?
>
>Just to comment on this, for GPU the usage model is not application
>register range of virtual address it wants to use. It is GPU can
>access _any_ CPU valid address just like the CPU would (modulo mmap
>of device file).
>
>This is because the API you want in userspace is application passing
>random pointer to the GPU and GPU being able to chase down any kind
>of random pointer chain (assuming all valid ie pointing to valid
>virtual address for the process).
>
>This is unlike the RDMA case.
>
>
>That being said, for best performance we still expect well behaving
>application to provide hint to kernel so that we know if a range of
>virtual address is likely to be use by the GPU or not. But this is
>not, and should not be a requirement.
>
>
>I posted patchset and given talks about this, but long term i believe
>we want a common API to manage hint provided by userspace (see my
>talk at LPC this year about new syscall to bind memory to device).
>With such thing in place we could hang mmu notifier range to it. But
>the driver will still need to be able to handle the case where there
>is no hint provided by userspace and thus no before knowledge of what
>VA might be accessed.
>

Thanks Jerome for the explanation. Will checkout your LPC talk.
Yes I agree. When GPU faulting support is available, driver will handle the fault,
migrate page if needed and bind the page using HMM.
This patch series adds support for prefetch and bind hints (via explicit ioctls).
Also, patch 12 of the series provides the ability to enable/disable SVM on a per
VM basis for user, and by default SVM is disabled.

Niranjana

>Cheers,
>Jérôme
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 02/13] drm/i915/svm: Define SVM UAPI
@ 2019-11-26 13:44     ` Joonas Lahtinen
  0 siblings, 0 replies; 111+ messages in thread
From: Joonas Lahtinen @ 2019-11-26 13:44 UTC (permalink / raw)
  To: Niranjana Vishwanathapura, intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Quoting Niranjana Vishwanathapura (2019-11-22 22:57:23)
> Define UAPI for Shared Virtual Memory (SVM) fucntionality including
> SVM capability and configurability.
> When there is no GPU page fault support available, add ioctls
> to explicitly bind and migrate virtual address ranges.
> 
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sudeep Dutt <sudeep.dutt@intel.com>
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>

Having this as a separate commit ahead of the functionality breaks
bisecting.

The uAPI should be exposed in similar sized chunks as it's implemented.
If it has to go in as single patch, that should be after all the
plumbing is already in place.

That also gives a clear indication to anybody backporting that you
need to backport until the uAPI patch.

Regards, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 02/13] drm/i915/svm: Define SVM UAPI
@ 2019-11-26 13:44     ` Joonas Lahtinen
  0 siblings, 0 replies; 111+ messages in thread
From: Joonas Lahtinen @ 2019-11-26 13:44 UTC (permalink / raw)
  To: Niranjana Vishwanathapura, intel-gfx
  Cc: sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen, jglisse,
	jon.bloomfield, daniel.vetter, dan.j.williams, ira.weiny, jgg

Quoting Niranjana Vishwanathapura (2019-11-22 22:57:23)
> Define UAPI for Shared Virtual Memory (SVM) fucntionality including
> SVM capability and configurability.
> When there is no GPU page fault support available, add ioctls
> to explicitly bind and migrate virtual address ranges.
> 
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sudeep Dutt <sudeep.dutt@intel.com>
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>

Having this as a separate commit ahead of the functionality breaks
bisecting.

The uAPI should be exposed in similar sized chunks as it's implemented.
If it has to go in as single patch, that should be after all the
plumbing is already in place.

That also gives a clear indication to anybody backporting that you
need to backport until the uAPI patch.

Regards, Joonas
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 02/13] drm/i915/svm: Define SVM UAPI
@ 2019-11-26 13:44     ` Joonas Lahtinen
  0 siblings, 0 replies; 111+ messages in thread
From: Joonas Lahtinen @ 2019-11-26 13:44 UTC (permalink / raw)
  To: Niranjana Vishwanathapura, intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Quoting Niranjana Vishwanathapura (2019-11-22 22:57:23)
> Define UAPI for Shared Virtual Memory (SVM) fucntionality including
> SVM capability and configurability.
> When there is no GPU page fault support available, add ioctls
> to explicitly bind and migrate virtual address ranges.
> 
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sudeep Dutt <sudeep.dutt@intel.com>
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>

Having this as a separate commit ahead of the functionality breaks
bisecting.

The uAPI should be exposed in similar sized chunks as it's implemented.
If it has to go in as single patch, that should be after all the
plumbing is already in place.

That also gives a clear indication to anybody backporting that you
need to backport until the uAPI patch.

Regards, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support
@ 2019-11-26 13:59         ` Joonas Lahtinen
  0 siblings, 0 replies; 111+ messages in thread
From: Joonas Lahtinen @ 2019-11-26 13:59 UTC (permalink / raw)
  To: Chris Wilson, Niranjan Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse, jgg,
	daniel.vetter, dan.j.williams, ira.weiny

Quoting Niranjan Vishwanathapura (2019-11-25 18:40:57)
> On Mon, Nov 25, 2019 at 09:59:37AM +0000, Chris Wilson wrote:
> >Quoting Niranjana Vishwanathapura (2019-11-22 20:57:24)
> >> Shared Virtual Memory (SVM) runtime allocator support allows
> >> binding a shared virtual address to a buffer object (BO) in the
> >> device page table through an ioctl call.
> >
> >The ioctl though is not svm specific, it is to do with "bulk residency"
> >and can be used to reduce execbuf traffic to provide virtual address
> >layout controls to e.g. Vulkan clients.
> >
> >I915_VM_BIND {
> >       uint32_t vm_id;
> >       int32_t fd; /* or -1 for anon, or buf depending on flags */
> >       uint64_t flags;
> >       uint64_t offset; /* offset info fd [page aligned] */
> >       uint64_t length; /* page aligned */
> >       uint64_t iova; /* page aligned */
> >       uint64_t extensions;
> >}; /* where page aligned is actually more I915_GTT_PAGE_ALIGNMENT */
> >
> >as I recall. I also recall it being part of a future command stream
> >interface to reduce ioctls, but that is another story.
> 
> Thanks Chris.
> I will change I915_BIND to I915_VM_BIND.

We're very much depending on GEM VM even if BOs wouldn't be used,
so this is best called I915_GEM_VM_BIND to match I915_GEM_VM_CREATE
and avoid user confusion.

> Currently, it is only addressing binding SVM system (buffer) and runtime (BOs)
> allocations. But it can be expanded for other bindings. I have 'type' field
> instead of 'fd' and 'extensions' & 'iov' can be added later if required.

We should try to have the uAPI as final as early as possible. The
change process is harder the later it is done, even for RFC :)

On the same note, I'm inclined to have I915_SVM_MIGRATE called
I915_GEM_VM_PREFAULT from the start, as I already have got some
confused questions from folks who mix it with memory pressure
initiated migration. And it's inherently racy as anybody could
race with it, so PREFAULT would give an impression of that.

And I think I915_GEM_VM_PREFAULT would be a generally applicable
function, just like I915_GEM_VM_BIND. So we should use a struct
member names that are close to I915_GEM_VM_BIND.

Better ideas for name to emphasis the nature of what is being
done? I915_GEM_VM_FAULT/I915_GEM_VM_{M,}ADVICE...

Regards, Joonas

> Is that OK?
> 
> >-Chris
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support
@ 2019-11-26 13:59         ` Joonas Lahtinen
  0 siblings, 0 replies; 111+ messages in thread
From: Joonas Lahtinen @ 2019-11-26 13:59 UTC (permalink / raw)
  To: Chris Wilson, Niranjan Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse, jgg,
	daniel.vetter, dan.j.williams, ira.weiny

Quoting Niranjan Vishwanathapura (2019-11-25 18:40:57)
> On Mon, Nov 25, 2019 at 09:59:37AM +0000, Chris Wilson wrote:
> >Quoting Niranjana Vishwanathapura (2019-11-22 20:57:24)
> >> Shared Virtual Memory (SVM) runtime allocator support allows
> >> binding a shared virtual address to a buffer object (BO) in the
> >> device page table through an ioctl call.
> >
> >The ioctl though is not svm specific, it is to do with "bulk residency"
> >and can be used to reduce execbuf traffic to provide virtual address
> >layout controls to e.g. Vulkan clients.
> >
> >I915_VM_BIND {
> >       uint32_t vm_id;
> >       int32_t fd; /* or -1 for anon, or buf depending on flags */
> >       uint64_t flags;
> >       uint64_t offset; /* offset info fd [page aligned] */
> >       uint64_t length; /* page aligned */
> >       uint64_t iova; /* page aligned */
> >       uint64_t extensions;
> >}; /* where page aligned is actually more I915_GTT_PAGE_ALIGNMENT */
> >
> >as I recall. I also recall it being part of a future command stream
> >interface to reduce ioctls, but that is another story.
> 
> Thanks Chris.
> I will change I915_BIND to I915_VM_BIND.

We're very much depending on GEM VM even if BOs wouldn't be used,
so this is best called I915_GEM_VM_BIND to match I915_GEM_VM_CREATE
and avoid user confusion.

> Currently, it is only addressing binding SVM system (buffer) and runtime (BOs)
> allocations. But it can be expanded for other bindings. I have 'type' field
> instead of 'fd' and 'extensions' & 'iov' can be added later if required.

We should try to have the uAPI as final as early as possible. The
change process is harder the later it is done, even for RFC :)

On the same note, I'm inclined to have I915_SVM_MIGRATE called
I915_GEM_VM_PREFAULT from the start, as I already have got some
confused questions from folks who mix it with memory pressure
initiated migration. And it's inherently racy as anybody could
race with it, so PREFAULT would give an impression of that.

And I think I915_GEM_VM_PREFAULT would be a generally applicable
function, just like I915_GEM_VM_BIND. So we should use a struct
member names that are close to I915_GEM_VM_BIND.

Better ideas for name to emphasis the nature of what is being
done? I915_GEM_VM_FAULT/I915_GEM_VM_{M,}ADVICE...

Regards, Joonas

> Is that OK?
> 
> >-Chris
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 00/13] drm/i915/svm: Add SVM support
@ 2019-11-26 14:03   ` Joonas Lahtinen
  0 siblings, 0 replies; 111+ messages in thread
From: Joonas Lahtinen @ 2019-11-26 14:03 UTC (permalink / raw)
  To: Niranjana Vishwanathapura, intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Quoting Niranjana Vishwanathapura (2019-11-22 22:57:21)
> Shared Virtual Memory (SVM) allows the programmer to use a single virtual
> address space which will be shared between threads executing on CPUs and GPUs.
> It abstracts away from the user the location of the backing memory, and hence
> simplifies the user programming model.
> SVM supports two types of virtual memory allocation methods.
> Runtime allocator requires the driver to provide memory allocation and
> management interface, like buffer object (BO) interface.
> Whereas system allocator makes use of default OS memory allocation and
> management support like malloc().
> 
> This patch series adds both SVM system and runtime allocator support
> to i915 driver.
> 
> The patch series includes
>  - SVM support for both system and runtime allocation.
>  - Plugin in device memory with the Linux kernel.
>  - User API advertising SVM capability and configuration by user on per
>    vm basis.
>  - User API to bind an address range or a BO with a device page table.
>  - User API to migrate an address range to device memory.
>  - Implicit migration by moving pages or BOs back from device to host
>    memory upon CPU access.
>  - CPU copy and blitter copy support for migrating the pages/BOs.
>  - Large page mapping support
>  - Page table dump support.

Link to the IGTs, any selftests?
Link to userspace changes?

Regards, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 00/13] drm/i915/svm: Add SVM support
@ 2019-11-26 14:03   ` Joonas Lahtinen
  0 siblings, 0 replies; 111+ messages in thread
From: Joonas Lahtinen @ 2019-11-26 14:03 UTC (permalink / raw)
  To: Niranjana Vishwanathapura, intel-gfx
  Cc: sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen, jglisse,
	jon.bloomfield, daniel.vetter, dan.j.williams, ira.weiny, jgg

Quoting Niranjana Vishwanathapura (2019-11-22 22:57:21)
> Shared Virtual Memory (SVM) allows the programmer to use a single virtual
> address space which will be shared between threads executing on CPUs and GPUs.
> It abstracts away from the user the location of the backing memory, and hence
> simplifies the user programming model.
> SVM supports two types of virtual memory allocation methods.
> Runtime allocator requires the driver to provide memory allocation and
> management interface, like buffer object (BO) interface.
> Whereas system allocator makes use of default OS memory allocation and
> management support like malloc().
> 
> This patch series adds both SVM system and runtime allocator support
> to i915 driver.
> 
> The patch series includes
>  - SVM support for both system and runtime allocation.
>  - Plugin in device memory with the Linux kernel.
>  - User API advertising SVM capability and configuration by user on per
>    vm basis.
>  - User API to bind an address range or a BO with a device page table.
>  - User API to migrate an address range to device memory.
>  - Implicit migration by moving pages or BOs back from device to host
>    memory upon CPU access.
>  - CPU copy and blitter copy support for migrating the pages/BOs.
>  - Large page mapping support
>  - Page table dump support.

Link to the IGTs, any selftests?
Link to userspace changes?

Regards, Joonas
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 00/13] drm/i915/svm: Add SVM support
@ 2019-11-26 14:03   ` Joonas Lahtinen
  0 siblings, 0 replies; 111+ messages in thread
From: Joonas Lahtinen @ 2019-11-26 14:03 UTC (permalink / raw)
  To: Niranjana Vishwanathapura, intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

Quoting Niranjana Vishwanathapura (2019-11-22 22:57:21)
> Shared Virtual Memory (SVM) allows the programmer to use a single virtual
> address space which will be shared between threads executing on CPUs and GPUs.
> It abstracts away from the user the location of the backing memory, and hence
> simplifies the user programming model.
> SVM supports two types of virtual memory allocation methods.
> Runtime allocator requires the driver to provide memory allocation and
> management interface, like buffer object (BO) interface.
> Whereas system allocator makes use of default OS memory allocation and
> management support like malloc().
> 
> This patch series adds both SVM system and runtime allocator support
> to i915 driver.
> 
> The patch series includes
>  - SVM support for both system and runtime allocation.
>  - Plugin in device memory with the Linux kernel.
>  - User API advertising SVM capability and configuration by user on per
>    vm basis.
>  - User API to bind an address range or a BO with a device page table.
>  - User API to migrate an address range to device memory.
>  - Implicit migration by moving pages or BOs back from device to host
>    memory upon CPU access.
>  - CPU copy and blitter copy support for migrating the pages/BOs.
>  - Large page mapping support
>  - Page table dump support.

Link to the IGTs, any selftests?
Link to userspace changes?

Regards, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-26 18:32         ` Jason Gunthorpe
  0 siblings, 0 replies; 111+ messages in thread
From: Jason Gunthorpe @ 2019-11-26 18:32 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, daniel.vetter,
	ira.weiny, dan.j.williams

On Mon, Nov 25, 2019 at 11:33:27AM -0500, Jerome Glisse wrote:
> On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
> > On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
> 
> [...]
> 
> > > +static int
> > > +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
> > > +{
> > > +	long ret;
> > > +
> > > +	range->default_flags = 0;
> > > +	range->pfn_flags_mask = -1UL;
> > > +
> > > +	ret = hmm_range_register(range, &svm->mirror);
> > > +	if (ret) {
> > > +		up_read(&svm->mm->mmap_sem);
> > > +		return (int)ret;
> > > +	}
> > 
> > 
> > Using a temporary range is the pattern from nouveau, is it really
> > necessary in this driver?
> 
> Just to comment on this, for GPU the usage model is not application
> register range of virtual address it wants to use. It is GPU can
> access _any_ CPU valid address just like the CPU would (modulo mmap
> of device file).
> 
> This is because the API you want in userspace is application passing
> random pointer to the GPU and GPU being able to chase down any kind
> of random pointer chain (assuming all valid ie pointing to valid
> virtual address for the process).
> 
> This is unlike the RDMA case.

No, RDMA has the full address space option as well, it is called
'implicit ODP'

This is implemented by registering ranges at a level in our page
table (currently 512G) whenever that level has populated pages below
it.

I think is a better strategy than temporary ranges.

But other GPU drivers like AMD are using BO and TTM objects with fixed
VA ranges and the range is tied to the BO/TTM.

I'm not sure if this i915 usage is closer to AMD or closer to nouveau.

Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-26 18:32         ` Jason Gunthorpe
  0 siblings, 0 replies; 111+ messages in thread
From: Jason Gunthorpe @ 2019-11-26 18:32 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: intel-gfx, sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen,
	jon.bloomfield, daniel.vetter, Niranjana Vishwanathapura,
	ira.weiny, dan.j.williams

On Mon, Nov 25, 2019 at 11:33:27AM -0500, Jerome Glisse wrote:
> On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
> > On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
> 
> [...]
> 
> > > +static int
> > > +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
> > > +{
> > > +	long ret;
> > > +
> > > +	range->default_flags = 0;
> > > +	range->pfn_flags_mask = -1UL;
> > > +
> > > +	ret = hmm_range_register(range, &svm->mirror);
> > > +	if (ret) {
> > > +		up_read(&svm->mm->mmap_sem);
> > > +		return (int)ret;
> > > +	}
> > 
> > 
> > Using a temporary range is the pattern from nouveau, is it really
> > necessary in this driver?
> 
> Just to comment on this, for GPU the usage model is not application
> register range of virtual address it wants to use. It is GPU can
> access _any_ CPU valid address just like the CPU would (modulo mmap
> of device file).
> 
> This is because the API you want in userspace is application passing
> random pointer to the GPU and GPU being able to chase down any kind
> of random pointer chain (assuming all valid ie pointing to valid
> virtual address for the process).
> 
> This is unlike the RDMA case.

No, RDMA has the full address space option as well, it is called
'implicit ODP'

This is implemented by registering ranges at a level in our page
table (currently 512G) whenever that level has populated pages below
it.

I think is a better strategy than temporary ranges.

But other GPU drivers like AMD are using BO and TTM objects with fixed
VA ranges and the range is tied to the BO/TTM.

I'm not sure if this i915 usage is closer to AMD or closer to nouveau.

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-26 18:32         ` Jason Gunthorpe
  0 siblings, 0 replies; 111+ messages in thread
From: Jason Gunthorpe @ 2019-11-26 18:32 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, daniel.vetter,
	ira.weiny, dan.j.williams

On Mon, Nov 25, 2019 at 11:33:27AM -0500, Jerome Glisse wrote:
> On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
> > On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
> 
> [...]
> 
> > > +static int
> > > +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
> > > +{
> > > +	long ret;
> > > +
> > > +	range->default_flags = 0;
> > > +	range->pfn_flags_mask = -1UL;
> > > +
> > > +	ret = hmm_range_register(range, &svm->mirror);
> > > +	if (ret) {
> > > +		up_read(&svm->mm->mmap_sem);
> > > +		return (int)ret;
> > > +	}
> > 
> > 
> > Using a temporary range is the pattern from nouveau, is it really
> > necessary in this driver?
> 
> Just to comment on this, for GPU the usage model is not application
> register range of virtual address it wants to use. It is GPU can
> access _any_ CPU valid address just like the CPU would (modulo mmap
> of device file).
> 
> This is because the API you want in userspace is application passing
> random pointer to the GPU and GPU being able to chase down any kind
> of random pointer chain (assuming all valid ie pointing to valid
> virtual address for the process).
> 
> This is unlike the RDMA case.

No, RDMA has the full address space option as well, it is called
'implicit ODP'

This is implemented by registering ranges at a level in our page
table (currently 512G) whenever that level has populated pages below
it.

I think is a better strategy than temporary ranges.

But other GPU drivers like AMD are using BO and TTM objects with fixed
VA ranges and the range is tied to the BO/TTM.

I'm not sure if this i915 usage is closer to AMD or closer to nouveau.

Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-26 18:45                 ` Jason Gunthorpe
  0 siblings, 0 replies; 111+ messages in thread
From: Jason Gunthorpe @ 2019-11-26 18:45 UTC (permalink / raw)
  To: Niranjan Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse,
	daniel.vetter, dan.j.williams, ira.weiny

On Mon, Nov 25, 2019 at 08:32:58AM -0800, Niranjan Vishwanathapura wrote:
> > And putting the cpu PFN of a ZONE_DEVICE device page into
> > sg_dma_address still looks very wrong to me
> 
> The below call in patch 7 does convert any cpu PFN to device address.
> So, it won't be CPU PFN.
> i915_dmem_convert_pfn(vm->i915, &range);

Well, then it should be done here in patch 6, surely?

> Also, only reason to use sgl list is because i915 driver page table update
> functions takes an sgl, otherwise we can directly deal with range.pfns array.

Maybe that should be fixed instead of abusing sgl

Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-26 18:45                 ` Jason Gunthorpe
  0 siblings, 0 replies; 111+ messages in thread
From: Jason Gunthorpe @ 2019-11-26 18:45 UTC (permalink / raw)
  To: Niranjan Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen,
	jglisse, jon.bloomfield, daniel.vetter, dan.j.williams,
	ira.weiny

On Mon, Nov 25, 2019 at 08:32:58AM -0800, Niranjan Vishwanathapura wrote:
> > And putting the cpu PFN of a ZONE_DEVICE device page into
> > sg_dma_address still looks very wrong to me
> 
> The below call in patch 7 does convert any cpu PFN to device address.
> So, it won't be CPU PFN.
> i915_dmem_convert_pfn(vm->i915, &range);

Well, then it should be done here in patch 6, surely?

> Also, only reason to use sgl list is because i915 driver page table update
> functions takes an sgl, otherwise we can directly deal with range.pfns array.

Maybe that should be fixed instead of abusing sgl

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-26 18:45                 ` Jason Gunthorpe
  0 siblings, 0 replies; 111+ messages in thread
From: Jason Gunthorpe @ 2019-11-26 18:45 UTC (permalink / raw)
  To: Niranjan Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse,
	daniel.vetter, dan.j.williams, ira.weiny

On Mon, Nov 25, 2019 at 08:32:58AM -0800, Niranjan Vishwanathapura wrote:
> > And putting the cpu PFN of a ZONE_DEVICE device page into
> > sg_dma_address still looks very wrong to me
> 
> The below call in patch 7 does convert any cpu PFN to device address.
> So, it won't be CPU PFN.
> i915_dmem_convert_pfn(vm->i915, &range);

Well, then it should be done here in patch 6, surely?

> Also, only reason to use sgl list is because i915 driver page table update
> functions takes an sgl, otherwise we can directly deal with range.pfns array.

Maybe that should be fixed instead of abusing sgl

Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support
@ 2019-11-27 19:23           ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-27 19:23 UTC (permalink / raw)
  To: Joonas Lahtinen
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse, jgg,
	daniel.vetter, dan.j.williams, ira.weiny

On Tue, Nov 26, 2019 at 03:59:31PM +0200, Joonas Lahtinen wrote:
>Quoting Niranjan Vishwanathapura (2019-11-25 18:40:57)
>> On Mon, Nov 25, 2019 at 09:59:37AM +0000, Chris Wilson wrote:
>> >Quoting Niranjana Vishwanathapura (2019-11-22 20:57:24)
>> >> Shared Virtual Memory (SVM) runtime allocator support allows
>> >> binding a shared virtual address to a buffer object (BO) in the
>> >> device page table through an ioctl call.
>> >
>> >The ioctl though is not svm specific, it is to do with "bulk residency"
>> >and can be used to reduce execbuf traffic to provide virtual address
>> >layout controls to e.g. Vulkan clients.
>> >
>> >I915_VM_BIND {
>> >       uint32_t vm_id;
>> >       int32_t fd; /* or -1 for anon, or buf depending on flags */
>> >       uint64_t flags;
>> >       uint64_t offset; /* offset info fd [page aligned] */
>> >       uint64_t length; /* page aligned */
>> >       uint64_t iova; /* page aligned */
>> >       uint64_t extensions;
>> >}; /* where page aligned is actually more I915_GTT_PAGE_ALIGNMENT */
>> >
>> >as I recall. I also recall it being part of a future command stream
>> >interface to reduce ioctls, but that is another story.
>>
>> Thanks Chris.
>> I will change I915_BIND to I915_VM_BIND.
>
>We're very much depending on GEM VM even if BOs wouldn't be used,
>so this is best called I915_GEM_VM_BIND to match I915_GEM_VM_CREATE
>and avoid user confusion.
>

Thanks Joonas.
Ok, makes sense. I will make it as such.

>> Currently, it is only addressing binding SVM system (buffer) and runtime (BOs)
>> allocations. But it can be expanded for other bindings. I have 'type' field
>> instead of 'fd' and 'extensions' & 'iov' can be added later if required.
>
>We should try to have the uAPI as final as early as possible. The
>change process is harder the later it is done, even for RFC :)
>
>On the same note, I'm inclined to have I915_SVM_MIGRATE called
>I915_GEM_VM_PREFAULT from the start, as I already have got some
>confused questions from folks who mix it with memory pressure
>initiated migration. And it's inherently racy as anybody could
>race with it, so PREFAULT would give an impression of that.
>
>And I think I915_GEM_VM_PREFAULT would be a generally applicable
>function, just like I915_GEM_VM_BIND. So we should use a struct
>member names that are close to I915_GEM_VM_BIND.
>
>Better ideas for name to emphasis the nature of what is being
>done? I915_GEM_VM_FAULT/I915_GEM_VM_{M,}ADVICE...
>

Though current patchset only supports migrating pages from
host to device memory, I intend to support migrating from device
to host memory as well with same ioctl. User would want that.
Not sure what would be a good name then, _MIGRATE/_PREFETCH/_MOVE?

Also, migrating gem objects is currently handled by separate ioctl
which is part of LMEM patch series. I am open to merging these
ioctls together (similart to VM_BIND) into a single MIGRATE ioctl.

Niranjana

>Regards, Joonas
>
>> Is that OK?
>>
>> >-Chris
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support
@ 2019-11-27 19:23           ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-27 19:23 UTC (permalink / raw)
  To: Joonas Lahtinen
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse, jgg,
	daniel.vetter, dan.j.williams, ira.weiny

On Tue, Nov 26, 2019 at 03:59:31PM +0200, Joonas Lahtinen wrote:
>Quoting Niranjan Vishwanathapura (2019-11-25 18:40:57)
>> On Mon, Nov 25, 2019 at 09:59:37AM +0000, Chris Wilson wrote:
>> >Quoting Niranjana Vishwanathapura (2019-11-22 20:57:24)
>> >> Shared Virtual Memory (SVM) runtime allocator support allows
>> >> binding a shared virtual address to a buffer object (BO) in the
>> >> device page table through an ioctl call.
>> >
>> >The ioctl though is not svm specific, it is to do with "bulk residency"
>> >and can be used to reduce execbuf traffic to provide virtual address
>> >layout controls to e.g. Vulkan clients.
>> >
>> >I915_VM_BIND {
>> >       uint32_t vm_id;
>> >       int32_t fd; /* or -1 for anon, or buf depending on flags */
>> >       uint64_t flags;
>> >       uint64_t offset; /* offset info fd [page aligned] */
>> >       uint64_t length; /* page aligned */
>> >       uint64_t iova; /* page aligned */
>> >       uint64_t extensions;
>> >}; /* where page aligned is actually more I915_GTT_PAGE_ALIGNMENT */
>> >
>> >as I recall. I also recall it being part of a future command stream
>> >interface to reduce ioctls, but that is another story.
>>
>> Thanks Chris.
>> I will change I915_BIND to I915_VM_BIND.
>
>We're very much depending on GEM VM even if BOs wouldn't be used,
>so this is best called I915_GEM_VM_BIND to match I915_GEM_VM_CREATE
>and avoid user confusion.
>

Thanks Joonas.
Ok, makes sense. I will make it as such.

>> Currently, it is only addressing binding SVM system (buffer) and runtime (BOs)
>> allocations. But it can be expanded for other bindings. I have 'type' field
>> instead of 'fd' and 'extensions' & 'iov' can be added later if required.
>
>We should try to have the uAPI as final as early as possible. The
>change process is harder the later it is done, even for RFC :)
>
>On the same note, I'm inclined to have I915_SVM_MIGRATE called
>I915_GEM_VM_PREFAULT from the start, as I already have got some
>confused questions from folks who mix it with memory pressure
>initiated migration. And it's inherently racy as anybody could
>race with it, so PREFAULT would give an impression of that.
>
>And I think I915_GEM_VM_PREFAULT would be a generally applicable
>function, just like I915_GEM_VM_BIND. So we should use a struct
>member names that are close to I915_GEM_VM_BIND.
>
>Better ideas for name to emphasis the nature of what is being
>done? I915_GEM_VM_FAULT/I915_GEM_VM_{M,}ADVICE...
>

Though current patchset only supports migrating pages from
host to device memory, I intend to support migrating from device
to host memory as well with same ioctl. User would want that.
Not sure what would be a good name then, _MIGRATE/_PREFETCH/_MOVE?

Also, migrating gem objects is currently handled by separate ioctl
which is part of LMEM patch series. I am open to merging these
ioctls together (similart to VM_BIND) into a single MIGRATE ioctl.

Niranjana

>Regards, Joonas
>
>> Is that OK?
>>
>> >-Chris
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support
@ 2019-11-27 19:23           ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-27 19:23 UTC (permalink / raw)
  To: Joonas Lahtinen
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse, jgg,
	daniel.vetter, dan.j.williams, ira.weiny

On Tue, Nov 26, 2019 at 03:59:31PM +0200, Joonas Lahtinen wrote:
>Quoting Niranjan Vishwanathapura (2019-11-25 18:40:57)
>> On Mon, Nov 25, 2019 at 09:59:37AM +0000, Chris Wilson wrote:
>> >Quoting Niranjana Vishwanathapura (2019-11-22 20:57:24)
>> >> Shared Virtual Memory (SVM) runtime allocator support allows
>> >> binding a shared virtual address to a buffer object (BO) in the
>> >> device page table through an ioctl call.
>> >
>> >The ioctl though is not svm specific, it is to do with "bulk residency"
>> >and can be used to reduce execbuf traffic to provide virtual address
>> >layout controls to e.g. Vulkan clients.
>> >
>> >I915_VM_BIND {
>> >       uint32_t vm_id;
>> >       int32_t fd; /* or -1 for anon, or buf depending on flags */
>> >       uint64_t flags;
>> >       uint64_t offset; /* offset info fd [page aligned] */
>> >       uint64_t length; /* page aligned */
>> >       uint64_t iova; /* page aligned */
>> >       uint64_t extensions;
>> >}; /* where page aligned is actually more I915_GTT_PAGE_ALIGNMENT */
>> >
>> >as I recall. I also recall it being part of a future command stream
>> >interface to reduce ioctls, but that is another story.
>>
>> Thanks Chris.
>> I will change I915_BIND to I915_VM_BIND.
>
>We're very much depending on GEM VM even if BOs wouldn't be used,
>so this is best called I915_GEM_VM_BIND to match I915_GEM_VM_CREATE
>and avoid user confusion.
>

Thanks Joonas.
Ok, makes sense. I will make it as such.

>> Currently, it is only addressing binding SVM system (buffer) and runtime (BOs)
>> allocations. But it can be expanded for other bindings. I have 'type' field
>> instead of 'fd' and 'extensions' & 'iov' can be added later if required.
>
>We should try to have the uAPI as final as early as possible. The
>change process is harder the later it is done, even for RFC :)
>
>On the same note, I'm inclined to have I915_SVM_MIGRATE called
>I915_GEM_VM_PREFAULT from the start, as I already have got some
>confused questions from folks who mix it with memory pressure
>initiated migration. And it's inherently racy as anybody could
>race with it, so PREFAULT would give an impression of that.
>
>And I think I915_GEM_VM_PREFAULT would be a generally applicable
>function, just like I915_GEM_VM_BIND. So we should use a struct
>member names that are close to I915_GEM_VM_BIND.
>
>Better ideas for name to emphasis the nature of what is being
>done? I915_GEM_VM_FAULT/I915_GEM_VM_{M,}ADVICE...
>

Though current patchset only supports migrating pages from
host to device memory, I intend to support migrating from device
to host memory as well with same ioctl. User would want that.
Not sure what would be a good name then, _MIGRATE/_PREFETCH/_MOVE?

Also, migrating gem objects is currently handled by separate ioctl
which is part of LMEM patch series. I am open to merging these
ioctls together (similart to VM_BIND) into a single MIGRATE ioctl.

Niranjana

>Regards, Joonas
>
>> Is that OK?
>>
>> >-Chris
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 02/13] drm/i915/svm: Define SVM UAPI
@ 2019-11-28  2:04       ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-28  2:04 UTC (permalink / raw)
  To: Joonas Lahtinen
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse,
	daniel.vetter, dan.j.williams, ira.weiny, jgg

On Tue, Nov 26, 2019 at 03:44:26PM +0200, Joonas Lahtinen wrote:
>Quoting Niranjana Vishwanathapura (2019-11-22 22:57:23)
>> Define UAPI for Shared Virtual Memory (SVM) fucntionality including
>> SVM capability and configurability.
>> When there is no GPU page fault support available, add ioctls
>> to explicitly bind and migrate virtual address ranges.
>>
>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>> Cc: Sudeep Dutt <sudeep.dutt@intel.com>
>> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>
>Having this as a separate commit ahead of the functionality breaks
>bisecting.
>
>The uAPI should be exposed in similar sized chunks as it's implemented.
>If it has to go in as single patch, that should be after all the
>plumbing is already in place.
>
>That also gives a clear indication to anybody backporting that you
>need to backport until the uAPI patch.
>

Ok, will include ioctls in the relavant patches instead of this separate patch.

Thanks,
Niranjana

>Regards, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 02/13] drm/i915/svm: Define SVM UAPI
@ 2019-11-28  2:04       ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-28  2:04 UTC (permalink / raw)
  To: Joonas Lahtinen
  Cc: intel-gfx, sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen,
	jglisse, jon.bloomfield, daniel.vetter, dan.j.williams,
	ira.weiny, jgg

On Tue, Nov 26, 2019 at 03:44:26PM +0200, Joonas Lahtinen wrote:
>Quoting Niranjana Vishwanathapura (2019-11-22 22:57:23)
>> Define UAPI for Shared Virtual Memory (SVM) fucntionality including
>> SVM capability and configurability.
>> When there is no GPU page fault support available, add ioctls
>> to explicitly bind and migrate virtual address ranges.
>>
>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>> Cc: Sudeep Dutt <sudeep.dutt@intel.com>
>> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>
>Having this as a separate commit ahead of the functionality breaks
>bisecting.
>
>The uAPI should be exposed in similar sized chunks as it's implemented.
>If it has to go in as single patch, that should be after all the
>plumbing is already in place.
>
>That also gives a clear indication to anybody backporting that you
>need to backport until the uAPI patch.
>

Ok, will include ioctls in the relavant patches instead of this separate patch.

Thanks,
Niranjana

>Regards, Joonas
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 02/13] drm/i915/svm: Define SVM UAPI
@ 2019-11-28  2:04       ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-28  2:04 UTC (permalink / raw)
  To: Joonas Lahtinen
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse,
	daniel.vetter, dan.j.williams, ira.weiny, jgg

On Tue, Nov 26, 2019 at 03:44:26PM +0200, Joonas Lahtinen wrote:
>Quoting Niranjana Vishwanathapura (2019-11-22 22:57:23)
>> Define UAPI for Shared Virtual Memory (SVM) fucntionality including
>> SVM capability and configurability.
>> When there is no GPU page fault support available, add ioctls
>> to explicitly bind and migrate virtual address ranges.
>>
>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>> Cc: Sudeep Dutt <sudeep.dutt@intel.com>
>> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>
>Having this as a separate commit ahead of the functionality breaks
>bisecting.
>
>The uAPI should be exposed in similar sized chunks as it's implemented.
>If it has to go in as single patch, that should be after all the
>plumbing is already in place.
>
>That also gives a clear indication to anybody backporting that you
>need to backport until the uAPI patch.
>

Ok, will include ioctls in the relavant patches instead of this separate patch.

Thanks,
Niranjana

>Regards, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support
@ 2019-11-28 12:12             ` Joonas Lahtinen
  0 siblings, 0 replies; 111+ messages in thread
From: Joonas Lahtinen @ 2019-11-28 12:12 UTC (permalink / raw)
  To: Niranjan Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse, jgg,
	daniel.vetter, dan.j.williams, ira.weiny

Quoting Niranjan Vishwanathapura (2019-11-27 21:23:56)
> >We should try to have the uAPI as final as early as possible. The
> >change process is harder the later it is done, even for RFC :)
> >
> >On the same note, I'm inclined to have I915_SVM_MIGRATE called
> >I915_GEM_VM_PREFAULT from the start, as I already have got some
> >confused questions from folks who mix it with memory pressure
> >initiated migration. And it's inherently racy as anybody could
> >race with it, so PREFAULT would give an impression of that.
> >
> >And I think I915_GEM_VM_PREFAULT would be a generally applicable
> >function, just like I915_GEM_VM_BIND. So we should use a struct
> >member names that are close to I915_GEM_VM_BIND.
> >
> >Better ideas for name to emphasis the nature of what is being
> >done? I915_GEM_VM_FAULT/I915_GEM_VM_{M,}ADVICE...
> >
> 
> Though current patchset only supports migrating pages from
> host to device memory, I intend to support migrating from device
> to host memory as well with same ioctl. User would want that.
> Not sure what would be a good name then, _MIGRATE/_PREFETCH/_MOVE?

In the HMM concept the CPU access would trigger fault, and trigger
the transition, wouldn't it? But you're correct that it is kind of
tied to the HMM concept, and may be confusing for BOs.

_PREFETCH is a good suggestion for the term, which lead to
discussion to avoid explosion of IOCTLs, Chris suggested
consolidation, maybe we should have I915_GEM_VM_{M,}ADVISE?

If we're looking at connections to fadvise(2), we're basically
talking about equivalent of FADV_WILLNEED. That concept would
be quite familiar to users. GEM_VM_{M,}ADVISE with WILLNEED
and explicitly passing the memory region? Because we can't decipher
that from the running thread like CPU.

Thoughts?

> Also, migrating gem objects is currently handled by separate ioctl
> which is part of LMEM patch series. I am open to merging these
> ioctls together (similart to VM_BIND) into a single MIGRATE ioctl.

The IOCTL in the LMEM series is about setting the allowed backing
store types of a BO, not about the residency. There was some
discussion around doing explicit migrations by changing that list.
Current thinking is that we only need to allow setting it once
at creation. That also means it might be convertible to creation
time only property.

That'd eliminate the need for BO freeze IOCTL that was discussed
with Mesa folks.

Regard, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support
@ 2019-11-28 12:12             ` Joonas Lahtinen
  0 siblings, 0 replies; 111+ messages in thread
From: Joonas Lahtinen @ 2019-11-28 12:12 UTC (permalink / raw)
  To: Niranjan Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse, jgg,
	daniel.vetter, dan.j.williams, ira.weiny

Quoting Niranjan Vishwanathapura (2019-11-27 21:23:56)
> >We should try to have the uAPI as final as early as possible. The
> >change process is harder the later it is done, even for RFC :)
> >
> >On the same note, I'm inclined to have I915_SVM_MIGRATE called
> >I915_GEM_VM_PREFAULT from the start, as I already have got some
> >confused questions from folks who mix it with memory pressure
> >initiated migration. And it's inherently racy as anybody could
> >race with it, so PREFAULT would give an impression of that.
> >
> >And I think I915_GEM_VM_PREFAULT would be a generally applicable
> >function, just like I915_GEM_VM_BIND. So we should use a struct
> >member names that are close to I915_GEM_VM_BIND.
> >
> >Better ideas for name to emphasis the nature of what is being
> >done? I915_GEM_VM_FAULT/I915_GEM_VM_{M,}ADVICE...
> >
> 
> Though current patchset only supports migrating pages from
> host to device memory, I intend to support migrating from device
> to host memory as well with same ioctl. User would want that.
> Not sure what would be a good name then, _MIGRATE/_PREFETCH/_MOVE?

In the HMM concept the CPU access would trigger fault, and trigger
the transition, wouldn't it? But you're correct that it is kind of
tied to the HMM concept, and may be confusing for BOs.

_PREFETCH is a good suggestion for the term, which lead to
discussion to avoid explosion of IOCTLs, Chris suggested
consolidation, maybe we should have I915_GEM_VM_{M,}ADVISE?

If we're looking at connections to fadvise(2), we're basically
talking about equivalent of FADV_WILLNEED. That concept would
be quite familiar to users. GEM_VM_{M,}ADVISE with WILLNEED
and explicitly passing the memory region? Because we can't decipher
that from the running thread like CPU.

Thoughts?

> Also, migrating gem objects is currently handled by separate ioctl
> which is part of LMEM patch series. I am open to merging these
> ioctls together (similart to VM_BIND) into a single MIGRATE ioctl.

The IOCTL in the LMEM series is about setting the allowed backing
store types of a BO, not about the residency. There was some
discussion around doing explicit migrations by changing that list.
Current thinking is that we only need to allow setting it once
at creation. That also means it might be convertible to creation
time only property.

That'd eliminate the need for BO freeze IOCTL that was discussed
with Mesa folks.

Regard, Joonas
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support
@ 2019-11-28 12:12             ` Joonas Lahtinen
  0 siblings, 0 replies; 111+ messages in thread
From: Joonas Lahtinen @ 2019-11-28 12:12 UTC (permalink / raw)
  To: Niranjan Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse, jgg,
	daniel.vetter, dan.j.williams, ira.weiny

Quoting Niranjan Vishwanathapura (2019-11-27 21:23:56)
> >We should try to have the uAPI as final as early as possible. The
> >change process is harder the later it is done, even for RFC :)
> >
> >On the same note, I'm inclined to have I915_SVM_MIGRATE called
> >I915_GEM_VM_PREFAULT from the start, as I already have got some
> >confused questions from folks who mix it with memory pressure
> >initiated migration. And it's inherently racy as anybody could
> >race with it, so PREFAULT would give an impression of that.
> >
> >And I think I915_GEM_VM_PREFAULT would be a generally applicable
> >function, just like I915_GEM_VM_BIND. So we should use a struct
> >member names that are close to I915_GEM_VM_BIND.
> >
> >Better ideas for name to emphasis the nature of what is being
> >done? I915_GEM_VM_FAULT/I915_GEM_VM_{M,}ADVICE...
> >
> 
> Though current patchset only supports migrating pages from
> host to device memory, I intend to support migrating from device
> to host memory as well with same ioctl. User would want that.
> Not sure what would be a good name then, _MIGRATE/_PREFETCH/_MOVE?

In the HMM concept the CPU access would trigger fault, and trigger
the transition, wouldn't it? But you're correct that it is kind of
tied to the HMM concept, and may be confusing for BOs.

_PREFETCH is a good suggestion for the term, which lead to
discussion to avoid explosion of IOCTLs, Chris suggested
consolidation, maybe we should have I915_GEM_VM_{M,}ADVISE?

If we're looking at connections to fadvise(2), we're basically
talking about equivalent of FADV_WILLNEED. That concept would
be quite familiar to users. GEM_VM_{M,}ADVISE with WILLNEED
and explicitly passing the memory region? Because we can't decipher
that from the running thread like CPU.

Thoughts?

> Also, migrating gem objects is currently handled by separate ioctl
> which is part of LMEM patch series. I am open to merging these
> ioctls together (similart to VM_BIND) into a single MIGRATE ioctl.

The IOCTL in the LMEM series is about setting the allowed backing
store types of a BO, not about the residency. There was some
discussion around doing explicit migrations by changing that list.
Current thinking is that we only need to allow setting it once
at creation. That also means it might be convertible to creation
time only property.

That'd eliminate the need for BO freeze IOCTL that was discussed
with Mesa folks.

Regard, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 00/13] drm/i915/svm: Add SVM support
@ 2019-12-02 19:37     ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-12-02 19:37 UTC (permalink / raw)
  To: Joonas Lahtinen
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse,
	daniel.vetter, dan.j.williams, ira.weiny, jgg

On Tue, Nov 26, 2019 at 04:03:21PM +0200, Joonas Lahtinen wrote:
>Quoting Niranjana Vishwanathapura (2019-11-22 22:57:21)
>> Shared Virtual Memory (SVM) allows the programmer to use a single virtual
>> address space which will be shared between threads executing on CPUs and GPUs.
>> It abstracts away from the user the location of the backing memory, and hence
>> simplifies the user programming model.
>> SVM supports two types of virtual memory allocation methods.
>> Runtime allocator requires the driver to provide memory allocation and
>> management interface, like buffer object (BO) interface.
>> Whereas system allocator makes use of default OS memory allocation and
>> management support like malloc().
>>
>> This patch series adds both SVM system and runtime allocator support
>> to i915 driver.
>>
>> The patch series includes
>>  - SVM support for both system and runtime allocation.
>>  - Plugin in device memory with the Linux kernel.
>>  - User API advertising SVM capability and configuration by user on per
>>    vm basis.
>>  - User API to bind an address range or a BO with a device page table.
>>  - User API to migrate an address range to device memory.
>>  - Implicit migration by moving pages or BOs back from device to host
>>    memory upon CPU access.
>>  - CPU copy and blitter copy support for migrating the pages/BOs.
>>  - Large page mapping support
>>  - Page table dump support.
>
>Link to the IGTs, any selftests?
>Link to userspace changes?

I have some IGT tests, I will post them soon.

Niranjana

>
>Regards, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 00/13] drm/i915/svm: Add SVM support
@ 2019-12-02 19:37     ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-12-02 19:37 UTC (permalink / raw)
  To: Joonas Lahtinen
  Cc: intel-gfx, sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen,
	jglisse, jon.bloomfield, daniel.vetter, dan.j.williams,
	ira.weiny, jgg

On Tue, Nov 26, 2019 at 04:03:21PM +0200, Joonas Lahtinen wrote:
>Quoting Niranjana Vishwanathapura (2019-11-22 22:57:21)
>> Shared Virtual Memory (SVM) allows the programmer to use a single virtual
>> address space which will be shared between threads executing on CPUs and GPUs.
>> It abstracts away from the user the location of the backing memory, and hence
>> simplifies the user programming model.
>> SVM supports two types of virtual memory allocation methods.
>> Runtime allocator requires the driver to provide memory allocation and
>> management interface, like buffer object (BO) interface.
>> Whereas system allocator makes use of default OS memory allocation and
>> management support like malloc().
>>
>> This patch series adds both SVM system and runtime allocator support
>> to i915 driver.
>>
>> The patch series includes
>>  - SVM support for both system and runtime allocation.
>>  - Plugin in device memory with the Linux kernel.
>>  - User API advertising SVM capability and configuration by user on per
>>    vm basis.
>>  - User API to bind an address range or a BO with a device page table.
>>  - User API to migrate an address range to device memory.
>>  - Implicit migration by moving pages or BOs back from device to host
>>    memory upon CPU access.
>>  - CPU copy and blitter copy support for migrating the pages/BOs.
>>  - Large page mapping support
>>  - Page table dump support.
>
>Link to the IGTs, any selftests?
>Link to userspace changes?

I have some IGT tests, I will post them soon.

Niranjana

>
>Regards, Joonas
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 00/13] drm/i915/svm: Add SVM support
@ 2019-12-02 19:37     ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-12-02 19:37 UTC (permalink / raw)
  To: Joonas Lahtinen
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse,
	daniel.vetter, dan.j.williams, ira.weiny, jgg

On Tue, Nov 26, 2019 at 04:03:21PM +0200, Joonas Lahtinen wrote:
>Quoting Niranjana Vishwanathapura (2019-11-22 22:57:21)
>> Shared Virtual Memory (SVM) allows the programmer to use a single virtual
>> address space which will be shared between threads executing on CPUs and GPUs.
>> It abstracts away from the user the location of the backing memory, and hence
>> simplifies the user programming model.
>> SVM supports two types of virtual memory allocation methods.
>> Runtime allocator requires the driver to provide memory allocation and
>> management interface, like buffer object (BO) interface.
>> Whereas system allocator makes use of default OS memory allocation and
>> management support like malloc().
>>
>> This patch series adds both SVM system and runtime allocator support
>> to i915 driver.
>>
>> The patch series includes
>>  - SVM support for both system and runtime allocation.
>>  - Plugin in device memory with the Linux kernel.
>>  - User API advertising SVM capability and configuration by user on per
>>    vm basis.
>>  - User API to bind an address range or a BO with a device page table.
>>  - User API to migrate an address range to device memory.
>>  - Implicit migration by moving pages or BOs back from device to host
>>    memory upon CPU access.
>>  - CPU copy and blitter copy support for migrating the pages/BOs.
>>  - Large page mapping support
>>  - Page table dump support.
>
>Link to the IGTs, any selftests?
>Link to userspace changes?

I have some IGT tests, I will post them soon.

Niranjana

>
>Regards, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support
@ 2019-12-02 19:59               ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-12-02 19:59 UTC (permalink / raw)
  To: Joonas Lahtinen
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse, jgg,
	daniel.vetter, dan.j.williams, ira.weiny

On Thu, Nov 28, 2019 at 02:12:30PM +0200, Joonas Lahtinen wrote:
>Quoting Niranjan Vishwanathapura (2019-11-27 21:23:56)
>> >We should try to have the uAPI as final as early as possible. The
>> >change process is harder the later it is done, even for RFC :)
>> >
>> >On the same note, I'm inclined to have I915_SVM_MIGRATE called
>> >I915_GEM_VM_PREFAULT from the start, as I already have got some
>> >confused questions from folks who mix it with memory pressure
>> >initiated migration. And it's inherently racy as anybody could
>> >race with it, so PREFAULT would give an impression of that.
>> >
>> >And I think I915_GEM_VM_PREFAULT would be a generally applicable
>> >function, just like I915_GEM_VM_BIND. So we should use a struct
>> >member names that are close to I915_GEM_VM_BIND.
>> >
>> >Better ideas for name to emphasis the nature of what is being
>> >done? I915_GEM_VM_FAULT/I915_GEM_VM_{M,}ADVICE...
>> >
>>
>> Though current patchset only supports migrating pages from
>> host to device memory, I intend to support migrating from device
>> to host memory as well with same ioctl. User would want that.
>> Not sure what would be a good name then, _MIGRATE/_PREFETCH/_MOVE?
>
>In the HMM concept the CPU access would trigger fault, and trigger
>the transition, wouldn't it? But you're correct that it is kind of
>tied to the HMM concept, and may be confusing for BOs.
>

Yes it does. But I think we should give the user mechanism to explicitly
migrate/prefetch it back to system memory.

>_PREFETCH is a good suggestion for the term, which lead to
>discussion to avoid explosion of IOCTLs, Chris suggested
>consolidation, maybe we should have I915_GEM_VM_{M,}ADVISE?
>
>If we're looking at connections to fadvise(2), we're basically
>talking about equivalent of FADV_WILLNEED. That concept would
>be quite familiar to users. GEM_VM_{M,}ADVISE with WILLNEED
>and explicitly passing the memory region? Because we can't decipher
>that from the running thread like CPU.
>
>Thoughts?

Yah it is closer to mbind (instead of nodemask, we specify memory region/s).
So, I915_GEM_VM_MBIND? I am ok with _PREFETCH also.

>
>> Also, migrating gem objects is currently handled by separate ioctl
>> which is part of LMEM patch series. I am open to merging these
>> ioctls together (similart to VM_BIND) into a single MIGRATE ioctl.
>
>The IOCTL in the LMEM series is about setting the allowed backing
>store types of a BO, not about the residency. There was some
>discussion around doing explicit migrations by changing that list.
>Current thinking is that we only need to allow setting it once
>at creation. That also means it might be convertible to creation
>time only property.
>
>That'd eliminate the need for BO freeze IOCTL that was discussed
>with Mesa folks.
>

Ok.

Thanks,
Niranjana

>Regard, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support
@ 2019-12-02 19:59               ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-12-02 19:59 UTC (permalink / raw)
  To: Joonas Lahtinen
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse, jgg,
	daniel.vetter, dan.j.williams, ira.weiny

On Thu, Nov 28, 2019 at 02:12:30PM +0200, Joonas Lahtinen wrote:
>Quoting Niranjan Vishwanathapura (2019-11-27 21:23:56)
>> >We should try to have the uAPI as final as early as possible. The
>> >change process is harder the later it is done, even for RFC :)
>> >
>> >On the same note, I'm inclined to have I915_SVM_MIGRATE called
>> >I915_GEM_VM_PREFAULT from the start, as I already have got some
>> >confused questions from folks who mix it with memory pressure
>> >initiated migration. And it's inherently racy as anybody could
>> >race with it, so PREFAULT would give an impression of that.
>> >
>> >And I think I915_GEM_VM_PREFAULT would be a generally applicable
>> >function, just like I915_GEM_VM_BIND. So we should use a struct
>> >member names that are close to I915_GEM_VM_BIND.
>> >
>> >Better ideas for name to emphasis the nature of what is being
>> >done? I915_GEM_VM_FAULT/I915_GEM_VM_{M,}ADVICE...
>> >
>>
>> Though current patchset only supports migrating pages from
>> host to device memory, I intend to support migrating from device
>> to host memory as well with same ioctl. User would want that.
>> Not sure what would be a good name then, _MIGRATE/_PREFETCH/_MOVE?
>
>In the HMM concept the CPU access would trigger fault, and trigger
>the transition, wouldn't it? But you're correct that it is kind of
>tied to the HMM concept, and may be confusing for BOs.
>

Yes it does. But I think we should give the user mechanism to explicitly
migrate/prefetch it back to system memory.

>_PREFETCH is a good suggestion for the term, which lead to
>discussion to avoid explosion of IOCTLs, Chris suggested
>consolidation, maybe we should have I915_GEM_VM_{M,}ADVISE?
>
>If we're looking at connections to fadvise(2), we're basically
>talking about equivalent of FADV_WILLNEED. That concept would
>be quite familiar to users. GEM_VM_{M,}ADVISE with WILLNEED
>and explicitly passing the memory region? Because we can't decipher
>that from the running thread like CPU.
>
>Thoughts?

Yah it is closer to mbind (instead of nodemask, we specify memory region/s).
So, I915_GEM_VM_MBIND? I am ok with _PREFETCH also.

>
>> Also, migrating gem objects is currently handled by separate ioctl
>> which is part of LMEM patch series. I am open to merging these
>> ioctls together (similart to VM_BIND) into a single MIGRATE ioctl.
>
>The IOCTL in the LMEM series is about setting the allowed backing
>store types of a BO, not about the residency. There was some
>discussion around doing explicit migrations by changing that list.
>Current thinking is that we only need to allow setting it once
>at creation. That also means it might be convertible to creation
>time only property.
>
>That'd eliminate the need for BO freeze IOCTL that was discussed
>with Mesa folks.
>

Ok.

Thanks,
Niranjana

>Regard, Joonas
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support
@ 2019-12-02 19:59               ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-12-02 19:59 UTC (permalink / raw)
  To: Joonas Lahtinen
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse, jgg,
	daniel.vetter, dan.j.williams, ira.weiny

On Thu, Nov 28, 2019 at 02:12:30PM +0200, Joonas Lahtinen wrote:
>Quoting Niranjan Vishwanathapura (2019-11-27 21:23:56)
>> >We should try to have the uAPI as final as early as possible. The
>> >change process is harder the later it is done, even for RFC :)
>> >
>> >On the same note, I'm inclined to have I915_SVM_MIGRATE called
>> >I915_GEM_VM_PREFAULT from the start, as I already have got some
>> >confused questions from folks who mix it with memory pressure
>> >initiated migration. And it's inherently racy as anybody could
>> >race with it, so PREFAULT would give an impression of that.
>> >
>> >And I think I915_GEM_VM_PREFAULT would be a generally applicable
>> >function, just like I915_GEM_VM_BIND. So we should use a struct
>> >member names that are close to I915_GEM_VM_BIND.
>> >
>> >Better ideas for name to emphasis the nature of what is being
>> >done? I915_GEM_VM_FAULT/I915_GEM_VM_{M,}ADVICE...
>> >
>>
>> Though current patchset only supports migrating pages from
>> host to device memory, I intend to support migrating from device
>> to host memory as well with same ioctl. User would want that.
>> Not sure what would be a good name then, _MIGRATE/_PREFETCH/_MOVE?
>
>In the HMM concept the CPU access would trigger fault, and trigger
>the transition, wouldn't it? But you're correct that it is kind of
>tied to the HMM concept, and may be confusing for BOs.
>

Yes it does. But I think we should give the user mechanism to explicitly
migrate/prefetch it back to system memory.

>_PREFETCH is a good suggestion for the term, which lead to
>discussion to avoid explosion of IOCTLs, Chris suggested
>consolidation, maybe we should have I915_GEM_VM_{M,}ADVISE?
>
>If we're looking at connections to fadvise(2), we're basically
>talking about equivalent of FADV_WILLNEED. That concept would
>be quite familiar to users. GEM_VM_{M,}ADVISE with WILLNEED
>and explicitly passing the memory region? Because we can't decipher
>that from the running thread like CPU.
>
>Thoughts?

Yah it is closer to mbind (instead of nodemask, we specify memory region/s).
So, I915_GEM_VM_MBIND? I am ok with _PREFETCH also.

>
>> Also, migrating gem objects is currently handled by separate ioctl
>> which is part of LMEM patch series. I am open to merging these
>> ioctls together (similart to VM_BIND) into a single MIGRATE ioctl.
>
>The IOCTL in the LMEM series is about setting the allowed backing
>store types of a BO, not about the residency. There was some
>discussion around doing explicit migrations by changing that list.
>Current thinking is that we only need to allow setting it once
>at creation. That also means it might be convertible to creation
>time only property.
>
>That'd eliminate the need for BO freeze IOCTL that was discussed
>with Mesa folks.
>

Ok.

Thanks,
Niranjana

>Regard, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
  2019-11-26 18:45                 ` Jason Gunthorpe
@ 2019-12-03 19:07                   ` Niranjan Vishwanathapura
  -1 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-12-03 19:07 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: intel-gfx, sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen,
	jglisse, jon.bloomfield, daniel.vetter, dan.j.williams,
	ira.weiny

On Tue, Nov 26, 2019 at 06:45:14PM +0000, Jason Gunthorpe wrote:
>On Mon, Nov 25, 2019 at 08:32:58AM -0800, Niranjan Vishwanathapura wrote:
>> > And putting the cpu PFN of a ZONE_DEVICE device page into
>> > sg_dma_address still looks very wrong to me
>>
>> The below call in patch 7 does convert any cpu PFN to device address.
>> So, it won't be CPU PFN.
>> i915_dmem_convert_pfn(vm->i915, &range);
>
>Well, then it should be done here in patch 6, surely?
>
>> Also, only reason to use sgl list is because i915 driver page table update
>> functions takes an sgl, otherwise we can directly deal with range.pfns array.
>
>Maybe that should be fixed instead of abusing sgl
>

Sorry, missed these comments.

Well, this i915 SVM support can be extended to work on intel integrated gfx also
with host memory (no ZONE_DEVICE), though enabling it is not the primary focus for
this patch series.
There we need to dma map these CPU PFNs and assign to the sg_dma_address.

Device memory plugin for discreate graphics is added in patch 7 of this series,
hence this convert_pfn function is added there.

Niranjana

>Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-12-03 19:07                   ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-12-03 19:07 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse,
	daniel.vetter, dan.j.williams, ira.weiny

On Tue, Nov 26, 2019 at 06:45:14PM +0000, Jason Gunthorpe wrote:
>On Mon, Nov 25, 2019 at 08:32:58AM -0800, Niranjan Vishwanathapura wrote:
>> > And putting the cpu PFN of a ZONE_DEVICE device page into
>> > sg_dma_address still looks very wrong to me
>>
>> The below call in patch 7 does convert any cpu PFN to device address.
>> So, it won't be CPU PFN.
>> i915_dmem_convert_pfn(vm->i915, &range);
>
>Well, then it should be done here in patch 6, surely?
>
>> Also, only reason to use sgl list is because i915 driver page table update
>> functions takes an sgl, otherwise we can directly deal with range.pfns array.
>
>Maybe that should be fixed instead of abusing sgl
>

Sorry, missed these comments.

Well, this i915 SVM support can be extended to work on intel integrated gfx also
with host memory (no ZONE_DEVICE), though enabling it is not the primary focus for
this patch series.
There we need to dma map these CPU PFNs and assign to the sg_dma_address.

Device memory plugin for discreate graphics is added in patch 7 of this series,
hence this convert_pfn function is added there.

Niranjana

>Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
  2019-11-26 18:32         ` Jason Gunthorpe
@ 2019-12-03 19:19           ` Niranjan Vishwanathapura
  -1 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-12-03 19:19 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: intel-gfx, sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen,
	Jerome Glisse, jon.bloomfield, daniel.vetter, dan.j.williams,
	ira.weiny

On Tue, Nov 26, 2019 at 06:32:52PM +0000, Jason Gunthorpe wrote:
>On Mon, Nov 25, 2019 at 11:33:27AM -0500, Jerome Glisse wrote:
>> On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
>> > On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
>>
>> [...]
>>
>> > > +static int
>> > > +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
>> > > +{
>> > > +	long ret;
>> > > +
>> > > +	range->default_flags = 0;
>> > > +	range->pfn_flags_mask = -1UL;
>> > > +
>> > > +	ret = hmm_range_register(range, &svm->mirror);
>> > > +	if (ret) {
>> > > +		up_read(&svm->mm->mmap_sem);
>> > > +		return (int)ret;
>> > > +	}
>> >
>> >
>> > Using a temporary range is the pattern from nouveau, is it really
>> > necessary in this driver?
>>
>> Just to comment on this, for GPU the usage model is not application
>> register range of virtual address it wants to use. It is GPU can
>> access _any_ CPU valid address just like the CPU would (modulo mmap
>> of device file).
>>
>> This is because the API you want in userspace is application passing
>> random pointer to the GPU and GPU being able to chase down any kind
>> of random pointer chain (assuming all valid ie pointing to valid
>> virtual address for the process).
>>
>> This is unlike the RDMA case.
>
>No, RDMA has the full address space option as well, it is called
>'implicit ODP'
>
>This is implemented by registering ranges at a level in our page
>table (currently 512G) whenever that level has populated pages below
>it.
>
>I think is a better strategy than temporary ranges.
>
>But other GPU drivers like AMD are using BO and TTM objects with fixed
>VA ranges and the range is tied to the BO/TTM.
>
>I'm not sure if this i915 usage is closer to AMD or closer to nouveau.
>

I don't know the full details of the HMM usecases in amd and nouveau.
AMD seems to be using it for usrptr objects which is tied to a BO.
I am not sure if nouveau has any BO tied to these address ranges.

Niranjana

>Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-12-03 19:19           ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-12-03 19:19 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, Jerome Glisse,
	daniel.vetter, dan.j.williams, ira.weiny

On Tue, Nov 26, 2019 at 06:32:52PM +0000, Jason Gunthorpe wrote:
>On Mon, Nov 25, 2019 at 11:33:27AM -0500, Jerome Glisse wrote:
>> On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
>> > On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
>>
>> [...]
>>
>> > > +static int
>> > > +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
>> > > +{
>> > > +	long ret;
>> > > +
>> > > +	range->default_flags = 0;
>> > > +	range->pfn_flags_mask = -1UL;
>> > > +
>> > > +	ret = hmm_range_register(range, &svm->mirror);
>> > > +	if (ret) {
>> > > +		up_read(&svm->mm->mmap_sem);
>> > > +		return (int)ret;
>> > > +	}
>> >
>> >
>> > Using a temporary range is the pattern from nouveau, is it really
>> > necessary in this driver?
>>
>> Just to comment on this, for GPU the usage model is not application
>> register range of virtual address it wants to use. It is GPU can
>> access _any_ CPU valid address just like the CPU would (modulo mmap
>> of device file).
>>
>> This is because the API you want in userspace is application passing
>> random pointer to the GPU and GPU being able to chase down any kind
>> of random pointer chain (assuming all valid ie pointing to valid
>> virtual address for the process).
>>
>> This is unlike the RDMA case.
>
>No, RDMA has the full address space option as well, it is called
>'implicit ODP'
>
>This is implemented by registering ranges at a level in our page
>table (currently 512G) whenever that level has populated pages below
>it.
>
>I think is a better strategy than temporary ranges.
>
>But other GPU drivers like AMD are using BO and TTM objects with fixed
>VA ranges and the range is tied to the BO/TTM.
>
>I'm not sure if this i915 usage is closer to AMD or closer to nouveau.
>

I don't know the full details of the HMM usecases in amd and nouveau.
AMD seems to be using it for usrptr objects which is tied to a BO.
I am not sure if nouveau has any BO tied to these address ranges.

Niranjana

>Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
  2019-12-03 19:19           ` [Intel-gfx] " Niranjan Vishwanathapura
@ 2019-12-04 21:51             ` Jerome Glisse
  -1 siblings, 0 replies; 111+ messages in thread
From: Jerome Glisse @ 2019-12-04 21:51 UTC (permalink / raw)
  To: Niranjan Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen,
	Jason Gunthorpe, jon.bloomfield, daniel.vetter, dan.j.williams,
	ira.weiny

On Tue, Dec 03, 2019 at 11:19:43AM -0800, Niranjan Vishwanathapura wrote:
> On Tue, Nov 26, 2019 at 06:32:52PM +0000, Jason Gunthorpe wrote:
> > On Mon, Nov 25, 2019 at 11:33:27AM -0500, Jerome Glisse wrote:
> > > On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
> > > > On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
> > > 
> > > [...]
> > > 
> > > > > +static int
> > > > > +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
> > > > > +{
> > > > > +	long ret;
> > > > > +
> > > > > +	range->default_flags = 0;
> > > > > +	range->pfn_flags_mask = -1UL;
> > > > > +
> > > > > +	ret = hmm_range_register(range, &svm->mirror);
> > > > > +	if (ret) {
> > > > > +		up_read(&svm->mm->mmap_sem);
> > > > > +		return (int)ret;
> > > > > +	}
> > > >
> > > >
> > > > Using a temporary range is the pattern from nouveau, is it really
> > > > necessary in this driver?
> > > 
> > > Just to comment on this, for GPU the usage model is not application
> > > register range of virtual address it wants to use. It is GPU can
> > > access _any_ CPU valid address just like the CPU would (modulo mmap
> > > of device file).
> > > 
> > > This is because the API you want in userspace is application passing
> > > random pointer to the GPU and GPU being able to chase down any kind
> > > of random pointer chain (assuming all valid ie pointing to valid
> > > virtual address for the process).
> > > 
> > > This is unlike the RDMA case.
> > 
> > No, RDMA has the full address space option as well, it is called
> > 'implicit ODP'
> > 
> > This is implemented by registering ranges at a level in our page
> > table (currently 512G) whenever that level has populated pages below
> > it.
> > 
> > I think is a better strategy than temporary ranges.

HMM original design did not have range and was well suited to nouveau.
Recent changes make it more tie to the range and less suited to nouveau.
I would not consider 512GB implicit range as good thing. Plan i have is
to create implicit range and align them on vma. I do not know when i will
have time to get to that.

> > 
> > But other GPU drivers like AMD are using BO and TTM objects with fixed
> > VA ranges and the range is tied to the BO/TTM.
> > 
> > I'm not sure if this i915 usage is closer to AMD or closer to nouveau.
> > 
> 
> I don't know the full details of the HMM usecases in amd and nouveau.
> AMD seems to be using it for usrptr objects which is tied to a BO.
> I am not sure if nouveau has any BO tied to these address ranges.

It is closer to nouveau, AMD usage is old userptr usecase where you have
a BO tie to range. While SVM means any valid CPU address and thus imply
that there is no BO tie to it (there is still BO usecase that must still
work here).

Cheers,
Jérôme

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-12-04 21:51             ` Jerome Glisse
  0 siblings, 0 replies; 111+ messages in thread
From: Jerome Glisse @ 2019-12-04 21:51 UTC (permalink / raw)
  To: Niranjan Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen,
	Jason Gunthorpe, daniel.vetter, dan.j.williams, ira.weiny

On Tue, Dec 03, 2019 at 11:19:43AM -0800, Niranjan Vishwanathapura wrote:
> On Tue, Nov 26, 2019 at 06:32:52PM +0000, Jason Gunthorpe wrote:
> > On Mon, Nov 25, 2019 at 11:33:27AM -0500, Jerome Glisse wrote:
> > > On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
> > > > On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
> > > 
> > > [...]
> > > 
> > > > > +static int
> > > > > +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
> > > > > +{
> > > > > +	long ret;
> > > > > +
> > > > > +	range->default_flags = 0;
> > > > > +	range->pfn_flags_mask = -1UL;
> > > > > +
> > > > > +	ret = hmm_range_register(range, &svm->mirror);
> > > > > +	if (ret) {
> > > > > +		up_read(&svm->mm->mmap_sem);
> > > > > +		return (int)ret;
> > > > > +	}
> > > >
> > > >
> > > > Using a temporary range is the pattern from nouveau, is it really
> > > > necessary in this driver?
> > > 
> > > Just to comment on this, for GPU the usage model is not application
> > > register range of virtual address it wants to use. It is GPU can
> > > access _any_ CPU valid address just like the CPU would (modulo mmap
> > > of device file).
> > > 
> > > This is because the API you want in userspace is application passing
> > > random pointer to the GPU and GPU being able to chase down any kind
> > > of random pointer chain (assuming all valid ie pointing to valid
> > > virtual address for the process).
> > > 
> > > This is unlike the RDMA case.
> > 
> > No, RDMA has the full address space option as well, it is called
> > 'implicit ODP'
> > 
> > This is implemented by registering ranges at a level in our page
> > table (currently 512G) whenever that level has populated pages below
> > it.
> > 
> > I think is a better strategy than temporary ranges.

HMM original design did not have range and was well suited to nouveau.
Recent changes make it more tie to the range and less suited to nouveau.
I would not consider 512GB implicit range as good thing. Plan i have is
to create implicit range and align them on vma. I do not know when i will
have time to get to that.

> > 
> > But other GPU drivers like AMD are using BO and TTM objects with fixed
> > VA ranges and the range is tied to the BO/TTM.
> > 
> > I'm not sure if this i915 usage is closer to AMD or closer to nouveau.
> > 
> 
> I don't know the full details of the HMM usecases in amd and nouveau.
> AMD seems to be using it for usrptr objects which is tied to a BO.
> I am not sure if nouveau has any BO tied to these address ranges.

It is closer to nouveau, AMD usage is old userptr usecase where you have
a BO tie to range. While SVM means any valid CPU address and thus imply
that there is no BO tie to it (there is still BO usecase that must still
work here).

Cheers,
Jérôme

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
  2019-12-04 21:51             ` [Intel-gfx] " Jerome Glisse
@ 2019-12-08 18:23               ` Jason Gunthorpe
  -1 siblings, 0 replies; 111+ messages in thread
From: Jason Gunthorpe @ 2019-12-08 18:23 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: intel-gfx, sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen,
	jon.bloomfield, daniel.vetter, Niranjan Vishwanathapura,
	ira.weiny, dan.j.williams

On Wed, Dec 04, 2019 at 04:51:36PM -0500, Jerome Glisse wrote:
> On Tue, Dec 03, 2019 at 11:19:43AM -0800, Niranjan Vishwanathapura wrote:
> > On Tue, Nov 26, 2019 at 06:32:52PM +0000, Jason Gunthorpe wrote:
> > > On Mon, Nov 25, 2019 at 11:33:27AM -0500, Jerome Glisse wrote:
> > > > On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
> > > > > On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
> > > > 
> > > > [...]
> > > > 
> > > > > > +static int
> > > > > > +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
> > > > > > +{
> > > > > > +	long ret;
> > > > > > +
> > > > > > +	range->default_flags = 0;
> > > > > > +	range->pfn_flags_mask = -1UL;
> > > > > > +
> > > > > > +	ret = hmm_range_register(range, &svm->mirror);
> > > > > > +	if (ret) {
> > > > > > +		up_read(&svm->mm->mmap_sem);
> > > > > > +		return (int)ret;
> > > > > > +	}
> > > > >
> > > > >
> > > > > Using a temporary range is the pattern from nouveau, is it really
> > > > > necessary in this driver?
> > > > 
> > > > Just to comment on this, for GPU the usage model is not application
> > > > register range of virtual address it wants to use. It is GPU can
> > > > access _any_ CPU valid address just like the CPU would (modulo mmap
> > > > of device file).
> > > > 
> > > > This is because the API you want in userspace is application passing
> > > > random pointer to the GPU and GPU being able to chase down any kind
> > > > of random pointer chain (assuming all valid ie pointing to valid
> > > > virtual address for the process).
> > > > 
> > > > This is unlike the RDMA case.
> > > 
> > > No, RDMA has the full address space option as well, it is called
> > > 'implicit ODP'
> > > 
> > > This is implemented by registering ranges at a level in our page
> > > table (currently 512G) whenever that level has populated pages below
> > > it.
> > > 
> > > I think is a better strategy than temporary ranges.
> 
> HMM original design did not have range and was well suited to nouveau.
> Recent changes make it more tie to the range and less suited to nouveau.
> I would not consider 512GB implicit range as good thing. Plan i have is
> to create implicit range and align them on vma. I do not know when i will
> have time to get to that.

For mlx5 the 512G is aligned to the page table levels, so it is a
reasonable approximation. GPU could do the same.

Not sure VMA related is really any better..

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [Intel-gfx] [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-12-08 18:23               ` Jason Gunthorpe
  0 siblings, 0 replies; 111+ messages in thread
From: Jason Gunthorpe @ 2019-12-08 18:23 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, daniel.vetter,
	ira.weiny, dan.j.williams

On Wed, Dec 04, 2019 at 04:51:36PM -0500, Jerome Glisse wrote:
> On Tue, Dec 03, 2019 at 11:19:43AM -0800, Niranjan Vishwanathapura wrote:
> > On Tue, Nov 26, 2019 at 06:32:52PM +0000, Jason Gunthorpe wrote:
> > > On Mon, Nov 25, 2019 at 11:33:27AM -0500, Jerome Glisse wrote:
> > > > On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
> > > > > On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
> > > > 
> > > > [...]
> > > > 
> > > > > > +static int
> > > > > > +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
> > > > > > +{
> > > > > > +	long ret;
> > > > > > +
> > > > > > +	range->default_flags = 0;
> > > > > > +	range->pfn_flags_mask = -1UL;
> > > > > > +
> > > > > > +	ret = hmm_range_register(range, &svm->mirror);
> > > > > > +	if (ret) {
> > > > > > +		up_read(&svm->mm->mmap_sem);
> > > > > > +		return (int)ret;
> > > > > > +	}
> > > > >
> > > > >
> > > > > Using a temporary range is the pattern from nouveau, is it really
> > > > > necessary in this driver?
> > > > 
> > > > Just to comment on this, for GPU the usage model is not application
> > > > register range of virtual address it wants to use. It is GPU can
> > > > access _any_ CPU valid address just like the CPU would (modulo mmap
> > > > of device file).
> > > > 
> > > > This is because the API you want in userspace is application passing
> > > > random pointer to the GPU and GPU being able to chase down any kind
> > > > of random pointer chain (assuming all valid ie pointing to valid
> > > > virtual address for the process).
> > > > 
> > > > This is unlike the RDMA case.
> > > 
> > > No, RDMA has the full address space option as well, it is called
> > > 'implicit ODP'
> > > 
> > > This is implemented by registering ranges at a level in our page
> > > table (currently 512G) whenever that level has populated pages below
> > > it.
> > > 
> > > I think is a better strategy than temporary ranges.
> 
> HMM original design did not have range and was well suited to nouveau.
> Recent changes make it more tie to the range and less suited to nouveau.
> I would not consider 512GB implicit range as good thing. Plan i have is
> to create implicit range and align them on vma. I do not know when i will
> have time to get to that.

For mlx5 the 512G is aligned to the page table levels, so it is a
reasonable approximation. GPU could do the same.

Not sure VMA related is really any better..

Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-22 20:14       ` Jason Gunthorpe
  0 siblings, 0 replies; 111+ messages in thread
From: Jason Gunthorpe @ 2019-11-22 20:14 UTC (permalink / raw)
  To: Niranjan Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, dri-devel, dave.hansen, jglisse,
	daniel.vetter, dan.j.williams, ira.weiny

On Fri, Nov 22, 2019 at 12:01:17PM -0800, Niranjan Vishwanathapura wrote:
> On Fri, Nov 22, 2019 at 11:54:45AM -0800, Niranjana Vishwanathapura wrote:
> > Use HMM page table mirroring support to build device page table.
> > Implement the bind ioctl and bind the process address range in the
> > specified context's ppgtt.
> > Handle invalidation notifications by unbinding the address range.
> > 
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> > Cc: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Sudeep Dutt <sudeep.dutt@intel.com>
> > Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> > +int i915_svm_bind_mm(struct i915_address_space *vm)
> > +{
> > +	struct i915_svm *svm;
> > +	struct mm_struct *mm;
> > +	int ret = 0;
> > +
> > +	mm = get_task_mm(current);
> > +	down_write(&mm->mmap_sem);
> > +	mutex_lock(&vm->svm_mutex);
> > +	if (vm->svm)
> > +		goto bind_out;
> > +
> > +	svm = kzalloc(sizeof(*svm), GFP_KERNEL);
> > +	if (!svm) {
> > +		ret = -ENOMEM;
> > +		goto bind_out;
> > +	}
> > +	svm->mirror.ops = &i915_mirror_ops;
> > +	mutex_init(&svm->mutex);
> > +	kref_init(&svm->ref);
> > +	svm->mm = mm;
> > +	svm->vm = vm;
> > +
> > +	ret = hmm_mirror_register(&svm->mirror, mm);
> 
> I saw that these APIs have been removed.
> I will update once it gets included in kernel release.

I would like to see all the mmu notifier use in i916 updated to use
the new APIs :)

Please cc me when you post patches using the new APIs, I'd like to see
how they are being used.

Regards,
Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-22 20:14       ` Jason Gunthorpe
  0 siblings, 0 replies; 111+ messages in thread
From: Jason Gunthorpe @ 2019-11-22 20:14 UTC (permalink / raw)
  To: Niranjan Vishwanathapura
  Cc: intel-gfx, sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen,
	jglisse, jon.bloomfield, daniel.vetter, dan.j.williams,
	ira.weiny

On Fri, Nov 22, 2019 at 12:01:17PM -0800, Niranjan Vishwanathapura wrote:
> On Fri, Nov 22, 2019 at 11:54:45AM -0800, Niranjana Vishwanathapura wrote:
> > Use HMM page table mirroring support to build device page table.
> > Implement the bind ioctl and bind the process address range in the
> > specified context's ppgtt.
> > Handle invalidation notifications by unbinding the address range.
> > 
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> > Cc: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Sudeep Dutt <sudeep.dutt@intel.com>
> > Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> > +int i915_svm_bind_mm(struct i915_address_space *vm)
> > +{
> > +	struct i915_svm *svm;
> > +	struct mm_struct *mm;
> > +	int ret = 0;
> > +
> > +	mm = get_task_mm(current);
> > +	down_write(&mm->mmap_sem);
> > +	mutex_lock(&vm->svm_mutex);
> > +	if (vm->svm)
> > +		goto bind_out;
> > +
> > +	svm = kzalloc(sizeof(*svm), GFP_KERNEL);
> > +	if (!svm) {
> > +		ret = -ENOMEM;
> > +		goto bind_out;
> > +	}
> > +	svm->mirror.ops = &i915_mirror_ops;
> > +	mutex_init(&svm->mutex);
> > +	kref_init(&svm->ref);
> > +	svm->mm = mm;
> > +	svm->vm = vm;
> > +
> > +	ret = hmm_mirror_register(&svm->mirror, mm);
> 
> I saw that these APIs have been removed.
> I will update once it gets included in kernel release.

I would like to see all the mmu notifier use in i916 updated to use
the new APIs :)

Please cc me when you post patches using the new APIs, I'd like to see
how they are being used.

Regards,
Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
  2019-11-22 20:14       ` Jason Gunthorpe
  (?)
@ 2019-11-22 20:11       ` Niranjan Vishwanathapura
  -1 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-22 20:11 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: intel-gfx, sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen,
	jglisse, jon.bloomfield, daniel.vetter, dan.j.williams,
	ira.weiny

On Fri, Nov 22, 2019 at 08:14:51PM +0000, Jason Gunthorpe wrote:
>On Fri, Nov 22, 2019 at 12:01:17PM -0800, Niranjan Vishwanathapura wrote:
>> On Fri, Nov 22, 2019 at 11:54:45AM -0800, Niranjana Vishwanathapura wrote:
>> > Use HMM page table mirroring support to build device page table.
>> > Implement the bind ioctl and bind the process address range in the
>> > specified context's ppgtt.
>> > Handle invalidation notifications by unbinding the address range.
>> >
>> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> > Cc: Jon Bloomfield <jon.bloomfield@intel.com>
>> > Cc: Daniel Vetter <daniel.vetter@intel.com>
>> > Cc: Sudeep Dutt <sudeep.dutt@intel.com>
>> > Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>> > +int i915_svm_bind_mm(struct i915_address_space *vm)
>> > +{
>> > +	struct i915_svm *svm;
>> > +	struct mm_struct *mm;
>> > +	int ret = 0;
>> > +
>> > +	mm = get_task_mm(current);
>> > +	down_write(&mm->mmap_sem);
>> > +	mutex_lock(&vm->svm_mutex);
>> > +	if (vm->svm)
>> > +		goto bind_out;
>> > +
>> > +	svm = kzalloc(sizeof(*svm), GFP_KERNEL);
>> > +	if (!svm) {
>> > +		ret = -ENOMEM;
>> > +		goto bind_out;
>> > +	}
>> > +	svm->mirror.ops = &i915_mirror_ops;
>> > +	mutex_init(&svm->mutex);
>> > +	kref_init(&svm->ref);
>> > +	svm->mm = mm;
>> > +	svm->vm = vm;
>> > +
>> > +	ret = hmm_mirror_register(&svm->mirror, mm);
>>
>> I saw that these APIs have been removed.
>> I will update once it gets included in kernel release.
>
>I would like to see all the mmu notifier use in i916 updated to use
>the new APIs :)
>
>Please cc me when you post patches using the new APIs, I'd like to see
>how they are being used.
>

Ok, sure, will do.

Thanks,
Niranjana

>Regards,
>Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-22 20:01     ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-22 20:01 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, dri-devel, dave.hansen, jglisse, daniel.vetter,
	dan.j.williams, ira.weiny, jgg

On Fri, Nov 22, 2019 at 11:54:45AM -0800, Niranjana Vishwanathapura wrote:
>Use HMM page table mirroring support to build device page table.
>Implement the bind ioctl and bind the process address range in the
>specified context's ppgtt.
>Handle invalidation notifications by unbinding the address range.
>
>Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>Cc: Jon Bloomfield <jon.bloomfield@intel.com>
>Cc: Daniel Vetter <daniel.vetter@intel.com>
>Cc: Sudeep Dutt <sudeep.dutt@intel.com>
>Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>---
>+int i915_svm_bind_mm(struct i915_address_space *vm)
>+{
>+	struct i915_svm *svm;
>+	struct mm_struct *mm;
>+	int ret = 0;
>+
>+	mm = get_task_mm(current);
>+	down_write(&mm->mmap_sem);
>+	mutex_lock(&vm->svm_mutex);
>+	if (vm->svm)
>+		goto bind_out;
>+
>+	svm = kzalloc(sizeof(*svm), GFP_KERNEL);
>+	if (!svm) {
>+		ret = -ENOMEM;
>+		goto bind_out;
>+	}
>+	svm->mirror.ops = &i915_mirror_ops;
>+	mutex_init(&svm->mutex);
>+	kref_init(&svm->ref);
>+	svm->mm = mm;
>+	svm->vm = vm;
>+
>+	ret = hmm_mirror_register(&svm->mirror, mm);

I saw that these APIs have been removed.
I will update once it gets included in kernel release.

Niranjana 

>+	if (ret)
>+		goto bind_out;
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC 06/13] drm/i915/svm: Page table mirroring support
@ 2019-11-22 20:01     ` Niranjan Vishwanathapura
  0 siblings, 0 replies; 111+ messages in thread
From: Niranjan Vishwanathapura @ 2019-11-22 20:01 UTC (permalink / raw)
  To: intel-gfx
  Cc: sanjay.k.kumar, sudeep.dutt, dri-devel, dave.hansen, jglisse,
	jon.bloomfield, daniel.vetter, dan.j.williams, ira.weiny, jgg

On Fri, Nov 22, 2019 at 11:54:45AM -0800, Niranjana Vishwanathapura wrote:
>Use HMM page table mirroring support to build device page table.
>Implement the bind ioctl and bind the process address range in the
>specified context's ppgtt.
>Handle invalidation notifications by unbinding the address range.
>
>Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>Cc: Jon Bloomfield <jon.bloomfield@intel.com>
>Cc: Daniel Vetter <daniel.vetter@intel.com>
>Cc: Sudeep Dutt <sudeep.dutt@intel.com>
>Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>---
>+int i915_svm_bind_mm(struct i915_address_space *vm)
>+{
>+	struct i915_svm *svm;
>+	struct mm_struct *mm;
>+	int ret = 0;
>+
>+	mm = get_task_mm(current);
>+	down_write(&mm->mmap_sem);
>+	mutex_lock(&vm->svm_mutex);
>+	if (vm->svm)
>+		goto bind_out;
>+
>+	svm = kzalloc(sizeof(*svm), GFP_KERNEL);
>+	if (!svm) {
>+		ret = -ENOMEM;
>+		goto bind_out;
>+	}
>+	svm->mirror.ops = &i915_mirror_ops;
>+	mutex_init(&svm->mutex);
>+	kref_init(&svm->ref);
>+	svm->mm = mm;
>+	svm->vm = vm;
>+
>+	ret = hmm_mirror_register(&svm->mirror, mm);

I saw that these APIs have been removed.
I will update once it gets included in kernel release.

Niranjana 

>+	if (ret)
>+		goto bind_out;
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

end of thread, other threads:[~2019-12-09  9:47 UTC | newest]

Thread overview: 111+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-22 20:57 [RFC 00/13] drm/i915/svm: Add SVM support Niranjana Vishwanathapura
2019-11-22 20:57 ` [Intel-gfx] " Niranjana Vishwanathapura
2019-11-22 20:57 ` [RFC 01/13] drm/i915/svm: Add SVM documentation Niranjana Vishwanathapura
2019-11-22 20:57   ` [Intel-gfx] " Niranjana Vishwanathapura
2019-11-22 20:57 ` [RFC 02/13] drm/i915/svm: Define SVM UAPI Niranjana Vishwanathapura
2019-11-22 20:57   ` [Intel-gfx] " Niranjana Vishwanathapura
2019-11-22 20:57   ` Niranjana Vishwanathapura
2019-11-26 13:44   ` Joonas Lahtinen
2019-11-26 13:44     ` [Intel-gfx] " Joonas Lahtinen
2019-11-26 13:44     ` Joonas Lahtinen
2019-11-28  2:04     ` Niranjan Vishwanathapura
2019-11-28  2:04       ` [Intel-gfx] " Niranjan Vishwanathapura
2019-11-28  2:04       ` Niranjan Vishwanathapura
2019-11-22 20:57 ` [RFC 03/13] drm/i915/svm: Runtime (RT) allocator support Niranjana Vishwanathapura
2019-11-22 20:57   ` [Intel-gfx] " Niranjana Vishwanathapura
2019-11-22 20:57   ` Niranjana Vishwanathapura
2019-11-25  9:59   ` Chris Wilson
2019-11-25  9:59     ` [Intel-gfx] " Chris Wilson
2019-11-25  9:59     ` Chris Wilson
2019-11-25 16:40     ` Niranjan Vishwanathapura
2019-11-25 16:40       ` [Intel-gfx] " Niranjan Vishwanathapura
2019-11-25 16:40       ` Niranjan Vishwanathapura
2019-11-26 13:59       ` Joonas Lahtinen
2019-11-26 13:59         ` Joonas Lahtinen
2019-11-27 19:23         ` Niranjan Vishwanathapura
2019-11-27 19:23           ` [Intel-gfx] " Niranjan Vishwanathapura
2019-11-27 19:23           ` Niranjan Vishwanathapura
2019-11-28 12:12           ` Joonas Lahtinen
2019-11-28 12:12             ` [Intel-gfx] " Joonas Lahtinen
2019-11-28 12:12             ` Joonas Lahtinen
2019-12-02 19:59             ` Niranjan Vishwanathapura
2019-12-02 19:59               ` [Intel-gfx] " Niranjan Vishwanathapura
2019-12-02 19:59               ` Niranjan Vishwanathapura
2019-11-22 20:57 ` [RFC 04/13] drm/i915/svm: Implicitly migrate BOs upon CPU access Niranjana Vishwanathapura
2019-11-22 20:57   ` [Intel-gfx] " Niranjana Vishwanathapura
2019-11-22 20:57   ` Niranjana Vishwanathapura
2019-11-22 20:57 ` [RFC 05/13] drm/i915/svm: Page table update support for SVM Niranjana Vishwanathapura
2019-11-22 20:57   ` [Intel-gfx] " Niranjana Vishwanathapura
2019-11-22 20:57   ` Niranjana Vishwanathapura
2019-11-22 20:57 ` [RFC 06/13] drm/i915/svm: Page table mirroring support Niranjana Vishwanathapura
2019-11-22 20:57   ` [Intel-gfx] " Niranjana Vishwanathapura
2019-11-22 20:57   ` Niranjana Vishwanathapura
2019-11-22 23:33   ` Jason Gunthorpe
2019-11-22 23:33     ` [Intel-gfx] " Jason Gunthorpe
2019-11-22 23:33     ` Jason Gunthorpe
2019-11-23  4:44     ` Niranjan Vishwanathapura
2019-11-23  4:44       ` [Intel-gfx] " Niranjan Vishwanathapura
2019-11-23  4:44       ` Niranjan Vishwanathapura
2019-11-23 23:53       ` Jason Gunthorpe
2019-11-23 23:53         ` [Intel-gfx] " Jason Gunthorpe
2019-11-23 23:53         ` Jason Gunthorpe
2019-11-24 21:12         ` Niranjan Vishwanathapura
2019-11-24 21:12           ` [Intel-gfx] " Niranjan Vishwanathapura
2019-11-24 21:12           ` Niranjan Vishwanathapura
2019-11-25 13:24           ` Jason Gunthorpe
2019-11-25 13:24             ` [Intel-gfx] " Jason Gunthorpe
2019-11-25 13:24             ` Jason Gunthorpe
2019-11-25 16:32             ` Niranjan Vishwanathapura
2019-11-25 16:32               ` [Intel-gfx] " Niranjan Vishwanathapura
2019-11-26 18:45               ` Jason Gunthorpe
2019-11-26 18:45                 ` [Intel-gfx] " Jason Gunthorpe
2019-11-26 18:45                 ` Jason Gunthorpe
2019-12-03 19:07                 ` Niranjan Vishwanathapura
2019-12-03 19:07                   ` [Intel-gfx] " Niranjan Vishwanathapura
2019-11-25 16:34             ` Jerome Glisse
2019-11-25 16:34               ` [Intel-gfx] " Jerome Glisse
2019-11-25 16:34               ` Jerome Glisse
2019-11-25 16:33     ` Jerome Glisse
2019-11-25 16:33       ` [Intel-gfx] " Jerome Glisse
2019-11-25 21:29       ` Niranjan Vishwanathapura
2019-11-25 21:29         ` [Intel-gfx] " Niranjan Vishwanathapura
2019-11-26 18:32       ` Jason Gunthorpe
2019-11-26 18:32         ` [Intel-gfx] " Jason Gunthorpe
2019-11-26 18:32         ` Jason Gunthorpe
2019-12-03 19:19         ` Niranjan Vishwanathapura
2019-12-03 19:19           ` [Intel-gfx] " Niranjan Vishwanathapura
2019-12-04 21:51           ` Jerome Glisse
2019-12-04 21:51             ` [Intel-gfx] " Jerome Glisse
2019-12-08 18:23             ` Jason Gunthorpe
2019-12-08 18:23               ` [Intel-gfx] " Jason Gunthorpe
2019-11-22 20:57 ` [RFC 07/13] drm/i915/svm: Device memory support Niranjana Vishwanathapura
2019-11-22 20:57   ` [Intel-gfx] " Niranjana Vishwanathapura
2019-11-22 20:57   ` Niranjana Vishwanathapura
2019-11-22 20:57 ` [RFC 08/13] drm/i915/svm: Implicitly migrate pages upon CPU fault Niranjana Vishwanathapura
2019-11-22 20:57   ` [Intel-gfx] " Niranjana Vishwanathapura
2019-11-22 20:57 ` [RFC 09/13] drm/i915/svm: Page copy support during migration Niranjana Vishwanathapura
2019-11-22 20:57   ` [Intel-gfx] " Niranjana Vishwanathapura
2019-11-22 20:57   ` Niranjana Vishwanathapura
2019-11-22 20:57 ` [RFC 10/13] drm/i915/svm: Add functions to blitter copy SVM buffers Niranjana Vishwanathapura
2019-11-22 20:57   ` [Intel-gfx] " Niranjana Vishwanathapura
2019-11-22 20:57   ` Niranjana Vishwanathapura
2019-11-22 20:57 ` [RFC 11/13] drm/i915/svm: Use blitter copy for migration Niranjana Vishwanathapura
2019-11-22 20:57   ` [Intel-gfx] " Niranjana Vishwanathapura
2019-11-22 20:57 ` [RFC 12/13] drm/i915/svm: Add support to en/disable SVM Niranjana Vishwanathapura
2019-11-22 20:57   ` [Intel-gfx] " Niranjana Vishwanathapura
2019-11-22 20:57   ` Niranjana Vishwanathapura
2019-11-22 20:57 ` [RFC 13/13] drm/i915/svm: Add page table dump support Niranjana Vishwanathapura
2019-11-22 20:57   ` [Intel-gfx] " Niranjana Vishwanathapura
2019-11-22 21:32 ` ✗ Fi.CI.BUILD: failure for drm/i915/svm: Add SVM support Patchwork
2019-11-22 21:32   ` [Intel-gfx] " Patchwork
2019-11-26 14:03 ` [RFC 00/13] " Joonas Lahtinen
2019-11-26 14:03   ` [Intel-gfx] " Joonas Lahtinen
2019-11-26 14:03   ` Joonas Lahtinen
2019-12-02 19:37   ` Niranjan Vishwanathapura
2019-12-02 19:37     ` [Intel-gfx] " Niranjan Vishwanathapura
2019-12-02 19:37     ` Niranjan Vishwanathapura
     [not found] <20191122195452.14346-1-niranjana.vishwanathapura@intel.com>
     [not found] ` <20191122195452.14346-7-niranjana.vishwanathapura@intel.com>
2019-11-22 20:01   ` [RFC 06/13] drm/i915/svm: Page table mirroring support Niranjan Vishwanathapura
2019-11-22 20:01     ` Niranjan Vishwanathapura
2019-11-22 20:14     ` Jason Gunthorpe
2019-11-22 20:14       ` Jason Gunthorpe
2019-11-22 20:11       ` Niranjan Vishwanathapura

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.