All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/1] gpu/cuda: expose GPU memory with GDRCopy
@ 2022-01-11 17:39 eagostini
  2022-01-11 17:39 ` [PATCH v1 1/1] " eagostini
  2022-02-21 22:44 ` [PATCH v2] gpu/cuda: CPU map " eagostini
  0 siblings, 2 replies; 8+ messages in thread
From: eagostini @ 2022-01-11 17:39 UTC (permalink / raw)
  To: dev; +Cc: Elena Agostini

From: Elena Agostini <eagostini@nvidia.com>

GPU CUDA implementation of the new gpudev functions
to expose GPU memory to the CPU.

Today GDRCopy library is required to pin and DMA map
the GPU memory through the BAR1 of the GPU and expose
it to the CPU.

Goal here is to hide technical details GDRCopy library
and expose the functionality through the generic
gpudev layer.

GDRCopy can be found here: https://github.com/NVIDIA/gdrcopy

To build GPU CUDA driver with GDRCopy, you need to build
DPDK indicating the gdrapi.h header file with
-Dc_args="-I/path/to/gdrapi/".

To execute you need to indicate the path to libgdrapi.so
library with the environment variable
GDRCOPY_PATH_L=/path/to/gdrcopy/lib/

If GDRCopy is not built with GPU CUDA driver, the GPU expose
functionality will not be supported by the driver.

This is an indipendent feature.
All the other GPU CUDA driver capabilities are not affected
if GDRCopy is not built.

Signed-off-by: Elena Agostini <eagostini@nvidia.com>

---
Dependency on https://patches.dpdk.org/project/dpdk/patch/20220108000457.31104-1-eagostini@nvidia.com/

Elena Agostini (1):
  gpu/cuda: expose GPU memory with GDRCopy

 drivers/gpu/cuda/cuda.c      | 101 +++++++++++++++++++++++++
 drivers/gpu/cuda/gdrcopy.c   | 139 +++++++++++++++++++++++++++++++++++
 drivers/gpu/cuda/gdrcopy.h   |  29 ++++++++
 drivers/gpu/cuda/meson.build |   6 +-
 4 files changed, 274 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/cuda/gdrcopy.c
 create mode 100644 drivers/gpu/cuda/gdrcopy.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v1 1/1] gpu/cuda: expose GPU memory with GDRCopy
  2022-01-11 17:39 [PATCH v1 0/1] gpu/cuda: expose GPU memory with GDRCopy eagostini
@ 2022-01-11 17:39 ` eagostini
  2022-02-21 22:44 ` [PATCH v2] gpu/cuda: CPU map " eagostini
  1 sibling, 0 replies; 8+ messages in thread
From: eagostini @ 2022-01-11 17:39 UTC (permalink / raw)
  To: dev; +Cc: Elena Agostini

From: Elena Agostini <eagostini@nvidia.com>

GPU CUDA implementation of the new gpudev functions
to expose GPU memory to the CPU.

Today GDRCopy library is required to pin and DMA map
the GPU memory through the BAR1 of the GPU and expose
it to the CPU.

Goal here is to hide technical details GDRCopy library
and expose the functionality through the generic
gpudev layer.

GDRCopy can be found here: https://github.com/NVIDIA/gdrcopy

To build GPU CUDA driver with GDRCopy, you need to build
DPDK indicating the gdrapi.h header file with
-Dc_args="-I/path/to/gdrapi/".

To execute you need to indicate the path to libgdrapi.so
library with the environment variable
GDRCOPY_PATH_L=/path/to/gdrcopy/lib/

If GDRCopy is not built with GPU CUDA driver, the GPU expose
functionality will not be supported by the driver.

This is an indipendent feature.
All the other GPU CUDA driver capabilities are not affected
if GDRCopy is not built.

Signed-off-by: Elena Agostini <eagostini@nvidia.com>

---
Dependency on https://patches.dpdk.org/project/dpdk/patch/20220108000457.31104-1-eagostini@nvidia.com/
---
 drivers/gpu/cuda/cuda.c      | 101 +++++++++++++++++++++++++
 drivers/gpu/cuda/gdrcopy.c   | 139 +++++++++++++++++++++++++++++++++++
 drivers/gpu/cuda/gdrcopy.h   |  29 ++++++++
 drivers/gpu/cuda/meson.build |   6 +-
 4 files changed, 274 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/cuda/gdrcopy.c
 create mode 100644 drivers/gpu/cuda/gdrcopy.h

diff --git a/drivers/gpu/cuda/cuda.c b/drivers/gpu/cuda/cuda.c
index 882df08e56..d66d6b76b9 100644
--- a/drivers/gpu/cuda/cuda.c
+++ b/drivers/gpu/cuda/cuda.c
@@ -17,6 +17,8 @@
 #include <cuda.h>
 #include <cudaTypedefs.h>
 
+#include "gdrcopy.h"
+
 #define CUDA_DRIVER_MIN_VERSION 11040
 #define CUDA_API_MIN_VERSION 3020
 
@@ -52,6 +54,8 @@ static void *cudalib;
 static unsigned int cuda_api_version;
 static int cuda_driver_version;
 
+static gdr_t gdrc_h;
+
 /* NVIDIA GPU vendor */
 #define NVIDIA_GPU_VENDOR_ID (0x10de)
 
@@ -144,6 +148,7 @@ struct mem_entry {
 	struct rte_gpu *dev;
 	CUcontext ctx;
 	cuda_ptr_key pkey;
+	gdr_mh_t mh;
 	enum mem_type mtype;
 	struct mem_entry *prev;
 	struct mem_entry *next;
@@ -943,6 +948,87 @@ cuda_wmb(struct rte_gpu *dev)
 	return 0;
 }
 
+static int
+cuda_mem_expose(struct rte_gpu *dev, __rte_unused size_t size, void *ptr_in, void **ptr_out)
+{
+	struct mem_entry *mem_item;
+	cuda_ptr_key hk;
+
+	if (dev == NULL)
+		return -ENODEV;
+
+	if (gdrc_h == NULL) {
+		rte_cuda_log(ERR, "GDRCopy not built or loaded. Can't expose GPU memory.");
+		rte_errno = ENOTSUP;
+		return -rte_errno;
+	}
+
+	hk = get_hash_from_ptr((void *)ptr_in);
+
+	mem_item = mem_list_find_item(hk);
+	if (mem_item == NULL) {
+		rte_cuda_log(ERR, "Memory address 0x%p not found in driver memory.", ptr_in);
+		rte_errno = EPERM;
+		return -rte_errno;
+	}
+
+	if (mem_item->mtype == GPU_MEM) {
+		rte_cuda_log(ERR, "Memory address 0x%p is not GPU memory type.", ptr_in);
+		rte_errno = EPERM;
+		return -rte_errno;
+	}
+
+	if (mem_item->size != size)
+		rte_cuda_log(WARNING,
+				"Can't expose memory area with size (%zd) different from original size (%zd).",
+				size, mem_item->size);
+
+	if (gdrcopy_pin(gdrc_h, &(mem_item->mh), (uint64_t)mem_item->ptr_d,
+			mem_item->size, &(mem_item->ptr_h))) {
+		rte_cuda_log(ERR, "Error exposing GPU memory address 0x%p.", ptr_in);
+		rte_errno = EPERM;
+		return -rte_errno;
+	}
+
+	*ptr_out = mem_item->ptr_h;
+
+	return 0;
+}
+
+static int
+cuda_mem_unexpose(struct rte_gpu *dev, void *ptr_in)
+{
+	struct mem_entry *mem_item;
+	cuda_ptr_key hk;
+
+	if (dev == NULL)
+		return -ENODEV;
+
+	if (gdrc_h == NULL) {
+		rte_cuda_log(ERR, "GDRCopy not built or loaded. Can't unexpose GPU memory.");
+		rte_errno = ENOTSUP;
+		return -rte_errno;
+	}
+
+	hk = get_hash_from_ptr((void *)ptr_in);
+
+	mem_item = mem_list_find_item(hk);
+	if (mem_item == NULL) {
+		rte_cuda_log(ERR, "Memory address 0x%p not found in driver memory.", ptr_in);
+		rte_errno = EPERM;
+		return -rte_errno;
+	}
+
+	if (gdrcopy_unpin(gdrc_h, mem_item->mh, (void *)mem_item->ptr_d,
+			mem_item->size)) {
+		rte_cuda_log(ERR, "Error unexposing GPU memory address 0x%p.", ptr_in);
+		rte_errno = EPERM;
+		return -rte_errno;
+	}
+
+	return 0;
+}
+
 static int
 cuda_gpu_probe(__rte_unused struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 {
@@ -1018,6 +1104,19 @@ cuda_gpu_probe(__rte_unused struct rte_pci_driver *pci_drv, struct rte_pci_devic
 			rte_errno = ENOTSUP;
 			return -rte_errno;
 		}
+
+		gdrc_h = NULL;
+
+		#ifdef DRIVERS_GPU_CUDA_GDRCOPY_H
+			if (gdrcopy_loader())
+				rte_cuda_log(ERR, "GDRCopy shared library not found.\n");
+			else {
+				if (gdrcopy_open(&gdrc_h))
+					rte_cuda_log(ERR, "GDRCopy handler can't be created. Is gdrdrv driver installed and loaded?\n");
+			}
+		#else
+			gdrc_h = NULL;
+		#endif
 	}
 
 	/* Fill HW specific part of device structure */
@@ -1160,6 +1259,8 @@ cuda_gpu_probe(__rte_unused struct rte_pci_driver *pci_drv, struct rte_pci_devic
 	dev->ops.mem_free = cuda_mem_free;
 	dev->ops.mem_register = cuda_mem_register;
 	dev->ops.mem_unregister = cuda_mem_unregister;
+	dev->ops.mem_expose = cuda_mem_expose;
+	dev->ops.mem_unexpose = cuda_mem_unexpose;
 	dev->ops.wmb = cuda_wmb;
 
 	rte_gpu_complete_new(dev);
diff --git a/drivers/gpu/cuda/gdrcopy.c b/drivers/gpu/cuda/gdrcopy.c
new file mode 100644
index 0000000000..1dc6b676e5
--- /dev/null
+++ b/drivers/gpu/cuda/gdrcopy.c
@@ -0,0 +1,139 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 NVIDIA Corporation & Affiliates
+ */
+
+#include "gdrcopy.h"
+
+static void *gdrclib;
+
+static gdr_t (*sym_gdr_open)(void);
+static int (*sym_gdr_close)(gdr_t g);
+static int (*sym_gdr_pin_buffer)(gdr_t g, unsigned long addr, size_t size, uint64_t p2p_token, uint32_t va_space, gdr_mh_t *handle);
+static int (*sym_gdr_unpin_buffer)(gdr_t g, gdr_mh_t handle);
+static int (*sym_gdr_map)(gdr_t g, gdr_mh_t handle, void **va, size_t size);
+static int (*sym_gdr_unmap)(gdr_t g, gdr_mh_t handle, void *va, size_t size);
+
+int
+gdrcopy_loader(void)
+{
+	char gdrcopy_path[1024];
+
+	if (getenv("GDRCOPY_PATH_L") == NULL)
+		snprintf(gdrcopy_path, 1024, "%s", "libgdrapi.so");
+	else
+		snprintf(gdrcopy_path, 1024, "%s%s", getenv("GDRCOPY_PATH_L"), "libgdrapi.so");
+
+	gdrclib = dlopen(gdrcopy_path, RTLD_LAZY);
+	if (gdrclib == NULL) {
+		fprintf(stderr, "Failed to find GDRCopy library in %s (GDRCOPY_PATH_L=%s)\n",
+				gdrcopy_path, getenv("GDRCOPY_PATH_L"));
+		return -1;
+	}
+
+	sym_gdr_open = dlsym(gdrclib, "gdr_open");
+	if (sym_gdr_open == NULL) {
+		fprintf(stderr, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	sym_gdr_close = dlsym(gdrclib, "gdr_close");
+	if (sym_gdr_close == NULL) {
+		fprintf(stderr, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	sym_gdr_pin_buffer = dlsym(gdrclib, "gdr_pin_buffer");
+	if (sym_gdr_pin_buffer == NULL) {
+		fprintf(stderr, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	sym_gdr_unpin_buffer = dlsym(gdrclib, "gdr_unpin_buffer");
+	if (sym_gdr_unpin_buffer == NULL) {
+		fprintf(stderr, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	sym_gdr_map = dlsym(gdrclib, "gdr_map");
+	if (sym_gdr_map == NULL) {
+		fprintf(stderr, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	sym_gdr_unmap = dlsym(gdrclib, "gdr_unmap");
+	if (sym_gdr_unmap == NULL) {
+		fprintf(stderr, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+int
+gdrcopy_open(gdr_t *g)
+{
+#ifdef DRIVERS_GPU_CUDA_GDRCOPY_H
+	gdr_t g_;
+
+	g_ = sym_gdr_open();
+	if (!g_)
+		return -1;
+
+	*g = g_;
+#else
+	*g = NULL;
+#endif
+	return 0;
+}
+
+int
+gdrcopy_close(__rte_unused gdr_t *g)
+{
+#ifdef DRIVERS_GPU_CUDA_GDRCOPY_H
+	sym_gdr_close(*g);
+#endif
+	return 0;
+}
+
+int
+gdrcopy_pin(gdr_t g, __rte_unused gdr_mh_t *mh, uint64_t d_addr, size_t size, void **h_addr)
+{
+	if (g == NULL)
+		return -ENOTSUP;
+
+#ifdef DRIVERS_GPU_CUDA_GDRCOPY_H
+	/* Pin the device buffer */
+	if (sym_gdr_pin_buffer(g, d_addr, size, 0, 0, mh) != 0) {
+		fprintf(stderr, "sym_gdr_pin_buffer\n");
+		return -1;
+	}
+
+	/* Map the buffer to user space */
+	if (sym_gdr_map(g, *mh, h_addr, size) != 0) {
+		fprintf(stderr, "sym_gdr_map\n");
+		sym_gdr_unpin_buffer(g, *mh);
+		return -1;
+	}
+#endif
+	return 0;
+}
+
+int
+gdrcopy_unpin(gdr_t g, __rte_unused gdr_mh_t mh, void *d_addr, size_t size)
+{
+	if (g == NULL)
+		return -ENOTSUP;
+
+#ifdef DRIVERS_GPU_CUDA_GDRCOPY_H
+	/* Unmap the buffer from user space */
+	if (sym_gdr_unmap(g, mh, d_addr, size) != 0)
+		fprintf(stderr, "sym_gdr_unmap\n");
+
+	/* Pin the device buffer */
+	if (sym_gdr_unpin_buffer(g, mh) != 0) {
+		fprintf(stderr, "sym_gdr_pin_buffer\n");
+		return -11;
+	}
+#endif
+	return 0;
+}
diff --git a/drivers/gpu/cuda/gdrcopy.h b/drivers/gpu/cuda/gdrcopy.h
new file mode 100644
index 0000000000..e5c1997731
--- /dev/null
+++ b/drivers/gpu/cuda/gdrcopy.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 NVIDIA Corporation & Affiliates
+ */
+
+#ifndef _CUDA_GDRCOPY_H_
+#define _CUDA_GDRCOPY_H_
+
+#include <dlfcn.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_errno.h>
+
+#ifdef DRIVERS_GPU_CUDA_GDRCOPY_H
+	#include <gdrapi.h>
+#else
+	struct gdr;
+	typedef struct gdr *gdr_t;
+	struct gdr_mh_s;
+	typedef struct gdr_mh_s gdr_mh_t;
+#endif
+
+int gdrcopy_loader(void);
+int gdrcopy_open(gdr_t *g);
+int gdrcopy_close(gdr_t *g);
+int gdrcopy_pin(gdr_t g, gdr_mh_t *mh, uint64_t d_addr, size_t size, void **h_addr);
+int gdrcopy_unpin(gdr_t g, gdr_mh_t mh, void *d_addr, size_t size);
+
+#endif
diff --git a/drivers/gpu/cuda/meson.build b/drivers/gpu/cuda/meson.build
index 3fe20929fa..784fa8bf0d 100644
--- a/drivers/gpu/cuda/meson.build
+++ b/drivers/gpu/cuda/meson.build
@@ -17,5 +17,9 @@ if not cc.has_header('cudaTypedefs.h')
         subdir_done()
 endif
 
+if cc.has_header('gdrapi.h')
+        dpdk_conf.set('DRIVERS_GPU_CUDA_GDRCOPY_H', 1)
+endif
+
 deps += ['gpudev', 'pci', 'bus_pci']
-sources = files('cuda.c')
+sources = files('cuda.c', 'gdrcopy.c')
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2] gpu/cuda: CPU map GPU memory with GDRCopy
  2022-01-11 17:39 [PATCH v1 0/1] gpu/cuda: expose GPU memory with GDRCopy eagostini
  2022-01-11 17:39 ` [PATCH v1 1/1] " eagostini
@ 2022-02-21 22:44 ` eagostini
  2022-02-23 19:44   ` [PATCH v3] " eagostini
  2022-02-25  3:12   ` [PATCH v4 1/2] doc/gpus: add cuda.ini into features eagostini
  1 sibling, 2 replies; 8+ messages in thread
From: eagostini @ 2022-02-21 22:44 UTC (permalink / raw)
  To: dev; +Cc: Elena Agostini

From: Elena Agostini <eagostini@nvidia.com>

To enable the gpudev rte_gpu_mem_cpu_map feature to expose
GPU memory to the CPU, the GPU CUDA driver library needs
the GDRCopy library and driver.

If DPDK is built without GDRCopy, the GPU CUDA driver returns
error if the is invoked rte_gpu_mem_cpu_map.

All the others GPU CUDA driver functionalities are not affected by
the absence of GDRCopy, thus this is an optional functionality
that can be enabled in the GPU CUDA driver.

CUDA driver documentation has been updated accordingly.

Signed-off-by: Elena Agostini <eagostini@nvidia.com>
---
 doc/guides/gpus/cuda.rst             |  52 +++++++++
 doc/guides/gpus/features/default.ini |   2 +
 drivers/gpu/cuda/cuda.c              |  78 +++++++++++++-
 drivers/gpu/cuda/gdrcopy.c           | 154 +++++++++++++++++++++++++++
 drivers/gpu/cuda/gdrcopy.h           |  27 +++++
 drivers/gpu/cuda/meson.build         |   6 +-
 6 files changed, 316 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/cuda/gdrcopy.c
 create mode 100644 drivers/gpu/cuda/gdrcopy.h

diff --git a/doc/guides/gpus/cuda.rst b/doc/guides/gpus/cuda.rst
index 38e22dc2c0..edf6f93008 100644
--- a/doc/guides/gpus/cuda.rst
+++ b/doc/guides/gpus/cuda.rst
@@ -29,6 +29,34 @@ Three ways:
 
 If headers are not found, the CUDA GPU driver library is not built.
 
+CPU map GPU memory
+~~~~~~~~~~~~~~~~~
+
+To enable this gpudev feature (i.e. implement the ``rte_gpu_mem_cpu_map``),
+you need the `GDRCopy <https://github.com/NVIDIA/gdrcopy>`_ library and driver
+installed on your system.
+
+A quick recipe to download, build and run GDRCopy library and driver:
+
+.. code-block:: console
+
+  $ git clone https://github.com/NVIDIA/gdrcopy.git
+  $ make
+  $ # make install to install GDRCopy library system wide
+  $ # Launch gdrdrv kernel module on the system
+  $ sudo ./insmod.sh
+
+You need to indicate to meson where GDRCopy headers files as in case of CUDA headers.
+An example would be:
+
+.. code-block:: console
+
+  $ meson build -Dc_args="-I/usr/local/cuda/include -I/path/to/gdrcopy/include"
+
+If headers are not found, the CUDA GPU driver library is built without the CPU map capability
+and will retun error if the application invokes the gpude ``rte_gpu_mem_cpu_map`` function.
+
+
 CUDA Shared Library
 -------------------
 
@@ -46,6 +74,30 @@ All CUDA API symbols are loaded at runtime as well.
 For this reason, to build the CUDA driver library,
 no need to install the CUDA library.
 
+CPU map GPU memory
+~~~~~~~~~~~~~~~~~
+
+Similarly to CUDA shared library, if the **libgdrapi.so** shared library is not
+installed in default locations (e.g. /usr/local/lib) you can use the
+``GDRCOPY_PATH_L``.
+
+As an example, to enable the CPU map feature sanity check, run the ``app/test-gpudev``
+application with:
+
+.. code-block:: console
+
+  $ sudo CUDA_PATH_L=/path/to/libcuda GDRCOPY_PATH_L=/path/to/libgdrapi ./build/app/dpdk-test-gpudev
+
+Additionally, the ``gdrdrv`` kernel module built with the GDRCopy project has to loaded
+on the system:
+
+.. code-block:: console
+
+  $ lsmod | egrep gdrdrv
+  gdrdrv                 20480  0
+  nvidia              35307520  19 nvidia_uvm,nv_peer_mem,gdrdrv,nvidia_modeset
+
+
 Design
 ------
 
diff --git a/doc/guides/gpus/features/default.ini b/doc/guides/gpus/features/default.ini
index 87e9966424..817113f2c2 100644
--- a/doc/guides/gpus/features/default.ini
+++ b/doc/guides/gpus/features/default.ini
@@ -11,3 +11,5 @@ Get device info                =
 Share CPU memory with device   =
 Allocate device memory         =
 Free memory                    =
+CPU map device memory          =
+CPU unmap device memory        =
diff --git a/drivers/gpu/cuda/cuda.c b/drivers/gpu/cuda/cuda.c
index b43d5a32b7..ca400d473c 100644
--- a/drivers/gpu/cuda/cuda.c
+++ b/drivers/gpu/cuda/cuda.c
@@ -16,6 +16,7 @@
 #include <gpudev_driver.h>
 #include <cuda.h>
 #include <cudaTypedefs.h>
+#include "gdrcopy.h"
 
 #define CUDA_DRIVER_MIN_VERSION 11040
 #define CUDA_API_MIN_VERSION 3020
@@ -51,6 +52,7 @@ static PFN_cuFlushGPUDirectRDMAWrites pfn_cuFlushGPUDirectRDMAWrites;
 static void *cudalib;
 static unsigned int cuda_api_version;
 static int cuda_driver_version;
+static gdr_t gdrc_h;
 
 /* NVIDIA GPU vendor */
 #define NVIDIA_GPU_VENDOR_ID (0x10de)
@@ -157,6 +159,7 @@ struct mem_entry {
 	CUcontext ctx;
 	cuda_ptr_key pkey;
 	enum mem_type mtype;
+	gdr_mh_t mh;
 	struct mem_entry *prev;
 	struct mem_entry *next;
 };
@@ -797,6 +800,47 @@ cuda_mem_register(struct rte_gpu *dev, size_t size, void *ptr)
 	return 0;
 }
 
+static int
+cuda_mem_cpu_map(struct rte_gpu *dev, __rte_unused size_t size, void *ptr_in, void **ptr_out)
+{
+	struct mem_entry *mem_item;
+	cuda_ptr_key hk;
+
+	if (dev == NULL)
+		return -ENODEV;
+
+	hk = get_hash_from_ptr((void *)ptr_in);
+
+	mem_item = mem_list_find_item(hk);
+	if (mem_item == NULL) {
+		rte_cuda_log(ERR, "Memory address 0x%p not found in driver memory.", ptr_in);
+		rte_errno = EPERM;
+		return -rte_errno;
+	}
+
+	if (mem_item->mtype != GPU_MEM) {
+		rte_cuda_log(ERR, "Memory address 0x%p is not GPU memory type.", ptr_in);
+		rte_errno = EPERM;
+		return -rte_errno;
+	}
+
+	if (mem_item->size != size)
+		rte_cuda_log(WARNING,
+				"Can't expose memory area with size (%zd) different from original size (%zd).",
+				size, mem_item->size);
+
+	if (gdrcopy_pin(&gdrc_h, &(mem_item->mh), (uint64_t)mem_item->ptr_d,
+					mem_item->size, &(mem_item->ptr_h))) {
+		rte_cuda_log(ERR, "Error exposing GPU memory address 0x%p.", ptr_in);
+		rte_errno = EPERM;
+		return -rte_errno;
+	}
+
+	*ptr_out = mem_item->ptr_h;
+
+	return 0;
+}
+
 static int
 cuda_mem_free(struct rte_gpu *dev, void *ptr)
 {
@@ -874,6 +918,34 @@ cuda_mem_unregister(struct rte_gpu *dev, void *ptr)
 	return -rte_errno;
 }
 
+static int
+cuda_mem_cpu_unmap(struct rte_gpu *dev, void *ptr_in)
+{
+	struct mem_entry *mem_item;
+	cuda_ptr_key hk;
+
+	if (dev == NULL)
+		return -ENODEV;
+
+	hk = get_hash_from_ptr((void *)ptr_in);
+
+	mem_item = mem_list_find_item(hk);
+	if (mem_item == NULL) {
+		rte_cuda_log(ERR, "Memory address 0x%p not found in driver memory.", ptr_in);
+		rte_errno = EPERM;
+		return -rte_errno;
+	}
+
+	if (gdrcopy_unpin(gdrc_h, mem_item->mh, (void *)mem_item->ptr_d,
+			mem_item->size)) {
+		rte_cuda_log(ERR, "Error unexposing GPU memory address 0x%p.", ptr_in);
+		rte_errno = EPERM;
+		return -rte_errno;
+	}
+
+	return 0;
+}
+
 static int
 cuda_dev_close(struct rte_gpu *dev)
 {
@@ -1040,6 +1112,8 @@ cuda_gpu_probe(__rte_unused struct rte_pci_driver *pci_drv, struct rte_pci_devic
 			rte_errno = ENOTSUP;
 			return -rte_errno;
 		}
+
+		gdrc_h = NULL;
 	}
 
 	/* Fill HW specific part of device structure */
@@ -1182,8 +1256,8 @@ cuda_gpu_probe(__rte_unused struct rte_pci_driver *pci_drv, struct rte_pci_devic
 	dev->ops.mem_free = cuda_mem_free;
 	dev->ops.mem_register = cuda_mem_register;
 	dev->ops.mem_unregister = cuda_mem_unregister;
-	dev->ops.mem_cpu_map = NULL;
-	dev->ops.mem_cpu_unmap = NULL;
+	dev->ops.mem_cpu_map = cuda_mem_cpu_map;
+	dev->ops.mem_cpu_unmap = cuda_mem_cpu_unmap;
 	dev->ops.wmb = cuda_wmb;
 
 	rte_gpu_complete_new(dev);
diff --git a/drivers/gpu/cuda/gdrcopy.c b/drivers/gpu/cuda/gdrcopy.c
new file mode 100644
index 0000000000..1c105adc1c
--- /dev/null
+++ b/drivers/gpu/cuda/gdrcopy.c
@@ -0,0 +1,154 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+* Copyright (c) 2022 NVIDIA Corporation & Affiliates
+*/
+#include <rte_common.h>
+#include <rte_log.h>
+#include "gdrcopy.h"
+
+static RTE_LOG_REGISTER_DEFAULT(cuda_gdr_logtype, NOTICE);
+
+/* Helper macro for logging */
+#define rte_cuda_gdrc_log(level, fmt, ...) \
+	rte_log(RTE_LOG_ ## level, cuda_gdr_logtype, fmt "\n", ##__VA_ARGS__)
+
+static void *gdrclib;
+static gdr_t (*sym_gdr_open)(void);
+static int (*sym_gdr_close)(gdr_t g);
+static int (*sym_gdr_pin_buffer)(gdr_t g, unsigned long addr, size_t size, uint64_t p2p_token, uint32_t va_space, gdr_mh_t *handle);
+static int (*sym_gdr_unpin_buffer)(gdr_t g, gdr_mh_t handle);
+static int (*sym_gdr_map)(gdr_t g, gdr_mh_t handle, void **va, size_t size);
+static int (*sym_gdr_unmap)(gdr_t g, gdr_mh_t handle, void *va, size_t size);
+
+static int
+gdrcopy_loader(void)
+{
+	char gdrcopy_path[1024];
+
+	if (getenv("GDRCOPY_PATH_L") == NULL)
+		snprintf(gdrcopy_path, 1024, "%s", "libgdrapi.so");
+	else
+		snprintf(gdrcopy_path, 1024, "%s/%s", getenv("GDRCOPY_PATH_L"), "libgdrapi.so");
+
+	gdrclib = dlopen(gdrcopy_path, RTLD_LAZY);
+	if (gdrclib == NULL) {
+		rte_cuda_gdrc_log(ERR, "Failed to find GDRCopy library %s (GDRCOPY_PATH_L=%s)\n",
+				gdrcopy_path, getenv("GDRCOPY_PATH_L"));
+		return -1;
+	}
+
+	sym_gdr_open = dlsym(gdrclib, "gdr_open");
+	if (sym_gdr_open == NULL) {
+		rte_cuda_gdrc_log(ERR, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	sym_gdr_close = dlsym(gdrclib, "gdr_close");
+	if (sym_gdr_close == NULL) {
+		rte_cuda_gdrc_log(ERR, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	sym_gdr_pin_buffer = dlsym(gdrclib, "gdr_pin_buffer");
+	if (sym_gdr_pin_buffer == NULL) {
+		rte_cuda_gdrc_log(ERR, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	sym_gdr_unpin_buffer = dlsym(gdrclib, "gdr_unpin_buffer");
+	if (sym_gdr_unpin_buffer == NULL) {
+		rte_cuda_gdrc_log(ERR, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	sym_gdr_map = dlsym(gdrclib, "gdr_map");
+	if (sym_gdr_map == NULL) {
+		rte_cuda_gdrc_log(ERR, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	sym_gdr_unmap = dlsym(gdrclib, "gdr_unmap");
+	if (sym_gdr_unmap == NULL) {
+		rte_cuda_gdrc_log(ERR, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+gdrcopy_open(gdr_t *g)
+{
+	gdr_t g_;
+
+	g_ = sym_gdr_open();
+	if (!g_)
+		return -1;
+	*g = g_;
+
+	return 0;
+}
+
+static int
+gdrcopy_close(gdr_t *g)
+{
+	sym_gdr_close(*g);
+	return 0;
+}
+
+int
+gdrcopy_pin(gdr_t *gdrc_h, __rte_unused gdr_mh_t *mh, uint64_t d_addr, size_t size, void **h_addr)
+{
+#ifdef DRIVERS_GPU_CUDA_GDRCOPY_H
+	if (*gdrc_h == NULL) {
+		if (gdrcopy_loader())
+			return -ENOTSUP;
+
+		if (gdrcopy_open(gdrc_h)) {
+			rte_cuda_gdrc_log(ERR,
+					"GDRCopy gdrdrv kernel module not found. Can't CPU map GPU memory.");
+			return -EPERM;
+		}
+	}
+
+	/* Pin the device buffer */
+	if (sym_gdr_pin_buffer(*gdrc_h, d_addr, size, 0, 0, mh) != 0) {
+		rte_cuda_gdrc_log(ERR, "GDRCopy pin buffer error.");
+		return -1;
+	}
+
+	/* Map the buffer to user space */
+	if (sym_gdr_map(*gdrc_h, *mh, h_addr, size) != 0) {
+		rte_cuda_gdrc_log(ERR, "GDRCopy map buffer error.");
+		sym_gdr_unpin_buffer(*gdrc_h, *mh);
+		return -1;
+	}
+
+	return 0;
+#else
+	rte_cuda_gdrc_log(ERR,
+			"GDRCopy headers not provided at DPDK building time. Can't CPU map GPU memory.");
+	return -ENOTSUP;
+#endif
+}
+
+int
+gdrcopy_unpin(gdr_t gdrc_h, __rte_unused gdr_mh_t mh, void *d_addr, size_t size)
+{
+	if (gdrc_h == NULL)
+		return -EINVAL;
+
+#ifdef DRIVERS_GPU_CUDA_GDRCOPY_H
+	/* Unmap the buffer from user space */
+	if (sym_gdr_unmap(gdrc_h, mh, d_addr, size) != 0) {
+		rte_cuda_gdrc_log(ERR, "GDRCopy unmap buffer error.");
+		return -1;
+	}
+	/* Unpin the device buffer */
+	if (sym_gdr_unpin_buffer(gdrc_h, mh) != 0) {
+		rte_cuda_gdrc_log(ERR, "GDRCopy unpin buffer error.");
+		return -1;
+	}
+#endif
+
+	return 0;
+}
diff --git a/drivers/gpu/cuda/gdrcopy.h b/drivers/gpu/cuda/gdrcopy.h
new file mode 100644
index 0000000000..11960424c9
--- /dev/null
+++ b/drivers/gpu/cuda/gdrcopy.h
@@ -0,0 +1,27 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 NVIDIA Corporation & Affiliates
+ */
+
+#ifndef _CUDA_GDRCOPY_H_
+#define _CUDA_GDRCOPY_H_
+
+#include <dlfcn.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_errno.h>
+
+#ifdef DRIVERS_GPU_CUDA_GDRCOPY_H
+	#include <gdrapi.h>
+#else
+	struct gdr;
+	typedef struct gdr *gdr_t;
+	struct gdr_mh_s;
+	typedef struct gdr_mh_s gdr_mh_t;
+#endif
+
+int gdrcopy_pin(gdr_t *gdrc_h, __rte_unused gdr_mh_t *mh, uint64_t d_addr, size_t size, void **h_addr);
+int gdrcopy_unpin(gdr_t gdrc_h, __rte_unused gdr_mh_t mh, void *d_addr, size_t size);
+
+#endif
+
diff --git a/drivers/gpu/cuda/meson.build b/drivers/gpu/cuda/meson.build
index 3fe20929fa..784fa8bf0d 100644
--- a/drivers/gpu/cuda/meson.build
+++ b/drivers/gpu/cuda/meson.build
@@ -17,5 +17,9 @@ if not cc.has_header('cudaTypedefs.h')
         subdir_done()
 endif
 
+if cc.has_header('gdrapi.h')
+        dpdk_conf.set('DRIVERS_GPU_CUDA_GDRCOPY_H', 1)
+endif
+
 deps += ['gpudev', 'pci', 'bus_pci']
-sources = files('cuda.c')
+sources = files('cuda.c', 'gdrcopy.c')
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3] gpu/cuda: CPU map GPU memory with GDRCopy
  2022-02-21 22:44 ` [PATCH v2] gpu/cuda: CPU map " eagostini
@ 2022-02-23 19:44   ` eagostini
  2022-02-25  3:12   ` [PATCH v4 1/2] doc/gpus: add cuda.ini into features eagostini
  1 sibling, 0 replies; 8+ messages in thread
From: eagostini @ 2022-02-23 19:44 UTC (permalink / raw)
  To: dev; +Cc: Elena Agostini

From: Elena Agostini <eagostini@nvidia.com>

To enable the gpudev rte_gpu_mem_cpu_map feature to expose
GPU memory to the CPU, the GPU CUDA driver library needs
the GDRCopy library and driver.

If DPDK is built without GDRCopy, the GPU CUDA driver returns
error if the is invoked rte_gpu_mem_cpu_map.

All the others GPU CUDA driver functionalities are not affected by
the absence of GDRCopy, thus this is an optional functionality
that can be enabled in the GPU CUDA driver.

CUDA driver documentation has been updated accordingly.

Signed-off-by: Elena Agostini <eagostini@nvidia.com>

----

Changelog:
- Fix checkpatch and doc build issue
---
 doc/guides/gpus/cuda.rst             |  52 +++++++++
 doc/guides/gpus/features/default.ini |   2 +
 drivers/gpu/cuda/cuda.c              |  78 +++++++++++++-
 drivers/gpu/cuda/gdrcopy.c           | 155 +++++++++++++++++++++++++++
 drivers/gpu/cuda/gdrcopy.h           |  29 +++++
 drivers/gpu/cuda/meson.build         |   6 +-
 6 files changed, 319 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/cuda/gdrcopy.c
 create mode 100644 drivers/gpu/cuda/gdrcopy.h

diff --git a/doc/guides/gpus/cuda.rst b/doc/guides/gpus/cuda.rst
index 38e22dc2c0..5d4fe02d77 100644
--- a/doc/guides/gpus/cuda.rst
+++ b/doc/guides/gpus/cuda.rst
@@ -29,6 +29,34 @@ Three ways:
 
 If headers are not found, the CUDA GPU driver library is not built.
 
+CPU map GPU memory
+~~~~~~~~~~~~~~~~~~
+
+To enable this gpudev feature (i.e. implement the ``rte_gpu_mem_cpu_map``),
+you need the `GDRCopy <https://github.com/NVIDIA/gdrcopy>`_ library and driver
+installed on your system.
+
+A quick recipe to download, build and run GDRCopy library and driver:
+
+.. code-block:: console
+
+  $ git clone https://github.com/NVIDIA/gdrcopy.git
+  $ make
+  $ # make install to install GDRCopy library system wide
+  $ # Launch gdrdrv kernel module on the system
+  $ sudo ./insmod.sh
+
+You need to indicate to meson where GDRCopy headers files as in case of CUDA headers.
+An example would be:
+
+.. code-block:: console
+
+  $ meson build -Dc_args="-I/usr/local/cuda/include -I/path/to/gdrcopy/include"
+
+If headers are not found, the CUDA GPU driver library is built without the CPU map capability
+and will return error if the application invokes the gpude ``rte_gpu_mem_cpu_map`` function.
+
+
 CUDA Shared Library
 -------------------
 
@@ -46,6 +74,30 @@ All CUDA API symbols are loaded at runtime as well.
 For this reason, to build the CUDA driver library,
 no need to install the CUDA library.
 
+CPU map GPU memory
+~~~~~~~~~~~~~~~~~~
+
+Similarly to CUDA shared library, if the **libgdrapi.so** shared library is not
+installed in default locations (e.g. /usr/local/lib) you can use the
+``GDRCOPY_PATH_L``.
+
+As an example, to enable the CPU map feature sanity check, run the ``app/test-gpudev``
+application with:
+
+.. code-block:: console
+
+  $ sudo CUDA_PATH_L=/path/to/libcuda GDRCOPY_PATH_L=/path/to/libgdrapi ./build/app/dpdk-test-gpudev
+
+Additionally, the ``gdrdrv`` kernel module built with the GDRCopy project has to loaded
+on the system:
+
+.. code-block:: console
+
+  $ lsmod | egrep gdrdrv
+  gdrdrv                 20480  0
+  nvidia              35307520  19 nvidia_uvm,nv_peer_mem,gdrdrv,nvidia_modeset
+
+
 Design
 ------
 
diff --git a/doc/guides/gpus/features/default.ini b/doc/guides/gpus/features/default.ini
index 87e9966424..817113f2c2 100644
--- a/doc/guides/gpus/features/default.ini
+++ b/doc/guides/gpus/features/default.ini
@@ -11,3 +11,5 @@ Get device info                =
 Share CPU memory with device   =
 Allocate device memory         =
 Free memory                    =
+CPU map device memory          =
+CPU unmap device memory        =
diff --git a/drivers/gpu/cuda/cuda.c b/drivers/gpu/cuda/cuda.c
index b43d5a32b7..ca400d473c 100644
--- a/drivers/gpu/cuda/cuda.c
+++ b/drivers/gpu/cuda/cuda.c
@@ -16,6 +16,7 @@
 #include <gpudev_driver.h>
 #include <cuda.h>
 #include <cudaTypedefs.h>
+#include "gdrcopy.h"
 
 #define CUDA_DRIVER_MIN_VERSION 11040
 #define CUDA_API_MIN_VERSION 3020
@@ -51,6 +52,7 @@ static PFN_cuFlushGPUDirectRDMAWrites pfn_cuFlushGPUDirectRDMAWrites;
 static void *cudalib;
 static unsigned int cuda_api_version;
 static int cuda_driver_version;
+static gdr_t gdrc_h;
 
 /* NVIDIA GPU vendor */
 #define NVIDIA_GPU_VENDOR_ID (0x10de)
@@ -157,6 +159,7 @@ struct mem_entry {
 	CUcontext ctx;
 	cuda_ptr_key pkey;
 	enum mem_type mtype;
+	gdr_mh_t mh;
 	struct mem_entry *prev;
 	struct mem_entry *next;
 };
@@ -797,6 +800,47 @@ cuda_mem_register(struct rte_gpu *dev, size_t size, void *ptr)
 	return 0;
 }
 
+static int
+cuda_mem_cpu_map(struct rte_gpu *dev, __rte_unused size_t size, void *ptr_in, void **ptr_out)
+{
+	struct mem_entry *mem_item;
+	cuda_ptr_key hk;
+
+	if (dev == NULL)
+		return -ENODEV;
+
+	hk = get_hash_from_ptr((void *)ptr_in);
+
+	mem_item = mem_list_find_item(hk);
+	if (mem_item == NULL) {
+		rte_cuda_log(ERR, "Memory address 0x%p not found in driver memory.", ptr_in);
+		rte_errno = EPERM;
+		return -rte_errno;
+	}
+
+	if (mem_item->mtype != GPU_MEM) {
+		rte_cuda_log(ERR, "Memory address 0x%p is not GPU memory type.", ptr_in);
+		rte_errno = EPERM;
+		return -rte_errno;
+	}
+
+	if (mem_item->size != size)
+		rte_cuda_log(WARNING,
+				"Can't expose memory area with size (%zd) different from original size (%zd).",
+				size, mem_item->size);
+
+	if (gdrcopy_pin(&gdrc_h, &(mem_item->mh), (uint64_t)mem_item->ptr_d,
+					mem_item->size, &(mem_item->ptr_h))) {
+		rte_cuda_log(ERR, "Error exposing GPU memory address 0x%p.", ptr_in);
+		rte_errno = EPERM;
+		return -rte_errno;
+	}
+
+	*ptr_out = mem_item->ptr_h;
+
+	return 0;
+}
+
 static int
 cuda_mem_free(struct rte_gpu *dev, void *ptr)
 {
@@ -874,6 +918,34 @@ cuda_mem_unregister(struct rte_gpu *dev, void *ptr)
 	return -rte_errno;
 }
 
+static int
+cuda_mem_cpu_unmap(struct rte_gpu *dev, void *ptr_in)
+{
+	struct mem_entry *mem_item;
+	cuda_ptr_key hk;
+
+	if (dev == NULL)
+		return -ENODEV;
+
+	hk = get_hash_from_ptr((void *)ptr_in);
+
+	mem_item = mem_list_find_item(hk);
+	if (mem_item == NULL) {
+		rte_cuda_log(ERR, "Memory address 0x%p not found in driver memory.", ptr_in);
+		rte_errno = EPERM;
+		return -rte_errno;
+	}
+
+	if (gdrcopy_unpin(gdrc_h, mem_item->mh, (void *)mem_item->ptr_d,
+			mem_item->size)) {
+		rte_cuda_log(ERR, "Error unexposing GPU memory address 0x%p.", ptr_in);
+		rte_errno = EPERM;
+		return -rte_errno;
+	}
+
+	return 0;
+}
+
 static int
 cuda_dev_close(struct rte_gpu *dev)
 {
@@ -1040,6 +1112,8 @@ cuda_gpu_probe(__rte_unused struct rte_pci_driver *pci_drv, struct rte_pci_devic
 			rte_errno = ENOTSUP;
 			return -rte_errno;
 		}
+
+		gdrc_h = NULL;
 	}
 
 	/* Fill HW specific part of device structure */
@@ -1182,8 +1256,8 @@ cuda_gpu_probe(__rte_unused struct rte_pci_driver *pci_drv, struct rte_pci_devic
 	dev->ops.mem_free = cuda_mem_free;
 	dev->ops.mem_register = cuda_mem_register;
 	dev->ops.mem_unregister = cuda_mem_unregister;
-	dev->ops.mem_cpu_map = NULL;
-	dev->ops.mem_cpu_unmap = NULL;
+	dev->ops.mem_cpu_map = cuda_mem_cpu_map;
+	dev->ops.mem_cpu_unmap = cuda_mem_cpu_unmap;
 	dev->ops.wmb = cuda_wmb;
 
 	rte_gpu_complete_new(dev);
diff --git a/drivers/gpu/cuda/gdrcopy.c b/drivers/gpu/cuda/gdrcopy.c
new file mode 100644
index 0000000000..102ba3f577
--- /dev/null
+++ b/drivers/gpu/cuda/gdrcopy.c
@@ -0,0 +1,155 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 NVIDIA Corporation & Affiliates
+ */
+#include <rte_common.h>
+#include <rte_log.h>
+#include "gdrcopy.h"
+
+static RTE_LOG_REGISTER_DEFAULT(cuda_gdr_logtype, NOTICE);
+
+/* Helper macro for logging */
+#define rte_cuda_gdrc_log(level, fmt, ...) \
+	rte_log(RTE_LOG_ ## level, cuda_gdr_logtype, fmt "\n", ##__VA_ARGS__)
+
+static void *gdrclib;
+static gdr_t (*sym_gdr_open)(void);
+static int (*sym_gdr_close)(gdr_t g);
+static int (*sym_gdr_pin_buffer)(gdr_t g, unsigned long addr, size_t size,
+		uint64_t p2p_token, uint32_t va_space, gdr_mh_t *handle);
+static int (*sym_gdr_unpin_buffer)(gdr_t g, gdr_mh_t handle);
+static int (*sym_gdr_map)(gdr_t g, gdr_mh_t handle, void **va, size_t size);
+static int (*sym_gdr_unmap)(gdr_t g, gdr_mh_t handle, void *va, size_t size);
+
+static int
+gdrcopy_loader(void)
+{
+	char gdrcopy_path[1024];
+
+	if (getenv("GDRCOPY_PATH_L") == NULL)
+		snprintf(gdrcopy_path, 1024, "%s", "libgdrapi.so");
+	else
+		snprintf(gdrcopy_path, 1024, "%s/%s", getenv("GDRCOPY_PATH_L"), "libgdrapi.so");
+
+	gdrclib = dlopen(gdrcopy_path, RTLD_LAZY);
+	if (gdrclib == NULL) {
+		rte_cuda_gdrc_log(ERR, "Failed to find GDRCopy library %s (GDRCOPY_PATH_L=%s)\n",
+				gdrcopy_path, getenv("GDRCOPY_PATH_L"));
+		return -1;
+	}
+
+	sym_gdr_open = dlsym(gdrclib, "gdr_open");
+	if (sym_gdr_open == NULL) {
+		rte_cuda_gdrc_log(ERR, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	sym_gdr_close = dlsym(gdrclib, "gdr_close");
+	if (sym_gdr_close == NULL) {
+		rte_cuda_gdrc_log(ERR, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	sym_gdr_pin_buffer = dlsym(gdrclib, "gdr_pin_buffer");
+	if (sym_gdr_pin_buffer == NULL) {
+		rte_cuda_gdrc_log(ERR, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	sym_gdr_unpin_buffer = dlsym(gdrclib, "gdr_unpin_buffer");
+	if (sym_gdr_unpin_buffer == NULL) {
+		rte_cuda_gdrc_log(ERR, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	sym_gdr_map = dlsym(gdrclib, "gdr_map");
+	if (sym_gdr_map == NULL) {
+		rte_cuda_gdrc_log(ERR, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	sym_gdr_unmap = dlsym(gdrclib, "gdr_unmap");
+	if (sym_gdr_unmap == NULL) {
+		rte_cuda_gdrc_log(ERR, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+gdrcopy_open(gdr_t *g)
+{
+	gdr_t g_;
+
+	g_ = sym_gdr_open();
+	if (!g_)
+		return -1;
+	*g = g_;
+
+	return 0;
+}
+
+static int
+gdrcopy_close(gdr_t *g)
+{
+	sym_gdr_close(*g);
+	return 0;
+}
+
+int
+gdrcopy_pin(gdr_t *gdrc_h, __rte_unused gdr_mh_t *mh, uint64_t d_addr, size_t size, void **h_addr)
+{
+#ifdef DRIVERS_GPU_CUDA_GDRCOPY_H
+	if (*gdrc_h == NULL) {
+		if (gdrcopy_loader())
+			return -ENOTSUP;
+
+		if (gdrcopy_open(gdrc_h)) {
+			rte_cuda_gdrc_log(ERR,
+					"GDRCopy gdrdrv kernel module not found. Can't CPU map GPU memory.");
+			return -EPERM;
+		}
+	}
+
+	/* Pin the device buffer */
+	if (sym_gdr_pin_buffer(*gdrc_h, d_addr, size, 0, 0, mh) != 0) {
+		rte_cuda_gdrc_log(ERR, "GDRCopy pin buffer error.");
+		return -1;
+	}
+
+	/* Map the buffer to user space */
+	if (sym_gdr_map(*gdrc_h, *mh, h_addr, size) != 0) {
+		rte_cuda_gdrc_log(ERR, "GDRCopy map buffer error.");
+		sym_gdr_unpin_buffer(*gdrc_h, *mh);
+		return -1;
+	}
+
+	return 0;
+#else
+	rte_cuda_gdrc_log(ERR,
+			"GDRCopy headers not provided at DPDK building time. Can't CPU map GPU memory.");
+	return -ENOTSUP;
+#endif
+}
+
+int
+gdrcopy_unpin(gdr_t gdrc_h, __rte_unused gdr_mh_t mh, void *d_addr, size_t size)
+{
+	if (gdrc_h == NULL)
+		return -EINVAL;
+
+#ifdef DRIVERS_GPU_CUDA_GDRCOPY_H
+	/* Unmap the buffer from user space */
+	if (sym_gdr_unmap(gdrc_h, mh, d_addr, size) != 0) {
+		rte_cuda_gdrc_log(ERR, "GDRCopy unmap buffer error.");
+		return -1;
+	}
+	/* Unpin the device buffer */
+	if (sym_gdr_unpin_buffer(gdrc_h, mh) != 0) {
+		rte_cuda_gdrc_log(ERR, "GDRCopy unpin buffer error.");
+		return -1;
+	}
+#endif
+
+	return 0;
+}
diff --git a/drivers/gpu/cuda/gdrcopy.h b/drivers/gpu/cuda/gdrcopy.h
new file mode 100644
index 0000000000..68b80de805
--- /dev/null
+++ b/drivers/gpu/cuda/gdrcopy.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 NVIDIA Corporation & Affiliates
+ */
+
+#ifndef _CUDA_GDRCOPY_H_
+#define _CUDA_GDRCOPY_H_
+
+#include <dlfcn.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_errno.h>
+
+#ifdef DRIVERS_GPU_CUDA_GDRCOPY_H
+	#include <gdrapi.h>
+#else
+	struct gdr;
+	typedef struct gdr *gdr_t;
+	struct gdr_mh_s;
+	typedef struct gdr_mh_s gdr_mh_t;
+#endif
+
+int gdrcopy_pin(gdr_t *gdrc_h, __rte_unused gdr_mh_t *mh,
+		uint64_t d_addr, size_t size, void **h_addr);
+int gdrcopy_unpin(gdr_t gdrc_h, __rte_unused gdr_mh_t mh,
+		void *d_addr, size_t size);
+
+#endif
+
diff --git a/drivers/gpu/cuda/meson.build b/drivers/gpu/cuda/meson.build
index 3fe20929fa..784fa8bf0d 100644
--- a/drivers/gpu/cuda/meson.build
+++ b/drivers/gpu/cuda/meson.build
@@ -17,5 +17,9 @@ if not cc.has_header('cudaTypedefs.h')
         subdir_done()
 endif
 
+if cc.has_header('gdrapi.h')
+        dpdk_conf.set('DRIVERS_GPU_CUDA_GDRCOPY_H', 1)
+endif
+
 deps += ['gpudev', 'pci', 'bus_pci']
-sources = files('cuda.c')
+sources = files('cuda.c', 'gdrcopy.c')
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 1/2] doc/gpus: add cuda.ini into features
  2022-02-21 22:44 ` [PATCH v2] gpu/cuda: CPU map " eagostini
  2022-02-23 19:44   ` [PATCH v3] " eagostini
@ 2022-02-25  3:12   ` eagostini
  2022-02-25  3:12     ` [PATCH v4 2/2] gpu/cuda: CPU map GPU memory with GDRCopy eagostini
  2022-02-27 16:48     ` [PATCH v4 1/2] doc/gpus: add cuda.ini into features Thomas Monjalon
  1 sibling, 2 replies; 8+ messages in thread
From: eagostini @ 2022-02-25  3:12 UTC (permalink / raw)
  To: dev; +Cc: Elena Agostini

From: Elena Agostini <eagostini@nvidia.com>

Signed-off-by: Elena Agostini <eagostini@nvidia.com>
---
 doc/guides/gpus/features/cuda.ini | 12 ++++++++++++
 1 file changed, 12 insertions(+)
 create mode 100644 doc/guides/gpus/features/cuda.ini

diff --git a/doc/guides/gpus/features/cuda.ini b/doc/guides/gpus/features/cuda.ini
new file mode 100644
index 0000000000..eb1aff9a80
--- /dev/null
+++ b/doc/guides/gpus/features/cuda.ini
@@ -0,0 +1,12 @@
+;
+; Supported features of the 'cuda' gpu driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Get device info                = Y
+Share CPU memory with device   = Y
+Allocate device memory         = Y
+Free memory                    = Y
+CPU map device memory          = Y
+CPU unmap device memory        = Y
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 2/2] gpu/cuda: CPU map GPU memory with GDRCopy
  2022-02-25  3:12   ` [PATCH v4 1/2] doc/gpus: add cuda.ini into features eagostini
@ 2022-02-25  3:12     ` eagostini
  2022-02-27 16:49       ` Thomas Monjalon
  2022-02-27 16:48     ` [PATCH v4 1/2] doc/gpus: add cuda.ini into features Thomas Monjalon
  1 sibling, 1 reply; 8+ messages in thread
From: eagostini @ 2022-02-25  3:12 UTC (permalink / raw)
  To: dev; +Cc: Elena Agostini

From: Elena Agostini <eagostini@nvidia.com>

To enable the gpudev rte_gpu_mem_cpu_map feature to expose
GPU memory to the CPU, the GPU CUDA driver library needs
the GDRCopy library and driver.

If DPDK is built without GDRCopy, the GPU CUDA driver returns
error if the is invoked rte_gpu_mem_cpu_map.

All the others GPU CUDA driver functionalities are not affected by
the absence of GDRCopy, thus this is an optional functionality
that can be enabled in the GPU CUDA driver.

CUDA driver documentation has been updated accordingly.

Signed-off-by: Elena Agostini <eagostini@nvidia.com>

----

Changelog:
- Fix checkpatch and doc build issue
- Added common header to cuda.c and gdrcopy.c
---
 doc/guides/gpus/cuda.rst             |  52 ++++++++++
 doc/guides/gpus/features/default.ini |   2 +
 drivers/gpu/cuda/common.h            |  39 +++++++
 drivers/gpu/cuda/cuda.c              |  90 +++++++++++++---
 drivers/gpu/cuda/gdrcopy.c           | 148 +++++++++++++++++++++++++++
 drivers/gpu/cuda/meson.build         |   6 +-
 6 files changed, 322 insertions(+), 15 deletions(-)
 create mode 100644 drivers/gpu/cuda/common.h
 create mode 100644 drivers/gpu/cuda/gdrcopy.c

diff --git a/doc/guides/gpus/cuda.rst b/doc/guides/gpus/cuda.rst
index 38e22dc2c0..5d4fe02d77 100644
--- a/doc/guides/gpus/cuda.rst
+++ b/doc/guides/gpus/cuda.rst
@@ -29,6 +29,34 @@ Three ways:
 
 If headers are not found, the CUDA GPU driver library is not built.
 
+CPU map GPU memory
+~~~~~~~~~~~~~~~~~~
+
+To enable this gpudev feature (i.e. implement the ``rte_gpu_mem_cpu_map``),
+you need the `GDRCopy <https://github.com/NVIDIA/gdrcopy>`_ library and driver
+installed on your system.
+
+A quick recipe to download, build and run GDRCopy library and driver:
+
+.. code-block:: console
+
+  $ git clone https://github.com/NVIDIA/gdrcopy.git
+  $ make
+  $ # make install to install GDRCopy library system wide
+  $ # Launch gdrdrv kernel module on the system
+  $ sudo ./insmod.sh
+
+You need to indicate to meson where GDRCopy headers files as in case of CUDA headers.
+An example would be:
+
+.. code-block:: console
+
+  $ meson build -Dc_args="-I/usr/local/cuda/include -I/path/to/gdrcopy/include"
+
+If headers are not found, the CUDA GPU driver library is built without the CPU map capability
+and will return error if the application invokes the gpude ``rte_gpu_mem_cpu_map`` function.
+
+
 CUDA Shared Library
 -------------------
 
@@ -46,6 +74,30 @@ All CUDA API symbols are loaded at runtime as well.
 For this reason, to build the CUDA driver library,
 no need to install the CUDA library.
 
+CPU map GPU memory
+~~~~~~~~~~~~~~~~~~
+
+Similarly to CUDA shared library, if the **libgdrapi.so** shared library is not
+installed in default locations (e.g. /usr/local/lib) you can use the
+``GDRCOPY_PATH_L``.
+
+As an example, to enable the CPU map feature sanity check, run the ``app/test-gpudev``
+application with:
+
+.. code-block:: console
+
+  $ sudo CUDA_PATH_L=/path/to/libcuda GDRCOPY_PATH_L=/path/to/libgdrapi ./build/app/dpdk-test-gpudev
+
+Additionally, the ``gdrdrv`` kernel module built with the GDRCopy project has to loaded
+on the system:
+
+.. code-block:: console
+
+  $ lsmod | egrep gdrdrv
+  gdrdrv                 20480  0
+  nvidia              35307520  19 nvidia_uvm,nv_peer_mem,gdrdrv,nvidia_modeset
+
+
 Design
 ------
 
diff --git a/doc/guides/gpus/features/default.ini b/doc/guides/gpus/features/default.ini
index 87e9966424..817113f2c2 100644
--- a/doc/guides/gpus/features/default.ini
+++ b/doc/guides/gpus/features/default.ini
@@ -11,3 +11,5 @@ Get device info                =
 Share CPU memory with device   =
 Allocate device memory         =
 Free memory                    =
+CPU map device memory          =
+CPU unmap device memory        =
diff --git a/drivers/gpu/cuda/common.h b/drivers/gpu/cuda/common.h
new file mode 100644
index 0000000000..323bbacd6e
--- /dev/null
+++ b/drivers/gpu/cuda/common.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 NVIDIA Corporation & Affiliates
+ */
+
+#ifndef _CUDA_COMMON_H_
+#define _CUDA_COMMON_H_
+
+#include <dlfcn.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_errno.h>
+
+RTE_LOG_REGISTER_DEFAULT(cuda_logtype, NOTICE);
+
+/* Helper macro for logging */
+#define rte_cuda_log(level, fmt, ...) \
+	rte_log(RTE_LOG_ ## level, cuda_logtype, fmt "\n", ##__VA_ARGS__)
+
+#define rte_cuda_debug(fmt, ...) \
+	rte_cuda_log(DEBUG, RTE_STR(__LINE__) ":%s() " fmt, __func__, \
+		##__VA_ARGS__)
+
+#ifdef DRIVERS_GPU_CUDA_GDRCOPY_H
+	#include <gdrapi.h>
+#else
+	struct gdr;
+	typedef struct gdr *gdr_t;
+	struct gdr_mh_s;
+	typedef struct gdr_mh_s gdr_mh_t;
+#endif
+
+int gdrcopy_pin(gdr_t *gdrc_h, __rte_unused gdr_mh_t *mh,
+		uint64_t d_addr, size_t size, void **h_addr);
+int gdrcopy_unpin(gdr_t gdrc_h, __rte_unused gdr_mh_t mh,
+		void *d_addr, size_t size);
+
+#endif
+
diff --git a/drivers/gpu/cuda/cuda.c b/drivers/gpu/cuda/cuda.c
index b43d5a32b7..c247ce0179 100644
--- a/drivers/gpu/cuda/cuda.c
+++ b/drivers/gpu/cuda/cuda.c
@@ -4,8 +4,6 @@
 
 #include <dlfcn.h>
 
-#include <rte_common.h>
-#include <rte_log.h>
 #include <rte_malloc.h>
 #include <rte_errno.h>
 #include <rte_pci.h>
@@ -16,6 +14,7 @@
 #include <gpudev_driver.h>
 #include <cuda.h>
 #include <cudaTypedefs.h>
+#include "common.h"
 
 #define CUDA_DRIVER_MIN_VERSION 11040
 #define CUDA_API_MIN_VERSION 3020
@@ -51,6 +50,7 @@ static PFN_cuFlushGPUDirectRDMAWrites pfn_cuFlushGPUDirectRDMAWrites;
 static void *cudalib;
 static unsigned int cuda_api_version;
 static int cuda_driver_version;
+static gdr_t gdrc_h;
 
 /* NVIDIA GPU vendor */
 #define NVIDIA_GPU_VENDOR_ID (0x10de)
@@ -74,16 +74,6 @@ static int cuda_driver_version;
 #define GPU_PAGE_SHIFT 16
 #define GPU_PAGE_SIZE (1UL << GPU_PAGE_SHIFT)
 
-static RTE_LOG_REGISTER_DEFAULT(cuda_logtype, NOTICE);
-
-/* Helper macro for logging */
-#define rte_cuda_log(level, fmt, ...) \
-	rte_log(RTE_LOG_ ## level, cuda_logtype, fmt "\n", ##__VA_ARGS__)
-
-#define rte_cuda_debug(fmt, ...) \
-	rte_cuda_log(DEBUG, RTE_STR(__LINE__) ":%s() " fmt, __func__, \
-		##__VA_ARGS__)
-
 /* NVIDIA GPU address map */
 static const struct rte_pci_id pci_id_cuda_map[] = {
 	{
@@ -157,6 +147,7 @@ struct mem_entry {
 	CUcontext ctx;
 	cuda_ptr_key pkey;
 	enum mem_type mtype;
+	gdr_mh_t mh;
 	struct mem_entry *prev;
 	struct mem_entry *next;
 };
@@ -797,6 +788,47 @@ cuda_mem_register(struct rte_gpu *dev, size_t size, void *ptr)
 	return 0;
 }
 
+static int
+cuda_mem_cpu_map(struct rte_gpu *dev, __rte_unused size_t size, void *ptr_in, void **ptr_out)
+{
+	struct mem_entry *mem_item;
+	cuda_ptr_key hk;
+
+	if (dev == NULL)
+		return -ENODEV;
+
+	hk = get_hash_from_ptr((void *)ptr_in);
+
+	mem_item = mem_list_find_item(hk);
+	if (mem_item == NULL) {
+		rte_cuda_log(ERR, "Memory address 0x%p not found in driver memory.", ptr_in);
+		rte_errno = EPERM;
+		return -rte_errno;
+	}
+
+	if (mem_item->mtype != GPU_MEM) {
+		rte_cuda_log(ERR, "Memory address 0x%p is not GPU memory type.", ptr_in);
+		rte_errno = EPERM;
+		return -rte_errno;
+	}
+
+	if (mem_item->size != size)
+		rte_cuda_log(WARNING,
+				"Can't expose memory area with size (%zd) different from original size (%zd).",
+				size, mem_item->size);
+
+	if (gdrcopy_pin(&gdrc_h, &(mem_item->mh), (uint64_t)mem_item->ptr_d,
+					mem_item->size, &(mem_item->ptr_h))) {
+		rte_cuda_log(ERR, "Error exposing GPU memory address 0x%p.", ptr_in);
+		rte_errno = EPERM;
+		return -rte_errno;
+	}
+
+	*ptr_out = mem_item->ptr_h;
+
+	return 0;
+}
+
 static int
 cuda_mem_free(struct rte_gpu *dev, void *ptr)
 {
@@ -874,6 +906,34 @@ cuda_mem_unregister(struct rte_gpu *dev, void *ptr)
 	return -rte_errno;
 }
 
+static int
+cuda_mem_cpu_unmap(struct rte_gpu *dev, void *ptr_in)
+{
+	struct mem_entry *mem_item;
+	cuda_ptr_key hk;
+
+	if (dev == NULL)
+		return -ENODEV;
+
+	hk = get_hash_from_ptr((void *)ptr_in);
+
+	mem_item = mem_list_find_item(hk);
+	if (mem_item == NULL) {
+		rte_cuda_log(ERR, "Memory address 0x%p not found in driver memory.", ptr_in);
+		rte_errno = EPERM;
+		return -rte_errno;
+	}
+
+	if (gdrcopy_unpin(gdrc_h, mem_item->mh, (void *)mem_item->ptr_d,
+			mem_item->size)) {
+		rte_cuda_log(ERR, "Error unexposing GPU memory address 0x%p.", ptr_in);
+		rte_errno = EPERM;
+		return -rte_errno;
+	}
+
+	return 0;
+}
+
 static int
 cuda_dev_close(struct rte_gpu *dev)
 {
@@ -1040,6 +1100,8 @@ cuda_gpu_probe(__rte_unused struct rte_pci_driver *pci_drv, struct rte_pci_devic
 			rte_errno = ENOTSUP;
 			return -rte_errno;
 		}
+
+		gdrc_h = NULL;
 	}
 
 	/* Fill HW specific part of device structure */
@@ -1182,8 +1244,8 @@ cuda_gpu_probe(__rte_unused struct rte_pci_driver *pci_drv, struct rte_pci_devic
 	dev->ops.mem_free = cuda_mem_free;
 	dev->ops.mem_register = cuda_mem_register;
 	dev->ops.mem_unregister = cuda_mem_unregister;
-	dev->ops.mem_cpu_map = NULL;
-	dev->ops.mem_cpu_unmap = NULL;
+	dev->ops.mem_cpu_map = cuda_mem_cpu_map;
+	dev->ops.mem_cpu_unmap = cuda_mem_cpu_unmap;
 	dev->ops.wmb = cuda_wmb;
 
 	rte_gpu_complete_new(dev);
diff --git a/drivers/gpu/cuda/gdrcopy.c b/drivers/gpu/cuda/gdrcopy.c
new file mode 100644
index 0000000000..8e6178c09d
--- /dev/null
+++ b/drivers/gpu/cuda/gdrcopy.c
@@ -0,0 +1,148 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 NVIDIA Corporation & Affiliates
+ */
+
+#include "common.h"
+
+static void *gdrclib;
+static gdr_t (*sym_gdr_open)(void);
+static int (*sym_gdr_close)(gdr_t g);
+static int (*sym_gdr_pin_buffer)(gdr_t g, unsigned long addr, size_t size,
+		uint64_t p2p_token, uint32_t va_space, gdr_mh_t *handle);
+static int (*sym_gdr_unpin_buffer)(gdr_t g, gdr_mh_t handle);
+static int (*sym_gdr_map)(gdr_t g, gdr_mh_t handle, void **va, size_t size);
+static int (*sym_gdr_unmap)(gdr_t g, gdr_mh_t handle, void *va, size_t size);
+
+static int
+gdrcopy_loader(void)
+{
+	char gdrcopy_path[1024];
+
+	if (getenv("GDRCOPY_PATH_L") == NULL)
+		snprintf(gdrcopy_path, 1024, "%s", "libgdrapi.so");
+	else
+		snprintf(gdrcopy_path, 1024, "%s/%s", getenv("GDRCOPY_PATH_L"), "libgdrapi.so");
+
+	gdrclib = dlopen(gdrcopy_path, RTLD_LAZY);
+	if (gdrclib == NULL) {
+		rte_cuda_log(ERR, "Failed to find GDRCopy library %s (GDRCOPY_PATH_L=%s)\n",
+				gdrcopy_path, getenv("GDRCOPY_PATH_L"));
+		return -1;
+	}
+
+	sym_gdr_open = dlsym(gdrclib, "gdr_open");
+	if (sym_gdr_open == NULL) {
+		rte_cuda_log(ERR, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	sym_gdr_close = dlsym(gdrclib, "gdr_close");
+	if (sym_gdr_close == NULL) {
+		rte_cuda_log(ERR, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	sym_gdr_pin_buffer = dlsym(gdrclib, "gdr_pin_buffer");
+	if (sym_gdr_pin_buffer == NULL) {
+		rte_cuda_log(ERR, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	sym_gdr_unpin_buffer = dlsym(gdrclib, "gdr_unpin_buffer");
+	if (sym_gdr_unpin_buffer == NULL) {
+		rte_cuda_log(ERR, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	sym_gdr_map = dlsym(gdrclib, "gdr_map");
+	if (sym_gdr_map == NULL) {
+		rte_cuda_log(ERR, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	sym_gdr_unmap = dlsym(gdrclib, "gdr_unmap");
+	if (sym_gdr_unmap == NULL) {
+		rte_cuda_log(ERR, "Failed to load GDRCopy symbols\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+gdrcopy_open(gdr_t *g)
+{
+	gdr_t g_;
+
+	g_ = sym_gdr_open();
+	if (!g_)
+		return -1;
+	*g = g_;
+
+	return 0;
+}
+
+static int
+gdrcopy_close(gdr_t *g)
+{
+	sym_gdr_close(*g);
+	return 0;
+}
+
+int
+gdrcopy_pin(gdr_t *gdrc_h, __rte_unused gdr_mh_t *mh, uint64_t d_addr, size_t size, void **h_addr)
+{
+#ifdef DRIVERS_GPU_CUDA_GDRCOPY_H
+	if (*gdrc_h == NULL) {
+		if (gdrcopy_loader())
+			return -ENOTSUP;
+
+		if (gdrcopy_open(gdrc_h)) {
+			rte_cuda_log(ERR,
+					"GDRCopy gdrdrv kernel module not found. Can't CPU map GPU memory.");
+			return -EPERM;
+		}
+	}
+
+	/* Pin the device buffer */
+	if (sym_gdr_pin_buffer(*gdrc_h, d_addr, size, 0, 0, mh) != 0) {
+		rte_cuda_log(ERR, "GDRCopy pin buffer error.");
+		return -1;
+	}
+
+	/* Map the buffer to user space */
+	if (sym_gdr_map(*gdrc_h, *mh, h_addr, size) != 0) {
+		rte_cuda_log(ERR, "GDRCopy map buffer error.");
+		sym_gdr_unpin_buffer(*gdrc_h, *mh);
+		return -1;
+	}
+
+	return 0;
+#else
+	rte_cuda_log(ERR,
+			"GDRCopy headers not provided at DPDK building time. Can't CPU map GPU memory.");
+	return -ENOTSUP;
+#endif
+}
+
+int
+gdrcopy_unpin(gdr_t gdrc_h, __rte_unused gdr_mh_t mh, void *d_addr, size_t size)
+{
+	if (gdrc_h == NULL)
+		return -EINVAL;
+
+#ifdef DRIVERS_GPU_CUDA_GDRCOPY_H
+	/* Unmap the buffer from user space */
+	if (sym_gdr_unmap(gdrc_h, mh, d_addr, size) != 0) {
+		rte_cuda_log(ERR, "GDRCopy unmap buffer error.");
+		return -1;
+	}
+	/* Unpin the device buffer */
+	if (sym_gdr_unpin_buffer(gdrc_h, mh) != 0) {
+		rte_cuda_log(ERR, "GDRCopy unpin buffer error.");
+		return -1;
+	}
+#endif
+
+	return 0;
+}
diff --git a/drivers/gpu/cuda/meson.build b/drivers/gpu/cuda/meson.build
index 3fe20929fa..784fa8bf0d 100644
--- a/drivers/gpu/cuda/meson.build
+++ b/drivers/gpu/cuda/meson.build
@@ -17,5 +17,9 @@ if not cc.has_header('cudaTypedefs.h')
         subdir_done()
 endif
 
+if cc.has_header('gdrapi.h')
+        dpdk_conf.set('DRIVERS_GPU_CUDA_GDRCOPY_H', 1)
+endif
+
 deps += ['gpudev', 'pci', 'bus_pci']
-sources = files('cuda.c')
+sources = files('cuda.c', 'gdrcopy.c')
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 1/2] doc/gpus: add cuda.ini into features
  2022-02-25  3:12   ` [PATCH v4 1/2] doc/gpus: add cuda.ini into features eagostini
  2022-02-25  3:12     ` [PATCH v4 2/2] gpu/cuda: CPU map GPU memory with GDRCopy eagostini
@ 2022-02-27 16:48     ` Thomas Monjalon
  1 sibling, 0 replies; 8+ messages in thread
From: Thomas Monjalon @ 2022-02-27 16:48 UTC (permalink / raw)
  To: Elena Agostini; +Cc: dev

25/02/2022 04:12, eagostini@nvidia.com:

    The features list were missed when introducing the driver.
    
    Fixes: 1306a73b1958 ("gpu/cuda: introduce CUDA driver")
    Cc: stable@dpdk.org

> Signed-off-by: Elena Agostini <eagostini@nvidia.com>
> ---
>  doc/guides/gpus/features/cuda.ini | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>  create mode 100644 doc/guides/gpus/features/cuda.ini
> 
> diff --git a/doc/guides/gpus/features/cuda.ini b/doc/guides/gpus/features/cuda.ini
> new file mode 100644
> index 0000000000..eb1aff9a80
> --- /dev/null
> +++ b/doc/guides/gpus/features/cuda.ini
> @@ -0,0 +1,12 @@
> +;
> +; Supported features of the 'cuda' gpu driver.
> +;
> +; Refer to default.ini for the full list of available PMD features.
> +;
> +[Features]
> +Get device info                = Y
> +Share CPU memory with device   = Y
> +Allocate device memory         = Y
> +Free memory                    = Y
> +CPU map device memory          = Y
> +CPU unmap device memory        = Y

The last 2 lines will be moved in the next patch implementing the feature.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 2/2] gpu/cuda: CPU map GPU memory with GDRCopy
  2022-02-25  3:12     ` [PATCH v4 2/2] gpu/cuda: CPU map GPU memory with GDRCopy eagostini
@ 2022-02-27 16:49       ` Thomas Monjalon
  0 siblings, 0 replies; 8+ messages in thread
From: Thomas Monjalon @ 2022-02-27 16:49 UTC (permalink / raw)
  To: Elena Agostini; +Cc: dev

25/02/2022 04:12, eagostini@nvidia.com:
> From: Elena Agostini <eagostini@nvidia.com>
> 
> To enable the gpudev rte_gpu_mem_cpu_map feature to expose
> GPU memory to the CPU, the GPU CUDA driver library needs
> the GDRCopy library and driver.
> 
> If DPDK is built without GDRCopy, the GPU CUDA driver returns
> error if the is invoked rte_gpu_mem_cpu_map.
> 
> All the others GPU CUDA driver functionalities are not affected by
> the absence of GDRCopy, thus this is an optional functionality
> that can be enabled in the GPU CUDA driver.
> 
> CUDA driver documentation has been updated accordingly.
> 
> Signed-off-by: Elena Agostini <eagostini@nvidia.com>
> 
> ----

Should be only 3 dashes to be interpreted by git.

> 
> Changelog:
> - Fix checkpatch and doc build issue
> - Added common header to cuda.c and gdrcopy.c

Applied, thanks.



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-02-27 16:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-11 17:39 [PATCH v1 0/1] gpu/cuda: expose GPU memory with GDRCopy eagostini
2022-01-11 17:39 ` [PATCH v1 1/1] " eagostini
2022-02-21 22:44 ` [PATCH v2] gpu/cuda: CPU map " eagostini
2022-02-23 19:44   ` [PATCH v3] " eagostini
2022-02-25  3:12   ` [PATCH v4 1/2] doc/gpus: add cuda.ini into features eagostini
2022-02-25  3:12     ` [PATCH v4 2/2] gpu/cuda: CPU map GPU memory with GDRCopy eagostini
2022-02-27 16:49       ` Thomas Monjalon
2022-02-27 16:48     ` [PATCH v4 1/2] doc/gpus: add cuda.ini into features Thomas Monjalon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.