linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/30] Driver for Hyper-v virtual compute device
@ 2022-03-01 19:45 Iouri Tarassov
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                   ` (2 more replies)
  0 siblings, 3 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:45 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

The driver code in the new patch series is split to more simple patches
for easy review. The definitions are added to the header files as
they become to be used.

The patches introduces a driver for the Hyper-V virtual compute hardware
devices. This driver powers the Windows Subsystem for Linux (WSL) and will
soon power the Windows Subsystem for Android (WSA).

Per earlier feedback, the driver is now located under the Hyper-V node as it
is not a classic Linux GPU (KMS/DRM) driver and really only make sense
when running under a Windows host under Hyper-V. Although we refer to this
driver as VGPU for shorthand, in reality this is a generic virtualization
infrastructure for various class of compute accelerators, the most popular
and ubiquitous being the GPU. We support virtualization of non-GPU devices
through this infrastructure. These devices are exposed to user-space and
used by various API and framework, such as CUDA, OpenCL, OpenVINO,
OneAPI, DX12, etc...

One critical piece of feedback, provided by the community in our earlier
submission, was the lack of an open source user-space to go along with our
kernel compute driver. At the time we only had CUDA and DX12 user-space APIs,
both of which being closed source. This is a divisive issue in the community
and is heavily debated (https://lwn.net/Articles/821817/).

We took this feedback to heart and we spent the last year working on a way
to address this key piece of feedback. Working closely with our partner Intel,
we're happy to say that we now have a fully open source user-space for the
OpenCL, OpenVINO and OneAPI compute family of APIs on Intel GPU platforms.
This is supported by this open source project
(https://github.com/intel/compute-runtime).

To make it easy for our partners to build compute drivers, which are usable
in both Windows and WSL environment, we provide a library, that provides a
stable interface to our compute device abstraction. This was originally part
of the libdxcore closed source API, but have now spawn that off into it's
own open source project (https://github.com/microsoft/libdxg).

Between the Intel compute runtime project and libdxg, we now have a fully
open source implementation of our virtualized compute stack inside of WSL.
We will continue to support both open source user-space API against our
compute abstraction as well as closed source one (CUDA, DX12), leaving
it to the API owners and partners to decide what makes the most sense for them.

A lot of efforts went into addressing community feedback in this revised
set of patches and we hope this is getting closer to what the community
would like to see.

We're looking forward additional feedback.

Driver overview
-------------------------------------------------------
dxgkrnl is a driver for Hyper-V virtual compute devices, such as VGPU
devices, which are projected to a Linux virtual machine (VM) by a Windows
host. dxgkrnl works in context of WDDM (Windows Display Driver Model)
for GPU or MCDM (Microsoft Compute Driver Model) for non-GPU devices.
WDDM/MCDM consists of the following components:
- Graphics or Compute applications
- A graphics or compute user mode API (for example OpenGL, Vulkan, OpenCL,
  OpenVINO, OneAPI, CUDA, DX12, ...)
- User Mode Driver (UMD), written by a hardware vendor
- optional libdxg library helping UMD portability across Windows and Linux
- dxgkrnl Linux kernel driver (this driver)
- Kernel mode port driver on the Windows host (dxgkrnl.sys / dxgmms2.sys)
- Kernel mode miniport driver (KMD) on the Windows host, written by a
  hardware vendor running on the Windows host and interfacing with the
  hardware device.

dxgkrnl exposes a subset the WDDM/MCDM D3DKMT interface to user space. See
https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/d3dkmthk/
This interface provides user space with an abstracted view and control
of compute devices in a portable way across Windows and WSL. It is used for
devices such as GPU or AI/ML processors. This interface is mapped to custom
IOCTLs on Linux (see d3dkmthk.h). The libdxg library translates the D3DKMT
interface calls to the driver IOCTLs.

A more detailed overview of this architecture is available here:
https://devblogs.microsoft.com/directx/directx-heart-linux/

Virtual compute devices are paravirtualized, meaning that the actual
communication with the corresponding device hardware happens on the host.
The version of dxgkrnl inside of the Linux kernel coordinates with the
version of dxgkrnl running on Windows to provide a consistent and
portable abstraction for the device that the various APIs and UMD can
rely on across Windows and Linux.

Dxgkrnl creates the /dev/dxg device, which can be used to enumerate virtual
compute devices (also called adapters). UMD or an API runtime open the
device and send ioctls to create various device objects (dxgdevice,
dxgcontext, dxgallocation, etc., defined in dxgkrnl.h) and to submit work
to the device. The WDDM objects are represented in user mode as opaque
handles (struct d3dkmthandle). Dxgkrnl creates a dxgprocess object
for each process, which opens the /dev/dxg device. This object has a handle
table, which is used to translate d3dkmt handles to kernel objects. Handle
table definitions are in hmgr.h. There is also a global handle table for
objects, which do not belong to a particular process.

Driver initialization
-------------------------------------------------------
When dxgkrnl is loaded, dxgkrnl registers for virtual PCI device arrival
notifications and VM bus channel device notifications (see dxgmodule.c). When
the first virtual device is started, dxgkrnl creates a misc device (/dev/dxg).
A user mode client can open the /dev/dxg device and send IOCTLs to enumerate
and control virtual compute devices.

Virtual compute device initialization
-------------------------------------------------------
A virtual device is represented by a dxgadapter object. It is created when
the corresponding device arrives on the virtual PCI bus. The device vendor
is PCI_VENDOR_ID_MICROSOFT and the device id is PCI_DEVICE_ID_VIRTUAL_RENDER.
The adapter is started when the corresponding VM bus channel and the global
VM bus channel are initialized. Dynamic arrival/removal of devices is
supported.

Internal objects
-------------------------------------------------------
dxgkrnl creates various internal objects in response to IOCTL calls. The
corresponsing objects are also created on the host. Each object is placed to
a process or global handle table. Object handles (d3dkmthandle) are returned
to user mode. The object handles are also used to reference objects on the host.
Corresponding objects in the guest and the host have the same handle value to
avoid handle translation. The internal objects are:
- dxgadapter
  Represents a virtual compute device object. It is created for every device
  projected by the host to the VM.
- dxgprocess
  The object is created for each Linux process, which opens /dxg/dev. It has the
  object handle table, which holds pointers to all internal objects, which are
  created by this process.
- dxgcontext
  Represents a device execution thread in the packet scheduling mode (as oppose
  to the hardware scheduling mode).
- dxghwqueue
  Represents a device execution thread in the hardware scheduling mode.
- dxgdevice
  A collection of device contexts, allocations, resources, sync objects, etc.
- dxgallocation
  Represents a device accessible memory allocation.
- dxgresource
  A collection of dxgallocation objects. This object could be shared between
  devices and processes.
- dxgsharedresource
  Represents a dxgresource object, which is sharable between processes.
- dxgsyncobject
  Represents a device synchronization object, which is used to synchronize
  execution of DMA buffers in the device execution contexts.
- dxgsharedsyncobject
  Represent a device synchronization object, which is sharable between processes.
- dxgpagingqueue
  Represents a queue, which is used to manage residency of the device allocation
  objects.

Communications with the host
-------------------------------------------------------
Dxgkrnl communicates with the host via Hyper-V VM bus channels. There
is a global channel and a per virtual computed device channel. The VM bus
messages are defined in dxgvmbus.h and the implementation is in dxgvmbus.c.

Most VM bus messages to the host are synchronous. When the host enables the
asynchronous mode, some high frequency VM bus messages are sent asynchronously
to improve performance. When async messages are enabled, all VM bus messages are
sent only via the global channel to maintain the order of messages on the host.

The host could send asynchronous messages to the Linux dxgkrnl driver via
the global VM bus channel. The host messages are handled by
dxgvmbuschannel_receive().

PCI config space of the device is used to exchange information between the host
and the guest during dxgkrnl initialization. Look at dxg_pci_probe_device().

CPU access to device accessible allocations
-------------------------------------------------------
D3DKMT API allows creation of allocations, which are accessible by the device
and the CPU. The global VM bus channels has associated IO space, which is used
to implement CPU access to CPU visible allocations. For each such allocation
the host allocates a portion of the guest IO space and maps it to the
allocation memory (it could be in system memory or in device local memory).
A user mode application calls the LX_DXLOCK2 ioctl to get the allocation
CPU address. Dxgkrnl uses vm_mmap to allocate a user space virtual address
range and maps it to the allocation IO space using io_remap_pfn_range().
This way Linux user mode virtual addresses point to the host system memory
or device local memory.

Sharing objects between processes
-------------------------------------------------------
Some dxgkrnl objects could be shared between processes. This includes resources
(dxgresource) and synchronization objects (dxgsyncobject).
The WDDM API provides a way to share objects using so called "NT handles".
"NT handle" on Windows is a native Windows process handle (HANDLE). "NT handles"
are implemented as file descriptors (FD) on Linux.
The LX_DXSHAREOBJECTS ioctl is used to get an FD for a shared object.
Before a shared object can be used, it needs to be "opened" to get the "local"
d3dkmthandle handle. The LX_DXOPENRESOURCEFROMNTHANDLE and
LX_DXOPENSYNCOBJECTFROMNTHANDLE2 ioctls are used to open a shared object

Iouri Tarassov (30):
  drivers: hv: dxgkrnl: Add virtual compute device VM bus
    channel guids
  drivers: hv: dxgkrnl: Driver initialization and loading
  drivers: hv: dxgkrnl: Add VM bus message support, initialize VM bus
    channels.
  drivers: hv: dxgkrnl: Creation of dxgadapter object
  drivers: hv: dxgkrnl: Opening of /dev/dxg device and dxgprocess
    creation
  drivers: hv: dxgkrnl: Enumerate and open dxgadapter objects
  drivers: hv: dxgkrnl: Creation of dxgdevice objects
  drivers: hv: dxgkrnl: Creation of dxgcontext objects
  drivers: hv: dxgkrnl: Creation of compute device allocations and
    resources
  drivers: hv: dxgkrnl: Creation of compute device sync objects
  drivers: hv: dxgkrnl: Operations using sync objects
  drivers: hv: dxgkrnl: Sharing of dxgresource objects
  drivers: hv: dxgkrnl: Sharing of sync objects
  drivers: hv: dxgkrnl: Creation of hardware queues. Sync object
    operations to hw queue.
  drivers: hv: dxgkrnl: Creation of paging queue objects.
  drivers: hv: dxgkrnl: Submit execution commands to the compute device
  drivers: hv: dxgkrnl: Share objects with the host
  drivers: hv: dxgkrnl: Query the dxgdevice state
  drivers: hv: dxgkrnl: Map(unmap) CPU address to device allocation
  drivers: hv: dxgkrnl: Manage device allocation properties
  drivers: hv: dxgkrnl: Flush heap transitions
  drivers: hv: dxgkrnl: Query video memory information
  drivers: hv: dxgkrnl: The escape ioctl
  drivers: hv: dxgkrnl: Ioctl to put device to error state
  drivers: hv: dxgkrnl: Ioctls to query statistics and clock calibration
  drivers: hv: dxgkrnl: Offer and reclaim allocations
  drivers: hv: dxgkrnl: Ioctls to manage scheduling priority
  drivers: hv: dxgkrnl: Manage residency of allocations
  drivers: hv: dxgkrnl: Manage compute device virtual addresses
  drivers: hv: dxgkrnl: Add support to map guest pages by host

 MAINTAINERS                     |    7 +
 drivers/hv/Kconfig              |    2 +
 drivers/hv/Makefile             |    1 +
 drivers/hv/dxgkrnl/Kconfig      |   26 +
 drivers/hv/dxgkrnl/Makefile     |    5 +
 drivers/hv/dxgkrnl/dxgadapter.c | 1375 ++++++++
 drivers/hv/dxgkrnl/dxgkrnl.h    |  962 ++++++
 drivers/hv/dxgkrnl/dxgmodule.c  |  942 ++++++
 drivers/hv/dxgkrnl/dxgprocess.c |  334 ++
 drivers/hv/dxgkrnl/dxgvmbus.c   | 3788 ++++++++++++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h   |  868 +++++
 drivers/hv/dxgkrnl/hmgr.c       |  566 ++++
 drivers/hv/dxgkrnl/hmgr.h       |  112 +
 drivers/hv/dxgkrnl/ioctl.c      | 5368 +++++++++++++++++++++++++++++++
 drivers/hv/dxgkrnl/misc.c       |   43 +
 drivers/hv/dxgkrnl/misc.h       |   96 +
 include/linux/hyperv.h          |   16 +
 include/uapi/misc/d3dkmthk.h    | 1679 ++++++++++
 18 files changed, 16190 insertions(+)
 create mode 100644 drivers/hv/dxgkrnl/Kconfig
 create mode 100644 drivers/hv/dxgkrnl/Makefile
 create mode 100644 drivers/hv/dxgkrnl/dxgadapter.c
 create mode 100644 drivers/hv/dxgkrnl/dxgkrnl.h
 create mode 100644 drivers/hv/dxgkrnl/dxgmodule.c
 create mode 100644 drivers/hv/dxgkrnl/dxgprocess.c
 create mode 100644 drivers/hv/dxgkrnl/dxgvmbus.c
 create mode 100644 drivers/hv/dxgkrnl/dxgvmbus.h
 create mode 100644 drivers/hv/dxgkrnl/hmgr.c
 create mode 100644 drivers/hv/dxgkrnl/hmgr.h
 create mode 100644 drivers/hv/dxgkrnl/ioctl.c
 create mode 100644 drivers/hv/dxgkrnl/misc.c
 create mode 100644 drivers/hv/dxgkrnl/misc.h
 create mode 100644 include/uapi/misc/d3dkmthk.h

-- 
2.35.1


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids
  2022-03-01 19:45 [PATCH v3 00/30] Driver for Hyper-v virtual compute device Iouri Tarassov
@ 2022-03-01 19:45 ` Iouri Tarassov
  2022-03-01 19:45   ` [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading Iouri Tarassov
                     ` (29 more replies)
  2022-03-02  8:51 ` [PATCH v3 00/30] Driver for Hyper-v virtual compute device Christoph Hellwig
  2022-03-02 14:53 ` Wei Liu
  2 siblings, 30 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:45 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Add VM bus channel guids, which are used by hyper-v virtual
compute device driver.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 include/linux/hyperv.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index b823311eac79..9d21055b003d 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1414,6 +1414,22 @@ void vmbus_free_mmio(resource_size_t start, resource_size_t size);
 	.guid = GUID_INIT(0xda0a7802, 0xe377, 0x4aac, 0x8e, 0x77, \
 			  0x05, 0x58, 0xeb, 0x10, 0x73, 0xf8)
 
+/*
+ * GPU paravirtualization global DXGK channel
+ * {DDE9CBC0-5060-4436-9448-EA1254A5D177}
+ */
+#define HV_GPUP_DXGK_GLOBAL_GUID \
+	.guid = GUID_INIT(0xdde9cbc0, 0x5060, 0x4436, 0x94, 0x48, \
+			  0xea, 0x12, 0x54, 0xa5, 0xd1, 0x77)
+
+/*
+ * GPU paravirtualization per virtual GPU DXGK channel
+ * {6E382D18-3336-4F4B-ACC4-2B7703D4DF4A}
+ */
+#define HV_GPUP_DXGK_VGPU_GUID \
+	.guid = GUID_INIT(0x6e382d18, 0x3336, 0x4f4b, 0xac, 0xc4, \
+			  0x2b, 0x77, 0x3, 0xd4, 0xdf, 0x4a)
+
 /*
  * Synthetic FC GUID
  * {2f9bcc4a-0069-4af3-b76b-6fd0be528cda}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
@ 2022-03-01 19:45   ` Iouri Tarassov
  2022-03-01 20:45     ` Greg KH
  2022-03-01 22:06     ` Wei Liu
  2022-03-01 19:45   ` [PATCH v3 03/30] drivers: hv: dxgkrnl: Add VM bus message support, initialize VM bus channels Iouri Tarassov
                     ` (28 subsequent siblings)
  29 siblings, 2 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:45 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

- Create skeleton and add basic functionality for the
hyper-v compute device driver (dxgkrnl).

- Register for PCI and VM bus driver notifications and
handle initialization of VM bus channels.

- Connect the dxgkrnl module to the drivers/hv/ Makefile and Kconfig

- Create a MAINTAINERS entry

A VM bus channel is a communication interface between the hyper-v guest
and the host. The are two type of VM bus channels, used in the driver:
  - the global channel
  - per virtual compute device channel

A PCI device is created for each virtual compute device, projected
by the host. The device vendor is PCI_VENDOR_ID_MICROSOFT and device
id is PCI_DEVICE_ID_VIRTUAL_RENDER. dxg_pci_probe_device handles
arrival of such devices. The PCI config space of the virtual compute
device has luid of the corresponding virtual compute device VM
bus channel. This is how the compute device adapter objects are
linked to VM bus channels.

VM bus interface version is exchanged by reading/writing the PCI config
space of the virtual compute device.

The IO space is used to handle CPU accessible compute device
allocations. Hyper-v allocates IO space for the global VM bus channel.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 MAINTAINERS                    |   7 +
 drivers/hv/Kconfig             |   2 +
 drivers/hv/Makefile            |   1 +
 drivers/hv/dxgkrnl/Kconfig     |  26 ++
 drivers/hv/dxgkrnl/Makefile    |   5 +
 drivers/hv/dxgkrnl/dxgkrnl.h   | 119 ++++++++
 drivers/hv/dxgkrnl/dxgmodule.c | 531 +++++++++++++++++++++++++++++++++
 include/uapi/misc/d3dkmthk.h   |  23 ++
 8 files changed, 714 insertions(+)
 create mode 100644 drivers/hv/dxgkrnl/Kconfig
 create mode 100644 drivers/hv/dxgkrnl/Makefile
 create mode 100644 drivers/hv/dxgkrnl/dxgkrnl.h
 create mode 100644 drivers/hv/dxgkrnl/dxgmodule.c
 create mode 100644 include/uapi/misc/d3dkmthk.h

diff --git a/MAINTAINERS b/MAINTAINERS
index a2bd991db512..5856e09d834c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8841,6 +8841,13 @@ F:	Documentation/devicetree/bindings/mtd/ti,am654-hbmc.yaml
 F:	drivers/mtd/hyperbus/
 F:	include/linux/mtd/hyperbus.h
 
+Hyper-V vGPU DRIVER
+M:	Iouri Tarassov <iourit@microsoft.com>
+L:	linux-hyperv@vger.kernel.org
+S:	Supported
+F:	drivers/hv/dxgkrnl/
+F:	include/uapi/misc/d3dkmthk.h
+
 HYPERVISOR VIRTUAL CONSOLE DRIVER
 L:	linuxppc-dev@lists.ozlabs.org
 S:	Odd Fixes
diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig
index dd12af20e467..7006f7b66200 100644
--- a/drivers/hv/Kconfig
+++ b/drivers/hv/Kconfig
@@ -29,4 +29,6 @@ config HYPERV_BALLOON
 	help
 	  Select this option to enable Hyper-V Balloon driver.
 
+source "drivers/hv/dxgkrnl/Kconfig"
+
 endmenu
diff --git a/drivers/hv/Makefile b/drivers/hv/Makefile
index d76df5c8c2a9..aa1cbdb5d0d2 100644
--- a/drivers/hv/Makefile
+++ b/drivers/hv/Makefile
@@ -2,6 +2,7 @@
 obj-$(CONFIG_HYPERV)		+= hv_vmbus.o
 obj-$(CONFIG_HYPERV_UTILS)	+= hv_utils.o
 obj-$(CONFIG_HYPERV_BALLOON)	+= hv_balloon.o
+obj-$(CONFIG_DXGKRNL)		+= dxgkrnl/
 
 CFLAGS_hv_trace.o = -I$(src)
 CFLAGS_hv_balloon.o = -I$(src)
diff --git a/drivers/hv/dxgkrnl/Kconfig b/drivers/hv/dxgkrnl/Kconfig
new file mode 100644
index 000000000000..bcd92bbff939
--- /dev/null
+++ b/drivers/hv/dxgkrnl/Kconfig
@@ -0,0 +1,26 @@
+# SPDX-License-Identifier: GPL-2.0
+# Configuration for the hyper-v virtual compute driver (dxgkrnl)
+#
+
+config DXGKRNL
+	tristate "Microsoft Paravirtualized GPU support"
+	depends on HYPERV
+	depends on 64BIT || COMPILE_TEST
+	help
+	  This driver supports paravirtualized virtual compute devices, exposed
+	  by Microsoft Hyper-V when Linux is running inside of a virtual machine
+	  hosted by Windows. The virtual machines needs to be configured to use
+	  host compute adapters. The driver name is dxgkrnl.
+
+	  An example of such virtual machine is a  Windows Subsystem for
+	  Linux container. When such container is instantiated, the Windows host
+	  assigns compatible host GPU adapters to the container. The corresponding
+	  virtual GPU devices appear on the PCI bus in the container. These
+	  devices are enumerated and accessed by this driver.
+
+	  Communications with the driver are done by using the Microsoft libdxcore
+	  library, which translates the D3DKMT interface
+	  <https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/d3dkmthk/>
+	  to the driver IOCTLs. The virtual GPU devices are paravirtualized,
+	  which means that access to the hardware is done in the host. The driver
+	  communicates with the host using Hyper-V VM bus communication channels.
diff --git a/drivers/hv/dxgkrnl/Makefile b/drivers/hv/dxgkrnl/Makefile
new file mode 100644
index 000000000000..052f8bd62ad3
--- /dev/null
+++ b/drivers/hv/dxgkrnl/Makefile
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier: GPL-2.0
+# Makefile for the hyper-v compute device driver (dxgkrnl).
+
+obj-$(CONFIG_DXGKRNL)	+= dxgkrnl.o
+dxgkrnl-y		:= dxgmodule.o
diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
new file mode 100644
index 000000000000..6b3e9334bce8
--- /dev/null
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -0,0 +1,119 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Copyright (c) 2019, Microsoft Corporation.
+ *
+ * Author:
+ *   Iouri Tarassov <iourit@linux.microsoft.com>
+ *
+ * Dxgkrnl Graphics Driver
+ * Headers for internal objects
+ *
+ */
+
+#ifndef _DXGKRNL_H
+#define _DXGKRNL_H
+
+#include <linux/uuid.h>
+#include <linux/kernel.h>
+#include <linux/mutex.h>
+#include <linux/semaphore.h>
+#include <linux/refcount.h>
+#include <linux/rwsem.h>
+#include <linux/atomic.h>
+#include <linux/spinlock.h>
+#include <linux/gfp.h>
+#include <linux/miscdevice.h>
+#include <linux/pci.h>
+#include <linux/hyperv.h>
+
+struct dxgadapter;
+
+#include <uapi/misc/d3dkmthk.h>
+
+struct dxgvmbuschannel {
+	struct vmbus_channel	*channel;
+	struct hv_device	*hdev;
+	spinlock_t		packet_list_mutex;
+	struct list_head	packet_list_head;
+	struct kmem_cache	*packet_cache;
+	atomic64_t		packet_request_id;
+};
+
+int dxgvmbuschannel_init(struct dxgvmbuschannel *ch, struct hv_device *hdev);
+void dxgvmbuschannel_destroy(struct dxgvmbuschannel *ch);
+void dxgvmbuschannel_receive(void *ctx);
+
+/*
+ * The structure defines an offered vGPU vm bus channel.
+ */
+struct dxgvgpuchannel {
+	struct list_head	vgpu_ch_list_entry;
+	struct winluid		adapter_luid;
+	struct hv_device	*hdev;
+};
+
+struct dxgglobal {
+	struct dxgvmbuschannel	channel;
+	struct delayed_work	dwork;
+	struct hv_device	*hdev;
+	u32			num_adapters;
+	u32			vmbus_ver;	/* Interface version */
+	struct resource		*mem;
+	u64			mmiospace_base;
+	u64			mmiospace_size;
+	struct miscdevice	dxgdevice;
+	struct mutex		device_mutex;
+
+	/*  list of created  processes */
+	struct list_head	plisthead;
+	struct mutex		plistmutex;
+
+	/*
+	 * List of the vGPU VM bus channels (dxgvgpuchannel)
+	 * Protected by device_mutex
+	 */
+	struct list_head	vgpu_ch_list_head;
+
+	/* protects acces to the global VM bus channel */
+	struct rw_semaphore	channel_lock;
+
+	bool			dxg_dev_initialized;
+	bool			vmbus_registered;
+	bool			pci_registered;
+	bool			global_channel_initialized;
+	bool			async_msg_enabled;
+};
+
+extern struct dxgglobal		*dxgglobal;
+
+struct dxgprocess {
+	/* Placeholder */
+};
+
+void init_ioctls(void);
+long dxgk_compat_ioctl(struct file *f, unsigned int p1, unsigned long p2);
+long dxgk_unlocked_ioctl(struct file *f, unsigned int p1, unsigned long p2);
+
+static inline void guid_to_luid(guid_t *guid, struct winluid *luid)
+{
+	*luid = *(struct winluid *)&guid->b[0];
+}
+
+/*
+ * VM bus interface
+ *
+ */
+
+/*
+ * The interface version is used to ensure that the host and the guest use the
+ * same VM bus protocol. It needs to be incremented every time the VM bus
+ * interface changes. DXGK_VMBUS_LAST_COMPATIBLE_INTERFACE_VERSION is
+ * incremented each time the earlier versions of the interface are no longer
+ * compatible with the current version.
+ */
+#define DXGK_VMBUS_INTERFACE_VERSION_OLD		27
+#define DXGK_VMBUS_INTERFACE_VERSION			40
+#define DXGK_VMBUS_LAST_COMPATIBLE_INTERFACE_VERSION	16
+
+#endif
diff --git a/drivers/hv/dxgkrnl/dxgmodule.c b/drivers/hv/dxgkrnl/dxgmodule.c
new file mode 100644
index 000000000000..412c37bc592d
--- /dev/null
+++ b/drivers/hv/dxgkrnl/dxgmodule.c
@@ -0,0 +1,531 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright (c) 2019, Microsoft Corporation.
+ *
+ * Author:
+ *   Iouri Tarassov <iourit@linux.microsoft.com>
+ *
+ * Dxgkrnl Graphics Driver
+ * Interface with Linux kernel, PCI driver and the VM bus driver
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/eventfd.h>
+#include <linux/hyperv.h>
+#include <linux/pci.h>
+
+#include "dxgkrnl.h"
+
+/*
+ * Pointer to the global device data. By design
+ * there is a single vGPU device on the VM bus and a single /dev/dxg device
+ * is created.
+ */
+struct dxgglobal *dxgglobal;
+
+#define DXGKRNL_VERSION			0x2216
+#define PCI_VENDOR_ID_MICROSOFT		0x1414
+#define PCI_DEVICE_ID_VIRTUAL_RENDER	0x008E
+
+#undef pr_fmt
+#define pr_fmt(fmt)	"dxgk: " fmt
+
+//
+// Interface from dxgglobal
+//
+
+struct vmbus_channel *dxgglobal_get_vmbus(void)
+{
+	return dxgglobal->channel.channel;
+}
+
+struct dxgvmbuschannel *dxgglobal_get_dxgvmbuschannel(void)
+{
+	return &dxgglobal->channel;
+}
+
+int dxgglobal_acquire_channel_lock(void)
+{
+	down_read(&dxgglobal->channel_lock);
+	if (dxgglobal->channel.channel == NULL) {
+		pr_err("Failed to acquire global channel lock");
+		return -ENODEV;
+	} else {
+		return 0;
+	}
+}
+
+void dxgglobal_release_channel_lock(void)
+{
+	up_read(&dxgglobal->channel_lock);
+}
+
+/*
+ * File operations for the /dev/dxg device
+ */
+
+static int dxgk_open(struct inode *n, struct file *f)
+{
+	return 0;
+}
+
+static int dxgk_release(struct inode *n, struct file *f)
+{
+	return 0;
+}
+
+static ssize_t dxgk_read(struct file *f, char __user *s, size_t len,
+			 loff_t *o)
+{
+	pr_debug("file read\n");
+	return 0;
+}
+
+static ssize_t dxgk_write(struct file *f, const char __user *s, size_t len,
+			  loff_t *o)
+{
+	pr_debug("file write\n");
+	return len;
+}
+
+const struct file_operations dxgk_fops = {
+	.owner = THIS_MODULE,
+	.open = dxgk_open,
+	.release = dxgk_release,
+	.write = dxgk_write,
+	.read = dxgk_read,
+};
+
+/*
+ * Interface with the PCI driver
+ */
+
+/*
+ * Part of the PCI config space of the vGPU device is used for vGPU
+ * configuration data. Reading/writing of the PCI config space is forwarded
+ * to the host.
+ */
+
+/* vGPU VM bus channel instance ID */
+#define DXGK_VMBUS_CHANNEL_ID_OFFSET 192
+/* DXGK_VMBUS_INTERFACE_VERSION (u32) */
+#define DXGK_VMBUS_VERSION_OFFSET	(DXGK_VMBUS_CHANNEL_ID_OFFSET + \
+					sizeof(guid_t))
+/* Luid of the virtual GPU on the host (struct winluid) */
+#define DXGK_VMBUS_VGPU_LUID_OFFSET	(DXGK_VMBUS_VERSION_OFFSET + \
+					sizeof(u32))
+/* The guest writes its capavilities to this adderss */
+#define DXGK_VMBUS_GUESTCAPS_OFFSET	(DXGK_VMBUS_VERSION_OFFSET + \
+					sizeof(u32))
+
+/* Capabilities of the guest driver, reported to the host */
+struct dxgk_vmbus_guestcaps {
+	union {
+		struct {
+			u32	wsl2		: 1;
+			u32	reserved	: 31;
+		};
+		u32 guest_caps;
+	};
+};
+
+/*
+ * A helper function to read PCI config space.
+ */
+static int dxg_pci_read_dwords(struct pci_dev *dev, int offset, int size,
+			       void *val)
+{
+	int off = offset;
+	int ret;
+	int i;
+
+	for (i = 0; i < size / sizeof(int); i++) {
+		ret = pci_read_config_dword(dev, off, &((int *)val)[i]);
+		if (ret) {
+			pr_err("Failed to read PCI config: %d", off);
+			return ret;
+		}
+		off += sizeof(int);
+	}
+	return 0;
+}
+
+static int dxg_pci_probe_device(struct pci_dev *dev,
+				const struct pci_device_id *id)
+{
+	int ret;
+	guid_t guid;
+	u32 vmbus_interface_ver = DXGK_VMBUS_INTERFACE_VERSION;
+	struct winluid vgpu_luid = {};
+	struct dxgk_vmbus_guestcaps guest_caps = {.wsl2 = 1};
+
+	mutex_lock(&dxgglobal->device_mutex);
+
+	if (dxgglobal->vmbus_ver == 0)  {
+		/* Report capabilities to the host */
+
+		ret = pci_write_config_dword(dev, DXGK_VMBUS_GUESTCAPS_OFFSET,
+					guest_caps.guest_caps);
+		if (ret)
+			goto cleanup;
+
+		/* Negotiate the VM bus version */
+
+		ret = pci_read_config_dword(dev, DXGK_VMBUS_VERSION_OFFSET,
+					&vmbus_interface_ver);
+		if (ret == 0 && vmbus_interface_ver != 0)
+			dxgglobal->vmbus_ver = vmbus_interface_ver;
+		else
+			dxgglobal->vmbus_ver = DXGK_VMBUS_INTERFACE_VERSION_OLD;
+
+		if (dxgglobal->vmbus_ver < DXGK_VMBUS_INTERFACE_VERSION)
+			goto read_channel_id;
+
+		ret = pci_write_config_dword(dev, DXGK_VMBUS_VERSION_OFFSET,
+					DXGK_VMBUS_INTERFACE_VERSION);
+		if (ret)
+			goto cleanup;
+
+		if (dxgglobal->vmbus_ver > DXGK_VMBUS_INTERFACE_VERSION)
+			dxgglobal->vmbus_ver = DXGK_VMBUS_INTERFACE_VERSION;
+	}
+
+read_channel_id:
+
+	/* Get the VM bus channel ID for the virtual GPU */
+	ret = dxg_pci_read_dwords(dev, DXGK_VMBUS_CHANNEL_ID_OFFSET,
+				sizeof(guid), (int *)&guid);
+	if (ret)
+		goto cleanup;
+
+	if (dxgglobal->vmbus_ver >= DXGK_VMBUS_INTERFACE_VERSION) {
+		ret = dxg_pci_read_dwords(dev, DXGK_VMBUS_VGPU_LUID_OFFSET,
+					  sizeof(vgpu_luid), &vgpu_luid);
+		if (ret)
+			goto cleanup;
+	}
+
+	pr_debug("Adapter channel: %pUb\n", &guid);
+	pr_debug("Vmbus interface version: %d\n",
+		dxgglobal->vmbus_ver);
+	pr_debug("Host vGPU luid: %x-%x\n",
+		vgpu_luid.b, vgpu_luid.a);
+
+cleanup:
+
+	mutex_unlock(&dxgglobal->device_mutex);
+
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+static void dxg_pci_remove_device(struct pci_dev *dev)
+{
+	/* Placeholder */
+}
+
+static struct pci_device_id dxg_pci_id_table = {
+	.vendor = PCI_VENDOR_ID_MICROSOFT,
+	.device = PCI_DEVICE_ID_VIRTUAL_RENDER,
+	.subvendor = PCI_ANY_ID,
+	.subdevice = PCI_ANY_ID
+};
+
+static struct pci_driver dxg_pci_drv = {
+	.name = KBUILD_MODNAME,
+	.id_table = &dxg_pci_id_table,
+	.probe = dxg_pci_probe_device,
+	.remove = dxg_pci_remove_device
+};
+
+/*
+ * Interface with the VM bus driver
+ */
+
+static int dxgglobal_getiospace(struct dxgglobal *dxgglobal)
+{
+	/* Get mmio space for the global channel */
+	struct hv_device *hdev = dxgglobal->hdev;
+	struct vmbus_channel *channel = hdev->channel;
+	resource_size_t pot_start = 0;
+	resource_size_t pot_end = -1;
+	int ret;
+
+	dxgglobal->mmiospace_size = channel->offermsg.offer.mmio_megabytes;
+	if (dxgglobal->mmiospace_size == 0) {
+		pr_debug("zero mmio space is offered\n");
+		return -ENOMEM;
+	}
+	dxgglobal->mmiospace_size <<= 20;
+	pr_debug("mmio offered: %llx\n",
+		dxgglobal->mmiospace_size);
+
+	ret = vmbus_allocate_mmio(&dxgglobal->mem, hdev, pot_start, pot_end,
+				  dxgglobal->mmiospace_size, 0x10000, false);
+	if (ret) {
+		pr_err("Unable to allocate mmio memory: %d\n", ret);
+		return ret;
+	}
+	dxgglobal->mmiospace_size = dxgglobal->mem->end -
+	    dxgglobal->mem->start + 1;
+	dxgglobal->mmiospace_base = dxgglobal->mem->start;
+	pr_info("mmio allocated %llx  %llx %llx %llx\n",
+		dxgglobal->mmiospace_base,
+		dxgglobal->mmiospace_size,
+		dxgglobal->mem->start, dxgglobal->mem->end);
+
+	return 0;
+}
+
+int dxgglobal_init_global_channel(void)
+{
+	int ret = 0;
+
+	ret = dxgvmbuschannel_init(&dxgglobal->channel, dxgglobal->hdev);
+	if (ret) {
+		pr_err("dxgvmbuschannel_init failed: %d\n", ret);
+		goto error;
+	}
+
+	ret = dxgglobal_getiospace(dxgglobal);
+	if (ret) {
+		pr_err("getiospace failed: %d\n", ret);
+		goto error;
+	}
+
+	hv_set_drvdata(dxgglobal->hdev, dxgglobal);
+
+	dxgglobal->dxgdevice.minor = MISC_DYNAMIC_MINOR;
+	dxgglobal->dxgdevice.name = "dxg";
+	dxgglobal->dxgdevice.fops = &dxgk_fops;
+	dxgglobal->dxgdevice.mode = 0666;
+	ret = misc_register(&dxgglobal->dxgdevice);
+	if (ret) {
+		pr_err("misc_register failed: %d", ret);
+		goto error;
+	}
+	dxgglobal->dxg_dev_initialized = true;
+
+error:
+	return ret;
+}
+
+void dxgglobal_destroy_global_channel(void)
+{
+	down_write(&dxgglobal->channel_lock);
+
+	dxgglobal->global_channel_initialized = false;
+
+	if (dxgglobal->dxg_dev_initialized) {
+		misc_deregister(&dxgglobal->dxgdevice);
+		dxgglobal->dxg_dev_initialized = false;
+	}
+
+	if (dxgglobal->mem) {
+		vmbus_free_mmio(dxgglobal->mmiospace_base,
+				dxgglobal->mmiospace_size);
+		dxgglobal->mem = NULL;
+	}
+
+	dxgvmbuschannel_destroy(&dxgglobal->channel);
+
+	if (dxgglobal->hdev) {
+		hv_set_drvdata(dxgglobal->hdev, NULL);
+		dxgglobal->hdev = NULL;
+	}
+
+	up_write(&dxgglobal->channel_lock);
+}
+
+static const struct hv_vmbus_device_id id_table[] = {
+	/* Per GPU Device GUID */
+	{ HV_GPUP_DXGK_VGPU_GUID },
+	/* Global Dxgkgnl channel for the virtual machine */
+	{ HV_GPUP_DXGK_GLOBAL_GUID },
+	{ }
+};
+
+static int dxg_probe_vmbus(struct hv_device *hdev,
+			   const struct hv_vmbus_device_id *dev_id)
+{
+	int ret = 0;
+	struct winluid luid;
+	struct dxgvgpuchannel *vgpuch;
+
+	mutex_lock(&dxgglobal->device_mutex);
+
+	if (uuid_le_cmp(hdev->dev_type, id_table[0].guid) == 0) {
+		/* This is a new virtual GPU channel */
+		guid_to_luid(&hdev->channel->offermsg.offer.if_instance, &luid);
+		pr_debug("vGPU channel: %pUb",
+			    &hdev->channel->offermsg.offer.if_instance);
+		vgpuch = vzalloc(sizeof(struct dxgvgpuchannel));
+		if (vgpuch == NULL) {
+			ret = -ENOMEM;
+			goto error;
+		}
+		vgpuch->adapter_luid = luid;
+		vgpuch->hdev = hdev;
+		list_add_tail(&vgpuch->vgpu_ch_list_entry,
+			      &dxgglobal->vgpu_ch_list_head);
+	} else if (uuid_le_cmp(hdev->dev_type, id_table[1].guid) == 0) {
+		/* This is the global Dxgkgnl channel */
+		pr_debug("Global channel: %pUb",
+			    &hdev->channel->offermsg.offer.if_instance);
+		if (dxgglobal->hdev) {
+			/* This device should appear only once */
+			pr_err("global channel already present\n");
+			ret = -EBADE;
+			goto error;
+		}
+		dxgglobal->hdev = hdev;
+	} else {
+		/* Unknown device type */
+		pr_err("probe: unknown device type\n");
+		ret = -EBADE;
+		goto error;
+	}
+
+error:
+
+	mutex_unlock(&dxgglobal->device_mutex);
+
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+static int dxg_remove_vmbus(struct hv_device *hdev)
+{
+	int ret = 0;
+	struct dxgvgpuchannel *vgpu_channel;
+
+	mutex_lock(&dxgglobal->device_mutex);
+
+	if (uuid_le_cmp(hdev->dev_type, id_table[0].guid) == 0) {
+		pr_debug("Remove virtual GPU channel\n");
+		list_for_each_entry(vgpu_channel,
+				    &dxgglobal->vgpu_ch_list_head,
+				    vgpu_ch_list_entry) {
+			if (vgpu_channel->hdev == hdev) {
+				list_del(&vgpu_channel->vgpu_ch_list_entry);
+				vfree(vgpu_channel);
+				break;
+			}
+		}
+	} else if (uuid_le_cmp(hdev->dev_type, id_table[1].guid) == 0) {
+		pr_debug("Remove global channel device\n");
+		dxgglobal_destroy_global_channel();
+	} else {
+		/* Unknown device type */
+		pr_err("remove: unknown device type\n");
+		ret = -EBADE;
+	}
+
+	mutex_unlock(&dxgglobal->device_mutex);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+MODULE_DEVICE_TABLE(vmbus, id_table);
+
+static struct hv_driver dxg_drv = {
+	.name = KBUILD_MODNAME,
+	.id_table = id_table,
+	.probe = dxg_probe_vmbus,
+	.remove = dxg_remove_vmbus,
+	.driver = {
+		   .probe_type = PROBE_PREFER_ASYNCHRONOUS,
+		    },
+};
+
+/*
+ * Interface with Linux kernel
+ */
+
+static int dxgglobal_create(void)
+{
+	int ret = 0;
+
+	dxgglobal = vzalloc(sizeof(struct dxgglobal));
+	if (!dxgglobal)
+		return -ENOMEM;
+
+	INIT_LIST_HEAD(&dxgglobal->plisthead);
+	mutex_init(&dxgglobal->plistmutex);
+	mutex_init(&dxgglobal->device_mutex);
+
+	INIT_LIST_HEAD(&dxgglobal->vgpu_ch_list_head);
+
+	init_rwsem(&dxgglobal->channel_lock);
+
+	pr_debug("dxgglobal_init end\n");
+	return ret;
+}
+
+static void dxgglobal_destroy(void)
+{
+	if (dxgglobal) {
+		if (dxgglobal->vmbus_registered)
+			vmbus_driver_unregister(&dxg_drv);
+
+		dxgglobal_destroy_global_channel();
+
+		if (dxgglobal->pci_registered)
+			pci_unregister_driver(&dxg_pci_drv);
+
+		vfree(dxgglobal);
+		dxgglobal = NULL;
+	}
+}
+
+/*
+ * Driver entry points
+ */
+
+static int __init dxg_drv_init(void)
+{
+	int ret;
+
+
+	ret = dxgglobal_create();
+	if (ret) {
+		pr_err("dxgglobal_init failed");
+		return -ENOMEM;
+	}
+
+	ret = vmbus_driver_register(&dxg_drv);
+	if (ret) {
+		pr_err("vmbus_driver_register failed: %d", ret);
+		return ret;
+	}
+	dxgglobal->vmbus_registered = true;
+
+	pr_info("%s  Version: %x", __func__, DXGKRNL_VERSION);
+
+	ret = pci_register_driver(&dxg_pci_drv);
+	if (ret) {
+		pr_err("pci_driver_register failed: %d", ret);
+		return ret;
+	}
+	dxgglobal->pci_registered = true;
+
+	init_ioctls();
+
+	return 0;
+}
+
+static void __exit dxg_drv_exit(void)
+{
+	dxgglobal_destroy();
+}
+
+module_init(dxg_drv_init);
+module_exit(dxg_drv_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Microsoft Dxgkrnl virtual GPU Driver");
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
new file mode 100644
index 000000000000..bdb7bc325d1a
--- /dev/null
+++ b/include/uapi/misc/d3dkmthk.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+
+/*
+ * Copyright (c) 2019, Microsoft Corporation.
+ *
+ * Author:
+ *   Iouri Tarassov <iourit@linux.microsoft.com>
+ *
+ * Dxgkrnl Graphics Driver
+ * User mode WDDM interface definitions
+ *
+ */
+
+#ifndef _D3DKMTHK_H
+#define _D3DKMTHK_H
+
+/* Matches Windows LUID definition */
+struct winluid {
+	__u32 a;
+	__u32 b;
+};
+
+#endif /* _D3DKMTHK_H */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 03/30] drivers: hv: dxgkrnl: Add VM bus message support, initialize VM bus channels.
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
  2022-03-01 19:45   ` [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading Iouri Tarassov
@ 2022-03-01 19:45   ` Iouri Tarassov
  2022-03-01 22:57     ` Wei Liu
  2022-03-01 19:45   ` [PATCH v3 04/30] drivers: hv: dxgkrnl: Creation of dxgadapter object Iouri Tarassov
                     ` (27 subsequent siblings)
  29 siblings, 1 reply; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:45 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement support for sending/receiving VM bus messages between
the host and the guest.

Initialize the VM bus channels and notify the host about IO space
settings of the VM bus global channel.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/Makefile    |   2 +-
 drivers/hv/dxgkrnl/dxgkrnl.h   |  14 ++
 drivers/hv/dxgkrnl/dxgmodule.c |   7 +
 drivers/hv/dxgkrnl/dxgvmbus.c  | 413 +++++++++++++++++++++++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h  |  86 +++++++
 drivers/hv/dxgkrnl/hmgr.h      |  75 ++++++
 drivers/hv/dxgkrnl/ioctl.c     |  24 ++
 drivers/hv/dxgkrnl/misc.h      |  72 ++++++
 include/uapi/misc/d3dkmthk.h   |  34 +++
 9 files changed, 726 insertions(+), 1 deletion(-)
 create mode 100644 drivers/hv/dxgkrnl/dxgvmbus.c
 create mode 100644 drivers/hv/dxgkrnl/dxgvmbus.h
 create mode 100644 drivers/hv/dxgkrnl/hmgr.h
 create mode 100644 drivers/hv/dxgkrnl/ioctl.c
 create mode 100644 drivers/hv/dxgkrnl/misc.h

diff --git a/drivers/hv/dxgkrnl/Makefile b/drivers/hv/dxgkrnl/Makefile
index 052f8bd62ad3..76349064b60a 100644
--- a/drivers/hv/dxgkrnl/Makefile
+++ b/drivers/hv/dxgkrnl/Makefile
@@ -2,4 +2,4 @@
 # Makefile for the hyper-v compute device driver (dxgkrnl).
 
 obj-$(CONFIG_DXGKRNL)	+= dxgkrnl.o
-dxgkrnl-y		:= dxgmodule.o
+dxgkrnl-y		:= dxgmodule.o dxgvmbus.o
diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index 6b3e9334bce8..533f3bb3ece2 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -29,6 +29,7 @@
 
 struct dxgadapter;
 
+#include "misc.h"
 #include <uapi/misc/d3dkmthk.h>
 
 struct dxgvmbuschannel {
@@ -87,6 +88,13 @@ struct dxgglobal {
 
 extern struct dxgglobal		*dxgglobal;
 
+int dxgglobal_init_global_channel(void);
+void dxgglobal_destroy_global_channel(void);
+struct vmbus_channel *dxgglobal_get_vmbus(void);
+struct dxgvmbuschannel *dxgglobal_get_dxgvmbuschannel(void);
+int dxgglobal_acquire_channel_lock(void);
+void dxgglobal_release_channel_lock(void);
+
 struct dxgprocess {
 	/* Placeholder */
 };
@@ -116,4 +124,10 @@ static inline void guid_to_luid(guid_t *guid, struct winluid *luid)
 #define DXGK_VMBUS_INTERFACE_VERSION			40
 #define DXGK_VMBUS_LAST_COMPATIBLE_INTERFACE_VERSION	16
 
+void dxgvmb_initialize(void);
+int dxgvmb_send_set_iospace_region(u64 start, u64 len,
+				   struct vmbus_gpadl *shared_mem_gpadl);
+
+int ntstatus2int(struct ntstatus status);
+
 #endif
diff --git a/drivers/hv/dxgkrnl/dxgmodule.c b/drivers/hv/dxgkrnl/dxgmodule.c
index 412c37bc592d..78f233e354b9 100644
--- a/drivers/hv/dxgkrnl/dxgmodule.c
+++ b/drivers/hv/dxgkrnl/dxgmodule.c
@@ -296,6 +296,13 @@ int dxgglobal_init_global_channel(void)
 		goto error;
 	}
 
+	ret = dxgvmb_send_set_iospace_region(dxgglobal->mmiospace_base,
+					     dxgglobal->mmiospace_size, 0);
+	if (ret < 0) {
+		pr_err("send_set_iospace_region failed");
+		goto error;
+	}
+
 	hv_set_drvdata(dxgglobal->hdev, dxgglobal);
 
 	dxgglobal->dxgdevice.minor = MISC_DYNAMIC_MINOR;
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
new file mode 100644
index 000000000000..399dec02a3a1
--- /dev/null
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -0,0 +1,413 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright (c) 2019, Microsoft Corporation.
+ *
+ * Author:
+ *   Iouri Tarassov <iourit@linux.microsoft.com>
+ *
+ * Dxgkrnl Graphics Driver
+ * VM bus interface implementation
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/completion.h>
+#include <linux/slab.h>
+#include <linux/eventfd.h>
+#include <linux/hyperv.h>
+#include <linux/mman.h>
+#include <linux/delay.h>
+#include <linux/pagemap.h>
+#include "dxgkrnl.h"
+#include "dxgvmbus.h"
+
+#undef pr_fmt
+#define pr_fmt(fmt)	"dxgk: " fmt
+
+#define RING_BUFSIZE (256 * 1024)
+
+/*
+ * The structure is used to track VM bus packets, waiting for completion.
+ */
+struct dxgvmbuspacket {
+	struct list_head packet_list_entry;
+	u64 request_id;
+	struct completion wait;
+	void *buffer;
+	u32 buffer_length;
+	int status;
+	bool completed;
+};
+
+struct dxgvmb_ext_header {
+	/* Offset from the start of the message to DXGKVMB_COMMAND_BASE */
+	u32		command_offset;
+	u32		reserved;
+	struct winluid	vgpu_luid;
+};
+
+#define VMBUSMESSAGEONSTACK	64
+
+struct dxgvmbusmsg {
+/* Points to the allocated buffer */
+	struct dxgvmb_ext_header	*hdr;
+/* Points to dxgkvmb_command_vm_to_host or dxgkvmb_command_vgpu_to_host */
+	void				*msg;
+/* The vm bus channel, used to pass the message to the host */
+	struct dxgvmbuschannel		*channel;
+/* Message size in bytes including the header and the payload */
+	u32				size;
+/* Buffer used for small messages */
+	char				msg_on_stack[VMBUSMESSAGEONSTACK];
+};
+
+struct dxgvmbusmsgres {
+/* Points to the allocated buffer */
+	struct dxgvmb_ext_header	*hdr;
+/* Points to dxgkvmb_command_vm_to_host or dxgkvmb_command_vgpu_to_host */
+	void				*msg;
+/* The vm bus channel, used to pass the message to the host */
+	struct dxgvmbuschannel		*channel;
+/* Message size in bytes including the header, the payload and the result */
+	u32				size;
+/* Result buffer size in bytes */
+	u32				res_size;
+/* Points to the result within the allocated buffer */
+	void				*res;
+};
+
+static int init_message(struct dxgvmbusmsg *msg,
+			struct dxgprocess *process, u32 size)
+{
+	bool use_ext_header = dxgglobal->vmbus_ver >=
+			      DXGK_VMBUS_INTERFACE_VERSION;
+
+	if (use_ext_header)
+		size += sizeof(struct dxgvmb_ext_header);
+	msg->size = size;
+	if (size <= VMBUSMESSAGEONSTACK) {
+		msg->hdr = (void *)msg->msg_on_stack;
+		memset(msg->hdr, 0, size);
+	} else {
+		msg->hdr = vzalloc(size);
+		if (msg->hdr == NULL)
+			return -ENOMEM;
+	}
+	if (use_ext_header) {
+		msg->msg = (char *)&msg->hdr[1];
+		msg->hdr->command_offset = sizeof(msg->hdr[0]);
+	} else {
+		msg->msg = (char *)msg->hdr;
+	}
+	msg->channel = &dxgglobal->channel;
+	return 0;
+}
+
+static void free_message(struct dxgvmbusmsg *msg, struct dxgprocess *process)
+{
+	if (msg->hdr && (char *)msg->hdr != msg->msg_on_stack)
+		vfree(msg->hdr);
+}
+
+/*
+ * Helper functions
+ */
+
+int ntstatus2int(struct ntstatus status)
+{
+	if (NT_SUCCESS(status))
+		return (int)status.v;
+	switch (status.v) {
+	case STATUS_OBJECT_NAME_COLLISION:
+		return -EEXIST;
+	case STATUS_NO_MEMORY:
+		return -ENOMEM;
+	case STATUS_INVALID_PARAMETER:
+		return -EINVAL;
+	case STATUS_OBJECT_NAME_INVALID:
+	case STATUS_OBJECT_NAME_NOT_FOUND:
+		return -ENOENT;
+	case STATUS_TIMEOUT:
+		return -EAGAIN;
+	case STATUS_BUFFER_TOO_SMALL:
+		return -EOVERFLOW;
+	case STATUS_DEVICE_REMOVED:
+		return -ENODEV;
+	case STATUS_ACCESS_DENIED:
+		return -EACCES;
+	case STATUS_NOT_SUPPORTED:
+		return -EPERM;
+	case STATUS_ILLEGAL_INSTRUCTION:
+		return -EOPNOTSUPP;
+	case STATUS_INVALID_HANDLE:
+		return -EBADF;
+	case STATUS_GRAPHICS_ALLOCATION_BUSY:
+		return -EINPROGRESS;
+	case STATUS_OBJECT_TYPE_MISMATCH:
+		return -EPROTOTYPE;
+	case STATUS_NOT_IMPLEMENTED:
+		return -EPERM;
+	default:
+		return -EINVAL;
+	}
+}
+
+int dxgvmbuschannel_init(struct dxgvmbuschannel *ch, struct hv_device *hdev)
+{
+	int ret;
+
+	ch->hdev = hdev;
+	spin_lock_init(&ch->packet_list_mutex);
+	INIT_LIST_HEAD(&ch->packet_list_head);
+	atomic64_set(&ch->packet_request_id, 0);
+
+	ch->packet_cache = kmem_cache_create("DXGK packet cache",
+					     sizeof(struct dxgvmbuspacket), 0,
+					     0, NULL);
+	if (ch->packet_cache == NULL) {
+		pr_err("packet_cache alloc failed");
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	hdev->channel->max_pkt_size = DXG_MAX_VM_BUS_PACKET_SIZE;
+	ret = vmbus_open(hdev->channel, RING_BUFSIZE, RING_BUFSIZE,
+			 NULL, 0, dxgvmbuschannel_receive, ch);
+	if (ret) {
+		pr_err("vmbus_open failed: %d", ret);
+		goto cleanup;
+	}
+
+	ch->channel = hdev->channel;
+
+cleanup:
+
+	return ret;
+}
+
+void dxgvmbuschannel_destroy(struct dxgvmbuschannel *ch)
+{
+	kmem_cache_destroy(ch->packet_cache);
+	ch->packet_cache = NULL;
+
+	if (ch->channel) {
+		vmbus_close(ch->channel);
+		ch->channel = NULL;
+	}
+}
+
+static void command_vm_to_host_init1(struct dxgkvmb_command_vm_to_host *command,
+				     enum dxgkvmb_commandtype_global type)
+{
+	command->command_type = type;
+	command->process.v = 0;
+	command->command_id = 0;
+	command->channel_type = DXGKVMB_VM_TO_HOST;
+}
+
+static void process_inband_packet(struct dxgvmbuschannel *channel,
+				  struct vmpacket_descriptor *desc)
+{
+	u32 packet_length = hv_pkt_datalen(desc);
+	struct dxgkvmb_command_host_to_vm *packet;
+
+	if (packet_length < sizeof(struct dxgkvmb_command_host_to_vm)) {
+		pr_err("Invalid global packet");
+	} else {
+		packet = hv_pkt_data(desc);
+		pr_debug("global packet %d",
+				packet->command_type);
+		switch (packet->command_type) {
+		case DXGK_VMBCOMMAND_SIGNALGUESTEVENT:
+		case DXGK_VMBCOMMAND_SIGNALGUESTEVENTPASSIVE:
+			break;
+		case DXGK_VMBCOMMAND_SENDWNFNOTIFICATION:
+			break;
+		default:
+			pr_err("unexpected host message %d",
+					packet->command_type);
+		}
+	}
+}
+
+static void process_completion_packet(struct dxgvmbuschannel *channel,
+				      struct vmpacket_descriptor *desc)
+{
+	struct dxgvmbuspacket *packet = NULL;
+	struct dxgvmbuspacket *entry;
+	u32 packet_length = hv_pkt_datalen(desc);
+	unsigned long flags;
+
+	spin_lock_irqsave(&channel->packet_list_mutex, flags);
+	list_for_each_entry(entry, &channel->packet_list_head,
+			    packet_list_entry) {
+		if (desc->trans_id == entry->request_id) {
+			packet = entry;
+			list_del(&packet->packet_list_entry);
+			packet->completed = true;
+			break;
+		}
+	}
+	spin_unlock_irqrestore(&channel->packet_list_mutex, flags);
+	if (packet) {
+		if (packet->buffer_length) {
+			if (packet_length < packet->buffer_length) {
+				pr_debug("invalid size %d Expected:%d",
+					    packet_length,
+					    packet->buffer_length);
+				packet->status = -EOVERFLOW;
+			} else {
+				memcpy(packet->buffer, hv_pkt_data(desc),
+				       packet->buffer_length);
+			}
+		}
+		complete(&packet->wait);
+	} else {
+		pr_err("did not find packet to complete");
+	}
+}
+
+/* Receive callback for messages from the host */
+void dxgvmbuschannel_receive(void *ctx)
+{
+	struct dxgvmbuschannel *channel = ctx;
+	struct vmpacket_descriptor *desc;
+	u32 packet_length = 0;
+
+	foreach_vmbus_pkt(desc, channel->channel) {
+		packet_length = hv_pkt_datalen(desc);
+		pr_debug("next packet (id, size, type): %llu %d %d",
+			desc->trans_id, packet_length, desc->type);
+		if (desc->type == VM_PKT_COMP) {
+			process_completion_packet(channel, desc);
+		} else {
+			if (desc->type != VM_PKT_DATA_INBAND)
+				pr_err("unexpected packet type");
+			else
+				process_inband_packet(channel, desc);
+		}
+	}
+}
+
+int dxgvmb_send_sync_msg(struct dxgvmbuschannel *channel,
+			 void *command,
+			 u32 cmd_size,
+			 void *result,
+			 u32 result_size)
+{
+	int ret;
+	struct dxgvmbuspacket *packet = NULL;
+	struct dxgkvmb_command_vm_to_host *cmd1;
+
+	if (cmd_size > DXG_MAX_VM_BUS_PACKET_SIZE ||
+	    result_size > DXG_MAX_VM_BUS_PACKET_SIZE) {
+		pr_err("%s invalid data size", __func__);
+		return -EINVAL;
+	}
+
+	packet = kmem_cache_alloc(channel->packet_cache, 0);
+	if (packet == NULL) {
+		pr_err("kmem_cache_alloc failed");
+		return -ENOMEM;
+	}
+
+	pr_debug("send_sync_msg global: %d %p %d %d",
+		cmd1->command_type, command, cmd_size, result_size);
+
+	packet->request_id = atomic64_inc_return(&channel->packet_request_id);
+	init_completion(&packet->wait);
+	packet->buffer = result;
+	packet->buffer_length = result_size;
+	packet->status = 0;
+	packet->completed = false;
+	spin_lock_irq(&channel->packet_list_mutex);
+	list_add_tail(&packet->packet_list_entry, &channel->packet_list_head);
+	spin_unlock_irq(&channel->packet_list_mutex);
+
+	ret = vmbus_sendpacket(channel->channel, command, cmd_size,
+			       packet->request_id, VM_PKT_DATA_INBAND,
+			       VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED);
+	if (ret) {
+		pr_err("vmbus_sendpacket failed: %x", ret);
+		spin_lock_irq(&channel->packet_list_mutex);
+		list_del(&packet->packet_list_entry);
+		spin_unlock_irq(&channel->packet_list_mutex);
+		goto cleanup;
+	}
+
+	pr_debug("waiting completion: %llu", packet->request_id);
+	ret = wait_for_completion_killable(&packet->wait);
+	if (ret) {
+		pr_err("wait_for_complition failed: %x", ret);
+		spin_lock_irq(&channel->packet_list_mutex);
+		if (!packet->completed)
+			list_del(&packet->packet_list_entry);
+		spin_unlock_irq(&channel->packet_list_mutex);
+		goto cleanup;
+	}
+	pr_debug("completion done: %llu %x",
+		packet->request_id, packet->status);
+	ret = packet->status;
+
+cleanup:
+
+	kmem_cache_free(channel->packet_cache, packet);
+	if (ret < 0)
+		pr_debug("%s failed: %x", __func__, ret);
+	return ret;
+}
+
+static int
+dxgvmb_send_sync_msg_ntstatus(struct dxgvmbuschannel *channel,
+			      void *command, u32 cmd_size)
+{
+	struct ntstatus status;
+	int ret;
+
+	ret = dxgvmb_send_sync_msg(channel, command, cmd_size,
+				   &status, sizeof(status));
+	if (ret >= 0)
+		ret = ntstatus2int(status);
+	return ret;
+}
+
+/*
+ * Global messages to the host
+ */
+
+int dxgvmb_send_set_iospace_region(u64 start, u64 len,
+	struct vmbus_gpadl *shared_mem_gpadl)
+{
+	int ret;
+	struct dxgkvmb_command_setiospaceregion *command;
+	struct dxgvmbusmsg msg;
+
+	ret = init_message(&msg, NULL, sizeof(*command));
+	if (ret)
+		return ret;
+	command = (void *)msg.msg;
+
+	ret = dxgglobal_acquire_channel_lock();
+	if (ret < 0)
+		goto cleanup;
+
+	command_vm_to_host_init1(&command->hdr,
+				 DXGK_VMBCOMMAND_SETIOSPACEREGION);
+	command->start = start;
+	command->length = len;
+	if (command->shared_page_gpadl)
+		command->shared_page_gpadl = shared_mem_gpadl->gpadl_handle;
+	ret = dxgvmb_send_sync_msg_ntstatus(&dxgglobal->channel, msg.hdr,
+					    msg.size);
+	if (ret < 0)
+		pr_err("send_set_iospace_region failed %x", ret);
+
+	dxgglobal_release_channel_lock();
+cleanup:
+	free_message(&msg, NULL);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
new file mode 100644
index 000000000000..d6136fd1ce9a
--- /dev/null
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -0,0 +1,86 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Copyright (c) 2019, Microsoft Corporation.
+ *
+ * Author:
+ *   Iouri Tarassov <iourit@linux.microsoft.com>
+ *
+ * Dxgkrnl Graphics Driver
+ * VM bus interface with the host definitions
+ *
+ */
+
+#ifndef _DXGVMBUS_H
+#define _DXGVMBUS_H
+
+#define DXG_MAX_VM_BUS_PACKET_SIZE	(1024 * 128)
+
+enum dxgkvmb_commandchanneltype {
+	DXGKVMB_VGPU_TO_HOST,
+	DXGKVMB_VM_TO_HOST,
+	DXGKVMB_HOST_TO_VM
+};
+
+/*
+ *
+ * Commands, sent to the host via the guest global VM bus channel
+ * DXG_GUEST_GLOBAL_VMBUS
+ *
+ */
+
+enum dxgkvmb_commandtype_global {
+	DXGK_VMBCOMMAND_VM_TO_HOST_FIRST	= 1000,
+	DXGK_VMBCOMMAND_CREATEPROCESS	= DXGK_VMBCOMMAND_VM_TO_HOST_FIRST,
+	DXGK_VMBCOMMAND_DESTROYPROCESS		= 1001,
+	DXGK_VMBCOMMAND_OPENSYNCOBJECT		= 1002,
+	DXGK_VMBCOMMAND_DESTROYSYNCOBJECT	= 1003,
+	DXGK_VMBCOMMAND_CREATENTSHAREDOBJECT	= 1004,
+	DXGK_VMBCOMMAND_DESTROYNTSHAREDOBJECT	= 1005,
+	DXGK_VMBCOMMAND_SIGNALFENCE		= 1006,
+	DXGK_VMBCOMMAND_NOTIFYPROCESSFREEZE	= 1007,
+	DXGK_VMBCOMMAND_NOTIFYPROCESSTHAW	= 1008,
+	DXGK_VMBCOMMAND_QUERYETWSESSION		= 1009,
+	DXGK_VMBCOMMAND_SETIOSPACEREGION	= 1010,
+	DXGK_VMBCOMMAND_COMPLETETRANSACTION	= 1011,
+	DXGK_VMBCOMMAND_SHAREOBJECTWITHHOST	= 1021,
+	DXGK_VMBCOMMAND_INVALID_VM_TO_HOST
+};
+
+/*
+ * Commands, sent by the host to the VM
+ */
+enum dxgkvmb_commandtype_host_to_vm {
+	DXGK_VMBCOMMAND_SIGNALGUESTEVENT,
+	DXGK_VMBCOMMAND_PROPAGATEPRESENTHISTORYTOKEN,
+	DXGK_VMBCOMMAND_SETGUESTDATA,
+	DXGK_VMBCOMMAND_SIGNALGUESTEVENTPASSIVE,
+	DXGK_VMBCOMMAND_SENDWNFNOTIFICATION,
+	DXGK_VMBCOMMAND_INVALID_HOST_TO_VM
+};
+
+struct dxgkvmb_command_vm_to_host {
+	u64				command_id;
+	struct d3dkmthandle		process;
+	enum dxgkvmb_commandchanneltype	channel_type;
+	enum dxgkvmb_commandtype_global	command_type;
+};
+
+struct dxgkvmb_command_host_to_vm {
+	u64					command_id;
+	struct d3dkmthandle			process;
+	u32					channel_type	: 8;
+	u32					async_msg	: 1;
+	u32					reserved	: 23;
+	enum dxgkvmb_commandtype_host_to_vm	command_type;
+};
+
+/* Returns ntstatus */
+struct dxgkvmb_command_setiospaceregion {
+	struct dxgkvmb_command_vm_to_host hdr;
+	u64				start;
+	u64				length;
+	u32				shared_page_gpadl;
+};
+
+#endif /* _DXGVMBUS_H */
diff --git a/drivers/hv/dxgkrnl/hmgr.h b/drivers/hv/dxgkrnl/hmgr.h
new file mode 100644
index 000000000000..b8b8f3ae5939
--- /dev/null
+++ b/drivers/hv/dxgkrnl/hmgr.h
@@ -0,0 +1,75 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Copyright (c) 2019, Microsoft Corporation.
+ *
+ * Author:
+ *   Iouri Tarassov <iourit@linux.microsoft.com>
+ *
+ * Dxgkrnl Graphics Driver
+ * Handle manager definitions
+ *
+ */
+
+#ifndef _HMGR_H_
+#define _HMGR_H_
+
+#include "misc.h"
+
+struct hmgrentry;
+
+/*
+ * Handle manager table.
+ *
+ * Implementation notes:
+ *   A list of free handles is built on top of the array of table entries.
+ *   free_handle_list_head is the index of the first entry in the list.
+ *   m_FreeHandleListTail is the index of an entry in the list, which is
+ *   HMGRTABLE_MIN_FREE_ENTRIES from the head. It means that when a handle is
+ *   freed, the next time the handle can be re-used is after allocating
+ *   HMGRTABLE_MIN_FREE_ENTRIES number of handles.
+ *   Handles are allocated from the start of the list and free handles are
+ *   inserted after the tail of the list.
+ *
+ */
+struct hmgrtable {
+	struct dxgprocess	*process;
+	struct hmgrentry	*entry_table;
+	u32			free_handle_list_head;
+	u32			free_handle_list_tail;
+	u32			table_size;
+	u32			free_count;
+	struct rw_semaphore	table_lock;
+};
+
+/*
+ * Handle entry data types.
+ */
+#define HMGRENTRY_TYPE_BITS 5
+
+enum hmgrentry_type {
+	HMGRENTRY_TYPE_FREE				= 0,
+	HMGRENTRY_TYPE_DXGADAPTER			= 1,
+	HMGRENTRY_TYPE_DXGSHAREDRESOURCE		= 2,
+	HMGRENTRY_TYPE_DXGDEVICE			= 3,
+	HMGRENTRY_TYPE_DXGRESOURCE			= 4,
+	HMGRENTRY_TYPE_DXGALLOCATION			= 5,
+	HMGRENTRY_TYPE_DXGOVERLAY			= 6,
+	HMGRENTRY_TYPE_DXGCONTEXT			= 7,
+	HMGRENTRY_TYPE_DXGSYNCOBJECT			= 8,
+	HMGRENTRY_TYPE_DXGKEYEDMUTEX			= 9,
+	HMGRENTRY_TYPE_DXGPAGINGQUEUE			= 10,
+	HMGRENTRY_TYPE_DXGDEVICESYNCOBJECT		= 11,
+	HMGRENTRY_TYPE_DXGPROCESS			= 12,
+	HMGRENTRY_TYPE_DXGSHAREDVMOBJECT		= 13,
+	HMGRENTRY_TYPE_DXGPROTECTEDSESSION		= 14,
+	HMGRENTRY_TYPE_DXGHWQUEUE			= 15,
+	HMGRENTRY_TYPE_DXGREMOTEBUNDLEOBJECT		= 16,
+	HMGRENTRY_TYPE_DXGCOMPOSITIONSURFACEOBJECT	= 17,
+	HMGRENTRY_TYPE_DXGCOMPOSITIONSURFACEPROXY	= 18,
+	HMGRENTRY_TYPE_DXGTRACKEDWORKLOAD		= 19,
+	HMGRENTRY_TYPE_LIMIT		= ((1 << HMGRENTRY_TYPE_BITS) - 1),
+	HMGRENTRY_TYPE_MONITOREDFENCE	= HMGRENTRY_TYPE_LIMIT + 1,
+};
+
+#endif
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
new file mode 100644
index 000000000000..277e25e5d8c6
--- /dev/null
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -0,0 +1,24 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright (c) 2019, Microsoft Corporation.
+ *
+ * Author:
+ *   Iouri Tarassov <iourit@linux.microsoft.com>
+ *
+ * Dxgkrnl Graphics Driver
+ * Ioctl implementation
+ *
+ */
+
+#include <linux/eventfd.h>
+#include <linux/file.h>
+#include <linux/fs.h>
+#include <linux/anon_inodes.h>
+#include <linux/mman.h>
+
+#include "dxgkrnl.h"
+#include "dxgvmbus.h"
+
+#undef pr_fmt
+#define pr_fmt(fmt)	"dxgk: " fmt
diff --git a/drivers/hv/dxgkrnl/misc.h b/drivers/hv/dxgkrnl/misc.h
new file mode 100644
index 000000000000..9fa3c7c8c3f5
--- /dev/null
+++ b/drivers/hv/dxgkrnl/misc.h
@@ -0,0 +1,72 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Copyright (c) 2019, Microsoft Corporation.
+ *
+ * Author:
+ *   Iouri Tarassov <iourit@linux.microsoft.com>
+ *
+ * Dxgkrnl Graphics Driver
+ * Misc definitions
+ *
+ */
+
+#ifndef _MISC_H_
+#define _MISC_H_
+
+extern const struct d3dkmthandle zerohandle;
+
+/*
+ * Synchronization lock hierarchy.
+ *
+ * The higher enum value, the higher is the lock order.
+ * When a lower lock ois held, the higher lock should not be acquired.
+ *
+ * channel_lock
+ * device_mutex
+ */
+
+/*
+ * Some of the Windows return codes, which needs to be translated to Linux
+ * IOCTL return codes. Positive values are success codes and need to be
+ * returned from the driver IOCTLs. libdxcore.so depends on returning
+ * specific return codes.
+ */
+#define STATUS_SUCCESS					((int)(0))
+#define	STATUS_OBJECT_NAME_INVALID			((int)(0xC0000033L))
+#define	STATUS_DEVICE_REMOVED				((int)(0xC00002B6L))
+#define	STATUS_INVALID_HANDLE				((int)(0xC0000008L))
+#define	STATUS_ILLEGAL_INSTRUCTION			((int)(0xC000001DL))
+#define	STATUS_NOT_IMPLEMENTED				((int)(0xC0000002L))
+#define	STATUS_PENDING					((int)(0x00000103L))
+#define	STATUS_ACCESS_DENIED				((int)(0xC0000022L))
+#define	STATUS_BUFFER_TOO_SMALL				((int)(0xC0000023L))
+#define	STATUS_OBJECT_TYPE_MISMATCH			((int)(0xC0000024L))
+#define	STATUS_GRAPHICS_ALLOCATION_BUSY			((int)(0xC01E0102L))
+#define	STATUS_NOT_SUPPORTED				((int)(0xC00000BBL))
+#define	STATUS_TIMEOUT					((int)(0x00000102L))
+#define	STATUS_INVALID_PARAMETER			((int)(0xC000000DL))
+#define	STATUS_NO_MEMORY				((int)(0xC0000017L))
+#define	STATUS_OBJECT_NAME_COLLISION			((int)(0xC0000035L))
+#define STATUS_OBJECT_NAME_NOT_FOUND			((int)(0xC0000034L))
+
+
+#define NT_SUCCESS(status)				(status.v >= 0)
+
+#ifndef DEBUG
+
+#define DXGKRNL_ASSERT(exp)
+
+#else
+
+#define DXGKRNL_ASSERT(exp)	\
+do {				\
+	if (!(exp)) {		\
+		dump_stack();	\
+		BUG_ON(true);	\
+	}			\
+} while (0)
+
+#endif /* DEBUG */
+
+#endif /* _MISC_H_ */
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index bdb7bc325d1a..2b9ed954a520 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -14,6 +14,40 @@
 #ifndef _D3DKMTHK_H
 #define _D3DKMTHK_H
 
+/*
+ * This structure matches the definition of D3DKMTHANDLE in Windows.
+ * The handle is opaque in user mode. It is used by user mode applications to
+ * represent kernel mode objects, created by dxgkrnl.
+ */
+struct d3dkmthandle {
+	union {
+		struct {
+			__u32 instance	:  6;
+			__u32 index	: 24;
+			__u32 unique	: 2;
+		};
+		__u32 v;
+	};
+};
+
+/*
+ * VM bus messages return Windows' NTSTATUS, which is integer and only negative
+ * value indicates a failure. A positive number is a success and needs to be
+ * returned to user mode as the IOCTL return code. Negative status codes are
+ * converted to Linux error codes.
+ */
+struct ntstatus {
+	union {
+		struct {
+			int code	: 16;
+			int facility	: 13;
+			int customer	: 1;
+			int severity	: 2;
+		};
+		int v;
+	};
+};
+
 /* Matches Windows LUID definition */
 struct winluid {
 	__u32 a;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 04/30] drivers: hv: dxgkrnl: Creation of dxgadapter object
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
  2022-03-01 19:45   ` [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading Iouri Tarassov
  2022-03-01 19:45   ` [PATCH v3 03/30] drivers: hv: dxgkrnl: Add VM bus message support, initialize VM bus channels Iouri Tarassov
@ 2022-03-01 19:45   ` Iouri Tarassov
  2022-03-01 23:25     ` Wei Liu
  2022-03-01 19:45   ` [PATCH v3 05/30] drivers: hv: dxgkrnl: Opening of /dev/dxg device and dxgprocess creation Iouri Tarassov
                     ` (26 subsequent siblings)
  29 siblings, 1 reply; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:45 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Handle creation and destruction of dxgadapter object, which
represents a virtual compute device, projected to the VM by
the host. The dxgadapter object is created when the
corresponding VM bus channel is offered by Hyper-V.

There could be multiple virtual compute device objects,
projected by the host to VM. They are enumerated by
issuing IOCTLs to the /dev/dxg device.

The adapter object can start functioning only when the global VM bus
channel and the corresponding per device VM bus channel are
initialized. Notifications about arrival of a virtual compute PCI
device and VM bus channels can happen in any order. Therefore,
the initial dxgadapter object state is DXGADAPTER_STATE_WAITING_VMBUS.
A list of VM bus channels and a list of waiting dxgadapter objects
are maintained. When dxgkrnl is notified about a VM bus channel
arrival, if tries to start all adapters, which are not started yet.

Properties of the adapter object are determined by sending VM
bus messages to the host to the corresponding VM bus channel.

When the per virtual compute device VM bus channel or the global
channel are destroyed, the adapter object is destroyed.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/Makefile     |   2 +-
 drivers/hv/dxgkrnl/dxgadapter.c | 172 +++++++++++++++++++++++++
 drivers/hv/dxgkrnl/dxgkrnl.h    |  89 +++++++++++++
 drivers/hv/dxgkrnl/dxgmodule.c  | 200 ++++++++++++++++++++++++++++-
 drivers/hv/dxgkrnl/dxgvmbus.c   | 216 +++++++++++++++++++++++++++++---
 drivers/hv/dxgkrnl/dxgvmbus.h   | 128 +++++++++++++++++++
 drivers/hv/dxgkrnl/hmgr.h       |  75 -----------
 drivers/hv/dxgkrnl/misc.c       |  37 ++++++
 drivers/hv/dxgkrnl/misc.h       |  17 +++
 9 files changed, 839 insertions(+), 97 deletions(-)
 create mode 100644 drivers/hv/dxgkrnl/dxgadapter.c
 delete mode 100644 drivers/hv/dxgkrnl/hmgr.h
 create mode 100644 drivers/hv/dxgkrnl/misc.c

diff --git a/drivers/hv/dxgkrnl/Makefile b/drivers/hv/dxgkrnl/Makefile
index 76349064b60a..2ed07d877c91 100644
--- a/drivers/hv/dxgkrnl/Makefile
+++ b/drivers/hv/dxgkrnl/Makefile
@@ -2,4 +2,4 @@
 # Makefile for the hyper-v compute device driver (dxgkrnl).
 
 obj-$(CONFIG_DXGKRNL)	+= dxgkrnl.o
-dxgkrnl-y		:= dxgmodule.o dxgvmbus.o
+dxgkrnl-y		:= dxgmodule.o misc.o dxgadapter.o ioctl.o dxgvmbus.o
diff --git a/drivers/hv/dxgkrnl/dxgadapter.c b/drivers/hv/dxgkrnl/dxgadapter.c
new file mode 100644
index 000000000000..e0a6fea00bd5
--- /dev/null
+++ b/drivers/hv/dxgkrnl/dxgadapter.c
@@ -0,0 +1,172 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright (c) 2019, Microsoft Corporation.
+ *
+ * Author:
+ *   Iouri Tarassov <iourit@linux.microsoft.com>
+ *
+ * Dxgkrnl Graphics Driver
+ * Implementation of dxgadapter and its objects
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/hyperv.h>
+#include <linux/pagemap.h>
+#include <linux/eventfd.h>
+
+#include "dxgkrnl.h"
+
+#undef pr_fmt
+#define pr_fmt(fmt)	"dxgk: " fmt
+
+int dxgadapter_set_vmbus(struct dxgadapter *adapter, struct hv_device *hdev)
+{
+	int ret;
+
+	guid_to_luid(&hdev->channel->offermsg.offer.if_instance,
+		     &adapter->luid);
+	pr_debug("%s: %x:%x %p %pUb\n",
+		    __func__, adapter->luid.b, adapter->luid.a, hdev->channel,
+		    &hdev->channel->offermsg.offer.if_instance);
+
+	ret = dxgvmbuschannel_init(&adapter->channel, hdev);
+	if (ret)
+		goto cleanup;
+
+	adapter->channel.adapter = adapter;
+	adapter->hv_dev = hdev;
+
+	ret = dxgvmb_send_open_adapter(adapter);
+	if (ret < 0) {
+		pr_err("dxgvmb_send_open_adapter failed: %d\n", ret);
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_get_internal_adapter_info(adapter);
+	if (ret < 0)
+		pr_err("get_internal_adapter_info failed: %d", ret);
+
+cleanup:
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+void dxgadapter_start(struct dxgadapter *adapter)
+{
+	struct dxgvgpuchannel *ch = NULL;
+	struct dxgvgpuchannel *entry;
+	int ret;
+
+	pr_debug("%s %x-%x",
+		__func__, adapter->luid.a, adapter->luid.b);
+
+	/* Find the corresponding vGPU vm bus channel */
+	list_for_each_entry(entry, &dxgglobal->vgpu_ch_list_head,
+			    vgpu_ch_list_entry) {
+		if (memcmp(&adapter->luid,
+			   &entry->adapter_luid,
+			   sizeof(struct winluid)) == 0) {
+			ch = entry;
+			break;
+		}
+	}
+	if (ch == NULL) {
+		pr_debug("%s vGPU chanel is not ready", __func__);
+		return;
+	}
+
+	/* The global channel is initialized when the first adapter starts */
+	if (!dxgglobal->global_channel_initialized) {
+		ret = dxgglobal_init_global_channel();
+		if (ret) {
+			dxgglobal_destroy_global_channel();
+			return;
+		}
+		dxgglobal->global_channel_initialized = true;
+	}
+
+	/* Initialize vGPU vm bus channel */
+	ret = dxgadapter_set_vmbus(adapter, ch->hdev);
+	if (ret) {
+		pr_err("Failed to start adapter %p", adapter);
+		adapter->adapter_state = DXGADAPTER_STATE_STOPPED;
+		return;
+	}
+
+	adapter->adapter_state = DXGADAPTER_STATE_ACTIVE;
+	pr_debug("%s Adapter started %p", __func__, adapter);
+}
+
+void dxgadapter_stop(struct dxgadapter *adapter)
+{
+	bool adapter_stopped = false;
+
+	down_write(&adapter->core_lock);
+	if (!adapter->stopping_adapter)
+		adapter->stopping_adapter = true;
+	else
+		adapter_stopped = true;
+	up_write(&adapter->core_lock);
+
+	if (adapter_stopped)
+		return;
+
+	if (dxgadapter_acquire_lock_exclusive(adapter) == 0) {
+		dxgvmb_send_close_adapter(adapter);
+		dxgadapter_release_lock_exclusive(adapter);
+	}
+	dxgvmbuschannel_destroy(&adapter->channel);
+
+	adapter->adapter_state = DXGADAPTER_STATE_STOPPED;
+}
+
+void dxgadapter_release(struct kref *refcount)
+{
+	struct dxgadapter *adapter;
+
+	adapter = container_of(refcount, struct dxgadapter, adapter_kref);
+	pr_debug("%s %p\n", __func__, adapter);
+	vfree(adapter);
+}
+
+bool dxgadapter_is_active(struct dxgadapter *adapter)
+{
+	return adapter->adapter_state == DXGADAPTER_STATE_ACTIVE;
+}
+
+int dxgadapter_acquire_lock_exclusive(struct dxgadapter *adapter)
+{
+	down_write(&adapter->core_lock);
+	if (adapter->adapter_state != DXGADAPTER_STATE_ACTIVE) {
+		dxgadapter_release_lock_exclusive(adapter);
+		return -ENODEV;
+	}
+	return 0;
+}
+
+void dxgadapter_acquire_lock_forced(struct dxgadapter *adapter)
+{
+	down_write(&adapter->core_lock);
+}
+
+void dxgadapter_release_lock_exclusive(struct dxgadapter *adapter)
+{
+	up_write(&adapter->core_lock);
+}
+
+int dxgadapter_acquire_lock_shared(struct dxgadapter *adapter)
+{
+	down_read(&adapter->core_lock);
+	if (adapter->adapter_state == DXGADAPTER_STATE_ACTIVE)
+		return 0;
+	dxgadapter_release_lock_shared(adapter);
+	return -ENODEV;
+}
+
+void dxgadapter_release_lock_shared(struct dxgadapter *adapter)
+{
+	up_read(&adapter->core_lock);
+}
diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index 533f3bb3ece2..c0fd0bb267f2 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -32,9 +32,39 @@ struct dxgadapter;
 #include "misc.h"
 #include <uapi/misc/d3dkmthk.h>
 
+struct dxgk_device_types {
+	u32 post_device:1;
+	u32 post_device_certain:1;
+	u32 software_device:1;
+	u32 soft_gpu_device:1;
+	u32 warp_device:1;
+	u32 bdd_device:1;
+	u32 support_miracast:1;
+	u32 mismatched_lda:1;
+	u32 indirect_display_device:1;
+	u32 xbox_one_device:1;
+	u32 child_id_support_dwm_clone:1;
+	u32 child_id_support_dwm_clone2:1;
+	u32 has_internal_panel:1;
+	u32 rfx_vgpu_device:1;
+	u32 virtual_render_device:1;
+	u32 support_preserve_boot_display:1;
+	u32 is_uefi_frame_buffer:1;
+	u32 removable_device:1;
+	u32 virtual_monitor_device:1;
+};
+
+enum dxgobjectstate {
+	DXGOBJECTSTATE_CREATED,
+	DXGOBJECTSTATE_ACTIVE,
+	DXGOBJECTSTATE_STOPPED,
+	DXGOBJECTSTATE_DESTROYED,
+};
+
 struct dxgvmbuschannel {
 	struct vmbus_channel	*channel;
 	struct hv_device	*hdev;
+	struct dxgadapter	*adapter;
 	spinlock_t		packet_list_mutex;
 	struct list_head	packet_list_head;
 	struct kmem_cache	*packet_cache;
@@ -70,6 +100,14 @@ struct dxgglobal {
 	struct list_head	plisthead;
 	struct mutex		plistmutex;
 
+	/* list of created adapters */
+	struct list_head	adapter_list_head;
+	struct rw_semaphore	adapter_list_lock;
+
+	/* List of all current threads for lock order tracking. */
+	struct mutex		thread_info_mutex;
+	struct list_head	thread_info_list_head;
+
 	/*
 	 * List of the vGPU VM bus channels (dxgvgpuchannel)
 	 * Protected by device_mutex
@@ -88,6 +126,10 @@ struct dxgglobal {
 
 extern struct dxgglobal		*dxgglobal;
 
+int dxgglobal_create_adapter(struct pci_dev *dev, guid_t *guid,
+			     struct winluid host_vgpu_luid);
+void dxgglobal_acquire_adapter_list_lock(enum dxglockstate state);
+void dxgglobal_release_adapter_list_lock(enum dxglockstate state);
 int dxgglobal_init_global_channel(void);
 void dxgglobal_destroy_global_channel(void);
 struct vmbus_channel *dxgglobal_get_vmbus(void);
@@ -99,6 +141,47 @@ struct dxgprocess {
 	/* Placeholder */
 };
 
+enum dxgadapter_state {
+	DXGADAPTER_STATE_ACTIVE		= 0,
+	DXGADAPTER_STATE_STOPPED	= 1,
+	DXGADAPTER_STATE_WAITING_VMBUS	= 2,
+};
+
+/*
+ * This object represents the grapchis adapter.
+ * Objects, which take reference on the adapter:
+ * - dxgglobal
+ * - adapter handle (struct d3dkmthandle)
+ */
+struct dxgadapter {
+	struct rw_semaphore	core_lock;
+	struct kref		adapter_kref;
+	/* Entry in the list of adapters in dxgglobal */
+	struct list_head	adapter_list_entry;
+	struct pci_dev		*pci_dev;
+	struct hv_device	*hv_dev;
+	struct dxgvmbuschannel	channel;
+	struct d3dkmthandle	host_handle;
+	enum dxgadapter_state	adapter_state;
+	struct winluid		host_adapter_luid;
+	struct winluid		host_vgpu_luid;
+	struct winluid		luid;	/* VM bus channel luid */
+	u16			device_description[80];
+	u16			device_instance_id[WIN_MAX_PATH];
+	bool			stopping_adapter;
+};
+
+int dxgadapter_set_vmbus(struct dxgadapter *adapter, struct hv_device *hdev);
+bool dxgadapter_is_active(struct dxgadapter *adapter);
+void dxgadapter_start(struct dxgadapter *adapter);
+void dxgadapter_stop(struct dxgadapter *adapter);
+void dxgadapter_release(struct kref *refcount);
+int dxgadapter_acquire_lock_shared(struct dxgadapter *adapter);
+void dxgadapter_release_lock_shared(struct dxgadapter *adapter);
+int dxgadapter_acquire_lock_exclusive(struct dxgadapter *adapter);
+void dxgadapter_acquire_lock_forced(struct dxgadapter *adapter);
+void dxgadapter_release_lock_exclusive(struct dxgadapter *adapter);
+
 void init_ioctls(void);
 long dxgk_compat_ioctl(struct file *f, unsigned int p1, unsigned long p2);
 long dxgk_unlocked_ioctl(struct file *f, unsigned int p1, unsigned long p2);
@@ -127,6 +210,12 @@ static inline void guid_to_luid(guid_t *guid, struct winluid *luid)
 void dxgvmb_initialize(void);
 int dxgvmb_send_set_iospace_region(u64 start, u64 len,
 				   struct vmbus_gpadl *shared_mem_gpadl);
+int dxgvmb_send_open_adapter(struct dxgadapter *adapter);
+int dxgvmb_send_close_adapter(struct dxgadapter *adapter);
+int dxgvmb_send_get_internal_adapter_info(struct dxgadapter *adapter);
+int dxgvmb_send_async_msg(struct dxgvmbuschannel *channel,
+			  void *command,
+			  u32 cmd_size);
 
 int ntstatus2int(struct ntstatus status);
 
diff --git a/drivers/hv/dxgkrnl/dxgmodule.c b/drivers/hv/dxgkrnl/dxgmodule.c
index 78f233e354b9..5ae57977731c 100644
--- a/drivers/hv/dxgkrnl/dxgmodule.c
+++ b/drivers/hv/dxgkrnl/dxgmodule.c
@@ -62,6 +62,148 @@ void dxgglobal_release_channel_lock(void)
 	up_read(&dxgglobal->channel_lock);
 }
 
+void dxgglobal_acquire_adapter_list_lock(enum dxglockstate state)
+{
+	if (state == DXGLOCK_EXCL)
+		down_write(&dxgglobal->adapter_list_lock);
+	else
+		down_read(&dxgglobal->adapter_list_lock);
+}
+
+void dxgglobal_release_adapter_list_lock(enum dxglockstate state)
+{
+	if (state == DXGLOCK_EXCL)
+		up_write(&dxgglobal->adapter_list_lock);
+	else
+		up_read(&dxgglobal->adapter_list_lock);
+}
+
+/*
+ * Returns a pointer to dxgadapter object, which corresponds to the given PCI
+ * device, or NULL.
+ */
+static struct dxgadapter *find_pci_adapter(struct pci_dev *dev)
+{
+	struct dxgadapter *entry;
+	struct dxgadapter *adapter = NULL;
+
+	dxgglobal_acquire_adapter_list_lock(DXGLOCK_EXCL);
+
+	list_for_each_entry(entry, &dxgglobal->adapter_list_head,
+			    adapter_list_entry) {
+		if (dev == entry->pci_dev) {
+			adapter = entry;
+			break;
+		}
+	}
+
+	dxgglobal_release_adapter_list_lock(DXGLOCK_EXCL);
+	return adapter;
+}
+
+/*
+ * Returns a pointer to dxgadapter object, which has the givel LUID
+ * device, or NULL.
+ */
+static struct dxgadapter *find_adapter(struct winluid *luid)
+{
+	struct dxgadapter *entry;
+	struct dxgadapter *adapter = NULL;
+
+	dxgglobal_acquire_adapter_list_lock(DXGLOCK_EXCL);
+
+	list_for_each_entry(entry, &dxgglobal->adapter_list_head,
+			    adapter_list_entry) {
+		if (memcmp(luid, &entry->luid, sizeof(struct winluid)) == 0) {
+			adapter = entry;
+			break;
+		}
+	}
+
+	dxgglobal_release_adapter_list_lock(DXGLOCK_EXCL);
+	return adapter;
+}
+
+/*
+ * Creates a new dxgadapter object, which represents a virtual GPU, projected
+ * by the host.
+ * The adapter is in the waiting state. It will become active when the global
+ * VM bus channel and the adapter VM bus channel are created.
+ */
+int dxgglobal_create_adapter(struct pci_dev *dev, guid_t *guid,
+			     struct winluid host_vgpu_luid)
+{
+	struct dxgadapter *adapter;
+	int ret = 0;
+
+	adapter = vzalloc(sizeof(struct dxgadapter));
+	if (adapter == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	adapter->adapter_state = DXGADAPTER_STATE_WAITING_VMBUS;
+	adapter->host_vgpu_luid = host_vgpu_luid;
+	kref_init(&adapter->adapter_kref);
+	init_rwsem(&adapter->core_lock);
+
+	adapter->pci_dev = dev;
+	guid_to_luid(guid, &adapter->luid);
+
+	dxgglobal_acquire_adapter_list_lock(DXGLOCK_EXCL);
+
+	list_add_tail(&adapter->adapter_list_entry,
+		      &dxgglobal->adapter_list_head);
+	dxgglobal->num_adapters++;
+	dxgglobal_release_adapter_list_lock(DXGLOCK_EXCL);
+
+	pr_debug("new adapter added %p %x-%x\n", adapter,
+		    adapter->luid.a, adapter->luid.b);
+cleanup:
+	pr_debug("%s end: %d", __func__, ret);
+	return ret;
+}
+
+/*
+ * Attempts to start dxgadapter objects, which are not active yet.
+ */
+static void dxgglobal_start_adapters(void)
+{
+	struct dxgadapter *adapter;
+
+	if (dxgglobal->hdev == NULL) {
+		pr_debug("Global channel is not ready");
+		return;
+	}
+	dxgglobal_acquire_adapter_list_lock(DXGLOCK_EXCL);
+	list_for_each_entry(adapter, &dxgglobal->adapter_list_head,
+			    adapter_list_entry) {
+		if (adapter->adapter_state == DXGADAPTER_STATE_WAITING_VMBUS)
+			dxgadapter_start(adapter);
+	}
+	dxgglobal_release_adapter_list_lock(DXGLOCK_EXCL);
+}
+
+/*
+ * Stopsthe active dxgadapter objects.
+ */
+static void dxgglobal_stop_adapters(void)
+{
+	struct dxgadapter *adapter;
+
+	if (dxgglobal->hdev == NULL) {
+		pr_debug("Global channel is not ready");
+		return;
+	}
+	dxgglobal_acquire_adapter_list_lock(DXGLOCK_EXCL);
+	list_for_each_entry(adapter, &dxgglobal->adapter_list_head,
+			    adapter_list_entry) {
+		if (adapter->adapter_state == DXGADAPTER_STATE_ACTIVE)
+			dxgadapter_stop(adapter);
+	}
+	dxgglobal_release_adapter_list_lock(DXGLOCK_EXCL);
+}
+
 /*
  * File operations for the /dev/dxg device
  */
@@ -207,12 +349,22 @@ static int dxg_pci_probe_device(struct pci_dev *dev,
 			goto cleanup;
 	}
 
+	/* Create new virtual GPU adapter */
+
 	pr_debug("Adapter channel: %pUb\n", &guid);
 	pr_debug("Vmbus interface version: %d\n",
 		dxgglobal->vmbus_ver);
 	pr_debug("Host vGPU luid: %x-%x\n",
 		vgpu_luid.b, vgpu_luid.a);
 
+	ret = dxgglobal_create_adapter(dev, &guid, vgpu_luid);
+	if (ret)
+		goto cleanup;
+
+	/* Attempt to start the adapter in case VM bus channels are created */
+
+	dxgglobal_start_adapters();
+
 cleanup:
 
 	mutex_unlock(&dxgglobal->device_mutex);
@@ -224,7 +376,24 @@ static int dxg_pci_probe_device(struct pci_dev *dev,
 
 static void dxg_pci_remove_device(struct pci_dev *dev)
 {
-	/* Placeholder */
+	struct dxgadapter *adapter;
+
+	mutex_lock(&dxgglobal->device_mutex);
+
+	adapter = find_pci_adapter(dev);
+	if (adapter) {
+		dxgglobal_acquire_adapter_list_lock(DXGLOCK_EXCL);
+		list_del(&adapter->adapter_list_entry);
+		dxgglobal->num_adapters--;
+		dxgglobal_release_adapter_list_lock(DXGLOCK_EXCL);
+
+		dxgadapter_stop(adapter);
+		kref_put(&adapter->adapter_kref, dxgadapter_release);
+	} else {
+		pr_err("Failed to find dxgadapter");
+	}
+
+	mutex_unlock(&dxgglobal->device_mutex);
 }
 
 static struct pci_device_id dxg_pci_id_table = {
@@ -347,6 +516,25 @@ void dxgglobal_destroy_global_channel(void)
 	up_write(&dxgglobal->channel_lock);
 }
 
+static void dxgglobal_stop_adapter_vmbus(struct hv_device *hdev)
+{
+	struct dxgadapter *adapter = NULL;
+	struct winluid luid;
+
+	guid_to_luid(&hdev->channel->offermsg.offer.if_instance, &luid);
+
+	pr_debug("%s: %x:%x\n", __func__, luid.b, luid.a);
+
+	adapter = find_adapter(&luid);
+
+	if (adapter && adapter->adapter_state == DXGADAPTER_STATE_ACTIVE) {
+		down_write(&adapter->core_lock);
+		dxgvmbuschannel_destroy(&adapter->channel);
+		adapter->adapter_state = DXGADAPTER_STATE_STOPPED;
+		up_write(&adapter->core_lock);
+	}
+}
+
 static const struct hv_vmbus_device_id id_table[] = {
 	/* Per GPU Device GUID */
 	{ HV_GPUP_DXGK_VGPU_GUID },
@@ -378,6 +566,7 @@ static int dxg_probe_vmbus(struct hv_device *hdev,
 		vgpuch->hdev = hdev;
 		list_add_tail(&vgpuch->vgpu_ch_list_entry,
 			      &dxgglobal->vgpu_ch_list_head);
+		dxgglobal_start_adapters();
 	} else if (uuid_le_cmp(hdev->dev_type, id_table[1].guid) == 0) {
 		/* This is the global Dxgkgnl channel */
 		pr_debug("Global channel: %pUb",
@@ -389,6 +578,7 @@ static int dxg_probe_vmbus(struct hv_device *hdev,
 			goto error;
 		}
 		dxgglobal->hdev = hdev;
+		dxgglobal_start_adapters();
 	} else {
 		/* Unknown device type */
 		pr_err("probe: unknown device type\n");
@@ -414,6 +604,7 @@ static int dxg_remove_vmbus(struct hv_device *hdev)
 
 	if (uuid_le_cmp(hdev->dev_type, id_table[0].guid) == 0) {
 		pr_debug("Remove virtual GPU channel\n");
+		dxgglobal_stop_adapter_vmbus(hdev);
 		list_for_each_entry(vgpu_channel,
 				    &dxgglobal->vgpu_ch_list_head,
 				    vgpu_ch_list_entry) {
@@ -466,7 +657,12 @@ static int dxgglobal_create(void)
 	mutex_init(&dxgglobal->plistmutex);
 	mutex_init(&dxgglobal->device_mutex);
 
+	INIT_LIST_HEAD(&dxgglobal->thread_info_list_head);
+	mutex_init(&dxgglobal->thread_info_mutex);
+
 	INIT_LIST_HEAD(&dxgglobal->vgpu_ch_list_head);
+	INIT_LIST_HEAD(&dxgglobal->adapter_list_head);
+	init_rwsem(&dxgglobal->adapter_list_lock);
 
 	init_rwsem(&dxgglobal->channel_lock);
 
@@ -477,6 +673,8 @@ static int dxgglobal_create(void)
 static void dxgglobal_destroy(void)
 {
 	if (dxgglobal) {
+		dxgglobal_stop_adapters();
+
 		if (dxgglobal->vmbus_registered)
 			vmbus_driver_unregister(&dxg_drv);
 
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index 399dec02a3a1..6b02932819c8 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -77,7 +77,7 @@ struct dxgvmbusmsgres {
 	void				*res;
 };
 
-static int init_message(struct dxgvmbusmsg *msg,
+static int init_message(struct dxgvmbusmsg *msg, struct dxgadapter *adapter,
 			struct dxgprocess *process, u32 size)
 {
 	bool use_ext_header = dxgglobal->vmbus_ver >=
@@ -97,10 +97,15 @@ static int init_message(struct dxgvmbusmsg *msg,
 	if (use_ext_header) {
 		msg->msg = (char *)&msg->hdr[1];
 		msg->hdr->command_offset = sizeof(msg->hdr[0]);
+		if (adapter)
+			msg->hdr->vgpu_luid = adapter->host_vgpu_luid;
 	} else {
 		msg->msg = (char *)msg->hdr;
 	}
-	msg->channel = &dxgglobal->channel;
+	if (adapter && !dxgglobal->async_msg_enabled)
+		msg->channel = &adapter->channel;
+	else
+		msg->channel = &dxgglobal->channel;
 	return 0;
 }
 
@@ -114,6 +119,37 @@ static void free_message(struct dxgvmbusmsg *msg, struct dxgprocess *process)
  * Helper functions
  */
 
+static void command_vm_to_host_init2(struct dxgkvmb_command_vm_to_host *command,
+				     enum dxgkvmb_commandtype_global t,
+				     struct d3dkmthandle process)
+{
+	command->command_type	= t;
+	command->process	= process;
+	command->command_id	= 0;
+	command->channel_type	= DXGKVMB_VM_TO_HOST;
+}
+
+static void command_vgpu_to_host_init1(struct dxgkvmb_command_vgpu_to_host
+					*command,
+					enum dxgkvmb_commandtype type)
+{
+	command->command_type	= type;
+	command->process.v	= 0;
+	command->command_id	= 0;
+	command->channel_type	= DXGKVMB_VGPU_TO_HOST;
+}
+
+static void command_vgpu_to_host_init2(struct dxgkvmb_command_vgpu_to_host
+					*command,
+					enum dxgkvmb_commandtype type,
+					struct d3dkmthandle process)
+{
+	command->command_type	= type;
+	command->process	= process;
+	command->command_id	= 0;
+	command->channel_type	= DXGKVMB_VGPU_TO_HOST;
+}
+
 int ntstatus2int(struct ntstatus status)
 {
 	if (NT_SUCCESS(status))
@@ -212,22 +248,26 @@ static void process_inband_packet(struct dxgvmbuschannel *channel,
 	u32 packet_length = hv_pkt_datalen(desc);
 	struct dxgkvmb_command_host_to_vm *packet;
 
-	if (packet_length < sizeof(struct dxgkvmb_command_host_to_vm)) {
-		pr_err("Invalid global packet");
-	} else {
-		packet = hv_pkt_data(desc);
-		pr_debug("global packet %d",
-				packet->command_type);
-		switch (packet->command_type) {
-		case DXGK_VMBCOMMAND_SIGNALGUESTEVENT:
-		case DXGK_VMBCOMMAND_SIGNALGUESTEVENTPASSIVE:
-			break;
-		case DXGK_VMBCOMMAND_SENDWNFNOTIFICATION:
-			break;
-		default:
-			pr_err("unexpected host message %d",
-					packet->command_type);
+	if (channel->adapter == NULL) {
+		if (packet_length < sizeof(struct dxgkvmb_command_host_to_vm)) {
+			pr_err("Invalid global packet");
+		} else {
+			packet = hv_pkt_data(desc);
+			pr_debug("global packet %d",
+				    packet->command_type);
+			switch (packet->command_type) {
+			case DXGK_VMBCOMMAND_SIGNALGUESTEVENT:
+			case DXGK_VMBCOMMAND_SIGNALGUESTEVENTPASSIVE:
+				break;
+			case DXGK_VMBCOMMAND_SENDWNFNOTIFICATION:
+				break;
+			default:
+				pr_err("unexpected host message %d",
+					   packet->command_type);
+			}
 		}
+	} else {
+		pr_err("Unexpected packet for adapter channel");
 	}
 }
 
@@ -275,6 +315,7 @@ void dxgvmbuschannel_receive(void *ctx)
 	struct vmpacket_descriptor *desc;
 	u32 packet_length = 0;
 
+	pr_debug("%s %p", __func__, channel->adapter);
 	foreach_vmbus_pkt(desc, channel->channel) {
 		packet_length = hv_pkt_datalen(desc);
 		pr_debug("next packet (id, size, type): %llu %d %d",
@@ -299,6 +340,7 @@ int dxgvmb_send_sync_msg(struct dxgvmbuschannel *channel,
 	int ret;
 	struct dxgvmbuspacket *packet = NULL;
 	struct dxgkvmb_command_vm_to_host *cmd1;
+	struct dxgkvmb_command_vgpu_to_host *cmd2;
 
 	if (cmd_size > DXG_MAX_VM_BUS_PACKET_SIZE ||
 	    result_size > DXG_MAX_VM_BUS_PACKET_SIZE) {
@@ -312,8 +354,15 @@ int dxgvmb_send_sync_msg(struct dxgvmbuschannel *channel,
 		return -ENOMEM;
 	}
 
-	pr_debug("send_sync_msg global: %d %p %d %d",
-		cmd1->command_type, command, cmd_size, result_size);
+	if (channel->adapter == NULL) {
+		cmd1 = command;
+		pr_debug("send_sync_msg global: %d %p %d %d",
+			cmd1->command_type, command, cmd_size, result_size);
+	} else {
+		cmd2 = command;
+		pr_debug("send_sync_msg adapter: %d %p %d %d",
+			cmd2->command_type, command, cmd_size, result_size);
+	}
 
 	packet->request_id = atomic64_inc_return(&channel->packet_request_id);
 	init_completion(&packet->wait);
@@ -358,6 +407,41 @@ int dxgvmb_send_sync_msg(struct dxgvmbuschannel *channel,
 	return ret;
 }
 
+int dxgvmb_send_async_msg(struct dxgvmbuschannel *channel,
+			  void *command,
+			  u32 cmd_size)
+{
+	int ret;
+	int try_count = 0;
+
+	if (cmd_size > DXG_MAX_VM_BUS_PACKET_SIZE) {
+		pr_err("%s invalid data size", __func__);
+		return -EINVAL;
+	}
+
+	if (channel->adapter) {
+		pr_err("Async messages should be sent to the global channel");
+		return -EINVAL;
+	}
+
+	do {
+		ret = vmbus_sendpacket(channel->channel, command, cmd_size,
+				0, VM_PKT_DATA_INBAND, 0);
+		/*
+		 * -EAGAIN is returned when the VM bus ring buffer if full.
+		 * Wait 2ms to allow the host to process messages and try again.
+		 */
+		if (ret == -EAGAIN) {
+			usleep_range(1000, 2000);
+			try_count++;
+		}
+	} while (ret == -EAGAIN && try_count < 5000);
+	if (ret < 0)
+		pr_err("vmbus_sendpacket failed: %x", ret);
+
+	return ret;
+}
+
 static int
 dxgvmb_send_sync_msg_ntstatus(struct dxgvmbuschannel *channel,
 			      void *command, u32 cmd_size)
@@ -383,7 +467,7 @@ int dxgvmb_send_set_iospace_region(u64 start, u64 len,
 	struct dxgkvmb_command_setiospaceregion *command;
 	struct dxgvmbusmsg msg;
 
-	ret = init_message(&msg, NULL, sizeof(*command));
+	ret = init_message(&msg, NULL, NULL, sizeof(*command));
 	if (ret)
 		return ret;
 	command = (void *)msg.msg;
@@ -411,3 +495,95 @@ int dxgvmb_send_set_iospace_region(u64 start, u64 len,
 	return ret;
 }
 
+/*
+ * Virtual GPU messages to the host
+ */
+
+int dxgvmb_send_open_adapter(struct dxgadapter *adapter)
+{
+	int ret;
+	struct dxgkvmb_command_openadapter *command;
+	struct dxgkvmb_command_openadapter_return result = { };
+	struct dxgvmbusmsg msg;
+
+	ret = init_message(&msg, adapter, NULL, sizeof(*command));
+	if (ret)
+		return ret;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init1(&command->hdr, DXGK_VMBCOMMAND_OPENADAPTER);
+	command->vmbus_interface_version = dxgglobal->vmbus_ver;
+	command->vmbus_last_compatible_interface_version =
+	    DXGK_VMBUS_LAST_COMPATIBLE_INTERFACE_VERSION;
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   &result, sizeof(result));
+	if (ret < 0)
+		goto cleanup;
+
+	ret = ntstatus2int(result.status);
+	adapter->host_handle = result.host_adapter_handle;
+
+cleanup:
+	free_message(&msg, NULL);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_close_adapter(struct dxgadapter *adapter)
+{
+	int ret;
+	struct dxgkvmb_command_closeadapter *command;
+	struct dxgvmbusmsg msg;
+
+	ret = init_message(&msg, adapter, NULL, sizeof(*command));
+	if (ret)
+		return ret;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init1(&command->hdr, DXGK_VMBCOMMAND_CLOSEADAPTER);
+	command->host_handle = adapter->host_handle;
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   NULL, 0);
+	free_message(&msg, NULL);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_get_internal_adapter_info(struct dxgadapter *adapter)
+{
+	int ret;
+	struct dxgkvmb_command_getinternaladapterinfo *command;
+	struct dxgkvmb_command_getinternaladapterinfo_return result = { };
+	struct dxgvmbusmsg msg;
+	u32 result_size = sizeof(result);
+
+	ret = init_message(&msg, adapter, NULL, sizeof(*command));
+	if (ret)
+		return ret;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init1(&command->hdr,
+				   DXGK_VMBCOMMAND_GETINTERNALADAPTERINFO);
+	if (dxgglobal->vmbus_ver < DXGK_VMBUS_INTERFACE_VERSION)
+		result_size -= sizeof(struct winluid);
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   &result, result_size);
+	if (ret >= 0) {
+		adapter->host_adapter_luid = result.host_adapter_luid;
+		adapter->host_vgpu_luid = result.host_vgpu_luid;
+		wcsncpy(adapter->device_description, result.device_description,
+			sizeof(adapter->device_description) / sizeof(u16));
+		wcsncpy(adapter->device_instance_id, result.device_instance_id,
+			sizeof(adapter->device_instance_id) / sizeof(u16));
+		dxgglobal->async_msg_enabled = result.async_msg_enabled != 0;
+	}
+	free_message(&msg, NULL);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index d6136fd1ce9a..eda259324588 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -47,6 +47,83 @@ enum dxgkvmb_commandtype_global {
 	DXGK_VMBCOMMAND_INVALID_VM_TO_HOST
 };
 
+/*
+ *
+ * Commands, sent to the host via the per adapter VM bus channel
+ * DXG_GUEST_VGPU_VMBUS
+ *
+ */
+
+enum dxgkvmb_commandtype {
+	DXGK_VMBCOMMAND_CREATEDEVICE		= 0,
+	DXGK_VMBCOMMAND_DESTROYDEVICE		= 1,
+	DXGK_VMBCOMMAND_QUERYADAPTERINFO	= 2,
+	DXGK_VMBCOMMAND_DDIQUERYADAPTERINFO	= 3,
+	DXGK_VMBCOMMAND_CREATEALLOCATION	= 4,
+	DXGK_VMBCOMMAND_DESTROYALLOCATION	= 5,
+	DXGK_VMBCOMMAND_CREATECONTEXTVIRTUAL	= 6,
+	DXGK_VMBCOMMAND_DESTROYCONTEXT		= 7,
+	DXGK_VMBCOMMAND_CREATESYNCOBJECT	= 8,
+	DXGK_VMBCOMMAND_CREATEPAGINGQUEUE	= 9,
+	DXGK_VMBCOMMAND_DESTROYPAGINGQUEUE	= 10,
+	DXGK_VMBCOMMAND_MAKERESIDENT		= 11,
+	DXGK_VMBCOMMAND_EVICT			= 12,
+	DXGK_VMBCOMMAND_ESCAPE			= 13,
+	DXGK_VMBCOMMAND_OPENADAPTER		= 14,
+	DXGK_VMBCOMMAND_CLOSEADAPTER		= 15,
+	DXGK_VMBCOMMAND_FREEGPUVIRTUALADDRESS	= 16,
+	DXGK_VMBCOMMAND_MAPGPUVIRTUALADDRESS	= 17,
+	DXGK_VMBCOMMAND_RESERVEGPUVIRTUALADDRESS = 18,
+	DXGK_VMBCOMMAND_UPDATEGPUVIRTUALADDRESS	= 19,
+	DXGK_VMBCOMMAND_SUBMITCOMMAND		= 20,
+	dxgk_vmbcommand_queryvideomemoryinfo	= 21,
+	DXGK_VMBCOMMAND_WAITFORSYNCOBJECTFROMCPU = 22,
+	DXGK_VMBCOMMAND_LOCK2			= 23,
+	DXGK_VMBCOMMAND_UNLOCK2			= 24,
+	DXGK_VMBCOMMAND_WAITFORSYNCOBJECTFROMGPU = 25,
+	DXGK_VMBCOMMAND_SIGNALSYNCOBJECT	= 26,
+	DXGK_VMBCOMMAND_SIGNALFENCENTSHAREDBYREF = 27,
+	DXGK_VMBCOMMAND_GETDEVICESTATE		= 28,
+	DXGK_VMBCOMMAND_MARKDEVICEASERROR	= 29,
+	DXGK_VMBCOMMAND_ADAPTERSTOP		= 30,
+	DXGK_VMBCOMMAND_SETQUEUEDLIMIT		= 31,
+	DXGK_VMBCOMMAND_OPENRESOURCE		= 32,
+	DXGK_VMBCOMMAND_SETCONTEXTSCHEDULINGPRIORITY = 33,
+	DXGK_VMBCOMMAND_PRESENTHISTORYTOKEN	= 34,
+	DXGK_VMBCOMMAND_SETREDIRECTEDFLIPFENCEVALUE = 35,
+	DXGK_VMBCOMMAND_GETINTERNALADAPTERINFO	= 36,
+	DXGK_VMBCOMMAND_FLUSHHEAPTRANSITIONS	= 37,
+	DXGK_VMBCOMMAND_BLT			= 38,
+	DXGK_VMBCOMMAND_DDIGETSTANDARDALLOCATIONDRIVERDATA = 39,
+	DXGK_VMBCOMMAND_CDDGDICOMMAND		= 40,
+	DXGK_VMBCOMMAND_QUERYALLOCATIONRESIDENCY = 41,
+	DXGK_VMBCOMMAND_FLUSHDEVICE		= 42,
+	DXGK_VMBCOMMAND_FLUSHADAPTER		= 43,
+	DXGK_VMBCOMMAND_DDIGETNODEMETADATA	= 44,
+	DXGK_VMBCOMMAND_SETEXISTINGSYSMEMSTORE	= 45,
+	DXGK_VMBCOMMAND_ISSYNCOBJECTSIGNALED	= 46,
+	DXGK_VMBCOMMAND_CDDSYNCGPUACCESS	= 47,
+	DXGK_VMBCOMMAND_QUERYSTATISTICS		= 48,
+	DXGK_VMBCOMMAND_CHANGEVIDEOMEMORYRESERVATION = 49,
+	DXGK_VMBCOMMAND_CREATEHWQUEUE		= 50,
+	DXGK_VMBCOMMAND_DESTROYHWQUEUE		= 51,
+	DXGK_VMBCOMMAND_SUBMITCOMMANDTOHWQUEUE	= 52,
+	DXGK_VMBCOMMAND_GETDRIVERSTOREFILE	= 53,
+	DXGK_VMBCOMMAND_READDRIVERSTOREFILE	= 54,
+	DXGK_VMBCOMMAND_GETNEXTHARDLINK		= 55,
+	DXGK_VMBCOMMAND_UPDATEALLOCATIONPROPERTY = 56,
+	DXGK_VMBCOMMAND_OFFERALLOCATIONS	= 57,
+	DXGK_VMBCOMMAND_RECLAIMALLOCATIONS	= 58,
+	DXGK_VMBCOMMAND_SETALLOCATIONPRIORITY	= 59,
+	DXGK_VMBCOMMAND_GETALLOCATIONPRIORITY	= 60,
+	DXGK_VMBCOMMAND_GETCONTEXTSCHEDULINGPRIORITY = 61,
+	DXGK_VMBCOMMAND_QUERYCLOCKCALIBRATION	= 62,
+	DXGK_VMBCOMMAND_QUERYRESOURCEINFO	= 64,
+	DXGK_VMBCOMMAND_LOGEVENT		= 65,
+	DXGK_VMBCOMMAND_SETEXISTINGSYSMEMPAGES	= 66,
+	DXGK_VMBCOMMAND_INVALID
+};
+
 /*
  * Commands, sent by the host to the VM
  */
@@ -66,6 +143,15 @@ struct dxgkvmb_command_vm_to_host {
 	enum dxgkvmb_commandtype_global	command_type;
 };
 
+struct dxgkvmb_command_vgpu_to_host {
+	u64				command_id;
+	struct d3dkmthandle		process;
+	u32				channel_type	: 8;
+	u32				async_msg	: 1;
+	u32				reserved	: 23;
+	enum dxgkvmb_commandtype	command_type;
+};
+
 struct dxgkvmb_command_host_to_vm {
 	u64					command_id;
 	struct d3dkmthandle			process;
@@ -83,4 +169,46 @@ struct dxgkvmb_command_setiospaceregion {
 	u32				shared_page_gpadl;
 };
 
+struct dxgkvmb_command_openadapter {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	u32				vmbus_interface_version;
+	u32				vmbus_last_compatible_interface_version;
+	struct winluid			guest_adapter_luid;
+};
+
+struct dxgkvmb_command_openadapter_return {
+	struct d3dkmthandle		host_adapter_handle;
+	struct ntstatus			status;
+	u32				vmbus_interface_version;
+	u32				vmbus_last_compatible_interface_version;
+};
+
+struct dxgkvmb_command_closeadapter {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		host_handle;
+};
+
+struct dxgkvmb_command_getinternaladapterinfo {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+};
+
+struct dxgkvmb_command_getinternaladapterinfo_return {
+	struct dxgk_device_types	device_types;
+	u32				driver_store_copy_mode;
+	u32				driver_ddi_version;
+	u32				secure_virtual_machine	: 1;
+	u32				virtual_machine_reset	: 1;
+	u32				is_vail_supported	: 1;
+	u32				hw_sch_enabled		: 1;
+	u32				hw_sch_capable		: 1;
+	u32				va_backed_vm		: 1;
+	u32				async_msg_enabled	: 1;
+	u32				hw_support_state	: 2;
+	u32				reserved		: 23;
+	struct winluid			host_adapter_luid;
+	u16				device_description[80];
+	u16				device_instance_id[WIN_MAX_PATH];
+	struct winluid			host_vgpu_luid;
+};
+
 #endif /* _DXGVMBUS_H */
diff --git a/drivers/hv/dxgkrnl/hmgr.h b/drivers/hv/dxgkrnl/hmgr.h
deleted file mode 100644
index b8b8f3ae5939..000000000000
--- a/drivers/hv/dxgkrnl/hmgr.h
+++ /dev/null
@@ -1,75 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-
-/*
- * Copyright (c) 2019, Microsoft Corporation.
- *
- * Author:
- *   Iouri Tarassov <iourit@linux.microsoft.com>
- *
- * Dxgkrnl Graphics Driver
- * Handle manager definitions
- *
- */
-
-#ifndef _HMGR_H_
-#define _HMGR_H_
-
-#include "misc.h"
-
-struct hmgrentry;
-
-/*
- * Handle manager table.
- *
- * Implementation notes:
- *   A list of free handles is built on top of the array of table entries.
- *   free_handle_list_head is the index of the first entry in the list.
- *   m_FreeHandleListTail is the index of an entry in the list, which is
- *   HMGRTABLE_MIN_FREE_ENTRIES from the head. It means that when a handle is
- *   freed, the next time the handle can be re-used is after allocating
- *   HMGRTABLE_MIN_FREE_ENTRIES number of handles.
- *   Handles are allocated from the start of the list and free handles are
- *   inserted after the tail of the list.
- *
- */
-struct hmgrtable {
-	struct dxgprocess	*process;
-	struct hmgrentry	*entry_table;
-	u32			free_handle_list_head;
-	u32			free_handle_list_tail;
-	u32			table_size;
-	u32			free_count;
-	struct rw_semaphore	table_lock;
-};
-
-/*
- * Handle entry data types.
- */
-#define HMGRENTRY_TYPE_BITS 5
-
-enum hmgrentry_type {
-	HMGRENTRY_TYPE_FREE				= 0,
-	HMGRENTRY_TYPE_DXGADAPTER			= 1,
-	HMGRENTRY_TYPE_DXGSHAREDRESOURCE		= 2,
-	HMGRENTRY_TYPE_DXGDEVICE			= 3,
-	HMGRENTRY_TYPE_DXGRESOURCE			= 4,
-	HMGRENTRY_TYPE_DXGALLOCATION			= 5,
-	HMGRENTRY_TYPE_DXGOVERLAY			= 6,
-	HMGRENTRY_TYPE_DXGCONTEXT			= 7,
-	HMGRENTRY_TYPE_DXGSYNCOBJECT			= 8,
-	HMGRENTRY_TYPE_DXGKEYEDMUTEX			= 9,
-	HMGRENTRY_TYPE_DXGPAGINGQUEUE			= 10,
-	HMGRENTRY_TYPE_DXGDEVICESYNCOBJECT		= 11,
-	HMGRENTRY_TYPE_DXGPROCESS			= 12,
-	HMGRENTRY_TYPE_DXGSHAREDVMOBJECT		= 13,
-	HMGRENTRY_TYPE_DXGPROTECTEDSESSION		= 14,
-	HMGRENTRY_TYPE_DXGHWQUEUE			= 15,
-	HMGRENTRY_TYPE_DXGREMOTEBUNDLEOBJECT		= 16,
-	HMGRENTRY_TYPE_DXGCOMPOSITIONSURFACEOBJECT	= 17,
-	HMGRENTRY_TYPE_DXGCOMPOSITIONSURFACEPROXY	= 18,
-	HMGRENTRY_TYPE_DXGTRACKEDWORKLOAD		= 19,
-	HMGRENTRY_TYPE_LIMIT		= ((1 << HMGRENTRY_TYPE_BITS) - 1),
-	HMGRENTRY_TYPE_MONITOREDFENCE	= HMGRENTRY_TYPE_LIMIT + 1,
-};
-
-#endif
diff --git a/drivers/hv/dxgkrnl/misc.c b/drivers/hv/dxgkrnl/misc.c
new file mode 100644
index 000000000000..cb1e0635bebc
--- /dev/null
+++ b/drivers/hv/dxgkrnl/misc.c
@@ -0,0 +1,37 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright (c) 2019, Microsoft Corporation.
+ *
+ * Author:
+ *   Iouri Tarassov <iourit@linux.microsoft.com>
+ *
+ * Dxgkrnl Graphics Driver
+ * Helper functions
+ *
+ */
+
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+#include <linux/uaccess.h>
+
+#include "dxgkrnl.h"
+#include "misc.h"
+
+#undef pr_fmt
+#define pr_fmt(fmt)	"dxgk: " fmt
+
+u16 *wcsncpy(u16 *dest, const u16 *src, size_t n)
+{
+	int i;
+
+	for (i = 0; i < n; i++) {
+		dest[i] = src[i];
+		if (src[i] == 0) {
+			i++;
+			break;
+		}
+	}
+	dest[i - 1] = 0;
+	return dest;
+}
diff --git a/drivers/hv/dxgkrnl/misc.h b/drivers/hv/dxgkrnl/misc.h
index 9fa3c7c8c3f5..1ff0c0e28332 100644
--- a/drivers/hv/dxgkrnl/misc.h
+++ b/drivers/hv/dxgkrnl/misc.h
@@ -14,6 +14,9 @@
 #ifndef _MISC_H_
 #define _MISC_H_
 
+/* Max characters in Windows path */
+#define WIN_MAX_PATH		260
+
 extern const struct d3dkmthandle zerohandle;
 
 /*
@@ -23,9 +26,23 @@ extern const struct d3dkmthandle zerohandle;
  * When a lower lock ois held, the higher lock should not be acquired.
  *
  * channel_lock
+ * fd_mutex
+ * plistmutex
+ * table_lock
+ * core_lock
+ * device_lock
+ * process->process_mutex
+ * adapter_list_lock
  * device_mutex
  */
 
+u16 *wcsncpy(u16 *dest, const u16 *src, size_t n);
+
+enum dxglockstate {
+	DXGLOCK_SHARED,
+	DXGLOCK_EXCL
+};
+
 /*
  * Some of the Windows return codes, which needs to be translated to Linux
  * IOCTL return codes. Positive values are success codes and need to be
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 05/30] drivers: hv: dxgkrnl: Opening of /dev/dxg device and dxgprocess creation
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (2 preceding siblings ...)
  2022-03-01 19:45   ` [PATCH v3 04/30] drivers: hv: dxgkrnl: Creation of dxgadapter object Iouri Tarassov
@ 2022-03-01 19:45   ` Iouri Tarassov
  2022-03-01 20:46     ` Greg KH
                       ` (2 more replies)
  2022-03-01 19:45   ` [PATCH v3 06/30] drivers: hv: dxgkrnl: Enumerate and open dxgadapter objects Iouri Tarassov
                     ` (25 subsequent siblings)
  29 siblings, 3 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:45 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

- Implement opening of the device (/dev/dxg) file object and creation of
dxgprocess objects.

- Add VM bus messages to create and destroy the host side of a dxgprocess
object.

- Implement the handle manager, which manages d3dkmthandle handles
for the internal process objects. The handles are used by a user mode
client to reference dxgkrnl objects.

dxgprocess is created for each process, which opens /dev/dxg.
dxgprocess is ref counted, so the existing dxgprocess objects is used
for a process, which opens the device object multiple time.
dxgprocess is destroyed when the file object is released.

A corresponding dxgprocess object is created on the host for every
dxgprocess object in the guest.

When a dxgkrnl object is created, in most cases the corresponding
object is created in the host. The VM references the host objects by
handles (d3dkmthandle). d3dkmthandle values for a host object and
the corresponding VM object are the same. A host handle is allocated
first and its value is assigned to the guest object.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/Makefile     |   2 +-
 drivers/hv/dxgkrnl/dxgkrnl.h    |  40 ++-
 drivers/hv/dxgkrnl/dxgmodule.c  |  67 +++-
 drivers/hv/dxgkrnl/dxgprocess.c |  97 ++++++
 drivers/hv/dxgkrnl/dxgvmbus.c   |  79 +++++
 drivers/hv/dxgkrnl/dxgvmbus.h   |  24 ++
 drivers/hv/dxgkrnl/hmgr.c       | 566 ++++++++++++++++++++++++++++++++
 drivers/hv/dxgkrnl/hmgr.h       | 112 +++++++
 drivers/hv/dxgkrnl/ioctl.c      |  70 ++++
 drivers/hv/dxgkrnl/misc.h       |   1 -
 include/uapi/misc/d3dkmthk.h    |   2 +
 11 files changed, 1056 insertions(+), 4 deletions(-)
 create mode 100644 drivers/hv/dxgkrnl/dxgprocess.c
 create mode 100644 drivers/hv/dxgkrnl/hmgr.c
 create mode 100644 drivers/hv/dxgkrnl/hmgr.h

diff --git a/drivers/hv/dxgkrnl/Makefile b/drivers/hv/dxgkrnl/Makefile
index 2ed07d877c91..9d821e83448a 100644
--- a/drivers/hv/dxgkrnl/Makefile
+++ b/drivers/hv/dxgkrnl/Makefile
@@ -2,4 +2,4 @@
 # Makefile for the hyper-v compute device driver (dxgkrnl).
 
 obj-$(CONFIG_DXGKRNL)	+= dxgkrnl.o
-dxgkrnl-y		:= dxgmodule.o misc.o dxgadapter.o ioctl.o dxgvmbus.o
+dxgkrnl-y		:= dxgmodule.o hmgr.o misc.o dxgadapter.o ioctl.o dxgvmbus.o dxgprocess.o
diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index c0fd0bb267f2..61512463baa4 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -27,9 +27,11 @@
 #include <linux/pci.h>
 #include <linux/hyperv.h>
 
+struct dxgprocess;
 struct dxgadapter;
 
 #include "misc.h"
+#include "hmgr.h"
 #include <uapi/misc/d3dkmthk.h>
 
 struct dxgk_device_types {
@@ -137,10 +139,44 @@ struct dxgvmbuschannel *dxgglobal_get_dxgvmbuschannel(void);
 int dxgglobal_acquire_channel_lock(void);
 void dxgglobal_release_channel_lock(void);
 
+/*
+ * The structure represents a process, which opened the /dev/dxg device.
+ * A corresponding object is created on the host.
+ */
 struct dxgprocess {
-	/* Placeholder */
+	/*
+	 * Process list entry in dxgglobal.
+	 * Protected by the dxgglobal->plistmutex.
+	 */
+	struct list_head	plistentry;
+	struct task_struct	*process;
+	pid_t			pid;
+	pid_t			tgid;
+	/* how many time the process was opened */
+	struct kref		process_kref;
+	/*
+	 * This handle table is used for all objects except dxgadapter
+	 * The handle table lock order is higher than the local_handle_table
+	 * lock
+	 */
+	struct hmgrtable	handle_table;
+	/*
+	 * This handle table is used for dxgadapter objects.
+	 * The handle table lock order is lowest.
+	 */
+	struct hmgrtable	local_handle_table;
+	/* Handle of the corresponding objec on the host */
+	struct d3dkmthandle	host_handle;
 };
 
+struct dxgprocess *dxgprocess_create(void);
+void dxgprocess_destroy(struct dxgprocess *process);
+void dxgprocess_release(struct kref *refcount);
+void dxgprocess_ht_lock_shared_down(struct dxgprocess *process);
+void dxgprocess_ht_lock_shared_up(struct dxgprocess *process);
+void dxgprocess_ht_lock_exclusive_down(struct dxgprocess *process);
+void dxgprocess_ht_lock_exclusive_up(struct dxgprocess *process);
+
 enum dxgadapter_state {
 	DXGADAPTER_STATE_ACTIVE		= 0,
 	DXGADAPTER_STATE_STOPPED	= 1,
@@ -210,6 +246,8 @@ static inline void guid_to_luid(guid_t *guid, struct winluid *luid)
 void dxgvmb_initialize(void);
 int dxgvmb_send_set_iospace_region(u64 start, u64 len,
 				   struct vmbus_gpadl *shared_mem_gpadl);
+int dxgvmb_send_create_process(struct dxgprocess *process);
+int dxgvmb_send_destroy_process(struct d3dkmthandle process);
 int dxgvmb_send_open_adapter(struct dxgadapter *adapter);
 int dxgvmb_send_close_adapter(struct dxgadapter *adapter);
 int dxgvmb_send_get_internal_adapter_info(struct dxgadapter *adapter);
diff --git a/drivers/hv/dxgkrnl/dxgmodule.c b/drivers/hv/dxgkrnl/dxgmodule.c
index 5ae57977731c..c79be67b8d56 100644
--- a/drivers/hv/dxgkrnl/dxgmodule.c
+++ b/drivers/hv/dxgkrnl/dxgmodule.c
@@ -204,17 +204,80 @@ static void dxgglobal_stop_adapters(void)
 	dxgglobal_release_adapter_list_lock(DXGLOCK_EXCL);
 }
 
+/*
+ * Returns dxgprocess for the current executing process.
+ * Creates dxgprocess if it doesn't exist.
+ */
+static struct dxgprocess *dxgglobal_get_current_process(void)
+{
+	/*
+	 * Find the DXG process for the current process.
+	 * A new process is created if necessary.
+	 */
+	struct dxgprocess *process = NULL;
+	struct dxgprocess *entry = NULL;
+
+	mutex_lock(&dxgglobal->plistmutex);
+	list_for_each_entry(entry, &dxgglobal->plisthead, plistentry) {
+		/* All threads of a process have the same thread group ID */
+		if (entry->process->tgid == current->tgid) {
+			if (kref_get_unless_zero(&entry->process_kref)) {
+				process = entry;
+				pr_debug("found dxgprocess");
+			} else {
+				pr_debug("process is destroyed");
+			}
+			break;
+		}
+	}
+	mutex_unlock(&dxgglobal->plistmutex);
+
+	if (process == NULL)
+		process = dxgprocess_create();
+
+	return process;
+}
+
 /*
  * File operations for the /dev/dxg device
  */
 
 static int dxgk_open(struct inode *n, struct file *f)
 {
-	return 0;
+	int ret = 0;
+	struct dxgprocess *process;
+
+	pr_debug("%s %p %d %d",
+		     __func__, f, current->pid, current->tgid);
+
+
+	/* Find/create a dxgprocess structure for this process */
+	process = dxgglobal_get_current_process();
+
+	if (process) {
+		f->private_data = process;
+	} else {
+		pr_debug("cannot create dxgprocess for open\n");
+		ret = -EBADF;
+	}
+
+	pr_debug("%s end %x", __func__, ret);
+	return ret;
 }
 
 static int dxgk_release(struct inode *n, struct file *f)
 {
+	struct dxgprocess *process;
+
+	process = (struct dxgprocess *)f->private_data;
+	pr_debug("%s %p, %p", __func__, f, process);
+
+	if (process == NULL)
+		return -EINVAL;
+
+	kref_put(&process->process_kref, dxgprocess_release);
+
+	f->private_data = NULL;
 	return 0;
 }
 
@@ -236,6 +299,8 @@ const struct file_operations dxgk_fops = {
 	.owner = THIS_MODULE,
 	.open = dxgk_open,
 	.release = dxgk_release,
+	.compat_ioctl = dxgk_compat_ioctl,
+	.unlocked_ioctl = dxgk_unlocked_ioctl,
 	.write = dxgk_write,
 	.read = dxgk_read,
 };
diff --git a/drivers/hv/dxgkrnl/dxgprocess.c b/drivers/hv/dxgkrnl/dxgprocess.c
new file mode 100644
index 000000000000..2a86ba1b4b1b
--- /dev/null
+++ b/drivers/hv/dxgkrnl/dxgprocess.c
@@ -0,0 +1,97 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright (c) 2019, Microsoft Corporation.
+ *
+ * Author:
+ *   Iouri Tarassov <iourit@linux.microsoft.com>
+ *
+ * Dxgkrnl Graphics Driver
+ * DXGPROCESS implementation
+ *
+ */
+
+#include "dxgkrnl.h"
+
+#undef pr_fmt
+#define pr_fmt(fmt)	"dxgk: " fmt
+
+/*
+ * Creates a new dxgprocess object
+ * Must be called when dxgglobal->plistmutex is held
+ */
+struct dxgprocess *dxgprocess_create(void)
+{
+	struct dxgprocess *process;
+	int ret;
+
+	process = vzalloc(sizeof(struct dxgprocess));
+	if (process != NULL) {
+		pr_debug("new dxgprocess created\n");
+		process->process = current;
+		process->pid = current->pid;
+		process->tgid = current->tgid;
+		ret = dxgvmb_send_create_process(process);
+		if (ret < 0) {
+			pr_debug("send_create_process failed\n");
+			vfree(process);
+			process = NULL;
+		} else {
+			INIT_LIST_HEAD(&process->plistentry);
+			kref_init(&process->process_kref);
+
+			mutex_lock(&dxgglobal->plistmutex);
+			list_add_tail(&process->plistentry,
+				      &dxgglobal->plisthead);
+			mutex_unlock(&dxgglobal->plistmutex);
+
+			hmgrtable_init(&process->handle_table, process);
+			hmgrtable_init(&process->local_handle_table, process);
+		}
+	}
+	return process;
+}
+
+void dxgprocess_destroy(struct dxgprocess *process)
+{
+	hmgrtable_destroy(&process->handle_table);
+	hmgrtable_destroy(&process->local_handle_table);
+}
+
+void dxgprocess_release(struct kref *refcount)
+{
+	struct dxgprocess *process;
+
+	process = container_of(refcount, struct dxgprocess, process_kref);
+
+	mutex_lock(&dxgglobal->plistmutex);
+	list_del(&process->plistentry);
+	process->plistentry.next = NULL;
+	mutex_unlock(&dxgglobal->plistmutex);
+
+	dxgprocess_destroy(process);
+
+	if (process->host_handle.v)
+		dxgvmb_send_destroy_process(process->host_handle);
+	vfree(process);
+}
+
+void dxgprocess_ht_lock_shared_down(struct dxgprocess *process)
+{
+	hmgrtable_lock(&process->handle_table, DXGLOCK_SHARED);
+}
+
+void dxgprocess_ht_lock_shared_up(struct dxgprocess *process)
+{
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_SHARED);
+}
+
+void dxgprocess_ht_lock_exclusive_down(struct dxgprocess *process)
+{
+	hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+}
+
+void dxgprocess_ht_lock_exclusive_up(struct dxgprocess *process)
+{
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+}
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index 6b02932819c8..988372c5812e 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -495,6 +495,85 @@ int dxgvmb_send_set_iospace_region(u64 start, u64 len,
 	return ret;
 }
 
+int dxgvmb_send_create_process(struct dxgprocess *process)
+{
+	int ret;
+	struct dxgkvmb_command_createprocess *command;
+	struct dxgkvmb_command_createprocess_return result = { 0 };
+	struct dxgvmbusmsg msg;
+	char s[WIN_MAX_PATH];
+	int i;
+
+	ret = init_message(&msg, NULL, process, sizeof(*command));
+	if (ret)
+		return ret;
+	command = (void *)msg.msg;
+
+	ret = dxgglobal_acquire_channel_lock();
+	if (ret < 0)
+		goto cleanup;
+
+	command_vm_to_host_init1(&command->hdr, DXGK_VMBCOMMAND_CREATEPROCESS);
+	command->process = process;
+	command->process_id = process->process->pid;
+	command->linux_process = 1;
+	s[0] = 0;
+	__get_task_comm(s, WIN_MAX_PATH, process->process);
+	for (i = 0; i < WIN_MAX_PATH; i++) {
+		command->process_name[i] = s[i];
+		if (s[i] == 0)
+			break;
+	}
+
+	ret = dxgvmb_send_sync_msg(&dxgglobal->channel, msg.hdr, msg.size,
+				   &result, sizeof(result));
+	if (ret < 0) {
+		pr_err("create_process failed %d", ret);
+	} else if (result.hprocess.v == 0) {
+		pr_err("create_process returned 0 handle");
+		ret = -ENOTRECOVERABLE;
+	} else {
+		process->host_handle = result.hprocess;
+		pr_debug("create_process returned %x",
+			    process->host_handle.v);
+	}
+
+	dxgglobal_release_channel_lock();
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_destroy_process(struct d3dkmthandle process)
+{
+	int ret;
+	struct dxgkvmb_command_destroyprocess *command;
+	struct dxgvmbusmsg msg;
+
+	ret = init_message(&msg, NULL, NULL, sizeof(*command));
+	if (ret)
+		return ret;
+	command = (void *)msg.msg;
+
+	ret = dxgglobal_acquire_channel_lock();
+	if (ret < 0)
+		goto cleanup;
+	command_vm_to_host_init2(&command->hdr, DXGK_VMBCOMMAND_DESTROYPROCESS,
+				 process);
+	ret = dxgvmb_send_sync_msg_ntstatus(&dxgglobal->channel,
+					    msg.hdr, msg.size);
+	dxgglobal_release_channel_lock();
+
+cleanup:
+	free_message(&msg, NULL);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 /*
  * Virtual GPU messages to the host
  */
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index eda259324588..ddd7b6bf9964 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -14,7 +14,11 @@
 #ifndef _DXGVMBUS_H
 #define _DXGVMBUS_H
 
+struct dxgprocess;
+struct dxgadapter;
+
 #define DXG_MAX_VM_BUS_PACKET_SIZE	(1024 * 128)
+#define DXG_VM_PROCESS_NAME_LENGTH	260
 
 enum dxgkvmb_commandchanneltype {
 	DXGKVMB_VGPU_TO_HOST,
@@ -169,6 +173,26 @@ struct dxgkvmb_command_setiospaceregion {
 	u32				shared_page_gpadl;
 };
 
+struct dxgkvmb_command_createprocess {
+	struct dxgkvmb_command_vm_to_host hdr;
+	void			*process;
+	u64			process_id;
+	u16			process_name[DXG_VM_PROCESS_NAME_LENGTH + 1];
+	u8			csrss_process:1;
+	u8			dwm_process:1;
+	u8			wow64_process:1;
+	u8			linux_process:1;
+};
+
+struct dxgkvmb_command_createprocess_return {
+	struct d3dkmthandle	hprocess;
+};
+
+// The command returns ntstatus
+struct dxgkvmb_command_destroyprocess {
+	struct dxgkvmb_command_vm_to_host hdr;
+};
+
 struct dxgkvmb_command_openadapter {
 	struct dxgkvmb_command_vgpu_to_host hdr;
 	u32				vmbus_interface_version;
diff --git a/drivers/hv/dxgkrnl/hmgr.c b/drivers/hv/dxgkrnl/hmgr.c
new file mode 100644
index 000000000000..4c012046dcaf
--- /dev/null
+++ b/drivers/hv/dxgkrnl/hmgr.c
@@ -0,0 +1,566 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright (c) 2019, Microsoft Corporation.
+ *
+ * Author:
+ *   Iouri Tarassov <iourit@linux.microsoft.com>
+ *
+ * Dxgkrnl Graphics Driver
+ * Handle manager implementation
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/mutex.h>
+#include <linux/rwsem.h>
+
+#include "misc.h"
+#include "dxgkrnl.h"
+#include "hmgr.h"
+
+#undef pr_fmt
+#define pr_fmt(fmt)	"dxgk: " fmt
+
+const struct d3dkmthandle zerohandle;
+
+/*
+ * Handle parameters
+ */
+#define HMGRHANDLE_INSTANCE_BITS	6
+#define HMGRHANDLE_INDEX_BITS		24
+#define HMGRHANDLE_UNIQUE_BITS		2
+
+#define HMGRHANDLE_INSTANCE_SHIFT	0
+#define HMGRHANDLE_INDEX_SHIFT	\
+	(HMGRHANDLE_INSTANCE_BITS + HMGRHANDLE_INSTANCE_SHIFT)
+#define HMGRHANDLE_UNIQUE_SHIFT	\
+	(HMGRHANDLE_INDEX_BITS + HMGRHANDLE_INDEX_SHIFT)
+
+#define HMGRHANDLE_INSTANCE_MASK \
+	(((1 << HMGRHANDLE_INSTANCE_BITS) - 1) << HMGRHANDLE_INSTANCE_SHIFT)
+#define HMGRHANDLE_INDEX_MASK      \
+	(((1 << HMGRHANDLE_INDEX_BITS)    - 1) << HMGRHANDLE_INDEX_SHIFT)
+#define HMGRHANDLE_UNIQUE_MASK     \
+	(((1 << HMGRHANDLE_UNIQUE_BITS)   - 1) << HMGRHANDLE_UNIQUE_SHIFT)
+
+#define HMGRHANDLE_INSTANCE_MAX	((1 << HMGRHANDLE_INSTANCE_BITS) - 1)
+#define HMGRHANDLE_INDEX_MAX	((1 << HMGRHANDLE_INDEX_BITS) - 1)
+#define HMGRHANDLE_UNIQUE_MAX	((1 << HMGRHANDLE_UNIQUE_BITS) - 1)
+
+/*
+ * Handle entry
+ */
+struct hmgrentry {
+	union {
+		void *object;
+		struct {
+			u32 prev_free_index;
+			u32 next_free_index;
+		};
+	};
+	u32 type:HMGRENTRY_TYPE_BITS + 1;
+	u32 unique:HMGRHANDLE_UNIQUE_BITS;
+	u32 instance:HMGRHANDLE_INSTANCE_BITS;
+	u32 destroyed:1;
+};
+
+#define HMGRTABLE_SIZE_INCREMENT	1024
+#define HMGRTABLE_MIN_FREE_ENTRIES 128
+#define HMGRTABLE_INVALID_INDEX (~((1 << HMGRHANDLE_INDEX_BITS) - 1))
+#define HMGRTABLE_SIZE_MAX		0xFFFFFFF
+
+static u32 table_size_increment = HMGRTABLE_SIZE_INCREMENT;
+
+static u32 get_unique(struct d3dkmthandle h)
+{
+	return (h.v & HMGRHANDLE_UNIQUE_MASK) >> HMGRHANDLE_UNIQUE_SHIFT;
+}
+
+static u32 get_index(struct d3dkmthandle h)
+{
+	return (h.v & HMGRHANDLE_INDEX_MASK) >> HMGRHANDLE_INDEX_SHIFT;
+}
+
+static bool is_handle_valid(struct hmgrtable *table, struct d3dkmthandle h,
+			    bool ignore_destroyed, enum hmgrentry_type t)
+{
+	u32 index = get_index(h);
+	u32 unique = get_unique(h);
+	struct hmgrentry *entry;
+
+	if (index >= table->table_size) {
+		pr_err("%s Invalid index %x %d\n", __func__, h.v, index);
+		return false;
+	}
+
+	entry = &table->entry_table[index];
+	if (unique != entry->unique) {
+		pr_err("%s Invalid unique %x %d %d %d %p",
+			   __func__, h.v, unique, entry->unique,
+			   index, entry->object);
+		return false;
+	}
+
+	if (entry->destroyed && !ignore_destroyed) {
+		pr_err("%s Invalid destroyed", __func__);
+		return false;
+	}
+
+	if (entry->type == HMGRENTRY_TYPE_FREE) {
+		pr_err("%s Entry is freed %x %d", __func__, h.v, index);
+		return false;
+	}
+
+	if (t != HMGRENTRY_TYPE_FREE && t != entry->type) {
+		pr_err("%s type mismatch %x %d %d", __func__, h.v,
+			   t, entry->type);
+		return false;
+	}
+
+	return true;
+}
+
+static struct d3dkmthandle build_handle(u32 index, u32 unique, u32 instance)
+{
+	struct d3dkmthandle handle;
+
+	handle.v = (index << HMGRHANDLE_INDEX_SHIFT) & HMGRHANDLE_INDEX_MASK;
+	handle.v |= (unique << HMGRHANDLE_UNIQUE_SHIFT) &
+	    HMGRHANDLE_UNIQUE_MASK;
+	handle.v |= (instance << HMGRHANDLE_INSTANCE_SHIFT) &
+	    HMGRHANDLE_INSTANCE_MASK;
+
+	return handle;
+}
+
+inline u32 hmgrtable_get_used_entry_count(struct hmgrtable *table)
+{
+	DXGKRNL_ASSERT(table->table_size >= table->free_count);
+	return (table->table_size - table->free_count);
+}
+
+bool hmgrtable_mark_destroyed(struct hmgrtable *table, struct d3dkmthandle h)
+{
+	if (!is_handle_valid(table, h, false, HMGRENTRY_TYPE_FREE))
+		return false;
+
+	table->entry_table[get_index(h)].destroyed = true;
+	return true;
+}
+
+bool hmgrtable_unmark_destroyed(struct hmgrtable *table, struct d3dkmthandle h)
+{
+	if (!is_handle_valid(table, h, true, HMGRENTRY_TYPE_FREE))
+		return true;
+
+	DXGKRNL_ASSERT(table->entry_table[get_index(h)].destroyed);
+	table->entry_table[get_index(h)].destroyed = 0;
+	return true;
+}
+
+static bool expand_table(struct hmgrtable *table, u32 NumEntries)
+{
+	u32 new_table_size;
+	struct hmgrentry *new_entry;
+	u32 table_index;
+	u32 new_free_count;
+	u32 prev_free_index;
+	u32 tail_index = table->free_handle_list_tail;
+
+	/* The tail should point to the last free element in the list */
+	if (table->free_count != 0) {
+		if (tail_index >= table->table_size ||
+		    table->entry_table[tail_index].next_free_index !=
+		    HMGRTABLE_INVALID_INDEX) {
+			pr_err("%s:corruption\n", __func__);
+			pr_err("tail_index: %x", tail_index);
+			pr_err("table size: %x", table->table_size);
+			pr_err("free_count: %d", table->free_count);
+			pr_err("NumEntries: %x", NumEntries);
+			return false;
+		}
+	}
+
+	new_free_count = table_size_increment + table->free_count;
+	new_table_size = table->table_size + table_size_increment;
+	if (new_table_size < NumEntries) {
+		new_free_count += NumEntries - new_table_size;
+		new_table_size = NumEntries;
+	}
+
+	if (new_table_size > HMGRHANDLE_INDEX_MAX) {
+		pr_err("%s:corruption\n", __func__);
+		return false;
+	}
+
+	new_entry = (struct hmgrentry *)
+	    vzalloc(new_table_size * sizeof(struct hmgrentry));
+	if (new_entry == NULL) {
+		pr_err("%s:allocation failed\n", __func__);
+		return false;
+	}
+
+	if (table->entry_table) {
+		memcpy(new_entry, table->entry_table,
+		       table->table_size * sizeof(struct hmgrentry));
+		vfree(table->entry_table);
+	} else {
+		table->free_handle_list_head = 0;
+	}
+
+	table->entry_table = new_entry;
+
+	/* Initialize new table entries and add to the free list */
+	table_index = table->table_size;
+
+	prev_free_index = table->free_handle_list_tail;
+
+	while (table_index < new_table_size) {
+		struct hmgrentry *entry = &table->entry_table[table_index];
+
+		entry->prev_free_index = prev_free_index;
+		entry->next_free_index = table_index + 1;
+		entry->type = HMGRENTRY_TYPE_FREE;
+		entry->unique = 1;
+		entry->instance = 0;
+		prev_free_index = table_index;
+
+		table_index++;
+	}
+
+	table->entry_table[table_index - 1].next_free_index =
+	    (u32) HMGRTABLE_INVALID_INDEX;
+
+	if (table->free_count != 0) {
+		/* Link the current free list with the new entries */
+		struct hmgrentry *entry;
+
+		entry = &table->entry_table[table->free_handle_list_tail];
+		entry->next_free_index = table->table_size;
+	}
+	table->free_handle_list_tail = new_table_size - 1;
+	if (table->free_handle_list_head == HMGRTABLE_INVALID_INDEX)
+		table->free_handle_list_head = table->table_size;
+
+	table->table_size = new_table_size;
+	table->free_count = new_free_count;
+
+	return true;
+}
+
+void hmgrtable_init(struct hmgrtable *table, struct dxgprocess *process)
+{
+	table->process = process;
+	table->entry_table = NULL;
+	table->table_size = 0;
+	table->free_handle_list_head = HMGRTABLE_INVALID_INDEX;
+	table->free_handle_list_tail = HMGRTABLE_INVALID_INDEX;
+	table->free_count = 0;
+	init_rwsem(&table->table_lock);
+}
+
+void hmgrtable_destroy(struct hmgrtable *table)
+{
+	if (table->entry_table) {
+		vfree(table->entry_table);
+		table->entry_table = NULL;
+	}
+}
+
+void hmgrtable_lock(struct hmgrtable *table, enum dxglockstate state)
+{
+	if (state == DXGLOCK_EXCL)
+		down_write(&table->table_lock);
+	else
+		down_read(&table->table_lock);
+}
+
+void hmgrtable_unlock(struct hmgrtable *table, enum dxglockstate state)
+{
+	if (state == DXGLOCK_EXCL)
+		up_write(&table->table_lock);
+	else
+		up_read(&table->table_lock);
+}
+
+struct d3dkmthandle hmgrtable_alloc_handle(struct hmgrtable *table,
+					   void *object,
+					   enum hmgrentry_type type,
+					   bool make_valid)
+{
+	u32 index;
+	struct hmgrentry *entry;
+	u32 unique;
+
+	DXGKRNL_ASSERT(type <= HMGRENTRY_TYPE_LIMIT);
+	DXGKRNL_ASSERT(type > HMGRENTRY_TYPE_FREE);
+
+	if (table->free_count <= HMGRTABLE_MIN_FREE_ENTRIES) {
+		if (!expand_table(table, 0)) {
+			pr_err("hmgrtable expand_table failed\n");
+			return zerohandle;
+		}
+	}
+
+	if (table->free_handle_list_head >= table->table_size) {
+		pr_err("hmgrtable corrupted handle table head\n");
+		return zerohandle;
+	}
+
+	index = table->free_handle_list_head;
+	entry = &table->entry_table[index];
+
+	if (entry->type != HMGRENTRY_TYPE_FREE) {
+		pr_err("hmgrtable expected free handle\n");
+		return zerohandle;
+	}
+
+	table->free_handle_list_head = entry->next_free_index;
+
+	if (entry->next_free_index != table->free_handle_list_tail) {
+		if (entry->next_free_index >= table->table_size) {
+			pr_err("hmgrtable invalid next free index\n");
+			return zerohandle;
+		}
+		table->entry_table[entry->next_free_index].prev_free_index =
+		    HMGRTABLE_INVALID_INDEX;
+	}
+
+	unique = table->entry_table[index].unique;
+
+	table->entry_table[index].object = object;
+	table->entry_table[index].type = type;
+	table->entry_table[index].instance = 0;
+	table->entry_table[index].destroyed = !make_valid;
+	table->free_count--;
+	DXGKRNL_ASSERT(table->free_count <= table->table_size);
+
+	return build_handle(index, unique, table->entry_table[index].instance);
+}
+
+int hmgrtable_assign_handle_safe(struct hmgrtable *table,
+				 void *object,
+				 enum hmgrentry_type type,
+				 struct d3dkmthandle h)
+{
+	int ret;
+
+	hmgrtable_lock(table, DXGLOCK_EXCL);
+	ret = hmgrtable_assign_handle(table, object, type, h);
+	hmgrtable_unlock(table, DXGLOCK_EXCL);
+	return ret;
+}
+
+int hmgrtable_assign_handle(struct hmgrtable *table, void *object,
+			    enum hmgrentry_type type, struct d3dkmthandle h)
+{
+	u32 index = get_index(h);
+	u32 unique = get_unique(h);
+	struct hmgrentry *entry = NULL;
+
+	pr_debug("%s %x, %d %p, %p\n",
+		    __func__, h.v, index, object, table);
+
+	if (index >= HMGRHANDLE_INDEX_MAX) {
+		pr_err("handle index is too big: %x %d", h.v, index);
+		return -EINVAL;
+	}
+
+	if (index >= table->table_size) {
+		u32 new_size = index + table_size_increment;
+
+		if (new_size > HMGRHANDLE_INDEX_MAX)
+			new_size = HMGRHANDLE_INDEX_MAX;
+		if (!expand_table(table, new_size)) {
+			pr_err("failed to expand handle table %d", new_size);
+			return -ENOMEM;
+		}
+	}
+
+	entry = &table->entry_table[index];
+
+	if (entry->type != HMGRENTRY_TYPE_FREE) {
+		pr_err("the entry is already busy: %d %x",
+			   entry->type,
+			   hmgrtable_build_entry_handle(table, index).v);
+		return -EINVAL;
+	}
+
+	if (index != table->free_handle_list_tail) {
+		if (entry->next_free_index >= table->table_size) {
+			pr_err("hmgr: invalid next free index %d\n",
+				   entry->next_free_index);
+			return -EINVAL;
+		}
+		table->entry_table[entry->next_free_index].prev_free_index =
+		    entry->prev_free_index;
+	} else {
+		table->free_handle_list_tail = entry->prev_free_index;
+	}
+
+	if (index != table->free_handle_list_head) {
+		if (entry->prev_free_index >= table->table_size) {
+			pr_err("hmgr: invalid next prev index %d\n",
+				   entry->prev_free_index);
+			return -EINVAL;
+		}
+		table->entry_table[entry->prev_free_index].next_free_index =
+		    entry->next_free_index;
+	} else {
+		table->free_handle_list_head = entry->next_free_index;
+	}
+
+	entry->prev_free_index = HMGRTABLE_INVALID_INDEX;
+	entry->next_free_index = HMGRTABLE_INVALID_INDEX;
+	entry->object = object;
+	entry->type = type;
+	entry->instance = 0;
+	entry->unique = unique;
+	entry->destroyed = false;
+
+	table->free_count--;
+	DXGKRNL_ASSERT(table->free_count <= table->table_size);
+	return 0;
+}
+
+struct d3dkmthandle hmgrtable_alloc_handle_safe(struct hmgrtable *table,
+						void *obj,
+						enum hmgrentry_type type,
+						bool make_valid)
+{
+	struct d3dkmthandle h;
+
+	hmgrtable_lock(table, DXGLOCK_EXCL);
+	h = hmgrtable_alloc_handle(table, obj, type, make_valid);
+	hmgrtable_unlock(table, DXGLOCK_EXCL);
+	return h;
+}
+
+void hmgrtable_free_handle(struct hmgrtable *table, enum hmgrentry_type t,
+			   struct d3dkmthandle h)
+{
+	struct hmgrentry *entry;
+	u32 i = get_index(h);
+
+	pr_debug("%s: %p %x\n", __func__, table, h.v);
+
+	/* Ignore the destroyed flag when checking the handle */
+	if (is_handle_valid(table, h, true, t)) {
+		DXGKRNL_ASSERT(table->free_count < table->table_size);
+		entry = &table->entry_table[i];
+		entry->unique = 1;
+		entry->type = HMGRENTRY_TYPE_FREE;
+		entry->destroyed = 0;
+		if (entry->unique != HMGRHANDLE_UNIQUE_MAX)
+			entry->unique += 1;
+		else
+			entry->unique = 1;
+
+		table->free_count++;
+		DXGKRNL_ASSERT(table->free_count <= table->table_size);
+
+		/*
+		 * Insert the index to the free list at the tail.
+		 */
+		entry->next_free_index = HMGRTABLE_INVALID_INDEX;
+		entry->prev_free_index = table->free_handle_list_tail;
+		entry = &table->entry_table[table->free_handle_list_tail];
+		entry->next_free_index = i;
+		table->free_handle_list_tail = i;
+	} else {
+		pr_err("%s:error: %d %x\n", __func__, i, h.v);
+	}
+}
+
+void hmgrtable_free_handle_safe(struct hmgrtable *table, enum hmgrentry_type t,
+				struct d3dkmthandle h)
+{
+	hmgrtable_lock(table, DXGLOCK_EXCL);
+	hmgrtable_free_handle(table, t, h);
+	hmgrtable_unlock(table, DXGLOCK_EXCL);
+}
+
+struct d3dkmthandle hmgrtable_build_entry_handle(struct hmgrtable *table,
+						 u32 index)
+{
+	DXGKRNL_ASSERT(index < table->table_size);
+
+	return build_handle(index, table->entry_table[index].unique,
+			    table->entry_table[index].instance);
+}
+
+void *hmgrtable_get_object(struct hmgrtable *table, struct d3dkmthandle h)
+{
+	if (!is_handle_valid(table, h, false, HMGRENTRY_TYPE_FREE))
+		return NULL;
+
+	return table->entry_table[get_index(h)].object;
+}
+
+void *hmgrtable_get_object_by_type(struct hmgrtable *table,
+				   enum hmgrentry_type type,
+				   struct d3dkmthandle h)
+{
+	if (!is_handle_valid(table, h, false, type)) {
+		pr_err("%s invalid handle %x\n", __func__, h.v);
+		return NULL;
+	}
+	return table->entry_table[get_index(h)].object;
+}
+
+void *hmgrtable_get_entry_object(struct hmgrtable *table, u32 index)
+{
+	DXGKRNL_ASSERT(index < table->table_size);
+	DXGKRNL_ASSERT(table->entry_table[index].type != HMGRENTRY_TYPE_FREE);
+
+	return table->entry_table[index].object;
+}
+
+static enum hmgrentry_type hmgrtable_get_entry_type(struct hmgrtable *table,
+						    u32 index)
+{
+	DXGKRNL_ASSERT(index < table->table_size);
+	return (enum hmgrentry_type)table->entry_table[index].type;
+}
+
+enum hmgrentry_type hmgrtable_get_object_type(struct hmgrtable *table,
+					      struct d3dkmthandle h)
+{
+	if (!is_handle_valid(table, h, false, HMGRENTRY_TYPE_FREE))
+		return HMGRENTRY_TYPE_FREE;
+
+	return hmgrtable_get_entry_type(table, get_index(h));
+}
+
+void *hmgrtable_get_object_ignore_destroyed(struct hmgrtable *table,
+					    struct d3dkmthandle h,
+					    enum hmgrentry_type type)
+{
+	if (!is_handle_valid(table, h, true, type))
+		return NULL;
+	return table->entry_table[get_index(h)].object;
+}
+
+bool hmgrtable_next_entry(struct hmgrtable *tbl,
+			  u32 *index,
+			  enum hmgrentry_type *type,
+			  struct d3dkmthandle *handle,
+			  void **object)
+{
+	u32 i;
+	struct hmgrentry *entry;
+
+	for (i = *index; i < tbl->table_size; i++) {
+		entry = &tbl->entry_table[i];
+		if (entry->type != HMGRENTRY_TYPE_FREE) {
+			*index = i + 1;
+			*object = entry->object;
+			*handle = build_handle(i, entry->unique,
+					       entry->instance);
+			*type = entry->type;
+			return true;
+		}
+	}
+	return false;
+}
diff --git a/drivers/hv/dxgkrnl/hmgr.h b/drivers/hv/dxgkrnl/hmgr.h
new file mode 100644
index 000000000000..9ce4f536bf29
--- /dev/null
+++ b/drivers/hv/dxgkrnl/hmgr.h
@@ -0,0 +1,112 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Copyright (c) 2019, Microsoft Corporation.
+ *
+ * Author:
+ *   Iouri Tarassov <iourit@linux.microsoft.com>
+ *
+ * Dxgkrnl Graphics Driver
+ * Handle manager definitions
+ *
+ */
+
+#ifndef _HMGR_H_
+#define _HMGR_H_
+
+#include "misc.h"
+
+struct hmgrentry;
+
+/*
+ * Handle manager table.
+ *
+ * Implementation notes:
+ *   A list of free handles is built on top of the array of table entries.
+ *   free_handle_list_head is the index of the first entry in the list.
+ *   m_FreeHandleListTail is the index of an entry in the list, which is
+ *   HMGRTABLE_MIN_FREE_ENTRIES from the head. It means that when a handle is
+ *   freed, the next time the handle can be re-used is after allocating
+ *   HMGRTABLE_MIN_FREE_ENTRIES number of handles.
+ *   Handles are allocated from the start of the list and free handles are
+ *   inserted after the tail of the list.
+ *
+ */
+struct hmgrtable {
+	struct dxgprocess	*process;
+	struct hmgrentry	*entry_table;
+	u32			free_handle_list_head;
+	u32			free_handle_list_tail;
+	u32			table_size;
+	u32			free_count;
+	struct rw_semaphore	table_lock;
+};
+
+/*
+ * Handle entry data types.
+ */
+#define HMGRENTRY_TYPE_BITS 5
+
+enum hmgrentry_type {
+	HMGRENTRY_TYPE_FREE				= 0,
+	HMGRENTRY_TYPE_DXGADAPTER			= 1,
+	HMGRENTRY_TYPE_DXGSHAREDRESOURCE		= 2,
+	HMGRENTRY_TYPE_DXGDEVICE			= 3,
+	HMGRENTRY_TYPE_DXGRESOURCE			= 4,
+	HMGRENTRY_TYPE_DXGALLOCATION			= 5,
+	HMGRENTRY_TYPE_DXGOVERLAY			= 6,
+	HMGRENTRY_TYPE_DXGCONTEXT			= 7,
+	HMGRENTRY_TYPE_DXGSYNCOBJECT			= 8,
+	HMGRENTRY_TYPE_DXGKEYEDMUTEX			= 9,
+	HMGRENTRY_TYPE_DXGPAGINGQUEUE			= 10,
+	HMGRENTRY_TYPE_DXGDEVICESYNCOBJECT		= 11,
+	HMGRENTRY_TYPE_DXGPROCESS			= 12,
+	HMGRENTRY_TYPE_DXGSHAREDVMOBJECT		= 13,
+	HMGRENTRY_TYPE_DXGPROTECTEDSESSION		= 14,
+	HMGRENTRY_TYPE_DXGHWQUEUE			= 15,
+	HMGRENTRY_TYPE_DXGREMOTEBUNDLEOBJECT		= 16,
+	HMGRENTRY_TYPE_DXGCOMPOSITIONSURFACEOBJECT	= 17,
+	HMGRENTRY_TYPE_DXGCOMPOSITIONSURFACEPROXY	= 18,
+	HMGRENTRY_TYPE_DXGTRACKEDWORKLOAD		= 19,
+	HMGRENTRY_TYPE_LIMIT		= ((1 << HMGRENTRY_TYPE_BITS) - 1),
+	HMGRENTRY_TYPE_MONITOREDFENCE	= HMGRENTRY_TYPE_LIMIT + 1,
+};
+
+void hmgrtable_init(struct hmgrtable *tbl, struct dxgprocess *process);
+void hmgrtable_destroy(struct hmgrtable *tbl);
+void hmgrtable_lock(struct hmgrtable *tbl, enum dxglockstate state);
+void hmgrtable_unlock(struct hmgrtable *tbl, enum dxglockstate state);
+struct d3dkmthandle hmgrtable_alloc_handle(struct hmgrtable *tbl, void *object,
+				     enum hmgrentry_type t, bool make_valid);
+struct d3dkmthandle hmgrtable_alloc_handle_safe(struct hmgrtable *tbl,
+						void *obj,
+						enum hmgrentry_type t,
+						bool reserve);
+int hmgrtable_assign_handle(struct hmgrtable *tbl, void *obj,
+			    enum hmgrentry_type, struct d3dkmthandle h);
+int hmgrtable_assign_handle_safe(struct hmgrtable *tbl, void *obj,
+				 enum hmgrentry_type t, struct d3dkmthandle h);
+void hmgrtable_free_handle(struct hmgrtable *tbl, enum hmgrentry_type t,
+			   struct d3dkmthandle h);
+void hmgrtable_free_handle_safe(struct hmgrtable *tbl, enum hmgrentry_type t,
+				struct d3dkmthandle h);
+struct d3dkmthandle hmgrtable_build_entry_handle(struct hmgrtable *tbl,
+						 u32 index);
+enum hmgrentry_type hmgrtable_get_object_type(struct hmgrtable *tbl,
+					      struct d3dkmthandle h);
+void *hmgrtable_get_object(struct hmgrtable *tbl, struct d3dkmthandle h);
+void *hmgrtable_get_object_by_type(struct hmgrtable *tbl, enum hmgrentry_type t,
+				   struct d3dkmthandle h);
+void *hmgrtable_get_object_ignore_destroyed(struct hmgrtable *tbl,
+					    struct d3dkmthandle h,
+					    enum hmgrentry_type t);
+bool hmgrtable_mark_destroyed(struct hmgrtable *tbl, struct d3dkmthandle h);
+bool hmgrtable_unmark_destroyed(struct hmgrtable *tbl, struct d3dkmthandle h);
+void *hmgrtable_get_entry_object(struct hmgrtable *tbl, u32 index);
+bool hmgrtable_next_entry(struct hmgrtable *tbl,
+			  u32 *start_index,
+			  enum hmgrentry_type *type,
+			  struct d3dkmthandle *handle,
+			  void **object);
+
+#endif
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index 277e25e5d8c6..fa7fc321addb 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -22,3 +22,73 @@
 
 #undef pr_fmt
 #define pr_fmt(fmt)	"dxgk: " fmt
+
+struct ioctl_desc {
+	int (*ioctl_callback)(struct dxgprocess *p, void __user *arg);
+	u32 ioctl;
+	u32 arg_size;
+};
+static struct ioctl_desc ioctls[LX_IO_MAX + 1];
+
+static char *errorstr(int ret)
+{
+	return ret < 0 ? "err" : "";
+}
+
+/*
+ * IOCTL processing
+ * The driver IOCTLs return
+ * - 0 in case of success
+ * - positive values, which are Windows NTSTATUS (for example, STATUS_PENDING).
+ *   Positive values are success codes.
+ * - Linux negative error codes
+ */
+static int dxgk_ioctl(struct file *f, unsigned int p1, unsigned long p2)
+{
+	int code = _IOC_NR(p1);
+	int status;
+	struct dxgprocess *process;
+
+	if (code < 1 || code > LX_IO_MAX) {
+		pr_err("bad ioctl %x %x %x %x",
+			   code, _IOC_TYPE(p1), _IOC_SIZE(p1), _IOC_DIR(p1));
+		return -ENOTTY;
+	}
+	if (ioctls[code].ioctl_callback == NULL) {
+		pr_err("ioctl callback is NULL %x", code);
+		return -ENOTTY;
+	}
+	if (ioctls[code].ioctl != p1) {
+		pr_err("ioctl mismatch. Code: %x User: %x Kernel: %x",
+			   code, p1, ioctls[code].ioctl);
+		return -ENOTTY;
+	}
+	process = (struct dxgprocess *)f->private_data;
+	if (process->tgid != current->tgid) {
+		pr_err("Call from a wrong process: %d %d",
+			   process->tgid, current->tgid);
+		return -ENOTTY;
+	}
+	status = ioctls[code].ioctl_callback(process, (void *__user)p2);
+	return status;
+}
+
+long dxgk_compat_ioctl(struct file *f, unsigned int p1, unsigned long p2)
+{
+	pr_debug("  compat ioctl %x", p1);
+	return dxgk_ioctl(f, p1, p2);
+}
+
+long dxgk_unlocked_ioctl(struct file *f, unsigned int p1, unsigned long p2)
+{
+	pr_debug("   unlocked ioctl %x Code:%d", p1, _IOC_NR(p1));
+	return dxgk_ioctl(f, p1, p2);
+}
+
+#define SET_IOCTL(callback, v)				\
+	ioctls[_IOC_NR(v)].ioctl_callback = callback;	\
+	ioctls[_IOC_NR(v)].ioctl = v
+
+void init_ioctls(void)
+{
+}
diff --git a/drivers/hv/dxgkrnl/misc.h b/drivers/hv/dxgkrnl/misc.h
index 1ff0c0e28332..433b59d3eb23 100644
--- a/drivers/hv/dxgkrnl/misc.h
+++ b/drivers/hv/dxgkrnl/misc.h
@@ -31,7 +31,6 @@ extern const struct d3dkmthandle zerohandle;
  * table_lock
  * core_lock
  * device_lock
- * process->process_mutex
  * adapter_list_lock
  * device_mutex
  */
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index 2b9ed954a520..7e2c520abf5f 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -54,4 +54,6 @@ struct winluid {
 	__u32 b;
 };
 
+#define LX_IO_MAX 0x45
+
 #endif /* _D3DKMTHK_H */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 06/30] drivers: hv: dxgkrnl: Enumerate and open dxgadapter objects
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (3 preceding siblings ...)
  2022-03-01 19:45   ` [PATCH v3 05/30] drivers: hv: dxgkrnl: Opening of /dev/dxg device and dxgprocess creation Iouri Tarassov
@ 2022-03-01 19:45   ` Iouri Tarassov
  2022-03-02 12:43     ` Wei Liu
  2022-03-01 19:45   ` [PATCH v3 07/30] drivers: hv: dxgkrnl: Creation of dxgdevice objects Iouri Tarassov
                     ` (24 subsequent siblings)
  29 siblings, 1 reply; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:45 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement ioctls to enumerate dxgadapter objects:
  - The LX_DXENUMADAPTERS2 ioctl
  - The LX_DXENUMADAPTERS3 ioctl.

Implement ioctls to open adapter by luid and to close adapter
handle:
  - The LX_DXOPENADAPTERFROMLUID ioctl
  - the LX_DXCLOSEADAPTER ioctl

Impllement the ioctl to query dxgadapter information:
  - The LX_DXQUERYADAPTERINFO ioctl

When a dxgadapter is enumerated, it is implicitely opened and
a handle (d3dkmthandle) is created in the current process handle
table. The handle is returned to the caller and can be used
by user mode to reference the adapter object in other ioctls.

The caller is responsible for closing the adapter when it is not
longer used by issuing the LX_DXCLOSEADAPTER ioctl.

A dxgprocess has a list of opened dxgadapter objects
(dxgprocess_adapter is used to represent the entry in the list).
A dxgadapter also has a list of dxgprocess_adapter objects.
This is needed for cleanup because either a process or an adapter
could be destroyed first.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgadapter.c |  77 ++++++
 drivers/hv/dxgkrnl/dxgkrnl.h    |  49 ++++
 drivers/hv/dxgkrnl/dxgmodule.c  |  12 +
 drivers/hv/dxgkrnl/dxgprocess.c | 166 +++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.c   |  82 +++++++
 drivers/hv/dxgkrnl/dxgvmbus.h   |  12 +
 drivers/hv/dxgkrnl/ioctl.c      | 412 ++++++++++++++++++++++++++++++++
 drivers/hv/dxgkrnl/misc.h       |   1 +
 include/uapi/misc/d3dkmthk.h    | 103 ++++++++
 9 files changed, 914 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgadapter.c b/drivers/hv/dxgkrnl/dxgadapter.c
index e0a6fea00bd5..2c7823713547 100644
--- a/drivers/hv/dxgkrnl/dxgadapter.c
+++ b/drivers/hv/dxgkrnl/dxgadapter.c
@@ -102,6 +102,7 @@ void dxgadapter_start(struct dxgadapter *adapter)
 
 void dxgadapter_stop(struct dxgadapter *adapter)
 {
+	struct dxgprocess_adapter *entry;
 	bool adapter_stopped = false;
 
 	down_write(&adapter->core_lock);
@@ -114,6 +115,15 @@ void dxgadapter_stop(struct dxgadapter *adapter)
 	if (adapter_stopped)
 		return;
 
+	dxgglobal_acquire_process_adapter_lock();
+
+	list_for_each_entry(entry, &adapter->adapter_process_list_head,
+			    adapter_process_list_entry) {
+		dxgprocess_adapter_stop(entry);
+	}
+
+	dxgglobal_release_process_adapter_lock();
+
 	if (dxgadapter_acquire_lock_exclusive(adapter) == 0) {
 		dxgvmb_send_close_adapter(adapter);
 		dxgadapter_release_lock_exclusive(adapter);
@@ -137,6 +147,24 @@ bool dxgadapter_is_active(struct dxgadapter *adapter)
 	return adapter->adapter_state == DXGADAPTER_STATE_ACTIVE;
 }
 
+/* Protected by dxgglobal_acquire_process_adapter_lock */
+void dxgadapter_add_process(struct dxgadapter *adapter,
+			    struct dxgprocess_adapter *process_info)
+{
+	pr_debug("%s %p %p", __func__, adapter, process_info);
+	list_add_tail(&process_info->adapter_process_list_entry,
+		      &adapter->adapter_process_list_head);
+}
+
+void dxgadapter_remove_process(struct dxgprocess_adapter *process_info)
+{
+	pr_debug("%s %p %p", __func__,
+		    process_info->adapter, process_info);
+	list_del(&process_info->adapter_process_list_entry);
+	process_info->adapter_process_list_entry.next = NULL;
+	process_info->adapter_process_list_entry.prev = NULL;
+}
+
 int dxgadapter_acquire_lock_exclusive(struct dxgadapter *adapter)
 {
 	down_write(&adapter->core_lock);
@@ -170,3 +198,52 @@ void dxgadapter_release_lock_shared(struct dxgadapter *adapter)
 {
 	up_read(&adapter->core_lock);
 }
+
+struct dxgprocess_adapter *dxgprocess_adapter_create(struct dxgprocess *process,
+						     struct dxgadapter *adapter)
+{
+	struct dxgprocess_adapter *adapter_info;
+
+	adapter_info = vzalloc(sizeof(*adapter_info));
+	if (adapter_info) {
+		if (kref_get_unless_zero(&adapter->adapter_kref) == 0) {
+			pr_err("failed to acquire adapter reference");
+			goto cleanup;
+		}
+		adapter_info->adapter = adapter;
+		adapter_info->process = process;
+		adapter_info->refcount = 1;
+		list_add_tail(&adapter_info->process_adapter_list_entry,
+			      &process->process_adapter_list_head);
+		dxgadapter_add_process(adapter, adapter_info);
+	}
+	return adapter_info;
+cleanup:
+	if (adapter_info)
+		vfree(adapter_info);
+	return NULL;
+}
+
+void dxgprocess_adapter_stop(struct dxgprocess_adapter *adapter_info)
+{
+}
+
+void dxgprocess_adapter_destroy(struct dxgprocess_adapter *adapter_info)
+{
+	dxgadapter_remove_process(adapter_info);
+	kref_put(&adapter_info->adapter->adapter_kref, dxgadapter_release);
+	list_del(&adapter_info->process_adapter_list_entry);
+	vfree(adapter_info);
+}
+
+/*
+ * Must be called when dxgglobal::process_adapter_mutex is held
+ */
+void dxgprocess_adapter_release(struct dxgprocess_adapter *adapter_info)
+{
+	pr_debug("%s %p %d",
+		    __func__, adapter_info, adapter_info->refcount);
+	adapter_info->refcount--;
+	if (adapter_info->refcount == 0)
+		dxgprocess_adapter_destroy(adapter_info);
+}
diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index 61512463baa4..fbc15731cbd5 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -119,6 +119,9 @@ struct dxgglobal {
 	/* protects acces to the global VM bus channel */
 	struct rw_semaphore	channel_lock;
 
+	/* protects the dxgprocess_adapter lists */
+	struct mutex		process_adapter_mutex;
+
 	bool			dxg_dev_initialized;
 	bool			vmbus_registered;
 	bool			pci_registered;
@@ -136,9 +139,31 @@ int dxgglobal_init_global_channel(void);
 void dxgglobal_destroy_global_channel(void);
 struct vmbus_channel *dxgglobal_get_vmbus(void);
 struct dxgvmbuschannel *dxgglobal_get_dxgvmbuschannel(void);
+void dxgglobal_acquire_process_adapter_lock(void);
+void dxgglobal_release_process_adapter_lock(void);
 int dxgglobal_acquire_channel_lock(void);
 void dxgglobal_release_channel_lock(void);
 
+/*
+ * Describes adapter information for each process
+ */
+struct dxgprocess_adapter {
+	/* Entry in dxgadapter::adapter_process_list_head */
+	struct list_head	adapter_process_list_entry;
+	/* Entry in dxgprocess::process_adapter_list_head */
+	struct list_head	process_adapter_list_entry;
+	struct dxgadapter	*adapter;
+	struct dxgprocess	*process;
+	int			refcount;
+};
+
+struct dxgprocess_adapter *dxgprocess_adapter_create(struct dxgprocess *process,
+						     struct dxgadapter
+						     *adapter);
+void dxgprocess_adapter_release(struct dxgprocess_adapter *adapter);
+void dxgprocess_adapter_stop(struct dxgprocess_adapter *adapter_info);
+void dxgprocess_adapter_destroy(struct dxgprocess_adapter *adapter_info);
+
 /*
  * The structure represents a process, which opened the /dev/dxg device.
  * A corresponding object is created on the host.
@@ -167,15 +192,31 @@ struct dxgprocess {
 	struct hmgrtable	local_handle_table;
 	/* Handle of the corresponding objec on the host */
 	struct d3dkmthandle	host_handle;
+
+	/* List of opened adapters (dxgprocess_adapter) */
+	struct list_head	process_adapter_list_head;
 };
 
 struct dxgprocess *dxgprocess_create(void);
 void dxgprocess_destroy(struct dxgprocess *process);
 void dxgprocess_release(struct kref *refcount);
+int dxgprocess_open_adapter(struct dxgprocess *process,
+					struct dxgadapter *adapter,
+					struct d3dkmthandle *handle);
+int dxgprocess_close_adapter(struct dxgprocess *process,
+					 struct d3dkmthandle handle);
+struct dxgadapter *dxgprocess_get_adapter(struct dxgprocess *process,
+					  struct d3dkmthandle handle);
+struct dxgadapter *dxgprocess_adapter_by_handle(struct dxgprocess *process,
+						struct d3dkmthandle handle);
 void dxgprocess_ht_lock_shared_down(struct dxgprocess *process);
 void dxgprocess_ht_lock_shared_up(struct dxgprocess *process);
 void dxgprocess_ht_lock_exclusive_down(struct dxgprocess *process);
 void dxgprocess_ht_lock_exclusive_up(struct dxgprocess *process);
+struct dxgprocess_adapter *dxgprocess_get_adapter_info(struct dxgprocess
+						       *process,
+						       struct dxgadapter
+						       *adapter);
 
 enum dxgadapter_state {
 	DXGADAPTER_STATE_ACTIVE		= 0,
@@ -194,6 +235,8 @@ struct dxgadapter {
 	struct kref		adapter_kref;
 	/* Entry in the list of adapters in dxgglobal */
 	struct list_head	adapter_list_entry;
+	/* The list of dxgprocess_adapter entries */
+	struct list_head	adapter_process_list_head;
 	struct pci_dev		*pci_dev;
 	struct hv_device	*hv_dev;
 	struct dxgvmbuschannel	channel;
@@ -217,6 +260,9 @@ void dxgadapter_release_lock_shared(struct dxgadapter *adapter);
 int dxgadapter_acquire_lock_exclusive(struct dxgadapter *adapter);
 void dxgadapter_acquire_lock_forced(struct dxgadapter *adapter);
 void dxgadapter_release_lock_exclusive(struct dxgadapter *adapter);
+void dxgadapter_add_process(struct dxgadapter *adapter,
+			    struct dxgprocess_adapter *process_info);
+void dxgadapter_remove_process(struct dxgprocess_adapter *process_info);
 
 void init_ioctls(void);
 long dxgk_compat_ioctl(struct file *f, unsigned int p1, unsigned long p2);
@@ -251,6 +297,9 @@ int dxgvmb_send_destroy_process(struct d3dkmthandle process);
 int dxgvmb_send_open_adapter(struct dxgadapter *adapter);
 int dxgvmb_send_close_adapter(struct dxgadapter *adapter);
 int dxgvmb_send_get_internal_adapter_info(struct dxgadapter *adapter);
+int dxgvmb_send_query_adapter_info(struct dxgprocess *process,
+				   struct dxgadapter *adapter,
+				   struct d3dkmt_queryadapterinfo *args);
 int dxgvmb_send_async_msg(struct dxgvmbuschannel *channel,
 			  void *command,
 			  u32 cmd_size);
diff --git a/drivers/hv/dxgkrnl/dxgmodule.c b/drivers/hv/dxgkrnl/dxgmodule.c
index c79be67b8d56..c1c3274197d8 100644
--- a/drivers/hv/dxgkrnl/dxgmodule.c
+++ b/drivers/hv/dxgkrnl/dxgmodule.c
@@ -124,6 +124,16 @@ static struct dxgadapter *find_adapter(struct winluid *luid)
 	return adapter;
 }
 
+void dxgglobal_acquire_process_adapter_lock(void)
+{
+	mutex_lock(&dxgglobal->process_adapter_mutex);
+}
+
+void dxgglobal_release_process_adapter_lock(void)
+{
+	mutex_unlock(&dxgglobal->process_adapter_mutex);
+}
+
 /*
  * Creates a new dxgadapter object, which represents a virtual GPU, projected
  * by the host.
@@ -147,6 +157,7 @@ int dxgglobal_create_adapter(struct pci_dev *dev, guid_t *guid,
 	kref_init(&adapter->adapter_kref);
 	init_rwsem(&adapter->core_lock);
 
+	INIT_LIST_HEAD(&adapter->adapter_process_list_head);
 	adapter->pci_dev = dev;
 	guid_to_luid(guid, &adapter->luid);
 
@@ -721,6 +732,7 @@ static int dxgglobal_create(void)
 	INIT_LIST_HEAD(&dxgglobal->plisthead);
 	mutex_init(&dxgglobal->plistmutex);
 	mutex_init(&dxgglobal->device_mutex);
+	mutex_init(&dxgglobal->process_adapter_mutex);
 
 	INIT_LIST_HEAD(&dxgglobal->thread_info_list_head);
 	mutex_init(&dxgglobal->thread_info_mutex);
diff --git a/drivers/hv/dxgkrnl/dxgprocess.c b/drivers/hv/dxgkrnl/dxgprocess.c
index 2a86ba1b4b1b..1fd7b7659792 100644
--- a/drivers/hv/dxgkrnl/dxgprocess.c
+++ b/drivers/hv/dxgkrnl/dxgprocess.c
@@ -47,6 +47,7 @@ struct dxgprocess *dxgprocess_create(void)
 
 			hmgrtable_init(&process->handle_table, process);
 			hmgrtable_init(&process->local_handle_table, process);
+			INIT_LIST_HEAD(&process->process_adapter_list_head);
 		}
 	}
 	return process;
@@ -54,6 +55,35 @@ struct dxgprocess *dxgprocess_create(void)
 
 void dxgprocess_destroy(struct dxgprocess *process)
 {
+	int i;
+	enum hmgrentry_type t;
+	struct d3dkmthandle h;
+	void *o;
+	struct dxgprocess_adapter *entry;
+	struct dxgprocess_adapter *tmp;
+
+	/* Destroy all adapter state */
+	dxgglobal_acquire_process_adapter_lock();
+	list_for_each_entry_safe(entry, tmp,
+				 &process->process_adapter_list_head,
+				 process_adapter_list_entry) {
+		dxgprocess_adapter_destroy(entry);
+	}
+	dxgglobal_release_process_adapter_lock();
+
+	i = 0;
+	while (hmgrtable_next_entry(&process->local_handle_table,
+				    &i, &t, &h, &o)) {
+		switch (t) {
+		case HMGRENTRY_TYPE_DXGADAPTER:
+			dxgprocess_close_adapter(process, h);
+			break;
+		default:
+			pr_err("invalid entry in local handle table %d", t);
+			break;
+		}
+	}
+
 	hmgrtable_destroy(&process->handle_table);
 	hmgrtable_destroy(&process->local_handle_table);
 }
@@ -76,6 +106,142 @@ void dxgprocess_release(struct kref *refcount)
 	vfree(process);
 }
 
+struct dxgprocess_adapter *dxgprocess_get_adapter_info(struct dxgprocess
+						       *process,
+						       struct dxgadapter
+						       *adapter)
+{
+	struct dxgprocess_adapter *entry;
+
+	list_for_each_entry(entry, &process->process_adapter_list_head,
+			    process_adapter_list_entry) {
+		if (adapter == entry->adapter) {
+			pr_debug("Found process info %p", entry);
+			return entry;
+		}
+	}
+	return NULL;
+}
+
+/*
+ * Dxgprocess takes references on dxgadapter and dxgprocess_adapter.
+ */
+int dxgprocess_open_adapter(struct dxgprocess *process,
+					struct dxgadapter *adapter,
+					struct d3dkmthandle *h)
+{
+	int ret = 0;
+	struct dxgprocess_adapter *adapter_info;
+	struct d3dkmthandle handle;
+
+	h->v = 0;
+	adapter_info = dxgprocess_get_adapter_info(process, adapter);
+	if (adapter_info == NULL) {
+		pr_debug("creating new process adapter info\n");
+		adapter_info = dxgprocess_adapter_create(process, adapter);
+		if (adapter_info == NULL) {
+			ret = -ENOMEM;
+			goto cleanup;
+		}
+	} else {
+		adapter_info->refcount++;
+	}
+
+	handle = hmgrtable_alloc_handle_safe(&process->local_handle_table,
+					     adapter, HMGRENTRY_TYPE_DXGADAPTER,
+					     true);
+	if (handle.v) {
+		*h = handle;
+	} else {
+		pr_err("failed to create adapter handle\n");
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+cleanup:
+
+	if (ret < 0) {
+		if (adapter_info) {
+			dxgglobal_acquire_process_adapter_lock();
+			dxgprocess_adapter_release(adapter_info);
+			dxgglobal_release_process_adapter_lock();
+		}
+	}
+
+	return ret;
+}
+
+int dxgprocess_close_adapter(struct dxgprocess *process,
+			     struct d3dkmthandle handle)
+{
+	struct dxgadapter *adapter;
+	struct dxgprocess_adapter *adapter_info;
+	int ret = 0;
+
+	if (handle.v == 0)
+		return 0;
+
+	hmgrtable_lock(&process->local_handle_table, DXGLOCK_EXCL);
+	adapter = dxgprocess_get_adapter(process, handle);
+	if (adapter)
+		hmgrtable_free_handle(&process->local_handle_table,
+				      HMGRENTRY_TYPE_DXGADAPTER, handle);
+	hmgrtable_unlock(&process->local_handle_table, DXGLOCK_EXCL);
+
+	if (adapter) {
+		adapter_info = dxgprocess_get_adapter_info(process, adapter);
+		if (adapter_info) {
+			dxgglobal_acquire_process_adapter_lock();
+			dxgprocess_adapter_release(adapter_info);
+			dxgglobal_release_process_adapter_lock();
+		} else {
+			ret = -EINVAL;
+		}
+	} else {
+		pr_err("%s failed %x", __func__, handle.v);
+		ret = -EINVAL;
+	}
+
+	return ret;
+}
+
+struct dxgadapter *dxgprocess_get_adapter(struct dxgprocess *process,
+					  struct d3dkmthandle handle)
+{
+	struct dxgadapter *adapter;
+
+	adapter = hmgrtable_get_object_by_type(&process->local_handle_table,
+					       HMGRENTRY_TYPE_DXGADAPTER,
+					       handle);
+	if (adapter == NULL)
+		pr_err("%s failed %x\n", __func__, handle.v);
+	return adapter;
+}
+
+/*
+ * Gets the adapter object from the process handle table.
+ * The adapter object is referenced.
+ * The function acquired the handle table lock shared.
+ */
+struct dxgadapter *dxgprocess_adapter_by_handle(struct dxgprocess *process,
+						struct d3dkmthandle handle)
+{
+	struct dxgadapter *adapter;
+
+	hmgrtable_lock(&process->local_handle_table, DXGLOCK_SHARED);
+	adapter = hmgrtable_get_object_by_type(&process->local_handle_table,
+					       HMGRENTRY_TYPE_DXGADAPTER,
+					       handle);
+	if (adapter == NULL)
+		pr_err("adapter_by_handle failed %x\n", handle.v);
+	else if (kref_get_unless_zero(&adapter->adapter_kref) == 0) {
+		pr_err("failed to acquire adapter reference\n");
+		adapter = NULL;
+	}
+	hmgrtable_unlock(&process->local_handle_table, DXGLOCK_SHARED);
+	return adapter;
+}
+
 void dxgprocess_ht_lock_shared_down(struct dxgprocess *process)
 {
 	hmgrtable_lock(&process->handle_table, DXGLOCK_SHARED);
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index 988372c5812e..1e7e34b45c3d 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -666,3 +666,85 @@ int dxgvmb_send_get_internal_adapter_info(struct dxgadapter *adapter)
 		pr_debug("err: %s %d", __func__, ret);
 	return ret;
 }
+
+int dxgvmb_send_query_adapter_info(struct dxgprocess *process,
+				   struct dxgadapter *adapter,
+				   struct d3dkmt_queryadapterinfo *args)
+{
+	struct dxgkvmb_command_queryadapterinfo *command;
+	u32 cmd_size = sizeof(*command) + args->private_data_size - 1;
+	int ret;
+	u32 private_data_size;
+	void *private_data;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, cmd_size);
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	ret = copy_from_user(command->private_data,
+			     args->private_data, args->private_data_size);
+	if (ret) {
+		pr_err("%s Faled to copy private data", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_QUERYADAPTERINFO,
+				   process->host_handle);
+	command->private_data_size = args->private_data_size;
+	command->query_type = args->type;
+
+	if (dxgglobal->vmbus_ver >= DXGK_VMBUS_INTERFACE_VERSION) {
+		private_data = msg.msg;
+		private_data_size = command->private_data_size +
+				    sizeof(struct ntstatus);
+	} else {
+		private_data = command->private_data;
+		private_data_size = command->private_data_size;
+	}
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   private_data, private_data_size);
+	if (ret < 0)
+		goto cleanup;
+
+	if (dxgglobal->vmbus_ver >= DXGK_VMBUS_INTERFACE_VERSION) {
+		ret = ntstatus2int(*(struct ntstatus *)private_data);
+		if (ret < 0)
+			goto cleanup;
+		private_data = (char *)private_data + sizeof(struct ntstatus);
+	}
+
+	switch (args->type) {
+	case _KMTQAITYPE_ADAPTERTYPE:
+	case _KMTQAITYPE_ADAPTERTYPE_RENDER:
+		{
+			struct d3dkmt_adaptertype *adapter_type =
+			    (void *)private_data;
+			adapter_type->paravirtualized = 1;
+			adapter_type->display_supported = 0;
+			adapter_type->post_device = 0;
+			adapter_type->indirect_display_device = 0;
+			adapter_type->acg_supported = 0;
+			adapter_type->support_set_timings_from_vidpn = 0;
+			break;
+		}
+	default:
+		break;
+	}
+	ret = copy_to_user(args->private_data, private_data,
+			   args->private_data_size);
+	if (ret) {
+		pr_err("%s Faled to copy private data to user", __func__);
+		ret = -EINVAL;
+	}
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index ddd7b6bf9964..1fbb89dee576 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -235,4 +235,16 @@ struct dxgkvmb_command_getinternaladapterinfo_return {
 	struct winluid			host_vgpu_luid;
 };
 
+struct dxgkvmb_command_queryadapterinfo {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	enum kmtqueryadapterinfotype	query_type;
+	u32				private_data_size;
+	u8				private_data[1];
+};
+
+struct dxgkvmb_command_queryadapterinfo_return {
+	struct ntstatus			status;
+	u8				private_data[1];
+};
+
 #endif /* _DXGVMBUS_H */
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index fa7fc321addb..62a958f6f146 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -35,6 +35,408 @@ static char *errorstr(int ret)
 	return ret < 0 ? "err" : "";
 }
 
+static int dxgk_open_adapter_from_luid(struct dxgprocess *process,
+						   void *__user inargs)
+{
+	struct d3dkmt_openadapterfromluid args;
+	int ret;
+	struct dxgadapter *entry;
+	struct dxgadapter *adapter = NULL;
+	struct d3dkmt_openadapterfromluid *__user result = inargs;
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s Faled to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	dxgglobal_acquire_adapter_list_lock(DXGLOCK_SHARED);
+	dxgglobal_acquire_process_adapter_lock();
+
+	list_for_each_entry(entry, &dxgglobal->adapter_list_head,
+			    adapter_list_entry) {
+		if (dxgadapter_acquire_lock_shared(entry) == 0) {
+			pr_debug("Compare luids: %d:%d  %d:%d",
+				    entry->luid.b, entry->luid.a,
+				    args.adapter_luid.b, args.adapter_luid.a);
+			if (*(u64 *) &entry->luid ==
+			    *(u64 *) &args.adapter_luid) {
+				ret =
+				    dxgprocess_open_adapter(process, entry,
+						    &args.adapter_handle);
+
+				if (ret >= 0) {
+					ret = copy_to_user(
+						&result->adapter_handle,
+						&args.adapter_handle,
+						sizeof(struct d3dkmthandle));
+					if (ret)
+						ret = -EINVAL;
+				}
+				adapter = entry;
+			}
+			dxgadapter_release_lock_shared(entry);
+			if (adapter)
+				break;
+		}
+	}
+
+	dxgglobal_release_process_adapter_lock();
+	dxgglobal_release_adapter_list_lock(DXGLOCK_SHARED);
+
+	if (args.adapter_handle.v == 0)
+		ret = -EINVAL;
+
+cleanup:
+
+	if (ret < 0)
+		dxgprocess_close_adapter(process, args.adapter_handle);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgkp_enum_adapters(struct dxgprocess *process,
+		    union d3dkmt_enumadapters_filter filter,
+		    u32 adapter_count_max,
+		    struct d3dkmt_adapterinfo *__user info_out,
+		    u32 * __user adapter_count_out)
+{
+	int ret = 0;
+	struct dxgadapter *entry;
+	struct d3dkmt_adapterinfo *info = NULL;
+	struct dxgadapter **adapters = NULL;
+	int adapter_count = 0;
+	int i;
+
+	pr_debug("ioctl: %s", __func__);
+	if (info_out == NULL || adapter_count_max == 0) {
+		pr_debug("buffer is NULL");
+		ret = copy_to_user(adapter_count_out,
+				   &dxgglobal->num_adapters, sizeof(u32));
+		if (ret) {
+			pr_err("%s copy_to_user faled",	__func__);
+			ret = -EINVAL;
+		}
+		goto cleanup;
+	}
+
+	if (adapter_count_max > 0xFFFF) {
+		pr_err("too many adapters");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	info = vzalloc(sizeof(struct d3dkmt_adapterinfo) * adapter_count_max);
+	if (info == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	adapters = vzalloc(sizeof(struct dxgadapter *) * adapter_count_max);
+	if (adapters == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	dxgglobal_acquire_adapter_list_lock(DXGLOCK_SHARED);
+	dxgglobal_acquire_process_adapter_lock();
+
+	list_for_each_entry(entry, &dxgglobal->adapter_list_head,
+			    adapter_list_entry) {
+		if (dxgadapter_acquire_lock_shared(entry) == 0) {
+			struct d3dkmt_adapterinfo *inf = &info[adapter_count];
+
+			ret = dxgprocess_open_adapter(process, entry,
+						      &inf->adapter_handle);
+			if (ret >= 0) {
+				inf->adapter_luid = entry->luid;
+				adapters[adapter_count] = entry;
+				pr_debug("adapter: %x %x:%x",
+					    inf->adapter_handle.v,
+					    inf->adapter_luid.b,
+					    inf->adapter_luid.a);
+				adapter_count++;
+			}
+			dxgadapter_release_lock_shared(entry);
+		}
+		if (ret < 0)
+			break;
+	}
+
+	dxgglobal_release_process_adapter_lock();
+	dxgglobal_release_adapter_list_lock(DXGLOCK_SHARED);
+
+	if (adapter_count > adapter_count_max) {
+		ret = STATUS_BUFFER_TOO_SMALL;
+		pr_debug("Too many adapters");
+		ret = copy_to_user(adapter_count_out,
+				   &dxgglobal->num_adapters, sizeof(u32));
+		if (ret) {
+			pr_err("%s copy_to_user failed", __func__);
+			ret = -EINVAL;
+		}
+		goto cleanup;
+	}
+
+	ret = copy_to_user(adapter_count_out, &adapter_count,
+			   sizeof(adapter_count));
+	if (ret) {
+		pr_err("%s failed to copy adapter_count", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	ret = copy_to_user(info_out, info, sizeof(info[0]) * adapter_count);
+	if (ret) {
+		pr_err("%s failed to copy adapter info", __func__);
+		ret = -EINVAL;
+	}
+
+cleanup:
+
+	if (ret >= 0) {
+		pr_debug("found %d adapters", adapter_count);
+		goto success;
+	}
+	if (info) {
+		for (i = 0; i < adapter_count; i++)
+			dxgprocess_close_adapter(process,
+						 info[i].adapter_handle);
+	}
+success:
+	if (info)
+		vfree(info);
+	if (adapters)
+		vfree(adapters);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_enum_adapters(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_enumadapters2 args;
+	int ret;
+	struct dxgadapter *entry;
+	struct d3dkmt_adapterinfo *info = NULL;
+	struct dxgadapter **adapters = NULL;
+	int adapter_count = 0;
+	int i;
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.adapters == NULL) {
+		pr_debug("buffer is NULL");
+		args.num_adapters = dxgglobal->num_adapters;
+		ret = copy_to_user(inargs, &args, sizeof(args));
+		if (ret) {
+			pr_err("%s failed to copy args to user", __func__);
+			ret = -EINVAL;
+		}
+		goto cleanup;
+	}
+	if (args.num_adapters < dxgglobal->num_adapters) {
+		args.num_adapters = dxgglobal->num_adapters;
+		pr_debug("buffer is too small");
+		ret = -EOVERFLOW;
+		goto cleanup;
+	}
+
+	if (args.num_adapters > D3DKMT_ADAPTERS_MAX) {
+		pr_debug("too many adapters");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	info = vzalloc(sizeof(struct d3dkmt_adapterinfo) * args.num_adapters);
+	if (info == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	adapters = vzalloc(sizeof(struct dxgadapter *) * args.num_adapters);
+	if (adapters == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	dxgglobal_acquire_adapter_list_lock(DXGLOCK_SHARED);
+	dxgglobal_acquire_process_adapter_lock();
+
+	list_for_each_entry(entry, &dxgglobal->adapter_list_head,
+			    adapter_list_entry) {
+		if (dxgadapter_acquire_lock_shared(entry) == 0) {
+			struct d3dkmt_adapterinfo *inf = &info[adapter_count];
+
+			ret = dxgprocess_open_adapter(process, entry,
+						      &inf->adapter_handle);
+			if (ret >= 0) {
+				inf->adapter_luid = entry->luid;
+				adapters[adapter_count] = entry;
+				pr_debug("adapter: %x %llx",
+					    inf->adapter_handle.v,
+					    *(u64 *) &inf->adapter_luid);
+				adapter_count++;
+			}
+			dxgadapter_release_lock_shared(entry);
+		}
+		if (ret < 0)
+			break;
+	}
+
+	dxgglobal_release_process_adapter_lock();
+	dxgglobal_release_adapter_list_lock(DXGLOCK_SHARED);
+
+	args.num_adapters = adapter_count;
+
+	ret = copy_to_user(inargs, &args, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy args to user", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	ret = copy_to_user(args.adapters, info,
+			   sizeof(info[0]) * args.num_adapters);
+	if (ret) {
+		pr_err("%s failed to copy adapter info to user", __func__);
+		ret = -EINVAL;
+	}
+
+cleanup:
+
+	if (ret < 0) {
+		if (info) {
+			for (i = 0; i < args.num_adapters; i++) {
+				dxgprocess_close_adapter(process,
+							info[i].adapter_handle);
+			}
+		}
+	} else {
+		pr_debug("found %d adapters", args.num_adapters);
+	}
+
+	if (info)
+		vfree(info);
+	if (adapters)
+		vfree(adapters);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_enum_adapters3(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_enumadapters3 args;
+	int ret;
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgkp_enum_adapters(process, args.filter,
+				  args.adapter_count,
+				  args.adapters,
+				  &((struct d3dkmt_enumadapters3 *)inargs)->
+				  adapter_count);
+
+cleanup:
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_close_adapter(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmthandle args;
+	int ret;
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgprocess_close_adapter(process, args);
+	if (ret < 0)
+		pr_err("%s failed", __func__);
+
+cleanup:
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_query_adapter_info(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_queryadapterinfo args;
+	int ret;
+	struct dxgadapter *adapter = NULL;
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.private_data_size > DXG_MAX_VM_BUS_PACKET_SIZE ||
+	    args.private_data_size == 0) {
+		pr_err("invalid private data size");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	pr_debug("Type: %d Size: %x",
+		args.type, args.private_data_size);
+
+	adapter = dxgprocess_adapter_by_handle(process, args.adapter);
+	if (adapter == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0)
+		goto cleanup;
+
+	ret = dxgvmb_send_query_adapter_info(process, adapter, &args);
+
+	dxgadapter_release_lock_shared(adapter);
+
+cleanup:
+
+	if (adapter)
+		kref_put(&adapter->adapter_kref, dxgadapter_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 /*
  * IOCTL processing
  * The driver IOCTLs return
@@ -91,4 +493,14 @@ long dxgk_unlocked_ioctl(struct file *f, unsigned int p1, unsigned long p2)
 
 void init_ioctls(void)
 {
+	SET_IOCTL(/*0x1 */ dxgk_open_adapter_from_luid,
+		  LX_DXOPENADAPTERFROMLUID);
+	SET_IOCTL(/*0x9 */ dxgk_query_adapter_info,
+		  LX_DXQUERYADAPTERINFO);
+	SET_IOCTL(/*0x14 */ dxgk_enum_adapters,
+		  LX_DXENUMADAPTERS2);
+	SET_IOCTL(/*0x15 */ dxgk_close_adapter,
+		  LX_DXCLOSEADAPTER);
+	SET_IOCTL(/*0x3e */ dxgk_enum_adapters3,
+		  LX_DXENUMADAPTERS3);
 }
diff --git a/drivers/hv/dxgkrnl/misc.h b/drivers/hv/dxgkrnl/misc.h
index 433b59d3eb23..d00e7cc00470 100644
--- a/drivers/hv/dxgkrnl/misc.h
+++ b/drivers/hv/dxgkrnl/misc.h
@@ -31,6 +31,7 @@ extern const struct d3dkmthandle zerohandle;
  * table_lock
  * core_lock
  * device_lock
+ * process_adapter_mutex
  * adapter_list_lock
  * device_mutex
  */
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index 7e2c520abf5f..ba7723ebd283 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -54,6 +54,109 @@ struct winluid {
 	__u32 b;
 };
 
+#define D3DKMT_ADAPTERS_MAX			64
+
+struct d3dkmt_adapterinfo {
+	struct d3dkmthandle		adapter_handle;
+	struct winluid			adapter_luid;
+	__u32				num_sources;
+	__u32				present_move_regions_preferred;
+};
+
+struct d3dkmt_enumadapters2 {
+	__u32				num_adapters;
+	__u32				reserved;
+#ifdef __KERNEL__
+	struct d3dkmt_adapterinfo	*adapters;
+#else
+	__u64				*adapters;
+#endif
+};
+
+struct d3dkmt_closeadapter {
+	struct d3dkmthandle		adapter_handle;
+};
+
+struct d3dkmt_openadapterfromluid {
+	struct winluid			adapter_luid;
+	struct d3dkmthandle		adapter_handle;
+};
+
+struct d3dkmt_adaptertype {
+	union {
+		struct {
+			__u32		render_supported:1;
+			__u32		display_supported:1;
+			__u32		software_device:1;
+			__u32		post_device:1;
+			__u32		hybrid_discrete:1;
+			__u32		hybrid_integrated:1;
+			__u32		indirect_display_device:1;
+			__u32		paravirtualized:1;
+			__u32		acg_supported:1;
+			__u32		support_set_timings_from_vidpn:1;
+			__u32		detachable:1;
+			__u32		compute_only:1;
+			__u32		prototype:1;
+			__u32		reserved:19;
+		};
+		__u32			value;
+	};
+};
+
+enum kmtqueryadapterinfotype {
+	_KMTQAITYPE_UMDRIVERPRIVATE	= 0,
+	_KMTQAITYPE_ADAPTERTYPE		= 15,
+	_KMTQAITYPE_ADAPTERTYPE_RENDER	= 57
+};
+
+struct d3dkmt_queryadapterinfo {
+	struct d3dkmthandle		adapter;
+	enum kmtqueryadapterinfotype	type;
+#ifdef __KERNEL__
+	void				*private_data;
+#else
+	__u64				private_data;
+#endif
+	__u32				private_data_size;
+};
+
+union d3dkmt_enumadapters_filter {
+	struct {
+		__u64	include_compute_only:1;
+		__u64	include_display_only:1;
+		__u64	reserved:62;
+	};
+	__u64		value;
+};
+
+struct d3dkmt_enumadapters3 {
+	union d3dkmt_enumadapters_filter	filter;
+	__u32					adapter_count;
+	__u32					reserved;
+#ifdef __KERNEL__
+	struct d3dkmt_adapterinfo		*adapters;
+#else
+	__u64					adapters;
+#endif
+};
+
+/*
+ * Dxgkrnl Graphics Port Driver ioctl definitions
+ *
+ */
+
+#define LX_DXOPENADAPTERFROMLUID	\
+	_IOWR(0x47, 0x01, struct d3dkmt_openadapterfromluid)
+#define LX_DXQUERYADAPTERINFO		\
+	_IOWR(0x47, 0x09, struct d3dkmt_queryadapterinfo)
+#define LX_DXENUMADAPTERS2		\
+	_IOWR(0x47, 0x14, struct d3dkmt_enumadapters2)
+#define LX_DXCLOSEADAPTER		\
+	_IOWR(0x47, 0x15, struct d3dkmt_closeadapter)
+#define LX_DXENUMADAPTERS3		\
+	_IOWR(0x47, 0x3e, struct d3dkmt_enumadapters3)
+
 #define LX_IO_MAX 0x45
 
 #endif /* _D3DKMTHK_H */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 07/30] drivers: hv: dxgkrnl: Creation of dxgdevice objects
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (4 preceding siblings ...)
  2022-03-01 19:45   ` [PATCH v3 06/30] drivers: hv: dxgkrnl: Enumerate and open dxgadapter objects Iouri Tarassov
@ 2022-03-01 19:45   ` Iouri Tarassov
  2022-03-01 19:45   ` [PATCH v3 08/30] drivers: hv: dxgkrnl: Creation of dxgcontext objects Iouri Tarassov
                     ` (23 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:45 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement ioctls for creation and destruction of dxgdevice
objects:
 - the LX_DXCREATEDEVICE ioctl
 - the LX_DXDESTROYDEVICE ioctl

A dxgdevice object represents a container of other virtual
compute device objects (allocations, sync objects, contexts,
etc.). It belongs to a dxgadapter object.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgadapter.c | 187 ++++++++++++++++++++++++++++++++
 drivers/hv/dxgkrnl/dxgkrnl.h    |  58 ++++++++++
 drivers/hv/dxgkrnl/dxgprocess.c |  43 ++++++++
 drivers/hv/dxgkrnl/dxgvmbus.c   |  80 ++++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h   |  22 ++++
 drivers/hv/dxgkrnl/ioctl.c      | 134 +++++++++++++++++++++++
 drivers/hv/dxgkrnl/misc.h       |   1 +
 include/uapi/misc/d3dkmthk.h    |  82 ++++++++++++++
 8 files changed, 607 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgadapter.c b/drivers/hv/dxgkrnl/dxgadapter.c
index 2c7823713547..ad71ba65e6af 100644
--- a/drivers/hv/dxgkrnl/dxgadapter.c
+++ b/drivers/hv/dxgkrnl/dxgadapter.c
@@ -199,6 +199,121 @@ void dxgadapter_release_lock_shared(struct dxgadapter *adapter)
 	up_read(&adapter->core_lock);
 }
 
+struct dxgdevice *dxgdevice_create(struct dxgadapter *adapter,
+				   struct dxgprocess *process)
+{
+	struct dxgdevice *device = vzalloc(sizeof(struct dxgdevice));
+	int ret;
+
+	if (device) {
+		kref_init(&device->device_kref);
+		device->adapter = adapter;
+		device->process = process;
+		kref_get(&adapter->adapter_kref);
+		init_rwsem(&device->device_lock);
+		INIT_LIST_HEAD(&device->pqueue_list_head);
+		device->object_state = DXGOBJECTSTATE_CREATED;
+		device->execution_state = _D3DKMT_DEVICEEXECUTION_ACTIVE;
+
+		ret = dxgprocess_adapter_add_device(process, adapter, device);
+		if (ret < 0) {
+			kref_put(&device->device_kref, dxgdevice_release);
+			device = NULL;
+		}
+	}
+	return device;
+}
+
+void dxgdevice_stop(struct dxgdevice *device)
+{
+}
+
+void dxgdevice_mark_destroyed(struct dxgdevice *device)
+{
+	down_write(&device->device_lock);
+	device->object_state = DXGOBJECTSTATE_DESTROYED;
+	up_write(&device->device_lock);
+}
+
+void dxgdevice_destroy(struct dxgdevice *device)
+{
+	struct dxgprocess *process = device->process;
+	struct dxgadapter *adapter = device->adapter;
+	struct d3dkmthandle device_handle = {};
+
+	pr_debug("%s: %p\n", __func__, device);
+
+	down_write(&device->device_lock);
+
+	if (device->object_state != DXGOBJECTSTATE_ACTIVE)
+		goto cleanup;
+
+	device->object_state = DXGOBJECTSTATE_DESTROYED;
+
+	dxgdevice_stop(device);
+
+	/* Guest handles need to be released before the host handles */
+	hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+	if (device->handle_valid) {
+		hmgrtable_free_handle(&process->handle_table,
+				      HMGRENTRY_TYPE_DXGDEVICE, device->handle);
+		device_handle = device->handle;
+		device->handle_valid = 0;
+	}
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+
+	if (device_handle.v) {
+		up_write(&device->device_lock);
+		if (dxgadapter_acquire_lock_shared(adapter) == 0) {
+			dxgvmb_send_destroy_device(adapter, process,
+						   device_handle);
+			dxgadapter_release_lock_shared(adapter);
+		}
+		down_write(&device->device_lock);
+	}
+
+cleanup:
+
+	if (device->adapter) {
+		dxgprocess_adapter_remove_device(device);
+		kref_put(&device->adapter->adapter_kref, dxgadapter_release);
+		device->adapter = NULL;
+	}
+
+	up_write(&device->device_lock);
+
+	kref_put(&device->device_kref, dxgdevice_release);
+	pr_debug("dxgdevice_destroy_end\n");
+}
+
+int dxgdevice_acquire_lock_shared(struct dxgdevice *device)
+{
+	down_read(&device->device_lock);
+	if (!dxgdevice_is_active(device)) {
+		up_read(&device->device_lock);
+		return -ENODEV;
+	}
+	return 0;
+}
+
+void dxgdevice_release_lock_shared(struct dxgdevice *device)
+{
+	up_read(&device->device_lock);
+}
+
+bool dxgdevice_is_active(struct dxgdevice *device)
+{
+	return device->object_state == DXGOBJECTSTATE_ACTIVE;
+}
+
+void dxgdevice_release(struct kref *refcount)
+{
+	struct dxgdevice *device;
+
+	device = container_of(refcount, struct dxgdevice, device_kref);
+	vfree(device);
+}
+
 struct dxgprocess_adapter *dxgprocess_adapter_create(struct dxgprocess *process,
 						     struct dxgadapter *adapter)
 {
@@ -213,6 +328,8 @@ struct dxgprocess_adapter *dxgprocess_adapter_create(struct dxgprocess *process,
 		adapter_info->adapter = adapter;
 		adapter_info->process = process;
 		adapter_info->refcount = 1;
+		mutex_init(&adapter_info->device_list_mutex);
+		INIT_LIST_HEAD(&adapter_info->device_list_head);
 		list_add_tail(&adapter_info->process_adapter_list_entry,
 			      &process->process_adapter_list_head);
 		dxgadapter_add_process(adapter, adapter_info);
@@ -226,10 +343,35 @@ struct dxgprocess_adapter *dxgprocess_adapter_create(struct dxgprocess *process,
 
 void dxgprocess_adapter_stop(struct dxgprocess_adapter *adapter_info)
 {
+	struct dxgdevice *device;
+
+	mutex_lock(&adapter_info->device_list_mutex);
+	list_for_each_entry(device, &adapter_info->device_list_head,
+			    device_list_entry) {
+		dxgdevice_stop(device);
+	}
+	mutex_unlock(&adapter_info->device_list_mutex);
 }
 
 void dxgprocess_adapter_destroy(struct dxgprocess_adapter *adapter_info)
 {
+	struct dxgdevice *device;
+
+	mutex_lock(&adapter_info->device_list_mutex);
+	while (!list_empty(&adapter_info->device_list_head)) {
+		device = list_first_entry(&adapter_info->device_list_head,
+					  struct dxgdevice, device_list_entry);
+		list_del(&device->device_list_entry);
+		device->device_list_entry.next = NULL;
+		mutex_unlock(&adapter_info->device_list_mutex);
+		dxgvmb_send_flush_device(device,
+			DXGDEVICE_FLUSHSCHEDULER_DEVICE_TERMINATE);
+		dxgdevice_destroy(device);
+		dxgdevice_destroy(device);
+		mutex_lock(&adapter_info->device_list_mutex);
+	}
+	mutex_unlock(&adapter_info->device_list_mutex);
+
 	dxgadapter_remove_process(adapter_info);
 	kref_put(&adapter_info->adapter->adapter_kref, dxgadapter_release);
 	list_del(&adapter_info->process_adapter_list_entry);
@@ -247,3 +389,48 @@ void dxgprocess_adapter_release(struct dxgprocess_adapter *adapter_info)
 	if (adapter_info->refcount == 0)
 		dxgprocess_adapter_destroy(adapter_info);
 }
+
+int dxgprocess_adapter_add_device(struct dxgprocess *process,
+				  struct dxgadapter *adapter,
+				  struct dxgdevice *device)
+{
+	struct dxgprocess_adapter *entry;
+	struct dxgprocess_adapter *adapter_info = NULL;
+	int ret = 0;
+
+	dxgglobal_acquire_process_adapter_lock();
+
+	list_for_each_entry(entry, &process->process_adapter_list_head,
+			    process_adapter_list_entry) {
+		if (entry->adapter == adapter) {
+			adapter_info = entry;
+			break;
+		}
+	}
+	if (adapter_info == NULL) {
+		pr_err("failed to find process adapter info\n");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	mutex_lock(&adapter_info->device_list_mutex);
+	list_add_tail(&device->device_list_entry,
+		      &adapter_info->device_list_head);
+	device->adapter_info = adapter_info;
+	mutex_unlock(&adapter_info->device_list_mutex);
+
+cleanup:
+
+	dxgglobal_release_process_adapter_lock();
+	return ret;
+}
+
+void dxgprocess_adapter_remove_device(struct dxgdevice *device)
+{
+	pr_debug("%s %p\n", __func__, device);
+	mutex_lock(&device->adapter_info->device_list_mutex);
+	if (device->device_list_entry.next) {
+		list_del(&device->device_list_entry);
+		device->device_list_entry.next = NULL;
+	}
+	mutex_unlock(&device->adapter_info->device_list_mutex);
+}
diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index fbc15731cbd5..242b98fc20d5 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -29,6 +29,7 @@
 
 struct dxgprocess;
 struct dxgadapter;
+struct dxgdevice;
 
 #include "misc.h"
 #include "hmgr.h"
@@ -56,6 +57,10 @@ struct dxgk_device_types {
 	u32 virtual_monitor_device:1;
 };
 
+enum dxgdevice_flushschedulerreason {
+	DXGDEVICE_FLUSHSCHEDULER_DEVICE_TERMINATE = 4,
+};
+
 enum dxgobjectstate {
 	DXGOBJECTSTATE_CREATED,
 	DXGOBJECTSTATE_ACTIVE,
@@ -152,6 +157,9 @@ struct dxgprocess_adapter {
 	struct list_head	adapter_process_list_entry;
 	/* Entry in dxgprocess::process_adapter_list_head */
 	struct list_head	process_adapter_list_entry;
+	/* List of all dxgdevice objects created for the process on adapter */
+	struct list_head	device_list_head;
+	struct mutex		device_list_mutex;
 	struct dxgadapter	*adapter;
 	struct dxgprocess	*process;
 	int			refcount;
@@ -161,6 +169,10 @@ struct dxgprocess_adapter *dxgprocess_adapter_create(struct dxgprocess *process,
 						     struct dxgadapter
 						     *adapter);
 void dxgprocess_adapter_release(struct dxgprocess_adapter *adapter);
+int dxgprocess_adapter_add_device(struct dxgprocess *process,
+					      struct dxgadapter *adapter,
+					      struct dxgdevice *device);
+void dxgprocess_adapter_remove_device(struct dxgdevice *device);
 void dxgprocess_adapter_stop(struct dxgprocess_adapter *adapter_info);
 void dxgprocess_adapter_destroy(struct dxgprocess_adapter *adapter_info);
 
@@ -209,6 +221,11 @@ struct dxgadapter *dxgprocess_get_adapter(struct dxgprocess *process,
 					  struct d3dkmthandle handle);
 struct dxgadapter *dxgprocess_adapter_by_handle(struct dxgprocess *process,
 						struct d3dkmthandle handle);
+struct dxgdevice *dxgprocess_device_by_handle(struct dxgprocess *process,
+					      struct d3dkmthandle handle);
+struct dxgdevice *dxgprocess_device_by_object_handle(struct dxgprocess *process,
+						     enum hmgrentry_type t,
+						     struct d3dkmthandle h);
 void dxgprocess_ht_lock_shared_down(struct dxgprocess *process);
 void dxgprocess_ht_lock_shared_up(struct dxgprocess *process);
 void dxgprocess_ht_lock_exclusive_down(struct dxgprocess *process);
@@ -228,6 +245,7 @@ enum dxgadapter_state {
  * This object represents the grapchis adapter.
  * Objects, which take reference on the adapter:
  * - dxgglobal
+ * - dxgdevice
  * - adapter handle (struct d3dkmthandle)
  */
 struct dxgadapter {
@@ -264,6 +282,38 @@ void dxgadapter_add_process(struct dxgadapter *adapter,
 			    struct dxgprocess_adapter *process_info);
 void dxgadapter_remove_process(struct dxgprocess_adapter *process_info);
 
+/*
+ * The object represent the device object.
+ * The following objects take reference on the device
+ * - device handle (struct d3dkmthandle)
+ */
+struct dxgdevice {
+	enum dxgobjectstate	object_state;
+	/* Device takes reference on the adapter */
+	struct dxgadapter	*adapter;
+	struct dxgprocess_adapter *adapter_info;
+	struct dxgprocess	*process;
+	/* Entry in the DGXPROCESS_ADAPTER device list */
+	struct list_head	device_list_entry;
+	struct kref		device_kref;
+	/* Protects destcruction of the device object */
+	struct rw_semaphore	device_lock;
+	/* List of paging queues. Protected by process handle table lock. */
+	struct list_head	pqueue_list_head;
+	struct d3dkmthandle	handle;
+	enum d3dkmt_deviceexecution_state execution_state;
+	u32			handle_valid;
+};
+
+struct dxgdevice *dxgdevice_create(struct dxgadapter *a, struct dxgprocess *p);
+void dxgdevice_destroy(struct dxgdevice *device);
+void dxgdevice_stop(struct dxgdevice *device);
+void dxgdevice_mark_destroyed(struct dxgdevice *device);
+int dxgdevice_acquire_lock_shared(struct dxgdevice *dev);
+void dxgdevice_release_lock_shared(struct dxgdevice *dev);
+void dxgdevice_release(struct kref *refcount);
+bool dxgdevice_is_active(struct dxgdevice *dev);
+
 void init_ioctls(void);
 long dxgk_compat_ioctl(struct file *f, unsigned int p1, unsigned long p2);
 long dxgk_unlocked_ioctl(struct file *f, unsigned int p1, unsigned long p2);
@@ -297,6 +347,14 @@ int dxgvmb_send_destroy_process(struct d3dkmthandle process);
 int dxgvmb_send_open_adapter(struct dxgadapter *adapter);
 int dxgvmb_send_close_adapter(struct dxgadapter *adapter);
 int dxgvmb_send_get_internal_adapter_info(struct dxgadapter *adapter);
+struct d3dkmthandle dxgvmb_send_create_device(struct dxgadapter *adapter,
+					      struct dxgprocess *process,
+					      struct d3dkmt_createdevice *args);
+int dxgvmb_send_destroy_device(struct dxgadapter *adapter,
+			       struct dxgprocess *process,
+			       struct d3dkmthandle h);
+int dxgvmb_send_flush_device(struct dxgdevice *device,
+			     enum dxgdevice_flushschedulerreason reason);
 int dxgvmb_send_query_adapter_info(struct dxgprocess *process,
 				   struct dxgadapter *adapter,
 				   struct d3dkmt_queryadapterinfo *args);
diff --git a/drivers/hv/dxgkrnl/dxgprocess.c b/drivers/hv/dxgkrnl/dxgprocess.c
index 1fd7b7659792..734585689213 100644
--- a/drivers/hv/dxgkrnl/dxgprocess.c
+++ b/drivers/hv/dxgkrnl/dxgprocess.c
@@ -242,6 +242,49 @@ struct dxgadapter *dxgprocess_adapter_by_handle(struct dxgprocess *process,
 	return adapter;
 }
 
+struct dxgdevice *dxgprocess_device_by_object_handle(struct dxgprocess *process,
+						     enum hmgrentry_type t,
+						     struct d3dkmthandle handle)
+{
+	struct dxgdevice *device = NULL;
+	void *obj;
+
+	hmgrtable_lock(&process->handle_table, DXGLOCK_SHARED);
+	obj = hmgrtable_get_object_by_type(&process->handle_table, t, handle);
+	if (obj) {
+		struct d3dkmthandle device_handle = {};
+
+		switch (t) {
+		case HMGRENTRY_TYPE_DXGDEVICE:
+			device = obj;
+			break;
+		default:
+			pr_err("invalid handle type: %d\n", t);
+			break;
+		}
+		if (device == NULL)
+			device = hmgrtable_get_object_by_type(
+					&process->handle_table,
+					 HMGRENTRY_TYPE_DXGDEVICE,
+					 device_handle);
+		if (device)
+			if (kref_get_unless_zero(&device->device_kref) == 0)
+				device = NULL;
+	}
+	if (device == NULL)
+		pr_err("device_by_handle failed: %d %x\n", t, handle.v);
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_SHARED);
+	return device;
+}
+
+struct dxgdevice *dxgprocess_device_by_handle(struct dxgprocess *process,
+					      struct d3dkmthandle handle)
+{
+	return dxgprocess_device_by_object_handle(process,
+						  HMGRENTRY_TYPE_DXGDEVICE,
+						  handle);
+}
+
 void dxgprocess_ht_lock_shared_down(struct dxgprocess *process)
 {
 	hmgrtable_lock(&process->handle_table, DXGLOCK_SHARED);
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index 1e7e34b45c3d..2bc2eca0e7da 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -667,6 +667,86 @@ int dxgvmb_send_get_internal_adapter_info(struct dxgadapter *adapter)
 	return ret;
 }
 
+struct d3dkmthandle dxgvmb_send_create_device(struct dxgadapter *adapter,
+					struct dxgprocess *process,
+					struct d3dkmt_createdevice *args)
+{
+	int ret;
+	struct dxgkvmb_command_createdevice *command;
+	struct dxgkvmb_command_createdevice_return result = { };
+	struct dxgvmbusmsg msg;
+
+	ret = init_message(&msg, adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr, DXGK_VMBCOMMAND_CREATEDEVICE,
+				   process->host_handle);
+	command->flags = args->flags;
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   &result, sizeof(result));
+	if (ret < 0)
+		result.device.v = 0;
+	free_message(&msg, process);
+cleanup:
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return result.device;
+}
+
+int dxgvmb_send_destroy_device(struct dxgadapter *adapter,
+			       struct dxgprocess *process,
+			       struct d3dkmthandle h)
+{
+	int ret;
+	struct dxgkvmb_command_destroydevice *command;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr, DXGK_VMBCOMMAND_DESTROYDEVICE,
+				   process->host_handle);
+	command->device = h;
+
+	ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_flush_device(struct dxgdevice *device,
+			     enum dxgdevice_flushschedulerreason reason)
+{
+	int ret;
+	struct dxgkvmb_command_flushdevice *command;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+	struct dxgprocess *process = device->process;
+
+	ret = init_message(&msg, device->adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr, DXGK_VMBCOMMAND_FLUSHDEVICE,
+				   process->host_handle);
+	command->device = device->handle;
+	command->reason = reason;
+
+	ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 int dxgvmb_send_query_adapter_info(struct dxgprocess *process,
 				   struct dxgadapter *adapter,
 				   struct d3dkmt_queryadapterinfo *args)
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index 1fbb89dee576..67448bee392a 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -247,4 +247,26 @@ struct dxgkvmb_command_queryadapterinfo_return {
 	u8				private_data[1];
 };
 
+struct dxgkvmb_command_createdevice {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmt_createdeviceflags	flags;
+	bool				cdd_device;
+	void				*error_code;
+};
+
+struct dxgkvmb_command_createdevice_return {
+	struct d3dkmthandle		device;
+};
+
+struct dxgkvmb_command_destroydevice {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		device;
+};
+
+struct dxgkvmb_command_flushdevice {
+	struct dxgkvmb_command_vgpu_to_host	hdr;
+	struct d3dkmthandle			device;
+	enum dxgdevice_flushschedulerreason	reason;
+};
+
 #endif /* _DXGVMBUS_H */
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index 62a958f6f146..a6be88b6c792 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -437,6 +437,136 @@ dxgk_query_adapter_info(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgk_create_device(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_createdevice args;
+	int ret;
+	struct dxgadapter *adapter = NULL;
+	struct dxgdevice *device = NULL;
+	struct d3dkmthandle host_device_handle = {};
+	bool adapter_locked = false;
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	/* The call acquires reference on the adapter */
+	adapter = dxgprocess_adapter_by_handle(process, args.adapter);
+	if (adapter == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	device = dxgdevice_create(adapter, process);
+	if (device == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0)
+		goto cleanup;
+
+	adapter_locked = true;
+
+	host_device_handle = dxgvmb_send_create_device(adapter, process, &args);
+	if (host_device_handle.v) {
+		ret = copy_to_user(&((struct d3dkmt_createdevice *)inargs)->
+				   device, &host_device_handle,
+				   sizeof(struct d3dkmthandle));
+		if (ret) {
+			pr_err("%s failed to copy device handle", __func__);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+
+		hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+		ret = hmgrtable_assign_handle(&process->handle_table, device,
+					      HMGRENTRY_TYPE_DXGDEVICE,
+					      host_device_handle);
+		if (ret >= 0) {
+			device->handle = host_device_handle;
+			device->handle_valid = 1;
+			device->object_state = DXGOBJECTSTATE_ACTIVE;
+		}
+		hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+	}
+
+cleanup:
+
+	if (ret < 0) {
+		if (host_device_handle.v)
+			dxgvmb_send_destroy_device(adapter, process,
+						   host_device_handle);
+		if (device)
+			dxgdevice_destroy(device);
+	}
+
+	if (adapter_locked)
+		dxgadapter_release_lock_shared(adapter);
+
+	if (adapter)
+		kref_put(&adapter->adapter_kref, dxgadapter_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_destroy_device(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_destroydevice args;
+	int ret;
+	struct dxgadapter *adapter = NULL;
+	struct dxgdevice *device = NULL;
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+	device = hmgrtable_get_object_by_type(&process->handle_table,
+					      HMGRENTRY_TYPE_DXGDEVICE,
+					      args.device);
+	if (device) {
+		hmgrtable_free_handle(&process->handle_table,
+				      HMGRENTRY_TYPE_DXGDEVICE, args.device);
+		device->handle_valid = 0;
+	}
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+
+	if (device == NULL) {
+		pr_err("invalid device handle: %x", args.device.v);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+
+	dxgdevice_destroy(device);
+
+	if (dxgadapter_acquire_lock_shared(adapter) == 0) {
+		dxgvmb_send_destroy_device(adapter, process, args.device);
+		dxgadapter_release_lock_shared(adapter);
+	}
+
+cleanup:
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 /*
  * IOCTL processing
  * The driver IOCTLs return
@@ -495,12 +625,16 @@ void init_ioctls(void)
 {
 	SET_IOCTL(/*0x1 */ dxgk_open_adapter_from_luid,
 		  LX_DXOPENADAPTERFROMLUID);
+	SET_IOCTL(/*0x2 */ dxgk_create_device,
+		  LX_DXCREATEDEVICE);
 	SET_IOCTL(/*0x9 */ dxgk_query_adapter_info,
 		  LX_DXQUERYADAPTERINFO);
 	SET_IOCTL(/*0x14 */ dxgk_enum_adapters,
 		  LX_DXENUMADAPTERS2);
 	SET_IOCTL(/*0x15 */ dxgk_close_adapter,
 		  LX_DXCLOSEADAPTER);
+	SET_IOCTL(/*0x19 */ dxgk_destroy_device,
+		  LX_DXDESTROYDEVICE);
 	SET_IOCTL(/*0x3e */ dxgk_enum_adapters3,
 		  LX_DXENUMADAPTERS3);
 }
diff --git a/drivers/hv/dxgkrnl/misc.h b/drivers/hv/dxgkrnl/misc.h
index d00e7cc00470..8948f48ec9dc 100644
--- a/drivers/hv/dxgkrnl/misc.h
+++ b/drivers/hv/dxgkrnl/misc.h
@@ -25,6 +25,7 @@ extern const struct d3dkmthandle zerohandle;
  * The higher enum value, the higher is the lock order.
  * When a lower lock ois held, the higher lock should not be acquired.
  *
+ * device_list_mutex
  * channel_lock
  * fd_mutex
  * plistmutex
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index ba7723ebd283..f303b52be8bc 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -82,6 +82,74 @@ struct d3dkmt_openadapterfromluid {
 	struct d3dkmthandle		adapter_handle;
 };
 
+struct d3dddi_allocationlist {
+	struct d3dkmthandle		allocation;
+	union {
+		struct {
+			__u32		write_operation		:1;
+			__u32		do_not_retire_instance	:1;
+			__u32		offer_priority		:3;
+			__u32		reserved		:27;
+		};
+		__u32			value;
+	};
+};
+
+struct d3dddi_patchlocationlist {
+	__u32				allocation_index;
+	union {
+		struct {
+			__u32		slot_id:24;
+			__u32		reserved:8;
+		};
+		__u32			value;
+	};
+	__u32				driver_id;
+	__u32				allocation_offset;
+	__u32				patch_offset;
+	__u32				split_offset;
+};
+
+struct d3dkmt_createdeviceflags {
+	__u32				legacy_mode:1;
+	__u32				request_vSync:1;
+	__u32				disable_gpu_timeout:1;
+	__u32				gdi_device:1;
+	__u32				reserved:28;
+};
+
+struct d3dkmt_createdevice {
+	struct d3dkmthandle		adapter;
+	__u32				reserved3;
+	struct d3dkmt_createdeviceflags	flags;
+	struct d3dkmthandle		device;
+#ifdef __KERNEL__
+	void				*command_buffer;
+#else
+	__u64				command_buffer;
+#endif
+	__u32				command_buffer_size;
+	__u32				reserved;
+#ifdef __KERNEL__
+	struct d3dddi_allocationlist	*allocation_list;
+#else
+	__u64				allocation_list;
+#endif
+	__u32				allocation_list_size;
+	__u32				reserved1;
+#ifdef __KERNEL__
+	struct d3dddi_patchlocationlist	*patch_location_list;
+#else
+	__u64				patch_location_list;
+#endif
+	__u32				patch_location_list_size;
+	__u32				reserved2;
+};
+
+struct d3dkmt_destroydevice {
+	struct d3dkmthandle		device;
+};
+
 struct d3dkmt_adaptertype {
 	union {
 		struct {
@@ -121,6 +189,16 @@ struct d3dkmt_queryadapterinfo {
 	__u32				private_data_size;
 };
 
+enum d3dkmt_deviceexecution_state {
+	_D3DKMT_DEVICEEXECUTION_ACTIVE			= 1,
+	_D3DKMT_DEVICEEXECUTION_RESET			= 2,
+	_D3DKMT_DEVICEEXECUTION_HUNG			= 3,
+	_D3DKMT_DEVICEEXECUTION_STOPPED			= 4,
+	_D3DKMT_DEVICEEXECUTION_ERROR_OUTOFMEMORY	= 5,
+	_D3DKMT_DEVICEEXECUTION_ERROR_DMAFAULT		= 6,
+	_D3DKMT_DEVICEEXECUTION_ERROR_DMAPAGEFAULT	= 7,
+};
+
 union d3dkmt_enumadapters_filter {
 	struct {
 		__u64	include_compute_only:1;
@@ -148,12 +226,16 @@ struct d3dkmt_enumadapters3 {
 
 #define LX_DXOPENADAPTERFROMLUID	\
 	_IOWR(0x47, 0x01, struct d3dkmt_openadapterfromluid)
+#define LX_DXCREATEDEVICE		\
+	_IOWR(0x47, 0x02, struct d3dkmt_createdevice)
 #define LX_DXQUERYADAPTERINFO		\
 	_IOWR(0x47, 0x09, struct d3dkmt_queryadapterinfo)
 #define LX_DXENUMADAPTERS2		\
 	_IOWR(0x47, 0x14, struct d3dkmt_enumadapters2)
 #define LX_DXCLOSEADAPTER		\
 	_IOWR(0x47, 0x15, struct d3dkmt_closeadapter)
+#define LX_DXDESTROYDEVICE		\
+	_IOWR(0x47, 0x19, struct d3dkmt_destroydevice)
 #define LX_DXENUMADAPTERS3		\
 	_IOWR(0x47, 0x3e, struct d3dkmt_enumadapters3)
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 08/30] drivers: hv: dxgkrnl: Creation of dxgcontext objects
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (5 preceding siblings ...)
  2022-03-01 19:45   ` [PATCH v3 07/30] drivers: hv: dxgkrnl: Creation of dxgdevice objects Iouri Tarassov
@ 2022-03-01 19:45   ` Iouri Tarassov
  2022-03-02 12:59     ` Wei Liu
  2022-03-01 19:45   ` [PATCH v3 09/30] drivers: hv: dxgkrnl: Creation of compute device allocations and resources Iouri Tarassov
                     ` (22 subsequent siblings)
  29 siblings, 1 reply; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:45 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement ioctls for creation/destruction of dxgcontext
objects:
  - the LX_DXCREATECONTEXTVIRTUAL ioctl
  - the LX_DXDESTROYCONTEXT ioctl.

A dxgcontext object represents a compute device execution thread.
Ccompute device DMA buffers and synchronization operations are
submitted for execution to a dxgcontext. dxgcontexts objects
belong to a dxgdevice object.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgadapter.c | 102 +++++++++++++++++++
 drivers/hv/dxgkrnl/dxgkrnl.h    |  38 +++++++
 drivers/hv/dxgkrnl/dxgprocess.c |   4 +
 drivers/hv/dxgkrnl/dxgvmbus.c   | 102 ++++++++++++++++++-
 drivers/hv/dxgkrnl/dxgvmbus.h   |  18 ++++
 drivers/hv/dxgkrnl/ioctl.c      | 172 ++++++++++++++++++++++++++++++++
 drivers/hv/dxgkrnl/misc.h       |   1 +
 include/uapi/misc/d3dkmthk.h    |  47 +++++++++
 8 files changed, 483 insertions(+), 1 deletion(-)

diff --git a/drivers/hv/dxgkrnl/dxgadapter.c b/drivers/hv/dxgkrnl/dxgadapter.c
index ad71ba65e6af..52ed325792fa 100644
--- a/drivers/hv/dxgkrnl/dxgadapter.c
+++ b/drivers/hv/dxgkrnl/dxgadapter.c
@@ -210,7 +210,9 @@ struct dxgdevice *dxgdevice_create(struct dxgadapter *adapter,
 		device->adapter = adapter;
 		device->process = process;
 		kref_get(&adapter->adapter_kref);
+		INIT_LIST_HEAD(&device->context_list_head);
 		init_rwsem(&device->device_lock);
+		init_rwsem(&device->context_list_lock);
 		INIT_LIST_HEAD(&device->pqueue_list_head);
 		device->object_state = DXGOBJECTSTATE_CREATED;
 		device->execution_state = _D3DKMT_DEVICEEXECUTION_ACTIVE;
@@ -252,6 +254,20 @@ void dxgdevice_destroy(struct dxgdevice *device)
 
 	dxgdevice_stop(device);
 
+	{
+		struct dxgcontext *context;
+		struct dxgcontext *tmp;
+
+		pr_debug("destroying contexts\n");
+		dxgdevice_acquire_context_list_lock(device);
+		list_for_each_entry_safe(context, tmp,
+					 &device->context_list_head,
+					 context_list_entry) {
+			dxgcontext_destroy(process, context);
+		}
+		dxgdevice_release_context_list_lock(device);
+	}
+
 	/* Guest handles need to be released before the host handles */
 	hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
 	if (device->handle_valid) {
@@ -306,6 +322,32 @@ bool dxgdevice_is_active(struct dxgdevice *device)
 	return device->object_state == DXGOBJECTSTATE_ACTIVE;
 }
 
+void dxgdevice_acquire_context_list_lock(struct dxgdevice *device)
+{
+	down_write(&device->context_list_lock);
+}
+
+void dxgdevice_release_context_list_lock(struct dxgdevice *device)
+{
+	up_write(&device->context_list_lock);
+}
+
+void dxgdevice_add_context(struct dxgdevice *device, struct dxgcontext *context)
+{
+	down_write(&device->context_list_lock);
+	list_add_tail(&context->context_list_entry, &device->context_list_head);
+	up_write(&device->context_list_lock);
+}
+
+void dxgdevice_remove_context(struct dxgdevice *device,
+			      struct dxgcontext *context)
+{
+	if (context->context_list_entry.next) {
+		list_del(&context->context_list_entry);
+		context->context_list_entry.next = NULL;
+	}
+}
+
 void dxgdevice_release(struct kref *refcount)
 {
 	struct dxgdevice *device;
@@ -314,6 +356,66 @@ void dxgdevice_release(struct kref *refcount)
 	vfree(device);
 }
 
+struct dxgcontext *dxgcontext_create(struct dxgdevice *device)
+{
+	struct dxgcontext *context = vzalloc(sizeof(struct dxgcontext));
+
+	if (context) {
+		kref_init(&context->context_kref);
+		context->device = device;
+		context->process = device->process;
+		context->device_handle = device->handle;
+		kref_get(&device->device_kref);
+		INIT_LIST_HEAD(&context->hwqueue_list_head);
+		init_rwsem(&context->hwqueue_list_lock);
+		dxgdevice_add_context(device, context);
+		context->object_state = DXGOBJECTSTATE_ACTIVE;
+	}
+	return context;
+}
+
+/*
+ * Called when the device context list lock is held
+ */
+void dxgcontext_destroy(struct dxgprocess *process, struct dxgcontext *context)
+{
+	pr_debug("%s %p\n", __func__, context);
+	context->object_state = DXGOBJECTSTATE_DESTROYED;
+	if (context->device) {
+		if (context->handle.v) {
+			hmgrtable_free_handle_safe(&process->handle_table,
+						   HMGRENTRY_TYPE_DXGCONTEXT,
+						   context->handle);
+		}
+		dxgdevice_remove_context(context->device, context);
+		kref_put(&context->device->device_kref, dxgdevice_release);
+	}
+	kref_put(&context->context_kref, dxgcontext_release);
+}
+
+void dxgcontext_destroy_safe(struct dxgprocess *process,
+			     struct dxgcontext *context)
+{
+	struct dxgdevice *device = context->device;
+
+	dxgdevice_acquire_context_list_lock(device);
+	dxgcontext_destroy(process, context);
+	dxgdevice_release_context_list_lock(device);
+}
+
+bool dxgcontext_is_active(struct dxgcontext *context)
+{
+	return context->object_state == DXGOBJECTSTATE_ACTIVE;
+}
+
+void dxgcontext_release(struct kref *refcount)
+{
+	struct dxgcontext *context;
+
+	context = container_of(refcount, struct dxgcontext, context_kref);
+	vfree(context);
+}
+
 struct dxgprocess_adapter *dxgprocess_adapter_create(struct dxgprocess *process,
 						     struct dxgadapter *adapter)
 {
diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index 242b98fc20d5..1ac2d4a64dc1 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -30,6 +30,7 @@
 struct dxgprocess;
 struct dxgadapter;
 struct dxgdevice;
+struct dxgcontext;
 
 #include "misc.h"
 #include "hmgr.h"
@@ -285,6 +286,7 @@ void dxgadapter_remove_process(struct dxgprocess_adapter *process_info);
 /*
  * The object represent the device object.
  * The following objects take reference on the device
+ * - dxgcontext
  * - device handle (struct d3dkmthandle)
  */
 struct dxgdevice {
@@ -298,6 +300,8 @@ struct dxgdevice {
 	struct kref		device_kref;
 	/* Protects destcruction of the device object */
 	struct rw_semaphore	device_lock;
+	struct rw_semaphore	context_list_lock;
+	struct list_head	context_list_head;
 	/* List of paging queues. Protected by process handle table lock. */
 	struct list_head	pqueue_list_head;
 	struct d3dkmthandle	handle;
@@ -312,7 +316,33 @@ void dxgdevice_mark_destroyed(struct dxgdevice *device);
 int dxgdevice_acquire_lock_shared(struct dxgdevice *dev);
 void dxgdevice_release_lock_shared(struct dxgdevice *dev);
 void dxgdevice_release(struct kref *refcount);
+void dxgdevice_add_context(struct dxgdevice *dev, struct dxgcontext *ctx);
+void dxgdevice_remove_context(struct dxgdevice *dev, struct dxgcontext *ctx);
 bool dxgdevice_is_active(struct dxgdevice *dev);
+void dxgdevice_acquire_context_list_lock(struct dxgdevice *dev);
+void dxgdevice_release_context_list_lock(struct dxgdevice *dev);
+
+/*
+ * The object represent the execution context of a device.
+ */
+struct dxgcontext {
+	enum dxgobjectstate	object_state;
+	struct dxgdevice	*device;
+	struct dxgprocess	*process;
+	/* entry in the device context list */
+	struct list_head	context_list_entry;
+	struct list_head	hwqueue_list_head;
+	struct rw_semaphore	hwqueue_list_lock;
+	struct kref		context_kref;
+	struct d3dkmthandle	handle;
+	struct d3dkmthandle	device_handle;
+};
+
+struct dxgcontext *dxgcontext_create(struct dxgdevice *dev);
+void dxgcontext_destroy(struct dxgprocess *pr, struct dxgcontext *ctx);
+void dxgcontext_destroy_safe(struct dxgprocess *pr, struct dxgcontext *ctx);
+void dxgcontext_release(struct kref *refcount);
+bool dxgcontext_is_active(struct dxgcontext *ctx);
 
 void init_ioctls(void);
 long dxgk_compat_ioctl(struct file *f, unsigned int p1, unsigned long p2);
@@ -355,6 +385,14 @@ int dxgvmb_send_destroy_device(struct dxgadapter *adapter,
 			       struct d3dkmthandle h);
 int dxgvmb_send_flush_device(struct dxgdevice *device,
 			     enum dxgdevice_flushschedulerreason reason);
+struct d3dkmthandle
+dxgvmb_send_create_context(struct dxgadapter *adapter,
+			   struct dxgprocess *process,
+			   struct d3dkmt_createcontextvirtual
+			   *args);
+int dxgvmb_send_destroy_context(struct dxgadapter *adapter,
+				struct dxgprocess *process,
+				struct d3dkmthandle h);
 int dxgvmb_send_query_adapter_info(struct dxgprocess *process,
 				   struct dxgadapter *adapter,
 				   struct d3dkmt_queryadapterinfo *args);
diff --git a/drivers/hv/dxgkrnl/dxgprocess.c b/drivers/hv/dxgkrnl/dxgprocess.c
index 734585689213..1e6500e12a22 100644
--- a/drivers/hv/dxgkrnl/dxgprocess.c
+++ b/drivers/hv/dxgkrnl/dxgprocess.c
@@ -258,6 +258,10 @@ struct dxgdevice *dxgprocess_device_by_object_handle(struct dxgprocess *process,
 		case HMGRENTRY_TYPE_DXGDEVICE:
 			device = obj;
 			break;
+		case HMGRENTRY_TYPE_DXGCONTEXT:
+			device_handle =
+			    ((struct dxgcontext *)obj)->device_handle;
+			break;
 		default:
 			pr_err("invalid handle type: %d\n", t);
 			break;
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index 2bc2eca0e7da..e85c1d9abc47 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -725,7 +725,7 @@ int dxgvmb_send_flush_device(struct dxgdevice *device,
 			     enum dxgdevice_flushschedulerreason reason)
 {
 	int ret;
-	struct dxgkvmb_command_flushdevice *command;
+	struct dxgkvmb_command_flushdevice *command = NULL;
 	struct dxgvmbusmsg msg = {.hdr = NULL};
 	struct dxgprocess *process = device->process;
 
@@ -739,6 +739,106 @@ int dxgvmb_send_flush_device(struct dxgdevice *device,
 	command->device = device->handle;
 	command->reason = reason;
 
+	ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+struct d3dkmthandle
+dxgvmb_send_create_context(struct dxgadapter *adapter,
+			   struct dxgprocess *process,
+			   struct d3dkmt_createcontextvirtual *args)
+{
+	struct dxgkvmb_command_createcontextvirtual *command = NULL;
+	u32 cmd_size;
+	int ret;
+	struct d3dkmthandle context = {};
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	if (args->priv_drv_data_size > DXG_MAX_VM_BUS_PACKET_SIZE) {
+		pr_err("PrivateDriverDataSize is invalid");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	cmd_size = sizeof(struct dxgkvmb_command_createcontextvirtual) +
+	    args->priv_drv_data_size - 1;
+
+	ret = init_message(&msg, adapter, process, cmd_size);
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_CREATECONTEXTVIRTUAL,
+				   process->host_handle);
+	command->device = args->device;
+	command->node_ordinal = args->node_ordinal;
+	command->engine_affinity = args->engine_affinity;
+	command->flags = args->flags;
+	command->client_hint = args->client_hint;
+	command->priv_drv_data_size = args->priv_drv_data_size;
+	if (args->priv_drv_data_size) {
+		ret = copy_from_user(command->priv_drv_data,
+				     args->priv_drv_data,
+				     args->priv_drv_data_size);
+		if (ret) {
+			pr_err("%s Faled to copy private data",
+				__func__);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+	}
+	/* Input command is returned back as output */
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   command, cmd_size);
+	if (ret < 0) {
+		goto cleanup;
+	} else {
+		context = command->context;
+		if (args->priv_drv_data_size) {
+			ret = copy_to_user(args->priv_drv_data,
+					   command->priv_drv_data,
+					   args->priv_drv_data_size);
+			if (ret) {
+				pr_err("%s Faled to copy private data to user",
+					__func__);
+				ret = -EINVAL;
+				dxgvmb_send_destroy_context(adapter, process,
+							    context);
+				context.v = 0;
+			}
+		}
+	}
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return context;
+}
+
+int dxgvmb_send_destroy_context(struct dxgadapter *adapter,
+				struct dxgprocess *process,
+				struct d3dkmthandle h)
+{
+	int ret;
+	struct dxgkvmb_command_destroycontext *command;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_DESTROYCONTEXT,
+				   process->host_handle);
+	command->context = h;
+
 	ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
 cleanup:
 	free_message(&msg, process);
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index 67448bee392a..adaccc464e21 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -269,4 +269,22 @@ struct dxgkvmb_command_flushdevice {
 	enum dxgdevice_flushschedulerreason	reason;
 };
 
+struct dxgkvmb_command_createcontextvirtual {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		context;
+	struct d3dkmthandle		device;
+	u32				node_ordinal;
+	u32				engine_affinity;
+	struct d3dddi_createcontextflags flags;
+	enum d3dkmt_clienthint		client_hint;
+	u32				priv_drv_data_size;
+	u8				priv_drv_data[1];
+};
+
+/* The command returns ntstatus */
+struct dxgkvmb_command_destroycontext {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle	context;
+};
+
 #endif /* _DXGVMBUS_H */
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index a6be88b6c792..879fb3c6b7b2 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -567,6 +567,174 @@ dxgk_destroy_device(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgk_create_context_virtual(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_createcontextvirtual args;
+	int ret;
+	struct dxgadapter *adapter = NULL;
+	struct dxgdevice *device = NULL;
+	struct dxgcontext *context = NULL;
+	struct d3dkmthandle host_context_handle = {};
+	bool device_lock_acquired = false;
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	/*
+	 * The call acquires reference on the device. It is safe to access the
+	 * adapter, because the device holds reference on it.
+	 */
+	device = dxgprocess_device_by_handle(process, args.device);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgdevice_acquire_lock_shared(device);
+	if (ret < 0)
+		goto cleanup;
+
+	device_lock_acquired = true;
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	context = dxgcontext_create(device);
+	if (context == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	host_context_handle = dxgvmb_send_create_context(adapter,
+							 process, &args);
+	if (host_context_handle.v) {
+		hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+		ret = hmgrtable_assign_handle(&process->handle_table, context,
+					      HMGRENTRY_TYPE_DXGCONTEXT,
+					      host_context_handle);
+		if (ret >= 0)
+			context->handle = host_context_handle;
+		hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+		if (ret < 0)
+			goto cleanup;
+		ret = copy_to_user(&((struct d3dkmt_createcontextvirtual *)
+				   inargs)->context, &host_context_handle,
+				   sizeof(struct d3dkmthandle));
+		if (ret) {
+			pr_err("%s failed to copy context handle", __func__);
+			ret = -EINVAL;
+		}
+	} else {
+		pr_err("invalid host handle");
+		ret = -EINVAL;
+	}
+
+cleanup:
+
+	if (ret < 0) {
+		if (host_context_handle.v) {
+			dxgvmb_send_destroy_context(adapter, process,
+						    host_context_handle);
+		}
+		if (context)
+			dxgcontext_destroy_safe(process, context);
+	}
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+
+	if (device) {
+		if (device_lock_acquired)
+			dxgdevice_release_lock_shared(device);
+		kref_put(&device->device_kref, dxgdevice_release);
+	}
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_destroy_context(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_destroycontext args;
+	int ret;
+	struct dxgadapter *adapter = NULL;
+	struct dxgcontext *context = NULL;
+	struct dxgdevice *device = NULL;
+	struct d3dkmthandle device_handle = {};
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+	context = hmgrtable_get_object_by_type(&process->handle_table,
+					       HMGRENTRY_TYPE_DXGCONTEXT,
+					       args.context);
+	if (context) {
+		hmgrtable_free_handle(&process->handle_table,
+				      HMGRENTRY_TYPE_DXGCONTEXT, args.context);
+		context->handle.v = 0;
+		device_handle = context->device_handle;
+		context->object_state = DXGOBJECTSTATE_DESTROYED;
+	}
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+
+	if (context == NULL) {
+		pr_err("invalid context handle: %x", args.context.v);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	/*
+	 * The call acquires reference on the device. It is safe to access the
+	 * adapter, because the device holds reference on it.
+	 */
+	device = dxgprocess_device_by_handle(process, device_handle);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_destroy_context(adapter, process, args.context);
+
+	dxgcontext_destroy_safe(process, context);
+
+cleanup:
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 /*
  * IOCTL processing
  * The driver IOCTLs return
@@ -627,6 +795,10 @@ void init_ioctls(void)
 		  LX_DXOPENADAPTERFROMLUID);
 	SET_IOCTL(/*0x2 */ dxgk_create_device,
 		  LX_DXCREATEDEVICE);
+	SET_IOCTL(/*0x4 */ dxgk_create_context_virtual,
+		  LX_DXCREATECONTEXTVIRTUAL);
+	SET_IOCTL(/*0x5 */ dxgk_destroy_context,
+		  LX_DXDESTROYCONTEXT);
 	SET_IOCTL(/*0x9 */ dxgk_query_adapter_info,
 		  LX_DXQUERYADAPTERINFO);
 	SET_IOCTL(/*0x14 */ dxgk_enum_adapters,
diff --git a/drivers/hv/dxgkrnl/misc.h b/drivers/hv/dxgkrnl/misc.h
index 8948f48ec9dc..8f7b37049308 100644
--- a/drivers/hv/dxgkrnl/misc.h
+++ b/drivers/hv/dxgkrnl/misc.h
@@ -30,6 +30,7 @@ extern const struct d3dkmthandle zerohandle;
  * fd_mutex
  * plistmutex
  * table_lock
+ * context_list_lock
  * core_lock
  * device_lock
  * process_adapter_mutex
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index f303b52be8bc..b17ef6e30ece 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -150,6 +150,49 @@ struct d3dkmt_destroydevice {
 	struct d3dkmthandle		device;
 };
 
+enum d3dkmt_clienthint {
+	_D3DKMT_CLIENTHNT_UNKNOWN	= 0,
+	_D3DKMT_CLIENTHINT_OPENGL	= 1,
+	_D3DKMT_CLIENTHINT_CDD		= 2,
+	_D3DKMT_CLIENTHINT_DX7		= 7,
+	_D3DKMT_CLIENTHINT_DX8		= 8,
+	_D3DKMT_CLIENTHINT_DX9		= 9,
+	_D3DKMT_CLIENTHINT_DX10		= 10,
+};
+
+struct d3dddi_createcontextflags {
+	union {
+		struct {
+			__u32		null_rendering:1;
+			__u32		initial_data:1;
+			__u32		disable_gpu_timeout:1;
+			__u32		synchronization_only:1;
+			__u32		hw_queue_supported:1;
+			__u32		reserved:27;
+		};
+		__u32			value;
+	};
+};
+
+struct d3dkmt_destroycontext {
+	struct d3dkmthandle		context;
+};
+
+struct d3dkmt_createcontextvirtual {
+	struct d3dkmthandle		device;
+	__u32				node_ordinal;
+	__u32				engine_affinity;
+	struct d3dddi_createcontextflags flags;
+#ifdef __KERNEL__
+	void				*priv_drv_data;
+#else
+	__u64				priv_drv_data;
+#endif
+	__u32				priv_drv_data_size;
+	enum d3dkmt_clienthint		client_hint;
+	struct d3dkmthandle		context;
+};
+
 struct d3dkmt_adaptertype {
 	union {
 		struct {
@@ -228,6 +271,10 @@ struct d3dkmt_enumadapters3 {
 	_IOWR(0x47, 0x01, struct d3dkmt_openadapterfromluid)
 #define LX_DXCREATEDEVICE		\
 	_IOWR(0x47, 0x02, struct d3dkmt_createdevice)
+#define LX_DXCREATECONTEXTVIRTUAL	\
+	_IOWR(0x47, 0x04, struct d3dkmt_createcontextvirtual)
+#define LX_DXDESTROYCONTEXT		\
+	_IOWR(0x47, 0x05, struct d3dkmt_destroycontext)
 #define LX_DXQUERYADAPTERINFO		\
 	_IOWR(0x47, 0x09, struct d3dkmt_queryadapterinfo)
 #define LX_DXENUMADAPTERS2		\
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 09/30] drivers: hv: dxgkrnl: Creation of compute device allocations and resources
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (6 preceding siblings ...)
  2022-03-01 19:45   ` [PATCH v3 08/30] drivers: hv: dxgkrnl: Creation of dxgcontext objects Iouri Tarassov
@ 2022-03-01 19:45   ` Iouri Tarassov
  2022-03-01 19:45   ` [PATCH v3 10/30] drivers: hv: dxgkrnl: Creation of compute device sync objects Iouri Tarassov
                     ` (21 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:45 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implemented ioctls to create and destroy virtual compute device
allocations (dxgallocation) and resources (dxgresource):
  - the LX_DXCREATEALLOCATION ioctl,
  - the LX_DXDESTROYALLOCATION2 ioctl.

Compute device allocations (dxgallocation objects) represent memory
allocation, which could be accessible by the device. Allocations can
be created around existing system memory (provided by an application)
or memory, allocated by dxgkrnl on the host.

Compute device resources (dxgresource objects) represent containers of
compute device allocations. Allocations could be dynamically added,
removed from a resource.

Each allocation/resource has associated driver private data, which
is provided during creation.

Each created resource or allocation have a handle (d3dkmthandle),
which is used to reference the corresponding object in other ioctls.

A dxgallocation can be resident (meaning that it is accessible by
the compute device) or evicted. When an allocation is evicted,
its content is stored in the backing store in system memory.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgadapter.c | 274 ++++++++++++++
 drivers/hv/dxgkrnl/dxgkrnl.h    | 109 ++++++
 drivers/hv/dxgkrnl/dxgmodule.c  |   1 +
 drivers/hv/dxgkrnl/dxgvmbus.c   | 645 ++++++++++++++++++++++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h   | 123 ++++++
 drivers/hv/dxgkrnl/ioctl.c      | 637 +++++++++++++++++++++++++++++++
 drivers/hv/dxgkrnl/misc.h       |   3 +
 include/uapi/misc/d3dkmthk.h    | 204 ++++++++++
 8 files changed, 1996 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgadapter.c b/drivers/hv/dxgkrnl/dxgadapter.c
index 52ed325792fa..01dada36a463 100644
--- a/drivers/hv/dxgkrnl/dxgadapter.c
+++ b/drivers/hv/dxgkrnl/dxgadapter.c
@@ -211,8 +211,11 @@ struct dxgdevice *dxgdevice_create(struct dxgadapter *adapter,
 		device->process = process;
 		kref_get(&adapter->adapter_kref);
 		INIT_LIST_HEAD(&device->context_list_head);
+		INIT_LIST_HEAD(&device->alloc_list_head);
+		INIT_LIST_HEAD(&device->resource_list_head);
 		init_rwsem(&device->device_lock);
 		init_rwsem(&device->context_list_lock);
+		init_rwsem(&device->alloc_list_lock);
 		INIT_LIST_HEAD(&device->pqueue_list_head);
 		device->object_state = DXGOBJECTSTATE_CREATED;
 		device->execution_state = _D3DKMT_DEVICEEXECUTION_ACTIVE;
@@ -228,6 +231,16 @@ struct dxgdevice *dxgdevice_create(struct dxgadapter *adapter,
 
 void dxgdevice_stop(struct dxgdevice *device)
 {
+	struct dxgallocation *alloc;
+
+	pr_debug("%s: %p", __func__, device);
+	dxgdevice_acquire_alloc_list_lock(device);
+	list_for_each_entry(alloc, &device->alloc_list_head, alloc_list_entry) {
+		dxgallocation_stop(alloc);
+	}
+	dxgdevice_release_alloc_list_lock(device);
+
+	pr_debug("%s: end %p\n", __func__, device);
 }
 
 void dxgdevice_mark_destroyed(struct dxgdevice *device)
@@ -254,6 +267,33 @@ void dxgdevice_destroy(struct dxgdevice *device)
 
 	dxgdevice_stop(device);
 
+	dxgdevice_acquire_alloc_list_lock(device);
+
+	{
+		struct dxgallocation *alloc;
+		struct dxgallocation *tmp;
+
+		pr_debug("destroying allocations\n");
+		list_for_each_entry_safe(alloc, tmp, &device->alloc_list_head,
+					 alloc_list_entry) {
+			dxgallocation_destroy(alloc);
+		}
+	}
+
+	{
+		struct dxgresource *resource;
+		struct dxgresource *tmp;
+
+		pr_debug("destroying resources\n");
+		list_for_each_entry_safe(resource, tmp,
+					 &device->resource_list_head,
+					 resource_list_entry) {
+			dxgresource_destroy(resource);
+		}
+	}
+
+	dxgdevice_release_alloc_list_lock(device);
+
 	{
 		struct dxgcontext *context;
 		struct dxgcontext *tmp;
@@ -332,6 +372,26 @@ void dxgdevice_release_context_list_lock(struct dxgdevice *device)
 	up_write(&device->context_list_lock);
 }
 
+void dxgdevice_acquire_alloc_list_lock(struct dxgdevice *device)
+{
+	down_write(&device->alloc_list_lock);
+}
+
+void dxgdevice_release_alloc_list_lock(struct dxgdevice *device)
+{
+	up_write(&device->alloc_list_lock);
+}
+
+void dxgdevice_acquire_alloc_list_lock_shared(struct dxgdevice *device)
+{
+	down_read(&device->alloc_list_lock);
+}
+
+void dxgdevice_release_alloc_list_lock_shared(struct dxgdevice *device)
+{
+	up_read(&device->alloc_list_lock);
+}
+
 void dxgdevice_add_context(struct dxgdevice *device, struct dxgcontext *context)
 {
 	down_write(&device->context_list_lock);
@@ -348,6 +408,160 @@ void dxgdevice_remove_context(struct dxgdevice *device,
 	}
 }
 
+void dxgdevice_add_alloc(struct dxgdevice *device, struct dxgallocation *alloc)
+{
+	dxgdevice_acquire_alloc_list_lock(device);
+	list_add_tail(&alloc->alloc_list_entry, &device->alloc_list_head);
+	kref_get(&device->device_kref);
+	alloc->owner.device = device;
+	dxgdevice_release_alloc_list_lock(device);
+}
+
+void dxgdevice_remove_alloc(struct dxgdevice *device,
+			    struct dxgallocation *alloc)
+{
+	if (alloc->alloc_list_entry.next) {
+		list_del(&alloc->alloc_list_entry);
+		alloc->alloc_list_entry.next = NULL;
+		kref_put(&device->device_kref, dxgdevice_release);
+	}
+}
+
+void dxgdevice_remove_alloc_safe(struct dxgdevice *device,
+				 struct dxgallocation *alloc)
+{
+	dxgdevice_acquire_alloc_list_lock(device);
+	dxgdevice_remove_alloc(device, alloc);
+	dxgdevice_release_alloc_list_lock(device);
+}
+
+void dxgdevice_add_resource(struct dxgdevice *device, struct dxgresource *res)
+{
+	dxgdevice_acquire_alloc_list_lock(device);
+	list_add_tail(&res->resource_list_entry, &device->resource_list_head);
+	kref_get(&device->device_kref);
+	dxgdevice_release_alloc_list_lock(device);
+}
+
+void dxgdevice_remove_resource(struct dxgdevice *device,
+			       struct dxgresource *res)
+{
+	if (res->resource_list_entry.next) {
+		list_del(&res->resource_list_entry);
+		res->resource_list_entry.next = NULL;
+		kref_put(&device->device_kref, dxgdevice_release);
+	}
+}
+
+struct dxgresource *dxgresource_create(struct dxgdevice *device)
+{
+	struct dxgresource *resource = vzalloc(sizeof(struct dxgresource));
+
+	if (resource) {
+		kref_init(&resource->resource_kref);
+		resource->device = device;
+		resource->process = device->process;
+		resource->object_state = DXGOBJECTSTATE_ACTIVE;
+		mutex_init(&resource->resource_mutex);
+		INIT_LIST_HEAD(&resource->alloc_list_head);
+		dxgdevice_add_resource(device, resource);
+	}
+	return resource;
+}
+
+void dxgresource_free_handle(struct dxgresource *resource)
+{
+	struct dxgallocation *alloc;
+	struct dxgprocess *process;
+
+	if (resource->handle_valid) {
+		process = resource->device->process;
+		hmgrtable_free_handle_safe(&process->handle_table,
+					   HMGRENTRY_TYPE_DXGRESOURCE,
+					   resource->handle);
+		resource->handle_valid = 0;
+	}
+	list_for_each_entry(alloc, &resource->alloc_list_head,
+			    alloc_list_entry) {
+		dxgallocation_free_handle(alloc);
+	}
+}
+
+void dxgresource_destroy(struct dxgresource *resource)
+{
+	/* device->alloc_list_lock is held */
+	struct dxgallocation *alloc;
+	struct dxgallocation *tmp;
+	struct d3dkmt_destroyallocation2 args = { };
+	int destroyed = test_and_set_bit(0, &resource->flags);
+	struct dxgdevice *device = resource->device;
+
+	if (!destroyed) {
+		dxgresource_free_handle(resource);
+		if (resource->handle.v) {
+			args.device = device->handle;
+			args.resource = resource->handle;
+			dxgvmb_send_destroy_allocation(device->process,
+						       device, &args, NULL);
+			resource->handle.v = 0;
+		}
+		list_for_each_entry_safe(alloc, tmp, &resource->alloc_list_head,
+					 alloc_list_entry) {
+			dxgallocation_destroy(alloc);
+		}
+		dxgdevice_remove_resource(device, resource);
+	}
+	kref_put(&resource->resource_kref, dxgresource_release);
+}
+
+void dxgresource_release(struct kref *refcount)
+{
+	struct dxgresource *resource;
+
+	resource = container_of(refcount, struct dxgresource, resource_kref);
+	vfree(resource);
+}
+
+bool dxgresource_is_active(struct dxgresource *resource)
+{
+	return resource->object_state == DXGOBJECTSTATE_ACTIVE;
+}
+
+int dxgresource_add_alloc(struct dxgresource *resource,
+				      struct dxgallocation *alloc)
+{
+	int ret = -ENODEV;
+	struct dxgdevice *device = resource->device;
+
+	dxgdevice_acquire_alloc_list_lock(device);
+	if (dxgresource_is_active(resource)) {
+		list_add_tail(&alloc->alloc_list_entry,
+			      &resource->alloc_list_head);
+		alloc->owner.resource = resource;
+		ret = 0;
+	}
+	alloc->resource_owner = 1;
+	dxgdevice_release_alloc_list_lock(device);
+	return ret;
+}
+
+void dxgresource_remove_alloc(struct dxgresource *resource,
+			      struct dxgallocation *alloc)
+{
+	if (alloc->alloc_list_entry.next) {
+		list_del(&alloc->alloc_list_entry);
+		alloc->alloc_list_entry.next = NULL;
+	}
+}
+
+void dxgresource_remove_alloc_safe(struct dxgresource *resource,
+				   struct dxgallocation *alloc)
+{
+	dxgdevice_acquire_alloc_list_lock(resource->device);
+	dxgresource_remove_alloc(resource, alloc);
+	dxgdevice_release_alloc_list_lock(resource->device);
+}
+
 void dxgdevice_release(struct kref *refcount)
 {
 	struct dxgdevice *device;
@@ -416,6 +630,66 @@ void dxgcontext_release(struct kref *refcount)
 	vfree(context);
 }
 
+struct dxgallocation *dxgallocation_create(struct dxgprocess *process)
+{
+	struct dxgallocation *alloc = vzalloc(sizeof(struct dxgallocation));
+
+	if (alloc)
+		alloc->process = process;
+	return alloc;
+}
+
+void dxgallocation_stop(struct dxgallocation *alloc)
+{
+	if (alloc->pages) {
+		release_pages(alloc->pages, alloc->num_pages);
+		vfree(alloc->pages);
+		alloc->pages = NULL;
+	}
+}
+
+void dxgallocation_free_handle(struct dxgallocation *alloc)
+{
+	dxgprocess_ht_lock_exclusive_down(alloc->process);
+	if (alloc->handle_valid) {
+		hmgrtable_free_handle(&alloc->process->handle_table,
+				      HMGRENTRY_TYPE_DXGALLOCATION,
+				      alloc->alloc_handle);
+		alloc->handle_valid = 0;
+	}
+	dxgprocess_ht_lock_exclusive_up(alloc->process);
+}
+
+void dxgallocation_destroy(struct dxgallocation *alloc)
+{
+	struct dxgprocess *process = alloc->process;
+	struct d3dkmt_destroyallocation2 args = { };
+
+	dxgallocation_stop(alloc);
+	if (alloc->resource_owner)
+		dxgresource_remove_alloc(alloc->owner.resource, alloc);
+	else if (alloc->owner.device)
+		dxgdevice_remove_alloc(alloc->owner.device, alloc);
+	dxgallocation_free_handle(alloc);
+	if (alloc->alloc_handle.v && !alloc->resource_owner) {
+		args.device = alloc->owner.device->handle;
+		args.alloc_count = 1;
+		dxgvmb_send_destroy_allocation(process,
+					       alloc->owner.device,
+					       &args, &alloc->alloc_handle);
+	}
+	if (alloc->gpadl.gpadl_handle) {
+		pr_debug("Teardown gpadl %d",
+			alloc->gpadl.gpadl_handle);
+		vmbus_teardown_gpadl(dxgglobal_get_vmbus(), &alloc->gpadl);
+		pr_debug("Teardown gpadl end");
+		alloc->gpadl.gpadl_handle = 0;
+	}
+	if (alloc->priv_drv_data)
+		vfree(alloc->priv_drv_data);
+	vfree(alloc);
+}
+
 struct dxgprocess_adapter *dxgprocess_adapter_create(struct dxgprocess *process,
 						     struct dxgadapter *adapter)
 {
diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index 1ac2d4a64dc1..766f5214cc57 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -31,6 +31,8 @@ struct dxgprocess;
 struct dxgadapter;
 struct dxgdevice;
 struct dxgcontext;
+struct dxgallocation;
+struct dxgresource;
 
 #include "misc.h"
 #include "hmgr.h"
@@ -256,6 +258,8 @@ struct dxgadapter {
 	struct list_head	adapter_list_entry;
 	/* The list of dxgprocess_adapter entries */
 	struct list_head	adapter_process_list_head;
+	/* This lock protects shared resource and syncobject lists */
+	struct rw_semaphore	shared_resource_list_lock;
 	struct pci_dev		*pci_dev;
 	struct hv_device	*hv_dev;
 	struct dxgvmbuschannel	channel;
@@ -302,6 +306,10 @@ struct dxgdevice {
 	struct rw_semaphore	device_lock;
 	struct rw_semaphore	context_list_lock;
 	struct list_head	context_list_head;
+	/* List of device allocations */
+	struct rw_semaphore	alloc_list_lock;
+	struct list_head	alloc_list_head;
+	struct list_head	resource_list_head;
 	/* List of paging queues. Protected by process handle table lock. */
 	struct list_head	pqueue_list_head;
 	struct d3dkmthandle	handle;
@@ -318,9 +326,19 @@ void dxgdevice_release_lock_shared(struct dxgdevice *dev);
 void dxgdevice_release(struct kref *refcount);
 void dxgdevice_add_context(struct dxgdevice *dev, struct dxgcontext *ctx);
 void dxgdevice_remove_context(struct dxgdevice *dev, struct dxgcontext *ctx);
+void dxgdevice_add_alloc(struct dxgdevice *dev, struct dxgallocation *a);
+void dxgdevice_remove_alloc(struct dxgdevice *dev, struct dxgallocation *a);
+void dxgdevice_remove_alloc_safe(struct dxgdevice *dev,
+				 struct dxgallocation *a);
+void dxgdevice_add_resource(struct dxgdevice *dev, struct dxgresource *res);
+void dxgdevice_remove_resource(struct dxgdevice *dev, struct dxgresource *res);
 bool dxgdevice_is_active(struct dxgdevice *dev);
 void dxgdevice_acquire_context_list_lock(struct dxgdevice *dev);
 void dxgdevice_release_context_list_lock(struct dxgdevice *dev);
+void dxgdevice_acquire_alloc_list_lock(struct dxgdevice *dev);
+void dxgdevice_release_alloc_list_lock(struct dxgdevice *dev);
+void dxgdevice_acquire_alloc_list_lock_shared(struct dxgdevice *dev);
+void dxgdevice_release_alloc_list_lock_shared(struct dxgdevice *dev);
 
 /*
  * The object represent the execution context of a device.
@@ -344,6 +362,79 @@ void dxgcontext_destroy_safe(struct dxgprocess *pr, struct dxgcontext *ctx);
 void dxgcontext_release(struct kref *refcount);
 bool dxgcontext_is_active(struct dxgcontext *ctx);
 
+struct dxgresource {
+	struct kref		resource_kref;
+	enum dxgobjectstate	object_state;
+	struct d3dkmthandle	handle;
+	struct list_head	alloc_list_head;
+	struct list_head	resource_list_entry;
+	struct list_head	shared_resource_list_entry;
+	struct dxgdevice	*device;
+	struct dxgprocess	*process;
+	/* Protects adding allocations to resource and resource destruction */
+	struct mutex		resource_mutex;
+	u64			private_runtime_handle;
+	union {
+		struct {
+			u32	destroyed:1;	/* Must be the first */
+			u32	handle_valid:1;
+			u32	reserved:30;
+		};
+		long		flags;
+	};
+};
+
+struct dxgresource *dxgresource_create(struct dxgdevice *dev);
+void dxgresource_destroy(struct dxgresource *res);
+void dxgresource_free_handle(struct dxgresource *res);
+void dxgresource_release(struct kref *refcount);
+int dxgresource_add_alloc(struct dxgresource *res,
+				      struct dxgallocation *a);
+void dxgresource_remove_alloc(struct dxgresource *res, struct dxgallocation *a);
+void dxgresource_remove_alloc_safe(struct dxgresource *res,
+				   struct dxgallocation *a);
+bool dxgresource_is_active(struct dxgresource *res);
+
+struct privdata {
+	u32 data_size;
+	u8 data[1];
+};
+
+struct dxgallocation {
+	/* Entry in the device list or resource list (when resource exists) */
+	struct list_head		alloc_list_entry;
+	/* Allocation owner */
+	union {
+		struct dxgdevice	*device;
+		struct dxgresource	*resource;
+	} owner;
+	struct dxgprocess		*process;
+	/* Pointer to private driver data desc. Used for shared resources */
+	struct privdata			*priv_drv_data;
+	struct d3dkmthandle		alloc_handle;
+	/* Set to 1 when allocation belongs to resource. */
+	u32				resource_owner:1;
+	/* Set to 1 when the allocatio is mapped as cached */
+	u32				cached:1;
+	u32				handle_valid:1;
+	/* GPADL address list for existing sysmem allocations */
+	struct vmbus_gpadl		gpadl;
+	/* Number of pages in the 'pages' array */
+	u32				num_pages;
+	/*
+	 * CPU address from the existing sysmem allocation, or
+	 * mapped to the CPU visible backing store in the IO space
+	 */
+	void				*cpu_address;
+	/* Describes pages for the existing sysmem allocation */
+	struct page			**pages;
+};
+
+struct dxgallocation *dxgallocation_create(struct dxgprocess *process);
+void dxgallocation_stop(struct dxgallocation *a);
+void dxgallocation_destroy(struct dxgallocation *a);
+void dxgallocation_free_handle(struct dxgallocation *a);
+
 void init_ioctls(void);
 long dxgk_compat_ioctl(struct file *f, unsigned int p1, unsigned long p2);
 long dxgk_unlocked_ioctl(struct file *f, unsigned int p1, unsigned long p2);
@@ -393,9 +484,27 @@ dxgvmb_send_create_context(struct dxgadapter *adapter,
 int dxgvmb_send_destroy_context(struct dxgadapter *adapter,
 				struct dxgprocess *process,
 				struct d3dkmthandle h);
+int dxgvmb_send_create_allocation(struct dxgprocess *pr, struct dxgdevice *dev,
+				  struct d3dkmt_createallocation *args,
+				  struct d3dkmt_createallocation *__user inargs,
+				  struct dxgresource *res,
+				  struct dxgallocation **allocs,
+				  struct d3dddi_allocationinfo2 *alloc_info,
+				  struct d3dkmt_createstandardallocation *stda);
+int dxgvmb_send_destroy_allocation(struct dxgprocess *pr, struct dxgdevice *dev,
+				   struct d3dkmt_destroyallocation2 *args,
+				   struct d3dkmthandle *alloc_handles);
 int dxgvmb_send_query_adapter_info(struct dxgprocess *process,
 				   struct dxgadapter *adapter,
 				   struct d3dkmt_queryadapterinfo *args);
+int dxgvmb_send_get_stdalloc_data(struct dxgdevice *device,
+				  enum d3dkmdt_standardallocationtype t,
+				  struct d3dkmdt_gdisurfacedata *data,
+				  u32 physical_adapter_index,
+				  u32 *alloc_priv_driver_size,
+				  void *prive_alloc_data,
+				  u32 *res_priv_data_size,
+				  void *priv_res_data);
 int dxgvmb_send_async_msg(struct dxgvmbuschannel *channel,
 			  void *command,
 			  u32 cmd_size);
diff --git a/drivers/hv/dxgkrnl/dxgmodule.c b/drivers/hv/dxgkrnl/dxgmodule.c
index c1c3274197d8..eb8ef8a69f28 100644
--- a/drivers/hv/dxgkrnl/dxgmodule.c
+++ b/drivers/hv/dxgkrnl/dxgmodule.c
@@ -158,6 +158,7 @@ int dxgglobal_create_adapter(struct pci_dev *dev, guid_t *guid,
 	init_rwsem(&adapter->core_lock);
 
 	INIT_LIST_HEAD(&adapter->adapter_process_list_head);
+	init_rwsem(&adapter->shared_resource_list_lock);
 	adapter->pci_dev = dev;
 	guid_to_luid(guid, &adapter->luid);
 
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index e85c1d9abc47..8a82838d86db 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -109,6 +109,40 @@ static int init_message(struct dxgvmbusmsg *msg, struct dxgadapter *adapter,
 	return 0;
 }
 
+static int init_message_res(struct dxgvmbusmsgres *msg,
+			    struct dxgadapter *adapter,
+			    struct dxgprocess *process,
+			    u32 size,
+			    u32 result_size)
+{
+	bool use_ext_header = dxgglobal->vmbus_ver >=
+			      DXGK_VMBUS_INTERFACE_VERSION;
+
+	if (use_ext_header)
+		size += sizeof(struct dxgvmb_ext_header);
+	msg->size = size;
+	msg->res_size += (result_size + 7) & ~7;
+	size += msg->res_size;
+	msg->hdr = vzalloc(size);
+	if (msg->hdr == NULL) {
+		pr_err("Failed to allocate VM bus message: %d", size);
+		return -ENOMEM;
+	}
+	if (use_ext_header) {
+		msg->msg = (char *)&msg->hdr[1];
+		msg->hdr->command_offset = sizeof(msg->hdr[0]);
+		msg->hdr->vgpu_luid = adapter->host_vgpu_luid;
+	} else {
+		msg->msg = (char *)msg->hdr;
+	}
+	msg->res = (char *)msg->hdr + msg->size;
+	if (dxgglobal->async_msg_enabled)
+		msg->channel = &dxgglobal->channel;
+	else
+		msg->channel = &adapter->channel;
+	return 0;
+}
+
 static void free_message(struct dxgvmbusmsg *msg, struct dxgprocess *process)
 {
 	if (msg->hdr && (char *)msg->hdr != msg->msg_on_stack)
@@ -847,6 +881,617 @@ int dxgvmb_send_destroy_context(struct dxgadapter *adapter,
 	return ret;
 }
 
+static int
+copy_private_data(struct d3dkmt_createallocation *args,
+		  struct dxgkvmb_command_createallocation *command,
+		  struct d3dddi_allocationinfo2 *input_alloc_info,
+		  struct d3dkmt_createstandardallocation *standard_alloc)
+{
+	struct dxgkvmb_command_createallocation_allocinfo *alloc_info;
+	struct d3dddi_allocationinfo2 *input_alloc;
+	int ret = 0;
+	int i;
+	u8 *private_data_dest = (u8 *) &command[1] +
+	    (args->alloc_count *
+	     sizeof(struct dxgkvmb_command_createallocation_allocinfo));
+
+	if (args->private_runtime_data_size) {
+		ret = copy_from_user(private_data_dest,
+				     args->private_runtime_data,
+				     args->private_runtime_data_size);
+		if (ret) {
+			pr_err("%s failed to copy runtime data", __func__);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+		private_data_dest += args->private_runtime_data_size;
+	}
+
+	if (args->flags.standard_allocation) {
+		pr_debug("private data offset %d",
+			(u32) (private_data_dest - (u8 *) command));
+
+		args->priv_drv_data_size = sizeof(*args->standard_allocation);
+		memcpy(private_data_dest, standard_alloc,
+		       sizeof(*standard_alloc));
+		private_data_dest += args->priv_drv_data_size;
+	} else if (args->priv_drv_data_size) {
+		ret = copy_from_user(private_data_dest,
+				     args->priv_drv_data,
+				     args->priv_drv_data_size);
+		if (ret) {
+			pr_err("%s failed to copy private data", __func__);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+		private_data_dest += args->priv_drv_data_size;
+	}
+
+	alloc_info = (void *)&command[1];
+	input_alloc = input_alloc_info;
+	if (input_alloc_info[0].sysmem)
+		command->flags.existing_sysmem = 1;
+	for (i = 0; i < args->alloc_count; i++) {
+		alloc_info->flags = input_alloc->flags.value;
+		alloc_info->vidpn_source_id = input_alloc->vidpn_source_id;
+		alloc_info->priv_drv_data_size =
+		    input_alloc->priv_drv_data_size;
+		if (input_alloc->priv_drv_data_size) {
+			ret = copy_from_user(private_data_dest,
+					     input_alloc->priv_drv_data,
+					     input_alloc->priv_drv_data_size);
+			if (ret) {
+				pr_err("%s failed to copy alloc data",
+					__func__);
+				ret = -EINVAL;
+				goto cleanup;
+			}
+			private_data_dest += input_alloc->priv_drv_data_size;
+		}
+		alloc_info++;
+		input_alloc++;
+	}
+
+cleanup:
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+static
+int create_existing_sysmem(struct dxgdevice *device,
+			   struct dxgkvmb_command_allocinfo_return *host_alloc,
+			   struct dxgallocation *dxgalloc,
+			   bool read_only,
+			   const void *sysmem)
+{
+	int ret1 = 0;
+	void *kmem = NULL;
+	int ret = 0;
+	struct dxgkvmb_command_setexistingsysmemstore *set_store_command;
+	u64 alloc_size = host_alloc->allocation_size;
+	u32 npages = alloc_size >> PAGE_SHIFT;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, device->adapter, device->process,
+			   sizeof(*set_store_command));
+	if (ret)
+		goto cleanup;
+	set_store_command = (void *)msg.msg;
+
+	/*
+	 * Create a guest physical address list and set it as the allocation
+	 * backing store in the host. This is done after creating the host
+	 * allocation, because only now the allocation size is known.
+	 */
+
+	pr_debug("   Alloc size: %lld", alloc_size);
+
+	dxgalloc->cpu_address = (void *)sysmem;
+	dxgalloc->pages = vzalloc(npages * sizeof(void *));
+	if (dxgalloc->pages == NULL) {
+		pr_err("failed to allocate pages");
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+	ret1 = get_user_pages_fast((unsigned long)sysmem, npages, !read_only,
+				  dxgalloc->pages);
+	if (ret1 != npages) {
+		pr_err("get_user_pages_fast failed: %d", ret1);
+		if (ret1 > 0 && ret1 < npages)
+			release_pages(dxgalloc->pages, ret1);
+		vfree(dxgalloc->pages);
+		dxgalloc->pages = NULL;
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+	kmem = vmap(dxgalloc->pages, npages, VM_MAP, PAGE_KERNEL);
+	if (kmem == NULL) {
+		pr_err("vmap failed");
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+	ret1 = vmbus_establish_gpadl(dxgglobal_get_vmbus(), kmem,
+				     alloc_size, &dxgalloc->gpadl);
+	if (ret1) {
+		pr_err("establish_gpadl failed: %d", ret1);
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+	pr_debug("New gpadl %d", dxgalloc->gpadl.gpadl_handle);
+
+	command_vgpu_to_host_init2(&set_store_command->hdr,
+				   DXGK_VMBCOMMAND_SETEXISTINGSYSMEMSTORE,
+				   device->process->host_handle);
+	set_store_command->device = device->handle;
+	set_store_command->device = device->handle;
+	set_store_command->allocation = host_alloc->allocation;
+	set_store_command->gpadl = dxgalloc->gpadl.gpadl_handle;
+	ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
+	if (ret < 0)
+		pr_err("failed to set existing store: %x", ret);
+
+cleanup:
+	if (kmem)
+		vunmap(kmem);
+	free_message(&msg, device->process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+static int
+process_allocation_handles(struct dxgprocess *process,
+			   struct dxgdevice *device,
+			   struct d3dkmt_createallocation *args,
+			   struct dxgkvmb_command_createallocation_return *res,
+			   struct dxgallocation **dxgalloc,
+			   struct dxgresource *resource)
+{
+	int ret = 0;
+	int i;
+
+	hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+	if (args->flags.create_resource) {
+		ret = hmgrtable_assign_handle(&process->handle_table, resource,
+					      HMGRENTRY_TYPE_DXGRESOURCE,
+					      res->resource);
+		if (ret < 0) {
+			pr_err("failed to assign resource handle %x",
+				   res->resource.v);
+		} else {
+			resource->handle = res->resource;
+			resource->handle_valid = 1;
+		}
+	}
+	for (i = 0; i < args->alloc_count; i++) {
+		struct dxgkvmb_command_allocinfo_return *host_alloc;
+
+		host_alloc = &res->allocation_info[i];
+		ret = hmgrtable_assign_handle(&process->handle_table,
+					      dxgalloc[i],
+					      HMGRENTRY_TYPE_DXGALLOCATION,
+					      host_alloc->allocation);
+		if (ret < 0) {
+			pr_err("failed to assign alloc handle %x %d %d",
+				   host_alloc->allocation.v,
+				   args->alloc_count, i);
+			break;
+		}
+		dxgalloc[i]->alloc_handle = host_alloc->allocation;
+		dxgalloc[i]->handle_valid = 1;
+	}
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+static int
+create_local_allocations(struct dxgprocess *process,
+			 struct dxgdevice *device,
+			 struct d3dkmt_createallocation *args,
+			 struct d3dkmt_createallocation *__user input_args,
+			 struct d3dddi_allocationinfo2 *alloc_info,
+			 struct dxgkvmb_command_createallocation_return *result,
+			 struct dxgresource *resource,
+			 struct dxgallocation **dxgalloc,
+			 u32 destroy_buffer_size)
+{
+	int i;
+	int alloc_count = args->alloc_count;
+	u8 *alloc_private_data = NULL;
+	int ret = 0;
+	int ret1;
+	struct dxgkvmb_command_destroyallocation *destroy_buf;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, device->adapter, process,
+			    destroy_buffer_size);
+	if (ret)
+		goto cleanup;
+	destroy_buf = (void *)msg.msg;
+
+	/* Prepare the command to destroy allocation in case of failure */
+	command_vgpu_to_host_init2(&destroy_buf->hdr,
+				   DXGK_VMBCOMMAND_DESTROYALLOCATION,
+				   process->host_handle);
+	destroy_buf->device = args->device;
+	destroy_buf->resource = args->resource;
+	destroy_buf->alloc_count = alloc_count;
+	destroy_buf->flags.assume_not_in_use = 1;
+	for (i = 0; i < alloc_count; i++) {
+		pr_debug("host allocation: %d %x",
+			     i, result->allocation_info[i].allocation.v);
+		destroy_buf->allocations[i] =
+		    result->allocation_info[i].allocation;
+	}
+
+	if (args->flags.create_resource) {
+		pr_debug("new resource: %x", result->resource.v);
+		ret = copy_to_user(&input_args->resource, &result->resource,
+				   sizeof(struct d3dkmthandle));
+		if (ret) {
+			pr_err("%s failed to copy resource handle", __func__);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+	}
+
+	alloc_private_data = (u8 *) result +
+	    sizeof(struct dxgkvmb_command_createallocation_return) +
+	    sizeof(struct dxgkvmb_command_allocinfo_return) * (alloc_count - 1);
+
+	for (i = 0; i < alloc_count; i++) {
+		struct dxgkvmb_command_allocinfo_return *host_alloc;
+		struct d3dddi_allocationinfo2 *user_alloc;
+
+		host_alloc = &result->allocation_info[i];
+		user_alloc = &alloc_info[i];
+		dxgalloc[i]->num_pages =
+		    host_alloc->allocation_size >> PAGE_SHIFT;
+		if (user_alloc->sysmem) {
+			ret = create_existing_sysmem(device, host_alloc,
+						     dxgalloc[i],
+						     args->flags.read_only != 0,
+						     user_alloc->sysmem);
+			if (ret < 0)
+				goto cleanup;
+		}
+		dxgalloc[i]->cached = host_alloc->allocation_flags.cached;
+		if (host_alloc->priv_drv_data_size) {
+			ret = copy_to_user(user_alloc->priv_drv_data,
+					   alloc_private_data,
+					   host_alloc->priv_drv_data_size);
+			if (ret) {
+				pr_err("%s failed to copy private data",
+					__func__);
+				ret = -EINVAL;
+				goto cleanup;
+			}
+			alloc_private_data += host_alloc->priv_drv_data_size;
+		}
+		ret = copy_to_user(&args->allocation_info[i].allocation,
+				   &host_alloc->allocation,
+				   sizeof(struct d3dkmthandle));
+		if (ret) {
+			pr_err("%s failed to copy alloc handle", __func__);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+	}
+
+	ret = process_allocation_handles(process, device, args, result,
+					 dxgalloc, resource);
+	if (ret < 0)
+		goto cleanup;
+
+	ret = copy_to_user(&input_args->global_share, &args->global_share,
+			   sizeof(struct d3dkmthandle));
+	if (ret) {
+		pr_err("%s failed to copy global share", __func__);
+		ret = -EINVAL;
+	}
+
+cleanup:
+
+	if (ret < 0) {
+		/* Free local handles before freeing the handles in the host */
+		dxgdevice_acquire_alloc_list_lock(device);
+		if (dxgalloc)
+			for (i = 0; i < alloc_count; i++)
+				if (dxgalloc[i])
+					dxgallocation_free_handle(dxgalloc[i]);
+		if (resource && args->flags.create_resource)
+			dxgresource_free_handle(resource);
+		dxgdevice_release_alloc_list_lock(device);
+
+		/* Destroy allocations in the host to unmap gpadls */
+		ret1 = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr,
+						     msg.size);
+		if (ret1 < 0)
+			pr_err("failed to destroy allocations: %x", ret1);
+
+		dxgdevice_acquire_alloc_list_lock(device);
+		if (dxgalloc) {
+			for (i = 0; i < alloc_count; i++) {
+				if (dxgalloc[i]) {
+					dxgalloc[i]->alloc_handle.v = 0;
+					dxgallocation_destroy(dxgalloc[i]);
+					dxgalloc[i] = NULL;
+				}
+			}
+		}
+		if (resource && args->flags.create_resource) {
+			/*
+			 * Prevent the resource memory from freeing.
+			 * It will be freed in the top level function.
+			 */
+			kref_get(&resource->resource_kref);
+			dxgresource_destroy(resource);
+		}
+		dxgdevice_release_alloc_list_lock(device);
+	}
+
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_create_allocation(struct dxgprocess *process,
+				  struct dxgdevice *device,
+				  struct d3dkmt_createallocation *args,
+				  struct d3dkmt_createallocation *__user
+				  input_args,
+				  struct dxgresource *resource,
+				  struct dxgallocation **dxgalloc,
+				  struct d3dddi_allocationinfo2 *alloc_info,
+				  struct d3dkmt_createstandardallocation
+				  *standard_alloc)
+{
+	struct dxgkvmb_command_createallocation *command = NULL;
+	struct dxgkvmb_command_createallocation_return *result = NULL;
+	int ret = -EINVAL;
+	int i;
+	u32 result_size = 0;
+	u32 cmd_size = 0;
+	u32 destroy_buffer_size = 0;
+	u32 priv_drv_data_size;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	if (args->private_runtime_data_size >= DXG_MAX_VM_BUS_PACKET_SIZE ||
+	    args->priv_drv_data_size >= DXG_MAX_VM_BUS_PACKET_SIZE) {
+		ret = -EOVERFLOW;
+		goto cleanup;
+	}
+
+	/*
+	 * Preallocate the buffer, which will be used for destruction in case
+	 * of a failure
+	 */
+	destroy_buffer_size = sizeof(struct dxgkvmb_command_destroyallocation) +
+	    args->alloc_count * sizeof(struct d3dkmthandle);
+
+	/* Compute the total private driver size */
+
+	priv_drv_data_size = 0;
+
+	for (i = 0; i < args->alloc_count; i++) {
+		if (alloc_info[i].priv_drv_data_size >=
+		    DXG_MAX_VM_BUS_PACKET_SIZE) {
+			ret = -EOVERFLOW;
+			goto cleanup;
+		} else {
+			priv_drv_data_size += alloc_info[i].priv_drv_data_size;
+		}
+		if (priv_drv_data_size >= DXG_MAX_VM_BUS_PACKET_SIZE) {
+			ret = -EOVERFLOW;
+			goto cleanup;
+		}
+	}
+
+	/*
+	 * Private driver data for the result includes only per allocation
+	 * private data
+	 */
+	result_size = sizeof(struct dxgkvmb_command_createallocation_return) +
+	    (args->alloc_count - 1) *
+	    sizeof(struct dxgkvmb_command_allocinfo_return) +
+	    priv_drv_data_size;
+	result = vzalloc(result_size);
+	if (result == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	/* Private drv data for the command includes the global private data */
+	priv_drv_data_size += args->priv_drv_data_size;
+
+	cmd_size = sizeof(struct dxgkvmb_command_createallocation) +
+	    args->alloc_count *
+	    sizeof(struct dxgkvmb_command_createallocation_allocinfo) +
+	    args->private_runtime_data_size + priv_drv_data_size;
+	if (cmd_size > DXG_MAX_VM_BUS_PACKET_SIZE) {
+		ret = -EOVERFLOW;
+		goto cleanup;
+	}
+
+	pr_debug("command size, driver_data_size %d %d %ld %ld",
+		    cmd_size, priv_drv_data_size,
+		    sizeof(struct dxgkvmb_command_createallocation),
+		    sizeof(struct dxgkvmb_command_createallocation_allocinfo));
+
+	ret = init_message(&msg, device->adapter, process,
+			   cmd_size);
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_CREATEALLOCATION,
+				   process->host_handle);
+	command->device = args->device;
+	command->flags = args->flags;
+	command->resource = args->resource;
+	command->private_runtime_resource_handle =
+	    args->private_runtime_resource_handle;
+	command->alloc_count = args->alloc_count;
+	command->private_runtime_data_size = args->private_runtime_data_size;
+	command->priv_drv_data_size = args->priv_drv_data_size;
+
+	ret = copy_private_data(args, command, alloc_info, standard_alloc);
+	if (ret < 0)
+		goto cleanup;
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   result, result_size);
+	if (ret < 0) {
+		pr_err("send_create_allocation failed %x", ret);
+		goto cleanup;
+	}
+
+	ret = create_local_allocations(process, device, args, input_args,
+				       alloc_info, result, resource, dxgalloc,
+				       destroy_buffer_size);
+cleanup:
+
+	if (result)
+		vfree(result);
+	free_message(&msg, process);
+
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_destroy_allocation(struct dxgprocess *process,
+				   struct dxgdevice *device,
+				   struct d3dkmt_destroyallocation2 *args,
+				   struct d3dkmthandle *alloc_handles)
+{
+	struct dxgkvmb_command_destroyallocation *destroy_buffer;
+	u32 destroy_buffer_size;
+	int ret;
+	int allocations_size = args->alloc_count * sizeof(struct d3dkmthandle);
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	destroy_buffer_size = sizeof(struct dxgkvmb_command_destroyallocation) +
+	    allocations_size;
+
+	ret = init_message(&msg, device->adapter, process,
+			    destroy_buffer_size);
+	if (ret)
+		goto cleanup;
+	destroy_buffer = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&destroy_buffer->hdr,
+				   DXGK_VMBCOMMAND_DESTROYALLOCATION,
+				   process->host_handle);
+	destroy_buffer->device = args->device;
+	destroy_buffer->resource = args->resource;
+	destroy_buffer->alloc_count = args->alloc_count;
+	destroy_buffer->flags = args->flags;
+	if (allocations_size)
+		memcpy(destroy_buffer->allocations, alloc_handles,
+		       allocations_size);
+
+	ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
+
+cleanup:
+
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_get_stdalloc_data(struct dxgdevice *device,
+				  enum d3dkmdt_standardallocationtype alloctype,
+				  struct d3dkmdt_gdisurfacedata *alloc_data,
+				  u32 physical_adapter_index,
+				  u32 *alloc_priv_driver_size,
+				  void *priv_alloc_data,
+				  u32 *res_priv_data_size,
+				  void *priv_res_data)
+{
+	struct dxgkvmb_command_getstandardallocprivdata *command;
+	struct dxgkvmb_command_getstandardallocprivdata_return *result = NULL;
+	u32 result_size = sizeof(*result);
+	int ret;
+	struct dxgvmbusmsgres msg = {.hdr = NULL};
+
+	if (priv_alloc_data)
+		result_size += *alloc_priv_driver_size;
+	if (priv_res_data)
+		result_size += *res_priv_data_size;
+	ret = init_message_res(&msg, device->adapter, device->process,
+			       sizeof(*command), result_size);
+	if (ret)
+		goto cleanup;
+	command = msg.msg;
+	result = msg.res;
+
+	command_vgpu_to_host_init2(&command->hdr,
+			DXGK_VMBCOMMAND_DDIGETSTANDARDALLOCATIONDRIVERDATA,
+			device->process->host_handle);
+
+	command->alloc_type = alloctype;
+	command->priv_driver_data_size = *alloc_priv_driver_size;
+	command->physical_adapter_index = physical_adapter_index;
+	command->priv_driver_resource_size = *res_priv_data_size;
+	switch (alloctype) {
+	case _D3DKMDT_STANDARDALLOCATION_GDISURFACE:
+		command->gdi_surface = *alloc_data;
+		break;
+	case _D3DKMDT_STANDARDALLOCATION_SHAREDPRIMARYSURFACE:
+	case _D3DKMDT_STANDARDALLOCATION_SHADOWSURFACE:
+	case _D3DKMDT_STANDARDALLOCATION_STAGINGSURFACE:
+	default:
+		pr_err("Invalid standard alloc type");
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   result, msg.res_size);
+	if (ret < 0)
+		goto cleanup;
+
+	ret = ntstatus2int(result->status);
+	if (ret < 0)
+		goto cleanup;
+
+	if (*alloc_priv_driver_size &&
+	    result->priv_driver_data_size != *alloc_priv_driver_size) {
+		pr_err("Priv data size mismatch");
+		goto cleanup;
+	}
+	if (*res_priv_data_size &&
+	    result->priv_driver_resource_size != *res_priv_data_size) {
+		pr_err("Resource priv data size mismatch");
+		goto cleanup;
+	}
+	*alloc_priv_driver_size = result->priv_driver_data_size;
+	*res_priv_data_size = result->priv_driver_resource_size;
+	if (priv_alloc_data) {
+		memcpy(priv_alloc_data, &result[1],
+		       result->priv_driver_data_size);
+	}
+	if (priv_res_data) {
+		memcpy(priv_res_data,
+		       (char *)(&result[1]) + result->priv_driver_data_size,
+		       result->priv_driver_resource_size);
+	}
+
+cleanup:
+
+	free_message((struct dxgvmbusmsg *)&msg, device->process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 int dxgvmb_send_query_adapter_info(struct dxgprocess *process,
 				   struct dxgadapter *adapter,
 				   struct d3dkmt_queryadapterinfo *args)
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index adaccc464e21..312ce049dbc2 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -173,6 +173,14 @@ struct dxgkvmb_command_setiospaceregion {
 	u32				shared_page_gpadl;
 };
 
+/* Returns ntstatus */
+struct dxgkvmb_command_setexistingsysmemstore {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		device;
+	struct d3dkmthandle		allocation;
+	u32				gpadl;
+};
+
 struct dxgkvmb_command_createprocess {
 	struct dxgkvmb_command_vm_to_host hdr;
 	void			*process;
@@ -269,6 +277,121 @@ struct dxgkvmb_command_flushdevice {
 	enum dxgdevice_flushschedulerreason	reason;
 };
 
+struct dxgkvmb_command_createallocation_allocinfo {
+	u32				flags;
+	u32				priv_drv_data_size;
+	u32				vidpn_source_id;
+};
+
+struct dxgkvmb_command_createallocation {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		device;
+	struct d3dkmthandle		resource;
+	u32				private_runtime_data_size;
+	u32				priv_drv_data_size;
+	u32				alloc_count;
+	struct d3dkmt_createallocationflags flags;
+	u64				private_runtime_resource_handle;
+	bool				make_resident;
+/* dxgkvmb_command_createallocation_allocinfo alloc_info[alloc_count]; */
+/* u8 private_rutime_data[private_runtime_data_size] */
+/* u8 priv_drv_data[] for each alloc_info */
+};
+
+struct dxgkvmb_command_getstandardallocprivdata {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	enum d3dkmdt_standardallocationtype alloc_type;
+	u32				priv_driver_data_size;
+	u32				priv_driver_resource_size;
+	u32				physical_adapter_index;
+	union {
+		struct d3dkmdt_sharedprimarysurfacedata	primary;
+		struct d3dkmdt_shadowsurfacedata	shadow;
+		struct d3dkmdt_stagingsurfacedata	staging;
+		struct d3dkmdt_gdisurfacedata		gdi_surface;
+	};
+};
+
+struct dxgkvmb_command_getstandardallocprivdata_return {
+	struct ntstatus			status;
+	u32				priv_driver_data_size;
+	u32				priv_driver_resource_size;
+	union {
+		struct d3dkmdt_sharedprimarysurfacedata	primary;
+		struct d3dkmdt_shadowsurfacedata	shadow;
+		struct d3dkmdt_stagingsurfacedata	staging;
+		struct d3dkmdt_gdisurfacedata		gdi_surface;
+	};
+/* char alloc_priv_data[priv_driver_data_size]; */
+/* char resource_priv_data[priv_driver_resource_size]; */
+};
+
+struct dxgkarg_describeallocation {
+	u64				allocation;
+	u32				width;
+	u32				height;
+	u32				format;
+	u32				multisample_method;
+	struct d3dddi_rational		refresh_rate;
+	u32				private_driver_attribute;
+	u32				flags;
+	u32				rotation;
+};
+
+struct dxgkvmb_allocflags {
+	union {
+		u32			flags;
+		struct {
+			u32		primary:1;
+			u32		cdd_primary:1;
+			u32		dod_primary:1;
+			u32		overlay:1;
+			u32		reserved6:1;
+			u32		capture:1;
+			u32		reserved0:4;
+			u32		reserved1:1;
+			u32		existing_sysmem:1;
+			u32		stereo:1;
+			u32		direct_flip:1;
+			u32		hardware_protected:1;
+			u32		reserved2:1;
+			u32		reserved3:1;
+			u32		reserved4:1;
+			u32		protected:1;
+			u32		cached:1;
+			u32		independent_primary:1;
+			u32		reserved:11;
+		};
+	};
+};
+
+struct dxgkvmb_command_allocinfo_return {
+	struct d3dkmthandle		allocation;
+	u32				priv_drv_data_size;
+	struct dxgkvmb_allocflags	allocation_flags;
+	u64				allocation_size;
+	struct dxgkarg_describeallocation driver_info;
+};
+
+struct dxgkvmb_command_createallocation_return {
+	struct d3dkmt_createallocationflags flags;
+	struct d3dkmthandle		resource;
+	struct d3dkmthandle		global_share;
+	u32				vgpu_flags;
+	struct dxgkvmb_command_allocinfo_return allocation_info[1];
+	/* Private driver data for allocations */
+};
+
+/* The command returns ntstatus */
+struct dxgkvmb_command_destroyallocation {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		device;
+	struct d3dkmthandle		resource;
+	u32				alloc_count;
+	struct d3dddicb_destroyallocation2flags flags;
+	struct d3dkmthandle		allocations[1];
+};
+
 struct dxgkvmb_command_createcontextvirtual {
 	struct dxgkvmb_command_vgpu_to_host hdr;
 	struct d3dkmthandle		context;
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index 879fb3c6b7b2..1f07a883debb 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -735,6 +735,639 @@ dxgk_destroy_context(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+get_standard_alloc_priv_data(struct dxgdevice *device,
+			     struct d3dkmt_createstandardallocation *alloc_info,
+			     u32 *standard_alloc_priv_data_size,
+			     void **standard_alloc_priv_data,
+			     u32 *standard_res_priv_data_size,
+			     void **standard_res_priv_data)
+{
+	int ret;
+	struct d3dkmdt_gdisurfacedata gdi_data = { };
+	u32 priv_data_size = 0;
+	u32 res_priv_data_size = 0;
+	void *priv_data = NULL;
+	void *res_priv_data = NULL;
+
+	gdi_data.type = _D3DKMDT_GDISURFACE_TEXTURE_CROSSADAPTER;
+	gdi_data.width = alloc_info->existing_heap_data.size;
+	gdi_data.height = 1;
+	gdi_data.format = _D3DDDIFMT_UNKNOWN;
+
+	*standard_alloc_priv_data_size = 0;
+	ret = dxgvmb_send_get_stdalloc_data(device,
+					_D3DKMDT_STANDARDALLOCATION_GDISURFACE,
+					&gdi_data, 0,
+					&priv_data_size, NULL,
+					&res_priv_data_size,
+					NULL);
+	if (ret < 0)
+		goto cleanup;
+	pr_debug("Priv data size: %d", priv_data_size);
+	if (priv_data_size == 0) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	priv_data = vzalloc(priv_data_size);
+	if (priv_data == NULL) {
+		ret = -ENOMEM;
+		pr_err("failed to allocate memory for priv data: %d",
+			   priv_data_size);
+		goto cleanup;
+	}
+	if (res_priv_data_size) {
+		res_priv_data = vzalloc(res_priv_data_size);
+		if (res_priv_data == NULL) {
+			ret = -ENOMEM;
+			pr_err("failed to alloc memory for res priv data: %d",
+				res_priv_data_size);
+			goto cleanup;
+		}
+	}
+	ret = dxgvmb_send_get_stdalloc_data(device,
+					_D3DKMDT_STANDARDALLOCATION_GDISURFACE,
+					&gdi_data, 0,
+					&priv_data_size,
+					priv_data,
+					&res_priv_data_size,
+					res_priv_data);
+	if (ret < 0)
+		goto cleanup;
+	*standard_alloc_priv_data_size = priv_data_size;
+	*standard_alloc_priv_data = priv_data;
+	*standard_res_priv_data_size = res_priv_data_size;
+	*standard_res_priv_data = res_priv_data;
+	priv_data = NULL;
+	res_priv_data = NULL;
+
+cleanup:
+	if (priv_data)
+		vfree(priv_data);
+	if (res_priv_data)
+		vfree(res_priv_data);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_create_allocation(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_createallocation args;
+	int ret;
+	struct dxgadapter *adapter = NULL;
+	struct dxgdevice *device = NULL;
+	struct d3dddi_allocationinfo2 *alloc_info = NULL;
+	struct d3dkmt_createstandardallocation standard_alloc;
+	u32 alloc_info_size = 0;
+	struct dxgresource *resource = NULL;
+	struct dxgallocation **dxgalloc = NULL;
+	bool resource_mutex_acquired = false;
+	u32 standard_alloc_priv_data_size = 0;
+	void *standard_alloc_priv_data = NULL;
+	u32 res_priv_data_size = 0;
+	void *res_priv_data = NULL;
+	int i;
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.alloc_count > D3DKMT_CREATEALLOCATION_MAX ||
+	    args.alloc_count == 0) {
+		pr_err("invalid number of allocations to create");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	alloc_info_size = sizeof(struct d3dddi_allocationinfo2) *
+	    args.alloc_count;
+	alloc_info = vzalloc(alloc_info_size);
+	if (alloc_info == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+	ret = copy_from_user(alloc_info, args.allocation_info,
+				 alloc_info_size);
+	if (ret) {
+		pr_err("%s failed to copy alloc info", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	for (i = 0; i < args.alloc_count; i++) {
+		if (args.flags.standard_allocation) {
+			if (alloc_info[i].priv_drv_data_size != 0) {
+				pr_err("private data size is not zero");
+				ret = -EINVAL;
+				goto cleanup;
+			}
+		}
+		if (alloc_info[i].priv_drv_data_size >=
+		    DXG_MAX_VM_BUS_PACKET_SIZE) {
+			pr_err("private data size is too big: %d %d %ld",
+				   i, alloc_info[i].priv_drv_data_size,
+				   sizeof(alloc_info[0]));
+			ret = -EINVAL;
+			goto cleanup;
+		}
+	}
+
+	if (args.flags.existing_section || args.flags.create_protected) {
+		pr_err("invalid allocation flags");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.flags.standard_allocation) {
+		if (args.standard_allocation == NULL) {
+			pr_err("invalid standard allocation");
+			ret = -EINVAL;
+			goto cleanup;
+		}
+		ret = copy_from_user(&standard_alloc,
+				     args.standard_allocation,
+				     sizeof(standard_alloc));
+		if (ret) {
+			pr_err("%s failed to copy std alloc data", __func__);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+		if (standard_alloc.type ==
+		    _D3DKMT_STANDARDALLOCATIONTYPE_EXISTINGHEAP) {
+			if (alloc_info[0].sysmem == NULL ||
+			   (unsigned long)alloc_info[0].sysmem &
+			   (PAGE_SIZE - 1)) {
+				pr_err("invalid sysmem pointer");
+				ret = STATUS_INVALID_PARAMETER;
+				goto cleanup;
+			}
+			if (!args.flags.existing_sysmem) {
+				pr_err("expected existing_sysmem flag");
+				ret = STATUS_INVALID_PARAMETER;
+				goto cleanup;
+			}
+		} else if (standard_alloc.type ==
+		    _D3DKMT_STANDARDALLOCATIONTYPE_CROSSADAPTER) {
+			if (args.flags.existing_sysmem) {
+				pr_err("existing_sysmem flag is invalid");
+				ret = STATUS_INVALID_PARAMETER;
+				goto cleanup;
+
+			}
+			if (alloc_info[0].sysmem != NULL) {
+				pr_err("sysmem should be NULL");
+				ret = STATUS_INVALID_PARAMETER;
+				goto cleanup;
+			}
+		} else {
+			pr_err("invalid standard allocation type");
+			ret = STATUS_INVALID_PARAMETER;
+			goto cleanup;
+		}
+
+		if (args.priv_drv_data_size != 0 ||
+		    args.alloc_count != 1 ||
+		    standard_alloc.existing_heap_data.size == 0 ||
+		    standard_alloc.existing_heap_data.size & (PAGE_SIZE - 1)) {
+			pr_err("invalid standard allocation");
+			ret = -EINVAL;
+			goto cleanup;
+		}
+		args.priv_drv_data_size =
+		    sizeof(struct d3dkmt_createstandardallocation);
+	}
+
+	if (args.flags.create_shared && !args.flags.create_resource) {
+		pr_err("create_resource must be set for create_shared");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	/*
+	 * The call acquires reference on the device. It is safe to access the
+	 * adapter, because the device holds reference on it.
+	 */
+	device = dxgprocess_device_by_handle(process, args.device);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgdevice_acquire_lock_shared(device);
+	if (ret < 0) {
+		kref_put(&device->device_kref, dxgdevice_release);
+		device = NULL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	if (args.flags.standard_allocation) {
+		ret = get_standard_alloc_priv_data(device,
+						&standard_alloc,
+						&standard_alloc_priv_data_size,
+						&standard_alloc_priv_data,
+						&res_priv_data_size,
+						&res_priv_data);
+		if (ret < 0)
+			goto cleanup;
+		pr_debug("Alloc private data: %d",
+			    standard_alloc_priv_data_size);
+	}
+
+	if (args.flags.create_resource) {
+		resource = dxgresource_create(device);
+		if (resource == NULL) {
+			ret = -ENOMEM;
+			goto cleanup;
+		}
+		resource->private_runtime_handle =
+		    args.private_runtime_resource_handle;
+	} else {
+		if (args.resource.v) {
+			/* Adding new allocations to the given resource */
+
+			dxgprocess_ht_lock_shared_down(process);
+			resource = hmgrtable_get_object_by_type(
+				&process->handle_table,
+				HMGRENTRY_TYPE_DXGRESOURCE,
+				args.resource);
+			kref_get(&resource->resource_kref);
+			dxgprocess_ht_lock_shared_up(process);
+
+			if (resource == NULL || resource->device != device) {
+				pr_err("invalid resource handle %x",
+					   args.resource.v);
+				ret = -EINVAL;
+				goto cleanup;
+			}
+			/* Synchronize with resource destruction */
+			mutex_lock(&resource->resource_mutex);
+			if (!dxgresource_is_active(resource)) {
+				mutex_unlock(&resource->resource_mutex);
+				ret = -EINVAL;
+				goto cleanup;
+			}
+			resource_mutex_acquired = true;
+		}
+	}
+
+	dxgalloc = vzalloc(sizeof(struct dxgallocation *) * args.alloc_count);
+	if (dxgalloc == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	for (i = 0; i < args.alloc_count; i++) {
+		struct dxgallocation *alloc;
+		u32 priv_data_size;
+
+		if (args.flags.standard_allocation)
+			priv_data_size = standard_alloc_priv_data_size;
+		else
+			priv_data_size = alloc_info[i].priv_drv_data_size;
+
+		if (alloc_info[i].sysmem && !args.flags.standard_allocation) {
+			if ((unsigned long)
+			    alloc_info[i].sysmem & (PAGE_SIZE - 1)) {
+				pr_err("invalid sysmem alloc %d, %p",
+					   i, alloc_info[i].sysmem);
+				ret = -EINVAL;
+				goto cleanup;
+			}
+		}
+		if ((alloc_info[0].sysmem == NULL) !=
+		    (alloc_info[i].sysmem == NULL)) {
+			pr_err("All allocations must have sysmem pointer");
+			ret = -EINVAL;
+			goto cleanup;
+		}
+
+		dxgalloc[i] = dxgallocation_create(process);
+		if (dxgalloc[i] == NULL) {
+			ret = -ENOMEM;
+			goto cleanup;
+		}
+		alloc = dxgalloc[i];
+
+		if (resource) {
+			ret = dxgresource_add_alloc(resource, alloc);
+			if (ret < 0)
+				goto cleanup;
+		} else {
+			dxgdevice_add_alloc(device, alloc);
+		}
+		if (args.flags.create_shared) {
+			/* Remember alloc private data to use it during open */
+			alloc->priv_drv_data = vzalloc(priv_data_size +
+					offsetof(struct privdata, data) - 1);
+			if (alloc->priv_drv_data == NULL) {
+				ret = -ENOMEM;
+				goto cleanup;
+			}
+			if (args.flags.standard_allocation) {
+				memcpy(alloc->priv_drv_data->data,
+				       standard_alloc_priv_data,
+				       standard_alloc_priv_data_size);
+				alloc->priv_drv_data->data_size =
+				    standard_alloc_priv_data_size;
+			} else {
+				ret = copy_from_user(
+					alloc->priv_drv_data->data,
+					alloc_info[i].priv_drv_data,
+					priv_data_size);
+				if (ret) {
+					pr_err("%s failed to copy priv data",
+						__func__);
+					ret = -EINVAL;
+					goto cleanup;
+				}
+				alloc->priv_drv_data->data_size =
+				    priv_data_size;
+			}
+		}
+	}
+
+	ret = dxgvmb_send_create_allocation(process, device, &args, inargs,
+					    resource, dxgalloc, alloc_info,
+					    &standard_alloc);
+cleanup:
+
+	if (resource_mutex_acquired) {
+		mutex_unlock(&resource->resource_mutex);
+		kref_put(&resource->resource_kref, dxgresource_release);
+	}
+	if (ret < 0) {
+		if (dxgalloc) {
+			for (i = 0; i < args.alloc_count; i++) {
+				if (dxgalloc[i])
+					dxgallocation_destroy(dxgalloc[i]);
+			}
+		}
+		if (resource && args.flags.create_resource) {
+			dxgresource_destroy(resource);
+		}
+	}
+	if (dxgalloc)
+		vfree(dxgalloc);
+	if (standard_alloc_priv_data)
+		vfree(standard_alloc_priv_data);
+	if (res_priv_data)
+		vfree(res_priv_data);
+	if (alloc_info)
+		vfree(alloc_info);
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+
+	if (device) {
+		dxgdevice_release_lock_shared(device);
+		kref_put(&device->device_kref, dxgdevice_release);
+	}
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int validate_alloc(struct dxgallocation *alloc0,
+			  struct dxgallocation *alloc,
+			  struct dxgdevice *device,
+			  struct d3dkmthandle alloc_handle)
+{
+	u32 fail_reason;
+
+	if (alloc == NULL) {
+		fail_reason = 1;
+		goto cleanup;
+	}
+	if (alloc->resource_owner != alloc0->resource_owner) {
+		fail_reason = 2;
+		goto cleanup;
+	}
+	if (alloc->resource_owner) {
+		if (alloc->owner.resource != alloc0->owner.resource) {
+			fail_reason = 3;
+			goto cleanup;
+		}
+		if (alloc->owner.resource->device != device) {
+			fail_reason = 4;
+			goto cleanup;
+		}
+	} else {
+		if (alloc->owner.device != device) {
+			fail_reason = 6;
+			goto cleanup;
+		}
+	}
+	return 0;
+cleanup:
+	pr_err("Alloc validation failed: reason: %d %x",
+		   fail_reason, alloc_handle.v);
+	return -EINVAL;
+}
+
+static int
+dxgk_destroy_allocation(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_destroyallocation2 args;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+	int ret;
+	struct d3dkmthandle *alloc_handles = NULL;
+	struct dxgallocation **allocs = NULL;
+	struct dxgresource *resource = NULL;
+	int i;
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.alloc_count > D3DKMT_CREATEALLOCATION_MAX ||
+	    ((args.alloc_count == 0) == (args.resource.v == 0))) {
+		pr_err("invalid number of allocations");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.alloc_count) {
+		u32 handle_size = sizeof(struct d3dkmthandle) *
+				   args.alloc_count;
+
+		alloc_handles = vzalloc(handle_size);
+		if (alloc_handles == NULL) {
+			ret = -ENOMEM;
+			goto cleanup;
+		}
+		allocs = vzalloc(sizeof(struct dxgallocation *) *
+				 args.alloc_count);
+		if (allocs == NULL) {
+			ret = -ENOMEM;
+			goto cleanup;
+		}
+		ret = copy_from_user(alloc_handles, args.allocations,
+					 handle_size);
+		if (ret) {
+			pr_err("%s failed to copy alloc handles", __func__);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+	}
+
+	/*
+	 * The call acquires reference on the device. It is safe to access the
+	 * adapter, because the device holds reference on it.
+	 */
+	device = dxgprocess_device_by_handle(process, args.device);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	/* Acquire the device lock to synchronize with the device destriction */
+	ret = dxgdevice_acquire_lock_shared(device);
+	if (ret < 0) {
+		kref_put(&device->device_kref, dxgdevice_release);
+		device = NULL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	/*
+	 * Destroy the local allocation handles first. If the host handle
+	 * is destroyed first, another object could be assigned to the process
+	 * table at the same place as the allocation handle and it will fail.
+	 */
+	if (args.alloc_count) {
+		dxgprocess_ht_lock_exclusive_down(process);
+		for (i = 0; i < args.alloc_count; i++) {
+			allocs[i] =
+			    hmgrtable_get_object_by_type(&process->handle_table,
+						HMGRENTRY_TYPE_DXGALLOCATION,
+						alloc_handles[i]);
+			ret =
+			    validate_alloc(allocs[0], allocs[i], device,
+					   alloc_handles[i]);
+			if (ret < 0) {
+				dxgprocess_ht_lock_exclusive_up(process);
+				goto cleanup;
+			}
+		}
+		dxgprocess_ht_lock_exclusive_up(process);
+		for (i = 0; i < args.alloc_count; i++)
+			dxgallocation_free_handle(allocs[i]);
+	} else {
+		struct dxgallocation *alloc;
+
+		dxgprocess_ht_lock_exclusive_down(process);
+		resource = hmgrtable_get_object_by_type(&process->handle_table,
+						HMGRENTRY_TYPE_DXGRESOURCE,
+						args.resource);
+		if (resource == NULL) {
+			pr_err("Invalid resource handle: %x",
+				   args.resource.v);
+			ret = -EINVAL;
+		} else if (resource->device != device) {
+			pr_err("Resource belongs to wrong device: %x",
+				   args.resource.v);
+			ret = -EINVAL;
+		} else {
+			hmgrtable_free_handle(&process->handle_table,
+					      HMGRENTRY_TYPE_DXGRESOURCE,
+					      args.resource);
+			resource->object_state = DXGOBJECTSTATE_DESTROYED;
+			resource->handle.v = 0;
+			resource->handle_valid = 0;
+		}
+		dxgprocess_ht_lock_exclusive_up(process);
+
+		if (ret < 0)
+			goto cleanup;
+
+		dxgdevice_acquire_alloc_list_lock_shared(device);
+		list_for_each_entry(alloc, &resource->alloc_list_head,
+				    alloc_list_entry) {
+			dxgallocation_free_handle(alloc);
+		}
+		dxgdevice_release_alloc_list_lock_shared(device);
+	}
+
+	if (args.alloc_count && allocs[0]->resource_owner)
+		resource = allocs[0]->owner.resource;
+
+	if (resource) {
+		kref_get(&resource->resource_kref);
+		mutex_lock(&resource->resource_mutex);
+	}
+
+	ret = dxgvmb_send_destroy_allocation(process, device, &args,
+					     alloc_handles);
+
+	/*
+	 * Destroy the allocations after the host destroyed it.
+	 * The allocation gpadl teardown will wait until the host unmaps its
+	 * gpadl.
+	 */
+	dxgdevice_acquire_alloc_list_lock(device);
+	if (args.alloc_count) {
+		for (i = 0; i < args.alloc_count; i++) {
+			if (allocs[i]) {
+				allocs[i]->alloc_handle.v = 0;
+				dxgallocation_destroy(allocs[i]);
+			}
+		}
+	} else {
+		dxgresource_destroy(resource);
+	}
+	dxgdevice_release_alloc_list_lock(device);
+
+	if (resource) {
+		mutex_unlock(&resource->resource_mutex);
+		kref_put(&resource->resource_kref, dxgresource_release);
+	}
+
+cleanup:
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+
+	if (device) {
+		dxgdevice_release_lock_shared(device);
+		kref_put(&device->device_kref, dxgdevice_release);
+	}
+
+	if (alloc_handles)
+		vfree(alloc_handles);
+
+	if (allocs)
+		vfree(allocs);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 /*
  * IOCTL processing
  * The driver IOCTLs return
@@ -799,8 +1432,12 @@ void init_ioctls(void)
 		  LX_DXCREATECONTEXTVIRTUAL);
 	SET_IOCTL(/*0x5 */ dxgk_destroy_context,
 		  LX_DXDESTROYCONTEXT);
+	SET_IOCTL(/*0x6 */ dxgk_create_allocation,
+		  LX_DXCREATEALLOCATION);
 	SET_IOCTL(/*0x9 */ dxgk_query_adapter_info,
 		  LX_DXQUERYADAPTERINFO);
+	SET_IOCTL(/*0x13 */ dxgk_destroy_allocation,
+		  LX_DXDESTROYALLOCATION2);
 	SET_IOCTL(/*0x14 */ dxgk_enum_adapters,
 		  LX_DXENUMADAPTERS2);
 	SET_IOCTL(/*0x15 */ dxgk_close_adapter,
diff --git a/drivers/hv/dxgkrnl/misc.h b/drivers/hv/dxgkrnl/misc.h
index 8f7b37049308..41abdc504bc2 100644
--- a/drivers/hv/dxgkrnl/misc.h
+++ b/drivers/hv/dxgkrnl/misc.h
@@ -31,6 +31,9 @@ extern const struct d3dkmthandle zerohandle;
  * plistmutex
  * table_lock
  * context_list_lock
+ * alloc_list_lock
+ * resource_mutex
+ * shared_resource_list_lock
  * core_lock
  * device_lock
  * process_adapter_mutex
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index b17ef6e30ece..676ca1db85c9 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -54,6 +54,7 @@ struct winluid {
 	__u32 b;
 };
 
+#define D3DKMT_CREATEALLOCATION_MAX		1024
 #define D3DKMT_ADAPTERS_MAX			64
 
 struct d3dkmt_adapterinfo {
@@ -193,6 +194,205 @@ struct d3dkmt_createcontextvirtual {
 	struct d3dkmthandle		context;
 };
 
+enum d3dkmdt_gdisurfacetype {
+	_D3DKMDT_GDISURFACE_INVALID				= 0,
+	_D3DKMDT_GDISURFACE_TEXTURE				= 1,
+	_D3DKMDT_GDISURFACE_STAGING_CPUVISIBLE			= 2,
+	_D3DKMDT_GDISURFACE_STAGING				= 3,
+	_D3DKMDT_GDISURFACE_LOOKUPTABLE				= 4,
+	_D3DKMDT_GDISURFACE_EXISTINGSYSMEM			= 5,
+	_D3DKMDT_GDISURFACE_TEXTURE_CPUVISIBLE			= 6,
+	_D3DKMDT_GDISURFACE_TEXTURE_CROSSADAPTER		= 7,
+	_D3DKMDT_GDISURFACE_TEXTURE_CPUVISIBLE_CROSSADAPTER	= 8,
+};
+
+struct d3dddi_rational {
+	__u32	numerator;
+	__u32	denominator;
+};
+
+enum d3dddiformat {
+	_D3DDDIFMT_UNKNOWN = 0,
+};
+
+struct d3dkmdt_gdisurfacedata {
+	__u32				width;
+	__u32				height;
+	__u32				format;
+	enum d3dkmdt_gdisurfacetype	type;
+	__u32				flags;
+	__u32				pitch;
+};
+
+struct d3dkmdt_stagingsurfacedata {
+	__u32	width;
+	__u32	height;
+	__u32	pitch;
+};
+
+struct d3dkmdt_sharedprimarysurfacedata {
+	__u32			width;
+	__u32			height;
+	enum d3dddiformat	format;
+	struct d3dddi_rational	refresh_rate;
+	__u32			vidpn_source_id;
+};
+
+struct d3dkmdt_shadowsurfacedata {
+	__u32			width;
+	__u32			height;
+	enum d3dddiformat	format;
+	__u32			pitch;
+};
+
+enum d3dkmdt_standardallocationtype {
+	_D3DKMDT_STANDARDALLOCATION_SHAREDPRIMARYSURFACE	= 1,
+	_D3DKMDT_STANDARDALLOCATION_SHADOWSURFACE		= 2,
+	_D3DKMDT_STANDARDALLOCATION_STAGINGSURFACE		= 3,
+	_D3DKMDT_STANDARDALLOCATION_GDISURFACE			= 4,
+};
+
+enum d3dkmt_standardallocationtype {
+	_D3DKMT_STANDARDALLOCATIONTYPE_EXISTINGHEAP	= 1,
+	_D3DKMT_STANDARDALLOCATIONTYPE_CROSSADAPTER	= 2,
+};
+
+struct d3dkmt_standardallocation_existingheap {
+	__u64	size;
+};
+
+struct d3dkmt_createstandardallocationflags {
+	union {
+		struct {
+			__u32		reserved:32;
+		};
+		__u32			value;
+	};
+};
+
+struct d3dkmt_createstandardallocation {
+	enum d3dkmt_standardallocationtype		type;
+	__u32						reserved;
+	struct d3dkmt_standardallocation_existingheap	existing_heap_data;
+	struct d3dkmt_createstandardallocationflags	flags;
+	__u32						reserved1;
+};
+
+struct d3dddi_allocationinfo2 {
+	struct d3dkmthandle	allocation;
+#ifdef __KERNEL__
+	const void		*sysmem;
+#else
+	__u64			sysmem;
+#endif
+#ifdef __KERNEL__
+	void			*priv_drv_data;
+#else
+	__u64			priv_drv_data;
+#endif
+	__u32			priv_drv_data_size;
+	__u32			vidpn_source_id;
+	union {
+		struct {
+			__u32	primary:1;
+			__u32	stereo:1;
+			__u32	override_priority:1;
+			__u32	reserved:29;
+		};
+		__u32		value;
+	} flags;
+	__u64			gpu_virtual_address;
+	union {
+		__u32		priority;
+		__u64		unused;
+	};
+	__u64			reserved[5];
+};
+
+struct d3dkmt_createallocationflags {
+	union {
+		struct {
+			__u32		create_resource:1;
+			__u32		create_shared:1;
+			__u32		non_secure:1;
+			__u32		create_protected:1;
+			__u32		restrict_shared_access:1;
+			__u32		existing_sysmem:1;
+			__u32		nt_security_sharing:1;
+			__u32		read_only:1;
+			__u32		create_write_combined:1;
+			__u32		create_cached:1;
+			__u32		swap_chain_back_buffer:1;
+			__u32		cross_adapter:1;
+			__u32		open_cross_adapter:1;
+			__u32		partial_shared_creation:1;
+			__u32		zeroed:1;
+			__u32		write_watch:1;
+			__u32		standard_allocation:1;
+			__u32		existing_section:1;
+			__u32		reserved:14;
+		};
+		__u32			value;
+	};
+};
+
+struct d3dkmt_createallocation {
+	struct d3dkmthandle		device;
+	struct d3dkmthandle		resource;
+	struct d3dkmthandle		global_share;
+	__u32				reserved;
+#ifdef __KERNEL__
+	const void			*private_runtime_data;
+#else
+	__u64				private_runtime_data;
+#endif
+	__u32				private_runtime_data_size;
+	__u32				reserved1;
+	union {
+#ifdef __KERNEL__
+		struct d3dkmt_createstandardallocation *standard_allocation;
+		const void *priv_drv_data;
+#else
+		__u64	standard_allocation;
+		__u64	priv_drv_data;
+#endif
+	};
+	__u32				priv_drv_data_size;
+	__u32				alloc_count;
+#ifdef __KERNEL__
+	struct d3dddi_allocationinfo2	*allocation_info;
+#else
+	__u64				allocation_info;
+#endif
+	struct d3dkmt_createallocationflags flags;
+	__u32				reserved2;
+	__u64				private_runtime_resource_handle;
+};
+
+struct d3dddicb_destroyallocation2flags {
+	union {
+		struct {
+			__u32		assume_not_in_use:1;
+			__u32		synchronous_destroy:1;
+			__u32		reserved:29;
+			__u32		system_use_only:1;
+		};
+		__u32			value;
+	};
+};
+
+struct d3dkmt_destroyallocation2 {
+	struct d3dkmthandle		device;
+	struct d3dkmthandle		resource;
+#ifdef __KERNEL__
+	const struct d3dkmthandle	*allocations;
+#else
+	__u64				allocations;
+#endif
+	__u32				alloc_count;
+	struct d3dddicb_destroyallocation2flags flags;
+};
+
 struct d3dkmt_adaptertype {
 	union {
 		struct {
@@ -275,8 +475,12 @@ struct d3dkmt_enumadapters3 {
 	_IOWR(0x47, 0x04, struct d3dkmt_createcontextvirtual)
 #define LX_DXDESTROYCONTEXT		\
 	_IOWR(0x47, 0x05, struct d3dkmt_destroycontext)
+#define LX_DXCREATEALLOCATION		\
+	_IOWR(0x47, 0x06, struct d3dkmt_createallocation)
 #define LX_DXQUERYADAPTERINFO		\
 	_IOWR(0x47, 0x09, struct d3dkmt_queryadapterinfo)
+#define LX_DXDESTROYALLOCATION2		\
+	_IOWR(0x47, 0x13, struct d3dkmt_destroyallocation2)
 #define LX_DXENUMADAPTERS2		\
 	_IOWR(0x47, 0x14, struct d3dkmt_enumadapters2)
 #define LX_DXCLOSEADAPTER		\
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 10/30] drivers: hv: dxgkrnl: Creation of compute device sync objects
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (7 preceding siblings ...)
  2022-03-01 19:45   ` [PATCH v3 09/30] drivers: hv: dxgkrnl: Creation of compute device allocations and resources Iouri Tarassov
@ 2022-03-01 19:45   ` Iouri Tarassov
  2022-03-02 13:25     ` Wei Liu
  2022-03-01 19:45   ` [PATCH v3 11/30] drivers: hv: dxgkrnl: Operations using " Iouri Tarassov
                     ` (20 subsequent siblings)
  29 siblings, 1 reply; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:45 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement ioctls to create and destroy compute devicesync objects:
 - the LX_DXCREATESYNCHRONIZATIONOBJECT ioctl,
 - the LX_DXDESTROYSYNCHRONIZATIONOBJECT ioctl.

Compute device synchronization objects are used to synchronize
execution of compute device commands, which are queued to
different execution contexts (dxgcontext objects).

There are several types of sync objects (mutex, monitored
fence, CPU event, fence). A "signal" or a "wait" operation
could be queued to an execution context.

Monitored fence sync objects are particular important.
A monitored fence object has a fence value, which could be
monitored by the compute device or by CPU. Therefore, a CPU
virtual address is allocated during object creation to allow
an application to read the fence value. dxg_map_iospace and
dxg_unmap_iospace implement creation of the CPU virtual address.
This is done as follow:
- The host allocates a portion of the guest IO space, which is mapped
  to the actual fence value memory on the host
- The host returns the guest IO space address to the guest
- The guest allocates a CPU virtual address and updates page tables
  to point to the IO space address

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgadapter.c | 182 +++++++++++++++++++++++++++++
 drivers/hv/dxgkrnl/dxgkrnl.h    |  80 +++++++++++++
 drivers/hv/dxgkrnl/dxgmodule.c  |   1 +
 drivers/hv/dxgkrnl/dxgprocess.c |  16 +++
 drivers/hv/dxgkrnl/dxgvmbus.c   | 199 ++++++++++++++++++++++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h   |  20 ++++
 drivers/hv/dxgkrnl/ioctl.c      | 130 +++++++++++++++++++++
 include/uapi/misc/d3dkmthk.h    |  95 +++++++++++++++
 8 files changed, 723 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgadapter.c b/drivers/hv/dxgkrnl/dxgadapter.c
index 01dada36a463..71e3e1ce5c2b 100644
--- a/drivers/hv/dxgkrnl/dxgadapter.c
+++ b/drivers/hv/dxgkrnl/dxgadapter.c
@@ -165,6 +165,24 @@ void dxgadapter_remove_process(struct dxgprocess_adapter *process_info)
 	process_info->adapter_process_list_entry.prev = NULL;
 }
 
+void dxgadapter_add_syncobj(struct dxgadapter *adapter,
+			    struct dxgsyncobject *object)
+{
+	down_write(&adapter->shared_resource_list_lock);
+	list_add_tail(&object->syncobj_list_entry, &adapter->syncobj_list_head);
+	up_write(&adapter->shared_resource_list_lock);
+}
+
+void dxgadapter_remove_syncobj(struct dxgsyncobject *object)
+{
+	down_write(&object->adapter->shared_resource_list_lock);
+	if (object->syncobj_list_entry.next) {
+		list_del(&object->syncobj_list_entry);
+		object->syncobj_list_entry.next = NULL;
+	}
+	up_write(&object->adapter->shared_resource_list_lock);
+}
+
 int dxgadapter_acquire_lock_exclusive(struct dxgadapter *adapter)
 {
 	down_write(&adapter->core_lock);
@@ -217,6 +235,7 @@ struct dxgdevice *dxgdevice_create(struct dxgadapter *adapter,
 		init_rwsem(&device->context_list_lock);
 		init_rwsem(&device->alloc_list_lock);
 		INIT_LIST_HEAD(&device->pqueue_list_head);
+		INIT_LIST_HEAD(&device->syncobj_list_head);
 		device->object_state = DXGOBJECTSTATE_CREATED;
 		device->execution_state = _D3DKMT_DEVICEEXECUTION_ACTIVE;
 
@@ -232,6 +251,7 @@ struct dxgdevice *dxgdevice_create(struct dxgadapter *adapter,
 void dxgdevice_stop(struct dxgdevice *device)
 {
 	struct dxgallocation *alloc;
+	struct dxgsyncobject *syncobj;
 
 	pr_debug("%s: %p", __func__, device);
 	dxgdevice_acquire_alloc_list_lock(device);
@@ -240,6 +260,12 @@ void dxgdevice_stop(struct dxgdevice *device)
 	}
 	dxgdevice_release_alloc_list_lock(device);
 
+	hmgrtable_lock(&device->process->handle_table, DXGLOCK_EXCL);
+	list_for_each_entry(syncobj, &device->syncobj_list_head,
+			    syncobj_list_entry) {
+		dxgsyncobject_stop(syncobj);
+	}
+	hmgrtable_unlock(&device->process->handle_table, DXGLOCK_EXCL);
 	pr_debug("%s: end %p\n", __func__, device);
 }
 
@@ -269,6 +295,20 @@ void dxgdevice_destroy(struct dxgdevice *device)
 
 	dxgdevice_acquire_alloc_list_lock(device);
 
+	while (!list_empty(&device->syncobj_list_head)) {
+		struct dxgsyncobject *syncobj =
+		    list_first_entry(&device->syncobj_list_head,
+				     struct dxgsyncobject,
+				     syncobj_list_entry);
+		list_del(&syncobj->syncobj_list_entry);
+		syncobj->syncobj_list_entry.next = NULL;
+		dxgdevice_release_alloc_list_lock(device);
+
+		dxgsyncobject_destroy(process, syncobj);
+
+		dxgdevice_acquire_alloc_list_lock(device);
+	}
+
 	{
 		struct dxgallocation *alloc;
 		struct dxgallocation *tmp;
@@ -570,6 +610,30 @@ void dxgdevice_release(struct kref *refcount)
 	vfree(device);
 }
 
+void dxgdevice_add_syncobj(struct dxgdevice *device,
+			   struct dxgsyncobject *syncobj)
+{
+	dxgdevice_acquire_alloc_list_lock(device);
+	list_add_tail(&syncobj->syncobj_list_entry, &device->syncobj_list_head);
+	kref_get(&syncobj->syncobj_kref);
+	dxgdevice_release_alloc_list_lock(device);
+}
+
+void dxgdevice_remove_syncobj(struct dxgsyncobject *entry)
+{
+	struct dxgdevice *device = entry->device;
+
+	dxgdevice_acquire_alloc_list_lock(device);
+	if (entry->syncobj_list_entry.next) {
+		list_del(&entry->syncobj_list_entry);
+		entry->syncobj_list_entry.next = NULL;
+		kref_put(&entry->syncobj_kref, dxgsyncobject_release);
+	}
+	dxgdevice_release_alloc_list_lock(device);
+	kref_put(&device->device_kref, dxgdevice_release);
+	entry->device = NULL;
+}
+
 struct dxgcontext *dxgcontext_create(struct dxgdevice *device)
 {
 	struct dxgcontext *context = vzalloc(sizeof(struct dxgcontext));
@@ -810,3 +874,121 @@ void dxgprocess_adapter_remove_device(struct dxgdevice *device)
 	}
 	mutex_unlock(&device->adapter_info->device_list_mutex);
 }
+
+struct dxgsyncobject *dxgsyncobject_create(struct dxgprocess *process,
+					   struct dxgdevice *device,
+					   struct dxgadapter *adapter,
+					   enum
+					   d3dddi_synchronizationobject_type
+					   type,
+					   struct
+					   d3dddi_synchronizationobject_flags
+					   flags)
+{
+	struct dxgsyncobject *syncobj;
+
+	syncobj = vzalloc(sizeof(*syncobj));
+	if (syncobj == NULL)
+		goto cleanup;
+	syncobj->type = type;
+	syncobj->process = process;
+	switch (type) {
+	case _D3DDDI_MONITORED_FENCE:
+	case _D3DDDI_PERIODIC_MONITORED_FENCE:
+		syncobj->monitored_fence = 1;
+		break;
+	default:
+		break;
+	}
+	if (flags.shared) {
+		syncobj->shared = 1;
+		if (!flags.nt_security_sharing) {
+			pr_err("%s: nt_security_sharing must be set", __func__);
+			goto cleanup;
+		}
+	}
+
+	kref_init(&syncobj->syncobj_kref);
+
+	if (syncobj->monitored_fence) {
+		syncobj->device = device;
+		syncobj->device_handle = device->handle;
+		kref_get(&device->device_kref);
+		dxgdevice_add_syncobj(device, syncobj);
+	} else {
+		dxgadapter_add_syncobj(adapter, syncobj);
+	}
+	syncobj->adapter = adapter;
+	kref_get(&adapter->adapter_kref);
+
+	pr_debug("%s 0x%p\n", __func__, syncobj);
+	return syncobj;
+cleanup:
+	if (syncobj)
+		vfree(syncobj);
+	return NULL;
+}
+
+void dxgsyncobject_destroy(struct dxgprocess *process,
+			   struct dxgsyncobject *syncobj)
+{
+	int destroyed;
+
+	pr_debug("%s 0x%p", __func__, syncobj);
+
+	dxgsyncobject_stop(syncobj);
+
+	destroyed = test_and_set_bit(0, &syncobj->flags);
+	if (!destroyed) {
+		pr_debug("Deleting handle: %x", syncobj->handle.v);
+		hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+		if (syncobj->handle.v) {
+			hmgrtable_free_handle(&process->handle_table,
+					      HMGRENTRY_TYPE_DXGSYNCOBJECT,
+					      syncobj->handle);
+			syncobj->handle.v = 0;
+			kref_put(&syncobj->syncobj_kref, dxgsyncobject_release);
+		}
+		hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+
+		if (syncobj->monitored_fence)
+			dxgdevice_remove_syncobj(syncobj);
+		else
+			dxgadapter_remove_syncobj(syncobj);
+		if (syncobj->adapter) {
+			kref_put(&syncobj->adapter->adapter_kref,
+				 dxgadapter_release);
+			syncobj->adapter = NULL;
+		}
+	}
+	kref_put(&syncobj->syncobj_kref, dxgsyncobject_release);
+}
+
+void dxgsyncobject_stop(struct dxgsyncobject *syncobj)
+{
+	int stopped = test_and_set_bit(1, &syncobj->flags);
+
+	if (!stopped) {
+		pr_debug("stopping");
+		if (syncobj->monitored_fence) {
+			if (syncobj->mapped_address) {
+				int ret =
+				    dxg_unmap_iospace(syncobj->mapped_address,
+						      PAGE_SIZE);
+
+				(void)ret;
+				pr_debug("unmap fence %d %p\n",
+					ret, syncobj->mapped_address);
+				syncobj->mapped_address = NULL;
+			}
+		}
+	}
+}
+
+void dxgsyncobject_release(struct kref *refcount)
+{
+	struct dxgsyncobject *syncobj;
+
+	syncobj = container_of(refcount, struct dxgsyncobject, syncobj_kref);
+	vfree(syncobj);
+}
diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index 766f5214cc57..f0b0e1bdd897 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -33,6 +33,7 @@ struct dxgdevice;
 struct dxgcontext;
 struct dxgallocation;
 struct dxgresource;
+struct dxgsyncobject;
 
 #include "misc.h"
 #include "hmgr.h"
@@ -85,6 +86,56 @@ int dxgvmbuschannel_init(struct dxgvmbuschannel *ch, struct hv_device *hdev);
 void dxgvmbuschannel_destroy(struct dxgvmbuschannel *ch);
 void dxgvmbuschannel_receive(void *ctx);
 
+/*
+ * This is GPU synchronization object, which is used to synchronize execution
+ * between GPU contextx/hardware queues or for tracking GPU execution progress.
+ * A dxgsyncobject is created when somebody creates a syncobject or opens a
+ * shared syncobject.
+ * A syncobject belongs to an adapter, unless it is a cross-adapter object.
+ * Cross adapter syncobjects are currently not implemented.
+ *
+ * D3DDDI_MONITORED_FENCE and D3DDDI_PERIODIC_MONITORED_FENCE are called
+ * "device" syncobject, because the belong to a device (dxgdevice).
+ * Device syncobjects are inserted to a list in dxgdevice.
+ *
+ */
+struct dxgsyncobject {
+	struct kref			syncobj_kref;
+	enum d3dddi_synchronizationobject_type	type;
+	/*
+	 * List entry in dxgdevice for device sync objects.
+	 * List entry in dxgadapter for other objects
+	 */
+	struct list_head		syncobj_list_entry;
+	/* Adapter, the syncobject belongs to. NULL for stopped sync obejcts. */
+	struct dxgadapter		*adapter;
+	/*
+	 * Pointer to the device, which was used to create the object.
+	 * This is NULL for non-device syncbjects
+	 */
+	struct dxgdevice		*device;
+	struct dxgprocess		*process;
+	/* CPU virtual address of the fence value for "device" syncobjects */
+	void				*mapped_address;
+	/* Handle in the process handle table */
+	struct d3dkmthandle		handle;
+	/* Cached handle of the device. Used to avoid device dereference. */
+	struct d3dkmthandle		device_handle;
+	union {
+		struct {
+			/* Must be the first bit */
+			u32		destroyed:1;
+			/* Must be the second bit */
+			u32		stopped:1;
+			/* device syncobject */
+			u32		monitored_fence:1;
+			u32		shared:1;
+			u32		reserved:27;
+		};
+		long			flags;
+	};
+};
+
 /*
  * The structure defines an offered vGPU vm bus channel.
  */
@@ -94,6 +145,20 @@ struct dxgvgpuchannel {
 	struct hv_device	*hdev;
 };
 
+struct dxgsyncobject *dxgsyncobject_create(struct dxgprocess *process,
+					   struct dxgdevice *device,
+					   struct dxgadapter *adapter,
+					   enum
+					   d3dddi_synchronizationobject_type
+					   type,
+					   struct
+					   d3dddi_synchronizationobject_flags
+					   flags);
+void dxgsyncobject_destroy(struct dxgprocess *process,
+			   struct dxgsyncobject *syncobj);
+void dxgsyncobject_stop(struct dxgsyncobject *syncobj);
+void dxgsyncobject_release(struct kref *refcount);
+
 struct dxgglobal {
 	struct dxgvmbuschannel	channel;
 	struct delayed_work	dwork;
@@ -258,6 +323,8 @@ struct dxgadapter {
 	struct list_head	adapter_list_entry;
 	/* The list of dxgprocess_adapter entries */
 	struct list_head	adapter_process_list_head;
+	/* List of all non-device dxgsyncobject objects */
+	struct list_head	syncobj_list_head;
 	/* This lock protects shared resource and syncobject lists */
 	struct rw_semaphore	shared_resource_list_lock;
 	struct pci_dev		*pci_dev;
@@ -283,6 +350,9 @@ void dxgadapter_release_lock_shared(struct dxgadapter *adapter);
 int dxgadapter_acquire_lock_exclusive(struct dxgadapter *adapter);
 void dxgadapter_acquire_lock_forced(struct dxgadapter *adapter);
 void dxgadapter_release_lock_exclusive(struct dxgadapter *adapter);
+void dxgadapter_add_syncobj(struct dxgadapter *adapter,
+			    struct dxgsyncobject *so);
+void dxgadapter_remove_syncobj(struct dxgsyncobject *so);
 void dxgadapter_add_process(struct dxgadapter *adapter,
 			    struct dxgprocess_adapter *process_info);
 void dxgadapter_remove_process(struct dxgprocess_adapter *process_info);
@@ -312,6 +382,7 @@ struct dxgdevice {
 	struct list_head	resource_list_head;
 	/* List of paging queues. Protected by process handle table lock. */
 	struct list_head	pqueue_list_head;
+	struct list_head	syncobj_list_head;
 	struct d3dkmthandle	handle;
 	enum d3dkmt_deviceexecution_state execution_state;
 	u32			handle_valid;
@@ -332,6 +403,8 @@ void dxgdevice_remove_alloc_safe(struct dxgdevice *dev,
 				 struct dxgallocation *a);
 void dxgdevice_add_resource(struct dxgdevice *dev, struct dxgresource *res);
 void dxgdevice_remove_resource(struct dxgdevice *dev, struct dxgresource *res);
+void dxgdevice_add_syncobj(struct dxgdevice *dev, struct dxgsyncobject *so);
+void dxgdevice_remove_syncobj(struct dxgsyncobject *so);
 bool dxgdevice_is_active(struct dxgdevice *dev);
 void dxgdevice_acquire_context_list_lock(struct dxgdevice *dev);
 void dxgdevice_release_context_list_lock(struct dxgdevice *dev);
@@ -439,6 +512,7 @@ void init_ioctls(void);
 long dxgk_compat_ioctl(struct file *f, unsigned int p1, unsigned long p2);
 long dxgk_unlocked_ioctl(struct file *f, unsigned int p1, unsigned long p2);
 
+int dxg_unmap_iospace(void *va, u32 size);
 static inline void guid_to_luid(guid_t *guid, struct winluid *luid)
 {
 	*luid = *(struct winluid *)&guid->b[0];
@@ -494,6 +568,12 @@ int dxgvmb_send_create_allocation(struct dxgprocess *pr, struct dxgdevice *dev,
 int dxgvmb_send_destroy_allocation(struct dxgprocess *pr, struct dxgdevice *dev,
 				   struct d3dkmt_destroyallocation2 *args,
 				   struct d3dkmthandle *alloc_handles);
+int dxgvmb_send_create_sync_object(struct dxgprocess *pr,
+				   struct dxgadapter *adapter,
+				   struct d3dkmt_createsynchronizationobject2
+				   *args, struct dxgsyncobject *so);
+int dxgvmb_send_destroy_sync_object(struct dxgprocess *pr,
+				    struct d3dkmthandle h);
 int dxgvmb_send_query_adapter_info(struct dxgprocess *process,
 				   struct dxgadapter *adapter,
 				   struct d3dkmt_queryadapterinfo *args);
diff --git a/drivers/hv/dxgkrnl/dxgmodule.c b/drivers/hv/dxgkrnl/dxgmodule.c
index eb8ef8a69f28..d64504a1cff0 100644
--- a/drivers/hv/dxgkrnl/dxgmodule.c
+++ b/drivers/hv/dxgkrnl/dxgmodule.c
@@ -158,6 +158,7 @@ int dxgglobal_create_adapter(struct pci_dev *dev, guid_t *guid,
 	init_rwsem(&adapter->core_lock);
 
 	INIT_LIST_HEAD(&adapter->adapter_process_list_head);
+	INIT_LIST_HEAD(&adapter->syncobj_list_head);
 	init_rwsem(&adapter->shared_resource_list_lock);
 	adapter->pci_dev = dev;
 	guid_to_luid(guid, &adapter->luid);
diff --git a/drivers/hv/dxgkrnl/dxgprocess.c b/drivers/hv/dxgkrnl/dxgprocess.c
index 1e6500e12a22..30af930cc8c0 100644
--- a/drivers/hv/dxgkrnl/dxgprocess.c
+++ b/drivers/hv/dxgkrnl/dxgprocess.c
@@ -59,6 +59,7 @@ void dxgprocess_destroy(struct dxgprocess *process)
 	enum hmgrentry_type t;
 	struct d3dkmthandle h;
 	void *o;
+	struct dxgsyncobject *syncobj;
 	struct dxgprocess_adapter *entry;
 	struct dxgprocess_adapter *tmp;
 
@@ -84,6 +85,21 @@ void dxgprocess_destroy(struct dxgprocess *process)
 		}
 	}
 
+	i = 0;
+	while (hmgrtable_next_entry(&process->handle_table, &i, &t, &h, &o)) {
+		switch (t) {
+		case HMGRENTRY_TYPE_DXGSYNCOBJECT:
+			pr_debug("Destroy syncobj: %p %d", o, i);
+			syncobj = o;
+			syncobj->handle.v = 0;
+			dxgsyncobject_destroy(process, syncobj);
+			break;
+		default:
+			pr_err("invalid entry in handle table %d", t);
+			break;
+		}
+	}
+
 	hmgrtable_destroy(&process->handle_table);
 	hmgrtable_destroy(&process->local_handle_table);
 }
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index 8a82838d86db..067a60319a12 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -490,6 +490,86 @@ dxgvmb_send_sync_msg_ntstatus(struct dxgvmbuschannel *channel,
 	return ret;
 }
 
+static int check_iospace_address(unsigned long address, u32 size)
+{
+	if (address < dxgglobal->mmiospace_base ||
+	    size > dxgglobal->mmiospace_size ||
+	    address >= (dxgglobal->mmiospace_base +
+			dxgglobal->mmiospace_size - size)) {
+		pr_err("invalid iospace address %lx", address);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+int dxg_unmap_iospace(void *va, u32 size)
+{
+	int ret = 0;
+
+	pr_debug("%s %p %x", __func__, va, size);
+
+	/*
+	 * When an app calls exit(), dxgkrnl is called to close the device
+	 * with current->mm equal to NULL.
+	 */
+	if (current->mm) {
+		ret = vm_munmap((unsigned long)va, size);
+		if (ret) {
+			pr_err("vm_munmap failed %d", ret);
+			return -ENOTRECOVERABLE;
+		}
+	}
+	return 0;
+}
+
+static u8 *dxg_map_iospace(u64 iospace_address, u32 size,
+			   unsigned long protection, bool cached)
+{
+	struct vm_area_struct *vma;
+	unsigned long va;
+	int ret = 0;
+
+	pr_debug("%s: %llx %x %lx",
+		    __func__, iospace_address, size, protection);
+	if (check_iospace_address(iospace_address, size) < 0) {
+		pr_err("%s: invalid address", __func__);
+		return NULL;
+	}
+
+	va = vm_mmap(NULL, 0, size, protection, MAP_SHARED | MAP_ANONYMOUS, 0);
+	if ((long)va <= 0) {
+		pr_err("vm_mmap failed %lx %d", va, size);
+		return NULL;
+	}
+
+	mmap_read_lock(current->mm);
+	vma = find_vma(current->mm, (unsigned long)va);
+	if (vma) {
+		pgprot_t prot = vma->vm_page_prot;
+
+		if (!cached)
+			prot = pgprot_writecombine(prot);
+		pr_debug("vma: %lx %lx %lx",
+			    vma->vm_start, vma->vm_end, va);
+		vma->vm_pgoff = iospace_address >> PAGE_SHIFT;
+		ret = io_remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
+					 size, prot);
+		if (ret)
+			pr_err("io_remap_pfn_range failed: %d", ret);
+	} else {
+		pr_err("failed to find vma: %p %lx", vma, va);
+		ret = -ENOMEM;
+	}
+	mmap_read_unlock(current->mm);
+
+	if (ret) {
+		dxg_unmap_iospace((void *)va, size);
+		return NULL;
+	}
+	pr_debug("%s end: %lx", __func__, va);
+	return (u8 *) va;
+}
+
 /*
  * Global messages to the host
  */
@@ -608,6 +688,39 @@ int dxgvmb_send_destroy_process(struct d3dkmthandle process)
 	return ret;
 }
 
+int dxgvmb_send_destroy_sync_object(struct dxgprocess *process,
+				    struct d3dkmthandle sync_object)
+{
+	struct dxgkvmb_command_destroysyncobject *command;
+	int ret;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, NULL, process, sizeof(*command));
+	if (ret)
+		return ret;
+	command = (void *)msg.msg;
+
+	ret = dxgglobal_acquire_channel_lock();
+	if (ret < 0)
+		goto cleanup;
+
+	command_vm_to_host_init2(&command->hdr,
+				 DXGK_VMBCOMMAND_DESTROYSYNCOBJECT,
+				 process->host_handle);
+	command->sync_object = sync_object;
+
+	ret = dxgvmb_send_sync_msg_ntstatus(dxgglobal_get_dxgvmbuschannel(),
+					    msg.hdr, msg.size);
+
+	dxgglobal_release_channel_lock();
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 /*
  * Virtual GPU messages to the host
  */
@@ -1492,6 +1605,92 @@ int dxgvmb_send_get_stdalloc_data(struct dxgdevice *device,
 	return ret;
 }
 
+static void set_result(struct d3dkmt_createsynchronizationobject2 *args,
+		       u64 fence_gpu_va, u8 *va)
+{
+	args->info.periodic_monitored_fence.fence_gpu_virtual_address =
+	    fence_gpu_va;
+	args->info.periodic_monitored_fence.fence_cpu_virtual_address = va;
+}
+
+int
+dxgvmb_send_create_sync_object(struct dxgprocess *process,
+			       struct dxgadapter *adapter,
+			       struct d3dkmt_createsynchronizationobject2 *args,
+			       struct dxgsyncobject *syncobj)
+{
+	struct dxgkvmb_command_createsyncobject_return result = { };
+	struct dxgkvmb_command_createsyncobject *command;
+	int ret;
+	u8 *va = 0;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_CREATESYNCOBJECT,
+				   process->host_handle);
+	command->args = *args;
+	command->client_hint = 1;	/* CLIENTHINT_UMD */
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size, &result,
+				   sizeof(result));
+	if (ret < 0) {
+		pr_err("%s failed %d", __func__, ret);
+		goto cleanup;
+	}
+	args->sync_object = result.sync_object;
+	if (syncobj->shared) {
+		if (result.global_sync_object.v == 0) {
+			pr_err("shared handle is 0");
+			ret = -EINVAL;
+			goto cleanup;
+		}
+		args->info.shared_handle = result.global_sync_object;
+	}
+
+	if (syncobj->monitored_fence) {
+		va = dxg_map_iospace(result.fence_storage_address, PAGE_SIZE,
+				     PROT_READ | PROT_WRITE, true);
+		if (va == NULL) {
+			ret = -ENOMEM;
+			goto cleanup;
+		}
+		if (args->info.type == _D3DDDI_MONITORED_FENCE) {
+			args->info.monitored_fence.fence_gpu_virtual_address =
+			    result.fence_gpu_va;
+			args->info.monitored_fence.fence_cpu_virtual_address =
+			    va;
+			{
+				unsigned long value;
+
+				pr_debug("fence cpu va: %p", va);
+				ret = copy_from_user(&value, va,
+						     sizeof(u64));
+				if (ret) {
+					pr_err("failed to read fence");
+					ret = -EINVAL;
+				} else {
+					pr_debug("fence value:%lx",
+						value);
+				}
+			}
+		} else {
+			set_result(args, result.fence_gpu_va, va);
+		}
+		syncobj->mapped_address = va;
+	}
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 int dxgvmb_send_query_adapter_info(struct dxgprocess *process,
 				   struct dxgadapter *adapter,
 				   struct d3dkmt_queryadapterinfo *args)
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index 312ce049dbc2..a2cfdb832664 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -410,4 +410,24 @@ struct dxgkvmb_command_destroycontext {
 	struct d3dkmthandle	context;
 };
 
+struct dxgkvmb_command_createsyncobject {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmt_createsynchronizationobject2 args;
+	u32				client_hint;
+};
+
+struct dxgkvmb_command_createsyncobject_return {
+	struct d3dkmthandle	sync_object;
+	struct d3dkmthandle	global_sync_object;
+	u64			fence_gpu_va;
+	u64			fence_storage_address;
+	u32			fence_storage_offset;
+};
+
+/* The command returns ntstatus */
+struct dxgkvmb_command_destroysyncobject {
+	struct dxgkvmb_command_vm_to_host hdr;
+	struct d3dkmthandle	sync_object;
+};
+
 #endif /* _DXGVMBUS_H */
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index 1f07a883debb..bc1ac2dd302f 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -1368,6 +1368,132 @@ dxgk_destroy_allocation(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgk_create_sync_object(struct dxgprocess *process, void *__user inargs)
+{
+	int ret;
+	struct d3dkmt_createsynchronizationobject2 args;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+	struct dxgsyncobject *syncobj = NULL;
+	bool device_lock_acquired = false;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	device = dxgprocess_device_by_handle(process, args.device);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgdevice_acquire_lock_shared(device);
+	if (ret < 0)
+		goto cleanup;
+
+	device_lock_acquired = true;
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	syncobj = dxgsyncobject_create(process, device, adapter, args.info.type,
+				       args.info.flags);
+	if (syncobj == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_create_sync_object(process, adapter, &args, syncobj);
+	if (ret < 0)
+		goto cleanup;
+
+	ret = copy_to_user(inargs, &args, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy output args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+	ret = hmgrtable_assign_handle(&process->handle_table, syncobj,
+				      HMGRENTRY_TYPE_DXGSYNCOBJECT,
+				      args.sync_object);
+	if (ret >= 0)
+		syncobj->handle = args.sync_object;
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+
+cleanup:
+
+	if (ret < 0) {
+		if (syncobj) {
+			dxgsyncobject_destroy(process, syncobj);
+			if (args.sync_object.v)
+				dxgvmb_send_destroy_sync_object(process,
+							args.sync_object);
+		}
+	}
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device_lock_acquired)
+		dxgdevice_release_lock_shared(device);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_destroy_sync_object(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_destroysynchronizationobject args;
+	struct dxgsyncobject *syncobj = NULL;
+	int ret;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	pr_debug("handle 0x%x", args.sync_object.v);
+	hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+	syncobj = hmgrtable_get_object_by_type(&process->handle_table,
+					       HMGRENTRY_TYPE_DXGSYNCOBJECT,
+					       args.sync_object);
+	if (syncobj) {
+		pr_debug("syncobj 0x%p", syncobj);
+		syncobj->handle.v = 0;
+		hmgrtable_free_handle(&process->handle_table,
+				      HMGRENTRY_TYPE_DXGSYNCOBJECT,
+				      args.sync_object);
+	}
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+
+	if (syncobj == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	dxgsyncobject_destroy(process, syncobj);
+
+	ret = dxgvmb_send_destroy_sync_object(process, args.sync_object);
+
+cleanup:
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 /*
  * IOCTL processing
  * The driver IOCTLs return
@@ -1436,6 +1562,8 @@ void init_ioctls(void)
 		  LX_DXCREATEALLOCATION);
 	SET_IOCTL(/*0x9 */ dxgk_query_adapter_info,
 		  LX_DXQUERYADAPTERINFO);
+	SET_IOCTL(/*0x10 */ dxgk_create_sync_object,
+		  LX_DXCREATESYNCHRONIZATIONOBJECT);
 	SET_IOCTL(/*0x13 */ dxgk_destroy_allocation,
 		  LX_DXDESTROYALLOCATION2);
 	SET_IOCTL(/*0x14 */ dxgk_enum_adapters,
@@ -1444,6 +1572,8 @@ void init_ioctls(void)
 		  LX_DXCLOSEADAPTER);
 	SET_IOCTL(/*0x19 */ dxgk_destroy_device,
 		  LX_DXDESTROYDEVICE);
+	SET_IOCTL(/*0x1d */ dxgk_destroy_sync_object,
+		  LX_DXDESTROYSYNCHRONIZATIONOBJECT);
 	SET_IOCTL(/*0x3e */ dxgk_enum_adapters3,
 		  LX_DXENUMADAPTERS3);
 }
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index 676ca1db85c9..52d54aa9266e 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -252,6 +252,97 @@ enum d3dkmdt_standardallocationtype {
 	_D3DKMDT_STANDARDALLOCATION_GDISURFACE			= 4,
 };
 
+struct d3dddi_synchronizationobject_flags {
+	union {
+		struct {
+			__u32	shared:1;
+			__u32	nt_security_sharing:1;
+			__u32	cross_adapter:1;
+			__u32	top_of_pipeline:1;
+			__u32	no_signal:1;
+			__u32	no_wait:1;
+			__u32	no_signal_max_value_on_tdr:1;
+			__u32	no_gpu_access:1;
+			__u32	reserved:23;
+		};
+		__u32		value;
+	};
+};
+
+enum d3dddi_synchronizationobject_type {
+	_D3DDDI_SYNCHRONIZATION_MUTEX		= 1,
+	_D3DDDI_SEMAPHORE			= 2,
+	_D3DDDI_FENCE				= 3,
+	_D3DDDI_CPU_NOTIFICATION		= 4,
+	_D3DDDI_MONITORED_FENCE			= 5,
+	_D3DDDI_PERIODIC_MONITORED_FENCE	= 6,
+	_D3DDDI_SYNCHRONIZATION_TYPE_LIMIT
+};
+
+struct d3dddi_synchronizationobjectinfo2 {
+	enum d3dddi_synchronizationobject_type	type;
+	struct d3dddi_synchronizationobject_flags flags;
+	union {
+		struct {
+			__u32	initial_state;
+		} synchronization_mutex;
+
+		struct {
+			__u32			max_count;
+			__u32			initial_count;
+		} semaphore;
+
+		struct {
+			__u64		fence_value;
+		} fence;
+
+		struct {
+			__u64		event;
+		} cpu_notification;
+
+		struct {
+			__u64	initial_fence_value;
+#ifdef __KERNEL__
+			void	*fence_cpu_virtual_address;
+#else
+			__u64	*fence_cpu_virtual_address;
+#endif
+			__u64	fence_gpu_virtual_address;
+			__u32	engine_affinity;
+		} monitored_fence;
+
+		struct {
+			struct d3dkmthandle	adapter;
+			__u32			vidpn_target_id;
+			__u64			time;
+#ifdef __KERNEL__
+			void			*fence_cpu_virtual_address;
+#else
+			__u64			fence_cpu_virtual_address;
+#endif
+			__u64			fence_gpu_virtual_address;
+			__u32			engine_affinity;
+		} periodic_monitored_fence;
+
+		struct {
+			__u64	reserved[8];
+		} reserved;
+	};
+	struct d3dkmthandle			shared_handle;
+};
+
+struct d3dkmt_createsynchronizationobject2 {
+	struct d3dkmthandle				device;
+	__u32						reserved;
+	struct d3dddi_synchronizationobjectinfo2	info;
+	struct d3dkmthandle				sync_object;
+	__u32						reserved1;
+};
+
+struct d3dkmt_destroysynchronizationobject {
+	struct d3dkmthandle	sync_object;
+};
+
 enum d3dkmt_standardallocationtype {
 	_D3DKMT_STANDARDALLOCATIONTYPE_EXISTINGHEAP	= 1,
 	_D3DKMT_STANDARDALLOCATIONTYPE_CROSSADAPTER	= 2,
@@ -479,6 +570,8 @@ struct d3dkmt_enumadapters3 {
 	_IOWR(0x47, 0x06, struct d3dkmt_createallocation)
 #define LX_DXQUERYADAPTERINFO		\
 	_IOWR(0x47, 0x09, struct d3dkmt_queryadapterinfo)
+#define LX_DXCREATESYNCHRONIZATIONOBJECT \
+	_IOWR(0x47, 0x10, struct d3dkmt_createsynchronizationobject2)
 #define LX_DXDESTROYALLOCATION2		\
 	_IOWR(0x47, 0x13, struct d3dkmt_destroyallocation2)
 #define LX_DXENUMADAPTERS2		\
@@ -487,6 +580,8 @@ struct d3dkmt_enumadapters3 {
 	_IOWR(0x47, 0x15, struct d3dkmt_closeadapter)
 #define LX_DXDESTROYDEVICE		\
 	_IOWR(0x47, 0x19, struct d3dkmt_destroydevice)
+#define LX_DXDESTROYSYNCHRONIZATIONOBJECT \
+	_IOWR(0x47, 0x1d, struct d3dkmt_destroysynchronizationobject)
 #define LX_DXENUMADAPTERS3		\
 	_IOWR(0x47, 0x3e, struct d3dkmt_enumadapters3)
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 11/30] drivers: hv: dxgkrnl: Operations using sync objects
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (8 preceding siblings ...)
  2022-03-01 19:45   ` [PATCH v3 10/30] drivers: hv: dxgkrnl: Creation of compute device sync objects Iouri Tarassov
@ 2022-03-01 19:45   ` Iouri Tarassov
  2022-03-01 19:45   ` [PATCH v3 12/30] drivers: hv: dxgkrnl: Sharing of dxgresource objects Iouri Tarassov
                     ` (19 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:45 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement ioctls to submit operations with compute device
sync objects:
  - the LX_DXSIGNALSYNCHRONIZATIONOBJECT ioctl.
    The ioctl is used to submit a signal to a sync object.
  - the LX_DXWAITFORSYNCHRONIZATIONOBJECT ioctl.
    The ioctl is used to submit a wait for a sync object
  - the LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMCPU ioctl
    The ioctl is used to signal to a monitored fence sync object
    from a CPU thread.
  - the LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMGPU ioctl.
    The ioctl is used to submit a signal to a monitored fence
    sync object..
  - the LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMGPU2 ioctl.
    The ioctl is used to submit a signal to a monitored fence
    sync object.
  - the LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMGPU ioctl.
    The ioctl is used to submit a wait for a monitored fence
    sync object.

Compute device synchronization objects are used to synchronize
execution of DMA buffers between different execution contexts.
Operations with sync objects include "signal" and "wait". A wait
for a sync object is satisfied when the sync object is signaled.

A signal operation could be submitted to a compute device context or
the sync object could be signaled by a CPU thread.

To improve performance, submitting operations to the host is done
asynchronously when the host supports it.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgadapter.c |  21 +
 drivers/hv/dxgkrnl/dxgkrnl.h    |  62 +++
 drivers/hv/dxgkrnl/dxgmodule.c  |  97 +++++
 drivers/hv/dxgkrnl/dxgvmbus.c   | 213 ++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h   |  48 +++
 drivers/hv/dxgkrnl/ioctl.c      | 693 ++++++++++++++++++++++++++++++++
 drivers/hv/dxgkrnl/misc.h       |   1 +
 include/uapi/misc/d3dkmthk.h    | 159 ++++++++
 8 files changed, 1294 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgadapter.c b/drivers/hv/dxgkrnl/dxgadapter.c
index 71e3e1ce5c2b..c001f58a30db 100644
--- a/drivers/hv/dxgkrnl/dxgadapter.c
+++ b/drivers/hv/dxgkrnl/dxgadapter.c
@@ -897,6 +897,12 @@ struct dxgsyncobject *dxgsyncobject_create(struct dxgprocess *process,
 	case _D3DDDI_PERIODIC_MONITORED_FENCE:
 		syncobj->monitored_fence = 1;
 		break;
+	case _D3DDDI_CPU_NOTIFICATION:
+		syncobj->cpu_event = 1;
+		syncobj->host_event = vzalloc(sizeof(struct dxghostevent));
+		if (syncobj->host_event == NULL)
+			goto cleanup;
+		break;
 	default:
 		break;
 	}
@@ -924,6 +930,8 @@ struct dxgsyncobject *dxgsyncobject_create(struct dxgprocess *process,
 	pr_debug("%s 0x%p\n", __func__, syncobj);
 	return syncobj;
 cleanup:
+	if (syncobj->host_event)
+		vfree(syncobj->host_event);
 	if (syncobj)
 		vfree(syncobj);
 	return NULL;
@@ -933,6 +941,7 @@ void dxgsyncobject_destroy(struct dxgprocess *process,
 			   struct dxgsyncobject *syncobj)
 {
 	int destroyed;
+	struct dxghosteventcpu *host_event;
 
 	pr_debug("%s 0x%p", __func__, syncobj);
 
@@ -951,6 +960,16 @@ void dxgsyncobject_destroy(struct dxgprocess *process,
 		}
 		hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
 
+		if (syncobj->cpu_event) {
+			host_event = syncobj->host_event;
+			if (host_event->cpu_event) {
+				eventfd_ctx_put(host_event->cpu_event);
+				if (host_event->hdr.event_id)
+					dxgglobal_remove_host_event(
+						&host_event->hdr);
+				host_event->cpu_event = NULL;
+			}
+		}
 		if (syncobj->monitored_fence)
 			dxgdevice_remove_syncobj(syncobj);
 		else
@@ -990,5 +1009,7 @@ void dxgsyncobject_release(struct kref *refcount)
 	struct dxgsyncobject *syncobj;
 
 	syncobj = container_of(refcount, struct dxgsyncobject, syncobj_kref);
+	if (syncobj->host_event)
+		vfree(syncobj->host_event);
 	vfree(syncobj);
 }
diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index f0b0e1bdd897..cb9096bd7a12 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -86,6 +86,29 @@ int dxgvmbuschannel_init(struct dxgvmbuschannel *ch, struct hv_device *hdev);
 void dxgvmbuschannel_destroy(struct dxgvmbuschannel *ch);
 void dxgvmbuschannel_receive(void *ctx);
 
+/*
+ * The structure describes an event, which will be signaled by
+ * a message from host.
+ */
+enum dxghosteventtype {
+	dxghostevent_cpu_event = 1,
+};
+
+struct dxghostevent {
+	struct list_head	host_event_list_entry;
+	u64			event_id;
+	enum dxghosteventtype	event_type;
+};
+
+struct dxghosteventcpu {
+	struct dxghostevent	hdr;
+	struct dxgprocess	*process;
+	struct eventfd_ctx	*cpu_event;
+	struct completion	*completion_event;
+	bool			destroy_after_signal;
+	bool			remove_from_list;
+};
+
 /*
  * This is GPU synchronization object, which is used to synchronize execution
  * between GPU contextx/hardware queues or for tracking GPU execution progress.
@@ -115,6 +138,8 @@ struct dxgsyncobject {
 	 */
 	struct dxgdevice		*device;
 	struct dxgprocess		*process;
+	/* Used by D3DDDI_CPU_NOTIFICATION objects */
+	struct dxghosteventcpu		*host_event;
 	/* CPU virtual address of the fence value for "device" syncobjects */
 	void				*mapped_address;
 	/* Handle in the process handle table */
@@ -129,6 +154,7 @@ struct dxgsyncobject {
 			u32		stopped:1;
 			/* device syncobject */
 			u32		monitored_fence:1;
+			u32		cpu_event:1;
 			u32		shared:1;
 			u32		reserved:27;
 		};
@@ -195,6 +221,11 @@ struct dxgglobal {
 	/* protects the dxgprocess_adapter lists */
 	struct mutex		process_adapter_mutex;
 
+	/*  list of events, waiting to be signaled by the host */
+	struct list_head	host_event_list_head;
+	spinlock_t		host_event_list_mutex;
+	atomic64_t		host_event_id;
+
 	bool			dxg_dev_initialized;
 	bool			vmbus_registered;
 	bool			pci_registered;
@@ -214,6 +245,11 @@ struct vmbus_channel *dxgglobal_get_vmbus(void);
 struct dxgvmbuschannel *dxgglobal_get_dxgvmbuschannel(void);
 void dxgglobal_acquire_process_adapter_lock(void);
 void dxgglobal_release_process_adapter_lock(void);
+void dxgglobal_add_host_event(struct dxghostevent *hostevent);
+void dxgglobal_remove_host_event(struct dxghostevent *hostevent);
+u64 dxgglobal_new_host_event_id(void);
+void dxgglobal_signal_host_event(u64 event_id);
+struct dxghostevent *dxgglobal_get_host_event(u64 event_id);
 int dxgglobal_acquire_channel_lock(void);
 void dxgglobal_release_channel_lock(void);
 
@@ -574,6 +610,31 @@ int dxgvmb_send_create_sync_object(struct dxgprocess *pr,
 				   *args, struct dxgsyncobject *so);
 int dxgvmb_send_destroy_sync_object(struct dxgprocess *pr,
 				    struct d3dkmthandle h);
+int dxgvmb_send_signal_sync_object(struct dxgprocess *process,
+				   struct dxgadapter *adapter,
+				   struct d3dddicb_signalflags flags,
+				   u64 legacy_fence_value,
+				   struct d3dkmthandle context,
+				   u32 object_count,
+				   struct d3dkmthandle *object,
+				   u32 context_count,
+				   struct d3dkmthandle *contexts,
+				   u32 fence_count, u64 *fences,
+				   struct eventfd_ctx *cpu_event,
+				   struct d3dkmthandle device);
+int dxgvmb_send_wait_sync_object_gpu(struct dxgprocess *process,
+				     struct dxgadapter *adapter,
+				     struct d3dkmthandle context,
+				     u32 object_count,
+				     struct d3dkmthandle *objects,
+				     u64 *fences,
+				     bool legacy_fence);
+int dxgvmb_send_wait_sync_object_cpu(struct dxgprocess *process,
+				     struct dxgadapter *adapter,
+				     struct
+				     d3dkmt_waitforsynchronizationobjectfromcpu
+				     *args,
+				     u64 cpu_event);
 int dxgvmb_send_query_adapter_info(struct dxgprocess *process,
 				   struct dxgadapter *adapter,
 				   struct d3dkmt_queryadapterinfo *args);
@@ -589,6 +650,7 @@ int dxgvmb_send_async_msg(struct dxgvmbuschannel *channel,
 			  void *command,
 			  u32 cmd_size);
 
+void signal_host_cpu_event(struct dxghostevent *eventhdr);
 int ntstatus2int(struct ntstatus status);
 
 #endif
diff --git a/drivers/hv/dxgkrnl/dxgmodule.c b/drivers/hv/dxgkrnl/dxgmodule.c
index d64504a1cff0..80af0de4aa37 100644
--- a/drivers/hv/dxgkrnl/dxgmodule.c
+++ b/drivers/hv/dxgkrnl/dxgmodule.c
@@ -124,6 +124,98 @@ static struct dxgadapter *find_adapter(struct winluid *luid)
 	return adapter;
 }
 
+void dxgglobal_add_host_event(struct dxghostevent *event)
+{
+	spin_lock_irq(&dxgglobal->host_event_list_mutex);
+	list_add_tail(&event->host_event_list_entry,
+		      &dxgglobal->host_event_list_head);
+	spin_unlock_irq(&dxgglobal->host_event_list_mutex);
+}
+
+void dxgglobal_remove_host_event(struct dxghostevent *event)
+{
+	spin_lock_irq(&dxgglobal->host_event_list_mutex);
+	if (event->host_event_list_entry.next != NULL) {
+		list_del(&event->host_event_list_entry);
+		event->host_event_list_entry.next = NULL;
+	}
+	spin_unlock_irq(&dxgglobal->host_event_list_mutex);
+}
+
+void signal_host_cpu_event(struct dxghostevent *eventhdr)
+{
+	struct  dxghosteventcpu *event = (struct  dxghosteventcpu *)eventhdr;
+
+	if (event->remove_from_list ||
+		event->destroy_after_signal) {
+		list_del(&eventhdr->host_event_list_entry);
+		eventhdr->host_event_list_entry.next = NULL;
+	}
+	if (event->cpu_event) {
+		pr_debug("signal cpu event\n");
+		eventfd_signal(event->cpu_event, 1);
+		if (event->destroy_after_signal)
+			eventfd_ctx_put(event->cpu_event);
+	} else {
+		pr_debug("signal completion\n");
+		complete(event->completion_event);
+	}
+	if (event->destroy_after_signal) {
+		pr_debug("destroying event %p\n",
+			event);
+		vfree(event);
+	}
+}
+
+void dxgglobal_signal_host_event(u64 event_id)
+{
+	struct dxghostevent *event;
+	unsigned long flags;
+
+	pr_debug("%s %lld\n", __func__, event_id);
+
+	spin_lock_irqsave(&dxgglobal->host_event_list_mutex, flags);
+	list_for_each_entry(event, &dxgglobal->host_event_list_head,
+			    host_event_list_entry) {
+		if (event->event_id == event_id) {
+			pr_debug("found event to signal %lld\n",
+				    event_id);
+			if (event->event_type == dxghostevent_cpu_event)
+				signal_host_cpu_event(event);
+			else
+				pr_err("Unknown host event type");
+			break;
+		}
+	}
+	spin_unlock_irqrestore(&dxgglobal->host_event_list_mutex, flags);
+	pr_debug("dxgglobal_signal_host_event_end %lld\n",
+		event_id);
+}
+
+struct dxghostevent *dxgglobal_get_host_event(u64 event_id)
+{
+	struct dxghostevent *entry;
+	struct dxghostevent *event = NULL;
+
+	spin_lock_irq(&dxgglobal->host_event_list_mutex);
+	list_for_each_entry(entry, &dxgglobal->host_event_list_head,
+			    host_event_list_entry) {
+		if (entry->event_id == event_id) {
+			list_del(&entry->host_event_list_entry);
+			entry->host_event_list_entry.next = NULL;
+			event = entry;
+			break;
+		}
+	}
+	spin_unlock_irq(&dxgglobal->host_event_list_mutex);
+	return event;
+}
+
+u64 dxgglobal_new_host_event_id(void)
+{
+	return atomic64_inc_return(&dxgglobal->host_event_id);
+}
+
 void dxgglobal_acquire_process_adapter_lock(void)
 {
 	mutex_lock(&dxgglobal->process_adapter_mutex);
@@ -745,6 +837,11 @@ static int dxgglobal_create(void)
 
 	init_rwsem(&dxgglobal->channel_lock);
 
+	INIT_LIST_HEAD(&dxgglobal->host_event_list_head);
+	spin_lock_init(&dxgglobal->host_event_list_mutex);
+	atomic64_set(&dxgglobal->host_event_id, 1);
+
+	pr_debug("dxgglobal_init end\n");
 	pr_debug("dxgglobal_init end\n");
 	return ret;
 }
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index 067a60319a12..a6979b4ae955 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -276,6 +276,22 @@ static void command_vm_to_host_init1(struct dxgkvmb_command_vm_to_host *command,
 	command->channel_type = DXGKVMB_VM_TO_HOST;
 }
 
+static void signal_guest_event(struct dxgkvmb_command_host_to_vm *packet,
+			       u32 packet_length)
+{
+	struct dxgkvmb_command_signalguestevent *command = (void *)packet;
+
+	if (packet_length < sizeof(struct dxgkvmb_command_signalguestevent)) {
+		pr_err("invalid packet size");
+		return;
+	}
+	if (command->event == 0) {
+		pr_err("invalid event pointer");
+		return;
+	}
+	dxgglobal_signal_host_event(command->event);
+}
+
 static void process_inband_packet(struct dxgvmbuschannel *channel,
 				  struct vmpacket_descriptor *desc)
 {
@@ -292,6 +308,7 @@ static void process_inband_packet(struct dxgvmbuschannel *channel,
 			switch (packet->command_type) {
 			case DXGK_VMBCOMMAND_SIGNALGUESTEVENT:
 			case DXGK_VMBCOMMAND_SIGNALGUESTEVENTPASSIVE:
+				signal_guest_event(packet, packet_length);
 				break;
 			case DXGK_VMBCOMMAND_SENDWNFNOTIFICATION:
 				break;
@@ -1691,6 +1708,202 @@ dxgvmb_send_create_sync_object(struct dxgprocess *process,
 	return ret;
 }
 
+int dxgvmb_send_signal_sync_object(struct dxgprocess *process,
+				   struct dxgadapter *adapter,
+				   struct d3dddicb_signalflags flags,
+				   u64 legacy_fence_value,
+				   struct d3dkmthandle context,
+				   u32 object_count,
+				   struct d3dkmthandle __user *objects,
+				   u32 context_count,
+				   struct d3dkmthandle __user *contexts,
+				   u32 fence_count,
+				   u64 __user *fences,
+				   struct eventfd_ctx *cpu_event_handle,
+				   struct d3dkmthandle device)
+{
+	int ret;
+	struct dxgkvmb_command_signalsyncobject *command;
+	u32 object_size = object_count * sizeof(struct d3dkmthandle);
+	u32 context_size = context_count * sizeof(struct d3dkmthandle);
+	u32 fence_size = fences ? fence_count * sizeof(u64) : 0;
+	u8 *current_pos;
+	u32 cmd_size = sizeof(struct dxgkvmb_command_signalsyncobject) +
+	    object_size + context_size + fence_size;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	if (context.v)
+		cmd_size += sizeof(struct d3dkmthandle);
+
+	ret = init_message(&msg, adapter, process, cmd_size);
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_SIGNALSYNCOBJECT,
+				   process->host_handle);
+
+	if (flags.enqueue_cpu_event)
+		command->cpu_event_handle = (u64) cpu_event_handle;
+	else
+		command->device = device;
+	command->flags = flags;
+	command->fence_value = legacy_fence_value;
+	command->object_count = object_count;
+	command->context_count = context_count;
+	current_pos = (u8 *) &command[1];
+	ret = copy_from_user(current_pos, objects, object_size);
+	if (ret) {
+		pr_err("Failed to read objects %p %d",
+			objects, object_size);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	current_pos += object_size;
+	if (context.v) {
+		command->context_count++;
+		*(struct d3dkmthandle *) current_pos = context;
+		current_pos += sizeof(struct d3dkmthandle);
+	}
+	if (context_size) {
+		ret = copy_from_user(current_pos, contexts, context_size);
+		if (ret) {
+			pr_err("Failed to read contexts %p %d",
+				contexts, context_size);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+		current_pos += context_size;
+	}
+	if (fence_size) {
+		ret = copy_from_user(current_pos, fences, fence_size);
+		if (ret) {
+			pr_err("Failed to read fences %p %d",
+				fences, fence_size);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+	}
+
+	if (dxgglobal->async_msg_enabled) {
+		command->hdr.async_msg = 1;
+		ret = dxgvmb_send_async_msg(msg.channel, msg.hdr, msg.size);
+	} else {
+		ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr,
+						    msg.size);
+	}
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_wait_sync_object_cpu(struct dxgprocess *process,
+				     struct dxgadapter *adapter,
+				     struct
+				     d3dkmt_waitforsynchronizationobjectfromcpu
+				     *args,
+				     u64 cpu_event)
+{
+	int ret = -EINVAL;
+	struct dxgkvmb_command_waitforsyncobjectfromcpu *command;
+	u32 object_size = args->object_count * sizeof(struct d3dkmthandle);
+	u32 fence_size = args->object_count * sizeof(u64);
+	u8 *current_pos;
+	u32 cmd_size = sizeof(*command) + object_size + fence_size;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, cmd_size);
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_WAITFORSYNCOBJECTFROMCPU,
+				   process->host_handle);
+	command->device = args->device;
+	command->flags = args->flags;
+	command->object_count = args->object_count;
+	command->guest_event_pointer = (u64) cpu_event;
+	current_pos = (u8 *) &command[1];
+	ret = copy_from_user(current_pos, args->objects, object_size);
+	if (ret) {
+		pr_err("%s failed to copy objects", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	current_pos += object_size;
+	ret = copy_from_user(current_pos, args->fence_values, fence_size);
+	if (ret) {
+		pr_err("%s failed to copy fences", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_wait_sync_object_gpu(struct dxgprocess *process,
+				     struct dxgadapter *adapter,
+				     struct d3dkmthandle context,
+				     u32 object_count,
+				     struct d3dkmthandle *objects,
+				     u64 *fences,
+				     bool legacy_fence)
+{
+	int ret;
+	struct dxgkvmb_command_waitforsyncobjectfromgpu *command;
+	u32 fence_size = object_count * sizeof(u64);
+	u32 object_size = object_count * sizeof(struct d3dkmthandle);
+	u8 *current_pos;
+	u32 cmd_size = object_size + fence_size - sizeof(u64) +
+	    sizeof(struct dxgkvmb_command_waitforsyncobjectfromgpu);
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	if (object_count == 0 || object_count > D3DDDI_MAX_OBJECT_WAITED_ON) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	ret = init_message(&msg, adapter, process, cmd_size);
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_WAITFORSYNCOBJECTFROMGPU,
+				   process->host_handle);
+	command->context = context;
+	command->object_count = object_count;
+	command->legacy_fence_object = legacy_fence;
+	current_pos = (u8 *) command->fence_values;
+	memcpy(current_pos, fences, fence_size);
+	current_pos += fence_size;
+	memcpy(current_pos, objects, object_size);
+
+	if (dxgglobal->async_msg_enabled) {
+		command->hdr.async_msg = 1;
+		ret = dxgvmb_send_async_msg(msg.channel, msg.hdr, msg.size);
+	} else {
+		ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr,
+						    msg.size);
+	}
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 int dxgvmb_send_query_adapter_info(struct dxgprocess *process,
 				   struct dxgadapter *adapter,
 				   struct d3dkmt_queryadapterinfo *args)
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index a2cfdb832664..da63f89ad822 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -165,6 +165,13 @@ struct dxgkvmb_command_host_to_vm {
 	enum dxgkvmb_commandtype_host_to_vm	command_type;
 };
 
+struct dxgkvmb_command_signalguestevent {
+	struct dxgkvmb_command_host_to_vm hdr;
+	u64				event;
+	u64				process_id;
+	bool				dereference_event;
+};
+
 /* Returns ntstatus */
 struct dxgkvmb_command_setiospaceregion {
 	struct dxgkvmb_command_vm_to_host hdr;
@@ -430,4 +437,45 @@ struct dxgkvmb_command_destroysyncobject {
 	struct d3dkmthandle	sync_object;
 };
 
+/* The command returns ntstatus */
+struct dxgkvmb_command_signalsyncobject {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	u32				object_count;
+	struct d3dddicb_signalflags	flags;
+	u32				context_count;
+	u64				fence_value;
+	union {
+		/* Pointer to the guest event object */
+		u64			cpu_event_handle;
+		/* Non zero when signal from CPU is done */
+		struct d3dkmthandle		device;
+	};
+	/* struct d3dkmthandle ObjectHandleArray[object_count] */
+	/* struct d3dkmthandle ContextArray[context_count]     */
+	/* u64 MonitoredFenceValueArray[object_count] */
+};
+
+/* The command returns ntstatus */
+struct dxgkvmb_command_waitforsyncobjectfromcpu {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		device;
+	u32				object_count;
+	struct d3dddi_waitforsynchronizationobjectfromcpu_flags flags;
+	u64				guest_event_pointer;
+	bool				dereference_event;
+	/* struct d3dkmthandle ObjectHandleArray[object_count] */
+	/* u64 FenceValueArray [object_count] */
+};
+
+/* The command returns ntstatus */
+struct dxgkvmb_command_waitforsyncobjectfromgpu {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		context;
+	/* Must be 1 when bLegacyFenceObject is TRUE */
+	u32				object_count;
+	bool				legacy_fence_object;
+	u64				fence_values[1];
+	/* struct d3dkmthandle ObjectHandles[object_count] */
+};
+
 #endif /* _DXGVMBUS_H */
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index bc1ac2dd302f..b945da0da55a 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -1375,8 +1375,10 @@ dxgk_create_sync_object(struct dxgprocess *process, void *__user inargs)
 	struct d3dkmt_createsynchronizationobject2 args;
 	struct dxgdevice *device = NULL;
 	struct dxgadapter *adapter = NULL;
+	struct eventfd_ctx *event = NULL;
 	struct dxgsyncobject *syncobj = NULL;
 	bool device_lock_acquired = false;
+	struct dxghosteventcpu *host_event = NULL;
 
 	ret = copy_from_user(&args, inargs, sizeof(args));
 	if (ret) {
@@ -1411,6 +1413,27 @@ dxgk_create_sync_object(struct dxgprocess *process, void *__user inargs)
 		goto cleanup;
 	}
 
+	if (args.info.type == _D3DDDI_CPU_NOTIFICATION) {
+		event = eventfd_ctx_fdget((int)
+					  args.info.cpu_notification.event);
+		if (IS_ERR(event)) {
+			pr_err("failed to reference the event");
+			event = NULL;
+			ret = -EINVAL;
+			goto cleanup;
+		}
+		host_event = syncobj->host_event;
+		host_event->hdr.event_id = dxgglobal_new_host_event_id();
+		host_event->cpu_event = event;
+		host_event->remove_from_list = false;
+		host_event->destroy_after_signal = false;
+		host_event->hdr.event_type = dxghostevent_cpu_event;
+		dxgglobal_add_host_event(&host_event->hdr);
+		args.info.cpu_notification.event = host_event->hdr.event_id;
+		pr_debug("creating CPU notification event: %lld",
+			args.info.cpu_notification.event);
+	}
+
 	ret = dxgvmb_send_create_sync_object(process, adapter, &args, syncobj);
 	if (ret < 0)
 		goto cleanup;
@@ -1438,7 +1461,10 @@ dxgk_create_sync_object(struct dxgprocess *process, void *__user inargs)
 			if (args.sync_object.v)
 				dxgvmb_send_destroy_sync_object(process,
 							args.sync_object);
+			event = NULL;
 		}
+		if (event)
+			eventfd_ctx_put(event);
 	}
 	if (adapter)
 		dxgadapter_release_lock_shared(adapter);
@@ -1494,6 +1520,659 @@ dxgk_destroy_sync_object(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgk_signal_sync_object(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_signalsynchronizationobject2 args;
+	struct d3dkmt_signalsynchronizationobject2 *__user in_args = inargs;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+	int ret;
+	u32 fence_count = 1;
+	struct eventfd_ctx *event = NULL;
+	struct dxghosteventcpu *host_event = NULL;
+	bool host_event_added = false;
+	u64 host_event_id = 0;
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.context_count >= D3DDDI_MAX_BROADCAST_CONTEXT ||
+	    args.object_count > D3DDDI_MAX_OBJECT_SIGNALED) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.flags.enqueue_cpu_event) {
+		host_event = vzalloc(sizeof(*host_event));
+		if (host_event == NULL) {
+			ret = -ENOMEM;
+			goto cleanup;
+		}
+		host_event->process = process;
+		event = eventfd_ctx_fdget((int)args.cpu_event_handle);
+		if (IS_ERR(event)) {
+			pr_err("failed to reference the event");
+			event = NULL;
+			ret = -EINVAL;
+			goto cleanup;
+		}
+		fence_count = 0;
+		host_event->cpu_event = event;
+		host_event_id = dxgglobal_new_host_event_id();
+		host_event->hdr.event_type = dxghostevent_cpu_event;
+		host_event->hdr.event_id = host_event_id;
+		host_event->remove_from_list = true;
+		host_event->destroy_after_signal = true;
+		dxgglobal_add_host_event(&host_event->hdr);
+		host_event_added = true;
+	}
+
+	device = dxgprocess_device_by_object_handle(process,
+						    HMGRENTRY_TYPE_DXGCONTEXT,
+						    args.context);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_signal_sync_object(process, adapter,
+					     args.flags, args.fence.fence_value,
+					     args.context, args.object_count,
+					     in_args->object_array,
+					     args.context_count,
+					     in_args->contexts, fence_count,
+					     NULL, (void *)host_event_id,
+					     zerohandle);
+
+	/*
+	 * When the send operation succeeds, the host event will be destroyed
+	 * after signal from the host
+	 */
+
+cleanup:
+
+	if (ret < 0) {
+		if (host_event_added) {
+			/* The event might be signaled and destroyed by host */
+			host_event = (struct dxghosteventcpu *)
+				dxgglobal_get_host_event(host_event_id);
+			if (host_event) {
+				eventfd_ctx_put(event);
+				event = NULL;
+				vfree(host_event);
+				host_event = NULL;
+			}
+		}
+		if (event)
+			eventfd_ctx_put(event);
+		if (host_event)
+			vfree(host_event);
+	}
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_signal_sync_object_cpu(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_signalsynchronizationobjectfromcpu args;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+	int ret;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	if (args.object_count == 0 ||
+	    args.object_count > D3DDDI_MAX_OBJECT_SIGNALED) {
+		pr_debug("Too many objects: %d",
+			args.object_count);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	device = dxgprocess_device_by_handle(process, args.device);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_signal_sync_object(process, adapter,
+					     args.flags, 0, zerohandle,
+					     args.object_count, args.objects, 0,
+					     NULL, args.object_count,
+					     args.fence_values, NULL,
+					     args.device);
+
+cleanup:
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_signal_sync_object_gpu(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_signalsynchronizationobjectfromgpu args;
+	struct d3dkmt_signalsynchronizationobjectfromgpu *__user user_args =
+	    inargs;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+	struct d3dddicb_signalflags flags = { };
+	int ret;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.object_count == 0 ||
+	    args.object_count > DXG_MAX_VM_BUS_PACKET_SIZE) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	device = dxgprocess_device_by_object_handle(process,
+						    HMGRENTRY_TYPE_DXGCONTEXT,
+						    args.context);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_signal_sync_object(process, adapter,
+					     flags, 0, zerohandle,
+					     args.object_count,
+					     args.objects, 1,
+					     &user_args->context,
+					     args.object_count,
+					     args.monitored_fence_values, NULL,
+					     zerohandle);
+
+cleanup:
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_signal_sync_object_gpu2(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_signalsynchronizationobjectfromgpu2 args;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+	struct d3dkmthandle context_handle;
+	struct eventfd_ctx *event = NULL;
+	u64 *fences = NULL;
+	u32 fence_count = 0;
+	int ret;
+	struct dxghosteventcpu *host_event = NULL;
+	bool host_event_added = false;
+	u64 host_event_id = 0;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.flags.enqueue_cpu_event) {
+		if (args.object_count != 0 || args.cpu_event_handle == 0) {
+			pr_err("Bad input for EnqueueCpuEvent: %d %lld",
+				   args.object_count, args.cpu_event_handle);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+	} else if (args.object_count == 0 ||
+		   args.object_count > DXG_MAX_VM_BUS_PACKET_SIZE ||
+		   args.context_count == 0 ||
+		   args.context_count > DXG_MAX_VM_BUS_PACKET_SIZE) {
+		pr_err("Invalid input: %d %d",
+			   args.object_count, args.context_count);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = copy_from_user(&context_handle, args.contexts,
+			     sizeof(struct d3dkmthandle));
+	if (ret) {
+		pr_err("%s failed to copy context handle", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.flags.enqueue_cpu_event) {
+		host_event = vzalloc(sizeof(*host_event));
+		if (host_event == NULL) {
+			ret = -ENOMEM;
+			goto cleanup;
+		}
+		host_event->process = process;
+		event = eventfd_ctx_fdget((int)args.cpu_event_handle);
+		if (IS_ERR(event)) {
+			pr_err("failed to reference the event");
+			event = NULL;
+			ret = -EINVAL;
+			goto cleanup;
+		}
+		fence_count = 0;
+		host_event->cpu_event = event;
+		host_event_id = dxgglobal_new_host_event_id();
+		host_event->hdr.event_id = host_event_id;
+		host_event->hdr.event_type = dxghostevent_cpu_event;
+		host_event->remove_from_list = true;
+		host_event->destroy_after_signal = true;
+		dxgglobal_add_host_event(&host_event->hdr);
+		host_event_added = true;
+	} else {
+		fences = args.monitored_fence_values;
+		fence_count = args.object_count;
+	}
+
+	device = dxgprocess_device_by_object_handle(process,
+						    HMGRENTRY_TYPE_DXGCONTEXT,
+						    context_handle);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_signal_sync_object(process, adapter,
+					     args.flags, 0, zerohandle,
+					     args.object_count, args.objects,
+					     args.context_count, args.contexts,
+					     fence_count, fences,
+					     (void *)host_event_id, zerohandle);
+
+cleanup:
+
+	if (ret < 0) {
+		if (host_event_added) {
+			/* The event might be signaled and destroyed by host */
+			host_event = (struct dxghosteventcpu *)
+				dxgglobal_get_host_event(host_event_id);
+			if (host_event) {
+				eventfd_ctx_put(event);
+				event = NULL;
+				vfree(host_event);
+				host_event = NULL;
+			}
+		}
+		if (event)
+			eventfd_ctx_put(event);
+		if (host_event)
+			vfree(host_event);
+	}
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_wait_sync_object(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_waitforsynchronizationobject2 args;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+	int ret;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.object_count > D3DDDI_MAX_OBJECT_WAITED_ON ||
+	    args.object_count == 0) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	device = dxgprocess_device_by_object_handle(process,
+						    HMGRENTRY_TYPE_DXGCONTEXT,
+						    args.context);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	pr_debug("Fence value: %lld", args.fence.fence_value);
+	ret = dxgvmb_send_wait_sync_object_gpu(process, adapter,
+					       args.context, args.object_count,
+					       args.object_array,
+					       &args.fence.fence_value, true);
+
+cleanup:
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_wait_sync_object_cpu(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_waitforsynchronizationobjectfromcpu args;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+	struct eventfd_ctx *event = NULL;
+	struct dxghosteventcpu host_event = { };
+	struct dxghosteventcpu *async_host_event = NULL;
+	struct completion local_event = { };
+	u64 event_id = 0;
+	int ret;
+	bool host_event_added = false;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.object_count > DXG_MAX_VM_BUS_PACKET_SIZE ||
+	    args.object_count == 0) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.async_event) {
+		async_host_event = vzalloc(sizeof(*async_host_event));
+		if (async_host_event == NULL) {
+			ret = -EINVAL;
+			goto cleanup;
+		}
+		async_host_event->process = process;
+		event = eventfd_ctx_fdget((int)args.async_event);
+		if (IS_ERR(event)) {
+			pr_err("failed to reference the event");
+			event = NULL;
+			ret = -EINVAL;
+			goto cleanup;
+		}
+		async_host_event->cpu_event = event;
+		async_host_event->hdr.event_id = dxgglobal_new_host_event_id();
+		async_host_event->destroy_after_signal = true;
+		async_host_event->hdr.event_type = dxghostevent_cpu_event;
+		dxgglobal_add_host_event(&async_host_event->hdr);
+		event_id = async_host_event->hdr.event_id;
+		host_event_added = true;
+	} else {
+		init_completion(&local_event);
+		host_event.completion_event = &local_event;
+		host_event.hdr.event_id = dxgglobal_new_host_event_id();
+		host_event.hdr.event_type = dxghostevent_cpu_event;
+		dxgglobal_add_host_event(&host_event.hdr);
+		event_id = host_event.hdr.event_id;
+	}
+
+	device = dxgprocess_device_by_handle(process, args.device);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_wait_sync_object_cpu(process, adapter,
+					       &args, event_id);
+	if (ret < 0)
+		goto cleanup;
+
+	if (args.async_event == 0) {
+		dxgadapter_release_lock_shared(adapter);
+		adapter = NULL;
+		ret = wait_for_completion_interruptible(&local_event);
+		if (ret) {
+			pr_err("%s: wait_completion_interruptible failed: %d",
+			       __func__, ret);
+			ret = -ERESTARTSYS;
+		}
+	}
+
+cleanup:
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+	if (host_event.hdr.event_id)
+		dxgglobal_remove_host_event(&host_event.hdr);
+	if (ret < 0) {
+		if (host_event_added) {
+			async_host_event = (struct dxghosteventcpu *)
+				dxgglobal_get_host_event(event_id);
+			if (async_host_event) {
+				if (async_host_event->hdr.event_type ==
+				    dxghostevent_cpu_event) {
+					eventfd_ctx_put(event);
+					event = NULL;
+					vfree(async_host_event);
+					async_host_event = NULL;
+				} else {
+					pr_err("Invalid event type");
+					DXGKRNL_ASSERT(0);
+				}
+			}
+		}
+		if (event)
+			eventfd_ctx_put(event);
+		if (async_host_event)
+			vfree(async_host_event);
+	}
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_wait_sync_object_gpu(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_waitforsynchronizationobjectfromgpu args;
+	struct dxgcontext *context = NULL;
+	struct d3dkmthandle device_handle = {};
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+	struct dxgsyncobject *syncobj = NULL;
+	struct d3dkmthandle *objects = NULL;
+	u32 object_size;
+	u64 *fences = NULL;
+	int ret;
+	enum hmgrentry_type syncobj_type = HMGRENTRY_TYPE_FREE;
+	bool monitored_fence = false;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.object_count > DXG_MAX_VM_BUS_PACKET_SIZE ||
+	    args.object_count == 0) {
+		pr_err("Invalid object count: %d", args.object_count);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	object_size = sizeof(struct d3dkmthandle) * args.object_count;
+	objects = vzalloc(object_size);
+	if (objects == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+	ret = copy_from_user(objects, args.objects, object_size);
+	if (ret) {
+		pr_err("%s failed to copy objects", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	hmgrtable_lock(&process->handle_table, DXGLOCK_SHARED);
+	context = hmgrtable_get_object_by_type(&process->handle_table,
+					       HMGRENTRY_TYPE_DXGCONTEXT,
+					       args.context);
+	if (context) {
+		device_handle = context->device_handle;
+		syncobj_type =
+		    hmgrtable_get_object_type(&process->handle_table,
+					      objects[0]);
+	}
+	if (device_handle.v == 0) {
+		pr_err("Invalid context handle: %x", args.context.v);
+		ret = -EINVAL;
+	} else {
+		if (syncobj_type == HMGRENTRY_TYPE_MONITOREDFENCE) {
+			monitored_fence = true;
+		} else if (syncobj_type == HMGRENTRY_TYPE_DXGSYNCOBJECT) {
+			syncobj =
+			    hmgrtable_get_object_by_type(&process->handle_table,
+						HMGRENTRY_TYPE_DXGSYNCOBJECT,
+						objects[0]);
+			if (syncobj == NULL) {
+				pr_err("Invalid syncobj: %x", objects[0].v);
+				ret = -EINVAL;
+			} else {
+				monitored_fence = syncobj->monitored_fence;
+			}
+		} else {
+			pr_err("Invalid syncobj type: %x", objects[0].v);
+			ret = -EINVAL;
+		}
+	}
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_SHARED);
+
+	if (ret < 0)
+		goto cleanup;
+
+	if (monitored_fence) {
+		object_size = sizeof(u64) * args.object_count;
+		fences = vzalloc(object_size);
+		if (fences == NULL) {
+			ret = -ENOMEM;
+			goto cleanup;
+		}
+		ret = copy_from_user(fences, args.monitored_fence_values,
+				     object_size);
+		if (ret) {
+			pr_err("%s failed to copy fences", __func__);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+	} else {
+		fences = &args.fence_value;
+	}
+
+	device = dxgprocess_device_by_handle(process, device_handle);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_wait_sync_object_gpu(process, adapter,
+					       args.context, args.object_count,
+					       objects, fences,
+					       !monitored_fence);
+
+cleanup:
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+	if (objects)
+		vfree(objects);
+	if (fences && fences != &args.fence_value)
+		vfree(fences);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 /*
  * IOCTL processing
  * The driver IOCTLs return
@@ -1564,6 +2243,10 @@ void init_ioctls(void)
 		  LX_DXQUERYADAPTERINFO);
 	SET_IOCTL(/*0x10 */ dxgk_create_sync_object,
 		  LX_DXCREATESYNCHRONIZATIONOBJECT);
+	SET_IOCTL(/*0x11 */ dxgk_signal_sync_object,
+		  LX_DXSIGNALSYNCHRONIZATIONOBJECT);
+	SET_IOCTL(/*0x12 */ dxgk_wait_sync_object,
+		  LX_DXWAITFORSYNCHRONIZATIONOBJECT);
 	SET_IOCTL(/*0x13 */ dxgk_destroy_allocation,
 		  LX_DXDESTROYALLOCATION2);
 	SET_IOCTL(/*0x14 */ dxgk_enum_adapters,
@@ -1574,6 +2257,16 @@ void init_ioctls(void)
 		  LX_DXDESTROYDEVICE);
 	SET_IOCTL(/*0x1d */ dxgk_destroy_sync_object,
 		  LX_DXDESTROYSYNCHRONIZATIONOBJECT);
+	SET_IOCTL(/*0x31 */ dxgk_signal_sync_object_cpu,
+		  LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMCPU);
+	SET_IOCTL(/*0x32 */ dxgk_signal_sync_object_gpu,
+		  LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMGPU);
+	SET_IOCTL(/*0x33 */ dxgk_signal_sync_object_gpu2,
+		  LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMGPU2);
+	SET_IOCTL(/*0x3a */ dxgk_wait_sync_object_cpu,
+		  LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMCPU);
+	SET_IOCTL(/*0x3b */ dxgk_wait_sync_object_gpu,
+		  LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMGPU);
 	SET_IOCTL(/*0x3e */ dxgk_enum_adapters3,
 		  LX_DXENUMADAPTERS3);
 }
diff --git a/drivers/hv/dxgkrnl/misc.h b/drivers/hv/dxgkrnl/misc.h
index 41abdc504bc2..f56d21b48814 100644
--- a/drivers/hv/dxgkrnl/misc.h
+++ b/drivers/hv/dxgkrnl/misc.h
@@ -26,6 +26,7 @@ extern const struct d3dkmthandle zerohandle;
  * When a lower lock ois held, the higher lock should not be acquired.
  *
  * device_list_mutex
+ * host_event_list_mutex
  * channel_lock
  * fd_mutex
  * plistmutex
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index 52d54aa9266e..13bc3adf0abb 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -56,6 +56,9 @@ struct winluid {
 
 #define D3DKMT_CREATEALLOCATION_MAX		1024
 #define D3DKMT_ADAPTERS_MAX			64
+#define D3DDDI_MAX_BROADCAST_CONTEXT		64
+#define D3DDDI_MAX_OBJECT_WAITED_ON		32
+#define D3DDDI_MAX_OBJECT_SIGNALED		32
 
 struct d3dkmt_adapterinfo {
 	struct d3dkmthandle		adapter_handle;
@@ -339,6 +342,148 @@ struct d3dkmt_createsynchronizationobject2 {
 	__u32						reserved1;
 };
 
+struct d3dkmt_waitforsynchronizationobject2 {
+	struct d3dkmthandle	context;
+	__u32			object_count;
+	struct d3dkmthandle	object_array[D3DDDI_MAX_OBJECT_WAITED_ON];
+	union {
+		struct {
+			__u64	fence_value;
+		} fence;
+		__u64		reserved[8];
+	};
+};
+
+struct d3dddicb_signalflags {
+	union {
+		struct {
+			__u32	signal_at_submission:1;
+			__u32	enqueue_cpu_event:1;
+			__u32	allow_fence_rewind:1;
+			__u32	reserved:28;
+			__u32	DXGK_SIGNAL_FLAG_INTERNAL0:1;
+		};
+		__u32		value;
+	};
+};
+
+struct d3dkmt_signalsynchronizationobject2 {
+	struct d3dkmthandle		context;
+	__u32				object_count;
+	struct d3dkmthandle	object_array[D3DDDI_MAX_OBJECT_SIGNALED];
+	struct d3dddicb_signalflags	flags;
+	__u32				context_count;
+	struct d3dkmthandle		contexts[D3DDDI_MAX_BROADCAST_CONTEXT];
+	union {
+		struct {
+			__u64		fence_value;
+		} fence;
+		__u64			cpu_event_handle;
+		__u64			reserved[8];
+	};
+};
+
+struct d3dddi_waitforsynchronizationobjectfromcpu_flags {
+	union {
+		struct {
+			__u32	wait_any:1;
+			__u32	reserved:31;
+		};
+		__u32		value;
+	};
+};
+
+struct d3dkmt_waitforsynchronizationobjectfromcpu {
+	struct d3dkmthandle	device;
+	__u32			object_count;
+#ifdef __KERNEL__
+	struct d3dkmthandle	*objects;
+	__u64			*fence_values;
+#else
+	__u64			objects;
+	__u64			fence_values;
+#endif
+	__u64			async_event;
+	struct d3dddi_waitforsynchronizationobjectfromcpu_flags flags;
+};
+
+struct d3dkmt_signalsynchronizationobjectfromcpu {
+	struct d3dkmthandle	device;
+	__u32			object_count;
+#ifdef __KERNEL__
+	struct d3dkmthandle	*objects;
+	__u64			*fence_values;
+#else
+	__u64			objects;
+	__u64			fence_values;
+#endif
+	struct d3dddicb_signalflags	flags;
+};
+
+struct d3dkmt_waitforsynchronizationobjectfromgpu {
+	struct d3dkmthandle	context;
+	__u32			object_count;
+#ifdef __KERNEL__
+	struct d3dkmthandle	*objects;
+#else
+	__u64			objects;
+#endif
+	union {
+#ifdef __KERNEL__
+		__u64		*monitored_fence_values;
+#else
+		__u64		monitored_fence_values;
+#endif
+		__u64		fence_value;
+		__u64		reserved[8];
+	};
+};
+
+struct d3dkmt_signalsynchronizationobjectfromgpu {
+	struct d3dkmthandle	context;
+	__u32			object_count;
+#ifdef __KERNEL__
+	struct d3dkmthandle	*objects;
+#else
+	__u64			objects;
+#endif
+	union {
+#ifdef __KERNEL__
+		__u64		*monitored_fence_values;
+#else
+		__u64		monitored_fence_values;
+#endif
+		__u64		reserved[8];
+	};
+};
+
+struct d3dkmt_signalsynchronizationobjectfromgpu2 {
+	__u32				object_count;
+	__u32				reserved1;
+#ifdef __KERNEL__
+	struct d3dkmthandle		*objects;
+#else
+	__u64				objects;
+#endif
+	struct d3dddicb_signalflags	flags;
+	__u32				context_count;
+#ifdef __KERNEL__
+	struct d3dkmthandle		*contexts;
+#else
+	__u64				contexts;
+#endif
+	union {
+		__u64			fence_value;
+		__u64			cpu_event_handle;
+#ifdef __KERNEL__
+		__u64			*monitored_fence_values;
+#else
+		__u64			monitored_fence_values;
+#endif
+		__u64			reserved[8];
+	};
+};
+
 struct d3dkmt_destroysynchronizationobject {
 	struct d3dkmthandle	sync_object;
 };
@@ -572,6 +717,10 @@ struct d3dkmt_enumadapters3 {
 	_IOWR(0x47, 0x09, struct d3dkmt_queryadapterinfo)
 #define LX_DXCREATESYNCHRONIZATIONOBJECT \
 	_IOWR(0x47, 0x10, struct d3dkmt_createsynchronizationobject2)
+#define LX_DXSIGNALSYNCHRONIZATIONOBJECT \
+	_IOWR(0x47, 0x11, struct d3dkmt_signalsynchronizationobject2)
+#define LX_DXWAITFORSYNCHRONIZATIONOBJECT \
+	_IOWR(0x47, 0x12, struct d3dkmt_waitforsynchronizationobject2)
 #define LX_DXDESTROYALLOCATION2		\
 	_IOWR(0x47, 0x13, struct d3dkmt_destroyallocation2)
 #define LX_DXENUMADAPTERS2		\
@@ -582,6 +731,16 @@ struct d3dkmt_enumadapters3 {
 	_IOWR(0x47, 0x19, struct d3dkmt_destroydevice)
 #define LX_DXDESTROYSYNCHRONIZATIONOBJECT \
 	_IOWR(0x47, 0x1d, struct d3dkmt_destroysynchronizationobject)
+#define LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMCPU \
+	_IOWR(0x47, 0x31, struct d3dkmt_signalsynchronizationobjectfromcpu)
+#define LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMGPU \
+	_IOWR(0x47, 0x32, struct d3dkmt_signalsynchronizationobjectfromgpu)
+#define LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMGPU2 \
+	_IOWR(0x47, 0x33, struct d3dkmt_signalsynchronizationobjectfromgpu2)
+#define LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMCPU \
+	_IOWR(0x47, 0x3a, struct d3dkmt_waitforsynchronizationobjectfromcpu)
+#define LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMGPU \
+	_IOWR(0x47, 0x3b, struct d3dkmt_waitforsynchronizationobjectfromgpu)
 #define LX_DXENUMADAPTERS3		\
 	_IOWR(0x47, 0x3e, struct d3dkmt_enumadapters3)
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 12/30] drivers: hv: dxgkrnl: Sharing of dxgresource objects
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (9 preceding siblings ...)
  2022-03-01 19:45   ` [PATCH v3 11/30] drivers: hv: dxgkrnl: Operations using " Iouri Tarassov
@ 2022-03-01 19:45   ` Iouri Tarassov
  2022-03-02 14:13     ` Wei Liu
  2022-03-02 14:15     ` Wei Liu
  2022-03-01 19:46   ` [PATCH v3 13/30] drivers: hv: dxgkrnl: Sharing of sync objects Iouri Tarassov
                     ` (18 subsequent siblings)
  29 siblings, 2 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:45 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement creation of shared resources and ioctls for sharing
dxgresource objects between processes in the virtual machine.

A dxgresource object is a collection of dxgallocation objects.
The driver API allows addition/removal of allocations to a resource,
but has limitations on addition/removal of allocations to a shared
resource. When a resource is "sealed", addition/removal of allocations
is not allowed.

Resources are shared using file descriptor (FD) handles. The name
"NT handle" is used to be compatible with Windows implementation.

An FD handle is created by the LX_DXSHAREOBJECTS ioctl. The given FD
handle could be sent to another process using any Linux API.

To use a shared resource object in other ioctls the object needs to be
opened using its FD handle. An resource object is opened by the
LX_DXOPENRESOURCEFROMNTHANDLE ioctl. This ioctl returns a d3dkmthandle
value, which can be used to reference the resource object.

The LX_DXQUERYRESOURCEINFOFROMNTHANDLE ioctl is used to query private
driver data of a shared resource object. This private data needs to be
used to actually open the object using the LX_DXOPENRESOURCEFROMNTHANDLE
ioctl.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgadapter.c |  81 ++++
 drivers/hv/dxgkrnl/dxgkrnl.h    |  77 ++++
 drivers/hv/dxgkrnl/dxgmodule.c  |   1 +
 drivers/hv/dxgkrnl/dxgvmbus.c   | 127 ++++++
 drivers/hv/dxgkrnl/dxgvmbus.h   |  30 ++
 drivers/hv/dxgkrnl/ioctl.c      | 786 ++++++++++++++++++++++++++++++++
 include/uapi/misc/d3dkmthk.h    |  96 ++++
 7 files changed, 1198 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgadapter.c b/drivers/hv/dxgkrnl/dxgadapter.c
index c001f58a30db..48a39805944b 100644
--- a/drivers/hv/dxgkrnl/dxgadapter.c
+++ b/drivers/hv/dxgkrnl/dxgadapter.c
@@ -165,6 +165,17 @@ void dxgadapter_remove_process(struct dxgprocess_adapter *process_info)
 	process_info->adapter_process_list_entry.prev = NULL;
 }
 
+void dxgadapter_remove_shared_resource(struct dxgadapter *adapter,
+				       struct dxgsharedresource *object)
+{
+	down_write(&adapter->shared_resource_list_lock);
+	if (object->shared_resource_list_entry.next) {
+		list_del(&object->shared_resource_list_entry);
+		object->shared_resource_list_entry.next = NULL;
+	}
+	up_write(&adapter->shared_resource_list_lock);
+}
+
 void dxgadapter_add_syncobj(struct dxgadapter *adapter,
 			    struct dxgsyncobject *object)
 {
@@ -493,6 +504,69 @@ void dxgdevice_remove_resource(struct dxgdevice *device,
 	}
 }
 
+struct dxgsharedresource *dxgsharedresource_create(struct dxgadapter *adapter)
+{
+	struct dxgsharedresource *resource;
+
+	resource = vzalloc(sizeof(*resource));
+	if (resource) {
+		INIT_LIST_HEAD(&resource->resource_list_head);
+		kref_init(&resource->sresource_kref);
+		mutex_init(&resource->fd_mutex);
+		resource->adapter = adapter;
+	}
+	return resource;
+}
+
+void dxgsharedresource_destroy(struct kref *refcount)
+{
+	struct dxgsharedresource *resource;
+
+	resource = container_of(refcount, struct dxgsharedresource,
+				sresource_kref);
+	if (resource->runtime_private_data)
+		vfree(resource->runtime_private_data);
+	if (resource->resource_private_data)
+		vfree(resource->resource_private_data);
+	if (resource->alloc_private_data_sizes)
+		vfree(resource->alloc_private_data_sizes);
+	if (resource->alloc_private_data)
+		vfree(resource->alloc_private_data);
+	vfree(resource);
+}
+
+void dxgsharedresource_add_resource(struct dxgsharedresource *shared_resource,
+				    struct dxgresource *resource)
+{
+	down_write(&shared_resource->adapter->shared_resource_list_lock);
+	pr_debug("%s: %p %p", __func__, shared_resource, resource);
+	list_add_tail(&resource->shared_resource_list_entry,
+		      &shared_resource->resource_list_head);
+	kref_get(&shared_resource->sresource_kref);
+	kref_get(&resource->resource_kref);
+	resource->shared_owner = shared_resource;
+	up_write(&shared_resource->adapter->shared_resource_list_lock);
+}
+
+void dxgsharedresource_remove_resource(struct dxgsharedresource
+				       *shared_resource,
+				       struct dxgresource *resource)
+{
+	struct dxgadapter *adapter = shared_resource->adapter;
+
+	down_write(&adapter->shared_resource_list_lock);
+	pr_debug("%s: %p %p", __func__, shared_resource, resource);
+	if (resource->shared_resource_list_entry.next) {
+		list_del(&resource->shared_resource_list_entry);
+		resource->shared_resource_list_entry.next = NULL;
+		kref_put(&shared_resource->sresource_kref,
+			 dxgsharedresource_destroy);
+		resource->shared_owner = NULL;
+		kref_put(&resource->resource_kref, dxgresource_release);
+	}
+	up_write(&adapter->shared_resource_list_lock);
+}
+
 struct dxgresource *dxgresource_create(struct dxgdevice *device)
 {
 	struct dxgresource *resource = vzalloc(sizeof(struct dxgresource));
@@ -535,6 +609,7 @@ void dxgresource_destroy(struct dxgresource *resource)
 	struct d3dkmt_destroyallocation2 args = { };
 	int destroyed = test_and_set_bit(0, &resource->flags);
 	struct dxgdevice *device = resource->device;
+	struct dxgsharedresource *shared_resource;
 
 	if (!destroyed) {
 		dxgresource_free_handle(resource);
@@ -550,6 +625,12 @@ void dxgresource_destroy(struct dxgresource *resource)
 			dxgallocation_destroy(alloc);
 		}
 		dxgdevice_remove_resource(device, resource);
+		shared_resource = resource->shared_owner;
+		if (shared_resource) {
+			dxgsharedresource_remove_resource(shared_resource,
+							  resource);
+			resource->shared_owner = NULL;
+		}
 	}
 	kref_put(&resource->resource_kref, dxgresource_release);
 }
diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index cb9096bd7a12..bad69cf4dfd7 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -33,6 +33,7 @@ struct dxgdevice;
 struct dxgcontext;
 struct dxgallocation;
 struct dxgresource;
+struct dxgsharedresource;
 struct dxgsyncobject;
 
 #include "misc.h"
@@ -359,6 +360,8 @@ struct dxgadapter {
 	struct list_head	adapter_list_entry;
 	/* The list of dxgprocess_adapter entries */
 	struct list_head	adapter_process_list_head;
+	/* List of all dxgsharedresource objects */
+	struct list_head	shared_resource_list_head;
 	/* List of all non-device dxgsyncobject objects */
 	struct list_head	syncobj_list_head;
 	/* This lock protects shared resource and syncobject lists */
@@ -392,6 +395,8 @@ void dxgadapter_remove_syncobj(struct dxgsyncobject *so);
 void dxgadapter_add_process(struct dxgadapter *adapter,
 			    struct dxgprocess_adapter *process_info);
 void dxgadapter_remove_process(struct dxgprocess_adapter *process_info);
+void dxgadapter_remove_shared_resource(struct dxgadapter *adapter,
+				       struct dxgsharedresource *object);
 
 /*
  * The object represent the device object.
@@ -471,6 +476,64 @@ void dxgcontext_destroy_safe(struct dxgprocess *pr, struct dxgcontext *ctx);
 void dxgcontext_release(struct kref *refcount);
 bool dxgcontext_is_active(struct dxgcontext *ctx);
 
+/*
+ * A shared resource object is created to track the list of dxgresource objects,
+ * which are opened for the same underlying shared resource.
+ * Objects are shared by using a file descriptor handle.
+ * FD is created by calling dxgk_share_objects and providing shandle to
+ * dxgsharedresource. The FD points to a dxgresource object, which is created
+ * by calling dxgk_open_resource_nt.  dxgresource object is referenced by the
+ * FD.
+ *
+ * The object is referenced by every dxgresource in its list.
+ *
+ */
+struct dxgsharedresource {
+	/* Every dxgresource object in the resource list takes a reference */
+	struct kref		sresource_kref;
+	struct dxgadapter	*adapter;
+	/* List of dxgresource objects, opened for the shared resource. */
+	/* Protected by dxgadapter::shared_resource_list_lock */
+	struct list_head	resource_list_head;
+	/* Entry in the list of dxgsharedresource in dxgadapter */
+	/* Protected by dxgadapter::shared_resource_list_lock */
+	struct list_head	shared_resource_list_entry;
+	struct mutex		fd_mutex;
+	/* Referenced by file descriptors */
+	int			host_shared_handle_nt_reference;
+	/* Corresponding global handle in the host */
+	struct d3dkmthandle	host_shared_handle;
+	/*
+	 * When the sync object is shared by NT handle, this is the
+	 * corresponding handle in the host
+	 */
+	struct d3dkmthandle	host_shared_handle_nt;
+	/* Values below are computed when the resource is sealed */
+	u32			runtime_private_data_size;
+	u32			alloc_private_data_size;
+	u32			resource_private_data_size;
+	u32			allocation_count;
+	union {
+		struct {
+			/* Cannot add new allocations */
+			u32	sealed:1;
+			u32	reserved:31;
+		};
+		long		flags;
+	};
+	u32			*alloc_private_data_sizes;
+	u8			*alloc_private_data;
+	u8			*runtime_private_data;
+	u8			*resource_private_data;
+};
+
+struct dxgsharedresource *dxgsharedresource_create(struct dxgadapter *adapter);
+void dxgsharedresource_destroy(struct kref *refcount);
+void dxgsharedresource_add_resource(struct dxgsharedresource *sres,
+				    struct dxgresource *res);
+void dxgsharedresource_remove_resource(struct dxgsharedresource *sres,
+				       struct dxgresource *res);
+
 struct dxgresource {
 	struct kref		resource_kref;
 	enum dxgobjectstate	object_state;
@@ -491,6 +554,8 @@ struct dxgresource {
 		};
 		long		flags;
 	};
+	/* Owner of the shared resource */
+	struct dxgsharedresource *shared_owner;
 };
 
 struct dxgresource *dxgresource_create(struct dxgdevice *dev);
@@ -638,6 +703,18 @@ int dxgvmb_send_wait_sync_object_cpu(struct dxgprocess *process,
 int dxgvmb_send_query_adapter_info(struct dxgprocess *process,
 				   struct dxgadapter *adapter,
 				   struct d3dkmt_queryadapterinfo *args);
+int dxgvmb_send_create_nt_shared_object(struct dxgprocess *process,
+					struct d3dkmthandle object,
+					struct d3dkmthandle *shared_handle);
+int dxgvmb_send_destroy_nt_shared_object(struct d3dkmthandle shared_handle);
+int dxgvmb_send_open_resource(struct dxgprocess *process,
+			      struct dxgadapter *adapter,
+			      struct d3dkmthandle device,
+			      struct d3dkmthandle global_share,
+			      u32 allocation_count,
+			      u32 total_priv_drv_data_size,
+			      struct d3dkmthandle *resource_handle,
+			      struct d3dkmthandle *alloc_handles);
 int dxgvmb_send_get_stdalloc_data(struct dxgdevice *device,
 				  enum d3dkmdt_standardallocationtype t,
 				  struct d3dkmdt_gdisurfacedata *data,
diff --git a/drivers/hv/dxgkrnl/dxgmodule.c b/drivers/hv/dxgkrnl/dxgmodule.c
index 80af0de4aa37..7ccf76c6ad9b 100644
--- a/drivers/hv/dxgkrnl/dxgmodule.c
+++ b/drivers/hv/dxgkrnl/dxgmodule.c
@@ -250,6 +250,7 @@ int dxgglobal_create_adapter(struct pci_dev *dev, guid_t *guid,
 	init_rwsem(&adapter->core_lock);
 
 	INIT_LIST_HEAD(&adapter->adapter_process_list_head);
+	INIT_LIST_HEAD(&adapter->shared_resource_list_head);
 	INIT_LIST_HEAD(&adapter->syncobj_list_head);
 	init_rwsem(&adapter->shared_resource_list_lock);
 	adapter->pci_dev = dev;
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index a6979b4ae955..b6b072b189ff 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -705,6 +705,79 @@ int dxgvmb_send_destroy_process(struct d3dkmthandle process)
 	return ret;
 }
 
+int dxgvmb_send_create_nt_shared_object(struct dxgprocess *process,
+					struct d3dkmthandle object,
+					struct d3dkmthandle *shared_handle)
+{
+	struct dxgkvmb_command_createntsharedobject *command;
+	int ret;
+	struct dxgvmbusmsg msg;
+
+	ret = init_message(&msg, NULL, process, sizeof(*command));
+	if (ret)
+		return ret;
+	command = (void *)msg.msg;
+
+	command_vm_to_host_init2(&command->hdr,
+				 DXGK_VMBCOMMAND_CREATENTSHAREDOBJECT,
+				 process->host_handle);
+	command->object = object;
+
+	ret = dxgglobal_acquire_channel_lock();
+	if (ret < 0)
+		goto cleanup;
+
+	ret = dxgvmb_send_sync_msg(dxgglobal_get_dxgvmbuschannel(),
+				   msg.hdr, msg.size, shared_handle,
+				   sizeof(*shared_handle));
+
+	dxgglobal_release_channel_lock();
+
+	if (ret < 0)
+		goto cleanup;
+	if (shared_handle->v == 0) {
+		pr_err("failed to create NT shared object");
+		ret = -ENOTRECOVERABLE;
+	}
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_destroy_nt_shared_object(struct d3dkmthandle shared_handle)
+{
+	struct dxgkvmb_command_destroyntsharedobject *command;
+	int ret;
+	struct dxgvmbusmsg msg;
+
+	ret = init_message(&msg, NULL, NULL, sizeof(*command));
+	if (ret)
+		return ret;
+	command = (void *)msg.msg;
+
+	command_vm_to_host_init1(&command->hdr,
+				 DXGK_VMBCOMMAND_DESTROYNTSHAREDOBJECT);
+	command->shared_handle = shared_handle;
+
+	ret = dxgglobal_acquire_channel_lock();
+	if (ret < 0)
+		goto cleanup;
+
+	ret = dxgvmb_send_sync_msg_ntstatus(dxgglobal_get_dxgvmbuschannel(),
+					    msg.hdr, msg.size);
+
+	dxgglobal_release_channel_lock();
+
+cleanup:
+	free_message(&msg, NULL);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 int dxgvmb_send_destroy_sync_object(struct dxgprocess *process,
 				    struct d3dkmthandle sync_object)
 {
@@ -1537,6 +1610,60 @@ int dxgvmb_send_destroy_allocation(struct dxgprocess *process,
 	return ret;
 }
 
+int dxgvmb_send_open_resource(struct dxgprocess *process,
+			      struct dxgadapter *adapter,
+			      struct d3dkmthandle device,
+			      struct d3dkmthandle global_share,
+			      u32 allocation_count,
+			      u32 total_priv_drv_data_size,
+			      struct d3dkmthandle *resource_handle,
+			      struct d3dkmthandle *alloc_handles)
+{
+	struct dxgkvmb_command_openresource *command;
+	struct dxgkvmb_command_openresource_return *result;
+	struct d3dkmthandle *handles;
+	int ret;
+	int i;
+	u32 result_size = allocation_count * sizeof(struct d3dkmthandle) +
+			   sizeof(*result);
+	struct dxgvmbusmsgres msg = {.hdr = NULL};
+
+	ret = init_message_res(&msg, adapter, process, sizeof(*command),
+			       result_size);
+	if (ret)
+		goto cleanup;
+	command = msg.msg;
+	result = msg.res;
+
+	command_vgpu_to_host_init2(&command->hdr, DXGK_VMBCOMMAND_OPENRESOURCE,
+				   process->host_handle);
+	command->device = device;
+	command->nt_security_sharing = 1;
+	command->global_share = global_share;
+	command->allocation_count = allocation_count;
+	command->total_priv_drv_data_size = total_priv_drv_data_size;
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   result, msg.res_size);
+	if (ret < 0)
+		goto cleanup;
+
+	ret = ntstatus2int(result->status);
+	if (ret < 0)
+		goto cleanup;
+
+	*resource_handle = result->resource;
+	handles = (struct d3dkmthandle *) &result[1];
+	for (i = 0; i < allocation_count; i++)
+		alloc_handles[i] = handles[i];
+
+cleanup:
+	free_message((struct dxgvmbusmsg *)&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 int dxgvmb_send_get_stdalloc_data(struct dxgdevice *device,
 				  enum d3dkmdt_standardallocationtype alloctype,
 				  struct d3dkmdt_gdisurfacedata *alloc_data,
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index da63f89ad822..60433744b46b 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -172,6 +172,21 @@ struct dxgkvmb_command_signalguestevent {
 	bool				dereference_event;
 };
 
+/*
+ * The command returns struct d3dkmthandle of a shared object for the
+ * given pre-process object
+ */
+struct dxgkvmb_command_createntsharedobject {
+	struct dxgkvmb_command_vm_to_host hdr;
+	struct d3dkmthandle		object;
+};
+
+/* The command returns ntstatus */
+struct dxgkvmb_command_destroyntsharedobject {
+	struct dxgkvmb_command_vm_to_host hdr;
+	struct d3dkmthandle		shared_handle;
+};
+
 /* Returns ntstatus */
 struct dxgkvmb_command_setiospaceregion {
 	struct dxgkvmb_command_vm_to_host hdr;
@@ -305,6 +320,21 @@ struct dxgkvmb_command_createallocation {
 /* u8 priv_drv_data[] for each alloc_info */
 };
 
+struct dxgkvmb_command_openresource {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		device;
+	bool				nt_security_sharing;
+	struct d3dkmthandle		global_share;
+	u32				allocation_count;
+	u32				total_priv_drv_data_size;
+};
+
+struct dxgkvmb_command_openresource_return {
+	struct d3dkmthandle		resource;
+	struct ntstatus			status;
+/* struct d3dkmthandle   allocation[allocation_count]; */
+};
+
 struct dxgkvmb_command_getstandardallocprivdata {
 	struct dxgkvmb_command_vgpu_to_host hdr;
 	enum d3dkmdt_standardallocationtype alloc_type;
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index b945da0da55a..05f59854cbbf 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -35,6 +35,33 @@ static char *errorstr(int ret)
 	return ret < 0 ? "err" : "";
 }
 
+static int dxgsharedresource_release(struct inode *inode, struct file *file)
+{
+	struct dxgsharedresource *resource = file->private_data;
+
+	pr_debug("%s: %p", __func__, resource);
+	mutex_lock(&resource->fd_mutex);
+	kref_get(&resource->sresource_kref);
+	resource->host_shared_handle_nt_reference--;
+	if (resource->host_shared_handle_nt_reference == 0) {
+		if (resource->host_shared_handle_nt.v) {
+			dxgvmb_send_destroy_nt_shared_object(
+					resource->host_shared_handle_nt);
+			pr_debug("Resource host_handle_nt destroyed: %x",
+				resource->host_shared_handle_nt.v);
+			resource->host_shared_handle_nt.v = 0;
+		}
+		kref_put(&resource->sresource_kref, dxgsharedresource_destroy);
+	}
+	mutex_unlock(&resource->fd_mutex);
+	kref_put(&resource->sresource_kref, dxgsharedresource_destroy);
+	return 0;
+}
+
+static const struct file_operations dxg_resource_fops = {
+	.release = dxgsharedresource_release,
+};
+
 static int dxgk_open_adapter_from_luid(struct dxgprocess *process,
 						   void *__user inargs)
 {
@@ -217,6 +244,97 @@ dxgkp_enum_adapters(struct dxgprocess *process,
 	return ret;
 }
 
+static int dxgsharedresource_seal(struct dxgsharedresource *shared_resource)
+{
+	int ret = 0;
+	int i = 0;
+	u8 *private_data;
+	u32 data_size;
+	struct dxgresource *resource;
+	struct dxgallocation *alloc;
+
+	pr_debug("Sealing resource: %p", shared_resource);
+
+	down_write(&shared_resource->adapter->shared_resource_list_lock);
+	if (shared_resource->sealed) {
+		pr_debug("Resource already sealed");
+		goto cleanup;
+	}
+	shared_resource->sealed = 1;
+	if (!list_empty(&shared_resource->resource_list_head)) {
+		resource =
+		    list_first_entry(&shared_resource->resource_list_head,
+				     struct dxgresource,
+				     shared_resource_list_entry);
+		pr_debug("First resource: %p", resource);
+		mutex_lock(&resource->resource_mutex);
+		list_for_each_entry(alloc, &resource->alloc_list_head,
+				    alloc_list_entry) {
+			pr_debug("Resource alloc: %p %d", alloc,
+				    alloc->priv_drv_data->data_size);
+			shared_resource->allocation_count++;
+			shared_resource->alloc_private_data_size +=
+			    alloc->priv_drv_data->data_size;
+			if (shared_resource->alloc_private_data_size <
+			    alloc->priv_drv_data->data_size) {
+				pr_err("alloc private data overflow");
+				ret = -EINVAL;
+				goto cleanup1;
+			}
+		}
+		if (shared_resource->alloc_private_data_size == 0) {
+			ret = -EINVAL;
+			goto cleanup1;
+		}
+		shared_resource->alloc_private_data =
+			vzalloc(shared_resource->alloc_private_data_size);
+		if (shared_resource->alloc_private_data == NULL) {
+			ret = -EINVAL;
+			goto cleanup1;
+		}
+		shared_resource->alloc_private_data_sizes =
+			vzalloc(sizeof(u32)*shared_resource->allocation_count);
+		if (shared_resource->alloc_private_data_sizes == NULL) {
+			ret = -EINVAL;
+			goto cleanup1;
+		}
+		private_data = shared_resource->alloc_private_data;
+		data_size = shared_resource->alloc_private_data_size;
+		i = 0;
+		list_for_each_entry(alloc, &resource->alloc_list_head,
+				    alloc_list_entry) {
+			u32 alloc_data_size = alloc->priv_drv_data->data_size;
+
+			if (alloc_data_size) {
+				if (data_size < alloc_data_size) {
+					pr_err("Invalid private data size");
+					ret = -EINVAL;
+					goto cleanup1;
+				}
+				shared_resource->alloc_private_data_sizes[i] =
+				    alloc_data_size;
+				memcpy(private_data,
+				       alloc->priv_drv_data->data,
+				       alloc_data_size);
+				vfree(alloc->priv_drv_data);
+				alloc->priv_drv_data = NULL;
+				private_data += alloc_data_size;
+				data_size -= alloc_data_size;
+			}
+			i++;
+		}
+		if (data_size != 0) {
+			pr_err("Data size mismatch");
+			ret = -EINVAL;
+		}
+cleanup1:
+		mutex_unlock(&resource->resource_mutex);
+	}
+cleanup:
+	up_write(&shared_resource->adapter->shared_resource_list_lock);
+	return ret;
+}
+
 static int
 dxgk_enum_adapters(struct dxgprocess *process, void *__user inargs)
 {
@@ -823,6 +941,7 @@ dxgk_create_allocation(struct dxgprocess *process, void *__user inargs)
 	u32 alloc_info_size = 0;
 	struct dxgresource *resource = NULL;
 	struct dxgallocation **dxgalloc = NULL;
+	struct dxgsharedresource *shared_resource = NULL;
 	bool resource_mutex_acquired = false;
 	u32 standard_alloc_priv_data_size = 0;
 	void *standard_alloc_priv_data = NULL;
@@ -995,6 +1114,76 @@ dxgk_create_allocation(struct dxgprocess *process, void *__user inargs)
 		}
 		resource->private_runtime_handle =
 		    args.private_runtime_resource_handle;
+		if (args.flags.create_shared) {
+			if (!args.flags.nt_security_sharing) {
+				pr_err("%s: nt_security_sharing must be set",
+					__func__);
+				ret = -EINVAL;
+				goto cleanup;
+			}
+			shared_resource = dxgsharedresource_create(adapter);
+			if (shared_resource == NULL) {
+				ret = -ENOMEM;
+				goto cleanup;
+			}
+			shared_resource->runtime_private_data_size =
+			    args.priv_drv_data_size;
+			shared_resource->resource_private_data_size =
+			    args.priv_drv_data_size;
+
+			shared_resource->runtime_private_data_size =
+			    args.private_runtime_data_size;
+			shared_resource->resource_private_data_size =
+			    args.priv_drv_data_size;
+			dxgsharedresource_add_resource(shared_resource,
+						       resource);
+			if (args.flags.standard_allocation) {
+				shared_resource->resource_private_data =
+					res_priv_data;
+				shared_resource->resource_private_data_size =
+					res_priv_data_size;
+				res_priv_data = NULL;
+			}
+			if (args.private_runtime_data_size) {
+				shared_resource->runtime_private_data =
+				    vzalloc(args.private_runtime_data_size);
+				if (shared_resource->runtime_private_data ==
+				    NULL) {
+					ret = -ENOMEM;
+					goto cleanup;
+				}
+				ret = copy_from_user(
+					shared_resource->runtime_private_data,
+					args.private_runtime_data,
+					args.private_runtime_data_size);
+				if (ret) {
+					pr_err("%s failed to copy runtime data",
+						__func__);
+					ret = -EINVAL;
+					goto cleanup;
+				}
+			}
+			if (args.priv_drv_data_size &&
+			    !args.flags.standard_allocation) {
+				shared_resource->resource_private_data =
+				    vzalloc(args.priv_drv_data_size);
+				if (shared_resource->resource_private_data ==
+				    NULL) {
+					ret = -ENOMEM;
+					goto cleanup;
+				}
+				ret = copy_from_user(
+					shared_resource->resource_private_data,
+					args.priv_drv_data,
+					args.priv_drv_data_size);
+				if (ret) {
+					pr_err("%s failed to copy res data",
+						__func__);
+					ret = -EINVAL;
+					goto cleanup;
+				}
+			}
+		}
 	} else {
 		if (args.resource.v) {
 			/* Adding new allocations to the given resource */
@@ -1013,6 +1202,12 @@ dxgk_create_allocation(struct dxgprocess *process, void *__user inargs)
 				ret = -EINVAL;
 				goto cleanup;
 			}
+			if (resource->shared_owner &&
+			    resource->shared_owner->sealed) {
+				pr_err("Resource is sealed");
+				ret = -EINVAL;
+				goto cleanup;
+			}
 			/* Synchronize with resource destruction */
 			mutex_lock(&resource->resource_mutex);
 			if (!dxgresource_is_active(resource)) {
@@ -1117,9 +1312,16 @@ dxgk_create_allocation(struct dxgprocess *process, void *__user inargs)
 			}
 		}
 		if (resource && args.flags.create_resource) {
+			if (shared_resource) {
+				dxgsharedresource_remove_resource
+				    (shared_resource, resource);
+			}
 			dxgresource_destroy(resource);
 		}
 	}
+	if (shared_resource)
+		kref_put(&shared_resource->sresource_kref,
+			 dxgsharedresource_destroy);
 	if (dxgalloc)
 		vfree(dxgalloc);
 	if (standard_alloc_priv_data)
@@ -1165,6 +1367,10 @@ static int validate_alloc(struct dxgallocation *alloc0,
 			fail_reason = 4;
 			goto cleanup;
 		}
+		if (alloc->owner.resource->shared_owner) {
+			fail_reason = 5;
+			goto cleanup;
+		}
 	} else {
 		if (alloc->owner.device != device) {
 			fail_reason = 6;
@@ -2173,6 +2379,580 @@ dxgk_wait_sync_object_gpu(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgsharedresource_get_host_nt_handle(struct dxgsharedresource *resource,
+				     struct dxgprocess *process,
+				     struct d3dkmthandle objecthandle)
+{
+	int ret = 0;
+
+	mutex_lock(&resource->fd_mutex);
+	if (resource->host_shared_handle_nt_reference == 0) {
+		ret = dxgvmb_send_create_nt_shared_object(process,
+					objecthandle,
+					&resource->host_shared_handle_nt);
+		if (ret < 0)
+			goto cleanup;
+		pr_debug("Resource host_shared_handle_ht: %x",
+			    resource->host_shared_handle_nt.v);
+		kref_get(&resource->sresource_kref);
+	}
+	resource->host_shared_handle_nt_reference++;
+cleanup:
+	mutex_unlock(&resource->fd_mutex);
+	return ret;
+}
+
+enum dxg_sharedobject_type {
+	DXG_SHARED_RESOURCE
+};
+
+static int get_object_fd(enum dxg_sharedobject_type type,
+				     void *object, int *fdout)
+{
+	struct file *file;
+	int fd;
+
+	fd = get_unused_fd_flags(O_CLOEXEC);
+	if (fd < 0) {
+		pr_err("get_unused_fd_flags failed: %x", fd);
+		return -ENOTRECOVERABLE;
+	}
+
+	switch (type) {
+	case DXG_SHARED_RESOURCE:
+		file = anon_inode_getfile("dxgresource",
+					  &dxg_resource_fops, object, 0);
+		break;
+	default:
+		return -EINVAL;
+	};
+	if (IS_ERR(file)) {
+		pr_err("anon_inode_getfile failed: %x", fd);
+		put_unused_fd(fd);
+		return -ENOTRECOVERABLE;
+	}
+
+	fd_install(fd, file);
+	*fdout = fd;
+	return 0;
+}
+
+static int
+dxgk_share_objects(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_shareobjects args;
+	enum hmgrentry_type object_type;
+	struct dxgsyncobject *syncobj = NULL;
+	struct dxgresource *resource = NULL;
+	struct dxgsharedresource *shared_resource = NULL;
+	struct d3dkmthandle *handles = NULL;
+	int object_fd = 0;
+	void *obj = NULL;
+	u32 handle_size;
+	int ret;
+	u64 tmp = 0;
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.object_count == 0 || args.object_count > 1) {
+		pr_err("invalid object count %d", args.object_count);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	handle_size = args.object_count * sizeof(struct d3dkmthandle);
+
+	handles = vzalloc(handle_size);
+	if (handles == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+	ret = copy_from_user(handles, args.objects, handle_size);
+	if (ret) {
+		pr_err("%s failed to copy object handles", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	pr_debug("Sharing handle: %x", handles[0].v);
+
+	hmgrtable_lock(&process->handle_table, DXGLOCK_SHARED);
+	object_type = hmgrtable_get_object_type(&process->handle_table,
+						handles[0]);
+	obj = hmgrtable_get_object(&process->handle_table, handles[0]);
+	if (obj == NULL) {
+		pr_err("invalid object handle %x", handles[0].v);
+		ret = -EINVAL;
+	} else {
+		switch (object_type) {
+		case HMGRENTRY_TYPE_DXGRESOURCE:
+			resource = obj;
+			if (resource->shared_owner) {
+				kref_get(&resource->resource_kref);
+				shared_resource = resource->shared_owner;
+			} else {
+				resource = NULL;
+				pr_err("resource object is not shared");
+				ret = -EINVAL;
+			}
+			break;
+		default:
+			pr_err("invalid object type %d", object_type);
+			ret = -EINVAL;
+			break;
+		}
+	}
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_SHARED);
+
+	if (ret < 0)
+		goto cleanup;
+
+	switch (object_type) {
+	case HMGRENTRY_TYPE_DXGRESOURCE:
+		ret = get_object_fd(DXG_SHARED_RESOURCE, shared_resource,
+				    &object_fd);
+		if (ret < 0) {
+			pr_err("%s get_object_fd failed for resource",
+				__func__);
+			goto cleanup;
+		}
+		ret = dxgsharedresource_get_host_nt_handle(shared_resource,
+							   process, handles[0]);
+		if (ret < 0) {
+			pr_err("%s get_host_res_nt_handle failed", __func__);
+			goto cleanup;
+		}
+		ret = dxgsharedresource_seal(shared_resource);
+		if (ret < 0) {
+			pr_err("%s dxgsharedresource_seal failed", __func__);
+			goto cleanup;
+		}
+		break;
+	default:
+		ret = -EINVAL;
+		break;
+	}
+
+	if (ret < 0)
+		goto cleanup;
+
+	pr_debug("Object FD: %x", object_fd);
+
+	tmp = (u64) object_fd;
+
+	ret = copy_to_user(args.shared_handle, &tmp, sizeof(u64));
+	if (ret < 0)
+		pr_err("%s failed to copy shared handle", __func__);
+
+cleanup:
+	if (ret < 0) {
+		if (object_fd > 0)
+			put_unused_fd(object_fd);
+	}
+
+	if (handles)
+		vfree(handles);
+
+	if (syncobj)
+		kref_put(&syncobj->syncobj_kref, dxgsyncobject_release);
+
+	if (resource)
+		kref_put(&resource->resource_kref, dxgresource_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_query_resource_info_nt(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_queryresourceinfofromnthandle args;
+	int ret;
+	struct dxgdevice *device = NULL;
+	struct dxgsharedresource *shared_resource = NULL;
+	struct file *file = NULL;
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	file = fget(args.nt_handle);
+	if (!file) {
+		pr_err("failed to get file from handle: %llx",
+			   args.nt_handle);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (file->f_op != &dxg_resource_fops) {
+		pr_err("invalid fd: %llx", args.nt_handle);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	shared_resource = file->private_data;
+	if (shared_resource == NULL) {
+		pr_err("invalid private data: %llx", args.nt_handle);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	device = dxgprocess_device_by_handle(process, args.device);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgdevice_acquire_lock_shared(device);
+	if (ret < 0) {
+		kref_put(&device->device_kref, dxgdevice_release);
+		device = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgsharedresource_seal(shared_resource);
+	if (ret < 0)
+		goto cleanup;
+
+	args.private_runtime_data_size =
+	    shared_resource->runtime_private_data_size;
+	args.resource_priv_drv_data_size =
+	    shared_resource->resource_private_data_size;
+	args.allocation_count = shared_resource->allocation_count;
+	args.total_priv_drv_data_size =
+	    shared_resource->alloc_private_data_size;
+
+	ret = copy_to_user(inargs, &args, sizeof(args));
+	if (ret < 0)
+		pr_err("%s failed to copy output args", __func__);
+
+cleanup:
+
+	if (file)
+		fput(file);
+	if (device)
+		dxgdevice_release_lock_shared(device);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+assign_resource_handles(struct dxgprocess *process,
+			struct dxgsharedresource *shared_resource,
+			struct d3dkmt_openresourcefromnthandle *args,
+			struct d3dkmthandle resource_handle,
+			struct dxgresource *resource,
+			struct dxgallocation **allocs,
+			struct d3dkmthandle *handles)
+{
+	int ret;
+	int i;
+	u8 *cur_priv_data;
+	u32 total_priv_data_size = 0;
+	struct d3dddi_openallocationinfo2 open_alloc_info = { };
+
+	hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+	ret = hmgrtable_assign_handle(&process->handle_table, resource,
+				      HMGRENTRY_TYPE_DXGRESOURCE,
+				      resource_handle);
+	if (ret < 0)
+		goto cleanup;
+	resource->handle = resource_handle;
+	resource->handle_valid = 1;
+	cur_priv_data = args->total_priv_drv_data;
+	for (i = 0; i < args->allocation_count; i++) {
+		ret = hmgrtable_assign_handle(&process->handle_table, allocs[i],
+					      HMGRENTRY_TYPE_DXGALLOCATION,
+					      handles[i]);
+		if (ret < 0)
+			goto cleanup;
+		allocs[i]->alloc_handle = handles[i];
+		allocs[i]->handle_valid = 1;
+		open_alloc_info.allocation = handles[i];
+		if (shared_resource->alloc_private_data_sizes)
+			open_alloc_info.priv_drv_data_size =
+			    shared_resource->alloc_private_data_sizes[i];
+		else
+			open_alloc_info.priv_drv_data_size = 0;
+
+		total_priv_data_size += open_alloc_info.priv_drv_data_size;
+		open_alloc_info.priv_drv_data = cur_priv_data;
+		cur_priv_data += open_alloc_info.priv_drv_data_size;
+
+		ret = copy_to_user(&args->open_alloc_info[i],
+				   &open_alloc_info,
+				   sizeof(open_alloc_info));
+		if (ret < 0) {
+			pr_err("%s failed to copy alloc info", __func__);
+			goto cleanup;
+		}
+	}
+	args->total_priv_drv_data_size = total_priv_data_size;
+cleanup:
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+	if (ret < 0) {
+		for (i = 0; i < args->allocation_count; i++)
+			dxgallocation_free_handle(allocs[i]);
+		dxgresource_free_handle(resource);
+	}
+	pr_debug("%s end %x", __func__, ret);
+	return ret;
+}
+
+static int
+open_resource(struct dxgprocess *process,
+	      struct d3dkmt_openresourcefromnthandle *args,
+	      __user struct d3dkmthandle *res_out,
+	      __user u32 *total_driver_data_size_out)
+{
+	int ret = 0;
+	int i;
+	struct d3dkmthandle *alloc_handles = NULL;
+	int alloc_handles_size = sizeof(struct d3dkmthandle) *
+				 args->allocation_count;
+	struct dxgsharedresource *shared_resource = NULL;
+	struct dxgresource *resource = NULL;
+	struct dxgallocation **allocs = NULL;
+	struct d3dkmthandle global_share = {};
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+	struct d3dkmthandle resource_handle = {};
+	struct file *file = NULL;
+
+	pr_debug("Opening resource handle: %llx", args->nt_handle);
+
+	file = fget(args->nt_handle);
+	if (!file) {
+		pr_err("failed to get file from handle: %llx",
+				args->nt_handle);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	if (file->f_op != &dxg_resource_fops) {
+		pr_err("invalid fd type: %llx", args->nt_handle);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	shared_resource = file->private_data;
+	if (shared_resource == NULL) {
+		pr_err("invalid private data: %llx",
+				args->nt_handle);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	if (kref_get_unless_zero(&shared_resource->sresource_kref) == 0)
+		shared_resource = NULL;
+	else
+		global_share = shared_resource->host_shared_handle_nt;
+
+	if (shared_resource == NULL) {
+		pr_err("Invalid shared resource handle: %x",
+			   (u32)args->nt_handle);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	pr_debug("Shared resource: %p %x", shared_resource,
+		    global_share.v);
+
+	device = dxgprocess_device_by_handle(process, args->device);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgdevice_acquire_lock_shared(device);
+	if (ret < 0) {
+		kref_put(&device->device_kref, dxgdevice_release);
+		device = NULL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgsharedresource_seal(shared_resource);
+	if (ret < 0)
+		goto cleanup;
+
+	if (args->allocation_count != shared_resource->allocation_count ||
+	    args->private_runtime_data_size <
+	    shared_resource->runtime_private_data_size ||
+	    args->resource_priv_drv_data_size <
+	    shared_resource->resource_private_data_size ||
+	    args->total_priv_drv_data_size <
+	    shared_resource->alloc_private_data_size) {
+		ret = -EINVAL;
+		pr_err("Invalid data sizes");
+		goto cleanup;
+	}
+
+	alloc_handles = vzalloc(alloc_handles_size);
+	if (alloc_handles == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	allocs = vzalloc(sizeof(void *) * args->allocation_count);
+	if (allocs == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	resource = dxgresource_create(device);
+	if (resource == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+	dxgsharedresource_add_resource(shared_resource, resource);
+
+	for (i = 0; i < args->allocation_count; i++) {
+		allocs[i] = dxgallocation_create(process);
+		if (allocs[i] == NULL)
+			goto cleanup;
+		ret = dxgresource_add_alloc(resource, allocs[i]);
+		if (ret < 0)
+			goto cleanup;
+	}
+
+	ret = dxgvmb_send_open_resource(process, adapter,
+					device->handle, global_share,
+					args->allocation_count,
+					args->total_priv_drv_data_size,
+					&resource_handle, alloc_handles);
+	if (ret < 0) {
+		pr_err("dxgvmb_send_open_resource failed");
+		goto cleanup;
+	}
+
+	if (shared_resource->runtime_private_data_size) {
+		ret = copy_to_user(args->private_runtime_data,
+				shared_resource->runtime_private_data,
+				shared_resource->runtime_private_data_size);
+		if (ret < 0) {
+			pr_err("%s failed to copy runtime data", __func__);
+			goto cleanup;
+		}
+	}
+
+	if (shared_resource->resource_private_data_size) {
+		ret = copy_to_user(args->resource_priv_drv_data,
+				shared_resource->resource_private_data,
+				shared_resource->resource_private_data_size);
+		if (ret < 0) {
+			pr_err("%s failed to copy resource data", __func__);
+			goto cleanup;
+		}
+	}
+
+	if (shared_resource->alloc_private_data_size) {
+		ret = copy_to_user(args->total_priv_drv_data,
+				shared_resource->alloc_private_data,
+				shared_resource->alloc_private_data_size);
+		if (ret < 0) {
+			pr_err("%s failed to copy alloc data", __func__);
+			goto cleanup;
+		}
+	}
+
+	ret = assign_resource_handles(process, shared_resource, args,
+				      resource_handle, resource, allocs,
+				      alloc_handles);
+	if (ret < 0)
+		goto cleanup;
+
+	ret = copy_to_user(res_out, &resource_handle,
+			   sizeof(struct d3dkmthandle));
+	if (ret < 0) {
+		pr_err("%s failed to copy resource handle to user", __func__);
+		goto cleanup;
+	}
+
+	ret = copy_to_user(total_driver_data_size_out,
+			   &args->total_priv_drv_data_size, sizeof(u32));
+	if (ret < 0)
+		pr_err("%s failed to copy total driver data size", __func__);
+
+cleanup:
+
+	if (ret < 0) {
+		if (resource_handle.v) {
+			struct d3dkmt_destroyallocation2 tmp = { };
+
+			tmp.flags.assume_not_in_use = 1;
+			tmp.device = args->device;
+			tmp.resource = resource_handle;
+			ret = dxgvmb_send_destroy_allocation(process, device,
+							     &tmp, NULL);
+		}
+		if (resource)
+			dxgresource_destroy(resource);
+	}
+
+	if (file)
+		fput(file);
+	if (allocs)
+		vfree(allocs);
+	if (shared_resource)
+		kref_put(&shared_resource->sresource_kref,
+			 dxgsharedresource_destroy);
+	if (alloc_handles)
+		vfree(alloc_handles);
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		dxgdevice_release_lock_shared(device);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	return ret;
+}
+
+static int
+dxgk_open_resource_nt(struct dxgprocess *process,
+				      void *__user inargs)
+{
+	struct d3dkmt_openresourcefromnthandle args;
+	struct d3dkmt_openresourcefromnthandle *__user args_user = inargs;
+	int ret;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = open_resource(process, &args,
+			    &args_user->resource,
+			    &args_user->total_priv_drv_data_size);
+
+cleanup:
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 /*
  * IOCTL processing
  * The driver IOCTLs return
@@ -2269,4 +3049,10 @@ void init_ioctls(void)
 		  LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMGPU);
 	SET_IOCTL(/*0x3e */ dxgk_enum_adapters3,
 		  LX_DXENUMADAPTERS3);
+	SET_IOCTL(/*0x3f */ dxgk_share_objects,
+		  LX_DXSHAREOBJECTS);
+	SET_IOCTL(/*0x41 */ dxgk_query_resource_info_nt,
+		  LX_DXQUERYRESOURCEINFOFROMNTHANDLE);
+	SET_IOCTL(/*0x42 */ dxgk_open_resource_nt,
+		  LX_DXOPENRESOURCEFROMNTHANDLE);
 }
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index 13bc3adf0abb..4f44dd855238 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -678,6 +678,94 @@ enum d3dkmt_deviceexecution_state {
 	_D3DKMT_DEVICEEXECUTION_ERROR_DMAPAGEFAULT	= 7,
 };
 
+struct d3dddi_openallocationinfo2 {
+	struct d3dkmthandle	allocation;
+#ifdef __KERNEL__
+	void			*priv_drv_data;
+#else
+	__u64			priv_drv_data;
+#endif
+	__u32			priv_drv_data_size;
+	__u64			gpu_va;
+	__u64			reserved[6];
+};
+
+struct d3dkmt_openresourcefromnthandle {
+	struct d3dkmthandle	device;
+	__u32			reserved;
+	__u64			nt_handle;
+	__u32			allocation_count;
+	__u32			reserved1;
+#ifdef __KERNEL__
+	struct d3dddi_openallocationinfo2 *open_alloc_info;
+#else
+	__u64			open_alloc_info;
+#endif
+	int			private_runtime_data_size;
+	__u32			reserved2;
+#ifdef __KERNEL__
+	void			*private_runtime_data;
+#else
+	__u64			private_runtime_data;
+#endif
+	__u32			resource_priv_drv_data_size;
+	__u32			reserved3;
+#ifdef __KERNEL__
+	void			*resource_priv_drv_data;
+#else
+	__u64			resource_priv_drv_data;
+#endif
+	__u32			total_priv_drv_data_size;
+#ifdef __KERNEL__
+	void			*total_priv_drv_data;
+#else
+	__u64			total_priv_drv_data;
+#endif
+	struct d3dkmthandle	resource;
+	struct d3dkmthandle	keyed_mutex;
+#ifdef __KERNEL__
+	void			*keyed_mutex_private_data;
+#else
+	__u64			keyed_mutex_private_data;
+#endif
+	__u32			keyed_mutex_private_data_size;
+	struct d3dkmthandle	sync_object;
+};
+
+struct d3dkmt_queryresourceinfofromnthandle {
+	struct d3dkmthandle	device;
+	__u32			reserved;
+	__u64			nt_handle;
+#ifdef __KERNEL__
+	void			*private_runtime_data;
+#else
+	__u64			private_runtime_data;
+#endif
+	__u32			private_runtime_data_size;
+	__u32			total_priv_drv_data_size;
+	__u32			resource_priv_drv_data_size;
+	__u32			allocation_count;
+};
+
+struct d3dkmt_shareobjects {
+	__u32			object_count;
+	__u32			reserved;
+#ifdef __KERNEL__
+	const struct d3dkmthandle *objects;
+	void			*object_attr;	/* security attributes */
+#else
+	__u64			objects;
+	__u64			object_attr;
+#endif
+	__u32			desired_access;
+	__u32			reserved1;
+#ifdef __KERNEL__
+	__u64			*shared_handle;	/* output file descriptors */
+#else
+	__u64			shared_handle;
+#endif
+};
+
 union d3dkmt_enumadapters_filter {
 	struct {
 		__u64	include_compute_only:1;
@@ -743,6 +831,14 @@ struct d3dkmt_enumadapters3 {
 	_IOWR(0x47, 0x3b, struct d3dkmt_waitforsynchronizationobjectfromgpu)
 #define LX_DXENUMADAPTERS3		\
 	_IOWR(0x47, 0x3e, struct d3dkmt_enumadapters3)
+#define LX_DXSHAREOBJECTS		\
+	_IOWR(0x47, 0x3f, struct d3dkmt_shareobjects)
+#define LX_DXOPENSYNCOBJECTFROMNTHANDLE2 \
+	_IOWR(0x47, 0x40, struct d3dkmt_opensyncobjectfromnthandle2)
+#define LX_DXQUERYRESOURCEINFOFROMNTHANDLE \
+	_IOWR(0x47, 0x41, struct d3dkmt_queryresourceinfofromnthandle)
+#define LX_DXOPENRESOURCEFROMNTHANDLE	\
+	_IOWR(0x47, 0x42, struct d3dkmt_openresourcefromnthandle)
 
 #define LX_IO_MAX 0x45
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 13/30] drivers: hv: dxgkrnl: Sharing of sync objects
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (10 preceding siblings ...)
  2022-03-01 19:45   ` [PATCH v3 12/30] drivers: hv: dxgkrnl: Sharing of dxgresource objects Iouri Tarassov
@ 2022-03-01 19:46   ` Iouri Tarassov
  2022-03-01 19:46   ` [PATCH v3 14/30] drivers: hv: dxgkrnl: Creation of hardware queues. Sync object operations to hw queue Iouri Tarassov
                     ` (17 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:46 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement creation of a shared sync objects and the ioctl for sharing
dxgsyncobject objects between processes in the virtual machine.

Sync objects are shared using file descriptor (FD) handles.
The name "NT handle" is used to be compatible with Windows implementation.

An FD handle is created by the LX_DXSHAREOBJECTS ioctl. The created FD
handle could be sent to another process using any Linux API.

To use a shared sync object in other ioctls, the object needs to be
opened using its FD handle. A sync object is opened by the
LX_DXOPENSYNCOBJECTFROMNTHANDLE2 ioctl, which returns a d3dkmthandle
value.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgadapter.c |  83 +++++++++++
 drivers/hv/dxgkrnl/dxgkrnl.h    |  63 +++++++++
 drivers/hv/dxgkrnl/dxgmodule.c  |   1 +
 drivers/hv/dxgkrnl/dxgvmbus.c   |  63 +++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h   |  15 ++
 drivers/hv/dxgkrnl/ioctl.c      | 240 ++++++++++++++++++++++++++++++++
 include/uapi/misc/d3dkmthk.h    |  20 +++
 7 files changed, 485 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgadapter.c b/drivers/hv/dxgkrnl/dxgadapter.c
index 48a39805944b..cb5575cdc308 100644
--- a/drivers/hv/dxgkrnl/dxgadapter.c
+++ b/drivers/hv/dxgkrnl/dxgadapter.c
@@ -176,6 +176,26 @@ void dxgadapter_remove_shared_resource(struct dxgadapter *adapter,
 	up_write(&adapter->shared_resource_list_lock);
 }
 
+void dxgadapter_add_shared_syncobj(struct dxgadapter *adapter,
+				   struct dxgsharedsyncobject *object)
+{
+	down_write(&adapter->shared_resource_list_lock);
+	list_add_tail(&object->adapter_shared_syncobj_list_entry,
+		      &adapter->adapter_shared_syncobj_list_head);
+	up_write(&adapter->shared_resource_list_lock);
+}
+
+void dxgadapter_remove_shared_syncobj(struct dxgadapter *adapter,
+				      struct dxgsharedsyncobject *object)
+{
+	down_write(&adapter->shared_resource_list_lock);
+	if (object->adapter_shared_syncobj_list_entry.next) {
+		list_del(&object->adapter_shared_syncobj_list_entry);
+		object->adapter_shared_syncobj_list_entry.next = NULL;
+	}
+	up_write(&adapter->shared_resource_list_lock);
+}
+
 void dxgadapter_add_syncobj(struct dxgadapter *adapter,
 			    struct dxgsyncobject *object)
 {
@@ -956,6 +976,63 @@ void dxgprocess_adapter_remove_device(struct dxgdevice *device)
 	mutex_unlock(&device->adapter_info->device_list_mutex);
 }
 
+struct dxgsharedsyncobject *dxgsharedsyncobj_create(struct dxgadapter *adapter,
+						    struct dxgsyncobject *so)
+{
+	struct dxgsharedsyncobject *syncobj;
+
+	syncobj = vzalloc(sizeof(*syncobj));
+	if (syncobj) {
+		kref_init(&syncobj->ssyncobj_kref);
+		INIT_LIST_HEAD(&syncobj->shared_syncobj_list_head);
+		syncobj->adapter = adapter;
+		syncobj->type = so->type;
+		syncobj->monitored_fence = so->monitored_fence;
+		dxgadapter_add_shared_syncobj(adapter, syncobj);
+		kref_get(&adapter->adapter_kref);
+		init_rwsem(&syncobj->syncobj_list_lock);
+		mutex_init(&syncobj->fd_mutex);
+	}
+	return syncobj;
+}
+
+void dxgsharedsyncobj_release(struct kref *refcount)
+{
+	struct dxgsharedsyncobject *syncobj;
+
+	syncobj = container_of(refcount, struct dxgsharedsyncobject,
+			       ssyncobj_kref);
+	pr_debug("Destroying shared sync object %p", syncobj);
+	if (syncobj->adapter) {
+		dxgadapter_remove_shared_syncobj(syncobj->adapter,
+							syncobj);
+		kref_put(&syncobj->adapter->adapter_kref,
+				dxgadapter_release);
+	}
+	vfree(syncobj);
+}
+
+void dxgsharedsyncobj_add_syncobj(struct dxgsharedsyncobject *shared,
+				  struct dxgsyncobject *syncobj)
+{
+	pr_debug("%s 0x%p 0x%p", __func__, shared, syncobj);
+	kref_get(&shared->ssyncobj_kref);
+	down_write(&shared->syncobj_list_lock);
+	list_add(&syncobj->shared_syncobj_list_entry,
+		 &shared->shared_syncobj_list_head);
+	syncobj->shared_owner = shared;
+	up_write(&shared->syncobj_list_lock);
+}
+
+void dxgsharedsyncobj_remove_syncobj(struct dxgsharedsyncobject *shared,
+				     struct dxgsyncobject *syncobj)
+{
+	pr_debug("%s 0x%p", __func__, shared);
+	down_write(&shared->syncobj_list_lock);
+	list_del(&syncobj->shared_syncobj_list_entry);
+	up_write(&shared->syncobj_list_lock);
+}
+
 struct dxgsyncobject *dxgsyncobject_create(struct dxgprocess *process,
 					   struct dxgdevice *device,
 					   struct dxgadapter *adapter,
@@ -1090,6 +1167,12 @@ void dxgsyncobject_release(struct kref *refcount)
 	struct dxgsyncobject *syncobj;
 
 	syncobj = container_of(refcount, struct dxgsyncobject, syncobj_kref);
+	if (syncobj->shared_owner) {
+		dxgsharedsyncobj_remove_syncobj(syncobj->shared_owner,
+						syncobj);
+		kref_put(&syncobj->shared_owner->ssyncobj_kref,
+			 dxgsharedsyncobj_release);
+	}
 	if (syncobj->host_event)
 		vfree(syncobj->host_event);
 	vfree(syncobj);
diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index bad69cf4dfd7..15cacdd43dee 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -35,6 +35,7 @@ struct dxgallocation;
 struct dxgresource;
 struct dxgsharedresource;
 struct dxgsyncobject;
+struct dxgsharedsyncobject;
 
 #include "misc.h"
 #include "hmgr.h"
@@ -122,6 +123,18 @@ struct dxghosteventcpu {
  * "device" syncobject, because the belong to a device (dxgdevice).
  * Device syncobjects are inserted to a list in dxgdevice.
  *
+ * A syncobject can be "shared", meaning that it could be opened by many
+ * processes.
+ *
+ * Shared syncobjects are inserted to a list in its owner
+ * (dxgsharedsyncobject).
+ * A syncobject can be shared by using a global handle or by using
+ * "NT security handle".
+ * When global handle sharing is used, the handle is created durinig object
+ * creation.
+ * When "NT security" is used, the handle for sharing is create be calling
+ * dxgk_share_objects. On Linux "NT handle" is represented by a file
+ * descriptor. FD points to dxgsharedsyncobject.
  */
 struct dxgsyncobject {
 	struct kref			syncobj_kref;
@@ -131,6 +144,8 @@ struct dxgsyncobject {
 	 * List entry in dxgadapter for other objects
 	 */
 	struct list_head		syncobj_list_entry;
+	/* List entry in the dxgsharedsyncobject object for shared synobjects */
+	struct list_head		shared_syncobj_list_entry;
 	/* Adapter, the syncobject belongs to. NULL for stopped sync obejcts. */
 	struct dxgadapter		*adapter;
 	/*
@@ -141,6 +156,8 @@ struct dxgsyncobject {
 	struct dxgprocess		*process;
 	/* Used by D3DDDI_CPU_NOTIFICATION objects */
 	struct dxghosteventcpu		*host_event;
+	/* Owner object for shared syncobjects */
+	struct dxgsharedsyncobject	*shared_owner;
 	/* CPU virtual address of the fence value for "device" syncobjects */
 	void				*mapped_address;
 	/* Handle in the process handle table */
@@ -172,6 +189,41 @@ struct dxgvgpuchannel {
 	struct hv_device	*hdev;
 };
 
+/*
+ * The object is used as parent of all sync objects, created for a shared
+ * syncobject. When a shared syncobject is created without NT security, the
+ * handle in the global handle table will point to this object.
+ */
+struct dxgsharedsyncobject {
+	struct kref			ssyncobj_kref;
+	/* Referenced by file descriptors */
+	int				host_shared_handle_nt_reference;
+	/* Corresponding handle in the host global handle table */
+	struct d3dkmthandle		host_shared_handle;
+	/*
+	 * When the sync object is shared by NT handle, this is the
+	 * corresponding handle in the host
+	 */
+	struct d3dkmthandle		host_shared_handle_nt;
+	/* Protects access to host_shared_handle_nt */
+	struct mutex			fd_mutex;
+	struct rw_semaphore		syncobj_list_lock;
+	struct list_head		shared_syncobj_list_head;
+	struct list_head		adapter_shared_syncobj_list_entry;
+	struct dxgadapter		*adapter;
+	enum d3dddi_synchronizationobject_type type;
+	u32				monitored_fence:1;
+};
+
+struct dxgsharedsyncobject *dxgsharedsyncobj_create(struct dxgadapter *adapter,
+						    struct dxgsyncobject
+						    *syncobj);
+void dxgsharedsyncobj_release(struct kref *refcount);
+void dxgsharedsyncobj_add_syncobj(struct dxgsharedsyncobject *sharedsyncobj,
+				  struct dxgsyncobject *syncobj);
+void dxgsharedsyncobj_remove_syncobj(struct dxgsharedsyncobject *sharedsyncobj,
+				     struct dxgsyncobject *syncobj);
+
 struct dxgsyncobject *dxgsyncobject_create(struct dxgprocess *process,
 					   struct dxgdevice *device,
 					   struct dxgadapter *adapter,
@@ -362,6 +414,8 @@ struct dxgadapter {
 	struct list_head	adapter_process_list_head;
 	/* List of all dxgsharedresource objects */
 	struct list_head	shared_resource_list_head;
+	/* List of all dxgsharedsyncobject objects */
+	struct list_head	adapter_shared_syncobj_list_head;
 	/* List of all non-device dxgsyncobject objects */
 	struct list_head	syncobj_list_head;
 	/* This lock protects shared resource and syncobject lists */
@@ -389,6 +443,10 @@ void dxgadapter_release_lock_shared(struct dxgadapter *adapter);
 int dxgadapter_acquire_lock_exclusive(struct dxgadapter *adapter);
 void dxgadapter_acquire_lock_forced(struct dxgadapter *adapter);
 void dxgadapter_release_lock_exclusive(struct dxgadapter *adapter);
+void dxgadapter_add_shared_syncobj(struct dxgadapter *adapter,
+				   struct dxgsharedsyncobject *so);
+void dxgadapter_remove_shared_syncobj(struct dxgadapter *adapter,
+				      struct dxgsharedsyncobject *so);
 void dxgadapter_add_syncobj(struct dxgadapter *adapter,
 			    struct dxgsyncobject *so);
 void dxgadapter_remove_syncobj(struct dxgsyncobject *so);
@@ -703,6 +761,11 @@ int dxgvmb_send_wait_sync_object_cpu(struct dxgprocess *process,
 int dxgvmb_send_query_adapter_info(struct dxgprocess *process,
 				   struct dxgadapter *adapter,
 				   struct d3dkmt_queryadapterinfo *args);
+int dxgvmb_send_open_sync_object_nt(struct dxgprocess *process,
+				    struct dxgvmbuschannel *channel,
+				    struct d3dkmt_opensyncobjectfromnthandle2
+				    *args,
+				    struct dxgsyncobject *syncobj);
 int dxgvmb_send_create_nt_shared_object(struct dxgprocess *process,
 					struct d3dkmthandle object,
 					struct d3dkmthandle *shared_handle);
diff --git a/drivers/hv/dxgkrnl/dxgmodule.c b/drivers/hv/dxgkrnl/dxgmodule.c
index 7ccf76c6ad9b..cfef880a512d 100644
--- a/drivers/hv/dxgkrnl/dxgmodule.c
+++ b/drivers/hv/dxgkrnl/dxgmodule.c
@@ -251,6 +251,7 @@ int dxgglobal_create_adapter(struct pci_dev *dev, guid_t *guid,
 
 	INIT_LIST_HEAD(&adapter->adapter_process_list_head);
 	INIT_LIST_HEAD(&adapter->shared_resource_list_head);
+	INIT_LIST_HEAD(&adapter->adapter_shared_syncobj_list_head);
 	INIT_LIST_HEAD(&adapter->syncobj_list_head);
 	init_rwsem(&adapter->shared_resource_list_lock);
 	adapter->pci_dev = dev;
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index b6b072b189ff..e159b9e70e1c 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -705,6 +705,69 @@ int dxgvmb_send_destroy_process(struct d3dkmthandle process)
 	return ret;
 }
 
+int dxgvmb_send_open_sync_object_nt(struct dxgprocess *process,
+				    struct dxgvmbuschannel *channel,
+				    struct d3dkmt_opensyncobjectfromnthandle2
+				    *args,
+				    struct dxgsyncobject *syncobj)
+{
+	struct dxgkvmb_command_opensyncobject *command;
+	struct dxgkvmb_command_opensyncobject_return result = { };
+	int ret;
+	struct dxgvmbusmsg msg;
+
+	ret = init_message(&msg, NULL, process, sizeof(*command));
+	if (ret)
+		return ret;
+	command = (void *)msg.msg;
+
+	command_vm_to_host_init2(&command->hdr, DXGK_VMBCOMMAND_OPENSYNCOBJECT,
+				 process->host_handle);
+	command->device = args->device;
+	command->global_sync_object = syncobj->shared_owner->host_shared_handle;
+	command->flags = args->flags;
+	if (syncobj->monitored_fence)
+		command->engine_affinity =
+			args->monitored_fence.engine_affinity;
+
+	ret = dxgglobal_acquire_channel_lock();
+	if (ret < 0)
+		goto cleanup;
+
+	ret = dxgvmb_send_sync_msg(channel, msg.hdr, msg.size,
+				   &result, sizeof(result));
+
+	dxgglobal_release_channel_lock();
+
+	if (ret < 0)
+		goto cleanup;
+
+	ret = ntstatus2int(result.status);
+	if (ret < 0)
+		goto cleanup;
+
+	args->sync_object = result.sync_object;
+	if (syncobj->monitored_fence) {
+		void *va = dxg_map_iospace(result.guest_cpu_physical_address,
+					   PAGE_SIZE, PROT_READ | PROT_WRITE,
+					   true);
+		if (va == NULL) {
+			ret = -ENOMEM;
+			goto cleanup;
+		}
+		args->monitored_fence.fence_value_cpu_va = va;
+		args->monitored_fence.fence_value_gpu_va =
+		    result.gpu_virtual_address;
+		syncobj->mapped_address = va;
+	}
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 int dxgvmb_send_create_nt_shared_object(struct dxgprocess *process,
 					struct d3dkmthandle object,
 					struct d3dkmthandle *shared_handle)
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index 60433744b46b..d9cc88dee186 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -172,6 +172,21 @@ struct dxgkvmb_command_signalguestevent {
 	bool				dereference_event;
 };
 
+struct dxgkvmb_command_opensyncobject {
+	struct dxgkvmb_command_vm_to_host hdr;
+	struct d3dkmthandle		device;
+	struct d3dkmthandle		global_sync_object;
+	u32				engine_affinity;
+	struct d3dddi_synchronizationobject_flags flags;
+};
+
+struct dxgkvmb_command_opensyncobject_return {
+	struct d3dkmthandle		sync_object;
+	struct ntstatus			status;
+	u64				gpu_virtual_address;
+	u64				guest_cpu_physical_address;
+};
+
 /*
  * The command returns struct d3dkmthandle of a shared object for the
  * given pre-process object
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index 05f59854cbbf..be9d72c4ae4e 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -35,6 +35,33 @@ static char *errorstr(int ret)
 	return ret < 0 ? "err" : "";
 }
 
+static int dxgsyncobj_release(struct inode *inode, struct file *file)
+{
+	struct dxgsharedsyncobject *syncobj = file->private_data;
+
+	pr_debug("%s: %p", __func__, syncobj);
+	mutex_lock(&syncobj->fd_mutex);
+	kref_get(&syncobj->ssyncobj_kref);
+	syncobj->host_shared_handle_nt_reference--;
+	if (syncobj->host_shared_handle_nt_reference == 0) {
+		if (syncobj->host_shared_handle_nt.v) {
+			dxgvmb_send_destroy_nt_shared_object(
+					syncobj->host_shared_handle_nt);
+			pr_debug("Syncobj host_handle_nt destroyed: %x",
+				    syncobj->host_shared_handle_nt.v);
+			syncobj->host_shared_handle_nt.v = 0;
+		}
+		kref_put(&syncobj->ssyncobj_kref, dxgsharedsyncobj_release);
+	}
+	mutex_unlock(&syncobj->fd_mutex);
+	kref_put(&syncobj->ssyncobj_kref, dxgsharedsyncobj_release);
+	return 0;
+}
+
+static const struct file_operations dxg_syncobj_fops = {
+	.release = dxgsyncobj_release,
+};
+
 static int dxgsharedresource_release(struct inode *inode, struct file *file)
 {
 	struct dxgsharedresource *resource = file->private_data;
@@ -1584,6 +1611,7 @@ dxgk_create_sync_object(struct dxgprocess *process, void *__user inargs)
 	struct eventfd_ctx *event = NULL;
 	struct dxgsyncobject *syncobj = NULL;
 	bool device_lock_acquired = false;
+	struct dxgsharedsyncobject *syncobjgbl = NULL;
 	struct dxghosteventcpu *host_event = NULL;
 
 	ret = copy_from_user(&args, inargs, sizeof(args));
@@ -1644,6 +1672,22 @@ dxgk_create_sync_object(struct dxgprocess *process, void *__user inargs)
 	if (ret < 0)
 		goto cleanup;
 
+	if (args.info.flags.shared) {
+		if (args.info.shared_handle.v == 0) {
+			pr_err("shared handle should not be 0");
+			ret = -EINVAL;
+			goto cleanup;
+		}
+		syncobjgbl = dxgsharedsyncobj_create(device->adapter, syncobj);
+		if (syncobjgbl == NULL) {
+			ret = -ENOMEM;
+			goto cleanup;
+		}
+		dxgsharedsyncobj_add_syncobj(syncobjgbl, syncobj);
+
+		syncobjgbl->host_shared_handle = args.info.shared_handle;
+	}
+
 	ret = copy_to_user(inargs, &args, sizeof(args));
 	if (ret) {
 		pr_err("%s failed to copy output args", __func__);
@@ -1672,6 +1716,8 @@ dxgk_create_sync_object(struct dxgprocess *process, void *__user inargs)
 		if (event)
 			eventfd_ctx_put(event);
 	}
+	if (syncobjgbl)
+		kref_put(&syncobjgbl->ssyncobj_kref, dxgsharedsyncobj_release);
 	if (adapter)
 		dxgadapter_release_lock_shared(adapter);
 	if (device_lock_acquired)
@@ -1726,6 +1772,141 @@ dxgk_destroy_sync_object(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgk_open_sync_object_nt(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_opensyncobjectfromnthandle2 args;
+	struct dxgsyncobject *syncobj = NULL;
+	struct dxgsharedsyncobject *syncobj_fd = NULL;
+	struct file *file = NULL;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+	struct d3dddi_synchronizationobject_flags flags = { };
+	int ret;
+	bool device_lock_acquired = false;
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	args.sync_object.v = 0;
+
+	if (args.device.v) {
+		device = dxgprocess_device_by_handle(process, args.device);
+		if (device == NULL) {
+			return -EINVAL;
+			goto cleanup;
+		}
+	} else {
+		pr_err("device handle is missing");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgdevice_acquire_lock_shared(device);
+	if (ret < 0)
+		goto cleanup;
+
+	device_lock_acquired = true;
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	file = fget(args.nt_handle);
+	if (!file) {
+		pr_err("failed to get file from handle: %llx",
+			   args.nt_handle);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (file->f_op != &dxg_syncobj_fops) {
+		pr_err("invalid fd: %llx", args.nt_handle);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	syncobj_fd = file->private_data;
+	if (syncobj_fd == NULL) {
+		pr_err("invalid private data: %llx", args.nt_handle);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	flags.shared = 1;
+	flags.nt_security_sharing = 1;
+	syncobj = dxgsyncobject_create(process, device, adapter,
+				       syncobj_fd->type, flags);
+	if (syncobj == NULL) {
+		pr_err("failed to create sync object");
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	dxgsharedsyncobj_add_syncobj(syncobj_fd, syncobj);
+
+	ret = dxgvmb_send_open_sync_object_nt(process, &dxgglobal->channel,
+					      &args, syncobj);
+	if (ret < 0) {
+		pr_err("failed to open sync object on host: %x",
+			syncobj_fd->host_shared_handle.v);
+		goto cleanup;
+	}
+
+	hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+	ret = hmgrtable_assign_handle(&process->handle_table, syncobj,
+				      HMGRENTRY_TYPE_DXGSYNCOBJECT,
+				      args.sync_object);
+	if (ret >= 0) {
+		syncobj->handle = args.sync_object;
+		kref_get(&syncobj->syncobj_kref);
+	}
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+
+	if (ret < 0)
+		goto cleanup;
+
+	ret = copy_to_user(inargs, &args, sizeof(args));
+	if (ret >= 0)
+		goto success;
+	pr_err("%s failed to copy output args", __func__);
+
+cleanup:
+
+	if (syncobj) {
+		dxgsyncobject_destroy(process, syncobj);
+		syncobj = NULL;
+	}
+
+	if (args.sync_object.v)
+		dxgvmb_send_destroy_sync_object(process, args.sync_object);
+
+success:
+
+	if (file)
+		fput(file);
+	if (syncobj)
+		kref_put(&syncobj->syncobj_kref, dxgsyncobject_release);
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device_lock_acquired)
+		dxgdevice_release_lock_shared(device);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 static int
 dxgk_signal_sync_object(struct dxgprocess *process, void *__user inargs)
 {
@@ -2379,6 +2560,30 @@ dxgk_wait_sync_object_gpu(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgsharedsyncobj_get_host_nt_handle(struct dxgsharedsyncobject *syncobj,
+				    struct dxgprocess *process,
+				    struct d3dkmthandle objecthandle)
+{
+	int ret = 0;
+
+	mutex_lock(&syncobj->fd_mutex);
+	if (syncobj->host_shared_handle_nt_reference == 0) {
+		ret = dxgvmb_send_create_nt_shared_object(process,
+			objecthandle,
+			&syncobj->host_shared_handle_nt);
+		if (ret < 0)
+			goto cleanup;
+		pr_debug("Host_shared_handle_ht: %x",
+			    syncobj->host_shared_handle_nt.v);
+		kref_get(&syncobj->ssyncobj_kref);
+	}
+	syncobj->host_shared_handle_nt_reference++;
+cleanup:
+	mutex_unlock(&syncobj->fd_mutex);
+	return ret;
+}
+
 static int
 dxgsharedresource_get_host_nt_handle(struct dxgsharedresource *resource,
 				     struct dxgprocess *process,
@@ -2404,6 +2609,7 @@ dxgsharedresource_get_host_nt_handle(struct dxgsharedresource *resource,
 }
 
 enum dxg_sharedobject_type {
+	DXG_SHARED_SYNCOBJECT,
 	DXG_SHARED_RESOURCE
 };
 
@@ -2420,6 +2626,10 @@ static int get_object_fd(enum dxg_sharedobject_type type,
 	}
 
 	switch (type) {
+	case DXG_SHARED_SYNCOBJECT:
+		file = anon_inode_getfile("dxgsyncobj",
+					  &dxg_syncobj_fops, object, 0);
+		break;
 	case DXG_SHARED_RESOURCE:
 		file = anon_inode_getfile("dxgresource",
 					  &dxg_resource_fops, object, 0);
@@ -2445,6 +2655,7 @@ dxgk_share_objects(struct dxgprocess *process, void *__user inargs)
 	enum hmgrentry_type object_type;
 	struct dxgsyncobject *syncobj = NULL;
 	struct dxgresource *resource = NULL;
+	struct dxgsharedsyncobject *shared_syncobj = NULL;
 	struct dxgsharedresource *shared_resource = NULL;
 	struct d3dkmthandle *handles = NULL;
 	int object_fd = 0;
@@ -2493,6 +2704,17 @@ dxgk_share_objects(struct dxgprocess *process, void *__user inargs)
 		ret = -EINVAL;
 	} else {
 		switch (object_type) {
+		case HMGRENTRY_TYPE_DXGSYNCOBJECT:
+			syncobj = obj;
+			if (syncobj->shared) {
+				kref_get(&syncobj->syncobj_kref);
+				shared_syncobj = syncobj->shared_owner;
+			} else {
+				pr_err("sync object is not shared");
+				syncobj = NULL;
+				ret = -EINVAL;
+			}
+			break;
 		case HMGRENTRY_TYPE_DXGRESOURCE:
 			resource = obj;
 			if (resource->shared_owner) {
@@ -2516,6 +2738,22 @@ dxgk_share_objects(struct dxgprocess *process, void *__user inargs)
 		goto cleanup;
 
 	switch (object_type) {
+	case HMGRENTRY_TYPE_DXGSYNCOBJECT:
+		ret = get_object_fd(DXG_SHARED_SYNCOBJECT, shared_syncobj,
+				    &object_fd);
+		if (ret < 0) {
+			pr_err("%s get_object_fd failed for sync object",
+				__func__);
+			goto cleanup;
+		}
+		ret = dxgsharedsyncobj_get_host_nt_handle(shared_syncobj,
+							  process,
+							  handles[0]);
+		if (ret < 0) {
+			pr_err("%s get_host_nt_handle failed", __func__);
+			goto cleanup;
+		}
+		break;
 	case HMGRENTRY_TYPE_DXGRESOURCE:
 		ret = get_object_fd(DXG_SHARED_RESOURCE, shared_resource,
 				    &object_fd);
@@ -3051,6 +3289,8 @@ void init_ioctls(void)
 		  LX_DXENUMADAPTERS3);
 	SET_IOCTL(/*0x3f */ dxgk_share_objects,
 		  LX_DXSHAREOBJECTS);
+	SET_IOCTL(/*0x40 */ dxgk_open_sync_object_nt,
+		  LX_DXOPENSYNCOBJECTFROMNTHANDLE2);
 	SET_IOCTL(/*0x41 */ dxgk_query_resource_info_nt,
 		  LX_DXQUERYRESOURCEINFOFROMNTHANDLE);
 	SET_IOCTL(/*0x42 */ dxgk_open_resource_nt,
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index 4f44dd855238..36466fc18608 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -690,6 +690,26 @@ struct d3dddi_openallocationinfo2 {
 	__u64			reserved[6];
 };
 
+struct d3dkmt_opensyncobjectfromnthandle2 {
+	__u64			nt_handle;
+	struct d3dkmthandle	device;
+	struct d3dddi_synchronizationobject_flags flags;
+	struct d3dkmthandle	sync_object;
+	__u32			reserved1;
+	union {
+		struct {
+#ifdef __KERNEL__
+			void	*fence_value_cpu_va;
+#else
+			__u64	fence_value_cpu_va;
+#endif
+			__u64	fence_value_gpu_va;
+			__u32	engine_affinity;
+		} monitored_fence;
+		__u64	reserved[8];
+	};
+};
+
 struct d3dkmt_openresourcefromnthandle {
 	struct d3dkmthandle	device;
 	__u32			reserved;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 14/30] drivers: hv: dxgkrnl: Creation of hardware queues. Sync object operations to hw queue.
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (11 preceding siblings ...)
  2022-03-01 19:46   ` [PATCH v3 13/30] drivers: hv: dxgkrnl: Sharing of sync objects Iouri Tarassov
@ 2022-03-01 19:46   ` Iouri Tarassov
  2022-03-01 19:46   ` [PATCH v3 15/30] drivers: hv: dxgkrnl: Creation of paging queue objects Iouri Tarassov
                     ` (16 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:46 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement ioctls for creation/destruction of the hardware queue objects:
  - LX_DXCREATEHWQUEUE
  - LX_DXDESTROYHWQUEUE

Implement ioctls to submit "signal" and "wait" operations for
sync objects to hardware queues:
  - LX_DXSUBMITSIGNALSYNCOBJECTSTOHWQUEUE,
  - LX_DXSUBMITWAITFORSYNCOBJECTSTOHWQUEUE

Hardware queues are used when a compute device supports "hardware
scheduling". This means that the compute device itself schedules execution
of DMA buffers from hardware queues without CPU intervention. This is as
oppose to the "packet scheduling" mode where the software scheduler on the
host schedules DMA buffer execution.

A hardware queue belongs to a dxgcontext object and has an associated
monitored fence object to track execution progress. The monitored fence
object has a mapped CPU address to read the fence value by CPU.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgadapter.c |  96 ++++++++++
 drivers/hv/dxgkrnl/dxgkrnl.h    |  33 ++++
 drivers/hv/dxgkrnl/dxgprocess.c |   4 +
 drivers/hv/dxgkrnl/dxgvmbus.c   | 155 +++++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h   |  20 ++
 drivers/hv/dxgkrnl/ioctl.c      | 321 ++++++++++++++++++++++++++++++++
 include/uapi/misc/d3dkmthk.h    |  73 ++++++++
 7 files changed, 702 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgadapter.c b/drivers/hv/dxgkrnl/dxgadapter.c
index cb5575cdc308..cdc057371bfe 100644
--- a/drivers/hv/dxgkrnl/dxgadapter.c
+++ b/drivers/hv/dxgkrnl/dxgadapter.c
@@ -758,6 +758,9 @@ struct dxgcontext *dxgcontext_create(struct dxgdevice *device)
  */
 void dxgcontext_destroy(struct dxgprocess *process, struct dxgcontext *context)
 {
+	struct dxghwqueue *hwqueue;
+	struct dxghwqueue *tmp;
+
 	pr_debug("%s %p\n", __func__, context);
 	context->object_state = DXGOBJECTSTATE_DESTROYED;
 	if (context->device) {
@@ -769,6 +772,10 @@ void dxgcontext_destroy(struct dxgprocess *process, struct dxgcontext *context)
 		dxgdevice_remove_context(context->device, context);
 		kref_put(&context->device->device_kref, dxgdevice_release);
 	}
+	list_for_each_entry_safe(hwqueue, tmp, &context->hwqueue_list_head,
+				 hwqueue_list_entry) {
+		dxghwqueue_destroy(process, hwqueue);
+	}
 	kref_put(&context->context_kref, dxgcontext_release);
 }
 
@@ -795,6 +802,38 @@ void dxgcontext_release(struct kref *refcount)
 	vfree(context);
 }
 
+int dxgcontext_add_hwqueue(struct dxgcontext *context,
+			   struct dxghwqueue *hwqueue)
+{
+	int ret = 0;
+
+	down_write(&context->hwqueue_list_lock);
+	if (dxgcontext_is_active(context))
+		list_add_tail(&hwqueue->hwqueue_list_entry,
+			      &context->hwqueue_list_head);
+	else
+		ret = -ENODEV;
+	up_write(&context->hwqueue_list_lock);
+	return ret;
+}
+
+void dxgcontext_remove_hwqueue(struct dxgcontext *context,
+			       struct dxghwqueue *hwqueue)
+{
+	if (hwqueue->hwqueue_list_entry.next) {
+		list_del(&hwqueue->hwqueue_list_entry);
+		hwqueue->hwqueue_list_entry.next = NULL;
+	}
+}
+
+void dxgcontext_remove_hwqueue_safe(struct dxgcontext *context,
+				    struct dxghwqueue *hwqueue)
+{
+	down_write(&context->hwqueue_list_lock);
+	dxgcontext_remove_hwqueue(context, hwqueue);
+	up_write(&context->hwqueue_list_lock);
+}
+
 struct dxgallocation *dxgallocation_create(struct dxgprocess *process)
 {
 	struct dxgallocation *alloc = vzalloc(sizeof(struct dxgallocation));
@@ -1177,3 +1216,60 @@ void dxgsyncobject_release(struct kref *refcount)
 		vfree(syncobj->host_event);
 	vfree(syncobj);
 }
+
+struct dxghwqueue *dxghwqueue_create(struct dxgcontext *context)
+{
+	struct dxgprocess *process = context->device->process;
+	struct dxghwqueue *hwqueue = vzalloc(sizeof(*hwqueue));
+
+	if (hwqueue) {
+		kref_init(&hwqueue->hwqueue_kref);
+		hwqueue->context = context;
+		hwqueue->process = process;
+		hwqueue->device_handle = context->device->handle;
+		if (dxgcontext_add_hwqueue(context, hwqueue) < 0) {
+			kref_put(&hwqueue->hwqueue_kref, dxghwqueue_release);
+			hwqueue = NULL;
+		} else {
+			kref_get(&context->context_kref);
+		}
+	}
+	return hwqueue;
+}
+
+void dxghwqueue_destroy(struct dxgprocess *process, struct dxghwqueue *hwqueue)
+{
+	pr_debug("%s %p\n", __func__, hwqueue);
+	hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+	if (hwqueue->handle.v) {
+		hmgrtable_free_handle(&process->handle_table,
+				      HMGRENTRY_TYPE_DXGHWQUEUE,
+				      hwqueue->handle);
+		hwqueue->handle.v = 0;
+	}
+	if (hwqueue->progress_fence_sync_object.v) {
+		hmgrtable_free_handle(&process->handle_table,
+				      HMGRENTRY_TYPE_MONITOREDFENCE,
+				      hwqueue->progress_fence_sync_object);
+		hwqueue->progress_fence_sync_object.v = 0;
+	}
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+
+	if (hwqueue->progress_fence_mapped_address) {
+		dxg_unmap_iospace(hwqueue->progress_fence_mapped_address,
+				  PAGE_SIZE);
+		hwqueue->progress_fence_mapped_address = NULL;
+	}
+	dxgcontext_remove_hwqueue_safe(hwqueue->context, hwqueue);
+
+	kref_put(&hwqueue->context->context_kref, dxgcontext_release);
+	kref_put(&hwqueue->hwqueue_kref, dxghwqueue_release);
+}
+
+void dxghwqueue_release(struct kref *refcount)
+{
+	struct dxghwqueue *hwqueue;
+
+	hwqueue = container_of(refcount, struct dxghwqueue, hwqueue_kref);
+	vfree(hwqueue);
+}
diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index 15cacdd43dee..74f412b0d6f5 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -36,6 +36,7 @@ struct dxgresource;
 struct dxgsharedresource;
 struct dxgsyncobject;
 struct dxgsharedsyncobject;
+struct dxghwqueue;
 
 #include "misc.h"
 #include "hmgr.h"
@@ -532,8 +533,32 @@ struct dxgcontext *dxgcontext_create(struct dxgdevice *dev);
 void dxgcontext_destroy(struct dxgprocess *pr, struct dxgcontext *ctx);
 void dxgcontext_destroy_safe(struct dxgprocess *pr, struct dxgcontext *ctx);
 void dxgcontext_release(struct kref *refcount);
+int dxgcontext_add_hwqueue(struct dxgcontext *ctx,
+				       struct dxghwqueue *hq);
+void dxgcontext_remove_hwqueue(struct dxgcontext *ctx, struct dxghwqueue *hq);
+void dxgcontext_remove_hwqueue_safe(struct dxgcontext *ctx,
+				    struct dxghwqueue *hq);
 bool dxgcontext_is_active(struct dxgcontext *ctx);
 
+/*
+ * The object represent the execution hardware queue of a device.
+ */
+struct dxghwqueue {
+	/* entry in the context hw queue list */
+	struct list_head	hwqueue_list_entry;
+	struct kref		hwqueue_kref;
+	struct dxgcontext	*context;
+	struct dxgprocess	*process;
+	struct d3dkmthandle	progress_fence_sync_object;
+	struct d3dkmthandle	handle;
+	struct d3dkmthandle	device_handle;
+	void			*progress_fence_mapped_address;
+};
+
+struct dxghwqueue *dxghwqueue_create(struct dxgcontext *ctx);
+void dxghwqueue_destroy(struct dxgprocess *pr, struct dxghwqueue *hq);
+void dxghwqueue_release(struct kref *refcount);
+
 /*
  * A shared resource object is created to track the list of dxgresource objects,
  * which are opened for the same underlying shared resource.
@@ -758,6 +783,14 @@ int dxgvmb_send_wait_sync_object_cpu(struct dxgprocess *process,
 				     d3dkmt_waitforsynchronizationobjectfromcpu
 				     *args,
 				     u64 cpu_event);
+int dxgvmb_send_create_hwqueue(struct dxgprocess *process,
+			       struct dxgadapter *adapter,
+			       struct d3dkmt_createhwqueue *args,
+			       struct d3dkmt_createhwqueue *__user inargs,
+			       struct dxghwqueue *hq);
+int dxgvmb_send_destroy_hwqueue(struct dxgprocess *process,
+				struct dxgadapter *adapter,
+				struct d3dkmthandle handle);
 int dxgvmb_send_query_adapter_info(struct dxgprocess *process,
 				   struct dxgadapter *adapter,
 				   struct d3dkmt_queryadapterinfo *args);
diff --git a/drivers/hv/dxgkrnl/dxgprocess.c b/drivers/hv/dxgkrnl/dxgprocess.c
index 30af930cc8c0..32aad8bfa1a5 100644
--- a/drivers/hv/dxgkrnl/dxgprocess.c
+++ b/drivers/hv/dxgkrnl/dxgprocess.c
@@ -278,6 +278,10 @@ struct dxgdevice *dxgprocess_device_by_object_handle(struct dxgprocess *process,
 			device_handle =
 			    ((struct dxgcontext *)obj)->device_handle;
 			break;
+		case HMGRENTRY_TYPE_DXGHWQUEUE:
+			device_handle =
+			    ((struct dxghwqueue *)obj)->device_handle;
+			break;
 		default:
 			pr_err("invalid handle type: %d\n", t);
 			break;
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index e159b9e70e1c..5d292858edec 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -2094,6 +2094,161 @@ int dxgvmb_send_wait_sync_object_gpu(struct dxgprocess *process,
 	return ret;
 }
 
+int dxgvmb_send_create_hwqueue(struct dxgprocess *process,
+			       struct dxgadapter *adapter,
+			       struct d3dkmt_createhwqueue *args,
+			       struct d3dkmt_createhwqueue *__user inargs,
+			       struct dxghwqueue *hwqueue)
+{
+	struct dxgkvmb_command_createhwqueue *command = NULL;
+	u32 cmd_size = sizeof(struct dxgkvmb_command_createhwqueue);
+	int ret;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	if (args->priv_drv_data_size > DXG_MAX_VM_BUS_PACKET_SIZE) {
+		pr_err("invalid private driver data size");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args->priv_drv_data_size)
+		cmd_size += args->priv_drv_data_size - 1;
+
+	ret = init_message(&msg, adapter, process, cmd_size);
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_CREATEHWQUEUE,
+				   process->host_handle);
+	command->context = args->context;
+	command->flags = args->flags;
+	command->priv_drv_data_size = args->priv_drv_data_size;
+	if (args->priv_drv_data_size) {
+		ret = copy_from_user(command->priv_drv_data,
+				     args->priv_drv_data,
+				     args->priv_drv_data_size);
+		if (ret) {
+			pr_err("%s failed to copy private data", __func__);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+	}
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   command, cmd_size);
+	if (ret < 0)
+		goto cleanup;
+
+	ret = ntstatus2int(command->status);
+	if (ret < 0) {
+		pr_err("dxgvmb_send_sync_msg failed: %x", command->status.v);
+		goto cleanup;
+	}
+
+	ret = hmgrtable_assign_handle_safe(&process->handle_table, hwqueue,
+					   HMGRENTRY_TYPE_DXGHWQUEUE,
+					   command->hwqueue);
+	if (ret < 0)
+		goto cleanup;
+
+	ret = hmgrtable_assign_handle_safe(&process->handle_table,
+				NULL,
+				HMGRENTRY_TYPE_MONITOREDFENCE,
+				command->hwqueue_progress_fence);
+	if (ret < 0)
+		goto cleanup;
+
+	hwqueue->handle = command->hwqueue;
+	hwqueue->progress_fence_sync_object = command->hwqueue_progress_fence;
+
+	hwqueue->progress_fence_mapped_address =
+		dxg_map_iospace((u64)command->hwqueue_progress_fence_cpuva,
+				PAGE_SIZE, PROT_READ | PROT_WRITE, true);
+	if (hwqueue->progress_fence_mapped_address == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	ret = copy_to_user(&inargs->queue, &command->hwqueue,
+			   sizeof(struct d3dkmthandle));
+	if (ret < 0) {
+		pr_err("%s failed to copy hwqueue handle", __func__);
+		goto cleanup;
+	}
+	ret = copy_to_user(&inargs->queue_progress_fence,
+			   &command->hwqueue_progress_fence,
+			   sizeof(struct d3dkmthandle));
+	if (ret < 0) {
+		pr_err("%s failed to progress fence", __func__);
+		goto cleanup;
+	}
+	ret = copy_to_user(&inargs->queue_progress_fence_cpu_va,
+			   &hwqueue->progress_fence_mapped_address,
+			   sizeof(inargs->queue_progress_fence_cpu_va));
+	if (ret < 0) {
+		pr_err("%s failed to copy fence cpu va", __func__);
+		goto cleanup;
+	}
+	ret = copy_to_user(&inargs->queue_progress_fence_gpu_va,
+			   &command->hwqueue_progress_fence_gpuva,
+			   sizeof(u64));
+	if (ret < 0) {
+		pr_err("%s failed to copy fence gpu va", __func__);
+		goto cleanup;
+	}
+	if (args->priv_drv_data_size) {
+		ret = copy_to_user(args->priv_drv_data,
+				   command->priv_drv_data,
+				   args->priv_drv_data_size);
+		if (ret < 0)
+			pr_err("%s failed to copy private data", __func__);
+	}
+
+cleanup:
+	if (ret < 0) {
+		pr_err("%s failed %x", __func__, ret);
+		if (hwqueue->handle.v) {
+			hmgrtable_free_handle_safe(&process->handle_table,
+						   HMGRENTRY_TYPE_DXGHWQUEUE,
+						   hwqueue->handle);
+			hwqueue->handle.v = 0;
+		}
+		if (command && command->hwqueue.v)
+			dxgvmb_send_destroy_hwqueue(process, adapter,
+						    command->hwqueue);
+	}
+	free_message(&msg, process);
+	return ret;
+}
+
+int dxgvmb_send_destroy_hwqueue(struct dxgprocess *process,
+				struct dxgadapter *adapter,
+				struct d3dkmthandle handle)
+{
+	int ret;
+	struct dxgkvmb_command_destroyhwqueue *command;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_DESTROYHWQUEUE,
+				   process->host_handle);
+	command->hwqueue = handle;
+
+	ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 int dxgvmb_send_query_adapter_info(struct dxgprocess *process,
 				   struct dxgadapter *adapter,
 				   struct d3dkmt_queryadapterinfo *args)
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index d9cc88dee186..a4000b3a1743 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -523,4 +523,24 @@ struct dxgkvmb_command_waitforsyncobjectfromgpu {
 	/* struct d3dkmthandle ObjectHandles[object_count] */
 };
 
+/* Returns the same structure */
+struct dxgkvmb_command_createhwqueue {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct ntstatus			status;
+	struct d3dkmthandle		hwqueue;
+	struct d3dkmthandle		hwqueue_progress_fence;
+	void				*hwqueue_progress_fence_cpuva;
+	u64				hwqueue_progress_fence_gpuva;
+	struct d3dkmthandle		context;
+	struct d3dddi_createhwqueueflags flags;
+	u32				priv_drv_data_size;
+	char				priv_drv_data[1];
+};
+
+/* The command returns ntstatus */
+struct dxgkvmb_command_destroyhwqueue {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		hwqueue;
+};
+
 #endif /* _DXGVMBUS_H */
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index be9d72c4ae4e..111a63235627 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -880,6 +880,160 @@ dxgk_destroy_context(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgk_create_hwqueue(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_createhwqueue args;
+	struct dxgdevice *device = NULL;
+	struct dxgcontext *context = NULL;
+	struct dxgadapter *adapter = NULL;
+	struct dxghwqueue *hwqueue = NULL;
+	int ret;
+	bool device_lock_acquired = false;
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	/*
+	 * The call acquires reference on the device. It is safe to access the
+	 * adapter, because the device holds reference on it.
+	 */
+	device = dxgprocess_device_by_object_handle(process,
+						    HMGRENTRY_TYPE_DXGCONTEXT,
+						    args.context);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgdevice_acquire_lock_shared(device);
+	if (ret < 0)
+		goto cleanup;
+
+	device_lock_acquired = true;
+
+	hmgrtable_lock(&process->handle_table, DXGLOCK_SHARED);
+	context = hmgrtable_get_object_by_type(&process->handle_table,
+					       HMGRENTRY_TYPE_DXGCONTEXT,
+					       args.context);
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_SHARED);
+
+	if (context == NULL) {
+		pr_err("Invalid context handle %x", args.context.v);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	hwqueue = dxghwqueue_create(context);
+	if (hwqueue == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_create_hwqueue(process, adapter, &args,
+					 inargs, hwqueue);
+
+cleanup:
+
+	if (ret < 0 && hwqueue)
+		dxghwqueue_destroy(process, hwqueue);
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+
+	if (device_lock_acquired)
+		dxgdevice_release_lock_shared(device);
+
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int dxgk_destroy_hwqueue(struct dxgprocess *process,
+					    void *__user inargs)
+{
+	struct d3dkmt_destroyhwqueue args;
+	int ret;
+	struct dxgadapter *adapter = NULL;
+	struct dxgdevice *device = NULL;
+	struct dxghwqueue *hwqueue = NULL;
+	struct d3dkmthandle device_handle = {};
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+	hwqueue = hmgrtable_get_object_by_type(&process->handle_table,
+					       HMGRENTRY_TYPE_DXGHWQUEUE,
+					       args.queue);
+	if (hwqueue) {
+		hmgrtable_free_handle(&process->handle_table,
+				      HMGRENTRY_TYPE_DXGHWQUEUE, args.queue);
+		hwqueue->handle.v = 0;
+		device_handle = hwqueue->device_handle;
+	}
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+
+	if (hwqueue == NULL) {
+		pr_err("invalid hwqueue handle: %x", args.queue.v);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	/*
+	 * The call acquires reference on the device. It is safe to access the
+	 * adapter, because the device holds reference on it.
+	 */
+	device = dxgprocess_device_by_handle(process, device_handle);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_destroy_hwqueue(process, adapter, args.queue);
+
+	dxghwqueue_destroy(process, hwqueue);
+
+cleanup:
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 static int
 get_standard_alloc_priv_data(struct dxgdevice *device,
 			     struct d3dkmt_createstandardallocation *alloc_info,
@@ -1601,6 +1755,165 @@ dxgk_destroy_allocation(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgk_submit_signal_to_hwqueue(struct dxgprocess *process, void *__user inargs)
+{
+	int ret;
+	struct d3dkmt_submitsignalsyncobjectstohwqueue args;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+	struct d3dkmthandle hwqueue = {};
+
+	pr_debug("ioctl: %s", __func__);
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.hwqueue_count > D3DDDI_MAX_BROADCAST_CONTEXT ||
+	    args.hwqueue_count == 0) {
+		pr_err("invalid hwqueue count");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.object_count > D3DDDI_MAX_OBJECT_SIGNALED ||
+	    args.object_count == 0) {
+		pr_err("invalid number of syn cobject");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = copy_from_user(&hwqueue, args.hwqueues,
+			     sizeof(struct d3dkmthandle));
+	if (ret) {
+		pr_err("%s failed to copy hwqueue handle", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	device = dxgprocess_device_by_object_handle(process,
+						    HMGRENTRY_TYPE_DXGHWQUEUE,
+						    hwqueue);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_signal_sync_object(process, adapter,
+					     args.flags, 0, zerohandle,
+					     args.object_count, args.objects,
+					     args.hwqueue_count, args.hwqueues,
+					     args.object_count,
+					     args.fence_values, NULL,
+					     zerohandle);
+
+cleanup:
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_submit_wait_to_hwqueue(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_submitwaitforsyncobjectstohwqueue args;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+	int ret;
+	struct d3dkmthandle *objects = NULL;
+	u32 object_size;
+	u64 *fences = NULL;
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.object_count > D3DDDI_MAX_OBJECT_WAITED_ON ||
+	    args.object_count == 0) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	object_size = sizeof(struct d3dkmthandle) * args.object_count;
+	objects = vzalloc(object_size);
+	if (objects == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+	ret = copy_from_user(objects, args.objects, object_size);
+	if (ret) {
+		pr_err("%s failed to copy objects", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	object_size = sizeof(u64) * args.object_count;
+	fences = vzalloc(object_size);
+	if (fences == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+	ret = copy_from_user(fences, args.fence_values, object_size);
+	if (ret) {
+		pr_err("%s failed to copy fence values", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	device = dxgprocess_device_by_object_handle(process,
+						    HMGRENTRY_TYPE_DXGHWQUEUE,
+						    args.hwqueue);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_wait_sync_object_gpu(process, adapter,
+					       args.hwqueue, args.object_count,
+					       objects, fences, false);
+
+cleanup:
+
+	if (objects)
+		vfree(objects);
+	if (fences)
+		vfree(fences);
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 static int
 dxgk_create_sync_object(struct dxgprocess *process, void *__user inargs)
 {
@@ -3271,8 +3584,12 @@ void init_ioctls(void)
 		  LX_DXENUMADAPTERS2);
 	SET_IOCTL(/*0x15 */ dxgk_close_adapter,
 		  LX_DXCLOSEADAPTER);
+	SET_IOCTL(/*0x18 */ dxgk_create_hwqueue,
+		  LX_DXCREATEHWQUEUE);
 	SET_IOCTL(/*0x19 */ dxgk_destroy_device,
 		  LX_DXDESTROYDEVICE);
+	SET_IOCTL(/*0x1b */ dxgk_destroy_hwqueue,
+		  LX_DXDESTROYHWQUEUE);
 	SET_IOCTL(/*0x1d */ dxgk_destroy_sync_object,
 		  LX_DXDESTROYSYNCHRONIZATIONOBJECT);
 	SET_IOCTL(/*0x31 */ dxgk_signal_sync_object_cpu,
@@ -3281,6 +3598,10 @@ void init_ioctls(void)
 		  LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMGPU);
 	SET_IOCTL(/*0x33 */ dxgk_signal_sync_object_gpu2,
 		  LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMGPU2);
+	SET_IOCTL(/*0x35 */ dxgk_submit_wait_to_hwqueue,
+		  LX_DXSUBMITWAITFORSYNCOBJECTSTOHWQUEUE);
+	SET_IOCTL(/*0x36 */ dxgk_submit_signal_to_hwqueue,
+		  LX_DXSUBMITSIGNALSYNCOBJECTSTOHWQUEUE);
 	SET_IOCTL(/*0x3a */ dxgk_wait_sync_object_cpu,
 		  LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMCPU);
 	SET_IOCTL(/*0x3b */ dxgk_wait_sync_object_gpu,
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index 36466fc18608..0b3cbe8ddcab 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -197,6 +197,16 @@ struct d3dkmt_createcontextvirtual {
 	struct d3dkmthandle		context;
 };
 
+struct d3dddi_createhwqueueflags {
+	union {
+		struct {
+			__u32		disable_gpu_timeout:1;
+			__u32		reserved:31;
+		};
+		__u32			value;
+	};
+};
+
 enum d3dkmdt_gdisurfacetype {
 	_D3DKMDT_GDISURFACE_INVALID				= 0,
 	_D3DKMDT_GDISURFACE_TEXTURE				= 1,
@@ -690,6 +700,61 @@ struct d3dddi_openallocationinfo2 {
 	__u64			reserved[6];
 };
 
+struct d3dkmt_createhwqueue {
+	struct d3dkmthandle	context;
+	struct d3dddi_createhwqueueflags flags;
+	__u32			priv_drv_data_size;
+	__u32			reserved;
+#ifdef __KERNEL__
+	void			*priv_drv_data;
+#else
+	__u64			priv_drv_data;
+#endif
+	struct d3dkmthandle	queue;
+	struct d3dkmthandle	queue_progress_fence;
+#ifdef __KERNEL__
+	void			*queue_progress_fence_cpu_va;
+#else
+	__u64			queue_progress_fence_cpu_va;
+#endif
+	__u64			queue_progress_fence_gpu_va;
+};
+
+struct d3dkmt_destroyhwqueue {
+	struct d3dkmthandle	queue;
+};
+
+struct d3dkmt_submitwaitforsyncobjectstohwqueue {
+	struct d3dkmthandle	hwqueue;
+	__u32			object_count;
+#ifdef __KERNEL__
+	struct d3dkmthandle	*objects;
+	__u64			*fence_values;
+#else
+	__u64			objects;
+	__u64			fence_values;
+#endif
+};
+
+struct d3dkmt_submitsignalsyncobjectstohwqueue {
+	struct d3dddicb_signalflags	flags;
+	__u32				hwqueue_count;
+#ifdef __KERNEL__
+	struct d3dkmthandle		*hwqueues;
+#else
+	__u64				hwqueues;
+#endif
+	__u32				object_count;
+	__u32				reserved;
+#ifdef __KERNEL__
+	struct d3dkmthandle		*objects;
+	__u64				*fence_values;
+#else
+	__u64				objects;
+	__u64				fence_values;
+#endif
+};
+
 struct d3dkmt_opensyncobjectfromnthandle2 {
 	__u64			nt_handle;
 	struct d3dkmthandle	device;
@@ -835,6 +900,10 @@ struct d3dkmt_enumadapters3 {
 	_IOWR(0x47, 0x14, struct d3dkmt_enumadapters2)
 #define LX_DXCLOSEADAPTER		\
 	_IOWR(0x47, 0x15, struct d3dkmt_closeadapter)
+#define LX_DXCREATEHWQUEUE		\
+	_IOWR(0x47, 0x18, struct d3dkmt_createhwqueue)
+#define LX_DXDESTROYHWQUEUE		\
+	_IOWR(0x47, 0x1b, struct d3dkmt_destroyhwqueue)
 #define LX_DXDESTROYDEVICE		\
 	_IOWR(0x47, 0x19, struct d3dkmt_destroydevice)
 #define LX_DXDESTROYSYNCHRONIZATIONOBJECT \
@@ -845,6 +914,10 @@ struct d3dkmt_enumadapters3 {
 	_IOWR(0x47, 0x32, struct d3dkmt_signalsynchronizationobjectfromgpu)
 #define LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMGPU2 \
 	_IOWR(0x47, 0x33, struct d3dkmt_signalsynchronizationobjectfromgpu2)
+#define LX_DXSUBMITSIGNALSYNCOBJECTSTOHWQUEUE \
+	_IOWR(0x47, 0x35, struct d3dkmt_submitsignalsyncobjectstohwqueue)
+#define LX_DXSUBMITWAITFORSYNCOBJECTSTOHWQUEUE \
+	_IOWR(0x47, 0x36, struct d3dkmt_submitwaitforsyncobjectstohwqueue)
 #define LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMCPU \
 	_IOWR(0x47, 0x3a, struct d3dkmt_waitforsynchronizationobjectfromcpu)
 #define LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMGPU \
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 15/30] drivers: hv: dxgkrnl: Creation of paging queue objects.
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (12 preceding siblings ...)
  2022-03-01 19:46   ` [PATCH v3 14/30] drivers: hv: dxgkrnl: Creation of hardware queues. Sync object operations to hw queue Iouri Tarassov
@ 2022-03-01 19:46   ` Iouri Tarassov
  2022-03-01 19:46   ` [PATCH v3 16/30] drivers: hv: dxgkrnl: Submit execution commands to the compute device Iouri Tarassov
                     ` (15 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:46 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement ioctls for creation/destruction of the paging queue objects:
  - LX_DXCREATEPAGINGQUEUE,
  - LX_DXDESTROYPAGINGQUEUE

Paging queue objects (dxgpagingqueue) contain operations, which
handle residency of device accessible allocations. An allocation is
resident, when the device has access to it. For example, the allocation
resides in local device memory or device page tables point to system
memory which is made non-pageable.

Each paging queue has an associated monitored fence sync object, which
is used to detect when a paging operation is completed.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgadapter.c |  89 +++++++++++++++
 drivers/hv/dxgkrnl/dxgkrnl.h    |  24 +++++
 drivers/hv/dxgkrnl/dxgprocess.c |   4 +
 drivers/hv/dxgkrnl/dxgvmbus.c   |  74 +++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h   |  17 +++
 drivers/hv/dxgkrnl/ioctl.c      | 184 ++++++++++++++++++++++++++++++++
 include/uapi/misc/d3dkmthk.h    |  27 +++++
 7 files changed, 419 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgadapter.c b/drivers/hv/dxgkrnl/dxgadapter.c
index cdc057371bfe..b162a37cf389 100644
--- a/drivers/hv/dxgkrnl/dxgadapter.c
+++ b/drivers/hv/dxgkrnl/dxgadapter.c
@@ -282,6 +282,7 @@ struct dxgdevice *dxgdevice_create(struct dxgadapter *adapter,
 void dxgdevice_stop(struct dxgdevice *device)
 {
 	struct dxgallocation *alloc;
+	struct dxgpagingqueue *pqueue;
 	struct dxgsyncobject *syncobj;
 
 	pr_debug("%s: %p", __func__, device);
@@ -292,6 +293,10 @@ void dxgdevice_stop(struct dxgdevice *device)
 	dxgdevice_release_alloc_list_lock(device);
 
 	hmgrtable_lock(&device->process->handle_table, DXGLOCK_EXCL);
+	list_for_each_entry(pqueue, &device->pqueue_list_head,
+			    pqueue_list_entry) {
+		dxgpagingqueue_stop(pqueue);
+	}
 	list_for_each_entry(syncobj, &device->syncobj_list_head,
 			    syncobj_list_entry) {
 		dxgsyncobject_stop(syncobj);
@@ -379,6 +384,17 @@ void dxgdevice_destroy(struct dxgdevice *device)
 		dxgdevice_release_context_list_lock(device);
 	}
 
+	{
+		struct dxgpagingqueue *tmp;
+		struct dxgpagingqueue *pqueue;
+
+		pr_debug("destroying paging queues\n");
+		list_for_each_entry_safe(pqueue, tmp, &device->pqueue_list_head,
+					 pqueue_list_entry) {
+			dxgpagingqueue_destroy(pqueue);
+		}
+	}
+
 	/* Guest handles need to be released before the host handles */
 	hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
 	if (device->handle_valid) {
@@ -711,6 +727,26 @@ void dxgdevice_release(struct kref *refcount)
 	vfree(device);
 }
 
+void dxgdevice_add_paging_queue(struct dxgdevice *device,
+				struct dxgpagingqueue *entry)
+{
+	dxgdevice_acquire_alloc_list_lock(device);
+	list_add_tail(&entry->pqueue_list_entry, &device->pqueue_list_head);
+	dxgdevice_release_alloc_list_lock(device);
+}
+
+void dxgdevice_remove_paging_queue(struct dxgpagingqueue *pqueue)
+{
+	struct dxgdevice *device = pqueue->device;
+
+	dxgdevice_acquire_alloc_list_lock(device);
+	if (pqueue->pqueue_list_entry.next) {
+		list_del(&pqueue->pqueue_list_entry);
+		pqueue->pqueue_list_entry.next = NULL;
+	}
+	dxgdevice_release_alloc_list_lock(device);
+}
+
 void dxgdevice_add_syncobj(struct dxgdevice *device,
 			   struct dxgsyncobject *syncobj)
 {
@@ -894,6 +930,59 @@ void dxgallocation_destroy(struct dxgallocation *alloc)
 	vfree(alloc);
 }
 
+struct dxgpagingqueue *dxgpagingqueue_create(struct dxgdevice *device)
+{
+	struct dxgpagingqueue *pqueue;
+
+	pqueue = vzalloc(sizeof(*pqueue));
+	if (pqueue) {
+		pqueue->device = device;
+		pqueue->process = device->process;
+		pqueue->device_handle = device->handle;
+		dxgdevice_add_paging_queue(device, pqueue);
+	}
+	return pqueue;
+}
+
+void dxgpagingqueue_stop(struct dxgpagingqueue *pqueue)
+{
+	int ret;
+
+	if (pqueue->mapped_address) {
+		ret = dxg_unmap_iospace(pqueue->mapped_address, PAGE_SIZE);
+		pr_debug("fence is unmapped %d %p",
+			    ret, pqueue->mapped_address);
+		pqueue->mapped_address = NULL;
+	}
+}
+
+void dxgpagingqueue_destroy(struct dxgpagingqueue *pqueue)
+{
+	struct dxgprocess *process = pqueue->process;
+
+	pr_debug("%s %p %x\n", __func__, pqueue, pqueue->handle.v);
+
+	dxgpagingqueue_stop(pqueue);
+
+	hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+	if (pqueue->handle.v) {
+		hmgrtable_free_handle(&process->handle_table,
+				      HMGRENTRY_TYPE_DXGPAGINGQUEUE,
+				      pqueue->handle);
+		pqueue->handle.v = 0;
+	}
+	if (pqueue->syncobj_handle.v) {
+		hmgrtable_free_handle(&process->handle_table,
+				      HMGRENTRY_TYPE_MONITOREDFENCE,
+				      pqueue->syncobj_handle);
+		pqueue->syncobj_handle.v = 0;
+	}
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+	if (pqueue->device)
+		dxgdevice_remove_paging_queue(pqueue);
+	vfree(pqueue);
+}
+
 struct dxgprocess_adapter *dxgprocess_adapter_create(struct dxgprocess *process,
 						     struct dxgadapter *adapter)
 {
diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index 74f412b0d6f5..a9aaefe9c168 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -89,6 +89,16 @@ int dxgvmbuschannel_init(struct dxgvmbuschannel *ch, struct hv_device *hdev);
 void dxgvmbuschannel_destroy(struct dxgvmbuschannel *ch);
 void dxgvmbuschannel_receive(void *ctx);
 
+struct dxgpagingqueue {
+	struct dxgdevice	*device;
+	struct dxgprocess	*process;
+	struct list_head	pqueue_list_entry;
+	struct d3dkmthandle	device_handle;
+	struct d3dkmthandle	handle;
+	struct d3dkmthandle	syncobj_handle;
+	void			*mapped_address;
+};
+
 /*
  * The structure describes an event, which will be signaled by
  * a message from host.
@@ -112,6 +122,10 @@ struct dxghosteventcpu {
 	bool			remove_from_list;
 };
 
+struct dxgpagingqueue *dxgpagingqueue_create(struct dxgdevice *device);
+void dxgpagingqueue_destroy(struct dxgpagingqueue *pqueue);
+void dxgpagingqueue_stop(struct dxgpagingqueue *pqueue);
+
 /*
  * This is GPU synchronization object, which is used to synchronize execution
  * between GPU contextx/hardware queues or for tracking GPU execution progress.
@@ -503,6 +517,9 @@ void dxgdevice_remove_alloc_safe(struct dxgdevice *dev,
 				 struct dxgallocation *a);
 void dxgdevice_add_resource(struct dxgdevice *dev, struct dxgresource *res);
 void dxgdevice_remove_resource(struct dxgdevice *dev, struct dxgresource *res);
+void dxgdevice_add_paging_queue(struct dxgdevice *dev,
+				struct dxgpagingqueue *pqueue);
+void dxgdevice_remove_paging_queue(struct dxgpagingqueue *pqueue);
 void dxgdevice_add_syncobj(struct dxgdevice *dev, struct dxgsyncobject *so);
 void dxgdevice_remove_syncobj(struct dxgsyncobject *so);
 bool dxgdevice_is_active(struct dxgdevice *dev);
@@ -742,6 +759,13 @@ dxgvmb_send_create_context(struct dxgadapter *adapter,
 int dxgvmb_send_destroy_context(struct dxgadapter *adapter,
 				struct dxgprocess *process,
 				struct d3dkmthandle h);
+int dxgvmb_send_create_paging_queue(struct dxgprocess *pr,
+				    struct dxgdevice *dev,
+				    struct d3dkmt_createpagingqueue *args,
+				    struct dxgpagingqueue *pq);
+int dxgvmb_send_destroy_paging_queue(struct dxgprocess *process,
+				     struct dxgadapter *adapter,
+				     struct d3dkmthandle h);
 int dxgvmb_send_create_allocation(struct dxgprocess *pr, struct dxgdevice *dev,
 				  struct d3dkmt_createallocation *args,
 				  struct d3dkmt_createallocation *__user inargs,
diff --git a/drivers/hv/dxgkrnl/dxgprocess.c b/drivers/hv/dxgkrnl/dxgprocess.c
index 32aad8bfa1a5..d4199995d279 100644
--- a/drivers/hv/dxgkrnl/dxgprocess.c
+++ b/drivers/hv/dxgkrnl/dxgprocess.c
@@ -278,6 +278,10 @@ struct dxgdevice *dxgprocess_device_by_object_handle(struct dxgprocess *process,
 			device_handle =
 			    ((struct dxgcontext *)obj)->device_handle;
 			break;
+		case HMGRENTRY_TYPE_DXGPAGINGQUEUE:
+			device_handle =
+			    ((struct dxgpagingqueue *)obj)->device_handle;
+			break;
 		case HMGRENTRY_TYPE_DXGHWQUEUE:
 			device_handle =
 			    ((struct dxghwqueue *)obj)->device_handle;
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index 5d292858edec..fcfd5544f651 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -1147,6 +1147,80 @@ int dxgvmb_send_destroy_context(struct dxgadapter *adapter,
 	return ret;
 }
 
+int dxgvmb_send_create_paging_queue(struct dxgprocess *process,
+				    struct dxgdevice *device,
+				    struct d3dkmt_createpagingqueue *args,
+				    struct dxgpagingqueue *pqueue)
+{
+	struct dxgkvmb_command_createpagingqueue_return result;
+	struct dxgkvmb_command_createpagingqueue *command;
+	int ret;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, device->adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_CREATEPAGINGQUEUE,
+				   process->host_handle);
+	command->args = *args;
+	args->paging_queue.v = 0;
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size, &result,
+				   sizeof(result));
+	if (ret < 0) {
+		pr_err("send_create_paging_queue failed %x", ret);
+		goto cleanup;
+	}
+
+	args->paging_queue = result.paging_queue;
+	args->sync_object = result.sync_object;
+	args->fence_cpu_virtual_address =
+	    dxg_map_iospace(result.fence_storage_physical_address, PAGE_SIZE,
+			    PROT_READ | PROT_WRITE, true);
+	if (args->fence_cpu_virtual_address == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+	pqueue->mapped_address = args->fence_cpu_virtual_address;
+	pqueue->handle = args->paging_queue;
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_destroy_paging_queue(struct dxgprocess *process,
+				     struct dxgadapter *adapter,
+				     struct d3dkmthandle h)
+{
+	int ret;
+	struct dxgkvmb_command_destroypagingqueue *command;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_DESTROYPAGINGQUEUE,
+				   process->host_handle);
+	command->paging_queue = h;
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size, NULL, 0);
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 static int
 copy_private_data(struct d3dkmt_createallocation *args,
 		  struct dxgkvmb_command_createallocation *command,
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index a4000b3a1743..10ec7efa4f27 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -462,6 +462,23 @@ struct dxgkvmb_command_destroycontext {
 	struct d3dkmthandle	context;
 };
 
+struct dxgkvmb_command_createpagingqueue {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmt_createpagingqueue	args;
+};
+
+struct dxgkvmb_command_createpagingqueue_return {
+	struct d3dkmthandle	paging_queue;
+	struct d3dkmthandle	sync_object;
+	u64			fence_storage_physical_address;
+	u64			fence_storage_offset;
+};
+
+struct dxgkvmb_command_destroypagingqueue {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle	paging_queue;
+};
+
 struct dxgkvmb_command_createsyncobject {
 	struct dxgkvmb_command_vgpu_to_host hdr;
 	struct d3dkmt_createsynchronizationobject2 args;
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index 111a63235627..8992679de5c4 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -1034,6 +1034,186 @@ static int dxgk_destroy_hwqueue(struct dxgprocess *process,
 	return ret;
 }
 
+static int
+dxgk_create_paging_queue(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_createpagingqueue args;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+	struct dxgpagingqueue *pqueue = NULL;
+	int ret;
+	struct d3dkmthandle host_handle = {};
+	bool device_lock_acquired = false;
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	/*
+	 * The call acquires reference on the device. It is safe to access the
+	 * adapter, because the device holds reference on it.
+	 */
+	device = dxgprocess_device_by_handle(process, args.device);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgdevice_acquire_lock_shared(device);
+	if (ret < 0)
+		goto cleanup;
+
+	device_lock_acquired = true;
+	adapter = device->adapter;
+
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	pqueue = dxgpagingqueue_create(device);
+	if (pqueue == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_create_paging_queue(process, device, &args, pqueue);
+	if (ret >= 0) {
+		host_handle = args.paging_queue;
+
+		ret = copy_to_user(inargs, &args, sizeof(args));
+		if (ret < 0) {
+			pr_err("%s failed to copy input args", __func__);
+			goto cleanup;
+		}
+
+		hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+		ret = hmgrtable_assign_handle(&process->handle_table, pqueue,
+					      HMGRENTRY_TYPE_DXGPAGINGQUEUE,
+					      host_handle);
+		if (ret >= 0) {
+			pqueue->handle = host_handle;
+			ret = hmgrtable_assign_handle(&process->handle_table,
+						NULL,
+						HMGRENTRY_TYPE_MONITOREDFENCE,
+						args.sync_object);
+			if (ret >= 0)
+				pqueue->syncobj_handle = args.sync_object;
+		}
+		hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+		/* should not fail after this */
+	}
+
+cleanup:
+
+	if (ret < 0) {
+		if (pqueue)
+			dxgpagingqueue_destroy(pqueue);
+		if (host_handle.v)
+			dxgvmb_send_destroy_paging_queue(process,
+							 adapter,
+							 host_handle);
+	}
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+
+	if (device) {
+		if (device_lock_acquired)
+			dxgdevice_release_lock_shared(device);
+		kref_put(&device->device_kref, dxgdevice_release);
+	}
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_destroy_paging_queue(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dddi_destroypagingqueue args;
+	struct dxgpagingqueue *paging_queue = NULL;
+	int ret;
+	struct d3dkmthandle device_handle = {};
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+	paging_queue = hmgrtable_get_object_by_type(&process->handle_table,
+						HMGRENTRY_TYPE_DXGPAGINGQUEUE,
+						args.paging_queue);
+	if (paging_queue) {
+		device_handle = paging_queue->device_handle;
+		hmgrtable_free_handle(&process->handle_table,
+				      HMGRENTRY_TYPE_DXGPAGINGQUEUE,
+				      args.paging_queue);
+		hmgrtable_free_handle(&process->handle_table,
+				      HMGRENTRY_TYPE_MONITOREDFENCE,
+				      paging_queue->syncobj_handle);
+		paging_queue->syncobj_handle.v = 0;
+		paging_queue->handle.v = 0;
+	}
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+
+	/*
+	 * The call acquires reference on the device. It is safe to access the
+	 * adapter, because the device holds reference on it.
+	 */
+	if (device_handle.v)
+		device = dxgprocess_device_by_handle(process, device_handle);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgdevice_acquire_lock_shared(device);
+	if (ret < 0) {
+		kref_put(&device->device_kref, dxgdevice_release);
+		device = NULL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_destroy_paging_queue(process, adapter,
+					       args.paging_queue);
+
+	dxgpagingqueue_destroy(paging_queue);
+
+cleanup:
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+
+	if (device) {
+		dxgdevice_release_lock_shared(device);
+		kref_put(&device->device_kref, dxgdevice_release);
+	}
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 static int
 get_standard_alloc_priv_data(struct dxgdevice *device,
 			     struct d3dkmt_createstandardallocation *alloc_info,
@@ -3570,6 +3750,8 @@ void init_ioctls(void)
 		  LX_DXDESTROYCONTEXT);
 	SET_IOCTL(/*0x6 */ dxgk_create_allocation,
 		  LX_DXCREATEALLOCATION);
+	SET_IOCTL(/*0x7 */ dxgk_create_paging_queue,
+		  LX_DXCREATEPAGINGQUEUE);
 	SET_IOCTL(/*0x9 */ dxgk_query_adapter_info,
 		  LX_DXQUERYADAPTERINFO);
 	SET_IOCTL(/*0x10 */ dxgk_create_sync_object,
@@ -3590,6 +3772,8 @@ void init_ioctls(void)
 		  LX_DXDESTROYDEVICE);
 	SET_IOCTL(/*0x1b */ dxgk_destroy_hwqueue,
 		  LX_DXDESTROYHWQUEUE);
+	SET_IOCTL(/*0x1c */ dxgk_destroy_paging_queue,
+		  LX_DXDESTROYPAGINGQUEUE);
 	SET_IOCTL(/*0x1d */ dxgk_destroy_sync_object,
 		  LX_DXDESTROYSYNCHRONIZATIONOBJECT);
 	SET_IOCTL(/*0x31 */ dxgk_signal_sync_object_cpu,
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index 0b3cbe8ddcab..a35e02c4cf17 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -207,6 +207,29 @@ struct d3dddi_createhwqueueflags {
 	};
 };
 
+enum d3dddi_pagingqueue_priority {
+	_D3DDDI_PAGINGQUEUE_PRIORITY_BELOW_NORMAL	= -1,
+	_D3DDDI_PAGINGQUEUE_PRIORITY_NORMAL		= 0,
+	_D3DDDI_PAGINGQUEUE_PRIORITY_ABOVE_NORMAL	= 1,
+};
+
+struct d3dkmt_createpagingqueue {
+	struct d3dkmthandle		device;
+	enum d3dddi_pagingqueue_priority priority;
+	struct d3dkmthandle		paging_queue;
+	struct d3dkmthandle		sync_object;
+#ifdef __KERNEL__
+	void				*fence_cpu_virtual_address;
+#else
+	__u64				fence_cpu_virtual_address;
+#endif
+	__u32				physical_adapter_index;
+};
+
+struct d3dddi_destroypagingqueue {
+	struct d3dkmthandle		paging_queue;
+};
+
 enum d3dkmdt_gdisurfacetype {
 	_D3DKMDT_GDISURFACE_INVALID				= 0,
 	_D3DKMDT_GDISURFACE_TEXTURE				= 1,
@@ -886,6 +909,8 @@ struct d3dkmt_enumadapters3 {
 	_IOWR(0x47, 0x05, struct d3dkmt_destroycontext)
 #define LX_DXCREATEALLOCATION		\
 	_IOWR(0x47, 0x06, struct d3dkmt_createallocation)
+#define LX_DXCREATEPAGINGQUEUE		\
+	_IOWR(0x47, 0x07, struct d3dkmt_createpagingqueue)
 #define LX_DXQUERYADAPTERINFO		\
 	_IOWR(0x47, 0x09, struct d3dkmt_queryadapterinfo)
 #define LX_DXCREATESYNCHRONIZATIONOBJECT \
@@ -904,6 +929,8 @@ struct d3dkmt_enumadapters3 {
 	_IOWR(0x47, 0x18, struct d3dkmt_createhwqueue)
 #define LX_DXDESTROYHWQUEUE		\
 	_IOWR(0x47, 0x1b, struct d3dkmt_destroyhwqueue)
+#define LX_DXDESTROYPAGINGQUEUE		\
+	_IOWR(0x47, 0x1c, struct d3dddi_destroypagingqueue)
 #define LX_DXDESTROYDEVICE		\
 	_IOWR(0x47, 0x19, struct d3dkmt_destroydevice)
 #define LX_DXDESTROYSYNCHRONIZATIONOBJECT \
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 16/30] drivers: hv: dxgkrnl: Submit execution commands to the compute device
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (13 preceding siblings ...)
  2022-03-01 19:46   ` [PATCH v3 15/30] drivers: hv: dxgkrnl: Creation of paging queue objects Iouri Tarassov
@ 2022-03-01 19:46   ` Iouri Tarassov
  2022-03-01 19:46   ` [PATCH v3 17/30] drivers: hv: dxgkrnl: Share objects with the host Iouri Tarassov
                     ` (14 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:46 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implements ioctls for submission of compute device buffers for execution:
  - LX_DXSUBMITCOMMAND
    The ioctl is used to submit a command buffer to the device,
    working in the "packet scheduling" mode.

  - LX_DXSUBMITCOMMANDTOHWQUEUE
  The ioctl is used to submit a command buffer to the device,
  working in the "hardware scheduling" mode.

To improve performance both ioctls use asynchronous VM bus messages
to communicate with the host as these are high frequency operations.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgkrnl.h  |   6 ++
 drivers/hv/dxgkrnl/dxgvmbus.c | 111 +++++++++++++++++++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h |  14 ++++
 drivers/hv/dxgkrnl/ioctl.c    | 129 ++++++++++++++++++++++++++++++++++
 include/uapi/misc/d3dkmthk.h  |  58 +++++++++++++++
 5 files changed, 318 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index a9aaefe9c168..2a2a69169e71 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -776,6 +776,9 @@ int dxgvmb_send_create_allocation(struct dxgprocess *pr, struct dxgdevice *dev,
 int dxgvmb_send_destroy_allocation(struct dxgprocess *pr, struct dxgdevice *dev,
 				   struct d3dkmt_destroyallocation2 *args,
 				   struct d3dkmthandle *alloc_handles);
+int dxgvmb_send_submit_command(struct dxgprocess *pr,
+			       struct dxgadapter *adapter,
+			       struct d3dkmt_submitcommand *args);
 int dxgvmb_send_create_sync_object(struct dxgprocess *pr,
 				   struct dxgadapter *adapter,
 				   struct d3dkmt_createsynchronizationobject2
@@ -818,6 +821,9 @@ int dxgvmb_send_destroy_hwqueue(struct dxgprocess *process,
 int dxgvmb_send_query_adapter_info(struct dxgprocess *process,
 				   struct dxgadapter *adapter,
 				   struct d3dkmt_queryadapterinfo *args);
+int dxgvmb_send_submit_command_hwqueue(struct dxgprocess *process,
+				       struct dxgadapter *adapter,
+				       struct d3dkmt_submitcommandtohwqueue *a);
 int dxgvmb_send_open_sync_object_nt(struct dxgprocess *process,
 				    struct dxgvmbuschannel *channel,
 				    struct d3dkmt_opensyncobjectfromnthandle2
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index fcfd5544f651..eeb7cb76e1a5 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -1886,6 +1886,60 @@ int dxgvmb_send_get_stdalloc_data(struct dxgdevice *device,
 	return ret;
 }
 
+int dxgvmb_send_submit_command(struct dxgprocess *process,
+			       struct dxgadapter *adapter,
+			       struct d3dkmt_submitcommand *args)
+{
+	int ret;
+	u32 cmd_size;
+	struct dxgkvmb_command_submitcommand *command;
+	u32 hbufsize = args->num_history_buffers * sizeof(struct d3dkmthandle);
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	cmd_size = sizeof(struct dxgkvmb_command_submitcommand) +
+	    hbufsize + args->priv_drv_data_size;
+
+	ret = init_message(&msg, adapter, process, cmd_size);
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	ret = copy_from_user(&command[1], args->history_buffer_array,
+			     hbufsize);
+	if (ret) {
+		pr_err("%s failed to copy history buffer", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	ret = copy_from_user((u8 *) &command[1] + hbufsize,
+			     args->priv_drv_data, args->priv_drv_data_size);
+	if (ret) {
+		pr_err("%s failed to copy history priv data", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_SUBMITCOMMAND,
+				   process->host_handle);
+	command->args = *args;
+
+	if (dxgglobal->async_msg_enabled) {
+		command->hdr.async_msg = 1;
+		ret = dxgvmb_send_async_msg(msg.channel, msg.hdr, msg.size);
+	} else {
+		ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr,
+						    msg.size);
+	}
+
+cleanup:
+
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 static void set_result(struct d3dkmt_createsynchronizationobject2 *args,
 		       u64 fence_gpu_va, u8 *va)
 {
@@ -2404,3 +2458,60 @@ int dxgvmb_send_query_adapter_info(struct dxgprocess *process,
 		pr_debug("err: %s %d", __func__, ret);
 	return ret;
 }
+
+int dxgvmb_send_submit_command_hwqueue(struct dxgprocess *process,
+				       struct dxgadapter *adapter,
+				       struct d3dkmt_submitcommandtohwqueue
+				       *args)
+{
+	int ret = -EINVAL;
+	u32 cmd_size;
+	struct dxgkvmb_command_submitcommandtohwqueue *command;
+	u32 primaries_size = args->num_primaries * sizeof(struct d3dkmthandle);
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	cmd_size = sizeof(*command) + args->priv_drv_data_size + primaries_size;
+	ret = init_message(&msg, adapter, process, cmd_size);
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	if (primaries_size) {
+		ret = copy_from_user(&command[1], args->written_primaries,
+					 primaries_size);
+		if (ret) {
+			pr_err("%s failed to copy primaries handles", __func__);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+	}
+	if (args->priv_drv_data_size) {
+		ret = copy_from_user((char *)&command[1] + primaries_size,
+				      args->priv_drv_data,
+				      args->priv_drv_data_size);
+		if (ret) {
+			pr_err("%s failed to copy primaries data", __func__);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+	}
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_SUBMITCOMMANDTOHWQUEUE,
+				   process->host_handle);
+	command->args = *args;
+
+	if (dxgglobal->async_msg_enabled) {
+		command->hdr.async_msg = 1;
+		ret = dxgvmb_send_async_msg(msg.channel, msg.hdr, msg.size);
+	} else {
+		ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr,
+						    msg.size);
+	}
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index 10ec7efa4f27..d65412a57a7c 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -314,6 +314,20 @@ struct dxgkvmb_command_flushdevice {
 	enum dxgdevice_flushschedulerreason	reason;
 };
 
+struct dxgkvmb_command_submitcommand {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmt_submitcommand	args;
+	/* HistoryBufferHandles */
+	/* PrivateDriverData    */
+};
+
+struct dxgkvmb_command_submitcommandtohwqueue {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmt_submitcommandtohwqueue args;
+	/* Written primaries */
+	/* PrivateDriverData */
+};
+
 struct dxgkvmb_command_createallocation_allocinfo {
 	u32				flags;
 	u32				priv_drv_data_size;
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index 8992679de5c4..38d28d1792df 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -1935,6 +1935,131 @@ dxgk_destroy_allocation(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgk_submit_command(struct dxgprocess *process, void *__user inargs)
+{
+	int ret;
+	struct d3dkmt_submitcommand args;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+
+	pr_debug("ioctl: %s", __func__);
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.broadcast_context_count > D3DDDI_MAX_BROADCAST_CONTEXT ||
+	    args.broadcast_context_count == 0) {
+		pr_err("invalid number of contexts");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.priv_drv_data_size > DXG_MAX_VM_BUS_PACKET_SIZE) {
+		pr_err("invalid private data size");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.num_history_buffers > 1024) {
+		pr_err("invalid number of history buffers");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.num_primaries > DXG_MAX_VM_BUS_PACKET_SIZE) {
+		pr_err("invalid number of primaries");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	device = dxgprocess_device_by_object_handle(process,
+						    HMGRENTRY_TYPE_DXGCONTEXT,
+						    args.broadcast_context[0]);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_submit_command(process, adapter, &args);
+
+cleanup:
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_submit_command_to_hwqueue(struct dxgprocess *process, void *__user inargs)
+{
+	int ret;
+	struct d3dkmt_submitcommandtohwqueue args;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+
+	pr_debug("ioctl: %s", __func__);
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.priv_drv_data_size > DXG_MAX_VM_BUS_PACKET_SIZE) {
+		pr_err("invalid private data size");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.num_primaries > DXG_MAX_VM_BUS_PACKET_SIZE) {
+		pr_err("invalid number of primaries");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	device = dxgprocess_device_by_object_handle(process,
+						    HMGRENTRY_TYPE_DXGHWQUEUE,
+						    args.hwqueue);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_submit_command_hwqueue(process, adapter, &args);
+
+cleanup:
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 static int
 dxgk_submit_signal_to_hwqueue(struct dxgprocess *process, void *__user inargs)
 {
@@ -3754,6 +3879,8 @@ void init_ioctls(void)
 		  LX_DXCREATEPAGINGQUEUE);
 	SET_IOCTL(/*0x9 */ dxgk_query_adapter_info,
 		  LX_DXQUERYADAPTERINFO);
+	SET_IOCTL(/*0xf */ dxgk_submit_command,
+		  LX_DXSUBMITCOMMAND);
 	SET_IOCTL(/*0x10 */ dxgk_create_sync_object,
 		  LX_DXCREATESYNCHRONIZATIONOBJECT);
 	SET_IOCTL(/*0x11 */ dxgk_signal_sync_object,
@@ -3782,6 +3909,8 @@ void init_ioctls(void)
 		  LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMGPU);
 	SET_IOCTL(/*0x33 */ dxgk_signal_sync_object_gpu2,
 		  LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMGPU2);
+	SET_IOCTL(/*0x34 */ dxgk_submit_command_to_hwqueue,
+		  LX_DXSUBMITCOMMANDTOHWQUEUE);
 	SET_IOCTL(/*0x35 */ dxgk_submit_wait_to_hwqueue,
 		  LX_DXSUBMITWAITFORSYNCOBJECTSTOHWQUEUE);
 	SET_IOCTL(/*0x36 */ dxgk_submit_signal_to_hwqueue,
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index a35e02c4cf17..fe0b391d7f6b 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -54,6 +54,8 @@ struct winluid {
 	__u32 b;
 };
 
+#define D3DDDI_MAX_WRITTEN_PRIMARIES		16
+
 #define D3DKMT_CREATEALLOCATION_MAX		1024
 #define D3DKMT_ADAPTERS_MAX			64
 #define D3DDDI_MAX_BROADCAST_CONTEXT		64
@@ -521,6 +523,58 @@ struct d3dkmt_destroysynchronizationobject {
 	struct d3dkmthandle	sync_object;
 };
 
+struct d3dkmt_submitcommandflags {
+	__u32					null_rendering:1;
+	__u32					present_redirected:1;
+	__u32					reserved:30;
+};
+
+struct d3dkmt_submitcommand {
+	__u64					command_buffer;
+	__u32					command_length;
+	struct d3dkmt_submitcommandflags	flags;
+	__u64					present_history_token;
+	__u32					broadcast_context_count;
+	struct d3dkmthandle	broadcast_context[D3DDDI_MAX_BROADCAST_CONTEXT];
+	__u32					reserved;
+#ifdef __KERNEL__
+	void					*priv_drv_data;
+#else
+	__u64					priv_drv_data;
+#endif
+	__u32					priv_drv_data_size;
+	__u32					num_primaries;
+	struct d3dkmthandle	written_primaries[D3DDDI_MAX_WRITTEN_PRIMARIES];
+	__u32					num_history_buffers;
+	__u32					reserved1;
+#ifdef __KERNEL__
+	struct d3dkmthandle			*history_buffer_array;
+#else
+	__u64					history_buffer_array;
+#endif
+};
+
+struct d3dkmt_submitcommandtohwqueue {
+	struct d3dkmthandle	hwqueue;
+	__u32			reserved;
+	__u64			hwqueue_progress_fence_id;
+	__u64			command_buffer;
+	__u32			command_length;
+	__u32			priv_drv_data_size;
+#ifdef __KERNEL__
+	void			*priv_drv_data;
+#else
+	__u64			priv_drv_data;
+#endif
+	__u32			num_primaries;
+	__u32			reserved1;
+#ifdef __KERNEL__
+	struct d3dkmthandle	*written_primaries;
+#else
+	__u64			written_primaries;
+#endif
+};
+
 enum d3dkmt_standardallocationtype {
 	_D3DKMT_STANDARDALLOCATIONTYPE_EXISTINGHEAP	= 1,
 	_D3DKMT_STANDARDALLOCATIONTYPE_CROSSADAPTER	= 2,
@@ -913,6 +967,8 @@ struct d3dkmt_enumadapters3 {
 	_IOWR(0x47, 0x07, struct d3dkmt_createpagingqueue)
 #define LX_DXQUERYADAPTERINFO		\
 	_IOWR(0x47, 0x09, struct d3dkmt_queryadapterinfo)
+#define LX_DXSUBMITCOMMAND		\
+	_IOWR(0x47, 0x0f, struct d3dkmt_submitcommand)
 #define LX_DXCREATESYNCHRONIZATIONOBJECT \
 	_IOWR(0x47, 0x10, struct d3dkmt_createsynchronizationobject2)
 #define LX_DXSIGNALSYNCHRONIZATIONOBJECT \
@@ -941,6 +997,8 @@ struct d3dkmt_enumadapters3 {
 	_IOWR(0x47, 0x32, struct d3dkmt_signalsynchronizationobjectfromgpu)
 #define LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMGPU2 \
 	_IOWR(0x47, 0x33, struct d3dkmt_signalsynchronizationobjectfromgpu2)
+#define LX_DXSUBMITCOMMANDTOHWQUEUE	\
+	_IOWR(0x47, 0x34, struct d3dkmt_submitcommandtohwqueue)
 #define LX_DXSUBMITSIGNALSYNCOBJECTSTOHWQUEUE \
 	_IOWR(0x47, 0x35, struct d3dkmt_submitsignalsyncobjectstohwqueue)
 #define LX_DXSUBMITWAITFORSYNCOBJECTSTOHWQUEUE \
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 17/30] drivers: hv: dxgkrnl: Share objects with the host
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (14 preceding siblings ...)
  2022-03-01 19:46   ` [PATCH v3 16/30] drivers: hv: dxgkrnl: Submit execution commands to the compute device Iouri Tarassov
@ 2022-03-01 19:46   ` Iouri Tarassov
  2022-03-01 19:46   ` [PATCH v3 18/30] drivers: hv: dxgkrnl: Query the dxgdevice state Iouri Tarassov
                     ` (13 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:46 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement the LX_DXSHAREOBJECTWITHHOST ioctl.
This ioctl is used to create a Windows NT handle on the host
for the given shared object (resource or sync object). The NT
handle is returned to the caller. The caller could share the NT
handle with a host application, which needs to access the object.
The host application can open the shared resource using the NT
handle. This way the guest and the host have access to the same
object.

Fix incorrect handling of error results from copy_from_user().

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgkrnl.h  |  2 ++
 drivers/hv/dxgkrnl/dxgvmbus.c | 60 ++++++++++++++++++++++++++++---
 drivers/hv/dxgkrnl/dxgvmbus.h | 18 ++++++++++
 drivers/hv/dxgkrnl/ioctl.c    | 68 ++++++++++++++++++++++++++++-------
 include/uapi/misc/d3dkmthk.h  |  9 +++++
 5 files changed, 140 insertions(+), 17 deletions(-)

diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index 2a2a69169e71..96f04cdadeb5 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -852,6 +852,8 @@ int dxgvmb_send_get_stdalloc_data(struct dxgdevice *device,
 int dxgvmb_send_async_msg(struct dxgvmbuschannel *channel,
 			  void *command,
 			  u32 cmd_size);
+int dxgvmb_send_share_object_with_host(struct dxgprocess *process,
+				struct d3dkmt_shareobjectwithhost *args);
 
 void signal_host_cpu_event(struct dxghostevent *eventhdr);
 int ntstatus2int(struct ntstatus status);
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index eeb7cb76e1a5..c1c24ea9c14b 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -874,6 +874,50 @@ int dxgvmb_send_destroy_sync_object(struct dxgprocess *process,
 	return ret;
 }
 
+int dxgvmb_send_share_object_with_host(struct dxgprocess *process,
+				struct d3dkmt_shareobjectwithhost *args)
+{
+	struct dxgkvmb_command_shareobjectwithhost *command;
+	struct dxgkvmb_command_shareobjectwithhost_return result = {};
+	int ret;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, NULL, process, sizeof(*command));
+	if (ret)
+		return ret;
+	command = (void *)msg.msg;
+
+	ret = dxgglobal_acquire_channel_lock();
+	if (ret < 0)
+		goto cleanup;
+
+	command_vm_to_host_init2(&command->hdr,
+				 DXGK_VMBCOMMAND_SHAREOBJECTWITHHOST,
+				 process->host_handle);
+	command->device_handle = args->device_handle;
+	command->object_handle = args->object_handle;
+
+	ret = dxgvmb_send_sync_msg(dxgglobal_get_dxgvmbuschannel(),
+				   msg.hdr, msg.size, &result, sizeof(result));
+
+	dxgglobal_release_channel_lock();
+
+	if (ret || !NT_SUCCESS(result.status)) {
+		if (ret == 0)
+			ret = ntstatus2int(result.status);
+		pr_err("DXGK_VMBCOMMAND_SHAREOBJECTWITHHOST failed: %d %x",
+			ret, result.status.v);
+		goto cleanup;
+	}
+	args->object_vail_nt_handle = result.vail_nt_handle;
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 /*
  * Virtual GPU messages to the host
  */
@@ -2301,37 +2345,43 @@ int dxgvmb_send_create_hwqueue(struct dxgprocess *process,
 
 	ret = copy_to_user(&inargs->queue, &command->hwqueue,
 			   sizeof(struct d3dkmthandle));
-	if (ret < 0) {
+	if (ret) {
 		pr_err("%s failed to copy hwqueue handle", __func__);
+		ret = -EINVAL;
 		goto cleanup;
 	}
 	ret = copy_to_user(&inargs->queue_progress_fence,
 			   &command->hwqueue_progress_fence,
 			   sizeof(struct d3dkmthandle));
-	if (ret < 0) {
+	if (ret) {
 		pr_err("%s failed to progress fence", __func__);
+		ret = -EINVAL;
 		goto cleanup;
 	}
 	ret = copy_to_user(&inargs->queue_progress_fence_cpu_va,
 			   &hwqueue->progress_fence_mapped_address,
 			   sizeof(inargs->queue_progress_fence_cpu_va));
-	if (ret < 0) {
+	if (ret) {
 		pr_err("%s failed to copy fence cpu va", __func__);
+		ret = -EINVAL;
 		goto cleanup;
 	}
 	ret = copy_to_user(&inargs->queue_progress_fence_gpu_va,
 			   &command->hwqueue_progress_fence_gpuva,
 			   sizeof(u64));
-	if (ret < 0) {
+	if (ret) {
 		pr_err("%s failed to copy fence gpu va", __func__);
+		ret = -EINVAL;
 		goto cleanup;
 	}
 	if (args->priv_drv_data_size) {
 		ret = copy_to_user(args->priv_drv_data,
 				   command->priv_drv_data,
 				   args->priv_drv_data_size);
-		if (ret < 0)
+		if (ret) {
 			pr_err("%s failed to copy private data", __func__);
+			ret = -EINVAL;
+		}
 	}
 
 cleanup:
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index d65412a57a7c..20e0659accf1 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -574,4 +574,22 @@ struct dxgkvmb_command_destroyhwqueue {
 	struct d3dkmthandle		hwqueue;
 };
 
+struct dxgkvmb_command_shareobjectwithhost {
+	struct dxgkvmb_command_vm_to_host hdr;
+	struct d3dkmthandle	device_handle;
+	struct d3dkmthandle	object_handle;
+	u64			reserved;
+};
+
+struct dxgkvmb_command_shareobjectwithhost_return {
+	struct ntstatus	status;
+	u32		alignment;
+	u64		vail_nt_handle;
+};
+
+int
+dxgvmb_send_sync_msg(struct dxgvmbuschannel *channel,
+		     void *command, u32 command_size, void *result,
+		     u32 result_size);
+
 #endif /* _DXGVMBUS_H */
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index 38d28d1792df..71ba7b75c60f 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -1088,8 +1088,9 @@ dxgk_create_paging_queue(struct dxgprocess *process, void *__user inargs)
 		host_handle = args.paging_queue;
 
 		ret = copy_to_user(inargs, &args, sizeof(args));
-		if (ret < 0) {
+		if (ret) {
 			pr_err("%s failed to copy input args", __func__);
+			ret = -EINVAL;
 			goto cleanup;
 		}
 
@@ -2494,8 +2495,9 @@ dxgk_open_sync_object_nt(struct dxgprocess *process, void *__user inargs)
 		goto cleanup;
 
 	ret = copy_to_user(inargs, &args, sizeof(args));
-	if (ret >= 0)
+	if (ret == 0)
 		goto success;
+	ret = -EINVAL;
 	pr_err("%s failed to copy output args", __func__);
 
 cleanup:
@@ -3405,8 +3407,10 @@ dxgk_share_objects(struct dxgprocess *process, void *__user inargs)
 	tmp = (u64) object_fd;
 
 	ret = copy_to_user(args.shared_handle, &tmp, sizeof(u64));
-	if (ret < 0)
+	if (ret) {
 		pr_err("%s failed to copy shared handle", __func__);
+		ret = -EINVAL;
+	}
 
 cleanup:
 	if (ret < 0) {
@@ -3436,8 +3440,6 @@ dxgk_query_resource_info_nt(struct dxgprocess *process, void *__user inargs)
 	struct dxgsharedresource *shared_resource = NULL;
 	struct file *file = NULL;
 
-	pr_debug("ioctl: %s", __func__);
-
 	ret = copy_from_user(&args, inargs, sizeof(args));
 	if (ret) {
 		pr_err("%s failed to copy input args", __func__);
@@ -3492,8 +3494,10 @@ dxgk_query_resource_info_nt(struct dxgprocess *process, void *__user inargs)
 	    shared_resource->alloc_private_data_size;
 
 	ret = copy_to_user(inargs, &args, sizeof(args));
-	if (ret < 0)
+	if (ret) {
 		pr_err("%s failed to copy output args", __func__);
+		ret = -EINVAL;
+	}
 
 cleanup:
 
@@ -3554,8 +3558,9 @@ assign_resource_handles(struct dxgprocess *process,
 		ret = copy_to_user(&args->open_alloc_info[i],
 				   &open_alloc_info,
 				   sizeof(open_alloc_info));
-		if (ret < 0) {
+		if (ret) {
 			pr_err("%s failed to copy alloc info", __func__);
+			ret = -EINVAL;
 			goto cleanup;
 		}
 	}
@@ -3705,8 +3710,9 @@ open_resource(struct dxgprocess *process,
 		ret = copy_to_user(args->private_runtime_data,
 				shared_resource->runtime_private_data,
 				shared_resource->runtime_private_data_size);
-		if (ret < 0) {
+		if (ret) {
 			pr_err("%s failed to copy runtime data", __func__);
+			ret = -EINVAL;
 			goto cleanup;
 		}
 	}
@@ -3715,8 +3721,9 @@ open_resource(struct dxgprocess *process,
 		ret = copy_to_user(args->resource_priv_drv_data,
 				shared_resource->resource_private_data,
 				shared_resource->resource_private_data_size);
-		if (ret < 0) {
+		if (ret) {
 			pr_err("%s failed to copy resource data", __func__);
+			ret = -EINVAL;
 			goto cleanup;
 		}
 	}
@@ -3725,8 +3732,9 @@ open_resource(struct dxgprocess *process,
 		ret = copy_to_user(args->total_priv_drv_data,
 				shared_resource->alloc_private_data,
 				shared_resource->alloc_private_data_size);
-		if (ret < 0) {
+		if (ret) {
 			pr_err("%s failed to copy alloc data", __func__);
+			ret = -EINVAL;
 			goto cleanup;
 		}
 	}
@@ -3739,15 +3747,18 @@ open_resource(struct dxgprocess *process,
 
 	ret = copy_to_user(res_out, &resource_handle,
 			   sizeof(struct d3dkmthandle));
-	if (ret < 0) {
+	if (ret) {
 		pr_err("%s failed to copy resource handle to user", __func__);
+		ret = -EINVAL;
 		goto cleanup;
 	}
 
 	ret = copy_to_user(total_driver_data_size_out,
 			   &args->total_priv_drv_data_size, sizeof(u32));
-	if (ret < 0)
+	if (ret) {
 		pr_err("%s failed to copy total driver data size", __func__);
+		ret = -EINVAL;
+	}
 
 cleanup:
 
@@ -3809,6 +3820,37 @@ dxgk_open_resource_nt(struct dxgprocess *process,
 	return ret;
 }
 
+static int
+dxgk_share_object_with_host(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_shareobjectwithhost args;
+	int ret;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_share_object_with_host(process, &args);
+	if (ret) {
+		pr_err("dxgvmb_send_share_object_with_host dailed");
+		goto cleanup;
+	}
+
+	ret = copy_to_user(inargs, &args, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy data to user", __func__);
+		ret = -EINVAL;
+	}
+
+cleanup:
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 /*
  * IOCTL processing
  * The driver IOCTLs return
@@ -3929,4 +3971,6 @@ void init_ioctls(void)
 		  LX_DXQUERYRESOURCEINFOFROMNTHANDLE);
 	SET_IOCTL(/*0x42 */ dxgk_open_resource_nt,
 		  LX_DXOPENRESOURCEFROMNTHANDLE);
+	SET_IOCTL(/*0x44 */ dxgk_share_object_with_host,
+		  LX_DXSHAREOBJECTWITHHOST);
 }
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index fe0b391d7f6b..e3b39c1d2b30 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -948,6 +948,13 @@ struct d3dkmt_enumadapters3 {
 #endif
 };
 
+struct d3dkmt_shareobjectwithhost {
+	struct d3dkmthandle	device_handle;
+	struct d3dkmthandle	object_handle;
+	__u64			reserved;
+	__u64			object_vail_nt_handle;
+};
+
 /*
  * Dxgkrnl Graphics Port Driver ioctl definitions
  *
@@ -1017,6 +1024,8 @@ struct d3dkmt_enumadapters3 {
 	_IOWR(0x47, 0x41, struct d3dkmt_queryresourceinfofromnthandle)
 #define LX_DXOPENRESOURCEFROMNTHANDLE	\
 	_IOWR(0x47, 0x42, struct d3dkmt_openresourcefromnthandle)
+#define LX_DXSHAREOBJECTWITHHOST	\
+	_IOWR(0x47, 0x44, struct d3dkmt_shareobjectwithhost)
 
 #define LX_IO_MAX 0x45
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 18/30] drivers: hv: dxgkrnl: Query the dxgdevice state
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (15 preceding siblings ...)
  2022-03-01 19:46   ` [PATCH v3 17/30] drivers: hv: dxgkrnl: Share objects with the host Iouri Tarassov
@ 2022-03-01 19:46   ` Iouri Tarassov
  2022-03-01 19:46   ` [PATCH v3 19/30] drivers: hv: dxgkrnl: Map(unmap) CPU address to device allocation Iouri Tarassov
                     ` (12 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:46 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement the ioctl to query the dxgdevice state - LX_DXGETDEVICESTATE.
The IOCTL is used to query the state of the given dxgdevice object (active,
error, etc.).

A call to the dxgdevice execution state could be high frequency.
The following method is used to avoid sending a synchronous VM
bus message to the host for every call:
- When a dxgdevice is created, a pointer to dxgglobal->device_state_counter
  is sent to the host
- Every time the device state on the host is changed, the host will send
  an asynchronous message to the guest (DXGK_VMBCOMMAND_SETGUESTDATA) and
  the guest will increment the device_state_counter value.
- the dxgdevice object has execution_state_counter member, which is equal
  to dxgglobal->device_state_counter value at the time when
  LX_DXGETDEVICESTATE was last processed..
- if execution_state_counter is different from device_state_counter, the
  dxgk_vmbcommand_getdevicestate VM bus message is sent to the host.
  Otherwise, the cached value is returned to the caller.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgkrnl.h  |  11 ++++
 drivers/hv/dxgkrnl/dxgvmbus.c |  66 ++++++++++++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h |  26 +++++++++
 drivers/hv/dxgkrnl/ioctl.c    |  66 ++++++++++++++++++++++
 include/uapi/misc/d3dkmthk.h  | 101 ++++++++++++++++++++++++++++++----
 5 files changed, 260 insertions(+), 10 deletions(-)

diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index 96f04cdadeb5..4a2c8149dcc5 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -253,12 +253,18 @@ void dxgsyncobject_destroy(struct dxgprocess *process,
 void dxgsyncobject_stop(struct dxgsyncobject *syncobj);
 void dxgsyncobject_release(struct kref *refcount);
 
+/*
+ * device_state_counter - incremented every time the execition state of
+ *	a DXGDEVICE is changed in the host. Used to optimize access to the
+ *	device execution state.
+ */
 struct dxgglobal {
 	struct dxgvmbuschannel	channel;
 	struct delayed_work	dwork;
 	struct hv_device	*hdev;
 	u32			num_adapters;
 	u32			vmbus_ver;	/* Interface version */
+	atomic_t		device_state_counter;
 	struct resource		*mem;
 	u64			mmiospace_base;
 	u64			mmiospace_size;
@@ -499,6 +505,7 @@ struct dxgdevice {
 	struct list_head	syncobj_list_head;
 	struct d3dkmthandle	handle;
 	enum d3dkmt_deviceexecution_state execution_state;
+	int			execution_state_counter;
 	u32			handle_valid;
 };
 
@@ -829,6 +836,10 @@ int dxgvmb_send_open_sync_object_nt(struct dxgprocess *process,
 				    struct d3dkmt_opensyncobjectfromnthandle2
 				    *args,
 				    struct dxgsyncobject *syncobj);
+int dxgvmb_send_get_device_state(struct dxgprocess *process,
+				 struct dxgadapter *adapter,
+				 struct d3dkmt_getdevicestate *args,
+				 struct d3dkmt_getdevicestate *__user inargs);
 int dxgvmb_send_create_nt_shared_object(struct dxgprocess *process,
 					struct d3dkmthandle object,
 					struct d3dkmthandle *shared_handle);
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index c1c24ea9c14b..44b95a63d7ce 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -276,6 +276,23 @@ static void command_vm_to_host_init1(struct dxgkvmb_command_vm_to_host *command,
 	command->channel_type = DXGKVMB_VM_TO_HOST;
 }
 
+static void set_guest_data(struct dxgkvmb_command_host_to_vm *packet,
+			   u32 packet_length)
+{
+	struct dxgkvmb_command_setguestdata *command = (void *)packet;
+
+	pr_debug("%s: %d %d %p %p", __func__,
+		command->data_type,
+		command->data32,
+		command->guest_pointer,
+		&dxgglobal->device_state_counter);
+	if (command->data_type == SETGUESTDATA_DATATYPE_DWORD &&
+	    command->guest_pointer == &dxgglobal->device_state_counter &&
+	    command->data32 != 0) {
+		atomic_inc(&dxgglobal->device_state_counter);
+	}
+}
+
 static void signal_guest_event(struct dxgkvmb_command_host_to_vm *packet,
 			       u32 packet_length)
 {
@@ -306,6 +323,9 @@ static void process_inband_packet(struct dxgvmbuschannel *channel,
 			pr_debug("global packet %d",
 				    packet->command_type);
 			switch (packet->command_type) {
+			case DXGK_VMBCOMMAND_SETGUESTDATA:
+				set_guest_data(packet, packet_length);
+				break;
 			case DXGK_VMBCOMMAND_SIGNALGUESTEVENT:
 			case DXGK_VMBCOMMAND_SIGNALGUESTEVENTPASSIVE:
 				signal_guest_event(packet, packet_length);
@@ -1028,6 +1048,7 @@ struct d3dkmthandle dxgvmb_send_create_device(struct dxgadapter *adapter,
 	command_vgpu_to_host_init2(&command->hdr, DXGK_VMBCOMMAND_CREATEDEVICE,
 				   process->host_handle);
 	command->flags = args->flags;
+	command->error_code = &dxgglobal->device_state_counter;
 
 	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
 				   &result, sizeof(result));
@@ -1791,6 +1812,51 @@ int dxgvmb_send_destroy_allocation(struct dxgprocess *process,
 	return ret;
 }
 
+int dxgvmb_send_get_device_state(struct dxgprocess *process,
+				 struct dxgadapter *adapter,
+				 struct d3dkmt_getdevicestate *args,
+				 struct d3dkmt_getdevicestate *__user output)
+{
+	int ret;
+	struct dxgkvmb_command_getdevicestate *command;
+	struct dxgkvmb_command_getdevicestate_return result = { };
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_GETDEVICESTATE,
+				   process->host_handle);
+	command->args = *args;
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   &result, sizeof(result));
+	if (ret < 0)
+		goto cleanup;
+
+	ret = ntstatus2int(result.status);
+	if (ret < 0)
+		goto cleanup;
+
+	ret = copy_to_user(output, &result.args, sizeof(result.args));
+	if (ret) {
+		pr_err("%s failed to copy output args", __func__);
+		ret = -EINVAL;
+	}
+
+	if (args->state_type == _D3DKMT_DEVICESTATE_EXECUTION)
+		args->execution_state = result.args.execution_state;
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 int dxgvmb_send_open_resource(struct dxgprocess *process,
 			      struct dxgadapter *adapter,
 			      struct d3dkmthandle device,
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index 20e0659accf1..62e2fc3e5d14 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -172,6 +172,22 @@ struct dxgkvmb_command_signalguestevent {
 	bool				dereference_event;
 };
 
+enum set_guestdata_type {
+	SETGUESTDATA_DATATYPE_DWORD	= 0,
+	SETGUESTDATA_DATATYPE_UINT64	= 1
+};
+
+struct dxgkvmb_command_setguestdata {
+	struct dxgkvmb_command_host_to_vm hdr;
+	void *guest_pointer;
+	union {
+		u64	data64;
+		u32	data32;
+	};
+	u32	dereference	: 1;
+	u32	data_type	: 4;
+};
+
 struct dxgkvmb_command_opensyncobject {
 	struct dxgkvmb_command_vm_to_host hdr;
 	struct d3dkmthandle		device;
@@ -574,6 +590,16 @@ struct dxgkvmb_command_destroyhwqueue {
 	struct d3dkmthandle		hwqueue;
 };
 
+struct dxgkvmb_command_getdevicestate {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmt_getdevicestate	args;
+};
+
+struct dxgkvmb_command_getdevicestate_return {
+	struct d3dkmt_getdevicestate	args;
+	struct ntstatus			status;
+};
+
 struct dxgkvmb_command_shareobjectwithhost {
 	struct dxgkvmb_command_vm_to_host hdr;
 	struct d3dkmthandle	device_handle;
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index 71ba7b75c60f..bb6eab08898f 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -3180,6 +3180,70 @@ dxgk_wait_sync_object_gpu(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgk_get_device_state(struct dxgprocess *process, void *__user inargs)
+{
+	int ret;
+	struct d3dkmt_getdevicestate args;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+	int global_device_state_counter = 0;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	device = dxgprocess_device_by_handle(process, args.device);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	if (args.state_type == _D3DKMT_DEVICESTATE_EXECUTION) {
+		global_device_state_counter =
+			atomic_read(&dxgglobal->device_state_counter);
+		if (device->execution_state_counter ==
+		    global_device_state_counter) {
+			args.execution_state = device->execution_state;
+			ret = copy_to_user(inargs, &args, sizeof(args));
+			if (ret) {
+				pr_err("%s failed to copy args to user",
+					__func__);
+				ret = -EINVAL;
+			}
+			goto cleanup;
+		}
+	}
+
+	ret = dxgvmb_send_get_device_state(process, adapter, &args, inargs);
+
+	if (ret == 0 && args.state_type == _D3DKMT_DEVICESTATE_EXECUTION) {
+		device->execution_state = args.execution_state;
+		device->execution_state_counter = global_device_state_counter;
+	}
+
+cleanup:
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+	if (ret < 0)
+		pr_err("%s failed %x", __func__, ret);
+
+	return ret;
+}
+
 static int
 dxgsharedsyncobj_get_host_nt_handle(struct dxgsharedsyncobject *syncobj,
 				    struct dxgprocess *process,
@@ -3921,6 +3985,8 @@ void init_ioctls(void)
 		  LX_DXCREATEPAGINGQUEUE);
 	SET_IOCTL(/*0x9 */ dxgk_query_adapter_info,
 		  LX_DXQUERYADAPTERINFO);
+	SET_IOCTL(/*0xe */ dxgk_get_device_state,
+		  LX_DXGETDEVICESTATE);
 	SET_IOCTL(/*0xf */ dxgk_submit_command,
 		  LX_DXSUBMITCOMMAND);
 	SET_IOCTL(/*0x10 */ dxgk_create_sync_object,
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index e3b39c1d2b30..a0af63f046c6 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -232,6 +232,95 @@ struct d3dddi_destroypagingqueue {
 	struct d3dkmthandle		paging_queue;
 };
 
+enum dxgk_render_pipeline_stage {
+	_DXGK_RENDER_PIPELINE_STAGE_UNKNOWN		= 0,
+	_DXGK_RENDER_PIPELINE_STAGE_INPUT_ASSEMBLER	= 1,
+	_DXGK_RENDER_PIPELINE_STAGE_VERTEX_SHADER	= 2,
+	_DXGK_RENDER_PIPELINE_STAGE_GEOMETRY_SHADER	= 3,
+	_DXGK_RENDER_PIPELINE_STAGE_STREAM_OUTPUT	= 4,
+	_DXGK_RENDER_PIPELINE_STAGE_RASTERIZER		= 5,
+	_DXGK_RENDER_PIPELINE_STAGE_PIXEL_SHADER	= 6,
+	_DXGK_RENDER_PIPELINE_STAGE_OUTPUT_MERGER	= 7,
+};
+
+enum dxgk_page_fault_flags {
+	_DXGK_PAGE_FAULT_WRITE			= 0x1,
+	_DXGK_PAGE_FAULT_FENCE_INVALID		= 0x2,
+	_DXGK_PAGE_FAULT_ADAPTER_RESET_REQUIRED	= 0x4,
+	_DXGK_PAGE_FAULT_ENGINE_RESET_REQUIRED	= 0x8,
+	_DXGK_PAGE_FAULT_FATAL_HARDWARE_ERROR	= 0x10,
+	_DXGK_PAGE_FAULT_IOMMU			= 0x20,
+	_DXGK_PAGE_FAULT_HW_CONTEXT_VALID	= 0x40,
+	_DXGK_PAGE_FAULT_PROCESS_HANDLE_VALID	= 0x80,
+};
+
+enum dxgk_general_error_code {
+	_DXGK_GENERAL_ERROR_PAGE_FAULT		= 0,
+	_DXGK_GENERAL_ERROR_INVALID_INSTRUCTION	= 1,
+};
+
+struct dxgk_fault_error_code {
+	union {
+		struct {
+			__u32	is_device_specific_code:1;
+			enum dxgk_general_error_code general_error_code:31;
+		};
+		struct {
+			__u32	is_device_specific_code_reserved_bit:1;
+			__u32	device_specific_code:31;
+		};
+	};
+};
+
+struct d3dkmt_devicereset_state {
+	union {
+		struct {
+			__u32	desktop_switched:1;
+			__u32	reserved:31;
+		};
+		__u32		value;
+	};
+};
+
+struct d3dkmt_devicepagefault_state {
+	__u64				faulted_primitive_api_sequence_number;
+	enum dxgk_render_pipeline_stage	faulted_pipeline_stage;
+	__u32				faulted_bind_table_entry;
+	enum dxgk_page_fault_flags	page_fault_flags;
+	struct dxgk_fault_error_code	fault_error_code;
+	__u64				faulted_virtual_address;
+};
+
+enum d3dkmt_deviceexecution_state {
+	_D3DKMT_DEVICEEXECUTION_ACTIVE			= 1,
+	_D3DKMT_DEVICEEXECUTION_RESET			= 2,
+	_D3DKMT_DEVICEEXECUTION_HUNG			= 3,
+	_D3DKMT_DEVICEEXECUTION_STOPPED			= 4,
+	_D3DKMT_DEVICEEXECUTION_ERROR_OUTOFMEMORY	= 5,
+	_D3DKMT_DEVICEEXECUTION_ERROR_DMAFAULT		= 6,
+	_D3DKMT_DEVICEEXECUTION_ERROR_DMAPAGEFAULT	= 7,
+};
+
+enum d3dkmt_devicestate_type {
+	_D3DKMT_DEVICESTATE_EXECUTION		= 1,
+	_D3DKMT_DEVICESTATE_PRESENT		= 2,
+	_D3DKMT_DEVICESTATE_RESET		= 3,
+	_D3DKMT_DEVICESTATE_PRESENT_DWM		= 4,
+	_D3DKMT_DEVICESTATE_PAGE_FAULT		= 5,
+	_D3DKMT_DEVICESTATE_PRESENT_QUEUE	= 6,
+};
+
+struct d3dkmt_getdevicestate {
+	struct d3dkmthandle				device;
+	enum d3dkmt_devicestate_type			state_type;
+	union {
+		enum d3dkmt_deviceexecution_state	execution_state;
+		struct d3dkmt_devicereset_state		reset_state;
+		struct d3dkmt_devicepagefault_state	page_fault_state;
+		char alignment[48];
+	};
+};
+
 enum d3dkmdt_gdisurfacetype {
 	_D3DKMDT_GDISURFACE_INVALID				= 0,
 	_D3DKMDT_GDISURFACE_TEXTURE				= 1,
@@ -755,16 +844,6 @@ struct d3dkmt_queryadapterinfo {
 	__u32				private_data_size;
 };
 
-enum d3dkmt_deviceexecution_state {
-	_D3DKMT_DEVICEEXECUTION_ACTIVE			= 1,
-	_D3DKMT_DEVICEEXECUTION_RESET			= 2,
-	_D3DKMT_DEVICEEXECUTION_HUNG			= 3,
-	_D3DKMT_DEVICEEXECUTION_STOPPED			= 4,
-	_D3DKMT_DEVICEEXECUTION_ERROR_OUTOFMEMORY	= 5,
-	_D3DKMT_DEVICEEXECUTION_ERROR_DMAFAULT		= 6,
-	_D3DKMT_DEVICEEXECUTION_ERROR_DMAPAGEFAULT	= 7,
-};
-
 struct d3dddi_openallocationinfo2 {
 	struct d3dkmthandle	allocation;
 #ifdef __KERNEL__
@@ -974,6 +1053,8 @@ struct d3dkmt_shareobjectwithhost {
 	_IOWR(0x47, 0x07, struct d3dkmt_createpagingqueue)
 #define LX_DXQUERYADAPTERINFO		\
 	_IOWR(0x47, 0x09, struct d3dkmt_queryadapterinfo)
+#define LX_DXGETDEVICESTATE		\
+	_IOWR(0x47, 0x0e, struct d3dkmt_getdevicestate)
 #define LX_DXSUBMITCOMMAND		\
 	_IOWR(0x47, 0x0f, struct d3dkmt_submitcommand)
 #define LX_DXCREATESYNCHRONIZATIONOBJECT \
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 19/30] drivers: hv: dxgkrnl: Map(unmap) CPU address to device allocation
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (16 preceding siblings ...)
  2022-03-01 19:46   ` [PATCH v3 18/30] drivers: hv: dxgkrnl: Query the dxgdevice state Iouri Tarassov
@ 2022-03-01 19:46   ` Iouri Tarassov
  2022-03-01 19:46   ` [PATCH v3 20/30] drivers: hv: dxgkrnl: Manage device allocation properties Iouri Tarassov
                     ` (11 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:46 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement ioctls to map/unmap CPU virtual addresses to compute device
allocations - LX_DXLOCK2 and LX_DXUNLOCK2.

The LX_DXLOCK2 ioctl maps a CPU virtual address to a compute device
allocation. The allocation could be located in system memory or local
device memory on the host. When the device allocation is created
from the guest system memory (existing sysmem allocation), the
allocation CPU address is known and is returned to the caller.
For other CPU visible allocations the code flow is the following:
1. A VM bus message is sent to the host to map the allocation
2. The host allocates a portion of the guest IO space and maps it
   to the allocation backing store. The IO space address of the
   allocation is returned back to the guest.
3. The guest allocates a CPU virtual address and maps it to the IO
   space (see the dxg_map_iospace function).
4. The CPU VA is returned back to the caller
cpu_address_mapped and cpu_address_refcount are used to track how
many times an allocation was mapped.

The LX_DXUNLOCK2 ioctl unmaps a CPU virtual address from a compute
device allocation.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgadapter.c |  11 +++
 drivers/hv/dxgkrnl/dxgkrnl.h    |  14 +++
 drivers/hv/dxgkrnl/dxgvmbus.c   | 107 +++++++++++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h   |  19 ++++
 drivers/hv/dxgkrnl/ioctl.c      | 161 ++++++++++++++++++++++++++++++++
 include/uapi/misc/d3dkmthk.h    |  30 ++++++
 6 files changed, 342 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgadapter.c b/drivers/hv/dxgkrnl/dxgadapter.c
index b162a37cf389..bf0b5d236b92 100644
--- a/drivers/hv/dxgkrnl/dxgadapter.c
+++ b/drivers/hv/dxgkrnl/dxgadapter.c
@@ -886,6 +886,15 @@ void dxgallocation_stop(struct dxgallocation *alloc)
 		vfree(alloc->pages);
 		alloc->pages = NULL;
 	}
+	dxgprocess_ht_lock_exclusive_down(alloc->process);
+	if (alloc->cpu_address_mapped) {
+		dxg_unmap_iospace(alloc->cpu_address,
+				  alloc->num_pages << PAGE_SHIFT);
+		alloc->cpu_address_mapped = false;
+		alloc->cpu_address = NULL;
+		alloc->cpu_address_refcount = 0;
+	}
+	dxgprocess_ht_lock_exclusive_up(alloc->process);
 }
 
 void dxgallocation_free_handle(struct dxgallocation *alloc)
@@ -927,6 +936,8 @@ void dxgallocation_destroy(struct dxgallocation *alloc)
 	}
 	if (alloc->priv_drv_data)
 		vfree(alloc->priv_drv_data);
+	if (alloc->cpu_address_mapped)
+		pr_err("Alloc IO space is mapped: %p", alloc);
 	vfree(alloc);
 }
 
diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index 4a2c8149dcc5..cf41a3eef75e 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -695,6 +695,8 @@ struct dxgallocation {
 	struct d3dkmthandle		alloc_handle;
 	/* Set to 1 when allocation belongs to resource. */
 	u32				resource_owner:1;
+	/* Set to 1 when 'cpu_address' is mapped to the IO space. */
+	u32				cpu_address_mapped:1;
 	/* Set to 1 when the allocatio is mapped as cached */
 	u32				cached:1;
 	u32				handle_valid:1;
@@ -702,6 +704,11 @@ struct dxgallocation {
 	struct vmbus_gpadl		gpadl;
 	/* Number of pages in the 'pages' array */
 	u32				num_pages;
+	/*
+	 * How many times dxgk_lock2 is called to allocation, which is mapped
+	 * to IO space.
+	 */
+	u32				cpu_address_refcount;
 	/*
 	 * CPU address from the existing sysmem allocation, or
 	 * mapped to the CPU visible backing store in the IO space
@@ -817,6 +824,13 @@ int dxgvmb_send_wait_sync_object_cpu(struct dxgprocess *process,
 				     d3dkmt_waitforsynchronizationobjectfromcpu
 				     *args,
 				     u64 cpu_event);
+int dxgvmb_send_lock2(struct dxgprocess *process,
+		      struct dxgadapter *adapter,
+		      struct d3dkmt_lock2 *args,
+		      struct d3dkmt_lock2 *__user outargs);
+int dxgvmb_send_unlock2(struct dxgprocess *process,
+			struct dxgadapter *adapter,
+			struct d3dkmt_unlock2 *args);
 int dxgvmb_send_create_hwqueue(struct dxgprocess *process,
 			       struct dxgadapter *adapter,
 			       struct d3dkmt_createhwqueue *args,
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index 44b95a63d7ce..0c2796934d1d 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -2332,6 +2332,113 @@ int dxgvmb_send_wait_sync_object_gpu(struct dxgprocess *process,
 	return ret;
 }
 
+int dxgvmb_send_lock2(struct dxgprocess *process,
+		      struct dxgadapter *adapter,
+		      struct d3dkmt_lock2 *args,
+		      struct d3dkmt_lock2 *__user outargs)
+{
+	int ret;
+	struct dxgkvmb_command_lock2 *command;
+	struct dxgkvmb_command_lock2_return result = { };
+	struct dxgallocation *alloc = NULL;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_LOCK2, process->host_handle);
+	command->args = *args;
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   &result, sizeof(result));
+	if (ret < 0)
+		goto cleanup;
+
+	ret = ntstatus2int(result.status);
+	if (ret < 0)
+		goto cleanup;
+
+	hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+	alloc = hmgrtable_get_object_by_type(&process->handle_table,
+					     HMGRENTRY_TYPE_DXGALLOCATION,
+					     args->allocation);
+	if (alloc == NULL) {
+		pr_err("%s invalid alloc", __func__);
+		ret = -EINVAL;
+	} else {
+		if (alloc->cpu_address) {
+			args->data = alloc->cpu_address;
+			if (alloc->cpu_address_mapped)
+				alloc->cpu_address_refcount++;
+		} else {
+			u64 offset = (u64)result.cpu_visible_buffer_offset;
+
+			args->data = dxg_map_iospace(offset,
+					alloc->num_pages << PAGE_SHIFT,
+					PROT_READ | PROT_WRITE, alloc->cached);
+			if (args->data) {
+				alloc->cpu_address_refcount = 1;
+				alloc->cpu_address_mapped = true;
+				alloc->cpu_address = args->data;
+			}
+		}
+		if (args->data == NULL) {
+			ret = -ENOMEM;
+		} else {
+			ret = copy_to_user(&outargs->data, &args->data,
+					   sizeof(args->data));
+			if (ret) {
+				pr_err("%s failed to copy data", __func__);
+				ret = -EINVAL;
+				alloc->cpu_address_refcount--;
+				if (alloc->cpu_address_refcount == 0) {
+					dxg_unmap_iospace(alloc->cpu_address,
+					   alloc->num_pages << PAGE_SHIFT);
+					alloc->cpu_address_mapped = false;
+					alloc->cpu_address = NULL;
+				}
+			}
+		}
+	}
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_unlock2(struct dxgprocess *process,
+			struct dxgadapter *adapter,
+			struct d3dkmt_unlock2 *args)
+{
+	int ret;
+	struct dxgkvmb_command_unlock2 *command;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_UNLOCK2,
+				   process->host_handle);
+	command->args = *args;
+
+	ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 int dxgvmb_send_create_hwqueue(struct dxgprocess *process,
 			       struct dxgadapter *adapter,
 			       struct d3dkmt_createhwqueue *args,
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index 62e2fc3e5d14..3e2edfeefd1a 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -570,6 +570,25 @@ struct dxgkvmb_command_waitforsyncobjectfromgpu {
 	/* struct d3dkmthandle ObjectHandles[object_count] */
 };
 
+struct dxgkvmb_command_lock2 {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmt_lock2		args;
+	bool				use_legacy_lock;
+	u32				flags;
+	u32				priv_drv_data;
+};
+
+struct dxgkvmb_command_lock2_return {
+	struct ntstatus			status;
+	void				*cpu_visible_buffer_offset;
+};
+
+struct dxgkvmb_command_unlock2 {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmt_unlock2		args;
+	bool				use_legacy_unlock;
+};
+
 /* Returns the same structure */
 struct dxgkvmb_command_createhwqueue {
 	struct dxgkvmb_command_vgpu_to_host hdr;
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index bb6eab08898f..12a4593c6aa5 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -3180,6 +3180,163 @@ dxgk_wait_sync_object_gpu(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgk_lock2(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_lock2 args;
+	struct d3dkmt_lock2 *__user result = inargs;
+	int ret;
+	struct dxgadapter *adapter = NULL;
+	struct dxgdevice *device = NULL;
+	struct dxgallocation *alloc = NULL;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	args.data = NULL;
+	hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+	alloc = hmgrtable_get_object_by_type(&process->handle_table,
+					     HMGRENTRY_TYPE_DXGALLOCATION,
+					     args.allocation);
+	if (alloc == NULL) {
+		ret = -EINVAL;
+	} else {
+		if (alloc->cpu_address) {
+			ret = copy_to_user(&result->data,
+					   &alloc->cpu_address,
+					   sizeof(args.data));
+			if (ret == 0) {
+				args.data = alloc->cpu_address;
+				if (alloc->cpu_address_mapped)
+					alloc->cpu_address_refcount++;
+			} else {
+				pr_err("%s Failed to copy cpu address",
+					__func__);
+				ret = -EINVAL;
+			}
+		}
+	}
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+	if (ret < 0)
+		goto cleanup;
+	if (args.data)
+		goto success;
+
+	/*
+	 * The call acquires reference on the device. It is safe to access the
+	 * adapter, because the device holds reference on it.
+	 */
+	device = dxgprocess_device_by_handle(process, args.device);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_lock2(process, adapter, &args, result);
+
+cleanup:
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+success:
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_unlock2(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_unlock2 args;
+	int ret;
+	struct dxgadapter *adapter = NULL;
+	struct dxgdevice *device = NULL;
+	struct dxgallocation *alloc = NULL;
+	bool done = false;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+	alloc = hmgrtable_get_object_by_type(&process->handle_table,
+					     HMGRENTRY_TYPE_DXGALLOCATION,
+					     args.allocation);
+	if (alloc == NULL) {
+		ret = -EINVAL;
+	} else {
+		if (alloc->cpu_address == NULL) {
+			pr_err("Allocation is not locked: %p", alloc);
+			ret = -EINVAL;
+		} else if (alloc->cpu_address_mapped) {
+			if (alloc->cpu_address_refcount > 0) {
+				alloc->cpu_address_refcount--;
+				if (alloc->cpu_address_refcount != 0) {
+					done = true;
+				} else {
+					dxg_unmap_iospace(alloc->cpu_address,
+						alloc->num_pages << PAGE_SHIFT);
+					alloc->cpu_address_mapped = false;
+					alloc->cpu_address = NULL;
+				}
+			} else {
+				pr_err("Invalid cpu access refcount");
+				done = true;
+			}
+		}
+	}
+	hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+	if (done)
+		goto success;
+	if (ret < 0)
+		goto cleanup;
+
+	/*
+	 * The call acquires reference on the device. It is safe to access the
+	 * adapter, because the device holds reference on it.
+	 */
+	device = dxgprocess_device_by_handle(process, args.device);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_unlock2(process, adapter, &args);
+
+cleanup:
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+success:
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 static int
 dxgk_get_device_state(struct dxgprocess *process, void *__user inargs)
 {
@@ -4011,6 +4168,8 @@ void init_ioctls(void)
 		  LX_DXDESTROYPAGINGQUEUE);
 	SET_IOCTL(/*0x1d */ dxgk_destroy_sync_object,
 		  LX_DXDESTROYSYNCHRONIZATIONOBJECT);
+	SET_IOCTL(/*0x25 */ dxgk_lock2,
+		  LX_DXLOCK2);
 	SET_IOCTL(/*0x31 */ dxgk_signal_sync_object_cpu,
 		  LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMCPU);
 	SET_IOCTL(/*0x32 */ dxgk_signal_sync_object_gpu,
@@ -4023,6 +4182,8 @@ void init_ioctls(void)
 		  LX_DXSUBMITWAITFORSYNCOBJECTSTOHWQUEUE);
 	SET_IOCTL(/*0x36 */ dxgk_submit_signal_to_hwqueue,
 		  LX_DXSUBMITSIGNALSYNCOBJECTSTOHWQUEUE);
+	SET_IOCTL(/*0x37 */ dxgk_unlock2,
+		  LX_DXUNLOCK2);
 	SET_IOCTL(/*0x3a */ dxgk_wait_sync_object_cpu,
 		  LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMCPU);
 	SET_IOCTL(/*0x3b */ dxgk_wait_sync_object_gpu,
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index a0af63f046c6..5bc1535ea881 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -664,6 +664,32 @@ struct d3dkmt_submitcommandtohwqueue {
 #endif
 };
 
+struct d3dddicb_lock2flags {
+	union {
+		struct {
+			__u32	reserved:32;
+		};
+		__u32		value;
+	};
+};
+
+struct d3dkmt_lock2 {
+	struct d3dkmthandle		device;
+	struct d3dkmthandle		allocation;
+	struct d3dddicb_lock2flags	flags;
+	__u32				reserved;
+#ifdef __KERNEL__
+	void				*data;
+#else
+	__u64				data;
+#endif
+};
+
+struct d3dkmt_unlock2 {
+	struct d3dkmthandle			device;
+	struct d3dkmthandle			allocation;
+};
+
 enum d3dkmt_standardallocationtype {
 	_D3DKMT_STANDARDALLOCATIONTYPE_EXISTINGHEAP	= 1,
 	_D3DKMT_STANDARDALLOCATIONTYPE_CROSSADAPTER	= 2,
@@ -1079,6 +1105,8 @@ struct d3dkmt_shareobjectwithhost {
 	_IOWR(0x47, 0x19, struct d3dkmt_destroydevice)
 #define LX_DXDESTROYSYNCHRONIZATIONOBJECT \
 	_IOWR(0x47, 0x1d, struct d3dkmt_destroysynchronizationobject)
+#define LX_DXLOCK2			\
+	_IOWR(0x47, 0x25, struct d3dkmt_lock2)
 #define LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMCPU \
 	_IOWR(0x47, 0x31, struct d3dkmt_signalsynchronizationobjectfromcpu)
 #define LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMGPU \
@@ -1091,6 +1119,8 @@ struct d3dkmt_shareobjectwithhost {
 	_IOWR(0x47, 0x35, struct d3dkmt_submitsignalsyncobjectstohwqueue)
 #define LX_DXSUBMITWAITFORSYNCOBJECTSTOHWQUEUE \
 	_IOWR(0x47, 0x36, struct d3dkmt_submitwaitforsyncobjectstohwqueue)
+#define LX_DXUNLOCK2			\
+	_IOWR(0x47, 0x37, struct d3dkmt_unlock2)
 #define LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMCPU \
 	_IOWR(0x47, 0x3a, struct d3dkmt_waitforsynchronizationobjectfromcpu)
 #define LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMGPU \
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 20/30] drivers: hv: dxgkrnl: Manage device allocation properties
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (17 preceding siblings ...)
  2022-03-01 19:46   ` [PATCH v3 19/30] drivers: hv: dxgkrnl: Map(unmap) CPU address to device allocation Iouri Tarassov
@ 2022-03-01 19:46   ` Iouri Tarassov
  2022-03-01 19:46   ` [PATCH v3 21/30] drivers: hv: dxgkrnl: Flush heap transitions Iouri Tarassov
                     ` (10 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:46 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement ioctls to manage properties of a compute device allocation:
  - LX_DXUPDATEALLOCPROPERTY,
  - LX_DXSETALLOCATIONPRIORITY,
  - LX_DXGETALLOCATIONPRIORITY,
  - LX_DXQUERYALLOCATIONRESIDENCY.
  - LX_DXCHANGEVIDEOMEMORYRESERVATION,

The LX_DXUPDATEALLOCPROPERTY ioctl requests the host to update
various properties of a compute devoce allocation.

The LX_DXSETALLOCATIONPRIORITY and LX_DXGETALLOCATIONPRIORITY ioctls
are used to set/get allocation priority, which defines the
importance of the allocation to be in the local device memory.

The LX_DXQUERYALLOCATIONRESIDENCY ioctl queries if the allocation
is located in the compute device accessible memory.

The LX_DXCHANGEVIDEOMEMORYRESERVATION ioctl changes compute device
memory reservation of an allocation.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgkrnl.h  |  21 +++
 drivers/hv/dxgkrnl/dxgvmbus.c | 300 ++++++++++++++++++++++++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h |  50 ++++++
 drivers/hv/dxgkrnl/ioctl.c    | 212 ++++++++++++++++++++++++
 include/uapi/misc/d3dkmthk.h  | 127 ++++++++++++++
 5 files changed, 710 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index cf41a3eef75e..e50f9dbbd5fb 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -831,6 +831,23 @@ int dxgvmb_send_lock2(struct dxgprocess *process,
 int dxgvmb_send_unlock2(struct dxgprocess *process,
 			struct dxgadapter *adapter,
 			struct d3dkmt_unlock2 *args);
+int dxgvmb_send_update_alloc_property(struct dxgprocess *process,
+				      struct dxgadapter *adapter,
+				      struct d3dddi_updateallocproperty *args,
+				      struct d3dddi_updateallocproperty *__user
+				      inargs);
+int dxgvmb_send_set_allocation_priority(struct dxgprocess *process,
+					struct dxgadapter *adapter,
+					struct d3dkmt_setallocationpriority *a);
+int dxgvmb_send_get_allocation_priority(struct dxgprocess *process,
+					struct dxgadapter *adapter,
+					struct d3dkmt_getallocationpriority *a);
+int dxgvmb_send_change_vidmem_reservation(struct dxgprocess *process,
+					  struct dxgadapter *adapter,
+					  struct d3dkmthandle other_process,
+					  struct
+					  d3dkmt_changevideomemoryreservation
+					  *args);
 int dxgvmb_send_create_hwqueue(struct dxgprocess *process,
 			       struct dxgadapter *adapter,
 			       struct d3dkmt_createhwqueue *args,
@@ -850,6 +867,10 @@ int dxgvmb_send_open_sync_object_nt(struct dxgprocess *process,
 				    struct d3dkmt_opensyncobjectfromnthandle2
 				    *args,
 				    struct dxgsyncobject *syncobj);
+int dxgvmb_send_query_alloc_residency(struct dxgprocess *process,
+				      struct dxgadapter *adapter,
+				      struct d3dkmt_queryallocationresidency
+				      *args);
 int dxgvmb_send_get_device_state(struct dxgprocess *process,
 				 struct dxgadapter *adapter,
 				 struct d3dkmt_getdevicestate *args,
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index 0c2796934d1d..7c95f63e7f88 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -1812,6 +1812,79 @@ int dxgvmb_send_destroy_allocation(struct dxgprocess *process,
 	return ret;
 }
 
+int dxgvmb_send_query_alloc_residency(struct dxgprocess *process,
+				      struct dxgadapter *adapter,
+				      struct d3dkmt_queryallocationresidency
+				      *args)
+{
+	int ret = -EINVAL;
+	struct dxgkvmb_command_queryallocationresidency *command = NULL;
+	u32 cmd_size = sizeof(*command);
+	u32 alloc_size = 0;
+	u32 result_allocation_size = 0;
+	struct dxgkvmb_command_queryallocationresidency_return *result = NULL;
+	u32 result_size = sizeof(*result);
+	struct dxgvmbusmsgres msg = {.hdr = NULL};
+
+	if (args->allocation_count > DXG_MAX_VM_BUS_PACKET_SIZE) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args->allocation_count) {
+		alloc_size = args->allocation_count *
+			     sizeof(struct d3dkmthandle);
+		cmd_size += alloc_size;
+		result_allocation_size = args->allocation_count *
+		    sizeof(args->residency_status[0]);
+	} else {
+		result_allocation_size = sizeof(args->residency_status[0]);
+	}
+	result_size += result_allocation_size;
+
+	ret = init_message_res(&msg, adapter, process, cmd_size, result_size);
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+	result = msg.res;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_QUERYALLOCATIONRESIDENCY,
+				   process->host_handle);
+	command->args = *args;
+	if (alloc_size) {
+		ret = copy_from_user(&command[1], args->allocations,
+				     alloc_size);
+		if (ret) {
+			pr_err("%s failed to copy alloc handles", __func__);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+	}
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   result, msg.res_size);
+	if (ret < 0)
+		goto cleanup;
+
+	ret = ntstatus2int(result->status);
+	if (ret < 0)
+		goto cleanup;
+
+	ret = copy_to_user(args->residency_status, &result[1],
+			   result_allocation_size);
+	if (ret) {
+		pr_err("%s failed to copy residency status", __func__);
+		ret = -EINVAL;
+	}
+
+cleanup:
+	free_message((struct dxgvmbusmsg *)&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 int dxgvmb_send_get_device_state(struct dxgprocess *process,
 				 struct dxgadapter *adapter,
 				 struct d3dkmt_getdevicestate *args,
@@ -2439,6 +2512,233 @@ int dxgvmb_send_unlock2(struct dxgprocess *process,
 	return ret;
 }
 
+int dxgvmb_send_update_alloc_property(struct dxgprocess *process,
+				      struct dxgadapter *adapter,
+				      struct d3dddi_updateallocproperty *args,
+				      struct d3dddi_updateallocproperty *__user
+				      inargs)
+{
+	int ret;
+	int ret1;
+	struct dxgkvmb_command_updateallocationproperty *command;
+	struct dxgkvmb_command_updateallocationproperty_return result = { };
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_UPDATEALLOCATIONPROPERTY,
+				   process->host_handle);
+	command->args = *args;
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   &result, sizeof(result));
+
+	if (ret < 0)
+		goto cleanup;
+	ret = ntstatus2int(result.status);
+	/* STATUS_PENING is a success code > 0 */
+	if (ret == STATUS_PENDING) {
+		ret1 = copy_to_user(&inargs->paging_fence_value,
+				    &result.paging_fence_value,
+				    sizeof(u64));
+		if (ret1) {
+			pr_err("%s failed to copy paging fence", __func__);
+			ret = -EINVAL;
+		}
+	}
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_set_allocation_priority(struct dxgprocess *process,
+				struct dxgadapter *adapter,
+				struct d3dkmt_setallocationpriority *args)
+{
+	u32 cmd_size = sizeof(struct dxgkvmb_command_setallocationpriority);
+	u32 alloc_size = 0;
+	u32 priority_size = 0;
+	struct dxgkvmb_command_setallocationpriority *command;
+	int ret;
+	struct d3dkmthandle *allocations;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	if (args->allocation_count > DXG_MAX_VM_BUS_PACKET_SIZE) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	if (args->resource.v) {
+		priority_size = sizeof(u32);
+		if (args->allocation_count != 0) {
+			ret = -EINVAL;
+			goto cleanup;
+		}
+	} else {
+		if (args->allocation_count == 0) {
+			ret = -EINVAL;
+			goto cleanup;
+		}
+		alloc_size = args->allocation_count *
+			     sizeof(struct d3dkmthandle);
+		cmd_size += alloc_size;
+		priority_size = sizeof(u32) * args->allocation_count;
+	}
+	cmd_size += priority_size;
+
+	ret = init_message(&msg, adapter, process, cmd_size);
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_SETALLOCATIONPRIORITY,
+				   process->host_handle);
+	command->device = args->device;
+	command->allocation_count = args->allocation_count;
+	command->resource = args->resource;
+	allocations = (struct d3dkmthandle *) &command[1];
+	ret = copy_from_user(allocations, args->allocation_list,
+			     alloc_size);
+	if (ret) {
+		pr_err("%s failed to copy alloc handle", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	ret = copy_from_user((u8 *) allocations + alloc_size,
+				args->priorities, priority_size);
+	if (ret) {
+		pr_err("%s failed to copy alloc priority", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_get_allocation_priority(struct dxgprocess *process,
+				struct dxgadapter *adapter,
+				struct d3dkmt_getallocationpriority *args)
+{
+	u32 cmd_size = sizeof(struct dxgkvmb_command_getallocationpriority);
+	u32 result_size;
+	u32 alloc_size = 0;
+	u32 priority_size = 0;
+	struct dxgkvmb_command_getallocationpriority *command;
+	struct dxgkvmb_command_getallocationpriority_return *result;
+	int ret;
+	struct d3dkmthandle *allocations;
+	struct dxgvmbusmsgres msg = {.hdr = NULL};
+
+	if (args->allocation_count > DXG_MAX_VM_BUS_PACKET_SIZE) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	if (args->resource.v) {
+		priority_size = sizeof(u32);
+		if (args->allocation_count != 0) {
+			ret = -EINVAL;
+			goto cleanup;
+		}
+	} else {
+		if (args->allocation_count == 0) {
+			ret = -EINVAL;
+			goto cleanup;
+		}
+		alloc_size = args->allocation_count *
+			sizeof(struct d3dkmthandle);
+		cmd_size += alloc_size;
+		priority_size = sizeof(u32) * args->allocation_count;
+	}
+	result_size = sizeof(*result) + priority_size;
+
+	ret = init_message_res(&msg, adapter, process, cmd_size, result_size);
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+	result = msg.res;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_GETALLOCATIONPRIORITY,
+				   process->host_handle);
+	command->device = args->device;
+	command->allocation_count = args->allocation_count;
+	command->resource = args->resource;
+	allocations = (struct d3dkmthandle *) &command[1];
+	ret = copy_from_user(allocations, args->allocation_list,
+			     alloc_size);
+	if (ret) {
+		pr_err("%s failed to copy alloc handles", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr,
+				   msg.size + msg.res_size,
+				   result, msg.res_size);
+	if (ret < 0)
+		goto cleanup;
+
+	ret = ntstatus2int(result->status);
+	if (ret < 0)
+		goto cleanup;
+
+	ret = copy_to_user(args->priorities,
+			   (u8 *) result + sizeof(*result),
+			   priority_size);
+	if (ret) {
+		pr_err("%s failed to copy priorities", __func__);
+		ret = -EINVAL;
+	}
+
+cleanup:
+	free_message((struct dxgvmbusmsg *)&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_change_vidmem_reservation(struct dxgprocess *process,
+					  struct dxgadapter *adapter,
+					  struct d3dkmthandle other_process,
+					  struct
+					  d3dkmt_changevideomemoryreservation
+					  *args)
+{
+	struct dxgkvmb_command_changevideomemoryreservation *command;
+	int ret;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_CHANGEVIDEOMEMORYRESERVATION,
+				   process->host_handle);
+	command->args = *args;
+	command->args.process = other_process.v;
+
+	ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 int dxgvmb_send_create_hwqueue(struct dxgprocess *process,
 			       struct dxgadapter *adapter,
 			       struct d3dkmt_createhwqueue *args,
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index 3e2edfeefd1a..5b7654ef63f3 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -308,6 +308,29 @@ struct dxgkvmb_command_queryadapterinfo_return {
 	u8				private_data[1];
 };
 
+/* Returns ntstatus */
+struct dxgkvmb_command_setallocationpriority {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		device;
+	struct d3dkmthandle		resource;
+	u32				allocation_count;
+	/* struct d3dkmthandle    allocations[allocation_count or 0]; */
+	/* u32 priorities[allocation_count or 1]; */
+};
+
+struct dxgkvmb_command_getallocationpriority {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		device;
+	struct d3dkmthandle		resource;
+	u32				allocation_count;
+	/* struct d3dkmthandle allocations[allocation_count or 0]; */
+};
+
+struct dxgkvmb_command_getallocationpriority_return {
+	struct ntstatus			status;
+	/* u32 priorities[allocation_count or 1]; */
+};
+
 struct dxgkvmb_command_createdevice {
 	struct dxgkvmb_command_vgpu_to_host hdr;
 	struct d3dkmt_createdeviceflags	flags;
@@ -589,6 +612,22 @@ struct dxgkvmb_command_unlock2 {
 	bool				use_legacy_unlock;
 };
 
+struct dxgkvmb_command_updateallocationproperty {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dddi_updateallocproperty args;
+};
+
+struct dxgkvmb_command_updateallocationproperty_return {
+	u64				paging_fence_value;
+	struct ntstatus			status;
+};
+
+/* Returns ntstatus */
+struct dxgkvmb_command_changevideomemoryreservation {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmt_changevideomemoryreservation args;
+};
+
 /* Returns the same structure */
 struct dxgkvmb_command_createhwqueue {
 	struct dxgkvmb_command_vgpu_to_host hdr;
@@ -609,6 +648,17 @@ struct dxgkvmb_command_destroyhwqueue {
 	struct d3dkmthandle		hwqueue;
 };
 
+struct dxgkvmb_command_queryallocationresidency {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmt_queryallocationresidency args;
+	/* struct d3dkmthandle allocations[0 or number of allocations] */
+};
+
+struct dxgkvmb_command_queryallocationresidency_return {
+	struct ntstatus			status;
+	/* d3dkmt_allocationresidencystatus[NumAllocations] */
+};
+
 struct dxgkvmb_command_getdevicestate {
 	struct dxgkvmb_command_vgpu_to_host hdr;
 	struct d3dkmt_getdevicestate	args;
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index 12a4593c6aa5..7ebad4524f94 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -3337,6 +3337,208 @@ dxgk_unlock2(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgk_update_alloc_property(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dddi_updateallocproperty args;
+	int ret;
+	struct dxgadapter *adapter = NULL;
+	struct dxgdevice *device = NULL;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	device = dxgprocess_device_by_object_handle(process,
+						HMGRENTRY_TYPE_DXGPAGINGQUEUE,
+						args.paging_queue);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_update_alloc_property(process, adapter,
+						&args, inargs);
+
+cleanup:
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_query_alloc_residency(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_queryallocationresidency args;
+	struct dxgadapter *adapter = NULL;
+	struct dxgdevice *device = NULL;
+	int ret;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if ((args.allocation_count == 0) == (args.resource.v == 0)) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	device = dxgprocess_device_by_handle(process, args.device);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+	ret = dxgvmb_send_query_alloc_residency(process, adapter, &args);
+cleanup:
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_set_allocation_priority(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_setallocationpriority args;
+	struct dxgadapter *adapter = NULL;
+	struct dxgdevice *device = NULL;
+	int ret;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	device = dxgprocess_device_by_handle(process, args.device);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+	ret = dxgvmb_send_set_allocation_priority(process, adapter, &args);
+cleanup:
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_get_allocation_priority(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_getallocationpriority args;
+	struct dxgadapter *adapter = NULL;
+	struct dxgdevice *device = NULL;
+	int ret;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	device = dxgprocess_device_by_handle(process, args.device);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+	ret = dxgvmb_send_get_allocation_priority(process, adapter, &args);
+cleanup:
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_change_vidmem_reservation(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_changevideomemoryreservation args;
+	int ret;
+	struct dxgadapter *adapter = NULL;
+	bool adapter_locked = false;
+
+	pr_debug("ioctl: %s", __func__);
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.process != 0) {
+		pr_err("setting memory reservation for other process");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = dxgprocess_adapter_by_handle(process, args.adapter);
+	if (adapter == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+	adapter_locked = true;
+	args.adapter.v = 0;
+	ret = dxgvmb_send_change_vidmem_reservation(process, adapter,
+						    zerohandle, &args);
+
+cleanup:
+
+	if (adapter_locked)
+		dxgadapter_release_lock_shared(adapter);
+	if (adapter)
+		kref_put(&adapter->adapter_kref, dxgadapter_release);
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 static int
 dxgk_get_device_state(struct dxgprocess *process, void *__user inargs)
 {
@@ -4158,6 +4360,8 @@ void init_ioctls(void)
 		  LX_DXENUMADAPTERS2);
 	SET_IOCTL(/*0x15 */ dxgk_close_adapter,
 		  LX_DXCLOSEADAPTER);
+	SET_IOCTL(/*0x16 */ dxgk_change_vidmem_reservation,
+		  LX_DXCHANGEVIDEOMEMORYRESERVATION);
 	SET_IOCTL(/*0x18 */ dxgk_create_hwqueue,
 		  LX_DXCREATEHWQUEUE);
 	SET_IOCTL(/*0x19 */ dxgk_destroy_device,
@@ -4170,6 +4374,10 @@ void init_ioctls(void)
 		  LX_DXDESTROYSYNCHRONIZATIONOBJECT);
 	SET_IOCTL(/*0x25 */ dxgk_lock2,
 		  LX_DXLOCK2);
+	SET_IOCTL(/*0x2a */ dxgk_query_alloc_residency,
+		  LX_DXQUERYALLOCATIONRESIDENCY);
+	SET_IOCTL(/*0x2e */ dxgk_set_allocation_priority,
+		  LX_DXSETALLOCATIONPRIORITY);
 	SET_IOCTL(/*0x31 */ dxgk_signal_sync_object_cpu,
 		  LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMCPU);
 	SET_IOCTL(/*0x32 */ dxgk_signal_sync_object_gpu,
@@ -4184,10 +4392,14 @@ void init_ioctls(void)
 		  LX_DXSUBMITSIGNALSYNCOBJECTSTOHWQUEUE);
 	SET_IOCTL(/*0x37 */ dxgk_unlock2,
 		  LX_DXUNLOCK2);
+	SET_IOCTL(/*0x38 */ dxgk_update_alloc_property,
+		  LX_DXUPDATEALLOCPROPERTY);
 	SET_IOCTL(/*0x3a */ dxgk_wait_sync_object_cpu,
 		  LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMCPU);
 	SET_IOCTL(/*0x3b */ dxgk_wait_sync_object_gpu,
 		  LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMGPU);
+	SET_IOCTL(/*0x3c */ dxgk_get_allocation_priority,
+		  LX_DXGETALLOCATIONPRIORITY);
 	SET_IOCTL(/*0x3e */ dxgk_enum_adapters3,
 		  LX_DXENUMADAPTERS3);
 	SET_IOCTL(/*0x3f */ dxgk_share_objects,
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index 5bc1535ea881..2dac2d13196a 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -664,6 +664,63 @@ struct d3dkmt_submitcommandtohwqueue {
 #endif
 };
 
+struct d3dkmt_setallocationpriority {
+	struct d3dkmthandle		device;
+	struct d3dkmthandle		resource;
+#ifdef __KERNEL__
+	const struct d3dkmthandle	*allocation_list;
+#else
+	__u64				allocation_list;
+#endif
+	__u32				allocation_count;
+	__u32				reserved;
+#ifdef __KERNEL__
+	const __u32			*priorities;
+#else
+	__u64				priorities;
+#endif
+};
+
+struct d3dkmt_getallocationpriority {
+	struct d3dkmthandle		device;
+	struct d3dkmthandle		resource;
+#ifdef __KERNEL__
+	const struct d3dkmthandle	*allocation_list;
+#else
+	__u64				allocation_list;
+#endif
+	__u32				allocation_count;
+	__u32				reserved;
+#ifdef __KERNEL__
+	__u32				*priorities;
+#else
+	__u64				priorities;
+#endif
+};
+
+enum d3dkmt_allocationresidencystatus {
+	_D3DKMT_ALLOCATIONRESIDENCYSTATUS_RESIDENTINGPUMEMORY		= 1,
+	_D3DKMT_ALLOCATIONRESIDENCYSTATUS_RESIDENTINSHAREDMEMORY	= 2,
+	_D3DKMT_ALLOCATIONRESIDENCYSTATUS_NOTRESIDENT			= 3,
+};
+
+struct d3dkmt_queryallocationresidency {
+	struct d3dkmthandle			device;
+	struct d3dkmthandle			resource;
+#ifdef __KERNEL__
+	struct d3dkmthandle			*allocations;
+#else
+	__u64					allocations;
+#endif
+	__u32					allocation_count;
+	__u32					reserved;
+#ifdef __KERNEL__
+	enum d3dkmt_allocationresidencystatus	*residency_status;
+#else
+	__u64					residency_status;
+#endif
+};
+
 struct d3dddicb_lock2flags {
 	union {
 		struct {
@@ -831,6 +888,11 @@ struct d3dkmt_destroyallocation2 {
 	struct d3dddicb_destroyallocation2flags flags;
 };
 
+enum d3dkmt_memory_segment_group {
+	_D3DKMT_MEMORY_SEGMENT_GROUP_LOCAL	= 0,
+	_D3DKMT_MEMORY_SEGMENT_GROUP_NON_LOCAL	= 1
+};
+
 struct d3dkmt_adaptertype {
 	union {
 		struct {
@@ -882,6 +944,61 @@ struct d3dddi_openallocationinfo2 {
 	__u64			reserved[6];
 };
 
+struct d3dddi_updateallocproperty_flags {
+	union {
+		struct {
+			__u32			accessed_physically:1;
+			__u32			reserved:31;
+		};
+		__u32				value;
+	};
+};
+
+struct d3dddi_segmentpreference {
+	union {
+		struct {
+			__u32			segment_id0:5;
+			__u32			direction0:1;
+			__u32			segment_id1:5;
+			__u32			direction1:1;
+			__u32			segment_id2:5;
+			__u32			direction2:1;
+			__u32			segment_id3:5;
+			__u32			direction3:1;
+			__u32			segment_id4:5;
+			__u32			direction4:1;
+			__u32			reserved:2;
+		};
+		__u32				value;
+	};
+};
+
+struct d3dddi_updateallocproperty {
+	struct d3dkmthandle			paging_queue;
+	struct d3dkmthandle			allocation;
+	__u32					supported_segment_set;
+	struct d3dddi_segmentpreference		preferred_segment;
+	struct d3dddi_updateallocproperty_flags	flags;
+	__u64					paging_fence_value;
+	union {
+		struct {
+			__u32			set_accessed_physically:1;
+			__u32			set_supported_segmentSet:1;
+			__u32			set_preferred_segment:1;
+			__u32			reserved:29;
+		};
+		__u32				property_mask_value;
+	};
+};
+
+struct d3dkmt_changevideomemoryreservation {
+	__u64			process;
+	struct d3dkmthandle	adapter;
+	enum d3dkmt_memory_segment_group memory_segment_group;
+	__u64			reservation;
+	__u32			physical_adapter_index;
+};
+
 struct d3dkmt_createhwqueue {
 	struct d3dkmthandle	context;
 	struct d3dddi_createhwqueueflags flags;
@@ -1095,6 +1212,8 @@ struct d3dkmt_shareobjectwithhost {
 	_IOWR(0x47, 0x14, struct d3dkmt_enumadapters2)
 #define LX_DXCLOSEADAPTER		\
 	_IOWR(0x47, 0x15, struct d3dkmt_closeadapter)
+#define LX_DXCHANGEVIDEOMEMORYRESERVATION \
+	_IOWR(0x47, 0x16, struct d3dkmt_changevideomemoryreservation)
 #define LX_DXCREATEHWQUEUE		\
 	_IOWR(0x47, 0x18, struct d3dkmt_createhwqueue)
 #define LX_DXDESTROYHWQUEUE		\
@@ -1107,6 +1226,10 @@ struct d3dkmt_shareobjectwithhost {
 	_IOWR(0x47, 0x1d, struct d3dkmt_destroysynchronizationobject)
 #define LX_DXLOCK2			\
 	_IOWR(0x47, 0x25, struct d3dkmt_lock2)
+#define LX_DXQUERYALLOCATIONRESIDENCY	\
+	_IOWR(0x47, 0x2a, struct d3dkmt_queryallocationresidency)
+#define LX_DXSETALLOCATIONPRIORITY	\
+	_IOWR(0x47, 0x2e, struct d3dkmt_setallocationpriority)
 #define LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMCPU \
 	_IOWR(0x47, 0x31, struct d3dkmt_signalsynchronizationobjectfromcpu)
 #define LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMGPU \
@@ -1121,10 +1244,14 @@ struct d3dkmt_shareobjectwithhost {
 	_IOWR(0x47, 0x36, struct d3dkmt_submitwaitforsyncobjectstohwqueue)
 #define LX_DXUNLOCK2			\
 	_IOWR(0x47, 0x37, struct d3dkmt_unlock2)
+#define LX_DXUPDATEALLOCPROPERTY	\
+	_IOWR(0x47, 0x38, struct d3dddi_updateallocproperty)
 #define LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMCPU \
 	_IOWR(0x47, 0x3a, struct d3dkmt_waitforsynchronizationobjectfromcpu)
 #define LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMGPU \
 	_IOWR(0x47, 0x3b, struct d3dkmt_waitforsynchronizationobjectfromgpu)
+#define LX_DXGETALLOCATIONPRIORITY	\
+	_IOWR(0x47, 0x3c, struct d3dkmt_getallocationpriority)
 #define LX_DXENUMADAPTERS3		\
 	_IOWR(0x47, 0x3e, struct d3dkmt_enumadapters3)
 #define LX_DXSHAREOBJECTS		\
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 21/30] drivers: hv: dxgkrnl: Flush heap transitions
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (18 preceding siblings ...)
  2022-03-01 19:46   ` [PATCH v3 20/30] drivers: hv: dxgkrnl: Manage device allocation properties Iouri Tarassov
@ 2022-03-01 19:46   ` Iouri Tarassov
  2022-03-01 19:46   ` [PATCH v3 22/30] drivers: hv: dxgkrnl: Query video memory information Iouri Tarassov
                     ` (9 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:46 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement the ioctl to flush heap transitions
(LX_DXFLUSHHEAPTRANSITIONS).

The ioctl is used to ensure that the video memory manager on the host
flushes all internal operations.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgkrnl.h  |  3 +++
 drivers/hv/dxgkrnl/dxgvmbus.c | 23 ++++++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h |  5 ++++
 drivers/hv/dxgkrnl/ioctl.c    | 49 +++++++++++++++++++++++++++++++++++
 include/uapi/misc/d3dkmthk.h  |  6 +++++
 5 files changed, 86 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index e50f9dbbd5fb..7d3f2617414d 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -862,6 +862,9 @@ int dxgvmb_send_query_adapter_info(struct dxgprocess *process,
 int dxgvmb_send_submit_command_hwqueue(struct dxgprocess *process,
 				       struct dxgadapter *adapter,
 				       struct d3dkmt_submitcommandtohwqueue *a);
+int dxgvmb_send_flush_heap_transitions(struct dxgprocess *process,
+				       struct dxgadapter *adapter,
+				       struct d3dkmt_flushheaptransitions *arg);
 int dxgvmb_send_open_sync_object_nt(struct dxgprocess *process,
 				    struct dxgvmbuschannel *channel,
 				    struct d3dkmt_opensyncobjectfromnthandle2
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index 7c95f63e7f88..67b9e1b45b97 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -1812,6 +1812,29 @@ int dxgvmb_send_destroy_allocation(struct dxgprocess *process,
 	return ret;
 }
 
+int dxgvmb_send_flush_heap_transitions(struct dxgprocess *process,
+				       struct dxgadapter *adapter,
+				       struct d3dkmt_flushheaptransitions *args)
+{
+	struct dxgkvmb_command_flushheaptransitions *command;
+	int ret;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_FLUSHHEAPTRANSITIONS,
+				   process->host_handle);
+	ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 int dxgvmb_send_query_alloc_residency(struct dxgprocess *process,
 				      struct dxgadapter *adapter,
 				      struct d3dkmt_queryallocationresidency
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index 5b7654ef63f3..95c8b4a0cff9 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -367,6 +367,11 @@ struct dxgkvmb_command_submitcommandtohwqueue {
 	/* PrivateDriverData */
 };
 
+/* Returns  ntstatus */
+struct dxgkvmb_command_flushheaptransitions {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+};
+
 struct dxgkvmb_command_createallocation_allocinfo {
 	u32				flags;
 	u32				priv_drv_data_size;
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index 7ebad4524f94..4f2d762811ba 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -3539,6 +3539,53 @@ dxgk_change_vidmem_reservation(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgk_flush_heap_transitions(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_flushheaptransitions args;
+	int ret;
+	struct dxgadapter *adapter = NULL;
+	bool adapter_locked = false;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = dxgprocess_adapter_by_handle(process, args.adapter);
+	if (adapter == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+	adapter_locked = true;
+
+	args.adapter = adapter->host_handle;
+	ret = dxgvmb_send_flush_heap_transitions(process, adapter, &args);
+	if (ret < 0)
+		goto cleanup;
+	ret = copy_to_user(inargs, &args, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy output args", __func__);
+		ret = -EINVAL;
+	}
+
+cleanup:
+
+	if (adapter_locked)
+		dxgadapter_release_lock_shared(adapter);
+	if (adapter)
+		kref_put(&adapter->adapter_kref, dxgadapter_release);
+	return ret;
+}
+
 static int
 dxgk_get_device_state(struct dxgprocess *process, void *__user inargs)
 {
@@ -4372,6 +4419,8 @@ void init_ioctls(void)
 		  LX_DXDESTROYPAGINGQUEUE);
 	SET_IOCTL(/*0x1d */ dxgk_destroy_sync_object,
 		  LX_DXDESTROYSYNCHRONIZATIONOBJECT);
+	SET_IOCTL(/*0x1f */ dxgk_flush_heap_transitions,
+		  LX_DXFLUSHHEAPTRANSITIONS);
 	SET_IOCTL(/*0x25 */ dxgk_lock2,
 		  LX_DXLOCK2);
 	SET_IOCTL(/*0x2a */ dxgk_query_alloc_residency,
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index 2dac2d13196a..b0a2c858f122 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -932,6 +932,10 @@ struct d3dkmt_queryadapterinfo {
 	__u32				private_data_size;
 };
 
+struct d3dkmt_flushheaptransitions {
+	struct d3dkmthandle	adapter;
+};
+
 struct d3dddi_openallocationinfo2 {
 	struct d3dkmthandle	allocation;
 #ifdef __KERNEL__
@@ -1224,6 +1228,8 @@ struct d3dkmt_shareobjectwithhost {
 	_IOWR(0x47, 0x19, struct d3dkmt_destroydevice)
 #define LX_DXDESTROYSYNCHRONIZATIONOBJECT \
 	_IOWR(0x47, 0x1d, struct d3dkmt_destroysynchronizationobject)
+#define LX_DXFLUSHHEAPTRANSITIONS	\
+	_IOWR(0x47, 0x1f, struct d3dkmt_flushheaptransitions)
 #define LX_DXLOCK2			\
 	_IOWR(0x47, 0x25, struct d3dkmt_lock2)
 #define LX_DXQUERYALLOCATIONRESIDENCY	\
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 22/30] drivers: hv: dxgkrnl: Query video memory information
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (19 preceding siblings ...)
  2022-03-01 19:46   ` [PATCH v3 21/30] drivers: hv: dxgkrnl: Flush heap transitions Iouri Tarassov
@ 2022-03-01 19:46   ` Iouri Tarassov
  2022-03-01 19:46   ` [PATCH v3 23/30] drivers: hv: dxgkrnl: The escape ioctl Iouri Tarassov
                     ` (8 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:46 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement the ioctl to query video memory information from the host
(LX_DXQUERYVIDEOMEMORYINFO).

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgkrnl.h  |  5 +++
 drivers/hv/dxgkrnl/dxgvmbus.c | 64 +++++++++++++++++++++++++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h | 14 ++++++++
 drivers/hv/dxgkrnl/ioctl.c    | 50 +++++++++++++++++++++++++++
 include/uapi/misc/d3dkmthk.h  | 13 +++++++
 5 files changed, 146 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index 7d3f2617414d..6cd6bc6d6b9a 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -874,6 +874,11 @@ int dxgvmb_send_query_alloc_residency(struct dxgprocess *process,
 				      struct dxgadapter *adapter,
 				      struct d3dkmt_queryallocationresidency
 				      *args);
+int dxgvmb_send_query_vidmem_info(struct dxgprocess *process,
+				  struct dxgadapter *adapter,
+				  struct d3dkmt_queryvideomemoryinfo *args,
+				  struct d3dkmt_queryvideomemoryinfo
+				  *__user iargs);
 int dxgvmb_send_get_device_state(struct dxgprocess *process,
 				 struct dxgadapter *adapter,
 				 struct d3dkmt_getdevicestate *args,
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index 67b9e1b45b97..40ed2e981d73 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -1908,6 +1908,70 @@ int dxgvmb_send_query_alloc_residency(struct dxgprocess *process,
 	return ret;
 }
 
+int dxgvmb_send_query_vidmem_info(struct dxgprocess *process,
+				  struct dxgadapter *adapter,
+				  struct d3dkmt_queryvideomemoryinfo *args,
+				  struct d3dkmt_queryvideomemoryinfo *__user
+				  output)
+{
+	int ret;
+	struct dxgkvmb_command_queryvideomemoryinfo *command;
+	struct dxgkvmb_command_queryvideomemoryinfo_return result = { };
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+	command_vgpu_to_host_init2(&command->hdr,
+				   dxgk_vmbcommand_queryvideomemoryinfo,
+				   process->host_handle);
+	command->adapter = args->adapter;
+	command->memory_segment_group = args->memory_segment_group;
+	command->physical_adapter_index = args->physical_adapter_index;
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   &result, sizeof(result));
+	if (ret < 0)
+		goto cleanup;
+
+	ret = copy_to_user(&output->budget, &result.budget,
+			   sizeof(output->budget));
+	if (ret) {
+		pr_err("%s failed to copy budget", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	ret = copy_to_user(&output->current_usage, &result.current_usage,
+			   sizeof(output->current_usage));
+	if (ret) {
+		pr_err("%s failed to copy current usage", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	ret = copy_to_user(&output->current_reservation,
+			   &result.current_reservation,
+			   sizeof(output->current_reservation));
+	if (ret) {
+		pr_err("%s failed to copy reservation", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	ret = copy_to_user(&output->available_for_reservation,
+			   &result.available_for_reservation,
+			   sizeof(output->available_for_reservation));
+	if (ret) {
+		pr_err("%s failed to copy avail reservation", __func__);
+		ret = -EINVAL;
+	}
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 int dxgvmb_send_get_device_state(struct dxgprocess *process,
 				 struct dxgadapter *adapter,
 				 struct d3dkmt_getdevicestate *args,
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index 95c8b4a0cff9..8bca304fcb4b 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -664,6 +664,20 @@ struct dxgkvmb_command_queryallocationresidency_return {
 	/* d3dkmt_allocationresidencystatus[NumAllocations] */
 };
 
+struct dxgkvmb_command_queryvideomemoryinfo {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		adapter;
+	enum d3dkmt_memory_segment_group memory_segment_group;
+	u32				physical_adapter_index;
+};
+
+struct dxgkvmb_command_queryvideomemoryinfo_return {
+	u64			budget;
+	u64			current_usage;
+	u64			current_reservation;
+	u64			available_for_reservation;
+};
+
 struct dxgkvmb_command_getdevicestate {
 	struct dxgkvmb_command_vgpu_to_host hdr;
 	struct d3dkmt_getdevicestate	args;
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index 4f2d762811ba..d7dbde41b8a2 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -3586,6 +3586,54 @@ dxgk_flush_heap_transitions(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgk_query_vidmem_info(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_queryvideomemoryinfo args;
+	int ret;
+	struct dxgadapter *adapter = NULL;
+	bool adapter_locked = false;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.process != 0) {
+		pr_err("query vidmem info from another process ");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = dxgprocess_adapter_by_handle(process, args.adapter);
+	if (adapter == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+	adapter_locked = true;
+
+	args.adapter = adapter->host_handle;
+	ret = dxgvmb_send_query_vidmem_info(process, adapter, &args, inargs);
+
+cleanup:
+
+	if (adapter_locked)
+		dxgadapter_release_lock_shared(adapter);
+	if (adapter)
+		kref_put(&adapter->adapter_kref, dxgadapter_release);
+	if (ret < 0)
+		pr_err("%s failed: %x", __func__, ret);
+	return ret;
+}
+
 static int
 dxgk_get_device_state(struct dxgprocess *process, void *__user inargs)
 {
@@ -4391,6 +4439,8 @@ void init_ioctls(void)
 		  LX_DXCREATEPAGINGQUEUE);
 	SET_IOCTL(/*0x9 */ dxgk_query_adapter_info,
 		  LX_DXQUERYADAPTERINFO);
+	SET_IOCTL(/*0xa */ dxgk_query_vidmem_info,
+		  LX_DXQUERYVIDEOMEMORYINFO);
 	SET_IOCTL(/*0xe */ dxgk_get_device_state,
 		  LX_DXGETDEVICESTATE);
 	SET_IOCTL(/*0xf */ dxgk_submit_command,
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index b0a2c858f122..f7f663f6e674 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -893,6 +893,17 @@ enum d3dkmt_memory_segment_group {
 	_D3DKMT_MEMORY_SEGMENT_GROUP_NON_LOCAL	= 1
 };
 
+struct d3dkmt_queryvideomemoryinfo {
+	__u64					process;
+	struct d3dkmthandle			adapter;
+	enum d3dkmt_memory_segment_group	memory_segment_group;
+	__u64					budget;
+	__u64					current_usage;
+	__u64					current_reservation;
+	__u64					available_for_reservation;
+	__u32					physical_adapter_index;
+};
+
 struct d3dkmt_adaptertype {
 	union {
 		struct {
@@ -1200,6 +1211,8 @@ struct d3dkmt_shareobjectwithhost {
 	_IOWR(0x47, 0x07, struct d3dkmt_createpagingqueue)
 #define LX_DXQUERYADAPTERINFO		\
 	_IOWR(0x47, 0x09, struct d3dkmt_queryadapterinfo)
+#define LX_DXQUERYVIDEOMEMORYINFO	\
+	_IOWR(0x47, 0x0a, struct d3dkmt_queryvideomemoryinfo)
 #define LX_DXGETDEVICESTATE		\
 	_IOWR(0x47, 0x0e, struct d3dkmt_getdevicestate)
 #define LX_DXSUBMITCOMMAND		\
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 23/30] drivers: hv: dxgkrnl: The escape ioctl
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (20 preceding siblings ...)
  2022-03-01 19:46   ` [PATCH v3 22/30] drivers: hv: dxgkrnl: Query video memory information Iouri Tarassov
@ 2022-03-01 19:46   ` Iouri Tarassov
  2022-03-01 19:46   ` [PATCH v3 24/30] drivers: hv: dxgkrnl: Ioctl to put device to error state Iouri Tarassov
                     ` (7 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:46 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement the escape ioctl (LX_DXESCAPE).

This ioctl is used to send/receive private data between user mode
compute device driver (guest) and kernel mode compute device
driver (host). It allows the user mode driver to extend the virtual
compute device API.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgkrnl.h  |  3 ++
 drivers/hv/dxgkrnl/dxgvmbus.c | 65 +++++++++++++++++++++++++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h | 12 +++++++
 drivers/hv/dxgkrnl/ioctl.c    | 42 ++++++++++++++++++++++
 include/uapi/misc/d3dkmthk.h  | 41 ++++++++++++++++++++++
 5 files changed, 163 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index 6cd6bc6d6b9a..875e336c11d3 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -874,6 +874,9 @@ int dxgvmb_send_query_alloc_residency(struct dxgprocess *process,
 				      struct dxgadapter *adapter,
 				      struct d3dkmt_queryallocationresidency
 				      *args);
+int dxgvmb_send_escape(struct dxgprocess *process,
+		       struct dxgadapter *adapter,
+		       struct d3dkmt_escape *args);
 int dxgvmb_send_query_vidmem_info(struct dxgprocess *process,
 				  struct dxgadapter *adapter,
 				  struct d3dkmt_queryvideomemoryinfo *args,
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index 40ed2e981d73..c76ab993e9ba 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -1908,6 +1908,70 @@ int dxgvmb_send_query_alloc_residency(struct dxgprocess *process,
 	return ret;
 }
 
+int dxgvmb_send_escape(struct dxgprocess *process,
+		       struct dxgadapter *adapter,
+		       struct d3dkmt_escape *args)
+{
+	int ret;
+	struct dxgkvmb_command_escape *command = NULL;
+	u32 cmd_size = sizeof(*command);
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	if (args->priv_drv_data_size > DXG_MAX_VM_BUS_PACKET_SIZE) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	cmd_size = cmd_size - sizeof(args->priv_drv_data[0]) +
+	    args->priv_drv_data_size;
+
+	ret = init_message(&msg, adapter, process, cmd_size);
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_ESCAPE,
+				   process->host_handle);
+	command->adapter = args->adapter;
+	command->device = args->device;
+	command->type = args->type;
+	command->flags = args->flags;
+	command->priv_drv_data_size = args->priv_drv_data_size;
+	command->context = args->context;
+	if (args->priv_drv_data_size) {
+		ret = copy_from_user(command->priv_drv_data,
+				     args->priv_drv_data,
+				     args->priv_drv_data_size);
+		if (ret) {
+			pr_err("%s failed to copy priv data", __func__);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+	}
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   command->priv_drv_data,
+				   args->priv_drv_data_size);
+	if (ret < 0)
+		goto cleanup;
+
+	if (args->priv_drv_data_size) {
+		ret = copy_to_user(args->priv_drv_data,
+				   command->priv_drv_data,
+				   args->priv_drv_data_size);
+		if (ret) {
+			pr_err("%s failed to copy priv data", __func__);
+			ret = -EINVAL;
+		}
+	}
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 int dxgvmb_send_query_vidmem_info(struct dxgprocess *process,
 				  struct dxgadapter *adapter,
 				  struct d3dkmt_queryvideomemoryinfo *args,
@@ -3125,3 +3189,4 @@ int dxgvmb_send_submit_command_hwqueue(struct dxgprocess *process,
 		pr_debug("err: %s %d", __func__, ret);
 	return ret;
 }
+
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index 8bca304fcb4b..d2d22775645b 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -664,6 +664,18 @@ struct dxgkvmb_command_queryallocationresidency_return {
 	/* d3dkmt_allocationresidencystatus[NumAllocations] */
 };
 
+/* Returns only private data */
+struct dxgkvmb_command_escape {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		adapter;
+	struct d3dkmthandle		device;
+	enum d3dkmt_escapetype		type;
+	struct d3dddi_escapeflags	flags;
+	u32				priv_drv_data_size;
+	struct d3dkmthandle		context;
+	u8				priv_drv_data[1];
+};
+
 struct dxgkvmb_command_queryvideomemoryinfo {
 	struct dxgkvmb_command_vgpu_to_host hdr;
 	struct d3dkmthandle		adapter;
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index d7dbde41b8a2..763bd76382a0 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -3586,6 +3586,46 @@ dxgk_flush_heap_transitions(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgk_escape(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_escape args;
+	int ret;
+	struct dxgadapter *adapter = NULL;
+	bool adapter_locked = false;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = dxgprocess_adapter_by_handle(process, args.adapter);
+	if (adapter == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+	adapter_locked = true;
+
+	args.adapter = adapter->host_handle;
+	ret = dxgvmb_send_escape(process, adapter, &args);
+
+cleanup:
+
+	if (adapter_locked)
+		dxgadapter_release_lock_shared(adapter);
+	if (adapter)
+		kref_put(&adapter->adapter_kref, dxgadapter_release);
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 static int
 dxgk_query_vidmem_info(struct dxgprocess *process, void *__user inargs)
 {
@@ -4441,6 +4481,8 @@ void init_ioctls(void)
 		  LX_DXQUERYADAPTERINFO);
 	SET_IOCTL(/*0xa */ dxgk_query_vidmem_info,
 		  LX_DXQUERYVIDEOMEMORYINFO);
+	SET_IOCTL(/*0xd */ dxgk_escape,
+		  LX_DXESCAPE);
 	SET_IOCTL(/*0xe */ dxgk_get_device_state,
 		  LX_DXGETDEVICESTATE);
 	SET_IOCTL(/*0xf */ dxgk_submit_command,
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index f7f663f6e674..cf08166f4e5d 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -232,6 +232,45 @@ struct d3dddi_destroypagingqueue {
 	struct d3dkmthandle		paging_queue;
 };
 
+enum d3dkmt_escapetype {
+	_D3DKMT_ESCAPE_DRIVERPRIVATE	= 0,
+	_D3DKMT_ESCAPE_VIDMM		= 1,
+	_D3DKMT_ESCAPE_VIDSCH		= 3,
+	_D3DKMT_ESCAPE_DEVICE		= 4,
+	_D3DKMT_ESCAPE_DRT_TEST		= 8,
+};
+
+struct d3dddi_escapeflags {
+	union {
+		struct {
+			__u32		hardware_access:1;
+			__u32		device_status_query:1;
+			__u32		change_frame_latency:1;
+			__u32		no_adapter_synchronization:1;
+			__u32		reserved:1;
+			__u32		virtual_machine_data:1;
+			__u32		driver_known_escape:1;
+			__u32		driver_common_escape:1;
+			__u32		reserved2:24;
+		};
+		__u32			value;
+	};
+};
+
+struct d3dkmt_escape {
+	struct d3dkmthandle		adapter;
+	struct d3dkmthandle		device;
+	enum d3dkmt_escapetype		type;
+	struct d3dddi_escapeflags	flags;
+#ifdef __KERNEL__
+	void				*priv_drv_data;
+#else
+	__u64				priv_drv_data;
+#endif
+	__u32				priv_drv_data_size;
+	struct d3dkmthandle		context;
+};
+
 enum dxgk_render_pipeline_stage {
 	_DXGK_RENDER_PIPELINE_STAGE_UNKNOWN		= 0,
 	_DXGK_RENDER_PIPELINE_STAGE_INPUT_ASSEMBLER	= 1,
@@ -1213,6 +1252,8 @@ struct d3dkmt_shareobjectwithhost {
 	_IOWR(0x47, 0x09, struct d3dkmt_queryadapterinfo)
 #define LX_DXQUERYVIDEOMEMORYINFO	\
 	_IOWR(0x47, 0x0a, struct d3dkmt_queryvideomemoryinfo)
+#define LX_DXESCAPE			\
+	_IOWR(0x47, 0x0d, struct d3dkmt_escape)
 #define LX_DXGETDEVICESTATE		\
 	_IOWR(0x47, 0x0e, struct d3dkmt_getdevicestate)
 #define LX_DXSUBMITCOMMAND		\
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 24/30] drivers: hv: dxgkrnl: Ioctl to put device to error state
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (21 preceding siblings ...)
  2022-03-01 19:46   ` [PATCH v3 23/30] drivers: hv: dxgkrnl: The escape ioctl Iouri Tarassov
@ 2022-03-01 19:46   ` Iouri Tarassov
  2022-03-01 19:46   ` [PATCH v3 25/30] drivers: hv: dxgkrnl: Ioctls to query statistics and clock calibration Iouri Tarassov
                     ` (6 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:46 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement the ioctl to put the virtual compute device to the error
state (LX_DXMARKDEVICEASERROR).

This ioctl is used by the user mode driver when it detects an
unrecoverable error condition.

When a compute device is put to the error state, all subsequent
ioctl calls to the device will fail.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgkrnl.h  |  3 +++
 drivers/hv/dxgkrnl/dxgvmbus.c | 25 ++++++++++++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h |  5 +++++
 drivers/hv/dxgkrnl/ioctl.c    | 39 +++++++++++++++++++++++++++++++++++
 include/uapi/misc/d3dkmthk.h  | 12 +++++++++++
 5 files changed, 84 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index 875e336c11d3..03acf8ce3baa 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -836,6 +836,9 @@ int dxgvmb_send_update_alloc_property(struct dxgprocess *process,
 				      struct d3dddi_updateallocproperty *args,
 				      struct d3dddi_updateallocproperty *__user
 				      inargs);
+int dxgvmb_send_mark_device_as_error(struct dxgprocess *process,
+				     struct dxgadapter *adapter,
+				     struct d3dkmt_markdeviceaserror *args);
 int dxgvmb_send_set_allocation_priority(struct dxgprocess *process,
 					struct dxgadapter *adapter,
 					struct d3dkmt_setallocationpriority *a);
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index c76ab993e9ba..bf65599635c6 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -2708,6 +2708,31 @@ int dxgvmb_send_update_alloc_property(struct dxgprocess *process,
 	return ret;
 }
 
+int dxgvmb_send_mark_device_as_error(struct dxgprocess *process,
+				     struct dxgadapter *adapter,
+				     struct d3dkmt_markdeviceaserror *args)
+{
+	struct dxgkvmb_command_markdeviceaserror *command;
+	int ret;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_MARKDEVICEASERROR,
+				   process->host_handle);
+	command->args = *args;
+	ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 int dxgvmb_send_set_allocation_priority(struct dxgprocess *process,
 				struct dxgadapter *adapter,
 				struct d3dkmt_setallocationpriority *args)
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index d2d22775645b..615a163f836e 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -627,6 +627,11 @@ struct dxgkvmb_command_updateallocationproperty_return {
 	struct ntstatus			status;
 };
 
+struct dxgkvmb_command_markdeviceaserror {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmt_markdeviceaserror args;
+};
+
 /* Returns ntstatus */
 struct dxgkvmb_command_changevideomemoryreservation {
 	struct dxgkvmb_command_vgpu_to_host hdr;
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index 763bd76382a0..b2cc2c6fc725 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -3380,6 +3380,43 @@ dxgk_update_alloc_property(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgk_mark_device_as_error(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_markdeviceaserror args;
+	struct dxgadapter *adapter = NULL;
+	struct dxgdevice *device = NULL;
+	int ret;
+
+	pr_debug("ioctl: %s", __func__);
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	device = dxgprocess_device_by_handle(process, args.device);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+	device->execution_state = _D3DKMT_DEVICEEXECUTION_RESET;
+	ret = dxgvmb_send_mark_device_as_error(process, adapter, &args);
+cleanup:
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 static int
 dxgk_query_alloc_residency(struct dxgprocess *process, void *__user inargs)
 {
@@ -4515,6 +4552,8 @@ void init_ioctls(void)
 		  LX_DXFLUSHHEAPTRANSITIONS);
 	SET_IOCTL(/*0x25 */ dxgk_lock2,
 		  LX_DXLOCK2);
+	SET_IOCTL(/*0x26 */ dxgk_mark_device_as_error,
+		  LX_DXMARKDEVICEASERROR);
 	SET_IOCTL(/*0x2a */ dxgk_query_alloc_residency,
 		  LX_DXQUERYALLOCATIONRESIDENCY);
 	SET_IOCTL(/*0x2e */ dxgk_set_allocation_priority,
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index cf08166f4e5d..fb4ff4b905c4 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -786,6 +786,16 @@ struct d3dkmt_unlock2 {
 	struct d3dkmthandle			allocation;
 };
 
+enum d3dkmt_device_error_reason {
+	_D3DKMT_DEVICE_ERROR_REASON_GENERIC		= 0x80000000,
+	_D3DKMT_DEVICE_ERROR_REASON_DRIVER_ERROR	= 0x80000006,
+};
+
+struct d3dkmt_markdeviceaserror {
+	struct d3dkmthandle			device;
+	enum d3dkmt_device_error_reason		reason;
+};
+
 enum d3dkmt_standardallocationtype {
 	_D3DKMT_STANDARDALLOCATIONTYPE_EXISTINGHEAP	= 1,
 	_D3DKMT_STANDARDALLOCATIONTYPE_CROSSADAPTER	= 2,
@@ -1286,6 +1296,8 @@ struct d3dkmt_shareobjectwithhost {
 	_IOWR(0x47, 0x1f, struct d3dkmt_flushheaptransitions)
 #define LX_DXLOCK2			\
 	_IOWR(0x47, 0x25, struct d3dkmt_lock2)
+#define LX_DXMARKDEVICEASERROR		\
+	_IOWR(0x47, 0x26, struct d3dkmt_markdeviceaserror)
 #define LX_DXQUERYALLOCATIONRESIDENCY	\
 	_IOWR(0x47, 0x2a, struct d3dkmt_queryallocationresidency)
 #define LX_DXSETALLOCATIONPRIORITY	\
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 25/30] drivers: hv: dxgkrnl: Ioctls to query statistics and clock calibration
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (22 preceding siblings ...)
  2022-03-01 19:46   ` [PATCH v3 24/30] drivers: hv: dxgkrnl: Ioctl to put device to error state Iouri Tarassov
@ 2022-03-01 19:46   ` Iouri Tarassov
  2022-03-01 19:46   ` [PATCH v3 26/30] drivers: hv: dxgkrnl: Offer and reclaim allocations Iouri Tarassov
                     ` (5 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:46 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement ioctls to query statistics from the VGPU device
(LX_DXQUERYSTATISTICS) and to query clock calibration
(LX_DXQUERYCLOCKCALIBRATION).

The LX_DXQUERYSTATISTICS ioctl is used to query various statistics from
the compute device on the host.

The LX_DXQUERYCLOCKCALIBRATION ioctl queries the compute device clock
and is used for performance monitoring.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgkrnl.h  |   8 +++
 drivers/hv/dxgkrnl/dxgvmbus.c |  77 +++++++++++++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h |  21 +++++++
 drivers/hv/dxgkrnl/ioctl.c    | 112 ++++++++++++++++++++++++++++++++++
 include/uapi/misc/d3dkmthk.h  |  62 +++++++++++++++++++
 5 files changed, 280 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index 03acf8ce3baa..f1b3fa3214dc 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -865,6 +865,11 @@ int dxgvmb_send_query_adapter_info(struct dxgprocess *process,
 int dxgvmb_send_submit_command_hwqueue(struct dxgprocess *process,
 				       struct dxgadapter *adapter,
 				       struct d3dkmt_submitcommandtohwqueue *a);
+int dxgvmb_send_query_clock_calibration(struct dxgprocess *process,
+					struct dxgadapter *adapter,
+					struct d3dkmt_queryclockcalibration *a,
+					struct d3dkmt_queryclockcalibration
+					*__user inargs);
 int dxgvmb_send_flush_heap_transitions(struct dxgprocess *process,
 				       struct dxgadapter *adapter,
 				       struct d3dkmt_flushheaptransitions *arg);
@@ -909,6 +914,9 @@ int dxgvmb_send_get_stdalloc_data(struct dxgdevice *device,
 				  void *prive_alloc_data,
 				  u32 *res_priv_data_size,
 				  void *priv_res_data);
+int dxgvmb_send_query_statistics(struct dxgprocess *process,
+				 struct dxgadapter *adapter,
+				 struct d3dkmt_querystatistics *args);
 int dxgvmb_send_async_msg(struct dxgvmbuschannel *channel,
 			  void *command,
 			  u32 cmd_size);
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index bf65599635c6..04173dfe26a3 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -1812,6 +1812,48 @@ int dxgvmb_send_destroy_allocation(struct dxgprocess *process,
 	return ret;
 }
 
+int dxgvmb_send_query_clock_calibration(struct dxgprocess *process,
+					struct dxgadapter *adapter,
+					struct d3dkmt_queryclockcalibration
+					*args,
+					struct d3dkmt_queryclockcalibration
+					*__user inargs)
+{
+	struct dxgkvmb_command_queryclockcalibration *command;
+	struct dxgkvmb_command_queryclockcalibration_return result;
+	int ret;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_QUERYCLOCKCALIBRATION,
+				   process->host_handle);
+	command->args = *args;
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   &result, sizeof(result));
+	if (ret < 0)
+		goto cleanup;
+	ret = copy_to_user(&inargs->clock_data, &result.clock_data,
+			   sizeof(result.clock_data));
+	if (ret) {
+		pr_err("%s failed to copy clock data", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	ret = ntstatus2int(result.status);
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 int dxgvmb_send_flush_heap_transitions(struct dxgprocess *process,
 				       struct dxgadapter *adapter,
 				       struct d3dkmt_flushheaptransitions *args)
@@ -3215,3 +3257,38 @@ int dxgvmb_send_submit_command_hwqueue(struct dxgprocess *process,
 	return ret;
 }
 
+int dxgvmb_send_query_statistics(struct dxgprocess *process,
+				 struct dxgadapter *adapter,
+				 struct d3dkmt_querystatistics *args)
+{
+	struct dxgkvmb_command_querystatistics *command;
+	struct dxgkvmb_command_querystatistics_return *result;
+	int ret;
+	struct dxgvmbusmsgres msg = {.hdr = NULL};
+
+	ret = init_message_res(&msg, adapter, process, sizeof(*command),
+			       sizeof(*result));
+	if (ret)
+		goto cleanup;
+	command = msg.msg;
+	result = msg.res;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_QUERYSTATISTICS,
+				   process->host_handle);
+	command->args = *args;
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   result, msg.res_size);
+	if (ret < 0)
+		goto cleanup;
+
+	args->result = result->result;
+	ret = ntstatus2int(result->status);
+
+cleanup:
+	free_message((struct dxgvmbusmsg *)&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index 615a163f836e..6c8ed5d547a5 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -372,6 +372,16 @@ struct dxgkvmb_command_flushheaptransitions {
 	struct dxgkvmb_command_vgpu_to_host hdr;
 };
 
+struct dxgkvmb_command_queryclockcalibration {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmt_queryclockcalibration args;
+};
+
+struct dxgkvmb_command_queryclockcalibration_return {
+	struct ntstatus			status;
+	struct dxgk_gpuclockdata	clock_data;
+};
+
 struct dxgkvmb_command_createallocation_allocinfo {
 	u32				flags;
 	u32				priv_drv_data_size;
@@ -408,6 +418,17 @@ struct dxgkvmb_command_openresource_return {
 /* struct d3dkmthandle   allocation[allocation_count]; */
 };
 
+struct dxgkvmb_command_querystatistics {
+	struct dxgkvmb_command_vgpu_to_host	hdr;
+	struct d3dkmt_querystatistics		args;
+};
+
+struct dxgkvmb_command_querystatistics_return {
+	struct ntstatus				status;
+	u32					reserved;
+	struct d3dkmt_querystatistics_result	result;
+};
+
 struct dxgkvmb_command_getstandardallocprivdata {
 	struct dxgkvmb_command_vgpu_to_host hdr;
 	enum d3dkmdt_standardallocationtype alloc_type;
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index b2cc2c6fc725..4bff78c7b1d3 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -153,6 +153,66 @@ static int dxgk_open_adapter_from_luid(struct dxgprocess *process,
 	return ret;
 }
 
+static int dxgk_query_statistics(struct dxgprocess *process,
+				 void __user *inargs)
+{
+	struct d3dkmt_querystatistics *args;
+	int ret;
+	struct dxgadapter *entry;
+	struct dxgadapter *adapter = NULL;
+	struct winluid tmp;
+
+	pr_debug("ioctl: %s", __func__);
+
+	args = vzalloc(sizeof(struct d3dkmt_querystatistics));
+	if (args == NULL) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	ret = copy_from_user(args, inargs, sizeof(*args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	dxgglobal_acquire_adapter_list_lock(DXGLOCK_SHARED);
+	list_for_each_entry(entry, &dxgglobal->adapter_list_head,
+			    adapter_list_entry) {
+		if (dxgadapter_acquire_lock_shared(entry) == 0) {
+			if (*(u64 *) &entry->luid ==
+			    *(u64 *) &args->adapter_luid) {
+				adapter = entry;
+				break;
+			}
+			dxgadapter_release_lock_shared(entry);
+		}
+	}
+	dxgglobal_release_adapter_list_lock(DXGLOCK_SHARED);
+	if (adapter) {
+		tmp = args->adapter_luid;
+		args->adapter_luid = adapter->host_adapter_luid;
+		ret = dxgvmb_send_query_statistics(process, adapter, args);
+		if (ret >= 0) {
+			args->adapter_luid = tmp;
+			ret = copy_to_user(inargs, args, sizeof(*args));
+			if (ret) {
+				pr_err("%s failed to copy args", __func__);
+				ret = -EINVAL;
+			}
+		}
+		dxgadapter_release_lock_shared(adapter);
+	}
+
+cleanup:
+	if (args)
+		vfree(args);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 static int
 dxgkp_enum_adapters(struct dxgprocess *process,
 		    union d3dkmt_enumadapters_filter filter,
@@ -3576,6 +3636,54 @@ dxgk_change_vidmem_reservation(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgk_query_clock_calibration(struct dxgprocess *process, void *__user inargs)
+{
+	struct d3dkmt_queryclockcalibration args;
+	int ret;
+	struct dxgadapter *adapter = NULL;
+	bool adapter_locked = false;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = dxgprocess_adapter_by_handle(process, args.adapter);
+	if (adapter == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+	adapter_locked = true;
+
+	args.adapter = adapter->host_handle;
+	ret = dxgvmb_send_query_clock_calibration(process, adapter,
+						  &args, inargs);
+	if (ret < 0)
+		goto cleanup;
+	ret = copy_to_user(inargs, &args, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy output args", __func__);
+		ret = -EINVAL;
+	}
+
+cleanup:
+
+	if (adapter_locked)
+		dxgadapter_release_lock_shared(adapter);
+	if (adapter)
+		kref_put(&adapter->adapter_kref, dxgadapter_release);
+	return ret;
+}
+
 static int
 dxgk_flush_heap_transitions(struct dxgprocess *process, void *__user inargs)
 {
@@ -4580,6 +4688,8 @@ void init_ioctls(void)
 		  LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMGPU);
 	SET_IOCTL(/*0x3c */ dxgk_get_allocation_priority,
 		  LX_DXGETALLOCATIONPRIORITY);
+	SET_IOCTL(/*0x3d */ dxgk_query_clock_calibration,
+		  LX_DXQUERYCLOCKCALIBRATION);
 	SET_IOCTL(/*0x3e */ dxgk_enum_adapters3,
 		  LX_DXENUMADAPTERS3);
 	SET_IOCTL(/*0x3f */ dxgk_share_objects,
@@ -4590,6 +4700,8 @@ void init_ioctls(void)
 		  LX_DXQUERYRESOURCEINFOFROMNTHANDLE);
 	SET_IOCTL(/*0x42 */ dxgk_open_resource_nt,
 		  LX_DXOPENRESOURCEFROMNTHANDLE);
+	SET_IOCTL(/*0x43 */ dxgk_query_statistics,
+		  LX_DXQUERYSTATISTICS);
 	SET_IOCTL(/*0x44 */ dxgk_share_object_with_host,
 		  LX_DXSHAREOBJECTWITHHOST);
 }
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index fb4ff4b905c4..2cd88a9320d1 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -992,6 +992,34 @@ struct d3dkmt_queryadapterinfo {
 	__u32				private_data_size;
 };
 
+#pragma pack(push, 1)
+
+struct dxgk_gpuclockdata_flags {
+	union {
+		struct {
+			__u32	context_management_processor:1;
+			__u32	reserved:31;
+		};
+		__u32		value;
+	};
+};
+
+struct dxgk_gpuclockdata {
+	__u64				gpu_frequency;
+	__u64				gpu_clock_counter;
+	__u64				cpu_clock_counter;
+	struct dxgk_gpuclockdata_flags	flags;
+} __packed;
+
+struct d3dkmt_queryclockcalibration {
+	struct d3dkmthandle	adapter;
+	__u32			node_ordinal;
+	__u32			physical_adapter_index;
+	struct dxgk_gpuclockdata clock_data;
+};
+
+#pragma pack(pop)
+
 struct d3dkmt_flushheaptransitions {
 	struct d3dkmthandle	adapter;
 };
@@ -1234,6 +1262,36 @@ struct d3dkmt_enumadapters3 {
 #endif
 };
 
+enum d3dkmt_querystatistics_type {
+	_D3DKMT_QUERYSTATISTICS_ADAPTER                = 0,
+	_D3DKMT_QUERYSTATISTICS_PROCESS                = 1,
+	_D3DKMT_QUERYSTATISTICS_PROCESS_ADAPTER        = 2,
+	_D3DKMT_QUERYSTATISTICS_SEGMENT                = 3,
+	_D3DKMT_QUERYSTATISTICS_PROCESS_SEGMENT        = 4,
+	_D3DKMT_QUERYSTATISTICS_NODE                   = 5,
+	_D3DKMT_QUERYSTATISTICS_PROCESS_NODE           = 6,
+	_D3DKMT_QUERYSTATISTICS_VIDPNSOURCE            = 7,
+	_D3DKMT_QUERYSTATISTICS_PROCESS_VIDPNSOURCE    = 8,
+	_D3DKMT_QUERYSTATISTICS_PROCESS_SEGMENT_GROUP  = 9,
+	_D3DKMT_QUERYSTATISTICS_PHYSICAL_ADAPTER       = 10,
+};
+
+struct d3dkmt_querystatistics_result {
+	char size[0x308];
+};
+
+struct d3dkmt_querystatistics {
+	union {
+		struct {
+			enum d3dkmt_querystatistics_type	type;
+			struct winluid				adapter_luid;
+			__u64					process;
+			struct d3dkmt_querystatistics_result	result;
+		};
+		char size[0x328];
+	};
+};
+
 struct d3dkmt_shareobjectwithhost {
 	struct d3dkmthandle	device_handle;
 	struct d3dkmthandle	object_handle;
@@ -1324,6 +1382,8 @@ struct d3dkmt_shareobjectwithhost {
 	_IOWR(0x47, 0x3b, struct d3dkmt_waitforsynchronizationobjectfromgpu)
 #define LX_DXGETALLOCATIONPRIORITY	\
 	_IOWR(0x47, 0x3c, struct d3dkmt_getallocationpriority)
+#define LX_DXQUERYCLOCKCALIBRATION	\
+	_IOWR(0x47, 0x3d, struct d3dkmt_queryclockcalibration)
 #define LX_DXENUMADAPTERS3		\
 	_IOWR(0x47, 0x3e, struct d3dkmt_enumadapters3)
 #define LX_DXSHAREOBJECTS		\
@@ -1334,6 +1394,8 @@ struct d3dkmt_shareobjectwithhost {
 	_IOWR(0x47, 0x41, struct d3dkmt_queryresourceinfofromnthandle)
 #define LX_DXOPENRESOURCEFROMNTHANDLE	\
 	_IOWR(0x47, 0x42, struct d3dkmt_openresourcefromnthandle)
+#define LX_DXQUERYSTATISTICS	\
+	_IOWR(0x47, 0x43, struct d3dkmt_querystatistics)
 #define LX_DXSHAREOBJECTWITHHOST	\
 	_IOWR(0x47, 0x44, struct d3dkmt_shareobjectwithhost)
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 26/30] drivers: hv: dxgkrnl: Offer and reclaim allocations
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (23 preceding siblings ...)
  2022-03-01 19:46   ` [PATCH v3 25/30] drivers: hv: dxgkrnl: Ioctls to query statistics and clock calibration Iouri Tarassov
@ 2022-03-01 19:46   ` Iouri Tarassov
  2022-03-02 14:25     ` Wei Liu
  2022-03-01 19:46   ` [PATCH v3 27/30] drivers: hv: dxgkrnl: Ioctls to manage scheduling priority Iouri Tarassov
                     ` (4 subsequent siblings)
  29 siblings, 1 reply; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:46 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement ioctls to offer and reclaim compute device allocations:
  - LX_DXOFFERALLOCATIONS,
  - LX_DXRECLAIMALLOCATIONS2

When a user mode driver (UMD) does not need to access an allocation,
it can "offer" it by issuing the LX_DXOFFERALLOCATIONS ioctl.  This
means that the allocation is not in use and its local device memory
could be evicted. The freed space could be given to another allocation.
When the allocation is again needed, the UMD can attempt to"reclaim"
the allocation by issuing the LX_DXRECLAIMALLOCATIONS2 ioctl. If the
allocation is still not evicted, the reclaim operation succeeds and no
other action is required. If the reclaim operation fails, the caller
must restore the content of the allocation before it can be used by
the device.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgkrnl.h  |   8 +++
 drivers/hv/dxgkrnl/dxgvmbus.c | 122 ++++++++++++++++++++++++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h |  27 ++++++++
 drivers/hv/dxgkrnl/ioctl.c    | 119 +++++++++++++++++++++++++++++++++
 include/uapi/misc/d3dkmthk.h  |  67 +++++++++++++++++++
 5 files changed, 343 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index f1b3fa3214dc..59e10e7202cd 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -845,6 +845,14 @@ int dxgvmb_send_set_allocation_priority(struct dxgprocess *process,
 int dxgvmb_send_get_allocation_priority(struct dxgprocess *process,
 					struct dxgadapter *adapter,
 					struct d3dkmt_getallocationpriority *a);
+int dxgvmb_send_offer_allocations(struct dxgprocess *process,
+				  struct dxgadapter *adapter,
+				  struct d3dkmt_offerallocations *args);
+int dxgvmb_send_reclaim_allocations(struct dxgprocess *process,
+				    struct dxgadapter *adapter,
+				    struct d3dkmthandle device,
+				    struct d3dkmt_reclaimallocations2 *args,
+				    u64 __user *paging_fence_value);
 int dxgvmb_send_change_vidmem_reservation(struct dxgprocess *process,
 					  struct dxgadapter *adapter,
 					  struct d3dkmthandle other_process,
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index 04173dfe26a3..0ab7acf3787b 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -2927,6 +2927,128 @@ int dxgvmb_send_get_allocation_priority(struct dxgprocess *process,
 	return ret;
 }
 
+int dxgvmb_send_offer_allocations(struct dxgprocess *process,
+				  struct dxgadapter *adapter,
+				  struct d3dkmt_offerallocations *args)
+{
+	struct dxgkvmb_command_offerallocations *command;
+	int ret = -EINVAL;
+	u32 alloc_size = sizeof(struct d3dkmthandle) * args->allocation_count;
+	u32 cmd_size = sizeof(struct dxgkvmb_command_offerallocations) +
+			alloc_size - sizeof(struct d3dkmthandle);
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, cmd_size);
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_OFFERALLOCATIONS,
+				   process->host_handle);
+	command->flags = args->flags;
+	command->priority = args->priority;
+	command->device = args->device;
+	command->allocation_count = args->allocation_count;
+	if (args->resources) {
+		command->resources = true;
+		ret = copy_from_user(command->allocations, args->resources,
+				     alloc_size);
+	} else {
+		ret = copy_from_user(command->allocations,
+				     args->allocations, alloc_size);
+	}
+	if (ret) {
+		pr_err("%s failed to copy input handles", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_reclaim_allocations(struct dxgprocess *process,
+				    struct dxgadapter *adapter,
+				    struct d3dkmthandle device,
+				    struct d3dkmt_reclaimallocations2 *args,
+				    u64  __user *paging_fence_value)
+{
+	struct dxgkvmb_command_reclaimallocations *command;
+	struct dxgkvmb_command_reclaimallocations_return *result;
+	int ret;
+	u32 alloc_size = sizeof(struct d3dkmthandle) * args->allocation_count;
+	u32 cmd_size = sizeof(struct dxgkvmb_command_reclaimallocations) +
+	    alloc_size - sizeof(struct d3dkmthandle);
+	u32 result_size = sizeof(*result);
+	struct dxgvmbusmsgres msg = {.hdr = NULL};
+
+	if (args->results)
+		result_size += (args->allocation_count - 1) *
+				sizeof(enum d3dddi_reclaim_result);
+
+	ret = init_message_res(&msg, adapter, process, cmd_size, result_size);
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+	result = msg.res;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_RECLAIMALLOCATIONS,
+				   process->host_handle);
+	command->device = device;
+	command->paging_queue = args->paging_queue;
+	command->allocation_count = args->allocation_count;
+	command->write_results = args->results != NULL;
+	if (args->resources) {
+		command->resources = true;
+		ret = copy_from_user(command->allocations, args->resources,
+					 alloc_size);
+	} else {
+		ret = copy_from_user(command->allocations,
+					 args->allocations, alloc_size);
+	}
+	if (ret) {
+		pr_err("%s failed to copy input handles", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   result, msg.res_size);
+	if (ret < 0)
+		goto cleanup;
+	ret = copy_to_user(paging_fence_value,
+			   &result->paging_fence_value, sizeof(u64));
+	if (ret) {
+		pr_err("%s failed to copy paging fence", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = ntstatus2int(result->status);
+	if (NT_SUCCESS(result->status) && args->results) {
+		ret = copy_to_user(args->results, result->discarded,
+				   sizeof(result->discarded[0]) *
+				   args->allocation_count);
+		if (ret) {
+			pr_err("%s failed to copy results", __func__);
+			ret = -EINVAL;
+		}
+	}
+
+cleanup:
+	free_message((struct dxgvmbusmsg *)&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 int dxgvmb_send_change_vidmem_reservation(struct dxgprocess *process,
 					  struct dxgadapter *adapter,
 					  struct d3dkmthandle other_process,
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index 6c8ed5d547a5..7fa89ed16c2d 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -653,6 +653,33 @@ struct dxgkvmb_command_markdeviceaserror {
 	struct d3dkmt_markdeviceaserror args;
 };
 
+/* Returns ntstatus */
+struct dxgkvmb_command_offerallocations {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		device;
+	u32				allocation_count;
+	enum d3dkmt_offer_priority	priority;
+	struct d3dkmt_offer_flags	flags;
+	bool				resources;
+	struct d3dkmthandle		allocations[1];
+};
+
+struct dxgkvmb_command_reclaimallocations {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		device;
+	struct d3dkmthandle		paging_queue;
+	u32				allocation_count;
+	bool				resources;
+	bool				write_results;
+	struct d3dkmthandle		allocations[1];
+};
+
+struct dxgkvmb_command_reclaimallocations_return {
+	u64				paging_fence_value;
+	struct ntstatus			status;
+	enum d3dddi_reclaim_result	discarded[1];
+};
+
 /* Returns ntstatus */
 struct dxgkvmb_command_changevideomemoryreservation {
 	struct dxgkvmb_command_vgpu_to_host hdr;
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index 4bff78c7b1d3..47ffdd1dda93 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -1996,6 +1996,121 @@ dxgk_destroy_allocation(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgk_offer_allocations(struct dxgprocess *process, void *__user inargs)
+{
+	int ret;
+	struct d3dkmt_offerallocations args;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+
+	pr_debug("ioctl: %s", __func__);
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.allocation_count > D3DKMT_MAKERESIDENT_ALLOC_MAX ||
+	    args.allocation_count == 0) {
+		pr_err("invalid number of allocations");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if ((args.resources == NULL) == (args.allocations == NULL)) {
+		pr_err("invalid pointer to resources/allocations");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	device = dxgprocess_device_by_handle(process, args.device);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_offer_allocations(process, adapter, &args);
+
+cleanup:
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_reclaim_allocations(struct dxgprocess *process, void *__user inargs)
+{
+	int ret;
+	struct d3dkmt_reclaimallocations2 args;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+	struct d3dkmt_reclaimallocations2 * __user in_args = inargs;
+
+	pr_debug("ioctl: %s", __func__);
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.allocation_count > D3DKMT_MAKERESIDENT_ALLOC_MAX ||
+	    args.allocation_count == 0) {
+		pr_err("invalid number of allocations");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if ((args.resources == NULL) == (args.allocations == NULL)) {
+		pr_err("invalid pointer to resources/allocations");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	device = dxgprocess_device_by_object_handle(process,
+						HMGRENTRY_TYPE_DXGPAGINGQUEUE,
+						args.paging_queue);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_reclaim_allocations(process, adapter,
+					      device->handle, &args,
+					      &in_args->paging_fence_value);
+
+cleanup:
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 static int
 dxgk_submit_command(struct dxgprocess *process, void *__user inargs)
 {
@@ -4662,8 +4777,12 @@ void init_ioctls(void)
 		  LX_DXLOCK2);
 	SET_IOCTL(/*0x26 */ dxgk_mark_device_as_error,
 		  LX_DXMARKDEVICEASERROR);
+	SET_IOCTL(/*0x27 */ dxgk_offer_allocations,
+		  LX_DXOFFERALLOCATIONS);
 	SET_IOCTL(/*0x2a */ dxgk_query_alloc_residency,
 		  LX_DXQUERYALLOCATIONRESIDENCY);
+	SET_IOCTL(/*0x2c */ dxgk_reclaim_allocations,
+		  LX_DXRECLAIMALLOCATIONS2);
 	SET_IOCTL(/*0x2e */ dxgk_set_allocation_priority,
 		  LX_DXSETALLOCATIONPRIORITY);
 	SET_IOCTL(/*0x31 */ dxgk_signal_sync_object_cpu,
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index 2cd88a9320d1..3849f714ae9c 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -57,6 +57,7 @@ struct winluid {
 #define D3DDDI_MAX_WRITTEN_PRIMARIES		16
 
 #define D3DKMT_CREATEALLOCATION_MAX		1024
+#define D3DKMT_MAKERESIDENT_ALLOC_MAX		(1024 * 10)
 #define D3DKMT_ADAPTERS_MAX			64
 #define D3DDDI_MAX_BROADCAST_CONTEXT		64
 #define D3DDDI_MAX_OBJECT_WAITED_ON		32
@@ -1083,6 +1084,68 @@ struct d3dddi_updateallocproperty {
 	};
 };
 
+enum d3dkmt_offer_priority {
+	_D3DKMT_OFFER_PRIORITY_LOW	= 1,
+	_D3DKMT_OFFER_PRIORITY_NORMAL	= 2,
+	_D3DKMT_OFFER_PRIORITY_HIGH	= 3,
+	_D3DKMT_OFFER_PRIORITY_AUTO	= 4,
+};
+
+struct d3dkmt_offer_flags {
+	union {
+		struct {
+			__u32	offer_immediately:1;
+			__u32	allow_decommit:1;
+			__u32	reserved:30;
+		};
+		__u32		value;
+	};
+};
+
+struct d3dkmt_offerallocations {
+	struct d3dkmthandle		device;
+	__u32				reserved;
+#ifdef __KERNEL__
+	struct d3dkmthandle		*resources;
+	const struct d3dkmthandle	*allocations;
+#else
+	__u64				resources;
+	__u64				allocations;
+#endif
+	__u32				allocation_count;
+	enum d3dkmt_offer_priority	priority;
+	struct d3dkmt_offer_flags	flags;
+	__u32				reserved1;
+};
+
+enum d3dddi_reclaim_result {
+	_D3DDDI_RECLAIM_RESULT_OK		= 0,
+	_D3DDDI_RECLAIM_RESULT_DISCARDED	= 1,
+	_D3DDDI_RECLAIM_RESULT_NOT_COMMITTED	= 2,
+};
+
+struct d3dkmt_reclaimallocations2 {
+	struct d3dkmthandle	paging_queue;
+	__u32			allocation_count;
+#ifdef __KERNEL__
+	struct d3dkmthandle	*resources;
+	struct d3dkmthandle	*allocations;
+#else
+	__u64			resources;
+	__u64			allocations;
+#endif
+	union {
+#ifdef __KERNEL__
+		__u32				*discarded;
+		enum d3dddi_reclaim_result	*results;
+#else
+		__u64				discarded;
+		__u64				results;
+#endif
+	};
+	__u64			paging_fence_value;
+};
+
 struct d3dkmt_changevideomemoryreservation {
 	__u64			process;
 	struct d3dkmthandle	adapter;
@@ -1356,8 +1419,12 @@ struct d3dkmt_shareobjectwithhost {
 	_IOWR(0x47, 0x25, struct d3dkmt_lock2)
 #define LX_DXMARKDEVICEASERROR		\
 	_IOWR(0x47, 0x26, struct d3dkmt_markdeviceaserror)
+#define LX_DXOFFERALLOCATIONS		\
+	_IOWR(0x47, 0x27, struct d3dkmt_offerallocations)
 #define LX_DXQUERYALLOCATIONRESIDENCY	\
 	_IOWR(0x47, 0x2a, struct d3dkmt_queryallocationresidency)
+#define LX_DXRECLAIMALLOCATIONS2	\
+	_IOWR(0x47, 0x2c, struct d3dkmt_reclaimallocations2)
 #define LX_DXSETALLOCATIONPRIORITY	\
 	_IOWR(0x47, 0x2e, struct d3dkmt_setallocationpriority)
 #define LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMCPU \
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 27/30] drivers: hv: dxgkrnl: Ioctls to manage scheduling priority
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (24 preceding siblings ...)
  2022-03-01 19:46   ` [PATCH v3 26/30] drivers: hv: dxgkrnl: Offer and reclaim allocations Iouri Tarassov
@ 2022-03-01 19:46   ` Iouri Tarassov
  2022-03-01 19:46   ` [PATCH v3 28/30] drivers: hv: dxgkrnl: Manage residency of allocations Iouri Tarassov
                     ` (3 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:46 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement iocts to manage compute device scheduling priority:
  - LX_DXGETCONTEXTINPROCESSSCHEDULINGPRIORITY
  - LX_DXGETCONTEXTSCHEDULINGPRIORITY
  - LX_DXSETCONTEXTINPROCESSSCHEDULINGPRIORITY
  - LX_DXSETCONTEXTSCHEDULINGPRIORITY

Each compute device execution context has an assigned scheduling
priority. It is used by the compute device scheduler on the host to
pick contexts for execution. There is a global priority and a
priority within a process.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgkrnl.h  |   9 ++
 drivers/hv/dxgkrnl/dxgvmbus.c |  63 +++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h |  19 ++++
 drivers/hv/dxgkrnl/ioctl.c    | 173 ++++++++++++++++++++++++++++++++++
 include/uapi/misc/d3dkmthk.h  |  28 ++++++
 5 files changed, 292 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index 59e10e7202cd..00c3bb5f3ab8 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -845,6 +845,15 @@ int dxgvmb_send_set_allocation_priority(struct dxgprocess *process,
 int dxgvmb_send_get_allocation_priority(struct dxgprocess *process,
 					struct dxgadapter *adapter,
 					struct d3dkmt_getallocationpriority *a);
+int dxgvmb_send_set_context_sch_priority(struct dxgprocess *process,
+					 struct dxgadapter *adapter,
+					 struct d3dkmthandle context,
+					 int priority, bool in_process);
+int dxgvmb_send_get_context_sch_priority(struct dxgprocess *process,
+					 struct dxgadapter *adapter,
+					 struct d3dkmthandle context,
+					 int *priority,
+					 bool in_process);
 int dxgvmb_send_offer_allocations(struct dxgprocess *process,
 				  struct dxgadapter *adapter,
 				  struct d3dkmt_offerallocations *args);
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index 0ab7acf3787b..2f0914b71c3c 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -2927,6 +2927,69 @@ int dxgvmb_send_get_allocation_priority(struct dxgprocess *process,
 	return ret;
 }
 
+int dxgvmb_send_set_context_sch_priority(struct dxgprocess *process,
+					 struct dxgadapter *adapter,
+					 struct d3dkmthandle context,
+					 int priority,
+					 bool in_process)
+{
+	struct dxgkvmb_command_setcontextschedulingpriority2 *command;
+	int ret;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_SETCONTEXTSCHEDULINGPRIORITY,
+				   process->host_handle);
+	command->context = context;
+	command->priority = priority;
+	command->in_process = in_process;
+	ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_get_context_sch_priority(struct dxgprocess *process,
+					 struct dxgadapter *adapter,
+					 struct d3dkmthandle context,
+					 int *priority,
+					 bool in_process)
+{
+	struct dxgkvmb_command_getcontextschedulingpriority *command;
+	struct dxgkvmb_command_getcontextschedulingpriority_return result = { };
+	int ret;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_GETCONTEXTSCHEDULINGPRIORITY,
+				   process->host_handle);
+	command->context = context;
+	command->in_process = in_process;
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   &result, sizeof(result));
+	if (ret >= 0) {
+		ret = ntstatus2int(result.status);
+		*priority = result.priority;
+	}
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 int dxgvmb_send_offer_allocations(struct dxgprocess *process,
 				  struct dxgadapter *adapter,
 				  struct d3dkmt_offerallocations *args)
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index 7fa89ed16c2d..2e522d6652da 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -331,6 +331,25 @@ struct dxgkvmb_command_getallocationpriority_return {
 	/* u32 priorities[allocation_count or 1]; */
 };
 
+/* Returns ntstatus */
+struct dxgkvmb_command_setcontextschedulingpriority2 {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		context;
+	int				priority;
+	bool				in_process;
+};
+
+struct dxgkvmb_command_getcontextschedulingpriority {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		context;
+	bool				in_process;
+};
+
+struct dxgkvmb_command_getcontextschedulingpriority_return {
+	struct ntstatus			status;
+	int				priority;
+};
+
 struct dxgkvmb_command_createdevice {
 	struct dxgkvmb_command_vgpu_to_host hdr;
 	struct d3dkmt_createdeviceflags	flags;
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index 47ffdd1dda93..0a4fca6ee2aa 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -3703,6 +3703,171 @@ dxgk_get_allocation_priority(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+set_context_scheduling_priority(struct dxgprocess *process,
+				struct d3dkmthandle hcontext,
+				int priority, bool in_process)
+{
+	int ret = 0;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+
+	device = dxgprocess_device_by_object_handle(process,
+						    HMGRENTRY_TYPE_DXGCONTEXT,
+						    hcontext);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+	ret = dxgvmb_send_set_context_sch_priority(process, adapter,
+						   hcontext, priority,
+						   in_process);
+	if (ret < 0)
+		pr_err("send_set_context_scheduling_priority failed");
+cleanup:
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	return ret;
+}
+
+static int
+dxgk_set_context_scheduling_priority(struct dxgprocess *process,
+				     void *__user inargs)
+{
+	struct d3dkmt_setcontextschedulingpriority args;
+	int ret;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = set_context_scheduling_priority(process, args.context,
+					      args.priority, false);
+cleanup:
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+get_context_scheduling_priority(struct dxgprocess *process,
+				struct d3dkmthandle hcontext,
+				int __user *priority,
+				bool in_process)
+{
+	int ret;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+	int pri = 0;
+
+	device = dxgprocess_device_by_object_handle(process,
+						    HMGRENTRY_TYPE_DXGCONTEXT,
+						    hcontext);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+	ret = dxgvmb_send_get_context_sch_priority(process, adapter,
+						   hcontext, &pri, in_process);
+	if (ret < 0)
+		goto cleanup;
+	ret = copy_to_user(priority, &pri, sizeof(pri));
+	if (ret) {
+		pr_err("%s failed to copy priority to user", __func__);
+		ret = -EINVAL;
+	}
+
+cleanup:
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	return ret;
+}
+
+static int
+dxgk_get_context_scheduling_priority(struct dxgprocess *process,
+				     void *__user inargs)
+{
+	struct d3dkmt_getcontextschedulingpriority args;
+	struct d3dkmt_getcontextschedulingpriority __user *input = inargs;
+	int ret;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = get_context_scheduling_priority(process, args.context,
+					      &input->priority, false);
+cleanup:
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_set_context_process_scheduling_priority(struct dxgprocess *process,
+					     void *__user inargs)
+{
+	struct d3dkmt_setcontextinprocessschedulingpriority args;
+	int ret;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = set_context_scheduling_priority(process, args.context,
+					      args.priority, true);
+cleanup:
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_get_context_process_scheduling_priority(struct dxgprocess *process,
+					     void __user *inargs)
+{
+	struct d3dkmt_getcontextinprocessschedulingpriority args;
+	int ret;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = get_context_scheduling_priority(process, args.context,
+		&((struct d3dkmt_getcontextinprocessschedulingpriority *)
+		inargs)->priority, true);
+cleanup:
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 static int
 dxgk_change_vidmem_reservation(struct dxgprocess *process, void *__user inargs)
 {
@@ -4773,6 +4938,10 @@ void init_ioctls(void)
 		  LX_DXDESTROYSYNCHRONIZATIONOBJECT);
 	SET_IOCTL(/*0x1f */ dxgk_flush_heap_transitions,
 		  LX_DXFLUSHHEAPTRANSITIONS);
+	SET_IOCTL(/*0x21 */ dxgk_get_context_process_scheduling_priority,
+		  LX_DXGETCONTEXTINPROCESSSCHEDULINGPRIORITY);
+	SET_IOCTL(/*0x22 */ dxgk_get_context_scheduling_priority,
+		  LX_DXGETCONTEXTSCHEDULINGPRIORITY);
 	SET_IOCTL(/*0x25 */ dxgk_lock2,
 		  LX_DXLOCK2);
 	SET_IOCTL(/*0x26 */ dxgk_mark_device_as_error,
@@ -4785,6 +4954,10 @@ void init_ioctls(void)
 		  LX_DXRECLAIMALLOCATIONS2);
 	SET_IOCTL(/*0x2e */ dxgk_set_allocation_priority,
 		  LX_DXSETALLOCATIONPRIORITY);
+	SET_IOCTL(/*0x2f */ dxgk_set_context_process_scheduling_priority,
+		  LX_DXSETCONTEXTINPROCESSSCHEDULINGPRIORITY);
+	SET_IOCTL(/*0x30 */ dxgk_set_context_scheduling_priority,
+		  LX_DXSETCONTEXTSCHEDULINGPRIORITY);
 	SET_IOCTL(/*0x31 */ dxgk_signal_sync_object_cpu,
 		  LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMCPU);
 	SET_IOCTL(/*0x32 */ dxgk_signal_sync_object_gpu,
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index 3849f714ae9c..11e2f3c9c88c 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -704,6 +704,26 @@ struct d3dkmt_submitcommandtohwqueue {
 #endif
 };
 
+struct d3dkmt_setcontextschedulingpriority {
+	struct d3dkmthandle			context;
+	int					priority;
+};
+
+struct d3dkmt_setcontextinprocessschedulingpriority {
+	struct d3dkmthandle			context;
+	int					priority;
+};
+
+struct d3dkmt_getcontextschedulingpriority {
+	struct d3dkmthandle			context;
+	int					priority;
+};
+
+struct d3dkmt_getcontextinprocessschedulingpriority {
+	struct d3dkmthandle			context;
+	int					priority;
+};
+
 struct d3dkmt_setallocationpriority {
 	struct d3dkmthandle		device;
 	struct d3dkmthandle		resource;
@@ -1415,6 +1435,10 @@ struct d3dkmt_shareobjectwithhost {
 	_IOWR(0x47, 0x1d, struct d3dkmt_destroysynchronizationobject)
 #define LX_DXFLUSHHEAPTRANSITIONS	\
 	_IOWR(0x47, 0x1f, struct d3dkmt_flushheaptransitions)
+#define LX_DXGETCONTEXTINPROCESSSCHEDULINGPRIORITY \
+	_IOWR(0x47, 0x21, struct d3dkmt_getcontextinprocessschedulingpriority)
+#define LX_DXGETCONTEXTSCHEDULINGPRIORITY \
+	_IOWR(0x47, 0x22, struct d3dkmt_getcontextschedulingpriority)
 #define LX_DXLOCK2			\
 	_IOWR(0x47, 0x25, struct d3dkmt_lock2)
 #define LX_DXMARKDEVICEASERROR		\
@@ -1427,6 +1451,10 @@ struct d3dkmt_shareobjectwithhost {
 	_IOWR(0x47, 0x2c, struct d3dkmt_reclaimallocations2)
 #define LX_DXSETALLOCATIONPRIORITY	\
 	_IOWR(0x47, 0x2e, struct d3dkmt_setallocationpriority)
+#define LX_DXSETCONTEXTINPROCESSSCHEDULINGPRIORITY \
+	_IOWR(0x47, 0x2f, struct d3dkmt_setcontextinprocessschedulingpriority)
+#define LX_DXSETCONTEXTSCHEDULINGPRIORITY \
+	_IOWR(0x47, 0x30, struct d3dkmt_setcontextschedulingpriority)
 #define LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMCPU \
 	_IOWR(0x47, 0x31, struct d3dkmt_signalsynchronizationobjectfromcpu)
 #define LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMGPU \
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 28/30] drivers: hv: dxgkrnl: Manage residency of allocations
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (25 preceding siblings ...)
  2022-03-01 19:46   ` [PATCH v3 27/30] drivers: hv: dxgkrnl: Ioctls to manage scheduling priority Iouri Tarassov
@ 2022-03-01 19:46   ` Iouri Tarassov
  2022-03-01 19:46   ` [PATCH v3 29/30] drivers: hv: dxgkrnl: Manage compute device virtual addresses Iouri Tarassov
                     ` (2 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:46 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement ioctls to manage residency of compute device allocations:
  - LX_DXMAKERESIDENT,
  - LX_DXEVICT.

An allocation is "resident" when the compute devoce is setup to
access it. It means that the allocation is in the local device
memory or in non-pageable system memory.

The current design does not support on demand compute device page
faulting. An allocation must be resident before the compute device
is allowed to access it.

The LX_DXMAKERESIDENT ioctl instructs the video memory manager to
make the given allocations resident. The operation is submitted to
a paging queue (dxgpagingqueue). When the ioctl returns a "pending"
status, a monitored fence sync object can be used to synchronize
with the completion of the operation.

The LX_DXEVICT ioctl istructs the video memory manager to evict
the given allocations from device accessible memory.

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgkrnl.h  |   4 +
 drivers/hv/dxgkrnl/dxgvmbus.c |  98 +++++++++++++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h |  27 +++++++
 drivers/hv/dxgkrnl/ioctl.c    | 144 ++++++++++++++++++++++++++++++++++
 include/uapi/misc/d3dkmthk.h  |  54 +++++++++++++
 5 files changed, 327 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index 00c3bb5f3ab8..c841203a1683 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -790,6 +790,10 @@ int dxgvmb_send_create_allocation(struct dxgprocess *pr, struct dxgdevice *dev,
 int dxgvmb_send_destroy_allocation(struct dxgprocess *pr, struct dxgdevice *dev,
 				   struct d3dkmt_destroyallocation2 *args,
 				   struct d3dkmthandle *alloc_handles);
+int dxgvmb_send_make_resident(struct dxgprocess *pr, struct dxgadapter *adapter,
+			      struct d3dddi_makeresident *args);
+int dxgvmb_send_evict(struct dxgprocess *pr, struct dxgadapter *adapter,
+		      struct d3dkmt_evict *args);
 int dxgvmb_send_submit_command(struct dxgprocess *pr,
 			       struct dxgadapter *adapter,
 			       struct d3dkmt_submitcommand *args);
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index 2f0914b71c3c..9f5b8edb186e 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -2262,6 +2262,104 @@ int dxgvmb_send_get_stdalloc_data(struct dxgdevice *device,
 	return ret;
 }
 
+int dxgvmb_send_make_resident(struct dxgprocess *process,
+			      struct dxgadapter *adapter,
+			      struct d3dddi_makeresident *args)
+{
+	int ret;
+	u32 cmd_size;
+	struct dxgkvmb_command_makeresident_return result = { };
+	struct dxgkvmb_command_makeresident *command = NULL;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	cmd_size = (args->alloc_count - 1) * sizeof(struct d3dkmthandle) +
+		   sizeof(struct dxgkvmb_command_makeresident);
+
+	ret = init_message(&msg, adapter, process, cmd_size);
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	ret = copy_from_user(command->allocations, args->allocation_list,
+			     args->alloc_count *
+			     sizeof(struct d3dkmthandle));
+	if (ret) {
+		pr_err("%s failed to copy alloc handles", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_MAKERESIDENT,
+				   process->host_handle);
+	command->alloc_count = args->alloc_count;
+	command->paging_queue = args->paging_queue;
+	command->flags = args->flags;
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   &result, sizeof(result));
+	if (ret < 0) {
+		pr_err("send_make_resident failed %x", ret);
+		goto cleanup;
+	}
+
+	args->paging_fence_value = result.paging_fence_value;
+	args->num_bytes_to_trim = result.num_bytes_to_trim;
+	ret = ntstatus2int(result.status);
+
+cleanup:
+
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_evict(struct dxgprocess *process,
+		      struct dxgadapter *adapter,
+		      struct d3dkmt_evict *args)
+{
+	int ret;
+	u32 cmd_size;
+	struct dxgkvmb_command_evict_return result = { };
+	struct dxgkvmb_command_evict *command = NULL;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	cmd_size = (args->alloc_count - 1) * sizeof(struct d3dkmthandle) +
+	    sizeof(struct dxgkvmb_command_evict);
+	ret = init_message(&msg, adapter, process, cmd_size);
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+	ret = copy_from_user(command->allocations, args->allocations,
+			     args->alloc_count *
+			     sizeof(struct d3dkmthandle));
+	if (ret) {
+		pr_err("%s failed to copy alloc handles", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_EVICT, process->host_handle);
+	command->alloc_count = args->alloc_count;
+	command->device = args->device;
+	command->flags = args->flags;
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+				   &result, sizeof(result));
+	if (ret < 0) {
+		pr_err("send_evict failed %x", ret);
+		goto cleanup;
+	}
+	args->num_bytes_to_trim = result.num_bytes_to_trim;
+
+cleanup:
+
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 int dxgvmb_send_submit_command(struct dxgprocess *process,
 			       struct dxgadapter *adapter,
 			       struct d3dkmt_submitcommand *args)
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index 2e522d6652da..59357bd5c7b9 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -372,6 +372,33 @@ struct dxgkvmb_command_flushdevice {
 	enum dxgdevice_flushschedulerreason	reason;
 };
 
+struct dxgkvmb_command_makeresident {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		device;
+	struct d3dkmthandle		paging_queue;
+	struct d3dddi_makeresident_flags flags;
+	u32				alloc_count;
+	struct d3dkmthandle		allocations[1];
+};
+
+struct dxgkvmb_command_makeresident_return {
+	u64			paging_fence_value;
+	u64			num_bytes_to_trim;
+	struct ntstatus		status;
+};
+
+struct dxgkvmb_command_evict {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		device;
+	struct d3dddi_evict_flags	flags;
+	u32				alloc_count;
+	struct d3dkmthandle		allocations[1];
+};
+
+struct dxgkvmb_command_evict_return {
+	u64			num_bytes_to_trim;
+};
+
 struct dxgkvmb_command_submitcommand {
 	struct dxgkvmb_command_vgpu_to_host hdr;
 	struct d3dkmt_submitcommand	args;
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index 0a4fca6ee2aa..a90c1a897d55 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -1996,6 +1996,146 @@ dxgk_destroy_allocation(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgk_make_resident(struct dxgprocess *process, void *__user inargs)
+{
+	int ret, ret2;
+	struct d3dddi_makeresident args;
+	struct d3dddi_makeresident *input = inargs;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+
+	pr_debug("ioctl: %s", __func__);
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.alloc_count > D3DKMT_MAKERESIDENT_ALLOC_MAX ||
+	    args.alloc_count == 0) {
+		pr_err("invalid number of allocations");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+	if (args.paging_queue.v == 0) {
+		pr_err("paging queue is missing");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	device = dxgprocess_device_by_object_handle(process,
+						HMGRENTRY_TYPE_DXGPAGINGQUEUE,
+						args.paging_queue);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_make_resident(process, adapter, &args);
+	if (ret < 0)
+		goto cleanup;
+	/* STATUS_PENING is a success code > 0. It is returned to user mode */
+	if (!(ret == STATUS_PENDING || ret == 0)) {
+		pr_err("%s Unexpected error %x", __func__, ret);
+		goto cleanup;
+	}
+
+	ret2 = copy_to_user(&input->paging_fence_value,
+			    &args.paging_fence_value, sizeof(u64));
+	if (ret2) {
+		pr_err("%s failed to copy paging fence", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret2 = copy_to_user(&input->num_bytes_to_trim,
+			    &args.num_bytes_to_trim, sizeof(u64));
+	if (ret2) {
+		pr_err("%s failed to copy bytes to trim", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+cleanup:
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+
+	return ret;
+}
+
+static int
+dxgk_evict(struct dxgprocess *process, void *__user inargs)
+{
+	int ret;
+	struct d3dkmt_evict args;
+	struct d3dkmt_evict *input = inargs;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+
+	pr_debug("ioctl: %s", __func__);
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	if (args.alloc_count > D3DKMT_MAKERESIDENT_ALLOC_MAX ||
+	    args.alloc_count == 0) {
+		pr_err("invalid number of allocations");
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	device = dxgprocess_device_by_handle(process, args.device);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_evict(process, adapter, &args);
+	if (ret < 0)
+		goto cleanup;
+
+	ret = copy_to_user(&input->num_bytes_to_trim,
+			   &args.num_bytes_to_trim, sizeof(u64));
+	if (ret) {
+		pr_err("%s failed to copy bytes to trim to user", __func__);
+		ret = -EINVAL;
+	}
+cleanup:
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
 static int
 dxgk_offer_allocations(struct dxgprocess *process, void *__user inargs)
 {
@@ -4906,6 +5046,8 @@ void init_ioctls(void)
 		  LX_DXQUERYADAPTERINFO);
 	SET_IOCTL(/*0xa */ dxgk_query_vidmem_info,
 		  LX_DXQUERYVIDEOMEMORYINFO);
+	SET_IOCTL(/*0xb */ dxgk_make_resident,
+		  LX_DXMAKERESIDENT);
 	SET_IOCTL(/*0xd */ dxgk_escape,
 		  LX_DXESCAPE);
 	SET_IOCTL(/*0xe */ dxgk_get_device_state,
@@ -4936,6 +5078,8 @@ void init_ioctls(void)
 		  LX_DXDESTROYPAGINGQUEUE);
 	SET_IOCTL(/*0x1d */ dxgk_destroy_sync_object,
 		  LX_DXDESTROYSYNCHRONIZATIONOBJECT);
+	SET_IOCTL(/*0x1e */ dxgk_evict,
+		  LX_DXEVICT);
 	SET_IOCTL(/*0x1f */ dxgk_flush_heap_transitions,
 		  LX_DXFLUSHHEAPTRANSITIONS);
 	SET_IOCTL(/*0x21 */ dxgk_get_context_process_scheduling_priority,
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index 11e2f3c9c88c..95d6df5f01b5 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -958,6 +958,56 @@ struct d3dkmt_destroyallocation2 {
 	struct d3dddicb_destroyallocation2flags flags;
 };
 
+struct d3dddi_makeresident_flags {
+	union {
+		struct {
+			__u32		cant_trim_further:1;
+			__u32		must_succeed:1;
+			__u32		reserved:30;
+		};
+		__u32			value;
+	};
+};
+
+struct d3dddi_makeresident {
+	struct d3dkmthandle		paging_queue;
+	__u32				alloc_count;
+#ifdef __KERNEL__
+	const struct d3dkmthandle	*allocation_list;
+	const __u32			*priority_list;
+#else
+	__u64				allocation_list;
+	__u64				priority_list;
+#endif
+	struct d3dddi_makeresident_flags flags;
+	__u64				paging_fence_value;
+	__u64				num_bytes_to_trim;
+};
+
+struct d3dddi_evict_flags {
+	union {
+		struct {
+			__u32		evict_only_if_necessary:1;
+			__u32		not_written_to:1;
+			__u32		reserved:30;
+		};
+		__u32			value;
+	};
+};
+
+struct d3dkmt_evict {
+	struct d3dkmthandle		device;
+	__u32				alloc_count;
+#ifdef __KERNEL__
+	const struct d3dkmthandle	*allocations;
+#else
+	__u64				allocations;
+#endif
+	struct d3dddi_evict_flags	flags;
+	__u32				reserved;
+	__u64				num_bytes_to_trim;
+};
+
 enum d3dkmt_memory_segment_group {
 	_D3DKMT_MEMORY_SEGMENT_GROUP_LOCAL	= 0,
 	_D3DKMT_MEMORY_SEGMENT_GROUP_NON_LOCAL	= 1
@@ -1403,6 +1453,8 @@ struct d3dkmt_shareobjectwithhost {
 	_IOWR(0x47, 0x09, struct d3dkmt_queryadapterinfo)
 #define LX_DXQUERYVIDEOMEMORYINFO	\
 	_IOWR(0x47, 0x0a, struct d3dkmt_queryvideomemoryinfo)
+#define LX_DXMAKERESIDENT		\
+	_IOWR(0x47, 0x0b, struct d3dddi_makeresident)
 #define LX_DXESCAPE			\
 	_IOWR(0x47, 0x0d, struct d3dkmt_escape)
 #define LX_DXGETDEVICESTATE		\
@@ -1433,6 +1485,8 @@ struct d3dkmt_shareobjectwithhost {
 	_IOWR(0x47, 0x19, struct d3dkmt_destroydevice)
 #define LX_DXDESTROYSYNCHRONIZATIONOBJECT \
 	_IOWR(0x47, 0x1d, struct d3dkmt_destroysynchronizationobject)
+#define LX_DXEVICT			\
+	_IOWR(0x47, 0x1e, struct d3dkmt_evict)
 #define LX_DXFLUSHHEAPTRANSITIONS	\
 	_IOWR(0x47, 0x1f, struct d3dkmt_flushheaptransitions)
 #define LX_DXGETCONTEXTINPROCESSSCHEDULINGPRIORITY \
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 29/30] drivers: hv: dxgkrnl: Manage compute device virtual addresses
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (26 preceding siblings ...)
  2022-03-01 19:46   ` [PATCH v3 28/30] drivers: hv: dxgkrnl: Manage residency of allocations Iouri Tarassov
@ 2022-03-01 19:46   ` Iouri Tarassov
  2022-03-01 19:46   ` [PATCH v3 30/30] drivers: hv: dxgkrnl: Add support to map guest pages by host Iouri Tarassov
  2022-03-01 21:37   ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Wei Liu
  29 siblings, 0 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:46 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement ioctls to manage compute device virtual addresses (VA):
  - LX_DXRESERVEGPUVIRTUALADDRESS,
  - LX_DXFREEGPUVIRTUALADDRESS,
  - LX_DXMAPGPUVIRTUALADDRESS,
  - LX_DXUPDATEGPUVIRTUALADDRESS.

Compute devices access memory by using virtual addressses.
Each process has a dedicated VA space. The video memory manager
on the host is responsible with updating device page tables
before submitting a DMA buffer for execution.

The LX_DXRESERVEGPUVIRTUALADDRESS ioctl reserves a portion of the
process compute device VA space.

The LX_DXMAPGPUVIRTUALADDRESS ioctl reserves a portion of the process
compute device VA space and maps it to the given compute device
allocation.

The LX_DXFREEGPUVIRTUALADDRESS frees the previously reserved portion
of the compute device VA space.

The LX_DXUPDATEGPUVIRTUALADDRESS ioctl adds operations to modify the
compute device VA space to a compute device execution context. It
allows the operations to be queued and synchronized with execution
of other compute device DMA buffers..

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/dxgkrnl.h  |  10 ++
 drivers/hv/dxgkrnl/dxgvmbus.c | 150 ++++++++++++++++++++++
 drivers/hv/dxgkrnl/dxgvmbus.h |  38 ++++++
 drivers/hv/dxgkrnl/ioctl.c    | 230 ++++++++++++++++++++++++++++++++++
 include/uapi/misc/d3dkmthk.h  | 126 +++++++++++++++++++
 5 files changed, 554 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index c841203a1683..de6333fdada6 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -797,6 +797,16 @@ int dxgvmb_send_evict(struct dxgprocess *pr, struct dxgadapter *adapter,
 int dxgvmb_send_submit_command(struct dxgprocess *pr,
 			       struct dxgadapter *adapter,
 			       struct d3dkmt_submitcommand *args);
+int dxgvmb_send_map_gpu_va(struct dxgprocess *pr, struct d3dkmthandle h,
+			   struct dxgadapter *adapter,
+			   struct d3dddi_mapgpuvirtualaddress *args);
+int dxgvmb_send_reserve_gpu_va(struct dxgprocess *pr,
+			       struct dxgadapter *adapter,
+			       struct d3dddi_reservegpuvirtualaddress *args);
+int dxgvmb_send_free_gpu_va(struct dxgprocess *pr, struct dxgadapter *adapter,
+			    struct d3dkmt_freegpuvirtualaddress *args);
+int dxgvmb_send_update_gpu_va(struct dxgprocess *pr, struct dxgadapter *adapter,
+			      struct d3dkmt_updategpuvirtualaddress *args);
 int dxgvmb_send_create_sync_object(struct dxgprocess *pr,
 				   struct dxgadapter *adapter,
 				   struct d3dkmt_createsynchronizationobject2
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index 9f5b8edb186e..49c36d86c9bf 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -2414,6 +2414,156 @@ int dxgvmb_send_submit_command(struct dxgprocess *process,
 	return ret;
 }
 
+int dxgvmb_send_map_gpu_va(struct dxgprocess *process,
+			   struct d3dkmthandle device,
+			   struct dxgadapter *adapter,
+			   struct d3dddi_mapgpuvirtualaddress *args)
+{
+	struct dxgkvmb_command_mapgpuvirtualaddress *command;
+	struct dxgkvmb_command_mapgpuvirtualaddress_return result;
+	int ret;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_MAPGPUVIRTUALADDRESS,
+				   process->host_handle);
+	command->args = *args;
+	command->device = device;
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size, &result,
+				   sizeof(result));
+	if (ret < 0)
+		goto cleanup;
+	args->virtual_address = result.virtual_address;
+	args->paging_fence_value = result.paging_fence_value;
+	ret = ntstatus2int(result.status);
+
+cleanup:
+
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_reserve_gpu_va(struct dxgprocess *process,
+			       struct dxgadapter *adapter,
+			       struct d3dddi_reservegpuvirtualaddress *args)
+{
+	struct dxgkvmb_command_reservegpuvirtualaddress *command;
+	struct dxgkvmb_command_reservegpuvirtualaddress_return result;
+	int ret;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_RESERVEGPUVIRTUALADDRESS,
+				   process->host_handle);
+	command->args = *args;
+
+	ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size, &result,
+				   sizeof(result));
+	args->virtual_address = result.virtual_address;
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_free_gpu_va(struct dxgprocess *process,
+			    struct dxgadapter *adapter,
+			    struct d3dkmt_freegpuvirtualaddress *args)
+{
+	struct dxgkvmb_command_freegpuvirtualaddress *command;
+	int ret;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	ret = init_message(&msg, adapter, process, sizeof(*command));
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_FREEGPUVIRTUALADDRESS,
+				   process->host_handle);
+	command->args = *args;
+
+	ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
+int dxgvmb_send_update_gpu_va(struct dxgprocess *process,
+			      struct dxgadapter *adapter,
+			      struct d3dkmt_updategpuvirtualaddress *args)
+{
+	struct dxgkvmb_command_updategpuvirtualaddress *command;
+	u32 cmd_size;
+	u32 op_size;
+	int ret;
+	struct dxgvmbusmsg msg = {.hdr = NULL};
+
+	if (args->num_operations == 0 ||
+	    (DXG_MAX_VM_BUS_PACKET_SIZE /
+	     sizeof(struct d3dddi_updategpuvirtualaddress_operation)) <
+	    args->num_operations) {
+		ret = -EINVAL;
+		pr_err("Invalid number of operations: %d",
+			   args->num_operations);
+		goto cleanup;
+	}
+
+	op_size = args->num_operations *
+	    sizeof(struct d3dddi_updategpuvirtualaddress_operation);
+	cmd_size = sizeof(struct dxgkvmb_command_updategpuvirtualaddress) +
+	    op_size - sizeof(args->operations[0]);
+
+	ret = init_message(&msg, adapter, process, cmd_size);
+	if (ret)
+		goto cleanup;
+	command = (void *)msg.msg;
+
+	command_vgpu_to_host_init2(&command->hdr,
+				   DXGK_VMBCOMMAND_UPDATEGPUVIRTUALADDRESS,
+				   process->host_handle);
+	command->fence_value = args->fence_value;
+	command->device = args->device;
+	command->context = args->context;
+	command->fence_object = args->fence_object;
+	command->num_operations = args->num_operations;
+	command->flags = args->flags.value;
+	ret = copy_from_user(command->operations, args->operations,
+			     op_size);
+	if (ret) {
+		pr_err("%s failed to copy operations", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
+
+cleanup:
+	free_message(&msg, process);
+	if (ret)
+		pr_debug("err: %s %d", __func__, ret);
+	return ret;
+}
+
 static void set_result(struct d3dkmt_createsynchronizationobject2 *args,
 		       u64 fence_gpu_va, u8 *va)
 {
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index 59357bd5c7b9..3eda60da013d 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -418,6 +418,44 @@ struct dxgkvmb_command_flushheaptransitions {
 	struct dxgkvmb_command_vgpu_to_host hdr;
 };
 
+struct dxgkvmb_command_freegpuvirtualaddress {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmt_freegpuvirtualaddress args;
+};
+
+struct dxgkvmb_command_mapgpuvirtualaddress {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dddi_mapgpuvirtualaddress args;
+	struct d3dkmthandle		device;
+};
+
+struct dxgkvmb_command_mapgpuvirtualaddress_return {
+	u64		virtual_address;
+	u64		paging_fence_value;
+	struct ntstatus	status;
+};
+
+struct dxgkvmb_command_reservegpuvirtualaddress {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dddi_reservegpuvirtualaddress args;
+};
+
+struct dxgkvmb_command_reservegpuvirtualaddress_return {
+	u64	virtual_address;
+	u64	paging_fence_value;
+};
+
+struct dxgkvmb_command_updategpuvirtualaddress {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	u64				fence_value;
+	struct d3dkmthandle		device;
+	struct d3dkmthandle		context;
+	struct d3dkmthandle		fence_object;
+	u32				num_operations;
+	u32				flags;
+	struct d3dddi_updategpuvirtualaddress_operation operations[1];
+};
+
 struct dxgkvmb_command_queryclockcalibration {
 	struct dxgkvmb_command_vgpu_to_host hdr;
 	struct d3dkmt_queryclockcalibration args;
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index a90c1a897d55..b2fd3d55d72a 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -2535,6 +2535,228 @@ dxgk_submit_wait_to_hwqueue(struct dxgprocess *process, void *__user inargs)
 	return ret;
 }
 
+static int
+dxgk_map_gpu_va(struct dxgprocess *process, void *__user inargs)
+{
+	int ret, ret2;
+	struct d3dddi_mapgpuvirtualaddress args;
+	struct d3dddi_mapgpuvirtualaddress *input = inargs;
+	struct dxgdevice *device = NULL;
+	struct dxgadapter *adapter = NULL;
+
+	pr_debug("ioctl: %s", __func__);
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	device = dxgprocess_device_by_object_handle(process,
+					HMGRENTRY_TYPE_DXGPAGINGQUEUE,
+					args.paging_queue);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_map_gpu_va(process, zerohandle, adapter, &args);
+	if (ret < 0)
+		goto cleanup;
+	/* STATUS_PENING is a success code > 0. It is returned to user mode */
+	if (!(ret == STATUS_PENDING || ret == 0)) {
+		pr_err("%s Unexpected error %x", __func__, ret);
+		goto cleanup;
+	}
+
+	ret2 = copy_to_user(&input->paging_fence_value,
+			    &args.paging_fence_value, sizeof(u64));
+	if (ret2) {
+		pr_err("%s failed to copy paging fence to user", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret2 = copy_to_user(&input->virtual_address, &args.virtual_address,
+				sizeof(args.virtual_address));
+	if (ret2) {
+		pr_err("%s failed to copy va to user", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+cleanup:
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_reserve_gpu_va(struct dxgprocess *process, void *__user inargs)
+{
+	int ret;
+	struct d3dddi_reservegpuvirtualaddress args;
+	struct d3dddi_reservegpuvirtualaddress *input = inargs;
+	struct dxgadapter *adapter = NULL;
+	struct dxgdevice *device = NULL;
+
+	pr_debug("ioctl: %s", __func__);
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = dxgprocess_adapter_by_handle(process, args.adapter);
+	if (adapter == NULL) {
+		device = dxgprocess_device_by_object_handle(process,
+						HMGRENTRY_TYPE_DXGPAGINGQUEUE,
+						args.adapter);
+		if (device == NULL) {
+			pr_err("invalid adapter or paging queue: 0x%x",
+				   args.adapter.v);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+		adapter = device->adapter;
+		kref_get(&adapter->adapter_kref);
+		kref_put(&device->device_kref, dxgdevice_release);
+	} else {
+		args.adapter = adapter->host_handle;
+	}
+
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		kref_put(&adapter->adapter_kref, dxgadapter_release);
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_reserve_gpu_va(process, adapter, &args);
+	if (ret < 0)
+		goto cleanup;
+
+	ret = copy_to_user(&input->virtual_address, &args.virtual_address,
+			   sizeof(args.virtual_address));
+	if (ret) {
+		pr_err("%s failed to copy VA to user", __func__);
+		ret = -EINVAL;
+	}
+
+cleanup:
+
+	if (adapter) {
+		dxgadapter_release_lock_shared(adapter);
+		kref_put(&adapter->adapter_kref, dxgadapter_release);
+	}
+
+	pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+	return ret;
+}
+
+static int
+dxgk_free_gpu_va(struct dxgprocess *process, void *__user inargs)
+{
+	int ret;
+	struct d3dkmt_freegpuvirtualaddress args;
+	struct dxgadapter *adapter = NULL;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = dxgprocess_adapter_by_handle(process, args.adapter);
+	if (adapter == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		kref_put(&adapter->adapter_kref, dxgadapter_release);
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	args.adapter = adapter->host_handle;
+	ret = dxgvmb_send_free_gpu_va(process, adapter, &args);
+
+cleanup:
+
+	if (adapter) {
+		dxgadapter_release_lock_shared(adapter);
+		kref_put(&adapter->adapter_kref, dxgadapter_release);
+	}
+
+	return ret;
+}
+
+static int
+dxgk_update_gpu_va(struct dxgprocess *process, void *__user inargs)
+{
+	int ret;
+	struct d3dkmt_updategpuvirtualaddress args;
+	struct d3dkmt_updategpuvirtualaddress *input = inargs;
+	struct dxgadapter *adapter = NULL;
+	struct dxgdevice *device = NULL;
+
+	ret = copy_from_user(&args, inargs, sizeof(args));
+	if (ret) {
+		pr_err("%s failed to copy input args", __func__);
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	device = dxgprocess_device_by_handle(process, args.device);
+	if (device == NULL) {
+		ret = -EINVAL;
+		goto cleanup;
+	}
+
+	adapter = device->adapter;
+	ret = dxgadapter_acquire_lock_shared(adapter);
+	if (ret < 0) {
+		adapter = NULL;
+		goto cleanup;
+	}
+
+	ret = dxgvmb_send_update_gpu_va(process, adapter, &args);
+	if (ret < 0)
+		goto cleanup;
+
+	ret = copy_to_user(&input->fence_value, &args.fence_value,
+			   sizeof(args.fence_value));
+	if (ret) {
+		pr_err("%s failed to copy fence value to user", __func__);
+		ret = -EINVAL;
+	}
+
+cleanup:
+
+	if (adapter)
+		dxgadapter_release_lock_shared(adapter);
+	if (device)
+		kref_put(&device->device_kref, dxgdevice_release);
+
+	return ret;
+}
+
 static int
 dxgk_create_sync_object(struct dxgprocess *process, void *__user inargs)
 {
@@ -5042,12 +5264,16 @@ void init_ioctls(void)
 		  LX_DXCREATEALLOCATION);
 	SET_IOCTL(/*0x7 */ dxgk_create_paging_queue,
 		  LX_DXCREATEPAGINGQUEUE);
+	SET_IOCTL(/*0x8 */ dxgk_reserve_gpu_va,
+		  LX_DXRESERVEGPUVIRTUALADDRESS);
 	SET_IOCTL(/*0x9 */ dxgk_query_adapter_info,
 		  LX_DXQUERYADAPTERINFO);
 	SET_IOCTL(/*0xa */ dxgk_query_vidmem_info,
 		  LX_DXQUERYVIDEOMEMORYINFO);
 	SET_IOCTL(/*0xb */ dxgk_make_resident,
 		  LX_DXMAKERESIDENT);
+	SET_IOCTL(/*0xc */ dxgk_map_gpu_va,
+		  LX_DXMAPGPUVIRTUALADDRESS);
 	SET_IOCTL(/*0xd */ dxgk_escape,
 		  LX_DXESCAPE);
 	SET_IOCTL(/*0xe */ dxgk_get_device_state,
@@ -5082,6 +5308,8 @@ void init_ioctls(void)
 		  LX_DXEVICT);
 	SET_IOCTL(/*0x1f */ dxgk_flush_heap_transitions,
 		  LX_DXFLUSHHEAPTRANSITIONS);
+	SET_IOCTL(/*0x20 */ dxgk_free_gpu_va,
+		  LX_DXFREEGPUVIRTUALADDRESS);
 	SET_IOCTL(/*0x21 */ dxgk_get_context_process_scheduling_priority,
 		  LX_DXGETCONTEXTINPROCESSSCHEDULINGPRIORITY);
 	SET_IOCTL(/*0x22 */ dxgk_get_context_scheduling_priority,
@@ -5118,6 +5346,8 @@ void init_ioctls(void)
 		  LX_DXUNLOCK2);
 	SET_IOCTL(/*0x38 */ dxgk_update_alloc_property,
 		  LX_DXUPDATEALLOCPROPERTY);
+	SET_IOCTL(/*0x39 */ dxgk_update_gpu_va,
+		  LX_DXUPDATEGPUVIRTUALADDRESS);
 	SET_IOCTL(/*0x3a */ dxgk_wait_sync_object_cpu,
 		  LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMCPU);
 	SET_IOCTL(/*0x3b */ dxgk_wait_sync_object_gpu,
diff --git a/include/uapi/misc/d3dkmthk.h b/include/uapi/misc/d3dkmthk.h
index 95d6df5f01b5..4c83b934816f 100644
--- a/include/uapi/misc/d3dkmthk.h
+++ b/include/uapi/misc/d3dkmthk.h
@@ -1008,6 +1008,124 @@ struct d3dkmt_evict {
 	__u64				num_bytes_to_trim;
 };
 
+struct d3dddigpuva_protection_type {
+	union {
+		struct {
+			__u64	write:1;
+			__u64	execute:1;
+			__u64	zero:1;
+			__u64	no_access:1;
+			__u64	system_use_only:1;
+			__u64	reserved:59;
+		};
+		__u64		value;
+	};
+};
+
+enum d3dddi_updategpuvirtualaddress_operation_type {
+	_D3DDDI_UPDATEGPUVIRTUALADDRESS_MAP		= 0,
+	_D3DDDI_UPDATEGPUVIRTUALADDRESS_UNMAP		= 1,
+	_D3DDDI_UPDATEGPUVIRTUALADDRESS_COPY		= 2,
+	_D3DDDI_UPDATEGPUVIRTUALADDRESS_MAP_PROTECT	= 3,
+};
+
+struct d3dddi_updategpuvirtualaddress_operation {
+	enum d3dddi_updategpuvirtualaddress_operation_type operation;
+	union {
+		struct {
+			__u64		base_address;
+			__u64		size;
+			struct d3dkmthandle allocation;
+			__u64		allocation_offset;
+			__u64		allocation_size;
+		} map;
+		struct {
+			__u64		base_address;
+			__u64		size;
+			struct d3dkmthandle allocation;
+			__u64		allocation_offset;
+			__u64		allocation_size;
+			struct d3dddigpuva_protection_type protection;
+			__u64		driver_protection;
+		} map_protect;
+		struct {
+			__u64	base_address;
+			__u64	size;
+			struct d3dddigpuva_protection_type protection;
+		} unmap;
+		struct {
+			__u64	source_address;
+			__u64	size;
+			__u64	dest_address;
+		} copy;
+	};
+};
+
+enum d3dddigpuva_reservation_type {
+	_D3DDDIGPUVA_RESERVE_NO_ACCESS		= 0,
+	_D3DDDIGPUVA_RESERVE_ZERO		= 1,
+	_D3DDDIGPUVA_RESERVE_NO_COMMIT		= 2
+};
+
+struct d3dkmt_updategpuvirtualaddress {
+	struct d3dkmthandle			device;
+	struct d3dkmthandle			context;
+	struct d3dkmthandle			fence_object;
+	__u32					num_operations;
+#ifdef __KERNEL__
+	struct d3dddi_updategpuvirtualaddress_operation *operations;
+#else
+	__u64					operations;
+#endif
+	__u32					reserved0;
+	__u32					reserved1;
+	__u64					reserved2;
+	__u64					fence_value;
+	union {
+		struct {
+			__u32			do_not_wait:1;
+			__u32			reserved:31;
+		};
+		__u32				value;
+	} flags;
+	__u32					reserved3;
+};
+
+struct d3dddi_mapgpuvirtualaddress {
+	struct d3dkmthandle			paging_queue;
+	__u64					base_address;
+	__u64					minimum_address;
+	__u64					maximum_address;
+	struct d3dkmthandle			allocation;
+	__u64					offset_in_pages;
+	__u64					size_in_pages;
+	struct d3dddigpuva_protection_type	protection;
+	__u64					driver_protection;
+	__u32					reserved0;
+	__u64					reserved1;
+	__u64					virtual_address;
+	__u64					paging_fence_value;
+};
+
+struct d3dddi_reservegpuvirtualaddress {
+	struct d3dkmthandle			adapter;
+	__u64					base_address;
+	__u64					minimum_address;
+	__u64					maximum_address;
+	__u64					size;
+	enum d3dddigpuva_reservation_type	reservation_type;
+	__u64					driver_protection;
+	__u64					virtual_address;
+	__u64					paging_fence_value;
+};
+
+struct d3dkmt_freegpuvirtualaddress {
+	struct d3dkmthandle	adapter;
+	__u32			reserved;
+	__u64			base_address;
+	__u64			size;
+};
+
 enum d3dkmt_memory_segment_group {
 	_D3DKMT_MEMORY_SEGMENT_GROUP_LOCAL	= 0,
 	_D3DKMT_MEMORY_SEGMENT_GROUP_NON_LOCAL	= 1
@@ -1449,12 +1567,16 @@ struct d3dkmt_shareobjectwithhost {
 	_IOWR(0x47, 0x06, struct d3dkmt_createallocation)
 #define LX_DXCREATEPAGINGQUEUE		\
 	_IOWR(0x47, 0x07, struct d3dkmt_createpagingqueue)
+#define LX_DXRESERVEGPUVIRTUALADDRESS	\
+	_IOWR(0x47, 0x08, struct d3dddi_reservegpuvirtualaddress)
 #define LX_DXQUERYADAPTERINFO		\
 	_IOWR(0x47, 0x09, struct d3dkmt_queryadapterinfo)
 #define LX_DXQUERYVIDEOMEMORYINFO	\
 	_IOWR(0x47, 0x0a, struct d3dkmt_queryvideomemoryinfo)
 #define LX_DXMAKERESIDENT		\
 	_IOWR(0x47, 0x0b, struct d3dddi_makeresident)
+#define LX_DXMAPGPUVIRTUALADDRESS	\
+	_IOWR(0x47, 0x0c, struct d3dddi_mapgpuvirtualaddress)
 #define LX_DXESCAPE			\
 	_IOWR(0x47, 0x0d, struct d3dkmt_escape)
 #define LX_DXGETDEVICESTATE		\
@@ -1489,6 +1611,8 @@ struct d3dkmt_shareobjectwithhost {
 	_IOWR(0x47, 0x1e, struct d3dkmt_evict)
 #define LX_DXFLUSHHEAPTRANSITIONS	\
 	_IOWR(0x47, 0x1f, struct d3dkmt_flushheaptransitions)
+#define LX_DXFREEGPUVIRTUALADDRESS	\
+	_IOWR(0x47, 0x20, struct d3dkmt_freegpuvirtualaddress)
 #define LX_DXGETCONTEXTINPROCESSSCHEDULINGPRIORITY \
 	_IOWR(0x47, 0x21, struct d3dkmt_getcontextinprocessschedulingpriority)
 #define LX_DXGETCONTEXTSCHEDULINGPRIORITY \
@@ -1525,6 +1649,8 @@ struct d3dkmt_shareobjectwithhost {
 	_IOWR(0x47, 0x37, struct d3dkmt_unlock2)
 #define LX_DXUPDATEALLOCPROPERTY	\
 	_IOWR(0x47, 0x38, struct d3dddi_updateallocproperty)
+#define LX_DXUPDATEGPUVIRTUALADDRESS	\
+	_IOWR(0x47, 0x39, struct d3dkmt_updategpuvirtualaddress)
 #define LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMCPU \
 	_IOWR(0x47, 0x3a, struct d3dkmt_waitforsynchronizationobjectfromcpu)
 #define LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMGPU \
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 30/30] drivers: hv: dxgkrnl: Add support to map guest pages by host
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (27 preceding siblings ...)
  2022-03-01 19:46   ` [PATCH v3 29/30] drivers: hv: dxgkrnl: Manage compute device virtual addresses Iouri Tarassov
@ 2022-03-01 19:46   ` Iouri Tarassov
  2022-03-02 14:29     ` Wei Liu
  2022-03-01 21:37   ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Wei Liu
  29 siblings, 1 reply; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 19:46 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-hyperv
  Cc: linux-kernel, spronovo, spronovo, gregkh

Implement support for mapping guest memory pages by the host.
This removes hyper-v limitations of using GPADL (guest physical
address list).

Dxgkrnl uses hyper-v GPADLs to share guest system memory with the
host. This method has limitations:
- a single GPADL can represent only ~32MB of memory
- there is a limit of how much memory the total size of GPADLs
  in a VM can represent.
To avoid these limitations the host implemented mapping guest memory
pages. Presence of this support is determined by reading PCI config
space. When the support is enabled, dxgkrnl does not use GPADLs and
instead uses the following code flow:
- memory pages of an existing system memory buffer are pinned
- PFNs of the pages are sent to the host via a VM bus message
- the host maps the PFNs to get access to the memory

Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
---
 drivers/hv/dxgkrnl/Makefile    |   2 +-
 drivers/hv/dxgkrnl/dxgkrnl.h   |   2 +
 drivers/hv/dxgkrnl/dxgmodule.c |  30 ++++++-
 drivers/hv/dxgkrnl/dxgvmbus.c  | 145 +++++++++++++++++++++++----------
 drivers/hv/dxgkrnl/dxgvmbus.h  |  10 +++
 drivers/hv/dxgkrnl/ioctl.c     |   7 +-
 drivers/hv/dxgkrnl/misc.c      |   6 ++
 drivers/hv/dxgkrnl/misc.h      |   1 +
 8 files changed, 153 insertions(+), 50 deletions(-)

diff --git a/drivers/hv/dxgkrnl/Makefile b/drivers/hv/dxgkrnl/Makefile
index 9d821e83448a..fc85a47a6ad5 100644
--- a/drivers/hv/dxgkrnl/Makefile
+++ b/drivers/hv/dxgkrnl/Makefile
@@ -2,4 +2,4 @@
 # Makefile for the hyper-v compute device driver (dxgkrnl).
 
 obj-$(CONFIG_DXGKRNL)	+= dxgkrnl.o
-dxgkrnl-y		:= dxgmodule.o hmgr.o misc.o dxgadapter.o ioctl.o dxgvmbus.o dxgprocess.o
+dxgkrnl-y := dxgmodule.o hmgr.o misc.o dxgadapter.o ioctl.o dxgvmbus.o dxgprocess.o
diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index de6333fdada6..2dfbdc4487e1 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -305,6 +305,7 @@ struct dxgglobal {
 	bool			pci_registered;
 	bool			global_channel_initialized;
 	bool			async_msg_enabled;
+	bool			map_guest_pages_enabled;
 };
 
 extern struct dxgglobal		*dxgglobal;
@@ -837,6 +838,7 @@ int dxgvmb_send_wait_sync_object_cpu(struct dxgprocess *process,
 				     struct
 				     d3dkmt_waitforsynchronizationobjectfromcpu
 				     *args,
+				     bool user_address,
 				     u64 cpu_event);
 int dxgvmb_send_lock2(struct dxgprocess *process,
 		      struct dxgadapter *adapter,
diff --git a/drivers/hv/dxgkrnl/dxgmodule.c b/drivers/hv/dxgkrnl/dxgmodule.c
index cfef880a512d..468c9b5dd43f 100644
--- a/drivers/hv/dxgkrnl/dxgmodule.c
+++ b/drivers/hv/dxgkrnl/dxgmodule.c
@@ -144,7 +144,7 @@ void dxgglobal_remove_host_event(struct dxghostevent *event)
 
 void signal_host_cpu_event(struct dxghostevent *eventhdr)
 {
-	struct  dxghosteventcpu *event = (struct  dxghosteventcpu *)eventhdr;
+	struct dxghosteventcpu *event = (struct dxghosteventcpu *)eventhdr;
 
 	if (event->remove_from_list ||
 		event->destroy_after_signal) {
@@ -430,6 +430,9 @@ const struct file_operations dxgk_fops = {
 /* Luid of the virtual GPU on the host (struct winluid) */
 #define DXGK_VMBUS_VGPU_LUID_OFFSET	(DXGK_VMBUS_VERSION_OFFSET + \
 					sizeof(u32))
+/* The host caps (dxgk_vmbus_hostcaps) */
+#define DXGK_VMBUS_HOSTCAPS_OFFSET	(DXGK_VMBUS_VGPU_LUID_OFFSET + \
+					sizeof(struct winluid))
 /* The guest writes its capavilities to this adderss */
 #define DXGK_VMBUS_GUESTCAPS_OFFSET	(DXGK_VMBUS_VERSION_OFFSET + \
 					sizeof(u32))
@@ -445,6 +448,23 @@ struct dxgk_vmbus_guestcaps {
 	};
 };
 
+/*
+ * The structure defines features, supported by the host.
+ *
+ * map_guest_memory
+ *   Host can map guest memory pages, so the guest can avoid using GPADLs
+ *   to represent existing system memory allocations.
+ */
+struct dxgk_vmbus_hostcaps {
+	union {
+		struct {
+			u32	map_guest_memory	: 1;
+			u32	reserved		: 31;
+		};
+		u32 host_caps;
+	};
+};
+
 /*
  * A helper function to read PCI config space.
  */
@@ -474,6 +494,7 @@ static int dxg_pci_probe_device(struct pci_dev *dev,
 	u32 vmbus_interface_ver = DXGK_VMBUS_INTERFACE_VERSION;
 	struct winluid vgpu_luid = {};
 	struct dxgk_vmbus_guestcaps guest_caps = {.wsl2 = 1};
+	struct dxgk_vmbus_hostcaps host_caps = {};
 
 	mutex_lock(&dxgglobal->device_mutex);
 
@@ -502,6 +523,13 @@ static int dxg_pci_probe_device(struct pci_dev *dev,
 		if (ret)
 			goto cleanup;
 
+		ret = pci_read_config_dword(dev, DXGK_VMBUS_HOSTCAPS_OFFSET,
+					&host_caps.host_caps);
+		if (ret == 0) {
+			if (host_caps.map_guest_memory)
+				dxgglobal->map_guest_pages_enabled = true;
+		}
+
 		if (dxgglobal->vmbus_ver > DXGK_VMBUS_INTERFACE_VERSION)
 			dxgglobal->vmbus_ver = DXGK_VMBUS_INTERFACE_VERSION;
 	}
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index 49c36d86c9bf..da92bc53becb 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -1374,15 +1374,18 @@ int create_existing_sysmem(struct dxgdevice *device,
 	void *kmem = NULL;
 	int ret = 0;
 	struct dxgkvmb_command_setexistingsysmemstore *set_store_command;
+	struct dxgkvmb_command_setexistingsysmempages *set_pages_command;
 	u64 alloc_size = host_alloc->allocation_size;
 	u32 npages = alloc_size >> PAGE_SHIFT;
 	struct dxgvmbusmsg msg = {.hdr = NULL};
-
-	ret = init_message(&msg, device->adapter, device->process,
-			   sizeof(*set_store_command));
-	if (ret)
-		goto cleanup;
-	set_store_command = (void *)msg.msg;
+	const u32 max_pfns_in_message =
+		(DXG_MAX_VM_BUS_PACKET_SIZE - sizeof(*set_pages_command) -
+		PAGE_SIZE) / sizeof(__u64);
+	u32 alloc_offset_in_pages = 0;
+	struct page **page_in;
+	u64 *pfn;
+	u32 pages_to_send;
+	u32 i;
 
 	/*
 	 * Create a guest physical address list and set it as the allocation
@@ -1393,6 +1396,7 @@ int create_existing_sysmem(struct dxgdevice *device,
 	pr_debug("   Alloc size: %lld", alloc_size);
 
 	dxgalloc->cpu_address = (void *)sysmem;
+
 	dxgalloc->pages = vzalloc(npages * sizeof(void *));
 	if (dxgalloc->pages == NULL) {
 		pr_err("failed to allocate pages");
@@ -1410,31 +1414,80 @@ int create_existing_sysmem(struct dxgdevice *device,
 		ret = -ENOMEM;
 		goto cleanup;
 	}
-	kmem = vmap(dxgalloc->pages, npages, VM_MAP, PAGE_KERNEL);
-	if (kmem == NULL) {
-		pr_err("vmap failed");
-		ret = -ENOMEM;
-		goto cleanup;
-	}
-	ret1 = vmbus_establish_gpadl(dxgglobal_get_vmbus(), kmem,
-				     alloc_size, &dxgalloc->gpadl);
-	if (ret1) {
-		pr_err("establish_gpadl failed: %d", ret1);
-		ret = -ENOMEM;
-		goto cleanup;
-	}
-	pr_debug("New gpadl %d", dxgalloc->gpadl.gpadl_handle);
+	if (!dxgglobal->map_guest_pages_enabled) {
+		ret = init_message(&msg, device->adapter, device->process,
+				sizeof(*set_store_command));
+		if (ret)
+			goto cleanup;
+		set_store_command = (void *)msg.msg;
 
-	command_vgpu_to_host_init2(&set_store_command->hdr,
-				   DXGK_VMBCOMMAND_SETEXISTINGSYSMEMSTORE,
-				   device->process->host_handle);
-	set_store_command->device = device->handle;
-	set_store_command->device = device->handle;
-	set_store_command->allocation = host_alloc->allocation;
-	set_store_command->gpadl = dxgalloc->gpadl.gpadl_handle;
-	ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
-	if (ret < 0)
-		pr_err("failed to set existing store: %x", ret);
+		kmem = vmap(dxgalloc->pages, npages, VM_MAP, PAGE_KERNEL);
+		if (kmem == NULL) {
+			pr_err("vmap failed");
+			ret = -ENOMEM;
+			goto cleanup;
+		}
+		ret1 = vmbus_establish_gpadl(dxgglobal_get_vmbus(), kmem,
+					alloc_size, &dxgalloc->gpadl);
+		if (ret1) {
+			pr_err("establish_gpadl failed: %d", ret1);
+			ret = -ENOMEM;
+			goto cleanup;
+		}
+		pr_debug("New gpadl %d",
+			dxgalloc->gpadl.gpadl_handle);
+
+		command_vgpu_to_host_init2(&set_store_command->hdr,
+					DXGK_VMBCOMMAND_SETEXISTINGSYSMEMSTORE,
+					device->process->host_handle);
+		set_store_command->device = device->handle;
+		set_store_command->allocation = host_alloc->allocation;
+		set_store_command->gpadl = dxgalloc->gpadl.gpadl_handle;
+		ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr,
+						    msg.size);
+		if (ret < 0)
+			pr_err("failed to set existing store: %x", ret);
+	} else {
+		/*
+		 * Send the list of the allocation PFNs to the host. The host
+		 * will map the pages for GPU access.
+		 */
+
+		ret = init_message(&msg, device->adapter, device->process,
+				sizeof(*set_pages_command) +
+				max_pfns_in_message * sizeof(u64));
+		if (ret)
+			goto cleanup;
+		set_pages_command = (void *)msg.msg;
+		command_vgpu_to_host_init2(&set_pages_command->hdr,
+					DXGK_VMBCOMMAND_SETEXISTINGSYSMEMPAGES,
+					device->process->host_handle);
+		set_pages_command->device = device->handle;
+		set_pages_command->allocation = host_alloc->allocation;
+
+		page_in = dxgalloc->pages;
+		while (alloc_offset_in_pages < npages) {
+			pfn = (u64 *)((char *)msg.msg +
+				sizeof(*set_pages_command));
+			pages_to_send = min(npages - alloc_offset_in_pages,
+					    max_pfns_in_message);
+			set_pages_command->num_pages = pages_to_send;
+			set_pages_command->alloc_offset_in_pages =
+				alloc_offset_in_pages;
+
+			for (i = 0; i < pages_to_send; i++)
+				*pfn++ = page_to_pfn(*page_in++);
+
+			ret = dxgvmb_send_sync_msg_ntstatus(msg.channel,
+							    msg.hdr,
+							    msg.size);
+			if (ret < 0) {
+				pr_err("failed to set existing pages: %x", ret);
+				break;
+			}
+			alloc_offset_in_pages += pages_to_send;
+		}
+	}
 
 cleanup:
 	if (kmem)
@@ -2748,6 +2801,7 @@ int dxgvmb_send_wait_sync_object_cpu(struct dxgprocess *process,
 				     struct
 				     d3dkmt_waitforsynchronizationobjectfromcpu
 				     *args,
+				     bool user_address,
 				     u64 cpu_event)
 {
 	int ret = -EINVAL;
@@ -2771,18 +2825,25 @@ int dxgvmb_send_wait_sync_object_cpu(struct dxgprocess *process,
 	command->object_count = args->object_count;
 	command->guest_event_pointer = (u64) cpu_event;
 	current_pos = (u8 *) &command[1];
-	ret = copy_from_user(current_pos, args->objects, object_size);
-	if (ret) {
-		pr_err("%s failed to copy objects", __func__);
-		ret = -EINVAL;
-		goto cleanup;
-	}
-	current_pos += object_size;
-	ret = copy_from_user(current_pos, args->fence_values, fence_size);
-	if (ret) {
-		pr_err("%s failed to copy fences", __func__);
-		ret = -EINVAL;
-		goto cleanup;
+	if (user_address) {
+		ret = copy_from_user(current_pos, args->objects, object_size);
+		if (ret) {
+			pr_err("%s failed to copy objects", __func__);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+		current_pos += object_size;
+		ret = copy_from_user(current_pos, args->fence_values,
+				     fence_size);
+		if (ret) {
+			pr_err("%s failed to copy fences", __func__);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+	} else {
+		memcpy(current_pos, args->objects, object_size);
+		current_pos += object_size;
+		memcpy(current_pos, args->fence_values, fence_size);
 	}
 
 	ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.h b/drivers/hv/dxgkrnl/dxgvmbus.h
index 3eda60da013d..74f57ead10d7 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.h
+++ b/drivers/hv/dxgkrnl/dxgvmbus.h
@@ -234,6 +234,16 @@ struct dxgkvmb_command_setexistingsysmemstore {
 	u32				gpadl;
 };
 
+/* Returns ntstatus */
+struct dxgkvmb_command_setexistingsysmempages {
+	struct dxgkvmb_command_vgpu_to_host hdr;
+	struct d3dkmthandle		device;
+	struct d3dkmthandle		allocation;
+	u32				num_pages;
+	u32				alloc_offset_in_pages;
+	/* u64 pfn_array[num_pages] */
+};
+
 struct dxgkvmb_command_createprocess {
 	struct dxgkvmb_command_vm_to_host hdr;
 	void			*process;
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index b2fd3d55d72a..92e1031ba652 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -30,11 +30,6 @@ struct ioctl_desc {
 };
 static struct ioctl_desc ioctls[LX_IO_MAX + 1];
 
-static char *errorstr(int ret)
-{
-	return ret < 0 ? "err" : "";
-}
-
 static int dxgsyncobj_release(struct inode *inode, struct file *file)
 {
 	struct dxgsharedsyncobject *syncobj = file->private_data;
@@ -3536,7 +3531,7 @@ dxgk_wait_sync_object_cpu(struct dxgprocess *process, void *__user inargs)
 	}
 
 	ret = dxgvmb_send_wait_sync_object_cpu(process, adapter,
-					       &args, event_id);
+					       &args, true, event_id);
 	if (ret < 0)
 		goto cleanup;
 
diff --git a/drivers/hv/dxgkrnl/misc.c b/drivers/hv/dxgkrnl/misc.c
index cb1e0635bebc..39e52b165b27 100644
--- a/drivers/hv/dxgkrnl/misc.c
+++ b/drivers/hv/dxgkrnl/misc.c
@@ -35,3 +35,9 @@ u16 *wcsncpy(u16 *dest, const u16 *src, size_t n)
 	dest[i - 1] = 0;
 	return dest;
 }
+
+char *errorstr(int ret)
+{
+	return ret < 0 ? "err" : "";
+}
+
diff --git a/drivers/hv/dxgkrnl/misc.h b/drivers/hv/dxgkrnl/misc.h
index f56d21b48814..721d4f5be235 100644
--- a/drivers/hv/dxgkrnl/misc.h
+++ b/drivers/hv/dxgkrnl/misc.h
@@ -43,6 +43,7 @@ extern const struct d3dkmthandle zerohandle;
  */
 
 u16 *wcsncpy(u16 *dest, const u16 *src, size_t n);
+char *errorstr(int ret);
 
 enum dxglockstate {
 	DXGLOCK_SHARED,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-01 19:45   ` [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading Iouri Tarassov
@ 2022-03-01 20:45     ` Greg KH
  2022-03-01 22:23       ` Wei Liu
                         ` (2 more replies)
  2022-03-01 22:06     ` Wei Liu
  1 sibling, 3 replies; 72+ messages in thread
From: Greg KH @ 2022-03-01 20:45 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo

On Tue, Mar 01, 2022 at 11:45:49AM -0800, Iouri Tarassov wrote:
> - Create skeleton and add basic functionality for the
> hyper-v compute device driver (dxgkrnl).
> 
> - Register for PCI and VM bus driver notifications and
> handle initialization of VM bus channels.
> 
> - Connect the dxgkrnl module to the drivers/hv/ Makefile and Kconfig
> 
> - Create a MAINTAINERS entry
> 
> A VM bus channel is a communication interface between the hyper-v guest
> and the host. The are two type of VM bus channels, used in the driver:
>   - the global channel
>   - per virtual compute device channel
> 
> A PCI device is created for each virtual compute device, projected
> by the host. The device vendor is PCI_VENDOR_ID_MICROSOFT and device
> id is PCI_DEVICE_ID_VIRTUAL_RENDER. dxg_pci_probe_device handles
> arrival of such devices. The PCI config space of the virtual compute
> device has luid of the corresponding virtual compute device VM
> bus channel. This is how the compute device adapter objects are
> linked to VM bus channels.
> 
> VM bus interface version is exchanged by reading/writing the PCI config
> space of the virtual compute device.
> 
> The IO space is used to handle CPU accessible compute device
> allocations. Hyper-v allocates IO space for the global VM bus channel.
> 
> Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>

Please work with internal developers to get reviews from them first,
before requiring the kernel community to point out basic issues.  It
will save you a lot of time and stop us from feeling like we are having
our time wasted.

Some simple examples below that your coworkers should have caught:

> --- /dev/null
> +++ b/drivers/hv/dxgkrnl/dxgkrnl.h
> @@ -0,0 +1,119 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +/*
> + * Copyright (c) 2019, Microsoft Corporation.

It is now 2022 :)

> +void init_ioctls(void);

That is a horrible global function name you just added to the kernel's
namespace for a single driver :(

> +long dxgk_unlocked_ioctl(struct file *f, unsigned int p1, unsigned long p2);
> +
> +static inline void guid_to_luid(guid_t *guid, struct winluid *luid)
> +{
> +	*luid = *(struct winluid *)&guid->b[0];

Why is the cast needed?  Shouldn't you use real types in your
structures?

> +/*
> + * The interface version is used to ensure that the host and the guest use the
> + * same VM bus protocol. It needs to be incremented every time the VM bus
> + * interface changes. DXGK_VMBUS_LAST_COMPATIBLE_INTERFACE_VERSION is
> + * incremented each time the earlier versions of the interface are no longer
> + * compatible with the current version.
> + */
> +#define DXGK_VMBUS_INTERFACE_VERSION_OLD		27
> +#define DXGK_VMBUS_INTERFACE_VERSION			40
> +#define DXGK_VMBUS_LAST_COMPATIBLE_INTERFACE_VERSION	16

Where do these numbers come from, the hypervisor specification?

> +/*
> + * Pointer to the global device data. By design
> + * there is a single vGPU device on the VM bus and a single /dev/dxg device
> + * is created.
> + */
> +struct dxgglobal *dxgglobal;

No, make this per-device, NEVER have a single device for your driver.
The Linux driver model makes it harder to do it this way than to do it
correctly.  Do it correctly please and have no global structures like
this.

> +#define DXGKRNL_VERSION			0x2216

What does this mean?

> +#define PCI_VENDOR_ID_MICROSOFT		0x1414
> +#define PCI_DEVICE_ID_VIRTUAL_RENDER	0x008E
> +
> +#undef pr_fmt
> +#define pr_fmt(fmt)	"dxgk: " fmt

Use the dev_*() print functions, you are a driver, and you always have a
pointer to a struct device.  There's no need to ever call pr_*().

> +
> +//
> +// Interface from dxgglobal
> +//

You mix /* */ and // styles for large comment blocks like this.  Pick
one and use it consistently.

> +
> +struct vmbus_channel *dxgglobal_get_vmbus(void)
> +{
> +	return dxgglobal->channel.channel;
> +}

Again, no globals.

> +/*
> + * File operations for the /dev/dxg device
> + */
> +
> +static int dxgk_open(struct inode *n, struct file *f)
> +{
> +	return 0;
> +}
> +
> +static int dxgk_release(struct inode *n, struct file *f)
> +{
> +	return 0;
> +}

If you do not need open/release, do not provide them.

> +
> +static ssize_t dxgk_read(struct file *f, char __user *s, size_t len,
> +			 loff_t *o)
> +{
> +	pr_debug("file read\n");

ftrace is your friend, please remove.

> +	return 0;
> +}
> +
> +static ssize_t dxgk_write(struct file *f, const char __user *s, size_t len,
> +			  loff_t *o)
> +{
> +	pr_debug("file write\n");
> +	return len;
> +}
> +
> +const struct file_operations dxgk_fops = {
> +	.owner = THIS_MODULE,
> +	.open = dxgk_open,
> +	.release = dxgk_release,
> +	.write = dxgk_write,
> +	.read = dxgk_read,
> +};
> +
> +/*
> + * Interface with the PCI driver
> + */
> +
> +/*
> + * Part of the PCI config space of the vGPU device is used for vGPU
> + * configuration data. Reading/writing of the PCI config space is forwarded
> + * to the host.
> + */
> +
> +/* vGPU VM bus channel instance ID */
> +#define DXGK_VMBUS_CHANNEL_ID_OFFSET 192

No tab?  Why not?

> +/* DXGK_VMBUS_INTERFACE_VERSION (u32) */

What is this?

> +#define DXGK_VMBUS_VERSION_OFFSET	(DXGK_VMBUS_CHANNEL_ID_OFFSET + \
> +					sizeof(guid_t))

offsetof() is your friend, please use it.

> +/* Luid of the virtual GPU on the host (struct winluid) */

What is a "luid"?

> +#define DXGK_VMBUS_VGPU_LUID_OFFSET	(DXGK_VMBUS_VERSION_OFFSET + \
> +					sizeof(u32))

Again, offsetof() please.

> +/* The guest writes its capavilities to this adderss */

Odd spelling :(

> +#define DXGK_VMBUS_GUESTCAPS_OFFSET	(DXGK_VMBUS_VERSION_OFFSET + \
> +					sizeof(u32))

offsetof().

> +static int dxg_probe_vmbus(struct hv_device *hdev,
> +			   const struct hv_vmbus_device_id *dev_id)
> +{
> +	int ret = 0;
> +	struct winluid luid;
> +	struct dxgvgpuchannel *vgpuch;
> +
> +	mutex_lock(&dxgglobal->device_mutex);

Lock the device's mutex, not the global.

> +
> +	if (uuid_le_cmp(hdev->dev_type, id_table[0].guid) == 0) {
> +		/* This is a new virtual GPU channel */
> +		guid_to_luid(&hdev->channel->offermsg.offer.if_instance, &luid);
> +		pr_debug("vGPU channel: %pUb",
> +			    &hdev->channel->offermsg.offer.if_instance);
> +		vgpuch = vzalloc(sizeof(struct dxgvgpuchannel));
> +		if (vgpuch == NULL) {
> +			ret = -ENOMEM;
> +			goto error;
> +		}
> +		vgpuch->adapter_luid = luid;
> +		vgpuch->hdev = hdev;
> +		list_add_tail(&vgpuch->vgpu_ch_list_entry,
> +			      &dxgglobal->vgpu_ch_list_head);
> +	} else if (uuid_le_cmp(hdev->dev_type, id_table[1].guid) == 0) {
> +		/* This is the global Dxgkgnl channel */
> +		pr_debug("Global channel: %pUb",
> +			    &hdev->channel->offermsg.offer.if_instance);
> +		if (dxgglobal->hdev) {
> +			/* This device should appear only once */
> +			pr_err("global channel already present\n");
> +			ret = -EBADE;
> +			goto error;
> +		}
> +		dxgglobal->hdev = hdev;
> +	} else {
> +		/* Unknown device type */
> +		pr_err("probe: unknown device type\n");

dev_err().

> +		ret = -EBADE;

-ENODEV


> +		goto error;

No need for this goto.

> +	}
> +
> +error:
> +
> +	mutex_unlock(&dxgglobal->device_mutex);
> +
> +	if (ret)
> +		pr_debug("err: %s %d", __func__, ret);

dev_dbg()

Also, pr_debug() and dev_dbg() already provide you with the function
name, you never need to add it again.

> +	return ret;
> +}
> +
> +static int dxg_remove_vmbus(struct hv_device *hdev)
> +{
> +	int ret = 0;
> +	struct dxgvgpuchannel *vgpu_channel;
> +
> +	mutex_lock(&dxgglobal->device_mutex);
> +
> +	if (uuid_le_cmp(hdev->dev_type, id_table[0].guid) == 0) {
> +		pr_debug("Remove virtual GPU channel\n");
> +		list_for_each_entry(vgpu_channel,
> +				    &dxgglobal->vgpu_ch_list_head,
> +				    vgpu_ch_list_entry) {
> +			if (vgpu_channel->hdev == hdev) {
> +				list_del(&vgpu_channel->vgpu_ch_list_entry);
> +				vfree(vgpu_channel);
> +				break;
> +			}
> +		}
> +	} else if (uuid_le_cmp(hdev->dev_type, id_table[1].guid) == 0) {
> +		pr_debug("Remove global channel device\n");
> +		dxgglobal_destroy_global_channel();
> +	} else {
> +		/* Unknown device type */
> +		pr_err("remove: unknown device type\n");
> +		ret = -EBADE;
> +	}
> +
> +	mutex_unlock(&dxgglobal->device_mutex);
> +	if (ret)
> +		pr_debug("err: %s %d", __func__, ret);
> +	return ret;
> +}
> +
> +MODULE_DEVICE_TABLE(vmbus, id_table);
> +
> +static struct hv_driver dxg_drv = {
> +	.name = KBUILD_MODNAME,
> +	.id_table = id_table,
> +	.probe = dxg_probe_vmbus,
> +	.remove = dxg_remove_vmbus,
> +	.driver = {
> +		   .probe_type = PROBE_PREFER_ASYNCHRONOUS,
> +		    },

Odd indentation :(

> +};
> +
> +/*
> + * Interface with Linux kernel

What was the code above this???

> + */
> +
> +static int dxgglobal_create(void)
> +{
> +	int ret = 0;

No need for this variable.

> +
> +	dxgglobal = vzalloc(sizeof(struct dxgglobal));
> +	if (!dxgglobal)
> +		return -ENOMEM;
> +
> +	INIT_LIST_HEAD(&dxgglobal->plisthead);
> +	mutex_init(&dxgglobal->plistmutex);
> +	mutex_init(&dxgglobal->device_mutex);
> +
> +	INIT_LIST_HEAD(&dxgglobal->vgpu_ch_list_head);
> +
> +	init_rwsem(&dxgglobal->channel_lock);
> +
> +	pr_debug("dxgglobal_init end\n");
> +	return ret;
> +}
> +
> +static void dxgglobal_destroy(void)
> +{
> +	if (dxgglobal) {
> +		if (dxgglobal->vmbus_registered)
> +			vmbus_driver_unregister(&dxg_drv);
> +
> +		dxgglobal_destroy_global_channel();
> +
> +		if (dxgglobal->pci_registered)
> +			pci_unregister_driver(&dxg_pci_drv);
> +
> +		vfree(dxgglobal);
> +		dxgglobal = NULL;
> +	}
> +}
> +
> +/*
> + * Driver entry points

????  Entry into what?

> + */
> +
> +static int __init dxg_drv_init(void)
> +{
> +	int ret;
> +
> +

Too many blank lines.

> +	ret = dxgglobal_create();
> +	if (ret) {
> +		pr_err("dxgglobal_init failed");
> +		return -ENOMEM;
> +	}
> +
> +	ret = vmbus_driver_register(&dxg_drv);
> +	if (ret) {
> +		pr_err("vmbus_driver_register failed: %d", ret);
> +		return ret;
> +	}
> +	dxgglobal->vmbus_registered = true;
> +
> +	pr_info("%s  Version: %x", __func__, DXGKRNL_VERSION);

When drivers work, they are quiet.

Also, in-kernel modules do not have a version, they are part of the
kernel tree, and that's the "version" they have.  You can drop the
individual version here, it makes no sense anymore.

> +#ifndef _D3DKMTHK_H
> +#define _D3DKMTHK_H
> +
> +/* Matches Windows LUID definition */

That does not describe much :(

> +struct winluid {
> +	__u32 a;
> +	__u32 b;

No kernel doc documentation?  What is "a"?  "b"?

I'll stop here in the patch series.

Again, have your coworkers review this first and have them put their
names as part of the signed-off-by chain so that we know they agree with
the submission.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 05/30] drivers: hv: dxgkrnl: Opening of /dev/dxg device and dxgprocess creation
  2022-03-01 19:45   ` [PATCH v3 05/30] drivers: hv: dxgkrnl: Opening of /dev/dxg device and dxgprocess creation Iouri Tarassov
@ 2022-03-01 20:46     ` Greg KH
  2022-03-01 20:46     ` Greg KH
  2022-03-01 23:47     ` Wei Liu
  2 siblings, 0 replies; 72+ messages in thread
From: Greg KH @ 2022-03-01 20:46 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo

On Tue, Mar 01, 2022 at 11:45:52AM -0800, Iouri Tarassov wrote:
> --- a/drivers/hv/dxgkrnl/dxgkrnl.h
> +++ b/drivers/hv/dxgkrnl/dxgkrnl.h
> @@ -27,9 +27,11 @@
>  #include <linux/pci.h>
>  #include <linux/hyperv.h>
>  
> +struct dxgprocess;

No need to pre-define this.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 05/30] drivers: hv: dxgkrnl: Opening of /dev/dxg device and dxgprocess creation
  2022-03-01 19:45   ` [PATCH v3 05/30] drivers: hv: dxgkrnl: Opening of /dev/dxg device and dxgprocess creation Iouri Tarassov
  2022-03-01 20:46     ` Greg KH
@ 2022-03-01 20:46     ` Greg KH
  2022-03-01 23:47     ` Wei Liu
  2 siblings, 0 replies; 72+ messages in thread
From: Greg KH @ 2022-03-01 20:46 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo

On Tue, Mar 01, 2022 at 11:45:52AM -0800, Iouri Tarassov wrote:
> +#define LX_IO_MAX 0x45

What does that mean???


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
                     ` (28 preceding siblings ...)
  2022-03-01 19:46   ` [PATCH v3 30/30] drivers: hv: dxgkrnl: Add support to map guest pages by host Iouri Tarassov
@ 2022-03-01 21:37   ` Wei Liu
  29 siblings, 0 replies; 72+ messages in thread
From: Wei Liu @ 2022-03-01 21:37 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo, gregkh

On Tue, Mar 01, 2022 at 11:45:48AM -0800, Iouri Tarassov wrote:
> Add VM bus channel guids, which are used by hyper-v virtual

"VMBus"

> compute device driver.

"Hyper-V"

> 
> Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>

This patch is trivially correct so with the names fixed.

Acked-by: Wei Liu <wei.liu@kernel.org>

> ---
>  include/linux/hyperv.h | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
> index b823311eac79..9d21055b003d 100644
> --- a/include/linux/hyperv.h
> +++ b/include/linux/hyperv.h
> @@ -1414,6 +1414,22 @@ void vmbus_free_mmio(resource_size_t start, resource_size_t size);
>  	.guid = GUID_INIT(0xda0a7802, 0xe377, 0x4aac, 0x8e, 0x77, \
>  			  0x05, 0x58, 0xeb, 0x10, 0x73, 0xf8)
>  
> +/*
> + * GPU paravirtualization global DXGK channel
> + * {DDE9CBC0-5060-4436-9448-EA1254A5D177}
> + */
> +#define HV_GPUP_DXGK_GLOBAL_GUID \
> +	.guid = GUID_INIT(0xdde9cbc0, 0x5060, 0x4436, 0x94, 0x48, \
> +			  0xea, 0x12, 0x54, 0xa5, 0xd1, 0x77)
> +
> +/*
> + * GPU paravirtualization per virtual GPU DXGK channel
> + * {6E382D18-3336-4F4B-ACC4-2B7703D4DF4A}
> + */
> +#define HV_GPUP_DXGK_VGPU_GUID \
> +	.guid = GUID_INIT(0x6e382d18, 0x3336, 0x4f4b, 0xac, 0xc4, \
> +			  0x2b, 0x77, 0x3, 0xd4, 0xdf, 0x4a)
> +
>  /*
>   * Synthetic FC GUID
>   * {2f9bcc4a-0069-4af3-b76b-6fd0be528cda}
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-01 19:45   ` [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading Iouri Tarassov
  2022-03-01 20:45     ` Greg KH
@ 2022-03-01 22:06     ` Wei Liu
  2022-03-03  2:01       ` Iouri Tarassov
  1 sibling, 1 reply; 72+ messages in thread
From: Wei Liu @ 2022-03-01 22:06 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo, gregkh

I will skip things that are pointed out by Greg.

On Tue, Mar 01, 2022 at 11:45:49AM -0800, Iouri Tarassov wrote:
> - Create skeleton and add basic functionality for the
> hyper-v compute device driver (dxgkrnl).
> 
> - Register for PCI and VM bus driver notifications and
> handle initialization of VM bus channels.
> 
> - Connect the dxgkrnl module to the drivers/hv/ Makefile and Kconfig
> 
> - Create a MAINTAINERS entry
> 
> A VM bus channel is a communication interface between the hyper-v guest
> and the host. The are two type of VM bus channels, used in the driver:
>   - the global channel
>   - per virtual compute device channel
> 

Same comment regarding the spelling of VMBus and Hyper-V. Please fix
other instances in code and comments.

> A PCI device is created for each virtual compute device, projected
> by the host. The device vendor is PCI_VENDOR_ID_MICROSOFT and device
> id is PCI_DEVICE_ID_VIRTUAL_RENDER. dxg_pci_probe_device handles
> arrival of such devices. The PCI config space of the virtual compute
> device has luid of the corresponding virtual compute device VM
> bus channel. This is how the compute device adapter objects are
> linked to VM bus channels.
> 
> VM bus interface version is exchanged by reading/writing the PCI config
> space of the virtual compute device.
> 
> The IO space is used to handle CPU accessible compute device
> allocations. Hyper-v allocates IO space for the global VM bus channel.
> 
> Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
> ---
[...]
> +static inline void guid_to_luid(guid_t *guid, struct winluid *luid)
> +{
> +	*luid = *(struct winluid *)&guid->b[0];
> +}

This should be moved to the header where luid is defined -- presumably
this is useful for other things in the future too.

Also, please provide a comment on why this conversion is okay.

[...]
> + * A helper function to read PCI config space.
> + */
> +static int dxg_pci_read_dwords(struct pci_dev *dev, int offset, int size,
> +			       void *val)
> +{
> +	int off = offset;
> +	int ret;
> +	int i;
> +

I think you should check for alignment here? size has to be a round
number of sizeof(int) otherwise you risk reading cross boundary and
causes unintended side effect?

> +	for (i = 0; i < size / sizeof(int); i++) {
> +		ret = pci_read_config_dword(dev, off, &((int *)val)[i]);
> +		if (ret) {
> +			pr_err("Failed to read PCI config: %d", off);
> +			return ret;
> +		}
> +		off += sizeof(int);
> +	}
> +	return 0;
> +}
> +
> +static int dxg_pci_probe_device(struct pci_dev *dev,
> +				const struct pci_device_id *id)
> +{
> +	int ret;
> +	guid_t guid;
> +	u32 vmbus_interface_ver = DXGK_VMBUS_INTERFACE_VERSION;
> +	struct winluid vgpu_luid = {};
> +	struct dxgk_vmbus_guestcaps guest_caps = {.wsl2 = 1};
> +
> +	mutex_lock(&dxgglobal->device_mutex);
> +
> +	if (dxgglobal->vmbus_ver == 0)  {
> +		/* Report capabilities to the host */
> +
> +		ret = pci_write_config_dword(dev, DXGK_VMBUS_GUESTCAPS_OFFSET,
> +					guest_caps.guest_caps);
> +		if (ret)
> +			goto cleanup;
> +
> +		/* Negotiate the VM bus version */
> +
[...]
> +	pr_debug("Adapter channel: %pUb\n", &guid);
> +	pr_debug("Vmbus interface version: %d\n",
> +		dxgglobal->vmbus_ver);

No need to wrap the line here.

> +	pr_debug("Host vGPU luid: %x-%x\n",
> +		vgpu_luid.b, vgpu_luid.a);

Ditto.

> +
> +cleanup:
> +
> +	mutex_unlock(&dxgglobal->device_mutex);
> +
> +	if (ret)
> +		pr_debug("err: %s %d", __func__, ret);
> +	return ret;
> +}
> +
[...]
> +static void __exit dxg_drv_exit(void)
> +{
> +	dxgglobal_destroy();
> +}
> +
> +module_init(dxg_drv_init);
> +module_exit(dxg_drv_exit);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Microsoft Dxgkrnl virtual GPU Driver");

Should be "virtual compute device driver" here? Please be consistent.

Thanks,
Wei.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-01 20:45     ` Greg KH
@ 2022-03-01 22:23       ` Wei Liu
  2022-03-01 22:47         ` Iouri Tarassov
  2022-03-02  7:53         ` Greg KH
  2022-03-02 22:59       ` Iouri Tarassov
  2022-03-02 23:27       ` Iouri Tarassov
  2 siblings, 2 replies; 72+ messages in thread
From: Wei Liu @ 2022-03-01 22:23 UTC (permalink / raw)
  To: Greg KH
  Cc: Iouri Tarassov, kys, haiyangz, sthemmin, wei.liu, linux-hyperv,
	linux-kernel, spronovo, spronovo

On Tue, Mar 01, 2022 at 09:45:41PM +0100, Greg KH wrote:
> On Tue, Mar 01, 2022 at 11:45:49AM -0800, Iouri Tarassov wrote:
> > - Create skeleton and add basic functionality for the
> > hyper-v compute device driver (dxgkrnl).
> > 
> > - Register for PCI and VM bus driver notifications and
> > handle initialization of VM bus channels.
> > 
> > - Connect the dxgkrnl module to the drivers/hv/ Makefile and Kconfig
> > 
> > - Create a MAINTAINERS entry
> > 
> > A VM bus channel is a communication interface between the hyper-v guest
> > and the host. The are two type of VM bus channels, used in the driver:
> >   - the global channel
> >   - per virtual compute device channel
> > 
> > A PCI device is created for each virtual compute device, projected
> > by the host. The device vendor is PCI_VENDOR_ID_MICROSOFT and device
> > id is PCI_DEVICE_ID_VIRTUAL_RENDER. dxg_pci_probe_device handles
> > arrival of such devices. The PCI config space of the virtual compute
> > device has luid of the corresponding virtual compute device VM
> > bus channel. This is how the compute device adapter objects are
> > linked to VM bus channels.
> > 
> > VM bus interface version is exchanged by reading/writing the PCI config
> > space of the virtual compute device.
> > 
> > The IO space is used to handle CPU accessible compute device
> > allocations. Hyper-v allocates IO space for the global VM bus channel.
> > 
> > Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
> 
> Please work with internal developers to get reviews from them first,
> before requiring the kernel community to point out basic issues.  It
> will save you a lot of time and stop us from feeling like we are having
> our time wasted.
> 
> Some simple examples below that your coworkers should have caught:
> 
> > --- /dev/null
> > +++ b/drivers/hv/dxgkrnl/dxgkrnl.h
> > @@ -0,0 +1,119 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +
> > +/*
> > + * Copyright (c) 2019, Microsoft Corporation.
> 
> It is now 2022 :)
> 
> > +void init_ioctls(void);
> 
> That is a horrible global function name you just added to the kernel's
> namespace for a single driver :(
> 
> > +long dxgk_unlocked_ioctl(struct file *f, unsigned int p1, unsigned long p2);
> > +
> > +static inline void guid_to_luid(guid_t *guid, struct winluid *luid)
> > +{
> > +	*luid = *(struct winluid *)&guid->b[0];
> 
> Why is the cast needed?  Shouldn't you use real types in your
> structures?
> 
> > +/*
> > + * The interface version is used to ensure that the host and the guest use the
> > + * same VM bus protocol. It needs to be incremented every time the VM bus
> > + * interface changes. DXGK_VMBUS_LAST_COMPATIBLE_INTERFACE_VERSION is
> > + * incremented each time the earlier versions of the interface are no longer
> > + * compatible with the current version.
> > + */
> > +#define DXGK_VMBUS_INTERFACE_VERSION_OLD		27
> > +#define DXGK_VMBUS_INTERFACE_VERSION			40
> > +#define DXGK_VMBUS_LAST_COMPATIBLE_INTERFACE_VERSION	16
> 
> Where do these numbers come from, the hypervisor specification?
> 
> > +/*
> > + * Pointer to the global device data. By design
> > + * there is a single vGPU device on the VM bus and a single /dev/dxg device
> > + * is created.
> > + */
> > +struct dxgglobal *dxgglobal;
> 
> No, make this per-device, NEVER have a single device for your driver.
> The Linux driver model makes it harder to do it this way than to do it
> correctly.  Do it correctly please and have no global structures like
> this.
> 

This may not be as big an issue as you thought. The device discovery is
still done via the normal VMBus probing routine. For all intents and
purposes the dxgglobal structure can be broken down into per device
fields and a global structure which contains the protocol versioning
information -- my understanding is there will always be a global
structure to hold information related to the backend, regardless of how
many devices there are.

I definitely think splitting is doable, but I also understand why Iouri
does not want to do it _now_ given there is no such a model for multiple
devices yet, so anything we put into the per-device structure could be
incomplete and it requires further changing when such a model arrives
later.

Iouri, please correct me if I have the wrong mental model here.

All in all, I hope this is not going to be a deal breaker for the
acceptance of this driver.

Thanks,
Wei.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-01 22:23       ` Wei Liu
@ 2022-03-01 22:47         ` Iouri Tarassov
  2022-03-02  7:54           ` Greg KH
  2022-03-02  7:53         ` Greg KH
  1 sibling, 1 reply; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-01 22:47 UTC (permalink / raw)
  To: Wei Liu, Greg KH
  Cc: kys, haiyangz, sthemmin, linux-hyperv, linux-kernel, spronovo, spronovo

On 3/1/2022 2:23 PM, Wei Liu wrote:
> On Tue, Mar 01, 2022 at 09:45:41PM +0100, Greg KH wrote:
> > On Tue, Mar 01, 2022 at 11:45:49AM -0800, Iouri Tarassov wrote:
> > > - Create skeleton and add basic functionality for the
> > > hyper-v compute device driver (dxgkrnl).
> > > 
> > > - Register for PCI and VM bus driver notifications and
> > > handle initialization of VM bus channels.
> > > 
> > > - Connect the dxgkrnl module to the drivers/hv/ Makefile and Kconfig
> > > 
> > > - Create a MAINTAINERS entry
> > > 
> > > A VM bus channel is a communication interface between the hyper-v guest
> > > and the host. The are two type of VM bus channels, used in the driver:
> > >   - the global channel
> > >   - per virtual compute device channel
> > > 
> > > A PCI device is created for each virtual compute device, projected
> > > by the host. The device vendor is PCI_VENDOR_ID_MICROSOFT and device
> > > id is PCI_DEVICE_ID_VIRTUAL_RENDER. dxg_pci_probe_device handles
> > > arrival of such devices. The PCI config space of the virtual compute
> > > device has luid of the corresponding virtual compute device VM
> > > bus channel. This is how the compute device adapter objects are
> > > linked to VM bus channels.
> > > 
> > > VM bus interface version is exchanged by reading/writing the PCI config
> > > space of the virtual compute device.
> > > 
> > > The IO space is used to handle CPU accessible compute device
> > > allocations. Hyper-v allocates IO space for the global VM bus channel.
> > > 
> > > Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
> > 
> > Please work with internal developers to get reviews from them first,
> > before requiring the kernel community to point out basic issues.  It
> > will save you a lot of time and stop us from feeling like we are having
> > our time wasted.
> > 
> > Some simple examples below that your coworkers should have caught:
> > 
> > > --- /dev/null
> > > +++ b/drivers/hv/dxgkrnl/dxgkrnl.h
> > > @@ -0,0 +1,119 @@
> > > +/* SPDX-License-Identifier: GPL-2.0 */
> > > +
> > > +/*
> > > + * Copyright (c) 2019, Microsoft Corporation.
> > 
> > It is now 2022 :)
> > 
> > > +void init_ioctls(void);
> > 
> > That is a horrible global function name you just added to the kernel's
> > namespace for a single driver :(
> > 
> > > +long dxgk_unlocked_ioctl(struct file *f, unsigned int p1, unsigned long p2);
> > > +
> > > +static inline void guid_to_luid(guid_t *guid, struct winluid *luid)
> > > +{
> > > +	*luid = *(struct winluid *)&guid->b[0];
> > 
> > Why is the cast needed?  Shouldn't you use real types in your
> > structures?
> > 
> > > +/*
> > > + * The interface version is used to ensure that the host and the guest use the
> > > + * same VM bus protocol. It needs to be incremented every time the VM bus
> > > + * interface changes. DXGK_VMBUS_LAST_COMPATIBLE_INTERFACE_VERSION is
> > > + * incremented each time the earlier versions of the interface are no longer
> > > + * compatible with the current version.
> > > + */
> > > +#define DXGK_VMBUS_INTERFACE_VERSION_OLD		27
> > > +#define DXGK_VMBUS_INTERFACE_VERSION			40
> > > +#define DXGK_VMBUS_LAST_COMPATIBLE_INTERFACE_VERSION	16
> > 
> > Where do these numbers come from, the hypervisor specification?
> > 
> > > +/*
> > > + * Pointer to the global device data. By design
> > > + * there is a single vGPU device on the VM bus and a single /dev/dxg device
> > > + * is created.
> > > + */
> > > +struct dxgglobal *dxgglobal;
> > 
> > No, make this per-device, NEVER have a single device for your driver.
> > The Linux driver model makes it harder to do it this way than to do it
> > correctly.  Do it correctly please and have no global structures like
> > this.
> > 
>
> This may not be as big an issue as you thought. The device discovery is
> still done via the normal VMBus probing routine. For all intents and
> purposes the dxgglobal structure can be broken down into per device
> fields and a global structure which contains the protocol versioning
> information -- my understanding is there will always be a global
> structure to hold information related to the backend, regardless of how
> many devices there are.
>
> I definitely think splitting is doable, but I also understand why Iouri
> does not want to do it _now_ given there is no such a model for multiple
> devices yet, so anything we put into the per-device structure could be
> incomplete and it requires further changing when such a model arrives
> later.
>
> Iouri, please correct me if I have the wrong mental model here.
>
> All in all, I hope this is not going to be a deal breaker for the
> acceptance of this driver.
>
> Thanks,
> Wei.

I agree with Wei that there always be global driver data.

The driver reflects what the host offers and also it must provide the same
interface to user mode as the host driver does. This is because we want the
user mode clients to use the same device interface as if they are working on
the host directly.

By design a single global VMBus channel is offered by the host and a single
/dev/dxg device is created. The /dev/dxg device provides interface to enumerate
virtual compute devices via an ioctl.

If we are to change this model, we would need to make changes to user mode
clients, which is a big re-design change, affecting many hardware vendors.

Thanks
Iouri
  


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 03/30] drivers: hv: dxgkrnl: Add VM bus message support, initialize VM bus channels.
  2022-03-01 19:45   ` [PATCH v3 03/30] drivers: hv: dxgkrnl: Add VM bus message support, initialize VM bus channels Iouri Tarassov
@ 2022-03-01 22:57     ` Wei Liu
  0 siblings, 0 replies; 72+ messages in thread
From: Wei Liu @ 2022-03-01 22:57 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo, gregkh

The subject line is too long. Also, no period at the end please.

You can write it as:

drivers: hv: dxgkrnl: Add VMBus message support and initialize channels

On Tue, Mar 01, 2022 at 11:45:50AM -0800, Iouri Tarassov wrote:
[...]
> +
> +struct dxgvmbusmsgres {
> +/* Points to the allocated buffer */
> +	struct dxgvmb_ext_header	*hdr;
> +/* Points to dxgkvmb_command_vm_to_host or dxgkvmb_command_vgpu_to_host */
> +	void				*msg;
> +/* The vm bus channel, used to pass the message to the host */
> +	struct dxgvmbuschannel		*channel;
> +/* Message size in bytes including the header, the payload and the result */
> +	u32				size;
> +/* Result buffer size in bytes */
> +	u32				res_size;
> +/* Points to the result within the allocated buffer */
> +	void				*res;

Please align the comments with their fields.

> +};
> +
[...]
> +static void process_inband_packet(struct dxgvmbuschannel *channel,
> +				  struct vmpacket_descriptor *desc)
> +{
> +	u32 packet_length = hv_pkt_datalen(desc);
> +	struct dxgkvmb_command_host_to_vm *packet;
> +
> +	if (packet_length < sizeof(struct dxgkvmb_command_host_to_vm)) {
> +		pr_err("Invalid global packet");
> +	} else {
> +		packet = hv_pkt_data(desc);
> +		pr_debug("global packet %d",
> +				packet->command_type);

Unnecessary line wrap.

> +		switch (packet->command_type) {
> +		case DXGK_VMBCOMMAND_SIGNALGUESTEVENT:
> +		case DXGK_VMBCOMMAND_SIGNALGUESTEVENTPASSIVE:
> +			break;
> +		case DXGK_VMBCOMMAND_SENDWNFNOTIFICATION:
> +			break;
> +		default:
> +			pr_err("unexpected host message %d",
> +					packet->command_type);
> +		}
> +	}
> +}
> +
[...]
> +
> +/* Receive callback for messages from the host */
> +void dxgvmbuschannel_receive(void *ctx)
> +{
> +	struct dxgvmbuschannel *channel = ctx;
> +	struct vmpacket_descriptor *desc;
> +	u32 packet_length = 0;
> +
> +	foreach_vmbus_pkt(desc, channel->channel) {
> +		packet_length = hv_pkt_datalen(desc);
> +		pr_debug("next packet (id, size, type): %llu %d %d",
> +			desc->trans_id, packet_length, desc->type);
> +		if (desc->type == VM_PKT_COMP) {
> +			process_completion_packet(channel, desc);
> +		} else {
> +			if (desc->type != VM_PKT_DATA_INBAND)
> +				pr_err("unexpected packet type");

This can potentially flood the guest if the backend is misbehaving.
We've seen flooding before so would definitely not want more of it.
Please consider using the ratelimit version pr_err.

The same comment goes for all other pr calls in repeatedly called paths.
I can see the value of having precise output from the pr_debug a few
lines above though.

> +			else
> +				process_inband_packet(channel, desc);
> +		}
> +	}
> +}
> +
[...]
> +int dxgvmb_send_set_iospace_region(u64 start, u64 len,
> +	struct vmbus_gpadl *shared_mem_gpadl)
> +{
> +	int ret;
> +	struct dxgkvmb_command_setiospaceregion *command;
> +	struct dxgvmbusmsg msg;
> +
> +	ret = init_message(&msg, NULL, sizeof(*command));
> +	if (ret)
> +		return ret;
> +	command = (void *)msg.msg;
> +
> +	ret = dxgglobal_acquire_channel_lock();
> +	if (ret < 0)
> +		goto cleanup;
> +
> +	command_vm_to_host_init1(&command->hdr,
> +				 DXGK_VMBCOMMAND_SETIOSPACEREGION);
> +	command->start = start;
> +	command->length = len;
> +	if (command->shared_page_gpadl)
> +		command->shared_page_gpadl = shared_mem_gpadl->gpadl_handle;

shared_mem_gpadl should be checked to be non-null. There is at least one
call site passes 0 to it.

> +	ret = dxgvmb_send_sync_msg_ntstatus(&dxgglobal->channel, msg.hdr,
> +					    msg.size);
> +	if (ret < 0)
> +		pr_err("send_set_iospace_region failed %x", ret);
> +
> +	dxgglobal_release_channel_lock();
> +cleanup:
> +	free_message(&msg, NULL);
> +	if (ret)
> +		pr_debug("err: %s %d", __func__, ret);
> +	return ret;
> +}
> +
[...]
> +
> +
> +#define NT_SUCCESS(status)				(status.v >= 0)
> +
> +#ifndef DEBUG
> +
> +#define DXGKRNL_ASSERT(exp)
> +
> +#else
> +
> +#define DXGKRNL_ASSERT(exp)	\
> +do {				\
> +	if (!(exp)) {		\
> +		dump_stack();	\
> +		BUG_ON(true);	\
> +	}			\
> +} while (0)


You can just use BUG_ON(exp), right? BUG_ON calls panic, which already
dumps the stack when CONFIG_DEBUG_VERBOSE is set.

Thanks,
Wei.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 04/30] drivers: hv: dxgkrnl: Creation of dxgadapter object
  2022-03-01 19:45   ` [PATCH v3 04/30] drivers: hv: dxgkrnl: Creation of dxgadapter object Iouri Tarassov
@ 2022-03-01 23:25     ` Wei Liu
  0 siblings, 0 replies; 72+ messages in thread
From: Wei Liu @ 2022-03-01 23:25 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo, gregkh

On Tue, Mar 01, 2022 at 11:45:51AM -0800, Iouri Tarassov wrote:
[...]
> +int dxgadapter_set_vmbus(struct dxgadapter *adapter, struct hv_device *hdev)
> +{
> +	int ret;
> +
> +	guid_to_luid(&hdev->channel->offermsg.offer.if_instance,
> +		     &adapter->luid);
> +	pr_debug("%s: %x:%x %p %pUb\n",
> +		    __func__, adapter->luid.b, adapter->luid.a, hdev->channel,
> +		    &hdev->channel->offermsg.offer.if_instance);
> +
> +	ret = dxgvmbuschannel_init(&adapter->channel, hdev);
> +	if (ret)
> +		goto cleanup;
> +
> +	adapter->channel.adapter = adapter;
> +	adapter->hv_dev = hdev;
> +
> +	ret = dxgvmb_send_open_adapter(adapter);
> +	if (ret < 0) {

if (ret) is simpler?

Please be consistent regarding how you check for errors.

> +		pr_err("dxgvmb_send_open_adapter failed: %d\n", ret);
> +		goto cleanup;
> +	}
> +
> +	ret = dxgvmb_send_get_internal_adapter_info(adapter);
> +	if (ret < 0)
> +		pr_err("get_internal_adapter_info failed: %d", ret);
> +
> +cleanup:
> +	if (ret)
> +		pr_debug("err: %s %d", __func__, ret);
> +	return ret;
> +}
> +
> +void dxgadapter_start(struct dxgadapter *adapter)
> +{
> +	struct dxgvgpuchannel *ch = NULL;
> +	struct dxgvgpuchannel *entry;
> +	int ret;
> +
> +	pr_debug("%s %x-%x",
> +		__func__, adapter->luid.a, adapter->luid.b);
> +
> +	/* Find the corresponding vGPU vm bus channel */
> +	list_for_each_entry(entry, &dxgglobal->vgpu_ch_list_head,
> +			    vgpu_ch_list_entry) {

The mutex is not acquired in this function.

I have not checked if this function is used elsewhere. But this is a
non-static function, it should have at least a comment on the locking
requirement?

If it is not needed elsewhere, please make it static or named it
dxgadapter_start_locked.


[...]
> +}
> +
> +void dxgadapter_stop(struct dxgadapter *adapter)

Same comment applies to this function as well.

Thanks,
Wei.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 05/30] drivers: hv: dxgkrnl: Opening of /dev/dxg device and dxgprocess creation
  2022-03-01 19:45   ` [PATCH v3 05/30] drivers: hv: dxgkrnl: Opening of /dev/dxg device and dxgprocess creation Iouri Tarassov
  2022-03-01 20:46     ` Greg KH
  2022-03-01 20:46     ` Greg KH
@ 2022-03-01 23:47     ` Wei Liu
  2 siblings, 0 replies; 72+ messages in thread
From: Wei Liu @ 2022-03-01 23:47 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo, gregkh

On Tue, Mar 01, 2022 at 11:45:52AM -0800, Iouri Tarassov wrote:
> - Implement opening of the device (/dev/dxg) file object and creation of
> dxgprocess objects.
[...]
>  static int dxgk_open(struct inode *n, struct file *f)
>  {
> -	return 0;
> +	int ret = 0;
> +	struct dxgprocess *process;
> +
> +	pr_debug("%s %p %d %d",
> +		     __func__, f, current->pid, current->tgid);
> +
> +
> +	/* Find/create a dxgprocess structure for this process */
> +	process = dxgglobal_get_current_process();
> +
> +	if (process) {
> +		f->private_data = process;
> +	} else {
> +		pr_debug("cannot create dxgprocess for open\n");
> +		ret = -EBADF;
> +	}
> +
> +	pr_debug("%s end %x", __func__, ret);

I would normally remove pr_deubg's like this when submitting. It doesn't
provide much information.

> +	return ret;
>  }
>  
[...]
>  
> +int dxgvmb_send_create_process(struct dxgprocess *process)
> +{
> +	int ret;
> +	struct dxgkvmb_command_createprocess *command;
> +	struct dxgkvmb_command_createprocess_return result = { 0 };
> +	struct dxgvmbusmsg msg;
> +	char s[WIN_MAX_PATH];
> +	int i;
> +
> +	ret = init_message(&msg, NULL, process, sizeof(*command));
> +	if (ret)
> +		return ret;
> +	command = (void *)msg.msg;
> +
> +	ret = dxgglobal_acquire_channel_lock();
> +	if (ret < 0)
> +		goto cleanup;
> +
> +	command_vm_to_host_init1(&command->hdr, DXGK_VMBCOMMAND_CREATEPROCESS);
> +	command->process = process;
> +	command->process_id = process->process->pid;
> +	command->linux_process = 1;
> +	s[0] = 0;
> +	__get_task_comm(s, WIN_MAX_PATH, process->process);
> +	for (i = 0; i < WIN_MAX_PATH; i++) {
> +		command->process_name[i] = s[i];
> +		if (s[i] == 0)
> +			break;
> +	}

What's wrong with doing

  __get_task_comm(command->process_name, WIN_MAX_PATH, process->process);

here?

That saves you many bytes on stack.

[...]
> +static char *errorstr(int ret)
> +{
> +	return ret < 0 ? "err" : "";
> +}
> +

This is not used in this patch.

[...]
> diff --git a/drivers/hv/dxgkrnl/misc.h b/drivers/hv/dxgkrnl/misc.h
> index 1ff0c0e28332..433b59d3eb23 100644
> --- a/drivers/hv/dxgkrnl/misc.h
> +++ b/drivers/hv/dxgkrnl/misc.h
> @@ -31,7 +31,6 @@ extern const struct d3dkmthandle zerohandle;
>   * table_lock
>   * core_lock
>   * device_lock
> - * process->process_mutex

Why is this deleted?

Thanks,
Wei.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-01 22:23       ` Wei Liu
  2022-03-01 22:47         ` Iouri Tarassov
@ 2022-03-02  7:53         ` Greg KH
  2022-03-02 11:53           ` Wei Liu
  2022-03-03  1:09           ` Iouri Tarassov
  1 sibling, 2 replies; 72+ messages in thread
From: Greg KH @ 2022-03-02  7:53 UTC (permalink / raw)
  To: Wei Liu
  Cc: Iouri Tarassov, kys, haiyangz, sthemmin, linux-hyperv,
	linux-kernel, spronovo, spronovo

On Tue, Mar 01, 2022 at 10:23:21PM +0000, Wei Liu wrote:
> > > +struct dxgglobal *dxgglobal;
> > 
> > No, make this per-device, NEVER have a single device for your driver.
> > The Linux driver model makes it harder to do it this way than to do it
> > correctly.  Do it correctly please and have no global structures like
> > this.
> > 
> 
> This may not be as big an issue as you thought. The device discovery is
> still done via the normal VMBus probing routine. For all intents and
> purposes the dxgglobal structure can be broken down into per device
> fields and a global structure which contains the protocol versioning
> information -- my understanding is there will always be a global
> structure to hold information related to the backend, regardless of how
> many devices there are.

Then that is wrong and needs to be fixed.  Drivers should almost never
have any global data, that is not how Linux drivers work.  What happens
when you get a second device in your system for this?  Major rework
would have to happen and the code will break.  Handle that all now as it
takes less work to make this per-device than it does to have a global
variable.

> I definitely think splitting is doable, but I also understand why Iouri
> does not want to do it _now_ given there is no such a model for multiple
> devices yet, so anything we put into the per-device structure could be
> incomplete and it requires further changing when such a model arrives
> later.
> 
> Iouri, please correct me if I have the wrong mental model here.
> 
> All in all, I hope this is not going to be a deal breaker for the
> acceptance of this driver.

For my reviews, yes it will be.

Again, it should be easier to keep things in a per-device state than
not as the proper lifetime rules and the like are automatically handled
for you.  If you have global data, you have to manage that all on your
own and it is _MUCH_ harder to review that you got it correct.

Please fix, it will make your life easier as well.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-01 22:47         ` Iouri Tarassov
@ 2022-03-02  7:54           ` Greg KH
  0 siblings, 0 replies; 72+ messages in thread
From: Greg KH @ 2022-03-02  7:54 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: Wei Liu, kys, haiyangz, sthemmin, linux-hyperv, linux-kernel,
	spronovo, spronovo

On Tue, Mar 01, 2022 at 02:47:28PM -0800, Iouri Tarassov wrote:
> On 3/1/2022 2:23 PM, Wei Liu wrote:
> > On Tue, Mar 01, 2022 at 09:45:41PM +0100, Greg KH wrote:
> > > On Tue, Mar 01, 2022 at 11:45:49AM -0800, Iouri Tarassov wrote:
> > > > - Create skeleton and add basic functionality for the
> > > > hyper-v compute device driver (dxgkrnl).
> > > > > > - Register for PCI and VM bus driver notifications and
> > > > handle initialization of VM bus channels.
> > > > > > - Connect the dxgkrnl module to the drivers/hv/ Makefile and
> > Kconfig
> > > > > > - Create a MAINTAINERS entry
> > > > > > A VM bus channel is a communication interface between the
> > hyper-v guest
> > > > and the host. The are two type of VM bus channels, used in the driver:
> > > >   - the global channel
> > > >   - per virtual compute device channel
> > > > > > A PCI device is created for each virtual compute device,
> > projected
> > > > by the host. The device vendor is PCI_VENDOR_ID_MICROSOFT and device
> > > > id is PCI_DEVICE_ID_VIRTUAL_RENDER. dxg_pci_probe_device handles
> > > > arrival of such devices. The PCI config space of the virtual compute
> > > > device has luid of the corresponding virtual compute device VM
> > > > bus channel. This is how the compute device adapter objects are
> > > > linked to VM bus channels.
> > > > > > VM bus interface version is exchanged by reading/writing the PCI
> > config
> > > > space of the virtual compute device.
> > > > > > The IO space is used to handle CPU accessible compute device
> > > > allocations. Hyper-v allocates IO space for the global VM bus channel.
> > > > > > Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
> > > > Please work with internal developers to get reviews from them first,
> > > before requiring the kernel community to point out basic issues.  It
> > > will save you a lot of time and stop us from feeling like we are having
> > > our time wasted.
> > > > Some simple examples below that your coworkers should have caught:
> > > > > --- /dev/null
> > > > +++ b/drivers/hv/dxgkrnl/dxgkrnl.h
> > > > @@ -0,0 +1,119 @@
> > > > +/* SPDX-License-Identifier: GPL-2.0 */
> > > > +
> > > > +/*
> > > > + * Copyright (c) 2019, Microsoft Corporation.
> > > > It is now 2022 :)
> > > > > +void init_ioctls(void);
> > > > That is a horrible global function name you just added to the
> > kernel's
> > > namespace for a single driver :(
> > > > > +long dxgk_unlocked_ioctl(struct file *f, unsigned int p1,
> > unsigned long p2);
> > > > +
> > > > +static inline void guid_to_luid(guid_t *guid, struct winluid *luid)
> > > > +{
> > > > +	*luid = *(struct winluid *)&guid->b[0];
> > > > Why is the cast needed?  Shouldn't you use real types in your
> > > structures?
> > > > > +/*
> > > > + * The interface version is used to ensure that the host and the guest use the
> > > > + * same VM bus protocol. It needs to be incremented every time the VM bus
> > > > + * interface changes. DXGK_VMBUS_LAST_COMPATIBLE_INTERFACE_VERSION is
> > > > + * incremented each time the earlier versions of the interface are no longer
> > > > + * compatible with the current version.
> > > > + */
> > > > +#define DXGK_VMBUS_INTERFACE_VERSION_OLD		27
> > > > +#define DXGK_VMBUS_INTERFACE_VERSION			40
> > > > +#define DXGK_VMBUS_LAST_COMPATIBLE_INTERFACE_VERSION	16
> > > > Where do these numbers come from, the hypervisor specification?
> > > > > +/*
> > > > + * Pointer to the global device data. By design
> > > > + * there is a single vGPU device on the VM bus and a single /dev/dxg device
> > > > + * is created.
> > > > + */
> > > > +struct dxgglobal *dxgglobal;
> > > > No, make this per-device, NEVER have a single device for your
> > driver.
> > > The Linux driver model makes it harder to do it this way than to do it
> > > correctly.  Do it correctly please and have no global structures like
> > > this.
> > >
> > 
> > This may not be as big an issue as you thought. The device discovery is
> > still done via the normal VMBus probing routine. For all intents and
> > purposes the dxgglobal structure can be broken down into per device
> > fields and a global structure which contains the protocol versioning
> > information -- my understanding is there will always be a global
> > structure to hold information related to the backend, regardless of how
> > many devices there are.
> > 
> > I definitely think splitting is doable, but I also understand why Iouri
> > does not want to do it _now_ given there is no such a model for multiple
> > devices yet, so anything we put into the per-device structure could be
> > incomplete and it requires further changing when such a model arrives
> > later.
> > 
> > Iouri, please correct me if I have the wrong mental model here.
> > 
> > All in all, I hope this is not going to be a deal breaker for the
> > acceptance of this driver.
> > 
> > Thanks,
> > Wei.
> 
> I agree with Wei that there always be global driver data.

Why?

> The driver reflects what the host offers and also it must provide the same
> interface to user mode as the host driver does. This is because we want the
> user mode clients to use the same device interface as if they are working on
> the host directly.

That's fine, put that state in the device interface.

> By design a single global VMBus channel is offered by the host and a single
> /dev/dxg device is created. The /dev/dxg device provides interface to enumerate
> virtual compute devices via an ioctl.

You have device data for each vmbus channel given to the driver, use
that.

> If we are to change this model, we would need to make changes to user mode
> clients, which is a big re-design change, affecting many hardware vendors.

This should not be visible to userspace at all.  If so, you are really
doing something odd.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 00/30] Driver for Hyper-v virtual compute device
  2022-03-01 19:45 [PATCH v3 00/30] Driver for Hyper-v virtual compute device Iouri Tarassov
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
@ 2022-03-02  8:51 ` Christoph Hellwig
  2022-03-02 14:53 ` Wei Liu
  2 siblings, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2022-03-02  8:51 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo, gregkh

Hi Iouri,

this still does not address what native API this matches.

As lonk as this is a just a shim to call functionality not natively
availabe on Linux: NAK.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-02  7:53         ` Greg KH
@ 2022-03-02 11:53           ` Wei Liu
  2022-03-02 18:49             ` Iouri Tarassov
  2022-03-03  1:09           ` Iouri Tarassov
  1 sibling, 1 reply; 72+ messages in thread
From: Wei Liu @ 2022-03-02 11:53 UTC (permalink / raw)
  To: Greg KH
  Cc: Wei Liu, Iouri Tarassov, kys, haiyangz, sthemmin, linux-hyperv,
	linux-kernel, spronovo, spronovo

On Wed, Mar 02, 2022 at 08:53:15AM +0100, Greg KH wrote:
> On Tue, Mar 01, 2022 at 10:23:21PM +0000, Wei Liu wrote:
> > > > +struct dxgglobal *dxgglobal;
> > > 
> > > No, make this per-device, NEVER have a single device for your driver.
> > > The Linux driver model makes it harder to do it this way than to do it
> > > correctly.  Do it correctly please and have no global structures like
> > > this.
> > > 
> > 
> > This may not be as big an issue as you thought. The device discovery is
> > still done via the normal VMBus probing routine. For all intents and
> > purposes the dxgglobal structure can be broken down into per device
> > fields and a global structure which contains the protocol versioning
> > information -- my understanding is there will always be a global
> > structure to hold information related to the backend, regardless of how
> > many devices there are.
> 
> Then that is wrong and needs to be fixed.  Drivers should almost never
> have any global data, that is not how Linux drivers work.  What happens
> when you get a second device in your system for this?  Major rework
> would have to happen and the code will break.  Handle that all now as it
> takes less work to make this per-device than it does to have a global
> variable.
> 

It is perhaps easier to draw parallel from an existing driver. I feel
like we're talking past each other.

Let's look at drivers/iommu/intel/iommu.c. There are a bunch of lists
like `static LIST_HEAD(dmar_rmrr_units)`. During the probing phase, new
units will be added to the list. I this the current code is following
this model. dxgglobal fulfills the role of a list.

Setting aside the question of whether it makes sense to keep a copy of
the per-VM state in each device instance, I can see the code be changed
to:

    struct mutex device_mutex; /* split out from dxgglobal */
    static LIST_HEAD(dxglist);
    
    /* Rename struct dxgglobal to struct dxgstate */
    struct dxgstate {
       struct list_head dxglist; /* link for dxglist */
       /* ... original fields sans device_mutex */
    }

    /*
     * Provide a bunch of helpers manipulate the list. Called in probe /
     * remove etc.
     */
    struct dxgstate *find_dxgstate(...);
    void remove_dxgstate(...);
    int add_dxgstate(...);

This model is well understood and used in tree. It is just that it
doesn't provide much value in doing this now since the list will only
contain one element. I hope that you're not saying we cannot even use a
per-module pointer to quickly get the data structure we want to use,
right?

Are you suggesting Iouri use dev_set_drvdata to stash the dxgstate
into the device object? I think that can be done too.

The code can be changed as:

    /* Rename struct dxgglobal to dxgstate and remove unneeded fields */
    struct dxgstate { ... };

    static int dxg_probe_vmbus(...) {

        /* probe successfully */

	struct dxgstate *state = kmalloc(...);
	/* Fill in dxgstate with information from backend */

	/* hdev->dev is the device object from the core driver framework */
	dev_set_drvdata(&hdev->dev, state);
    }

    static int dxg_remove_vmbus(...) {
        /* Normal stuff here ...*/

	struct dxgstate *state = dev_get_drvdata(...);
	dev_set_drvdata(..., NULL);
	kfree(state);
    }

    /* In all other functions */
    void do_things(...) {
        struct dxgstate *state = dev_get_drvdata(...);

	/* Use state in place of where dxgglobal was needed */

    }

Iouri, notice this doesn't change anything regarding how userspace is
designed. This is about how kernel organises its data.

I hope what I wrote above can bring our understanding closer.

Thanks,
Wei.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 06/30] drivers: hv: dxgkrnl: Enumerate and open dxgadapter objects
  2022-03-01 19:45   ` [PATCH v3 06/30] drivers: hv: dxgkrnl: Enumerate and open dxgadapter objects Iouri Tarassov
@ 2022-03-02 12:43     ` Wei Liu
  0 siblings, 0 replies; 72+ messages in thread
From: Wei Liu @ 2022-03-02 12:43 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo, gregkh

On Tue, Mar 01, 2022 at 11:45:53AM -0800, Iouri Tarassov wrote:
[...]
> +void dxgadapter_remove_process(struct dxgprocess_adapter *process_info)
> +{
> +	pr_debug("%s %p %p", __func__,
> +		    process_info->adapter, process_info);

This is not very useful in a production system.

Keep in mind that this can flood dmesg when the debug level is cranked
up. The user is just perhaps trying to get information for other parts
of the kernel.

I would like to ask you to reduce the amount of pr_debug's in code, only
leave the most critical ones in place. There is ftrace and ebfp in the
kernel to help with observation at runtime.

> +	list_del(&process_info->adapter_process_list_entry);
> +	process_info->adapter_process_list_entry.next = NULL;
> +	process_info->adapter_process_list_entry.prev = NULL;

There is no need to explicitly set next and prev pointers to NULL. They
have been set to poison values by list_del. Please remove this kind of
code from all other places.

> +}
> +
>  int dxgadapter_acquire_lock_exclusive(struct dxgadapter *adapter)
>  {
>  	down_write(&adapter->core_lock);
> @@ -170,3 +198,52 @@ void dxgadapter_release_lock_shared(struct dxgadapter *adapter)
>  {
>  	up_read(&adapter->core_lock);
>  }
> +
[...]
> +struct dxgprocess_adapter *dxgprocess_get_adapter_info(struct dxgprocess
> +						       *process,
> +						       struct dxgadapter
> +						       *adapter)

Please document the locking requirement or renamed this function. I see
the list is manipulated with no lock held.

> +{
> +	struct dxgprocess_adapter *entry;
> +
> +	list_for_each_entry(entry, &process->process_adapter_list_head,
> +			    process_adapter_list_entry) {
> +		if (adapter == entry->adapter) {
> +			pr_debug("Found process info %p", entry);
> +			return entry;
> +		}
> +	}
> +	return NULL;
> +}

Thanks,
Wei.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 08/30] drivers: hv: dxgkrnl: Creation of dxgcontext objects
  2022-03-01 19:45   ` [PATCH v3 08/30] drivers: hv: dxgkrnl: Creation of dxgcontext objects Iouri Tarassov
@ 2022-03-02 12:59     ` Wei Liu
  0 siblings, 0 replies; 72+ messages in thread
From: Wei Liu @ 2022-03-02 12:59 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo, gregkh

On Tue, Mar 01, 2022 at 11:45:55AM -0800, Iouri Tarassov wrote:
[...]
> +static int
> +dxgk_create_context_virtual(struct dxgprocess *process, void *__user inargs)
> +{
> +	struct d3dkmt_createcontextvirtual args;
> +	int ret;
> +	struct dxgadapter *adapter = NULL;
> +	struct dxgdevice *device = NULL;
> +	struct dxgcontext *context = NULL;
> +	struct d3dkmthandle host_context_handle = {};
> +	bool device_lock_acquired = false;
> +
> +	pr_debug("ioctl: %s", __func__);
> +
> +	ret = copy_from_user(&args, inargs, sizeof(args));
> +	if (ret) {
> +		pr_err("%s failed to copy input args", __func__);
> +		ret = -EINVAL;

This should be -EFAULT. This goes for other copy_from_user calls too. You can also
drop the pr_err above.

Thanks,
Wei.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 10/30] drivers: hv: dxgkrnl: Creation of compute device sync objects
  2022-03-01 19:45   ` [PATCH v3 10/30] drivers: hv: dxgkrnl: Creation of compute device sync objects Iouri Tarassov
@ 2022-03-02 13:25     ` Wei Liu
  0 siblings, 0 replies; 72+ messages in thread
From: Wei Liu @ 2022-03-02 13:25 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo, gregkh

On Tue, Mar 01, 2022 at 11:45:57AM -0800, Iouri Tarassov wrote:
[...]
> +void dxgadapter_remove_syncobj(struct dxgsyncobject *object)
> +{
> +	down_write(&object->adapter->shared_resource_list_lock);
> +	if (object->syncobj_list_entry.next) {
> +		list_del(&object->syncobj_list_entry);
> +		object->syncobj_list_entry.next = NULL;
> +	}

Just use list_del here.

Thanks,
Wei.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 12/30] drivers: hv: dxgkrnl: Sharing of dxgresource objects
  2022-03-01 19:45   ` [PATCH v3 12/30] drivers: hv: dxgkrnl: Sharing of dxgresource objects Iouri Tarassov
@ 2022-03-02 14:13     ` Wei Liu
  2022-03-02 14:15     ` Wei Liu
  1 sibling, 0 replies; 72+ messages in thread
From: Wei Liu @ 2022-03-02 14:13 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo, gregkh

On Tue, Mar 01, 2022 at 11:45:59AM -0800, Iouri Tarassov wrote:
[...]
>  
> +static int dxgsharedresource_seal(struct dxgsharedresource *shared_resource)
> +{
> +	int ret = 0;
> +	int i = 0;
> +	u8 *private_data;
> +	u32 data_size;
> +	struct dxgresource *resource;
> +	struct dxgallocation *alloc;
> +
> +	pr_debug("Sealing resource: %p", shared_resource);
> +
> +	down_write(&shared_resource->adapter->shared_resource_list_lock);
> +	if (shared_resource->sealed) {
> +		pr_debug("Resource already sealed");
> +		goto cleanup;
> +	}
> +	shared_resource->sealed = 1;
> +	if (!list_empty(&shared_resource->resource_list_head)) {
[...]
> +		}
> +cleanup1:

Please just name it unlock ... 

> +		mutex_unlock(&resource->resource_mutex);
> +	}
> +cleanup:

... and rename this to done because you are not cleaning up anything.

> +	up_write(&shared_resource->adapter->shared_resource_list_lock);
> +	return ret;
> +}
> +

Thanks,
Wei.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 12/30] drivers: hv: dxgkrnl: Sharing of dxgresource objects
  2022-03-01 19:45   ` [PATCH v3 12/30] drivers: hv: dxgkrnl: Sharing of dxgresource objects Iouri Tarassov
  2022-03-02 14:13     ` Wei Liu
@ 2022-03-02 14:15     ` Wei Liu
  1 sibling, 0 replies; 72+ messages in thread
From: Wei Liu @ 2022-03-02 14:15 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo, gregkh

On Tue, Mar 01, 2022 at 11:45:59AM -0800, Iouri Tarassov wrote:
[...]
> +void dxgadapter_remove_shared_resource(struct dxgadapter *adapter,
> +				       struct dxgsharedresource *object)
> +{
> +	down_write(&adapter->shared_resource_list_lock);
> +	if (object->shared_resource_list_entry.next) {
> +		list_del(&object->shared_resource_list_entry);
> +		object->shared_resource_list_entry.next = NULL;
> +	}

Again, just list_del please. I will skip pointing out this in
later patches.

Thanks,
Wei.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 26/30] drivers: hv: dxgkrnl: Offer and reclaim allocations
  2022-03-01 19:46   ` [PATCH v3 26/30] drivers: hv: dxgkrnl: Offer and reclaim allocations Iouri Tarassov
@ 2022-03-02 14:25     ` Wei Liu
  2022-03-02 18:13       ` Iouri Tarassov
  0 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2022-03-02 14:25 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo, gregkh

On Tue, Mar 01, 2022 at 11:46:13AM -0800, Iouri Tarassov wrote:
> Implement ioctls to offer and reclaim compute device allocations:
>   - LX_DXOFFERALLOCATIONS,
>   - LX_DXRECLAIMALLOCATIONS2
> 
> When a user mode driver (UMD) does not need to access an allocation,

What is a "user mode driver" in this context? Is that something that
runs inside the guest?

> it can "offer" it by issuing the LX_DXOFFERALLOCATIONS ioctl.  This
> means that the allocation is not in use and its local device memory
> could be evicted. The freed space could be given to another allocation.
> When the allocation is again needed, the UMD can attempt to"reclaim"
> the allocation by issuing the LX_DXRECLAIMALLOCATIONS2 ioctl. If the
> allocation is still not evicted, the reclaim operation succeeds and no
> other action is required. If the reclaim operation fails, the caller
> must restore the content of the allocation before it can be used by
> the device.
> 
> Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 30/30] drivers: hv: dxgkrnl: Add support to map guest pages by host
  2022-03-01 19:46   ` [PATCH v3 30/30] drivers: hv: dxgkrnl: Add support to map guest pages by host Iouri Tarassov
@ 2022-03-02 14:29     ` Wei Liu
  0 siblings, 0 replies; 72+ messages in thread
From: Wei Liu @ 2022-03-02 14:29 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo, gregkh

On Tue, Mar 01, 2022 at 11:46:17AM -0800, Iouri Tarassov wrote:
[...]
> @@ -1393,6 +1396,7 @@ int create_existing_sysmem(struct dxgdevice *device,
>  	pr_debug("   Alloc size: %lld", alloc_size);
>  
>  	dxgalloc->cpu_address = (void *)sysmem;
> +

Well this is just adding a blank line. Please avoid changes like this...

Thanks,
Wei.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 00/30] Driver for Hyper-v virtual compute device
  2022-03-01 19:45 [PATCH v3 00/30] Driver for Hyper-v virtual compute device Iouri Tarassov
  2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
  2022-03-02  8:51 ` [PATCH v3 00/30] Driver for Hyper-v virtual compute device Christoph Hellwig
@ 2022-03-02 14:53 ` Wei Liu
  2 siblings, 0 replies; 72+ messages in thread
From: Wei Liu @ 2022-03-02 14:53 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo, gregkh

Hi Iouri

I've gone through the whole series. A large portion of the series is
about graphics stuff which is out of my wheelhouse.

I have two general comments here.

There seems to be an excessive amount pr_debug / pr_err in code. Please
try to reduce their number a bit.

For memory allocation you always use vzalloc, however large or small the
allocation may be. Why not use kzalloc for smaller allocations?

Thanks,
Wei.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 26/30] drivers: hv: dxgkrnl: Offer and reclaim allocations
  2022-03-02 14:25     ` Wei Liu
@ 2022-03-02 18:13       ` Iouri Tarassov
  2022-03-02 18:23         ` Wei Liu
  0 siblings, 1 reply; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-02 18:13 UTC (permalink / raw)
  To: Wei Liu
  Cc: kys, haiyangz, sthemmin, linux-hyperv, linux-kernel, spronovo,
	spronovo, gregkh


On 3/2/2022 6:25 AM, Wei Liu wrote:
> On Tue, Mar 01, 2022 at 11:46:13AM -0800, Iouri Tarassov wrote:
> > Implement ioctls to offer and reclaim compute device allocations:
> >   - LX_DXOFFERALLOCATIONS,
> >   - LX_DXRECLAIMALLOCATIONS2
> > 
> > When a user mode driver (UMD) does not need to access an allocation,
>
> What is a "user mode driver" in this context? Is that something that
> runs inside the guest?

Hi Wei,

The user mode driver runs inside the guest. This driver is written by
hardware vendors.
For example, the NVIDIA's Cuda runtime is considered a user mode driver.
The driver
provides a specific API to applications (like the Cuda API).

The cover letter explains the design of the virtual compute device
paravirtualization
model and describes all components, which are involved. I feel that I do
not need to
include explanation to every patch.

Thanks
Iouri


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 26/30] drivers: hv: dxgkrnl: Offer and reclaim allocations
  2022-03-02 18:13       ` Iouri Tarassov
@ 2022-03-02 18:23         ` Wei Liu
  0 siblings, 0 replies; 72+ messages in thread
From: Wei Liu @ 2022-03-02 18:23 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: Wei Liu, kys, haiyangz, sthemmin, linux-hyperv, linux-kernel,
	spronovo, spronovo, gregkh

On Wed, Mar 02, 2022 at 10:13:34AM -0800, Iouri Tarassov wrote:
> 
> On 3/2/2022 6:25 AM, Wei Liu wrote:
> > On Tue, Mar 01, 2022 at 11:46:13AM -0800, Iouri Tarassov wrote:
> > > Implement ioctls to offer and reclaim compute device allocations:
> > >   - LX_DXOFFERALLOCATIONS,
> > >   - LX_DXRECLAIMALLOCATIONS2
> > > 
> > > When a user mode driver (UMD) does not need to access an allocation,
> >
> > What is a "user mode driver" in this context? Is that something that
> > runs inside the guest?
> 
> Hi Wei,
> 
> The user mode driver runs inside the guest. This driver is written by
> hardware vendors.
> For example, the NVIDIA's Cuda runtime is considered a user mode driver.
> The driver
> provides a specific API to applications (like the Cuda API).
> 
> The cover letter explains the design of the virtual compute device
> paravirtualization
> model and describes all components, which are involved. I feel that I do
> not need to
> include explanation to every patch.

It's fine. I was just asking a question -- I didn't have all the terms
in my head by the time I got to this patch. There is no need to add the 
explanation to every patch.

Thanks,
Wei.

> 
> Thanks
> Iouri
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-02 11:53           ` Wei Liu
@ 2022-03-02 18:49             ` Iouri Tarassov
  2022-03-02 20:20               ` Greg KH
  2022-03-02 20:34               ` Wei Liu
  0 siblings, 2 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-02 18:49 UTC (permalink / raw)
  To: Wei Liu, Greg KH
  Cc: kys, haiyangz, sthemmin, linux-hyperv, linux-kernel, spronovo, spronovo

On 3/2/2022 3:53 AM, Wei Liu wrote:
> On Wed, Mar 02, 2022 at 08:53:15AM +0100, Greg KH wrote:
> > On Tue, Mar 01, 2022 at 10:23:21PM +0000, Wei Liu wrote:
> > > > > +struct dxgglobal *dxgglobal;
> > > > 
> > > > No, make this per-device, NEVER have a single device for your driver.
> > > > The Linux driver model makes it harder to do it this way than to do it
> > > > correctly.  Do it correctly please and have no global structures like
> > > > this.
> > > > 
> > > 
> > > This may not be as big an issue as you thought. The device discovery is
> > > still done via the normal VMBus probing routine. For all intents and
> > > purposes the dxgglobal structure can be broken down into per device
> > > fields and a global structure which contains the protocol versioning
> > > information -- my understanding is there will always be a global
> > > structure to hold information related to the backend, regardless of how
> > > many devices there are.
> > 
> > Then that is wrong and needs to be fixed.  Drivers should almost never
> > have any global data, that is not how Linux drivers work.  What happens
> > when you get a second device in your system for this?  Major rework
> > would have to happen and the code will break.  Handle that all now as it
> > takes less work to make this per-device than it does to have a global
> > variable.
> > 
>
> It is perhaps easier to draw parallel from an existing driver. I feel
> like we're talking past each other.
>
> Let's look at drivers/iommu/intel/iommu.c. There are a bunch of lists
> like `static LIST_HEAD(dmar_rmrr_units)`. During the probing phase, new
> units will be added to the list. I this the current code is following
> this model. dxgglobal fulfills the role of a list.
>
> Setting aside the question of whether it makes sense to keep a copy of
> the per-VM state in each device instance, I can see the code be changed
> to:
>
>     struct mutex device_mutex; /* split out from dxgglobal */
>     static LIST_HEAD(dxglist);
>     
>     /* Rename struct dxgglobal to struct dxgstate */
>     struct dxgstate {
>        struct list_head dxglist; /* link for dxglist */
>        /* ... original fields sans device_mutex */
>     }
>
>     /*
>      * Provide a bunch of helpers manipulate the list. Called in probe /
>      * remove etc.
>      */
>     struct dxgstate *find_dxgstate(...);
>     void remove_dxgstate(...);
>     int add_dxgstate(...);
>
> This model is well understood and used in tree. It is just that it
> doesn't provide much value in doing this now since the list will only
> contain one element. I hope that you're not saying we cannot even use a
> per-module pointer to quickly get the data structure we want to use,
> right?
>
> Are you suggesting Iouri use dev_set_drvdata to stash the dxgstate
> into the device object? I think that can be done too.
>
> The code can be changed as:
>
>     /* Rename struct dxgglobal to dxgstate and remove unneeded fields */
>     struct dxgstate { ... };
>
>     static int dxg_probe_vmbus(...) {
>
>         /* probe successfully */
>
> 	struct dxgstate *state = kmalloc(...);
> 	/* Fill in dxgstate with information from backend */
>
> 	/* hdev->dev is the device object from the core driver framework */
> 	dev_set_drvdata(&hdev->dev, state);
>     }
>
>     static int dxg_remove_vmbus(...) {
>         /* Normal stuff here ...*/
>
> 	struct dxgstate *state = dev_get_drvdata(...);
> 	dev_set_drvdata(..., NULL);
> 	kfree(state);
>     }
>
>     /* In all other functions */
>     void do_things(...) {
>         struct dxgstate *state = dev_get_drvdata(...);
>
> 	/* Use state in place of where dxgglobal was needed */
>
>     }
>
> Iouri, notice this doesn't change anything regarding how userspace is
> designed. This is about how kernel organises its data.
>
> I hope what I wrote above can bring our understanding closer.
>
> Thanks,
> Wei.


I can certainly remove dxgglobal and keep the  pointer to the global
state in the device object.

This will require passing of the global pointer to all functions, which
need to access it.


Maybe my understanding of the Greg's suggestion was not correct. I
thought the suggestion was

to have multiple /dev/dxgN devices (one per virtual compute device).
This would change how the user mode

clients enumerate and communicate with compute devices.


Thanks

Iouri


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-02 18:49             ` Iouri Tarassov
@ 2022-03-02 20:20               ` Greg KH
  2022-03-02 22:27                 ` Iouri Tarassov
  2022-03-02 20:34               ` Wei Liu
  1 sibling, 1 reply; 72+ messages in thread
From: Greg KH @ 2022-03-02 20:20 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: Wei Liu, kys, haiyangz, sthemmin, linux-hyperv, linux-kernel,
	spronovo, spronovo

On Wed, Mar 02, 2022 at 10:49:15AM -0800, Iouri Tarassov wrote:
> On 3/2/2022 3:53 AM, Wei Liu wrote:
> > On Wed, Mar 02, 2022 at 08:53:15AM +0100, Greg KH wrote:
> > > On Tue, Mar 01, 2022 at 10:23:21PM +0000, Wei Liu wrote:
> > > > > > +struct dxgglobal *dxgglobal;
> > > > > 
> > > > > No, make this per-device, NEVER have a single device for your driver.
> > > > > The Linux driver model makes it harder to do it this way than to do it
> > > > > correctly.  Do it correctly please and have no global structures like
> > > > > this.
> > > > > 
> > > > 
> > > > This may not be as big an issue as you thought. The device discovery is
> > > > still done via the normal VMBus probing routine. For all intents and
> > > > purposes the dxgglobal structure can be broken down into per device
> > > > fields and a global structure which contains the protocol versioning
> > > > information -- my understanding is there will always be a global
> > > > structure to hold information related to the backend, regardless of how
> > > > many devices there are.
> > > 
> > > Then that is wrong and needs to be fixed.  Drivers should almost never
> > > have any global data, that is not how Linux drivers work.  What happens
> > > when you get a second device in your system for this?  Major rework
> > > would have to happen and the code will break.  Handle that all now as it
> > > takes less work to make this per-device than it does to have a global
> > > variable.
> > > 
> >
> > It is perhaps easier to draw parallel from an existing driver. I feel
> > like we're talking past each other.
> >
> > Let's look at drivers/iommu/intel/iommu.c. There are a bunch of lists
> > like `static LIST_HEAD(dmar_rmrr_units)`. During the probing phase, new
> > units will be added to the list. I this the current code is following
> > this model. dxgglobal fulfills the role of a list.
> >
> > Setting aside the question of whether it makes sense to keep a copy of
> > the per-VM state in each device instance, I can see the code be changed
> > to:
> >
> >     struct mutex device_mutex; /* split out from dxgglobal */
> >     static LIST_HEAD(dxglist);
> >     
> >     /* Rename struct dxgglobal to struct dxgstate */
> >     struct dxgstate {
> >        struct list_head dxglist; /* link for dxglist */
> >        /* ... original fields sans device_mutex */
> >     }
> >
> >     /*
> >      * Provide a bunch of helpers manipulate the list. Called in probe /
> >      * remove etc.
> >      */
> >     struct dxgstate *find_dxgstate(...);
> >     void remove_dxgstate(...);
> >     int add_dxgstate(...);
> >
> > This model is well understood and used in tree. It is just that it
> > doesn't provide much value in doing this now since the list will only
> > contain one element. I hope that you're not saying we cannot even use a
> > per-module pointer to quickly get the data structure we want to use,
> > right?
> >
> > Are you suggesting Iouri use dev_set_drvdata to stash the dxgstate
> > into the device object? I think that can be done too.
> >
> > The code can be changed as:
> >
> >     /* Rename struct dxgglobal to dxgstate and remove unneeded fields */
> >     struct dxgstate { ... };
> >
> >     static int dxg_probe_vmbus(...) {
> >
> >         /* probe successfully */
> >
> > 	struct dxgstate *state = kmalloc(...);
> > 	/* Fill in dxgstate with information from backend */
> >
> > 	/* hdev->dev is the device object from the core driver framework */
> > 	dev_set_drvdata(&hdev->dev, state);
> >     }
> >
> >     static int dxg_remove_vmbus(...) {
> >         /* Normal stuff here ...*/
> >
> > 	struct dxgstate *state = dev_get_drvdata(...);
> > 	dev_set_drvdata(..., NULL);
> > 	kfree(state);
> >     }
> >
> >     /* In all other functions */
> >     void do_things(...) {
> >         struct dxgstate *state = dev_get_drvdata(...);
> >
> > 	/* Use state in place of where dxgglobal was needed */
> >
> >     }
> >
> > Iouri, notice this doesn't change anything regarding how userspace is
> > designed. This is about how kernel organises its data.
> >
> > I hope what I wrote above can bring our understanding closer.
> >
> > Thanks,
> > Wei.
> 
> 
> I can certainly remove dxgglobal and keep the  pointer to the global
> state in the device object.
> 
> This will require passing of the global pointer to all functions, which
> need to access it.
> 
> 
> Maybe my understanding of the Greg's suggestion was not correct. I
> thought the suggestion was
> 
> to have multiple /dev/dxgN devices (one per virtual compute device).

You have one device per HV device, as the bus already provides you.
That's all you really need, right?  Who would be opening the same device
node multiple times?

> This would change how the user mode
> clients enumerate and communicate with compute devices.

What does userspace have to do here?  It should just open the device
node that is present when needed.  How will there be multiple userspace
clients for a single HV device?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-02 18:49             ` Iouri Tarassov
  2022-03-02 20:20               ` Greg KH
@ 2022-03-02 20:34               ` Wei Liu
  1 sibling, 0 replies; 72+ messages in thread
From: Wei Liu @ 2022-03-02 20:34 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: Wei Liu, Greg KH, kys, haiyangz, sthemmin, linux-hyperv,
	linux-kernel, spronovo, spronovo

On Wed, Mar 02, 2022 at 10:49:15AM -0800, Iouri Tarassov wrote:
> On 3/2/2022 3:53 AM, Wei Liu wrote:
> > On Wed, Mar 02, 2022 at 08:53:15AM +0100, Greg KH wrote:
> > > On Tue, Mar 01, 2022 at 10:23:21PM +0000, Wei Liu wrote:
> > > > > > +struct dxgglobal *dxgglobal;
> > > > > 
> > > > > No, make this per-device, NEVER have a single device for your driver.
> > > > > The Linux driver model makes it harder to do it this way than to do it
> > > > > correctly.  Do it correctly please and have no global structures like
> > > > > this.
> > > > > 
> > > > 
> > > > This may not be as big an issue as you thought. The device discovery is
> > > > still done via the normal VMBus probing routine. For all intents and
> > > > purposes the dxgglobal structure can be broken down into per device
> > > > fields and a global structure which contains the protocol versioning
> > > > information -- my understanding is there will always be a global
> > > > structure to hold information related to the backend, regardless of how
> > > > many devices there are.
> > > 
> > > Then that is wrong and needs to be fixed.  Drivers should almost never
> > > have any global data, that is not how Linux drivers work.  What happens
> > > when you get a second device in your system for this?  Major rework
> > > would have to happen and the code will break.  Handle that all now as it
> > > takes less work to make this per-device than it does to have a global
> > > variable.
> > > 
> >
> > It is perhaps easier to draw parallel from an existing driver. I feel
> > like we're talking past each other.
> >
> > Let's look at drivers/iommu/intel/iommu.c. There are a bunch of lists
> > like `static LIST_HEAD(dmar_rmrr_units)`. During the probing phase, new
> > units will be added to the list. I this the current code is following
> > this model. dxgglobal fulfills the role of a list.
> >
> > Setting aside the question of whether it makes sense to keep a copy of
> > the per-VM state in each device instance, I can see the code be changed
> > to:
> >
> >     struct mutex device_mutex; /* split out from dxgglobal */
> >     static LIST_HEAD(dxglist);
> >     
> >     /* Rename struct dxgglobal to struct dxgstate */
> >     struct dxgstate {
> >        struct list_head dxglist; /* link for dxglist */
> >        /* ... original fields sans device_mutex */
> >     }
> >
> >     /*
> >      * Provide a bunch of helpers manipulate the list. Called in probe /
> >      * remove etc.
> >      */
> >     struct dxgstate *find_dxgstate(...);
> >     void remove_dxgstate(...);
> >     int add_dxgstate(...);
> >
> > This model is well understood and used in tree. It is just that it
> > doesn't provide much value in doing this now since the list will only
> > contain one element. I hope that you're not saying we cannot even use a
> > per-module pointer to quickly get the data structure we want to use,
> > right?
> >
> > Are you suggesting Iouri use dev_set_drvdata to stash the dxgstate
> > into the device object? I think that can be done too.
> >
> > The code can be changed as:
> >
> >     /* Rename struct dxgglobal to dxgstate and remove unneeded fields */
> >     struct dxgstate { ... };
> >
> >     static int dxg_probe_vmbus(...) {
> >
> >         /* probe successfully */
> >
> > 	struct dxgstate *state = kmalloc(...);
> > 	/* Fill in dxgstate with information from backend */
> >
> > 	/* hdev->dev is the device object from the core driver framework */
> > 	dev_set_drvdata(&hdev->dev, state);
> >     }
> >
> >     static int dxg_remove_vmbus(...) {
> >         /* Normal stuff here ...*/
> >
> > 	struct dxgstate *state = dev_get_drvdata(...);
> > 	dev_set_drvdata(..., NULL);
> > 	kfree(state);
> >     }
> >
> >     /* In all other functions */
> >     void do_things(...) {
> >         struct dxgstate *state = dev_get_drvdata(...);
> >
> > 	/* Use state in place of where dxgglobal was needed */
> >
> >     }
> >
> > Iouri, notice this doesn't change anything regarding how userspace is
> > designed. This is about how kernel organises its data.
> >
> > I hope what I wrote above can bring our understanding closer.
> >
> > Thanks,
> > Wei.
> 
> 
> I can certainly remove dxgglobal and keep the  pointer to the global
> state in the device object.
> 

No, no more global pointer needed. You just call dev_drv_setdata in the
place that you assign to the global pointer.

> This will require passing of the global pointer to all functions, which
> need to access it.
> 

And in the place you need the global pointer, call dev_drv_getdata.

> 
> Maybe my understanding of the Greg's suggestion was not correct. I
> thought the suggestion was
> 
> to have multiple /dev/dxgN devices (one per virtual compute device).
> This would change how the user mode
> 

No. You still have only one /dev/dxg here.

Wei.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-02 20:20               ` Greg KH
@ 2022-03-02 22:27                 ` Iouri Tarassov
  2022-03-03 13:16                   ` Greg KH
  0 siblings, 1 reply; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-02 22:27 UTC (permalink / raw)
  To: Greg KH
  Cc: Wei Liu, kys, haiyangz, sthemmin, linux-hyperv, linux-kernel,
	spronovo, spronovo


On 3/2/2022 12:20 PM, Greg KH wrote:
> On Wed, Mar 02, 2022 at 10:49:15AM -0800, Iouri Tarassov wrote:
> > On 3/2/2022 3:53 AM, Wei Liu wrote:
> > > On Wed, Mar 02, 2022 at 08:53:15AM +0100, Greg KH wrote:
> > > > On Tue, Mar 01, 2022 at 10:23:21PM +0000, Wei Liu wrote:
> > > > > > > +struct dxgglobal *dxgglobal;
> > > > > > 
> > > > > > No, make this per-device, NEVER have a single device for your driver.
> > > > > > The Linux driver model makes it harder to do it this way than to do it
> > > > > > correctly.  Do it correctly please and have no global structures like
> > > > > > this.
> > > > > > 
> > > > > 
> > > > > This may not be as big an issue as you thought. The device discovery is
> > > > > still done via the normal VMBus probing routine. For all intents and
> > > > > purposes the dxgglobal structure can be broken down into per device
> > > > > fields and a global structure which contains the protocol versioning
> > > > > information -- my understanding is there will always be a global
> > > > > structure to hold information related to the backend, regardless of how
> > > > > many devices there are.
> > > > 
> > > > Then that is wrong and needs to be fixed.  Drivers should almost never
> > > > have any global data, that is not how Linux drivers work.  What happens
> > > > when you get a second device in your system for this?  Major rework
> > > > would have to happen and the code will break.  Handle that all now as it
> > > > takes less work to make this per-device than it does to have a global
> > > > variable.
> > > > 
> > >
> > > It is perhaps easier to draw parallel from an existing driver. I feel
> > > like we're talking past each other.
> > >
> > > Let's look at drivers/iommu/intel/iommu.c. There are a bunch of lists
> > > like `static LIST_HEAD(dmar_rmrr_units)`. During the probing phase, new
> > > units will be added to the list. I this the current code is following
> > > this model. dxgglobal fulfills the role of a list.
> > >
> > > Setting aside the question of whether it makes sense to keep a copy of
> > > the per-VM state in each device instance, I can see the code be changed
> > > to:
> > >
> > >     struct mutex device_mutex; /* split out from dxgglobal */
> > >     static LIST_HEAD(dxglist);
> > >     
> > >     /* Rename struct dxgglobal to struct dxgstate */
> > >     struct dxgstate {
> > >        struct list_head dxglist; /* link for dxglist */
> > >        /* ... original fields sans device_mutex */
> > >     }
> > >
> > >     /*
> > >      * Provide a bunch of helpers manipulate the list. Called in probe /
> > >      * remove etc.
> > >      */
> > >     struct dxgstate *find_dxgstate(...);
> > >     void remove_dxgstate(...);
> > >     int add_dxgstate(...);
> > >
> > > This model is well understood and used in tree. It is just that it
> > > doesn't provide much value in doing this now since the list will only
> > > contain one element. I hope that you're not saying we cannot even use a
> > > per-module pointer to quickly get the data structure we want to use,
> > > right?
> > >
> > > Are you suggesting Iouri use dev_set_drvdata to stash the dxgstate
> > > into the device object? I think that can be done too.
> > >
> > > The code can be changed as:
> > >
> > >     /* Rename struct dxgglobal to dxgstate and remove unneeded fields */
> > >     struct dxgstate { ... };
> > >
> > >     static int dxg_probe_vmbus(...) {
> > >
> > >         /* probe successfully */
> > >
> > > 	struct dxgstate *state = kmalloc(...);
> > > 	/* Fill in dxgstate with information from backend */
> > >
> > > 	/* hdev->dev is the device object from the core driver framework */
> > > 	dev_set_drvdata(&hdev->dev, state);
> > >     }
> > >
> > >     static int dxg_remove_vmbus(...) {
> > >         /* Normal stuff here ...*/
> > >
> > > 	struct dxgstate *state = dev_get_drvdata(...);
> > > 	dev_set_drvdata(..., NULL);
> > > 	kfree(state);
> > >     }
> > >
> > >     /* In all other functions */
> > >     void do_things(...) {
> > >         struct dxgstate *state = dev_get_drvdata(...);
> > >
> > > 	/* Use state in place of where dxgglobal was needed */
> > >
> > >     }
> > >
> > > Iouri, notice this doesn't change anything regarding how userspace is
> > > designed. This is about how kernel organises its data.
> > >
> > > I hope what I wrote above can bring our understanding closer.
> > >
> > > Thanks,
> > > Wei.
> > 
> > 
> > I can certainly remove dxgglobal and keep the  pointer to the global
> > state in the device object.
> > 
> > This will require passing of the global pointer to all functions, which
> > need to access it.
> > 
> > 
> > Maybe my understanding of the Greg's suggestion was not correct. I
> > thought the suggestion was
> > 
> > to have multiple /dev/dxgN devices (one per virtual compute device).
>
> You have one device per HV device, as the bus already provides you.
> That's all you really need, right?  Who would be opening the same device
> node multiple times?
> > This would change how the user mode
> > clients enumerate and communicate with compute devices.
>
> What does userspace have to do here?  It should just open the device
> node that is present when needed.  How will there be multiple userspace
> clients for a single HV device?


Dxgkrnl creates a single user mode visible device node /dev/dxg. It has
nothing to do with a specific hardware compute device on the host. Its
purpose is to provide services (IOCTLs) to enumerate and manage virtual
compute devices, which represent hardware devices on the host. The VMBus
devices are not used directly by user mode clients in the current design.

Virtual compute devices are shared between processes. There could be a
Cuda application, Gimp and a Direct3D12 application working at the same
time.This is what I mean by saying that there are multiple user mode
clients who use the /dev/dxg driver interface. Each of this applications
will open the /dev/dxg device node and enumerate/use virtual compute
devices.

If we change the way how the virtual compute devices are visible to user
mode, the Cuda runtime, Direct3D runtime would need to be changed.

I think we agreed that I will keep the global driver state in the device
object as Wei suggested and remove global variables. There still will be
a single /dev/dxg device node. Correct?


Thanks

Iouri


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-01 20:45     ` Greg KH
  2022-03-01 22:23       ` Wei Liu
@ 2022-03-02 22:59       ` Iouri Tarassov
  2022-03-03 13:29         ` Greg KH
  2022-03-02 23:27       ` Iouri Tarassov
  2 siblings, 1 reply; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-02 22:59 UTC (permalink / raw)
  To: Greg KH
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo

Hi Greg,

Thanks a lot for reviewing the patches.

I appreciate the very detailed comments. I my reply I omitted the
comments, which I agree with and will address in subsequent patches.

On 3/1/2022 12:45 PM, Greg KH wrote:

>
> > +long dxgk_unlocked_ioctl(struct file *f, unsigned int p1, unsigned long p2);
> > +
> > +static inline void guid_to_luid(guid_t *guid, struct winluid *luid)
> > +{
> > +	*luid = *(struct winluid *)&guid->b[0];
>
> Why is the cast needed?  Shouldn't you use real types in your
> structures?


The VMBus channel ID, which is GUID, is present in the PCI config space
of the compute device. The convention is that the lower part of the GUID
is LUID (local unique identifier). This function just converts GUID to
LUID. I am not sure what is the ask here.


> > +/*
> > + * The interface version is used to ensure that the host and the guest use the
> > + * same VM bus protocol. It needs to be incremented every time the VM bus
> > + * interface changes. DXGK_VMBUS_LAST_COMPATIBLE_INTERFACE_VERSION is
> > + * incremented each time the earlier versions of the interface are no longer
> > + * compatible with the current version.
> > + */
> > +#define DXGK_VMBUS_INTERFACE_VERSION_OLD		27
> > +#define DXGK_VMBUS_INTERFACE_VERSION			40
> > +#define DXGK_VMBUS_LAST_COMPATIBLE_INTERFACE_VERSION	16
>
> Where do these numbers come from, the hypervisor specification?


The definitions are for the VMBus interface between the virtual compute
device in the guest and the Windows host. The interface version is
updated when the VMBus protocol is changed. These are specific to the
virtual compute device, not to the hypervisor.


> > +/*
> > + * Pointer to the global device data. By design
> > + * there is a single vGPU device on the VM bus and a single /dev/dxg device
> > + * is created.
> > + */
> > +struct dxgglobal *dxgglobal;
>
> No, make this per-device, NEVER have a single device for your driver.
> The Linux driver model makes it harder to do it this way than to do it
> correctly.  Do it correctly please and have no global structures like
> this.


This will be addressed in the next version of patches.


> > +#define DXGKRNL_VERSION			0x2216
>
> What does this mean?


This is the driver implementation version. There are many Microsoft
shipped Linux kernels, which include the driver. The version allows to
quickly determine, which level of functionality or bug fixes is
implemented in the current driver.


> > +
> > +/*
> > + * Interface with the PCI driver
> > + */
> > +
> > +/*
> > + * Part of the PCI config space of the vGPU device is used for vGPU
> > + * configuration data. Reading/writing of the PCI config space is forwarded
> > + * to the host.
> > + */
>
> > +/* DXGK_VMBUS_INTERFACE_VERSION (u32) */
>
> What is this?


This comment explains that the value of DXGK_VMBUS_INTERFACE_VERSION is
located at the offset DXGK_VMBUS_CHANNEL_ID_OFFSET in the PCI config space.


> > +#define DXGK_VMBUS_VERSION_OFFSET	(DXGK_VMBUS_CHANNEL_ID_OFFSET + \
> > +					sizeof(guid_t))
>
> offsetof() is your friend, please use it.


There is no structure definition for the PCI config space of the device.
Therefore, offsets are computed by using data size.


> > +/* Luid of the virtual GPU on the host (struct winluid) */
>
> What is a "luid"?


LUID is "Locally Unique Identifier". It is used in Windows and its value
is guaranteed to be unique until the computer reboots. This is similar
to GUID, but is it not globally unique. Dxgkrnl on the host uses LUIDs
to identify compute adapter objects.
I added a comment to the definition in the header.


> > +	ret = vmbus_driver_register(&dxg_drv);
> > +	if (ret) {
> > +		pr_err("vmbus_driver_register failed: %d", ret);
> > +		return ret;
> > +	}
> > +	dxgglobal->vmbus_registered = true;
> > +
> > +	pr_info("%s  Version: %x", __func__, DXGKRNL_VERSION);
>
> When drivers work, they are quiet.
>
> Also, in-kernel modules do not have a version, they are part of the
> kernel tree, and that's the "version" they have.  You can drop the
> individual version here, it makes no sense anymore.


Microsoft develops many Linux kernels when the dxgkrnl driver is
included (EFLOW, Android, WSL). The kernels are built at different times
and might include a different version of the driver. Printing the
version allows to quickly determine what functionality and bug fixes are
included in the driver.

I see that other in-tree drivers print some information during
init_module (like nfblock.c).

This is done for convenience. So instead of tracking multiple kernel
versions, we can track the dxgkrnl driver version. I am ok to remove it
if it really hurts something. It is only a single line per driver load.


Thanks

Iouri


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-01 20:45     ` Greg KH
  2022-03-01 22:23       ` Wei Liu
  2022-03-02 22:59       ` Iouri Tarassov
@ 2022-03-02 23:27       ` Iouri Tarassov
  2022-03-03 13:10         ` Greg KH
  2 siblings, 1 reply; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-02 23:27 UTC (permalink / raw)
  To: Greg KH
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo


On 3/1/2022 12:45 PM, Greg KH wrote:
> On Tue, Mar 01, 2022 at 11:45:49AM -0800, Iouri Tarassov wrote:
> > - Create skeleton and add basic functionality for the
> > hyper-v compute device driver (dxgkrnl).
> > 
> > +
> > +#undef pr_fmt
> > +#define pr_fmt(fmt)	"dxgk: " fmt
>
> Use the dev_*() print functions, you are a driver, and you always have a
> pointer to a struct device.  There's no need to ever call pr_*().
>

There is no struct device during module initialization until the
/dev/dxg device is created. Is it ok to use pr_* functions in this case?
Should dev_*(NULL,...) be used? I see other drivers use the pr_*
functions in this case (mips.c as an example).


Thanks

Iouri

> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-02  7:53         ` Greg KH
  2022-03-02 11:53           ` Wei Liu
@ 2022-03-03  1:09           ` Iouri Tarassov
  2022-03-03 13:22             ` Greg KH
  1 sibling, 1 reply; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-03  1:09 UTC (permalink / raw)
  To: Greg KH, Wei Liu
  Cc: kys, haiyangz, sthemmin, linux-hyperv, linux-kernel, spronovo, spronovo


On 3/1/2022 11:53 PM, Greg KH wrote:
> On Tue, Mar 01, 2022 at 10:23:21PM +0000, Wei Liu wrote:
> > > > +struct dxgglobal *dxgglobal;
> > > 
> > > No, make this per-device, NEVER have a single device for your driver.
> > > The Linux driver model makes it harder to do it this way than to do it
> > > correctly.  Do it correctly please and have no global structures like
> > > this.
> > > 
> > 
> > This may not be as big an issue as you thought. The device discovery is
> > still done via the normal VMBus probing routine. For all intents and
> > purposes the dxgglobal structure can be broken down into per device
> > fields and a global structure which contains the protocol versioning
> > information -- my understanding is there will always be a global
> > structure to hold information related to the backend, regardless of how
> > many devices there are.
>
> Then that is wrong and needs to be fixed.  Drivers should almost never
> have any global data, that is not how Linux drivers work.  What happens
> when you get a second device in your system for this?  Major rework
> would have to happen and the code will break.  Handle that all now as it
> takes less work to make this per-device than it does to have a global
> variable.
>
> > I definitely think splitting is doable, but I also understand why Iouri
> > does not want to do it _now_ given there is no such a model for multiple
> > devices yet, so anything we put into the per-device structure could be
> > incomplete and it requires further changing when such a model arrives
> > later.
> > 
> > Iouri, please correct me if I have the wrong mental model here.
> > 
> > All in all, I hope this is not going to be a deal breaker for the
> > acceptance of this driver.
>
> For my reviews, yes it will be.
>
> Again, it should be easier to keep things in a per-device state than
> not as the proper lifetime rules and the like are automatically handled
> for you.  If you have global data, you have to manage that all on your
> own and it is _MUCH_ harder to review that you got it correct.

Hi Greg,

I do not really see how the driver be written without the global data. Let's review the design.

Dxgkrnl acts as the aggregator of all virtual compute devices, projected by the host. It needs to do operations, which do not belong to a particular compute device. For example, cross device synchronization and resource sharing.

A PCI device device is created for each virtual compute device. Therefore, there should be a global list of objects and a mutex to synchronize access to the list.

A VMBus channel is offered by the host for each compute device. The list of the VMBus channels should be global.

A global VMBus channel is offered by the host. The channel does not belong to any particular compute device, so it must be global.

IO space is shared by all compute devices, so its parameters should be global.

Dxgkrnl needs to maintain a list of processes, which opened compute device objects. Dxgkrnl maintains private state for each process and when a process opens the /dev/dxg device, Dxgkrnl needs to find if the process state is already created by walking the global process list.

Now, where to keep this global state? It could be kept in the /dev/dxg private device structure. But this structure is not available when, for example, dxg_pci_probe_device() or dxg_probe_vmbus() is called.

Can there be multiple /dev/dxg devices? No. Because the /dev/dxg device represents the driver itself, not a particular compute device.

I am not sure what design model you have in mind when saying there should be no global data. Could you please explain keeping in mind the above requirements?

Thanks
Iouri


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-01 22:06     ` Wei Liu
@ 2022-03-03  2:01       ` Iouri Tarassov
  0 siblings, 0 replies; 72+ messages in thread
From: Iouri Tarassov @ 2022-03-03  2:01 UTC (permalink / raw)
  To: Wei Liu
  Cc: kys, haiyangz, sthemmin, linux-hyperv, linux-kernel, spronovo,
	spronovo, gregkh


On 3/1/2022 2:06 PM, Wei Liu wrote:
> I will skip things that are pointed out by Greg.
>
> On Tue, Mar 01, 2022 at 11:45:49AM -0800, Iouri Tarassov wrote:
> > - Create skeleton and add basic functionality for the
> > hyper-v compute device driver (dxgkrnl).
> > 
> > - Register for PCI and VM bus driver notifications and
> > handle initialization of VM bus channels.
> > 
> > - Connect the dxgkrnl module to the drivers/hv/ Makefile and Kconfig
> > 
> > - Create a MAINTAINERS entry
> > 
> > A VM bus channel is a communication interface between the hyper-v guest
> > and the host. The are two type of VM bus channels, used in the driver:
> >   - the global channel
> >   - per virtual compute device channel
> > 
>
> Same comment regarding the spelling of VMBus and Hyper-V. Please fix
> other instances in code and comments.
>
> > A PCI device is created for each virtual compute device, projected
> > by the host. The device vendor is PCI_VENDOR_ID_MICROSOFT and device
> > id is PCI_DEVICE_ID_VIRTUAL_RENDER. dxg_pci_probe_device handles
> > arrival of such devices. The PCI config space of the virtual compute
> > device has luid of the corresponding virtual compute device VM
> > bus channel. This is how the compute device adapter objects are
> > linked to VM bus channels.
> > 
> > VM bus interface version is exchanged by reading/writing the PCI config
> > space of the virtual compute device.
> > 
> > The IO space is used to handle CPU accessible compute device
> > allocations. Hyper-v allocates IO space for the global VM bus channel.
> > 
> > Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>
> > ---
> [...]
> > +static inline void guid_to_luid(guid_t *guid, struct winluid *luid)
> > +{
> > +	*luid = *(struct winluid *)&guid->b[0];
> > +}
>
> This should be moved to the header where luid is defined -- presumably
> this is useful for other things in the future too.
>
> Also, please provide a comment on why this conversion is okay.
>
The definition of the structure is in the public header. I do not think it makes sense to move the function there. It is a detail of the internal implementation. There is no official conversion of GUID to LUID.

The comment will be added.

Thanks
Iouri



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-02 23:27       ` Iouri Tarassov
@ 2022-03-03 13:10         ` Greg KH
  0 siblings, 0 replies; 72+ messages in thread
From: Greg KH @ 2022-03-03 13:10 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo

On Wed, Mar 02, 2022 at 03:27:56PM -0800, Iouri Tarassov wrote:
> 
> On 3/1/2022 12:45 PM, Greg KH wrote:
> > On Tue, Mar 01, 2022 at 11:45:49AM -0800, Iouri Tarassov wrote:
> > > - Create skeleton and add basic functionality for the
> > > hyper-v compute device driver (dxgkrnl).
> > > 
> > > +
> > > +#undef pr_fmt
> > > +#define pr_fmt(fmt)	"dxgk: " fmt
> >
> > Use the dev_*() print functions, you are a driver, and you always have a
> > pointer to a struct device.  There's no need to ever call pr_*().
> >
> 
> There is no struct device during module initialization until the
> /dev/dxg device is created.

Then you should not have anything to print out.

> Is it ok to use pr_* functions in this case?

Nope.

> Should dev_*(NULL,...) be used?

No, not at all, never!

> I see other drivers use the pr_* functions in this case (mips.c as an
> example).

There are lots of bad examples in the kernel tree, let's make your code
a good one.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-02 22:27                 ` Iouri Tarassov
@ 2022-03-03 13:16                   ` Greg KH
  2022-03-09 23:14                     ` Steve Pronovost
  0 siblings, 1 reply; 72+ messages in thread
From: Greg KH @ 2022-03-03 13:16 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: Wei Liu, kys, haiyangz, sthemmin, linux-hyperv, linux-kernel,
	spronovo, spronovo

On Wed, Mar 02, 2022 at 02:27:05PM -0800, Iouri Tarassov wrote:
> 
> On 3/2/2022 12:20 PM, Greg KH wrote:
> > On Wed, Mar 02, 2022 at 10:49:15AM -0800, Iouri Tarassov wrote:
> > > On 3/2/2022 3:53 AM, Wei Liu wrote:
> > > > On Wed, Mar 02, 2022 at 08:53:15AM +0100, Greg KH wrote:
> > > > > On Tue, Mar 01, 2022 at 10:23:21PM +0000, Wei Liu wrote:
> > > > > > > > +struct dxgglobal *dxgglobal;
> > > > > > > 
> > > > > > > No, make this per-device, NEVER have a single device for your driver.
> > > > > > > The Linux driver model makes it harder to do it this way than to do it
> > > > > > > correctly.  Do it correctly please and have no global structures like
> > > > > > > this.
> > > > > > > 
> > > > > > 
> > > > > > This may not be as big an issue as you thought. The device discovery is
> > > > > > still done via the normal VMBus probing routine. For all intents and
> > > > > > purposes the dxgglobal structure can be broken down into per device
> > > > > > fields and a global structure which contains the protocol versioning
> > > > > > information -- my understanding is there will always be a global
> > > > > > structure to hold information related to the backend, regardless of how
> > > > > > many devices there are.
> > > > > 
> > > > > Then that is wrong and needs to be fixed.  Drivers should almost never
> > > > > have any global data, that is not how Linux drivers work.  What happens
> > > > > when you get a second device in your system for this?  Major rework
> > > > > would have to happen and the code will break.  Handle that all now as it
> > > > > takes less work to make this per-device than it does to have a global
> > > > > variable.
> > > > > 
> > > >
> > > > It is perhaps easier to draw parallel from an existing driver. I feel
> > > > like we're talking past each other.
> > > >
> > > > Let's look at drivers/iommu/intel/iommu.c. There are a bunch of lists
> > > > like `static LIST_HEAD(dmar_rmrr_units)`. During the probing phase, new
> > > > units will be added to the list. I this the current code is following
> > > > this model. dxgglobal fulfills the role of a list.
> > > >
> > > > Setting aside the question of whether it makes sense to keep a copy of
> > > > the per-VM state in each device instance, I can see the code be changed
> > > > to:
> > > >
> > > >     struct mutex device_mutex; /* split out from dxgglobal */
> > > >     static LIST_HEAD(dxglist);
> > > >     
> > > >     /* Rename struct dxgglobal to struct dxgstate */
> > > >     struct dxgstate {
> > > >        struct list_head dxglist; /* link for dxglist */
> > > >        /* ... original fields sans device_mutex */
> > > >     }
> > > >
> > > >     /*
> > > >      * Provide a bunch of helpers manipulate the list. Called in probe /
> > > >      * remove etc.
> > > >      */
> > > >     struct dxgstate *find_dxgstate(...);
> > > >     void remove_dxgstate(...);
> > > >     int add_dxgstate(...);
> > > >
> > > > This model is well understood and used in tree. It is just that it
> > > > doesn't provide much value in doing this now since the list will only
> > > > contain one element. I hope that you're not saying we cannot even use a
> > > > per-module pointer to quickly get the data structure we want to use,
> > > > right?
> > > >
> > > > Are you suggesting Iouri use dev_set_drvdata to stash the dxgstate
> > > > into the device object? I think that can be done too.
> > > >
> > > > The code can be changed as:
> > > >
> > > >     /* Rename struct dxgglobal to dxgstate and remove unneeded fields */
> > > >     struct dxgstate { ... };
> > > >
> > > >     static int dxg_probe_vmbus(...) {
> > > >
> > > >         /* probe successfully */
> > > >
> > > > 	struct dxgstate *state = kmalloc(...);
> > > > 	/* Fill in dxgstate with information from backend */
> > > >
> > > > 	/* hdev->dev is the device object from the core driver framework */
> > > > 	dev_set_drvdata(&hdev->dev, state);
> > > >     }
> > > >
> > > >     static int dxg_remove_vmbus(...) {
> > > >         /* Normal stuff here ...*/
> > > >
> > > > 	struct dxgstate *state = dev_get_drvdata(...);
> > > > 	dev_set_drvdata(..., NULL);
> > > > 	kfree(state);
> > > >     }
> > > >
> > > >     /* In all other functions */
> > > >     void do_things(...) {
> > > >         struct dxgstate *state = dev_get_drvdata(...);
> > > >
> > > > 	/* Use state in place of where dxgglobal was needed */
> > > >
> > > >     }
> > > >
> > > > Iouri, notice this doesn't change anything regarding how userspace is
> > > > designed. This is about how kernel organises its data.
> > > >
> > > > I hope what I wrote above can bring our understanding closer.
> > > >
> > > > Thanks,
> > > > Wei.
> > > 
> > > 
> > > I can certainly remove dxgglobal and keep the  pointer to the global
> > > state in the device object.
> > > 
> > > This will require passing of the global pointer to all functions, which
> > > need to access it.
> > > 
> > > 
> > > Maybe my understanding of the Greg's suggestion was not correct. I
> > > thought the suggestion was
> > > 
> > > to have multiple /dev/dxgN devices (one per virtual compute device).
> >
> > You have one device per HV device, as the bus already provides you.
> > That's all you really need, right?  Who would be opening the same device
> > node multiple times?
> > > This would change how the user mode
> > > clients enumerate and communicate with compute devices.
> >
> > What does userspace have to do here?  It should just open the device
> > node that is present when needed.  How will there be multiple userspace
> > clients for a single HV device?
> 
> 
> Dxgkrnl creates a single user mode visible device node /dev/dxg.

When you do that, you have a device to put all of your data on.  Use
that.

> It has
> nothing to do with a specific hardware compute device on the host. Its
> purpose is to provide services (IOCTLs) to enumerate and manage virtual
> compute devices, which represent hardware devices on the host. The VMBus
> devices are not used directly by user mode clients in the current design.

That's horrid, why not just export the virtual compute devices properly
through individual device nodes instead?

In essence, you are creating a new syscall here to manage and handle
devices for just your driver with the use of this ioctl.  That's
generally a bad idea.

> Virtual compute devices are shared between processes. There could be a
> Cuda application, Gimp and a Direct3D12 application working at the same
> time.

Why are all of those applications sharing anything?  How are they
sharing anything?  If you need to share something across processes, use
the existing inter-process ways of sharing stuff that we have today (12+
different apis and counting).  Don't create yet-another-one just for
your tiny little driver here.  That's rude to the rest of the kernel.

> This is what I mean by saying that there are multiple user mode
> clients who use the /dev/dxg driver interface. Each of this applications
> will open the /dev/dxg device node and enumerate/use virtual compute
> devices.

That seems like an odd model to follow.  How many virtual devices do you
support?  Is there a limit?  Why not just enumerate them all to start
with?  Or if there are too many, why not do it like the raw device and
have a single device node that is used to create the virtual devices you
wish to use?

> If we change the way how the virtual compute devices are visible to user
> mode, the Cuda runtime, Direct3D runtime would need to be changed.

We do not care about existing userspace code at this point in time, you
are trying to get the kernel api correct here.  Once you have done that,
then you can go fix up your userspace code to work properly.  Anything
that came before today is pointless here, right?  :)

> I think we agreed that I will keep the global driver state in the device
> object as Wei suggested and remove global variables. There still will be
> a single /dev/dxg device node. Correct?

I do not know, as I can't figure out your goals here at all, sorry.

Please go work with some experienced Linux developers in your company.
They should be the ones answering these questions for you, not me :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-03  1:09           ` Iouri Tarassov
@ 2022-03-03 13:22             ` Greg KH
  2022-03-04 16:04               ` Wei Liu
  0 siblings, 1 reply; 72+ messages in thread
From: Greg KH @ 2022-03-03 13:22 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: Wei Liu, kys, haiyangz, sthemmin, linux-hyperv, linux-kernel,
	spronovo, spronovo

On Wed, Mar 02, 2022 at 05:09:21PM -0800, Iouri Tarassov wrote:
> 
> On 3/1/2022 11:53 PM, Greg KH wrote:
> > On Tue, Mar 01, 2022 at 10:23:21PM +0000, Wei Liu wrote:
> > > > > +struct dxgglobal *dxgglobal;
> > > > 
> > > > No, make this per-device, NEVER have a single device for your driver.
> > > > The Linux driver model makes it harder to do it this way than to do it
> > > > correctly.  Do it correctly please and have no global structures like
> > > > this.
> > > > 
> > > 
> > > This may not be as big an issue as you thought. The device discovery is
> > > still done via the normal VMBus probing routine. For all intents and
> > > purposes the dxgglobal structure can be broken down into per device
> > > fields and a global structure which contains the protocol versioning
> > > information -- my understanding is there will always be a global
> > > structure to hold information related to the backend, regardless of how
> > > many devices there are.
> >
> > Then that is wrong and needs to be fixed.  Drivers should almost never
> > have any global data, that is not how Linux drivers work.  What happens
> > when you get a second device in your system for this?  Major rework
> > would have to happen and the code will break.  Handle that all now as it
> > takes less work to make this per-device than it does to have a global
> > variable.
> >
> > > I definitely think splitting is doable, but I also understand why Iouri
> > > does not want to do it _now_ given there is no such a model for multiple
> > > devices yet, so anything we put into the per-device structure could be
> > > incomplete and it requires further changing when such a model arrives
> > > later.
> > > 
> > > Iouri, please correct me if I have the wrong mental model here.
> > > 
> > > All in all, I hope this is not going to be a deal breaker for the
> > > acceptance of this driver.
> >
> > For my reviews, yes it will be.
> >
> > Again, it should be easier to keep things in a per-device state than
> > not as the proper lifetime rules and the like are automatically handled
> > for you.  If you have global data, you have to manage that all on your
> > own and it is _MUCH_ harder to review that you got it correct.
> 
> Hi Greg,
> 
> I do not really see how the driver be written without the global data. Let's review the design.

I see it the other way around.  It's easier to make it without a static
structure, it is more work to keep it as you have done so here.  Do it
correctly to start with and you will not have any of these issues going
forward.

> Dxgkrnl acts as the aggregator of all virtual compute devices, projected by the host. It needs to do operations, which do not belong to a particular compute device. For example, cross device synchronization and resource sharing.

Then hang your data off of your device node structure that you created.
Why ignore that?

> A PCI device device is created for each virtual compute device. Therefore, there should be a global list of objects and a mutex to synchronize access to the list.

Woah, what?  You create a fake PCI device for each virtual device?  If
so, great, then you are now a PCI bus and create the PCI devices
properly so that the PCI core can handle and manage them and then assign
them to your driver.  You should NEVER have a global list of these
devices, as that is what the driver model should be managing.  Not you!

> A VMBus channel is offered by the host for each compute device. The list of the VMBus channels should be global.

The vmbus channels are already handled by the driver core.  Use those
devices that are given to you.  You don't need to manage them at all.

> A global VMBus channel is offered by the host. The channel does not belong to any particular compute device, so it must be global.

That channel is attached to your driver, use the device given to your
driver by the bus.  It's not "global" in any sense of the word.

And what's up with your lack of line wrapping?

> IO space is shared by all compute devices, so its parameters should be global.

Huh?  If that's the case then you have bigger problems.  Use the aux bus
for devices that share io space.  That is what it was created for, do
not ignore the functionality that Linux already provides you by trying
to go around it and writing your own code.  Use the frameworks we have
already debugged and support.  This is why your Linux driver should be
at least 1/3 smaller than drivers for other operating systems.

> Dxgkrnl needs to maintain a list of processes, which opened compute device objects. Dxgkrnl maintains private state for each process and when a process opens the /dev/dxg device, Dxgkrnl needs to find if the process state is already created by walking the global process list.

That "list" is handled by the device node structure that was opened.
It's not "global" at all.  Again, just like any other device node in
Linux, this isn't a new thing or anything special at all.

> Now, where to keep this global state? It could be kept in the /dev/dxg private device structure. But this structure is not available when, for example, dxg_pci_probe_device() or dxg_probe_vmbus() is called.

Then your design is wrong.  It's as simple as that.  Fix it.

> Can there be multiple /dev/dxg devices? No. Because the /dev/dxg device represents the driver itself, not a particular compute device.

Then fix this.  Make your compute devices store the needed information
when they are created.  Again, we have loads of examples in the kernel,
this is nothing new.

> I am not sure what design model you have in mind when saying there should be no global data. Could you please explain keeping in mind the above requirements?

Please see all of my responses above, and please use more \n characters
in the future :)

good luck!

greg k-h

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-02 22:59       ` Iouri Tarassov
@ 2022-03-03 13:29         ` Greg KH
  0 siblings, 0 replies; 72+ messages in thread
From: Greg KH @ 2022-03-03 13:29 UTC (permalink / raw)
  To: Iouri Tarassov
  Cc: kys, haiyangz, sthemmin, wei.liu, linux-hyperv, linux-kernel,
	spronovo, spronovo

On Wed, Mar 02, 2022 at 02:59:13PM -0800, Iouri Tarassov wrote:
> Hi Greg,
> 
> Thanks a lot for reviewing the patches.
> 
> I appreciate the very detailed comments. I my reply I omitted the
> comments, which I agree with and will address in subsequent patches.
> 
> On 3/1/2022 12:45 PM, Greg KH wrote:
> 
> >
> > > +long dxgk_unlocked_ioctl(struct file *f, unsigned int p1, unsigned long p2);
> > > +
> > > +static inline void guid_to_luid(guid_t *guid, struct winluid *luid)
> > > +{
> > > +	*luid = *(struct winluid *)&guid->b[0];
> >
> > Why is the cast needed?  Shouldn't you use real types in your
> > structures?
> 
> 
> The VMBus channel ID, which is GUID, is present in the PCI config space
> of the compute device. The convention is that the lower part of the GUID
> is LUID (local unique identifier). This function just converts GUID to
> LUID. I am not sure what is the ask here.

You need to documen the heck out of what you are doing here as it looks
very very odd.

> > > +/*
> > > + * The interface version is used to ensure that the host and the guest use the
> > > + * same VM bus protocol. It needs to be incremented every time the VM bus
> > > + * interface changes. DXGK_VMBUS_LAST_COMPATIBLE_INTERFACE_VERSION is
> > > + * incremented each time the earlier versions of the interface are no longer
> > > + * compatible with the current version.
> > > + */
> > > +#define DXGK_VMBUS_INTERFACE_VERSION_OLD		27
> > > +#define DXGK_VMBUS_INTERFACE_VERSION			40
> > > +#define DXGK_VMBUS_LAST_COMPATIBLE_INTERFACE_VERSION	16
> >
> > Where do these numbers come from, the hypervisor specification?
> 
> 
> The definitions are for the VMBus interface between the virtual compute
> device in the guest and the Windows host. The interface version is
> updated when the VMBus protocol is changed. These are specific to the
> virtual compute device, not to the hypervisor.

That's not ok, you can not break the user/kernel interface from this
moment going forward forever.  You can't just have version numbers for
your protocol, you need to handle it correctly like all of Linux does.

> > > +#define DXGKRNL_VERSION			0x2216
> >
> > What does this mean?
> 
> 
> This is the driver implementation version. There are many Microsoft
> shipped Linux kernels, which include the driver. The version allows to
> quickly determine, which level of functionality or bug fixes is
> implemented in the current driver.

That number means nothing once the driver is in the kernel tree as your
driver is part of a larger whole and you have no idea what people have,
or have not, backported to your portion of the driver or the rest of the
kernel anymore.  Just drop this, it's not going to be used ever again,
and will always be out of date and wrong.

In-kernel drivers do NOT need a version number.  I have worked to remove
almost all of them, but have not finished that task, which is why you
see a few remaining.  Do not copy bad examples.

> > > +/*
> > > + * Part of the PCI config space of the vGPU device is used for vGPU
> > > + * configuration data. Reading/writing of the PCI config space is forwarded
> > > + * to the host.
> > > + */
> >
> > > +/* DXGK_VMBUS_INTERFACE_VERSION (u32) */
> >
> > What is this?
> 
> 
> This comment explains that the value of DXGK_VMBUS_INTERFACE_VERSION is
> located at the offset DXGK_VMBUS_CHANNEL_ID_OFFSET in the PCI config space.

Then please explain that.

> > > +#define DXGK_VMBUS_VERSION_OFFSET	(DXGK_VMBUS_CHANNEL_ID_OFFSET + \
> > > +					sizeof(guid_t))
> >
> > offsetof() is your friend, please use it.
> 
> 
> There is no structure definition for the PCI config space of the device.
> Therefore, offsets are computed by using data size.

That's odd, why not just use a structure on top of the memory location?
That way you get real structure information if things change and you
don't have to cast anything.

> > > +/* Luid of the virtual GPU on the host (struct winluid) */
> >
> > What is a "luid"?
> 
> 
> LUID is "Locally Unique Identifier". It is used in Windows and its value
> is guaranteed to be unique until the computer reboots.

What about vm lifespans?  VM cloning?  Why does the Linux kernel care
about this value?

> > > +	ret = vmbus_driver_register(&dxg_drv);
> > > +	if (ret) {
> > > +		pr_err("vmbus_driver_register failed: %d", ret);
> > > +		return ret;
> > > +	}
> > > +	dxgglobal->vmbus_registered = true;
> > > +
> > > +	pr_info("%s  Version: %x", __func__, DXGKRNL_VERSION);
> >
> > When drivers work, they are quiet.
> >
> > Also, in-kernel modules do not have a version, they are part of the
> > kernel tree, and that's the "version" they have.  You can drop the
> > individual version here, it makes no sense anymore.
> 
> 
> Microsoft develops many Linux kernels when the dxgkrnl driver is
> included (EFLOW, Android, WSL). The kernels are built at different times
> and might include a different version of the driver. Printing the
> version allows to quickly determine what functionality and bug fixes are
> included in the driver.

Again, see above why your driver version is not needed.  Please remove
this line as well.

> I see that other in-tree drivers print some information during
> init_module (like nfblock.c).

Some older ones do, yes.  Please be good and follow the proper rules and
be quiet when booting.

> This is done for convenience. So instead of tracking multiple kernel
> versions, we can track the dxgkrnl driver version. I am ok to remove it
> if it really hurts something. It is only a single line per driver load.

The driver version _IS_ the kernel version once it is merged into the
kernel tree, as you rely on the whole of the kernel for your proper
functionality.  Drivers do not need individual versions.

Again, work with the experienced Linux developers at your company for
this type of thing.  They know this already, you don't need me to find
basic problems like this in your code :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-03 13:22             ` Greg KH
@ 2022-03-04 16:04               ` Wei Liu
  2022-03-04 17:55                 ` Greg KH
  0 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2022-03-04 16:04 UTC (permalink / raw)
  To: Greg KH
  Cc: Iouri Tarassov, Wei Liu, kys, haiyangz, sthemmin, linux-hyperv,
	linux-kernel, spronovo, spronovo

On Thu, Mar 03, 2022 at 02:22:32PM +0100, Greg KH wrote:
> On Wed, Mar 02, 2022 at 05:09:21PM -0800, Iouri Tarassov wrote:
> > 
> > On 3/1/2022 11:53 PM, Greg KH wrote:
> > > On Tue, Mar 01, 2022 at 10:23:21PM +0000, Wei Liu wrote:
> > > > > > +struct dxgglobal *dxgglobal;
> > > > > 
> > > > > No, make this per-device, NEVER have a single device for your driver.
> > > > > The Linux driver model makes it harder to do it this way than to do it
> > > > > correctly.  Do it correctly please and have no global structures like
> > > > > this.
> > > > > 
> > > > 
> > > > This may not be as big an issue as you thought. The device discovery is
> > > > still done via the normal VMBus probing routine. For all intents and
> > > > purposes the dxgglobal structure can be broken down into per device
> > > > fields and a global structure which contains the protocol versioning
> > > > information -- my understanding is there will always be a global
> > > > structure to hold information related to the backend, regardless of how
> > > > many devices there are.
> > >
> > > Then that is wrong and needs to be fixed.  Drivers should almost never
> > > have any global data, that is not how Linux drivers work.  What happens
> > > when you get a second device in your system for this?  Major rework
> > > would have to happen and the code will break.  Handle that all now as it
> > > takes less work to make this per-device than it does to have a global
> > > variable.
> > >
> > > > I definitely think splitting is doable, but I also understand why Iouri
> > > > does not want to do it _now_ given there is no such a model for multiple
> > > > devices yet, so anything we put into the per-device structure could be
> > > > incomplete and it requires further changing when such a model arrives
> > > > later.
> > > > 
> > > > Iouri, please correct me if I have the wrong mental model here.
> > > > 
> > > > All in all, I hope this is not going to be a deal breaker for the
> > > > acceptance of this driver.
> > >
> > > For my reviews, yes it will be.
> > >
> > > Again, it should be easier to keep things in a per-device state than
> > > not as the proper lifetime rules and the like are automatically handled
> > > for you.  If you have global data, you have to manage that all on your
> > > own and it is _MUCH_ harder to review that you got it correct.
> > 
> > Hi Greg,
> > 
> > I do not really see how the driver be written without the global data. Let's review the design.
> 
> I see it the other way around.  It's easier to make it without a static
> structure, it is more work to keep it as you have done so here.  Do it
> correctly to start with and you will not have any of these issues going
> forward.
> 

> > Dxgkrnl acts as the aggregator of all virtual compute devices, projected by the host. It needs to do operations, which do not belong to a particular compute device. For example, cross device synchronization and resource sharing.

Hey Iouri, please fix your text wrapping.

Greg, I have to admit I only started paying close attention to this
series a few days ago, so I don't claim I know a lot.

> 
> Then hang your data off of your device node structure that you
> created. Why ignore that?
> 
> > A PCI device device is created for each virtual compute device.
> > Therefore, there should be a global list of objects and a mutex to
> > synchronize access to the list.
> 
> Woah, what?  You create a fake PCI device for each virtual device?  If
> so, great, then you are now a PCI bus and create the PCI devices
> properly so that the PCI core can handle and manage them and then
> assign them to your driver.  You should NEVER have a global list of
> these devices, as that is what the driver model should be managing.
> Not you!
> 

No, there is no fake PCI device. The device object is still coming from
the PCI core driver. There is code to match against PCI vendor ID and
device ID, and follow the usual way of managing PCI device.

Iouri understands device specific state should be encapsulated in the
private data field in their respective device. And I believe the code
can perhaps be rewritten to better conform to Linux kernel's model.

That should address the issue ...

> > A VMBus channel is offered by the host for each compute device. The
> > list of the VMBus channels should be global.
> 
> The vmbus channels are already handled by the driver core.  Use those
> devices that are given to you.  You don't need to manage them at all.
> 

here ...

> > A global VMBus channel is offered by the host. The channel does not
> > belong to any particular compute device, so it must be global.
> 
> That channel is attached to your driver, use the device given to your
> driver by the bus.  It's not "global" in any sense of the word.
> 

here ...

> And what's up with your lack of line wrapping?
> 
> > IO space is shared by all compute devices, so its parameters should
> > be global.
> 
> Huh?  If that's the case then you have bigger problems.  Use the aux
> bus for devices that share io space.  That is what it was created for,
> do not ignore the functionality that Linux already provides you by
> trying to go around it and writing your own code.  Use the frameworks
> we have already debugged and support.  This is why your Linux driver
> should be at least 1/3 smaller than drivers for other operating
> systems.
> 

To be fair, auxiliary bus was only added in 5.11, while this series was
written long before that. Unfortunately one only has so much time to
follow Linux kernel development closely. I admit this is the first time
I hear about it. :-)

> > Dxgkrnl needs to maintain a list of processes, which opened compute
> > device objects. Dxgkrnl maintains private state for each process and
> > when a process opens the /dev/dxg device, Dxgkrnl needs to find if
> > the process state is already created by walking the global process
> > list.
> 
> That "list" is handled by the device node structure that was opened.
> It's not "global" at all.  Again, just like any other device node in
> Linux, this isn't a new thing or anything special at all.
> 

Again, the state can be associated with the `private_data` field in
struct file.

> > Now, where to keep this global state? It could be kept in the
> > /dev/dxg private device structure. But this structure is not
> > available when, for example, dxg_pci_probe_device() or
> > dxg_probe_vmbus() is called.
> 
> Then your design is wrong.  It's as simple as that.  Fix it.
> 
> > Can there be multiple /dev/dxg devices? No. Because the /dev/dxg
> > device represents the driver itself, not a particular compute
> > device.
> 
> Then fix this.  Make your compute devices store the needed information
> when they are created.  Again, we have loads of examples in the kernel,
> this is nothing new.
> 

At this point, I think Iouri and I have settled on more encapsulation is
needed. Yet there is something I don't know how to square yet. That is,
devices (either from vmbus or pci) don't form a clear hierarchy. If
there isn't a linked list or some sort to organize them it would be
difficult to cross-reference. This then goes back to what I wrote
earlier in <20220302115334.wemdkznokszlzcpe@liuwe-devbox-debian-v2>. I
hope you won't be against using that pattern -- it is used in a lot of
places in tree.

Thanks,
Wei.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-04 16:04               ` Wei Liu
@ 2022-03-04 17:55                 ` Greg KH
  0 siblings, 0 replies; 72+ messages in thread
From: Greg KH @ 2022-03-04 17:55 UTC (permalink / raw)
  To: Wei Liu
  Cc: Iouri Tarassov, kys, haiyangz, sthemmin, linux-hyperv,
	linux-kernel, spronovo, spronovo

On Fri, Mar 04, 2022 at 04:04:10PM +0000, Wei Liu wrote:
> > > A PCI device device is created for each virtual compute device.
> > > Therefore, there should be a global list of objects and a mutex to
> > > synchronize access to the list.
> > 
> > Woah, what?  You create a fake PCI device for each virtual device?  If
> > so, great, then you are now a PCI bus and create the PCI devices
> > properly so that the PCI core can handle and manage them and then
> > assign them to your driver.  You should NEVER have a global list of
> > these devices, as that is what the driver model should be managing.
> > Not you!
> > 
> 
> No, there is no fake PCI device. The device object is still coming from
> the PCI core driver. There is code to match against PCI vendor ID and
> device ID, and follow the usual way of managing PCI device.

So it is a real PCI device?  Why the confusion?

> Iouri understands device specific state should be encapsulated in the
> private data field in their respective device. And I believe the code
> can perhaps be rewritten to better conform to Linux kernel's model.

It has to follow the Linux kernel model, to think otherwise is quite
odd.

> > > IO space is shared by all compute devices, so its parameters should
> > > be global.
> > 
> > Huh?  If that's the case then you have bigger problems.  Use the aux
> > bus for devices that share io space.  That is what it was created for,
> > do not ignore the functionality that Linux already provides you by
> > trying to go around it and writing your own code.  Use the frameworks
> > we have already debugged and support.  This is why your Linux driver
> > should be at least 1/3 smaller than drivers for other operating
> > systems.
> > 
> 
> To be fair, auxiliary bus was only added in 5.11, while this series was
> written long before that. Unfortunately one only has so much time to
> follow Linux kernel development closely. I admit this is the first time
> I hear about it. :-)

5.11 was over a year ago.  We do not take "old" drivers into the kernel
tree, and if this series has not been touched for over a year, um, you
all have bigger problems here and are just wasting our time :(

> > Then fix this.  Make your compute devices store the needed information
> > when they are created.  Again, we have loads of examples in the kernel,
> > this is nothing new.
> > 
> 
> At this point, I think Iouri and I have settled on more encapsulation is
> needed. Yet there is something I don't know how to square yet. That is,
> devices (either from vmbus or pci) don't form a clear hierarchy.

That is because devices don't care where they are in the larger kernel
tree of devices, they are independant.  To try to claim there is a
needed hierarchy is quite odd and will cause lots of problems when that
heirachy changes by the underlying system.

> If
> there isn't a linked list or some sort to organize them it would be
> difficult to cross-reference.

You should never be touching another device from another one.  There are
a few rare exceptions but they are rare and you should not use them at
all.  And if you do have to do so, you better get the reference counting
logic correct.

good luck!

greg k-h

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-03 13:16                   ` Greg KH
@ 2022-03-09 23:14                     ` Steve Pronovost
  2022-03-10 10:13                       ` Greg KH
  0 siblings, 1 reply; 72+ messages in thread
From: Steve Pronovost @ 2022-03-09 23:14 UTC (permalink / raw)
  To: Greg KH, Iouri Tarassov
  Cc: Wei Liu, kys, haiyangz, sthemmin, linux-hyperv, linux-kernel, spronovo


On 3/3/22 05:16, Greg KH wrote:
> On Wed, Mar 02, 2022 at 02:27:05PM -0800, Iouri Tarassov wrote:
>> On 3/2/2022 12:20 PM, Greg KH wrote:
>>> On Wed, Mar 02, 2022 at 10:49:15AM -0800, Iouri Tarassov wrote:
>>>> On 3/2/2022 3:53 AM, Wei Liu wrote:
>>>>> On Wed, Mar 02, 2022 at 08:53:15AM +0100, Greg KH wrote:
>>>>>> On Tue, Mar 01, 2022 at 10:23:21PM +0000, Wei Liu wrote:
>>>>>>>>> +struct dxgglobal *dxgglobal;
>>>>>>>> No, make this per-device, NEVER have a single device for your driver.
>>>>>>>> The Linux driver model makes it harder to do it this way than to do it
>>>>>>>> correctly.  Do it correctly please and have no global structures like
>>>>>>>> this.
>>>>>>>>
>>>>>>> This may not be as big an issue as you thought. The device discovery is
>>>>>>> still done via the normal VMBus probing routine. For all intents and
>>>>>>> purposes the dxgglobal structure can be broken down into per device
>>>>>>> fields and a global structure which contains the protocol versioning
>>>>>>> information -- my understanding is there will always be a global
>>>>>>> structure to hold information related to the backend, regardless of how
>>>>>>> many devices there are.
>>>>>> Then that is wrong and needs to be fixed.  Drivers should almost never
>>>>>> have any global data, that is not how Linux drivers work.  What happens
>>>>>> when you get a second device in your system for this?  Major rework
>>>>>> would have to happen and the code will break.  Handle that all now as it
>>>>>> takes less work to make this per-device than it does to have a global
>>>>>> variable.
>>>>>>
>>>>> It is perhaps easier to draw parallel from an existing driver. I feel
>>>>> like we're talking past each other.
>>>>>
>>>>> Let's look at drivers/iommu/intel/iommu.c. There are a bunch of lists
>>>>> like `static LIST_HEAD(dmar_rmrr_units)`. During the probing phase, new
>>>>> units will be added to the list. I this the current code is following
>>>>> this model. dxgglobal fulfills the role of a list.
>>>>>
>>>>> Setting aside the question of whether it makes sense to keep a copy of
>>>>> the per-VM state in each device instance, I can see the code be changed
>>>>> to:
>>>>>
>>>>>     struct mutex device_mutex; /* split out from dxgglobal */
>>>>>     static LIST_HEAD(dxglist);
>>>>>     
>>>>>     /* Rename struct dxgglobal to struct dxgstate */
>>>>>     struct dxgstate {
>>>>>        struct list_head dxglist; /* link for dxglist */
>>>>>        /* ... original fields sans device_mutex */
>>>>>     }
>>>>>
>>>>>     /*
>>>>>      * Provide a bunch of helpers manipulate the list. Called in probe /
>>>>>      * remove etc.
>>>>>      */
>>>>>     struct dxgstate *find_dxgstate(...);
>>>>>     void remove_dxgstate(...);
>>>>>     int add_dxgstate(...);
>>>>>
>>>>> This model is well understood and used in tree. It is just that it
>>>>> doesn't provide much value in doing this now since the list will only
>>>>> contain one element. I hope that you're not saying we cannot even use a
>>>>> per-module pointer to quickly get the data structure we want to use,
>>>>> right?
>>>>>
>>>>> Are you suggesting Iouri use dev_set_drvdata to stash the dxgstate
>>>>> into the device object? I think that can be done too.
>>>>>
>>>>> The code can be changed as:
>>>>>
>>>>>     /* Rename struct dxgglobal to dxgstate and remove unneeded fields */
>>>>>     struct dxgstate { ... };
>>>>>
>>>>>     static int dxg_probe_vmbus(...) {
>>>>>
>>>>>         /* probe successfully */
>>>>>
>>>>> 	struct dxgstate *state = kmalloc(...);
>>>>> 	/* Fill in dxgstate with information from backend */
>>>>>
>>>>> 	/* hdev->dev is the device object from the core driver framework */
>>>>> 	dev_set_drvdata(&hdev->dev, state);
>>>>>     }
>>>>>
>>>>>     static int dxg_remove_vmbus(...) {
>>>>>         /* Normal stuff here ...*/
>>>>>
>>>>> 	struct dxgstate *state = dev_get_drvdata(...);
>>>>> 	dev_set_drvdata(..., NULL);
>>>>> 	kfree(state);
>>>>>     }
>>>>>
>>>>>     /* In all other functions */
>>>>>     void do_things(...) {
>>>>>         struct dxgstate *state = dev_get_drvdata(...);
>>>>>
>>>>> 	/* Use state in place of where dxgglobal was needed */
>>>>>
>>>>>     }
>>>>>
>>>>> Iouri, notice this doesn't change anything regarding how userspace is
>>>>> designed. This is about how kernel organises its data.
>>>>>
>>>>> I hope what I wrote above can bring our understanding closer.
>>>>>
>>>>> Thanks,
>>>>> Wei.
>>>>
>>>> I can certainly remove dxgglobal and keep the  pointer to the global
>>>> state in the device object.
>>>>
>>>> This will require passing of the global pointer to all functions, which
>>>> need to access it.
>>>>
>>>>
>>>> Maybe my understanding of the Greg's suggestion was not correct. I
>>>> thought the suggestion was
>>>>
>>>> to have multiple /dev/dxgN devices (one per virtual compute device).
>>> You have one device per HV device, as the bus already provides you.
>>> That's all you really need, right?  Who would be opening the same device
>>> node multiple times?
>>>> This would change how the user mode
>>>> clients enumerate and communicate with compute devices.
>>> What does userspace have to do here?  It should just open the device
>>> node that is present when needed.  How will there be multiple userspace
>>> clients for a single HV device?
>>
>> Dxgkrnl creates a single user mode visible device node /dev/dxg.
> When you do that, you have a device to put all of your data on.  Use
> that.
>
>> It has
>> nothing to do with a specific hardware compute device on the host. Its
>> purpose is to provide services (IOCTLs) to enumerate and manage virtual
>> compute devices, which represent hardware devices on the host. The VMBus
>> devices are not used directly by user mode clients in the current design.
> That's horrid, why not just export the virtual compute devices properly
> through individual device nodes instead?
>
> In essence, you are creating a new syscall here to manage and handle
> devices for just your driver with the use of this ioctl.  That's
> generally a bad idea.
>
>> Virtual compute devices are shared between processes. There could be a
>> Cuda application, Gimp and a Direct3D12 application working at the same
>> time.
> Why are all of those applications sharing anything?  How are they
> sharing anything?  If you need to share something across processes, use
> the existing inter-process ways of sharing stuff that we have today (12+
> different apis and counting).  Don't create yet-another-one just for
> your tiny little driver here.  That's rude to the rest of the kernel.
>
>> This is what I mean by saying that there are multiple user mode
>> clients who use the /dev/dxg driver interface. Each of this applications
>> will open the /dev/dxg device node and enumerate/use virtual compute
>> devices.
> That seems like an odd model to follow.  How many virtual devices do you
> support?  Is there a limit?  Why not just enumerate them all to start
> with?  Or if there are too many, why not do it like the raw device and
> have a single device node that is used to create the virtual devices you
> wish to use?
>
>> If we change the way how the virtual compute devices are visible to user
>> mode, the Cuda runtime, Direct3D runtime would need to be changed.
> We do not care about existing userspace code at this point in time, you
> are trying to get the kernel api correct here.  Once you have done that,
> then you can go fix up your userspace code to work properly.  Anything
> that came before today is pointless here, right?  :)
>
>> I think we agreed that I will keep the global driver state in the device
>> object as Wei suggested and remove global variables. There still will be
>> a single /dev/dxg device node. Correct?
> I do not know, as I can't figure out your goals here at all, sorry.
>
> Please go work with some experienced Linux developers in your company.
> They should be the ones answering these questions for you, not me :)
>
> thanks,
>
> greg k-h
Thanks Greg and folks for all the feedback. The userspace API provided  
by this driver is specifically designed to match the Windows compute  
device abstraction in order to light up a large eco-system of graphics  
and compute APIs in our various virtualized Linux environment. This is  
a pretty critical design point as this is the only way we can scale to  
such a vast eco-system of APIs and their related tools. This enables  
APIs and framework which are available natively on Linux, such as  
CUDA, OpenGL, OpenCL, OpenVINO, OneAPI/L0, TensorFlow, Pytorch, soon  
Vulkan and more in the future and allows them to share the host GPU or  
compute device when run under our hypervisor. Most of these APIs sit  
directly on top of our userspace API and changing that abstraction  
would have major implications. Our D3D/DML userspace components are  
only implementation details for a subset of those APIs.
 
We have millions of developers experiencing and enjoying various Linux  
distributions through our Windows Subsystem for Linux (WSL) and this  
gives them a way to get hardware acceleration in a large number of  
important scenarios. WSL makes it trivial for developers to try out  
Linux totally risk free on over a billion devices, without having to  
commit to a native install upfront or understand and manually manage  
VMs. We think this is a pretty valuable proposition to both the  
Windows and the Linux community and one which has definitely been  
quite popular.
 
That being said, we recognize that our virtualization approach makes  
some folks in the community uncomfortable and the resulting userspace  
API exposed by our driver will look a bit different than typical Linux  
driver. Taking into consideration feedback and reception, we decided  
to pause our upstream effort for this driver. We will keep our driver  
open source and available in our downstream kernel tree for folks who  
wants to leverage it in their private kernel.
 
Thanks, Steve

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-09 23:14                     ` Steve Pronovost
@ 2022-03-10 10:13                       ` Greg KH
  2022-03-10 18:31                         ` Steve Pronovost
  0 siblings, 1 reply; 72+ messages in thread
From: Greg KH @ 2022-03-10 10:13 UTC (permalink / raw)
  To: Steve Pronovost
  Cc: Iouri Tarassov, Wei Liu, kys, haiyangz, sthemmin, linux-hyperv,
	linux-kernel, spronovo

On Wed, Mar 09, 2022 at 03:14:24PM -0800, Steve Pronovost wrote:
> 
> On 3/3/22 05:16, Greg KH wrote:
> > On Wed, Mar 02, 2022 at 02:27:05PM -0800, Iouri Tarassov wrote:
> >> On 3/2/2022 12:20 PM, Greg KH wrote:
> >>> On Wed, Mar 02, 2022 at 10:49:15AM -0800, Iouri Tarassov wrote:
> >>>> On 3/2/2022 3:53 AM, Wei Liu wrote:
> >>>>> On Wed, Mar 02, 2022 at 08:53:15AM +0100, Greg KH wrote:
> >>>>>> On Tue, Mar 01, 2022 at 10:23:21PM +0000, Wei Liu wrote:
> >>>>>>>>> +struct dxgglobal *dxgglobal;
> >>>>>>>> No, make this per-device, NEVER have a single device for your driver.
> >>>>>>>> The Linux driver model makes it harder to do it this way than to do it
> >>>>>>>> correctly.  Do it correctly please and have no global structures like
> >>>>>>>> this.
> >>>>>>>>
> >>>>>>> This may not be as big an issue as you thought. The device discovery is
> >>>>>>> still done via the normal VMBus probing routine. For all intents and
> >>>>>>> purposes the dxgglobal structure can be broken down into per device
> >>>>>>> fields and a global structure which contains the protocol versioning
> >>>>>>> information -- my understanding is there will always be a global
> >>>>>>> structure to hold information related to the backend, regardless of how
> >>>>>>> many devices there are.
> >>>>>> Then that is wrong and needs to be fixed.  Drivers should almost never
> >>>>>> have any global data, that is not how Linux drivers work.  What happens
> >>>>>> when you get a second device in your system for this?  Major rework
> >>>>>> would have to happen and the code will break.  Handle that all now as it
> >>>>>> takes less work to make this per-device than it does to have a global
> >>>>>> variable.
> >>>>>>
> >>>>> It is perhaps easier to draw parallel from an existing driver. I feel
> >>>>> like we're talking past each other.
> >>>>>
> >>>>> Let's look at drivers/iommu/intel/iommu.c. There are a bunch of lists
> >>>>> like `static LIST_HEAD(dmar_rmrr_units)`. During the probing phase, new
> >>>>> units will be added to the list. I this the current code is following
> >>>>> this model. dxgglobal fulfills the role of a list.
> >>>>>
> >>>>> Setting aside the question of whether it makes sense to keep a copy of
> >>>>> the per-VM state in each device instance, I can see the code be changed
> >>>>> to:
> >>>>>
> >>>>>     struct mutex device_mutex; /* split out from dxgglobal */
> >>>>>     static LIST_HEAD(dxglist);
> >>>>>     
> >>>>>     /* Rename struct dxgglobal to struct dxgstate */
> >>>>>     struct dxgstate {
> >>>>>        struct list_head dxglist; /* link for dxglist */
> >>>>>        /* ... original fields sans device_mutex */
> >>>>>     }
> >>>>>
> >>>>>     /*
> >>>>>      * Provide a bunch of helpers manipulate the list. Called in probe /
> >>>>>      * remove etc.
> >>>>>      */
> >>>>>     struct dxgstate *find_dxgstate(...);
> >>>>>     void remove_dxgstate(...);
> >>>>>     int add_dxgstate(...);
> >>>>>
> >>>>> This model is well understood and used in tree. It is just that it
> >>>>> doesn't provide much value in doing this now since the list will only
> >>>>> contain one element. I hope that you're not saying we cannot even use a
> >>>>> per-module pointer to quickly get the data structure we want to use,
> >>>>> right?
> >>>>>
> >>>>> Are you suggesting Iouri use dev_set_drvdata to stash the dxgstate
> >>>>> into the device object? I think that can be done too.
> >>>>>
> >>>>> The code can be changed as:
> >>>>>
> >>>>>     /* Rename struct dxgglobal to dxgstate and remove unneeded fields */
> >>>>>     struct dxgstate { ... };
> >>>>>
> >>>>>     static int dxg_probe_vmbus(...) {
> >>>>>
> >>>>>         /* probe successfully */
> >>>>>
> >>>>> 	struct dxgstate *state = kmalloc(...);
> >>>>> 	/* Fill in dxgstate with information from backend */
> >>>>>
> >>>>> 	/* hdev->dev is the device object from the core driver framework */
> >>>>> 	dev_set_drvdata(&hdev->dev, state);
> >>>>>     }
> >>>>>
> >>>>>     static int dxg_remove_vmbus(...) {
> >>>>>         /* Normal stuff here ...*/
> >>>>>
> >>>>> 	struct dxgstate *state = dev_get_drvdata(...);
> >>>>> 	dev_set_drvdata(..., NULL);
> >>>>> 	kfree(state);
> >>>>>     }
> >>>>>
> >>>>>     /* In all other functions */
> >>>>>     void do_things(...) {
> >>>>>         struct dxgstate *state = dev_get_drvdata(...);
> >>>>>
> >>>>> 	/* Use state in place of where dxgglobal was needed */
> >>>>>
> >>>>>     }
> >>>>>
> >>>>> Iouri, notice this doesn't change anything regarding how userspace is
> >>>>> designed. This is about how kernel organises its data.
> >>>>>
> >>>>> I hope what I wrote above can bring our understanding closer.
> >>>>>
> >>>>> Thanks,
> >>>>> Wei.
> >>>>
> >>>> I can certainly remove dxgglobal and keep the  pointer to the global
> >>>> state in the device object.
> >>>>
> >>>> This will require passing of the global pointer to all functions, which
> >>>> need to access it.
> >>>>
> >>>>
> >>>> Maybe my understanding of the Greg's suggestion was not correct. I
> >>>> thought the suggestion was
> >>>>
> >>>> to have multiple /dev/dxgN devices (one per virtual compute device).
> >>> You have one device per HV device, as the bus already provides you.
> >>> That's all you really need, right?  Who would be opening the same device
> >>> node multiple times?
> >>>> This would change how the user mode
> >>>> clients enumerate and communicate with compute devices.
> >>> What does userspace have to do here?  It should just open the device
> >>> node that is present when needed.  How will there be multiple userspace
> >>> clients for a single HV device?
> >>
> >> Dxgkrnl creates a single user mode visible device node /dev/dxg.
> > When you do that, you have a device to put all of your data on.  Use
> > that.
> >
> >> It has
> >> nothing to do with a specific hardware compute device on the host. Its
> >> purpose is to provide services (IOCTLs) to enumerate and manage virtual
> >> compute devices, which represent hardware devices on the host. The VMBus
> >> devices are not used directly by user mode clients in the current design.
> > That's horrid, why not just export the virtual compute devices properly
> > through individual device nodes instead?
> >
> > In essence, you are creating a new syscall here to manage and handle
> > devices for just your driver with the use of this ioctl.  That's
> > generally a bad idea.
> >
> >> Virtual compute devices are shared between processes. There could be a
> >> Cuda application, Gimp and a Direct3D12 application working at the same
> >> time.
> > Why are all of those applications sharing anything?  How are they
> > sharing anything?  If you need to share something across processes, use
> > the existing inter-process ways of sharing stuff that we have today (12+
> > different apis and counting).  Don't create yet-another-one just for
> > your tiny little driver here.  That's rude to the rest of the kernel.
> >
> >> This is what I mean by saying that there are multiple user mode
> >> clients who use the /dev/dxg driver interface. Each of this applications
> >> will open the /dev/dxg device node and enumerate/use virtual compute
> >> devices.
> > That seems like an odd model to follow.  How many virtual devices do you
> > support?  Is there a limit?  Why not just enumerate them all to start
> > with?  Or if there are too many, why not do it like the raw device and
> > have a single device node that is used to create the virtual devices you
> > wish to use?
> >
> >> If we change the way how the virtual compute devices are visible to user
> >> mode, the Cuda runtime, Direct3D runtime would need to be changed.
> > We do not care about existing userspace code at this point in time, you
> > are trying to get the kernel api correct here.  Once you have done that,
> > then you can go fix up your userspace code to work properly.  Anything
> > that came before today is pointless here, right?  :)
> >
> >> I think we agreed that I will keep the global driver state in the device
> >> object as Wei suggested and remove global variables. There still will be
> >> a single /dev/dxg device node. Correct?
> > I do not know, as I can't figure out your goals here at all, sorry.
> >
> > Please go work with some experienced Linux developers in your company.
> > They should be the ones answering these questions for you, not me :)
> >
> > thanks,
> >
> > greg k-h
> Thanks Greg and folks for all the feedback. The userspace API provided  
> by this driver is specifically designed to match the Windows compute  
> device abstraction in order to light up a large eco-system of graphics  
> and compute APIs in our various virtualized Linux environment. This is  
> a pretty critical design point as this is the only way we can scale to  
> such a vast eco-system of APIs and their related tools. This enables  
> APIs and framework which are available natively on Linux, such as  
> CUDA, OpenGL, OpenCL, OpenVINO, OneAPI/L0, TensorFlow, Pytorch, soon  
> Vulkan and more in the future and allows them to share the host GPU or  
> compute device when run under our hypervisor. Most of these APIs sit  
> directly on top of our userspace API and changing that abstraction  
> would have major implications. Our D3D/DML userspace components are  
> only implementation details for a subset of those APIs.

This feels odd to me as I was only discussing the in-kernel apis, and
how this driver does not follow them correctly at all (global device
state, odd device creation, etc.)  That should have nothing to do with
the user/kernel api at all.  And surely CUDA and the like never have
defined a specific user/kernal api that must be followed, right?

Yes, the existing user/kernel api in this driver is a bit odd, but
really, we haven't even started to review that.  For you all to just
cut-and-run at the first sign of a detailed in-kernel code review is
strange, why are you giving up now?

> We have millions of developers experiencing and enjoying various Linux  
> distributions through our Windows Subsystem for Linux (WSL) and this  
> gives them a way to get hardware acceleration in a large number of  
> important scenarios. WSL makes it trivial for developers to try out  
> Linux totally risk free on over a billion devices, without having to  
> commit to a native install upfront or understand and manually manage  
> VMs. We think this is a pretty valuable proposition to both the  
> Windows and the Linux community and one which has definitely been  
> quite popular.

This has nothing to do with how this tiny driver uses the in-kernel apis
for creating and managing devices.  You are discussing userspace
applications here, which is a nice marketing discussion about how many
developers you have, but means nothing when it comes to how the kernel
code works.  So why mention it at all?

> That being said, we recognize that our virtualization approach makes  
> some folks in the community uncomfortable and the resulting userspace  
> API exposed by our driver will look a bit different than typical Linux  
> driver. Taking into consideration feedback and reception, we decided  
> to pause our upstream effort for this driver. We will keep our driver  
> open source and available in our downstream kernel tree for folks who  
> wants to leverage it in their private kernel.

The problems I am pointing out here are how your kernel driver is
interacting with the kernel driver model.  It has almost nothing to do
with the user/kernel api at all.

And even if it did concern the user/kernel api, you already have a
userspace library that has to talk to this driver, all you need to do is
modify that to handle the expected changes that will happen when you
actually get experienced kernel developers to review your code and apis.

I don't understand what you all were expecting here.  Did you not want
to fit into the kernel tree and take advantage of using the proper
functions and interfaces that have been created already?  That should
enable your users to be able to rely on a more stable and secure and
widespread driver for everyone to use.

Instead you are saying that you don't appreciate the time and energy we
spent on trying to help you improve your code for your users, and wish
to just ignore our help?

By doing this, you are telling all community members that they should
not waste their time in reviewing your company's code submissions as
they are not appreciated or wanted unless they are just a rubber-stamp
approval.  That's going to ensure that no one ever reviews your
contributions again in the future, which is an odd request, don't you
think?

Oh well, it's easy to add a filter to just ignore any future code
submissions from your email domain.  I'll recommend that all other
kernel developers and maintainers do the same as we all have changes to
review from companies and developers that actually want our feedback.

best of luck!

greg k-h

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
  2022-03-10 10:13                       ` Greg KH
@ 2022-03-10 18:31                         ` Steve Pronovost
  0 siblings, 0 replies; 72+ messages in thread
From: Steve Pronovost @ 2022-03-10 18:31 UTC (permalink / raw)
  To: Greg KH
  Cc: Iouri Tarassov, Wei Liu, kys, haiyangz, sthemmin, linux-hyperv,
	linux-kernel, spronovo


On 3/10/22 02:13, Greg KH wrote:
> On Wed, Mar 09, 2022 at 03:14:24PM -0800, Steve Pronovost wrote:
>> On 3/3/22 05:16, Greg KH wrote:
>>> On Wed, Mar 02, 2022 at 02:27:05PM -0800, Iouri Tarassov wrote:
>>>> On 3/2/2022 12:20 PM, Greg KH wrote:
>>>>> On Wed, Mar 02, 2022 at 10:49:15AM -0800, Iouri Tarassov wrote:
>>>>>> On 3/2/2022 3:53 AM, Wei Liu wrote:
>>>>>>> On Wed, Mar 02, 2022 at 08:53:15AM +0100, Greg KH wrote:
>>>>>>>> On Tue, Mar 01, 2022 at 10:23:21PM +0000, Wei Liu wrote:
>>>>>>>>>>> +struct dxgglobal *dxgglobal;
>>>>>>>>>> No, make this per-device, NEVER have a single device for your driver.
>>>>>>>>>> The Linux driver model makes it harder to do it this way than to do it
>>>>>>>>>> correctly.  Do it correctly please and have no global structures like
>>>>>>>>>> this.
>>>>>>>>>>
>>>>>>>>> This may not be as big an issue as you thought. The device discovery is
>>>>>>>>> still done via the normal VMBus probing routine. For all intents and
>>>>>>>>> purposes the dxgglobal structure can be broken down into per device
>>>>>>>>> fields and a global structure which contains the protocol versioning
>>>>>>>>> information -- my understanding is there will always be a global
>>>>>>>>> structure to hold information related to the backend, regardless of how
>>>>>>>>> many devices there are.
>>>>>>>> Then that is wrong and needs to be fixed.  Drivers should almost never
>>>>>>>> have any global data, that is not how Linux drivers work.  What happens
>>>>>>>> when you get a second device in your system for this?  Major rework
>>>>>>>> would have to happen and the code will break.  Handle that all now as it
>>>>>>>> takes less work to make this per-device than it does to have a global
>>>>>>>> variable.
>>>>>>>>
>>>>>>> It is perhaps easier to draw parallel from an existing driver. I feel
>>>>>>> like we're talking past each other.
>>>>>>>
>>>>>>> Let's look at drivers/iommu/intel/iommu.c. There are a bunch of lists
>>>>>>> like `static LIST_HEAD(dmar_rmrr_units)`. During the probing phase, new
>>>>>>> units will be added to the list. I this the current code is following
>>>>>>> this model. dxgglobal fulfills the role of a list.
>>>>>>>
>>>>>>> Setting aside the question of whether it makes sense to keep a copy of
>>>>>>> the per-VM state in each device instance, I can see the code be changed
>>>>>>> to:
>>>>>>>
>>>>>>>     struct mutex device_mutex; /* split out from dxgglobal */
>>>>>>>     static LIST_HEAD(dxglist);
>>>>>>>     
>>>>>>>     /* Rename struct dxgglobal to struct dxgstate */
>>>>>>>     struct dxgstate {
>>>>>>>        struct list_head dxglist; /* link for dxglist */
>>>>>>>        /* ... original fields sans device_mutex */
>>>>>>>     }
>>>>>>>
>>>>>>>     /*
>>>>>>>      * Provide a bunch of helpers manipulate the list. Called in probe /
>>>>>>>      * remove etc.
>>>>>>>      */
>>>>>>>     struct dxgstate *find_dxgstate(...);
>>>>>>>     void remove_dxgstate(...);
>>>>>>>     int add_dxgstate(...);
>>>>>>>
>>>>>>> This model is well understood and used in tree. It is just that it
>>>>>>> doesn't provide much value in doing this now since the list will only
>>>>>>> contain one element. I hope that you're not saying we cannot even use a
>>>>>>> per-module pointer to quickly get the data structure we want to use,
>>>>>>> right?
>>>>>>>
>>>>>>> Are you suggesting Iouri use dev_set_drvdata to stash the dxgstate
>>>>>>> into the device object? I think that can be done too.
>>>>>>>
>>>>>>> The code can be changed as:
>>>>>>>
>>>>>>>     /* Rename struct dxgglobal to dxgstate and remove unneeded fields */
>>>>>>>     struct dxgstate { ... };
>>>>>>>
>>>>>>>     static int dxg_probe_vmbus(...) {
>>>>>>>
>>>>>>>         /* probe successfully */
>>>>>>>
>>>>>>> 	struct dxgstate *state = kmalloc(...);
>>>>>>> 	/* Fill in dxgstate with information from backend */
>>>>>>>
>>>>>>> 	/* hdev->dev is the device object from the core driver framework */
>>>>>>> 	dev_set_drvdata(&hdev->dev, state);
>>>>>>>     }
>>>>>>>
>>>>>>>     static int dxg_remove_vmbus(...) {
>>>>>>>         /* Normal stuff here ...*/
>>>>>>>
>>>>>>> 	struct dxgstate *state = dev_get_drvdata(...);
>>>>>>> 	dev_set_drvdata(..., NULL);
>>>>>>> 	kfree(state);
>>>>>>>     }
>>>>>>>
>>>>>>>     /* In all other functions */
>>>>>>>     void do_things(...) {
>>>>>>>         struct dxgstate *state = dev_get_drvdata(...);
>>>>>>>
>>>>>>> 	/* Use state in place of where dxgglobal was needed */
>>>>>>>
>>>>>>>     }
>>>>>>>
>>>>>>> Iouri, notice this doesn't change anything regarding how userspace is
>>>>>>> designed. This is about how kernel organises its data.
>>>>>>>
>>>>>>> I hope what I wrote above can bring our understanding closer.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Wei.
>>>>>> I can certainly remove dxgglobal and keep the  pointer to the global
>>>>>> state in the device object.
>>>>>>
>>>>>> This will require passing of the global pointer to all functions, which
>>>>>> need to access it.
>>>>>>
>>>>>>
>>>>>> Maybe my understanding of the Greg's suggestion was not correct. I
>>>>>> thought the suggestion was
>>>>>>
>>>>>> to have multiple /dev/dxgN devices (one per virtual compute device).
>>>>> You have one device per HV device, as the bus already provides you.
>>>>> That's all you really need, right?  Who would be opening the same device
>>>>> node multiple times?
>>>>>> This would change how the user mode
>>>>>> clients enumerate and communicate with compute devices.
>>>>> What does userspace have to do here?  It should just open the device
>>>>> node that is present when needed.  How will there be multiple userspace
>>>>> clients for a single HV device?
>>>> Dxgkrnl creates a single user mode visible device node /dev/dxg.
>>> When you do that, you have a device to put all of your data on.  Use
>>> that.
>>>
>>>> It has
>>>> nothing to do with a specific hardware compute device on the host. Its
>>>> purpose is to provide services (IOCTLs) to enumerate and manage virtual
>>>> compute devices, which represent hardware devices on the host. The VMBus
>>>> devices are not used directly by user mode clients in the current design.
>>> That's horrid, why not just export the virtual compute devices properly
>>> through individual device nodes instead?
>>>
>>> In essence, you are creating a new syscall here to manage and handle
>>> devices for just your driver with the use of this ioctl.  That's
>>> generally a bad idea.
>>>
>>>> Virtual compute devices are shared between processes. There could be a
>>>> Cuda application, Gimp and a Direct3D12 application working at the same
>>>> time.
>>> Why are all of those applications sharing anything?  How are they
>>> sharing anything?  If you need to share something across processes, use
>>> the existing inter-process ways of sharing stuff that we have today (12+
>>> different apis and counting).  Don't create yet-another-one just for
>>> your tiny little driver here.  That's rude to the rest of the kernel.
>>>
>>>> This is what I mean by saying that there are multiple user mode
>>>> clients who use the /dev/dxg driver interface. Each of this applications
>>>> will open the /dev/dxg device node and enumerate/use virtual compute
>>>> devices.
>>> That seems like an odd model to follow.  How many virtual devices do you
>>> support?  Is there a limit?  Why not just enumerate them all to start
>>> with?  Or if there are too many, why not do it like the raw device and
>>> have a single device node that is used to create the virtual devices you
>>> wish to use?
>>>
>>>> If we change the way how the virtual compute devices are visible to user
>>>> mode, the Cuda runtime, Direct3D runtime would need to be changed.
>>> We do not care about existing userspace code at this point in time, you
>>> are trying to get the kernel api correct here.  Once you have done that,
>>> then you can go fix up your userspace code to work properly.  Anything
>>> that came before today is pointless here, right?  :)
>>>
>>>> I think we agreed that I will keep the global driver state in the device
>>>> object as Wei suggested and remove global variables. There still will be
>>>> a single /dev/dxg device node. Correct?
>>> I do not know, as I can't figure out your goals here at all, sorry.
>>>
>>> Please go work with some experienced Linux developers in your company.
>>> They should be the ones answering these questions for you, not me :)
>>>
>>> thanks,
>>>
>>> greg k-h
>> Thanks Greg and folks for all the feedback. The userspace API provided  
>> by this driver is specifically designed to match the Windows compute  
>> device abstraction in order to light up a large eco-system of graphics  
>> and compute APIs in our various virtualized Linux environment. This is  
>> a pretty critical design point as this is the only way we can scale to  
>> such a vast eco-system of APIs and their related tools. This enables  
>> APIs and framework which are available natively on Linux, such as  
>> CUDA, OpenGL, OpenCL, OpenVINO, OneAPI/L0, TensorFlow, Pytorch, soon  
>> Vulkan and more in the future and allows them to share the host GPU or  
>> compute device when run under our hypervisor. Most of these APIs sit  
>> directly on top of our userspace API and changing that abstraction  
>> would have major implications. Our D3D/DML userspace components are  
>> only implementation details for a subset of those APIs.
> This feels odd to me as I was only discussing the in-kernel apis, and
> how this driver does not follow them correctly at all (global device
> state, odd device creation, etc.)  That should have nothing to do with
> the user/kernel api at all.  And surely CUDA and the like never have
> defined a specific user/kernal api that must be followed, right?
>
> Yes, the existing user/kernel api in this driver is a bit odd, but
> really, we haven't even started to review that.  For you all to just
> cut-and-run at the first sign of a detailed in-kernel code review is
> strange, why are you giving up now?
>
>> We have millions of developers experiencing and enjoying various Linux  
>> distributions through our Windows Subsystem for Linux (WSL) and this  
>> gives them a way to get hardware acceleration in a large number of  
>> important scenarios. WSL makes it trivial for developers to try out  
>> Linux totally risk free on over a billion devices, without having to  
>> commit to a native install upfront or understand and manually manage  
>> VMs. We think this is a pretty valuable proposition to both the  
>> Windows and the Linux community and one which has definitely been  
>> quite popular.
> This has nothing to do with how this tiny driver uses the in-kernel apis
> for creating and managing devices.  You are discussing userspace
> applications here, which is a nice marketing discussion about how many
> developers you have, but means nothing when it comes to how the kernel
> code works.  So why mention it at all?
>
>> That being said, we recognize that our virtualization approach makes  
>> some folks in the community uncomfortable and the resulting userspace  
>> API exposed by our driver will look a bit different than typical Linux  
>> driver. Taking into consideration feedback and reception, we decided  
>> to pause our upstream effort for this driver. We will keep our driver  
>> open source and available in our downstream kernel tree for folks who  
>> wants to leverage it in their private kernel.
> The problems I am pointing out here are how your kernel driver is
> interacting with the kernel driver model.  It has almost nothing to do
> with the user/kernel api at all.
>
> And even if it did concern the user/kernel api, you already have a
> userspace library that has to talk to this driver, all you need to do is
> modify that to handle the expected changes that will happen when you
> actually get experienced kernel developers to review your code and apis.
>
> I don't understand what you all were expecting here.  Did you not want
> to fit into the kernel tree and take advantage of using the proper
> functions and interfaces that have been created already?  That should
> enable your users to be able to rely on a more stable and secure and
> widespread driver for everyone to use.
>
> Instead you are saying that you don't appreciate the time and energy we
> spent on trying to help you improve your code for your users, and wish
> to just ignore our help?
>
> By doing this, you are telling all community members that they should
> not waste their time in reviewing your company's code submissions as
> they are not appreciated or wanted unless they are just a rubber-stamp
> approval.  That's going to ensure that no one ever reviews your
> contributions again in the future, which is an odd request, don't you
> think?
>
> Oh well, it's easy to add a filter to just ignore any future code
> submissions from your email domain.  I'll recommend that all other
> kernel developers and maintainers do the same as we all have changes to
> review from companies and developers that actually want our feedback.
>
> best of luck!
>
> greg k-h
Hey Greg,
 
Apologies, I didn’t mean to imply we don’t appreciate the feedback from  
you or the community nor were we looking for a rubber stamp approval.  
You’ve provided great feedback and we’ll continue to evolve our driver  
toward this and we will continue to review internally with more  
experienced Linux kernel developers to get our driver into a better  
shape that can be upstreamed
 
It’s clear our code isn’t ready to be included upstream and we don’t  
want to waste reviewer’s time on it. We take all of the feedback  
seriously. Last year in our rfc, when we got  the feedback that we  
really needed an open source userspace, we went to great length to  
get one. We will continue to integrate all of the feedback we have  
gotten so far.
 
I totally agree that some of the feedback has nothing to do with the  
userspace API and we’ll evolve the driver toward that direction, it’s  
just goodness and will help make our driver simpler and more reliable.  
We will go back and figure out how to address this feedback, but it  
will take some time to do it right. The pause below wasn’t meant to  
imply we’re giving up forever, but that it will take a while before  
we can attempt again with a patch set that is hopefully better aligned  
and more mature in the future.
 
My email below was meant to be a thank you for the feedback, we’ll  
pause iteration on the current patch set since we have a lot of  
internalizing to do and some things that will take quite a while to  
address and we didn’t want to just go silent. but obviously I did a  
very poor job of communicating that. Let me assure you there is no ill  
feeling on our end and we do appreciate the feedback, it may take some  
time but we will be back again :-).
 
Thanks,
Steve

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2022-03-10 18:31 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-01 19:45 [PATCH v3 00/30] Driver for Hyper-v virtual compute device Iouri Tarassov
2022-03-01 19:45 ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Iouri Tarassov
2022-03-01 19:45   ` [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading Iouri Tarassov
2022-03-01 20:45     ` Greg KH
2022-03-01 22:23       ` Wei Liu
2022-03-01 22:47         ` Iouri Tarassov
2022-03-02  7:54           ` Greg KH
2022-03-02  7:53         ` Greg KH
2022-03-02 11:53           ` Wei Liu
2022-03-02 18:49             ` Iouri Tarassov
2022-03-02 20:20               ` Greg KH
2022-03-02 22:27                 ` Iouri Tarassov
2022-03-03 13:16                   ` Greg KH
2022-03-09 23:14                     ` Steve Pronovost
2022-03-10 10:13                       ` Greg KH
2022-03-10 18:31                         ` Steve Pronovost
2022-03-02 20:34               ` Wei Liu
2022-03-03  1:09           ` Iouri Tarassov
2022-03-03 13:22             ` Greg KH
2022-03-04 16:04               ` Wei Liu
2022-03-04 17:55                 ` Greg KH
2022-03-02 22:59       ` Iouri Tarassov
2022-03-03 13:29         ` Greg KH
2022-03-02 23:27       ` Iouri Tarassov
2022-03-03 13:10         ` Greg KH
2022-03-01 22:06     ` Wei Liu
2022-03-03  2:01       ` Iouri Tarassov
2022-03-01 19:45   ` [PATCH v3 03/30] drivers: hv: dxgkrnl: Add VM bus message support, initialize VM bus channels Iouri Tarassov
2022-03-01 22:57     ` Wei Liu
2022-03-01 19:45   ` [PATCH v3 04/30] drivers: hv: dxgkrnl: Creation of dxgadapter object Iouri Tarassov
2022-03-01 23:25     ` Wei Liu
2022-03-01 19:45   ` [PATCH v3 05/30] drivers: hv: dxgkrnl: Opening of /dev/dxg device and dxgprocess creation Iouri Tarassov
2022-03-01 20:46     ` Greg KH
2022-03-01 20:46     ` Greg KH
2022-03-01 23:47     ` Wei Liu
2022-03-01 19:45   ` [PATCH v3 06/30] drivers: hv: dxgkrnl: Enumerate and open dxgadapter objects Iouri Tarassov
2022-03-02 12:43     ` Wei Liu
2022-03-01 19:45   ` [PATCH v3 07/30] drivers: hv: dxgkrnl: Creation of dxgdevice objects Iouri Tarassov
2022-03-01 19:45   ` [PATCH v3 08/30] drivers: hv: dxgkrnl: Creation of dxgcontext objects Iouri Tarassov
2022-03-02 12:59     ` Wei Liu
2022-03-01 19:45   ` [PATCH v3 09/30] drivers: hv: dxgkrnl: Creation of compute device allocations and resources Iouri Tarassov
2022-03-01 19:45   ` [PATCH v3 10/30] drivers: hv: dxgkrnl: Creation of compute device sync objects Iouri Tarassov
2022-03-02 13:25     ` Wei Liu
2022-03-01 19:45   ` [PATCH v3 11/30] drivers: hv: dxgkrnl: Operations using " Iouri Tarassov
2022-03-01 19:45   ` [PATCH v3 12/30] drivers: hv: dxgkrnl: Sharing of dxgresource objects Iouri Tarassov
2022-03-02 14:13     ` Wei Liu
2022-03-02 14:15     ` Wei Liu
2022-03-01 19:46   ` [PATCH v3 13/30] drivers: hv: dxgkrnl: Sharing of sync objects Iouri Tarassov
2022-03-01 19:46   ` [PATCH v3 14/30] drivers: hv: dxgkrnl: Creation of hardware queues. Sync object operations to hw queue Iouri Tarassov
2022-03-01 19:46   ` [PATCH v3 15/30] drivers: hv: dxgkrnl: Creation of paging queue objects Iouri Tarassov
2022-03-01 19:46   ` [PATCH v3 16/30] drivers: hv: dxgkrnl: Submit execution commands to the compute device Iouri Tarassov
2022-03-01 19:46   ` [PATCH v3 17/30] drivers: hv: dxgkrnl: Share objects with the host Iouri Tarassov
2022-03-01 19:46   ` [PATCH v3 18/30] drivers: hv: dxgkrnl: Query the dxgdevice state Iouri Tarassov
2022-03-01 19:46   ` [PATCH v3 19/30] drivers: hv: dxgkrnl: Map(unmap) CPU address to device allocation Iouri Tarassov
2022-03-01 19:46   ` [PATCH v3 20/30] drivers: hv: dxgkrnl: Manage device allocation properties Iouri Tarassov
2022-03-01 19:46   ` [PATCH v3 21/30] drivers: hv: dxgkrnl: Flush heap transitions Iouri Tarassov
2022-03-01 19:46   ` [PATCH v3 22/30] drivers: hv: dxgkrnl: Query video memory information Iouri Tarassov
2022-03-01 19:46   ` [PATCH v3 23/30] drivers: hv: dxgkrnl: The escape ioctl Iouri Tarassov
2022-03-01 19:46   ` [PATCH v3 24/30] drivers: hv: dxgkrnl: Ioctl to put device to error state Iouri Tarassov
2022-03-01 19:46   ` [PATCH v3 25/30] drivers: hv: dxgkrnl: Ioctls to query statistics and clock calibration Iouri Tarassov
2022-03-01 19:46   ` [PATCH v3 26/30] drivers: hv: dxgkrnl: Offer and reclaim allocations Iouri Tarassov
2022-03-02 14:25     ` Wei Liu
2022-03-02 18:13       ` Iouri Tarassov
2022-03-02 18:23         ` Wei Liu
2022-03-01 19:46   ` [PATCH v3 27/30] drivers: hv: dxgkrnl: Ioctls to manage scheduling priority Iouri Tarassov
2022-03-01 19:46   ` [PATCH v3 28/30] drivers: hv: dxgkrnl: Manage residency of allocations Iouri Tarassov
2022-03-01 19:46   ` [PATCH v3 29/30] drivers: hv: dxgkrnl: Manage compute device virtual addresses Iouri Tarassov
2022-03-01 19:46   ` [PATCH v3 30/30] drivers: hv: dxgkrnl: Add support to map guest pages by host Iouri Tarassov
2022-03-02 14:29     ` Wei Liu
2022-03-01 21:37   ` [PATCH v3 01/30] drivers: hv: dxgkrnl: Add virtual compute device VM bus channel guids Wei Liu
2022-03-02  8:51 ` [PATCH v3 00/30] Driver for Hyper-v virtual compute device Christoph Hellwig
2022-03-02 14:53 ` Wei Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).