linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/17] HSM driver for ACRN hypervisor
@ 2020-09-22 11:42 shuo.a.liu
  2020-09-22 11:42 ` [PATCH v4 01/17] docs: acrn: Introduce ACRN shuo.a.liu
                   ` (13 more replies)
  0 siblings, 14 replies; 58+ messages in thread
From: shuo.a.liu @ 2020-09-22 11:42 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: Greg Kroah-Hartman, H . Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Sean Christopherson, Yu Wang,
	Reinette Chatre, Shuo Liu

From: Shuo Liu <shuo.a.liu@intel.com>

ACRN is a Type 1 reference hypervisor stack, running directly on the bare-metal
hardware, and is suitable for a variety of IoT and embedded device solutions.

ACRN implements a hybrid VMM architecture, using a privileged Service VM. The
Service VM manages the system resources (CPU, memory, etc.) and I/O devices of
User VMs. Multiple User VMs are supported, with each of them running Linux,
Android OS or Windows. Both Service VM and User VMs are guest VM.

Below figure shows the architecture.

                Service VM                    User VM
      +----------------------------+  |  +------------------+
      |        +--------------+    |  |  |                  |
      |        |ACRN userspace|    |  |  |                  |
      |        +--------------+    |  |  |                  |
      |-----------------ioctl------|  |  |                  |   ...
      |kernel space   +----------+ |  |  |                  |
      |               |   HSM    | |  |  | Drivers          |
      |               +----------+ |  |  |                  |
      +--------------------|-------+  |  +------------------+
  +---------------------hypercall----------------------------------------+
  |                       ACRN Hypervisor                                |
  +----------------------------------------------------------------------+
  |                          Hardware                                    |
  +----------------------------------------------------------------------+

There is only one Service VM which could run Linux as OS.

In a typical case, the Service VM will be auto started when ACRN Hypervisor is
booted. Then the ACRN userspace (an application running in Service VM) could be
used to start/stop User VMs by communicating with ACRN Hypervisor Service
Module (HSM).

ACRN Hypervisor Service Module (HSM) is a middle layer that allows the ACRN
userspace and Service VM OS kernel to communicate with ACRN Hypervisor
and manage different User VMs. This middle layer provides the following
functionalities,
  - Issues hypercalls to the hypervisor to manage User VMs:
      * VM/vCPU management
      * Memory management
      * Device passthrough
      * Interrupts injection
  - I/O requests handling from User VMs.
  - Exports ioctl through HSM char device.
  - Exports function calls for other kernel modules

ACRN is focused on embedded system. So it doesn't support some features.
E.g.,
  - ACRN doesn't support VM migration.
  - ACRN doesn't support vCPU migration.

This patch set adds the HSM to the Linux kernel.

The basic ARCN support was merged to upstream already.
https://lore.kernel.org/lkml/1559108037-18813-3-git-send-email-yakui.zhao@intel.com/

ChangeLog:
v4:
  - Used acrn_dev.this_device directly for dev_*() (Reinette)
  - Removed the odd usage of {get|put}_device() on &acrn_dev->this_device (Greg)
  - Removed unused log code. (Greg)
  - Corrected the return error values. (Greg)
  - Mentioned that HSM relies hypervisor for sanity check in acrn_dev_ioctl() comments (Greg)

v3:
  - Used {get|put}_device() helpers on &acrn_dev->this_device
  - Moved unused code from front patches to later ones.
  - Removed self-defined pr_fmt() and dev_fmt()
  - Provided comments for acrn_vm_list_lock.

v2:
  - Removed API version related code. (Dave)
  - Replaced pr_*() by dev_*(). (Greg)
  - Used -ENOTTY as the error code of unsupported ioctl. (Greg)

Shuo Liu (16):
  docs: acrn: Introduce ACRN
  x86/acrn: Introduce acrn_{setup, remove}_intr_handler()
  x86/acrn: Introduce hypercall interfaces
  virt: acrn: Introduce ACRN HSM basic driver
  virt: acrn: Introduce VM management interfaces
  virt: acrn: Introduce an ioctl to set vCPU registers state
  virt: acrn: Introduce EPT mapping management
  virt: acrn: Introduce I/O request management
  virt: acrn: Introduce PCI configuration space PIO accesses combiner
  virt: acrn: Introduce interfaces for PCI device passthrough
  virt: acrn: Introduce interrupt injection interfaces
  virt: acrn: Introduce interfaces to query C-states and P-states
    allowed by hypervisor
  virt: acrn: Introduce I/O ranges operation interfaces
  virt: acrn: Introduce ioeventfd
  virt: acrn: Introduce irqfd
  virt: acrn: Introduce an interface for Service VM to control vCPU

Yin Fengwei (1):
  x86/acrn: Introduce an API to check if a VM is privileged

 .../userspace-api/ioctl/ioctl-number.rst      |   1 +
 Documentation/virt/acrn/index.rst             |  11 +
 Documentation/virt/acrn/introduction.rst      |  40 ++
 Documentation/virt/acrn/io-request.rst        |  97 +++
 Documentation/virt/index.rst                  |   1 +
 MAINTAINERS                                   |   9 +
 arch/x86/include/asm/acrn.h                   |  74 ++
 arch/x86/kernel/cpu/acrn.c                    |  35 +-
 drivers/virt/Kconfig                          |   2 +
 drivers/virt/Makefile                         |   1 +
 drivers/virt/acrn/Kconfig                     |  15 +
 drivers/virt/acrn/Makefile                    |   3 +
 drivers/virt/acrn/acrn_drv.h                  | 229 +++++++
 drivers/virt/acrn/hsm.c                       | 437 ++++++++++++
 drivers/virt/acrn/hypercall.h                 | 254 +++++++
 drivers/virt/acrn/ioeventfd.c                 | 273 ++++++++
 drivers/virt/acrn/ioreq.c                     | 645 ++++++++++++++++++
 drivers/virt/acrn/irqfd.c                     | 235 +++++++
 drivers/virt/acrn/mm.c                        | 305 +++++++++
 drivers/virt/acrn/vm.c                        | 126 ++++
 include/uapi/linux/acrn.h                     | 486 +++++++++++++
 21 files changed, 3278 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/virt/acrn/index.rst
 create mode 100644 Documentation/virt/acrn/introduction.rst
 create mode 100644 Documentation/virt/acrn/io-request.rst
 create mode 100644 arch/x86/include/asm/acrn.h
 create mode 100644 drivers/virt/acrn/Kconfig
 create mode 100644 drivers/virt/acrn/Makefile
 create mode 100644 drivers/virt/acrn/acrn_drv.h
 create mode 100644 drivers/virt/acrn/hsm.c
 create mode 100644 drivers/virt/acrn/hypercall.h
 create mode 100644 drivers/virt/acrn/ioeventfd.c
 create mode 100644 drivers/virt/acrn/ioreq.c
 create mode 100644 drivers/virt/acrn/irqfd.c
 create mode 100644 drivers/virt/acrn/mm.c
 create mode 100644 drivers/virt/acrn/vm.c
 create mode 100644 include/uapi/linux/acrn.h


base-commit: 18445bf405cb331117bc98427b1ba6f12418ad17
-- 
2.28.0


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v4 01/17] docs: acrn: Introduce ACRN
  2020-09-22 11:42 [PATCH v4 00/17] HSM driver for ACRN hypervisor shuo.a.liu
@ 2020-09-22 11:42 ` shuo.a.liu
  2020-10-09  1:48   ` Randy Dunlap
  2020-09-22 11:42 ` [PATCH v4 02/17] x86/acrn: Introduce acrn_{setup, remove}_intr_handler() shuo.a.liu
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 58+ messages in thread
From: shuo.a.liu @ 2020-09-22 11:42 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: Greg Kroah-Hartman, H . Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Sean Christopherson, Yu Wang,
	Reinette Chatre, Shuo Liu, Zhi Wang, Dave Hansen, Dan Williams,
	Fengwei Yin, Zhenyu Wang

From: Shuo Liu <shuo.a.liu@intel.com>

Add documentation on the following aspects of ACRN:

  1) A brief introduction on the architecture of ACRN.
  2) I/O request handling in ACRN.

To learn more about ACRN, please go to ACRN project website
https://projectacrn.org, or the documentation page
https://projectacrn.github.io/.

Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Sen Christopherson <sean.j.christopherson@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Fengwei Yin <fengwei.yin@intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Yu Wang <yu1.wang@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/virt/acrn/index.rst        | 11 +++
 Documentation/virt/acrn/introduction.rst | 40 ++++++++++
 Documentation/virt/acrn/io-request.rst   | 97 ++++++++++++++++++++++++
 Documentation/virt/index.rst             |  1 +
 MAINTAINERS                              |  7 ++
 5 files changed, 156 insertions(+)
 create mode 100644 Documentation/virt/acrn/index.rst
 create mode 100644 Documentation/virt/acrn/introduction.rst
 create mode 100644 Documentation/virt/acrn/io-request.rst

diff --git a/Documentation/virt/acrn/index.rst b/Documentation/virt/acrn/index.rst
new file mode 100644
index 000000000000..e3cf99033bdb
--- /dev/null
+++ b/Documentation/virt/acrn/index.rst
@@ -0,0 +1,11 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===============
+ACRN Hypervisor
+===============
+
+.. toctree::
+   :maxdepth: 1
+
+   introduction
+   io-request
diff --git a/Documentation/virt/acrn/introduction.rst b/Documentation/virt/acrn/introduction.rst
new file mode 100644
index 000000000000..6b44924d5c0e
--- /dev/null
+++ b/Documentation/virt/acrn/introduction.rst
@@ -0,0 +1,40 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+ACRN Hypervisor Introduction
+============================
+
+The ACRN Hypervisor is a Type 1 hypervisor, running directly on the bare-metal
+hardware. It has a privileged management VM, called Service VM, to manage User
+VMs and do I/O emulation.
+
+ACRN userspace is an application running in the Service VM that emulates
+devices for a User VM based on command line configurations. ACRN Hypervisor
+Service Module (HSM) is a kernel module in the Service VM which provides
+hypervisor services to the ACRN userspace.
+
+Below figure shows the architecture.
+
+::
+
+                Service VM                    User VM
+      +----------------------------+  |  +------------------+
+      |        +--------------+    |  |  |                  |
+      |        |ACRN userspace|    |  |  |                  |
+      |        +--------------+    |  |  |                  |
+      |-----------------ioctl------|  |  |                  |   ...
+      |kernel space   +----------+ |  |  |                  |
+      |               |   HSM    | |  |  | Drivers          |
+      |               +----------+ |  |  |                  |
+      +--------------------|-------+  |  +------------------+
+  +---------------------hypercall----------------------------------------+
+  |                         ACRN Hypervisor                              |
+  +----------------------------------------------------------------------+
+  |                          Hardware                                    |
+  +----------------------------------------------------------------------+
+
+ACRN userspace allocates memory for the User VM, configures and initializes the
+devices used by the User VM, loads the virtual bootloader, initializes the
+virtual CPU state and handles I/O request accesses from the User VM. It uses
+ioctls to communicate with the HSM. HSM implements hypervisor services by
+interacting with the ACRN Hypervisor via hypercalls. HSM exports a char device
+interface (/dev/acrn_hsm) to userspace.
diff --git a/Documentation/virt/acrn/io-request.rst b/Documentation/virt/acrn/io-request.rst
new file mode 100644
index 000000000000..019dc5978f7c
--- /dev/null
+++ b/Documentation/virt/acrn/io-request.rst
@@ -0,0 +1,97 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+I/O request handling
+====================
+
+An I/O request of a User VM, which is constructed by the hypervisor, is
+distributed by the ACRN Hypervisor Service Module to an I/O client
+corresponding to the address range of the I/O request. Details of I/O request
+handling are described in the following sections.
+
+1. I/O request
+--------------
+
+For each User VM, there is a shared 4-KByte memory region used for I/O requests
+communication between the hypervisor and Service VM. An I/O request is a
+256-byte structure buffer, which is 'struct acrn_io_request', that is filled by
+an I/O handler of the hypervisor when a trapped I/O access happens in a User
+VM. ACRN userspace in the Service VM first allocates a 4-KByte page and passes
+the GPA (Guest Physical Address) of the buffer to the hypervisor. The buffer is
+used as an array of 16 I/O request slots with each I/O request slot being 256
+bytes. This array is indexed by vCPU ID.
+
+2. I/O clients
+--------------
+
+An I/O client is responsible for handling User VM I/O requests whose accessed
+GPA falls in a certain range. Multiple I/O clients can be associated with each
+User VM. There is a special client associated with each User VM, called the
+default client, that handles all I/O requests that do not fit into the range of
+any other clients. The ACRN userspace acts as the default client for each User
+VM.
+
+Below illustration shows the relationship between I/O requests shared buffer,
+I/O requests and I/O clients.
+
+::
+
+     +------------------------------------------------------+
+     |                                       Service VM     |
+     |+--------------------------------------------------+  |
+     ||      +----------------------------------------+  |  |
+     ||      | shared page            ACRN userspace  |  |  |
+     ||      |    +-----------------+  +------------+ |  |  |
+     ||   +----+->| acrn_io_request |<-+  default   | |  |  |
+     ||   |  | |  +-----------------+  | I/O client | |  |  |
+     ||   |  | |  |       ...       |  +------------+ |  |  |
+     ||   |  | |  +-----------------+                 |  |  |
+     ||   |  +-|--------------------------------------+  |  |
+     ||---|----|-----------------------------------------|  |
+     ||   |    |                             kernel      |  |
+     ||   |    |            +----------------------+     |  |
+     ||   |    |            | +-------------+  HSM |     |  |
+     ||   |    +--------------+             |      |     |  |
+     ||   |                 | | I/O clients |      |     |  |
+     ||   |                 | |             |      |     |  |
+     ||   |                 | +-------------+      |     |  |
+     ||   |                 +----------------------+     |  |
+     |+---|----------------------------------------------+  |
+     +----|-------------------------------------------------+
+          |
+     +----|-------------------------------------------------+
+     |  +-+-----------+                                     |
+     |  | I/O handler |              ACRN Hypervisor        |
+     |  +-------------+                                     |
+     +------------------------------------------------------+
+
+3. I/O request state transition
+-------------------------------
+
+The state transitions of a ACRN I/O request are as follows.
+
+::
+
+   FREE -> PENDING -> PROCESSING -> COMPLETE -> FREE -> ...
+
+- FREE: this I/O request slot is empty
+- PENDING: a valid I/O request is pending in this slot
+- PROCESSING: the I/O request is being processed
+- COMPLETE: the I/O request has been processed
+
+An I/O request in COMPLETE or FREE state is owned by the hypervisor. HSM and
+ACRN userspace are in charge of processing the others.
+
+4. Processing flow of I/O requests
+-------------------------------
+
+a. The I/O handler of the hypervisor will fill an I/O request with PENDING
+   state when a trapped I/O access happens in a User VM.
+b. The hypervisor makes an upcall, which is a notification interrupt, to
+   the Service VM.
+c. The upcall handler schedules a tasklet to dispatch I/O requests.
+d. The tasklet looks for the PENDING I/O requests, assigns them to different
+   registered clients based on the address of the I/O accesses, updates
+   their state to PROCESSING, and notifies the corresponding client to handle.
+e. The notified client handles the assigned I/O requests.
+f. The HSM updates I/O requests states to COMPLETE and notifies the hypervisor
+   of the completion via hypercalls.
diff --git a/Documentation/virt/index.rst b/Documentation/virt/index.rst
index de1ab81df958..c10b519507f5 100644
--- a/Documentation/virt/index.rst
+++ b/Documentation/virt/index.rst
@@ -11,6 +11,7 @@ Linux Virtualization Support
    uml/user_mode_linux
    paravirt_ops
    guest-halt-polling
+   acrn/index
 
 .. only:: html and subproject
 
diff --git a/MAINTAINERS b/MAINTAINERS
index deaafb617361..e0fea5e464b4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -436,6 +436,13 @@ S:	Orphan
 F:	drivers/platform/x86/wmi.c
 F:	include/uapi/linux/wmi.h
 
+ACRN HYPERVISOR SERVICE MODULE
+M:	Shuo Liu <shuo.a.liu@intel.com>
+L:	acrn-dev@lists.projectacrn.org
+S:	Supported
+W:	https://projectacrn.org
+F:	Documentation/virt/acrn/
+
 AD1889 ALSA SOUND DRIVER
 L:	linux-parisc@vger.kernel.org
 S:	Maintained
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 02/17] x86/acrn: Introduce acrn_{setup, remove}_intr_handler()
  2020-09-22 11:42 [PATCH v4 00/17] HSM driver for ACRN hypervisor shuo.a.liu
  2020-09-22 11:42 ` [PATCH v4 01/17] docs: acrn: Introduce ACRN shuo.a.liu
@ 2020-09-22 11:42 ` shuo.a.liu
  2020-09-27 10:49   ` Greg Kroah-Hartman
  2020-09-29 18:01   ` Borislav Petkov
  2020-09-22 11:42 ` [PATCH v4 03/17] x86/acrn: Introduce an API to check if a VM is privileged shuo.a.liu
                   ` (11 subsequent siblings)
  13 siblings, 2 replies; 58+ messages in thread
From: shuo.a.liu @ 2020-09-22 11:42 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: Greg Kroah-Hartman, H . Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Sean Christopherson, Yu Wang,
	Reinette Chatre, Shuo Liu, Yakui Zhao, Zhi Wang, Dave Hansen,
	Dan Williams, Fengwei Yin, Zhenyu Wang

From: Shuo Liu <shuo.a.liu@intel.com>

The ACRN Hypervisor builds an I/O request when a trapped I/O access
happens in User VM. Then, ACRN Hypervisor issues an upcall by sending
a notification interrupt to the Service VM. HSM in the Service VM needs
to hook the notification interrupt to handle I/O requests.

Notification interrupts from ACRN Hypervisor are already supported and
a, currently uninitialized, callback called.

Export two APIs for HSM to setup/remove its callback.

Originally-by: Yakui Zhao <yakui.zhao@intel.com>
Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Fengwei Yin <fengwei.yin@intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Yu Wang <yu1.wang@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/acrn.h |  8 ++++++++
 arch/x86/kernel/cpu/acrn.c  | 16 ++++++++++++++++
 2 files changed, 24 insertions(+)
 create mode 100644 arch/x86/include/asm/acrn.h

diff --git a/arch/x86/include/asm/acrn.h b/arch/x86/include/asm/acrn.h
new file mode 100644
index 000000000000..ff259b69cde7
--- /dev/null
+++ b/arch/x86/include/asm/acrn.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_ACRN_H
+#define _ASM_X86_ACRN_H
+
+void acrn_setup_intr_handler(void (*handler)(void));
+void acrn_remove_intr_handler(void);
+
+#endif /* _ASM_X86_ACRN_H */
diff --git a/arch/x86/kernel/cpu/acrn.c b/arch/x86/kernel/cpu/acrn.c
index 0b2c03943ac6..42e88d01ccf9 100644
--- a/arch/x86/kernel/cpu/acrn.c
+++ b/arch/x86/kernel/cpu/acrn.c
@@ -9,7 +9,11 @@
  *
  */
 
+#define pr_fmt(fmt) "acrn: " fmt
+
 #include <linux/interrupt.h>
+
+#include <asm/acrn.h>
 #include <asm/apic.h>
 #include <asm/cpufeatures.h>
 #include <asm/desc.h>
@@ -55,6 +59,18 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_acrn_hv_callback)
 	set_irq_regs(old_regs);
 }
 
+void acrn_setup_intr_handler(void (*handler)(void))
+{
+	acrn_intr_handler = handler;
+}
+EXPORT_SYMBOL_GPL(acrn_setup_intr_handler);
+
+void acrn_remove_intr_handler(void)
+{
+	acrn_intr_handler = NULL;
+}
+EXPORT_SYMBOL_GPL(acrn_remove_intr_handler);
+
 const __initconst struct hypervisor_x86 x86_hyper_acrn = {
 	.name                   = "ACRN",
 	.detect                 = acrn_detect,
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 03/17] x86/acrn: Introduce an API to check if a VM is privileged
  2020-09-22 11:42 [PATCH v4 00/17] HSM driver for ACRN hypervisor shuo.a.liu
  2020-09-22 11:42 ` [PATCH v4 01/17] docs: acrn: Introduce ACRN shuo.a.liu
  2020-09-22 11:42 ` [PATCH v4 02/17] x86/acrn: Introduce acrn_{setup, remove}_intr_handler() shuo.a.liu
@ 2020-09-22 11:42 ` shuo.a.liu
  2020-09-30  8:09   ` Borislav Petkov
  2020-09-22 11:42 ` [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces shuo.a.liu
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 58+ messages in thread
From: shuo.a.liu @ 2020-09-22 11:42 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: Greg Kroah-Hartman, H . Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Sean Christopherson, Yu Wang,
	Reinette Chatre, Yin Fengwei, Shuo Liu, Dave Hansen,
	Dan Williams, Zhi Wang, Zhenyu Wang

From: Yin Fengwei <fengwei.yin@intel.com>

ACRN Hypervisor reports hypervisor features via CPUID leaf 0x40000001
which is similar to KVM. A VM can check if it's the privileged VM using
the feature bits. The Service VM is the only privileged VM by design.

Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Fengwei Yin <fengwei.yin@intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Yu Wang <yu1.wang@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/acrn.h |  9 +++++++++
 arch/x86/kernel/cpu/acrn.c  | 19 ++++++++++++++++++-
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/acrn.h b/arch/x86/include/asm/acrn.h
index ff259b69cde7..a2d4aea3a80d 100644
--- a/arch/x86/include/asm/acrn.h
+++ b/arch/x86/include/asm/acrn.h
@@ -2,7 +2,16 @@
 #ifndef _ASM_X86_ACRN_H
 #define _ASM_X86_ACRN_H
 
+/*
+ * This CPUID returns feature bitmaps in EAX.
+ * Guest VM uses this to detect the appropriate feature bit.
+ */
+#define	ACRN_CPUID_FEATURES		0x40000001
+/* Bit 0 indicates whether guest VM is privileged */
+#define	ACRN_FEATURE_PRIVILEGED_VM	BIT(0)
+
 void acrn_setup_intr_handler(void (*handler)(void));
 void acrn_remove_intr_handler(void);
+bool acrn_is_privileged_vm(void);
 
 #endif /* _ASM_X86_ACRN_H */
diff --git a/arch/x86/kernel/cpu/acrn.c b/arch/x86/kernel/cpu/acrn.c
index 42e88d01ccf9..b04fef8bd50b 100644
--- a/arch/x86/kernel/cpu/acrn.c
+++ b/arch/x86/kernel/cpu/acrn.c
@@ -21,9 +21,26 @@
 #include <asm/idtentry.h>
 #include <asm/irq_regs.h>
 
+static u32 acrn_cpuid_base(void)
+{
+	static u32 acrn_cpuid_base;
+
+	if (!acrn_cpuid_base && boot_cpu_has(X86_FEATURE_HYPERVISOR))
+		acrn_cpuid_base = hypervisor_cpuid_base("ACRNACRNACRN", 0);
+
+	return acrn_cpuid_base;
+}
+
+bool acrn_is_privileged_vm(void)
+{
+	return cpuid_eax(acrn_cpuid_base() | ACRN_CPUID_FEATURES) &
+			 ACRN_FEATURE_PRIVILEGED_VM;
+}
+EXPORT_SYMBOL_GPL(acrn_is_privileged_vm);
+
 static u32 __init acrn_detect(void)
 {
-	return hypervisor_cpuid_base("ACRNACRNACRN", 0);
+	return acrn_cpuid_base();
 }
 
 static void __init acrn_init_platform(void)
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-09-22 11:42 [PATCH v4 00/17] HSM driver for ACRN hypervisor shuo.a.liu
                   ` (2 preceding siblings ...)
  2020-09-22 11:42 ` [PATCH v4 03/17] x86/acrn: Introduce an API to check if a VM is privileged shuo.a.liu
@ 2020-09-22 11:42 ` shuo.a.liu
  2020-09-27 10:51   ` Greg Kroah-Hartman
  2020-09-30 10:54   ` Borislav Petkov
  2020-09-22 11:42 ` [PATCH v4 05/17] virt: acrn: Introduce ACRN HSM basic driver shuo.a.liu
                   ` (9 subsequent siblings)
  13 siblings, 2 replies; 58+ messages in thread
From: shuo.a.liu @ 2020-09-22 11:42 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: Greg Kroah-Hartman, H . Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Sean Christopherson, Yu Wang,
	Reinette Chatre, Shuo Liu, Yakui Zhao, Dave Hansen, Dan Williams,
	Fengwei Yin, Zhi Wang, Zhenyu Wang

From: Shuo Liu <shuo.a.liu@intel.com>

The Service VM communicates with the hypervisor via conventional
hypercalls. VMCALL instruction is used to make the hypercalls.

ACRN hypercall ABI:
  * Hypercall number is in R8 register.
  * Up to 2 parameters are in RDI and RSI registers.
  * Return value is in RAX register.

Introduce the ACRN hypercall interfaces. Because GCC doesn't support R8
register as direct register constraints, here are two ways to use R8 in
extended asm:
  1) use explicit register variable as input
  2) use supported constraint as input with a explicit MOV to R8 in
     beginning of asm

The number of instructions of above two ways are same.
Asm code from 1)
  38:   41 b8 00 00 00 80       mov    $0x80000000,%r8d
  3e:   48 89 c7                mov    %rax,%rdi
  41:   0f 01 c1                vmcall
Here, writes to the lower dword (%r8d) clear the upper dword of %r8 when
the CPU is in 64-bit mode.

Asm code from 2)
  38:   48 89 c7                mov    %rax,%rdi
  3b:   49 b8 00 00 00 80 00    movabs $0x80000000,%r8
  42:   00 00 00
  45:   0f 01 c1                vmcall

Choose 1) for code simplicity and a little bit of code size
optimization.

Originally-by: Yakui Zhao <yakui.zhao@intel.com>
Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Fengwei Yin <fengwei.yin@intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Yu Wang <yu1.wang@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/acrn.h | 57 +++++++++++++++++++++++++++++++++++++
 1 file changed, 57 insertions(+)

diff --git a/arch/x86/include/asm/acrn.h b/arch/x86/include/asm/acrn.h
index a2d4aea3a80d..23a93b87edeb 100644
--- a/arch/x86/include/asm/acrn.h
+++ b/arch/x86/include/asm/acrn.h
@@ -14,4 +14,61 @@ void acrn_setup_intr_handler(void (*handler)(void));
 void acrn_remove_intr_handler(void);
 bool acrn_is_privileged_vm(void);
 
+/*
+ * Hypercalls for ACRN
+ *
+ * - VMCALL instruction is used to implement ACRN hypercalls.
+ * - ACRN hypercall ABI:
+ *   - Hypercall number is passed in R8 register.
+ *   - Up to 2 arguments are passed in RDI, RSI.
+ *   - Return value will be placed in RAX.
+ */
+static inline long acrn_hypercall0(unsigned long hcall_id)
+{
+	register long r8 asm("r8");
+	long result;
+
+	/* Nothing can come between the r8 assignment and the asm: */
+	r8 = hcall_id;
+	asm volatile("vmcall\n\t"
+		     : "=a" (result)
+		     : "r" (r8)
+		     : );
+
+	return result;
+}
+
+static inline long acrn_hypercall1(unsigned long hcall_id,
+				   unsigned long param1)
+{
+	register long r8 asm("r8");
+	long result;
+
+	/* Nothing can come between the r8 assignment and the asm: */
+	r8 = hcall_id;
+	asm volatile("vmcall\n\t"
+		     : "=a" (result)
+		     : "r" (r8), "D" (param1)
+		     : );
+
+	return result;
+}
+
+static inline long acrn_hypercall2(unsigned long hcall_id,
+				   unsigned long param1,
+				   unsigned long param2)
+{
+	register long r8 asm("r8");
+	long result;
+
+	/* Nothing can come between the r8 assignment and the asm: */
+	r8 = hcall_id;
+	asm volatile("vmcall\n\t"
+		     : "=a" (result)
+		     : "r" (r8), "D" (param1), "S" (param2)
+		     : );
+
+	return result;
+}
+
 #endif /* _ASM_X86_ACRN_H */
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 05/17] virt: acrn: Introduce ACRN HSM basic driver
  2020-09-22 11:42 [PATCH v4 00/17] HSM driver for ACRN hypervisor shuo.a.liu
                   ` (3 preceding siblings ...)
  2020-09-22 11:42 ` [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces shuo.a.liu
@ 2020-09-22 11:42 ` shuo.a.liu
  2020-09-22 11:43 ` [PATCH v4 08/17] virt: acrn: Introduce EPT mapping management shuo.a.liu
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 58+ messages in thread
From: shuo.a.liu @ 2020-09-22 11:42 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: Greg Kroah-Hartman, H . Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Sean Christopherson, Yu Wang,
	Reinette Chatre, Shuo Liu, Dave Hansen, Zhi Wang, Zhenyu Wang

From: Shuo Liu <shuo.a.liu@intel.com>

ACRN Hypervisor Service Module (HSM) is a kernel module in Service VM
which communicates with ACRN userspace through ioctls and talks to ACRN
Hypervisor through hypercalls.

Add a basic HSM driver which allows Service VM userspace to communicate
with ACRN. The following patches will add more ioctls, guest VM memory
mapping caching, I/O request processing, ioeventfd and irqfd into this
module. HSM exports a char device interface (/dev/acrn_hsm) to userspace.

Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Yu Wang <yu1.wang@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 MAINTAINERS                  |  1 +
 drivers/virt/Kconfig         |  2 +
 drivers/virt/Makefile        |  1 +
 drivers/virt/acrn/Kconfig    | 14 ++++++
 drivers/virt/acrn/Makefile   |  3 ++
 drivers/virt/acrn/acrn_drv.h | 18 ++++++++
 drivers/virt/acrn/hsm.c      | 87 ++++++++++++++++++++++++++++++++++++
 7 files changed, 126 insertions(+)
 create mode 100644 drivers/virt/acrn/Kconfig
 create mode 100644 drivers/virt/acrn/Makefile
 create mode 100644 drivers/virt/acrn/acrn_drv.h
 create mode 100644 drivers/virt/acrn/hsm.c

diff --git a/MAINTAINERS b/MAINTAINERS
index e0fea5e464b4..3030d0e93d02 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -442,6 +442,7 @@ L:	acrn-dev@lists.projectacrn.org
 S:	Supported
 W:	https://projectacrn.org
 F:	Documentation/virt/acrn/
+F:	drivers/virt/acrn/
 
 AD1889 ALSA SOUND DRIVER
 L:	linux-parisc@vger.kernel.org
diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
index cbc1f25c79ab..d9484a2e9b46 100644
--- a/drivers/virt/Kconfig
+++ b/drivers/virt/Kconfig
@@ -32,4 +32,6 @@ config FSL_HV_MANAGER
 	     partition shuts down.
 
 source "drivers/virt/vboxguest/Kconfig"
+
+source "drivers/virt/acrn/Kconfig"
 endif
diff --git a/drivers/virt/Makefile b/drivers/virt/Makefile
index fd331247c27a..f0491bbf0d4d 100644
--- a/drivers/virt/Makefile
+++ b/drivers/virt/Makefile
@@ -5,3 +5,4 @@
 
 obj-$(CONFIG_FSL_HV_MANAGER)	+= fsl_hypervisor.o
 obj-y				+= vboxguest/
+obj-$(CONFIG_ACRN_HSM)		+= acrn/
diff --git a/drivers/virt/acrn/Kconfig b/drivers/virt/acrn/Kconfig
new file mode 100644
index 000000000000..36c80378c30c
--- /dev/null
+++ b/drivers/virt/acrn/Kconfig
@@ -0,0 +1,14 @@
+# SPDX-License-Identifier: GPL-2.0
+config ACRN_HSM
+	tristate "ACRN Hypervisor Service Module"
+	depends on ACRN_GUEST
+	help
+	  ACRN Hypervisor Service Module (HSM) is a kernel module which
+	  communicates with ACRN userspace through ioctls and talks to
+	  the ACRN Hypervisor through hypercalls. HSM will only run in
+	  a privileged management VM, called Service VM, to manage User
+	  VMs and do I/O emulation. Not required for simply running
+	  under ACRN as a User VM.
+
+	  To compile as a module, choose M, the module will be called
+	  acrn. If unsure, say N.
diff --git a/drivers/virt/acrn/Makefile b/drivers/virt/acrn/Makefile
new file mode 100644
index 000000000000..6920ed798aaf
--- /dev/null
+++ b/drivers/virt/acrn/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_ACRN_HSM)	:= acrn.o
+acrn-y := hsm.o
diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
new file mode 100644
index 000000000000..29eedd696327
--- /dev/null
+++ b/drivers/virt/acrn/acrn_drv.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ACRN_HSM_DRV_H
+#define __ACRN_HSM_DRV_H
+
+#include <linux/types.h>
+
+#define ACRN_INVALID_VMID (0xffffU)
+
+/**
+ * struct acrn_vm - Properties of ACRN User VM.
+ * @vmid:	User VM ID
+ */
+struct acrn_vm {
+	u16	vmid;
+};
+
+#endif /* __ACRN_HSM_DRV_H */
diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
new file mode 100644
index 000000000000..28a3052ffa55
--- /dev/null
+++ b/drivers/virt/acrn/hsm.c
@@ -0,0 +1,87 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ACRN Hypervisor Service Module (HSM)
+ *
+ * Copyright (C) 2020 Intel Corporation. All rights reserved.
+ *
+ * Authors:
+ *	Fengwei Yin <fengwei.yin@intel.com>
+ *	Yakui Zhao <yakui.zhao@intel.com>
+ */
+
+#include <linux/miscdevice.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+
+#include <asm/acrn.h>
+#include <asm/hypervisor.h>
+
+#include "acrn_drv.h"
+
+/*
+ * When /dev/acrn_hsm is opened, a 'struct acrn_vm' object is created to
+ * represent a VM instance and continues to be associated with the opened file
+ * descriptor. All ioctl operations on this file descriptor will be targeted to
+ * the VM instance. Release of this file descriptor will destroy the object.
+ */
+static int acrn_dev_open(struct inode *inode, struct file *filp)
+{
+	struct acrn_vm *vm;
+
+	vm = kzalloc(sizeof(*vm), GFP_KERNEL);
+	if (!vm)
+		return -ENOMEM;
+
+	vm->vmid = ACRN_INVALID_VMID;
+	filp->private_data = vm;
+	return 0;
+}
+
+static int acrn_dev_release(struct inode *inode, struct file *filp)
+{
+	struct acrn_vm *vm = filp->private_data;
+
+	kfree(vm);
+	return 0;
+}
+
+static const struct file_operations acrn_fops = {
+	.owner		= THIS_MODULE,
+	.open		= acrn_dev_open,
+	.release	= acrn_dev_release,
+};
+
+static struct miscdevice acrn_dev = {
+	.minor	= MISC_DYNAMIC_MINOR,
+	.name	= "acrn_hsm",
+	.fops	= &acrn_fops,
+};
+
+static int __init hsm_init(void)
+{
+	int ret;
+
+	if (x86_hyper_type != X86_HYPER_ACRN)
+		return -ENODEV;
+
+	if (!acrn_is_privileged_vm())
+		return -EPERM;
+
+	ret = misc_register(&acrn_dev);
+	if (ret)
+		pr_err("Create misc dev failed!\n");
+
+	return ret;
+}
+
+static void __exit hsm_exit(void)
+{
+	misc_deregister(&acrn_dev);
+}
+module_init(hsm_init);
+module_exit(hsm_exit);
+
+MODULE_AUTHOR("Intel Corporation");
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("ACRN Hypervisor Service Module (HSM)");
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 08/17] virt: acrn: Introduce EPT mapping management
  2020-09-22 11:42 [PATCH v4 00/17] HSM driver for ACRN hypervisor shuo.a.liu
                   ` (4 preceding siblings ...)
  2020-09-22 11:42 ` [PATCH v4 05/17] virt: acrn: Introduce ACRN HSM basic driver shuo.a.liu
@ 2020-09-22 11:43 ` shuo.a.liu
  2020-09-22 11:43 ` [PATCH v4 10/17] virt: acrn: Introduce PCI configuration space PIO accesses combiner shuo.a.liu
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 58+ messages in thread
From: shuo.a.liu @ 2020-09-22 11:43 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: Greg Kroah-Hartman, H . Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Sean Christopherson, Yu Wang,
	Reinette Chatre, Shuo Liu, Zhi Wang, Zhenyu Wang

From: Shuo Liu <shuo.a.liu@intel.com>

The HSM provides hypervisor services to the ACRN userspace. While
launching a User VM, ACRN userspace needs to allocate memory and request
the ACRN Hypervisor to set up the EPT mapping for the VM.

A mapping cache is introduced for accelerating the translation between
the Service VM kernel virtual address and User VM physical address.

From the perspective of the hypervisor, the types of GPA of User VM can be
listed as following:
   1) RAM region, which is used by User VM as system ram.
   2) MMIO region, which is recognized by User VM as MMIO. MMIO region is
      used to be utilized for devices emulation.

Generally, User VM RAM regions mapping is set up before VM started and
is released in the User VM destruction. MMIO regions mapping may be set
and unset dynamically during User VM running.

To achieve this, ioctls ACRN_IOCTL_SET_MEMSEG and ACRN_IOCTL_UNSET_MEMSEG
are introduced in HSM.

Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Yu Wang <yu1.wang@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/virt/acrn/Makefile    |   2 +-
 drivers/virt/acrn/acrn_drv.h  |  98 ++++++++++-
 drivers/virt/acrn/hsm.c       |  15 ++
 drivers/virt/acrn/hypercall.h |  14 ++
 drivers/virt/acrn/mm.c        | 305 ++++++++++++++++++++++++++++++++++
 drivers/virt/acrn/vm.c        |   4 +
 include/uapi/linux/acrn.h     |  51 ++++++
 7 files changed, 479 insertions(+), 10 deletions(-)
 create mode 100644 drivers/virt/acrn/mm.c

diff --git a/drivers/virt/acrn/Makefile b/drivers/virt/acrn/Makefile
index cf8b4ed5e74e..38bc44b6edcd 100644
--- a/drivers/virt/acrn/Makefile
+++ b/drivers/virt/acrn/Makefile
@@ -1,3 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_ACRN_HSM)	:= acrn.o
-acrn-y := hsm.o vm.o
+acrn-y := hsm.o vm.o mm.o
diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
index 72d92b60d944..fe59476186e9 100644
--- a/drivers/virt/acrn/acrn_drv.h
+++ b/drivers/virt/acrn/acrn_drv.h
@@ -12,6 +12,71 @@
 
 extern struct miscdevice acrn_dev;
 
+#define ACRN_MEM_MAPPING_MAX	256
+
+#define ACRN_MEM_REGION_ADD	0
+#define ACRN_MEM_REGION_DEL	2
+/**
+ * struct vm_memory_region_op - Hypervisor memory operation
+ * @type:		Operation type (ACRN_MEM_REGION_*)
+ * @attr:		Memory attribute (ACRN_MEM_TYPE_* | ACRN_MEM_ACCESS_*)
+ * @user_vm_pa:		Physical address of User VM to be mapped.
+ * @service_vm_pa:	Physical address of Service VM to be mapped.
+ * @size:		Size of this region.
+ *
+ * Structure containing needed information that is provided to ACRN Hypervisor
+ * to manage the EPT mappings of a single memory region of the User VM. Several
+ * &struct vm_memory_region_op can be batched to ACRN Hypervisor, see &struct
+ * vm_memory_region_batch.
+ */
+struct vm_memory_region_op {
+	u32	type;
+	u32	attr;
+	u64	user_vm_pa;
+	u64	service_vm_pa;
+	u64	size;
+};
+
+/**
+ * struct vm_memory_region_batch - A batch of vm_memory_region_op.
+ * @vmid:		A User VM ID.
+ * @reserved:		Reserved.
+ * @regions_num:	The number of vm_memory_region_op.
+ * @reserved1:		Reserved.
+ * @regions_gpa:	Physical address of a vm_memory_region_op array.
+ *
+ * HC_VM_SET_MEMORY_REGIONS uses this structure to manage EPT mappings of
+ * multiple memory regions of a User VM. A &struct vm_memory_region_batch
+ * contains multiple &struct vm_memory_region_op for batch processing in the
+ * ACRN Hypervisor.
+ */
+struct vm_memory_region_batch {
+	u16	vmid;
+	u16	reserved[3];
+	u32	regions_num;
+	u32	reserved1;
+	u64	regions_gpa;
+};
+
+/**
+ * struct vm_memory_mapping - Memory map between a User VM and the Service VM
+ * @pages:		Pages in Service VM kernel.
+ * @npages:		Number of pages.
+ * @service_vm_va:	Virtual address in Service VM kernel.
+ * @user_vm_pa:		Physical address in User VM.
+ * @size:		Size of this memory region.
+ *
+ * HSM maintains memory mappings between a User VM GPA and the Service VM
+ * kernel VA for accelerating the User VM GPA translation.
+ */
+struct vm_memory_mapping {
+	struct page	**pages;
+	int		npages;
+	void		*service_vm_va;
+	u64		user_vm_pa;
+	size_t		size;
+};
+
 #define ACRN_INVALID_VMID (0xffffU)
 
 #define ACRN_VM_FLAG_DESTROYED		0U
@@ -19,21 +84,36 @@ extern struct list_head acrn_vm_list;
 extern rwlock_t acrn_vm_list_lock;
 /**
  * struct acrn_vm - Properties of ACRN User VM.
- * @list:	Entry within global list of all VMs
- * @vmid:	User VM ID
- * @vcpu_num:	Number of virtual CPUs in the VM
- * @flags:	Flags (ACRN_VM_FLAG_*) of the VM. This is VM flag management
- *		in HSM which is different from the &acrn_vm_creation.vm_flag.
+ * @list:			Entry within global list of all VMs.
+ * @vmid:			User VM ID.
+ * @vcpu_num:			Number of virtual CPUs in the VM.
+ * @flags:			Flags (ACRN_VM_FLAG_*) of the VM. This is VM
+ *				flag management in HSM which is different
+ *				from the &acrn_vm_creation.vm_flag.
+ * @regions_mapping_lock:	Lock to protect &acrn_vm.regions_mapping and
+ *				&acrn_vm.regions_mapping_count.
+ * @regions_mapping:		Memory mappings of this VM.
+ * @regions_mapping_count:	Number of memory mapping of this VM.
  */
 struct acrn_vm {
-	struct list_head	list;
-	u16			vmid;
-	int			vcpu_num;
-	unsigned long		flags;
+	struct list_head		list;
+	u16				vmid;
+	int				vcpu_num;
+	unsigned long			flags;
+	struct mutex			regions_mapping_lock;
+	struct vm_memory_mapping	regions_mapping[ACRN_MEM_MAPPING_MAX];
+	int				regions_mapping_count;
 };
 
 struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
 			       struct acrn_vm_creation *vm_param);
 int acrn_vm_destroy(struct acrn_vm *vm);
+int acrn_mm_region_add(struct acrn_vm *vm, u64 user_gpa, u64 service_gpa,
+		       u64 size, u32 mem_type, u32 mem_access_right);
+int acrn_mm_region_del(struct acrn_vm *vm, u64 user_gpa, u64 size);
+int acrn_vm_memseg_map(struct acrn_vm *vm, struct acrn_vm_memmap *memmap);
+int acrn_vm_memseg_unmap(struct acrn_vm *vm, struct acrn_vm_memmap *memmap);
+int acrn_vm_ram_map(struct acrn_vm *vm, struct acrn_vm_memmap *memmap);
+void acrn_vm_all_ram_unmap(struct acrn_vm *vm);
 
 #endif /* __ACRN_HSM_DRV_H */
diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
index e45f3abbc87f..ea5dfd04163b 100644
--- a/drivers/virt/acrn/hsm.c
+++ b/drivers/virt/acrn/hsm.c
@@ -48,6 +48,7 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
 	struct acrn_vm *vm = filp->private_data;
 	struct acrn_vm_creation *vm_param;
 	struct acrn_vcpu_regs *cpu_regs;
+	struct acrn_vm_memmap memmap;
 	int ret = 0;
 
 	if (vm->vmid == ACRN_INVALID_VMID && cmd != ACRN_IOCTL_CREATE_VM) {
@@ -112,6 +113,20 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
 				vm->vmid);
 		kfree(cpu_regs);
 		break;
+	case ACRN_IOCTL_SET_MEMSEG:
+		if (copy_from_user(&memmap, (void __user *)ioctl_param,
+				   sizeof(memmap)))
+			return -EFAULT;
+
+		ret = acrn_vm_memseg_map(vm, &memmap);
+		break;
+	case ACRN_IOCTL_UNSET_MEMSEG:
+		if (copy_from_user(&memmap, (void __user *)ioctl_param,
+				   sizeof(memmap)))
+			return -EFAULT;
+
+		ret = acrn_vm_memseg_unmap(vm, &memmap);
+		break;
 	default:
 		dev_warn(acrn_dev.this_device, "Unknown IOCTL 0x%x!\n", cmd);
 		ret = -ENOTTY;
diff --git a/drivers/virt/acrn/hypercall.h b/drivers/virt/acrn/hypercall.h
index f29cfae08862..a1a70a071713 100644
--- a/drivers/virt/acrn/hypercall.h
+++ b/drivers/virt/acrn/hypercall.h
@@ -21,6 +21,9 @@
 #define HC_RESET_VM			_HC_ID(HC_ID, HC_ID_VM_BASE + 0x05)
 #define HC_SET_VCPU_REGS		_HC_ID(HC_ID, HC_ID_VM_BASE + 0x06)
 
+#define HC_ID_MEM_BASE			0x40UL
+#define HC_VM_SET_MEMORY_REGIONS	_HC_ID(HC_ID, HC_ID_MEM_BASE + 0x02)
+
 /**
  * hcall_create_vm() - Create a User VM
  * @vminfo:	Service VM GPA of info of User VM creation
@@ -88,4 +91,15 @@ static inline long hcall_set_vcpu_regs(u64 vmid, u64 regs_state)
 	return acrn_hypercall2(HC_SET_VCPU_REGS, vmid, regs_state);
 }
 
+/**
+ * hcall_set_memory_regions() - Inform the hypervisor to set up EPT mappings
+ * @regions_pa:	Service VM GPA of &struct vm_memory_region_batch
+ *
+ * Return: 0 on success, <0 on failure
+ */
+static inline long hcall_set_memory_regions(u64 regions_pa)
+{
+	return acrn_hypercall1(HC_VM_SET_MEMORY_REGIONS, regions_pa);
+}
+
 #endif /* __ACRN_HSM_HYPERCALL_H */
diff --git a/drivers/virt/acrn/mm.c b/drivers/virt/acrn/mm.c
new file mode 100644
index 000000000000..c0526863aa96
--- /dev/null
+++ b/drivers/virt/acrn/mm.c
@@ -0,0 +1,305 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ACRN: Memory mapping management
+ *
+ * Copyright (C) 2020 Intel Corporation. All rights reserved.
+ *
+ * Authors:
+ *	Fei Li <lei1.li@intel.com>
+ *	Shuo Liu <shuo.a.liu@intel.com>
+ */
+
+#include <linux/io.h>
+#include <linux/mm.h>
+#include <linux/slab.h>
+
+#include "acrn_drv.h"
+
+static int modify_region(struct acrn_vm *vm, struct vm_memory_region_op *region)
+{
+	struct vm_memory_region_batch *regions;
+	int ret;
+
+	regions = kzalloc(sizeof(*regions), GFP_KERNEL);
+	if (!regions)
+		return -ENOMEM;
+
+	regions->vmid = vm->vmid;
+	regions->regions_num = 1;
+	regions->regions_gpa = virt_to_phys(region);
+
+	ret = hcall_set_memory_regions(virt_to_phys(regions));
+	if (ret < 0)
+		dev_err(acrn_dev.this_device,
+			"Failed to set memory region for VM[%u]!\n", vm->vmid);
+
+	kfree(regions);
+	return ret;
+}
+
+/**
+ * acrn_mm_region_add() - Set up the EPT mapping of a memory region.
+ * @vm:			User VM.
+ * @user_gpa:		A GPA of User VM.
+ * @service_gpa:	A GPA of Service VM.
+ * @size:		Size of the region.
+ * @mem_type:		Combination of ACRN_MEM_TYPE_*.
+ * @mem_access_right:	Combination of ACRN_MEM_ACCESS_*.
+ *
+ * Return: 0 on success, <0 on error.
+ */
+int acrn_mm_region_add(struct acrn_vm *vm, u64 user_gpa, u64 service_gpa,
+		       u64 size, u32 mem_type, u32 mem_access_right)
+{
+	struct vm_memory_region_op *region;
+	int ret = 0;
+
+	region = kzalloc(sizeof(*region), GFP_KERNEL);
+	if (!region)
+		return -ENOMEM;
+
+	region->type = ACRN_MEM_REGION_ADD;
+	region->user_vm_pa = user_gpa;
+	region->service_vm_pa = service_gpa;
+	region->size = size;
+	region->attr = ((mem_type & ACRN_MEM_TYPE_MASK) |
+			(mem_access_right & ACRN_MEM_ACCESS_RIGHT_MASK));
+	ret = modify_region(vm, region);
+
+	dev_dbg(acrn_dev.this_device,
+		"%s: user-GPA[%pK] service-GPA[%pK] size[0x%llx].\n",
+		__func__, (void *)user_gpa, (void *)service_gpa, size);
+	kfree(region);
+	return ret;
+}
+
+/**
+ * acrn_mm_region_del() - Del the EPT mapping of a memory region.
+ * @vm:		User VM.
+ * @user_gpa:	A GPA of the User VM.
+ * @size:	Size of the region.
+ *
+ * Return: 0 on success, <0 for error.
+ */
+int acrn_mm_region_del(struct acrn_vm *vm, u64 user_gpa, u64 size)
+{
+	struct vm_memory_region_op *region;
+	int ret = 0;
+
+	region = kzalloc(sizeof(*region), GFP_KERNEL);
+	if (!region)
+		return -ENOMEM;
+
+	region->type = ACRN_MEM_REGION_DEL;
+	region->user_vm_pa = user_gpa;
+	region->service_vm_pa = 0UL;
+	region->size = size;
+	region->attr = 0U;
+
+	ret = modify_region(vm, region);
+
+	dev_dbg(acrn_dev.this_device, "%s: user-GPA[%pK] size[0x%llx].\n",
+		__func__, (void *)user_gpa, size);
+	kfree(region);
+	return ret;
+}
+
+int acrn_vm_memseg_map(struct acrn_vm *vm, struct acrn_vm_memmap *memmap)
+{
+	int ret;
+
+	if (memmap->type == ACRN_MEMMAP_RAM)
+		return acrn_vm_ram_map(vm, memmap);
+
+	if (memmap->type != ACRN_MEMMAP_MMIO) {
+		dev_err(acrn_dev.this_device,
+			"Invalid memmap type: %u\n", memmap->type);
+		return -EINVAL;
+	}
+
+	ret = acrn_mm_region_add(vm, memmap->user_vm_pa,
+				 memmap->service_vm_pa, memmap->len,
+				 ACRN_MEM_TYPE_UC, memmap->attr);
+	if (ret < 0)
+		dev_err(acrn_dev.this_device,
+			"Add memory region failed, VM[%u]!\n", vm->vmid);
+
+	return ret;
+}
+
+int acrn_vm_memseg_unmap(struct acrn_vm *vm, struct acrn_vm_memmap *memmap)
+{
+	int ret;
+
+	if (memmap->type != ACRN_MEMMAP_MMIO) {
+		dev_err(acrn_dev.this_device,
+			"Invalid memmap type: %u\n", memmap->type);
+		return -EINVAL;
+	}
+
+	ret = acrn_mm_region_del(vm, memmap->user_vm_pa, memmap->len);
+	if (ret < 0)
+		dev_err(acrn_dev.this_device,
+			"Del memory region failed, VM[%u]!\n", vm->vmid);
+
+	return ret;
+}
+
+/**
+ * acrn_vm_ram_map() - Create a RAM EPT mapping of User VM.
+ * @vm:		The User VM pointer
+ * @memmap:	Info of the EPT mapping
+ *
+ * Return: 0 on success, <0 for error.
+ */
+int acrn_vm_ram_map(struct acrn_vm *vm, struct acrn_vm_memmap *memmap)
+{
+	struct vm_memory_region_batch *regions_info;
+	int nr_pages, i = 0, order, nr_regions = 0;
+	struct vm_memory_mapping *region_mapping;
+	struct vm_memory_region_op *vm_region;
+	struct page **pages = NULL, *page;
+	void *remap_vaddr;
+	int ret, pinned;
+	u64 user_vm_pa;
+
+	if (!vm || !memmap)
+		return -EINVAL;
+
+	/* Get the page number of the map region */
+	nr_pages = memmap->len >> PAGE_SHIFT;
+	pages = vzalloc(nr_pages * sizeof(struct page *));
+	if (!pages)
+		return -ENOMEM;
+
+	/* Lock the pages of user memory map region */
+	pinned = get_user_pages_fast(memmap->vma_base,
+				     nr_pages, FOLL_WRITE, pages);
+	if (pinned < 0) {
+		ret = pinned;
+		goto free_pages;
+	} else if (pinned != nr_pages) {
+		ret = -EFAULT;
+		goto put_pages;
+	}
+
+	/* Create a kernel map for the map region */
+	remap_vaddr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL);
+	if (!remap_vaddr) {
+		ret = -ENOMEM;
+		goto put_pages;
+	}
+
+	/* Record Service VM va <-> User VM pa mapping */
+	mutex_lock(&vm->regions_mapping_lock);
+	region_mapping = &vm->regions_mapping[vm->regions_mapping_count];
+	if (vm->regions_mapping_count < ACRN_MEM_MAPPING_MAX) {
+		region_mapping->pages = pages;
+		region_mapping->npages = nr_pages;
+		region_mapping->size = memmap->len;
+		region_mapping->service_vm_va = remap_vaddr;
+		region_mapping->user_vm_pa = memmap->user_vm_pa;
+		vm->regions_mapping_count++;
+	} else {
+		dev_warn(acrn_dev.this_device,
+			"Run out of memory mapping slots!\n");
+		ret = -ENOMEM;
+		mutex_unlock(&vm->regions_mapping_lock);
+		goto unmap_no_count;
+	}
+	mutex_unlock(&vm->regions_mapping_lock);
+
+	/* Calculate count of vm_memory_region_op */
+	while (i < nr_pages) {
+		page = pages[i];
+		VM_BUG_ON_PAGE(PageTail(page), page);
+		order = compound_order(page);
+		nr_regions++;
+		i += 1 << order;
+	}
+
+	/* Prepare the vm_memory_region_batch */
+	regions_info = kzalloc(sizeof(*regions_info) +
+			       sizeof(*vm_region) * nr_regions,
+			       GFP_KERNEL);
+	if (!regions_info) {
+		ret = -ENOMEM;
+		goto unmap_kernel_map;
+	}
+
+	/* Fill each vm_memory_region_op */
+	vm_region = (struct vm_memory_region_op *)(regions_info + 1);
+	regions_info->vmid = vm->vmid;
+	regions_info->regions_num = nr_regions;
+	regions_info->regions_gpa = virt_to_phys(vm_region);
+	user_vm_pa = memmap->user_vm_pa;
+	i = 0;
+	while (i < nr_pages) {
+		u32 region_size;
+
+		page = pages[i];
+		VM_BUG_ON_PAGE(PageTail(page), page);
+		order = compound_order(page);
+		region_size = PAGE_SIZE << order;
+		vm_region->type = ACRN_MEM_REGION_ADD;
+		vm_region->user_vm_pa = user_vm_pa;
+		vm_region->service_vm_pa = page_to_phys(page);
+		vm_region->size = region_size;
+		vm_region->attr = (ACRN_MEM_TYPE_WB & ACRN_MEM_TYPE_MASK) |
+				  (memmap->attr & ACRN_MEM_ACCESS_RIGHT_MASK);
+
+		vm_region++;
+		user_vm_pa += region_size;
+		i += 1 << order;
+	}
+
+	/* Inform the ACRN Hypervisor to set up EPT mappings */
+	ret = hcall_set_memory_regions(virt_to_phys(regions_info));
+	if (ret < 0) {
+		dev_err(acrn_dev.this_device,
+			"Failed to set regions, VM[%u]!\n", vm->vmid);
+		goto unset_region;
+	}
+	kfree(regions_info);
+
+	dev_dbg(acrn_dev.this_device,
+		"%s: VM[%u] service-GVA[%pK] user-GPA[%pK] size[0x%llx]\n",
+		__func__, vm->vmid,
+		remap_vaddr, (void *)memmap->user_vm_pa, memmap->len);
+	return ret;
+
+unset_region:
+	kfree(regions_info);
+unmap_kernel_map:
+	mutex_lock(&vm->regions_mapping_lock);
+	vm->regions_mapping_count--;
+	mutex_unlock(&vm->regions_mapping_lock);
+unmap_no_count:
+	vunmap(remap_vaddr);
+put_pages:
+	for (i = 0; i < pinned; i++)
+		put_page(pages[i]);
+free_pages:
+	vfree(pages);
+	return ret;
+}
+
+/**
+ * acrn_vm_all_ram_unmap() - Destroy a RAM EPT mapping of User VM.
+ * @vm:	The User VM
+ */
+void acrn_vm_all_ram_unmap(struct acrn_vm *vm)
+{
+	struct vm_memory_mapping *region_mapping;
+	int i, j;
+
+	mutex_lock(&vm->regions_mapping_lock);
+	for (i = 0; i < vm->regions_mapping_count; i++) {
+		region_mapping = &vm->regions_mapping[i];
+		vunmap(region_mapping->service_vm_va);
+		for (j = 0; j < region_mapping->npages; j++)
+			put_page(region_mapping->pages[j]);
+		vfree(region_mapping->pages);
+	}
+	mutex_unlock(&vm->regions_mapping_lock);
+}
diff --git a/drivers/virt/acrn/vm.c b/drivers/virt/acrn/vm.c
index 920ca48f4847..c088362cc3e3 100644
--- a/drivers/virt/acrn/vm.c
+++ b/drivers/virt/acrn/vm.c
@@ -34,6 +34,7 @@ struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
 		return NULL;
 	}
 
+	mutex_init(&vm->regions_mapping_lock);
 	vm->vmid = vm_param->vmid;
 	vm->vcpu_num = vm_param->vcpu_num;
 
@@ -65,6 +66,9 @@ int acrn_vm_destroy(struct acrn_vm *vm)
 		clear_bit(ACRN_VM_FLAG_DESTROYED, &vm->flags);
 		return ret;
 	}
+
+	acrn_vm_all_ram_unmap(vm);
+
 	dev_dbg(acrn_dev.this_device, "VM %u destroyed.\n", vm->vmid);
 	vm->vmid = ACRN_INVALID_VMID;
 	return 0;
diff --git a/include/uapi/linux/acrn.h b/include/uapi/linux/acrn.h
index 1d5b82e154fb..33bbdd6d3956 100644
--- a/include/uapi/linux/acrn.h
+++ b/include/uapi/linux/acrn.h
@@ -105,6 +105,52 @@ struct acrn_vcpu_regs {
 	struct acrn_regs	vcpu_regs;
 } __attribute__((aligned(8)));
 
+#define	ACRN_MEM_ACCESS_RIGHT_MASK	0x00000007U
+#define	ACRN_MEM_ACCESS_READ		0x00000001U
+#define	ACRN_MEM_ACCESS_WRITE		0x00000002U
+#define	ACRN_MEM_ACCESS_EXEC		0x00000004U
+#define	ACRN_MEM_ACCESS_RWX		(ACRN_MEM_ACCESS_READ  | \
+					 ACRN_MEM_ACCESS_WRITE | \
+					 ACRN_MEM_ACCESS_EXEC)
+
+#define	ACRN_MEM_TYPE_MASK		0x000007C0U
+#define	ACRN_MEM_TYPE_WB		0x00000040U
+#define	ACRN_MEM_TYPE_WT		0x00000080U
+#define	ACRN_MEM_TYPE_UC		0x00000100U
+#define	ACRN_MEM_TYPE_WC		0x00000200U
+#define	ACRN_MEM_TYPE_WP		0x00000400U
+
+/* Memory mapping types */
+#define	ACRN_MEMMAP_RAM			0
+#define	ACRN_MEMMAP_MMIO		1
+
+/**
+ * struct acrn_vm_memmap - A EPT memory mapping info for a User VM.
+ * @type:		Type of the memory mapping (ACRM_MEMMAP_*).
+ *			Pass to hypervisor directly.
+ * @reserved:		Reserved.
+ * @user_vm_pa:		Physical address of User VM.
+ *			Pass to hypervisor directly.
+ * @service_vm_pa:	Physical address of Service VM.
+ *			Pass to hypervisor directly.
+ * @vma_base:		VMA address of Service VM. Pass to hypervisor directly.
+ * @len:		Length of the memory mapping.
+ *			Pass to hypervisor directly.
+ * @attr:		Attribute of the memory mapping.
+ *			Pass to hypervisor directly.
+ */
+struct acrn_vm_memmap {
+	__u32	type;
+	__u32	reserved;
+	__u64	user_vm_pa;
+	union {
+		__u64	service_vm_pa;
+		__u64	vma_base;
+	};
+	__u64	len;
+	__u32	attr;
+} __attribute__((aligned(8)));
+
 /* The ioctl type, documented in ioctl-number.rst */
 #define ACRN_IOCTL_TYPE			0xA2
 
@@ -124,4 +170,9 @@ struct acrn_vcpu_regs {
 #define ACRN_IOCTL_SET_VCPU_REGS	\
 	_IOW(ACRN_IOCTL_TYPE, 0x16, struct acrn_vcpu_regs)
 
+#define ACRN_IOCTL_SET_MEMSEG		\
+	_IOW(ACRN_IOCTL_TYPE, 0x41, struct acrn_vm_memmap)
+#define ACRN_IOCTL_UNSET_MEMSEG		\
+	_IOW(ACRN_IOCTL_TYPE, 0x42, struct acrn_vm_memmap)
+
 #endif /* _UAPI_ACRN_H */
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 10/17] virt: acrn: Introduce PCI configuration space PIO accesses combiner
  2020-09-22 11:42 [PATCH v4 00/17] HSM driver for ACRN hypervisor shuo.a.liu
                   ` (5 preceding siblings ...)
  2020-09-22 11:43 ` [PATCH v4 08/17] virt: acrn: Introduce EPT mapping management shuo.a.liu
@ 2020-09-22 11:43 ` shuo.a.liu
  2020-09-22 11:43 ` [PATCH v4 11/17] virt: acrn: Introduce interfaces for PCI device passthrough shuo.a.liu
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 58+ messages in thread
From: shuo.a.liu @ 2020-09-22 11:43 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: Greg Kroah-Hartman, H . Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Sean Christopherson, Yu Wang,
	Reinette Chatre, Shuo Liu

From: Shuo Liu <shuo.a.liu@intel.com>

A User VM can access its virtual PCI configuration spaces via port IO
approach, which has two following steps:
 1) writes address into port 0xCF8
 2) put/get data in/from port 0xCFC

To distribute a complete PCI configuration space access one time, HSM
need to combine such two accesses together.

Combine two paired PIO I/O requests into one PCI I/O request and
continue the I/O request distribution.

Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/virt/acrn/acrn_drv.h |  2 +
 drivers/virt/acrn/ioreq.c    | 76 ++++++++++++++++++++++++++++++++++++
 include/uapi/linux/acrn.h    | 15 +++++++
 3 files changed, 93 insertions(+)

diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
index cf9143cf760d..97d2aab8b70a 100644
--- a/drivers/virt/acrn/acrn_drv.h
+++ b/drivers/virt/acrn/acrn_drv.h
@@ -156,6 +156,7 @@ extern rwlock_t acrn_vm_list_lock;
  * @default_client:		The default I/O request client
  * @ioreq_buf:			I/O request shared buffer
  * @ioreq_page:			The page of the I/O request shared buffer
+ * @pci_conf_addr:		Address of a PCI configuration access emulation
  */
 struct acrn_vm {
 	struct list_head		list;
@@ -170,6 +171,7 @@ struct acrn_vm {
 	struct acrn_ioreq_client	*default_client;
 	struct acrn_io_request_buffer	*ioreq_buf;
 	struct page			*ioreq_page;
+	u32				pci_conf_addr;
 };
 
 struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
diff --git a/drivers/virt/acrn/ioreq.c b/drivers/virt/acrn/ioreq.c
index 2e9fd432c147..bf194f0fbd70 100644
--- a/drivers/virt/acrn/ioreq.c
+++ b/drivers/virt/acrn/ioreq.c
@@ -221,6 +221,80 @@ int acrn_ioreq_client_wait(struct acrn_ioreq_client *client)
 	return 0;
 }
 
+static bool is_cfg_addr(struct acrn_io_request *req)
+{
+	return ((req->type == ACRN_IOREQ_TYPE_PORTIO) &&
+		(req->reqs.pio_request.address == 0xcf8));
+}
+
+static bool is_cfg_data(struct acrn_io_request *req)
+{
+	return ((req->type == ACRN_IOREQ_TYPE_PORTIO) &&
+		((req->reqs.pio_request.address >= 0xcfc) &&
+		 (req->reqs.pio_request.address < (0xcfc + 4))));
+}
+
+/* The low 8-bit of supported pci_reg addr.*/
+#define PCI_LOWREG_MASK  0xFC
+/* The high 4-bit of supported pci_reg addr */
+#define PCI_HIGHREG_MASK 0xF00
+/* Max number of supported functions */
+#define PCI_FUNCMAX	7
+/* Max number of supported slots */
+#define PCI_SLOTMAX	31
+/* Max number of supported buses */
+#define PCI_BUSMAX	255
+#define CONF1_ENABLE	0x80000000UL
+/*
+ * A PCI configuration space access via PIO 0xCF8 and 0xCFC normally has two
+ * following steps:
+ *   1) writes address into 0xCF8 port
+ *   2) accesses data in/from 0xCFC
+ * This function combines such paired PCI configuration space I/O requests into
+ * one ACRN_IOREQ_TYPE_PCICFG type I/O request and continues the processing.
+ */
+static bool handle_cf8cfc(struct acrn_vm *vm,
+			  struct acrn_io_request *req, u16 vcpu)
+{
+	int offset, pci_cfg_addr, pci_reg;
+	bool is_handled = false;
+
+	if (is_cfg_addr(req)) {
+		WARN_ON(req->reqs.pio_request.size != 4);
+		if (req->reqs.pio_request.direction == ACRN_IOREQ_DIR_WRITE)
+			vm->pci_conf_addr = req->reqs.pio_request.value;
+		else
+			req->reqs.pio_request.value = vm->pci_conf_addr;
+		is_handled = true;
+	} else if (is_cfg_data(req)) {
+		if (!(vm->pci_conf_addr & CONF1_ENABLE)) {
+			if (req->reqs.pio_request.direction ==
+					ACRN_IOREQ_DIR_READ)
+				req->reqs.pio_request.value = 0xffffffff;
+			is_handled = true;
+		} else {
+			offset = req->reqs.pio_request.address - 0xcfc;
+
+			req->type = ACRN_IOREQ_TYPE_PCICFG;
+			pci_cfg_addr = vm->pci_conf_addr;
+			req->reqs.pci_request.bus =
+					(pci_cfg_addr >> 16) & PCI_BUSMAX;
+			req->reqs.pci_request.dev =
+					(pci_cfg_addr >> 11) & PCI_SLOTMAX;
+			req->reqs.pci_request.func =
+					(pci_cfg_addr >> 8) & PCI_FUNCMAX;
+			pci_reg = (pci_cfg_addr & PCI_LOWREG_MASK) +
+				   ((pci_cfg_addr >> 16) & PCI_HIGHREG_MASK);
+			req->reqs.pci_request.reg = pci_reg + offset;
+		}
+	}
+
+	if (is_handled)
+		ioreq_complete_request(vm, vcpu, req);
+
+	return is_handled;
+}
+
 static bool in_range(struct acrn_ioreq_range *range,
 		     struct acrn_io_request *req)
 {
@@ -381,6 +455,8 @@ static int acrn_ioreq_dispatch(struct acrn_vm *vm)
 				ioreq_complete_request(vm, i, req);
 				continue;
 			}
+			if (handle_cf8cfc(vm, req, i))
+				continue;
 
 			spin_lock_bh(&vm->ioreq_clients_lock);
 			client = find_ioreq_client(vm, req);
diff --git a/include/uapi/linux/acrn.h b/include/uapi/linux/acrn.h
index 8eb687f1482c..31cf0fd73bcc 100644
--- a/include/uapi/linux/acrn.h
+++ b/include/uapi/linux/acrn.h
@@ -20,6 +20,7 @@
 
 #define ACRN_IOREQ_TYPE_PORTIO		0
 #define ACRN_IOREQ_TYPE_MMIO		1
+#define ACRN_IOREQ_TYPE_PCICFG		2
 
 #define ACRN_IOREQ_DIR_READ		0
 #define ACRN_IOREQ_DIR_WRITE		1
@@ -40,6 +41,18 @@ struct acrn_pio_request {
 	__u32	value;
 } __attribute__((aligned(8)));
 
+/* Need keep same header fields with pio_request */
+struct acrn_pci_request {
+	__u32	direction;
+	__u32	reserved[3];
+	__u64	size;
+	__u32	value;
+	__u32	bus;
+	__u32	dev;
+	__u32	func;
+	__u32	reg;
+} __attribute__((aligned(8)));
+
 /**
  * struct acrn_io_request - 256-byte ACRN I/O request
  * @type:		Type of this request (ACRN_IOREQ_TYPE_*).
@@ -48,6 +61,7 @@ struct acrn_pio_request {
  * @reserved0:		Reserved fields.
  * @reqs:		Union of different types of request. Byte offset: 64.
  * @reqs.pio_request:	PIO request data of the I/O request.
+ * @reqs.pci_request:	PCI configuration space request data of the I/O request.
  * @reqs.mmio_request:	MMIO request data of the I/O request.
  * @reqs.data:		Raw data of the I/O request.
  * @reserved1:		Reserved fields.
@@ -107,6 +121,7 @@ struct acrn_io_request {
 	__u32	reserved0[14];
 	union {
 		struct acrn_pio_request		pio_request;
+		struct acrn_pci_request		pci_request;
 		struct acrn_mmio_request	mmio_request;
 		__u64				data[8];
 	} reqs;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 11/17] virt: acrn: Introduce interfaces for PCI device passthrough
  2020-09-22 11:42 [PATCH v4 00/17] HSM driver for ACRN hypervisor shuo.a.liu
                   ` (6 preceding siblings ...)
  2020-09-22 11:43 ` [PATCH v4 10/17] virt: acrn: Introduce PCI configuration space PIO accesses combiner shuo.a.liu
@ 2020-09-22 11:43 ` shuo.a.liu
  2020-09-22 11:43 ` [PATCH v4 12/17] virt: acrn: Introduce interrupt injection interfaces shuo.a.liu
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 58+ messages in thread
From: shuo.a.liu @ 2020-09-22 11:43 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: Greg Kroah-Hartman, H . Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Sean Christopherson, Yu Wang,
	Reinette Chatre, Shuo Liu, Zhi Wang, Zhenyu Wang

From: Shuo Liu <shuo.a.liu@intel.com>

PCI device passthrough enables an OS in a virtual machine to directly
access a PCI device in the host. It promises almost the native
performance, which is required in performance-critical scenarios of
ACRN.

HSM provides the following ioctls:
 - Assign - ACRN_IOCTL_ASSIGN_PCIDEV
   Pass data struct acrn_pcidev from userspace to the hypervisor, and
   inform the hypervisor to assign a PCI device to a User VM.

 - De-assign - ACRN_IOCTL_DEASSIGN_PCIDEV
   Pass data struct acrn_pcidev from userspace to the hypervisor, and
   inform the hypervisor to de-assign a PCI device from a User VM.

 - Set a interrupt of a passthrough device - ACRN_IOCTL_SET_PTDEV_INTR
   Pass data struct acrn_ptdev_irq from userspace to the hypervisor,
   and inform the hypervisor to map a INTx interrupt of passthrough
   device of User VM.

 - Reset passthrough device interrupt - ACRN_IOCTL_RESET_PTDEV_INTR
   Pass data struct acrn_ptdev_irq from userspace to the hypervisor,
   and inform the hypervisor to unmap a INTx interrupt of passthrough
   device of User VM.

Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Yu Wang <yu1.wang@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/virt/acrn/hsm.c       | 50 +++++++++++++++++++++++++++
 drivers/virt/acrn/hypercall.h | 54 ++++++++++++++++++++++++++++++
 include/uapi/linux/acrn.h     | 63 +++++++++++++++++++++++++++++++++++
 3 files changed, 167 insertions(+)

diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
index b4b5d99ff6bb..bcb5e273dcc3 100644
--- a/drivers/virt/acrn/hsm.c
+++ b/drivers/virt/acrn/hsm.c
@@ -49,7 +49,9 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
 	struct acrn_vm_creation *vm_param;
 	struct acrn_vcpu_regs *cpu_regs;
 	struct acrn_ioreq_notify notify;
+	struct acrn_ptdev_irq *irq_info;
 	struct acrn_vm_memmap memmap;
+	struct acrn_pcidev *pcidev;
 	int ret = 0;
 
 	if (vm->vmid == ACRN_INVALID_VMID && cmd != ACRN_IOCTL_CREATE_VM) {
@@ -128,6 +130,54 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
 
 		ret = acrn_vm_memseg_unmap(vm, &memmap);
 		break;
+	case ACRN_IOCTL_ASSIGN_PCIDEV:
+		pcidev = memdup_user((void __user *)ioctl_param,
+				     sizeof(struct acrn_pcidev));
+		if (IS_ERR(pcidev))
+			return PTR_ERR(pcidev);
+
+		ret = hcall_assign_pcidev(vm->vmid, virt_to_phys(pcidev));
+		if (ret < 0)
+			dev_err(acrn_dev.this_device,
+				"Failed to assign pci device!\n");
+		kfree(pcidev);
+		break;
+	case ACRN_IOCTL_DEASSIGN_PCIDEV:
+		pcidev = memdup_user((void __user *)ioctl_param,
+				     sizeof(struct acrn_pcidev));
+		if (IS_ERR(pcidev))
+			return PTR_ERR(pcidev);
+
+		ret = hcall_deassign_pcidev(vm->vmid, virt_to_phys(pcidev));
+		if (ret < 0)
+			dev_err(acrn_dev.this_device,
+				"Failed to deassign pci device!\n");
+		kfree(pcidev);
+		break;
+	case ACRN_IOCTL_SET_PTDEV_INTR:
+		irq_info = memdup_user((void __user *)ioctl_param,
+				       sizeof(struct acrn_ptdev_irq));
+		if (IS_ERR(irq_info))
+			return PTR_ERR(irq_info);
+
+		ret = hcall_set_ptdev_intr(vm->vmid, virt_to_phys(irq_info));
+		if (ret < 0)
+			dev_err(acrn_dev.this_device,
+				"Failed to configure intr for ptdev!\n");
+		kfree(irq_info);
+		break;
+	case ACRN_IOCTL_RESET_PTDEV_INTR:
+		irq_info = memdup_user((void __user *)ioctl_param,
+				       sizeof(struct acrn_ptdev_irq));
+		if (IS_ERR(irq_info))
+			return PTR_ERR(irq_info);
+
+		ret = hcall_reset_ptdev_intr(vm->vmid, virt_to_phys(irq_info));
+		if (ret < 0)
+			dev_err(acrn_dev.this_device,
+				"Failed to reset intr for ptdev!\n");
+		kfree(irq_info);
+		break;
 	case ACRN_IOCTL_CREATE_IOREQ_CLIENT:
 		if (vm->default_client)
 			return -EEXIST;
diff --git a/drivers/virt/acrn/hypercall.h b/drivers/virt/acrn/hypercall.h
index 5eba29e3ed38..f448301832cf 100644
--- a/drivers/virt/acrn/hypercall.h
+++ b/drivers/virt/acrn/hypercall.h
@@ -28,6 +28,12 @@
 #define HC_ID_MEM_BASE			0x40UL
 #define HC_VM_SET_MEMORY_REGIONS	_HC_ID(HC_ID, HC_ID_MEM_BASE + 0x02)
 
+#define HC_ID_PCI_BASE			0x50UL
+#define HC_SET_PTDEV_INTR		_HC_ID(HC_ID, HC_ID_PCI_BASE + 0x03)
+#define HC_RESET_PTDEV_INTR		_HC_ID(HC_ID, HC_ID_PCI_BASE + 0x04)
+#define HC_ASSIGN_PCIDEV		_HC_ID(HC_ID, HC_ID_PCI_BASE + 0x05)
+#define HC_DEASSIGN_PCIDEV		_HC_ID(HC_ID, HC_ID_PCI_BASE + 0x06)
+
 /**
  * hcall_create_vm() - Create a User VM
  * @vminfo:	Service VM GPA of info of User VM creation
@@ -130,4 +136,52 @@ static inline long hcall_set_memory_regions(u64 regions_pa)
 	return acrn_hypercall1(HC_VM_SET_MEMORY_REGIONS, regions_pa);
 }
 
+/**
+ * hcall_assign_pcidev() - Assign a PCI device to a User VM
+ * @vmid:	User VM ID
+ * @addr:	Service VM GPA of the &struct acrn_pcidev
+ *
+ * Return: 0 on success, <0 on failure
+ */
+static inline long hcall_assign_pcidev(u64 vmid, u64 addr)
+{
+	return acrn_hypercall2(HC_ASSIGN_PCIDEV, vmid, addr);
+}
+
+/**
+ * hcall_deassign_pcidev() - De-assign a PCI device from a User VM
+ * @vmid:	User VM ID
+ * @addr:	Service VM GPA of the &struct acrn_pcidev
+ *
+ * Return: 0 on success, <0 on failure
+ */
+static inline long hcall_deassign_pcidev(u64 vmid, u64 addr)
+{
+	return acrn_hypercall2(HC_DEASSIGN_PCIDEV, vmid, addr);
+}
+
+/**
+ * hcall_set_ptdev_intr() - Configure an interrupt for an assigned PCI device.
+ * @vmid:	User VM ID
+ * @irq:	Service VM GPA of the &struct acrn_ptdev_irq
+ *
+ * Return: 0 on success, <0 on failure
+ */
+static inline long hcall_set_ptdev_intr(u64 vmid, u64 irq)
+{
+	return acrn_hypercall2(HC_SET_PTDEV_INTR, vmid, irq);
+}
+
+/**
+ * hcall_reset_ptdev_intr() - Reset an interrupt for an assigned PCI device.
+ * @vmid:	User VM ID
+ * @irq:	Service VM GPA of the &struct acrn_ptdev_irq
+ *
+ * Return: 0 on success, <0 on failure
+ */
+static inline long hcall_reset_ptdev_intr(u64 vmid, u64 irq)
+{
+	return acrn_hypercall2(HC_RESET_PTDEV_INTR, vmid, irq);
+}
+
 #endif /* __ACRN_HSM_HYPERCALL_H */
diff --git a/include/uapi/linux/acrn.h b/include/uapi/linux/acrn.h
index 31cf0fd73bcc..893389babbcb 100644
--- a/include/uapi/linux/acrn.h
+++ b/include/uapi/linux/acrn.h
@@ -289,6 +289,60 @@ struct acrn_vm_memmap {
 	__u32	attr;
 } __attribute__((aligned(8)));
 
+/* Type of interrupt of a passthrough device */
+#define ACRN_PTDEV_IRQ_INTX	0
+#define ACRN_PTDEV_IRQ_MSI	1
+#define ACRN_PTDEV_IRQ_MSIX	2
+/**
+ * struct acrn_ptdev_irq - Interrupt data of a passthrough device.
+ * @type:		Type (ACRN_PTDEV_IRQ_*)
+ * @virt_bdf:		Virtual Bus/Device/Function
+ * @phys_bdf:		Physical Bus/Device/Function
+ * @intx:		Info of interrupt
+ * @intx.virt_pin:	Virtual IOAPIC pin
+ * @intx.phys_pin:	Physical IOAPIC pin
+ * @intx.is_pic_pin:	Is PIC pin or not
+ *
+ * This structure will be passed to hypervisor directly.
+ */
+struct acrn_ptdev_irq {
+	__u32	type;
+	__u16	virt_bdf;
+	__u16	phys_bdf;
+
+	struct {
+		__u32	virt_pin;
+		__u32	phys_pin;
+		__u32	is_pic_pin;
+	} intx;
+} __attribute__((aligned(8)));
+
+/* Type of PCI device assignment */
+#define ACRN_PTDEV_QUIRK_ASSIGN	(1U << 0)
+
+#define ACRN_PCI_NUM_BARS	6
+/**
+ * struct acrn_pcidev - Info for assigning or de-assigning a PCI device
+ * @type:	Type of the assignment
+ * @virt_bdf:	Virtual Bus/Device/Function
+ * @phys_bdf:	Physical Bus/Device/Function
+ * @intr_line:	PCI interrupt line
+ * @intr_pin:	PCI interrupt pin
+ * @bar:	PCI BARs.
+ * @reserved:	Reserved.
+ *
+ * This structure will be passed to hypervisor directly.
+ */
+struct acrn_pcidev {
+	__u32	type;
+	__u16	virt_bdf;
+	__u16	phys_bdf;
+	__u8	intr_line;
+	__u8	intr_pin;
+	__u32	bar[ACRN_PCI_NUM_BARS];
+	__u32	reserved[6];
+} __attribute__((aligned(8)));
+
 /* The ioctl type, documented in ioctl-number.rst */
 #define ACRN_IOCTL_TYPE			0xA2
 
@@ -324,4 +378,13 @@ struct acrn_vm_memmap {
 #define ACRN_IOCTL_UNSET_MEMSEG		\
 	_IOW(ACRN_IOCTL_TYPE, 0x42, struct acrn_vm_memmap)
 
+#define ACRN_IOCTL_SET_PTDEV_INTR	\
+	_IOW(ACRN_IOCTL_TYPE, 0x53, struct acrn_ptdev_irq)
+#define ACRN_IOCTL_RESET_PTDEV_INTR	\
+	_IOW(ACRN_IOCTL_TYPE, 0x54, struct acrn_ptdev_irq)
+#define ACRN_IOCTL_ASSIGN_PCIDEV	\
+	_IOW(ACRN_IOCTL_TYPE, 0x55, struct acrn_pcidev)
+#define ACRN_IOCTL_DEASSIGN_PCIDEV	\
+	_IOW(ACRN_IOCTL_TYPE, 0x56, struct acrn_pcidev)
+
 #endif /* _UAPI_ACRN_H */
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 12/17] virt: acrn: Introduce interrupt injection interfaces
  2020-09-22 11:42 [PATCH v4 00/17] HSM driver for ACRN hypervisor shuo.a.liu
                   ` (7 preceding siblings ...)
  2020-09-22 11:43 ` [PATCH v4 11/17] virt: acrn: Introduce interfaces for PCI device passthrough shuo.a.liu
@ 2020-09-22 11:43 ` shuo.a.liu
  2020-09-22 11:43 ` [PATCH v4 14/17] virt: acrn: Introduce I/O ranges operation interfaces shuo.a.liu
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 58+ messages in thread
From: shuo.a.liu @ 2020-09-22 11:43 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: Greg Kroah-Hartman, H . Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Sean Christopherson, Yu Wang,
	Reinette Chatre, Shuo Liu, Zhi Wang, Zhenyu Wang

From: Shuo Liu <shuo.a.liu@intel.com>

ACRN userspace need to inject virtual interrupts into a User VM in
devices emulation.

HSM needs provide interfaces to do so.

Introduce following interrupt injection interfaces:

ioctl ACRN_IOCTL_SET_IRQLINE:
  Pass data from userspace to the hypervisor, and inform the hypervisor
  to inject a virtual IOAPIC GSI interrupt to a User VM.

ioctl ACRN_IOCTL_INJECT_MSI:
  Pass data struct acrn_msi_entry from userspace to the hypervisor, and
  inform the hypervisor to inject a virtual MSI to a User VM.

ioctl ACRN_IOCTL_VM_INTR_MONITOR:
  Set a 4-Kbyte aligned shared page for statistics information of
  interrupts of a User VM.

Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Yu Wang <yu1.wang@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/virt/acrn/acrn_drv.h  |  4 ++++
 drivers/virt/acrn/hsm.c       | 39 +++++++++++++++++++++++++++++++++
 drivers/virt/acrn/hypercall.h | 41 +++++++++++++++++++++++++++++++++++
 drivers/virt/acrn/vm.c        | 36 ++++++++++++++++++++++++++++++
 include/uapi/linux/acrn.h     | 17 +++++++++++++++
 5 files changed, 137 insertions(+)

diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
index 97d2aab8b70a..701c83319115 100644
--- a/drivers/virt/acrn/acrn_drv.h
+++ b/drivers/virt/acrn/acrn_drv.h
@@ -157,6 +157,7 @@ extern rwlock_t acrn_vm_list_lock;
  * @ioreq_buf:			I/O request shared buffer
  * @ioreq_page:			The page of the I/O request shared buffer
  * @pci_conf_addr:		Address of a PCI configuration access emulation
+ * @monitor_page:		Page of interrupt statistics of User VM
  */
 struct acrn_vm {
 	struct list_head		list;
@@ -172,6 +173,7 @@ struct acrn_vm {
 	struct acrn_io_request_buffer	*ioreq_buf;
 	struct page			*ioreq_page;
 	u32				pci_conf_addr;
+	struct page			*monitor_page;
 };
 
 struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
@@ -198,4 +200,6 @@ struct acrn_ioreq_client *acrn_ioreq_client_create(struct acrn_vm *vm,
 						   const char *name);
 void acrn_ioreq_client_destroy(struct acrn_ioreq_client *client);
 
+int acrn_msi_inject(struct acrn_vm *vm, u64 msi_addr, u64 msi_data);
+
 #endif /* __ACRN_HSM_DRV_H */
diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
index bcb5e273dcc3..5c82e0491210 100644
--- a/drivers/virt/acrn/hsm.c
+++ b/drivers/virt/acrn/hsm.c
@@ -51,7 +51,9 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
 	struct acrn_ioreq_notify notify;
 	struct acrn_ptdev_irq *irq_info;
 	struct acrn_vm_memmap memmap;
+	struct acrn_msi_entry *msi;
 	struct acrn_pcidev *pcidev;
+	struct page *page;
 	int ret = 0;
 
 	if (vm->vmid == ACRN_INVALID_VMID && cmd != ACRN_IOCTL_CREATE_VM) {
@@ -178,6 +180,43 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
 				"Failed to reset intr for ptdev!\n");
 		kfree(irq_info);
 		break;
+	case ACRN_IOCTL_SET_IRQLINE:
+		ret = hcall_set_irqline(vm->vmid, ioctl_param);
+		if (ret < 0)
+			dev_err(acrn_dev.this_device,
+				"Failed to set interrupt line!\n");
+		break;
+	case ACRN_IOCTL_INJECT_MSI:
+		msi = memdup_user((void __user *)ioctl_param,
+				  sizeof(struct acrn_msi_entry));
+		if (IS_ERR(msi))
+			return PTR_ERR(msi);
+
+		ret = hcall_inject_msi(vm->vmid, virt_to_phys(msi));
+		if (ret < 0)
+			dev_err(acrn_dev.this_device,
+				"Failed to inject MSI!\n");
+		kfree(msi);
+		break;
+	case ACRN_IOCTL_VM_INTR_MONITOR:
+		ret = get_user_pages_fast(ioctl_param, 1, FOLL_WRITE, &page);
+		if (unlikely(ret != 1)) {
+			dev_err(acrn_dev.this_device,
+				"Failed to pin intr hdr buffer!\n");
+			return -EFAULT;
+		}
+
+		ret = hcall_vm_intr_monitor(vm->vmid, page_to_phys(page));
+		if (ret < 0) {
+			put_page(page);
+			dev_err(acrn_dev.this_device,
+				"Failed to monitor intr data!\n");
+			return ret;
+		}
+		if (vm->monitor_page)
+			put_page(vm->monitor_page);
+		vm->monitor_page = page;
+		break;
 	case ACRN_IOCTL_CREATE_IOREQ_CLIENT:
 		if (vm->default_client)
 			return -EEXIST;
diff --git a/drivers/virt/acrn/hypercall.h b/drivers/virt/acrn/hypercall.h
index f448301832cf..a8813397a3fe 100644
--- a/drivers/virt/acrn/hypercall.h
+++ b/drivers/virt/acrn/hypercall.h
@@ -21,6 +21,11 @@
 #define HC_RESET_VM			_HC_ID(HC_ID, HC_ID_VM_BASE + 0x05)
 #define HC_SET_VCPU_REGS		_HC_ID(HC_ID, HC_ID_VM_BASE + 0x06)
 
+#define HC_ID_IRQ_BASE			0x20UL
+#define HC_INJECT_MSI			_HC_ID(HC_ID, HC_ID_IRQ_BASE + 0x03)
+#define HC_VM_INTR_MONITOR		_HC_ID(HC_ID, HC_ID_IRQ_BASE + 0x04)
+#define HC_SET_IRQLINE			_HC_ID(HC_ID, HC_ID_IRQ_BASE + 0x05)
+
 #define HC_ID_IOREQ_BASE		0x30UL
 #define HC_SET_IOREQ_BUFFER		_HC_ID(HC_ID, HC_ID_IOREQ_BASE + 0x00)
 #define HC_NOTIFY_REQUEST_FINISH	_HC_ID(HC_ID, HC_ID_IOREQ_BASE + 0x01)
@@ -101,6 +106,42 @@ static inline long hcall_set_vcpu_regs(u64 vmid, u64 regs_state)
 	return acrn_hypercall2(HC_SET_VCPU_REGS, vmid, regs_state);
 }
 
+/**
+ * hcall_inject_msi() - Deliver a MSI interrupt to a User VM
+ * @vmid:	User VM ID
+ * @msi:	Service VM GPA of MSI message
+ *
+ * Return: 0 on success, <0 on failure
+ */
+static inline long hcall_inject_msi(u64 vmid, u64 msi)
+{
+	return acrn_hypercall2(HC_INJECT_MSI, vmid, msi);
+}
+
+/**
+ * hcall_vm_intr_monitor() - Set a shared page for User VM interrupt statistics
+ * @vmid:	User VM ID
+ * @addr:	Service VM GPA of the shared page
+ *
+ * Return: 0 on success, <0 on failure
+ */
+static inline long hcall_vm_intr_monitor(u64 vmid, u64 addr)
+{
+	return acrn_hypercall2(HC_VM_INTR_MONITOR, vmid, addr);
+}
+
+/**
+ * hcall_set_irqline() - Set or clear an interrupt line
+ * @vmid:	User VM ID
+ * @op:		Service VM GPA of interrupt line operations
+ *
+ * Return: 0 on success, <0 on failure
+ */
+static inline long hcall_set_irqline(u64 vmid, u64 op)
+{
+	return acrn_hypercall2(HC_SET_IRQLINE, vmid, op);
+}
+
 /**
  * hcall_set_ioreq_buffer() - Set up the shared buffer for I/O Requests.
  * @vmid:	User VM ID
diff --git a/drivers/virt/acrn/vm.c b/drivers/virt/acrn/vm.c
index 4ee1a99df4b7..38304aeef181 100644
--- a/drivers/virt/acrn/vm.c
+++ b/drivers/virt/acrn/vm.c
@@ -68,6 +68,10 @@ int acrn_vm_destroy(struct acrn_vm *vm)
 	write_unlock_bh(&acrn_vm_list_lock);
 
 	acrn_ioreq_deinit(vm);
+	if (vm->monitor_page) {
+		put_page(vm->monitor_page);
+		vm->monitor_page = NULL;
+	}
 
 	ret = hcall_destroy_vm(vm->vmid);
 	if (ret < 0) {
@@ -83,3 +87,35 @@ int acrn_vm_destroy(struct acrn_vm *vm)
 	vm->vmid = ACRN_INVALID_VMID;
 	return 0;
 }
+
+/**
+ * acrn_inject_msi() - Inject a MSI interrupt into a User VM
+ * @vm:		User VM
+ * @msi_addr:	The MSI address
+ * @msi_data:	The MSI data
+ *
+ * Return: 0 on success, <0 on error
+ */
+int acrn_msi_inject(struct acrn_vm *vm, u64 msi_addr, u64 msi_data)
+{
+	struct acrn_msi_entry *msi;
+	int ret;
+
+	/* might be used in interrupt context, so use GFP_ATOMIC */
+	msi = kzalloc(sizeof(*msi), GFP_ATOMIC);
+	if (!msi)
+		return -ENOMEM;
+
+	/*
+	 * msi_addr: addr[19:12] with dest vcpu id
+	 * msi_data: data[7:0] with vector
+	 */
+	msi->msi_addr = msi_addr;
+	msi->msi_data = msi_data;
+	ret = hcall_inject_msi(vm->vmid, virt_to_phys(msi));
+	if (ret < 0)
+		dev_err(acrn_dev.this_device,
+			"Failed to inject MSI to VM %u!\n", vm->vmid);
+	kfree(msi);
+	return ret;
+}
diff --git a/include/uapi/linux/acrn.h b/include/uapi/linux/acrn.h
index 893389babbcb..7764459e130c 100644
--- a/include/uapi/linux/acrn.h
+++ b/include/uapi/linux/acrn.h
@@ -343,6 +343,16 @@ struct acrn_pcidev {
 	__u32	reserved[6];
 } __attribute__((aligned(8)));
 
+/**
+ * struct acrn_msi_entry - Info for injecting a MSI interrupt to a VM
+ * @msi_addr:	MSI addr[19:12] with dest vCPU ID
+ * @msi_data:	MSI data[7:0] with vector
+ */
+struct acrn_msi_entry {
+	__u64	msi_addr;
+	__u64	msi_data;
+};
+
 /* The ioctl type, documented in ioctl-number.rst */
 #define ACRN_IOCTL_TYPE			0xA2
 
@@ -362,6 +372,13 @@ struct acrn_pcidev {
 #define ACRN_IOCTL_SET_VCPU_REGS	\
 	_IOW(ACRN_IOCTL_TYPE, 0x16, struct acrn_vcpu_regs)
 
+#define ACRN_IOCTL_INJECT_MSI		\
+	_IOW(ACRN_IOCTL_TYPE, 0x23, struct acrn_msi_entry)
+#define ACRN_IOCTL_VM_INTR_MONITOR	\
+	_IOW(ACRN_IOCTL_TYPE, 0x24, unsigned long)
+#define ACRN_IOCTL_SET_IRQLINE		\
+	_IOW(ACRN_IOCTL_TYPE, 0x25, __u64)
+
 #define ACRN_IOCTL_NOTIFY_REQUEST_FINISH \
 	_IOW(ACRN_IOCTL_TYPE, 0x31, struct acrn_ioreq_notify)
 #define ACRN_IOCTL_CREATE_IOREQ_CLIENT	\
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 14/17] virt: acrn: Introduce I/O ranges operation interfaces
  2020-09-22 11:42 [PATCH v4 00/17] HSM driver for ACRN hypervisor shuo.a.liu
                   ` (8 preceding siblings ...)
  2020-09-22 11:43 ` [PATCH v4 12/17] virt: acrn: Introduce interrupt injection interfaces shuo.a.liu
@ 2020-09-22 11:43 ` shuo.a.liu
  2020-09-22 11:43 ` [PATCH v4 16/17] virt: acrn: Introduce irqfd shuo.a.liu
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 58+ messages in thread
From: shuo.a.liu @ 2020-09-22 11:43 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: Greg Kroah-Hartman, H . Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Sean Christopherson, Yu Wang,
	Reinette Chatre, Shuo Liu, Zhi Wang, Zhenyu Wang

From: Shuo Liu <shuo.a.liu@intel.com>

An I/O request of a User VM, which is constructed by hypervisor, is
distributed by the ACRN Hypervisor Service Module to an I/O client
corresponding to the address range of the I/O request.

I/O client maintains a list of address ranges. Introduce
acrn_ioreq_range_{add,del}() to manage these address ranges.

Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Yu Wang <yu1.wang@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/virt/acrn/acrn_drv.h |  4 +++
 drivers/virt/acrn/ioreq.c    | 60 ++++++++++++++++++++++++++++++++++++
 2 files changed, 64 insertions(+)

diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
index 701c83319115..5b824fa1ee57 100644
--- a/drivers/virt/acrn/acrn_drv.h
+++ b/drivers/virt/acrn/acrn_drv.h
@@ -199,6 +199,10 @@ struct acrn_ioreq_client *acrn_ioreq_client_create(struct acrn_vm *vm,
 						   void *data, bool is_default,
 						   const char *name);
 void acrn_ioreq_client_destroy(struct acrn_ioreq_client *client);
+int acrn_ioreq_range_add(struct acrn_ioreq_client *client,
+			 u32 type, u64 start, u64 end);
+void acrn_ioreq_range_del(struct acrn_ioreq_client *client,
+			  u32 type, u64 start, u64 end);
 
 int acrn_msi_inject(struct acrn_vm *vm, u64 msi_addr, u64 msi_data);
 
diff --git a/drivers/virt/acrn/ioreq.c b/drivers/virt/acrn/ioreq.c
index bf194f0fbd70..acf4edfb8c74 100644
--- a/drivers/virt/acrn/ioreq.c
+++ b/drivers/virt/acrn/ioreq.c
@@ -101,6 +101,66 @@ int acrn_ioreq_request_default_complete(struct acrn_vm *vm, u16 vcpu)
 	return ret;
 }
 
+/**
+ * acrn_ioreq_range_add() - Add an iorange monitored by an ioreq client
+ * @client:	The ioreq client
+ * @type:	Type (ACRN_IOREQ_TYPE_MMIO or ACRN_IOREQ_TYPE_PORTIO)
+ * @start:	Start address of iorange
+ * @end:	End address of iorange
+ *
+ * Return: 0 on success, <0 on error
+ */
+int acrn_ioreq_range_add(struct acrn_ioreq_client *client,
+			 u32 type, u64 start, u64 end)
+{
+	struct acrn_ioreq_range *range;
+
+	if (end < start) {
+		dev_err(acrn_dev.this_device,
+			"Invalid IO range [0x%llx,0x%llx]\n", start, end);
+		return -EINVAL;
+	}
+
+	range = kzalloc(sizeof(*range), GFP_KERNEL);
+	if (!range)
+		return -ENOMEM;
+
+	range->type = type;
+	range->start = start;
+	range->end = end;
+
+	write_lock_bh(&client->range_lock);
+	list_add(&range->list, &client->range_list);
+	write_unlock_bh(&client->range_lock);
+
+	return 0;
+}
+
+/**
+ * acrn_ioreq_range_del() - Del an iorange monitored by an ioreq client
+ * @client:	The ioreq client
+ * @type:	Type (ACRN_IOREQ_TYPE_MMIO or ACRN_IOREQ_TYPE_PORTIO)
+ * @start:	Start address of iorange
+ * @end:	End address of iorange
+ */
+void acrn_ioreq_range_del(struct acrn_ioreq_client *client,
+			  u32 type, u64 start, u64 end)
+{
+	struct acrn_ioreq_range *range;
+
+	write_lock_bh(&client->range_lock);
+	list_for_each_entry(range, &client->range_list, list) {
+		if (type == range->type &&
+		    start == range->start &&
+		    end == range->end) {
+			list_del(&range->list);
+			kfree(range);
+			break;
+		}
+	}
+	write_unlock_bh(&client->range_lock);
+}
+
 /*
  * ioreq_task() is the execution entity of handler thread of an I/O client.
  * The handler callback of the I/O client is called within the handler thread.
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 16/17] virt: acrn: Introduce irqfd
  2020-09-22 11:42 [PATCH v4 00/17] HSM driver for ACRN hypervisor shuo.a.liu
                   ` (9 preceding siblings ...)
  2020-09-22 11:43 ` [PATCH v4 14/17] virt: acrn: Introduce I/O ranges operation interfaces shuo.a.liu
@ 2020-09-22 11:43 ` shuo.a.liu
  2020-09-22 11:43 ` [PATCH v4 17/17] virt: acrn: Introduce an interface for Service VM to control vCPU shuo.a.liu
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 58+ messages in thread
From: shuo.a.liu @ 2020-09-22 11:43 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: Greg Kroah-Hartman, H . Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Sean Christopherson, Yu Wang,
	Reinette Chatre, Shuo Liu, Zhi Wang, Zhenyu Wang

From: Shuo Liu <shuo.a.liu@intel.com>

irqfd is a mechanism to inject a specific interrupt to a User VM using a
decoupled eventfd mechanism.

Vhost is a kernel-level virtio server which uses eventfd for interrupt
injection. To support vhost on ACRN, irqfd is introduced in HSM.

HSM provides ioctls to associate a virtual Message Signaled Interrupt
(MSI) with an eventfd. The corresponding virtual MSI will be injected
into a User VM once the eventfd got signal.

Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Yu Wang <yu1.wang@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/virt/acrn/Makefile   |   2 +-
 drivers/virt/acrn/acrn_drv.h |  10 ++
 drivers/virt/acrn/hsm.c      |   7 ++
 drivers/virt/acrn/irqfd.c    | 235 +++++++++++++++++++++++++++++++++++
 drivers/virt/acrn/vm.c       |   3 +
 include/uapi/linux/acrn.h    |  15 +++
 6 files changed, 271 insertions(+), 1 deletion(-)
 create mode 100644 drivers/virt/acrn/irqfd.c

diff --git a/drivers/virt/acrn/Makefile b/drivers/virt/acrn/Makefile
index 755b583b32ca..08ce641dcfa1 100644
--- a/drivers/virt/acrn/Makefile
+++ b/drivers/virt/acrn/Makefile
@@ -1,3 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_ACRN_HSM)	:= acrn.o
-acrn-y := hsm.o vm.o mm.o ioreq.o ioeventfd.o
+acrn-y := hsm.o vm.o mm.o ioreq.o ioeventfd.o irqfd.o
diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
index c66c620b9f10..8354d0d5881c 100644
--- a/drivers/virt/acrn/acrn_drv.h
+++ b/drivers/virt/acrn/acrn_drv.h
@@ -161,6 +161,9 @@ extern rwlock_t acrn_vm_list_lock;
  * @ioeventfds_lock:		Lock to protect ioeventfds list
  * @ioeventfds:			List to link all hsm_ioeventfd
  * @ioeventfd_client:		I/O client for ioeventfds of the VM
+ * @irqfds_lock:		Lock to protect irqfds list
+ * @irqfds:			List to link all hsm_irqfd
+ * @irqfd_wq:			Workqueue for irqfd async shutdown
  */
 struct acrn_vm {
 	struct list_head		list;
@@ -180,6 +183,9 @@ struct acrn_vm {
 	struct mutex			ioeventfds_lock;
 	struct list_head		ioeventfds;
 	struct acrn_ioreq_client	*ioeventfd_client;
+	struct mutex			irqfds_lock;
+	struct list_head		irqfds;
+	struct workqueue_struct		*irqfd_wq;
 };
 
 struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
@@ -216,4 +222,8 @@ int acrn_ioeventfd_init(struct acrn_vm *vm);
 int acrn_ioeventfd_config(struct acrn_vm *vm, struct acrn_ioeventfd *args);
 void acrn_ioeventfd_deinit(struct acrn_vm *vm);
 
+int acrn_irqfd_init(struct acrn_vm *vm);
+int acrn_irqfd_config(struct acrn_vm *vm, struct acrn_irqfd *args);
+void acrn_irqfd_deinit(struct acrn_vm *vm);
+
 #endif /* __ACRN_HSM_DRV_H */
diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
index 40ec16a21308..aaf4e76d27b4 100644
--- a/drivers/virt/acrn/hsm.c
+++ b/drivers/virt/acrn/hsm.c
@@ -115,6 +115,7 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
 	struct acrn_vm_memmap memmap;
 	struct acrn_msi_entry *msi;
 	struct acrn_pcidev *pcidev;
+	struct acrn_irqfd irqfd;
 	struct page *page;
 	u64 cstate_cmd;
 	int ret = 0;
@@ -317,6 +318,12 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
 
 		ret = acrn_ioeventfd_config(vm, &ioeventfd);
 		break;
+	case ACRN_IOCTL_IRQFD:
+		if (copy_from_user(&irqfd, (void __user *)ioctl_param,
+				   sizeof(irqfd)))
+			return -EFAULT;
+		ret = acrn_irqfd_config(vm, &irqfd);
+		break;
 	default:
 		dev_warn(acrn_dev.this_device, "Unknown IOCTL 0x%x!\n", cmd);
 		ret = -ENOTTY;
diff --git a/drivers/virt/acrn/irqfd.c b/drivers/virt/acrn/irqfd.c
new file mode 100644
index 000000000000..a8766d528e29
--- /dev/null
+++ b/drivers/virt/acrn/irqfd.c
@@ -0,0 +1,235 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ACRN HSM irqfd: use eventfd objects to inject virtual interrupts
+ *
+ * Copyright (C) 2020 Intel Corporation. All rights reserved.
+ *
+ * Authors:
+ *	Shuo Liu <shuo.a.liu@intel.com>
+ *	Yakui Zhao <yakui.zhao@intel.com>
+ */
+
+#include <linux/eventfd.h>
+#include <linux/file.h>
+#include <linux/poll.h>
+#include <linux/slab.h>
+
+#include "acrn_drv.h"
+
+static LIST_HEAD(acrn_irqfd_clients);
+static DEFINE_MUTEX(acrn_irqfds_mutex);
+
+/**
+ * struct hsm_irqfd - Properties of HSM irqfd
+ * @vm:		Associated VM pointer
+ * @wait:	Entry of wait-queue
+ * @shutdown:	Async shutdown work
+ * @eventfd:	Associated eventfd
+ * @list:	Entry within &acrn_vm.irqfds of irqfds of a VM
+ * @pt:		Structure for select/poll on the associated eventfd
+ * @msi:	MSI data
+ */
+struct hsm_irqfd {
+	struct acrn_vm		*vm;
+	wait_queue_entry_t	wait;
+	struct work_struct	shutdown;
+	struct eventfd_ctx	*eventfd;
+	struct list_head	list;
+	poll_table		pt;
+	struct acrn_msi_entry	msi;
+};
+
+static void acrn_irqfd_inject(struct hsm_irqfd *irqfd)
+{
+	struct acrn_vm *vm = irqfd->vm;
+
+	acrn_msi_inject(vm, irqfd->msi.msi_addr,
+			irqfd->msi.msi_data);
+}
+
+static void hsm_irqfd_shutdown(struct hsm_irqfd *irqfd)
+{
+	u64 cnt;
+
+	lockdep_assert_held(&irqfd->vm->irqfds_lock);
+
+	/* remove from wait queue */
+	list_del_init(&irqfd->list);
+	eventfd_ctx_remove_wait_queue(irqfd->eventfd, &irqfd->wait, &cnt);
+	eventfd_ctx_put(irqfd->eventfd);
+	kfree(irqfd);
+}
+
+static void hsm_irqfd_shutdown_work(struct work_struct *work)
+{
+	struct hsm_irqfd *irqfd;
+	struct acrn_vm *vm;
+
+	irqfd = container_of(work, struct hsm_irqfd, shutdown);
+	vm = irqfd->vm;
+	mutex_lock(&vm->irqfds_lock);
+	if (!list_empty(&irqfd->list))
+		hsm_irqfd_shutdown(irqfd);
+	mutex_unlock(&vm->irqfds_lock);
+}
+
+/* Called with wqh->lock held and interrupts disabled */
+static int hsm_irqfd_wakeup(wait_queue_entry_t *wait, unsigned int mode,
+			    int sync, void *key)
+{
+	unsigned long poll_bits = (unsigned long)key;
+	struct hsm_irqfd *irqfd;
+	struct acrn_vm *vm;
+
+	irqfd = container_of(wait, struct hsm_irqfd, wait);
+	vm = irqfd->vm;
+	if (poll_bits & POLLIN)
+		/* An event has been signaled, inject an interrupt */
+		acrn_irqfd_inject(irqfd);
+
+	if (poll_bits & POLLHUP)
+		/* Do shutdown work in thread to hold wqh->lock */
+		queue_work(vm->irqfd_wq, &irqfd->shutdown);
+
+	return 0;
+}
+
+static void hsm_irqfd_poll_func(struct file *file, wait_queue_head_t *wqh,
+				poll_table *pt)
+{
+	struct hsm_irqfd *irqfd;
+
+	irqfd = container_of(pt, struct hsm_irqfd, pt);
+	add_wait_queue(wqh, &irqfd->wait);
+}
+
+/*
+ * Assign an eventfd to a VM and create a HSM irqfd associated with the
+ * eventfd. The properties of the HSM irqfd are built from a &struct
+ * acrn_irqfd.
+ */
+static int acrn_irqfd_assign(struct acrn_vm *vm, struct acrn_irqfd *args)
+{
+	struct eventfd_ctx *eventfd = NULL;
+	struct hsm_irqfd *irqfd, *tmp;
+	unsigned int events;
+	struct fd f;
+	int ret = 0;
+
+	irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
+	if (!irqfd)
+		return -ENOMEM;
+
+	irqfd->vm = vm;
+	memcpy(&irqfd->msi, &args->msi, sizeof(args->msi));
+	INIT_LIST_HEAD(&irqfd->list);
+	INIT_WORK(&irqfd->shutdown, hsm_irqfd_shutdown_work);
+
+	f = fdget(args->fd);
+	if (!f.file) {
+		ret = -EBADF;
+		goto out;
+	}
+
+	eventfd = eventfd_ctx_fileget(f.file);
+	if (IS_ERR(eventfd)) {
+		ret = PTR_ERR(eventfd);
+		goto fail;
+	}
+
+	irqfd->eventfd = eventfd;
+
+	/*
+	 * Install custom wake-up handling to be notified whenever underlying
+	 * eventfd is signaled.
+	 */
+	init_waitqueue_func_entry(&irqfd->wait, hsm_irqfd_wakeup);
+	init_poll_funcptr(&irqfd->pt, hsm_irqfd_poll_func);
+
+	mutex_lock(&vm->irqfds_lock);
+	list_for_each_entry(tmp, &vm->irqfds, list) {
+		if (irqfd->eventfd != tmp->eventfd)
+			continue;
+		ret = -EBUSY;
+		mutex_unlock(&vm->irqfds_lock);
+		goto fail;
+	}
+	list_add_tail(&irqfd->list, &vm->irqfds);
+	mutex_unlock(&vm->irqfds_lock);
+
+	/* Check the pending event in this stage */
+	events = f.file->f_op->poll(f.file, &irqfd->pt);
+
+	if (events & POLLIN)
+		acrn_irqfd_inject(irqfd);
+
+	fdput(f);
+	return 0;
+fail:
+	if (eventfd && !IS_ERR(eventfd))
+		eventfd_ctx_put(eventfd);
+
+	fdput(f);
+out:
+	kfree(irqfd);
+	return ret;
+}
+
+static int acrn_irqfd_deassign(struct acrn_vm *vm,
+			       struct acrn_irqfd *args)
+{
+	struct hsm_irqfd *irqfd, *tmp;
+	struct eventfd_ctx *eventfd;
+
+	eventfd = eventfd_ctx_fdget(args->fd);
+	if (IS_ERR(eventfd))
+		return PTR_ERR(eventfd);
+
+	mutex_lock(&vm->irqfds_lock);
+	list_for_each_entry_safe(irqfd, tmp, &vm->irqfds, list) {
+		if (irqfd->eventfd == eventfd) {
+			hsm_irqfd_shutdown(irqfd);
+			break;
+		}
+	}
+	mutex_unlock(&vm->irqfds_lock);
+	eventfd_ctx_put(eventfd);
+
+	return 0;
+}
+
+int acrn_irqfd_config(struct acrn_vm *vm, struct acrn_irqfd *args)
+{
+	int ret;
+
+	if (args->flags & ACRN_IRQFD_FLAG_DEASSIGN)
+		ret = acrn_irqfd_deassign(vm, args);
+	else
+		ret = acrn_irqfd_assign(vm, args);
+
+	return ret;
+}
+
+int acrn_irqfd_init(struct acrn_vm *vm)
+{
+	INIT_LIST_HEAD(&vm->irqfds);
+	mutex_init(&vm->irqfds_lock);
+	vm->irqfd_wq = alloc_workqueue("acrn_irqfd-%u", 0, 0, vm->vmid);
+	if (!vm->irqfd_wq)
+		return -ENOMEM;
+
+	dev_dbg(acrn_dev.this_device, "VM %u irqfd init.\n", vm->vmid);
+	return 0;
+}
+
+void acrn_irqfd_deinit(struct acrn_vm *vm)
+{
+	struct hsm_irqfd *irqfd, *next;
+
+	dev_dbg(acrn_dev.this_device, "VM %u irqfd deinit.\n", vm->vmid);
+	destroy_workqueue(vm->irqfd_wq);
+	mutex_lock(&vm->irqfds_lock);
+	list_for_each_entry_safe(irqfd, next, &vm->irqfds, list)
+		hsm_irqfd_shutdown(irqfd);
+	mutex_unlock(&vm->irqfds_lock);
+}
diff --git a/drivers/virt/acrn/vm.c b/drivers/virt/acrn/vm.c
index 3c671b03b273..7f152a74b591 100644
--- a/drivers/virt/acrn/vm.c
+++ b/drivers/virt/acrn/vm.c
@@ -51,6 +51,7 @@ struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
 	write_unlock_bh(&acrn_vm_list_lock);
 
 	acrn_ioeventfd_init(vm);
+	acrn_irqfd_init(vm);
 	dev_dbg(acrn_dev.this_device, "VM %u created.\n", vm->vmid);
 	return vm;
 }
@@ -69,7 +70,9 @@ int acrn_vm_destroy(struct acrn_vm *vm)
 	write_unlock_bh(&acrn_vm_list_lock);
 
 	acrn_ioeventfd_deinit(vm);
+	acrn_irqfd_deinit(vm);
 	acrn_ioreq_deinit(vm);
+
 	if (vm->monitor_page) {
 		put_page(vm->monitor_page);
 		vm->monitor_page = NULL;
diff --git a/include/uapi/linux/acrn.h b/include/uapi/linux/acrn.h
index 7a99124c7d4d..75a687838a43 100644
--- a/include/uapi/linux/acrn.h
+++ b/include/uapi/linux/acrn.h
@@ -411,6 +411,19 @@ struct acrn_ioeventfd {
 	__u64	data;
 };
 
+#define ACRN_IRQFD_FLAG_DEASSIGN	0x01
+/**
+ * struct acrn_irqfd - Data to operate a &struct hsm_irqfd
+ * @fd:		The fd of eventfd associated with a hsm_irqfd
+ * @flags:	Logical-OR of ACRN_IRQFD_FLAG_*
+ * @msi:	Info of MSI associated with the irqfd
+ */
+struct acrn_irqfd {
+	__s32			fd;
+	__u32			flags;
+	struct acrn_msi_entry	msi;
+};
+
 /* The ioctl type, documented in ioctl-number.rst */
 #define ACRN_IOCTL_TYPE			0xA2
 
@@ -467,5 +480,7 @@ struct acrn_ioeventfd {
 
 #define ACRN_IOCTL_IOEVENTFD		\
 	_IOW(ACRN_IOCTL_TYPE, 0x70, struct acrn_ioeventfd)
+#define ACRN_IOCTL_IRQFD		\
+	_IOW(ACRN_IOCTL_TYPE, 0x71, struct acrn_irqfd)
 
 #endif /* _UAPI_ACRN_H */
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 17/17] virt: acrn: Introduce an interface for Service VM to control vCPU
  2020-09-22 11:42 [PATCH v4 00/17] HSM driver for ACRN hypervisor shuo.a.liu
                   ` (10 preceding siblings ...)
  2020-09-22 11:43 ` [PATCH v4 16/17] virt: acrn: Introduce irqfd shuo.a.liu
@ 2020-09-22 11:43 ` shuo.a.liu
  2020-09-27 10:44   ` Greg Kroah-Hartman
  2020-09-27  0:24 ` [PATCH v4 00/17] HSM driver for ACRN hypervisor Liu, Shuo A
       [not found] ` <20200922114311.38804-7-shuo.a.liu@intel.com>
  13 siblings, 1 reply; 58+ messages in thread
From: shuo.a.liu @ 2020-09-22 11:43 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: Greg Kroah-Hartman, H . Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Sean Christopherson, Yu Wang,
	Reinette Chatre, Shuo Liu, Zhi Wang, Zhenyu Wang

From: Shuo Liu <shuo.a.liu@intel.com>

ACRN supports partition mode to achieve real-time requirements. In
partition mode, a CPU core can be dedicated to a vCPU of User VM. The
local APIC of the dedicated CPU core can be passthrough to the User VM.
The Service VM controls the assignment of the CPU cores.

Introduce an interface for the Service VM to remove the control of CPU
core from hypervisor perspective so that the CPU core can be a dedicated
CPU core of User VM.

Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Yu Wang <yu1.wang@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/virt/acrn/hsm.c       | 50 +++++++++++++++++++++++++++++++++++
 drivers/virt/acrn/hypercall.h | 14 ++++++++++
 2 files changed, 64 insertions(+)

diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
index aaf4e76d27b4..ef5f77a38d1f 100644
--- a/drivers/virt/acrn/hsm.c
+++ b/drivers/virt/acrn/hsm.c
@@ -9,6 +9,7 @@
  *	Yakui Zhao <yakui.zhao@intel.com>
  */
 
+#include <linux/cpu.h>
 #include <linux/io.h>
 #include <linux/mm.h>
 #include <linux/module.h>
@@ -354,6 +355,47 @@ struct miscdevice acrn_dev = {
 	.fops	= &acrn_fops,
 };
 
+static ssize_t remove_cpu_store(struct device *dev,
+				struct device_attribute *attr,
+				const char *buf, size_t count)
+{
+	u64 cpu, lapicid;
+	int ret;
+
+	if (kstrtoull(buf, 0, &cpu) < 0)
+		return -EINVAL;
+
+	if (cpu >= num_possible_cpus() || cpu == 0 || !cpu_is_hotpluggable(cpu))
+		return -EINVAL;
+
+	if (cpu_online(cpu))
+		remove_cpu(cpu);
+
+	lapicid = cpu_data(cpu).apicid;
+	dev_dbg(dev, "Try to remove cpu %lld with lapicid %lld\n", cpu, lapicid);
+	ret = hcall_sos_remove_cpu(lapicid);
+	if (ret < 0) {
+		dev_err(dev, "Failed to remove cpu %lld!\n", cpu);
+		goto fail_remove;
+	}
+
+	return count;
+
+fail_remove:
+	add_cpu(cpu);
+	return ret;
+}
+static DEVICE_ATTR_WO(remove_cpu);
+
+static struct attribute *acrn_attrs[] = {
+	&dev_attr_remove_cpu.attr,
+	NULL
+};
+
+static struct attribute_group acrn_attr_group = {
+	.attrs = acrn_attrs,
+};
+
 static int __init hsm_init(void)
 {
 	int ret;
@@ -370,13 +412,21 @@ static int __init hsm_init(void)
 		return ret;
 	}
 
+	ret = sysfs_create_group(&acrn_dev.this_device->kobj, &acrn_attr_group);
+	if (ret) {
+		dev_warn(acrn_dev.this_device, "sysfs create failed\n");
+		misc_deregister(&acrn_dev);
+		return ret;
+	}
 	acrn_ioreq_intr_setup();
+
 	return 0;
 }
 
 static void __exit hsm_exit(void)
 {
 	acrn_ioreq_intr_remove();
+	sysfs_remove_group(&acrn_dev.this_device->kobj, &acrn_attr_group);
 	misc_deregister(&acrn_dev);
 }
 module_init(hsm_init);
diff --git a/drivers/virt/acrn/hypercall.h b/drivers/virt/acrn/hypercall.h
index e640632366f0..0cfad05bd1a9 100644
--- a/drivers/virt/acrn/hypercall.h
+++ b/drivers/virt/acrn/hypercall.h
@@ -13,6 +13,9 @@
 
 #define HC_ID 0x80UL
 
+#define HC_ID_GEN_BASE			0x0UL
+#define HC_SOS_REMOVE_CPU		_HC_ID(HC_ID, HC_ID_GEN_BASE + 0x01)
+
 #define HC_ID_VM_BASE			0x10UL
 #define HC_CREATE_VM			_HC_ID(HC_ID, HC_ID_VM_BASE + 0x00)
 #define HC_DESTROY_VM			_HC_ID(HC_ID, HC_ID_VM_BASE + 0x01)
@@ -42,6 +45,17 @@
 #define HC_ID_PM_BASE			0x80UL
 #define HC_PM_GET_CPU_STATE		_HC_ID(HC_ID, HC_ID_PM_BASE + 0x00)
 
+/**
+ * hcall_sos_remove_cpu() - Remove a vCPU of Service VM
+ * @cpu: The vCPU to be removed
+ *
+ * Return: 0 on success, <0 on failure
+ */
+static inline long hcall_sos_remove_cpu(u64 cpu)
+{
+	return acrn_hypercall1(HC_SOS_REMOVE_CPU, cpu);
+}
+
 /**
  * hcall_create_vm() - Create a User VM
  * @vminfo:	Service VM GPA of info of User VM creation
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 00/17] HSM driver for ACRN hypervisor
  2020-09-22 11:42 [PATCH v4 00/17] HSM driver for ACRN hypervisor shuo.a.liu
                   ` (11 preceding siblings ...)
  2020-09-22 11:43 ` [PATCH v4 17/17] virt: acrn: Introduce an interface for Service VM to control vCPU shuo.a.liu
@ 2020-09-27  0:24 ` Liu, Shuo A
  2020-09-27  5:42   ` Greg Kroah-Hartman
       [not found] ` <20200922114311.38804-7-shuo.a.liu@intel.com>
  13 siblings, 1 reply; 58+ messages in thread
From: Liu, Shuo A @ 2020-09-27  0:24 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: Greg Kroah-Hartman, H . Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Sean Christopherson, Yu Wang,
	Reinette Chatre

Ping...

On 9/22/2020 19:42, shuo.a.liu@intel.com wrote:
> From: Shuo Liu <shuo.a.liu@intel.com>
> 
> ACRN is a Type 1 reference hypervisor stack, running directly on the bare-metal
> hardware, and is suitable for a variety of IoT and embedded device solutions.
> 
> ACRN implements a hybrid VMM architecture, using a privileged Service VM. The
> Service VM manages the system resources (CPU, memory, etc.) and I/O devices of
> User VMs. Multiple User VMs are supported, with each of them running Linux,
> Android OS or Windows. Both Service VM and User VMs are guest VM.
> 
> Below figure shows the architecture.
> 
>                 Service VM                    User VM
>       +----------------------------+  |  +------------------+
>       |        +--------------+    |  |  |                  |
>       |        |ACRN userspace|    |  |  |                  |
>       |        +--------------+    |  |  |                  |
>       |-----------------ioctl------|  |  |                  |   ...
>       |kernel space   +----------+ |  |  |                  |
>       |               |   HSM    | |  |  | Drivers          |
>       |               +----------+ |  |  |                  |
>       +--------------------|-------+  |  +------------------+
>   +---------------------hypercall----------------------------------------+
>   |                       ACRN Hypervisor                                |
>   +----------------------------------------------------------------------+
>   |                          Hardware                                    |
>   +----------------------------------------------------------------------+
> 
> There is only one Service VM which could run Linux as OS.
> 
> In a typical case, the Service VM will be auto started when ACRN Hypervisor is
> booted. Then the ACRN userspace (an application running in Service VM) could be
> used to start/stop User VMs by communicating with ACRN Hypervisor Service
> Module (HSM).
> 
> ACRN Hypervisor Service Module (HSM) is a middle layer that allows the ACRN
> userspace and Service VM OS kernel to communicate with ACRN Hypervisor
> and manage different User VMs. This middle layer provides the following
> functionalities,
>   - Issues hypercalls to the hypervisor to manage User VMs:
>       * VM/vCPU management
>       * Memory management
>       * Device passthrough
>       * Interrupts injection
>   - I/O requests handling from User VMs.
>   - Exports ioctl through HSM char device.
>   - Exports function calls for other kernel modules
> 
> ACRN is focused on embedded system. So it doesn't support some features.
> E.g.,
>   - ACRN doesn't support VM migration.
>   - ACRN doesn't support vCPU migration.
> 
> This patch set adds the HSM to the Linux kernel.
> 
> The basic ARCN support was merged to upstream already.
> https://lore.kernel.org/lkml/1559108037-18813-3-git-send-email-yakui.zhao@intel.com/
> 
> ChangeLog:
> v4:
>   - Used acrn_dev.this_device directly for dev_*() (Reinette)
>   - Removed the odd usage of {get|put}_device() on &acrn_dev->this_device (Greg)
>   - Removed unused log code. (Greg)
>   - Corrected the return error values. (Greg)
>   - Mentioned that HSM relies hypervisor for sanity check in acrn_dev_ioctl() comments (Greg)
> 
> v3:
>   - Used {get|put}_device() helpers on &acrn_dev->this_device
>   - Moved unused code from front patches to later ones.
>   - Removed self-defined pr_fmt() and dev_fmt()
>   - Provided comments for acrn_vm_list_lock.
> 
> v2:
>   - Removed API version related code. (Dave)
>   - Replaced pr_*() by dev_*(). (Greg)
>   - Used -ENOTTY as the error code of unsupported ioctl. (Greg)
> 
> Shuo Liu (16):
>   docs: acrn: Introduce ACRN
>   x86/acrn: Introduce acrn_{setup, remove}_intr_handler()
>   x86/acrn: Introduce hypercall interfaces
>   virt: acrn: Introduce ACRN HSM basic driver
>   virt: acrn: Introduce VM management interfaces
>   virt: acrn: Introduce an ioctl to set vCPU registers state
>   virt: acrn: Introduce EPT mapping management
>   virt: acrn: Introduce I/O request management
>   virt: acrn: Introduce PCI configuration space PIO accesses combiner
>   virt: acrn: Introduce interfaces for PCI device passthrough
>   virt: acrn: Introduce interrupt injection interfaces
>   virt: acrn: Introduce interfaces to query C-states and P-states
>     allowed by hypervisor
>   virt: acrn: Introduce I/O ranges operation interfaces
>   virt: acrn: Introduce ioeventfd
>   virt: acrn: Introduce irqfd
>   virt: acrn: Introduce an interface for Service VM to control vCPU
> 
> Yin Fengwei (1):
>   x86/acrn: Introduce an API to check if a VM is privileged
> 
>  .../userspace-api/ioctl/ioctl-number.rst      |   1 +
>  Documentation/virt/acrn/index.rst             |  11 +
>  Documentation/virt/acrn/introduction.rst      |  40 ++
>  Documentation/virt/acrn/io-request.rst        |  97 +++
>  Documentation/virt/index.rst                  |   1 +
>  MAINTAINERS                                   |   9 +
>  arch/x86/include/asm/acrn.h                   |  74 ++
>  arch/x86/kernel/cpu/acrn.c                    |  35 +-
>  drivers/virt/Kconfig                          |   2 +
>  drivers/virt/Makefile                         |   1 +
>  drivers/virt/acrn/Kconfig                     |  15 +
>  drivers/virt/acrn/Makefile                    |   3 +
>  drivers/virt/acrn/acrn_drv.h                  | 229 +++++++
>  drivers/virt/acrn/hsm.c                       | 437 ++++++++++++
>  drivers/virt/acrn/hypercall.h                 | 254 +++++++
>  drivers/virt/acrn/ioeventfd.c                 | 273 ++++++++
>  drivers/virt/acrn/ioreq.c                     | 645 ++++++++++++++++++
>  drivers/virt/acrn/irqfd.c                     | 235 +++++++
>  drivers/virt/acrn/mm.c                        | 305 +++++++++
>  drivers/virt/acrn/vm.c                        | 126 ++++
>  include/uapi/linux/acrn.h                     | 486 +++++++++++++
>  21 files changed, 3278 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/virt/acrn/index.rst
>  create mode 100644 Documentation/virt/acrn/introduction.rst
>  create mode 100644 Documentation/virt/acrn/io-request.rst
>  create mode 100644 arch/x86/include/asm/acrn.h
>  create mode 100644 drivers/virt/acrn/Kconfig
>  create mode 100644 drivers/virt/acrn/Makefile
>  create mode 100644 drivers/virt/acrn/acrn_drv.h
>  create mode 100644 drivers/virt/acrn/hsm.c
>  create mode 100644 drivers/virt/acrn/hypercall.h
>  create mode 100644 drivers/virt/acrn/ioeventfd.c
>  create mode 100644 drivers/virt/acrn/ioreq.c
>  create mode 100644 drivers/virt/acrn/irqfd.c
>  create mode 100644 drivers/virt/acrn/mm.c
>  create mode 100644 drivers/virt/acrn/vm.c
>  create mode 100644 include/uapi/linux/acrn.h
> 
> 
> base-commit: 18445bf405cb331117bc98427b1ba6f12418ad17
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 00/17] HSM driver for ACRN hypervisor
  2020-09-27  0:24 ` [PATCH v4 00/17] HSM driver for ACRN hypervisor Liu, Shuo A
@ 2020-09-27  5:42   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 58+ messages in thread
From: Greg Kroah-Hartman @ 2020-09-27  5:42 UTC (permalink / raw)
  To: Liu, Shuo A
  Cc: linux-kernel, x86, H . Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sean Christopherson, Yu Wang, Reinette Chatre

On Sun, Sep 27, 2020 at 08:24:39AM +0800, Liu, Shuo A wrote:
> Ping...

It's been less than a week since you sent this.  Please relax and if you
really need reviews, get them from within Intel, where you can impose a
deadline on those developers.  Otherwise, your patch is in good company:

	$ mdfrm -c ~/mail/todo/
	993 messages in /home/gregkh/mail/todo/

And will be handled when I get to it.

thanks,

greg "Intel still owes me lots of liquor" k-h

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 17/17] virt: acrn: Introduce an interface for Service VM to control vCPU
  2020-09-22 11:43 ` [PATCH v4 17/17] virt: acrn: Introduce an interface for Service VM to control vCPU shuo.a.liu
@ 2020-09-27 10:44   ` Greg Kroah-Hartman
  2020-09-28  4:10     ` Shuo A Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Greg Kroah-Hartman @ 2020-09-27 10:44 UTC (permalink / raw)
  To: shuo.a.liu
  Cc: linux-kernel, x86, H . Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sean Christopherson, Yu Wang, Reinette Chatre,
	Zhi Wang, Zhenyu Wang

On Tue, Sep 22, 2020 at 07:43:11PM +0800, shuo.a.liu@intel.com wrote:
> From: Shuo Liu <shuo.a.liu@intel.com>
> 
> ACRN supports partition mode to achieve real-time requirements. In
> partition mode, a CPU core can be dedicated to a vCPU of User VM. The
> local APIC of the dedicated CPU core can be passthrough to the User VM.
> The Service VM controls the assignment of the CPU cores.
> 
> Introduce an interface for the Service VM to remove the control of CPU
> core from hypervisor perspective so that the CPU core can be a dedicated
> CPU core of User VM.
> 
> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
> Cc: Zhi Wang <zhi.a.wang@intel.com>
> Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> Cc: Yu Wang <yu1.wang@intel.com>
> Cc: Reinette Chatre <reinette.chatre@intel.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
>  drivers/virt/acrn/hsm.c       | 50 +++++++++++++++++++++++++++++++++++
>  drivers/virt/acrn/hypercall.h | 14 ++++++++++
>  2 files changed, 64 insertions(+)
> 
> diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
> index aaf4e76d27b4..ef5f77a38d1f 100644
> --- a/drivers/virt/acrn/hsm.c
> +++ b/drivers/virt/acrn/hsm.c
> @@ -9,6 +9,7 @@
>   *	Yakui Zhao <yakui.zhao@intel.com>
>   */
>  
> +#include <linux/cpu.h>
>  #include <linux/io.h>
>  #include <linux/mm.h>
>  #include <linux/module.h>
> @@ -354,6 +355,47 @@ struct miscdevice acrn_dev = {
>  	.fops	= &acrn_fops,
>  };
>  
> +static ssize_t remove_cpu_store(struct device *dev,
> +				struct device_attribute *attr,
> +				const char *buf, size_t count)
> +{
> +	u64 cpu, lapicid;
> +	int ret;
> +
> +	if (kstrtoull(buf, 0, &cpu) < 0)
> +		return -EINVAL;
> +
> +	if (cpu >= num_possible_cpus() || cpu == 0 || !cpu_is_hotpluggable(cpu))
> +		return -EINVAL;
> +
> +	if (cpu_online(cpu))
> +		remove_cpu(cpu);
> +
> +	lapicid = cpu_data(cpu).apicid;
> +	dev_dbg(dev, "Try to remove cpu %lld with lapicid %lld\n", cpu, lapicid);
> +	ret = hcall_sos_remove_cpu(lapicid);
> +	if (ret < 0) {
> +		dev_err(dev, "Failed to remove cpu %lld!\n", cpu);
> +		goto fail_remove;
> +	}
> +
> +	return count;
> +
> +fail_remove:
> +	add_cpu(cpu);
> +	return ret;
> +}
> +static DEVICE_ATTR_WO(remove_cpu);
> +
> +static struct attribute *acrn_attrs[] = {
> +	&dev_attr_remove_cpu.attr,
> +	NULL
> +};
> +
> +static struct attribute_group acrn_attr_group = {
> +	.attrs = acrn_attrs,
> +};

You create a sysfs attribute without any Documentation/ABI/ update as
well?  That's not good.

And why are you trying to emulate CPU hotplug here and not using the
existing CPU hotplug mechanism?

> +
>  static int __init hsm_init(void)
>  {
>  	int ret;
> @@ -370,13 +412,21 @@ static int __init hsm_init(void)
>  		return ret;
>  	}
>  
> +	ret = sysfs_create_group(&acrn_dev.this_device->kobj, &acrn_attr_group);
> +	if (ret) {
> +		dev_warn(acrn_dev.this_device, "sysfs create failed\n");
> +		misc_deregister(&acrn_dev);
> +		return ret;
> +	}

You just raced with userspace and lost.  If you want to add attribute
files to a device, use the default attribute group list, and it will be
managed properly for you by the driver core.

Huge hint, if a driver every has to touch a kobject, or call sysfs_*,
then it is probably doing something wrong.

greg k-h

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 06/17] virt: acrn: Introduce VM management interfaces
       [not found] ` <20200922114311.38804-7-shuo.a.liu@intel.com>
@ 2020-09-27 10:45   ` Greg Kroah-Hartman
  2020-09-28  3:43     ` Shuo A Liu
  2020-09-27 10:47   ` Greg Kroah-Hartman
  1 sibling, 1 reply; 58+ messages in thread
From: Greg Kroah-Hartman @ 2020-09-27 10:45 UTC (permalink / raw)
  To: shuo.a.liu
  Cc: linux-kernel, x86, H . Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sean Christopherson, Yu Wang, Reinette Chatre,
	Zhi Wang, Zhenyu Wang

On Tue, Sep 22, 2020 at 07:43:00PM +0800, shuo.a.liu@intel.com wrote:
> From: Shuo Liu <shuo.a.liu@intel.com>
> 
> The VM management interfaces expose several VM operations to ACRN
> userspace via ioctls. For example, creating VM, starting VM, destroying
> VM and so on.
> 
> The ACRN Hypervisor needs to exchange data with the ACRN userspace
> during the VM operations. HSM provides VM operation ioctls to the ACRN
> userspace and communicates with the ACRN Hypervisor for VM operations
> via hypercalls.
> 
> HSM maintains a list of User VM. Each User VM will be bound to an
> existing file descriptor of /dev/acrn_hsm. The User VM will be
> destroyed when the file descriptor is closed.
> 
> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
> Cc: Zhi Wang <zhi.a.wang@intel.com>
> Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> Cc: Yu Wang <yu1.wang@intel.com>
> Cc: Reinette Chatre <reinette.chatre@intel.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
>  .../userspace-api/ioctl/ioctl-number.rst      |  1 +
>  MAINTAINERS                                   |  1 +
>  drivers/virt/acrn/Makefile                    |  2 +-
>  drivers/virt/acrn/acrn_drv.h                  | 23 +++++-
>  drivers/virt/acrn/hsm.c                       | 73 ++++++++++++++++-
>  drivers/virt/acrn/hypercall.h                 | 78 +++++++++++++++++++
>  drivers/virt/acrn/vm.c                        | 71 +++++++++++++++++
>  include/uapi/linux/acrn.h                     | 56 +++++++++++++
>  8 files changed, 301 insertions(+), 4 deletions(-)
>  create mode 100644 drivers/virt/acrn/hypercall.h
>  create mode 100644 drivers/virt/acrn/vm.c
>  create mode 100644 include/uapi/linux/acrn.h
> 
> diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
> index 2a198838fca9..ac60efedb104 100644
> --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
> +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
> @@ -319,6 +319,7 @@ Code  Seq#    Include File                                           Comments
>  0xA0  all    linux/sdp/sdp.h                                         Industrial Device Project
>                                                                       <mailto:kenji@bitgate.com>
>  0xA1  0      linux/vtpm_proxy.h                                      TPM Emulator Proxy Driver
> +0xA2  all    uapi/linux/acrn.h                                       ACRN hypervisor
>  0xA3  80-8F                                                          Port ACL  in development:
>                                                                       <mailto:tlewis@mindspring.com>
>  0xA3  90-9F  linux/dtlk.h
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 3030d0e93d02..d4c1ef303c2d 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -443,6 +443,7 @@ S:	Supported
>  W:	https://projectacrn.org
>  F:	Documentation/virt/acrn/
>  F:	drivers/virt/acrn/
> +F:	include/uapi/linux/acrn.h
>  
>  AD1889 ALSA SOUND DRIVER
>  L:	linux-parisc@vger.kernel.org
> diff --git a/drivers/virt/acrn/Makefile b/drivers/virt/acrn/Makefile
> index 6920ed798aaf..cf8b4ed5e74e 100644
> --- a/drivers/virt/acrn/Makefile
> +++ b/drivers/virt/acrn/Makefile
> @@ -1,3 +1,3 @@
>  # SPDX-License-Identifier: GPL-2.0
>  obj-$(CONFIG_ACRN_HSM)	:= acrn.o
> -acrn-y := hsm.o
> +acrn-y := hsm.o vm.o
> diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
> index 29eedd696327..72d92b60d944 100644
> --- a/drivers/virt/acrn/acrn_drv.h
> +++ b/drivers/virt/acrn/acrn_drv.h
> @@ -3,16 +3,37 @@
>  #ifndef __ACRN_HSM_DRV_H
>  #define __ACRN_HSM_DRV_H
>  
> +#include <linux/acrn.h>
> +#include <linux/dev_printk.h>
> +#include <linux/miscdevice.h>
>  #include <linux/types.h>
>  
> +#include "hypercall.h"
> +
> +extern struct miscdevice acrn_dev;

Who else needs to get to this structure in your driver?

> +
>  #define ACRN_INVALID_VMID (0xffffU)
>  
> +#define ACRN_VM_FLAG_DESTROYED		0U
> +extern struct list_head acrn_vm_list;
> +extern rwlock_t acrn_vm_list_lock;
>  /**
>   * struct acrn_vm - Properties of ACRN User VM.
> + * @list:	Entry within global list of all VMs
>   * @vmid:	User VM ID
> + * @vcpu_num:	Number of virtual CPUs in the VM
> + * @flags:	Flags (ACRN_VM_FLAG_*) of the VM. This is VM flag management
> + *		in HSM which is different from the &acrn_vm_creation.vm_flag.
>   */
>  struct acrn_vm {
> -	u16	vmid;
> +	struct list_head	list;
> +	u16			vmid;
> +	int			vcpu_num;
> +	unsigned long		flags;
>  };
>  
> +struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
> +			       struct acrn_vm_creation *vm_param);
> +int acrn_vm_destroy(struct acrn_vm *vm);
> +
>  #endif /* __ACRN_HSM_DRV_H */
> diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
> index 28a3052ffa55..f3e6467b8723 100644
> --- a/drivers/virt/acrn/hsm.c
> +++ b/drivers/virt/acrn/hsm.c
> @@ -9,7 +9,6 @@
>   *	Yakui Zhao <yakui.zhao@intel.com>
>   */
>  
> -#include <linux/miscdevice.h>
>  #include <linux/mm.h>
>  #include <linux/module.h>
>  #include <linux/slab.h>
> @@ -38,10 +37,79 @@ static int acrn_dev_open(struct inode *inode, struct file *filp)
>  	return 0;
>  }
>  
> +/*
> + * HSM relies on hypercall layer of the ACRN hypervisor to do the
> + * sanity check against the input parameters.
> + */
> +static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
> +			   unsigned long ioctl_param)
> +{
> +	struct acrn_vm *vm = filp->private_data;
> +	struct acrn_vm_creation *vm_param;
> +	int ret = 0;
> +
> +	if (vm->vmid == ACRN_INVALID_VMID && cmd != ACRN_IOCTL_CREATE_VM) {
> +		dev_dbg(acrn_dev.this_device,
> +			"ioctl 0x%x: Invalid VM state!\n", cmd);
> +		return -EINVAL;
> +	}
> +
> +	switch (cmd) {
> +	case ACRN_IOCTL_CREATE_VM:
> +		vm_param = memdup_user((void __user *)ioctl_param,
> +				       sizeof(struct acrn_vm_creation));
> +		if (IS_ERR(vm_param))
> +			return PTR_ERR(vm_param);
> +
> +		vm = acrn_vm_create(vm, vm_param);
> +		if (!vm) {
> +			ret = -EINVAL;
> +			kfree(vm_param);
> +			break;
> +		}
> +
> +		if (copy_to_user((void __user *)ioctl_param, vm_param,
> +				 sizeof(struct acrn_vm_creation))) {
> +			acrn_vm_destroy(vm);
> +			ret = -EFAULT;
> +		}
> +
> +		kfree(vm_param);
> +		break;
> +	case ACRN_IOCTL_START_VM:
> +		ret = hcall_start_vm(vm->vmid);
> +		if (ret < 0)
> +			dev_err(acrn_dev.this_device,
> +				"Failed to start VM %u!\n", vm->vmid);
> +		break;
> +	case ACRN_IOCTL_PAUSE_VM:
> +		ret = hcall_pause_vm(vm->vmid);
> +		if (ret < 0)
> +			dev_err(acrn_dev.this_device,
> +				"Failed to pause VM %u!\n", vm->vmid);
> +		break;
> +	case ACRN_IOCTL_RESET_VM:
> +		ret = hcall_reset_vm(vm->vmid);
> +		if (ret < 0)
> +			dev_err(acrn_dev.this_device,
> +				"Failed to restart VM %u!\n", vm->vmid);
> +		break;
> +	case ACRN_IOCTL_DESTROY_VM:
> +		ret = acrn_vm_destroy(vm);
> +		break;
> +	default:
> +		dev_warn(acrn_dev.this_device, "Unknown IOCTL 0x%x!\n", cmd);

Do not let userspace spam kernel logs with invalid stuff, that's a sure
way to cause a DoS.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 06/17] virt: acrn: Introduce VM management interfaces
       [not found] ` <20200922114311.38804-7-shuo.a.liu@intel.com>
  2020-09-27 10:45   ` [PATCH v4 06/17] virt: acrn: Introduce VM management interfaces Greg Kroah-Hartman
@ 2020-09-27 10:47   ` Greg Kroah-Hartman
  2020-09-28  3:50     ` Shuo A Liu
  1 sibling, 1 reply; 58+ messages in thread
From: Greg Kroah-Hartman @ 2020-09-27 10:47 UTC (permalink / raw)
  To: shuo.a.liu
  Cc: linux-kernel, x86, H . Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sean Christopherson, Yu Wang, Reinette Chatre,
	Zhi Wang, Zhenyu Wang

On Tue, Sep 22, 2020 at 07:43:00PM +0800, shuo.a.liu@intel.com wrote:
> From: Shuo Liu <shuo.a.liu@intel.com>
> 
> The VM management interfaces expose several VM operations to ACRN
> userspace via ioctls. For example, creating VM, starting VM, destroying
> VM and so on.
> 
> The ACRN Hypervisor needs to exchange data with the ACRN userspace
> during the VM operations. HSM provides VM operation ioctls to the ACRN
> userspace and communicates with the ACRN Hypervisor for VM operations
> via hypercalls.
> 
> HSM maintains a list of User VM. Each User VM will be bound to an
> existing file descriptor of /dev/acrn_hsm. The User VM will be
> destroyed when the file descriptor is closed.
> 
> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
> Cc: Zhi Wang <zhi.a.wang@intel.com>
> Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> Cc: Yu Wang <yu1.wang@intel.com>
> Cc: Reinette Chatre <reinette.chatre@intel.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
>  .../userspace-api/ioctl/ioctl-number.rst      |  1 +
>  MAINTAINERS                                   |  1 +
>  drivers/virt/acrn/Makefile                    |  2 +-
>  drivers/virt/acrn/acrn_drv.h                  | 23 +++++-
>  drivers/virt/acrn/hsm.c                       | 73 ++++++++++++++++-
>  drivers/virt/acrn/hypercall.h                 | 78 +++++++++++++++++++
>  drivers/virt/acrn/vm.c                        | 71 +++++++++++++++++
>  include/uapi/linux/acrn.h                     | 56 +++++++++++++
>  8 files changed, 301 insertions(+), 4 deletions(-)
>  create mode 100644 drivers/virt/acrn/hypercall.h
>  create mode 100644 drivers/virt/acrn/vm.c
>  create mode 100644 include/uapi/linux/acrn.h
> 
> diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
> index 2a198838fca9..ac60efedb104 100644
> --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
> +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
> @@ -319,6 +319,7 @@ Code  Seq#    Include File                                           Comments
>  0xA0  all    linux/sdp/sdp.h                                         Industrial Device Project
>                                                                       <mailto:kenji@bitgate.com>
>  0xA1  0      linux/vtpm_proxy.h                                      TPM Emulator Proxy Driver
> +0xA2  all    uapi/linux/acrn.h                                       ACRN hypervisor
>  0xA3  80-8F                                                          Port ACL  in development:
>                                                                       <mailto:tlewis@mindspring.com>
>  0xA3  90-9F  linux/dtlk.h
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 3030d0e93d02..d4c1ef303c2d 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -443,6 +443,7 @@ S:	Supported
>  W:	https://projectacrn.org
>  F:	Documentation/virt/acrn/
>  F:	drivers/virt/acrn/
> +F:	include/uapi/linux/acrn.h
>  
>  AD1889 ALSA SOUND DRIVER
>  L:	linux-parisc@vger.kernel.org
> diff --git a/drivers/virt/acrn/Makefile b/drivers/virt/acrn/Makefile
> index 6920ed798aaf..cf8b4ed5e74e 100644
> --- a/drivers/virt/acrn/Makefile
> +++ b/drivers/virt/acrn/Makefile
> @@ -1,3 +1,3 @@
>  # SPDX-License-Identifier: GPL-2.0
>  obj-$(CONFIG_ACRN_HSM)	:= acrn.o
> -acrn-y := hsm.o
> +acrn-y := hsm.o vm.o
> diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
> index 29eedd696327..72d92b60d944 100644
> --- a/drivers/virt/acrn/acrn_drv.h
> +++ b/drivers/virt/acrn/acrn_drv.h
> @@ -3,16 +3,37 @@
>  #ifndef __ACRN_HSM_DRV_H
>  #define __ACRN_HSM_DRV_H
>  
> +#include <linux/acrn.h>
> +#include <linux/dev_printk.h>
> +#include <linux/miscdevice.h>
>  #include <linux/types.h>
>  
> +#include "hypercall.h"
> +
> +extern struct miscdevice acrn_dev;
> +
>  #define ACRN_INVALID_VMID (0xffffU)
>  
> +#define ACRN_VM_FLAG_DESTROYED		0U
> +extern struct list_head acrn_vm_list;
> +extern rwlock_t acrn_vm_list_lock;
>  /**
>   * struct acrn_vm - Properties of ACRN User VM.
> + * @list:	Entry within global list of all VMs
>   * @vmid:	User VM ID
> + * @vcpu_num:	Number of virtual CPUs in the VM
> + * @flags:	Flags (ACRN_VM_FLAG_*) of the VM. This is VM flag management
> + *		in HSM which is different from the &acrn_vm_creation.vm_flag.
>   */
>  struct acrn_vm {
> -	u16	vmid;
> +	struct list_head	list;
> +	u16			vmid;
> +	int			vcpu_num;
> +	unsigned long		flags;
>  };
>  
> +struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
> +			       struct acrn_vm_creation *vm_param);
> +int acrn_vm_destroy(struct acrn_vm *vm);
> +
>  #endif /* __ACRN_HSM_DRV_H */
> diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
> index 28a3052ffa55..f3e6467b8723 100644
> --- a/drivers/virt/acrn/hsm.c
> +++ b/drivers/virt/acrn/hsm.c
> @@ -9,7 +9,6 @@
>   *	Yakui Zhao <yakui.zhao@intel.com>
>   */
>  
> -#include <linux/miscdevice.h>
>  #include <linux/mm.h>
>  #include <linux/module.h>
>  #include <linux/slab.h>
> @@ -38,10 +37,79 @@ static int acrn_dev_open(struct inode *inode, struct file *filp)
>  	return 0;
>  }
>  
> +/*
> + * HSM relies on hypercall layer of the ACRN hypervisor to do the
> + * sanity check against the input parameters.
> + */
> +static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
> +			   unsigned long ioctl_param)
> +{
> +	struct acrn_vm *vm = filp->private_data;
> +	struct acrn_vm_creation *vm_param;
> +	int ret = 0;
> +
> +	if (vm->vmid == ACRN_INVALID_VMID && cmd != ACRN_IOCTL_CREATE_VM) {
> +		dev_dbg(acrn_dev.this_device,
> +			"ioctl 0x%x: Invalid VM state!\n", cmd);
> +		return -EINVAL;
> +	}
> +
> +	switch (cmd) {
> +	case ACRN_IOCTL_CREATE_VM:
> +		vm_param = memdup_user((void __user *)ioctl_param,
> +				       sizeof(struct acrn_vm_creation));
> +		if (IS_ERR(vm_param))
> +			return PTR_ERR(vm_param);
> +
> +		vm = acrn_vm_create(vm, vm_param);
> +		if (!vm) {
> +			ret = -EINVAL;
> +			kfree(vm_param);
> +			break;
> +		}
> +
> +		if (copy_to_user((void __user *)ioctl_param, vm_param,
> +				 sizeof(struct acrn_vm_creation))) {
> +			acrn_vm_destroy(vm);
> +			ret = -EFAULT;
> +		}
> +
> +		kfree(vm_param);
> +		break;
> +	case ACRN_IOCTL_START_VM:
> +		ret = hcall_start_vm(vm->vmid);
> +		if (ret < 0)
> +			dev_err(acrn_dev.this_device,
> +				"Failed to start VM %u!\n", vm->vmid);
> +		break;
> +	case ACRN_IOCTL_PAUSE_VM:
> +		ret = hcall_pause_vm(vm->vmid);
> +		if (ret < 0)
> +			dev_err(acrn_dev.this_device,
> +				"Failed to pause VM %u!\n", vm->vmid);
> +		break;
> +	case ACRN_IOCTL_RESET_VM:
> +		ret = hcall_reset_vm(vm->vmid);
> +		if (ret < 0)
> +			dev_err(acrn_dev.this_device,
> +				"Failed to restart VM %u!\n", vm->vmid);
> +		break;
> +	case ACRN_IOCTL_DESTROY_VM:
> +		ret = acrn_vm_destroy(vm);
> +		break;
> +	default:
> +		dev_warn(acrn_dev.this_device, "Unknown IOCTL 0x%x!\n", cmd);
> +		ret = -ENOTTY;
> +	}
> +
> +	return ret;
> +}
> +
>  static int acrn_dev_release(struct inode *inode, struct file *filp)
>  {
>  	struct acrn_vm *vm = filp->private_data;
>  
> +	acrn_vm_destroy(vm);
>  	kfree(vm);
>  	return 0;
>  }
> @@ -50,9 +118,10 @@ static const struct file_operations acrn_fops = {
>  	.owner		= THIS_MODULE,
>  	.open		= acrn_dev_open,
>  	.release	= acrn_dev_release,
> +	.unlocked_ioctl = acrn_dev_ioctl,
>  };
>  
> -static struct miscdevice acrn_dev = {
> +struct miscdevice acrn_dev = {
>  	.minor	= MISC_DYNAMIC_MINOR,
>  	.name	= "acrn_hsm",
>  	.fops	= &acrn_fops,
> diff --git a/drivers/virt/acrn/hypercall.h b/drivers/virt/acrn/hypercall.h
> new file mode 100644
> index 000000000000..426b66cadb1f
> --- /dev/null
> +++ b/drivers/virt/acrn/hypercall.h
> @@ -0,0 +1,78 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * ACRN HSM: hypercalls of ACRN Hypervisor
> + */
> +#ifndef __ACRN_HSM_HYPERCALL_H
> +#define __ACRN_HSM_HYPERCALL_H
> +#include <asm/acrn.h>
> +
> +/*
> + * Hypercall IDs of the ACRN Hypervisor
> + */
> +#define _HC_ID(x, y) (((x) << 24) | (y))
> +
> +#define HC_ID 0x80UL
> +
> +#define HC_ID_VM_BASE			0x10UL
> +#define HC_CREATE_VM			_HC_ID(HC_ID, HC_ID_VM_BASE + 0x00)
> +#define HC_DESTROY_VM			_HC_ID(HC_ID, HC_ID_VM_BASE + 0x01)
> +#define HC_START_VM			_HC_ID(HC_ID, HC_ID_VM_BASE + 0x02)
> +#define HC_PAUSE_VM			_HC_ID(HC_ID, HC_ID_VM_BASE + 0x03)
> +#define HC_RESET_VM			_HC_ID(HC_ID, HC_ID_VM_BASE + 0x05)
> +
> +/**
> + * hcall_create_vm() - Create a User VM
> + * @vminfo:	Service VM GPA of info of User VM creation
> + *
> + * Return: 0 on success, <0 on failure
> + */
> +static inline long hcall_create_vm(u64 vminfo)
> +{
> +	return acrn_hypercall1(HC_CREATE_VM, vminfo);
> +}
> +
> +/**
> + * hcall_start_vm() - Start a User VM
> + * @vmid:	User VM ID
> + *
> + * Return: 0 on success, <0 on failure
> + */
> +static inline long hcall_start_vm(u64 vmid)
> +{
> +	return acrn_hypercall1(HC_START_VM, vmid);
> +}
> +
> +/**
> + * hcall_pause_vm() - Pause a User VM
> + * @vmid:	User VM ID
> + *
> + * Return: 0 on success, <0 on failure
> + */
> +static inline long hcall_pause_vm(u64 vmid)
> +{
> +	return acrn_hypercall1(HC_PAUSE_VM, vmid);
> +}
> +
> +/**
> + * hcall_destroy_vm() - Destroy a User VM
> + * @vmid:	User VM ID
> + *
> + * Return: 0 on success, <0 on failure
> + */
> +static inline long hcall_destroy_vm(u64 vmid)
> +{
> +	return acrn_hypercall1(HC_DESTROY_VM, vmid);
> +}
> +
> +/**
> + * hcall_reset_vm() - Reset a User VM
> + * @vmid:	User VM ID
> + *
> + * Return: 0 on success, <0 on failure
> + */
> +static inline long hcall_reset_vm(u64 vmid)
> +{
> +	return acrn_hypercall1(HC_RESET_VM, vmid);
> +}
> +
> +#endif /* __ACRN_HSM_HYPERCALL_H */
> diff --git a/drivers/virt/acrn/vm.c b/drivers/virt/acrn/vm.c
> new file mode 100644
> index 000000000000..920ca48f4847
> --- /dev/null
> +++ b/drivers/virt/acrn/vm.c
> @@ -0,0 +1,71 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * ACRN_HSM: Virtual Machine management
> + *
> + * Copyright (C) 2020 Intel Corporation. All rights reserved.
> + *
> + * Authors:
> + *	Jason Chen CJ <jason.cj.chen@intel.com>
> + *	Yakui Zhao <yakui.zhao@intel.com>
> + */
> +#include <linux/io.h>
> +#include <linux/mm.h>
> +#include <linux/slab.h>
> +
> +#include "acrn_drv.h"
> +
> +/* List of VMs */
> +LIST_HEAD(acrn_vm_list);
> +/*
> + * acrn_vm_list is read in a tasklet which dispatch I/O requests and is wrote
> + * in VM creation ioctl. Use the rwlock mechanism to protect it.
> + */
> +DEFINE_RWLOCK(acrn_vm_list_lock);
> +
> +struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
> +			       struct acrn_vm_creation *vm_param)
> +{
> +	int ret;
> +
> +	ret = hcall_create_vm(virt_to_phys(vm_param));
> +	if (ret < 0 || vm_param->vmid == ACRN_INVALID_VMID) {
> +		dev_err(acrn_dev.this_device,
> +			"Failed to create VM! Error: %d\n", ret);
> +		return NULL;
> +	}
> +
> +	vm->vmid = vm_param->vmid;
> +	vm->vcpu_num = vm_param->vcpu_num;
> +
> +	write_lock_bh(&acrn_vm_list_lock);
> +	list_add(&vm->list, &acrn_vm_list);
> +	write_unlock_bh(&acrn_vm_list_lock);

Why are the _bh() variants being used here?

You are only accessing this list from userspace context in this patch.

Heck, you aren't even reading from the list, only writing to it...

greg k-h

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 02/17] x86/acrn: Introduce acrn_{setup, remove}_intr_handler()
  2020-09-22 11:42 ` [PATCH v4 02/17] x86/acrn: Introduce acrn_{setup, remove}_intr_handler() shuo.a.liu
@ 2020-09-27 10:49   ` Greg Kroah-Hartman
  2020-09-28  3:28     ` Shuo A Liu
  2020-09-29 18:01   ` Borislav Petkov
  1 sibling, 1 reply; 58+ messages in thread
From: Greg Kroah-Hartman @ 2020-09-27 10:49 UTC (permalink / raw)
  To: shuo.a.liu
  Cc: linux-kernel, x86, H . Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sean Christopherson, Yu Wang, Reinette Chatre,
	Yakui Zhao, Zhi Wang, Dave Hansen, Dan Williams, Fengwei Yin,
	Zhenyu Wang

On Tue, Sep 22, 2020 at 07:42:56PM +0800, shuo.a.liu@intel.com wrote:
> From: Shuo Liu <shuo.a.liu@intel.com>
> 
> The ACRN Hypervisor builds an I/O request when a trapped I/O access
> happens in User VM. Then, ACRN Hypervisor issues an upcall by sending
> a notification interrupt to the Service VM. HSM in the Service VM needs
> to hook the notification interrupt to handle I/O requests.
> 
> Notification interrupts from ACRN Hypervisor are already supported and
> a, currently uninitialized, callback called.
> 
> Export two APIs for HSM to setup/remove its callback.
> 
> Originally-by: Yakui Zhao <yakui.zhao@intel.com>
> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Sean Christopherson <sean.j.christopherson@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Fengwei Yin <fengwei.yin@intel.com>
> Cc: Zhi Wang <zhi.a.wang@intel.com>
> Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> Cc: Yu Wang <yu1.wang@intel.com>
> Cc: Reinette Chatre <reinette.chatre@intel.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
>  arch/x86/include/asm/acrn.h |  8 ++++++++
>  arch/x86/kernel/cpu/acrn.c  | 16 ++++++++++++++++
>  2 files changed, 24 insertions(+)
>  create mode 100644 arch/x86/include/asm/acrn.h
> 
> diff --git a/arch/x86/include/asm/acrn.h b/arch/x86/include/asm/acrn.h
> new file mode 100644
> index 000000000000..ff259b69cde7
> --- /dev/null
> +++ b/arch/x86/include/asm/acrn.h
> @@ -0,0 +1,8 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_X86_ACRN_H
> +#define _ASM_X86_ACRN_H
> +
> +void acrn_setup_intr_handler(void (*handler)(void));
> +void acrn_remove_intr_handler(void);
> +
> +#endif /* _ASM_X86_ACRN_H */
> diff --git a/arch/x86/kernel/cpu/acrn.c b/arch/x86/kernel/cpu/acrn.c
> index 0b2c03943ac6..42e88d01ccf9 100644
> --- a/arch/x86/kernel/cpu/acrn.c
> +++ b/arch/x86/kernel/cpu/acrn.c
> @@ -9,7 +9,11 @@
>   *
>   */
>  
> +#define pr_fmt(fmt) "acrn: " fmt

Why is this needed, if you are not adding pr_* calls in this patch?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-09-22 11:42 ` [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces shuo.a.liu
@ 2020-09-27 10:51   ` Greg Kroah-Hartman
  2020-09-27 10:53     ` Greg Kroah-Hartman
  2020-09-27 15:38     ` Dave Hansen
  2020-09-30 10:54   ` Borislav Petkov
  1 sibling, 2 replies; 58+ messages in thread
From: Greg Kroah-Hartman @ 2020-09-27 10:51 UTC (permalink / raw)
  To: shuo.a.liu
  Cc: linux-kernel, x86, H . Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sean Christopherson, Yu Wang, Reinette Chatre,
	Yakui Zhao, Dave Hansen, Dan Williams, Fengwei Yin, Zhi Wang,
	Zhenyu Wang

On Tue, Sep 22, 2020 at 07:42:58PM +0800, shuo.a.liu@intel.com wrote:
> From: Shuo Liu <shuo.a.liu@intel.com>
> 
> The Service VM communicates with the hypervisor via conventional
> hypercalls. VMCALL instruction is used to make the hypercalls.
> 
> ACRN hypercall ABI:
>   * Hypercall number is in R8 register.
>   * Up to 2 parameters are in RDI and RSI registers.
>   * Return value is in RAX register.
> 
> Introduce the ACRN hypercall interfaces. Because GCC doesn't support R8
> register as direct register constraints, here are two ways to use R8 in
> extended asm:
>   1) use explicit register variable as input
>   2) use supported constraint as input with a explicit MOV to R8 in
>      beginning of asm
> 
> The number of instructions of above two ways are same.
> Asm code from 1)
>   38:   41 b8 00 00 00 80       mov    $0x80000000,%r8d
>   3e:   48 89 c7                mov    %rax,%rdi
>   41:   0f 01 c1                vmcall
> Here, writes to the lower dword (%r8d) clear the upper dword of %r8 when
> the CPU is in 64-bit mode.
> 
> Asm code from 2)
>   38:   48 89 c7                mov    %rax,%rdi
>   3b:   49 b8 00 00 00 80 00    movabs $0x80000000,%r8
>   42:   00 00 00
>   45:   0f 01 c1                vmcall
> 
> Choose 1) for code simplicity and a little bit of code size
> optimization.
> 
> Originally-by: Yakui Zhao <yakui.zhao@intel.com>
> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Sean Christopherson <sean.j.christopherson@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Fengwei Yin <fengwei.yin@intel.com>
> Cc: Zhi Wang <zhi.a.wang@intel.com>
> Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> Cc: Yu Wang <yu1.wang@intel.com>
> Cc: Reinette Chatre <reinette.chatre@intel.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
>  arch/x86/include/asm/acrn.h | 57 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 57 insertions(+)
> 
> diff --git a/arch/x86/include/asm/acrn.h b/arch/x86/include/asm/acrn.h
> index a2d4aea3a80d..23a93b87edeb 100644
> --- a/arch/x86/include/asm/acrn.h
> +++ b/arch/x86/include/asm/acrn.h
> @@ -14,4 +14,61 @@ void acrn_setup_intr_handler(void (*handler)(void));
>  void acrn_remove_intr_handler(void);
>  bool acrn_is_privileged_vm(void);
>  
> +/*
> + * Hypercalls for ACRN
> + *
> + * - VMCALL instruction is used to implement ACRN hypercalls.
> + * - ACRN hypercall ABI:
> + *   - Hypercall number is passed in R8 register.
> + *   - Up to 2 arguments are passed in RDI, RSI.
> + *   - Return value will be placed in RAX.
> + */
> +static inline long acrn_hypercall0(unsigned long hcall_id)
> +{
> +	register long r8 asm("r8");
> +	long result;
> +
> +	/* Nothing can come between the r8 assignment and the asm: */
> +	r8 = hcall_id;
> +	asm volatile("vmcall\n\t"
> +		     : "=a" (result)
> +		     : "r" (r8)
> +		     : );

What keeps an interrupt from happening between the r8 assignment and the
asm: ?

Is this something that most hypercalls need to handle?  I don't see
other ones needing this type of thing, is it just because of how these
are defined?

confused,

greg k-h

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-09-27 10:51   ` Greg Kroah-Hartman
@ 2020-09-27 10:53     ` Greg Kroah-Hartman
  2020-09-28  3:38       ` Shuo A Liu
  2020-09-27 15:38     ` Dave Hansen
  1 sibling, 1 reply; 58+ messages in thread
From: Greg Kroah-Hartman @ 2020-09-27 10:53 UTC (permalink / raw)
  To: shuo.a.liu
  Cc: linux-kernel, x86, H . Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sean Christopherson, Yu Wang, Reinette Chatre,
	Yakui Zhao, Dave Hansen, Dan Williams, Fengwei Yin, Zhi Wang,
	Zhenyu Wang

On Sun, Sep 27, 2020 at 12:51:52PM +0200, Greg Kroah-Hartman wrote:
> On Tue, Sep 22, 2020 at 07:42:58PM +0800, shuo.a.liu@intel.com wrote:
> > From: Shuo Liu <shuo.a.liu@intel.com>
> > 
> > The Service VM communicates with the hypervisor via conventional
> > hypercalls. VMCALL instruction is used to make the hypercalls.
> > 
> > ACRN hypercall ABI:
> >   * Hypercall number is in R8 register.
> >   * Up to 2 parameters are in RDI and RSI registers.
> >   * Return value is in RAX register.
> > 
> > Introduce the ACRN hypercall interfaces. Because GCC doesn't support R8
> > register as direct register constraints, here are two ways to use R8 in
> > extended asm:
> >   1) use explicit register variable as input
> >   2) use supported constraint as input with a explicit MOV to R8 in
> >      beginning of asm
> > 
> > The number of instructions of above two ways are same.
> > Asm code from 1)
> >   38:   41 b8 00 00 00 80       mov    $0x80000000,%r8d
> >   3e:   48 89 c7                mov    %rax,%rdi
> >   41:   0f 01 c1                vmcall
> > Here, writes to the lower dword (%r8d) clear the upper dword of %r8 when
> > the CPU is in 64-bit mode.
> > 
> > Asm code from 2)
> >   38:   48 89 c7                mov    %rax,%rdi
> >   3b:   49 b8 00 00 00 80 00    movabs $0x80000000,%r8
> >   42:   00 00 00
> >   45:   0f 01 c1                vmcall
> > 
> > Choose 1) for code simplicity and a little bit of code size
> > optimization.
> > 
> > Originally-by: Yakui Zhao <yakui.zhao@intel.com>
> > Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
> > Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
> > Cc: Dave Hansen <dave.hansen@intel.com>
> > Cc: Sean Christopherson <sean.j.christopherson@intel.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Fengwei Yin <fengwei.yin@intel.com>
> > Cc: Zhi Wang <zhi.a.wang@intel.com>
> > Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> > Cc: Yu Wang <yu1.wang@intel.com>
> > Cc: Reinette Chatre <reinette.chatre@intel.com>
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > ---
> >  arch/x86/include/asm/acrn.h | 57 +++++++++++++++++++++++++++++++++++++
> >  1 file changed, 57 insertions(+)
> > 
> > diff --git a/arch/x86/include/asm/acrn.h b/arch/x86/include/asm/acrn.h
> > index a2d4aea3a80d..23a93b87edeb 100644
> > --- a/arch/x86/include/asm/acrn.h
> > +++ b/arch/x86/include/asm/acrn.h
> > @@ -14,4 +14,61 @@ void acrn_setup_intr_handler(void (*handler)(void));
> >  void acrn_remove_intr_handler(void);
> >  bool acrn_is_privileged_vm(void);
> >  
> > +/*
> > + * Hypercalls for ACRN
> > + *
> > + * - VMCALL instruction is used to implement ACRN hypercalls.
> > + * - ACRN hypercall ABI:
> > + *   - Hypercall number is passed in R8 register.
> > + *   - Up to 2 arguments are passed in RDI, RSI.
> > + *   - Return value will be placed in RAX.
> > + */
> > +static inline long acrn_hypercall0(unsigned long hcall_id)
> > +{
> > +	register long r8 asm("r8");
> > +	long result;
> > +
> > +	/* Nothing can come between the r8 assignment and the asm: */
> > +	r8 = hcall_id;
> > +	asm volatile("vmcall\n\t"
> > +		     : "=a" (result)
> > +		     : "r" (r8)
> > +		     : );
> 
> What keeps an interrupt from happening between the r8 assignment and the
> asm: ?
> 
> Is this something that most hypercalls need to handle?  I don't see
> other ones needing this type of thing, is it just because of how these
> are defined?

Ah, the changelog above explains this.  You should put that in the code
itself, as a comment, otherwise we will not know this at all in 5
years, when gcc is changed to allow r8 access :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-09-27 10:51   ` Greg Kroah-Hartman
  2020-09-27 10:53     ` Greg Kroah-Hartman
@ 2020-09-27 15:38     ` Dave Hansen
  2020-09-30 11:16       ` Peter Zijlstra
  1 sibling, 1 reply; 58+ messages in thread
From: Dave Hansen @ 2020-09-27 15:38 UTC (permalink / raw)
  To: Greg Kroah-Hartman, shuo.a.liu
  Cc: linux-kernel, x86, H . Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sean Christopherson, Yu Wang, Reinette Chatre,
	Yakui Zhao, Dan Williams, Fengwei Yin, Zhi Wang, Zhenyu Wang

On 9/27/20 3:51 AM, Greg Kroah-Hartman wrote:
>> +static inline long acrn_hypercall0(unsigned long hcall_id)
>> +{
>> +	register long r8 asm("r8");
>> +	long result;
>> +
>> +	/* Nothing can come between the r8 assignment and the asm: */
>> +	r8 = hcall_id;
>> +	asm volatile("vmcall\n\t"
>> +		     : "=a" (result)
>> +		     : "r" (r8)
>> +		     : );
> What keeps an interrupt from happening between the r8 assignment and the
> asm: ?

It's probably better phrased something like: "No other C code can come
between this r8 assignment and the inline asm".  An interrupt would
actually be fine in there because interrupts save and restore all
register state, including r8.

The problem (mentioned in the changelog) is that gcc does not let you
place data directly into r8.  But, it does allow you to declare a
register variable that you can assign to use r8.  There might be a
problem if a function calls was in between and clobber the register,
thus the "nothing can come between" comment.

The comment is really intended to scare away anyone from adding printk()'s.

More information about these register variables is here:

> https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables

Any better ideas for comments would be greatly appreciated.  It has 4 or
5 copies so I wanted it to be succinct.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 02/17] x86/acrn: Introduce acrn_{setup, remove}_intr_handler()
  2020-09-27 10:49   ` Greg Kroah-Hartman
@ 2020-09-28  3:28     ` Shuo A Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Shuo A Liu @ 2020-09-28  3:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, x86, H . Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sean Christopherson, Yu Wang, Reinette Chatre,
	Yakui Zhao, Zhi Wang, Dave Hansen, Dan Williams, Fengwei Yin,
	Zhenyu Wang

On Sun 27.Sep'20 at 12:49:43 +0200, Greg Kroah-Hartman wrote:
>On Tue, Sep 22, 2020 at 07:42:56PM +0800, shuo.a.liu@intel.com wrote:
>> From: Shuo Liu <shuo.a.liu@intel.com>
>>
>> The ACRN Hypervisor builds an I/O request when a trapped I/O access
>> happens in User VM. Then, ACRN Hypervisor issues an upcall by sending
>> a notification interrupt to the Service VM. HSM in the Service VM needs
>> to hook the notification interrupt to handle I/O requests.
>>
>> Notification interrupts from ACRN Hypervisor are already supported and
>> a, currently uninitialized, callback called.
>>
>> Export two APIs for HSM to setup/remove its callback.
>>
>> Originally-by: Yakui Zhao <yakui.zhao@intel.com>
>> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
>> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
>> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
>> Cc: Dave Hansen <dave.hansen@intel.com>
>> Cc: Sean Christopherson <sean.j.christopherson@intel.com>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Fengwei Yin <fengwei.yin@intel.com>
>> Cc: Zhi Wang <zhi.a.wang@intel.com>
>> Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
>> Cc: Yu Wang <yu1.wang@intel.com>
>> Cc: Reinette Chatre <reinette.chatre@intel.com>
>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> ---
>>  arch/x86/include/asm/acrn.h |  8 ++++++++
>>  arch/x86/kernel/cpu/acrn.c  | 16 ++++++++++++++++
>>  2 files changed, 24 insertions(+)
>>  create mode 100644 arch/x86/include/asm/acrn.h
>>
>> diff --git a/arch/x86/include/asm/acrn.h b/arch/x86/include/asm/acrn.h
>> new file mode 100644
>> index 000000000000..ff259b69cde7
>> --- /dev/null
>> +++ b/arch/x86/include/asm/acrn.h
>> @@ -0,0 +1,8 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef _ASM_X86_ACRN_H
>> +#define _ASM_X86_ACRN_H
>> +
>> +void acrn_setup_intr_handler(void (*handler)(void));
>> +void acrn_remove_intr_handler(void);
>> +
>> +#endif /* _ASM_X86_ACRN_H */
>> diff --git a/arch/x86/kernel/cpu/acrn.c b/arch/x86/kernel/cpu/acrn.c
>> index 0b2c03943ac6..42e88d01ccf9 100644
>> --- a/arch/x86/kernel/cpu/acrn.c
>> +++ b/arch/x86/kernel/cpu/acrn.c
>> @@ -9,7 +9,11 @@
>>   *
>>   */
>>
>> +#define pr_fmt(fmt) "acrn: " fmt
>
>Why is this needed, if you are not adding pr_* calls in this patch?

True. I will remove it. Thanks.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-09-27 10:53     ` Greg Kroah-Hartman
@ 2020-09-28  3:38       ` Shuo A Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Shuo A Liu @ 2020-09-28  3:38 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, x86, H . Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sean Christopherson, Yu Wang, Reinette Chatre,
	Yakui Zhao, Dave Hansen, Dan Williams, Fengwei Yin, Zhi Wang,
	Zhenyu Wang

On Sun 27.Sep'20 at 12:53:14 +0200, Greg Kroah-Hartman wrote:
>On Sun, Sep 27, 2020 at 12:51:52PM +0200, Greg Kroah-Hartman wrote:
>> On Tue, Sep 22, 2020 at 07:42:58PM +0800, shuo.a.liu@intel.com wrote:
>> > From: Shuo Liu <shuo.a.liu@intel.com>
>> >
>> > The Service VM communicates with the hypervisor via conventional
>> > hypercalls. VMCALL instruction is used to make the hypercalls.
>> >
>> > ACRN hypercall ABI:
>> >   * Hypercall number is in R8 register.
>> >   * Up to 2 parameters are in RDI and RSI registers.
>> >   * Return value is in RAX register.
>> >
>> > Introduce the ACRN hypercall interfaces. Because GCC doesn't support R8
>> > register as direct register constraints, here are two ways to use R8 in
>> > extended asm:
>> >   1) use explicit register variable as input
>> >   2) use supported constraint as input with a explicit MOV to R8 in
>> >      beginning of asm
>> >
>> > The number of instructions of above two ways are same.
>> > Asm code from 1)
>> >   38:   41 b8 00 00 00 80       mov    $0x80000000,%r8d
>> >   3e:   48 89 c7                mov    %rax,%rdi
>> >   41:   0f 01 c1                vmcall
>> > Here, writes to the lower dword (%r8d) clear the upper dword of %r8 when
>> > the CPU is in 64-bit mode.
>> >
>> > Asm code from 2)
>> >   38:   48 89 c7                mov    %rax,%rdi
>> >   3b:   49 b8 00 00 00 80 00    movabs $0x80000000,%r8
>> >   42:   00 00 00
>> >   45:   0f 01 c1                vmcall
>> >
>> > Choose 1) for code simplicity and a little bit of code size
>> > optimization.
>> >
>> > Originally-by: Yakui Zhao <yakui.zhao@intel.com>
>> > Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
>> > Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
>> > Cc: Dave Hansen <dave.hansen@intel.com>
>> > Cc: Sean Christopherson <sean.j.christopherson@intel.com>
>> > Cc: Dan Williams <dan.j.williams@intel.com>
>> > Cc: Fengwei Yin <fengwei.yin@intel.com>
>> > Cc: Zhi Wang <zhi.a.wang@intel.com>
>> > Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
>> > Cc: Yu Wang <yu1.wang@intel.com>
>> > Cc: Reinette Chatre <reinette.chatre@intel.com>
>> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> > ---
>> >  arch/x86/include/asm/acrn.h | 57 +++++++++++++++++++++++++++++++++++++
>> >  1 file changed, 57 insertions(+)
>> >
>> > diff --git a/arch/x86/include/asm/acrn.h b/arch/x86/include/asm/acrn.h
>> > index a2d4aea3a80d..23a93b87edeb 100644
>> > --- a/arch/x86/include/asm/acrn.h
>> > +++ b/arch/x86/include/asm/acrn.h
>> > @@ -14,4 +14,61 @@ void acrn_setup_intr_handler(void (*handler)(void));
>> >  void acrn_remove_intr_handler(void);
>> >  bool acrn_is_privileged_vm(void);
>> >
>> > +/*
>> > + * Hypercalls for ACRN
>> > + *
>> > + * - VMCALL instruction is used to implement ACRN hypercalls.
>> > + * - ACRN hypercall ABI:
>> > + *   - Hypercall number is passed in R8 register.
>> > + *   - Up to 2 arguments are passed in RDI, RSI.
>> > + *   - Return value will be placed in RAX.
>> > + */
>> > +static inline long acrn_hypercall0(unsigned long hcall_id)
>> > +{
>> > +	register long r8 asm("r8");
>> > +	long result;
>> > +
>> > +	/* Nothing can come between the r8 assignment and the asm: */
>> > +	r8 = hcall_id;
>> > +	asm volatile("vmcall\n\t"
>> > +		     : "=a" (result)
>> > +		     : "r" (r8)
>> > +		     : );
>>
>> What keeps an interrupt from happening between the r8 assignment and the
>> asm: ?

Dave gave a good explanation in another email. I will apply his better
comment that "No other C code can come between this r8 assignment and the
inline asm".

>>
>> Is this something that most hypercalls need to handle?  I don't see
>> other ones needing this type of thing, is it just because of how these
>> are defined?
>
>Ah, the changelog above explains this.  You should put that in the code
>itself, as a comment, otherwise we will not know this at all in 5
>years, when gcc is changed to allow r8 access :)

OK. I will copy that into code as well.

Thanks
shuo

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 06/17] virt: acrn: Introduce VM management interfaces
  2020-09-27 10:45   ` [PATCH v4 06/17] virt: acrn: Introduce VM management interfaces Greg Kroah-Hartman
@ 2020-09-28  3:43     ` Shuo A Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Shuo A Liu @ 2020-09-28  3:43 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, x86, H . Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sean Christopherson, Yu Wang, Reinette Chatre,
	Zhi Wang, Zhenyu Wang

Hi Greg,

On Sun 27.Sep'20 at 12:45:38 +0200, Greg Kroah-Hartman wrote:
>On Tue, Sep 22, 2020 at 07:43:00PM +0800, shuo.a.liu@intel.com wrote:
>> From: Shuo Liu <shuo.a.liu@intel.com>
>>
>> The VM management interfaces expose several VM operations to ACRN
>> userspace via ioctls. For example, creating VM, starting VM, destroying
>> VM and so on.
>>
>> The ACRN Hypervisor needs to exchange data with the ACRN userspace
>> during the VM operations. HSM provides VM operation ioctls to the ACRN
>> userspace and communicates with the ACRN Hypervisor for VM operations
>> via hypercalls.
>>
>> HSM maintains a list of User VM. Each User VM will be bound to an
>> existing file descriptor of /dev/acrn_hsm. The User VM will be
>> destroyed when the file descriptor is closed.
>>
>> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
>> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
>> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
>> Cc: Zhi Wang <zhi.a.wang@intel.com>
>> Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
>> Cc: Yu Wang <yu1.wang@intel.com>
>> Cc: Reinette Chatre <reinette.chatre@intel.com>
>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> ---
>>  .../userspace-api/ioctl/ioctl-number.rst      |  1 +
>>  MAINTAINERS                                   |  1 +
>>  drivers/virt/acrn/Makefile                    |  2 +-
>>  drivers/virt/acrn/acrn_drv.h                  | 23 +++++-
>>  drivers/virt/acrn/hsm.c                       | 73 ++++++++++++++++-
>>  drivers/virt/acrn/hypercall.h                 | 78 +++++++++++++++++++
>>  drivers/virt/acrn/vm.c                        | 71 +++++++++++++++++
>>  include/uapi/linux/acrn.h                     | 56 +++++++++++++
>>  8 files changed, 301 insertions(+), 4 deletions(-)
>>  create mode 100644 drivers/virt/acrn/hypercall.h
>>  create mode 100644 drivers/virt/acrn/vm.c
>>  create mode 100644 include/uapi/linux/acrn.h
>>
>> diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
>> index 2a198838fca9..ac60efedb104 100644
>> --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
>> +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
>> @@ -319,6 +319,7 @@ Code  Seq#    Include File                                           Comments
>>  0xA0  all    linux/sdp/sdp.h                                         Industrial Device Project
>>                                                                       <mailto:kenji@bitgate.com>
>>  0xA1  0      linux/vtpm_proxy.h                                      TPM Emulator Proxy Driver
>> +0xA2  all    uapi/linux/acrn.h                                       ACRN hypervisor
>>  0xA3  80-8F                                                          Port ACL  in development:
>>                                                                       <mailto:tlewis@mindspring.com>
>>  0xA3  90-9F  linux/dtlk.h
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 3030d0e93d02..d4c1ef303c2d 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -443,6 +443,7 @@ S:	Supported
>>  W:	https://projectacrn.org
>>  F:	Documentation/virt/acrn/
>>  F:	drivers/virt/acrn/
>> +F:	include/uapi/linux/acrn.h
>>
>>  AD1889 ALSA SOUND DRIVER
>>  L:	linux-parisc@vger.kernel.org
>> diff --git a/drivers/virt/acrn/Makefile b/drivers/virt/acrn/Makefile
>> index 6920ed798aaf..cf8b4ed5e74e 100644
>> --- a/drivers/virt/acrn/Makefile
>> +++ b/drivers/virt/acrn/Makefile
>> @@ -1,3 +1,3 @@
>>  # SPDX-License-Identifier: GPL-2.0
>>  obj-$(CONFIG_ACRN_HSM)	:= acrn.o
>> -acrn-y := hsm.o
>> +acrn-y := hsm.o vm.o
>> diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
>> index 29eedd696327..72d92b60d944 100644
>> --- a/drivers/virt/acrn/acrn_drv.h
>> +++ b/drivers/virt/acrn/acrn_drv.h
>> @@ -3,16 +3,37 @@
>>  #ifndef __ACRN_HSM_DRV_H
>>  #define __ACRN_HSM_DRV_H
>>
>> +#include <linux/acrn.h>
>> +#include <linux/dev_printk.h>
>> +#include <linux/miscdevice.h>
>>  #include <linux/types.h>
>>
>> +#include "hypercall.h"
>> +
>> +extern struct miscdevice acrn_dev;
>
>Who else needs to get to this structure in your driver?

Other files of the driver need to use it for dev_*() log APIs.

>
>> +
>>  #define ACRN_INVALID_VMID (0xffffU)
>>
>> +#define ACRN_VM_FLAG_DESTROYED		0U
>> +extern struct list_head acrn_vm_list;
>> +extern rwlock_t acrn_vm_list_lock;
>>  /**
>>   * struct acrn_vm - Properties of ACRN User VM.
>> + * @list:	Entry within global list of all VMs
>>   * @vmid:	User VM ID
>> + * @vcpu_num:	Number of virtual CPUs in the VM
>> + * @flags:	Flags (ACRN_VM_FLAG_*) of the VM. This is VM flag management
>> + *		in HSM which is different from the &acrn_vm_creation.vm_flag.
>>   */
>>  struct acrn_vm {
>> -	u16	vmid;
>> +	struct list_head	list;
>> +	u16			vmid;
>> +	int			vcpu_num;
>> +	unsigned long		flags;
>>  };
>>
>> +struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
>> +			       struct acrn_vm_creation *vm_param);
>> +int acrn_vm_destroy(struct acrn_vm *vm);
>> +
>>  #endif /* __ACRN_HSM_DRV_H */
>> diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
>> index 28a3052ffa55..f3e6467b8723 100644
>> --- a/drivers/virt/acrn/hsm.c
>> +++ b/drivers/virt/acrn/hsm.c
>> @@ -9,7 +9,6 @@
>>   *	Yakui Zhao <yakui.zhao@intel.com>
>>   */
>>
>> -#include <linux/miscdevice.h>
>>  #include <linux/mm.h>
>>  #include <linux/module.h>
>>  #include <linux/slab.h>
>> @@ -38,10 +37,79 @@ static int acrn_dev_open(struct inode *inode, struct file *filp)
>>  	return 0;
>>  }
>>
>> +/*
>> + * HSM relies on hypercall layer of the ACRN hypervisor to do the
>> + * sanity check against the input parameters.
>> + */
>> +static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
>> +			   unsigned long ioctl_param)
>> +{
>> +	struct acrn_vm *vm = filp->private_data;
>> +	struct acrn_vm_creation *vm_param;
>> +	int ret = 0;
>> +
>> +	if (vm->vmid == ACRN_INVALID_VMID && cmd != ACRN_IOCTL_CREATE_VM) {
>> +		dev_dbg(acrn_dev.this_device,
>> +			"ioctl 0x%x: Invalid VM state!\n", cmd);
>> +		return -EINVAL;
>> +	}
>> +
>> +	switch (cmd) {
>> +	case ACRN_IOCTL_CREATE_VM:
>> +		vm_param = memdup_user((void __user *)ioctl_param,
>> +				       sizeof(struct acrn_vm_creation));
>> +		if (IS_ERR(vm_param))
>> +			return PTR_ERR(vm_param);
>> +
>> +		vm = acrn_vm_create(vm, vm_param);
>> +		if (!vm) {
>> +			ret = -EINVAL;
>> +			kfree(vm_param);
>> +			break;
>> +		}
>> +
>> +		if (copy_to_user((void __user *)ioctl_param, vm_param,
>> +				 sizeof(struct acrn_vm_creation))) {
>> +			acrn_vm_destroy(vm);
>> +			ret = -EFAULT;
>> +		}
>> +
>> +		kfree(vm_param);
>> +		break;
>> +	case ACRN_IOCTL_START_VM:
>> +		ret = hcall_start_vm(vm->vmid);
>> +		if (ret < 0)
>> +			dev_err(acrn_dev.this_device,
>> +				"Failed to start VM %u!\n", vm->vmid);
>> +		break;
>> +	case ACRN_IOCTL_PAUSE_VM:
>> +		ret = hcall_pause_vm(vm->vmid);
>> +		if (ret < 0)
>> +			dev_err(acrn_dev.this_device,
>> +				"Failed to pause VM %u!\n", vm->vmid);
>> +		break;
>> +	case ACRN_IOCTL_RESET_VM:
>> +		ret = hcall_reset_vm(vm->vmid);
>> +		if (ret < 0)
>> +			dev_err(acrn_dev.this_device,
>> +				"Failed to restart VM %u!\n", vm->vmid);
>> +		break;
>> +	case ACRN_IOCTL_DESTROY_VM:
>> +		ret = acrn_vm_destroy(vm);
>> +		break;
>> +	default:
>> +		dev_warn(acrn_dev.this_device, "Unknown IOCTL 0x%x!\n", cmd);
>
>Do not let userspace spam kernel logs with invalid stuff, that's a sure
>way to cause a DoS.

OK. Got it. Will be dev_dbg().

Thanks
shuo

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 06/17] virt: acrn: Introduce VM management interfaces
  2020-09-27 10:47   ` Greg Kroah-Hartman
@ 2020-09-28  3:50     ` Shuo A Liu
  2020-09-28  5:25       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 58+ messages in thread
From: Shuo A Liu @ 2020-09-28  3:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, x86, H . Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sean Christopherson, Yu Wang, Reinette Chatre,
	Zhi Wang, Zhenyu Wang

Hi Greg,

On Sun 27.Sep'20 at 12:47:02 +0200, Greg Kroah-Hartman wrote:
>On Tue, Sep 22, 2020 at 07:43:00PM +0800, shuo.a.liu@intel.com wrote:
>> From: Shuo Liu <shuo.a.liu@intel.com>
>>
>> The VM management interfaces expose several VM operations to ACRN
>> userspace via ioctls. For example, creating VM, starting VM, destroying
>> VM and so on.
>>
>> The ACRN Hypervisor needs to exchange data with the ACRN userspace
>> during the VM operations. HSM provides VM operation ioctls to the ACRN
>> userspace and communicates with the ACRN Hypervisor for VM operations
>> via hypercalls.
>>
>> HSM maintains a list of User VM. Each User VM will be bound to an
>> existing file descriptor of /dev/acrn_hsm. The User VM will be
>> destroyed when the file descriptor is closed.
>>
>> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
>> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
>> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
>> Cc: Zhi Wang <zhi.a.wang@intel.com>
>> Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
>> Cc: Yu Wang <yu1.wang@intel.com>
>> Cc: Reinette Chatre <reinette.chatre@intel.com>
>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> ---
>>  .../userspace-api/ioctl/ioctl-number.rst      |  1 +
>>  MAINTAINERS                                   |  1 +
>>  drivers/virt/acrn/Makefile                    |  2 +-
>>  drivers/virt/acrn/acrn_drv.h                  | 23 +++++-
>>  drivers/virt/acrn/hsm.c                       | 73 ++++++++++++++++-
>>  drivers/virt/acrn/hypercall.h                 | 78 +++++++++++++++++++
>>  drivers/virt/acrn/vm.c                        | 71 +++++++++++++++++
>>  include/uapi/linux/acrn.h                     | 56 +++++++++++++
>>  8 files changed, 301 insertions(+), 4 deletions(-)
>>  create mode 100644 drivers/virt/acrn/hypercall.h
>>  create mode 100644 drivers/virt/acrn/vm.c
>>  create mode 100644 include/uapi/linux/acrn.h
>>
>> diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
>> index 2a198838fca9..ac60efedb104 100644
>> --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
>> +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
>> @@ -319,6 +319,7 @@ Code  Seq#    Include File                                           Comments
>>  0xA0  all    linux/sdp/sdp.h                                         Industrial Device Project
>>                                                                       <mailto:kenji@bitgate.com>
>>  0xA1  0      linux/vtpm_proxy.h                                      TPM Emulator Proxy Driver
>> +0xA2  all    uapi/linux/acrn.h                                       ACRN hypervisor
>>  0xA3  80-8F                                                          Port ACL  in development:
>>                                                                       <mailto:tlewis@mindspring.com>
>>  0xA3  90-9F  linux/dtlk.h
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 3030d0e93d02..d4c1ef303c2d 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -443,6 +443,7 @@ S:	Supported
>>  W:	https://projectacrn.org
>>  F:	Documentation/virt/acrn/
>>  F:	drivers/virt/acrn/
>> +F:	include/uapi/linux/acrn.h
>>
>>  AD1889 ALSA SOUND DRIVER
>>  L:	linux-parisc@vger.kernel.org
>> diff --git a/drivers/virt/acrn/Makefile b/drivers/virt/acrn/Makefile
>> index 6920ed798aaf..cf8b4ed5e74e 100644
>> --- a/drivers/virt/acrn/Makefile
>> +++ b/drivers/virt/acrn/Makefile
>> @@ -1,3 +1,3 @@
>>  # SPDX-License-Identifier: GPL-2.0
>>  obj-$(CONFIG_ACRN_HSM)	:= acrn.o
>> -acrn-y := hsm.o
>> +acrn-y := hsm.o vm.o
>> diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
>> index 29eedd696327..72d92b60d944 100644
>> --- a/drivers/virt/acrn/acrn_drv.h
>> +++ b/drivers/virt/acrn/acrn_drv.h
>> @@ -3,16 +3,37 @@
>>  #ifndef __ACRN_HSM_DRV_H
>>  #define __ACRN_HSM_DRV_H
>>
>> +#include <linux/acrn.h>
>> +#include <linux/dev_printk.h>
>> +#include <linux/miscdevice.h>
>>  #include <linux/types.h>
>>
>> +#include "hypercall.h"
>> +
>> +extern struct miscdevice acrn_dev;
>> +
>>  #define ACRN_INVALID_VMID (0xffffU)
>>
>> +#define ACRN_VM_FLAG_DESTROYED		0U
>> +extern struct list_head acrn_vm_list;
>> +extern rwlock_t acrn_vm_list_lock;
>>  /**
>>   * struct acrn_vm - Properties of ACRN User VM.
>> + * @list:	Entry within global list of all VMs
>>   * @vmid:	User VM ID
>> + * @vcpu_num:	Number of virtual CPUs in the VM
>> + * @flags:	Flags (ACRN_VM_FLAG_*) of the VM. This is VM flag management
>> + *		in HSM which is different from the &acrn_vm_creation.vm_flag.
>>   */
>>  struct acrn_vm {
>> -	u16	vmid;
>> +	struct list_head	list;
>> +	u16			vmid;
>> +	int			vcpu_num;
>> +	unsigned long		flags;
>>  };
>>
>> +struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
>> +			       struct acrn_vm_creation *vm_param);
>> +int acrn_vm_destroy(struct acrn_vm *vm);
>> +
>>  #endif /* __ACRN_HSM_DRV_H */
>> diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
>> index 28a3052ffa55..f3e6467b8723 100644
>> --- a/drivers/virt/acrn/hsm.c
>> +++ b/drivers/virt/acrn/hsm.c
>> @@ -9,7 +9,6 @@
>>   *	Yakui Zhao <yakui.zhao@intel.com>
>>   */
>>
>> -#include <linux/miscdevice.h>
>>  #include <linux/mm.h>
>>  #include <linux/module.h>
>>  #include <linux/slab.h>
>> @@ -38,10 +37,79 @@ static int acrn_dev_open(struct inode *inode, struct file *filp)
>>  	return 0;
>>  }
>>
>> +/*
>> + * HSM relies on hypercall layer of the ACRN hypervisor to do the
>> + * sanity check against the input parameters.
>> + */
>> +static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
>> +			   unsigned long ioctl_param)
>> +{
>> +	struct acrn_vm *vm = filp->private_data;
>> +	struct acrn_vm_creation *vm_param;
>> +	int ret = 0;
>> +
>> +	if (vm->vmid == ACRN_INVALID_VMID && cmd != ACRN_IOCTL_CREATE_VM) {
>> +		dev_dbg(acrn_dev.this_device,
>> +			"ioctl 0x%x: Invalid VM state!\n", cmd);
>> +		return -EINVAL;
>> +	}
>> +
>> +	switch (cmd) {
>> +	case ACRN_IOCTL_CREATE_VM:
>> +		vm_param = memdup_user((void __user *)ioctl_param,
>> +				       sizeof(struct acrn_vm_creation));
>> +		if (IS_ERR(vm_param))
>> +			return PTR_ERR(vm_param);
>> +
>> +		vm = acrn_vm_create(vm, vm_param);
>> +		if (!vm) {
>> +			ret = -EINVAL;
>> +			kfree(vm_param);
>> +			break;
>> +		}
>> +
>> +		if (copy_to_user((void __user *)ioctl_param, vm_param,
>> +				 sizeof(struct acrn_vm_creation))) {
>> +			acrn_vm_destroy(vm);
>> +			ret = -EFAULT;
>> +		}
>> +
>> +		kfree(vm_param);
>> +		break;
>> +	case ACRN_IOCTL_START_VM:
>> +		ret = hcall_start_vm(vm->vmid);
>> +		if (ret < 0)
>> +			dev_err(acrn_dev.this_device,
>> +				"Failed to start VM %u!\n", vm->vmid);
>> +		break;
>> +	case ACRN_IOCTL_PAUSE_VM:
>> +		ret = hcall_pause_vm(vm->vmid);
>> +		if (ret < 0)
>> +			dev_err(acrn_dev.this_device,
>> +				"Failed to pause VM %u!\n", vm->vmid);
>> +		break;
>> +	case ACRN_IOCTL_RESET_VM:
>> +		ret = hcall_reset_vm(vm->vmid);
>> +		if (ret < 0)
>> +			dev_err(acrn_dev.this_device,
>> +				"Failed to restart VM %u!\n", vm->vmid);
>> +		break;
>> +	case ACRN_IOCTL_DESTROY_VM:
>> +		ret = acrn_vm_destroy(vm);
>> +		break;
>> +	default:
>> +		dev_warn(acrn_dev.this_device, "Unknown IOCTL 0x%x!\n", cmd);
>> +		ret = -ENOTTY;
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>>  static int acrn_dev_release(struct inode *inode, struct file *filp)
>>  {
>>  	struct acrn_vm *vm = filp->private_data;
>>
>> +	acrn_vm_destroy(vm);
>>  	kfree(vm);
>>  	return 0;
>>  }
>> @@ -50,9 +118,10 @@ static const struct file_operations acrn_fops = {
>>  	.owner		= THIS_MODULE,
>>  	.open		= acrn_dev_open,
>>  	.release	= acrn_dev_release,
>> +	.unlocked_ioctl = acrn_dev_ioctl,
>>  };
>>
>> -static struct miscdevice acrn_dev = {
>> +struct miscdevice acrn_dev = {
>>  	.minor	= MISC_DYNAMIC_MINOR,
>>  	.name	= "acrn_hsm",
>>  	.fops	= &acrn_fops,
>> diff --git a/drivers/virt/acrn/hypercall.h b/drivers/virt/acrn/hypercall.h
>> new file mode 100644
>> index 000000000000..426b66cadb1f
>> --- /dev/null
>> +++ b/drivers/virt/acrn/hypercall.h
>> @@ -0,0 +1,78 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * ACRN HSM: hypercalls of ACRN Hypervisor
>> + */
>> +#ifndef __ACRN_HSM_HYPERCALL_H
>> +#define __ACRN_HSM_HYPERCALL_H
>> +#include <asm/acrn.h>
>> +
>> +/*
>> + * Hypercall IDs of the ACRN Hypervisor
>> + */
>> +#define _HC_ID(x, y) (((x) << 24) | (y))
>> +
>> +#define HC_ID 0x80UL
>> +
>> +#define HC_ID_VM_BASE			0x10UL
>> +#define HC_CREATE_VM			_HC_ID(HC_ID, HC_ID_VM_BASE + 0x00)
>> +#define HC_DESTROY_VM			_HC_ID(HC_ID, HC_ID_VM_BASE + 0x01)
>> +#define HC_START_VM			_HC_ID(HC_ID, HC_ID_VM_BASE + 0x02)
>> +#define HC_PAUSE_VM			_HC_ID(HC_ID, HC_ID_VM_BASE + 0x03)
>> +#define HC_RESET_VM			_HC_ID(HC_ID, HC_ID_VM_BASE + 0x05)
>> +
>> +/**
>> + * hcall_create_vm() - Create a User VM
>> + * @vminfo:	Service VM GPA of info of User VM creation
>> + *
>> + * Return: 0 on success, <0 on failure
>> + */
>> +static inline long hcall_create_vm(u64 vminfo)
>> +{
>> +	return acrn_hypercall1(HC_CREATE_VM, vminfo);
>> +}
>> +
>> +/**
>> + * hcall_start_vm() - Start a User VM
>> + * @vmid:	User VM ID
>> + *
>> + * Return: 0 on success, <0 on failure
>> + */
>> +static inline long hcall_start_vm(u64 vmid)
>> +{
>> +	return acrn_hypercall1(HC_START_VM, vmid);
>> +}
>> +
>> +/**
>> + * hcall_pause_vm() - Pause a User VM
>> + * @vmid:	User VM ID
>> + *
>> + * Return: 0 on success, <0 on failure
>> + */
>> +static inline long hcall_pause_vm(u64 vmid)
>> +{
>> +	return acrn_hypercall1(HC_PAUSE_VM, vmid);
>> +}
>> +
>> +/**
>> + * hcall_destroy_vm() - Destroy a User VM
>> + * @vmid:	User VM ID
>> + *
>> + * Return: 0 on success, <0 on failure
>> + */
>> +static inline long hcall_destroy_vm(u64 vmid)
>> +{
>> +	return acrn_hypercall1(HC_DESTROY_VM, vmid);
>> +}
>> +
>> +/**
>> + * hcall_reset_vm() - Reset a User VM
>> + * @vmid:	User VM ID
>> + *
>> + * Return: 0 on success, <0 on failure
>> + */
>> +static inline long hcall_reset_vm(u64 vmid)
>> +{
>> +	return acrn_hypercall1(HC_RESET_VM, vmid);
>> +}
>> +
>> +#endif /* __ACRN_HSM_HYPERCALL_H */
>> diff --git a/drivers/virt/acrn/vm.c b/drivers/virt/acrn/vm.c
>> new file mode 100644
>> index 000000000000..920ca48f4847
>> --- /dev/null
>> +++ b/drivers/virt/acrn/vm.c
>> @@ -0,0 +1,71 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * ACRN_HSM: Virtual Machine management
>> + *
>> + * Copyright (C) 2020 Intel Corporation. All rights reserved.
>> + *
>> + * Authors:
>> + *	Jason Chen CJ <jason.cj.chen@intel.com>
>> + *	Yakui Zhao <yakui.zhao@intel.com>
>> + */
>> +#include <linux/io.h>
>> +#include <linux/mm.h>
>> +#include <linux/slab.h>
>> +
>> +#include "acrn_drv.h"
>> +
>> +/* List of VMs */
>> +LIST_HEAD(acrn_vm_list);
>> +/*
>> + * acrn_vm_list is read in a tasklet which dispatch I/O requests and is wrote
>> + * in VM creation ioctl. Use the rwlock mechanism to protect it.
>> + */
>> +DEFINE_RWLOCK(acrn_vm_list_lock);
>> +
>> +struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
>> +			       struct acrn_vm_creation *vm_param)
>> +{
>> +	int ret;
>> +
>> +	ret = hcall_create_vm(virt_to_phys(vm_param));
>> +	if (ret < 0 || vm_param->vmid == ACRN_INVALID_VMID) {
>> +		dev_err(acrn_dev.this_device,
>> +			"Failed to create VM! Error: %d\n", ret);
>> +		return NULL;
>> +	}
>> +
>> +	vm->vmid = vm_param->vmid;
>> +	vm->vcpu_num = vm_param->vcpu_num;
>> +
>> +	write_lock_bh(&acrn_vm_list_lock);
>> +	list_add(&vm->list, &acrn_vm_list);
>> +	write_unlock_bh(&acrn_vm_list_lock);
>
>Why are the _bh() variants being used here?
>
>You are only accessing this list from userspace context in this patch.
>
>Heck, you aren't even reading from the list, only writing to it...

acrn_vm_list is read in a tasklet which dispatch I/O requests and is wrote
in VM creation ioctl. Use the rwlock mechanism to protect it.
The reading operation is introduced in the following patches of this
series. So i keep the lock type at the moment of introduction.

Thanks
shuo

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 17/17] virt: acrn: Introduce an interface for Service VM to control vCPU
  2020-09-27 10:44   ` Greg Kroah-Hartman
@ 2020-09-28  4:10     ` Shuo A Liu
  2020-09-28  5:23       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 58+ messages in thread
From: Shuo A Liu @ 2020-09-28  4:10 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, x86, H . Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sean Christopherson, Yu Wang, Reinette Chatre,
	Zhi Wang, Zhenyu Wang

Hi Greg,

On Sun 27.Sep'20 at 12:44:14 +0200, Greg Kroah-Hartman wrote:
>On Tue, Sep 22, 2020 at 07:43:11PM +0800, shuo.a.liu@intel.com wrote:
>> From: Shuo Liu <shuo.a.liu@intel.com>
>>
>> ACRN supports partition mode to achieve real-time requirements. In
>> partition mode, a CPU core can be dedicated to a vCPU of User VM. The
>> local APIC of the dedicated CPU core can be passthrough to the User VM.
>> The Service VM controls the assignment of the CPU cores.
>>
>> Introduce an interface for the Service VM to remove the control of CPU
>> core from hypervisor perspective so that the CPU core can be a dedicated
>> CPU core of User VM.
>>
>> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
>> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
>> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
>> Cc: Zhi Wang <zhi.a.wang@intel.com>
>> Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
>> Cc: Yu Wang <yu1.wang@intel.com>
>> Cc: Reinette Chatre <reinette.chatre@intel.com>
>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> ---
>>  drivers/virt/acrn/hsm.c       | 50 +++++++++++++++++++++++++++++++++++
>>  drivers/virt/acrn/hypercall.h | 14 ++++++++++
>>  2 files changed, 64 insertions(+)
>>
>> diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
>> index aaf4e76d27b4..ef5f77a38d1f 100644
>> --- a/drivers/virt/acrn/hsm.c
>> +++ b/drivers/virt/acrn/hsm.c
>> @@ -9,6 +9,7 @@
>>   *	Yakui Zhao <yakui.zhao@intel.com>
>>   */
>>
>> +#include <linux/cpu.h>
>>  #include <linux/io.h>
>>  #include <linux/mm.h>
>>  #include <linux/module.h>
>> @@ -354,6 +355,47 @@ struct miscdevice acrn_dev = {
>>  	.fops	= &acrn_fops,
>>  };
>>
>> +static ssize_t remove_cpu_store(struct device *dev,
>> +				struct device_attribute *attr,
>> +				const char *buf, size_t count)
>> +{
>> +	u64 cpu, lapicid;
>> +	int ret;
>> +
>> +	if (kstrtoull(buf, 0, &cpu) < 0)
>> +		return -EINVAL;
>> +
>> +	if (cpu >= num_possible_cpus() || cpu == 0 || !cpu_is_hotpluggable(cpu))
>> +		return -EINVAL;
>> +
>> +	if (cpu_online(cpu))
>> +		remove_cpu(cpu);
>> +
>> +	lapicid = cpu_data(cpu).apicid;
>> +	dev_dbg(dev, "Try to remove cpu %lld with lapicid %lld\n", cpu, lapicid);
>> +	ret = hcall_sos_remove_cpu(lapicid);
>> +	if (ret < 0) {
>> +		dev_err(dev, "Failed to remove cpu %lld!\n", cpu);
>> +		goto fail_remove;
>> +	}
>> +
>> +	return count;
>> +
>> +fail_remove:
>> +	add_cpu(cpu);
>> +	return ret;
>> +}
>> +static DEVICE_ATTR_WO(remove_cpu);
>> +
>> +static struct attribute *acrn_attrs[] = {
>> +	&dev_attr_remove_cpu.attr,
>> +	NULL
>> +};
>> +
>> +static struct attribute_group acrn_attr_group = {
>> +	.attrs = acrn_attrs,
>> +};
>
>You create a sysfs attribute without any Documentation/ABI/ update as
>well?  That's not good.

Sorry, i will add it in the ABI/testing.

>
>And why are you trying to emulate CPU hotplug here and not using the
>existing CPU hotplug mechanism?

The interface introduced here includes:
  1) The Service VM virtual CPU hotplug
  2) hypercall to the hypervisor to remove one virtual CPU from the
    Service VM
The 1) just do the CPU hotplug with kernel API remove_cpu(), and can be
resume back (by CPU online interface) if only 1) is done.
If 2) is done, then the physical CPU will be removed from the Service
VM's CPU pool. The ACRN hypervisor supports passthrough a physical CPU
to a VM. The precondition is that the physical CPU cannot be occupied by
any other VM. This interface intends to do that.


>
>> +
>>  static int __init hsm_init(void)
>>  {
>>  	int ret;
>> @@ -370,13 +412,21 @@ static int __init hsm_init(void)
>>  		return ret;
>>  	}
>>
>> +	ret = sysfs_create_group(&acrn_dev.this_device->kobj, &acrn_attr_group);
>> +	if (ret) {
>> +		dev_warn(acrn_dev.this_device, "sysfs create failed\n");
>> +		misc_deregister(&acrn_dev);
>> +		return ret;
>> +	}
>
>You just raced with userspace and lost.  If you want to add attribute
>files to a device, use the default attribute group list, and it will be
>managed properly for you by the driver core.
>
>Huge hint, if a driver every has to touch a kobject, or call sysfs_*,
>then it is probably doing something wrong.

Do you mean use .groups of struct miscdevice directly ?

If yes, let me follow drivers/char/hw_random/s390-trng.c to do this.
BTW, few driver use the .groups directly. :)

Thanks
shuo

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 17/17] virt: acrn: Introduce an interface for Service VM to control vCPU
  2020-09-28  4:10     ` Shuo A Liu
@ 2020-09-28  5:23       ` Greg Kroah-Hartman
  2020-09-28  6:33         ` Shuo A Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Greg Kroah-Hartman @ 2020-09-28  5:23 UTC (permalink / raw)
  To: Shuo A Liu
  Cc: linux-kernel, x86, H . Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sean Christopherson, Yu Wang, Reinette Chatre,
	Zhi Wang, Zhenyu Wang

On Mon, Sep 28, 2020 at 12:10:07PM +0800, Shuo A Liu wrote:
> > You just raced with userspace and lost.  If you want to add attribute
> > files to a device, use the default attribute group list, and it will be
> > managed properly for you by the driver core.
> > 
> > Huge hint, if a driver every has to touch a kobject, or call sysfs_*,
> > then it is probably doing something wrong.
> 
> Do you mean use .groups of struct miscdevice directly ?
> 
> If yes, let me follow drivers/char/hw_random/s390-trng.c to do this.
> BTW, few driver use the .groups directly. :)

Drivers should almost never be messing with individual sysfs files.  And
this ability to use .groups is a "new" one, conversions of existing code
that do not use them is always welcome.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 06/17] virt: acrn: Introduce VM management interfaces
  2020-09-28  3:50     ` Shuo A Liu
@ 2020-09-28  5:25       ` Greg Kroah-Hartman
  2020-09-28  6:29         ` Shuo A Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Greg Kroah-Hartman @ 2020-09-28  5:25 UTC (permalink / raw)
  To: Shuo A Liu
  Cc: linux-kernel, x86, H . Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sean Christopherson, Yu Wang, Reinette Chatre,
	Zhi Wang, Zhenyu Wang

On Mon, Sep 28, 2020 at 11:50:30AM +0800, Shuo A Liu wrote:
> > > +	write_lock_bh(&acrn_vm_list_lock);
> > > +	list_add(&vm->list, &acrn_vm_list);
> > > +	write_unlock_bh(&acrn_vm_list_lock);
> > 
> > Why are the _bh() variants being used here?
> > 
> > You are only accessing this list from userspace context in this patch.
> > 
> > Heck, you aren't even reading from the list, only writing to it...
> 
> acrn_vm_list is read in a tasklet which dispatch I/O requests and is wrote
> in VM creation ioctl. Use the rwlock mechanism to protect it.
> The reading operation is introduced in the following patches of this
> series. So i keep the lock type at the moment of introduction.

Ok, but think about someone trying to review this code.  Does this lock
actually make sense here?  No, it does not.  How am I supposed to know
to look at future patches to determine that it changes location and
usage to require this?

That's just not fair, would you want to review something like this?

And a HUGE meta-comment, again, why am I the only one reviewing this
stuff?  Why do you have a ton of Intel people on the Cc: yet it is, once
again, my job to do this?

If you all are wanting to burn me out, you are doing a good job...

greg k-h

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 06/17] virt: acrn: Introduce VM management interfaces
  2020-09-28  5:25       ` Greg Kroah-Hartman
@ 2020-09-28  6:29         ` Shuo A Liu
  2020-09-28 12:26           ` Greg Kroah-Hartman
  0 siblings, 1 reply; 58+ messages in thread
From: Shuo A Liu @ 2020-09-28  6:29 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, x86, H . Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sean Christopherson, Yu Wang, Reinette Chatre,
	Zhi Wang, Zhenyu Wang

On Mon 28.Sep'20 at  7:25:16 +0200, Greg Kroah-Hartman wrote:
>On Mon, Sep 28, 2020 at 11:50:30AM +0800, Shuo A Liu wrote:
>> > > +	write_lock_bh(&acrn_vm_list_lock);
>> > > +	list_add(&vm->list, &acrn_vm_list);
>> > > +	write_unlock_bh(&acrn_vm_list_lock);
>> >
>> > Why are the _bh() variants being used here?
>> >
>> > You are only accessing this list from userspace context in this patch.
>> >
>> > Heck, you aren't even reading from the list, only writing to it...
>>
>> acrn_vm_list is read in a tasklet which dispatch I/O requests and is wrote
>> in VM creation ioctl. Use the rwlock mechanism to protect it.
>> The reading operation is introduced in the following patches of this
>> series. So i keep the lock type at the moment of introduction.
>
>Ok, but think about someone trying to review this code.  Does this lock
>actually make sense here?  No, it does not.  How am I supposed to know
>to look at future patches to determine that it changes location and
>usage to require this?

OK. May i know how to handle such kind of code submission? Or which way
following do you prefer?
  1) Use a mutex lock here, then change it to rwlock in a later patch
     of this series.
  2) Add more comments in changelog about the lock. (Now, there is
     comment around the acrn_vm_list_lock)

>
>That's just not fair, would you want to review something like this?
>
>And a HUGE meta-comment, again, why am I the only one reviewing this
>stuff?  Why do you have a ton of Intel people on the Cc: yet it is, once
>again, my job to do this?

The patchset has been reviewed in Intel's internal mailist several
rounds and got Reviewed-by: before send out. That's why i Cced many
Intel people as well.

This patchset is all about a common driver for the ACRN hypervisor
support. I put the code in drivers/virt/ and found you are one of the
maintainer of vboxguest driver which is in the same subdirectory. I
thought you should be the right person to be Cced when i submitted this
series.

Certainly, any comments are welcome. And really appreciate your review
and help. I have little experience to submit a new driver to the
community, my apologies if thing goes wrong.

Thanks
shuo

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 17/17] virt: acrn: Introduce an interface for Service VM to control vCPU
  2020-09-28  5:23       ` Greg Kroah-Hartman
@ 2020-09-28  6:33         ` Shuo A Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Shuo A Liu @ 2020-09-28  6:33 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, x86, H . Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sean Christopherson, Yu Wang, Reinette Chatre,
	Zhi Wang, Zhenyu Wang

On Mon 28.Sep'20 at  7:23:05 +0200, Greg Kroah-Hartman wrote:
>On Mon, Sep 28, 2020 at 12:10:07PM +0800, Shuo A Liu wrote:
>> > You just raced with userspace and lost.  If you want to add attribute
>> > files to a device, use the default attribute group list, and it will be
>> > managed properly for you by the driver core.
>> >
>> > Huge hint, if a driver every has to touch a kobject, or call sysfs_*,
>> > then it is probably doing something wrong.
>>
>> Do you mean use .groups of struct miscdevice directly ?
>>
>> If yes, let me follow drivers/char/hw_random/s390-trng.c to do this.
>> BTW, few driver use the .groups directly. :)
>
>Drivers should almost never be messing with individual sysfs files.  And
>this ability to use .groups is a "new" one, conversions of existing code
>that do not use them is always welcome.

OK. Thanks for the explanation. I will follow the 'new' way :)

Thanks
shuo

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 06/17] virt: acrn: Introduce VM management interfaces
  2020-09-28  6:29         ` Shuo A Liu
@ 2020-09-28 12:26           ` Greg Kroah-Hartman
  2020-09-30  2:49             ` Shuo A Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Greg Kroah-Hartman @ 2020-09-28 12:26 UTC (permalink / raw)
  To: Shuo A Liu
  Cc: linux-kernel, x86, H . Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sean Christopherson, Yu Wang, Reinette Chatre,
	Zhi Wang, Zhenyu Wang

On Mon, Sep 28, 2020 at 02:29:34PM +0800, Shuo A Liu wrote:
> On Mon 28.Sep'20 at  7:25:16 +0200, Greg Kroah-Hartman wrote:
> > On Mon, Sep 28, 2020 at 11:50:30AM +0800, Shuo A Liu wrote:
> > > > > +	write_lock_bh(&acrn_vm_list_lock);
> > > > > +	list_add(&vm->list, &acrn_vm_list);
> > > > > +	write_unlock_bh(&acrn_vm_list_lock);
> > > >
> > > > Why are the _bh() variants being used here?
> > > >
> > > > You are only accessing this list from userspace context in this patch.
> > > >
> > > > Heck, you aren't even reading from the list, only writing to it...
> > > 
> > > acrn_vm_list is read in a tasklet which dispatch I/O requests and is wrote
> > > in VM creation ioctl. Use the rwlock mechanism to protect it.
> > > The reading operation is introduced in the following patches of this
> > > series. So i keep the lock type at the moment of introduction.
> > 
> > Ok, but think about someone trying to review this code.  Does this lock
> > actually make sense here?  No, it does not.  How am I supposed to know
> > to look at future patches to determine that it changes location and
> > usage to require this?
> 
> OK. May i know how to handle such kind of code submission? Or which way
> following do you prefer?
>  1) Use a mutex lock here, then change it to rwlock in a later patch
>     of this series.

Wouldn't this make more sense if you had to read these one after
another?

>  2) Add more comments in changelog about the lock. (Now, there is
>     comment around the acrn_vm_list_lock)

It's hard to verify a comment's statement without digging through other
patches in the series, right?  You want the reviewer to just trust you?
:)

Again, what would _YOU_ want to see if you had to review this?

> > That's just not fair, would you want to review something like this?
> > 
> > And a HUGE meta-comment, again, why am I the only one reviewing this
> > stuff?  Why do you have a ton of Intel people on the Cc: yet it is, once
> > again, my job to do this?
> 
> The patchset has been reviewed in Intel's internal mailist several
> rounds and got Reviewed-by: before send out. That's why i Cced many
> Intel people as well.

Then why didn't any of those intel people on the cc: actually review it
after you have sent it out?  Why is it only me?  Do I need to wait
longer for them to get to this?  I'll gladly do so next time...

> This patchset is all about a common driver for the ACRN hypervisor
> support. I put the code in drivers/virt/ and found you are one of the
> maintainer of vboxguest driver which is in the same subdirectory. I
> thought you should be the right person to be Cced when i submitted this
> series.

I am, I'm not complaining about that.  I'm complaining that it seems to
be _only_ me reviewing this here, and not any of the people you are cc:ing
from intel.  Most of those people should be giving you this same type of
review comments and not forcing an external person to do so, right?

> Certainly, any comments are welcome. And really appreciate your review
> and help. I have little experience to submit a new driver to the
> community, my apologies if thing goes wrong.

You didn't do anything wrong, I'm arguing about the larger meta-issue I
have right now with Intel and the lack of reviews that seems to happen
from other Intel people on their co-workers patches.

Anyway, you are doing fine, it's an iterative process, hopefully you can
also review other people's patches in this area that are being posted as
well.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 02/17] x86/acrn: Introduce acrn_{setup, remove}_intr_handler()
  2020-09-22 11:42 ` [PATCH v4 02/17] x86/acrn: Introduce acrn_{setup, remove}_intr_handler() shuo.a.liu
  2020-09-27 10:49   ` Greg Kroah-Hartman
@ 2020-09-29 18:01   ` Borislav Petkov
  2020-09-29 20:07     ` Thomas Gleixner
  1 sibling, 1 reply; 58+ messages in thread
From: Borislav Petkov @ 2020-09-29 18:01 UTC (permalink / raw)
  To: shuo.a.liu
  Cc: linux-kernel, x86, Greg Kroah-Hartman, H . Peter Anvin,
	Thomas Gleixner, Ingo Molnar, Sean Christopherson, Yu Wang,
	Reinette Chatre, Yakui Zhao, Zhi Wang, Dave Hansen, Dan Williams,
	Fengwei Yin, Zhenyu Wang

On Tue, Sep 22, 2020 at 07:42:56PM +0800, shuo.a.liu@intel.com wrote:
> +void acrn_setup_intr_handler(void (*handler)(void))
> +{
> +	acrn_intr_handler = handler;
> +}
> +EXPORT_SYMBOL_GPL(acrn_setup_intr_handler);
> +
> +void acrn_remove_intr_handler(void)
> +{
> +	acrn_intr_handler = NULL;
> +}
> +EXPORT_SYMBOL_GPL(acrn_remove_intr_handler);

I don't like this one bit.

Also, what stops the module from doing acrn_remove_intr_handler()
while it gets a HYPERVISOR_CALLBACK_VECTOR interrupt and the handler
disappearing from under it?

IOW, this should be an atomic notifier instead.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 02/17] x86/acrn: Introduce acrn_{setup, remove}_intr_handler()
  2020-09-29 18:01   ` Borislav Petkov
@ 2020-09-29 20:07     ` Thomas Gleixner
  2020-09-29 20:26       ` Borislav Petkov
  0 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2020-09-29 20:07 UTC (permalink / raw)
  To: Borislav Petkov, shuo.a.liu
  Cc: linux-kernel, x86, Greg Kroah-Hartman, H . Peter Anvin,
	Ingo Molnar, Sean Christopherson, Yu Wang, Reinette Chatre,
	Yakui Zhao, Zhi Wang, Dave Hansen, Dan Williams, Fengwei Yin,
	Zhenyu Wang

On Tue, Sep 29 2020 at 20:01, Borislav Petkov wrote:
> On Tue, Sep 22, 2020 at 07:42:56PM +0800, shuo.a.liu@intel.com wrote:
>> +void acrn_setup_intr_handler(void (*handler)(void))
>> +{
>> +	acrn_intr_handler = handler;
>> +}
>> +EXPORT_SYMBOL_GPL(acrn_setup_intr_handler);
>> +
>> +void acrn_remove_intr_handler(void)
>> +{
>> +	acrn_intr_handler = NULL;
>> +}
>> +EXPORT_SYMBOL_GPL(acrn_remove_intr_handler);
>
> I don't like this one bit.
>
> Also, what stops the module from doing acrn_remove_intr_handler()
> while it gets a HYPERVISOR_CALLBACK_VECTOR interrupt and the handler
> disappearing from under it?
>
> IOW, this should be an atomic notifier instead.

That does not prevent that either and notifiers suck. The pointer is
fine and if something removes the handler before all of the muck is
shutdown then the author can keep the pieces and mop up the remains.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 02/17] x86/acrn: Introduce acrn_{setup, remove}_intr_handler()
  2020-09-29 20:07     ` Thomas Gleixner
@ 2020-09-29 20:26       ` Borislav Petkov
  2020-09-30  3:02         ` Shuo A Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Borislav Petkov @ 2020-09-29 20:26 UTC (permalink / raw)
  To: Thomas Gleixner, shuo.a.liu
  Cc: linux-kernel, x86, Greg Kroah-Hartman, H . Peter Anvin,
	Ingo Molnar, Sean Christopherson, Yu Wang, Reinette Chatre,
	Yakui Zhao, Zhi Wang, Dave Hansen, Dan Williams, Fengwei Yin,
	Zhenyu Wang

On Tue, Sep 29, 2020 at 10:07:17PM +0200, Thomas Gleixner wrote:
> That does not prevent that either and notifiers suck.

Bah, atomic notifiers run functions which cannot block, not what is
needed here, right.

> The pointer is fine and if something removes the handler before all of
> the muck is shutdown then the author can keep the pieces and mop up
> the remains.

Uhu, so what makes sure that the module is not removed while an IRQ is
happening?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 06/17] virt: acrn: Introduce VM management interfaces
  2020-09-28 12:26           ` Greg Kroah-Hartman
@ 2020-09-30  2:49             ` Shuo A Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Shuo A Liu @ 2020-09-30  2:49 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, x86, H . Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sean Christopherson, Yu Wang, Reinette Chatre,
	Zhi Wang, Zhenyu Wang

On Mon 28.Sep'20 at 14:26:02 +0200, Greg Kroah-Hartman wrote:
>On Mon, Sep 28, 2020 at 02:29:34PM +0800, Shuo A Liu wrote:
>> On Mon 28.Sep'20 at  7:25:16 +0200, Greg Kroah-Hartman wrote:
>> > On Mon, Sep 28, 2020 at 11:50:30AM +0800, Shuo A Liu wrote:
>> > > > > +	write_lock_bh(&acrn_vm_list_lock);
>> > > > > +	list_add(&vm->list, &acrn_vm_list);
>> > > > > +	write_unlock_bh(&acrn_vm_list_lock);
>> > > >
>> > > > Why are the _bh() variants being used here?
>> > > >
>> > > > You are only accessing this list from userspace context in this patch.
>> > > >
>> > > > Heck, you aren't even reading from the list, only writing to it...
>> > >
>> > > acrn_vm_list is read in a tasklet which dispatch I/O requests and is wrote
>> > > in VM creation ioctl. Use the rwlock mechanism to protect it.
>> > > The reading operation is introduced in the following patches of this
>> > > series. So i keep the lock type at the moment of introduction.
>> >
>> > Ok, but think about someone trying to review this code.  Does this lock
>> > actually make sense here?  No, it does not.  How am I supposed to know
>> > to look at future patches to determine that it changes location and
>> > usage to require this?
>>
>> OK. May i know how to handle such kind of code submission? Or which way
>> following do you prefer?
>>  1) Use a mutex lock here, then change it to rwlock in a later patch
>>     of this series.
>
>Wouldn't this make more sense if you had to read these one after
>another?

OK. I will change to mutex firstly for more readable. 

>
>>  2) Add more comments in changelog about the lock. (Now, there is
>>     comment around the acrn_vm_list_lock)
>
>It's hard to verify a comment's statement without digging through other
>patches in the series, right?  You want the reviewer to just trust you?
>:)
>
>Again, what would _YOU_ want to see if you had to review this?
>
>> > That's just not fair, would you want to review something like this?
>> >
>> > And a HUGE meta-comment, again, why am I the only one reviewing this
>> > stuff?  Why do you have a ton of Intel people on the Cc: yet it is, once
>> > again, my job to do this?
>>
>> The patchset has been reviewed in Intel's internal mailist several
>> rounds and got Reviewed-by: before send out. That's why i Cced many
>> Intel people as well.
>
>Then why didn't any of those intel people on the cc: actually review it
>after you have sent it out?  Why is it only me?  Do I need to wait
>longer for them to get to this?  I'll gladly do so next time...
>
>> This patchset is all about a common driver for the ACRN hypervisor
>> support. I put the code in drivers/virt/ and found you are one of the
>> maintainer of vboxguest driver which is in the same subdirectory. I
>> thought you should be the right person to be Cced when i submitted this
>> series.
>
>I am, I'm not complaining about that.  I'm complaining that it seems to
>be _only_ me reviewing this here, and not any of the people you are cc:ing
>from intel.  Most of those people should be giving you this same type of
>review comments and not forcing an external person to do so, right?
>
>> Certainly, any comments are welcome. And really appreciate your review
>> and help. I have little experience to submit a new driver to the
>> community, my apologies if thing goes wrong.
>
>You didn't do anything wrong, I'm arguing about the larger meta-issue I
>have right now with Intel and the lack of reviews that seems to happen
>from other Intel people on their co-workers patches.
>
>Anyway, you are doing fine, it's an iterative process, hopefully you can
>also review other people's patches in this area that are being posted as
>well.

Sorry, i have no answer about some of your question above. :(
However, i will try my best to help review other people's patches in
this area.

Thanks
shuo

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 02/17] x86/acrn: Introduce acrn_{setup, remove}_intr_handler()
  2020-09-29 20:26       ` Borislav Petkov
@ 2020-09-30  3:02         ` Shuo A Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Shuo A Liu @ 2020-09-30  3:02 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Thomas Gleixner, linux-kernel, x86, Greg Kroah-Hartman,
	H . Peter Anvin, Ingo Molnar, Sean Christopherson, Yu Wang,
	Reinette Chatre, Yakui Zhao, Zhi Wang, Dave Hansen, Dan Williams,
	Fengwei Yin, Zhenyu Wang

Hi Boris,

On Tue 29.Sep'20 at 22:26:54 +0200, Borislav Petkov wrote:
>On Tue, Sep 29, 2020 at 10:07:17PM +0200, Thomas Gleixner wrote:
>> That does not prevent that either and notifiers suck.
>
>Bah, atomic notifiers run functions which cannot block, not what is
>needed here, right.
>
>> The pointer is fine and if something removes the handler before all of
>> the muck is shutdown then the author can keep the pieces and mop up
>> the remains.
>
>Uhu, so what makes sure that the module is not removed while an IRQ is
>happening?

The precondition of the removing of the module is that there is no
User VM living (every opening of the dev file will hold a ref count of
the module). The interrupt only can occur with active User VMs. So if
a notification interrupt is happending, the module cannot be removed as
there is still User VM living.

Thanks
shuo

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 03/17] x86/acrn: Introduce an API to check if a VM is privileged
  2020-09-22 11:42 ` [PATCH v4 03/17] x86/acrn: Introduce an API to check if a VM is privileged shuo.a.liu
@ 2020-09-30  8:09   ` Borislav Petkov
  2020-10-12  8:40     ` Shuo A Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Borislav Petkov @ 2020-09-30  8:09 UTC (permalink / raw)
  To: shuo.a.liu
  Cc: linux-kernel, x86, Greg Kroah-Hartman, H . Peter Anvin,
	Thomas Gleixner, Ingo Molnar, Sean Christopherson, Yu Wang,
	Reinette Chatre, Yin Fengwei, Dave Hansen, Dan Williams,
	Zhi Wang, Zhenyu Wang

On Tue, Sep 22, 2020 at 07:42:57PM +0800, shuo.a.liu@intel.com wrote:
> +static u32 acrn_cpuid_base(void)
> +{
> +	static u32 acrn_cpuid_base;
> +
> +	if (!acrn_cpuid_base && boot_cpu_has(X86_FEATURE_HYPERVISOR))
> +		acrn_cpuid_base = hypervisor_cpuid_base("ACRNACRNACRN", 0);
> +
> +	return acrn_cpuid_base;
> +}
> +
> +bool acrn_is_privileged_vm(void)
> +{
> +	return cpuid_eax(acrn_cpuid_base() | ACRN_CPUID_FEATURES) &

What's that dance and acrn_cpuid_base static thing needed for? Why not
simply:

	cpuid_eax(ACRN_CPUID_FEATURES) & ...

?

> +			 ACRN_FEATURE_PRIVILEGED_VM;
> +}
> +EXPORT_SYMBOL_GPL(acrn_is_privileged_vm);

Also, if you're going to need more of those bit checkers acrn_is_<something>
which look at ACRN_CPUID_FEATURES, just stash CPUID_0x40000001_EAX locally and
use a

	acrn_has(ACRN_FEATURE_PRIVILEGED_VM)

which does the bit testing.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-09-22 11:42 ` [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces shuo.a.liu
  2020-09-27 10:51   ` Greg Kroah-Hartman
@ 2020-09-30 10:54   ` Borislav Petkov
  2020-10-12  8:49     ` Shuo A Liu
  1 sibling, 1 reply; 58+ messages in thread
From: Borislav Petkov @ 2020-09-30 10:54 UTC (permalink / raw)
  To: shuo.a.liu
  Cc: linux-kernel, x86, Greg Kroah-Hartman, H . Peter Anvin,
	Thomas Gleixner, Ingo Molnar, Sean Christopherson, Yu Wang,
	Reinette Chatre, Yakui Zhao, Dave Hansen, Dan Williams,
	Fengwei Yin, Zhi Wang, Zhenyu Wang

On Tue, Sep 22, 2020 at 07:42:58PM +0800, shuo.a.liu@intel.com wrote:
> From: Shuo Liu <shuo.a.liu@intel.com>
> 
> The Service VM communicates with the hypervisor via conventional
> hypercalls. VMCALL instruction is used to make the hypercalls.
> 
> ACRN hypercall ABI:
>   * Hypercall number is in R8 register.
>   * Up to 2 parameters are in RDI and RSI registers.
>   * Return value is in RAX register.

I'm assuming this is already cast in stone in the HV and it cannot be
changed?

> Introduce the ACRN hypercall interfaces. Because GCC doesn't support R8
> register as direct register constraints, here are two ways to use R8 in
> extended asm:
>   1) use explicit register variable as input
>   2) use supported constraint as input with a explicit MOV to R8 in
>      beginning of asm
> 
> The number of instructions of above two ways are same.
> Asm code from 1)
>   38:   41 b8 00 00 00 80       mov    $0x80000000,%r8d
>   3e:   48 89 c7                mov    %rax,%rdi
>   41:   0f 01 c1                vmcall
> Here, writes to the lower dword (%r8d) clear the upper dword of %r8 when
> the CPU is in 64-bit mode.
> 
> Asm code from 2)
>   38:   48 89 c7                mov    %rax,%rdi
>   3b:   49 b8 00 00 00 80 00    movabs $0x80000000,%r8
>   42:   00 00 00
>   45:   0f 01 c1                vmcall
> 
> Choose 1) for code simplicity and a little bit of code size
> optimization.

What?

How much "optimization" is this actually? A couple of bytes?

And all that for this

	/* Nothing can come between the r8 assignment and the asm: */

restriction?

If it is only a couple of bytes, just do the explicit MOV to %r8 and
f'get about it.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-09-27 15:38     ` Dave Hansen
@ 2020-09-30 11:16       ` Peter Zijlstra
  2020-09-30 16:10         ` Segher Boessenkool
  0 siblings, 1 reply; 58+ messages in thread
From: Peter Zijlstra @ 2020-09-30 11:16 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Greg Kroah-Hartman, shuo.a.liu, linux-kernel, x86,
	H . Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sean Christopherson, Yu Wang, Reinette Chatre, Yakui Zhao,
	Dan Williams, Fengwei Yin, Zhi Wang, Zhenyu Wang, segher,
	ndesaulniers

On Sun, Sep 27, 2020 at 08:38:03AM -0700, Dave Hansen wrote:
> On 9/27/20 3:51 AM, Greg Kroah-Hartman wrote:
> >> +static inline long acrn_hypercall0(unsigned long hcall_id)
> >> +{
> >> +	register long r8 asm("r8");
> >> +	long result;
> >> +
> >> +	/* Nothing can come between the r8 assignment and the asm: */
> >> +	r8 = hcall_id;
> >> +	asm volatile("vmcall\n\t"
> >> +		     : "=a" (result)
> >> +		     : "r" (r8)
> >> +		     : );
> > What keeps an interrupt from happening between the r8 assignment and the
> > asm: ?
> 
> It's probably better phrased something like: "No other C code can come
> between this r8 assignment and the inline asm".  An interrupt would
> actually be fine in there because interrupts save and restore all
> register state, including r8.
> 
> The problem (mentioned in the changelog) is that gcc does not let you
> place data directly into r8.  But, it does allow you to declare a
> register variable that you can assign to use r8.  There might be a
> problem if a function calls was in between and clobber the register,
> thus the "nothing can come between" comment.
> 
> The comment is really intended to scare away anyone from adding printk()'s.
> 
> More information about these register variables is here:
> 
> > https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables
> 
> Any better ideas for comments would be greatly appreciated.  It has 4 or
> 5 copies so I wanted it to be succinct.

This is disguisting.. Segher, does this actually work? Nick, does clang
also support this?


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-09-30 11:16       ` Peter Zijlstra
@ 2020-09-30 16:10         ` Segher Boessenkool
  2020-09-30 17:13           ` Peter Zijlstra
  0 siblings, 1 reply; 58+ messages in thread
From: Segher Boessenkool @ 2020-09-30 16:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dave Hansen, Greg Kroah-Hartman, shuo.a.liu, linux-kernel, x86,
	H . Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sean Christopherson, Yu Wang, Reinette Chatre, Yakui Zhao,
	Dan Williams, Fengwei Yin, Zhi Wang, Zhenyu Wang, ndesaulniers

Hi!

On Wed, Sep 30, 2020 at 01:16:12PM +0200, Peter Zijlstra wrote:
> On Sun, Sep 27, 2020 at 08:38:03AM -0700, Dave Hansen wrote:
> > On 9/27/20 3:51 AM, Greg Kroah-Hartman wrote:
> > >> +static inline long acrn_hypercall0(unsigned long hcall_id)
> > >> +{
> > >> +	register long r8 asm("r8");
> > >> +	long result;
> > >> +
> > >> +	/* Nothing can come between the r8 assignment and the asm: */
> > >> +	r8 = hcall_id;
> > >> +	asm volatile("vmcall\n\t"
> > >> +		     : "=a" (result)
> > >> +		     : "r" (r8)
> > >> +		     : );
> > > What keeps an interrupt from happening between the r8 assignment and the
> > > asm: ?
> > 
> > It's probably better phrased something like: "No other C code can come
> > between this r8 assignment and the inline asm".  An interrupt would
> > actually be fine in there because interrupts save and restore all
> > register state, including r8.
> > 
> > The problem (mentioned in the changelog) is that gcc does not let you
> > place data directly into r8.  But, it does allow you to declare a
> > register variable that you can assign to use r8.  There might be a
> > problem if a function calls was in between and clobber the register,
> > thus the "nothing can come between" comment.
> > 
> > The comment is really intended to scare away anyone from adding printk()'s.
> > 
> > More information about these register variables is here:
> > 
> > > https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables
> > 
> > Any better ideas for comments would be greatly appreciated.  It has 4 or
> > 5 copies so I wanted it to be succinct.
> 
> This is disguisting.. Segher, does this actually work? Nick, does clang
> also support this?

The C variable "r8" is just a variable like any other; it can live in
memory, or in any register, and different in all places, too.  It can be
moved around too; where "the assignment to it" happens is a
philosophical question more than anything (the assignment there  can be
optimised away completely, for example; it is just a C variable, there
is no magic).

Since this variable is a local register asm, on entry to the asm the
compiler guarantees that the value lives in the assigned register (the
"r8" hardware register in this case).  This all works completely fine.
This is the only guaranteed behaviour for local register asm (well,
together with analogous behaviour for outputs).

If you want to *always* have it live in the hardware reg "r8", you have
to use a global register asm, and almost certainly do that for all
translation units, and use -ffixed-r8 as well.  This of course is
extremely costly.


Segher

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-09-30 16:10         ` Segher Boessenkool
@ 2020-09-30 17:13           ` Peter Zijlstra
  2020-09-30 19:14             ` Nick Desaulniers
  0 siblings, 1 reply; 58+ messages in thread
From: Peter Zijlstra @ 2020-09-30 17:13 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Dave Hansen, Greg Kroah-Hartman, shuo.a.liu, linux-kernel, x86,
	H . Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sean Christopherson, Yu Wang, Reinette Chatre, Yakui Zhao,
	Dan Williams, Fengwei Yin, Zhi Wang, Zhenyu Wang, ndesaulniers

On Wed, Sep 30, 2020 at 11:10:36AM -0500, Segher Boessenkool wrote:

> Since this variable is a local register asm, on entry to the asm the
> compiler guarantees that the value lives in the assigned register (the
> "r8" hardware register in this case).  This all works completely fine.
> This is the only guaranteed behaviour for local register asm (well,
> together with analogous behaviour for outputs).

Right, that's what they're trying to achieve. The hypervisor calling
convention needs that variable in %r8 (which is somewhat unfortunate).

AFAIK this is the first such use in the kernel, but at least the gcc-4.9
(our oldest supported version) claims to support this.

So now we need to know if clang will actually do this too..

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-09-30 17:13           ` Peter Zijlstra
@ 2020-09-30 19:14             ` Nick Desaulniers
  2020-09-30 19:42               ` Peter Zijlstra
                                 ` (3 more replies)
  0 siblings, 4 replies; 58+ messages in thread
From: Nick Desaulniers @ 2020-09-30 19:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Segher Boessenkool, Dave Hansen, Greg Kroah-Hartman, shuo.a.liu,
	LKML, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H . Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sean Christopherson, Yu Wang, Reinette Chatre, Yakui Zhao,
	Dan Williams, Fengwei Yin, Zhi Wang, Zhenyu Wang

On Wed, Sep 30, 2020 at 10:13 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, Sep 30, 2020 at 11:10:36AM -0500, Segher Boessenkool wrote:
>
> > Since this variable is a local register asm, on entry to the asm the
> > compiler guarantees that the value lives in the assigned register (the
> > "r8" hardware register in this case).  This all works completely fine.
> > This is the only guaranteed behaviour for local register asm (well,
> > together with analogous behaviour for outputs).
>
> Right, that's what they're trying to achieve. The hypervisor calling
> convention needs that variable in %r8 (which is somewhat unfortunate).
>
> AFAIK this is the first such use in the kernel, but at least the gcc-4.9
> (our oldest supported version) claims to support this.
>
> So now we need to know if clang will actually do this too..

Does clang support register local storage? Let's use godbolt.org to find out:
https://godbolt.org/z/YM45W5
Looks like yes. You can even check different GCC versions via the
dropdown in the top right.

The -ffixed-* flags are less well supported in Clang; they need to be
reimplemented on a per-backend basis. aarch64 is relatively well
supported, but other arches not so much IME.

Do we need register local storage here?

static inline long bar(unsigned long hcall_id)
{
  long result;
  asm volatile("movl %1, %%r8d\n\t"
  "vmcall\n\t"
    : "=a" (result)
    : "ir" (hcall_id)
    : );
  return result;
}
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-09-30 19:14             ` Nick Desaulniers
@ 2020-09-30 19:42               ` Peter Zijlstra
  2020-09-30 23:58                 ` Segher Boessenkool
  2020-09-30 19:59               ` Arvind Sankar
                                 ` (2 subsequent siblings)
  3 siblings, 1 reply; 58+ messages in thread
From: Peter Zijlstra @ 2020-09-30 19:42 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Segher Boessenkool, Dave Hansen, Greg Kroah-Hartman, shuo.a.liu,
	LKML, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H . Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sean Christopherson, Yu Wang, Reinette Chatre, Yakui Zhao,
	Dan Williams, Fengwei Yin, Zhi Wang, Zhenyu Wang

On Wed, Sep 30, 2020 at 12:14:03PM -0700, Nick Desaulniers wrote:
> On Wed, Sep 30, 2020 at 10:13 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Wed, Sep 30, 2020 at 11:10:36AM -0500, Segher Boessenkool wrote:
> >
> > > Since this variable is a local register asm, on entry to the asm the
> > > compiler guarantees that the value lives in the assigned register (the
> > > "r8" hardware register in this case).  This all works completely fine.
> > > This is the only guaranteed behaviour for local register asm (well,
> > > together with analogous behaviour for outputs).
> >
> > Right, that's what they're trying to achieve. The hypervisor calling
> > convention needs that variable in %r8 (which is somewhat unfortunate).
> >
> > AFAIK this is the first such use in the kernel, but at least the gcc-4.9
> > (our oldest supported version) claims to support this.
> >
> > So now we need to know if clang will actually do this too..
> 
> Does clang support register local storage? Let's use godbolt.org to find out:
> https://godbolt.org/z/YM45W5
> Looks like yes. You can even check different GCC versions via the
> dropdown in the top right.

That only tells me it compiles it, not if that (IMO) weird construct is
actually guaranteed to work as expected.

I'd almost dive into the GCC archives to read the back-story to this
'feature', it just seems to weird to me. A well, for another day that.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-09-30 19:14             ` Nick Desaulniers
  2020-09-30 19:42               ` Peter Zijlstra
@ 2020-09-30 19:59               ` Arvind Sankar
  2020-09-30 20:01                 ` Arvind Sankar
  2020-10-01  0:08                 ` Segher Boessenkool
  2020-09-30 23:25               ` Segher Boessenkool
  2020-10-12  8:44               ` Shuo A Liu
  3 siblings, 2 replies; 58+ messages in thread
From: Arvind Sankar @ 2020-09-30 19:59 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Peter Zijlstra, Segher Boessenkool, Dave Hansen,
	Greg Kroah-Hartman, shuo.a.liu, LKML,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H . Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sean Christopherson, Yu Wang, Reinette Chatre, Yakui Zhao,
	Dan Williams, Fengwei Yin, Zhi Wang, Zhenyu Wang

On Wed, Sep 30, 2020 at 12:14:03PM -0700, Nick Desaulniers wrote:
> On Wed, Sep 30, 2020 at 10:13 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Wed, Sep 30, 2020 at 11:10:36AM -0500, Segher Boessenkool wrote:
> >
> > > Since this variable is a local register asm, on entry to the asm the
> > > compiler guarantees that the value lives in the assigned register (the
> > > "r8" hardware register in this case).  This all works completely fine.
> > > This is the only guaranteed behaviour for local register asm (well,
> > > together with analogous behaviour for outputs).

How strict is the guarantee? This is an inline function -- could the
compiler decide to reorder some other code in between the r8 assignment
and the asm statement when it gets inlined?

> >
> > Right, that's what they're trying to achieve. The hypervisor calling
> > convention needs that variable in %r8 (which is somewhat unfortunate).
> >
> > AFAIK this is the first such use in the kernel, but at least the gcc-4.9
> > (our oldest supported version) claims to support this.
> >
> > So now we need to know if clang will actually do this too..
> 
> Does clang support register local storage? Let's use godbolt.org to find out:
> https://godbolt.org/z/YM45W5
> Looks like yes. You can even check different GCC versions via the
> dropdown in the top right.
> 
> The -ffixed-* flags are less well supported in Clang; they need to be
> reimplemented on a per-backend basis. aarch64 is relatively well
> supported, but other arches not so much IME.
> 
> Do we need register local storage here?
> 
> static inline long bar(unsigned long hcall_id)
> {
>   long result;
>   asm volatile("movl %1, %%r8d\n\t"
>   "vmcall\n\t"
>     : "=a" (result)
>     : "ir" (hcall_id)
>     : );
>   return result;
> }

This seems more robust, though you probably need an r8 clobber in there?
Is hcall_id actually just 32 bits or can it be >=2^32?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-09-30 19:59               ` Arvind Sankar
@ 2020-09-30 20:01                 ` Arvind Sankar
  2020-10-01  0:08                 ` Segher Boessenkool
  1 sibling, 0 replies; 58+ messages in thread
From: Arvind Sankar @ 2020-09-30 20:01 UTC (permalink / raw)
  To: Arvind Sankar
  Cc: Nick Desaulniers, Peter Zijlstra, Segher Boessenkool,
	Dave Hansen, Greg Kroah-Hartman, shuo.a.liu, LKML,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H . Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sean Christopherson, Yu Wang, Reinette Chatre, Yakui Zhao,
	Dan Williams, Fengwei Yin, Zhi Wang, Zhenyu Wang

On Wed, Sep 30, 2020 at 03:59:15PM -0400, Arvind Sankar wrote:
> On Wed, Sep 30, 2020 at 12:14:03PM -0700, Nick Desaulniers wrote:
> > On Wed, Sep 30, 2020 at 10:13 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Wed, Sep 30, 2020 at 11:10:36AM -0500, Segher Boessenkool wrote:
> > >
> > > > Since this variable is a local register asm, on entry to the asm the
> > > > compiler guarantees that the value lives in the assigned register (the
> > > > "r8" hardware register in this case).  This all works completely fine.
> > > > This is the only guaranteed behaviour for local register asm (well,
> > > > together with analogous behaviour for outputs).
> 
> How strict is the guarantee? This is an inline function -- could the
> compiler decide to reorder some other code in between the r8 assignment
> and the asm statement when it gets inlined?
> 
> > >
> > > Right, that's what they're trying to achieve. The hypervisor calling
> > > convention needs that variable in %r8 (which is somewhat unfortunate).
> > >
> > > AFAIK this is the first such use in the kernel, but at least the gcc-4.9
> > > (our oldest supported version) claims to support this.
> > >
> > > So now we need to know if clang will actually do this too..
> > 
> > Does clang support register local storage? Let's use godbolt.org to find out:
> > https://godbolt.org/z/YM45W5
> > Looks like yes. You can even check different GCC versions via the
> > dropdown in the top right.
> > 
> > The -ffixed-* flags are less well supported in Clang; they need to be
> > reimplemented on a per-backend basis. aarch64 is relatively well
> > supported, but other arches not so much IME.
> > 
> > Do we need register local storage here?
> > 
> > static inline long bar(unsigned long hcall_id)
> > {
> >   long result;
> >   asm volatile("movl %1, %%r8d\n\t"
> >   "vmcall\n\t"
> >     : "=a" (result)
> >     : "ir" (hcall_id)
> >     : );
> >   return result;
> > }
> 
> This seems more robust, though you probably need an r8 clobber in there?
> Is hcall_id actually just 32 bits or can it be >=2^32?

Also, I think you need memory clobbers for all of these in either case, no?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-09-30 19:14             ` Nick Desaulniers
  2020-09-30 19:42               ` Peter Zijlstra
  2020-09-30 19:59               ` Arvind Sankar
@ 2020-09-30 23:25               ` Segher Boessenkool
  2020-09-30 23:38                 ` Arvind Sankar
  2020-10-12  8:44               ` Shuo A Liu
  3 siblings, 1 reply; 58+ messages in thread
From: Segher Boessenkool @ 2020-09-30 23:25 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Peter Zijlstra, Dave Hansen, Greg Kroah-Hartman, shuo.a.liu,
	LKML, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H . Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sean Christopherson, Yu Wang, Reinette Chatre, Yakui Zhao,
	Dan Williams, Fengwei Yin, Zhi Wang, Zhenyu Wang

On Wed, Sep 30, 2020 at 12:14:03PM -0700, Nick Desaulniers wrote:
> Do we need register local storage here?

Depends what you want.  It looks like you do:

> static inline long bar(unsigned long hcall_id)
> {
>   long result;
>   asm volatile("movl %1, %%r8d\n\t"
>   "vmcall\n\t"
>     : "=a" (result)
>     : "ir" (hcall_id)
>     : );
>   return result;
> }

"result" as output from the asm is in %rax, and the compiler will
shuffle that to wherever it needs it as the function return value.  That
part will work fine.

But how you are accessing %r8d is not correct, that needs to be a local
register asm (or r8 be made a fixed reg, probably not what you want ;-) )


Segher

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-09-30 23:25               ` Segher Boessenkool
@ 2020-09-30 23:38                 ` Arvind Sankar
  2020-10-01  0:11                   ` Segher Boessenkool
  0 siblings, 1 reply; 58+ messages in thread
From: Arvind Sankar @ 2020-09-30 23:38 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Nick Desaulniers, Peter Zijlstra, Dave Hansen,
	Greg Kroah-Hartman, shuo.a.liu, LKML,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H . Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sean Christopherson, Yu Wang, Reinette Chatre, Yakui Zhao,
	Dan Williams, Fengwei Yin, Zhi Wang, Zhenyu Wang

On Wed, Sep 30, 2020 at 06:25:59PM -0500, Segher Boessenkool wrote:
> On Wed, Sep 30, 2020 at 12:14:03PM -0700, Nick Desaulniers wrote:
> > Do we need register local storage here?
> 
> Depends what you want.  It looks like you do:
> 
> > static inline long bar(unsigned long hcall_id)
> > {
> >   long result;
> >   asm volatile("movl %1, %%r8d\n\t"
> >   "vmcall\n\t"
> >     : "=a" (result)
> >     : "ir" (hcall_id)
> >     : );
> >   return result;
> > }
> 
> "result" as output from the asm is in %rax, and the compiler will
> shuffle that to wherever it needs it as the function return value.  That
> part will work fine.
> 
> But how you are accessing %r8d is not correct, that needs to be a local
> register asm (or r8 be made a fixed reg, probably not what you want ;-) )
> 

Doesn't it just need an "r8" clobber to allow using r8d?

> 
> Segher

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-09-30 19:42               ` Peter Zijlstra
@ 2020-09-30 23:58                 ` Segher Boessenkool
  0 siblings, 0 replies; 58+ messages in thread
From: Segher Boessenkool @ 2020-09-30 23:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nick Desaulniers, Dave Hansen, Greg Kroah-Hartman, shuo.a.liu,
	LKML, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H . Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sean Christopherson, Yu Wang, Reinette Chatre, Yakui Zhao,
	Dan Williams, Fengwei Yin, Zhi Wang, Zhenyu Wang

On Wed, Sep 30, 2020 at 09:42:40PM +0200, Peter Zijlstra wrote:
> > Looks like yes. You can even check different GCC versions via the
> > dropdown in the top right.
> 
> That only tells me it compiles it, not if that (IMO) weird construct is
> actually guaranteed to work as expected.
> 
> I'd almost dive into the GCC archives to read the back-story to this
> 'feature', it just seems to weird to me. A well, for another day that.

It was documented in 1996 (<https://gcc.gnu.org/g:c1f7febfcb10>), the
feature is older than that though.

In 2004 (in <https://gcc.gnu.org/g:805c33df1366>) the documentation for
this was made more explicit (it has been rewritten since, but it still
says the same thing).


Segher

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-09-30 19:59               ` Arvind Sankar
  2020-09-30 20:01                 ` Arvind Sankar
@ 2020-10-01  0:08                 ` Segher Boessenkool
  1 sibling, 0 replies; 58+ messages in thread
From: Segher Boessenkool @ 2020-10-01  0:08 UTC (permalink / raw)
  To: Arvind Sankar
  Cc: Nick Desaulniers, Peter Zijlstra, Dave Hansen,
	Greg Kroah-Hartman, shuo.a.liu, LKML,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H . Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sean Christopherson, Yu Wang, Reinette Chatre, Yakui Zhao,
	Dan Williams, Fengwei Yin, Zhi Wang, Zhenyu Wang

On Wed, Sep 30, 2020 at 03:59:15PM -0400, Arvind Sankar wrote:
> On Wed, Sep 30, 2020 at 12:14:03PM -0700, Nick Desaulniers wrote:
> > On Wed, Sep 30, 2020 at 10:13 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Wed, Sep 30, 2020 at 11:10:36AM -0500, Segher Boessenkool wrote:
> > >
> > > > Since this variable is a local register asm, on entry to the asm the
> > > > compiler guarantees that the value lives in the assigned register (the
> > > > "r8" hardware register in this case).  This all works completely fine.
> > > > This is the only guaranteed behaviour for local register asm (well,
> > > > together with analogous behaviour for outputs).
> 
> How strict is the guarantee? This is an inline function -- could the
> compiler decide to reorder some other code in between the r8 assignment
> and the asm statement when it gets inlined?

Nope.  It will be in r8 on entry to the asm.  A guarantee is a
guarantee; it is not a "yeah maybe, we'll see".

> > Do we need register local storage here?
> > 
> > static inline long bar(unsigned long hcall_id)
> > {
> >   long result;
> >   asm volatile("movl %1, %%r8d\n\t"
> >   "vmcall\n\t"
> >     : "=a" (result)
> >     : "ir" (hcall_id)
> >     : );
> >   return result;
> > }
> 
> This seems more robust, though you probably need an r8 clobber in there?

Oh, x86 has the operand order inverted, so this should work in fact.


Segher

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-09-30 23:38                 ` Arvind Sankar
@ 2020-10-01  0:11                   ` Segher Boessenkool
  0 siblings, 0 replies; 58+ messages in thread
From: Segher Boessenkool @ 2020-10-01  0:11 UTC (permalink / raw)
  To: Arvind Sankar
  Cc: Nick Desaulniers, Peter Zijlstra, Dave Hansen,
	Greg Kroah-Hartman, shuo.a.liu, LKML,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H . Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sean Christopherson, Yu Wang, Reinette Chatre, Yakui Zhao,
	Dan Williams, Fengwei Yin, Zhi Wang, Zhenyu Wang

On Wed, Sep 30, 2020 at 07:38:15PM -0400, Arvind Sankar wrote:
> On Wed, Sep 30, 2020 at 06:25:59PM -0500, Segher Boessenkool wrote:
> > On Wed, Sep 30, 2020 at 12:14:03PM -0700, Nick Desaulniers wrote:
> > > Do we need register local storage here?
> > 
> > Depends what you want.  It looks like you do:
> > 
> > > static inline long bar(unsigned long hcall_id)
> > > {
> > >   long result;
> > >   asm volatile("movl %1, %%r8d\n\t"
> > >   "vmcall\n\t"
> > >     : "=a" (result)
> > >     : "ir" (hcall_id)
> > >     : );
> > >   return result;
> > > }
> > 
> > "result" as output from the asm is in %rax, and the compiler will
> > shuffle that to wherever it needs it as the function return value.  That
> > part will work fine.
> > 
> > But how you are accessing %r8d is not correct, that needs to be a local
> > register asm (or r8 be made a fixed reg, probably not what you want ;-) )
> 
> Doesn't it just need an "r8" clobber to allow using r8d?

Yes, x86 asm is hard to read, what can I say :-)  Sorry about that.


Segher

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 01/17] docs: acrn: Introduce ACRN
  2020-09-22 11:42 ` [PATCH v4 01/17] docs: acrn: Introduce ACRN shuo.a.liu
@ 2020-10-09  1:48   ` Randy Dunlap
  2020-10-12  8:50     ` Shuo A Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Randy Dunlap @ 2020-10-09  1:48 UTC (permalink / raw)
  To: shuo.a.liu, linux-kernel, x86
  Cc: Greg Kroah-Hartman, H . Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Sean Christopherson, Yu Wang,
	Reinette Chatre, Zhi Wang, Dave Hansen, Dan Williams,
	Fengwei Yin, Zhenyu Wang

On 9/22/20 4:42 AM, shuo.a.liu@intel.com wrote:
> From: Shuo Liu <shuo.a.liu@intel.com>
> 
> Add documentation on the following aspects of ACRN:
> 
>   1) A brief introduction on the architecture of ACRN.
>   2) I/O request handling in ACRN.
> 
> To learn more about ACRN, please go to ACRN project website
> https://projectacrn.org, or the documentation page
> https://projectacrn.github.io/.
> 
> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Sen Christopherson <sean.j.christopherson@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Fengwei Yin <fengwei.yin@intel.com>
> Cc: Zhi Wang <zhi.a.wang@intel.com>
> Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> Cc: Yu Wang <yu1.wang@intel.com>
> Cc: Reinette Chatre <reinette.chatre@intel.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
>  Documentation/virt/acrn/index.rst        | 11 +++
>  Documentation/virt/acrn/introduction.rst | 40 ++++++++++
>  Documentation/virt/acrn/io-request.rst   | 97 ++++++++++++++++++++++++
>  Documentation/virt/index.rst             |  1 +
>  MAINTAINERS                              |  7 ++
>  5 files changed, 156 insertions(+)
>  create mode 100644 Documentation/virt/acrn/index.rst
>  create mode 100644 Documentation/virt/acrn/introduction.rst
>  create mode 100644 Documentation/virt/acrn/io-request.rst
> 

> diff --git a/Documentation/virt/acrn/io-request.rst b/Documentation/virt/acrn/io-request.rst
> new file mode 100644
> index 000000000000..019dc5978f7c
> --- /dev/null
> +++ b/Documentation/virt/acrn/io-request.rst
> @@ -0,0 +1,97 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +I/O request handling
> +====================
> +
> +An I/O request of a User VM, which is constructed by the hypervisor, is
> +distributed by the ACRN Hypervisor Service Module to an I/O client
> +corresponding to the address range of the I/O request. Details of I/O request
> +handling are described in the following sections.
> +
> +1. I/O request
> +--------------
> +

...

> +
> +2. I/O clients
> +--------------
> +

...

> +
> +3. I/O request state transition
> +-------------------------------
> +
> +The state transitions of a ACRN I/O request are as follows.

                         of an ACRN

> +
> +::
> +
> +   FREE -> PENDING -> PROCESSING -> COMPLETE -> FREE -> ...
> +
> +- FREE: this I/O request slot is empty
> +- PENDING: a valid I/O request is pending in this slot
> +- PROCESSING: the I/O request is being processed
> +- COMPLETE: the I/O request has been processed
> +
> +An I/O request in COMPLETE or FREE state is owned by the hypervisor. HSM and
> +ACRN userspace are in charge of processing the others.
> +
> +4. Processing flow of I/O requests
> +-------------------------------
> +

...



thanks.
-- 
~Randy


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 03/17] x86/acrn: Introduce an API to check if a VM is privileged
  2020-09-30  8:09   ` Borislav Petkov
@ 2020-10-12  8:40     ` Shuo A Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Shuo A Liu @ 2020-10-12  8:40 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, x86, Greg Kroah-Hartman, H . Peter Anvin,
	Thomas Gleixner, Ingo Molnar, Sean Christopherson, Yu Wang,
	Reinette Chatre, Yin Fengwei, Dave Hansen, Dan Williams,
	Zhi Wang, Zhenyu Wang

Hi Boris,

On Wed 30.Sep'20 at 10:09:59 +0200, Borislav Petkov wrote:
>On Tue, Sep 22, 2020 at 07:42:57PM +0800, shuo.a.liu@intel.com wrote:
>> +static u32 acrn_cpuid_base(void)
>> +{
>> +	static u32 acrn_cpuid_base;
>> +
>> +	if (!acrn_cpuid_base && boot_cpu_has(X86_FEATURE_HYPERVISOR))
>> +		acrn_cpuid_base = hypervisor_cpuid_base("ACRNACRNACRN", 0);
>> +
>> +	return acrn_cpuid_base;
>> +}
>> +
>> +bool acrn_is_privileged_vm(void)
>> +{
>> +	return cpuid_eax(acrn_cpuid_base() | ACRN_CPUID_FEATURES) &
>
>What's that dance and acrn_cpuid_base static thing needed for? Why not
>simply:
>
>	cpuid_eax(ACRN_CPUID_FEATURES) & ...
>
>?

hypervisor_cpuid_base() searches reserved hypervisor cpuid region and
return the base matched the right signature, the base might vary. So i
put it here.

>
>> +			 ACRN_FEATURE_PRIVILEGED_VM;
>> +}
>> +EXPORT_SYMBOL_GPL(acrn_is_privileged_vm);
>
>Also, if you're going to need more of those bit checkers acrn_is_<something>
>which look at ACRN_CPUID_FEATURES, just stash CPUID_0x40000001_EAX locally and
>use a
>
>	acrn_has(ACRN_FEATURE_PRIVILEGED_VM)
>
>which does the bit testing.

Thanks. Currently, there is only one feature bit. I will introduce
that you suggested with more feature bits to be tested.

Thanks
shuo

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-09-30 19:14             ` Nick Desaulniers
                                 ` (2 preceding siblings ...)
  2020-09-30 23:25               ` Segher Boessenkool
@ 2020-10-12  8:44               ` Shuo A Liu
  2020-10-12 16:49                 ` Arvind Sankar
  3 siblings, 1 reply; 58+ messages in thread
From: Shuo A Liu @ 2020-10-12  8:44 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Peter Zijlstra, Segher Boessenkool, Dave Hansen,
	Greg Kroah-Hartman, LKML,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H . Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sean Christopherson, Yu Wang, Reinette Chatre, Yakui Zhao,
	Dan Williams, Fengwei Yin, Zhi Wang, Zhenyu Wang

On Wed 30.Sep'20 at 12:14:03 -0700, Nick Desaulniers wrote:
>On Wed, Sep 30, 2020 at 10:13 AM Peter Zijlstra <peterz@infradead.org> wrote:
>>
>> On Wed, Sep 30, 2020 at 11:10:36AM -0500, Segher Boessenkool wrote:
>>
>> > Since this variable is a local register asm, on entry to the asm the
>> > compiler guarantees that the value lives in the assigned register (the
>> > "r8" hardware register in this case).  This all works completely fine.
>> > This is the only guaranteed behaviour for local register asm (well,
>> > together with analogous behaviour for outputs).
>>
>> Right, that's what they're trying to achieve. The hypervisor calling
>> convention needs that variable in %r8 (which is somewhat unfortunate).
>>
>> AFAIK this is the first such use in the kernel, but at least the gcc-4.9
>> (our oldest supported version) claims to support this.
>>
>> So now we need to know if clang will actually do this too..
>
>Does clang support register local storage? Let's use godbolt.org to find out:
>https://godbolt.org/z/YM45W5
>Looks like yes. You can even check different GCC versions via the
>dropdown in the top right.
>
>The -ffixed-* flags are less well supported in Clang; they need to be
>reimplemented on a per-backend basis. aarch64 is relatively well
>supported, but other arches not so much IME.
>
>Do we need register local storage here?
>
>static inline long bar(unsigned long hcall_id)
>{
>  long result;
>  asm volatile("movl %1, %%r8d\n\t"
>  "vmcall\n\t"
>    : "=a" (result)
>    : "ir" (hcall_id)
>    : );
>  return result;
>}

Yeah, this approach is also mentioned in the changelog. I will change to
this way to follow your preference. With an addtional "r8" clobber what
Arvind mentioned.

Thanks
shuo

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-09-30 10:54   ` Borislav Petkov
@ 2020-10-12  8:49     ` Shuo A Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Shuo A Liu @ 2020-10-12  8:49 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, x86, Greg Kroah-Hartman, H . Peter Anvin,
	Thomas Gleixner, Ingo Molnar, Sean Christopherson, Yu Wang,
	Reinette Chatre, Yakui Zhao, Dave Hansen, Dan Williams,
	Fengwei Yin, Zhi Wang, Zhenyu Wang

On Wed 30.Sep'20 at 12:54:08 +0200, Borislav Petkov wrote:
>On Tue, Sep 22, 2020 at 07:42:58PM +0800, shuo.a.liu@intel.com wrote:
>> From: Shuo Liu <shuo.a.liu@intel.com>
>>
>> The Service VM communicates with the hypervisor via conventional
>> hypercalls. VMCALL instruction is used to make the hypercalls.
>>
>> ACRN hypercall ABI:
>>   * Hypercall number is in R8 register.
>>   * Up to 2 parameters are in RDI and RSI registers.
>>   * Return value is in RAX register.
>
>I'm assuming this is already cast in stone in the HV and it cannot be
>changed?

Yes, it is.

>
>> Introduce the ACRN hypercall interfaces. Because GCC doesn't support R8
>> register as direct register constraints, here are two ways to use R8 in
>> extended asm:
>>   1) use explicit register variable as input
>>   2) use supported constraint as input with a explicit MOV to R8 in
>>      beginning of asm
>>
>> The number of instructions of above two ways are same.
>> Asm code from 1)
>>   38:   41 b8 00 00 00 80       mov    $0x80000000,%r8d
>>   3e:   48 89 c7                mov    %rax,%rdi
>>   41:   0f 01 c1                vmcall
>> Here, writes to the lower dword (%r8d) clear the upper dword of %r8 when
>> the CPU is in 64-bit mode.
>>
>> Asm code from 2)
>>   38:   48 89 c7                mov    %rax,%rdi
>>   3b:   49 b8 00 00 00 80 00    movabs $0x80000000,%r8
>>   42:   00 00 00
>>   45:   0f 01 c1                vmcall
>>
>> Choose 1) for code simplicity and a little bit of code size
>> optimization.
>
>What?
>
>How much "optimization" is this actually? A couple of bytes?
>
>And all that for this
>
>	/* Nothing can come between the r8 assignment and the asm: */
>
>restriction?
>
>If it is only a couple of bytes, just do the explicit MOV to %r8 and
>f'get about it.

Yes. Just a couple of bytes. Number of instructions is same.
sure, i can change to approach 2)
  2) use supported constraint as input with a explicit MOV to R8
     in beginning of asm

Thanks
shuo

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 01/17] docs: acrn: Introduce ACRN
  2020-10-09  1:48   ` Randy Dunlap
@ 2020-10-12  8:50     ` Shuo A Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Shuo A Liu @ 2020-10-12  8:50 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: linux-kernel, x86, Greg Kroah-Hartman, H . Peter Anvin,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sean Christopherson, Yu Wang, Reinette Chatre, Zhi Wang,
	Dave Hansen, Dan Williams, Fengwei Yin, Zhenyu Wang

On Thu  8.Oct'20 at 18:48:52 -0700, Randy Dunlap wrote:
>On 9/22/20 4:42 AM, shuo.a.liu@intel.com wrote:
>> From: Shuo Liu <shuo.a.liu@intel.com>
>>
>> Add documentation on the following aspects of ACRN:
>>
>>   1) A brief introduction on the architecture of ACRN.
>>   2) I/O request handling in ACRN.
>>
>> To learn more about ACRN, please go to ACRN project website
>> https://projectacrn.org, or the documentation page
>> https://projectacrn.github.io/.
>>
>> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
>> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
>> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
>> Cc: Dave Hansen <dave.hansen@intel.com>
>> Cc: Sen Christopherson <sean.j.christopherson@intel.com>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Fengwei Yin <fengwei.yin@intel.com>
>> Cc: Zhi Wang <zhi.a.wang@intel.com>
>> Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
>> Cc: Yu Wang <yu1.wang@intel.com>
>> Cc: Reinette Chatre <reinette.chatre@intel.com>
>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> ---
>>  Documentation/virt/acrn/index.rst        | 11 +++
>>  Documentation/virt/acrn/introduction.rst | 40 ++++++++++
>>  Documentation/virt/acrn/io-request.rst   | 97 ++++++++++++++++++++++++
>>  Documentation/virt/index.rst             |  1 +
>>  MAINTAINERS                              |  7 ++
>>  5 files changed, 156 insertions(+)
>>  create mode 100644 Documentation/virt/acrn/index.rst
>>  create mode 100644 Documentation/virt/acrn/introduction.rst
>>  create mode 100644 Documentation/virt/acrn/io-request.rst
>>
>
>> diff --git a/Documentation/virt/acrn/io-request.rst b/Documentation/virt/acrn/io-request.rst
>> new file mode 100644
>> index 000000000000..019dc5978f7c
>> --- /dev/null
>> +++ b/Documentation/virt/acrn/io-request.rst
>> @@ -0,0 +1,97 @@
>> +.. SPDX-License-Identifier: GPL-2.0
>> +
>> +I/O request handling
>> +====================
>> +
>> +An I/O request of a User VM, which is constructed by the hypervisor, is
>> +distributed by the ACRN Hypervisor Service Module to an I/O client
>> +corresponding to the address range of the I/O request. Details of I/O request
>> +handling are described in the following sections.
>> +
>> +1. I/O request
>> +--------------
>> +
>
>...
>
>> +
>> +2. I/O clients
>> +--------------
>> +
>
>...
>
>> +
>> +3. I/O request state transition
>> +-------------------------------
>> +
>> +The state transitions of a ACRN I/O request are as follows.
>
>                         of an ACRN

OK. Thanks for review. 

Thanks
shuo

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-10-12  8:44               ` Shuo A Liu
@ 2020-10-12 16:49                 ` Arvind Sankar
  2020-10-13  2:44                   ` Shuo A Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Arvind Sankar @ 2020-10-12 16:49 UTC (permalink / raw)
  To: Shuo A Liu
  Cc: Nick Desaulniers, Peter Zijlstra, Segher Boessenkool,
	Dave Hansen, Greg Kroah-Hartman, LKML,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H . Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sean Christopherson, Yu Wang, Reinette Chatre, Yakui Zhao,
	Dan Williams, Fengwei Yin, Zhi Wang, Zhenyu Wang

On Mon, Oct 12, 2020 at 04:44:31PM +0800, Shuo A Liu wrote:
> On Wed 30.Sep'20 at 12:14:03 -0700, Nick Desaulniers wrote:
> >On Wed, Sep 30, 2020 at 10:13 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >>
> >> On Wed, Sep 30, 2020 at 11:10:36AM -0500, Segher Boessenkool wrote:
> >>
> >> > Since this variable is a local register asm, on entry to the asm the
> >> > compiler guarantees that the value lives in the assigned register (the
> >> > "r8" hardware register in this case).  This all works completely fine.
> >> > This is the only guaranteed behaviour for local register asm (well,
> >> > together with analogous behaviour for outputs).
> >>
> >> Right, that's what they're trying to achieve. The hypervisor calling
> >> convention needs that variable in %r8 (which is somewhat unfortunate).
> >>
> >> AFAIK this is the first such use in the kernel, but at least the gcc-4.9
> >> (our oldest supported version) claims to support this.
> >>
> >> So now we need to know if clang will actually do this too..
> >
> >Does clang support register local storage? Let's use godbolt.org to find out:
> >https://godbolt.org/z/YM45W5
> >Looks like yes. You can even check different GCC versions via the
> >dropdown in the top right.
> >
> >The -ffixed-* flags are less well supported in Clang; they need to be
> >reimplemented on a per-backend basis. aarch64 is relatively well
> >supported, but other arches not so much IME.
> >
> >Do we need register local storage here?
> >
> >static inline long bar(unsigned long hcall_id)
> >{
> >  long result;
> >  asm volatile("movl %1, %%r8d\n\t"
> >  "vmcall\n\t"
> >    : "=a" (result)
> >    : "ir" (hcall_id)
> >    : );
> >  return result;
> >}
> 
> Yeah, this approach is also mentioned in the changelog. I will change to
> this way to follow your preference. With an addtional "r8" clobber what
> Arvind mentioned.
> 
> Thanks
> shuo

Btw, I noticed that arch/x86/xen/hypercall.h uses register-local
variables already for its hypercalls for quite some time, so this
wouldn't be unprecedented. [0]

Do these calls also need a memory clobber? The KVM/xen hypercall functions
all have one.

Thanks.

[0] e74359028d548 ("xen64: fix calls into hypercall page")

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces
  2020-10-12 16:49                 ` Arvind Sankar
@ 2020-10-13  2:44                   ` Shuo A Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Shuo A Liu @ 2020-10-13  2:44 UTC (permalink / raw)
  To: Arvind Sankar
  Cc: Nick Desaulniers, Peter Zijlstra, Segher Boessenkool,
	Dave Hansen, Greg Kroah-Hartman, LKML,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H . Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sean Christopherson, Yu Wang, Reinette Chatre, Yakui Zhao,
	Dan Williams, Fengwei Yin, Zhi Wang, Zhenyu Wang

On Mon 12.Oct'20 at 12:49:16 -0400, Arvind Sankar wrote:
>On Mon, Oct 12, 2020 at 04:44:31PM +0800, Shuo A Liu wrote:
>> On Wed 30.Sep'20 at 12:14:03 -0700, Nick Desaulniers wrote:
>> >On Wed, Sep 30, 2020 at 10:13 AM Peter Zijlstra <peterz@infradead.org> wrote:
>> >>
>> >> On Wed, Sep 30, 2020 at 11:10:36AM -0500, Segher Boessenkool wrote:
>> >>
>> >> > Since this variable is a local register asm, on entry to the asm the
>> >> > compiler guarantees that the value lives in the assigned register (the
>> >> > "r8" hardware register in this case).  This all works completely fine.
>> >> > This is the only guaranteed behaviour for local register asm (well,
>> >> > together with analogous behaviour for outputs).
>> >>
>> >> Right, that's what they're trying to achieve. The hypervisor calling
>> >> convention needs that variable in %r8 (which is somewhat unfortunate).
>> >>
>> >> AFAIK this is the first such use in the kernel, but at least the gcc-4.9
>> >> (our oldest supported version) claims to support this.
>> >>
>> >> So now we need to know if clang will actually do this too..
>> >
>> >Does clang support register local storage? Let's use godbolt.org to find out:
>> >https://godbolt.org/z/YM45W5
>> >Looks like yes. You can even check different GCC versions via the
>> >dropdown in the top right.
>> >
>> >The -ffixed-* flags are less well supported in Clang; they need to be
>> >reimplemented on a per-backend basis. aarch64 is relatively well
>> >supported, but other arches not so much IME.
>> >
>> >Do we need register local storage here?
>> >
>> >static inline long bar(unsigned long hcall_id)
>> >{
>> >  long result;
>> >  asm volatile("movl %1, %%r8d\n\t"
>> >  "vmcall\n\t"
>> >    : "=a" (result)
>> >    : "ir" (hcall_id)
>> >    : );
>> >  return result;
>> >}
>>
>> Yeah, this approach is also mentioned in the changelog. I will change to
>> this way to follow your preference. With an addtional "r8" clobber what
>> Arvind mentioned.
>>
>> Thanks
>> shuo
>
>Btw, I noticed that arch/x86/xen/hypercall.h uses register-local
>variables already for its hypercalls for quite some time, so this
>wouldn't be unprecedented. [0]
>
>Do these calls also need a memory clobber? The KVM/xen hypercall functions
>all have one.

Yes. it's needed. I will add it. Thanks

>
>Thanks.
>
>[0] e74359028d548 ("xen64: fix calls into hypercall page")

^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2020-10-13  2:44 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-22 11:42 [PATCH v4 00/17] HSM driver for ACRN hypervisor shuo.a.liu
2020-09-22 11:42 ` [PATCH v4 01/17] docs: acrn: Introduce ACRN shuo.a.liu
2020-10-09  1:48   ` Randy Dunlap
2020-10-12  8:50     ` Shuo A Liu
2020-09-22 11:42 ` [PATCH v4 02/17] x86/acrn: Introduce acrn_{setup, remove}_intr_handler() shuo.a.liu
2020-09-27 10:49   ` Greg Kroah-Hartman
2020-09-28  3:28     ` Shuo A Liu
2020-09-29 18:01   ` Borislav Petkov
2020-09-29 20:07     ` Thomas Gleixner
2020-09-29 20:26       ` Borislav Petkov
2020-09-30  3:02         ` Shuo A Liu
2020-09-22 11:42 ` [PATCH v4 03/17] x86/acrn: Introduce an API to check if a VM is privileged shuo.a.liu
2020-09-30  8:09   ` Borislav Petkov
2020-10-12  8:40     ` Shuo A Liu
2020-09-22 11:42 ` [PATCH v4 04/17] x86/acrn: Introduce hypercall interfaces shuo.a.liu
2020-09-27 10:51   ` Greg Kroah-Hartman
2020-09-27 10:53     ` Greg Kroah-Hartman
2020-09-28  3:38       ` Shuo A Liu
2020-09-27 15:38     ` Dave Hansen
2020-09-30 11:16       ` Peter Zijlstra
2020-09-30 16:10         ` Segher Boessenkool
2020-09-30 17:13           ` Peter Zijlstra
2020-09-30 19:14             ` Nick Desaulniers
2020-09-30 19:42               ` Peter Zijlstra
2020-09-30 23:58                 ` Segher Boessenkool
2020-09-30 19:59               ` Arvind Sankar
2020-09-30 20:01                 ` Arvind Sankar
2020-10-01  0:08                 ` Segher Boessenkool
2020-09-30 23:25               ` Segher Boessenkool
2020-09-30 23:38                 ` Arvind Sankar
2020-10-01  0:11                   ` Segher Boessenkool
2020-10-12  8:44               ` Shuo A Liu
2020-10-12 16:49                 ` Arvind Sankar
2020-10-13  2:44                   ` Shuo A Liu
2020-09-30 10:54   ` Borislav Petkov
2020-10-12  8:49     ` Shuo A Liu
2020-09-22 11:42 ` [PATCH v4 05/17] virt: acrn: Introduce ACRN HSM basic driver shuo.a.liu
2020-09-22 11:43 ` [PATCH v4 08/17] virt: acrn: Introduce EPT mapping management shuo.a.liu
2020-09-22 11:43 ` [PATCH v4 10/17] virt: acrn: Introduce PCI configuration space PIO accesses combiner shuo.a.liu
2020-09-22 11:43 ` [PATCH v4 11/17] virt: acrn: Introduce interfaces for PCI device passthrough shuo.a.liu
2020-09-22 11:43 ` [PATCH v4 12/17] virt: acrn: Introduce interrupt injection interfaces shuo.a.liu
2020-09-22 11:43 ` [PATCH v4 14/17] virt: acrn: Introduce I/O ranges operation interfaces shuo.a.liu
2020-09-22 11:43 ` [PATCH v4 16/17] virt: acrn: Introduce irqfd shuo.a.liu
2020-09-22 11:43 ` [PATCH v4 17/17] virt: acrn: Introduce an interface for Service VM to control vCPU shuo.a.liu
2020-09-27 10:44   ` Greg Kroah-Hartman
2020-09-28  4:10     ` Shuo A Liu
2020-09-28  5:23       ` Greg Kroah-Hartman
2020-09-28  6:33         ` Shuo A Liu
2020-09-27  0:24 ` [PATCH v4 00/17] HSM driver for ACRN hypervisor Liu, Shuo A
2020-09-27  5:42   ` Greg Kroah-Hartman
     [not found] ` <20200922114311.38804-7-shuo.a.liu@intel.com>
2020-09-27 10:45   ` [PATCH v4 06/17] virt: acrn: Introduce VM management interfaces Greg Kroah-Hartman
2020-09-28  3:43     ` Shuo A Liu
2020-09-27 10:47   ` Greg Kroah-Hartman
2020-09-28  3:50     ` Shuo A Liu
2020-09-28  5:25       ` Greg Kroah-Hartman
2020-09-28  6:29         ` Shuo A Liu
2020-09-28 12:26           ` Greg Kroah-Hartman
2020-09-30  2:49             ` Shuo A Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).