All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/17] Add new board: Xen guest for ARM64
@ 2020-07-01 16:29 Anastasiia Lukianenko
  2020-07-01 16:29 ` [PATCH 01/17] armv8: Fix SMCC and ARM_PSCI_FW dependencies Anastasiia Lukianenko
                   ` (17 more replies)
  0 siblings, 18 replies; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-01 16:29 UTC (permalink / raw)
  To: u-boot

From: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>

This work introduces Xen [1] guest ARM64 board support in U-Boot with
para-virtualized (PV) [2] block and serial drivers: xenguest_arm64.

This board is to be run as a virtual Xen guest with U-boot as its
primary bootloader. The rationale behind introducing this board is a
better and simpler decoupling of the guest from the initial
privileged domain which starts a guest?s virtual machine: there are
cross dependencies between the guest OS and initial privileged domain
(Domain-0) such as Domain-0 needs guest's kernel and may need its
device tree to boot it. These dependencies interfere if the kernel or
guest OS needs to be updated, thus having a unified bootloader in
Domain-0 allows resolving this:
1. U-boot boot scripts, which are stored on the guest?s virtual disk,
are guest specific, so any change in the guest?s configuration can be
handled by the guest itself.
2. Guest OS? kernel can be updated if OS? needs that without any help
from Domain-0.
3. Using the Device Tree Overlay mechanism it is possible to customize
the device tree entries yet at bootloader stage inside the guest
itself, so the base device tree provided by Xen can be customized.

Xen support for U-boot was implemented by introducing a new Xen guest
ARM64 board and porting essential drivers from MiniOS [3] as well as
some of the work previously done by NXP [4]:
1. PV block device front driver with XenStore based device
enumeration, new UCLASS_PVBLOCK;
2. PV serial console device front driver;
3. Xen hypervisor support with minimal set of the essential headers
adapted from Linux kernel;
4. grant table support;
5. event channel support, without IRQ support, but polling;
6. xenbus support;
7. dynamic RAM size as defined in the device tree instead of
statically defined values;
8. position-independent pre-relocation code is used as we cannot
statically define any start addresses at compile time which is up to
Xen to choose at run-time;
9. new defconfig introduced: xenguest_arm64_defconfig.

Please note, that due to the fact that para-virtualized serial driver
requires some of the Xen functionality available late not all the
printouts are available at the very start including U-Boot banner,
memory size etc.

All the above was tested with block driver related commands
(info/part/read/write), FAT and ext4 operations work properly, the
Linux kernel can start.

Thank you in advance,
Anastasiia Lukianenko,
Oleksandr Andrushchenko


[1] - https://xenproject.org/
[2] - https://wiki.xenproject.org/wiki/Paravirtualization_(PV)
[3] - https://wiki.xenproject.org/wiki/Mini-OS
[4] - https://source.codeaurora.org/external/imx/uboot-imx/tree/?h=imx_v2018.03_4.14.98_2.0.0_ga

Anastasiia Lukianenko (5):
  xen: pvblock: Add initial support for para-virtualized block driver
  xen: pvblock: Enumerate virtual block devices
  xen: pvblock: Read XenStore configuration and initialize
  xen: pvblock: Implement front-back protocol and do IO
  xen: pvblock: Print found devices indices

Andrii Anisov (2):
  board: Introduce xenguest_arm64 board
  lib: sscanf: add sscanf implementation

Oleksandr Andrushchenko (8):
  armv8: Fix SMCC and ARM_PSCI_FW dependencies
  xen: Add essential and required interface headers
  xen: Port Xen hypervizor related code from mini-os
  xen: Port Xen event channel driver from mini-os
  linux/compat.h: Add wait_event_timeout macro
  xen: Port Xen bus driver from mini-os
  xen: Port Xen grant table driver from mini-os
  board: xen: De-initialize before jumping to Linux

Peng Fan (2):
  Kconfig: Introduce CONFIG_XEN
  serial: serial_xen: Add Xen PV serial driver

 Kconfig                                   |   7 +
 arch/arm/Kconfig                          |  10 +-
 arch/arm/cpu/armv8/Kconfig                |   2 +
 arch/arm/cpu/armv8/Makefile               |   1 +
 arch/arm/cpu/armv8/xen/Makefile           |   6 +
 arch/arm/cpu/armv8/xen/hypercall.S        |  78 ++
 arch/arm/cpu/armv8/xen/lowlevel_init.S    |  34 +
 arch/arm/include/asm/xen.h                |   8 +
 arch/arm/include/asm/xen/hypercall.h      |  45 ++
 arch/arm/include/asm/xen/system.h         |  96 +++
 board/xen/xenguest_arm64/Kconfig          |  12 +
 board/xen/xenguest_arm64/Makefile         |   5 +
 board/xen/xenguest_arm64/xenguest_arm64.c | 203 +++++
 cmd/Kconfig                               |   7 +
 cmd/Makefile                              |   1 +
 cmd/pvblock.c                             |  31 +
 common/board_r.c                          |  25 +
 configs/xenguest_arm64_defconfig          |  60 ++
 disk/part.c                               |   4 +
 drivers/Kconfig                           |   2 +
 drivers/Makefile                          |   1 +
 drivers/block/blk-uclass.c                |   2 +
 drivers/serial/Kconfig                    |   7 +
 drivers/serial/Makefile                   |   1 +
 drivers/serial/serial_xen.c               | 175 +++++
 drivers/xen/Kconfig                       |  10 +
 drivers/xen/Makefile                      |  10 +
 drivers/xen/events.c                      | 181 +++++
 drivers/xen/gnttab.c                      | 258 +++++++
 drivers/xen/hypervisor.c                  | 289 +++++++
 drivers/xen/pvblock.c                     | 808 ++++++++++++++++++++
 drivers/xen/xenbus.c                      | 547 ++++++++++++++
 include/blk.h                             |   1 +
 include/configs/xenguest_arm64.h          |  53 ++
 include/dm/uclass-id.h                    |   1 +
 include/linux/compat.h                    |  45 ++
 include/pvblock.h                         |  12 +
 include/vsprintf.h                        |   8 +
 include/xen.h                             |  12 +
 include/xen/arm/interface.h               |  88 +++
 include/xen/events.h                      |  47 ++
 include/xen/gnttab.h                      |  25 +
 include/xen/hvm.h                         |  30 +
 include/xen/interface/event_channel.h     | 281 +++++++
 include/xen/interface/grant_table.h       | 582 ++++++++++++++
 include/xen/interface/hvm/hvm_op.h        |  69 ++
 include/xen/interface/hvm/params.h        | 127 ++++
 include/xen/interface/io/blkif.h          | 726 ++++++++++++++++++
 include/xen/interface/io/console.h        |  56 ++
 include/xen/interface/io/protocols.h      |  42 +
 include/xen/interface/io/ring.h           | 479 ++++++++++++
 include/xen/interface/io/xenbus.h         |  81 ++
 include/xen/interface/io/xs_wire.h        | 151 ++++
 include/xen/interface/memory.h            | 332 ++++++++
 include/xen/interface/sched.h             | 188 +++++
 include/xen/interface/xen.h               | 225 ++++++
 include/xen/xenbus.h                      |  86 +++
 lib/Kconfig                               |   4 +
 lib/Makefile                              |   1 +
 lib/sscanf.c                              | 883 ++++++++++++++++++++++
 60 files changed, 7560 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm/cpu/armv8/xen/Makefile
 create mode 100644 arch/arm/cpu/armv8/xen/hypercall.S
 create mode 100644 arch/arm/cpu/armv8/xen/lowlevel_init.S
 create mode 100644 arch/arm/include/asm/xen.h
 create mode 100644 arch/arm/include/asm/xen/hypercall.h
 create mode 100644 arch/arm/include/asm/xen/system.h
 create mode 100644 board/xen/xenguest_arm64/Kconfig
 create mode 100644 board/xen/xenguest_arm64/Makefile
 create mode 100644 board/xen/xenguest_arm64/xenguest_arm64.c
 create mode 100644 cmd/pvblock.c
 create mode 100644 configs/xenguest_arm64_defconfig
 create mode 100644 drivers/serial/serial_xen.c
 create mode 100644 drivers/xen/Kconfig
 create mode 100644 drivers/xen/Makefile
 create mode 100644 drivers/xen/events.c
 create mode 100644 drivers/xen/gnttab.c
 create mode 100644 drivers/xen/hypervisor.c
 create mode 100644 drivers/xen/pvblock.c
 create mode 100644 drivers/xen/xenbus.c
 create mode 100644 include/configs/xenguest_arm64.h
 create mode 100644 include/pvblock.h
 create mode 100644 include/xen.h
 create mode 100644 include/xen/arm/interface.h
 create mode 100644 include/xen/events.h
 create mode 100644 include/xen/gnttab.h
 create mode 100644 include/xen/hvm.h
 create mode 100644 include/xen/interface/event_channel.h
 create mode 100644 include/xen/interface/grant_table.h
 create mode 100644 include/xen/interface/hvm/hvm_op.h
 create mode 100644 include/xen/interface/hvm/params.h
 create mode 100644 include/xen/interface/io/blkif.h
 create mode 100644 include/xen/interface/io/console.h
 create mode 100644 include/xen/interface/io/protocols.h
 create mode 100644 include/xen/interface/io/ring.h
 create mode 100644 include/xen/interface/io/xenbus.h
 create mode 100644 include/xen/interface/io/xs_wire.h
 create mode 100644 include/xen/interface/memory.h
 create mode 100644 include/xen/interface/sched.h
 create mode 100644 include/xen/interface/xen.h
 create mode 100644 include/xen/xenbus.h
 create mode 100644 lib/sscanf.c

-- 
2.17.1

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 01/17] armv8: Fix SMCC and ARM_PSCI_FW dependencies
  2020-07-01 16:29 [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
@ 2020-07-01 16:29 ` Anastasiia Lukianenko
  2020-07-02  1:14   ` Peng Fan
  2020-07-01 16:29 ` [PATCH 02/17] Kconfig: Introduce CONFIG_XEN Anastasiia Lukianenko
                   ` (16 subsequent siblings)
  17 siblings, 1 reply; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-01 16:29 UTC (permalink / raw)
  To: u-boot

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Currently SMCC selects ARM_PSCI_FW if enabled which is not correct
as there are cases that PSCI can function without firmware at all.
ARM_PSCI_FW itself is built with driver model approach, so it cannot
be enabled if DM is off.
Fix this by making PSCI reset functionality depend on ARM_PSCI_FW and
only in case if DM is enabled.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
Suggested-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
---
 arch/arm/Kconfig           | 1 -
 arch/arm/cpu/armv8/Kconfig | 2 ++
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 54d65f8488..e9ad716aaa 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -387,7 +387,6 @@ config SYS_ARCH_TIMER
 config ARM_SMCCC
 	bool "Support for ARM SMC Calling Convention (SMCCC)"
 	depends on CPU_V7A || ARM64
-	select ARM_PSCI_FW
 	help
 	  Say Y here if you want to enable ARM SMC Calling Convention.
 	  This should be enabled if U-Boot needs to communicate with system
diff --git a/arch/arm/cpu/armv8/Kconfig b/arch/arm/cpu/armv8/Kconfig
index 3655990772..c8727f4175 100644
--- a/arch/arm/cpu/armv8/Kconfig
+++ b/arch/arm/cpu/armv8/Kconfig
@@ -103,6 +103,8 @@ config PSCI_RESET
 	bool "Use PSCI for reset and shutdown"
 	default y
 	select ARM_SMCCC if OF_CONTROL
+	select ARM_PSCI_FW if DM
+
 	depends on !ARCH_EXYNOS7 && !ARCH_BCM283X && \
 		   !TARGET_LS2080A_SIMU && !TARGET_LS2080AQDS && \
 		   !TARGET_LS2080ARDB && !TARGET_LS2080A_EMU && \
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 02/17] Kconfig: Introduce CONFIG_XEN
  2020-07-01 16:29 [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
  2020-07-01 16:29 ` [PATCH 01/17] armv8: Fix SMCC and ARM_PSCI_FW dependencies Anastasiia Lukianenko
@ 2020-07-01 16:29 ` Anastasiia Lukianenko
  2020-07-03  3:50   ` Simon Glass
  2020-07-01 16:29 ` [PATCH 03/17] board: Introduce xenguest_arm64 board Anastasiia Lukianenko
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-01 16:29 UTC (permalink / raw)
  To: u-boot

From: Peng Fan <peng.fan@nxp.com>

Introduce CONFIG_XEN to make U-Boot could be used as bootloader
for a virtual machine.

Without bootloader, we could successfully boot up android on XEN, but
we need need bootloader to support A/B, dm verify and etc.

Signed-off-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
---
 Kconfig | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/Kconfig b/Kconfig
index 8f3fba085a..67f773d3a6 100644
--- a/Kconfig
+++ b/Kconfig
@@ -69,6 +69,13 @@ config CC_COVERAGE
 	  Enabling this option will pass "--coverage" to gcc to compile
 	  and link code instrumented for coverage analysis.
 
+config XEN
+	bool "Select U-Boot be run as a bootloader for XEN Virtual Machine"
+	default n
+	help
+	  Enabling this option will make U-Boot be run as a bootloader
+	  for XEN Virtual Machine.
+
 config DISTRO_DEFAULTS
 	bool "Select defaults suitable for booting general purpose Linux distributions"
 	select AUTO_COMPLETE
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 03/17] board: Introduce xenguest_arm64 board
  2020-07-01 16:29 [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
  2020-07-01 16:29 ` [PATCH 01/17] armv8: Fix SMCC and ARM_PSCI_FW dependencies Anastasiia Lukianenko
  2020-07-01 16:29 ` [PATCH 02/17] Kconfig: Introduce CONFIG_XEN Anastasiia Lukianenko
@ 2020-07-01 16:29 ` Anastasiia Lukianenko
  2020-07-02  1:28   ` Peng Fan
  2020-07-01 16:29 ` [PATCH 04/17] xen: Add essential and required interface headers Anastasiia Lukianenko
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-01 16:29 UTC (permalink / raw)
  To: u-boot

From: Andrii Anisov <andrii_anisov@epam.com>

Introduce a minimal Xen guest board running as a virtual
machine under Xen Project's hypervisor [1], [2].

Part of the code is ported from Xen mini-os and also uses
work initially done by different authors from NXP: please see
relevant files for their copyrights.

[1] https://xenbits.xen.org
[2] https://wiki.xenproject.org/

Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
---
 arch/arm/Kconfig                          |   7 +
 arch/arm/cpu/armv8/Makefile               |   1 +
 arch/arm/cpu/armv8/xen/Makefile           |   6 +
 arch/arm/cpu/armv8/xen/hypercall.S        |  78 +++++++++++
 arch/arm/cpu/armv8/xen/lowlevel_init.S    |  34 +++++
 arch/arm/include/asm/xen.h                |   8 ++
 arch/arm/include/asm/xen/hypercall.h      |  45 +++++++
 board/xen/xenguest_arm64/Kconfig          |  12 ++
 board/xen/xenguest_arm64/Makefile         |   5 +
 board/xen/xenguest_arm64/xenguest_arm64.c | 153 ++++++++++++++++++++++
 configs/xenguest_arm64_defconfig          |  56 ++++++++
 include/configs/xenguest_arm64.h          |  45 +++++++
 12 files changed, 450 insertions(+)
 create mode 100644 arch/arm/cpu/armv8/xen/Makefile
 create mode 100644 arch/arm/cpu/armv8/xen/hypercall.S
 create mode 100644 arch/arm/cpu/armv8/xen/lowlevel_init.S
 create mode 100644 arch/arm/include/asm/xen.h
 create mode 100644 arch/arm/include/asm/xen/hypercall.h
 create mode 100644 board/xen/xenguest_arm64/Kconfig
 create mode 100644 board/xen/xenguest_arm64/Makefile
 create mode 100644 board/xen/xenguest_arm64/xenguest_arm64.c
 create mode 100644 configs/xenguest_arm64_defconfig
 create mode 100644 include/configs/xenguest_arm64.h

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index e9ad716aaa..c469863967 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1717,6 +1717,12 @@ config TARGET_PRESIDIO_ASIC
 	bool "Support Cortina Presidio ASIC Platform"
 	select ARM64
 
+config TARGET_XENGUEST_ARM64
+	bool "Xen guest ARM64"
+	select ARM64
+	select XEN
+	select OF_CONTROL
+	select LINUX_KERNEL_IMAGE_HEADER
 endchoice
 
 config ARCH_SUPPORT_TFABOOT
@@ -1920,6 +1926,7 @@ source "board/xilinx/Kconfig"
 source "board/xilinx/zynq/Kconfig"
 source "board/xilinx/zynqmp/Kconfig"
 source "board/phytium/durian/Kconfig"
+source "board/xen/xenguest_arm64/Kconfig"
 
 source "arch/arm/Kconfig.debug"
 
diff --git a/arch/arm/cpu/armv8/Makefile b/arch/arm/cpu/armv8/Makefile
index 2e48df0eb9..dd6c354d19 100644
--- a/arch/arm/cpu/armv8/Makefile
+++ b/arch/arm/cpu/armv8/Makefile
@@ -39,3 +39,4 @@ obj-$(CONFIG_S32V234) += s32v234/
 obj-$(CONFIG_TARGET_HIKEY) += hisilicon/
 obj-$(CONFIG_ARMV8_PSCI) += psci.o
 obj-$(CONFIG_ARCH_SUNXI) += lowlevel_init.o
+obj-$(CONFIG_XEN) += xen/
diff --git a/arch/arm/cpu/armv8/xen/Makefile b/arch/arm/cpu/armv8/xen/Makefile
new file mode 100644
index 0000000000..e3b4ae2bd4
--- /dev/null
+++ b/arch/arm/cpu/armv8/xen/Makefile
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0+
+#
+# (C) 2018 NXP
+# (C) 2020 EPAM Systems Inc.
+
+obj-y += lowlevel_init.o hypercall.o
diff --git a/arch/arm/cpu/armv8/xen/hypercall.S b/arch/arm/cpu/armv8/xen/hypercall.S
new file mode 100644
index 0000000000..9596e336b5
--- /dev/null
+++ b/arch/arm/cpu/armv8/xen/hypercall.S
@@ -0,0 +1,78 @@
+/******************************************************************************
+ * hypercall.S
+ *
+ * Xen hypercall wrappers
+ *
+ * Stefano Stabellini <stefano.stabellini@eu.citrix.com>, Citrix, 2012
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version 2
+ * as published by the Free Software Foundation; or, when distributed
+ * separately from the Linux kernel or incorporated into other
+ * software packages, subject to the following license:
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this source file (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use, copy, modify,
+ * merge, publish, distribute, sublicense, and/or sell copies of the Software,
+ * and to permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+/*
+ * The Xen hypercall calling convention is very similar to the procedure
+ * call standard for the ARM 64-bit architecture: the first parameter is
+ * passed in x0, the second in x1, the third in x2, the fourth in x3 and
+ * the fifth in x4.
+ *
+ * The hypercall number is passed in x16.
+ *
+ * The return value is in x0.
+ *
+ * The hvc ISS is required to be 0xEA1, that is the Xen specific ARM
+ * hypercall tag.
+ *
+ * Parameter structs passed to hypercalls are laid out according to
+ * the ARM 64-bit EABI standard.
+ */
+
+#include <xen/interface/xen.h>
+
+#define XEN_HYPERCALL_TAG	0xEA1
+
+#define HYPERCALL_SIMPLE(hypercall)		\
+.globl HYPERVISOR_##hypercall;                  \
+.align 4,0x90;                                  \
+HYPERVISOR_##hypercall:				\
+	mov x16, #__HYPERVISOR_##hypercall;	\
+	hvc XEN_HYPERCALL_TAG;			\
+	ret;					\
+
+#define HYPERCALL0 HYPERCALL_SIMPLE
+#define HYPERCALL1 HYPERCALL_SIMPLE
+#define HYPERCALL2 HYPERCALL_SIMPLE
+#define HYPERCALL3 HYPERCALL_SIMPLE
+#define HYPERCALL4 HYPERCALL_SIMPLE
+#define HYPERCALL5 HYPERCALL_SIMPLE
+
+                .text
+
+HYPERCALL2(xen_version);
+HYPERCALL3(console_io);
+HYPERCALL3(grant_table_op);
+HYPERCALL2(sched_op);
+HYPERCALL2(event_channel_op);
+HYPERCALL2(hvm_op);
+HYPERCALL2(memory_op);
+
diff --git a/arch/arm/cpu/armv8/xen/lowlevel_init.S b/arch/arm/cpu/armv8/xen/lowlevel_init.S
new file mode 100644
index 0000000000..25ed438e20
--- /dev/null
+++ b/arch/arm/cpu/armv8/xen/lowlevel_init.S
@@ -0,0 +1,34 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0+
+ *
+ * (C) 2017 NXP
+ * (C) 2020 EPAM Systems Inc.
+ */
+
+#include <config.h>
+
+.align 8
+.global rom_pointer
+rom_pointer:
+	.space 32
+
+/*
+ * Routine: save_boot_params (called after reset from start.S)
+ */
+
+.global save_boot_params
+save_boot_params:
+	/* The firmware provided ATAG/FDT address can be found in r2/x0 */
+	adr	x1, rom_pointer
+	stp	x0, x2, [x1], #16
+	stp	x3, x4, [x1], #16
+
+	/* Returns */
+	b	save_boot_params_ret
+
+.global restore_boot_params
+restore_boot_params:
+	adr	x1, rom_pointer
+	ldp	x0, x2, [x1], #16
+	ldp	x3, x4, [x1], #16
+	ret
diff --git a/arch/arm/include/asm/xen.h b/arch/arm/include/asm/xen.h
new file mode 100644
index 0000000000..fb7f03e19c
--- /dev/null
+++ b/arch/arm/include/asm/xen.h
@@ -0,0 +1,8 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0+
+ *
+ * (C) 2020 EPAM Systems Inc.
+ */
+
+extern unsigned long rom_pointer[];
+
diff --git a/arch/arm/include/asm/xen/hypercall.h b/arch/arm/include/asm/xen/hypercall.h
new file mode 100644
index 0000000000..26644ce886
--- /dev/null
+++ b/arch/arm/include/asm/xen/hypercall.h
@@ -0,0 +1,45 @@
+/******************************************************************************
+ * hypercall.h
+ *
+ * Linux-specific hypervisor handling.
+ *
+ * Stefano Stabellini <stefano.stabellini@eu.citrix.com>, Citrix, 2012
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version 2
+ * as published by the Free Software Foundation; or, when distributed
+ * separately from the Linux kernel or incorporated into other
+ * software packages, subject to the following license:
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this source file (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use, copy, modify,
+ * merge, publish, distribute, sublicense, and/or sell copies of the Software,
+ * and to permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#ifndef _ASM_ARM_XEN_HYPERCALL_H
+#define _ASM_ARM_XEN_HYPERCALL_H
+
+#include <xen/interface/xen.h>
+
+int HYPERVISOR_xen_version(int cmd, void *arg);
+int HYPERVISOR_console_io(int cmd, int count, char *str);
+int HYPERVISOR_grant_table_op(unsigned int cmd, void *uop, unsigned int count);
+int HYPERVISOR_sched_op(int cmd, void *arg);
+int HYPERVISOR_event_channel_op(int cmd, void *arg);
+unsigned long HYPERVISOR_hvm_op(int op, void *arg);
+int HYPERVISOR_memory_op(unsigned int cmd, void *arg);
+#endif /* _ASM_ARM_XEN_HYPERCALL_H */
diff --git a/board/xen/xenguest_arm64/Kconfig b/board/xen/xenguest_arm64/Kconfig
new file mode 100644
index 0000000000..cc131ed5b9
--- /dev/null
+++ b/board/xen/xenguest_arm64/Kconfig
@@ -0,0 +1,12 @@
+if TARGET_XENGUEST_ARM64
+
+config SYS_BOARD
+	default "xenguest_arm64"
+
+config SYS_VENDOR
+	default "xen"
+
+config SYS_CONFIG_NAME
+	default "xenguest_arm64"
+
+endif
diff --git a/board/xen/xenguest_arm64/Makefile b/board/xen/xenguest_arm64/Makefile
new file mode 100644
index 0000000000..1cf87a728f
--- /dev/null
+++ b/board/xen/xenguest_arm64/Makefile
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier:	GPL-2.0+
+#
+# (C) Copyright 2020 EPAM Systems Inc.
+
+obj-y	:= xenguest_arm64.o
diff --git a/board/xen/xenguest_arm64/xenguest_arm64.c b/board/xen/xenguest_arm64/xenguest_arm64.c
new file mode 100644
index 0000000000..9e099f388f
--- /dev/null
+++ b/board/xen/xenguest_arm64/xenguest_arm64.c
@@ -0,0 +1,153 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0+
+ *
+ * (C) 2013
+ * David Feng <fenghua@phytium.com.cn>
+ * Sharma Bhupesh <bhupesh.sharma@freescale.com>
+ *
+ * (C) 2020 EPAM Systems Inc
+ */
+
+#include <common.h>
+#include <cpu_func.h>
+#include <dm.h>
+#include <errno.h>
+#include <malloc.h>
+
+#include <asm/io.h>
+#include <asm/armv8/mmu.h>
+#include <asm/xen.h>
+#include <asm/xen/hypercall.h>
+
+#include <linux/compiler.h>
+
+DECLARE_GLOBAL_DATA_PTR;
+
+int board_init(void)
+{
+	return 0;
+}
+
+/*
+ * Use fdt provided by Xen: according to
+ * https://www.kernel.org/doc/Documentation/arm64/booting.txt
+ * x0 is the physical address of the device tree blob (dtb) in system RAM.
+ * This is stored in rom_pointer during low level init.
+ */
+void *board_fdt_blob_setup(void)
+{
+	if (fdt_magic(rom_pointer[0]) != FDT_MAGIC)
+		return NULL;
+	return (void *)rom_pointer[0];
+}
+
+#define MAX_MEM_MAP_REGIONS 5
+static struct mm_region xen_mem_map[MAX_MEM_MAP_REGIONS];
+struct mm_region *mem_map = xen_mem_map;
+
+static int get_next_memory_node(const void *blob, int mem)
+{
+	do {
+		mem = fdt_node_offset_by_prop_value(blob, mem,
+						    "device_type", "memory", 7);
+	} while (!fdtdec_get_is_enabled(blob, mem));
+
+	return mem;
+}
+
+static int setup_mem_map(void)
+{
+	int i, ret, mem, reg = 0;
+	struct fdt_resource res;
+	const void *blob = gd->fdt_blob;
+
+	mem = get_next_memory_node(blob, -1);
+	if (mem < 0) {
+		printf("%s: Missing /memory node\n", __func__);
+		return -EINVAL;
+	}
+
+	for (i = 0; i < MAX_MEM_MAP_REGIONS; i++) {
+		ret = fdt_get_resource(blob, mem, "reg", reg++, &res);
+		if (ret == -FDT_ERR_NOTFOUND) {
+			reg = 0;
+			mem = get_next_memory_node(blob, mem);
+			if (mem == -FDT_ERR_NOTFOUND)
+				break;
+
+			ret = fdt_get_resource(blob, mem, "reg", reg++, &res);
+			if (ret == -FDT_ERR_NOTFOUND)
+				break;
+		}
+		if (ret != 0) {
+			printf("No reg property for memory node\n");
+			return -EINVAL;
+		}
+
+		xen_mem_map[i].virt = (phys_addr_t)res.start;
+		xen_mem_map[i].phys = (phys_addr_t)res.start;
+		xen_mem_map[i].size = (phys_size_t)(res.end - res.start + 1);
+		xen_mem_map[i].attrs = (PTE_BLOCK_MEMTYPE(MT_NORMAL) |
+					PTE_BLOCK_INNER_SHARE);
+	}
+	return 0;
+}
+
+void enable_caches(void)
+{
+	/* Re-setup the memory map as BSS gets cleared after relocation. */
+	setup_mem_map();
+	icache_enable();
+	dcache_enable();
+}
+
+/* Read memory settings from the Xen provided device tree. */
+int dram_init(void)
+{
+	int ret;
+
+	ret = fdtdec_setup_mem_size_base();
+	if (ret < 0)
+		return ret;
+	/* Setup memory map, so MMU page table size can be estimated. */
+	return setup_mem_map();
+}
+
+int dram_init_banksize(void)
+{
+	return fdtdec_setup_memory_banksize();
+}
+
+/*
+ * Board specific reset that is system reset.
+ */
+void reset_cpu(ulong addr)
+{
+}
+
+int ft_system_setup(void *blob, bd_t *bd)
+{
+	return 0;
+}
+
+int ft_board_setup(void *blob, bd_t *bd)
+{
+	return 0;
+}
+
+int board_early_init_f(void)
+{
+	return 0;
+}
+
+int print_cpuinfo(void)
+{
+	printf("Xen virtual CPU\n");
+	return 0;
+}
+
+__weak struct serial_device *default_serial_console(void)
+{
+	return NULL;
+}
+
diff --git a/configs/xenguest_arm64_defconfig b/configs/xenguest_arm64_defconfig
new file mode 100644
index 0000000000..2a8caf8647
--- /dev/null
+++ b/configs/xenguest_arm64_defconfig
@@ -0,0 +1,56 @@
+CONFIG_ARM=y
+CONFIG_POSITION_INDEPENDENT=y
+CONFIG_SYS_TEXT_BASE=0x40080000
+CONFIG_SYS_MALLOC_F_LEN=0x2000
+CONFIG_IDENT_STRING=" xenguest"
+CONFIG_TARGET_XENGUEST_ARM64=y
+CONFIG_BOOTDELAY=10
+
+CONFIG_SYS_PROMPT="xenguest# "
+
+CONFIG_CMD_NET=n
+CONFIG_CMD_BDI=n
+CONFIG_CMD_BOOTD=n
+CONFIG_CMD_BOOTEFI=n
+CONFIG_CMD_BOOTEFI_HELLO_COMPILE=n
+CONFIG_CMD_ELF=n
+CONFIG_CMD_GO=n
+CONFIG_CMD_RUN=n
+CONFIG_CMD_IMI=n
+CONFIG_CMD_IMLS=n
+CONFIG_CMD_XIMG=n
+CONFIG_CMD_EXPORTENV=n
+CONFIG_CMD_IMPORTENV=n
+CONFIG_CMD_EDITENV=n
+CONFIG_CMD_ENV_EXISTS=n
+CONFIG_CMD_MEMORY=y
+CONFIG_CMD_CRC32=n
+CONFIG_CMD_DM=n
+CONFIG_CMD_LOADB=n
+CONFIG_CMD_LOADS=n
+CONFIG_CMD_FLASH=n
+CONFIG_CMD_GPT=n
+CONFIG_CMD_FPGA=n
+CONFIG_CMD_ECHO=n
+CONFIG_CMD_ITEST=n
+CONFIG_CMD_SOURCE=n
+CONFIG_CMD_SETEXPR=n
+CONFIG_CMD_MISC=n
+CONFIG_CMD_UNZIP=n
+CONFIG_CMD_LZMADEC=n
+CONFIG_CMD_SAVEENV=n
+CONFIG_CMD_UMS=n
+
+#CONFIG_USB=n
+# CONFIG_ISO_PARTITION is not set
+
+#CONFIG_EFI_PARTITION=y
+# CONFIG_EFI_LOADER is not set
+
+# CONFIG_DM is not set
+# CONFIG_MMC is not set
+# CONFIG_DM_SERIAL is not set
+# CONFIG_REQUIRE_SERIAL_CONSOLE is not set
+
+CONFIG_OF_BOARD=y
+CONFIG_OF_LIBFDT=y
diff --git a/include/configs/xenguest_arm64.h b/include/configs/xenguest_arm64.h
new file mode 100644
index 0000000000..467dabf1e5
--- /dev/null
+++ b/include/configs/xenguest_arm64.h
@@ -0,0 +1,45 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0+
+ *
+ * (C) Copyright 2020 EPAM Systemc Inc.
+ */
+#ifndef __XENGUEST_ARM64_H
+#define __XENGUEST_ARM64_H
+
+#ifndef __ASSEMBLY__
+#include <linux/types.h>
+#endif
+
+#define CONFIG_BOARD_EARLY_INIT_F
+
+#define CONFIG_EXTRA_ENV_SETTINGS
+
+#undef CONFIG_NR_DRAM_BANKS
+#undef CONFIG_SYS_SDRAM_BASE
+
+#define CONFIG_NR_DRAM_BANKS          1
+
+/*
+ * This can be any arbitrary address as we are using PIE, but
+ * please note, that CONFIG_SYS_TEXT_BASE must match the below.
+ */
+#define CONFIG_SYS_LOAD_ADDR                    0x40000000
+#define CONFIG_LNX_KRNL_IMG_TEXT_OFFSET_BASE    CONFIG_SYS_LOAD_ADDR
+
+/* Size of malloc() pool */
+#define CONFIG_SYS_MALLOC_LEN         (32 * 1024 * 1024)
+
+/* Monitor Command Prompt */
+#define CONFIG_SYS_PROMPT_HUSH_PS2    "> "
+#define CONFIG_SYS_CBSIZE             1024
+#define CONFIG_SYS_MAXARGS            64
+#define CONFIG_SYS_BARGSIZE           CONFIG_SYS_CBSIZE
+#define CONFIG_SYS_PBSIZE             (CONFIG_SYS_CBSIZE + \
+				      sizeof(CONFIG_SYS_PROMPT) + 16)
+
+#define CONFIG_OF_SYSTEM_SETUP
+
+#define CONFIG_CMDLINE_TAG            1
+#define CONFIG_INITRD_TAG             1
+
+#endif /* __XENGUEST_ARM64_H */
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 04/17] xen: Add essential and required interface headers
  2020-07-01 16:29 [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
                   ` (2 preceding siblings ...)
  2020-07-01 16:29 ` [PATCH 03/17] board: Introduce xenguest_arm64 board Anastasiia Lukianenko
@ 2020-07-01 16:29 ` Anastasiia Lukianenko
  2020-07-02  1:30   ` Peng Fan
  2020-07-01 16:29 ` [PATCH 05/17] xen: Port Xen hypervizor related code from mini-os Anastasiia Lukianenko
                   ` (13 subsequent siblings)
  17 siblings, 1 reply; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-01 16:29 UTC (permalink / raw)
  To: u-boot

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Add essential and required Xen interface headers only taken from
the stable Linux kernel stable/linux-5.7.y at commit
66dfe45221605e11f38a0bf5eb2ee808cea7cfe7.

These are better suited for U-boot than the original headers
from Xen as they are the stripped versions of the same.

At the same time use public protocols from Xen RELEASE-4.13.1, at
commit 6278553325a9f76d37811923221b21db3882e017
as those have more comments in them.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
---
 include/xen/arm/interface.h           |  88 ++++
 include/xen/interface/event_channel.h | 281 ++++++++++
 include/xen/interface/grant_table.h   | 582 +++++++++++++++++++++
 include/xen/interface/hvm/hvm_op.h    |  69 +++
 include/xen/interface/hvm/params.h    | 127 +++++
 include/xen/interface/io/blkif.h      | 726 ++++++++++++++++++++++++++
 include/xen/interface/io/console.h    |  56 ++
 include/xen/interface/io/protocols.h  |  42 ++
 include/xen/interface/io/ring.h       | 479 +++++++++++++++++
 include/xen/interface/io/xenbus.h     |  81 +++
 include/xen/interface/io/xs_wire.h    | 151 ++++++
 include/xen/interface/memory.h        | 332 ++++++++++++
 include/xen/interface/sched.h         | 188 +++++++
 include/xen/interface/xen.h           | 225 ++++++++
 14 files changed, 3427 insertions(+)
 create mode 100644 include/xen/arm/interface.h
 create mode 100644 include/xen/interface/event_channel.h
 create mode 100644 include/xen/interface/grant_table.h
 create mode 100644 include/xen/interface/hvm/hvm_op.h
 create mode 100644 include/xen/interface/hvm/params.h
 create mode 100644 include/xen/interface/io/blkif.h
 create mode 100644 include/xen/interface/io/console.h
 create mode 100644 include/xen/interface/io/protocols.h
 create mode 100644 include/xen/interface/io/ring.h
 create mode 100644 include/xen/interface/io/xenbus.h
 create mode 100644 include/xen/interface/io/xs_wire.h
 create mode 100644 include/xen/interface/memory.h
 create mode 100644 include/xen/interface/sched.h
 create mode 100644 include/xen/interface/xen.h

diff --git a/include/xen/arm/interface.h b/include/xen/arm/interface.h
new file mode 100644
index 0000000000..79d5ae8563
--- /dev/null
+++ b/include/xen/arm/interface.h
@@ -0,0 +1,88 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/******************************************************************************
+ * Guest OS interface to ARM Xen.
+ *
+ * Stefano Stabellini <stefano.stabellini@eu.citrix.com>, Citrix, 2012
+ */
+
+#ifndef _ASM_ARM_XEN_INTERFACE_H
+#define _ASM_ARM_XEN_INTERFACE_H
+
+#ifndef __ASSEMBLY__
+#include <linux/types.h>
+#endif
+
+#define uint64_aligned_t u64 __attribute__((aligned(8)))
+
+#define __DEFINE_GUEST_HANDLE(name, type) \
+	typedef struct { union { type *p; uint64_aligned_t q; }; }  \
+		__guest_handle_ ## name
+
+#define DEFINE_GUEST_HANDLE_STRUCT(name) \
+	__DEFINE_GUEST_HANDLE(name, struct name)
+#define DEFINE_GUEST_HANDLE(name) __DEFINE_GUEST_HANDLE(name, name)
+#define GUEST_HANDLE(name)        __guest_handle_ ## name
+
+#define set_xen_guest_handle(hnd, val)			\
+	do {						\
+		if (sizeof(hnd) == 8)			\
+			*(u64 *)&(hnd) = 0;	\
+		(hnd).p = val;				\
+	} while (0)
+
+#define __HYPERVISOR_platform_op_raw __HYPERVISOR_platform_op
+
+#ifndef __ASSEMBLY__
+/* Explicitly size integers that represent pfns in the interface with
+ * Xen so that we can have one ABI that works for 32 and 64 bit guests.
+ * Note that this means that the xen_pfn_t type may be capable of
+ * representing pfn's which the guest cannot represent in its own pfn
+ * type. However since pfn space is controlled by the guest this is
+ * fine since it simply wouldn't be able to create any sure pfns in
+ * the first place.
+ */
+typedef u64 xen_pfn_t;
+#define PRI_xen_pfn "llx"
+typedef u64 xen_ulong_t;
+#define PRI_xen_ulong "llx"
+typedef s64 xen_long_t;
+#define PRI_xen_long "llx"
+/* Guest handles for primitive C types. */
+__DEFINE_GUEST_HANDLE(uchar, unsigned char);
+__DEFINE_GUEST_HANDLE(uint,  unsigned int);
+DEFINE_GUEST_HANDLE(char);
+DEFINE_GUEST_HANDLE(int);
+DEFINE_GUEST_HANDLE(void);
+DEFINE_GUEST_HANDLE(u64);
+DEFINE_GUEST_HANDLE(u32);
+DEFINE_GUEST_HANDLE(xen_pfn_t);
+DEFINE_GUEST_HANDLE(xen_ulong_t);
+
+/* Maximum number of virtual CPUs in multi-processor guests. */
+#define MAX_VIRT_CPUS 1
+
+struct arch_vcpu_info { };
+struct arch_shared_info { };
+
+/* TODO: Move pvclock definitions some place arch independent */
+struct pvclock_vcpu_time_info {
+	u32   version;
+	u32   pad0;
+	u64   tsc_timestamp;
+	u64   system_time;
+	u32   tsc_to_system_mul;
+	s8    tsc_shift;
+	u8    flags;
+	u8    pad[2];
+} __attribute__((__packed__)); /* 32 bytes */
+
+/* It is OK to have a 12 bytes struct with no padding because it is packed */
+struct pvclock_wall_clock {
+	u32   version;
+	u32   sec;
+	u32   nsec;
+	u32   sec_hi;
+} __attribute__((__packed__));
+#endif
+
+#endif /* _ASM_ARM_XEN_INTERFACE_H */
diff --git a/include/xen/interface/event_channel.h b/include/xen/interface/event_channel.h
new file mode 100644
index 0000000000..8174999c2f
--- /dev/null
+++ b/include/xen/interface/event_channel.h
@@ -0,0 +1,281 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/******************************************************************************
+ * event_channel.h
+ *
+ * Event channels between domains.
+ *
+ * Copyright (c) 2003-2004, K A Fraser.
+ */
+
+#ifndef __XEN_PUBLIC_EVENT_CHANNEL_H__
+#define __XEN_PUBLIC_EVENT_CHANNEL_H__
+
+#include <xen/interface/xen.h>
+
+typedef u32 evtchn_port_t;
+DEFINE_GUEST_HANDLE(evtchn_port_t);
+
+/*
+ * EVTCHNOP_alloc_unbound: Allocate a port in domain <dom> and mark as
+ * accepting interdomain bindings from domain <remote_dom>. A fresh port
+ * is allocated in <dom> and returned as <port>.
+ * NOTES:
+ *  1. If the caller is unprivileged then <dom> must be DOMID_SELF.
+ *  2. <rdom> may be DOMID_SELF, allowing loopback connections.
+ */
+#define EVTCHNOP_alloc_unbound	  6
+struct evtchn_alloc_unbound {
+	/* IN parameters */
+	domid_t dom, remote_dom;
+	/* OUT parameters */
+	evtchn_port_t port;
+};
+
+/*
+ * EVTCHNOP_bind_interdomain: Construct an interdomain event channel between
+ * the calling domain and <remote_dom>. <remote_dom,remote_port> must identify
+ * a port that is unbound and marked as accepting bindings from the calling
+ * domain. A fresh port is allocated in the calling domain and returned as
+ * <local_port>.
+ * NOTES:
+ *  2. <remote_dom> may be DOMID_SELF, allowing loopback connections.
+ */
+#define EVTCHNOP_bind_interdomain 0
+struct evtchn_bind_interdomain {
+	/* IN parameters. */
+	domid_t remote_dom;
+	evtchn_port_t remote_port;
+	/* OUT parameters. */
+	evtchn_port_t local_port;
+};
+
+/*
+ * EVTCHNOP_bind_virq: Bind a local event channel to VIRQ <irq> on specified
+ * vcpu.
+ * NOTES:
+ *  1. A virtual IRQ may be bound to at most one event channel per vcpu.
+ *  2. The allocated event channel is bound to the specified vcpu. The binding
+ *     may not be changed.
+ */
+#define EVTCHNOP_bind_virq	  1
+struct evtchn_bind_virq {
+	/* IN parameters. */
+	u32 virq;
+	u32 vcpu;
+	/* OUT parameters. */
+	evtchn_port_t port;
+};
+
+/*
+ * EVTCHNOP_bind_pirq: Bind a local event channel to PIRQ <irq>.
+ * NOTES:
+ *  1. A physical IRQ may be bound to at most one event channel per domain.
+ *  2. Only a sufficiently-privileged domain may bind to a physical IRQ.
+ */
+#define EVTCHNOP_bind_pirq	  2
+struct evtchn_bind_pirq {
+	/* IN parameters. */
+	u32 pirq;
+#define BIND_PIRQ__WILL_SHARE 1
+	u32 flags; /* BIND_PIRQ__* */
+	/* OUT parameters. */
+	evtchn_port_t port;
+};
+
+/*
+ * EVTCHNOP_bind_ipi: Bind a local event channel to receive events.
+ * NOTES:
+ *  1. The allocated event channel is bound to the specified vcpu. The binding
+ *     may not be changed.
+ */
+#define EVTCHNOP_bind_ipi	  7
+struct evtchn_bind_ipi {
+	u32 vcpu;
+	/* OUT parameters. */
+	evtchn_port_t port;
+};
+
+/*
+ * EVTCHNOP_close: Close a local event channel <port>. If the channel is
+ * interdomain then the remote end is placed in the unbound state
+ * (EVTCHNSTAT_unbound), awaiting a new connection.
+ */
+#define EVTCHNOP_close		  3
+struct evtchn_close {
+	/* IN parameters. */
+	evtchn_port_t port;
+};
+
+/*
+ * EVTCHNOP_send: Send an event to the remote end of the channel whose local
+ * endpoint is <port>.
+ */
+#define EVTCHNOP_send		  4
+struct evtchn_send {
+	/* IN parameters. */
+	evtchn_port_t port;
+};
+
+/*
+ * EVTCHNOP_status: Get the current status of the communication channel which
+ * has an endpoint at <dom, port>.
+ * NOTES:
+ *  1. <dom> may be specified as DOMID_SELF.
+ *  2. Only a sufficiently-privileged domain may obtain the status of an event
+ *     channel for which <dom> is not DOMID_SELF.
+ */
+#define EVTCHNOP_status		  5
+struct evtchn_status {
+	/* IN parameters */
+	domid_t  dom;
+	evtchn_port_t port;
+	/* OUT parameters */
+#define EVTCHNSTAT_closed	0  /* Channel is not in use.		     */
+#define EVTCHNSTAT_unbound	1  /* Channel is waiting interdom connection.*/
+#define EVTCHNSTAT_interdomain	2  /* Channel is connected to remote domain. */
+#define EVTCHNSTAT_pirq		3  /* Channel is bound to a phys IRQ line.   */
+#define EVTCHNSTAT_virq		4  /* Channel is bound to a virtual IRQ line */
+#define EVTCHNSTAT_ipi		5  /* Channel is bound to a virtual IPI line */
+	u32 status;
+	u32 vcpu;		   /* VCPU to which this channel is bound.   */
+	union {
+		struct {
+			domid_t dom;
+		} unbound; /* EVTCHNSTAT_unbound */
+		struct {
+			domid_t dom;
+			evtchn_port_t port;
+		} interdomain; /* EVTCHNSTAT_interdomain */
+		u32 pirq;	    /* EVTCHNSTAT_pirq	      */
+		u32 virq;	    /* EVTCHNSTAT_virq	      */
+	} u;
+};
+
+/*
+ * EVTCHNOP_bind_vcpu: Specify which vcpu a channel should notify when an
+ * event is pending.
+ * NOTES:
+ *  1. IPI- and VIRQ-bound channels always notify the vcpu that initialised
+ *     the binding. This binding cannot be changed.
+ *  2. All other channels notify vcpu0 by default. This default is set when
+ *     the channel is allocated (a port that is freed and subsequently reused
+ *     has its binding reset to vcpu0).
+ */
+#define EVTCHNOP_bind_vcpu	  8
+struct evtchn_bind_vcpu {
+	/* IN parameters. */
+	evtchn_port_t port;
+	u32 vcpu;
+};
+
+/*
+ * EVTCHNOP_unmask: Unmask the specified local event-channel port and deliver
+ * a notification to the appropriate VCPU if an event is pending.
+ */
+#define EVTCHNOP_unmask		  9
+struct evtchn_unmask {
+	/* IN parameters. */
+	evtchn_port_t port;
+};
+
+/*
+ * EVTCHNOP_reset: Close all event channels associated with specified domain.
+ * NOTES:
+ *  1. <dom> may be specified as DOMID_SELF.
+ *  2. Only a sufficiently-privileged domain may specify other than DOMID_SELF.
+ */
+#define EVTCHNOP_reset		 10
+struct evtchn_reset {
+	/* IN parameters. */
+	domid_t dom;
+};
+
+typedef struct evtchn_reset evtchn_reset_t;
+
+/*
+ * EVTCHNOP_init_control: initialize the control block for the FIFO ABI.
+ */
+#define EVTCHNOP_init_control    11
+struct evtchn_init_control {
+	/* IN parameters. */
+	u64 control_gfn;
+	u32 offset;
+	u32 vcpu;
+	/* OUT parameters. */
+	u8 link_bits;
+	u8 _pad[7];
+};
+
+/*
+ * EVTCHNOP_expand_array: add an additional page to the event array.
+ */
+#define EVTCHNOP_expand_array    12
+struct evtchn_expand_array {
+	/* IN parameters. */
+	u64 array_gfn;
+};
+
+/*
+ * EVTCHNOP_set_priority: set the priority for an event channel.
+ */
+#define EVTCHNOP_set_priority    13
+struct evtchn_set_priority {
+	/* IN parameters. */
+	evtchn_port_t port;
+	u32 priority;
+};
+
+struct evtchn_op {
+	u32 cmd; /* EVTCHNOP_* */
+	union {
+		struct evtchn_alloc_unbound    alloc_unbound;
+		struct evtchn_bind_interdomain bind_interdomain;
+		struct evtchn_bind_virq	       bind_virq;
+		struct evtchn_bind_pirq	       bind_pirq;
+		struct evtchn_bind_ipi	       bind_ipi;
+		struct evtchn_close	       close;
+		struct evtchn_send	       send;
+		struct evtchn_status	       status;
+		struct evtchn_bind_vcpu	       bind_vcpu;
+		struct evtchn_unmask	       unmask;
+	} u;
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(evtchn_op);
+
+/*
+ * 2-level ABI
+ */
+
+#define EVTCHN_2L_NR_CHANNELS (sizeof(xen_ulong_t) * sizeof(xen_ulong_t) * 64)
+
+/*
+ * FIFO ABI
+ */
+
+/* Events may have priorities from 0 (highest) to 15 (lowest). */
+#define EVTCHN_FIFO_PRIORITY_MAX     0
+#define EVTCHN_FIFO_PRIORITY_DEFAULT 7
+#define EVTCHN_FIFO_PRIORITY_MIN     15
+
+#define EVTCHN_FIFO_MAX_QUEUES (EVTCHN_FIFO_PRIORITY_MIN + 1)
+
+typedef u32 event_word_t;
+
+#define EVTCHN_FIFO_PENDING 31
+#define EVTCHN_FIFO_MASKED  30
+#define EVTCHN_FIFO_LINKED  29
+#define EVTCHN_FIFO_BUSY    28
+
+#define EVTCHN_FIFO_LINK_BITS 17
+#define EVTCHN_FIFO_LINK_MASK ((1 << EVTCHN_FIFO_LINK_BITS) - 1)
+
+#define EVTCHN_FIFO_NR_CHANNELS (1 << EVTCHN_FIFO_LINK_BITS)
+
+struct evtchn_fifo_control_block {
+	u32     ready;
+	u32     _rsvd;
+	event_word_t head[EVTCHN_FIFO_MAX_QUEUES];
+};
+
+#endif /* __XEN_PUBLIC_EVENT_CHANNEL_H__ */
diff --git a/include/xen/interface/grant_table.h b/include/xen/interface/grant_table.h
new file mode 100644
index 0000000000..197a0d0d58
--- /dev/null
+++ b/include/xen/interface/grant_table.h
@@ -0,0 +1,582 @@
+/******************************************************************************
+ * grant_table.h
+ *
+ * Interface for granting foreign access to page frames, and receiving
+ * page-ownership transfers.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (c) 2004, K A Fraser
+ */
+
+#ifndef __XEN_PUBLIC_GRANT_TABLE_H__
+#define __XEN_PUBLIC_GRANT_TABLE_H__
+
+#include <xen/interface/xen.h>
+
+/***********************************
+ * GRANT TABLE REPRESENTATION
+ */
+
+/* Some rough guidelines on accessing and updating grant-table entries
+ * in a concurrency-safe manner. For more information, Linux contains a
+ * reference implementation for guest OSes (arch/xen/kernel/grant_table.c).
+ *
+ * NB. WMB is a no-op on current-generation x86 processors. However, a
+ *     compiler barrier will still be required.
+ *
+ * Introducing a valid entry into the grant table:
+ *  1. Write ent->domid.
+ *  2. Write ent->frame:
+ *      GTF_permit_access:   Frame to which access is permitted.
+ *      GTF_accept_transfer: Pseudo-phys frame slot being filled by new
+ *                           frame, or zero if none.
+ *  3. Write memory barrier (WMB).
+ *  4. Write ent->flags, inc. valid type.
+ *
+ * Invalidating an unused GTF_permit_access entry:
+ *  1. flags = ent->flags.
+ *  2. Observe that !(flags & (GTF_reading|GTF_writing)).
+ *  3. Check result of SMP-safe CMPXCHG(&ent->flags, flags, 0).
+ *  NB. No need for WMB as reuse of entry is control-dependent on success of
+ *      step 3, and all architectures guarantee ordering of ctrl-dep writes.
+ *
+ * Invalidating an in-use GTF_permit_access entry:
+ *  This cannot be done directly. Request assistance from the domain controller
+ *  which can set a timeout on the use of a grant entry and take necessary
+ *  action. (NB. This is not yet implemented!).
+ *
+ * Invalidating an unused GTF_accept_transfer entry:
+ *  1. flags = ent->flags.
+ *  2. Observe that !(flags & GTF_transfer_committed). [*]
+ *  3. Check result of SMP-safe CMPXCHG(&ent->flags, flags, 0).
+ *  NB. No need for WMB as reuse of entry is control-dependent on success of
+ *      step 3, and all architectures guarantee ordering of ctrl-dep writes.
+ *  [*] If GTF_transfer_committed is set then the grant entry is 'committed'.
+ *      The guest must /not/ modify the grant entry until the address of the
+ *      transferred frame is written. It is safe for the guest to spin waiting
+ *      for this to occur (detect by observing GTF_transfer_completed in
+ *      ent->flags).
+ *
+ * Invalidating a committed GTF_accept_transfer entry:
+ *  1. Wait for (ent->flags & GTF_transfer_completed).
+ *
+ * Changing a GTF_permit_access from writable to read-only:
+ *  Use SMP-safe CMPXCHG to set GTF_readonly, while checking !GTF_writing.
+ *
+ * Changing a GTF_permit_access from read-only to writable:
+ *  Use SMP-safe bit-setting instruction.
+ */
+
+/*
+ * Reference to a grant entry in a specified domain's grant table.
+ */
+typedef u32 grant_ref_t;
+
+/*
+ * A grant table comprises a packed array of grant entries in one or more
+ * page frames shared between Xen and a guest.
+ * [XEN]: This field is written by Xen and read by the sharing guest.
+ * [GST]: This field is written by the guest and read by Xen.
+ */
+
+/*
+ * Version 1 of the grant table entry structure is maintained purely
+ * for backwards compatibility.  New guests should use version 2.
+ */
+struct grant_entry_v1 {
+	/* GTF_xxx: various type and flag information.  [XEN,GST] */
+	u16 flags;
+	/* The domain being granted foreign privileges. [GST] */
+	domid_t  domid;
+	/*
+	 * GTF_permit_access: Frame that @domid is allowed to map and access. [GST]
+	 * GTF_accept_transfer: Frame whose ownership transferred by @domid. [XEN]
+	 */
+	u32 frame;
+};
+
+/*
+ * Type of grant entry.
+ *  GTF_invalid: This grant entry grants no privileges.
+ *  GTF_permit_access: Allow @domid to map/access @frame.
+ *  GTF_accept_transfer: Allow @domid to transfer ownership of one page frame
+ *                       to this guest. Xen writes the page number to @frame.
+ *  GTF_transitive: Allow @domid to transitively access a subrange of
+ *                  @trans_grant in @trans_domid.  No mappings are allowed.
+ */
+#define GTF_invalid         (0U << 0)
+#define GTF_permit_access   (1U << 0)
+#define GTF_accept_transfer (2U << 0)
+#define GTF_transitive      (3U << 0)
+#define GTF_type_mask       (3U << 0)
+
+/*
+ * Subflags for GTF_permit_access.
+ *  GTF_readonly: Restrict @domid to read-only mappings and accesses. [GST]
+ *  GTF_reading: Grant entry is currently mapped for reading by @domid. [XEN]
+ *  GTF_writing: Grant entry is currently mapped for writing by @domid. [XEN]
+ *  GTF_sub_page: Grant access to only a subrange of the page.  @domid
+ *                will only be allowed to copy from the grant, and not
+ *                map it. [GST]
+ */
+#define _GTF_readonly       (2)
+#define GTF_readonly        (1U << _GTF_readonly)
+#define _GTF_reading        (3)
+#define GTF_reading         (1U << _GTF_reading)
+#define _GTF_writing        (4)
+#define GTF_writing         (1U << _GTF_writing)
+#define _GTF_sub_page       (8)
+#define GTF_sub_page        (1U << _GTF_sub_page)
+
+/*
+ * Subflags for GTF_accept_transfer:
+ *  GTF_transfer_committed: Xen sets this flag to indicate that it is committed
+ *      to transferring ownership of a page frame. When a guest sees this flag
+ *      it must /not/ modify the grant entry until GTF_transfer_completed is
+ *      set by Xen.
+ *  GTF_transfer_completed: It is safe for the guest to spin-wait on this flag
+ *      after reading GTF_transfer_committed. Xen will always write the frame
+ *      address, followed by ORing this flag, in a timely manner.
+ */
+#define _GTF_transfer_committed (2)
+#define GTF_transfer_committed  (1U << _GTF_transfer_committed)
+#define _GTF_transfer_completed (3)
+#define GTF_transfer_completed  (1U << _GTF_transfer_completed)
+
+/*
+ * Version 2 grant table entries.  These fulfil the same role as
+ * version 1 entries, but can represent more complicated operations.
+ * Any given domain will have either a version 1 or a version 2 table,
+ * and every entry in the table will be the same version.
+ *
+ * The interface by which domains use grant references does not depend
+ * on the grant table version in use by the other domain.
+ */
+
+/*
+ * Version 1 and version 2 grant entries share a common prefix.  The
+ * fields of the prefix are documented as part of struct
+ * grant_entry_v1.
+ */
+struct grant_entry_header {
+	u16 flags;
+	domid_t  domid;
+};
+
+/*
+ * Version 2 of the grant entry structure, here is a union because three
+ * different types are suppotted: full_page, sub_page and transitive.
+ */
+union grant_entry_v2 {
+	struct grant_entry_header hdr;
+
+	/*
+	 * This member is used for V1-style full page grants, where either:
+	 *
+	 * -- hdr.type is GTF_accept_transfer, or
+	 * -- hdr.type is GTF_permit_access and GTF_sub_page is not set.
+	 *
+	 * In that case, the frame field has the same semantics as the
+	 * field of the same name in the V1 entry structure.
+	 */
+	struct {
+	struct grant_entry_header hdr;
+	u32 pad0;
+	u64 frame;
+	} full_page;
+
+	/*
+	 * If the grant type is GTF_grant_access and GTF_sub_page is set,
+	 * @domid is allowed to access bytes [@page_off,@page_off+@length)
+	 * in frame @frame.
+	 */
+	struct {
+	struct grant_entry_header hdr;
+	u16 page_off;
+	u16 length;
+	u64 frame;
+	} sub_page;
+
+	/*
+	 * If the grant is GTF_transitive, @domid is allowed to use the
+	 * grant @gref in domain @trans_domid, as if it was the local
+	 * domain.  Obviously, the transitive access must be compatible
+	 * with the original grant.
+	 */
+	struct {
+	struct grant_entry_header hdr;
+	domid_t trans_domid;
+	u16 pad0;
+	grant_ref_t gref;
+	} transitive;
+
+	u32 __spacer[4]; /* Pad to a power of two */
+};
+
+typedef u16 grant_status_t;
+
+/***********************************
+ * GRANT TABLE QUERIES AND USES
+ */
+
+/*
+ * Handle to track a mapping created via a grant reference.
+ */
+typedef u32 grant_handle_t;
+
+/*
+ * GNTTABOP_map_grant_ref: Map the grant entry (<dom>,<ref>) for access
+ * by devices and/or host CPUs. If successful, <handle> is a tracking number
+ * that must be presented later to destroy the mapping(s). On error, <handle>
+ * is a negative status code.
+ * NOTES:
+ *  1. If GNTMAP_device_map is specified then <dev_bus_addr> is the address
+ *     via which I/O devices may access the granted frame.
+ *  2. If GNTMAP_host_map is specified then a mapping will be added at
+ *     either a host virtual address in the current address space, or at
+ *     a PTE at the specified machine address.  The type of mapping to
+ *     perform is selected through the GNTMAP_contains_pte flag, and the
+ *     address is specified in <host_addr>.
+ *  3. Mappings should only be destroyed via GNTTABOP_unmap_grant_ref. If a
+ *     host mapping is destroyed by other means then it is *NOT* guaranteed
+ *     to be accounted to the correct grant reference!
+ */
+#define GNTTABOP_map_grant_ref        0
+struct gnttab_map_grant_ref {
+	/* IN parameters. */
+	u64 host_addr;
+	u32 flags;               /* GNTMAP_* */
+	grant_ref_t ref;
+	domid_t  dom;
+	/* OUT parameters. */
+	s16  status;              /* GNTST_* */
+	grant_handle_t handle;
+	u64 dev_bus_addr;
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_map_grant_ref);
+
+/*
+ * GNTTABOP_unmap_grant_ref: Destroy one or more grant-reference mappings
+ * tracked by <handle>. If <host_addr> or <dev_bus_addr> is zero, that
+ * field is ignored. If non-zero, they must refer to a device/host mapping
+ * that is tracked by <handle>
+ * NOTES:
+ *  1. The call may fail in an undefined manner if either mapping is not
+ *     tracked by <handle>.
+ *  3. After executing a batch of unmaps, it is guaranteed that no stale
+ *     mappings will remain in the device or host TLBs.
+ */
+#define GNTTABOP_unmap_grant_ref      1
+struct gnttab_unmap_grant_ref {
+	/* IN parameters. */
+	u64 host_addr;
+	u64 dev_bus_addr;
+	grant_handle_t handle;
+	/* OUT parameters. */
+	s16  status;              /* GNTST_* */
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_unmap_grant_ref);
+
+/*
+ * GNTTABOP_setup_table: Set up a grant table for <dom> comprising at least
+ * <nr_frames> pages. The frame addresses are written to the <frame_list>.
+ * Only <nr_frames> addresses are written, even if the table is larger.
+ * NOTES:
+ *  1. <dom> may be specified as DOMID_SELF.
+ *  2. Only a sufficiently-privileged domain may specify <dom> != DOMID_SELF.
+ *  3. Xen may not support more than a single grant-table page per domain.
+ */
+#define GNTTABOP_setup_table          2
+struct gnttab_setup_table {
+	/* IN parameters. */
+	domid_t  dom;
+	u32 nr_frames;
+	/* OUT parameters. */
+	s16  status;              /* GNTST_* */
+
+	GUEST_HANDLE(xen_pfn_t)frame_list;
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_setup_table);
+
+/*
+ * GNTTABOP_dump_table: Dump the contents of the grant table to the
+ * xen console. Debugging use only.
+ */
+#define GNTTABOP_dump_table           3
+struct gnttab_dump_table {
+	/* IN parameters. */
+	domid_t dom;
+	/* OUT parameters. */
+	s16 status;               /* GNTST_* */
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_dump_table);
+
+/*
+ * GNTTABOP_transfer_grant_ref: Transfer <frame> to a foreign domain. The
+ * foreign domain has previously registered its interest in the transfer via
+ * <domid, ref>.
+ *
+ * Note that, even if the transfer fails, the specified page no longer belongs
+ * to the calling domain *unless* the error is GNTST_bad_page.
+ */
+#define GNTTABOP_transfer                4
+struct gnttab_transfer {
+	/* IN parameters. */
+	xen_pfn_t mfn;
+	domid_t       domid;
+	grant_ref_t   ref;
+	/* OUT parameters. */
+	s16       status;
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_transfer);
+
+/*
+ * GNTTABOP_copy: Hypervisor based copy
+ * source and destinations can be eithers MFNs or, for foreign domains,
+ * grant references. the foreign domain has to grant read/write access
+ * in its grant table.
+ *
+ * The flags specify what type source and destinations are (either MFN
+ * or grant reference).
+ *
+ * Note that this can also be used to copy data between two domains
+ * via a third party if the source and destination domains had previously
+ * grant appropriate access to their pages to the third party.
+ *
+ * source_offset specifies an offset in the source frame, dest_offset
+ * the offset in the target frame and  len specifies the number of
+ * bytes to be copied.
+ */
+
+#define _GNTCOPY_source_gref      (0)
+#define GNTCOPY_source_gref       (1 << _GNTCOPY_source_gref)
+#define _GNTCOPY_dest_gref        (1)
+#define GNTCOPY_dest_gref         (1 << _GNTCOPY_dest_gref)
+
+#define GNTTABOP_copy                 5
+struct gnttab_copy {
+	/* IN parameters. */
+	struct {
+		union {
+			grant_ref_t ref;
+			xen_pfn_t   gmfn;
+		} u;
+		domid_t  domid;
+		u16 offset;
+	} source, dest;
+	u16      len;
+	u16      flags;          /* GNTCOPY_* */
+	/* OUT parameters. */
+	s16       status;
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_copy);
+
+/*
+ * GNTTABOP_query_size: Query the current and maximum sizes of the shared
+ * grant table.
+ * NOTES:
+ *  1. <dom> may be specified as DOMID_SELF.
+ *  2. Only a sufficiently-privileged domain may specify <dom> != DOMID_SELF.
+ */
+#define GNTTABOP_query_size           6
+struct gnttab_query_size {
+	/* IN parameters. */
+	domid_t  dom;
+	/* OUT parameters. */
+	u32 nr_frames;
+	u32 max_nr_frames;
+	s16  status;              /* GNTST_* */
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_query_size);
+
+/*
+ * GNTTABOP_unmap_and_replace: Destroy one or more grant-reference mappings
+ * tracked by <handle> but atomically replace the page table entry with one
+ * pointing to the machine address under <new_addr>.  <new_addr> will be
+ * redirected to the null entry.
+ * NOTES:
+ *  1. The call may fail in an undefined manner if either mapping is not
+ *     tracked by <handle>.
+ *  2. After executing a batch of unmaps, it is guaranteed that no stale
+ *     mappings will remain in the device or host TLBs.
+ */
+#define GNTTABOP_unmap_and_replace    7
+struct gnttab_unmap_and_replace {
+	/* IN parameters. */
+	u64 host_addr;
+	u64 new_addr;
+	grant_handle_t handle;
+	/* OUT parameters. */
+	s16  status;              /* GNTST_* */
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_unmap_and_replace);
+
+/*
+ * GNTTABOP_set_version: Request a particular version of the grant
+ * table shared table structure.  This operation can only be performed
+ * once in any given domain.  It must be performed before any grants
+ * are activated; otherwise, the domain will be stuck with version 1.
+ * The only defined versions are 1 and 2.
+ */
+#define GNTTABOP_set_version          8
+struct gnttab_set_version {
+	/* IN parameters */
+	u32 version;
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_set_version);
+
+/*
+ * GNTTABOP_get_status_frames: Get the list of frames used to store grant
+ * status for <dom>. In grant format version 2, the status is separated
+ * from the other shared grant fields to allow more efficient synchronization
+ * using barriers instead of atomic cmpexch operations.
+ * <nr_frames> specify the size of vector <frame_list>.
+ * The frame addresses are returned in the <frame_list>.
+ * Only <nr_frames> addresses are returned, even if the table is larger.
+ * NOTES:
+ *  1. <dom> may be specified as DOMID_SELF.
+ *  2. Only a sufficiently-privileged domain may specify <dom> != DOMID_SELF.
+ */
+#define GNTTABOP_get_status_frames     9
+struct gnttab_get_status_frames {
+	/* IN parameters. */
+	u32 nr_frames;
+	domid_t  dom;
+	/* OUT parameters. */
+	s16  status;              /* GNTST_* */
+
+	GUEST_HANDLE(u64)frame_list;
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_get_status_frames);
+
+/*
+ * GNTTABOP_get_version: Get the grant table version which is in
+ * effect for domain <dom>.
+ */
+#define GNTTABOP_get_version          10
+struct gnttab_get_version {
+	/* IN parameters */
+	domid_t dom;
+	u16 pad;
+	/* OUT parameters */
+	u32 version;
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_get_version);
+
+/*
+ * Issue one or more cache maintenance operations on a portion of a
+ * page granted to the calling domain by a foreign domain.
+ */
+#define GNTTABOP_cache_flush          12
+struct gnttab_cache_flush {
+	union {
+		u64 dev_bus_addr;
+		grant_ref_t ref;
+	} a;
+	u16 offset;   /* offset from start of grant */
+	u16 length;   /* size within the grant */
+#define GNTTAB_CACHE_CLEAN          (1 << 0)
+#define GNTTAB_CACHE_INVAL          (1 << 1)
+#define GNTTAB_CACHE_SOURCE_GREF    (1 << 31)
+	u32 op;
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_cache_flush);
+
+/*
+ * Bitfield values for update_pin_status.flags.
+ */
+ /* Map the grant entry for access by I/O devices. */
+#define _GNTMAP_device_map      (0)
+#define GNTMAP_device_map       (1 << _GNTMAP_device_map)
+/* Map the grant entry for access by host CPUs. */
+#define _GNTMAP_host_map        (1)
+#define GNTMAP_host_map         (1 << _GNTMAP_host_map)
+/* Accesses to the granted frame will be restricted to read-only access. */
+#define _GNTMAP_readonly        (2)
+#define GNTMAP_readonly         (1 << _GNTMAP_readonly)
+/*
+ * GNTMAP_host_map subflag:
+ *  0 => The host mapping is usable only by the guest OS.
+ *  1 => The host mapping is usable by guest OS + current application.
+ */
+#define _GNTMAP_application_map (3)
+#define GNTMAP_application_map  (1 << _GNTMAP_application_map)
+
+/*
+ * GNTMAP_contains_pte subflag:
+ *  0 => This map request contains a host virtual address.
+ *  1 => This map request contains the machine addess of the PTE to update.
+ */
+#define _GNTMAP_contains_pte    (4)
+#define GNTMAP_contains_pte     (1 << _GNTMAP_contains_pte)
+
+/*
+ * Bits to be placed in guest kernel available PTE bits (architecture
+ * dependent; only supported when XENFEAT_gnttab_map_avail_bits is set).
+ */
+#define _GNTMAP_guest_avail0    (16)
+#define GNTMAP_guest_avail_mask ((u32)~0 << _GNTMAP_guest_avail0)
+
+/*
+ * Values for error status returns. All errors are -ve.
+ */
+#define GNTST_okay             (0)  /* Normal return.                        */
+#define GNTST_general_error    (-1) /* General undefined error.              */
+#define GNTST_bad_domain       (-2) /* Unrecognsed domain id.                */
+#define GNTST_bad_gntref       (-3) /* Unrecognised or inappropriate gntref. */
+#define GNTST_bad_handle       (-4) /* Unrecognised or inappropriate handle. */
+#define GNTST_bad_virt_addr    (-5) /* Inappropriate virtual address to map. */
+#define GNTST_bad_dev_addr     (-6) /* Inappropriate device address to unmap.*/
+#define GNTST_no_device_space  (-7) /* Out of space in I/O MMU.              */
+#define GNTST_permission_denied (-8) /* Not enough privilege for operation.  */
+#define GNTST_bad_page         (-9) /* Specified page was invalid for op.    */
+#define GNTST_bad_copy_arg    (-10) /* copy arguments cross page boundary.   */
+#define GNTST_address_too_big (-11) /* transfer page address too large.      */
+#define GNTST_eagain          (-12) /* Operation not done; try again.        */
+
+#define GNTTABOP_error_msgs {                   \
+	"okay",                                     \
+	"undefined error",                          \
+	"unrecognised domain id",                   \
+	"invalid grant reference",                  \
+	"invalid mapping handle",                   \
+	"invalid virtual address",                  \
+	"invalid device address",                   \
+	"no spare translation slot in the I/O MMU", \
+	"permission denied",                        \
+	"bad page",                                 \
+	"copy arguments cross page boundary",       \
+	"page address size too large",              \
+	"operation not done; try again"             \
+}
+
+#endif /* __XEN_PUBLIC_GRANT_TABLE_H__ */
diff --git a/include/xen/interface/hvm/hvm_op.h b/include/xen/interface/hvm/hvm_op.h
new file mode 100644
index 0000000000..1c53cad729
--- /dev/null
+++ b/include/xen/interface/hvm/hvm_op.h
@@ -0,0 +1,69 @@
+/*
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef __XEN_PUBLIC_HVM_HVM_OP_H__
+#define __XEN_PUBLIC_HVM_HVM_OP_H__
+
+/* Get/set subcommands: the second argument of the hypercall is a
+ * pointer to a xen_hvm_param struct.
+ */
+#define HVMOP_set_param           0
+#define HVMOP_get_param           1
+struct xen_hvm_param {
+	domid_t  domid;    /* IN */
+	u32 index;    /* IN */
+	u64 value;    /* IN/OUT */
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_param);
+
+/* Hint from PV drivers for pagetable destruction. */
+#define HVMOP_pagetable_dying       9
+struct xen_hvm_pagetable_dying {
+	/* Domain with a pagetable about to be destroyed. */
+	domid_t  domid;
+	/* guest physical address of the toplevel pagetable dying */
+	aligned_u64 gpa;
+};
+
+typedef struct xen_hvm_pagetable_dying xen_hvm_pagetable_dying_t;
+DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_pagetable_dying_t);
+
+enum hvmmem_type_t {
+	HVMMEM_ram_rw,             /* Normal read/write guest RAM */
+	HVMMEM_ram_ro,             /* Read-only; writes are discarded */
+	HVMMEM_mmio_dm,            /* Reads and write go to the device model */
+};
+
+#define HVMOP_get_mem_type    15
+/* Return hvmmem_type_t for the specified pfn. */
+struct xen_hvm_get_mem_type {
+	/* Domain to be queried. */
+	domid_t domid;
+	/* OUT variable. */
+	u16 mem_type;
+	u16 pad[2]; /* align next field on 8-byte boundary */
+	/* IN variable. */
+	u64 pfn;
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_get_mem_type);
+
+#endif /* __XEN_PUBLIC_HVM_HVM_OP_H__ */
diff --git a/include/xen/interface/hvm/params.h b/include/xen/interface/hvm/params.h
new file mode 100644
index 0000000000..4d61fc58d9
--- /dev/null
+++ b/include/xen/interface/hvm/params.h
@@ -0,0 +1,127 @@
+/*
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef __XEN_PUBLIC_HVM_PARAMS_H__
+#define __XEN_PUBLIC_HVM_PARAMS_H__
+
+#include <xen/interface/hvm/hvm_op.h>
+
+/*
+ * Parameter space for HVMOP_{set,get}_param.
+ */
+
+#define HVM_PARAM_CALLBACK_IRQ 0
+/*
+ * How should CPU0 event-channel notifications be delivered?
+ *
+ * If val == 0 then CPU0 event-channel notifications are not delivered.
+ * If val != 0, val[63:56] encodes the type, as follows:
+ */
+
+#define HVM_PARAM_CALLBACK_TYPE_GSI      0
+/*
+ * val[55:0] is a delivery GSI.  GSI 0 cannot be used, as it aliases val == 0,
+ * and disables all notifications.
+ */
+
+#define HVM_PARAM_CALLBACK_TYPE_PCI_INTX 1
+/*
+ * val[55:0] is a delivery PCI INTx line:
+ * Domain = val[47:32], Bus = val[31:16] DevFn = val[15:8], IntX = val[1:0]
+ */
+
+#if defined(__i386__) || defined(__x86_64__)
+#define HVM_PARAM_CALLBACK_TYPE_VECTOR   2
+/*
+ * val[7:0] is a vector number.  Check for XENFEAT_hvm_callback_vector to know
+ * if this delivery method is available.
+ */
+#elif defined(__arm__) || defined(__aarch64__)
+#define HVM_PARAM_CALLBACK_TYPE_PPI      2
+/*
+ * val[55:16] needs to be zero.
+ * val[15:8] is interrupt flag of the PPI used by event-channel:
+ *  bit 8: the PPI is edge(1) or level(0) triggered
+ *  bit 9: the PPI is active low(1) or high(0)
+ * val[7:0] is a PPI number used by event-channel.
+ * This is only used by ARM/ARM64 and masking/eoi the interrupt associated to
+ * the notification is handled by the interrupt controller.
+ */
+#endif
+
+#define HVM_PARAM_STORE_PFN    1
+#define HVM_PARAM_STORE_EVTCHN 2
+
+#define HVM_PARAM_PAE_ENABLED  4
+
+#define HVM_PARAM_IOREQ_PFN    5
+
+#define HVM_PARAM_BUFIOREQ_PFN 6
+
+/*
+ * Set mode for virtual timers (currently x86 only):
+ *  delay_for_missed_ticks (default):
+ *   Do not advance a vcpu's time beyond the correct delivery time for
+ *   interrupts that have been missed due to preemption. Deliver missed
+ *   interrupts when the vcpu is rescheduled and advance the vcpu's virtual
+ *   time stepwise for each one.
+ *  no_delay_for_missed_ticks:
+ *   As above, missed interrupts are delivered, but guest time always tracks
+ *   wallclock (i.e., real) time while doing so.
+ *  no_missed_ticks_pending:
+ *   No missed interrupts are held pending. Instead, to ensure ticks are
+ *   delivered at some non-zero rate, if we detect missed ticks then the
+ *   internal tick alarm is not disabled if the VCPU is preempted during the
+ *   next tick period.
+ *  one_missed_tick_pending:
+ *   Missed interrupts are collapsed together and delivered as one 'late tick'.
+ *   Guest time always tracks wallclock (i.e., real) time.
+ */
+#define HVM_PARAM_TIMER_MODE   10
+#define HVMPTM_delay_for_missed_ticks    0
+#define HVMPTM_no_delay_for_missed_ticks 1
+#define HVMPTM_no_missed_ticks_pending   2
+#define HVMPTM_one_missed_tick_pending   3
+
+/* Boolean: Enable virtual HPET (high-precision event timer)? (x86-only) */
+#define HVM_PARAM_HPET_ENABLED 11
+
+/* Identity-map page directory used by Intel EPT when CR0.PG=0. */
+#define HVM_PARAM_IDENT_PT     12
+
+/* Device Model domain, defaults to 0. */
+#define HVM_PARAM_DM_DOMAIN    13
+
+/* ACPI S state: currently support S0 and S3 on x86. */
+#define HVM_PARAM_ACPI_S_STATE 14
+
+/* TSS used on Intel when CR0.PE=0. */
+#define HVM_PARAM_VM86_TSS     15
+
+/* Boolean: Enable aligning all periodic vpts to reduce interrupts */
+#define HVM_PARAM_VPT_ALIGN    16
+
+/* Console debug shared memory ring and event channel */
+#define HVM_PARAM_CONSOLE_PFN    17
+#define HVM_PARAM_CONSOLE_EVTCHN 18
+
+#define HVM_NR_PARAMS          19
+
+#endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */
diff --git a/include/xen/interface/io/blkif.h b/include/xen/interface/io/blkif.h
new file mode 100644
index 0000000000..7d74c99226
--- /dev/null
+++ b/include/xen/interface/io/blkif.h
@@ -0,0 +1,726 @@
+/******************************************************************************
+ * blkif.h
+ *
+ * Unified block-device I/O interface for Xen guest OSes.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (c) 2003-2004, Keir Fraser
+ * Copyright (c) 2012, Spectra Logic Corporation
+ */
+
+#ifndef __XEN_PUBLIC_IO_BLKIF_H__
+#define __XEN_PUBLIC_IO_BLKIF_H__
+
+#include "ring.h"
+#include "../grant_table.h"
+
+/*
+ * Front->back notifications: When enqueuing a new request, sending a
+ * notification can be made conditional on req_event (i.e., the generic
+ * hold-off mechanism provided by the ring macros). Backends must set
+ * req_event appropriately (e.g., using RING_FINAL_CHECK_FOR_REQUESTS()).
+ *
+ * Back->front notifications: When enqueuing a new response, sending a
+ * notification can be made conditional on rsp_event (i.e., the generic
+ * hold-off mechanism provided by the ring macros). Frontends must set
+ * rsp_event appropriately (e.g., using RING_FINAL_CHECK_FOR_RESPONSES()).
+ */
+
+#ifndef blkif_vdev_t
+#define blkif_vdev_t   u16
+#endif
+#define blkif_sector_t u64
+
+/*
+ * Feature and Parameter Negotiation
+ * =================================
+ * The two halves of a Xen block driver utilize nodes within the XenStore to
+ * communicate capabilities and to negotiate operating parameters.  This
+ * section enumerates these nodes which reside in the respective front and
+ * backend portions of the XenStore, following the XenBus convention.
+ *
+ * All data in the XenStore is stored as strings.  Nodes specifying numeric
+ * values are encoded in decimal.  Integer value ranges listed below are
+ * expressed as fixed sized integer types capable of storing the conversion
+ * of a properly formated node string, without loss of information.
+ *
+ * Any specified default value is in effect if the corresponding XenBus node
+ * is not present in the XenStore.
+ *
+ * XenStore nodes in sections marked "PRIVATE" are solely for use by the
+ * driver side whose XenBus tree contains them.
+ *
+ * XenStore nodes marked "DEPRECATED" in their notes section should only be
+ * used to provide interoperability with legacy implementations.
+ *
+ * See the XenBus state transition diagram below for details on when XenBus
+ * nodes must be published and when they can be queried.
+ *
+ *****************************************************************************
+ *                            Backend XenBus Nodes
+ *****************************************************************************
+ *
+ *------------------ Backend Device Identification (PRIVATE) ------------------
+ *
+ * mode
+ *      Values:         "r" (read only), "w" (writable)
+ *
+ *      The read or write access permissions to the backing store to be
+ *      granted to the frontend.
+ *
+ * params
+ *      Values:         string
+ *
+ *      A free formatted string providing sufficient information for the
+ *      hotplug script to attach the device and provide a suitable
+ *      handler (ie: a block device) for blkback to use.
+ *
+ * physical-device
+ *      Values:         "MAJOR:MINOR"
+ *      Notes: 11
+ *
+ *      MAJOR and MINOR are the major number and minor number of the
+ *      backing device respectively.
+ *
+ * physical-device-path
+ *      Values:         path string
+ *
+ *      A string that contains the absolute path to the disk image. On
+ *      NetBSD and Linux this is always a block device, while on FreeBSD
+ *      it can be either a block device or a regular file.
+ *
+ * type
+ *      Values:         "file", "phy", "tap"
+ *
+ *      The type of the backing device/object.
+ *
+ *
+ * direct-io-safe
+ *      Values:         0/1 (boolean)
+ *      Default Value:  0
+ *
+ *      The underlying storage is not affected by the direct IO memory
+ *      lifetime bug.  See:
+ *        http://lists.xen.org/archives/html/xen-devel/2012-12/msg01154.html
+ *
+ *      Therefore this option gives the backend permission to use
+ *      O_DIRECT, notwithstanding that bug.
+ *
+ *      That is, if this option is enabled, use of O_DIRECT is safe,
+ *      in circumstances where we would normally have avoided it as a
+ *      workaround for that bug.  This option is not relevant for all
+ *      backends, and even not necessarily supported for those for
+ *      which it is relevant.  A backend which knows that it is not
+ *      affected by the bug can ignore this option.
+ *
+ *      This option doesn't require a backend to use O_DIRECT, so it
+ *      should not be used to try to control the caching behaviour.
+ *
+ *--------------------------------- Features ---------------------------------
+ *
+ * feature-barrier
+ *      Values:         0/1 (boolean)
+ *      Default Value:  0
+ *
+ *      A value of "1" indicates that the backend can process requests
+ *      containing the BLKIF_OP_WRITE_BARRIER request opcode.  Requests
+ *      of this type may still be returned at any time with the
+ *      BLKIF_RSP_EOPNOTSUPP result code.
+ *
+ * feature-flush-cache
+ *      Values:         0/1 (boolean)
+ *      Default Value:  0
+ *
+ *      A value of "1" indicates that the backend can process requests
+ *      containing the BLKIF_OP_FLUSH_DISKCACHE request opcode.  Requests
+ *      of this type may still be returned at any time with the
+ *      BLKIF_RSP_EOPNOTSUPP result code.
+ *
+ * feature-discard
+ *      Values:         0/1 (boolean)
+ *      Default Value:  0
+ *
+ *      A value of "1" indicates that the backend can process requests
+ *      containing the BLKIF_OP_DISCARD request opcode.  Requests
+ *      of this type may still be returned at any time with the
+ *      BLKIF_RSP_EOPNOTSUPP result code.
+ *
+ * feature-persistent
+ *      Values:         0/1 (boolean)
+ *      Default Value:  0
+ *      Notes: 7
+ *
+ *      A value of "1" indicates that the backend can keep the grants used
+ *      by the frontend driver mapped, so the same set of grants should be
+ *      used in all transactions. The maximum number of grants the backend
+ *      can map persistently depends on the implementation, but ideally it
+ *      should be RING_SIZE * BLKIF_MAX_SEGMENTS_PER_REQUEST. Using this
+ *      feature the backend doesn't need to unmap each grant, preventing
+ *      costly TLB flushes. The backend driver should only map grants
+ *      persistently if the frontend supports it. If a backend driver chooses
+ *      to use the persistent protocol when the frontend doesn't support it,
+ *      it will probably hit the maximum number of persistently mapped grants
+ *      (due to the fact that the frontend won't be reusing the same grants),
+ *      and fall back to non-persistent mode. Backend implementations may
+ *      shrink or expand the number of persistently mapped grants without
+ *      notifying the frontend depending on memory constraints (this might
+ *      cause a performance degradation).
+ *
+ *      If a backend driver wants to limit the maximum number of persistently
+ *      mapped grants to a value less than RING_SIZE *
+ *      BLKIF_MAX_SEGMENTS_PER_REQUEST a LRU strategy should be used to
+ *      discard the grants that are less commonly used. Using a LRU in the
+ *      backend driver paired with a LIFO queue in the frontend will
+ *      allow us to have better performance in this scenario.
+ *
+ *----------------------- Request Transport Parameters ------------------------
+ *
+ * max-ring-page-order
+ *      Values:         <uint32_t>
+ *      Default Value:  0
+ *      Notes:          1, 3
+ *
+ *      The maximum supported size of the request ring buffer in units of
+ *      lb(machine pages). (e.g. 0 == 1 page,  1 = 2 pages, 2 == 4 pages,
+ *      etc.).
+ *
+ * max-ring-pages
+ *      Values:         <uint32_t>
+ *      Default Value:  1
+ *      Notes:          DEPRECATED, 2, 3
+ *
+ *      The maximum supported size of the request ring buffer in units of
+ *      machine pages.  The value must be a power of 2.
+ *
+ *------------------------- Backend Device Properties -------------------------
+ *
+ * discard-enable
+ *      Values:         0/1 (boolean)
+ *      Default Value:  1
+ *
+ *      This optional property, set by the toolstack, instructs the backend
+ *      to offer (or not to offer) discard to the frontend. If the property
+ *      is missing the backend should offer discard if the backing storage
+ *      actually supports it.
+ *
+ * discard-alignment
+ *      Values:         <uint32_t>
+ *      Default Value:  0
+ *      Notes:          4, 5
+ *
+ *      The offset, in bytes from the beginning of the virtual block device,
+ *      to the first, addressable, discard extent on the underlying device.
+ *
+ * discard-granularity
+ *      Values:         <uint32_t>
+ *      Default Value:  <"sector-size">
+ *      Notes:          4
+ *
+ *      The size, in bytes, of the individually addressable discard extents
+ *      of the underlying device.
+ *
+ * discard-secure
+ *      Values:         0/1 (boolean)
+ *      Default Value:  0
+ *      Notes:          10
+ *
+ *      A value of "1" indicates that the backend can process BLKIF_OP_DISCARD
+ *      requests with the BLKIF_DISCARD_SECURE flag set.
+ *
+ * info
+ *      Values:         <uint32_t> (bitmap)
+ *
+ *      A collection of bit flags describing attributes of the backing
+ *      device.  The VDISK_* macros define the meaning of each bit
+ *      location.
+ *
+ * sector-size
+ *      Values:         <uint32_t>
+ *
+ *      The logical block size, in bytes, of the underlying storage. This
+ *      must be a power of two with a minimum value of 512.
+ *
+ *      NOTE: Because of implementation bugs in some frontends this must be
+ *            set to 512, unless the frontend advertizes a non-zero value
+ *            in its "feature-large-sector-size" xenbus node. (See below).
+ *
+ * physical-sector-size
+ *      Values:         <uint32_t>
+ *      Default Value:  <"sector-size">
+ *
+ *      The physical block size, in bytes, of the backend storage. This
+ *      must be an integer multiple of "sector-size".
+ *
+ * sectors
+ *      Values:         <u64>
+ *
+ *      The size of the backend device, expressed in units of "sector-size".
+ *      The product of "sector-size" and "sectors" must also be an integer
+ *      multiple of "physical-sector-size", if that node is present.
+ *
+ *****************************************************************************
+ *                            Frontend XenBus Nodes
+ *****************************************************************************
+ *
+ *----------------------- Request Transport Parameters -----------------------
+ *
+ * event-channel
+ *      Values:         <uint32_t>
+ *
+ *      The identifier of the Xen event channel used to signal activity
+ *      in the ring buffer.
+ *
+ * ring-ref
+ *      Values:         <uint32_t>
+ *      Notes:          6
+ *
+ *      The Xen grant reference granting permission for the backend to map
+ *      the sole page in a single page sized ring buffer.
+ *
+ * ring-ref%u
+ *      Values:         <uint32_t>
+ *      Notes:          6
+ *
+ *      For a frontend providing a multi-page ring, a "number of ring pages"
+ *      sized list of nodes, each containing a Xen grant reference granting
+ *      permission for the backend to map the page of the ring located
+ *      at page index "%u".  Page indexes are zero based.
+ *
+ * protocol
+ *      Values:         string (XEN_IO_PROTO_ABI_*)
+ *      Default Value:  XEN_IO_PROTO_ABI_NATIVE
+ *
+ *      The machine ABI rules governing the format of all ring request and
+ *      response structures.
+ *
+ * ring-page-order
+ *      Values:         <uint32_t>
+ *      Default Value:  0
+ *      Maximum Value:  MAX(ffs(max-ring-pages) - 1, max-ring-page-order)
+ *      Notes:          1, 3
+ *
+ *      The size of the frontend allocated request ring buffer in units
+ *      of lb(machine pages). (e.g. 0 == 1 page, 1 = 2 pages, 2 == 4 pages,
+ *      etc.).
+ *
+ * num-ring-pages
+ *      Values:         <uint32_t>
+ *      Default Value:  1
+ *      Maximum Value:  MAX(max-ring-pages,(0x1 << max-ring-page-order))
+ *      Notes:          DEPRECATED, 2, 3
+ *
+ *      The size of the frontend allocated request ring buffer in units of
+ *      machine pages.  The value must be a power of 2.
+ *
+ *--------------------------------- Features ---------------------------------
+ *
+ * feature-persistent
+ *      Values:         0/1 (boolean)
+ *      Default Value:  0
+ *      Notes: 7, 8, 9
+ *
+ *      A value of "1" indicates that the frontend will reuse the same grants
+ *      for all transactions, allowing the backend to map them with write
+ *      access (even when it should be read-only). If the frontend hits the
+ *      maximum number of allowed persistently mapped grants, it can fallback
+ *      to non persistent mode. This will cause a performance degradation,
+ *      since the the backend driver will still try to map those grants
+ *      persistently. Since the persistent grants protocol is compatible with
+ *      the previous protocol, a frontend driver can choose to work in
+ *      persistent mode even when the backend doesn't support it.
+ *
+ *      It is recommended that the frontend driver stores the persistently
+ *      mapped grants in a LIFO queue, so a subset of all persistently mapped
+ *      grants gets used commonly. This is done in case the backend driver
+ *      decides to limit the maximum number of persistently mapped grants
+ *      to a value less than RING_SIZE * BLKIF_MAX_SEGMENTS_PER_REQUEST.
+ *
+ * feature-large-sector-size
+ *      Values:         0/1 (boolean)
+ *      Default Value:  0
+ *
+ *      A value of "1" indicates that the frontend will correctly supply and
+ *      interpret all sector-based quantities in terms of the "sector-size"
+ *      value supplied in the backend info, whatever that may be set to.
+ *      If this node is not present or its value is "0" then it is assumed
+ *      that the frontend requires that the logical block size is 512 as it
+ *      is hardcoded (which is the case in some frontend implementations).
+ *
+ *------------------------- Virtual Device Properties -------------------------
+ *
+ * device-type
+ *      Values:         "disk", "cdrom", "floppy", etc.
+ *
+ * virtual-device
+ *      Values:         <uint32_t>
+ *
+ *      A value indicating the physical device to virtualize within the
+ *      frontend's domain.  (e.g. "The first ATA disk", "The third SCSI
+ *      disk", etc.)
+ *
+ *      See docs/misc/vbd-interface.txt for details on the format of this
+ *      value.
+ *
+ * Notes
+ * -----
+ * (1) Multi-page ring buffer scheme first developed in the Citrix XenServer
+ *     PV drivers.
+ * (2) Multi-page ring buffer scheme first used in some RedHat distributions
+ *     including a distribution deployed on certain nodes of the Amazon
+ *     EC2 cluster.
+ * (3) Support for multi-page ring buffers was implemented independently,
+ *     in slightly different forms, by both Citrix and RedHat/Amazon.
+ *     For full interoperability, block front and backends should publish
+ *     identical ring parameters, adjusted for unit differences, to the
+ *     XenStore nodes used in both schemes.
+ * (4) Devices that support discard functionality may internally allocate space
+ *     (discardable extents) in units that are larger than the exported logical
+ *     block size. If the backing device has such discardable extents the
+ *     backend should provide both discard-granularity and discard-alignment.
+ *     Providing just one of the two may be considered an error by the frontend.
+ *     Backends supporting discard should include discard-granularity and
+ *     discard-alignment even if it supports discarding individual sectors.
+ *     Frontends should assume discard-alignment == 0 and discard-granularity
+ *     == sector size if these keys are missing.
+ * (5) The discard-alignment parameter allows a physical device to be
+ *     partitioned into virtual devices that do not necessarily begin or
+ *     end on a discardable extent boundary.
+ * (6) When there is only a single page allocated to the request ring,
+ *     'ring-ref' is used to communicate the grant reference for this
+ *     page to the backend.  When using a multi-page ring, the 'ring-ref'
+ *     node is not created.  Instead 'ring-ref0' - 'ring-refN' are used.
+ * (7) When using persistent grants data has to be copied from/to the page
+ *     where the grant is currently mapped. The overhead of doing this copy
+ *     however doesn't suppress the speed improvement of not having to unmap
+ *     the grants.
+ * (8) The frontend driver has to allow the backend driver to map all grants
+ *     with write access, even when they should be mapped read-only, since
+ *     further requests may reuse these grants and require write permissions.
+ * (9) Linux implementation doesn't have a limit on the maximum number of
+ *     grants that can be persistently mapped in the frontend driver, but
+ *     due to the frontent driver implementation it should never be bigger
+ *     than RING_SIZE * BLKIF_MAX_SEGMENTS_PER_REQUEST.
+ *(10) The discard-secure property may be present and will be set to 1 if the
+ *     backing device supports secure discard.
+ *(11) Only used by Linux and NetBSD.
+ */
+
+/*
+ * Multiple hardware queues/rings:
+ * If supported, the backend will write the key "multi-queue-max-queues" to
+ * the directory for that vbd, and set its value to the maximum supported
+ * number of queues.
+ * Frontends that are aware of this feature and wish to use it can write the
+ * key "multi-queue-num-queues" with the number they wish to use, which must be
+ * greater than zero, and no more than the value reported by the backend in
+ * "multi-queue-max-queues".
+ *
+ * For frontends requesting just one queue, the usual event-channel and
+ * ring-ref keys are written as before, simplifying the backend processing
+ * to avoid distinguishing between a frontend that doesn't understand the
+ * multi-queue feature, and one that does, but requested only one queue.
+ *
+ * Frontends requesting two or more queues must not write the toplevel
+ * event-channel and ring-ref keys, instead writing those keys under sub-keys
+ * having the name "queue-N" where N is the integer ID of the queue/ring for
+ * which those keys belong. Queues are indexed from zero.
+ * For example, a frontend with two queues must write the following set of
+ * queue-related keys:
+ *
+ * /local/domain/1/device/vbd/0/multi-queue-num-queues = "2"
+ * /local/domain/1/device/vbd/0/queue-0 = ""
+ * /local/domain/1/device/vbd/0/queue-0/ring-ref = "<ring-ref#0>"
+ * /local/domain/1/device/vbd/0/queue-0/event-channel = "<evtchn#0>"
+ * /local/domain/1/device/vbd/0/queue-1 = ""
+ * /local/domain/1/device/vbd/0/queue-1/ring-ref = "<ring-ref#1>"
+ * /local/domain/1/device/vbd/0/queue-1/event-channel = "<evtchn#1>"
+ *
+ * It is also possible to use multiple queues/rings together with
+ * feature multi-page ring buffer.
+ * For example, a frontend requests two queues/rings and the size of each ring
+ * buffer is two pages must write the following set of related keys:
+ *
+ * /local/domain/1/device/vbd/0/multi-queue-num-queues = "2"
+ * /local/domain/1/device/vbd/0/ring-page-order = "1"
+ * /local/domain/1/device/vbd/0/queue-0 = ""
+ * /local/domain/1/device/vbd/0/queue-0/ring-ref0 = "<ring-ref#0>"
+ * /local/domain/1/device/vbd/0/queue-0/ring-ref1 = "<ring-ref#1>"
+ * /local/domain/1/device/vbd/0/queue-0/event-channel = "<evtchn#0>"
+ * /local/domain/1/device/vbd/0/queue-1 = ""
+ * /local/domain/1/device/vbd/0/queue-1/ring-ref0 = "<ring-ref#2>"
+ * /local/domain/1/device/vbd/0/queue-1/ring-ref1 = "<ring-ref#3>"
+ * /local/domain/1/device/vbd/0/queue-1/event-channel = "<evtchn#1>"
+ *
+ */
+
+/*
+ * STATE DIAGRAMS
+ *
+ *****************************************************************************
+ *                                   Startup                                 *
+ *****************************************************************************
+ *
+ * Tool stack creates front and back nodes with state XenbusStateInitialising.
+ *
+ * Front                                Back
+ * =================================    =====================================
+ * XenbusStateInitialising              XenbusStateInitialising
+ *  o Query virtual device               o Query backend device identification
+ *    properties.                          data.
+ *  o Setup OS device instance.          o Open and validate backend device.
+ *                                       o Publish backend features and
+ *                                         transport parameters.
+ *                                                      |
+ *                                                      |
+ *                                                      V
+ *                                      XenbusStateInitWait
+ *
+ * o Query backend features and
+ *   transport parameters.
+ * o Allocate and initialize the
+ *   request ring.
+ * o Publish transport parameters
+ *   that will be in effect during
+ *   this connection.
+ *              |
+ *              |
+ *              V
+ * XenbusStateInitialised
+ *
+ *                                       o Query frontend transport parameters.
+ *                                       o Connect to the request ring and
+ *                                         event channel.
+ *                                       o Publish backend device properties.
+ *                                                      |
+ *                                                      |
+ *                                                      V
+ *                                      XenbusStateConnected
+ *
+ *  o Query backend device properties.
+ *  o Finalize OS virtual device
+ *    instance.
+ *              |
+ *              |
+ *              V
+ * XenbusStateConnected
+ *
+ * Note: Drivers that do not support any optional features, or the negotiation
+ *       of transport parameters, can skip certain states in the state machine:
+ *
+ *       o A frontend may transition to XenbusStateInitialised without
+ *         waiting for the backend to enter XenbusStateInitWait.  In this
+ *         case, default transport parameters are in effect and any
+ *         transport parameters published by the frontend must contain
+ *         their default values.
+ *
+ *       o A backend may transition to XenbusStateInitialised, bypassing
+ *         XenbusStateInitWait, without waiting for the frontend to first
+ *         enter the XenbusStateInitialised state.  In this case, default
+ *         transport parameters are in effect and any transport parameters
+ *         published by the backend must contain their default values.
+ *
+ *       Drivers that support optional features and/or transport parameter
+ *       negotiation must tolerate these additional state transition paths.
+ *       In general this means performing the work of any skipped state
+ *       transition, if it has not already been performed, in addition to the
+ *       work associated with entry into the current state.
+ */
+
+/*
+ * REQUEST CODES.
+ */
+#define BLKIF_OP_READ              0
+#define BLKIF_OP_WRITE             1
+/*
+ * All writes issued prior to a request with the BLKIF_OP_WRITE_BARRIER
+ * operation code ("barrier request") must be completed prior to the
+ * execution of the barrier request.  All writes issued after the barrier
+ * request must not execute until after the completion of the barrier request.
+ *
+ * Optional.  See "feature-barrier" XenBus node documentation above.
+ */
+#define BLKIF_OP_WRITE_BARRIER     2
+/*
+ * Commit any uncommitted contents of the backing device's volatile cache
+ * to stable storage.
+ *
+ * Optional.  See "feature-flush-cache" XenBus node documentation above.
+ */
+#define BLKIF_OP_FLUSH_DISKCACHE   3
+/*
+ * Used in SLES sources for device specific command packet
+ * contained within the request. Reserved for that purpose.
+ */
+#define BLKIF_OP_RESERVED_1        4
+/*
+ * Indicate to the backend device that a region of storage is no longer in
+ * use, and may be discarded at any time without impact to the client.  If
+ * the BLKIF_DISCARD_SECURE flag is set on the request, all copies of the
+ * discarded region on the device must be rendered unrecoverable before the
+ * command returns.
+ *
+ * This operation is analogous to performing a trim (ATA) or unamp (SCSI),
+ * command on a native device.
+ *
+ * More information about trim/unmap operations can be found at:
+ * http://t13.org/Documents/UploadedDocuments/docs2008/
+ *     e07154r6-Data_Set_Management_Proposal_for_ATA-ACS2.doc
+ * http://www.seagate.com/staticfiles/support/disc/manuals/
+ *     Interface%20manuals/100293068c.pdf
+ *
+ * Optional.  See "feature-discard", "discard-alignment",
+ * "discard-granularity", and "discard-secure" in the XenBus node
+ * documentation above.
+ */
+#define BLKIF_OP_DISCARD           5
+
+/*
+ * Recognized if "feature-max-indirect-segments" in present in the backend
+ * xenbus info. The "feature-max-indirect-segments" node contains the maximum
+ * number of segments allowed by the backend per request. If the node is
+ * present, the frontend might use blkif_request_indirect structs in order to
+ * issue requests with more than BLKIF_MAX_SEGMENTS_PER_REQUEST (11). The
+ * maximum number of indirect segments is fixed by the backend, but the
+ * frontend can issue requests with any number of indirect segments as long as
+ * it's less than the number provided by the backend. The indirect_grefs field
+ * in blkif_request_indirect should be filled by the frontend with the
+ * grant references of the pages that are holding the indirect segments.
+ * These pages are filled with an array of blkif_request_segment that hold the
+ * information about the segments. The number of indirect pages to use is
+ * determined by the number of segments an indirect request contains. Every
+ * indirect page can contain a maximum of
+ * (PAGE_SIZE / sizeof(struct blkif_request_segment)) segments, so to
+ * calculate the number of indirect pages to use we have to do
+ * ceil(indirect_segments / (PAGE_SIZE / sizeof(struct blkif_request_segment))).
+ *
+ * If a backend does not recognize BLKIF_OP_INDIRECT, it should *not*
+ * create the "feature-max-indirect-segments" node!
+ */
+#define BLKIF_OP_INDIRECT          6
+
+/*
+ * Maximum scatter/gather segments per request.
+ * This is carefully chosen so that sizeof(blkif_ring_t) <= PAGE_SIZE.
+ * NB. This could be 12 if the ring indexes weren't stored in the same page.
+ */
+#define BLKIF_MAX_SEGMENTS_PER_REQUEST 11
+
+/*
+ * Maximum number of indirect pages to use per request.
+ */
+#define BLKIF_MAX_INDIRECT_PAGES_PER_REQUEST 8
+
+/*
+ * NB. 'first_sect' and 'last_sect' in blkif_request_segment, as well as
+ * 'sector_number' in blkif_request, blkif_request_discard and
+ * blkif_request_indirect are sector-based quantities. See the description
+ * of the "feature-large-sector-size" frontend xenbus node above for
+ * more information.
+ */
+struct blkif_request_segment {
+	grant_ref_t gref;        /* reference to I/O buffer frame        */
+	/* @first_sect: first sector in frame to transfer (inclusive).   */
+	/* @last_sect: last sector in frame to transfer (inclusive).     */
+	u8     first_sect, last_sect;
+};
+
+/*
+ * Starting ring element for any I/O request.
+ */
+struct blkif_request {
+	u8        operation;    /* BLKIF_OP_???                         */
+	u8        nr_segments;  /* number of segments                   */
+	blkif_vdev_t   handle;       /* only for read/write requests         */
+	u64       id;           /* private guest value, echoed in resp  */
+	blkif_sector_t sector_number;/* start sector idx on disk (r/w only)  */
+	struct blkif_request_segment seg[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+};
+
+typedef struct blkif_request blkif_request_t;
+
+/*
+ * Cast to this structure when blkif_request.operation == BLKIF_OP_DISCARD
+ * sizeof(struct blkif_request_discard) <= sizeof(struct blkif_request)
+ */
+struct blkif_request_discard {
+	u8        operation;    /* BLKIF_OP_DISCARD                     */
+	u8        flag;         /* BLKIF_DISCARD_SECURE or zero         */
+#define BLKIF_DISCARD_SECURE (1 << 0)  /* ignored if discard-secure=0      */
+	blkif_vdev_t   handle;       /* same as for read/write requests      */
+	u64       id;           /* private guest value, echoed in resp  */
+	blkif_sector_t sector_number;/* start sector idx on disk             */
+	u64       nr_sectors;   /* number of contiguous sectors to discard*/
+};
+
+typedef struct blkif_request_discard blkif_request_discard_t;
+
+struct blkif_request_indirect {
+	u8        operation;    /* BLKIF_OP_INDIRECT                    */
+	u8        indirect_op;  /* BLKIF_OP_{READ/WRITE}                */
+	u16       nr_segments;  /* number of segments                   */
+	u64       id;           /* private guest value, echoed in resp  */
+	blkif_sector_t sector_number;/* start sector idx on disk (r/w only)  */
+	blkif_vdev_t   handle;       /* same as for read/write requests      */
+	grant_ref_t    indirect_grefs[BLKIF_MAX_INDIRECT_PAGES_PER_REQUEST];
+#ifdef __i386__
+	u64       pad;          /* Make it 64 byte aligned on i386      */
+#endif
+};
+
+typedef struct blkif_request_indirect blkif_request_indirect_t;
+
+struct blkif_response {
+	u64        id;              /* copied from request */
+	u8         operation;       /* copied from request */
+	s16         status;          /* BLKIF_RSP_???       */
+};
+
+typedef struct blkif_response blkif_response_t;
+
+/*
+ * STATUS RETURN CODES.
+ */
+ /* Operation not supported (only happens on barrier writes). */
+#define BLKIF_RSP_EOPNOTSUPP  -2
+ /* Operation failed for some unspecified reason (-EIO). */
+#define BLKIF_RSP_ERROR       -1
+ /* Operation completed successfully. */
+#define BLKIF_RSP_OKAY         0
+
+/*
+ * Generate blkif ring structures and types.
+ */
+DEFINE_RING_TYPES(blkif, struct blkif_request, struct blkif_response);
+
+#define VDISK_CDROM        0x1
+#define VDISK_REMOVABLE    0x2
+#define VDISK_READONLY     0x4
+
+#endif /* __XEN_PUBLIC_IO_BLKIF_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/include/xen/interface/io/console.h b/include/xen/interface/io/console.h
new file mode 100644
index 0000000000..3489fc7a60
--- /dev/null
+++ b/include/xen/interface/io/console.h
@@ -0,0 +1,56 @@
+/******************************************************************************
+ * console.h
+ *
+ * Console I/O interface for Xen guest OSes.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (c) 2005, Keir Fraser
+ */
+
+#ifndef __XEN_PUBLIC_IO_CONSOLE_H__
+#define __XEN_PUBLIC_IO_CONSOLE_H__
+
+typedef u32 XENCONS_RING_IDX;
+
+#define MASK_XENCONS_IDX(idx, ring) ((idx) & (sizeof(ring) - 1))
+
+struct xencons_interface {
+	char in[1024];
+	char out[2048];
+	XENCONS_RING_IDX in_cons, in_prod;
+	XENCONS_RING_IDX out_cons, out_prod;
+};
+
+#ifdef XEN_WANT_FLEX_CONSOLE_RING
+#include "ring.h"
+DEFINE_XEN_FLEX_RING(xencons);
+#endif
+
+#endif /* __XEN_PUBLIC_IO_CONSOLE_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/include/xen/interface/io/protocols.h b/include/xen/interface/io/protocols.h
new file mode 100644
index 0000000000..52b4de0f81
--- /dev/null
+++ b/include/xen/interface/io/protocols.h
@@ -0,0 +1,42 @@
+/******************************************************************************
+ * protocols.h
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (c) 2008, Keir Fraser
+ */
+
+#ifndef __XEN_PROTOCOLS_H__
+#define __XEN_PROTOCOLS_H__
+
+#define XEN_IO_PROTO_ABI_X86_32     "x86_32-abi"
+#define XEN_IO_PROTO_ABI_X86_64     "x86_64-abi"
+#define XEN_IO_PROTO_ABI_ARM        "arm-abi"
+
+#if defined(__i386__)
+# define XEN_IO_PROTO_ABI_NATIVE XEN_IO_PROTO_ABI_X86_32
+#elif defined(__x86_64__)
+# define XEN_IO_PROTO_ABI_NATIVE XEN_IO_PROTO_ABI_X86_64
+#elif defined(__arm__) || defined(__aarch64__)
+# define XEN_IO_PROTO_ABI_NATIVE XEN_IO_PROTO_ABI_ARM
+#else
+# error arch fixup needed here
+#endif
+
+#endif
diff --git a/include/xen/interface/io/ring.h b/include/xen/interface/io/ring.h
new file mode 100644
index 0000000000..4e02678e3c
--- /dev/null
+++ b/include/xen/interface/io/ring.h
@@ -0,0 +1,479 @@
+/******************************************************************************
+ * ring.h
+ *
+ * Shared producer-consumer ring macros.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Tim Deegan and Andrew Warfield November 2004.
+ */
+
+#ifndef __XEN_PUBLIC_IO_RING_H__
+#define __XEN_PUBLIC_IO_RING_H__
+
+/*
+ * When #include'ing this header, you need to provide the following
+ * declaration upfront:
+ * - standard integers types (u8, u16, etc)
+ * They are provided by stdint.h of the standard headers.
+ *
+ * In addition, if you intend to use the FLEX macros, you also need to
+ * provide the following, before invoking the FLEX macros:
+ * - size_t
+ * - memcpy
+ * - grant_ref_t
+ * These declarations are provided by string.h of the standard headers,
+ * and grant_table.h from the Xen public headers.
+ */
+
+#include <xen/interface/grant_table.h>
+
+typedef unsigned int RING_IDX;
+
+/* Round a 32-bit unsigned constant down to the nearest power of two. */
+#define __RD2(_x)  (((_x) & 0x00000002) ? 0x2                  : ((_x) & 0x1))
+#define __RD4(_x)  (((_x) & 0x0000000c) ? __RD2((_x)>>2)<<2    : __RD2(_x))
+#define __RD8(_x)  (((_x) & 0x000000f0) ? __RD4((_x)>>4)<<4    : __RD4(_x))
+#define __RD16(_x) (((_x) & 0x0000ff00) ? __RD8((_x)>>8)<<8    : __RD8(_x))
+#define __RD32(_x) (((_x) & 0xffff0000) ? __RD16((_x)>>16)<<16 : __RD16(_x))
+
+/*
+ * Calculate size of a shared ring, given the total available space for the
+ * ring and indexes (_sz), and the name tag of the request/response structure.
+ * A ring contains as many entries as will fit, rounded down to the nearest
+ * power of two (so we can mask with (size-1) to loop around).
+ */
+#define __CONST_RING_SIZE(_s, _sz) \
+	(__RD32(((_sz) - offsetof(struct _s##_sring, ring)) / \
+		sizeof(((struct _s##_sring *)0)->ring[0])))
+/*
+ * The same for passing in an actual pointer instead of a name tag.
+ */
+#define __RING_SIZE(_s, _sz) \
+	(__RD32(((_sz) - (long)(_s)->ring + (long)(_s)) / sizeof((_s)->ring[0])))
+
+/*
+ * Macros to make the correct C datatypes for a new kind of ring.
+ *
+ * To make a new ring datatype, you need to have two message structures,
+ * let's say request_t, and response_t already defined.
+ *
+ * In a header where you want the ring datatype declared, you then do:
+ *
+ *     DEFINE_RING_TYPES(mytag, request_t, response_t);
+ *
+ * These expand out to give you a set of types, as you can see below.
+ * The most important of these are:
+ *
+ *     mytag_sring_t      - The shared ring.
+ *     mytag_front_ring_t - The 'front' half of the ring.
+ *     mytag_back_ring_t  - The 'back' half of the ring.
+ *
+ * To initialize a ring in your code you need to know the location and size
+ * of the shared memory area (PAGE_SIZE, for instance). To initialise
+ * the front half:
+ *
+ *     mytag_front_ring_t front_ring;
+ *     SHARED_RING_INIT((mytag_sring_t *)shared_page);
+ *     FRONT_RING_INIT(&front_ring, (mytag_sring_t *)shared_page, PAGE_SIZE);
+ *
+ * Initializing the back follows similarly (note that only the front
+ * initializes the shared ring):
+ *
+ *     mytag_back_ring_t back_ring;
+ *     BACK_RING_INIT(&back_ring, (mytag_sring_t *)shared_page, PAGE_SIZE);
+ */
+
+#define DEFINE_RING_TYPES(__name, __req_t, __rsp_t)                               \
+										  \
+/* Shared ring entry */                                                           \
+union __name##_sring_entry {                                                      \
+	__req_t req;                                                              \
+	__rsp_t rsp;                                                              \
+};                                                                                \
+										  \
+/* Shared ring page */                                                            \
+struct __name##_sring {                                                           \
+	RING_IDX req_prod, req_event;                                             \
+	RING_IDX rsp_prod, rsp_event;                                             \
+	union {                                                                   \
+		struct {                                                          \
+			u8 smartpoll_active;                                      \
+		} netif;                                                          \
+		struct {                                                          \
+			u8 msg;                                                   \
+		} tapif_user;                                                     \
+		u8 pvt_pad[4];                                                    \
+	} pvt;                                                                    \
+	u8 __pad[44];                                                             \
+	union __name##_sring_entry ring[1]; /* variable-length */                 \
+};                                                                                \
+										  \
+/* "Front" end's private variables */                                             \
+struct __name##_front_ring {                                                      \
+	RING_IDX req_prod_pvt;                                                    \
+	RING_IDX rsp_cons;                                                        \
+	unsigned int nr_ents;                                                     \
+	struct __name##_sring *sring;                                             \
+};                                                                                \
+										  \
+/* "Back" end's private variables */                                              \
+struct __name##_back_ring {                                                       \
+	RING_IDX rsp_prod_pvt;                                                    \
+	RING_IDX req_cons;                                                        \
+	unsigned int nr_ents;                                                     \
+	struct __name##_sring *sring;                                             \
+};                                                                                \
+										  \
+/* Syntactic sugar */                                                             \
+typedef struct __name##_sring __name##_sring_t;                                   \
+typedef struct __name##_front_ring __name##_front_ring_t;                         \
+typedef struct __name##_back_ring __name##_back_ring_t
+
+/*
+ * Macros for manipulating rings.
+ *
+ * FRONT_RING_whatever works on the "front end" of a ring: here
+ * requests are pushed on to the ring and responses taken off it.
+ *
+ * BACK_RING_whatever works on the "back end" of a ring: here
+ * requests are taken off the ring and responses put on.
+ *
+ * N.B. these macros do NO INTERLOCKS OR FLOW CONTROL.
+ * This is OK in 1-for-1 request-response situations where the
+ * requestor (front end) never has more than RING_SIZE()-1
+ * outstanding requests.
+ */
+
+/* Initialising empty rings */
+#define SHARED_RING_INIT(_s) do {                                                 \
+	(_s)->req_prod  = (_s)->rsp_prod  = 0;                                    \
+	(_s)->req_event = (_s)->rsp_event = 1;                                    \
+	(void)memset((_s)->pvt.pvt_pad, 0, sizeof((_s)->pvt.pvt_pad));            \
+	(void)memset((_s)->__pad, 0, sizeof((_s)->__pad));                        \
+} while (0)
+
+#define FRONT_RING_INIT(_r, _s, __size) do {                                      \
+	(_r)->req_prod_pvt = 0;                                                   \
+	(_r)->rsp_cons = 0;                                                       \
+	(_r)->nr_ents = __RING_SIZE(_s, __size);                                  \
+	(_r)->sring = (_s);                                                       \
+} while (0)
+
+#define BACK_RING_INIT(_r, _s, __size) do {                                       \
+	(_r)->rsp_prod_pvt = 0;                                                   \
+	(_r)->req_cons = 0;                                                       \
+	(_r)->nr_ents = __RING_SIZE(_s, __size);                                  \
+	(_r)->sring = (_s);                                                       \
+} while (0)
+
+/* How big is this ring? */
+#define RING_SIZE(_r)                                                             \
+	((_r)->nr_ents)
+
+/* Number of free requests (for use on front side only). */
+#define RING_FREE_REQUESTS(_r)                                                    \
+	(RING_SIZE(_r) - ((_r)->req_prod_pvt - (_r)->rsp_cons))
+
+/* Test if there is an empty slot available on the front ring.
+ * (This is only meaningful from the front. )
+ */
+#define RING_FULL(_r)                                                             \
+	(RING_FREE_REQUESTS(_r) == 0)
+
+/* Test if there are outstanding messages to be processed on a ring. */
+#define RING_HAS_UNCONSUMED_RESPONSES(_r)                                         \
+	((_r)->sring->rsp_prod - (_r)->rsp_cons)
+
+#ifdef __GNUC__
+#define RING_HAS_UNCONSUMED_REQUESTS(_r) ({                                       \
+	unsigned int req = (_r)->sring->req_prod - (_r)->req_cons;                \
+	unsigned int rsp = RING_SIZE(_r) -                                        \
+		((_r)->req_cons - (_r)->rsp_prod_pvt);                            \
+	req < rsp ? req : rsp;                                                    \
+})
+#else
+/* Same as above, but without the nice GCC ({ ... }) syntax. */
+#define RING_HAS_UNCONSUMED_REQUESTS(_r)                                          \
+	((((_r)->sring->req_prod - (_r)->req_cons) <                              \
+	  (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) ?              \
+	 ((_r)->sring->req_prod - (_r)->req_cons) :                               \
+	 (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt)))
+#endif
+
+/* Direct access to individual ring elements, by index. */
+#define RING_GET_REQUEST(_r, _idx)                                                \
+	(&((_r)->sring->ring[((_idx) & (RING_SIZE(_r) - 1))].req))
+
+/*
+ * Get a local copy of a request.
+ *
+ * Use this in preference to RING_GET_REQUEST() so all processing is
+ * done on a local copy that cannot be modified by the other end.
+ *
+ * Note that https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145 may cause this
+ * to be ineffective where _req is a struct which consists of only bitfields.
+ */
+#define RING_COPY_REQUEST(_r, _idx, _req) do {				          \
+	/* Use volatile to force the copy into _req. */			          \
+	*(_req) = *(volatile typeof(_req))RING_GET_REQUEST(_r, _idx);	          \
+} while (0)
+
+#define RING_GET_RESPONSE(_r, _idx)                                               \
+	(&((_r)->sring->ring[((_idx) & (RING_SIZE(_r) - 1))].rsp))
+
+/* Loop termination condition: Would the specified index overflow the ring? */
+#define RING_REQUEST_CONS_OVERFLOW(_r, _cons)                                     \
+	(((_cons) - (_r)->rsp_prod_pvt) >= RING_SIZE(_r))
+
+/* Ill-behaved frontend determination: Can there be this many requests? */
+#define RING_REQUEST_PROD_OVERFLOW(_r, _prod)                                     \
+	(((_prod) - (_r)->rsp_prod_pvt) > RING_SIZE(_r))
+
+#define RING_PUSH_REQUESTS(_r) do {                                               \
+	xen_wmb(); /* back sees requests /before/ updated producer index */       \
+	(_r)->sring->req_prod = (_r)->req_prod_pvt;                               \
+} while (0)
+
+#define RING_PUSH_RESPONSES(_r) do {                                              \
+	xen_wmb(); /* front sees resps /before/ updated producer index */         \
+	(_r)->sring->rsp_prod = (_r)->rsp_prod_pvt;                               \
+} while (0)
+
+/*
+ * Notification hold-off (req_event and rsp_event):
+ *
+ * When queueing requests or responses on a shared ring, it may not always be
+ * necessary to notify the remote end. For example, if requests are in flight
+ * in a backend, the front may be able to queue further requests without
+ * notifying the back (if the back checks for new requests when it queues
+ * responses).
+ *
+ * When enqueuing requests or responses:
+ *
+ *  Use RING_PUSH_{REQUESTS,RESPONSES}_AND_CHECK_NOTIFY(). The second argument
+ *  is a boolean return value. True indicates that the receiver requires an
+ *  asynchronous notification.
+ *
+ * After dequeuing requests or responses (before sleeping the connection):
+ *
+ *  Use RING_FINAL_CHECK_FOR_REQUESTS() or RING_FINAL_CHECK_FOR_RESPONSES().
+ *  The second argument is a boolean return value. True indicates that there
+ *  are pending messages on the ring (i.e., the connection should not be put
+ *  to sleep).
+ *
+ *  These macros will set the req_event/rsp_event field to trigger a
+ *  notification on the very next message that is enqueued. If you want to
+ *  create batches of work (i.e., only receive a notification after several
+ *  messages have been enqueued) then you will need to create a customised
+ *  version of the FINAL_CHECK macro in your own code, which sets the event
+ *  field appropriately.
+ */
+
+#define RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(_r, _notify) do {                     \
+	RING_IDX __old = (_r)->sring->req_prod;                                   \
+	RING_IDX __new = (_r)->req_prod_pvt;                                      \
+	xen_wmb(); /* back sees requests /before/ updated producer index */       \
+	(_r)->sring->req_prod = __new;                                            \
+	xen_mb(); /* back sees new requests /before/ we check req_event */        \
+	(_notify) = ((RING_IDX)(__new - (_r)->sring->req_event) <                 \
+				 (RING_IDX)(__new - __old));                      \
+} while (0)
+
+#define RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(_r, _notify) do {                    \
+	RING_IDX __old = (_r)->sring->rsp_prod;                                   \
+	RING_IDX __new = (_r)->rsp_prod_pvt;                                      \
+	xen_wmb(); /* front sees resps /before/ updated producer index */         \
+	(_r)->sring->rsp_prod = __new;                                            \
+	xen_mb(); /* front sees new resps /before/ we check rsp_event */          \
+	(_notify) = ((RING_IDX)(__new - (_r)->sring->rsp_event) <                 \
+				 (RING_IDX)(__new - __old));                      \
+} while (0)
+
+#define RING_FINAL_CHECK_FOR_REQUESTS(_r, _work_to_do) do {                       \
+	(_work_to_do) = RING_HAS_UNCONSUMED_REQUESTS(_r);                         \
+	if (_work_to_do)							  \
+		break;                                                            \
+	(_r)->sring->req_event = (_r)->req_cons + 1;                              \
+	xen_mb();                                                                 \
+	(_work_to_do) = RING_HAS_UNCONSUMED_REQUESTS(_r);                         \
+} while (0)
+
+#define RING_FINAL_CHECK_FOR_RESPONSES(_r, _work_to_do) do {                      \
+	(_work_to_do) = RING_HAS_UNCONSUMED_RESPONSES(_r);                        \
+	if (_work_to_do)							  \
+		break;                                                            \
+	(_r)->sring->rsp_event = (_r)->rsp_cons + 1;                              \
+	xen_mb();                                                                 \
+	(_work_to_do) = RING_HAS_UNCONSUMED_RESPONSES(_r);                        \
+} while (0)
+
+/*
+ * DEFINE_XEN_FLEX_RING_AND_INTF defines two monodirectional rings and
+ * functions to check if there is data on the ring, and to read and
+ * write to them.
+ *
+ * DEFINE_XEN_FLEX_RING is similar to DEFINE_XEN_FLEX_RING_AND_INTF, but
+ * does not define the indexes page. As different protocols can have
+ * extensions to the basic format, this macro allow them to define their
+ * own struct.
+ *
+ * XEN_FLEX_RING_SIZE
+ *   Convenience macro to calculate the size of one of the two rings
+ *   from the overall order.
+ *
+ * $NAME_mask
+ *   Function to apply the size mask to an index, to reduce the index
+ *   within the range [0-size].
+ *
+ * $NAME_read_packet
+ *   Function to read data from the ring. The amount of data to read is
+ *   specified by the "size" argument.
+ *
+ * $NAME_write_packet
+ *   Function to write data to the ring. The amount of data to write is
+ *   specified by the "size" argument.
+ *
+ * $NAME_get_ring_ptr
+ *   Convenience function that returns a pointer to read/write to the
+ *   ring at the right location.
+ *
+ * $NAME_data_intf
+ *   Indexes page, shared between frontend and backend. It also
+ *   contains the array of grant refs.
+ *
+ * $NAME_queued
+ *   Function to calculate how many bytes are currently on the ring,
+ *   ready to be read. It can also be used to calculate how much free
+ *   space is currently on the ring (XEN_FLEX_RING_SIZE() -
+ *   $NAME_queued()).
+ */
+
+#ifndef XEN_PAGE_SHIFT
+/* The PAGE_SIZE for ring protocols and hypercall interfaces is always
+ * 4K, regardless of the architecture, and page granularity chosen by
+ * operating systems.
+ */
+#define XEN_PAGE_SHIFT 12
+#endif
+#define XEN_FLEX_RING_SIZE(order)                                                 \
+	(1UL << ((order) + XEN_PAGE_SHIFT - 1))
+
+#define DEFINE_XEN_FLEX_RING(name)                                                \
+static inline RING_IDX name##_mask(RING_IDX idx, RING_IDX ring_size)              \
+{                                                                                 \
+	return idx & (ring_size - 1);                                             \
+}                                                                                 \
+										  \
+static inline unsigned char *name##_get_ring_ptr(unsigned char *buf,              \
+						 RING_IDX idx,                    \
+						 RING_IDX ring_size)              \
+{                                                                                 \
+	return buf + name##_mask(idx, ring_size);                                 \
+}                                                                                 \
+										  \
+static inline void name##_read_packet(void *opaque,                               \
+				      const unsigned char *buf,                   \
+				      size_t size,                                \
+				      RING_IDX masked_prod,                       \
+				      RING_IDX *masked_cons,                      \
+				      RING_IDX ring_size)                         \
+{                                                                                 \
+	if (*masked_cons < masked_prod ||                                         \
+		size <= ring_size - *masked_cons) {                               \
+		memcpy(opaque, buf + *masked_cons, size);                         \
+	} else {                                                                  \
+		memcpy(opaque, buf + *masked_cons, ring_size - *masked_cons);     \
+		memcpy((unsigned char *)opaque + ring_size - *masked_cons, buf,   \
+			   size - (ring_size - *masked_cons));                    \
+	}                                                                         \
+	*masked_cons = name##_mask(*masked_cons + size, ring_size);               \
+}                                                                                 \
+										  \
+static inline void name##_write_packet(unsigned char *buf,                        \
+				       const void *opaque,                        \
+				       size_t size,                               \
+				       RING_IDX *masked_prod,                     \
+				       RING_IDX masked_cons,                      \
+				       RING_IDX ring_size)                        \
+{                                                                                 \
+	if (*masked_prod < masked_cons ||                                         \
+		size <= ring_size - *masked_prod) {                               \
+		memcpy(buf + *masked_prod, opaque, size);                         \
+	} else {                                                                  \
+		memcpy(buf + *masked_prod, opaque, ring_size - *masked_prod);     \
+		memcpy(buf, (unsigned char *)opaque + (ring_size - *masked_prod), \
+		       size - (ring_size - *masked_prod));                        \
+	}                                                                         \
+	*masked_prod = name##_mask(*masked_prod + size, ring_size);               \
+}                                                                                 \
+										  \
+static inline RING_IDX name##_queued(RING_IDX prod,                               \
+				     RING_IDX cons,                               \
+				     RING_IDX ring_size)                          \
+{                                                                                 \
+	RING_IDX size;                                                            \
+										  \
+	if (prod == cons)                                                         \
+		return 0;                                                         \
+										  \
+	prod = name##_mask(prod, ring_size);                                      \
+	cons = name##_mask(cons, ring_size);                                      \
+										  \
+	if (prod == cons)                                                         \
+		return ring_size;                                                 \
+										  \
+	if (prod > cons)                                                          \
+		size = prod - cons;                                               \
+	else                                                                      \
+		size = ring_size - (cons - prod);                                 \
+	return size;                                                              \
+}                                                                                 \
+										  \
+struct name##_data {                                                              \
+	unsigned char *in; /* half of the allocation */                           \
+	unsigned char *out; /* half of the allocation */                          \
+}
+
+#define DEFINE_XEN_FLEX_RING_AND_INTF(name)                                       \
+struct name##_data_intf {                                                         \
+	RING_IDX in_cons, in_prod;                                                \
+										  \
+	u8 pad1[56];                                                              \
+										  \
+	RING_IDX out_cons, out_prod;                                              \
+										  \
+	u8 pad2[56];                                                              \
+										  \
+	RING_IDX ring_order;                                                      \
+	grant_ref_t ref[];                                                        \
+};                                                                                \
+DEFINE_XEN_FLEX_RING(name)
+
+#endif /* __XEN_PUBLIC_IO_RING_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 8
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/include/xen/interface/io/xenbus.h b/include/xen/interface/io/xenbus.h
new file mode 100644
index 0000000000..f452748b03
--- /dev/null
+++ b/include/xen/interface/io/xenbus.h
@@ -0,0 +1,81 @@
+/*****************************************************************************
+ * xenbus.h
+ *
+ * Xenbus protocol details.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (C) 2005 XenSource Ltd.
+ */
+
+#ifndef _XEN_PUBLIC_IO_XENBUS_H
+#define _XEN_PUBLIC_IO_XENBUS_H
+
+/*
+ * The state of either end of the Xenbus, i.e. the current communication
+ * status of initialisation across the bus.  States here imply nothing about
+ * the state of the connection between the driver and the kernel's device
+ * layers.
+ */
+enum xenbus_state {
+	XenbusStateUnknown       = 0,
+
+	XenbusStateInitialising  = 1,
+
+	/*
+	 * InitWait: Finished early initialisation but waiting for information
+	 * from the peer or hotplug scripts.
+	 */
+	XenbusStateInitWait      = 2,
+
+	/*
+	 * Initialised: Waiting for a connection from the peer.
+	 */
+	XenbusStateInitialised   = 3,
+
+	XenbusStateConnected     = 4,
+
+	/*
+	 * Closing: The device is being closed due to an error or an unplug event.
+	 */
+	XenbusStateClosing       = 5,
+
+	XenbusStateClosed        = 6,
+
+	/*
+	 * Reconfiguring: The device is being reconfigured.
+	 */
+	XenbusStateReconfiguring = 7,
+
+	XenbusStateReconfigured  = 8
+};
+
+typedef enum xenbus_state XenbusState;
+
+#endif /* _XEN_PUBLIC_IO_XENBUS_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/include/xen/interface/io/xs_wire.h b/include/xen/interface/io/xs_wire.h
new file mode 100644
index 0000000000..87987334bf
--- /dev/null
+++ b/include/xen/interface/io/xs_wire.h
@@ -0,0 +1,151 @@
+/*
+ * Details of the "wire" protocol between Xen Store Daemon and client
+ * library or guest kernel.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (C) 2005 Rusty Russell IBM Corporation
+ */
+
+#ifndef _XS_WIRE_H
+#define _XS_WIRE_H
+
+enum xsd_sockmsg_type {
+	XS_CONTROL,
+#define XS_DEBUG XS_CONTROL
+	XS_DIRECTORY,
+	XS_READ,
+	XS_GET_PERMS,
+	XS_WATCH,
+	XS_UNWATCH,
+	XS_TRANSACTION_START,
+	XS_TRANSACTION_END,
+	XS_INTRODUCE,
+	XS_RELEASE,
+	XS_GET_DOMAIN_PATH,
+	XS_WRITE,
+	XS_MKDIR,
+	XS_RM,
+	XS_SET_PERMS,
+	XS_WATCH_EVENT,
+	XS_ERROR,
+	XS_IS_DOMAIN_INTRODUCED,
+	XS_RESUME,
+	XS_SET_TARGET,
+	/* XS_RESTRICT has been removed */
+	XS_RESET_WATCHES = XS_SET_TARGET + 2,
+	XS_DIRECTORY_PART,
+
+	XS_TYPE_COUNT,      /* Number of valid types. */
+
+	XS_INVALID = 0xffff /* Guaranteed to remain an invalid type */
+};
+
+#define XS_WRITE_NONE "NONE"
+#define XS_WRITE_CREATE "CREATE"
+#define XS_WRITE_CREATE_EXCL "CREATE|EXCL"
+
+/* We hand errors as strings, for portability. */
+struct xsd_errors {
+	int errnum;
+	const char *errstring;
+};
+
+#ifdef EINVAL
+#define XSD_ERROR(x) { x, #x }
+/* LINTED: static unused */
+static struct xsd_errors xsd_errors[]
+#if defined(__GNUC__)
+__attribute__((unused))
+#endif
+	= {
+	XSD_ERROR(EINVAL),
+	XSD_ERROR(EACCES),
+	XSD_ERROR(EEXIST),
+	XSD_ERROR(EISDIR),
+	XSD_ERROR(ENOENT),
+	XSD_ERROR(ENOMEM),
+	XSD_ERROR(ENOSPC),
+	XSD_ERROR(EIO),
+	XSD_ERROR(ENOTEMPTY),
+	XSD_ERROR(ENOSYS),
+	XSD_ERROR(EROFS),
+	XSD_ERROR(EBUSY),
+	XSD_ERROR(EAGAIN),
+	XSD_ERROR(EISCONN),
+	XSD_ERROR(E2BIG)
+};
+#endif
+
+struct xsd_sockmsg {
+	u32 type;  /* XS_??? */
+	u32 req_id;/* Request identifier, echoed in daemon's response.  */
+	u32 tx_id; /* Transaction id (0 if not related to a transaction). */
+	u32 len;   /* Length of data following this. */
+
+	/* Generally followed by nul-terminated string(s). */
+};
+
+enum xs_watch_type {
+	XS_WATCH_PATH = 0,
+	XS_WATCH_TOKEN
+};
+
+/*
+ * `incontents 150 xenstore_struct XenStore wire protocol.
+ *
+ * Inter-domain shared memory communications.
+ */
+#define XENSTORE_RING_SIZE 1024
+typedef u32 XENSTORE_RING_IDX;
+#define MASK_XENSTORE_IDX(idx) ((idx) & (XENSTORE_RING_SIZE - 1))
+struct xenstore_domain_interface {
+	char req[XENSTORE_RING_SIZE]; /* Requests to xenstore daemon. */
+	char rsp[XENSTORE_RING_SIZE]; /* Replies and async watch events. */
+	XENSTORE_RING_IDX req_cons, req_prod;
+	XENSTORE_RING_IDX rsp_cons, rsp_prod;
+	u32 server_features; /* Bitmap of features supported by the server */
+	u32 connection;
+};
+
+/* Violating this is very bad.  See docs/misc/xenstore.txt. */
+#define XENSTORE_PAYLOAD_MAX 4096
+
+/* Violating these just gets you an error back */
+#define XENSTORE_ABS_PATH_MAX 3072
+#define XENSTORE_REL_PATH_MAX 2048
+
+/* The ability to reconnect a ring */
+#define XENSTORE_SERVER_FEATURE_RECONNECTION 1
+
+/* Valid values for the connection field */
+#define XENSTORE_CONNECTED 0 /* the steady-state */
+#define XENSTORE_RECONNECT 1 /* guest has initiated a reconnect */
+
+#endif /* _XS_WIRE_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 8
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/include/xen/interface/memory.h b/include/xen/interface/memory.h
new file mode 100644
index 0000000000..19959da8b4
--- /dev/null
+++ b/include/xen/interface/memory.h
@@ -0,0 +1,332 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/******************************************************************************
+ * memory.h
+ *
+ * Memory reservation and information.
+ *
+ * Copyright (c) 2005, Keir Fraser <keir@xensource.com>
+ */
+
+#ifndef __XEN_PUBLIC_MEMORY_H__
+#define __XEN_PUBLIC_MEMORY_H__
+
+/*
+ * Increase or decrease the specified domain's memory reservation. Returns a
+ * -ve errcode on failure, or the # extents successfully allocated or freed.
+ * arg == addr of struct xen_memory_reservation.
+ */
+#define XENMEM_increase_reservation 0
+#define XENMEM_decrease_reservation 1
+#define XENMEM_populate_physmap     6
+struct xen_memory_reservation {
+	/*
+	 * XENMEM_increase_reservation:
+	 *   OUT: MFN (*not* GMFN) bases of extents that were allocated
+	 * XENMEM_decrease_reservation:
+	 *   IN:  GMFN bases of extents to free
+	 * XENMEM_populate_physmap:
+	 *   IN:  GPFN bases of extents to populate with memory
+	 *   OUT: GMFN bases of extents that were allocated
+	 *   (NB. This command also updates the mach_to_phys translation table)
+	 */
+	GUEST_HANDLE(xen_pfn_t)extent_start;
+
+	/* Number of extents, and size/alignment of each (2^extent_order pages). */
+	xen_ulong_t  nr_extents;
+	unsigned int   extent_order;
+
+	/*
+	 * Maximum # bits addressable by the user of the allocated region (e.g.,
+	 * I/O devices often have a 32-bit limitation even in 64-bit systems). If
+	 * zero then the user has no addressing restriction.
+	 * This field is not used by XENMEM_decrease_reservation.
+	 */
+	unsigned int   address_bits;
+
+	/*
+	 * Domain whose reservation is being changed.
+	 * Unprivileged domains can specify only DOMID_SELF.
+	 */
+	domid_t        domid;
+
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(xen_memory_reservation);
+
+/*
+ * An atomic exchange of memory pages. If return code is zero then
+ * @out.extent_list provides GMFNs of the newly-allocated memory.
+ * Returns zero on complete success, otherwise a negative error code.
+ * On complete success then always @nr_exchanged == @in.nr_extents.
+ * On partial success @nr_exchanged indicates how much work was done.
+ */
+#define XENMEM_exchange             11
+struct xen_memory_exchange {
+	/*
+	 * [IN] Details of memory extents to be exchanged (GMFN bases).
+	 * Note that @in.address_bits is ignored and unused.
+	 */
+	struct xen_memory_reservation in;
+
+	/*
+	 * [IN/OUT] Details of new memory extents.
+	 * We require that:
+	 *  1. @in.domid == @out.domid
+	 *  2. @in.nr_extents  << @in.extent_order ==
+	 *     @out.nr_extents << @out.extent_order
+	 *  3. @in.extent_start and @out.extent_start lists must not overlap
+	 *  4. @out.extent_start lists GPFN bases to be populated
+	 *  5. @out.extent_start is overwritten with allocated GMFN bases
+	 */
+	struct xen_memory_reservation out;
+
+	/*
+	 * [OUT] Number of input extents that were successfully exchanged:
+	 *  1. The first @nr_exchanged input extents were successfully
+	 *     deallocated.
+	 *  2. The corresponding first entries in the output extent list correctly
+	 *     indicate the GMFNs that were successfully exchanged.
+	 *  3. All other input and output extents are untouched.
+	 *  4. If not all input exents are exchanged then the return code of this
+	 *     command will be non-zero.
+	 *  5. THIS FIELD MUST BE INITIALISED TO ZERO BY THE CALLER!
+	 */
+	xen_ulong_t nr_exchanged;
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(xen_memory_exchange);
+/*
+ * Returns the maximum machine frame number of mapped RAM in this system.
+ * This command always succeeds (it never returns an error code).
+ * arg == NULL.
+ */
+#define XENMEM_maximum_ram_page     2
+
+/*
+ * Returns the current or maximum memory reservation, in pages, of the
+ * specified domain (may be DOMID_SELF). Returns -ve errcode on failure.
+ * arg == addr of domid_t.
+ */
+#define XENMEM_current_reservation  3
+#define XENMEM_maximum_reservation  4
+
+/*
+ * Returns a list of MFN bases of 2MB extents comprising the machine_to_phys
+ * mapping table. Architectures which do not have a m2p table do not implement
+ * this command.
+ * arg == addr of xen_machphys_mfn_list_t.
+ */
+#define XENMEM_machphys_mfn_list    5
+struct xen_machphys_mfn_list {
+	/*
+	 * Size of the 'extent_start' array. Fewer entries will be filled if the
+	 * machphys table is smaller than max_extents * 2MB.
+	 */
+	unsigned int max_extents;
+
+	/*
+	 * Pointer to buffer to fill with list of extent starts. If there are
+	 * any large discontiguities in the machine address space, 2MB gaps in
+	 * the machphys table will be represented by an MFN base of zero.
+	 */
+	GUEST_HANDLE(xen_pfn_t)extent_start;
+
+	/*
+	 * Number of extents written to the above array. This will be smaller
+	 * than 'max_extents' if the machphys table is smaller than max_e * 2MB.
+	 */
+	unsigned int nr_extents;
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(xen_machphys_mfn_list);
+
+/*
+ * Returns the location in virtual address space of the machine_to_phys
+ * mapping table. Architectures which do not have a m2p table, or which do not
+ * map it by default into guest address space, do not implement this command.
+ * arg == addr of xen_machphys_mapping_t.
+ */
+#define XENMEM_machphys_mapping     12
+struct xen_machphys_mapping {
+	xen_ulong_t v_start, v_end; /* Start and end virtual addresses.   */
+	xen_ulong_t max_mfn;        /* Maximum MFN that can be looked up. */
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(xen_machphys_mapping_t);
+
+#define XENMAPSPACE_shared_info  0 /* shared info page */
+#define XENMAPSPACE_grant_table  1 /* grant table page */
+#define XENMAPSPACE_gmfn         2 /* GMFN */
+#define XENMAPSPACE_gmfn_range   3 /* GMFN range, XENMEM_add_to_physmap only. */
+#define XENMAPSPACE_gmfn_foreign 4 /* GMFN from another dom,
+				    * XENMEM_add_to_physmap_range only.
+				    */
+#define XENMAPSPACE_dev_mmio     5 /* device mmio region */
+
+/*
+ * Sets the GPFN@which a particular page appears in the specified guest's
+ * pseudophysical address space.
+ * arg == addr of xen_add_to_physmap_t.
+ */
+#define XENMEM_add_to_physmap      7
+struct xen_add_to_physmap {
+	/* Which domain to change the mapping for. */
+	domid_t domid;
+
+	/* Number of pages to go through for gmfn_range */
+	u16    size;
+
+	/* Source mapping space. */
+	unsigned int space;
+
+	/* Index into source mapping space. */
+	xen_ulong_t idx;
+
+	/* GPFN where the source mapping page should appear. */
+	xen_pfn_t gpfn;
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(xen_add_to_physmap);
+
+/*** REMOVED ***/
+/*#define XENMEM_translate_gpfn_list  8*/
+
+#define XENMEM_add_to_physmap_range 23
+struct xen_add_to_physmap_range {
+	/* IN */
+	/* Which domain to change the mapping for. */
+	domid_t domid;
+	u16 space; /* => enum phys_map_space */
+
+	/* Number of pages to go through */
+	u16 size;
+	domid_t foreign_domid; /* IFF gmfn_foreign */
+
+	/* Indexes into space being mapped. */
+	GUEST_HANDLE(xen_ulong_t)idxs;
+
+	/* GPFN in domid where the source mapping page should appear. */
+	GUEST_HANDLE(xen_pfn_t)gpfns;
+
+	/* OUT */
+
+	/* Per index error code. */
+	GUEST_HANDLE(int)errs;
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(xen_add_to_physmap_range);
+
+/*
+ * Returns the pseudo-physical memory map as it was when the domain
+ * was started (specified by XENMEM_set_memory_map).
+ * arg == addr of struct xen_memory_map.
+ */
+#define XENMEM_memory_map           9
+struct xen_memory_map {
+	/*
+	 * On call the number of entries which can be stored in buffer. On
+	 * return the number of entries which have been stored in
+	 * buffer.
+	 */
+	unsigned int nr_entries;
+
+	/*
+	 * Entries in the buffer are in the same format as returned by the
+	 * BIOS INT 0x15 EAX=0xE820 call.
+	 */
+	GUEST_HANDLE(void)buffer;
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(xen_memory_map);
+
+/*
+ * Returns the real physical memory map. Passes the same structure as
+ * XENMEM_memory_map.
+ * arg == addr of struct xen_memory_map.
+ */
+#define XENMEM_machine_memory_map   10
+
+/*
+ * Unmaps the page appearing at a particular GPFN from the specified guest's
+ * pseudophysical address space.
+ * arg == addr of xen_remove_from_physmap_t.
+ */
+#define XENMEM_remove_from_physmap      15
+struct xen_remove_from_physmap {
+	/* Which domain to change the mapping for. */
+	domid_t domid;
+
+	/* GPFN of the current mapping of the page. */
+	xen_pfn_t gpfn;
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(xen_remove_from_physmap);
+
+/*
+ * Get the pages for a particular guest resource, so that they can be
+ * mapped directly by a tools domain.
+ */
+#define XENMEM_acquire_resource 28
+struct xen_mem_acquire_resource {
+	/* IN - The domain whose resource is to be mapped */
+	domid_t domid;
+	/* IN - the type of resource */
+	u16 type;
+
+#define XENMEM_resource_ioreq_server 0
+#define XENMEM_resource_grant_table 1
+
+	/*
+	 * IN - a type-specific resource identifier, which must be zero
+	 *      unless stated otherwise.
+	 *
+	 * type == XENMEM_resource_ioreq_server -> id == ioreq server id
+	 * type == XENMEM_resource_grant_table -> id defined below
+	 */
+	u32 id;
+
+#define XENMEM_resource_grant_table_id_shared 0
+#define XENMEM_resource_grant_table_id_status 1
+
+	/* IN/OUT - As an IN parameter number of frames of the resource
+	 *          to be mapped. However, if the specified value is 0 and
+	 *          frame_list is NULL then this field will be set to the
+	 *          maximum value supported by the implementation on return.
+	 */
+	u32 nr_frames;
+	/*
+	 * OUT - Must be zero on entry. On return this may contain a bitwise
+	 *       OR of the following values.
+	 */
+	u32 flags;
+
+	/* The resource pages have been assigned to the calling domain */
+#define _XENMEM_rsrc_acq_caller_owned 0
+#define XENMEM_rsrc_acq_caller_owned (1u << _XENMEM_rsrc_acq_caller_owned)
+
+	/*
+	 * IN - the index of the initial frame to be mapped. This parameter
+	 *      is ignored if nr_frames is 0.
+	 */
+	u64 frame;
+
+#define XENMEM_resource_ioreq_server_frame_bufioreq 0
+#define XENMEM_resource_ioreq_server_frame_ioreq(n) (1 + (n))
+
+	/*
+	 * IN/OUT - If the tools domain is PV then, upon return, frame_list
+	 *          will be populated with the MFNs of the resource.
+	 *          If the tools domain is HVM then it is expected that, on
+	 *          entry, frame_list will be populated with a list of GFNs
+	 *          that will be mapped to the MFNs of the resource.
+	 *          If -EIO is returned then the frame_list has only been
+	 *          partially mapped and it is up to the caller to unmap all
+	 *          the GFNs.
+	 *          This parameter may be NULL if nr_frames is 0.
+	 */
+	GUEST_HANDLE(xen_pfn_t)frame_list;
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(xen_mem_acquire_resource);
+
+#endif /* __XEN_PUBLIC_MEMORY_H__ */
diff --git a/include/xen/interface/sched.h b/include/xen/interface/sched.h
new file mode 100644
index 0000000000..0f12dcf267
--- /dev/null
+++ b/include/xen/interface/sched.h
@@ -0,0 +1,188 @@
+/******************************************************************************
+ * sched.h
+ *
+ * Scheduler state interactions
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (c) 2005, Keir Fraser <keir@xensource.com>
+ */
+
+#ifndef __XEN_PUBLIC_SCHED_H__
+#define __XEN_PUBLIC_SCHED_H__
+
+#include <xen/interface/event_channel.h>
+
+/*
+ * Guest Scheduler Operations
+ *
+ * The SCHEDOP interface provides mechanisms for a guest to interact
+ * with the scheduler, including yield, blocking and shutting itself
+ * down.
+ */
+
+/*
+ * The prototype for this hypercall is:
+ * long HYPERVISOR_sched_op(enum sched_op cmd, void *arg, ...)
+ *
+ * @cmd == SCHEDOP_??? (scheduler operation).
+ * @arg == Operation-specific extra argument(s), as described below.
+ * ...  == Additional Operation-specific extra arguments, described below.
+ *
+ * Versions of Xen prior to 3.0.2 provided only the following legacy version
+ * of this hypercall, supporting only the commands yield, block and shutdown:
+ *  long sched_op(int cmd, unsigned long arg)
+ * @cmd == SCHEDOP_??? (scheduler operation).
+ * @arg == 0               (SCHEDOP_yield and SCHEDOP_block)
+ *      == SHUTDOWN_* code (SCHEDOP_shutdown)
+ *
+ * This legacy version is available to new guests as:
+ * long HYPERVISOR_sched_op_compat(enum sched_op cmd, unsigned long arg)
+ */
+
+/*
+ * Voluntarily yield the CPU.
+ * @arg == NULL.
+ */
+#define SCHEDOP_yield       0
+
+/*
+ * Block execution of this VCPU until an event is received for processing.
+ * If called with event upcalls masked, this operation will atomically
+ * reenable event delivery and check for pending events before blocking the
+ * VCPU. This avoids a "wakeup waiting" race.
+ * @arg == NULL.
+ */
+#define SCHEDOP_block       1
+
+/*
+ * Halt execution of this domain (all VCPUs) and notify the system controller.
+ * @arg == pointer to sched_shutdown structure.
+ *
+ * If the sched_shutdown_t reason is SHUTDOWN_suspend then
+ * x86 PV guests must also set RDX (EDX for 32-bit guests) to the MFN
+ * of the guest's start info page.  RDX/EDX is the third hypercall
+ * argument.
+ *
+ * In addition, which reason is SHUTDOWN_suspend this hypercall
+ * returns 1 if suspend was cancelled or the domain was merely
+ * checkpointed, and 0 if it is resuming in a new domain.
+ */
+#define SCHEDOP_shutdown    2
+
+/*
+ * Poll a set of event-channel ports. Return when one or more are pending. An
+ * optional timeout may be specified.
+ * @arg == pointer to sched_poll structure.
+ */
+#define SCHEDOP_poll        3
+
+/*
+ * Declare a shutdown for another domain. The main use of this function is
+ * in interpreting shutdown requests and reasons for fully-virtualized
+ * domains.  A para-virtualized domain may use SCHEDOP_shutdown directly.
+ * @arg == pointer to sched_remote_shutdown structure.
+ */
+#define SCHEDOP_remote_shutdown        4
+
+/*
+ * Latch a shutdown code, so that when the domain later shuts down it
+ * reports this code to the control tools.
+ * @arg == sched_shutdown, as for SCHEDOP_shutdown.
+ */
+#define SCHEDOP_shutdown_code 5
+
+/*
+ * Setup, poke and destroy a domain watchdog timer.
+ * @arg == pointer to sched_watchdog structure.
+ * With id == 0, setup a domain watchdog timer to cause domain shutdown
+ *               after timeout, returns watchdog id.
+ * With id != 0 and timeout == 0, destroy domain watchdog timer.
+ * With id != 0 and timeout != 0, poke watchdog timer and set new timeout.
+ */
+#define SCHEDOP_watchdog    6
+
+/*
+ * Override the current vcpu affinity by pinning it to one physical cpu or
+ * undo this override restoring the previous affinity.
+ * @arg == pointer to sched_pin_override structure.
+ *
+ * A negative pcpu value will undo a previous pin override and restore the
+ * previous cpu affinity.
+ * This call is allowed for the hardware domain only and requires the cpu
+ * to be part of the domain's cpupool.
+ */
+#define SCHEDOP_pin_override 7
+
+struct sched_shutdown {
+	unsigned int reason; /* SHUTDOWN_* => shutdown reason */
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(sched_shutdown);
+
+struct sched_poll {
+	GUEST_HANDLE(evtchn_port_t)ports;
+	unsigned int nr_ports;
+	u64 timeout;
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(sched_poll);
+
+struct sched_remote_shutdown {
+	domid_t domain_id;         /* Remote domain ID */
+	unsigned int reason;       /* SHUTDOWN_* => shutdown reason */
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(sched_remote_shutdown);
+
+struct sched_watchdog {
+	u32 id;                /* watchdog ID */
+	u32 timeout;           /* timeout */
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(sched_watchdog);
+
+struct sched_pin_override {
+	s32 pcpu;
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(sched_pin_override);
+
+/*
+ * Reason codes for SCHEDOP_shutdown. These may be interpreted by control
+ * software to determine the appropriate action. For the most part, Xen does
+ * not care about the shutdown code.
+ */
+#define SHUTDOWN_poweroff   0  /* Domain exited normally. Clean up and kill. */
+#define SHUTDOWN_reboot     1  /* Clean up, kill, and then restart.          */
+#define SHUTDOWN_suspend    2  /* Clean up, save suspend info, kill.         */
+#define SHUTDOWN_crash      3  /* Tell controller we've crashed.             */
+#define SHUTDOWN_watchdog   4  /* Restart because watchdog time expired.     */
+
+/*
+ * Domain asked to perform 'soft reset' for it. The expected behavior is to
+ * reset internal Xen state for the domain returning it to the point where it
+ * was created but leaving the domain's memory contents and vCPU contexts
+ * intact. This will allow the domain to start over and set up all Xen specific
+ * interfaces again.
+ */
+#define SHUTDOWN_soft_reset 5
+#define SHUTDOWN_MAX        5  /* Maximum valid shutdown reason.             */
+
+#endif /* __XEN_PUBLIC_SCHED_H__ */
diff --git a/include/xen/interface/xen.h b/include/xen/interface/xen.h
new file mode 100644
index 0000000000..964daaedfb
--- /dev/null
+++ b/include/xen/interface/xen.h
@@ -0,0 +1,225 @@
+/******************************************************************************
+ * xen.h
+ *
+ * Guest OS interface to Xen.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (c) 2004, K A Fraser
+ */
+
+#ifndef __XEN_PUBLIC_XEN_H__
+#define __XEN_PUBLIC_XEN_H__
+
+#include <xen/arm/interface.h>
+
+/*
+ * XEN "SYSTEM CALLS" (a.k.a. HYPERCALLS).
+ */
+
+/*
+ * x86_32: EAX = vector; EBX, ECX, EDX, ESI, EDI = args 1, 2, 3, 4, 5.
+ *         EAX = return value
+ *         (argument registers may be clobbered on return)
+ * x86_64: RAX = vector; RDI, RSI, RDX, R10, R8, R9 = args 1, 2, 3, 4, 5, 6.
+ *         RAX = return value
+ *         (argument registers not clobbered on return; RCX, R11 are)
+ */
+#define __HYPERVISOR_set_trap_table        0
+#define __HYPERVISOR_mmu_update            1
+#define __HYPERVISOR_set_gdt               2
+#define __HYPERVISOR_stack_switch          3
+#define __HYPERVISOR_set_callbacks         4
+#define __HYPERVISOR_fpu_taskswitch        5
+#define __HYPERVISOR_sched_op_compat       6
+#define __HYPERVISOR_platform_op           7
+#define __HYPERVISOR_set_debugreg          8
+#define __HYPERVISOR_get_debugreg          9
+#define __HYPERVISOR_update_descriptor    10
+#define __HYPERVISOR_memory_op            12
+#define __HYPERVISOR_multicall            13
+#define __HYPERVISOR_update_va_mapping    14
+#define __HYPERVISOR_set_timer_op         15
+#define __HYPERVISOR_event_channel_op_compat 16
+#define __HYPERVISOR_xen_version          17
+#define __HYPERVISOR_console_io           18
+#define __HYPERVISOR_physdev_op_compat    19
+#define __HYPERVISOR_grant_table_op       20
+#define __HYPERVISOR_vm_assist            21
+#define __HYPERVISOR_update_va_mapping_otherdomain 22
+#define __HYPERVISOR_iret                 23 /* x86 only */
+#define __HYPERVISOR_vcpu_op              24
+#define __HYPERVISOR_set_segment_base     25 /* x86/64 only */
+#define __HYPERVISOR_mmuext_op            26
+#define __HYPERVISOR_xsm_op               27
+#define __HYPERVISOR_nmi_op               28
+#define __HYPERVISOR_sched_op             29
+#define __HYPERVISOR_callback_op          30
+#define __HYPERVISOR_xenoprof_op          31
+#define __HYPERVISOR_event_channel_op     32
+#define __HYPERVISOR_physdev_op           33
+#define __HYPERVISOR_hvm_op               34
+#define __HYPERVISOR_sysctl               35
+#define __HYPERVISOR_domctl               36
+#define __HYPERVISOR_kexec_op             37
+#define __HYPERVISOR_tmem_op              38
+#define __HYPERVISOR_xc_reserved_op       39 /* reserved for XenClient */
+#define __HYPERVISOR_xenpmu_op            40
+#define __HYPERVISOR_dm_op                41
+
+/* Architecture-specific hypercall definitions. */
+#define __HYPERVISOR_arch_0               48
+#define __HYPERVISOR_arch_1               49
+#define __HYPERVISOR_arch_2               50
+#define __HYPERVISOR_arch_3               51
+#define __HYPERVISOR_arch_4               52
+#define __HYPERVISOR_arch_5               53
+#define __HYPERVISOR_arch_6               54
+#define __HYPERVISOR_arch_7               55
+
+#ifndef __ASSEMBLY__
+
+typedef u16 domid_t;
+
+/* Domain ids >= DOMID_FIRST_RESERVED cannot be used for ordinary domains. */
+#define DOMID_FIRST_RESERVED (0x7FF0U)
+
+/* DOMID_SELF is used in certain contexts to refer to oneself. */
+#define DOMID_SELF (0x7FF0U)
+
+/*
+ * DOMID_IO is used to restrict page-table updates to mapping I/O memory.
+ * Although no Foreign Domain need be specified to map I/O pages, DOMID_IO
+ * is useful to ensure that no mappings to the OS's own heap are accidentally
+ * installed. (e.g., in Linux this could cause havoc as reference counts
+ * aren't adjusted on the I/O-mapping code path).
+ * This only makes sense in MMUEXT_SET_FOREIGNDOM, but in that context can
+ * be specified by any calling domain.
+ */
+#define DOMID_IO   (0x7FF1U)
+
+/*
+ * DOMID_XEN is used to allow privileged domains to map restricted parts of
+ * Xen's heap space (e.g., the machine_to_phys table).
+ * This only makes sense in MMUEXT_SET_FOREIGNDOM, and is only permitted if
+ * the caller is privileged.
+ */
+#define DOMID_XEN  (0x7FF2U)
+
+/* DOMID_COW is used as the owner of sharable pages */
+#define DOMID_COW  (0x7FF3U)
+
+/* DOMID_INVALID is used to identify pages with unknown owner. */
+#define DOMID_INVALID (0x7FF4U)
+
+/* Idle domain. */
+#define DOMID_IDLE (0x7FFFU)
+
+struct vcpu_info {
+	/*
+	 * 'evtchn_upcall_pending' is written non-zero by Xen to indicate
+	 * a pending notification for a particular VCPU. It is then cleared
+	 * by the guest OS /before/ checking for pending work, thus avoiding
+	 * a set-and-check race. Note that the mask is only accessed by Xen
+	 * on the CPU that is currently hosting the VCPU. This means that the
+	 * pending and mask flags can be updated by the guest without special
+	 * synchronisation (i.e., no need for the x86 LOCK prefix).
+	 * This may seem suboptimal because if the pending flag is set by
+	 * a different CPU then an IPI may be scheduled even when the mask
+	 * is set. However, note:
+	 *  1. The task of 'interrupt holdoff' is covered by the per-event-
+	 *     channel mask bits. A 'noisy' event that is continually being
+	 *     triggered can be masked at source at this very precise
+	 *     granularity.
+	 *  2. The main purpose of the per-VCPU mask is therefore to restrict
+	 *     reentrant execution: whether for concurrency control, or to
+	 *     prevent unbounded stack usage. Whatever the purpose, we expect
+	 *     that the mask will be asserted only for short periods at a time,
+	 *     and so the likelihood of a 'spurious' IPI is suitably small.
+	 * The mask is read before making an event upcall to the guest: a
+	 * non-zero mask therefore guarantees that the VCPU will not receive
+	 * an upcall activation. The mask is cleared when the VCPU requests
+	 * to block: this avoids wakeup-waiting races.
+	 */
+	u8 evtchn_upcall_pending;
+	u8 evtchn_upcall_mask;
+	xen_ulong_t evtchn_pending_sel;
+	struct arch_vcpu_info arch;
+	struct pvclock_vcpu_time_info time;
+}; /* 64 bytes (x86) */
+
+/*
+ * Xen/kernel shared data -- pointer provided in start_info.
+ * NB. We expect that this struct is smaller than a page.
+ */
+struct shared_info {
+	struct vcpu_info vcpu_info[MAX_VIRT_CPUS];
+
+	/*
+	 * A domain can create "event channels" on which it can send and receive
+	 * asynchronous event notifications. There are three classes of event that
+	 * are delivered by this mechanism:
+	 *  1. Bi-directional inter- and intra-domain connections. Domains must
+	 *     arrange out-of-band to set up a connection (usually by allocating
+	 *     an unbound 'listener' port and avertising that via a storage service
+	 *     such as xenstore).
+	 *  2. Physical interrupts. A domain with suitable hardware-access
+	 *     privileges can bind an event-channel port to a physical interrupt
+	 *     source.
+	 *  3. Virtual interrupts ('events'). A domain can bind an event-channel
+	 *     port to a virtual interrupt source, such as the virtual-timer
+	 *     device or the emergency console.
+	 *
+	 * Event channels are addressed by a "port index". Each channel is
+	 * associated with two bits of information:
+	 *  1. PENDING -- notifies the domain that there is a pending notification
+	 *     to be processed. This bit is cleared by the guest.
+	 *  2. MASK -- if this bit is clear then a 0->1 transition of PENDING
+	 *     will cause an asynchronous upcall to be scheduled. This bit is only
+	 *     updated by the guest. It is read-only within Xen. If a channel
+	 *     becomes pending while the channel is masked then the 'edge' is lost
+	 *     (i.e., when the channel is unmasked, the guest must manually handle
+	 *     pending notifications as no upcall will be scheduled by Xen).
+	 *
+	 * To expedite scanning of pending notifications, any 0->1 pending
+	 * transition on an unmasked channel causes a corresponding bit in a
+	 * per-vcpu selector word to be set. Each bit in the selector covers a
+	 * 'C long' in the PENDING bitfield array.
+	 */
+	xen_ulong_t evtchn_pending[sizeof(xen_ulong_t) * 8];
+	xen_ulong_t evtchn_mask[sizeof(xen_ulong_t) * 8];
+
+	/*
+	 * Wallclock time: updated only by control software. Guests should base
+	 * their gettimeofday() syscall on this wallclock-base value.
+	 */
+	struct pvclock_wall_clock wc;
+
+	struct arch_shared_info arch;
+
+};
+
+#else /* __ASSEMBLY__ */
+
+/* In assembly code we cannot use C numeric constant suffixes. */
+#define mk_unsigned_long(x) x
+
+#endif /* !__ASSEMBLY__ */
+
+#endif /* __XEN_PUBLIC_XEN_H__ */
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 05/17] xen: Port Xen hypervizor related code from mini-os
  2020-07-01 16:29 [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
                   ` (3 preceding siblings ...)
  2020-07-01 16:29 ` [PATCH 04/17] xen: Add essential and required interface headers Anastasiia Lukianenko
@ 2020-07-01 16:29 ` Anastasiia Lukianenko
  2020-07-01 17:46   ` Julien Grall
  2020-07-01 16:29 ` [PATCH 06/17] xen: Port Xen event channel driver " Anastasiia Lukianenko
                   ` (12 subsequent siblings)
  17 siblings, 1 reply; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-01 16:29 UTC (permalink / raw)
  To: u-boot

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Port hypervizor related code from mini-os. Update essential
arch code to support required bit operations, memory barriers etc.

Copyright for the bits ported belong to at least the following authors,
please see related files for details:

Copyright (c) 2002-2003, K A Fraser
Copyright (c) 2005, Grzegorz Milos, gm281 at cam.ac.uk,Intel Research Cambridge
Copyright (c) 2014, Karim Allah Ahmed <karim.allah.ahmed@gmail.com>

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
---
 arch/arm/include/asm/xen/system.h |  96 +++++++++++
 common/board_r.c                  |  11 ++
 drivers/Makefile                  |   1 +
 drivers/xen/Makefile              |   5 +
 drivers/xen/hypervisor.c          | 277 ++++++++++++++++++++++++++++++
 include/xen.h                     |  11 ++
 include/xen/hvm.h                 |  30 ++++
 7 files changed, 431 insertions(+)
 create mode 100644 arch/arm/include/asm/xen/system.h
 create mode 100644 drivers/xen/Makefile
 create mode 100644 drivers/xen/hypervisor.c
 create mode 100644 include/xen.h
 create mode 100644 include/xen/hvm.h

diff --git a/arch/arm/include/asm/xen/system.h b/arch/arm/include/asm/xen/system.h
new file mode 100644
index 0000000000..81ab90160e
--- /dev/null
+++ b/arch/arm/include/asm/xen/system.h
@@ -0,0 +1,96 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0
+ *
+ * (C) 2014 Karim Allah Ahmed <karim.allah.ahmed@gmail.com>
+ * (C) 2020, EPAM Systems Inc.
+ */
+#ifndef _ASM_ARM_XEN_SYSTEM_H
+#define _ASM_ARM_XEN_SYSTEM_H
+
+#include <compiler.h>
+#include <asm/bitops.h>
+
+/* If *ptr == old, then store new there (and return new).
+ * Otherwise, return the old value.
+ * Atomic.
+ */
+#define synch_cmpxchg(ptr, old, new) \
+({ __typeof__(*ptr) stored = old; \
+	__atomic_compare_exchange_n(ptr, &stored, new, 0, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST) ? new : old; \
+})
+
+/* As test_and_clear_bit, but using __ATOMIC_SEQ_CST */
+static inline int synch_test_and_clear_bit(int nr, volatile void *addr)
+{
+	u8 *byte = ((u8 *)addr) + (nr >> 3);
+	u8 bit = 1 << (nr & 7);
+	u8 orig;
+
+	orig = __atomic_fetch_and(byte, ~bit, __ATOMIC_SEQ_CST);
+
+	return (orig & bit) != 0;
+}
+
+/* As test_and_set_bit, but using __ATOMIC_SEQ_CST */
+static inline int synch_test_and_set_bit(int nr, volatile void *base)
+{
+	u8 *byte = ((u8 *)base) + (nr >> 3);
+	u8 bit = 1 << (nr & 7);
+	u8 orig;
+
+	orig = __atomic_fetch_or(byte, bit, __ATOMIC_SEQ_CST);
+
+	return (orig & bit) != 0;
+}
+
+/* As set_bit, but using __ATOMIC_SEQ_CST */
+static inline void synch_set_bit(int nr, volatile void *addr)
+{
+	synch_test_and_set_bit(nr, addr);
+}
+
+/* As clear_bit, but using __ATOMIC_SEQ_CST */
+static inline void synch_clear_bit(int nr, volatile void *addr)
+{
+	synch_test_and_clear_bit(nr, addr);
+}
+
+/* As test_bit, but with a following memory barrier. */
+//static inline int synch_test_bit(int nr, volatile void *addr)
+static inline int synch_test_bit(int nr, const void *addr)
+{
+	int result;
+
+	result = test_bit(nr, addr);
+	barrier();
+	return result;
+}
+
+#define xchg(ptr, v)	__atomic_exchange_n(ptr, v, __ATOMIC_SEQ_CST)
+#define xchg(ptr, v)	__atomic_exchange_n(ptr, v, __ATOMIC_SEQ_CST)
+
+#define mb()		dsb()
+#define rmb()		dsb()
+#define wmb()		dsb()
+#define __iormb()	dmb()
+#define __iowmb()	dmb()
+#define xen_mb()	mb()
+#define xen_rmb()	rmb()
+#define xen_wmb()	wmb()
+
+#define smp_processor_id()	0
+
+#define to_phys(x)		((unsigned long)(x))
+#define to_virt(x)		((void *)(x))
+
+#define PFN_UP(x)		(unsigned long)(((x) + PAGE_SIZE - 1) >> PAGE_SHIFT)
+#define PFN_DOWN(x)		(unsigned long)((x) >> PAGE_SHIFT)
+#define PFN_PHYS(x)		((unsigned long)(x) << PAGE_SHIFT)
+#define PHYS_PFN(x)		(unsigned long)((x) >> PAGE_SHIFT)
+
+#define virt_to_pfn(_virt)	(PFN_DOWN(to_phys(_virt)))
+#define virt_to_mfn(_virt)	(PFN_DOWN(to_phys(_virt)))
+#define mfn_to_virt(_mfn)	(to_virt(PFN_PHYS(_mfn)))
+#define pfn_to_virt(_pfn)	(to_virt(PFN_PHYS(_pfn)))
+
+#endif
diff --git a/common/board_r.c b/common/board_r.c
index fa57fa9b69..fd36edb4e5 100644
--- a/common/board_r.c
+++ b/common/board_r.c
@@ -56,6 +56,7 @@
 #include <timer.h>
 #include <trace.h>
 #include <watchdog.h>
+#include <xen.h>
 #ifdef CONFIG_ADDR_MAP
 #include <asm/mmu.h>
 #endif
@@ -462,6 +463,13 @@ static int initr_mmc(void)
 }
 #endif
 
+#ifdef CONFIG_XEN
+static int initr_xen(void)
+{
+	xen_init();
+	return 0;
+}
+#endif
 /*
  * Tell if it's OK to load the environment early in boot.
  *
@@ -769,6 +777,9 @@ static init_fnc_t init_sequence_r[] = {
 #endif
 #ifdef CONFIG_MMC
 	initr_mmc,
+#endif
+#ifdef CONFIG_XEN
+	initr_xen,
 #endif
 	initr_env,
 #ifdef CONFIG_SYS_BOOTPARAMS_LEN
diff --git a/drivers/Makefile b/drivers/Makefile
index 94e8c5da17..0dd8891e76 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -28,6 +28,7 @@ obj-$(CONFIG_$(SPL_)REMOTEPROC) += remoteproc/
 obj-$(CONFIG_$(SPL_TPL_)TPM) += tpm/
 obj-$(CONFIG_$(SPL_TPL_)ACPI_PMC) += power/acpi_pmc/
 obj-$(CONFIG_$(SPL_)BOARD) += board/
+obj-$(CONFIG_XEN) += xen/
 
 ifndef CONFIG_TPL_BUILD
 ifdef CONFIG_SPL_BUILD
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
new file mode 100644
index 0000000000..1211bf2386
--- /dev/null
+++ b/drivers/xen/Makefile
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier:	GPL-2.0+
+#
+# (C) Copyright 2020 EPAM Systems Inc.
+
+obj-y += hypervisor.o
diff --git a/drivers/xen/hypervisor.c b/drivers/xen/hypervisor.c
new file mode 100644
index 0000000000..5883285142
--- /dev/null
+++ b/drivers/xen/hypervisor.c
@@ -0,0 +1,277 @@
+/******************************************************************************
+ * hypervisor.c
+ *
+ * Communication to/from hypervisor.
+ *
+ * Copyright (c) 2002-2003, K A Fraser
+ * Copyright (c) 2005, Grzegorz Milos, gm281 at cam.ac.uk,Intel Research Cambridge
+ * Copyright (c) 2020, EPAM Systems Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+#include <common.h>
+#include <cpu_func.h>
+#include <log.h>
+#include <memalign.h>
+
+#include <asm/io.h>
+#include <asm/armv8/mmu.h>
+#include <asm/xen/system.h>
+
+#include <linux/bug.h>
+
+#include <xen/hvm.h>
+#include <xen/interface/memory.h>
+
+#define active_evtchns(cpu, sh, idx)	\
+	((sh)->evtchn_pending[idx] &	\
+	 ~(sh)->evtchn_mask[idx])
+
+int in_callback;
+
+/*
+ * Shared page for communicating with the hypervisor.
+ * Events flags go here, for example.
+ */
+struct shared_info *HYPERVISOR_shared_info;
+
+#ifndef CONFIG_PARAVIRT
+static const char *param_name(int op)
+{
+#define PARAM(x)[HVM_PARAM_##x] = #x
+	static const char *const names[] = {
+		PARAM(CALLBACK_IRQ),
+		PARAM(STORE_PFN),
+		PARAM(STORE_EVTCHN),
+		PARAM(PAE_ENABLED),
+		PARAM(IOREQ_PFN),
+		PARAM(TIMER_MODE),
+		PARAM(HPET_ENABLED),
+		PARAM(IDENT_PT),
+		PARAM(ACPI_S_STATE),
+		PARAM(VM86_TSS),
+		PARAM(VPT_ALIGN),
+		PARAM(CONSOLE_PFN),
+		PARAM(CONSOLE_EVTCHN),
+	};
+#undef PARAM
+
+	if (op >= ARRAY_SIZE(names))
+		return "unknown";
+
+	if (!names[op])
+		return "reserved";
+
+	return names[op];
+}
+
+int hvm_get_parameter_maintain_dcache(int idx, uint64_t *value)
+{
+	struct xen_hvm_param xhv;
+	int ret;
+
+	xhv.domid = DOMID_SELF;
+	xhv.index = idx;
+	invalidate_dcache_range((unsigned long)&xhv,
+				(unsigned long)&xhv + sizeof(xhv));
+
+	ret = HYPERVISOR_hvm_op(HVMOP_get_param, &xhv);
+	if (ret < 0) {
+		pr_err("Cannot get hvm parameter %s (%d): %d!\n",
+			   param_name(idx), idx, ret);
+		BUG();
+	}
+	invalidate_dcache_range((unsigned long)&xhv,
+				(unsigned long)&xhv + sizeof(xhv));
+
+	*value = xhv.value;
+	return ret;
+}
+
+int hvm_get_parameter(int idx, uint64_t *value)
+{
+	struct xen_hvm_param xhv;
+	int ret;
+
+	xhv.domid = DOMID_SELF;
+	xhv.index = idx;
+	ret = HYPERVISOR_hvm_op(HVMOP_get_param, &xhv);
+	if (ret < 0) {
+		pr_err("Cannot get hvm parameter %s (%d): %d!\n",
+			   param_name(idx), idx, ret);
+		BUG();
+	}
+
+	*value = xhv.value;
+	return ret;
+}
+
+int hvm_set_parameter(int idx, uint64_t value)
+{
+	struct xen_hvm_param xhv;
+	int ret;
+
+	xhv.domid = DOMID_SELF;
+	xhv.index = idx;
+	xhv.value = value;
+	ret = HYPERVISOR_hvm_op(HVMOP_set_param, &xhv);
+
+	if (ret < 0) {
+		pr_err("Cannot get hvm parameter %s (%d): %d!\n",
+			   param_name(idx), idx, ret);
+		BUG();
+	}
+
+	return ret;
+}
+
+struct shared_info *map_shared_info(void *p)
+{
+	struct xen_add_to_physmap xatp;
+
+	HYPERVISOR_shared_info = (struct shared_info *)memalign(PAGE_SIZE,
+								PAGE_SIZE);
+	if (HYPERVISOR_shared_info == NULL)
+		BUG();
+
+	xatp.domid = DOMID_SELF;
+	xatp.idx = 0;
+	xatp.space = XENMAPSPACE_shared_info;
+	xatp.gpfn = virt_to_pfn(HYPERVISOR_shared_info);
+	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp) != 0)
+		BUG();
+
+	return HYPERVISOR_shared_info;
+}
+
+void unmap_shared_info(void)
+{
+	struct xen_remove_from_physmap xrtp;
+
+	xrtp.domid = DOMID_SELF;
+	xrtp.gpfn = virt_to_pfn(HYPERVISOR_shared_info);
+	if (HYPERVISOR_memory_op(XENMEM_remove_from_physmap, &xrtp) != 0)
+		BUG();
+}
+#endif
+
+void do_hypervisor_callback(struct pt_regs *regs)
+{
+	unsigned long l1, l2, l1i, l2i;
+	unsigned int port;
+	int cpu = 0;
+	struct shared_info *s = HYPERVISOR_shared_info;
+	struct vcpu_info *vcpu_info = &s->vcpu_info[cpu];
+
+	in_callback = 1;
+
+	vcpu_info->evtchn_upcall_pending = 0;
+	/* NB x86. No need for a barrier here -- XCHG is a barrier on x86. */
+#if !defined(__i386__) && !defined(__x86_64__)
+	/* Clear master flag /before/ clearing selector flag. */
+	wmb();
+#endif
+	l1 = xchg(&vcpu_info->evtchn_pending_sel, 0);
+
+	while (l1 != 0) {
+		l1i = __ffs(l1);
+		l1 &= ~(1UL << l1i);
+
+		while ((l2 = active_evtchns(cpu, s, l1i)) != 0) {
+			l2i = __ffs(l2);
+			l2 &= ~(1UL << l2i);
+
+			port = (l1i * (sizeof(unsigned long) * 8)) + l2i;
+			/* TODO: handle new event: do_event(port, regs); */
+			/* Suppress -Wunused-but-set-variable */
+			(void)(port);
+		}
+	}
+
+	in_callback = 0;
+}
+
+void force_evtchn_callback(void)
+{
+#ifdef XEN_HAVE_PV_UPCALL_MASK
+	int save;
+#endif
+	struct vcpu_info *vcpu;
+
+	vcpu = &HYPERVISOR_shared_info->vcpu_info[smp_processor_id()];
+#ifdef XEN_HAVE_PV_UPCALL_MASK
+	save = vcpu->evtchn_upcall_mask;
+#endif
+
+	while (vcpu->evtchn_upcall_pending) {
+#ifdef XEN_HAVE_PV_UPCALL_MASK
+		vcpu->evtchn_upcall_mask = 1;
+#endif
+		barrier();
+		do_hypervisor_callback(NULL);
+		barrier();
+#ifdef XEN_HAVE_PV_UPCALL_MASK
+		vcpu->evtchn_upcall_mask = save;
+		barrier();
+#endif
+	};
+}
+
+void mask_evtchn(uint32_t port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	synch_set_bit(port, &s->evtchn_mask[0]);
+}
+
+void unmask_evtchn(uint32_t port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	struct vcpu_info *vcpu_info = &s->vcpu_info[smp_processor_id()];
+
+	synch_clear_bit(port, &s->evtchn_mask[0]);
+
+	/*
+	 * The following is basically the equivalent of 'hw_resend_irq'. Just like
+	 * a real IO-APIC we 'lose the interrupt edge' if the channel is masked.
+	 */
+	if (synch_test_bit(port, &s->evtchn_pending[0]) &&
+	    !synch_test_and_set_bit(port / (sizeof(unsigned long) * 8),
+				    &vcpu_info->evtchn_pending_sel)) {
+		vcpu_info->evtchn_upcall_pending = 1;
+#ifdef XEN_HAVE_PV_UPCALL_MASK
+		if (!vcpu_info->evtchn_upcall_mask)
+#endif
+			force_evtchn_callback();
+	}
+}
+
+void clear_evtchn(uint32_t port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+
+	synch_clear_bit(port, &s->evtchn_pending[0]);
+}
+
+void xen_init(void)
+{
+	debug("%s\n", __func__);
+
+	map_shared_info(NULL);
+}
+
diff --git a/include/xen.h b/include/xen.h
new file mode 100644
index 0000000000..1d6f74cc92
--- /dev/null
+++ b/include/xen.h
@@ -0,0 +1,11 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0
+ *
+ * (C) 2020, EPAM Systems Inc.
+ */
+#ifndef __XEN_H__
+#define __XEN_H__
+
+void xen_init(void);
+
+#endif /* __XEN_H__ */
diff --git a/include/xen/hvm.h b/include/xen/hvm.h
new file mode 100644
index 0000000000..89de9625ca
--- /dev/null
+++ b/include/xen/hvm.h
@@ -0,0 +1,30 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0
+ *
+ * Simple wrappers around HVM functions
+ *
+ * Copyright (c) 2002-2003, K A Fraser
+ * Copyright (c) 2005, Grzegorz Milos, gm281 at cam.ac.uk,Intel Research Cambridge
+ * Copyright (c) 2020, EPAM Systems Inc.
+ */
+#ifndef XEN_HVM_H__
+#define XEN_HVM_H__
+
+#include <asm/xen/hypercall.h>
+#include <xen/interface/hvm/params.h>
+#include <xen/interface/xen.h>
+
+extern struct shared_info *HYPERVISOR_shared_info;
+
+int hvm_get_parameter(int idx, uint64_t *value);
+int hvm_get_parameter_maintain_dcache(int idx, uint64_t *value);
+int hvm_set_parameter(int idx, uint64_t value);
+
+struct shared_info *map_shared_info(void *p);
+void unmap_shared_info(void);
+void do_hypervisor_callback(struct pt_regs *regs);
+void mask_evtchn(uint32_t port);
+void unmask_evtchn(uint32_t port);
+void clear_evtchn(uint32_t port);
+
+#endif /* XEN_HVM_H__ */
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 06/17] xen: Port Xen event channel driver from mini-os
  2020-07-01 16:29 [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
                   ` (4 preceding siblings ...)
  2020-07-01 16:29 ` [PATCH 05/17] xen: Port Xen hypervizor related code from mini-os Anastasiia Lukianenko
@ 2020-07-01 16:29 ` Anastasiia Lukianenko
  2020-07-03  3:50   ` Simon Glass
  2020-07-01 16:29 ` [PATCH 07/17] serial: serial_xen: Add Xen PV serial driver Anastasiia Lukianenko
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-01 16:29 UTC (permalink / raw)
  To: u-boot

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Make required updates to run on u-boot. Strip functionality
not needed by U-boot.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
---
 drivers/xen/Makefile     |   1 +
 drivers/xen/events.c     | 177 +++++++++++++++++++++++++++++++++++++++
 drivers/xen/hypervisor.c |   6 +-
 include/xen/events.h     |  47 +++++++++++
 4 files changed, 228 insertions(+), 3 deletions(-)
 create mode 100644 drivers/xen/events.c
 create mode 100644 include/xen/events.h

diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 1211bf2386..0ad35edefb 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -3,3 +3,4 @@
 # (C) Copyright 2020 EPAM Systems Inc.
 
 obj-y += hypervisor.o
+obj-y += events.o
diff --git a/drivers/xen/events.c b/drivers/xen/events.c
new file mode 100644
index 0000000000..eddc6b6e29
--- /dev/null
+++ b/drivers/xen/events.c
@@ -0,0 +1,177 @@
+/* -*-  Mode:C; c-basic-offset:4; tab-width:4 -*-
+ ****************************************************************************
+ * (C) 2003 - Rolf Neugebauer - Intel Research Cambridge
+ * (C) 2005 - Grzegorz Milos - Intel Research Cambridge
+ * (C) 2020 - EPAM Systems Inc.
+ ****************************************************************************
+ *
+ *		File: events.c
+ *	  Author: Rolf Neugebauer (neugebar at dcs.gla.ac.uk)
+ *	 Changes: Grzegorz Milos (gm281 at cam.ac.uk)
+ *
+ *		Date: Jul 2003, changes Jun 2005
+ *
+ * Environment: Xen Minimal OS
+ * Description: Deals with events received on event channels
+ *
+ ****************************************************************************
+ */
+#include <common.h>
+#include <log.h>
+
+#include <asm/io.h>
+#include <asm/xen/system.h>
+
+#include <xen/events.h>
+#include <xen/hvm.h>
+
+#define NR_EVS 1024
+
+/* this represents a event handler. Chaining or sharing is not allowed */
+typedef struct _ev_action_t {
+	evtchn_handler_t handler;
+	void *data;
+	u32 count;
+} ev_action_t;
+
+static ev_action_t ev_actions[NR_EVS];
+void default_handler(evtchn_port_t port, struct pt_regs *regs, void *data);
+
+static unsigned long bound_ports[NR_EVS / (8 * sizeof(unsigned long))];
+
+void unbind_all_ports(void)
+{
+	int i;
+	int cpu = 0;
+	struct shared_info *s = HYPERVISOR_shared_info;
+	struct vcpu_info *vcpu_info = &s->vcpu_info[cpu];
+
+	for (i = 0; i < NR_EVS; i++) {
+		if (test_and_clear_bit(i, bound_ports)) {
+			printf("port %d still bound!\n", i);
+			unbind_evtchn(i);
+		}
+	}
+	vcpu_info->evtchn_upcall_pending = 0;
+	vcpu_info->evtchn_pending_sel = 0;
+}
+
+/*
+ * Demux events to different handlers.
+ */
+int do_event(evtchn_port_t port, struct pt_regs *regs)
+{
+	ev_action_t  *action;
+
+	clear_evtchn(port);
+
+	if (port >= NR_EVS) {
+		printk("WARN: do_event(): Port number too large: %d\n", port);
+		return 1;
+	}
+
+	action = &ev_actions[port];
+	action->count++;
+
+	/* call the handler */
+	action->handler(port, regs, action->data);
+
+	return 1;
+}
+
+evtchn_port_t bind_evtchn(evtchn_port_t port, evtchn_handler_t handler,
+			  void *data)
+{
+	if (ev_actions[port].handler != default_handler)
+		printf("WARN: Handler for port %d already registered, replacing\n",
+		       port);
+
+	ev_actions[port].data = data;
+	wmb();
+	ev_actions[port].handler = handler;
+	synch_set_bit(port, bound_ports);
+
+	return port;
+}
+
+void unbind_evtchn(evtchn_port_t port)
+{
+	struct evtchn_close close;
+	int rc;
+
+	if (ev_actions[port].handler == default_handler)
+		printf("WARN: No handler for port %d when unbinding\n", port);
+	mask_evtchn(port);
+	clear_evtchn(port);
+
+	ev_actions[port].handler = default_handler;
+	wmb();
+	ev_actions[port].data = NULL;
+	synch_clear_bit(port, bound_ports);
+
+	close.port = port;
+	rc = HYPERVISOR_event_channel_op(EVTCHNOP_close, &close);
+	if (rc)
+		printf("WARN: close_port %d failed rc=%d. ignored\n", port, rc);
+}
+
+void default_handler(evtchn_port_t port, struct pt_regs *regs, void *ignore)
+{
+	debug("[Port %d] - event received\n", port);
+}
+
+/* Create a port available to the pal for exchanging notifications.
+ * Returns the result of the hypervisor call.
+ */
+
+/* Unfortunate confusion of terminology: the port is unbound as far
+ * as Xen is concerned, but we automatically bind a handler to it
+ * from inside mini-os.
+ */
+int evtchn_alloc_unbound(domid_t pal, evtchn_handler_t handler,
+			 void *data, evtchn_port_t *port)
+{
+	int rc;
+
+	struct evtchn_alloc_unbound op;
+
+	op.dom = DOMID_SELF;
+	op.remote_dom = pal;
+	rc = HYPERVISOR_event_channel_op(EVTCHNOP_alloc_unbound, &op);
+	if (rc) {
+		printf("ERROR: alloc_unbound failed with rc=%d", rc);
+		       return rc;
+	}
+	if (!handler)
+		handler = default_handler;
+	*port = bind_evtchn(op.port, handler, data);
+	return rc;
+}
+
+void eventchn_poll(void)
+{
+	do_hypervisor_callback(NULL);
+}
+
+/*
+ * Initially all events are without a handler and disabled
+ */
+void init_events(void)
+{
+	int i;
+
+	debug("%s\n", __func__);
+	/* initialize event handler */
+	for (i = 0; i < NR_EVS; i++) {
+		ev_actions[i].handler = default_handler;
+		mask_evtchn(i);
+	}
+}
+
+void fini_events(void)
+{
+	debug("%s\n", __func__);
+	/* Dealloc all events */
+	unbind_all_ports();
+}
+
diff --git a/drivers/xen/hypervisor.c b/drivers/xen/hypervisor.c
index 5883285142..975e552242 100644
--- a/drivers/xen/hypervisor.c
+++ b/drivers/xen/hypervisor.c
@@ -37,6 +37,7 @@
 #include <linux/bug.h>
 
 #include <xen/hvm.h>
+#include <xen/events.h>
 #include <xen/interface/memory.h>
 
 #define active_evtchns(cpu, sh, idx)	\
@@ -198,9 +199,7 @@ void do_hypervisor_callback(struct pt_regs *regs)
 			l2 &= ~(1UL << l2i);
 
 			port = (l1i * (sizeof(unsigned long) * 8)) + l2i;
-			/* TODO: handle new event: do_event(port, regs); */
-			/* Suppress -Wunused-but-set-variable */
-			(void)(port);
+			do_event(port, regs);
 		}
 	}
 
@@ -273,5 +272,6 @@ void xen_init(void)
 	debug("%s\n", __func__);
 
 	map_shared_info(NULL);
+	init_events();
 }
 
diff --git a/include/xen/events.h b/include/xen/events.h
new file mode 100644
index 0000000000..63abdf426b
--- /dev/null
+++ b/include/xen/events.h
@@ -0,0 +1,47 @@
+/* -*-  Mode:C; c-basic-offset:4; tab-width:4 -*-
+ ****************************************************************************
+ * (C) 2003 - Rolf Neugebauer - Intel Research Cambridge
+ * (C) 2005 - Grzegorz Milos - Intel Reseach Cambridge
+ * (C) 2020 - EPAM Systems Inc.
+ ****************************************************************************
+ *
+ *        File: events.h
+ *      Author: Rolf Neugebauer (neugebar at dcs.gla.ac.uk)
+ *     Changes: Grzegorz Milos (gm281@cam.ac.uk)
+ *
+ *        Date: Jul 2003, changes Jun 2005
+ *
+ * Environment: Xen Minimal OS
+ * Description: Deals with events on the event channels
+ *
+ ****************************************************************************
+ */
+
+#ifndef _EVENTS_H_
+#define _EVENTS_H_
+
+#include <asm/xen/hypercall.h>
+#include <xen/interface/event_channel.h>
+
+typedef void (*evtchn_handler_t)(evtchn_port_t, struct pt_regs *, void *);
+
+void init_events(void);
+void fini_events(void);
+
+int do_event(evtchn_port_t port, struct pt_regs *regs);
+void unbind_evtchn(evtchn_port_t port);
+void unbind_all_ports(void);
+int evtchn_alloc_unbound(domid_t pal, evtchn_handler_t handler,
+			 void *data, evtchn_port_t *port);
+
+static inline int notify_remote_via_evtchn(evtchn_port_t port)
+{
+	struct evtchn_send op;
+
+	op.port = port;
+	return HYPERVISOR_event_channel_op(EVTCHNOP_send, &op);
+}
+
+void eventchn_poll(void);
+
+#endif /* _EVENTS_H_ */
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 07/17] serial: serial_xen: Add Xen PV serial driver
  2020-07-01 16:29 [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
                   ` (5 preceding siblings ...)
  2020-07-01 16:29 ` [PATCH 06/17] xen: Port Xen event channel driver " Anastasiia Lukianenko
@ 2020-07-01 16:29 ` Anastasiia Lukianenko
  2020-07-03  3:50   ` Simon Glass
  2020-07-01 16:29 ` [PATCH 08/17] linux/compat.h: Add wait_event_timeout macro Anastasiia Lukianenko
                   ` (10 subsequent siblings)
  17 siblings, 1 reply; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-01 16:29 UTC (permalink / raw)
  To: u-boot

From: Peng Fan <peng.fan@nxp.com>

Add support for Xen para-virtualized serial driver. This
driver fully supports serial console for the virtual machine.

Please note that as the driver is initialized late, so no banner
nor memory size is visible.

Signed-off-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
---
 arch/arm/Kconfig                          |   1 +
 board/xen/xenguest_arm64/xenguest_arm64.c |  31 +++-
 configs/xenguest_arm64_defconfig          |   4 +-
 drivers/serial/Kconfig                    |   7 +
 drivers/serial/Makefile                   |   1 +
 drivers/serial/serial_xen.c               | 175 ++++++++++++++++++++++
 drivers/xen/events.c                      |   4 +
 7 files changed, 214 insertions(+), 9 deletions(-)
 create mode 100644 drivers/serial/serial_xen.c

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index c469863967..d4de1139aa 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1723,6 +1723,7 @@ config TARGET_XENGUEST_ARM64
 	select XEN
 	select OF_CONTROL
 	select LINUX_KERNEL_IMAGE_HEADER
+	select XEN_SERIAL
 endchoice
 
 config ARCH_SUPPORT_TFABOOT
diff --git a/board/xen/xenguest_arm64/xenguest_arm64.c b/board/xen/xenguest_arm64/xenguest_arm64.c
index 9e099f388f..fd10a002e9 100644
--- a/board/xen/xenguest_arm64/xenguest_arm64.c
+++ b/board/xen/xenguest_arm64/xenguest_arm64.c
@@ -18,9 +18,12 @@
 #include <asm/armv8/mmu.h>
 #include <asm/xen.h>
 #include <asm/xen/hypercall.h>
+#include <asm/xen/system.h>
 
 #include <linux/compiler.h>
 
+#include <xen/hvm.h>
+
 DECLARE_GLOBAL_DATA_PTR;
 
 int board_init(void)
@@ -57,9 +60,28 @@ static int get_next_memory_node(const void *blob, int mem)
 
 static int setup_mem_map(void)
 {
-	int i, ret, mem, reg = 0;
+	int i = 0, ret, mem, reg = 0;
 	struct fdt_resource res;
 	const void *blob = gd->fdt_blob;
+	u64 gfn;
+
+	/*
+	 * Add "magic" region which is used by Xen to provide some essentials
+	 * for the guest: we need console.
+	 */
+	ret = hvm_get_parameter_maintain_dcache(HVM_PARAM_CONSOLE_PFN, &gfn);
+	if (ret < 0) {
+		printf("%s: Can't get HVM_PARAM_CONSOLE_PFN, ret %d\n",
+		       __func__, ret);
+		return -EINVAL;
+	}
+
+	xen_mem_map[i].virt = PFN_PHYS(gfn);
+	xen_mem_map[i].phys = PFN_PHYS(gfn);
+	xen_mem_map[i].size = PAGE_SIZE;
+	xen_mem_map[i].attrs = (PTE_BLOCK_MEMTYPE(MT_NORMAL) |
+				PTE_BLOCK_INNER_SHARE);
+	i++;
 
 	mem = get_next_memory_node(blob, -1);
 	if (mem < 0) {
@@ -67,7 +89,7 @@ static int setup_mem_map(void)
 		return -EINVAL;
 	}
 
-	for (i = 0; i < MAX_MEM_MAP_REGIONS; i++) {
+	for (; i < MAX_MEM_MAP_REGIONS; i++) {
 		ret = fdt_get_resource(blob, mem, "reg", reg++, &res);
 		if (ret == -FDT_ERR_NOTFOUND) {
 			reg = 0;
@@ -146,8 +168,3 @@ int print_cpuinfo(void)
 	return 0;
 }
 
-__weak struct serial_device *default_serial_console(void)
-{
-	return NULL;
-}
-
diff --git a/configs/xenguest_arm64_defconfig b/configs/xenguest_arm64_defconfig
index 2a8caf8647..45559a161b 100644
--- a/configs/xenguest_arm64_defconfig
+++ b/configs/xenguest_arm64_defconfig
@@ -47,9 +47,9 @@ CONFIG_CMD_UMS=n
 #CONFIG_EFI_PARTITION=y
 # CONFIG_EFI_LOADER is not set
 
-# CONFIG_DM is not set
+CONFIG_DM=y
 # CONFIG_MMC is not set
-# CONFIG_DM_SERIAL is not set
+CONFIG_DM_SERIAL=y
 # CONFIG_REQUIRE_SERIAL_CONSOLE is not set
 
 CONFIG_OF_BOARD=y
diff --git a/drivers/serial/Kconfig b/drivers/serial/Kconfig
index 17d0e73623..33c989a66d 100644
--- a/drivers/serial/Kconfig
+++ b/drivers/serial/Kconfig
@@ -821,6 +821,13 @@ config MPC8XX_CONS
 	depends on MPC8xx
 	default y
 
+config XEN_SERIAL
+	bool "XEN serial support"
+	depends on XEN
+	help
+	  If built without DM support, then requires Xen
+	  to be built with CONFIG_VERBOSE_DEBUG.
+
 choice
 	prompt "Console port"
 	default 8xx_CONS_SMC1
diff --git a/drivers/serial/Makefile b/drivers/serial/Makefile
index e4a92bbbb7..25f7f8d342 100644
--- a/drivers/serial/Makefile
+++ b/drivers/serial/Makefile
@@ -70,6 +70,7 @@ obj-$(CONFIG_OWL_SERIAL) += serial_owl.o
 obj-$(CONFIG_OMAP_SERIAL) += serial_omap.o
 obj-$(CONFIG_MTK_SERIAL) += serial_mtk.o
 obj-$(CONFIG_SIFIVE_SERIAL) += serial_sifive.o
+obj-$(CONFIG_XEN_SERIAL) += serial_xen.o
 
 ifndef CONFIG_SPL_BUILD
 obj-$(CONFIG_USB_TTY) += usbtty.o
diff --git a/drivers/serial/serial_xen.c b/drivers/serial/serial_xen.c
new file mode 100644
index 0000000000..dcd4b2df79
--- /dev/null
+++ b/drivers/serial/serial_xen.c
@@ -0,0 +1,175 @@
+/*
+ * SPDX-License-Identifier:	GPL-2.0+
+ *
+ * (C) 2018 NXP
+ * (C) 2020 EPAM Systems Inc.
+ */
+#include <common.h>
+#include <cpu_func.h>
+#include <dm.h>
+#include <serial.h>
+#include <watchdog.h>
+
+#include <linux/bug.h>
+
+#include <xen/hvm.h>
+#include <xen/events.h>
+
+#include <xen/interface/sched.h>
+#include <xen/interface/hvm/hvm_op.h>
+#include <xen/interface/hvm/params.h>
+#include <xen/interface/io/console.h>
+#include <xen/interface/io/ring.h>
+
+DECLARE_GLOBAL_DATA_PTR;
+
+u32 console_evtchn;
+
+struct xen_uart_priv {
+	struct xencons_interface *intf;
+	u32 evtchn;
+	int vtermno;
+	struct hvc_struct *hvc;
+};
+
+int xen_serial_setbrg(struct udevice *dev, int baudrate)
+{
+	return 0;
+}
+
+static int xen_serial_probe(struct udevice *dev)
+{
+	struct xen_uart_priv *priv = dev_get_priv(dev);
+	u64 v = 0;
+	unsigned long gfn;
+	int r;
+
+	r = hvm_get_parameter(HVM_PARAM_CONSOLE_EVTCHN, &v);
+	if (r < 0 || v == 0)
+		return r;
+
+	priv->evtchn = v;
+	console_evtchn = v;
+
+	r = hvm_get_parameter(HVM_PARAM_CONSOLE_PFN, &v);
+	if (r < 0 || v == 0)
+		return -ENODEV;
+
+	gfn = v;
+
+	priv->intf = (struct xencons_interface *)(gfn << XEN_PAGE_SHIFT);
+	if (!priv->intf)
+		return -EINVAL;
+	return 0;
+}
+
+static int xen_serial_pending(struct udevice *dev, bool input)
+{
+	struct xen_uart_priv *priv = dev_get_priv(dev);
+	struct xencons_interface *intf = priv->intf;
+
+	if (!input || intf->in_cons == intf->in_prod)
+		return 0;
+	return 1;
+}
+
+static int xen_serial_getc(struct udevice *dev)
+{
+	struct xen_uart_priv *priv = dev_get_priv(dev);
+	struct xencons_interface *intf = priv->intf;
+	XENCONS_RING_IDX cons;
+	char c;
+
+	while (intf->in_cons == intf->in_prod) {
+		mb(); /* wait */
+	}
+
+	cons = intf->in_cons;
+	mb();			/* get pointers before reading ring */
+
+	c = intf->in[MASK_XENCONS_IDX(cons++, intf->in)];
+
+	mb();			/* read ring before consuming */
+	intf->in_cons = cons;
+
+	notify_remote_via_evtchn(priv->evtchn);
+	return c;
+}
+
+static int __write_console(struct udevice *dev, const char *data, int len)
+{
+	struct xen_uart_priv *priv = dev_get_priv(dev);
+	struct xencons_interface *intf = priv->intf;
+	XENCONS_RING_IDX cons, prod;
+	int sent = 0;
+
+	cons = intf->out_cons;
+	prod = intf->out_prod;
+	mb(); /* Update pointer */
+
+	WARN_ON((prod - cons) > sizeof(intf->out));
+
+	while ((sent < len) && ((prod - cons) < sizeof(intf->out)))
+		intf->out[MASK_XENCONS_IDX(prod++, intf->out)] = data[sent++];
+
+	mb(); /* Update data before pointer */
+	intf->out_prod = prod;
+
+	if (sent)
+		notify_remote_via_evtchn(priv->evtchn);
+	return sent;
+}
+
+static int write_console(struct udevice *dev, const char *data, int len)
+{
+	/*
+	 * Make sure the whole buffer is emitted, polling if
+	 * necessary.  We don't ever want to rely on the hvc daemon
+	 * because the most interesting console output is when the
+	 * kernel is crippled.
+	 */
+	while (len) {
+		int sent = __write_console(dev, data, len);
+
+		data += sent;
+		len -= sent;
+
+		if (unlikely(len))
+			HYPERVISOR_sched_op(SCHEDOP_yield, NULL);
+	}
+	return 0;
+}
+
+static int xen_serial_putc(struct udevice *dev, const char ch)
+{
+	write_console(dev, &ch, 1);
+	return 0;
+}
+
+static const struct dm_serial_ops xen_serial_ops = {
+	.putc = xen_serial_putc,
+	.getc = xen_serial_getc,
+	.pending = xen_serial_pending,
+};
+
+#if CONFIG_IS_ENABLED(OF_CONTROL)
+static const struct udevice_id xen_serial_ids[] = {
+	{ .compatible = "xen,xen" },
+	{ }
+};
+#endif
+
+U_BOOT_DRIVER(serial_xen) = {
+	.name			= "serial_xen",
+	.id			= UCLASS_SERIAL,
+#if CONFIG_IS_ENABLED(OF_CONTROL)
+	.of_match		= xen_serial_ids,
+#endif
+	.priv_auto_alloc_size	= sizeof(struct xen_uart_priv),
+	.probe			= xen_serial_probe,
+	.ops			= &xen_serial_ops,
+#if !CONFIG_IS_ENABLED(OF_CONTROL)
+	.flags			= DM_FLAG_PRE_RELOC,
+#endif
+};
+
diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index eddc6b6e29..a1b36a2196 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -25,6 +25,8 @@
 #include <xen/events.h>
 #include <xen/hvm.h>
 
+extern u32 console_evtchn;
+
 #define NR_EVS 1024
 
 /* this represents a event handler. Chaining or sharing is not allowed */
@@ -47,6 +49,8 @@ void unbind_all_ports(void)
 	struct vcpu_info *vcpu_info = &s->vcpu_info[cpu];
 
 	for (i = 0; i < NR_EVS; i++) {
+		if (i == console_evtchn)
+			continue;
 		if (test_and_clear_bit(i, bound_ports)) {
 			printf("port %d still bound!\n", i);
 			unbind_evtchn(i);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 08/17] linux/compat.h: Add wait_event_timeout macro
  2020-07-01 16:29 [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
                   ` (6 preceding siblings ...)
  2020-07-01 16:29 ` [PATCH 07/17] serial: serial_xen: Add Xen PV serial driver Anastasiia Lukianenko
@ 2020-07-01 16:29 ` Anastasiia Lukianenko
  2020-07-02  4:08   ` Heinrich Schuchardt
  2020-07-01 16:29 ` [PATCH 09/17] lib: sscanf: add sscanf implementation Anastasiia Lukianenko
                   ` (9 subsequent siblings)
  17 siblings, 1 reply; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-01 16:29 UTC (permalink / raw)
  To: u-boot

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Add  wait_event_timeout - sleep until a condition gets true or a
timeout elapses.

This is a stripped version of the same from Linux kernel with the
following u-boot specific modifications:
- no wait queues supported
- use u-boot timer to detect timeouts
- check for Ctrl-C pressed during wait

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
---
 include/linux/compat.h | 45 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/include/linux/compat.h b/include/linux/compat.h
index 712eeaef4e..5375b7d3b8 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -1,12 +1,20 @@
 #ifndef _LINUX_COMPAT_H_
 #define _LINUX_COMPAT_H_
 
+#include <console.h>
 #include <log.h>
 #include <malloc.h>
+
+#include <asm/processor.h>
+
 #include <linux/types.h>
 #include <linux/err.h>
 #include <linux/kernel.h>
 
+#ifdef CONFIG_XEN
+#include <xen/events.h>
+#endif
+
 struct unused {};
 typedef struct unused unused_t;
 
@@ -122,6 +130,43 @@ static inline void kmem_cache_destroy(struct kmem_cache *cachep)
 #define add_wait_queue(...)	do { } while (0)
 #define remove_wait_queue(...)	do { } while (0)
 
+#ifndef CONFIG_XEN
+#define eventchn_poll()
+#endif
+
+#define __wait_event_timeout(condition, timeout, ret)		\
+({								\
+	ulong __ret = ret; /* explicit shadow */		\
+	ulong start = get_timer(0);				\
+	for (;;) {						\
+		eventchn_poll();				\
+		if (condition) {				\
+			__ret = 1;				\
+			break;					\
+	}							\
+	if ((get_timer(start) > timeout) || ctrlc()) {		\
+		__ret = 0;					\
+		break;						\
+	}							\
+	cpu_relax();						\
+	}							\
+	__ret;							\
+})
+
+/*
+ * 0 if the @condition evaluated to %false after the @timeout elapsed,
+ * 1 if the @condition evaluated to %true
+ */
+#define wait_event_timeout(wq_head, condition, timeout)			\
+({									\
+	ulong __ret;							\
+	if (condition)							\
+		__ret = 1;						\
+	else								\
+		__ret = __wait_event_timeout(condition, timeout, __ret);\
+	__ret;								\
+})
+
 #define KERNEL_VERSION(a,b,c)	(((a) << 16) + ((b) << 8) + (c))
 
 /* This is also defined in ARMv8's mmu.h */
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 09/17] lib: sscanf: add sscanf implementation
  2020-07-01 16:29 [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
                   ` (7 preceding siblings ...)
  2020-07-01 16:29 ` [PATCH 08/17] linux/compat.h: Add wait_event_timeout macro Anastasiia Lukianenko
@ 2020-07-01 16:29 ` Anastasiia Lukianenko
  2020-07-02  4:04   ` Heinrich Schuchardt
  2020-07-01 16:29 ` [PATCH 10/17] xen: Port Xen bus driver from mini-os Anastasiia Lukianenko
                   ` (8 subsequent siblings)
  17 siblings, 1 reply; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-01 16:29 UTC (permalink / raw)
  To: u-boot

From: Andrii Anisov <andrii_anisov@epam.com>

Port sscanf implementation from mini-os and introduce new
Kconfig option to enable it: CONFIG_SSCANF. Disable by default.

Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
 include/vsprintf.h |   8 +
 lib/Kconfig        |   4 +
 lib/Makefile       |   1 +
 lib/sscanf.c       | 883 +++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 896 insertions(+)
 create mode 100644 lib/sscanf.c

diff --git a/include/vsprintf.h b/include/vsprintf.h
index d9fb68add0..ca2640dd43 100644
--- a/include/vsprintf.h
+++ b/include/vsprintf.h
@@ -234,4 +234,12 @@ char *strmhz(char *buf, unsigned long hz);
  */
 void str_to_upper(const char *in, char *out, size_t len);
 
+/**
+ * sscanf - Unformat a buffer into a list of arguments
+ * @buf:	input buffer
+ * @fmt:	formatting of buffer
+ * @...:	resulting arguments
+ */
+int sscanf(const char * buf, const char * fmt, ...);
+
 #endif
diff --git a/lib/Kconfig b/lib/Kconfig
index af5c38afd9..3dfc6dd0c5 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -67,6 +67,10 @@ config SPL_SPRINTF
 config TPL_SPRINTF
 	bool
 
+config SSCANF
+	bool
+	default n
+
 config STRTO
 	bool
 	default y
diff --git a/lib/Makefile b/lib/Makefile
index dc5761966c..65409df15e 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -122,6 +122,7 @@ else
 # Main U-Boot always uses the full printf support
 obj-y += vsprintf.o strto.o
 obj-$(CONFIG_OID_REGISTRY) += oid_registry.o
+obj-$(CONFIG_SSCANF) += sscanf.o
 endif
 
 obj-y += date.o
diff --git a/lib/sscanf.c b/lib/sscanf.c
new file mode 100644
index 0000000000..2123fa4653
--- /dev/null
+++ b/lib/sscanf.c
@@ -0,0 +1,883 @@
+/*
+ ****************************************************************************
+ *
+ *        File: printf.c
+ *      Author: Juergen Gross <jgross@suse.com>
+ *
+ *        Date: Jun 2016
+ *
+ * Environment: Xen Minimal OS
+ * Description: Library functions for printing
+ *              (FreeBSD port)
+ *
+ ****************************************************************************
+ */
+
+/*-
+ * Copyright (c) 1990, 1993
+ *	The Regents of the University of California.  All rights reserved.
+ *
+ * This code is derived from software contributed to Berkeley by
+ * Chris Torek.
+ *
+ * Copyright (c) 2011 The FreeBSD Foundation
+ * All rights reserved.
+ * Portions of this software were developed by David Chisnall
+ * under sponsorship from the FreeBSD Foundation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of the University nor the names of its contributors
+ *    may be used to endorse or promote products derived from this software
+ *    without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#if !defined HAVE_LIBC
+
+#include <os.h>
+#include <linux/kernel.h>
+#include <linux/ctype.h>
+#include <vsprintf.h>
+#include <linux/string.h>
+
+#define __DECONST(type, var)    ((type)(uintptr_t)(const void *)(var))
+
+/*
+ * Convert a string to an unsigned long integer.
+ *
+ * Ignores `locale' stuff.  Assumes that the upper and lower case
+ * alphabets and digits are each contiguous.
+ */
+unsigned long
+strtoul(const char *nptr, char **endptr, int base)
+{
+	const char *s = nptr;
+	unsigned long acc;
+	unsigned char c;
+	unsigned long cutoff;
+	int neg = 0, any, cutlim;
+
+	/*
+	 * See strtol for comments as to the logic used.
+	 */
+	do {
+		c = *s++;
+	} while (isspace(c));
+	if (c == '-') {
+		neg = 1;
+		c = *s++;
+	} else if (c == '+') {
+		c = *s++;
+	}
+	if ((base == 0 || base == 16) &&
+		c == '0' && (*s == 'x' || *s == 'X')) {
+		c = s[1];
+		s += 2;
+		base = 16;
+	}
+	if (base == 0)
+		base = c == '0' ? 8 : 10;
+	cutoff = (unsigned long)ULONG_MAX / (unsigned long)base;
+	cutlim = (unsigned long)ULONG_MAX % (unsigned long)base;
+	for (acc = 0, any = 0;; c = *s++) {
+		if (!isascii(c))
+			break;
+		if (isdigit(c))
+			c -= '0';
+		else if (isalpha(c))
+			c -= isupper(c) ? 'A' - 10 : 'a' - 10;
+		else
+			break;
+		if (c >= base)
+			break;
+		if (any < 0 || acc > cutoff || (acc == cutoff && c > cutlim)) {
+			any = -1;
+		} else {
+			any = 1;
+			acc *= base;
+			acc += c;
+		}
+	}
+	if (any < 0)
+		acc = ULONG_MAX;
+	else if (neg)
+		acc = -acc;
+	if (endptr != 0)
+		*endptr = __DECONST(char *, any ? s - 1 : nptr);
+	return acc;
+}
+
+/*
+ * Convert a string to a quad integer.
+ *
+ * Ignores `locale' stuff.  Assumes that the upper and lower case
+ * alphabets and digits are each contiguous.
+ */
+s64
+strtoq(const char *nptr, char **endptr, int base)
+{
+	const char *s;
+	u64 acc;
+	unsigned char c;
+	u64 qbase, cutoff;
+	int neg, any, cutlim;
+
+	/*
+	 * Skip white space and pick up leading +/- sign if any.
+	 * If base is 0, allow 0x for hex and 0 for octal, else
+	 * assume decimal; if base is already 16, allow 0x.
+	 */
+	s = nptr;
+	do {
+		c = *s++;
+	} while (isspace(c));
+	if (c == '-') {
+		neg = 1;
+		c = *s++;
+	} else {
+		neg = 0;
+		if (c == '+')
+			c = *s++;
+	}
+	if ((base == 0 || base == 16) &&
+	    c == '0' && (*s == 'x' || *s == 'X')) {
+		c = s[1];
+		s += 2;
+		base = 16;
+	}
+	if (base == 0)
+		base = c == '0' ? 8 : 10;
+
+	/*
+	 * Compute the cutoff value between legal numbers and illegal
+	 * numbers.  That is the largest legal value, divided by the
+	 * base.  An input number that is greater than this value, if
+	 * followed by a legal input character, is too big.  One that
+	 * is equal to this value may be valid or not; the limit
+	 * between valid and invalid numbers is then based on the last
+	 * digit.  For instance, if the range for quads is
+	 * [-9223372036854775808..9223372036854775807] and the input base
+	 * is 10, cutoff will be set to 922337203685477580 and cutlim to
+	 * either 7 (neg==0) or 8 (neg==1), meaning that if we have
+	 * accumulated a value > 922337203685477580, or equal but the
+	 * next digit is > 7 (or 8), the number is too big, and we will
+	 * return a range error.
+	 *
+	 * Set any if any `digits' consumed; make it negative to indicate
+	 * overflow.
+	 */
+	qbase = (unsigned int)base;
+	cutoff = neg ? (u64)-(LLONG_MIN + LLONG_MAX) + LLONG_MAX : LLONG_MAX;
+	cutlim = cutoff % qbase;
+	cutoff /= qbase;
+	for (acc = 0, any = 0;; c = *s++) {
+		if (!isascii(c))
+			break;
+		if (isdigit(c))
+			c -= '0';
+		else if (isalpha(c))
+			c -= isupper(c) ? 'A' - 10 : 'a' - 10;
+		else
+			break;
+		if (c >= base)
+			break;
+		if (any < 0 || acc > cutoff || (acc == cutoff && c > cutlim)) {
+			any = -1;
+		} else {
+			any = 1;
+			acc *= qbase;
+			acc += c;
+		}
+	}
+	if (any < 0)
+		acc = neg ? LLONG_MIN : LLONG_MAX;
+	else if (neg)
+		acc = -acc;
+	if (endptr != 0)
+		*endptr = __DECONST(char *, any ? s - 1 : nptr);
+	return acc;
+}
+
+/*
+ * Convert a string to an unsigned quad integer.
+ *
+ * Ignores `locale' stuff.  Assumes that the upper and lower case
+ * alphabets and digits are each contiguous.
+ */
+u64
+strtouq(const char *nptr, char **endptr, int base)
+{
+	const char *s = nptr;
+	u64 acc;
+	unsigned char c;
+	u64 qbase, cutoff;
+	int neg, any, cutlim;
+
+	/*
+	 * See strtoq for comments as to the logic used.
+	 */
+	do {
+		c = *s++;
+	} while (isspace(c));
+	if (c == '-') {
+		neg = 1;
+		c = *s++;
+	} else {
+		neg = 0;
+		if (c == '+')
+			c = *s++;
+	}
+	if ((base == 0 || base == 16) &&
+		c == '0' && (*s == 'x' || *s == 'X')) {
+		c = s[1];
+		s += 2;
+		base = 16;
+	}
+	if (base == 0)
+		base = c == '0' ? 8 : 10;
+	qbase = (unsigned int)base;
+	cutoff = (u64)ULLONG_MAX / qbase;
+	cutlim = (u64)ULLONG_MAX % qbase;
+	for (acc = 0, any = 0;; c = *s++) {
+		if (!isascii(c))
+			break;
+		if (isdigit(c))
+			c -= '0';
+		else if (isalpha(c))
+			c -= isupper(c) ? 'A' - 10 : 'a' - 10;
+		else
+			break;
+		if (c >= base)
+			break;
+		if (any < 0 || acc > cutoff || (acc == cutoff && c > cutlim)) {
+			any = -1;
+		} else {
+			any = 1;
+			acc *= qbase;
+			acc += c;
+		}
+	}
+	if (any < 0)
+		acc = ULLONG_MAX;
+	else if (neg)
+		acc = -acc;
+	if (endptr != 0)
+		*endptr = __DECONST(char *, any ? s - 1 : nptr);
+	return acc;
+}
+
+/*
+ * Fill in the given table from the scanset@the given format
+ * (just after `[').  Return a pointer to the character past the
+ * closing `]'.  The table has a 1 wherever characters should be
+ * considered part of the scanset.
+ */
+static const u_char *
+__sccl(char *tab, const u_char *fmt)
+{
+	int c, n, v;
+
+	/* first `clear' the whole table */
+	c = *fmt++;             /* first char hat => negated scanset */
+	if (c == '^') {
+		v = 1;          /* default => accept */
+		c = *fmt++;     /* get new first char */
+	} else {
+		v = 0;          /* default => reject */
+	}
+
+	/* XXX: Will not work if sizeof(tab*) > sizeof(char) */
+	for (n = 0; n < 256; n++)
+		tab[n] = v;        /* memset(tab, v, 256) */
+
+	if (c == 0)
+		return (fmt - 1);/* format ended before closing ] */
+
+	/*
+	 * Now set the entries corresponding to the actual scanset
+	 * to the opposite of the above.
+	 *
+	 * The first character may be ']' (or '-') without being special;
+	 * the last character may be '-'.
+	 */
+	v = 1 - v;
+	for (;;) {
+		tab[c] = v;             /* take character c */
+doswitch:
+		n = *fmt++;             /* and examine the next */
+		switch (n) {
+		case 0:                 /* format ended too soon */
+			return (fmt - 1);
+
+		case '-':
+			/*
+			 * A scanset of the form
+			 *      [01+-]
+			 * is defined as `the digit 0, the digit 1,
+			 * the character +, the character -', but
+			 * the effect of a scanset such as
+			 *      [a-zA-Z0-9]
+			 * is implementation defined.  The V7 Unix
+			 * scanf treats `a-z' as `the letters a through
+			 * z', but treats `a-a' as `the letter a, the
+			 * character -, and the letter a'.
+			 *
+			 * For compatibility, the `-' is not considerd
+			 * to define a range if the character following
+			 * it is either a close bracket (required by ANSI)
+			 * or is not numerically greater than the character
+			 * we just stored in the table (c).
+			 */
+			n = *fmt;
+			if (n == ']' || n < c) {
+				c = '-';
+				break;  /* resume the for(;;) */
+			}
+			fmt++;
+			/* fill in the range */
+			do {
+				tab[++c] = v;
+			} while (c < n);
+			c = n;
+			/*
+			 * Alas, the V7 Unix scanf also treats formats
+			 * such as [a-c-e] as `the letters a through e'.
+			 * This too is permitted by the standard....
+			 */
+			goto doswitch;
+			break;
+
+		case ']':               /* end of scanset */
+			return (fmt);
+
+		default:                /* just another character */
+			c = n;
+			break;
+		}
+	}
+	/* NOTREACHED */
+}
+
+/**
+ * vsscanf - Unformat a buffer into a list of arguments
+ * @buf:	input buffer
+ * @fmt:	format of buffer
+ * @args:	arguments
+ */
+#define BUF             32      /* Maximum length of numeric string. */
+
+/*
+ * Flags used during conversion.
+ */
+#define LONG            0x01    /* l: long or double */
+#define SHORT           0x04    /* h: short */
+#define SUPPRESS        0x08    /* suppress assignment */
+#define POINTER         0x10    /* weird %p pointer (`fake hex') */
+#define NOSKIP          0x20    /* do not skip blanks */
+#define QUAD            0x400
+#define SHORTSHORT      0x4000  /** hh: char */
+
+/*
+ * The following are used in numeric conversions only:
+ * SIGNOK, NDIGITS, DPTOK, and EXPOK are for floating point;
+ * SIGNOK, NDIGITS, PFXOK, and NZDIGITS are for integral.
+ */
+#define SIGNOK          0x40    /* +/- is (still) legal */
+#define NDIGITS         0x80    /* no digits detected */
+
+#define DPTOK           0x100   /* (float) decimal point is still legal */
+#define EXPOK           0x200   /* (float) exponent (e+3, etc) still legal */
+
+#define PFXOK           0x100   /* 0x prefix is (still) legal */
+#define NZDIGITS        0x200   /* no zero digits detected */
+
+/*
+ * Conversion types.
+ */
+#define CT_CHAR         0       /* %c conversion */
+#define CT_CCL          1       /* %[...] conversion */
+#define CT_STRING       2       /* %s conversion */
+#define CT_INT          3       /* integer, i.e., strtoq or strtouq */
+typedef u64 (*ccfntype)(const char *, char **, int);
+
+int
+vsscanf(const char *inp, char const *fmt0, va_list ap)
+{
+	int inr;
+	const u_char *fmt = (const u_char *)fmt0;
+	int c;                  /* character from format, or conversion */
+	size_t width;           /* field width, or 0 */
+	char *p;                /* points into all kinds of strings */
+	int n;                  /* handy integer */
+	int flags;              /* flags as defined above */
+	char *p0;               /* saves original value of p when necessary */
+	int nassigned;          /* number of fields assigned */
+	int nconversions;       /* number of conversions */
+	int nread;              /* number of characters consumed from fp */
+	int base;               /* base argument to strtoq/strtouq */
+	ccfntype ccfn;          /* conversion function (strtoq/strtouq) */
+	char ccltab[256];       /* character class table for %[...] */
+	char buf[BUF];          /* buffer for numeric conversions */
+
+	/* `basefix' is used to avoid `if' tests in the integer scanner */
+	static short basefix[17] = { 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
+				     12, 13, 14, 15, 16 };
+
+	inr = strlen(inp);
+
+	nassigned = 0;
+	nconversions = 0;
+	nread = 0;
+	base = 0;               /* XXX just to keep gcc happy */
+	ccfn = NULL;            /* XXX just to keep gcc happy */
+	for (;;) {
+		c = *fmt++;
+		if (c == 0)
+			return (nassigned);
+		if (isspace(c)) {
+			while (inr > 0 && isspace(*inp))
+				nread++, inr--, inp++;
+			continue;
+		}
+		if (c != '%')
+			goto literal;
+		width = 0;
+		flags = 0;
+		/*
+		 * switch on the format.  continue if done;
+		 * break once format type is derived.
+		 */
+again:          c = *fmt++;
+		switch (c) {
+		case '%':
+literal:
+			if (inr <= 0)
+				goto input_failure;
+			if (*inp != c)
+				goto match_failure;
+			inr--, inp++;
+			nread++;
+			continue;
+
+		case '*':
+			flags |= SUPPRESS;
+			goto again;
+		case 'l':
+			if (flags & LONG) {
+				flags &= ~LONG;
+				flags |= QUAD;
+			} else {
+				flags |= LONG;
+			}
+			goto again;
+		case 'q':
+			flags |= QUAD;
+			goto again;
+		case 'h':
+			if (flags & SHORT) {
+				flags &= ~SHORT;
+				flags |= SHORTSHORT;
+			} else {
+				flags |= SHORT;
+			}
+			goto again;
+
+		case '0': case '1': case '2': case '3': case '4':
+		case '5': case '6': case '7': case '8': case '9':
+			width = width * 10 + c - '0';
+			goto again;
+
+		/*
+		 * Conversions.
+		 *
+		 */
+		case 'd':
+			c = CT_INT;
+			ccfn = (ccfntype)strtoq;
+			base = 10;
+			break;
+
+		case 'i':
+			c = CT_INT;
+			ccfn = (ccfntype)strtoq;
+			base = 0;
+			break;
+
+		case 'o':
+			c = CT_INT;
+			ccfn = strtouq;
+			base = 8;
+			break;
+
+		case 'u':
+			c = CT_INT;
+			ccfn = strtouq;
+			base = 10;
+			break;
+
+		case 'x':
+			flags |= PFXOK; /* enable 0x prefixing */
+			c = CT_INT;
+			ccfn = strtouq;
+			base = 16;
+			break;
+
+		case 's':
+			c = CT_STRING;
+			break;
+
+		case '[':
+			fmt = __sccl(ccltab, fmt);
+			flags |= NOSKIP;
+			c = CT_CCL;
+			break;
+
+		case 'c':
+			flags |= NOSKIP;
+			c = CT_CHAR;
+			break;
+
+		case 'p':       /* pointer format is like hex */
+			flags |= POINTER | PFXOK;
+			c = CT_INT;
+			ccfn = strtouq;
+			base = 16;
+			break;
+
+		case 'n':
+			nconversions++;
+			if (flags & SUPPRESS)   /* ??? */
+				continue;
+			if (flags & SHORTSHORT)
+				*va_arg(ap, char *) = nread;
+			else if (flags & SHORT)
+				*va_arg(ap, short *) = nread;
+			else if (flags & LONG)
+				*va_arg(ap, long *) = nread;
+			else if (flags & QUAD)
+				*va_arg(ap, s64 *) = nread;
+			else
+				*va_arg(ap, int *) = nread;
+			continue;
+		}
+
+		/*
+		 * We have a conversion that requires input.
+		 */
+		if (inr <= 0)
+			goto input_failure;
+
+		/*
+		 * Consume leading white space, except for formats
+		 * that suppress this.
+		 */
+		if ((flags & NOSKIP) == 0) {
+			while (isspace(*inp)) {
+				nread++;
+				if (--inr > 0)
+					inp++;
+				else
+					goto input_failure;
+			}
+			/*
+			 * Note that there is at least one character in
+			 * the buffer, so conversions that do not set NOSKIP
+			 * can no longer result in an input failure.
+			 */
+		}
+
+		/*
+		 * Do the conversion.
+		 */
+		switch (c) {
+		case CT_CHAR:
+			/* scan arbitrary characters (sets NOSKIP) */
+			if (width == 0)
+				width = 1;
+			if (flags & SUPPRESS) {
+				size_t sum = 0;
+
+				if ((n = inr) < width) {
+					sum += n;
+					width -= n;
+					inp += n;
+					if (sum == 0)
+						goto input_failure;
+				} else {
+					sum += width;
+					inr -= width;
+					inp += width;
+				}
+				nread += sum;
+			} else {
+				memcpy(va_arg(ap, char *), inp, width);
+				inr -= width;
+				inp += width;
+				nread += width;
+				nassigned++;
+			}
+			nconversions++;
+			break;
+
+		case CT_CCL:
+			/* scan a (nonempty) character class (sets NOSKIP) */
+			if (width == 0)
+				width = (size_t)~0;     /* `infinity' */
+			/* take only those things in the class */
+			if (flags & SUPPRESS) {
+				n = 0;
+				while (ccltab[(unsigned char)*inp]) {
+					n++, inr--, inp++;
+					if (--width == 0)
+						break;
+					if (inr <= 0) {
+						if (n == 0)
+							goto input_failure;
+						break;
+					}
+				}
+				if (n == 0)
+					goto match_failure;
+			} else {
+				p = va_arg(ap, char *);
+				p0 = p;
+				while (ccltab[(unsigned char)*inp]) {
+					inr--;
+					*p++ = *inp++;
+					if (--width == 0)
+						break;
+					if (inr <= 0) {
+						if (p == p0)
+							goto input_failure;
+						break;
+					}
+				}
+				n = p - p0;
+				if (n == 0)
+					goto match_failure;
+				*p = 0;
+				nassigned++;
+			}
+			nread += n;
+			nconversions++;
+			break;
+
+		case CT_STRING:
+			/* like CCL, but zero-length string OK, & no NOSKIP */
+			if (width == 0)
+				width = (size_t)~0;
+			if (flags & SUPPRESS) {
+				n = 0;
+				while (!isspace(*inp)) {
+					n++, inr--, inp++;
+					if (--width == 0)
+						break;
+					if (inr <= 0)
+						break;
+				}
+				nread += n;
+			} else {
+				p = va_arg(ap, char *);
+				p0 = p;
+				while (!isspace(*inp)) {
+					inr--;
+					*p++ = *inp++;
+					if (--width == 0)
+						break;
+					if (inr <= 0)
+						break;
+				}
+				*p = 0;
+				nread += p - p0;
+				nassigned++;
+			}
+			nconversions++;
+			continue;
+
+		case CT_INT:
+			/* scan an integer as if by strtoq/strtouq */
+#ifdef hardway
+			if (width == 0 || width > sizeof(buf) - 1)
+				width = sizeof(buf) - 1;
+#else
+			/* size_t is unsigned, hence this optimisation */
+			if (--width > sizeof(buf) - 2)
+				width = sizeof(buf) - 2;
+			width++;
+#endif
+			flags |= SIGNOK | NDIGITS | NZDIGITS;
+			for (p = buf; width; width--) {
+				c = *inp;
+				/*
+				 * Switch on the character; `goto ok'
+				 * if we accept it as a part of number.
+				 */
+				switch (c) {
+				/*
+				 * The digit 0 is always legal, but is
+				 * special.  For %i conversions, if no
+				 * digits (zero or nonzero) have been
+				 * scanned (only signs), we will have
+				 * base==0.  In that case, we should set
+				 * it to 8 and enable 0x prefixing.
+				 * Also, if we have not scanned zero digits
+				 * before this, do not turn off prefixing
+				 * (someone else will turn it off if we
+				 * have scanned any nonzero digits).
+				 */
+				case '0':
+					if (base == 0) {
+						base = 8;
+						flags |= PFXOK;
+					}
+					if (flags & NZDIGITS)
+						flags &= ~(SIGNOK | NZDIGITS | NDIGITS);
+					else
+						flags &= ~(SIGNOK | PFXOK | NDIGITS);
+					goto ok;
+
+				/* 1 through 7 always legal */
+				case '1': case '2': case '3':
+				case '4': case '5': case '6': case '7':
+					base = basefix[base];
+					flags &= ~(SIGNOK | PFXOK | NDIGITS);
+					goto ok;
+
+				/* digits 8 and 9 ok iff decimal or hex */
+				case '8': case '9':
+					base = basefix[base];
+					if (base <= 8)
+						break;  /* not legal here */
+					flags &= ~(SIGNOK | PFXOK | NDIGITS);
+					goto ok;
+
+				/* letters ok iff hex */
+				case 'A': case 'B': case 'C':
+				case 'D': case 'E': case 'F':
+				case 'a': case 'b': case 'c':
+				case 'd': case 'e': case 'f':
+					/* no need to fix base here */
+					if (base <= 10)
+						break;  /* not legal here */
+					flags &= ~(SIGNOK | PFXOK | NDIGITS);
+					goto ok;
+
+				/* sign ok only as first character */
+				case '+': case '-':
+					if (flags & SIGNOK) {
+						flags &= ~SIGNOK;
+						goto ok;
+						}
+					break;
+
+				/* x ok iff flag still set & 2nd char */
+				case 'x': case 'X':
+					if (flags & PFXOK && p == buf + 1) {
+						base = 16;      /* if %i */
+						flags &= ~PFXOK;
+						goto ok;
+					}
+					break;
+				}
+
+				/*
+				 * If we got here, c is not a legal character
+				 * for a number.  Stop accumulating digits.
+				 */
+				break;
+ok:
+				/*
+				 * c is legal: store it and look@the next.
+				 */
+				*p++ = c;
+				if (--inr > 0)
+					inp++;
+				else
+					break;          /* end of input */
+			}
+			/*
+			 * If we had only a sign, it is no good; push
+			 * back the sign.  If the number ends in `x',
+			 * it was [sign] '' 'x', so push back the x
+			 * and treat it as [sign] ''.
+			 */
+			if (flags & NDIGITS) {
+				if (p > buf) {
+					inp--;
+					inr++;
+				}
+				goto match_failure;
+			}
+			c = ((u_char *)p)[-1];
+			if (c == 'x' || c == 'X') {
+				--p;
+				inp--;
+				inr++;
+			}
+			if ((flags & SUPPRESS) == 0) {
+				u64 res;
+
+				*p = 0;
+				res = (*ccfn)(buf, (char **)NULL, base);
+				if (flags & POINTER)
+					*va_arg(ap, void **) =
+					(void *)(uintptr_t)res;
+				else if (flags & SHORTSHORT)
+					*va_arg(ap, char *) = res;
+				else if (flags & SHORT)
+					*va_arg(ap, short *) = res;
+				else if (flags & LONG)
+					*va_arg(ap, long *) = res;
+				else if (flags & QUAD)
+					*va_arg(ap, s64 *) = res;
+				else
+					*va_arg(ap, int *) = res;
+				nassigned++;
+			}
+			nread += p - buf;
+			nconversions++;
+			break;
+		}
+	}
+input_failure:
+		return (nconversions != 0 ? nassigned : -1);
+match_failure:
+		return (nassigned);
+}
+
+/**
+ * sscanf - Unformat a buffer into a list of arguments
+ * @buf:	input buffer
+ * @fmt:	formatting of buffer
+ * @...:	resulting arguments
+ */
+int sscanf(const char *buf, const char *fmt, ...)
+{
+	va_list args;
+	int i;
+
+	va_start(args, fmt);
+	i = vsscanf(buf, fmt, args);
+	va_end(args);
+	return i;
+}
+
+#endif
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 10/17] xen: Port Xen bus driver from mini-os
  2020-07-01 16:29 [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
                   ` (8 preceding siblings ...)
  2020-07-01 16:29 ` [PATCH 09/17] lib: sscanf: add sscanf implementation Anastasiia Lukianenko
@ 2020-07-01 16:29 ` Anastasiia Lukianenko
  2020-07-02  4:43   ` Heinrich Schuchardt
  2020-07-01 16:29 ` [PATCH 11/17] xen: Port Xen grant table " Anastasiia Lukianenko
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-01 16:29 UTC (permalink / raw)
  To: u-boot

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Make required updates to run on u-boot and strip test code.

Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
 arch/arm/Kconfig                          |   1 +
 board/xen/xenguest_arm64/xenguest_arm64.c |  16 +-
 drivers/xen/Makefile                      |   1 +
 drivers/xen/hypervisor.c                  |   2 +
 drivers/xen/xenbus.c                      | 547 ++++++++++++++++++++++
 include/xen/xenbus.h                      |  86 ++++
 6 files changed, 652 insertions(+), 1 deletion(-)
 create mode 100644 drivers/xen/xenbus.c
 create mode 100644 include/xen/xenbus.h

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index d4de1139aa..bcd9ab5c9d 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1724,6 +1724,7 @@ config TARGET_XENGUEST_ARM64
 	select OF_CONTROL
 	select LINUX_KERNEL_IMAGE_HEADER
 	select XEN_SERIAL
+	select SSCANF
 endchoice
 
 config ARCH_SUPPORT_TFABOOT
diff --git a/board/xen/xenguest_arm64/xenguest_arm64.c b/board/xen/xenguest_arm64/xenguest_arm64.c
index fd10a002e9..e8621f7174 100644
--- a/board/xen/xenguest_arm64/xenguest_arm64.c
+++ b/board/xen/xenguest_arm64/xenguest_arm64.c
@@ -67,7 +67,7 @@ static int setup_mem_map(void)
 
 	/*
 	 * Add "magic" region which is used by Xen to provide some essentials
-	 * for the guest: we need console.
+	 * for the guest: we need console and xenstore.
 	 */
 	ret = hvm_get_parameter_maintain_dcache(HVM_PARAM_CONSOLE_PFN, &gfn);
 	if (ret < 0) {
@@ -83,6 +83,20 @@ static int setup_mem_map(void)
 				PTE_BLOCK_INNER_SHARE);
 	i++;
 
+	ret = hvm_get_parameter_maintain_dcache(HVM_PARAM_STORE_PFN, &gfn);
+	if (ret < 0) {
+		printf("%s: Can't get HVM_PARAM_STORE_PFN, ret %d\n",
+		       __func__, ret);
+		return -EINVAL;
+	}
+
+	xen_mem_map[i].virt = PFN_PHYS(gfn);
+	xen_mem_map[i].phys = PFN_PHYS(gfn);
+	xen_mem_map[i].size = PAGE_SIZE;
+	xen_mem_map[i].attrs = (PTE_BLOCK_MEMTYPE(MT_NORMAL) |
+				PTE_BLOCK_INNER_SHARE);
+	i++;
+
 	mem = get_next_memory_node(blob, -1);
 	if (mem < 0) {
 		printf("%s: Missing /memory node\n", __func__);
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 0ad35edefb..9d0f604aaa 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -4,3 +4,4 @@
 
 obj-y += hypervisor.o
 obj-y += events.o
+obj-y += xenbus.o
diff --git a/drivers/xen/hypervisor.c b/drivers/xen/hypervisor.c
index 975e552242..d7fbacb08e 100644
--- a/drivers/xen/hypervisor.c
+++ b/drivers/xen/hypervisor.c
@@ -38,6 +38,7 @@
 
 #include <xen/hvm.h>
 #include <xen/events.h>
+#include <xen/xenbus.h>
 #include <xen/interface/memory.h>
 
 #define active_evtchns(cpu, sh, idx)	\
@@ -273,5 +274,6 @@ void xen_init(void)
 
 	map_shared_info(NULL);
 	init_events();
+	init_xenbus();
 }
 
diff --git a/drivers/xen/xenbus.c b/drivers/xen/xenbus.c
new file mode 100644
index 0000000000..64eb28e843
--- /dev/null
+++ b/drivers/xen/xenbus.c
@@ -0,0 +1,547 @@
+/*
+ ****************************************************************************
+ * (C) 2006 - Cambridge University
+ * (C) 2020 - EPAM Systems Inc.
+ ****************************************************************************
+ *
+ *		File: xenbus.c
+ *	  Author: Steven Smith (sos22 at cam.ac.uk)
+ *	 Changes: Grzegorz Milos (gm281 at cam.ac.uk)
+ *	 Changes: John D. Ramsdell
+ *
+ *		Date: Jun 2006, chages Aug 2005
+ *
+ * Environment: Xen Minimal OS
+ * Description: Minimal implementation of xenbus
+ *
+ ****************************************************************************
+ **/
+
+#include <common.h>
+#include <log.h>
+
+#include <asm/armv8/mmu.h>
+#include <asm/io.h>
+#include <asm/xen/system.h>
+
+#include <linux/bug.h>
+#include <linux/compat.h>
+
+#include <xen/events.h>
+#include <xen/hvm.h>
+#include <xen/xenbus.h>
+
+#include <xen/interface/io/xs_wire.h>
+
+#define map_frame_virt(v)	(v << PAGE_SHIFT)
+
+#define SCNd16			"d"
+
+/* Wait for reply time out, ms */
+#define WAIT_XENBUS_TO_MS	5000
+/* Polling time out, ms */
+#define WAIT_XENBUS_POLL_TO_MS	1
+
+static struct xenstore_domain_interface *xenstore_buf;
+
+static char *errmsg(struct xsd_sockmsg *rep);
+
+u32 xenbus_evtchn;
+
+struct write_req {
+	const void *data;
+	unsigned int len;
+};
+
+static void memcpy_from_ring(const void *r, void *d, int off, int len)
+{
+	int c1, c2;
+	const char *ring = r;
+	char *dest = d;
+
+	c1 = min(len, XENSTORE_RING_SIZE - off);
+	c2 = len - c1;
+	memcpy(dest, ring + off, c1);
+	memcpy(dest + c1, ring, c2);
+}
+
+static bool xenbus_get_reply(struct xsd_sockmsg **req_reply)
+{
+	struct xsd_sockmsg msg;
+	unsigned int prod = xenstore_buf->rsp_prod;
+
+again:
+	if (!wait_event_timeout(NULL, prod != xenstore_buf->rsp_prod,
+				WAIT_XENBUS_TO_MS)) {
+		printk("%s: wait_event timeout\n", __func__);
+		return false;
+	}
+
+	prod = xenstore_buf->rsp_prod;
+	if (xenstore_buf->rsp_prod - xenstore_buf->rsp_cons < sizeof(msg))
+		goto again;
+
+	rmb();
+	memcpy_from_ring(xenstore_buf->rsp, &msg,
+			 MASK_XENSTORE_IDX(xenstore_buf->rsp_cons),
+			 sizeof(msg));
+
+	if (xenstore_buf->rsp_prod - xenstore_buf->rsp_cons < sizeof(msg) + msg.len)
+		goto again;
+
+	/* We do not support and expect any Xen bus wathes. */
+	BUG_ON(msg.type == XS_WATCH_EVENT);
+
+	*req_reply = malloc(sizeof(msg) + msg.len);
+	memcpy_from_ring(xenstore_buf->rsp, *req_reply,
+			 MASK_XENSTORE_IDX(xenstore_buf->rsp_cons),
+			 msg.len + sizeof(msg));
+	mb();
+	xenstore_buf->rsp_cons += msg.len + sizeof(msg);
+
+	wmb();
+	notify_remote_via_evtchn(xenbus_evtchn);
+	return true;
+}
+
+char *xenbus_switch_state(xenbus_transaction_t xbt, const char *path,
+			  XenbusState state)
+{
+	char *current_state;
+	char *msg = NULL;
+	char *msg2 = NULL;
+	char value[2];
+	XenbusState rs;
+	int xbt_flag = 0;
+	int retry = 0;
+
+	do {
+		if (xbt == XBT_NIL) {
+			msg = xenbus_transaction_start(&xbt);
+			if (msg)
+				goto exit;
+			xbt_flag = 1;
+		}
+
+		msg = xenbus_read(xbt, path, &current_state);
+		if (msg)
+			goto exit;
+
+		rs = (XenbusState)(current_state[0] - '0');
+		free(current_state);
+		if (rs == state) {
+			msg = NULL;
+			goto exit;
+		}
+
+		snprintf(value, 2, "%d", state);
+		msg = xenbus_write(xbt, path, value);
+
+exit:
+		if (xbt_flag) {
+			msg2 = xenbus_transaction_end(xbt, 0, &retry);
+			xbt = XBT_NIL;
+		}
+		if (msg == NULL && msg2 != NULL)
+			msg = msg2;
+		else
+			free(msg2);
+	} while (retry);
+
+	return msg;
+}
+
+char *xenbus_wait_for_state_change(const char *path, XenbusState *state)
+{
+	for (;;) {
+		char *res, *msg;
+		XenbusState rs;
+
+		msg = xenbus_read(XBT_NIL, path, &res);
+		if (msg)
+			return msg;
+
+		rs = (XenbusState)(res[0] - 48);
+		free(res);
+
+		if (rs == *state) {
+			wait_event_timeout(NULL, false, WAIT_XENBUS_POLL_TO_MS);
+		} else {
+			*state = rs;
+			break;
+		}
+	}
+	return NULL;
+}
+
+/* Send data to xenbus.  This can block.  All of the requests are seen
+ * by xenbus as if sent atomically.  The header is added
+ * automatically, using type %type, req_id %req_id, and trans_id
+ * %trans_id.
+ */
+static void xb_write(int type, int req_id, xenbus_transaction_t trans_id,
+		     const struct write_req *req, int nr_reqs)
+{
+	XENSTORE_RING_IDX prod;
+	int r;
+	int len = 0;
+	const struct write_req *cur_req;
+	int req_off;
+	int total_off;
+	int this_chunk;
+	struct xsd_sockmsg m = {
+		.type = type,
+		.req_id = req_id,
+		.tx_id = trans_id
+	};
+	struct write_req header_req = {
+		&m,
+		sizeof(m)
+	};
+
+	for (r = 0; r < nr_reqs; r++)
+		len += req[r].len;
+	m.len = len;
+	len += sizeof(m);
+
+	cur_req = &header_req;
+
+	BUG_ON(len > XENSTORE_RING_SIZE);
+	prod = xenstore_buf->req_prod;
+	/* We are running synchronously, so it is a bug if we do not
+	 * have enough room to send a message: please note that a message
+	 * can occupy multiple slots in the ring buffer.
+	 */
+	BUG_ON(prod + len - xenstore_buf->req_cons > XENSTORE_RING_SIZE);
+
+	total_off = 0;
+	req_off = 0;
+	while (total_off < len) {
+		this_chunk = min(cur_req->len - req_off,
+				 XENSTORE_RING_SIZE - MASK_XENSTORE_IDX(prod));
+		memcpy((char *)xenstore_buf->req + MASK_XENSTORE_IDX(prod),
+		       (char *)cur_req->data + req_off, this_chunk);
+		prod += this_chunk;
+		req_off += this_chunk;
+		total_off += this_chunk;
+		if (req_off == cur_req->len) {
+			req_off = 0;
+			if (cur_req == &header_req)
+				cur_req = req;
+			else
+				cur_req++;
+		}
+	}
+
+	BUG_ON(req_off != 0);
+	BUG_ON(total_off != len);
+	BUG_ON(prod > xenstore_buf->req_cons + XENSTORE_RING_SIZE);
+
+	/* Remote must see entire message before updating indexes */
+	wmb();
+
+	xenstore_buf->req_prod += len;
+
+	/* Send evtchn to notify remote */
+	notify_remote_via_evtchn(xenbus_evtchn);
+}
+
+/* Send a message to xenbus, in the same fashion as xb_write, and
+ * block waiting for a reply.  The reply is malloced and should be
+ * freed by the caller.
+ */
+struct xsd_sockmsg *xenbus_msg_reply(int type,
+				     xenbus_transaction_t trans,
+				     struct write_req *io,
+				     int nr_reqs)
+{
+	struct xsd_sockmsg *rep;
+
+	/* We do not use request identifier which is echoed in daemon's response. */
+	xb_write(type, 0, trans, io, nr_reqs);
+	/* Now wait for the message to arrive. */
+	if (!xenbus_get_reply(&rep))
+		return NULL;
+	return rep;
+}
+
+static char *errmsg(struct xsd_sockmsg *rep)
+{
+	char *res;
+
+	if (!rep) {
+		char msg[] = "No reply";
+		size_t len = strlen(msg) + 1;
+
+		return memcpy(malloc(len), msg, len);
+	}
+	if (rep->type != XS_ERROR)
+		return NULL;
+	res = malloc(rep->len + 1);
+	memcpy(res, rep + 1, rep->len);
+	res[rep->len] = 0;
+	free(rep);
+	return res;
+}
+
+/* List the contents of a directory.  Returns a malloc()ed array of
+ * pointers to malloc()ed strings.  The array is NULL terminated.  May
+ * block.
+ */
+char *xenbus_ls(xenbus_transaction_t xbt, const char *pre, char ***contents)
+{
+	struct xsd_sockmsg *reply, *repmsg;
+	struct write_req req[] = { { pre, strlen(pre) + 1 } };
+	int nr_elems, x, i;
+	char **res, *msg;
+
+	repmsg = xenbus_msg_reply(XS_DIRECTORY, xbt, req, ARRAY_SIZE(req));
+	msg = errmsg(repmsg);
+	if (msg) {
+		*contents = NULL;
+		return msg;
+	}
+	reply = repmsg + 1;
+	for (x = nr_elems = 0; x < repmsg->len; x++)
+		nr_elems += (((char *)reply)[x] == 0);
+	res = malloc(sizeof(res[0]) * (nr_elems + 1));
+	for (x = i = 0; i < nr_elems; i++) {
+		int l = strlen((char *)reply + x);
+
+		res[i] = malloc(l + 1);
+		memcpy(res[i], (char *)reply + x, l + 1);
+		x += l + 1;
+	}
+	res[i] = NULL;
+	free(repmsg);
+	*contents = res;
+	return NULL;
+}
+
+char *xenbus_read(xenbus_transaction_t xbt, const char *path, char **value)
+{
+	struct write_req req[] = { {path, strlen(path) + 1} };
+	struct xsd_sockmsg *rep;
+	char *res, *msg;
+
+	rep = xenbus_msg_reply(XS_READ, xbt, req, ARRAY_SIZE(req));
+	msg = errmsg(rep);
+	if (msg) {
+		*value = NULL;
+		return msg;
+	}
+	res = malloc(rep->len + 1);
+	memcpy(res, rep + 1, rep->len);
+	res[rep->len] = 0;
+	free(rep);
+	*value = res;
+	return NULL;
+}
+
+char *xenbus_write(xenbus_transaction_t xbt, const char *path,
+				   const char *value)
+{
+	struct write_req req[] = {
+		{path, strlen(path) + 1},
+		{value, strlen(value)},
+	};
+	struct xsd_sockmsg *rep;
+	char *msg;
+
+	rep = xenbus_msg_reply(XS_WRITE, xbt, req, ARRAY_SIZE(req));
+	msg = errmsg(rep);
+	if (msg)
+		return msg;
+	free(rep);
+	return NULL;
+}
+
+char *xenbus_rm(xenbus_transaction_t xbt, const char *path)
+{
+	struct write_req req[] = { {path, strlen(path) + 1} };
+	struct xsd_sockmsg *rep;
+	char *msg;
+
+	rep = xenbus_msg_reply(XS_RM, xbt, req, ARRAY_SIZE(req));
+	msg = errmsg(rep);
+	if (msg)
+		return msg;
+	free(rep);
+	return NULL;
+}
+
+char *xenbus_get_perms(xenbus_transaction_t xbt, const char *path, char **value)
+{
+	struct write_req req[] = { {path, strlen(path) + 1} };
+	struct xsd_sockmsg *rep;
+	char *res, *msg;
+
+	rep = xenbus_msg_reply(XS_GET_PERMS, xbt, req, ARRAY_SIZE(req));
+	msg = errmsg(rep);
+	if (msg) {
+		*value = NULL;
+		return msg;
+	}
+	res = malloc(rep->len + 1);
+	memcpy(res, rep + 1, rep->len);
+	res[rep->len] = 0;
+	free(rep);
+	*value = res;
+	return NULL;
+}
+
+#define PERM_MAX_SIZE 32
+char *xenbus_set_perms(xenbus_transaction_t xbt, const char *path,
+		       domid_t dom, char perm)
+{
+	char value[PERM_MAX_SIZE];
+	struct write_req req[] = {
+		{path, strlen(path) + 1},
+		{value, 0},
+	};
+	struct xsd_sockmsg *rep;
+	char *msg;
+
+	snprintf(value, PERM_MAX_SIZE, "%c%hu", perm, dom);
+	req[1].len = strlen(value) + 1;
+	rep = xenbus_msg_reply(XS_SET_PERMS, xbt, req, ARRAY_SIZE(req));
+	msg = errmsg(rep);
+	if (msg)
+		return msg;
+	free(rep);
+	return NULL;
+}
+
+char *xenbus_transaction_start(xenbus_transaction_t *xbt)
+{
+	/* Xenstored becomes angry if you send a length 0 message, so just
+	 * shove a nul terminator on the end
+	 */
+	struct write_req req = { "", 1};
+	struct xsd_sockmsg *rep;
+	char *err;
+
+	rep = xenbus_msg_reply(XS_TRANSACTION_START, 0, &req, 1);
+	err = errmsg(rep);
+	if (err)
+		return err;
+	sscanf((char *)(rep + 1), "%lu", xbt);
+	free(rep);
+	return NULL;
+}
+
+char *xenbus_transaction_end(xenbus_transaction_t t, int abort, int *retry)
+{
+	struct xsd_sockmsg *rep;
+	struct write_req req;
+	char *err;
+
+	*retry = 0;
+
+	req.data = abort ? "F" : "T";
+	req.len = 2;
+	rep = xenbus_msg_reply(XS_TRANSACTION_END, t, &req, 1);
+	err = errmsg(rep);
+	if (err) {
+		if (!strcmp(err, "EAGAIN")) {
+			*retry = 1;
+			free(err);
+			return NULL;
+		} else {
+			return err;
+		}
+	}
+	free(rep);
+	return NULL;
+}
+
+int xenbus_read_integer(const char *path)
+{
+	char *res, *buf;
+	int t;
+
+	res = xenbus_read(XBT_NIL, path, &buf);
+	if (res) {
+		printk("Failed to read %s.\n", path);
+		free(res);
+		return -1;
+	}
+	sscanf(buf, "%d", &t);
+	free(buf);
+	return t;
+}
+
+int xenbus_read_uuid(const char *path, unsigned char uuid[16]) {
+	char *res, *buf;
+
+	res = xenbus_read(XBT_NIL, path, &buf);
+	if (res) {
+		printk("Failed to read %s.\n", path);
+		free(res);
+		return 0;
+	}
+	if (strlen(buf) != ((2 * 16) + 4) /* 16 hex bytes and 4 hyphens */
+	    || sscanf(buf,
+		      "%2hhx%2hhx%2hhx%2hhx-"
+		      "%2hhx%2hhx-"
+		      "%2hhx%2hhx-"
+		      "%2hhx%2hhx-"
+		      "%2hhx%2hhx%2hhx%2hhx%2hhx%2hhx",
+		      uuid, uuid + 1, uuid + 2, uuid + 3,
+		      uuid + 4, uuid + 5, uuid + 6, uuid + 7,
+		      uuid + 8, uuid + 9, uuid + 10, uuid + 11,
+		      uuid + 12, uuid + 13, uuid + 14, uuid + 15) != 16) {
+		printk("Xenbus path %s value %s is not a uuid!\n", path, buf);
+		free(buf);
+		return 0;
+	}
+	free(buf);
+	return 1;
+}
+
+char *xenbus_printf(xenbus_transaction_t xbt,
+		    const char *node, const char *path,
+		    const char *fmt, ...)
+{
+#define BUFFER_SIZE 256
+	char fullpath[BUFFER_SIZE];
+	char val[BUFFER_SIZE];
+	va_list args;
+
+	BUG_ON(strlen(node) + strlen(path) + 1 >= BUFFER_SIZE);
+	sprintf(fullpath, "%s/%s", node, path);
+	va_start(args, fmt);
+	vsprintf(val, fmt, args);
+	va_end(args);
+	return xenbus_write(xbt, fullpath, val);
+}
+
+domid_t xenbus_get_self_id(void)
+{
+	char *dom_id;
+	domid_t ret;
+
+	BUG_ON(xenbus_read(XBT_NIL, "domid", &dom_id));
+	sscanf(dom_id, "%"SCNd16, &ret);
+
+	return ret;
+}
+
+void init_xenbus(void)
+{
+	u64 v;
+
+	debug("%s\n", __func__);
+	if (hvm_get_parameter(HVM_PARAM_STORE_EVTCHN, &v))
+		BUG();
+	xenbus_evtchn = v;
+
+	if (hvm_get_parameter(HVM_PARAM_STORE_PFN, &v))
+		BUG();
+	xenstore_buf = (struct xenstore_domain_interface *)map_frame_virt(v);
+}
+
+void fini_xenbus(void)
+{
+	debug("%s\n", __func__);
+}
diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
new file mode 100644
index 0000000000..e2e3ef9292
--- /dev/null
+++ b/include/xen/xenbus.h
@@ -0,0 +1,86 @@
+#ifndef XENBUS_H__
+#define XENBUS_H__
+
+#include <xen/interface/xen.h>
+#include <xen/interface/io/xenbus.h>
+
+typedef unsigned long xenbus_transaction_t;
+#define XBT_NIL ((xenbus_transaction_t)0)
+
+extern u32 xenbus_evtchn;
+
+/* Initialize the XenBus system. */
+void init_xenbus(void);
+/* Finalize the XenBus system. */
+void fini_xenbus(void);
+
+/* Read the value associated with a path.  Returns a malloc'd error
+ * string on failure and sets *value to NULL.  On success, *value is
+ * set to a malloc'd copy of the value.
+ */
+char *xenbus_read(xenbus_transaction_t xbt, const char *path, char **value);
+
+char *xenbus_wait_for_state_change(const char *path, XenbusState *state);
+char *xenbus_switch_state(xenbus_transaction_t xbt, const char *path,
+			  XenbusState state);
+
+/* Associates a value with a path.  Returns a malloc'd error string on
+ * failure.
+ */
+char *xenbus_write(xenbus_transaction_t xbt, const char *path,
+		   const char *value);
+
+/* Removes the value associated with a path.  Returns a malloc'd error
+ * string on failure.
+ */
+char *xenbus_rm(xenbus_transaction_t xbt, const char *path);
+
+/* List the contents of a directory.  Returns a malloc'd error string
+ * on failure and sets *contents to NULL.  On success, *contents is
+ * set to a malloc'd array of pointers to malloc'd strings.  The array
+ * is NULL terminated.  May block.
+ */
+char *xenbus_ls(xenbus_transaction_t xbt, const char *prefix, char ***contents);
+
+/* Reads permissions associated with a path.  Returns a malloc'd error
+ * string on failure and sets *value to NULL.  On success, *value is
+ * set to a malloc'd copy of the value.
+ */
+char *xenbus_get_perms(xenbus_transaction_t xbt, const char *path, char **value);
+
+/* Sets the permissions associated with a path.  Returns a malloc'd
+ * error string on failure.
+ */
+char *xenbus_set_perms(xenbus_transaction_t xbt, const char *path, domid_t dom,
+		       char perm);
+
+/* Start a xenbus transaction.  Returns the transaction in xbt on
+ * success or a malloc'd error string otherwise.
+ */
+char *xenbus_transaction_start(xenbus_transaction_t *xbt);
+
+/* End a xenbus transaction.  Returns a malloc'd error string if it
+ * fails.  abort says whether the transaction should be aborted.
+ * Returns 1 in *retry iff the transaction should be retried.
+ */
+char *xenbus_transaction_end(xenbus_transaction_t, int abort,
+			     int *retry);
+
+/* Read path and parse it as an integer.  Returns -1 on error. */
+int xenbus_read_integer(const char *path);
+
+/* Read path and parse it as 16 byte uuid. Returns 1 if
+ * read and parsing were successful, 0 if not
+ */
+int xenbus_read_uuid(const char *path, unsigned char uuid[16]);
+
+/* Contraction of snprintf and xenbus_write(path/node). */
+char *xenbus_printf(xenbus_transaction_t xbt,
+		    const char *node, const char *path,
+		    const char *fmt, ...)
+	__attribute__((__format__(printf, 4, 5)));
+
+/* Utility function to figure out our domain id */
+domid_t xenbus_get_self_id(void);
+
+#endif /* XENBUS_H__ */
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 11/17] xen: Port Xen grant table driver from mini-os
  2020-07-01 16:29 [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
                   ` (9 preceding siblings ...)
  2020-07-01 16:29 ` [PATCH 10/17] xen: Port Xen bus driver from mini-os Anastasiia Lukianenko
@ 2020-07-01 16:29 ` Anastasiia Lukianenko
  2020-07-01 16:59   ` Julien Grall
  2020-07-01 16:29 ` [PATCH 12/17] xen: pvblock: Add initial support for para-virtualized block driver Anastasiia Lukianenko
                   ` (6 subsequent siblings)
  17 siblings, 1 reply; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-01 16:29 UTC (permalink / raw)
  To: u-boot

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Make required updates to run on u-boot.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
---
 board/xen/xenguest_arm64/xenguest_arm64.c |  13 ++
 drivers/xen/Makefile                      |   1 +
 drivers/xen/gnttab.c                      | 258 ++++++++++++++++++++++
 drivers/xen/hypervisor.c                  |   2 +
 include/xen/gnttab.h                      |  25 +++
 5 files changed, 299 insertions(+)
 create mode 100644 drivers/xen/gnttab.c
 create mode 100644 include/xen/gnttab.h

diff --git a/board/xen/xenguest_arm64/xenguest_arm64.c b/board/xen/xenguest_arm64/xenguest_arm64.c
index e8621f7174..b4e1650f99 100644
--- a/board/xen/xenguest_arm64/xenguest_arm64.c
+++ b/board/xen/xenguest_arm64/xenguest_arm64.c
@@ -22,6 +22,7 @@
 
 #include <linux/compiler.h>
 
+#include <xen/gnttab.h>
 #include <xen/hvm.h>
 
 DECLARE_GLOBAL_DATA_PTR;
@@ -64,6 +65,8 @@ static int setup_mem_map(void)
 	struct fdt_resource res;
 	const void *blob = gd->fdt_blob;
 	u64 gfn;
+	phys_addr_t gnttab_base;
+	phys_size_t gnttab_sz;
 
 	/*
 	 * Add "magic" region which is used by Xen to provide some essentials
@@ -97,6 +100,16 @@ static int setup_mem_map(void)
 				PTE_BLOCK_INNER_SHARE);
 	i++;
 
+	/* Get Xen's suggested physical page assignments for the grant table. */
+	get_gnttab_base(&gnttab_base, &gnttab_sz);
+
+	xen_mem_map[i].virt = gnttab_base;
+	xen_mem_map[i].phys = gnttab_base;
+	xen_mem_map[i].size = gnttab_sz;
+	xen_mem_map[i].attrs = (PTE_BLOCK_MEMTYPE(MT_NORMAL) |
+				PTE_BLOCK_INNER_SHARE);
+	i++;
+
 	mem = get_next_memory_node(blob, -1);
 	if (mem < 0) {
 		printf("%s: Missing /memory node\n", __func__);
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 9d0f604aaa..243b13277a 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -5,3 +5,4 @@
 obj-y += hypervisor.o
 obj-y += events.o
 obj-y += xenbus.o
+obj-y += gnttab.o
diff --git a/drivers/xen/gnttab.c b/drivers/xen/gnttab.c
new file mode 100644
index 0000000000..b18102e329
--- /dev/null
+++ b/drivers/xen/gnttab.c
@@ -0,0 +1,258 @@
+/*
+ ****************************************************************************
+ * (C) 2006 - Cambridge University
+ * (C) 2020 - EPAM Systems Inc.
+ ****************************************************************************
+ *
+ *		File: gnttab.c
+ *	  Author: Steven Smith (sos22 at cam.ac.uk)
+ *	 Changes: Grzegorz Milos (gm281@cam.ac.uk)
+ *
+ *		Date: July 2006
+ *
+ * Environment: Xen Minimal OS
+ * Description: Simple grant tables implementation. About as stupid as it's
+ *  possible to be and still work.
+ *
+ ****************************************************************************
+ */
+#include <common.h>
+#include <linux/compiler.h>
+#include <log.h>
+#include <malloc.h>
+
+#include <asm/armv8/mmu.h>
+#include <asm/io.h>
+#include <asm/xen/system.h>
+
+#include <linux/bug.h>
+
+#include <xen/gnttab.h>
+#include <xen/hvm.h>
+
+#include <xen/interface/memory.h>
+
+DECLARE_GLOBAL_DATA_PTR;
+
+#define NR_RESERVED_ENTRIES 8
+
+/* NR_GRANT_FRAMES must be less than or equal to that configured in Xen */
+#define NR_GRANT_FRAMES 1
+#define NR_GRANT_ENTRIES (NR_GRANT_FRAMES * PAGE_SIZE / sizeof(struct grant_entry_v1))
+
+static struct grant_entry_v1 *gnttab_table;
+static grant_ref_t gnttab_list[NR_GRANT_ENTRIES];
+
+static void put_free_entry(grant_ref_t ref)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	gnttab_list[ref] = gnttab_list[0];
+	gnttab_list[0]  = ref;
+	local_irq_restore(flags);
+}
+
+static grant_ref_t get_free_entry(void)
+{
+	unsigned int ref;
+	unsigned long flags;
+
+	local_irq_save(flags);
+	ref = gnttab_list[0];
+	BUG_ON(ref < NR_RESERVED_ENTRIES || ref >= NR_GRANT_ENTRIES);
+	gnttab_list[0] = gnttab_list[ref];
+	local_irq_restore(flags);
+	return ref;
+}
+
+grant_ref_t gnttab_grant_access(domid_t domid, unsigned long frame, int readonly)
+{
+	grant_ref_t ref;
+
+	ref = get_free_entry();
+	gnttab_table[ref].frame = frame;
+	gnttab_table[ref].domid = domid;
+	wmb();
+	readonly *= GTF_readonly;
+	gnttab_table[ref].flags = GTF_permit_access | readonly;
+
+	return ref;
+}
+
+grant_ref_t gnttab_grant_transfer(domid_t domid, unsigned long pfn)
+{
+	grant_ref_t ref;
+
+	ref = get_free_entry();
+	gnttab_table[ref].frame = pfn;
+	gnttab_table[ref].domid = domid;
+	wmb();
+	gnttab_table[ref].flags = GTF_accept_transfer;
+
+	return ref;
+}
+
+int gnttab_end_access(grant_ref_t ref)
+{
+	u16 flags, nflags;
+
+	BUG_ON(ref >= NR_GRANT_ENTRIES || ref < NR_RESERVED_ENTRIES);
+
+	nflags = gnttab_table[ref].flags;
+	do {
+		if ((flags = nflags) & (GTF_reading | GTF_writing)) {
+			printf("WARNING: g.e. still in use! (%x)\n", flags);
+			return 0;
+		}
+	} while ((nflags = synch_cmpxchg(&gnttab_table[ref].flags, flags, 0)) !=
+		 flags);
+
+	put_free_entry(ref);
+	return 1;
+}
+
+unsigned long gnttab_end_transfer(grant_ref_t ref)
+{
+	unsigned long frame;
+	u16 flags;
+
+	BUG_ON(ref >= NR_GRANT_ENTRIES || ref < NR_RESERVED_ENTRIES);
+
+	while (!((flags = gnttab_table[ref].flags) & GTF_transfer_committed)) {
+		if (synch_cmpxchg(&gnttab_table[ref].flags, flags, 0) == flags) {
+			printf("Release unused transfer grant.\n");
+			put_free_entry(ref);
+			return 0;
+		}
+	}
+
+	/* If a transfer is in progress then wait until it is completed. */
+	while (!(flags & GTF_transfer_completed))
+		flags = gnttab_table[ref].flags;
+
+	/* Read the frame number /after/ reading completion status. */
+	rmb();
+	frame = gnttab_table[ref].frame;
+
+	put_free_entry(ref);
+
+	return frame;
+}
+
+grant_ref_t gnttab_alloc_and_grant(void **map)
+{
+	unsigned long mfn;
+	grant_ref_t gref;
+
+	*map = (void *)memalign(PAGE_SIZE, PAGE_SIZE);
+	mfn = virt_to_mfn(*map);
+	gref = gnttab_grant_access(0, mfn, 0);
+	return gref;
+}
+
+static const char * const gnttabop_error_msgs[] = GNTTABOP_error_msgs;
+
+const char *gnttabop_error(int16_t status)
+{
+	status = -status;
+	if (status < 0 || status >= ARRAY_SIZE(gnttabop_error_msgs))
+		return "bad status";
+	else
+		return gnttabop_error_msgs[status];
+}
+
+/* Get Xen's suggested physical page assignments for the grant table. */
+void get_gnttab_base(phys_addr_t *gnttab_base, phys_size_t *gnttab_sz)
+{
+	const void *blob = gd->fdt_blob;
+	struct fdt_resource res;
+	int mem;
+
+	mem = fdt_node_offset_by_compatible(blob, -1, "xen,xen");
+	if (mem < 0) {
+		printf("No xen,xen compatible found\n");
+		BUG();
+	}
+
+	mem = fdt_get_resource(blob, mem, "reg", 0, &res);
+	if (mem == -FDT_ERR_NOTFOUND) {
+		printf("No grant table base in the device tree\n");
+		BUG();
+	}
+
+	*gnttab_base = (phys_addr_t)res.start;
+	if (gnttab_sz)
+		*gnttab_sz = (phys_size_t)(res.end - res.start + 1);
+
+	debug("FDT suggests grant table base@%llx\n",
+	      *gnttab_base);
+}
+
+void init_gnttab(void)
+{
+	struct xen_add_to_physmap xatp;
+	struct gnttab_setup_table setup;
+	xen_pfn_t frames[NR_GRANT_FRAMES];
+	int i, rc;
+
+	debug("%s\n", __func__);
+
+	for (i = NR_RESERVED_ENTRIES; i < NR_GRANT_ENTRIES; i++)
+		put_free_entry(i);
+
+	get_gnttab_base((phys_addr_t *)&gnttab_table, NULL);
+
+	for (i = 0; i < NR_GRANT_FRAMES; i++) {
+		xatp.domid = DOMID_SELF;
+		xatp.size = 0;
+		xatp.space = XENMAPSPACE_grant_table;
+		xatp.idx = i;
+		xatp.gpfn = PFN_DOWN((unsigned long)gnttab_table) + i;
+		rc = HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp);
+		if (rc)
+			printf("XENMEM_add_to_physmap failed; status = %d\n",
+			       rc);
+		BUG_ON(rc != 0);
+	}
+
+	setup.dom = DOMID_SELF;
+	setup.nr_frames = NR_GRANT_FRAMES;
+	set_xen_guest_handle(setup.frame_list, frames);
+	rc = HYPERVISOR_grant_table_op(GNTTABOP_setup_table, &setup, 1);
+	if (rc || setup.status) {
+		printf("GNTTABOP_setup_table failed; status = %s\n",
+		       gnttabop_error(setup.status));
+		BUG();
+	}
+}
+
+void fini_gnttab(void)
+{
+	struct xen_remove_from_physmap xrtp;
+	struct gnttab_setup_table setup;
+	int i, rc;
+
+	debug("%s\n", __func__);
+
+	for (i = 0; i < NR_GRANT_FRAMES; i++) {
+		xrtp.domid = DOMID_SELF;
+		xrtp.gpfn = PFN_DOWN((unsigned long)gnttab_table) + i;
+		rc = HYPERVISOR_memory_op(XENMEM_remove_from_physmap, &xrtp);
+		if (rc)
+			printf("XENMEM_remove_from_physmap failed; status = %d\n",
+			       rc);
+		BUG_ON(rc != 0);
+	}
+
+	setup.dom = DOMID_SELF;
+	setup.nr_frames = 0;
+
+	HYPERVISOR_grant_table_op(GNTTABOP_setup_table, &setup, 1);
+	if (setup.status) {
+		printf("GNTTABOP_setup_table failed; status = %s\n",
+		       gnttabop_error(setup.status));
+		BUG();
+	}
+}
+
diff --git a/drivers/xen/hypervisor.c b/drivers/xen/hypervisor.c
index d7fbacb08e..f3c2504d72 100644
--- a/drivers/xen/hypervisor.c
+++ b/drivers/xen/hypervisor.c
@@ -38,6 +38,7 @@
 
 #include <xen/hvm.h>
 #include <xen/events.h>
+#include <xen/gnttab.h>
 #include <xen/xenbus.h>
 #include <xen/interface/memory.h>
 
@@ -275,5 +276,6 @@ void xen_init(void)
 	map_shared_info(NULL);
 	init_events();
 	init_xenbus();
+	init_gnttab();
 }
 
diff --git a/include/xen/gnttab.h b/include/xen/gnttab.h
new file mode 100644
index 0000000000..7e0f6db83e
--- /dev/null
+++ b/include/xen/gnttab.h
@@ -0,0 +1,25 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0
+ *
+ * (C) 2006, Steven Smith <sos22@cam.ac.uk>
+ * (C) 2006, Grzegorz Milos <gm281@cam.ac.uk>
+ * (C) 2020, EPAM Systems Inc.
+ */
+#ifndef __GNTTAB_H__
+#define __GNTTAB_H__
+
+#include <xen/interface/grant_table.h>
+
+void init_gnttab(void);
+void fini_gnttab(void);
+
+grant_ref_t gnttab_alloc_and_grant(void **map);
+grant_ref_t gnttab_grant_access(domid_t domid, unsigned long frame,
+				int readonly);
+grant_ref_t gnttab_grant_transfer(domid_t domid, unsigned long pfn);
+int gnttab_end_access(grant_ref_t ref);
+const char *gnttabop_error(int16_t status);
+
+void get_gnttab_base(phys_addr_t *gnttab_base, phys_size_t *gnttab_sz);
+
+#endif /* !__GNTTAB_H__ */
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 12/17] xen: pvblock: Add initial support for para-virtualized block driver
  2020-07-01 16:29 [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
                   ` (10 preceding siblings ...)
  2020-07-01 16:29 ` [PATCH 11/17] xen: Port Xen grant table " Anastasiia Lukianenko
@ 2020-07-01 16:29 ` Anastasiia Lukianenko
  2020-07-02  4:17   ` Heinrich Schuchardt
  2020-07-02  4:29   ` Heinrich Schuchardt
  2020-07-01 16:29 ` [PATCH 13/17] xen: pvblock: Enumerate virtual block devices Anastasiia Lukianenko
                   ` (5 subsequent siblings)
  17 siblings, 2 replies; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-01 16:29 UTC (permalink / raw)
  To: u-boot

From: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>

Add initial infrastructure for Xen para-virtualized block device.
This includes compile-time configuration and the skeleton for
the future driver implementation.
Add new class UCLASS_PVBLOCK which is going to be a parent for
virtual block devices.
Add new interface type IF_TYPE_PVBLOCK.

Implement basic driver setup by reading XenStore configuration.

Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
 cmd/Kconfig                      |   7 ++
 cmd/Makefile                     |   1 +
 cmd/pvblock.c                    |  31 ++++++++
 common/board_r.c                 |  14 ++++
 configs/xenguest_arm64_defconfig |   4 +
 disk/part.c                      |   4 +
 drivers/Kconfig                  |   2 +
 drivers/block/blk-uclass.c       |   2 +
 drivers/xen/Kconfig              |  10 +++
 drivers/xen/Makefile             |   2 +
 drivers/xen/pvblock.c            | 121 +++++++++++++++++++++++++++++++
 include/blk.h                    |   1 +
 include/configs/xenguest_arm64.h |   8 ++
 include/dm/uclass-id.h           |   1 +
 include/pvblock.h                |  12 +++
 15 files changed, 220 insertions(+)
 create mode 100644 cmd/pvblock.c
 create mode 100644 drivers/xen/Kconfig
 create mode 100644 drivers/xen/pvblock.c
 create mode 100644 include/pvblock.h

diff --git a/cmd/Kconfig b/cmd/Kconfig
index 192b3b262f..f28576947b 100644
--- a/cmd/Kconfig
+++ b/cmd/Kconfig
@@ -1335,6 +1335,13 @@ config CMD_USB_MASS_STORAGE
 	help
 	  USB mass storage support
 
+config CMD_PVBLOCK
+	bool "Xen para-virtualized block device"
+	depends on XEN
+	select PVBLOCK
+	help
+	  Xen para-virtualized block device support
+
 config CMD_VIRTIO
 	bool "virtio"
 	depends on VIRTIO
diff --git a/cmd/Makefile b/cmd/Makefile
index 974ad48b0a..117284a28c 100644
--- a/cmd/Makefile
+++ b/cmd/Makefile
@@ -169,6 +169,7 @@ obj-$(CONFIG_CMD_DFU) += dfu.o
 obj-$(CONFIG_CMD_GPT) += gpt.o
 obj-$(CONFIG_CMD_ETHSW) += ethsw.o
 obj-$(CONFIG_CMD_AXI) += axi.o
+obj-$(CONFIG_CMD_PVBLOCK) += pvblock.o
 
 # Power
 obj-$(CONFIG_CMD_PMIC) += pmic.o
diff --git a/cmd/pvblock.c b/cmd/pvblock.c
new file mode 100644
index 0000000000..7dbb243a74
--- /dev/null
+++ b/cmd/pvblock.c
@@ -0,0 +1,31 @@
+/*
+ * SPDX-License-Identifier:	GPL-2.0+
+ *
+ * (C) Copyright 2020 EPAM Systems Inc.
+ *
+ * XEN para-virtualized block device support
+ */
+
+#include <blk.h>
+#include <common.h>
+#include <command.h>
+
+/* Current I/O Device	*/
+static int pvblock_curr_device;
+
+int do_pvblock(struct cmd_tbl *cmdtp, int flag, int argc, char *const argv[])
+{
+	return blk_common_cmd(argc, argv, IF_TYPE_PVBLOCK,
+			      &pvblock_curr_device);
+}
+
+U_BOOT_CMD(pvblock, 5, 1, do_pvblock,
+	   "Xen para-virtualized block device",
+	   "info  - show available block devices\n"
+	   "pvblock device [dev] - show or set current device\n"
+	   "pvblock part [dev] - print partition table of one or all devices\n"
+	   "pvblock read  addr blk# cnt\n"
+	   "pvblock write addr blk# cnt - read/write `cnt'"
+	   " blocks starting at block `blk#'\n"
+	   "    to/from memory address `addr'");
+
diff --git a/common/board_r.c b/common/board_r.c
index fd36edb4e5..40cd0e5d3c 100644
--- a/common/board_r.c
+++ b/common/board_r.c
@@ -49,6 +49,7 @@
 #include <nand.h>
 #include <of_live.h>
 #include <onenand_uboot.h>
+#include <pvblock.h>
 #include <scsi.h>
 #include <serial.h>
 #include <status_led.h>
@@ -470,6 +471,16 @@ static int initr_xen(void)
 	return 0;
 }
 #endif
+
+#ifdef CONFIG_PVBLOCK
+static int initr_pvblock(void)
+{
+	puts("PVBLOCK: ");
+	pvblock_init();
+	return 0;
+}
+#endif
+
 /*
  * Tell if it's OK to load the environment early in boot.
  *
@@ -780,6 +791,9 @@ static init_fnc_t init_sequence_r[] = {
 #endif
 #ifdef CONFIG_XEN
 	initr_xen,
+#endif
+#ifdef CONFIG_PVBLOCK
+	initr_pvblock,
 #endif
 	initr_env,
 #ifdef CONFIG_SYS_BOOTPARAMS_LEN
diff --git a/configs/xenguest_arm64_defconfig b/configs/xenguest_arm64_defconfig
index 45559a161b..46473c251d 100644
--- a/configs/xenguest_arm64_defconfig
+++ b/configs/xenguest_arm64_defconfig
@@ -14,6 +14,8 @@ CONFIG_CMD_BOOTD=n
 CONFIG_CMD_BOOTEFI=n
 CONFIG_CMD_BOOTEFI_HELLO_COMPILE=n
 CONFIG_CMD_ELF=n
+CONFIG_CMD_EXT4=y
+CONFIG_CMD_FAT=y
 CONFIG_CMD_GO=n
 CONFIG_CMD_RUN=n
 CONFIG_CMD_IMI=n
@@ -41,6 +43,8 @@ CONFIG_CMD_LZMADEC=n
 CONFIG_CMD_SAVEENV=n
 CONFIG_CMD_UMS=n
 
+CONFIG_CMD_PVBLOCK=y
+
 #CONFIG_USB=n
 # CONFIG_ISO_PARTITION is not set
 
diff --git a/disk/part.c b/disk/part.c
index f6a31025dc..b69fd345f3 100644
--- a/disk/part.c
+++ b/disk/part.c
@@ -149,6 +149,7 @@ void dev_print (struct blk_desc *dev_desc)
 	case IF_TYPE_MMC:
 	case IF_TYPE_USB:
 	case IF_TYPE_NVME:
+	case IF_TYPE_PVBLOCK:
 		printf ("Vendor: %s Rev: %s Prod: %s\n",
 			dev_desc->vendor,
 			dev_desc->revision,
@@ -288,6 +289,9 @@ static void print_part_header(const char *type, struct blk_desc *dev_desc)
 	case IF_TYPE_NVME:
 		puts ("NVMe");
 		break;
+	case IF_TYPE_PVBLOCK:
+		puts("PV BLOCK");
+		break;
 	case IF_TYPE_VIRTIO:
 		puts("VirtIO");
 		break;
diff --git a/drivers/Kconfig b/drivers/Kconfig
index e34a22708c..65076aab03 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -132,6 +132,8 @@ source "drivers/w1-eeprom/Kconfig"
 
 source "drivers/watchdog/Kconfig"
 
+source "drivers/xen/Kconfig"
+
 config PHYS_TO_BUS
 	bool "Custom physical to bus address mapping"
 	help
diff --git a/drivers/block/blk-uclass.c b/drivers/block/blk-uclass.c
index b19375cbc8..6cfabbca24 100644
--- a/drivers/block/blk-uclass.c
+++ b/drivers/block/blk-uclass.c
@@ -28,6 +28,7 @@ static const char *if_typename_str[IF_TYPE_COUNT] = {
 	[IF_TYPE_NVME]		= "nvme",
 	[IF_TYPE_EFI]		= "efi",
 	[IF_TYPE_VIRTIO]	= "virtio",
+	[IF_TYPE_PVBLOCK]	= "pvblock",
 };
 
 static enum uclass_id if_type_uclass_id[IF_TYPE_COUNT] = {
@@ -43,6 +44,7 @@ static enum uclass_id if_type_uclass_id[IF_TYPE_COUNT] = {
 	[IF_TYPE_NVME]		= UCLASS_NVME,
 	[IF_TYPE_EFI]		= UCLASS_EFI,
 	[IF_TYPE_VIRTIO]	= UCLASS_VIRTIO,
+	[IF_TYPE_PVBLOCK]	= UCLASS_PVBLOCK,
 };
 
 static enum if_type if_typename_to_iftype(const char *if_typename)
diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
new file mode 100644
index 0000000000..6ad2a93668
--- /dev/null
+++ b/drivers/xen/Kconfig
@@ -0,0 +1,10 @@
+config PVBLOCK
+	bool "Xen para-virtualized block device"
+	depends on DM
+	select BLK
+	select HAVE_BLOCK_DEVICE
+	help
+	  This driver implements the front-end of the Xen virtual
+	  block device driver. It communicates with a back-end driver
+	  in another domain which drives the actual block device.
+
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 243b13277a..87157df69b 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -6,3 +6,5 @@ obj-y += hypervisor.o
 obj-y += events.o
 obj-y += xenbus.o
 obj-y += gnttab.o
+
+obj-$(CONFIG_PVBLOCK) += pvblock.o
diff --git a/drivers/xen/pvblock.c b/drivers/xen/pvblock.c
new file mode 100644
index 0000000000..057add9753
--- /dev/null
+++ b/drivers/xen/pvblock.c
@@ -0,0 +1,121 @@
+/*
+ * SPDX-License-Identifier:	GPL-2.0+
+ *
+ * (C) Copyright 2020 EPAM Systems Inc.
+ */
+#include <blk.h>
+#include <common.h>
+#include <dm.h>
+#include <dm/device-internal.h>
+
+#define DRV_NAME	"pvblock"
+#define DRV_NAME_BLK	"pvblock_blk"
+
+struct blkfront_dev {
+	char dummy;
+};
+
+static int init_blkfront(unsigned int devid, struct blkfront_dev *dev)
+{
+	return 0;
+}
+
+static void shutdown_blkfront(struct blkfront_dev *dev)
+{
+}
+
+ulong pvblock_blk_read(struct udevice *udev, lbaint_t blknr, lbaint_t blkcnt,
+		       void *buffer)
+{
+	return 0;
+}
+
+ulong pvblock_blk_write(struct udevice *udev, lbaint_t blknr, lbaint_t blkcnt,
+			const void *buffer)
+{
+	return 0;
+}
+
+static int pvblock_blk_bind(struct udevice *udev)
+{
+	return 0;
+}
+
+static int pvblock_blk_probe(struct udevice *udev)
+{
+	struct blkfront_dev *blk_dev = dev_get_priv(udev);
+	int ret;
+
+	ret = init_blkfront(0, blk_dev);
+	if (ret < 0)
+		return ret;
+	return 0;
+}
+
+static int pvblock_blk_remove(struct udevice *udev)
+{
+	struct blkfront_dev *blk_dev = dev_get_priv(udev);
+
+	shutdown_blkfront(blk_dev);
+	return 0;
+}
+
+static const struct blk_ops pvblock_blk_ops = {
+	.read	= pvblock_blk_read,
+	.write	= pvblock_blk_write,
+};
+
+U_BOOT_DRIVER(pvblock_blk) = {
+	.name			= DRV_NAME_BLK,
+	.id			= UCLASS_BLK,
+	.ops			= &pvblock_blk_ops,
+	.bind			= pvblock_blk_bind,
+	.probe			= pvblock_blk_probe,
+	.remove			= pvblock_blk_remove,
+	.priv_auto_alloc_size	= sizeof(struct blkfront_dev),
+	.flags			= DM_FLAG_OS_PREPARE,
+};
+
+/*******************************************************************************
+ * Para-virtual block device class
+ *******************************************************************************/
+
+void pvblock_init(void)
+{
+	struct driver_info info;
+	struct udevice *udev;
+	struct uclass *uc;
+	int ret;
+
+	/*
+	 * At this point Xen drivers have already initialized,
+	 * so we can instantiate the class driver and enumerate
+	 * virtual block devices.
+	 */
+	info.name = DRV_NAME;
+	ret = device_bind_by_name(gd->dm_root, false, &info, &udev);
+	if (ret < 0)
+		printf("Failed to bind " DRV_NAME ", ret: %d\n", ret);
+
+	/* Bootstrap virtual block devices class driver */
+	ret = uclass_get(UCLASS_PVBLOCK, &uc);
+	if (ret)
+		return;
+	uclass_foreach_dev_probe(UCLASS_PVBLOCK, udev);
+}
+
+static int pvblock_probe(struct udevice *udev)
+{
+	return 0;
+}
+
+U_BOOT_DRIVER(pvblock_drv) = {
+	.name		= DRV_NAME,
+	.id		= UCLASS_PVBLOCK,
+	.probe		= pvblock_probe,
+};
+
+UCLASS_DRIVER(pvblock) = {
+	.name		= DRV_NAME,
+	.id		= UCLASS_PVBLOCK,
+};
diff --git a/include/blk.h b/include/blk.h
index abcd4bedbb..9ee10fb80e 100644
--- a/include/blk.h
+++ b/include/blk.h
@@ -33,6 +33,7 @@ enum if_type {
 	IF_TYPE_HOST,
 	IF_TYPE_NVME,
 	IF_TYPE_EFI,
+	IF_TYPE_PVBLOCK,
 	IF_TYPE_VIRTIO,
 
 	IF_TYPE_COUNT,			/* Number of interface types */
diff --git a/include/configs/xenguest_arm64.h b/include/configs/xenguest_arm64.h
index 467dabf1e5..2c0d3d64fb 100644
--- a/include/configs/xenguest_arm64.h
+++ b/include/configs/xenguest_arm64.h
@@ -42,4 +42,12 @@
 #define CONFIG_CMDLINE_TAG            1
 #define CONFIG_INITRD_TAG             1
 
+#define CONFIG_CMD_RUN
+
+#undef CONFIG_EXTRA_ENV_SETTINGS
+#define CONFIG_EXTRA_ENV_SETTINGS	\
+	"loadimage=ext4load pvblock 0 0x90000000 /boot/Image;\0" \
+	"pvblockboot=run loadimage;" \
+		"booti 0x90000000 - 0x88000000;\0"
+
 #endif /* __XENGUEST_ARM64_H */
diff --git a/include/dm/uclass-id.h b/include/dm/uclass-id.h
index 7837d459f1..4bf7501204 100644
--- a/include/dm/uclass-id.h
+++ b/include/dm/uclass-id.h
@@ -121,6 +121,7 @@ enum uclass_id {
 	UCLASS_W1,		/* Dallas 1-Wire bus */
 	UCLASS_W1_EEPROM,	/* one-wire EEPROMs */
 	UCLASS_WDT,		/* Watchdog Timer driver */
+	UCLASS_PVBLOCK,		/* Xen virtual block device */
 
 	UCLASS_COUNT,
 	UCLASS_INVALID = -1,
diff --git a/include/pvblock.h b/include/pvblock.h
new file mode 100644
index 0000000000..e3bb8ff9a7
--- /dev/null
+++ b/include/pvblock.h
@@ -0,0 +1,12 @@
+/*
+ * SPDX-License-Identifier:	GPL-2.0+
+ *
+ * (C) 2020 EPAM Systems Inc.
+ */
+
+#ifndef _PVBLOCK_H
+#define _PVBLOCK_H
+
+void pvblock_init(void);
+
+#endif /* _PVBLOCK_H */
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 13/17] xen: pvblock: Enumerate virtual block devices
  2020-07-01 16:29 [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
                   ` (11 preceding siblings ...)
  2020-07-01 16:29 ` [PATCH 12/17] xen: pvblock: Add initial support for para-virtualized block driver Anastasiia Lukianenko
@ 2020-07-01 16:29 ` Anastasiia Lukianenko
  2020-07-03  3:50   ` Simon Glass
  2020-07-01 16:29 ` [PATCH 14/17] xen: pvblock: Read XenStore configuration and initialize Anastasiia Lukianenko
                   ` (4 subsequent siblings)
  17 siblings, 1 reply; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-01 16:29 UTC (permalink / raw)
  To: u-boot

From: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>

Enumerate Xen virtual block devices found in XenStore and
instantiate pvblock devices.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
---
 drivers/xen/pvblock.c | 112 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 110 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/pvblock.c b/drivers/xen/pvblock.c
index 057add9753..6ce0ae97c3 100644
--- a/drivers/xen/pvblock.c
+++ b/drivers/xen/pvblock.c
@@ -7,6 +7,10 @@
 #include <common.h>
 #include <dm.h>
 #include <dm/device-internal.h>
+#include <malloc.h>
+#include <part.h>
+
+#include <xen/xenbus.h>
 
 #define DRV_NAME	"pvblock"
 #define DRV_NAME_BLK	"pvblock_blk"
@@ -15,6 +19,10 @@ struct blkfront_dev {
 	char dummy;
 };
 
+struct blkfront_platdata {
+	unsigned int devid;
+};
+
 static int init_blkfront(unsigned int devid, struct blkfront_dev *dev)
 {
 	return 0;
@@ -38,15 +46,40 @@ ulong pvblock_blk_write(struct udevice *udev, lbaint_t blknr, lbaint_t blkcnt,
 
 static int pvblock_blk_bind(struct udevice *udev)
 {
+	struct blk_desc *desc = dev_get_uclass_platdata(udev);
+	int devnum;
+
+	desc->if_type = IF_TYPE_PVBLOCK;
+	/*
+	 * Initialize the devnum to -ENODEV. This is to make sure that
+	 * blk_next_free_devnum() works as expected, since the default
+	 * value 0 is a valid devnum.
+	 */
+	desc->devnum = -ENODEV;
+	devnum = blk_next_free_devnum(IF_TYPE_PVBLOCK);
+	if (devnum < 0)
+		return devnum;
+	desc->devnum = devnum;
+	desc->part_type = PART_TYPE_UNKNOWN;
+	desc->bdev = udev;
+
+	strncpy(desc->vendor, "Xen", sizeof(desc->vendor));
+	strncpy(desc->revision, "1", sizeof(desc->revision));
+	strncpy(desc->product, "Virtual disk", sizeof(desc->product));
+
 	return 0;
 }
 
 static int pvblock_blk_probe(struct udevice *udev)
 {
 	struct blkfront_dev *blk_dev = dev_get_priv(udev);
-	int ret;
+	struct blkfront_platdata *platdata = dev_get_platdata(udev);
+	int ret, devid;
 
-	ret = init_blkfront(0, blk_dev);
+	devid = platdata->devid;
+	free(platdata);
+
+	ret = init_blkfront(devid, blk_dev);
 	if (ret < 0)
 		return ret;
 	return 0;
@@ -80,6 +113,68 @@ U_BOOT_DRIVER(pvblock_blk) = {
  * Para-virtual block device class
  *******************************************************************************/
 
+typedef int (*enum_vbd_callback)(struct udevice *parent, unsigned int devid);
+
+static int on_new_vbd(struct udevice *parent, unsigned int devid)
+{
+	struct driver_info info;
+	struct udevice *udev;
+	struct blkfront_platdata *platdata;
+	int ret;
+
+	debug("New " DRV_NAME_BLK ", device ID %d\n", devid);
+
+	platdata = malloc(sizeof(struct blkfront_platdata));
+	if (!platdata) {
+		printf("Failed to allocate platform data\n");
+		return -ENOMEM;
+	}
+
+	platdata->devid = devid;
+
+	info.name = DRV_NAME_BLK;
+	info.platdata = platdata;
+
+	ret = device_bind_by_name(parent, false, &info, &udev);
+	if (ret < 0) {
+		printf("Failed to bind " DRV_NAME_BLK " to device with ID %d, ret: %d\n",
+		       devid, ret);
+		free(platdata);
+	}
+	return ret;
+}
+
+static int xenbus_enumerate_vbd(struct udevice *udev, enum_vbd_callback clb)
+{
+	char **dirs, *msg;
+	int i, ret;
+
+	msg = xenbus_ls(XBT_NIL, "device/vbd", &dirs);
+	if (msg) {
+		printf("Failed to read device/vbd directory: %s\n", msg);
+		free(msg);
+		return -ENODEV;
+	}
+
+	for (i = 0; dirs[i]; i++) {
+		int devid;
+
+		sscanf(dirs[i], "%d", &devid);
+		ret = clb(udev, devid);
+		if (ret < 0)
+			goto fail;
+
+		free(dirs[i]);
+	}
+	ret = 0;
+
+fail:
+	for (; dirs[i]; i++)
+		free(dirs[i]);
+	free(dirs);
+	return ret;
+}
+
 void pvblock_init(void)
 {
 	struct driver_info info;
@@ -106,6 +201,19 @@ void pvblock_init(void)
 
 static int pvblock_probe(struct udevice *udev)
 {
+	struct uclass *uc;
+	int ret;
+
+	if (xenbus_enumerate_vbd(udev, on_new_vbd) < 0)
+		return -ENODEV;
+
+	ret = uclass_get(UCLASS_BLK, &uc);
+	if (ret)
+		return ret;
+	uclass_foreach_dev_probe(UCLASS_BLK, udev) {
+		if (_ret)
+			return _ret;
+	};
 	return 0;
 }
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 14/17] xen: pvblock: Read XenStore configuration and initialize
  2020-07-01 16:29 [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
                   ` (12 preceding siblings ...)
  2020-07-01 16:29 ` [PATCH 13/17] xen: pvblock: Enumerate virtual block devices Anastasiia Lukianenko
@ 2020-07-01 16:29 ` Anastasiia Lukianenko
  2020-07-03  3:50   ` Simon Glass
  2020-07-01 16:29 ` [PATCH 15/17] xen: pvblock: Implement front-back protocol and do IO Anastasiia Lukianenko
                   ` (3 subsequent siblings)
  17 siblings, 1 reply; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-01 16:29 UTC (permalink / raw)
  To: u-boot

From: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>

Read essential virtual block device configuration data from XenStore,
initialize front ring and event channel.
Update block device description with actual block size.

Use code for XenStore from mini-os.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
---
 drivers/xen/pvblock.c | 272 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 271 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/pvblock.c b/drivers/xen/pvblock.c
index 6ce0ae97c3..9ed18be633 100644
--- a/drivers/xen/pvblock.c
+++ b/drivers/xen/pvblock.c
@@ -1,6 +1,7 @@
 /*
  * SPDX-License-Identifier:	GPL-2.0+
  *
+ * (C) 2007-2008 Samuel Thibault.
  * (C) Copyright 2020 EPAM Systems Inc.
  */
 #include <blk.h>
@@ -10,26 +11,289 @@
 #include <malloc.h>
 #include <part.h>
 
+#include <asm/armv8/mmu.h>
+#include <asm/io.h>
+#include <asm/xen/system.h>
+
+#include <linux/compat.h>
+
+#include <xen/events.h>
+#include <xen/gnttab.h>
+#include <xen/hvm.h>
 #include <xen/xenbus.h>
 
+#include <xen/interface/io/ring.h>
+#include <xen/interface/io/blkif.h>
+#include <xen/interface/io/protocols.h>
+
 #define DRV_NAME	"pvblock"
 #define DRV_NAME_BLK	"pvblock_blk"
 
+#define O_RDONLY	00
+#define O_RDWR		02
+
+struct blkfront_info {
+	u64 sectors;
+	unsigned int sector_size;
+	int mode;
+	int info;
+	int barrier;
+	int flush;
+};
+
 struct blkfront_dev {
-	char dummy;
+	domid_t dom;
+
+	struct blkif_front_ring ring;
+	grant_ref_t ring_ref;
+	evtchn_port_t evtchn;
+	blkif_vdev_t handle;
+
+	char *nodename;
+	char *backend;
+	struct blkfront_info info;
+	unsigned int devid;
 };
 
 struct blkfront_platdata {
 	unsigned int devid;
 };
 
+static void free_blkfront(struct blkfront_dev *dev)
+{
+	mask_evtchn(dev->evtchn);
+	free(dev->backend);
+
+	gnttab_end_access(dev->ring_ref);
+	free(dev->ring.sring);
+
+	unbind_evtchn(dev->evtchn);
+
+	free(dev->nodename);
+	free(dev);
+}
+
+static void blkfront_handler(evtchn_port_t port, struct pt_regs *regs,
+			     void *data)
+{
+	printf("%s [Port %d] - event received\n", __func__, port);
+}
+
 static int init_blkfront(unsigned int devid, struct blkfront_dev *dev)
 {
+	xenbus_transaction_t xbt;
+	char *err = NULL;
+	char *message = NULL;
+	struct blkif_sring *s;
+	int retry = 0;
+	char *msg = NULL;
+	char *c;
+	char nodename[32];
+	char path[ARRAY_SIZE(nodename) + strlen("/backend-id") + 1];
+
+	sprintf(nodename, "device/vbd/%d", devid);
+
+	memset(dev, 0, sizeof(*dev));
+	dev->nodename = strdup(nodename);
+	dev->devid = devid;
+
+	snprintf(path, sizeof(path), "%s/backend-id", nodename);
+	dev->dom = xenbus_read_integer(path);
+	evtchn_alloc_unbound(dev->dom, blkfront_handler, dev, &dev->evtchn);
+
+	s = (struct blkif_sring *)memalign(PAGE_SIZE, PAGE_SIZE);
+	if (!s) {
+		printf("Failed to allocate shared ring\n");
+		goto error;
+	}
+
+	SHARED_RING_INIT(s);
+	FRONT_RING_INIT(&dev->ring, s, PAGE_SIZE);
+
+	dev->ring_ref = gnttab_grant_access(dev->dom, virt_to_pfn(s), 0);
+
+again:
+	err = xenbus_transaction_start(&xbt);
+	if (err) {
+		printf("starting transaction\n");
+		free(err);
+	}
+
+	err = xenbus_printf(xbt, nodename, "ring-ref", "%u", dev->ring_ref);
+	if (err) {
+		message = "writing ring-ref";
+		goto abort_transaction;
+	}
+	err = xenbus_printf(xbt, nodename, "event-channel", "%u", dev->evtchn);
+	if (err) {
+		message = "writing event-channel";
+		goto abort_transaction;
+	}
+	err = xenbus_printf(xbt, nodename, "protocol", "%s",
+			    XEN_IO_PROTO_ABI_NATIVE);
+	if (err) {
+		message = "writing protocol";
+		goto abort_transaction;
+	}
+
+	snprintf(path, sizeof(path), "%s/state", nodename);
+	err = xenbus_switch_state(xbt, path, XenbusStateConnected);
+	if (err) {
+		message = "switching state";
+		goto abort_transaction;
+	}
+
+	err = xenbus_transaction_end(xbt, 0, &retry);
+	free(err);
+	if (retry) {
+		goto again;
+		printf("completing transaction\n");
+	}
+
+	goto done;
+
+abort_transaction:
+	free(err);
+	err = xenbus_transaction_end(xbt, 1, &retry);
+	printf("Abort transaction %s\n", message);
+	goto error;
+
+done:
+	snprintf(path, sizeof(path), "%s/backend", nodename);
+	msg = xenbus_read(XBT_NIL, path, &dev->backend);
+	if (msg) {
+		printf("Error %s when reading the backend path %s\n",
+		       msg, path);
+		goto error;
+	}
+
+	dev->handle = strtoul(strrchr(nodename, '/') + 1, NULL, 0);
+
+	{
+		XenbusState state;
+		char path[strlen(dev->backend) +
+			strlen("/feature-flush-cache") + 1];
+
+		snprintf(path, sizeof(path), "%s/mode", dev->backend);
+		msg = xenbus_read(XBT_NIL, path, &c);
+		if (msg) {
+			printf("Error %s when reading the mode\n", msg);
+			goto error;
+		}
+		if (*c == 'w')
+			dev->info.mode = O_RDWR;
+		else
+			dev->info.mode = O_RDONLY;
+		free(c);
+
+		snprintf(path, sizeof(path), "%s/state", dev->backend);
+
+		msg = NULL;
+		state = xenbus_read_integer(path);
+		while (msg == NULL && state < XenbusStateConnected)
+			msg = xenbus_wait_for_state_change(path, &state);
+		if (msg != NULL || state != XenbusStateConnected) {
+			printf("backend not available, state=%d\n", state);
+			goto error;
+		}
+
+		snprintf(path, sizeof(path), "%s/info", dev->backend);
+		dev->info.info = xenbus_read_integer(path);
+
+		snprintf(path, sizeof(path), "%s/sectors", dev->backend);
+		/*
+		 * FIXME: read_integer returns an int, so disk size
+		 * limited to 1TB for now
+		 */
+		dev->info.sectors = xenbus_read_integer(path);
+
+		snprintf(path, sizeof(path), "%s/sector-size", dev->backend);
+		dev->info.sector_size = xenbus_read_integer(path);
+
+		snprintf(path, sizeof(path), "%s/feature-barrier",
+			 dev->backend);
+		dev->info.barrier = xenbus_read_integer(path);
+
+		snprintf(path, sizeof(path), "%s/feature-flush-cache",
+			 dev->backend);
+		dev->info.flush = xenbus_read_integer(path);
+	}
+	unmask_evtchn(dev->evtchn);
+
+	debug("%llu sectors of %u bytes\n",
+	      dev->info.sectors, dev->info.sector_size);
+
 	return 0;
+
+error:
+	free(msg);
+	free(err);
+	free_blkfront(dev);
+	return -ENODEV;
 }
 
 static void shutdown_blkfront(struct blkfront_dev *dev)
 {
+	char *err = NULL, *err2;
+	XenbusState state;
+
+	char path[strlen(dev->backend) + strlen("/state") + 1];
+	char nodename[strlen(dev->nodename) + strlen("/event-channel") + 1];
+
+	debug("Close " DRV_NAME ", device ID %d\n", dev->devid);
+
+	snprintf(path, sizeof(path), "%s/state", dev->backend);
+	snprintf(nodename, sizeof(nodename), "%s/state", dev->nodename);
+
+	if ((err = xenbus_switch_state(XBT_NIL, nodename,
+				       XenbusStateClosing)) != NULL) {
+		printf("%s: error changing state to %d: %s\n", __func__,
+		       XenbusStateClosing, err);
+		goto close;
+	}
+
+	state = xenbus_read_integer(path);
+	while (err == NULL && state < XenbusStateClosing)
+		err = xenbus_wait_for_state_change(path, &state);
+	free(err);
+
+	if ((err = xenbus_switch_state(XBT_NIL, nodename,
+				       XenbusStateClosed)) != NULL) {
+		printf("%s: error changing state to %d: %s\n", __func__,
+		       XenbusStateClosed, err);
+		goto close;
+	}
+
+	state = xenbus_read_integer(path);
+	while (state < XenbusStateClosed) {
+		err = xenbus_wait_for_state_change(path, &state);
+		free(err);
+	}
+
+	if ((err = xenbus_switch_state(XBT_NIL, nodename,
+				       XenbusStateInitialising)) != NULL) {
+		printf("%s: error changing state to %d: %s\n", __func__,
+		       XenbusStateInitialising, err);
+		goto close;
+	}
+
+	state = xenbus_read_integer(path);
+	while (err == NULL &&
+	       (state < XenbusStateInitWait || state >= XenbusStateClosed))
+		err = xenbus_wait_for_state_change(path, &state);
+
+close:
+	free(err);
+
+	snprintf(nodename, sizeof(nodename), "%s/ring-ref", dev->nodename);
+	err2 = xenbus_rm(XBT_NIL, nodename);
+	free(err2);
+	snprintf(nodename, sizeof(nodename), "%s/event-channel", dev->nodename);
+	err2 = xenbus_rm(XBT_NIL, nodename);
+	free(err2);
+
+	if (!err)
+		free_blkfront(dev);
 }
 
 ulong pvblock_blk_read(struct udevice *udev, lbaint_t blknr, lbaint_t blkcnt,
@@ -74,6 +338,7 @@ static int pvblock_blk_probe(struct udevice *udev)
 {
 	struct blkfront_dev *blk_dev = dev_get_priv(udev);
 	struct blkfront_platdata *platdata = dev_get_platdata(udev);
+	struct blk_desc *desc = dev_get_uclass_platdata(udev);
 	int ret, devid;
 
 	devid = platdata->devid;
@@ -82,6 +347,11 @@ static int pvblock_blk_probe(struct udevice *udev)
 	ret = init_blkfront(devid, blk_dev);
 	if (ret < 0)
 		return ret;
+
+	desc->blksz = blk_dev->info.sector_size;
+	desc->lba = blk_dev->info.sectors;
+	desc->log2blksz = LOG2(blk_dev->info.sector_size);
+
 	return 0;
 }
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 15/17] xen: pvblock: Implement front-back protocol and do IO
  2020-07-01 16:29 [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
                   ` (13 preceding siblings ...)
  2020-07-01 16:29 ` [PATCH 14/17] xen: pvblock: Read XenStore configuration and initialize Anastasiia Lukianenko
@ 2020-07-01 16:29 ` Anastasiia Lukianenko
  2020-07-03  3:50   ` Simon Glass
  2020-07-01 16:29 ` [PATCH 16/17] xen: pvblock: Print found devices indices Anastasiia Lukianenko
                   ` (2 subsequent siblings)
  17 siblings, 1 reply; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-01 16:29 UTC (permalink / raw)
  To: u-boot

From: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>

Implement Xen para-virtual frontend to backend communication
and actually read/write disk data.

This is based on mini-os implementation of the para-virtual block
frontend driver.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
---
 drivers/xen/events.c  |   2 +-
 drivers/xen/pvblock.c | 311 ++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 301 insertions(+), 12 deletions(-)

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index a1b36a2196..192b290c02 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -104,7 +104,7 @@ void unbind_evtchn(evtchn_port_t port)
 	int rc;
 
 	if (ev_actions[port].handler == default_handler)
-		printf("WARN: No handler for port %d when unbinding\n", port);
+		debug("Default handler for port %d when unbinding\n", port);
 	mask_evtchn(port);
 	clear_evtchn(port);
 
diff --git a/drivers/xen/pvblock.c b/drivers/xen/pvblock.c
index 9ed18be633..a23afc2cb2 100644
--- a/drivers/xen/pvblock.c
+++ b/drivers/xen/pvblock.c
@@ -15,6 +15,7 @@
 #include <asm/io.h>
 #include <asm/xen/system.h>
 
+#include <linux/bug.h>
 #include <linux/compat.h>
 
 #include <xen/events.h>
@@ -31,6 +32,7 @@
 
 #define O_RDONLY	00
 #define O_RDWR		02
+#define WAIT_RING_TO_MS	10
 
 struct blkfront_info {
 	u64 sectors;
@@ -53,12 +55,30 @@ struct blkfront_dev {
 	char *backend;
 	struct blkfront_info info;
 	unsigned int devid;
+	u8 *bounce_buffer;
 };
 
 struct blkfront_platdata {
 	unsigned int devid;
 };
 
+struct blkfront_aiocb {
+	struct blkfront_dev *aio_dev;
+	u8 *aio_buf;
+	size_t aio_nbytes;
+	off_t aio_offset;
+	size_t total_bytes;
+	u8 is_write;
+	void *data;
+
+	grant_ref_t gref[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+	int n;
+
+	void (*aio_cb)(struct blkfront_aiocb *aiocb, int ret);
+};
+
+static void blkfront_sync(struct blkfront_dev *dev);
+
 static void free_blkfront(struct blkfront_dev *dev)
 {
 	mask_evtchn(dev->evtchn);
@@ -69,16 +89,11 @@ static void free_blkfront(struct blkfront_dev *dev)
 
 	unbind_evtchn(dev->evtchn);
 
+	free(dev->bounce_buffer);
 	free(dev->nodename);
 	free(dev);
 }
 
-static void blkfront_handler(evtchn_port_t port, struct pt_regs *regs,
-			     void *data)
-{
-	printf("%s [Port %d] - event received\n", __func__, port);
-}
-
 static int init_blkfront(unsigned int devid, struct blkfront_dev *dev)
 {
 	xenbus_transaction_t xbt;
@@ -99,7 +114,7 @@ static int init_blkfront(unsigned int devid, struct blkfront_dev *dev)
 
 	snprintf(path, sizeof(path), "%s/backend-id", nodename);
 	dev->dom = xenbus_read_integer(path);
-	evtchn_alloc_unbound(dev->dom, blkfront_handler, dev, &dev->evtchn);
+	evtchn_alloc_unbound(dev->dom, NULL, dev, &dev->evtchn);
 
 	s = (struct blkif_sring *)memalign(PAGE_SIZE, PAGE_SIZE);
 	if (!s) {
@@ -220,8 +235,16 @@ done:
 	}
 	unmask_evtchn(dev->evtchn);
 
-	debug("%llu sectors of %u bytes\n",
-	      dev->info.sectors, dev->info.sector_size);
+	dev->bounce_buffer = memalign(dev->info.sector_size,
+				      dev->info.sector_size);
+	if (!dev->bounce_buffer) {
+		printf("Failed to allocate bouncing buffer\n");
+		goto error;
+	}
+
+	debug("%llu sectors of %u bytes, bounce buffer at %p\n",
+	      dev->info.sectors, dev->info.sector_size,
+	      dev->bounce_buffer);
 
 	return 0;
 
@@ -242,6 +265,8 @@ static void shutdown_blkfront(struct blkfront_dev *dev)
 
 	debug("Close " DRV_NAME ", device ID %d\n", dev->devid);
 
+	blkfront_sync(dev);
+
 	snprintf(path, sizeof(path), "%s/state", dev->backend);
 	snprintf(nodename, sizeof(nodename), "%s/state", dev->nodename);
 
@@ -296,16 +321,280 @@ close:
 		free_blkfront(dev);
 }
 
+static int blkfront_aio_poll(struct blkfront_dev *dev)
+{
+	RING_IDX rp, cons;
+	struct blkif_response *rsp;
+	int more;
+	int nr_consumed;
+
+moretodo:
+	rp = dev->ring.sring->rsp_prod;
+	rmb(); /* Ensure we see queued responses up to 'rp'. */
+	cons = dev->ring.rsp_cons;
+
+	nr_consumed = 0;
+	while ((cons != rp)) {
+		struct blkfront_aiocb *aiocbp;
+		int status;
+
+		rsp = RING_GET_RESPONSE(&dev->ring, cons);
+		nr_consumed++;
+
+		aiocbp = (void *)(uintptr_t)rsp->id;
+		status = rsp->status;
+
+		switch (rsp->operation) {
+		case BLKIF_OP_READ:
+		case BLKIF_OP_WRITE:
+		{
+			int j;
+
+			if (status != BLKIF_RSP_OKAY)
+				printf("%s error %d on %s at offset %llu, num bytes %llu\n",
+				       rsp->operation == BLKIF_OP_READ ?
+				       "read" : "write",
+				       status, aiocbp->aio_dev->nodename,
+				       (unsigned long long)aiocbp->aio_offset,
+				       (unsigned long long)aiocbp->aio_nbytes);
+
+			for (j = 0; j < aiocbp->n; j++)
+				gnttab_end_access(aiocbp->gref[j]);
+
+			break;
+		}
+
+		case BLKIF_OP_WRITE_BARRIER:
+			if (status != BLKIF_RSP_OKAY)
+				printf("write barrier error %d\n", status);
+			break;
+		case BLKIF_OP_FLUSH_DISKCACHE:
+			if (status != BLKIF_RSP_OKAY)
+				printf("flush error %d\n", status);
+			break;
+
+		default:
+			printf("unrecognized block operation %d response (status %d)\n",
+			       rsp->operation, status);
+			break;
+		}
+
+		dev->ring.rsp_cons = ++cons;
+		/* Nota: callback frees aiocbp itself */
+		if (aiocbp && aiocbp->aio_cb)
+			aiocbp->aio_cb(aiocbp, status ? -EIO : 0);
+		if (dev->ring.rsp_cons != cons)
+			/* We reentered, we must not continue here */
+			break;
+	}
+
+	RING_FINAL_CHECK_FOR_RESPONSES(&dev->ring, more);
+	if (more)
+		goto moretodo;
+
+	return nr_consumed;
+}
+
+static void blkfront_wait_slot(struct blkfront_dev *dev)
+{
+	/* Wait for a slot */
+	if (RING_FULL(&dev->ring)) {
+		while (true) {
+			blkfront_aio_poll(dev);
+			if (!RING_FULL(&dev->ring))
+				break;
+			wait_event_timeout(NULL, !RING_FULL(&dev->ring),
+					   WAIT_RING_TO_MS);
+		}
+	}
+}
+
+/* Issue an aio */
+static void blkfront_aio(struct blkfront_aiocb *aiocbp, int write)
+{
+	struct blkfront_dev *dev = aiocbp->aio_dev;
+	struct blkif_request *req;
+	RING_IDX i;
+	int notify;
+	int n, j;
+	uintptr_t start, end;
+
+	/* Can't io at non-sector-aligned location */
+	BUG_ON(aiocbp->aio_offset & (dev->info.sector_size - 1));
+	/* Can't io non-sector-sized amounts */
+	BUG_ON(aiocbp->aio_nbytes & (dev->info.sector_size - 1));
+	/* Can't io non-sector-aligned buffer */
+	BUG_ON(((uintptr_t)aiocbp->aio_buf & (dev->info.sector_size - 1)));
+
+	start = (uintptr_t)aiocbp->aio_buf & PAGE_MASK;
+	end = ((uintptr_t)aiocbp->aio_buf + aiocbp->aio_nbytes +
+	       PAGE_SIZE - 1) & PAGE_MASK;
+	n = (end - start) / PAGE_SIZE;
+	aiocbp->n = n;
+
+	BUG_ON(n > BLKIF_MAX_SEGMENTS_PER_REQUEST);
+
+	blkfront_wait_slot(dev);
+	i = dev->ring.req_prod_pvt;
+	req = RING_GET_REQUEST(&dev->ring, i);
+
+	req->operation = write ? BLKIF_OP_WRITE : BLKIF_OP_READ;
+	req->nr_segments = n;
+	req->handle = dev->handle;
+	req->id = (uintptr_t)aiocbp;
+	req->sector_number = aiocbp->aio_offset / dev->info.sector_size;
+
+	for (j = 0; j < n; j++) {
+		req->seg[j].first_sect = 0;
+		req->seg[j].last_sect = PAGE_SIZE / dev->info.sector_size - 1;
+	}
+	req->seg[0].first_sect = ((uintptr_t)aiocbp->aio_buf & ~PAGE_MASK) /
+		dev->info.sector_size;
+	req->seg[n - 1].last_sect = (((uintptr_t)aiocbp->aio_buf +
+		aiocbp->aio_nbytes - 1) & ~PAGE_MASK) / dev->info.sector_size;
+	for (j = 0; j < n; j++) {
+		uintptr_t data = start + j * PAGE_SIZE;
+
+		if (!write) {
+			/* Trigger CoW if needed */
+			*(char *)(data + (req->seg[j].first_sect *
+					  dev->info.sector_size)) = 0;
+			barrier();
+		}
+		req->seg[j].gref = gnttab_grant_access(dev->dom,
+						       virt_to_pfn((void *)data),
+						       write);
+		aiocbp->gref[j] = req->seg[j].gref;
+	}
+
+	dev->ring.req_prod_pvt = i + 1;
+
+	wmb();
+	RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&dev->ring, notify);
+
+	if (notify)
+		notify_remote_via_evtchn(dev->evtchn);
+}
+
+static void blkfront_aio_cb(struct blkfront_aiocb *aiocbp, int ret)
+{
+	aiocbp->data = (void *)1;
+	aiocbp->aio_cb = NULL;
+}
+
+static void blkfront_io(struct blkfront_aiocb *aiocbp, int write)
+{
+	aiocbp->aio_cb = blkfront_aio_cb;
+	blkfront_aio(aiocbp, write);
+	aiocbp->data = NULL;
+
+	while (true) {
+		blkfront_aio_poll(aiocbp->aio_dev);
+		if (aiocbp->data)
+			break;
+		cpu_relax();
+	}
+}
+
+static void blkfront_push_operation(struct blkfront_dev *dev, u8 op,
+				    uint64_t id)
+{
+	struct blkif_request *req;
+	int notify, i;
+
+	blkfront_wait_slot(dev);
+	i = dev->ring.req_prod_pvt;
+	req = RING_GET_REQUEST(&dev->ring, i);
+	req->operation = op;
+	req->nr_segments = 0;
+	req->handle = dev->handle;
+	req->id = id;
+	req->sector_number = 0;
+	dev->ring.req_prod_pvt = i + 1;
+	wmb();
+	RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&dev->ring, notify);
+	if (notify)
+		notify_remote_via_evtchn(dev->evtchn);
+}
+
+static void blkfront_sync(struct blkfront_dev *dev)
+{
+	if (dev->info.mode == O_RDWR) {
+		if (dev->info.barrier == 1)
+			blkfront_push_operation(dev,
+						BLKIF_OP_WRITE_BARRIER, 0);
+
+		if (dev->info.flush == 1)
+			blkfront_push_operation(dev,
+						BLKIF_OP_FLUSH_DISKCACHE, 0);
+	}
+
+	while (true) {
+		blkfront_aio_poll(dev);
+		if (RING_FREE_REQUESTS(&dev->ring) == RING_SIZE(&dev->ring))
+			break;
+		cpu_relax();
+	}
+}
+
+static ulong pvblock_iop(struct udevice *udev, lbaint_t blknr,
+			 lbaint_t blkcnt, void *buffer, int write)
+{
+	struct blkfront_dev *blk_dev = dev_get_priv(udev);
+	struct blk_desc *desc = dev_get_uclass_platdata(udev);
+	struct blkfront_aiocb aiocb;
+	lbaint_t blocks_todo;
+	bool unaligned;
+
+	if (blkcnt == 0)
+		return 0;
+
+	if ((blknr + blkcnt) > desc->lba) {
+		printf(DRV_NAME ": block number 0x" LBAF " exceeds max(0x" LBAF ")\n",
+		       blknr + blkcnt, desc->lba);
+		return 0;
+	}
+
+	unaligned = (uintptr_t)buffer & (blk_dev->info.sector_size - 1);
+
+	aiocb.aio_dev = blk_dev;
+	aiocb.aio_offset = blknr * desc->blksz;
+	aiocb.aio_cb = NULL;
+	aiocb.data = NULL;
+	blocks_todo = blkcnt;
+	do {
+		aiocb.aio_buf = unaligned ? blk_dev->bounce_buffer : buffer;
+
+		if (write && unaligned)
+			memcpy(blk_dev->bounce_buffer, buffer, desc->blksz);
+
+		aiocb.aio_nbytes = unaligned ? desc->blksz :
+			min((size_t)(BLKIF_MAX_SEGMENTS_PER_REQUEST * PAGE_SIZE),
+			    (size_t)(blocks_todo * desc->blksz));
+
+		blkfront_io(&aiocb, write);
+
+		if (!write && unaligned)
+			memcpy(buffer, blk_dev->bounce_buffer, desc->blksz);
+
+		aiocb.aio_offset += aiocb.aio_nbytes;
+		buffer += aiocb.aio_nbytes;
+		blocks_todo -= aiocb.aio_nbytes / desc->blksz;
+	} while (blocks_todo > 0);
+
+	return blkcnt;
+}
+
 ulong pvblock_blk_read(struct udevice *udev, lbaint_t blknr, lbaint_t blkcnt,
 		       void *buffer)
 {
-	return 0;
+	return pvblock_iop(udev, blknr, blkcnt, buffer, 0);
 }
 
 ulong pvblock_blk_write(struct udevice *udev, lbaint_t blknr, lbaint_t blkcnt,
 			const void *buffer)
 {
-	return 0;
+	return pvblock_iop(udev, blknr, blkcnt, (void *)buffer, 1);
 }
 
 static int pvblock_blk_bind(struct udevice *udev)
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 16/17] xen: pvblock: Print found devices indices
  2020-07-01 16:29 [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
                   ` (14 preceding siblings ...)
  2020-07-01 16:29 ` [PATCH 15/17] xen: pvblock: Implement front-back protocol and do IO Anastasiia Lukianenko
@ 2020-07-01 16:29 ` Anastasiia Lukianenko
  2020-07-03  3:50   ` Simon Glass
  2020-07-01 16:29 ` [PATCH 17/17] board: xen: De-initialize before jumping to Linux Anastasiia Lukianenko
  2020-07-01 16:51 ` [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
  17 siblings, 1 reply; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-01 16:29 UTC (permalink / raw)
  To: u-boot

From: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
---
 drivers/xen/pvblock.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/drivers/xen/pvblock.c b/drivers/xen/pvblock.c
index a23afc2cb2..8b102b181d 100644
--- a/drivers/xen/pvblock.c
+++ b/drivers/xen/pvblock.c
@@ -734,6 +734,24 @@ fail:
 	return ret;
 }
 
+static void print_pvblock_devices(void)
+{
+	struct udevice *udev;
+	bool first = true;
+	const char *class_name;
+
+	class_name = uclass_get_name(UCLASS_PVBLOCK);
+	for (blk_first_device(IF_TYPE_PVBLOCK, &udev); udev;
+	     blk_next_device(&udev), first = false) {
+		struct blk_desc *desc = dev_get_uclass_platdata(udev);
+
+		if (!first)
+			puts(", ");
+		printf("%s: %d", class_name, desc->devnum);
+	}
+	printf("\n");
+}
+
 void pvblock_init(void)
 {
 	struct driver_info info;
@@ -756,6 +774,8 @@ void pvblock_init(void)
 	if (ret)
 		return;
 	uclass_foreach_dev_probe(UCLASS_PVBLOCK, udev);
+
+	print_pvblock_devices();
 }
 
 static int pvblock_probe(struct udevice *udev)
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 17/17] board: xen: De-initialize before jumping to Linux
  2020-07-01 16:29 [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
                   ` (15 preceding siblings ...)
  2020-07-01 16:29 ` [PATCH 16/17] xen: pvblock: Print found devices indices Anastasiia Lukianenko
@ 2020-07-01 16:29 ` Anastasiia Lukianenko
  2020-07-03  3:50   ` Simon Glass
  2020-07-01 16:51 ` [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
  17 siblings, 1 reply; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-01 16:29 UTC (permalink / raw)
  To: u-boot

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Free resources used by Xen board before jumping to Linux kernel.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
---
 board/xen/xenguest_arm64/xenguest_arm64.c | 6 ++++++
 drivers/xen/hypervisor.c                  | 8 ++++++++
 include/xen.h                             | 1 +
 3 files changed, 15 insertions(+)

diff --git a/board/xen/xenguest_arm64/xenguest_arm64.c b/board/xen/xenguest_arm64/xenguest_arm64.c
index b4e1650f99..76a18bea8b 100644
--- a/board/xen/xenguest_arm64/xenguest_arm64.c
+++ b/board/xen/xenguest_arm64/xenguest_arm64.c
@@ -13,6 +13,7 @@
 #include <dm.h>
 #include <errno.h>
 #include <malloc.h>
+#include <xen.h>
 
 #include <asm/io.h>
 #include <asm/armv8/mmu.h>
@@ -195,3 +196,8 @@ int print_cpuinfo(void)
 	return 0;
 }
 
+void board_cleanup_before_linux(void)
+{
+	xen_fini();
+}
+
diff --git a/drivers/xen/hypervisor.c b/drivers/xen/hypervisor.c
index f3c2504d72..8d7d320839 100644
--- a/drivers/xen/hypervisor.c
+++ b/drivers/xen/hypervisor.c
@@ -279,3 +279,11 @@ void xen_init(void)
 	init_gnttab();
 }
 
+void xen_fini(void)
+{
+	debug("%s\n", __func__);
+
+	fini_gnttab();
+	fini_xenbus();
+	fini_events();
+}
diff --git a/include/xen.h b/include/xen.h
index 1d6f74cc92..327d7e132b 100644
--- a/include/xen.h
+++ b/include/xen.h
@@ -7,5 +7,6 @@
 #define __XEN_H__
 
 void xen_init(void);
+void xen_fini(void);
 
 #endif /* __XEN_H__ */
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 00/17] Add new board: Xen guest for ARM64
  2020-07-01 16:29 [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
                   ` (16 preceding siblings ...)
  2020-07-01 16:29 ` [PATCH 17/17] board: xen: De-initialize before jumping to Linux Anastasiia Lukianenko
@ 2020-07-01 16:51 ` Anastasiia Lukianenko
  17 siblings, 0 replies; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-01 16:51 UTC (permalink / raw)
  To: u-boot

Sorry for top posting, but please note, that there is a dependency on the following patch being upstreamed:
http://u-boot.10912.n7.nabble.com/PATCH-v2-arm64-issue-ISB-after-updating-system-registers-td417392.html#none
For Xen guests with RAM less than 129MB one will also need:
http://u-boot.10912.n7.nabble.com/PATCH-common-board-f-Respect-original-FDT-size-while-relocating-td416963.html#none 

Regards,
Anastasiia

From: Anastasiia Lukianenko <vicooodin@gmail.com>
Sent: 01 July 2020 19:29
To: u-boot at lists.denx.de <u-boot@lists.denx.de>; sjg at chromium.org <sjg@chromium.org>; ye.li at nxp.com <ye.li@nxp.com>; bmeng.cn at gmail.com <bmeng.cn@gmail.com>; xypron.glpk at gmx.de <xypron.glpk@gmx.de>
Cc: julien at xen.org <julien@xen.org>; sstabellini at kernel.org <sstabellini@kernel.org>; peng.fan at nxp.com <peng.fan@nxp.com>; roman at zededa.com <roman@zededa.com>; Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>; Anastasiia Lukianenko <Anastasiia_Lukianenko@epam.com>
Subject: [PATCH 00/17] Add new board: Xen guest for ARM64 
?
From: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>

This work introduces Xen [1] guest ARM64 board support in U-Boot with
para-virtualized (PV) [2] block and serial drivers: xenguest_arm64.

This board is to be run as a virtual Xen guest with U-boot as its
primary bootloader. The rationale behind introducing this board is a
better and simpler decoupling of the guest from the initial
privileged domain which starts a guest?s virtual machine: there are
cross dependencies between the guest OS and initial privileged domain
(Domain-0) such as Domain-0 needs guest's kernel and may need its
device tree to boot it. These dependencies interfere if the kernel or
guest OS needs to be updated, thus having a unified bootloader in
Domain-0 allows resolving this:
1. U-boot boot scripts, which are stored on the guest?s virtual disk,
are guest specific, so any change in the guest?s configuration can be
handled by the guest itself.
2. Guest OS? kernel can be updated if OS? needs that without any help
from Domain-0.
3. Using the Device Tree Overlay mechanism it is possible to customize
the device tree entries yet at bootloader stage inside the guest
itself, so the base device tree provided by Xen can be customized.

Xen support for U-boot was implemented by introducing a new Xen guest
ARM64 board and porting essential drivers from MiniOS [3] as well as
some of the work previously done by NXP [4]:
1. PV block device front driver with XenStore based device
enumeration, new UCLASS_PVBLOCK;
2. PV serial console device front driver;
3. Xen hypervisor support with minimal set of the essential headers
adapted from Linux kernel;
4. grant table support;
5. event channel support, without IRQ support, but polling;
6. xenbus support;
7. dynamic RAM size as defined in the device tree instead of
statically defined values;
8. position-independent pre-relocation code is used as we cannot
statically define any start addresses at compile time which is up to
Xen to choose at run-time;
9. new defconfig introduced: xenguest_arm64_defconfig.

Please note, that due to the fact that para-virtualized serial driver
requires some of the Xen functionality available late not all the
printouts are available at the very start including U-Boot banner,
memory size etc.

All the above was tested with block driver related commands
(info/part/read/write), FAT and ext4 operations work properly, the
Linux kernel can start.

Thank you in advance,
Anastasiia Lukianenko,
Oleksandr Andrushchenko


[1] - https://urldefense.com/v3/__https://xenproject.org/__;!!GF_29dbcQIUBPA!nXu3bp3MaWPZ4Ea_kBwDZfiCgeuk2evQq7UQocVVU3TGUNkR19P1n19d1aVHLdKqJASr7wA$ 
[2] - https://urldefense.com/v3/__https://wiki.xenproject.org/wiki/Paravirtualization_(PV)__;!!GF_29dbcQIUBPA!nXu3bp3MaWPZ4Ea_kBwDZfiCgeuk2evQq7UQocVVU3TGUNkR19P1n19d1aVHLdKqPFADqbk$ 
[3] - https://urldefense.com/v3/__https://wiki.xenproject.org/wiki/Mini-OS__;!!GF_29dbcQIUBPA!nXu3bp3MaWPZ4Ea_kBwDZfiCgeuk2evQq7UQocVVU3TGUNkR19P1n19d1aVHLdKq9Oab5y0$ 
[4] - https://urldefense.com/v3/__https://source.codeaurora.org/external/imx/uboot-imx/tree/?h=imx_v2018.03_4.14.98_2.0.0_ga__;!!GF_29dbcQIUBPA!nXu3bp3MaWPZ4Ea_kBwDZfiCgeuk2evQq7UQocVVU3TGUNkR19P1n19d1aVHLdKqMVZz7ZU$ 

Anastasiia Lukianenko (5):
? xen: pvblock: Add initial support for para-virtualized block driver
? xen: pvblock: Enumerate virtual block devices
? xen: pvblock: Read XenStore configuration and initialize
? xen: pvblock: Implement front-back protocol and do IO
? xen: pvblock: Print found devices indices

Andrii Anisov (2):
? board: Introduce xenguest_arm64 board
? lib: sscanf: add sscanf implementation

Oleksandr Andrushchenko (8):
? armv8: Fix SMCC and ARM_PSCI_FW dependencies
? xen: Add essential and required interface headers
? xen: Port Xen hypervizor related code from mini-os
? xen: Port Xen event channel driver from mini-os
? linux/compat.h: Add wait_event_timeout macro
? xen: Port Xen bus driver from mini-os
? xen: Port Xen grant table driver from mini-os
? board: xen: De-initialize before jumping to Linux

Peng Fan (2):
? Kconfig: Introduce CONFIG_XEN
? serial: serial_xen: Add Xen PV serial driver

?Kconfig?????????????????????????????????? |?? 7 +
?arch/arm/Kconfig????????????????????????? |? 10 +-
?arch/arm/cpu/armv8/Kconfig??????????????? |?? 2 +
?arch/arm/cpu/armv8/Makefile?????????????? |?? 1 +
?arch/arm/cpu/armv8/xen/Makefile?????????? |?? 6 +
?arch/arm/cpu/armv8/xen/hypercall.S??????? |? 78 ++
?arch/arm/cpu/armv8/xen/lowlevel_init.S??? |? 34 +
?arch/arm/include/asm/xen.h??????????????? |?? 8 +
?arch/arm/include/asm/xen/hypercall.h????? |? 45 ++
?arch/arm/include/asm/xen/system.h???????? |? 96 +++
?board/xen/xenguest_arm64/Kconfig????????? |? 12 +
?board/xen/xenguest_arm64/Makefile???????? |?? 5 +
?board/xen/xenguest_arm64/xenguest_arm64.c | 203 +++++
?cmd/Kconfig?????????????????????????????? |?? 7 +
?cmd/Makefile????????????????????????????? |?? 1 +
?cmd/pvblock.c???????????????????????????? |? 31 +
?common/board_r.c????????????????????????? |? 25 +
?configs/xenguest_arm64_defconfig????????? |? 60 ++
?disk/part.c?????????????????????????????? |?? 4 +
?drivers/Kconfig?????????????????????????? |?? 2 +
?drivers/Makefile????????????????????????? |?? 1 +
?drivers/block/blk-uclass.c??????????????? |?? 2 +
?drivers/serial/Kconfig??????????????????? |?? 7 +
?drivers/serial/Makefile?????????????????? |?? 1 +
?drivers/serial/serial_xen.c?????????????? | 175 +++++
?drivers/xen/Kconfig?????????????????????? |? 10 +
?drivers/xen/Makefile????????????????????? |? 10 +
?drivers/xen/events.c????????????????????? | 181 +++++
?drivers/xen/gnttab.c????????????????????? | 258 +++++++
?drivers/xen/hypervisor.c????????????????? | 289 +++++++
?drivers/xen/pvblock.c???????????????????? | 808 ++++++++++++++++++++
?drivers/xen/xenbus.c????????????????????? | 547 ++++++++++++++
?include/blk.h???????????????????????????? |?? 1 +
?include/configs/xenguest_arm64.h????????? |? 53 ++
?include/dm/uclass-id.h??????????????????? |?? 1 +
?include/linux/compat.h??????????????????? |? 45 ++
?include/pvblock.h???????????????????????? |? 12 +
?include/vsprintf.h??????????????????????? |?? 8 +
?include/xen.h???????????????????????????? |? 12 +
?include/xen/arm/interface.h?????????????? |? 88 +++
?include/xen/events.h????????????????????? |? 47 ++
?include/xen/gnttab.h????????????????????? |? 25 +
?include/xen/hvm.h???????????????????????? |? 30 +
?include/xen/interface/event_channel.h???? | 281 +++++++
?include/xen/interface/grant_table.h?????? | 582 ++++++++++++++
?include/xen/interface/hvm/hvm_op.h??????? |? 69 ++
?include/xen/interface/hvm/params.h??????? | 127 ++++
?include/xen/interface/io/blkif.h????????? | 726 ++++++++++++++++++
?include/xen/interface/io/console.h??????? |? 56 ++
?include/xen/interface/io/protocols.h????? |? 42 +
?include/xen/interface/io/ring.h?????????? | 479 ++++++++++++
?include/xen/interface/io/xenbus.h???????? |? 81 ++
?include/xen/interface/io/xs_wire.h??????? | 151 ++++
?include/xen/interface/memory.h??????????? | 332 ++++++++
?include/xen/interface/sched.h???????????? | 188 +++++
?include/xen/interface/xen.h?????????????? | 225 ++++++
?include/xen/xenbus.h????????????????????? |? 86 +++
?lib/Kconfig?????????????????????????????? |?? 4 +
?lib/Makefile????????????????????????????? |?? 1 +
?lib/sscanf.c????????????????????????????? | 883 ++++++++++++++++++++++
?60 files changed, 7560 insertions(+), 1 deletion(-)
?create mode 100644 arch/arm/cpu/armv8/xen/Makefile
?create mode 100644 arch/arm/cpu/armv8/xen/hypercall.S
?create mode 100644 arch/arm/cpu/armv8/xen/lowlevel_init.S
?create mode 100644 arch/arm/include/asm/xen.h
?create mode 100644 arch/arm/include/asm/xen/hypercall.h
?create mode 100644 arch/arm/include/asm/xen/system.h
?create mode 100644 board/xen/xenguest_arm64/Kconfig
?create mode 100644 board/xen/xenguest_arm64/Makefile
?create mode 100644 board/xen/xenguest_arm64/xenguest_arm64.c
?create mode 100644 cmd/pvblock.c
?create mode 100644 configs/xenguest_arm64_defconfig
?create mode 100644 drivers/serial/serial_xen.c
?create mode 100644 drivers/xen/Kconfig
?create mode 100644 drivers/xen/Makefile
?create mode 100644 drivers/xen/events.c
?create mode 100644 drivers/xen/gnttab.c
?create mode 100644 drivers/xen/hypervisor.c
?create mode 100644 drivers/xen/pvblock.c
?create mode 100644 drivers/xen/xenbus.c
?create mode 100644 include/configs/xenguest_arm64.h
?create mode 100644 include/pvblock.h
?create mode 100644 include/xen.h
?create mode 100644 include/xen/arm/interface.h
?create mode 100644 include/xen/events.h
?create mode 100644 include/xen/gnttab.h
?create mode 100644 include/xen/hvm.h
?create mode 100644 include/xen/interface/event_channel.h
?create mode 100644 include/xen/interface/grant_table.h
?create mode 100644 include/xen/interface/hvm/hvm_op.h
?create mode 100644 include/xen/interface/hvm/params.h
?create mode 100644 include/xen/interface/io/blkif.h
?create mode 100644 include/xen/interface/io/console.h
?create mode 100644 include/xen/interface/io/protocols.h
?create mode 100644 include/xen/interface/io/ring.h
?create mode 100644 include/xen/interface/io/xenbus.h
?create mode 100644 include/xen/interface/io/xs_wire.h
?create mode 100644 include/xen/interface/memory.h
?create mode 100644 include/xen/interface/sched.h
?create mode 100644 include/xen/interface/xen.h
?create mode 100644 include/xen/xenbus.h
?create mode 100644 lib/sscanf.c

-- 
2.17.1

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 11/17] xen: Port Xen grant table driver from mini-os
  2020-07-01 16:29 ` [PATCH 11/17] xen: Port Xen grant table " Anastasiia Lukianenko
@ 2020-07-01 16:59   ` Julien Grall
  2020-07-03 13:09     ` Anastasiia Lukianenko
  0 siblings, 1 reply; 57+ messages in thread
From: Julien Grall @ 2020-07-01 16:59 UTC (permalink / raw)
  To: u-boot



On 01/07/2020 17:29, Anastasiia Lukianenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> Make required updates to run on u-boot.
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> ---
>   board/xen/xenguest_arm64/xenguest_arm64.c |  13 ++
>   drivers/xen/Makefile                      |   1 +
>   drivers/xen/gnttab.c                      | 258 ++++++++++++++++++++++
>   drivers/xen/hypervisor.c                  |   2 +
>   include/xen/gnttab.h                      |  25 +++
>   5 files changed, 299 insertions(+)
>   create mode 100644 drivers/xen/gnttab.c
>   create mode 100644 include/xen/gnttab.h
> 
> diff --git a/board/xen/xenguest_arm64/xenguest_arm64.c b/board/xen/xenguest_arm64/xenguest_arm64.c
> index e8621f7174..b4e1650f99 100644
> --- a/board/xen/xenguest_arm64/xenguest_arm64.c
> +++ b/board/xen/xenguest_arm64/xenguest_arm64.c
> @@ -22,6 +22,7 @@
>   
>   #include <linux/compiler.h>
>   
> +#include <xen/gnttab.h>
>   #include <xen/hvm.h>
>   
>   DECLARE_GLOBAL_DATA_PTR;
> @@ -64,6 +65,8 @@ static int setup_mem_map(void)
>   	struct fdt_resource res;
>   	const void *blob = gd->fdt_blob;
>   	u64 gfn;
> +	phys_addr_t gnttab_base;
> +	phys_size_t gnttab_sz;
>   
>   	/*
>   	 * Add "magic" region which is used by Xen to provide some essentials
> @@ -97,6 +100,16 @@ static int setup_mem_map(void)
>   				PTE_BLOCK_INNER_SHARE);
>   	i++;
>   
> +	/* Get Xen's suggested physical page assignments for the grant table. */
> +	get_gnttab_base(&gnttab_base, &gnttab_sz);
> +
> +	xen_mem_map[i].virt = gnttab_base;
> +	xen_mem_map[i].phys = gnttab_base;
> +	xen_mem_map[i].size = gnttab_sz;
> +	xen_mem_map[i].attrs = (PTE_BLOCK_MEMTYPE(MT_NORMAL) |
> +				PTE_BLOCK_INNER_SHARE);
> +	i++;
> +
>   	mem = get_next_memory_node(blob, -1);
>   	if (mem < 0) {
>   		printf("%s: Missing /memory node\n", __func__);
> diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
> index 9d0f604aaa..243b13277a 100644
> --- a/drivers/xen/Makefile
> +++ b/drivers/xen/Makefile
> @@ -5,3 +5,4 @@
>   obj-y += hypervisor.o
>   obj-y += events.o
>   obj-y += xenbus.o
> +obj-y += gnttab.o
> diff --git a/drivers/xen/gnttab.c b/drivers/xen/gnttab.c
> new file mode 100644
> index 0000000000..b18102e329
> --- /dev/null
> +++ b/drivers/xen/gnttab.c
> @@ -0,0 +1,258 @@
> +/*
> + ****************************************************************************
> + * (C) 2006 - Cambridge University
> + * (C) 2020 - EPAM Systems Inc.
> + ****************************************************************************
> + *
> + *		File: gnttab.c
> + *	  Author: Steven Smith (sos22 at cam.ac.uk)
> + *	 Changes: Grzegorz Milos (gm281 at cam.ac.uk)
> + *
> + *		Date: July 2006
> + *
> + * Environment: Xen Minimal OS
> + * Description: Simple grant tables implementation. About as stupid as it's
> + *  possible to be and still work.
> + *
> + ****************************************************************************
> + */
> +#include <common.h>
> +#include <linux/compiler.h>
> +#include <log.h>
> +#include <malloc.h>
> +
> +#include <asm/armv8/mmu.h>
> +#include <asm/io.h>
> +#include <asm/xen/system.h>
> +
> +#include <linux/bug.h>
> +
> +#include <xen/gnttab.h>
> +#include <xen/hvm.h>
> +
> +#include <xen/interface/memory.h>
> +
> +DECLARE_GLOBAL_DATA_PTR;
> +
> +#define NR_RESERVED_ENTRIES 8
> +
> +/* NR_GRANT_FRAMES must be less than or equal to that configured in Xen */
> +#define NR_GRANT_FRAMES 1
> +#define NR_GRANT_ENTRIES (NR_GRANT_FRAMES * PAGE_SIZE / sizeof(struct grant_entry_v1))
> +
> +static struct grant_entry_v1 *gnttab_table;
> +static grant_ref_t gnttab_list[NR_GRANT_ENTRIES];
> +
> +static void put_free_entry(grant_ref_t ref)
> +{
> +	unsigned long flags;
> +
> +	local_irq_save(flags);
> +	gnttab_list[ref] = gnttab_list[0];
> +	gnttab_list[0]  = ref;
> +	local_irq_restore(flags);
> +}
> +
> +static grant_ref_t get_free_entry(void)
> +{
> +	unsigned int ref;
> +	unsigned long flags;
> +
> +	local_irq_save(flags);
> +	ref = gnttab_list[0];
> +	BUG_ON(ref < NR_RESERVED_ENTRIES || ref >= NR_GRANT_ENTRIES);
> +	gnttab_list[0] = gnttab_list[ref];
> +	local_irq_restore(flags);
> +	return ref;
> +}
> +
> +grant_ref_t gnttab_grant_access(domid_t domid, unsigned long frame, int readonly)
> +{
> +	grant_ref_t ref;
> +
> +	ref = get_free_entry();
> +	gnttab_table[ref].frame = frame;
> +	gnttab_table[ref].domid = domid;
> +	wmb();
> +	readonly *= GTF_readonly;
> +	gnttab_table[ref].flags = GTF_permit_access | readonly;
> +
> +	return ref;
> +}
> +
> +grant_ref_t gnttab_grant_transfer(domid_t domid, unsigned long pfn)

It is not possible to transfer grant on Arm. So I would suggest to 
remove the code related to it.

[...]

> +unsigned long gnttab_end_transfer(grant_ref_t ref)

likewise.

> +{
> +	unsigned long frame;
> +	u16 flags;
> +
> +	BUG_ON(ref >= NR_GRANT_ENTRIES || ref < NR_RESERVED_ENTRIES);
> +
> +	while (!((flags = gnttab_table[ref].flags) & GTF_transfer_committed)) {
> +		if (synch_cmpxchg(&gnttab_table[ref].flags, flags, 0) == flags) {
> +			printf("Release unused transfer grant.\n");
> +			put_free_entry(ref);
> +			return 0;
> +		}
> +	}
> +
> +	/* If a transfer is in progress then wait until it is completed. */
> +	while (!(flags & GTF_transfer_completed))
> +		flags = gnttab_table[ref].flags;
> +
> +	/* Read the frame number /after/ reading completion status. */
> +	rmb();
> +	frame = gnttab_table[ref].frame;
> +
> +	put_free_entry(ref);
> +
> +	return frame;
> +}
> +
> +grant_ref_t gnttab_alloc_and_grant(void **map)
> +{
> +	unsigned long mfn;
> +	grant_ref_t gref;
> +
> +	*map = (void *)memalign(PAGE_SIZE, PAGE_SIZE);
> +	mfn = virt_to_mfn(*map);
> +	gref = gnttab_grant_access(0, mfn, 0);
> +	return gref;
> +}
> +
> +static const char * const gnttabop_error_msgs[] = GNTTABOP_error_msgs;
> +
> +const char *gnttabop_error(int16_t status)
> +{
> +	status = -status;
> +	if (status < 0 || status >= ARRAY_SIZE(gnttabop_error_msgs))
> +		return "bad status";
> +	else
> +		return gnttabop_error_msgs[status];
> +}
> +
> +/* Get Xen's suggested physical page assignments for the grant table. */
> +void get_gnttab_base(phys_addr_t *gnttab_base, phys_size_t *gnttab_sz)
> +{
> +	const void *blob = gd->fdt_blob;
> +	struct fdt_resource res;
> +	int mem;
> +
> +	mem = fdt_node_offset_by_compatible(blob, -1, "xen,xen");
> +	if (mem < 0) {
> +		printf("No xen,xen compatible found\n");
> +		BUG();
> +	}
> +
> +	mem = fdt_get_resource(blob, mem, "reg", 0, &res);
> +	if (mem == -FDT_ERR_NOTFOUND) {
> +		printf("No grant table base in the device tree\n");
> +		BUG();
> +	}
> +
> +	*gnttab_base = (phys_addr_t)res.start;
> +	if (gnttab_sz)
> +		*gnttab_sz = (phys_size_t)(res.end - res.start + 1);
> +
> +	debug("FDT suggests grant table base at %llx\n",
> +	      *gnttab_base);
> +}
> +
> +void init_gnttab(void)
> +{
> +	struct xen_add_to_physmap xatp;
> +	struct gnttab_setup_table setup;
> +	xen_pfn_t frames[NR_GRANT_FRAMES];
> +	int i, rc;
> +
> +	debug("%s\n", __func__);
> +
> +	for (i = NR_RESERVED_ENTRIES; i < NR_GRANT_ENTRIES; i++)
> +		put_free_entry(i);
> +
> +	get_gnttab_base((phys_addr_t *)&gnttab_table, NULL);
> +
> +	for (i = 0; i < NR_GRANT_FRAMES; i++) {
> +		xatp.domid = DOMID_SELF;
> +		xatp.size = 0;
> +		xatp.space = XENMAPSPACE_grant_table;
> +		xatp.idx = i;
> +		xatp.gpfn = PFN_DOWN((unsigned long)gnttab_table) + i;
> +		rc = HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp);
> +		if (rc)
> +			printf("XENMEM_add_to_physmap failed; status = %d\n",
> +			       rc);
> +		BUG_ON(rc != 0);
> +	}
> +
> +	setup.dom = DOMID_SELF;
> +	setup.nr_frames = NR_GRANT_FRAMES;
> +	set_xen_guest_handle(setup.frame_list, frames);
> +	rc = HYPERVISOR_grant_table_op(GNTTABOP_setup_table, &setup, 1);
> +	if (rc || setup.status) {
> +		printf("GNTTABOP_setup_table failed; status = %s\n",
> +		       gnttabop_error(setup.status));
> +		BUG();
> +	}

GNTTAOP_grant_table_op is not needed on Arm.

> +}
> +
> +void fini_gnttab(void)
> +{
> +	struct xen_remove_from_physmap xrtp;
> +	struct gnttab_setup_table setup;
> +	int i, rc;
> +
> +	debug("%s\n", __func__);
> +
> +	for (i = 0; i < NR_GRANT_FRAMES; i++) {
> +		xrtp.domid = DOMID_SELF;
> +		xrtp.gpfn = PFN_DOWN((unsigned long)gnttab_table) + i;
> +		rc = HYPERVISOR_memory_op(XENMEM_remove_from_physmap, &xrtp);
> +		if (rc)
> +			printf("XENMEM_remove_from_physmap failed; status = %d\n",
> +			       rc);
> +		BUG_ON(rc != 0);
> +	}
> +
> +	setup.dom = DOMID_SELF;
> +	setup.nr_frames = 0;
> +
> +	HYPERVISOR_grant_table_op(GNTTABOP_setup_table, &setup, 1);
> +	if (setup.status) {
> +		printf("GNTTABOP_setup_table failed; status = %s\n",
> +		       gnttabop_error(setup.status));
> +		BUG();
> +	}

The hypercall doesn't do any clean-up in Xen. So why are you calling 
this from fini_gnttab()?

Cheers,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 05/17] xen: Port Xen hypervizor related code from mini-os
  2020-07-01 16:29 ` [PATCH 05/17] xen: Port Xen hypervizor related code from mini-os Anastasiia Lukianenko
@ 2020-07-01 17:46   ` Julien Grall
  2020-07-03 12:21     ` Anastasiia Lukianenko
  2020-07-16 13:16     ` Anastasiia Lukianenko
  0 siblings, 2 replies; 57+ messages in thread
From: Julien Grall @ 2020-07-01 17:46 UTC (permalink / raw)
  To: u-boot

Title: s/hypervizor/hypervisor/

On 01/07/2020 17:29, Anastasiia Lukianenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> Port hypervizor related code from mini-os. Update essential

Ditto.

But I would be quite cautious to import code from mini-OS in order to 
support Arm. The port has always been broken and from a look below needs 
to be refined for Arm.

> arch code to support required bit operations, memory barriers etc.
> 
> Copyright for the bits ported belong to at least the following authors,
> please see related files for details:
> 
> Copyright (c) 2002-2003, K A Fraser
> Copyright (c) 2005, Grzegorz Milos, gm281 at cam.ac.uk,Intel Research Cambridge
> Copyright (c) 2014, Karim Allah Ahmed <karim.allah.ahmed@gmail.com>
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> ---
>   arch/arm/include/asm/xen/system.h |  96 +++++++++++
>   common/board_r.c                  |  11 ++
>   drivers/Makefile                  |   1 +
>   drivers/xen/Makefile              |   5 +
>   drivers/xen/hypervisor.c          | 277 ++++++++++++++++++++++++++++++
>   include/xen.h                     |  11 ++
>   include/xen/hvm.h                 |  30 ++++
>   7 files changed, 431 insertions(+)
>   create mode 100644 arch/arm/include/asm/xen/system.h
>   create mode 100644 drivers/xen/Makefile
>   create mode 100644 drivers/xen/hypervisor.c
>   create mode 100644 include/xen.h
>   create mode 100644 include/xen/hvm.h
> 
> diff --git a/arch/arm/include/asm/xen/system.h b/arch/arm/include/asm/xen/system.h
> new file mode 100644
> index 0000000000..81ab90160e
> --- /dev/null
> +++ b/arch/arm/include/asm/xen/system.h
> @@ -0,0 +1,96 @@
> +/*
> + * SPDX-License-Identifier: GPL-2.0
> + *
> + * (C) 2014 Karim Allah Ahmed <karim.allah.ahmed@gmail.com>
> + * (C) 2020, EPAM Systems Inc.
> + */
> +#ifndef _ASM_ARM_XEN_SYSTEM_H
> +#define _ASM_ARM_XEN_SYSTEM_H
> +
> +#include <compiler.h>
> +#include <asm/bitops.h>
> +
> +/* If *ptr == old, then store new there (and return new).
> + * Otherwise, return the old value.
> + * Atomic.
> + */
> +#define synch_cmpxchg(ptr, old, new) \
> +({ __typeof__(*ptr) stored = old; \
> +	__atomic_compare_exchange_n(ptr, &stored, new, 0, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST) ? new : old; \
> +})
> +
> +/* As test_and_clear_bit, but using __ATOMIC_SEQ_CST */
> +static inline int synch_test_and_clear_bit(int nr, volatile void *addr)
> +{
> +	u8 *byte = ((u8 *)addr) + (nr >> 3);
> +	u8 bit = 1 << (nr & 7);
> +	u8 orig;
> +
> +	orig = __atomic_fetch_and(byte, ~bit, __ATOMIC_SEQ_CST);
> +
> +	return (orig & bit) != 0;
> +}
> +
> +/* As test_and_set_bit, but using __ATOMIC_SEQ_CST */
> +static inline int synch_test_and_set_bit(int nr, volatile void *base)
> +{
> +	u8 *byte = ((u8 *)base) + (nr >> 3);
> +	u8 bit = 1 << (nr & 7);
> +	u8 orig;
> +
> +	orig = __atomic_fetch_or(byte, bit, __ATOMIC_SEQ_CST);
> +
> +	return (orig & bit) != 0;
> +}
> +
> +/* As set_bit, but using __ATOMIC_SEQ_CST */
> +static inline void synch_set_bit(int nr, volatile void *addr)
> +{
> +	synch_test_and_set_bit(nr, addr);
> +}
> +
> +/* As clear_bit, but using __ATOMIC_SEQ_CST */
> +static inline void synch_clear_bit(int nr, volatile void *addr)
> +{
> +	synch_test_and_clear_bit(nr, addr);
> +}
> +
> +/* As test_bit, but with a following memory barrier. */
> +//static inline int synch_test_bit(int nr, volatile void *addr)
> +static inline int synch_test_bit(int nr, const void *addr)
> +{
> +	int result;
> +
> +	result = test_bit(nr, addr);
> +	barrier();
> +	return result;
> +}

I can understand why we implement sync_* helpers as AFAICT the generic 
helpers are not SMP safe. However...

> +
> +#define xchg(ptr, v)	__atomic_exchange_n(ptr, v, __ATOMIC_SEQ_CST)
> +#define xchg(ptr, v)	__atomic_exchange_n(ptr, v, __ATOMIC_SEQ_CST)
> +
> +#define mb()		dsb()
> +#define rmb()		dsb()
> +#define wmb()		dsb()
> +#define __iormb()	dmb()
> +#define __iowmb()	dmb()

Why do you need to re-implement the barriers?

> +#define xen_mb()	mb()
> +#define xen_rmb()	rmb()
> +#define xen_wmb()	wmb()
> +
> +#define smp_processor_id()	0
Shouldn't this be common?

> +
> +#define to_phys(x)		((unsigned long)(x))
> +#define to_virt(x)		((void *)(x))
> +
> +#define PFN_UP(x)		(unsigned long)(((x) + PAGE_SIZE - 1) >> PAGE_SHIFT)
> +#define PFN_DOWN(x)		(unsigned long)((x) >> PAGE_SHIFT)
> +#define PFN_PHYS(x)		((unsigned long)(x) << PAGE_SHIFT)
> +#define PHYS_PFN(x)		(unsigned long)((x) >> PAGE_SHIFT)
> +
> +#define virt_to_pfn(_virt)	(PFN_DOWN(to_phys(_virt)))
> +#define virt_to_mfn(_virt)	(PFN_DOWN(to_phys(_virt)))
> +#define mfn_to_virt(_mfn)	(to_virt(PFN_PHYS(_mfn)))
> +#define pfn_to_virt(_pfn)	(to_virt(PFN_PHYS(_pfn)))

There is already generic phys <-> virt helpers (see 
include/asm-generic/io.h). So why do you need to create a new version?

> +
> +#endif
> diff --git a/common/board_r.c b/common/board_r.c
> index fa57fa9b69..fd36edb4e5 100644
> --- a/common/board_r.c
> +++ b/common/board_r.c
> @@ -56,6 +56,7 @@
>   #include <timer.h>
>   #include <trace.h>
>   #include <watchdog.h>
> +#include <xen.h>

Do we want to include it for other boards?

>   #ifdef CONFIG_ADDR_MAP
>   #include <asm/mmu.h>
>   #endif
> @@ -462,6 +463,13 @@ static int initr_mmc(void)
>   }
>   #endif
>   
> +#ifdef CONFIG_XEN
> +static int initr_xen(void)
> +{
> +	xen_init();
> +	return 0;
> +}
> +#endif
>   /*
>    * Tell if it's OK to load the environment early in boot.
>    *
> @@ -769,6 +777,9 @@ static init_fnc_t init_sequence_r[] = {
>   #endif
>   #ifdef CONFIG_MMC
>   	initr_mmc,
> +#endif
> +#ifdef CONFIG_XEN
> +	initr_xen,
>   #endif
>   	initr_env,
>   #ifdef CONFIG_SYS_BOOTPARAMS_LEN
> diff --git a/drivers/Makefile b/drivers/Makefile
> index 94e8c5da17..0dd8891e76 100644
> --- a/drivers/Makefile
> +++ b/drivers/Makefile
> @@ -28,6 +28,7 @@ obj-$(CONFIG_$(SPL_)REMOTEPROC) += remoteproc/
>   obj-$(CONFIG_$(SPL_TPL_)TPM) += tpm/
>   obj-$(CONFIG_$(SPL_TPL_)ACPI_PMC) += power/acpi_pmc/
>   obj-$(CONFIG_$(SPL_)BOARD) += board/
> +obj-$(CONFIG_XEN) += xen/
>   
>   ifndef CONFIG_TPL_BUILD
>   ifdef CONFIG_SPL_BUILD
> diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
> new file mode 100644
> index 0000000000..1211bf2386
> --- /dev/null
> +++ b/drivers/xen/Makefile
> @@ -0,0 +1,5 @@
> +# SPDX-License-Identifier:	GPL-2.0+
> +#
> +# (C) Copyright 2020 EPAM Systems Inc.
> +
> +obj-y += hypervisor.o
> diff --git a/drivers/xen/hypervisor.c b/drivers/xen/hypervisor.c
> new file mode 100644
> index 0000000000..5883285142
> --- /dev/null
> +++ b/drivers/xen/hypervisor.c
> @@ -0,0 +1,277 @@
> +/******************************************************************************
> + * hypervisor.c
> + *
> + * Communication to/from hypervisor.
> + *
> + * Copyright (c) 2002-2003, K A Fraser
> + * Copyright (c) 2005, Grzegorz Milos, gm281 at cam.ac.uk,Intel Research Cambridge
> + * Copyright (c) 2020, EPAM Systems Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a copy
> + * of this software and associated documentation files (the "Software"), to
> + * deal in the Software without restriction, including without limitation the
> + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
> + * sell copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +#include <common.h>
> +#include <cpu_func.h>
> +#include <log.h>
> +#include <memalign.h>
> +
> +#include <asm/io.h>
> +#include <asm/armv8/mmu.h>
> +#include <asm/xen/system.h>
> +
> +#include <linux/bug.h>
> +
> +#include <xen/hvm.h>
> +#include <xen/interface/memory.h>
> +
> +#define active_evtchns(cpu, sh, idx)	\
> +	((sh)->evtchn_pending[idx] &	\
> +	 ~(sh)->evtchn_mask[idx])
> +
> +int in_callback;
> +
> +/*
> + * Shared page for communicating with the hypervisor.
> + * Events flags go here, for example.
> + */
> +struct shared_info *HYPERVISOR_shared_info;
> +
> +#ifndef CONFIG_PARAVIRT

Is there any plan to support this on x86?

> +static const char *param_name(int op)
> +{
> +#define PARAM(x)[HVM_PARAM_##x] = #x
> +	static const char *const names[] = {
> +		PARAM(CALLBACK_IRQ),
> +		PARAM(STORE_PFN),
> +		PARAM(STORE_EVTCHN),
> +		PARAM(PAE_ENABLED),
> +		PARAM(IOREQ_PFN),
> +		PARAM(TIMER_MODE),
> +		PARAM(HPET_ENABLED),
> +		PARAM(IDENT_PT),
> +		PARAM(ACPI_S_STATE),
> +		PARAM(VM86_TSS),
> +		PARAM(VPT_ALIGN),
> +		PARAM(CONSOLE_PFN),
> +		PARAM(CONSOLE_EVTCHN),

Most of those parameters are never going to be used on Arm. So could 
this be clobberred?

> +	};
> +#undef PARAM
> +
> +	if (op >= ARRAY_SIZE(names))
> +		return "unknown";
> +
> +	if (!names[op])
> +		return "reserved";
> +
> +	return names[op];
> +}
> +
> +int hvm_get_parameter_maintain_dcache(int idx, uint64_t *value)

I would recommend to add some comments explaining when this function is 
meant to be used and what it is doing in regards of the cache.

> +{
> +	struct xen_hvm_param xhv;
> +	int ret;

I don't think there is a guarantee that your cache is going to be clean 
when writing xhv. So you likely want to add a invalidate_dcache_range() 
before writing it.

> +
> +	xhv.domid = DOMID_SELF;
> +	xhv.index = idx;
> +	invalidate_dcache_range((unsigned long)&xhv,
> +				(unsigned long)&xhv + sizeof(xhv));
> +
> +	ret = HYPERVISOR_hvm_op(HVMOP_get_param, &xhv);
> +	if (ret < 0) {
> +		pr_err("Cannot get hvm parameter %s (%d): %d!\n",
> +			   param_name(idx), idx, ret);
> +		BUG();
> +	}
> +	invalidate_dcache_range((unsigned long)&xhv,
> +				(unsigned long)&xhv + sizeof(xhv));
> +
> +	*value = xhv.value;
> +	return ret;
> +}
> +
> +int hvm_get_parameter(int idx, uint64_t *value)
> +{
> +	struct xen_hvm_param xhv;
> +	int ret;
> +
> +	xhv.domid = DOMID_SELF;
> +	xhv.index = idx;
> +	ret = HYPERVISOR_hvm_op(HVMOP_get_param, &xhv);
> +	if (ret < 0) {
> +		pr_err("Cannot get hvm parameter %s (%d): %d!\n",
> +			   param_name(idx), idx, ret);
> +		BUG();
> +	}
> +
> +	*value = xhv.value;
> +	return ret;
> +}
> +
> +int hvm_set_parameter(int idx, uint64_t value)
> +{
> +	struct xen_hvm_param xhv;
> +	int ret;
> +
> +	xhv.domid = DOMID_SELF;
> +	xhv.index = idx;
> +	xhv.value = value;
> +	ret = HYPERVISOR_hvm_op(HVMOP_set_param, &xhv);
> +
> +	if (ret < 0) {
> +		pr_err("Cannot get hvm parameter %s (%d): %d!\n",
> +			   param_name(idx), idx, ret);
> +		BUG();
> +	}
> +
> +	return ret;
> +}
> +
> +struct shared_info *map_shared_info(void *p)
> +{
> +	struct xen_add_to_physmap xatp;
> +
> +	HYPERVISOR_shared_info = (struct shared_info *)memalign(PAGE_SIZE,
> +								PAGE_SIZE);
> +	if (HYPERVISOR_shared_info == NULL)
> +		BUG();
> +
> +	xatp.domid = DOMID_SELF;
> +	xatp.idx = 0;
> +	xatp.space = XENMAPSPACE_shared_info;
> +	xatp.gpfn = virt_to_pfn(HYPERVISOR_shared_info);
> +	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp) != 0)
> +		BUG();
> +
> +	return HYPERVISOR_shared_info;
> +}
> +
> +void unmap_shared_info(void)
> +{
> +	struct xen_remove_from_physmap xrtp;
> +
> +	xrtp.domid = DOMID_SELF;
> +	xrtp.gpfn = virt_to_pfn(HYPERVISOR_shared_info);
> +	if (HYPERVISOR_memory_op(XENMEM_remove_from_physmap, &xrtp) != 0)
> +		BUG();
> +}
> +#endif
> +
> +void do_hypervisor_callback(struct pt_regs *regs)
> +{
> +	unsigned long l1, l2, l1i, l2i;
> +	unsigned int port;
> +	int cpu = 0;
> +	struct shared_info *s = HYPERVISOR_shared_info;
> +	struct vcpu_info *vcpu_info = &s->vcpu_info[cpu];
> +
> +	in_callback = 1;
> +
> +	vcpu_info->evtchn_upcall_pending = 0;
> +	/* NB x86. No need for a barrier here -- XCHG is a barrier on x86. */
> +#if !defined(__i386__) && !defined(__x86_64__)
> +	/* Clear master flag /before/ clearing selector flag. */
> +	wmb();
> +#endif
> +	l1 = xchg(&vcpu_info->evtchn_pending_sel, 0);
> +
> +	while (l1 != 0) {
> +		l1i = __ffs(l1);
> +		l1 &= ~(1UL << l1i);
> +
> +		while ((l2 = active_evtchns(cpu, s, l1i)) != 0) {
> +			l2i = __ffs(l2);
> +			l2 &= ~(1UL << l2i);
> +
> +			port = (l1i * (sizeof(unsigned long) * 8)) + l2i;
> +			/* TODO: handle new event: do_event(port, regs); */
> +			/* Suppress -Wunused-but-set-variable */
> +			(void)(port);
> +		}
> +	}

You likely want a memory barrier here as otherwise in_callback could be 
written/seen before the loop end.

> +
> +	in_callback = 0;
> +}
> +
> +void force_evtchn_callback(void)
> +{
> +#ifdef XEN_HAVE_PV_UPCALL_MASK
> +	int save;
> +#endif
> +	struct vcpu_info *vcpu;
> +
> +	vcpu = &HYPERVISOR_shared_info->vcpu_info[smp_processor_id()];

On Arm, this is only valid for vCPU0. For all the other vCPUs, you will 
want to register a vCPU shared info.

> +#ifdef XEN_HAVE_PV_UPCALL_MASK
> +	save = vcpu->evtchn_upcall_mask;
> +#endif
> +
> +	while (vcpu->evtchn_upcall_pending) {
> +#ifdef XEN_HAVE_PV_UPCALL_MASK
> +		vcpu->evtchn_upcall_mask = 1;
> +#endif
> +		barrier();

What are you trying to prevent with this barrier? In particular why 
would the compiler be an issue but not the processor?

> +		do_hypervisor_callback(NULL);
> +		barrier();
> +#ifdef XEN_HAVE_PV_UPCALL_MASK
> +		vcpu->evtchn_upcall_mask = save;
> +		barrier();

Same here.

> +#endif
> +	};
> +}
> +
> +void mask_evtchn(uint32_t port)
> +{
> +	struct shared_info *s = HYPERVISOR_shared_info;
> +	synch_set_bit(port, &s->evtchn_mask[0]);
> +}
> +
> +void unmask_evtchn(uint32_t port)
> +{
> +	struct shared_info *s = HYPERVISOR_shared_info;
> +	struct vcpu_info *vcpu_info = &s->vcpu_info[smp_processor_id()];
> +
> +	synch_clear_bit(port, &s->evtchn_mask[0]);
> +
> +	/*
> +	 * The following is basically the equivalent of 'hw_resend_irq'. Just like
> +	 * a real IO-APIC we 'lose the interrupt edge' if the channel is masked.
> +	 */
This seems to be out-of-context now, you might want to update it.

> +	if (synch_test_bit(port, &s->evtchn_pending[0]) &&
> +	    !synch_test_and_set_bit(port / (sizeof(unsigned long) * 8),
> +				    &vcpu_info->evtchn_pending_sel)) {
> +		vcpu_info->evtchn_upcall_pending = 1;
> +#ifdef XEN_HAVE_PV_UPCALL_MASK
> +		if (!vcpu_info->evtchn_upcall_mask)
> +#endif
> +			force_evtchn_callback();
> +	}
> +}
> +
> +void clear_evtchn(uint32_t port)
> +{
> +	struct shared_info *s = HYPERVISOR_shared_info;
> +
> +	synch_clear_bit(port, &s->evtchn_pending[0]);
> +}
> +
> +void xen_init(void)
> +{
> +	debug("%s\n", __func__);

Is this a left-over?

> +
> +	map_shared_info(NULL);
> +}
> +
> diff --git a/include/xen.h b/include/xen.h
> new file mode 100644
> index 0000000000..1d6f74cc92
> --- /dev/null
> +++ b/include/xen.h
> @@ -0,0 +1,11 @@
> +/*
> + * SPDX-License-Identifier: GPL-2.0
> + *
> + * (C) 2020, EPAM Systems Inc.
> + */
> +#ifndef __XEN_H__
> +#define __XEN_H__
> +
> +void xen_init(void);
> +
> +#endif /* __XEN_H__ */
> diff --git a/include/xen/hvm.h b/include/xen/hvm.h
> new file mode 100644
> index 0000000000..89de9625ca
> --- /dev/null
> +++ b/include/xen/hvm.h
> @@ -0,0 +1,30 @@
> +/*
> + * SPDX-License-Identifier: GPL-2.0
> + *
> + * Simple wrappers around HVM functions
> + *
> + * Copyright (c) 2002-2003, K A Fraser
> + * Copyright (c) 2005, Grzegorz Milos, gm281 at cam.ac.uk,Intel Research Cambridge
> + * Copyright (c) 2020, EPAM Systems Inc.
> + */
> +#ifndef XEN_HVM_H__
> +#define XEN_HVM_H__
> +
> +#include <asm/xen/hypercall.h>
> +#include <xen/interface/hvm/params.h>
> +#include <xen/interface/xen.h>
> +
> +extern struct shared_info *HYPERVISOR_shared_info;
> +
> +int hvm_get_parameter(int idx, uint64_t *value);
> +int hvm_get_parameter_maintain_dcache(int idx, uint64_t *value);
> +int hvm_set_parameter(int idx, uint64_t value);
> +
> +struct shared_info *map_shared_info(void *p);
> +void unmap_shared_info(void);
> +void do_hypervisor_callback(struct pt_regs *regs);
> +void mask_evtchn(uint32_t port);
> +void unmask_evtchn(uint32_t port);
> +void clear_evtchn(uint32_t port);
> +
> +#endif /* XEN_HVM_H__ */

Cheers,


-- 
Julien Grall

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 01/17] armv8: Fix SMCC and ARM_PSCI_FW dependencies
  2020-07-01 16:29 ` [PATCH 01/17] armv8: Fix SMCC and ARM_PSCI_FW dependencies Anastasiia Lukianenko
@ 2020-07-02  1:14   ` Peng Fan
  2020-07-03  9:57     ` Nastya Vicodin
  0 siblings, 1 reply; 57+ messages in thread
From: Peng Fan @ 2020-07-02  1:14 UTC (permalink / raw)
  To: u-boot

> Subject: [PATCH 01/17] armv8: Fix SMCC and ARM_PSCI_FW dependencies
> 
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> Currently SMCC selects ARM_PSCI_FW if enabled which is not correct as
> there are cases that PSCI can function without firmware at all.
> ARM_PSCI_FW itself is built with driver model approach, so it cannot be
> enabled if DM is off.
> Fix this by making PSCI reset functionality depend on ARM_PSCI_FW and only
> in case if DM is enabled.

I think this might break others, see drivers/firmware/psci.c

Regards,
Peng.

> 
> Signed-off-by: Oleksandr Andrushchenko
> <oleksandr_andrushchenko@epam.com>
> Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> Suggested-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
> ---
>  arch/arm/Kconfig           | 1 -
>  arch/arm/cpu/armv8/Kconfig | 2 ++
>  2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index
> 54d65f8488..e9ad716aaa 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -387,7 +387,6 @@ config SYS_ARCH_TIMER  config ARM_SMCCC
>  	bool "Support for ARM SMC Calling Convention (SMCCC)"
>  	depends on CPU_V7A || ARM64
> -	select ARM_PSCI_FW
>  	help
>  	  Say Y here if you want to enable ARM SMC Calling Convention.
>  	  This should be enabled if U-Boot needs to communicate with system
> diff --git a/arch/arm/cpu/armv8/Kconfig b/arch/arm/cpu/armv8/Kconfig
> index 3655990772..c8727f4175 100644
> --- a/arch/arm/cpu/armv8/Kconfig
> +++ b/arch/arm/cpu/armv8/Kconfig
> @@ -103,6 +103,8 @@ config PSCI_RESET
>  	bool "Use PSCI for reset and shutdown"
>  	default y
>  	select ARM_SMCCC if OF_CONTROL
> +	select ARM_PSCI_FW if DM
> +
>  	depends on !ARCH_EXYNOS7 && !ARCH_BCM283X && \
>  		   !TARGET_LS2080A_SIMU && !TARGET_LS2080AQDS && \
>  		   !TARGET_LS2080ARDB && !TARGET_LS2080A_EMU && \
> --
> 2.17.1

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 03/17] board: Introduce xenguest_arm64 board
  2020-07-01 16:29 ` [PATCH 03/17] board: Introduce xenguest_arm64 board Anastasiia Lukianenko
@ 2020-07-02  1:28   ` Peng Fan
  2020-07-02  7:18     ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 57+ messages in thread
From: Peng Fan @ 2020-07-02  1:28 UTC (permalink / raw)
  To: u-boot

> Subject: [PATCH 03/17] board: Introduce xenguest_arm64 board
> 
> From: Andrii Anisov <andrii_anisov@epam.com>
> 
> Introduce a minimal Xen guest board running as a virtual machine under Xen
> Project's hypervisor [1], [2].
> 
> Part of the code is ported from Xen mini-os and also uses work initially done
> by different authors from NXP: please see relevant files for their copyrights.

This patch needs to be in the last, otherwise it might break git bisect.

> 
> [1]
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fxenbit
> s.xen.org%2F&amp;data=02%7C01%7Cpeng.fan%40nxp.com%7C61151b8230
> c94f145ce408d81ddc04ee%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
> 7C0%7C637292178110014498&amp;sdata=pgJ6Qf1iDW%2FjNWTcGBWFVYY
> SrG0MX%2FiTzbfzbyqkxsY%3D&amp;reserved=0
> [2]
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.xe
> nproject.org%2F&amp;data=02%7C01%7Cpeng.fan%40nxp.com%7C61151b8
> 230c94f145ce408d81ddc04ee%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C
> 0%7C0%7C637292178110014498&amp;sdata=x0gKBoJvFRQdX7YatAhgF%2Fc
> ovJ4kdrmbl2iUiXvCqww%3D&amp;reserved=0
> 
> Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
> Signed-off-by: Oleksandr Andrushchenko
> <oleksandr_andrushchenko@epam.com>
> Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> ---
>  arch/arm/Kconfig                          |   7 +
>  arch/arm/cpu/armv8/Makefile               |   1 +
>  arch/arm/cpu/armv8/xen/Makefile           |   6 +
>  arch/arm/cpu/armv8/xen/hypercall.S        |  78 +++++++++++
>  arch/arm/cpu/armv8/xen/lowlevel_init.S    |  34 +++++
>  arch/arm/include/asm/xen.h                |   8 ++
>  arch/arm/include/asm/xen/hypercall.h      |  45 +++++++
>  board/xen/xenguest_arm64/Kconfig          |  12 ++
>  board/xen/xenguest_arm64/Makefile         |   5 +
>  board/xen/xenguest_arm64/xenguest_arm64.c | 153
> ++++++++++++++++++++++
>  configs/xenguest_arm64_defconfig          |  56 ++++++++
>  include/configs/xenguest_arm64.h          |  45 +++++++
>  12 files changed, 450 insertions(+)
>  create mode 100644 arch/arm/cpu/armv8/xen/Makefile  create mode
> 100644 arch/arm/cpu/armv8/xen/hypercall.S
>  create mode 100644 arch/arm/cpu/armv8/xen/lowlevel_init.S
>  create mode 100644 arch/arm/include/asm/xen.h  create mode 100644
> arch/arm/include/asm/xen/hypercall.h
>  create mode 100644 board/xen/xenguest_arm64/Kconfig  create mode
> 100644 board/xen/xenguest_arm64/Makefile  create mode 100644
> board/xen/xenguest_arm64/xenguest_arm64.c
>  create mode 100644 configs/xenguest_arm64_defconfig  create mode
> 100644 include/configs/xenguest_arm64.h
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index
> e9ad716aaa..c469863967 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -1717,6 +1717,12 @@ config TARGET_PRESIDIO_ASIC
>  	bool "Support Cortina Presidio ASIC Platform"
>  	select ARM64
> 
> +config TARGET_XENGUEST_ARM64
> +	bool "Xen guest ARM64"
> +	select ARM64
> +	select XEN
> +	select OF_CONTROL
> +	select LINUX_KERNEL_IMAGE_HEADER
>  endchoice
> 
>  config ARCH_SUPPORT_TFABOOT
> @@ -1920,6 +1926,7 @@ source "board/xilinx/Kconfig"
>  source "board/xilinx/zynq/Kconfig"
>  source "board/xilinx/zynqmp/Kconfig"
>  source "board/phytium/durian/Kconfig"
> +source "board/xen/xenguest_arm64/Kconfig"
> 
>  source "arch/arm/Kconfig.debug"
> 
> diff --git a/arch/arm/cpu/armv8/Makefile b/arch/arm/cpu/armv8/Makefile
> index 2e48df0eb9..dd6c354d19 100644
> --- a/arch/arm/cpu/armv8/Makefile
> +++ b/arch/arm/cpu/armv8/Makefile
> @@ -39,3 +39,4 @@ obj-$(CONFIG_S32V234) += s32v234/
>  obj-$(CONFIG_TARGET_HIKEY) += hisilicon/
>  obj-$(CONFIG_ARMV8_PSCI) += psci.o
>  obj-$(CONFIG_ARCH_SUNXI) += lowlevel_init.o
> +obj-$(CONFIG_XEN) += xen/
> diff --git a/arch/arm/cpu/armv8/xen/Makefile
> b/arch/arm/cpu/armv8/xen/Makefile new file mode 100644 index
> 0000000000..e3b4ae2bd4
> --- /dev/null
> +++ b/arch/arm/cpu/armv8/xen/Makefile
> @@ -0,0 +1,6 @@
> +# SPDX-License-Identifier: GPL-2.0+
> +#
> +# (C) 2018 NXP
> +# (C) 2020 EPAM Systems Inc.
> +
> +obj-y += lowlevel_init.o hypercall.o
> diff --git a/arch/arm/cpu/armv8/xen/hypercall.S
> b/arch/arm/cpu/armv8/xen/hypercall.S
> new file mode 100644
> index 0000000000..9596e336b5
> --- /dev/null
> +++ b/arch/arm/cpu/armv8/xen/hypercall.S
> @@ -0,0 +1,78 @@
> +/************************************************************
> **********
> +********
> + * hypercall.S
> + *
> + * Xen hypercall wrappers
> + *
> + * Stefano Stabellini <stefano.stabellini@eu.citrix.com>, Citrix, 2012
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License version
> +2
> + * as published by the Free Software Foundation; or, when distributed
> + * separately from the Linux kernel or incorporated into other
> + * software packages, subject to the following license:
> + *
> + * Permission is hereby granted, free of charge, to any person
> +obtaining a copy
> + * of this source file (the "Software"), to deal in the Software
> +without
> + * restriction, including without limitation the rights to use, copy,
> +modify,
> + * merge, publish, distribute, sublicense, and/or sell copies of the
> +Software,
> + * and to permit persons to whom the Software is furnished to do so,
> +subject to
> + * the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be
> +included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> KIND,
> +EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> +MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> EVENT
> +SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> DAMAGES OR
> +OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> +ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> OR OTHER
> +DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +/*
> + * The Xen hypercall calling convention is very similar to the
> +procedure
> + * call standard for the ARM 64-bit architecture: the first parameter
> +is
> + * passed in x0, the second in x1, the third in x2, the fourth in x3
> +and
> + * the fifth in x4.
> + *
> + * The hypercall number is passed in x16.
> + *
> + * The return value is in x0.
> + *
> + * The hvc ISS is required to be 0xEA1, that is the Xen specific ARM
> + * hypercall tag.
> + *
> + * Parameter structs passed to hypercalls are laid out according to
> + * the ARM 64-bit EABI standard.
> + */
> +
> +#include <xen/interface/xen.h>
> +
> +#define XEN_HYPERCALL_TAG	0xEA1
> +
> +#define HYPERCALL_SIMPLE(hypercall)		\
> +.globl HYPERVISOR_##hypercall;                  \
> +.align 4,0x90;                                  \
> +HYPERVISOR_##hypercall:				\
> +	mov x16, #__HYPERVISOR_##hypercall;	\
> +	hvc XEN_HYPERCALL_TAG;			\
> +	ret;					\
> +
> +#define HYPERCALL0 HYPERCALL_SIMPLE
> +#define HYPERCALL1 HYPERCALL_SIMPLE
> +#define HYPERCALL2 HYPERCALL_SIMPLE
> +#define HYPERCALL3 HYPERCALL_SIMPLE
> +#define HYPERCALL4 HYPERCALL_SIMPLE
> +#define HYPERCALL5 HYPERCALL_SIMPLE
> +
> +                .text
> +
> +HYPERCALL2(xen_version);
> +HYPERCALL3(console_io);
> +HYPERCALL3(grant_table_op);
> +HYPERCALL2(sched_op);
> +HYPERCALL2(event_channel_op);
> +HYPERCALL2(hvm_op);
> +HYPERCALL2(memory_op);
> +
> diff --git a/arch/arm/cpu/armv8/xen/lowlevel_init.S
> b/arch/arm/cpu/armv8/xen/lowlevel_init.S
> new file mode 100644
> index 0000000000..25ed438e20
> --- /dev/null
> +++ b/arch/arm/cpu/armv8/xen/lowlevel_init.S
> @@ -0,0 +1,34 @@
> +/*
> + * SPDX-License-Identifier: GPL-2.0+
> + *
> + * (C) 2017 NXP
> + * (C) 2020 EPAM Systems Inc.
> + */
> +
> +#include <config.h>
> +
> +.align 8
> +.global rom_pointer
> +rom_pointer:
> +	.space 32
> +
> +/*
> + * Routine: save_boot_params (called after reset from start.S)  */
> +
> +.global save_boot_params
> +save_boot_params:
> +	/* The firmware provided ATAG/FDT address can be found in r2/x0 */
> +	adr	x1, rom_pointer
> +	stp	x0, x2, [x1], #16
> +	stp	x3, x4, [x1], #16
> +
> +	/* Returns */
> +	b	save_boot_params_ret
> +
> +.global restore_boot_params
> +restore_boot_params:
> +	adr	x1, rom_pointer
> +	ldp	x0, x2, [x1], #16
> +	ldp	x3, x4, [x1], #16
> +	ret
> diff --git a/arch/arm/include/asm/xen.h b/arch/arm/include/asm/xen.h new
> file mode 100644 index 0000000000..fb7f03e19c
> --- /dev/null
> +++ b/arch/arm/include/asm/xen.h
> @@ -0,0 +1,8 @@
> +/*
> + * SPDX-License-Identifier: GPL-2.0+
> + *
> + * (C) 2020 EPAM Systems Inc.
> + */
> +
> +extern unsigned long rom_pointer[];
> +
> diff --git a/arch/arm/include/asm/xen/hypercall.h
> b/arch/arm/include/asm/xen/hypercall.h
> new file mode 100644
> index 0000000000..26644ce886
> --- /dev/null
> +++ b/arch/arm/include/asm/xen/hypercall.h
> @@ -0,0 +1,45 @@
> +/************************************************************
> **********
> +********
> + * hypercall.h
> + *
> + * Linux-specific hypervisor handling.
> + *
> + * Stefano Stabellini <stefano.stabellini@eu.citrix.com>, Citrix, 2012
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License version
> +2
> + * as published by the Free Software Foundation; or, when distributed
> + * separately from the Linux kernel or incorporated into other
> + * software packages, subject to the following license:
> + *
> + * Permission is hereby granted, free of charge, to any person
> +obtaining a copy
> + * of this source file (the "Software"), to deal in the Software
> +without
> + * restriction, including without limitation the rights to use, copy,
> +modify,
> + * merge, publish, distribute, sublicense, and/or sell copies of the
> +Software,
> + * and to permit persons to whom the Software is furnished to do so,
> +subject to
> + * the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be
> +included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> KIND,
> +EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> +MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> EVENT
> +SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> DAMAGES OR
> +OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> +ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> OR OTHER
> +DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#ifndef _ASM_ARM_XEN_HYPERCALL_H
> +#define _ASM_ARM_XEN_HYPERCALL_H
> +
> +#include <xen/interface/xen.h>
> +
> +int HYPERVISOR_xen_version(int cmd, void *arg); int
> +HYPERVISOR_console_io(int cmd, int count, char *str); int
> +HYPERVISOR_grant_table_op(unsigned int cmd, void *uop, unsigned int
> +count); int HYPERVISOR_sched_op(int cmd, void *arg); int
> +HYPERVISOR_event_channel_op(int cmd, void *arg); unsigned long
> +HYPERVISOR_hvm_op(int op, void *arg); int
> HYPERVISOR_memory_op(unsigned
> +int cmd, void *arg); #endif /* _ASM_ARM_XEN_HYPERCALL_H */
> diff --git a/board/xen/xenguest_arm64/Kconfig
> b/board/xen/xenguest_arm64/Kconfig
> new file mode 100644
> index 0000000000..cc131ed5b9
> --- /dev/null
> +++ b/board/xen/xenguest_arm64/Kconfig
> @@ -0,0 +1,12 @@
> +if TARGET_XENGUEST_ARM64
> +
> +config SYS_BOARD
> +	default "xenguest_arm64"
> +
> +config SYS_VENDOR
> +	default "xen"
> +
> +config SYS_CONFIG_NAME
> +	default "xenguest_arm64"
> +
> +endif
> diff --git a/board/xen/xenguest_arm64/Makefile
> b/board/xen/xenguest_arm64/Makefile
> new file mode 100644
> index 0000000000..1cf87a728f
> --- /dev/null
> +++ b/board/xen/xenguest_arm64/Makefile
> @@ -0,0 +1,5 @@
> +# SPDX-License-Identifier:	GPL-2.0+
> +#
> +# (C) Copyright 2020 EPAM Systems Inc.
> +
> +obj-y	:= xenguest_arm64.o
> diff --git a/board/xen/xenguest_arm64/xenguest_arm64.c
> b/board/xen/xenguest_arm64/xenguest_arm64.c
> new file mode 100644
> index 0000000000..9e099f388f
> --- /dev/null
> +++ b/board/xen/xenguest_arm64/xenguest_arm64.c
> @@ -0,0 +1,153 @@
> +/*
> + * SPDX-License-Identifier: GPL-2.0+
> + *
> + * (C) 2013
> + * David Feng <fenghua@phytium.com.cn>
> + * Sharma Bhupesh <bhupesh.sharma@freescale.com>
> + *
> + * (C) 2020 EPAM Systems Inc
> + */
> +
> +#include <common.h>
> +#include <cpu_func.h>
> +#include <dm.h>
> +#include <errno.h>
> +#include <malloc.h>
> +
> +#include <asm/io.h>
> +#include <asm/armv8/mmu.h>
> +#include <asm/xen.h>
> +#include <asm/xen/hypercall.h>
> +
> +#include <linux/compiler.h>
> +
> +DECLARE_GLOBAL_DATA_PTR;
> +
> +int board_init(void)
> +{
> +	return 0;
> +}
> +
> +/*
> + * Use fdt provided by Xen: according to
> + *
> +https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww
> .k
> +ernel.org%2Fdoc%2FDocumentation%2Farm64%2Fbooting.txt&amp;data=0
> 2%7C01%
> +7Cpeng.fan%40nxp.com%7C61151b8230c94f145ce408d81ddc04ee%7C686
> ea1d3bc2b4
> +c6fa92cd99c5c301635%7C0%7C0%7C637292178110014498&amp;sdata=3t
> i9j4nAzNSw
> +xsmZs8rONDmPLNbGx89HYBsezkgD%2FVI%3D&amp;reserved=0
> + * x0 is the physical address of the device tree blob (dtb) in system RAM.
> + * This is stored in rom_pointer during low level init.
> + */
> +void *board_fdt_blob_setup(void)
> +{
> +	if (fdt_magic(rom_pointer[0]) != FDT_MAGIC)
> +		return NULL;
> +	return (void *)rom_pointer[0];
> +}
> +
> +#define MAX_MEM_MAP_REGIONS 5
> +static struct mm_region xen_mem_map[MAX_MEM_MAP_REGIONS]; struct
> +mm_region *mem_map = xen_mem_map;
> +
> +static int get_next_memory_node(const void *blob, int mem) {
> +	do {
> +		mem = fdt_node_offset_by_prop_value(blob, mem,
> +						    "device_type", "memory", 7);
> +	} while (!fdtdec_get_is_enabled(blob, mem));
> +
> +	return mem;
> +}
> +
> +static int setup_mem_map(void)
> +{
> +	int i, ret, mem, reg = 0;
> +	struct fdt_resource res;
> +	const void *blob = gd->fdt_blob;
> +
> +	mem = get_next_memory_node(blob, -1);
> +	if (mem < 0) {
> +		printf("%s: Missing /memory node\n", __func__);
> +		return -EINVAL;
> +	}
> +
> +	for (i = 0; i < MAX_MEM_MAP_REGIONS; i++) {
> +		ret = fdt_get_resource(blob, mem, "reg", reg++, &res);
> +		if (ret == -FDT_ERR_NOTFOUND) {
> +			reg = 0;
> +			mem = get_next_memory_node(blob, mem);
> +			if (mem == -FDT_ERR_NOTFOUND)
> +				break;
> +
> +			ret = fdt_get_resource(blob, mem, "reg", reg++, &res);
> +			if (ret == -FDT_ERR_NOTFOUND)
> +				break;
> +		}
> +		if (ret != 0) {
> +			printf("No reg property for memory node\n");
> +			return -EINVAL;
> +		}
> +
> +		xen_mem_map[i].virt = (phys_addr_t)res.start;
> +		xen_mem_map[i].phys = (phys_addr_t)res.start;
> +		xen_mem_map[i].size = (phys_size_t)(res.end - res.start + 1);
> +		xen_mem_map[i].attrs = (PTE_BLOCK_MEMTYPE(MT_NORMAL) |
> +					PTE_BLOCK_INNER_SHARE);
> +	}
> +	return 0;
> +}
> +
> +void enable_caches(void)
> +{
> +	/* Re-setup the memory map as BSS gets cleared after relocation. */
> +	setup_mem_map();
> +	icache_enable();
> +	dcache_enable();
> +}
> +
> +/* Read memory settings from the Xen provided device tree. */ int
> +dram_init(void) {
> +	int ret;
> +
> +	ret = fdtdec_setup_mem_size_base();
> +	if (ret < 0)
> +		return ret;
> +	/* Setup memory map, so MMU page table size can be estimated. */
> +	return setup_mem_map();
> +}
> +
> +int dram_init_banksize(void)
> +{
> +	return fdtdec_setup_memory_banksize(); }
> +
> +/*
> + * Board specific reset that is system reset.
> + */
> +void reset_cpu(ulong addr)
> +{
> +}
> +
> +int ft_system_setup(void *blob, bd_t *bd) {
> +	return 0;
> +}
> +
> +int ft_board_setup(void *blob, bd_t *bd) {
> +	return 0;
> +}
> +
> +int board_early_init_f(void)
> +{
> +	return 0;
> +}

Drop the upper three functions if not needed.

> +
> +int print_cpuinfo(void)
> +{
> +	printf("Xen virtual CPU\n");
> +	return 0;
> +}
> +
> +__weak struct serial_device *default_serial_console(void) {
> +	return NULL;
> +}
> +
> diff --git a/configs/xenguest_arm64_defconfig
> b/configs/xenguest_arm64_defconfig
> new file mode 100644
> index 0000000000..2a8caf8647
> --- /dev/null
> +++ b/configs/xenguest_arm64_defconfig
> @@ -0,0 +1,56 @@
> +CONFIG_ARM=y
> +CONFIG_POSITION_INDEPENDENT=y
> +CONFIG_SYS_TEXT_BASE=0x40080000
> +CONFIG_SYS_MALLOC_F_LEN=0x2000
> +CONFIG_IDENT_STRING=" xenguest"
> +CONFIG_TARGET_XENGUEST_ARM64=y
> +CONFIG_BOOTDELAY=10

10s?

Regards,
Peng.
> +
> +CONFIG_SYS_PROMPT="xenguest# "
> +
> +CONFIG_CMD_NET=n
> +CONFIG_CMD_BDI=n
> +CONFIG_CMD_BOOTD=n
> +CONFIG_CMD_BOOTEFI=n
> +CONFIG_CMD_BOOTEFI_HELLO_COMPILE=n
> +CONFIG_CMD_ELF=n
> +CONFIG_CMD_GO=n
> +CONFIG_CMD_RUN=n
> +CONFIG_CMD_IMI=n
> +CONFIG_CMD_IMLS=n
> +CONFIG_CMD_XIMG=n
> +CONFIG_CMD_EXPORTENV=n
> +CONFIG_CMD_IMPORTENV=n
> +CONFIG_CMD_EDITENV=n
> +CONFIG_CMD_ENV_EXISTS=n
> +CONFIG_CMD_MEMORY=y
> +CONFIG_CMD_CRC32=n
> +CONFIG_CMD_DM=n
> +CONFIG_CMD_LOADB=n
> +CONFIG_CMD_LOADS=n
> +CONFIG_CMD_FLASH=n
> +CONFIG_CMD_GPT=n
> +CONFIG_CMD_FPGA=n
> +CONFIG_CMD_ECHO=n
> +CONFIG_CMD_ITEST=n
> +CONFIG_CMD_SOURCE=n
> +CONFIG_CMD_SETEXPR=n
> +CONFIG_CMD_MISC=n
> +CONFIG_CMD_UNZIP=n
> +CONFIG_CMD_LZMADEC=n
> +CONFIG_CMD_SAVEENV=n
> +CONFIG_CMD_UMS=n
> +
> +#CONFIG_USB=n
> +# CONFIG_ISO_PARTITION is not set
> +
> +#CONFIG_EFI_PARTITION=y
> +# CONFIG_EFI_LOADER is not set
> +
> +# CONFIG_DM is not set
> +# CONFIG_MMC is not set
> +# CONFIG_DM_SERIAL is not set
> +# CONFIG_REQUIRE_SERIAL_CONSOLE is not set
> +
> +CONFIG_OF_BOARD=y
> +CONFIG_OF_LIBFDT=y
> diff --git a/include/configs/xenguest_arm64.h
> b/include/configs/xenguest_arm64.h
> new file mode 100644
> index 0000000000..467dabf1e5
> --- /dev/null
> +++ b/include/configs/xenguest_arm64.h
> @@ -0,0 +1,45 @@
> +/*
> + * SPDX-License-Identifier: GPL-2.0+
> + *
> + * (C) Copyright 2020 EPAM Systemc Inc.
> + */
> +#ifndef __XENGUEST_ARM64_H
> +#define __XENGUEST_ARM64_H
> +
> +#ifndef __ASSEMBLY__
> +#include <linux/types.h>
> +#endif
> +
> +#define CONFIG_BOARD_EARLY_INIT_F
> +
> +#define CONFIG_EXTRA_ENV_SETTINGS
> +
> +#undef CONFIG_NR_DRAM_BANKS
> +#undef CONFIG_SYS_SDRAM_BASE
> +
> +#define CONFIG_NR_DRAM_BANKS          1
> +
> +/*
> + * This can be any arbitrary address as we are using PIE, but
> + * please note, that CONFIG_SYS_TEXT_BASE must match the below.
> + */
> +#define CONFIG_SYS_LOAD_ADDR                    0x40000000
> +#define CONFIG_LNX_KRNL_IMG_TEXT_OFFSET_BASE
> CONFIG_SYS_LOAD_ADDR
> +
> +/* Size of malloc() pool */
> +#define CONFIG_SYS_MALLOC_LEN         (32 * 1024 * 1024)
> +
> +/* Monitor Command Prompt */
> +#define CONFIG_SYS_PROMPT_HUSH_PS2    "> "
> +#define CONFIG_SYS_CBSIZE             1024
> +#define CONFIG_SYS_MAXARGS            64
> +#define CONFIG_SYS_BARGSIZE           CONFIG_SYS_CBSIZE
> +#define CONFIG_SYS_PBSIZE             (CONFIG_SYS_CBSIZE + \
> +				      sizeof(CONFIG_SYS_PROMPT) + 16)
> +
> +#define CONFIG_OF_SYSTEM_SETUP
> +
> +#define CONFIG_CMDLINE_TAG            1
> +#define CONFIG_INITRD_TAG             1
> +
> +#endif /* __XENGUEST_ARM64_H */
> --
> 2.17.1

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 04/17] xen: Add essential and required interface headers
  2020-07-01 16:29 ` [PATCH 04/17] xen: Add essential and required interface headers Anastasiia Lukianenko
@ 2020-07-02  1:30   ` Peng Fan
  2020-07-03 12:46     ` Anastasiia Lukianenko
  0 siblings, 1 reply; 57+ messages in thread
From: Peng Fan @ 2020-07-02  1:30 UTC (permalink / raw)
  To: u-boot

> Subject: [PATCH 04/17] xen: Add essential and required interface headers
> 
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> Add essential and required Xen interface headers only taken from
> the stable Linux kernel stable/linux-5.7.y at commit
> 66dfe45221605e11f38a0bf5eb2ee808cea7cfe7.

Please use commit <12+> ("commit header")

> 
> These are better suited for U-boot than the original headers
> from Xen as they are the stripped versions of the same.
> 
> At the same time use public protocols from Xen RELEASE-4.13.1, at
> commit 6278553325a9f76d37811923221b21db3882e017

Please use commit <12+> ("commit header")

Then:

Acked-by: Peng Fan <peng.fan@nxp.com>

> as those have more comments in them.
> 
> Signed-off-by: Oleksandr Andrushchenko
> <oleksandr_andrushchenko@epam.com>
> Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> ---
>  include/xen/arm/interface.h           |  88 ++++
>  include/xen/interface/event_channel.h | 281 ++++++++++
>  include/xen/interface/grant_table.h   | 582 +++++++++++++++++++++
>  include/xen/interface/hvm/hvm_op.h    |  69 +++
>  include/xen/interface/hvm/params.h    | 127 +++++
>  include/xen/interface/io/blkif.h      | 726
> ++++++++++++++++++++++++++
>  include/xen/interface/io/console.h    |  56 ++
>  include/xen/interface/io/protocols.h  |  42 ++
>  include/xen/interface/io/ring.h       | 479 +++++++++++++++++
>  include/xen/interface/io/xenbus.h     |  81 +++
>  include/xen/interface/io/xs_wire.h    | 151 ++++++
>  include/xen/interface/memory.h        | 332 ++++++++++++
>  include/xen/interface/sched.h         | 188 +++++++
>  include/xen/interface/xen.h           | 225 ++++++++
>  14 files changed, 3427 insertions(+)
>  create mode 100644 include/xen/arm/interface.h
>  create mode 100644 include/xen/interface/event_channel.h
>  create mode 100644 include/xen/interface/grant_table.h
>  create mode 100644 include/xen/interface/hvm/hvm_op.h
>  create mode 100644 include/xen/interface/hvm/params.h
>  create mode 100644 include/xen/interface/io/blkif.h
>  create mode 100644 include/xen/interface/io/console.h
>  create mode 100644 include/xen/interface/io/protocols.h
>  create mode 100644 include/xen/interface/io/ring.h
>  create mode 100644 include/xen/interface/io/xenbus.h
>  create mode 100644 include/xen/interface/io/xs_wire.h
>  create mode 100644 include/xen/interface/memory.h
>  create mode 100644 include/xen/interface/sched.h
>  create mode 100644 include/xen/interface/xen.h
> 
> diff --git a/include/xen/arm/interface.h b/include/xen/arm/interface.h
> new file mode 100644
> index 0000000000..79d5ae8563
> --- /dev/null
> +++ b/include/xen/arm/interface.h
> @@ -0,0 +1,88 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/************************************************************
> ******************
> + * Guest OS interface to ARM Xen.
> + *
> + * Stefano Stabellini <stefano.stabellini@eu.citrix.com>, Citrix, 2012
> + */
> +
> +#ifndef _ASM_ARM_XEN_INTERFACE_H
> +#define _ASM_ARM_XEN_INTERFACE_H
> +
> +#ifndef __ASSEMBLY__
> +#include <linux/types.h>
> +#endif
> +
> +#define uint64_aligned_t u64 __attribute__((aligned(8)))
> +
> +#define __DEFINE_GUEST_HANDLE(name, type) \
> +	typedef struct { union { type *p; uint64_aligned_t q; }; }  \
> +		__guest_handle_ ## name
> +
> +#define DEFINE_GUEST_HANDLE_STRUCT(name) \
> +	__DEFINE_GUEST_HANDLE(name, struct name)
> +#define DEFINE_GUEST_HANDLE(name) __DEFINE_GUEST_HANDLE(name,
> name)
> +#define GUEST_HANDLE(name)        __guest_handle_ ## name
> +
> +#define set_xen_guest_handle(hnd, val)			\
> +	do {						\
> +		if (sizeof(hnd) == 8)			\
> +			*(u64 *)&(hnd) = 0;	\
> +		(hnd).p = val;				\
> +	} while (0)
> +
> +#define __HYPERVISOR_platform_op_raw __HYPERVISOR_platform_op
> +
> +#ifndef __ASSEMBLY__
> +/* Explicitly size integers that represent pfns in the interface with
> + * Xen so that we can have one ABI that works for 32 and 64 bit guests.
> + * Note that this means that the xen_pfn_t type may be capable of
> + * representing pfn's which the guest cannot represent in its own pfn
> + * type. However since pfn space is controlled by the guest this is
> + * fine since it simply wouldn't be able to create any sure pfns in
> + * the first place.
> + */
> +typedef u64 xen_pfn_t;
> +#define PRI_xen_pfn "llx"
> +typedef u64 xen_ulong_t;
> +#define PRI_xen_ulong "llx"
> +typedef s64 xen_long_t;
> +#define PRI_xen_long "llx"
> +/* Guest handles for primitive C types. */
> +__DEFINE_GUEST_HANDLE(uchar, unsigned char);
> +__DEFINE_GUEST_HANDLE(uint,  unsigned int);
> +DEFINE_GUEST_HANDLE(char);
> +DEFINE_GUEST_HANDLE(int);
> +DEFINE_GUEST_HANDLE(void);
> +DEFINE_GUEST_HANDLE(u64);
> +DEFINE_GUEST_HANDLE(u32);
> +DEFINE_GUEST_HANDLE(xen_pfn_t);
> +DEFINE_GUEST_HANDLE(xen_ulong_t);
> +
> +/* Maximum number of virtual CPUs in multi-processor guests. */
> +#define MAX_VIRT_CPUS 1
> +
> +struct arch_vcpu_info { };
> +struct arch_shared_info { };
> +
> +/* TODO: Move pvclock definitions some place arch independent */
> +struct pvclock_vcpu_time_info {
> +	u32   version;
> +	u32   pad0;
> +	u64   tsc_timestamp;
> +	u64   system_time;
> +	u32   tsc_to_system_mul;
> +	s8    tsc_shift;
> +	u8    flags;
> +	u8    pad[2];
> +} __attribute__((__packed__)); /* 32 bytes */
> +
> +/* It is OK to have a 12 bytes struct with no padding because it is packed */
> +struct pvclock_wall_clock {
> +	u32   version;
> +	u32   sec;
> +	u32   nsec;
> +	u32   sec_hi;
> +} __attribute__((__packed__));
> +#endif
> +
> +#endif /* _ASM_ARM_XEN_INTERFACE_H */
> diff --git a/include/xen/interface/event_channel.h
> b/include/xen/interface/event_channel.h
> new file mode 100644
> index 0000000000..8174999c2f
> --- /dev/null
> +++ b/include/xen/interface/event_channel.h
> @@ -0,0 +1,281 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/************************************************************
> ******************
> + * event_channel.h
> + *
> + * Event channels between domains.
> + *
> + * Copyright (c) 2003-2004, K A Fraser.
> + */
> +
> +#ifndef __XEN_PUBLIC_EVENT_CHANNEL_H__
> +#define __XEN_PUBLIC_EVENT_CHANNEL_H__
> +
> +#include <xen/interface/xen.h>
> +
> +typedef u32 evtchn_port_t;
> +DEFINE_GUEST_HANDLE(evtchn_port_t);
> +
> +/*
> + * EVTCHNOP_alloc_unbound: Allocate a port in domain <dom> and mark as
> + * accepting interdomain bindings from domain <remote_dom>. A fresh port
> + * is allocated in <dom> and returned as <port>.
> + * NOTES:
> + *  1. If the caller is unprivileged then <dom> must be DOMID_SELF.
> + *  2. <rdom> may be DOMID_SELF, allowing loopback connections.
> + */
> +#define EVTCHNOP_alloc_unbound	  6
> +struct evtchn_alloc_unbound {
> +	/* IN parameters */
> +	domid_t dom, remote_dom;
> +	/* OUT parameters */
> +	evtchn_port_t port;
> +};
> +
> +/*
> + * EVTCHNOP_bind_interdomain: Construct an interdomain event channel
> between
> + * the calling domain and <remote_dom>. <remote_dom,remote_port> must
> identify
> + * a port that is unbound and marked as accepting bindings from the calling
> + * domain. A fresh port is allocated in the calling domain and returned as
> + * <local_port>.
> + * NOTES:
> + *  2. <remote_dom> may be DOMID_SELF, allowing loopback connections.
> + */
> +#define EVTCHNOP_bind_interdomain 0
> +struct evtchn_bind_interdomain {
> +	/* IN parameters. */
> +	domid_t remote_dom;
> +	evtchn_port_t remote_port;
> +	/* OUT parameters. */
> +	evtchn_port_t local_port;
> +};
> +
> +/*
> + * EVTCHNOP_bind_virq: Bind a local event channel to VIRQ <irq> on
> specified
> + * vcpu.
> + * NOTES:
> + *  1. A virtual IRQ may be bound to at most one event channel per vcpu.
> + *  2. The allocated event channel is bound to the specified vcpu. The
> binding
> + *     may not be changed.
> + */
> +#define EVTCHNOP_bind_virq	  1
> +struct evtchn_bind_virq {
> +	/* IN parameters. */
> +	u32 virq;
> +	u32 vcpu;
> +	/* OUT parameters. */
> +	evtchn_port_t port;
> +};
> +
> +/*
> + * EVTCHNOP_bind_pirq: Bind a local event channel to PIRQ <irq>.
> + * NOTES:
> + *  1. A physical IRQ may be bound to at most one event channel per
> domain.
> + *  2. Only a sufficiently-privileged domain may bind to a physical IRQ.
> + */
> +#define EVTCHNOP_bind_pirq	  2
> +struct evtchn_bind_pirq {
> +	/* IN parameters. */
> +	u32 pirq;
> +#define BIND_PIRQ__WILL_SHARE 1
> +	u32 flags; /* BIND_PIRQ__* */
> +	/* OUT parameters. */
> +	evtchn_port_t port;
> +};
> +
> +/*
> + * EVTCHNOP_bind_ipi: Bind a local event channel to receive events.
> + * NOTES:
> + *  1. The allocated event channel is bound to the specified vcpu. The
> binding
> + *     may not be changed.
> + */
> +#define EVTCHNOP_bind_ipi	  7
> +struct evtchn_bind_ipi {
> +	u32 vcpu;
> +	/* OUT parameters. */
> +	evtchn_port_t port;
> +};
> +
> +/*
> + * EVTCHNOP_close: Close a local event channel <port>. If the channel is
> + * interdomain then the remote end is placed in the unbound state
> + * (EVTCHNSTAT_unbound), awaiting a new connection.
> + */
> +#define EVTCHNOP_close		  3
> +struct evtchn_close {
> +	/* IN parameters. */
> +	evtchn_port_t port;
> +};
> +
> +/*
> + * EVTCHNOP_send: Send an event to the remote end of the channel whose
> local
> + * endpoint is <port>.
> + */
> +#define EVTCHNOP_send		  4
> +struct evtchn_send {
> +	/* IN parameters. */
> +	evtchn_port_t port;
> +};
> +
> +/*
> + * EVTCHNOP_status: Get the current status of the communication channel
> which
> + * has an endpoint at <dom, port>.
> + * NOTES:
> + *  1. <dom> may be specified as DOMID_SELF.
> + *  2. Only a sufficiently-privileged domain may obtain the status of an
> event
> + *     channel for which <dom> is not DOMID_SELF.
> + */
> +#define EVTCHNOP_status		  5
> +struct evtchn_status {
> +	/* IN parameters */
> +	domid_t  dom;
> +	evtchn_port_t port;
> +	/* OUT parameters */
> +#define EVTCHNSTAT_closed	0  /* Channel is not in use.		     */
> +#define EVTCHNSTAT_unbound	1  /* Channel is waiting interdom
> connection.*/
> +#define EVTCHNSTAT_interdomain	2  /* Channel is connected to remote
> domain. */
> +#define EVTCHNSTAT_pirq		3  /* Channel is bound to a phys IRQ line.
> */
> +#define EVTCHNSTAT_virq		4  /* Channel is bound to a virtual IRQ line
> */
> +#define EVTCHNSTAT_ipi		5  /* Channel is bound to a virtual IPI line
> */
> +	u32 status;
> +	u32 vcpu;		   /* VCPU to which this channel is bound.   */
> +	union {
> +		struct {
> +			domid_t dom;
> +		} unbound; /* EVTCHNSTAT_unbound */
> +		struct {
> +			domid_t dom;
> +			evtchn_port_t port;
> +		} interdomain; /* EVTCHNSTAT_interdomain */
> +		u32 pirq;	    /* EVTCHNSTAT_pirq	      */
> +		u32 virq;	    /* EVTCHNSTAT_virq	      */
> +	} u;
> +};
> +
> +/*
> + * EVTCHNOP_bind_vcpu: Specify which vcpu a channel should notify when
> an
> + * event is pending.
> + * NOTES:
> + *  1. IPI- and VIRQ-bound channels always notify the vcpu that initialised
> + *     the binding. This binding cannot be changed.
> + *  2. All other channels notify vcpu0 by default. This default is set when
> + *     the channel is allocated (a port that is freed and subsequently reused
> + *     has its binding reset to vcpu0).
> + */
> +#define EVTCHNOP_bind_vcpu	  8
> +struct evtchn_bind_vcpu {
> +	/* IN parameters. */
> +	evtchn_port_t port;
> +	u32 vcpu;
> +};
> +
> +/*
> + * EVTCHNOP_unmask: Unmask the specified local event-channel port and
> deliver
> + * a notification to the appropriate VCPU if an event is pending.
> + */
> +#define EVTCHNOP_unmask		  9
> +struct evtchn_unmask {
> +	/* IN parameters. */
> +	evtchn_port_t port;
> +};
> +
> +/*
> + * EVTCHNOP_reset: Close all event channels associated with specified
> domain.
> + * NOTES:
> + *  1. <dom> may be specified as DOMID_SELF.
> + *  2. Only a sufficiently-privileged domain may specify other than
> DOMID_SELF.
> + */
> +#define EVTCHNOP_reset		 10
> +struct evtchn_reset {
> +	/* IN parameters. */
> +	domid_t dom;
> +};
> +
> +typedef struct evtchn_reset evtchn_reset_t;
> +
> +/*
> + * EVTCHNOP_init_control: initialize the control block for the FIFO ABI.
> + */
> +#define EVTCHNOP_init_control    11
> +struct evtchn_init_control {
> +	/* IN parameters. */
> +	u64 control_gfn;
> +	u32 offset;
> +	u32 vcpu;
> +	/* OUT parameters. */
> +	u8 link_bits;
> +	u8 _pad[7];
> +};
> +
> +/*
> + * EVTCHNOP_expand_array: add an additional page to the event array.
> + */
> +#define EVTCHNOP_expand_array    12
> +struct evtchn_expand_array {
> +	/* IN parameters. */
> +	u64 array_gfn;
> +};
> +
> +/*
> + * EVTCHNOP_set_priority: set the priority for an event channel.
> + */
> +#define EVTCHNOP_set_priority    13
> +struct evtchn_set_priority {
> +	/* IN parameters. */
> +	evtchn_port_t port;
> +	u32 priority;
> +};
> +
> +struct evtchn_op {
> +	u32 cmd; /* EVTCHNOP_* */
> +	union {
> +		struct evtchn_alloc_unbound    alloc_unbound;
> +		struct evtchn_bind_interdomain bind_interdomain;
> +		struct evtchn_bind_virq	       bind_virq;
> +		struct evtchn_bind_pirq	       bind_pirq;
> +		struct evtchn_bind_ipi	       bind_ipi;
> +		struct evtchn_close	       close;
> +		struct evtchn_send	       send;
> +		struct evtchn_status	       status;
> +		struct evtchn_bind_vcpu	       bind_vcpu;
> +		struct evtchn_unmask	       unmask;
> +	} u;
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(evtchn_op);
> +
> +/*
> + * 2-level ABI
> + */
> +
> +#define EVTCHN_2L_NR_CHANNELS (sizeof(xen_ulong_t) *
> sizeof(xen_ulong_t) * 64)
> +
> +/*
> + * FIFO ABI
> + */
> +
> +/* Events may have priorities from 0 (highest) to 15 (lowest). */
> +#define EVTCHN_FIFO_PRIORITY_MAX     0
> +#define EVTCHN_FIFO_PRIORITY_DEFAULT 7
> +#define EVTCHN_FIFO_PRIORITY_MIN     15
> +
> +#define EVTCHN_FIFO_MAX_QUEUES (EVTCHN_FIFO_PRIORITY_MIN + 1)
> +
> +typedef u32 event_word_t;
> +
> +#define EVTCHN_FIFO_PENDING 31
> +#define EVTCHN_FIFO_MASKED  30
> +#define EVTCHN_FIFO_LINKED  29
> +#define EVTCHN_FIFO_BUSY    28
> +
> +#define EVTCHN_FIFO_LINK_BITS 17
> +#define EVTCHN_FIFO_LINK_MASK ((1 << EVTCHN_FIFO_LINK_BITS) - 1)
> +
> +#define EVTCHN_FIFO_NR_CHANNELS (1 << EVTCHN_FIFO_LINK_BITS)
> +
> +struct evtchn_fifo_control_block {
> +	u32     ready;
> +	u32     _rsvd;
> +	event_word_t head[EVTCHN_FIFO_MAX_QUEUES];
> +};
> +
> +#endif /* __XEN_PUBLIC_EVENT_CHANNEL_H__ */
> diff --git a/include/xen/interface/grant_table.h
> b/include/xen/interface/grant_table.h
> new file mode 100644
> index 0000000000..197a0d0d58
> --- /dev/null
> +++ b/include/xen/interface/grant_table.h
> @@ -0,0 +1,582 @@
> +/************************************************************
> ******************
> + * grant_table.h
> + *
> + * Interface for granting foreign access to page frames, and receiving
> + * page-ownership transfers.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> copy
> + * of this software and associated documentation files (the "Software"), to
> + * deal in the Software without restriction, including without limitation the
> + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
> + * sell copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> EVENT SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + *
> + * Copyright (c) 2004, K A Fraser
> + */
> +
> +#ifndef __XEN_PUBLIC_GRANT_TABLE_H__
> +#define __XEN_PUBLIC_GRANT_TABLE_H__
> +
> +#include <xen/interface/xen.h>
> +
> +/***********************************
> + * GRANT TABLE REPRESENTATION
> + */
> +
> +/* Some rough guidelines on accessing and updating grant-table entries
> + * in a concurrency-safe manner. For more information, Linux contains a
> + * reference implementation for guest OSes (arch/xen/kernel/grant_table.c).
> + *
> + * NB. WMB is a no-op on current-generation x86 processors. However, a
> + *     compiler barrier will still be required.
> + *
> + * Introducing a valid entry into the grant table:
> + *  1. Write ent->domid.
> + *  2. Write ent->frame:
> + *      GTF_permit_access:   Frame to which access is permitted.
> + *      GTF_accept_transfer: Pseudo-phys frame slot being filled by new
> + *                           frame, or zero if none.
> + *  3. Write memory barrier (WMB).
> + *  4. Write ent->flags, inc. valid type.
> + *
> + * Invalidating an unused GTF_permit_access entry:
> + *  1. flags = ent->flags.
> + *  2. Observe that !(flags & (GTF_reading|GTF_writing)).
> + *  3. Check result of SMP-safe CMPXCHG(&ent->flags, flags, 0).
> + *  NB. No need for WMB as reuse of entry is control-dependent on success
> of
> + *      step 3, and all architectures guarantee ordering of ctrl-dep writes.
> + *
> + * Invalidating an in-use GTF_permit_access entry:
> + *  This cannot be done directly. Request assistance from the domain
> controller
> + *  which can set a timeout on the use of a grant entry and take necessary
> + *  action. (NB. This is not yet implemented!).
> + *
> + * Invalidating an unused GTF_accept_transfer entry:
> + *  1. flags = ent->flags.
> + *  2. Observe that !(flags & GTF_transfer_committed). [*]
> + *  3. Check result of SMP-safe CMPXCHG(&ent->flags, flags, 0).
> + *  NB. No need for WMB as reuse of entry is control-dependent on success
> of
> + *      step 3, and all architectures guarantee ordering of ctrl-dep writes.
> + *  [*] If GTF_transfer_committed is set then the grant entry is 'committed'.
> + *      The guest must /not/ modify the grant entry until the address of
> the
> + *      transferred frame is written. It is safe for the guest to spin waiting
> + *      for this to occur (detect by observing GTF_transfer_completed in
> + *      ent->flags).
> + *
> + * Invalidating a committed GTF_accept_transfer entry:
> + *  1. Wait for (ent->flags & GTF_transfer_completed).
> + *
> + * Changing a GTF_permit_access from writable to read-only:
> + *  Use SMP-safe CMPXCHG to set GTF_readonly, while
> checking !GTF_writing.
> + *
> + * Changing a GTF_permit_access from read-only to writable:
> + *  Use SMP-safe bit-setting instruction.
> + */
> +
> +/*
> + * Reference to a grant entry in a specified domain's grant table.
> + */
> +typedef u32 grant_ref_t;
> +
> +/*
> + * A grant table comprises a packed array of grant entries in one or more
> + * page frames shared between Xen and a guest.
> + * [XEN]: This field is written by Xen and read by the sharing guest.
> + * [GST]: This field is written by the guest and read by Xen.
> + */
> +
> +/*
> + * Version 1 of the grant table entry structure is maintained purely
> + * for backwards compatibility.  New guests should use version 2.
> + */
> +struct grant_entry_v1 {
> +	/* GTF_xxx: various type and flag information.  [XEN,GST] */
> +	u16 flags;
> +	/* The domain being granted foreign privileges. [GST] */
> +	domid_t  domid;
> +	/*
> +	 * GTF_permit_access: Frame that @domid is allowed to map and
> access. [GST]
> +	 * GTF_accept_transfer: Frame whose ownership transferred by
> @domid. [XEN]
> +	 */
> +	u32 frame;
> +};
> +
> +/*
> + * Type of grant entry.
> + *  GTF_invalid: This grant entry grants no privileges.
> + *  GTF_permit_access: Allow @domid to map/access @frame.
> + *  GTF_accept_transfer: Allow @domid to transfer ownership of one page
> frame
> + *                       to this guest. Xen writes the page number to
> @frame.
> + *  GTF_transitive: Allow @domid to transitively access a subrange of
> + *                  @trans_grant in @trans_domid.  No mappings are
> allowed.
> + */
> +#define GTF_invalid         (0U << 0)
> +#define GTF_permit_access   (1U << 0)
> +#define GTF_accept_transfer (2U << 0)
> +#define GTF_transitive      (3U << 0)
> +#define GTF_type_mask       (3U << 0)
> +
> +/*
> + * Subflags for GTF_permit_access.
> + *  GTF_readonly: Restrict @domid to read-only mappings and accesses.
> [GST]
> + *  GTF_reading: Grant entry is currently mapped for reading by @domid.
> [XEN]
> + *  GTF_writing: Grant entry is currently mapped for writing by @domid.
> [XEN]
> + *  GTF_sub_page: Grant access to only a subrange of the page.  @domid
> + *                will only be allowed to copy from the grant, and not
> + *                map it. [GST]
> + */
> +#define _GTF_readonly       (2)
> +#define GTF_readonly        (1U << _GTF_readonly)
> +#define _GTF_reading        (3)
> +#define GTF_reading         (1U << _GTF_reading)
> +#define _GTF_writing        (4)
> +#define GTF_writing         (1U << _GTF_writing)
> +#define _GTF_sub_page       (8)
> +#define GTF_sub_page        (1U << _GTF_sub_page)
> +
> +/*
> + * Subflags for GTF_accept_transfer:
> + *  GTF_transfer_committed: Xen sets this flag to indicate that it is
> committed
> + *      to transferring ownership of a page frame. When a guest sees this
> flag
> + *      it must /not/ modify the grant entry until GTF_transfer_completed
> is
> + *      set by Xen.
> + *  GTF_transfer_completed: It is safe for the guest to spin-wait on this flag
> + *      after reading GTF_transfer_committed. Xen will always write the
> frame
> + *      address, followed by ORing this flag, in a timely manner.
> + */
> +#define _GTF_transfer_committed (2)
> +#define GTF_transfer_committed  (1U << _GTF_transfer_committed)
> +#define _GTF_transfer_completed (3)
> +#define GTF_transfer_completed  (1U << _GTF_transfer_completed)
> +
> +/*
> + * Version 2 grant table entries.  These fulfil the same role as
> + * version 1 entries, but can represent more complicated operations.
> + * Any given domain will have either a version 1 or a version 2 table,
> + * and every entry in the table will be the same version.
> + *
> + * The interface by which domains use grant references does not depend
> + * on the grant table version in use by the other domain.
> + */
> +
> +/*
> + * Version 1 and version 2 grant entries share a common prefix.  The
> + * fields of the prefix are documented as part of struct
> + * grant_entry_v1.
> + */
> +struct grant_entry_header {
> +	u16 flags;
> +	domid_t  domid;
> +};
> +
> +/*
> + * Version 2 of the grant entry structure, here is a union because three
> + * different types are suppotted: full_page, sub_page and transitive.
> + */
> +union grant_entry_v2 {
> +	struct grant_entry_header hdr;
> +
> +	/*
> +	 * This member is used for V1-style full page grants, where either:
> +	 *
> +	 * -- hdr.type is GTF_accept_transfer, or
> +	 * -- hdr.type is GTF_permit_access and GTF_sub_page is not set.
> +	 *
> +	 * In that case, the frame field has the same semantics as the
> +	 * field of the same name in the V1 entry structure.
> +	 */
> +	struct {
> +	struct grant_entry_header hdr;
> +	u32 pad0;
> +	u64 frame;
> +	} full_page;
> +
> +	/*
> +	 * If the grant type is GTF_grant_access and GTF_sub_page is set,
> +	 * @domid is allowed to access bytes [@page_off, at page_off+@length)
> +	 * in frame @frame.
> +	 */
> +	struct {
> +	struct grant_entry_header hdr;
> +	u16 page_off;
> +	u16 length;
> +	u64 frame;
> +	} sub_page;
> +
> +	/*
> +	 * If the grant is GTF_transitive, @domid is allowed to use the
> +	 * grant @gref in domain @trans_domid, as if it was the local
> +	 * domain.  Obviously, the transitive access must be compatible
> +	 * with the original grant.
> +	 */
> +	struct {
> +	struct grant_entry_header hdr;
> +	domid_t trans_domid;
> +	u16 pad0;
> +	grant_ref_t gref;
> +	} transitive;
> +
> +	u32 __spacer[4]; /* Pad to a power of two */
> +};
> +
> +typedef u16 grant_status_t;
> +
> +/***********************************
> + * GRANT TABLE QUERIES AND USES
> + */
> +
> +/*
> + * Handle to track a mapping created via a grant reference.
> + */
> +typedef u32 grant_handle_t;
> +
> +/*
> + * GNTTABOP_map_grant_ref: Map the grant entry (<dom>,<ref>) for access
> + * by devices and/or host CPUs. If successful, <handle> is a tracking number
> + * that must be presented later to destroy the mapping(s). On error,
> <handle>
> + * is a negative status code.
> + * NOTES:
> + *  1. If GNTMAP_device_map is specified then <dev_bus_addr> is the
> address
> + *     via which I/O devices may access the granted frame.
> + *  2. If GNTMAP_host_map is specified then a mapping will be added at
> + *     either a host virtual address in the current address space, or at
> + *     a PTE at the specified machine address.  The type of mapping to
> + *     perform is selected through the GNTMAP_contains_pte flag, and the
> + *     address is specified in <host_addr>.
> + *  3. Mappings should only be destroyed via GNTTABOP_unmap_grant_ref.
> If a
> + *     host mapping is destroyed by other means then it is *NOT*
> guaranteed
> + *     to be accounted to the correct grant reference!
> + */
> +#define GNTTABOP_map_grant_ref        0
> +struct gnttab_map_grant_ref {
> +	/* IN parameters. */
> +	u64 host_addr;
> +	u32 flags;               /* GNTMAP_* */
> +	grant_ref_t ref;
> +	domid_t  dom;
> +	/* OUT parameters. */
> +	s16  status;              /* GNTST_* */
> +	grant_handle_t handle;
> +	u64 dev_bus_addr;
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(gnttab_map_grant_ref);
> +
> +/*
> + * GNTTABOP_unmap_grant_ref: Destroy one or more grant-reference
> mappings
> + * tracked by <handle>. If <host_addr> or <dev_bus_addr> is zero, that
> + * field is ignored. If non-zero, they must refer to a device/host mapping
> + * that is tracked by <handle>
> + * NOTES:
> + *  1. The call may fail in an undefined manner if either mapping is not
> + *     tracked by <handle>.
> + *  3. After executing a batch of unmaps, it is guaranteed that no stale
> + *     mappings will remain in the device or host TLBs.
> + */
> +#define GNTTABOP_unmap_grant_ref      1
> +struct gnttab_unmap_grant_ref {
> +	/* IN parameters. */
> +	u64 host_addr;
> +	u64 dev_bus_addr;
> +	grant_handle_t handle;
> +	/* OUT parameters. */
> +	s16  status;              /* GNTST_* */
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(gnttab_unmap_grant_ref);
> +
> +/*
> + * GNTTABOP_setup_table: Set up a grant table for <dom> comprising at
> least
> + * <nr_frames> pages. The frame addresses are written to the <frame_list>.
> + * Only <nr_frames> addresses are written, even if the table is larger.
> + * NOTES:
> + *  1. <dom> may be specified as DOMID_SELF.
> + *  2. Only a sufficiently-privileged domain may specify <dom> !=
> DOMID_SELF.
> + *  3. Xen may not support more than a single grant-table page per domain.
> + */
> +#define GNTTABOP_setup_table          2
> +struct gnttab_setup_table {
> +	/* IN parameters. */
> +	domid_t  dom;
> +	u32 nr_frames;
> +	/* OUT parameters. */
> +	s16  status;              /* GNTST_* */
> +
> +	GUEST_HANDLE(xen_pfn_t)frame_list;
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(gnttab_setup_table);
> +
> +/*
> + * GNTTABOP_dump_table: Dump the contents of the grant table to the
> + * xen console. Debugging use only.
> + */
> +#define GNTTABOP_dump_table           3
> +struct gnttab_dump_table {
> +	/* IN parameters. */
> +	domid_t dom;
> +	/* OUT parameters. */
> +	s16 status;               /* GNTST_* */
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(gnttab_dump_table);
> +
> +/*
> + * GNTTABOP_transfer_grant_ref: Transfer <frame> to a foreign domain. The
> + * foreign domain has previously registered its interest in the transfer via
> + * <domid, ref>.
> + *
> + * Note that, even if the transfer fails, the specified page no longer belongs
> + * to the calling domain *unless* the error is GNTST_bad_page.
> + */
> +#define GNTTABOP_transfer                4
> +struct gnttab_transfer {
> +	/* IN parameters. */
> +	xen_pfn_t mfn;
> +	domid_t       domid;
> +	grant_ref_t   ref;
> +	/* OUT parameters. */
> +	s16       status;
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(gnttab_transfer);
> +
> +/*
> + * GNTTABOP_copy: Hypervisor based copy
> + * source and destinations can be eithers MFNs or, for foreign domains,
> + * grant references. the foreign domain has to grant read/write access
> + * in its grant table.
> + *
> + * The flags specify what type source and destinations are (either MFN
> + * or grant reference).
> + *
> + * Note that this can also be used to copy data between two domains
> + * via a third party if the source and destination domains had previously
> + * grant appropriate access to their pages to the third party.
> + *
> + * source_offset specifies an offset in the source frame, dest_offset
> + * the offset in the target frame and  len specifies the number of
> + * bytes to be copied.
> + */
> +
> +#define _GNTCOPY_source_gref      (0)
> +#define GNTCOPY_source_gref       (1 << _GNTCOPY_source_gref)
> +#define _GNTCOPY_dest_gref        (1)
> +#define GNTCOPY_dest_gref         (1 << _GNTCOPY_dest_gref)
> +
> +#define GNTTABOP_copy                 5
> +struct gnttab_copy {
> +	/* IN parameters. */
> +	struct {
> +		union {
> +			grant_ref_t ref;
> +			xen_pfn_t   gmfn;
> +		} u;
> +		domid_t  domid;
> +		u16 offset;
> +	} source, dest;
> +	u16      len;
> +	u16      flags;          /* GNTCOPY_* */
> +	/* OUT parameters. */
> +	s16       status;
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(gnttab_copy);
> +
> +/*
> + * GNTTABOP_query_size: Query the current and maximum sizes of the
> shared
> + * grant table.
> + * NOTES:
> + *  1. <dom> may be specified as DOMID_SELF.
> + *  2. Only a sufficiently-privileged domain may specify <dom> !=
> DOMID_SELF.
> + */
> +#define GNTTABOP_query_size           6
> +struct gnttab_query_size {
> +	/* IN parameters. */
> +	domid_t  dom;
> +	/* OUT parameters. */
> +	u32 nr_frames;
> +	u32 max_nr_frames;
> +	s16  status;              /* GNTST_* */
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(gnttab_query_size);
> +
> +/*
> + * GNTTABOP_unmap_and_replace: Destroy one or more grant-reference
> mappings
> + * tracked by <handle> but atomically replace the page table entry with one
> + * pointing to the machine address under <new_addr>.  <new_addr> will
> be
> + * redirected to the null entry.
> + * NOTES:
> + *  1. The call may fail in an undefined manner if either mapping is not
> + *     tracked by <handle>.
> + *  2. After executing a batch of unmaps, it is guaranteed that no stale
> + *     mappings will remain in the device or host TLBs.
> + */
> +#define GNTTABOP_unmap_and_replace    7
> +struct gnttab_unmap_and_replace {
> +	/* IN parameters. */
> +	u64 host_addr;
> +	u64 new_addr;
> +	grant_handle_t handle;
> +	/* OUT parameters. */
> +	s16  status;              /* GNTST_* */
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(gnttab_unmap_and_replace);
> +
> +/*
> + * GNTTABOP_set_version: Request a particular version of the grant
> + * table shared table structure.  This operation can only be performed
> + * once in any given domain.  It must be performed before any grants
> + * are activated; otherwise, the domain will be stuck with version 1.
> + * The only defined versions are 1 and 2.
> + */
> +#define GNTTABOP_set_version          8
> +struct gnttab_set_version {
> +	/* IN parameters */
> +	u32 version;
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(gnttab_set_version);
> +
> +/*
> + * GNTTABOP_get_status_frames: Get the list of frames used to store grant
> + * status for <dom>. In grant format version 2, the status is separated
> + * from the other shared grant fields to allow more efficient synchronization
> + * using barriers instead of atomic cmpexch operations.
> + * <nr_frames> specify the size of vector <frame_list>.
> + * The frame addresses are returned in the <frame_list>.
> + * Only <nr_frames> addresses are returned, even if the table is larger.
> + * NOTES:
> + *  1. <dom> may be specified as DOMID_SELF.
> + *  2. Only a sufficiently-privileged domain may specify <dom> !=
> DOMID_SELF.
> + */
> +#define GNTTABOP_get_status_frames     9
> +struct gnttab_get_status_frames {
> +	/* IN parameters. */
> +	u32 nr_frames;
> +	domid_t  dom;
> +	/* OUT parameters. */
> +	s16  status;              /* GNTST_* */
> +
> +	GUEST_HANDLE(u64)frame_list;
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(gnttab_get_status_frames);
> +
> +/*
> + * GNTTABOP_get_version: Get the grant table version which is in
> + * effect for domain <dom>.
> + */
> +#define GNTTABOP_get_version          10
> +struct gnttab_get_version {
> +	/* IN parameters */
> +	domid_t dom;
> +	u16 pad;
> +	/* OUT parameters */
> +	u32 version;
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(gnttab_get_version);
> +
> +/*
> + * Issue one or more cache maintenance operations on a portion of a
> + * page granted to the calling domain by a foreign domain.
> + */
> +#define GNTTABOP_cache_flush          12
> +struct gnttab_cache_flush {
> +	union {
> +		u64 dev_bus_addr;
> +		grant_ref_t ref;
> +	} a;
> +	u16 offset;   /* offset from start of grant */
> +	u16 length;   /* size within the grant */
> +#define GNTTAB_CACHE_CLEAN          (1 << 0)
> +#define GNTTAB_CACHE_INVAL          (1 << 1)
> +#define GNTTAB_CACHE_SOURCE_GREF    (1 << 31)
> +	u32 op;
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(gnttab_cache_flush);
> +
> +/*
> + * Bitfield values for update_pin_status.flags.
> + */
> + /* Map the grant entry for access by I/O devices. */
> +#define _GNTMAP_device_map      (0)
> +#define GNTMAP_device_map       (1 << _GNTMAP_device_map)
> +/* Map the grant entry for access by host CPUs. */
> +#define _GNTMAP_host_map        (1)
> +#define GNTMAP_host_map         (1 << _GNTMAP_host_map)
> +/* Accesses to the granted frame will be restricted to read-only access. */
> +#define _GNTMAP_readonly        (2)
> +#define GNTMAP_readonly         (1 << _GNTMAP_readonly)
> +/*
> + * GNTMAP_host_map subflag:
> + *  0 => The host mapping is usable only by the guest OS.
> + *  1 => The host mapping is usable by guest OS + current application.
> + */
> +#define _GNTMAP_application_map (3)
> +#define GNTMAP_application_map  (1 << _GNTMAP_application_map)
> +
> +/*
> + * GNTMAP_contains_pte subflag:
> + *  0 => This map request contains a host virtual address.
> + *  1 => This map request contains the machine addess of the PTE to
> update.
> + */
> +#define _GNTMAP_contains_pte    (4)
> +#define GNTMAP_contains_pte     (1 << _GNTMAP_contains_pte)
> +
> +/*
> + * Bits to be placed in guest kernel available PTE bits (architecture
> + * dependent; only supported when XENFEAT_gnttab_map_avail_bits is set).
> + */
> +#define _GNTMAP_guest_avail0    (16)
> +#define GNTMAP_guest_avail_mask ((u32)~0 << _GNTMAP_guest_avail0)
> +
> +/*
> + * Values for error status returns. All errors are -ve.
> + */
> +#define GNTST_okay             (0)  /* Normal return.
> */
> +#define GNTST_general_error    (-1) /* General undefined error.
> */
> +#define GNTST_bad_domain       (-2) /* Unrecognsed domain id.
> */
> +#define GNTST_bad_gntref       (-3) /* Unrecognised or inappropriate
> gntref. */
> +#define GNTST_bad_handle       (-4) /* Unrecognised or inappropriate
> handle. */
> +#define GNTST_bad_virt_addr    (-5) /* Inappropriate virtual address to
> map. */
> +#define GNTST_bad_dev_addr     (-6) /* Inappropriate device address to
> unmap.*/
> +#define GNTST_no_device_space  (-7) /* Out of space in I/O MMU.
> */
> +#define GNTST_permission_denied (-8) /* Not enough privilege for operation.
> */
> +#define GNTST_bad_page         (-9) /* Specified page was invalid for op.
> */
> +#define GNTST_bad_copy_arg    (-10) /* copy arguments cross page
> boundary.   */
> +#define GNTST_address_too_big (-11) /* transfer page address too large.
> */
> +#define GNTST_eagain          (-12) /* Operation not done; try again.
> */
> +
> +#define GNTTABOP_error_msgs {                   \
> +	"okay",                                     \
> +	"undefined error",                          \
> +	"unrecognised domain id",                   \
> +	"invalid grant reference",                  \
> +	"invalid mapping handle",                   \
> +	"invalid virtual address",                  \
> +	"invalid device address",                   \
> +	"no spare translation slot in the I/O MMU", \
> +	"permission denied",                        \
> +	"bad page",                                 \
> +	"copy arguments cross page boundary",       \
> +	"page address size too large",              \
> +	"operation not done; try again"             \
> +}
> +
> +#endif /* __XEN_PUBLIC_GRANT_TABLE_H__ */
> diff --git a/include/xen/interface/hvm/hvm_op.h
> b/include/xen/interface/hvm/hvm_op.h
> new file mode 100644
> index 0000000000..1c53cad729
> --- /dev/null
> +++ b/include/xen/interface/hvm/hvm_op.h
> @@ -0,0 +1,69 @@
> +/*
> + * Permission is hereby granted, free of charge, to any person obtaining a
> copy
> + * of this software and associated documentation files (the "Software"), to
> + * deal in the Software without restriction, including without limitation the
> + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
> + * sell copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> EVENT SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#ifndef __XEN_PUBLIC_HVM_HVM_OP_H__
> +#define __XEN_PUBLIC_HVM_HVM_OP_H__
> +
> +/* Get/set subcommands: the second argument of the hypercall is a
> + * pointer to a xen_hvm_param struct.
> + */
> +#define HVMOP_set_param           0
> +#define HVMOP_get_param           1
> +struct xen_hvm_param {
> +	domid_t  domid;    /* IN */
> +	u32 index;    /* IN */
> +	u64 value;    /* IN/OUT */
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_param);
> +
> +/* Hint from PV drivers for pagetable destruction. */
> +#define HVMOP_pagetable_dying       9
> +struct xen_hvm_pagetable_dying {
> +	/* Domain with a pagetable about to be destroyed. */
> +	domid_t  domid;
> +	/* guest physical address of the toplevel pagetable dying */
> +	aligned_u64 gpa;
> +};
> +
> +typedef struct xen_hvm_pagetable_dying xen_hvm_pagetable_dying_t;
> +DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_pagetable_dying_t);
> +
> +enum hvmmem_type_t {
> +	HVMMEM_ram_rw,             /* Normal read/write guest RAM */
> +	HVMMEM_ram_ro,             /* Read-only; writes are discarded */
> +	HVMMEM_mmio_dm,            /* Reads and write go to the device
> model */
> +};
> +
> +#define HVMOP_get_mem_type    15
> +/* Return hvmmem_type_t for the specified pfn. */
> +struct xen_hvm_get_mem_type {
> +	/* Domain to be queried. */
> +	domid_t domid;
> +	/* OUT variable. */
> +	u16 mem_type;
> +	u16 pad[2]; /* align next field on 8-byte boundary */
> +	/* IN variable. */
> +	u64 pfn;
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_get_mem_type);
> +
> +#endif /* __XEN_PUBLIC_HVM_HVM_OP_H__ */
> diff --git a/include/xen/interface/hvm/params.h
> b/include/xen/interface/hvm/params.h
> new file mode 100644
> index 0000000000..4d61fc58d9
> --- /dev/null
> +++ b/include/xen/interface/hvm/params.h
> @@ -0,0 +1,127 @@
> +/*
> + * Permission is hereby granted, free of charge, to any person obtaining a
> copy
> + * of this software and associated documentation files (the "Software"), to
> + * deal in the Software without restriction, including without limitation the
> + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
> + * sell copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> EVENT SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#ifndef __XEN_PUBLIC_HVM_PARAMS_H__
> +#define __XEN_PUBLIC_HVM_PARAMS_H__
> +
> +#include <xen/interface/hvm/hvm_op.h>
> +
> +/*
> + * Parameter space for HVMOP_{set,get}_param.
> + */
> +
> +#define HVM_PARAM_CALLBACK_IRQ 0
> +/*
> + * How should CPU0 event-channel notifications be delivered?
> + *
> + * If val == 0 then CPU0 event-channel notifications are not delivered.
> + * If val != 0, val[63:56] encodes the type, as follows:
> + */
> +
> +#define HVM_PARAM_CALLBACK_TYPE_GSI      0
> +/*
> + * val[55:0] is a delivery GSI.  GSI 0 cannot be used, as it aliases val == 0,
> + * and disables all notifications.
> + */
> +
> +#define HVM_PARAM_CALLBACK_TYPE_PCI_INTX 1
> +/*
> + * val[55:0] is a delivery PCI INTx line:
> + * Domain = val[47:32], Bus = val[31:16] DevFn = val[15:8], IntX = val[1:0]
> + */
> +
> +#if defined(__i386__) || defined(__x86_64__)
> +#define HVM_PARAM_CALLBACK_TYPE_VECTOR   2
> +/*
> + * val[7:0] is a vector number.  Check for XENFEAT_hvm_callback_vector to
> know
> + * if this delivery method is available.
> + */
> +#elif defined(__arm__) || defined(__aarch64__)
> +#define HVM_PARAM_CALLBACK_TYPE_PPI      2
> +/*
> + * val[55:16] needs to be zero.
> + * val[15:8] is interrupt flag of the PPI used by event-channel:
> + *  bit 8: the PPI is edge(1) or level(0) triggered
> + *  bit 9: the PPI is active low(1) or high(0)
> + * val[7:0] is a PPI number used by event-channel.
> + * This is only used by ARM/ARM64 and masking/eoi the interrupt associated
> to
> + * the notification is handled by the interrupt controller.
> + */
> +#endif
> +
> +#define HVM_PARAM_STORE_PFN    1
> +#define HVM_PARAM_STORE_EVTCHN 2
> +
> +#define HVM_PARAM_PAE_ENABLED  4
> +
> +#define HVM_PARAM_IOREQ_PFN    5
> +
> +#define HVM_PARAM_BUFIOREQ_PFN 6
> +
> +/*
> + * Set mode for virtual timers (currently x86 only):
> + *  delay_for_missed_ticks (default):
> + *   Do not advance a vcpu's time beyond the correct delivery time for
> + *   interrupts that have been missed due to preemption. Deliver missed
> + *   interrupts when the vcpu is rescheduled and advance the vcpu's virtual
> + *   time stepwise for each one.
> + *  no_delay_for_missed_ticks:
> + *   As above, missed interrupts are delivered, but guest time always tracks
> + *   wallclock (i.e., real) time while doing so.
> + *  no_missed_ticks_pending:
> + *   No missed interrupts are held pending. Instead, to ensure ticks are
> + *   delivered at some non-zero rate, if we detect missed ticks then the
> + *   internal tick alarm is not disabled if the VCPU is preempted during the
> + *   next tick period.
> + *  one_missed_tick_pending:
> + *   Missed interrupts are collapsed together and delivered as one 'late
> tick'.
> + *   Guest time always tracks wallclock (i.e., real) time.
> + */
> +#define HVM_PARAM_TIMER_MODE   10
> +#define HVMPTM_delay_for_missed_ticks    0
> +#define HVMPTM_no_delay_for_missed_ticks 1
> +#define HVMPTM_no_missed_ticks_pending   2
> +#define HVMPTM_one_missed_tick_pending   3
> +
> +/* Boolean: Enable virtual HPET (high-precision event timer)? (x86-only) */
> +#define HVM_PARAM_HPET_ENABLED 11
> +
> +/* Identity-map page directory used by Intel EPT when CR0.PG=0. */
> +#define HVM_PARAM_IDENT_PT     12
> +
> +/* Device Model domain, defaults to 0. */
> +#define HVM_PARAM_DM_DOMAIN    13
> +
> +/* ACPI S state: currently support S0 and S3 on x86. */
> +#define HVM_PARAM_ACPI_S_STATE 14
> +
> +/* TSS used on Intel when CR0.PE=0. */
> +#define HVM_PARAM_VM86_TSS     15
> +
> +/* Boolean: Enable aligning all periodic vpts to reduce interrupts */
> +#define HVM_PARAM_VPT_ALIGN    16
> +
> +/* Console debug shared memory ring and event channel */
> +#define HVM_PARAM_CONSOLE_PFN    17
> +#define HVM_PARAM_CONSOLE_EVTCHN 18
> +
> +#define HVM_NR_PARAMS          19
> +
> +#endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */
> diff --git a/include/xen/interface/io/blkif.h b/include/xen/interface/io/blkif.h
> new file mode 100644
> index 0000000000..7d74c99226
> --- /dev/null
> +++ b/include/xen/interface/io/blkif.h
> @@ -0,0 +1,726 @@
> +/************************************************************
> ******************
> + * blkif.h
> + *
> + * Unified block-device I/O interface for Xen guest OSes.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> copy
> + * of this software and associated documentation files (the "Software"), to
> + * deal in the Software without restriction, including without limitation the
> + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
> + * sell copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> EVENT SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + *
> + * Copyright (c) 2003-2004, Keir Fraser
> + * Copyright (c) 2012, Spectra Logic Corporation
> + */
> +
> +#ifndef __XEN_PUBLIC_IO_BLKIF_H__
> +#define __XEN_PUBLIC_IO_BLKIF_H__
> +
> +#include "ring.h"
> +#include "../grant_table.h"
> +
> +/*
> + * Front->back notifications: When enqueuing a new request, sending a
> + * notification can be made conditional on req_event (i.e., the generic
> + * hold-off mechanism provided by the ring macros). Backends must set
> + * req_event appropriately (e.g., using
> RING_FINAL_CHECK_FOR_REQUESTS()).
> + *
> + * Back->front notifications: When enqueuing a new response, sending a
> + * notification can be made conditional on rsp_event (i.e., the generic
> + * hold-off mechanism provided by the ring macros). Frontends must set
> + * rsp_event appropriately (e.g., using
> RING_FINAL_CHECK_FOR_RESPONSES()).
> + */
> +
> +#ifndef blkif_vdev_t
> +#define blkif_vdev_t   u16
> +#endif
> +#define blkif_sector_t u64
> +
> +/*
> + * Feature and Parameter Negotiation
> + * =================================
> + * The two halves of a Xen block driver utilize nodes within the XenStore to
> + * communicate capabilities and to negotiate operating parameters.  This
> + * section enumerates these nodes which reside in the respective front and
> + * backend portions of the XenStore, following the XenBus convention.
> + *
> + * All data in the XenStore is stored as strings.  Nodes specifying numeric
> + * values are encoded in decimal.  Integer value ranges listed below are
> + * expressed as fixed sized integer types capable of storing the conversion
> + * of a properly formated node string, without loss of information.
> + *
> + * Any specified default value is in effect if the corresponding XenBus node
> + * is not present in the XenStore.
> + *
> + * XenStore nodes in sections marked "PRIVATE" are solely for use by the
> + * driver side whose XenBus tree contains them.
> + *
> + * XenStore nodes marked "DEPRECATED" in their notes section should only
> be
> + * used to provide interoperability with legacy implementations.
> + *
> + * See the XenBus state transition diagram below for details on when XenBus
> + * nodes must be published and when they can be queried.
> + *
> +
> **************************************************************
> ***************
> + *                            Backend XenBus Nodes
> +
> **************************************************************
> ***************
> + *
> + *------------------ Backend Device Identification (PRIVATE) ------------------
> + *
> + * mode
> + *      Values:         "r" (read only), "w" (writable)
> + *
> + *      The read or write access permissions to the backing store to be
> + *      granted to the frontend.
> + *
> + * params
> + *      Values:         string
> + *
> + *      A free formatted string providing sufficient information for the
> + *      hotplug script to attach the device and provide a suitable
> + *      handler (ie: a block device) for blkback to use.
> + *
> + * physical-device
> + *      Values:         "MAJOR:MINOR"
> + *      Notes: 11
> + *
> + *      MAJOR and MINOR are the major number and minor number of
> the
> + *      backing device respectively.
> + *
> + * physical-device-path
> + *      Values:         path string
> + *
> + *      A string that contains the absolute path to the disk image. On
> + *      NetBSD and Linux this is always a block device, while on FreeBSD
> + *      it can be either a block device or a regular file.
> + *
> + * type
> + *      Values:         "file", "phy", "tap"
> + *
> + *      The type of the backing device/object.
> + *
> + *
> + * direct-io-safe
> + *      Values:         0/1 (boolean)
> + *      Default Value:  0
> + *
> + *      The underlying storage is not affected by the direct IO memory
> + *      lifetime bug.  See:
> + *
> https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.xe
> n.org%2Farchives%2Fhtml%2Fxen-devel%2F2012-12%2Fmsg01154.html&am
> p;data=02%7C01%7Cpeng.fan%40nxp.com%7Cdd87f4854f514bc096ba08d81
> ddc0812%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C63729217
> 8170181802&amp;sdata=wXiKB5EvbBokB%2BYrOdMDiKDBwSHo8m1ssXFp0K
> RQ0Io%3D&amp;reserved=0
> + *
> + *      Therefore this option gives the backend permission to use
> + *      O_DIRECT, notwithstanding that bug.
> + *
> + *      That is, if this option is enabled, use of O_DIRECT is safe,
> + *      in circumstances where we would normally have avoided it as a
> + *      workaround for that bug.  This option is not relevant for all
> + *      backends, and even not necessarily supported for those for
> + *      which it is relevant.  A backend which knows that it is not
> + *      affected by the bug can ignore this option.
> + *
> + *      This option doesn't require a backend to use O_DIRECT, so it
> + *      should not be used to try to control the caching behaviour.
> + *
> + *--------------------------------- Features ---------------------------------
> + *
> + * feature-barrier
> + *      Values:         0/1 (boolean)
> + *      Default Value:  0
> + *
> + *      A value of "1" indicates that the backend can process requests
> + *      containing the BLKIF_OP_WRITE_BARRIER request opcode.
> Requests
> + *      of this type may still be returned at any time with the
> + *      BLKIF_RSP_EOPNOTSUPP result code.
> + *
> + * feature-flush-cache
> + *      Values:         0/1 (boolean)
> + *      Default Value:  0
> + *
> + *      A value of "1" indicates that the backend can process requests
> + *      containing the BLKIF_OP_FLUSH_DISKCACHE request opcode.
> Requests
> + *      of this type may still be returned at any time with the
> + *      BLKIF_RSP_EOPNOTSUPP result code.
> + *
> + * feature-discard
> + *      Values:         0/1 (boolean)
> + *      Default Value:  0
> + *
> + *      A value of "1" indicates that the backend can process requests
> + *      containing the BLKIF_OP_DISCARD request opcode.  Requests
> + *      of this type may still be returned at any time with the
> + *      BLKIF_RSP_EOPNOTSUPP result code.
> + *
> + * feature-persistent
> + *      Values:         0/1 (boolean)
> + *      Default Value:  0
> + *      Notes: 7
> + *
> + *      A value of "1" indicates that the backend can keep the grants used
> + *      by the frontend driver mapped, so the same set of grants should be
> + *      used in all transactions. The maximum number of grants the
> backend
> + *      can map persistently depends on the implementation, but ideally it
> + *      should be RING_SIZE * BLKIF_MAX_SEGMENTS_PER_REQUEST.
> Using this
> + *      feature the backend doesn't need to unmap each grant, preventing
> + *      costly TLB flushes. The backend driver should only map grants
> + *      persistently if the frontend supports it. If a backend driver chooses
> + *      to use the persistent protocol when the frontend doesn't support it,
> + *      it will probably hit the maximum number of persistently mapped
> grants
> + *      (due to the fact that the frontend won't be reusing the same
> grants),
> + *      and fall back to non-persistent mode. Backend implementations
> may
> + *      shrink or expand the number of persistently mapped grants without
> + *      notifying the frontend depending on memory constraints (this might
> + *      cause a performance degradation).
> + *
> + *      If a backend driver wants to limit the maximum number of
> persistently
> + *      mapped grants to a value less than RING_SIZE *
> + *      BLKIF_MAX_SEGMENTS_PER_REQUEST a LRU strategy should be
> used to
> + *      discard the grants that are less commonly used. Using a LRU in the
> + *      backend driver paired with a LIFO queue in the frontend will
> + *      allow us to have better performance in this scenario.
> + *
> + *----------------------- Request Transport Parameters ------------------------
> + *
> + * max-ring-page-order
> + *      Values:         <uint32_t>
> + *      Default Value:  0
> + *      Notes:          1, 3
> + *
> + *      The maximum supported size of the request ring buffer in units of
> + *      lb(machine pages). (e.g. 0 == 1 page,  1 = 2 pages, 2 == 4 pages,
> + *      etc.).
> + *
> + * max-ring-pages
> + *      Values:         <uint32_t>
> + *      Default Value:  1
> + *      Notes:          DEPRECATED, 2, 3
> + *
> + *      The maximum supported size of the request ring buffer in units of
> + *      machine pages.  The value must be a power of 2.
> + *
> + *------------------------- Backend Device Properties -------------------------
> + *
> + * discard-enable
> + *      Values:         0/1 (boolean)
> + *      Default Value:  1
> + *
> + *      This optional property, set by the toolstack, instructs the backend
> + *      to offer (or not to offer) discard to the frontend. If the property
> + *      is missing the backend should offer discard if the backing storage
> + *      actually supports it.
> + *
> + * discard-alignment
> + *      Values:         <uint32_t>
> + *      Default Value:  0
> + *      Notes:          4, 5
> + *
> + *      The offset, in bytes from the beginning of the virtual block device,
> + *      to the first, addressable, discard extent on the underlying device.
> + *
> + * discard-granularity
> + *      Values:         <uint32_t>
> + *      Default Value:  <"sector-size">
> + *      Notes:          4
> + *
> + *      The size, in bytes, of the individually addressable discard extents
> + *      of the underlying device.
> + *
> + * discard-secure
> + *      Values:         0/1 (boolean)
> + *      Default Value:  0
> + *      Notes:          10
> + *
> + *      A value of "1" indicates that the backend can process
> BLKIF_OP_DISCARD
> + *      requests with the BLKIF_DISCARD_SECURE flag set.
> + *
> + * info
> + *      Values:         <uint32_t> (bitmap)
> + *
> + *      A collection of bit flags describing attributes of the backing
> + *      device.  The VDISK_* macros define the meaning of each bit
> + *      location.
> + *
> + * sector-size
> + *      Values:         <uint32_t>
> + *
> + *      The logical block size, in bytes, of the underlying storage. This
> + *      must be a power of two with a minimum value of 512.
> + *
> + *      NOTE: Because of implementation bugs in some frontends this
> must be
> + *            set to 512, unless the frontend advertizes a non-zero value
> + *            in its "feature-large-sector-size" xenbus node. (See below).
> + *
> + * physical-sector-size
> + *      Values:         <uint32_t>
> + *      Default Value:  <"sector-size">
> + *
> + *      The physical block size, in bytes, of the backend storage. This
> + *      must be an integer multiple of "sector-size".
> + *
> + * sectors
> + *      Values:         <u64>
> + *
> + *      The size of the backend device, expressed in units of "sector-size".
> + *      The product of "sector-size" and "sectors" must also be an integer
> + *      multiple of "physical-sector-size", if that node is present.
> + *
> +
> **************************************************************
> ***************
> + *                            Frontend XenBus Nodes
> +
> **************************************************************
> ***************
> + *
> + *----------------------- Request Transport Parameters -----------------------
> + *
> + * event-channel
> + *      Values:         <uint32_t>
> + *
> + *      The identifier of the Xen event channel used to signal activity
> + *      in the ring buffer.
> + *
> + * ring-ref
> + *      Values:         <uint32_t>
> + *      Notes:          6
> + *
> + *      The Xen grant reference granting permission for the backend to
> map
> + *      the sole page in a single page sized ring buffer.
> + *
> + * ring-ref%u
> + *      Values:         <uint32_t>
> + *      Notes:          6
> + *
> + *      For a frontend providing a multi-page ring, a "number of ring pages"
> + *      sized list of nodes, each containing a Xen grant reference granting
> + *      permission for the backend to map the page of the ring located
> + *      at page index "%u".  Page indexes are zero based.
> + *
> + * protocol
> + *      Values:         string (XEN_IO_PROTO_ABI_*)
> + *      Default Value:  XEN_IO_PROTO_ABI_NATIVE
> + *
> + *      The machine ABI rules governing the format of all ring request and
> + *      response structures.
> + *
> + * ring-page-order
> + *      Values:         <uint32_t>
> + *      Default Value:  0
> + *      Maximum Value:  MAX(ffs(max-ring-pages) - 1,
> max-ring-page-order)
> + *      Notes:          1, 3
> + *
> + *      The size of the frontend allocated request ring buffer in units
> + *      of lb(machine pages). (e.g. 0 == 1 page, 1 = 2 pages, 2 == 4 pages,
> + *      etc.).
> + *
> + * num-ring-pages
> + *      Values:         <uint32_t>
> + *      Default Value:  1
> + *      Maximum Value:  MAX(max-ring-pages,(0x1 <<
> max-ring-page-order))
> + *      Notes:          DEPRECATED, 2, 3
> + *
> + *      The size of the frontend allocated request ring buffer in units of
> + *      machine pages.  The value must be a power of 2.
> + *
> + *--------------------------------- Features ---------------------------------
> + *
> + * feature-persistent
> + *      Values:         0/1 (boolean)
> + *      Default Value:  0
> + *      Notes: 7, 8, 9
> + *
> + *      A value of "1" indicates that the frontend will reuse the same grants
> + *      for all transactions, allowing the backend to map them with write
> + *      access (even when it should be read-only). If the frontend hits the
> + *      maximum number of allowed persistently mapped grants, it can
> fallback
> + *      to non persistent mode. This will cause a performance degradation,
> + *      since the the backend driver will still try to map those grants
> + *      persistently. Since the persistent grants protocol is compatible with
> + *      the previous protocol, a frontend driver can choose to work in
> + *      persistent mode even when the backend doesn't support it.
> + *
> + *      It is recommended that the frontend driver stores the persistently
> + *      mapped grants in a LIFO queue, so a subset of all persistently
> mapped
> + *      grants gets used commonly. This is done in case the backend driver
> + *      decides to limit the maximum number of persistently mapped
> grants
> + *      to a value less than RING_SIZE *
> BLKIF_MAX_SEGMENTS_PER_REQUEST.
> + *
> + * feature-large-sector-size
> + *      Values:         0/1 (boolean)
> + *      Default Value:  0
> + *
> + *      A value of "1" indicates that the frontend will correctly supply and
> + *      interpret all sector-based quantities in terms of the "sector-size"
> + *      value supplied in the backend info, whatever that may be set to.
> + *      If this node is not present or its value is "0" then it is assumed
> + *      that the frontend requires that the logical block size is 512 as it
> + *      is hardcoded (which is the case in some frontend implementations).
> + *
> + *------------------------- Virtual Device Properties -------------------------
> + *
> + * device-type
> + *      Values:         "disk", "cdrom", "floppy", etc.
> + *
> + * virtual-device
> + *      Values:         <uint32_t>
> + *
> + *      A value indicating the physical device to virtualize within the
> + *      frontend's domain.  (e.g. "The first ATA disk", "The third SCSI
> + *      disk", etc.)
> + *
> + *      See docs/misc/vbd-interface.txt for details on the format of this
> + *      value.
> + *
> + * Notes
> + * -----
> + * (1) Multi-page ring buffer scheme first developed in the Citrix XenServer
> + *     PV drivers.
> + * (2) Multi-page ring buffer scheme first used in some RedHat distributions
> + *     including a distribution deployed on certain nodes of the Amazon
> + *     EC2 cluster.
> + * (3) Support for multi-page ring buffers was implemented independently,
> + *     in slightly different forms, by both Citrix and RedHat/Amazon.
> + *     For full interoperability, block front and backends should publish
> + *     identical ring parameters, adjusted for unit differences, to the
> + *     XenStore nodes used in both schemes.
> + * (4) Devices that support discard functionality may internally allocate space
> + *     (discardable extents) in units that are larger than the exported
> logical
> + *     block size. If the backing device has such discardable extents the
> + *     backend should provide both discard-granularity and
> discard-alignment.
> + *     Providing just one of the two may be considered an error by the
> frontend.
> + *     Backends supporting discard should include discard-granularity and
> + *     discard-alignment even if it supports discarding individual sectors.
> + *     Frontends should assume discard-alignment == 0 and
> discard-granularity
> + *     == sector size if these keys are missing.
> + * (5) The discard-alignment parameter allows a physical device to be
> + *     partitioned into virtual devices that do not necessarily begin or
> + *     end on a discardable extent boundary.
> + * (6) When there is only a single page allocated to the request ring,
> + *     'ring-ref' is used to communicate the grant reference for this
> + *     page to the backend.  When using a multi-page ring, the 'ring-ref'
> + *     node is not created.  Instead 'ring-ref0' - 'ring-refN' are used.
> + * (7) When using persistent grants data has to be copied from/to the page
> + *     where the grant is currently mapped. The overhead of doing this
> copy
> + *     however doesn't suppress the speed improvement of not having to
> unmap
> + *     the grants.
> + * (8) The frontend driver has to allow the backend driver to map all grants
> + *     with write access, even when they should be mapped read-only,
> since
> + *     further requests may reuse these grants and require write
> permissions.
> + * (9) Linux implementation doesn't have a limit on the maximum number of
> + *     grants that can be persistently mapped in the frontend driver, but
> + *     due to the frontent driver implementation it should never be bigger
> + *     than RING_SIZE * BLKIF_MAX_SEGMENTS_PER_REQUEST.
> + *(10) The discard-secure property may be present and will be set to 1 if the
> + *     backing device supports secure discard.
> + *(11) Only used by Linux and NetBSD.
> + */
> +
> +/*
> + * Multiple hardware queues/rings:
> + * If supported, the backend will write the key "multi-queue-max-queues" to
> + * the directory for that vbd, and set its value to the maximum supported
> + * number of queues.
> + * Frontends that are aware of this feature and wish to use it can write the
> + * key "multi-queue-num-queues" with the number they wish to use, which
> must be
> + * greater than zero, and no more than the value reported by the backend in
> + * "multi-queue-max-queues".
> + *
> + * For frontends requesting just one queue, the usual event-channel and
> + * ring-ref keys are written as before, simplifying the backend processing
> + * to avoid distinguishing between a frontend that doesn't understand the
> + * multi-queue feature, and one that does, but requested only one queue.
> + *
> + * Frontends requesting two or more queues must not write the toplevel
> + * event-channel and ring-ref keys, instead writing those keys under
> sub-keys
> + * having the name "queue-N" where N is the integer ID of the queue/ring
> for
> + * which those keys belong. Queues are indexed from zero.
> + * For example, a frontend with two queues must write the following set of
> + * queue-related keys:
> + *
> + * /local/domain/1/device/vbd/0/multi-queue-num-queues = "2"
> + * /local/domain/1/device/vbd/0/queue-0 = ""
> + * /local/domain/1/device/vbd/0/queue-0/ring-ref = "<ring-ref#0>"
> + * /local/domain/1/device/vbd/0/queue-0/event-channel = "<evtchn#0>"
> + * /local/domain/1/device/vbd/0/queue-1 = ""
> + * /local/domain/1/device/vbd/0/queue-1/ring-ref = "<ring-ref#1>"
> + * /local/domain/1/device/vbd/0/queue-1/event-channel = "<evtchn#1>"
> + *
> + * It is also possible to use multiple queues/rings together with
> + * feature multi-page ring buffer.
> + * For example, a frontend requests two queues/rings and the size of each
> ring
> + * buffer is two pages must write the following set of related keys:
> + *
> + * /local/domain/1/device/vbd/0/multi-queue-num-queues = "2"
> + * /local/domain/1/device/vbd/0/ring-page-order = "1"
> + * /local/domain/1/device/vbd/0/queue-0 = ""
> + * /local/domain/1/device/vbd/0/queue-0/ring-ref0 = "<ring-ref#0>"
> + * /local/domain/1/device/vbd/0/queue-0/ring-ref1 = "<ring-ref#1>"
> + * /local/domain/1/device/vbd/0/queue-0/event-channel = "<evtchn#0>"
> + * /local/domain/1/device/vbd/0/queue-1 = ""
> + * /local/domain/1/device/vbd/0/queue-1/ring-ref0 = "<ring-ref#2>"
> + * /local/domain/1/device/vbd/0/queue-1/ring-ref1 = "<ring-ref#3>"
> + * /local/domain/1/device/vbd/0/queue-1/event-channel = "<evtchn#1>"
> + *
> + */
> +
> +/*
> + * STATE DIAGRAMS
> + *
> +
> **************************************************************
> ***************
> + *                                   Startup
> *
> +
> **************************************************************
> ***************
> + *
> + * Tool stack creates front and back nodes with state XenbusStateInitialising.
> + *
> + * Front                                Back
> + * =================================
> =====================================
> + * XenbusStateInitialising              XenbusStateInitialising
> + *  o Query virtual device               o Query backend device
> identification
> + *    properties.                          data.
> + *  o Setup OS device instance.          o Open and validate backend
> device.
> + *                                       o Publish backend
> features and
> + *                                         transport parameters.
> + *                                                      |
> + *                                                      |
> + *                                                      V
> + *                                      XenbusStateInitWait
> + *
> + * o Query backend features and
> + *   transport parameters.
> + * o Allocate and initialize the
> + *   request ring.
> + * o Publish transport parameters
> + *   that will be in effect during
> + *   this connection.
> + *              |
> + *              |
> + *              V
> + * XenbusStateInitialised
> + *
> + *                                       o Query frontend
> transport parameters.
> + *                                       o Connect to the request
> ring and
> + *                                         event channel.
> + *                                       o Publish backend device
> properties.
> + *                                                      |
> + *                                                      |
> + *                                                      V
> + *                                      XenbusStateConnected
> + *
> + *  o Query backend device properties.
> + *  o Finalize OS virtual device
> + *    instance.
> + *              |
> + *              |
> + *              V
> + * XenbusStateConnected
> + *
> + * Note: Drivers that do not support any optional features, or the negotiation
> + *       of transport parameters, can skip certain states in the state
> machine:
> + *
> + *       o A frontend may transition to XenbusStateInitialised without
> + *         waiting for the backend to enter XenbusStateInitWait.  In this
> + *         case, default transport parameters are in effect and any
> + *         transport parameters published by the frontend must contain
> + *         their default values.
> + *
> + *       o A backend may transition to XenbusStateInitialised, bypassing
> + *         XenbusStateInitWait, without waiting for the frontend to first
> + *         enter the XenbusStateInitialised state.  In this case, default
> + *         transport parameters are in effect and any transport
> parameters
> + *         published by the backend must contain their default values.
> + *
> + *       Drivers that support optional features and/or transport parameter
> + *       negotiation must tolerate these additional state transition paths.
> + *       In general this means performing the work of any skipped state
> + *       transition, if it has not already been performed, in addition to the
> + *       work associated with entry into the current state.
> + */
> +
> +/*
> + * REQUEST CODES.
> + */
> +#define BLKIF_OP_READ              0
> +#define BLKIF_OP_WRITE             1
> +/*
> + * All writes issued prior to a request with the BLKIF_OP_WRITE_BARRIER
> + * operation code ("barrier request") must be completed prior to the
> + * execution of the barrier request.  All writes issued after the barrier
> + * request must not execute until after the completion of the barrier request.
> + *
> + * Optional.  See "feature-barrier" XenBus node documentation above.
> + */
> +#define BLKIF_OP_WRITE_BARRIER     2
> +/*
> + * Commit any uncommitted contents of the backing device's volatile cache
> + * to stable storage.
> + *
> + * Optional.  See "feature-flush-cache" XenBus node documentation above.
> + */
> +#define BLKIF_OP_FLUSH_DISKCACHE   3
> +/*
> + * Used in SLES sources for device specific command packet
> + * contained within the request. Reserved for that purpose.
> + */
> +#define BLKIF_OP_RESERVED_1        4
> +/*
> + * Indicate to the backend device that a region of storage is no longer in
> + * use, and may be discarded at any time without impact to the client.  If
> + * the BLKIF_DISCARD_SECURE flag is set on the request, all copies of the
> + * discarded region on the device must be rendered unrecoverable before
> the
> + * command returns.
> + *
> + * This operation is analogous to performing a trim (ATA) or unamp (SCSI),
> + * command on a native device.
> + *
> + * More information about trim/unmap operations can be found at:
> + *
> https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Ft13.org
> %2FDocuments%2FUploadedDocuments%2Fdocs2008%2F&amp;data=02%7
> C01%7Cpeng.fan%40nxp.com%7Cdd87f4854f514bc096ba08d81ddc0812%7C
> 686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637292178170181802
> &amp;sdata=JOOjsvkjqxkuoF47PMVw1loNNDhxPCXQVdPQQklTIGM%3D&am
> p;reserved=0
> + *     e07154r6-Data_Set_Management_Proposal_for_ATA-ACS2.doc
> + *
> https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.s
> eagate.com%2Fstaticfiles%2Fsupport%2Fdisc%2Fmanuals%2F&amp;data=02
> %7C01%7Cpeng.fan%40nxp.com%7Cdd87f4854f514bc096ba08d81ddc0812%
> 7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6372921781701818
> 02&amp;sdata=gd5Cvr1Q9%2Bv%2BfUS5OleuozBITkjbybYoR302s4XsVv8%3D
> &amp;reserved=0
> + *     Interface%20manuals/100293068c.pdf
> + *
> + * Optional.  See "feature-discard", "discard-alignment",
> + * "discard-granularity", and "discard-secure" in the XenBus node
> + * documentation above.
> + */
> +#define BLKIF_OP_DISCARD           5
> +
> +/*
> + * Recognized if "feature-max-indirect-segments" in present in the backend
> + * xenbus info. The "feature-max-indirect-segments" node contains the
> maximum
> + * number of segments allowed by the backend per request. If the node is
> + * present, the frontend might use blkif_request_indirect structs in order to
> + * issue requests with more than BLKIF_MAX_SEGMENTS_PER_REQUEST
> (11). The
> + * maximum number of indirect segments is fixed by the backend, but the
> + * frontend can issue requests with any number of indirect segments as long
> as
> + * it's less than the number provided by the backend. The indirect_grefs field
> + * in blkif_request_indirect should be filled by the frontend with the
> + * grant references of the pages that are holding the indirect segments.
> + * These pages are filled with an array of blkif_request_segment that hold
> the
> + * information about the segments. The number of indirect pages to use is
> + * determined by the number of segments an indirect request contains.
> Every
> + * indirect page can contain a maximum of
> + * (PAGE_SIZE / sizeof(struct blkif_request_segment)) segments, so to
> + * calculate the number of indirect pages to use we have to do
> + * ceil(indirect_segments / (PAGE_SIZE / sizeof(struct
> blkif_request_segment))).
> + *
> + * If a backend does not recognize BLKIF_OP_INDIRECT, it should *not*
> + * create the "feature-max-indirect-segments" node!
> + */
> +#define BLKIF_OP_INDIRECT          6
> +
> +/*
> + * Maximum scatter/gather segments per request.
> + * This is carefully chosen so that sizeof(blkif_ring_t) <= PAGE_SIZE.
> + * NB. This could be 12 if the ring indexes weren't stored in the same page.
> + */
> +#define BLKIF_MAX_SEGMENTS_PER_REQUEST 11
> +
> +/*
> + * Maximum number of indirect pages to use per request.
> + */
> +#define BLKIF_MAX_INDIRECT_PAGES_PER_REQUEST 8
> +
> +/*
> + * NB. 'first_sect' and 'last_sect' in blkif_request_segment, as well as
> + * 'sector_number' in blkif_request, blkif_request_discard and
> + * blkif_request_indirect are sector-based quantities. See the description
> + * of the "feature-large-sector-size" frontend xenbus node above for
> + * more information.
> + */
> +struct blkif_request_segment {
> +	grant_ref_t gref;        /* reference to I/O buffer frame        */
> +	/* @first_sect: first sector in frame to transfer (inclusive).   */
> +	/* @last_sect: last sector in frame to transfer (inclusive).     */
> +	u8     first_sect, last_sect;
> +};
> +
> +/*
> + * Starting ring element for any I/O request.
> + */
> +struct blkif_request {
> +	u8        operation;    /* BLKIF_OP_???
> */
> +	u8        nr_segments;  /* number of segments
> */
> +	blkif_vdev_t   handle;       /* only for read/write requests
> */
> +	u64       id;           /* private guest value, echoed in resp  */
> +	blkif_sector_t sector_number;/* start sector idx on disk (r/w only)  */
> +	struct blkif_request_segment
> seg[BLKIF_MAX_SEGMENTS_PER_REQUEST];
> +};
> +
> +typedef struct blkif_request blkif_request_t;
> +
> +/*
> + * Cast to this structure when blkif_request.operation ==
> BLKIF_OP_DISCARD
> + * sizeof(struct blkif_request_discard) <= sizeof(struct blkif_request)
> + */
> +struct blkif_request_discard {
> +	u8        operation;    /* BLKIF_OP_DISCARD
> */
> +	u8        flag;         /* BLKIF_DISCARD_SECURE or zero
> */
> +#define BLKIF_DISCARD_SECURE (1 << 0)  /* ignored if discard-secure=0
> */
> +	blkif_vdev_t   handle;       /* same as for read/write requests
> */
> +	u64       id;           /* private guest value, echoed in resp  */
> +	blkif_sector_t sector_number;/* start sector idx on disk
> */
> +	u64       nr_sectors;   /* number of contiguous sectors to discard*/
> +};
> +
> +typedef struct blkif_request_discard blkif_request_discard_t;
> +
> +struct blkif_request_indirect {
> +	u8        operation;    /* BLKIF_OP_INDIRECT
> */
> +	u8        indirect_op;  /* BLKIF_OP_{READ/WRITE}
> */
> +	u16       nr_segments;  /* number of segments
> */
> +	u64       id;           /* private guest value, echoed in resp  */
> +	blkif_sector_t sector_number;/* start sector idx on disk (r/w only)  */
> +	blkif_vdev_t   handle;       /* same as for read/write requests
> */
> +	grant_ref_t
> indirect_grefs[BLKIF_MAX_INDIRECT_PAGES_PER_REQUEST];
> +#ifdef __i386__
> +	u64       pad;          /* Make it 64 byte aligned on i386
> */
> +#endif
> +};
> +
> +typedef struct blkif_request_indirect blkif_request_indirect_t;
> +
> +struct blkif_response {
> +	u64        id;              /* copied from request */
> +	u8         operation;       /* copied from request */
> +	s16         status;          /* BLKIF_RSP_???       */
> +};
> +
> +typedef struct blkif_response blkif_response_t;
> +
> +/*
> + * STATUS RETURN CODES.
> + */
> + /* Operation not supported (only happens on barrier writes). */
> +#define BLKIF_RSP_EOPNOTSUPP  -2
> + /* Operation failed for some unspecified reason (-EIO). */
> +#define BLKIF_RSP_ERROR       -1
> + /* Operation completed successfully. */
> +#define BLKIF_RSP_OKAY         0
> +
> +/*
> + * Generate blkif ring structures and types.
> + */
> +DEFINE_RING_TYPES(blkif, struct blkif_request, struct blkif_response);
> +
> +#define VDISK_CDROM        0x1
> +#define VDISK_REMOVABLE    0x2
> +#define VDISK_READONLY     0x4
> +
> +#endif /* __XEN_PUBLIC_IO_BLKIF_H__ */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/include/xen/interface/io/console.h
> b/include/xen/interface/io/console.h
> new file mode 100644
> index 0000000000..3489fc7a60
> --- /dev/null
> +++ b/include/xen/interface/io/console.h
> @@ -0,0 +1,56 @@
> +/************************************************************
> ******************
> + * console.h
> + *
> + * Console I/O interface for Xen guest OSes.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> copy
> + * of this software and associated documentation files (the "Software"), to
> + * deal in the Software without restriction, including without limitation the
> + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
> + * sell copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> EVENT SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + *
> + * Copyright (c) 2005, Keir Fraser
> + */
> +
> +#ifndef __XEN_PUBLIC_IO_CONSOLE_H__
> +#define __XEN_PUBLIC_IO_CONSOLE_H__
> +
> +typedef u32 XENCONS_RING_IDX;
> +
> +#define MASK_XENCONS_IDX(idx, ring) ((idx) & (sizeof(ring) - 1))
> +
> +struct xencons_interface {
> +	char in[1024];
> +	char out[2048];
> +	XENCONS_RING_IDX in_cons, in_prod;
> +	XENCONS_RING_IDX out_cons, out_prod;
> +};
> +
> +#ifdef XEN_WANT_FLEX_CONSOLE_RING
> +#include "ring.h"
> +DEFINE_XEN_FLEX_RING(xencons);
> +#endif
> +
> +#endif /* __XEN_PUBLIC_IO_CONSOLE_H__ */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/include/xen/interface/io/protocols.h
> b/include/xen/interface/io/protocols.h
> new file mode 100644
> index 0000000000..52b4de0f81
> --- /dev/null
> +++ b/include/xen/interface/io/protocols.h
> @@ -0,0 +1,42 @@
> +/************************************************************
> ******************
> + * protocols.h
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> copy
> + * of this software and associated documentation files (the "Software"), to
> + * deal in the Software without restriction, including without limitation the
> + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
> + * sell copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> EVENT SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + *
> + * Copyright (c) 2008, Keir Fraser
> + */
> +
> +#ifndef __XEN_PROTOCOLS_H__
> +#define __XEN_PROTOCOLS_H__
> +
> +#define XEN_IO_PROTO_ABI_X86_32     "x86_32-abi"
> +#define XEN_IO_PROTO_ABI_X86_64     "x86_64-abi"
> +#define XEN_IO_PROTO_ABI_ARM        "arm-abi"
> +
> +#if defined(__i386__)
> +# define XEN_IO_PROTO_ABI_NATIVE XEN_IO_PROTO_ABI_X86_32
> +#elif defined(__x86_64__)
> +# define XEN_IO_PROTO_ABI_NATIVE XEN_IO_PROTO_ABI_X86_64
> +#elif defined(__arm__) || defined(__aarch64__)
> +# define XEN_IO_PROTO_ABI_NATIVE XEN_IO_PROTO_ABI_ARM
> +#else
> +# error arch fixup needed here
> +#endif
> +
> +#endif
> diff --git a/include/xen/interface/io/ring.h b/include/xen/interface/io/ring.h
> new file mode 100644
> index 0000000000..4e02678e3c
> --- /dev/null
> +++ b/include/xen/interface/io/ring.h
> @@ -0,0 +1,479 @@
> +/************************************************************
> ******************
> + * ring.h
> + *
> + * Shared producer-consumer ring macros.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> copy
> + * of this software and associated documentation files (the "Software"), to
> + * deal in the Software without restriction, including without limitation the
> + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
> + * sell copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> EVENT SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + *
> + * Tim Deegan and Andrew Warfield November 2004.
> + */
> +
> +#ifndef __XEN_PUBLIC_IO_RING_H__
> +#define __XEN_PUBLIC_IO_RING_H__
> +
> +/*
> + * When #include'ing this header, you need to provide the following
> + * declaration upfront:
> + * - standard integers types (u8, u16, etc)
> + * They are provided by stdint.h of the standard headers.
> + *
> + * In addition, if you intend to use the FLEX macros, you also need to
> + * provide the following, before invoking the FLEX macros:
> + * - size_t
> + * - memcpy
> + * - grant_ref_t
> + * These declarations are provided by string.h of the standard headers,
> + * and grant_table.h from the Xen public headers.
> + */
> +
> +#include <xen/interface/grant_table.h>
> +
> +typedef unsigned int RING_IDX;
> +
> +/* Round a 32-bit unsigned constant down to the nearest power of two. */
> +#define __RD2(_x)  (((_x) & 0x00000002) ? 0x2                  : ((_x)
> & 0x1))
> +#define __RD4(_x)  (((_x) & 0x0000000c) ? __RD2((_x)>>2)<<2    :
> __RD2(_x))
> +#define __RD8(_x)  (((_x) & 0x000000f0) ? __RD4((_x)>>4)<<4    :
> __RD4(_x))
> +#define __RD16(_x) (((_x) & 0x0000ff00) ? __RD8((_x)>>8)<<8    :
> __RD8(_x))
> +#define __RD32(_x) (((_x) & 0xffff0000) ? __RD16((_x)>>16)<<16 :
> __RD16(_x))
> +
> +/*
> + * Calculate size of a shared ring, given the total available space for the
> + * ring and indexes (_sz), and the name tag of the request/response
> structure.
> + * A ring contains as many entries as will fit, rounded down to the nearest
> + * power of two (so we can mask with (size-1) to loop around).
> + */
> +#define __CONST_RING_SIZE(_s, _sz) \
> +	(__RD32(((_sz) - offsetof(struct _s##_sring, ring)) / \
> +		sizeof(((struct _s##_sring *)0)->ring[0])))
> +/*
> + * The same for passing in an actual pointer instead of a name tag.
> + */
> +#define __RING_SIZE(_s, _sz) \
> +	(__RD32(((_sz) - (long)(_s)->ring + (long)(_s)) / sizeof((_s)->ring[0])))
> +
> +/*
> + * Macros to make the correct C datatypes for a new kind of ring.
> + *
> + * To make a new ring datatype, you need to have two message structures,
> + * let's say request_t, and response_t already defined.
> + *
> + * In a header where you want the ring datatype declared, you then do:
> + *
> + *     DEFINE_RING_TYPES(mytag, request_t, response_t);
> + *
> + * These expand out to give you a set of types, as you can see below.
> + * The most important of these are:
> + *
> + *     mytag_sring_t      - The shared ring.
> + *     mytag_front_ring_t - The 'front' half of the ring.
> + *     mytag_back_ring_t  - The 'back' half of the ring.
> + *
> + * To initialize a ring in your code you need to know the location and size
> + * of the shared memory area (PAGE_SIZE, for instance). To initialise
> + * the front half:
> + *
> + *     mytag_front_ring_t front_ring;
> + *     SHARED_RING_INIT((mytag_sring_t *)shared_page);
> + *     FRONT_RING_INIT(&front_ring, (mytag_sring_t *)shared_page,
> PAGE_SIZE);
> + *
> + * Initializing the back follows similarly (note that only the front
> + * initializes the shared ring):
> + *
> + *     mytag_back_ring_t back_ring;
> + *     BACK_RING_INIT(&back_ring, (mytag_sring_t *)shared_page,
> PAGE_SIZE);
> + */
> +
> +#define DEFINE_RING_TYPES(__name, __req_t, __rsp_t)
> \
> +										  \
> +/* Shared ring entry */
> \
> +union __name##_sring_entry
> {                                                      \
> +	__req_t req;
> \
> +	__rsp_t rsp;
> \
> +};
> \
> +										  \
> +/* Shared ring page */
> \
> +struct __name##_sring
> {                                                           \
> +	RING_IDX req_prod, req_event;
> \
> +	RING_IDX rsp_prod, rsp_event;
> \
> +	union
> {
>       \
> +		struct
> {                                                          \
> +			u8 smartpoll_active;
> \
> +		} netif;
> \
> +		struct
> {                                                          \
> +			u8 msg;
> \
> +		} tapif_user;
> \
> +		u8 pvt_pad[4];
> \
> +	} pvt;
> \
> +	u8 __pad[44];
> \
> +	union __name##_sring_entry ring[1]; /* variable-length */
> \
> +};
> \
> +										  \
> +/* "Front" end's private variables */
> \
> +struct __name##_front_ring
> {                                                      \
> +	RING_IDX req_prod_pvt;
> \
> +	RING_IDX rsp_cons;
> \
> +	unsigned int nr_ents;
> \
> +	struct __name##_sring *sring;
> \
> +};
> \
> +										  \
> +/* "Back" end's private variables */
> \
> +struct __name##_back_ring
> {                                                       \
> +	RING_IDX rsp_prod_pvt;
> \
> +	RING_IDX req_cons;
> \
> +	unsigned int nr_ents;
> \
> +	struct __name##_sring *sring;
> \
> +};
> \
> +										  \
> +/* Syntactic sugar */
> \
> +typedef struct __name##_sring __name##_sring_t;
> \
> +typedef struct __name##_front_ring __name##_front_ring_t;
> \
> +typedef struct __name##_back_ring __name##_back_ring_t
> +
> +/*
> + * Macros for manipulating rings.
> + *
> + * FRONT_RING_whatever works on the "front end" of a ring: here
> + * requests are pushed on to the ring and responses taken off it.
> + *
> + * BACK_RING_whatever works on the "back end" of a ring: here
> + * requests are taken off the ring and responses put on.
> + *
> + * N.B. these macros do NO INTERLOCKS OR FLOW CONTROL.
> + * This is OK in 1-for-1 request-response situations where the
> + * requestor (front end) never has more than RING_SIZE()-1
> + * outstanding requests.
> + */
> +
> +/* Initialising empty rings */
> +#define SHARED_RING_INIT(_s) do
> {                                                 \
> +	(_s)->req_prod  = (_s)->rsp_prod  = 0;
> \
> +	(_s)->req_event = (_s)->rsp_event = 1;
> \
> +	(void)memset((_s)->pvt.pvt_pad, 0, sizeof((_s)->pvt.pvt_pad));
> \
> +	(void)memset((_s)->__pad, 0, sizeof((_s)->__pad));
> \
> +} while (0)
> +
> +#define FRONT_RING_INIT(_r, _s, __size) do
> {                                      \
> +	(_r)->req_prod_pvt = 0;
> \
> +	(_r)->rsp_cons = 0;
> \
> +	(_r)->nr_ents = __RING_SIZE(_s, __size);
> \
> +	(_r)->sring = (_s);
> \
> +} while (0)
> +
> +#define BACK_RING_INIT(_r, _s, __size) do
> {                                       \
> +	(_r)->rsp_prod_pvt = 0;
> \
> +	(_r)->req_cons = 0;
> \
> +	(_r)->nr_ents = __RING_SIZE(_s, __size);
> \
> +	(_r)->sring = (_s);
> \
> +} while (0)
> +
> +/* How big is this ring? */
> +#define RING_SIZE(_r)
> \
> +	((_r)->nr_ents)
> +
> +/* Number of free requests (for use on front side only). */
> +#define RING_FREE_REQUESTS(_r)
> \
> +	(RING_SIZE(_r) - ((_r)->req_prod_pvt - (_r)->rsp_cons))
> +
> +/* Test if there is an empty slot available on the front ring.
> + * (This is only meaningful from the front. )
> + */
> +#define RING_FULL(_r)
> \
> +	(RING_FREE_REQUESTS(_r) == 0)
> +
> +/* Test if there are outstanding messages to be processed on a ring. */
> +#define RING_HAS_UNCONSUMED_RESPONSES(_r)
> \
> +	((_r)->sring->rsp_prod - (_r)->rsp_cons)
> +
> +#ifdef __GNUC__
> +#define RING_HAS_UNCONSUMED_REQUESTS(_r)
> ({                                       \
> +	unsigned int req = (_r)->sring->req_prod - (_r)->req_cons;
> \
> +	unsigned int rsp = RING_SIZE(_r) -
> \
> +		((_r)->req_cons - (_r)->rsp_prod_pvt);
> \
> +	req < rsp ? req : rsp;
> \
> +})
> +#else
> +/* Same as above, but without the nice GCC ({ ... }) syntax. */
> +#define RING_HAS_UNCONSUMED_REQUESTS(_r)
> \
> +	((((_r)->sring->req_prod - (_r)->req_cons) <
> \
> +	  (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) ?
> \
> +	 ((_r)->sring->req_prod - (_r)->req_cons) :
> \
> +	 (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt)))
> +#endif
> +
> +/* Direct access to individual ring elements, by index. */
> +#define RING_GET_REQUEST(_r, _idx)
> \
> +	(&((_r)->sring->ring[((_idx) & (RING_SIZE(_r) - 1))].req))
> +
> +/*
> + * Get a local copy of a request.
> + *
> + * Use this in preference to RING_GET_REQUEST() so all processing is
> + * done on a local copy that cannot be modified by the other end.
> + *
> + * Note that
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gn
> u.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D58145&amp;data=02%7C01%7C
> peng.fan%40nxp.com%7Cdd87f4854f514bc096ba08d81ddc0812%7C686ea1d
> 3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637292178170181802&amp;sd
> ata=hZDVA%2FOZbJO%2Fh4uzROYzVzmB05ekJWbcnkDAXsHzClc%3D&amp;re
> served=0 may cause this
> + * to be ineffective where _req is a struct which consists of only bitfields.
> + */
> +#define RING_COPY_REQUEST(_r, _idx, _req) do {
> \
> +	/* Use volatile to force the copy into _req. */			          \
> +	*(_req) = *(volatile typeof(_req))RING_GET_REQUEST(_r, _idx);
> \
> +} while (0)
> +
> +#define RING_GET_RESPONSE(_r, _idx)
> \
> +	(&((_r)->sring->ring[((_idx) & (RING_SIZE(_r) - 1))].rsp))
> +
> +/* Loop termination condition: Would the specified index overflow the ring?
> */
> +#define RING_REQUEST_CONS_OVERFLOW(_r, _cons)
> \
> +	(((_cons) - (_r)->rsp_prod_pvt) >= RING_SIZE(_r))
> +
> +/* Ill-behaved frontend determination: Can there be this many requests? */
> +#define RING_REQUEST_PROD_OVERFLOW(_r, _prod)
> \
> +	(((_prod) - (_r)->rsp_prod_pvt) > RING_SIZE(_r))
> +
> +#define RING_PUSH_REQUESTS(_r) do
> {                                               \
> +	xen_wmb(); /* back sees requests /before/ updated producer index */
> \
> +	(_r)->sring->req_prod = (_r)->req_prod_pvt;
> \
> +} while (0)
> +
> +#define RING_PUSH_RESPONSES(_r) do
> {                                              \
> +	xen_wmb(); /* front sees resps /before/ updated producer index */
> \
> +	(_r)->sring->rsp_prod = (_r)->rsp_prod_pvt;
> \
> +} while (0)
> +
> +/*
> + * Notification hold-off (req_event and rsp_event):
> + *
> + * When queueing requests or responses on a shared ring, it may not always
> be
> + * necessary to notify the remote end. For example, if requests are in flight
> + * in a backend, the front may be able to queue further requests without
> + * notifying the back (if the back checks for new requests when it queues
> + * responses).
> + *
> + * When enqueuing requests or responses:
> + *
> + *  Use RING_PUSH_{REQUESTS,RESPONSES}_AND_CHECK_NOTIFY(). The
> second argument
> + *  is a boolean return value. True indicates that the receiver requires an
> + *  asynchronous notification.
> + *
> + * After dequeuing requests or responses (before sleeping the connection):
> + *
> + *  Use RING_FINAL_CHECK_FOR_REQUESTS() or
> RING_FINAL_CHECK_FOR_RESPONSES().
> + *  The second argument is a boolean return value. True indicates that there
> + *  are pending messages on the ring (i.e., the connection should not be put
> + *  to sleep).
> + *
> + *  These macros will set the req_event/rsp_event field to trigger a
> + *  notification on the very next message that is enqueued. If you want to
> + *  create batches of work (i.e., only receive a notification after several
> + *  messages have been enqueued) then you will need to create a
> customised
> + *  version of the FINAL_CHECK macro in your own code, which sets the
> event
> + *  field appropriately.
> + */
> +
> +#define RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(_r, _notify) do
> {                     \
> +	RING_IDX __old = (_r)->sring->req_prod;
> \
> +	RING_IDX __new = (_r)->req_prod_pvt;
> \
> +	xen_wmb(); /* back sees requests /before/ updated producer index */
> \
> +	(_r)->sring->req_prod = __new;
> \
> +	xen_mb(); /* back sees new requests /before/ we check req_event */
> \
> +	(_notify) = ((RING_IDX)(__new - (_r)->sring->req_event) <
> \
> +				 (RING_IDX)(__new - __old));                      \
> +} while (0)
> +
> +#define RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(_r, _notify) do
> {                    \
> +	RING_IDX __old = (_r)->sring->rsp_prod;
> \
> +	RING_IDX __new = (_r)->rsp_prod_pvt;
> \
> +	xen_wmb(); /* front sees resps /before/ updated producer index */
> \
> +	(_r)->sring->rsp_prod = __new;
> \
> +	xen_mb(); /* front sees new resps /before/ we check rsp_event */
> \
> +	(_notify) = ((RING_IDX)(__new - (_r)->sring->rsp_event) <
> \
> +				 (RING_IDX)(__new - __old));                      \
> +} while (0)
> +
> +#define RING_FINAL_CHECK_FOR_REQUESTS(_r, _work_to_do) do
> {                       \
> +	(_work_to_do) = RING_HAS_UNCONSUMED_REQUESTS(_r);
> \
> +	if (_work_to_do)							  \
> +		break;
> \
> +	(_r)->sring->req_event = (_r)->req_cons + 1;
> \
> +	xen_mb();
> \
> +	(_work_to_do) = RING_HAS_UNCONSUMED_REQUESTS(_r);
> \
> +} while (0)
> +
> +#define RING_FINAL_CHECK_FOR_RESPONSES(_r, _work_to_do) do
> {                      \
> +	(_work_to_do) = RING_HAS_UNCONSUMED_RESPONSES(_r);
> \
> +	if (_work_to_do)							  \
> +		break;
> \
> +	(_r)->sring->rsp_event = (_r)->rsp_cons + 1;
> \
> +	xen_mb();
> \
> +	(_work_to_do) = RING_HAS_UNCONSUMED_RESPONSES(_r);
> \
> +} while (0)
> +
> +/*
> + * DEFINE_XEN_FLEX_RING_AND_INTF defines two monodirectional rings
> and
> + * functions to check if there is data on the ring, and to read and
> + * write to them.
> + *
> + * DEFINE_XEN_FLEX_RING is similar to
> DEFINE_XEN_FLEX_RING_AND_INTF, but
> + * does not define the indexes page. As different protocols can have
> + * extensions to the basic format, this macro allow them to define their
> + * own struct.
> + *
> + * XEN_FLEX_RING_SIZE
> + *   Convenience macro to calculate the size of one of the two rings
> + *   from the overall order.
> + *
> + * $NAME_mask
> + *   Function to apply the size mask to an index, to reduce the index
> + *   within the range [0-size].
> + *
> + * $NAME_read_packet
> + *   Function to read data from the ring. The amount of data to read is
> + *   specified by the "size" argument.
> + *
> + * $NAME_write_packet
> + *   Function to write data to the ring. The amount of data to write is
> + *   specified by the "size" argument.
> + *
> + * $NAME_get_ring_ptr
> + *   Convenience function that returns a pointer to read/write to the
> + *   ring at the right location.
> + *
> + * $NAME_data_intf
> + *   Indexes page, shared between frontend and backend. It also
> + *   contains the array of grant refs.
> + *
> + * $NAME_queued
> + *   Function to calculate how many bytes are currently on the ring,
> + *   ready to be read. It can also be used to calculate how much free
> + *   space is currently on the ring (XEN_FLEX_RING_SIZE() -
> + *   $NAME_queued()).
> + */
> +
> +#ifndef XEN_PAGE_SHIFT
> +/* The PAGE_SIZE for ring protocols and hypercall interfaces is always
> + * 4K, regardless of the architecture, and page granularity chosen by
> + * operating systems.
> + */
> +#define XEN_PAGE_SHIFT 12
> +#endif
> +#define XEN_FLEX_RING_SIZE(order)
> \
> +	(1UL << ((order) + XEN_PAGE_SHIFT - 1))
> +
> +#define DEFINE_XEN_FLEX_RING(name)
> \
> +static inline RING_IDX name##_mask(RING_IDX idx, RING_IDX ring_size)
> \
> +{
>                      \
> +	return idx & (ring_size - 1);
> \
> +}
> \
> +										  \
> +static inline unsigned char *name##_get_ring_ptr(unsigned char *buf,
> \
> +						 RING_IDX idx,                    \
> +						 RING_IDX ring_size)              \
> +{
>                      \
> +	return buf + name##_mask(idx, ring_size);
> \
> +}
> \
> +										  \
> +static inline void name##_read_packet(void *opaque,
> \
> +				      const unsigned char *buf,                   \
> +				      size_t size,
> \
> +				      RING_IDX masked_prod,
> \
> +				      RING_IDX *masked_cons,
> \
> +				      RING_IDX ring_size)
> \
> +{
>                      \
> +	if (*masked_cons < masked_prod ||
> \
> +		size <= ring_size - *masked_cons)
> {                               \
> +		memcpy(opaque, buf + *masked_cons, size);
> \
> +	} else
> {
>      \
> +		memcpy(opaque, buf + *masked_cons, ring_size - *masked_cons);
> \
> +		memcpy((unsigned char *)opaque + ring_size - *masked_cons, buf,
> \
> +			   size - (ring_size - *masked_cons));
> \
> +	}
> \
> +	*masked_cons = name##_mask(*masked_cons + size, ring_size);
> \
> +}
> \
> +										  \
> +static inline void name##_write_packet(unsigned char *buf,
> \
> +				       const void *opaque,
> \
> +				       size_t size,
> \
> +				       RING_IDX *masked_prod,
> \
> +				       RING_IDX masked_cons,
> \
> +				       RING_IDX ring_size)
> \
> +{
>                      \
> +	if (*masked_prod < masked_cons ||
> \
> +		size <= ring_size - *masked_prod)
> {                               \
> +		memcpy(buf + *masked_prod, opaque, size);
> \
> +	} else
> {
>      \
> +		memcpy(buf + *masked_prod, opaque, ring_size - *masked_prod);
> \
> +		memcpy(buf, (unsigned char *)opaque + (ring_size - *masked_prod),
> \
> +		       size - (ring_size - *masked_prod));
> \
> +	}
> \
> +	*masked_prod = name##_mask(*masked_prod + size, ring_size);
> \
> +}
> \
> +										  \
> +static inline RING_IDX name##_queued(RING_IDX prod,
> \
> +				     RING_IDX cons,
> \
> +				     RING_IDX ring_size)
> \
> +{
>                      \
> +	RING_IDX size;
> \
> +										  \
> +	if (prod == cons)
> \
> +		return 0;
> \
> +										  \
> +	prod = name##_mask(prod, ring_size);
> \
> +	cons = name##_mask(cons, ring_size);
> \
> +										  \
> +	if (prod == cons)
> \
> +		return ring_size;
> \
> +										  \
> +	if (prod > cons)
> \
> +		size = prod - cons;
> \
> +	else
> \
> +		size = ring_size - (cons - prod);
> \
> +	return size;
> \
> +}
> \
> +										  \
> +struct name##_data
> {
>  \
> +	unsigned char *in; /* half of the allocation */
> \
> +	unsigned char *out; /* half of the allocation */
> \
> +}
> +
> +#define DEFINE_XEN_FLEX_RING_AND_INTF(name)
> \
> +struct name##_data_intf
> {                                                         \
> +	RING_IDX in_cons, in_prod;
> \
> +										  \
> +	u8 pad1[56];
> \
> +										  \
> +	RING_IDX out_cons, out_prod;
> \
> +										  \
> +	u8 pad2[56];
> \
> +										  \
> +	RING_IDX ring_order;
> \
> +	grant_ref_t ref[];
> \
> +};
> \
> +DEFINE_XEN_FLEX_RING(name)
> +
> +#endif /* __XEN_PUBLIC_IO_RING_H__ */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 8
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/include/xen/interface/io/xenbus.h
> b/include/xen/interface/io/xenbus.h
> new file mode 100644
> index 0000000000..f452748b03
> --- /dev/null
> +++ b/include/xen/interface/io/xenbus.h
> @@ -0,0 +1,81 @@
> +/************************************************************
> *****************
> + * xenbus.h
> + *
> + * Xenbus protocol details.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> copy
> + * of this software and associated documentation files (the "Software"), to
> + * deal in the Software without restriction, including without limitation the
> + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
> + * sell copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> EVENT SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + *
> + * Copyright (C) 2005 XenSource Ltd.
> + */
> +
> +#ifndef _XEN_PUBLIC_IO_XENBUS_H
> +#define _XEN_PUBLIC_IO_XENBUS_H
> +
> +/*
> + * The state of either end of the Xenbus, i.e. the current communication
> + * status of initialisation across the bus.  States here imply nothing about
> + * the state of the connection between the driver and the kernel's device
> + * layers.
> + */
> +enum xenbus_state {
> +	XenbusStateUnknown       = 0,
> +
> +	XenbusStateInitialising  = 1,
> +
> +	/*
> +	 * InitWait: Finished early initialisation but waiting for information
> +	 * from the peer or hotplug scripts.
> +	 */
> +	XenbusStateInitWait      = 2,
> +
> +	/*
> +	 * Initialised: Waiting for a connection from the peer.
> +	 */
> +	XenbusStateInitialised   = 3,
> +
> +	XenbusStateConnected     = 4,
> +
> +	/*
> +	 * Closing: The device is being closed due to an error or an unplug event.
> +	 */
> +	XenbusStateClosing       = 5,
> +
> +	XenbusStateClosed        = 6,
> +
> +	/*
> +	 * Reconfiguring: The device is being reconfigured.
> +	 */
> +	XenbusStateReconfiguring = 7,
> +
> +	XenbusStateReconfigured  = 8
> +};
> +
> +typedef enum xenbus_state XenbusState;
> +
> +#endif /* _XEN_PUBLIC_IO_XENBUS_H */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/include/xen/interface/io/xs_wire.h
> b/include/xen/interface/io/xs_wire.h
> new file mode 100644
> index 0000000000..87987334bf
> --- /dev/null
> +++ b/include/xen/interface/io/xs_wire.h
> @@ -0,0 +1,151 @@
> +/*
> + * Details of the "wire" protocol between Xen Store Daemon and client
> + * library or guest kernel.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> copy
> + * of this software and associated documentation files (the "Software"), to
> + * deal in the Software without restriction, including without limitation the
> + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
> + * sell copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> EVENT SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + *
> + * Copyright (C) 2005 Rusty Russell IBM Corporation
> + */
> +
> +#ifndef _XS_WIRE_H
> +#define _XS_WIRE_H
> +
> +enum xsd_sockmsg_type {
> +	XS_CONTROL,
> +#define XS_DEBUG XS_CONTROL
> +	XS_DIRECTORY,
> +	XS_READ,
> +	XS_GET_PERMS,
> +	XS_WATCH,
> +	XS_UNWATCH,
> +	XS_TRANSACTION_START,
> +	XS_TRANSACTION_END,
> +	XS_INTRODUCE,
> +	XS_RELEASE,
> +	XS_GET_DOMAIN_PATH,
> +	XS_WRITE,
> +	XS_MKDIR,
> +	XS_RM,
> +	XS_SET_PERMS,
> +	XS_WATCH_EVENT,
> +	XS_ERROR,
> +	XS_IS_DOMAIN_INTRODUCED,
> +	XS_RESUME,
> +	XS_SET_TARGET,
> +	/* XS_RESTRICT has been removed */
> +	XS_RESET_WATCHES = XS_SET_TARGET + 2,
> +	XS_DIRECTORY_PART,
> +
> +	XS_TYPE_COUNT,      /* Number of valid types. */
> +
> +	XS_INVALID = 0xffff /* Guaranteed to remain an invalid type */
> +};
> +
> +#define XS_WRITE_NONE "NONE"
> +#define XS_WRITE_CREATE "CREATE"
> +#define XS_WRITE_CREATE_EXCL "CREATE|EXCL"
> +
> +/* We hand errors as strings, for portability. */
> +struct xsd_errors {
> +	int errnum;
> +	const char *errstring;
> +};
> +
> +#ifdef EINVAL
> +#define XSD_ERROR(x) { x, #x }
> +/* LINTED: static unused */
> +static struct xsd_errors xsd_errors[]
> +#if defined(__GNUC__)
> +__attribute__((unused))
> +#endif
> +	= {
> +	XSD_ERROR(EINVAL),
> +	XSD_ERROR(EACCES),
> +	XSD_ERROR(EEXIST),
> +	XSD_ERROR(EISDIR),
> +	XSD_ERROR(ENOENT),
> +	XSD_ERROR(ENOMEM),
> +	XSD_ERROR(ENOSPC),
> +	XSD_ERROR(EIO),
> +	XSD_ERROR(ENOTEMPTY),
> +	XSD_ERROR(ENOSYS),
> +	XSD_ERROR(EROFS),
> +	XSD_ERROR(EBUSY),
> +	XSD_ERROR(EAGAIN),
> +	XSD_ERROR(EISCONN),
> +	XSD_ERROR(E2BIG)
> +};
> +#endif
> +
> +struct xsd_sockmsg {
> +	u32 type;  /* XS_??? */
> +	u32 req_id;/* Request identifier, echoed in daemon's response.  */
> +	u32 tx_id; /* Transaction id (0 if not related to a transaction). */
> +	u32 len;   /* Length of data following this. */
> +
> +	/* Generally followed by nul-terminated string(s). */
> +};
> +
> +enum xs_watch_type {
> +	XS_WATCH_PATH = 0,
> +	XS_WATCH_TOKEN
> +};
> +
> +/*
> + * `incontents 150 xenstore_struct XenStore wire protocol.
> + *
> + * Inter-domain shared memory communications.
> + */
> +#define XENSTORE_RING_SIZE 1024
> +typedef u32 XENSTORE_RING_IDX;
> +#define MASK_XENSTORE_IDX(idx) ((idx) & (XENSTORE_RING_SIZE - 1))
> +struct xenstore_domain_interface {
> +	char req[XENSTORE_RING_SIZE]; /* Requests to xenstore daemon. */
> +	char rsp[XENSTORE_RING_SIZE]; /* Replies and async watch events. */
> +	XENSTORE_RING_IDX req_cons, req_prod;
> +	XENSTORE_RING_IDX rsp_cons, rsp_prod;
> +	u32 server_features; /* Bitmap of features supported by the server */
> +	u32 connection;
> +};
> +
> +/* Violating this is very bad.  See docs/misc/xenstore.txt. */
> +#define XENSTORE_PAYLOAD_MAX 4096
> +
> +/* Violating these just gets you an error back */
> +#define XENSTORE_ABS_PATH_MAX 3072
> +#define XENSTORE_REL_PATH_MAX 2048
> +
> +/* The ability to reconnect a ring */
> +#define XENSTORE_SERVER_FEATURE_RECONNECTION 1
> +
> +/* Valid values for the connection field */
> +#define XENSTORE_CONNECTED 0 /* the steady-state */
> +#define XENSTORE_RECONNECT 1 /* guest has initiated a reconnect */
> +
> +#endif /* _XS_WIRE_H */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 8
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/include/xen/interface/memory.h
> b/include/xen/interface/memory.h
> new file mode 100644
> index 0000000000..19959da8b4
> --- /dev/null
> +++ b/include/xen/interface/memory.h
> @@ -0,0 +1,332 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/************************************************************
> ******************
> + * memory.h
> + *
> + * Memory reservation and information.
> + *
> + * Copyright (c) 2005, Keir Fraser <keir@xensource.com>
> + */
> +
> +#ifndef __XEN_PUBLIC_MEMORY_H__
> +#define __XEN_PUBLIC_MEMORY_H__
> +
> +/*
> + * Increase or decrease the specified domain's memory reservation. Returns
> a
> + * -ve errcode on failure, or the # extents successfully allocated or freed.
> + * arg == addr of struct xen_memory_reservation.
> + */
> +#define XENMEM_increase_reservation 0
> +#define XENMEM_decrease_reservation 1
> +#define XENMEM_populate_physmap     6
> +struct xen_memory_reservation {
> +	/*
> +	 * XENMEM_increase_reservation:
> +	 *   OUT: MFN (*not* GMFN) bases of extents that were allocated
> +	 * XENMEM_decrease_reservation:
> +	 *   IN:  GMFN bases of extents to free
> +	 * XENMEM_populate_physmap:
> +	 *   IN:  GPFN bases of extents to populate with memory
> +	 *   OUT: GMFN bases of extents that were allocated
> +	 *   (NB. This command also updates the mach_to_phys translation
> table)
> +	 */
> +	GUEST_HANDLE(xen_pfn_t)extent_start;
> +
> +	/* Number of extents, and size/alignment of each (2^extent_order
> pages). */
> +	xen_ulong_t  nr_extents;
> +	unsigned int   extent_order;
> +
> +	/*
> +	 * Maximum # bits addressable by the user of the allocated region (e.g.,
> +	 * I/O devices often have a 32-bit limitation even in 64-bit systems). If
> +	 * zero then the user has no addressing restriction.
> +	 * This field is not used by XENMEM_decrease_reservation.
> +	 */
> +	unsigned int   address_bits;
> +
> +	/*
> +	 * Domain whose reservation is being changed.
> +	 * Unprivileged domains can specify only DOMID_SELF.
> +	 */
> +	domid_t        domid;
> +
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(xen_memory_reservation);
> +
> +/*
> + * An atomic exchange of memory pages. If return code is zero then
> + * @out.extent_list provides GMFNs of the newly-allocated memory.
> + * Returns zero on complete success, otherwise a negative error code.
> + * On complete success then always @nr_exchanged == @in.nr_extents.
> + * On partial success @nr_exchanged indicates how much work was done.
> + */
> +#define XENMEM_exchange             11
> +struct xen_memory_exchange {
> +	/*
> +	 * [IN] Details of memory extents to be exchanged (GMFN bases).
> +	 * Note that @in.address_bits is ignored and unused.
> +	 */
> +	struct xen_memory_reservation in;
> +
> +	/*
> +	 * [IN/OUT] Details of new memory extents.
> +	 * We require that:
> +	 *  1. @in.domid == @out.domid
> +	 *  2. @in.nr_extents  << @in.extent_order ==
> +	 *     @out.nr_extents << @out.extent_order
> +	 *  3. @in.extent_start and @out.extent_start lists must not overlap
> +	 *  4. @out.extent_start lists GPFN bases to be populated
> +	 *  5. @out.extent_start is overwritten with allocated GMFN bases
> +	 */
> +	struct xen_memory_reservation out;
> +
> +	/*
> +	 * [OUT] Number of input extents that were successfully exchanged:
> +	 *  1. The first @nr_exchanged input extents were successfully
> +	 *     deallocated.
> +	 *  2. The corresponding first entries in the output extent list correctly
> +	 *     indicate the GMFNs that were successfully exchanged.
> +	 *  3. All other input and output extents are untouched.
> +	 *  4. If not all input exents are exchanged then the return code of this
> +	 *     command will be non-zero.
> +	 *  5. THIS FIELD MUST BE INITIALISED TO ZERO BY THE CALLER!
> +	 */
> +	xen_ulong_t nr_exchanged;
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(xen_memory_exchange);
> +/*
> + * Returns the maximum machine frame number of mapped RAM in this
> system.
> + * This command always succeeds (it never returns an error code).
> + * arg == NULL.
> + */
> +#define XENMEM_maximum_ram_page     2
> +
> +/*
> + * Returns the current or maximum memory reservation, in pages, of the
> + * specified domain (may be DOMID_SELF). Returns -ve errcode on failure.
> + * arg == addr of domid_t.
> + */
> +#define XENMEM_current_reservation  3
> +#define XENMEM_maximum_reservation  4
> +
> +/*
> + * Returns a list of MFN bases of 2MB extents comprising the
> machine_to_phys
> + * mapping table. Architectures which do not have a m2p table do not
> implement
> + * this command.
> + * arg == addr of xen_machphys_mfn_list_t.
> + */
> +#define XENMEM_machphys_mfn_list    5
> +struct xen_machphys_mfn_list {
> +	/*
> +	 * Size of the 'extent_start' array. Fewer entries will be filled if the
> +	 * machphys table is smaller than max_extents * 2MB.
> +	 */
> +	unsigned int max_extents;
> +
> +	/*
> +	 * Pointer to buffer to fill with list of extent starts. If there are
> +	 * any large discontiguities in the machine address space, 2MB gaps in
> +	 * the machphys table will be represented by an MFN base of zero.
> +	 */
> +	GUEST_HANDLE(xen_pfn_t)extent_start;
> +
> +	/*
> +	 * Number of extents written to the above array. This will be smaller
> +	 * than 'max_extents' if the machphys table is smaller than max_e *
> 2MB.
> +	 */
> +	unsigned int nr_extents;
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(xen_machphys_mfn_list);
> +
> +/*
> + * Returns the location in virtual address space of the machine_to_phys
> + * mapping table. Architectures which do not have a m2p table, or which do
> not
> + * map it by default into guest address space, do not implement this
> command.
> + * arg == addr of xen_machphys_mapping_t.
> + */
> +#define XENMEM_machphys_mapping     12
> +struct xen_machphys_mapping {
> +	xen_ulong_t v_start, v_end; /* Start and end virtual addresses.   */
> +	xen_ulong_t max_mfn;        /* Maximum MFN that can be looked up.
> */
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(xen_machphys_mapping_t);
> +
> +#define XENMAPSPACE_shared_info  0 /* shared info page */
> +#define XENMAPSPACE_grant_table  1 /* grant table page */
> +#define XENMAPSPACE_gmfn         2 /* GMFN */
> +#define XENMAPSPACE_gmfn_range   3 /* GMFN range,
> XENMEM_add_to_physmap only. */
> +#define XENMAPSPACE_gmfn_foreign 4 /* GMFN from another dom,
> +				    * XENMEM_add_to_physmap_range only.
> +				    */
> +#define XENMAPSPACE_dev_mmio     5 /* device mmio region */
> +
> +/*
> + * Sets the GPFN at which a particular page appears in the specified guest's
> + * pseudophysical address space.
> + * arg == addr of xen_add_to_physmap_t.
> + */
> +#define XENMEM_add_to_physmap      7
> +struct xen_add_to_physmap {
> +	/* Which domain to change the mapping for. */
> +	domid_t domid;
> +
> +	/* Number of pages to go through for gmfn_range */
> +	u16    size;
> +
> +	/* Source mapping space. */
> +	unsigned int space;
> +
> +	/* Index into source mapping space. */
> +	xen_ulong_t idx;
> +
> +	/* GPFN where the source mapping page should appear. */
> +	xen_pfn_t gpfn;
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(xen_add_to_physmap);
> +
> +/*** REMOVED ***/
> +/*#define XENMEM_translate_gpfn_list  8*/
> +
> +#define XENMEM_add_to_physmap_range 23
> +struct xen_add_to_physmap_range {
> +	/* IN */
> +	/* Which domain to change the mapping for. */
> +	domid_t domid;
> +	u16 space; /* => enum phys_map_space */
> +
> +	/* Number of pages to go through */
> +	u16 size;
> +	domid_t foreign_domid; /* IFF gmfn_foreign */
> +
> +	/* Indexes into space being mapped. */
> +	GUEST_HANDLE(xen_ulong_t)idxs;
> +
> +	/* GPFN in domid where the source mapping page should appear. */
> +	GUEST_HANDLE(xen_pfn_t)gpfns;
> +
> +	/* OUT */
> +
> +	/* Per index error code. */
> +	GUEST_HANDLE(int)errs;
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(xen_add_to_physmap_range);
> +
> +/*
> + * Returns the pseudo-physical memory map as it was when the domain
> + * was started (specified by XENMEM_set_memory_map).
> + * arg == addr of struct xen_memory_map.
> + */
> +#define XENMEM_memory_map           9
> +struct xen_memory_map {
> +	/*
> +	 * On call the number of entries which can be stored in buffer. On
> +	 * return the number of entries which have been stored in
> +	 * buffer.
> +	 */
> +	unsigned int nr_entries;
> +
> +	/*
> +	 * Entries in the buffer are in the same format as returned by the
> +	 * BIOS INT 0x15 EAX=0xE820 call.
> +	 */
> +	GUEST_HANDLE(void)buffer;
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(xen_memory_map);
> +
> +/*
> + * Returns the real physical memory map. Passes the same structure as
> + * XENMEM_memory_map.
> + * arg == addr of struct xen_memory_map.
> + */
> +#define XENMEM_machine_memory_map   10
> +
> +/*
> + * Unmaps the page appearing at a particular GPFN from the specified
> guest's
> + * pseudophysical address space.
> + * arg == addr of xen_remove_from_physmap_t.
> + */
> +#define XENMEM_remove_from_physmap      15
> +struct xen_remove_from_physmap {
> +	/* Which domain to change the mapping for. */
> +	domid_t domid;
> +
> +	/* GPFN of the current mapping of the page. */
> +	xen_pfn_t gpfn;
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(xen_remove_from_physmap);
> +
> +/*
> + * Get the pages for a particular guest resource, so that they can be
> + * mapped directly by a tools domain.
> + */
> +#define XENMEM_acquire_resource 28
> +struct xen_mem_acquire_resource {
> +	/* IN - The domain whose resource is to be mapped */
> +	domid_t domid;
> +	/* IN - the type of resource */
> +	u16 type;
> +
> +#define XENMEM_resource_ioreq_server 0
> +#define XENMEM_resource_grant_table 1
> +
> +	/*
> +	 * IN - a type-specific resource identifier, which must be zero
> +	 *      unless stated otherwise.
> +	 *
> +	 * type == XENMEM_resource_ioreq_server -> id == ioreq server id
> +	 * type == XENMEM_resource_grant_table -> id defined below
> +	 */
> +	u32 id;
> +
> +#define XENMEM_resource_grant_table_id_shared 0
> +#define XENMEM_resource_grant_table_id_status 1
> +
> +	/* IN/OUT - As an IN parameter number of frames of the resource
> +	 *          to be mapped. However, if the specified value is 0 and
> +	 *          frame_list is NULL then this field will be set to the
> +	 *          maximum value supported by the implementation on
> return.
> +	 */
> +	u32 nr_frames;
> +	/*
> +	 * OUT - Must be zero on entry. On return this may contain a bitwise
> +	 *       OR of the following values.
> +	 */
> +	u32 flags;
> +
> +	/* The resource pages have been assigned to the calling domain */
> +#define _XENMEM_rsrc_acq_caller_owned 0
> +#define XENMEM_rsrc_acq_caller_owned (1u <<
> _XENMEM_rsrc_acq_caller_owned)
> +
> +	/*
> +	 * IN - the index of the initial frame to be mapped. This parameter
> +	 *      is ignored if nr_frames is 0.
> +	 */
> +	u64 frame;
> +
> +#define XENMEM_resource_ioreq_server_frame_bufioreq 0
> +#define XENMEM_resource_ioreq_server_frame_ioreq(n) (1 + (n))
> +
> +	/*
> +	 * IN/OUT - If the tools domain is PV then, upon return, frame_list
> +	 *          will be populated with the MFNs of the resource.
> +	 *          If the tools domain is HVM then it is expected that, on
> +	 *          entry, frame_list will be populated with a list of GFNs
> +	 *          that will be mapped to the MFNs of the resource.
> +	 *          If -EIO is returned then the frame_list has only been
> +	 *          partially mapped and it is up to the caller to unmap all
> +	 *          the GFNs.
> +	 *          This parameter may be NULL if nr_frames is 0.
> +	 */
> +	GUEST_HANDLE(xen_pfn_t)frame_list;
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(xen_mem_acquire_resource);
> +
> +#endif /* __XEN_PUBLIC_MEMORY_H__ */
> diff --git a/include/xen/interface/sched.h b/include/xen/interface/sched.h
> new file mode 100644
> index 0000000000..0f12dcf267
> --- /dev/null
> +++ b/include/xen/interface/sched.h
> @@ -0,0 +1,188 @@
> +/************************************************************
> ******************
> + * sched.h
> + *
> + * Scheduler state interactions
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> copy
> + * of this software and associated documentation files (the "Software"), to
> + * deal in the Software without restriction, including without limitation the
> + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
> + * sell copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> EVENT SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + *
> + * Copyright (c) 2005, Keir Fraser <keir@xensource.com>
> + */
> +
> +#ifndef __XEN_PUBLIC_SCHED_H__
> +#define __XEN_PUBLIC_SCHED_H__
> +
> +#include <xen/interface/event_channel.h>
> +
> +/*
> + * Guest Scheduler Operations
> + *
> + * The SCHEDOP interface provides mechanisms for a guest to interact
> + * with the scheduler, including yield, blocking and shutting itself
> + * down.
> + */
> +
> +/*
> + * The prototype for this hypercall is:
> + * long HYPERVISOR_sched_op(enum sched_op cmd, void *arg, ...)
> + *
> + * @cmd == SCHEDOP_??? (scheduler operation).
> + * @arg == Operation-specific extra argument(s), as described below.
> + * ...  == Additional Operation-specific extra arguments, described below.
> + *
> + * Versions of Xen prior to 3.0.2 provided only the following legacy version
> + * of this hypercall, supporting only the commands yield, block and
> shutdown:
> + *  long sched_op(int cmd, unsigned long arg)
> + * @cmd == SCHEDOP_??? (scheduler operation).
> + * @arg == 0               (SCHEDOP_yield and SCHEDOP_block)
> + *      == SHUTDOWN_* code (SCHEDOP_shutdown)
> + *
> + * This legacy version is available to new guests as:
> + * long HYPERVISOR_sched_op_compat(enum sched_op cmd, unsigned long
> arg)
> + */
> +
> +/*
> + * Voluntarily yield the CPU.
> + * @arg == NULL.
> + */
> +#define SCHEDOP_yield       0
> +
> +/*
> + * Block execution of this VCPU until an event is received for processing.
> + * If called with event upcalls masked, this operation will atomically
> + * reenable event delivery and check for pending events before blocking the
> + * VCPU. This avoids a "wakeup waiting" race.
> + * @arg == NULL.
> + */
> +#define SCHEDOP_block       1
> +
> +/*
> + * Halt execution of this domain (all VCPUs) and notify the system controller.
> + * @arg == pointer to sched_shutdown structure.
> + *
> + * If the sched_shutdown_t reason is SHUTDOWN_suspend then
> + * x86 PV guests must also set RDX (EDX for 32-bit guests) to the MFN
> + * of the guest's start info page.  RDX/EDX is the third hypercall
> + * argument.
> + *
> + * In addition, which reason is SHUTDOWN_suspend this hypercall
> + * returns 1 if suspend was cancelled or the domain was merely
> + * checkpointed, and 0 if it is resuming in a new domain.
> + */
> +#define SCHEDOP_shutdown    2
> +
> +/*
> + * Poll a set of event-channel ports. Return when one or more are pending.
> An
> + * optional timeout may be specified.
> + * @arg == pointer to sched_poll structure.
> + */
> +#define SCHEDOP_poll        3
> +
> +/*
> + * Declare a shutdown for another domain. The main use of this function is
> + * in interpreting shutdown requests and reasons for fully-virtualized
> + * domains.  A para-virtualized domain may use SCHEDOP_shutdown
> directly.
> + * @arg == pointer to sched_remote_shutdown structure.
> + */
> +#define SCHEDOP_remote_shutdown        4
> +
> +/*
> + * Latch a shutdown code, so that when the domain later shuts down it
> + * reports this code to the control tools.
> + * @arg == sched_shutdown, as for SCHEDOP_shutdown.
> + */
> +#define SCHEDOP_shutdown_code 5
> +
> +/*
> + * Setup, poke and destroy a domain watchdog timer.
> + * @arg == pointer to sched_watchdog structure.
> + * With id == 0, setup a domain watchdog timer to cause domain shutdown
> + *               after timeout, returns watchdog id.
> + * With id != 0 and timeout == 0, destroy domain watchdog timer.
> + * With id != 0 and timeout != 0, poke watchdog timer and set new timeout.
> + */
> +#define SCHEDOP_watchdog    6
> +
> +/*
> + * Override the current vcpu affinity by pinning it to one physical cpu or
> + * undo this override restoring the previous affinity.
> + * @arg == pointer to sched_pin_override structure.
> + *
> + * A negative pcpu value will undo a previous pin override and restore the
> + * previous cpu affinity.
> + * This call is allowed for the hardware domain only and requires the cpu
> + * to be part of the domain's cpupool.
> + */
> +#define SCHEDOP_pin_override 7
> +
> +struct sched_shutdown {
> +	unsigned int reason; /* SHUTDOWN_* => shutdown reason */
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(sched_shutdown);
> +
> +struct sched_poll {
> +	GUEST_HANDLE(evtchn_port_t)ports;
> +	unsigned int nr_ports;
> +	u64 timeout;
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(sched_poll);
> +
> +struct sched_remote_shutdown {
> +	domid_t domain_id;         /* Remote domain ID */
> +	unsigned int reason;       /* SHUTDOWN_* => shutdown reason */
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(sched_remote_shutdown);
> +
> +struct sched_watchdog {
> +	u32 id;                /* watchdog ID */
> +	u32 timeout;           /* timeout */
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(sched_watchdog);
> +
> +struct sched_pin_override {
> +	s32 pcpu;
> +};
> +
> +DEFINE_GUEST_HANDLE_STRUCT(sched_pin_override);
> +
> +/*
> + * Reason codes for SCHEDOP_shutdown. These may be interpreted by
> control
> + * software to determine the appropriate action. For the most part, Xen does
> + * not care about the shutdown code.
> + */
> +#define SHUTDOWN_poweroff   0  /* Domain exited normally. Clean up
> and kill. */
> +#define SHUTDOWN_reboot     1  /* Clean up, kill, and then restart.
> */
> +#define SHUTDOWN_suspend    2  /* Clean up, save suspend info, kill.
> */
> +#define SHUTDOWN_crash      3  /* Tell controller we've crashed.
> */
> +#define SHUTDOWN_watchdog   4  /* Restart because watchdog time
> expired.     */
> +
> +/*
> + * Domain asked to perform 'soft reset' for it. The expected behavior is to
> + * reset internal Xen state for the domain returning it to the point where it
> + * was created but leaving the domain's memory contents and vCPU
> contexts
> + * intact. This will allow the domain to start over and set up all Xen specific
> + * interfaces again.
> + */
> +#define SHUTDOWN_soft_reset 5
> +#define SHUTDOWN_MAX        5  /* Maximum valid shutdown reason.
> */
> +
> +#endif /* __XEN_PUBLIC_SCHED_H__ */
> diff --git a/include/xen/interface/xen.h b/include/xen/interface/xen.h
> new file mode 100644
> index 0000000000..964daaedfb
> --- /dev/null
> +++ b/include/xen/interface/xen.h
> @@ -0,0 +1,225 @@
> +/************************************************************
> ******************
> + * xen.h
> + *
> + * Guest OS interface to Xen.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> copy
> + * of this software and associated documentation files (the "Software"), to
> + * deal in the Software without restriction, including without limitation the
> + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
> + * sell copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> EVENT SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + *
> + * Copyright (c) 2004, K A Fraser
> + */
> +
> +#ifndef __XEN_PUBLIC_XEN_H__
> +#define __XEN_PUBLIC_XEN_H__
> +
> +#include <xen/arm/interface.h>
> +
> +/*
> + * XEN "SYSTEM CALLS" (a.k.a. HYPERCALLS).
> + */
> +
> +/*
> + * x86_32: EAX = vector; EBX, ECX, EDX, ESI, EDI = args 1, 2, 3, 4, 5.
> + *         EAX = return value
> + *         (argument registers may be clobbered on return)
> + * x86_64: RAX = vector; RDI, RSI, RDX, R10, R8, R9 = args 1, 2, 3, 4, 5, 6.
> + *         RAX = return value
> + *         (argument registers not clobbered on return; RCX, R11 are)
> + */
> +#define __HYPERVISOR_set_trap_table        0
> +#define __HYPERVISOR_mmu_update            1
> +#define __HYPERVISOR_set_gdt               2
> +#define __HYPERVISOR_stack_switch          3
> +#define __HYPERVISOR_set_callbacks         4
> +#define __HYPERVISOR_fpu_taskswitch        5
> +#define __HYPERVISOR_sched_op_compat       6
> +#define __HYPERVISOR_platform_op           7
> +#define __HYPERVISOR_set_debugreg          8
> +#define __HYPERVISOR_get_debugreg          9
> +#define __HYPERVISOR_update_descriptor    10
> +#define __HYPERVISOR_memory_op            12
> +#define __HYPERVISOR_multicall            13
> +#define __HYPERVISOR_update_va_mapping    14
> +#define __HYPERVISOR_set_timer_op         15
> +#define __HYPERVISOR_event_channel_op_compat 16
> +#define __HYPERVISOR_xen_version          17
> +#define __HYPERVISOR_console_io           18
> +#define __HYPERVISOR_physdev_op_compat    19
> +#define __HYPERVISOR_grant_table_op       20
> +#define __HYPERVISOR_vm_assist            21
> +#define __HYPERVISOR_update_va_mapping_otherdomain 22
> +#define __HYPERVISOR_iret                 23 /* x86 only */
> +#define __HYPERVISOR_vcpu_op              24
> +#define __HYPERVISOR_set_segment_base     25 /* x86/64 only */
> +#define __HYPERVISOR_mmuext_op            26
> +#define __HYPERVISOR_xsm_op               27
> +#define __HYPERVISOR_nmi_op               28
> +#define __HYPERVISOR_sched_op             29
> +#define __HYPERVISOR_callback_op          30
> +#define __HYPERVISOR_xenoprof_op          31
> +#define __HYPERVISOR_event_channel_op     32
> +#define __HYPERVISOR_physdev_op           33
> +#define __HYPERVISOR_hvm_op               34
> +#define __HYPERVISOR_sysctl               35
> +#define __HYPERVISOR_domctl               36
> +#define __HYPERVISOR_kexec_op             37
> +#define __HYPERVISOR_tmem_op              38
> +#define __HYPERVISOR_xc_reserved_op       39 /* reserved for
> XenClient */
> +#define __HYPERVISOR_xenpmu_op            40
> +#define __HYPERVISOR_dm_op                41
> +
> +/* Architecture-specific hypercall definitions. */
> +#define __HYPERVISOR_arch_0               48
> +#define __HYPERVISOR_arch_1               49
> +#define __HYPERVISOR_arch_2               50
> +#define __HYPERVISOR_arch_3               51
> +#define __HYPERVISOR_arch_4               52
> +#define __HYPERVISOR_arch_5               53
> +#define __HYPERVISOR_arch_6               54
> +#define __HYPERVISOR_arch_7               55
> +
> +#ifndef __ASSEMBLY__
> +
> +typedef u16 domid_t;
> +
> +/* Domain ids >= DOMID_FIRST_RESERVED cannot be used for ordinary
> domains. */
> +#define DOMID_FIRST_RESERVED (0x7FF0U)
> +
> +/* DOMID_SELF is used in certain contexts to refer to oneself. */
> +#define DOMID_SELF (0x7FF0U)
> +
> +/*
> + * DOMID_IO is used to restrict page-table updates to mapping I/O memory.
> + * Although no Foreign Domain need be specified to map I/O pages,
> DOMID_IO
> + * is useful to ensure that no mappings to the OS's own heap are accidentally
> + * installed. (e.g., in Linux this could cause havoc as reference counts
> + * aren't adjusted on the I/O-mapping code path).
> + * This only makes sense in MMUEXT_SET_FOREIGNDOM, but in that
> context can
> + * be specified by any calling domain.
> + */
> +#define DOMID_IO   (0x7FF1U)
> +
> +/*
> + * DOMID_XEN is used to allow privileged domains to map restricted parts of
> + * Xen's heap space (e.g., the machine_to_phys table).
> + * This only makes sense in MMUEXT_SET_FOREIGNDOM, and is only
> permitted if
> + * the caller is privileged.
> + */
> +#define DOMID_XEN  (0x7FF2U)
> +
> +/* DOMID_COW is used as the owner of sharable pages */
> +#define DOMID_COW  (0x7FF3U)
> +
> +/* DOMID_INVALID is used to identify pages with unknown owner. */
> +#define DOMID_INVALID (0x7FF4U)
> +
> +/* Idle domain. */
> +#define DOMID_IDLE (0x7FFFU)
> +
> +struct vcpu_info {
> +	/*
> +	 * 'evtchn_upcall_pending' is written non-zero by Xen to indicate
> +	 * a pending notification for a particular VCPU. It is then cleared
> +	 * by the guest OS /before/ checking for pending work, thus avoiding
> +	 * a set-and-check race. Note that the mask is only accessed by Xen
> +	 * on the CPU that is currently hosting the VCPU. This means that the
> +	 * pending and mask flags can be updated by the guest without special
> +	 * synchronisation (i.e., no need for the x86 LOCK prefix).
> +	 * This may seem suboptimal because if the pending flag is set by
> +	 * a different CPU then an IPI may be scheduled even when the mask
> +	 * is set. However, note:
> +	 *  1. The task of 'interrupt holdoff' is covered by the per-event-
> +	 *     channel mask bits. A 'noisy' event that is continually being
> +	 *     triggered can be masked at source at this very precise
> +	 *     granularity.
> +	 *  2. The main purpose of the per-VCPU mask is therefore to restrict
> +	 *     reentrant execution: whether for concurrency control, or to
> +	 *     prevent unbounded stack usage. Whatever the purpose, we
> expect
> +	 *     that the mask will be asserted only for short periods at a time,
> +	 *     and so the likelihood of a 'spurious' IPI is suitably small.
> +	 * The mask is read before making an event upcall to the guest: a
> +	 * non-zero mask therefore guarantees that the VCPU will not receive
> +	 * an upcall activation. The mask is cleared when the VCPU requests
> +	 * to block: this avoids wakeup-waiting races.
> +	 */
> +	u8 evtchn_upcall_pending;
> +	u8 evtchn_upcall_mask;
> +	xen_ulong_t evtchn_pending_sel;
> +	struct arch_vcpu_info arch;
> +	struct pvclock_vcpu_time_info time;
> +}; /* 64 bytes (x86) */
> +
> +/*
> + * Xen/kernel shared data -- pointer provided in start_info.
> + * NB. We expect that this struct is smaller than a page.
> + */
> +struct shared_info {
> +	struct vcpu_info vcpu_info[MAX_VIRT_CPUS];
> +
> +	/*
> +	 * A domain can create "event channels" on which it can send and
> receive
> +	 * asynchronous event notifications. There are three classes of event
> that
> +	 * are delivered by this mechanism:
> +	 *  1. Bi-directional inter- and intra-domain connections. Domains must
> +	 *     arrange out-of-band to set up a connection (usually by allocating
> +	 *     an unbound 'listener' port and avertising that via a storage
> service
> +	 *     such as xenstore).
> +	 *  2. Physical interrupts. A domain with suitable hardware-access
> +	 *     privileges can bind an event-channel port to a physical interrupt
> +	 *     source.
> +	 *  3. Virtual interrupts ('events'). A domain can bind an event-channel
> +	 *     port to a virtual interrupt source, such as the virtual-timer
> +	 *     device or the emergency console.
> +	 *
> +	 * Event channels are addressed by a "port index". Each channel is
> +	 * associated with two bits of information:
> +	 *  1. PENDING -- notifies the domain that there is a pending
> notification
> +	 *     to be processed. This bit is cleared by the guest.
> +	 *  2. MASK -- if this bit is clear then a 0->1 transition of PENDING
> +	 *     will cause an asynchronous upcall to be scheduled. This bit is
> only
> +	 *     updated by the guest. It is read-only within Xen. If a channel
> +	 *     becomes pending while the channel is masked then the 'edge' is
> lost
> +	 *     (i.e., when the channel is unmasked, the guest must manually
> handle
> +	 *     pending notifications as no upcall will be scheduled by Xen).
> +	 *
> +	 * To expedite scanning of pending notifications, any 0->1 pending
> +	 * transition on an unmasked channel causes a corresponding bit in a
> +	 * per-vcpu selector word to be set. Each bit in the selector covers a
> +	 * 'C long' in the PENDING bitfield array.
> +	 */
> +	xen_ulong_t evtchn_pending[sizeof(xen_ulong_t) * 8];
> +	xen_ulong_t evtchn_mask[sizeof(xen_ulong_t) * 8];
> +
> +	/*
> +	 * Wallclock time: updated only by control software. Guests should base
> +	 * their gettimeofday() syscall on this wallclock-base value.
> +	 */
> +	struct pvclock_wall_clock wc;
> +
> +	struct arch_shared_info arch;
> +
> +};
> +
> +#else /* __ASSEMBLY__ */
> +
> +/* In assembly code we cannot use C numeric constant suffixes. */
> +#define mk_unsigned_long(x) x
> +
> +#endif /* !__ASSEMBLY__ */
> +
> +#endif /* __XEN_PUBLIC_XEN_H__ */
> --
> 2.17.1

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 09/17] lib: sscanf: add sscanf implementation
  2020-07-01 16:29 ` [PATCH 09/17] lib: sscanf: add sscanf implementation Anastasiia Lukianenko
@ 2020-07-02  4:04   ` Heinrich Schuchardt
  0 siblings, 0 replies; 57+ messages in thread
From: Heinrich Schuchardt @ 2020-07-02  4:04 UTC (permalink / raw)
  To: u-boot

On 7/1/20 6:29 PM, Anastasiia Lukianenko wrote:
> From: Andrii Anisov <andrii_anisov@epam.com>
>
> Port sscanf implementation from mini-os and introduce new
> Kconfig option to enable it: CONFIG_SSCANF. Disable by default.
>
> Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
> Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> ---
>  include/vsprintf.h |   8 +
>  lib/Kconfig        |   4 +
>  lib/Makefile       |   1 +
>  lib/sscanf.c       | 883 +++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 896 insertions(+)
>  create mode 100644 lib/sscanf.c
>
> diff --git a/include/vsprintf.h b/include/vsprintf.h
> index d9fb68add0..ca2640dd43 100644
> --- a/include/vsprintf.h
> +++ b/include/vsprintf.h
> @@ -234,4 +234,12 @@ char *strmhz(char *buf, unsigned long hz);
>   */
>  void str_to_upper(const char *in, char *out, size_t len);
>
> +/**
> + * sscanf - Unformat a buffer into a list of arguments
> + * @buf:	input buffer
> + * @fmt:	formatting of buffer
> + * @...:	resulting arguments
> + */
> +int sscanf(const char * buf, const char * fmt, ...);
> +
>  #endif
> diff --git a/lib/Kconfig b/lib/Kconfig
> index af5c38afd9..3dfc6dd0c5 100644
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -67,6 +67,10 @@ config SPL_SPRINTF
>  config TPL_SPRINTF
>  	bool
>
> +config SSCANF
> +	bool
> +	default n
> +
>  config STRTO
>  	bool
>  	default y
> diff --git a/lib/Makefile b/lib/Makefile
> index dc5761966c..65409df15e 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -122,6 +122,7 @@ else
>  # Main U-Boot always uses the full printf support
>  obj-y += vsprintf.o strto.o
>  obj-$(CONFIG_OID_REGISTRY) += oid_registry.o
> +obj-$(CONFIG_SSCANF) += sscanf.o
>  endif
>
>  obj-y += date.o
> diff --git a/lib/sscanf.c b/lib/sscanf.c
> new file mode 100644
> index 0000000000..2123fa4653
> --- /dev/null
> +++ b/lib/sscanf.c
> @@ -0,0 +1,883 @@

Please, add an SPDX header here.


> +/*
> + ****************************************************************************
> + *
> + *        File: printf.c
> + *      Author: Juergen Gross <jgross@suse.com>
> + *
> + *        Date: Jun 2016
> + *
> + * Environment: Xen Minimal OS
> + * Description: Library functions for printing
> + *              (FreeBSD port)
> + *
> + ****************************************************************************

Please, remove this incorrect information


> + */
> +
> +/*-
> + * Copyright (c) 1990, 1993
> + *	The Regents of the University of California.  All rights reserved.
> + *
> + * This code is derived from software contributed to Berkeley by
> + * Chris Torek.
> + *
> + * Copyright (c) 2011 The FreeBSD Foundation
> + * All rights reserved.
> + * Portions of this software were developed by David Chisnall
> + * under sponsorship from the FreeBSD Foundation.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + *    notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *    notice, this list of conditions and the following disclaimer in the
> + *    documentation and/or other materials provided with the distribution.
> + * 3. Neither the name of the University nor the names of its contributors
> + *    may be used to endorse or promote products derived from this software
> + *    without specific prior written permission.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> + * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
> + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + */

We don't need to repeat the whole BSD license. We keep it in
Licenses/bsd-3-clause.txt.

> +
> +#if !defined HAVE_LIBC
> +
> +#include <os.h>
> +#include <linux/kernel.h>
> +#include <linux/ctype.h>
> +#include <vsprintf.h>
> +#include <linux/string.h>
> +
> +#define __DECONST(type, var)    ((type)(uintptr_t)(const void *)(var))
> +
> +/*
> + * Convert a string to an unsigned long integer.
> + *
> + * Ignores `locale' stuff.  Assumes that the upper and lower case
> + * alphabets and digits are each contiguous.
> + */

Please, use Sphinx style function descriptions.

https://www.kernel.org/doc/html/latest/doc-guide/kernel-doc.html#function-documentation

Integrate the documentation into the HTML documenation generated by

    make htmldocs

> +unsigned long
> +strtoul(const char *nptr, char **endptr, int base)

We already have simple_strtoul() and strict_strtoul(). Why should we now
add a third implementation?

For new libary functions, pleases, provide a test in /tests/lib.

> +{
> +	const char *s = nptr;
> +	unsigned long acc;
> +	unsigned char c;
> +	unsigned long cutoff;
> +	int neg = 0, any, cutlim;
> +
> +	/*
> +	 * See strtol for comments as to the logic used.
> +	 */
> +	do {
> +		c = *s++;
> +	} while (isspace(c));
> +	if (c == '-') {
> +		neg = 1;
> +		c = *s++;
> +	} else if (c == '+') {
> +		c = *s++;
> +	}
> +	if ((base == 0 || base == 16) &&
> +		c == '0' && (*s == 'x' || *s == 'X')) {
> +		c = s[1];
> +		s += 2;
> +		base = 16;
> +	}
> +	if (base == 0)
> +		base = c == '0' ? 8 : 10;
> +	cutoff = (unsigned long)ULONG_MAX / (unsigned long)base;
> +	cutlim = (unsigned long)ULONG_MAX % (unsigned long)base;
> +	for (acc = 0, any = 0;; c = *s++) {
> +		if (!isascii(c))
> +			break;
> +		if (isdigit(c))
> +			c -= '0';
> +		else if (isalpha(c))
> +			c -= isupper(c) ? 'A' - 10 : 'a' - 10;
> +		else
> +			break;
> +		if (c >= base)
> +			break;
> +		if (any < 0 || acc > cutoff || (acc == cutoff && c > cutlim)) {
> +			any = -1;
> +		} else {
> +			any = 1;
> +			acc *= base;
> +			acc += c;
> +		}
> +	}
> +	if (any < 0)
> +		acc = ULONG_MAX;
> +	else if (neg)
> +		acc = -acc;
> +	if (endptr != 0)
> +		*endptr = __DECONST(char *, any ? s - 1 : nptr);
> +	return acc;
> +}
> +
> +/*
> + * Convert a string to a quad integer.
> + *
> + * Ignores `locale' stuff.  Assumes that the upper and lower case
> + * alphabets and digits are each contiguous.
> + */
> +s64
> +strtoq(const char *nptr, char **endptr, int base)

The function is not in any include. So should it be static?

Do you have a use case for this function?

> +{
> +	const char *s;
> +	u64 acc;
> +	unsigned char c;
> +	u64 qbase, cutoff;
> +	int neg, any, cutlim;
> +
> +	/*
> +	 * Skip white space and pick up leading +/- sign if any.
> +	 * If base is 0, allow 0x for hex and 0 for octal, else
> +	 * assume decimal; if base is already 16, allow 0x.
> +	 */
> +	s = nptr;
> +	do {
> +		c = *s++;
> +	} while (isspace(c));
> +	if (c == '-') {
> +		neg = 1;
> +		c = *s++;
> +	} else {
> +		neg = 0;
> +		if (c == '+')
> +			c = *s++;
> +	}
> +	if ((base == 0 || base == 16) &&
> +	    c == '0' && (*s == 'x' || *s == 'X')) {
> +		c = s[1];
> +		s += 2;
> +		base = 16;
> +	}
> +	if (base == 0)
> +		base = c == '0' ? 8 : 10;
> +
> +	/*
> +	 * Compute the cutoff value between legal numbers and illegal
> +	 * numbers.  That is the largest legal value, divided by the
> +	 * base.  An input number that is greater than this value, if
> +	 * followed by a legal input character, is too big.  One that
> +	 * is equal to this value may be valid or not; the limit
> +	 * between valid and invalid numbers is then based on the last
> +	 * digit.  For instance, if the range for quads is
> +	 * [-9223372036854775808..9223372036854775807] and the input base
> +	 * is 10, cutoff will be set to 922337203685477580 and cutlim to
> +	 * either 7 (neg==0) or 8 (neg==1), meaning that if we have
> +	 * accumulated a value > 922337203685477580, or equal but the
> +	 * next digit is > 7 (or 8), the number is too big, and we will
> +	 * return a range error.
> +	 *
> +	 * Set any if any `digits' consumed; make it negative to indicate
> +	 * overflow.
> +	 */
> +	qbase = (unsigned int)base;
> +	cutoff = neg ? (u64)-(LLONG_MIN + LLONG_MAX) + LLONG_MAX : LLONG_MAX;
> +	cutlim = cutoff % qbase;
> +	cutoff /= qbase;
> +	for (acc = 0, any = 0;; c = *s++) {
> +		if (!isascii(c))
> +			break;
> +		if (isdigit(c))
> +			c -= '0';
> +		else if (isalpha(c))
> +			c -= isupper(c) ? 'A' - 10 : 'a' - 10;
> +		else
> +			break;
> +		if (c >= base)
> +			break;
> +		if (any < 0 || acc > cutoff || (acc == cutoff && c > cutlim)) {
> +			any = -1;
> +		} else {
> +			any = 1;
> +			acc *= qbase;
> +			acc += c;
> +		}
> +	}
> +	if (any < 0)
> +		acc = neg ? LLONG_MIN : LLONG_MAX;
> +	else if (neg)
> +		acc = -acc;
> +	if (endptr != 0)
> +		*endptr = __DECONST(char *, any ? s - 1 : nptr);
> +	return acc;
> +}
> +
> +/*
> + * Convert a string to an unsigned quad integer.
> + *
> + * Ignores `locale' stuff.  Assumes that the upper and lower case
> + * alphabets and digits are each contiguous.
> + */
> +u64
> +strtouq(const char *nptr, char **endptr, int base)
> +{
> +	const char *s = nptr;
> +	u64 acc;
> +	unsigned char c;
> +	u64 qbase, cutoff;
> +	int neg, any, cutlim;
> +
> +	/*
> +	 * See strtoq for comments as to the logic used.
> +	 */
> +	do {
> +		c = *s++;
> +	} while (isspace(c));
> +	if (c == '-') {
> +		neg = 1;
> +		c = *s++;
> +	} else {
> +		neg = 0;
> +		if (c == '+')
> +			c = *s++;
> +	}
> +	if ((base == 0 || base == 16) &&
> +		c == '0' && (*s == 'x' || *s == 'X')) {
> +		c = s[1];
> +		s += 2;
> +		base = 16;
> +	}
> +	if (base == 0)
> +		base = c == '0' ? 8 : 10;
> +	qbase = (unsigned int)base;
> +	cutoff = (u64)ULLONG_MAX / qbase;
> +	cutlim = (u64)ULLONG_MAX % qbase;
> +	for (acc = 0, any = 0;; c = *s++) {
> +		if (!isascii(c))
> +			break;
> +		if (isdigit(c))
> +			c -= '0';
> +		else if (isalpha(c))
> +			c -= isupper(c) ? 'A' - 10 : 'a' - 10;

This is just repeating code that we already have in strtoq().
Reduce the code size.

> +		else
> +			break;
> +		if (c >= base)
> +			break;
> +		if (any < 0 || acc > cutoff || (acc == cutoff && c > cutlim)) {
> +			any = -1;
> +		} else {
> +			any = 1;
> +			acc *= qbase;
> +			acc += c;
> +		}
> +	}
> +	if (any < 0)
> +		acc = ULLONG_MAX;
> +	else if (neg)
> +		acc = -acc;
> +	if (endptr != 0)
> +		*endptr = __DECONST(char *, any ? s - 1 : nptr);
> +	return acc;
> +}
> +
> +/*
> + * Fill in the given table from the scanset at the given format
> + * (just after `[').  Return a pointer to the character past the
> + * closing `]'.  The table has a 1 wherever characters should be
> + * considered part of the scanset.
> + */
> +static const u_char *
> +__sccl(char *tab, const u_char *fmt)
> +{
> +	int c, n, v;
> +
> +	/* first `clear' the whole table */
> +	c = *fmt++;             /* first char hat => negated scanset */
> +	if (c == '^') {
> +		v = 1;          /* default => accept */
> +		c = *fmt++;     /* get new first char */
> +	} else {
> +		v = 0;          /* default => reject */
> +	}
> +
> +	/* XXX: Will not work if sizeof(tab*) > sizeof(char) */
> +	for (n = 0; n < 256; n++)
> +		tab[n] = v;        /* memset(tab, v, 256) */
> +
> +	if (c == 0)
> +		return (fmt - 1);/* format ended before closing ] */
> +
> +	/*
> +	 * Now set the entries corresponding to the actual scanset
> +	 * to the opposite of the above.
> +	 *
> +	 * The first character may be ']' (or '-') without being special;
> +	 * the last character may be '-'.
> +	 */
> +	v = 1 - v;
> +	for (;;) {
> +		tab[c] = v;             /* take character c */
> +doswitch:
> +		n = *fmt++;             /* and examine the next */
> +		switch (n) {
> +		case 0:                 /* format ended too soon */
> +			return (fmt - 1);
> +
> +		case '-':
> +			/*
> +			 * A scanset of the form
> +			 *      [01+-]
> +			 * is defined as `the digit 0, the digit 1,
> +			 * the character +, the character -', but
> +			 * the effect of a scanset such as
> +			 *      [a-zA-Z0-9]
> +			 * is implementation defined.  The V7 Unix
> +			 * scanf treats `a-z' as `the letters a through
> +			 * z', but treats `a-a' as `the letter a, the
> +			 * character -, and the letter a'.
> +			 *
> +			 * For compatibility, the `-' is not considerd
> +			 * to define a range if the character following
> +			 * it is either a close bracket (required by ANSI)
> +			 * or is not numerically greater than the character
> +			 * we just stored in the table (c).
> +			 */
> +			n = *fmt;
> +			if (n == ']' || n < c) {
> +				c = '-';
> +				break;  /* resume the for(;;) */
> +			}
> +			fmt++;
> +			/* fill in the range */
> +			do {
> +				tab[++c] = v;
> +			} while (c < n);
> +			c = n;
> +			/*
> +			 * Alas, the V7 Unix scanf also treats formats
> +			 * such as [a-c-e] as `the letters a through e'.
> +			 * This too is permitted by the standard....
> +			 */
> +			goto doswitch;
> +			break;
> +
> +		case ']':               /* end of scanset */
> +			return (fmt);
> +
> +		default:                /* just another character */
> +			c = n;
> +			break;
> +		}
> +	}
> +	/* NOTREACHED */
> +}
> +
> +/**
> + * vsscanf - Unformat a buffer into a list of arguments
> + * @buf:	input buffer
> + * @fmt:	format of buffer
> + * @args:	arguments
> + */
> +#define BUF             32      /* Maximum length of numeric string. */
> +
> +/*
> + * Flags used during conversion.
> + */
> +#define LONG            0x01    /* l: long or double */
> +#define SHORT           0x04    /* h: short */
> +#define SUPPRESS        0x08    /* suppress assignment */
> +#define POINTER         0x10    /* weird %p pointer (`fake hex') */
> +#define NOSKIP          0x20    /* do not skip blanks */
> +#define QUAD            0x400
> +#define SHORTSHORT      0x4000  /** hh: char */
> +
> +/*
> + * The following are used in numeric conversions only:
> + * SIGNOK, NDIGITS, DPTOK, and EXPOK are for floating point;
> + * SIGNOK, NDIGITS, PFXOK, and NZDIGITS are for integral.
> + */
> +#define SIGNOK          0x40    /* +/- is (still) legal */
> +#define NDIGITS         0x80    /* no digits detected */
> +
> +#define DPTOK           0x100   /* (float) decimal point is still legal */
> +#define EXPOK           0x200   /* (float) exponent (e+3, etc) still legal */
> +
> +#define PFXOK           0x100   /* 0x prefix is (still) legal */
> +#define NZDIGITS        0x200   /* no zero digits detected */
> +
> +/*
> + * Conversion types.
> + */
> +#define CT_CHAR         0       /* %c conversion */
> +#define CT_CCL          1       /* %[...] conversion */
> +#define CT_STRING       2       /* %s conversion */
> +#define CT_INT          3       /* integer, i.e., strtoq or strtouq */
> +typedef u64 (*ccfntype)(const char *, char **, int);
> +
> +int
> +vsscanf(const char *inp, char const *fmt0, va_list ap)

Please, provide a test in test/lib.

Best regards

Heinrich

> +{
> +	int inr;
> +	const u_char *fmt = (const u_char *)fmt0;
> +	int c;                  /* character from format, or conversion */
> +	size_t width;           /* field width, or 0 */
> +	char *p;                /* points into all kinds of strings */
> +	int n;                  /* handy integer */
> +	int flags;              /* flags as defined above */
> +	char *p0;               /* saves original value of p when necessary */
> +	int nassigned;          /* number of fields assigned */
> +	int nconversions;       /* number of conversions */
> +	int nread;              /* number of characters consumed from fp */
> +	int base;               /* base argument to strtoq/strtouq */
> +	ccfntype ccfn;          /* conversion function (strtoq/strtouq) */
> +	char ccltab[256];       /* character class table for %[...] */
> +	char buf[BUF];          /* buffer for numeric conversions */
> +
> +	/* `basefix' is used to avoid `if' tests in the integer scanner */
> +	static short basefix[17] = { 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
> +				     12, 13, 14, 15, 16 };
> +
> +	inr = strlen(inp);
> +
> +	nassigned = 0;
> +	nconversions = 0;
> +	nread = 0;
> +	base = 0;               /* XXX just to keep gcc happy */
> +	ccfn = NULL;            /* XXX just to keep gcc happy */
> +	for (;;) {
> +		c = *fmt++;
> +		if (c == 0)
> +			return (nassigned);
> +		if (isspace(c)) {
> +			while (inr > 0 && isspace(*inp))
> +				nread++, inr--, inp++;
> +			continue;
> +		}
> +		if (c != '%')
> +			goto literal;
> +		width = 0;
> +		flags = 0;
> +		/*
> +		 * switch on the format.  continue if done;
> +		 * break once format type is derived.
> +		 */
> +again:          c = *fmt++;
> +		switch (c) {
> +		case '%':
> +literal:
> +			if (inr <= 0)
> +				goto input_failure;
> +			if (*inp != c)
> +				goto match_failure;
> +			inr--, inp++;
> +			nread++;
> +			continue;
> +
> +		case '*':
> +			flags |= SUPPRESS;
> +			goto again;
> +		case 'l':
> +			if (flags & LONG) {
> +				flags &= ~LONG;
> +				flags |= QUAD;
> +			} else {
> +				flags |= LONG;
> +			}
> +			goto again;
> +		case 'q':
> +			flags |= QUAD;
> +			goto again;
> +		case 'h':
> +			if (flags & SHORT) {
> +				flags &= ~SHORT;
> +				flags |= SHORTSHORT;
> +			} else {
> +				flags |= SHORT;
> +			}
> +			goto again;
> +
> +		case '0': case '1': case '2': case '3': case '4':
> +		case '5': case '6': case '7': case '8': case '9':
> +			width = width * 10 + c - '0';
> +			goto again;
> +
> +		/*
> +		 * Conversions.
> +		 *
> +		 */
> +		case 'd':
> +			c = CT_INT;
> +			ccfn = (ccfntype)strtoq;
> +			base = 10;
> +			break;
> +
> +		case 'i':
> +			c = CT_INT;
> +			ccfn = (ccfntype)strtoq;
> +			base = 0;
> +			break;
> +
> +		case 'o':
> +			c = CT_INT;
> +			ccfn = strtouq;
> +			base = 8;
> +			break;
> +
> +		case 'u':
> +			c = CT_INT;
> +			ccfn = strtouq;
> +			base = 10;
> +			break;
> +
> +		case 'x':
> +			flags |= PFXOK; /* enable 0x prefixing */
> +			c = CT_INT;
> +			ccfn = strtouq;
> +			base = 16;
> +			break;
> +
> +		case 's':
> +			c = CT_STRING;
> +			break;
> +
> +		case '[':
> +			fmt = __sccl(ccltab, fmt);
> +			flags |= NOSKIP;
> +			c = CT_CCL;
> +			break;
> +
> +		case 'c':
> +			flags |= NOSKIP;
> +			c = CT_CHAR;
> +			break;
> +
> +		case 'p':       /* pointer format is like hex */
> +			flags |= POINTER | PFXOK;
> +			c = CT_INT;
> +			ccfn = strtouq;
> +			base = 16;
> +			break;
> +
> +		case 'n':
> +			nconversions++;
> +			if (flags & SUPPRESS)   /* ??? */
> +				continue;
> +			if (flags & SHORTSHORT)
> +				*va_arg(ap, char *) = nread;
> +			else if (flags & SHORT)
> +				*va_arg(ap, short *) = nread;
> +			else if (flags & LONG)
> +				*va_arg(ap, long *) = nread;
> +			else if (flags & QUAD)
> +				*va_arg(ap, s64 *) = nread;
> +			else
> +				*va_arg(ap, int *) = nread;
> +			continue;
> +		}
> +
> +		/*
> +		 * We have a conversion that requires input.
> +		 */
> +		if (inr <= 0)
> +			goto input_failure;
> +
> +		/*
> +		 * Consume leading white space, except for formats
> +		 * that suppress this.
> +		 */
> +		if ((flags & NOSKIP) == 0) {
> +			while (isspace(*inp)) {
> +				nread++;
> +				if (--inr > 0)
> +					inp++;
> +				else
> +					goto input_failure;
> +			}
> +			/*
> +			 * Note that there is at least one character in
> +			 * the buffer, so conversions that do not set NOSKIP
> +			 * can no longer result in an input failure.
> +			 */
> +		}
> +
> +		/*
> +		 * Do the conversion.
> +		 */
> +		switch (c) {
> +		case CT_CHAR:
> +			/* scan arbitrary characters (sets NOSKIP) */
> +			if (width == 0)
> +				width = 1;
> +			if (flags & SUPPRESS) {
> +				size_t sum = 0;
> +
> +				if ((n = inr) < width) {
> +					sum += n;
> +					width -= n;
> +					inp += n;
> +					if (sum == 0)
> +						goto input_failure;
> +				} else {
> +					sum += width;
> +					inr -= width;
> +					inp += width;
> +				}
> +				nread += sum;
> +			} else {
> +				memcpy(va_arg(ap, char *), inp, width);
> +				inr -= width;
> +				inp += width;
> +				nread += width;
> +				nassigned++;
> +			}
> +			nconversions++;
> +			break;
> +
> +		case CT_CCL:
> +			/* scan a (nonempty) character class (sets NOSKIP) */
> +			if (width == 0)
> +				width = (size_t)~0;     /* `infinity' */
> +			/* take only those things in the class */
> +			if (flags & SUPPRESS) {
> +				n = 0;
> +				while (ccltab[(unsigned char)*inp]) {
> +					n++, inr--, inp++;
> +					if (--width == 0)
> +						break;
> +					if (inr <= 0) {
> +						if (n == 0)
> +							goto input_failure;
> +						break;
> +					}
> +				}
> +				if (n == 0)
> +					goto match_failure;
> +			} else {
> +				p = va_arg(ap, char *);
> +				p0 = p;
> +				while (ccltab[(unsigned char)*inp]) {
> +					inr--;
> +					*p++ = *inp++;
> +					if (--width == 0)
> +						break;
> +					if (inr <= 0) {
> +						if (p == p0)
> +							goto input_failure;
> +						break;
> +					}
> +				}
> +				n = p - p0;
> +				if (n == 0)
> +					goto match_failure;
> +				*p = 0;
> +				nassigned++;
> +			}
> +			nread += n;
> +			nconversions++;
> +			break;
> +
> +		case CT_STRING:
> +			/* like CCL, but zero-length string OK, & no NOSKIP */
> +			if (width == 0)
> +				width = (size_t)~0;
> +			if (flags & SUPPRESS) {
> +				n = 0;
> +				while (!isspace(*inp)) {
> +					n++, inr--, inp++;
> +					if (--width == 0)
> +						break;
> +					if (inr <= 0)
> +						break;
> +				}
> +				nread += n;
> +			} else {
> +				p = va_arg(ap, char *);
> +				p0 = p;
> +				while (!isspace(*inp)) {
> +					inr--;
> +					*p++ = *inp++;
> +					if (--width == 0)
> +						break;
> +					if (inr <= 0)
> +						break;
> +				}
> +				*p = 0;
> +				nread += p - p0;
> +				nassigned++;
> +			}
> +			nconversions++;
> +			continue;
> +
> +		case CT_INT:
> +			/* scan an integer as if by strtoq/strtouq */
> +#ifdef hardway
> +			if (width == 0 || width > sizeof(buf) - 1)
> +				width = sizeof(buf) - 1;
> +#else
> +			/* size_t is unsigned, hence this optimisation */
> +			if (--width > sizeof(buf) - 2)
> +				width = sizeof(buf) - 2;
> +			width++;
> +#endif
> +			flags |= SIGNOK | NDIGITS | NZDIGITS;
> +			for (p = buf; width; width--) {
> +				c = *inp;
> +				/*
> +				 * Switch on the character; `goto ok'
> +				 * if we accept it as a part of number.
> +				 */
> +				switch (c) {
> +				/*
> +				 * The digit 0 is always legal, but is
> +				 * special.  For %i conversions, if no
> +				 * digits (zero or nonzero) have been
> +				 * scanned (only signs), we will have
> +				 * base==0.  In that case, we should set
> +				 * it to 8 and enable 0x prefixing.
> +				 * Also, if we have not scanned zero digits
> +				 * before this, do not turn off prefixing
> +				 * (someone else will turn it off if we
> +				 * have scanned any nonzero digits).
> +				 */
> +				case '0':
> +					if (base == 0) {
> +						base = 8;
> +						flags |= PFXOK;
> +					}
> +					if (flags & NZDIGITS)
> +						flags &= ~(SIGNOK | NZDIGITS | NDIGITS);
> +					else
> +						flags &= ~(SIGNOK | PFXOK | NDIGITS);
> +					goto ok;
> +
> +				/* 1 through 7 always legal */
> +				case '1': case '2': case '3':
> +				case '4': case '5': case '6': case '7':
> +					base = basefix[base];
> +					flags &= ~(SIGNOK | PFXOK | NDIGITS);
> +					goto ok;
> +
> +				/* digits 8 and 9 ok iff decimal or hex */
> +				case '8': case '9':
> +					base = basefix[base];
> +					if (base <= 8)
> +						break;  /* not legal here */
> +					flags &= ~(SIGNOK | PFXOK | NDIGITS);
> +					goto ok;
> +
> +				/* letters ok iff hex */
> +				case 'A': case 'B': case 'C':
> +				case 'D': case 'E': case 'F':
> +				case 'a': case 'b': case 'c':
> +				case 'd': case 'e': case 'f':
> +					/* no need to fix base here */
> +					if (base <= 10)
> +						break;  /* not legal here */
> +					flags &= ~(SIGNOK | PFXOK | NDIGITS);
> +					goto ok;
> +
> +				/* sign ok only as first character */
> +				case '+': case '-':
> +					if (flags & SIGNOK) {
> +						flags &= ~SIGNOK;
> +						goto ok;
> +						}
> +					break;
> +
> +				/* x ok iff flag still set & 2nd char */
> +				case 'x': case 'X':
> +					if (flags & PFXOK && p == buf + 1) {
> +						base = 16;      /* if %i */
> +						flags &= ~PFXOK;
> +						goto ok;
> +					}
> +					break;
> +				}
> +
> +				/*
> +				 * If we got here, c is not a legal character
> +				 * for a number.  Stop accumulating digits.
> +				 */
> +				break;
> +ok:
> +				/*
> +				 * c is legal: store it and look at the next.
> +				 */
> +				*p++ = c;
> +				if (--inr > 0)
> +					inp++;
> +				else
> +					break;          /* end of input */
> +			}
> +			/*
> +			 * If we had only a sign, it is no good; push
> +			 * back the sign.  If the number ends in `x',
> +			 * it was [sign] '' 'x', so push back the x
> +			 * and treat it as [sign] ''.
> +			 */
> +			if (flags & NDIGITS) {
> +				if (p > buf) {
> +					inp--;
> +					inr++;
> +				}
> +				goto match_failure;
> +			}
> +			c = ((u_char *)p)[-1];
> +			if (c == 'x' || c == 'X') {
> +				--p;
> +				inp--;
> +				inr++;
> +			}
> +			if ((flags & SUPPRESS) == 0) {
> +				u64 res;
> +
> +				*p = 0;
> +				res = (*ccfn)(buf, (char **)NULL, base);
> +				if (flags & POINTER)
> +					*va_arg(ap, void **) =
> +					(void *)(uintptr_t)res;
> +				else if (flags & SHORTSHORT)
> +					*va_arg(ap, char *) = res;
> +				else if (flags & SHORT)
> +					*va_arg(ap, short *) = res;
> +				else if (flags & LONG)
> +					*va_arg(ap, long *) = res;
> +				else if (flags & QUAD)
> +					*va_arg(ap, s64 *) = res;
> +				else
> +					*va_arg(ap, int *) = res;
> +				nassigned++;
> +			}
> +			nread += p - buf;
> +			nconversions++;
> +			break;
> +		}
> +	}
> +input_failure:
> +		return (nconversions != 0 ? nassigned : -1);
> +match_failure:
> +		return (nassigned);
> +}
> +
> +/**
> + * sscanf - Unformat a buffer into a list of arguments
> + * @buf:	input buffer
> + * @fmt:	formatting of buffer
> + * @...:	resulting arguments
> + */
> +int sscanf(const char *buf, const char *fmt, ...)
> +{
> +	va_list args;
> +	int i;
> +
> +	va_start(args, fmt);
> +	i = vsscanf(buf, fmt, args);
> +	va_end(args);
> +	return i;
> +}
> +
> +#endif
>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 08/17] linux/compat.h: Add wait_event_timeout macro
  2020-07-01 16:29 ` [PATCH 08/17] linux/compat.h: Add wait_event_timeout macro Anastasiia Lukianenko
@ 2020-07-02  4:08   ` Heinrich Schuchardt
  2020-07-03 13:02     ` Anastasiia Lukianenko
  0 siblings, 1 reply; 57+ messages in thread
From: Heinrich Schuchardt @ 2020-07-02  4:08 UTC (permalink / raw)
  To: u-boot

On 7/1/20 6:29 PM, Anastasiia Lukianenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>
> Add  wait_event_timeout - sleep until a condition gets true or a
> timeout elapses.
>
> This is a stripped version of the same from Linux kernel with the
> following u-boot specific modifications:
> - no wait queues supported
> - use u-boot timer to detect timeouts
> - check for Ctrl-C pressed during wait
>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> ---
>  include/linux/compat.h | 45 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 45 insertions(+)
>
> diff --git a/include/linux/compat.h b/include/linux/compat.h
> index 712eeaef4e..5375b7d3b8 100644
> --- a/include/linux/compat.h
> +++ b/include/linux/compat.h
> @@ -1,12 +1,20 @@
>  #ifndef _LINUX_COMPAT_H_
>  #define _LINUX_COMPAT_H_
>
> +#include <console.h>
>  #include <log.h>
>  #include <malloc.h>
> +
> +#include <asm/processor.h>
> +
>  #include <linux/types.h>
>  #include <linux/err.h>
>  #include <linux/kernel.h>
>
> +#ifdef CONFIG_XEN
> +#include <xen/events.h>
> +#endif
> +
>  struct unused {};
>  typedef struct unused unused_t;
>
> @@ -122,6 +130,43 @@ static inline void kmem_cache_destroy(struct kmem_cache *cachep)
>  #define add_wait_queue(...)	do { } while (0)
>  #define remove_wait_queue(...)	do { } while (0)
>
> +#ifndef CONFIG_XEN
> +#define eventchn_poll()
> +#endif
> +
> +#define __wait_event_timeout(condition, timeout, ret)		\
> +({								\
> +	ulong __ret = ret; /* explicit shadow */		\
> +	ulong start = get_timer(0);				\
> +	for (;;) {						\
> +		eventchn_poll();				\
> +		if (condition) {				\
> +			__ret = 1;				\
> +			break;					\
> +	}							\
> +	if ((get_timer(start) > timeout) || ctrlc()) {		\
> +		__ret = 0;					\
> +		break;						\
> +	}							\
> +	cpu_relax();						\
> +	}							\
> +	__ret;							\
> +})
> +
> +/*
> + * 0 if the @condition evaluated to %false after the @timeout elapsed,
> + * 1 if the @condition evaluated to %true
> + */

Please, document all arguments. Use Sphinx style as in

https://www.kernel.org/doc/html/latest/doc-guide/kernel-doc.html#function-documentation.

Best regards

Heinrich.

> +#define wait_event_timeout(wq_head, condition, timeout)			\
> +({									\
> +	ulong __ret;							\
> +	if (condition)							\
> +		__ret = 1;						\
> +	else								\
> +		__ret = __wait_event_timeout(condition, timeout, __ret);\
> +	__ret;								\
> +})
> +
>  #define KERNEL_VERSION(a,b,c)	(((a) << 16) + ((b) << 8) + (c))
>
>  /* This is also defined in ARMv8's mmu.h */
>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 12/17] xen: pvblock: Add initial support for para-virtualized block driver
  2020-07-01 16:29 ` [PATCH 12/17] xen: pvblock: Add initial support for para-virtualized block driver Anastasiia Lukianenko
@ 2020-07-02  4:17   ` Heinrich Schuchardt
  2020-07-03 13:25     ` Anastasiia Lukianenko
  2020-07-02  4:29   ` Heinrich Schuchardt
  1 sibling, 1 reply; 57+ messages in thread
From: Heinrich Schuchardt @ 2020-07-02  4:17 UTC (permalink / raw)
  To: u-boot

On 7/1/20 6:29 PM, Anastasiia Lukianenko wrote:
> From: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
>
> Add initial infrastructure for Xen para-virtualized block device.
> This includes compile-time configuration and the skeleton for
> the future driver implementation.
> Add new class UCLASS_PVBLOCK which is going to be a parent for
> virtual block devices.
> Add new interface type IF_TYPE_PVBLOCK.
>
> Implement basic driver setup by reading XenStore configuration.

Please, add documenation for the board in doc/board/.

>
> Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
> Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> ---
>  cmd/Kconfig                      |   7 ++
>  cmd/Makefile                     |   1 +
>  cmd/pvblock.c                    |  31 ++++++++
>  common/board_r.c                 |  14 ++++
>  configs/xenguest_arm64_defconfig |   4 +
>  disk/part.c                      |   4 +
>  drivers/Kconfig                  |   2 +
>  drivers/block/blk-uclass.c       |   2 +
>  drivers/xen/Kconfig              |  10 +++
>  drivers/xen/Makefile             |   2 +
>  drivers/xen/pvblock.c            | 121 +++++++++++++++++++++++++++++++
>  include/blk.h                    |   1 +
>  include/configs/xenguest_arm64.h |   8 ++
>  include/dm/uclass-id.h           |   1 +
>  include/pvblock.h                |  12 +++
>  15 files changed, 220 insertions(+)
>  create mode 100644 cmd/pvblock.c
>  create mode 100644 drivers/xen/Kconfig
>  create mode 100644 drivers/xen/pvblock.c
>  create mode 100644 include/pvblock.h
>
> diff --git a/cmd/Kconfig b/cmd/Kconfig
> index 192b3b262f..f28576947b 100644
> --- a/cmd/Kconfig
> +++ b/cmd/Kconfig
> @@ -1335,6 +1335,13 @@ config CMD_USB_MASS_STORAGE
>  	help
>  	  USB mass storage support
>
> +config CMD_PVBLOCK
> +	bool "Xen para-virtualized block device"
> +	depends on XEN
> +	select PVBLOCK
> +	help
> +	  Xen para-virtualized block device support
> +
>  config CMD_VIRTIO
>  	bool "virtio"
>  	depends on VIRTIO
> diff --git a/cmd/Makefile b/cmd/Makefile
> index 974ad48b0a..117284a28c 100644
> --- a/cmd/Makefile
> +++ b/cmd/Makefile
> @@ -169,6 +169,7 @@ obj-$(CONFIG_CMD_DFU) += dfu.o
>  obj-$(CONFIG_CMD_GPT) += gpt.o
>  obj-$(CONFIG_CMD_ETHSW) += ethsw.o
>  obj-$(CONFIG_CMD_AXI) += axi.o
> +obj-$(CONFIG_CMD_PVBLOCK) += pvblock.o
>
>  # Power
>  obj-$(CONFIG_CMD_PMIC) += pmic.o
> diff --git a/cmd/pvblock.c b/cmd/pvblock.c
> new file mode 100644
> index 0000000000..7dbb243a74
> --- /dev/null
> +++ b/cmd/pvblock.c
> @@ -0,0 +1,31 @@
> +/*
> + * SPDX-License-Identifier:	GPL-2.0+

SPDX should be in first line and formatted as described in

https://www.kernel.org/doc/html/latest/process/license-rules.html#license-identifier-syntax

> + *
> + * (C) Copyright 2020 EPAM Systems Inc.
> + *
> + * XEN para-virtualized block device support
> + */
> +
> +#include <blk.h>
> +#include <common.h>
> +#include <command.h>
> +
> +/* Current I/O Device	*/
> +static int pvblock_curr_device;
> +
> +int do_pvblock(struct cmd_tbl *cmdtp, int flag, int argc, char *const argv[])
> +{
> +	return blk_common_cmd(argc, argv, IF_TYPE_PVBLOCK,
> +			      &pvblock_curr_device);
> +}
> +
> +U_BOOT_CMD(pvblock, 5, 1, do_pvblock,
> +	   "Xen para-virtualized block device",
> +	   "info  - show available block devices\n"
> +	   "pvblock device [dev] - show or set current device\n"
> +	   "pvblock part [dev] - print partition table of one or all devices\n"
> +	   "pvblock read  addr blk# cnt\n"
> +	   "pvblock write addr blk# cnt - read/write `cnt'"
> +	   " blocks starting at block `blk#'\n"
> +	   "    to/from memory address `addr'");
> +
> diff --git a/common/board_r.c b/common/board_r.c
> index fd36edb4e5..40cd0e5d3c 100644
> --- a/common/board_r.c
> +++ b/common/board_r.c
> @@ -49,6 +49,7 @@
>  #include <nand.h>
>  #include <of_live.h>
>  #include <onenand_uboot.h>
> +#include <pvblock.h>
>  #include <scsi.h>
>  #include <serial.h>
>  #include <status_led.h>
> @@ -470,6 +471,16 @@ static int initr_xen(void)
>  	return 0;
>  }
>  #endif
> +
> +#ifdef CONFIG_PVBLOCK
> +static int initr_pvblock(void)
> +{
> +	puts("PVBLOCK: ");
> +	pvblock_init();
> +	return 0;
> +}
> +#endif
> +
>  /*
>   * Tell if it's OK to load the environment early in boot.
>   *
> @@ -780,6 +791,9 @@ static init_fnc_t init_sequence_r[] = {
>  #endif
>  #ifdef CONFIG_XEN
>  	initr_xen,
> +#endif
> +#ifdef CONFIG_PVBLOCK
> +	initr_pvblock,
>  #endif
>  	initr_env,
>  #ifdef CONFIG_SYS_BOOTPARAMS_LEN
> diff --git a/configs/xenguest_arm64_defconfig b/configs/xenguest_arm64_defconfig
> index 45559a161b..46473c251d 100644
> --- a/configs/xenguest_arm64_defconfig
> +++ b/configs/xenguest_arm64_defconfig
> @@ -14,6 +14,8 @@ CONFIG_CMD_BOOTD=n
>  CONFIG_CMD_BOOTEFI=n
>  CONFIG_CMD_BOOTEFI_HELLO_COMPILE=n
>  CONFIG_CMD_ELF=n
> +CONFIG_CMD_EXT4=y
> +CONFIG_CMD_FAT=y
>  CONFIG_CMD_GO=n
>  CONFIG_CMD_RUN=n
>  CONFIG_CMD_IMI=n
> @@ -41,6 +43,8 @@ CONFIG_CMD_LZMADEC=n
>  CONFIG_CMD_SAVEENV=n
>  CONFIG_CMD_UMS=n
>
> +CONFIG_CMD_PVBLOCK=y
> +
>  #CONFIG_USB=n
>  # CONFIG_ISO_PARTITION is not set
>
> diff --git a/disk/part.c b/disk/part.c
> index f6a31025dc..b69fd345f3 100644
> --- a/disk/part.c
> +++ b/disk/part.c
> @@ -149,6 +149,7 @@ void dev_print (struct blk_desc *dev_desc)
>  	case IF_TYPE_MMC:
>  	case IF_TYPE_USB:
>  	case IF_TYPE_NVME:
> +	case IF_TYPE_PVBLOCK:
>  		printf ("Vendor: %s Rev: %s Prod: %s\n",
>  			dev_desc->vendor,
>  			dev_desc->revision,
> @@ -288,6 +289,9 @@ static void print_part_header(const char *type, struct blk_desc *dev_desc)
>  	case IF_TYPE_NVME:
>  		puts ("NVMe");
>  		break;
> +	case IF_TYPE_PVBLOCK:
> +		puts("PV BLOCK");
> +		break;
>  	case IF_TYPE_VIRTIO:
>  		puts("VirtIO");
>  		break;
> diff --git a/drivers/Kconfig b/drivers/Kconfig
> index e34a22708c..65076aab03 100644
> --- a/drivers/Kconfig
> +++ b/drivers/Kconfig
> @@ -132,6 +132,8 @@ source "drivers/w1-eeprom/Kconfig"
>
>  source "drivers/watchdog/Kconfig"
>
> +source "drivers/xen/Kconfig"
> +
>  config PHYS_TO_BUS
>  	bool "Custom physical to bus address mapping"
>  	help
> diff --git a/drivers/block/blk-uclass.c b/drivers/block/blk-uclass.c
> index b19375cbc8..6cfabbca24 100644
> --- a/drivers/block/blk-uclass.c
> +++ b/drivers/block/blk-uclass.c
> @@ -28,6 +28,7 @@ static const char *if_typename_str[IF_TYPE_COUNT] = {
>  	[IF_TYPE_NVME]		= "nvme",
>  	[IF_TYPE_EFI]		= "efi",
>  	[IF_TYPE_VIRTIO]	= "virtio",
> +	[IF_TYPE_PVBLOCK]	= "pvblock",
>  };
>
>  static enum uclass_id if_type_uclass_id[IF_TYPE_COUNT] = {
> @@ -43,6 +44,7 @@ static enum uclass_id if_type_uclass_id[IF_TYPE_COUNT] = {
>  	[IF_TYPE_NVME]		= UCLASS_NVME,
>  	[IF_TYPE_EFI]		= UCLASS_EFI,
>  	[IF_TYPE_VIRTIO]	= UCLASS_VIRTIO,
> +	[IF_TYPE_PVBLOCK]	= UCLASS_PVBLOCK,
>  };
>
>  static enum if_type if_typename_to_iftype(const char *if_typename)
> diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
> new file mode 100644
> index 0000000000..6ad2a93668
> --- /dev/null
> +++ b/drivers/xen/Kconfig
> @@ -0,0 +1,10 @@
> +config PVBLOCK
> +	bool "Xen para-virtualized block device"
> +	depends on DM
> +	select BLK
> +	select HAVE_BLOCK_DEVICE
> +	help
> +	  This driver implements the front-end of the Xen virtual
> +	  block device driver. It communicates with a back-end driver
> +	  in another domain which drives the actual block device.
> +
> diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
> index 243b13277a..87157df69b 100644
> --- a/drivers/xen/Makefile
> +++ b/drivers/xen/Makefile
> @@ -6,3 +6,5 @@ obj-y += hypervisor.o
>  obj-y += events.o
>  obj-y += xenbus.o
>  obj-y += gnttab.o
> +
> +obj-$(CONFIG_PVBLOCK) += pvblock.o
> diff --git a/drivers/xen/pvblock.c b/drivers/xen/pvblock.c
> new file mode 100644
> index 0000000000..057add9753
> --- /dev/null
> +++ b/drivers/xen/pvblock.c
> @@ -0,0 +1,121 @@
> +/*
> + * SPDX-License-Identifier:	GPL-2.0+
> + *
> + * (C) Copyright 2020 EPAM Systems Inc.
> + */
> +#include <blk.h>
> +#include <common.h>
> +#include <dm.h>
> +#include <dm/device-internal.h>
> +
> +#define DRV_NAME	"pvblock"
> +#define DRV_NAME_BLK	"pvblock_blk"
> +
> +struct blkfront_dev {
> +	char dummy;
> +};
> +
> +static int init_blkfront(unsigned int devid, struct blkfront_dev *dev)
> +{
> +	return 0;
> +}
> +
> +static void shutdown_blkfront(struct blkfront_dev *dev)
> +{
> +}
> +
> +ulong pvblock_blk_read(struct udevice *udev, lbaint_t blknr, lbaint_t blkcnt,
> +		       void *buffer)
> +{
> +	return 0;
> +}
> +
> +ulong pvblock_blk_write(struct udevice *udev, lbaint_t blknr, lbaint_t blkcnt,
> +			const void *buffer)
> +{
> +	return 0;
> +}
> +
> +static int pvblock_blk_bind(struct udevice *udev)
> +{
> +	return 0;
> +}
> +
> +static int pvblock_blk_probe(struct udevice *udev)
> +{
> +	struct blkfront_dev *blk_dev = dev_get_priv(udev);
> +	int ret;
> +
> +	ret = init_blkfront(0, blk_dev);
> +	if (ret < 0)
> +		return ret;
> +	return 0;
> +}
> +
> +static int pvblock_blk_remove(struct udevice *udev)
> +{
> +	struct blkfront_dev *blk_dev = dev_get_priv(udev);
> +
> +	shutdown_blkfront(blk_dev);
> +	return 0;
> +}
> +
> +static const struct blk_ops pvblock_blk_ops = {
> +	.read	= pvblock_blk_read,
> +	.write	= pvblock_blk_write,
> +};
> +
> +U_BOOT_DRIVER(pvblock_blk) = {
> +	.name			= DRV_NAME_BLK,
> +	.id			= UCLASS_BLK,
> +	.ops			= &pvblock_blk_ops,
> +	.bind			= pvblock_blk_bind,
> +	.probe			= pvblock_blk_probe,
> +	.remove			= pvblock_blk_remove,
> +	.priv_auto_alloc_size	= sizeof(struct blkfront_dev),
> +	.flags			= DM_FLAG_OS_PREPARE,
> +};
> +
> +/*******************************************************************************
> + * Para-virtual block device class
> + *******************************************************************************/
> +
> +void pvblock_init(void)
> +{
> +	struct driver_info info;
> +	struct udevice *udev;
> +	struct uclass *uc;
> +	int ret;
> +
> +	/*
> +	 * At this point Xen drivers have already initialized,
> +	 * so we can instantiate the class driver and enumerate
> +	 * virtual block devices.
> +	 */
> +	info.name = DRV_NAME;
> +	ret = device_bind_by_name(gd->dm_root, false, &info, &udev);
> +	if (ret < 0)
> +		printf("Failed to bind " DRV_NAME ", ret: %d\n", ret);
> +
> +	/* Bootstrap virtual block devices class driver */
> +	ret = uclass_get(UCLASS_PVBLOCK, &uc);
> +	if (ret)
> +		return;
> +	uclass_foreach_dev_probe(UCLASS_PVBLOCK, udev);
> +}
> +
> +static int pvblock_probe(struct udevice *udev)
> +{
> +	return 0;
> +}
> +
> +U_BOOT_DRIVER(pvblock_drv) = {
> +	.name		= DRV_NAME,
> +	.id		= UCLASS_PVBLOCK,
> +	.probe		= pvblock_probe,
> +};
> +
> +UCLASS_DRIVER(pvblock) = {
> +	.name		= DRV_NAME,
> +	.id		= UCLASS_PVBLOCK,
> +};
> diff --git a/include/blk.h b/include/blk.h
> index abcd4bedbb..9ee10fb80e 100644
> --- a/include/blk.h
> +++ b/include/blk.h
> @@ -33,6 +33,7 @@ enum if_type {
>  	IF_TYPE_HOST,
>  	IF_TYPE_NVME,
>  	IF_TYPE_EFI,
> +	IF_TYPE_PVBLOCK,
>  	IF_TYPE_VIRTIO,
>
>  	IF_TYPE_COUNT,			/* Number of interface types */
> diff --git a/include/configs/xenguest_arm64.h b/include/configs/xenguest_arm64.h
> index 467dabf1e5..2c0d3d64fb 100644
> --- a/include/configs/xenguest_arm64.h
> +++ b/include/configs/xenguest_arm64.h
> @@ -42,4 +42,12 @@
>  #define CONFIG_CMDLINE_TAG            1
>  #define CONFIG_INITRD_TAG             1
>
> +#define CONFIG_CMD_RUN
> +
> +#undef CONFIG_EXTRA_ENV_SETTINGS
> +#define CONFIG_EXTRA_ENV_SETTINGS	\
> +	"loadimage=ext4load pvblock 0 0x90000000 /boot/Image;\0" \
> +	"pvblockboot=run loadimage;" \
> +		"booti 0x90000000 - 0x88000000;\0"
> +
>  #endif /* __XENGUEST_ARM64_H */
> diff --git a/include/dm/uclass-id.h b/include/dm/uclass-id.h
> index 7837d459f1..4bf7501204 100644
> --- a/include/dm/uclass-id.h
> +++ b/include/dm/uclass-id.h
> @@ -121,6 +121,7 @@ enum uclass_id {
>  	UCLASS_W1,		/* Dallas 1-Wire bus */
>  	UCLASS_W1_EEPROM,	/* one-wire EEPROMs */
>  	UCLASS_WDT,		/* Watchdog Timer driver */
> +	UCLASS_PVBLOCK,		/* Xen virtual block device */
>
>  	UCLASS_COUNT,
>  	UCLASS_INVALID = -1,
> diff --git a/include/pvblock.h b/include/pvblock.h
> new file mode 100644
> index 0000000000..e3bb8ff9a7
> --- /dev/null
> +++ b/include/pvblock.h
> @@ -0,0 +1,12 @@
> +/*
> + * SPDX-License-Identifier:	GPL-2.0+

see above

Please, use scripts/checkpatch.pl before submitting patches.

Best regards

Heinrich

> + *
> + * (C) 2020 EPAM Systems Inc.
> + */
> +
> +#ifndef _PVBLOCK_H
> +#define _PVBLOCK_H
> +
> +void pvblock_init(void);
> +
> +#endif /* _PVBLOCK_H */
>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 12/17] xen: pvblock: Add initial support for para-virtualized block driver
  2020-07-01 16:29 ` [PATCH 12/17] xen: pvblock: Add initial support for para-virtualized block driver Anastasiia Lukianenko
  2020-07-02  4:17   ` Heinrich Schuchardt
@ 2020-07-02  4:29   ` Heinrich Schuchardt
  2020-07-02  5:30     ` Peng Fan
  2020-07-03 14:14     ` Anastasiia Lukianenko
  1 sibling, 2 replies; 57+ messages in thread
From: Heinrich Schuchardt @ 2020-07-02  4:29 UTC (permalink / raw)
  To: u-boot

On 7/1/20 6:29 PM, Anastasiia Lukianenko wrote:
> From: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
>
> Add initial infrastructure for Xen para-virtualized block device.
> This includes compile-time configuration and the skeleton for
> the future driver implementation.
> Add new class UCLASS_PVBLOCK which is going to be a parent for
> virtual block devices.

We already have virtual block devices: virtio_blk, efi_blk.

They work fine using the exising UCLASS_BLK. Why do we need a new uclass?

> Add new interface type IF_TYPE_PVBLOCK.
>
> Implement basic driver setup by reading XenStore configuration.
>
> Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
> Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> ---
>  cmd/Kconfig                      |   7 ++
>  cmd/Makefile                     |   1 +
>  cmd/pvblock.c                    |  31 ++++++++
>  common/board_r.c                 |  14 ++++
>  configs/xenguest_arm64_defconfig |   4 +
>  disk/part.c                      |   4 +
>  drivers/Kconfig                  |   2 +
>  drivers/block/blk-uclass.c       |   2 +
>  drivers/xen/Kconfig              |  10 +++
>  drivers/xen/Makefile             |   2 +
>  drivers/xen/pvblock.c            | 121 +++++++++++++++++++++++++++++++
>  include/blk.h                    |   1 +
>  include/configs/xenguest_arm64.h |   8 ++
>  include/dm/uclass-id.h           |   1 +
>  include/pvblock.h                |  12 +++
>  15 files changed, 220 insertions(+)
>  create mode 100644 cmd/pvblock.c
>  create mode 100644 drivers/xen/Kconfig
>  create mode 100644 drivers/xen/pvblock.c
>  create mode 100644 include/pvblock.h
>
> diff --git a/cmd/Kconfig b/cmd/Kconfig
> index 192b3b262f..f28576947b 100644
> --- a/cmd/Kconfig
> +++ b/cmd/Kconfig
> @@ -1335,6 +1335,13 @@ config CMD_USB_MASS_STORAGE
>  	help
>  	  USB mass storage support
>
> +config CMD_PVBLOCK
> +	bool "Xen para-virtualized block device"
> +	depends on XEN
> +	select PVBLOCK
> +	help
> +	  Xen para-virtualized block device support
> +
>  config CMD_VIRTIO
>  	bool "virtio"
>  	depends on VIRTIO
> diff --git a/cmd/Makefile b/cmd/Makefile
> index 974ad48b0a..117284a28c 100644
> --- a/cmd/Makefile
> +++ b/cmd/Makefile
> @@ -169,6 +169,7 @@ obj-$(CONFIG_CMD_DFU) += dfu.o
>  obj-$(CONFIG_CMD_GPT) += gpt.o
>  obj-$(CONFIG_CMD_ETHSW) += ethsw.o
>  obj-$(CONFIG_CMD_AXI) += axi.o
> +obj-$(CONFIG_CMD_PVBLOCK) += pvblock.o
>
>  # Power
>  obj-$(CONFIG_CMD_PMIC) += pmic.o
> diff --git a/cmd/pvblock.c b/cmd/pvblock.c
> new file mode 100644
> index 0000000000..7dbb243a74
> --- /dev/null
> +++ b/cmd/pvblock.c
> @@ -0,0 +1,31 @@
> +/*
> + * SPDX-License-Identifier:	GPL-2.0+

Please, correct the formatting.

https://www.kernel.org/doc/html/latest/process/license-rules.html#license-identifier-syntax

> + *
> + * (C) Copyright 2020 EPAM Systems Inc.
> + *
> + * XEN para-virtualized block device support
> + */
> +
> +#include <blk.h>
> +#include <common.h>
> +#include <command.h>
> +
> +/* Current I/O Device	*/
> +static int pvblock_curr_device;
> +
> +int do_pvblock(struct cmd_tbl *cmdtp, int flag, int argc, char *const argv[])
> +{
> +	return blk_common_cmd(argc, argv, IF_TYPE_PVBLOCK,
> +			      &pvblock_curr_device);
> +}
> +
> +U_BOOT_CMD(pvblock, 5, 1, do_pvblock,
> +	   "Xen para-virtualized block device",
> +	   "info  - show available block devices\n"
> +	   "pvblock device [dev] - show or set current device\n"
> +	   "pvblock part [dev] - print partition table of one or all devices\n"
> +	   "pvblock read  addr blk# cnt\n"
> +	   "pvblock write addr blk# cnt - read/write `cnt'"
> +	   " blocks starting at block `blk#'\n"
> +	   "    to/from memory address `addr'");
> +
> diff --git a/common/board_r.c b/common/board_r.c
> index fd36edb4e5..40cd0e5d3c 100644
> --- a/common/board_r.c
> +++ b/common/board_r.c
> @@ -49,6 +49,7 @@
>  #include <nand.h>
>  #include <of_live.h>
>  #include <onenand_uboot.h>
> +#include <pvblock.h>
>  #include <scsi.h>
>  #include <serial.h>
>  #include <status_led.h>
> @@ -470,6 +471,16 @@ static int initr_xen(void)
>  	return 0;
>  }
>  #endif
> +
> +#ifdef CONFIG_PVBLOCK
> +static int initr_pvblock(void)
> +{
> +	puts("PVBLOCK: ");
> +	pvblock_init();
> +	return 0;
> +}
> +#endif
> +
>  /*
>   * Tell if it's OK to load the environment early in boot.
>   *
> @@ -780,6 +791,9 @@ static init_fnc_t init_sequence_r[] = {
>  #endif
>  #ifdef CONFIG_XEN
>  	initr_xen,
> +#endif
> +#ifdef CONFIG_PVBLOCK
> +	initr_pvblock,
>  #endif
>  	initr_env,
>  #ifdef CONFIG_SYS_BOOTPARAMS_LEN
> diff --git a/configs/xenguest_arm64_defconfig b/configs/xenguest_arm64_defconfig
> index 45559a161b..46473c251d 100644
> --- a/configs/xenguest_arm64_defconfig
> +++ b/configs/xenguest_arm64_defconfig
> @@ -14,6 +14,8 @@ CONFIG_CMD_BOOTD=n
>  CONFIG_CMD_BOOTEFI=n
>  CONFIG_CMD_BOOTEFI_HELLO_COMPILE=n
>  CONFIG_CMD_ELF=n
> +CONFIG_CMD_EXT4=y
> +CONFIG_CMD_FAT=y
>  CONFIG_CMD_GO=n
>  CONFIG_CMD_RUN=n
>  CONFIG_CMD_IMI=n
> @@ -41,6 +43,8 @@ CONFIG_CMD_LZMADEC=n
>  CONFIG_CMD_SAVEENV=n
>  CONFIG_CMD_UMS=n
>
> +CONFIG_CMD_PVBLOCK=y
> +
>  #CONFIG_USB=n
>  # CONFIG_ISO_PARTITION is not set
>
> diff --git a/disk/part.c b/disk/part.c
> index f6a31025dc..b69fd345f3 100644
> --- a/disk/part.c
> +++ b/disk/part.c
> @@ -149,6 +149,7 @@ void dev_print (struct blk_desc *dev_desc)
>  	case IF_TYPE_MMC:
>  	case IF_TYPE_USB:
>  	case IF_TYPE_NVME:
> +	case IF_TYPE_PVBLOCK:
>  		printf ("Vendor: %s Rev: %s Prod: %s\n",
>  			dev_desc->vendor,
>  			dev_desc->revision,
> @@ -288,6 +289,9 @@ static void print_part_header(const char *type, struct blk_desc *dev_desc)
>  	case IF_TYPE_NVME:
>  		puts ("NVMe");
>  		break;
> +	case IF_TYPE_PVBLOCK:
> +		puts("PV BLOCK");
> +		break;
>  	case IF_TYPE_VIRTIO:
>  		puts("VirtIO");
>  		break;
> diff --git a/drivers/Kconfig b/drivers/Kconfig
> index e34a22708c..65076aab03 100644
> --- a/drivers/Kconfig
> +++ b/drivers/Kconfig
> @@ -132,6 +132,8 @@ source "drivers/w1-eeprom/Kconfig"
>
>  source "drivers/watchdog/Kconfig"
>
> +source "drivers/xen/Kconfig"
> +
>  config PHYS_TO_BUS
>  	bool "Custom physical to bus address mapping"
>  	help
> diff --git a/drivers/block/blk-uclass.c b/drivers/block/blk-uclass.c
> index b19375cbc8..6cfabbca24 100644
> --- a/drivers/block/blk-uclass.c
> +++ b/drivers/block/blk-uclass.c
> @@ -28,6 +28,7 @@ static const char *if_typename_str[IF_TYPE_COUNT] = {
>  	[IF_TYPE_NVME]		= "nvme",
>  	[IF_TYPE_EFI]		= "efi",
>  	[IF_TYPE_VIRTIO]	= "virtio",
> +	[IF_TYPE_PVBLOCK]	= "pvblock",
>  };
>
>  static enum uclass_id if_type_uclass_id[IF_TYPE_COUNT] = {
> @@ -43,6 +44,7 @@ static enum uclass_id if_type_uclass_id[IF_TYPE_COUNT] = {
>  	[IF_TYPE_NVME]		= UCLASS_NVME,
>  	[IF_TYPE_EFI]		= UCLASS_EFI,
>  	[IF_TYPE_VIRTIO]	= UCLASS_VIRTIO,
> +	[IF_TYPE_PVBLOCK]	= UCLASS_PVBLOCK,
>  };
>
>  static enum if_type if_typename_to_iftype(const char *if_typename)
> diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
> new file mode 100644
> index 0000000000..6ad2a93668
> --- /dev/null
> +++ b/drivers/xen/Kconfig
> @@ -0,0 +1,10 @@
> +config PVBLOCK
> +	bool "Xen para-virtualized block device"
> +	depends on DM
> +	select BLK
> +	select HAVE_BLOCK_DEVICE
> +	help
> +	  This driver implements the front-end of the Xen virtual
> +	  block device driver. It communicates with a back-end driver
> +	  in another domain which drives the actual block device.
> +
> diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
> index 243b13277a..87157df69b 100644
> --- a/drivers/xen/Makefile
> +++ b/drivers/xen/Makefile
> @@ -6,3 +6,5 @@ obj-y += hypervisor.o
>  obj-y += events.o
>  obj-y += xenbus.o
>  obj-y += gnttab.o
> +
> +obj-$(CONFIG_PVBLOCK) += pvblock.o
> diff --git a/drivers/xen/pvblock.c b/drivers/xen/pvblock.c
> new file mode 100644
> index 0000000000..057add9753
> --- /dev/null
> +++ b/drivers/xen/pvblock.c
> @@ -0,0 +1,121 @@
> +/*
> + * SPDX-License-Identifier:	GPL-2.0+

See above.

> + *
> + * (C) Copyright 2020 EPAM Systems Inc.
> + */
> +#include <blk.h>
> +#include <common.h>
> +#include <dm.h>
> +#include <dm/device-internal.h>
> +
> +#define DRV_NAME	"pvblock"
> +#define DRV_NAME_BLK	"pvblock_blk"
> +
> +struct blkfront_dev {
> +	char dummy;
> +};
> +
> +static int init_blkfront(unsigned int devid, struct blkfront_dev *dev)
> +{
> +	return 0;
> +}
> +
> +static void shutdown_blkfront(struct blkfront_dev *dev)
> +{
> +}
> +
> +ulong pvblock_blk_read(struct udevice *udev, lbaint_t blknr, lbaint_t blkcnt,
> +		       void *buffer)
> +{
> +	return 0;
> +}
> +
> +ulong pvblock_blk_write(struct udevice *udev, lbaint_t blknr, lbaint_t blkcnt,
> +			const void *buffer)
> +{
> +	return 0;
> +}
> +
> +static int pvblock_blk_bind(struct udevice *udev)
> +{
> +	return 0;
> +}
> +
> +static int pvblock_blk_probe(struct udevice *udev)
> +{
> +	struct blkfront_dev *blk_dev = dev_get_priv(udev);
> +	int ret;
> +
> +	ret = init_blkfront(0, blk_dev);
> +	if (ret < 0)
> +		return ret;
> +	return 0;
> +}
> +
> +static int pvblock_blk_remove(struct udevice *udev)
> +{
> +	struct blkfront_dev *blk_dev = dev_get_priv(udev);
> +
> +	shutdown_blkfront(blk_dev);
> +	return 0;
> +}
> +
> +static const struct blk_ops pvblock_blk_ops = {
> +	.read	= pvblock_blk_read,
> +	.write	= pvblock_blk_write,
> +};
> +
> +U_BOOT_DRIVER(pvblock_blk) = {
> +	.name			= DRV_NAME_BLK,
> +	.id			= UCLASS_BLK,
> +	.ops			= &pvblock_blk_ops,
> +	.bind			= pvblock_blk_bind,
> +	.probe			= pvblock_blk_probe,
> +	.remove			= pvblock_blk_remove,
> +	.priv_auto_alloc_size	= sizeof(struct blkfront_dev),
> +	.flags			= DM_FLAG_OS_PREPARE,
> +};
> +
> +/*******************************************************************************
> + * Para-virtual block device class
> + *******************************************************************************/
> +
> +void pvblock_init(void)
> +{
> +	struct driver_info info;
> +	struct udevice *udev;
> +	struct uclass *uc;
> +	int ret;
> +
> +	/*
> +	 * At this point Xen drivers have already initialized,
> +	 * so we can instantiate the class driver and enumerate
> +	 * virtual block devices.
> +	 */
> +	info.name = DRV_NAME;
> +	ret = device_bind_by_name(gd->dm_root, false, &info, &udev);
> +	if (ret < 0)
> +		printf("Failed to bind " DRV_NAME ", ret: %d\n", ret);
> +
> +	/* Bootstrap virtual block devices class driver */
> +	ret = uclass_get(UCLASS_PVBLOCK, &uc);
> +	if (ret)
> +		return;
> +	uclass_foreach_dev_probe(UCLASS_PVBLOCK, udev);
> +}
> +
> +static int pvblock_probe(struct udevice *udev)
> +{
> +	return 0;
> +}
> +
> +U_BOOT_DRIVER(pvblock_drv) = {
> +	.name		= DRV_NAME,
> +	.id		= UCLASS_PVBLOCK,
> +	.probe		= pvblock_probe,
> +};
> +
> +UCLASS_DRIVER(pvblock) = {
> +	.name		= DRV_NAME,
> +	.id		= UCLASS_PVBLOCK,
> +};
> diff --git a/include/blk.h b/include/blk.h
> index abcd4bedbb..9ee10fb80e 100644
> --- a/include/blk.h
> +++ b/include/blk.h
> @@ -33,6 +33,7 @@ enum if_type {
>  	IF_TYPE_HOST,
>  	IF_TYPE_NVME,
>  	IF_TYPE_EFI,
> +	IF_TYPE_PVBLOCK,
>  	IF_TYPE_VIRTIO,
>
>  	IF_TYPE_COUNT,			/* Number of interface types */
> diff --git a/include/configs/xenguest_arm64.h b/include/configs/xenguest_arm64.h
> index 467dabf1e5..2c0d3d64fb 100644
> --- a/include/configs/xenguest_arm64.h
> +++ b/include/configs/xenguest_arm64.h
> @@ -42,4 +42,12 @@
>  #define CONFIG_CMDLINE_TAG            1
>  #define CONFIG_INITRD_TAG             1
>
> +#define CONFIG_CMD_RUN
> +
> +#undef CONFIG_EXTRA_ENV_SETTINGS
> +#define CONFIG_EXTRA_ENV_SETTINGS	\
> +	"loadimage=ext4load pvblock 0 0x90000000 /boot/Image;\0" \
> +	"pvblockboot=run loadimage;" \
> +		"booti 0x90000000 - 0x88000000;\0"
> +
>  #endif /* __XENGUEST_ARM64_H */
> diff --git a/include/dm/uclass-id.h b/include/dm/uclass-id.h
> index 7837d459f1..4bf7501204 100644
> --- a/include/dm/uclass-id.h
> +++ b/include/dm/uclass-id.h
> @@ -121,6 +121,7 @@ enum uclass_id {
>  	UCLASS_W1,		/* Dallas 1-Wire bus */
>  	UCLASS_W1_EEPROM,	/* one-wire EEPROMs */
>  	UCLASS_WDT,		/* Watchdog Timer driver */
> +	UCLASS_PVBLOCK,		/* Xen virtual block device */
>
>  	UCLASS_COUNT,
>  	UCLASS_INVALID = -1,
> diff --git a/include/pvblock.h b/include/pvblock.h
> new file mode 100644
> index 0000000000..e3bb8ff9a7
> --- /dev/null
> +++ b/include/pvblock.h
> @@ -0,0 +1,12 @@
> +/*
> + * SPDX-License-Identifier:	GPL-2.0+

See above.

scripts/checkpatch.pl is your friend.

> + *
> + * (C) 2020 EPAM Systems Inc.
> + */
> +
> +#ifndef _PVBLOCK_H
> +#define _PVBLOCK_H
> +

Document you functions as described in

https://www.kernel.org/doc/html/latest/doc-guide/kernel-doc.html#function-documentation

Include the documentation in the HTML documentation generated by

    make htmldocs.

Best regards

Heinrich

> +void pvblock_init(void);
> +
> +#endif /* _PVBLOCK_H */
>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 10/17] xen: Port Xen bus driver from mini-os
  2020-07-01 16:29 ` [PATCH 10/17] xen: Port Xen bus driver from mini-os Anastasiia Lukianenko
@ 2020-07-02  4:43   ` Heinrich Schuchardt
  0 siblings, 0 replies; 57+ messages in thread
From: Heinrich Schuchardt @ 2020-07-02  4:43 UTC (permalink / raw)
  To: u-boot

On 7/1/20 6:29 PM, Anastasiia Lukianenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>
> Make required updates to run on u-boot and strip test code.
>
> Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> ---
>  arch/arm/Kconfig                          |   1 +
>  board/xen/xenguest_arm64/xenguest_arm64.c |  16 +-
>  drivers/xen/Makefile                      |   1 +
>  drivers/xen/hypervisor.c                  |   2 +
>  drivers/xen/xenbus.c                      | 547 ++++++++++++++++++++++
>  include/xen/xenbus.h                      |  86 ++++
>  6 files changed, 652 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/xen/xenbus.c
>  create mode 100644 include/xen/xenbus.h
>
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index d4de1139aa..bcd9ab5c9d 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -1724,6 +1724,7 @@ config TARGET_XENGUEST_ARM64
>  	select OF_CONTROL
>  	select LINUX_KERNEL_IMAGE_HEADER
>  	select XEN_SERIAL
> +	select SSCANF
>  endchoice
>
>  config ARCH_SUPPORT_TFABOOT
> diff --git a/board/xen/xenguest_arm64/xenguest_arm64.c b/board/xen/xenguest_arm64/xenguest_arm64.c
> index fd10a002e9..e8621f7174 100644
> --- a/board/xen/xenguest_arm64/xenguest_arm64.c
> +++ b/board/xen/xenguest_arm64/xenguest_arm64.c
> @@ -67,7 +67,7 @@ static int setup_mem_map(void)
>
>  	/*
>  	 * Add "magic" region which is used by Xen to provide some essentials
> -	 * for the guest: we need console.
> +	 * for the guest: we need console and xenstore.
>  	 */
>  	ret = hvm_get_parameter_maintain_dcache(HVM_PARAM_CONSOLE_PFN, &gfn);
>  	if (ret < 0) {
> @@ -83,6 +83,20 @@ static int setup_mem_map(void)
>  				PTE_BLOCK_INNER_SHARE);
>  	i++;
>
> +	ret = hvm_get_parameter_maintain_dcache(HVM_PARAM_STORE_PFN, &gfn);
> +	if (ret < 0) {
> +		printf("%s: Can't get HVM_PARAM_STORE_PFN, ret %d\n",
> +		       __func__, ret);
> +		return -EINVAL;
> +	}
> +
> +	xen_mem_map[i].virt = PFN_PHYS(gfn);
> +	xen_mem_map[i].phys = PFN_PHYS(gfn);
> +	xen_mem_map[i].size = PAGE_SIZE;
> +	xen_mem_map[i].attrs = (PTE_BLOCK_MEMTYPE(MT_NORMAL) |
> +				PTE_BLOCK_INNER_SHARE);
> +	i++;
> +
>  	mem = get_next_memory_node(blob, -1);
>  	if (mem < 0) {
>  		printf("%s: Missing /memory node\n", __func__);
> diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
> index 0ad35edefb..9d0f604aaa 100644
> --- a/drivers/xen/Makefile
> +++ b/drivers/xen/Makefile
> @@ -4,3 +4,4 @@
>
>  obj-y += hypervisor.o
>  obj-y += events.o
> +obj-y += xenbus.o
> diff --git a/drivers/xen/hypervisor.c b/drivers/xen/hypervisor.c
> index 975e552242..d7fbacb08e 100644
> --- a/drivers/xen/hypervisor.c
> +++ b/drivers/xen/hypervisor.c
> @@ -38,6 +38,7 @@
>
>  #include <xen/hvm.h>
>  #include <xen/events.h>
> +#include <xen/xenbus.h>
>  #include <xen/interface/memory.h>
>
>  #define active_evtchns(cpu, sh, idx)	\
> @@ -273,5 +274,6 @@ void xen_init(void)
>
>  	map_shared_info(NULL);
>  	init_events();
> +	init_xenbus();
>  }
>
> diff --git a/drivers/xen/xenbus.c b/drivers/xen/xenbus.c
> new file mode 100644
> index 0000000000..64eb28e843
> --- /dev/null
> +++ b/drivers/xen/xenbus.c
> @@ -0,0 +1,547 @@
> +/*

Add an SPDX header, please.

> + ****************************************************************************
> + * (C) 2006 - Cambridge University
> + * (C) 2020 - EPAM Systems Inc.
> + ****************************************************************************
> + *
> + *		File: xenbus.c
> + *	  Author: Steven Smith (sos22 at cam.ac.uk)
> + *	 Changes: Grzegorz Milos (gm281 at cam.ac.uk)
> + *	 Changes: John D. Ramsdell
> + *
> + *		Date: Jun 2006, chages Aug 2005

%s/chages/changes/ ?
Does time run in reverse in Cambridge?

> + *
> + * Environment: Xen Minimal OS

This is U-Boot.

Better provide a link to the original source.

> + * Description: Minimal implementation of xenbus
> + *
> + ****************************************************************************

Can we get rid of this not U-Boot style formatting?

> + **/
> +
> +#include <common.h>
> +#include <log.h>
> +
> +#include <asm/armv8/mmu.h>
> +#include <asm/io.h>
> +#include <asm/xen/system.h>
> +
> +#include <linux/bug.h>
> +#include <linux/compat.h>
> +
> +#include <xen/events.h>
> +#include <xen/hvm.h>
> +#include <xen/xenbus.h>
> +
> +#include <xen/interface/io/xs_wire.h>
> +
> +#define map_frame_virt(v)	(v << PAGE_SHIFT)
> +
> +#define SCNd16			"d"
> +
> +/* Wait for reply time out, ms */
> +#define WAIT_XENBUS_TO_MS	5000
> +/* Polling time out, ms */
> +#define WAIT_XENBUS_POLL_TO_MS	1
> +
> +static struct xenstore_domain_interface *xenstore_buf;
> +
> +static char *errmsg(struct xsd_sockmsg *rep);
> +
> +u32 xenbus_evtchn;
> +
> +struct write_req {
> +	const void *data;
> +	unsigned int len;
> +};
> +
> +static void memcpy_from_ring(const void *r, void *d, int off, int len)
> +{
> +	int c1, c2;
> +	const char *ring = r;
> +	char *dest = d;
> +
> +	c1 = min(len, XENSTORE_RING_SIZE - off);
> +	c2 = len - c1;
> +	memcpy(dest, ring + off, c1);
> +	memcpy(dest + c1, ring, c2);
> +}
> +
> +static bool xenbus_get_reply(struct xsd_sockmsg **req_reply)
> +{
> +	struct xsd_sockmsg msg;
> +	unsigned int prod = xenstore_buf->rsp_prod;
> +
> +again:
> +	if (!wait_event_timeout(NULL, prod != xenstore_buf->rsp_prod,
> +				WAIT_XENBUS_TO_MS)) {
> +		printk("%s: wait_event timeout\n", __func__);
> +		return false;
> +	}
> +
> +	prod = xenstore_buf->rsp_prod;
> +	if (xenstore_buf->rsp_prod - xenstore_buf->rsp_cons < sizeof(msg))
> +		goto again;
> +
> +	rmb();
> +	memcpy_from_ring(xenstore_buf->rsp, &msg,
> +			 MASK_XENSTORE_IDX(xenstore_buf->rsp_cons),
> +			 sizeof(msg));
> +
> +	if (xenstore_buf->rsp_prod - xenstore_buf->rsp_cons < sizeof(msg) + msg.len)
> +		goto again;
> +
> +	/* We do not support and expect any Xen bus wathes. */
> +	BUG_ON(msg.type == XS_WATCH_EVENT);
> +
> +	*req_reply = malloc(sizeof(msg) + msg.len);
> +	memcpy_from_ring(xenstore_buf->rsp, *req_reply,
> +			 MASK_XENSTORE_IDX(xenstore_buf->rsp_cons),
> +			 msg.len + sizeof(msg));
> +	mb();
> +	xenstore_buf->rsp_cons += msg.len + sizeof(msg);
> +
> +	wmb();
> +	notify_remote_via_evtchn(xenbus_evtchn);
> +	return true;
> +}
> +

Document functions, please.

https://www.kernel.org/doc/html/latest/doc-guide/kernel-doc.html#function-documentation

Best regards

Heinrich

> +char *xenbus_switch_state(xenbus_transaction_t xbt, const char *path,
> +			  XenbusState state)
> +{
> +	char *current_state;
> +	char *msg = NULL;
> +	char *msg2 = NULL;
> +	char value[2];
> +	XenbusState rs;
> +	int xbt_flag = 0;
> +	int retry = 0;
> +
> +	do {
> +		if (xbt == XBT_NIL) {
> +			msg = xenbus_transaction_start(&xbt);
> +			if (msg)
> +				goto exit;
> +			xbt_flag = 1;
> +		}
> +
> +		msg = xenbus_read(xbt, path, &current_state);
> +		if (msg)
> +			goto exit;
> +
> +		rs = (XenbusState)(current_state[0] - '0');
> +		free(current_state);
> +		if (rs == state) {
> +			msg = NULL;
> +			goto exit;
> +		}
> +
> +		snprintf(value, 2, "%d", state);
> +		msg = xenbus_write(xbt, path, value);
> +
> +exit:
> +		if (xbt_flag) {
> +			msg2 = xenbus_transaction_end(xbt, 0, &retry);
> +			xbt = XBT_NIL;
> +		}
> +		if (msg == NULL && msg2 != NULL)
> +			msg = msg2;
> +		else
> +			free(msg2);
> +	} while (retry);
> +
> +	return msg;
> +}
> +
> +char *xenbus_wait_for_state_change(const char *path, XenbusState *state)
> +{
> +	for (;;) {
> +		char *res, *msg;
> +		XenbusState rs;
> +
> +		msg = xenbus_read(XBT_NIL, path, &res);
> +		if (msg)
> +			return msg;
> +
> +		rs = (XenbusState)(res[0] - 48);
> +		free(res);
> +
> +		if (rs == *state) {
> +			wait_event_timeout(NULL, false, WAIT_XENBUS_POLL_TO_MS);
> +		} else {
> +			*state = rs;
> +			break;
> +		}
> +	}
> +	return NULL;
> +}
> +
> +/* Send data to xenbus.  This can block.  All of the requests are seen
> + * by xenbus as if sent atomically.  The header is added
> + * automatically, using type %type, req_id %req_id, and trans_id
> + * %trans_id.
> + */
> +static void xb_write(int type, int req_id, xenbus_transaction_t trans_id,
> +		     const struct write_req *req, int nr_reqs)
> +{
> +	XENSTORE_RING_IDX prod;
> +	int r;
> +	int len = 0;
> +	const struct write_req *cur_req;
> +	int req_off;
> +	int total_off;
> +	int this_chunk;
> +	struct xsd_sockmsg m = {
> +		.type = type,
> +		.req_id = req_id,
> +		.tx_id = trans_id
> +	};
> +	struct write_req header_req = {
> +		&m,
> +		sizeof(m)
> +	};
> +
> +	for (r = 0; r < nr_reqs; r++)
> +		len += req[r].len;
> +	m.len = len;
> +	len += sizeof(m);
> +
> +	cur_req = &header_req;
> +
> +	BUG_ON(len > XENSTORE_RING_SIZE);
> +	prod = xenstore_buf->req_prod;
> +	/* We are running synchronously, so it is a bug if we do not
> +	 * have enough room to send a message: please note that a message
> +	 * can occupy multiple slots in the ring buffer.
> +	 */
> +	BUG_ON(prod + len - xenstore_buf->req_cons > XENSTORE_RING_SIZE);
> +
> +	total_off = 0;
> +	req_off = 0;
> +	while (total_off < len) {
> +		this_chunk = min(cur_req->len - req_off,
> +				 XENSTORE_RING_SIZE - MASK_XENSTORE_IDX(prod));
> +		memcpy((char *)xenstore_buf->req + MASK_XENSTORE_IDX(prod),
> +		       (char *)cur_req->data + req_off, this_chunk);
> +		prod += this_chunk;
> +		req_off += this_chunk;
> +		total_off += this_chunk;
> +		if (req_off == cur_req->len) {
> +			req_off = 0;
> +			if (cur_req == &header_req)
> +				cur_req = req;
> +			else
> +				cur_req++;
> +		}
> +	}
> +
> +	BUG_ON(req_off != 0);
> +	BUG_ON(total_off != len);
> +	BUG_ON(prod > xenstore_buf->req_cons + XENSTORE_RING_SIZE);
> +
> +	/* Remote must see entire message before updating indexes */
> +	wmb();
> +
> +	xenstore_buf->req_prod += len;
> +
> +	/* Send evtchn to notify remote */
> +	notify_remote_via_evtchn(xenbus_evtchn);
> +}
> +
> +/* Send a message to xenbus, in the same fashion as xb_write, and
> + * block waiting for a reply.  The reply is malloced and should be
> + * freed by the caller.
> + */
> +struct xsd_sockmsg *xenbus_msg_reply(int type,
> +				     xenbus_transaction_t trans,
> +				     struct write_req *io,
> +				     int nr_reqs)
> +{
> +	struct xsd_sockmsg *rep;
> +
> +	/* We do not use request identifier which is echoed in daemon's response. */
> +	xb_write(type, 0, trans, io, nr_reqs);
> +	/* Now wait for the message to arrive. */
> +	if (!xenbus_get_reply(&rep))
> +		return NULL;
> +	return rep;
> +}
> +
> +static char *errmsg(struct xsd_sockmsg *rep)
> +{
> +	char *res;
> +
> +	if (!rep) {
> +		char msg[] = "No reply";
> +		size_t len = strlen(msg) + 1;
> +
> +		return memcpy(malloc(len), msg, len);
> +	}
> +	if (rep->type != XS_ERROR)
> +		return NULL;
> +	res = malloc(rep->len + 1);
> +	memcpy(res, rep + 1, rep->len);
> +	res[rep->len] = 0;
> +	free(rep);
> +	return res;
> +}
> +
> +/* List the contents of a directory.  Returns a malloc()ed array of
> + * pointers to malloc()ed strings.  The array is NULL terminated.  May
> + * block.
> + */
> +char *xenbus_ls(xenbus_transaction_t xbt, const char *pre, char ***contents)
> +{
> +	struct xsd_sockmsg *reply, *repmsg;
> +	struct write_req req[] = { { pre, strlen(pre) + 1 } };
> +	int nr_elems, x, i;
> +	char **res, *msg;
> +
> +	repmsg = xenbus_msg_reply(XS_DIRECTORY, xbt, req, ARRAY_SIZE(req));
> +	msg = errmsg(repmsg);
> +	if (msg) {
> +		*contents = NULL;
> +		return msg;
> +	}
> +	reply = repmsg + 1;
> +	for (x = nr_elems = 0; x < repmsg->len; x++)
> +		nr_elems += (((char *)reply)[x] == 0);
> +	res = malloc(sizeof(res[0]) * (nr_elems + 1));
> +	for (x = i = 0; i < nr_elems; i++) {
> +		int l = strlen((char *)reply + x);
> +
> +		res[i] = malloc(l + 1);
> +		memcpy(res[i], (char *)reply + x, l + 1);
> +		x += l + 1;
> +	}
> +	res[i] = NULL;
> +	free(repmsg);
> +	*contents = res;
> +	return NULL;
> +}
> +
> +char *xenbus_read(xenbus_transaction_t xbt, const char *path, char **value)
> +{
> +	struct write_req req[] = { {path, strlen(path) + 1} };
> +	struct xsd_sockmsg *rep;
> +	char *res, *msg;
> +
> +	rep = xenbus_msg_reply(XS_READ, xbt, req, ARRAY_SIZE(req));
> +	msg = errmsg(rep);
> +	if (msg) {
> +		*value = NULL;
> +		return msg;
> +	}
> +	res = malloc(rep->len + 1);
> +	memcpy(res, rep + 1, rep->len);
> +	res[rep->len] = 0;
> +	free(rep);
> +	*value = res;
> +	return NULL;
> +}
> +
> +char *xenbus_write(xenbus_transaction_t xbt, const char *path,
> +				   const char *value)
> +{
> +	struct write_req req[] = {
> +		{path, strlen(path) + 1},
> +		{value, strlen(value)},
> +	};
> +	struct xsd_sockmsg *rep;
> +	char *msg;
> +
> +	rep = xenbus_msg_reply(XS_WRITE, xbt, req, ARRAY_SIZE(req));
> +	msg = errmsg(rep);
> +	if (msg)
> +		return msg;
> +	free(rep);
> +	return NULL;
> +}
> +
> +char *xenbus_rm(xenbus_transaction_t xbt, const char *path)
> +{
> +	struct write_req req[] = { {path, strlen(path) + 1} };
> +	struct xsd_sockmsg *rep;
> +	char *msg;
> +
> +	rep = xenbus_msg_reply(XS_RM, xbt, req, ARRAY_SIZE(req));
> +	msg = errmsg(rep);
> +	if (msg)
> +		return msg;
> +	free(rep);
> +	return NULL;
> +}
> +
> +char *xenbus_get_perms(xenbus_transaction_t xbt, const char *path, char **value)
> +{
> +	struct write_req req[] = { {path, strlen(path) + 1} };
> +	struct xsd_sockmsg *rep;
> +	char *res, *msg;
> +
> +	rep = xenbus_msg_reply(XS_GET_PERMS, xbt, req, ARRAY_SIZE(req));
> +	msg = errmsg(rep);
> +	if (msg) {
> +		*value = NULL;
> +		return msg;
> +	}
> +	res = malloc(rep->len + 1);
> +	memcpy(res, rep + 1, rep->len);
> +	res[rep->len] = 0;
> +	free(rep);
> +	*value = res;
> +	return NULL;
> +}
> +
> +#define PERM_MAX_SIZE 32
> +char *xenbus_set_perms(xenbus_transaction_t xbt, const char *path,
> +		       domid_t dom, char perm)
> +{
> +	char value[PERM_MAX_SIZE];
> +	struct write_req req[] = {
> +		{path, strlen(path) + 1},
> +		{value, 0},
> +	};
> +	struct xsd_sockmsg *rep;
> +	char *msg;
> +
> +	snprintf(value, PERM_MAX_SIZE, "%c%hu", perm, dom);
> +	req[1].len = strlen(value) + 1;
> +	rep = xenbus_msg_reply(XS_SET_PERMS, xbt, req, ARRAY_SIZE(req));
> +	msg = errmsg(rep);
> +	if (msg)
> +		return msg;
> +	free(rep);
> +	return NULL;
> +}
> +
> +char *xenbus_transaction_start(xenbus_transaction_t *xbt)
> +{
> +	/* Xenstored becomes angry if you send a length 0 message, so just
> +	 * shove a nul terminator on the end
> +	 */
> +	struct write_req req = { "", 1};
> +	struct xsd_sockmsg *rep;
> +	char *err;
> +
> +	rep = xenbus_msg_reply(XS_TRANSACTION_START, 0, &req, 1);
> +	err = errmsg(rep);
> +	if (err)
> +		return err;
> +	sscanf((char *)(rep + 1), "%lu", xbt);
> +	free(rep);
> +	return NULL;
> +}
> +
> +char *xenbus_transaction_end(xenbus_transaction_t t, int abort, int *retry)
> +{
> +	struct xsd_sockmsg *rep;
> +	struct write_req req;
> +	char *err;
> +
> +	*retry = 0;
> +
> +	req.data = abort ? "F" : "T";
> +	req.len = 2;
> +	rep = xenbus_msg_reply(XS_TRANSACTION_END, t, &req, 1);
> +	err = errmsg(rep);
> +	if (err) {
> +		if (!strcmp(err, "EAGAIN")) {
> +			*retry = 1;
> +			free(err);
> +			return NULL;
> +		} else {
> +			return err;
> +		}
> +	}
> +	free(rep);
> +	return NULL;
> +}
> +
> +int xenbus_read_integer(const char *path)
> +{
> +	char *res, *buf;
> +	int t;
> +
> +	res = xenbus_read(XBT_NIL, path, &buf);
> +	if (res) {
> +		printk("Failed to read %s.\n", path);
> +		free(res);
> +		return -1;
> +	}
> +	sscanf(buf, "%d", &t);
> +	free(buf);
> +	return t;
> +}
> +
> +int xenbus_read_uuid(const char *path, unsigned char uuid[16]) {
> +	char *res, *buf;
> +
> +	res = xenbus_read(XBT_NIL, path, &buf);
> +	if (res) {
> +		printk("Failed to read %s.\n", path);
> +		free(res);
> +		return 0;
> +	}
> +	if (strlen(buf) != ((2 * 16) + 4) /* 16 hex bytes and 4 hyphens */
> +	    || sscanf(buf,
> +		      "%2hhx%2hhx%2hhx%2hhx-"
> +		      "%2hhx%2hhx-"
> +		      "%2hhx%2hhx-"
> +		      "%2hhx%2hhx-"
> +		      "%2hhx%2hhx%2hhx%2hhx%2hhx%2hhx",
> +		      uuid, uuid + 1, uuid + 2, uuid + 3,
> +		      uuid + 4, uuid + 5, uuid + 6, uuid + 7,
> +		      uuid + 8, uuid + 9, uuid + 10, uuid + 11,
> +		      uuid + 12, uuid + 13, uuid + 14, uuid + 15) != 16) {
> +		printk("Xenbus path %s value %s is not a uuid!\n", path, buf);
> +		free(buf);
> +		return 0;
> +	}
> +	free(buf);
> +	return 1;
> +}
> +
> +char *xenbus_printf(xenbus_transaction_t xbt,
> +		    const char *node, const char *path,
> +		    const char *fmt, ...)
> +{
> +#define BUFFER_SIZE 256
> +	char fullpath[BUFFER_SIZE];
> +	char val[BUFFER_SIZE];
> +	va_list args;
> +
> +	BUG_ON(strlen(node) + strlen(path) + 1 >= BUFFER_SIZE);
> +	sprintf(fullpath, "%s/%s", node, path);
> +	va_start(args, fmt);
> +	vsprintf(val, fmt, args);
> +	va_end(args);
> +	return xenbus_write(xbt, fullpath, val);
> +}
> +
> +domid_t xenbus_get_self_id(void)
> +{
> +	char *dom_id;
> +	domid_t ret;
> +
> +	BUG_ON(xenbus_read(XBT_NIL, "domid", &dom_id));
> +	sscanf(dom_id, "%"SCNd16, &ret);
> +
> +	return ret;
> +}
> +
> +void init_xenbus(void)
> +{
> +	u64 v;
> +
> +	debug("%s\n", __func__);
> +	if (hvm_get_parameter(HVM_PARAM_STORE_EVTCHN, &v))
> +		BUG();
> +	xenbus_evtchn = v;
> +
> +	if (hvm_get_parameter(HVM_PARAM_STORE_PFN, &v))
> +		BUG();
> +	xenstore_buf = (struct xenstore_domain_interface *)map_frame_virt(v);
> +}
> +
> +void fini_xenbus(void)
> +{
> +	debug("%s\n", __func__);
> +}
> diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
> new file mode 100644
> index 0000000000..e2e3ef9292
> --- /dev/null
> +++ b/include/xen/xenbus.h
> @@ -0,0 +1,86 @@
> +#ifndef XENBUS_H__
> +#define XENBUS_H__
> +
> +#include <xen/interface/xen.h>
> +#include <xen/interface/io/xenbus.h>
> +
> +typedef unsigned long xenbus_transaction_t;
> +#define XBT_NIL ((xenbus_transaction_t)0)
> +
> +extern u32 xenbus_evtchn;
> +
> +/* Initialize the XenBus system. */
> +void init_xenbus(void);
> +/* Finalize the XenBus system. */
> +void fini_xenbus(void);
> +
> +/* Read the value associated with a path.  Returns a malloc'd error
> + * string on failure and sets *value to NULL.  On success, *value is
> + * set to a malloc'd copy of the value.
> + */
> +char *xenbus_read(xenbus_transaction_t xbt, const char *path, char **value);
> +
> +char *xenbus_wait_for_state_change(const char *path, XenbusState *state);
> +char *xenbus_switch_state(xenbus_transaction_t xbt, const char *path,
> +			  XenbusState state);
> +
> +/* Associates a value with a path.  Returns a malloc'd error string on
> + * failure.
> + */
> +char *xenbus_write(xenbus_transaction_t xbt, const char *path,
> +		   const char *value);
> +
> +/* Removes the value associated with a path.  Returns a malloc'd error
> + * string on failure.
> + */
> +char *xenbus_rm(xenbus_transaction_t xbt, const char *path);
> +
> +/* List the contents of a directory.  Returns a malloc'd error string
> + * on failure and sets *contents to NULL.  On success, *contents is
> + * set to a malloc'd array of pointers to malloc'd strings.  The array
> + * is NULL terminated.  May block.
> + */
> +char *xenbus_ls(xenbus_transaction_t xbt, const char *prefix, char ***contents);
> +
> +/* Reads permissions associated with a path.  Returns a malloc'd error
> + * string on failure and sets *value to NULL.  On success, *value is
> + * set to a malloc'd copy of the value.
> + */
> +char *xenbus_get_perms(xenbus_transaction_t xbt, const char *path, char **value);
> +
> +/* Sets the permissions associated with a path.  Returns a malloc'd
> + * error string on failure.
> + */
> +char *xenbus_set_perms(xenbus_transaction_t xbt, const char *path, domid_t dom,
> +		       char perm);
> +
> +/* Start a xenbus transaction.  Returns the transaction in xbt on
> + * success or a malloc'd error string otherwise.
> + */
> +char *xenbus_transaction_start(xenbus_transaction_t *xbt);
> +
> +/* End a xenbus transaction.  Returns a malloc'd error string if it
> + * fails.  abort says whether the transaction should be aborted.
> + * Returns 1 in *retry iff the transaction should be retried.
> + */
> +char *xenbus_transaction_end(xenbus_transaction_t, int abort,
> +			     int *retry);
> +
> +/* Read path and parse it as an integer.  Returns -1 on error. */
> +int xenbus_read_integer(const char *path);
> +
> +/* Read path and parse it as 16 byte uuid. Returns 1 if
> + * read and parsing were successful, 0 if not
> + */
> +int xenbus_read_uuid(const char *path, unsigned char uuid[16]);
> +
> +/* Contraction of snprintf and xenbus_write(path/node). */
> +char *xenbus_printf(xenbus_transaction_t xbt,
> +		    const char *node, const char *path,
> +		    const char *fmt, ...)
> +	__attribute__((__format__(printf, 4, 5)));
> +
> +/* Utility function to figure out our domain id */
> +domid_t xenbus_get_self_id(void);
> +
> +#endif /* XENBUS_H__ */
>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 12/17] xen: pvblock: Add initial support for para-virtualized block driver
  2020-07-02  4:29   ` Heinrich Schuchardt
@ 2020-07-02  5:30     ` Peng Fan
  2020-07-03 14:14     ` Anastasiia Lukianenko
  1 sibling, 0 replies; 57+ messages in thread
From: Peng Fan @ 2020-07-02  5:30 UTC (permalink / raw)
  To: u-boot

> Subject: Re: [PATCH 12/17] xen: pvblock: Add initial support for
> para-virtualized block driver
> 
> On 7/1/20 6:29 PM, Anastasiia Lukianenko wrote:
> > From: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> >
> > Add initial infrastructure for Xen para-virtualized block device.
> > This includes compile-time configuration and the skeleton for the
> > future driver implementation.
> > Add new class UCLASS_PVBLOCK which is going to be a parent for virtual
> > block devices.
> 
> We already have virtual block devices: virtio_blk, efi_blk.

XEN has its own paravirtualization block, not virtio or efi.

Regards,
Peng.

> 
> They work fine using the exising UCLASS_BLK. Why do we need a new uclass?
> 
> > Add new interface type IF_TYPE_PVBLOCK.
> >
> > Implement basic driver setup by reading XenStore configuration.
> >
> > Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
> > Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> > Signed-off-by: Oleksandr Andrushchenko
> > <oleksandr_andrushchenko@epam.com>
> > ---
> >  cmd/Kconfig                      |   7 ++
> >  cmd/Makefile                     |   1 +
> >  cmd/pvblock.c                    |  31 ++++++++
> >  common/board_r.c                 |  14 ++++
> >  configs/xenguest_arm64_defconfig |   4 +
> >  disk/part.c                      |   4 +
> >  drivers/Kconfig                  |   2 +
> >  drivers/block/blk-uclass.c       |   2 +
> >  drivers/xen/Kconfig              |  10 +++
> >  drivers/xen/Makefile             |   2 +
> >  drivers/xen/pvblock.c            | 121
> +++++++++++++++++++++++++++++++
> >  include/blk.h                    |   1 +
> >  include/configs/xenguest_arm64.h |   8 ++
> >  include/dm/uclass-id.h           |   1 +
> >  include/pvblock.h                |  12 +++
> >  15 files changed, 220 insertions(+)
> >  create mode 100644 cmd/pvblock.c
> >  create mode 100644 drivers/xen/Kconfig  create mode 100644
> > drivers/xen/pvblock.c  create mode 100644 include/pvblock.h
> >
> > diff --git a/cmd/Kconfig b/cmd/Kconfig index 192b3b262f..f28576947b
> > 100644
> > --- a/cmd/Kconfig
> > +++ b/cmd/Kconfig
> > @@ -1335,6 +1335,13 @@ config CMD_USB_MASS_STORAGE
> >  	help
> >  	  USB mass storage support
> >
> > +config CMD_PVBLOCK
> > +	bool "Xen para-virtualized block device"
> > +	depends on XEN
> > +	select PVBLOCK
> > +	help
> > +	  Xen para-virtualized block device support
> > +
> >  config CMD_VIRTIO
> >  	bool "virtio"
> >  	depends on VIRTIO
> > diff --git a/cmd/Makefile b/cmd/Makefile index 974ad48b0a..117284a28c
> > 100644
> > --- a/cmd/Makefile
> > +++ b/cmd/Makefile
> > @@ -169,6 +169,7 @@ obj-$(CONFIG_CMD_DFU) += dfu.o
> >  obj-$(CONFIG_CMD_GPT) += gpt.o
> >  obj-$(CONFIG_CMD_ETHSW) += ethsw.o
> >  obj-$(CONFIG_CMD_AXI) += axi.o
> > +obj-$(CONFIG_CMD_PVBLOCK) += pvblock.o
> >
> >  # Power
> >  obj-$(CONFIG_CMD_PMIC) += pmic.o
> > diff --git a/cmd/pvblock.c b/cmd/pvblock.c new file mode 100644 index
> > 0000000000..7dbb243a74
> > --- /dev/null
> > +++ b/cmd/pvblock.c
> > @@ -0,0 +1,31 @@
> > +/*
> > + * SPDX-License-Identifier:	GPL-2.0+
> 
> Please, correct the formatting.
> 
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
> kernel.org%2Fdoc%2Fhtml%2Flatest%2Fprocess%2Flicense-rules.html%23lic
> ense-identifier-syntax&amp;data=02%7C01%7Cpeng.fan%40nxp.com%7C766
> 8aea23aa24f9cefe408d81e4086bc%7C686ea1d3bc2b4c6fa92cd99c5c30163
> 5%7C0%7C0%7C637292609781573766&amp;sdata=q8BMngBAzhdf9KmswU
> AEqp47ciZqJzHSo2mjiIWMQOM%3D&amp;reserved=0
> 
> > + *
> > + * (C) Copyright 2020 EPAM Systems Inc.
> > + *
> > + * XEN para-virtualized block device support */
> > +
> > +#include <blk.h>
> > +#include <common.h>
> > +#include <command.h>
> > +
> > +/* Current I/O Device	*/
> > +static int pvblock_curr_device;
> > +
> > +int do_pvblock(struct cmd_tbl *cmdtp, int flag, int argc, char *const
> > +argv[]) {
> > +	return blk_common_cmd(argc, argv, IF_TYPE_PVBLOCK,
> > +			      &pvblock_curr_device);
> > +}
> > +
> > +U_BOOT_CMD(pvblock, 5, 1, do_pvblock,
> > +	   "Xen para-virtualized block device",
> > +	   "info  - show available block devices\n"
> > +	   "pvblock device [dev] - show or set current device\n"
> > +	   "pvblock part [dev] - print partition table of one or all devices\n"
> > +	   "pvblock read  addr blk# cnt\n"
> > +	   "pvblock write addr blk# cnt - read/write `cnt'"
> > +	   " blocks starting at block `blk#'\n"
> > +	   "    to/from memory address `addr'");
> > +
> > diff --git a/common/board_r.c b/common/board_r.c index
> > fd36edb4e5..40cd0e5d3c 100644
> > --- a/common/board_r.c
> > +++ b/common/board_r.c
> > @@ -49,6 +49,7 @@
> >  #include <nand.h>
> >  #include <of_live.h>
> >  #include <onenand_uboot.h>
> > +#include <pvblock.h>
> >  #include <scsi.h>
> >  #include <serial.h>
> >  #include <status_led.h>
> > @@ -470,6 +471,16 @@ static int initr_xen(void)
> >  	return 0;
> >  }
> >  #endif
> > +
> > +#ifdef CONFIG_PVBLOCK
> > +static int initr_pvblock(void)
> > +{
> > +	puts("PVBLOCK: ");
> > +	pvblock_init();
> > +	return 0;
> > +}
> > +#endif
> > +
> >  /*
> >   * Tell if it's OK to load the environment early in boot.
> >   *
> > @@ -780,6 +791,9 @@ static init_fnc_t init_sequence_r[] = {  #endif
> > #ifdef CONFIG_XEN
> >  	initr_xen,
> > +#endif
> > +#ifdef CONFIG_PVBLOCK
> > +	initr_pvblock,
> >  #endif
> >  	initr_env,
> >  #ifdef CONFIG_SYS_BOOTPARAMS_LEN
> > diff --git a/configs/xenguest_arm64_defconfig
> > b/configs/xenguest_arm64_defconfig
> > index 45559a161b..46473c251d 100644
> > --- a/configs/xenguest_arm64_defconfig
> > +++ b/configs/xenguest_arm64_defconfig
> > @@ -14,6 +14,8 @@ CONFIG_CMD_BOOTD=n
> >  CONFIG_CMD_BOOTEFI=n
> >  CONFIG_CMD_BOOTEFI_HELLO_COMPILE=n
> >  CONFIG_CMD_ELF=n
> > +CONFIG_CMD_EXT4=y
> > +CONFIG_CMD_FAT=y
> >  CONFIG_CMD_GO=n
> >  CONFIG_CMD_RUN=n
> >  CONFIG_CMD_IMI=n
> > @@ -41,6 +43,8 @@ CONFIG_CMD_LZMADEC=n
> CONFIG_CMD_SAVEENV=n
> > CONFIG_CMD_UMS=n
> >
> > +CONFIG_CMD_PVBLOCK=y
> > +
> >  #CONFIG_USB=n
> >  # CONFIG_ISO_PARTITION is not set
> >
> > diff --git a/disk/part.c b/disk/part.c index f6a31025dc..b69fd345f3
> > 100644
> > --- a/disk/part.c
> > +++ b/disk/part.c
> > @@ -149,6 +149,7 @@ void dev_print (struct blk_desc *dev_desc)
> >  	case IF_TYPE_MMC:
> >  	case IF_TYPE_USB:
> >  	case IF_TYPE_NVME:
> > +	case IF_TYPE_PVBLOCK:
> >  		printf ("Vendor: %s Rev: %s Prod: %s\n",
> >  			dev_desc->vendor,
> >  			dev_desc->revision,
> > @@ -288,6 +289,9 @@ static void print_part_header(const char *type,
> struct blk_desc *dev_desc)
> >  	case IF_TYPE_NVME:
> >  		puts ("NVMe");
> >  		break;
> > +	case IF_TYPE_PVBLOCK:
> > +		puts("PV BLOCK");
> > +		break;
> >  	case IF_TYPE_VIRTIO:
> >  		puts("VirtIO");
> >  		break;
> > diff --git a/drivers/Kconfig b/drivers/Kconfig index
> > e34a22708c..65076aab03 100644
> > --- a/drivers/Kconfig
> > +++ b/drivers/Kconfig
> > @@ -132,6 +132,8 @@ source "drivers/w1-eeprom/Kconfig"
> >
> >  source "drivers/watchdog/Kconfig"
> >
> > +source "drivers/xen/Kconfig"
> > +
> >  config PHYS_TO_BUS
> >  	bool "Custom physical to bus address mapping"
> >  	help
> > diff --git a/drivers/block/blk-uclass.c b/drivers/block/blk-uclass.c
> > index b19375cbc8..6cfabbca24 100644
> > --- a/drivers/block/blk-uclass.c
> > +++ b/drivers/block/blk-uclass.c
> > @@ -28,6 +28,7 @@ static const char *if_typename_str[IF_TYPE_COUNT]
> = {
> >  	[IF_TYPE_NVME]		= "nvme",
> >  	[IF_TYPE_EFI]		= "efi",
> >  	[IF_TYPE_VIRTIO]	= "virtio",
> > +	[IF_TYPE_PVBLOCK]	= "pvblock",
> >  };
> >
> >  static enum uclass_id if_type_uclass_id[IF_TYPE_COUNT] = { @@ -43,6
> > +44,7 @@ static enum uclass_id if_type_uclass_id[IF_TYPE_COUNT] = {
> >  	[IF_TYPE_NVME]		= UCLASS_NVME,
> >  	[IF_TYPE_EFI]		= UCLASS_EFI,
> >  	[IF_TYPE_VIRTIO]	= UCLASS_VIRTIO,
> > +	[IF_TYPE_PVBLOCK]	= UCLASS_PVBLOCK,
> >  };
> >
> >  static enum if_type if_typename_to_iftype(const char *if_typename)
> > diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig new file mode
> > 100644 index 0000000000..6ad2a93668
> > --- /dev/null
> > +++ b/drivers/xen/Kconfig
> > @@ -0,0 +1,10 @@
> > +config PVBLOCK
> > +	bool "Xen para-virtualized block device"
> > +	depends on DM
> > +	select BLK
> > +	select HAVE_BLOCK_DEVICE
> > +	help
> > +	  This driver implements the front-end of the Xen virtual
> > +	  block device driver. It communicates with a back-end driver
> > +	  in another domain which drives the actual block device.
> > +
> > diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile index
> > 243b13277a..87157df69b 100644
> > --- a/drivers/xen/Makefile
> > +++ b/drivers/xen/Makefile
> > @@ -6,3 +6,5 @@ obj-y += hypervisor.o
> >  obj-y += events.o
> >  obj-y += xenbus.o
> >  obj-y += gnttab.o
> > +
> > +obj-$(CONFIG_PVBLOCK) += pvblock.o
> > diff --git a/drivers/xen/pvblock.c b/drivers/xen/pvblock.c new file
> > mode 100644 index 0000000000..057add9753
> > --- /dev/null
> > +++ b/drivers/xen/pvblock.c
> > @@ -0,0 +1,121 @@
> > +/*
> > + * SPDX-License-Identifier:	GPL-2.0+
> 
> See above.
> 
> > + *
> > + * (C) Copyright 2020 EPAM Systems Inc.
> > + */
> > +#include <blk.h>
> > +#include <common.h>
> > +#include <dm.h>
> > +#include <dm/device-internal.h>
> > +
> > +#define DRV_NAME	"pvblock"
> > +#define DRV_NAME_BLK	"pvblock_blk"
> > +
> > +struct blkfront_dev {
> > +	char dummy;
> > +};
> > +
> > +static int init_blkfront(unsigned int devid, struct blkfront_dev
> > +*dev) {
> > +	return 0;
> > +}
> > +
> > +static void shutdown_blkfront(struct blkfront_dev *dev) { }
> > +
> > +ulong pvblock_blk_read(struct udevice *udev, lbaint_t blknr, lbaint_t
> blkcnt,
> > +		       void *buffer)
> > +{
> > +	return 0;
> > +}
> > +
> > +ulong pvblock_blk_write(struct udevice *udev, lbaint_t blknr, lbaint_t
> blkcnt,
> > +			const void *buffer)
> > +{
> > +	return 0;
> > +}
> > +
> > +static int pvblock_blk_bind(struct udevice *udev) {
> > +	return 0;
> > +}
> > +
> > +static int pvblock_blk_probe(struct udevice *udev) {
> > +	struct blkfront_dev *blk_dev = dev_get_priv(udev);
> > +	int ret;
> > +
> > +	ret = init_blkfront(0, blk_dev);
> > +	if (ret < 0)
> > +		return ret;
> > +	return 0;
> > +}
> > +
> > +static int pvblock_blk_remove(struct udevice *udev) {
> > +	struct blkfront_dev *blk_dev = dev_get_priv(udev);
> > +
> > +	shutdown_blkfront(blk_dev);
> > +	return 0;
> > +}
> > +
> > +static const struct blk_ops pvblock_blk_ops = {
> > +	.read	= pvblock_blk_read,
> > +	.write	= pvblock_blk_write,
> > +};
> > +
> > +U_BOOT_DRIVER(pvblock_blk) = {
> > +	.name			= DRV_NAME_BLK,
> > +	.id			= UCLASS_BLK,
> > +	.ops			= &pvblock_blk_ops,
> > +	.bind			= pvblock_blk_bind,
> > +	.probe			= pvblock_blk_probe,
> > +	.remove			= pvblock_blk_remove,
> > +	.priv_auto_alloc_size	= sizeof(struct blkfront_dev),
> > +	.flags			= DM_FLAG_OS_PREPARE,
> > +};
> > +
> >
> +/************************************************************
> ********
> > +***********
> > + * Para-virtual block device class
> > +
> >
> +*************************************************************
> ********
> > +**********/
> > +
> > +void pvblock_init(void)
> > +{
> > +	struct driver_info info;
> > +	struct udevice *udev;
> > +	struct uclass *uc;
> > +	int ret;
> > +
> > +	/*
> > +	 * At this point Xen drivers have already initialized,
> > +	 * so we can instantiate the class driver and enumerate
> > +	 * virtual block devices.
> > +	 */
> > +	info.name = DRV_NAME;
> > +	ret = device_bind_by_name(gd->dm_root, false, &info, &udev);
> > +	if (ret < 0)
> > +		printf("Failed to bind " DRV_NAME ", ret: %d\n", ret);
> > +
> > +	/* Bootstrap virtual block devices class driver */
> > +	ret = uclass_get(UCLASS_PVBLOCK, &uc);
> > +	if (ret)
> > +		return;
> > +	uclass_foreach_dev_probe(UCLASS_PVBLOCK, udev); }
> > +
> > +static int pvblock_probe(struct udevice *udev) {
> > +	return 0;
> > +}
> > +
> > +U_BOOT_DRIVER(pvblock_drv) = {
> > +	.name		= DRV_NAME,
> > +	.id		= UCLASS_PVBLOCK,
> > +	.probe		= pvblock_probe,
> > +};
> > +
> > +UCLASS_DRIVER(pvblock) = {
> > +	.name		= DRV_NAME,
> > +	.id		= UCLASS_PVBLOCK,
> > +};
> > diff --git a/include/blk.h b/include/blk.h index
> > abcd4bedbb..9ee10fb80e 100644
> > --- a/include/blk.h
> > +++ b/include/blk.h
> > @@ -33,6 +33,7 @@ enum if_type {
> >  	IF_TYPE_HOST,
> >  	IF_TYPE_NVME,
> >  	IF_TYPE_EFI,
> > +	IF_TYPE_PVBLOCK,
> >  	IF_TYPE_VIRTIO,
> >
> >  	IF_TYPE_COUNT,			/* Number of interface types */
> > diff --git a/include/configs/xenguest_arm64.h
> > b/include/configs/xenguest_arm64.h
> > index 467dabf1e5..2c0d3d64fb 100644
> > --- a/include/configs/xenguest_arm64.h
> > +++ b/include/configs/xenguest_arm64.h
> > @@ -42,4 +42,12 @@
> >  #define CONFIG_CMDLINE_TAG            1
> >  #define CONFIG_INITRD_TAG             1
> >
> > +#define CONFIG_CMD_RUN
> > +
> > +#undef CONFIG_EXTRA_ENV_SETTINGS
> > +#define CONFIG_EXTRA_ENV_SETTINGS	\
> > +	"loadimage=ext4load pvblock 0 0x90000000 /boot/Image;\0" \
> > +	"pvblockboot=run loadimage;" \
> > +		"booti 0x90000000 - 0x88000000;\0"
> > +
> >  #endif /* __XENGUEST_ARM64_H */
> > diff --git a/include/dm/uclass-id.h b/include/dm/uclass-id.h index
> > 7837d459f1..4bf7501204 100644
> > --- a/include/dm/uclass-id.h
> > +++ b/include/dm/uclass-id.h
> > @@ -121,6 +121,7 @@ enum uclass_id {
> >  	UCLASS_W1,		/* Dallas 1-Wire bus */
> >  	UCLASS_W1_EEPROM,	/* one-wire EEPROMs */
> >  	UCLASS_WDT,		/* Watchdog Timer driver */
> > +	UCLASS_PVBLOCK,		/* Xen virtual block device */
> >
> >  	UCLASS_COUNT,
> >  	UCLASS_INVALID = -1,
> > diff --git a/include/pvblock.h b/include/pvblock.h new file mode
> > 100644 index 0000000000..e3bb8ff9a7
> > --- /dev/null
> > +++ b/include/pvblock.h
> > @@ -0,0 +1,12 @@
> > +/*
> > + * SPDX-License-Identifier:	GPL-2.0+
> 
> See above.
> 
> scripts/checkpatch.pl is your friend.
> 
> > + *
> > + * (C) 2020 EPAM Systems Inc.
> > + */
> > +
> > +#ifndef _PVBLOCK_H
> > +#define _PVBLOCK_H
> > +
> 
> Document you functions as described in
> 
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
> kernel.org%2Fdoc%2Fhtml%2Flatest%2Fdoc-guide%2Fkernel-doc.html%23fu
> nction-documentation&amp;data=02%7C01%7Cpeng.fan%40nxp.com%7C76
> 68aea23aa24f9cefe408d81e4086bc%7C686ea1d3bc2b4c6fa92cd99c5c3016
> 35%7C0%7C0%7C637292609781573766&amp;sdata=lMTLJLCWplR12spHSKs
> hcNUzWZ1Ce%2BNBSVHZX%2BgvSCk%3D&amp;reserved=0
> 
> Include the documentation in the HTML documentation generated by
> 
>     make htmldocs.
> 
> Best regards
> 
> Heinrich
> 
> > +void pvblock_init(void);
> > +
> > +#endif /* _PVBLOCK_H */
> >

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 03/17] board: Introduce xenguest_arm64 board
  2020-07-02  1:28   ` Peng Fan
@ 2020-07-02  7:18     ` Oleksandr Andrushchenko
  2020-07-02  7:26       ` Heinrich Schuchardt
  0 siblings, 1 reply; 57+ messages in thread
From: Oleksandr Andrushchenko @ 2020-07-02  7:18 UTC (permalink / raw)
  To: u-boot

On 7/2/20 4:28 AM, Peng Fan wrote:
>> Subject: [PATCH 03/17] board: Introduce xenguest_arm64 board
>>
>> From: Andrii Anisov <andrii_anisov@epam.com>
>>
>> Introduce a minimal Xen guest board running as a virtual machine under Xen
>> Project's hypervisor [1], [2].
>>
>> Part of the code is ported from Xen mini-os and also uses work initially done
>> by different authors from NXP: please see relevant files for their copyrights.
> This patch needs to be in the last, otherwise it might break git bisect.

Not sure I understand why. This patch is a self-contained piece of work

which introduces a new board. What's wrong with this? Why would it break?

>
>> [1]
>> https://urldefense.com/v3/__https://eur01.safelinks.protection.outlook.com/?url=https*3A*2F*2Fxenbit__;JSUl!!GF_29dbcQIUBPA!kLvFHwcVni_hKobueMDuGiWwAyUqOyVghhe446DfQrocVMn84Rp1m4EWJM8nHzH0_vEGLuxcEg$
>> s.xen.org%2F&amp;data=02%7C01%7Cpeng.fan%40nxp.com%7C61151b8230
>> c94f145ce408d81ddc04ee%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
>> 7C0%7C637292178110014498&amp;sdata=pgJ6Qf1iDW%2FjNWTcGBWFVYY
>> SrG0MX%2FiTzbfzbyqkxsY%3D&amp;reserved=0
>> [2]
>> https://urldefense.com/v3/__https://eur01.safelinks.protection.outlook.com/?url=https*3A*2F*2Fwiki.xe__;JSUl!!GF_29dbcQIUBPA!kLvFHwcVni_hKobueMDuGiWwAyUqOyVghhe446DfQrocVMn84Rp1m4EWJM8nHzH0_vFUM7ad7A$
>> nproject.org%2F&amp;data=02%7C01%7Cpeng.fan%40nxp.com%7C61151b8
>> 230c94f145ce408d81ddc04ee%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C
>> 0%7C0%7C637292178110014498&amp;sdata=x0gKBoJvFRQdX7YatAhgF%2Fc
>> ovJ4kdrmbl2iUiXvCqww%3D&amp;reserved=0
>>
>> Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
>> Signed-off-by: Oleksandr Andrushchenko
>> <oleksandr_andrushchenko@epam.com>
>> Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
>> ---
>>   arch/arm/Kconfig                          |   7 +
>>   arch/arm/cpu/armv8/Makefile               |   1 +
>>   arch/arm/cpu/armv8/xen/Makefile           |   6 +
>>   arch/arm/cpu/armv8/xen/hypercall.S        |  78 +++++++++++
>>   arch/arm/cpu/armv8/xen/lowlevel_init.S    |  34 +++++
>>   arch/arm/include/asm/xen.h                |   8 ++
>>   arch/arm/include/asm/xen/hypercall.h      |  45 +++++++
>>   board/xen/xenguest_arm64/Kconfig          |  12 ++
>>   board/xen/xenguest_arm64/Makefile         |   5 +
>>   board/xen/xenguest_arm64/xenguest_arm64.c | 153
>> ++++++++++++++++++++++
>>   configs/xenguest_arm64_defconfig          |  56 ++++++++
>>   include/configs/xenguest_arm64.h          |  45 +++++++
>>   12 files changed, 450 insertions(+)
>>   create mode 100644 arch/arm/cpu/armv8/xen/Makefile  create mode
>> 100644 arch/arm/cpu/armv8/xen/hypercall.S
>>   create mode 100644 arch/arm/cpu/armv8/xen/lowlevel_init.S
>>   create mode 100644 arch/arm/include/asm/xen.h  create mode 100644
>> arch/arm/include/asm/xen/hypercall.h
>>   create mode 100644 board/xen/xenguest_arm64/Kconfig  create mode
>> 100644 board/xen/xenguest_arm64/Makefile  create mode 100644
>> board/xen/xenguest_arm64/xenguest_arm64.c
>>   create mode 100644 configs/xenguest_arm64_defconfig  create mode
>> 100644 include/configs/xenguest_arm64.h
>>
>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index
>> e9ad716aaa..c469863967 100644
>> --- a/arch/arm/Kconfig
>> +++ b/arch/arm/Kconfig
>> @@ -1717,6 +1717,12 @@ config TARGET_PRESIDIO_ASIC
>>   	bool "Support Cortina Presidio ASIC Platform"
>>   	select ARM64
>>
>> +config TARGET_XENGUEST_ARM64
>> +	bool "Xen guest ARM64"
>> +	select ARM64
>> +	select XEN
>> +	select OF_CONTROL
>> +	select LINUX_KERNEL_IMAGE_HEADER
>>   endchoice
>>
>>   config ARCH_SUPPORT_TFABOOT
>> @@ -1920,6 +1926,7 @@ source "board/xilinx/Kconfig"
>>   source "board/xilinx/zynq/Kconfig"
>>   source "board/xilinx/zynqmp/Kconfig"
>>   source "board/phytium/durian/Kconfig"
>> +source "board/xen/xenguest_arm64/Kconfig"
>>
>>   source "arch/arm/Kconfig.debug"
>>
>> diff --git a/arch/arm/cpu/armv8/Makefile b/arch/arm/cpu/armv8/Makefile
>> index 2e48df0eb9..dd6c354d19 100644
>> --- a/arch/arm/cpu/armv8/Makefile
>> +++ b/arch/arm/cpu/armv8/Makefile
>> @@ -39,3 +39,4 @@ obj-$(CONFIG_S32V234) += s32v234/
>>   obj-$(CONFIG_TARGET_HIKEY) += hisilicon/
>>   obj-$(CONFIG_ARMV8_PSCI) += psci.o
>>   obj-$(CONFIG_ARCH_SUNXI) += lowlevel_init.o
>> +obj-$(CONFIG_XEN) += xen/
>> diff --git a/arch/arm/cpu/armv8/xen/Makefile
>> b/arch/arm/cpu/armv8/xen/Makefile new file mode 100644 index
>> 0000000000..e3b4ae2bd4
>> --- /dev/null
>> +++ b/arch/arm/cpu/armv8/xen/Makefile
>> @@ -0,0 +1,6 @@
>> +# SPDX-License-Identifier: GPL-2.0+
>> +#
>> +# (C) 2018 NXP
>> +# (C) 2020 EPAM Systems Inc.
>> +
>> +obj-y += lowlevel_init.o hypercall.o
>> diff --git a/arch/arm/cpu/armv8/xen/hypercall.S
>> b/arch/arm/cpu/armv8/xen/hypercall.S
>> new file mode 100644
>> index 0000000000..9596e336b5
>> --- /dev/null
>> +++ b/arch/arm/cpu/armv8/xen/hypercall.S
>> @@ -0,0 +1,78 @@
>> +/************************************************************
>> **********
>> +********
>> + * hypercall.S
>> + *
>> + * Xen hypercall wrappers
>> + *
>> + * Stefano Stabellini <stefano.stabellini@eu.citrix.com>, Citrix, 2012
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public License version
>> +2
>> + * as published by the Free Software Foundation; or, when distributed
>> + * separately from the Linux kernel or incorporated into other
>> + * software packages, subject to the following license:
>> + *
>> + * Permission is hereby granted, free of charge, to any person
>> +obtaining a copy
>> + * of this source file (the "Software"), to deal in the Software
>> +without
>> + * restriction, including without limitation the rights to use, copy,
>> +modify,
>> + * merge, publish, distribute, sublicense, and/or sell copies of the
>> +Software,
>> + * and to permit persons to whom the Software is furnished to do so,
>> +subject to
>> + * the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be
>> +included in
>> + * all copies or substantial portions of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
>> KIND,
>> +EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> +MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
>> EVENT
>> +SHALL THE
>> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
>> DAMAGES OR
>> +OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> +ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
>> OR OTHER
>> +DEALINGS
>> + * IN THE SOFTWARE.
>> + */
>> +
>> +/*
>> + * The Xen hypercall calling convention is very similar to the
>> +procedure
>> + * call standard for the ARM 64-bit architecture: the first parameter
>> +is
>> + * passed in x0, the second in x1, the third in x2, the fourth in x3
>> +and
>> + * the fifth in x4.
>> + *
>> + * The hypercall number is passed in x16.
>> + *
>> + * The return value is in x0.
>> + *
>> + * The hvc ISS is required to be 0xEA1, that is the Xen specific ARM
>> + * hypercall tag.
>> + *
>> + * Parameter structs passed to hypercalls are laid out according to
>> + * the ARM 64-bit EABI standard.
>> + */
>> +
>> +#include <xen/interface/xen.h>
>> +
>> +#define XEN_HYPERCALL_TAG	0xEA1
>> +
>> +#define HYPERCALL_SIMPLE(hypercall)		\
>> +.globl HYPERVISOR_##hypercall;                  \
>> +.align 4,0x90;                                  \
>> +HYPERVISOR_##hypercall:				\
>> +	mov x16, #__HYPERVISOR_##hypercall;	\
>> +	hvc XEN_HYPERCALL_TAG;			\
>> +	ret;					\
>> +
>> +#define HYPERCALL0 HYPERCALL_SIMPLE
>> +#define HYPERCALL1 HYPERCALL_SIMPLE
>> +#define HYPERCALL2 HYPERCALL_SIMPLE
>> +#define HYPERCALL3 HYPERCALL_SIMPLE
>> +#define HYPERCALL4 HYPERCALL_SIMPLE
>> +#define HYPERCALL5 HYPERCALL_SIMPLE
>> +
>> +                .text
>> +
>> +HYPERCALL2(xen_version);
>> +HYPERCALL3(console_io);
>> +HYPERCALL3(grant_table_op);
>> +HYPERCALL2(sched_op);
>> +HYPERCALL2(event_channel_op);
>> +HYPERCALL2(hvm_op);
>> +HYPERCALL2(memory_op);
>> +
>> diff --git a/arch/arm/cpu/armv8/xen/lowlevel_init.S
>> b/arch/arm/cpu/armv8/xen/lowlevel_init.S
>> new file mode 100644
>> index 0000000000..25ed438e20
>> --- /dev/null
>> +++ b/arch/arm/cpu/armv8/xen/lowlevel_init.S
>> @@ -0,0 +1,34 @@
>> +/*
>> + * SPDX-License-Identifier: GPL-2.0+
>> + *
>> + * (C) 2017 NXP
>> + * (C) 2020 EPAM Systems Inc.
>> + */
>> +
>> +#include <config.h>
>> +
>> +.align 8
>> +.global rom_pointer
>> +rom_pointer:
>> +	.space 32
>> +
>> +/*
>> + * Routine: save_boot_params (called after reset from start.S)  */
>> +
>> +.global save_boot_params
>> +save_boot_params:
>> +	/* The firmware provided ATAG/FDT address can be found in r2/x0 */
>> +	adr	x1, rom_pointer
>> +	stp	x0, x2, [x1], #16
>> +	stp	x3, x4, [x1], #16
>> +
>> +	/* Returns */
>> +	b	save_boot_params_ret
>> +
>> +.global restore_boot_params
>> +restore_boot_params:
>> +	adr	x1, rom_pointer
>> +	ldp	x0, x2, [x1], #16
>> +	ldp	x3, x4, [x1], #16
>> +	ret
>> diff --git a/arch/arm/include/asm/xen.h b/arch/arm/include/asm/xen.h new
>> file mode 100644 index 0000000000..fb7f03e19c
>> --- /dev/null
>> +++ b/arch/arm/include/asm/xen.h
>> @@ -0,0 +1,8 @@
>> +/*
>> + * SPDX-License-Identifier: GPL-2.0+
>> + *
>> + * (C) 2020 EPAM Systems Inc.
>> + */
>> +
>> +extern unsigned long rom_pointer[];
>> +
>> diff --git a/arch/arm/include/asm/xen/hypercall.h
>> b/arch/arm/include/asm/xen/hypercall.h
>> new file mode 100644
>> index 0000000000..26644ce886
>> --- /dev/null
>> +++ b/arch/arm/include/asm/xen/hypercall.h
>> @@ -0,0 +1,45 @@
>> +/************************************************************
>> **********
>> +********
>> + * hypercall.h
>> + *
>> + * Linux-specific hypervisor handling.
>> + *
>> + * Stefano Stabellini <stefano.stabellini@eu.citrix.com>, Citrix, 2012
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public License version
>> +2
>> + * as published by the Free Software Foundation; or, when distributed
>> + * separately from the Linux kernel or incorporated into other
>> + * software packages, subject to the following license:
>> + *
>> + * Permission is hereby granted, free of charge, to any person
>> +obtaining a copy
>> + * of this source file (the "Software"), to deal in the Software
>> +without
>> + * restriction, including without limitation the rights to use, copy,
>> +modify,
>> + * merge, publish, distribute, sublicense, and/or sell copies of the
>> +Software,
>> + * and to permit persons to whom the Software is furnished to do so,
>> +subject to
>> + * the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be
>> +included in
>> + * all copies or substantial portions of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
>> KIND,
>> +EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> +MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
>> EVENT
>> +SHALL THE
>> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
>> DAMAGES OR
>> +OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> +ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
>> OR OTHER
>> +DEALINGS
>> + * IN THE SOFTWARE.
>> + */
>> +
>> +#ifndef _ASM_ARM_XEN_HYPERCALL_H
>> +#define _ASM_ARM_XEN_HYPERCALL_H
>> +
>> +#include <xen/interface/xen.h>
>> +
>> +int HYPERVISOR_xen_version(int cmd, void *arg); int
>> +HYPERVISOR_console_io(int cmd, int count, char *str); int
>> +HYPERVISOR_grant_table_op(unsigned int cmd, void *uop, unsigned int
>> +count); int HYPERVISOR_sched_op(int cmd, void *arg); int
>> +HYPERVISOR_event_channel_op(int cmd, void *arg); unsigned long
>> +HYPERVISOR_hvm_op(int op, void *arg); int
>> HYPERVISOR_memory_op(unsigned
>> +int cmd, void *arg); #endif /* _ASM_ARM_XEN_HYPERCALL_H */
>> diff --git a/board/xen/xenguest_arm64/Kconfig
>> b/board/xen/xenguest_arm64/Kconfig
>> new file mode 100644
>> index 0000000000..cc131ed5b9
>> --- /dev/null
>> +++ b/board/xen/xenguest_arm64/Kconfig
>> @@ -0,0 +1,12 @@
>> +if TARGET_XENGUEST_ARM64
>> +
>> +config SYS_BOARD
>> +	default "xenguest_arm64"
>> +
>> +config SYS_VENDOR
>> +	default "xen"
>> +
>> +config SYS_CONFIG_NAME
>> +	default "xenguest_arm64"
>> +
>> +endif
>> diff --git a/board/xen/xenguest_arm64/Makefile
>> b/board/xen/xenguest_arm64/Makefile
>> new file mode 100644
>> index 0000000000..1cf87a728f
>> --- /dev/null
>> +++ b/board/xen/xenguest_arm64/Makefile
>> @@ -0,0 +1,5 @@
>> +# SPDX-License-Identifier:	GPL-2.0+
>> +#
>> +# (C) Copyright 2020 EPAM Systems Inc.
>> +
>> +obj-y	:= xenguest_arm64.o
>> diff --git a/board/xen/xenguest_arm64/xenguest_arm64.c
>> b/board/xen/xenguest_arm64/xenguest_arm64.c
>> new file mode 100644
>> index 0000000000..9e099f388f
>> --- /dev/null
>> +++ b/board/xen/xenguest_arm64/xenguest_arm64.c
>> @@ -0,0 +1,153 @@
>> +/*
>> + * SPDX-License-Identifier: GPL-2.0+
>> + *
>> + * (C) 2013
>> + * David Feng <fenghua@phytium.com.cn>
>> + * Sharma Bhupesh <bhupesh.sharma@freescale.com>
>> + *
>> + * (C) 2020 EPAM Systems Inc
>> + */
>> +
>> +#include <common.h>
>> +#include <cpu_func.h>
>> +#include <dm.h>
>> +#include <errno.h>
>> +#include <malloc.h>
>> +
>> +#include <asm/io.h>
>> +#include <asm/armv8/mmu.h>
>> +#include <asm/xen.h>
>> +#include <asm/xen/hypercall.h>
>> +
>> +#include <linux/compiler.h>
>> +
>> +DECLARE_GLOBAL_DATA_PTR;
>> +
>> +int board_init(void)
>> +{
>> +	return 0;
>> +}
>> +
>> +/*
>> + * Use fdt provided by Xen: according to
>> + *
>> +https://urldefense.com/v3/__https://eur01.safelinks.protection.outlook.com/?url=https*3A*2F*2Fwww__;JSUl!!GF_29dbcQIUBPA!kLvFHwcVni_hKobueMDuGiWwAyUqOyVghhe446DfQrocVMn84Rp1m4EWJM8nHzH0_vGd4Bu7GQ$
>> .k
>> +ernel.org%2Fdoc%2FDocumentation%2Farm64%2Fbooting.txt&amp;data=0
>> 2%7C01%
>> +7Cpeng.fan%40nxp.com%7C61151b8230c94f145ce408d81ddc04ee%7C686
>> ea1d3bc2b4
>> +c6fa92cd99c5c301635%7C0%7C0%7C637292178110014498&amp;sdata=3t
>> i9j4nAzNSw
>> +xsmZs8rONDmPLNbGx89HYBsezkgD%2FVI%3D&amp;reserved=0
>> + * x0 is the physical address of the device tree blob (dtb) in system RAM.
>> + * This is stored in rom_pointer during low level init.
>> + */
>> +void *board_fdt_blob_setup(void)
>> +{
>> +	if (fdt_magic(rom_pointer[0]) != FDT_MAGIC)
>> +		return NULL;
>> +	return (void *)rom_pointer[0];
>> +}
>> +
>> +#define MAX_MEM_MAP_REGIONS 5
>> +static struct mm_region xen_mem_map[MAX_MEM_MAP_REGIONS]; struct
>> +mm_region *mem_map = xen_mem_map;
>> +
>> +static int get_next_memory_node(const void *blob, int mem) {
>> +	do {
>> +		mem = fdt_node_offset_by_prop_value(blob, mem,
>> +						    "device_type", "memory", 7);
>> +	} while (!fdtdec_get_is_enabled(blob, mem));
>> +
>> +	return mem;
>> +}
>> +
>> +static int setup_mem_map(void)
>> +{
>> +	int i, ret, mem, reg = 0;
>> +	struct fdt_resource res;
>> +	const void *blob = gd->fdt_blob;
>> +
>> +	mem = get_next_memory_node(blob, -1);
>> +	if (mem < 0) {
>> +		printf("%s: Missing /memory node\n", __func__);
>> +		return -EINVAL;
>> +	}
>> +
>> +	for (i = 0; i < MAX_MEM_MAP_REGIONS; i++) {
>> +		ret = fdt_get_resource(blob, mem, "reg", reg++, &res);
>> +		if (ret == -FDT_ERR_NOTFOUND) {
>> +			reg = 0;
>> +			mem = get_next_memory_node(blob, mem);
>> +			if (mem == -FDT_ERR_NOTFOUND)
>> +				break;
>> +
>> +			ret = fdt_get_resource(blob, mem, "reg", reg++, &res);
>> +			if (ret == -FDT_ERR_NOTFOUND)
>> +				break;
>> +		}
>> +		if (ret != 0) {
>> +			printf("No reg property for memory node\n");
>> +			return -EINVAL;
>> +		}
>> +
>> +		xen_mem_map[i].virt = (phys_addr_t)res.start;
>> +		xen_mem_map[i].phys = (phys_addr_t)res.start;
>> +		xen_mem_map[i].size = (phys_size_t)(res.end - res.start + 1);
>> +		xen_mem_map[i].attrs = (PTE_BLOCK_MEMTYPE(MT_NORMAL) |
>> +					PTE_BLOCK_INNER_SHARE);
>> +	}
>> +	return 0;
>> +}
>> +
>> +void enable_caches(void)
>> +{
>> +	/* Re-setup the memory map as BSS gets cleared after relocation. */
>> +	setup_mem_map();
>> +	icache_enable();
>> +	dcache_enable();
>> +}
>> +
>> +/* Read memory settings from the Xen provided device tree. */ int
>> +dram_init(void) {
>> +	int ret;
>> +
>> +	ret = fdtdec_setup_mem_size_base();
>> +	if (ret < 0)
>> +		return ret;
>> +	/* Setup memory map, so MMU page table size can be estimated. */
>> +	return setup_mem_map();
>> +}
>> +
>> +int dram_init_banksize(void)
>> +{
>> +	return fdtdec_setup_memory_banksize(); }
>> +
>> +/*
>> + * Board specific reset that is system reset.
>> + */
>> +void reset_cpu(ulong addr)
>> +{
>> +}
>> +
>> +int ft_system_setup(void *blob, bd_t *bd) {
>> +	return 0;
>> +}
>> +
>> +int ft_board_setup(void *blob, bd_t *bd) {
>> +	return 0;
>> +}
>> +
>> +int board_early_init_f(void)
>> +{
>> +	return 0;
>> +}
> Drop the upper three functions if not needed.
>
>> +
>> +int print_cpuinfo(void)
>> +{
>> +	printf("Xen virtual CPU\n");
>> +	return 0;
>> +}
>> +
>> +__weak struct serial_device *default_serial_console(void) {
>> +	return NULL;
>> +}
>> +
>> diff --git a/configs/xenguest_arm64_defconfig
>> b/configs/xenguest_arm64_defconfig
>> new file mode 100644
>> index 0000000000..2a8caf8647
>> --- /dev/null
>> +++ b/configs/xenguest_arm64_defconfig
>> @@ -0,0 +1,56 @@
>> +CONFIG_ARM=y
>> +CONFIG_POSITION_INDEPENDENT=y
>> +CONFIG_SYS_TEXT_BASE=0x40080000
>> +CONFIG_SYS_MALLOC_F_LEN=0x2000
>> +CONFIG_IDENT_STRING=" xenguest"
>> +CONFIG_TARGET_XENGUEST_ARM64=y
>> +CONFIG_BOOTDELAY=10
> 10s?
>
> Regards,
> Peng.
>> +
>> +CONFIG_SYS_PROMPT="xenguest# "
>> +
>> +CONFIG_CMD_NET=n
>> +CONFIG_CMD_BDI=n
>> +CONFIG_CMD_BOOTD=n
>> +CONFIG_CMD_BOOTEFI=n
>> +CONFIG_CMD_BOOTEFI_HELLO_COMPILE=n
>> +CONFIG_CMD_ELF=n
>> +CONFIG_CMD_GO=n
>> +CONFIG_CMD_RUN=n
>> +CONFIG_CMD_IMI=n
>> +CONFIG_CMD_IMLS=n
>> +CONFIG_CMD_XIMG=n
>> +CONFIG_CMD_EXPORTENV=n
>> +CONFIG_CMD_IMPORTENV=n
>> +CONFIG_CMD_EDITENV=n
>> +CONFIG_CMD_ENV_EXISTS=n
>> +CONFIG_CMD_MEMORY=y
>> +CONFIG_CMD_CRC32=n
>> +CONFIG_CMD_DM=n
>> +CONFIG_CMD_LOADB=n
>> +CONFIG_CMD_LOADS=n
>> +CONFIG_CMD_FLASH=n
>> +CONFIG_CMD_GPT=n
>> +CONFIG_CMD_FPGA=n
>> +CONFIG_CMD_ECHO=n
>> +CONFIG_CMD_ITEST=n
>> +CONFIG_CMD_SOURCE=n
>> +CONFIG_CMD_SETEXPR=n
>> +CONFIG_CMD_MISC=n
>> +CONFIG_CMD_UNZIP=n
>> +CONFIG_CMD_LZMADEC=n
>> +CONFIG_CMD_SAVEENV=n
>> +CONFIG_CMD_UMS=n
>> +
>> +#CONFIG_USB=n
>> +# CONFIG_ISO_PARTITION is not set
>> +
>> +#CONFIG_EFI_PARTITION=y
>> +# CONFIG_EFI_LOADER is not set
>> +
>> +# CONFIG_DM is not set
>> +# CONFIG_MMC is not set
>> +# CONFIG_DM_SERIAL is not set
>> +# CONFIG_REQUIRE_SERIAL_CONSOLE is not set
>> +
>> +CONFIG_OF_BOARD=y
>> +CONFIG_OF_LIBFDT=y
>> diff --git a/include/configs/xenguest_arm64.h
>> b/include/configs/xenguest_arm64.h
>> new file mode 100644
>> index 0000000000..467dabf1e5
>> --- /dev/null
>> +++ b/include/configs/xenguest_arm64.h
>> @@ -0,0 +1,45 @@
>> +/*
>> + * SPDX-License-Identifier: GPL-2.0+
>> + *
>> + * (C) Copyright 2020 EPAM Systemc Inc.
>> + */
>> +#ifndef __XENGUEST_ARM64_H
>> +#define __XENGUEST_ARM64_H
>> +
>> +#ifndef __ASSEMBLY__
>> +#include <linux/types.h>
>> +#endif
>> +
>> +#define CONFIG_BOARD_EARLY_INIT_F
>> +
>> +#define CONFIG_EXTRA_ENV_SETTINGS
>> +
>> +#undef CONFIG_NR_DRAM_BANKS
>> +#undef CONFIG_SYS_SDRAM_BASE
>> +
>> +#define CONFIG_NR_DRAM_BANKS          1
>> +
>> +/*
>> + * This can be any arbitrary address as we are using PIE, but
>> + * please note, that CONFIG_SYS_TEXT_BASE must match the below.
>> + */
>> +#define CONFIG_SYS_LOAD_ADDR                    0x40000000
>> +#define CONFIG_LNX_KRNL_IMG_TEXT_OFFSET_BASE
>> CONFIG_SYS_LOAD_ADDR
>> +
>> +/* Size of malloc() pool */
>> +#define CONFIG_SYS_MALLOC_LEN         (32 * 1024 * 1024)
>> +
>> +/* Monitor Command Prompt */
>> +#define CONFIG_SYS_PROMPT_HUSH_PS2    "> "
>> +#define CONFIG_SYS_CBSIZE             1024
>> +#define CONFIG_SYS_MAXARGS            64
>> +#define CONFIG_SYS_BARGSIZE           CONFIG_SYS_CBSIZE
>> +#define CONFIG_SYS_PBSIZE             (CONFIG_SYS_CBSIZE + \
>> +				      sizeof(CONFIG_SYS_PROMPT) + 16)
>> +
>> +#define CONFIG_OF_SYSTEM_SETUP
>> +
>> +#define CONFIG_CMDLINE_TAG            1
>> +#define CONFIG_INITRD_TAG             1
>> +
>> +#endif /* __XENGUEST_ARM64_H */
>> --
>> 2.17.1

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 03/17] board: Introduce xenguest_arm64 board
  2020-07-02  7:18     ` Oleksandr Andrushchenko
@ 2020-07-02  7:26       ` Heinrich Schuchardt
  2020-07-02  7:57         ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 57+ messages in thread
From: Heinrich Schuchardt @ 2020-07-02  7:26 UTC (permalink / raw)
  To: u-boot

On 02.07.20 09:18, Oleksandr Andrushchenko wrote:
> On 7/2/20 4:28 AM, Peng Fan wrote:
>>> Subject: [PATCH 03/17] board: Introduce xenguest_arm64 board
>>>
>>> From: Andrii Anisov <andrii_anisov@epam.com>
>>>
>>> Introduce a minimal Xen guest board running as a virtual machine under Xen
>>> Project's hypervisor [1], [2].
>>>
>>> Part of the code is ported from Xen mini-os and also uses work initially done
>>> by different authors from NXP: please see relevant files for their copyrights.
>> This patch needs to be in the last, otherwise it might break git bisect.
>
> Not sure I understand why. This patch is a self-contained piece of work
>
> which introduces a new board. What's wrong with this? Why would it break?
>

We are running automated test in Gitlab CI and Travis CI. Once there is
a defconfig file we try to build the file.

If building fails with only patches 1-3 merged, we have a problem.

Best regards

Heinrich

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 03/17] board: Introduce xenguest_arm64 board
  2020-07-02  7:26       ` Heinrich Schuchardt
@ 2020-07-02  7:57         ` Oleksandr Andrushchenko
  0 siblings, 0 replies; 57+ messages in thread
From: Oleksandr Andrushchenko @ 2020-07-02  7:57 UTC (permalink / raw)
  To: u-boot


On 7/2/20 10:26 AM, Heinrich Schuchardt wrote:
> On 02.07.20 09:18, Oleksandr Andrushchenko wrote:
>> On 7/2/20 4:28 AM, Peng Fan wrote:
>>>> Subject: [PATCH 03/17] board: Introduce xenguest_arm64 board
>>>>
>>>> From: Andrii Anisov <andrii_anisov@epam.com>
>>>>
>>>> Introduce a minimal Xen guest board running as a virtual machine under Xen
>>>> Project's hypervisor [1], [2].
>>>>
>>>> Part of the code is ported from Xen mini-os and also uses work initially done
>>>> by different authors from NXP: please see relevant files for their copyrights.
>>> This patch needs to be in the last, otherwise it might break git bisect.
>> Not sure I understand why. This patch is a self-contained piece of work
>>
>> which introduces a new board. What's wrong with this? Why would it break?
>>
> We are running automated test in Gitlab CI and Travis CI. Once there is
> a defconfig file we try to build the file.
>
> If building fails with only patches 1-3 merged, we have a problem.

You both are absolutely right, we need to swap two patches, so they are in

the following order:

board: Introduce xenguest_arm64 board
xen: Add essential and required interface headers
Kconfig: Introduce CONFIG_XEN

e.g. we add headers first and then the board.

Thank you for pointing to it,

Oleksandr

>
> Best regards
>
> Heinrich

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 02/17] Kconfig: Introduce CONFIG_XEN
  2020-07-01 16:29 ` [PATCH 02/17] Kconfig: Introduce CONFIG_XEN Anastasiia Lukianenko
@ 2020-07-03  3:50   ` Simon Glass
  2020-07-03 12:42     ` Anastasiia Lukianenko
  0 siblings, 1 reply; 57+ messages in thread
From: Simon Glass @ 2020-07-03  3:50 UTC (permalink / raw)
  To: u-boot

Hi Anastasiia,

On Wed, 1 Jul 2020 at 10:30, Anastasiia Lukianenko <vicooodin@gmail.com> wrote:
>
> From: Peng Fan <peng.fan@nxp.com>
>
> Introduce CONFIG_XEN to make U-Boot could be used as bootloader
> for a virtual machine.
>
> Without bootloader, we could successfully boot up android on XEN, but
> we need need bootloader to support A/B, dm verify and etc.
>
> Signed-off-by: Peng Fan <peng.fan@nxp.com>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> ---
>  Kconfig | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/Kconfig b/Kconfig
> index 8f3fba085a..67f773d3a6 100644
> --- a/Kconfig
> +++ b/Kconfig
> @@ -69,6 +69,13 @@ config CC_COVERAGE
>           Enabling this option will pass "--coverage" to gcc to compile
>           and link code instrumented for coverage analysis.
>
> +config XEN
> +       bool "Select U-Boot be run as a bootloader for XEN Virtual Machine"
> +       default n

Not needed

> +       help
> +         Enabling this option will make U-Boot be run as a bootloader
> +         for XEN Virtual Machine.

Can you please add a few more details. What is XEN? URL? Also what
does this actually do? Add some features to talk to XEN? It really
needs more info and perhaps a pointer to some docs.

> +
>  config DISTRO_DEFAULTS
>         bool "Select defaults suitable for booting general purpose Linux distributions"
>         select AUTO_COMPLETE
> --
> 2.17.1
>

Regards,
Simon

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 06/17] xen: Port Xen event channel driver from mini-os
  2020-07-01 16:29 ` [PATCH 06/17] xen: Port Xen event channel driver " Anastasiia Lukianenko
@ 2020-07-03  3:50   ` Simon Glass
  2020-07-03 12:34     ` Anastasiia Lukianenko
  0 siblings, 1 reply; 57+ messages in thread
From: Simon Glass @ 2020-07-03  3:50 UTC (permalink / raw)
  To: u-boot

Hi,

On Wed, 1 Jul 2020 at 10:30, Anastasiia Lukianenko <vicooodin@gmail.com> wrote:
>
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>
> Make required updates to run on u-boot. Strip functionality
> not needed by U-boot.
>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> ---
>  drivers/xen/Makefile     |   1 +
>  drivers/xen/events.c     | 177 +++++++++++++++++++++++++++++++++++++++
>  drivers/xen/hypervisor.c |   6 +-
>  include/xen/events.h     |  47 +++++++++++
>  4 files changed, 228 insertions(+), 3 deletions(-)
>  create mode 100644 drivers/xen/events.c
>  create mode 100644 include/xen/events.h
>
> diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
> index 1211bf2386..0ad35edefb 100644
> --- a/drivers/xen/Makefile
> +++ b/drivers/xen/Makefile
> @@ -3,3 +3,4 @@
>  # (C) Copyright 2020 EPAM Systems Inc.
>
>  obj-y += hypervisor.o
> +obj-y += events.o
> diff --git a/drivers/xen/events.c b/drivers/xen/events.c
> new file mode 100644
> index 0000000000..eddc6b6e29
> --- /dev/null
> +++ b/drivers/xen/events.c
> @@ -0,0 +1,177 @@
> +/* -*-  Mode:C; c-basic-offset:4; tab-width:4 -*-

SPDX is needed on files

> + ****************************************************************************
> + * (C) 2003 - Rolf Neugebauer - Intel Research Cambridge
> + * (C) 2005 - Grzegorz Milos - Intel Research Cambridge
> + * (C) 2020 - EPAM Systems Inc.
> + ****************************************************************************
> + *
> + *             File: events.c
> + *       Author: Rolf Neugebauer (neugebar at dcs.gla.ac.uk)
> + *      Changes: Grzegorz Milos (gm281 at cam.ac.uk)
> + *
> + *             Date: Jul 2003, changes Jun 2005
> + *
> + * Environment: Xen Minimal OS
> + * Description: Deals with events received on event channels
> + *
> + ****************************************************************************

Can you drop these stars and use the normal U-Boot format?

> + */
> +#include <common.h>
> +#include <log.h>
> +
> +#include <asm/io.h>
> +#include <asm/xen/system.h>
> +
> +#include <xen/events.h>
> +#include <xen/hvm.h>
> +
> +#define NR_EVS 1024
> +
> +/* this represents a event handler. Chaining or sharing is not allowed */
> +typedef struct _ev_action_t {

Please don't use typedefs.

Also there should be comments on functions, particularly those in the
header file.

Are you trying to keep the source similar to an upstream version?

Regards,
SImon

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 07/17] serial: serial_xen: Add Xen PV serial driver
  2020-07-01 16:29 ` [PATCH 07/17] serial: serial_xen: Add Xen PV serial driver Anastasiia Lukianenko
@ 2020-07-03  3:50   ` Simon Glass
  2020-07-03 12:59     ` Anastasiia Lukianenko
  0 siblings, 1 reply; 57+ messages in thread
From: Simon Glass @ 2020-07-03  3:50 UTC (permalink / raw)
  To: u-boot

Hi Anastasiia,

On Wed, 1 Jul 2020 at 10:30, Anastasiia Lukianenko <vicooodin@gmail.com> wrote:
>
> From: Peng Fan <peng.fan@nxp.com>
>
> Add support for Xen para-virtualized serial driver. This
> driver fully supports serial console for the virtual machine.
>
> Please note that as the driver is initialized late, so no banner
> nor memory size is visible.
>
> Signed-off-by: Peng Fan <peng.fan@nxp.com>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> ---
>  arch/arm/Kconfig                          |   1 +
>  board/xen/xenguest_arm64/xenguest_arm64.c |  31 +++-
>  configs/xenguest_arm64_defconfig          |   4 +-
>  drivers/serial/Kconfig                    |   7 +
>  drivers/serial/Makefile                   |   1 +
>  drivers/serial/serial_xen.c               | 175 ++++++++++++++++++++++
>  drivers/xen/events.c                      |   4 +
>  7 files changed, 214 insertions(+), 9 deletions(-)
>  create mode 100644 drivers/serial/serial_xen.c

Reviewed-by: Simon Glass <sjg@chromium.org>

nits below

>
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index c469863967..d4de1139aa 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -1723,6 +1723,7 @@ config TARGET_XENGUEST_ARM64
>         select XEN
>         select OF_CONTROL
>         select LINUX_KERNEL_IMAGE_HEADER
> +       select XEN_SERIAL
>  endchoice
>
>  config ARCH_SUPPORT_TFABOOT
> diff --git a/board/xen/xenguest_arm64/xenguest_arm64.c b/board/xen/xenguest_arm64/xenguest_arm64.c
> index 9e099f388f..fd10a002e9 100644
> --- a/board/xen/xenguest_arm64/xenguest_arm64.c
> +++ b/board/xen/xenguest_arm64/xenguest_arm64.c
> @@ -18,9 +18,12 @@
>  #include <asm/armv8/mmu.h>
>  #include <asm/xen.h>
>  #include <asm/xen/hypercall.h>
> +#include <asm/xen/system.h>
>
>  #include <linux/compiler.h>
>
> +#include <xen/hvm.h>
> +
>  DECLARE_GLOBAL_DATA_PTR;
>
>  int board_init(void)
> @@ -57,9 +60,28 @@ static int get_next_memory_node(const void *blob, int mem)
>
>  static int setup_mem_map(void)
>  {
> -       int i, ret, mem, reg = 0;
> +       int i = 0, ret, mem, reg = 0;
>         struct fdt_resource res;
>         const void *blob = gd->fdt_blob;
> +       u64 gfn;
> +
> +       /*
> +        * Add "magic" region which is used by Xen to provide some essentials
> +        * for the guest: we need console.
> +        */
> +       ret = hvm_get_parameter_maintain_dcache(HVM_PARAM_CONSOLE_PFN, &gfn);
> +       if (ret < 0) {
> +               printf("%s: Can't get HVM_PARAM_CONSOLE_PFN, ret %d\n",
> +                      __func__, ret);
> +               return -EINVAL;
> +       }
> +
> +       xen_mem_map[i].virt = PFN_PHYS(gfn);
> +       xen_mem_map[i].phys = PFN_PHYS(gfn);
> +       xen_mem_map[i].size = PAGE_SIZE;
> +       xen_mem_map[i].attrs = (PTE_BLOCK_MEMTYPE(MT_NORMAL) |
> +                               PTE_BLOCK_INNER_SHARE);
> +       i++;
>
>         mem = get_next_memory_node(blob, -1);
>         if (mem < 0) {
> @@ -67,7 +89,7 @@ static int setup_mem_map(void)
>                 return -EINVAL;
>         }
>
> -       for (i = 0; i < MAX_MEM_MAP_REGIONS; i++) {
> +       for (; i < MAX_MEM_MAP_REGIONS; i++) {
>                 ret = fdt_get_resource(blob, mem, "reg", reg++, &res);
>                 if (ret == -FDT_ERR_NOTFOUND) {
>                         reg = 0;
> @@ -146,8 +168,3 @@ int print_cpuinfo(void)
>         return 0;
>  }
>
> -__weak struct serial_device *default_serial_console(void)
> -{
> -       return NULL;
> -}
> -
> diff --git a/configs/xenguest_arm64_defconfig b/configs/xenguest_arm64_defconfig
> index 2a8caf8647..45559a161b 100644
> --- a/configs/xenguest_arm64_defconfig
> +++ b/configs/xenguest_arm64_defconfig
> @@ -47,9 +47,9 @@ CONFIG_CMD_UMS=n
>  #CONFIG_EFI_PARTITION=y
>  # CONFIG_EFI_LOADER is not set
>
> -# CONFIG_DM is not set
> +CONFIG_DM=y
>  # CONFIG_MMC is not set
> -# CONFIG_DM_SERIAL is not set
> +CONFIG_DM_SERIAL=y
>  # CONFIG_REQUIRE_SERIAL_CONSOLE is not set
>
>  CONFIG_OF_BOARD=y
> diff --git a/drivers/serial/Kconfig b/drivers/serial/Kconfig
> index 17d0e73623..33c989a66d 100644
> --- a/drivers/serial/Kconfig
> +++ b/drivers/serial/Kconfig
> @@ -821,6 +821,13 @@ config MPC8XX_CONS
>         depends on MPC8xx
>         default y
>
> +config XEN_SERIAL
> +       bool "XEN serial support"
> +       depends on XEN
> +       help
> +         If built without DM support, then requires Xen
> +         to be built with CONFIG_VERBOSE_DEBUG.

Yes but what does it do? Also should probably not support non-DM at this point.

> +
>  choice
>         prompt "Console port"
>         default 8xx_CONS_SMC1
> diff --git a/drivers/serial/Makefile b/drivers/serial/Makefile
> index e4a92bbbb7..25f7f8d342 100644
> --- a/drivers/serial/Makefile
> +++ b/drivers/serial/Makefile
> @@ -70,6 +70,7 @@ obj-$(CONFIG_OWL_SERIAL) += serial_owl.o
>  obj-$(CONFIG_OMAP_SERIAL) += serial_omap.o
>  obj-$(CONFIG_MTK_SERIAL) += serial_mtk.o
>  obj-$(CONFIG_SIFIVE_SERIAL) += serial_sifive.o
> +obj-$(CONFIG_XEN_SERIAL) += serial_xen.o
>
>  ifndef CONFIG_SPL_BUILD
>  obj-$(CONFIG_USB_TTY) += usbtty.o
> diff --git a/drivers/serial/serial_xen.c b/drivers/serial/serial_xen.c
> new file mode 100644
> index 0000000000..dcd4b2df79
> --- /dev/null
> +++ b/drivers/serial/serial_xen.c
> @@ -0,0 +1,175 @@
> +/*
> + * SPDX-License-Identifier:    GPL-2.0+
> + *
> + * (C) 2018 NXP
> + * (C) 2020 EPAM Systems Inc.
> + */
> +#include <common.h>
> +#include <cpu_func.h>
> +#include <dm.h>
> +#include <serial.h>
> +#include <watchdog.h>
> +
> +#include <linux/bug.h>
> +
> +#include <xen/hvm.h>
> +#include <xen/events.h>
> +
> +#include <xen/interface/sched.h>
> +#include <xen/interface/hvm/hvm_op.h>
> +#include <xen/interface/hvm/params.h>
> +#include <xen/interface/io/console.h>
> +#include <xen/interface/io/ring.h>
> +
> +DECLARE_GLOBAL_DATA_PTR;
> +
> +u32 console_evtchn;
> +
> +struct xen_uart_priv {
> +       struct xencons_interface *intf;
> +       u32 evtchn;
> +       int vtermno;
> +       struct hvc_struct *hvc;

comment for struct

> +};
> +
> +int xen_serial_setbrg(struct udevice *dev, int baudrate)
> +{
> +       return 0;
> +}
> +
> +static int xen_serial_probe(struct udevice *dev)
> +{
> +       struct xen_uart_priv *priv = dev_get_priv(dev);
> +       u64 v = 0;
> +       unsigned long gfn;
> +       int r;
> +
> +       r = hvm_get_parameter(HVM_PARAM_CONSOLE_EVTCHN, &v);

Can you use ret and val instead of single-char var names? It is OK for
loops, but not here.

> +       if (r < 0 || v == 0)
> +               return r;
> +
> +       priv->evtchn = v;
> +       console_evtchn = v;
> +
> +       r = hvm_get_parameter(HVM_PARAM_CONSOLE_PFN, &v);
> +       if (r < 0 || v == 0)
> +               return -ENODEV;

return r if non-zero

return -EINVAL perhaps or -ENXIO if !v

-ENODEV means there is no device and is reserved for driver model.
Clearly in this case there is a device.

> +
> +       gfn = v;
> +
> +       priv->intf = (struct xencons_interface *)(gfn << XEN_PAGE_SHIFT);
> +       if (!priv->intf)

Don't you already check for !v above?

> +               return -EINVAL;

Blank line

> +       return 0;
> +}
> +
> +static int xen_serial_pending(struct udevice *dev, bool input)
> +{
> +       struct xen_uart_priv *priv = dev_get_priv(dev);
> +       struct xencons_interface *intf = priv->intf;
> +
> +       if (!input || intf->in_cons == intf->in_prod)
> +               return 0;

blank line before final return. Please fix globally

> +       return 1;
> +}
> +
> +static int xen_serial_getc(struct udevice *dev)
> +{
> +       struct xen_uart_priv *priv = dev_get_priv(dev);
> +       struct xencons_interface *intf = priv->intf;
> +       XENCONS_RING_IDX cons;
> +       char c;
> +
> +       while (intf->in_cons == intf->in_prod) {
> +               mb(); /* wait */
> +       }

Drop {}. Has this been through patman?

> +
> +       cons = intf->in_cons;
> +       mb();                   /* get pointers before reading ring */
> +
> +       c = intf->in[MASK_XENCONS_IDX(cons++, intf->in)];
> +
> +       mb();                   /* read ring before consuming */
> +       intf->in_cons = cons;
> +
> +       notify_remote_via_evtchn(priv->evtchn);
> +       return c;
> +}
> +
> +static int __write_console(struct udevice *dev, const char *data, int len)
> +{
> +       struct xen_uart_priv *priv = dev_get_priv(dev);
> +       struct xencons_interface *intf = priv->intf;
> +       XENCONS_RING_IDX cons, prod;
> +       int sent = 0;
> +
> +       cons = intf->out_cons;
> +       prod = intf->out_prod;
> +       mb(); /* Update pointer */
> +
> +       WARN_ON((prod - cons) > sizeof(intf->out));
> +
> +       while ((sent < len) && ((prod - cons) < sizeof(intf->out)))
> +               intf->out[MASK_XENCONS_IDX(prod++, intf->out)] = data[sent++];
> +
> +       mb(); /* Update data before pointer */
> +       intf->out_prod = prod;
> +
> +       if (sent)
> +               notify_remote_via_evtchn(priv->evtchn);
> +       return sent;
> +}
> +
> +static int write_console(struct udevice *dev, const char *data, int len)
> +{
> +       /*
> +        * Make sure the whole buffer is emitted, polling if
> +        * necessary.  We don't ever want to rely on the hvc daemon
> +        * because the most interesting console output is when the
> +        * kernel is crippled.
> +        */
> +       while (len) {
> +               int sent = __write_console(dev, data, len);
> +
> +               data += sent;
> +               len -= sent;
> +
> +               if (unlikely(len))
> +                       HYPERVISOR_sched_op(SCHEDOP_yield, NULL);
> +       }
> +       return 0;
> +}
> +
> +static int xen_serial_putc(struct udevice *dev, const char ch)
> +{
> +       write_console(dev, &ch, 1);
> +       return 0;
> +}
> +
> +static const struct dm_serial_ops xen_serial_ops = {
> +       .putc = xen_serial_putc,
> +       .getc = xen_serial_getc,
> +       .pending = xen_serial_pending,
> +};
> +
> +#if CONFIG_IS_ENABLED(OF_CONTROL)
> +static const struct udevice_id xen_serial_ids[] = {
> +       { .compatible = "xen,xen" },
> +       { }
> +};
> +#endif
> +
> +U_BOOT_DRIVER(serial_xen) = {
> +       .name                   = "serial_xen",
> +       .id                     = UCLASS_SERIAL,
> +#if CONFIG_IS_ENABLED(OF_CONTROL)

of_patch_ptr() - but I think you can drop this

> +       .of_match               = xen_serial_ids,
> +#endif
> +       .priv_auto_alloc_size   = sizeof(struct xen_uart_priv),
> +       .probe                  = xen_serial_probe,
> +       .ops                    = &xen_serial_ops,
> +#if !CONFIG_IS_ENABLED(OF_CONTROL)

and this?

> +       .flags                  = DM_FLAG_PRE_RELOC,
> +#endif
> +};
> +
> diff --git a/drivers/xen/events.c b/drivers/xen/events.c
> index eddc6b6e29..a1b36a2196 100644
> --- a/drivers/xen/events.c
> +++ b/drivers/xen/events.c
> @@ -25,6 +25,8 @@
>  #include <xen/events.h>
>  #include <xen/hvm.h>
>
> +extern u32 console_evtchn;
> +
>  #define NR_EVS 1024
>
>  /* this represents a event handler. Chaining or sharing is not allowed */
> @@ -47,6 +49,8 @@ void unbind_all_ports(void)
>         struct vcpu_info *vcpu_info = &s->vcpu_info[cpu];
>
>         for (i = 0; i < NR_EVS; i++) {
> +               if (i == console_evtchn)
> +                       continue;
>                 if (test_and_clear_bit(i, bound_ports)) {
>                         printf("port %d still bound!\n", i);
>                         unbind_evtchn(i);
> --
> 2.17.1
>

Regards,
Simon

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 13/17] xen: pvblock: Enumerate virtual block devices
  2020-07-01 16:29 ` [PATCH 13/17] xen: pvblock: Enumerate virtual block devices Anastasiia Lukianenko
@ 2020-07-03  3:50   ` Simon Glass
  0 siblings, 0 replies; 57+ messages in thread
From: Simon Glass @ 2020-07-03  3:50 UTC (permalink / raw)
  To: u-boot

On Wed, 1 Jul 2020 at 10:30, Anastasiia Lukianenko <vicooodin@gmail.com> wrote:
>
> From: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
>
> Enumerate Xen virtual block devices found in XenStore and
> instantiate pvblock devices.
>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> ---
>  drivers/xen/pvblock.c | 112 +++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 110 insertions(+), 2 deletions(-)

Reviewed-by: Simon Glass <sjg@chromium.org>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 14/17] xen: pvblock: Read XenStore configuration and initialize
  2020-07-01 16:29 ` [PATCH 14/17] xen: pvblock: Read XenStore configuration and initialize Anastasiia Lukianenko
@ 2020-07-03  3:50   ` Simon Glass
  2020-07-06  9:08     ` Anastasiia Lukianenko
  0 siblings, 1 reply; 57+ messages in thread
From: Simon Glass @ 2020-07-03  3:50 UTC (permalink / raw)
  To: u-boot

Hi Anastasiia,

On Wed, 1 Jul 2020 at 10:30, Anastasiia Lukianenko <vicooodin@gmail.com> wrote:
>
> From: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
>
> Read essential virtual block device configuration data from XenStore,
> initialize front ring and event channel.
> Update block device description with actual block size.
>
> Use code for XenStore from mini-os.
>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> ---
>  drivers/xen/pvblock.c | 272 +++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 271 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/xen/pvblock.c b/drivers/xen/pvblock.c
> index 6ce0ae97c3..9ed18be633 100644
> --- a/drivers/xen/pvblock.c
> +++ b/drivers/xen/pvblock.c
> @@ -1,6 +1,7 @@
>  /*
>   * SPDX-License-Identifier:    GPL-2.0+
>   *
> + * (C) 2007-2008 Samuel Thibault.
>   * (C) Copyright 2020 EPAM Systems Inc.
>   */
>  #include <blk.h>
> @@ -10,26 +11,289 @@
>  #include <malloc.h>
>  #include <part.h>
>
> +#include <asm/armv8/mmu.h>
> +#include <asm/io.h>
> +#include <asm/xen/system.h>
> +
> +#include <linux/compat.h>
> +
> +#include <xen/events.h>
> +#include <xen/gnttab.h>
> +#include <xen/hvm.h>
>  #include <xen/xenbus.h>
>
> +#include <xen/interface/io/ring.h>
> +#include <xen/interface/io/blkif.h>
> +#include <xen/interface/io/protocols.h>
> +
>  #define DRV_NAME       "pvblock"
>  #define DRV_NAME_BLK   "pvblock_blk"
>
> +#define O_RDONLY       00
> +#define O_RDWR         02
> +
> +struct blkfront_info {
> +       u64 sectors;
> +       unsigned int sector_size;
> +       int mode;
> +       int info;
> +       int barrier;
> +       int flush;
> +};
> +
>  struct blkfront_dev {
> -       char dummy;
> +       domid_t dom;
> +
> +       struct blkif_front_ring ring;
> +       grant_ref_t ring_ref;
> +       evtchn_port_t evtchn;
> +       blkif_vdev_t handle;
> +
> +       char *nodename;
> +       char *backend;
> +       struct blkfront_info info;
> +       unsigned int devid;

How about some comments?

Regards,
Simon

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 15/17] xen: pvblock: Implement front-back protocol and do IO
  2020-07-01 16:29 ` [PATCH 15/17] xen: pvblock: Implement front-back protocol and do IO Anastasiia Lukianenko
@ 2020-07-03  3:50   ` Simon Glass
  2020-07-06  9:10     ` Anastasiia Lukianenko
  0 siblings, 1 reply; 57+ messages in thread
From: Simon Glass @ 2020-07-03  3:50 UTC (permalink / raw)
  To: u-boot

On Wed, 1 Jul 2020 at 10:30, Anastasiia Lukianenko <vicooodin@gmail.com> wrote:
>
> From: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
>
> Implement Xen para-virtual frontend to backend communication
> and actually read/write disk data.
>
> This is based on mini-os implementation of the para-virtual block
> frontend driver.
>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> ---
>  drivers/xen/events.c  |   2 +-
>  drivers/xen/pvblock.c | 311 ++++++++++++++++++++++++++++++++++++++++--
>  2 files changed, 301 insertions(+), 12 deletions(-)

Please can you comment structs and non-trival functions?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 16/17] xen: pvblock: Print found devices indices
  2020-07-01 16:29 ` [PATCH 16/17] xen: pvblock: Print found devices indices Anastasiia Lukianenko
@ 2020-07-03  3:50   ` Simon Glass
  0 siblings, 0 replies; 57+ messages in thread
From: Simon Glass @ 2020-07-03  3:50 UTC (permalink / raw)
  To: u-boot

On Wed, 1 Jul 2020 at 10:30, Anastasiia Lukianenko <vicooodin@gmail.com> wrote:
>
> From: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> ---
>  drivers/xen/pvblock.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>

Reviewed-by: Simon Glass <sjg@chromium.org>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 17/17] board: xen: De-initialize before jumping to Linux
  2020-07-01 16:29 ` [PATCH 17/17] board: xen: De-initialize before jumping to Linux Anastasiia Lukianenko
@ 2020-07-03  3:50   ` Simon Glass
  2020-07-06  9:13     ` Anastasiia Lukianenko
  0 siblings, 1 reply; 57+ messages in thread
From: Simon Glass @ 2020-07-03  3:50 UTC (permalink / raw)
  To: u-boot

On Wed, 1 Jul 2020 at 10:30, Anastasiia Lukianenko <vicooodin@gmail.com> wrote:
>
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>
> Free resources used by Xen board before jumping to Linux kernel.
>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> ---
>  board/xen/xenguest_arm64/xenguest_arm64.c | 6 ++++++
>  drivers/xen/hypervisor.c                  | 8 ++++++++
>  include/xen.h                             | 1 +
>  3 files changed, 15 insertions(+)

Reviewed-by: Simon Glass <sjg@chromium.org>



>
> diff --git a/board/xen/xenguest_arm64/xenguest_arm64.c b/board/xen/xenguest_arm64/xenguest_arm64.c
> index b4e1650f99..76a18bea8b 100644
> --- a/board/xen/xenguest_arm64/xenguest_arm64.c
> +++ b/board/xen/xenguest_arm64/xenguest_arm64.c
> @@ -13,6 +13,7 @@
>  #include <dm.h>
>  #include <errno.h>
>  #include <malloc.h>
> +#include <xen.h>
>
>  #include <asm/io.h>
>  #include <asm/armv8/mmu.h>
> @@ -195,3 +196,8 @@ int print_cpuinfo(void)
>         return 0;
>  }
>
> +void board_cleanup_before_linux(void)
> +{
> +       xen_fini();
> +}
> +
> diff --git a/drivers/xen/hypervisor.c b/drivers/xen/hypervisor.c
> index f3c2504d72..8d7d320839 100644
> --- a/drivers/xen/hypervisor.c
> +++ b/drivers/xen/hypervisor.c
> @@ -279,3 +279,11 @@ void xen_init(void)
>         init_gnttab();
>  }
>
> +void xen_fini(void)
> +{
> +       debug("%s\n", __func__);
> +
> +       fini_gnttab();
> +       fini_xenbus();
> +       fini_events();
> +}
> diff --git a/include/xen.h b/include/xen.h
> index 1d6f74cc92..327d7e132b 100644
> --- a/include/xen.h
> +++ b/include/xen.h
> @@ -7,5 +7,6 @@
>  #define __XEN_H__
>
>  void xen_init(void);
> +void xen_fini(void);

Comment? What does this do?


>
>  #endif /* __XEN_H__ */
> --
> 2.17.1
>

- SImon

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 01/17] armv8: Fix SMCC and ARM_PSCI_FW dependencies
  2020-07-02  1:14   ` Peng Fan
@ 2020-07-03  9:57     ` Nastya Vicodin
  0 siblings, 0 replies; 57+ messages in thread
From: Nastya Vicodin @ 2020-07-03  9:57 UTC (permalink / raw)
  To: u-boot

On Thu, Jul 2, 2020 at 4:14 AM Peng Fan <peng.fan@nxp.com> wrote:
>
> > Subject: [PATCH 01/17] armv8: Fix SMCC and ARM_PSCI_FW dependencies
> >
> > From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> >
> > Currently SMCC selects ARM_PSCI_FW if enabled which is not correct as
> > there are cases that PSCI can function without firmware at all.
> > ARM_PSCI_FW itself is built with driver model approach, so it cannot be
> > enabled if DM is off.
> > Fix this by making PSCI reset functionality depend on ARM_PSCI_FW and
only
> > in case if DM is enabled.
>
> I think this might break others, see drivers/firmware/psci.c
>
> Regards,
> Peng.

Well, the only reason we have this patch *was* the problem with the board
support
if CONFIG_DM is off when we tried to add an early console support w/o
driver model.

But it seems this is not needed at all now, so this patch can be easily
dropped without
causing any harm. I?ll also enable CONFIG_DM from the very start as all the
drivers
we are adding will use it anyway.

But, IMO, CONFIG_ARM_PSCI_FW support is still broken wrt the fact that SMCC
can function without PSCI_FW, but made a strong requirement. Even more, it
requires
DM because of the PSCI driver which detects if we are about to use SMCCC or
HVC.

Regards,
Anastasiia

>
>
> >
> > Signed-off-by: Oleksandr Andrushchenko
> > <oleksandr_andrushchenko@epam.com>
> > Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> > Suggested-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
> > ---
> >  arch/arm/Kconfig           | 1 -
> >  arch/arm/cpu/armv8/Kconfig | 2 ++
> >  2 files changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index
> > 54d65f8488..e9ad716aaa 100644
> > --- a/arch/arm/Kconfig
> > +++ b/arch/arm/Kconfig
> > @@ -387,7 +387,6 @@ config SYS_ARCH_TIMER  config ARM_SMCCC
> >       bool "Support for ARM SMC Calling Convention (SMCCC)"
> >       depends on CPU_V7A || ARM64
> > -     select ARM_PSCI_FW
> >       help
> >         Say Y here if you want to enable ARM SMC Calling Convention.
> >         This should be enabled if U-Boot needs to communicate with
system
> > diff --git a/arch/arm/cpu/armv8/Kconfig b/arch/arm/cpu/armv8/Kconfig
> > index 3655990772..c8727f4175 100644
> > --- a/arch/arm/cpu/armv8/Kconfig
> > +++ b/arch/arm/cpu/armv8/Kconfig
> > @@ -103,6 +103,8 @@ config PSCI_RESET
> >       bool "Use PSCI for reset and shutdown"
> >       default y
> >       select ARM_SMCCC if OF_CONTROL
> > +     select ARM_PSCI_FW if DM
> > +
> >       depends on !ARCH_EXYNOS7 && !ARCH_BCM283X && \
> >                  !TARGET_LS2080A_SIMU && !TARGET_LS2080AQDS && \
> >                  !TARGET_LS2080ARDB && !TARGET_LS2080A_EMU && \
> > --
> > 2.17.1

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 05/17] xen: Port Xen hypervizor related code from mini-os
  2020-07-01 17:46   ` Julien Grall
@ 2020-07-03 12:21     ` Anastasiia Lukianenko
  2020-07-03 13:38       ` Julien Grall
  2020-07-16 13:16     ` Anastasiia Lukianenko
  1 sibling, 1 reply; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-03 12:21 UTC (permalink / raw)
  To: u-boot

Hi Julien,

On Wed, 2020-07-01 at 18:46 +0100, Julien Grall wrote:
> Title: s/hypervizor/hypervisor/

Thank you for pointing :) I will fix it in the next version.

> 
> On 01/07/2020 17:29, Anastasiia Lukianenko wrote:
> > From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> > 
> > Port hypervizor related code from mini-os. Update essential
> 
> Ditto.
> 
> But I would be quite cautious to import code from mini-OS in order
> to 
> support Arm. The port has always been broken and from a look below
> needs 
> to be refined for Arm.

We were referencing the code of Mini-OS from [1] by Huang Shijie and
Volodymyr Babchuk which is for ARM64, so we hope this part should be
ok.

[1] https://github.com/zyzii/mini-os.git

> 
> > arch code to support required bit operations, memory barriers etc.
> > 
> > Copyright for the bits ported belong to at least the following
> > authors,
> > please see related files for details:
> > 
> > Copyright (c) 2002-2003, K A Fraser
> > Copyright (c) 2005, Grzegorz Milos, gm281 at cam.ac.uk,Intel Research
> > Cambridge
> > Copyright (c) 2014, Karim Allah Ahmed <karim.allah.ahmed@gmail.com>
> > 
> > Signed-off-by: Oleksandr Andrushchenko <
> > oleksandr_andrushchenko at epam.com>
> > Signed-off-by: Anastasiia Lukianenko <
> > anastasiia_lukianenko at epam.com>
> > ---
> >   arch/arm/include/asm/xen/system.h |  96 +++++++++++
> >   common/board_r.c                  |  11 ++
> >   drivers/Makefile                  |   1 +
> >   drivers/xen/Makefile              |   5 +
> >   drivers/xen/hypervisor.c          | 277
> > ++++++++++++++++++++++++++++++
> >   include/xen.h                     |  11 ++
> >   include/xen/hvm.h                 |  30 ++++
> >   7 files changed, 431 insertions(+)
> >   create mode 100644 arch/arm/include/asm/xen/system.h
> >   create mode 100644 drivers/xen/Makefile
> >   create mode 100644 drivers/xen/hypervisor.c
> >   create mode 100644 include/xen.h
> >   create mode 100644 include/xen/hvm.h
> > 
> > diff --git a/arch/arm/include/asm/xen/system.h
> > b/arch/arm/include/asm/xen/system.h
> > new file mode 100644
> > index 0000000000..81ab90160e
> > --- /dev/null
> > +++ b/arch/arm/include/asm/xen/system.h
> > @@ -0,0 +1,96 @@
> > +/*
> > + * SPDX-License-Identifier: GPL-2.0
> > + *
> > + * (C) 2014 Karim Allah Ahmed <karim.allah.ahmed@gmail.com>
> > + * (C) 2020, EPAM Systems Inc.
> > + */
> > +#ifndef _ASM_ARM_XEN_SYSTEM_H
> > +#define _ASM_ARM_XEN_SYSTEM_H
> > +
> > +#include <compiler.h>
> > +#include <asm/bitops.h>
> > +
> > +/* If *ptr == old, then store new there (and return new).
> > + * Otherwise, return the old value.
> > + * Atomic.
> > + */
> > +#define synch_cmpxchg(ptr, old, new) \
> > +({ __typeof__(*ptr) stored = old; \
> > +	__atomic_compare_exchange_n(ptr, &stored, new, 0,
> > __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST) ? new : old; \
> > +})
> > +
> > +/* As test_and_clear_bit, but using __ATOMIC_SEQ_CST */
> > +static inline int synch_test_and_clear_bit(int nr, volatile void
> > *addr)
> > +{
> > +	u8 *byte = ((u8 *)addr) + (nr >> 3);
> > +	u8 bit = 1 << (nr & 7);
> > +	u8 orig;
> > +
> > +	orig = __atomic_fetch_and(byte, ~bit, __ATOMIC_SEQ_CST);
> > +
> > +	return (orig & bit) != 0;
> > +}
> > +
> > +/* As test_and_set_bit, but using __ATOMIC_SEQ_CST */
> > +static inline int synch_test_and_set_bit(int nr, volatile void
> > *base)
> > +{
> > +	u8 *byte = ((u8 *)base) + (nr >> 3);
> > +	u8 bit = 1 << (nr & 7);
> > +	u8 orig;
> > +
> > +	orig = __atomic_fetch_or(byte, bit, __ATOMIC_SEQ_CST);
> > +
> > +	return (orig & bit) != 0;
> > +}
> > +
> > +/* As set_bit, but using __ATOMIC_SEQ_CST */
> > +static inline void synch_set_bit(int nr, volatile void *addr)
> > +{
> > +	synch_test_and_set_bit(nr, addr);
> > +}
> > +
> > +/* As clear_bit, but using __ATOMIC_SEQ_CST */
> > +static inline void synch_clear_bit(int nr, volatile void *addr)
> > +{
> > +	synch_test_and_clear_bit(nr, addr);
> > +}
> > +
> > +/* As test_bit, but with a following memory barrier. */
> > +//static inline int synch_test_bit(int nr, volatile void *addr)
> > +static inline int synch_test_bit(int nr, const void *addr)
> > +{
> > +	int result;
> > +
> > +	result = test_bit(nr, addr);
> > +	barrier();
> > +	return result;
> > +}
> 
> I can understand why we implement sync_* helpers as AFAICT the
> generic 
> helpers are not SMP safe. However...
> 
> > +
> > +#define xchg(ptr, v)	__atomic_exchange_n(ptr, v,
> > __ATOMIC_SEQ_CST)
> > +#define xchg(ptr, v)	__atomic_exchange_n(ptr, v,
> > __ATOMIC_SEQ_CST)
> > +
> > +#define mb()		dsb()
> > +#define rmb()		dsb()
> > +#define wmb()		dsb()
> > +#define __iormb()	dmb()
> > +#define __iowmb()	dmb()
> 
> Why do you need to re-implement the barriers?

Indeed, we do not need to do this.
I will fix it in the next version.

> 
> > +#define xen_mb()	mb()
> > +#define xen_rmb()	rmb()
> > +#define xen_wmb()	wmb()
> > +
> > +#define smp_processor_id()	0
> 
> Shouldn't this be common?

Currently it is only used by Xen and we are not sure if
any other entity will use it, but we can put that into
arch/arm/include/asm/io.h

> 
> > +
> > +#define to_phys(x)		((unsigned long)(x))
> > +#define to_virt(x)		((void *)(x))
> > +
> > +#define PFN_UP(x)		(unsigned long)(((x) + PAGE_SIZE - 1)
> > >> PAGE_SHIFT)
> > +#define PFN_DOWN(x)		(unsigned long)((x) >>
> > PAGE_SHIFT)
> > +#define PFN_PHYS(x)		((unsigned long)(x) <<
> > PAGE_SHIFT)
> > +#define PHYS_PFN(x)		(unsigned long)((x) >>
> > PAGE_SHIFT)
> > +
> > +#define virt_to_pfn(_virt)	(PFN_DOWN(to_phys(_virt)))
> > +#define virt_to_mfn(_virt)	(PFN_DOWN(to_phys(_virt)))
> > +#define mfn_to_virt(_mfn)	(to_virt(PFN_PHYS(_mfn)))
> > +#define pfn_to_virt(_pfn)	(to_virt(PFN_PHYS(_pfn)))
> 
> There is already generic phys <-> virt helpers (see 
> include/asm-generic/io.h). So why do you need to create a new
> version?

Indeed, we do not need to do this.
I will fix it in the next version.

> 
> > +
> > +#endif
> > diff --git a/common/board_r.c b/common/board_r.c
> > index fa57fa9b69..fd36edb4e5 100644
> > --- a/common/board_r.c
> > +++ b/common/board_r.c
> > @@ -56,6 +56,7 @@
> >   #include <timer.h>
> >   #include <trace.h>
> >   #include <watchdog.h>
> > +#include <xen.h>
> 
> Do we want to include it for other boards?

For now, we do not have a plan and resources to support
anything other than what we need. Therefore only ARM64.

> 
> >   #ifdef CONFIG_ADDR_MAP
> >   #include <asm/mmu.h>
> >   #endif
> > @@ -462,6 +463,13 @@ static int initr_mmc(void)
> >   }
> >   #endif
> >   
> > +#ifdef CONFIG_XEN
> > +static int initr_xen(void)
> > +{
> > +	xen_init();
> > +	return 0;
> > +}
> > +#endif
> >   /*
> >    * Tell if it's OK to load the environment early in boot.
> >    *
> > @@ -769,6 +777,9 @@ static init_fnc_t init_sequence_r[] = {
> >   #endif
> >   #ifdef CONFIG_MMC
> >   	initr_mmc,
> > +#endif
> > +#ifdef CONFIG_XEN
> > +	initr_xen,
> >   #endif
> >   	initr_env,
> >   #ifdef CONFIG_SYS_BOOTPARAMS_LEN
> > diff --git a/drivers/Makefile b/drivers/Makefile
> > index 94e8c5da17..0dd8891e76 100644
> > --- a/drivers/Makefile
> > +++ b/drivers/Makefile
> > @@ -28,6 +28,7 @@ obj-$(CONFIG_$(SPL_)REMOTEPROC) += remoteproc/
> >   obj-$(CONFIG_$(SPL_TPL_)TPM) += tpm/
> >   obj-$(CONFIG_$(SPL_TPL_)ACPI_PMC) += power/acpi_pmc/
> >   obj-$(CONFIG_$(SPL_)BOARD) += board/
> > +obj-$(CONFIG_XEN) += xen/
> >   
> >   ifndef CONFIG_TPL_BUILD
> >   ifdef CONFIG_SPL_BUILD
> > diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
> > new file mode 100644
> > index 0000000000..1211bf2386
> > --- /dev/null
> > +++ b/drivers/xen/Makefile
> > @@ -0,0 +1,5 @@
> > +# SPDX-License-Identifier:	GPL-2.0+
> > +#
> > +# (C) Copyright 2020 EPAM Systems Inc.
> > +
> > +obj-y += hypervisor.o
> > diff --git a/drivers/xen/hypervisor.c b/drivers/xen/hypervisor.c
> > new file mode 100644
> > index 0000000000..5883285142
> > --- /dev/null
> > +++ b/drivers/xen/hypervisor.c
> > @@ -0,0 +1,277 @@
> > +/*****************************************************************
> > *************
> > + * hypervisor.c
> > + *
> > + * Communication to/from hypervisor.
> > + *
> > + * Copyright (c) 2002-2003, K A Fraser
> > + * Copyright (c) 2005, Grzegorz Milos, gm281 at cam.ac.uk,Intel
> > Research Cambridge
> > + * Copyright (c) 2020, EPAM Systems Inc.
> > + *
> > + * Permission is hereby granted, free of charge, to any person
> > obtaining a copy
> > + * of this software and associated documentation files (the
> > "Software"), to
> > + * deal in the Software without restriction, including without
> > limitation the
> > + * rights to use, copy, modify, merge, publish, distribute,
> > sublicense, and/or
> > + * sell copies of the Software, and to permit persons to whom the
> > Software is
> > + * furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be
> > included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> > EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> > EVENT SHALL THE
> > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES
> > OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> > OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + */
> > +#include <common.h>
> > +#include <cpu_func.h>
> > +#include <log.h>
> > +#include <memalign.h>
> > +
> > +#include <asm/io.h>
> > +#include <asm/armv8/mmu.h>
> > +#include <asm/xen/system.h>
> > +
> > +#include <linux/bug.h>
> > +
> > +#include <xen/hvm.h>
> > +#include <xen/interface/memory.h>
> > +
> > +#define active_evtchns(cpu, sh, idx)	\
> > +	((sh)->evtchn_pending[idx] &	\
> > +	 ~(sh)->evtchn_mask[idx])
> > +
> > +int in_callback;
> > +
> > +/*
> > + * Shared page for communicating with the hypervisor.
> > + * Events flags go here, for example.
> > + */
> > +struct shared_info *HYPERVISOR_shared_info;
> > +
> > +#ifndef CONFIG_PARAVIRT
> 
> Is there any plan to support this on x86?

For now, we do not have a plan and resources to support
anything other
than what we need. Therefore only ARM64.

> 
> > +static const char *param_name(int op)
> > +{
> > +#define PARAM(x)[HVM_PARAM_##x] = #x
> > +	static const char *const names[] = {
> > +		PARAM(CALLBACK_IRQ),
> > +		PARAM(STORE_PFN),
> > +		PARAM(STORE_EVTCHN),
> > +		PARAM(PAE_ENABLED),
> > +		PARAM(IOREQ_PFN),
> > +		PARAM(TIMER_MODE),
> > +		PARAM(HPET_ENABLED),
> > +		PARAM(IDENT_PT),
> > +		PARAM(ACPI_S_STATE),
> > +		PARAM(VM86_TSS),
> > +		PARAM(VPT_ALIGN),
> > +		PARAM(CONSOLE_PFN),
> > +		PARAM(CONSOLE_EVTCHN),
> 
> Most of those parameters are never going to be used on Arm. So could 
> this be clobberred?
> 
> > +	};
> > +#undef PARAM
> > +
> > +	if (op >= ARRAY_SIZE(names))
> > +		return "unknown";
> > +
> > +	if (!names[op])
> > +		return "reserved";
> > +
> > +	return names[op];
> > +}
> > +
> > +int hvm_get_parameter_maintain_dcache(int idx, uint64_t *value)
> 
> I would recommend to add some comments explaining when this function
> is 
> meant to be used and what it is doing in regards of the cache.

Thank you for recommendation. I will add comments about this function
in the next version.

> 
> > +{
> > +	struct xen_hvm_param xhv;
> > +	int ret;
> 
> I don't think there is a guarantee that your cache is going to be
> clean 
> when writing xhv. So you likely want to add a
> invalidate_dcache_range() 
> before writing it.

Thank you for advice.
Ah, so we need something like:

...
invalidate_dcache_range((unsigned long)&xhv,
			(unsigned long)&xhv + sizeof(xhv));
xhv.domid = DOMID_SELF;
xhv.index = idx;
invalidate_dcache_range((unsigned long)&xhv,
			(unsigned long)&xhv + sizeof(xhv));
...

> 
> > +
> > +	xhv.domid = DOMID_SELF;
> > +	xhv.index = idx;
> > +	invalidate_dcache_range((unsigned long)&xhv,
> > +				(unsigned long)&xhv + sizeof(xhv));
> > +
> > +	ret = HYPERVISOR_hvm_op(HVMOP_get_param, &xhv);
> > +	if (ret < 0) {
> > +		pr_err("Cannot get hvm parameter %s (%d): %d!\n",
> > +			   param_name(idx), idx, ret);
> > +		BUG();
> > +	}
> > +	invalidate_dcache_range((unsigned long)&xhv,
> > +				(unsigned long)&xhv + sizeof(xhv));
> > +
> > +	*value = xhv.value;
> > +	return ret;
> > +}
> > +
> > +int hvm_get_parameter(int idx, uint64_t *value)
> > +{
> > +	struct xen_hvm_param xhv;
> > +	int ret;
> > +
> > +	xhv.domid = DOMID_SELF;
> > +	xhv.index = idx;
> > +	ret = HYPERVISOR_hvm_op(HVMOP_get_param, &xhv);
> > +	if (ret < 0) {
> > +		pr_err("Cannot get hvm parameter %s (%d): %d!\n",
> > +			   param_name(idx), idx, ret);
> > +		BUG();
> > +	}
> > +
> > +	*value = xhv.value;
> > +	return ret;
> > +}
> > +
> > +int hvm_set_parameter(int idx, uint64_t value)
> > +{
> > +	struct xen_hvm_param xhv;
> > +	int ret;
> > +
> > +	xhv.domid = DOMID_SELF;
> > +	xhv.index = idx;
> > +	xhv.value = value;
> > +	ret = HYPERVISOR_hvm_op(HVMOP_set_param, &xhv);
> > +
> > +	if (ret < 0) {
> > +		pr_err("Cannot get hvm parameter %s (%d): %d!\n",
> > +			   param_name(idx), idx, ret);
> > +		BUG();
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> > +struct shared_info *map_shared_info(void *p)
> > +{
> > +	struct xen_add_to_physmap xatp;
> > +
> > +	HYPERVISOR_shared_info = (struct shared_info
> > *)memalign(PAGE_SIZE,
> > +								PAGE_SI
> > ZE);
> > +	if (HYPERVISOR_shared_info == NULL)
> > +		BUG();
> > +
> > +	xatp.domid = DOMID_SELF;
> > +	xatp.idx = 0;
> > +	xatp.space = XENMAPSPACE_shared_info;
> > +	xatp.gpfn = virt_to_pfn(HYPERVISOR_shared_info);
> > +	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp) != 0)
> > +		BUG();
> > +
> > +	return HYPERVISOR_shared_info;
> > +}
> > +
> > +void unmap_shared_info(void)
> > +{
> > +	struct xen_remove_from_physmap xrtp;
> > +
> > +	xrtp.domid = DOMID_SELF;
> > +	xrtp.gpfn = virt_to_pfn(HYPERVISOR_shared_info);
> > +	if (HYPERVISOR_memory_op(XENMEM_remove_from_physmap, &xrtp) !=
> > 0)
> > +		BUG();
> > +}
> > +#endif
> > +
> > +void do_hypervisor_callback(struct pt_regs *regs)
> > +{
> > +	unsigned long l1, l2, l1i, l2i;
> > +	unsigned int port;
> > +	int cpu = 0;
> > +	struct shared_info *s = HYPERVISOR_shared_info;
> > +	struct vcpu_info *vcpu_info = &s->vcpu_info[cpu];
> > +
> > +	in_callback = 1;
> > +
> > +	vcpu_info->evtchn_upcall_pending = 0;
> > +	/* NB x86. No need for a barrier here -- XCHG is a barrier on
> > x86. */
> > +#if !defined(__i386__) && !defined(__x86_64__)
> > +	/* Clear master flag /before/ clearing selector flag. */
> > +	wmb();
> > +#endif
> > +	l1 = xchg(&vcpu_info->evtchn_pending_sel, 0);
> > +
> > +	while (l1 != 0) {
> > +		l1i = __ffs(l1);
> > +		l1 &= ~(1UL << l1i);
> > +
> > +		while ((l2 = active_evtchns(cpu, s, l1i)) != 0) {
> > +			l2i = __ffs(l2);
> > +			l2 &= ~(1UL << l2i);
> > +
> > +			port = (l1i * (sizeof(unsigned long) * 8)) +
> > l2i;
> > +			/* TODO: handle new event: do_event(port,
> > regs); */
> > +			/* Suppress -Wunused-but-set-variable */
> > +			(void)(port);
> > +		}
> > +	}
> 
> You likely want a memory barrier here as otherwise in_callback could
> be 
> written/seen before the loop end.
> 

We are not running in a multi-threaded environment, so probably
in_callback should be fine as is? Or it can be removed completely as
there are no currently users of it.

> > +
> > +	in_callback = 0;
> > +}
> > +
> > +void force_evtchn_callback(void)
> > +{
> > +#ifdef XEN_HAVE_PV_UPCALL_MASK
> > +	int save;
> > +#endif
> > +	struct vcpu_info *vcpu;
> > +
> > +	vcpu = &HYPERVISOR_shared_info->vcpu_info[smp_processor_id()];
> 
> On Arm, this is only valid for vCPU0. For all the other vCPUs, you
> will 
> want to register a vCPU shared info.
> 

According to Mini-OS this is also expected for x86 [1] as both ARM and
x86 are defining smp_processor_id as 0. Do you expect any issue with
that?

[1] 
http://xenbits.xenproject.org/gitweb/?p=mini-os.git;a=blob;f=include/x86/os.h;h=a73b63e5e4e0f4b7fa7ca944739f2c3b8a956833;hb=HEAD#l10

> > +#ifdef XEN_HAVE_PV_UPCALL_MASK
> > +	save = vcpu->evtchn_upcall_mask;
> > +#endif
> > +
> > +	while (vcpu->evtchn_upcall_pending) {
> > +#ifdef XEN_HAVE_PV_UPCALL_MASK
> > +		vcpu->evtchn_upcall_mask = 1;
> > +#endif
> > +		barrier();
> 
> What are you trying to prevent with this barrier? In particular why 
> would the compiler be an issue but not the processor?

This is the original code from Mini-OS and it seems that the barriers
are leftovers from some old code. We do not define
XEN_HAVE_PV_UPCALL_MASK, so this function can be stripped a lot with
barriers removed completely.

> 
> > +		do_hypervisor_callback(NULL);
> > +		barrier();
> > +#ifdef XEN_HAVE_PV_UPCALL_MASK
> > +		vcpu->evtchn_upcall_mask = save;
> > +		barrier();
> 
> Same here.

Same as above.

> 
> > +#endif
> > +	};
> > +}
> > +
> > +void mask_evtchn(uint32_t port)
> > +{
> > +	struct shared_info *s = HYPERVISOR_shared_info;
> > +	synch_set_bit(port, &s->evtchn_mask[0]);
> > +}
> > +
> > +void unmask_evtchn(uint32_t port)
> > +{
> > +	struct shared_info *s = HYPERVISOR_shared_info;
> > +	struct vcpu_info *vcpu_info = &s-
> > >vcpu_info[smp_processor_id()];
> > +
> > +	synch_clear_bit(port, &s->evtchn_mask[0]);
> > +
> > +	/*
> > +	 * The following is basically the equivalent of
> > 'hw_resend_irq'. Just like
> > +	 * a real IO-APIC we 'lose the interrupt edge' if the channel
> > is masked.
> > +	 */
> 
> This seems to be out-of-context now, you might want to update it.

I am not sure I understand it right.
Could you please clarify what do you mean under the word "update"?

> 
> > +	if (synch_test_bit(port, &s->evtchn_pending[0]) &&
> > +	    !synch_test_and_set_bit(port / (sizeof(unsigned long) * 8),
> > +				    &vcpu_info->evtchn_pending_sel)) {
> > +		vcpu_info->evtchn_upcall_pending = 1;
> > +#ifdef XEN_HAVE_PV_UPCALL_MASK
> > +		if (!vcpu_info->evtchn_upcall_mask)
> > +#endif
> > +			force_evtchn_callback();
> > +	}
> > +}
> > +
> > +void clear_evtchn(uint32_t port)
> > +{
> > +	struct shared_info *s = HYPERVISOR_shared_info;
> > +
> > +	synch_clear_bit(port, &s->evtchn_pending[0]);
> > +}
> > +
> > +void xen_init(void)
> > +{
> > +	debug("%s\n", __func__);
> 
> Is this a left-over?

I think this is a relevant comment for debug purpose.
But we do not mind removing it, if it seems superfluous.

> 
> > +
> > +	map_shared_info(NULL);
> > +}
> > +
> > diff --git a/include/xen.h b/include/xen.h
> > new file mode 100644
> > index 0000000000..1d6f74cc92
> > --- /dev/null
> > +++ b/include/xen.h
> > @@ -0,0 +1,11 @@
> > +/*
> > + * SPDX-License-Identifier: GPL-2.0
> > + *
> > + * (C) 2020, EPAM Systems Inc.
> > + */
> > +#ifndef __XEN_H__
> > +#define __XEN_H__
> > +
> > +void xen_init(void);
> > +
> > +#endif /* __XEN_H__ */
> > diff --git a/include/xen/hvm.h b/include/xen/hvm.h
> > new file mode 100644
> > index 0000000000..89de9625ca
> > --- /dev/null
> > +++ b/include/xen/hvm.h
> > @@ -0,0 +1,30 @@
> > +/*
> > + * SPDX-License-Identifier: GPL-2.0
> > + *
> > + * Simple wrappers around HVM functions
> > + *
> > + * Copyright (c) 2002-2003, K A Fraser
> > + * Copyright (c) 2005, Grzegorz Milos, gm281 at cam.ac.uk,Intel
> > Research Cambridge
> > + * Copyright (c) 2020, EPAM Systems Inc.
> > + */
> > +#ifndef XEN_HVM_H__
> > +#define XEN_HVM_H__
> > +
> > +#include <asm/xen/hypercall.h>
> > +#include <xen/interface/hvm/params.h>
> > +#include <xen/interface/xen.h>
> > +
> > +extern struct shared_info *HYPERVISOR_shared_info;
> > +
> > +int hvm_get_parameter(int idx, uint64_t *value);
> > +int hvm_get_parameter_maintain_dcache(int idx, uint64_t *value);
> > +int hvm_set_parameter(int idx, uint64_t value);
> > +
> > +struct shared_info *map_shared_info(void *p);
> > +void unmap_shared_info(void);
> > +void do_hypervisor_callback(struct pt_regs *regs);
> > +void mask_evtchn(uint32_t port);
> > +void unmask_evtchn(uint32_t port);
> > +void clear_evtchn(uint32_t port);
> > +
> > +#endif /* XEN_HVM_H__ */
> 
> Cheers,
> 
> 

Regards,
Anastasiia

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 06/17] xen: Port Xen event channel driver from mini-os
  2020-07-03  3:50   ` Simon Glass
@ 2020-07-03 12:34     ` Anastasiia Lukianenko
  0 siblings, 0 replies; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-03 12:34 UTC (permalink / raw)
  To: u-boot

Hello Simon,

On Thu, 2020-07-02 at 21:50 -0600, Simon Glass wrote:
> Hi,
> 
> On Wed, 1 Jul 2020 at 10:30, Anastasiia Lukianenko <
> vicooodin at gmail.com> wrote:
> > 
> > From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> > 
> > Make required updates to run on u-boot. Strip functionality
> > not needed by U-boot.
> > 
> > Signed-off-by: Oleksandr Andrushchenko <
> > oleksandr_andrushchenko at epam.com>
> > Signed-off-by: Anastasiia Lukianenko <
> > anastasiia_lukianenko at epam.com>
> > ---
> >  drivers/xen/Makefile     |   1 +
> >  drivers/xen/events.c     | 177
> > +++++++++++++++++++++++++++++++++++++++
> >  drivers/xen/hypervisor.c |   6 +-
> >  include/xen/events.h     |  47 +++++++++++
> >  4 files changed, 228 insertions(+), 3 deletions(-)
> >  create mode 100644 drivers/xen/events.c
> >  create mode 100644 include/xen/events.h
> > 
> > diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
> > index 1211bf2386..0ad35edefb 100644
> > --- a/drivers/xen/Makefile
> > +++ b/drivers/xen/Makefile
> > @@ -3,3 +3,4 @@
> >  # (C) Copyright 2020 EPAM Systems Inc.
> > 
> >  obj-y += hypervisor.o
> > +obj-y += events.o
> > diff --git a/drivers/xen/events.c b/drivers/xen/events.c
> > new file mode 100644
> > index 0000000000..eddc6b6e29
> > --- /dev/null
> > +++ b/drivers/xen/events.c
> > @@ -0,0 +1,177 @@
> > +/* -*-  Mode:C; c-basic-offset:4; tab-width:4 -*-
> 
> SPDX is needed on files

Ok, will add.

> 
> > +
> > *******************************************************************
> > *********
> > + * (C) 2003 - Rolf Neugebauer - Intel Research Cambridge
> > + * (C) 2005 - Grzegorz Milos - Intel Research Cambridge
> > + * (C) 2020 - EPAM Systems Inc.
> > +
> > *******************************************************************
> > *********
> > + *
> > + *             File: events.c
> > + *       Author: Rolf Neugebauer (neugebar at dcs.gla.ac.uk)
> > + *      Changes: Grzegorz Milos (gm281 at cam.ac.uk)
> > + *
> > + *             Date: Jul 2003, changes Jun 2005
> > + *
> > + * Environment: Xen Minimal OS
> > + * Description: Deals with events received on event channels
> > + *
> > +
> > *******************************************************************
> > *********
> 
> Can you drop these stars and use the normal U-Boot format?

Ok, will update all the files.

> 
> > + */
> > +#include <common.h>
> > +#include <log.h>
> > +
> > +#include <asm/io.h>
> > +#include <asm/xen/system.h>
> > +
> > +#include <xen/events.h>
> > +#include <xen/hvm.h>
> > +
> > +#define NR_EVS 1024
> > +
> > +/* this represents a event handler. Chaining or sharing is not
> > allowed */
> > +typedef struct _ev_action_t {
> 
> Please don't use typedefs.

Ok.

> 
> Also there should be comments on functions, particularly those in the
> header file.

Ok, will add comments in next version.

> 
> Are you trying to keep the source similar to an upstream version?

It seems that after this review we are stepping away from Mini-OS
anyway (like removing x86 code etc), so it is ok not to keep the source
close to the original code.

> 
> Regards,
> SImon

Regards,
Anastasiia

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 02/17] Kconfig: Introduce CONFIG_XEN
  2020-07-03  3:50   ` Simon Glass
@ 2020-07-03 12:42     ` Anastasiia Lukianenko
  0 siblings, 0 replies; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-03 12:42 UTC (permalink / raw)
  To: u-boot

Hi Simon,

On Thu, 2020-07-02 at 21:50 -0600, Simon Glass wrote:
> Hi Anastasiia,
> 
> On Wed, 1 Jul 2020 at 10:30, Anastasiia Lukianenko <
> vicooodin at gmail.com> wrote:
> > 
> > From: Peng Fan <peng.fan@nxp.com>
> > 
> > Introduce CONFIG_XEN to make U-Boot could be used as bootloader
> > for a virtual machine.
> > 
> > Without bootloader, we could successfully boot up android on XEN,
> > but
> > we need need bootloader to support A/B, dm verify and etc.
> > 
> > Signed-off-by: Peng Fan <peng.fan@nxp.com>
> > Signed-off-by: Oleksandr Andrushchenko <
> > oleksandr_andrushchenko at epam.com>
> > Signed-off-by: Anastasiia Lukianenko <
> > anastasiia_lukianenko at epam.com>
> > ---
> >  Kconfig | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/Kconfig b/Kconfig
> > index 8f3fba085a..67f773d3a6 100644
> > --- a/Kconfig
> > +++ b/Kconfig
> > @@ -69,6 +69,13 @@ config CC_COVERAGE
> >           Enabling this option will pass "--coverage" to gcc to
> > compile
> >           and link code instrumented for coverage analysis.
> > 
> > +config XEN
> > +       bool "Select U-Boot be run as a bootloader for XEN Virtual
> > Machine"
> > +       default n
> 
> Not needed

Ok, will remove.

> 
> > +       help
> > +         Enabling this option will make U-Boot be run as a
> > bootloader
> > +         for XEN Virtual Machine.
> 
> Can you please add a few more details. What is XEN? URL? Also what
> does this actually do? Add some features to talk to XEN? It really
> needs more info and perhaps a pointer to some docs.

I have number of links and explanation in the cover-letter,
so I?ll paste some of that here in the next version.

> 
> > +
> >  config DISTRO_DEFAULTS
> >         bool "Select defaults suitable for booting general purpose
> > Linux distributions"
> >         select AUTO_COMPLETE
> > --
> > 2.17.1
> > 
> 
> Regards,
> Simon

Regards,
Anastasiia

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 04/17] xen: Add essential and required interface headers
  2020-07-02  1:30   ` Peng Fan
@ 2020-07-03 12:46     ` Anastasiia Lukianenko
  0 siblings, 0 replies; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-03 12:46 UTC (permalink / raw)
  To: u-boot

Hi Peng,

On Thu, 2020-07-02 at 01:30 +0000, Peng Fan wrote:
> > Subject: [PATCH 04/17] xen: Add essential and required interface
> > headers
> > 
> > From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> > 
> > Add essential and required Xen interface headers only taken from
> > the stable Linux kernel stable/linux-5.7.y at commit
> > 66dfe45221605e11f38a0bf5eb2ee808cea7cfe7.
> 
> Please use commit <12+> ("commit header")

Ok, will fix it in the next version.

> 
> > 
> > These are better suited for U-boot than the original headers
> > from Xen as they are the stripped versions of the same.
> > 
> > At the same time use public protocols from Xen RELEASE-4.13.1, at
> > commit 6278553325a9f76d37811923221b21db3882e017
> 
> Please use commit <12+> ("commit header")

Ok, will fix it in the next version.

> 
> Then:
> 
> Acked-by: Peng Fan <peng.fan@nxp.com>

Regards,
Anastasiia

> 
> > as those have more comments in them.
> > 
> > Signed-off-by: Oleksandr Andrushchenko
> > <oleksandr_andrushchenko@epam.com>
> > Signed-off-by: Anastasiia Lukianenko <
> > anastasiia_lukianenko at epam.com>
> > ---
> >  include/xen/arm/interface.h           |  88 ++++
> >  include/xen/interface/event_channel.h | 281 ++++++++++
> >  include/xen/interface/grant_table.h   | 582 +++++++++++++++++++++
> >  include/xen/interface/hvm/hvm_op.h    |  69 +++
> >  include/xen/interface/hvm/params.h    | 127 +++++
> >  include/xen/interface/io/blkif.h      | 726
> > ++++++++++++++++++++++++++
> >  include/xen/interface/io/console.h    |  56 ++
> >  include/xen/interface/io/protocols.h  |  42 ++
> >  include/xen/interface/io/ring.h       | 479 +++++++++++++++++
> >  include/xen/interface/io/xenbus.h     |  81 +++
> >  include/xen/interface/io/xs_wire.h    | 151 ++++++
> >  include/xen/interface/memory.h        | 332 ++++++++++++
> >  include/xen/interface/sched.h         | 188 +++++++
> >  include/xen/interface/xen.h           | 225 ++++++++
> >  14 files changed, 3427 insertions(+)
> >  create mode 100644 include/xen/arm/interface.h
> >  create mode 100644 include/xen/interface/event_channel.h
> >  create mode 100644 include/xen/interface/grant_table.h
> >  create mode 100644 include/xen/interface/hvm/hvm_op.h
> >  create mode 100644 include/xen/interface/hvm/params.h
> >  create mode 100644 include/xen/interface/io/blkif.h
> >  create mode 100644 include/xen/interface/io/console.h
> >  create mode 100644 include/xen/interface/io/protocols.h
> >  create mode 100644 include/xen/interface/io/ring.h
> >  create mode 100644 include/xen/interface/io/xenbus.h
> >  create mode 100644 include/xen/interface/io/xs_wire.h
> >  create mode 100644 include/xen/interface/memory.h
> >  create mode 100644 include/xen/interface/sched.h
> >  create mode 100644 include/xen/interface/xen.h
> > 
> > diff --git a/include/xen/arm/interface.h
> > b/include/xen/arm/interface.h
> > new file mode 100644
> > index 0000000000..79d5ae8563
> > --- /dev/null
> > +++ b/include/xen/arm/interface.h
> > @@ -0,0 +1,88 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/************************************************************
> > ******************
> > + * Guest OS interface to ARM Xen.
> > + *
> > + * Stefano Stabellini <stefano.stabellini@eu.citrix.com>, Citrix,
> > 2012
> > + */
> > +
> > +#ifndef _ASM_ARM_XEN_INTERFACE_H
> > +#define _ASM_ARM_XEN_INTERFACE_H
> > +
> > +#ifndef __ASSEMBLY__
> > +#include <linux/types.h>
> > +#endif
> > +
> > +#define uint64_aligned_t u64 __attribute__((aligned(8)))
> > +
> > +#define __DEFINE_GUEST_HANDLE(name, type) \
> > +	typedef struct { union { type *p; uint64_aligned_t q; }; }  \
> > +		__guest_handle_ ## name
> > +
> > +#define DEFINE_GUEST_HANDLE_STRUCT(name) \
> > +	__DEFINE_GUEST_HANDLE(name, struct name)
> > +#define DEFINE_GUEST_HANDLE(name) __DEFINE_GUEST_HANDLE(name,
> > name)
> > +#define GUEST_HANDLE(name)        __guest_handle_ ## name
> > +
> > +#define set_xen_guest_handle(hnd, val)			\
> > +	do {						\
> > +		if (sizeof(hnd) == 8)			\
> > +			*(u64 *)&(hnd) = 0;	\
> > +		(hnd).p = val;				\
> > +	} while (0)
> > +
> > +#define __HYPERVISOR_platform_op_raw __HYPERVISOR_platform_op
> > +
> > +#ifndef __ASSEMBLY__
> > +/* Explicitly size integers that represent pfns in the interface
> > with
> > + * Xen so that we can have one ABI that works for 32 and 64 bit
> > guests.
> > + * Note that this means that the xen_pfn_t type may be capable of
> > + * representing pfn's which the guest cannot represent in its own
> > pfn
> > + * type. However since pfn space is controlled by the guest this
> > is
> > + * fine since it simply wouldn't be able to create any sure pfns
> > in
> > + * the first place.
> > + */
> > +typedef u64 xen_pfn_t;
> > +#define PRI_xen_pfn "llx"
> > +typedef u64 xen_ulong_t;
> > +#define PRI_xen_ulong "llx"
> > +typedef s64 xen_long_t;
> > +#define PRI_xen_long "llx"
> > +/* Guest handles for primitive C types. */
> > +__DEFINE_GUEST_HANDLE(uchar, unsigned char);
> > +__DEFINE_GUEST_HANDLE(uint,  unsigned int);
> > +DEFINE_GUEST_HANDLE(char);
> > +DEFINE_GUEST_HANDLE(int);
> > +DEFINE_GUEST_HANDLE(void);
> > +DEFINE_GUEST_HANDLE(u64);
> > +DEFINE_GUEST_HANDLE(u32);
> > +DEFINE_GUEST_HANDLE(xen_pfn_t);
> > +DEFINE_GUEST_HANDLE(xen_ulong_t);
> > +
> > +/* Maximum number of virtual CPUs in multi-processor guests. */
> > +#define MAX_VIRT_CPUS 1
> > +
> > +struct arch_vcpu_info { };
> > +struct arch_shared_info { };
> > +
> > +/* TODO: Move pvclock definitions some place arch independent */
> > +struct pvclock_vcpu_time_info {
> > +	u32   version;
> > +	u32   pad0;
> > +	u64   tsc_timestamp;
> > +	u64   system_time;
> > +	u32   tsc_to_system_mul;
> > +	s8    tsc_shift;
> > +	u8    flags;
> > +	u8    pad[2];
> > +} __attribute__((__packed__)); /* 32 bytes */
> > +
> > +/* It is OK to have a 12 bytes struct with no padding because it
> > is packed */
> > +struct pvclock_wall_clock {
> > +	u32   version;
> > +	u32   sec;
> > +	u32   nsec;
> > +	u32   sec_hi;
> > +} __attribute__((__packed__));
> > +#endif
> > +
> > +#endif /* _ASM_ARM_XEN_INTERFACE_H */
> > diff --git a/include/xen/interface/event_channel.h
> > b/include/xen/interface/event_channel.h
> > new file mode 100644
> > index 0000000000..8174999c2f
> > --- /dev/null
> > +++ b/include/xen/interface/event_channel.h
> > @@ -0,0 +1,281 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/************************************************************
> > ******************
> > + * event_channel.h
> > + *
> > + * Event channels between domains.
> > + *
> > + * Copyright (c) 2003-2004, K A Fraser.
> > + */
> > +
> > +#ifndef __XEN_PUBLIC_EVENT_CHANNEL_H__
> > +#define __XEN_PUBLIC_EVENT_CHANNEL_H__
> > +
> > +#include <xen/interface/xen.h>
> > +
> > +typedef u32 evtchn_port_t;
> > +DEFINE_GUEST_HANDLE(evtchn_port_t);
> > +
> > +/*
> > + * EVTCHNOP_alloc_unbound: Allocate a port in domain <dom> and
> > mark as
> > + * accepting interdomain bindings from domain <remote_dom>. A
> > fresh port
> > + * is allocated in <dom> and returned as <port>.
> > + * NOTES:
> > + *  1. If the caller is unprivileged then <dom> must be
> > DOMID_SELF.
> > + *  2. <rdom> may be DOMID_SELF, allowing loopback connections.
> > + */
> > +#define EVTCHNOP_alloc_unbound	  6
> > +struct evtchn_alloc_unbound {
> > +	/* IN parameters */
> > +	domid_t dom, remote_dom;
> > +	/* OUT parameters */
> > +	evtchn_port_t port;
> > +};
> > +
> > +/*
> > + * EVTCHNOP_bind_interdomain: Construct an interdomain event
> > channel
> > between
> > + * the calling domain and <remote_dom>. <remote_dom,remote_port>
> > must
> > identify
> > + * a port that is unbound and marked as accepting bindings from
> > the calling
> > + * domain. A fresh port is allocated in the calling domain and
> > returned as
> > + * <local_port>.
> > + * NOTES:
> > + *  2. <remote_dom> may be DOMID_SELF, allowing loopback
> > connections.
> > + */
> > +#define EVTCHNOP_bind_interdomain 0
> > +struct evtchn_bind_interdomain {
> > +	/* IN parameters. */
> > +	domid_t remote_dom;
> > +	evtchn_port_t remote_port;
> > +	/* OUT parameters. */
> > +	evtchn_port_t local_port;
> > +};
> > +
> > +/*
> > + * EVTCHNOP_bind_virq: Bind a local event channel to VIRQ <irq> on
> > specified
> > + * vcpu.
> > + * NOTES:
> > + *  1. A virtual IRQ may be bound to at most one event channel per
> > vcpu.
> > + *  2. The allocated event channel is bound to the specified vcpu.
> > The
> > binding
> > + *     may not be changed.
> > + */
> > +#define EVTCHNOP_bind_virq	  1
> > +struct evtchn_bind_virq {
> > +	/* IN parameters. */
> > +	u32 virq;
> > +	u32 vcpu;
> > +	/* OUT parameters. */
> > +	evtchn_port_t port;
> > +};
> > +
> > +/*
> > + * EVTCHNOP_bind_pirq: Bind a local event channel to PIRQ <irq>.
> > + * NOTES:
> > + *  1. A physical IRQ may be bound to at most one event channel
> > per
> > domain.
> > + *  2. Only a sufficiently-privileged domain may bind to a
> > physical IRQ.
> > + */
> > +#define EVTCHNOP_bind_pirq	  2
> > +struct evtchn_bind_pirq {
> > +	/* IN parameters. */
> > +	u32 pirq;
> > +#define BIND_PIRQ__WILL_SHARE 1
> > +	u32 flags; /* BIND_PIRQ__* */
> > +	/* OUT parameters. */
> > +	evtchn_port_t port;
> > +};
> > +
> > +/*
> > + * EVTCHNOP_bind_ipi: Bind a local event channel to receive
> > events.
> > + * NOTES:
> > + *  1. The allocated event channel is bound to the specified vcpu.
> > The
> > binding
> > + *     may not be changed.
> > + */
> > +#define EVTCHNOP_bind_ipi	  7
> > +struct evtchn_bind_ipi {
> > +	u32 vcpu;
> > +	/* OUT parameters. */
> > +	evtchn_port_t port;
> > +};
> > +
> > +/*
> > + * EVTCHNOP_close: Close a local event channel <port>. If the
> > channel is
> > + * interdomain then the remote end is placed in the unbound state
> > + * (EVTCHNSTAT_unbound), awaiting a new connection.
> > + */
> > +#define EVTCHNOP_close		  3
> > +struct evtchn_close {
> > +	/* IN parameters. */
> > +	evtchn_port_t port;
> > +};
> > +
> > +/*
> > + * EVTCHNOP_send: Send an event to the remote end of the channel
> > whose
> > local
> > + * endpoint is <port>.
> > + */
> > +#define EVTCHNOP_send		  4
> > +struct evtchn_send {
> > +	/* IN parameters. */
> > +	evtchn_port_t port;
> > +};
> > +
> > +/*
> > + * EVTCHNOP_status: Get the current status of the communication
> > channel
> > which
> > + * has an endpoint at <dom, port>.
> > + * NOTES:
> > + *  1. <dom> may be specified as DOMID_SELF.
> > + *  2. Only a sufficiently-privileged domain may obtain the status
> > of an
> > event
> > + *     channel for which <dom> is not DOMID_SELF.
> > + */
> > +#define EVTCHNOP_status		  5
> > +struct evtchn_status {
> > +	/* IN parameters */
> > +	domid_t  dom;
> > +	evtchn_port_t port;
> > +	/* OUT parameters */
> > +#define EVTCHNSTAT_closed	0  /* Channel is not in use.		
> >      */
> > +#define EVTCHNSTAT_unbound	1  /* Channel is waiting interdom
> > connection.*/
> > +#define EVTCHNSTAT_interdomain	2  /* Channel is connected to
> > remote
> > domain. */
> > +#define EVTCHNSTAT_pirq		3  /* Channel is bound to a
> > phys IRQ line.
> > */
> > +#define EVTCHNSTAT_virq		4  /* Channel is bound to a
> > virtual IRQ line
> > */
> > +#define EVTCHNSTAT_ipi		5  /* Channel is bound to a
> > virtual IPI line
> > */
> > +	u32 status;
> > +	u32 vcpu;		   /* VCPU to which this channel is
> > bound.   */
> > +	union {
> > +		struct {
> > +			domid_t dom;
> > +		} unbound; /* EVTCHNSTAT_unbound */
> > +		struct {
> > +			domid_t dom;
> > +			evtchn_port_t port;
> > +		} interdomain; /* EVTCHNSTAT_interdomain */
> > +		u32 pirq;	    /* EVTCHNSTAT_pirq	      */
> > +		u32 virq;	    /* EVTCHNSTAT_virq	      */
> > +	} u;
> > +};
> > +
> > +/*
> > + * EVTCHNOP_bind_vcpu: Specify which vcpu a channel should notify
> > when
> > an
> > + * event is pending.
> > + * NOTES:
> > + *  1. IPI- and VIRQ-bound channels always notify the vcpu that
> > initialised
> > + *     the binding. This binding cannot be changed.
> > + *  2. All other channels notify vcpu0 by default. This default is
> > set when
> > + *     the channel is allocated (a port that is freed and
> > subsequently reused
> > + *     has its binding reset to vcpu0).
> > + */
> > +#define EVTCHNOP_bind_vcpu	  8
> > +struct evtchn_bind_vcpu {
> > +	/* IN parameters. */
> > +	evtchn_port_t port;
> > +	u32 vcpu;
> > +};
> > +
> > +/*
> > + * EVTCHNOP_unmask: Unmask the specified local event-channel port
> > and
> > deliver
> > + * a notification to the appropriate VCPU if an event is pending.
> > + */
> > +#define EVTCHNOP_unmask		  9
> > +struct evtchn_unmask {
> > +	/* IN parameters. */
> > +	evtchn_port_t port;
> > +};
> > +
> > +/*
> > + * EVTCHNOP_reset: Close all event channels associated with
> > specified
> > domain.
> > + * NOTES:
> > + *  1. <dom> may be specified as DOMID_SELF.
> > + *  2. Only a sufficiently-privileged domain may specify other
> > than
> > DOMID_SELF.
> > + */
> > +#define EVTCHNOP_reset		 10
> > +struct evtchn_reset {
> > +	/* IN parameters. */
> > +	domid_t dom;
> > +};
> > +
> > +typedef struct evtchn_reset evtchn_reset_t;
> > +
> > +/*
> > + * EVTCHNOP_init_control: initialize the control block for the
> > FIFO ABI.
> > + */
> > +#define EVTCHNOP_init_control    11
> > +struct evtchn_init_control {
> > +	/* IN parameters. */
> > +	u64 control_gfn;
> > +	u32 offset;
> > +	u32 vcpu;
> > +	/* OUT parameters. */
> > +	u8 link_bits;
> > +	u8 _pad[7];
> > +};
> > +
> > +/*
> > + * EVTCHNOP_expand_array: add an additional page to the event
> > array.
> > + */
> > +#define EVTCHNOP_expand_array    12
> > +struct evtchn_expand_array {
> > +	/* IN parameters. */
> > +	u64 array_gfn;
> > +};
> > +
> > +/*
> > + * EVTCHNOP_set_priority: set the priority for an event channel.
> > + */
> > +#define EVTCHNOP_set_priority    13
> > +struct evtchn_set_priority {
> > +	/* IN parameters. */
> > +	evtchn_port_t port;
> > +	u32 priority;
> > +};
> > +
> > +struct evtchn_op {
> > +	u32 cmd; /* EVTCHNOP_* */
> > +	union {
> > +		struct evtchn_alloc_unbound    alloc_unbound;
> > +		struct evtchn_bind_interdomain bind_interdomain;
> > +		struct evtchn_bind_virq	       bind_virq;
> > +		struct evtchn_bind_pirq	       bind_pirq;
> > +		struct evtchn_bind_ipi	       bind_ipi;
> > +		struct evtchn_close	       close;
> > +		struct evtchn_send	       send;
> > +		struct evtchn_status	       status;
> > +		struct evtchn_bind_vcpu	       bind_vcpu;
> > +		struct evtchn_unmask	       unmask;
> > +	} u;
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(evtchn_op);
> > +
> > +/*
> > + * 2-level ABI
> > + */
> > +
> > +#define EVTCHN_2L_NR_CHANNELS (sizeof(xen_ulong_t) *
> > sizeof(xen_ulong_t) * 64)
> > +
> > +/*
> > + * FIFO ABI
> > + */
> > +
> > +/* Events may have priorities from 0 (highest) to 15 (lowest). */
> > +#define EVTCHN_FIFO_PRIORITY_MAX     0
> > +#define EVTCHN_FIFO_PRIORITY_DEFAULT 7
> > +#define EVTCHN_FIFO_PRIORITY_MIN     15
> > +
> > +#define EVTCHN_FIFO_MAX_QUEUES (EVTCHN_FIFO_PRIORITY_MIN + 1)
> > +
> > +typedef u32 event_word_t;
> > +
> > +#define EVTCHN_FIFO_PENDING 31
> > +#define EVTCHN_FIFO_MASKED  30
> > +#define EVTCHN_FIFO_LINKED  29
> > +#define EVTCHN_FIFO_BUSY    28
> > +
> > +#define EVTCHN_FIFO_LINK_BITS 17
> > +#define EVTCHN_FIFO_LINK_MASK ((1 << EVTCHN_FIFO_LINK_BITS) - 1)
> > +
> > +#define EVTCHN_FIFO_NR_CHANNELS (1 << EVTCHN_FIFO_LINK_BITS)
> > +
> > +struct evtchn_fifo_control_block {
> > +	u32     ready;
> > +	u32     _rsvd;
> > +	event_word_t head[EVTCHN_FIFO_MAX_QUEUES];
> > +};
> > +
> > +#endif /* __XEN_PUBLIC_EVENT_CHANNEL_H__ */
> > diff --git a/include/xen/interface/grant_table.h
> > b/include/xen/interface/grant_table.h
> > new file mode 100644
> > index 0000000000..197a0d0d58
> > --- /dev/null
> > +++ b/include/xen/interface/grant_table.h
> > @@ -0,0 +1,582 @@
> > +/************************************************************
> > ******************
> > + * grant_table.h
> > + *
> > + * Interface for granting foreign access to page frames, and
> > receiving
> > + * page-ownership transfers.
> > + *
> > + * Permission is hereby granted, free of charge, to any person
> > obtaining a
> > copy
> > + * of this software and associated documentation files (the
> > "Software"), to
> > + * deal in the Software without restriction, including without
> > limitation the
> > + * rights to use, copy, modify, merge, publish, distribute,
> > sublicense, and/or
> > + * sell copies of the Software, and to permit persons to whom the
> > Software is
> > + * furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be
> > included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> > KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> > EVENT SHALL THE
> > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> > OR OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + *
> > + * Copyright (c) 2004, K A Fraser
> > + */
> > +
> > +#ifndef __XEN_PUBLIC_GRANT_TABLE_H__
> > +#define __XEN_PUBLIC_GRANT_TABLE_H__
> > +
> > +#include <xen/interface/xen.h>
> > +
> > +/***********************************
> > + * GRANT TABLE REPRESENTATION
> > + */
> > +
> > +/* Some rough guidelines on accessing and updating grant-table
> > entries
> > + * in a concurrency-safe manner. For more information, Linux
> > contains a
> > + * reference implementation for guest OSes
> > (arch/xen/kernel/grant_table.c).
> > + *
> > + * NB. WMB is a no-op on current-generation x86 processors.
> > However, a
> > + *     compiler barrier will still be required.
> > + *
> > + * Introducing a valid entry into the grant table:
> > + *  1. Write ent->domid.
> > + *  2. Write ent->frame:
> > + *      GTF_permit_access:   Frame to which access is permitted.
> > + *      GTF_accept_transfer: Pseudo-phys frame slot being filled
> > by new
> > + *                           frame, or zero if none.
> > + *  3. Write memory barrier (WMB).
> > + *  4. Write ent->flags, inc. valid type.
> > + *
> > + * Invalidating an unused GTF_permit_access entry:
> > + *  1. flags = ent->flags.
> > + *  2. Observe that !(flags & (GTF_reading|GTF_writing)).
> > + *  3. Check result of SMP-safe CMPXCHG(&ent->flags, flags, 0).
> > + *  NB. No need for WMB as reuse of entry is control-dependent on
> > success
> > of
> > + *      step 3, and all architectures guarantee ordering of ctrl-
> > dep writes.
> > + *
> > + * Invalidating an in-use GTF_permit_access entry:
> > + *  This cannot be done directly. Request assistance from the
> > domain
> > controller
> > + *  which can set a timeout on the use of a grant entry and take
> > necessary
> > + *  action. (NB. This is not yet implemented!).
> > + *
> > + * Invalidating an unused GTF_accept_transfer entry:
> > + *  1. flags = ent->flags.
> > + *  2. Observe that !(flags & GTF_transfer_committed). [*]
> > + *  3. Check result of SMP-safe CMPXCHG(&ent->flags, flags, 0).
> > + *  NB. No need for WMB as reuse of entry is control-dependent on
> > success
> > of
> > + *      step 3, and all architectures guarantee ordering of ctrl-
> > dep writes.
> > + *  [*] If GTF_transfer_committed is set then the grant entry is
> > 'committed'.
> > + *      The guest must /not/ modify the grant entry until the
> > address of
> > the
> > + *      transferred frame is written. It is safe for the guest to
> > spin waiting
> > + *      for this to occur (detect by observing
> > GTF_transfer_completed in
> > + *      ent->flags).
> > + *
> > + * Invalidating a committed GTF_accept_transfer entry:
> > + *  1. Wait for (ent->flags & GTF_transfer_completed).
> > + *
> > + * Changing a GTF_permit_access from writable to read-only:
> > + *  Use SMP-safe CMPXCHG to set GTF_readonly, while
> > checking !GTF_writing.
> > + *
> > + * Changing a GTF_permit_access from read-only to writable:
> > + *  Use SMP-safe bit-setting instruction.
> > + */
> > +
> > +/*
> > + * Reference to a grant entry in a specified domain's grant table.
> > + */
> > +typedef u32 grant_ref_t;
> > +
> > +/*
> > + * A grant table comprises a packed array of grant entries in one
> > or more
> > + * page frames shared between Xen and a guest.
> > + * [XEN]: This field is written by Xen and read by the sharing
> > guest.
> > + * [GST]: This field is written by the guest and read by Xen.
> > + */
> > +
> > +/*
> > + * Version 1 of the grant table entry structure is maintained
> > purely
> > + * for backwards compatibility.  New guests should use version 2.
> > + */
> > +struct grant_entry_v1 {
> > +	/* GTF_xxx: various type and flag information.  [XEN,GST] */
> > +	u16 flags;
> > +	/* The domain being granted foreign privileges. [GST] */
> > +	domid_t  domid;
> > +	/*
> > +	 * GTF_permit_access: Frame that @domid is allowed to map and
> > access. [GST]
> > +	 * GTF_accept_transfer: Frame whose ownership transferred by
> > @domid. [XEN]
> > +	 */
> > +	u32 frame;
> > +};
> > +
> > +/*
> > + * Type of grant entry.
> > + *  GTF_invalid: This grant entry grants no privileges.
> > + *  GTF_permit_access: Allow @domid to map/access @frame.
> > + *  GTF_accept_transfer: Allow @domid to transfer ownership of one
> > page
> > frame
> > + *                       to this guest. Xen writes the page number
> > to
> > @frame.
> > + *  GTF_transitive: Allow @domid to transitively access a subrange
> > of
> > + *                  @trans_grant in @trans_domid.  No mappings are
> > allowed.
> > + */
> > +#define GTF_invalid         (0U << 0)
> > +#define GTF_permit_access   (1U << 0)
> > +#define GTF_accept_transfer (2U << 0)
> > +#define GTF_transitive      (3U << 0)
> > +#define GTF_type_mask       (3U << 0)
> > +
> > +/*
> > + * Subflags for GTF_permit_access.
> > + *  GTF_readonly: Restrict @domid to read-only mappings and
> > accesses.
> > [GST]
> > + *  GTF_reading: Grant entry is currently mapped for reading by
> > @domid.
> > [XEN]
> > + *  GTF_writing: Grant entry is currently mapped for writing by
> > @domid.
> > [XEN]
> > + *  GTF_sub_page: Grant access to only a subrange of the
> > page.  @domid
> > + *                will only be allowed to copy from the grant, and
> > not
> > + *                map it. [GST]
> > + */
> > +#define _GTF_readonly       (2)
> > +#define GTF_readonly        (1U << _GTF_readonly)
> > +#define _GTF_reading        (3)
> > +#define GTF_reading         (1U << _GTF_reading)
> > +#define _GTF_writing        (4)
> > +#define GTF_writing         (1U << _GTF_writing)
> > +#define _GTF_sub_page       (8)
> > +#define GTF_sub_page        (1U << _GTF_sub_page)
> > +
> > +/*
> > + * Subflags for GTF_accept_transfer:
> > + *  GTF_transfer_committed: Xen sets this flag to indicate that it
> > is
> > committed
> > + *      to transferring ownership of a page frame. When a guest
> > sees this
> > flag
> > + *      it must /not/ modify the grant entry until
> > GTF_transfer_completed
> > is
> > + *      set by Xen.
> > + *  GTF_transfer_completed: It is safe for the guest to spin-wait
> > on this flag
> > + *      after reading GTF_transfer_committed. Xen will always
> > write the
> > frame
> > + *      address, followed by ORing this flag, in a timely manner.
> > + */
> > +#define _GTF_transfer_committed (2)
> > +#define GTF_transfer_committed  (1U << _GTF_transfer_committed)
> > +#define _GTF_transfer_completed (3)
> > +#define GTF_transfer_completed  (1U << _GTF_transfer_completed)
> > +
> > +/*
> > + * Version 2 grant table entries.  These fulfil the same role as
> > + * version 1 entries, but can represent more complicated
> > operations.
> > + * Any given domain will have either a version 1 or a version 2
> > table,
> > + * and every entry in the table will be the same version.
> > + *
> > + * The interface by which domains use grant references does not
> > depend
> > + * on the grant table version in use by the other domain.
> > + */
> > +
> > +/*
> > + * Version 1 and version 2 grant entries share a common
> > prefix.  The
> > + * fields of the prefix are documented as part of struct
> > + * grant_entry_v1.
> > + */
> > +struct grant_entry_header {
> > +	u16 flags;
> > +	domid_t  domid;
> > +};
> > +
> > +/*
> > + * Version 2 of the grant entry structure, here is a union because
> > three
> > + * different types are suppotted: full_page, sub_page and
> > transitive.
> > + */
> > +union grant_entry_v2 {
> > +	struct grant_entry_header hdr;
> > +
> > +	/*
> > +	 * This member is used for V1-style full page grants, where
> > either:
> > +	 *
> > +	 * -- hdr.type is GTF_accept_transfer, or
> > +	 * -- hdr.type is GTF_permit_access and GTF_sub_page is not
> > set.
> > +	 *
> > +	 * In that case, the frame field has the same semantics as the
> > +	 * field of the same name in the V1 entry structure.
> > +	 */
> > +	struct {
> > +	struct grant_entry_header hdr;
> > +	u32 pad0;
> > +	u64 frame;
> > +	} full_page;
> > +
> > +	/*
> > +	 * If the grant type is GTF_grant_access and GTF_sub_page is
> > set,
> > +	 * @domid is allowed to access bytes [@page_off,@
> > page_off+ at length)
> > +	 * in frame @frame.
> > +	 */
> > +	struct {
> > +	struct grant_entry_header hdr;
> > +	u16 page_off;
> > +	u16 length;
> > +	u64 frame;
> > +	} sub_page;
> > +
> > +	/*
> > +	 * If the grant is GTF_transitive, @domid is allowed to use the
> > +	 * grant @gref in domain @trans_domid, as if it was the local
> > +	 * domain.  Obviously, the transitive access must be compatible
> > +	 * with the original grant.
> > +	 */
> > +	struct {
> > +	struct grant_entry_header hdr;
> > +	domid_t trans_domid;
> > +	u16 pad0;
> > +	grant_ref_t gref;
> > +	} transitive;
> > +
> > +	u32 __spacer[4]; /* Pad to a power of two */
> > +};
> > +
> > +typedef u16 grant_status_t;
> > +
> > +/***********************************
> > + * GRANT TABLE QUERIES AND USES
> > + */
> > +
> > +/*
> > + * Handle to track a mapping created via a grant reference.
> > + */
> > +typedef u32 grant_handle_t;
> > +
> > +/*
> > + * GNTTABOP_map_grant_ref: Map the grant entry (<dom>,<ref>) for
> > access
> > + * by devices and/or host CPUs. If successful, <handle> is a
> > tracking number
> > + * that must be presented later to destroy the mapping(s). On
> > error,
> > <handle>
> > + * is a negative status code.
> > + * NOTES:
> > + *  1. If GNTMAP_device_map is specified then <dev_bus_addr> is
> > the
> > address
> > + *     via which I/O devices may access the granted frame.
> > + *  2. If GNTMAP_host_map is specified then a mapping will be
> > added at
> > + *     either a host virtual address in the current address space,
> > or at
> > + *     a PTE at the specified machine address.  The type of
> > mapping to
> > + *     perform is selected through the GNTMAP_contains_pte flag,
> > and the
> > + *     address is specified in <host_addr>.
> > + *  3. Mappings should only be destroyed via
> > GNTTABOP_unmap_grant_ref.
> > If a
> > + *     host mapping is destroyed by other means then it is *NOT*
> > guaranteed
> > + *     to be accounted to the correct grant reference!
> > + */
> > +#define GNTTABOP_map_grant_ref        0
> > +struct gnttab_map_grant_ref {
> > +	/* IN parameters. */
> > +	u64 host_addr;
> > +	u32 flags;               /* GNTMAP_* */
> > +	grant_ref_t ref;
> > +	domid_t  dom;
> > +	/* OUT parameters. */
> > +	s16  status;              /* GNTST_* */
> > +	grant_handle_t handle;
> > +	u64 dev_bus_addr;
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(gnttab_map_grant_ref);
> > +
> > +/*
> > + * GNTTABOP_unmap_grant_ref: Destroy one or more grant-reference
> > mappings
> > + * tracked by <handle>. If <host_addr> or <dev_bus_addr> is zero,
> > that
> > + * field is ignored. If non-zero, they must refer to a device/host
> > mapping
> > + * that is tracked by <handle>
> > + * NOTES:
> > + *  1. The call may fail in an undefined manner if either mapping
> > is not
> > + *     tracked by <handle>.
> > + *  3. After executing a batch of unmaps, it is guaranteed that no
> > stale
> > + *     mappings will remain in the device or host TLBs.
> > + */
> > +#define GNTTABOP_unmap_grant_ref      1
> > +struct gnttab_unmap_grant_ref {
> > +	/* IN parameters. */
> > +	u64 host_addr;
> > +	u64 dev_bus_addr;
> > +	grant_handle_t handle;
> > +	/* OUT parameters. */
> > +	s16  status;              /* GNTST_* */
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(gnttab_unmap_grant_ref);
> > +
> > +/*
> > + * GNTTABOP_setup_table: Set up a grant table for <dom> comprising
> > at
> > least
> > + * <nr_frames> pages. The frame addresses are written to the
> > <frame_list>.
> > + * Only <nr_frames> addresses are written, even if the table is
> > larger.
> > + * NOTES:
> > + *  1. <dom> may be specified as DOMID_SELF.
> > + *  2. Only a sufficiently-privileged domain may specify <dom> !=
> > DOMID_SELF.
> > + *  3. Xen may not support more than a single grant-table page per
> > domain.
> > + */
> > +#define GNTTABOP_setup_table          2
> > +struct gnttab_setup_table {
> > +	/* IN parameters. */
> > +	domid_t  dom;
> > +	u32 nr_frames;
> > +	/* OUT parameters. */
> > +	s16  status;              /* GNTST_* */
> > +
> > +	GUEST_HANDLE(xen_pfn_t)frame_list;
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(gnttab_setup_table);
> > +
> > +/*
> > + * GNTTABOP_dump_table: Dump the contents of the grant table to
> > the
> > + * xen console. Debugging use only.
> > + */
> > +#define GNTTABOP_dump_table           3
> > +struct gnttab_dump_table {
> > +	/* IN parameters. */
> > +	domid_t dom;
> > +	/* OUT parameters. */
> > +	s16 status;               /* GNTST_* */
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(gnttab_dump_table);
> > +
> > +/*
> > + * GNTTABOP_transfer_grant_ref: Transfer <frame> to a foreign
> > domain. The
> > + * foreign domain has previously registered its interest in the
> > transfer via
> > + * <domid, ref>.
> > + *
> > + * Note that, even if the transfer fails, the specified page no
> > longer belongs
> > + * to the calling domain *unless* the error is GNTST_bad_page.
> > + */
> > +#define GNTTABOP_transfer                4
> > +struct gnttab_transfer {
> > +	/* IN parameters. */
> > +	xen_pfn_t mfn;
> > +	domid_t       domid;
> > +	grant_ref_t   ref;
> > +	/* OUT parameters. */
> > +	s16       status;
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(gnttab_transfer);
> > +
> > +/*
> > + * GNTTABOP_copy: Hypervisor based copy
> > + * source and destinations can be eithers MFNs or, for foreign
> > domains,
> > + * grant references. the foreign domain has to grant read/write
> > access
> > + * in its grant table.
> > + *
> > + * The flags specify what type source and destinations are (either
> > MFN
> > + * or grant reference).
> > + *
> > + * Note that this can also be used to copy data between two
> > domains
> > + * via a third party if the source and destination domains had
> > previously
> > + * grant appropriate access to their pages to the third party.
> > + *
> > + * source_offset specifies an offset in the source frame,
> > dest_offset
> > + * the offset in the target frame and  len specifies the number of
> > + * bytes to be copied.
> > + */
> > +
> > +#define _GNTCOPY_source_gref      (0)
> > +#define GNTCOPY_source_gref       (1 << _GNTCOPY_source_gref)
> > +#define _GNTCOPY_dest_gref        (1)
> > +#define GNTCOPY_dest_gref         (1 << _GNTCOPY_dest_gref)
> > +
> > +#define GNTTABOP_copy                 5
> > +struct gnttab_copy {
> > +	/* IN parameters. */
> > +	struct {
> > +		union {
> > +			grant_ref_t ref;
> > +			xen_pfn_t   gmfn;
> > +		} u;
> > +		domid_t  domid;
> > +		u16 offset;
> > +	} source, dest;
> > +	u16      len;
> > +	u16      flags;          /* GNTCOPY_* */
> > +	/* OUT parameters. */
> > +	s16       status;
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(gnttab_copy);
> > +
> > +/*
> > + * GNTTABOP_query_size: Query the current and maximum sizes of the
> > shared
> > + * grant table.
> > + * NOTES:
> > + *  1. <dom> may be specified as DOMID_SELF.
> > + *  2. Only a sufficiently-privileged domain may specify <dom> !=
> > DOMID_SELF.
> > + */
> > +#define GNTTABOP_query_size           6
> > +struct gnttab_query_size {
> > +	/* IN parameters. */
> > +	domid_t  dom;
> > +	/* OUT parameters. */
> > +	u32 nr_frames;
> > +	u32 max_nr_frames;
> > +	s16  status;              /* GNTST_* */
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(gnttab_query_size);
> > +
> > +/*
> > + * GNTTABOP_unmap_and_replace: Destroy one or more grant-reference
> > mappings
> > + * tracked by <handle> but atomically replace the page table entry
> > with one
> > + * pointing to the machine address under <new_addr>.  <new_addr>
> > will
> > be
> > + * redirected to the null entry.
> > + * NOTES:
> > + *  1. The call may fail in an undefined manner if either mapping
> > is not
> > + *     tracked by <handle>.
> > + *  2. After executing a batch of unmaps, it is guaranteed that no
> > stale
> > + *     mappings will remain in the device or host TLBs.
> > + */
> > +#define GNTTABOP_unmap_and_replace    7
> > +struct gnttab_unmap_and_replace {
> > +	/* IN parameters. */
> > +	u64 host_addr;
> > +	u64 new_addr;
> > +	grant_handle_t handle;
> > +	/* OUT parameters. */
> > +	s16  status;              /* GNTST_* */
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(gnttab_unmap_and_replace);
> > +
> > +/*
> > + * GNTTABOP_set_version: Request a particular version of the grant
> > + * table shared table structure.  This operation can only be
> > performed
> > + * once in any given domain.  It must be performed before any
> > grants
> > + * are activated; otherwise, the domain will be stuck with version
> > 1.
> > + * The only defined versions are 1 and 2.
> > + */
> > +#define GNTTABOP_set_version          8
> > +struct gnttab_set_version {
> > +	/* IN parameters */
> > +	u32 version;
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(gnttab_set_version);
> > +
> > +/*
> > + * GNTTABOP_get_status_frames: Get the list of frames used to
> > store grant
> > + * status for <dom>. In grant format version 2, the status is
> > separated
> > + * from the other shared grant fields to allow more efficient
> > synchronization
> > + * using barriers instead of atomic cmpexch operations.
> > + * <nr_frames> specify the size of vector <frame_list>.
> > + * The frame addresses are returned in the <frame_list>.
> > + * Only <nr_frames> addresses are returned, even if the table is
> > larger.
> > + * NOTES:
> > + *  1. <dom> may be specified as DOMID_SELF.
> > + *  2. Only a sufficiently-privileged domain may specify <dom> !=
> > DOMID_SELF.
> > + */
> > +#define GNTTABOP_get_status_frames     9
> > +struct gnttab_get_status_frames {
> > +	/* IN parameters. */
> > +	u32 nr_frames;
> > +	domid_t  dom;
> > +	/* OUT parameters. */
> > +	s16  status;              /* GNTST_* */
> > +
> > +	GUEST_HANDLE(u64)frame_list;
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(gnttab_get_status_frames);
> > +
> > +/*
> > + * GNTTABOP_get_version: Get the grant table version which is in
> > + * effect for domain <dom>.
> > + */
> > +#define GNTTABOP_get_version          10
> > +struct gnttab_get_version {
> > +	/* IN parameters */
> > +	domid_t dom;
> > +	u16 pad;
> > +	/* OUT parameters */
> > +	u32 version;
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(gnttab_get_version);
> > +
> > +/*
> > + * Issue one or more cache maintenance operations on a portion of
> > a
> > + * page granted to the calling domain by a foreign domain.
> > + */
> > +#define GNTTABOP_cache_flush          12
> > +struct gnttab_cache_flush {
> > +	union {
> > +		u64 dev_bus_addr;
> > +		grant_ref_t ref;
> > +	} a;
> > +	u16 offset;   /* offset from start of grant */
> > +	u16 length;   /* size within the grant */
> > +#define GNTTAB_CACHE_CLEAN          (1 << 0)
> > +#define GNTTAB_CACHE_INVAL          (1 << 1)
> > +#define GNTTAB_CACHE_SOURCE_GREF    (1 << 31)
> > +	u32 op;
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(gnttab_cache_flush);
> > +
> > +/*
> > + * Bitfield values for update_pin_status.flags.
> > + */
> > + /* Map the grant entry for access by I/O devices. */
> > +#define _GNTMAP_device_map      (0)
> > +#define GNTMAP_device_map       (1 << _GNTMAP_device_map)
> > +/* Map the grant entry for access by host CPUs. */
> > +#define _GNTMAP_host_map        (1)
> > +#define GNTMAP_host_map         (1 << _GNTMAP_host_map)
> > +/* Accesses to the granted frame will be restricted to read-only
> > access. */
> > +#define _GNTMAP_readonly        (2)
> > +#define GNTMAP_readonly         (1 << _GNTMAP_readonly)
> > +/*
> > + * GNTMAP_host_map subflag:
> > + *  0 => The host mapping is usable only by the guest OS.
> > + *  1 => The host mapping is usable by guest OS + current
> > application.
> > + */
> > +#define _GNTMAP_application_map (3)
> > +#define GNTMAP_application_map  (1 << _GNTMAP_application_map)
> > +
> > +/*
> > + * GNTMAP_contains_pte subflag:
> > + *  0 => This map request contains a host virtual address.
> > + *  1 => This map request contains the machine addess of the PTE
> > to
> > update.
> > + */
> > +#define _GNTMAP_contains_pte    (4)
> > +#define GNTMAP_contains_pte     (1 << _GNTMAP_contains_pte)
> > +
> > +/*
> > + * Bits to be placed in guest kernel available PTE bits
> > (architecture
> > + * dependent; only supported when XENFEAT_gnttab_map_avail_bits is
> > set).
> > + */
> > +#define _GNTMAP_guest_avail0    (16)
> > +#define GNTMAP_guest_avail_mask ((u32)~0 << _GNTMAP_guest_avail0)
> > +
> > +/*
> > + * Values for error status returns. All errors are -ve.
> > + */
> > +#define GNTST_okay             (0)  /* Normal return.
> > */
> > +#define GNTST_general_error    (-1) /* General undefined error.
> > */
> > +#define GNTST_bad_domain       (-2) /* Unrecognsed domain id.
> > */
> > +#define GNTST_bad_gntref       (-3) /* Unrecognised or
> > inappropriate
> > gntref. */
> > +#define GNTST_bad_handle       (-4) /* Unrecognised or
> > inappropriate
> > handle. */
> > +#define GNTST_bad_virt_addr    (-5) /* Inappropriate virtual
> > address to
> > map. */
> > +#define GNTST_bad_dev_addr     (-6) /* Inappropriate device
> > address to
> > unmap.*/
> > +#define GNTST_no_device_space  (-7) /* Out of space in I/O MMU.
> > */
> > +#define GNTST_permission_denied (-8) /* Not enough privilege for
> > operation.
> > */
> > +#define GNTST_bad_page         (-9) /* Specified page was invalid
> > for op.
> > */
> > +#define GNTST_bad_copy_arg    (-10) /* copy arguments cross page
> > boundary.   */
> > +#define GNTST_address_too_big (-11) /* transfer page address too
> > large.
> > */
> > +#define GNTST_eagain          (-12) /* Operation not done; try
> > again.
> > */
> > +
> > +#define GNTTABOP_error_msgs {                   \
> > +	"okay",                                     \
> > +	"undefined error",                          \
> > +	"unrecognised domain id",                   \
> > +	"invalid grant reference",                  \
> > +	"invalid mapping handle",                   \
> > +	"invalid virtual address",                  \
> > +	"invalid device address",                   \
> > +	"no spare translation slot in the I/O MMU", \
> > +	"permission denied",                        \
> > +	"bad page",                                 \
> > +	"copy arguments cross page boundary",       \
> > +	"page address size too large",              \
> > +	"operation not done; try again"             \
> > +}
> > +
> > +#endif /* __XEN_PUBLIC_GRANT_TABLE_H__ */
> > diff --git a/include/xen/interface/hvm/hvm_op.h
> > b/include/xen/interface/hvm/hvm_op.h
> > new file mode 100644
> > index 0000000000..1c53cad729
> > --- /dev/null
> > +++ b/include/xen/interface/hvm/hvm_op.h
> > @@ -0,0 +1,69 @@
> > +/*
> > + * Permission is hereby granted, free of charge, to any person
> > obtaining a
> > copy
> > + * of this software and associated documentation files (the
> > "Software"), to
> > + * deal in the Software without restriction, including without
> > limitation the
> > + * rights to use, copy, modify, merge, publish, distribute,
> > sublicense, and/or
> > + * sell copies of the Software, and to permit persons to whom the
> > Software is
> > + * furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be
> > included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> > KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> > EVENT SHALL THE
> > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> > OR OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + */
> > +
> > +#ifndef __XEN_PUBLIC_HVM_HVM_OP_H__
> > +#define __XEN_PUBLIC_HVM_HVM_OP_H__
> > +
> > +/* Get/set subcommands: the second argument of the hypercall is a
> > + * pointer to a xen_hvm_param struct.
> > + */
> > +#define HVMOP_set_param           0
> > +#define HVMOP_get_param           1
> > +struct xen_hvm_param {
> > +	domid_t  domid;    /* IN */
> > +	u32 index;    /* IN */
> > +	u64 value;    /* IN/OUT */
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_param);
> > +
> > +/* Hint from PV drivers for pagetable destruction. */
> > +#define HVMOP_pagetable_dying       9
> > +struct xen_hvm_pagetable_dying {
> > +	/* Domain with a pagetable about to be destroyed. */
> > +	domid_t  domid;
> > +	/* guest physical address of the toplevel pagetable dying */
> > +	aligned_u64 gpa;
> > +};
> > +
> > +typedef struct xen_hvm_pagetable_dying xen_hvm_pagetable_dying_t;
> > +DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_pagetable_dying_t);
> > +
> > +enum hvmmem_type_t {
> > +	HVMMEM_ram_rw,             /* Normal read/write guest RAM */
> > +	HVMMEM_ram_ro,             /* Read-only; writes are discarded
> > */
> > +	HVMMEM_mmio_dm,            /* Reads and write go to the device
> > model */
> > +};
> > +
> > +#define HVMOP_get_mem_type    15
> > +/* Return hvmmem_type_t for the specified pfn. */
> > +struct xen_hvm_get_mem_type {
> > +	/* Domain to be queried. */
> > +	domid_t domid;
> > +	/* OUT variable. */
> > +	u16 mem_type;
> > +	u16 pad[2]; /* align next field on 8-byte boundary */
> > +	/* IN variable. */
> > +	u64 pfn;
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_get_mem_type);
> > +
> > +#endif /* __XEN_PUBLIC_HVM_HVM_OP_H__ */
> > diff --git a/include/xen/interface/hvm/params.h
> > b/include/xen/interface/hvm/params.h
> > new file mode 100644
> > index 0000000000..4d61fc58d9
> > --- /dev/null
> > +++ b/include/xen/interface/hvm/params.h
> > @@ -0,0 +1,127 @@
> > +/*
> > + * Permission is hereby granted, free of charge, to any person
> > obtaining a
> > copy
> > + * of this software and associated documentation files (the
> > "Software"), to
> > + * deal in the Software without restriction, including without
> > limitation the
> > + * rights to use, copy, modify, merge, publish, distribute,
> > sublicense, and/or
> > + * sell copies of the Software, and to permit persons to whom the
> > Software is
> > + * furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be
> > included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> > KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> > EVENT SHALL THE
> > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> > OR OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + */
> > +
> > +#ifndef __XEN_PUBLIC_HVM_PARAMS_H__
> > +#define __XEN_PUBLIC_HVM_PARAMS_H__
> > +
> > +#include <xen/interface/hvm/hvm_op.h>
> > +
> > +/*
> > + * Parameter space for HVMOP_{set,get}_param.
> > + */
> > +
> > +#define HVM_PARAM_CALLBACK_IRQ 0
> > +/*
> > + * How should CPU0 event-channel notifications be delivered?
> > + *
> > + * If val == 0 then CPU0 event-channel notifications are not
> > delivered.
> > + * If val != 0, val[63:56] encodes the type, as follows:
> > + */
> > +
> > +#define HVM_PARAM_CALLBACK_TYPE_GSI      0
> > +/*
> > + * val[55:0] is a delivery GSI.  GSI 0 cannot be used, as it
> > aliases val == 0,
> > + * and disables all notifications.
> > + */
> > +
> > +#define HVM_PARAM_CALLBACK_TYPE_PCI_INTX 1
> > +/*
> > + * val[55:0] is a delivery PCI INTx line:
> > + * Domain = val[47:32], Bus = val[31:16] DevFn = val[15:8], IntX =
> > val[1:0]
> > + */
> > +
> > +#if defined(__i386__) || defined(__x86_64__)
> > +#define HVM_PARAM_CALLBACK_TYPE_VECTOR   2
> > +/*
> > + * val[7:0] is a vector number.  Check for
> > XENFEAT_hvm_callback_vector to
> > know
> > + * if this delivery method is available.
> > + */
> > +#elif defined(__arm__) || defined(__aarch64__)
> > +#define HVM_PARAM_CALLBACK_TYPE_PPI      2
> > +/*
> > + * val[55:16] needs to be zero.
> > + * val[15:8] is interrupt flag of the PPI used by event-channel:
> > + *  bit 8: the PPI is edge(1) or level(0) triggered
> > + *  bit 9: the PPI is active low(1) or high(0)
> > + * val[7:0] is a PPI number used by event-channel.
> > + * This is only used by ARM/ARM64 and masking/eoi the interrupt
> > associated
> > to
> > + * the notification is handled by the interrupt controller.
> > + */
> > +#endif
> > +
> > +#define HVM_PARAM_STORE_PFN    1
> > +#define HVM_PARAM_STORE_EVTCHN 2
> > +
> > +#define HVM_PARAM_PAE_ENABLED  4
> > +
> > +#define HVM_PARAM_IOREQ_PFN    5
> > +
> > +#define HVM_PARAM_BUFIOREQ_PFN 6
> > +
> > +/*
> > + * Set mode for virtual timers (currently x86 only):
> > + *  delay_for_missed_ticks (default):
> > + *   Do not advance a vcpu's time beyond the correct delivery time
> > for
> > + *   interrupts that have been missed due to preemption. Deliver
> > missed
> > + *   interrupts when the vcpu is rescheduled and advance the
> > vcpu's virtual
> > + *   time stepwise for each one.
> > + *  no_delay_for_missed_ticks:
> > + *   As above, missed interrupts are delivered, but guest time
> > always tracks
> > + *   wallclock (i.e., real) time while doing so.
> > + *  no_missed_ticks_pending:
> > + *   No missed interrupts are held pending. Instead, to ensure
> > ticks are
> > + *   delivered at some non-zero rate, if we detect missed ticks
> > then the
> > + *   internal tick alarm is not disabled if the VCPU is preempted
> > during the
> > + *   next tick period.
> > + *  one_missed_tick_pending:
> > + *   Missed interrupts are collapsed together and delivered as one
> > 'late
> > tick'.
> > + *   Guest time always tracks wallclock (i.e., real) time.
> > + */
> > +#define HVM_PARAM_TIMER_MODE   10
> > +#define HVMPTM_delay_for_missed_ticks    0
> > +#define HVMPTM_no_delay_for_missed_ticks 1
> > +#define HVMPTM_no_missed_ticks_pending   2
> > +#define HVMPTM_one_missed_tick_pending   3
> > +
> > +/* Boolean: Enable virtual HPET (high-precision event timer)?
> > (x86-only) */
> > +#define HVM_PARAM_HPET_ENABLED 11
> > +
> > +/* Identity-map page directory used by Intel EPT when CR0.PG=0. */
> > +#define HVM_PARAM_IDENT_PT     12
> > +
> > +/* Device Model domain, defaults to 0. */
> > +#define HVM_PARAM_DM_DOMAIN    13
> > +
> > +/* ACPI S state: currently support S0 and S3 on x86. */
> > +#define HVM_PARAM_ACPI_S_STATE 14
> > +
> > +/* TSS used on Intel when CR0.PE=0. */
> > +#define HVM_PARAM_VM86_TSS     15
> > +
> > +/* Boolean: Enable aligning all periodic vpts to reduce interrupts
> > */
> > +#define HVM_PARAM_VPT_ALIGN    16
> > +
> > +/* Console debug shared memory ring and event channel */
> > +#define HVM_PARAM_CONSOLE_PFN    17
> > +#define HVM_PARAM_CONSOLE_EVTCHN 18
> > +
> > +#define HVM_NR_PARAMS          19
> > +
> > +#endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */
> > diff --git a/include/xen/interface/io/blkif.h
> > b/include/xen/interface/io/blkif.h
> > new file mode 100644
> > index 0000000000..7d74c99226
> > --- /dev/null
> > +++ b/include/xen/interface/io/blkif.h
> > @@ -0,0 +1,726 @@
> > +/************************************************************
> > ******************
> > + * blkif.h
> > + *
> > + * Unified block-device I/O interface for Xen guest OSes.
> > + *
> > + * Permission is hereby granted, free of charge, to any person
> > obtaining a
> > copy
> > + * of this software and associated documentation files (the
> > "Software"), to
> > + * deal in the Software without restriction, including without
> > limitation the
> > + * rights to use, copy, modify, merge, publish, distribute,
> > sublicense, and/or
> > + * sell copies of the Software, and to permit persons to whom the
> > Software is
> > + * furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be
> > included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> > KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> > EVENT SHALL THE
> > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> > OR OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + *
> > + * Copyright (c) 2003-2004, Keir Fraser
> > + * Copyright (c) 2012, Spectra Logic Corporation
> > + */
> > +
> > +#ifndef __XEN_PUBLIC_IO_BLKIF_H__
> > +#define __XEN_PUBLIC_IO_BLKIF_H__
> > +
> > +#include "ring.h"
> > +#include "../grant_table.h"
> > +
> > +/*
> > + * Front->back notifications: When enqueuing a new request,
> > sending a
> > + * notification can be made conditional on req_event (i.e., the
> > generic
> > + * hold-off mechanism provided by the ring macros). Backends must
> > set
> > + * req_event appropriately (e.g., using
> > RING_FINAL_CHECK_FOR_REQUESTS()).
> > + *
> > + * Back->front notifications: When enqueuing a new response,
> > sending a
> > + * notification can be made conditional on rsp_event (i.e., the
> > generic
> > + * hold-off mechanism provided by the ring macros). Frontends must
> > set
> > + * rsp_event appropriately (e.g., using
> > RING_FINAL_CHECK_FOR_RESPONSES()).
> > + */
> > +
> > +#ifndef blkif_vdev_t
> > +#define blkif_vdev_t   u16
> > +#endif
> > +#define blkif_sector_t u64
> > +
> > +/*
> > + * Feature and Parameter Negotiation
> > + * =================================
> > + * The two halves of a Xen block driver utilize nodes within the
> > XenStore to
> > + * communicate capabilities and to negotiate operating
> > parameters.  This
> > + * section enumerates these nodes which reside in the respective
> > front and
> > + * backend portions of the XenStore, following the XenBus
> > convention.
> > + *
> > + * All data in the XenStore is stored as strings.  Nodes
> > specifying numeric
> > + * values are encoded in decimal.  Integer value ranges listed
> > below are
> > + * expressed as fixed sized integer types capable of storing the
> > conversion
> > + * of a properly formated node string, without loss of
> > information.
> > + *
> > + * Any specified default value is in effect if the corresponding
> > XenBus node
> > + * is not present in the XenStore.
> > + *
> > + * XenStore nodes in sections marked "PRIVATE" are solely for use
> > by the
> > + * driver side whose XenBus tree contains them.
> > + *
> > + * XenStore nodes marked "DEPRECATED" in their notes section
> > should only
> > be
> > + * used to provide interoperability with legacy implementations.
> > + *
> > + * See the XenBus state transition diagram below for details on
> > when XenBus
> > + * nodes must be published and when they can be queried.
> > + *
> > +
> > **************************************************************
> > ***************
> > + *                            Backend XenBus Nodes
> > +
> > **************************************************************
> > ***************
> > + *
> > + *------------------ Backend Device Identification (PRIVATE) ---
> > ---------------
> > + *
> > + * mode
> > + *      Values:         "r" (read only), "w" (writable)
> > + *
> > + *      The read or write access permissions to the backing store
> > to be
> > + *      granted to the frontend.
> > + *
> > + * params
> > + *      Values:         string
> > + *
> > + *      A free formatted string providing sufficient information
> > for the
> > + *      hotplug script to attach the device and provide a suitable
> > + *      handler (ie: a block device) for blkback to use.
> > + *
> > + * physical-device
> > + *      Values:         "MAJOR:MINOR"
> > + *      Notes: 11
> > + *
> > + *      MAJOR and MINOR are the major number and minor number of
> > the
> > + *      backing device respectively.
> > + *
> > + * physical-device-path
> > + *      Values:         path string
> > + *
> > + *      A string that contains the absolute path to the disk
> > image. On
> > + *      NetBSD and Linux this is always a block device, while on
> > FreeBSD
> > + *      it can be either a block device or a regular file.
> > + *
> > + * type
> > + *      Values:         "file", "phy", "tap"
> > + *
> > + *      The type of the backing device/object.
> > + *
> > + *
> > + * direct-io-safe
> > + *      Values:         0/1 (boolean)
> > + *      Default Value:  0
> > + *
> > + *      The underlying storage is not affected by the direct IO
> > memory
> > + *      lifetime bug.  See:
> > + *
> > 
https://urldefense.com/v3/__https://eur01.safelinks.protection.outlook.com/?url=http*3A*2F*2Flists.xe__;JSUl!!GF_29dbcQIUBPA!jD586eXHYPvw-3dNl43vD8yZH2dB5zfAfDsAEdhFEjZcol8ete6qMxK4PKq9W1aTi73eSJ8$
> >  
> > n.org%2Farchives%2Fhtml%2Fxen-devel%2F2012-12%2Fmsg01154.html&am
> > p;data=02%7C01%7Cpeng.fan%40nxp.com%7Cdd87f4854f514bc096ba08d81
> > ddc0812%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C63729217
> > 8170181802&amp;sdata=wXiKB5EvbBokB%2BYrOdMDiKDBwSHo8m1ssXFp0K
> > RQ0Io%3D&amp;reserved=0
> > + *
> > + *      Therefore this option gives the backend permission to use
> > + *      O_DIRECT, notwithstanding that bug.
> > + *
> > + *      That is, if this option is enabled, use of O_DIRECT is
> > safe,
> > + *      in circumstances where we would normally have avoided it
> > as a
> > + *      workaround for that bug.  This option is not relevant for
> > all
> > + *      backends, and even not necessarily supported for those for
> > + *      which it is relevant.  A backend which knows that it is
> > not
> > + *      affected by the bug can ignore this option.
> > + *
> > + *      This option doesn't require a backend to use O_DIRECT, so
> > it
> > + *      should not be used to try to control the caching
> > behaviour.
> > + *
> > + *--------------------------------- Features -------------------
> > --------------
> > + *
> > + * feature-barrier
> > + *      Values:         0/1 (boolean)
> > + *      Default Value:  0
> > + *
> > + *      A value of "1" indicates that the backend can process
> > requests
> > + *      containing the BLKIF_OP_WRITE_BARRIER request opcode.
> > Requests
> > + *      of this type may still be returned at any time with the
> > + *      BLKIF_RSP_EOPNOTSUPP result code.
> > + *
> > + * feature-flush-cache
> > + *      Values:         0/1 (boolean)
> > + *      Default Value:  0
> > + *
> > + *      A value of "1" indicates that the backend can process
> > requests
> > + *      containing the BLKIF_OP_FLUSH_DISKCACHE request opcode.
> > Requests
> > + *      of this type may still be returned at any time with the
> > + *      BLKIF_RSP_EOPNOTSUPP result code.
> > + *
> > + * feature-discard
> > + *      Values:         0/1 (boolean)
> > + *      Default Value:  0
> > + *
> > + *      A value of "1" indicates that the backend can process
> > requests
> > + *      containing the BLKIF_OP_DISCARD request opcode.  Requests
> > + *      of this type may still be returned at any time with the
> > + *      BLKIF_RSP_EOPNOTSUPP result code.
> > + *
> > + * feature-persistent
> > + *      Values:         0/1 (boolean)
> > + *      Default Value:  0
> > + *      Notes: 7
> > + *
> > + *      A value of "1" indicates that the backend can keep the
> > grants used
> > + *      by the frontend driver mapped, so the same set of grants
> > should be
> > + *      used in all transactions. The maximum number of grants the
> > backend
> > + *      can map persistently depends on the implementation, but
> > ideally it
> > + *      should be RING_SIZE * BLKIF_MAX_SEGMENTS_PER_REQUEST.
> > Using this
> > + *      feature the backend doesn't need to unmap each grant,
> > preventing
> > + *      costly TLB flushes. The backend driver should only map
> > grants
> > + *      persistently if the frontend supports it. If a backend
> > driver chooses
> > + *      to use the persistent protocol when the frontend doesn't
> > support it,
> > + *      it will probably hit the maximum number of persistently
> > mapped
> > grants
> > + *      (due to the fact that the frontend won't be reusing the
> > same
> > grants),
> > + *      and fall back to non-persistent mode. Backend
> > implementations
> > may
> > + *      shrink or expand the number of persistently mapped grants
> > without
> > + *      notifying the frontend depending on memory constraints
> > (this might
> > + *      cause a performance degradation).
> > + *
> > + *      If a backend driver wants to limit the maximum number of
> > persistently
> > + *      mapped grants to a value less than RING_SIZE *
> > + *      BLKIF_MAX_SEGMENTS_PER_REQUEST a LRU strategy should be
> > used to
> > + *      discard the grants that are less commonly used. Using a
> > LRU in the
> > + *      backend driver paired with a LIFO queue in the frontend
> > will
> > + *      allow us to have better performance in this scenario.
> > + *
> > + *----------------------- Request Transport Parameters ---------
> > ---------------
> > + *
> > + * max-ring-page-order
> > + *      Values:         <uint32_t>
> > + *      Default Value:  0
> > + *      Notes:          1, 3
> > + *
> > + *      The maximum supported size of the request ring buffer in
> > units of
> > + *      lb(machine pages). (e.g. 0 == 1 page,  1 = 2 pages, 2 == 4
> > pages,
> > + *      etc.).
> > + *
> > + * max-ring-pages
> > + *      Values:         <uint32_t>
> > + *      Default Value:  1
> > + *      Notes:          DEPRECATED, 2, 3
> > + *
> > + *      The maximum supported size of the request ring buffer in
> > units of
> > + *      machine pages.  The value must be a power of 2.
> > + *
> > + *------------------------- Backend Device Properties ------------
> > -------------
> > + *
> > + * discard-enable
> > + *      Values:         0/1 (boolean)
> > + *      Default Value:  1
> > + *
> > + *      This optional property, set by the toolstack, instructs
> > the backend
> > + *      to offer (or not to offer) discard to the frontend. If the
> > property
> > + *      is missing the backend should offer discard if the backing
> > storage
> > + *      actually supports it.
> > + *
> > + * discard-alignment
> > + *      Values:         <uint32_t>
> > + *      Default Value:  0
> > + *      Notes:          4, 5
> > + *
> > + *      The offset, in bytes from the beginning of the virtual
> > block device,
> > + *      to the first, addressable, discard extent on the
> > underlying device.
> > + *
> > + * discard-granularity
> > + *      Values:         <uint32_t>
> > + *      Default Value:  <"sector-size">
> > + *      Notes:          4
> > + *
> > + *      The size, in bytes, of the individually addressable
> > discard extents
> > + *      of the underlying device.
> > + *
> > + * discard-secure
> > + *      Values:         0/1 (boolean)
> > + *      Default Value:  0
> > + *      Notes:          10
> > + *
> > + *      A value of "1" indicates that the backend can process
> > BLKIF_OP_DISCARD
> > + *      requests with the BLKIF_DISCARD_SECURE flag set.
> > + *
> > + * info
> > + *      Values:         <uint32_t> (bitmap)
> > + *
> > + *      A collection of bit flags describing attributes of the
> > backing
> > + *      device.  The VDISK_* macros define the meaning of each bit
> > + *      location.
> > + *
> > + * sector-size
> > + *      Values:         <uint32_t>
> > + *
> > + *      The logical block size, in bytes, of the underlying
> > storage. This
> > + *      must be a power of two with a minimum value of 512.
> > + *
> > + *      NOTE: Because of implementation bugs in some frontends
> > this
> > must be
> > + *            set to 512, unless the frontend advertizes a non-
> > zero value
> > + *            in its "feature-large-sector-size" xenbus node. (See
> > below).
> > + *
> > + * physical-sector-size
> > + *      Values:         <uint32_t>
> > + *      Default Value:  <"sector-size">
> > + *
> > + *      The physical block size, in bytes, of the backend storage.
> > This
> > + *      must be an integer multiple of "sector-size".
> > + *
> > + * sectors
> > + *      Values:         <u64>
> > + *
> > + *      The size of the backend device, expressed in units of
> > "sector-size".
> > + *      The product of "sector-size" and "sectors" must also be an
> > integer
> > + *      multiple of "physical-sector-size", if that node is
> > present.
> > + *
> > +
> > **************************************************************
> > ***************
> > + *                            Frontend XenBus Nodes
> > +
> > **************************************************************
> > ***************
> > + *
> > + *----------------------- Request Transport Parameters ---------
> > --------------
> > + *
> > + * event-channel
> > + *      Values:         <uint32_t>
> > + *
> > + *      The identifier of the Xen event channel used to signal
> > activity
> > + *      in the ring buffer.
> > + *
> > + * ring-ref
> > + *      Values:         <uint32_t>
> > + *      Notes:          6
> > + *
> > + *      The Xen grant reference granting permission for the
> > backend to
> > map
> > + *      the sole page in a single page sized ring buffer.
> > + *
> > + * ring-ref%u
> > + *      Values:         <uint32_t>
> > + *      Notes:          6
> > + *
> > + *      For a frontend providing a multi-page ring, a "number of
> > ring pages"
> > + *      sized list of nodes, each containing a Xen grant reference
> > granting
> > + *      permission for the backend to map the page of the ring
> > located
> > + *      at page index "%u".  Page indexes are zero based.
> > + *
> > + * protocol
> > + *      Values:         string (XEN_IO_PROTO_ABI_*)
> > + *      Default Value:  XEN_IO_PROTO_ABI_NATIVE
> > + *
> > + *      The machine ABI rules governing the format of all ring
> > request and
> > + *      response structures.
> > + *
> > + * ring-page-order
> > + *      Values:         <uint32_t>
> > + *      Default Value:  0
> > + *      Maximum Value:  MAX(ffs(max-ring-pages) - 1,
> > max-ring-page-order)
> > + *      Notes:          1, 3
> > + *
> > + *      The size of the frontend allocated request ring buffer in
> > units
> > + *      of lb(machine pages). (e.g. 0 == 1 page, 1 = 2 pages, 2 ==
> > 4 pages,
> > + *      etc.).
> > + *
> > + * num-ring-pages
> > + *      Values:         <uint32_t>
> > + *      Default Value:  1
> > + *      Maximum Value:  MAX(max-ring-pages,(0x1 <<
> > max-ring-page-order))
> > + *      Notes:          DEPRECATED, 2, 3
> > + *
> > + *      The size of the frontend allocated request ring buffer in
> > units of
> > + *      machine pages.  The value must be a power of 2.
> > + *
> > + *--------------------------------- Features -------------------
> > --------------
> > + *
> > + * feature-persistent
> > + *      Values:         0/1 (boolean)
> > + *      Default Value:  0
> > + *      Notes: 7, 8, 9
> > + *
> > + *      A value of "1" indicates that the frontend will reuse the
> > same grants
> > + *      for all transactions, allowing the backend to map them
> > with write
> > + *      access (even when it should be read-only). If the frontend
> > hits the
> > + *      maximum number of allowed persistently mapped grants, it
> > can
> > fallback
> > + *      to non persistent mode. This will cause a performance
> > degradation,
> > + *      since the the backend driver will still try to map those
> > grants
> > + *      persistently. Since the persistent grants protocol is
> > compatible with
> > + *      the previous protocol, a frontend driver can choose to
> > work in
> > + *      persistent mode even when the backend doesn't support it.
> > + *
> > + *      It is recommended that the frontend driver stores the
> > persistently
> > + *      mapped grants in a LIFO queue, so a subset of all
> > persistently
> > mapped
> > + *      grants gets used commonly. This is done in case the
> > backend driver
> > + *      decides to limit the maximum number of persistently mapped
> > grants
> > + *      to a value less than RING_SIZE *
> > BLKIF_MAX_SEGMENTS_PER_REQUEST.
> > + *
> > + * feature-large-sector-size
> > + *      Values:         0/1 (boolean)
> > + *      Default Value:  0
> > + *
> > + *      A value of "1" indicates that the frontend will correctly
> > supply and
> > + *      interpret all sector-based quantities in terms of the
> > "sector-size"
> > + *      value supplied in the backend info, whatever that may be
> > set to.
> > + *      If this node is not present or its value is "0" then it is
> > assumed
> > + *      that the frontend requires that the logical block size is
> > 512 as it
> > + *      is hardcoded (which is the case in some frontend
> > implementations).
> > + *
> > + *------------------------- Virtual Device Properties ------------
> > -------------
> > + *
> > + * device-type
> > + *      Values:         "disk", "cdrom", "floppy", etc.
> > + *
> > + * virtual-device
> > + *      Values:         <uint32_t>
> > + *
> > + *      A value indicating the physical device to virtualize
> > within the
> > + *      frontend's domain.  (e.g. "The first ATA disk", "The third
> > SCSI
> > + *      disk", etc.)
> > + *
> > + *      See docs/misc/vbd-interface.txt for details on the format
> > of this
> > + *      value.
> > + *
> > + * Notes
> > + * -----
> > + * (1) Multi-page ring buffer scheme first developed in the Citrix
> > XenServer
> > + *     PV drivers.
> > + * (2) Multi-page ring buffer scheme first used in some RedHat
> > distributions
> > + *     including a distribution deployed on certain nodes of the
> > Amazon
> > + *     EC2 cluster.
> > + * (3) Support for multi-page ring buffers was implemented
> > independently,
> > + *     in slightly different forms, by both Citrix and
> > RedHat/Amazon.
> > + *     For full interoperability, block front and backends should
> > publish
> > + *     identical ring parameters, adjusted for unit differences,
> > to the
> > + *     XenStore nodes used in both schemes.
> > + * (4) Devices that support discard functionality may internally
> > allocate space
> > + *     (discardable extents) in units that are larger than the
> > exported
> > logical
> > + *     block size. If the backing device has such discardable
> > extents the
> > + *     backend should provide both discard-granularity and
> > discard-alignment.
> > + *     Providing just one of the two may be considered an error by
> > the
> > frontend.
> > + *     Backends supporting discard should include discard-
> > granularity and
> > + *     discard-alignment even if it supports discarding individual
> > sectors.
> > + *     Frontends should assume discard-alignment == 0 and
> > discard-granularity
> > + *     == sector size if these keys are missing.
> > + * (5) The discard-alignment parameter allows a physical device to
> > be
> > + *     partitioned into virtual devices that do not necessarily
> > begin or
> > + *     end on a discardable extent boundary.
> > + * (6) When there is only a single page allocated to the request
> > ring,
> > + *     'ring-ref' is used to communicate the grant reference for
> > this
> > + *     page to the backend.  When using a multi-page ring, the
> > 'ring-ref'
> > + *     node is not created.  Instead 'ring-ref0' - 'ring-refN' are
> > used.
> > + * (7) When using persistent grants data has to be copied from/to
> > the page
> > + *     where the grant is currently mapped. The overhead of doing
> > this
> > copy
> > + *     however doesn't suppress the speed improvement of not
> > having to
> > unmap
> > + *     the grants.
> > + * (8) The frontend driver has to allow the backend driver to map
> > all grants
> > + *     with write access, even when they should be mapped read-
> > only,
> > since
> > + *     further requests may reuse these grants and require write
> > permissions.
> > + * (9) Linux implementation doesn't have a limit on the maximum
> > number of
> > + *     grants that can be persistently mapped in the frontend
> > driver, but
> > + *     due to the frontent driver implementation it should never
> > be bigger
> > + *     than RING_SIZE * BLKIF_MAX_SEGMENTS_PER_REQUEST.
> > + *(10) The discard-secure property may be present and will be set
> > to 1 if the
> > + *     backing device supports secure discard.
> > + *(11) Only used by Linux and NetBSD.
> > + */
> > +
> > +/*
> > + * Multiple hardware queues/rings:
> > + * If supported, the backend will write the key "multi-queue-max-
> > queues" to
> > + * the directory for that vbd, and set its value to the maximum
> > supported
> > + * number of queues.
> > + * Frontends that are aware of this feature and wish to use it can
> > write the
> > + * key "multi-queue-num-queues" with the number they wish to use,
> > which
> > must be
> > + * greater than zero, and no more than the value reported by the
> > backend in
> > + * "multi-queue-max-queues".
> > + *
> > + * For frontends requesting just one queue, the usual event-
> > channel and
> > + * ring-ref keys are written as before, simplifying the backend
> > processing
> > + * to avoid distinguishing between a frontend that doesn't
> > understand the
> > + * multi-queue feature, and one that does, but requested only one
> > queue.
> > + *
> > + * Frontends requesting two or more queues must not write the
> > toplevel
> > + * event-channel and ring-ref keys, instead writing those keys
> > under
> > sub-keys
> > + * having the name "queue-N" where N is the integer ID of the
> > queue/ring
> > for
> > + * which those keys belong. Queues are indexed from zero.
> > + * For example, a frontend with two queues must write the
> > following set of
> > + * queue-related keys:
> > + *
> > + * /local/domain/1/device/vbd/0/multi-queue-num-queues = "2"
> > + * /local/domain/1/device/vbd/0/queue-0 = ""
> > + * /local/domain/1/device/vbd/0/queue-0/ring-ref = "<ring-ref#0>"
> > + * /local/domain/1/device/vbd/0/queue-0/event-channel =
> > "<evtchn#0>"
> > + * /local/domain/1/device/vbd/0/queue-1 = ""
> > + * /local/domain/1/device/vbd/0/queue-1/ring-ref = "<ring-ref#1>"
> > + * /local/domain/1/device/vbd/0/queue-1/event-channel =
> > "<evtchn#1>"
> > + *
> > + * It is also possible to use multiple queues/rings together with
> > + * feature multi-page ring buffer.
> > + * For example, a frontend requests two queues/rings and the size
> > of each
> > ring
> > + * buffer is two pages must write the following set of related
> > keys:
> > + *
> > + * /local/domain/1/device/vbd/0/multi-queue-num-queues = "2"
> > + * /local/domain/1/device/vbd/0/ring-page-order = "1"
> > + * /local/domain/1/device/vbd/0/queue-0 = ""
> > + * /local/domain/1/device/vbd/0/queue-0/ring-ref0 = "<ring-ref#0>"
> > + * /local/domain/1/device/vbd/0/queue-0/ring-ref1 = "<ring-ref#1>"
> > + * /local/domain/1/device/vbd/0/queue-0/event-channel =
> > "<evtchn#0>"
> > + * /local/domain/1/device/vbd/0/queue-1 = ""
> > + * /local/domain/1/device/vbd/0/queue-1/ring-ref0 = "<ring-ref#2>"
> > + * /local/domain/1/device/vbd/0/queue-1/ring-ref1 = "<ring-ref#3>"
> > + * /local/domain/1/device/vbd/0/queue-1/event-channel =
> > "<evtchn#1>"
> > + *
> > + */
> > +
> > +/*
> > + * STATE DIAGRAMS
> > + *
> > +
> > **************************************************************
> > ***************
> > + *                                   Startup
> > *
> > +
> > **************************************************************
> > ***************
> > + *
> > + * Tool stack creates front and back nodes with state
> > XenbusStateInitialising.
> > + *
> > + * Front                                Back
> > + * =================================
> > =====================================
> > + * XenbusStateInitialising              XenbusStateInitialising
> > + *  o Query virtual device               o Query backend device
> > identification
> > + *    properties.                          data.
> > + *  o Setup OS device instance.          o Open and validate
> > backend
> > device.
> > + *                                       o Publish backend
> > features and
> > + *                                         transport parameters.
> > + *                                                      |
> > + *                                                      |
> > + *                                                      V
> > + *                                      XenbusStateInitWait
> > + *
> > + * o Query backend features and
> > + *   transport parameters.
> > + * o Allocate and initialize the
> > + *   request ring.
> > + * o Publish transport parameters
> > + *   that will be in effect during
> > + *   this connection.
> > + *              |
> > + *              |
> > + *              V
> > + * XenbusStateInitialised
> > + *
> > + *                                       o Query frontend
> > transport parameters.
> > + *                                       o Connect to the request
> > ring and
> > + *                                         event channel.
> > + *                                       o Publish backend device
> > properties.
> > + *                                                      |
> > + *                                                      |
> > + *                                                      V
> > + *                                      XenbusStateConnected
> > + *
> > + *  o Query backend device properties.
> > + *  o Finalize OS virtual device
> > + *    instance.
> > + *              |
> > + *              |
> > + *              V
> > + * XenbusStateConnected
> > + *
> > + * Note: Drivers that do not support any optional features, or the
> > negotiation
> > + *       of transport parameters, can skip certain states in the
> > state
> > machine:
> > + *
> > + *       o A frontend may transition to XenbusStateInitialised
> > without
> > + *         waiting for the backend to enter
> > XenbusStateInitWait.  In this
> > + *         case, default transport parameters are in effect and
> > any
> > + *         transport parameters published by the frontend must
> > contain
> > + *         their default values.
> > + *
> > + *       o A backend may transition to XenbusStateInitialised,
> > bypassing
> > + *         XenbusStateInitWait, without waiting for the frontend
> > to first
> > + *         enter the XenbusStateInitialised state.  In this case,
> > default
> > + *         transport parameters are in effect and any transport
> > parameters
> > + *         published by the backend must contain their default
> > values.
> > + *
> > + *       Drivers that support optional features and/or transport
> > parameter
> > + *       negotiation must tolerate these additional state
> > transition paths.
> > + *       In general this means performing the work of any skipped
> > state
> > + *       transition, if it has not already been performed, in
> > addition to the
> > + *       work associated with entry into the current state.
> > + */
> > +
> > +/*
> > + * REQUEST CODES.
> > + */
> > +#define BLKIF_OP_READ              0
> > +#define BLKIF_OP_WRITE             1
> > +/*
> > + * All writes issued prior to a request with the
> > BLKIF_OP_WRITE_BARRIER
> > + * operation code ("barrier request") must be completed prior to
> > the
> > + * execution of the barrier request.  All writes issued after the
> > barrier
> > + * request must not execute until after the completion of the
> > barrier request.
> > + *
> > + * Optional.  See "feature-barrier" XenBus node documentation
> > above.
> > + */
> > +#define BLKIF_OP_WRITE_BARRIER     2
> > +/*
> > + * Commit any uncommitted contents of the backing device's
> > volatile cache
> > + * to stable storage.
> > + *
> > + * Optional.  See "feature-flush-cache" XenBus node documentation
> > above.
> > + */
> > +#define BLKIF_OP_FLUSH_DISKCACHE   3
> > +/*
> > + * Used in SLES sources for device specific command packet
> > + * contained within the request. Reserved for that purpose.
> > + */
> > +#define BLKIF_OP_RESERVED_1        4
> > +/*
> > + * Indicate to the backend device that a region of storage is no
> > longer in
> > + * use, and may be discarded at any time without impact to the
> > client.  If
> > + * the BLKIF_DISCARD_SECURE flag is set on the request, all copies
> > of the
> > + * discarded region on the device must be rendered unrecoverable
> > before
> > the
> > + * command returns.
> > + *
> > + * This operation is analogous to performing a trim (ATA) or unamp
> > (SCSI),
> > + * command on a native device.
> > + *
> > + * More information about trim/unmap operations can be found at:
> > + *
> > 
https://urldefense.com/v3/__https://eur01.safelinks.protection.outlook.com/?url=http*3A*2F*2Ft13.org__;JSUl!!GF_29dbcQIUBPA!jD586eXHYPvw-3dNl43vD8yZH2dB5zfAfDsAEdhFEjZcol8ete6qMxK4PKq9W1aTLlXS-Uk$
> >  
> > %2FDocuments%2FUploadedDocuments%2Fdocs2008%2F&amp;data=02%7
> > C01%7Cpeng.fan%40nxp.com%7Cdd87f4854f514bc096ba08d81ddc0812%7C
> > 686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637292178170181802
> > &amp;sdata=JOOjsvkjqxkuoF47PMVw1loNNDhxPCXQVdPQQklTIGM%3D&am
> > p;reserved=0
> > + *     e07154r6-Data_Set_Management_Proposal_for_ATA-ACS2.doc
> > + *
> > 
https://urldefense.com/v3/__https://eur01.safelinks.protection.outlook.com/?url=http*3A*2F*2Fwww.s__;JSUl!!GF_29dbcQIUBPA!jD586eXHYPvw-3dNl43vD8yZH2dB5zfAfDsAEdhFEjZcol8ete6qMxK4PKq9W1aTiWVfQfs$
> >  
> > eagate.com%2Fstaticfiles%2Fsupport%2Fdisc%2Fmanuals%2F&amp;data=02
> > %7C01%7Cpeng.fan%40nxp.com%7Cdd87f4854f514bc096ba08d81ddc0812%
> > 7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6372921781701818
> > 02&amp;sdata=gd5Cvr1Q9%2Bv%2BfUS5OleuozBITkjbybYoR302s4XsVv8%3D
> > &amp;reserved=0
> > + *     Interface%20manuals/100293068c.pdf
> > + *
> > + * Optional.  See "feature-discard", "discard-alignment",
> > + * "discard-granularity", and "discard-secure" in the XenBus node
> > + * documentation above.
> > + */
> > +#define BLKIF_OP_DISCARD           5
> > +
> > +/*
> > + * Recognized if "feature-max-indirect-segments" in present in the
> > backend
> > + * xenbus info. The "feature-max-indirect-segments" node contains
> > the
> > maximum
> > + * number of segments allowed by the backend per request. If the
> > node is
> > + * present, the frontend might use blkif_request_indirect structs
> > in order to
> > + * issue requests with more than BLKIF_MAX_SEGMENTS_PER_REQUEST
> > (11). The
> > + * maximum number of indirect segments is fixed by the backend,
> > but the
> > + * frontend can issue requests with any number of indirect
> > segments as long
> > as
> > + * it's less than the number provided by the backend. The
> > indirect_grefs field
> > + * in blkif_request_indirect should be filled by the frontend with
> > the
> > + * grant references of the pages that are holding the indirect
> > segments.
> > + * These pages are filled with an array of blkif_request_segment
> > that hold
> > the
> > + * information about the segments. The number of indirect pages to
> > use is
> > + * determined by the number of segments an indirect request
> > contains.
> > Every
> > + * indirect page can contain a maximum of
> > + * (PAGE_SIZE / sizeof(struct blkif_request_segment)) segments, so
> > to
> > + * calculate the number of indirect pages to use we have to do
> > + * ceil(indirect_segments / (PAGE_SIZE / sizeof(struct
> > blkif_request_segment))).
> > + *
> > + * If a backend does not recognize BLKIF_OP_INDIRECT, it should
> > *not*
> > + * create the "feature-max-indirect-segments" node!
> > + */
> > +#define BLKIF_OP_INDIRECT          6
> > +
> > +/*
> > + * Maximum scatter/gather segments per request.
> > + * This is carefully chosen so that sizeof(blkif_ring_t) <=
> > PAGE_SIZE.
> > + * NB. This could be 12 if the ring indexes weren't stored in the
> > same page.
> > + */
> > +#define BLKIF_MAX_SEGMENTS_PER_REQUEST 11
> > +
> > +/*
> > + * Maximum number of indirect pages to use per request.
> > + */
> > +#define BLKIF_MAX_INDIRECT_PAGES_PER_REQUEST 8
> > +
> > +/*
> > + * NB. 'first_sect' and 'last_sect' in blkif_request_segment, as
> > well as
> > + * 'sector_number' in blkif_request, blkif_request_discard and
> > + * blkif_request_indirect are sector-based quantities. See the
> > description
> > + * of the "feature-large-sector-size" frontend xenbus node above
> > for
> > + * more information.
> > + */
> > +struct blkif_request_segment {
> > +	grant_ref_t gref;        /* reference to I/O buffer
> > frame        */
> > +	/* @first_sect: first sector in frame to transfer
> > (inclusive).   */
> > +	/* @last_sect: last sector in frame to transfer
> > (inclusive).     */
> > +	u8     first_sect, last_sect;
> > +};
> > +
> > +/*
> > + * Starting ring element for any I/O request.
> > + */
> > +struct blkif_request {
> > +	u8        operation;    /* BLKIF_OP_???
> > */
> > +	u8        nr_segments;  /* number of segments
> > */
> > +	blkif_vdev_t   handle;       /* only for read/write requests
> > */
> > +	u64       id;           /* private guest value, echoed in
> > resp  */
> > +	blkif_sector_t sector_number;/* start sector idx on disk (r/w
> > only)  */
> > +	struct blkif_request_segment
> > seg[BLKIF_MAX_SEGMENTS_PER_REQUEST];
> > +};
> > +
> > +typedef struct blkif_request blkif_request_t;
> > +
> > +/*
> > + * Cast to this structure when blkif_request.operation ==
> > BLKIF_OP_DISCARD
> > + * sizeof(struct blkif_request_discard) <= sizeof(struct
> > blkif_request)
> > + */
> > +struct blkif_request_discard {
> > +	u8        operation;    /* BLKIF_OP_DISCARD
> > */
> > +	u8        flag;         /* BLKIF_DISCARD_SECURE or zero
> > */
> > +#define BLKIF_DISCARD_SECURE (1 << 0)  /* ignored if discard-
> > secure=0
> > */
> > +	blkif_vdev_t   handle;       /* same as for read/write requests
> > */
> > +	u64       id;           /* private guest value, echoed in
> > resp  */
> > +	blkif_sector_t sector_number;/* start sector idx on disk
> > */
> > +	u64       nr_sectors;   /* number of contiguous sectors to
> > discard*/
> > +};
> > +
> > +typedef struct blkif_request_discard blkif_request_discard_t;
> > +
> > +struct blkif_request_indirect {
> > +	u8        operation;    /* BLKIF_OP_INDIRECT
> > */
> > +	u8        indirect_op;  /* BLKIF_OP_{READ/WRITE}
> > */
> > +	u16       nr_segments;  /* number of segments
> > */
> > +	u64       id;           /* private guest value, echoed in
> > resp  */
> > +	blkif_sector_t sector_number;/* start sector idx on disk (r/w
> > only)  */
> > +	blkif_vdev_t   handle;       /* same as for read/write requests
> > */
> > +	grant_ref_t
> > indirect_grefs[BLKIF_MAX_INDIRECT_PAGES_PER_REQUEST];
> > +#ifdef __i386__
> > +	u64       pad;          /* Make it 64 byte aligned on i386
> > */
> > +#endif
> > +};
> > +
> > +typedef struct blkif_request_indirect blkif_request_indirect_t;
> > +
> > +struct blkif_response {
> > +	u64        id;              /* copied from request */
> > +	u8         operation;       /* copied from request */
> > +	s16         status;          /* BLKIF_RSP_???       */
> > +};
> > +
> > +typedef struct blkif_response blkif_response_t;
> > +
> > +/*
> > + * STATUS RETURN CODES.
> > + */
> > + /* Operation not supported (only happens on barrier writes). */
> > +#define BLKIF_RSP_EOPNOTSUPP  -2
> > + /* Operation failed for some unspecified reason (-EIO). */
> > +#define BLKIF_RSP_ERROR       -1
> > + /* Operation completed successfully. */
> > +#define BLKIF_RSP_OKAY         0
> > +
> > +/*
> > + * Generate blkif ring structures and types.
> > + */
> > +DEFINE_RING_TYPES(blkif, struct blkif_request, struct
> > blkif_response);
> > +
> > +#define VDISK_CDROM        0x1
> > +#define VDISK_REMOVABLE    0x2
> > +#define VDISK_READONLY     0x4
> > +
> > +#endif /* __XEN_PUBLIC_IO_BLKIF_H__ */
> > +
> > +/*
> > + * Local variables:
> > + * mode: C
> > + * c-file-style: "BSD"
> > + * c-basic-offset: 4
> > + * tab-width: 4
> > + * indent-tabs-mode: nil
> > + * End:
> > + */
> > diff --git a/include/xen/interface/io/console.h
> > b/include/xen/interface/io/console.h
> > new file mode 100644
> > index 0000000000..3489fc7a60
> > --- /dev/null
> > +++ b/include/xen/interface/io/console.h
> > @@ -0,0 +1,56 @@
> > +/************************************************************
> > ******************
> > + * console.h
> > + *
> > + * Console I/O interface for Xen guest OSes.
> > + *
> > + * Permission is hereby granted, free of charge, to any person
> > obtaining a
> > copy
> > + * of this software and associated documentation files (the
> > "Software"), to
> > + * deal in the Software without restriction, including without
> > limitation the
> > + * rights to use, copy, modify, merge, publish, distribute,
> > sublicense, and/or
> > + * sell copies of the Software, and to permit persons to whom the
> > Software is
> > + * furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be
> > included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> > KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> > EVENT SHALL THE
> > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> > OR OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + *
> > + * Copyright (c) 2005, Keir Fraser
> > + */
> > +
> > +#ifndef __XEN_PUBLIC_IO_CONSOLE_H__
> > +#define __XEN_PUBLIC_IO_CONSOLE_H__
> > +
> > +typedef u32 XENCONS_RING_IDX;
> > +
> > +#define MASK_XENCONS_IDX(idx, ring) ((idx) & (sizeof(ring) - 1))
> > +
> > +struct xencons_interface {
> > +	char in[1024];
> > +	char out[2048];
> > +	XENCONS_RING_IDX in_cons, in_prod;
> > +	XENCONS_RING_IDX out_cons, out_prod;
> > +};
> > +
> > +#ifdef XEN_WANT_FLEX_CONSOLE_RING
> > +#include "ring.h"
> > +DEFINE_XEN_FLEX_RING(xencons);
> > +#endif
> > +
> > +#endif /* __XEN_PUBLIC_IO_CONSOLE_H__ */
> > +
> > +/*
> > + * Local variables:
> > + * mode: C
> > + * c-file-style: "BSD"
> > + * c-basic-offset: 4
> > + * tab-width: 4
> > + * indent-tabs-mode: nil
> > + * End:
> > + */
> > diff --git a/include/xen/interface/io/protocols.h
> > b/include/xen/interface/io/protocols.h
> > new file mode 100644
> > index 0000000000..52b4de0f81
> > --- /dev/null
> > +++ b/include/xen/interface/io/protocols.h
> > @@ -0,0 +1,42 @@
> > +/************************************************************
> > ******************
> > + * protocols.h
> > + *
> > + * Permission is hereby granted, free of charge, to any person
> > obtaining a
> > copy
> > + * of this software and associated documentation files (the
> > "Software"), to
> > + * deal in the Software without restriction, including without
> > limitation the
> > + * rights to use, copy, modify, merge, publish, distribute,
> > sublicense, and/or
> > + * sell copies of the Software, and to permit persons to whom the
> > Software is
> > + * furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be
> > included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> > KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> > EVENT SHALL THE
> > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> > OR OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + *
> > + * Copyright (c) 2008, Keir Fraser
> > + */
> > +
> > +#ifndef __XEN_PROTOCOLS_H__
> > +#define __XEN_PROTOCOLS_H__
> > +
> > +#define XEN_IO_PROTO_ABI_X86_32     "x86_32-abi"
> > +#define XEN_IO_PROTO_ABI_X86_64     "x86_64-abi"
> > +#define XEN_IO_PROTO_ABI_ARM        "arm-abi"
> > +
> > +#if defined(__i386__)
> > +# define XEN_IO_PROTO_ABI_NATIVE XEN_IO_PROTO_ABI_X86_32
> > +#elif defined(__x86_64__)
> > +# define XEN_IO_PROTO_ABI_NATIVE XEN_IO_PROTO_ABI_X86_64
> > +#elif defined(__arm__) || defined(__aarch64__)
> > +# define XEN_IO_PROTO_ABI_NATIVE XEN_IO_PROTO_ABI_ARM
> > +#else
> > +# error arch fixup needed here
> > +#endif
> > +
> > +#endif
> > diff --git a/include/xen/interface/io/ring.h
> > b/include/xen/interface/io/ring.h
> > new file mode 100644
> > index 0000000000..4e02678e3c
> > --- /dev/null
> > +++ b/include/xen/interface/io/ring.h
> > @@ -0,0 +1,479 @@
> > +/************************************************************
> > ******************
> > + * ring.h
> > + *
> > + * Shared producer-consumer ring macros.
> > + *
> > + * Permission is hereby granted, free of charge, to any person
> > obtaining a
> > copy
> > + * of this software and associated documentation files (the
> > "Software"), to
> > + * deal in the Software without restriction, including without
> > limitation the
> > + * rights to use, copy, modify, merge, publish, distribute,
> > sublicense, and/or
> > + * sell copies of the Software, and to permit persons to whom the
> > Software is
> > + * furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be
> > included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> > KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> > EVENT SHALL THE
> > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> > OR OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + *
> > + * Tim Deegan and Andrew Warfield November 2004.
> > + */
> > +
> > +#ifndef __XEN_PUBLIC_IO_RING_H__
> > +#define __XEN_PUBLIC_IO_RING_H__
> > +
> > +/*
> > + * When #include'ing this header, you need to provide the
> > following
> > + * declaration upfront:
> > + * - standard integers types (u8, u16, etc)
> > + * They are provided by stdint.h of the standard headers.
> > + *
> > + * In addition, if you intend to use the FLEX macros, you also
> > need to
> > + * provide the following, before invoking the FLEX macros:
> > + * - size_t
> > + * - memcpy
> > + * - grant_ref_t
> > + * These declarations are provided by string.h of the standard
> > headers,
> > + * and grant_table.h from the Xen public headers.
> > + */
> > +
> > +#include <xen/interface/grant_table.h>
> > +
> > +typedef unsigned int RING_IDX;
> > +
> > +/* Round a 32-bit unsigned constant down to the nearest power of
> > two. */
> > +#define __RD2(_x)  (((_x) & 0x00000002) ? 0x2                  :
> > ((_x)
> > & 0x1))
> > +#define __RD4(_x)  (((_x) & 0x0000000c) ? __RD2((_x)>>2)<<2    :
> > __RD2(_x))
> > +#define __RD8(_x)  (((_x) & 0x000000f0) ? __RD4((_x)>>4)<<4    :
> > __RD4(_x))
> > +#define __RD16(_x) (((_x) & 0x0000ff00) ? __RD8((_x)>>8)<<8    :
> > __RD8(_x))
> > +#define __RD32(_x) (((_x) & 0xffff0000) ? __RD16((_x)>>16)<<16 :
> > __RD16(_x))
> > +
> > +/*
> > + * Calculate size of a shared ring, given the total available
> > space for the
> > + * ring and indexes (_sz), and the name tag of the
> > request/response
> > structure.
> > + * A ring contains as many entries as will fit, rounded down to
> > the nearest
> > + * power of two (so we can mask with (size-1) to loop around).
> > + */
> > +#define __CONST_RING_SIZE(_s, _sz) \
> > +	(__RD32(((_sz) - offsetof(struct _s##_sring, ring)) / \
> > +		sizeof(((struct _s##_sring *)0)->ring[0])))
> > +/*
> > + * The same for passing in an actual pointer instead of a name
> > tag.
> > + */
> > +#define __RING_SIZE(_s, _sz) \
> > +	(__RD32(((_sz) - (long)(_s)->ring + (long)(_s)) / sizeof((_s)-
> > >ring[0])))
> > +
> > +/*
> > + * Macros to make the correct C datatypes for a new kind of ring.
> > + *
> > + * To make a new ring datatype, you need to have two message
> > structures,
> > + * let's say request_t, and response_t already defined.
> > + *
> > + * In a header where you want the ring datatype declared, you then
> > do:
> > + *
> > + *     DEFINE_RING_TYPES(mytag, request_t, response_t);
> > + *
> > + * These expand out to give you a set of types, as you can see
> > below.
> > + * The most important of these are:
> > + *
> > + *     mytag_sring_t      - The shared ring.
> > + *     mytag_front_ring_t - The 'front' half of the ring.
> > + *     mytag_back_ring_t  - The 'back' half of the ring.
> > + *
> > + * To initialize a ring in your code you need to know the location
> > and size
> > + * of the shared memory area (PAGE_SIZE, for instance). To
> > initialise
> > + * the front half:
> > + *
> > + *     mytag_front_ring_t front_ring;
> > + *     SHARED_RING_INIT((mytag_sring_t *)shared_page);
> > + *     FRONT_RING_INIT(&front_ring, (mytag_sring_t *)shared_page,
> > PAGE_SIZE);
> > + *
> > + * Initializing the back follows similarly (note that only the
> > front
> > + * initializes the shared ring):
> > + *
> > + *     mytag_back_ring_t back_ring;
> > + *     BACK_RING_INIT(&back_ring, (mytag_sring_t *)shared_page,
> > PAGE_SIZE);
> > + */
> > +
> > +#define DEFINE_RING_TYPES(__name, __req_t, __rsp_t)
> > \
> > +									
> > 	  \
> > +/* Shared ring entry */
> > \
> > +union __name##_sring_entry
> > {                                                      \
> > +	__req_t req;
> > \
> > +	__rsp_t rsp;
> > \
> > +};
> > \
> > +									
> > 	  \
> > +/* Shared ring page */
> > \
> > +struct __name##_sring
> > {                                                           \
> > +	RING_IDX req_prod, req_event;
> > \
> > +	RING_IDX rsp_prod, rsp_event;
> > \
> > +	union
> > {
> >       \
> > +		struct
> > {                                                          \
> > +			u8 smartpoll_active;
> > \
> > +		} netif;
> > \
> > +		struct
> > {                                                          \
> > +			u8 msg;
> > \
> > +		} tapif_user;
> > \
> > +		u8 pvt_pad[4];
> > \
> > +	} pvt;
> > \
> > +	u8 __pad[44];
> > \
> > +	union __name##_sring_entry ring[1]; /* variable-length */
> > \
> > +};
> > \
> > +									
> > 	  \
> > +/* "Front" end's private variables */
> > \
> > +struct __name##_front_ring
> > {                                                      \
> > +	RING_IDX req_prod_pvt;
> > \
> > +	RING_IDX rsp_cons;
> > \
> > +	unsigned int nr_ents;
> > \
> > +	struct __name##_sring *sring;
> > \
> > +};
> > \
> > +									
> > 	  \
> > +/* "Back" end's private variables */
> > \
> > +struct __name##_back_ring
> > {                                                       \
> > +	RING_IDX rsp_prod_pvt;
> > \
> > +	RING_IDX req_cons;
> > \
> > +	unsigned int nr_ents;
> > \
> > +	struct __name##_sring *sring;
> > \
> > +};
> > \
> > +									
> > 	  \
> > +/* Syntactic sugar */
> > \
> > +typedef struct __name##_sring __name##_sring_t;
> > \
> > +typedef struct __name##_front_ring __name##_front_ring_t;
> > \
> > +typedef struct __name##_back_ring __name##_back_ring_t
> > +
> > +/*
> > + * Macros for manipulating rings.
> > + *
> > + * FRONT_RING_whatever works on the "front end" of a ring: here
> > + * requests are pushed on to the ring and responses taken off it.
> > + *
> > + * BACK_RING_whatever works on the "back end" of a ring: here
> > + * requests are taken off the ring and responses put on.
> > + *
> > + * N.B. these macros do NO INTERLOCKS OR FLOW CONTROL.
> > + * This is OK in 1-for-1 request-response situations where the
> > + * requestor (front end) never has more than RING_SIZE()-1
> > + * outstanding requests.
> > + */
> > +
> > +/* Initialising empty rings */
> > +#define SHARED_RING_INIT(_s) do
> > {                                                 \
> > +	(_s)->req_prod  = (_s)->rsp_prod  = 0;
> > \
> > +	(_s)->req_event = (_s)->rsp_event = 1;
> > \
> > +	(void)memset((_s)->pvt.pvt_pad, 0, sizeof((_s)->pvt.pvt_pad));
> > \
> > +	(void)memset((_s)->__pad, 0, sizeof((_s)->__pad));
> > \
> > +} while (0)
> > +
> > +#define FRONT_RING_INIT(_r, _s, __size) do
> > {                                      \
> > +	(_r)->req_prod_pvt = 0;
> > \
> > +	(_r)->rsp_cons = 0;
> > \
> > +	(_r)->nr_ents = __RING_SIZE(_s, __size);
> > \
> > +	(_r)->sring = (_s);
> > \
> > +} while (0)
> > +
> > +#define BACK_RING_INIT(_r, _s, __size) do
> > {                                       \
> > +	(_r)->rsp_prod_pvt = 0;
> > \
> > +	(_r)->req_cons = 0;
> > \
> > +	(_r)->nr_ents = __RING_SIZE(_s, __size);
> > \
> > +	(_r)->sring = (_s);
> > \
> > +} while (0)
> > +
> > +/* How big is this ring? */
> > +#define RING_SIZE(_r)
> > \
> > +	((_r)->nr_ents)
> > +
> > +/* Number of free requests (for use on front side only). */
> > +#define RING_FREE_REQUESTS(_r)
> > \
> > +	(RING_SIZE(_r) - ((_r)->req_prod_pvt - (_r)->rsp_cons))
> > +
> > +/* Test if there is an empty slot available on the front ring.
> > + * (This is only meaningful from the front. )
> > + */
> > +#define RING_FULL(_r)
> > \
> > +	(RING_FREE_REQUESTS(_r) == 0)
> > +
> > +/* Test if there are outstanding messages to be processed on a
> > ring. */
> > +#define RING_HAS_UNCONSUMED_RESPONSES(_r)
> > \
> > +	((_r)->sring->rsp_prod - (_r)->rsp_cons)
> > +
> > +#ifdef __GNUC__
> > +#define RING_HAS_UNCONSUMED_REQUESTS(_r)
> > ({                                       \
> > +	unsigned int req = (_r)->sring->req_prod - (_r)->req_cons;
> > \
> > +	unsigned int rsp = RING_SIZE(_r) -
> > \
> > +		((_r)->req_cons - (_r)->rsp_prod_pvt);
> > \
> > +	req < rsp ? req : rsp;
> > \
> > +})
> > +#else
> > +/* Same as above, but without the nice GCC ({ ... }) syntax. */
> > +#define RING_HAS_UNCONSUMED_REQUESTS(_r)
> > \
> > +	((((_r)->sring->req_prod - (_r)->req_cons) <
> > \
> > +	  (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) ?
> > \
> > +	 ((_r)->sring->req_prod - (_r)->req_cons) :
> > \
> > +	 (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt)))
> > +#endif
> > +
> > +/* Direct access to individual ring elements, by index. */
> > +#define RING_GET_REQUEST(_r, _idx)
> > \
> > +	(&((_r)->sring->ring[((_idx) & (RING_SIZE(_r) - 1))].req))
> > +
> > +/*
> > + * Get a local copy of a request.
> > + *
> > + * Use this in preference to RING_GET_REQUEST() so all processing
> > is
> > + * done on a local copy that cannot be modified by the other end.
> > + *
> > + * Note that
> > 
https://urldefense.com/v3/__https://eur01.safelinks.protection.outlook.com/?url=https*3A*2F*2Fgcc.gn__;JSUl!!GF_29dbcQIUBPA!jD586eXHYPvw-3dNl43vD8yZH2dB5zfAfDsAEdhFEjZcol8ete6qMxK4PKq9W1aTD-_NctI$
> >  
> > u.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D58145&amp;data=02%7C01%7C
> > peng.fan%40nxp.com%7Cdd87f4854f514bc096ba08d81ddc0812%7C686ea1d
> > 3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637292178170181802&amp;sd
> > ata=hZDVA%2FOZbJO%2Fh4uzROYzVzmB05ekJWbcnkDAXsHzClc%3D&amp;re
> > served=0 may cause this
> > + * to be ineffective where _req is a struct which consists of only
> > bitfields.
> > + */
> > +#define RING_COPY_REQUEST(_r, _idx, _req) do {
> > \
> > +	/* Use volatile to force the copy into _req. */			
> >           \
> > +	*(_req) = *(volatile typeof(_req))RING_GET_REQUEST(_r, _idx);
> > \
> > +} while (0)
> > +
> > +#define RING_GET_RESPONSE(_r, _idx)
> > \
> > +	(&((_r)->sring->ring[((_idx) & (RING_SIZE(_r) - 1))].rsp))
> > +
> > +/* Loop termination condition: Would the specified index overflow
> > the ring?
> > */
> > +#define RING_REQUEST_CONS_OVERFLOW(_r, _cons)
> > \
> > +	(((_cons) - (_r)->rsp_prod_pvt) >= RING_SIZE(_r))
> > +
> > +/* Ill-behaved frontend determination: Can there be this many
> > requests? */
> > +#define RING_REQUEST_PROD_OVERFLOW(_r, _prod)
> > \
> > +	(((_prod) - (_r)->rsp_prod_pvt) > RING_SIZE(_r))
> > +
> > +#define RING_PUSH_REQUESTS(_r) do
> > {                                               \
> > +	xen_wmb(); /* back sees requests /before/ updated producer
> > index */
> > \
> > +	(_r)->sring->req_prod = (_r)->req_prod_pvt;
> > \
> > +} while (0)
> > +
> > +#define RING_PUSH_RESPONSES(_r) do
> > {                                              \
> > +	xen_wmb(); /* front sees resps /before/ updated producer index
> > */
> > \
> > +	(_r)->sring->rsp_prod = (_r)->rsp_prod_pvt;
> > \
> > +} while (0)
> > +
> > +/*
> > + * Notification hold-off (req_event and rsp_event):
> > + *
> > + * When queueing requests or responses on a shared ring, it may
> > not always
> > be
> > + * necessary to notify the remote end. For example, if requests
> > are in flight
> > + * in a backend, the front may be able to queue further requests
> > without
> > + * notifying the back (if the back checks for new requests when it
> > queues
> > + * responses).
> > + *
> > + * When enqueuing requests or responses:
> > + *
> > + *  Use RING_PUSH_{REQUESTS,RESPONSES}_AND_CHECK_NOTIFY(). The
> > second argument
> > + *  is a boolean return value. True indicates that the receiver
> > requires an
> > + *  asynchronous notification.
> > + *
> > + * After dequeuing requests or responses (before sleeping the
> > connection):
> > + *
> > + *  Use RING_FINAL_CHECK_FOR_REQUESTS() or
> > RING_FINAL_CHECK_FOR_RESPONSES().
> > + *  The second argument is a boolean return value. True indicates
> > that there
> > + *  are pending messages on the ring (i.e., the connection should
> > not be put
> > + *  to sleep).
> > + *
> > + *  These macros will set the req_event/rsp_event field to trigger
> > a
> > + *  notification on the very next message that is enqueued. If you
> > want to
> > + *  create batches of work (i.e., only receive a notification
> > after several
> > + *  messages have been enqueued) then you will need to create a
> > customised
> > + *  version of the FINAL_CHECK macro in your own code, which sets
> > the
> > event
> > + *  field appropriately.
> > + */
> > +
> > +#define RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(_r, _notify) do
> > {                     \
> > +	RING_IDX __old = (_r)->sring->req_prod;
> > \
> > +	RING_IDX __new = (_r)->req_prod_pvt;
> > \
> > +	xen_wmb(); /* back sees requests /before/ updated producer
> > index */
> > \
> > +	(_r)->sring->req_prod = __new;
> > \
> > +	xen_mb(); /* back sees new requests /before/ we check req_event
> > */
> > \
> > +	(_notify) = ((RING_IDX)(__new - (_r)->sring->req_event) <
> > \
> > +				 (RING_IDX)(__new -
> > __old));                      \
> > +} while (0)
> > +
> > +#define RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(_r, _notify) do
> > {                    \
> > +	RING_IDX __old = (_r)->sring->rsp_prod;
> > \
> > +	RING_IDX __new = (_r)->rsp_prod_pvt;
> > \
> > +	xen_wmb(); /* front sees resps /before/ updated producer index
> > */
> > \
> > +	(_r)->sring->rsp_prod = __new;
> > \
> > +	xen_mb(); /* front sees new resps /before/ we check rsp_event
> > */
> > \
> > +	(_notify) = ((RING_IDX)(__new - (_r)->sring->rsp_event) <
> > \
> > +				 (RING_IDX)(__new -
> > __old));                      \
> > +} while (0)
> > +
> > +#define RING_FINAL_CHECK_FOR_REQUESTS(_r, _work_to_do) do
> > {                       \
> > +	(_work_to_do) = RING_HAS_UNCONSUMED_REQUESTS(_r);
> > \
> > +	if (_work_to_do)							
> >   \
> > +		break;
> > \
> > +	(_r)->sring->req_event = (_r)->req_cons + 1;
> > \
> > +	xen_mb();
> > \
> > +	(_work_to_do) = RING_HAS_UNCONSUMED_REQUESTS(_r);
> > \
> > +} while (0)
> > +
> > +#define RING_FINAL_CHECK_FOR_RESPONSES(_r, _work_to_do) do
> > {                      \
> > +	(_work_to_do) = RING_HAS_UNCONSUMED_RESPONSES(_r);
> > \
> > +	if (_work_to_do)							
> >   \
> > +		break;
> > \
> > +	(_r)->sring->rsp_event = (_r)->rsp_cons + 1;
> > \
> > +	xen_mb();
> > \
> > +	(_work_to_do) = RING_HAS_UNCONSUMED_RESPONSES(_r);
> > \
> > +} while (0)
> > +
> > +/*
> > + * DEFINE_XEN_FLEX_RING_AND_INTF defines two monodirectional rings
> > and
> > + * functions to check if there is data on the ring, and to read
> > and
> > + * write to them.
> > + *
> > + * DEFINE_XEN_FLEX_RING is similar to
> > DEFINE_XEN_FLEX_RING_AND_INTF, but
> > + * does not define the indexes page. As different protocols can
> > have
> > + * extensions to the basic format, this macro allow them to define
> > their
> > + * own struct.
> > + *
> > + * XEN_FLEX_RING_SIZE
> > + *   Convenience macro to calculate the size of one of the two
> > rings
> > + *   from the overall order.
> > + *
> > + * $NAME_mask
> > + *   Function to apply the size mask to an index, to reduce the
> > index
> > + *   within the range [0-size].
> > + *
> > + * $NAME_read_packet
> > + *   Function to read data from the ring. The amount of data to
> > read is
> > + *   specified by the "size" argument.
> > + *
> > + * $NAME_write_packet
> > + *   Function to write data to the ring. The amount of data to
> > write is
> > + *   specified by the "size" argument.
> > + *
> > + * $NAME_get_ring_ptr
> > + *   Convenience function that returns a pointer to read/write to
> > the
> > + *   ring at the right location.
> > + *
> > + * $NAME_data_intf
> > + *   Indexes page, shared between frontend and backend. It also
> > + *   contains the array of grant refs.
> > + *
> > + * $NAME_queued
> > + *   Function to calculate how many bytes are currently on the
> > ring,
> > + *   ready to be read. It can also be used to calculate how much
> > free
> > + *   space is currently on the ring (XEN_FLEX_RING_SIZE() -
> > + *   $NAME_queued()).
> > + */
> > +
> > +#ifndef XEN_PAGE_SHIFT
> > +/* The PAGE_SIZE for ring protocols and hypercall interfaces is
> > always
> > + * 4K, regardless of the architecture, and page granularity chosen
> > by
> > + * operating systems.
> > + */
> > +#define XEN_PAGE_SHIFT 12
> > +#endif
> > +#define XEN_FLEX_RING_SIZE(order)
> > \
> > +	(1UL << ((order) + XEN_PAGE_SHIFT - 1))
> > +
> > +#define DEFINE_XEN_FLEX_RING(name)
> > \
> > +static inline RING_IDX name##_mask(RING_IDX idx, RING_IDX
> > ring_size)
> > \
> > +{
> >                      \
> > +	return idx & (ring_size - 1);
> > \
> > +}
> > \
> > +									
> > 	  \
> > +static inline unsigned char *name##_get_ring_ptr(unsigned char
> > *buf,
> > \
> > +						 RING_IDX
> > idx,                    \
> > +						 RING_IDX
> > ring_size)              \
> > +{
> >                      \
> > +	return buf + name##_mask(idx, ring_size);
> > \
> > +}
> > \
> > +									
> > 	  \
> > +static inline void name##_read_packet(void *opaque,
> > \
> > +				      const unsigned char
> > *buf,                   \
> > +				      size_t size,
> > \
> > +				      RING_IDX masked_prod,
> > \
> > +				      RING_IDX *masked_cons,
> > \
> > +				      RING_IDX ring_size)
> > \
> > +{
> >                      \
> > +	if (*masked_cons < masked_prod ||
> > \
> > +		size <= ring_size - *masked_cons)
> > {                               \
> > +		memcpy(opaque, buf + *masked_cons, size);
> > \
> > +	} else
> > {
> >      \
> > +		memcpy(opaque, buf + *masked_cons, ring_size -
> > *masked_cons);
> > \
> > +		memcpy((unsigned char *)opaque + ring_size -
> > *masked_cons, buf,
> > \
> > +			   size - (ring_size - *masked_cons));
> > \
> > +	}
> > \
> > +	*masked_cons = name##_mask(*masked_cons + size, ring_size);
> > \
> > +}
> > \
> > +									
> > 	  \
> > +static inline void name##_write_packet(unsigned char *buf,
> > \
> > +				       const void *opaque,
> > \
> > +				       size_t size,
> > \
> > +				       RING_IDX *masked_prod,
> > \
> > +				       RING_IDX masked_cons,
> > \
> > +				       RING_IDX ring_size)
> > \
> > +{
> >                      \
> > +	if (*masked_prod < masked_cons ||
> > \
> > +		size <= ring_size - *masked_prod)
> > {                               \
> > +		memcpy(buf + *masked_prod, opaque, size);
> > \
> > +	} else
> > {
> >      \
> > +		memcpy(buf + *masked_prod, opaque, ring_size -
> > *masked_prod);
> > \
> > +		memcpy(buf, (unsigned char *)opaque + (ring_size -
> > *masked_prod),
> > \
> > +		       size - (ring_size - *masked_prod));
> > \
> > +	}
> > \
> > +	*masked_prod = name##_mask(*masked_prod + size, ring_size);
> > \
> > +}
> > \
> > +									
> > 	  \
> > +static inline RING_IDX name##_queued(RING_IDX prod,
> > \
> > +				     RING_IDX cons,
> > \
> > +				     RING_IDX ring_size)
> > \
> > +{
> >                      \
> > +	RING_IDX size;
> > \
> > +									
> > 	  \
> > +	if (prod == cons)
> > \
> > +		return 0;
> > \
> > +									
> > 	  \
> > +	prod = name##_mask(prod, ring_size);
> > \
> > +	cons = name##_mask(cons, ring_size);
> > \
> > +									
> > 	  \
> > +	if (prod == cons)
> > \
> > +		return ring_size;
> > \
> > +									
> > 	  \
> > +	if (prod > cons)
> > \
> > +		size = prod - cons;
> > \
> > +	else
> > \
> > +		size = ring_size - (cons - prod);
> > \
> > +	return size;
> > \
> > +}
> > \
> > +									
> > 	  \
> > +struct name##_data
> > {
> >  \
> > +	unsigned char *in; /* half of the allocation */
> > \
> > +	unsigned char *out; /* half of the allocation */
> > \
> > +}
> > +
> > +#define DEFINE_XEN_FLEX_RING_AND_INTF(name)
> > \
> > +struct name##_data_intf
> > {                                                         \
> > +	RING_IDX in_cons, in_prod;
> > \
> > +									
> > 	  \
> > +	u8 pad1[56];
> > \
> > +									
> > 	  \
> > +	RING_IDX out_cons, out_prod;
> > \
> > +									
> > 	  \
> > +	u8 pad2[56];
> > \
> > +									
> > 	  \
> > +	RING_IDX ring_order;
> > \
> > +	grant_ref_t ref[];
> > \
> > +};
> > \
> > +DEFINE_XEN_FLEX_RING(name)
> > +
> > +#endif /* __XEN_PUBLIC_IO_RING_H__ */
> > +
> > +/*
> > + * Local variables:
> > + * mode: C
> > + * c-file-style: "BSD"
> > + * c-basic-offset: 4
> > + * tab-width: 8
> > + * indent-tabs-mode: nil
> > + * End:
> > + */
> > diff --git a/include/xen/interface/io/xenbus.h
> > b/include/xen/interface/io/xenbus.h
> > new file mode 100644
> > index 0000000000..f452748b03
> > --- /dev/null
> > +++ b/include/xen/interface/io/xenbus.h
> > @@ -0,0 +1,81 @@
> > +/************************************************************
> > *****************
> > + * xenbus.h
> > + *
> > + * Xenbus protocol details.
> > + *
> > + * Permission is hereby granted, free of charge, to any person
> > obtaining a
> > copy
> > + * of this software and associated documentation files (the
> > "Software"), to
> > + * deal in the Software without restriction, including without
> > limitation the
> > + * rights to use, copy, modify, merge, publish, distribute,
> > sublicense, and/or
> > + * sell copies of the Software, and to permit persons to whom the
> > Software is
> > + * furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be
> > included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> > KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> > EVENT SHALL THE
> > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> > OR OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + *
> > + * Copyright (C) 2005 XenSource Ltd.
> > + */
> > +
> > +#ifndef _XEN_PUBLIC_IO_XENBUS_H
> > +#define _XEN_PUBLIC_IO_XENBUS_H
> > +
> > +/*
> > + * The state of either end of the Xenbus, i.e. the current
> > communication
> > + * status of initialisation across the bus.  States here imply
> > nothing about
> > + * the state of the connection between the driver and the kernel's
> > device
> > + * layers.
> > + */
> > +enum xenbus_state {
> > +	XenbusStateUnknown       = 0,
> > +
> > +	XenbusStateInitialising  = 1,
> > +
> > +	/*
> > +	 * InitWait: Finished early initialisation but waiting for
> > information
> > +	 * from the peer or hotplug scripts.
> > +	 */
> > +	XenbusStateInitWait      = 2,
> > +
> > +	/*
> > +	 * Initialised: Waiting for a connection from the peer.
> > +	 */
> > +	XenbusStateInitialised   = 3,
> > +
> > +	XenbusStateConnected     = 4,
> > +
> > +	/*
> > +	 * Closing: The device is being closed due to an error or an
> > unplug event.
> > +	 */
> > +	XenbusStateClosing       = 5,
> > +
> > +	XenbusStateClosed        = 6,
> > +
> > +	/*
> > +	 * Reconfiguring: The device is being reconfigured.
> > +	 */
> > +	XenbusStateReconfiguring = 7,
> > +
> > +	XenbusStateReconfigured  = 8
> > +};
> > +
> > +typedef enum xenbus_state XenbusState;
> > +
> > +#endif /* _XEN_PUBLIC_IO_XENBUS_H */
> > +
> > +/*
> > + * Local variables:
> > + * mode: C
> > + * c-file-style: "BSD"
> > + * c-basic-offset: 4
> > + * tab-width: 4
> > + * indent-tabs-mode: nil
> > + * End:
> > + */
> > diff --git a/include/xen/interface/io/xs_wire.h
> > b/include/xen/interface/io/xs_wire.h
> > new file mode 100644
> > index 0000000000..87987334bf
> > --- /dev/null
> > +++ b/include/xen/interface/io/xs_wire.h
> > @@ -0,0 +1,151 @@
> > +/*
> > + * Details of the "wire" protocol between Xen Store Daemon and
> > client
> > + * library or guest kernel.
> > + *
> > + * Permission is hereby granted, free of charge, to any person
> > obtaining a
> > copy
> > + * of this software and associated documentation files (the
> > "Software"), to
> > + * deal in the Software without restriction, including without
> > limitation the
> > + * rights to use, copy, modify, merge, publish, distribute,
> > sublicense, and/or
> > + * sell copies of the Software, and to permit persons to whom the
> > Software is
> > + * furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be
> > included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> > KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> > EVENT SHALL THE
> > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> > OR OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + *
> > + * Copyright (C) 2005 Rusty Russell IBM Corporation
> > + */
> > +
> > +#ifndef _XS_WIRE_H
> > +#define _XS_WIRE_H
> > +
> > +enum xsd_sockmsg_type {
> > +	XS_CONTROL,
> > +#define XS_DEBUG XS_CONTROL
> > +	XS_DIRECTORY,
> > +	XS_READ,
> > +	XS_GET_PERMS,
> > +	XS_WATCH,
> > +	XS_UNWATCH,
> > +	XS_TRANSACTION_START,
> > +	XS_TRANSACTION_END,
> > +	XS_INTRODUCE,
> > +	XS_RELEASE,
> > +	XS_GET_DOMAIN_PATH,
> > +	XS_WRITE,
> > +	XS_MKDIR,
> > +	XS_RM,
> > +	XS_SET_PERMS,
> > +	XS_WATCH_EVENT,
> > +	XS_ERROR,
> > +	XS_IS_DOMAIN_INTRODUCED,
> > +	XS_RESUME,
> > +	XS_SET_TARGET,
> > +	/* XS_RESTRICT has been removed */
> > +	XS_RESET_WATCHES = XS_SET_TARGET + 2,
> > +	XS_DIRECTORY_PART,
> > +
> > +	XS_TYPE_COUNT,      /* Number of valid types. */
> > +
> > +	XS_INVALID = 0xffff /* Guaranteed to remain an invalid type */
> > +};
> > +
> > +#define XS_WRITE_NONE "NONE"
> > +#define XS_WRITE_CREATE "CREATE"
> > +#define XS_WRITE_CREATE_EXCL "CREATE|EXCL"
> > +
> > +/* We hand errors as strings, for portability. */
> > +struct xsd_errors {
> > +	int errnum;
> > +	const char *errstring;
> > +};
> > +
> > +#ifdef EINVAL
> > +#define XSD_ERROR(x) { x, #x }
> > +/* LINTED: static unused */
> > +static struct xsd_errors xsd_errors[]
> > +#if defined(__GNUC__)
> > +__attribute__((unused))
> > +#endif
> > +	= {
> > +	XSD_ERROR(EINVAL),
> > +	XSD_ERROR(EACCES),
> > +	XSD_ERROR(EEXIST),
> > +	XSD_ERROR(EISDIR),
> > +	XSD_ERROR(ENOENT),
> > +	XSD_ERROR(ENOMEM),
> > +	XSD_ERROR(ENOSPC),
> > +	XSD_ERROR(EIO),
> > +	XSD_ERROR(ENOTEMPTY),
> > +	XSD_ERROR(ENOSYS),
> > +	XSD_ERROR(EROFS),
> > +	XSD_ERROR(EBUSY),
> > +	XSD_ERROR(EAGAIN),
> > +	XSD_ERROR(EISCONN),
> > +	XSD_ERROR(E2BIG)
> > +};
> > +#endif
> > +
> > +struct xsd_sockmsg {
> > +	u32 type;  /* XS_??? */
> > +	u32 req_id;/* Request identifier, echoed in daemon's
> > response.  */
> > +	u32 tx_id; /* Transaction id (0 if not related to a
> > transaction). */
> > +	u32 len;   /* Length of data following this. */
> > +
> > +	/* Generally followed by nul-terminated string(s). */
> > +};
> > +
> > +enum xs_watch_type {
> > +	XS_WATCH_PATH = 0,
> > +	XS_WATCH_TOKEN
> > +};
> > +
> > +/*
> > + * `incontents 150 xenstore_struct XenStore wire protocol.
> > + *
> > + * Inter-domain shared memory communications.
> > + */
> > +#define XENSTORE_RING_SIZE 1024
> > +typedef u32 XENSTORE_RING_IDX;
> > +#define MASK_XENSTORE_IDX(idx) ((idx) & (XENSTORE_RING_SIZE - 1))
> > +struct xenstore_domain_interface {
> > +	char req[XENSTORE_RING_SIZE]; /* Requests to xenstore daemon.
> > */
> > +	char rsp[XENSTORE_RING_SIZE]; /* Replies and async watch
> > events. */
> > +	XENSTORE_RING_IDX req_cons, req_prod;
> > +	XENSTORE_RING_IDX rsp_cons, rsp_prod;
> > +	u32 server_features; /* Bitmap of features supported by the
> > server */
> > +	u32 connection;
> > +};
> > +
> > +/* Violating this is very bad.  See docs/misc/xenstore.txt. */
> > +#define XENSTORE_PAYLOAD_MAX 4096
> > +
> > +/* Violating these just gets you an error back */
> > +#define XENSTORE_ABS_PATH_MAX 3072
> > +#define XENSTORE_REL_PATH_MAX 2048
> > +
> > +/* The ability to reconnect a ring */
> > +#define XENSTORE_SERVER_FEATURE_RECONNECTION 1
> > +
> > +/* Valid values for the connection field */
> > +#define XENSTORE_CONNECTED 0 /* the steady-state */
> > +#define XENSTORE_RECONNECT 1 /* guest has initiated a reconnect */
> > +
> > +#endif /* _XS_WIRE_H */
> > +
> > +/*
> > + * Local variables:
> > + * mode: C
> > + * c-file-style: "BSD"
> > + * c-basic-offset: 4
> > + * tab-width: 8
> > + * indent-tabs-mode: nil
> > + * End:
> > + */
> > diff --git a/include/xen/interface/memory.h
> > b/include/xen/interface/memory.h
> > new file mode 100644
> > index 0000000000..19959da8b4
> > --- /dev/null
> > +++ b/include/xen/interface/memory.h
> > @@ -0,0 +1,332 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/************************************************************
> > ******************
> > + * memory.h
> > + *
> > + * Memory reservation and information.
> > + *
> > + * Copyright (c) 2005, Keir Fraser <keir@xensource.com>
> > + */
> > +
> > +#ifndef __XEN_PUBLIC_MEMORY_H__
> > +#define __XEN_PUBLIC_MEMORY_H__
> > +
> > +/*
> > + * Increase or decrease the specified domain's memory reservation.
> > Returns
> > a
> > + * -ve errcode on failure, or the # extents successfully allocated
> > or freed.
> > + * arg == addr of struct xen_memory_reservation.
> > + */
> > +#define XENMEM_increase_reservation 0
> > +#define XENMEM_decrease_reservation 1
> > +#define XENMEM_populate_physmap     6
> > +struct xen_memory_reservation {
> > +	/*
> > +	 * XENMEM_increase_reservation:
> > +	 *   OUT: MFN (*not* GMFN) bases of extents that were allocated
> > +	 * XENMEM_decrease_reservation:
> > +	 *   IN:  GMFN bases of extents to free
> > +	 * XENMEM_populate_physmap:
> > +	 *   IN:  GPFN bases of extents to populate with memory
> > +	 *   OUT: GMFN bases of extents that were allocated
> > +	 *   (NB. This command also updates the mach_to_phys
> > translation
> > table)
> > +	 */
> > +	GUEST_HANDLE(xen_pfn_t)extent_start;
> > +
> > +	/* Number of extents, and size/alignment of each
> > (2^extent_order
> > pages). */
> > +	xen_ulong_t  nr_extents;
> > +	unsigned int   extent_order;
> > +
> > +	/*
> > +	 * Maximum # bits addressable by the user of the allocated
> > region (e.g.,
> > +	 * I/O devices often have a 32-bit limitation even in 64-bit
> > systems). If
> > +	 * zero then the user has no addressing restriction.
> > +	 * This field is not used by XENMEM_decrease_reservation.
> > +	 */
> > +	unsigned int   address_bits;
> > +
> > +	/*
> > +	 * Domain whose reservation is being changed.
> > +	 * Unprivileged domains can specify only DOMID_SELF.
> > +	 */
> > +	domid_t        domid;
> > +
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(xen_memory_reservation);
> > +
> > +/*
> > + * An atomic exchange of memory pages. If return code is zero then
> > + * @out.extent_list provides GMFNs of the newly-allocated memory.
> > + * Returns zero on complete success, otherwise a negative error
> > code.
> > + * On complete success then always @nr_exchanged ==
> > @in.nr_extents.
> > + * On partial success @nr_exchanged indicates how much work was
> > done.
> > + */
> > +#define XENMEM_exchange             11
> > +struct xen_memory_exchange {
> > +	/*
> > +	 * [IN] Details of memory extents to be exchanged (GMFN bases).
> > +	 * Note that @in.address_bits is ignored and unused.
> > +	 */
> > +	struct xen_memory_reservation in;
> > +
> > +	/*
> > +	 * [IN/OUT] Details of new memory extents.
> > +	 * We require that:
> > +	 *  1. @in.domid == @out.domid
> > +	 *  2. @in.nr_extents  << @in.extent_order ==
> > +	 *     @out.nr_extents << @out.extent_order
> > +	 *  3. @in.extent_start and @out.extent_start lists must not
> > overlap
> > +	 *  4. @out.extent_start lists GPFN bases to be populated
> > +	 *  5. @out.extent_start is overwritten with allocated GMFN
> > bases
> > +	 */
> > +	struct xen_memory_reservation out;
> > +
> > +	/*
> > +	 * [OUT] Number of input extents that were successfully
> > exchanged:
> > +	 *  1. The first @nr_exchanged input extents were successfully
> > +	 *     deallocated.
> > +	 *  2. The corresponding first entries in the output extent
> > list correctly
> > +	 *     indicate the GMFNs that were successfully exchanged.
> > +	 *  3. All other input and output extents are untouched.
> > +	 *  4. If not all input exents are exchanged then the return
> > code of this
> > +	 *     command will be non-zero.
> > +	 *  5. THIS FIELD MUST BE INITIALISED TO ZERO BY THE CALLER!
> > +	 */
> > +	xen_ulong_t nr_exchanged;
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(xen_memory_exchange);
> > +/*
> > + * Returns the maximum machine frame number of mapped RAM in this
> > system.
> > + * This command always succeeds (it never returns an error code).
> > + * arg == NULL.
> > + */
> > +#define XENMEM_maximum_ram_page     2
> > +
> > +/*
> > + * Returns the current or maximum memory reservation, in pages, of
> > the
> > + * specified domain (may be DOMID_SELF). Returns -ve errcode on
> > failure.
> > + * arg == addr of domid_t.
> > + */
> > +#define XENMEM_current_reservation  3
> > +#define XENMEM_maximum_reservation  4
> > +
> > +/*
> > + * Returns a list of MFN bases of 2MB extents comprising the
> > machine_to_phys
> > + * mapping table. Architectures which do not have a m2p table do
> > not
> > implement
> > + * this command.
> > + * arg == addr of xen_machphys_mfn_list_t.
> > + */
> > +#define XENMEM_machphys_mfn_list    5
> > +struct xen_machphys_mfn_list {
> > +	/*
> > +	 * Size of the 'extent_start' array. Fewer entries will be
> > filled if the
> > +	 * machphys table is smaller than max_extents * 2MB.
> > +	 */
> > +	unsigned int max_extents;
> > +
> > +	/*
> > +	 * Pointer to buffer to fill with list of extent starts. If
> > there are
> > +	 * any large discontiguities in the machine address space, 2MB
> > gaps in
> > +	 * the machphys table will be represented by an MFN base of
> > zero.
> > +	 */
> > +	GUEST_HANDLE(xen_pfn_t)extent_start;
> > +
> > +	/*
> > +	 * Number of extents written to the above array. This will be
> > smaller
> > +	 * than 'max_extents' if the machphys table is smaller than
> > max_e *
> > 2MB.
> > +	 */
> > +	unsigned int nr_extents;
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(xen_machphys_mfn_list);
> > +
> > +/*
> > + * Returns the location in virtual address space of the
> > machine_to_phys
> > + * mapping table. Architectures which do not have a m2p table, or
> > which do
> > not
> > + * map it by default into guest address space, do not implement
> > this
> > command.
> > + * arg == addr of xen_machphys_mapping_t.
> > + */
> > +#define XENMEM_machphys_mapping     12
> > +struct xen_machphys_mapping {
> > +	xen_ulong_t v_start, v_end; /* Start and end virtual
> > addresses.   */
> > +	xen_ulong_t max_mfn;        /* Maximum MFN that can be looked
> > up.
> > */
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(xen_machphys_mapping_t);
> > +
> > +#define XENMAPSPACE_shared_info  0 /* shared info page */
> > +#define XENMAPSPACE_grant_table  1 /* grant table page */
> > +#define XENMAPSPACE_gmfn         2 /* GMFN */
> > +#define XENMAPSPACE_gmfn_range   3 /* GMFN range,
> > XENMEM_add_to_physmap only. */
> > +#define XENMAPSPACE_gmfn_foreign 4 /* GMFN from another dom,
> > +				    * XENMEM_add_to_physmap_range only.
> > +				    */
> > +#define XENMAPSPACE_dev_mmio     5 /* device mmio region */
> > +
> > +/*
> > + * Sets the GPFN at which a particular page appears in the
> > specified guest's
> > + * pseudophysical address space.
> > + * arg == addr of xen_add_to_physmap_t.
> > + */
> > +#define XENMEM_add_to_physmap      7
> > +struct xen_add_to_physmap {
> > +	/* Which domain to change the mapping for. */
> > +	domid_t domid;
> > +
> > +	/* Number of pages to go through for gmfn_range */
> > +	u16    size;
> > +
> > +	/* Source mapping space. */
> > +	unsigned int space;
> > +
> > +	/* Index into source mapping space. */
> > +	xen_ulong_t idx;
> > +
> > +	/* GPFN where the source mapping page should appear. */
> > +	xen_pfn_t gpfn;
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(xen_add_to_physmap);
> > +
> > +/*** REMOVED ***/
> > +/*#define XENMEM_translate_gpfn_list  8*/
> > +
> > +#define XENMEM_add_to_physmap_range 23
> > +struct xen_add_to_physmap_range {
> > +	/* IN */
> > +	/* Which domain to change the mapping for. */
> > +	domid_t domid;
> > +	u16 space; /* => enum phys_map_space */
> > +
> > +	/* Number of pages to go through */
> > +	u16 size;
> > +	domid_t foreign_domid; /* IFF gmfn_foreign */
> > +
> > +	/* Indexes into space being mapped. */
> > +	GUEST_HANDLE(xen_ulong_t)idxs;
> > +
> > +	/* GPFN in domid where the source mapping page should appear.
> > */
> > +	GUEST_HANDLE(xen_pfn_t)gpfns;
> > +
> > +	/* OUT */
> > +
> > +	/* Per index error code. */
> > +	GUEST_HANDLE(int)errs;
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(xen_add_to_physmap_range);
> > +
> > +/*
> > + * Returns the pseudo-physical memory map as it was when the
> > domain
> > + * was started (specified by XENMEM_set_memory_map).
> > + * arg == addr of struct xen_memory_map.
> > + */
> > +#define XENMEM_memory_map           9
> > +struct xen_memory_map {
> > +	/*
> > +	 * On call the number of entries which can be stored in buffer.
> > On
> > +	 * return the number of entries which have been stored in
> > +	 * buffer.
> > +	 */
> > +	unsigned int nr_entries;
> > +
> > +	/*
> > +	 * Entries in the buffer are in the same format as returned by
> > the
> > +	 * BIOS INT 0x15 EAX=0xE820 call.
> > +	 */
> > +	GUEST_HANDLE(void)buffer;
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(xen_memory_map);
> > +
> > +/*
> > + * Returns the real physical memory map. Passes the same structure
> > as
> > + * XENMEM_memory_map.
> > + * arg == addr of struct xen_memory_map.
> > + */
> > +#define XENMEM_machine_memory_map   10
> > +
> > +/*
> > + * Unmaps the page appearing at a particular GPFN from the
> > specified
> > guest's
> > + * pseudophysical address space.
> > + * arg == addr of xen_remove_from_physmap_t.
> > + */
> > +#define XENMEM_remove_from_physmap      15
> > +struct xen_remove_from_physmap {
> > +	/* Which domain to change the mapping for. */
> > +	domid_t domid;
> > +
> > +	/* GPFN of the current mapping of the page. */
> > +	xen_pfn_t gpfn;
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(xen_remove_from_physmap);
> > +
> > +/*
> > + * Get the pages for a particular guest resource, so that they can
> > be
> > + * mapped directly by a tools domain.
> > + */
> > +#define XENMEM_acquire_resource 28
> > +struct xen_mem_acquire_resource {
> > +	/* IN - The domain whose resource is to be mapped */
> > +	domid_t domid;
> > +	/* IN - the type of resource */
> > +	u16 type;
> > +
> > +#define XENMEM_resource_ioreq_server 0
> > +#define XENMEM_resource_grant_table 1
> > +
> > +	/*
> > +	 * IN - a type-specific resource identifier, which must be zero
> > +	 *      unless stated otherwise.
> > +	 *
> > +	 * type == XENMEM_resource_ioreq_server -> id == ioreq server
> > id
> > +	 * type == XENMEM_resource_grant_table -> id defined below
> > +	 */
> > +	u32 id;
> > +
> > +#define XENMEM_resource_grant_table_id_shared 0
> > +#define XENMEM_resource_grant_table_id_status 1
> > +
> > +	/* IN/OUT - As an IN parameter number of frames of the resource
> > +	 *          to be mapped. However, if the specified value is 0
> > and
> > +	 *          frame_list is NULL then this field will be set to
> > the
> > +	 *          maximum value supported by the implementation on
> > return.
> > +	 */
> > +	u32 nr_frames;
> > +	/*
> > +	 * OUT - Must be zero on entry. On return this may contain a
> > bitwise
> > +	 *       OR of the following values.
> > +	 */
> > +	u32 flags;
> > +
> > +	/* The resource pages have been assigned to the calling domain
> > */
> > +#define _XENMEM_rsrc_acq_caller_owned 0
> > +#define XENMEM_rsrc_acq_caller_owned (1u <<
> > _XENMEM_rsrc_acq_caller_owned)
> > +
> > +	/*
> > +	 * IN - the index of the initial frame to be mapped. This
> > parameter
> > +	 *      is ignored if nr_frames is 0.
> > +	 */
> > +	u64 frame;
> > +
> > +#define XENMEM_resource_ioreq_server_frame_bufioreq 0
> > +#define XENMEM_resource_ioreq_server_frame_ioreq(n) (1 + (n))
> > +
> > +	/*
> > +	 * IN/OUT - If the tools domain is PV then, upon return,
> > frame_list
> > +	 *          will be populated with the MFNs of the resource.
> > +	 *          If the tools domain is HVM then it is expected
> > that, on
> > +	 *          entry, frame_list will be populated with a list of
> > GFNs
> > +	 *          that will be mapped to the MFNs of the resource.
> > +	 *          If -EIO is returned then the frame_list has only
> > been
> > +	 *          partially mapped and it is up to the caller to
> > unmap all
> > +	 *          the GFNs.
> > +	 *          This parameter may be NULL if nr_frames is 0.
> > +	 */
> > +	GUEST_HANDLE(xen_pfn_t)frame_list;
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(xen_mem_acquire_resource);
> > +
> > +#endif /* __XEN_PUBLIC_MEMORY_H__ */
> > diff --git a/include/xen/interface/sched.h
> > b/include/xen/interface/sched.h
> > new file mode 100644
> > index 0000000000..0f12dcf267
> > --- /dev/null
> > +++ b/include/xen/interface/sched.h
> > @@ -0,0 +1,188 @@
> > +/************************************************************
> > ******************
> > + * sched.h
> > + *
> > + * Scheduler state interactions
> > + *
> > + * Permission is hereby granted, free of charge, to any person
> > obtaining a
> > copy
> > + * of this software and associated documentation files (the
> > "Software"), to
> > + * deal in the Software without restriction, including without
> > limitation the
> > + * rights to use, copy, modify, merge, publish, distribute,
> > sublicense, and/or
> > + * sell copies of the Software, and to permit persons to whom the
> > Software is
> > + * furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be
> > included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> > KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> > EVENT SHALL THE
> > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> > OR OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + *
> > + * Copyright (c) 2005, Keir Fraser <keir@xensource.com>
> > + */
> > +
> > +#ifndef __XEN_PUBLIC_SCHED_H__
> > +#define __XEN_PUBLIC_SCHED_H__
> > +
> > +#include <xen/interface/event_channel.h>
> > +
> > +/*
> > + * Guest Scheduler Operations
> > + *
> > + * The SCHEDOP interface provides mechanisms for a guest to
> > interact
> > + * with the scheduler, including yield, blocking and shutting
> > itself
> > + * down.
> > + */
> > +
> > +/*
> > + * The prototype for this hypercall is:
> > + * long HYPERVISOR_sched_op(enum sched_op cmd, void *arg, ...)
> > + *
> > + * @cmd == SCHEDOP_??? (scheduler operation).
> > + * @arg == Operation-specific extra argument(s), as described
> > below.
> > + * ...  == Additional Operation-specific extra arguments,
> > described below.
> > + *
> > + * Versions of Xen prior to 3.0.2 provided only the following
> > legacy version
> > + * of this hypercall, supporting only the commands yield, block
> > and
> > shutdown:
> > + *  long sched_op(int cmd, unsigned long arg)
> > + * @cmd == SCHEDOP_??? (scheduler operation).
> > + * @arg == 0               (SCHEDOP_yield and SCHEDOP_block)
> > + *      == SHUTDOWN_* code (SCHEDOP_shutdown)
> > + *
> > + * This legacy version is available to new guests as:
> > + * long HYPERVISOR_sched_op_compat(enum sched_op cmd, unsigned
> > long
> > arg)
> > + */
> > +
> > +/*
> > + * Voluntarily yield the CPU.
> > + * @arg == NULL.
> > + */
> > +#define SCHEDOP_yield       0
> > +
> > +/*
> > + * Block execution of this VCPU until an event is received for
> > processing.
> > + * If called with event upcalls masked, this operation will
> > atomically
> > + * reenable event delivery and check for pending events before
> > blocking the
> > + * VCPU. This avoids a "wakeup waiting" race.
> > + * @arg == NULL.
> > + */
> > +#define SCHEDOP_block       1
> > +
> > +/*
> > + * Halt execution of this domain (all VCPUs) and notify the system
> > controller.
> > + * @arg == pointer to sched_shutdown structure.
> > + *
> > + * If the sched_shutdown_t reason is SHUTDOWN_suspend then
> > + * x86 PV guests must also set RDX (EDX for 32-bit guests) to the
> > MFN
> > + * of the guest's start info page.  RDX/EDX is the third hypercall
> > + * argument.
> > + *
> > + * In addition, which reason is SHUTDOWN_suspend this hypercall
> > + * returns 1 if suspend was cancelled or the domain was merely
> > + * checkpointed, and 0 if it is resuming in a new domain.
> > + */
> > +#define SCHEDOP_shutdown    2
> > +
> > +/*
> > + * Poll a set of event-channel ports. Return when one or more are
> > pending.
> > An
> > + * optional timeout may be specified.
> > + * @arg == pointer to sched_poll structure.
> > + */
> > +#define SCHEDOP_poll        3
> > +
> > +/*
> > + * Declare a shutdown for another domain. The main use of this
> > function is
> > + * in interpreting shutdown requests and reasons for fully-
> > virtualized
> > + * domains.  A para-virtualized domain may use SCHEDOP_shutdown
> > directly.
> > + * @arg == pointer to sched_remote_shutdown structure.
> > + */
> > +#define SCHEDOP_remote_shutdown        4
> > +
> > +/*
> > + * Latch a shutdown code, so that when the domain later shuts down
> > it
> > + * reports this code to the control tools.
> > + * @arg == sched_shutdown, as for SCHEDOP_shutdown.
> > + */
> > +#define SCHEDOP_shutdown_code 5
> > +
> > +/*
> > + * Setup, poke and destroy a domain watchdog timer.
> > + * @arg == pointer to sched_watchdog structure.
> > + * With id == 0, setup a domain watchdog timer to cause domain
> > shutdown
> > + *               after timeout, returns watchdog id.
> > + * With id != 0 and timeout == 0, destroy domain watchdog timer.
> > + * With id != 0 and timeout != 0, poke watchdog timer and set new
> > timeout.
> > + */
> > +#define SCHEDOP_watchdog    6
> > +
> > +/*
> > + * Override the current vcpu affinity by pinning it to one
> > physical cpu or
> > + * undo this override restoring the previous affinity.
> > + * @arg == pointer to sched_pin_override structure.
> > + *
> > + * A negative pcpu value will undo a previous pin override and
> > restore the
> > + * previous cpu affinity.
> > + * This call is allowed for the hardware domain only and requires
> > the cpu
> > + * to be part of the domain's cpupool.
> > + */
> > +#define SCHEDOP_pin_override 7
> > +
> > +struct sched_shutdown {
> > +	unsigned int reason; /* SHUTDOWN_* => shutdown reason */
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(sched_shutdown);
> > +
> > +struct sched_poll {
> > +	GUEST_HANDLE(evtchn_port_t)ports;
> > +	unsigned int nr_ports;
> > +	u64 timeout;
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(sched_poll);
> > +
> > +struct sched_remote_shutdown {
> > +	domid_t domain_id;         /* Remote domain ID */
> > +	unsigned int reason;       /* SHUTDOWN_* => shutdown reason */
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(sched_remote_shutdown);
> > +
> > +struct sched_watchdog {
> > +	u32 id;                /* watchdog ID */
> > +	u32 timeout;           /* timeout */
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(sched_watchdog);
> > +
> > +struct sched_pin_override {
> > +	s32 pcpu;
> > +};
> > +
> > +DEFINE_GUEST_HANDLE_STRUCT(sched_pin_override);
> > +
> > +/*
> > + * Reason codes for SCHEDOP_shutdown. These may be interpreted by
> > control
> > + * software to determine the appropriate action. For the most
> > part, Xen does
> > + * not care about the shutdown code.
> > + */
> > +#define SHUTDOWN_poweroff   0  /* Domain exited normally. Clean up
> > and kill. */
> > +#define SHUTDOWN_reboot     1  /* Clean up, kill, and then
> > restart.
> > */
> > +#define SHUTDOWN_suspend    2  /* Clean up, save suspend info,
> > kill.
> > */
> > +#define SHUTDOWN_crash      3  /* Tell controller we've crashed.
> > */
> > +#define SHUTDOWN_watchdog   4  /* Restart because watchdog time
> > expired.     */
> > +
> > +/*
> > + * Domain asked to perform 'soft reset' for it. The expected
> > behavior is to
> > + * reset internal Xen state for the domain returning it to the
> > point where it
> > + * was created but leaving the domain's memory contents and vCPU
> > contexts
> > + * intact. This will allow the domain to start over and set up all
> > Xen specific
> > + * interfaces again.
> > + */
> > +#define SHUTDOWN_soft_reset 5
> > +#define SHUTDOWN_MAX        5  /* Maximum valid shutdown reason.
> > */
> > +
> > +#endif /* __XEN_PUBLIC_SCHED_H__ */
> > diff --git a/include/xen/interface/xen.h
> > b/include/xen/interface/xen.h
> > new file mode 100644
> > index 0000000000..964daaedfb
> > --- /dev/null
> > +++ b/include/xen/interface/xen.h
> > @@ -0,0 +1,225 @@
> > +/************************************************************
> > ******************
> > + * xen.h
> > + *
> > + * Guest OS interface to Xen.
> > + *
> > + * Permission is hereby granted, free of charge, to any person
> > obtaining a
> > copy
> > + * of this software and associated documentation files (the
> > "Software"), to
> > + * deal in the Software without restriction, including without
> > limitation the
> > + * rights to use, copy, modify, merge, publish, distribute,
> > sublicense, and/or
> > + * sell copies of the Software, and to permit persons to whom the
> > Software is
> > + * furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be
> > included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> > KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> > EVENT SHALL THE
> > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> > OR OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + *
> > + * Copyright (c) 2004, K A Fraser
> > + */
> > +
> > +#ifndef __XEN_PUBLIC_XEN_H__
> > +#define __XEN_PUBLIC_XEN_H__
> > +
> > +#include <xen/arm/interface.h>
> > +
> > +/*
> > + * XEN "SYSTEM CALLS" (a.k.a. HYPERCALLS).
> > + */
> > +
> > +/*
> > + * x86_32: EAX = vector; EBX, ECX, EDX, ESI, EDI = args 1, 2, 3,
> > 4, 5.
> > + *         EAX = return value
> > + *         (argument registers may be clobbered on return)
> > + * x86_64: RAX = vector; RDI, RSI, RDX, R10, R8, R9 = args 1, 2,
> > 3, 4, 5, 6.
> > + *         RAX = return value
> > + *         (argument registers not clobbered on return; RCX, R11
> > are)
> > + */
> > +#define __HYPERVISOR_set_trap_table        0
> > +#define __HYPERVISOR_mmu_update            1
> > +#define __HYPERVISOR_set_gdt               2
> > +#define __HYPERVISOR_stack_switch          3
> > +#define __HYPERVISOR_set_callbacks         4
> > +#define __HYPERVISOR_fpu_taskswitch        5
> > +#define __HYPERVISOR_sched_op_compat       6
> > +#define __HYPERVISOR_platform_op           7
> > +#define __HYPERVISOR_set_debugreg          8
> > +#define __HYPERVISOR_get_debugreg          9
> > +#define __HYPERVISOR_update_descriptor    10
> > +#define __HYPERVISOR_memory_op            12
> > +#define __HYPERVISOR_multicall            13
> > +#define __HYPERVISOR_update_va_mapping    14
> > +#define __HYPERVISOR_set_timer_op         15
> > +#define __HYPERVISOR_event_channel_op_compat 16
> > +#define __HYPERVISOR_xen_version          17
> > +#define __HYPERVISOR_console_io           18
> > +#define __HYPERVISOR_physdev_op_compat    19
> > +#define __HYPERVISOR_grant_table_op       20
> > +#define __HYPERVISOR_vm_assist            21
> > +#define __HYPERVISOR_update_va_mapping_otherdomain 22
> > +#define __HYPERVISOR_iret                 23 /* x86 only */
> > +#define __HYPERVISOR_vcpu_op              24
> > +#define __HYPERVISOR_set_segment_base     25 /* x86/64 only */
> > +#define __HYPERVISOR_mmuext_op            26
> > +#define __HYPERVISOR_xsm_op               27
> > +#define __HYPERVISOR_nmi_op               28
> > +#define __HYPERVISOR_sched_op             29
> > +#define __HYPERVISOR_callback_op          30
> > +#define __HYPERVISOR_xenoprof_op          31
> > +#define __HYPERVISOR_event_channel_op     32
> > +#define __HYPERVISOR_physdev_op           33
> > +#define __HYPERVISOR_hvm_op               34
> > +#define __HYPERVISOR_sysctl               35
> > +#define __HYPERVISOR_domctl               36
> > +#define __HYPERVISOR_kexec_op             37
> > +#define __HYPERVISOR_tmem_op              38
> > +#define __HYPERVISOR_xc_reserved_op       39 /* reserved for
> > XenClient */
> > +#define __HYPERVISOR_xenpmu_op            40
> > +#define __HYPERVISOR_dm_op                41
> > +
> > +/* Architecture-specific hypercall definitions. */
> > +#define __HYPERVISOR_arch_0               48
> > +#define __HYPERVISOR_arch_1               49
> > +#define __HYPERVISOR_arch_2               50
> > +#define __HYPERVISOR_arch_3               51
> > +#define __HYPERVISOR_arch_4               52
> > +#define __HYPERVISOR_arch_5               53
> > +#define __HYPERVISOR_arch_6               54
> > +#define __HYPERVISOR_arch_7               55
> > +
> > +#ifndef __ASSEMBLY__
> > +
> > +typedef u16 domid_t;
> > +
> > +/* Domain ids >= DOMID_FIRST_RESERVED cannot be used for ordinary
> > domains. */
> > +#define DOMID_FIRST_RESERVED (0x7FF0U)
> > +
> > +/* DOMID_SELF is used in certain contexts to refer to oneself. */
> > +#define DOMID_SELF (0x7FF0U)
> > +
> > +/*
> > + * DOMID_IO is used to restrict page-table updates to mapping I/O
> > memory.
> > + * Although no Foreign Domain need be specified to map I/O pages,
> > DOMID_IO
> > + * is useful to ensure that no mappings to the OS's own heap are
> > accidentally
> > + * installed. (e.g., in Linux this could cause havoc as reference
> > counts
> > + * aren't adjusted on the I/O-mapping code path).
> > + * This only makes sense in MMUEXT_SET_FOREIGNDOM, but in that
> > context can
> > + * be specified by any calling domain.
> > + */
> > +#define DOMID_IO   (0x7FF1U)
> > +
> > +/*
> > + * DOMID_XEN is used to allow privileged domains to map restricted
> > parts of
> > + * Xen's heap space (e.g., the machine_to_phys table).
> > + * This only makes sense in MMUEXT_SET_FOREIGNDOM, and is only
> > permitted if
> > + * the caller is privileged.
> > + */
> > +#define DOMID_XEN  (0x7FF2U)
> > +
> > +/* DOMID_COW is used as the owner of sharable pages */
> > +#define DOMID_COW  (0x7FF3U)
> > +
> > +/* DOMID_INVALID is used to identify pages with unknown owner. */
> > +#define DOMID_INVALID (0x7FF4U)
> > +
> > +/* Idle domain. */
> > +#define DOMID_IDLE (0x7FFFU)
> > +
> > +struct vcpu_info {
> > +	/*
> > +	 * 'evtchn_upcall_pending' is written non-zero by Xen to
> > indicate
> > +	 * a pending notification for a particular VCPU. It is then
> > cleared
> > +	 * by the guest OS /before/ checking for pending work, thus
> > avoiding
> > +	 * a set-and-check race. Note that the mask is only accessed by
> > Xen
> > +	 * on the CPU that is currently hosting the VCPU. This means
> > that the
> > +	 * pending and mask flags can be updated by the guest without
> > special
> > +	 * synchronisation (i.e., no need for the x86 LOCK prefix).
> > +	 * This may seem suboptimal because if the pending flag is set
> > by
> > +	 * a different CPU then an IPI may be scheduled even when the
> > mask
> > +	 * is set. However, note:
> > +	 *  1. The task of 'interrupt holdoff' is covered by the per-
> > event-
> > +	 *     channel mask bits. A 'noisy' event that is continually
> > being
> > +	 *     triggered can be masked at source at this very precise
> > +	 *     granularity.
> > +	 *  2. The main purpose of the per-VCPU mask is therefore to
> > restrict
> > +	 *     reentrant execution: whether for concurrency control, or
> > to
> > +	 *     prevent unbounded stack usage. Whatever the purpose, we
> > expect
> > +	 *     that the mask will be asserted only for short periods at
> > a time,
> > +	 *     and so the likelihood of a 'spurious' IPI is suitably
> > small.
> > +	 * The mask is read before making an event upcall to the guest:
> > a
> > +	 * non-zero mask therefore guarantees that the VCPU will not
> > receive
> > +	 * an upcall activation. The mask is cleared when the VCPU
> > requests
> > +	 * to block: this avoids wakeup-waiting races.
> > +	 */
> > +	u8 evtchn_upcall_pending;
> > +	u8 evtchn_upcall_mask;
> > +	xen_ulong_t evtchn_pending_sel;
> > +	struct arch_vcpu_info arch;
> > +	struct pvclock_vcpu_time_info time;
> > +}; /* 64 bytes (x86) */
> > +
> > +/*
> > + * Xen/kernel shared data -- pointer provided in start_info.
> > + * NB. We expect that this struct is smaller than a page.
> > + */
> > +struct shared_info {
> > +	struct vcpu_info vcpu_info[MAX_VIRT_CPUS];
> > +
> > +	/*
> > +	 * A domain can create "event channels" on which it can send
> > and
> > receive
> > +	 * asynchronous event notifications. There are three classes of
> > event
> > that
> > +	 * are delivered by this mechanism:
> > +	 *  1. Bi-directional inter- and intra-domain connections.
> > Domains must
> > +	 *     arrange out-of-band to set up a connection (usually by
> > allocating
> > +	 *     an unbound 'listener' port and avertising that via a
> > storage
> > service
> > +	 *     such as xenstore).
> > +	 *  2. Physical interrupts. A domain with suitable hardware-
> > access
> > +	 *     privileges can bind an event-channel port to a physical
> > interrupt
> > +	 *     source.
> > +	 *  3. Virtual interrupts ('events'). A domain can bind an
> > event-channel
> > +	 *     port to a virtual interrupt source, such as the virtual-
> > timer
> > +	 *     device or the emergency console.
> > +	 *
> > +	 * Event channels are addressed by a "port index". Each channel
> > is
> > +	 * associated with two bits of information:
> > +	 *  1. PENDING -- notifies the domain that there is a pending
> > notification
> > +	 *     to be processed. This bit is cleared by the guest.
> > +	 *  2. MASK -- if this bit is clear then a 0->1 transition of
> > PENDING
> > +	 *     will cause an asynchronous upcall to be scheduled. This
> > bit is
> > only
> > +	 *     updated by the guest. It is read-only within Xen. If a
> > channel
> > +	 *     becomes pending while the channel is masked then the
> > 'edge' is
> > lost
> > +	 *     (i.e., when the channel is unmasked, the guest must
> > manually
> > handle
> > +	 *     pending notifications as no upcall will be scheduled by
> > Xen).
> > +	 *
> > +	 * To expedite scanning of pending notifications, any 0->1
> > pending
> > +	 * transition on an unmasked channel causes a corresponding bit
> > in a
> > +	 * per-vcpu selector word to be set. Each bit in the selector
> > covers a
> > +	 * 'C long' in the PENDING bitfield array.
> > +	 */
> > +	xen_ulong_t evtchn_pending[sizeof(xen_ulong_t) * 8];
> > +	xen_ulong_t evtchn_mask[sizeof(xen_ulong_t) * 8];
> > +
> > +	/*
> > +	 * Wallclock time: updated only by control software. Guests
> > should base
> > +	 * their gettimeofday() syscall on this wallclock-base value.
> > +	 */
> > +	struct pvclock_wall_clock wc;
> > +
> > +	struct arch_shared_info arch;
> > +
> > +};
> > +
> > +#else /* __ASSEMBLY__ */
> > +
> > +/* In assembly code we cannot use C numeric constant suffixes. */
> > +#define mk_unsigned_long(x) x
> > +
> > +#endif /* !__ASSEMBLY__ */
> > +
> > +#endif /* __XEN_PUBLIC_XEN_H__ */
> > --
> > 2.17.1
> 
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 07/17] serial: serial_xen: Add Xen PV serial driver
  2020-07-03  3:50   ` Simon Glass
@ 2020-07-03 12:59     ` Anastasiia Lukianenko
  0 siblings, 0 replies; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-03 12:59 UTC (permalink / raw)
  To: u-boot

Hello Simon,

On Thu, 2020-07-02 at 21:50 -0600, Simon Glass wrote:
> Hi Anastasiia,
> 
> On Wed, 1 Jul 2020 at 10:30, Anastasiia Lukianenko <
> vicooodin at gmail.com> wrote:
> > 
> > From: Peng Fan <peng.fan@nxp.com>
> > 
> > Add support for Xen para-virtualized serial driver. This
> > driver fully supports serial console for the virtual machine.
> > 
> > Please note that as the driver is initialized late, so no banner
> > nor memory size is visible.
> > 
> > Signed-off-by: Peng Fan <peng.fan@nxp.com>
> > Signed-off-by: Oleksandr Andrushchenko <
> > oleksandr_andrushchenko at epam.com>
> > Signed-off-by: Anastasiia Lukianenko <
> > anastasiia_lukianenko at epam.com>
> > ---
> >  arch/arm/Kconfig                          |   1 +
> >  board/xen/xenguest_arm64/xenguest_arm64.c |  31 +++-
> >  configs/xenguest_arm64_defconfig          |   4 +-
> >  drivers/serial/Kconfig                    |   7 +
> >  drivers/serial/Makefile                   |   1 +
> >  drivers/serial/serial_xen.c               | 175
> > ++++++++++++++++++++++
> >  drivers/xen/events.c                      |   4 +
> >  7 files changed, 214 insertions(+), 9 deletions(-)
> >  create mode 100644 drivers/serial/serial_xen.c
> 
> Reviewed-by: Simon Glass <sjg@chromium.org>
> 
> nits below
> 
> > 
> > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> > index c469863967..d4de1139aa 100644
> > --- a/arch/arm/Kconfig
> > +++ b/arch/arm/Kconfig
> > @@ -1723,6 +1723,7 @@ config TARGET_XENGUEST_ARM64
> >         select XEN
> >         select OF_CONTROL
> >         select LINUX_KERNEL_IMAGE_HEADER
> > +       select XEN_SERIAL
> >  endchoice
> > 
> >  config ARCH_SUPPORT_TFABOOT
> > diff --git a/board/xen/xenguest_arm64/xenguest_arm64.c
> > b/board/xen/xenguest_arm64/xenguest_arm64.c
> > index 9e099f388f..fd10a002e9 100644
> > --- a/board/xen/xenguest_arm64/xenguest_arm64.c
> > +++ b/board/xen/xenguest_arm64/xenguest_arm64.c
> > @@ -18,9 +18,12 @@
> >  #include <asm/armv8/mmu.h>
> >  #include <asm/xen.h>
> >  #include <asm/xen/hypercall.h>
> > +#include <asm/xen/system.h>
> > 
> >  #include <linux/compiler.h>
> > 
> > +#include <xen/hvm.h>
> > +
> >  DECLARE_GLOBAL_DATA_PTR;
> > 
> >  int board_init(void)
> > @@ -57,9 +60,28 @@ static int get_next_memory_node(const void
> > *blob, int mem)
> > 
> >  static int setup_mem_map(void)
> >  {
> > -       int i, ret, mem, reg = 0;
> > +       int i = 0, ret, mem, reg = 0;
> >         struct fdt_resource res;
> >         const void *blob = gd->fdt_blob;
> > +       u64 gfn;
> > +
> > +       /*
> > +        * Add "magic" region which is used by Xen to provide some
> > essentials
> > +        * for the guest: we need console.
> > +        */
> > +       ret =
> > hvm_get_parameter_maintain_dcache(HVM_PARAM_CONSOLE_PFN, &gfn);
> > +       if (ret < 0) {
> > +               printf("%s: Can't get HVM_PARAM_CONSOLE_PFN, ret
> > %d\n",
> > +                      __func__, ret);
> > +               return -EINVAL;
> > +       }
> > +
> > +       xen_mem_map[i].virt = PFN_PHYS(gfn);
> > +       xen_mem_map[i].phys = PFN_PHYS(gfn);
> > +       xen_mem_map[i].size = PAGE_SIZE;
> > +       xen_mem_map[i].attrs = (PTE_BLOCK_MEMTYPE(MT_NORMAL) |
> > +                               PTE_BLOCK_INNER_SHARE);
> > +       i++;
> > 
> >         mem = get_next_memory_node(blob, -1);
> >         if (mem < 0) {
> > @@ -67,7 +89,7 @@ static int setup_mem_map(void)
> >                 return -EINVAL;
> >         }
> > 
> > -       for (i = 0; i < MAX_MEM_MAP_REGIONS; i++) {
> > +       for (; i < MAX_MEM_MAP_REGIONS; i++) {
> >                 ret = fdt_get_resource(blob, mem, "reg", reg++,
> > &res);
> >                 if (ret == -FDT_ERR_NOTFOUND) {
> >                         reg = 0;
> > @@ -146,8 +168,3 @@ int print_cpuinfo(void)
> >         return 0;
> >  }
> > 
> > -__weak struct serial_device *default_serial_console(void)
> > -{
> > -       return NULL;
> > -}
> > -
> > diff --git a/configs/xenguest_arm64_defconfig
> > b/configs/xenguest_arm64_defconfig
> > index 2a8caf8647..45559a161b 100644
> > --- a/configs/xenguest_arm64_defconfig
> > +++ b/configs/xenguest_arm64_defconfig
> > @@ -47,9 +47,9 @@ CONFIG_CMD_UMS=n
> >  #CONFIG_EFI_PARTITION=y
> >  # CONFIG_EFI_LOADER is not set
> > 
> > -# CONFIG_DM is not set
> > +CONFIG_DM=y
> >  # CONFIG_MMC is not set
> > -# CONFIG_DM_SERIAL is not set
> > +CONFIG_DM_SERIAL=y
> >  # CONFIG_REQUIRE_SERIAL_CONSOLE is not set
> > 
> >  CONFIG_OF_BOARD=y
> > diff --git a/drivers/serial/Kconfig b/drivers/serial/Kconfig
> > index 17d0e73623..33c989a66d 100644
> > --- a/drivers/serial/Kconfig
> > +++ b/drivers/serial/Kconfig
> > @@ -821,6 +821,13 @@ config MPC8XX_CONS
> >         depends on MPC8xx
> >         default y
> > 
> > +config XEN_SERIAL
> > +       bool "XEN serial support"
> > +       depends on XEN
> > +       help
> > +         If built without DM support, then requires Xen
> > +         to be built with CONFIG_VERBOSE_DEBUG.
> 
> Yes but what does it do? Also should probably not support non-DM at
> this point.
> 

This will be removed at all as only DM based drivers are added.

> > +
> >  choice
> >         prompt "Console port"
> >         default 8xx_CONS_SMC1
> > diff --git a/drivers/serial/Makefile b/drivers/serial/Makefile
> > index e4a92bbbb7..25f7f8d342 100644
> > --- a/drivers/serial/Makefile
> > +++ b/drivers/serial/Makefile
> > @@ -70,6 +70,7 @@ obj-$(CONFIG_OWL_SERIAL) += serial_owl.o
> >  obj-$(CONFIG_OMAP_SERIAL) += serial_omap.o
> >  obj-$(CONFIG_MTK_SERIAL) += serial_mtk.o
> >  obj-$(CONFIG_SIFIVE_SERIAL) += serial_sifive.o
> > +obj-$(CONFIG_XEN_SERIAL) += serial_xen.o
> > 
> >  ifndef CONFIG_SPL_BUILD
> >  obj-$(CONFIG_USB_TTY) += usbtty.o
> > diff --git a/drivers/serial/serial_xen.c
> > b/drivers/serial/serial_xen.c
> > new file mode 100644
> > index 0000000000..dcd4b2df79
> > --- /dev/null
> > +++ b/drivers/serial/serial_xen.c
> > @@ -0,0 +1,175 @@
> > +/*
> > + * SPDX-License-Identifier:    GPL-2.0+
> > + *
> > + * (C) 2018 NXP
> > + * (C) 2020 EPAM Systems Inc.
> > + */
> > +#include <common.h>
> > +#include <cpu_func.h>
> > +#include <dm.h>
> > +#include <serial.h>
> > +#include <watchdog.h>
> > +
> > +#include <linux/bug.h>
> > +
> > +#include <xen/hvm.h>
> > +#include <xen/events.h>
> > +
> > +#include <xen/interface/sched.h>
> > +#include <xen/interface/hvm/hvm_op.h>
> > +#include <xen/interface/hvm/params.h>
> > +#include <xen/interface/io/console.h>
> > +#include <xen/interface/io/ring.h>
> > +
> > +DECLARE_GLOBAL_DATA_PTR;
> > +
> > +u32 console_evtchn;
> > +
> > +struct xen_uart_priv {
> > +       struct xencons_interface *intf;
> > +       u32 evtchn;
> > +       int vtermno;
> > +       struct hvc_struct *hvc;
> 
> comment for struct

Ok, will add.

> 
> > +};
> > +
> > +int xen_serial_setbrg(struct udevice *dev, int baudrate)
> > +{
> > +       return 0;
> > +}
> > +
> > +static int xen_serial_probe(struct udevice *dev)
> > +{
> > +       struct xen_uart_priv *priv = dev_get_priv(dev);
> > +       u64 v = 0;
> > +       unsigned long gfn;
> > +       int r;
> > +
> > +       r = hvm_get_parameter(HVM_PARAM_CONSOLE_EVTCHN, &v);
> 
> Can you use ret and val instead of single-char var names? It is OK
> for
> loops, but not here.

Sure.

> 
> > +       if (r < 0 || v == 0)
> > +               return r;
> > +
> > +       priv->evtchn = v;
> > +       console_evtchn = v;
> > +
> > +       r = hvm_get_parameter(HVM_PARAM_CONSOLE_PFN, &v);
> > +       if (r < 0 || v == 0)
> > +               return -ENODEV;
> 
> return r if non-zero
> 
> return -EINVAL perhaps or -ENXIO if !v
> 
> -ENODEV means there is no device and is reserved for driver model.
> Clearly in this case there is a device.

Ok, makes sense.

> 
> > +
> > +       gfn = v;
> > +
> > +       priv->intf = (struct xencons_interface *)(gfn <<
> > XEN_PAGE_SHIFT);
> > +       if (!priv->intf)
> 
> Don't you already check for !v above?
> 
> > +               return -EINVAL;
> 
> Blank line

Ok, will fix.

> 
> > +       return 0;
> > +}
> > +
> > +static int xen_serial_pending(struct udevice *dev, bool input)
> > +{
> > +       struct xen_uart_priv *priv = dev_get_priv(dev);
> > +       struct xencons_interface *intf = priv->intf;
> > +
> > +       if (!input || intf->in_cons == intf->in_prod)
> > +               return 0;
> 
> blank line before final return. Please fix globally

Ok, will fix in the next version.

> 
> > +       return 1;
> > +}
> > +
> > +static int xen_serial_getc(struct udevice *dev)
> > +{
> > +       struct xen_uart_priv *priv = dev_get_priv(dev);
> > +       struct xencons_interface *intf = priv->intf;
> > +       XENCONS_RING_IDX cons;
> > +       char c;
> > +
> > +       while (intf->in_cons == intf->in_prod) {
> > +               mb(); /* wait */
> > +       }
> 
> Drop {}. Has this been through patman?

We used checkpatch before sending.

> 
> > +
> > +       cons = intf->in_cons;
> > +       mb();                   /* get pointers before reading ring
> > */
> > +
> > +       c = intf->in[MASK_XENCONS_IDX(cons++, intf->in)];
> > +
> > +       mb();                   /* read ring before consuming */
> > +       intf->in_cons = cons;
> > +
> > +       notify_remote_via_evtchn(priv->evtchn);
> > +       return c;
> > +}
> > +
> > +static int __write_console(struct udevice *dev, const char *data,
> > int len)
> > +{
> > +       struct xen_uart_priv *priv = dev_get_priv(dev);
> > +       struct xencons_interface *intf = priv->intf;
> > +       XENCONS_RING_IDX cons, prod;
> > +       int sent = 0;
> > +
> > +       cons = intf->out_cons;
> > +       prod = intf->out_prod;
> > +       mb(); /* Update pointer */
> > +
> > +       WARN_ON((prod - cons) > sizeof(intf->out));
> > +
> > +       while ((sent < len) && ((prod - cons) < sizeof(intf->out)))
> > +               intf->out[MASK_XENCONS_IDX(prod++, intf->out)] =
> > data[sent++];
> > +
> > +       mb(); /* Update data before pointer */
> > +       intf->out_prod = prod;
> > +
> > +       if (sent)
> > +               notify_remote_via_evtchn(priv->evtchn);
> > +       return sent;
> > +}
> > +
> > +static int write_console(struct udevice *dev, const char *data,
> > int len)
> > +{
> > +       /*
> > +        * Make sure the whole buffer is emitted, polling if
> > +        * necessary.  We don't ever want to rely on the hvc daemon
> > +        * because the most interesting console output is when the
> > +        * kernel is crippled.
> > +        */
> > +       while (len) {
> > +               int sent = __write_console(dev, data, len);
> > +
> > +               data += sent;
> > +               len -= sent;
> > +
> > +               if (unlikely(len))
> > +                       HYPERVISOR_sched_op(SCHEDOP_yield, NULL);
> > +       }
> > +       return 0;
> > +}
> > +
> > +static int xen_serial_putc(struct udevice *dev, const char ch)
> > +{
> > +       write_console(dev, &ch, 1);
> > +       return 0;
> > +}
> > +
> > +static const struct dm_serial_ops xen_serial_ops = {
> > +       .putc = xen_serial_putc,
> > +       .getc = xen_serial_getc,
> > +       .pending = xen_serial_pending,
> > +};
> > +
> > +#if CONFIG_IS_ENABLED(OF_CONTROL)
> > +static const struct udevice_id xen_serial_ids[] = {
> > +       { .compatible = "xen,xen" },
> > +       { }
> > +};
> > +#endif
> > +
> > +U_BOOT_DRIVER(serial_xen) = {
> > +       .name                   = "serial_xen",
> > +       .id                     = UCLASS_SERIAL,
> > +#if CONFIG_IS_ENABLED(OF_CONTROL)
> 
> of_patch_ptr() - but I think you can drop this
> 
> > +       .of_match               = xen_serial_ids,
> > +#endif
> > +       .priv_auto_alloc_size   = sizeof(struct xen_uart_priv),
> > +       .probe                  = xen_serial_probe,
> > +       .ops                    = &xen_serial_ops,
> > +#if !CONFIG_IS_ENABLED(OF_CONTROL)
> 
> and this?
> 
> > +       .flags                  = DM_FLAG_PRE_RELOC,
> > +#endif
> > +};
> > +
> > diff --git a/drivers/xen/events.c b/drivers/xen/events.c
> > index eddc6b6e29..a1b36a2196 100644
> > --- a/drivers/xen/events.c
> > +++ b/drivers/xen/events.c
> > @@ -25,6 +25,8 @@
> >  #include <xen/events.h>
> >  #include <xen/hvm.h>
> > 
> > +extern u32 console_evtchn;
> > +
> >  #define NR_EVS 1024
> > 
> >  /* this represents a event handler. Chaining or sharing is not
> > allowed */
> > @@ -47,6 +49,8 @@ void unbind_all_ports(void)
> >         struct vcpu_info *vcpu_info = &s->vcpu_info[cpu];
> > 
> >         for (i = 0; i < NR_EVS; i++) {
> > +               if (i == console_evtchn)
> > +                       continue;
> >                 if (test_and_clear_bit(i, bound_ports)) {
> >                         printf("port %d still bound!\n", i);
> >                         unbind_evtchn(i);
> > --
> > 2.17.1
> > 
> 
> Regards,
> Simon

Regards,
Anastasiia

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 08/17] linux/compat.h: Add wait_event_timeout macro
  2020-07-02  4:08   ` Heinrich Schuchardt
@ 2020-07-03 13:02     ` Anastasiia Lukianenko
  0 siblings, 0 replies; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-03 13:02 UTC (permalink / raw)
  To: u-boot

Hi Heinrich,

On Thu, 2020-07-02 at 06:08 +0200, Heinrich Schuchardt wrote:
> On 7/1/20 6:29 PM, Anastasiia Lukianenko wrote:
> > From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> > 
> > Add  wait_event_timeout - sleep until a condition gets true or a
> > timeout elapses.
> > 
> > This is a stripped version of the same from Linux kernel with the
> > following u-boot specific modifications:
> > - no wait queues supported
> > - use u-boot timer to detect timeouts
> > - check for Ctrl-C pressed during wait
> > 
> > Signed-off-by: Oleksandr Andrushchenko <
> > oleksandr_andrushchenko at epam.com>
> > Signed-off-by: Anastasiia Lukianenko <
> > anastasiia_lukianenko at epam.com>
> > ---
> >  include/linux/compat.h | 45
> > ++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 45 insertions(+)
> > 
> > diff --git a/include/linux/compat.h b/include/linux/compat.h
> > index 712eeaef4e..5375b7d3b8 100644
> > --- a/include/linux/compat.h
> > +++ b/include/linux/compat.h
> > @@ -1,12 +1,20 @@
> >  #ifndef _LINUX_COMPAT_H_
> >  #define _LINUX_COMPAT_H_
> > 
> > +#include <console.h>
> >  #include <log.h>
> >  #include <malloc.h>
> > +
> > +#include <asm/processor.h>
> > +
> >  #include <linux/types.h>
> >  #include <linux/err.h>
> >  #include <linux/kernel.h>
> > 
> > +#ifdef CONFIG_XEN
> > +#include <xen/events.h>
> > +#endif
> > +
> >  struct unused {};
> >  typedef struct unused unused_t;
> > 
> > @@ -122,6 +130,43 @@ static inline void kmem_cache_destroy(struct
> > kmem_cache *cachep)
> >  #define add_wait_queue(...)	do { } while (0)
> >  #define remove_wait_queue(...)	do { } while (0)
> > 
> > +#ifndef CONFIG_XEN
> > +#define eventchn_poll()
> > +#endif
> > +
> > +#define __wait_event_timeout(condition, timeout, ret)		
> > \
> > +({								\
> > +	ulong __ret = ret; /* explicit shadow */		\
> > +	ulong start = get_timer(0);				\
> > +	for (;;) {						\
> > +		eventchn_poll();				\
> > +		if (condition) {				\
> > +			__ret = 1;				\
> > +			break;					\
> > +	}							\
> > +	if ((get_timer(start) > timeout) || ctrlc()) {		\
> > +		__ret = 0;					\
> > +		break;						\
> > +	}							\
> > +	cpu_relax();						\
> > +	}							\
> > +	__ret;							\
> > +})
> > +
> > +/*
> > + * 0 if the @condition evaluated to %false after the @timeout
> > elapsed,
> > + * 1 if the @condition evaluated to %true
> > + */
> 
> Please, document all arguments. Use Sphinx style as in
> 
> 
https://urldefense.com/v3/__https://www.kernel.org/doc/html/latest/doc-guide/kernel-doc.html*function-documentation__;Iw!!GF_29dbcQIUBPA!jxP9Fy4gB94pb9T2mEndgiT2VqEyLEBMrYaWyDyW68eZlNMGungcRuQt_ImPQPyw3zQYqiU$
>  .
> 

Ok, I will fix it in the next version.

> Best regards
> 
> Heinrich.

Best regards,
Anastasiia

> 
> > +#define wait_event_timeout(wq_head, condition, timeout)		
> > 	\
> > +({									
> > \
> > +	ulong __ret;							
> > \
> > +	if (condition)							
> > \
> > +		__ret = 1;						\
> > +	else								
> > \
> > +		__ret = __wait_event_timeout(condition, timeout,
> > __ret);\
> > +	__ret;								
> > \
> > +})
> > +
> >  #define KERNEL_VERSION(a,b,c)	(((a) << 16) + ((b) << 8) +
> > (c))
> > 
> >  /* This is also defined in ARMv8's mmu.h */
> > 
> 
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 11/17] xen: Port Xen grant table driver from mini-os
  2020-07-01 16:59   ` Julien Grall
@ 2020-07-03 13:09     ` Anastasiia Lukianenko
  0 siblings, 0 replies; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-03 13:09 UTC (permalink / raw)
  To: u-boot

Hi Julien,

On Wed, 2020-07-01 at 17:59 +0100, Julien Grall wrote:
> 
> On 01/07/2020 17:29, Anastasiia Lukianenko wrote:
> > From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> > 
> > Make required updates to run on u-boot.
> > 
> > Signed-off-by: Oleksandr Andrushchenko <
> > oleksandr_andrushchenko at epam.com>
> > Signed-off-by: Anastasiia Lukianenko <
> > anastasiia_lukianenko at epam.com>
> > ---
> >   board/xen/xenguest_arm64/xenguest_arm64.c |  13 ++
> >   drivers/xen/Makefile                      |   1 +
> >   drivers/xen/gnttab.c                      | 258
> > ++++++++++++++++++++++
> >   drivers/xen/hypervisor.c                  |   2 +
> >   include/xen/gnttab.h                      |  25 +++
> >   5 files changed, 299 insertions(+)
> >   create mode 100644 drivers/xen/gnttab.c
> >   create mode 100644 include/xen/gnttab.h
> > 
> > diff --git a/board/xen/xenguest_arm64/xenguest_arm64.c
> > b/board/xen/xenguest_arm64/xenguest_arm64.c
> > index e8621f7174..b4e1650f99 100644
> > --- a/board/xen/xenguest_arm64/xenguest_arm64.c
> > +++ b/board/xen/xenguest_arm64/xenguest_arm64.c
> > @@ -22,6 +22,7 @@
> >   
> >   #include <linux/compiler.h>
> >   
> > +#include <xen/gnttab.h>
> >   #include <xen/hvm.h>
> >   
> >   DECLARE_GLOBAL_DATA_PTR;
> > @@ -64,6 +65,8 @@ static int setup_mem_map(void)
> >   	struct fdt_resource res;
> >   	const void *blob = gd->fdt_blob;
> >   	u64 gfn;
> > +	phys_addr_t gnttab_base;
> > +	phys_size_t gnttab_sz;
> >   
> >   	/*
> >   	 * Add "magic" region which is used by Xen to provide some
> > essentials
> > @@ -97,6 +100,16 @@ static int setup_mem_map(void)
> >   				PTE_BLOCK_INNER_SHARE);
> >   	i++;
> >   
> > +	/* Get Xen's suggested physical page assignments for the grant
> > table. */
> > +	get_gnttab_base(&gnttab_base, &gnttab_sz);
> > +
> > +	xen_mem_map[i].virt = gnttab_base;
> > +	xen_mem_map[i].phys = gnttab_base;
> > +	xen_mem_map[i].size = gnttab_sz;
> > +	xen_mem_map[i].attrs = (PTE_BLOCK_MEMTYPE(MT_NORMAL) |
> > +				PTE_BLOCK_INNER_SHARE);
> > +	i++;
> > +
> >   	mem = get_next_memory_node(blob, -1);
> >   	if (mem < 0) {
> >   		printf("%s: Missing /memory node\n", __func__);
> > diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
> > index 9d0f604aaa..243b13277a 100644
> > --- a/drivers/xen/Makefile
> > +++ b/drivers/xen/Makefile
> > @@ -5,3 +5,4 @@
> >   obj-y += hypervisor.o
> >   obj-y += events.o
> >   obj-y += xenbus.o
> > +obj-y += gnttab.o
> > diff --git a/drivers/xen/gnttab.c b/drivers/xen/gnttab.c
> > new file mode 100644
> > index 0000000000..b18102e329
> > --- /dev/null
> > +++ b/drivers/xen/gnttab.c
> > @@ -0,0 +1,258 @@
> > +/*
> > +
> > *******************************************************************
> > *********
> > + * (C) 2006 - Cambridge University
> > + * (C) 2020 - EPAM Systems Inc.
> > +
> > *******************************************************************
> > *********
> > + *
> > + *		File: gnttab.c
> > + *	  Author: Steven Smith (sos22 at cam.ac.uk)
> > + *	 Changes: Grzegorz Milos (gm281 at cam.ac.uk)
> > + *
> > + *		Date: July 2006
> > + *
> > + * Environment: Xen Minimal OS
> > + * Description: Simple grant tables implementation. About as
> > stupid as it's
> > + *  possible to be and still work.
> > + *
> > +
> > *******************************************************************
> > *********
> > + */
> > +#include <common.h>
> > +#include <linux/compiler.h>
> > +#include <log.h>
> > +#include <malloc.h>
> > +
> > +#include <asm/armv8/mmu.h>
> > +#include <asm/io.h>
> > +#include <asm/xen/system.h>
> > +
> > +#include <linux/bug.h>
> > +
> > +#include <xen/gnttab.h>
> > +#include <xen/hvm.h>
> > +
> > +#include <xen/interface/memory.h>
> > +
> > +DECLARE_GLOBAL_DATA_PTR;
> > +
> > +#define NR_RESERVED_ENTRIES 8
> > +
> > +/* NR_GRANT_FRAMES must be less than or equal to that configured
> > in Xen */
> > +#define NR_GRANT_FRAMES 1
> > +#define NR_GRANT_ENTRIES (NR_GRANT_FRAMES * PAGE_SIZE /
> > sizeof(struct grant_entry_v1))
> > +
> > +static struct grant_entry_v1 *gnttab_table;
> > +static grant_ref_t gnttab_list[NR_GRANT_ENTRIES];
> > +
> > +static void put_free_entry(grant_ref_t ref)
> > +{
> > +	unsigned long flags;
> > +
> > +	local_irq_save(flags);
> > +	gnttab_list[ref] = gnttab_list[0];
> > +	gnttab_list[0]  = ref;
> > +	local_irq_restore(flags);
> > +}
> > +
> > +static grant_ref_t get_free_entry(void)
> > +{
> > +	unsigned int ref;
> > +	unsigned long flags;
> > +
> > +	local_irq_save(flags);
> > +	ref = gnttab_list[0];
> > +	BUG_ON(ref < NR_RESERVED_ENTRIES || ref >= NR_GRANT_ENTRIES);
> > +	gnttab_list[0] = gnttab_list[ref];
> > +	local_irq_restore(flags);
> > +	return ref;
> > +}
> > +
> > +grant_ref_t gnttab_grant_access(domid_t domid, unsigned long
> > frame, int readonly)
> > +{
> > +	grant_ref_t ref;
> > +
> > +	ref = get_free_entry();
> > +	gnttab_table[ref].frame = frame;
> > +	gnttab_table[ref].domid = domid;
> > +	wmb();
> > +	readonly *= GTF_readonly;
> > +	gnttab_table[ref].flags = GTF_permit_access | readonly;
> > +
> > +	return ref;
> > +}
> > +
> > +grant_ref_t gnttab_grant_transfer(domid_t domid, unsigned long
> > pfn)
> 
> It is not possible to transfer grant on Arm. So I would suggest to 
> remove the code related to it.
> 
> [...]

Makes sense, will remove.

> 
> > +unsigned long gnttab_end_transfer(grant_ref_t ref)
> 
> likewise.

Same above.

> 
> > +{
> > +	unsigned long frame;
> > +	u16 flags;
> > +
> > +	BUG_ON(ref >= NR_GRANT_ENTRIES || ref < NR_RESERVED_ENTRIES);
> > +
> > +	while (!((flags = gnttab_table[ref].flags) &
> > GTF_transfer_committed)) {
> > +		if (synch_cmpxchg(&gnttab_table[ref].flags, flags, 0)
> > == flags) {
> > +			printf("Release unused transfer grant.\n");
> > +			put_free_entry(ref);
> > +			return 0;
> > +		}
> > +	}
> > +
> > +	/* If a transfer is in progress then wait until it is
> > completed. */
> > +	while (!(flags & GTF_transfer_completed))
> > +		flags = gnttab_table[ref].flags;
> > +
> > +	/* Read the frame number /after/ reading completion status. */
> > +	rmb();
> > +	frame = gnttab_table[ref].frame;
> > +
> > +	put_free_entry(ref);
> > +
> > +	return frame;
> > +}
> > +
> > +grant_ref_t gnttab_alloc_and_grant(void **map)
> > +{
> > +	unsigned long mfn;
> > +	grant_ref_t gref;
> > +
> > +	*map = (void *)memalign(PAGE_SIZE, PAGE_SIZE);
> > +	mfn = virt_to_mfn(*map);
> > +	gref = gnttab_grant_access(0, mfn, 0);
> > +	return gref;
> > +}
> > +
> > +static const char * const gnttabop_error_msgs[] =
> > GNTTABOP_error_msgs;
> > +
> > +const char *gnttabop_error(int16_t status)
> > +{
> > +	status = -status;
> > +	if (status < 0 || status >= ARRAY_SIZE(gnttabop_error_msgs))
> > +		return "bad status";
> > +	else
> > +		return gnttabop_error_msgs[status];
> > +}
> > +
> > +/* Get Xen's suggested physical page assignments for the grant
> > table. */
> > +void get_gnttab_base(phys_addr_t *gnttab_base, phys_size_t
> > *gnttab_sz)
> > +{
> > +	const void *blob = gd->fdt_blob;
> > +	struct fdt_resource res;
> > +	int mem;
> > +
> > +	mem = fdt_node_offset_by_compatible(blob, -1, "xen,xen");
> > +	if (mem < 0) {
> > +		printf("No xen,xen compatible found\n");
> > +		BUG();
> > +	}
> > +
> > +	mem = fdt_get_resource(blob, mem, "reg", 0, &res);
> > +	if (mem == -FDT_ERR_NOTFOUND) {
> > +		printf("No grant table base in the device tree\n");
> > +		BUG();
> > +	}
> > +
> > +	*gnttab_base = (phys_addr_t)res.start;
> > +	if (gnttab_sz)
> > +		*gnttab_sz = (phys_size_t)(res.end - res.start + 1);
> > +
> > +	debug("FDT suggests grant table base at %llx\n",
> > +	      *gnttab_base);
> > +}
> > +
> > +void init_gnttab(void)
> > +{
> > +	struct xen_add_to_physmap xatp;
> > +	struct gnttab_setup_table setup;
> > +	xen_pfn_t frames[NR_GRANT_FRAMES];
> > +	int i, rc;
> > +
> > +	debug("%s\n", __func__);
> > +
> > +	for (i = NR_RESERVED_ENTRIES; i < NR_GRANT_ENTRIES; i++)
> > +		put_free_entry(i);
> > +
> > +	get_gnttab_base((phys_addr_t *)&gnttab_table, NULL);
> > +
> > +	for (i = 0; i < NR_GRANT_FRAMES; i++) {
> > +		xatp.domid = DOMID_SELF;
> > +		xatp.size = 0;
> > +		xatp.space = XENMAPSPACE_grant_table;
> > +		xatp.idx = i;
> > +		xatp.gpfn = PFN_DOWN((unsigned long)gnttab_table) + i;
> > +		rc = HYPERVISOR_memory_op(XENMEM_add_to_physmap,
> > &xatp);
> > +		if (rc)
> > +			printf("XENMEM_add_to_physmap failed; status =
> > %d\n",
> > +			       rc);
> > +		BUG_ON(rc != 0);
> > +	}
> > +
> > +	setup.dom = DOMID_SELF;
> > +	setup.nr_frames = NR_GRANT_FRAMES;
> > +	set_xen_guest_handle(setup.frame_list, frames);
> > +	rc = HYPERVISOR_grant_table_op(GNTTABOP_setup_table, &setup,
> > 1);
> > +	if (rc || setup.status) {
> > +		printf("GNTTABOP_setup_table failed; status = %s\n",
> > +		       gnttabop_error(setup.status));
> > +		BUG();
> > +	}
> 
> GNTTAOP_grant_table_op is not needed on Arm.
> 

Ok, will remove.

> > +}
> > +
> > +void fini_gnttab(void)
> > +{
> > +	struct xen_remove_from_physmap xrtp;
> > +	struct gnttab_setup_table setup;
> > +	int i, rc;
> > +
> > +	debug("%s\n", __func__);
> > +
> > +	for (i = 0; i < NR_GRANT_FRAMES; i++) {
> > +		xrtp.domid = DOMID_SELF;
> > +		xrtp.gpfn = PFN_DOWN((unsigned long)gnttab_table) + i;
> > +		rc = HYPERVISOR_memory_op(XENMEM_remove_from_physmap,
> > &xrtp);
> > +		if (rc)
> > +			printf("XENMEM_remove_from_physmap failed;
> > status = %d\n",
> > +			       rc);
> > +		BUG_ON(rc != 0);
> > +	}
> > +
> > +	setup.dom = DOMID_SELF;
> > +	setup.nr_frames = 0;
> > +
> > +	HYPERVISOR_grant_table_op(GNTTABOP_setup_table, &setup, 1);
> > +	if (setup.status) {
> > +		printf("GNTTABOP_setup_table failed; status = %s\n",
> > +		       gnttabop_error(setup.status));
> > +		BUG();
> > +	}
> 
> The hypercall doesn't do any clean-up in Xen. So why are you calling 
> this from fini_gnttab()?

Will remove.

> 
> Cheers,
> 

Best regards,
Anastasiia

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 12/17] xen: pvblock: Add initial support for para-virtualized block driver
  2020-07-02  4:17   ` Heinrich Schuchardt
@ 2020-07-03 13:25     ` Anastasiia Lukianenko
  0 siblings, 0 replies; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-03 13:25 UTC (permalink / raw)
  To: u-boot

Hi Heinrich,

On Thu, 2020-07-02 at 06:17 +0200, Heinrich Schuchardt wrote:
> On 7/1/20 6:29 PM, Anastasiia Lukianenko wrote:
> > From: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> > 
> > Add initial infrastructure for Xen para-virtualized block device.
> > This includes compile-time configuration and the skeleton for
> > the future driver implementation.
> > Add new class UCLASS_PVBLOCK which is going to be a parent for
> > virtual block devices.
> > Add new interface type IF_TYPE_PVBLOCK.
> > 
> > Implement basic driver setup by reading XenStore configuration.
> 
> Please, add documenation for the board in doc/board/.

Ok, will add.

> 
> > 
> > Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
> > Signed-off-by: Anastasiia Lukianenko <
> > anastasiia_lukianenko at epam.com>
> > Signed-off-by: Oleksandr Andrushchenko <
> > oleksandr_andrushchenko at epam.com>
> > ---
> >  cmd/Kconfig                      |   7 ++
> >  cmd/Makefile                     |   1 +
> >  cmd/pvblock.c                    |  31 ++++++++
> >  common/board_r.c                 |  14 ++++
> >  configs/xenguest_arm64_defconfig |   4 +
> >  disk/part.c                      |   4 +
> >  drivers/Kconfig                  |   2 +
> >  drivers/block/blk-uclass.c       |   2 +
> >  drivers/xen/Kconfig              |  10 +++
> >  drivers/xen/Makefile             |   2 +
> >  drivers/xen/pvblock.c            | 121
> > +++++++++++++++++++++++++++++++
> >  include/blk.h                    |   1 +
> >  include/configs/xenguest_arm64.h |   8 ++
> >  include/dm/uclass-id.h           |   1 +
> >  include/pvblock.h                |  12 +++
> >  15 files changed, 220 insertions(+)
> >  create mode 100644 cmd/pvblock.c
> >  create mode 100644 drivers/xen/Kconfig
> >  create mode 100644 drivers/xen/pvblock.c
> >  create mode 100644 include/pvblock.h
> > 
> > diff --git a/cmd/Kconfig b/cmd/Kconfig
> > index 192b3b262f..f28576947b 100644
> > --- a/cmd/Kconfig
> > +++ b/cmd/Kconfig
> > @@ -1335,6 +1335,13 @@ config CMD_USB_MASS_STORAGE
> >  	help
> >  	  USB mass storage support
> > 
> > +config CMD_PVBLOCK
> > +	bool "Xen para-virtualized block device"
> > +	depends on XEN
> > +	select PVBLOCK
> > +	help
> > +	  Xen para-virtualized block device support
> > +
> >  config CMD_VIRTIO
> >  	bool "virtio"
> >  	depends on VIRTIO
> > diff --git a/cmd/Makefile b/cmd/Makefile
> > index 974ad48b0a..117284a28c 100644
> > --- a/cmd/Makefile
> > +++ b/cmd/Makefile
> > @@ -169,6 +169,7 @@ obj-$(CONFIG_CMD_DFU) += dfu.o
> >  obj-$(CONFIG_CMD_GPT) += gpt.o
> >  obj-$(CONFIG_CMD_ETHSW) += ethsw.o
> >  obj-$(CONFIG_CMD_AXI) += axi.o
> > +obj-$(CONFIG_CMD_PVBLOCK) += pvblock.o
> > 
> >  # Power
> >  obj-$(CONFIG_CMD_PMIC) += pmic.o
> > diff --git a/cmd/pvblock.c b/cmd/pvblock.c
> > new file mode 100644
> > index 0000000000..7dbb243a74
> > --- /dev/null
> > +++ b/cmd/pvblock.c
> > @@ -0,0 +1,31 @@
> > +/*
> > + * SPDX-License-Identifier:	GPL-2.0+
> 
> SPDX should be in first line and formatted as described in
> 
> 
https://urldefense.com/v3/__https://www.kernel.org/doc/html/latest/process/license-rules.html*license-identifier-syntax__;Iw!!GF_29dbcQIUBPA!l97PRX7YWb0RSRfrhKECVdFblqlOu73YTMPut6YwlXMfrLzvQrqF56fQO2MjT5c6kJT1Mqk$
> 

Ok, will fix.

>  
> 
> > + *
> > + * (C) Copyright 2020 EPAM Systems Inc.
> > + *
> > + * XEN para-virtualized block device support
> > + */
> > +
> > +#include <blk.h>
> > +#include <common.h>
> > +#include <command.h>
> > +
> > +/* Current I/O Device	*/
> > +static int pvblock_curr_device;
> > +
> > +int do_pvblock(struct cmd_tbl *cmdtp, int flag, int argc, char
> > *const argv[])
> > +{
> > +	return blk_common_cmd(argc, argv, IF_TYPE_PVBLOCK,
> > +			      &pvblock_curr_device);
> > +}
> > +
> > +U_BOOT_CMD(pvblock, 5, 1, do_pvblock,
> > +	   "Xen para-virtualized block device",
> > +	   "info  - show available block devices\n"
> > +	   "pvblock device [dev] - show or set current device\n"
> > +	   "pvblock part [dev] - print partition table of one or all
> > devices\n"
> > +	   "pvblock read  addr blk# cnt\n"
> > +	   "pvblock write addr blk# cnt - read/write `cnt'"
> > +	   " blocks starting at block `blk#'\n"
> > +	   "    to/from memory address `addr'");
> > +
> > diff --git a/common/board_r.c b/common/board_r.c
> > index fd36edb4e5..40cd0e5d3c 100644
> > --- a/common/board_r.c
> > +++ b/common/board_r.c
> > @@ -49,6 +49,7 @@
> >  #include <nand.h>
> >  #include <of_live.h>
> >  #include <onenand_uboot.h>
> > +#include <pvblock.h>
> >  #include <scsi.h>
> >  #include <serial.h>
> >  #include <status_led.h>
> > @@ -470,6 +471,16 @@ static int initr_xen(void)
> >  	return 0;
> >  }
> >  #endif
> > +
> > +#ifdef CONFIG_PVBLOCK
> > +static int initr_pvblock(void)
> > +{
> > +	puts("PVBLOCK: ");
> > +	pvblock_init();
> > +	return 0;
> > +}
> > +#endif
> > +
> >  /*
> >   * Tell if it's OK to load the environment early in boot.
> >   *
> > @@ -780,6 +791,9 @@ static init_fnc_t init_sequence_r[] = {
> >  #endif
> >  #ifdef CONFIG_XEN
> >  	initr_xen,
> > +#endif
> > +#ifdef CONFIG_PVBLOCK
> > +	initr_pvblock,
> >  #endif
> >  	initr_env,
> >  #ifdef CONFIG_SYS_BOOTPARAMS_LEN
> > diff --git a/configs/xenguest_arm64_defconfig
> > b/configs/xenguest_arm64_defconfig
> > index 45559a161b..46473c251d 100644
> > --- a/configs/xenguest_arm64_defconfig
> > +++ b/configs/xenguest_arm64_defconfig
> > @@ -14,6 +14,8 @@ CONFIG_CMD_BOOTD=n
> >  CONFIG_CMD_BOOTEFI=n
> >  CONFIG_CMD_BOOTEFI_HELLO_COMPILE=n
> >  CONFIG_CMD_ELF=n
> > +CONFIG_CMD_EXT4=y
> > +CONFIG_CMD_FAT=y
> >  CONFIG_CMD_GO=n
> >  CONFIG_CMD_RUN=n
> >  CONFIG_CMD_IMI=n
> > @@ -41,6 +43,8 @@ CONFIG_CMD_LZMADEC=n
> >  CONFIG_CMD_SAVEENV=n
> >  CONFIG_CMD_UMS=n
> > 
> > +CONFIG_CMD_PVBLOCK=y
> > +
> >  #CONFIG_USB=n
> >  # CONFIG_ISO_PARTITION is not set
> > 
> > diff --git a/disk/part.c b/disk/part.c
> > index f6a31025dc..b69fd345f3 100644
> > --- a/disk/part.c
> > +++ b/disk/part.c
> > @@ -149,6 +149,7 @@ void dev_print (struct blk_desc *dev_desc)
> >  	case IF_TYPE_MMC:
> >  	case IF_TYPE_USB:
> >  	case IF_TYPE_NVME:
> > +	case IF_TYPE_PVBLOCK:
> >  		printf ("Vendor: %s Rev: %s Prod: %s\n",
> >  			dev_desc->vendor,
> >  			dev_desc->revision,
> > @@ -288,6 +289,9 @@ static void print_part_header(const char *type,
> > struct blk_desc *dev_desc)
> >  	case IF_TYPE_NVME:
> >  		puts ("NVMe");
> >  		break;
> > +	case IF_TYPE_PVBLOCK:
> > +		puts("PV BLOCK");
> > +		break;
> >  	case IF_TYPE_VIRTIO:
> >  		puts("VirtIO");
> >  		break;
> > diff --git a/drivers/Kconfig b/drivers/Kconfig
> > index e34a22708c..65076aab03 100644
> > --- a/drivers/Kconfig
> > +++ b/drivers/Kconfig
> > @@ -132,6 +132,8 @@ source "drivers/w1-eeprom/Kconfig"
> > 
> >  source "drivers/watchdog/Kconfig"
> > 
> > +source "drivers/xen/Kconfig"
> > +
> >  config PHYS_TO_BUS
> >  	bool "Custom physical to bus address mapping"
> >  	help
> > diff --git a/drivers/block/blk-uclass.c b/drivers/block/blk-
> > uclass.c
> > index b19375cbc8..6cfabbca24 100644
> > --- a/drivers/block/blk-uclass.c
> > +++ b/drivers/block/blk-uclass.c
> > @@ -28,6 +28,7 @@ static const char *if_typename_str[IF_TYPE_COUNT]
> > = {
> >  	[IF_TYPE_NVME]		= "nvme",
> >  	[IF_TYPE_EFI]		= "efi",
> >  	[IF_TYPE_VIRTIO]	= "virtio",
> > +	[IF_TYPE_PVBLOCK]	= "pvblock",
> >  };
> > 
> >  static enum uclass_id if_type_uclass_id[IF_TYPE_COUNT] = {
> > @@ -43,6 +44,7 @@ static enum uclass_id
> > if_type_uclass_id[IF_TYPE_COUNT] = {
> >  	[IF_TYPE_NVME]		= UCLASS_NVME,
> >  	[IF_TYPE_EFI]		= UCLASS_EFI,
> >  	[IF_TYPE_VIRTIO]	= UCLASS_VIRTIO,
> > +	[IF_TYPE_PVBLOCK]	= UCLASS_PVBLOCK,
> >  };
> > 
> >  static enum if_type if_typename_to_iftype(const char *if_typename)
> > diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
> > new file mode 100644
> > index 0000000000..6ad2a93668
> > --- /dev/null
> > +++ b/drivers/xen/Kconfig
> > @@ -0,0 +1,10 @@
> > +config PVBLOCK
> > +	bool "Xen para-virtualized block device"
> > +	depends on DM
> > +	select BLK
> > +	select HAVE_BLOCK_DEVICE
> > +	help
> > +	  This driver implements the front-end of the Xen virtual
> > +	  block device driver. It communicates with a back-end driver
> > +	  in another domain which drives the actual block device.
> > +
> > diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
> > index 243b13277a..87157df69b 100644
> > --- a/drivers/xen/Makefile
> > +++ b/drivers/xen/Makefile
> > @@ -6,3 +6,5 @@ obj-y += hypervisor.o
> >  obj-y += events.o
> >  obj-y += xenbus.o
> >  obj-y += gnttab.o
> > +
> > +obj-$(CONFIG_PVBLOCK) += pvblock.o
> > diff --git a/drivers/xen/pvblock.c b/drivers/xen/pvblock.c
> > new file mode 100644
> > index 0000000000..057add9753
> > --- /dev/null
> > +++ b/drivers/xen/pvblock.c
> > @@ -0,0 +1,121 @@
> > +/*
> > + * SPDX-License-Identifier:	GPL-2.0+
> > + *
> > + * (C) Copyright 2020 EPAM Systems Inc.
> > + */
> > +#include <blk.h>
> > +#include <common.h>
> > +#include <dm.h>
> > +#include <dm/device-internal.h>
> > +
> > +#define DRV_NAME	"pvblock"
> > +#define DRV_NAME_BLK	"pvblock_blk"
> > +
> > +struct blkfront_dev {
> > +	char dummy;
> > +};
> > +
> > +static int init_blkfront(unsigned int devid, struct blkfront_dev
> > *dev)
> > +{
> > +	return 0;
> > +}
> > +
> > +static void shutdown_blkfront(struct blkfront_dev *dev)
> > +{
> > +}
> > +
> > +ulong pvblock_blk_read(struct udevice *udev, lbaint_t blknr,
> > lbaint_t blkcnt,
> > +		       void *buffer)
> > +{
> > +	return 0;
> > +}
> > +
> > +ulong pvblock_blk_write(struct udevice *udev, lbaint_t blknr,
> > lbaint_t blkcnt,
> > +			const void *buffer)
> > +{
> > +	return 0;
> > +}
> > +
> > +static int pvblock_blk_bind(struct udevice *udev)
> > +{
> > +	return 0;
> > +}
> > +
> > +static int pvblock_blk_probe(struct udevice *udev)
> > +{
> > +	struct blkfront_dev *blk_dev = dev_get_priv(udev);
> > +	int ret;
> > +
> > +	ret = init_blkfront(0, blk_dev);
> > +	if (ret < 0)
> > +		return ret;
> > +	return 0;
> > +}
> > +
> > +static int pvblock_blk_remove(struct udevice *udev)
> > +{
> > +	struct blkfront_dev *blk_dev = dev_get_priv(udev);
> > +
> > +	shutdown_blkfront(blk_dev);
> > +	return 0;
> > +}
> > +
> > +static const struct blk_ops pvblock_blk_ops = {
> > +	.read	= pvblock_blk_read,
> > +	.write	= pvblock_blk_write,
> > +};
> > +
> > +U_BOOT_DRIVER(pvblock_blk) = {
> > +	.name			= DRV_NAME_BLK,
> > +	.id			= UCLASS_BLK,
> > +	.ops			= &pvblock_blk_ops,
> > +	.bind			= pvblock_blk_bind,
> > +	.probe			= pvblock_blk_probe,
> > +	.remove			= pvblock_blk_remove,
> > +	.priv_auto_alloc_size	= sizeof(struct blkfront_dev),
> > +	.flags			= DM_FLAG_OS_PREPARE,
> > +};
> > +
> > +/*****************************************************************
> > **************
> > + * Para-virtual block device class
> > +
> > *******************************************************************
> > ************/
> > +
> > +void pvblock_init(void)
> > +{
> > +	struct driver_info info;
> > +	struct udevice *udev;
> > +	struct uclass *uc;
> > +	int ret;
> > +
> > +	/*
> > +	 * At this point Xen drivers have already initialized,
> > +	 * so we can instantiate the class driver and enumerate
> > +	 * virtual block devices.
> > +	 */
> > +	info.name = DRV_NAME;
> > +	ret = device_bind_by_name(gd->dm_root, false, &info, &udev);
> > +	if (ret < 0)
> > +		printf("Failed to bind " DRV_NAME ", ret: %d\n", ret);
> > +
> > +	/* Bootstrap virtual block devices class driver */
> > +	ret = uclass_get(UCLASS_PVBLOCK, &uc);
> > +	if (ret)
> > +		return;
> > +	uclass_foreach_dev_probe(UCLASS_PVBLOCK, udev);
> > +}
> > +
> > +static int pvblock_probe(struct udevice *udev)
> > +{
> > +	return 0;
> > +}
> > +
> > +U_BOOT_DRIVER(pvblock_drv) = {
> > +	.name		= DRV_NAME,
> > +	.id		= UCLASS_PVBLOCK,
> > +	.probe		= pvblock_probe,
> > +};
> > +
> > +UCLASS_DRIVER(pvblock) = {
> > +	.name		= DRV_NAME,
> > +	.id		= UCLASS_PVBLOCK,
> > +};
> > diff --git a/include/blk.h b/include/blk.h
> > index abcd4bedbb..9ee10fb80e 100644
> > --- a/include/blk.h
> > +++ b/include/blk.h
> > @@ -33,6 +33,7 @@ enum if_type {
> >  	IF_TYPE_HOST,
> >  	IF_TYPE_NVME,
> >  	IF_TYPE_EFI,
> > +	IF_TYPE_PVBLOCK,
> >  	IF_TYPE_VIRTIO,
> > 
> >  	IF_TYPE_COUNT,			/* Number of interface
> > types */
> > diff --git a/include/configs/xenguest_arm64.h
> > b/include/configs/xenguest_arm64.h
> > index 467dabf1e5..2c0d3d64fb 100644
> > --- a/include/configs/xenguest_arm64.h
> > +++ b/include/configs/xenguest_arm64.h
> > @@ -42,4 +42,12 @@
> >  #define CONFIG_CMDLINE_TAG            1
> >  #define CONFIG_INITRD_TAG             1
> > 
> > +#define CONFIG_CMD_RUN
> > +
> > +#undef CONFIG_EXTRA_ENV_SETTINGS
> > +#define CONFIG_EXTRA_ENV_SETTINGS	\
> > +	"loadimage=ext4load pvblock 0 0x90000000 /boot/Image;\0" \
> > +	"pvblockboot=run loadimage;" \
> > +		"booti 0x90000000 - 0x88000000;\0"
> > +
> >  #endif /* __XENGUEST_ARM64_H */
> > diff --git a/include/dm/uclass-id.h b/include/dm/uclass-id.h
> > index 7837d459f1..4bf7501204 100644
> > --- a/include/dm/uclass-id.h
> > +++ b/include/dm/uclass-id.h
> > @@ -121,6 +121,7 @@ enum uclass_id {
> >  	UCLASS_W1,		/* Dallas 1-Wire bus */
> >  	UCLASS_W1_EEPROM,	/* one-wire EEPROMs */
> >  	UCLASS_WDT,		/* Watchdog Timer driver */
> > +	UCLASS_PVBLOCK,		/* Xen virtual block device */
> > 
> >  	UCLASS_COUNT,
> >  	UCLASS_INVALID = -1,
> > diff --git a/include/pvblock.h b/include/pvblock.h
> > new file mode 100644
> > index 0000000000..e3bb8ff9a7
> > --- /dev/null
> > +++ b/include/pvblock.h
> > @@ -0,0 +1,12 @@
> > +/*
> > + * SPDX-License-Identifier:	GPL-2.0+
> 
> see above
> 
> Please, use scripts/checkpatch.pl before submitting patches.

We used it, but did not know when to stop :)

> 
> Best regards
> 
> Heinrich

Best regards,
Anastasiia

> 
> > + *
> > + * (C) 2020 EPAM Systems Inc.
> > + */
> > +
> > +#ifndef _PVBLOCK_H
> > +#define _PVBLOCK_H
> > +
> > +void pvblock_init(void);
> > +
> > +#endif /* _PVBLOCK_H */
> > 
> 
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 05/17] xen: Port Xen hypervizor related code from mini-os
  2020-07-03 12:21     ` Anastasiia Lukianenko
@ 2020-07-03 13:38       ` Julien Grall
  2020-07-08  8:55         ` Anastasiia Lukianenko
  0 siblings, 1 reply; 57+ messages in thread
From: Julien Grall @ 2020-07-03 13:38 UTC (permalink / raw)
  To: u-boot

Hi,

On 03/07/2020 13:21, Anastasiia Lukianenko wrote:
> Hi Julien,
> 
> On Wed, 2020-07-01 at 18:46 +0100, Julien Grall wrote:
>> Title: s/hypervizor/hypervisor/
> 
> Thank you for pointing :) I will fix it in the next version.
> 
>>
>> On 01/07/2020 17:29, Anastasiia Lukianenko wrote:
>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>
>>> Port hypervizor related code from mini-os. Update essential
>>
>> Ditto.
>>
>> But I would be quite cautious to import code from mini-OS in order
>> to
>> support Arm. The port has always been broken and from a look below
>> needs
>> to be refined for Arm.
> 
> We were referencing the code of Mini-OS from [1] by Huang Shijie and
> Volodymyr Babchuk which is for ARM64, so we hope this part should be
> ok.
> 
> [1] https://github.com/zyzii/mini-os.git

Well, that's not part of the official port. It would have been nice to 
at least mention that in somewhere in the series.

>>> +	return result;
>>> +}
>>
>> I can understand why we implement sync_* helpers as AFAICT the
>> generic
>> helpers are not SMP safe. However...
>>
>>> +
>>> +#define xchg(ptr, v)	__atomic_exchange_n(ptr, v,
>>> __ATOMIC_SEQ_CST)
>>> +#define xchg(ptr, v)	__atomic_exchange_n(ptr, v,
>>> __ATOMIC_SEQ_CST)
>>> +
>>> +#define mb()		dsb()
>>> +#define rmb()		dsb()
>>> +#define wmb()		dsb()
>>> +#define __iormb()	dmb()
>>> +#define __iowmb()	dmb()
>>
>> Why do you need to re-implement the barriers?
> 
> Indeed, we do not need to do this.
> I will fix it in the next version.
> 
>>
>>> +#define xen_mb()	mb()
>>> +#define xen_rmb()	rmb()
>>> +#define xen_wmb()	wmb()
>>> +
>>> +#define smp_processor_id()	0
>>
>> Shouldn't this be common?
> 
> Currently it is only used by Xen and we are not sure if
> any other entity will use it, but we can put that into
> arch/arm/include/asm/io.h
I looked at the usage in Xen and don't really think it would help in any 
way to get the code SMP ready. Does U-boot will enable Xen features on 
secondary CPUs? If not, then I would recomment to just drop it.

[...]

>>
>>> +
>>> +#endif
>>> diff --git a/common/board_r.c b/common/board_r.c
>>> index fa57fa9b69..fd36edb4e5 100644
>>> --- a/common/board_r.c
>>> +++ b/common/board_r.c
>>> @@ -56,6 +56,7 @@
>>>    #include <timer.h>
>>>    #include <trace.h>
>>>    #include <watchdog.h>
>>> +#include <xen.h>
>>
>> Do we want to include it for other boards?
> 
> For now, we do not have a plan and resources to support
> anything other than what we need. Therefore only ARM64.

I think you misunderstood my comment here. The file seems to be common 
but you include xen.h unconditionnally. Is it really what you want to do?

>>> +/*
>>> + * Shared page for communicating with the hypervisor.
>>> + * Events flags go here, for example.
>>> + */
>>> +struct shared_info *HYPERVISOR_shared_info;
>>> +
>>> +#ifndef CONFIG_PARAVIRT
>>
>> Is there any plan to support this on x86?
> 
> For now, we do not have a plan and resources to support
> anything other
> than what we need. Therefore only ARM64.

Ok. I doubt that one will want to use U-boot on PV x86. So I would 
recommend to drop anything related to CONFIG_PARAVIRT.

>>> +{
>>> +	struct xen_hvm_param xhv;
>>> +	int ret;
>>
>> I don't think there is a guarantee that your cache is going to be
>> clean
>> when writing xhv. So you likely want to add a
>> invalidate_dcache_range()
>> before writing it.
> 
> Thank you for advice.
> Ah, so we need something like:
> 
> ...
> invalidate_dcache_range((unsigned long)&xhv,
> 			(unsigned long)&xhv + sizeof(xhv));
> xhv.domid = DOMID_SELF;
> xhv.index = idx;
> invalidate_dcache_range((unsigned long)&xhv,
> 			(unsigned long)&xhv + sizeof(xhv));
> ...

Right, this would indeed be safer.

[...]

>>> +void do_hypervisor_callback(struct pt_regs *regs)
>>> +{
>>> +	unsigned long l1, l2, l1i, l2i;
>>> +	unsigned int port;
>>> +	int cpu = 0;
>>> +	struct shared_info *s = HYPERVISOR_shared_info;
>>> +	struct vcpu_info *vcpu_info = &s->vcpu_info[cpu];
>>> +
>>> +	in_callback = 1;
>>> +
>>> +	vcpu_info->evtchn_upcall_pending = 0;
>>> +	/* NB x86. No need for a barrier here -- XCHG is a barrier on
>>> x86. */
>>> +#if !defined(__i386__) && !defined(__x86_64__)
>>> +	/* Clear master flag /before/ clearing selector flag. */
>>> +	wmb();
>>> +#endif
>>> +	l1 = xchg(&vcpu_info->evtchn_pending_sel, 0);
>>> +
>>> +	while (l1 != 0) {
>>> +		l1i = __ffs(l1);
>>> +		l1 &= ~(1UL << l1i);
>>> +
>>> +		while ((l2 = active_evtchns(cpu, s, l1i)) != 0) {
>>> +			l2i = __ffs(l2);
>>> +			l2 &= ~(1UL << l2i);
>>> +
>>> +			port = (l1i * (sizeof(unsigned long) * 8)) +
>>> l2i;
>>> +			/* TODO: handle new event: do_event(port,
>>> regs); */
>>> +			/* Suppress -Wunused-but-set-variable */
>>> +			(void)(port);
>>> +		}
>>> +	}
>>
>> You likely want a memory barrier here as otherwise in_callback could
>> be
>> written/seen before the loop end.
>>
> 
> We are not running in a multi-threaded environment, so probably
> in_callback should be fine as is?

It really depends on how you plan to use in_callback. If you want to use 
it in interrupt context to know whether you are dealing with a callback, 
then you will want a compiler barrier.  But...

> Or it can be removed completely as
> there are no currently users of it.

... it would be best to remove if you


> 
>>> +
>>> +	in_callback = 0;
>>> +}
>>> +
>>> +void force_evtchn_callback(void)
>>> +{
>>> +#ifdef XEN_HAVE_PV_UPCALL_MASK
>>> +	int save;
>>> +#endif
>>> +	struct vcpu_info *vcpu;
>>> +
>>> +	vcpu = &HYPERVISOR_shared_info->vcpu_info[smp_processor_id()];
>>
>> On Arm, this is only valid for vCPU0. For all the other vCPUs, you
>> will
>> want to register a vCPU shared info.
>>
> 
> According to Mini-OS this is also expected for x86 [1] as both ARM and
> x86 are defining smp_processor_id as 0. Do you expect any issue with
> that?

I am not sure why you are referring to Mini-OS... We are discussing this 
code in the context of U-boot.

smp_processor_id() leads to think that you want to make your code ready 
for SMP support. However, on Arm, if smp_processor_id() return another 
value other than 0 it would be totally broken.

Will you ever need to run this code on other code than CPU0?

>  > [1]
> http://xenbits.xenproject.org/gitweb/?p=mini-os.git;a=blob;f=include/x86/os.h;h=a73b63e5e4e0f4b7fa7ca944739f2c3b8a956833;hb=HEAD#l10
> 
>>> +#ifdef XEN_HAVE_PV_UPCALL_MASK
>>> +	save = vcpu->evtchn_upcall_mask;
>>> +#endif
>>> +
>>> +	while (vcpu->evtchn_upcall_pending) {
>>> +#ifdef XEN_HAVE_PV_UPCALL_MASK
>>> +		vcpu->evtchn_upcall_mask = 1;
>>> +#endif
>>> +		barrier();
>>
>> What are you trying to prevent with this barrier? In particular why
>> would the compiler be an issue but not the processor?
> 
> This is the original code from Mini-OS and it seems that the barriers
> are leftovers from some old code. We do not define
> XEN_HAVE_PV_UPCALL_MASK, so this function can be stripped a lot with
> barriers removed completely.

I don't think I agree with your analysis. vcpu->evtchn_upcall_mask can 
be modified by the hypervisor, so you want to make sure that 
vcpu->evtchn_upcall_mask is read *after* we finish to deal with the 
first round of events. Otherwise you have a risk to delay handling of 
events.

This likely means a "dmb ishld" + compiler barrier after 
do_hypercall_callback(). FWIW, in Linux they use virt_rmb().

I think you don't need any barrier before hand thanks to xchg as the 
atomic built-in should already add a barrier for you (you use 
__ATOMIC_SEQ_CST). Although, it probably worth to check this is the case.

>>> +#endif
>>> +	};
>>> +}
>>> +
>>> +void mask_evtchn(uint32_t port)
>>> +{
>>> +	struct shared_info *s = HYPERVISOR_shared_info;
>>> +	synch_set_bit(port, &s->evtchn_mask[0]);
>>> +}
>>> +
>>> +void unmask_evtchn(uint32_t port)
>>> +{
>>> +	struct shared_info *s = HYPERVISOR_shared_info;
>>> +	struct vcpu_info *vcpu_info = &s-
>>>> vcpu_info[smp_processor_id()];
>>> +
>>> +	synch_clear_bit(port, &s->evtchn_mask[0]);
>>> +
>>> +	/*
>>> +	 * The following is basically the equivalent of
>>> 'hw_resend_irq'. Just like
>>> +	 * a real IO-APIC we 'lose the interrupt edge' if the channel
>>> is masked.
>>> +	 */
>>
>> This seems to be out-of-context now, you might want to update it.
> 
> I am not sure I understand it right.
> Could you please clarify what do you mean under the word "update"?

Well the comment is referring to "hw_resend_irq". I guess this is a 
function I can't find any code in either Mini-OS and U-boot.

Therefore comment seems to be wrong and needs to be updated.

> 
>>
>>> +	if (synch_test_bit(port, &s->evtchn_pending[0]) &&
>>> +	    !synch_test_and_set_bit(port / (sizeof(unsigned long) * 8),
>>> +				    &vcpu_info->evtchn_pending_sel)) {
>>> +		vcpu_info->evtchn_upcall_pending = 1;
>>> +#ifdef XEN_HAVE_PV_UPCALL_MASK
>>> +		if (!vcpu_info->evtchn_upcall_mask)
>>> +#endif
>>> +			force_evtchn_callback();
>>> +	}
>>> +}
>>> +
>>> +void clear_evtchn(uint32_t port)
>>> +{
>>> +	struct shared_info *s = HYPERVISOR_shared_info;
>>> +
>>> +	synch_clear_bit(port, &s->evtchn_pending[0]);
>>> +}
>>> +
>>> +void xen_init(void)
>>> +{
>>> +	debug("%s\n", __func__);
>>
>> Is this a left-over?
> 
> I think this is a relevant comment for debug purpose.
> But we do not mind removing it, if it seems superfluous.

That's fine. I was just asking if it was still worth it.

Cheers,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 12/17] xen: pvblock: Add initial support for para-virtualized block driver
  2020-07-02  4:29   ` Heinrich Schuchardt
  2020-07-02  5:30     ` Peng Fan
@ 2020-07-03 14:14     ` Anastasiia Lukianenko
  1 sibling, 0 replies; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-03 14:14 UTC (permalink / raw)
  To: u-boot

Hi Heinrich,

On Thu, 2020-07-02 at 06:29 +0200, Heinrich Schuchardt wrote:
> On 7/1/20 6:29 PM, Anastasiia Lukianenko wrote:
> > From: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> > 
> > Add initial infrastructure for Xen para-virtualized block device.
> > This includes compile-time configuration and the skeleton for
> > the future driver implementation.
> > Add new class UCLASS_PVBLOCK which is going to be a parent for
> > virtual block devices.
> 
> We already have virtual block devices: virtio_blk, efi_blk.
> 
> They work fine using the exising UCLASS_BLK. Why do we need a new
> uclass?
> 

During the implementation we had a discussion with Simon Glass on that
[1]. PVBLOCK could be just a UCLASS_BLK driver with the parent being
gd->dm_root, but in this case most of the file system/block device
commands fail to work as those expect that pvblock driver must be a
parent device and pvblock blk device must be its child to comply
to U-boot driver architecture.

When we bind a new driver [2], we need to specify it's parent. There is
no UCLASS_BLK device available at that moment, therefore it is possible
to use only a UCLASS_ROOT device, but parent and child interface types
must be the same [3].

So, then we need interface to be defined as [IF_TYPE_PVBLOCK] =
UCLASS_ROOT which doesn't seem to be right. Thereby, we created a new
UCLASS_PVBLOCK class for para-virtual block devices [4].

The driver class provides methods for accessing the "bus" and actually
implements the ops for reading and writing. But, we don't have a "bus"
or a common transport as virtio has, so we are a bit different with
that respect from the drivers you mentioned.

[1] - https://www.mail-archive.com/u-boot at lists.denx.de/msg371956.html
[2] - 
https://github.com/xen-troops/u-boot/blob/master/drivers/xen/pvblock.c#L697
[3] - 
https://github.com/xen-troops/u-boot/blob/master/drivers/block/blk-uclass.c#L124
[4] - 
https://github.com/xen-troops/u-boot/blob/master/drivers/block/blk-uclass.c#L47

> > Add new interface type IF_TYPE_PVBLOCK.
> > 
> > Implement basic driver setup by reading XenStore configuration.
> > 
> > Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
> > Signed-off-by: Anastasiia Lukianenko <
> > anastasiia_lukianenko at epam.com>
> > Signed-off-by: Oleksandr Andrushchenko <
> > oleksandr_andrushchenko at epam.com>
> > ---
> >  cmd/Kconfig                      |   7 ++
> >  cmd/Makefile                     |   1 +
> >  cmd/pvblock.c                    |  31 ++++++++
> >  common/board_r.c                 |  14 ++++
> >  configs/xenguest_arm64_defconfig |   4 +
> >  disk/part.c                      |   4 +
> >  drivers/Kconfig                  |   2 +
> >  drivers/block/blk-uclass.c       |   2 +
> >  drivers/xen/Kconfig              |  10 +++
> >  drivers/xen/Makefile             |   2 +
> >  drivers/xen/pvblock.c            | 121
> > +++++++++++++++++++++++++++++++
> >  include/blk.h                    |   1 +
> >  include/configs/xenguest_arm64.h |   8 ++
> >  include/dm/uclass-id.h           |   1 +
> >  include/pvblock.h                |  12 +++
> >  15 files changed, 220 insertions(+)
> >  create mode 100644 cmd/pvblock.c
> >  create mode 100644 drivers/xen/Kconfig
> >  create mode 100644 drivers/xen/pvblock.c
> >  create mode 100644 include/pvblock.h
> > 
> > diff --git a/cmd/Kconfig b/cmd/Kconfig
> > index 192b3b262f..f28576947b 100644
> > --- a/cmd/Kconfig
> > +++ b/cmd/Kconfig
> > @@ -1335,6 +1335,13 @@ config CMD_USB_MASS_STORAGE
> >  	help
> >  	  USB mass storage support
> > 
> > +config CMD_PVBLOCK
> > +	bool "Xen para-virtualized block device"
> > +	depends on XEN
> > +	select PVBLOCK
> > +	help
> > +	  Xen para-virtualized block device support
> > +
> >  config CMD_VIRTIO
> >  	bool "virtio"
> >  	depends on VIRTIO
> > diff --git a/cmd/Makefile b/cmd/Makefile
> > index 974ad48b0a..117284a28c 100644
> > --- a/cmd/Makefile
> > +++ b/cmd/Makefile
> > @@ -169,6 +169,7 @@ obj-$(CONFIG_CMD_DFU) += dfu.o
> >  obj-$(CONFIG_CMD_GPT) += gpt.o
> >  obj-$(CONFIG_CMD_ETHSW) += ethsw.o
> >  obj-$(CONFIG_CMD_AXI) += axi.o
> > +obj-$(CONFIG_CMD_PVBLOCK) += pvblock.o
> > 
> >  # Power
> >  obj-$(CONFIG_CMD_PMIC) += pmic.o
> > diff --git a/cmd/pvblock.c b/cmd/pvblock.c
> > new file mode 100644
> > index 0000000000..7dbb243a74
> > --- /dev/null
> > +++ b/cmd/pvblock.c
> > @@ -0,0 +1,31 @@
> > +/*
> > + * SPDX-License-Identifier:	GPL-2.0+
> 
> Please, correct the formatting.
> 
> 
https://urldefense.com/v3/__https://www.kernel.org/doc/html/latest/process/license-rules.html*license-identifier-syntax__;Iw!!GF_29dbcQIUBPA!jxsFCyOKmFzfwm6JpWhcYhyr_qGk_okiGw-S0zzuQwWAleeoT0qjgG6bmf0_OfcJMo3d-dM$
>  
> 

Ok.

> > + *
> > + * (C) Copyright 2020 EPAM Systems Inc.
> > + *
> > + * XEN para-virtualized block device support
> > + */
> > +
> > +#include <blk.h>
> > +#include <common.h>
> > +#include <command.h>
> > +
> > +/* Current I/O Device	*/
> > +static int pvblock_curr_device;
> > +
> > +int do_pvblock(struct cmd_tbl *cmdtp, int flag, int argc, char
> > *const argv[])
> > +{
> > +	return blk_common_cmd(argc, argv, IF_TYPE_PVBLOCK,
> > +			      &pvblock_curr_device);
> > +}
> > +
> > +U_BOOT_CMD(pvblock, 5, 1, do_pvblock,
> > +	   "Xen para-virtualized block device",
> > +	   "info  - show available block devices\n"
> > +	   "pvblock device [dev] - show or set current device\n"
> > +	   "pvblock part [dev] - print partition table of one or all
> > devices\n"
> > +	   "pvblock read  addr blk# cnt\n"
> > +	   "pvblock write addr blk# cnt - read/write `cnt'"
> > +	   " blocks starting at block `blk#'\n"
> > +	   "    to/from memory address `addr'");
> > +
> > diff --git a/common/board_r.c b/common/board_r.c
> > index fd36edb4e5..40cd0e5d3c 100644
> > --- a/common/board_r.c
> > +++ b/common/board_r.c
> > @@ -49,6 +49,7 @@
> >  #include <nand.h>
> >  #include <of_live.h>
> >  #include <onenand_uboot.h>
> > +#include <pvblock.h>
> >  #include <scsi.h>
> >  #include <serial.h>
> >  #include <status_led.h>
> > @@ -470,6 +471,16 @@ static int initr_xen(void)
> >  	return 0;
> >  }
> >  #endif
> > +
> > +#ifdef CONFIG_PVBLOCK
> > +static int initr_pvblock(void)
> > +{
> > +	puts("PVBLOCK: ");
> > +	pvblock_init();
> > +	return 0;
> > +}
> > +#endif
> > +
> >  /*
> >   * Tell if it's OK to load the environment early in boot.
> >   *
> > @@ -780,6 +791,9 @@ static init_fnc_t init_sequence_r[] = {
> >  #endif
> >  #ifdef CONFIG_XEN
> >  	initr_xen,
> > +#endif
> > +#ifdef CONFIG_PVBLOCK
> > +	initr_pvblock,
> >  #endif
> >  	initr_env,
> >  #ifdef CONFIG_SYS_BOOTPARAMS_LEN
> > diff --git a/configs/xenguest_arm64_defconfig
> > b/configs/xenguest_arm64_defconfig
> > index 45559a161b..46473c251d 100644
> > --- a/configs/xenguest_arm64_defconfig
> > +++ b/configs/xenguest_arm64_defconfig
> > @@ -14,6 +14,8 @@ CONFIG_CMD_BOOTD=n
> >  CONFIG_CMD_BOOTEFI=n
> >  CONFIG_CMD_BOOTEFI_HELLO_COMPILE=n
> >  CONFIG_CMD_ELF=n
> > +CONFIG_CMD_EXT4=y
> > +CONFIG_CMD_FAT=y
> >  CONFIG_CMD_GO=n
> >  CONFIG_CMD_RUN=n
> >  CONFIG_CMD_IMI=n
> > @@ -41,6 +43,8 @@ CONFIG_CMD_LZMADEC=n
> >  CONFIG_CMD_SAVEENV=n
> >  CONFIG_CMD_UMS=n
> > 
> > +CONFIG_CMD_PVBLOCK=y
> > +
> >  #CONFIG_USB=n
> >  # CONFIG_ISO_PARTITION is not set
> > 
> > diff --git a/disk/part.c b/disk/part.c
> > index f6a31025dc..b69fd345f3 100644
> > --- a/disk/part.c
> > +++ b/disk/part.c
> > @@ -149,6 +149,7 @@ void dev_print (struct blk_desc *dev_desc)
> >  	case IF_TYPE_MMC:
> >  	case IF_TYPE_USB:
> >  	case IF_TYPE_NVME:
> > +	case IF_TYPE_PVBLOCK:
> >  		printf ("Vendor: %s Rev: %s Prod: %s\n",
> >  			dev_desc->vendor,
> >  			dev_desc->revision,
> > @@ -288,6 +289,9 @@ static void print_part_header(const char *type,
> > struct blk_desc *dev_desc)
> >  	case IF_TYPE_NVME:
> >  		puts ("NVMe");
> >  		break;
> > +	case IF_TYPE_PVBLOCK:
> > +		puts("PV BLOCK");
> > +		break;
> >  	case IF_TYPE_VIRTIO:
> >  		puts("VirtIO");
> >  		break;
> > diff --git a/drivers/Kconfig b/drivers/Kconfig
> > index e34a22708c..65076aab03 100644
> > --- a/drivers/Kconfig
> > +++ b/drivers/Kconfig
> > @@ -132,6 +132,8 @@ source "drivers/w1-eeprom/Kconfig"
> > 
> >  source "drivers/watchdog/Kconfig"
> > 
> > +source "drivers/xen/Kconfig"
> > +
> >  config PHYS_TO_BUS
> >  	bool "Custom physical to bus address mapping"
> >  	help
> > diff --git a/drivers/block/blk-uclass.c b/drivers/block/blk-
> > uclass.c
> > index b19375cbc8..6cfabbca24 100644
> > --- a/drivers/block/blk-uclass.c
> > +++ b/drivers/block/blk-uclass.c
> > @@ -28,6 +28,7 @@ static const char *if_typename_str[IF_TYPE_COUNT]
> > = {
> >  	[IF_TYPE_NVME]		= "nvme",
> >  	[IF_TYPE_EFI]		= "efi",
> >  	[IF_TYPE_VIRTIO]	= "virtio",
> > +	[IF_TYPE_PVBLOCK]	= "pvblock",
> >  };
> > 
> >  static enum uclass_id if_type_uclass_id[IF_TYPE_COUNT] = {
> > @@ -43,6 +44,7 @@ static enum uclass_id
> > if_type_uclass_id[IF_TYPE_COUNT] = {
> >  	[IF_TYPE_NVME]		= UCLASS_NVME,
> >  	[IF_TYPE_EFI]		= UCLASS_EFI,
> >  	[IF_TYPE_VIRTIO]	= UCLASS_VIRTIO,
> > +	[IF_TYPE_PVBLOCK]	= UCLASS_PVBLOCK,
> >  };
> > 
> >  static enum if_type if_typename_to_iftype(const char *if_typename)
> > diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
> > new file mode 100644
> > index 0000000000..6ad2a93668
> > --- /dev/null
> > +++ b/drivers/xen/Kconfig
> > @@ -0,0 +1,10 @@
> > +config PVBLOCK
> > +	bool "Xen para-virtualized block device"
> > +	depends on DM
> > +	select BLK
> > +	select HAVE_BLOCK_DEVICE
> > +	help
> > +	  This driver implements the front-end of the Xen virtual
> > +	  block device driver. It communicates with a back-end driver
> > +	  in another domain which drives the actual block device.
> > +
> > diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
> > index 243b13277a..87157df69b 100644
> > --- a/drivers/xen/Makefile
> > +++ b/drivers/xen/Makefile
> > @@ -6,3 +6,5 @@ obj-y += hypervisor.o
> >  obj-y += events.o
> >  obj-y += xenbus.o
> >  obj-y += gnttab.o
> > +
> > +obj-$(CONFIG_PVBLOCK) += pvblock.o
> > diff --git a/drivers/xen/pvblock.c b/drivers/xen/pvblock.c
> > new file mode 100644
> > index 0000000000..057add9753
> > --- /dev/null
> > +++ b/drivers/xen/pvblock.c
> > @@ -0,0 +1,121 @@
> > +/*
> > + * SPDX-License-Identifier:	GPL-2.0+
> 
> See above.
> 

Ok, will fix.

> > + *
> > + * (C) Copyright 2020 EPAM Systems Inc.
> > + */
> > +#include <blk.h>
> > +#include <common.h>
> > +#include <dm.h>
> > +#include <dm/device-internal.h>
> > +
> > +#define DRV_NAME	"pvblock"
> > +#define DRV_NAME_BLK	"pvblock_blk"
> > +
> > +struct blkfront_dev {
> > +	char dummy;
> > +};
> > +
> > +static int init_blkfront(unsigned int devid, struct blkfront_dev
> > *dev)
> > +{
> > +	return 0;
> > +}
> > +
> > +static void shutdown_blkfront(struct blkfront_dev *dev)
> > +{
> > +}
> > +
> > +ulong pvblock_blk_read(struct udevice *udev, lbaint_t blknr,
> > lbaint_t blkcnt,
> > +		       void *buffer)
> > +{
> > +	return 0;
> > +}
> > +
> > +ulong pvblock_blk_write(struct udevice *udev, lbaint_t blknr,
> > lbaint_t blkcnt,
> > +			const void *buffer)
> > +{
> > +	return 0;
> > +}
> > +
> > +static int pvblock_blk_bind(struct udevice *udev)
> > +{
> > +	return 0;
> > +}
> > +
> > +static int pvblock_blk_probe(struct udevice *udev)
> > +{
> > +	struct blkfront_dev *blk_dev = dev_get_priv(udev);
> > +	int ret;
> > +
> > +	ret = init_blkfront(0, blk_dev);
> > +	if (ret < 0)
> > +		return ret;
> > +	return 0;
> > +}
> > +
> > +static int pvblock_blk_remove(struct udevice *udev)
> > +{
> > +	struct blkfront_dev *blk_dev = dev_get_priv(udev);
> > +
> > +	shutdown_blkfront(blk_dev);
> > +	return 0;
> > +}
> > +
> > +static const struct blk_ops pvblock_blk_ops = {
> > +	.read	= pvblock_blk_read,
> > +	.write	= pvblock_blk_write,
> > +};
> > +
> > +U_BOOT_DRIVER(pvblock_blk) = {
> > +	.name			= DRV_NAME_BLK,
> > +	.id			= UCLASS_BLK,
> > +	.ops			= &pvblock_blk_ops,
> > +	.bind			= pvblock_blk_bind,
> > +	.probe			= pvblock_blk_probe,
> > +	.remove			= pvblock_blk_remove,
> > +	.priv_auto_alloc_size	= sizeof(struct blkfront_dev),
> > +	.flags			= DM_FLAG_OS_PREPARE,
> > +};
> > +
> > +/*****************************************************************
> > **************
> > + * Para-virtual block device class
> > +
> > *******************************************************************
> > ************/
> > +
> > +void pvblock_init(void)
> > +{
> > +	struct driver_info info;
> > +	struct udevice *udev;
> > +	struct uclass *uc;
> > +	int ret;
> > +
> > +	/*
> > +	 * At this point Xen drivers have already initialized,
> > +	 * so we can instantiate the class driver and enumerate
> > +	 * virtual block devices.
> > +	 */
> > +	info.name = DRV_NAME;
> > +	ret = device_bind_by_name(gd->dm_root, false, &info, &udev);
> > +	if (ret < 0)
> > +		printf("Failed to bind " DRV_NAME ", ret: %d\n", ret);
> > +
> > +	/* Bootstrap virtual block devices class driver */
> > +	ret = uclass_get(UCLASS_PVBLOCK, &uc);
> > +	if (ret)
> > +		return;
> > +	uclass_foreach_dev_probe(UCLASS_PVBLOCK, udev);
> > +}
> > +
> > +static int pvblock_probe(struct udevice *udev)
> > +{
> > +	return 0;
> > +}
> > +
> > +U_BOOT_DRIVER(pvblock_drv) = {
> > +	.name		= DRV_NAME,
> > +	.id		= UCLASS_PVBLOCK,
> > +	.probe		= pvblock_probe,
> > +};
> > +
> > +UCLASS_DRIVER(pvblock) = {
> > +	.name		= DRV_NAME,
> > +	.id		= UCLASS_PVBLOCK,
> > +};
> > diff --git a/include/blk.h b/include/blk.h
> > index abcd4bedbb..9ee10fb80e 100644
> > --- a/include/blk.h
> > +++ b/include/blk.h
> > @@ -33,6 +33,7 @@ enum if_type {
> >  	IF_TYPE_HOST,
> >  	IF_TYPE_NVME,
> >  	IF_TYPE_EFI,
> > +	IF_TYPE_PVBLOCK,
> >  	IF_TYPE_VIRTIO,
> > 
> >  	IF_TYPE_COUNT,			/* Number of interface
> > types */
> > diff --git a/include/configs/xenguest_arm64.h
> > b/include/configs/xenguest_arm64.h
> > index 467dabf1e5..2c0d3d64fb 100644
> > --- a/include/configs/xenguest_arm64.h
> > +++ b/include/configs/xenguest_arm64.h
> > @@ -42,4 +42,12 @@
> >  #define CONFIG_CMDLINE_TAG            1
> >  #define CONFIG_INITRD_TAG             1
> > 
> > +#define CONFIG_CMD_RUN
> > +
> > +#undef CONFIG_EXTRA_ENV_SETTINGS
> > +#define CONFIG_EXTRA_ENV_SETTINGS	\
> > +	"loadimage=ext4load pvblock 0 0x90000000 /boot/Image;\0" \
> > +	"pvblockboot=run loadimage;" \
> > +		"booti 0x90000000 - 0x88000000;\0"
> > +
> >  #endif /* __XENGUEST_ARM64_H */
> > diff --git a/include/dm/uclass-id.h b/include/dm/uclass-id.h
> > index 7837d459f1..4bf7501204 100644
> > --- a/include/dm/uclass-id.h
> > +++ b/include/dm/uclass-id.h
> > @@ -121,6 +121,7 @@ enum uclass_id {
> >  	UCLASS_W1,		/* Dallas 1-Wire bus */
> >  	UCLASS_W1_EEPROM,	/* one-wire EEPROMs */
> >  	UCLASS_WDT,		/* Watchdog Timer driver */
> > +	UCLASS_PVBLOCK,		/* Xen virtual block device */
> > 
> >  	UCLASS_COUNT,
> >  	UCLASS_INVALID = -1,
> > diff --git a/include/pvblock.h b/include/pvblock.h
> > new file mode 100644
> > index 0000000000..e3bb8ff9a7
> > --- /dev/null
> > +++ b/include/pvblock.h
> > @@ -0,0 +1,12 @@
> > +/*
> > + * SPDX-License-Identifier:	GPL-2.0+
> 
> See above.
> 
> scripts/checkpatch.pl is your friend.
> 
> > + *
> > + * (C) 2020 EPAM Systems Inc.
> > + */
> > +
> > +#ifndef _PVBLOCK_H
> > +#define _PVBLOCK_H
> > +
> 
> Document you functions as described in
> 
> 
https://urldefense.com/v3/__https://www.kernel.org/doc/html/latest/doc-guide/kernel-doc.html*function-documentation__;Iw!!GF_29dbcQIUBPA!jxsFCyOKmFzfwm6JpWhcYhyr_qGk_okiGw-S0zzuQwWAleeoT0qjgG6bmf0_OfcJcWTT8IY$
> 
>  
> 
> Include the documentation in the HTML documentation generated by
> 
>     make htmldocs.

Ok, will add.

> 
> Best regards
> 
> Heinrich
> 

Best regards,
Anastasiia

> > +void pvblock_init(void);
> > +
> > +#endif /* _PVBLOCK_H */
> > 
> 
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 14/17] xen: pvblock: Read XenStore configuration and initialize
  2020-07-03  3:50   ` Simon Glass
@ 2020-07-06  9:08     ` Anastasiia Lukianenko
  0 siblings, 0 replies; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-06  9:08 UTC (permalink / raw)
  To: u-boot

On Thu, 2020-07-02 at 21:50 -0600, Simon Glass wrote:
> Hi Anastasiia,
> 
> On Wed, 1 Jul 2020 at 10:30, Anastasiia Lukianenko <
> vicooodin at gmail.com> wrote:
> > 
> > From: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> > 
> > Read essential virtual block device configuration data from
> > XenStore,
> > initialize front ring and event channel.
> > Update block device description with actual block size.
> > 
> > Use code for XenStore from mini-os.
> > 
> > Signed-off-by: Oleksandr Andrushchenko <
> > oleksandr_andrushchenko at epam.com>
> > Signed-off-by: Anastasiia Lukianenko <
> > anastasiia_lukianenko at epam.com>
> > ---
> >  drivers/xen/pvblock.c | 272
> > +++++++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 271 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/xen/pvblock.c b/drivers/xen/pvblock.c
> > index 6ce0ae97c3..9ed18be633 100644
> > --- a/drivers/xen/pvblock.c
> > +++ b/drivers/xen/pvblock.c
> > @@ -1,6 +1,7 @@
> >  /*
> >   * SPDX-License-Identifier:    GPL-2.0+
> >   *
> > + * (C) 2007-2008 Samuel Thibault.
> >   * (C) Copyright 2020 EPAM Systems Inc.
> >   */
> >  #include <blk.h>
> > @@ -10,26 +11,289 @@
> >  #include <malloc.h>
> >  #include <part.h>
> > 
> > +#include <asm/armv8/mmu.h>
> > +#include <asm/io.h>
> > +#include <asm/xen/system.h>
> > +
> > +#include <linux/compat.h>
> > +
> > +#include <xen/events.h>
> > +#include <xen/gnttab.h>
> > +#include <xen/hvm.h>
> >  #include <xen/xenbus.h>
> > 
> > +#include <xen/interface/io/ring.h>
> > +#include <xen/interface/io/blkif.h>
> > +#include <xen/interface/io/protocols.h>
> > +
> >  #define DRV_NAME       "pvblock"
> >  #define DRV_NAME_BLK   "pvblock_blk"
> > 
> > +#define O_RDONLY       00
> > +#define O_RDWR         02
> > +
> > +struct blkfront_info {
> > +       u64 sectors;
> > +       unsigned int sector_size;
> > +       int mode;
> > +       int info;
> > +       int barrier;
> > +       int flush;
> > +};
> > +
> >  struct blkfront_dev {
> > -       char dummy;
> > +       domid_t dom;
> > +
> > +       struct blkif_front_ring ring;
> > +       grant_ref_t ring_ref;
> > +       evtchn_port_t evtchn;
> > +       blkif_vdev_t handle;
> > +
> > +       char *nodename;
> > +       char *backend;
> > +       struct blkfront_info info;
> > +       unsigned int devid;
> 
> How about some comments?
> 

Ok, will add.

> Regards,
> Simon

Regards,
Anastasiia

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 15/17] xen: pvblock: Implement front-back protocol and do IO
  2020-07-03  3:50   ` Simon Glass
@ 2020-07-06  9:10     ` Anastasiia Lukianenko
  0 siblings, 0 replies; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-06  9:10 UTC (permalink / raw)
  To: u-boot

On Thu, 2020-07-02 at 21:50 -0600, Simon Glass wrote:
> On Wed, 1 Jul 2020 at 10:30, Anastasiia Lukianenko <
> vicooodin at gmail.com> wrote:
> > 
> > From: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
> > 
> > Implement Xen para-virtual frontend to backend communication
> > and actually read/write disk data.
> > 
> > This is based on mini-os implementation of the para-virtual block
> > frontend driver.
> > 
> > Signed-off-by: Oleksandr Andrushchenko <
> > oleksandr_andrushchenko at epam.com>
> > Signed-off-by: Anastasiia Lukianenko <
> > anastasiia_lukianenko at epam.com>
> > ---
> >  drivers/xen/events.c  |   2 +-
> >  drivers/xen/pvblock.c | 311
> > ++++++++++++++++++++++++++++++++++++++++--
> >  2 files changed, 301 insertions(+), 12 deletions(-)
> 
> Please can you comment structs and non-trival functions?

Sure.

Regards,
Anastasiia

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 17/17] board: xen: De-initialize before jumping to Linux
  2020-07-03  3:50   ` Simon Glass
@ 2020-07-06  9:13     ` Anastasiia Lukianenko
  0 siblings, 0 replies; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-06  9:13 UTC (permalink / raw)
  To: u-boot

On Thu, 2020-07-02 at 21:50 -0600, Simon Glass wrote:
> On Wed, 1 Jul 2020 at 10:30, Anastasiia Lukianenko <
> vicooodin at gmail.com> wrote:
> > 
> > From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> > 
> > Free resources used by Xen board before jumping to Linux kernel.
> > 
> > Signed-off-by: Oleksandr Andrushchenko <
> > oleksandr_andrushchenko at epam.com>
> > Signed-off-by: Anastasiia Lukianenko <
> > anastasiia_lukianenko at epam.com>
> > ---
> >  board/xen/xenguest_arm64/xenguest_arm64.c | 6 ++++++
> >  drivers/xen/hypervisor.c                  | 8 ++++++++
> >  include/xen.h                             | 1 +
> >  3 files changed, 15 insertions(+)
> 
> Reviewed-by: Simon Glass <sjg@chromium.org>
> 
> 
> 
> > 
> > diff --git a/board/xen/xenguest_arm64/xenguest_arm64.c
> > b/board/xen/xenguest_arm64/xenguest_arm64.c
> > index b4e1650f99..76a18bea8b 100644
> > --- a/board/xen/xenguest_arm64/xenguest_arm64.c
> > +++ b/board/xen/xenguest_arm64/xenguest_arm64.c
> > @@ -13,6 +13,7 @@
> >  #include <dm.h>
> >  #include <errno.h>
> >  #include <malloc.h>
> > +#include <xen.h>
> > 
> >  #include <asm/io.h>
> >  #include <asm/armv8/mmu.h>
> > @@ -195,3 +196,8 @@ int print_cpuinfo(void)
> >         return 0;
> >  }
> > 
> > +void board_cleanup_before_linux(void)
> > +{
> > +       xen_fini();
> > +}
> > +
> > diff --git a/drivers/xen/hypervisor.c b/drivers/xen/hypervisor.c
> > index f3c2504d72..8d7d320839 100644
> > --- a/drivers/xen/hypervisor.c
> > +++ b/drivers/xen/hypervisor.c
> > @@ -279,3 +279,11 @@ void xen_init(void)
> >         init_gnttab();
> >  }
> > 
> > +void xen_fini(void)
> > +{
> > +       debug("%s\n", __func__);
> > +
> > +       fini_gnttab();
> > +       fini_xenbus();
> > +       fini_events();
> > +}
> > diff --git a/include/xen.h b/include/xen.h
> > index 1d6f74cc92..327d7e132b 100644
> > --- a/include/xen.h
> > +++ b/include/xen.h
> > @@ -7,5 +7,6 @@
> >  #define __XEN_H__
> > 
> >  void xen_init(void);
> > +void xen_fini(void);
> 
> Comment? What does this do?
> 
Ok, will add.
> 
> > 
> >  #endif /* __XEN_H__ */
> > --
> > 2.17.1
> > 
> 
> - SImon

Regards,
Anastasiia

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 05/17] xen: Port Xen hypervizor related code from mini-os
  2020-07-03 13:38       ` Julien Grall
@ 2020-07-08  8:55         ` Anastasiia Lukianenko
  0 siblings, 0 replies; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-08  8:55 UTC (permalink / raw)
  To: u-boot

Hi,

On Fri, 2020-07-03 at 14:38 +0100, Julien Grall wrote:
> Hi,
> 
> On 03/07/2020 13:21, Anastasiia Lukianenko wrote:
> > Hi Julien,
> > 
> > On Wed, 2020-07-01 at 18:46 +0100, Julien Grall wrote:
> > > Title: s/hypervizor/hypervisor/
> > 
> > Thank you for pointing :) I will fix it in the next version.
> > 
> > > 
> > > On 01/07/2020 17:29, Anastasiia Lukianenko wrote:
> > > > From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com
> > > > >
> > > > 
> > > > Port hypervizor related code from mini-os. Update essential
> > > 
> > > Ditto.
> > > 
> > > But I would be quite cautious to import code from mini-OS in
> > > order
> > > to
> > > support Arm. The port has always been broken and from a look
> > > below
> > > needs
> > > to be refined for Arm.
> > 
> > We were referencing the code of Mini-OS from [1] by Huang Shijie
> > and
> > Volodymyr Babchuk which is for ARM64, so we hope this part should
> > be
> > ok.
> > 
> > [1] 
> > https://urldefense.com/v3/__https://github.com/zyzii/mini-os.git__;!!GF_29dbcQIUBPA!i0hVwJuV0iEI89D83SJP8zr1mgHfh5o3IS2vytGwgxyJ0kzSiCLqVdtA3crvFm0GUMTNGQU$
> >  
> 
> Well, that's not part of the official port. It would have been nice
> to 
> at least mention that in somewhere in the series.
> 

Sure, will mention.

> > > > +	return result;
> > > > +}
> > > 
> > > I can understand why we implement sync_* helpers as AFAICT the
> > > generic
> > > helpers are not SMP safe. However...
> > > 
> > > > +
> > > > +#define xchg(ptr, v)	__atomic_exchange_n(ptr, v,
> > > > __ATOMIC_SEQ_CST)
> > > > +#define xchg(ptr, v)	__atomic_exchange_n(ptr, v,
> > > > __ATOMIC_SEQ_CST)
> > > > +
> > > > +#define mb()		dsb()
> > > > +#define rmb()		dsb()
> > > > +#define wmb()		dsb()
> > > > +#define __iormb()	dmb()
> > > > +#define __iowmb()	dmb()
> > > 
> > > Why do you need to re-implement the barriers?
> > 
> > Indeed, we do not need to do this.
> > I will fix it in the next version.
> > 
> > > 
> > > > +#define xen_mb()	mb()
> > > > +#define xen_rmb()	rmb()
> > > > +#define xen_wmb()	wmb()
> > > > +
> > > > +#define smp_processor_id()	0
> > > 
> > > Shouldn't this be common?
> > 
> > Currently it is only used by Xen and we are not sure if
> > any other entity will use it, but we can put that into
> > arch/arm/include/asm/io.h
> 
> I looked at the usage in Xen and don't really think it would help in
> any 
> way to get the code SMP ready. Does U-boot will enable Xen features
> on 
> secondary CPUs? If not, then I would recomment to just drop it.
> 

Ok, will drop

> [...]
> 
> > > 
> > > > +
> > > > +#endif
> > > > diff --git a/common/board_r.c b/common/board_r.c
> > > > index fa57fa9b69..fd36edb4e5 100644
> > > > --- a/common/board_r.c
> > > > +++ b/common/board_r.c
> > > > @@ -56,6 +56,7 @@
> > > >    #include <timer.h>
> > > >    #include <trace.h>
> > > >    #include <watchdog.h>
> > > > +#include <xen.h>
> > > 
> > > Do we want to include it for other boards?
> > 
> > For now, we do not have a plan and resources to support
> > anything other than what we need. Therefore only ARM64.
> 
> I think you misunderstood my comment here. The file seems to be
> common 
> but you include xen.h unconditionnally. Is it really what you want to
> do?
> 
> > > > +/*
> > > > + * Shared page for communicating with the hypervisor.
> > > > + * Events flags go here, for example.
> > > > + */
> > > > +struct shared_info *HYPERVISOR_shared_info;
> > > > +
> > > > +#ifndef CONFIG_PARAVIRT
> > > 
> > > Is there any plan to support this on x86?
> > 
> > For now, we do not have a plan and resources to support
> > anything other
> > than what we need. Therefore only ARM64.
> 
> Ok. I doubt that one will want to use U-boot on PV x86. So I would 
> recommend to drop anything related to CONFIG_PARAVIRT.
> 

Ok, will remove

> > > > +{
> > > > +	struct xen_hvm_param xhv;
> > > > +	int ret;
> > > 
> > > I don't think there is a guarantee that your cache is going to be
> > > clean
> > > when writing xhv. So you likely want to add a
> > > invalidate_dcache_range()
> > > before writing it.
> > 
> > Thank you for advice.
> > Ah, so we need something like:
> > 
> > ...
> > invalidate_dcache_range((unsigned long)&xhv,
> > 			(unsigned long)&xhv + sizeof(xhv));
> > xhv.domid = DOMID_SELF;
> > xhv.index = idx;
> > invalidate_dcache_range((unsigned long)&xhv,
> > 			(unsigned long)&xhv + sizeof(xhv));
> > ...
> 
> Right, this would indeed be safer.
> 
> [...]
> 
> > > > +void do_hypervisor_callback(struct pt_regs *regs)
> > > > +{
> > > > +	unsigned long l1, l2, l1i, l2i;
> > > > +	unsigned int port;
> > > > +	int cpu = 0;
> > > > +	struct shared_info *s = HYPERVISOR_shared_info;
> > > > +	struct vcpu_info *vcpu_info = &s->vcpu_info[cpu];
> > > > +
> > > > +	in_callback = 1;
> > > > +
> > > > +	vcpu_info->evtchn_upcall_pending = 0;
> > > > +	/* NB x86. No need for a barrier here -- XCHG is a
> > > > barrier on
> > > > x86. */
> > > > +#if !defined(__i386__) && !defined(__x86_64__)
> > > > +	/* Clear master flag /before/ clearing selector flag.
> > > > */
> > > > +	wmb();
> > > > +#endif
> > > > +	l1 = xchg(&vcpu_info->evtchn_pending_sel, 0);
> > > > +
> > > > +	while (l1 != 0) {
> > > > +		l1i = __ffs(l1);
> > > > +		l1 &= ~(1UL << l1i);
> > > > +
> > > > +		while ((l2 = active_evtchns(cpu, s, l1i)) != 0)
> > > > {
> > > > +			l2i = __ffs(l2);
> > > > +			l2 &= ~(1UL << l2i);
> > > > +
> > > > +			port = (l1i * (sizeof(unsigned long) *
> > > > 8)) +
> > > > l2i;
> > > > +			/* TODO: handle new event:
> > > > do_event(port,
> > > > regs); */
> > > > +			/* Suppress -Wunused-but-set-variable
> > > > */
> > > > +			(void)(port);
> > > > +		}
> > > > +	}
> > > 
> > > You likely want a memory barrier here as otherwise in_callback
> > > could
> > > be
> > > written/seen before the loop end.
> > > 
> > 
> > We are not running in a multi-threaded environment, so probably
> > in_callback should be fine as is?
> 
> It really depends on how you plan to use in_callback. If you want to
> use 
> it in interrupt context to know whether you are dealing with a
> callback, 
> then you will want a compiler barrier.  But...
> 
> > Or it can be removed completely as
> > there are no currently users of it.
> 
> ... it would be best to remove if you
> 

Ok, will remove.

> 
> > 
> > > > +
> > > > +	in_callback = 0;
> > > > +}
> > > > +
> > > > +void force_evtchn_callback(void)
> > > > +{
> > > > +#ifdef XEN_HAVE_PV_UPCALL_MASK
> > > > +	int save;
> > > > +#endif
> > > > +	struct vcpu_info *vcpu;
> > > > +
> > > > +	vcpu = &HYPERVISOR_shared_info-
> > > > >vcpu_info[smp_processor_id()];
> > > 
> > > On Arm, this is only valid for vCPU0. For all the other vCPUs,
> > > you
> > > will
> > > want to register a vCPU shared info.
> > > 
> > 
> > According to Mini-OS this is also expected for x86 [1] as both ARM
> > and
> > x86 are defining smp_processor_id as 0. Do you expect any issue
> > with
> > that?
> 
> I am not sure why you are referring to Mini-OS... We are discussing
> this 
> code in the context of U-boot.
> 
> smp_processor_id() leads to think that you want to make your code
> ready 
> for SMP support. However, on Arm, if smp_processor_id() return
> another 
> value other than 0 it would be totally broken.
> 
> Will you ever need to run this code on other code than CPU0?
> 
> >  > [1]
> > 
https://urldefense.com/v3/__http://xenbits.xenproject.org/gitweb/?p=mini-os.git;a=blob;f=include*x86*os.h;h=a73b63e5e4e0f4b7fa7ca944739f2c3b8a956833;hb=HEAD*l10__;Ly8j!!GF_29dbcQIUBPA!i0hVwJuV0iEI89D83SJP8zr1mgHfh5o3IS2vytGwgxyJ0kzSiCLqVdtA3crvFm0GI_2BcP0$
> >  
> > 
> > > > +#ifdef XEN_HAVE_PV_UPCALL_MASK
> > > > +	save = vcpu->evtchn_upcall_mask;
> > > > +#endif
> > > > +
> > > > +	while (vcpu->evtchn_upcall_pending) {
> > > > +#ifdef XEN_HAVE_PV_UPCALL_MASK
> > > > +		vcpu->evtchn_upcall_mask = 1;
> > > > +#endif
> > > > +		barrier();
> > > 
> > > What are you trying to prevent with this barrier? In particular
> > > why
> > > would the compiler be an issue but not the processor?
> > 
> > This is the original code from Mini-OS and it seems that the
> > barriers
> > are leftovers from some old code. We do not define
> > XEN_HAVE_PV_UPCALL_MASK, so this function can be stripped a lot
> > with
> > barriers removed completely.
> 
> I don't think I agree with your analysis. vcpu->evtchn_upcall_mask
> can 
> be modified by the hypervisor, so you want to make sure that 
> vcpu->evtchn_upcall_mask is read *after* we finish to deal with the 
> first round of events. Otherwise you have a risk to delay handling
> of 
> events.
> 
> This likely means a "dmb ishld" + compiler barrier after 
> do_hypercall_callback(). FWIW, in Linux they use virt_rmb().
> 
> I think you don't need any barrier before hand thanks to xchg as the 
> atomic built-in should already add a barrier for you (you use 
> __ATOMIC_SEQ_CST). Although, it probably worth to check this is the
> case.
> 
> > > > +#endif
> > > > +	};
> > > > +}
> > > > +
> > > > +void mask_evtchn(uint32_t port)
> > > > +{
> > > > +	struct shared_info *s = HYPERVISOR_shared_info;
> > > > +	synch_set_bit(port, &s->evtchn_mask[0]);
> > > > +}
> > > > +
> > > > +void unmask_evtchn(uint32_t port)
> > > > +{
> > > > +	struct shared_info *s = HYPERVISOR_shared_info;
> > > > +	struct vcpu_info *vcpu_info = &s-
> > > > > vcpu_info[smp_processor_id()];
> > > > 
> > > > +
> > > > +	synch_clear_bit(port, &s->evtchn_mask[0]);
> > > > +
> > > > +	/*
> > > > +	 * The following is basically the equivalent of
> > > > 'hw_resend_irq'. Just like
> > > > +	 * a real IO-APIC we 'lose the interrupt edge' if the
> > > > channel
> > > > is masked.
> > > > +	 */
> > > 
> > > This seems to be out-of-context now, you might want to update it.
> > 
> > I am not sure I understand it right.
> > Could you please clarify what do you mean under the word "update"?
> 
> Well the comment is referring to "hw_resend_irq". I guess this is a 
> function I can't find any code in either Mini-OS and U-boot.
> 
> Therefore comment seems to be wrong and needs to be updated.
> 

Thank you for clarification. Ok, will update.

> > 
> > > 
> > > > +	if (synch_test_bit(port, &s->evtchn_pending[0]) &&
> > > > +	    !synch_test_and_set_bit(port / (sizeof(unsigned
> > > > long) * 8),
> > > > +				    &vcpu_info-
> > > > >evtchn_pending_sel)) {
> > > > +		vcpu_info->evtchn_upcall_pending = 1;
> > > > +#ifdef XEN_HAVE_PV_UPCALL_MASK
> > > > +		if (!vcpu_info->evtchn_upcall_mask)
> > > > +#endif
> > > > +			force_evtchn_callback();
> > > > +	}
> > > > +}
> > > > +
> > > > +void clear_evtchn(uint32_t port)
> > > > +{
> > > > +	struct shared_info *s = HYPERVISOR_shared_info;
> > > > +
> > > > +	synch_clear_bit(port, &s->evtchn_pending[0]);
> > > > +}
> > > > +
> > > > +void xen_init(void)
> > > > +{
> > > > +	debug("%s\n", __func__);
> > > 
> > > Is this a left-over?
> > 
> > I think this is a relevant comment for debug purpose.
> > But we do not mind removing it, if it seems superfluous.
> 
> That's fine. I was just asking if it was still worth it.
> 
> Cheers,
> 

Regards,
Anastasiia

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 05/17] xen: Port Xen hypervizor related code from mini-os
  2020-07-01 17:46   ` Julien Grall
  2020-07-03 12:21     ` Anastasiia Lukianenko
@ 2020-07-16 13:16     ` Anastasiia Lukianenko
  1 sibling, 0 replies; 57+ messages in thread
From: Anastasiia Lukianenko @ 2020-07-16 13:16 UTC (permalink / raw)
  To: u-boot

Hello Julien,

On Wed, 2020-07-01 at 18:46 +0100, Julien Grall wrote:
> Title: s/hypervizor/hypervisor/
> 
> On 01/07/2020 17:29, Anastasiia Lukianenko wrote:
> > From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> > 
> > Port hypervizor related code from mini-os. Update essential
> 
> Ditto.
> 
> But I would be quite cautious to import code from mini-OS in order
> to 
> support Arm. The port has always been broken and from a look below
> needs 
> to be refined for Arm.
> 
> > arch code to support required bit operations, memory barriers etc.
> > 
> > Copyright for the bits ported belong to at least the following
> > authors,
> > please see related files for details:
> > 
> > Copyright (c) 2002-2003, K A Fraser
> > Copyright (c) 2005, Grzegorz Milos, gm281 at cam.ac.uk,Intel Research
> > Cambridge
> > Copyright (c) 2014, Karim Allah Ahmed <karim.allah.ahmed@gmail.com>
> > 
> > Signed-off-by: Oleksandr Andrushchenko <
> > oleksandr_andrushchenko at epam.com>
> > Signed-off-by: Anastasiia Lukianenko <
> > anastasiia_lukianenko at epam.com>
> > ---
> >   arch/arm/include/asm/xen/system.h |  96 +++++++++++
> >   common/board_r.c                  |  11 ++
> >   drivers/Makefile                  |   1 +
> >   drivers/xen/Makefile              |   5 +
> >   drivers/xen/hypervisor.c          | 277
> > ++++++++++++++++++++++++++++++
> >   include/xen.h                     |  11 ++
> >   include/xen/hvm.h                 |  30 ++++
> >   7 files changed, 431 insertions(+)
> >   create mode 100644 arch/arm/include/asm/xen/system.h
> >   create mode 100644 drivers/xen/Makefile
> >   create mode 100644 drivers/xen/hypervisor.c
> >   create mode 100644 include/xen.h
> >   create mode 100644 include/xen/hvm.h
> > 
> > diff --git a/arch/arm/include/asm/xen/system.h
> > b/arch/arm/include/asm/xen/system.h
> > new file mode 100644
> > index 0000000000..81ab90160e
> > --- /dev/null
> > +++ b/arch/arm/include/asm/xen/system.h
> > @@ -0,0 +1,96 @@
> > +/*
> > + * SPDX-License-Identifier: GPL-2.0
> > + *
> > + * (C) 2014 Karim Allah Ahmed <karim.allah.ahmed@gmail.com>
> > + * (C) 2020, EPAM Systems Inc.
> > + */
> > +#ifndef _ASM_ARM_XEN_SYSTEM_H
> > +#define _ASM_ARM_XEN_SYSTEM_H
> > +
> > +#include <compiler.h>
> > +#include <asm/bitops.h>
> > +
> > +/* If *ptr == old, then store new there (and return new).
> > + * Otherwise, return the old value.
> > + * Atomic.
> > + */
> > +#define synch_cmpxchg(ptr, old, new) \
> > +({ __typeof__(*ptr) stored = old; \
> > +	__atomic_compare_exchange_n(ptr, &stored, new, 0,
> > __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST) ? new : old; \
> > +})
> > +
> > +/* As test_and_clear_bit, but using __ATOMIC_SEQ_CST */
> > +static inline int synch_test_and_clear_bit(int nr, volatile void
> > *addr)
> > +{
> > +	u8 *byte = ((u8 *)addr) + (nr >> 3);
> > +	u8 bit = 1 << (nr & 7);
> > +	u8 orig;
> > +
> > +	orig = __atomic_fetch_and(byte, ~bit, __ATOMIC_SEQ_CST);
> > +
> > +	return (orig & bit) != 0;
> > +}
> > +
> > +/* As test_and_set_bit, but using __ATOMIC_SEQ_CST */
> > +static inline int synch_test_and_set_bit(int nr, volatile void
> > *base)
> > +{
> > +	u8 *byte = ((u8 *)base) + (nr >> 3);
> > +	u8 bit = 1 << (nr & 7);
> > +	u8 orig;
> > +
> > +	orig = __atomic_fetch_or(byte, bit, __ATOMIC_SEQ_CST);
> > +
> > +	return (orig & bit) != 0;
> > +}
> > +
> > +/* As set_bit, but using __ATOMIC_SEQ_CST */
> > +static inline void synch_set_bit(int nr, volatile void *addr)
> > +{
> > +	synch_test_and_set_bit(nr, addr);
> > +}
> > +
> > +/* As clear_bit, but using __ATOMIC_SEQ_CST */
> > +static inline void synch_clear_bit(int nr, volatile void *addr)
> > +{
> > +	synch_test_and_clear_bit(nr, addr);
> > +}
> > +
> > +/* As test_bit, but with a following memory barrier. */
> > +//static inline int synch_test_bit(int nr, volatile void *addr)
> > +static inline int synch_test_bit(int nr, const void *addr)
> > +{
> > +	int result;
> > +
> > +	result = test_bit(nr, addr);
> > +	barrier();
> > +	return result;
> > +}
> 
> I can understand why we implement sync_* helpers as AFAICT the
> generic 
> helpers are not SMP safe. However...
> 
> > +
> > +#define xchg(ptr, v)	__atomic_exchange_n(ptr, v,
> > __ATOMIC_SEQ_CST)
> > +#define xchg(ptr, v)	__atomic_exchange_n(ptr, v,
> > __ATOMIC_SEQ_CST)
> > +
> > +#define mb()		dsb()
> > +#define rmb()		dsb()
> > +#define wmb()		dsb()
> > +#define __iormb()	dmb()
> > +#define __iowmb()	dmb()
> 
> Why do you need to re-implement the barriers?
> 
> > +#define xen_mb()	mb()
> > +#define xen_rmb()	rmb()
> > +#define xen_wmb()	wmb()
> > +
> > +#define smp_processor_id()	0
> 
> Shouldn't this be common?
> 
> > +
> > +#define to_phys(x)		((unsigned long)(x))
> > +#define to_virt(x)		((void *)(x))
> > +
> > +#define PFN_UP(x)		(unsigned long)(((x) + PAGE_SIZE - 1)
> > >> PAGE_SHIFT)
> > +#define PFN_DOWN(x)		(unsigned long)((x) >>
> > PAGE_SHIFT)
> > +#define PFN_PHYS(x)		((unsigned long)(x) <<
> > PAGE_SHIFT)
> > +#define PHYS_PFN(x)		(unsigned long)((x) >>
> > PAGE_SHIFT)
> > +
> > +#define virt_to_pfn(_virt)	(PFN_DOWN(to_phys(_virt)))
> > +#define virt_to_mfn(_virt)	(PFN_DOWN(to_phys(_virt)))
> > +#define mfn_to_virt(_mfn)	(to_virt(PFN_PHYS(_mfn)))
> > +#define pfn_to_virt(_pfn)	(to_virt(PFN_PHYS(_pfn)))
> 
> There is already generic phys <-> virt helpers (see 
> include/asm-generic/io.h). So why do you need to create a new
> version?
> 
AFAIU, we need to use phys_to_virt and virt_to_phys functions from
include/asm-generic/io.h instead of to_phys and to_virt defines.
For the rest of the definitions, we think they should be left as we
work with frames, not addresses.

> > +
> > +#endif
> > diff --git a/common/board_r.c b/common/board_r.c
> > index fa57fa9b69..fd36edb4e5 100644
> > --- a/common/board_r.c
> > +++ b/common/board_r.c
> > @@ -56,6 +56,7 @@
> >   #include <timer.h>
> >   #include <trace.h>
> >   #include <watchdog.h>
> > +#include <xen.h>
> 
> Do we want to include it for other boards?
> 
> >   #ifdef CONFIG_ADDR_MAP
> >   #include <asm/mmu.h>
> >   #endif
> > @@ -462,6 +463,13 @@ static int initr_mmc(void)
> >   }
> >   #endif
> >   
> > +#ifdef CONFIG_XEN
> > +static int initr_xen(void)
> > +{
> > +	xen_init();
> > +	return 0;
> > +}
> > +#endif
> >   /*
> >    * Tell if it's OK to load the environment early in boot.
> >    *
> > @@ -769,6 +777,9 @@ static init_fnc_t init_sequence_r[] = {
> >   #endif
> >   #ifdef CONFIG_MMC
> >   	initr_mmc,
> > +#endif
> > +#ifdef CONFIG_XEN
> > +	initr_xen,
> >   #endif
> >   	initr_env,
> >   #ifdef CONFIG_SYS_BOOTPARAMS_LEN
> > diff --git a/drivers/Makefile b/drivers/Makefile
> > index 94e8c5da17..0dd8891e76 100644
> > --- a/drivers/Makefile
> > +++ b/drivers/Makefile
> > @@ -28,6 +28,7 @@ obj-$(CONFIG_$(SPL_)REMOTEPROC) += remoteproc/
> >   obj-$(CONFIG_$(SPL_TPL_)TPM) += tpm/
> >   obj-$(CONFIG_$(SPL_TPL_)ACPI_PMC) += power/acpi_pmc/
> >   obj-$(CONFIG_$(SPL_)BOARD) += board/
> > +obj-$(CONFIG_XEN) += xen/
> >   
> >   ifndef CONFIG_TPL_BUILD
> >   ifdef CONFIG_SPL_BUILD
> > diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
> > new file mode 100644
> > index 0000000000..1211bf2386
> > --- /dev/null
> > +++ b/drivers/xen/Makefile
> > @@ -0,0 +1,5 @@
> > +# SPDX-License-Identifier:	GPL-2.0+
> > +#
> > +# (C) Copyright 2020 EPAM Systems Inc.
> > +
> > +obj-y += hypervisor.o
> > diff --git a/drivers/xen/hypervisor.c b/drivers/xen/hypervisor.c
> > new file mode 100644
> > index 0000000000..5883285142
> > --- /dev/null
> > +++ b/drivers/xen/hypervisor.c
> > @@ -0,0 +1,277 @@
> > +/*****************************************************************
> > *************
> > + * hypervisor.c
> > + *
> > + * Communication to/from hypervisor.
> > + *
> > + * Copyright (c) 2002-2003, K A Fraser
> > + * Copyright (c) 2005, Grzegorz Milos, gm281 at cam.ac.uk,Intel
> > Research Cambridge
> > + * Copyright (c) 2020, EPAM Systems Inc.
> > + *
> > + * Permission is hereby granted, free of charge, to any person
> > obtaining a copy
> > + * of this software and associated documentation files (the
> > "Software"), to
> > + * deal in the Software without restriction, including without
> > limitation the
> > + * rights to use, copy, modify, merge, publish, distribute,
> > sublicense, and/or
> > + * sell copies of the Software, and to permit persons to whom the
> > Software is
> > + * furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be
> > included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> > EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
> > EVENT SHALL THE
> > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES
> > OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> > OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + */
> > +#include <common.h>
> > +#include <cpu_func.h>
> > +#include <log.h>
> > +#include <memalign.h>
> > +
> > +#include <asm/io.h>
> > +#include <asm/armv8/mmu.h>
> > +#include <asm/xen/system.h>
> > +
> > +#include <linux/bug.h>
> > +
> > +#include <xen/hvm.h>
> > +#include <xen/interface/memory.h>
> > +
> > +#define active_evtchns(cpu, sh, idx)	\
> > +	((sh)->evtchn_pending[idx] &	\
> > +	 ~(sh)->evtchn_mask[idx])
> > +
> > +int in_callback;
> > +
> > +/*
> > + * Shared page for communicating with the hypervisor.
> > + * Events flags go here, for example.
> > + */
> > +struct shared_info *HYPERVISOR_shared_info;
> > +
> > +#ifndef CONFIG_PARAVIRT
> 
> Is there any plan to support this on x86?
> 
> > +static const char *param_name(int op)
> > +{
> > +#define PARAM(x)[HVM_PARAM_##x] = #x
> > +	static const char *const names[] = {
> > +		PARAM(CALLBACK_IRQ),
> > +		PARAM(STORE_PFN),
> > +		PARAM(STORE_EVTCHN),
> > +		PARAM(PAE_ENABLED),
> > +		PARAM(IOREQ_PFN),
> > +		PARAM(TIMER_MODE),
> > +		PARAM(HPET_ENABLED),
> > +		PARAM(IDENT_PT),
> > +		PARAM(ACPI_S_STATE),
> > +		PARAM(VM86_TSS),
> > +		PARAM(VPT_ALIGN),
> > +		PARAM(CONSOLE_PFN),
> > +		PARAM(CONSOLE_EVTCHN),
> 
> Most of those parameters are never going to be used on Arm. So could 
> this be clobberred?
> 
> > +	};
> > +#undef PARAM
> > +
> > +	if (op >= ARRAY_SIZE(names))
> > +		return "unknown";
> > +
> > +	if (!names[op])
> > +		return "reserved";
> > +
> > +	return names[op];
> > +}
> > +
> > +int hvm_get_parameter_maintain_dcache(int idx, uint64_t *value)
> 
> I would recommend to add some comments explaining when this function
> is 
> meant to be used and what it is doing in regards of the cache.
> 
> > +{
> > +	struct xen_hvm_param xhv;
> > +	int ret;
> 
> I don't think there is a guarantee that your cache is going to be
> clean 
> when writing xhv. So you likely want to add a
> invalidate_dcache_range() 
> before writing it.
> 
> > +
> > +	xhv.domid = DOMID_SELF;
> > +	xhv.index = idx;
> > +	invalidate_dcache_range((unsigned long)&xhv,
> > +				(unsigned long)&xhv + sizeof(xhv));
> > +
> > +	ret = HYPERVISOR_hvm_op(HVMOP_get_param, &xhv);
> > +	if (ret < 0) {
> > +		pr_err("Cannot get hvm parameter %s (%d): %d!\n",
> > +			   param_name(idx), idx, ret);
> > +		BUG();
> > +	}
> > +	invalidate_dcache_range((unsigned long)&xhv,
> > +				(unsigned long)&xhv + sizeof(xhv));
> > +
> > +	*value = xhv.value;
> > +	return ret;
> > +}
> > +
> > +int hvm_get_parameter(int idx, uint64_t *value)
> > +{
> > +	struct xen_hvm_param xhv;
> > +	int ret;
> > +
> > +	xhv.domid = DOMID_SELF;
> > +	xhv.index = idx;
> > +	ret = HYPERVISOR_hvm_op(HVMOP_get_param, &xhv);
> > +	if (ret < 0) {
> > +		pr_err("Cannot get hvm parameter %s (%d): %d!\n",
> > +			   param_name(idx), idx, ret);
> > +		BUG();
> > +	}
> > +
> > +	*value = xhv.value;
> > +	return ret;
> > +}
> > +
> > +int hvm_set_parameter(int idx, uint64_t value)
> > +{
> > +	struct xen_hvm_param xhv;
> > +	int ret;
> > +
> > +	xhv.domid = DOMID_SELF;
> > +	xhv.index = idx;
> > +	xhv.value = value;
> > +	ret = HYPERVISOR_hvm_op(HVMOP_set_param, &xhv);
> > +
> > +	if (ret < 0) {
> > +		pr_err("Cannot get hvm parameter %s (%d): %d!\n",
> > +			   param_name(idx), idx, ret);
> > +		BUG();
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> > +struct shared_info *map_shared_info(void *p)
> > +{
> > +	struct xen_add_to_physmap xatp;
> > +
> > +	HYPERVISOR_shared_info = (struct shared_info
> > *)memalign(PAGE_SIZE,
> > +								PAGE_SI
> > ZE);
> > +	if (HYPERVISOR_shared_info == NULL)
> > +		BUG();
> > +
> > +	xatp.domid = DOMID_SELF;
> > +	xatp.idx = 0;
> > +	xatp.space = XENMAPSPACE_shared_info;
> > +	xatp.gpfn = virt_to_pfn(HYPERVISOR_shared_info);
> > +	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp) != 0)
> > +		BUG();
> > +
> > +	return HYPERVISOR_shared_info;
> > +}
> > +
> > +void unmap_shared_info(void)
> > +{
> > +	struct xen_remove_from_physmap xrtp;
> > +
> > +	xrtp.domid = DOMID_SELF;
> > +	xrtp.gpfn = virt_to_pfn(HYPERVISOR_shared_info);
> > +	if (HYPERVISOR_memory_op(XENMEM_remove_from_physmap, &xrtp) !=
> > 0)
> > +		BUG();
> > +}
> > +#endif
> > +
> > +void do_hypervisor_callback(struct pt_regs *regs)
> > +{
> > +	unsigned long l1, l2, l1i, l2i;
> > +	unsigned int port;
> > +	int cpu = 0;
> > +	struct shared_info *s = HYPERVISOR_shared_info;
> > +	struct vcpu_info *vcpu_info = &s->vcpu_info[cpu];
> > +
> > +	in_callback = 1;
> > +
> > +	vcpu_info->evtchn_upcall_pending = 0;
> > +	/* NB x86. No need for a barrier here -- XCHG is a barrier on
> > x86. */
> > +#if !defined(__i386__) && !defined(__x86_64__)
> > +	/* Clear master flag /before/ clearing selector flag. */
> > +	wmb();
> > +#endif
> > +	l1 = xchg(&vcpu_info->evtchn_pending_sel, 0);
> > +
> > +	while (l1 != 0) {
> > +		l1i = __ffs(l1);
> > +		l1 &= ~(1UL << l1i);
> > +
> > +		while ((l2 = active_evtchns(cpu, s, l1i)) != 0) {
> > +			l2i = __ffs(l2);
> > +			l2 &= ~(1UL << l2i);
> > +
> > +			port = (l1i * (sizeof(unsigned long) * 8)) +
> > l2i;
> > +			/* TODO: handle new event: do_event(port,
> > regs); */
> > +			/* Suppress -Wunused-but-set-variable */
> > +			(void)(port);
> > +		}
> > +	}
> 
> You likely want a memory barrier here as otherwise in_callback could
> be 
> written/seen before the loop end.
> 
> > +
> > +	in_callback = 0;
> > +}
> > +
> > +void force_evtchn_callback(void)
> > +{
> > +#ifdef XEN_HAVE_PV_UPCALL_MASK
> > +	int save;
> > +#endif
> > +	struct vcpu_info *vcpu;
> > +
> > +	vcpu = &HYPERVISOR_shared_info->vcpu_info[smp_processor_id()];
> 
> On Arm, this is only valid for vCPU0. For all the other vCPUs, you
> will 
> want to register a vCPU shared info.
> 
> > +#ifdef XEN_HAVE_PV_UPCALL_MASK
> > +	save = vcpu->evtchn_upcall_mask;
> > +#endif
> > +
> > +	while (vcpu->evtchn_upcall_pending) {
> > +#ifdef XEN_HAVE_PV_UPCALL_MASK
> > +		vcpu->evtchn_upcall_mask = 1;
> > +#endif
> > +		barrier();
> 
> What are you trying to prevent with this barrier? In particular why 
> would the compiler be an issue but not the processor?
> 
> > +		do_hypervisor_callback(NULL);
> > +		barrier();
> > +#ifdef XEN_HAVE_PV_UPCALL_MASK
> > +		vcpu->evtchn_upcall_mask = save;
> > +		barrier();
> 
> Same here.
> 
> > +#endif
> > +	};
> > +}
> > +
> > +void mask_evtchn(uint32_t port)
> > +{
> > +	struct shared_info *s = HYPERVISOR_shared_info;
> > +	synch_set_bit(port, &s->evtchn_mask[0]);
> > +}
> > +
> > +void unmask_evtchn(uint32_t port)
> > +{
> > +	struct shared_info *s = HYPERVISOR_shared_info;
> > +	struct vcpu_info *vcpu_info = &s-
> > >vcpu_info[smp_processor_id()];
> > +
> > +	synch_clear_bit(port, &s->evtchn_mask[0]);
> > +
> > +	/*
> > +	 * The following is basically the equivalent of
> > 'hw_resend_irq'. Just like
> > +	 * a real IO-APIC we 'lose the interrupt edge' if the channel
> > is masked.
> > +	 */
> 
> This seems to be out-of-context now, you might want to update it.
> 
> > +	if (synch_test_bit(port, &s->evtchn_pending[0]) &&
> > +	    !synch_test_and_set_bit(port / (sizeof(unsigned long) * 8),
> > +				    &vcpu_info->evtchn_pending_sel)) {
> > +		vcpu_info->evtchn_upcall_pending = 1;
> > +#ifdef XEN_HAVE_PV_UPCALL_MASK
> > +		if (!vcpu_info->evtchn_upcall_mask)
> > +#endif
> > +			force_evtchn_callback();
> > +	}
> > +}
> > +
> > +void clear_evtchn(uint32_t port)
> > +{
> > +	struct shared_info *s = HYPERVISOR_shared_info;
> > +
> > +	synch_clear_bit(port, &s->evtchn_pending[0]);
> > +}
> > +
> > +void xen_init(void)
> > +{
> > +	debug("%s\n", __func__);
> 
> Is this a left-over?
> 
> > +
> > +	map_shared_info(NULL);
> > +}
> > +
> > diff --git a/include/xen.h b/include/xen.h
> > new file mode 100644
> > index 0000000000..1d6f74cc92
> > --- /dev/null
> > +++ b/include/xen.h
> > @@ -0,0 +1,11 @@
> > +/*
> > + * SPDX-License-Identifier: GPL-2.0
> > + *
> > + * (C) 2020, EPAM Systems Inc.
> > + */
> > +#ifndef __XEN_H__
> > +#define __XEN_H__
> > +
> > +void xen_init(void);
> > +
> > +#endif /* __XEN_H__ */
> > diff --git a/include/xen/hvm.h b/include/xen/hvm.h
> > new file mode 100644
> > index 0000000000..89de9625ca
> > --- /dev/null
> > +++ b/include/xen/hvm.h
> > @@ -0,0 +1,30 @@
> > +/*
> > + * SPDX-License-Identifier: GPL-2.0
> > + *
> > + * Simple wrappers around HVM functions
> > + *
> > + * Copyright (c) 2002-2003, K A Fraser
> > + * Copyright (c) 2005, Grzegorz Milos, gm281 at cam.ac.uk,Intel
> > Research Cambridge
> > + * Copyright (c) 2020, EPAM Systems Inc.
> > + */
> > +#ifndef XEN_HVM_H__
> > +#define XEN_HVM_H__
> > +
> > +#include <asm/xen/hypercall.h>
> > +#include <xen/interface/hvm/params.h>
> > +#include <xen/interface/xen.h>
> > +
> > +extern struct shared_info *HYPERVISOR_shared_info;
> > +
> > +int hvm_get_parameter(int idx, uint64_t *value);
> > +int hvm_get_parameter_maintain_dcache(int idx, uint64_t *value);
> > +int hvm_set_parameter(int idx, uint64_t value);
> > +
> > +struct shared_info *map_shared_info(void *p);
> > +void unmap_shared_info(void);
> > +void do_hypervisor_callback(struct pt_regs *regs);
> > +void mask_evtchn(uint32_t port);
> > +void unmask_evtchn(uint32_t port);
> > +void clear_evtchn(uint32_t port);
> > +
> > +#endif /* XEN_HVM_H__ */
> 
> Cheers,
> 
> 
Regards,
Anastasiia

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2020-07-16 13:16 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-01 16:29 [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko
2020-07-01 16:29 ` [PATCH 01/17] armv8: Fix SMCC and ARM_PSCI_FW dependencies Anastasiia Lukianenko
2020-07-02  1:14   ` Peng Fan
2020-07-03  9:57     ` Nastya Vicodin
2020-07-01 16:29 ` [PATCH 02/17] Kconfig: Introduce CONFIG_XEN Anastasiia Lukianenko
2020-07-03  3:50   ` Simon Glass
2020-07-03 12:42     ` Anastasiia Lukianenko
2020-07-01 16:29 ` [PATCH 03/17] board: Introduce xenguest_arm64 board Anastasiia Lukianenko
2020-07-02  1:28   ` Peng Fan
2020-07-02  7:18     ` Oleksandr Andrushchenko
2020-07-02  7:26       ` Heinrich Schuchardt
2020-07-02  7:57         ` Oleksandr Andrushchenko
2020-07-01 16:29 ` [PATCH 04/17] xen: Add essential and required interface headers Anastasiia Lukianenko
2020-07-02  1:30   ` Peng Fan
2020-07-03 12:46     ` Anastasiia Lukianenko
2020-07-01 16:29 ` [PATCH 05/17] xen: Port Xen hypervizor related code from mini-os Anastasiia Lukianenko
2020-07-01 17:46   ` Julien Grall
2020-07-03 12:21     ` Anastasiia Lukianenko
2020-07-03 13:38       ` Julien Grall
2020-07-08  8:55         ` Anastasiia Lukianenko
2020-07-16 13:16     ` Anastasiia Lukianenko
2020-07-01 16:29 ` [PATCH 06/17] xen: Port Xen event channel driver " Anastasiia Lukianenko
2020-07-03  3:50   ` Simon Glass
2020-07-03 12:34     ` Anastasiia Lukianenko
2020-07-01 16:29 ` [PATCH 07/17] serial: serial_xen: Add Xen PV serial driver Anastasiia Lukianenko
2020-07-03  3:50   ` Simon Glass
2020-07-03 12:59     ` Anastasiia Lukianenko
2020-07-01 16:29 ` [PATCH 08/17] linux/compat.h: Add wait_event_timeout macro Anastasiia Lukianenko
2020-07-02  4:08   ` Heinrich Schuchardt
2020-07-03 13:02     ` Anastasiia Lukianenko
2020-07-01 16:29 ` [PATCH 09/17] lib: sscanf: add sscanf implementation Anastasiia Lukianenko
2020-07-02  4:04   ` Heinrich Schuchardt
2020-07-01 16:29 ` [PATCH 10/17] xen: Port Xen bus driver from mini-os Anastasiia Lukianenko
2020-07-02  4:43   ` Heinrich Schuchardt
2020-07-01 16:29 ` [PATCH 11/17] xen: Port Xen grant table " Anastasiia Lukianenko
2020-07-01 16:59   ` Julien Grall
2020-07-03 13:09     ` Anastasiia Lukianenko
2020-07-01 16:29 ` [PATCH 12/17] xen: pvblock: Add initial support for para-virtualized block driver Anastasiia Lukianenko
2020-07-02  4:17   ` Heinrich Schuchardt
2020-07-03 13:25     ` Anastasiia Lukianenko
2020-07-02  4:29   ` Heinrich Schuchardt
2020-07-02  5:30     ` Peng Fan
2020-07-03 14:14     ` Anastasiia Lukianenko
2020-07-01 16:29 ` [PATCH 13/17] xen: pvblock: Enumerate virtual block devices Anastasiia Lukianenko
2020-07-03  3:50   ` Simon Glass
2020-07-01 16:29 ` [PATCH 14/17] xen: pvblock: Read XenStore configuration and initialize Anastasiia Lukianenko
2020-07-03  3:50   ` Simon Glass
2020-07-06  9:08     ` Anastasiia Lukianenko
2020-07-01 16:29 ` [PATCH 15/17] xen: pvblock: Implement front-back protocol and do IO Anastasiia Lukianenko
2020-07-03  3:50   ` Simon Glass
2020-07-06  9:10     ` Anastasiia Lukianenko
2020-07-01 16:29 ` [PATCH 16/17] xen: pvblock: Print found devices indices Anastasiia Lukianenko
2020-07-03  3:50   ` Simon Glass
2020-07-01 16:29 ` [PATCH 17/17] board: xen: De-initialize before jumping to Linux Anastasiia Lukianenko
2020-07-03  3:50   ` Simon Glass
2020-07-06  9:13     ` Anastasiia Lukianenko
2020-07-01 16:51 ` [PATCH 00/17] Add new board: Xen guest for ARM64 Anastasiia Lukianenko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.