All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support
@ 2019-02-05 17:32 Eric Auger
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 01/18] update-linux-headers.sh: Copy new headers Eric Auger
                   ` (18 more replies)
  0 siblings, 19 replies; 56+ messages in thread
From: Eric Auger @ 2019-02-05 17:32 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, imammedo, david
  Cc: dgilbert, david, drjones

This series aims to bump the 255GB RAM limit in machvirt and to
support device memory in general, and especially PCDIMM/NVDIMM.

In machvirt versions < 4.0, the initial RAM starts at 1GB and can
grow up to 255GB. From 256GB onwards we find IO regions such as the
additional GICv3 RDIST region, high PCIe ECAM region and high PCIe
MMIO region. The address map was 1TB large. This corresponded to
the max IPA capacity KVM was able to manage.

Since 4.20, the host kernel is able to support a larger and dynamic
IPA range. So the guest physical address can go beyond the 1TB. The
max GPA size depends on the host kernel configuration and physical CPUs.

In this series we use this feature and allow the RAM to grow without
any other limit than the one put by the host.

The RAM still starts at 1GB. First comes the initial ram (-m) of size
ram_size and then comes the device memory (,maxmem) of size
maxram_size - ram_size. The device memory is potentially hotpluggable
depending on the instantiated memory objects.

IO regions previously located between 256GB and 1TB are moved after
the RAM. Their offset is dynamically computed, depends on ram_size
and maxram_size. Size alignment is enforced.

In case maxmem value is inferior to 255GB, the legacy memory map
still is used. The change of memory map becomes effective from 4.0
onwards.

As we keep the initial RAM at 1GB base address, we do not need to do
invasive changes in the EDK2 FW. It seems nobody is eager to do
that job at the moment.

Device memory being put just after the initial RAM, it is possible
to get access to this feature while keeping a 1TB address map.

This series reuses/rebases patches initially submitted by Shameer
in [1] and Kwangwoo in [2] for the PC-DIMM and NV-DIMM parts.

Functionally, the series is split into 3 parts:
1) bump of the initial RAM limit [1 - 10] and change in
   the memory map
2) Support of PC-DIMM [11 - 14]
3) Support of NV-DIMM [15 - 18]

1) can be upstreamed before 2 and 2 can be upstreamed before 3.

Work is ongoing to transform the whole memory as device memory.
However this move is not trivial and to me, is independent on
the improvements brought by this series:
- if we were to use DIMM for initial RAM, those DIMMs would use
  use slots. Although they would not be part of the ones provided
  using the ",slots" options, they are ACPI limited resources.
- DT and ACPI description needs to be reworked
- NUMA integration needs special care
- a special device memory object may be required to avoid consuming
  slots and easing the FW description.

So I preferred to separate the concerns. This new implementation
based on device memory could be candidate for another virt
version.

Best Regards

Eric

References:

[0] [RFC v2 0/6] hw/arm: Add support for non-contiguous iova regions
http://patchwork.ozlabs.org/cover/914694/

[1] [RFC PATCH 0/3] add nvdimm support on AArch64 virt platform
https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg04599.html

This series can be found at:
https://github.com/eauger/qemu/tree/v3.1.0-dimm-v6

Tests:
- On HPE system, allocated a VM with 300GB initial RAM and 220 vcpus
- tested on kernel versions not supporting variable IPA range
- tested with different virt versions

History:

v5 -> v6:
- mingw compilation issue fix
- kvm_arm_get_max_vm_phys_shift always returns the number of supported
  IPA bits
- new patch "hw/arm/virt: Rename highmem IO regions" that eases the review
  of "hw/arm/virt: Split the memory map description"
- "hw/arm/virt: Move memory map initialization into machvirt_init"
  squashed into the previous patch
- change alignment of IO regions beyond the RAM so that it matches their
  size

v4 -> v5:
- change in the memory map
- see individual logs

v3 -> v4:
- rebase on David's "pc-dimm: next bunch of cleanups" and
  "pc-dimm: pre_plug "slot" and "addr" assignment"
- kvm-type option not used anymore. We directly use
  maxram_size and ram_size machine fields to compute the
  MAX IPA range. Migration is naturally handled as CLI
  option are kept between source and destination. This was
  suggested by David.
- device_memory_start and device_memory_size not stored
  anymore in vms->bootinfo
- I did not take into account 2 Igor's comments: the one
  related to the refactoring of arm_load_dtb and the one
  related to the generation of the dtb after system_reset
  which would contain nodes of hotplugged devices (we do
  not support hotplug at this stage)
- check the end-user does not attempt to hotplug a device
- addition of "vl: Set machine ram_size, maxram_size and
  ram_slots earlier"

v2 -> v3:
- fix pc_q35 and pc_piix compilation error
- kwangwoo's email being not valid anymore, remove his address

v1 -> v2:
- kvm_get_max_vm_phys_shift moved in arch specific file
- addition of NVDIMM part
- single series
- rebase on David's refactoring

v1:
- was "[RFC 0/6] KVM/ARM: Dynamic and larger GPA size"
- was "[RFC 0/5] ARM virt: Support PC-DIMM at 2TB"

Best Regards

Eric

Alexey Kardashevskiy (1):
  update-linux-headers.sh: Copy new headers

Eric Auger (12):
  linux-headers: Update to v5.0-rc2
  hw/arm/virt: Rename highmem IO regions
  hw/arm/virt: Split the memory map description
  hw/boards: Add a MachineState parameter to kvm_type callback
  kvm: add kvm_arm_get_max_vm_phys_shift
  vl: Set machine ram_size, maxram_size and ram_slots earlier
  hw/arm/virt: Implement kvm_type function for 4.0 machine
  hw/arm/virt: Bump the 255GB initial RAM limit
  hw/arm/virt: Add memory hotplug framework
  hw/arm/virt: Allocate device_memory
  hw/arm/boot: Expose the pmem nodes in the DT
  hw/arm/virt: Add nvdimm and nvdimm-persistence options

Kwangwoo Lee (2):
  nvdimm: use configurable ACPI IO base and size
  hw/arm/virt: Add nvdimm hot-plug infrastructure

Shameer Kolothum (3):
  hw/arm/boot: introduce fdt_add_memory_node helper
  hw/arm/boot: Expose the PC-DIMM nodes in the DT
  hw/arm/virt-acpi-build: Add PC-DIMM in SRAT

 accel/kvm/kvm-all.c                           |    2 +-
 default-configs/arm-softmmu.mak               |    4 +
 hw/acpi/nvdimm.c                              |   28 +-
 hw/arm/boot.c                                 |  120 +-
 hw/arm/virt-acpi-build.c                      |   23 +-
 hw/arm/virt.c                                 |  311 ++++-
 hw/i386/pc_piix.c                             |    8 +-
 hw/i386/pc_q35.c                              |    8 +-
 hw/ppc/mac_newworld.c                         |    3 +-
 hw/ppc/mac_oldworld.c                         |    2 +-
 hw/ppc/spapr.c                                |    2 +-
 include/hw/arm/virt.h                         |   21 +-
 include/hw/boards.h                           |    2 +-
 include/hw/mem/nvdimm.h                       |   12 +
 include/standard-headers/drm/drm_fourcc.h     |   63 +
 include/standard-headers/linux/ethtool.h      |   19 +-
 .../linux/input-event-codes.h                 |   19 +
 include/standard-headers/linux/pci_regs.h     |    1 +
 .../standard-headers/linux/virtio_balloon.h   |    8 +
 include/standard-headers/linux/virtio_blk.h   |   54 +
 .../standard-headers/linux/virtio_config.h    |    3 +
 include/standard-headers/linux/virtio_gpu.h   |   18 +
 include/standard-headers/linux/virtio_ring.h  |   52 +
 .../standard-headers/rdma/vmw_pvrdma-abi.h    |    1 +
 linux-headers/asm-arm/unistd-common.h         |    1 +
 linux-headers/asm-arm64/unistd.h              |    1 +
 linux-headers/asm-generic/unistd.h            |   10 +-
 linux-headers/asm-mips/sgidefs.h              |    8 -
 linux-headers/asm-mips/unistd.h               | 1074 +----------------
 linux-headers/asm-mips/unistd_n32.h           |  338 ++++++
 linux-headers/asm-mips/unistd_n64.h           |  334 +++++
 linux-headers/asm-mips/unistd_o32.h           |  374 ++++++
 linux-headers/asm-powerpc/unistd.h            |  389 +-----
 linux-headers/asm-powerpc/unistd_32.h         |  381 ++++++
 linux-headers/asm-powerpc/unistd_64.h         |  372 ++++++
 linux-headers/linux/kvm.h                     |   29 +
 linux-headers/linux/vfio.h                    |   92 ++
 linux-headers/linux/vhost.h                   |  113 +-
 linux-headers/linux/vhost_types.h             |  128 ++
 scripts/update-linux-headers.sh               |   11 +-
 target/arm/kvm.c                              |   10 +
 target/arm/kvm_arm.h                          |   13 +
 vl.c                                          |    6 +-
 43 files changed, 2797 insertions(+), 1671 deletions(-)
 create mode 100644 linux-headers/asm-mips/unistd_n32.h
 create mode 100644 linux-headers/asm-mips/unistd_n64.h
 create mode 100644 linux-headers/asm-mips/unistd_o32.h
 create mode 100644 linux-headers/asm-powerpc/unistd_32.h
 create mode 100644 linux-headers/asm-powerpc/unistd_64.h
 create mode 100644 linux-headers/linux/vhost_types.h

-- 
2.20.1

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v6 01/18] update-linux-headers.sh: Copy new headers
  2019-02-05 17:32 [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
@ 2019-02-05 17:32 ` Eric Auger
  2019-02-14 16:36   ` Peter Maydell
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 02/18] linux-headers: Update to v5.0-rc2 Eric Auger
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 56+ messages in thread
From: Eric Auger @ 2019-02-05 17:32 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, imammedo, david
  Cc: dgilbert, david, drjones

From: Alexey Kardashevskiy <aik@ozlabs.ru>

Since Linux'es ab66dcc76d "powerpc: generate uapi header and system call
table files" there are 2 new files: unistd_32.h and unistd_64.h. These
files content is moved from unistd.h so now we have to copy new files
as well, just like we already do for other architectures; this does it
for MIPS as well.

Also, v5.0-rc2 moved vhost bits around in 4b86713236e4bd
"vhost: split structs into a separate header file", add those too.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 scripts/update-linux-headers.sh | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
index 0a964fe240..1cd8bd57be 100755
--- a/scripts/update-linux-headers.sh
+++ b/scripts/update-linux-headers.sh
@@ -121,11 +121,20 @@ for arch in $ARCHLIST; do
         cp "$tmpdir/include/asm/unistd_64.h" "$output/linux-headers/asm-x86/"
         cp_portable "$tmpdir/include/asm/kvm_para.h" "$output/include/standard-headers/asm-$arch"
     fi
+    if [ $arch = powerpc ]; then
+        cp "$tmpdir/include/asm/unistd_32.h" "$output/linux-headers/asm-powerpc/"
+        cp "$tmpdir/include/asm/unistd_64.h" "$output/linux-headers/asm-powerpc/"
+    fi
+    if [ $arch = mips ]; then
+        cp "$tmpdir/include/asm/unistd_o32.h" "$output/linux-headers/asm-mips/"
+        cp "$tmpdir/include/asm/unistd_n32.h" "$output/linux-headers/asm-mips/"
+        cp "$tmpdir/include/asm/unistd_n64.h" "$output/linux-headers/asm-mips/"
+    fi
 done
 
 rm -rf "$output/linux-headers/linux"
 mkdir -p "$output/linux-headers/linux"
-for header in kvm.h vfio.h vfio_ccw.h vhost.h \
+for header in kvm.h vfio.h vfio_ccw.h vhost.h vhost_types.h \
               psci.h psp-sev.h userfaultfd.h; do
     cp "$tmpdir/include/linux/$header" "$output/linux-headers/linux"
 done
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v6 02/18] linux-headers: Update to v5.0-rc2
  2019-02-05 17:32 [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 01/18] update-linux-headers.sh: Copy new headers Eric Auger
@ 2019-02-05 17:32 ` Eric Auger
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 03/18] hw/arm/boot: introduce fdt_add_memory_node helper Eric Auger
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 56+ messages in thread
From: Eric Auger @ 2019-02-05 17:32 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, imammedo, david
  Cc: dgilbert, david, drjones

Here is the update script update and what it did to the v5.0-rc2 kernel.

This is based on sha1: d7393226d15add056285c8fc86723d54d7e0c77d

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 include/standard-headers/drm/drm_fourcc.h     |   63 +
 include/standard-headers/linux/ethtool.h      |   19 +-
 .../linux/input-event-codes.h                 |   19 +
 include/standard-headers/linux/pci_regs.h     |    1 +
 .../standard-headers/linux/virtio_balloon.h   |    8 +
 include/standard-headers/linux/virtio_blk.h   |   54 +
 .../standard-headers/linux/virtio_config.h    |    3 +
 include/standard-headers/linux/virtio_gpu.h   |   18 +
 include/standard-headers/linux/virtio_ring.h  |   52 +
 .../standard-headers/rdma/vmw_pvrdma-abi.h    |    1 +
 linux-headers/asm-arm/unistd-common.h         |    1 +
 linux-headers/asm-arm64/unistd.h              |    1 +
 linux-headers/asm-generic/unistd.h            |   10 +-
 linux-headers/asm-mips/sgidefs.h              |    8 -
 linux-headers/asm-mips/unistd.h               | 1074 +----------------
 linux-headers/asm-mips/unistd_n32.h           |  338 ++++++
 linux-headers/asm-mips/unistd_n64.h           |  334 +++++
 linux-headers/asm-mips/unistd_o32.h           |  374 ++++++
 linux-headers/asm-powerpc/unistd.h            |  389 +-----
 linux-headers/asm-powerpc/unistd_32.h         |  381 ++++++
 linux-headers/asm-powerpc/unistd_64.h         |  372 ++++++
 linux-headers/linux/kvm.h                     |   29 +
 linux-headers/linux/vfio.h                    |   92 ++
 linux-headers/linux/vhost.h                   |  113 +-
 linux-headers/linux/vhost_types.h             |  128 ++
 25 files changed, 2293 insertions(+), 1589 deletions(-)
 create mode 100644 linux-headers/asm-mips/unistd_n32.h
 create mode 100644 linux-headers/asm-mips/unistd_n64.h
 create mode 100644 linux-headers/asm-mips/unistd_o32.h
 create mode 100644 linux-headers/asm-powerpc/unistd_32.h
 create mode 100644 linux-headers/asm-powerpc/unistd_64.h
 create mode 100644 linux-headers/linux/vhost_types.h

diff --git a/include/standard-headers/drm/drm_fourcc.h b/include/standard-headers/drm/drm_fourcc.h
index b53f8d7c8c..44490607f9 100644
--- a/include/standard-headers/drm/drm_fourcc.h
+++ b/include/standard-headers/drm/drm_fourcc.h
@@ -29,11 +29,50 @@
 extern "C" {
 #endif
 
+/**
+ * DOC: overview
+ *
+ * In the DRM subsystem, framebuffer pixel formats are described using the
+ * fourcc codes defined in `include/uapi/drm/drm_fourcc.h`. In addition to the
+ * fourcc code, a Format Modifier may optionally be provided, in order to
+ * further describe the buffer's format - for example tiling or compression.
+ *
+ * Format Modifiers
+ * ----------------
+ *
+ * Format modifiers are used in conjunction with a fourcc code, forming a
+ * unique fourcc:modifier pair. This format:modifier pair must fully define the
+ * format and data layout of the buffer, and should be the only way to describe
+ * that particular buffer.
+ *
+ * Having multiple fourcc:modifier pairs which describe the same layout should
+ * be avoided, as such aliases run the risk of different drivers exposing
+ * different names for the same data format, forcing userspace to understand
+ * that they are aliases.
+ *
+ * Format modifiers may change any property of the buffer, including the number
+ * of planes and/or the required allocation size. Format modifiers are
+ * vendor-namespaced, and as such the relationship between a fourcc code and a
+ * modifier is specific to the modifer being used. For example, some modifiers
+ * may preserve meaning - such as number of planes - from the fourcc code,
+ * whereas others may not.
+ *
+ * Vendors should document their modifier usage in as much detail as
+ * possible, to ensure maximum compatibility across devices, drivers and
+ * applications.
+ *
+ * The authoritative list of format modifier codes is found in
+ * `include/uapi/drm/drm_fourcc.h`
+ */
+
 #define fourcc_code(a, b, c, d) ((uint32_t)(a) | ((uint32_t)(b) << 8) | \
 				 ((uint32_t)(c) << 16) | ((uint32_t)(d) << 24))
 
 #define DRM_FORMAT_BIG_ENDIAN (1<<31) /* format is big endian instead of little endian */
 
+/* Reserve 0 for the invalid format specifier */
+#define DRM_FORMAT_INVALID	0
+
 /* color index */
 #define DRM_FORMAT_C8		fourcc_code('C', '8', ' ', ' ') /* [7:0] C */
 
@@ -111,6 +150,21 @@ extern "C" {
 #define DRM_FORMAT_VYUY		fourcc_code('V', 'Y', 'U', 'Y') /* [31:0] Y1:Cb0:Y0:Cr0 8:8:8:8 little endian */
 
 #define DRM_FORMAT_AYUV		fourcc_code('A', 'Y', 'U', 'V') /* [31:0] A:Y:Cb:Cr 8:8:8:8 little endian */
+#define DRM_FORMAT_XYUV8888		fourcc_code('X', 'Y', 'U', 'V') /* [31:0] X:Y:Cb:Cr 8:8:8:8 little endian */
+
+/*
+ * packed YCbCr420 2x2 tiled formats
+ * first 64 bits will contain Y,Cb,Cr components for a 2x2 tile
+ */
+/* [63:0]   A3:A2:Y3:0:Cr0:0:Y2:0:A1:A0:Y1:0:Cb0:0:Y0:0  1:1:8:2:8:2:8:2:1:1:8:2:8:2:8:2 little endian */
+#define DRM_FORMAT_Y0L0		fourcc_code('Y', '0', 'L', '0')
+/* [63:0]   X3:X2:Y3:0:Cr0:0:Y2:0:X1:X0:Y1:0:Cb0:0:Y0:0  1:1:8:2:8:2:8:2:1:1:8:2:8:2:8:2 little endian */
+#define DRM_FORMAT_X0L0		fourcc_code('X', '0', 'L', '0')
+
+/* [63:0]   A3:A2:Y3:Cr0:Y2:A1:A0:Y1:Cb0:Y0  1:1:10:10:10:1:1:10:10:10 little endian */
+#define DRM_FORMAT_Y0L2		fourcc_code('Y', '0', 'L', '2')
+/* [63:0]   X3:X2:Y3:Cr0:Y2:X1:X0:Y1:Cb0:Y0  1:1:10:10:10:1:1:10:10:10 little endian */
+#define DRM_FORMAT_X0L2		fourcc_code('X', '0', 'L', '2')
 
 /*
  * 2 plane RGB + A
@@ -298,6 +352,15 @@ extern "C" {
  */
 #define DRM_FORMAT_MOD_SAMSUNG_64_32_TILE	fourcc_mod_code(SAMSUNG, 1)
 
+/*
+ * Tiled, 16 (pixels) x 16 (lines) - sized macroblocks
+ *
+ * This is a simple tiled layout using tiles of 16x16 pixels in a row-major
+ * layout. For YCbCr formats Cb/Cr components are taken in such a way that
+ * they correspond to their 16x16 luma block.
+ */
+#define DRM_FORMAT_MOD_SAMSUNG_16_16_TILE	fourcc_mod_code(SAMSUNG, 2)
+
 /*
  * Qualcomm Compressed Format
  *
diff --git a/include/standard-headers/linux/ethtool.h b/include/standard-headers/linux/ethtool.h
index 57ffcb5341..063c814278 100644
--- a/include/standard-headers/linux/ethtool.h
+++ b/include/standard-headers/linux/ethtool.h
@@ -91,10 +91,6 @@
  * %ETHTOOL_GSET to get the current values before making specific
  * changes and then applying them with %ETHTOOL_SSET.
  *
- * Drivers that implement set_settings() should validate all fields
- * other than @cmd that are not described as read-only or deprecated,
- * and must ignore all fields described as read-only.
- *
  * Deprecated fields should be ignored by both users and drivers.
  */
 struct ethtool_cmd {
@@ -886,7 +882,7 @@ struct ethtool_rx_flow_spec {
 	uint32_t		location;
 };
 
-/* How rings are layed out when accessing virtual functions or
+/* How rings are laid out when accessing virtual functions or
  * offloaded queues is device specific. To allow users to do flow
  * steering and specify these queues the ring cookie is partitioned
  * into a 32bit queue index with an 8 bit virtual function id.
@@ -895,7 +891,7 @@ struct ethtool_rx_flow_spec {
  * devices start supporting PCIe w/ARI. However at the moment I
  * do not know of any devices that support this so I do not reserve
  * space for this at this time. If a future patch consumes the next
- * byte it should be aware of this possiblity.
+ * byte it should be aware of this possibility.
  */
 #define ETHTOOL_RX_FLOW_SPEC_RING	0x00000000FFFFFFFFLL
 #define ETHTOOL_RX_FLOW_SPEC_RING_VF	0x000000FF00000000LL
@@ -1800,14 +1796,9 @@ enum ethtool_reset_flags {
  * rejected.
  *
  * Deprecated %ethtool_cmd fields transceiver, maxtxpkt and maxrxpkt
- * are not available in %ethtool_link_settings. Until all drivers are
- * converted to ignore them or to the new %ethtool_link_settings API,
- * for both queries and changes, users should always try
- * %ETHTOOL_GLINKSETTINGS first, and if it fails with -ENOTSUPP stick
- * only to %ETHTOOL_GSET and %ETHTOOL_SSET consistently. If it
- * succeeds, then users should stick to %ETHTOOL_GLINKSETTINGS and
- * %ETHTOOL_SLINKSETTINGS (which would support drivers implementing
- * either %ethtool_cmd or %ethtool_link_settings).
+ * are not available in %ethtool_link_settings. These fields will be
+ * always set to zero in %ETHTOOL_GSET reply and %ETHTOOL_SSET will
+ * fail if any of them is set to non-zero value.
  *
  * Users should assume that all fields not marked read-only are
  * writable and subject to validation by the driver.  They should use
diff --git a/include/standard-headers/linux/input-event-codes.h b/include/standard-headers/linux/input-event-codes.h
index 9e6a8ba4ce..ff2e1ebcc9 100644
--- a/include/standard-headers/linux/input-event-codes.h
+++ b/include/standard-headers/linux/input-event-codes.h
@@ -708,6 +708,16 @@
 #define REL_DIAL		0x07
 #define REL_WHEEL		0x08
 #define REL_MISC		0x09
+/*
+ * 0x0a is reserved and should not be used in input drivers.
+ * It was used by HID as REL_MISC+1 and userspace needs to detect if
+ * the next REL_* event is correct or is just REL_MISC + n.
+ * We define here REL_RESERVED so userspace can rely on it and detect
+ * the situation described above.
+ */
+#define REL_RESERVED		0x0a
+#define REL_WHEEL_HI_RES	0x0b
+#define REL_HWHEEL_HI_RES	0x0c
 #define REL_MAX			0x0f
 #define REL_CNT			(REL_MAX+1)
 
@@ -744,6 +754,15 @@
 
 #define ABS_MISC		0x28
 
+/*
+ * 0x2e is reserved and should not be used in input drivers.
+ * It was used by HID as ABS_MISC+6 and userspace needs to detect if
+ * the next ABS_* event is correct or is just ABS_MISC + n.
+ * We define here ABS_RESERVED so userspace can rely on it and detect
+ * the situation described above.
+ */
+#define ABS_RESERVED		0x2e
+
 #define ABS_MT_SLOT		0x2f	/* MT slot being modified */
 #define ABS_MT_TOUCH_MAJOR	0x30	/* Major axis of touching ellipse */
 #define ABS_MT_TOUCH_MINOR	0x31	/* Minor axis (omit if circular) */
diff --git a/include/standard-headers/linux/pci_regs.h b/include/standard-headers/linux/pci_regs.h
index ee556ccc93..e1e9888c85 100644
--- a/include/standard-headers/linux/pci_regs.h
+++ b/include/standard-headers/linux/pci_regs.h
@@ -52,6 +52,7 @@
 #define  PCI_COMMAND_INTX_DISABLE 0x400 /* INTx Emulation Disable */
 
 #define PCI_STATUS		0x06	/* 16 bits */
+#define  PCI_STATUS_IMM_READY	0x01	/* Immediate Readiness */
 #define  PCI_STATUS_INTERRUPT	0x08	/* Interrupt status */
 #define  PCI_STATUS_CAP_LIST	0x10	/* Support Capability List */
 #define  PCI_STATUS_66MHZ	0x20	/* Support 66 MHz PCI 2.1 bus */
diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
index 4dbb7dc6c0..9375ca2a70 100644
--- a/include/standard-headers/linux/virtio_balloon.h
+++ b/include/standard-headers/linux/virtio_balloon.h
@@ -34,15 +34,23 @@
 #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
 #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
 #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
+#define VIRTIO_BALLOON_F_FREE_PAGE_HINT	3 /* VQ to report free pages */
+#define VIRTIO_BALLOON_F_PAGE_POISON	4 /* Guest is using page poisoning */
 
 /* Size of a PFN in the balloon interface. */
 #define VIRTIO_BALLOON_PFN_SHIFT 12
 
+#define VIRTIO_BALLOON_CMD_ID_STOP	0
+#define VIRTIO_BALLOON_CMD_ID_DONE	1
 struct virtio_balloon_config {
 	/* Number of pages host wants Guest to give up. */
 	uint32_t num_pages;
 	/* Number of pages we've actually got in balloon. */
 	uint32_t actual;
+	/* Free page report command id, readonly by guest */
+	uint32_t free_page_report_cmd_id;
+	/* Stores PAGE_POISON if page poisoning is in use */
+	uint32_t poison_val;
 };
 
 #define VIRTIO_BALLOON_S_SWAP_IN  0   /* Amount of memory swapped in */
diff --git a/include/standard-headers/linux/virtio_blk.h b/include/standard-headers/linux/virtio_blk.h
index ab16ec5fd2..0229b0fbe4 100644
--- a/include/standard-headers/linux/virtio_blk.h
+++ b/include/standard-headers/linux/virtio_blk.h
@@ -38,6 +38,8 @@
 #define VIRTIO_BLK_F_BLK_SIZE	6	/* Block size of disk is available*/
 #define VIRTIO_BLK_F_TOPOLOGY	10	/* Topology information is available */
 #define VIRTIO_BLK_F_MQ		12	/* support more than one vq */
+#define VIRTIO_BLK_F_DISCARD	13	/* DISCARD is supported */
+#define VIRTIO_BLK_F_WRITE_ZEROES	14	/* WRITE ZEROES is supported */
 
 /* Legacy feature bits */
 #ifndef VIRTIO_BLK_NO_LEGACY
@@ -84,6 +86,39 @@ struct virtio_blk_config {
 
 	/* number of vqs, only available when VIRTIO_BLK_F_MQ is set */
 	uint16_t num_queues;
+
+	/* the next 3 entries are guarded by VIRTIO_BLK_F_DISCARD */
+	/*
+	 * The maximum discard sectors (in 512-byte sectors) for
+	 * one segment.
+	 */
+	uint32_t max_discard_sectors;
+	/*
+	 * The maximum number of discard segments in a
+	 * discard command.
+	 */
+	uint32_t max_discard_seg;
+	/* Discard commands must be aligned to this number of sectors. */
+	uint32_t discard_sector_alignment;
+
+	/* the next 3 entries are guarded by VIRTIO_BLK_F_WRITE_ZEROES */
+	/*
+	 * The maximum number of write zeroes sectors (in 512-byte sectors) in
+	 * one segment.
+	 */
+	uint32_t max_write_zeroes_sectors;
+	/*
+	 * The maximum number of segments in a write zeroes
+	 * command.
+	 */
+	uint32_t max_write_zeroes_seg;
+	/*
+	 * Set if a VIRTIO_BLK_T_WRITE_ZEROES request may result in the
+	 * deallocation of one or more of the sectors.
+	 */
+	uint8_t write_zeroes_may_unmap;
+
+	uint8_t unused1[3];
 } QEMU_PACKED;
 
 /*
@@ -112,6 +147,12 @@ struct virtio_blk_config {
 /* Get device ID command */
 #define VIRTIO_BLK_T_GET_ID    8
 
+/* Discard command */
+#define VIRTIO_BLK_T_DISCARD	11
+
+/* Write zeroes command */
+#define VIRTIO_BLK_T_WRITE_ZEROES	13
+
 #ifndef VIRTIO_BLK_NO_LEGACY
 /* Barrier before this op. */
 #define VIRTIO_BLK_T_BARRIER	0x80000000
@@ -131,6 +172,19 @@ struct virtio_blk_outhdr {
 	__virtio64 sector;
 };
 
+/* Unmap this range (only valid for write zeroes command) */
+#define VIRTIO_BLK_WRITE_ZEROES_FLAG_UNMAP	0x00000001
+
+/* Discard/write zeroes range for each request. */
+struct virtio_blk_discard_write_zeroes {
+	/* discard/write zeroes start sector */
+	uint64_t sector;
+	/* number of discard/write zeroes sectors */
+	uint32_t num_sectors;
+	/* flags for this range */
+	uint32_t flags;
+};
+
 #ifndef VIRTIO_BLK_NO_LEGACY
 struct virtio_scsi_inhdr {
 	__virtio32 errors;
diff --git a/include/standard-headers/linux/virtio_config.h b/include/standard-headers/linux/virtio_config.h
index 0b194365a0..24e30af5ec 100644
--- a/include/standard-headers/linux/virtio_config.h
+++ b/include/standard-headers/linux/virtio_config.h
@@ -75,6 +75,9 @@
  */
 #define VIRTIO_F_IOMMU_PLATFORM		33
 
+/* This feature indicates support for the packed virtqueue layout. */
+#define VIRTIO_F_RING_PACKED		34
+
 /*
  * Does the device support Single Root I/O Virtualization?
  */
diff --git a/include/standard-headers/linux/virtio_gpu.h b/include/standard-headers/linux/virtio_gpu.h
index 52a830dcf8..27bb5111f9 100644
--- a/include/standard-headers/linux/virtio_gpu.h
+++ b/include/standard-headers/linux/virtio_gpu.h
@@ -41,6 +41,7 @@
 #include "standard-headers/linux/types.h"
 
 #define VIRTIO_GPU_F_VIRGL 0
+#define VIRTIO_GPU_F_EDID  1
 
 enum virtio_gpu_ctrl_type {
 	VIRTIO_GPU_UNDEFINED = 0,
@@ -56,6 +57,7 @@ enum virtio_gpu_ctrl_type {
 	VIRTIO_GPU_CMD_RESOURCE_DETACH_BACKING,
 	VIRTIO_GPU_CMD_GET_CAPSET_INFO,
 	VIRTIO_GPU_CMD_GET_CAPSET,
+	VIRTIO_GPU_CMD_GET_EDID,
 
 	/* 3d commands */
 	VIRTIO_GPU_CMD_CTX_CREATE = 0x0200,
@@ -76,6 +78,7 @@ enum virtio_gpu_ctrl_type {
 	VIRTIO_GPU_RESP_OK_DISPLAY_INFO,
 	VIRTIO_GPU_RESP_OK_CAPSET_INFO,
 	VIRTIO_GPU_RESP_OK_CAPSET,
+	VIRTIO_GPU_RESP_OK_EDID,
 
 	/* error responses */
 	VIRTIO_GPU_RESP_ERR_UNSPEC = 0x1200,
@@ -291,6 +294,21 @@ struct virtio_gpu_resp_capset {
 	uint8_t capset_data[];
 };
 
+/* VIRTIO_GPU_CMD_GET_EDID */
+struct virtio_gpu_cmd_get_edid {
+	struct virtio_gpu_ctrl_hdr hdr;
+	uint32_t scanout;
+	uint32_t padding;
+};
+
+/* VIRTIO_GPU_RESP_OK_EDID */
+struct virtio_gpu_resp_edid {
+	struct virtio_gpu_ctrl_hdr hdr;
+	uint32_t size;
+	uint32_t padding;
+	uint8_t edid[1024];
+};
+
 #define VIRTIO_GPU_EVENT_DISPLAY (1 << 0)
 
 struct virtio_gpu_config {
diff --git a/include/standard-headers/linux/virtio_ring.h b/include/standard-headers/linux/virtio_ring.h
index d26e72bc6b..e89931f634 100644
--- a/include/standard-headers/linux/virtio_ring.h
+++ b/include/standard-headers/linux/virtio_ring.h
@@ -42,6 +42,13 @@
 /* This means the buffer contains a list of buffer descriptors. */
 #define VRING_DESC_F_INDIRECT	4
 
+/*
+ * Mark a descriptor as available or used in packed ring.
+ * Notice: they are defined as shifts instead of shifted values.
+ */
+#define VRING_PACKED_DESC_F_AVAIL	7
+#define VRING_PACKED_DESC_F_USED	15
+
 /* The Host uses this in used->flags to advise the Guest: don't kick me when
  * you add a buffer.  It's unreliable, so it's simply an optimization.  Guest
  * will still kick if it's out of buffers. */
@@ -51,6 +58,23 @@
  * optimization.  */
 #define VRING_AVAIL_F_NO_INTERRUPT	1
 
+/* Enable events in packed ring. */
+#define VRING_PACKED_EVENT_FLAG_ENABLE	0x0
+/* Disable events in packed ring. */
+#define VRING_PACKED_EVENT_FLAG_DISABLE	0x1
+/*
+ * Enable events for a specific descriptor in packed ring.
+ * (as specified by Descriptor Ring Change Event Offset/Wrap Counter).
+ * Only valid if VIRTIO_RING_F_EVENT_IDX has been negotiated.
+ */
+#define VRING_PACKED_EVENT_FLAG_DESC	0x2
+
+/*
+ * Wrap counter bit shift in event suppression structure
+ * of packed ring.
+ */
+#define VRING_PACKED_EVENT_F_WRAP_CTR	15
+
 /* We support indirect buffer descriptors */
 #define VIRTIO_RING_F_INDIRECT_DESC	28
 
@@ -169,4 +193,32 @@ static inline int vring_need_event(uint16_t event_idx, uint16_t new_idx, uint16_
 	return (uint16_t)(new_idx - event_idx - 1) < (uint16_t)(new_idx - old);
 }
 
+struct vring_packed_desc_event {
+	/* Descriptor Ring Change Event Offset/Wrap Counter. */
+	uint16_t off_wrap;
+	/* Descriptor Ring Change Event Flags. */
+	uint16_t flags;
+};
+
+struct vring_packed_desc {
+	/* Buffer Address. */
+	uint64_t addr;
+	/* Buffer Length. */
+	uint32_t len;
+	/* Buffer ID. */
+	uint16_t id;
+	/* The flags depending on descriptor type. */
+	uint16_t flags;
+};
+
+struct vring_packed {
+	unsigned int num;
+
+	struct vring_packed_desc *desc;
+
+	struct vring_packed_desc_event *driver;
+
+	struct vring_packed_desc_event *device;
+};
+
 #endif /* _LINUX_VIRTIO_RING_H */
diff --git a/include/standard-headers/rdma/vmw_pvrdma-abi.h b/include/standard-headers/rdma/vmw_pvrdma-abi.h
index 6c2bc46116..336a8d596f 100644
--- a/include/standard-headers/rdma/vmw_pvrdma-abi.h
+++ b/include/standard-headers/rdma/vmw_pvrdma-abi.h
@@ -78,6 +78,7 @@ enum pvrdma_wr_opcode {
 	PVRDMA_WR_MASKED_ATOMIC_FETCH_AND_ADD,
 	PVRDMA_WR_BIND_MW,
 	PVRDMA_WR_REG_SIG_MR,
+	PVRDMA_WR_ERROR,
 };
 
 enum pvrdma_wc_status {
diff --git a/linux-headers/asm-arm/unistd-common.h b/linux-headers/asm-arm/unistd-common.h
index 60c2d931d0..8c84bcf10f 100644
--- a/linux-headers/asm-arm/unistd-common.h
+++ b/linux-headers/asm-arm/unistd-common.h
@@ -355,5 +355,6 @@
 #define __NR_pkey_free (__NR_SYSCALL_BASE + 396)
 #define __NR_statx (__NR_SYSCALL_BASE + 397)
 #define __NR_rseq (__NR_SYSCALL_BASE + 398)
+#define __NR_io_pgetevents (__NR_SYSCALL_BASE + 399)
 
 #endif /* _ASM_ARM_UNISTD_COMMON_H */
diff --git a/linux-headers/asm-arm64/unistd.h b/linux-headers/asm-arm64/unistd.h
index 5072cbd15c..dae1584cf0 100644
--- a/linux-headers/asm-arm64/unistd.h
+++ b/linux-headers/asm-arm64/unistd.h
@@ -16,5 +16,6 @@
  */
 
 #define __ARCH_WANT_RENAMEAT
+#define __ARCH_WANT_NEW_STAT
 
 #include <asm-generic/unistd.h>
diff --git a/linux-headers/asm-generic/unistd.h b/linux-headers/asm-generic/unistd.h
index df4bedb9b0..d90127298f 100644
--- a/linux-headers/asm-generic/unistd.h
+++ b/linux-headers/asm-generic/unistd.h
@@ -242,10 +242,12 @@ __SYSCALL(__NR_tee, sys_tee)
 /* fs/stat.c */
 #define __NR_readlinkat 78
 __SYSCALL(__NR_readlinkat, sys_readlinkat)
+#if defined(__ARCH_WANT_NEW_STAT) || defined(__ARCH_WANT_STAT64)
 #define __NR3264_fstatat 79
 __SC_3264(__NR3264_fstatat, sys_fstatat64, sys_newfstatat)
 #define __NR3264_fstat 80
 __SC_3264(__NR3264_fstat, sys_fstat64, sys_newfstat)
+#endif
 
 /* fs/sync.c */
 #define __NR_sync 81
@@ -736,9 +738,11 @@ __SYSCALL(__NR_statx,     sys_statx)
 __SC_COMP(__NR_io_pgetevents, sys_io_pgetevents, compat_sys_io_pgetevents)
 #define __NR_rseq 293
 __SYSCALL(__NR_rseq, sys_rseq)
+#define __NR_kexec_file_load 294
+__SYSCALL(__NR_kexec_file_load,     sys_kexec_file_load)
 
 #undef __NR_syscalls
-#define __NR_syscalls 294
+#define __NR_syscalls 295
 
 /*
  * 32 bit systems traditionally used different
@@ -758,8 +762,10 @@ __SYSCALL(__NR_rseq, sys_rseq)
 #define __NR_ftruncate __NR3264_ftruncate
 #define __NR_lseek __NR3264_lseek
 #define __NR_sendfile __NR3264_sendfile
+#if defined(__ARCH_WANT_NEW_STAT) || defined(__ARCH_WANT_STAT64)
 #define __NR_newfstatat __NR3264_fstatat
 #define __NR_fstat __NR3264_fstat
+#endif
 #define __NR_mmap __NR3264_mmap
 #define __NR_fadvise64 __NR3264_fadvise64
 #ifdef __NR3264_stat
@@ -774,8 +780,10 @@ __SYSCALL(__NR_rseq, sys_rseq)
 #define __NR_ftruncate64 __NR3264_ftruncate
 #define __NR_llseek __NR3264_lseek
 #define __NR_sendfile64 __NR3264_sendfile
+#if defined(__ARCH_WANT_NEW_STAT) || defined(__ARCH_WANT_STAT64)
 #define __NR_fstatat64 __NR3264_fstatat
 #define __NR_fstat64 __NR3264_fstat
+#endif
 #define __NR_mmap2 __NR3264_mmap
 #define __NR_fadvise64_64 __NR3264_fadvise64
 #ifdef __NR3264_stat
diff --git a/linux-headers/asm-mips/sgidefs.h b/linux-headers/asm-mips/sgidefs.h
index 26143e3b7c..69c3de90c5 100644
--- a/linux-headers/asm-mips/sgidefs.h
+++ b/linux-headers/asm-mips/sgidefs.h
@@ -11,14 +11,6 @@
 #ifndef __ASM_SGIDEFS_H
 #define __ASM_SGIDEFS_H
 
-/*
- * Using a Linux compiler for building Linux seems logic but not to
- * everybody.
- */
-#ifndef __linux__
-#error Use a Linux compiler or give up.
-#endif
-
 /*
  * Definitions for the ISA levels
  *
diff --git a/linux-headers/asm-mips/unistd.h b/linux-headers/asm-mips/unistd.h
index d4a85ef3eb..62b86b865c 100644
--- a/linux-headers/asm-mips/unistd.h
+++ b/linux-headers/asm-mips/unistd.h
@@ -17,1085 +17,23 @@
 
 #if _MIPS_SIM == _MIPS_SIM_ABI32
 
-/*
- * Linux o32 style syscalls are in the range from 4000 to 4999.
- */
-#define __NR_Linux			4000
-#define __NR_syscall			(__NR_Linux +	0)
-#define __NR_exit			(__NR_Linux +	1)
-#define __NR_fork			(__NR_Linux +	2)
-#define __NR_read			(__NR_Linux +	3)
-#define __NR_write			(__NR_Linux +	4)
-#define __NR_open			(__NR_Linux +	5)
-#define __NR_close			(__NR_Linux +	6)
-#define __NR_waitpid			(__NR_Linux +	7)
-#define __NR_creat			(__NR_Linux +	8)
-#define __NR_link			(__NR_Linux +	9)
-#define __NR_unlink			(__NR_Linux +  10)
-#define __NR_execve			(__NR_Linux +  11)
-#define __NR_chdir			(__NR_Linux +  12)
-#define __NR_time			(__NR_Linux +  13)
-#define __NR_mknod			(__NR_Linux +  14)
-#define __NR_chmod			(__NR_Linux +  15)
-#define __NR_lchown			(__NR_Linux +  16)
-#define __NR_break			(__NR_Linux +  17)
-#define __NR_unused18			(__NR_Linux +  18)
-#define __NR_lseek			(__NR_Linux +  19)
-#define __NR_getpid			(__NR_Linux +  20)
-#define __NR_mount			(__NR_Linux +  21)
-#define __NR_umount			(__NR_Linux +  22)
-#define __NR_setuid			(__NR_Linux +  23)
-#define __NR_getuid			(__NR_Linux +  24)
-#define __NR_stime			(__NR_Linux +  25)
-#define __NR_ptrace			(__NR_Linux +  26)
-#define __NR_alarm			(__NR_Linux +  27)
-#define __NR_unused28			(__NR_Linux +  28)
-#define __NR_pause			(__NR_Linux +  29)
-#define __NR_utime			(__NR_Linux +  30)
-#define __NR_stty			(__NR_Linux +  31)
-#define __NR_gtty			(__NR_Linux +  32)
-#define __NR_access			(__NR_Linux +  33)
-#define __NR_nice			(__NR_Linux +  34)
-#define __NR_ftime			(__NR_Linux +  35)
-#define __NR_sync			(__NR_Linux +  36)
-#define __NR_kill			(__NR_Linux +  37)
-#define __NR_rename			(__NR_Linux +  38)
-#define __NR_mkdir			(__NR_Linux +  39)
-#define __NR_rmdir			(__NR_Linux +  40)
-#define __NR_dup			(__NR_Linux +  41)
-#define __NR_pipe			(__NR_Linux +  42)
-#define __NR_times			(__NR_Linux +  43)
-#define __NR_prof			(__NR_Linux +  44)
-#define __NR_brk			(__NR_Linux +  45)
-#define __NR_setgid			(__NR_Linux +  46)
-#define __NR_getgid			(__NR_Linux +  47)
-#define __NR_signal			(__NR_Linux +  48)
-#define __NR_geteuid			(__NR_Linux +  49)
-#define __NR_getegid			(__NR_Linux +  50)
-#define __NR_acct			(__NR_Linux +  51)
-#define __NR_umount2			(__NR_Linux +  52)
-#define __NR_lock			(__NR_Linux +  53)
-#define __NR_ioctl			(__NR_Linux +  54)
-#define __NR_fcntl			(__NR_Linux +  55)
-#define __NR_mpx			(__NR_Linux +  56)
-#define __NR_setpgid			(__NR_Linux +  57)
-#define __NR_ulimit			(__NR_Linux +  58)
-#define __NR_unused59			(__NR_Linux +  59)
-#define __NR_umask			(__NR_Linux +  60)
-#define __NR_chroot			(__NR_Linux +  61)
-#define __NR_ustat			(__NR_Linux +  62)
-#define __NR_dup2			(__NR_Linux +  63)
-#define __NR_getppid			(__NR_Linux +  64)
-#define __NR_getpgrp			(__NR_Linux +  65)
-#define __NR_setsid			(__NR_Linux +  66)
-#define __NR_sigaction			(__NR_Linux +  67)
-#define __NR_sgetmask			(__NR_Linux +  68)
-#define __NR_ssetmask			(__NR_Linux +  69)
-#define __NR_setreuid			(__NR_Linux +  70)
-#define __NR_setregid			(__NR_Linux +  71)
-#define __NR_sigsuspend			(__NR_Linux +  72)
-#define __NR_sigpending			(__NR_Linux +  73)
-#define __NR_sethostname		(__NR_Linux +  74)
-#define __NR_setrlimit			(__NR_Linux +  75)
-#define __NR_getrlimit			(__NR_Linux +  76)
-#define __NR_getrusage			(__NR_Linux +  77)
-#define __NR_gettimeofday		(__NR_Linux +  78)
-#define __NR_settimeofday		(__NR_Linux +  79)
-#define __NR_getgroups			(__NR_Linux +  80)
-#define __NR_setgroups			(__NR_Linux +  81)
-#define __NR_reserved82			(__NR_Linux +  82)
-#define __NR_symlink			(__NR_Linux +  83)
-#define __NR_unused84			(__NR_Linux +  84)
-#define __NR_readlink			(__NR_Linux +  85)
-#define __NR_uselib			(__NR_Linux +  86)
-#define __NR_swapon			(__NR_Linux +  87)
-#define __NR_reboot			(__NR_Linux +  88)
-#define __NR_readdir			(__NR_Linux +  89)
-#define __NR_mmap			(__NR_Linux +  90)
-#define __NR_munmap			(__NR_Linux +  91)
-#define __NR_truncate			(__NR_Linux +  92)
-#define __NR_ftruncate			(__NR_Linux +  93)
-#define __NR_fchmod			(__NR_Linux +  94)
-#define __NR_fchown			(__NR_Linux +  95)
-#define __NR_getpriority		(__NR_Linux +  96)
-#define __NR_setpriority		(__NR_Linux +  97)
-#define __NR_profil			(__NR_Linux +  98)
-#define __NR_statfs			(__NR_Linux +  99)
-#define __NR_fstatfs			(__NR_Linux + 100)
-#define __NR_ioperm			(__NR_Linux + 101)
-#define __NR_socketcall			(__NR_Linux + 102)
-#define __NR_syslog			(__NR_Linux + 103)
-#define __NR_setitimer			(__NR_Linux + 104)
-#define __NR_getitimer			(__NR_Linux + 105)
-#define __NR_stat			(__NR_Linux + 106)
-#define __NR_lstat			(__NR_Linux + 107)
-#define __NR_fstat			(__NR_Linux + 108)
-#define __NR_unused109			(__NR_Linux + 109)
-#define __NR_iopl			(__NR_Linux + 110)
-#define __NR_vhangup			(__NR_Linux + 111)
-#define __NR_idle			(__NR_Linux + 112)
-#define __NR_vm86			(__NR_Linux + 113)
-#define __NR_wait4			(__NR_Linux + 114)
-#define __NR_swapoff			(__NR_Linux + 115)
-#define __NR_sysinfo			(__NR_Linux + 116)
-#define __NR_ipc			(__NR_Linux + 117)
-#define __NR_fsync			(__NR_Linux + 118)
-#define __NR_sigreturn			(__NR_Linux + 119)
-#define __NR_clone			(__NR_Linux + 120)
-#define __NR_setdomainname		(__NR_Linux + 121)
-#define __NR_uname			(__NR_Linux + 122)
-#define __NR_modify_ldt			(__NR_Linux + 123)
-#define __NR_adjtimex			(__NR_Linux + 124)
-#define __NR_mprotect			(__NR_Linux + 125)
-#define __NR_sigprocmask		(__NR_Linux + 126)
-#define __NR_create_module		(__NR_Linux + 127)
-#define __NR_init_module		(__NR_Linux + 128)
-#define __NR_delete_module		(__NR_Linux + 129)
-#define __NR_get_kernel_syms		(__NR_Linux + 130)
-#define __NR_quotactl			(__NR_Linux + 131)
-#define __NR_getpgid			(__NR_Linux + 132)
-#define __NR_fchdir			(__NR_Linux + 133)
-#define __NR_bdflush			(__NR_Linux + 134)
-#define __NR_sysfs			(__NR_Linux + 135)
-#define __NR_personality		(__NR_Linux + 136)
-#define __NR_afs_syscall		(__NR_Linux + 137) /* Syscall for Andrew File System */
-#define __NR_setfsuid			(__NR_Linux + 138)
-#define __NR_setfsgid			(__NR_Linux + 139)
-#define __NR__llseek			(__NR_Linux + 140)
-#define __NR_getdents			(__NR_Linux + 141)
-#define __NR__newselect			(__NR_Linux + 142)
-#define __NR_flock			(__NR_Linux + 143)
-#define __NR_msync			(__NR_Linux + 144)
-#define __NR_readv			(__NR_Linux + 145)
-#define __NR_writev			(__NR_Linux + 146)
-#define __NR_cacheflush			(__NR_Linux + 147)
-#define __NR_cachectl			(__NR_Linux + 148)
-#define __NR_sysmips			(__NR_Linux + 149)
-#define __NR_unused150			(__NR_Linux + 150)
-#define __NR_getsid			(__NR_Linux + 151)
-#define __NR_fdatasync			(__NR_Linux + 152)
-#define __NR__sysctl			(__NR_Linux + 153)
-#define __NR_mlock			(__NR_Linux + 154)
-#define __NR_munlock			(__NR_Linux + 155)
-#define __NR_mlockall			(__NR_Linux + 156)
-#define __NR_munlockall			(__NR_Linux + 157)
-#define __NR_sched_setparam		(__NR_Linux + 158)
-#define __NR_sched_getparam		(__NR_Linux + 159)
-#define __NR_sched_setscheduler		(__NR_Linux + 160)
-#define __NR_sched_getscheduler		(__NR_Linux + 161)
-#define __NR_sched_yield		(__NR_Linux + 162)
-#define __NR_sched_get_priority_max	(__NR_Linux + 163)
-#define __NR_sched_get_priority_min	(__NR_Linux + 164)
-#define __NR_sched_rr_get_interval	(__NR_Linux + 165)
-#define __NR_nanosleep			(__NR_Linux + 166)
-#define __NR_mremap			(__NR_Linux + 167)
-#define __NR_accept			(__NR_Linux + 168)
-#define __NR_bind			(__NR_Linux + 169)
-#define __NR_connect			(__NR_Linux + 170)
-#define __NR_getpeername		(__NR_Linux + 171)
-#define __NR_getsockname		(__NR_Linux + 172)
-#define __NR_getsockopt			(__NR_Linux + 173)
-#define __NR_listen			(__NR_Linux + 174)
-#define __NR_recv			(__NR_Linux + 175)
-#define __NR_recvfrom			(__NR_Linux + 176)
-#define __NR_recvmsg			(__NR_Linux + 177)
-#define __NR_send			(__NR_Linux + 178)
-#define __NR_sendmsg			(__NR_Linux + 179)
-#define __NR_sendto			(__NR_Linux + 180)
-#define __NR_setsockopt			(__NR_Linux + 181)
-#define __NR_shutdown			(__NR_Linux + 182)
-#define __NR_socket			(__NR_Linux + 183)
-#define __NR_socketpair			(__NR_Linux + 184)
-#define __NR_setresuid			(__NR_Linux + 185)
-#define __NR_getresuid			(__NR_Linux + 186)
-#define __NR_query_module		(__NR_Linux + 187)
-#define __NR_poll			(__NR_Linux + 188)
-#define __NR_nfsservctl			(__NR_Linux + 189)
-#define __NR_setresgid			(__NR_Linux + 190)
-#define __NR_getresgid			(__NR_Linux + 191)
-#define __NR_prctl			(__NR_Linux + 192)
-#define __NR_rt_sigreturn		(__NR_Linux + 193)
-#define __NR_rt_sigaction		(__NR_Linux + 194)
-#define __NR_rt_sigprocmask		(__NR_Linux + 195)
-#define __NR_rt_sigpending		(__NR_Linux + 196)
-#define __NR_rt_sigtimedwait		(__NR_Linux + 197)
-#define __NR_rt_sigqueueinfo		(__NR_Linux + 198)
-#define __NR_rt_sigsuspend		(__NR_Linux + 199)
-#define __NR_pread64			(__NR_Linux + 200)
-#define __NR_pwrite64			(__NR_Linux + 201)
-#define __NR_chown			(__NR_Linux + 202)
-#define __NR_getcwd			(__NR_Linux + 203)
-#define __NR_capget			(__NR_Linux + 204)
-#define __NR_capset			(__NR_Linux + 205)
-#define __NR_sigaltstack		(__NR_Linux + 206)
-#define __NR_sendfile			(__NR_Linux + 207)
-#define __NR_getpmsg			(__NR_Linux + 208)
-#define __NR_putpmsg			(__NR_Linux + 209)
-#define __NR_mmap2			(__NR_Linux + 210)
-#define __NR_truncate64			(__NR_Linux + 211)
-#define __NR_ftruncate64		(__NR_Linux + 212)
-#define __NR_stat64			(__NR_Linux + 213)
-#define __NR_lstat64			(__NR_Linux + 214)
-#define __NR_fstat64			(__NR_Linux + 215)
-#define __NR_pivot_root			(__NR_Linux + 216)
-#define __NR_mincore			(__NR_Linux + 217)
-#define __NR_madvise			(__NR_Linux + 218)
-#define __NR_getdents64			(__NR_Linux + 219)
-#define __NR_fcntl64			(__NR_Linux + 220)
-#define __NR_reserved221		(__NR_Linux + 221)
-#define __NR_gettid			(__NR_Linux + 222)
-#define __NR_readahead			(__NR_Linux + 223)
-#define __NR_setxattr			(__NR_Linux + 224)
-#define __NR_lsetxattr			(__NR_Linux + 225)
-#define __NR_fsetxattr			(__NR_Linux + 226)
-#define __NR_getxattr			(__NR_Linux + 227)
-#define __NR_lgetxattr			(__NR_Linux + 228)
-#define __NR_fgetxattr			(__NR_Linux + 229)
-#define __NR_listxattr			(__NR_Linux + 230)
-#define __NR_llistxattr			(__NR_Linux + 231)
-#define __NR_flistxattr			(__NR_Linux + 232)
-#define __NR_removexattr		(__NR_Linux + 233)
-#define __NR_lremovexattr		(__NR_Linux + 234)
-#define __NR_fremovexattr		(__NR_Linux + 235)
-#define __NR_tkill			(__NR_Linux + 236)
-#define __NR_sendfile64			(__NR_Linux + 237)
-#define __NR_futex			(__NR_Linux + 238)
-#define __NR_sched_setaffinity		(__NR_Linux + 239)
-#define __NR_sched_getaffinity		(__NR_Linux + 240)
-#define __NR_io_setup			(__NR_Linux + 241)
-#define __NR_io_destroy			(__NR_Linux + 242)
-#define __NR_io_getevents		(__NR_Linux + 243)
-#define __NR_io_submit			(__NR_Linux + 244)
-#define __NR_io_cancel			(__NR_Linux + 245)
-#define __NR_exit_group			(__NR_Linux + 246)
-#define __NR_lookup_dcookie		(__NR_Linux + 247)
-#define __NR_epoll_create		(__NR_Linux + 248)
-#define __NR_epoll_ctl			(__NR_Linux + 249)
-#define __NR_epoll_wait			(__NR_Linux + 250)
-#define __NR_remap_file_pages		(__NR_Linux + 251)
-#define __NR_set_tid_address		(__NR_Linux + 252)
-#define __NR_restart_syscall		(__NR_Linux + 253)
-#define __NR_fadvise64			(__NR_Linux + 254)
-#define __NR_statfs64			(__NR_Linux + 255)
-#define __NR_fstatfs64			(__NR_Linux + 256)
-#define __NR_timer_create		(__NR_Linux + 257)
-#define __NR_timer_settime		(__NR_Linux + 258)
-#define __NR_timer_gettime		(__NR_Linux + 259)
-#define __NR_timer_getoverrun		(__NR_Linux + 260)
-#define __NR_timer_delete		(__NR_Linux + 261)
-#define __NR_clock_settime		(__NR_Linux + 262)
-#define __NR_clock_gettime		(__NR_Linux + 263)
-#define __NR_clock_getres		(__NR_Linux + 264)
-#define __NR_clock_nanosleep		(__NR_Linux + 265)
-#define __NR_tgkill			(__NR_Linux + 266)
-#define __NR_utimes			(__NR_Linux + 267)
-#define __NR_mbind			(__NR_Linux + 268)
-#define __NR_get_mempolicy		(__NR_Linux + 269)
-#define __NR_set_mempolicy		(__NR_Linux + 270)
-#define __NR_mq_open			(__NR_Linux + 271)
-#define __NR_mq_unlink			(__NR_Linux + 272)
-#define __NR_mq_timedsend		(__NR_Linux + 273)
-#define __NR_mq_timedreceive		(__NR_Linux + 274)
-#define __NR_mq_notify			(__NR_Linux + 275)
-#define __NR_mq_getsetattr		(__NR_Linux + 276)
-#define __NR_vserver			(__NR_Linux + 277)
-#define __NR_waitid			(__NR_Linux + 278)
-/* #define __NR_sys_setaltroot		(__NR_Linux + 279) */
-#define __NR_add_key			(__NR_Linux + 280)
-#define __NR_request_key		(__NR_Linux + 281)
-#define __NR_keyctl			(__NR_Linux + 282)
-#define __NR_set_thread_area		(__NR_Linux + 283)
-#define __NR_inotify_init		(__NR_Linux + 284)
-#define __NR_inotify_add_watch		(__NR_Linux + 285)
-#define __NR_inotify_rm_watch		(__NR_Linux + 286)
-#define __NR_migrate_pages		(__NR_Linux + 287)
-#define __NR_openat			(__NR_Linux + 288)
-#define __NR_mkdirat			(__NR_Linux + 289)
-#define __NR_mknodat			(__NR_Linux + 290)
-#define __NR_fchownat			(__NR_Linux + 291)
-#define __NR_futimesat			(__NR_Linux + 292)
-#define __NR_fstatat64			(__NR_Linux + 293)
-#define __NR_unlinkat			(__NR_Linux + 294)
-#define __NR_renameat			(__NR_Linux + 295)
-#define __NR_linkat			(__NR_Linux + 296)
-#define __NR_symlinkat			(__NR_Linux + 297)
-#define __NR_readlinkat			(__NR_Linux + 298)
-#define __NR_fchmodat			(__NR_Linux + 299)
-#define __NR_faccessat			(__NR_Linux + 300)
-#define __NR_pselect6			(__NR_Linux + 301)
-#define __NR_ppoll			(__NR_Linux + 302)
-#define __NR_unshare			(__NR_Linux + 303)
-#define __NR_splice			(__NR_Linux + 304)
-#define __NR_sync_file_range		(__NR_Linux + 305)
-#define __NR_tee			(__NR_Linux + 306)
-#define __NR_vmsplice			(__NR_Linux + 307)
-#define __NR_move_pages			(__NR_Linux + 308)
-#define __NR_set_robust_list		(__NR_Linux + 309)
-#define __NR_get_robust_list		(__NR_Linux + 310)
-#define __NR_kexec_load			(__NR_Linux + 311)
-#define __NR_getcpu			(__NR_Linux + 312)
-#define __NR_epoll_pwait		(__NR_Linux + 313)
-#define __NR_ioprio_set			(__NR_Linux + 314)
-#define __NR_ioprio_get			(__NR_Linux + 315)
-#define __NR_utimensat			(__NR_Linux + 316)
-#define __NR_signalfd			(__NR_Linux + 317)
-#define __NR_timerfd			(__NR_Linux + 318)
-#define __NR_eventfd			(__NR_Linux + 319)
-#define __NR_fallocate			(__NR_Linux + 320)
-#define __NR_timerfd_create		(__NR_Linux + 321)
-#define __NR_timerfd_gettime		(__NR_Linux + 322)
-#define __NR_timerfd_settime		(__NR_Linux + 323)
-#define __NR_signalfd4			(__NR_Linux + 324)
-#define __NR_eventfd2			(__NR_Linux + 325)
-#define __NR_epoll_create1		(__NR_Linux + 326)
-#define __NR_dup3			(__NR_Linux + 327)
-#define __NR_pipe2			(__NR_Linux + 328)
-#define __NR_inotify_init1		(__NR_Linux + 329)
-#define __NR_preadv			(__NR_Linux + 330)
-#define __NR_pwritev			(__NR_Linux + 331)
-#define __NR_rt_tgsigqueueinfo		(__NR_Linux + 332)
-#define __NR_perf_event_open		(__NR_Linux + 333)
-#define __NR_accept4			(__NR_Linux + 334)
-#define __NR_recvmmsg			(__NR_Linux + 335)
-#define __NR_fanotify_init		(__NR_Linux + 336)
-#define __NR_fanotify_mark		(__NR_Linux + 337)
-#define __NR_prlimit64			(__NR_Linux + 338)
-#define __NR_name_to_handle_at		(__NR_Linux + 339)
-#define __NR_open_by_handle_at		(__NR_Linux + 340)
-#define __NR_clock_adjtime		(__NR_Linux + 341)
-#define __NR_syncfs			(__NR_Linux + 342)
-#define __NR_sendmmsg			(__NR_Linux + 343)
-#define __NR_setns			(__NR_Linux + 344)
-#define __NR_process_vm_readv		(__NR_Linux + 345)
-#define __NR_process_vm_writev		(__NR_Linux + 346)
-#define __NR_kcmp			(__NR_Linux + 347)
-#define __NR_finit_module		(__NR_Linux + 348)
-#define __NR_sched_setattr		(__NR_Linux + 349)
-#define __NR_sched_getattr		(__NR_Linux + 350)
-#define __NR_renameat2			(__NR_Linux + 351)
-#define __NR_seccomp			(__NR_Linux + 352)
-#define __NR_getrandom			(__NR_Linux + 353)
-#define __NR_memfd_create		(__NR_Linux + 354)
-#define __NR_bpf			(__NR_Linux + 355)
-#define __NR_execveat			(__NR_Linux + 356)
-#define __NR_userfaultfd		(__NR_Linux + 357)
-#define __NR_membarrier			(__NR_Linux + 358)
-#define __NR_mlock2			(__NR_Linux + 359)
-#define __NR_copy_file_range		(__NR_Linux + 360)
-#define __NR_preadv2			(__NR_Linux + 361)
-#define __NR_pwritev2			(__NR_Linux + 362)
-#define __NR_pkey_mprotect		(__NR_Linux + 363)
-#define __NR_pkey_alloc			(__NR_Linux + 364)
-#define __NR_pkey_free			(__NR_Linux + 365)
-#define __NR_statx			(__NR_Linux + 366)
-#define __NR_rseq			(__NR_Linux + 367)
-#define __NR_io_pgetevents		(__NR_Linux + 368)
-
-
-/*
- * Offset of the last Linux o32 flavoured syscall
- */
-#define __NR_Linux_syscalls		368
+#define __NR_Linux	4000
+#include <asm/unistd_o32.h>
 
 #endif /* _MIPS_SIM == _MIPS_SIM_ABI32 */
 
-#define __NR_O32_Linux			4000
-#define __NR_O32_Linux_syscalls		368
-
 #if _MIPS_SIM == _MIPS_SIM_ABI64
 
-/*
- * Linux 64-bit syscalls are in the range from 5000 to 5999.
- */
-#define __NR_Linux			5000
-#define __NR_read			(__NR_Linux +	0)
-#define __NR_write			(__NR_Linux +	1)
-#define __NR_open			(__NR_Linux +	2)
-#define __NR_close			(__NR_Linux +	3)
-#define __NR_stat			(__NR_Linux +	4)
-#define __NR_fstat			(__NR_Linux +	5)
-#define __NR_lstat			(__NR_Linux +	6)
-#define __NR_poll			(__NR_Linux +	7)
-#define __NR_lseek			(__NR_Linux +	8)
-#define __NR_mmap			(__NR_Linux +	9)
-#define __NR_mprotect			(__NR_Linux +  10)
-#define __NR_munmap			(__NR_Linux +  11)
-#define __NR_brk			(__NR_Linux +  12)
-#define __NR_rt_sigaction		(__NR_Linux +  13)
-#define __NR_rt_sigprocmask		(__NR_Linux +  14)
-#define __NR_ioctl			(__NR_Linux +  15)
-#define __NR_pread64			(__NR_Linux +  16)
-#define __NR_pwrite64			(__NR_Linux +  17)
-#define __NR_readv			(__NR_Linux +  18)
-#define __NR_writev			(__NR_Linux +  19)
-#define __NR_access			(__NR_Linux +  20)
-#define __NR_pipe			(__NR_Linux +  21)
-#define __NR__newselect			(__NR_Linux +  22)
-#define __NR_sched_yield		(__NR_Linux +  23)
-#define __NR_mremap			(__NR_Linux +  24)
-#define __NR_msync			(__NR_Linux +  25)
-#define __NR_mincore			(__NR_Linux +  26)
-#define __NR_madvise			(__NR_Linux +  27)
-#define __NR_shmget			(__NR_Linux +  28)
-#define __NR_shmat			(__NR_Linux +  29)
-#define __NR_shmctl			(__NR_Linux +  30)
-#define __NR_dup			(__NR_Linux +  31)
-#define __NR_dup2			(__NR_Linux +  32)
-#define __NR_pause			(__NR_Linux +  33)
-#define __NR_nanosleep			(__NR_Linux +  34)
-#define __NR_getitimer			(__NR_Linux +  35)
-#define __NR_setitimer			(__NR_Linux +  36)
-#define __NR_alarm			(__NR_Linux +  37)
-#define __NR_getpid			(__NR_Linux +  38)
-#define __NR_sendfile			(__NR_Linux +  39)
-#define __NR_socket			(__NR_Linux +  40)
-#define __NR_connect			(__NR_Linux +  41)
-#define __NR_accept			(__NR_Linux +  42)
-#define __NR_sendto			(__NR_Linux +  43)
-#define __NR_recvfrom			(__NR_Linux +  44)
-#define __NR_sendmsg			(__NR_Linux +  45)
-#define __NR_recvmsg			(__NR_Linux +  46)
-#define __NR_shutdown			(__NR_Linux +  47)
-#define __NR_bind			(__NR_Linux +  48)
-#define __NR_listen			(__NR_Linux +  49)
-#define __NR_getsockname		(__NR_Linux +  50)
-#define __NR_getpeername		(__NR_Linux +  51)
-#define __NR_socketpair			(__NR_Linux +  52)
-#define __NR_setsockopt			(__NR_Linux +  53)
-#define __NR_getsockopt			(__NR_Linux +  54)
-#define __NR_clone			(__NR_Linux +  55)
-#define __NR_fork			(__NR_Linux +  56)
-#define __NR_execve			(__NR_Linux +  57)
-#define __NR_exit			(__NR_Linux +  58)
-#define __NR_wait4			(__NR_Linux +  59)
-#define __NR_kill			(__NR_Linux +  60)
-#define __NR_uname			(__NR_Linux +  61)
-#define __NR_semget			(__NR_Linux +  62)
-#define __NR_semop			(__NR_Linux +  63)
-#define __NR_semctl			(__NR_Linux +  64)
-#define __NR_shmdt			(__NR_Linux +  65)
-#define __NR_msgget			(__NR_Linux +  66)
-#define __NR_msgsnd			(__NR_Linux +  67)
-#define __NR_msgrcv			(__NR_Linux +  68)
-#define __NR_msgctl			(__NR_Linux +  69)
-#define __NR_fcntl			(__NR_Linux +  70)
-#define __NR_flock			(__NR_Linux +  71)
-#define __NR_fsync			(__NR_Linux +  72)
-#define __NR_fdatasync			(__NR_Linux +  73)
-#define __NR_truncate			(__NR_Linux +  74)
-#define __NR_ftruncate			(__NR_Linux +  75)
-#define __NR_getdents			(__NR_Linux +  76)
-#define __NR_getcwd			(__NR_Linux +  77)
-#define __NR_chdir			(__NR_Linux +  78)
-#define __NR_fchdir			(__NR_Linux +  79)
-#define __NR_rename			(__NR_Linux +  80)
-#define __NR_mkdir			(__NR_Linux +  81)
-#define __NR_rmdir			(__NR_Linux +  82)
-#define __NR_creat			(__NR_Linux +  83)
-#define __NR_link			(__NR_Linux +  84)
-#define __NR_unlink			(__NR_Linux +  85)
-#define __NR_symlink			(__NR_Linux +  86)
-#define __NR_readlink			(__NR_Linux +  87)
-#define __NR_chmod			(__NR_Linux +  88)
-#define __NR_fchmod			(__NR_Linux +  89)
-#define __NR_chown			(__NR_Linux +  90)
-#define __NR_fchown			(__NR_Linux +  91)
-#define __NR_lchown			(__NR_Linux +  92)
-#define __NR_umask			(__NR_Linux +  93)
-#define __NR_gettimeofday		(__NR_Linux +  94)
-#define __NR_getrlimit			(__NR_Linux +  95)
-#define __NR_getrusage			(__NR_Linux +  96)
-#define __NR_sysinfo			(__NR_Linux +  97)
-#define __NR_times			(__NR_Linux +  98)
-#define __NR_ptrace			(__NR_Linux +  99)
-#define __NR_getuid			(__NR_Linux + 100)
-#define __NR_syslog			(__NR_Linux + 101)
-#define __NR_getgid			(__NR_Linux + 102)
-#define __NR_setuid			(__NR_Linux + 103)
-#define __NR_setgid			(__NR_Linux + 104)
-#define __NR_geteuid			(__NR_Linux + 105)
-#define __NR_getegid			(__NR_Linux + 106)
-#define __NR_setpgid			(__NR_Linux + 107)
-#define __NR_getppid			(__NR_Linux + 108)
-#define __NR_getpgrp			(__NR_Linux + 109)
-#define __NR_setsid			(__NR_Linux + 110)
-#define __NR_setreuid			(__NR_Linux + 111)
-#define __NR_setregid			(__NR_Linux + 112)
-#define __NR_getgroups			(__NR_Linux + 113)
-#define __NR_setgroups			(__NR_Linux + 114)
-#define __NR_setresuid			(__NR_Linux + 115)
-#define __NR_getresuid			(__NR_Linux + 116)
-#define __NR_setresgid			(__NR_Linux + 117)
-#define __NR_getresgid			(__NR_Linux + 118)
-#define __NR_getpgid			(__NR_Linux + 119)
-#define __NR_setfsuid			(__NR_Linux + 120)
-#define __NR_setfsgid			(__NR_Linux + 121)
-#define __NR_getsid			(__NR_Linux + 122)
-#define __NR_capget			(__NR_Linux + 123)
-#define __NR_capset			(__NR_Linux + 124)
-#define __NR_rt_sigpending		(__NR_Linux + 125)
-#define __NR_rt_sigtimedwait		(__NR_Linux + 126)
-#define __NR_rt_sigqueueinfo		(__NR_Linux + 127)
-#define __NR_rt_sigsuspend		(__NR_Linux + 128)
-#define __NR_sigaltstack		(__NR_Linux + 129)
-#define __NR_utime			(__NR_Linux + 130)
-#define __NR_mknod			(__NR_Linux + 131)
-#define __NR_personality		(__NR_Linux + 132)
-#define __NR_ustat			(__NR_Linux + 133)
-#define __NR_statfs			(__NR_Linux + 134)
-#define __NR_fstatfs			(__NR_Linux + 135)
-#define __NR_sysfs			(__NR_Linux + 136)
-#define __NR_getpriority		(__NR_Linux + 137)
-#define __NR_setpriority		(__NR_Linux + 138)
-#define __NR_sched_setparam		(__NR_Linux + 139)
-#define __NR_sched_getparam		(__NR_Linux + 140)
-#define __NR_sched_setscheduler		(__NR_Linux + 141)
-#define __NR_sched_getscheduler		(__NR_Linux + 142)
-#define __NR_sched_get_priority_max	(__NR_Linux + 143)
-#define __NR_sched_get_priority_min	(__NR_Linux + 144)
-#define __NR_sched_rr_get_interval	(__NR_Linux + 145)
-#define __NR_mlock			(__NR_Linux + 146)
-#define __NR_munlock			(__NR_Linux + 147)
-#define __NR_mlockall			(__NR_Linux + 148)
-#define __NR_munlockall			(__NR_Linux + 149)
-#define __NR_vhangup			(__NR_Linux + 150)
-#define __NR_pivot_root			(__NR_Linux + 151)
-#define __NR__sysctl			(__NR_Linux + 152)
-#define __NR_prctl			(__NR_Linux + 153)
-#define __NR_adjtimex			(__NR_Linux + 154)
-#define __NR_setrlimit			(__NR_Linux + 155)
-#define __NR_chroot			(__NR_Linux + 156)
-#define __NR_sync			(__NR_Linux + 157)
-#define __NR_acct			(__NR_Linux + 158)
-#define __NR_settimeofday		(__NR_Linux + 159)
-#define __NR_mount			(__NR_Linux + 160)
-#define __NR_umount2			(__NR_Linux + 161)
-#define __NR_swapon			(__NR_Linux + 162)
-#define __NR_swapoff			(__NR_Linux + 163)
-#define __NR_reboot			(__NR_Linux + 164)
-#define __NR_sethostname		(__NR_Linux + 165)
-#define __NR_setdomainname		(__NR_Linux + 166)
-#define __NR_create_module		(__NR_Linux + 167)
-#define __NR_init_module		(__NR_Linux + 168)
-#define __NR_delete_module		(__NR_Linux + 169)
-#define __NR_get_kernel_syms		(__NR_Linux + 170)
-#define __NR_query_module		(__NR_Linux + 171)
-#define __NR_quotactl			(__NR_Linux + 172)
-#define __NR_nfsservctl			(__NR_Linux + 173)
-#define __NR_getpmsg			(__NR_Linux + 174)
-#define __NR_putpmsg			(__NR_Linux + 175)
-#define __NR_afs_syscall		(__NR_Linux + 176)
-#define __NR_reserved177		(__NR_Linux + 177)
-#define __NR_gettid			(__NR_Linux + 178)
-#define __NR_readahead			(__NR_Linux + 179)
-#define __NR_setxattr			(__NR_Linux + 180)
-#define __NR_lsetxattr			(__NR_Linux + 181)
-#define __NR_fsetxattr			(__NR_Linux + 182)
-#define __NR_getxattr			(__NR_Linux + 183)
-#define __NR_lgetxattr			(__NR_Linux + 184)
-#define __NR_fgetxattr			(__NR_Linux + 185)
-#define __NR_listxattr			(__NR_Linux + 186)
-#define __NR_llistxattr			(__NR_Linux + 187)
-#define __NR_flistxattr			(__NR_Linux + 188)
-#define __NR_removexattr		(__NR_Linux + 189)
-#define __NR_lremovexattr		(__NR_Linux + 190)
-#define __NR_fremovexattr		(__NR_Linux + 191)
-#define __NR_tkill			(__NR_Linux + 192)
-#define __NR_reserved193		(__NR_Linux + 193)
-#define __NR_futex			(__NR_Linux + 194)
-#define __NR_sched_setaffinity		(__NR_Linux + 195)
-#define __NR_sched_getaffinity		(__NR_Linux + 196)
-#define __NR_cacheflush			(__NR_Linux + 197)
-#define __NR_cachectl			(__NR_Linux + 198)
-#define __NR_sysmips			(__NR_Linux + 199)
-#define __NR_io_setup			(__NR_Linux + 200)
-#define __NR_io_destroy			(__NR_Linux + 201)
-#define __NR_io_getevents		(__NR_Linux + 202)
-#define __NR_io_submit			(__NR_Linux + 203)
-#define __NR_io_cancel			(__NR_Linux + 204)
-#define __NR_exit_group			(__NR_Linux + 205)
-#define __NR_lookup_dcookie		(__NR_Linux + 206)
-#define __NR_epoll_create		(__NR_Linux + 207)
-#define __NR_epoll_ctl			(__NR_Linux + 208)
-#define __NR_epoll_wait			(__NR_Linux + 209)
-#define __NR_remap_file_pages		(__NR_Linux + 210)
-#define __NR_rt_sigreturn		(__NR_Linux + 211)
-#define __NR_set_tid_address		(__NR_Linux + 212)
-#define __NR_restart_syscall		(__NR_Linux + 213)
-#define __NR_semtimedop			(__NR_Linux + 214)
-#define __NR_fadvise64			(__NR_Linux + 215)
-#define __NR_timer_create		(__NR_Linux + 216)
-#define __NR_timer_settime		(__NR_Linux + 217)
-#define __NR_timer_gettime		(__NR_Linux + 218)
-#define __NR_timer_getoverrun		(__NR_Linux + 219)
-#define __NR_timer_delete		(__NR_Linux + 220)
-#define __NR_clock_settime		(__NR_Linux + 221)
-#define __NR_clock_gettime		(__NR_Linux + 222)
-#define __NR_clock_getres		(__NR_Linux + 223)
-#define __NR_clock_nanosleep		(__NR_Linux + 224)
-#define __NR_tgkill			(__NR_Linux + 225)
-#define __NR_utimes			(__NR_Linux + 226)
-#define __NR_mbind			(__NR_Linux + 227)
-#define __NR_get_mempolicy		(__NR_Linux + 228)
-#define __NR_set_mempolicy		(__NR_Linux + 229)
-#define __NR_mq_open			(__NR_Linux + 230)
-#define __NR_mq_unlink			(__NR_Linux + 231)
-#define __NR_mq_timedsend		(__NR_Linux + 232)
-#define __NR_mq_timedreceive		(__NR_Linux + 233)
-#define __NR_mq_notify			(__NR_Linux + 234)
-#define __NR_mq_getsetattr		(__NR_Linux + 235)
-#define __NR_vserver			(__NR_Linux + 236)
-#define __NR_waitid			(__NR_Linux + 237)
-/* #define __NR_sys_setaltroot		(__NR_Linux + 238) */
-#define __NR_add_key			(__NR_Linux + 239)
-#define __NR_request_key		(__NR_Linux + 240)
-#define __NR_keyctl			(__NR_Linux + 241)
-#define __NR_set_thread_area		(__NR_Linux + 242)
-#define __NR_inotify_init		(__NR_Linux + 243)
-#define __NR_inotify_add_watch		(__NR_Linux + 244)
-#define __NR_inotify_rm_watch		(__NR_Linux + 245)
-#define __NR_migrate_pages		(__NR_Linux + 246)
-#define __NR_openat			(__NR_Linux + 247)
-#define __NR_mkdirat			(__NR_Linux + 248)
-#define __NR_mknodat			(__NR_Linux + 249)
-#define __NR_fchownat			(__NR_Linux + 250)
-#define __NR_futimesat			(__NR_Linux + 251)
-#define __NR_newfstatat			(__NR_Linux + 252)
-#define __NR_unlinkat			(__NR_Linux + 253)
-#define __NR_renameat			(__NR_Linux + 254)
-#define __NR_linkat			(__NR_Linux + 255)
-#define __NR_symlinkat			(__NR_Linux + 256)
-#define __NR_readlinkat			(__NR_Linux + 257)
-#define __NR_fchmodat			(__NR_Linux + 258)
-#define __NR_faccessat			(__NR_Linux + 259)
-#define __NR_pselect6			(__NR_Linux + 260)
-#define __NR_ppoll			(__NR_Linux + 261)
-#define __NR_unshare			(__NR_Linux + 262)
-#define __NR_splice			(__NR_Linux + 263)
-#define __NR_sync_file_range		(__NR_Linux + 264)
-#define __NR_tee			(__NR_Linux + 265)
-#define __NR_vmsplice			(__NR_Linux + 266)
-#define __NR_move_pages			(__NR_Linux + 267)
-#define __NR_set_robust_list		(__NR_Linux + 268)
-#define __NR_get_robust_list		(__NR_Linux + 269)
-#define __NR_kexec_load			(__NR_Linux + 270)
-#define __NR_getcpu			(__NR_Linux + 271)
-#define __NR_epoll_pwait		(__NR_Linux + 272)
-#define __NR_ioprio_set			(__NR_Linux + 273)
-#define __NR_ioprio_get			(__NR_Linux + 274)
-#define __NR_utimensat			(__NR_Linux + 275)
-#define __NR_signalfd			(__NR_Linux + 276)
-#define __NR_timerfd			(__NR_Linux + 277)
-#define __NR_eventfd			(__NR_Linux + 278)
-#define __NR_fallocate			(__NR_Linux + 279)
-#define __NR_timerfd_create		(__NR_Linux + 280)
-#define __NR_timerfd_gettime		(__NR_Linux + 281)
-#define __NR_timerfd_settime		(__NR_Linux + 282)
-#define __NR_signalfd4			(__NR_Linux + 283)
-#define __NR_eventfd2			(__NR_Linux + 284)
-#define __NR_epoll_create1		(__NR_Linux + 285)
-#define __NR_dup3			(__NR_Linux + 286)
-#define __NR_pipe2			(__NR_Linux + 287)
-#define __NR_inotify_init1		(__NR_Linux + 288)
-#define __NR_preadv			(__NR_Linux + 289)
-#define __NR_pwritev			(__NR_Linux + 290)
-#define __NR_rt_tgsigqueueinfo		(__NR_Linux + 291)
-#define __NR_perf_event_open		(__NR_Linux + 292)
-#define __NR_accept4			(__NR_Linux + 293)
-#define __NR_recvmmsg			(__NR_Linux + 294)
-#define __NR_fanotify_init		(__NR_Linux + 295)
-#define __NR_fanotify_mark		(__NR_Linux + 296)
-#define __NR_prlimit64			(__NR_Linux + 297)
-#define __NR_name_to_handle_at		(__NR_Linux + 298)
-#define __NR_open_by_handle_at		(__NR_Linux + 299)
-#define __NR_clock_adjtime		(__NR_Linux + 300)
-#define __NR_syncfs			(__NR_Linux + 301)
-#define __NR_sendmmsg			(__NR_Linux + 302)
-#define __NR_setns			(__NR_Linux + 303)
-#define __NR_process_vm_readv		(__NR_Linux + 304)
-#define __NR_process_vm_writev		(__NR_Linux + 305)
-#define __NR_kcmp			(__NR_Linux + 306)
-#define __NR_finit_module		(__NR_Linux + 307)
-#define __NR_getdents64			(__NR_Linux + 308)
-#define __NR_sched_setattr		(__NR_Linux + 309)
-#define __NR_sched_getattr		(__NR_Linux + 310)
-#define __NR_renameat2			(__NR_Linux + 311)
-#define __NR_seccomp			(__NR_Linux + 312)
-#define __NR_getrandom			(__NR_Linux + 313)
-#define __NR_memfd_create		(__NR_Linux + 314)
-#define __NR_bpf			(__NR_Linux + 315)
-#define __NR_execveat			(__NR_Linux + 316)
-#define __NR_userfaultfd		(__NR_Linux + 317)
-#define __NR_membarrier			(__NR_Linux + 318)
-#define __NR_mlock2			(__NR_Linux + 319)
-#define __NR_copy_file_range		(__NR_Linux + 320)
-#define __NR_preadv2			(__NR_Linux + 321)
-#define __NR_pwritev2			(__NR_Linux + 322)
-#define __NR_pkey_mprotect		(__NR_Linux + 323)
-#define __NR_pkey_alloc			(__NR_Linux + 324)
-#define __NR_pkey_free			(__NR_Linux + 325)
-#define __NR_statx			(__NR_Linux + 326)
-#define __NR_rseq			(__NR_Linux + 327)
-#define __NR_io_pgetevents		(__NR_Linux + 328)
-
-/*
- * Offset of the last Linux 64-bit flavoured syscall
- */
-#define __NR_Linux_syscalls		328
+#define __NR_Linux	5000
+#include <asm/unistd_n64.h>
 
 #endif /* _MIPS_SIM == _MIPS_SIM_ABI64 */
 
-#define __NR_64_Linux			5000
-#define __NR_64_Linux_syscalls		328
-
 #if _MIPS_SIM == _MIPS_SIM_NABI32
 
-/*
- * Linux N32 syscalls are in the range from 6000 to 6999.
- */
-#define __NR_Linux			6000
-#define __NR_read			(__NR_Linux +	0)
-#define __NR_write			(__NR_Linux +	1)
-#define __NR_open			(__NR_Linux +	2)
-#define __NR_close			(__NR_Linux +	3)
-#define __NR_stat			(__NR_Linux +	4)
-#define __NR_fstat			(__NR_Linux +	5)
-#define __NR_lstat			(__NR_Linux +	6)
-#define __NR_poll			(__NR_Linux +	7)
-#define __NR_lseek			(__NR_Linux +	8)
-#define __NR_mmap			(__NR_Linux +	9)
-#define __NR_mprotect			(__NR_Linux +  10)
-#define __NR_munmap			(__NR_Linux +  11)
-#define __NR_brk			(__NR_Linux +  12)
-#define __NR_rt_sigaction		(__NR_Linux +  13)
-#define __NR_rt_sigprocmask		(__NR_Linux +  14)
-#define __NR_ioctl			(__NR_Linux +  15)
-#define __NR_pread64			(__NR_Linux +  16)
-#define __NR_pwrite64			(__NR_Linux +  17)
-#define __NR_readv			(__NR_Linux +  18)
-#define __NR_writev			(__NR_Linux +  19)
-#define __NR_access			(__NR_Linux +  20)
-#define __NR_pipe			(__NR_Linux +  21)
-#define __NR__newselect			(__NR_Linux +  22)
-#define __NR_sched_yield		(__NR_Linux +  23)
-#define __NR_mremap			(__NR_Linux +  24)
-#define __NR_msync			(__NR_Linux +  25)
-#define __NR_mincore			(__NR_Linux +  26)
-#define __NR_madvise			(__NR_Linux +  27)
-#define __NR_shmget			(__NR_Linux +  28)
-#define __NR_shmat			(__NR_Linux +  29)
-#define __NR_shmctl			(__NR_Linux +  30)
-#define __NR_dup			(__NR_Linux +  31)
-#define __NR_dup2			(__NR_Linux +  32)
-#define __NR_pause			(__NR_Linux +  33)
-#define __NR_nanosleep			(__NR_Linux +  34)
-#define __NR_getitimer			(__NR_Linux +  35)
-#define __NR_setitimer			(__NR_Linux +  36)
-#define __NR_alarm			(__NR_Linux +  37)
-#define __NR_getpid			(__NR_Linux +  38)
-#define __NR_sendfile			(__NR_Linux +  39)
-#define __NR_socket			(__NR_Linux +  40)
-#define __NR_connect			(__NR_Linux +  41)
-#define __NR_accept			(__NR_Linux +  42)
-#define __NR_sendto			(__NR_Linux +  43)
-#define __NR_recvfrom			(__NR_Linux +  44)
-#define __NR_sendmsg			(__NR_Linux +  45)
-#define __NR_recvmsg			(__NR_Linux +  46)
-#define __NR_shutdown			(__NR_Linux +  47)
-#define __NR_bind			(__NR_Linux +  48)
-#define __NR_listen			(__NR_Linux +  49)
-#define __NR_getsockname		(__NR_Linux +  50)
-#define __NR_getpeername		(__NR_Linux +  51)
-#define __NR_socketpair			(__NR_Linux +  52)
-#define __NR_setsockopt			(__NR_Linux +  53)
-#define __NR_getsockopt			(__NR_Linux +  54)
-#define __NR_clone			(__NR_Linux +  55)
-#define __NR_fork			(__NR_Linux +  56)
-#define __NR_execve			(__NR_Linux +  57)
-#define __NR_exit			(__NR_Linux +  58)
-#define __NR_wait4			(__NR_Linux +  59)
-#define __NR_kill			(__NR_Linux +  60)
-#define __NR_uname			(__NR_Linux +  61)
-#define __NR_semget			(__NR_Linux +  62)
-#define __NR_semop			(__NR_Linux +  63)
-#define __NR_semctl			(__NR_Linux +  64)
-#define __NR_shmdt			(__NR_Linux +  65)
-#define __NR_msgget			(__NR_Linux +  66)
-#define __NR_msgsnd			(__NR_Linux +  67)
-#define __NR_msgrcv			(__NR_Linux +  68)
-#define __NR_msgctl			(__NR_Linux +  69)
-#define __NR_fcntl			(__NR_Linux +  70)
-#define __NR_flock			(__NR_Linux +  71)
-#define __NR_fsync			(__NR_Linux +  72)
-#define __NR_fdatasync			(__NR_Linux +  73)
-#define __NR_truncate			(__NR_Linux +  74)
-#define __NR_ftruncate			(__NR_Linux +  75)
-#define __NR_getdents			(__NR_Linux +  76)
-#define __NR_getcwd			(__NR_Linux +  77)
-#define __NR_chdir			(__NR_Linux +  78)
-#define __NR_fchdir			(__NR_Linux +  79)
-#define __NR_rename			(__NR_Linux +  80)
-#define __NR_mkdir			(__NR_Linux +  81)
-#define __NR_rmdir			(__NR_Linux +  82)
-#define __NR_creat			(__NR_Linux +  83)
-#define __NR_link			(__NR_Linux +  84)
-#define __NR_unlink			(__NR_Linux +  85)
-#define __NR_symlink			(__NR_Linux +  86)
-#define __NR_readlink			(__NR_Linux +  87)
-#define __NR_chmod			(__NR_Linux +  88)
-#define __NR_fchmod			(__NR_Linux +  89)
-#define __NR_chown			(__NR_Linux +  90)
-#define __NR_fchown			(__NR_Linux +  91)
-#define __NR_lchown			(__NR_Linux +  92)
-#define __NR_umask			(__NR_Linux +  93)
-#define __NR_gettimeofday		(__NR_Linux +  94)
-#define __NR_getrlimit			(__NR_Linux +  95)
-#define __NR_getrusage			(__NR_Linux +  96)
-#define __NR_sysinfo			(__NR_Linux +  97)
-#define __NR_times			(__NR_Linux +  98)
-#define __NR_ptrace			(__NR_Linux +  99)
-#define __NR_getuid			(__NR_Linux + 100)
-#define __NR_syslog			(__NR_Linux + 101)
-#define __NR_getgid			(__NR_Linux + 102)
-#define __NR_setuid			(__NR_Linux + 103)
-#define __NR_setgid			(__NR_Linux + 104)
-#define __NR_geteuid			(__NR_Linux + 105)
-#define __NR_getegid			(__NR_Linux + 106)
-#define __NR_setpgid			(__NR_Linux + 107)
-#define __NR_getppid			(__NR_Linux + 108)
-#define __NR_getpgrp			(__NR_Linux + 109)
-#define __NR_setsid			(__NR_Linux + 110)
-#define __NR_setreuid			(__NR_Linux + 111)
-#define __NR_setregid			(__NR_Linux + 112)
-#define __NR_getgroups			(__NR_Linux + 113)
-#define __NR_setgroups			(__NR_Linux + 114)
-#define __NR_setresuid			(__NR_Linux + 115)
-#define __NR_getresuid			(__NR_Linux + 116)
-#define __NR_setresgid			(__NR_Linux + 117)
-#define __NR_getresgid			(__NR_Linux + 118)
-#define __NR_getpgid			(__NR_Linux + 119)
-#define __NR_setfsuid			(__NR_Linux + 120)
-#define __NR_setfsgid			(__NR_Linux + 121)
-#define __NR_getsid			(__NR_Linux + 122)
-#define __NR_capget			(__NR_Linux + 123)
-#define __NR_capset			(__NR_Linux + 124)
-#define __NR_rt_sigpending		(__NR_Linux + 125)
-#define __NR_rt_sigtimedwait		(__NR_Linux + 126)
-#define __NR_rt_sigqueueinfo		(__NR_Linux + 127)
-#define __NR_rt_sigsuspend		(__NR_Linux + 128)
-#define __NR_sigaltstack		(__NR_Linux + 129)
-#define __NR_utime			(__NR_Linux + 130)
-#define __NR_mknod			(__NR_Linux + 131)
-#define __NR_personality		(__NR_Linux + 132)
-#define __NR_ustat			(__NR_Linux + 133)
-#define __NR_statfs			(__NR_Linux + 134)
-#define __NR_fstatfs			(__NR_Linux + 135)
-#define __NR_sysfs			(__NR_Linux + 136)
-#define __NR_getpriority		(__NR_Linux + 137)
-#define __NR_setpriority		(__NR_Linux + 138)
-#define __NR_sched_setparam		(__NR_Linux + 139)
-#define __NR_sched_getparam		(__NR_Linux + 140)
-#define __NR_sched_setscheduler		(__NR_Linux + 141)
-#define __NR_sched_getscheduler		(__NR_Linux + 142)
-#define __NR_sched_get_priority_max	(__NR_Linux + 143)
-#define __NR_sched_get_priority_min	(__NR_Linux + 144)
-#define __NR_sched_rr_get_interval	(__NR_Linux + 145)
-#define __NR_mlock			(__NR_Linux + 146)
-#define __NR_munlock			(__NR_Linux + 147)
-#define __NR_mlockall			(__NR_Linux + 148)
-#define __NR_munlockall			(__NR_Linux + 149)
-#define __NR_vhangup			(__NR_Linux + 150)
-#define __NR_pivot_root			(__NR_Linux + 151)
-#define __NR__sysctl			(__NR_Linux + 152)
-#define __NR_prctl			(__NR_Linux + 153)
-#define __NR_adjtimex			(__NR_Linux + 154)
-#define __NR_setrlimit			(__NR_Linux + 155)
-#define __NR_chroot			(__NR_Linux + 156)
-#define __NR_sync			(__NR_Linux + 157)
-#define __NR_acct			(__NR_Linux + 158)
-#define __NR_settimeofday		(__NR_Linux + 159)
-#define __NR_mount			(__NR_Linux + 160)
-#define __NR_umount2			(__NR_Linux + 161)
-#define __NR_swapon			(__NR_Linux + 162)
-#define __NR_swapoff			(__NR_Linux + 163)
-#define __NR_reboot			(__NR_Linux + 164)
-#define __NR_sethostname		(__NR_Linux + 165)
-#define __NR_setdomainname		(__NR_Linux + 166)
-#define __NR_create_module		(__NR_Linux + 167)
-#define __NR_init_module		(__NR_Linux + 168)
-#define __NR_delete_module		(__NR_Linux + 169)
-#define __NR_get_kernel_syms		(__NR_Linux + 170)
-#define __NR_query_module		(__NR_Linux + 171)
-#define __NR_quotactl			(__NR_Linux + 172)
-#define __NR_nfsservctl			(__NR_Linux + 173)
-#define __NR_getpmsg			(__NR_Linux + 174)
-#define __NR_putpmsg			(__NR_Linux + 175)
-#define __NR_afs_syscall		(__NR_Linux + 176)
-#define __NR_reserved177		(__NR_Linux + 177)
-#define __NR_gettid			(__NR_Linux + 178)
-#define __NR_readahead			(__NR_Linux + 179)
-#define __NR_setxattr			(__NR_Linux + 180)
-#define __NR_lsetxattr			(__NR_Linux + 181)
-#define __NR_fsetxattr			(__NR_Linux + 182)
-#define __NR_getxattr			(__NR_Linux + 183)
-#define __NR_lgetxattr			(__NR_Linux + 184)
-#define __NR_fgetxattr			(__NR_Linux + 185)
-#define __NR_listxattr			(__NR_Linux + 186)
-#define __NR_llistxattr			(__NR_Linux + 187)
-#define __NR_flistxattr			(__NR_Linux + 188)
-#define __NR_removexattr		(__NR_Linux + 189)
-#define __NR_lremovexattr		(__NR_Linux + 190)
-#define __NR_fremovexattr		(__NR_Linux + 191)
-#define __NR_tkill			(__NR_Linux + 192)
-#define __NR_reserved193		(__NR_Linux + 193)
-#define __NR_futex			(__NR_Linux + 194)
-#define __NR_sched_setaffinity		(__NR_Linux + 195)
-#define __NR_sched_getaffinity		(__NR_Linux + 196)
-#define __NR_cacheflush			(__NR_Linux + 197)
-#define __NR_cachectl			(__NR_Linux + 198)
-#define __NR_sysmips			(__NR_Linux + 199)
-#define __NR_io_setup			(__NR_Linux + 200)
-#define __NR_io_destroy			(__NR_Linux + 201)
-#define __NR_io_getevents		(__NR_Linux + 202)
-#define __NR_io_submit			(__NR_Linux + 203)
-#define __NR_io_cancel			(__NR_Linux + 204)
-#define __NR_exit_group			(__NR_Linux + 205)
-#define __NR_lookup_dcookie		(__NR_Linux + 206)
-#define __NR_epoll_create		(__NR_Linux + 207)
-#define __NR_epoll_ctl			(__NR_Linux + 208)
-#define __NR_epoll_wait			(__NR_Linux + 209)
-#define __NR_remap_file_pages		(__NR_Linux + 210)
-#define __NR_rt_sigreturn		(__NR_Linux + 211)
-#define __NR_fcntl64			(__NR_Linux + 212)
-#define __NR_set_tid_address		(__NR_Linux + 213)
-#define __NR_restart_syscall		(__NR_Linux + 214)
-#define __NR_semtimedop			(__NR_Linux + 215)
-#define __NR_fadvise64			(__NR_Linux + 216)
-#define __NR_statfs64			(__NR_Linux + 217)
-#define __NR_fstatfs64			(__NR_Linux + 218)
-#define __NR_sendfile64			(__NR_Linux + 219)
-#define __NR_timer_create		(__NR_Linux + 220)
-#define __NR_timer_settime		(__NR_Linux + 221)
-#define __NR_timer_gettime		(__NR_Linux + 222)
-#define __NR_timer_getoverrun		(__NR_Linux + 223)
-#define __NR_timer_delete		(__NR_Linux + 224)
-#define __NR_clock_settime		(__NR_Linux + 225)
-#define __NR_clock_gettime		(__NR_Linux + 226)
-#define __NR_clock_getres		(__NR_Linux + 227)
-#define __NR_clock_nanosleep		(__NR_Linux + 228)
-#define __NR_tgkill			(__NR_Linux + 229)
-#define __NR_utimes			(__NR_Linux + 230)
-#define __NR_mbind			(__NR_Linux + 231)
-#define __NR_get_mempolicy		(__NR_Linux + 232)
-#define __NR_set_mempolicy		(__NR_Linux + 233)
-#define __NR_mq_open			(__NR_Linux + 234)
-#define __NR_mq_unlink			(__NR_Linux + 235)
-#define __NR_mq_timedsend		(__NR_Linux + 236)
-#define __NR_mq_timedreceive		(__NR_Linux + 237)
-#define __NR_mq_notify			(__NR_Linux + 238)
-#define __NR_mq_getsetattr		(__NR_Linux + 239)
-#define __NR_vserver			(__NR_Linux + 240)
-#define __NR_waitid			(__NR_Linux + 241)
-/* #define __NR_sys_setaltroot		(__NR_Linux + 242) */
-#define __NR_add_key			(__NR_Linux + 243)
-#define __NR_request_key		(__NR_Linux + 244)
-#define __NR_keyctl			(__NR_Linux + 245)
-#define __NR_set_thread_area		(__NR_Linux + 246)
-#define __NR_inotify_init		(__NR_Linux + 247)
-#define __NR_inotify_add_watch		(__NR_Linux + 248)
-#define __NR_inotify_rm_watch		(__NR_Linux + 249)
-#define __NR_migrate_pages		(__NR_Linux + 250)
-#define __NR_openat			(__NR_Linux + 251)
-#define __NR_mkdirat			(__NR_Linux + 252)
-#define __NR_mknodat			(__NR_Linux + 253)
-#define __NR_fchownat			(__NR_Linux + 254)
-#define __NR_futimesat			(__NR_Linux + 255)
-#define __NR_newfstatat			(__NR_Linux + 256)
-#define __NR_unlinkat			(__NR_Linux + 257)
-#define __NR_renameat			(__NR_Linux + 258)
-#define __NR_linkat			(__NR_Linux + 259)
-#define __NR_symlinkat			(__NR_Linux + 260)
-#define __NR_readlinkat			(__NR_Linux + 261)
-#define __NR_fchmodat			(__NR_Linux + 262)
-#define __NR_faccessat			(__NR_Linux + 263)
-#define __NR_pselect6			(__NR_Linux + 264)
-#define __NR_ppoll			(__NR_Linux + 265)
-#define __NR_unshare			(__NR_Linux + 266)
-#define __NR_splice			(__NR_Linux + 267)
-#define __NR_sync_file_range		(__NR_Linux + 268)
-#define __NR_tee			(__NR_Linux + 269)
-#define __NR_vmsplice			(__NR_Linux + 270)
-#define __NR_move_pages			(__NR_Linux + 271)
-#define __NR_set_robust_list		(__NR_Linux + 272)
-#define __NR_get_robust_list		(__NR_Linux + 273)
-#define __NR_kexec_load			(__NR_Linux + 274)
-#define __NR_getcpu			(__NR_Linux + 275)
-#define __NR_epoll_pwait		(__NR_Linux + 276)
-#define __NR_ioprio_set			(__NR_Linux + 277)
-#define __NR_ioprio_get			(__NR_Linux + 278)
-#define __NR_utimensat			(__NR_Linux + 279)
-#define __NR_signalfd			(__NR_Linux + 280)
-#define __NR_timerfd			(__NR_Linux + 281)
-#define __NR_eventfd			(__NR_Linux + 282)
-#define __NR_fallocate			(__NR_Linux + 283)
-#define __NR_timerfd_create		(__NR_Linux + 284)
-#define __NR_timerfd_gettime		(__NR_Linux + 285)
-#define __NR_timerfd_settime		(__NR_Linux + 286)
-#define __NR_signalfd4			(__NR_Linux + 287)
-#define __NR_eventfd2			(__NR_Linux + 288)
-#define __NR_epoll_create1		(__NR_Linux + 289)
-#define __NR_dup3			(__NR_Linux + 290)
-#define __NR_pipe2			(__NR_Linux + 291)
-#define __NR_inotify_init1		(__NR_Linux + 292)
-#define __NR_preadv			(__NR_Linux + 293)
-#define __NR_pwritev			(__NR_Linux + 294)
-#define __NR_rt_tgsigqueueinfo		(__NR_Linux + 295)
-#define __NR_perf_event_open		(__NR_Linux + 296)
-#define __NR_accept4			(__NR_Linux + 297)
-#define __NR_recvmmsg			(__NR_Linux + 298)
-#define __NR_getdents64			(__NR_Linux + 299)
-#define __NR_fanotify_init		(__NR_Linux + 300)
-#define __NR_fanotify_mark		(__NR_Linux + 301)
-#define __NR_prlimit64			(__NR_Linux + 302)
-#define __NR_name_to_handle_at		(__NR_Linux + 303)
-#define __NR_open_by_handle_at		(__NR_Linux + 304)
-#define __NR_clock_adjtime		(__NR_Linux + 305)
-#define __NR_syncfs			(__NR_Linux + 306)
-#define __NR_sendmmsg			(__NR_Linux + 307)
-#define __NR_setns			(__NR_Linux + 308)
-#define __NR_process_vm_readv		(__NR_Linux + 309)
-#define __NR_process_vm_writev		(__NR_Linux + 310)
-#define __NR_kcmp			(__NR_Linux + 311)
-#define __NR_finit_module		(__NR_Linux + 312)
-#define __NR_sched_setattr		(__NR_Linux + 313)
-#define __NR_sched_getattr		(__NR_Linux + 314)
-#define __NR_renameat2			(__NR_Linux + 315)
-#define __NR_seccomp			(__NR_Linux + 316)
-#define __NR_getrandom			(__NR_Linux + 317)
-#define __NR_memfd_create		(__NR_Linux + 318)
-#define __NR_bpf			(__NR_Linux + 319)
-#define __NR_execveat			(__NR_Linux + 320)
-#define __NR_userfaultfd		(__NR_Linux + 321)
-#define __NR_membarrier			(__NR_Linux + 322)
-#define __NR_mlock2			(__NR_Linux + 323)
-#define __NR_copy_file_range		(__NR_Linux + 324)
-#define __NR_preadv2			(__NR_Linux + 325)
-#define __NR_pwritev2			(__NR_Linux + 326)
-#define __NR_pkey_mprotect		(__NR_Linux + 327)
-#define __NR_pkey_alloc			(__NR_Linux + 328)
-#define __NR_pkey_free			(__NR_Linux + 329)
-#define __NR_statx			(__NR_Linux + 330)
-#define __NR_rseq			(__NR_Linux + 331)
-#define __NR_io_pgetevents		(__NR_Linux + 332)
-
-/*
- * Offset of the last N32 flavoured syscall
- */
-#define __NR_Linux_syscalls		332
+#define __NR_Linux	6000
+#include <asm/unistd_n32.h>
 
 #endif /* _MIPS_SIM == _MIPS_SIM_NABI32 */
 
-#define __NR_N32_Linux			6000
-#define __NR_N32_Linux_syscalls		332
-
 #endif /* _ASM_UNISTD_H */
diff --git a/linux-headers/asm-mips/unistd_n32.h b/linux-headers/asm-mips/unistd_n32.h
new file mode 100644
index 0000000000..b744f4d520
--- /dev/null
+++ b/linux-headers/asm-mips/unistd_n32.h
@@ -0,0 +1,338 @@
+#ifndef _ASM_MIPS_UNISTD_N32_H
+#define _ASM_MIPS_UNISTD_N32_H
+
+#define __NR_read	(__NR_Linux + 0)
+#define __NR_write	(__NR_Linux + 1)
+#define __NR_open	(__NR_Linux + 2)
+#define __NR_close	(__NR_Linux + 3)
+#define __NR_stat	(__NR_Linux + 4)
+#define __NR_fstat	(__NR_Linux + 5)
+#define __NR_lstat	(__NR_Linux + 6)
+#define __NR_poll	(__NR_Linux + 7)
+#define __NR_lseek	(__NR_Linux + 8)
+#define __NR_mmap	(__NR_Linux + 9)
+#define __NR_mprotect	(__NR_Linux + 10)
+#define __NR_munmap	(__NR_Linux + 11)
+#define __NR_brk	(__NR_Linux + 12)
+#define __NR_rt_sigaction	(__NR_Linux + 13)
+#define __NR_rt_sigprocmask	(__NR_Linux + 14)
+#define __NR_ioctl	(__NR_Linux + 15)
+#define __NR_pread64	(__NR_Linux + 16)
+#define __NR_pwrite64	(__NR_Linux + 17)
+#define __NR_readv	(__NR_Linux + 18)
+#define __NR_writev	(__NR_Linux + 19)
+#define __NR_access	(__NR_Linux + 20)
+#define __NR_pipe	(__NR_Linux + 21)
+#define __NR__newselect	(__NR_Linux + 22)
+#define __NR_sched_yield	(__NR_Linux + 23)
+#define __NR_mremap	(__NR_Linux + 24)
+#define __NR_msync	(__NR_Linux + 25)
+#define __NR_mincore	(__NR_Linux + 26)
+#define __NR_madvise	(__NR_Linux + 27)
+#define __NR_shmget	(__NR_Linux + 28)
+#define __NR_shmat	(__NR_Linux + 29)
+#define __NR_shmctl	(__NR_Linux + 30)
+#define __NR_dup	(__NR_Linux + 31)
+#define __NR_dup2	(__NR_Linux + 32)
+#define __NR_pause	(__NR_Linux + 33)
+#define __NR_nanosleep	(__NR_Linux + 34)
+#define __NR_getitimer	(__NR_Linux + 35)
+#define __NR_setitimer	(__NR_Linux + 36)
+#define __NR_alarm	(__NR_Linux + 37)
+#define __NR_getpid	(__NR_Linux + 38)
+#define __NR_sendfile	(__NR_Linux + 39)
+#define __NR_socket	(__NR_Linux + 40)
+#define __NR_connect	(__NR_Linux + 41)
+#define __NR_accept	(__NR_Linux + 42)
+#define __NR_sendto	(__NR_Linux + 43)
+#define __NR_recvfrom	(__NR_Linux + 44)
+#define __NR_sendmsg	(__NR_Linux + 45)
+#define __NR_recvmsg	(__NR_Linux + 46)
+#define __NR_shutdown	(__NR_Linux + 47)
+#define __NR_bind	(__NR_Linux + 48)
+#define __NR_listen	(__NR_Linux + 49)
+#define __NR_getsockname	(__NR_Linux + 50)
+#define __NR_getpeername	(__NR_Linux + 51)
+#define __NR_socketpair	(__NR_Linux + 52)
+#define __NR_setsockopt	(__NR_Linux + 53)
+#define __NR_getsockopt	(__NR_Linux + 54)
+#define __NR_clone	(__NR_Linux + 55)
+#define __NR_fork	(__NR_Linux + 56)
+#define __NR_execve	(__NR_Linux + 57)
+#define __NR_exit	(__NR_Linux + 58)
+#define __NR_wait4	(__NR_Linux + 59)
+#define __NR_kill	(__NR_Linux + 60)
+#define __NR_uname	(__NR_Linux + 61)
+#define __NR_semget	(__NR_Linux + 62)
+#define __NR_semop	(__NR_Linux + 63)
+#define __NR_semctl	(__NR_Linux + 64)
+#define __NR_shmdt	(__NR_Linux + 65)
+#define __NR_msgget	(__NR_Linux + 66)
+#define __NR_msgsnd	(__NR_Linux + 67)
+#define __NR_msgrcv	(__NR_Linux + 68)
+#define __NR_msgctl	(__NR_Linux + 69)
+#define __NR_fcntl	(__NR_Linux + 70)
+#define __NR_flock	(__NR_Linux + 71)
+#define __NR_fsync	(__NR_Linux + 72)
+#define __NR_fdatasync	(__NR_Linux + 73)
+#define __NR_truncate	(__NR_Linux + 74)
+#define __NR_ftruncate	(__NR_Linux + 75)
+#define __NR_getdents	(__NR_Linux + 76)
+#define __NR_getcwd	(__NR_Linux + 77)
+#define __NR_chdir	(__NR_Linux + 78)
+#define __NR_fchdir	(__NR_Linux + 79)
+#define __NR_rename	(__NR_Linux + 80)
+#define __NR_mkdir	(__NR_Linux + 81)
+#define __NR_rmdir	(__NR_Linux + 82)
+#define __NR_creat	(__NR_Linux + 83)
+#define __NR_link	(__NR_Linux + 84)
+#define __NR_unlink	(__NR_Linux + 85)
+#define __NR_symlink	(__NR_Linux + 86)
+#define __NR_readlink	(__NR_Linux + 87)
+#define __NR_chmod	(__NR_Linux + 88)
+#define __NR_fchmod	(__NR_Linux + 89)
+#define __NR_chown	(__NR_Linux + 90)
+#define __NR_fchown	(__NR_Linux + 91)
+#define __NR_lchown	(__NR_Linux + 92)
+#define __NR_umask	(__NR_Linux + 93)
+#define __NR_gettimeofday	(__NR_Linux + 94)
+#define __NR_getrlimit	(__NR_Linux + 95)
+#define __NR_getrusage	(__NR_Linux + 96)
+#define __NR_sysinfo	(__NR_Linux + 97)
+#define __NR_times	(__NR_Linux + 98)
+#define __NR_ptrace	(__NR_Linux + 99)
+#define __NR_getuid	(__NR_Linux + 100)
+#define __NR_syslog	(__NR_Linux + 101)
+#define __NR_getgid	(__NR_Linux + 102)
+#define __NR_setuid	(__NR_Linux + 103)
+#define __NR_setgid	(__NR_Linux + 104)
+#define __NR_geteuid	(__NR_Linux + 105)
+#define __NR_getegid	(__NR_Linux + 106)
+#define __NR_setpgid	(__NR_Linux + 107)
+#define __NR_getppid	(__NR_Linux + 108)
+#define __NR_getpgrp	(__NR_Linux + 109)
+#define __NR_setsid	(__NR_Linux + 110)
+#define __NR_setreuid	(__NR_Linux + 111)
+#define __NR_setregid	(__NR_Linux + 112)
+#define __NR_getgroups	(__NR_Linux + 113)
+#define __NR_setgroups	(__NR_Linux + 114)
+#define __NR_setresuid	(__NR_Linux + 115)
+#define __NR_getresuid	(__NR_Linux + 116)
+#define __NR_setresgid	(__NR_Linux + 117)
+#define __NR_getresgid	(__NR_Linux + 118)
+#define __NR_getpgid	(__NR_Linux + 119)
+#define __NR_setfsuid	(__NR_Linux + 120)
+#define __NR_setfsgid	(__NR_Linux + 121)
+#define __NR_getsid	(__NR_Linux + 122)
+#define __NR_capget	(__NR_Linux + 123)
+#define __NR_capset	(__NR_Linux + 124)
+#define __NR_rt_sigpending	(__NR_Linux + 125)
+#define __NR_rt_sigtimedwait	(__NR_Linux + 126)
+#define __NR_rt_sigqueueinfo	(__NR_Linux + 127)
+#define __NR_rt_sigsuspend	(__NR_Linux + 128)
+#define __NR_sigaltstack	(__NR_Linux + 129)
+#define __NR_utime	(__NR_Linux + 130)
+#define __NR_mknod	(__NR_Linux + 131)
+#define __NR_personality	(__NR_Linux + 132)
+#define __NR_ustat	(__NR_Linux + 133)
+#define __NR_statfs	(__NR_Linux + 134)
+#define __NR_fstatfs	(__NR_Linux + 135)
+#define __NR_sysfs	(__NR_Linux + 136)
+#define __NR_getpriority	(__NR_Linux + 137)
+#define __NR_setpriority	(__NR_Linux + 138)
+#define __NR_sched_setparam	(__NR_Linux + 139)
+#define __NR_sched_getparam	(__NR_Linux + 140)
+#define __NR_sched_setscheduler	(__NR_Linux + 141)
+#define __NR_sched_getscheduler	(__NR_Linux + 142)
+#define __NR_sched_get_priority_max	(__NR_Linux + 143)
+#define __NR_sched_get_priority_min	(__NR_Linux + 144)
+#define __NR_sched_rr_get_interval	(__NR_Linux + 145)
+#define __NR_mlock	(__NR_Linux + 146)
+#define __NR_munlock	(__NR_Linux + 147)
+#define __NR_mlockall	(__NR_Linux + 148)
+#define __NR_munlockall	(__NR_Linux + 149)
+#define __NR_vhangup	(__NR_Linux + 150)
+#define __NR_pivot_root	(__NR_Linux + 151)
+#define __NR__sysctl	(__NR_Linux + 152)
+#define __NR_prctl	(__NR_Linux + 153)
+#define __NR_adjtimex	(__NR_Linux + 154)
+#define __NR_setrlimit	(__NR_Linux + 155)
+#define __NR_chroot	(__NR_Linux + 156)
+#define __NR_sync	(__NR_Linux + 157)
+#define __NR_acct	(__NR_Linux + 158)
+#define __NR_settimeofday	(__NR_Linux + 159)
+#define __NR_mount	(__NR_Linux + 160)
+#define __NR_umount2	(__NR_Linux + 161)
+#define __NR_swapon	(__NR_Linux + 162)
+#define __NR_swapoff	(__NR_Linux + 163)
+#define __NR_reboot	(__NR_Linux + 164)
+#define __NR_sethostname	(__NR_Linux + 165)
+#define __NR_setdomainname	(__NR_Linux + 166)
+#define __NR_create_module	(__NR_Linux + 167)
+#define __NR_init_module	(__NR_Linux + 168)
+#define __NR_delete_module	(__NR_Linux + 169)
+#define __NR_get_kernel_syms	(__NR_Linux + 170)
+#define __NR_query_module	(__NR_Linux + 171)
+#define __NR_quotactl	(__NR_Linux + 172)
+#define __NR_nfsservctl	(__NR_Linux + 173)
+#define __NR_getpmsg	(__NR_Linux + 174)
+#define __NR_putpmsg	(__NR_Linux + 175)
+#define __NR_afs_syscall	(__NR_Linux + 176)
+#define __NR_reserved177	(__NR_Linux + 177)
+#define __NR_gettid	(__NR_Linux + 178)
+#define __NR_readahead	(__NR_Linux + 179)
+#define __NR_setxattr	(__NR_Linux + 180)
+#define __NR_lsetxattr	(__NR_Linux + 181)
+#define __NR_fsetxattr	(__NR_Linux + 182)
+#define __NR_getxattr	(__NR_Linux + 183)
+#define __NR_lgetxattr	(__NR_Linux + 184)
+#define __NR_fgetxattr	(__NR_Linux + 185)
+#define __NR_listxattr	(__NR_Linux + 186)
+#define __NR_llistxattr	(__NR_Linux + 187)
+#define __NR_flistxattr	(__NR_Linux + 188)
+#define __NR_removexattr	(__NR_Linux + 189)
+#define __NR_lremovexattr	(__NR_Linux + 190)
+#define __NR_fremovexattr	(__NR_Linux + 191)
+#define __NR_tkill	(__NR_Linux + 192)
+#define __NR_reserved193	(__NR_Linux + 193)
+#define __NR_futex	(__NR_Linux + 194)
+#define __NR_sched_setaffinity	(__NR_Linux + 195)
+#define __NR_sched_getaffinity	(__NR_Linux + 196)
+#define __NR_cacheflush	(__NR_Linux + 197)
+#define __NR_cachectl	(__NR_Linux + 198)
+#define __NR_sysmips	(__NR_Linux + 199)
+#define __NR_io_setup	(__NR_Linux + 200)
+#define __NR_io_destroy	(__NR_Linux + 201)
+#define __NR_io_getevents	(__NR_Linux + 202)
+#define __NR_io_submit	(__NR_Linux + 203)
+#define __NR_io_cancel	(__NR_Linux + 204)
+#define __NR_exit_group	(__NR_Linux + 205)
+#define __NR_lookup_dcookie	(__NR_Linux + 206)
+#define __NR_epoll_create	(__NR_Linux + 207)
+#define __NR_epoll_ctl	(__NR_Linux + 208)
+#define __NR_epoll_wait	(__NR_Linux + 209)
+#define __NR_remap_file_pages	(__NR_Linux + 210)
+#define __NR_rt_sigreturn	(__NR_Linux + 211)
+#define __NR_fcntl64	(__NR_Linux + 212)
+#define __NR_set_tid_address	(__NR_Linux + 213)
+#define __NR_restart_syscall	(__NR_Linux + 214)
+#define __NR_semtimedop	(__NR_Linux + 215)
+#define __NR_fadvise64	(__NR_Linux + 216)
+#define __NR_statfs64	(__NR_Linux + 217)
+#define __NR_fstatfs64	(__NR_Linux + 218)
+#define __NR_sendfile64	(__NR_Linux + 219)
+#define __NR_timer_create	(__NR_Linux + 220)
+#define __NR_timer_settime	(__NR_Linux + 221)
+#define __NR_timer_gettime	(__NR_Linux + 222)
+#define __NR_timer_getoverrun	(__NR_Linux + 223)
+#define __NR_timer_delete	(__NR_Linux + 224)
+#define __NR_clock_settime	(__NR_Linux + 225)
+#define __NR_clock_gettime	(__NR_Linux + 226)
+#define __NR_clock_getres	(__NR_Linux + 227)
+#define __NR_clock_nanosleep	(__NR_Linux + 228)
+#define __NR_tgkill	(__NR_Linux + 229)
+#define __NR_utimes	(__NR_Linux + 230)
+#define __NR_mbind	(__NR_Linux + 231)
+#define __NR_get_mempolicy	(__NR_Linux + 232)
+#define __NR_set_mempolicy	(__NR_Linux + 233)
+#define __NR_mq_open	(__NR_Linux + 234)
+#define __NR_mq_unlink	(__NR_Linux + 235)
+#define __NR_mq_timedsend	(__NR_Linux + 236)
+#define __NR_mq_timedreceive	(__NR_Linux + 237)
+#define __NR_mq_notify	(__NR_Linux + 238)
+#define __NR_mq_getsetattr	(__NR_Linux + 239)
+#define __NR_vserver	(__NR_Linux + 240)
+#define __NR_waitid	(__NR_Linux + 241)
+#define __NR_add_key	(__NR_Linux + 243)
+#define __NR_request_key	(__NR_Linux + 244)
+#define __NR_keyctl	(__NR_Linux + 245)
+#define __NR_set_thread_area	(__NR_Linux + 246)
+#define __NR_inotify_init	(__NR_Linux + 247)
+#define __NR_inotify_add_watch	(__NR_Linux + 248)
+#define __NR_inotify_rm_watch	(__NR_Linux + 249)
+#define __NR_migrate_pages	(__NR_Linux + 250)
+#define __NR_openat	(__NR_Linux + 251)
+#define __NR_mkdirat	(__NR_Linux + 252)
+#define __NR_mknodat	(__NR_Linux + 253)
+#define __NR_fchownat	(__NR_Linux + 254)
+#define __NR_futimesat	(__NR_Linux + 255)
+#define __NR_newfstatat	(__NR_Linux + 256)
+#define __NR_unlinkat	(__NR_Linux + 257)
+#define __NR_renameat	(__NR_Linux + 258)
+#define __NR_linkat	(__NR_Linux + 259)
+#define __NR_symlinkat	(__NR_Linux + 260)
+#define __NR_readlinkat	(__NR_Linux + 261)
+#define __NR_fchmodat	(__NR_Linux + 262)
+#define __NR_faccessat	(__NR_Linux + 263)
+#define __NR_pselect6	(__NR_Linux + 264)
+#define __NR_ppoll	(__NR_Linux + 265)
+#define __NR_unshare	(__NR_Linux + 266)
+#define __NR_splice	(__NR_Linux + 267)
+#define __NR_sync_file_range	(__NR_Linux + 268)
+#define __NR_tee	(__NR_Linux + 269)
+#define __NR_vmsplice	(__NR_Linux + 270)
+#define __NR_move_pages	(__NR_Linux + 271)
+#define __NR_set_robust_list	(__NR_Linux + 272)
+#define __NR_get_robust_list	(__NR_Linux + 273)
+#define __NR_kexec_load	(__NR_Linux + 274)
+#define __NR_getcpu	(__NR_Linux + 275)
+#define __NR_epoll_pwait	(__NR_Linux + 276)
+#define __NR_ioprio_set	(__NR_Linux + 277)
+#define __NR_ioprio_get	(__NR_Linux + 278)
+#define __NR_utimensat	(__NR_Linux + 279)
+#define __NR_signalfd	(__NR_Linux + 280)
+#define __NR_timerfd	(__NR_Linux + 281)
+#define __NR_eventfd	(__NR_Linux + 282)
+#define __NR_fallocate	(__NR_Linux + 283)
+#define __NR_timerfd_create	(__NR_Linux + 284)
+#define __NR_timerfd_gettime	(__NR_Linux + 285)
+#define __NR_timerfd_settime	(__NR_Linux + 286)
+#define __NR_signalfd4	(__NR_Linux + 287)
+#define __NR_eventfd2	(__NR_Linux + 288)
+#define __NR_epoll_create1	(__NR_Linux + 289)
+#define __NR_dup3	(__NR_Linux + 290)
+#define __NR_pipe2	(__NR_Linux + 291)
+#define __NR_inotify_init1	(__NR_Linux + 292)
+#define __NR_preadv	(__NR_Linux + 293)
+#define __NR_pwritev	(__NR_Linux + 294)
+#define __NR_rt_tgsigqueueinfo	(__NR_Linux + 295)
+#define __NR_perf_event_open	(__NR_Linux + 296)
+#define __NR_accept4	(__NR_Linux + 297)
+#define __NR_recvmmsg	(__NR_Linux + 298)
+#define __NR_getdents64	(__NR_Linux + 299)
+#define __NR_fanotify_init	(__NR_Linux + 300)
+#define __NR_fanotify_mark	(__NR_Linux + 301)
+#define __NR_prlimit64	(__NR_Linux + 302)
+#define __NR_name_to_handle_at	(__NR_Linux + 303)
+#define __NR_open_by_handle_at	(__NR_Linux + 304)
+#define __NR_clock_adjtime	(__NR_Linux + 305)
+#define __NR_syncfs	(__NR_Linux + 306)
+#define __NR_sendmmsg	(__NR_Linux + 307)
+#define __NR_setns	(__NR_Linux + 308)
+#define __NR_process_vm_readv	(__NR_Linux + 309)
+#define __NR_process_vm_writev	(__NR_Linux + 310)
+#define __NR_kcmp	(__NR_Linux + 311)
+#define __NR_finit_module	(__NR_Linux + 312)
+#define __NR_sched_setattr	(__NR_Linux + 313)
+#define __NR_sched_getattr	(__NR_Linux + 314)
+#define __NR_renameat2	(__NR_Linux + 315)
+#define __NR_seccomp	(__NR_Linux + 316)
+#define __NR_getrandom	(__NR_Linux + 317)
+#define __NR_memfd_create	(__NR_Linux + 318)
+#define __NR_bpf	(__NR_Linux + 319)
+#define __NR_execveat	(__NR_Linux + 320)
+#define __NR_userfaultfd	(__NR_Linux + 321)
+#define __NR_membarrier	(__NR_Linux + 322)
+#define __NR_mlock2	(__NR_Linux + 323)
+#define __NR_copy_file_range	(__NR_Linux + 324)
+#define __NR_preadv2	(__NR_Linux + 325)
+#define __NR_pwritev2	(__NR_Linux + 326)
+#define __NR_pkey_mprotect	(__NR_Linux + 327)
+#define __NR_pkey_alloc	(__NR_Linux + 328)
+#define __NR_pkey_free	(__NR_Linux + 329)
+#define __NR_statx	(__NR_Linux + 330)
+#define __NR_rseq	(__NR_Linux + 331)
+#define __NR_io_pgetevents	(__NR_Linux + 332)
+
+
+#endif /* _ASM_MIPS_UNISTD_N32_H */
diff --git a/linux-headers/asm-mips/unistd_n64.h b/linux-headers/asm-mips/unistd_n64.h
new file mode 100644
index 0000000000..8083de1f25
--- /dev/null
+++ b/linux-headers/asm-mips/unistd_n64.h
@@ -0,0 +1,334 @@
+#ifndef _ASM_MIPS_UNISTD_N64_H
+#define _ASM_MIPS_UNISTD_N64_H
+
+#define __NR_read	(__NR_Linux + 0)
+#define __NR_write	(__NR_Linux + 1)
+#define __NR_open	(__NR_Linux + 2)
+#define __NR_close	(__NR_Linux + 3)
+#define __NR_stat	(__NR_Linux + 4)
+#define __NR_fstat	(__NR_Linux + 5)
+#define __NR_lstat	(__NR_Linux + 6)
+#define __NR_poll	(__NR_Linux + 7)
+#define __NR_lseek	(__NR_Linux + 8)
+#define __NR_mmap	(__NR_Linux + 9)
+#define __NR_mprotect	(__NR_Linux + 10)
+#define __NR_munmap	(__NR_Linux + 11)
+#define __NR_brk	(__NR_Linux + 12)
+#define __NR_rt_sigaction	(__NR_Linux + 13)
+#define __NR_rt_sigprocmask	(__NR_Linux + 14)
+#define __NR_ioctl	(__NR_Linux + 15)
+#define __NR_pread64	(__NR_Linux + 16)
+#define __NR_pwrite64	(__NR_Linux + 17)
+#define __NR_readv	(__NR_Linux + 18)
+#define __NR_writev	(__NR_Linux + 19)
+#define __NR_access	(__NR_Linux + 20)
+#define __NR_pipe	(__NR_Linux + 21)
+#define __NR__newselect	(__NR_Linux + 22)
+#define __NR_sched_yield	(__NR_Linux + 23)
+#define __NR_mremap	(__NR_Linux + 24)
+#define __NR_msync	(__NR_Linux + 25)
+#define __NR_mincore	(__NR_Linux + 26)
+#define __NR_madvise	(__NR_Linux + 27)
+#define __NR_shmget	(__NR_Linux + 28)
+#define __NR_shmat	(__NR_Linux + 29)
+#define __NR_shmctl	(__NR_Linux + 30)
+#define __NR_dup	(__NR_Linux + 31)
+#define __NR_dup2	(__NR_Linux + 32)
+#define __NR_pause	(__NR_Linux + 33)
+#define __NR_nanosleep	(__NR_Linux + 34)
+#define __NR_getitimer	(__NR_Linux + 35)
+#define __NR_setitimer	(__NR_Linux + 36)
+#define __NR_alarm	(__NR_Linux + 37)
+#define __NR_getpid	(__NR_Linux + 38)
+#define __NR_sendfile	(__NR_Linux + 39)
+#define __NR_socket	(__NR_Linux + 40)
+#define __NR_connect	(__NR_Linux + 41)
+#define __NR_accept	(__NR_Linux + 42)
+#define __NR_sendto	(__NR_Linux + 43)
+#define __NR_recvfrom	(__NR_Linux + 44)
+#define __NR_sendmsg	(__NR_Linux + 45)
+#define __NR_recvmsg	(__NR_Linux + 46)
+#define __NR_shutdown	(__NR_Linux + 47)
+#define __NR_bind	(__NR_Linux + 48)
+#define __NR_listen	(__NR_Linux + 49)
+#define __NR_getsockname	(__NR_Linux + 50)
+#define __NR_getpeername	(__NR_Linux + 51)
+#define __NR_socketpair	(__NR_Linux + 52)
+#define __NR_setsockopt	(__NR_Linux + 53)
+#define __NR_getsockopt	(__NR_Linux + 54)
+#define __NR_clone	(__NR_Linux + 55)
+#define __NR_fork	(__NR_Linux + 56)
+#define __NR_execve	(__NR_Linux + 57)
+#define __NR_exit	(__NR_Linux + 58)
+#define __NR_wait4	(__NR_Linux + 59)
+#define __NR_kill	(__NR_Linux + 60)
+#define __NR_uname	(__NR_Linux + 61)
+#define __NR_semget	(__NR_Linux + 62)
+#define __NR_semop	(__NR_Linux + 63)
+#define __NR_semctl	(__NR_Linux + 64)
+#define __NR_shmdt	(__NR_Linux + 65)
+#define __NR_msgget	(__NR_Linux + 66)
+#define __NR_msgsnd	(__NR_Linux + 67)
+#define __NR_msgrcv	(__NR_Linux + 68)
+#define __NR_msgctl	(__NR_Linux + 69)
+#define __NR_fcntl	(__NR_Linux + 70)
+#define __NR_flock	(__NR_Linux + 71)
+#define __NR_fsync	(__NR_Linux + 72)
+#define __NR_fdatasync	(__NR_Linux + 73)
+#define __NR_truncate	(__NR_Linux + 74)
+#define __NR_ftruncate	(__NR_Linux + 75)
+#define __NR_getdents	(__NR_Linux + 76)
+#define __NR_getcwd	(__NR_Linux + 77)
+#define __NR_chdir	(__NR_Linux + 78)
+#define __NR_fchdir	(__NR_Linux + 79)
+#define __NR_rename	(__NR_Linux + 80)
+#define __NR_mkdir	(__NR_Linux + 81)
+#define __NR_rmdir	(__NR_Linux + 82)
+#define __NR_creat	(__NR_Linux + 83)
+#define __NR_link	(__NR_Linux + 84)
+#define __NR_unlink	(__NR_Linux + 85)
+#define __NR_symlink	(__NR_Linux + 86)
+#define __NR_readlink	(__NR_Linux + 87)
+#define __NR_chmod	(__NR_Linux + 88)
+#define __NR_fchmod	(__NR_Linux + 89)
+#define __NR_chown	(__NR_Linux + 90)
+#define __NR_fchown	(__NR_Linux + 91)
+#define __NR_lchown	(__NR_Linux + 92)
+#define __NR_umask	(__NR_Linux + 93)
+#define __NR_gettimeofday	(__NR_Linux + 94)
+#define __NR_getrlimit	(__NR_Linux + 95)
+#define __NR_getrusage	(__NR_Linux + 96)
+#define __NR_sysinfo	(__NR_Linux + 97)
+#define __NR_times	(__NR_Linux + 98)
+#define __NR_ptrace	(__NR_Linux + 99)
+#define __NR_getuid	(__NR_Linux + 100)
+#define __NR_syslog	(__NR_Linux + 101)
+#define __NR_getgid	(__NR_Linux + 102)
+#define __NR_setuid	(__NR_Linux + 103)
+#define __NR_setgid	(__NR_Linux + 104)
+#define __NR_geteuid	(__NR_Linux + 105)
+#define __NR_getegid	(__NR_Linux + 106)
+#define __NR_setpgid	(__NR_Linux + 107)
+#define __NR_getppid	(__NR_Linux + 108)
+#define __NR_getpgrp	(__NR_Linux + 109)
+#define __NR_setsid	(__NR_Linux + 110)
+#define __NR_setreuid	(__NR_Linux + 111)
+#define __NR_setregid	(__NR_Linux + 112)
+#define __NR_getgroups	(__NR_Linux + 113)
+#define __NR_setgroups	(__NR_Linux + 114)
+#define __NR_setresuid	(__NR_Linux + 115)
+#define __NR_getresuid	(__NR_Linux + 116)
+#define __NR_setresgid	(__NR_Linux + 117)
+#define __NR_getresgid	(__NR_Linux + 118)
+#define __NR_getpgid	(__NR_Linux + 119)
+#define __NR_setfsuid	(__NR_Linux + 120)
+#define __NR_setfsgid	(__NR_Linux + 121)
+#define __NR_getsid	(__NR_Linux + 122)
+#define __NR_capget	(__NR_Linux + 123)
+#define __NR_capset	(__NR_Linux + 124)
+#define __NR_rt_sigpending	(__NR_Linux + 125)
+#define __NR_rt_sigtimedwait	(__NR_Linux + 126)
+#define __NR_rt_sigqueueinfo	(__NR_Linux + 127)
+#define __NR_rt_sigsuspend	(__NR_Linux + 128)
+#define __NR_sigaltstack	(__NR_Linux + 129)
+#define __NR_utime	(__NR_Linux + 130)
+#define __NR_mknod	(__NR_Linux + 131)
+#define __NR_personality	(__NR_Linux + 132)
+#define __NR_ustat	(__NR_Linux + 133)
+#define __NR_statfs	(__NR_Linux + 134)
+#define __NR_fstatfs	(__NR_Linux + 135)
+#define __NR_sysfs	(__NR_Linux + 136)
+#define __NR_getpriority	(__NR_Linux + 137)
+#define __NR_setpriority	(__NR_Linux + 138)
+#define __NR_sched_setparam	(__NR_Linux + 139)
+#define __NR_sched_getparam	(__NR_Linux + 140)
+#define __NR_sched_setscheduler	(__NR_Linux + 141)
+#define __NR_sched_getscheduler	(__NR_Linux + 142)
+#define __NR_sched_get_priority_max	(__NR_Linux + 143)
+#define __NR_sched_get_priority_min	(__NR_Linux + 144)
+#define __NR_sched_rr_get_interval	(__NR_Linux + 145)
+#define __NR_mlock	(__NR_Linux + 146)
+#define __NR_munlock	(__NR_Linux + 147)
+#define __NR_mlockall	(__NR_Linux + 148)
+#define __NR_munlockall	(__NR_Linux + 149)
+#define __NR_vhangup	(__NR_Linux + 150)
+#define __NR_pivot_root	(__NR_Linux + 151)
+#define __NR__sysctl	(__NR_Linux + 152)
+#define __NR_prctl	(__NR_Linux + 153)
+#define __NR_adjtimex	(__NR_Linux + 154)
+#define __NR_setrlimit	(__NR_Linux + 155)
+#define __NR_chroot	(__NR_Linux + 156)
+#define __NR_sync	(__NR_Linux + 157)
+#define __NR_acct	(__NR_Linux + 158)
+#define __NR_settimeofday	(__NR_Linux + 159)
+#define __NR_mount	(__NR_Linux + 160)
+#define __NR_umount2	(__NR_Linux + 161)
+#define __NR_swapon	(__NR_Linux + 162)
+#define __NR_swapoff	(__NR_Linux + 163)
+#define __NR_reboot	(__NR_Linux + 164)
+#define __NR_sethostname	(__NR_Linux + 165)
+#define __NR_setdomainname	(__NR_Linux + 166)
+#define __NR_create_module	(__NR_Linux + 167)
+#define __NR_init_module	(__NR_Linux + 168)
+#define __NR_delete_module	(__NR_Linux + 169)
+#define __NR_get_kernel_syms	(__NR_Linux + 170)
+#define __NR_query_module	(__NR_Linux + 171)
+#define __NR_quotactl	(__NR_Linux + 172)
+#define __NR_nfsservctl	(__NR_Linux + 173)
+#define __NR_getpmsg	(__NR_Linux + 174)
+#define __NR_putpmsg	(__NR_Linux + 175)
+#define __NR_afs_syscall	(__NR_Linux + 176)
+#define __NR_reserved177	(__NR_Linux + 177)
+#define __NR_gettid	(__NR_Linux + 178)
+#define __NR_readahead	(__NR_Linux + 179)
+#define __NR_setxattr	(__NR_Linux + 180)
+#define __NR_lsetxattr	(__NR_Linux + 181)
+#define __NR_fsetxattr	(__NR_Linux + 182)
+#define __NR_getxattr	(__NR_Linux + 183)
+#define __NR_lgetxattr	(__NR_Linux + 184)
+#define __NR_fgetxattr	(__NR_Linux + 185)
+#define __NR_listxattr	(__NR_Linux + 186)
+#define __NR_llistxattr	(__NR_Linux + 187)
+#define __NR_flistxattr	(__NR_Linux + 188)
+#define __NR_removexattr	(__NR_Linux + 189)
+#define __NR_lremovexattr	(__NR_Linux + 190)
+#define __NR_fremovexattr	(__NR_Linux + 191)
+#define __NR_tkill	(__NR_Linux + 192)
+#define __NR_reserved193	(__NR_Linux + 193)
+#define __NR_futex	(__NR_Linux + 194)
+#define __NR_sched_setaffinity	(__NR_Linux + 195)
+#define __NR_sched_getaffinity	(__NR_Linux + 196)
+#define __NR_cacheflush	(__NR_Linux + 197)
+#define __NR_cachectl	(__NR_Linux + 198)
+#define __NR_sysmips	(__NR_Linux + 199)
+#define __NR_io_setup	(__NR_Linux + 200)
+#define __NR_io_destroy	(__NR_Linux + 201)
+#define __NR_io_getevents	(__NR_Linux + 202)
+#define __NR_io_submit	(__NR_Linux + 203)
+#define __NR_io_cancel	(__NR_Linux + 204)
+#define __NR_exit_group	(__NR_Linux + 205)
+#define __NR_lookup_dcookie	(__NR_Linux + 206)
+#define __NR_epoll_create	(__NR_Linux + 207)
+#define __NR_epoll_ctl	(__NR_Linux + 208)
+#define __NR_epoll_wait	(__NR_Linux + 209)
+#define __NR_remap_file_pages	(__NR_Linux + 210)
+#define __NR_rt_sigreturn	(__NR_Linux + 211)
+#define __NR_set_tid_address	(__NR_Linux + 212)
+#define __NR_restart_syscall	(__NR_Linux + 213)
+#define __NR_semtimedop	(__NR_Linux + 214)
+#define __NR_fadvise64	(__NR_Linux + 215)
+#define __NR_timer_create	(__NR_Linux + 216)
+#define __NR_timer_settime	(__NR_Linux + 217)
+#define __NR_timer_gettime	(__NR_Linux + 218)
+#define __NR_timer_getoverrun	(__NR_Linux + 219)
+#define __NR_timer_delete	(__NR_Linux + 220)
+#define __NR_clock_settime	(__NR_Linux + 221)
+#define __NR_clock_gettime	(__NR_Linux + 222)
+#define __NR_clock_getres	(__NR_Linux + 223)
+#define __NR_clock_nanosleep	(__NR_Linux + 224)
+#define __NR_tgkill	(__NR_Linux + 225)
+#define __NR_utimes	(__NR_Linux + 226)
+#define __NR_mbind	(__NR_Linux + 227)
+#define __NR_get_mempolicy	(__NR_Linux + 228)
+#define __NR_set_mempolicy	(__NR_Linux + 229)
+#define __NR_mq_open	(__NR_Linux + 230)
+#define __NR_mq_unlink	(__NR_Linux + 231)
+#define __NR_mq_timedsend	(__NR_Linux + 232)
+#define __NR_mq_timedreceive	(__NR_Linux + 233)
+#define __NR_mq_notify	(__NR_Linux + 234)
+#define __NR_mq_getsetattr	(__NR_Linux + 235)
+#define __NR_vserver	(__NR_Linux + 236)
+#define __NR_waitid	(__NR_Linux + 237)
+#define __NR_add_key	(__NR_Linux + 239)
+#define __NR_request_key	(__NR_Linux + 240)
+#define __NR_keyctl	(__NR_Linux + 241)
+#define __NR_set_thread_area	(__NR_Linux + 242)
+#define __NR_inotify_init	(__NR_Linux + 243)
+#define __NR_inotify_add_watch	(__NR_Linux + 244)
+#define __NR_inotify_rm_watch	(__NR_Linux + 245)
+#define __NR_migrate_pages	(__NR_Linux + 246)
+#define __NR_openat	(__NR_Linux + 247)
+#define __NR_mkdirat	(__NR_Linux + 248)
+#define __NR_mknodat	(__NR_Linux + 249)
+#define __NR_fchownat	(__NR_Linux + 250)
+#define __NR_futimesat	(__NR_Linux + 251)
+#define __NR_newfstatat	(__NR_Linux + 252)
+#define __NR_unlinkat	(__NR_Linux + 253)
+#define __NR_renameat	(__NR_Linux + 254)
+#define __NR_linkat	(__NR_Linux + 255)
+#define __NR_symlinkat	(__NR_Linux + 256)
+#define __NR_readlinkat	(__NR_Linux + 257)
+#define __NR_fchmodat	(__NR_Linux + 258)
+#define __NR_faccessat	(__NR_Linux + 259)
+#define __NR_pselect6	(__NR_Linux + 260)
+#define __NR_ppoll	(__NR_Linux + 261)
+#define __NR_unshare	(__NR_Linux + 262)
+#define __NR_splice	(__NR_Linux + 263)
+#define __NR_sync_file_range	(__NR_Linux + 264)
+#define __NR_tee	(__NR_Linux + 265)
+#define __NR_vmsplice	(__NR_Linux + 266)
+#define __NR_move_pages	(__NR_Linux + 267)
+#define __NR_set_robust_list	(__NR_Linux + 268)
+#define __NR_get_robust_list	(__NR_Linux + 269)
+#define __NR_kexec_load	(__NR_Linux + 270)
+#define __NR_getcpu	(__NR_Linux + 271)
+#define __NR_epoll_pwait	(__NR_Linux + 272)
+#define __NR_ioprio_set	(__NR_Linux + 273)
+#define __NR_ioprio_get	(__NR_Linux + 274)
+#define __NR_utimensat	(__NR_Linux + 275)
+#define __NR_signalfd	(__NR_Linux + 276)
+#define __NR_timerfd	(__NR_Linux + 277)
+#define __NR_eventfd	(__NR_Linux + 278)
+#define __NR_fallocate	(__NR_Linux + 279)
+#define __NR_timerfd_create	(__NR_Linux + 280)
+#define __NR_timerfd_gettime	(__NR_Linux + 281)
+#define __NR_timerfd_settime	(__NR_Linux + 282)
+#define __NR_signalfd4	(__NR_Linux + 283)
+#define __NR_eventfd2	(__NR_Linux + 284)
+#define __NR_epoll_create1	(__NR_Linux + 285)
+#define __NR_dup3	(__NR_Linux + 286)
+#define __NR_pipe2	(__NR_Linux + 287)
+#define __NR_inotify_init1	(__NR_Linux + 288)
+#define __NR_preadv	(__NR_Linux + 289)
+#define __NR_pwritev	(__NR_Linux + 290)
+#define __NR_rt_tgsigqueueinfo	(__NR_Linux + 291)
+#define __NR_perf_event_open	(__NR_Linux + 292)
+#define __NR_accept4	(__NR_Linux + 293)
+#define __NR_recvmmsg	(__NR_Linux + 294)
+#define __NR_fanotify_init	(__NR_Linux + 295)
+#define __NR_fanotify_mark	(__NR_Linux + 296)
+#define __NR_prlimit64	(__NR_Linux + 297)
+#define __NR_name_to_handle_at	(__NR_Linux + 298)
+#define __NR_open_by_handle_at	(__NR_Linux + 299)
+#define __NR_clock_adjtime	(__NR_Linux + 300)
+#define __NR_syncfs	(__NR_Linux + 301)
+#define __NR_sendmmsg	(__NR_Linux + 302)
+#define __NR_setns	(__NR_Linux + 303)
+#define __NR_process_vm_readv	(__NR_Linux + 304)
+#define __NR_process_vm_writev	(__NR_Linux + 305)
+#define __NR_kcmp	(__NR_Linux + 306)
+#define __NR_finit_module	(__NR_Linux + 307)
+#define __NR_getdents64	(__NR_Linux + 308)
+#define __NR_sched_setattr	(__NR_Linux + 309)
+#define __NR_sched_getattr	(__NR_Linux + 310)
+#define __NR_renameat2	(__NR_Linux + 311)
+#define __NR_seccomp	(__NR_Linux + 312)
+#define __NR_getrandom	(__NR_Linux + 313)
+#define __NR_memfd_create	(__NR_Linux + 314)
+#define __NR_bpf	(__NR_Linux + 315)
+#define __NR_execveat	(__NR_Linux + 316)
+#define __NR_userfaultfd	(__NR_Linux + 317)
+#define __NR_membarrier	(__NR_Linux + 318)
+#define __NR_mlock2	(__NR_Linux + 319)
+#define __NR_copy_file_range	(__NR_Linux + 320)
+#define __NR_preadv2	(__NR_Linux + 321)
+#define __NR_pwritev2	(__NR_Linux + 322)
+#define __NR_pkey_mprotect	(__NR_Linux + 323)
+#define __NR_pkey_alloc	(__NR_Linux + 324)
+#define __NR_pkey_free	(__NR_Linux + 325)
+#define __NR_statx	(__NR_Linux + 326)
+#define __NR_rseq	(__NR_Linux + 327)
+#define __NR_io_pgetevents	(__NR_Linux + 328)
+
+
+#endif /* _ASM_MIPS_UNISTD_N64_H */
diff --git a/linux-headers/asm-mips/unistd_o32.h b/linux-headers/asm-mips/unistd_o32.h
new file mode 100644
index 0000000000..b03835b286
--- /dev/null
+++ b/linux-headers/asm-mips/unistd_o32.h
@@ -0,0 +1,374 @@
+#ifndef _ASM_MIPS_UNISTD_O32_H
+#define _ASM_MIPS_UNISTD_O32_H
+
+#define __NR_syscall	(__NR_Linux + 0)
+#define __NR_exit	(__NR_Linux + 1)
+#define __NR_fork	(__NR_Linux + 2)
+#define __NR_read	(__NR_Linux + 3)
+#define __NR_write	(__NR_Linux + 4)
+#define __NR_open	(__NR_Linux + 5)
+#define __NR_close	(__NR_Linux + 6)
+#define __NR_waitpid	(__NR_Linux + 7)
+#define __NR_creat	(__NR_Linux + 8)
+#define __NR_link	(__NR_Linux + 9)
+#define __NR_unlink	(__NR_Linux + 10)
+#define __NR_execve	(__NR_Linux + 11)
+#define __NR_chdir	(__NR_Linux + 12)
+#define __NR_time	(__NR_Linux + 13)
+#define __NR_mknod	(__NR_Linux + 14)
+#define __NR_chmod	(__NR_Linux + 15)
+#define __NR_lchown	(__NR_Linux + 16)
+#define __NR_break	(__NR_Linux + 17)
+#define __NR_unused18	(__NR_Linux + 18)
+#define __NR_lseek	(__NR_Linux + 19)
+#define __NR_getpid	(__NR_Linux + 20)
+#define __NR_mount	(__NR_Linux + 21)
+#define __NR_umount	(__NR_Linux + 22)
+#define __NR_setuid	(__NR_Linux + 23)
+#define __NR_getuid	(__NR_Linux + 24)
+#define __NR_stime	(__NR_Linux + 25)
+#define __NR_ptrace	(__NR_Linux + 26)
+#define __NR_alarm	(__NR_Linux + 27)
+#define __NR_unused28	(__NR_Linux + 28)
+#define __NR_pause	(__NR_Linux + 29)
+#define __NR_utime	(__NR_Linux + 30)
+#define __NR_stty	(__NR_Linux + 31)
+#define __NR_gtty	(__NR_Linux + 32)
+#define __NR_access	(__NR_Linux + 33)
+#define __NR_nice	(__NR_Linux + 34)
+#define __NR_ftime	(__NR_Linux + 35)
+#define __NR_sync	(__NR_Linux + 36)
+#define __NR_kill	(__NR_Linux + 37)
+#define __NR_rename	(__NR_Linux + 38)
+#define __NR_mkdir	(__NR_Linux + 39)
+#define __NR_rmdir	(__NR_Linux + 40)
+#define __NR_dup	(__NR_Linux + 41)
+#define __NR_pipe	(__NR_Linux + 42)
+#define __NR_times	(__NR_Linux + 43)
+#define __NR_prof	(__NR_Linux + 44)
+#define __NR_brk	(__NR_Linux + 45)
+#define __NR_setgid	(__NR_Linux + 46)
+#define __NR_getgid	(__NR_Linux + 47)
+#define __NR_signal	(__NR_Linux + 48)
+#define __NR_geteuid	(__NR_Linux + 49)
+#define __NR_getegid	(__NR_Linux + 50)
+#define __NR_acct	(__NR_Linux + 51)
+#define __NR_umount2	(__NR_Linux + 52)
+#define __NR_lock	(__NR_Linux + 53)
+#define __NR_ioctl	(__NR_Linux + 54)
+#define __NR_fcntl	(__NR_Linux + 55)
+#define __NR_mpx	(__NR_Linux + 56)
+#define __NR_setpgid	(__NR_Linux + 57)
+#define __NR_ulimit	(__NR_Linux + 58)
+#define __NR_unused59	(__NR_Linux + 59)
+#define __NR_umask	(__NR_Linux + 60)
+#define __NR_chroot	(__NR_Linux + 61)
+#define __NR_ustat	(__NR_Linux + 62)
+#define __NR_dup2	(__NR_Linux + 63)
+#define __NR_getppid	(__NR_Linux + 64)
+#define __NR_getpgrp	(__NR_Linux + 65)
+#define __NR_setsid	(__NR_Linux + 66)
+#define __NR_sigaction	(__NR_Linux + 67)
+#define __NR_sgetmask	(__NR_Linux + 68)
+#define __NR_ssetmask	(__NR_Linux + 69)
+#define __NR_setreuid	(__NR_Linux + 70)
+#define __NR_setregid	(__NR_Linux + 71)
+#define __NR_sigsuspend	(__NR_Linux + 72)
+#define __NR_sigpending	(__NR_Linux + 73)
+#define __NR_sethostname	(__NR_Linux + 74)
+#define __NR_setrlimit	(__NR_Linux + 75)
+#define __NR_getrlimit	(__NR_Linux + 76)
+#define __NR_getrusage	(__NR_Linux + 77)
+#define __NR_gettimeofday	(__NR_Linux + 78)
+#define __NR_settimeofday	(__NR_Linux + 79)
+#define __NR_getgroups	(__NR_Linux + 80)
+#define __NR_setgroups	(__NR_Linux + 81)
+#define __NR_reserved82	(__NR_Linux + 82)
+#define __NR_symlink	(__NR_Linux + 83)
+#define __NR_unused84	(__NR_Linux + 84)
+#define __NR_readlink	(__NR_Linux + 85)
+#define __NR_uselib	(__NR_Linux + 86)
+#define __NR_swapon	(__NR_Linux + 87)
+#define __NR_reboot	(__NR_Linux + 88)
+#define __NR_readdir	(__NR_Linux + 89)
+#define __NR_mmap	(__NR_Linux + 90)
+#define __NR_munmap	(__NR_Linux + 91)
+#define __NR_truncate	(__NR_Linux + 92)
+#define __NR_ftruncate	(__NR_Linux + 93)
+#define __NR_fchmod	(__NR_Linux + 94)
+#define __NR_fchown	(__NR_Linux + 95)
+#define __NR_getpriority	(__NR_Linux + 96)
+#define __NR_setpriority	(__NR_Linux + 97)
+#define __NR_profil	(__NR_Linux + 98)
+#define __NR_statfs	(__NR_Linux + 99)
+#define __NR_fstatfs	(__NR_Linux + 100)
+#define __NR_ioperm	(__NR_Linux + 101)
+#define __NR_socketcall	(__NR_Linux + 102)
+#define __NR_syslog	(__NR_Linux + 103)
+#define __NR_setitimer	(__NR_Linux + 104)
+#define __NR_getitimer	(__NR_Linux + 105)
+#define __NR_stat	(__NR_Linux + 106)
+#define __NR_lstat	(__NR_Linux + 107)
+#define __NR_fstat	(__NR_Linux + 108)
+#define __NR_unused109	(__NR_Linux + 109)
+#define __NR_iopl	(__NR_Linux + 110)
+#define __NR_vhangup	(__NR_Linux + 111)
+#define __NR_idle	(__NR_Linux + 112)
+#define __NR_vm86	(__NR_Linux + 113)
+#define __NR_wait4	(__NR_Linux + 114)
+#define __NR_swapoff	(__NR_Linux + 115)
+#define __NR_sysinfo	(__NR_Linux + 116)
+#define __NR_ipc	(__NR_Linux + 117)
+#define __NR_fsync	(__NR_Linux + 118)
+#define __NR_sigreturn	(__NR_Linux + 119)
+#define __NR_clone	(__NR_Linux + 120)
+#define __NR_setdomainname	(__NR_Linux + 121)
+#define __NR_uname	(__NR_Linux + 122)
+#define __NR_modify_ldt	(__NR_Linux + 123)
+#define __NR_adjtimex	(__NR_Linux + 124)
+#define __NR_mprotect	(__NR_Linux + 125)
+#define __NR_sigprocmask	(__NR_Linux + 126)
+#define __NR_create_module	(__NR_Linux + 127)
+#define __NR_init_module	(__NR_Linux + 128)
+#define __NR_delete_module	(__NR_Linux + 129)
+#define __NR_get_kernel_syms	(__NR_Linux + 130)
+#define __NR_quotactl	(__NR_Linux + 131)
+#define __NR_getpgid	(__NR_Linux + 132)
+#define __NR_fchdir	(__NR_Linux + 133)
+#define __NR_bdflush	(__NR_Linux + 134)
+#define __NR_sysfs	(__NR_Linux + 135)
+#define __NR_personality	(__NR_Linux + 136)
+#define __NR_afs_syscall	(__NR_Linux + 137)
+#define __NR_setfsuid	(__NR_Linux + 138)
+#define __NR_setfsgid	(__NR_Linux + 139)
+#define __NR__llseek	(__NR_Linux + 140)
+#define __NR_getdents	(__NR_Linux + 141)
+#define __NR__newselect	(__NR_Linux + 142)
+#define __NR_flock	(__NR_Linux + 143)
+#define __NR_msync	(__NR_Linux + 144)
+#define __NR_readv	(__NR_Linux + 145)
+#define __NR_writev	(__NR_Linux + 146)
+#define __NR_cacheflush	(__NR_Linux + 147)
+#define __NR_cachectl	(__NR_Linux + 148)
+#define __NR_sysmips	(__NR_Linux + 149)
+#define __NR_unused150	(__NR_Linux + 150)
+#define __NR_getsid	(__NR_Linux + 151)
+#define __NR_fdatasync	(__NR_Linux + 152)
+#define __NR__sysctl	(__NR_Linux + 153)
+#define __NR_mlock	(__NR_Linux + 154)
+#define __NR_munlock	(__NR_Linux + 155)
+#define __NR_mlockall	(__NR_Linux + 156)
+#define __NR_munlockall	(__NR_Linux + 157)
+#define __NR_sched_setparam	(__NR_Linux + 158)
+#define __NR_sched_getparam	(__NR_Linux + 159)
+#define __NR_sched_setscheduler	(__NR_Linux + 160)
+#define __NR_sched_getscheduler	(__NR_Linux + 161)
+#define __NR_sched_yield	(__NR_Linux + 162)
+#define __NR_sched_get_priority_max	(__NR_Linux + 163)
+#define __NR_sched_get_priority_min	(__NR_Linux + 164)
+#define __NR_sched_rr_get_interval	(__NR_Linux + 165)
+#define __NR_nanosleep	(__NR_Linux + 166)
+#define __NR_mremap	(__NR_Linux + 167)
+#define __NR_accept	(__NR_Linux + 168)
+#define __NR_bind	(__NR_Linux + 169)
+#define __NR_connect	(__NR_Linux + 170)
+#define __NR_getpeername	(__NR_Linux + 171)
+#define __NR_getsockname	(__NR_Linux + 172)
+#define __NR_getsockopt	(__NR_Linux + 173)
+#define __NR_listen	(__NR_Linux + 174)
+#define __NR_recv	(__NR_Linux + 175)
+#define __NR_recvfrom	(__NR_Linux + 176)
+#define __NR_recvmsg	(__NR_Linux + 177)
+#define __NR_send	(__NR_Linux + 178)
+#define __NR_sendmsg	(__NR_Linux + 179)
+#define __NR_sendto	(__NR_Linux + 180)
+#define __NR_setsockopt	(__NR_Linux + 181)
+#define __NR_shutdown	(__NR_Linux + 182)
+#define __NR_socket	(__NR_Linux + 183)
+#define __NR_socketpair	(__NR_Linux + 184)
+#define __NR_setresuid	(__NR_Linux + 185)
+#define __NR_getresuid	(__NR_Linux + 186)
+#define __NR_query_module	(__NR_Linux + 187)
+#define __NR_poll	(__NR_Linux + 188)
+#define __NR_nfsservctl	(__NR_Linux + 189)
+#define __NR_setresgid	(__NR_Linux + 190)
+#define __NR_getresgid	(__NR_Linux + 191)
+#define __NR_prctl	(__NR_Linux + 192)
+#define __NR_rt_sigreturn	(__NR_Linux + 193)
+#define __NR_rt_sigaction	(__NR_Linux + 194)
+#define __NR_rt_sigprocmask	(__NR_Linux + 195)
+#define __NR_rt_sigpending	(__NR_Linux + 196)
+#define __NR_rt_sigtimedwait	(__NR_Linux + 197)
+#define __NR_rt_sigqueueinfo	(__NR_Linux + 198)
+#define __NR_rt_sigsuspend	(__NR_Linux + 199)
+#define __NR_pread64	(__NR_Linux + 200)
+#define __NR_pwrite64	(__NR_Linux + 201)
+#define __NR_chown	(__NR_Linux + 202)
+#define __NR_getcwd	(__NR_Linux + 203)
+#define __NR_capget	(__NR_Linux + 204)
+#define __NR_capset	(__NR_Linux + 205)
+#define __NR_sigaltstack	(__NR_Linux + 206)
+#define __NR_sendfile	(__NR_Linux + 207)
+#define __NR_getpmsg	(__NR_Linux + 208)
+#define __NR_putpmsg	(__NR_Linux + 209)
+#define __NR_mmap2	(__NR_Linux + 210)
+#define __NR_truncate64	(__NR_Linux + 211)
+#define __NR_ftruncate64	(__NR_Linux + 212)
+#define __NR_stat64	(__NR_Linux + 213)
+#define __NR_lstat64	(__NR_Linux + 214)
+#define __NR_fstat64	(__NR_Linux + 215)
+#define __NR_pivot_root	(__NR_Linux + 216)
+#define __NR_mincore	(__NR_Linux + 217)
+#define __NR_madvise	(__NR_Linux + 218)
+#define __NR_getdents64	(__NR_Linux + 219)
+#define __NR_fcntl64	(__NR_Linux + 220)
+#define __NR_reserved221	(__NR_Linux + 221)
+#define __NR_gettid	(__NR_Linux + 222)
+#define __NR_readahead	(__NR_Linux + 223)
+#define __NR_setxattr	(__NR_Linux + 224)
+#define __NR_lsetxattr	(__NR_Linux + 225)
+#define __NR_fsetxattr	(__NR_Linux + 226)
+#define __NR_getxattr	(__NR_Linux + 227)
+#define __NR_lgetxattr	(__NR_Linux + 228)
+#define __NR_fgetxattr	(__NR_Linux + 229)
+#define __NR_listxattr	(__NR_Linux + 230)
+#define __NR_llistxattr	(__NR_Linux + 231)
+#define __NR_flistxattr	(__NR_Linux + 232)
+#define __NR_removexattr	(__NR_Linux + 233)
+#define __NR_lremovexattr	(__NR_Linux + 234)
+#define __NR_fremovexattr	(__NR_Linux + 235)
+#define __NR_tkill	(__NR_Linux + 236)
+#define __NR_sendfile64	(__NR_Linux + 237)
+#define __NR_futex	(__NR_Linux + 238)
+#define __NR_sched_setaffinity	(__NR_Linux + 239)
+#define __NR_sched_getaffinity	(__NR_Linux + 240)
+#define __NR_io_setup	(__NR_Linux + 241)
+#define __NR_io_destroy	(__NR_Linux + 242)
+#define __NR_io_getevents	(__NR_Linux + 243)
+#define __NR_io_submit	(__NR_Linux + 244)
+#define __NR_io_cancel	(__NR_Linux + 245)
+#define __NR_exit_group	(__NR_Linux + 246)
+#define __NR_lookup_dcookie	(__NR_Linux + 247)
+#define __NR_epoll_create	(__NR_Linux + 248)
+#define __NR_epoll_ctl	(__NR_Linux + 249)
+#define __NR_epoll_wait	(__NR_Linux + 250)
+#define __NR_remap_file_pages	(__NR_Linux + 251)
+#define __NR_set_tid_address	(__NR_Linux + 252)
+#define __NR_restart_syscall	(__NR_Linux + 253)
+#define __NR_fadvise64	(__NR_Linux + 254)
+#define __NR_statfs64	(__NR_Linux + 255)
+#define __NR_fstatfs64	(__NR_Linux + 256)
+#define __NR_timer_create	(__NR_Linux + 257)
+#define __NR_timer_settime	(__NR_Linux + 258)
+#define __NR_timer_gettime	(__NR_Linux + 259)
+#define __NR_timer_getoverrun	(__NR_Linux + 260)
+#define __NR_timer_delete	(__NR_Linux + 261)
+#define __NR_clock_settime	(__NR_Linux + 262)
+#define __NR_clock_gettime	(__NR_Linux + 263)
+#define __NR_clock_getres	(__NR_Linux + 264)
+#define __NR_clock_nanosleep	(__NR_Linux + 265)
+#define __NR_tgkill	(__NR_Linux + 266)
+#define __NR_utimes	(__NR_Linux + 267)
+#define __NR_mbind	(__NR_Linux + 268)
+#define __NR_get_mempolicy	(__NR_Linux + 269)
+#define __NR_set_mempolicy	(__NR_Linux + 270)
+#define __NR_mq_open	(__NR_Linux + 271)
+#define __NR_mq_unlink	(__NR_Linux + 272)
+#define __NR_mq_timedsend	(__NR_Linux + 273)
+#define __NR_mq_timedreceive	(__NR_Linux + 274)
+#define __NR_mq_notify	(__NR_Linux + 275)
+#define __NR_mq_getsetattr	(__NR_Linux + 276)
+#define __NR_vserver	(__NR_Linux + 277)
+#define __NR_waitid	(__NR_Linux + 278)
+#define __NR_add_key	(__NR_Linux + 280)
+#define __NR_request_key	(__NR_Linux + 281)
+#define __NR_keyctl	(__NR_Linux + 282)
+#define __NR_set_thread_area	(__NR_Linux + 283)
+#define __NR_inotify_init	(__NR_Linux + 284)
+#define __NR_inotify_add_watch	(__NR_Linux + 285)
+#define __NR_inotify_rm_watch	(__NR_Linux + 286)
+#define __NR_migrate_pages	(__NR_Linux + 287)
+#define __NR_openat	(__NR_Linux + 288)
+#define __NR_mkdirat	(__NR_Linux + 289)
+#define __NR_mknodat	(__NR_Linux + 290)
+#define __NR_fchownat	(__NR_Linux + 291)
+#define __NR_futimesat	(__NR_Linux + 292)
+#define __NR_fstatat64	(__NR_Linux + 293)
+#define __NR_unlinkat	(__NR_Linux + 294)
+#define __NR_renameat	(__NR_Linux + 295)
+#define __NR_linkat	(__NR_Linux + 296)
+#define __NR_symlinkat	(__NR_Linux + 297)
+#define __NR_readlinkat	(__NR_Linux + 298)
+#define __NR_fchmodat	(__NR_Linux + 299)
+#define __NR_faccessat	(__NR_Linux + 300)
+#define __NR_pselect6	(__NR_Linux + 301)
+#define __NR_ppoll	(__NR_Linux + 302)
+#define __NR_unshare	(__NR_Linux + 303)
+#define __NR_splice	(__NR_Linux + 304)
+#define __NR_sync_file_range	(__NR_Linux + 305)
+#define __NR_tee	(__NR_Linux + 306)
+#define __NR_vmsplice	(__NR_Linux + 307)
+#define __NR_move_pages	(__NR_Linux + 308)
+#define __NR_set_robust_list	(__NR_Linux + 309)
+#define __NR_get_robust_list	(__NR_Linux + 310)
+#define __NR_kexec_load	(__NR_Linux + 311)
+#define __NR_getcpu	(__NR_Linux + 312)
+#define __NR_epoll_pwait	(__NR_Linux + 313)
+#define __NR_ioprio_set	(__NR_Linux + 314)
+#define __NR_ioprio_get	(__NR_Linux + 315)
+#define __NR_utimensat	(__NR_Linux + 316)
+#define __NR_signalfd	(__NR_Linux + 317)
+#define __NR_timerfd	(__NR_Linux + 318)
+#define __NR_eventfd	(__NR_Linux + 319)
+#define __NR_fallocate	(__NR_Linux + 320)
+#define __NR_timerfd_create	(__NR_Linux + 321)
+#define __NR_timerfd_gettime	(__NR_Linux + 322)
+#define __NR_timerfd_settime	(__NR_Linux + 323)
+#define __NR_signalfd4	(__NR_Linux + 324)
+#define __NR_eventfd2	(__NR_Linux + 325)
+#define __NR_epoll_create1	(__NR_Linux + 326)
+#define __NR_dup3	(__NR_Linux + 327)
+#define __NR_pipe2	(__NR_Linux + 328)
+#define __NR_inotify_init1	(__NR_Linux + 329)
+#define __NR_preadv	(__NR_Linux + 330)
+#define __NR_pwritev	(__NR_Linux + 331)
+#define __NR_rt_tgsigqueueinfo	(__NR_Linux + 332)
+#define __NR_perf_event_open	(__NR_Linux + 333)
+#define __NR_accept4	(__NR_Linux + 334)
+#define __NR_recvmmsg	(__NR_Linux + 335)
+#define __NR_fanotify_init	(__NR_Linux + 336)
+#define __NR_fanotify_mark	(__NR_Linux + 337)
+#define __NR_prlimit64	(__NR_Linux + 338)
+#define __NR_name_to_handle_at	(__NR_Linux + 339)
+#define __NR_open_by_handle_at	(__NR_Linux + 340)
+#define __NR_clock_adjtime	(__NR_Linux + 341)
+#define __NR_syncfs	(__NR_Linux + 342)
+#define __NR_sendmmsg	(__NR_Linux + 343)
+#define __NR_setns	(__NR_Linux + 344)
+#define __NR_process_vm_readv	(__NR_Linux + 345)
+#define __NR_process_vm_writev	(__NR_Linux + 346)
+#define __NR_kcmp	(__NR_Linux + 347)
+#define __NR_finit_module	(__NR_Linux + 348)
+#define __NR_sched_setattr	(__NR_Linux + 349)
+#define __NR_sched_getattr	(__NR_Linux + 350)
+#define __NR_renameat2	(__NR_Linux + 351)
+#define __NR_seccomp	(__NR_Linux + 352)
+#define __NR_getrandom	(__NR_Linux + 353)
+#define __NR_memfd_create	(__NR_Linux + 354)
+#define __NR_bpf	(__NR_Linux + 355)
+#define __NR_execveat	(__NR_Linux + 356)
+#define __NR_userfaultfd	(__NR_Linux + 357)
+#define __NR_membarrier	(__NR_Linux + 358)
+#define __NR_mlock2	(__NR_Linux + 359)
+#define __NR_copy_file_range	(__NR_Linux + 360)
+#define __NR_preadv2	(__NR_Linux + 361)
+#define __NR_pwritev2	(__NR_Linux + 362)
+#define __NR_pkey_mprotect	(__NR_Linux + 363)
+#define __NR_pkey_alloc	(__NR_Linux + 364)
+#define __NR_pkey_free	(__NR_Linux + 365)
+#define __NR_statx	(__NR_Linux + 366)
+#define __NR_rseq	(__NR_Linux + 367)
+#define __NR_io_pgetevents	(__NR_Linux + 368)
+
+
+#endif /* _ASM_MIPS_UNISTD_O32_H */
diff --git a/linux-headers/asm-powerpc/unistd.h b/linux-headers/asm-powerpc/unistd.h
index ec3533b1d0..2b29bd8096 100644
--- a/linux-headers/asm-powerpc/unistd.h
+++ b/linux-headers/asm-powerpc/unistd.h
@@ -10,395 +10,10 @@
 #ifndef _ASM_POWERPC_UNISTD_H_
 #define _ASM_POWERPC_UNISTD_H_
 
-
-#define __NR_restart_syscall	  0
-#define __NR_exit		  1
-#define __NR_fork		  2
-#define __NR_read		  3
-#define __NR_write		  4
-#define __NR_open		  5
-#define __NR_close		  6
-#define __NR_waitpid		  7
-#define __NR_creat		  8
-#define __NR_link		  9
-#define __NR_unlink		 10
-#define __NR_execve		 11
-#define __NR_chdir		 12
-#define __NR_time		 13
-#define __NR_mknod		 14
-#define __NR_chmod		 15
-#define __NR_lchown		 16
-#define __NR_break		 17
-#define __NR_oldstat		 18
-#define __NR_lseek		 19
-#define __NR_getpid		 20
-#define __NR_mount		 21
-#define __NR_umount		 22
-#define __NR_setuid		 23
-#define __NR_getuid		 24
-#define __NR_stime		 25
-#define __NR_ptrace		 26
-#define __NR_alarm		 27
-#define __NR_oldfstat		 28
-#define __NR_pause		 29
-#define __NR_utime		 30
-#define __NR_stty		 31
-#define __NR_gtty		 32
-#define __NR_access		 33
-#define __NR_nice		 34
-#define __NR_ftime		 35
-#define __NR_sync		 36
-#define __NR_kill		 37
-#define __NR_rename		 38
-#define __NR_mkdir		 39
-#define __NR_rmdir		 40
-#define __NR_dup		 41
-#define __NR_pipe		 42
-#define __NR_times		 43
-#define __NR_prof		 44
-#define __NR_brk		 45
-#define __NR_setgid		 46
-#define __NR_getgid		 47
-#define __NR_signal		 48
-#define __NR_geteuid		 49
-#define __NR_getegid		 50
-#define __NR_acct		 51
-#define __NR_umount2		 52
-#define __NR_lock		 53
-#define __NR_ioctl		 54
-#define __NR_fcntl		 55
-#define __NR_mpx		 56
-#define __NR_setpgid		 57
-#define __NR_ulimit		 58
-#define __NR_oldolduname	 59
-#define __NR_umask		 60
-#define __NR_chroot		 61
-#define __NR_ustat		 62
-#define __NR_dup2		 63
-#define __NR_getppid		 64
-#define __NR_getpgrp		 65
-#define __NR_setsid		 66
-#define __NR_sigaction		 67
-#define __NR_sgetmask		 68
-#define __NR_ssetmask		 69
-#define __NR_setreuid		 70
-#define __NR_setregid		 71
-#define __NR_sigsuspend		 72
-#define __NR_sigpending		 73
-#define __NR_sethostname	 74
-#define __NR_setrlimit		 75
-#define __NR_getrlimit		 76
-#define __NR_getrusage		 77
-#define __NR_gettimeofday	 78
-#define __NR_settimeofday	 79
-#define __NR_getgroups		 80
-#define __NR_setgroups		 81
-#define __NR_select		 82
-#define __NR_symlink		 83
-#define __NR_oldlstat		 84
-#define __NR_readlink		 85
-#define __NR_uselib		 86
-#define __NR_swapon		 87
-#define __NR_reboot		 88
-#define __NR_readdir		 89
-#define __NR_mmap		 90
-#define __NR_munmap		 91
-#define __NR_truncate		 92
-#define __NR_ftruncate		 93
-#define __NR_fchmod		 94
-#define __NR_fchown		 95
-#define __NR_getpriority	 96
-#define __NR_setpriority	 97
-#define __NR_profil		 98
-#define __NR_statfs		 99
-#define __NR_fstatfs		100
-#define __NR_ioperm		101
-#define __NR_socketcall		102
-#define __NR_syslog		103
-#define __NR_setitimer		104
-#define __NR_getitimer		105
-#define __NR_stat		106
-#define __NR_lstat		107
-#define __NR_fstat		108
-#define __NR_olduname		109
-#define __NR_iopl		110
-#define __NR_vhangup		111
-#define __NR_idle		112
-#define __NR_vm86		113
-#define __NR_wait4		114
-#define __NR_swapoff		115
-#define __NR_sysinfo		116
-#define __NR_ipc		117
-#define __NR_fsync		118
-#define __NR_sigreturn		119
-#define __NR_clone		120
-#define __NR_setdomainname	121
-#define __NR_uname		122
-#define __NR_modify_ldt		123
-#define __NR_adjtimex		124
-#define __NR_mprotect		125
-#define __NR_sigprocmask	126
-#define __NR_create_module	127
-#define __NR_init_module	128
-#define __NR_delete_module	129
-#define __NR_get_kernel_syms	130
-#define __NR_quotactl		131
-#define __NR_getpgid		132
-#define __NR_fchdir		133
-#define __NR_bdflush		134
-#define __NR_sysfs		135
-#define __NR_personality	136
-#define __NR_afs_syscall	137 /* Syscall for Andrew File System */
-#define __NR_setfsuid		138
-#define __NR_setfsgid		139
-#define __NR__llseek		140
-#define __NR_getdents		141
-#define __NR__newselect		142
-#define __NR_flock		143
-#define __NR_msync		144
-#define __NR_readv		145
-#define __NR_writev		146
-#define __NR_getsid		147
-#define __NR_fdatasync		148
-#define __NR__sysctl		149
-#define __NR_mlock		150
-#define __NR_munlock		151
-#define __NR_mlockall		152
-#define __NR_munlockall		153
-#define __NR_sched_setparam		154
-#define __NR_sched_getparam		155
-#define __NR_sched_setscheduler		156
-#define __NR_sched_getscheduler		157
-#define __NR_sched_yield		158
-#define __NR_sched_get_priority_max	159
-#define __NR_sched_get_priority_min	160
-#define __NR_sched_rr_get_interval	161
-#define __NR_nanosleep		162
-#define __NR_mremap		163
-#define __NR_setresuid		164
-#define __NR_getresuid		165
-#define __NR_query_module	166
-#define __NR_poll		167
-#define __NR_nfsservctl		168
-#define __NR_setresgid		169
-#define __NR_getresgid		170
-#define __NR_prctl		171
-#define __NR_rt_sigreturn	172
-#define __NR_rt_sigaction	173
-#define __NR_rt_sigprocmask	174
-#define __NR_rt_sigpending	175
-#define __NR_rt_sigtimedwait	176
-#define __NR_rt_sigqueueinfo	177
-#define __NR_rt_sigsuspend	178
-#define __NR_pread64		179
-#define __NR_pwrite64		180
-#define __NR_chown		181
-#define __NR_getcwd		182
-#define __NR_capget		183
-#define __NR_capset		184
-#define __NR_sigaltstack	185
-#define __NR_sendfile		186
-#define __NR_getpmsg		187	/* some people actually want streams */
-#define __NR_putpmsg		188	/* some people actually want streams */
-#define __NR_vfork		189
-#define __NR_ugetrlimit		190	/* SuS compliant getrlimit */
-#define __NR_readahead		191
-#ifndef __powerpc64__			/* these are 32-bit only */
-#define __NR_mmap2		192
-#define __NR_truncate64		193
-#define __NR_ftruncate64	194
-#define __NR_stat64		195
-#define __NR_lstat64		196
-#define __NR_fstat64		197
-#endif
-#define __NR_pciconfig_read	198
-#define __NR_pciconfig_write	199
-#define __NR_pciconfig_iobase	200
-#define __NR_multiplexer	201
-#define __NR_getdents64		202
-#define __NR_pivot_root		203
 #ifndef __powerpc64__
-#define __NR_fcntl64		204
-#endif
-#define __NR_madvise		205
-#define __NR_mincore		206
-#define __NR_gettid		207
-#define __NR_tkill		208
-#define __NR_setxattr		209
-#define __NR_lsetxattr		210
-#define __NR_fsetxattr		211
-#define __NR_getxattr		212
-#define __NR_lgetxattr		213
-#define __NR_fgetxattr		214
-#define __NR_listxattr		215
-#define __NR_llistxattr		216
-#define __NR_flistxattr		217
-#define __NR_removexattr	218
-#define __NR_lremovexattr	219
-#define __NR_fremovexattr	220
-#define __NR_futex		221
-#define __NR_sched_setaffinity	222
-#define __NR_sched_getaffinity	223
-/* 224 currently unused */
-#define __NR_tuxcall		225
-#ifndef __powerpc64__
-#define __NR_sendfile64		226
-#endif
-#define __NR_io_setup		227
-#define __NR_io_destroy		228
-#define __NR_io_getevents	229
-#define __NR_io_submit		230
-#define __NR_io_cancel		231
-#define __NR_set_tid_address	232
-#define __NR_fadvise64		233
-#define __NR_exit_group		234
-#define __NR_lookup_dcookie	235
-#define __NR_epoll_create	236
-#define __NR_epoll_ctl		237
-#define __NR_epoll_wait		238
-#define __NR_remap_file_pages	239
-#define __NR_timer_create	240
-#define __NR_timer_settime	241
-#define __NR_timer_gettime	242
-#define __NR_timer_getoverrun	243
-#define __NR_timer_delete	244
-#define __NR_clock_settime	245
-#define __NR_clock_gettime	246
-#define __NR_clock_getres	247
-#define __NR_clock_nanosleep	248
-#define __NR_swapcontext	249
-#define __NR_tgkill		250
-#define __NR_utimes		251
-#define __NR_statfs64		252
-#define __NR_fstatfs64		253
-#ifndef __powerpc64__
-#define __NR_fadvise64_64	254
-#endif
-#define __NR_rtas		255
-#define __NR_sys_debug_setcontext 256
-/* Number 257 is reserved for vserver */
-#define __NR_migrate_pages	258
-#define __NR_mbind		259
-#define __NR_get_mempolicy	260
-#define __NR_set_mempolicy	261
-#define __NR_mq_open		262
-#define __NR_mq_unlink		263
-#define __NR_mq_timedsend	264
-#define __NR_mq_timedreceive	265
-#define __NR_mq_notify		266
-#define __NR_mq_getsetattr	267
-#define __NR_kexec_load		268
-#define __NR_add_key		269
-#define __NR_request_key	270
-#define __NR_keyctl		271
-#define __NR_waitid		272
-#define __NR_ioprio_set		273
-#define __NR_ioprio_get		274
-#define __NR_inotify_init	275
-#define __NR_inotify_add_watch	276
-#define __NR_inotify_rm_watch	277
-#define __NR_spu_run		278
-#define __NR_spu_create		279
-#define __NR_pselect6		280
-#define __NR_ppoll		281
-#define __NR_unshare		282
-#define __NR_splice		283
-#define __NR_tee		284
-#define __NR_vmsplice		285
-#define __NR_openat		286
-#define __NR_mkdirat		287
-#define __NR_mknodat		288
-#define __NR_fchownat		289
-#define __NR_futimesat		290
-#ifdef __powerpc64__
-#define __NR_newfstatat		291
+#include <asm/unistd_32.h>
 #else
-#define __NR_fstatat64		291
+#include <asm/unistd_64.h>
 #endif
-#define __NR_unlinkat		292
-#define __NR_renameat		293
-#define __NR_linkat		294
-#define __NR_symlinkat		295
-#define __NR_readlinkat		296
-#define __NR_fchmodat		297
-#define __NR_faccessat		298
-#define __NR_get_robust_list	299
-#define __NR_set_robust_list	300
-#define __NR_move_pages		301
-#define __NR_getcpu		302
-#define __NR_epoll_pwait	303
-#define __NR_utimensat		304
-#define __NR_signalfd		305
-#define __NR_timerfd_create	306
-#define __NR_eventfd		307
-#define __NR_sync_file_range2	308
-#define __NR_fallocate		309
-#define __NR_subpage_prot	310
-#define __NR_timerfd_settime	311
-#define __NR_timerfd_gettime	312
-#define __NR_signalfd4		313
-#define __NR_eventfd2		314
-#define __NR_epoll_create1	315
-#define __NR_dup3		316
-#define __NR_pipe2		317
-#define __NR_inotify_init1	318
-#define __NR_perf_event_open	319
-#define __NR_preadv		320
-#define __NR_pwritev		321
-#define __NR_rt_tgsigqueueinfo	322
-#define __NR_fanotify_init	323
-#define __NR_fanotify_mark	324
-#define __NR_prlimit64		325
-#define __NR_socket		326
-#define __NR_bind		327
-#define __NR_connect		328
-#define __NR_listen		329
-#define __NR_accept		330
-#define __NR_getsockname	331
-#define __NR_getpeername	332
-#define __NR_socketpair		333
-#define __NR_send		334
-#define __NR_sendto		335
-#define __NR_recv		336
-#define __NR_recvfrom		337
-#define __NR_shutdown		338
-#define __NR_setsockopt		339
-#define __NR_getsockopt		340
-#define __NR_sendmsg		341
-#define __NR_recvmsg		342
-#define __NR_recvmmsg		343
-#define __NR_accept4		344
-#define __NR_name_to_handle_at	345
-#define __NR_open_by_handle_at	346
-#define __NR_clock_adjtime	347
-#define __NR_syncfs		348
-#define __NR_sendmmsg		349
-#define __NR_setns		350
-#define __NR_process_vm_readv	351
-#define __NR_process_vm_writev	352
-#define __NR_finit_module	353
-#define __NR_kcmp		354
-#define __NR_sched_setattr	355
-#define __NR_sched_getattr	356
-#define __NR_renameat2		357
-#define __NR_seccomp		358
-#define __NR_getrandom		359
-#define __NR_memfd_create	360
-#define __NR_bpf		361
-#define __NR_execveat		362
-#define __NR_switch_endian	363
-#define __NR_userfaultfd	364
-#define __NR_membarrier		365
-#define __NR_mlock2		378
-#define __NR_copy_file_range	379
-#define __NR_preadv2		380
-#define __NR_pwritev2		381
-#define __NR_kexec_file_load	382
-#define __NR_statx		383
-#define __NR_pkey_alloc		384
-#define __NR_pkey_free		385
-#define __NR_pkey_mprotect	386
-#define __NR_rseq		387
-#define __NR_io_pgetevents	388
 
 #endif /* _ASM_POWERPC_UNISTD_H_ */
diff --git a/linux-headers/asm-powerpc/unistd_32.h b/linux-headers/asm-powerpc/unistd_32.h
new file mode 100644
index 0000000000..b8403d700d
--- /dev/null
+++ b/linux-headers/asm-powerpc/unistd_32.h
@@ -0,0 +1,381 @@
+#ifndef _ASM_POWERPC_UNISTD_32_H
+#define _ASM_POWERPC_UNISTD_32_H
+
+#define __NR_restart_syscall	0
+#define __NR_exit	1
+#define __NR_fork	2
+#define __NR_read	3
+#define __NR_write	4
+#define __NR_open	5
+#define __NR_close	6
+#define __NR_waitpid	7
+#define __NR_creat	8
+#define __NR_link	9
+#define __NR_unlink	10
+#define __NR_execve	11
+#define __NR_chdir	12
+#define __NR_time	13
+#define __NR_mknod	14
+#define __NR_chmod	15
+#define __NR_lchown	16
+#define __NR_break	17
+#define __NR_oldstat	18
+#define __NR_lseek	19
+#define __NR_getpid	20
+#define __NR_mount	21
+#define __NR_umount	22
+#define __NR_setuid	23
+#define __NR_getuid	24
+#define __NR_stime	25
+#define __NR_ptrace	26
+#define __NR_alarm	27
+#define __NR_oldfstat	28
+#define __NR_pause	29
+#define __NR_utime	30
+#define __NR_stty	31
+#define __NR_gtty	32
+#define __NR_access	33
+#define __NR_nice	34
+#define __NR_ftime	35
+#define __NR_sync	36
+#define __NR_kill	37
+#define __NR_rename	38
+#define __NR_mkdir	39
+#define __NR_rmdir	40
+#define __NR_dup	41
+#define __NR_pipe	42
+#define __NR_times	43
+#define __NR_prof	44
+#define __NR_brk	45
+#define __NR_setgid	46
+#define __NR_getgid	47
+#define __NR_signal	48
+#define __NR_geteuid	49
+#define __NR_getegid	50
+#define __NR_acct	51
+#define __NR_umount2	52
+#define __NR_lock	53
+#define __NR_ioctl	54
+#define __NR_fcntl	55
+#define __NR_mpx	56
+#define __NR_setpgid	57
+#define __NR_ulimit	58
+#define __NR_oldolduname	59
+#define __NR_umask	60
+#define __NR_chroot	61
+#define __NR_ustat	62
+#define __NR_dup2	63
+#define __NR_getppid	64
+#define __NR_getpgrp	65
+#define __NR_setsid	66
+#define __NR_sigaction	67
+#define __NR_sgetmask	68
+#define __NR_ssetmask	69
+#define __NR_setreuid	70
+#define __NR_setregid	71
+#define __NR_sigsuspend	72
+#define __NR_sigpending	73
+#define __NR_sethostname	74
+#define __NR_setrlimit	75
+#define __NR_getrlimit	76
+#define __NR_getrusage	77
+#define __NR_gettimeofday	78
+#define __NR_settimeofday	79
+#define __NR_getgroups	80
+#define __NR_setgroups	81
+#define __NR_select	82
+#define __NR_symlink	83
+#define __NR_oldlstat	84
+#define __NR_readlink	85
+#define __NR_uselib	86
+#define __NR_swapon	87
+#define __NR_reboot	88
+#define __NR_readdir	89
+#define __NR_mmap	90
+#define __NR_munmap	91
+#define __NR_truncate	92
+#define __NR_ftruncate	93
+#define __NR_fchmod	94
+#define __NR_fchown	95
+#define __NR_getpriority	96
+#define __NR_setpriority	97
+#define __NR_profil	98
+#define __NR_statfs	99
+#define __NR_fstatfs	100
+#define __NR_ioperm	101
+#define __NR_socketcall	102
+#define __NR_syslog	103
+#define __NR_setitimer	104
+#define __NR_getitimer	105
+#define __NR_stat	106
+#define __NR_lstat	107
+#define __NR_fstat	108
+#define __NR_olduname	109
+#define __NR_iopl	110
+#define __NR_vhangup	111
+#define __NR_idle	112
+#define __NR_vm86	113
+#define __NR_wait4	114
+#define __NR_swapoff	115
+#define __NR_sysinfo	116
+#define __NR_ipc	117
+#define __NR_fsync	118
+#define __NR_sigreturn	119
+#define __NR_clone	120
+#define __NR_setdomainname	121
+#define __NR_uname	122
+#define __NR_modify_ldt	123
+#define __NR_adjtimex	124
+#define __NR_mprotect	125
+#define __NR_sigprocmask	126
+#define __NR_create_module	127
+#define __NR_init_module	128
+#define __NR_delete_module	129
+#define __NR_get_kernel_syms	130
+#define __NR_quotactl	131
+#define __NR_getpgid	132
+#define __NR_fchdir	133
+#define __NR_bdflush	134
+#define __NR_sysfs	135
+#define __NR_personality	136
+#define __NR_afs_syscall	137
+#define __NR_setfsuid	138
+#define __NR_setfsgid	139
+#define __NR__llseek	140
+#define __NR_getdents	141
+#define __NR__newselect	142
+#define __NR_flock	143
+#define __NR_msync	144
+#define __NR_readv	145
+#define __NR_writev	146
+#define __NR_getsid	147
+#define __NR_fdatasync	148
+#define __NR__sysctl	149
+#define __NR_mlock	150
+#define __NR_munlock	151
+#define __NR_mlockall	152
+#define __NR_munlockall	153
+#define __NR_sched_setparam	154
+#define __NR_sched_getparam	155
+#define __NR_sched_setscheduler	156
+#define __NR_sched_getscheduler	157
+#define __NR_sched_yield	158
+#define __NR_sched_get_priority_max	159
+#define __NR_sched_get_priority_min	160
+#define __NR_sched_rr_get_interval	161
+#define __NR_nanosleep	162
+#define __NR_mremap	163
+#define __NR_setresuid	164
+#define __NR_getresuid	165
+#define __NR_query_module	166
+#define __NR_poll	167
+#define __NR_nfsservctl	168
+#define __NR_setresgid	169
+#define __NR_getresgid	170
+#define __NR_prctl	171
+#define __NR_rt_sigreturn	172
+#define __NR_rt_sigaction	173
+#define __NR_rt_sigprocmask	174
+#define __NR_rt_sigpending	175
+#define __NR_rt_sigtimedwait	176
+#define __NR_rt_sigqueueinfo	177
+#define __NR_rt_sigsuspend	178
+#define __NR_pread64	179
+#define __NR_pwrite64	180
+#define __NR_chown	181
+#define __NR_getcwd	182
+#define __NR_capget	183
+#define __NR_capset	184
+#define __NR_sigaltstack	185
+#define __NR_sendfile	186
+#define __NR_getpmsg	187
+#define __NR_putpmsg	188
+#define __NR_vfork	189
+#define __NR_ugetrlimit	190
+#define __NR_readahead	191
+#define __NR_mmap2	192
+#define __NR_truncate64	193
+#define __NR_ftruncate64	194
+#define __NR_stat64	195
+#define __NR_lstat64	196
+#define __NR_fstat64	197
+#define __NR_pciconfig_read	198
+#define __NR_pciconfig_write	199
+#define __NR_pciconfig_iobase	200
+#define __NR_multiplexer	201
+#define __NR_getdents64	202
+#define __NR_pivot_root	203
+#define __NR_fcntl64	204
+#define __NR_madvise	205
+#define __NR_mincore	206
+#define __NR_gettid	207
+#define __NR_tkill	208
+#define __NR_setxattr	209
+#define __NR_lsetxattr	210
+#define __NR_fsetxattr	211
+#define __NR_getxattr	212
+#define __NR_lgetxattr	213
+#define __NR_fgetxattr	214
+#define __NR_listxattr	215
+#define __NR_llistxattr	216
+#define __NR_flistxattr	217
+#define __NR_removexattr	218
+#define __NR_lremovexattr	219
+#define __NR_fremovexattr	220
+#define __NR_futex	221
+#define __NR_sched_setaffinity	222
+#define __NR_sched_getaffinity	223
+#define __NR_tuxcall	225
+#define __NR_sendfile64	226
+#define __NR_io_setup	227
+#define __NR_io_destroy	228
+#define __NR_io_getevents	229
+#define __NR_io_submit	230
+#define __NR_io_cancel	231
+#define __NR_set_tid_address	232
+#define __NR_fadvise64	233
+#define __NR_exit_group	234
+#define __NR_lookup_dcookie	235
+#define __NR_epoll_create	236
+#define __NR_epoll_ctl	237
+#define __NR_epoll_wait	238
+#define __NR_remap_file_pages	239
+#define __NR_timer_create	240
+#define __NR_timer_settime	241
+#define __NR_timer_gettime	242
+#define __NR_timer_getoverrun	243
+#define __NR_timer_delete	244
+#define __NR_clock_settime	245
+#define __NR_clock_gettime	246
+#define __NR_clock_getres	247
+#define __NR_clock_nanosleep	248
+#define __NR_swapcontext	249
+#define __NR_tgkill	250
+#define __NR_utimes	251
+#define __NR_statfs64	252
+#define __NR_fstatfs64	253
+#define __NR_fadvise64_64	254
+#define __NR_rtas	255
+#define __NR_sys_debug_setcontext	256
+#define __NR_migrate_pages	258
+#define __NR_mbind	259
+#define __NR_get_mempolicy	260
+#define __NR_set_mempolicy	261
+#define __NR_mq_open	262
+#define __NR_mq_unlink	263
+#define __NR_mq_timedsend	264
+#define __NR_mq_timedreceive	265
+#define __NR_mq_notify	266
+#define __NR_mq_getsetattr	267
+#define __NR_kexec_load	268
+#define __NR_add_key	269
+#define __NR_request_key	270
+#define __NR_keyctl	271
+#define __NR_waitid	272
+#define __NR_ioprio_set	273
+#define __NR_ioprio_get	274
+#define __NR_inotify_init	275
+#define __NR_inotify_add_watch	276
+#define __NR_inotify_rm_watch	277
+#define __NR_spu_run	278
+#define __NR_spu_create	279
+#define __NR_pselect6	280
+#define __NR_ppoll	281
+#define __NR_unshare	282
+#define __NR_splice	283
+#define __NR_tee	284
+#define __NR_vmsplice	285
+#define __NR_openat	286
+#define __NR_mkdirat	287
+#define __NR_mknodat	288
+#define __NR_fchownat	289
+#define __NR_futimesat	290
+#define __NR_fstatat64	291
+#define __NR_unlinkat	292
+#define __NR_renameat	293
+#define __NR_linkat	294
+#define __NR_symlinkat	295
+#define __NR_readlinkat	296
+#define __NR_fchmodat	297
+#define __NR_faccessat	298
+#define __NR_get_robust_list	299
+#define __NR_set_robust_list	300
+#define __NR_move_pages	301
+#define __NR_getcpu	302
+#define __NR_epoll_pwait	303
+#define __NR_utimensat	304
+#define __NR_signalfd	305
+#define __NR_timerfd_create	306
+#define __NR_eventfd	307
+#define __NR_sync_file_range2	308
+#define __NR_fallocate	309
+#define __NR_subpage_prot	310
+#define __NR_timerfd_settime	311
+#define __NR_timerfd_gettime	312
+#define __NR_signalfd4	313
+#define __NR_eventfd2	314
+#define __NR_epoll_create1	315
+#define __NR_dup3	316
+#define __NR_pipe2	317
+#define __NR_inotify_init1	318
+#define __NR_perf_event_open	319
+#define __NR_preadv	320
+#define __NR_pwritev	321
+#define __NR_rt_tgsigqueueinfo	322
+#define __NR_fanotify_init	323
+#define __NR_fanotify_mark	324
+#define __NR_prlimit64	325
+#define __NR_socket	326
+#define __NR_bind	327
+#define __NR_connect	328
+#define __NR_listen	329
+#define __NR_accept	330
+#define __NR_getsockname	331
+#define __NR_getpeername	332
+#define __NR_socketpair	333
+#define __NR_send	334
+#define __NR_sendto	335
+#define __NR_recv	336
+#define __NR_recvfrom	337
+#define __NR_shutdown	338
+#define __NR_setsockopt	339
+#define __NR_getsockopt	340
+#define __NR_sendmsg	341
+#define __NR_recvmsg	342
+#define __NR_recvmmsg	343
+#define __NR_accept4	344
+#define __NR_name_to_handle_at	345
+#define __NR_open_by_handle_at	346
+#define __NR_clock_adjtime	347
+#define __NR_syncfs	348
+#define __NR_sendmmsg	349
+#define __NR_setns	350
+#define __NR_process_vm_readv	351
+#define __NR_process_vm_writev	352
+#define __NR_finit_module	353
+#define __NR_kcmp	354
+#define __NR_sched_setattr	355
+#define __NR_sched_getattr	356
+#define __NR_renameat2	357
+#define __NR_seccomp	358
+#define __NR_getrandom	359
+#define __NR_memfd_create	360
+#define __NR_bpf	361
+#define __NR_execveat	362
+#define __NR_switch_endian	363
+#define __NR_userfaultfd	364
+#define __NR_membarrier	365
+#define __NR_mlock2	378
+#define __NR_copy_file_range	379
+#define __NR_preadv2	380
+#define __NR_pwritev2	381
+#define __NR_kexec_file_load	382
+#define __NR_statx	383
+#define __NR_pkey_alloc	384
+#define __NR_pkey_free	385
+#define __NR_pkey_mprotect	386
+#define __NR_rseq	387
+#define __NR_io_pgetevents	388
+
+
+#endif /* _ASM_POWERPC_UNISTD_32_H */
diff --git a/linux-headers/asm-powerpc/unistd_64.h b/linux-headers/asm-powerpc/unistd_64.h
new file mode 100644
index 0000000000..f6a25fbbdd
--- /dev/null
+++ b/linux-headers/asm-powerpc/unistd_64.h
@@ -0,0 +1,372 @@
+#ifndef _ASM_POWERPC_UNISTD_64_H
+#define _ASM_POWERPC_UNISTD_64_H
+
+#define __NR_restart_syscall	0
+#define __NR_exit	1
+#define __NR_fork	2
+#define __NR_read	3
+#define __NR_write	4
+#define __NR_open	5
+#define __NR_close	6
+#define __NR_waitpid	7
+#define __NR_creat	8
+#define __NR_link	9
+#define __NR_unlink	10
+#define __NR_execve	11
+#define __NR_chdir	12
+#define __NR_time	13
+#define __NR_mknod	14
+#define __NR_chmod	15
+#define __NR_lchown	16
+#define __NR_break	17
+#define __NR_oldstat	18
+#define __NR_lseek	19
+#define __NR_getpid	20
+#define __NR_mount	21
+#define __NR_umount	22
+#define __NR_setuid	23
+#define __NR_getuid	24
+#define __NR_stime	25
+#define __NR_ptrace	26
+#define __NR_alarm	27
+#define __NR_oldfstat	28
+#define __NR_pause	29
+#define __NR_utime	30
+#define __NR_stty	31
+#define __NR_gtty	32
+#define __NR_access	33
+#define __NR_nice	34
+#define __NR_ftime	35
+#define __NR_sync	36
+#define __NR_kill	37
+#define __NR_rename	38
+#define __NR_mkdir	39
+#define __NR_rmdir	40
+#define __NR_dup	41
+#define __NR_pipe	42
+#define __NR_times	43
+#define __NR_prof	44
+#define __NR_brk	45
+#define __NR_setgid	46
+#define __NR_getgid	47
+#define __NR_signal	48
+#define __NR_geteuid	49
+#define __NR_getegid	50
+#define __NR_acct	51
+#define __NR_umount2	52
+#define __NR_lock	53
+#define __NR_ioctl	54
+#define __NR_fcntl	55
+#define __NR_mpx	56
+#define __NR_setpgid	57
+#define __NR_ulimit	58
+#define __NR_oldolduname	59
+#define __NR_umask	60
+#define __NR_chroot	61
+#define __NR_ustat	62
+#define __NR_dup2	63
+#define __NR_getppid	64
+#define __NR_getpgrp	65
+#define __NR_setsid	66
+#define __NR_sigaction	67
+#define __NR_sgetmask	68
+#define __NR_ssetmask	69
+#define __NR_setreuid	70
+#define __NR_setregid	71
+#define __NR_sigsuspend	72
+#define __NR_sigpending	73
+#define __NR_sethostname	74
+#define __NR_setrlimit	75
+#define __NR_getrlimit	76
+#define __NR_getrusage	77
+#define __NR_gettimeofday	78
+#define __NR_settimeofday	79
+#define __NR_getgroups	80
+#define __NR_setgroups	81
+#define __NR_select	82
+#define __NR_symlink	83
+#define __NR_oldlstat	84
+#define __NR_readlink	85
+#define __NR_uselib	86
+#define __NR_swapon	87
+#define __NR_reboot	88
+#define __NR_readdir	89
+#define __NR_mmap	90
+#define __NR_munmap	91
+#define __NR_truncate	92
+#define __NR_ftruncate	93
+#define __NR_fchmod	94
+#define __NR_fchown	95
+#define __NR_getpriority	96
+#define __NR_setpriority	97
+#define __NR_profil	98
+#define __NR_statfs	99
+#define __NR_fstatfs	100
+#define __NR_ioperm	101
+#define __NR_socketcall	102
+#define __NR_syslog	103
+#define __NR_setitimer	104
+#define __NR_getitimer	105
+#define __NR_stat	106
+#define __NR_lstat	107
+#define __NR_fstat	108
+#define __NR_olduname	109
+#define __NR_iopl	110
+#define __NR_vhangup	111
+#define __NR_idle	112
+#define __NR_vm86	113
+#define __NR_wait4	114
+#define __NR_swapoff	115
+#define __NR_sysinfo	116
+#define __NR_ipc	117
+#define __NR_fsync	118
+#define __NR_sigreturn	119
+#define __NR_clone	120
+#define __NR_setdomainname	121
+#define __NR_uname	122
+#define __NR_modify_ldt	123
+#define __NR_adjtimex	124
+#define __NR_mprotect	125
+#define __NR_sigprocmask	126
+#define __NR_create_module	127
+#define __NR_init_module	128
+#define __NR_delete_module	129
+#define __NR_get_kernel_syms	130
+#define __NR_quotactl	131
+#define __NR_getpgid	132
+#define __NR_fchdir	133
+#define __NR_bdflush	134
+#define __NR_sysfs	135
+#define __NR_personality	136
+#define __NR_afs_syscall	137
+#define __NR_setfsuid	138
+#define __NR_setfsgid	139
+#define __NR__llseek	140
+#define __NR_getdents	141
+#define __NR__newselect	142
+#define __NR_flock	143
+#define __NR_msync	144
+#define __NR_readv	145
+#define __NR_writev	146
+#define __NR_getsid	147
+#define __NR_fdatasync	148
+#define __NR__sysctl	149
+#define __NR_mlock	150
+#define __NR_munlock	151
+#define __NR_mlockall	152
+#define __NR_munlockall	153
+#define __NR_sched_setparam	154
+#define __NR_sched_getparam	155
+#define __NR_sched_setscheduler	156
+#define __NR_sched_getscheduler	157
+#define __NR_sched_yield	158
+#define __NR_sched_get_priority_max	159
+#define __NR_sched_get_priority_min	160
+#define __NR_sched_rr_get_interval	161
+#define __NR_nanosleep	162
+#define __NR_mremap	163
+#define __NR_setresuid	164
+#define __NR_getresuid	165
+#define __NR_query_module	166
+#define __NR_poll	167
+#define __NR_nfsservctl	168
+#define __NR_setresgid	169
+#define __NR_getresgid	170
+#define __NR_prctl	171
+#define __NR_rt_sigreturn	172
+#define __NR_rt_sigaction	173
+#define __NR_rt_sigprocmask	174
+#define __NR_rt_sigpending	175
+#define __NR_rt_sigtimedwait	176
+#define __NR_rt_sigqueueinfo	177
+#define __NR_rt_sigsuspend	178
+#define __NR_pread64	179
+#define __NR_pwrite64	180
+#define __NR_chown	181
+#define __NR_getcwd	182
+#define __NR_capget	183
+#define __NR_capset	184
+#define __NR_sigaltstack	185
+#define __NR_sendfile	186
+#define __NR_getpmsg	187
+#define __NR_putpmsg	188
+#define __NR_vfork	189
+#define __NR_ugetrlimit	190
+#define __NR_readahead	191
+#define __NR_pciconfig_read	198
+#define __NR_pciconfig_write	199
+#define __NR_pciconfig_iobase	200
+#define __NR_multiplexer	201
+#define __NR_getdents64	202
+#define __NR_pivot_root	203
+#define __NR_madvise	205
+#define __NR_mincore	206
+#define __NR_gettid	207
+#define __NR_tkill	208
+#define __NR_setxattr	209
+#define __NR_lsetxattr	210
+#define __NR_fsetxattr	211
+#define __NR_getxattr	212
+#define __NR_lgetxattr	213
+#define __NR_fgetxattr	214
+#define __NR_listxattr	215
+#define __NR_llistxattr	216
+#define __NR_flistxattr	217
+#define __NR_removexattr	218
+#define __NR_lremovexattr	219
+#define __NR_fremovexattr	220
+#define __NR_futex	221
+#define __NR_sched_setaffinity	222
+#define __NR_sched_getaffinity	223
+#define __NR_tuxcall	225
+#define __NR_io_setup	227
+#define __NR_io_destroy	228
+#define __NR_io_getevents	229
+#define __NR_io_submit	230
+#define __NR_io_cancel	231
+#define __NR_set_tid_address	232
+#define __NR_fadvise64	233
+#define __NR_exit_group	234
+#define __NR_lookup_dcookie	235
+#define __NR_epoll_create	236
+#define __NR_epoll_ctl	237
+#define __NR_epoll_wait	238
+#define __NR_remap_file_pages	239
+#define __NR_timer_create	240
+#define __NR_timer_settime	241
+#define __NR_timer_gettime	242
+#define __NR_timer_getoverrun	243
+#define __NR_timer_delete	244
+#define __NR_clock_settime	245
+#define __NR_clock_gettime	246
+#define __NR_clock_getres	247
+#define __NR_clock_nanosleep	248
+#define __NR_swapcontext	249
+#define __NR_tgkill	250
+#define __NR_utimes	251
+#define __NR_statfs64	252
+#define __NR_fstatfs64	253
+#define __NR_rtas	255
+#define __NR_sys_debug_setcontext	256
+#define __NR_migrate_pages	258
+#define __NR_mbind	259
+#define __NR_get_mempolicy	260
+#define __NR_set_mempolicy	261
+#define __NR_mq_open	262
+#define __NR_mq_unlink	263
+#define __NR_mq_timedsend	264
+#define __NR_mq_timedreceive	265
+#define __NR_mq_notify	266
+#define __NR_mq_getsetattr	267
+#define __NR_kexec_load	268
+#define __NR_add_key	269
+#define __NR_request_key	270
+#define __NR_keyctl	271
+#define __NR_waitid	272
+#define __NR_ioprio_set	273
+#define __NR_ioprio_get	274
+#define __NR_inotify_init	275
+#define __NR_inotify_add_watch	276
+#define __NR_inotify_rm_watch	277
+#define __NR_spu_run	278
+#define __NR_spu_create	279
+#define __NR_pselect6	280
+#define __NR_ppoll	281
+#define __NR_unshare	282
+#define __NR_splice	283
+#define __NR_tee	284
+#define __NR_vmsplice	285
+#define __NR_openat	286
+#define __NR_mkdirat	287
+#define __NR_mknodat	288
+#define __NR_fchownat	289
+#define __NR_futimesat	290
+#define __NR_newfstatat	291
+#define __NR_unlinkat	292
+#define __NR_renameat	293
+#define __NR_linkat	294
+#define __NR_symlinkat	295
+#define __NR_readlinkat	296
+#define __NR_fchmodat	297
+#define __NR_faccessat	298
+#define __NR_get_robust_list	299
+#define __NR_set_robust_list	300
+#define __NR_move_pages	301
+#define __NR_getcpu	302
+#define __NR_epoll_pwait	303
+#define __NR_utimensat	304
+#define __NR_signalfd	305
+#define __NR_timerfd_create	306
+#define __NR_eventfd	307
+#define __NR_sync_file_range2	308
+#define __NR_fallocate	309
+#define __NR_subpage_prot	310
+#define __NR_timerfd_settime	311
+#define __NR_timerfd_gettime	312
+#define __NR_signalfd4	313
+#define __NR_eventfd2	314
+#define __NR_epoll_create1	315
+#define __NR_dup3	316
+#define __NR_pipe2	317
+#define __NR_inotify_init1	318
+#define __NR_perf_event_open	319
+#define __NR_preadv	320
+#define __NR_pwritev	321
+#define __NR_rt_tgsigqueueinfo	322
+#define __NR_fanotify_init	323
+#define __NR_fanotify_mark	324
+#define __NR_prlimit64	325
+#define __NR_socket	326
+#define __NR_bind	327
+#define __NR_connect	328
+#define __NR_listen	329
+#define __NR_accept	330
+#define __NR_getsockname	331
+#define __NR_getpeername	332
+#define __NR_socketpair	333
+#define __NR_send	334
+#define __NR_sendto	335
+#define __NR_recv	336
+#define __NR_recvfrom	337
+#define __NR_shutdown	338
+#define __NR_setsockopt	339
+#define __NR_getsockopt	340
+#define __NR_sendmsg	341
+#define __NR_recvmsg	342
+#define __NR_recvmmsg	343
+#define __NR_accept4	344
+#define __NR_name_to_handle_at	345
+#define __NR_open_by_handle_at	346
+#define __NR_clock_adjtime	347
+#define __NR_syncfs	348
+#define __NR_sendmmsg	349
+#define __NR_setns	350
+#define __NR_process_vm_readv	351
+#define __NR_process_vm_writev	352
+#define __NR_finit_module	353
+#define __NR_kcmp	354
+#define __NR_sched_setattr	355
+#define __NR_sched_getattr	356
+#define __NR_renameat2	357
+#define __NR_seccomp	358
+#define __NR_getrandom	359
+#define __NR_memfd_create	360
+#define __NR_bpf	361
+#define __NR_execveat	362
+#define __NR_switch_endian	363
+#define __NR_userfaultfd	364
+#define __NR_membarrier	365
+#define __NR_mlock2	378
+#define __NR_copy_file_range	379
+#define __NR_preadv2	380
+#define __NR_pwritev2	381
+#define __NR_kexec_file_load	382
+#define __NR_statx	383
+#define __NR_pkey_alloc	384
+#define __NR_pkey_free	385
+#define __NR_pkey_mprotect	386
+#define __NR_rseq	387
+#define __NR_io_pgetevents	388
+
+
+#endif /* _ASM_POWERPC_UNISTD_64_H */
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index f11a7eb49c..b53ee59748 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -492,6 +492,17 @@ struct kvm_dirty_log {
 	};
 };
 
+/* for KVM_CLEAR_DIRTY_LOG */
+struct kvm_clear_dirty_log {
+	__u32 slot;
+	__u32 num_pages;
+	__u64 first_page;
+	union {
+		void *dirty_bitmap; /* one bit per page */
+		__u64 padding2;
+	};
+};
+
 /* for KVM_SET_SIGNAL_MASK */
 struct kvm_signal_mask {
 	__u32 len;
@@ -757,6 +768,15 @@ struct kvm_ppc_resize_hpt {
 
 #define KVM_S390_SIE_PAGE_OFFSET 1
 
+/*
+ * On arm64, machine type can be used to request the physical
+ * address size for the VM. Bits[7-0] are reserved for the guest
+ * PA size shift (i.e, log2(PA_Size)). For backward compatibility,
+ * value 0 implies the default IPA size, 40bits.
+ */
+#define KVM_VM_TYPE_ARM_IPA_SIZE_MASK	0xffULL
+#define KVM_VM_TYPE_ARM_IPA_SIZE(x)		\
+	((x) & KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
 /*
  * ioctls for /dev/kvm fds:
  */
@@ -965,6 +985,9 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_COALESCED_PIO 162
 #define KVM_CAP_HYPERV_ENLIGHTENED_VMCS 163
 #define KVM_CAP_EXCEPTION_PAYLOAD 164
+#define KVM_CAP_ARM_VM_IPA_SIZE 165
+#define KVM_CAP_MANUAL_DIRTY_LOG_PROTECT 166
+#define KVM_CAP_HYPERV_CPUID 167
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -1411,6 +1434,12 @@ struct kvm_enc_region {
 #define KVM_GET_NESTED_STATE         _IOWR(KVMIO, 0xbe, struct kvm_nested_state)
 #define KVM_SET_NESTED_STATE         _IOW(KVMIO,  0xbf, struct kvm_nested_state)
 
+/* Available with KVM_CAP_MANUAL_DIRTY_LOG_PROTECT */
+#define KVM_CLEAR_DIRTY_LOG          _IOWR(KVMIO, 0xc0, struct kvm_clear_dirty_log)
+
+/* Available with KVM_CAP_HYPERV_CPUID */
+#define KVM_GET_SUPPORTED_HV_CPUID _IOWR(KVMIO, 0xc1, struct kvm_cpuid2)
+
 /* Secure Encrypted Virtualization command */
 enum sev_cmd_id {
 	/* Guest initialization commands */
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index ceb6453394..12a7b1dc53 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -303,6 +303,71 @@ struct vfio_region_info_cap_type {
 #define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG	(2)
 #define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG	(3)
 
+#define VFIO_REGION_TYPE_GFX                    (1)
+#define VFIO_REGION_SUBTYPE_GFX_EDID            (1)
+
+/**
+ * struct vfio_region_gfx_edid - EDID region layout.
+ *
+ * Set display link state and EDID blob.
+ *
+ * The EDID blob has monitor information such as brand, name, serial
+ * number, physical size, supported video modes and more.
+ *
+ * This special region allows userspace (typically qemu) set a virtual
+ * EDID for the virtual monitor, which allows a flexible display
+ * configuration.
+ *
+ * For the edid blob spec look here:
+ *    https://en.wikipedia.org/wiki/Extended_Display_Identification_Data
+ *
+ * On linux systems you can find the EDID blob in sysfs:
+ *    /sys/class/drm/${card}/${connector}/edid
+ *
+ * You can use the edid-decode ulility (comes with xorg-x11-utils) to
+ * decode the EDID blob.
+ *
+ * @edid_offset: location of the edid blob, relative to the
+ *               start of the region (readonly).
+ * @edid_max_size: max size of the edid blob (readonly).
+ * @edid_size: actual edid size (read/write).
+ * @link_state: display link state (read/write).
+ * VFIO_DEVICE_GFX_LINK_STATE_UP: Monitor is turned on.
+ * VFIO_DEVICE_GFX_LINK_STATE_DOWN: Monitor is turned off.
+ * @max_xres: max display width (0 == no limitation, readonly).
+ * @max_yres: max display height (0 == no limitation, readonly).
+ *
+ * EDID update protocol:
+ *   (1) set link-state to down.
+ *   (2) update edid blob and size.
+ *   (3) set link-state to up.
+ */
+struct vfio_region_gfx_edid {
+	__u32 edid_offset;
+	__u32 edid_max_size;
+	__u32 edid_size;
+	__u32 max_xres;
+	__u32 max_yres;
+	__u32 link_state;
+#define VFIO_DEVICE_GFX_LINK_STATE_UP    1
+#define VFIO_DEVICE_GFX_LINK_STATE_DOWN  2
+};
+
+/*
+ * 10de vendor sub-type
+ *
+ * NVIDIA GPU NVlink2 RAM is coherent RAM mapped onto the host address space.
+ */
+#define VFIO_REGION_SUBTYPE_NVIDIA_NVLINK2_RAM	(1)
+
+/*
+ * 1014 vendor sub-type
+ *
+ * IBM NPU NVlink2 ATSD (Address Translation Shootdown) register of NPU
+ * to do TLB invalidation on a GPU.
+ */
+#define VFIO_REGION_SUBTYPE_IBM_NVLINK2_ATSD	(1)
+
 /*
  * The MSIX mappable capability informs that MSIX data of a BAR can be mmapped
  * which allows direct access to non-MSIX registers which happened to be within
@@ -313,6 +378,33 @@ struct vfio_region_info_cap_type {
  */
 #define VFIO_REGION_INFO_CAP_MSIX_MAPPABLE	3
 
+/*
+ * Capability with compressed real address (aka SSA - small system address)
+ * where GPU RAM is mapped on a system bus. Used by a GPU for DMA routing
+ * and by the userspace to associate a NVLink bridge with a GPU.
+ */
+#define VFIO_REGION_INFO_CAP_NVLINK2_SSATGT	4
+
+struct vfio_region_info_cap_nvlink2_ssatgt {
+	struct vfio_info_cap_header header;
+	__u64 tgt;
+};
+
+/*
+ * Capability with an NVLink link speed. The value is read by
+ * the NVlink2 bridge driver from the bridge's "ibm,nvlink-speed"
+ * property in the device tree. The value is fixed in the hardware
+ * and failing to provide the correct value results in the link
+ * not working with no indication from the driver why.
+ */
+#define VFIO_REGION_INFO_CAP_NVLINK2_LNKSPD	5
+
+struct vfio_region_info_cap_nvlink2_lnkspd {
+	struct vfio_info_cap_header header;
+	__u32 link_speed;
+	__u32 __pad;
+};
+
 /**
  * VFIO_DEVICE_GET_IRQ_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 9,
  *				    struct vfio_irq_info)
diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
index c8a8fbeb81..40d028eed6 100644
--- a/linux-headers/linux/vhost.h
+++ b/linux-headers/linux/vhost.h
@@ -11,94 +11,9 @@
  * device configuration.
  */
 
+#include <linux/vhost_types.h>
 #include <linux/types.h>
-
 #include <linux/ioctl.h>
-#include <linux/virtio_config.h>
-#include <linux/virtio_ring.h>
-
-struct vhost_vring_state {
-	unsigned int index;
-	unsigned int num;
-};
-
-struct vhost_vring_file {
-	unsigned int index;
-	int fd; /* Pass -1 to unbind from file. */
-
-};
-
-struct vhost_vring_addr {
-	unsigned int index;
-	/* Option flags. */
-	unsigned int flags;
-	/* Flag values: */
-	/* Whether log address is valid. If set enables logging. */
-#define VHOST_VRING_F_LOG 0
-
-	/* Start of array of descriptors (virtually contiguous) */
-	__u64 desc_user_addr;
-	/* Used structure address. Must be 32 bit aligned */
-	__u64 used_user_addr;
-	/* Available structure address. Must be 16 bit aligned */
-	__u64 avail_user_addr;
-	/* Logging support. */
-	/* Log writes to used structure, at offset calculated from specified
-	 * address. Address must be 32 bit aligned. */
-	__u64 log_guest_addr;
-};
-
-/* no alignment requirement */
-struct vhost_iotlb_msg {
-	__u64 iova;
-	__u64 size;
-	__u64 uaddr;
-#define VHOST_ACCESS_RO      0x1
-#define VHOST_ACCESS_WO      0x2
-#define VHOST_ACCESS_RW      0x3
-	__u8 perm;
-#define VHOST_IOTLB_MISS           1
-#define VHOST_IOTLB_UPDATE         2
-#define VHOST_IOTLB_INVALIDATE     3
-#define VHOST_IOTLB_ACCESS_FAIL    4
-	__u8 type;
-};
-
-#define VHOST_IOTLB_MSG 0x1
-#define VHOST_IOTLB_MSG_V2 0x2
-
-struct vhost_msg {
-	int type;
-	union {
-		struct vhost_iotlb_msg iotlb;
-		__u8 padding[64];
-	};
-};
-
-struct vhost_msg_v2 {
-	__u32 type;
-	__u32 reserved;
-	union {
-		struct vhost_iotlb_msg iotlb;
-		__u8 padding[64];
-	};
-};
-
-struct vhost_memory_region {
-	__u64 guest_phys_addr;
-	__u64 memory_size; /* bytes */
-	__u64 userspace_addr;
-	__u64 flags_padding; /* No flags are currently specified. */
-};
-
-/* All region addresses and sizes must be 4K aligned. */
-#define VHOST_PAGE_SIZE 0x1000
-
-struct vhost_memory {
-	__u32 nregions;
-	__u32 padding;
-	struct vhost_memory_region regions[0];
-};
 
 /* ioctls */
 
@@ -186,31 +101,7 @@ struct vhost_memory {
  * device.  This can be used to stop the ring (e.g. for migration). */
 #define VHOST_NET_SET_BACKEND _IOW(VHOST_VIRTIO, 0x30, struct vhost_vring_file)
 
-/* Feature bits */
-/* Log all write descriptors. Can be changed while device is active. */
-#define VHOST_F_LOG_ALL 26
-/* vhost-net should add virtio_net_hdr for RX, and strip for TX packets. */
-#define VHOST_NET_F_VIRTIO_NET_HDR 27
-
-/* VHOST_SCSI specific definitions */
-
-/*
- * Used by QEMU userspace to ensure a consistent vhost-scsi ABI.
- *
- * ABI Rev 0: July 2012 version starting point for v3.6-rc merge candidate +
- *            RFC-v2 vhost-scsi userspace.  Add GET_ABI_VERSION ioctl usage
- * ABI Rev 1: January 2013. Ignore vhost_tpgt filed in struct vhost_scsi_target.
- *            All the targets under vhost_wwpn can be seen and used by guset.
- */
-
-#define VHOST_SCSI_ABI_VERSION	1
-
-struct vhost_scsi_target {
-	int abi_version;
-	char vhost_wwpn[224]; /* TRANSPORT_IQN_LEN */
-	unsigned short vhost_tpgt;
-	unsigned short reserved;
-};
+/* VHOST_SCSI specific defines */
 
 #define VHOST_SCSI_SET_ENDPOINT _IOW(VHOST_VIRTIO, 0x40, struct vhost_scsi_target)
 #define VHOST_SCSI_CLEAR_ENDPOINT _IOW(VHOST_VIRTIO, 0x41, struct vhost_scsi_target)
diff --git a/linux-headers/linux/vhost_types.h b/linux-headers/linux/vhost_types.h
new file mode 100644
index 0000000000..93c17ae58a
--- /dev/null
+++ b/linux-headers/linux/vhost_types.h
@@ -0,0 +1,128 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _LINUX_VHOST_TYPES_H
+#define _LINUX_VHOST_TYPES_H
+/* Userspace interface for in-kernel virtio accelerators. */
+
+/* vhost is used to reduce the number of system calls involved in virtio.
+ *
+ * Existing virtio net code is used in the guest without modification.
+ *
+ * This header includes interface used by userspace hypervisor for
+ * device configuration.
+ */
+
+#include <linux/types.h>
+
+#include <linux/virtio_config.h>
+#include <linux/virtio_ring.h>
+
+struct vhost_vring_state {
+	unsigned int index;
+	unsigned int num;
+};
+
+struct vhost_vring_file {
+	unsigned int index;
+	int fd; /* Pass -1 to unbind from file. */
+
+};
+
+struct vhost_vring_addr {
+	unsigned int index;
+	/* Option flags. */
+	unsigned int flags;
+	/* Flag values: */
+	/* Whether log address is valid. If set enables logging. */
+#define VHOST_VRING_F_LOG 0
+
+	/* Start of array of descriptors (virtually contiguous) */
+	__u64 desc_user_addr;
+	/* Used structure address. Must be 32 bit aligned */
+	__u64 used_user_addr;
+	/* Available structure address. Must be 16 bit aligned */
+	__u64 avail_user_addr;
+	/* Logging support. */
+	/* Log writes to used structure, at offset calculated from specified
+	 * address. Address must be 32 bit aligned. */
+	__u64 log_guest_addr;
+};
+
+/* no alignment requirement */
+struct vhost_iotlb_msg {
+	__u64 iova;
+	__u64 size;
+	__u64 uaddr;
+#define VHOST_ACCESS_RO      0x1
+#define VHOST_ACCESS_WO      0x2
+#define VHOST_ACCESS_RW      0x3
+	__u8 perm;
+#define VHOST_IOTLB_MISS           1
+#define VHOST_IOTLB_UPDATE         2
+#define VHOST_IOTLB_INVALIDATE     3
+#define VHOST_IOTLB_ACCESS_FAIL    4
+	__u8 type;
+};
+
+#define VHOST_IOTLB_MSG 0x1
+#define VHOST_IOTLB_MSG_V2 0x2
+
+struct vhost_msg {
+	int type;
+	union {
+		struct vhost_iotlb_msg iotlb;
+		__u8 padding[64];
+	};
+};
+
+struct vhost_msg_v2 {
+	__u32 type;
+	__u32 reserved;
+	union {
+		struct vhost_iotlb_msg iotlb;
+		__u8 padding[64];
+	};
+};
+
+struct vhost_memory_region {
+	__u64 guest_phys_addr;
+	__u64 memory_size; /* bytes */
+	__u64 userspace_addr;
+	__u64 flags_padding; /* No flags are currently specified. */
+};
+
+/* All region addresses and sizes must be 4K aligned. */
+#define VHOST_PAGE_SIZE 0x1000
+
+struct vhost_memory {
+	__u32 nregions;
+	__u32 padding;
+	struct vhost_memory_region regions[0];
+};
+
+/* VHOST_SCSI specific definitions */
+
+/*
+ * Used by QEMU userspace to ensure a consistent vhost-scsi ABI.
+ *
+ * ABI Rev 0: July 2012 version starting point for v3.6-rc merge candidate +
+ *            RFC-v2 vhost-scsi userspace.  Add GET_ABI_VERSION ioctl usage
+ * ABI Rev 1: January 2013. Ignore vhost_tpgt field in struct vhost_scsi_target.
+ *            All the targets under vhost_wwpn can be seen and used by guset.
+ */
+
+#define VHOST_SCSI_ABI_VERSION	1
+
+struct vhost_scsi_target {
+	int abi_version;
+	char vhost_wwpn[224]; /* TRANSPORT_IQN_LEN */
+	unsigned short vhost_tpgt;
+	unsigned short reserved;
+};
+
+/* Feature bits */
+/* Log all write descriptors. Can be changed while device is active. */
+#define VHOST_F_LOG_ALL 26
+/* vhost-net should add virtio_net_hdr for RX, and strip for TX packets. */
+#define VHOST_NET_F_VIRTIO_NET_HDR 27
+
+#endif
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v6 03/18] hw/arm/boot: introduce fdt_add_memory_node helper
  2019-02-05 17:32 [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 01/18] update-linux-headers.sh: Copy new headers Eric Auger
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 02/18] linux-headers: Update to v5.0-rc2 Eric Auger
@ 2019-02-05 17:32 ` Eric Auger
  2019-02-14 16:49   ` Peter Maydell
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 04/18] hw/arm/virt: Rename highmem IO regions Eric Auger
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 56+ messages in thread
From: Eric Auger @ 2019-02-05 17:32 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, imammedo, david
  Cc: dgilbert, david, drjones

From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>

We introduce an helper to create a memory node.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 hw/arm/boot.c | 54 ++++++++++++++++++++++++++++++++-------------------
 1 file changed, 34 insertions(+), 20 deletions(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 05762d0fc1..2ef367e15b 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -423,6 +423,36 @@ static void set_kernel_args_old(const struct arm_boot_info *info,
     }
 }
 
+static int fdt_add_memory_node(void *fdt, uint32_t acells, hwaddr mem_base,
+                               uint32_t scells, hwaddr mem_len,
+                               int numa_node_id)
+{
+    char *nodename = NULL;
+    int ret;
+
+    nodename = g_strdup_printf("/memory@%" PRIx64, mem_base);
+    qemu_fdt_add_subnode(fdt, nodename);
+    qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory");
+    ret = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg", acells, mem_base,
+                                       scells, mem_len);
+    if (ret < 0) {
+        fprintf(stderr, "couldn't set %s/reg\n", nodename);
+        goto out;
+    }
+    if (numa_node_id < 0) {
+        goto out;
+    }
+
+    ret = qemu_fdt_setprop_cell(fdt, nodename, "numa-node-id", numa_node_id);
+    if (ret < 0) {
+        fprintf(stderr, "couldn't set %s/numa-node-id\n", nodename);
+    }
+
+out:
+    g_free(nodename);
+    return ret;
+}
+
 static void fdt_add_psci_node(void *fdt)
 {
     uint32_t cpu_suspend_fn;
@@ -502,7 +532,6 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
     void *fdt = NULL;
     int size, rc, n = 0;
     uint32_t acells, scells;
-    char *nodename;
     unsigned int i;
     hwaddr mem_base, mem_len;
     char **node_path;
@@ -576,35 +605,20 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
         mem_base = binfo->loader_start;
         for (i = 0; i < nb_numa_nodes; i++) {
             mem_len = numa_info[i].node_mem;
-            nodename = g_strdup_printf("/memory@%" PRIx64, mem_base);
-            qemu_fdt_add_subnode(fdt, nodename);
-            qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory");
-            rc = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg",
-                                              acells, mem_base,
-                                              scells, mem_len);
+            rc = fdt_add_memory_node(fdt, acells, mem_base,
+                                     scells, mem_len, i);
             if (rc < 0) {
-                fprintf(stderr, "couldn't set %s/reg for node %d\n", nodename,
-                        i);
                 goto fail;
             }
 
-            qemu_fdt_setprop_cell(fdt, nodename, "numa-node-id", i);
             mem_base += mem_len;
-            g_free(nodename);
         }
     } else {
-        nodename = g_strdup_printf("/memory@%" PRIx64, binfo->loader_start);
-        qemu_fdt_add_subnode(fdt, nodename);
-        qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory");
-
-        rc = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg",
-                                          acells, binfo->loader_start,
-                                          scells, binfo->ram_size);
+        rc = fdt_add_memory_node(fdt, acells, binfo->loader_start,
+                                 scells, binfo->ram_size, -1);
         if (rc < 0) {
-            fprintf(stderr, "couldn't set %s reg\n", nodename);
             goto fail;
         }
-        g_free(nodename);
     }
 
     rc = fdt_path_offset(fdt, "/chosen");
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v6 04/18] hw/arm/virt: Rename highmem IO regions
  2019-02-05 17:32 [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
                   ` (2 preceding siblings ...)
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 03/18] hw/arm/boot: introduce fdt_add_memory_node helper Eric Auger
@ 2019-02-05 17:32 ` Eric Auger
  2019-02-14 16:50   ` Peter Maydell
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 05/18] hw/arm/virt: Split the memory map description Eric Auger
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 56+ messages in thread
From: Eric Auger @ 2019-02-05 17:32 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, imammedo, david
  Cc: dgilbert, david, drjones

In preparation for a split of the memory map into a static
part and a dynamic part floating after the RAM, let's rename the
regions located after the RAM

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v6 : creation
---
 hw/arm/virt-acpi-build.c |  8 ++++----
 hw/arm/virt.c            | 21 +++++++++++----------
 include/hw/arm/virt.h    |  8 ++++----
 3 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 04b62c714d..829d2f0035 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -229,8 +229,8 @@ static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry *memmap,
                      size_pio));
 
     if (use_highmem) {
-        hwaddr base_mmio_high = memmap[VIRT_PCIE_MMIO_HIGH].base;
-        hwaddr size_mmio_high = memmap[VIRT_PCIE_MMIO_HIGH].size;
+        hwaddr base_mmio_high = memmap[VIRT_HIGH_PCIE_MMIO].base;
+        hwaddr size_mmio_high = memmap[VIRT_HIGH_PCIE_MMIO].size;
 
         aml_append(rbuf,
             aml_qword_memory(AML_POS_DECODE, AML_MIN_FIXED, AML_MAX_FIXED,
@@ -663,8 +663,8 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
             gicr = acpi_data_push(table_data, sizeof(*gicr));
             gicr->type = ACPI_APIC_GENERIC_REDISTRIBUTOR;
             gicr->length = sizeof(*gicr);
-            gicr->base_address = cpu_to_le64(memmap[VIRT_GIC_REDIST2].base);
-            gicr->range_length = cpu_to_le32(memmap[VIRT_GIC_REDIST2].size);
+            gicr->base_address = cpu_to_le64(memmap[VIRT_HIGH_GIC_REDIST2].base);
+            gicr->range_length = cpu_to_le32(memmap[VIRT_HIGH_GIC_REDIST2].size);
         }
 
         if (its_class_name() && !vmc->no_its) {
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 99c2b6e60d..a1955e7764 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -150,10 +150,10 @@ static const MemMapEntry a15memmap[] = {
     [VIRT_PCIE_ECAM] =          { 0x3f000000, 0x01000000 },
     [VIRT_MEM] =                { 0x40000000, RAMLIMIT_BYTES },
     /* Additional 64 MB redist region (can contain up to 512 redistributors) */
-    [VIRT_GIC_REDIST2] =        { 0x4000000000ULL, 0x4000000 },
-    [VIRT_PCIE_ECAM_HIGH] =     { 0x4010000000ULL, 0x10000000 },
+    [VIRT_HIGH_GIC_REDIST2] =   { 0x4000000000ULL, 0x4000000 },
+    [VIRT_HIGH_PCIE_ECAM] =     { 0x4010000000ULL, 0x10000000 },
     /* Second PCIe window, 512GB wide at the 512GB boundary */
-    [VIRT_PCIE_MMIO_HIGH] =   { 0x8000000000ULL, 0x8000000000ULL },
+    [VIRT_HIGH_PCIE_MMIO] =     { 0x8000000000ULL, 0x8000000000ULL },
 };
 
 static const int a15irqmap[] = {
@@ -435,8 +435,8 @@ static void fdt_add_gic_node(VirtMachineState *vms)
                                          2, vms->memmap[VIRT_GIC_DIST].size,
                                          2, vms->memmap[VIRT_GIC_REDIST].base,
                                          2, vms->memmap[VIRT_GIC_REDIST].size,
-                                         2, vms->memmap[VIRT_GIC_REDIST2].base,
-                                         2, vms->memmap[VIRT_GIC_REDIST2].size);
+                                         2, vms->memmap[VIRT_HIGH_GIC_REDIST2].base,
+                                         2, vms->memmap[VIRT_HIGH_GIC_REDIST2].size);
         }
 
         if (vms->virt) {
@@ -584,7 +584,7 @@ static void create_gic(VirtMachineState *vms, qemu_irq *pic)
 
         if (nb_redist_regions == 2) {
             uint32_t redist1_capacity =
-                        vms->memmap[VIRT_GIC_REDIST2].size / GICV3_REDIST_SIZE;
+                    vms->memmap[VIRT_HIGH_GIC_REDIST2].size / GICV3_REDIST_SIZE;
 
             qdev_prop_set_uint32(gicdev, "redist-region-count[1]",
                 MIN(smp_cpus - redist0_count, redist1_capacity));
@@ -601,7 +601,8 @@ static void create_gic(VirtMachineState *vms, qemu_irq *pic)
     if (type == 3) {
         sysbus_mmio_map(gicbusdev, 1, vms->memmap[VIRT_GIC_REDIST].base);
         if (nb_redist_regions == 2) {
-            sysbus_mmio_map(gicbusdev, 2, vms->memmap[VIRT_GIC_REDIST2].base);
+            sysbus_mmio_map(gicbusdev, 2,
+                            vms->memmap[VIRT_HIGH_GIC_REDIST2].base);
         }
     } else {
         sysbus_mmio_map(gicbusdev, 1, vms->memmap[VIRT_GIC_CPU].base);
@@ -1088,8 +1089,8 @@ static void create_pcie(VirtMachineState *vms, qemu_irq *pic)
 {
     hwaddr base_mmio = vms->memmap[VIRT_PCIE_MMIO].base;
     hwaddr size_mmio = vms->memmap[VIRT_PCIE_MMIO].size;
-    hwaddr base_mmio_high = vms->memmap[VIRT_PCIE_MMIO_HIGH].base;
-    hwaddr size_mmio_high = vms->memmap[VIRT_PCIE_MMIO_HIGH].size;
+    hwaddr base_mmio_high = vms->memmap[VIRT_HIGH_PCIE_MMIO].base;
+    hwaddr size_mmio_high = vms->memmap[VIRT_HIGH_PCIE_MMIO].size;
     hwaddr base_pio = vms->memmap[VIRT_PCIE_PIO].base;
     hwaddr size_pio = vms->memmap[VIRT_PCIE_PIO].size;
     hwaddr base_ecam, size_ecam;
@@ -1418,7 +1419,7 @@ static void machvirt_init(MachineState *machine)
      */
     if (vms->gic_version == 3) {
         virt_max_cpus = vms->memmap[VIRT_GIC_REDIST].size / GICV3_REDIST_SIZE;
-        virt_max_cpus += vms->memmap[VIRT_GIC_REDIST2].size / GICV3_REDIST_SIZE;
+        virt_max_cpus += vms->memmap[VIRT_HIGH_GIC_REDIST2].size / GICV3_REDIST_SIZE;
     } else {
         virt_max_cpus = GIC_NCPU;
     }
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 4cc57a7ef6..a27086d524 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -64,7 +64,7 @@ enum {
     VIRT_GIC_VCPU,
     VIRT_GIC_ITS,
     VIRT_GIC_REDIST,
-    VIRT_GIC_REDIST2,
+    VIRT_HIGH_GIC_REDIST2,
     VIRT_SMMU,
     VIRT_UART,
     VIRT_MMIO,
@@ -74,9 +74,9 @@ enum {
     VIRT_PCIE_MMIO,
     VIRT_PCIE_PIO,
     VIRT_PCIE_ECAM,
-    VIRT_PCIE_ECAM_HIGH,
+    VIRT_HIGH_PCIE_ECAM,
     VIRT_PLATFORM_BUS,
-    VIRT_PCIE_MMIO_HIGH,
+    VIRT_HIGH_PCIE_MMIO,
     VIRT_GPIO,
     VIRT_SECURE_UART,
     VIRT_SECURE_MEM,
@@ -128,7 +128,7 @@ typedef struct {
     int psci_conduit;
 } VirtMachineState;
 
-#define VIRT_ECAM_ID(high) (high ? VIRT_PCIE_ECAM_HIGH : VIRT_PCIE_ECAM)
+#define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
 
 #define TYPE_VIRT_MACHINE   MACHINE_TYPE_NAME("virt")
 #define VIRT_MACHINE(obj) \
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v6 05/18] hw/arm/virt: Split the memory map description
  2019-02-05 17:32 [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
                   ` (3 preceding siblings ...)
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 04/18] hw/arm/virt: Rename highmem IO regions Eric Auger
@ 2019-02-05 17:32 ` Eric Auger
  2019-02-14 17:07   ` Peter Maydell
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 06/18] hw/boards: Add a MachineState parameter to kvm_type callback Eric Auger
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 56+ messages in thread
From: Eric Auger @ 2019-02-05 17:32 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, imammedo, david
  Cc: dgilbert, david, drjones

In the prospect to introduce an extended memory map supporting more
RAM, let's split the memory map array into two parts:

- the former a15memmap contains regions below and including the RAM
- extended_memmap, only initialized with entries located after the RAM.
  Only the size of the region is initialized there since their base
  address will be dynamically computed, depending on the top of the
  RAM (initial RAM at the moment), with same alignment as their size.

This new split will allow to grow the RAM size without changing the
description of the high regions.

The patch also moves the memory map setup into machvirt_init().
The rationale is the memory map will be soon affected by the
kvm_type() call that happens after virt_instance_init() and
before machvirt_init().

At that point the memory map is not changed, ie. the initial RAM can
grow up to 256GiB. Then come the high IO regions with same layout as
before.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v5 -> v6
- removal of many macros in units.h
- introduce the virt_set_memmap helper
- new computation for offsets of high IO regions
- add comments
---
 hw/arm/virt.c         | 45 ++++++++++++++++++++++++++++++++++++++-----
 include/hw/arm/virt.h | 14 ++++++++++----
 2 files changed, 50 insertions(+), 9 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index a1955e7764..2b15839d0b 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -29,6 +29,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/units.h"
 #include "qapi/error.h"
 #include "hw/sysbus.h"
 #include "hw/arm/arm.h"
@@ -149,11 +150,20 @@ static const MemMapEntry a15memmap[] = {
     [VIRT_PCIE_PIO] =           { 0x3eff0000, 0x00010000 },
     [VIRT_PCIE_ECAM] =          { 0x3f000000, 0x01000000 },
     [VIRT_MEM] =                { 0x40000000, RAMLIMIT_BYTES },
+};
+
+/*
+ * Highmem IO Regions: This memory map is floating, located after the RAM.
+ * Each IO region offset will be dynamically computed, depending on the
+ * top of the RAM, so that its base get the same alignment as the size,
+ * ie. a 512GiB region will be aligned on a 512GiB boundary.
+ */
+static MemMapEntry extended_memmap[] = {
     /* Additional 64 MB redist region (can contain up to 512 redistributors) */
-    [VIRT_HIGH_GIC_REDIST2] =   { 0x4000000000ULL, 0x4000000 },
-    [VIRT_HIGH_PCIE_ECAM] =     { 0x4010000000ULL, 0x10000000 },
-    /* Second PCIe window, 512GB wide at the 512GB boundary */
-    [VIRT_HIGH_PCIE_MMIO] =     { 0x8000000000ULL, 0x8000000000ULL },
+    [VIRT_HIGH_GIC_REDIST2] =   { 0x0, 64 * MiB },
+    [VIRT_HIGH_PCIE_ECAM] =     { 0x0, 256 * MiB },
+    /* Second PCIe window */
+    [VIRT_HIGH_PCIE_MMIO] =     { 0x0, 512 * GiB },
 };
 
 static const int a15irqmap[] = {
@@ -1354,6 +1364,30 @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx)
     return arm_cpu_mp_affinity(idx, clustersz);
 }
 
+static void virt_set_memmap(VirtMachineState *vms)
+{
+    hwaddr base;
+    int i;
+
+    vms->memmap = extended_memmap;
+
+    for (i = 0; i < ARRAY_SIZE(a15memmap); i++) {
+        vms->memmap[i] = a15memmap[i];
+    }
+
+    vms->high_io_base = 256 * GiB; /* Top of the legacy initial RAM region */
+    base = vms->high_io_base;
+
+    for (i = VIRT_LOWMEMMAP_LAST; i < ARRAY_SIZE(extended_memmap); i++) {
+        hwaddr size = extended_memmap[i].size;
+
+        base = ROUND_UP(base, size);
+        vms->memmap[i].base = base;
+        vms->memmap[i].size = size;
+        base += size;
+    }
+}
+
 static void machvirt_init(MachineState *machine)
 {
     VirtMachineState *vms = VIRT_MACHINE(machine);
@@ -1368,6 +1402,8 @@ static void machvirt_init(MachineState *machine)
     bool firmware_loaded = bios_name || drive_get(IF_PFLASH, 0, 0);
     bool aarch64 = true;
 
+    virt_set_memmap(vms);
+
     /* We can probe only here because during property set
      * KVM is not available yet
      */
@@ -1843,7 +1879,6 @@ static void virt_instance_init(Object *obj)
                                     "Valid values are none and smmuv3",
                                     NULL);
 
-    vms->memmap = a15memmap;
     vms->irqmap = a15irqmap;
 }
 
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index a27086d524..3dc7a6c5d5 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -64,7 +64,6 @@ enum {
     VIRT_GIC_VCPU,
     VIRT_GIC_ITS,
     VIRT_GIC_REDIST,
-    VIRT_HIGH_GIC_REDIST2,
     VIRT_SMMU,
     VIRT_UART,
     VIRT_MMIO,
@@ -74,12 +73,18 @@ enum {
     VIRT_PCIE_MMIO,
     VIRT_PCIE_PIO,
     VIRT_PCIE_ECAM,
-    VIRT_HIGH_PCIE_ECAM,
     VIRT_PLATFORM_BUS,
-    VIRT_HIGH_PCIE_MMIO,
     VIRT_GPIO,
     VIRT_SECURE_UART,
     VIRT_SECURE_MEM,
+    VIRT_LOWMEMMAP_LAST,
+};
+
+/* indices of IO regions located after the RAM */
+enum {
+    VIRT_HIGH_GIC_REDIST2 =  VIRT_LOWMEMMAP_LAST,
+    VIRT_HIGH_PCIE_ECAM,
+    VIRT_HIGH_PCIE_MMIO,
 };
 
 typedef enum VirtIOMMUType {
@@ -116,7 +121,7 @@ typedef struct {
     int32_t gic_version;
     VirtIOMMUType iommu;
     struct arm_boot_info bootinfo;
-    const MemMapEntry *memmap;
+    MemMapEntry *memmap;
     const int *irqmap;
     int smp_cpus;
     void *fdt;
@@ -126,6 +131,7 @@ typedef struct {
     uint32_t msi_phandle;
     uint32_t iommu_phandle;
     int psci_conduit;
+    hwaddr high_io_base;
 } VirtMachineState;
 
 #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v6 06/18] hw/boards: Add a MachineState parameter to kvm_type callback
  2019-02-05 17:32 [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
                   ` (4 preceding siblings ...)
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 05/18] hw/arm/virt: Split the memory map description Eric Auger
@ 2019-02-05 17:32 ` Eric Auger
  2019-02-14 17:12   ` Peter Maydell
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 07/18] kvm: add kvm_arm_get_max_vm_phys_shift Eric Auger
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 56+ messages in thread
From: Eric Auger @ 2019-02-05 17:32 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, imammedo, david
  Cc: dgilbert, david, drjones

On ARM, the kvm_type will be resolved by querying the KVMState.
Let's add the MachineState handle to the callback so that we
can retrieve the  KVMState handle. in kvm_init, when the callback
is called, the kvm_state variable is not yet set.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
[ppc parts]
---
 accel/kvm/kvm-all.c   | 2 +-
 hw/ppc/mac_newworld.c | 3 +--
 hw/ppc/mac_oldworld.c | 2 +-
 hw/ppc/spapr.c        | 2 +-
 include/hw/boards.h   | 2 +-
 5 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 4e1de942ce..503900604c 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -1590,7 +1590,7 @@ static int kvm_init(MachineState *ms)
 
     kvm_type = qemu_opt_get(qemu_get_machine_opts(), "kvm-type");
     if (mc->kvm_type) {
-        type = mc->kvm_type(kvm_type);
+        type = mc->kvm_type(ms, kvm_type);
     } else if (kvm_type) {
         ret = -EINVAL;
         fprintf(stderr, "Invalid argument kvm-type=%s\n", kvm_type);
diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
index f1c8400efd..3cce612ffa 100644
--- a/hw/ppc/mac_newworld.c
+++ b/hw/ppc/mac_newworld.c
@@ -563,8 +563,7 @@ static char *core99_fw_dev_path(FWPathProvider *p, BusState *bus,
 
     return NULL;
 }
-
-static int core99_kvm_type(const char *arg)
+static int core99_kvm_type(MachineState *ms, const char *arg)
 {
     /* Always force PR KVM */
     return 2;
diff --git a/hw/ppc/mac_oldworld.c b/hw/ppc/mac_oldworld.c
index 98d531d114..f3a594d7dc 100644
--- a/hw/ppc/mac_oldworld.c
+++ b/hw/ppc/mac_oldworld.c
@@ -419,7 +419,7 @@ static char *heathrow_fw_dev_path(FWPathProvider *p, BusState *bus,
     return NULL;
 }
 
-static int heathrow_kvm_type(const char *arg)
+static int heathrow_kvm_type(MachineState *ms, const char *arg)
 {
     /* Always force PR KVM */
     return 2;
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 0fcdd35cbe..90cd0acb74 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2919,7 +2919,7 @@ static void spapr_machine_init(MachineState *machine)
     }
 }
 
-static int spapr_kvm_type(const char *vm_type)
+static int spapr_kvm_type(MachineState *ms, const char *vm_type)
 {
     if (!vm_type) {
         return 0;
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 02f114085f..425d2c86a6 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -171,7 +171,7 @@ struct MachineClass {
     void (*init)(MachineState *state);
     void (*reset)(void);
     void (*hot_add_cpu)(const int64_t id, Error **errp);
-    int (*kvm_type)(const char *arg);
+    int (*kvm_type)(MachineState *ms, const char *arg);
 
     BlockInterfaceType block_default_type;
     int units_per_default_bus;
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v6 07/18] kvm: add kvm_arm_get_max_vm_phys_shift
  2019-02-05 17:32 [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
                   ` (5 preceding siblings ...)
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 06/18] hw/boards: Add a MachineState parameter to kvm_type callback Eric Auger
@ 2019-02-05 17:32 ` Eric Auger
  2019-02-14 17:15   ` Peter Maydell
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 08/18] vl: Set machine ram_size, maxram_size and ram_slots earlier Eric Auger
                   ` (11 subsequent siblings)
  18 siblings, 1 reply; 56+ messages in thread
From: Eric Auger @ 2019-02-05 17:32 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, imammedo, david
  Cc: dgilbert, david, drjones

Add the kvm_arm_get_max_vm_phys_shift() helper that returns the
log of the maximum IPA size supported by KVM. This capability
needs to be known to create the VM with a specific IPA max size
(kvm_type passed along KVM_CREATE_VM ioctl.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v4 -> v5:
- return 40 if the host does not support the capability

v3 -> v4:
- s/s/ms in kvm_arm_get_max_vm_phys_shift function comment
- check KVM_CAP_ARM_VM_IPA_SIZE extension

v1 -> v2:
- put this in ARM specific code
---
 target/arm/kvm.c     | 10 ++++++++++
 target/arm/kvm_arm.h | 13 +++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index e00ccf9c98..fc1dd3ec6a 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -18,6 +18,7 @@
 #include "qemu/error-report.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/kvm.h"
+#include "sysemu/kvm_int.h"
 #include "kvm_arm.h"
 #include "cpu.h"
 #include "trace.h"
@@ -162,6 +163,15 @@ void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu)
     env->features = arm_host_cpu_features.features;
 }
 
+int kvm_arm_get_max_vm_phys_shift(MachineState *ms)
+{
+    KVMState *s = KVM_STATE(ms->accelerator);
+    int ret;
+
+    ret = kvm_check_extension(s, KVM_CAP_ARM_VM_IPA_SIZE);
+    return ret > 0 ? ret : 40;
+}
+
 int kvm_arch_init(MachineState *ms, KVMState *s)
 {
     /* For ARM interrupt delivery is always asynchronous,
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index 6393455b1d..0728bbfa6b 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -207,6 +207,14 @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf);
  */
 void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu);
 
+/**
+ * kvm_arm_get_max_vm_phys_shift - Returns log2 of the max IPA size
+ * supported by KVM
+ *
+ * @ms: Machine state handle
+ */
+int kvm_arm_get_max_vm_phys_shift(MachineState *ms);
+
 /**
  * kvm_arm_sync_mpstate_to_kvm
  * @cpu: ARMCPU
@@ -239,6 +247,11 @@ static inline void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu)
     cpu->host_cpu_probe_failed = true;
 }
 
+static inline int kvm_arm_get_max_vm_phys_shift(MachineState *ms)
+{
+    return -ENOENT;
+}
+
 static inline int kvm_arm_vgic_probe(void)
 {
     return 0;
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v6 08/18] vl: Set machine ram_size, maxram_size and ram_slots earlier
  2019-02-05 17:32 [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
                   ` (6 preceding siblings ...)
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 07/18] kvm: add kvm_arm_get_max_vm_phys_shift Eric Auger
@ 2019-02-05 17:32 ` Eric Auger
  2019-02-14 17:16   ` Peter Maydell
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 09/18] hw/arm/virt: Implement kvm_type function for 4.0 machine Eric Auger
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 56+ messages in thread
From: Eric Auger @ 2019-02-05 17:32 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, imammedo, david
  Cc: dgilbert, david, drjones

The machine RAM attributes will need to be analyzed during the
configure_accelerator() process. especially kvm_type() arm64
machine callback will use them to know how many IPA/GPA bits are
needed to model the whole RAM range. So let's assign those machine
state fields before calling configure_accelerator.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v4: new
---
 vl.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/vl.c b/vl.c
index 9cf0fbe0b8..28f6bbebe2 100644
--- a/vl.c
+++ b/vl.c
@@ -4324,6 +4324,9 @@ int main(int argc, char **argv, char **envp)
     machine_opts = qemu_get_machine_opts();
     qemu_opt_foreach(machine_opts, machine_set_property, current_machine,
                      &error_fatal);
+    current_machine->ram_size = ram_size;
+    current_machine->maxram_size = maxram_size;
+    current_machine->ram_slots = ram_slots;
 
     configure_accelerator(current_machine, argv[0]);
 
@@ -4521,9 +4524,6 @@ int main(int argc, char **argv, char **envp)
     replay_checkpoint(CHECKPOINT_INIT);
     qdev_machine_init();
 
-    current_machine->ram_size = ram_size;
-    current_machine->maxram_size = maxram_size;
-    current_machine->ram_slots = ram_slots;
     current_machine->boot_order = boot_order;
 
     /* parse features once if machine provides default cpu_type */
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v6 09/18] hw/arm/virt: Implement kvm_type function for 4.0 machine
  2019-02-05 17:32 [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
                   ` (7 preceding siblings ...)
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 08/18] vl: Set machine ram_size, maxram_size and ram_slots earlier Eric Auger
@ 2019-02-05 17:32 ` Eric Auger
  2019-02-14 17:29   ` Peter Maydell
  2019-02-18 10:07   ` Igor Mammedov
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 10/18] hw/arm/virt: Bump the 255GB initial RAM limit Eric Auger
                   ` (9 subsequent siblings)
  18 siblings, 2 replies; 56+ messages in thread
From: Eric Auger @ 2019-02-05 17:32 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, imammedo, david
  Cc: dgilbert, david, drjones

This patch implements the machine class kvm_type() callback.
It returns the max IPA shift needed to implement the whole GPA
range including the RAM and IO regions located beyond.
The returned value in passed though the KVM_CREATE_VM ioctl and
this allows KVM to set the stage2 tables dynamically.

At this stage the RAM limit still is limited to 255GB.

Setting all the existing highmem IO regions beyond the RAM
allows to have a single contiguous RAM region (initial RAM and
possible hotpluggable device memory). That way we do not need
to do invasive changes in the EDK2 FW to support a dynamic
RAM base.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v5 -> v6:
- add some comments
- high IO region cannot start before 256GiB
---
 hw/arm/virt.c         | 52 +++++++++++++++++++++++++++++++++++++++++--
 include/hw/arm/virt.h |  2 ++
 2 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 2b15839d0b..b90ffc2e5d 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1366,6 +1366,7 @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx)
 
 static void virt_set_memmap(VirtMachineState *vms)
 {
+    MachineState *ms = MACHINE(vms);
     hwaddr base;
     int i;
 
@@ -1375,7 +1376,17 @@ static void virt_set_memmap(VirtMachineState *vms)
         vms->memmap[i] = a15memmap[i];
     }
 
-    vms->high_io_base = 256 * GiB; /* Top of the legacy initial RAM region */
+    /*
+     * We now compute the base of the high IO region depending on the
+     * amount of initial and device memory. The device memory start/size
+     * is aligned on 1GiB. We never put the high IO region below 256GiB
+     * so that if maxram_size is < 255GiB we keep the legacy memory map
+     */
+    vms->high_io_base = ROUND_UP(GiB + ms->ram_size, GiB) +
+                        ROUND_UP(ms->maxram_size - ms->ram_size, GiB);
+    if (vms->high_io_base < 256 * GiB) {
+        vms->high_io_base = 256 * GiB;
+    }
     base = vms->high_io_base;
 
     for (i = VIRT_LOWMEMMAP_LAST; i < ARRAY_SIZE(extended_memmap); i++) {
@@ -1386,6 +1397,7 @@ static void virt_set_memmap(VirtMachineState *vms)
         vms->memmap[i].size = size;
         base += size;
     }
+    vms->highest_gpa = base - 1;
 }
 
 static void machvirt_init(MachineState *machine)
@@ -1402,7 +1414,9 @@ static void machvirt_init(MachineState *machine)
     bool firmware_loaded = bios_name || drive_get(IF_PFLASH, 0, 0);
     bool aarch64 = true;
 
-    virt_set_memmap(vms);
+    if (!vms->extended_memmap) {
+        virt_set_memmap(vms);
+    }
 
     /* We can probe only here because during property set
      * KVM is not available yet
@@ -1784,6 +1798,36 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
     return NULL;
 }
 
+/*
+ * for arm64 kvm_type [7-0] encodes the IPA size shift
+ */
+static int virt_kvm_type(MachineState *ms, const char *type_str)
+{
+    VirtMachineState *vms = VIRT_MACHINE(ms);
+    int max_vm_phys_shift = kvm_arm_get_max_vm_phys_shift(ms);
+    int max_pa_shift;
+
+    vms->extended_memmap = true;
+
+    virt_set_memmap(vms);
+
+    max_pa_shift = 64 - clz64(vms->highest_gpa);
+
+    if (max_pa_shift > max_vm_phys_shift) {
+        error_report("-m and ,maxmem option values "
+                     "require an IPA range (%d bits) larger than "
+                     "the one supported by the host (%d bits)",
+                     max_pa_shift, max_vm_phys_shift);
+       exit(1);
+    }
+    /*
+     * By default we return 0 which corresponds to an implicit legacy
+     * 40b IPA setting. Otherwise we return the actual requested IPA
+     * logsize
+     */
+    return max_pa_shift > 40 ? max_pa_shift : 0;
+}
+
 static void virt_machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
@@ -1808,6 +1852,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
     mc->cpu_index_to_instance_props = virt_cpu_index_to_props;
     mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a15");
     mc->get_default_cpu_node_id = virt_get_default_cpu_node_id;
+    mc->kvm_type = virt_kvm_type;
     assert(!mc->get_hotplug_handler);
     mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
     hc->plug = virt_machine_device_plug_cb;
@@ -1911,6 +1956,9 @@ static void virt_machine_3_1_options(MachineClass *mc)
 {
     virt_machine_4_0_options(mc);
     compat_props_add(mc->compat_props, hw_compat_3_1, hw_compat_3_1_len);
+
+    /* extended memory map is enabled from 4.0 onwards */
+    mc->kvm_type = NULL;
 }
 DEFINE_VIRT_MACHINE(3, 1)
 
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 3dc7a6c5d5..c88f67a492 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -132,6 +132,8 @@ typedef struct {
     uint32_t iommu_phandle;
     int psci_conduit;
     hwaddr high_io_base;
+    hwaddr highest_gpa;
+    bool extended_memmap;
 } VirtMachineState;
 
 #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v6 10/18] hw/arm/virt: Bump the 255GB initial RAM limit
  2019-02-05 17:32 [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
                   ` (8 preceding siblings ...)
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 09/18] hw/arm/virt: Implement kvm_type function for 4.0 machine Eric Auger
@ 2019-02-05 17:32 ` Eric Auger
  2019-02-07 15:19   ` Shameerali Kolothum Thodi
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 11/18] hw/arm/virt: Add memory hotplug framework Eric Auger
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 56+ messages in thread
From: Eric Auger @ 2019-02-05 17:32 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, imammedo, david
  Cc: dgilbert, david, drjones

Now we have the extended memory map (high IO regions beyond the
scalable RAM) and dynamic IPA range support at KVM/ARM level
we can bump the legacy 255GB initial RAM limit. The actual maximum
RAM size now depends on the physical CPU and host kernel.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/virt.c | 26 +++++++-------------------
 1 file changed, 7 insertions(+), 19 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index b90ffc2e5d..f01886da22 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -93,22 +93,9 @@
 
 #define PLATFORM_BUS_NUM_IRQS 64
 
-/* RAM limit in GB. Since VIRT_MEM starts at the 1GB mark, this means
- * RAM can go up to the 256GB mark, leaving 256GB of the physical
- * address space unallocated and free for future use between 256G and 512G.
- * If we need to provide more RAM to VMs in the future then we need to:
- *  * allocate a second bank of RAM starting at 2TB and working up
- *  * fix the DT and ACPI table generation code in QEMU to correctly
- *    report two split lumps of RAM to the guest
- *  * fix KVM in the host kernel to allow guests with >40 bit address spaces
- * (We don't want to fill all the way up to 512GB with RAM because
- * we might want it for non-RAM purposes later. Conversely it seems
- * reasonable to assume that anybody configuring a VM with a quarter
- * of a terabyte of RAM will be doing it on a host with more than a
- * terabyte of physical address space.)
- */
-#define RAMLIMIT_GB 255
-#define RAMLIMIT_BYTES (RAMLIMIT_GB * 1024ULL * 1024 * 1024)
+/* Legacy RAM limit in GB (< version 4.0) */
+#define LEGACY_RAMLIMIT_GB 255
+#define LEGACY_RAMLIMIT_BYTES (LEGACY_RAMLIMIT_GB * GiB)
 
 /* Addresses and sizes of our components.
  * 0..128MB is space for a flash device so we can run bootrom code such as UEFI.
@@ -149,7 +136,7 @@ static const MemMapEntry a15memmap[] = {
     [VIRT_PCIE_MMIO] =          { 0x10000000, 0x2eff0000 },
     [VIRT_PCIE_PIO] =           { 0x3eff0000, 0x00010000 },
     [VIRT_PCIE_ECAM] =          { 0x3f000000, 0x01000000 },
-    [VIRT_MEM] =                { 0x40000000, RAMLIMIT_BYTES },
+    [VIRT_MEM] =                { 0x40000000, LEGACY_RAMLIMIT_BYTES },
 };
 
 /*
@@ -1483,8 +1470,9 @@ static void machvirt_init(MachineState *machine)
 
     vms->smp_cpus = smp_cpus;
 
-    if (machine->ram_size > vms->memmap[VIRT_MEM].size) {
-        error_report("mach-virt: cannot model more than %dGB RAM", RAMLIMIT_GB);
+    if (!vms->extended_memmap && machine->ram_size > LEGACY_RAMLIMIT_GB) {
+        error_report("mach-virt: cannot model more than %dGB RAM",
+                     LEGACY_RAMLIMIT_GB);
         exit(1);
     }
 
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v6 11/18] hw/arm/virt: Add memory hotplug framework
  2019-02-05 17:32 [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
                   ` (9 preceding siblings ...)
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 10/18] hw/arm/virt: Bump the 255GB initial RAM limit Eric Auger
@ 2019-02-05 17:32 ` Eric Auger
  2019-02-14 17:15   ` David Hildenbrand
  2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 12/18] hw/arm/boot: Expose the PC-DIMM nodes in the DT Eric Auger
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 56+ messages in thread
From: Eric Auger @ 2019-02-05 17:32 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, imammedo, david
  Cc: dgilbert, david, drjones

This patch adds the the memory hot-plug/hot-unplug infrastructure
in machvirt. It is still not enabled as no device memory is allocated.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>

---
v4 -> v5:
- change in pc_dimm_pre_plug signature
- CONFIG_MEM_HOTPLUG replaced by CONFIG_MEM_DEVICE and CONFIG_DIMM

v3 -> v4:
- check the memory device is not hotplugged

v2 -> v3:
- change in pc_dimm_plug()'s signature
- add pc_dimm_pre_plug call

v1 -> v2:
- s/virt_dimm_plug|unplug/virt_memory_plug|unplug
- s/pc_dimm_memory_plug/pc_dimm_plug
- reworded title and commit message
- added pre_plug cb
- don't handle get_memory_region failure anymore
---
 default-configs/arm-softmmu.mak |  2 ++
 hw/arm/virt.c                   | 64 ++++++++++++++++++++++++++++++++-
 2 files changed, 65 insertions(+), 1 deletion(-)

diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index be88870799..dc4624794f 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -160,3 +160,5 @@ CONFIG_PCI_DESIGNWARE=y
 CONFIG_STRONGARM=y
 CONFIG_HIGHBANK=y
 CONFIG_MUSICPAL=y
+CONFIG_MEM_DEVICE=y
+CONFIG_DIMM=y
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index f01886da22..783468ba77 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -59,6 +59,8 @@
 #include "qapi/visitor.h"
 #include "standard-headers/linux/input.h"
 #include "hw/arm/smmuv3.h"
+#include "hw/mem/pc-dimm.h"
+#include "hw/mem/nvdimm.h"
 
 #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
     static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
@@ -1763,6 +1765,49 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
     return ms->possible_cpus;
 }
 
+static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
+                                 Error **errp)
+{
+    const bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM);
+
+    if (dev->hotplugged) {
+        error_setg(errp, "memory hotplug is not supported");
+    }
+
+    if (is_nvdimm) {
+        error_setg(errp, "nvdimm is not yet supported");
+        return;
+    }
+
+    pc_dimm_pre_plug(PC_DIMM(dev), MACHINE(hotplug_dev), NULL, errp);
+}
+
+static void virt_memory_plug(HotplugHandler *hotplug_dev,
+                             DeviceState *dev, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
+    Error *local_err = NULL;
+
+    pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
+
+    error_propagate(errp, local_err);
+}
+
+static void virt_memory_unplug(HotplugHandler *hotplug_dev,
+                               DeviceState *dev, Error **errp)
+{
+    pc_dimm_unplug(PC_DIMM(dev), MACHINE(hotplug_dev));
+    object_unparent(OBJECT(dev));
+}
+
+static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
+                                            DeviceState *dev, Error **errp)
+{
+    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        virt_memory_pre_plug(hotplug_dev, dev, errp);
+    }
+}
+
 static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
                                         DeviceState *dev, Error **errp)
 {
@@ -1774,12 +1819,27 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
                                      SYS_BUS_DEVICE(dev));
         }
     }
+    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+            virt_memory_plug(hotplug_dev, dev, errp);
+    }
+}
+
+static void virt_machine_device_unplug_cb(HotplugHandler *hotplug_dev,
+                                          DeviceState *dev, Error **errp)
+{
+    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        virt_memory_unplug(hotplug_dev, dev, errp);
+    } else {
+        error_setg(errp, "device unplug request for unsupported device"
+                   " type: %s", object_get_typename(OBJECT(dev)));
+    }
 }
 
 static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
                                                         DeviceState *dev)
 {
-    if (object_dynamic_cast(OBJECT(dev), TYPE_SYS_BUS_DEVICE)) {
+    if (object_dynamic_cast(OBJECT(dev), TYPE_SYS_BUS_DEVICE) ||
+       (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM))) {
         return HOTPLUG_HANDLER(machine);
     }
 
@@ -1843,7 +1903,9 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
     mc->kvm_type = virt_kvm_type;
     assert(!mc->get_hotplug_handler);
     mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
+    hc->pre_plug = virt_machine_device_pre_plug_cb;
     hc->plug = virt_machine_device_plug_cb;
+    hc->unplug = virt_machine_device_unplug_cb;
 }
 
 static void virt_instance_init(Object *obj)
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v6 12/18] hw/arm/boot: Expose the PC-DIMM nodes in the DT
  2019-02-05 17:32 [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
                   ` (10 preceding siblings ...)
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 11/18] hw/arm/virt: Add memory hotplug framework Eric Auger
@ 2019-02-05 17:33 ` Eric Auger
  2019-02-18  8:58   ` Igor Mammedov
  2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 13/18] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT Eric Auger
                   ` (6 subsequent siblings)
  18 siblings, 1 reply; 56+ messages in thread
From: Eric Auger @ 2019-02-05 17:33 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, imammedo, david
  Cc: dgilbert, david, drjones

From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>

This patch add memory nodes corresponding to PC-DIMM regions.

NV_DIMM and ACPI_NVDIMM configs are not yet set for ARM so we
don't need to care about NV-DIMM at this stage.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v3 -> v4:
- git rid of @base and @len in fdt_add_hotpluggable_memory_nodes

v1 -> v2:
- added qapi_free_MemoryDeviceInfoList and simplify the loop
---
 hw/arm/boot.c | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 2ef367e15b..2a70e8aa82 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -19,6 +19,7 @@
 #include "sysemu/numa.h"
 #include "hw/boards.h"
 #include "hw/loader.h"
+#include "hw/mem/memory-device.h"
 #include "elf.h"
 #include "sysemu/device_tree.h"
 #include "qemu/config-file.h"
@@ -526,6 +527,34 @@ static void fdt_add_psci_node(void *fdt)
     qemu_fdt_setprop_cell(fdt, "/psci", "migrate", migrate_fn);
 }
 
+static int fdt_add_hotpluggable_memory_nodes(void *fdt,
+                                             uint32_t acells, uint32_t scells) {
+    MemoryDeviceInfoList *info, *info_list = qmp_memory_device_list();
+    MemoryDeviceInfo *mi;
+    PCDIMMDeviceInfo *di;
+    bool is_nvdimm;
+    int ret = 0;
+
+    for (info = info_list; info != NULL; info = info->next) {
+        mi = info->value;
+        is_nvdimm = (mi->type == MEMORY_DEVICE_INFO_KIND_NVDIMM);
+        di = !is_nvdimm ? mi->u.dimm.data : mi->u.nvdimm.data;
+
+        if (is_nvdimm) {
+            ret = -ENOENT; /* NV-DIMM not yet supported */
+        } else {
+            ret = fdt_add_memory_node(fdt, acells, di->addr,
+                                      scells, di->size, di->node);
+        }
+        if (ret < 0) {
+            goto out;
+        }
+    }
+out:
+    qapi_free_MemoryDeviceInfoList(info_list);
+    return ret;
+}
+
 int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
                  hwaddr addr_limit, AddressSpace *as)
 {
@@ -621,6 +650,12 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
         }
     }
 
+    rc = fdt_add_hotpluggable_memory_nodes(fdt, acells, scells);
+    if (rc < 0) {
+            fprintf(stderr, "couldn't add hotpluggable memory nodes\n");
+            goto fail;
+    }
+
     rc = fdt_path_offset(fdt, "/chosen");
     if (rc < 0) {
         qemu_fdt_add_subnode(fdt, "/chosen");
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v6 13/18] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT
  2019-02-05 17:32 [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
                   ` (11 preceding siblings ...)
  2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 12/18] hw/arm/boot: Expose the PC-DIMM nodes in the DT Eric Auger
@ 2019-02-05 17:33 ` Eric Auger
  2019-02-18  8:14   ` Igor Mammedov
  2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 14/18] hw/arm/virt: Allocate device_memory Eric Auger
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 56+ messages in thread
From: Eric Auger @ 2019-02-05 17:33 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, imammedo, david
  Cc: dgilbert, david, drjones

From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>

Generate Memory Affinity Structures for PC-DIMM ranges.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v5 -> v6:
- fix mingw compil issue

v4 -> v5:
- Align to x86 code and especially
  "pc: acpi: revert back to 1 SRAT entry for hotpluggable area"

v3 -> v4:
- do not use vms->bootinfo.device_memory_start/device_memory_size anymore

v1 -> v2:
- build_srat_hotpluggable_memory movedc to aml-build
---
 hw/arm/virt-acpi-build.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 829d2f0035..781eafaf5e 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -516,6 +516,7 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     int i, srat_start;
     uint64_t mem_base;
     MachineClass *mc = MACHINE_GET_CLASS(vms);
+    MachineState *ms = MACHINE(vms);
     const CPUArchIdList *cpu_list = mc->possible_cpu_arch_ids(MACHINE(vms));
 
     srat_start = table_data->len;
@@ -541,6 +542,14 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
         }
     }
 
+    if (ms->device_memory) {
+        numamem = acpi_data_push(table_data, sizeof *numamem);
+        build_srat_memory(numamem, ms->device_memory->base,
+                          memory_region_size(&ms->device_memory->mr),
+                          nb_numa_nodes - 1,
+                          MEM_AFFINITY_HOTPLUGGABLE | MEM_AFFINITY_ENABLED);
+    }
+
     build_header(linker, table_data, (void *)(table_data->data + srat_start),
                  "SRAT", table_data->len - srat_start, 3, NULL, NULL);
 }
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v6 14/18] hw/arm/virt: Allocate device_memory
  2019-02-05 17:32 [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
                   ` (12 preceding siblings ...)
  2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 13/18] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT Eric Auger
@ 2019-02-05 17:33 ` Eric Auger
  2019-02-18  9:31   ` Igor Mammedov
  2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 15/18] nvdimm: use configurable ACPI IO base and size Eric Auger
                   ` (4 subsequent siblings)
  18 siblings, 1 reply; 56+ messages in thread
From: Eric Auger @ 2019-02-05 17:33 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, imammedo, david
  Cc: dgilbert, david, drjones

The device memory region is located after the initial RAM.
its start/size are 1GB aligned.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>

---
v4 -> v5:
- device memory set after the initial RAM

v3 -> v4:
- remove bootinfo.device_memory_start/device_memory_size
- rename VIRT_HOTPLUG_MEM into VIRT_DEVICE_MEM
---
 hw/arm/virt.c | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 783468ba77..b683902991 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -61,6 +61,7 @@
 #include "hw/arm/smmuv3.h"
 #include "hw/mem/pc-dimm.h"
 #include "hw/mem/nvdimm.h"
+#include "hw/acpi/acpi.h"
 
 #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
     static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
@@ -1260,6 +1261,37 @@ static void create_secure_ram(VirtMachineState *vms,
     g_free(nodename);
 }
 
+static void create_device_memory(VirtMachineState *vms, MemoryRegion *sysmem)
+{
+    MachineState *ms = MACHINE(vms);
+    uint64_t device_memory_size = ms->maxram_size - ms->ram_size;
+    uint64_t align = GiB;
+
+    if (!device_memory_size) {
+        return;
+    }
+
+    if (ms->ram_slots > ACPI_MAX_RAM_SLOTS) {
+        error_report("unsupported number of memory slots: %"PRIu64,
+                     ms->ram_slots);
+        exit(EXIT_FAILURE);
+    }
+
+    if (QEMU_ALIGN_UP(ms->maxram_size, align) != ms->maxram_size) {
+        error_report("maximum memory size must be aligned to multiple of 0x%"
+                     PRIx64, align);
+        exit(EXIT_FAILURE);
+    }
+
+    ms->device_memory = g_malloc0(sizeof(*ms->device_memory));
+    ms->device_memory->base = QEMU_ALIGN_UP(GiB + ms->ram_size, GiB);
+
+    memory_region_init(&ms->device_memory->mr, OBJECT(vms),
+                       "device-memory", device_memory_size);
+    memory_region_add_subregion(sysmem, ms->device_memory->base,
+                                &ms->device_memory->mr);
+}
+
 static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
 {
     const VirtMachineState *board = container_of(binfo, VirtMachineState,
@@ -1569,6 +1601,10 @@ static void machvirt_init(MachineState *machine)
                                          machine->ram_size);
     memory_region_add_subregion(sysmem, vms->memmap[VIRT_MEM].base, ram);
 
+    if (vms->extended_memmap) {
+        create_device_memory(vms, sysmem);
+    }
+
     create_flash(vms, sysmem, secure_sysmem ? secure_sysmem : sysmem);
 
     create_gic(vms, pic);
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v6 15/18] nvdimm: use configurable ACPI IO base and size
  2019-02-05 17:32 [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
                   ` (13 preceding siblings ...)
  2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 14/18] hw/arm/virt: Allocate device_memory Eric Auger
@ 2019-02-05 17:33 ` Eric Auger
  2019-02-18 10:21   ` Igor Mammedov
  2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 16/18] hw/arm/virt: Add nvdimm hot-plug infrastructure Eric Auger
                   ` (3 subsequent siblings)
  18 siblings, 1 reply; 56+ messages in thread
From: Eric Auger @ 2019-02-05 17:33 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, imammedo, david
  Cc: dgilbert, david, drjones

From: Kwangwoo Lee <kwangwoo.lee@sk.com>

This patch uses configurable IO base and size to create NPIO AML for
ACPI NFIT. Since a different architecture like AArch64 does not use
port-mapped IO, a configurable IO base is required to create correct
mapping of ACPI IO address and size.

Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v2 -> v3:
- s/size/len in pc_piix.c and pc_q35.c
---
 hw/acpi/nvdimm.c        | 28 +++++++++++++++++++---------
 hw/i386/pc_piix.c       |  8 +++++++-
 hw/i386/pc_q35.c        |  8 +++++++-
 include/hw/mem/nvdimm.h | 12 ++++++++++++
 4 files changed, 45 insertions(+), 11 deletions(-)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index e53b2cb681..da68de5535 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -929,8 +929,8 @@ void nvdimm_init_acpi_state(AcpiNVDIMMState *state, MemoryRegion *io,
                             FWCfgState *fw_cfg, Object *owner)
 {
     memory_region_init_io(&state->io_mr, owner, &nvdimm_dsm_ops, state,
-                          "nvdimm-acpi-io", NVDIMM_ACPI_IO_LEN);
-    memory_region_add_subregion(io, NVDIMM_ACPI_IO_BASE, &state->io_mr);
+                          "nvdimm-acpi-io", state->dsm_io.len);
+    memory_region_add_subregion(io, state->dsm_io.base, &state->io_mr);
 
     state->dsm_mem = g_array_new(false, true /* clear */, 1);
     acpi_data_push(state->dsm_mem, sizeof(NvdimmDsmIn));
@@ -959,12 +959,14 @@ void nvdimm_init_acpi_state(AcpiNVDIMMState *state, MemoryRegion *io,
 
 #define NVDIMM_QEMU_RSVD_UUID   "648B9CF2-CDA1-4312-8AD9-49C4AF32BD62"
 
-static void nvdimm_build_common_dsm(Aml *dev)
+static void nvdimm_build_common_dsm(Aml *dev,
+                                    AcpiNVDIMMState *acpi_nvdimm_state)
 {
     Aml *method, *ifctx, *function, *handle, *uuid, *dsm_mem, *elsectx2;
     Aml *elsectx, *unsupport, *unpatched, *expected_uuid, *uuid_invalid;
     Aml *pckg, *pckg_index, *pckg_buf, *field, *dsm_out_buf, *dsm_out_buf_size;
     uint8_t byte_list[1];
+    AmlRegionSpace rs;
 
     method = aml_method(NVDIMM_COMMON_DSM, 5, AML_SERIALIZED);
     uuid = aml_arg(0);
@@ -975,9 +977,16 @@ static void nvdimm_build_common_dsm(Aml *dev)
 
     aml_append(method, aml_store(aml_name(NVDIMM_ACPI_MEM_ADDR), dsm_mem));
 
+    if (acpi_nvdimm_state->dsm_io.type == NVDIMM_ACPI_IO_PORT) {
+        rs = AML_SYSTEM_IO;
+    } else {
+        rs = AML_SYSTEM_MEMORY;
+    }
+
     /* map DSM memory and IO into ACPI namespace. */
-    aml_append(method, aml_operation_region(NVDIMM_DSM_IOPORT, AML_SYSTEM_IO,
-               aml_int(NVDIMM_ACPI_IO_BASE), NVDIMM_ACPI_IO_LEN));
+    aml_append(method, aml_operation_region(NVDIMM_DSM_IOPORT, rs,
+               aml_int(acpi_nvdimm_state->dsm_io.base),
+               acpi_nvdimm_state->dsm_io.len));
     aml_append(method, aml_operation_region(NVDIMM_DSM_MEMORY,
                AML_SYSTEM_MEMORY, dsm_mem, sizeof(NvdimmDsmIn)));
 
@@ -1260,7 +1269,8 @@ static void nvdimm_build_nvdimm_devices(Aml *root_dev, uint32_t ram_slots)
 }
 
 static void nvdimm_build_ssdt(GArray *table_offsets, GArray *table_data,
-                              BIOSLinker *linker, GArray *dsm_dma_arrea,
+                              BIOSLinker *linker,
+                              AcpiNVDIMMState *acpi_nvdimm_state,
                               uint32_t ram_slots)
 {
     Aml *ssdt, *sb_scope, *dev;
@@ -1288,7 +1298,7 @@ static void nvdimm_build_ssdt(GArray *table_offsets, GArray *table_data,
      */
     aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0012")));
 
-    nvdimm_build_common_dsm(dev);
+    nvdimm_build_common_dsm(dev, acpi_nvdimm_state);
 
     /* 0 is reserved for root device. */
     nvdimm_build_device_dsm(dev, 0);
@@ -1307,7 +1317,7 @@ static void nvdimm_build_ssdt(GArray *table_offsets, GArray *table_data,
                                                NVDIMM_ACPI_MEM_ADDR);
 
     bios_linker_loader_alloc(linker,
-                             NVDIMM_DSM_MEM_FILE, dsm_dma_arrea,
+                             NVDIMM_DSM_MEM_FILE, acpi_nvdimm_state->dsm_mem,
                              sizeof(NvdimmDsmIn), false /* high memory */);
     bios_linker_loader_add_pointer(linker,
         ACPI_BUILD_TABLE_FILE, mem_addr_offset, sizeof(uint32_t),
@@ -1329,7 +1339,7 @@ void nvdimm_build_acpi(GArray *table_offsets, GArray *table_data,
         return;
     }
 
-    nvdimm_build_ssdt(table_offsets, table_data, linker, state->dsm_mem,
+    nvdimm_build_ssdt(table_offsets, table_data, linker, state,
                       ram_slots);
 
     device_list = nvdimm_get_device_list();
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 63c84e3827..d9c81c9aa6 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -298,7 +298,13 @@ static void pc_init1(MachineState *machine,
     }
 
     if (pcms->acpi_nvdimm_state.is_enabled) {
-        nvdimm_init_acpi_state(&pcms->acpi_nvdimm_state, system_io,
+        AcpiNVDIMMState *acpi_nvdimm_state = &pcms->acpi_nvdimm_state;
+
+        acpi_nvdimm_state->dsm_io.type = NVDIMM_ACPI_IO_PORT;
+        acpi_nvdimm_state->dsm_io.base = NVDIMM_ACPI_IO_BASE;
+        acpi_nvdimm_state->dsm_io.len = NVDIMM_ACPI_IO_LEN;
+
+        nvdimm_init_acpi_state(acpi_nvdimm_state, system_io,
                                pcms->fw_cfg, OBJECT(pcms));
     }
 }
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index b7b7959934..1110a26e34 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -330,7 +330,13 @@ static void pc_q35_init(MachineState *machine)
     pc_nic_init(pcmc, isa_bus, host_bus);
 
     if (pcms->acpi_nvdimm_state.is_enabled) {
-        nvdimm_init_acpi_state(&pcms->acpi_nvdimm_state, system_io,
+        AcpiNVDIMMState *acpi_nvdimm_state = &pcms->acpi_nvdimm_state;
+
+        acpi_nvdimm_state->dsm_io.type = NVDIMM_ACPI_IO_PORT;
+        acpi_nvdimm_state->dsm_io.base = NVDIMM_ACPI_IO_BASE;
+        acpi_nvdimm_state->dsm_io.len = NVDIMM_ACPI_IO_LEN;
+
+        nvdimm_init_acpi_state(acpi_nvdimm_state, system_io,
                                pcms->fw_cfg, OBJECT(pcms));
     }
 }
diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
index c5c9b3c7f8..af8a5fd034 100644
--- a/include/hw/mem/nvdimm.h
+++ b/include/hw/mem/nvdimm.h
@@ -123,6 +123,17 @@ struct NvdimmFitBuffer {
 };
 typedef struct NvdimmFitBuffer NvdimmFitBuffer;
 
+typedef enum {
+    NVDIMM_ACPI_IO_PORT,
+    NVDIMM_ACPI_IO_MEMORY,
+} AcpiNVDIMMIOType;
+
+typedef struct AcpiNVDIMMIOEntry {
+    AcpiNVDIMMIOType type;
+    hwaddr base;
+    hwaddr len;
+} AcpiNVDIMMIOEntry;
+
 struct AcpiNVDIMMState {
     /* detect if NVDIMM support is enabled. */
     bool is_enabled;
@@ -140,6 +151,7 @@ struct AcpiNVDIMMState {
      */
     int32_t persistence;
     char    *persistence_string;
+    AcpiNVDIMMIOEntry dsm_io;
 };
 typedef struct AcpiNVDIMMState AcpiNVDIMMState;
 
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v6 16/18] hw/arm/virt: Add nvdimm hot-plug infrastructure
  2019-02-05 17:32 [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
                   ` (14 preceding siblings ...)
  2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 15/18] nvdimm: use configurable ACPI IO base and size Eric Auger
@ 2019-02-05 17:33 ` Eric Auger
  2019-02-18 10:30   ` Igor Mammedov
  2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 17/18] hw/arm/boot: Expose the pmem nodes in the DT Eric Auger
                   ` (2 subsequent siblings)
  18 siblings, 1 reply; 56+ messages in thread
From: Eric Auger @ 2019-02-05 17:33 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, imammedo, david
  Cc: dgilbert, david, drjones

From: Kwangwoo Lee <kwangwoo.lee@sk.com>

Pre-plug and plug handlers are prepared for NVDIMM support.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
---
 default-configs/arm-softmmu.mak |  2 ++
 hw/arm/virt-acpi-build.c        |  6 ++++++
 hw/arm/virt.c                   | 22 ++++++++++++++++++++++
 include/hw/arm/virt.h           |  3 +++
 4 files changed, 33 insertions(+)

diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index dc4624794f..ddbe87ed15 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -162,3 +162,5 @@ CONFIG_HIGHBANK=y
 CONFIG_MUSICPAL=y
 CONFIG_MEM_DEVICE=y
 CONFIG_DIMM=y
+CONFIG_NVDIMM=y
+CONFIG_ACPI_NVDIMM=y
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 781eafaf5e..f086adfa82 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -784,6 +784,7 @@ static
 void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
 {
     VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
+    MachineState *ms = MACHINE(vms);
     GArray *table_offsets;
     unsigned dsdt, xsdt;
     GArray *tables_blob = tables->table_data;
@@ -824,6 +825,11 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
         }
     }
 
+    if (vms->acpi_nvdimm_state.is_enabled) {
+        nvdimm_build_acpi(table_offsets, tables_blob, tables->linker,
+                          &vms->acpi_nvdimm_state, ms->ram_slots);
+    }
+
     if (its_class_name() && !vmc->no_its) {
         acpi_add_table(table_offsets, tables_blob);
         build_iort(tables_blob, tables->linker, vms);
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index b683902991..0c8c2cc191 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -132,6 +132,7 @@ static const MemMapEntry a15memmap[] = {
     [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
     [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
     [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
+    [VIRT_ACPI_IO] =            { 0x09070000, 0x00010000 },
     [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
     /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
     [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
@@ -1637,6 +1638,18 @@ static void machvirt_init(MachineState *machine)
 
     create_platform_bus(vms, pic);
 
+    if (vms->acpi_nvdimm_state.is_enabled) {
+        AcpiNVDIMMState *acpi_nvdimm_state = &vms->acpi_nvdimm_state;
+
+        acpi_nvdimm_state->dsm_io.type = NVDIMM_ACPI_IO_MEMORY;
+        acpi_nvdimm_state->dsm_io.base =
+                vms->memmap[VIRT_ACPI_IO].base + NVDIMM_ACPI_IO_BASE;
+        acpi_nvdimm_state->dsm_io.len = NVDIMM_ACPI_IO_LEN;
+
+        nvdimm_init_acpi_state(acpi_nvdimm_state, sysmem,
+                               vms->fw_cfg, OBJECT(vms));
+    }
+
     vms->bootinfo.ram_size = machine->ram_size;
     vms->bootinfo.kernel_filename = machine->kernel_filename;
     vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
@@ -1822,10 +1835,19 @@ static void virt_memory_plug(HotplugHandler *hotplug_dev,
                              DeviceState *dev, Error **errp)
 {
     VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
+    bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM);
     Error *local_err = NULL;
 
     pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
+    if (local_err) {
+        goto out;
+    }
 
+    if (is_nvdimm) {
+        nvdimm_plug(&vms->acpi_nvdimm_state);
+    }
+
+out:
     error_propagate(errp, local_err);
 }
 
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index c88f67a492..56d73b0e86 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -37,6 +37,7 @@
 #include "hw/arm/arm.h"
 #include "sysemu/kvm.h"
 #include "hw/intc/arm_gicv3_common.h"
+#include "hw/mem/nvdimm.h"
 
 #define NUM_GICV2M_SPIS       64
 #define NUM_VIRTIO_TRANSPORTS 32
@@ -77,6 +78,7 @@ enum {
     VIRT_GPIO,
     VIRT_SECURE_UART,
     VIRT_SECURE_MEM,
+    VIRT_ACPI_IO,
     VIRT_LOWMEMMAP_LAST,
 };
 
@@ -134,6 +136,7 @@ typedef struct {
     hwaddr high_io_base;
     hwaddr highest_gpa;
     bool extended_memmap;
+    AcpiNVDIMMState acpi_nvdimm_state;
 } VirtMachineState;
 
 #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v6 17/18] hw/arm/boot: Expose the pmem nodes in the DT
  2019-02-05 17:32 [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
                   ` (15 preceding siblings ...)
  2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 16/18] hw/arm/virt: Add nvdimm hot-plug infrastructure Eric Auger
@ 2019-02-05 17:33 ` Eric Auger
  2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 18/18] hw/arm/virt: Add nvdimm and nvdimm-persistence options Eric Auger
  2019-02-14 17:35 ` [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Peter Maydell
  18 siblings, 0 replies; 56+ messages in thread
From: Eric Auger @ 2019-02-05 17:33 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, imammedo, david
  Cc: dgilbert, david, drjones

In case of NV-DIMM slots, let's add /pmem DT nodes.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/boot.c | 33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 2a70e8aa82..b1aa866f07 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -454,6 +454,36 @@ out:
     return ret;
 }
 
+static int fdt_add_pmem_node(void *fdt, uint32_t acells, hwaddr mem_base,
+                             uint32_t scells, hwaddr mem_len,
+                             int numa_node_id)
+{
+    char *nodename = NULL;
+    int ret;
+
+    nodename = g_strdup_printf("/pmem@%" PRIx64, mem_base);
+    qemu_fdt_add_subnode(fdt, nodename);
+    qemu_fdt_setprop_string(fdt, nodename, "compatible", "pmem-region");
+    ret = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg", acells, mem_base,
+                                       scells, mem_len);
+    if (ret < 0) {
+        fprintf(stderr, "couldn't set %s/reg\n", nodename);
+        goto out;
+    }
+    if (numa_node_id < 0) {
+        goto out;
+    }
+
+    ret = qemu_fdt_setprop_cell(fdt, nodename, "numa-node-id", numa_node_id);
+    if (ret < 0) {
+        fprintf(stderr, "couldn't set %s/numa-node-id\n", nodename);
+    }
+
+out:
+    g_free(nodename);
+    return ret;
+}
+
 static void fdt_add_psci_node(void *fdt)
 {
     uint32_t cpu_suspend_fn;
@@ -541,7 +571,8 @@ static int fdt_add_hotpluggable_memory_nodes(void *fdt,
         di = !is_nvdimm ? mi->u.dimm.data : mi->u.nvdimm.data;
 
         if (is_nvdimm) {
-            ret = -ENOENT; /* NV-DIMM not yet supported */
+            ret = fdt_add_pmem_node(fdt, acells, di->addr,
+                                    scells, di->size, di->node);
         } else {
             ret = fdt_add_memory_node(fdt, acells, di->addr,
                                       scells, di->size, di->node);
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v6 18/18] hw/arm/virt: Add nvdimm and nvdimm-persistence options
  2019-02-05 17:32 [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
                   ` (16 preceding siblings ...)
  2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 17/18] hw/arm/boot: Expose the pmem nodes in the DT Eric Auger
@ 2019-02-05 17:33 ` Eric Auger
  2019-02-14 17:35 ` [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Peter Maydell
  18 siblings, 0 replies; 56+ messages in thread
From: Eric Auger @ 2019-02-05 17:33 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, imammedo, david
  Cc: dgilbert, david, drjones

Machine option nvdimm allows to turn NVDIMM support on.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/virt.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 57 insertions(+), 2 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 0c8c2cc191..85ce9becdb 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1776,6 +1776,47 @@ static void virt_set_iommu(Object *obj, const char *value, Error **errp)
     }
 }
 
+static bool virt_get_nvdimm(Object *obj, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    return vms->acpi_nvdimm_state.is_enabled;
+}
+
+static void virt_set_nvdimm(Object *obj, bool value, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    vms->acpi_nvdimm_state.is_enabled = value;
+}
+
+static char *virt_get_nvdimm_persistence(Object *obj, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    return g_strdup(vms->acpi_nvdimm_state.persistence_string);
+}
+
+static void virt_set_nvdimm_persistence(Object *obj, const char *value,
+                                        Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+    AcpiNVDIMMState *nvdimm_state = &vms->acpi_nvdimm_state;
+
+    if (strcmp(value, "cpu") == 0)
+        nvdimm_state->persistence = 3;
+    else if (strcmp(value, "mem-ctrl") == 0)
+        nvdimm_state->persistence = 2;
+    else {
+        error_report("-machine nvdimm-persistence=%s: unsupported option",
+                     value);
+        exit(EXIT_FAILURE);
+    }
+
+    g_free(nvdimm_state->persistence_string);
+    nvdimm_state->persistence_string = g_strdup(value);
+}
+
 static CpuInstanceProperties
 virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
 {
@@ -1818,13 +1859,14 @@ static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
                                  Error **errp)
 {
     const bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM);
+    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
 
     if (dev->hotplugged) {
         error_setg(errp, "memory hotplug is not supported");
     }
 
-    if (is_nvdimm) {
-        error_setg(errp, "nvdimm is not yet supported");
+    if (is_nvdimm && !vms->acpi_nvdimm_state.is_enabled) {
+        error_setg(errp, "nvdimm is not enabled: missing 'nvdimm' in '-M'");
         return;
     }
 
@@ -2032,6 +2074,19 @@ static void virt_instance_init(Object *obj)
                                     "Valid values are none and smmuv3",
                                     NULL);
 
+    object_property_add_bool(obj, "nvdimm",
+                             virt_get_nvdimm, virt_set_nvdimm, NULL);
+    object_property_set_description(obj, "nvdimm",
+                                         "Set on/off to enable/disable NVDIMM "
+                                         "instantiation", NULL);
+
+    object_property_add_str(obj, "nvdimm-persistence",
+                            virt_get_nvdimm_persistence,
+                            virt_set_nvdimm_persistence, NULL);
+    object_property_set_description(obj, "nvdimm-persistence",
+                                    "Set NVDIMM persistence"
+                                    "Valid values are cpu and mem-ctrl", NULL);
+
     vms->irqmap = a15irqmap;
 }
 
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 10/18] hw/arm/virt: Bump the 255GB initial RAM limit
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 10/18] hw/arm/virt: Bump the 255GB initial RAM limit Eric Auger
@ 2019-02-07 15:19   ` Shameerali Kolothum Thodi
  2019-02-07 15:25     ` Auger Eric
  0 siblings, 1 reply; 56+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-02-07 15:19 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	imammedo, david
  Cc: dgilbert, david, drjones

Hi Eric,

> -----Original Message-----
> From: Eric Auger [mailto:eric.auger@redhat.com]
> Sent: 05 February 2019 17:33
> To: eric.auger.pro@gmail.com; eric.auger@redhat.com;
> qemu-devel@nongnu.org; qemu-arm@nongnu.org; peter.maydell@linaro.org;
> Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> imammedo@redhat.com; david@redhat.com
> Cc: dgilbert@redhat.com; david@gibson.dropbear.id.au; drjones@redhat.com
> Subject: [PATCH v6 10/18] hw/arm/virt: Bump the 255GB initial RAM limit
> 
> Now we have the extended memory map (high IO regions beyond the
> scalable RAM) and dynamic IPA range support at KVM/ARM level
> we can bump the legacy 255GB initial RAM limit. The actual maximum
> RAM size now depends on the physical CPU and host kernel.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  hw/arm/virt.c | 26 +++++++-------------------
>  1 file changed, 7 insertions(+), 19 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index b90ffc2e5d..f01886da22 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -93,22 +93,9 @@
> 
>  #define PLATFORM_BUS_NUM_IRQS 64
> 
> -/* RAM limit in GB. Since VIRT_MEM starts at the 1GB mark, this means
> - * RAM can go up to the 256GB mark, leaving 256GB of the physical
> - * address space unallocated and free for future use between 256G and 512G.
> - * If we need to provide more RAM to VMs in the future then we need to:
> - *  * allocate a second bank of RAM starting at 2TB and working up
> - *  * fix the DT and ACPI table generation code in QEMU to correctly
> - *    report two split lumps of RAM to the guest
> - *  * fix KVM in the host kernel to allow guests with >40 bit address spaces
> - * (We don't want to fill all the way up to 512GB with RAM because
> - * we might want it for non-RAM purposes later. Conversely it seems
> - * reasonable to assume that anybody configuring a VM with a quarter
> - * of a terabyte of RAM will be doing it on a host with more than a
> - * terabyte of physical address space.)
> - */
> -#define RAMLIMIT_GB 255
> -#define RAMLIMIT_BYTES (RAMLIMIT_GB * 1024ULL * 1024 * 1024)
> +/* Legacy RAM limit in GB (< version 4.0) */
> +#define LEGACY_RAMLIMIT_GB 255
> +#define LEGACY_RAMLIMIT_BYTES (LEGACY_RAMLIMIT_GB * GiB)
> 
>  /* Addresses and sizes of our components.
>   * 0..128MB is space for a flash device so we can run bootrom code such as
> UEFI.
> @@ -149,7 +136,7 @@ static const MemMapEntry a15memmap[] = {
>      [VIRT_PCIE_MMIO] =          { 0x10000000, 0x2eff0000 },
>      [VIRT_PCIE_PIO] =           { 0x3eff0000, 0x00010000 },
>      [VIRT_PCIE_ECAM] =          { 0x3f000000, 0x01000000 },
> -    [VIRT_MEM] =                { 0x40000000, RAMLIMIT_BYTES },
> +    [VIRT_MEM] =                { 0x40000000,
> LEGACY_RAMLIMIT_BYTES },
>  };
> 
>  /*
> @@ -1483,8 +1470,9 @@ static void machvirt_init(MachineState *machine)
> 
>      vms->smp_cpus = smp_cpus;
> 
> -    if (machine->ram_size > vms->memmap[VIRT_MEM].size) {
> -        error_report("mach-virt: cannot model more than %dGB RAM",
> RAMLIMIT_GB);
> +    if (!vms->extended_memmap && machine->ram_size >
> LEGACY_RAMLIMIT_GB) {

Just hit this while testing, should this check be against LEGACY_RAMLIMIT_BYTES?

Thanks,
Shameer

> +        error_report("mach-virt: cannot model more than %dGB RAM",
> +                     LEGACY_RAMLIMIT_GB);
>          exit(1);
>      }
> 
> --
> 2.20.1

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 10/18] hw/arm/virt: Bump the 255GB initial RAM limit
  2019-02-07 15:19   ` Shameerali Kolothum Thodi
@ 2019-02-07 15:25     ` Auger Eric
  0 siblings, 0 replies; 56+ messages in thread
From: Auger Eric @ 2019-02-07 15:25 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi, eric.auger.pro, qemu-devel, qemu-arm,
	peter.maydell, imammedo, david
  Cc: dgilbert, david, drjones

Hi Shameer,

On 2/7/19 4:19 PM, Shameerali Kolothum Thodi wrote:
> Hi Eric,
> 
>> -----Original Message-----
>> From: Eric Auger [mailto:eric.auger@redhat.com]
>> Sent: 05 February 2019 17:33
>> To: eric.auger.pro@gmail.com; eric.auger@redhat.com;
>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; peter.maydell@linaro.org;
>> Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>> imammedo@redhat.com; david@redhat.com
>> Cc: dgilbert@redhat.com; david@gibson.dropbear.id.au; drjones@redhat.com
>> Subject: [PATCH v6 10/18] hw/arm/virt: Bump the 255GB initial RAM limit
>>
>> Now we have the extended memory map (high IO regions beyond the
>> scalable RAM) and dynamic IPA range support at KVM/ARM level
>> we can bump the legacy 255GB initial RAM limit. The actual maximum
>> RAM size now depends on the physical CPU and host kernel.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>>  hw/arm/virt.c | 26 +++++++-------------------
>>  1 file changed, 7 insertions(+), 19 deletions(-)
>>
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index b90ffc2e5d..f01886da22 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -93,22 +93,9 @@
>>
>>  #define PLATFORM_BUS_NUM_IRQS 64
>>
>> -/* RAM limit in GB. Since VIRT_MEM starts at the 1GB mark, this means
>> - * RAM can go up to the 256GB mark, leaving 256GB of the physical
>> - * address space unallocated and free for future use between 256G and 512G.
>> - * If we need to provide more RAM to VMs in the future then we need to:
>> - *  * allocate a second bank of RAM starting at 2TB and working up
>> - *  * fix the DT and ACPI table generation code in QEMU to correctly
>> - *    report two split lumps of RAM to the guest
>> - *  * fix KVM in the host kernel to allow guests with >40 bit address spaces
>> - * (We don't want to fill all the way up to 512GB with RAM because
>> - * we might want it for non-RAM purposes later. Conversely it seems
>> - * reasonable to assume that anybody configuring a VM with a quarter
>> - * of a terabyte of RAM will be doing it on a host with more than a
>> - * terabyte of physical address space.)
>> - */
>> -#define RAMLIMIT_GB 255
>> -#define RAMLIMIT_BYTES (RAMLIMIT_GB * 1024ULL * 1024 * 1024)
>> +/* Legacy RAM limit in GB (< version 4.0) */
>> +#define LEGACY_RAMLIMIT_GB 255
>> +#define LEGACY_RAMLIMIT_BYTES (LEGACY_RAMLIMIT_GB * GiB)
>>
>>  /* Addresses and sizes of our components.
>>   * 0..128MB is space for a flash device so we can run bootrom code such as
>> UEFI.
>> @@ -149,7 +136,7 @@ static const MemMapEntry a15memmap[] = {
>>      [VIRT_PCIE_MMIO] =          { 0x10000000, 0x2eff0000 },
>>      [VIRT_PCIE_PIO] =           { 0x3eff0000, 0x00010000 },
>>      [VIRT_PCIE_ECAM] =          { 0x3f000000, 0x01000000 },
>> -    [VIRT_MEM] =                { 0x40000000, RAMLIMIT_BYTES },
>> +    [VIRT_MEM] =                { 0x40000000,
>> LEGACY_RAMLIMIT_BYTES },
>>  };
>>
>>  /*
>> @@ -1483,8 +1470,9 @@ static void machvirt_init(MachineState *machine)
>>
>>      vms->smp_cpus = smp_cpus;
>>
>> -    if (machine->ram_size > vms->memmap[VIRT_MEM].size) {
>> -        error_report("mach-virt: cannot model more than %dGB RAM",
>> RAMLIMIT_GB);
>> +    if (!vms->extended_memmap && machine->ram_size >
>> LEGACY_RAMLIMIT_GB) {
> 
> Just hit this while testing, should this check be against LEGACY_RAMLIMIT_BYTES?
Definitively, my mistake. Thank you for spotting that.

Thanks

Eric
> 
> Thanks,
> Shameer
> 
>> +        error_report("mach-virt: cannot model more than %dGB RAM",
>> +                     LEGACY_RAMLIMIT_GB);
>>          exit(1);
>>      }
>>
>> --
>> 2.20.1
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 01/18] update-linux-headers.sh: Copy new headers
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 01/18] update-linux-headers.sh: Copy new headers Eric Auger
@ 2019-02-14 16:36   ` Peter Maydell
  2019-02-21  6:15     ` Alexey Kardashevskiy
  0 siblings, 1 reply; 56+ messages in thread
From: Peter Maydell @ 2019-02-14 16:36 UTC (permalink / raw)
  To: Eric Auger
  Cc: Eric Auger, QEMU Developers, qemu-arm, Shameerali Kolothum Thodi,
	Igor Mammedov, David Hildenbrand, Dr. David Alan Gilbert,
	David Gibson, Andrew Jones

On Tue, 5 Feb 2019 at 17:33, Eric Auger <eric.auger@redhat.com> wrote:
>
> From: Alexey Kardashevskiy <aik@ozlabs.ru>
>
> Since Linux'es ab66dcc76d "powerpc: generate uapi header and system call
> table files" there are 2 new files: unistd_32.h and unistd_64.h. These
> files content is moved from unistd.h so now we have to copy new files
> as well, just like we already do for other architectures; this does it
> for MIPS as well.
>
> Also, v5.0-rc2 moved vhost bits around in 4b86713236e4bd
> "vhost: split structs into a separate header file", add those too.
>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>

I think this fix is handled by commit a0a6ef91a4a4edde27
(now in master), yes ?

thanks
-- PMM

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 03/18] hw/arm/boot: introduce fdt_add_memory_node helper
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 03/18] hw/arm/boot: introduce fdt_add_memory_node helper Eric Auger
@ 2019-02-14 16:49   ` Peter Maydell
  0 siblings, 0 replies; 56+ messages in thread
From: Peter Maydell @ 2019-02-14 16:49 UTC (permalink / raw)
  To: Eric Auger
  Cc: Eric Auger, QEMU Developers, qemu-arm, Shameerali Kolothum Thodi,
	Igor Mammedov, David Hildenbrand, Dr. David Alan Gilbert,
	David Gibson, Andrew Jones

On Tue, 5 Feb 2019 at 17:33, Eric Auger <eric.auger@redhat.com> wrote:
>
> From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>
> We introduce an helper to create a memory node.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
>  hw/arm/boot.c | 54 ++++++++++++++++++++++++++++++++-------------------
>  1 file changed, 34 insertions(+), 20 deletions(-)
>
> diff --git a/hw/arm/boot.c b/hw/arm/boot.c
> index 05762d0fc1..2ef367e15b 100644
> --- a/hw/arm/boot.c
> +++ b/hw/arm/boot.c
> @@ -423,6 +423,36 @@ static void set_kernel_args_old(const struct arm_boot_info *info,
>      }
>  }
>
> +static int fdt_add_memory_node(void *fdt, uint32_t acells, hwaddr mem_base,
> +                               uint32_t scells, hwaddr mem_len,
> +                               int numa_node_id)
> +{
> +    char *nodename = NULL;

You set nodename immediately below, so no need for the NULL initialization here.

> +    int ret;
> +
> +    nodename = g_strdup_printf("/memory@%" PRIx64, mem_base);
> +    qemu_fdt_add_subnode(fdt, nodename);
> +    qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory");
> +    ret = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg", acells, mem_base,
> +                                       scells, mem_len);
> +    if (ret < 0) {
> +        fprintf(stderr, "couldn't set %s/reg\n", nodename);
> +        goto out;
> +    }

I think error handling (ie whether we print messages or not) ought to be
done by the calling function, rather than here.

> +    if (numa_node_id < 0) {

What is this for? My original theory was that this was an error
case that should probably be an assert(), but we seem to use it
in one of the callers below. A brief comment at the top of the
function documenting its API would assist here. If this is
"only set the NUMA ID if it is specified" then I think writing it as
  if (numa_node_id >= 0) {
     set the id;
  }

is clearer than making it look like an error-exit check.

> +        goto out;
> +    }
> +
> +    ret = qemu_fdt_setprop_cell(fdt, nodename, "numa-node-id", numa_node_id);
> +    if (ret < 0) {
> +        fprintf(stderr, "couldn't set %s/numa-node-id\n", nodename);
> +    }
> +
> +out:
> +    g_free(nodename);
> +    return ret;
> +}
> +
>  static void fdt_add_psci_node(void *fdt)
>  {
>      uint32_t cpu_suspend_fn;
> @@ -502,7 +532,6 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
>      void *fdt = NULL;
>      int size, rc, n = 0;
>      uint32_t acells, scells;
> -    char *nodename;
>      unsigned int i;
>      hwaddr mem_base, mem_len;
>      char **node_path;
> @@ -576,35 +605,20 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
>          mem_base = binfo->loader_start;
>          for (i = 0; i < nb_numa_nodes; i++) {
>              mem_len = numa_info[i].node_mem;
> -            nodename = g_strdup_printf("/memory@%" PRIx64, mem_base);
> -            qemu_fdt_add_subnode(fdt, nodename);
> -            qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory");
> -            rc = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg",
> -                                              acells, mem_base,
> -                                              scells, mem_len);
> +            rc = fdt_add_memory_node(fdt, acells, mem_base,
> +                                     scells, mem_len, i);
>              if (rc < 0) {
> -                fprintf(stderr, "couldn't set %s/reg for node %d\n", nodename,
> -                        i);
>                  goto fail;
>              }
>
> -            qemu_fdt_setprop_cell(fdt, nodename, "numa-node-id", i);
>              mem_base += mem_len;
> -            g_free(nodename);
>          }
>      } else {
> -        nodename = g_strdup_printf("/memory@%" PRIx64, binfo->loader_start);
> -        qemu_fdt_add_subnode(fdt, nodename);
> -        qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory");
> -
> -        rc = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg",
> -                                          acells, binfo->loader_start,
> -                                          scells, binfo->ram_size);
> +        rc = fdt_add_memory_node(fdt, acells, binfo->loader_start,
> +                                 scells, binfo->ram_size, -1);
>          if (rc < 0) {
> -            fprintf(stderr, "couldn't set %s reg\n", nodename);
>              goto fail;
>          }
> -        g_free(nodename);
>      }
>
>      rc = fdt_path_offset(fdt, "/chosen");
> --
> 2.20.1
>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 04/18] hw/arm/virt: Rename highmem IO regions
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 04/18] hw/arm/virt: Rename highmem IO regions Eric Auger
@ 2019-02-14 16:50   ` Peter Maydell
  0 siblings, 0 replies; 56+ messages in thread
From: Peter Maydell @ 2019-02-14 16:50 UTC (permalink / raw)
  To: Eric Auger
  Cc: Eric Auger, QEMU Developers, qemu-arm, Shameerali Kolothum Thodi,
	Igor Mammedov, David Hildenbrand, Dr. David Alan Gilbert,
	David Gibson, Andrew Jones

On Tue, 5 Feb 2019 at 17:33, Eric Auger <eric.auger@redhat.com> wrote:
>
> In preparation for a split of the memory map into a static
> part and a dynamic part floating after the RAM, let's rename the
> regions located after the RAM
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 05/18] hw/arm/virt: Split the memory map description
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 05/18] hw/arm/virt: Split the memory map description Eric Auger
@ 2019-02-14 17:07   ` Peter Maydell
  0 siblings, 0 replies; 56+ messages in thread
From: Peter Maydell @ 2019-02-14 17:07 UTC (permalink / raw)
  To: Eric Auger
  Cc: Eric Auger, QEMU Developers, qemu-arm, Shameerali Kolothum Thodi,
	Igor Mammedov, David Hildenbrand, Dr. David Alan Gilbert,
	David Gibson, Andrew Jones

On Tue, 5 Feb 2019 at 17:33, Eric Auger <eric.auger@redhat.com> wrote:
>
> In the prospect to introduce an extended memory map supporting more
> RAM, let's split the memory map array into two parts:
>
> - the former a15memmap contains regions below and including the RAM
> - extended_memmap, only initialized with entries located after the RAM.
>   Only the size of the region is initialized there since their base
>   address will be dynamically computed, depending on the top of the
>   RAM (initial RAM at the moment), with same alignment as their size.
>
> This new split will allow to grow the RAM size without changing the
> description of the high regions.

This change makes it clear that "a15memmap" is badly misnamed.
I think we should change it to "base_memmap" here.

>
> The patch also moves the memory map setup into machvirt_init().
> The rationale is the memory map will be soon affected by the
> kvm_type() call that happens after virt_instance_init() and
> before machvirt_init().
>
> At that point the memory map is not changed, ie. the initial RAM can

"At this point" ?

> grow up to 256GiB. Then come the high IO regions with same layout as
> before.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
> v5 -> v6
> - removal of many macros in units.h
> - introduce the virt_set_memmap helper
> - new computation for offsets of high IO regions
> - add comments
> ---
>  hw/arm/virt.c         | 45 ++++++++++++++++++++++++++++++++++++++-----
>  include/hw/arm/virt.h | 14 ++++++++++----
>  2 files changed, 50 insertions(+), 9 deletions(-)
>
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index a1955e7764..2b15839d0b 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -29,6 +29,7 @@
>   */
>
>  #include "qemu/osdep.h"
> +#include "qemu/units.h"
>  #include "qapi/error.h"
>  #include "hw/sysbus.h"
>  #include "hw/arm/arm.h"
> @@ -149,11 +150,20 @@ static const MemMapEntry a15memmap[] = {
>      [VIRT_PCIE_PIO] =           { 0x3eff0000, 0x00010000 },
>      [VIRT_PCIE_ECAM] =          { 0x3f000000, 0x01000000 },
>      [VIRT_MEM] =                { 0x40000000, RAMLIMIT_BYTES },
> +};
> +
> +/*
> + * Highmem IO Regions: This memory map is floating, located after the RAM.
> + * Each IO region offset will be dynamically computed, depending on the
> + * top of the RAM, so that its base get the same alignment as the size,
> + * ie. a 512GiB region will be aligned on a 512GiB boundary.

I think you should say here that if there is less than 256GiB of RAM
then the floating area starts at the 256GiB mark.

> + */
> +static MemMapEntry extended_memmap[] = {
>      /* Additional 64 MB redist region (can contain up to 512 redistributors) */
> -    [VIRT_HIGH_GIC_REDIST2] =   { 0x4000000000ULL, 0x4000000 },
> -    [VIRT_HIGH_PCIE_ECAM] =     { 0x4010000000ULL, 0x10000000 },
> -    /* Second PCIe window, 512GB wide at the 512GB boundary */
> -    [VIRT_HIGH_PCIE_MMIO] =     { 0x8000000000ULL, 0x8000000000ULL },
> +    [VIRT_HIGH_GIC_REDIST2] =   { 0x0, 64 * MiB },
> +    [VIRT_HIGH_PCIE_ECAM] =     { 0x0, 256 * MiB },
> +    /* Second PCIe window */
> +    [VIRT_HIGH_PCIE_MMIO] =     { 0x0, 512 * GiB },
>  };
>
>  static const int a15irqmap[] = {
> @@ -1354,6 +1364,30 @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx)
>      return arm_cpu_mp_affinity(idx, clustersz);
>  }

Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 06/18] hw/boards: Add a MachineState parameter to kvm_type callback
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 06/18] hw/boards: Add a MachineState parameter to kvm_type callback Eric Auger
@ 2019-02-14 17:12   ` Peter Maydell
  0 siblings, 0 replies; 56+ messages in thread
From: Peter Maydell @ 2019-02-14 17:12 UTC (permalink / raw)
  To: Eric Auger
  Cc: Eric Auger, QEMU Developers, qemu-arm, Shameerali Kolothum Thodi,
	Igor Mammedov, David Hildenbrand, Dr. David Alan Gilbert,
	David Gibson, Andrew Jones

On Tue, 5 Feb 2019 at 17:33, Eric Auger <eric.auger@redhat.com> wrote:
>
> On ARM, the kvm_type will be resolved by querying the KVMState.
> Let's add the MachineState handle to the callback so that we
> can retrieve the  KVMState handle. in kvm_init, when the callback
> is called, the kvm_state variable is not yet set.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> Acked-by: David Gibson <david@gibson.dropbear.id.au>
> [ppc parts]
> ---
>  accel/kvm/kvm-all.c   | 2 +-
>  hw/ppc/mac_newworld.c | 3 +--
>  hw/ppc/mac_oldworld.c | 2 +-
>  hw/ppc/spapr.c        | 2 +-
>  include/hw/boards.h   | 2 +-
>  5 files changed, 5 insertions(+), 6 deletions(-)
>


> diff --git a/include/hw/boards.h b/include/hw/boards.h
> index 02f114085f..425d2c86a6 100644
> --- a/include/hw/boards.h
> +++ b/include/hw/boards.h
> @@ -171,7 +171,7 @@ struct MachineClass {
>      void (*init)(MachineState *state);
>      void (*reset)(void);
>      void (*hot_add_cpu)(const int64_t id, Error **errp);
> -    int (*kvm_type)(const char *arg);
> +    int (*kvm_type)(MachineState *ms, const char *arg);
>
>      BlockInterfaceType block_default_type;
>      int units_per_default_bus;
> --

Can you add a line to the struct's documentation comment for the
@kvm_type field, please ?

We're rather inconsistent about what we name the MachineState*
parameter in methods here:
 "state" x 1   (init)
 "machine" x 3 (get_hotplug_handler, cpu_index_to_instance_props,
                possible_cpu_arch_ids)
 "ms" x 1 (get_default_cpu_node_id)

It would probably be better to follow the most common option
rather than one of the rarer ones.

Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 07/18] kvm: add kvm_arm_get_max_vm_phys_shift
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 07/18] kvm: add kvm_arm_get_max_vm_phys_shift Eric Auger
@ 2019-02-14 17:15   ` Peter Maydell
  2019-02-18 18:03     ` Auger Eric
  0 siblings, 1 reply; 56+ messages in thread
From: Peter Maydell @ 2019-02-14 17:15 UTC (permalink / raw)
  To: Eric Auger
  Cc: Eric Auger, QEMU Developers, qemu-arm, Shameerali Kolothum Thodi,
	Igor Mammedov, David Hildenbrand, Dr. David Alan Gilbert,
	David Gibson, Andrew Jones

On Tue, 5 Feb 2019 at 17:33, Eric Auger <eric.auger@redhat.com> wrote:
>
> Add the kvm_arm_get_max_vm_phys_shift() helper that returns the
> log of the maximum IPA size supported by KVM. This capability
> needs to be known to create the VM with a specific IPA max size
> (kvm_type passed along KVM_CREATE_VM ioctl.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
> v4 -> v5:
> - return 40 if the host does not support the capability
>
> v3 -> v4:
> - s/s/ms in kvm_arm_get_max_vm_phys_shift function comment
> - check KVM_CAP_ARM_VM_IPA_SIZE extension
>
> v1 -> v2:
> - put this in ARM specific code
> ---
>  target/arm/kvm.c     | 10 ++++++++++
>  target/arm/kvm_arm.h | 13 +++++++++++++
>  2 files changed, 23 insertions(+)
>
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index e00ccf9c98..fc1dd3ec6a 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -18,6 +18,7 @@
>  #include "qemu/error-report.h"
>  #include "sysemu/sysemu.h"
>  #include "sysemu/kvm.h"
> +#include "sysemu/kvm_int.h"
>  #include "kvm_arm.h"
>  #include "cpu.h"
>  #include "trace.h"
> @@ -162,6 +163,15 @@ void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu)
>      env->features = arm_host_cpu_features.features;
>  }
>
> +int kvm_arm_get_max_vm_phys_shift(MachineState *ms)
> +{
> +    KVMState *s = KVM_STATE(ms->accelerator);
> +    int ret;
> +
> +    ret = kvm_check_extension(s, KVM_CAP_ARM_VM_IPA_SIZE);

Why not name the function the same as the extension name?

> +    return ret > 0 ? ret : 40;
> +}
> +
>  int kvm_arch_init(MachineState *ms, KVMState *s)
>  {
>      /* For ARM interrupt delivery is always asynchronous,
> diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
> index 6393455b1d..0728bbfa6b 100644
> --- a/target/arm/kvm_arm.h
> +++ b/target/arm/kvm_arm.h
> @@ -207,6 +207,14 @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf);
>   */
>  void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu);
>
> +/**
> + * kvm_arm_get_max_vm_phys_shift - Returns log2 of the max IPA size
> + * supported by KVM

This is the number of bits in the IPA address space,
right (ie 40 for a 40-bit IPA, and so on) ? If so, then
I think "Return number of bits in the IPA address space"
might be clearer.

> + *
> + * @ms: Machine state handle
> + */
> +int kvm_arm_get_max_vm_phys_shift(MachineState *ms);

thanks
-- PMM

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 11/18] hw/arm/virt: Add memory hotplug framework
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 11/18] hw/arm/virt: Add memory hotplug framework Eric Auger
@ 2019-02-14 17:15   ` David Hildenbrand
  2019-02-18 18:10     ` Auger Eric
  0 siblings, 1 reply; 56+ messages in thread
From: David Hildenbrand @ 2019-02-14 17:15 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, imammedo
  Cc: dgilbert, david, drjones

On 05.02.19 18:32, Eric Auger wrote:
> This patch adds the the memory hot-plug/hot-unplug infrastructure
> in machvirt. It is still not enabled as no device memory is allocated.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
> 
> ---
> v4 -> v5:
> - change in pc_dimm_pre_plug signature
> - CONFIG_MEM_HOTPLUG replaced by CONFIG_MEM_DEVICE and CONFIG_DIMM
> 
> v3 -> v4:
> - check the memory device is not hotplugged
> 
> v2 -> v3:
> - change in pc_dimm_plug()'s signature
> - add pc_dimm_pre_plug call
> 
> v1 -> v2:
> - s/virt_dimm_plug|unplug/virt_memory_plug|unplug
> - s/pc_dimm_memory_plug/pc_dimm_plug
> - reworded title and commit message
> - added pre_plug cb
> - don't handle get_memory_region failure anymore
> ---
>  default-configs/arm-softmmu.mak |  2 ++
>  hw/arm/virt.c                   | 64 ++++++++++++++++++++++++++++++++-
>  2 files changed, 65 insertions(+), 1 deletion(-)
> 
> diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
> index be88870799..dc4624794f 100644
> --- a/default-configs/arm-softmmu.mak
> +++ b/default-configs/arm-softmmu.mak
> @@ -160,3 +160,5 @@ CONFIG_PCI_DESIGNWARE=y
>  CONFIG_STRONGARM=y
>  CONFIG_HIGHBANK=y
>  CONFIG_MUSICPAL=y
> +CONFIG_MEM_DEVICE=y
> +CONFIG_DIMM=y
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index f01886da22..783468ba77 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -59,6 +59,8 @@
>  #include "qapi/visitor.h"
>  #include "standard-headers/linux/input.h"
>  #include "hw/arm/smmuv3.h"
> +#include "hw/mem/pc-dimm.h"
> +#include "hw/mem/nvdimm.h"
>  
>  #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
>      static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
> @@ -1763,6 +1765,49 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>      return ms->possible_cpus;
>  }
>  
> +static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> +                                 Error **errp)
> +{
> +    const bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM);
> +
> +    if (dev->hotplugged) {
> +        error_setg(errp, "memory hotplug is not supported");
> +    }
> +
> +    if (is_nvdimm) {
> +        error_setg(errp, "nvdimm is not yet supported");
> +        return;
> +    }
> +
> +    pc_dimm_pre_plug(PC_DIMM(dev), MACHINE(hotplug_dev), NULL, errp);
> +}
> +
> +static void virt_memory_plug(HotplugHandler *hotplug_dev,
> +                             DeviceState *dev, Error **errp)
> +{
> +    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> +    Error *local_err = NULL;
> +
> +    pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
> +
> +    error_propagate(errp, local_err);
> +}
> +
> +static void virt_memory_unplug(HotplugHandler *hotplug_dev,
> +                               DeviceState *dev, Error **errp)
> +{
> +    pc_dimm_unplug(PC_DIMM(dev), MACHINE(hotplug_dev));
> +    object_unparent(OBJECT(dev));

Please note that this will soon change with

[PATCH RFCv2 0/9] qdev: Hotplug handler chaining + virtio-pmem

What you'll have to do then is to replace the object_unparent by a

object_property_set_bool(OBJECT(dev), false, "realized", NULL);


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 08/18] vl: Set machine ram_size, maxram_size and ram_slots earlier
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 08/18] vl: Set machine ram_size, maxram_size and ram_slots earlier Eric Auger
@ 2019-02-14 17:16   ` Peter Maydell
  0 siblings, 0 replies; 56+ messages in thread
From: Peter Maydell @ 2019-02-14 17:16 UTC (permalink / raw)
  To: Eric Auger
  Cc: Eric Auger, QEMU Developers, qemu-arm, Shameerali Kolothum Thodi,
	Igor Mammedov, David Hildenbrand, Dr. David Alan Gilbert,
	David Gibson, Andrew Jones

On Tue, 5 Feb 2019 at 17:33, Eric Auger <eric.auger@redhat.com> wrote:
>
> The machine RAM attributes will need to be analyzed during the
> configure_accelerator() process. especially kvm_type() arm64
> machine callback will use them to know how many IPA/GPA bits are
> needed to model the whole RAM range. So let's assign those machine
> state fields before calling configure_accelerator.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
>
> v4: new
> ---
>  vl.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/vl.c b/vl.c
> index 9cf0fbe0b8..28f6bbebe2 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -4324,6 +4324,9 @@ int main(int argc, char **argv, char **envp)
>      machine_opts = qemu_get_machine_opts();
>      qemu_opt_foreach(machine_opts, machine_set_property, current_machine,
>                       &error_fatal);
> +    current_machine->ram_size = ram_size;
> +    current_machine->maxram_size = maxram_size;
> +    current_machine->ram_slots = ram_slots;
>
>      configure_accelerator(current_machine, argv[0]);

This is still after the call to set_memory_options(), so it's OK.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 09/18] hw/arm/virt: Implement kvm_type function for 4.0 machine
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 09/18] hw/arm/virt: Implement kvm_type function for 4.0 machine Eric Auger
@ 2019-02-14 17:29   ` Peter Maydell
  2019-02-18 21:29     ` Auger Eric
  2019-02-18 10:07   ` Igor Mammedov
  1 sibling, 1 reply; 56+ messages in thread
From: Peter Maydell @ 2019-02-14 17:29 UTC (permalink / raw)
  To: Eric Auger
  Cc: Eric Auger, QEMU Developers, qemu-arm, Shameerali Kolothum Thodi,
	Igor Mammedov, David Hildenbrand, Dr. David Alan Gilbert,
	David Gibson, Andrew Jones

On Tue, 5 Feb 2019 at 17:33, Eric Auger <eric.auger@redhat.com> wrote:
>
> This patch implements the machine class kvm_type() callback.
> It returns the max IPA shift needed to implement the whole GPA
> range including the RAM and IO regions located beyond.
> The returned value in passed though the KVM_CREATE_VM ioctl and
> this allows KVM to set the stage2 tables dynamically.
>
> At this stage the RAM limit still is limited to 255GB.
>
> Setting all the existing highmem IO regions beyond the RAM
> allows to have a single contiguous RAM region (initial RAM and
> possible hotpluggable device memory). That way we do not need
> to do invasive changes in the EDK2 FW to support a dynamic
> RAM base.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
>
> v5 -> v6:
> - add some comments
> - high IO region cannot start before 256GiB
> ---
>  hw/arm/virt.c         | 52 +++++++++++++++++++++++++++++++++++++++++--
>  include/hw/arm/virt.h |  2 ++
>  2 files changed, 52 insertions(+), 2 deletions(-)
>
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 2b15839d0b..b90ffc2e5d 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -1366,6 +1366,7 @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx)
>
>  static void virt_set_memmap(VirtMachineState *vms)
>  {
> +    MachineState *ms = MACHINE(vms);
>      hwaddr base;
>      int i;
>
> @@ -1375,7 +1376,17 @@ static void virt_set_memmap(VirtMachineState *vms)
>          vms->memmap[i] = a15memmap[i];
>      }
>
> -    vms->high_io_base = 256 * GiB; /* Top of the legacy initial RAM region */
> +    /*
> +     * We now compute the base of the high IO region depending on the
> +     * amount of initial and device memory. The device memory start/size
> +     * is aligned on 1GiB. We never put the high IO region below 256GiB
> +     * so that if maxram_size is < 255GiB we keep the legacy memory map
> +     */
> +    vms->high_io_base = ROUND_UP(GiB + ms->ram_size, GiB) +
> +                        ROUND_UP(ms->maxram_size - ms->ram_size, GiB);

I don't understand this expression...

> +    if (vms->high_io_base < 256 * GiB) {
> +        vms->high_io_base = 256 * GiB;
> +    }
>      base = vms->high_io_base;
>
>      for (i = VIRT_LOWMEMMAP_LAST; i < ARRAY_SIZE(extended_memmap); i++) {
> @@ -1386,6 +1397,7 @@ static void virt_set_memmap(VirtMachineState *vms)
>          vms->memmap[i].size = size;
>          base += size;
>      }
> +    vms->highest_gpa = base - 1;
>  }
>
>  static void machvirt_init(MachineState *machine)
> @@ -1402,7 +1414,9 @@ static void machvirt_init(MachineState *machine)
>      bool firmware_loaded = bios_name || drive_get(IF_PFLASH, 0, 0);
>      bool aarch64 = true;
>
> -    virt_set_memmap(vms);
> +    if (!vms->extended_memmap) {
> +        virt_set_memmap(vms);
> +    }
>
>      /* We can probe only here because during property set
>       * KVM is not available yet
> @@ -1784,6 +1798,36 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
>      return NULL;
>  }
>
> +/*
> + * for arm64 kvm_type [7-0] encodes the IPA size shift
> + */
> +static int virt_kvm_type(MachineState *ms, const char *type_str)
> +{
> +    VirtMachineState *vms = VIRT_MACHINE(ms);
> +    int max_vm_phys_shift = kvm_arm_get_max_vm_phys_shift(ms);
> +    int max_pa_shift;
> +
> +    vms->extended_memmap = true;
> +
> +    virt_set_memmap(vms);
> +
> +    max_pa_shift = 64 - clz64(vms->highest_gpa);
> +
> +    if (max_pa_shift > max_vm_phys_shift) {
> +        error_report("-m and ,maxmem option values "
> +                     "require an IPA range (%d bits) larger than "
> +                     "the one supported by the host (%d bits)",
> +                     max_pa_shift, max_vm_phys_shift);
> +       exit(1);
> +    }

Presumably we should have some equivalent check for TCG, so
that we don't let the user create a setup which wants more
bits of physical address than the TCG CPU allows ?

> +    /*
> +     * By default we return 0 which corresponds to an implicit legacy
> +     * 40b IPA setting. Otherwise we return the actual requested IPA
> +     * logsize
> +     */
> +    return max_pa_shift > 40 ? max_pa_shift : 0;
> +}
> +
>  static void virt_machine_class_init(ObjectClass *oc, void *data)
>  {
>      MachineClass *mc = MACHINE_CLASS(oc);
> @@ -1808,6 +1852,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
>      mc->cpu_index_to_instance_props = virt_cpu_index_to_props;
>      mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a15");
>      mc->get_default_cpu_node_id = virt_get_default_cpu_node_id;
> +    mc->kvm_type = virt_kvm_type;
>      assert(!mc->get_hotplug_handler);
>      mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
>      hc->plug = virt_machine_device_plug_cb;
> @@ -1911,6 +1956,9 @@ static void virt_machine_3_1_options(MachineClass *mc)
>  {
>      virt_machine_4_0_options(mc);
>      compat_props_add(mc->compat_props, hw_compat_3_1, hw_compat_3_1_len);
> +
> +    /* extended memory map is enabled from 4.0 onwards */
> +    mc->kvm_type = NULL;

When is there a difference between setting this to NULL,
and setting it to virt_kvm_type but having the memory
size be <= 256GiB ?

If there isn't any difference, why can't we just let the
pre-4.0 versions behave like the new ones? No existing
VM setup will have > 256GB of memory, so as long as there's
no behaviour change for the <=256GB case we don't need to
take special effort to ensure that the >256GB case continues
to give an error message, do we ?

>  }
>  DEFINE_VIRT_MACHINE(3, 1)
>
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index 3dc7a6c5d5..c88f67a492 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -132,6 +132,8 @@ typedef struct {
>      uint32_t iommu_phandle;
>      int psci_conduit;
>      hwaddr high_io_base;
> +    hwaddr highest_gpa;
> +    bool extended_memmap;
>  } VirtMachineState;
>
>  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
> --
> 2.20.1

thanks
-- PMM

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support
  2019-02-05 17:32 [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
                   ` (17 preceding siblings ...)
  2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 18/18] hw/arm/virt: Add nvdimm and nvdimm-persistence options Eric Auger
@ 2019-02-14 17:35 ` Peter Maydell
  2019-02-14 18:00   ` Auger Eric
  18 siblings, 1 reply; 56+ messages in thread
From: Peter Maydell @ 2019-02-14 17:35 UTC (permalink / raw)
  To: Eric Auger
  Cc: Eric Auger, QEMU Developers, qemu-arm, Shameerali Kolothum Thodi,
	Igor Mammedov, David Hildenbrand, Dr. David Alan Gilbert,
	David Gibson, Andrew Jones

On Tue, 5 Feb 2019 at 17:33, Eric Auger <eric.auger@redhat.com> wrote:
> This series aims to bump the 255GB RAM limit in machvirt and to
> support device memory in general, and especially PCDIMM/NVDIMM.

> Functionally, the series is split into 3 parts:
> 1) bump of the initial RAM limit [1 - 10] and change in
>    the memory map
> 2) Support of PC-DIMM [11 - 14]
> 3) Support of NV-DIMM [15 - 18]
>
> 1) can be upstreamed before 2 and 2 can be upstreamed before 3.

Hi Eric; sorry I haven't reviewed this series earlier. I think
that 1-10 are pretty near to ready to go in; maybe the easiest
path is to do a respin of just those with the review issues fixed?

I'm a long way from being expert in the PC-DIMM/NV-DIMM stuff, so
I'm going to be reliant on other people to review those parts.

I don't know if your series needs anything from linux-headers
which isn't already in QEMU master after the update to match
5.0rc1 -- if not you could drop the header-sync patch.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support
  2019-02-14 17:35 ` [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Peter Maydell
@ 2019-02-14 18:00   ` Auger Eric
  0 siblings, 0 replies; 56+ messages in thread
From: Auger Eric @ 2019-02-14 18:00 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Eric Auger, QEMU Developers, qemu-arm, Shameerali Kolothum Thodi,
	Igor Mammedov, David Hildenbrand, Dr. David Alan Gilbert,
	David Gibson, Andrew Jones

Hi Peter,

On 2/14/19 6:35 PM, Peter Maydell wrote:
> On Tue, 5 Feb 2019 at 17:33, Eric Auger <eric.auger@redhat.com> wrote:
>> This series aims to bump the 255GB RAM limit in machvirt and to
>> support device memory in general, and especially PCDIMM/NVDIMM.
> 
>> Functionally, the series is split into 3 parts:
>> 1) bump of the initial RAM limit [1 - 10] and change in
>>    the memory map
>> 2) Support of PC-DIMM [11 - 14]
>> 3) Support of NV-DIMM [15 - 18]
>>
>> 1) can be upstreamed before 2 and 2 can be upstreamed before 3.
> 
> Hi Eric; sorry I haven't reviewed this series earlier. I think
> that 1-10 are pretty near to ready to go in; maybe the easiest
> path is to do a respin of just those with the review issues fixed?

No problem. Thank you for the review.

Yes I will quickly respin the patches you reviewed.

> 
> I'm a long way from being expert in the PC-DIMM/NV-DIMM stuff, so
> I'm going to be reliant on other people to review those parts.
> 
> I don't know if your series needs anything from linux-headers
> which isn't already in QEMU master after the update to match
> 5.0rc1 -- if not you could drop the header-sync patch.
5.0-rc1 should be OK so I think I can drop the header sync.

Thanks

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 13/18] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT
  2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 13/18] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT Eric Auger
@ 2019-02-18  8:14   ` Igor Mammedov
  0 siblings, 0 replies; 56+ messages in thread
From: Igor Mammedov @ 2019-02-18  8:14 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, david, dgilbert, david, drjones

On Tue,  5 Feb 2019 18:33:01 +0100
Eric Auger <eric.auger@redhat.com> wrote:

> From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> 
> Generate Memory Affinity Structures for PC-DIMM ranges.
> 
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>

Reviewed-by: Igor Mammedov <imammedo@redhat.com>

> 
> ---
> v5 -> v6:
> - fix mingw compil issue
> 
> v4 -> v5:
> - Align to x86 code and especially
>   "pc: acpi: revert back to 1 SRAT entry for hotpluggable area"
> 
> v3 -> v4:
> - do not use vms->bootinfo.device_memory_start/device_memory_size anymore
> 
> v1 -> v2:
> - build_srat_hotpluggable_memory movedc to aml-build
> ---
>  hw/arm/virt-acpi-build.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 829d2f0035..781eafaf5e 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -516,6 +516,7 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>      int i, srat_start;
>      uint64_t mem_base;
>      MachineClass *mc = MACHINE_GET_CLASS(vms);
> +    MachineState *ms = MACHINE(vms);
>      const CPUArchIdList *cpu_list = mc->possible_cpu_arch_ids(MACHINE(vms));
>  
>      srat_start = table_data->len;
> @@ -541,6 +542,14 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>          }
>      }
>  
> +    if (ms->device_memory) {
> +        numamem = acpi_data_push(table_data, sizeof *numamem);
> +        build_srat_memory(numamem, ms->device_memory->base,
> +                          memory_region_size(&ms->device_memory->mr),
> +                          nb_numa_nodes - 1,
> +                          MEM_AFFINITY_HOTPLUGGABLE | MEM_AFFINITY_ENABLED);
> +    }
> +
>      build_header(linker, table_data, (void *)(table_data->data + srat_start),
>                   "SRAT", table_data->len - srat_start, 3, NULL, NULL);
>  }

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 12/18] hw/arm/boot: Expose the PC-DIMM nodes in the DT
  2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 12/18] hw/arm/boot: Expose the PC-DIMM nodes in the DT Eric Auger
@ 2019-02-18  8:58   ` Igor Mammedov
  2019-02-20 15:30     ` Auger Eric
  0 siblings, 1 reply; 56+ messages in thread
From: Igor Mammedov @ 2019-02-18  8:58 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, david, drjones, dgilbert, david

On Tue,  5 Feb 2019 18:33:00 +0100
Eric Auger <eric.auger@redhat.com> wrote:

> From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> 
> This patch add memory nodes corresponding to PC-DIMM regions.
s/add/adds/ or s/This patch add/Add/

> 
> NV_DIMM and ACPI_NVDIMM configs are not yet set for ARM so we
> don't need to care about NV-DIMM at this stage.
> 
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> v3 -> v4:
> - git rid of @base and @len in fdt_add_hotpluggable_memory_nodes
> 
> v1 -> v2:
> - added qapi_free_MemoryDeviceInfoList and simplify the loop
> ---
>  hw/arm/boot.c | 35 +++++++++++++++++++++++++++++++++++
>  1 file changed, 35 insertions(+)
> 
> diff --git a/hw/arm/boot.c b/hw/arm/boot.c
> index 2ef367e15b..2a70e8aa82 100644
> --- a/hw/arm/boot.c
> +++ b/hw/arm/boot.c
> @@ -19,6 +19,7 @@
>  #include "sysemu/numa.h"
>  #include "hw/boards.h"
>  #include "hw/loader.h"
> +#include "hw/mem/memory-device.h"
>  #include "elf.h"
>  #include "sysemu/device_tree.h"
>  #include "qemu/config-file.h"
> @@ -526,6 +527,34 @@ static void fdt_add_psci_node(void *fdt)
>      qemu_fdt_setprop_cell(fdt, "/psci", "migrate", migrate_fn);
>  }
>  
> +static int fdt_add_hotpluggable_memory_nodes(void *fdt,
> +                                             uint32_t acells, uint32_t scells) {
> +    MemoryDeviceInfoList *info, *info_list = qmp_memory_device_list();
> +    MemoryDeviceInfo *mi;
> +    PCDIMMDeviceInfo *di;
> +    bool is_nvdimm;
> +    int ret = 0;
> +
> +    for (info = info_list; info != NULL; info = info->next) {
> +        mi = info->value;
> +        is_nvdimm = (mi->type == MEMORY_DEVICE_INFO_KIND_NVDIMM);
> +        di = !is_nvdimm ? mi->u.dimm.data : mi->u.nvdimm.data;
> +
> +        if (is_nvdimm) {
> +            ret = -ENOENT; /* NV-DIMM not yet supported */
> +        } else {
> +            ret = fdt_add_memory_node(fdt, acells, di->addr,
> +                                      scells, di->size, di->node);
> +        }
> +        if (ret < 0) {
> +            goto out;
> +        }
> +    }
> +out:
> +    qapi_free_MemoryDeviceInfoList(info_list);
> +    return ret;
> +}
> +
>  int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
>                   hwaddr addr_limit, AddressSpace *as)
>  {
> @@ -621,6 +650,12 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
>          }
>      }
>  
> +    rc = fdt_add_hotpluggable_memory_nodes(fdt, acells, scells);
> +    if (rc < 0) {
> +            fprintf(stderr, "couldn't add hotpluggable memory nodes\n");
error message is rather vague, user using nvdimms + pc-dimms on CLI won't have
a clue that the former is not supported.
Suggest pass in error_fatal as argument and report more specific error from
fdt_add_hotpluggable_memory_nodes()

does this run on reboot?

> +            goto fail;
> +    }
> +
>      rc = fdt_path_offset(fdt, "/chosen");
>      if (rc < 0) {
>          qemu_fdt_add_subnode(fdt, "/chosen");

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 14/18] hw/arm/virt: Allocate device_memory
  2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 14/18] hw/arm/virt: Allocate device_memory Eric Auger
@ 2019-02-18  9:31   ` Igor Mammedov
  2019-02-19 15:53     ` Auger Eric
  0 siblings, 1 reply; 56+ messages in thread
From: Igor Mammedov @ 2019-02-18  9:31 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, david, drjones, dgilbert, david

On Tue,  5 Feb 2019 18:33:02 +0100
Eric Auger <eric.auger@redhat.com> wrote:

> The device memory region is located after the initial RAM.
> its start/size are 1GB aligned.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
> 
> ---
> v4 -> v5:
> - device memory set after the initial RAM
> 
> v3 -> v4:
> - remove bootinfo.device_memory_start/device_memory_size
> - rename VIRT_HOTPLUG_MEM into VIRT_DEVICE_MEM
> ---
>  hw/arm/virt.c | 36 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 36 insertions(+)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 783468ba77..b683902991 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -61,6 +61,7 @@
>  #include "hw/arm/smmuv3.h"
>  #include "hw/mem/pc-dimm.h"
>  #include "hw/mem/nvdimm.h"
> +#include "hw/acpi/acpi.h"
>  
>  #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
>      static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
> @@ -1260,6 +1261,37 @@ static void create_secure_ram(VirtMachineState *vms,
>      g_free(nodename);
>  }
>  
> +static void create_device_memory(VirtMachineState *vms, MemoryRegion *sysmem)
> +{
> +    MachineState *ms = MACHINE(vms);
> +    uint64_t device_memory_size = ms->maxram_size - ms->ram_size;
should size it with 1Gb alignment per slot from the start (to avoid x86 mistakes),
see enforce_aligned_dimm usage and associated commit for more details

> +    uint64_t align = GiB;
> +
> +    if (!device_memory_size) {
> +        return;
> +    }
> +
> +    if (ms->ram_slots > ACPI_MAX_RAM_SLOTS) {
> +        error_report("unsupported number of memory slots: %"PRIu64,
> +                     ms->ram_slots);
> +        exit(EXIT_FAILURE);
> +    }
> +
> +    if (QEMU_ALIGN_UP(ms->maxram_size, align) != ms->maxram_size) {
> +        error_report("maximum memory size must be aligned to multiple of 0x%"
> +                     PRIx64, align);
> +        exit(EXIT_FAILURE);
> +    }
> +
> +    ms->device_memory = g_malloc0(sizeof(*ms->device_memory));
> +    ms->device_memory->base = QEMU_ALIGN_UP(GiB + ms->ram_size, GiB);
                                               ^^^ where does this come from?


> +
> +    memory_region_init(&ms->device_memory->mr, OBJECT(vms),
> +                       "device-memory", device_memory_size);
> +    memory_region_add_subregion(sysmem, ms->device_memory->base,
> +                                &ms->device_memory->mr);
> +}
> +
>  static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
>  {
>      const VirtMachineState *board = container_of(binfo, VirtMachineState,
> @@ -1569,6 +1601,10 @@ static void machvirt_init(MachineState *machine)
>                                           machine->ram_size);
>      memory_region_add_subregion(sysmem, vms->memmap[VIRT_MEM].base, ram);
>  
> +    if (vms->extended_memmap) {
> +        create_device_memory(vms, sysmem);
> +    }
> +
>      create_flash(vms, sysmem, secure_sysmem ? secure_sysmem : sysmem);
>  
>      create_gic(vms, pic);

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 09/18] hw/arm/virt: Implement kvm_type function for 4.0 machine
  2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 09/18] hw/arm/virt: Implement kvm_type function for 4.0 machine Eric Auger
  2019-02-14 17:29   ` Peter Maydell
@ 2019-02-18 10:07   ` Igor Mammedov
  2019-02-19 15:56     ` Auger Eric
  1 sibling, 1 reply; 56+ messages in thread
From: Igor Mammedov @ 2019-02-18 10:07 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, david, drjones, dgilbert, david

On Tue,  5 Feb 2019 18:32:57 +0100
Eric Auger <eric.auger@redhat.com> wrote:

> This patch implements the machine class kvm_type() callback.
> It returns the max IPA shift needed to implement the whole GPA
> range including the RAM and IO regions located beyond.
> The returned value in passed though the KVM_CREATE_VM ioctl and
> this allows KVM to set the stage2 tables dynamically.
> 
> At this stage the RAM limit still is limited to 255GB.
> 
> Setting all the existing highmem IO regions beyond the RAM
> allows to have a single contiguous RAM region (initial RAM and
> possible hotpluggable device memory). That way we do not need
> to do invasive changes in the EDK2 FW to support a dynamic
> RAM base.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> 
> v5 -> v6:
> - add some comments
> - high IO region cannot start before 256GiB
> ---
>  hw/arm/virt.c         | 52 +++++++++++++++++++++++++++++++++++++++++--
>  include/hw/arm/virt.h |  2 ++
>  2 files changed, 52 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 2b15839d0b..b90ffc2e5d 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -1366,6 +1366,7 @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx)
>  
>  static void virt_set_memmap(VirtMachineState *vms)
>  {
> +    MachineState *ms = MACHINE(vms);
>      hwaddr base;
>      int i;
>  
> @@ -1375,7 +1376,17 @@ static void virt_set_memmap(VirtMachineState *vms)
>          vms->memmap[i] = a15memmap[i];
>      }
>  
> -    vms->high_io_base = 256 * GiB; /* Top of the legacy initial RAM region */
> +    /*
> +     * We now compute the base of the high IO region depending on the
> +     * amount of initial and device memory. The device memory start/size
> +     * is aligned on 1GiB. We never put the high IO region below 256GiB
> +     * so that if maxram_size is < 255GiB we keep the legacy memory map
> +     */
> +    vms->high_io_base = ROUND_UP(GiB + ms->ram_size, GiB) +
> +                        ROUND_UP(ms->maxram_size - ms->ram_size, GiB);
> +    if (vms->high_io_base < 256 * GiB) {
> +        vms->high_io_base = 256 * GiB;
> +    }
>      base = vms->high_io_base;
>  
>      for (i = VIRT_LOWMEMMAP_LAST; i < ARRAY_SIZE(extended_memmap); i++) {
> @@ -1386,6 +1397,7 @@ static void virt_set_memmap(VirtMachineState *vms)
>          vms->memmap[i].size = size;
>          base += size;
>      }
> +    vms->highest_gpa = base - 1;
>  }
>  
>  static void machvirt_init(MachineState *machine)
> @@ -1402,7 +1414,9 @@ static void machvirt_init(MachineState *machine)
>      bool firmware_loaded = bios_name || drive_get(IF_PFLASH, 0, 0);
>      bool aarch64 = true;
>  
> -    virt_set_memmap(vms);
> +    if (!vms->extended_memmap) {
> +        virt_set_memmap(vms);
> +    }
>  
>      /* We can probe only here because during property set
>       * KVM is not available yet
> @@ -1784,6 +1798,36 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
>      return NULL;
>  }
>  
> +/*
> + * for arm64 kvm_type [7-0] encodes the IPA size shift
> + */
> +static int virt_kvm_type(MachineState *ms, const char *type_str)
> +{
> +    VirtMachineState *vms = VIRT_MACHINE(ms);
> +    int max_vm_phys_shift = kvm_arm_get_max_vm_phys_shift(ms);
> +    int max_pa_shift;
> +
> +    vms->extended_memmap = true;
> +
> +    virt_set_memmap(vms);
> +
> +    max_pa_shift = 64 - clz64(vms->highest_gpa);
> +
> +    if (max_pa_shift > max_vm_phys_shift) {
> +        error_report("-m and ,maxmem option values "
> +                     "require an IPA range (%d bits) larger than "
> +                     "the one supported by the host (%d bits)",
> +                     max_pa_shift, max_vm_phys_shift);
> +       exit(1);
> +    }
> +    /*
> +     * By default we return 0 which corresponds to an implicit legacy
> +     * 40b IPA setting. Otherwise we return the actual requested IPA
> +     * logsize
> +     */
> +    return max_pa_shift > 40 ? max_pa_shift : 0;
> +}
> +
>  static void virt_machine_class_init(ObjectClass *oc, void *data)
>  {
>      MachineClass *mc = MACHINE_CLASS(oc);
> @@ -1808,6 +1852,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
>      mc->cpu_index_to_instance_props = virt_cpu_index_to_props;
>      mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a15");
>      mc->get_default_cpu_node_id = virt_get_default_cpu_node_id;
> +    mc->kvm_type = virt_kvm_type;
>      assert(!mc->get_hotplug_handler);
>      mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
>      hc->plug = virt_machine_device_plug_cb;
> @@ -1911,6 +1956,9 @@ static void virt_machine_3_1_options(MachineClass *mc)
>  {
>      virt_machine_4_0_options(mc);
>      compat_props_add(mc->compat_props, hw_compat_3_1, hw_compat_3_1_len);
> +
> +    /* extended memory map is enabled from 4.0 onwards */
> +    mc->kvm_type = NULL;
it's quite confusing, you have vms->extended_memmap and mc->kvm_type and
the later for some reason enables device memory.

to me it seems that both are not related, device memory should work just fine
without kvm nor dynamic IPA (within TCG supported limits).

I'd make extended_memmap virt machine class member the will enable pc-dimm support
and then it add checks for supported IPA range on top

>  }
>  DEFINE_VIRT_MACHINE(3, 1)
>  
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index 3dc7a6c5d5..c88f67a492 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -132,6 +132,8 @@ typedef struct {
>      uint32_t iommu_phandle;
>      int psci_conduit;
>      hwaddr high_io_base;
> +    hwaddr highest_gpa;
> +    bool extended_memmap;
>  } VirtMachineState;
>  
>  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 15/18] nvdimm: use configurable ACPI IO base and size
  2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 15/18] nvdimm: use configurable ACPI IO base and size Eric Auger
@ 2019-02-18 10:21   ` Igor Mammedov
  0 siblings, 0 replies; 56+ messages in thread
From: Igor Mammedov @ 2019-02-18 10:21 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, david, dgilbert, david, drjones

On Tue,  5 Feb 2019 18:33:03 +0100
Eric Auger <eric.auger@redhat.com> wrote:

> From: Kwangwoo Lee <kwangwoo.lee@sk.com>
> 
> This patch uses configurable IO base and size to create NPIO AML for
> ACPI NFIT. Since a different architecture like AArch64 does not use
> port-mapped IO, a configurable IO base is required to create correct
> mapping of ACPI IO address and size.
> 
> Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> 
> v2 -> v3:
> - s/size/len in pc_piix.c and pc_q35.c
> ---
>  hw/acpi/nvdimm.c        | 28 +++++++++++++++++++---------
>  hw/i386/pc_piix.c       |  8 +++++++-
>  hw/i386/pc_q35.c        |  8 +++++++-
>  include/hw/mem/nvdimm.h | 12 ++++++++++++
>  4 files changed, 45 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
> index e53b2cb681..da68de5535 100644
> --- a/hw/acpi/nvdimm.c
> +++ b/hw/acpi/nvdimm.c
> @@ -929,8 +929,8 @@ void nvdimm_init_acpi_state(AcpiNVDIMMState *state, MemoryRegion *io,
>                              FWCfgState *fw_cfg, Object *owner)
>  {
>      memory_region_init_io(&state->io_mr, owner, &nvdimm_dsm_ops, state,
> -                          "nvdimm-acpi-io", NVDIMM_ACPI_IO_LEN);
> -    memory_region_add_subregion(io, NVDIMM_ACPI_IO_BASE, &state->io_mr);
> +                          "nvdimm-acpi-io", state->dsm_io.len);
> +    memory_region_add_subregion(io, state->dsm_io.base, &state->io_mr);
>  
>      state->dsm_mem = g_array_new(false, true /* clear */, 1);
>      acpi_data_push(state->dsm_mem, sizeof(NvdimmDsmIn));
> @@ -959,12 +959,14 @@ void nvdimm_init_acpi_state(AcpiNVDIMMState *state, MemoryRegion *io,
>  
>  #define NVDIMM_QEMU_RSVD_UUID   "648B9CF2-CDA1-4312-8AD9-49C4AF32BD62"
>  
> -static void nvdimm_build_common_dsm(Aml *dev)
> +static void nvdimm_build_common_dsm(Aml *dev,
> +                                    AcpiNVDIMMState *acpi_nvdimm_state)
>  {
>      Aml *method, *ifctx, *function, *handle, *uuid, *dsm_mem, *elsectx2;
>      Aml *elsectx, *unsupport, *unpatched, *expected_uuid, *uuid_invalid;
>      Aml *pckg, *pckg_index, *pckg_buf, *field, *dsm_out_buf, *dsm_out_buf_size;
>      uint8_t byte_list[1];
> +    AmlRegionSpace rs;
>  
>      method = aml_method(NVDIMM_COMMON_DSM, 5, AML_SERIALIZED);
>      uuid = aml_arg(0);
> @@ -975,9 +977,16 @@ static void nvdimm_build_common_dsm(Aml *dev)
>  
>      aml_append(method, aml_store(aml_name(NVDIMM_ACPI_MEM_ADDR), dsm_mem));
>  
> +    if (acpi_nvdimm_state->dsm_io.type == NVDIMM_ACPI_IO_PORT) {
> +        rs = AML_SYSTEM_IO;
> +    } else {
> +        rs = AML_SYSTEM_MEMORY;
> +    }
> +
>      /* map DSM memory and IO into ACPI namespace. */
> -    aml_append(method, aml_operation_region(NVDIMM_DSM_IOPORT, AML_SYSTEM_IO,
> -               aml_int(NVDIMM_ACPI_IO_BASE), NVDIMM_ACPI_IO_LEN));
> +    aml_append(method, aml_operation_region(NVDIMM_DSM_IOPORT, rs,
> +               aml_int(acpi_nvdimm_state->dsm_io.base),
> +               acpi_nvdimm_state->dsm_io.len));
>      aml_append(method, aml_operation_region(NVDIMM_DSM_MEMORY,
>                 AML_SYSTEM_MEMORY, dsm_mem, sizeof(NvdimmDsmIn)));
>  
> @@ -1260,7 +1269,8 @@ static void nvdimm_build_nvdimm_devices(Aml *root_dev, uint32_t ram_slots)
>  }
>  
>  static void nvdimm_build_ssdt(GArray *table_offsets, GArray *table_data,
> -                              BIOSLinker *linker, GArray *dsm_dma_arrea,
> +                              BIOSLinker *linker,
> +                              AcpiNVDIMMState *acpi_nvdimm_state,
>                                uint32_t ram_slots)
>  {
>      Aml *ssdt, *sb_scope, *dev;
> @@ -1288,7 +1298,7 @@ static void nvdimm_build_ssdt(GArray *table_offsets, GArray *table_data,
>       */
>      aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0012")));
>  
> -    nvdimm_build_common_dsm(dev);
> +    nvdimm_build_common_dsm(dev, acpi_nvdimm_state);
>  
>      /* 0 is reserved for root device. */
>      nvdimm_build_device_dsm(dev, 0);
> @@ -1307,7 +1317,7 @@ static void nvdimm_build_ssdt(GArray *table_offsets, GArray *table_data,
>                                                 NVDIMM_ACPI_MEM_ADDR);
>  
>      bios_linker_loader_alloc(linker,
> -                             NVDIMM_DSM_MEM_FILE, dsm_dma_arrea,
> +                             NVDIMM_DSM_MEM_FILE, acpi_nvdimm_state->dsm_mem,
>                               sizeof(NvdimmDsmIn), false /* high memory */);
>      bios_linker_loader_add_pointer(linker,
>          ACPI_BUILD_TABLE_FILE, mem_addr_offset, sizeof(uint32_t),
> @@ -1329,7 +1339,7 @@ void nvdimm_build_acpi(GArray *table_offsets, GArray *table_data,
>          return;
>      }
>  
> -    nvdimm_build_ssdt(table_offsets, table_data, linker, state->dsm_mem,
> +    nvdimm_build_ssdt(table_offsets, table_data, linker, state,
>                        ram_slots);
>  
>      device_list = nvdimm_get_device_list();
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index 63c84e3827..d9c81c9aa6 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -298,7 +298,13 @@ static void pc_init1(MachineState *machine,
>      }
>  
>      if (pcms->acpi_nvdimm_state.is_enabled) {
> -        nvdimm_init_acpi_state(&pcms->acpi_nvdimm_state, system_io,
> +        AcpiNVDIMMState *acpi_nvdimm_state = &pcms->acpi_nvdimm_state;
> +
> +        acpi_nvdimm_state->dsm_io.type = NVDIMM_ACPI_IO_PORT;

> +        acpi_nvdimm_state->dsm_io.base = NVDIMM_ACPI_IO_BASE;
> +        acpi_nvdimm_state->dsm_io.len = NVDIMM_ACPI_IO_LEN;
above constants probably should by moved to target a specific header

> +
> +        nvdimm_init_acpi_state(acpi_nvdimm_state, system_io,
>                                 pcms->fw_cfg, OBJECT(pcms));
>      }
>  }
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index b7b7959934..1110a26e34 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -330,7 +330,13 @@ static void pc_q35_init(MachineState *machine)
>      pc_nic_init(pcmc, isa_bus, host_bus);
>  
>      if (pcms->acpi_nvdimm_state.is_enabled) {
> -        nvdimm_init_acpi_state(&pcms->acpi_nvdimm_state, system_io,
> +        AcpiNVDIMMState *acpi_nvdimm_state = &pcms->acpi_nvdimm_state;
> +
> +        acpi_nvdimm_state->dsm_io.type = NVDIMM_ACPI_IO_PORT;
> +        acpi_nvdimm_state->dsm_io.base = NVDIMM_ACPI_IO_BASE;
> +        acpi_nvdimm_state->dsm_io.len = NVDIMM_ACPI_IO_LEN;
> +
> +        nvdimm_init_acpi_state(acpi_nvdimm_state, system_io,
>                                 pcms->fw_cfg, OBJECT(pcms));
>      }
>  }
> diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
> index c5c9b3c7f8..af8a5fd034 100644
> --- a/include/hw/mem/nvdimm.h
> +++ b/include/hw/mem/nvdimm.h
> @@ -123,6 +123,17 @@ struct NvdimmFitBuffer {
>  };
>  typedef struct NvdimmFitBuffer NvdimmFitBuffer;
>  

> +typedef enum {
> +    NVDIMM_ACPI_IO_PORT,
> +    NVDIMM_ACPI_IO_MEMORY,
> +} AcpiNVDIMMIOType;
> +
> +typedef struct AcpiNVDIMMIOEntry {
> +    AcpiNVDIMMIOType type;
> +    hwaddr base;
> +    hwaddr len;
> +} AcpiNVDIMMIOEntry;
This one very much resembles AcpiGenericAddress, 
why not to reuse it?

>  struct AcpiNVDIMMState {
>      /* detect if NVDIMM support is enabled. */
>      bool is_enabled;
> @@ -140,6 +151,7 @@ struct AcpiNVDIMMState {
>       */
>      int32_t persistence;
>      char    *persistence_string;
> +    AcpiNVDIMMIOEntry dsm_io;
>  };
>  typedef struct AcpiNVDIMMState AcpiNVDIMMState;
>  

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 16/18] hw/arm/virt: Add nvdimm hot-plug infrastructure
  2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 16/18] hw/arm/virt: Add nvdimm hot-plug infrastructure Eric Auger
@ 2019-02-18 10:30   ` Igor Mammedov
  2019-02-20 15:21     ` Auger Eric
  0 siblings, 1 reply; 56+ messages in thread
From: Igor Mammedov @ 2019-02-18 10:30 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, david, dgilbert, david, drjones

On Tue,  5 Feb 2019 18:33:04 +0100
Eric Auger <eric.auger@redhat.com> wrote:

> From: Kwangwoo Lee <kwangwoo.lee@sk.com>
> 
> Pre-plug and plug handlers are prepared for NVDIMM support.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
> ---
>  default-configs/arm-softmmu.mak |  2 ++
>  hw/arm/virt-acpi-build.c        |  6 ++++++
>  hw/arm/virt.c                   | 22 ++++++++++++++++++++++
>  include/hw/arm/virt.h           |  3 +++
>  4 files changed, 33 insertions(+)
> 
> diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
> index dc4624794f..ddbe87ed15 100644
> --- a/default-configs/arm-softmmu.mak
> +++ b/default-configs/arm-softmmu.mak
> @@ -162,3 +162,5 @@ CONFIG_HIGHBANK=y
>  CONFIG_MUSICPAL=y
>  CONFIG_MEM_DEVICE=y
>  CONFIG_DIMM=y
> +CONFIG_NVDIMM=y
> +CONFIG_ACPI_NVDIMM=y
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 781eafaf5e..f086adfa82 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -784,6 +784,7 @@ static
>  void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>  {
>      VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
> +    MachineState *ms = MACHINE(vms);
>      GArray *table_offsets;
>      unsigned dsdt, xsdt;
>      GArray *tables_blob = tables->table_data;
> @@ -824,6 +825,11 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>          }
>      }
>  
> +    if (vms->acpi_nvdimm_state.is_enabled) {
> +        nvdimm_build_acpi(table_offsets, tables_blob, tables->linker,
> +                          &vms->acpi_nvdimm_state, ms->ram_slots);
> +    }
> +
>      if (its_class_name() && !vmc->no_its) {
>          acpi_add_table(table_offsets, tables_blob);
>          build_iort(tables_blob, tables->linker, vms);
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index b683902991..0c8c2cc191 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -132,6 +132,7 @@ static const MemMapEntry a15memmap[] = {
>      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
>      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
>      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
> +    [VIRT_ACPI_IO] =            { 0x09070000, 0x00010000 },
where does this range come from and is its size sufficient?

>      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
>      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
>      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
> @@ -1637,6 +1638,18 @@ static void machvirt_init(MachineState *machine)
>  
>      create_platform_bus(vms, pic);
>  
> +    if (vms->acpi_nvdimm_state.is_enabled) {
> +        AcpiNVDIMMState *acpi_nvdimm_state = &vms->acpi_nvdimm_state;
> +
> +        acpi_nvdimm_state->dsm_io.type = NVDIMM_ACPI_IO_MEMORY;
> +        acpi_nvdimm_state->dsm_io.base =
> +                vms->memmap[VIRT_ACPI_IO].base + NVDIMM_ACPI_IO_BASE;
> +        acpi_nvdimm_state->dsm_io.len = NVDIMM_ACPI_IO_LEN;
> +
> +        nvdimm_init_acpi_state(acpi_nvdimm_state, sysmem,
> +                               vms->fw_cfg, OBJECT(vms));
> +    }
> +
>      vms->bootinfo.ram_size = machine->ram_size;
>      vms->bootinfo.kernel_filename = machine->kernel_filename;
>      vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
> @@ -1822,10 +1835,19 @@ static void virt_memory_plug(HotplugHandler *hotplug_dev,
>                               DeviceState *dev, Error **errp)
>  {
>      VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> +    bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM);
>      Error *local_err = NULL;
>  
>      pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
>  
> +    if (is_nvdimm) {
> +        nvdimm_plug(&vms->acpi_nvdimm_state);
> +    }
> +
> +out:
>      error_propagate(errp, local_err);
>  }
>  
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index c88f67a492..56d73b0e86 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -37,6 +37,7 @@
>  #include "hw/arm/arm.h"
>  #include "sysemu/kvm.h"
>  #include "hw/intc/arm_gicv3_common.h"
> +#include "hw/mem/nvdimm.h"
>  
>  #define NUM_GICV2M_SPIS       64
>  #define NUM_VIRTIO_TRANSPORTS 32
> @@ -77,6 +78,7 @@ enum {
>      VIRT_GPIO,
>      VIRT_SECURE_UART,
>      VIRT_SECURE_MEM,
> +    VIRT_ACPI_IO,
>      VIRT_LOWMEMMAP_LAST,
>  };
>  
> @@ -134,6 +136,7 @@ typedef struct {
>      hwaddr high_io_base;
>      hwaddr highest_gpa;
>      bool extended_memmap;
> +    AcpiNVDIMMState acpi_nvdimm_state;
>  } VirtMachineState;
>  
>  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 07/18] kvm: add kvm_arm_get_max_vm_phys_shift
  2019-02-14 17:15   ` Peter Maydell
@ 2019-02-18 18:03     ` Auger Eric
  0 siblings, 0 replies; 56+ messages in thread
From: Auger Eric @ 2019-02-18 18:03 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Eric Auger, QEMU Developers, qemu-arm, Shameerali Kolothum Thodi,
	Igor Mammedov, David Hildenbrand, Dr. David Alan Gilbert,
	David Gibson, Andrew Jones

Hi Peter,
On 2/14/19 6:15 PM, Peter Maydell wrote:
> On Tue, 5 Feb 2019 at 17:33, Eric Auger <eric.auger@redhat.com> wrote:
>>
>> Add the kvm_arm_get_max_vm_phys_shift() helper that returns the
>> log of the maximum IPA size supported by KVM. This capability
>> needs to be known to create the VM with a specific IPA max size
>> (kvm_type passed along KVM_CREATE_VM ioctl.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>> v4 -> v5:
>> - return 40 if the host does not support the capability
>>
>> v3 -> v4:
>> - s/s/ms in kvm_arm_get_max_vm_phys_shift function comment
>> - check KVM_CAP_ARM_VM_IPA_SIZE extension
>>
>> v1 -> v2:
>> - put this in ARM specific code
>> ---
>>  target/arm/kvm.c     | 10 ++++++++++
>>  target/arm/kvm_arm.h | 13 +++++++++++++
>>  2 files changed, 23 insertions(+)
>>
>> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
>> index e00ccf9c98..fc1dd3ec6a 100644
>> --- a/target/arm/kvm.c
>> +++ b/target/arm/kvm.c
>> @@ -18,6 +18,7 @@
>>  #include "qemu/error-report.h"
>>  #include "sysemu/sysemu.h"
>>  #include "sysemu/kvm.h"
>> +#include "sysemu/kvm_int.h"
>>  #include "kvm_arm.h"
>>  #include "cpu.h"
>>  #include "trace.h"
>> @@ -162,6 +163,15 @@ void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu)
>>      env->features = arm_host_cpu_features.features;
>>  }
>>
>> +int kvm_arm_get_max_vm_phys_shift(MachineState *ms)
>> +{
>> +    KVMState *s = KVM_STATE(ms->accelerator);
>> +    int ret;
>> +
>> +    ret = kvm_check_extension(s, KVM_CAP_ARM_VM_IPA_SIZE);
> 
> Why not name the function the same as the extension name?
> 
>> +    return ret > 0 ? ret : 40;
>> +}
>> +
>>  int kvm_arch_init(MachineState *ms, KVMState *s)
>>  {
>>      /* For ARM interrupt delivery is always asynchronous,
>> diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
>> index 6393455b1d..0728bbfa6b 100644
>> --- a/target/arm/kvm_arm.h
>> +++ b/target/arm/kvm_arm.h
>> @@ -207,6 +207,14 @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf);
>>   */
>>  void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu);
>>
>> +/**
>> + * kvm_arm_get_max_vm_phys_shift - Returns log2 of the max IPA size
>> + * supported by KVM
> 
> This is the number of bits in the IPA address space,
> right (ie 40 for a 40-bit IPA, and so on) ? If so, then
> I think "Return number of bits in the IPA address space"
> might be clearer.

It actually returns the MAX number of bits in the IPA address space
supported by the host kernel. What about naming this function
"kvm_arm_get_max_vm_ipa_size".

Thanks

Eric
> 
>> + *
>> + * @ms: Machine state handle
>> + */
>> +int kvm_arm_get_max_vm_phys_shift(MachineState *ms);
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 11/18] hw/arm/virt: Add memory hotplug framework
  2019-02-14 17:15   ` David Hildenbrand
@ 2019-02-18 18:10     ` Auger Eric
  0 siblings, 0 replies; 56+ messages in thread
From: Auger Eric @ 2019-02-18 18:10 UTC (permalink / raw)
  To: David Hildenbrand, eric.auger.pro, qemu-devel, qemu-arm,
	peter.maydell, shameerali.kolothum.thodi, imammedo
  Cc: dgilbert, david, drjones

Hi David,

On 2/14/19 6:15 PM, David Hildenbrand wrote:
> On 05.02.19 18:32, Eric Auger wrote:
>> This patch adds the the memory hot-plug/hot-unplug infrastructure
>> in machvirt. It is still not enabled as no device memory is allocated.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>> Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
>>
>> ---
>> v4 -> v5:
>> - change in pc_dimm_pre_plug signature
>> - CONFIG_MEM_HOTPLUG replaced by CONFIG_MEM_DEVICE and CONFIG_DIMM
>>
>> v3 -> v4:
>> - check the memory device is not hotplugged
>>
>> v2 -> v3:
>> - change in pc_dimm_plug()'s signature
>> - add pc_dimm_pre_plug call
>>
>> v1 -> v2:
>> - s/virt_dimm_plug|unplug/virt_memory_plug|unplug
>> - s/pc_dimm_memory_plug/pc_dimm_plug
>> - reworded title and commit message
>> - added pre_plug cb
>> - don't handle get_memory_region failure anymore
>> ---
>>  default-configs/arm-softmmu.mak |  2 ++
>>  hw/arm/virt.c                   | 64 ++++++++++++++++++++++++++++++++-
>>  2 files changed, 65 insertions(+), 1 deletion(-)
>>
>> diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
>> index be88870799..dc4624794f 100644
>> --- a/default-configs/arm-softmmu.mak
>> +++ b/default-configs/arm-softmmu.mak
>> @@ -160,3 +160,5 @@ CONFIG_PCI_DESIGNWARE=y
>>  CONFIG_STRONGARM=y
>>  CONFIG_HIGHBANK=y
>>  CONFIG_MUSICPAL=y
>> +CONFIG_MEM_DEVICE=y
>> +CONFIG_DIMM=y
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index f01886da22..783468ba77 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -59,6 +59,8 @@
>>  #include "qapi/visitor.h"
>>  #include "standard-headers/linux/input.h"
>>  #include "hw/arm/smmuv3.h"
>> +#include "hw/mem/pc-dimm.h"
>> +#include "hw/mem/nvdimm.h"
>>  
>>  #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
>>      static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
>> @@ -1763,6 +1765,49 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>      return ms->possible_cpus;
>>  }
>>  
>> +static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>> +                                 Error **errp)
>> +{
>> +    const bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM);
>> +
>> +    if (dev->hotplugged) {
>> +        error_setg(errp, "memory hotplug is not supported");
>> +    }
>> +
>> +    if (is_nvdimm) {
>> +        error_setg(errp, "nvdimm is not yet supported");
>> +        return;
>> +    }
>> +
>> +    pc_dimm_pre_plug(PC_DIMM(dev), MACHINE(hotplug_dev), NULL, errp);
>> +}
>> +
>> +static void virt_memory_plug(HotplugHandler *hotplug_dev,
>> +                             DeviceState *dev, Error **errp)
>> +{
>> +    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>> +    Error *local_err = NULL;
>> +
>> +    pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
>> +
>> +    error_propagate(errp, local_err);
>> +}
>> +
>> +static void virt_memory_unplug(HotplugHandler *hotplug_dev,
>> +                               DeviceState *dev, Error **errp)
>> +{
>> +    pc_dimm_unplug(PC_DIMM(dev), MACHINE(hotplug_dev));
>> +    object_unparent(OBJECT(dev));
> 
> Please note that this will soon change with
> 
> [PATCH RFCv2 0/9] qdev: Hotplug handler chaining + virtio-pmem
> 
> What you'll have to do then is to replace the object_unparent by a
> 
> object_property_set_bool(OBJECT(dev), false, "realized", NULL);
Noted.

Thanks for the heads up!

Eric
> 
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 09/18] hw/arm/virt: Implement kvm_type function for 4.0 machine
  2019-02-14 17:29   ` Peter Maydell
@ 2019-02-18 21:29     ` Auger Eric
  2019-02-19  7:49       ` Igor Mammedov
  0 siblings, 1 reply; 56+ messages in thread
From: Auger Eric @ 2019-02-18 21:29 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Eric Auger, QEMU Developers, qemu-arm, Shameerali Kolothum Thodi,
	Igor Mammedov, David Hildenbrand, Dr. David Alan Gilbert,
	David Gibson, Andrew Jones

Hi Peter,

On 2/14/19 6:29 PM, Peter Maydell wrote:
> On Tue, 5 Feb 2019 at 17:33, Eric Auger <eric.auger@redhat.com> wrote:
>>
>> This patch implements the machine class kvm_type() callback.
>> It returns the max IPA shift needed to implement the whole GPA
>> range including the RAM and IO regions located beyond.
>> The returned value in passed though the KVM_CREATE_VM ioctl and
>> this allows KVM to set the stage2 tables dynamically.
>>
>> At this stage the RAM limit still is limited to 255GB.
>>
>> Setting all the existing highmem IO regions beyond the RAM
>> allows to have a single contiguous RAM region (initial RAM and
>> possible hotpluggable device memory). That way we do not need
>> to do invasive changes in the EDK2 FW to support a dynamic
>> RAM base.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>>
>> v5 -> v6:
>> - add some comments
>> - high IO region cannot start before 256GiB
>> ---
>>  hw/arm/virt.c         | 52 +++++++++++++++++++++++++++++++++++++++++--
>>  include/hw/arm/virt.h |  2 ++
>>  2 files changed, 52 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index 2b15839d0b..b90ffc2e5d 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -1366,6 +1366,7 @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx)
>>
>>  static void virt_set_memmap(VirtMachineState *vms)
>>  {
>> +    MachineState *ms = MACHINE(vms);
>>      hwaddr base;
>>      int i;
>>
>> @@ -1375,7 +1376,17 @@ static void virt_set_memmap(VirtMachineState *vms)
>>          vms->memmap[i] = a15memmap[i];
>>      }
>>
>> -    vms->high_io_base = 256 * GiB; /* Top of the legacy initial RAM region */
>> +    /*
>> +     * We now compute the base of the high IO region depending on the
>> +     * amount of initial and device memory. The device memory start/size
>> +     * is aligned on 1GiB. We never put the high IO region below 256GiB
>> +     * so that if maxram_size is < 255GiB we keep the legacy memory map
>> +     */
>> +    vms->high_io_base = ROUND_UP(GiB + ms->ram_size, GiB) +
>> +                        ROUND_UP(ms->maxram_size - ms->ram_size, GiB);
> 
> I don't understand this expression...
My intent was to align the start of the device memory on a GiB boundary,
just after the initial RAM (ram_size). And then align the floating IO
region on a GiB boundary after the device memory (of size
ms->maxram_size - ms->ram_size). What do I miss?
> 
>> +    if (vms->high_io_base < 256 * GiB) {
>> +        vms->high_io_base = 256 * GiB;
>> +    }
>>      base = vms->high_io_base;
>>
>>      for (i = VIRT_LOWMEMMAP_LAST; i < ARRAY_SIZE(extended_memmap); i++) {
>> @@ -1386,6 +1397,7 @@ static void virt_set_memmap(VirtMachineState *vms)
>>          vms->memmap[i].size = size;
>>          base += size;
>>      }
>> +    vms->highest_gpa = base - 1;
>>  }
>>
>>  static void machvirt_init(MachineState *machine)
>> @@ -1402,7 +1414,9 @@ static void machvirt_init(MachineState *machine)
>>      bool firmware_loaded = bios_name || drive_get(IF_PFLASH, 0, 0);
>>      bool aarch64 = true;
>>
>> -    virt_set_memmap(vms);
>> +    if (!vms->extended_memmap) {
>> +        virt_set_memmap(vms);
>> +    }
>>
>>      /* We can probe only here because during property set
>>       * KVM is not available yet
>> @@ -1784,6 +1798,36 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
>>      return NULL;
>>  }
>>
>> +/*
>> + * for arm64 kvm_type [7-0] encodes the IPA size shift
>> + */
>> +static int virt_kvm_type(MachineState *ms, const char *type_str)
>> +{
>> +    VirtMachineState *vms = VIRT_MACHINE(ms);
>> +    int max_vm_phys_shift = kvm_arm_get_max_vm_phys_shift(ms);
>> +    int max_pa_shift;
>> +
>> +    vms->extended_memmap = true;
>> +
>> +    virt_set_memmap(vms);
>> +
>> +    max_pa_shift = 64 - clz64(vms->highest_gpa);
>> +
>> +    if (max_pa_shift > max_vm_phys_shift) {
>> +        error_report("-m and ,maxmem option values "
>> +                     "require an IPA range (%d bits) larger than "
>> +                     "the one supported by the host (%d bits)",
>> +                     max_pa_shift, max_vm_phys_shift);
>> +       exit(1);
>> +    }
> 
> Presumably we should have some equivalent check for TCG, so
> that we don't let the user create a setup which wants more
> bits of physical address than the TCG CPU allows ?
kvm_type() sets the new memory map. For TCG we should stick to the 1TB
GPA address space which should be consistent with the existing
ID_AA64MMFR0_EL1 settings (arm/internals.h implements arm_pamax(ARMCPU
*cpu) which decodes hardcoded cpu->id_aa64mmfr0).
> 
>> +    /*
>> +     * By default we return 0 which corresponds to an implicit legacy
>> +     * 40b IPA setting. Otherwise we return the actual requested IPA
>> +     * logsize
>> +     */
>> +    return max_pa_shift > 40 ? max_pa_shift : 0;
>> +}
>> +
>>  static void virt_machine_class_init(ObjectClass *oc, void *data)
>>  {
>>      MachineClass *mc = MACHINE_CLASS(oc);
>> @@ -1808,6 +1852,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
>>      mc->cpu_index_to_instance_props = virt_cpu_index_to_props;
>>      mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a15");
>>      mc->get_default_cpu_node_id = virt_get_default_cpu_node_id;
>> +    mc->kvm_type = virt_kvm_type;
>>      assert(!mc->get_hotplug_handler);
>>      mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
>>      hc->plug = virt_machine_device_plug_cb;
>> @@ -1911,6 +1956,9 @@ static void virt_machine_3_1_options(MachineClass *mc)
>>  {
>>      virt_machine_4_0_options(mc);
>>      compat_props_add(mc->compat_props, hw_compat_3_1, hw_compat_3_1_len);
>> +
>> +    /* extended memory map is enabled from 4.0 onwards */
>> +    mc->kvm_type = NULL;
> 
> When is there a difference between setting this to NULL,
> and setting it to virt_kvm_type but having the memory
> size be <= 256GiB ?
There shouldn't be any difference. When size <= 255GiB we stick to the
1TB PA address space.
> 
> If there isn't any difference, why can't we just let the
> pre-4.0 versions behave like the new ones? No existing
> VM setup will have > 256GB of memory, so as long as there's
> no behaviour change for the <=256GB case we don't need to
> take special effort to ensure that the >256GB case continues
> to give an error message, do we ?
But don't we want to forbid any pre-4.0 machvirt to run with more than
255GiB RAM?

Thanks

Eric
> 
>>  }
>>  DEFINE_VIRT_MACHINE(3, 1)
>>
>> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
>> index 3dc7a6c5d5..c88f67a492 100644
>> --- a/include/hw/arm/virt.h
>> +++ b/include/hw/arm/virt.h
>> @@ -132,6 +132,8 @@ typedef struct {
>>      uint32_t iommu_phandle;
>>      int psci_conduit;
>>      hwaddr high_io_base;
>> +    hwaddr highest_gpa;
>> +    bool extended_memmap;
>>  } VirtMachineState;
>>
>>  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
>> --
>> 2.20.1
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 09/18] hw/arm/virt: Implement kvm_type function for 4.0 machine
  2019-02-18 21:29     ` Auger Eric
@ 2019-02-19  7:49       ` Igor Mammedov
  2019-02-19  8:52         ` Auger Eric
  0 siblings, 1 reply; 56+ messages in thread
From: Igor Mammedov @ 2019-02-19  7:49 UTC (permalink / raw)
  To: Auger Eric
  Cc: Peter Maydell, Andrew Jones, David Hildenbrand, QEMU Developers,
	Shameerali Kolothum Thodi, Dr. David Alan Gilbert, qemu-arm,
	David Gibson, Eric Auger

On Mon, 18 Feb 2019 22:29:40 +0100
Auger Eric <eric.auger@redhat.com> wrote:

> Hi Peter,
> 
> On 2/14/19 6:29 PM, Peter Maydell wrote:
> > On Tue, 5 Feb 2019 at 17:33, Eric Auger <eric.auger@redhat.com> wrote:  
> >>
> >> This patch implements the machine class kvm_type() callback.
> >> It returns the max IPA shift needed to implement the whole GPA
> >> range including the RAM and IO regions located beyond.
> >> The returned value in passed though the KVM_CREATE_VM ioctl and
> >> this allows KVM to set the stage2 tables dynamically.
> >>
> >> At this stage the RAM limit still is limited to 255GB.
> >>
> >> Setting all the existing highmem IO regions beyond the RAM
> >> allows to have a single contiguous RAM region (initial RAM and
> >> possible hotpluggable device memory). That way we do not need
> >> to do invasive changes in the EDK2 FW to support a dynamic
> >> RAM base.
> >>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >>
> >> ---
> >>
> >> v5 -> v6:
> >> - add some comments
> >> - high IO region cannot start before 256GiB
> >> ---
> >>  hw/arm/virt.c         | 52 +++++++++++++++++++++++++++++++++++++++++--
> >>  include/hw/arm/virt.h |  2 ++
> >>  2 files changed, 52 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> >> index 2b15839d0b..b90ffc2e5d 100644
> >> --- a/hw/arm/virt.c
> >> +++ b/hw/arm/virt.c
> >> @@ -1366,6 +1366,7 @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx)
> >>
> >>  static void virt_set_memmap(VirtMachineState *vms)
> >>  {
> >> +    MachineState *ms = MACHINE(vms);
> >>      hwaddr base;
> >>      int i;
> >>
> >> @@ -1375,7 +1376,17 @@ static void virt_set_memmap(VirtMachineState *vms)
> >>          vms->memmap[i] = a15memmap[i];
> >>      }
> >>
> >> -    vms->high_io_base = 256 * GiB; /* Top of the legacy initial RAM region */
> >> +    /*
> >> +     * We now compute the base of the high IO region depending on the
> >> +     * amount of initial and device memory. The device memory start/size
> >> +     * is aligned on 1GiB. We never put the high IO region below 256GiB
> >> +     * so that if maxram_size is < 255GiB we keep the legacy memory map
> >> +     */
> >> +    vms->high_io_base = ROUND_UP(GiB + ms->ram_size, GiB) +
> >> +                        ROUND_UP(ms->maxram_size - ms->ram_size, GiB);  
> > 
> > I don't understand this expression...  
> My intent was to align the start of the device memory on a GiB boundary,
> just after the initial RAM (ram_size). And then align the floating IO
> region on a GiB boundary after the device memory (of size
> ms->maxram_size - ms->ram_size). What do I miss?

It's not obvious what "GiB +  ms->ram_size" means and where it comes from,
maybe substitute GiB with properly named constant/macro that's also re-used in
memmap definition so it would be obvious that's it's where initial RAM
is mapped. Also I'd move both ROUND_UPs into separate expressions using
reasonable named local vars and possible overflow checks on top of that,
so one won't have to guess that it's initial RAM end + device RAM end.

> >   
> >> +    if (vms->high_io_base < 256 * GiB) {
> >> +        vms->high_io_base = 256 * GiB;
> >> +    }
> >>      base = vms->high_io_base;
> >>
> >>      for (i = VIRT_LOWMEMMAP_LAST; i < ARRAY_SIZE(extended_memmap); i++) {
> >> @@ -1386,6 +1397,7 @@ static void virt_set_memmap(VirtMachineState *vms)
> >>          vms->memmap[i].size = size;
> >>          base += size;
> >>      }
> >> +    vms->highest_gpa = base - 1;
> >>  }
> >>
> >>  static void machvirt_init(MachineState *machine)
> >> @@ -1402,7 +1414,9 @@ static void machvirt_init(MachineState *machine)
> >>      bool firmware_loaded = bios_name || drive_get(IF_PFLASH, 0, 0);
> >>      bool aarch64 = true;
> >>
> >> -    virt_set_memmap(vms);
> >> +    if (!vms->extended_memmap) {
> >> +        virt_set_memmap(vms);
> >> +    }
> >>
> >>      /* We can probe only here because during property set
> >>       * KVM is not available yet
> >> @@ -1784,6 +1798,36 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
> >>      return NULL;
> >>  }
> >>
> >> +/*
> >> + * for arm64 kvm_type [7-0] encodes the IPA size shift
> >> + */
> >> +static int virt_kvm_type(MachineState *ms, const char *type_str)
> >> +{
> >> +    VirtMachineState *vms = VIRT_MACHINE(ms);
> >> +    int max_vm_phys_shift = kvm_arm_get_max_vm_phys_shift(ms);
> >> +    int max_pa_shift;
> >> +
> >> +    vms->extended_memmap = true;
> >> +
> >> +    virt_set_memmap(vms);
> >> +
> >> +    max_pa_shift = 64 - clz64(vms->highest_gpa);
> >> +
> >> +    if (max_pa_shift > max_vm_phys_shift) {
> >> +        error_report("-m and ,maxmem option values "
> >> +                     "require an IPA range (%d bits) larger than "
> >> +                     "the one supported by the host (%d bits)",
> >> +                     max_pa_shift, max_vm_phys_shift);
> >> +       exit(1);
> >> +    }  
> > 
> > Presumably we should have some equivalent check for TCG, so
> > that we don't let the user create a setup which wants more
> > bits of physical address than the TCG CPU allows ?  
> kvm_type() sets the new memory map. For TCG we should stick to the 1TB
> GPA address space which should be consistent with the existing
> ID_AA64MMFR0_EL1 settings (arm/internals.h implements arm_pamax(ARMCPU
> *cpu) which decodes hardcoded cpu->id_aa64mmfr0).
> >   
> >> +    /*
> >> +     * By default we return 0 which corresponds to an implicit legacy
> >> +     * 40b IPA setting. Otherwise we return the actual requested IPA
> >> +     * logsize
> >> +     */
> >> +    return max_pa_shift > 40 ? max_pa_shift : 0;
> >> +}
> >> +
> >>  static void virt_machine_class_init(ObjectClass *oc, void *data)
> >>  {
> >>      MachineClass *mc = MACHINE_CLASS(oc);
> >> @@ -1808,6 +1852,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
> >>      mc->cpu_index_to_instance_props = virt_cpu_index_to_props;
> >>      mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a15");
> >>      mc->get_default_cpu_node_id = virt_get_default_cpu_node_id;
> >> +    mc->kvm_type = virt_kvm_type;
> >>      assert(!mc->get_hotplug_handler);
> >>      mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
> >>      hc->plug = virt_machine_device_plug_cb;
> >> @@ -1911,6 +1956,9 @@ static void virt_machine_3_1_options(MachineClass *mc)
> >>  {
> >>      virt_machine_4_0_options(mc);
> >>      compat_props_add(mc->compat_props, hw_compat_3_1, hw_compat_3_1_len);
> >> +
> >> +    /* extended memory map is enabled from 4.0 onwards */
> >> +    mc->kvm_type = NULL;  
> > 
> > When is there a difference between setting this to NULL,
> > and setting it to virt_kvm_type but having the memory
> > size be <= 256GiB ?  
> There shouldn't be any difference. When size <= 255GiB we stick to the
> 1TB PA address space.
> > 
> > If there isn't any difference, why can't we just let the
> > pre-4.0 versions behave like the new ones? No existing
> > VM setup will have > 256GB of memory, so as long as there's
> > no behaviour change for the <=256GB case we don't need to
> > take special effort to ensure that the >256GB case continues
> > to give an error message, do we ?  
> But don't we want to forbid any pre-4.0 machvirt to run with more than
> 255GiB RAM?
Why would we if it doesn't break migration?

 
> Thanks
> 
> Eric
> >   
> >>  }
> >>  DEFINE_VIRT_MACHINE(3, 1)
> >>
> >> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> >> index 3dc7a6c5d5..c88f67a492 100644
> >> --- a/include/hw/arm/virt.h
> >> +++ b/include/hw/arm/virt.h
> >> @@ -132,6 +132,8 @@ typedef struct {
> >>      uint32_t iommu_phandle;
> >>      int psci_conduit;
> >>      hwaddr high_io_base;
> >> +    hwaddr highest_gpa;
> >> +    bool extended_memmap;
> >>  } VirtMachineState;
> >>
> >>  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
> >> --
> >> 2.20.1  
> > 
> > thanks
> > -- PMM
> >   
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 09/18] hw/arm/virt: Implement kvm_type function for 4.0 machine
  2019-02-19  7:49       ` Igor Mammedov
@ 2019-02-19  8:52         ` Auger Eric
  0 siblings, 0 replies; 56+ messages in thread
From: Auger Eric @ 2019-02-19  8:52 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Peter Maydell, Andrew Jones, David Hildenbrand,
	Dr. David Alan Gilbert, Shameerali Kolothum Thodi,
	QEMU Developers, qemu-arm, Eric Auger, David Gibson

Hi Igor,

On 2/19/19 8:49 AM, Igor Mammedov wrote:
> On Mon, 18 Feb 2019 22:29:40 +0100
> Auger Eric <eric.auger@redhat.com> wrote:
> 
>> Hi Peter,
>>
>> On 2/14/19 6:29 PM, Peter Maydell wrote:
>>> On Tue, 5 Feb 2019 at 17:33, Eric Auger <eric.auger@redhat.com> wrote:  
>>>>
>>>> This patch implements the machine class kvm_type() callback.
>>>> It returns the max IPA shift needed to implement the whole GPA
>>>> range including the RAM and IO regions located beyond.
>>>> The returned value in passed though the KVM_CREATE_VM ioctl and
>>>> this allows KVM to set the stage2 tables dynamically.
>>>>
>>>> At this stage the RAM limit still is limited to 255GB.
>>>>
>>>> Setting all the existing highmem IO regions beyond the RAM
>>>> allows to have a single contiguous RAM region (initial RAM and
>>>> possible hotpluggable device memory). That way we do not need
>>>> to do invasive changes in the EDK2 FW to support a dynamic
>>>> RAM base.
>>>>
>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>
>>>> ---
>>>>
>>>> v5 -> v6:
>>>> - add some comments
>>>> - high IO region cannot start before 256GiB
>>>> ---
>>>>  hw/arm/virt.c         | 52 +++++++++++++++++++++++++++++++++++++++++--
>>>>  include/hw/arm/virt.h |  2 ++
>>>>  2 files changed, 52 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>>> index 2b15839d0b..b90ffc2e5d 100644
>>>> --- a/hw/arm/virt.c
>>>> +++ b/hw/arm/virt.c
>>>> @@ -1366,6 +1366,7 @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx)
>>>>
>>>>  static void virt_set_memmap(VirtMachineState *vms)
>>>>  {
>>>> +    MachineState *ms = MACHINE(vms);
>>>>      hwaddr base;
>>>>      int i;
>>>>
>>>> @@ -1375,7 +1376,17 @@ static void virt_set_memmap(VirtMachineState *vms)
>>>>          vms->memmap[i] = a15memmap[i];
>>>>      }
>>>>
>>>> -    vms->high_io_base = 256 * GiB; /* Top of the legacy initial RAM region */
>>>> +    /*
>>>> +     * We now compute the base of the high IO region depending on the
>>>> +     * amount of initial and device memory. The device memory start/size
>>>> +     * is aligned on 1GiB. We never put the high IO region below 256GiB
>>>> +     * so that if maxram_size is < 255GiB we keep the legacy memory map
>>>> +     */
>>>> +    vms->high_io_base = ROUND_UP(GiB + ms->ram_size, GiB) +
>>>> +                        ROUND_UP(ms->maxram_size - ms->ram_size, GiB);  
>>>
>>> I don't understand this expression...  
>> My intent was to align the start of the device memory on a GiB boundary,
>> just after the initial RAM (ram_size). And then align the floating IO
>> region on a GiB boundary after the device memory (of size
>> ms->maxram_size - ms->ram_size). What do I miss?
> 
> It's not obvious what "GiB +  ms->ram_size" means and where it comes from,
I agree
> maybe substitute GiB with properly named constant/macro that's also re-used in
> memmap definition so it would be obvious that's it's where initial RAM
> is mapped. Also I'd move both ROUND_UPs into separate expressions using
> reasonable named local vars and possible overflow checks on top of that,
> so one won't have to guess that it's initial RAM end + device RAM end.
Makes sense too.

Thanks

Eric
> 
>>>   
>>>> +    if (vms->high_io_base < 256 * GiB) {
>>>> +        vms->high_io_base = 256 * GiB;
>>>> +    }
>>>>      base = vms->high_io_base;
>>>>
>>>>      for (i = VIRT_LOWMEMMAP_LAST; i < ARRAY_SIZE(extended_memmap); i++) {
>>>> @@ -1386,6 +1397,7 @@ static void virt_set_memmap(VirtMachineState *vms)
>>>>          vms->memmap[i].size = size;
>>>>          base += size;
>>>>      }
>>>> +    vms->highest_gpa = base - 1;
>>>>  }
>>>>
>>>>  static void machvirt_init(MachineState *machine)
>>>> @@ -1402,7 +1414,9 @@ static void machvirt_init(MachineState *machine)
>>>>      bool firmware_loaded = bios_name || drive_get(IF_PFLASH, 0, 0);
>>>>      bool aarch64 = true;
>>>>
>>>> -    virt_set_memmap(vms);
>>>> +    if (!vms->extended_memmap) {
>>>> +        virt_set_memmap(vms);
>>>> +    }
>>>>
>>>>      /* We can probe only here because during property set
>>>>       * KVM is not available yet
>>>> @@ -1784,6 +1798,36 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
>>>>      return NULL;
>>>>  }
>>>>
>>>> +/*
>>>> + * for arm64 kvm_type [7-0] encodes the IPA size shift
>>>> + */
>>>> +static int virt_kvm_type(MachineState *ms, const char *type_str)
>>>> +{
>>>> +    VirtMachineState *vms = VIRT_MACHINE(ms);
>>>> +    int max_vm_phys_shift = kvm_arm_get_max_vm_phys_shift(ms);
>>>> +    int max_pa_shift;
>>>> +
>>>> +    vms->extended_memmap = true;
>>>> +
>>>> +    virt_set_memmap(vms);
>>>> +
>>>> +    max_pa_shift = 64 - clz64(vms->highest_gpa);
>>>> +
>>>> +    if (max_pa_shift > max_vm_phys_shift) {
>>>> +        error_report("-m and ,maxmem option values "
>>>> +                     "require an IPA range (%d bits) larger than "
>>>> +                     "the one supported by the host (%d bits)",
>>>> +                     max_pa_shift, max_vm_phys_shift);
>>>> +       exit(1);
>>>> +    }  
>>>
>>> Presumably we should have some equivalent check for TCG, so
>>> that we don't let the user create a setup which wants more
>>> bits of physical address than the TCG CPU allows ?  
>> kvm_type() sets the new memory map. For TCG we should stick to the 1TB
>> GPA address space which should be consistent with the existing
>> ID_AA64MMFR0_EL1 settings (arm/internals.h implements arm_pamax(ARMCPU
>> *cpu) which decodes hardcoded cpu->id_aa64mmfr0).
>>>   
>>>> +    /*
>>>> +     * By default we return 0 which corresponds to an implicit legacy
>>>> +     * 40b IPA setting. Otherwise we return the actual requested IPA
>>>> +     * logsize
>>>> +     */
>>>> +    return max_pa_shift > 40 ? max_pa_shift : 0;
>>>> +}
>>>> +
>>>>  static void virt_machine_class_init(ObjectClass *oc, void *data)
>>>>  {
>>>>      MachineClass *mc = MACHINE_CLASS(oc);
>>>> @@ -1808,6 +1852,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
>>>>      mc->cpu_index_to_instance_props = virt_cpu_index_to_props;
>>>>      mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a15");
>>>>      mc->get_default_cpu_node_id = virt_get_default_cpu_node_id;
>>>> +    mc->kvm_type = virt_kvm_type;
>>>>      assert(!mc->get_hotplug_handler);
>>>>      mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
>>>>      hc->plug = virt_machine_device_plug_cb;
>>>> @@ -1911,6 +1956,9 @@ static void virt_machine_3_1_options(MachineClass *mc)
>>>>  {
>>>>      virt_machine_4_0_options(mc);
>>>>      compat_props_add(mc->compat_props, hw_compat_3_1, hw_compat_3_1_len);
>>>> +
>>>> +    /* extended memory map is enabled from 4.0 onwards */
>>>> +    mc->kvm_type = NULL;  
>>>
>>> When is there a difference between setting this to NULL,
>>> and setting it to virt_kvm_type but having the memory
>>> size be <= 256GiB ?  
>> There shouldn't be any difference. When size <= 255GiB we stick to the
>> 1TB PA address space.
>>>
>>> If there isn't any difference, why can't we just let the
>>> pre-4.0 versions behave like the new ones? No existing
>>> VM setup will have > 256GB of memory, so as long as there's
>>> no behaviour change for the <=256GB case we don't need to
>>> take special effort to ensure that the >256GB case continues
>>> to give an error message, do we ?  
>> But don't we want to forbid any pre-4.0 machvirt to run with more than
>> 255GiB RAM?
> Why would we if it doesn't break migration?
> 
>  
>> Thanks
>>
>> Eric
>>>   
>>>>  }
>>>>  DEFINE_VIRT_MACHINE(3, 1)
>>>>
>>>> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
>>>> index 3dc7a6c5d5..c88f67a492 100644
>>>> --- a/include/hw/arm/virt.h
>>>> +++ b/include/hw/arm/virt.h
>>>> @@ -132,6 +132,8 @@ typedef struct {
>>>>      uint32_t iommu_phandle;
>>>>      int psci_conduit;
>>>>      hwaddr high_io_base;
>>>> +    hwaddr highest_gpa;
>>>> +    bool extended_memmap;
>>>>  } VirtMachineState;
>>>>
>>>>  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
>>>> --
>>>> 2.20.1  
>>>
>>> thanks
>>> -- PMM
>>>   
>>
> 
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 14/18] hw/arm/virt: Allocate device_memory
  2019-02-18  9:31   ` Igor Mammedov
@ 2019-02-19 15:53     ` Auger Eric
  2019-02-19 15:56       ` David Hildenbrand
  2019-02-21  9:36       ` Igor Mammedov
  0 siblings, 2 replies; 56+ messages in thread
From: Auger Eric @ 2019-02-19 15:53 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, david, drjones, dgilbert, david

Hi Igor,

On 2/18/19 10:31 AM, Igor Mammedov wrote:
> On Tue,  5 Feb 2019 18:33:02 +0100
> Eric Auger <eric.auger@redhat.com> wrote:
> 
>> The device memory region is located after the initial RAM.
>> its start/size are 1GB aligned.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
>>
>> ---
>> v4 -> v5:
>> - device memory set after the initial RAM
>>
>> v3 -> v4:
>> - remove bootinfo.device_memory_start/device_memory_size
>> - rename VIRT_HOTPLUG_MEM into VIRT_DEVICE_MEM
>> ---
>>  hw/arm/virt.c | 36 ++++++++++++++++++++++++++++++++++++
>>  1 file changed, 36 insertions(+)
>>
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index 783468ba77..b683902991 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -61,6 +61,7 @@
>>  #include "hw/arm/smmuv3.h"
>>  #include "hw/mem/pc-dimm.h"
>>  #include "hw/mem/nvdimm.h"
>> +#include "hw/acpi/acpi.h"
>>  
>>  #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
>>      static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
>> @@ -1260,6 +1261,37 @@ static void create_secure_ram(VirtMachineState *vms,
>>      g_free(nodename);
>>  }
>>  
>> +static void create_device_memory(VirtMachineState *vms, MemoryRegion *sysmem)
>> +{
>> +    MachineState *ms = MACHINE(vms);
>> +    uint64_t device_memory_size = ms->maxram_size - ms->ram_size;
> should size it with 1Gb alignment per slot from the start (to avoid x86 mistakes),
> see enforce_aligned_dimm usage and associated commit for more details
I don't understand the computation done in pc machine. eventually we are
likely to have more device memory than requested by the user. Why don't
we check (machine->maxram_size - machine->ram_size) >=
machine->ram_slots * GiB
instead of adding 1GiB/slot to the initial user requirements?

Also machine->maxram_size - machine->ram_size is checked to be aligned
with TARGET_PAGE_SIZE. Is TARGET_PAGE_SIZE representative of the guest
PAGE in accelerated mode? Is it valid ro require an alignment on 1GB
boundary as I do in this patch?

> 
>> +    uint64_t align = GiB;
>> +
>> +    if (!device_memory_size) {
>> +        return;
>> +    }
>> +
>> +    if (ms->ram_slots > ACPI_MAX_RAM_SLOTS) {
>> +        error_report("unsupported number of memory slots: %"PRIu64,
>> +                     ms->ram_slots);
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    if (QEMU_ALIGN_UP(ms->maxram_size, align) != ms->maxram_size) {
>> +        error_report("maximum memory size must be aligned to multiple of 0x%"
>> +                     PRIx64, align);
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    ms->device_memory = g_malloc0(sizeof(*ms->device_memory));
>> +    ms->device_memory->base = QEMU_ALIGN_UP(GiB + ms->ram_size, GiB);
>                                                ^^^ where does this come from?
OK, introduced RAMBASE macro

Thanks

Eric
> 
> 
>> +
>> +    memory_region_init(&ms->device_memory->mr, OBJECT(vms),
>> +                       "device-memory", device_memory_size);
>> +    memory_region_add_subregion(sysmem, ms->device_memory->base,
>> +                                &ms->device_memory->mr);
>> +}
>> +
>>  static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
>>  {
>>      const VirtMachineState *board = container_of(binfo, VirtMachineState,
>> @@ -1569,6 +1601,10 @@ static void machvirt_init(MachineState *machine)
>>                                           machine->ram_size);
>>      memory_region_add_subregion(sysmem, vms->memmap[VIRT_MEM].base, ram);
>>  
>> +    if (vms->extended_memmap) {
>> +        create_device_memory(vms, sysmem);
>> +    }
>> +
>>      create_flash(vms, sysmem, secure_sysmem ? secure_sysmem : sysmem);
>>  
>>      create_gic(vms, pic);
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 09/18] hw/arm/virt: Implement kvm_type function for 4.0 machine
  2019-02-18 10:07   ` Igor Mammedov
@ 2019-02-19 15:56     ` Auger Eric
  0 siblings, 0 replies; 56+ messages in thread
From: Auger Eric @ 2019-02-19 15:56 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, david, drjones, dgilbert, david

Hi Igor,
On 2/18/19 11:07 AM, Igor Mammedov wrote:
> On Tue,  5 Feb 2019 18:32:57 +0100
> Eric Auger <eric.auger@redhat.com> wrote:
> 
>> This patch implements the machine class kvm_type() callback.
>> It returns the max IPA shift needed to implement the whole GPA
>> range including the RAM and IO regions located beyond.
>> The returned value in passed though the KVM_CREATE_VM ioctl and
>> this allows KVM to set the stage2 tables dynamically.
>>
>> At this stage the RAM limit still is limited to 255GB.
>>
>> Setting all the existing highmem IO regions beyond the RAM
>> allows to have a single contiguous RAM region (initial RAM and
>> possible hotpluggable device memory). That way we do not need
>> to do invasive changes in the EDK2 FW to support a dynamic
>> RAM base.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>>
>> v5 -> v6:
>> - add some comments
>> - high IO region cannot start before 256GiB
>> ---
>>  hw/arm/virt.c         | 52 +++++++++++++++++++++++++++++++++++++++++--
>>  include/hw/arm/virt.h |  2 ++
>>  2 files changed, 52 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index 2b15839d0b..b90ffc2e5d 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -1366,6 +1366,7 @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx)
>>  
>>  static void virt_set_memmap(VirtMachineState *vms)
>>  {
>> +    MachineState *ms = MACHINE(vms);
>>      hwaddr base;
>>      int i;
>>  
>> @@ -1375,7 +1376,17 @@ static void virt_set_memmap(VirtMachineState *vms)
>>          vms->memmap[i] = a15memmap[i];
>>      }
>>  
>> -    vms->high_io_base = 256 * GiB; /* Top of the legacy initial RAM region */
>> +    /*
>> +     * We now compute the base of the high IO region depending on the
>> +     * amount of initial and device memory. The device memory start/size
>> +     * is aligned on 1GiB. We never put the high IO region below 256GiB
>> +     * so that if maxram_size is < 255GiB we keep the legacy memory map
>> +     */
>> +    vms->high_io_base = ROUND_UP(GiB + ms->ram_size, GiB) +
>> +                        ROUND_UP(ms->maxram_size - ms->ram_size, GiB);
>> +    if (vms->high_io_base < 256 * GiB) {
>> +        vms->high_io_base = 256 * GiB;
>> +    }
>>      base = vms->high_io_base;
>>  
>>      for (i = VIRT_LOWMEMMAP_LAST; i < ARRAY_SIZE(extended_memmap); i++) {
>> @@ -1386,6 +1397,7 @@ static void virt_set_memmap(VirtMachineState *vms)
>>          vms->memmap[i].size = size;
>>          base += size;
>>      }
>> +    vms->highest_gpa = base - 1;
>>  }
>>  
>>  static void machvirt_init(MachineState *machine)
>> @@ -1402,7 +1414,9 @@ static void machvirt_init(MachineState *machine)
>>      bool firmware_loaded = bios_name || drive_get(IF_PFLASH, 0, 0);
>>      bool aarch64 = true;
>>  
>> -    virt_set_memmap(vms);
>> +    if (!vms->extended_memmap) {
>> +        virt_set_memmap(vms);
>> +    }
>>  
>>      /* We can probe only here because during property set
>>       * KVM is not available yet
>> @@ -1784,6 +1798,36 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
>>      return NULL;
>>  }
>>  
>> +/*
>> + * for arm64 kvm_type [7-0] encodes the IPA size shift
>> + */
>> +static int virt_kvm_type(MachineState *ms, const char *type_str)
>> +{
>> +    VirtMachineState *vms = VIRT_MACHINE(ms);
>> +    int max_vm_phys_shift = kvm_arm_get_max_vm_phys_shift(ms);
>> +    int max_pa_shift;
>> +
>> +    vms->extended_memmap = true;
>> +
>> +    virt_set_memmap(vms);
>> +
>> +    max_pa_shift = 64 - clz64(vms->highest_gpa);
>> +
>> +    if (max_pa_shift > max_vm_phys_shift) {
>> +        error_report("-m and ,maxmem option values "
>> +                     "require an IPA range (%d bits) larger than "
>> +                     "the one supported by the host (%d bits)",
>> +                     max_pa_shift, max_vm_phys_shift);
>> +       exit(1);
>> +    }
>> +    /*
>> +     * By default we return 0 which corresponds to an implicit legacy
>> +     * 40b IPA setting. Otherwise we return the actual requested IPA
>> +     * logsize
>> +     */
>> +    return max_pa_shift > 40 ? max_pa_shift : 0;
>> +}
>> +
>>  static void virt_machine_class_init(ObjectClass *oc, void *data)
>>  {
>>      MachineClass *mc = MACHINE_CLASS(oc);
>> @@ -1808,6 +1852,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
>>      mc->cpu_index_to_instance_props = virt_cpu_index_to_props;
>>      mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a15");
>>      mc->get_default_cpu_node_id = virt_get_default_cpu_node_id;
>> +    mc->kvm_type = virt_kvm_type;
>>      assert(!mc->get_hotplug_handler);
>>      mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
>>      hc->plug = virt_machine_device_plug_cb;
>> @@ -1911,6 +1956,9 @@ static void virt_machine_3_1_options(MachineClass *mc)
>>  {
>>      virt_machine_4_0_options(mc);
>>      compat_props_add(mc->compat_props, hw_compat_3_1, hw_compat_3_1_len);
>> +
>> +    /* extended memory map is enabled from 4.0 onwards */
>> +    mc->kvm_type = NULL;
> it's quite confusing, you have vms->extended_memmap and mc->kvm_type and
> the later for some reason enables device memory.
> 
> to me it seems that both are not related, device memory should work just fine
> without kvm nor dynamic IPA (within TCG supported limits).
> 
> I'd make extended_memmap virt machine class member the will enable pc-dimm support
> and then it add checks for supported IPA range on top
I agree I did not take into account the TCG use case and this series
does not enable device memory with TCG which is a pitty. I will
decorrelate things.

Thanks!

Eric
> 
>>  }
>>  DEFINE_VIRT_MACHINE(3, 1)
>>  
>> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
>> index 3dc7a6c5d5..c88f67a492 100644
>> --- a/include/hw/arm/virt.h
>> +++ b/include/hw/arm/virt.h
>> @@ -132,6 +132,8 @@ typedef struct {
>>      uint32_t iommu_phandle;
>>      int psci_conduit;
>>      hwaddr high_io_base;
>> +    hwaddr highest_gpa;
>> +    bool extended_memmap;
>>  } VirtMachineState;
>>  
>>  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 14/18] hw/arm/virt: Allocate device_memory
  2019-02-19 15:53     ` Auger Eric
@ 2019-02-19 15:56       ` David Hildenbrand
  2019-02-21  9:36       ` Igor Mammedov
  1 sibling, 0 replies; 56+ messages in thread
From: David Hildenbrand @ 2019-02-19 15:56 UTC (permalink / raw)
  To: Auger Eric, Igor Mammedov
  Cc: eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, drjones, dgilbert, david

On 19.02.19 16:53, Auger Eric wrote:
> Hi Igor,
> 
> On 2/18/19 10:31 AM, Igor Mammedov wrote:
>> On Tue,  5 Feb 2019 18:33:02 +0100
>> Eric Auger <eric.auger@redhat.com> wrote:
>>
>>> The device memory region is located after the initial RAM.
>>> its start/size are 1GB aligned.
>>>
>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>> Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
>>>
>>> ---
>>> v4 -> v5:
>>> - device memory set after the initial RAM
>>>
>>> v3 -> v4:
>>> - remove bootinfo.device_memory_start/device_memory_size
>>> - rename VIRT_HOTPLUG_MEM into VIRT_DEVICE_MEM
>>> ---
>>>  hw/arm/virt.c | 36 ++++++++++++++++++++++++++++++++++++
>>>  1 file changed, 36 insertions(+)
>>>
>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>> index 783468ba77..b683902991 100644
>>> --- a/hw/arm/virt.c
>>> +++ b/hw/arm/virt.c
>>> @@ -61,6 +61,7 @@
>>>  #include "hw/arm/smmuv3.h"
>>>  #include "hw/mem/pc-dimm.h"
>>>  #include "hw/mem/nvdimm.h"
>>> +#include "hw/acpi/acpi.h"
>>>  
>>>  #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
>>>      static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
>>> @@ -1260,6 +1261,37 @@ static void create_secure_ram(VirtMachineState *vms,
>>>      g_free(nodename);
>>>  }
>>>  
>>> +static void create_device_memory(VirtMachineState *vms, MemoryRegion *sysmem)
>>> +{
>>> +    MachineState *ms = MACHINE(vms);
>>> +    uint64_t device_memory_size = ms->maxram_size - ms->ram_size;
>> should size it with 1Gb alignment per slot from the start (to avoid x86 mistakes),
>> see enforce_aligned_dimm usage and associated commit for more details
> I don't understand the computation done in pc machine. eventually we are
> likely to have more device memory than requested by the user. Why don't
> we check (machine->maxram_size - machine->ram_size) >=
> machine->ram_slots * GiB
> instead of adding 1GiB/slot to the initial user requirements?

This is to be able to potentially align each slot as far as I know, so
the "memory device address space" cannot that easily be fragmented.

E.g. Linux requires a certain alignment to make full use of a DIMM.

> 
> Also machine->maxram_size - machine->ram_size is checked to be aligned
> with TARGET_PAGE_SIZE. Is TARGET_PAGE_SIZE representative of the guest
> PAGE in accelerated mode? Is it valid ro require an alignment on 1GB
> boundary as I do in this patch?

I guess the alignment check is only done because for that target,
anything having sub-page granularity cannot be used either way.

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 16/18] hw/arm/virt: Add nvdimm hot-plug infrastructure
  2019-02-18 10:30   ` Igor Mammedov
@ 2019-02-20 15:21     ` Auger Eric
  2019-02-21 12:16       ` Igor Mammedov
  0 siblings, 1 reply; 56+ messages in thread
From: Auger Eric @ 2019-02-20 15:21 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, david, dgilbert, david, drjones

Hi Igor,

On 2/18/19 11:30 AM, Igor Mammedov wrote:
> On Tue,  5 Feb 2019 18:33:04 +0100
> Eric Auger <eric.auger@redhat.com> wrote:
> 
>> From: Kwangwoo Lee <kwangwoo.lee@sk.com>
>>
>> Pre-plug and plug handlers are prepared for NVDIMM support.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
>> ---
>>  default-configs/arm-softmmu.mak |  2 ++
>>  hw/arm/virt-acpi-build.c        |  6 ++++++
>>  hw/arm/virt.c                   | 22 ++++++++++++++++++++++
>>  include/hw/arm/virt.h           |  3 +++
>>  4 files changed, 33 insertions(+)
>>
>> diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
>> index dc4624794f..ddbe87ed15 100644
>> --- a/default-configs/arm-softmmu.mak
>> +++ b/default-configs/arm-softmmu.mak
>> @@ -162,3 +162,5 @@ CONFIG_HIGHBANK=y
>>  CONFIG_MUSICPAL=y
>>  CONFIG_MEM_DEVICE=y
>>  CONFIG_DIMM=y
>> +CONFIG_NVDIMM=y
>> +CONFIG_ACPI_NVDIMM=y
>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>> index 781eafaf5e..f086adfa82 100644
>> --- a/hw/arm/virt-acpi-build.c
>> +++ b/hw/arm/virt-acpi-build.c
>> @@ -784,6 +784,7 @@ static
>>  void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>>  {
>>      VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
>> +    MachineState *ms = MACHINE(vms);
>>      GArray *table_offsets;
>>      unsigned dsdt, xsdt;
>>      GArray *tables_blob = tables->table_data;
>> @@ -824,6 +825,11 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>>          }
>>      }
>>  
>> +    if (vms->acpi_nvdimm_state.is_enabled) {
>> +        nvdimm_build_acpi(table_offsets, tables_blob, tables->linker,
>> +                          &vms->acpi_nvdimm_state, ms->ram_slots);
>> +    }
>> +
>>      if (its_class_name() && !vmc->no_its) {
>>          acpi_add_table(table_offsets, tables_blob);
>>          build_iort(tables_blob, tables->linker, vms);
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index b683902991..0c8c2cc191 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -132,6 +132,7 @@ static const MemMapEntry a15memmap[] = {
>>      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
>>      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
>>      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
>> +    [VIRT_ACPI_IO] =            { 0x09070000, 0x00010000 },
> where does this range come from and is its size sufficient?
I understand it can be anywhere in low mem and must be large enough to
contain [NVDIMM_ACPI_IO_BASE, NDIMM_ACPI_IO_BASE + NVDIMM_ACPI_IO_LEN].
So one 64kB page should do the job?

Thanks

Eric
> 
>>      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
>>      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
>>      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
>> @@ -1637,6 +1638,18 @@ static void machvirt_init(MachineState *machine)
>>  
>>      create_platform_bus(vms, pic);
>>  
>> +    if (vms->acpi_nvdimm_state.is_enabled) {
>> +        AcpiNVDIMMState *acpi_nvdimm_state = &vms->acpi_nvdimm_state;
>> +
>> +        acpi_nvdimm_state->dsm_io.type = NVDIMM_ACPI_IO_MEMORY;
>> +        acpi_nvdimm_state->dsm_io.base =
>> +                vms->memmap[VIRT_ACPI_IO].base + NVDIMM_ACPI_IO_BASE;
>> +        acpi_nvdimm_state->dsm_io.len = NVDIMM_ACPI_IO_LEN;
>> +
>> +        nvdimm_init_acpi_state(acpi_nvdimm_state, sysmem,
>> +                               vms->fw_cfg, OBJECT(vms));
>> +    }
>> +
>>      vms->bootinfo.ram_size = machine->ram_size;
>>      vms->bootinfo.kernel_filename = machine->kernel_filename;
>>      vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
>> @@ -1822,10 +1835,19 @@ static void virt_memory_plug(HotplugHandler *hotplug_dev,
>>                               DeviceState *dev, Error **errp)
>>  {
>>      VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>> +    bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM);
>>      Error *local_err = NULL;
>>  
>>      pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
>> +    if (local_err) {
>> +        goto out;
>> +    }
>>  
>> +    if (is_nvdimm) {
>> +        nvdimm_plug(&vms->acpi_nvdimm_state);
>> +    }
>> +
>> +out:
>>      error_propagate(errp, local_err);
>>  }
>>  
>> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
>> index c88f67a492..56d73b0e86 100644
>> --- a/include/hw/arm/virt.h
>> +++ b/include/hw/arm/virt.h
>> @@ -37,6 +37,7 @@
>>  #include "hw/arm/arm.h"
>>  #include "sysemu/kvm.h"
>>  #include "hw/intc/arm_gicv3_common.h"
>> +#include "hw/mem/nvdimm.h"
>>  
>>  #define NUM_GICV2M_SPIS       64
>>  #define NUM_VIRTIO_TRANSPORTS 32
>> @@ -77,6 +78,7 @@ enum {
>>      VIRT_GPIO,
>>      VIRT_SECURE_UART,
>>      VIRT_SECURE_MEM,
>> +    VIRT_ACPI_IO,
>>      VIRT_LOWMEMMAP_LAST,
>>  };
>>  
>> @@ -134,6 +136,7 @@ typedef struct {
>>      hwaddr high_io_base;
>>      hwaddr highest_gpa;
>>      bool extended_memmap;
>> +    AcpiNVDIMMState acpi_nvdimm_state;
>>  } VirtMachineState;
>>  
>>  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 12/18] hw/arm/boot: Expose the PC-DIMM nodes in the DT
  2019-02-18  8:58   ` Igor Mammedov
@ 2019-02-20 15:30     ` Auger Eric
  2019-02-21  9:27       ` Igor Mammedov
  0 siblings, 1 reply; 56+ messages in thread
From: Auger Eric @ 2019-02-20 15:30 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, david, drjones, dgilbert, david

Hi Igor,

On 2/18/19 9:58 AM, Igor Mammedov wrote:
> On Tue,  5 Feb 2019 18:33:00 +0100
> Eric Auger <eric.auger@redhat.com> wrote:
> 
>> From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>>
>> This patch add memory nodes corresponding to PC-DIMM regions.
> s/add/adds/ or s/This patch add/Add/
> 
>>
>> NV_DIMM and ACPI_NVDIMM configs are not yet set for ARM so we
>> don't need to care about NV-DIMM at this stage.
>>
>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>> v3 -> v4:
>> - git rid of @base and @len in fdt_add_hotpluggable_memory_nodes
>>
>> v1 -> v2:
>> - added qapi_free_MemoryDeviceInfoList and simplify the loop
>> ---
>>  hw/arm/boot.c | 35 +++++++++++++++++++++++++++++++++++
>>  1 file changed, 35 insertions(+)
>>
>> diff --git a/hw/arm/boot.c b/hw/arm/boot.c
>> index 2ef367e15b..2a70e8aa82 100644
>> --- a/hw/arm/boot.c
>> +++ b/hw/arm/boot.c
>> @@ -19,6 +19,7 @@
>>  #include "sysemu/numa.h"
>>  #include "hw/boards.h"
>>  #include "hw/loader.h"
>> +#include "hw/mem/memory-device.h"
>>  #include "elf.h"
>>  #include "sysemu/device_tree.h"
>>  #include "qemu/config-file.h"
>> @@ -526,6 +527,34 @@ static void fdt_add_psci_node(void *fdt)
>>      qemu_fdt_setprop_cell(fdt, "/psci", "migrate", migrate_fn);
>>  }
>>  
>> +static int fdt_add_hotpluggable_memory_nodes(void *fdt,
>> +                                             uint32_t acells, uint32_t scells) {
>> +    MemoryDeviceInfoList *info, *info_list = qmp_memory_device_list();
>> +    MemoryDeviceInfo *mi;
>> +    PCDIMMDeviceInfo *di;
>> +    bool is_nvdimm;
>> +    int ret = 0;
>> +
>> +    for (info = info_list; info != NULL; info = info->next) {
>> +        mi = info->value;
>> +        is_nvdimm = (mi->type == MEMORY_DEVICE_INFO_KIND_NVDIMM);
>> +        di = !is_nvdimm ? mi->u.dimm.data : mi->u.nvdimm.data;
>> +
>> +        if (is_nvdimm) {
>> +            ret = -ENOENT; /* NV-DIMM not yet supported */
>> +        } else {
>> +            ret = fdt_add_memory_node(fdt, acells, di->addr,
>> +                                      scells, di->size, di->node);
>> +        }
>> +        if (ret < 0) {
>> +            goto out;
>> +        }
>> +    }
>> +out:
>> +    qapi_free_MemoryDeviceInfoList(info_list);
>> +    return ret;
>> +}
>> +
>>  int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
>>                   hwaddr addr_limit, AddressSpace *as)
>>  {
>> @@ -621,6 +650,12 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
>>          }
>>      }
>>  
>> +    rc = fdt_add_hotpluggable_memory_nodes(fdt, acells, scells);
>> +    if (rc < 0) {
>> +            fprintf(stderr, "couldn't add hotpluggable memory nodes\n");
> error message is rather vague, user using nvdimms + pc-dimms on CLI won't have
> a clue that the former is not supported.
> Suggest pass in error_fatal as argument and report more specific error from
> fdt_add_hotpluggable_memory_nodes()
> 
> does this run on reboot?

Yes after a QMP system_reset I can see the DIMMS on guest in /proc/meminfo

Thanks

Eric
> 
>> +            goto fail;
>> +    }
>> +
>>      rc = fdt_path_offset(fdt, "/chosen");
>>      if (rc < 0) {
>>          qemu_fdt_add_subnode(fdt, "/chosen");
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 01/18] update-linux-headers.sh: Copy new headers
  2019-02-14 16:36   ` Peter Maydell
@ 2019-02-21  6:15     ` Alexey Kardashevskiy
  0 siblings, 0 replies; 56+ messages in thread
From: Alexey Kardashevskiy @ 2019-02-21  6:15 UTC (permalink / raw)
  To: Peter Maydell, Eric Auger
  Cc: Andrew Jones, David Hildenbrand, QEMU Developers,
	Shameerali Kolothum Thodi, Dr. David Alan Gilbert, qemu-arm,
	Igor Mammedov, David Gibson, Eric Auger



On 15/02/2019 03:36, Peter Maydell wrote:
> On Tue, 5 Feb 2019 at 17:33, Eric Auger <eric.auger@redhat.com> wrote:
>>
>> From: Alexey Kardashevskiy <aik@ozlabs.ru>
>>
>> Since Linux'es ab66dcc76d "powerpc: generate uapi header and system call
>> table files" there are 2 new files: unistd_32.h and unistd_64.h. These
>> files content is moved from unistd.h so now we have to copy new files
>> as well, just like we already do for other architectures; this does it
>> for MIPS as well.
>>
>> Also, v5.0-rc2 moved vhost bits around in 4b86713236e4bd
>> "vhost: split structs into a separate header file", add those too.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> 
> I think this fix is handled by commit a0a6ef91a4a4edde27
> (now in master), yes ?

uff, just noticed this mail. yes, it is done by a0a6ef91a4a4edde27. Thanks,


-- 
Alexey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 12/18] hw/arm/boot: Expose the PC-DIMM nodes in the DT
  2019-02-20 15:30     ` Auger Eric
@ 2019-02-21  9:27       ` Igor Mammedov
  0 siblings, 0 replies; 56+ messages in thread
From: Igor Mammedov @ 2019-02-21  9:27 UTC (permalink / raw)
  To: Auger Eric
  Cc: eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, david, drjones, dgilbert, david

On Wed, 20 Feb 2019 16:30:13 +0100
Auger Eric <eric.auger@redhat.com> wrote:

> Hi Igor,
> 
> On 2/18/19 9:58 AM, Igor Mammedov wrote:
> > On Tue,  5 Feb 2019 18:33:00 +0100
> > Eric Auger <eric.auger@redhat.com> wrote:
> >   
> >> From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> >>
> >> This patch add memory nodes corresponding to PC-DIMM regions.  
> > s/add/adds/ or s/This patch add/Add/
> >   
> >>
> >> NV_DIMM and ACPI_NVDIMM configs are not yet set for ARM so we
> >> don't need to care about NV-DIMM at this stage.
> >>
> >> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >>
> >> ---
> >> v3 -> v4:
> >> - git rid of @base and @len in fdt_add_hotpluggable_memory_nodes
> >>
> >> v1 -> v2:
> >> - added qapi_free_MemoryDeviceInfoList and simplify the loop
> >> ---
> >>  hw/arm/boot.c | 35 +++++++++++++++++++++++++++++++++++
> >>  1 file changed, 35 insertions(+)
> >>
> >> diff --git a/hw/arm/boot.c b/hw/arm/boot.c
> >> index 2ef367e15b..2a70e8aa82 100644
> >> --- a/hw/arm/boot.c
> >> +++ b/hw/arm/boot.c
> >> @@ -19,6 +19,7 @@
> >>  #include "sysemu/numa.h"
> >>  #include "hw/boards.h"
> >>  #include "hw/loader.h"
> >> +#include "hw/mem/memory-device.h"
> >>  #include "elf.h"
> >>  #include "sysemu/device_tree.h"
> >>  #include "qemu/config-file.h"
> >> @@ -526,6 +527,34 @@ static void fdt_add_psci_node(void *fdt)
> >>      qemu_fdt_setprop_cell(fdt, "/psci", "migrate", migrate_fn);
> >>  }
> >>  
> >> +static int fdt_add_hotpluggable_memory_nodes(void *fdt,
> >> +                                             uint32_t acells, uint32_t scells) {
> >> +    MemoryDeviceInfoList *info, *info_list = qmp_memory_device_list();
> >> +    MemoryDeviceInfo *mi;
> >> +    PCDIMMDeviceInfo *di;
> >> +    bool is_nvdimm;
> >> +    int ret = 0;
> >> +
> >> +    for (info = info_list; info != NULL; info = info->next) {
> >> +        mi = info->value;
> >> +        is_nvdimm = (mi->type == MEMORY_DEVICE_INFO_KIND_NVDIMM);
> >> +        di = !is_nvdimm ? mi->u.dimm.data : mi->u.nvdimm.data;
> >> +
> >> +        if (is_nvdimm) {
> >> +            ret = -ENOENT; /* NV-DIMM not yet supported */
> >> +        } else {
> >> +            ret = fdt_add_memory_node(fdt, acells, di->addr,
> >> +                                      scells, di->size, di->node);
> >> +        }
> >> +        if (ret < 0) {
> >> +            goto out;
> >> +        }
> >> +    }
> >> +out:
> >> +    qapi_free_MemoryDeviceInfoList(info_list);
> >> +    return ret;
> >> +}
> >> +
> >>  int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
> >>                   hwaddr addr_limit, AddressSpace *as)
> >>  {
> >> @@ -621,6 +650,12 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
> >>          }
> >>      }
> >>  
> >> +    rc = fdt_add_hotpluggable_memory_nodes(fdt, acells, scells);
> >> +    if (rc < 0) {
> >> +            fprintf(stderr, "couldn't add hotpluggable memory nodes\n");  
> > error message is rather vague, user using nvdimms + pc-dimms on CLI won't have
> > a clue that the former is not supported.
> > Suggest pass in error_fatal as argument and report more specific error from
> > fdt_add_hotpluggable_memory_nodes()
> > 
> > does this run on reboot?  
> 
> Yes after a QMP system_reset I can see the DIMMS on guest in /proc/meminfo
it siims that dimms come from ACPI and not DTB.

My worry was that arm_load_dtb() always end up with exit(1) on failure but it seems
that it's not called on reboot so it should be fine.

> 
> Thanks
> 
> Eric
> >   
> >> +            goto fail;
> >> +    }
> >> +
> >>      rc = fdt_path_offset(fdt, "/chosen");
> >>      if (rc < 0) {
> >>          qemu_fdt_add_subnode(fdt, "/chosen");  
> >   

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 14/18] hw/arm/virt: Allocate device_memory
  2019-02-19 15:53     ` Auger Eric
  2019-02-19 15:56       ` David Hildenbrand
@ 2019-02-21  9:36       ` Igor Mammedov
  2019-02-21 12:37         ` Auger Eric
  1 sibling, 1 reply; 56+ messages in thread
From: Igor Mammedov @ 2019-02-21  9:36 UTC (permalink / raw)
  To: Auger Eric
  Cc: peter.maydell, drjones, david, qemu-devel,
	shameerali.kolothum.thodi, dgilbert, qemu-arm, david,
	eric.auger.pro

On Tue, 19 Feb 2019 16:53:22 +0100
Auger Eric <eric.auger@redhat.com> wrote:

> Hi Igor,
> 
> On 2/18/19 10:31 AM, Igor Mammedov wrote:
> > On Tue,  5 Feb 2019 18:33:02 +0100
> > Eric Auger <eric.auger@redhat.com> wrote:
> >   
> >> The device memory region is located after the initial RAM.
> >> its start/size are 1GB aligned.
> >>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >> Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
> >>
> >> ---
> >> v4 -> v5:
> >> - device memory set after the initial RAM
> >>
> >> v3 -> v4:
> >> - remove bootinfo.device_memory_start/device_memory_size
> >> - rename VIRT_HOTPLUG_MEM into VIRT_DEVICE_MEM
> >> ---
> >>  hw/arm/virt.c | 36 ++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 36 insertions(+)
> >>
> >> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> >> index 783468ba77..b683902991 100644
> >> --- a/hw/arm/virt.c
> >> +++ b/hw/arm/virt.c
> >> @@ -61,6 +61,7 @@
> >>  #include "hw/arm/smmuv3.h"
> >>  #include "hw/mem/pc-dimm.h"
> >>  #include "hw/mem/nvdimm.h"
> >> +#include "hw/acpi/acpi.h"
> >>  
> >>  #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
> >>      static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
> >> @@ -1260,6 +1261,37 @@ static void create_secure_ram(VirtMachineState *vms,
> >>      g_free(nodename);
> >>  }
> >>  
> >> +static void create_device_memory(VirtMachineState *vms, MemoryRegion *sysmem)
> >> +{
> >> +    MachineState *ms = MACHINE(vms);
> >> +    uint64_t device_memory_size = ms->maxram_size - ms->ram_size;  
> > should size it with 1Gb alignment per slot from the start (to avoid x86 mistakes),
> > see enforce_aligned_dimm usage and associated commit for more details  
> I don't understand the computation done in pc machine. eventually we are
> likely to have more device memory than requested by the user. Why don't
> we check (machine->maxram_size - machine->ram_size) >=
> machine->ram_slots * GiB
> instead of adding 1GiB/slot to the initial user requirements?
> 
> Also machine->maxram_size - machine->ram_size is checked to be aligned
> with TARGET_PAGE_SIZE. Is TARGET_PAGE_SIZE representative of the guest
> PAGE in accelerated mode? Is it valid ro require an alignment on 1GB
> boundary as I do in this patch?
See commit 085f8e88b for explanation,
What we are basically are doing there is sizing hotpluggbale address space
to allow max possible huge page aligned DIMM to be successfully plugged in
even if address space if fragmented.

> 
> >   
> >> +    uint64_t align = GiB;
> >> +
> >> +    if (!device_memory_size) {
> >> +        return;
> >> +    }
> >> +
> >> +    if (ms->ram_slots > ACPI_MAX_RAM_SLOTS) {
> >> +        error_report("unsupported number of memory slots: %"PRIu64,
> >> +                     ms->ram_slots);
> >> +        exit(EXIT_FAILURE);
> >> +    }
> >> +
> >> +    if (QEMU_ALIGN_UP(ms->maxram_size, align) != ms->maxram_size) {
> >> +        error_report("maximum memory size must be aligned to multiple of 0x%"
> >> +                     PRIx64, align);
> >> +        exit(EXIT_FAILURE);
> >> +    }
> >> +
> >> +    ms->device_memory = g_malloc0(sizeof(*ms->device_memory));
> >> +    ms->device_memory->base = QEMU_ALIGN_UP(GiB + ms->ram_size, GiB);  
> >                                                ^^^ where does this come from?  
> OK, introduced RAMBASE macro
> 
> Thanks
> 
> Eric
> > 
> >   
> >> +
> >> +    memory_region_init(&ms->device_memory->mr, OBJECT(vms),
> >> +                       "device-memory", device_memory_size);
> >> +    memory_region_add_subregion(sysmem, ms->device_memory->base,
> >> +                                &ms->device_memory->mr);
> >> +}
> >> +
> >>  static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
> >>  {
> >>      const VirtMachineState *board = container_of(binfo, VirtMachineState,
> >> @@ -1569,6 +1601,10 @@ static void machvirt_init(MachineState *machine)
> >>                                           machine->ram_size);
> >>      memory_region_add_subregion(sysmem, vms->memmap[VIRT_MEM].base, ram);
> >>  
> >> +    if (vms->extended_memmap) {
> >> +        create_device_memory(vms, sysmem);
> >> +    }
> >> +
> >>      create_flash(vms, sysmem, secure_sysmem ? secure_sysmem : sysmem);
> >>  
> >>      create_gic(vms, pic);  
> >   
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 16/18] hw/arm/virt: Add nvdimm hot-plug infrastructure
  2019-02-20 15:21     ` Auger Eric
@ 2019-02-21 12:16       ` Igor Mammedov
  2019-02-21 12:34         ` Auger Eric
  0 siblings, 1 reply; 56+ messages in thread
From: Igor Mammedov @ 2019-02-21 12:16 UTC (permalink / raw)
  To: Auger Eric
  Cc: eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, david, dgilbert, david, drjones

On Wed, 20 Feb 2019 16:21:05 +0100
Auger Eric <eric.auger@redhat.com> wrote:

> Hi Igor,
> 
> On 2/18/19 11:30 AM, Igor Mammedov wrote:
> > On Tue,  5 Feb 2019 18:33:04 +0100
> > Eric Auger <eric.auger@redhat.com> wrote:
> >   
> >> From: Kwangwoo Lee <kwangwoo.lee@sk.com>
> >>
> >> Pre-plug and plug handlers are prepared for NVDIMM support.
> >>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >> Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
> >> ---
> >>  default-configs/arm-softmmu.mak |  2 ++
> >>  hw/arm/virt-acpi-build.c        |  6 ++++++
> >>  hw/arm/virt.c                   | 22 ++++++++++++++++++++++
> >>  include/hw/arm/virt.h           |  3 +++
> >>  4 files changed, 33 insertions(+)
> >>
> >> diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
> >> index dc4624794f..ddbe87ed15 100644
> >> --- a/default-configs/arm-softmmu.mak
> >> +++ b/default-configs/arm-softmmu.mak
> >> @@ -162,3 +162,5 @@ CONFIG_HIGHBANK=y
> >>  CONFIG_MUSICPAL=y
> >>  CONFIG_MEM_DEVICE=y
> >>  CONFIG_DIMM=y
> >> +CONFIG_NVDIMM=y
> >> +CONFIG_ACPI_NVDIMM=y
> >> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> >> index 781eafaf5e..f086adfa82 100644
> >> --- a/hw/arm/virt-acpi-build.c
> >> +++ b/hw/arm/virt-acpi-build.c
> >> @@ -784,6 +784,7 @@ static
> >>  void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
> >>  {
> >>      VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
> >> +    MachineState *ms = MACHINE(vms);
> >>      GArray *table_offsets;
> >>      unsigned dsdt, xsdt;
> >>      GArray *tables_blob = tables->table_data;
> >> @@ -824,6 +825,11 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
> >>          }
> >>      }
> >>  
> >> +    if (vms->acpi_nvdimm_state.is_enabled) {
> >> +        nvdimm_build_acpi(table_offsets, tables_blob, tables->linker,
> >> +                          &vms->acpi_nvdimm_state, ms->ram_slots);
> >> +    }
> >> +
> >>      if (its_class_name() && !vmc->no_its) {
> >>          acpi_add_table(table_offsets, tables_blob);
> >>          build_iort(tables_blob, tables->linker, vms);
> >> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> >> index b683902991..0c8c2cc191 100644
> >> --- a/hw/arm/virt.c
> >> +++ b/hw/arm/virt.c
> >> @@ -132,6 +132,7 @@ static const MemMapEntry a15memmap[] = {
> >>      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
> >>      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
> >>      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
> >> +    [VIRT_ACPI_IO] =            { 0x09070000, 0x00010000 },  
> > where does this range come from and is its size sufficient?  
> I understand it can be anywhere in low mem and must be large enough to
> contain [NVDIMM_ACPI_IO_BASE, NDIMM_ACPI_IO_BASE + NVDIMM_ACPI_IO_LEN].
it looked to like generic ACPI arrea rather than NVDIMM
so I'd suggest to name it properly and probably use NVDIMM_ACPI_IO_BASE & co
to add this entry so that reader won't have to wonder where this magic numbers
come from.

> So one 64kB page should do the job?
Should it be located in lowmem?
(do we care about device_memory & AVMF & ACPI & not 64bit guests?)


> Thanks
> 
> Eric
> >   
> >>      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
> >>      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
> >>      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
> >> @@ -1637,6 +1638,18 @@ static void machvirt_init(MachineState *machine)
> >>  
> >>      create_platform_bus(vms, pic);
> >>  
> >> +    if (vms->acpi_nvdimm_state.is_enabled) {
> >> +        AcpiNVDIMMState *acpi_nvdimm_state = &vms->acpi_nvdimm_state;
> >> +
> >> +        acpi_nvdimm_state->dsm_io.type = NVDIMM_ACPI_IO_MEMORY;
> >> +        acpi_nvdimm_state->dsm_io.base =
> >> +                vms->memmap[VIRT_ACPI_IO].base + NVDIMM_ACPI_IO_BASE;
> >> +        acpi_nvdimm_state->dsm_io.len = NVDIMM_ACPI_IO_LEN;
> >> +
> >> +        nvdimm_init_acpi_state(acpi_nvdimm_state, sysmem,
> >> +                               vms->fw_cfg, OBJECT(vms));
> >> +    }
> >> +
> >>      vms->bootinfo.ram_size = machine->ram_size;
> >>      vms->bootinfo.kernel_filename = machine->kernel_filename;
> >>      vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
> >> @@ -1822,10 +1835,19 @@ static void virt_memory_plug(HotplugHandler *hotplug_dev,
> >>                               DeviceState *dev, Error **errp)
> >>  {
> >>      VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> >> +    bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM);
> >>      Error *local_err = NULL;
> >>  
> >>      pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
> >> +    if (local_err) {
> >> +        goto out;
> >> +    }
> >>  
> >> +    if (is_nvdimm) {
> >> +        nvdimm_plug(&vms->acpi_nvdimm_state);
> >> +    }
> >> +
> >> +out:
> >>      error_propagate(errp, local_err);
> >>  }
> >>  
> >> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> >> index c88f67a492..56d73b0e86 100644
> >> --- a/include/hw/arm/virt.h
> >> +++ b/include/hw/arm/virt.h
> >> @@ -37,6 +37,7 @@
> >>  #include "hw/arm/arm.h"
> >>  #include "sysemu/kvm.h"
> >>  #include "hw/intc/arm_gicv3_common.h"
> >> +#include "hw/mem/nvdimm.h"
> >>  
> >>  #define NUM_GICV2M_SPIS       64
> >>  #define NUM_VIRTIO_TRANSPORTS 32
> >> @@ -77,6 +78,7 @@ enum {
> >>      VIRT_GPIO,
> >>      VIRT_SECURE_UART,
> >>      VIRT_SECURE_MEM,
> >> +    VIRT_ACPI_IO,
> >>      VIRT_LOWMEMMAP_LAST,
> >>  };
> >>  
> >> @@ -134,6 +136,7 @@ typedef struct {
> >>      hwaddr high_io_base;
> >>      hwaddr highest_gpa;
> >>      bool extended_memmap;
> >> +    AcpiNVDIMMState acpi_nvdimm_state;
> >>  } VirtMachineState;
> >>  
> >>  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)  
> >   

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 16/18] hw/arm/virt: Add nvdimm hot-plug infrastructure
  2019-02-21 12:16       ` Igor Mammedov
@ 2019-02-21 12:34         ` Auger Eric
  0 siblings, 0 replies; 56+ messages in thread
From: Auger Eric @ 2019-02-21 12:34 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: peter.maydell, drjones, david, qemu-devel,
	shameerali.kolothum.thodi, dgilbert, qemu-arm, david,
	eric.auger.pro

Hi Igor,

On 2/21/19 1:16 PM, Igor Mammedov wrote:
> On Wed, 20 Feb 2019 16:21:05 +0100
> Auger Eric <eric.auger@redhat.com> wrote:
> 
>> Hi Igor,
>>
>> On 2/18/19 11:30 AM, Igor Mammedov wrote:
>>> On Tue,  5 Feb 2019 18:33:04 +0100
>>> Eric Auger <eric.auger@redhat.com> wrote:
>>>   
>>>> From: Kwangwoo Lee <kwangwoo.lee@sk.com>
>>>>
>>>> Pre-plug and plug handlers are prepared for NVDIMM support.
>>>>
>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>> Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
>>>> ---
>>>>  default-configs/arm-softmmu.mak |  2 ++
>>>>  hw/arm/virt-acpi-build.c        |  6 ++++++
>>>>  hw/arm/virt.c                   | 22 ++++++++++++++++++++++
>>>>  include/hw/arm/virt.h           |  3 +++
>>>>  4 files changed, 33 insertions(+)
>>>>
>>>> diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
>>>> index dc4624794f..ddbe87ed15 100644
>>>> --- a/default-configs/arm-softmmu.mak
>>>> +++ b/default-configs/arm-softmmu.mak
>>>> @@ -162,3 +162,5 @@ CONFIG_HIGHBANK=y
>>>>  CONFIG_MUSICPAL=y
>>>>  CONFIG_MEM_DEVICE=y
>>>>  CONFIG_DIMM=y
>>>> +CONFIG_NVDIMM=y
>>>> +CONFIG_ACPI_NVDIMM=y
>>>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>>>> index 781eafaf5e..f086adfa82 100644
>>>> --- a/hw/arm/virt-acpi-build.c
>>>> +++ b/hw/arm/virt-acpi-build.c
>>>> @@ -784,6 +784,7 @@ static
>>>>  void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>>>>  {
>>>>      VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
>>>> +    MachineState *ms = MACHINE(vms);
>>>>      GArray *table_offsets;
>>>>      unsigned dsdt, xsdt;
>>>>      GArray *tables_blob = tables->table_data;
>>>> @@ -824,6 +825,11 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>>>>          }
>>>>      }
>>>>  
>>>> +    if (vms->acpi_nvdimm_state.is_enabled) {
>>>> +        nvdimm_build_acpi(table_offsets, tables_blob, tables->linker,
>>>> +                          &vms->acpi_nvdimm_state, ms->ram_slots);
>>>> +    }
>>>> +
>>>>      if (its_class_name() && !vmc->no_its) {
>>>>          acpi_add_table(table_offsets, tables_blob);
>>>>          build_iort(tables_blob, tables->linker, vms);
>>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>>> index b683902991..0c8c2cc191 100644
>>>> --- a/hw/arm/virt.c
>>>> +++ b/hw/arm/virt.c
>>>> @@ -132,6 +132,7 @@ static const MemMapEntry a15memmap[] = {
>>>>      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
>>>>      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
>>>>      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
>>>> +    [VIRT_ACPI_IO] =            { 0x09070000, 0x00010000 },  
>>> where does this range come from and is its size sufficient?  
>> I understand it can be anywhere in low mem and must be large enough to
>> contain [NVDIMM_ACPI_IO_BASE, NDIMM_ACPI_IO_BASE + NVDIMM_ACPI_IO_LEN].
> it looked to like generic ACPI arrea rather than NVDIMM
> so I'd suggest to name it properly and probably use NVDIMM_ACPI_IO_BASE & co
> to add this entry so that reader won't have to wonder where this magic numbers
> come from.
OK. I will rename the region in v8
> 
>> So one 64kB page should do the job?
> Should it be located in lowmem?
> (do we care about device_memory & AVMF & ACPI & not 64bit guests?)
I guess no. We could easily fit this region into the existing low mem
IO, that's what I meant actually.

Thanks

Eric
> 
> 
>> Thanks
>>
>> Eric
>>>   
>>>>      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
>>>>      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
>>>>      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
>>>> @@ -1637,6 +1638,18 @@ static void machvirt_init(MachineState *machine)
>>>>  
>>>>      create_platform_bus(vms, pic);
>>>>  
>>>> +    if (vms->acpi_nvdimm_state.is_enabled) {
>>>> +        AcpiNVDIMMState *acpi_nvdimm_state = &vms->acpi_nvdimm_state;
>>>> +
>>>> +        acpi_nvdimm_state->dsm_io.type = NVDIMM_ACPI_IO_MEMORY;
>>>> +        acpi_nvdimm_state->dsm_io.base =
>>>> +                vms->memmap[VIRT_ACPI_IO].base + NVDIMM_ACPI_IO_BASE;
>>>> +        acpi_nvdimm_state->dsm_io.len = NVDIMM_ACPI_IO_LEN;
>>>> +
>>>> +        nvdimm_init_acpi_state(acpi_nvdimm_state, sysmem,
>>>> +                               vms->fw_cfg, OBJECT(vms));
>>>> +    }
>>>> +
>>>>      vms->bootinfo.ram_size = machine->ram_size;
>>>>      vms->bootinfo.kernel_filename = machine->kernel_filename;
>>>>      vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
>>>> @@ -1822,10 +1835,19 @@ static void virt_memory_plug(HotplugHandler *hotplug_dev,
>>>>                               DeviceState *dev, Error **errp)
>>>>  {
>>>>      VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>>>> +    bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM);
>>>>      Error *local_err = NULL;
>>>>  
>>>>      pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
>>>> +    if (local_err) {
>>>> +        goto out;
>>>> +    }
>>>>  
>>>> +    if (is_nvdimm) {
>>>> +        nvdimm_plug(&vms->acpi_nvdimm_state);
>>>> +    }
>>>> +
>>>> +out:
>>>>      error_propagate(errp, local_err);
>>>>  }
>>>>  
>>>> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
>>>> index c88f67a492..56d73b0e86 100644
>>>> --- a/include/hw/arm/virt.h
>>>> +++ b/include/hw/arm/virt.h
>>>> @@ -37,6 +37,7 @@
>>>>  #include "hw/arm/arm.h"
>>>>  #include "sysemu/kvm.h"
>>>>  #include "hw/intc/arm_gicv3_common.h"
>>>> +#include "hw/mem/nvdimm.h"
>>>>  
>>>>  #define NUM_GICV2M_SPIS       64
>>>>  #define NUM_VIRTIO_TRANSPORTS 32
>>>> @@ -77,6 +78,7 @@ enum {
>>>>      VIRT_GPIO,
>>>>      VIRT_SECURE_UART,
>>>>      VIRT_SECURE_MEM,
>>>> +    VIRT_ACPI_IO,
>>>>      VIRT_LOWMEMMAP_LAST,
>>>>  };
>>>>  
>>>> @@ -134,6 +136,7 @@ typedef struct {
>>>>      hwaddr high_io_base;
>>>>      hwaddr highest_gpa;
>>>>      bool extended_memmap;
>>>> +    AcpiNVDIMMState acpi_nvdimm_state;
>>>>  } VirtMachineState;
>>>>  
>>>>  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)  
>>>   
> 
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 14/18] hw/arm/virt: Allocate device_memory
  2019-02-21  9:36       ` Igor Mammedov
@ 2019-02-21 12:37         ` Auger Eric
  2019-02-21 12:44           ` David Hildenbrand
  0 siblings, 1 reply; 56+ messages in thread
From: Auger Eric @ 2019-02-21 12:37 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: peter.maydell, drjones, david, dgilbert,
	shameerali.kolothum.thodi, qemu-devel, qemu-arm, eric.auger.pro,
	david

Hi Igor,

On 2/21/19 10:36 AM, Igor Mammedov wrote:
> On Tue, 19 Feb 2019 16:53:22 +0100
> Auger Eric <eric.auger@redhat.com> wrote:
> 
>> Hi Igor,
>>
>> On 2/18/19 10:31 AM, Igor Mammedov wrote:
>>> On Tue,  5 Feb 2019 18:33:02 +0100
>>> Eric Auger <eric.auger@redhat.com> wrote:
>>>   
>>>> The device memory region is located after the initial RAM.
>>>> its start/size are 1GB aligned.
>>>>
>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>> Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
>>>>
>>>> ---
>>>> v4 -> v5:
>>>> - device memory set after the initial RAM
>>>>
>>>> v3 -> v4:
>>>> - remove bootinfo.device_memory_start/device_memory_size
>>>> - rename VIRT_HOTPLUG_MEM into VIRT_DEVICE_MEM
>>>> ---
>>>>  hw/arm/virt.c | 36 ++++++++++++++++++++++++++++++++++++
>>>>  1 file changed, 36 insertions(+)
>>>>
>>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>>> index 783468ba77..b683902991 100644
>>>> --- a/hw/arm/virt.c
>>>> +++ b/hw/arm/virt.c
>>>> @@ -61,6 +61,7 @@
>>>>  #include "hw/arm/smmuv3.h"
>>>>  #include "hw/mem/pc-dimm.h"
>>>>  #include "hw/mem/nvdimm.h"
>>>> +#include "hw/acpi/acpi.h"
>>>>  
>>>>  #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
>>>>      static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
>>>> @@ -1260,6 +1261,37 @@ static void create_secure_ram(VirtMachineState *vms,
>>>>      g_free(nodename);
>>>>  }
>>>>  
>>>> +static void create_device_memory(VirtMachineState *vms, MemoryRegion *sysmem)
>>>> +{
>>>> +    MachineState *ms = MACHINE(vms);
>>>> +    uint64_t device_memory_size = ms->maxram_size - ms->ram_size;  
>>> should size it with 1Gb alignment per slot from the start (to avoid x86 mistakes),
>>> see enforce_aligned_dimm usage and associated commit for more details  
>> I don't understand the computation done in pc machine. eventually we are
>> likely to have more device memory than requested by the user. Why don't
>> we check (machine->maxram_size - machine->ram_size) >=
>> machine->ram_slots * GiB
>> instead of adding 1GiB/slot to the initial user requirements?
>>
>> Also machine->maxram_size - machine->ram_size is checked to be aligned
>> with TARGET_PAGE_SIZE. Is TARGET_PAGE_SIZE representative of the guest
>> PAGE in accelerated mode? Is it valid ro require an alignment on 1GB
>> boundary as I do in this patch?
> See commit 085f8e88b for explanation,
> What we are basically are doing there is sizing hotpluggbale address space
> to allow max possible huge page aligned DIMM to be successfully plugged in
> even if address space if fragmented.
In v7, I also added ram_slots * GiB to (maxram_size - ram_size).

Thanks

Eric
> 
>>
>>>   
>>>> +    uint64_t align = GiB;
>>>> +
>>>> +    if (!device_memory_size) {
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    if (ms->ram_slots > ACPI_MAX_RAM_SLOTS) {
>>>> +        error_report("unsupported number of memory slots: %"PRIu64,
>>>> +                     ms->ram_slots);
>>>> +        exit(EXIT_FAILURE);
>>>> +    }
>>>> +
>>>> +    if (QEMU_ALIGN_UP(ms->maxram_size, align) != ms->maxram_size) {
>>>> +        error_report("maximum memory size must be aligned to multiple of 0x%"
>>>> +                     PRIx64, align);
>>>> +        exit(EXIT_FAILURE);
>>>> +    }
>>>> +
>>>> +    ms->device_memory = g_malloc0(sizeof(*ms->device_memory));
>>>> +    ms->device_memory->base = QEMU_ALIGN_UP(GiB + ms->ram_size, GiB);  
>>>                                                ^^^ where does this come from?  
>> OK, introduced RAMBASE macro
>>
>> Thanks
>>
>> Eric
>>>
>>>   
>>>> +
>>>> +    memory_region_init(&ms->device_memory->mr, OBJECT(vms),
>>>> +                       "device-memory", device_memory_size);
>>>> +    memory_region_add_subregion(sysmem, ms->device_memory->base,
>>>> +                                &ms->device_memory->mr);
>>>> +}
>>>> +
>>>>  static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
>>>>  {
>>>>      const VirtMachineState *board = container_of(binfo, VirtMachineState,
>>>> @@ -1569,6 +1601,10 @@ static void machvirt_init(MachineState *machine)
>>>>                                           machine->ram_size);
>>>>      memory_region_add_subregion(sysmem, vms->memmap[VIRT_MEM].base, ram);
>>>>  
>>>> +    if (vms->extended_memmap) {
>>>> +        create_device_memory(vms, sysmem);
>>>> +    }
>>>> +
>>>>      create_flash(vms, sysmem, secure_sysmem ? secure_sysmem : sysmem);
>>>>  
>>>>      create_gic(vms, pic);  
>>>   
>>
> 
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 14/18] hw/arm/virt: Allocate device_memory
  2019-02-21 12:37         ` Auger Eric
@ 2019-02-21 12:44           ` David Hildenbrand
  2019-02-21 13:07             ` Auger Eric
  0 siblings, 1 reply; 56+ messages in thread
From: David Hildenbrand @ 2019-02-21 12:44 UTC (permalink / raw)
  To: Auger Eric, Igor Mammedov
  Cc: peter.maydell, drjones, dgilbert, shameerali.kolothum.thodi,
	qemu-devel, qemu-arm, eric.auger.pro, david

On 21.02.19 13:37, Auger Eric wrote:
> Hi Igor,
> 
> On 2/21/19 10:36 AM, Igor Mammedov wrote:
>> On Tue, 19 Feb 2019 16:53:22 +0100
>> Auger Eric <eric.auger@redhat.com> wrote:
>>
>>> Hi Igor,
>>>
>>> On 2/18/19 10:31 AM, Igor Mammedov wrote:
>>>> On Tue,  5 Feb 2019 18:33:02 +0100
>>>> Eric Auger <eric.auger@redhat.com> wrote:
>>>>   
>>>>> The device memory region is located after the initial RAM.
>>>>> its start/size are 1GB aligned.
>>>>>
>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>> Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
>>>>>
>>>>> ---
>>>>> v4 -> v5:
>>>>> - device memory set after the initial RAM
>>>>>
>>>>> v3 -> v4:
>>>>> - remove bootinfo.device_memory_start/device_memory_size
>>>>> - rename VIRT_HOTPLUG_MEM into VIRT_DEVICE_MEM
>>>>> ---
>>>>>  hw/arm/virt.c | 36 ++++++++++++++++++++++++++++++++++++
>>>>>  1 file changed, 36 insertions(+)
>>>>>
>>>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>>>> index 783468ba77..b683902991 100644
>>>>> --- a/hw/arm/virt.c
>>>>> +++ b/hw/arm/virt.c
>>>>> @@ -61,6 +61,7 @@
>>>>>  #include "hw/arm/smmuv3.h"
>>>>>  #include "hw/mem/pc-dimm.h"
>>>>>  #include "hw/mem/nvdimm.h"
>>>>> +#include "hw/acpi/acpi.h"
>>>>>  
>>>>>  #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
>>>>>      static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
>>>>> @@ -1260,6 +1261,37 @@ static void create_secure_ram(VirtMachineState *vms,
>>>>>      g_free(nodename);
>>>>>  }
>>>>>  
>>>>> +static void create_device_memory(VirtMachineState *vms, MemoryRegion *sysmem)
>>>>> +{
>>>>> +    MachineState *ms = MACHINE(vms);
>>>>> +    uint64_t device_memory_size = ms->maxram_size - ms->ram_size;  
>>>> should size it with 1Gb alignment per slot from the start (to avoid x86 mistakes),
>>>> see enforce_aligned_dimm usage and associated commit for more details  
>>> I don't understand the computation done in pc machine. eventually we are
>>> likely to have more device memory than requested by the user. Why don't
>>> we check (machine->maxram_size - machine->ram_size) >=
>>> machine->ram_slots * GiB
>>> instead of adding 1GiB/slot to the initial user requirements?
>>>
>>> Also machine->maxram_size - machine->ram_size is checked to be aligned
>>> with TARGET_PAGE_SIZE. Is TARGET_PAGE_SIZE representative of the guest
>>> PAGE in accelerated mode? Is it valid ro require an alignment on 1GB
>>> boundary as I do in this patch?
>> See commit 085f8e88b for explanation,
>> What we are basically are doing there is sizing hotpluggbale address space
>> to allow max possible huge page aligned DIMM to be successfully plugged in
>> even if address space if fragmented.
> In v7, I also added ram_slots * GiB to (maxram_size - ram_size).
> 

Depending on the way the system handles it, this might be confusing for
the end user and has to be documented somewhere.

E.g. if there are certain memory limits (say 2TB) and the user specifies
something like "maxmem=2TB,slots=20" it might be confusing if he gets an
error like "more than 2TB are not supported".

> Thanks
> 
> Eric
-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v6 14/18] hw/arm/virt: Allocate device_memory
  2019-02-21 12:44           ` David Hildenbrand
@ 2019-02-21 13:07             ` Auger Eric
  0 siblings, 0 replies; 56+ messages in thread
From: Auger Eric @ 2019-02-21 13:07 UTC (permalink / raw)
  To: David Hildenbrand, Igor Mammedov
  Cc: peter.maydell, drjones, dgilbert, shameerali.kolothum.thodi,
	qemu-devel, qemu-arm, eric.auger.pro, david

Hi David,

On 2/21/19 1:44 PM, David Hildenbrand wrote:
> On 21.02.19 13:37, Auger Eric wrote:
>> Hi Igor,
>>
>> On 2/21/19 10:36 AM, Igor Mammedov wrote:
>>> On Tue, 19 Feb 2019 16:53:22 +0100
>>> Auger Eric <eric.auger@redhat.com> wrote:
>>>
>>>> Hi Igor,
>>>>
>>>> On 2/18/19 10:31 AM, Igor Mammedov wrote:
>>>>> On Tue,  5 Feb 2019 18:33:02 +0100
>>>>> Eric Auger <eric.auger@redhat.com> wrote:
>>>>>   
>>>>>> The device memory region is located after the initial RAM.
>>>>>> its start/size are 1GB aligned.
>>>>>>
>>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>>> Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
>>>>>>
>>>>>> ---
>>>>>> v4 -> v5:
>>>>>> - device memory set after the initial RAM
>>>>>>
>>>>>> v3 -> v4:
>>>>>> - remove bootinfo.device_memory_start/device_memory_size
>>>>>> - rename VIRT_HOTPLUG_MEM into VIRT_DEVICE_MEM
>>>>>> ---
>>>>>>  hw/arm/virt.c | 36 ++++++++++++++++++++++++++++++++++++
>>>>>>  1 file changed, 36 insertions(+)
>>>>>>
>>>>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>>>>> index 783468ba77..b683902991 100644
>>>>>> --- a/hw/arm/virt.c
>>>>>> +++ b/hw/arm/virt.c
>>>>>> @@ -61,6 +61,7 @@
>>>>>>  #include "hw/arm/smmuv3.h"
>>>>>>  #include "hw/mem/pc-dimm.h"
>>>>>>  #include "hw/mem/nvdimm.h"
>>>>>> +#include "hw/acpi/acpi.h"
>>>>>>  
>>>>>>  #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
>>>>>>      static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
>>>>>> @@ -1260,6 +1261,37 @@ static void create_secure_ram(VirtMachineState *vms,
>>>>>>      g_free(nodename);
>>>>>>  }
>>>>>>  
>>>>>> +static void create_device_memory(VirtMachineState *vms, MemoryRegion *sysmem)
>>>>>> +{
>>>>>> +    MachineState *ms = MACHINE(vms);
>>>>>> +    uint64_t device_memory_size = ms->maxram_size - ms->ram_size;  
>>>>> should size it with 1Gb alignment per slot from the start (to avoid x86 mistakes),
>>>>> see enforce_aligned_dimm usage and associated commit for more details  
>>>> I don't understand the computation done in pc machine. eventually we are
>>>> likely to have more device memory than requested by the user. Why don't
>>>> we check (machine->maxram_size - machine->ram_size) >=
>>>> machine->ram_slots * GiB
>>>> instead of adding 1GiB/slot to the initial user requirements?
>>>>
>>>> Also machine->maxram_size - machine->ram_size is checked to be aligned
>>>> with TARGET_PAGE_SIZE. Is TARGET_PAGE_SIZE representative of the guest
>>>> PAGE in accelerated mode? Is it valid ro require an alignment on 1GB
>>>> boundary as I do in this patch?
>>> See commit 085f8e88b for explanation,
>>> What we are basically are doing there is sizing hotpluggbale address space
>>> to allow max possible huge page aligned DIMM to be successfully plugged in
>>> even if address space if fragmented.
>> In v7, I also added ram_slots * GiB to (maxram_size - ram_size).
>>
> 
> Depending on the way the system handles it, this might be confusing for
> the end user and has to be documented somewhere.
> 
> E.g. if there are certain memory limits (say 2TB) and the user specifies
> something like "maxmem=2TB,slots=20" it might be confusing if he gets an
> error like "more than 2TB are not supported".
I Agree. On ARM we also intend to put high IO regions above the RAM so
if we overshoot the host limit, we warn the user the memory map requires
more PA bits than the host can support and the end user is invited to
lower maxmem/slots.

Thanks

Eric
> 
>> Thanks
>>
>> Eric

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2019-02-21 13:14 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-05 17:32 [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 01/18] update-linux-headers.sh: Copy new headers Eric Auger
2019-02-14 16:36   ` Peter Maydell
2019-02-21  6:15     ` Alexey Kardashevskiy
2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 02/18] linux-headers: Update to v5.0-rc2 Eric Auger
2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 03/18] hw/arm/boot: introduce fdt_add_memory_node helper Eric Auger
2019-02-14 16:49   ` Peter Maydell
2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 04/18] hw/arm/virt: Rename highmem IO regions Eric Auger
2019-02-14 16:50   ` Peter Maydell
2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 05/18] hw/arm/virt: Split the memory map description Eric Auger
2019-02-14 17:07   ` Peter Maydell
2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 06/18] hw/boards: Add a MachineState parameter to kvm_type callback Eric Auger
2019-02-14 17:12   ` Peter Maydell
2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 07/18] kvm: add kvm_arm_get_max_vm_phys_shift Eric Auger
2019-02-14 17:15   ` Peter Maydell
2019-02-18 18:03     ` Auger Eric
2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 08/18] vl: Set machine ram_size, maxram_size and ram_slots earlier Eric Auger
2019-02-14 17:16   ` Peter Maydell
2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 09/18] hw/arm/virt: Implement kvm_type function for 4.0 machine Eric Auger
2019-02-14 17:29   ` Peter Maydell
2019-02-18 21:29     ` Auger Eric
2019-02-19  7:49       ` Igor Mammedov
2019-02-19  8:52         ` Auger Eric
2019-02-18 10:07   ` Igor Mammedov
2019-02-19 15:56     ` Auger Eric
2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 10/18] hw/arm/virt: Bump the 255GB initial RAM limit Eric Auger
2019-02-07 15:19   ` Shameerali Kolothum Thodi
2019-02-07 15:25     ` Auger Eric
2019-02-05 17:32 ` [Qemu-devel] [PATCH v6 11/18] hw/arm/virt: Add memory hotplug framework Eric Auger
2019-02-14 17:15   ` David Hildenbrand
2019-02-18 18:10     ` Auger Eric
2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 12/18] hw/arm/boot: Expose the PC-DIMM nodes in the DT Eric Auger
2019-02-18  8:58   ` Igor Mammedov
2019-02-20 15:30     ` Auger Eric
2019-02-21  9:27       ` Igor Mammedov
2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 13/18] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT Eric Auger
2019-02-18  8:14   ` Igor Mammedov
2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 14/18] hw/arm/virt: Allocate device_memory Eric Auger
2019-02-18  9:31   ` Igor Mammedov
2019-02-19 15:53     ` Auger Eric
2019-02-19 15:56       ` David Hildenbrand
2019-02-21  9:36       ` Igor Mammedov
2019-02-21 12:37         ` Auger Eric
2019-02-21 12:44           ` David Hildenbrand
2019-02-21 13:07             ` Auger Eric
2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 15/18] nvdimm: use configurable ACPI IO base and size Eric Auger
2019-02-18 10:21   ` Igor Mammedov
2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 16/18] hw/arm/virt: Add nvdimm hot-plug infrastructure Eric Auger
2019-02-18 10:30   ` Igor Mammedov
2019-02-20 15:21     ` Auger Eric
2019-02-21 12:16       ` Igor Mammedov
2019-02-21 12:34         ` Auger Eric
2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 17/18] hw/arm/boot: Expose the pmem nodes in the DT Eric Auger
2019-02-05 17:33 ` [Qemu-devel] [PATCH v6 18/18] hw/arm/virt: Add nvdimm and nvdimm-persistence options Eric Auger
2019-02-14 17:35 ` [Qemu-devel] [PATCH v6 00/18] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Peter Maydell
2019-02-14 18:00   ` Auger Eric

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.