kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/5] target/arm: Reduced-IPA space and highmem=off fixes
@ 2021-12-27 21:16 Marc Zyngier
  2021-12-27 21:16 ` [PATCH v3 1/5] hw/arm/virt: Key enablement of highmem PCIe on highmem_ecam Marc Zyngier
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Marc Zyngier @ 2021-12-27 21:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Andrew Jones, Eric Auger, Peter Maydell, kvmarm, kvm, kernel-team

Here's another stab at enabling QEMU on systems with pathologically
reduced IPA ranges such as the Apple M1 (previous version at [1]).
Eventually, we're able to run a KVM guest with more than just 3GB of
RAM on a system with a 36bit IPA space, and at most 123 vCPUs.

This series does a few things:
- decouple the enabling of the highmem PCIe region from the highmem
  attribute
- introduce a new attribute to control the enabling of the highmem
  GICv3 redistributors
- correctly cap the PA range with highmem is off
- generalise the highmem behaviour to any PA range
- disable both highmem PCIe and GICv3 RDs when they are outside of the
  PA range

This has been tested on an M1-based Mac-mini running Linux v5.16-rc6
with both KVM and TCG.

* From v2:
  - Fixed checking of the maximum memory against the IPA space
  - Fixed TCG memory map creation
  - Rebased on top of QEMU's 89f3bfa326
  - Collected Andrew's RBs, with thanks

[1] https://lore.kernel.org/r/20211003164605.3116450-1-maz@kernel.org

Marc Zyngier (5):
  hw/arm/virt: Key enablement of highmem PCIe on highmem_ecam
  hw/arm/virt: Add a control for the the highmem redistributors
  hw/arm/virt: Honor highmem setting when computing the memory map
  hw/arm/virt: Use the PA range to compute the memory map
  hw/arm/virt: Disable highmem devices that don't fit in the PA range

 hw/arm/virt-acpi-build.c | 12 +++----
 hw/arm/virt.c            | 67 ++++++++++++++++++++++++++++++++++------
 include/hw/arm/virt.h    |  4 ++-
 3 files changed, 67 insertions(+), 16 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/5] hw/arm/virt: Key enablement of highmem PCIe on highmem_ecam
  2021-12-27 21:16 [PATCH v3 0/5] target/arm: Reduced-IPA space and highmem=off fixes Marc Zyngier
@ 2021-12-27 21:16 ` Marc Zyngier
  2021-12-27 21:16 ` [PATCH v3 2/5] hw/arm/virt: Add a control for the the highmem redistributors Marc Zyngier
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Marc Zyngier @ 2021-12-27 21:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Andrew Jones, Eric Auger, Peter Maydell, kvmarm, kvm, kernel-team

Currently, the highmem PCIe region is oddly keyed on the highmem
attribute instead of highmem_ecam. Move the enablement of this PCIe
region over to highmem_ecam.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 hw/arm/virt-acpi-build.c | 10 ++++------
 hw/arm/virt.c            |  4 ++--
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index d0f4867fdf..d04c107fd8 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -158,10 +158,9 @@ static void acpi_dsdt_add_virtio(Aml *scope,
 }
 
 static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry *memmap,
-                              uint32_t irq, bool use_highmem, bool highmem_ecam,
-                              VirtMachineState *vms)
+                              uint32_t irq, VirtMachineState *vms)
 {
-    int ecam_id = VIRT_ECAM_ID(highmem_ecam);
+    int ecam_id = VIRT_ECAM_ID(vms->highmem_ecam);
     struct GPEXConfig cfg = {
         .mmio32 = memmap[VIRT_PCIE_MMIO],
         .pio    = memmap[VIRT_PCIE_PIO],
@@ -170,7 +169,7 @@ static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry *memmap,
         .bus    = vms->bus,
     };
 
-    if (use_highmem) {
+    if (vms->highmem_ecam) {
         cfg.mmio64 = memmap[VIRT_HIGH_PCIE_MMIO];
     }
 
@@ -868,8 +867,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     acpi_dsdt_add_fw_cfg(scope, &memmap[VIRT_FW_CFG]);
     acpi_dsdt_add_virtio(scope, &memmap[VIRT_MMIO],
                     (irqmap[VIRT_MMIO] + ARM_SPI_BASE), NUM_VIRTIO_TRANSPORTS);
-    acpi_dsdt_add_pci(scope, memmap, (irqmap[VIRT_PCIE] + ARM_SPI_BASE),
-                      vms->highmem, vms->highmem_ecam, vms);
+    acpi_dsdt_add_pci(scope, memmap, irqmap[VIRT_PCIE] + ARM_SPI_BASE, vms);
     if (vms->acpi_dev) {
         build_ged_aml(scope, "\\_SB."GED_DEVICE,
                       HOTPLUG_HANDLER(vms->acpi_dev),
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 6bce595aba..a54dc43175 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1412,7 +1412,7 @@ static void create_pcie(VirtMachineState *vms)
                              mmio_reg, base_mmio, size_mmio);
     memory_region_add_subregion(get_system_memory(), base_mmio, mmio_alias);
 
-    if (vms->highmem) {
+    if (vms->highmem_ecam) {
         /* Map high MMIO space */
         MemoryRegion *high_mmio_alias = g_new0(MemoryRegion, 1);
 
@@ -1466,7 +1466,7 @@ static void create_pcie(VirtMachineState *vms)
     qemu_fdt_setprop_sized_cells(ms->fdt, nodename, "reg",
                                  2, base_ecam, 2, size_ecam);
 
-    if (vms->highmem) {
+    if (vms->highmem_ecam) {
         qemu_fdt_setprop_sized_cells(ms->fdt, nodename, "ranges",
                                      1, FDT_PCI_RANGE_IOPORT, 2, 0,
                                      2, base_pio, 2, size_pio,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/5] hw/arm/virt: Add a control for the the highmem redistributors
  2021-12-27 21:16 [PATCH v3 0/5] target/arm: Reduced-IPA space and highmem=off fixes Marc Zyngier
  2021-12-27 21:16 ` [PATCH v3 1/5] hw/arm/virt: Key enablement of highmem PCIe on highmem_ecam Marc Zyngier
@ 2021-12-27 21:16 ` Marc Zyngier
  2021-12-27 21:16 ` [PATCH v3 3/5] hw/arm/virt: Honor highmem setting when computing the memory map Marc Zyngier
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Marc Zyngier @ 2021-12-27 21:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Andrew Jones, Eric Auger, Peter Maydell, kvmarm, kvm, kernel-team

Just like we can control the enablement of the highmem PCIe region
using highmem_ecam, let's add a control for the highmem GICv3
redistributor region.

Similarily to highmem_ecam, these redistributors are disabled when
highmem is off.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 hw/arm/virt-acpi-build.c | 2 ++
 hw/arm/virt.c            | 3 +++
 include/hw/arm/virt.h    | 4 +++-
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index d04c107fd8..fcbff9d835 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -946,6 +946,8 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
     acpi_add_table(table_offsets, tables_blob);
     build_fadt_rev5(tables_blob, tables->linker, vms, dsdt);
 
+    vms->highmem_redists &= vms->highmem;
+
     acpi_add_table(table_offsets, tables_blob);
     build_madt(tables_blob, tables->linker, vms);
 
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index a54dc43175..8b600d82c1 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2105,6 +2105,8 @@ static void machvirt_init(MachineState *machine)
 
     virt_flash_fdt(vms, sysmem, secure_sysmem ?: sysmem);
 
+    vms->highmem_redists &= vms->highmem;
+
     create_gic(vms, sysmem);
 
     virt_cpu_post_init(vms, sysmem);
@@ -2802,6 +2804,7 @@ static void virt_instance_init(Object *obj)
     vms->gic_version = VIRT_GIC_VERSION_NOSEL;
 
     vms->highmem_ecam = !vmc->no_highmem_ecam;
+    vms->highmem_redists = true;
 
     if (vmc->no_its) {
         vms->its = false;
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index dc6b66ffc8..726623a176 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -143,6 +143,7 @@ struct VirtMachineState {
     bool secure;
     bool highmem;
     bool highmem_ecam;
+    bool highmem_redists;
     bool its;
     bool tcg_its;
     bool virt;
@@ -189,7 +190,8 @@ static inline int virt_gicv3_redist_region_count(VirtMachineState *vms)
 
     assert(vms->gic_version == VIRT_GIC_VERSION_3);
 
-    return MACHINE(vms)->smp.cpus > redist0_capacity ? 2 : 1;
+    return (MACHINE(vms)->smp.cpus > redist0_capacity &&
+            vms->highmem_redists) ? 2 : 1;
 }
 
 #endif /* QEMU_ARM_VIRT_H */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 3/5] hw/arm/virt: Honor highmem setting when computing the memory map
  2021-12-27 21:16 [PATCH v3 0/5] target/arm: Reduced-IPA space and highmem=off fixes Marc Zyngier
  2021-12-27 21:16 ` [PATCH v3 1/5] hw/arm/virt: Key enablement of highmem PCIe on highmem_ecam Marc Zyngier
  2021-12-27 21:16 ` [PATCH v3 2/5] hw/arm/virt: Add a control for the the highmem redistributors Marc Zyngier
@ 2021-12-27 21:16 ` Marc Zyngier
  2022-01-05  9:22   ` Eric Auger
  2021-12-27 21:16 ` [PATCH v3 4/5] hw/arm/virt: Use the PA range to compute " Marc Zyngier
  2021-12-27 21:16 ` [PATCH v3 5/5] hw/arm/virt: Disable highmem devices that don't fit in the PA range Marc Zyngier
  4 siblings, 1 reply; 13+ messages in thread
From: Marc Zyngier @ 2021-12-27 21:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Andrew Jones, Eric Auger, Peter Maydell, kvmarm, kvm, kernel-team

Even when the VM is configured with highmem=off, the highest_gpa
field includes devices that are above the 4GiB limit.
Similarily, nothing seem to check that the memory is within
the limit set by the highmem=off option.

This leads to failures in virt_kvm_type() on systems that have
a crippled IPA range, as the reported IPA space is larger than
what it should be.

Instead, honor the user-specified limit to only use the devices
at the lowest end of the spectrum, and fail if we have memory
crossing the 4GiB limit.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 hw/arm/virt.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 8b600d82c1..84dd3b36fb 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1678,6 +1678,11 @@ static void virt_set_memmap(VirtMachineState *vms)
         exit(EXIT_FAILURE);
     }
 
+    if (!vms->highmem &&
+        vms->memmap[VIRT_MEM].base + ms->maxram_size > 4 * GiB) {
+        error_report("highmem=off, but memory crosses the 4GiB limit\n");
+        exit(EXIT_FAILURE);
+    }
     /*
      * We compute the base of the high IO region depending on the
      * amount of initial and device memory. The device memory start/size
@@ -1707,7 +1712,9 @@ static void virt_set_memmap(VirtMachineState *vms)
         vms->memmap[i].size = size;
         base += size;
     }
-    vms->highest_gpa = base - 1;
+    vms->highest_gpa = (vms->highmem ?
+                        base :
+                        vms->memmap[VIRT_MEM].base + ms->maxram_size) - 1;
     if (device_memory_size > 0) {
         ms->device_memory = g_malloc0(sizeof(*ms->device_memory));
         ms->device_memory->base = device_memory_base;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 4/5] hw/arm/virt: Use the PA range to compute the memory map
  2021-12-27 21:16 [PATCH v3 0/5] target/arm: Reduced-IPA space and highmem=off fixes Marc Zyngier
                   ` (2 preceding siblings ...)
  2021-12-27 21:16 ` [PATCH v3 3/5] hw/arm/virt: Honor highmem setting when computing the memory map Marc Zyngier
@ 2021-12-27 21:16 ` Marc Zyngier
  2021-12-27 21:16 ` [PATCH v3 5/5] hw/arm/virt: Disable highmem devices that don't fit in the PA range Marc Zyngier
  4 siblings, 0 replies; 13+ messages in thread
From: Marc Zyngier @ 2021-12-27 21:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Andrew Jones, Eric Auger, Peter Maydell, kvmarm, kvm, kernel-team

The highmem attribute is nothing but another way to express the
PA range of a VM. To support HW that has a smaller PA range then
what QEMU assumes, pass this PA range to the virt_set_memmap()
function, allowing it to correctly exclude highmem devices
if they are outside of the PA range.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 hw/arm/virt.c | 64 ++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 50 insertions(+), 14 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 84dd3b36fb..212079e7a6 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1660,10 +1660,10 @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx)
     return arm_cpu_mp_affinity(idx, clustersz);
 }
 
-static void virt_set_memmap(VirtMachineState *vms)
+static void virt_set_memmap(VirtMachineState *vms, int pa_bits)
 {
     MachineState *ms = MACHINE(vms);
-    hwaddr base, device_memory_base, device_memory_size;
+    hwaddr base, device_memory_base, device_memory_size, memtop;
     int i;
 
     vms->memmap = extended_memmap;
@@ -1678,11 +1678,9 @@ static void virt_set_memmap(VirtMachineState *vms)
         exit(EXIT_FAILURE);
     }
 
-    if (!vms->highmem &&
-        vms->memmap[VIRT_MEM].base + ms->maxram_size > 4 * GiB) {
-        error_report("highmem=off, but memory crosses the 4GiB limit\n");
-        exit(EXIT_FAILURE);
-    }
+    if (!vms->highmem)
+	    pa_bits = 32;
+
     /*
      * We compute the base of the high IO region depending on the
      * amount of initial and device memory. The device memory start/size
@@ -1695,7 +1693,12 @@ static void virt_set_memmap(VirtMachineState *vms)
     device_memory_size = ms->maxram_size - ms->ram_size + ms->ram_slots * GiB;
 
     /* Base address of the high IO region */
-    base = device_memory_base + ROUND_UP(device_memory_size, GiB);
+    memtop = base = device_memory_base + ROUND_UP(device_memory_size, GiB);
+    if (memtop > BIT_ULL(pa_bits)) {
+	    error_report("Addressing limited to %d bits, but memory exceeds it by %llu bytes\n",
+			 pa_bits, memtop - BIT_ULL(pa_bits));
+        exit(EXIT_FAILURE);
+    }
     if (base < device_memory_base) {
         error_report("maxmem/slots too huge");
         exit(EXIT_FAILURE);
@@ -1712,9 +1715,17 @@ static void virt_set_memmap(VirtMachineState *vms)
         vms->memmap[i].size = size;
         base += size;
     }
-    vms->highest_gpa = (vms->highmem ?
-                        base :
-                        vms->memmap[VIRT_MEM].base + ms->maxram_size) - 1;
+
+    /*
+     * If base fits within pa_bits, all good. If it doesn't, limit it
+     * to the end of RAM, which is guaranteed to fit within pa_bits.
+     */
+    if (base <= BIT_ULL(pa_bits)) {
+        vms->highest_gpa = base - 1;
+    } else {
+        vms->highest_gpa = memtop - 1;
+    }
+
     if (device_memory_size > 0) {
         ms->device_memory = g_malloc0(sizeof(*ms->device_memory));
         ms->device_memory->base = device_memory_base;
@@ -1905,12 +1916,38 @@ static void machvirt_init(MachineState *machine)
     unsigned int smp_cpus = machine->smp.cpus;
     unsigned int max_cpus = machine->smp.max_cpus;
 
+    possible_cpus = mc->possible_cpu_arch_ids(machine);
+
     /*
      * In accelerated mode, the memory map is computed earlier in kvm_type()
      * to create a VM with the right number of IPA bits.
      */
     if (!vms->memmap) {
-        virt_set_memmap(vms);
+        Object *cpuobj;
+        ARMCPU *armcpu;
+        int pa_bits;
+
+        /*
+         * Instanciate a temporary CPU object to find out about what
+         * we are about to deal with. Once this is done, get rid of
+         * the object.
+         */
+        cpuobj = object_new(possible_cpus->cpus[0].type);
+        armcpu = ARM_CPU(cpuobj);
+
+        if (object_property_get_bool(cpuobj, "aarch64", NULL)) {
+            pa_bits = arm_pamax(armcpu);
+        } else if (arm_feature(&armcpu->env, ARM_FEATURE_LPAE)) {
+            /* v7 with LPAE */
+            pa_bits = 40;
+        } else {
+            /* Anything else */
+            pa_bits = 32;
+        }
+
+        object_unref(cpuobj);
+
+        virt_set_memmap(vms, pa_bits);
     }
 
     /* We can probe only here because during property set
@@ -1992,7 +2029,6 @@ static void machvirt_init(MachineState *machine)
 
     create_fdt(vms);
 
-    possible_cpus = mc->possible_cpu_arch_ids(machine);
     assert(possible_cpus->len == max_cpus);
     for (n = 0; n < possible_cpus->len; n++) {
         Object *cpuobj;
@@ -2648,7 +2684,7 @@ static int virt_kvm_type(MachineState *ms, const char *type_str)
     max_vm_pa_size = kvm_arm_get_max_vm_ipa_size(ms, &fixed_ipa);
 
     /* we freeze the memory map to compute the highest gpa */
-    virt_set_memmap(vms);
+    virt_set_memmap(vms, max_vm_pa_size);
 
     requested_pa_size = 64 - clz64(vms->highest_gpa);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 5/5] hw/arm/virt: Disable highmem devices that don't fit in the PA range
  2021-12-27 21:16 [PATCH v3 0/5] target/arm: Reduced-IPA space and highmem=off fixes Marc Zyngier
                   ` (3 preceding siblings ...)
  2021-12-27 21:16 ` [PATCH v3 4/5] hw/arm/virt: Use the PA range to compute " Marc Zyngier
@ 2021-12-27 21:16 ` Marc Zyngier
  4 siblings, 0 replies; 13+ messages in thread
From: Marc Zyngier @ 2021-12-27 21:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Andrew Jones, Eric Auger, Peter Maydell, kvmarm, kvm, kernel-team

Make sure both the highmem PCIe and GICv3 regions are disabled when
they don't fully fit in the PA range.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 hw/arm/virt.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 212079e7a6..18e615070f 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1723,6 +1723,9 @@ static void virt_set_memmap(VirtMachineState *vms, int pa_bits)
     if (base <= BIT_ULL(pa_bits)) {
         vms->highest_gpa = base - 1;
     } else {
+        /* Advertise that we have disabled the highmem devices */
+        vms->highmem_ecam = false;
+        vms->highmem_redists = false;
         vms->highest_gpa = memtop - 1;
     }
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/5] hw/arm/virt: Honor highmem setting when computing the memory map
  2021-12-27 21:16 ` [PATCH v3 3/5] hw/arm/virt: Honor highmem setting when computing the memory map Marc Zyngier
@ 2022-01-05  9:22   ` Eric Auger
  2022-01-05  9:36     ` Eric Auger
  2022-01-06 21:26     ` Marc Zyngier
  0 siblings, 2 replies; 13+ messages in thread
From: Eric Auger @ 2022-01-05  9:22 UTC (permalink / raw)
  To: Marc Zyngier, qemu-devel
  Cc: Andrew Jones, Peter Maydell, kvmarm, kvm, kernel-team

Hi Marc,

On 12/27/21 10:16 PM, Marc Zyngier wrote:
> Even when the VM is configured with highmem=off, the highest_gpa
> field includes devices that are above the 4GiB limit.
> Similarily, nothing seem to check that the memory is within
> the limit set by the highmem=off option.
>
> This leads to failures in virt_kvm_type() on systems that have
> a crippled IPA range, as the reported IPA space is larger than
> what it should be.
>
> Instead, honor the user-specified limit to only use the devices
> at the lowest end of the spectrum, and fail if we have memory
> crossing the 4GiB limit.
>
> Reviewed-by: Andrew Jones <drjones@redhat.com>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
>  hw/arm/virt.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 8b600d82c1..84dd3b36fb 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -1678,6 +1678,11 @@ static void virt_set_memmap(VirtMachineState *vms)
>          exit(EXIT_FAILURE);
>      }
>  
> +    if (!vms->highmem &&
> +        vms->memmap[VIRT_MEM].base + ms->maxram_size > 4 * GiB) {
> +        error_report("highmem=off, but memory crosses the 4GiB limit\n");
> +        exit(EXIT_FAILURE);

The memory is composed of initial memory and device memory.
device memory is put after the initial memory but has a 1GB alignment
On top of that you have 1G page alignment per device memory slot

so potentially the highest mem address is larger than
vms->memmap[VIRT_MEM].base + ms->maxram_size.
I would rather do the check on device_memory_base + device_memory_size
> +    }
>      /*
>       * We compute the base of the high IO region depending on the
>       * amount of initial and device memory. The device memory start/size
> @@ -1707,7 +1712,9 @@ static void virt_set_memmap(VirtMachineState *vms)
>          vms->memmap[i].size = size;
>          base += size;
>      }
> -    vms->highest_gpa = base - 1;
> +    vms->highest_gpa = (vms->highmem ?
> +                        base :
> +                        vms->memmap[VIRT_MEM].base + ms->maxram_size) - 1;
As per the previous comment this looks wrong to me if !highmem.

If !highmem, if RAM requirements are low we still could get benefit from
REDIST2 and HIGH ECAM which could fit within the 4GB limit. But maybe we
simply don't care? If we don't, why don't we simply skip the
extended_memmap overlay as suggested in v2? I did not get your reply sorry.

Thanks

Eric
>      if (device_memory_size > 0) {
>          ms->device_memory = g_malloc0(sizeof(*ms->device_memory));
>          ms->device_memory->base = device_memory_base;


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/5] hw/arm/virt: Honor highmem setting when computing the memory map
  2022-01-05  9:22   ` Eric Auger
@ 2022-01-05  9:36     ` Eric Auger
  2022-01-06 21:26     ` Marc Zyngier
  1 sibling, 0 replies; 13+ messages in thread
From: Eric Auger @ 2022-01-05  9:36 UTC (permalink / raw)
  To: Marc Zyngier, qemu-devel
  Cc: Andrew Jones, Peter Maydell, kvmarm, kvm, kernel-team

Hi Marc,

On 1/5/22 10:22 AM, Eric Auger wrote:
> Hi Marc,
> 
> On 12/27/21 10:16 PM, Marc Zyngier wrote:
>> Even when the VM is configured with highmem=off, the highest_gpa
>> field includes devices that are above the 4GiB limit.
>> Similarily, nothing seem to check that the memory is within
>> the limit set by the highmem=off option.
>>
>> This leads to failures in virt_kvm_type() on systems that have
>> a crippled IPA range, as the reported IPA space is larger than
>> what it should be.
>>
>> Instead, honor the user-specified limit to only use the devices
>> at the lowest end of the spectrum, and fail if we have memory
>> crossing the 4GiB limit.
>>
>> Reviewed-by: Andrew Jones <drjones@redhat.com>
>> Signed-off-by: Marc Zyngier <maz@kernel.org>
>> ---
>>  hw/arm/virt.c | 9 ++++++++-
>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index 8b600d82c1..84dd3b36fb 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -1678,6 +1678,11 @@ static void virt_set_memmap(VirtMachineState *vms)
>>          exit(EXIT_FAILURE);
>>      }
>>  
>> +    if (!vms->highmem &&
>> +        vms->memmap[VIRT_MEM].base + ms->maxram_size > 4 * GiB) {
>> +        error_report("highmem=off, but memory crosses the 4GiB limit\n");
>> +        exit(EXIT_FAILURE);
> 
> The memory is composed of initial memory and device memory.
> device memory is put after the initial memory but has a 1GB alignment
> On top of that you have 1G page alignment per device memory slot
> 
> so potentially the highest mem address is larger than
> vms->memmap[VIRT_MEM].base + ms->maxram_size.
> I would rather do the check on device_memory_base + device_memory_size
>> +    }
>>      /*
>>       * We compute the base of the high IO region depending on the
>>       * amount of initial and device memory. The device memory start/size
>> @@ -1707,7 +1712,9 @@ static void virt_set_memmap(VirtMachineState *vms)
>>          vms->memmap[i].size = size;
>>          base += size;
>>      }
>> -    vms->highest_gpa = base - 1;
>> +    vms->highest_gpa = (vms->highmem ?
>> +                        base :
>> +                        vms->memmap[VIRT_MEM].base + ms->maxram_size) - 1;
> As per the previous comment this looks wrong to me if !highmem.
> 
> If !highmem, if RAM requirements are low we still could get benefit from
> REDIST2 and HIGH ECAM which could fit within the 4GB limit. But maybe we
the above assertion is wrong, sorry, as we kept the legacy mem map in
that situation (HIGH regions always put above 256 GB). So anyway we can
skip the extended memmap if !highmem.

Eric
> simply don't care? If we don't, why don't we simply skip the
> extended_memmap overlay as suggested in v2? I did not get your reply sorry.
> 
> Thanks
> 
> Eric
>>      if (device_memory_size > 0) {
>>          ms->device_memory = g_malloc0(sizeof(*ms->device_memory));
>>          ms->device_memory->base = device_memory_base;
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/5] hw/arm/virt: Honor highmem setting when computing the memory map
  2022-01-05  9:22   ` Eric Auger
  2022-01-05  9:36     ` Eric Auger
@ 2022-01-06 21:26     ` Marc Zyngier
  2022-01-07 17:15       ` Eric Auger
  1 sibling, 1 reply; 13+ messages in thread
From: Marc Zyngier @ 2022-01-06 21:26 UTC (permalink / raw)
  To: eric.auger
  Cc: qemu-devel, Andrew Jones, Peter Maydell, kvmarm, kvm, kernel-team

On Wed, 05 Jan 2022 09:22:39 +0000,
Eric Auger <eric.auger@redhat.com> wrote:
> 
> Hi Marc,
> 
> On 12/27/21 10:16 PM, Marc Zyngier wrote:
> > Even when the VM is configured with highmem=off, the highest_gpa
> > field includes devices that are above the 4GiB limit.
> > Similarily, nothing seem to check that the memory is within
> > the limit set by the highmem=off option.
> >
> > This leads to failures in virt_kvm_type() on systems that have
> > a crippled IPA range, as the reported IPA space is larger than
> > what it should be.
> >
> > Instead, honor the user-specified limit to only use the devices
> > at the lowest end of the spectrum, and fail if we have memory
> > crossing the 4GiB limit.
> >
> > Reviewed-by: Andrew Jones <drjones@redhat.com>
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > ---
> >  hw/arm/virt.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 8b600d82c1..84dd3b36fb 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -1678,6 +1678,11 @@ static void virt_set_memmap(VirtMachineState *vms)
> >          exit(EXIT_FAILURE);
> >      }
> >  
> > +    if (!vms->highmem &&
> > +        vms->memmap[VIRT_MEM].base + ms->maxram_size > 4 * GiB) {
> > +        error_report("highmem=off, but memory crosses the 4GiB limit\n");
> > +        exit(EXIT_FAILURE);
> 
> The memory is composed of initial memory and device memory.
> device memory is put after the initial memory but has a 1GB alignment
> On top of that you have 1G page alignment per device memory slot
> 
> so potentially the highest mem address is larger than
> vms->memmap[VIRT_MEM].base + ms->maxram_size.
> I would rather do the check on device_memory_base + device_memory_size

Yup, that's a good point.

There is also a corner case in one of the later patches where I check
this limit against the PA using the rounded-up device_memory_size.
This could result in returning an error if the last memory slot would
still fit in the PA space, but the rounded-up quantity wouldn't. I
don't think it matters much, but I'll fix it anyway.

> > +    }
> >      /*
> >       * We compute the base of the high IO region depending on the
> >       * amount of initial and device memory. The device memory start/size
> > @@ -1707,7 +1712,9 @@ static void virt_set_memmap(VirtMachineState *vms)
> >          vms->memmap[i].size = size;
> >          base += size;
> >      }
> > -    vms->highest_gpa = base - 1;
> > +    vms->highest_gpa = (vms->highmem ?
> > +                        base :
> > +                        vms->memmap[VIRT_MEM].base + ms->maxram_size) - 1;
> As per the previous comment this looks wrong to me if !highmem.

Agreed.

> If !highmem, if RAM requirements are low we still could get benefit from
> REDIST2 and HIGH ECAM which could fit within the 4GB limit. But maybe we
> simply don't care?

I don't see how. These devices live at a minimum of 256GB, which
contradicts the very meaning of !highmem being a 4GB limit.

> If we don't, why don't we simply skip the extended_memmap overlay as
> suggested in v2? I did not get your reply sorry.

Because although this makes sense if you only care about a 32bit
limit, we eventually want to check against an arbitrary PA limit and
enable the individual devices that do fit in that space.

In order to do that, we need to compute the base addresses for these
extra devices. Also, computing 3 base addresses isn't going to be
massively expensive.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/5] hw/arm/virt: Honor highmem setting when computing the memory map
  2022-01-06 21:26     ` Marc Zyngier
@ 2022-01-07 17:15       ` Eric Auger
  2022-01-07 18:18         ` Marc Zyngier
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Auger @ 2022-01-07 17:15 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: qemu-devel, Andrew Jones, Peter Maydell, kvmarm, kvm, kernel-team

Hi Marc,

On 1/6/22 10:26 PM, Marc Zyngier wrote:
> On Wed, 05 Jan 2022 09:22:39 +0000,
> Eric Auger <eric.auger@redhat.com> wrote:
>> Hi Marc,
>>
>> On 12/27/21 10:16 PM, Marc Zyngier wrote:
>>> Even when the VM is configured with highmem=off, the highest_gpa
>>> field includes devices that are above the 4GiB limit.
>>> Similarily, nothing seem to check that the memory is within
>>> the limit set by the highmem=off option.
>>>
>>> This leads to failures in virt_kvm_type() on systems that have
>>> a crippled IPA range, as the reported IPA space is larger than
>>> what it should be.
>>>
>>> Instead, honor the user-specified limit to only use the devices
>>> at the lowest end of the spectrum, and fail if we have memory
>>> crossing the 4GiB limit.
>>>
>>> Reviewed-by: Andrew Jones <drjones@redhat.com>
>>> Signed-off-by: Marc Zyngier <maz@kernel.org>
>>> ---
>>>  hw/arm/virt.c | 9 ++++++++-
>>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>> index 8b600d82c1..84dd3b36fb 100644
>>> --- a/hw/arm/virt.c
>>> +++ b/hw/arm/virt.c
>>> @@ -1678,6 +1678,11 @@ static void virt_set_memmap(VirtMachineState *vms)
>>>          exit(EXIT_FAILURE);
>>>      }
>>>  
>>> +    if (!vms->highmem &&
>>> +        vms->memmap[VIRT_MEM].base + ms->maxram_size > 4 * GiB) {
>>> +        error_report("highmem=off, but memory crosses the 4GiB limit\n");
>>> +        exit(EXIT_FAILURE);
>> The memory is composed of initial memory and device memory.
>> device memory is put after the initial memory but has a 1GB alignment
>> On top of that you have 1G page alignment per device memory slot
>>
>> so potentially the highest mem address is larger than
>> vms->memmap[VIRT_MEM].base + ms->maxram_size.
>> I would rather do the check on device_memory_base + device_memory_size
> Yup, that's a good point.
>
> There is also a corner case in one of the later patches where I check
> this limit against the PA using the rounded-up device_memory_size.
> This could result in returning an error if the last memory slot would
> still fit in the PA space, but the rounded-up quantity wouldn't. I
> don't think it matters much, but I'll fix it anyway.
>
>>> +    }
>>>      /*
>>>       * We compute the base of the high IO region depending on the
>>>       * amount of initial and device memory. The device memory start/size
>>> @@ -1707,7 +1712,9 @@ static void virt_set_memmap(VirtMachineState *vms)
>>>          vms->memmap[i].size = size;
>>>          base += size;
>>>      }
>>> -    vms->highest_gpa = base - 1;
>>> +    vms->highest_gpa = (vms->highmem ?
>>> +                        base :
>>> +                        vms->memmap[VIRT_MEM].base + ms->maxram_size) - 1;
>> As per the previous comment this looks wrong to me if !highmem.
> Agreed.
>
>> If !highmem, if RAM requirements are low we still could get benefit from
>> REDIST2 and HIGH ECAM which could fit within the 4GB limit. But maybe we
>> simply don't care?
> I don't see how. These devices live at a minimum of 256GB, which
> contradicts the very meaning of !highmem being a 4GB limit.
Yes I corrected the above statement afterwards, sorry for the noise.
>
>> If we don't, why don't we simply skip the extended_memmap overlay as
>> suggested in v2? I did not get your reply sorry.
> Because although this makes sense if you only care about a 32bit
> limit, we eventually want to check against an arbitrary PA limit and
> enable the individual devices that do fit in that space.

In my understanding that is what virt_kvm_type() was supposed to do by
testing the result of kvm_arm_get_max_vm_ipa_size and requested_pa_size
(which accounted the high regions) and exiting if they were
incompatible. But I must miss something.
>
> In order to do that, we need to compute the base addresses for these
> extra devices. Also, computing 3 base addresses isn't going to be
> massively expensive.
>
> Thanks,
>
> 	M.
>
Eric


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/5] hw/arm/virt: Honor highmem setting when computing the memory map
  2022-01-07 17:15       ` Eric Auger
@ 2022-01-07 18:18         ` Marc Zyngier
  2022-01-07 18:48           ` Peter Maydell
  0 siblings, 1 reply; 13+ messages in thread
From: Marc Zyngier @ 2022-01-07 18:18 UTC (permalink / raw)
  To: eric.auger
  Cc: qemu-devel, Andrew Jones, Peter Maydell, kvmarm, kvm, kernel-team

Hi Eric,

On Fri, 07 Jan 2022 17:15:19 +0000,
Eric Auger <eric.auger@redhat.com> wrote:
> 
> Hi Marc,
> 
> On 1/6/22 10:26 PM, Marc Zyngier wrote:
> > On Wed, 05 Jan 2022 09:22:39 +0000,
> > Eric Auger <eric.auger@redhat.com> wrote:
> >> Hi Marc,
> >>
> >> On 12/27/21 10:16 PM, Marc Zyngier wrote:
> >>> Even when the VM is configured with highmem=off, the highest_gpa
> >>> field includes devices that are above the 4GiB limit.
> >>> Similarily, nothing seem to check that the memory is within
> >>> the limit set by the highmem=off option.
> >>>
> >>> This leads to failures in virt_kvm_type() on systems that have
> >>> a crippled IPA range, as the reported IPA space is larger than
> >>> what it should be.
> >>>
> >>> Instead, honor the user-specified limit to only use the devices
> >>> at the lowest end of the spectrum, and fail if we have memory
> >>> crossing the 4GiB limit.
> >>>
> >>> Reviewed-by: Andrew Jones <drjones@redhat.com>
> >>> Signed-off-by: Marc Zyngier <maz@kernel.org>
> >>> ---
> >>>  hw/arm/virt.c | 9 ++++++++-
> >>>  1 file changed, 8 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> >>> index 8b600d82c1..84dd3b36fb 100644
> >>> --- a/hw/arm/virt.c
> >>> +++ b/hw/arm/virt.c
> >>> @@ -1678,6 +1678,11 @@ static void virt_set_memmap(VirtMachineState *vms)
> >>>          exit(EXIT_FAILURE);
> >>>      }
> >>>  
> >>> +    if (!vms->highmem &&
> >>> +        vms->memmap[VIRT_MEM].base + ms->maxram_size > 4 * GiB) {
> >>> +        error_report("highmem=off, but memory crosses the 4GiB limit\n");
> >>> +        exit(EXIT_FAILURE);
> >> The memory is composed of initial memory and device memory.
> >> device memory is put after the initial memory but has a 1GB alignment
> >> On top of that you have 1G page alignment per device memory slot
> >>
> >> so potentially the highest mem address is larger than
> >> vms->memmap[VIRT_MEM].base + ms->maxram_size.
> >> I would rather do the check on device_memory_base + device_memory_size
> > Yup, that's a good point.
> >
> > There is also a corner case in one of the later patches where I check
> > this limit against the PA using the rounded-up device_memory_size.
> > This could result in returning an error if the last memory slot would
> > still fit in the PA space, but the rounded-up quantity wouldn't. I
> > don't think it matters much, but I'll fix it anyway.
> >
> >>> +    }
> >>>      /*
> >>>       * We compute the base of the high IO region depending on the
> >>>       * amount of initial and device memory. The device memory start/size
> >>> @@ -1707,7 +1712,9 @@ static void virt_set_memmap(VirtMachineState *vms)
> >>>          vms->memmap[i].size = size;
> >>>          base += size;
> >>>      }
> >>> -    vms->highest_gpa = base - 1;
> >>> +    vms->highest_gpa = (vms->highmem ?
> >>> +                        base :
> >>> +                        vms->memmap[VIRT_MEM].base + ms->maxram_size) - 1;
> >> As per the previous comment this looks wrong to me if !highmem.
> > Agreed.
> >
> >> If !highmem, if RAM requirements are low we still could get benefit from
> >> REDIST2 and HIGH ECAM which could fit within the 4GB limit. But maybe we
> >> simply don't care?
> > I don't see how. These devices live at a minimum of 256GB, which
> > contradicts the very meaning of !highmem being a 4GB limit.
> Yes I corrected the above statement afterwards, sorry for the noise.
> >
> >> If we don't, why don't we simply skip the extended_memmap overlay as
> >> suggested in v2? I did not get your reply sorry.
> > Because although this makes sense if you only care about a 32bit
> > limit, we eventually want to check against an arbitrary PA limit and
> > enable the individual devices that do fit in that space.
> 
> In my understanding that is what virt_kvm_type() was supposed to do by
> testing the result of kvm_arm_get_max_vm_ipa_size and requested_pa_size
> (which accounted the high regions) and exiting if they were
> incompatible. But I must miss something.

This is a chicken and egg problem: you need the IPA size to compute
the memory map, and you need the memory map to compute the IPA
size. Fun, isn't it?

At the moment, virt_set_memmap() doesn't know about the IPA space,
generates a highest_gpa that may not work, and we end-up failing
because the resulting VM type is out of bound.

My solution to that is to feed the *maximum* IPA size to
virt_set_memmap(), compute the memory map there, and then use
highest_gpa to compute the actual IPA size that is used to create the
VM. By knowing the IPA limit in virt_set_memmap(), I'm able to keep it
in check and avoid generating an unusable memory map.

I've tried to make that clearer in my v4. Hopefully I succeeded.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/5] hw/arm/virt: Honor highmem setting when computing the memory map
  2022-01-07 18:18         ` Marc Zyngier
@ 2022-01-07 18:48           ` Peter Maydell
  2022-01-07 19:04             ` Marc Zyngier
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Maydell @ 2022-01-07 18:48 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: eric.auger, qemu-devel, Andrew Jones, kvmarm, kvm, kernel-team

On Fri, 7 Jan 2022 at 18:18, Marc Zyngier <maz@kernel.org> wrote:
> This is a chicken and egg problem: you need the IPA size to compute
> the memory map, and you need the memory map to compute the IPA
> size. Fun, isn't it?
>
> At the moment, virt_set_memmap() doesn't know about the IPA space,
> generates a highest_gpa that may not work, and we end-up failing
> because the resulting VM type is out of bound.
>
> My solution to that is to feed the *maximum* IPA size to
> virt_set_memmap(), compute the memory map there, and then use
> highest_gpa to compute the actual IPA size that is used to create the
> VM. By knowing the IPA limit in virt_set_memmap(), I'm able to keep it
> in check and avoid generating an unusable memory map.

Is there any reason not to just always create the VM with the
maximum supported IPA size, rather than trying to create it
with the smallest IPA size that will work? (ie skip the last
step of computing the IPA size to create the VM with)

-- PMM

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/5] hw/arm/virt: Honor highmem setting when computing the memory map
  2022-01-07 18:48           ` Peter Maydell
@ 2022-01-07 19:04             ` Marc Zyngier
  0 siblings, 0 replies; 13+ messages in thread
From: Marc Zyngier @ 2022-01-07 19:04 UTC (permalink / raw)
  To: Peter Maydell
  Cc: eric.auger, qemu-devel, Andrew Jones, kvmarm, kvm, kernel-team

On Fri, 07 Jan 2022 18:48:16 +0000,
Peter Maydell <peter.maydell@linaro.org> wrote:
> 
> On Fri, 7 Jan 2022 at 18:18, Marc Zyngier <maz@kernel.org> wrote:
> > This is a chicken and egg problem: you need the IPA size to compute
> > the memory map, and you need the memory map to compute the IPA
> > size. Fun, isn't it?
> >
> > At the moment, virt_set_memmap() doesn't know about the IPA space,
> > generates a highest_gpa that may not work, and we end-up failing
> > because the resulting VM type is out of bound.
> >
> > My solution to that is to feed the *maximum* IPA size to
> > virt_set_memmap(), compute the memory map there, and then use
> > highest_gpa to compute the actual IPA size that is used to create the
> > VM. By knowing the IPA limit in virt_set_memmap(), I'm able to keep it
> > in check and avoid generating an unusable memory map.
> 
> Is there any reason not to just always create the VM with the
> maximum supported IPA size, rather than trying to create it
> with the smallest IPA size that will work? (ie skip the last
> step of computing the IPA size to create the VM with)

That gives KVM the opportunity to reduce the depth of the S2 page
tables. On HW that supports a large PA space, there is a real
advantage in keeping these shallow if at all possible.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-01-07 19:04 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-27 21:16 [PATCH v3 0/5] target/arm: Reduced-IPA space and highmem=off fixes Marc Zyngier
2021-12-27 21:16 ` [PATCH v3 1/5] hw/arm/virt: Key enablement of highmem PCIe on highmem_ecam Marc Zyngier
2021-12-27 21:16 ` [PATCH v3 2/5] hw/arm/virt: Add a control for the the highmem redistributors Marc Zyngier
2021-12-27 21:16 ` [PATCH v3 3/5] hw/arm/virt: Honor highmem setting when computing the memory map Marc Zyngier
2022-01-05  9:22   ` Eric Auger
2022-01-05  9:36     ` Eric Auger
2022-01-06 21:26     ` Marc Zyngier
2022-01-07 17:15       ` Eric Auger
2022-01-07 18:18         ` Marc Zyngier
2022-01-07 18:48           ` Peter Maydell
2022-01-07 19:04             ` Marc Zyngier
2021-12-27 21:16 ` [PATCH v3 4/5] hw/arm/virt: Use the PA range to compute " Marc Zyngier
2021-12-27 21:16 ` [PATCH v3 5/5] hw/arm/virt: Disable highmem devices that don't fit in the PA range Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).