All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses
@ 2018-05-23 16:03 Eric Auger
  2018-05-23 16:03 ` [Qemu-devel] [RFC 1/2] hw/arm/virt: Add a new 256MB ECAM region Eric Auger
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Eric Auger @ 2018-05-23 16:03 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, wei, zhaoshenglong, lersek, ard.biesheuvel

Current Machvirt PCI host controller's ECAM region is 16MB large.
This limits the number of PCIe buses to 16.

PC/Q35 machines have a 256MB region allowing up to 256 buses.
This series tries to bridge the gap.

It declares a new ECAM region located beyond 256GB, of size 256MB
(just after the hypothetical new GICv3 RDIST region). The new
ECAM region is used as soon as the highmem option is set (default)
and disabled for machines older than 3.0.

Best Regards

Eric

Git: complete series available at
https://github.com/eauger/qemu/tree/v2.12.0-256MB-ECAM-RFCv1

- Tested with guest running in aarch64 and aarch32 modes (aarch64=off)
- In aarch32 mode I encountered the issue the vmalloc region may be
  reported too small for the needs (dmesg excerpt below). So I had to
  extend the vmalloc size by passing the "vmalloc=512M" option to the
  bootargs and this eventually booted fine.

[    1.399581] pl061_gpio 9030000.pl061: PL061 GPIO chip @0x0000000009030000 registered
[    1.402636] OF: PCI: host bridge /pcie@10000000 ranges:
[    1.404506] OF: PCI:    IO 0x3eff0000..0x3effffff -> 0x00000000
[    1.406606] OF: PCI:   MEM 0x10000000..0x3efeffff -> 0x10000000
[    1.408690] OF: PCI:   MEM 0x8000000000..0xffffffffff -> 0x8000000000
[    1.411992] vmap allocation for size 1052672 failed: use vmalloc=<size> to increase size
[    1.414895] pci-host-generic 4010000000.pcie: ECAM ioremap failed
[    1.427472] pci-host-generic: probe of 4010000000.pcie failed with error -12

- Maybe this issue deserves introducing a new highmem_ecam option?

Eric Auger (2):
  hw/arm/virt: Add a new 256MB ECAM region
  hw/arm/virt: Add virt-3.0 machine type

 hw/arm/virt-acpi-build.c | 22 ++++++++++++++--------
 hw/arm/virt.c            | 39 ++++++++++++++++++++++++++++++++-------
 include/hw/arm/virt.h    |  3 +++
 3 files changed, 49 insertions(+), 15 deletions(-)

-- 
2.5.5

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] [RFC 1/2] hw/arm/virt: Add a new 256MB ECAM region
  2018-05-23 16:03 [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses Eric Auger
@ 2018-05-23 16:03 ` Eric Auger
  2018-05-23 16:03 ` [Qemu-devel] [RFC 2/2] hw/arm/virt: Add virt-3.0 machine type Eric Auger
  2018-05-23 17:45 ` [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses Laszlo Ersek
  2 siblings, 0 replies; 17+ messages in thread
From: Eric Auger @ 2018-05-23 16:03 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, wei, zhaoshenglong, lersek, ard.biesheuvel

This patch defines a new ECAM region located after the 256GB limit.

The virt machine state is augmented with a new highmem_ecam field
which guards the usage of this new ECAM region instead of the legacy
16MB one. With the highmem ECAM region, up to 256 PCIe buses can be
used.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/virt-acpi-build.c | 22 ++++++++++++++--------
 hw/arm/virt.c            | 12 ++++++++----
 include/hw/arm/virt.h    |  2 ++
 3 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 92ceee9..b72f041 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -150,16 +150,17 @@ static void acpi_dsdt_add_virtio(Aml *scope,
 }
 
 static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry *memmap,
-                              uint32_t irq, bool use_highmem)
+                              uint32_t irq, bool use_highmem, bool highmem_ecam)
 {
+    int ecam_id = highmem_ecam ? VIRT_PCIE_ECAM_HIGH : VIRT_PCIE_ECAM;
     Aml *method, *crs, *ifctx, *UUID, *ifctx1, *elsectx, *buf;
     int i, bus_no;
     hwaddr base_mmio = memmap[VIRT_PCIE_MMIO].base;
     hwaddr size_mmio = memmap[VIRT_PCIE_MMIO].size;
     hwaddr base_pio = memmap[VIRT_PCIE_PIO].base;
     hwaddr size_pio = memmap[VIRT_PCIE_PIO].size;
-    hwaddr base_ecam = memmap[VIRT_PCIE_ECAM].base;
-    hwaddr size_ecam = memmap[VIRT_PCIE_ECAM].size;
+    hwaddr base_ecam = memmap[ecam_id].base;
+    hwaddr size_ecam = memmap[ecam_id].size;
     int nr_pcie_buses = size_ecam / PCIE_MMCFG_SIZE_MIN;
 
     Aml *dev = aml_device("%s", "PCI0");
@@ -173,7 +174,7 @@ static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry *memmap,
     aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
 
     /* Declare the PCI Routing Table. */
-    Aml *rt_pkg = aml_package(nr_pcie_buses * PCI_NUM_PINS);
+    Aml *rt_pkg = aml_varpackage(nr_pcie_buses * PCI_NUM_PINS);
     for (bus_no = 0; bus_no < nr_pcie_buses; bus_no++) {
         for (i = 0; i < PCI_NUM_PINS; i++) {
             int gsi = (i + bus_no) % PCI_NUM_PINS;
@@ -316,10 +317,14 @@ static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry *memmap,
     Aml *dev_res0 = aml_device("%s", "RES0");
     aml_append(dev_res0, aml_name_decl("_HID", aml_string("PNP0C02")));
     crs = aml_resource_template();
-    aml_append(crs, aml_memory32_fixed(base_ecam, size_ecam, AML_READ_WRITE));
+    aml_append(crs,
+        aml_qword_memory(AML_POS_DECODE, AML_MIN_FIXED, AML_MAX_FIXED,
+                         AML_NON_CACHEABLE, AML_READ_WRITE, 0x0000, base_ecam,
+                         base_ecam + size_ecam - 1, 0x0000, size_ecam));
     aml_append(dev_res0, aml_name_decl("_CRS", crs));
     aml_append(dev, dev_res0);
     aml_append(scope, dev);
+
 }
 
 static void acpi_dsdt_add_gpio(Aml *scope, const MemMapEntry *gpio_memmap,
@@ -563,16 +568,17 @@ build_mcfg(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
 {
     AcpiTableMcfg *mcfg;
     const MemMapEntry *memmap = vms->memmap;
+    int ecam_id = vms->highmem_ecam ? VIRT_PCIE_ECAM_HIGH : VIRT_PCIE_ECAM;
     int len = sizeof(*mcfg) + sizeof(mcfg->allocation[0]);
     int mcfg_start = table_data->len;
 
     mcfg = acpi_data_push(table_data, len);
-    mcfg->allocation[0].address = cpu_to_le64(memmap[VIRT_PCIE_ECAM].base);
+    mcfg->allocation[0].address = cpu_to_le64(memmap[ecam_id].base);
 
     /* Only a single allocation so no need to play with segments */
     mcfg->allocation[0].pci_segment = cpu_to_le16(0);
     mcfg->allocation[0].start_bus_number = 0;
-    mcfg->allocation[0].end_bus_number = (memmap[VIRT_PCIE_ECAM].size
+    mcfg->allocation[0].end_bus_number = (memmap[ecam_id].size
                                           / PCIE_MMCFG_SIZE_MIN) - 1;
 
     build_header(linker, table_data, (void *)(table_data->data + mcfg_start),
@@ -747,7 +753,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     acpi_dsdt_add_virtio(scope, &memmap[VIRT_MMIO],
                     (irqmap[VIRT_MMIO] + ARM_SPI_BASE), NUM_VIRTIO_TRANSPORTS);
     acpi_dsdt_add_pci(scope, memmap, (irqmap[VIRT_PCIE] + ARM_SPI_BASE),
-                      vms->highmem);
+                      vms->highmem, vms->highmem_ecam);
     acpi_dsdt_add_gpio(scope, &memmap[VIRT_GPIO],
                        (irqmap[VIRT_GPIO] + ARM_SPI_BASE));
     acpi_dsdt_add_power_button(scope);
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index a3a28e2..a35550b 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -149,6 +149,7 @@ static const MemMapEntry a15memmap[] = {
     [VIRT_PCIE_PIO] =           { 0x3eff0000, 0x00010000 },
     [VIRT_PCIE_ECAM] =          { 0x3f000000, 0x01000000 },
     [VIRT_MEM] =                { 0x40000000, RAMLIMIT_BYTES },
+    [VIRT_PCIE_ECAM_HIGH] =     { 0x4010000000ULL, 0x10000000 },
     /* Second PCIe window, 512GB wide at the 512GB boundary */
     [VIRT_PCIE_MMIO_HIGH] =   { 0x8000000000ULL, 0x8000000000ULL },
 };
@@ -1001,10 +1002,9 @@ static void create_pcie(VirtMachineState *vms, qemu_irq *pic)
     hwaddr size_mmio_high = vms->memmap[VIRT_PCIE_MMIO_HIGH].size;
     hwaddr base_pio = vms->memmap[VIRT_PCIE_PIO].base;
     hwaddr size_pio = vms->memmap[VIRT_PCIE_PIO].size;
-    hwaddr base_ecam = vms->memmap[VIRT_PCIE_ECAM].base;
-    hwaddr size_ecam = vms->memmap[VIRT_PCIE_ECAM].size;
+    hwaddr base_ecam, size_ecam;
     hwaddr base = base_mmio;
-    int nr_pcie_buses = size_ecam / PCIE_MMCFG_SIZE_MIN;
+    int nr_pcie_buses;
     int irq = vms->irqmap[VIRT_PCIE];
     MemoryRegion *mmio_alias;
     MemoryRegion *mmio_reg;
@@ -1012,12 +1012,16 @@ static void create_pcie(VirtMachineState *vms, qemu_irq *pic)
     MemoryRegion *ecam_reg;
     DeviceState *dev;
     char *nodename;
-    int i;
+    int i, ecam_memmap_id;
     PCIHostState *pci;
 
     dev = qdev_create(NULL, TYPE_GPEX_HOST);
     qdev_init_nofail(dev);
 
+    ecam_memmap_id = vms->highmem ? VIRT_PCIE_ECAM_HIGH : VIRT_PCIE_ECAM;
+    base_ecam = vms->memmap[ecam_memmap_id].base;
+    size_ecam = vms->memmap[ecam_memmap_id].size;
+    nr_pcie_buses = size_ecam / PCIE_MMCFG_SIZE_MIN;
     /* Map only the first size_ecam bytes of ECAM space */
     ecam_alias = g_new0(MemoryRegion, 1);
     ecam_reg = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 4ac7ef6..e9423a7 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -69,6 +69,7 @@ enum {
     VIRT_PCIE_MMIO,
     VIRT_PCIE_PIO,
     VIRT_PCIE_ECAM,
+    VIRT_PCIE_ECAM_HIGH,
     VIRT_PLATFORM_BUS,
     VIRT_PCIE_MMIO_HIGH,
     VIRT_GPIO,
@@ -103,6 +104,7 @@ typedef struct {
     FWCfgState *fw_cfg;
     bool secure;
     bool highmem;
+    bool highmem_ecam;
     bool its;
     bool virt;
     int32_t gic_version;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [RFC 2/2] hw/arm/virt: Add virt-3.0 machine type
  2018-05-23 16:03 [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses Eric Auger
  2018-05-23 16:03 ` [Qemu-devel] [RFC 1/2] hw/arm/virt: Add a new 256MB ECAM region Eric Auger
@ 2018-05-23 16:03 ` Eric Auger
  2018-05-23 17:45 ` [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses Laszlo Ersek
  2 siblings, 0 replies; 17+ messages in thread
From: Eric Auger @ 2018-05-23 16:03 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, wei, zhaoshenglong, lersek, ard.biesheuvel

Add virt-3.0 machine type.

This machine type supports highmem 256MB ECAM by default.
This feature is disabled for earlier machine types and
if highmem is off.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/virt.c         | 27 ++++++++++++++++++++++++---
 include/hw/arm/virt.h |  1 +
 2 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index a35550b..c24c306 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1520,6 +1520,7 @@ static void virt_set_highmem(Object *obj, bool value, Error **errp)
     VirtMachineState *vms = VIRT_MACHINE(obj);
 
     vms->highmem = value;
+    vms->highmem_ecam &= value;
 }
 
 static bool virt_get_its(Object *obj, Error **errp)
@@ -1697,7 +1698,7 @@ static void machvirt_machine_init(void)
 }
 type_init(machvirt_machine_init);
 
-static void virt_2_12_instance_init(Object *obj)
+static void virt_3_0_instance_init(Object *obj)
 {
     VirtMachineState *vms = VIRT_MACHINE(obj);
     VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
@@ -1740,6 +1741,8 @@ static void virt_2_12_instance_init(Object *obj)
                                     "Set GIC version. "
                                     "Valid values are 2, 3 and host", NULL);
 
+    vms->highmem_ecam = vmc->no_highmem_ecam ? false : true;
+
     if (vmc->no_its) {
         vms->its = false;
     } else {
@@ -1765,10 +1768,28 @@ static void virt_2_12_instance_init(Object *obj)
     vms->irqmap = a15irqmap;
 }
 
-static void virt_machine_2_12_options(MachineClass *mc)
+static void virt_machine_3_0_options(MachineClass *mc)
 {
 }
-DEFINE_VIRT_MACHINE_AS_LATEST(2, 12)
+DEFINE_VIRT_MACHINE_AS_LATEST(3, 0)
+
+#define VIRT_COMPAT_2_12 \
+    HW_COMPAT_2_12
+
+static void virt_2_12_instance_init(Object *obj)
+{
+    virt_3_0_instance_init(obj);
+}
+
+static void virt_machine_2_12_options(MachineClass *mc)
+{
+    VirtMachineClass *vmc = VIRT_MACHINE_CLASS(OBJECT_CLASS(mc));
+
+    virt_machine_3_0_options(mc);
+    SET_MACHINE_COMPAT(mc, VIRT_COMPAT_2_12);
+    vmc->no_highmem_ecam = true;
+}
+DEFINE_VIRT_MACHINE(2, 12)
 
 #define VIRT_COMPAT_2_11 \
     HW_COMPAT_2_11
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index e9423a7..10a5c71 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -95,6 +95,7 @@ typedef struct {
     bool no_pmu;
     bool claim_edge_triggered_timers;
     bool smbios_old_sys_ver;
+    bool no_highmem_ecam;
 } VirtMachineClass;
 
 typedef struct {
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses
  2018-05-23 16:03 [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses Eric Auger
  2018-05-23 16:03 ` [Qemu-devel] [RFC 1/2] hw/arm/virt: Add a new 256MB ECAM region Eric Auger
  2018-05-23 16:03 ` [Qemu-devel] [RFC 2/2] hw/arm/virt: Add virt-3.0 machine type Eric Auger
@ 2018-05-23 17:45 ` Laszlo Ersek
  2018-05-23 20:40   ` Auger Eric
  2 siblings, 1 reply; 17+ messages in thread
From: Laszlo Ersek @ 2018-05-23 17:45 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, wei, zhaoshenglong, ard.biesheuvel

Hi Eric,

On 05/23/18 18:03, Eric Auger wrote:
> Current Machvirt PCI host controller's ECAM region is 16MB large.
> This limits the number of PCIe buses to 16.
>
> PC/Q35 machines have a 256MB region allowing up to 256 buses.
> This series tries to bridge the gap.
>
> It declares a new ECAM region located beyond 256GB, of size 256MB
> (just after the hypothetical new GICv3 RDIST region). The new
> ECAM region is used as soon as the highmem option is set (default)
> and disabled for machines older than 3.0.
>
> Best Regards
>
> Eric
>
> Git: complete series available at
> https://github.com/eauger/qemu/tree/v2.12.0-256MB-ECAM-RFCv1
>
> - Tested with guest running in aarch64 and aarch32 modes (aarch64=off)
> - In aarch32 mode I encountered the issue the vmalloc region may be
>   reported too small for the needs (dmesg excerpt below). So I had to
>   extend the vmalloc size by passing the "vmalloc=512M" option to the
>   bootargs and this eventually booted fine.
>
> [    1.399581] pl061_gpio 9030000.pl061: PL061 GPIO chip @0x0000000009030000 registered
> [    1.402636] OF: PCI: host bridge /pcie@10000000 ranges:
> [    1.404506] OF: PCI:    IO 0x3eff0000..0x3effffff -> 0x00000000
> [    1.406606] OF: PCI:   MEM 0x10000000..0x3efeffff -> 0x10000000
> [    1.408690] OF: PCI:   MEM 0x8000000000..0xffffffffff -> 0x8000000000
> [    1.411992] vmap allocation for size 1052672 failed: use vmalloc=<size> to increase size
> [    1.414895] pci-host-generic 4010000000.pcie: ECAM ioremap failed
> [    1.427472] pci-host-generic: probe of 4010000000.pcie failed with error -12
>
> - Maybe this issue deserves introducing a new highmem_ecam option?

I refer to my earlier email here:

  http://mid.mail-archive.com/13d95529-d61e-fc30-ffd4-f1ef93edad40@redhat.com

This series flips the sole ECAM range that is exposed to the guest to a
large one that is located above 4GB. That's a problem because -- to my
understanding -- it breaks 32-bit ARM UEFI builds, unless you change the
QEMU command line.

(1) Please enable the "firmware repo" from Gerd's site:

https://www.kraxel.org/repos/

(2) Please install the "edk2.git-arm" package.

(3) Please run the 32-bit ARM UEFI firmware, with qemu-system-aarch64,
in a separate directory, as follows (note: TCG only, KVM not needed):

  cp /usr/share/edk2.git/arm/vars-template-pflash.raw vars
  FWBIN=/usr/share/edk2.git/arm/QEMU_EFI-pflash.raw

  qemu-system-aarch64 \
    -nodefaults \
    -no-user-config \
    \
    -M virt \
    -cpu cortex-a15 \
    -m 1024 \
    \
    -drive if=pflash,format=raw,file=$FWBIN,readonly \
    -drive if=pflash,format=raw,file=vars \
    \
    -device virtio-gpu-pci \
    -device qemu-xhci \
    -device usb-kbd \
    \
    -chardev stdio,signal=off,mux=on,id=char0 \
    -mon chardev=char0,mode=readline \
    -serial chardev:char0

This will boot the UEFI shell for you in a graphical window and take
input from the keyboard in that window. A virtio-gpu-pci device is used
as GPU (a PCI Express virtio device) and a USB3.0 keyboard is used as
human input device (the USB3.0 controller is also PCI Express).


I didn't test it, but I expect that this series, when applied as-is,
will break the above use case, unless highmem is explicitly disabled.

I think the first patch is OK (modulo the runaway empty line at the end
of acpi_dsdt_add_pci()), while realizing my review cannot be complete.
:)

Regarding the second patch, I do believe we need "more sophistication"
there. For example, I guess it could be possible to distinguish "-cpu
cortex-a15" from "-cpu cortex-a57" somehow, and stick with the low/small
ECAM in the former case. (The 32-bit firmware already runs on cortex-a15
only, and not on cortex-a57, according to my testing.)

Thanks,
Laszlo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses
  2018-05-23 17:45 ` [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses Laszlo Ersek
@ 2018-05-23 20:40   ` Auger Eric
  2018-05-23 20:52     ` Laszlo Ersek
  0 siblings, 1 reply; 17+ messages in thread
From: Auger Eric @ 2018-05-23 20:40 UTC (permalink / raw)
  To: Laszlo Ersek, eric.auger.pro, qemu-devel, qemu-arm, peter.maydell
  Cc: wei, drjones, ard.biesheuvel, zhaoshenglong

Hi Laszlo,

On 05/23/2018 07:45 PM, Laszlo Ersek wrote:
> Hi Eric,
> 
> On 05/23/18 18:03, Eric Auger wrote:
>> Current Machvirt PCI host controller's ECAM region is 16MB large.
>> This limits the number of PCIe buses to 16.
>>
>> PC/Q35 machines have a 256MB region allowing up to 256 buses.
>> This series tries to bridge the gap.
>>
>> It declares a new ECAM region located beyond 256GB, of size 256MB
>> (just after the hypothetical new GICv3 RDIST region). The new
>> ECAM region is used as soon as the highmem option is set (default)
>> and disabled for machines older than 3.0.
>>
>> Best Regards
>>
>> Eric
>>
>> Git: complete series available at
>> https://github.com/eauger/qemu/tree/v2.12.0-256MB-ECAM-RFCv1
>>
>> - Tested with guest running in aarch64 and aarch32 modes (aarch64=off)
>> - In aarch32 mode I encountered the issue the vmalloc region may be
>>   reported too small for the needs (dmesg excerpt below). So I had to
>>   extend the vmalloc size by passing the "vmalloc=512M" option to the
>>   bootargs and this eventually booted fine.
>>
>> [    1.399581] pl061_gpio 9030000.pl061: PL061 GPIO chip @0x0000000009030000 registered
>> [    1.402636] OF: PCI: host bridge /pcie@10000000 ranges:
>> [    1.404506] OF: PCI:    IO 0x3eff0000..0x3effffff -> 0x00000000
>> [    1.406606] OF: PCI:   MEM 0x10000000..0x3efeffff -> 0x10000000
>> [    1.408690] OF: PCI:   MEM 0x8000000000..0xffffffffff -> 0x8000000000
>> [    1.411992] vmap allocation for size 1052672 failed: use vmalloc=<size> to increase size
>> [    1.414895] pci-host-generic 4010000000.pcie: ECAM ioremap failed
>> [    1.427472] pci-host-generic: probe of 4010000000.pcie failed with error -12
>>
>> - Maybe this issue deserves introducing a new highmem_ecam option?
> 
> I refer to my earlier email here:
> 
>   http://mid.mail-archive.com/13d95529-d61e-fc30-ffd4-f1ef93edad40@redhat.com
> 
> This series flips the sole ECAM range that is exposed to the guest to a
> large one that is located above 4GB. That's a problem because -- to my
> understanding -- it breaks 32-bit ARM UEFI builds, unless you change the
> QEMU command line.

Thank you for your quick feedback. Effectively I ran the aarch32 guest
in DT boot. My bad.
> 
> (1) Please enable the "firmware repo" from Gerd's site:
> 
> https://www.kraxel.org/repos/
> 
> (2) Please install the "edk2.git-arm" package.
> 
> (3) Please run the 32-bit ARM UEFI firmware, with qemu-system-aarch64,
> in a separate directory, as follows (note: TCG only, KVM not needed):
> 
>   cp /usr/share/edk2.git/arm/vars-template-pflash.raw vars
>   FWBIN=/usr/share/edk2.git/arm/QEMU_EFI-pflash.raw
> 
>   qemu-system-aarch64 \
>     -nodefaults \
>     -no-user-config \
>     \
>     -M virt \
>     -cpu cortex-a15 \
>     -m 1024 \
>     \
>     -drive if=pflash,format=raw,file=$FWBIN,readonly \
>     -drive if=pflash,format=raw,file=vars \
>     \
>     -device virtio-gpu-pci \
>     -device qemu-xhci \
>     -device usb-kbd \
>     \
>     -chardev stdio,signal=off,mux=on,id=char0 \
>     -mon chardev=char0,mode=readline \
>     -serial chardev:char0
> 
> This will boot the UEFI shell for you in a graphical window and take
> input from the keyboard in that window. A virtio-gpu-pci device is used
> as GPU (a PCI Express virtio device) and a USB3.0 keyboard is used as
> human input device (the USB3.0 controller is also PCI Express).
> 
> 
> I didn't test it, but I expect that this series, when applied as-is,
> will break the above use case, unless highmem is explicitly disabled.

Yes as you expected, it leads to an exception whereas it works properly
without the series. Actually I misunderstood your last email and was
thinking/hoping that with aarch32 LPAE, things should work.


PCI Bus First Scanning

Data Abort Exception PC at 0x7F7E1A06  CPSR 0x40000033 nZcveaifT_svc
/home/jenkins/workspace/edk2/build/edk2-g7cd8a57599/Build/ArmVirtQemu-ARM/DEBUG_GCC5/ARM/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciHostBridgeDxe/DEBUG/PciHostBridgeDxe.dll
loaded at 0x7F7E0000 (PE/COFF offset) 0x1A06 (ELF or Mach-O offset) 0xA06
0x6820       LDR    r0, [r4 #0x0]
  R0 0x00000001   R1 0xFFFFFFFF   R2 0x00000000   R3 0x00000000
  R4 0x10000000   R5 0x7FAC6C28   R6 0x00000001   R7 0x00000004
  R8 0x7FAC6C28   R9 0x00000000  R10 0x00000000  R11 0x00000000
 R12 0x00000000   SP 0x7FAC6B68   LR 0x7F7E19F5   PC 0x7F7E1A06
DFSR 0x00000008 DFAR 0x10000000 IFSR 0x00000000 IFAR 0x00000000
 Precise External Abort: read from 0x10000000

ASSERT [ArmCpuDxe]
/home/jenkins/workspace/edk2/build/edk2-g7cd8a57599/ArmPkg/Library/DefaultExceptionHandlerLib/Arm/DefaultExceptionHandler.c(268):
((BOOLEAN)(0==1))

> 
> I think the first patch is OK (modulo the runaway empty line at the end
> of acpi_dsdt_add_pci()), while realizing my review cannot be complete.
> :)
> 
> Regarding the second patch, I do believe we need "more sophistication"
> there. For example, I guess it could be possible to distinguish "-cpu
> cortex-a15" from "-cpu cortex-a57" somehow, and stick with the low/small
> ECAM in the former case. (The 32-bit firmware already runs on cortex-a15
> only, and not on cortex-a57, according to my testing.)

So we should detect we are in ACPI boot  + aarch32 mode to force legacy
ECAM region, right?

Thanks

Eric
> 
> Thanks,
> Laszlo
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses
  2018-05-23 20:40   ` Auger Eric
@ 2018-05-23 20:52     ` Laszlo Ersek
  2018-05-23 20:55       ` Auger Eric
  2018-05-24  9:11       ` Peter Maydell
  0 siblings, 2 replies; 17+ messages in thread
From: Laszlo Ersek @ 2018-05-23 20:52 UTC (permalink / raw)
  To: Auger Eric, eric.auger.pro, qemu-devel, qemu-arm, peter.maydell
  Cc: wei, drjones, ard.biesheuvel, zhaoshenglong

On 05/23/18 22:40, Auger Eric wrote:
> On 05/23/2018 07:45 PM, Laszlo Ersek wrote:

>> Regarding the second patch, I do believe we need "more sophistication"
>> there. For example, I guess it could be possible to distinguish "-cpu
>> cortex-a15" from "-cpu cortex-a57" somehow, and stick with the low/small
>> ECAM in the former case. (The 32-bit firmware already runs on cortex-a15
>> only, and not on cortex-a57, according to my testing.)
> 
> So we should detect we are in ACPI boot  + aarch32 mode to force legacy
> ECAM region, right?

Agree about the aarch32 subcondition.

However, "ACPI vs. DT" is not the right "other" subcondition here;
instead we should (minimally) check "firmware vs. no firmware". See the
"firmware_loaded" boolean field.

I also suggest waiting for feedback from others! :)

Thanks,
Laszlo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses
  2018-05-23 20:52     ` Laszlo Ersek
@ 2018-05-23 20:55       ` Auger Eric
  2018-05-24  9:11       ` Peter Maydell
  1 sibling, 0 replies; 17+ messages in thread
From: Auger Eric @ 2018-05-23 20:55 UTC (permalink / raw)
  To: Laszlo Ersek, eric.auger.pro, qemu-devel, qemu-arm, peter.maydell
  Cc: wei, drjones, ard.biesheuvel, zhaoshenglong

Hi,

On 05/23/2018 10:52 PM, Laszlo Ersek wrote:
> On 05/23/18 22:40, Auger Eric wrote:
>> On 05/23/2018 07:45 PM, Laszlo Ersek wrote:
> 
>>> Regarding the second patch, I do believe we need "more sophistication"
>>> there. For example, I guess it could be possible to distinguish "-cpu
>>> cortex-a15" from "-cpu cortex-a57" somehow, and stick with the low/small
>>> ECAM in the former case. (The 32-bit firmware already runs on cortex-a15
>>> only, and not on cortex-a57, according to my testing.)
>>
>> So we should detect we are in ACPI boot  + aarch32 mode to force legacy
>> ECAM region, right?
> 
> Agree about the aarch32 subcondition.
> 
> However, "ACPI vs. DT" is not the right "other" subcondition here;
> instead we should (minimally) check "firmware vs. no firmware". See the
> "firmware_loaded" boolean field.

OK
> 
> I also suggest waiting for feedback from others! :)

sure ;-)

Eric
> 
> Thanks,
> Laszlo
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses
  2018-05-23 20:52     ` Laszlo Ersek
  2018-05-23 20:55       ` Auger Eric
@ 2018-05-24  9:11       ` Peter Maydell
  2018-05-24 12:59         ` Laszlo Ersek
  1 sibling, 1 reply; 17+ messages in thread
From: Peter Maydell @ 2018-05-24  9:11 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Auger Eric, Eric Auger, QEMU Developers, qemu-arm, Wei Huang,
	Andrew Jones, Ard Biesheuvel, Shannon Zhao

On 23 May 2018 at 21:52, Laszlo Ersek <lersek@redhat.com> wrote:
> On 05/23/18 22:40, Auger Eric wrote:
>> On 05/23/2018 07:45 PM, Laszlo Ersek wrote:
>
>>> Regarding the second patch, I do believe we need "more sophistication"
>>> there. For example, I guess it could be possible to distinguish "-cpu
>>> cortex-a15" from "-cpu cortex-a57" somehow, and stick with the low/small
>>> ECAM in the former case. (The 32-bit firmware already runs on cortex-a15
>>> only, and not on cortex-a57, according to my testing.)
>>
>> So we should detect we are in ACPI boot  + aarch32 mode to force legacy
>> ECAM region, right?
>
> Agree about the aarch32 subcondition.
>
> However, "ACPI vs. DT" is not the right "other" subcondition here;
> instead we should (minimally) check "firmware vs. no firmware". See the
> "firmware_loaded" boolean field.

Won't it also break a guest which is just Linux loaded not via
firmware which is an aarch32 kernel without LPAE support?

thanks
-- PMM

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses
  2018-05-24  9:11       ` Peter Maydell
@ 2018-05-24 12:59         ` Laszlo Ersek
  2018-05-24 13:07           ` Peter Maydell
  0 siblings, 1 reply; 17+ messages in thread
From: Laszlo Ersek @ 2018-05-24 12:59 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Auger Eric, Eric Auger, QEMU Developers, qemu-arm, Wei Huang,
	Andrew Jones, Ard Biesheuvel, Shannon Zhao

On 05/24/18 11:11, Peter Maydell wrote:
> On 23 May 2018 at 21:52, Laszlo Ersek <lersek@redhat.com> wrote:
>> On 05/23/18 22:40, Auger Eric wrote:
>>> On 05/23/2018 07:45 PM, Laszlo Ersek wrote:
>>
>>>> Regarding the second patch, I do believe we need "more sophistication"
>>>> there. For example, I guess it could be possible to distinguish "-cpu
>>>> cortex-a15" from "-cpu cortex-a57" somehow, and stick with the low/small
>>>> ECAM in the former case. (The 32-bit firmware already runs on cortex-a15
>>>> only, and not on cortex-a57, according to my testing.)
>>>
>>> So we should detect we are in ACPI boot  + aarch32 mode to force legacy
>>> ECAM region, right?
>>
>> Agree about the aarch32 subcondition.
>>
>> However, "ACPI vs. DT" is not the right "other" subcondition here;
>> instead we should (minimally) check "firmware vs. no firmware". See the
>> "firmware_loaded" boolean field.
> 
> Won't it also break a guest which is just Linux loaded not via
> firmware which is an aarch32 kernel without LPAE support?

Does such a thing exist? (I honestly have no clue.)

If it does, then there are two options:
- don't enable the new ECAM range by default (always take an explicit
option),
- offer both ECAM ranges and let the guest pick one (I should add that I
have no idea whether exposing such *alternatives* is possible via DT and
ACPI; i.e., "pick one but not both").

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses
  2018-05-24 12:59         ` Laszlo Ersek
@ 2018-05-24 13:07           ` Peter Maydell
  2018-05-24 13:10             ` Auger Eric
  2018-05-24 13:59             ` Laszlo Ersek
  0 siblings, 2 replies; 17+ messages in thread
From: Peter Maydell @ 2018-05-24 13:07 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Auger Eric, Eric Auger, QEMU Developers, qemu-arm, Wei Huang,
	Andrew Jones, Ard Biesheuvel, Shannon Zhao

On 24 May 2018 at 13:59, Laszlo Ersek <lersek@redhat.com> wrote:
> On 05/24/18 11:11, Peter Maydell wrote:
>> Won't it also break a guest which is just Linux loaded not via
>> firmware which is an aarch32 kernel without LPAE support?
>
> Does such a thing exist? (I honestly have no clue.)

Yes, it does; LPAE isn't a mandatory kernel config option.
This is why we have the machine 'highmem' option, so that
we can run on those kernels by not putting anything above
the 4G boundary. Looking back at the history on that, we
opted at the time for "default to highmem on, and if you're
running an non-lpae kernel you need to turn it off manually".
So we can handle those kernels by just not putting ECAM
above 4G if highmem is false.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses
  2018-05-24 13:07           ` Peter Maydell
@ 2018-05-24 13:10             ` Auger Eric
  2018-05-24 13:59             ` Laszlo Ersek
  1 sibling, 0 replies; 17+ messages in thread
From: Auger Eric @ 2018-05-24 13:10 UTC (permalink / raw)
  To: Peter Maydell, Laszlo Ersek
  Cc: Eric Auger, QEMU Developers, qemu-arm, Wei Huang, Andrew Jones,
	Ard Biesheuvel, Shannon Zhao

Hi Peter, Laszlo,

On 05/24/2018 03:07 PM, Peter Maydell wrote:
> On 24 May 2018 at 13:59, Laszlo Ersek <lersek@redhat.com> wrote:
>> On 05/24/18 11:11, Peter Maydell wrote:
>>> Won't it also break a guest which is just Linux loaded not via
>>> firmware which is an aarch32 kernel without LPAE support?
>>
>> Does such a thing exist? (I honestly have no clue.)
> 
> Yes, it does; LPAE isn't a mandatory kernel config option.
> This is why we have the machine 'highmem' option, so that
> we can run on those kernels by not putting anything above
> the 4G boundary. Looking back at the history on that, we
> opted at the time for "default to highmem on, and if you're
> running an non-lpae kernel you need to turn it off manually".
> So we can handle those kernels by just not putting ECAM
> above 4G if highmem is false.

Actually that's what my series does. If highmem=off then we use the
legacy ECAM.

Thanks

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses
  2018-05-24 13:07           ` Peter Maydell
  2018-05-24 13:10             ` Auger Eric
@ 2018-05-24 13:59             ` Laszlo Ersek
  2018-05-24 14:09               ` Auger Eric
  2018-05-24 14:14               ` Ard Biesheuvel
  1 sibling, 2 replies; 17+ messages in thread
From: Laszlo Ersek @ 2018-05-24 13:59 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Auger Eric, Eric Auger, QEMU Developers, qemu-arm, Wei Huang,
	Andrew Jones, Ard Biesheuvel, Shannon Zhao

On 05/24/18 15:07, Peter Maydell wrote:
> On 24 May 2018 at 13:59, Laszlo Ersek <lersek@redhat.com> wrote:
>> On 05/24/18 11:11, Peter Maydell wrote:
>>> Won't it also break a guest which is just Linux loaded not via
>>> firmware which is an aarch32 kernel without LPAE support?
>>
>> Does such a thing exist? (I honestly have no clue.)
> 
> Yes, it does; LPAE isn't a mandatory kernel config option.
> This is why we have the machine 'highmem' option, so that
> we can run on those kernels by not putting anything above
> the 4G boundary. Looking back at the history on that, we
> opted at the time for "default to highmem on, and if you're
> running an non-lpae kernel you need to turn it off manually".

Ah, OK, I didn't know that.

> So we can handle those kernels by just not putting ECAM
> above 4G if highmem is false.

The problem is we can have a combination of 32-bit UEFI firmware (which
certainly lacks LPAE) and a 32-bit kernel which supports LPAE.
Previously, you wouldn't specify highmem=off, and things would just work
-- the firmware would simply ignore the >=4GB MMIO aperture, and use the
32-bit MMIO aperture only (and use the sole 32-bit ECAM). The kernel
could then use both low and high MMIO apertures, however (I gather?).

The difference with "high ECAM" is that it is *moved* (not *added*), so
the 32-bit firmware is left with nothing for config space access. For
booting the same combination as above, you are suddenly forced to add
highmem=off, just to keep the ECAM low -- and that, while it keeps the
firmware happy, prevents the LPAE-capable kernel from using the high
MMIO aperture.

So I think "highmem_ecam" should be computed like this:

  highmem_ecam = highmem_ecam_machtype_default &&
                 highmem &&
                 (!firmware_loaded || aarch64);

Thanks,
Laszlo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses
  2018-05-24 13:59             ` Laszlo Ersek
@ 2018-05-24 14:09               ` Auger Eric
  2018-05-24 16:58                 ` Laszlo Ersek
  2018-05-24 14:14               ` Ard Biesheuvel
  1 sibling, 1 reply; 17+ messages in thread
From: Auger Eric @ 2018-05-24 14:09 UTC (permalink / raw)
  To: Laszlo Ersek, Peter Maydell
  Cc: Wei Huang, Andrew Jones, Ard Biesheuvel, QEMU Developers,
	qemu-arm, Shannon Zhao, Eric Auger

Hi Laszlo,

On 05/24/2018 03:59 PM, Laszlo Ersek wrote:
> On 05/24/18 15:07, Peter Maydell wrote:
>> On 24 May 2018 at 13:59, Laszlo Ersek <lersek@redhat.com> wrote:
>>> On 05/24/18 11:11, Peter Maydell wrote:
>>>> Won't it also break a guest which is just Linux loaded not via
>>>> firmware which is an aarch32 kernel without LPAE support?
>>>
>>> Does such a thing exist? (I honestly have no clue.)
>>
>> Yes, it does; LPAE isn't a mandatory kernel config option.
>> This is why we have the machine 'highmem' option, so that
>> we can run on those kernels by not putting anything above
>> the 4G boundary. Looking back at the history on that, we
>> opted at the time for "default to highmem on, and if you're
>> running an non-lpae kernel you need to turn it off manually".
> 
> Ah, OK, I didn't know that.
> 
>> So we can handle those kernels by just not putting ECAM
>> above 4G if highmem is false.
> 
> The problem is we can have a combination of 32-bit UEFI firmware (which
> certainly lacks LPAE) and a 32-bit kernel which supports LPAE.

Is it what happens with the FW you provided to me? There is no LPAE in it?

> Previously, you wouldn't specify highmem=off, and things would just work
> -- the firmware would simply ignore the >=4GB MMIO apertur  e, and use the
> 32-bit MMIO aperture only (and use the sole 32-bit ECAM). The kernel
> could then use both low and high MMIO apertures, however (I gather?).
>   
> The difference with "high ECAM" is that it is *moved* (not *added*), so
> the 32-bit firmware is left with nothing for config space access.
Yes it is not possible to declare several disjoint ECAM spaces for a
single segment I think, hence the move.

 For
> booting the same combination as above, you are suddenly forced to add
> highmem=off, just to keep the ECAM low -- and that, while it keeps the
> firmware happy, prevents the LPAE-capable kernel from using the high
> MMIO aperture.
> 
> So I think "highmem_ecam" should be computed like this:
> 
>   highmem_ecam = highmem_ecam_machtype_default &&
>                  highmem &&
>                  (!firmware_loaded || aarch64);

Looks sensible to me

Thanks

Eric
> 
> Thanks,
> Laszlo
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses
  2018-05-24 13:59             ` Laszlo Ersek
  2018-05-24 14:09               ` Auger Eric
@ 2018-05-24 14:14               ` Ard Biesheuvel
  2018-05-24 17:20                 ` Laszlo Ersek
  1 sibling, 1 reply; 17+ messages in thread
From: Ard Biesheuvel @ 2018-05-24 14:14 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Peter Maydell, Auger Eric, Eric Auger, QEMU Developers, qemu-arm,
	Wei Huang, Andrew Jones, Shannon Zhao

On 24 May 2018 at 15:59, Laszlo Ersek <lersek@redhat.com> wrote:
> On 05/24/18 15:07, Peter Maydell wrote:
>> On 24 May 2018 at 13:59, Laszlo Ersek <lersek@redhat.com> wrote:
>>> On 05/24/18 11:11, Peter Maydell wrote:
>>>> Won't it also break a guest which is just Linux loaded not via
>>>> firmware which is an aarch32 kernel without LPAE support?
>>>
>>> Does such a thing exist? (I honestly have no clue.)
>>
>> Yes, it does; LPAE isn't a mandatory kernel config option.
>> This is why we have the machine 'highmem' option, so that
>> we can run on those kernels by not putting anything above
>> the 4G boundary. Looking back at the history on that, we
>> opted at the time for "default to highmem on, and if you're
>> running an non-lpae kernel you need to turn it off manually".
>
> Ah, OK, I didn't know that.
>
>> So we can handle those kernels by just not putting ECAM
>> above 4G if highmem is false.
>
> The problem is we can have a combination of 32-bit UEFI firmware (which
> certainly lacks LPAE) and a 32-bit kernel which supports LPAE.
> Previously, you wouldn't specify highmem=off, and things would just work
> -- the firmware would simply ignore the >=4GB MMIO aperture, and use the
> 32-bit MMIO aperture only (and use the sole 32-bit ECAM). The kernel
> could then use both low and high MMIO apertures, however (I gather?).
>
> The difference with "high ECAM" is that it is *moved* (not *added*), so
> the 32-bit firmware is left with nothing for config space access. For
> booting the same combination as above, you are suddenly forced to add
> highmem=off, just to keep the ECAM low -- and that, while it keeps the
> firmware happy, prevents the LPAE-capable kernel from using the high
> MMIO aperture.
>
> So I think "highmem_ecam" should be computed like this:
>
>   highmem_ecam = highmem_ecam_machtype_default &&
>                  highmem &&
>                  (!firmware_loaded || aarch64);
>

Given that the firmware is tightly coupled to the platform, we may
decide not to care about ECAM for UEFI itself, and invent a secondary
config space access mechanism that does not consume such a huge amount
of address space. For instance, legacy PCI uses a pair of I/O ports
for this, one to set the address and one to perform the actual read or
write, and we could easily implement something similar (such an
interface is problematic in SMP context but we don't care about that
anyway)

Just a thought - perhaps we don't care enough about 32-bit to go
through the trouble, but it would be nice if LPAE capable 32-bit
guests could make use of the expanded PCIe config space as well.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses
  2018-05-24 14:09               ` Auger Eric
@ 2018-05-24 16:58                 ` Laszlo Ersek
  0 siblings, 0 replies; 17+ messages in thread
From: Laszlo Ersek @ 2018-05-24 16:58 UTC (permalink / raw)
  To: Auger Eric, Peter Maydell
  Cc: Wei Huang, Andrew Jones, Ard Biesheuvel, QEMU Developers,
	qemu-arm, Shannon Zhao, Eric Auger

On 05/24/18 16:09, Auger Eric wrote:
> Hi Laszlo,
> 
> On 05/24/2018 03:59 PM, Laszlo Ersek wrote:
>> On 05/24/18 15:07, Peter Maydell wrote:
>>> On 24 May 2018 at 13:59, Laszlo Ersek <lersek@redhat.com> wrote:
>>>> On 05/24/18 11:11, Peter Maydell wrote:
>>>>> Won't it also break a guest which is just Linux loaded not via
>>>>> firmware which is an aarch32 kernel without LPAE support?
>>>>
>>>> Does such a thing exist? (I honestly have no clue.)
>>>
>>> Yes, it does; LPAE isn't a mandatory kernel config option.
>>> This is why we have the machine 'highmem' option, so that
>>> we can run on those kernels by not putting anything above
>>> the 4G boundary. Looking back at the history on that, we
>>> opted at the time for "default to highmem on, and if you're
>>> running an non-lpae kernel you need to turn it off manually".
>>
>> Ah, OK, I didn't know that.
>>
>>> So we can handle those kernels by just not putting ECAM
>>> above 4G if highmem is false.
>>
>> The problem is we can have a combination of 32-bit UEFI firmware (which
>> certainly lacks LPAE) and a 32-bit kernel which supports LPAE.
> 
> Is it what happens with the FW you provided to me? There is no LPAE in it?

That's the case, to my knowledge.

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses
  2018-05-24 14:14               ` Ard Biesheuvel
@ 2018-05-24 17:20                 ` Laszlo Ersek
  2018-05-24 19:26                   ` Auger Eric
  0 siblings, 1 reply; 17+ messages in thread
From: Laszlo Ersek @ 2018-05-24 17:20 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Peter Maydell, Auger Eric, Eric Auger, QEMU Developers, qemu-arm,
	Wei Huang, Andrew Jones, Shannon Zhao

On 05/24/18 16:14, Ard Biesheuvel wrote:
> On 24 May 2018 at 15:59, Laszlo Ersek <lersek@redhat.com> wrote:
>> On 05/24/18 15:07, Peter Maydell wrote:
>>> On 24 May 2018 at 13:59, Laszlo Ersek <lersek@redhat.com> wrote:
>>>> On 05/24/18 11:11, Peter Maydell wrote:
>>>>> Won't it also break a guest which is just Linux loaded not via
>>>>> firmware which is an aarch32 kernel without LPAE support?
>>>>
>>>> Does such a thing exist? (I honestly have no clue.)
>>>
>>> Yes, it does; LPAE isn't a mandatory kernel config option.
>>> This is why we have the machine 'highmem' option, so that
>>> we can run on those kernels by not putting anything above
>>> the 4G boundary. Looking back at the history on that, we
>>> opted at the time for "default to highmem on, and if you're
>>> running an non-lpae kernel you need to turn it off manually".
>>
>> Ah, OK, I didn't know that.
>>
>>> So we can handle those kernels by just not putting ECAM
>>> above 4G if highmem is false.
>>
>> The problem is we can have a combination of 32-bit UEFI firmware (which
>> certainly lacks LPAE) and a 32-bit kernel which supports LPAE.
>> Previously, you wouldn't specify highmem=off, and things would just work
>> -- the firmware would simply ignore the >=4GB MMIO aperture, and use the
>> 32-bit MMIO aperture only (and use the sole 32-bit ECAM). The kernel
>> could then use both low and high MMIO apertures, however (I gather?).
>>
>> The difference with "high ECAM" is that it is *moved* (not *added*), so
>> the 32-bit firmware is left with nothing for config space access. For
>> booting the same combination as above, you are suddenly forced to add
>> highmem=off, just to keep the ECAM low -- and that, while it keeps the
>> firmware happy, prevents the LPAE-capable kernel from using the high
>> MMIO aperture.
>>
>> So I think "highmem_ecam" should be computed like this:
>>
>>   highmem_ecam = highmem_ecam_machtype_default &&
>>                  highmem &&
>>                  (!firmware_loaded || aarch64);
>>
> 
> Given that the firmware is tightly coupled to the platform, we may
> decide not to care about ECAM for UEFI itself, and invent a secondary
> config space access mechanism that does not consume such a huge amount
> of address space. For instance, legacy PCI uses a pair of I/O ports
> for this, one to set the address and one to perform the actual read or
> write, and we could easily implement something similar (such an
> interface is problematic in SMP context but we don't care about that
> anyway)
> 
> Just a thought - perhaps we don't care enough about 32-bit to go
> through the trouble, but it would be nice if LPAE capable 32-bit
> guests could make use of the expanded PCIe config space as well.

Under the above proposal, they could, they'd just have to be launched
without firmware:

  highmem_ecam_machtype_default = true;
  highmem = true;
  firmware_loaded = false;
  aarch64 = false;

  highmem_ecam = true &&
                 true &&
                 (!false || false);

I see a return to the 0xCF8/0xCFC pattern regressive; I'd rather
restrict the large/high ECAM feature to 64-bit guests (with or without
firmware), and to 32-bit LPAE kernels that are launched without firmware
(which, I think, has been the case for most of their history).

Personally I don't have a stake in 32-bit ARM, so do take my opinion
with a grain of salt. Wearing my upstream ArmVirtQemu co-maintainer hat,
my sole 32-bit interest is in keeping command lines working, *if* they
once worked. Not extending new QEMU features to 32-bit firmware is fine
with me -- in fact I would value that over seeing more quirky firmware
code just for 32-bit's sake.

Side topic: the last subcondition basically says, "IF we use firmware
THEN the VM had better be 64-bit". This is a "logical implication":
A-->B. The C language doesn't have an "implication operator", so I
rewrote it equivalently with the logical negation and logical OR
operators: A-->B is equivalent to (!A || B). (If A is true, then B must
hold; if A is false, then B doesn't matter.)

Thanks,
Laszlo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses
  2018-05-24 17:20                 ` Laszlo Ersek
@ 2018-05-24 19:26                   ` Auger Eric
  0 siblings, 0 replies; 17+ messages in thread
From: Auger Eric @ 2018-05-24 19:26 UTC (permalink / raw)
  To: Laszlo Ersek, Ard Biesheuvel
  Cc: Wei Huang, Peter Maydell, Andrew Jones, QEMU Developers,
	qemu-arm, Shannon Zhao, Eric Auger



On 05/24/2018 07:20 PM, Laszlo Ersek wrote:
> On 05/24/18 16:14, Ard Biesheuvel wrote:
>> On 24 May 2018 at 15:59, Laszlo Ersek <lersek@redhat.com> wrote:
>>> On 05/24/18 15:07, Peter Maydell wrote:
>>>> On 24 May 2018 at 13:59, Laszlo Ersek <lersek@redhat.com> wrote:
>>>>> On 05/24/18 11:11, Peter Maydell wrote:
>>>>>> Won't it also break a guest which is just Linux loaded not via
>>>>>> firmware which is an aarch32 kernel without LPAE support?
>>>>>
>>>>> Does such a thing exist? (I honestly have no clue.)
>>>>
>>>> Yes, it does; LPAE isn't a mandatory kernel config option.
>>>> This is why we have the machine 'highmem' option, so that
>>>> we can run on those kernels by not putting anything above
>>>> the 4G boundary. Looking back at the history on that, we
>>>> opted at the time for "default to highmem on, and if you're
>>>> running an non-lpae kernel you need to turn it off manually".
>>>
>>> Ah, OK, I didn't know that.
>>>
>>>> So we can handle those kernels by just not putting ECAM
>>>> above 4G if highmem is false.
>>>
>>> The problem is we can have a combination of 32-bit UEFI firmware (which
>>> certainly lacks LPAE) and a 32-bit kernel which supports LPAE.
>>> Previously, you wouldn't specify highmem=off, and things would just work
>>> -- the firmware would simply ignore the >=4GB MMIO aperture, and use the
>>> 32-bit MMIO aperture only (and use the sole 32-bit ECAM). The kernel
>>> could then use both low and high MMIO apertures, however (I gather?).
>>>
>>> The difference with "high ECAM" is that it is *moved* (not *added*), so
>>> the 32-bit firmware is left with nothing for config space access. For
>>> booting the same combination as above, you are suddenly forced to add
>>> highmem=off, just to keep the ECAM low -- and that, while it keeps the
>>> firmware happy, prevents the LPAE-capable kernel from using the high
>>> MMIO aperture.
>>>
>>> So I think "highmem_ecam" should be computed like this:
>>>
>>>   highmem_ecam = highmem_ecam_machtype_default &&
>>>                  highmem &&
>>>                  (!firmware_loaded || aarch64);
>>>
>>
>> Given that the firmware is tightly coupled to the platform, we may
>> decide not to care about ECAM for UEFI itself, and invent a secondary
>> config space access mechanism that does not consume such a huge amount
>> of address space. For instance, legacy PCI uses a pair of I/O ports
>> for this, one to set the address and one to perform the actual read or
>> write, and we could easily implement something similar (such an
>> interface is problematic in SMP context but we don't care about that
>> anyway)
>>
>> Just a thought - perhaps we don't care enough about 32-bit to go
>> through the trouble, but it would be nice if LPAE capable 32-bit
>> guests could make use of the expanded PCIe config space as well.
> 
> Under the above proposal, they could, they'd just have to be launched
> without firmware:
> 
>   highmem_ecam_machtype_default = true;
>   highmem = true;
>   firmware_loaded = false;
>   aarch64 = false;
> 
>   highmem_ecam = true &&
>                  true &&
>                  (!false || false);

I think we mostly care about 64b guest experience improvement here. So
personally I am fine with your proposal.

Also there is this vmalloc shortage issue, hit with aarch32 guest only,
up to now (Which I reported at the end of the cover letter). This can
cause some existing guest configs (even without FW) to not boot with the
new high ECAM region whereas it booted before. I don't know if this is
acceptable.

Thanks

Eric
> 
> I see a return to the 0xCF8/0xCFC pattern regressive; I'd rather
> restrict the large/high ECAM feature to 64-bit guests (with or without
> firmware), and to 32-bit LPAE kernels that are launched without firmware
> (which, I think, has been the case for most of their history).
> 
> Personally I don't have a stake in 32-bit ARM, so do take my opinion
> with a grain of salt. Wearing my upstream ArmVirtQemu co-maintainer hat,
> my sole 32-bit interest is in keeping command lines working, *if* they
> once worked. Not extending new QEMU features to 32-bit firmware is fine
> with me -- in fact I would value that over seeing more quirky firmware
> code just for 32-bit's sake.
> 
> Side topic: the last subcondition basically says, "IF we use firmware
> THEN the VM had better be 64-bit". This is a "logical implication":
> A-->B. The C language doesn't have an "implication operator", so I
> rewrote it equivalently with the logical negation and logical OR
> operators: A-->B is equivalent to (!A || B). (If A is true, then B must
> hold; if A is false, then B doesn't matter.)
> 
> Thanks,
> Laszlo
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-05-24 19:26 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-23 16:03 [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses Eric Auger
2018-05-23 16:03 ` [Qemu-devel] [RFC 1/2] hw/arm/virt: Add a new 256MB ECAM region Eric Auger
2018-05-23 16:03 ` [Qemu-devel] [RFC 2/2] hw/arm/virt: Add virt-3.0 machine type Eric Auger
2018-05-23 17:45 ` [Qemu-devel] [RFC 0/2] ARM virt: Support up to 256 PCIe buses Laszlo Ersek
2018-05-23 20:40   ` Auger Eric
2018-05-23 20:52     ` Laszlo Ersek
2018-05-23 20:55       ` Auger Eric
2018-05-24  9:11       ` Peter Maydell
2018-05-24 12:59         ` Laszlo Ersek
2018-05-24 13:07           ` Peter Maydell
2018-05-24 13:10             ` Auger Eric
2018-05-24 13:59             ` Laszlo Ersek
2018-05-24 14:09               ` Auger Eric
2018-05-24 16:58                 ` Laszlo Ersek
2018-05-24 14:14               ` Ard Biesheuvel
2018-05-24 17:20                 ` Laszlo Ersek
2018-05-24 19:26                   ` Auger Eric

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.