All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
To: Auger Eric <eric.auger@redhat.com>, Andrew Jones <drjones@redhat.com>
Cc: "peter.maydell@linaro.org" <peter.maydell@linaro.org>,
	Zhaoshenglong <zhaoshenglong@huawei.com>,
	Linuxarm <linuxarm@huawei.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
	"qemu-arm@nongnu.org" <qemu-arm@nongnu.org>,
	Jonathan Cameron <jonathan.cameron@huawei.com>,
	"imammedo@redhat.com" <imammedo@redhat.com>
Subject: Re: [Qemu-devel] [RFC v2 5/6] hw/arm: ACPI SRAT changes to accommodate non-contiguous mem
Date: Fri, 1 Jun 2018 11:09:13 +0000	[thread overview]
Message-ID: <5FC3163CFD30C246ABAA99954A238FA8386F539C@FRAEML521-MBX.china.huawei.com> (raw)
In-Reply-To: <27825c1d-07a2-f03e-477a-03e3d778ac35@redhat.com>

Hi Eric,

> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: Thursday, May 31, 2018 9:16 PM
> To: Andrew Jones <drjones@redhat.com>; Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>
> Cc: peter.maydell@linaro.org; Zhaoshenglong <zhaoshenglong@huawei.com>;
> Linuxarm <linuxarm@huawei.com>; qemu-devel@nongnu.org;
> alex.williamson@redhat.com; qemu-arm@nongnu.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; imammedo@redhat.com
> Subject: Re: [Qemu-devel] [RFC v2 5/6] hw/arm: ACPI SRAT changes to
> accommodate non-contiguous mem
> 
> Hi Shameer,
> 
> On 05/28/2018 07:02 PM, Andrew Jones wrote:
> > On Wed, May 16, 2018 at 04:20:25PM +0100, Shameer Kolothum wrote:
> >> This is in preparation for the next patch where initial ram is split
> >> into a non-pluggable chunk and a pc-dimm modeled mem if  the vaild
> >> iova regions are non-contiguous.
> >>
> >> Signed-off-by: Shameer Kolothum
> <shameerali.kolothum.thodi@huawei.com>
> >> ---
> >>  hw/arm/virt-acpi-build.c | 24 ++++++++++++++++++++----
> >>  1 file changed, 20 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> >> index c7c6a57..8d17b40 100644
> >> --- a/hw/arm/virt-acpi-build.c
> >> +++ b/hw/arm/virt-acpi-build.c
> >> @@ -488,7 +488,7 @@ build_srat(GArray *table_data, BIOSLinker *linker,
> VirtMachineState *vms)
> >>      AcpiSratProcessorGiccAffinity *core;
> >>      AcpiSratMemoryAffinity *numamem;
> >>      int i, srat_start;
> >> -    uint64_t mem_base;
> >> +    uint64_t mem_base, mem_sz, mem_len;
> >>      MachineClass *mc = MACHINE_GET_CLASS(vms);
> >>      const CPUArchIdList *cpu_list = mc-
> >possible_cpu_arch_ids(MACHINE(vms));
> >>
> >> @@ -505,12 +505,28 @@ build_srat(GArray *table_data, BIOSLinker
> *linker, VirtMachineState *vms)
> >>          core->flags = cpu_to_le32(1);
> >>      }
> >>
> >> -    mem_base = vms->memmap[VIRT_MEM].base;
> >> +    mem_base = vms->bootinfo.loader_start;
> >> +    mem_sz = vms->bootinfo.loader_start;
> >
> > mem_sz = vms->bootinfo.ram_size;
> >
> > Assuming the DT generator was correct, meaning bootinfo.ram_size will
> > be the size of the non-pluggable dimm.
> >
> >
> >>      for (i = 0; i < nb_numa_nodes; ++i) {
> >>          numamem = acpi_data_push(table_data, sizeof(*numamem));
> >> -        build_srat_memory(numamem, mem_base, numa_info[i].node_mem,
> i,
> >> +        mem_len = MIN(numa_info[i].node_mem, mem_sz);
> >> +        build_srat_memory(numamem, mem_base, mem_len, i,
> >>                            MEM_AFFINITY_ENABLED);
> >> -        mem_base += numa_info[i].node_mem;
> >> +        mem_base += mem_len;
> >> +        mem_sz -= mem_len;
> >> +        if (!mem_sz) {
> >> +            break;
> >> +        }
> >> +    }
> >> +
> >> +    /* Create table for initial pc-dimm ram, if any */
> >> +    if (vms->bootinfo.dimm_mem) {
> >> +        numamem = acpi_data_push(table_data, sizeof(*numamem));
> >> +        build_srat_memory(numamem, vms->bootinfo.dimm_mem->base,
> >> +                          vms->bootinfo.dimm_mem->size,
> >> +                          vms->bootinfo.dimm_mem->node,
> >> +                          MEM_AFFINITY_ENABLED);
> If my understanding is correct the SRAT table is built only if
> nb_numa_nodes > 0. I don't get how the PC-DIMM region is exposed if NUMA
> nodes are not set?

Yes, SRAT is only build when nb_numa_nodes > 0. I had the same doubt as how the Guest
will see the pc-dimm node on ACPI boot without numa nodes. But during my tests, it did.

This is my qemu command options and please find below logs  with or without the "numa node,nodeid=0"

./qemu-system-aarch64 -machine virt,kernel_irqchip=on,gic-version=3 -cpu host \
-kernel Image \
-initrd rootfs-iperf.cpio \
-device vfio-pci,host=000a:11:10.0 \
-net none \
-m 12G \
-numa node,nodeid=0 \
-nographic -D -d -enable-kvm \
-smp 4 \
-bios QEMU_EFI.fd \
-append "console=ttyAMA0 root=/dev/vda -m 4096 rw earlycon=pl011,0x9000000 acpi=force"
  

1. Guest Boot log (without -numa node,nodeid=0 )
---------------------------------------------------------------

[    0.000000] Boot CPU: AArch64 Processor [410fd082]
[    0.000000] earlycon: pl11 at MMIO 0x0000000009000000 (options '')
[    0.000000] bootconsole [pl11] enabled
[    0.000000] efi: Getting EFI parameters from FDT:
[    0.000000] efi: EFI v2.60 by EDK II
[    0.000000] efi:  SMBIOS 3.0=0x78710000  ACPI 2.0=0x789b0000  MEMATTR=0x7ba44018 
[    0.000000] cma: Reserved 16 MiB at 0x000000007f000000
[    0.000000] ACPI: Early table checksum verification disabled
[    0.000000] ACPI: RSDP 0x00000000789B0000 000024 (v02 BOCHS )
[    0.000000] ACPI: XSDT 0x00000000789A0000 000054 (v01 BOCHS  BXPCFACP 00000001      01000013)
[    0.000000] ACPI: FACP 0x0000000078610000 00010C (v05 BOCHS  BXPCFACP 00000001 BXPC 00000001)
[    0.000000] ACPI: DSDT 0x0000000078620000 0011F7 (v02 BOCHS  BXPCDSDT 00000001 BXPC 00000001)
[    0.000000] ACPI: APIC 0x0000000078600000 000198 (v03 BOCHS  BXPCAPIC 00000001 BXPC 00000001)
[    0.000000] ACPI: GTDT 0x00000000785F0000 000060 (v02 BOCHS  BXPCGTDT 00000001 BXPC 00000001)
[    0.000000] ACPI: MCFG 0x00000000785E0000 00003C (v01 BOCHS  BXPCMCFG 00000001 BXPC 00000001)
[    0.000000] ACPI: SPCR 0x00000000785D0000 000050 (v02 BOCHS  BXPCSPCR 00000001 BXPC 00000001)
[    0.000000] ACPI: IORT 0x00000000785C0000 00007C (v00 BOCHS  BXPCIORT 00000001 BXPC 00000001)
[    0.000000] ACPI: SPCR: console: pl011,mmio,0x9000000,9600
[    0.000000] ACPI: NUMA: Failed to initialise from firmware
[    0.000000] NUMA: Faking a node at [mem 0x0000000000000000-0x00000003bfffffff]
[    0.000000] NUMA: Adding memblock [0x40000000 - 0x785bffff] on node 0
[    0.000000] NUMA: Adding memblock [0x785c0000 - 0x7862ffff] on node 0
[    0.000000] NUMA: Adding memblock [0x78630000 - 0x786fffff] on node 0
[    0.000000] NUMA: Adding memblock [0x78700000 - 0x78b63fff] on node 0
[    0.000000] NUMA: Adding memblock [0x78b64000 - 0x7be3ffff] on node 0
[    0.000000] NUMA: Adding memblock [0x7be40000 - 0x7becffff] on node 0
[    0.000000] NUMA: Adding memblock [0x7bed0000 - 0x7bedffff] on node 0
[    0.000000] NUMA: Adding memblock [0x7bee0000 - 0x7bffffff] on node 0
[    0.000000] NUMA: Adding memblock [0x7c000000 - 0x7fffffff] on node 0
[    0.000000] NUMA: Adding memblock [0x100000000 - 0x3bfffffff] on node 0
[    0.000000] NUMA: Initmem setup node 0 [mem 0x40000000-0x3bfffffff]
[    0.000000] NUMA: NODE_DATA [mem 0x3bffef500-0x3bfff0fff]
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000040000000-0x00000000ffffffff]
[    0.000000]   Normal   [mem 0x0000000100000000-0x00000003bfffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000040000000-0x00000000785bffff]
[    0.000000]   node   0: [mem 0x00000000785c0000-0x000000007862ffff]
[    0.000000]   node   0: [mem 0x0000000078630000-0x00000000786fffff]
[    0.000000]   node   0: [mem 0x0000000078700000-0x0000000078b63fff]
[    0.000000]   node   0: [mem 0x0000000078b64000-0x000000007be3ffff]
[    0.000000]   node   0: [mem 0x000000007be40000-0x000000007becffff]
[    0.000000]   node   0: [mem 0x000000007bed0000-0x000000007bedffff]
[    0.000000]   node   0: [mem 0x000000007bee0000-0x000000007bffffff]
[    0.000000]   node   0: [mem 0x000000007c000000-0x000000007fffffff]
[    0.000000]   node   0: [mem 0x0000000100000000-0x00000003bfffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000003bfffffff]
[    0.000000] psci: probing for conduit method from ACPI.


2. Guest Boot log (with -numa node,nodeid=0 )

[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 4.11.0-rc1-g7426f0c (shameer@shameer-ubuntu) (gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) ) #228 SMP PREEMPT Mon Apr 24 14:51:06 BST 2017
[    0.000000] Boot CPU: AArch64 Processor [410fd082]
[    0.000000] earlycon: pl11 at MMIO 0x0000000009000000 (options '')
[    0.000000] bootconsole [pl11] enabled
[    0.000000] efi: Getting EFI parameters from FDT:
[    0.000000] efi: EFI v2.60 by EDK II
[    0.000000] efi:  SMBIOS 3.0=0x78710000  ACPI 2.0=0x789b0000  MEMATTR=0x7ba44018 
[    0.000000] cma: Reserved 16 MiB at 0x000000007f000000
[    0.000000] ACPI: Early table checksum verification disabled
[    0.000000] ACPI: RSDP 0x00000000789B0000 000024 (v02 BOCHS )
[    0.000000] ACPI: XSDT 0x00000000789A0000 00005C (v01 BOCHS  BXPCFACP 00000001      01000013)
[    0.000000] ACPI: FACP 0x0000000078610000 00010C (v05 BOCHS  BXPCFACP 00000001 BXPC 00000001)
[    0.000000] ACPI: DSDT 0x0000000078620000 0011F7 (v02 BOCHS  BXPCDSDT 00000001 BXPC 00000001)
[    0.000000] ACPI: APIC 0x0000000078600000 000198 (v03 BOCHS  BXPCAPIC 00000001 BXPC 00000001)
[    0.000000] ACPI: GTDT 0x00000000785F0000 000060 (v02 BOCHS  BXPCGTDT 00000001 BXPC 00000001)
[    0.000000] ACPI: MCFG 0x00000000785E0000 00003C (v01 BOCHS  BXPCMCFG 00000001 BXPC 00000001)
[    0.000000] ACPI: SPCR 0x00000000785D0000 000050 (v02 BOCHS  BXPCSPCR 00000001 BXPC 00000001)
[    0.000000] ACPI: SRAT 0x00000000785C0000 0000C8 (v03 BOCHS  BXPCSRAT 00000001 BXPC 00000001)
[    0.000000] ACPI: IORT 0x00000000785B0000 00007C (v00 BOCHS  BXPCIORT 00000001 BXPC 00000001)
[    0.000000] ACPI: SPCR: console: pl011,mmio,0x9000000,9600
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x0 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x1 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x2 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x3 -> Node 0
[    0.000000] NUMA: Adding memblock [0x40000000 - 0x7fffffff] on node 0
[    0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x40000000-0x7fffffff]
[    0.000000] NUMA: Adding memblock [0x100000000 - 0x3bfffffff] on node 0
[    0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x3bfffffff]
[    0.000000] NUMA: Initmem setup node 0 [mem 0x40000000-0x3bfffffff]
[    0.000000] NUMA: NODE_DATA [mem 0x3bffef500-0x3bfff0fff]
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000040000000-0x00000000ffffffff]
[    0.000000]   Normal   [mem 0x0000000100000000-0x00000003bfffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000040000000-0x00000000785affff]
[    0.000000]   node   0: [mem 0x00000000785b0000-0x000000007862ffff]
[    0.000000]   node   0: [mem 0x0000000078630000-0x00000000786fffff]
[    0.000000]   node   0: [mem 0x0000000078700000-0x0000000078b63fff]
[    0.000000]   node   0: [mem 0x0000000078b64000-0x000000007be3ffff]
[    0.000000]   node   0: [mem 0x000000007be40000-0x000000007becffff]
[    0.000000]   node   0: [mem 0x000000007bed0000-0x000000007bedffff]
[    0.000000]   node   0: [mem 0x000000007bee0000-0x000000007bffffff]
[    0.000000]   node   0: [mem 0x000000007c000000-0x000000007fffffff]
[    0.000000]   node   0: [mem 0x0000000100000000-0x00000003bfffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000003bfffffff]
[    0.000000] psci: probing for conduit method from ACPI.


In both cases the memblock [0x100000000 - 0x3bfffffff] is present which corresponds to the 
pc-dimm slot. My guess is, this is because the guest kernel retrieves the UEFI params from
FDT when EFI boot is detected.

[    0.000000] efi: Getting EFI parameters from FDT:
[    0.000000] efi: EFI v2.60 by EDK II

May be I am missing something here or there are other boot scenarios where this is not the case.

Please let me know your thoughts.

Thanks,
Shameer

> Thanks
> 
> Eric
> >> +
> >>      }
> >>
> >>      build_header(linker, table_data, (void *)(table_data->data + srat_start),
> >> --
> >> 2.7.4
> >>
> >>
> >>
> >

  reply	other threads:[~2018-06-01 11:09 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-16 15:20 [Qemu-devel] [RFC v2 0/6] hw/arm: Add support for non-contiguous iova regions Shameer Kolothum
2018-05-16 15:20 ` [Qemu-devel] [RFC v2 1/6] hw/vfio: Retrieve valid iova ranges from kernel Shameer Kolothum
2018-05-28 14:21   ` Auger Eric
2018-05-30 14:43     ` Shameerali Kolothum Thodi
2018-05-16 15:20 ` [Qemu-devel] [RFC v2 2/6] hw/arm/virt: Enable dynamic generation of guest RAM memory regions Shameer Kolothum
2018-05-28 14:21   ` Auger Eric
2018-05-30 14:43     ` Shameerali Kolothum Thodi
2018-05-28 16:47   ` Andrew Jones
2018-05-30 14:50     ` Shameerali Kolothum Thodi
2018-05-16 15:20 ` [Qemu-devel] [RFC v2 3/6] hw/arm/virt: Add pc-dimm mem hotplug framework Shameer Kolothum
2018-05-28 14:21   ` Auger Eric
2018-05-30 14:46     ` Shameerali Kolothum Thodi
2018-05-16 15:20 ` [Qemu-devel] [RFC v2 4/6] hw/arm: Changes required to accommodate non-contiguous DT mem nodes Shameer Kolothum
2018-05-28 14:21   ` Auger Eric
2018-05-30 14:46     ` Shameerali Kolothum Thodi
2018-05-16 15:20 ` [Qemu-devel] [RFC v2 5/6] hw/arm: ACPI SRAT changes to accommodate non-contiguous mem Shameer Kolothum
2018-05-28 14:21   ` Auger Eric
2018-05-28 17:02   ` Andrew Jones
2018-05-30 14:51     ` Shameerali Kolothum Thodi
2018-05-31 20:15     ` Auger Eric
2018-06-01 11:09       ` Shameerali Kolothum Thodi [this message]
2018-05-16 15:20 ` [Qemu-devel] [RFC v2 6/6] hw/arm: Populate non-contiguous memory regions Shameer Kolothum
2018-05-28 14:21   ` Auger Eric
2018-05-30 14:48     ` Shameerali Kolothum Thodi
2018-06-05  7:49     ` Shameerali Kolothum Thodi
2018-06-15 15:44       ` Andrew Jones
2018-06-15 15:54         ` Peter Maydell
2018-06-15 16:13           ` Auger Eric
2018-06-15 16:33             ` Peter Maydell
2018-06-18  9:46               ` Shameerali Kolothum Thodi
2018-06-15 15:55         ` Auger Eric
2018-05-28 14:22 ` [Qemu-devel] [RFC v2 0/6] hw/arm: Add support for non-contiguous iova regions Auger Eric
2018-05-30 14:39   ` Shameerali Kolothum Thodi
2018-05-30 15:24     ` Auger Eric

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5FC3163CFD30C246ABAA99954A238FA8386F539C@FRAEML521-MBX.china.huawei.com \
    --to=shameerali.kolothum.thodi@huawei.com \
    --cc=alex.williamson@redhat.com \
    --cc=drjones@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=linuxarm@huawei.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=zhaoshenglong@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.