All of lore.kernel.org
 help / color / mirror / Atom feed
From: Auger Eric <eric.auger@redhat.com>
To: Igor Mammedov <imammedo@redhat.com>
Cc: eric.auger.pro@gmail.com, qemu-devel@nongnu.org,
	qemu-arm@nongnu.org, peter.maydell@linaro.org,
	shameerali.kolothum.thodi@huawei.com, david@redhat.com,
	wei@redhat.com, drjones@redhat.com,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	agraf@suse.de, Laszlo Ersek <lersek@redhat.com>,
	david@gibson.dropbear.id.au
Subject: Re: [Qemu-devel] [RFC v3 00/15] ARM virt: PCDIMM/NVDIMM at 2TB
Date: Thu, 4 Oct 2018 13:32:26 +0200	[thread overview]
Message-ID: <b673ca5c-f38e-b4a5-7abd-debfa8382e22@redhat.com> (raw)
In-Reply-To: <20181004131150.3de8174a@redhat.com>

Hi Igor,

On 10/4/18 1:11 PM, Igor Mammedov wrote:
> On Wed, 3 Oct 2018 15:49:03 +0200
> Auger Eric <eric.auger@redhat.com> wrote:
> 
>> Hi,
>>
>> On 7/3/18 9:19 AM, Eric Auger wrote:
>>> This series aims at supporting PCDIMM/NVDIMM intantiation in
>>> machvirt at 2TB guest physical address.
>>>
>>> This is achieved in 3 steps:
>>> 1) support more than 40b IPA/GPA
>>> 2) support PCDIMM instantiation
>>> 3) support NVDIMM instantiation  
>>
>> While respinning this series I have some general questions that raise up
>> when thinking about extending the RAM on mach-virt:
>>
>> At the moment mach-virt offers 255GB max initial RAM starting at 1GB
>> ("-m " option).
>>
>> This series does not touch this initial RAM and only targets to add
>> device memory (usable for PCDIMM, NVDIMM, virtio-mem, virtio-pmem) in
>> 3.1 machine, located at 2TB. 3.0 address map top currently is at 1TB
>> (legacy aarch32 LPAE limit) so it would leave 1TB for IO or PCI. Is it OK?
>>
>> - Putting device memory at 2TB means only ARMv8/aarch64 would get
>> benefit of it. Is it an issue? ie. no device memory for ARMv7 or
>> ARMv8/aarch32. Do we need to put effort supporting more memory and
>> memory devices for those configs? there is less than 256GB free in the
>> existing 1TB mach-virt memory map anyway.
>>
>> - is it OK to rely only on device memory to extend the existing 255 GB
>> RAM or would we need additional initial memory? device memory usage
>> induces a more complex command line so this puts a constraint on upper
>> layers. Is it acceptable though?
>>
>> - I revisited the series so that the max IPA size shift would get
>> automatically computed according to the top address reached by the
>> device memory, ie. 2TB + (maxram_size - ramsize). So we would not need
>> any additional kvm-type or explicit vm-phys-shift option to select the
>> correct max IPA shift (or any CPU phys-bits as suggested by Dave). This
>> also assumes we don't put anything beyond the device memory. It is OK?
>>
>> - Igor told me we was concerned about the split-memory RAM model as it
>> caused a lot of trouble regarding compat/migration on PC machine. After
>> having studied the pc machine code I now wonder if we can compare the PC
>> compat issues with the ones we could encounter on ARM with the proposed
>> split memory model.
> that's not the only issue.
> 
> For example since initial memory isn't modeled as a device
> (i.e. it's just a plain memory region), there is a bunch of numa
> code to deal with it. If initial memory were replaced by pc-dimm,
> we would drop some of it and if we deprecated old '-numa mem' we
> should be able to drop the most of it (newer '-numa memdev' maps
> directly into pc-dimm model).
see my comment below.
> 
>  
>> On PC there are many knobs to tune the RAM layout
>> - max_ram_below_4g option tunes how much RAM we want below 4G
>> - gigabyte_align to force 3GB versus 3.5GB lowmem limit if ram_size >
>> max_ram_below_4g
>> - plus the usual ram_size which affects the rest of the initial ram
>> - plus the maxram_size, slots which affect the size of the device memory
>> - the device memory is just behind the initial RAM, aligned to 1GB
>>
>> Note the inital RAM and the device memory may be disjoint due to
>> misalignment of the initial ram size against 1GB
>>
>> On ARM, we would have 3.0 virt machine supporting only initial RAM from
>> 1GB to 256 GB. 3.1 (or beyond ;-)) virt machine would support the same
>> initial RAM + device memory from 2TB to 4TB.
>>
>> With that memory split and the different machine type, I don't see any
>> major hurdle with respect to migration. Do I miss something?
> Later on someone with a need to punch holes in fixed initial RAM/device memory,
> starts making it complex.
Support of host reserved regions is not acked yet but that's a valid
argument.
> 
>> Alternative to have a split model is having a floating RAM base for a
>> contiguous initial + device memory (contiguity actually depends on
>> initial RAM size alignment too). This requires significant changes in FW
>> and also potentially impacts the legacy virt address map as we need to
>> pass the RAM floating base address in some way (using an SRAM at 1GB) or
>> using fw_cfg. Is it worth the effort? Also, Peter/Laszlo mentioned their
>> reluctance to move the RAM earlier
> Drew is working on it, lets see outcome first.
> 
> We actually may try implement single region that uses pc-dimm for
> all memory (including initial) and be still compatible with legacy layout
> as far as legacy mode sticks to the current RAM limit and device memory
> region is put at the current RAM base.
> When flexible RAM base is available, we will move that region to
> non legacy layout at 2TB (or wherever).

Oh I did not understand you wanted to also replace the initial memory by
device memory. So we would switch from a pure static initial RAM setup
to a pure dynamic device memory setup. Looks quite drastic a change to
me. As mentionned I am concerned about complexifying the qemu cmd line
and I asked livirt guys about the induced pain.

Thank you for your feedbacks

Eric


> 
>> (https://lists.gnu.org/archive/html/qemu-devel/2017-10/msg03172.html).
>>
>> Your feedbacks on those points are really welcome!
>>
>> Thanks
>>
>> Eric
>>
>>>
>>> This series reuses/rebases patches initially submitted by Shameer in [1]
>>> and Kwangwoo in [2].
>>>
>>> I put all parts all together for consistency and due to dependencies
>>> however as soon as the kernel dependency is resolved we can consider
>>> upstreaming them separately.
>>>
>>> Support more than 40b IPA/GPA [ patches 1 - 5 ]
>>> -----------------------------------------------
>>> was "[RFC 0/6] KVM/ARM: Dynamic and larger GPA size"
>>>
>>> At the moment the guest physical address space is limited to 40b
>>> due to KVM limitations. [0] bumps this limitation and allows to
>>> create a VM with up to 52b GPA address space.
>>>
>>> With this series, QEMU creates a virt VM with the max IPA range
>>> reported by the host kernel or 40b by default.
>>>
>>> This choice can be overriden by using the -machine kvm-type=<bits>
>>> option with bits within [40, 52]. If <bits> are not supported by
>>> the host, the legacy 40b value is used.
>>>
>>> Currently the EDK2 FW also hardcodes the max number of GPA bits to
>>> 40. This will need to be fixed.
>>>
>>> PCDIMM Support [ patches 6 - 11 ]
>>> ---------------------------------
>>> was "[RFC 0/5] ARM virt: Support PC-DIMM at 2TB"
>>>
>>> We instantiate the device_memory at 2TB. Using it obviously requires
>>> at least 42b of IPA/GPA. While its max capacity is currently limited
>>> to 2TB, the actual size depends on the initial guest RAM size and
>>> maxmem parameter.
>>>
>>> Actual hot-plug and hot-unplug of PC-DIMM is not suported due to lack
>>> of support of those features in baremetal.
>>>
>>> NVDIMM support [ patches 12 - 15 ]
>>> ----------------------------------
>>>
>>> Once the memory hotplug framework is in place it is fairly
>>> straightforward to add support for NVDIMM. the machine "nvdimm" option
>>> turns the capability on.
>>>
>>> Best Regards
>>>
>>> Eric
>>>
>>> References:
>>>
>>> [0] [PATCH v3 00/20] arm64: Dynamic & 52bit IPA support
>>> https://www.spinics.net/lists/kernel/msg2841735.html
>>>
>>> [1] [RFC v2 0/6] hw/arm: Add support for non-contiguous iova regions
>>> http://patchwork.ozlabs.org/cover/914694/
>>>
>>> [2] [RFC PATCH 0/3] add nvdimm support on AArch64 virt platform
>>> https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg04599.html
>>>
>>> Tests:
>>> - On Cavium Gigabyte, a 48b VM was created.
>>> - Migration tests were performed between kernel supporting the
>>>   feature and destination kernel not suporting it
>>> - test with ACPI: to overcome the limitation of EDK2 FW, virt
>>>   memory map was hacked to move the device memory below 1TB.
>>>
>>> This series can be found at:
>>> https://github.com/eauger/qemu/tree/v2.12.0-dimm-2tb-v3
>>>
>>> History:
>>>
>>> v2 -> v3:
>>> - fix pc_q35 and pc_piix compilation error
>>> - kwangwoo's email being not valid anymore, remove his address
>>>
>>> v1 -> v2:
>>> - kvm_get_max_vm_phys_shift moved in arch specific file
>>> - addition of NVDIMM part
>>> - single series
>>> - rebase on David's refactoring
>>>
>>> v1:
>>> - was "[RFC 0/6] KVM/ARM: Dynamic and larger GPA size"
>>> - was "[RFC 0/5] ARM virt: Support PC-DIMM at 2TB"
>>>
>>> Best Regards
>>>
>>> Eric
>>>
>>>
>>> Eric Auger (9):
>>>   linux-headers: header update for KVM/ARM KVM_ARM_GET_MAX_VM_PHYS_SHIFT
>>>   hw/boards: Add a MachineState parameter to kvm_type callback
>>>   kvm: add kvm_arm_get_max_vm_phys_shift
>>>   hw/arm/virt: support kvm_type property
>>>   hw/arm/virt: handle max_vm_phys_shift conflicts on migration
>>>   hw/arm/virt: Allocate device_memory
>>>   acpi: move build_srat_hotpluggable_memory to generic ACPI source
>>>   hw/arm/boot: Expose the pmem nodes in the DT
>>>   hw/arm/virt: Add nvdimm and nvdimm-persistence options
>>>
>>> Kwangwoo Lee (2):
>>>   nvdimm: use configurable ACPI IO base and size
>>>   hw/arm/virt: Add nvdimm hot-plug infrastructure
>>>
>>> Shameer Kolothum (4):
>>>   hw/arm/virt: Add memory hotplug framework
>>>   hw/arm/boot: introduce fdt_add_memory_node helper
>>>   hw/arm/boot: Expose the PC-DIMM nodes in the DT
>>>   hw/arm/virt-acpi-build: Add PC-DIMM in SRAT
>>>
>>>  accel/kvm/kvm-all.c                            |   2 +-
>>>  default-configs/arm-softmmu.mak                |   4 +
>>>  hw/acpi/aml-build.c                            |  51 ++++
>>>  hw/acpi/nvdimm.c                               |  28 ++-
>>>  hw/arm/boot.c                                  | 123 +++++++--
>>>  hw/arm/virt-acpi-build.c                       |  10 +
>>>  hw/arm/virt.c                                  | 330 ++++++++++++++++++++++---
>>>  hw/i386/acpi-build.c                           |  49 ----
>>>  hw/i386/pc_piix.c                              |   8 +-
>>>  hw/i386/pc_q35.c                               |   8 +-
>>>  hw/ppc/mac_newworld.c                          |   2 +-
>>>  hw/ppc/mac_oldworld.c                          |   2 +-
>>>  hw/ppc/spapr.c                                 |   2 +-
>>>  include/hw/acpi/aml-build.h                    |   3 +
>>>  include/hw/arm/arm.h                           |   2 +
>>>  include/hw/arm/virt.h                          |   7 +
>>>  include/hw/boards.h                            |   2 +-
>>>  include/hw/mem/nvdimm.h                        |  12 +
>>>  include/standard-headers/linux/virtio_config.h |  16 +-
>>>  linux-headers/asm-mips/unistd.h                |  18 +-
>>>  linux-headers/asm-powerpc/kvm.h                |   1 +
>>>  linux-headers/linux/kvm.h                      |  16 ++
>>>  target/arm/kvm.c                               |   9 +
>>>  target/arm/kvm_arm.h                           |  16 ++
>>>  24 files changed, 597 insertions(+), 124 deletions(-)
>>>   
>>
> 

  reply	other threads:[~2018-10-04 11:33 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-03  7:19 [Qemu-devel] [RFC v3 00/15] ARM virt: PCDIMM/NVDIMM at 2TB Eric Auger
2018-07-03  7:19 ` [Qemu-devel] [RFC v3 01/15] linux-headers: header update for KVM/ARM KVM_ARM_GET_MAX_VM_PHYS_SHIFT Eric Auger
2018-07-03  7:19 ` [Qemu-devel] [RFC v3 02/15] hw/boards: Add a MachineState parameter to kvm_type callback Eric Auger
2018-07-03  7:19 ` [Qemu-devel] [RFC v3 03/15] kvm: add kvm_arm_get_max_vm_phys_shift Eric Auger
2018-07-03  7:19 ` [Qemu-devel] [RFC v3 04/15] hw/arm/virt: support kvm_type property Eric Auger
2018-07-03  7:19 ` [Qemu-devel] [RFC v3 05/15] hw/arm/virt: handle max_vm_phys_shift conflicts on migration Eric Auger
2018-07-03 18:41   ` David Hildenbrand
2018-07-03 19:32     ` Auger Eric
2018-07-04 11:53       ` David Hildenbrand
2018-07-04 12:50         ` Auger Eric
2018-07-03  7:19 ` [Qemu-devel] [RFC v3 06/15] hw/arm/virt: Allocate device_memory Eric Auger
2018-07-03 18:25   ` David Hildenbrand
2018-07-03 19:27     ` Auger Eric
2018-07-04 12:05       ` David Hildenbrand
2018-07-05 11:42         ` Auger Eric
2018-07-05 11:54           ` David Hildenbrand
2018-07-05 12:00             ` Auger Eric
2018-07-05 12:09               ` David Hildenbrand
2018-07-05 12:17                 ` Auger Eric
2018-07-05 13:19                   ` Shameerali Kolothum Thodi
2018-07-05 14:27                     ` Auger Eric
2018-07-11 13:17                       ` Igor Mammedov
2018-07-12 14:22                         ` Auger Eric
2018-07-12 14:45                           ` Andrew Jones
2018-07-12 14:53                             ` Auger Eric
2018-07-12 15:15                               ` Andrew Jones
2018-07-18 13:00                               ` Igor Mammedov
2018-08-08  9:33                                 ` Auger Eric
2018-08-09  8:45                                   ` Igor Mammedov
2018-08-09  9:54                                     ` Auger Eric
2018-07-18 13:05   ` Igor Mammedov
2018-08-08  9:33     ` Auger Eric
2018-07-03  7:19 ` [Qemu-devel] [RFC v3 07/15] hw/arm/virt: Add memory hotplug framework Eric Auger
2018-07-03 18:28   ` David Hildenbrand
2018-07-03 19:28     ` Auger Eric
2018-07-03 18:44   ` David Hildenbrand
2018-07-03 19:34     ` Auger Eric
2018-07-04 11:47       ` David Hildenbrand
2018-07-03  7:19 ` [Qemu-devel] [RFC v3 08/15] hw/arm/boot: introduce fdt_add_memory_node helper Eric Auger
2018-07-18 14:04   ` Igor Mammedov
2018-08-08  9:44     ` Auger Eric
2018-08-09  8:57       ` Igor Mammedov
2018-07-03  7:19 ` [Qemu-devel] [RFC v3 09/15] hw/arm/boot: Expose the PC-DIMM nodes in the DT Eric Auger
2018-07-03  7:19 ` [Qemu-devel] [RFC v3 10/15] acpi: move build_srat_hotpluggable_memory to generic ACPI source Eric Auger
2018-07-03  7:19 ` [Qemu-devel] [RFC v3 11/15] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT Eric Auger
2018-07-03  7:19 ` [Qemu-devel] [RFC v3 12/15] nvdimm: use configurable ACPI IO base and size Eric Auger
2018-07-03  7:19 ` [Qemu-devel] [RFC v3 13/15] hw/arm/virt: Add nvdimm hot-plug infrastructure Eric Auger
2018-07-03  7:19 ` [Qemu-devel] [RFC v3 14/15] hw/arm/boot: Expose the pmem nodes in the DT Eric Auger
2018-07-03  7:19 ` [Qemu-devel] [RFC v3 15/15] hw/arm/virt: Add nvdimm and nvdimm-persistence options Eric Auger
2018-07-18 14:08 ` [Qemu-devel] [RFC v3 00/15] ARM virt: PCDIMM/NVDIMM at 2TB Igor Mammedov
2018-10-18 12:56   ` Auger Eric
2018-10-03 13:49 ` Auger Eric
2018-10-03 14:13   ` Dr. David Alan Gilbert
2018-10-03 14:42     ` Auger Eric
2018-10-03 14:46       ` Dr. David Alan Gilbert
2018-10-04 11:11   ` Igor Mammedov
2018-10-04 11:32     ` Auger Eric [this message]
2018-10-04 12:02       ` David Hildenbrand
2018-10-04 12:07         ` Auger Eric
2018-10-04 13:16       ` Igor Mammedov
2018-10-04 14:16         ` Dr. David Alan Gilbert
2018-10-05  8:18           ` Igor Mammedov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b673ca5c-f38e-b4a5-7abd-debfa8382e22@redhat.com \
    --to=eric.auger@redhat.com \
    --cc=agraf@suse.de \
    --cc=ard.biesheuvel@linaro.org \
    --cc=david@gibson.dropbear.id.au \
    --cc=david@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=drjones@redhat.com \
    --cc=eric.auger.pro@gmail.com \
    --cc=imammedo@redhat.com \
    --cc=lersek@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=wei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.