From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51491) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fb2wf-0001mw-9h for qemu-devel@nongnu.org; Thu, 05 Jul 2018 08:01:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fb2wY-0006Oz-UZ for qemu-devel@nongnu.org; Thu, 05 Jul 2018 08:01:09 -0400 References: <1530602398-16127-1-git-send-email-eric.auger@redhat.com> <1530602398-16127-7-git-send-email-eric.auger@redhat.com> <43c1349e-1ca6-4890-07c0-7bfa35ab914d@redhat.com> <5311fed5-7f13-a177-b967-db6e3ed028b9@redhat.com> <405e3f2b-3044-d7fc-8df4-b07a8487470f@redhat.com> From: Auger Eric Message-ID: Date: Thu, 5 Jul 2018 14:00:53 +0200 MIME-Version: 1.0 In-Reply-To: <405e3f2b-3044-d7fc-8df4-b07a8487470f@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC v3 06/15] hw/arm/virt: Allocate device_memory List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: David Hildenbrand , eric.auger.pro@gmail.com, qemu-devel@nongnu.org, qemu-arm@nongnu.org, peter.maydell@linaro.org, shameerali.kolothum.thodi@huawei.com, imammedo@redhat.com Cc: wei@redhat.com, drjones@redhat.com, david@gibson.dropbear.id.au, dgilbert@redhat.com, agraf@suse.de Hi David, On 07/05/2018 01:54 PM, David Hildenbrand wrote: > On 05.07.2018 13:42, Auger Eric wrote: >> Hi David, >> >> On 07/04/2018 02:05 PM, David Hildenbrand wrote: >>> On 03.07.2018 21:27, Auger Eric wrote: >>>> Hi David, >>>> On 07/03/2018 08:25 PM, David Hildenbrand wrote: >>>>> On 03.07.2018 09:19, Eric Auger wrote: >>>>>> We define a new hotpluggable RAM region (aka. device memory). >>>>>> Its base is 2TB GPA. This obviously requires 42b IPA support >>>>>> in KVM/ARM, FW and guest kernel. At the moment the device >>>>>> memory region is max 2TB. >>>>> >>>>> Maybe a stupid question, but why exactly does it have to start at 2TB >>>>> (and not e.g. at 1TB)? >>>> not a stupid question. See tentative answer below. >>>>> >>>>>> >>>>>> This is largely inspired of device memory initialization in >>>>>> pc machine code. >>>>>> >>>>>> Signed-off-by: Eric Auger >>>>>> Signed-off-by: Kwangwoo Lee >>>>>> --- >>>>>> hw/arm/virt.c | 104 ++++++++++++++++++++++++++++++++++++-------------- >>>>>> include/hw/arm/arm.h | 2 + >>>>>> include/hw/arm/virt.h | 1 + >>>>>> 3 files changed, 79 insertions(+), 28 deletions(-) >>>>>> >>>>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c >>>>>> index 5a4d0bf..6fefb78 100644 >>>>>> --- a/hw/arm/virt.c >>>>>> +++ b/hw/arm/virt.c >>>>>> @@ -59,6 +59,7 @@ >>>>>> #include "qapi/visitor.h" >>>>>> #include "standard-headers/linux/input.h" >>>>>> #include "hw/arm/smmuv3.h" >>>>>> +#include "hw/acpi/acpi.h" >>>>>> >>>>>> #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \ >>>>>> static void virt_##major##_##minor##_class_init(ObjectClass *oc, \ >>>>>> @@ -94,34 +95,25 @@ >>>>>> >>>>>> #define PLATFORM_BUS_NUM_IRQS 64 >>>>>> >>>>>> -/* RAM limit in GB. Since VIRT_MEM starts at the 1GB mark, this means >>>>>> - * RAM can go up to the 256GB mark, leaving 256GB of the physical >>>>>> - * address space unallocated and free for future use between 256G and 512G. >>>>>> - * If we need to provide more RAM to VMs in the future then we need to: >>>>>> - * * allocate a second bank of RAM starting at 2TB and working up >>>> I acknowledge this comment was the main justification. Now if you look at >>>> >>>> Principles of ARM Memory Maps >>>> http://infocenter.arm.com/help/topic/com.arm.doc.den0001c/DEN0001C_principles_of_arm_memory_maps.pdf >>>> chapter 2.3 you will find that when adding PA bits, you always leave >>>> space for reserved space and mapped IO. >>> >>> Thanks for the pointer! >>> >>> So ... we can fit >>> >>> a) 2GB at 2GB >>> b) 32GB at 32GB >>> c) 512GB at 512GB >>> d) 8TB at 8TB >>> e) 128TB at 128TB >>> >>> (this is a nice rule of thumb if I understand it correctly :) ) >>> >>> We should strive for device memory (maxram_size - ram_size) to fit >>> exactly into one of these slots (otherwise things get nasty). >>> >>> Depending on the ram_size, we might have simpler setups and can support >>> more configurations, no? >>> >>> E.g. ram_size <= 34GB, device_memory <= 512GB >>> -> move ram into a) and b) >>> -> move device memory into c) >> >> The issue is machvirt doesn't comply with that document. >> At the moment we have >> 0 -> 1GB MMIO >> 1GB -> 256GB RAM >> 256GB -> 512GB is theoretically reserved for IO but most is free. >> 512GB -> 1T is reserved for ECAM MMIO range. This is the top of our >> existing 40b GPA space. >> >> We don't want to change this address map due to legacy reasons. >> > > Thanks, good to know! > >> Another question! do you know if it would be possible to have >> device_memory region split into several discontinuous segments? > > It can be implemented for sure, but I would try to avoid that, as it > makes certain configurations impossible (and very end user unfriendly). > > E.g. (numbers completely made up, but it should show what I mean) > > -m 20G,maxmem=120G: > -> Try to add a DIMM with 100G -> error. > -> But we can add e.g. two DIMMs with 40G and 60G. > > This exposes internal details to the end user. And the end user has no > idea what is going on. > > So I think we should try our best to keep that area consecutive. Actually I didn't sufficiently detail the context. I would like 1) 1 segment to be exposed to the end-user through slot|maxmem stuff (what this series targets) and 2) another segment used to instantiate PC-DIMM for internal needs as replacement of part of the 1GB -> 256GB static RAM. This was the purpose of Shameer's original series [1] [RFC v2 0/6] hw/arm: Add support for non-contiguous iova regions http://patchwork.ozlabs.org/cover/914694/ This approach is not yet validated though. The rationale is sometimes you must have "holes" in RAM as some GPAs match reserved IOVAs for assigned devices. Thanks Eric > >> >> Thanks >> >> Eric > >