From: Auger Eric <eric.auger@redhat.com>
To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>,
Laszlo Ersek <lersek@redhat.com>,
Igor Mammedov <imammedo@redhat.com>
Cc: "peter.maydell@linaro.org" <peter.maydell@linaro.org>,
"xuwei (O)" <xuwei5@huawei.com>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Catalin Marinas <Catalin.Marinas@arm.com>,
"ard.biesheuvel@linaro.org" <ard.biesheuvel@linaro.org>,
"will.deacon@arm.com" <will.deacon@arm.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
Linuxarm <linuxarm@huawei.com>, linux-mm <linux-mm@kvack.org>,
"qemu-arm@nongnu.org" <qemu-arm@nongnu.org>,
Jonathan Cameron <jonathan.cameron@huawei.com>,
Robin Murphy <robin.murphy@arm.com>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>
Subject: Re: [Qemu-devel] [Question] Memory hotplug clarification for Qemu ARM/virt
Date: Fri, 10 May 2019 11:15:33 +0200 [thread overview]
Message-ID: <499f2bc5-da85-72b2-4f7b-32f2d25d842b@redhat.com> (raw)
In-Reply-To: <5FC3163CFD30C246ABAA99954A238FA83F1DDFE5@lhreml524-mbs.china.huawei.com>
Hi Shameer,
On 5/10/19 10:34 AM, Shameerali Kolothum Thodi wrote:
>
>
>> -----Original Message-----
>> From: Laszlo Ersek [mailto:lersek@redhat.com]
>> Sent: 09 May 2019 22:48
>> To: Igor Mammedov <imammedo@redhat.com>
>> Cc: Robin Murphy <robin.murphy@arm.com>; Shameerali Kolothum Thodi
>> <shameerali.kolothum.thodi@huawei.com>; will.deacon@arm.com; Catalin
>> Marinas <Catalin.Marinas@arm.com>; Anshuman Khandual
>> <anshuman.khandual@arm.com>; linux-arm-kernel@lists.infradead.org;
>> linux-mm <linux-mm@kvack.org>; qemu-devel@nongnu.org;
>> qemu-arm@nongnu.org; eric.auger@redhat.com; peter.maydell@linaro.org;
>> Linuxarm <linuxarm@huawei.com>; ard.biesheuvel@linaro.org; Jonathan
>> Cameron <jonathan.cameron@huawei.com>; xuwei (O) <xuwei5@huawei.com>
>> Subject: Re: [Question] Memory hotplug clarification for Qemu ARM/virt
>>
>> On 05/09/19 18:35, Igor Mammedov wrote:
>>> On Wed, 8 May 2019 22:26:12 +0200
>>> Laszlo Ersek <lersek@redhat.com> wrote:
>>>
>>>> On 05/08/19 14:50, Robin Murphy wrote:
>>>>> Hi Shameer,
>>>>>
>>>>> On 08/05/2019 11:15, Shameerali Kolothum Thodi wrote:
>>>>>> Hi,
>>>>>>
>>>>>> This series here[0] attempts to add support for PCDIMM in QEMU for
>>>>>> ARM/Virt platform and has stumbled upon an issue as it is not clear(at
>>>>>> least
>>>>>> from Qemu/EDK2 point of view) how in physical world the hotpluggable
>>>>>> memory is handled by kernel.
>>>>>>
>>>>>> The proposed implementation in Qemu, builds the SRAT and DSDT parts
>>>>>> and uses GED device to trigger the hotplug. This works fine.
>>>>>>
>>>>>> But when we added the DT node corresponding to the PCDIMM(cold plug
>>>>>> scenario), we noticed that Guest kernel see this memory during early
>> boot
>>>>>> even if we are booting with ACPI. Because of this, hotpluggable memory
>>>>>> may end up in zone normal and make it non-hot-un-pluggable even if
>> Guest
>>>>>> boots with ACPI.
>>>>>>
>>>>>> Further discussions[1] revealed that, EDK2 UEFI has no means to
>>>>>> interpret the
>>>>>> ACPI content from Qemu(this is designed to do so) and uses DT info to
>>>>>> build the GetMemoryMap(). To solve this, introduced "hotpluggable"
>>>>>> property
>>>>>> to DT memory node(patches #7 & #8 from [0]) so that UEFI can
>>>>>> differentiate
>>>>>> the nodes and exclude the hotpluggable ones from GetMemoryMap().
>>>>>>
>>>>>> But then Laszlo rightly pointed out that in order to accommodate the
>>>>>> changes
>>>>>> into UEFI we need to know how exactly Linux expects/handles all the
>>>>>> hotpluggable memory scenarios. Please find the discussion here[2].
>>>>>>
>>>>>> For ease, I am just copying the relevant comment from Laszlo below,
>>>>>>
>>>>>> /******
>>>>>> "Given patches #7 and #8, as I understand them, the firmware cannot
>>>>>> distinguish
>>>>>> hotpluggable & present, from hotpluggable & absent. The firmware
>> can
>>>>>> only
>>>>>> skip both hotpluggable cases. That's fine in that the firmware will
>>>>>> hog neither
>>>>>> type -- but is that OK for the OS as well, for both ACPI boot and DT
>>>>>> boot?
>>>>>>
>>>>>> Consider in particular the "hotpluggable & present, ACPI boot" case.
>>>>>> Assuming
>>>>>> we modify the firmware to skip "hotpluggable" altogether, the UEFI
>> memmap
>>>>>> will not include the range despite it being present at boot.
>>>>>> Presumably, ACPI
>>>>>> will refer to the range somehow, however. Will that not confuse the OS?
>>>>>>
>>>>>> When Igor raised this earlier, I suggested that
>>>>>> hotpluggable-and-present should
>>>>>> be added by the firmware, but also allocated immediately, as
>>>>>> EfiBootServicesData
>>>>>> type memory. This will prevent other drivers in the firmware from
>>>>>> allocating AcpiNVS
>>>>>> or Reserved chunks from the same memory range, the UEFI memmap will
>>>>>> contain
>>>>>> the range as EfiBootServicesData, and then the OS can release that
>>>>>> allocation in
>>>>>> one go early during boot.
>>>>>>
>>>>>> But this really has to be clarified from the Linux kernel's
>>>>>> expectations. Please
>>>>>> formalize all of the following cases:
>>>>>>
>>>>>> OS boot (DT/ACPI) hotpluggable & ... GetMemoryMap() should report
>>>>>> as DT/ACPI should report as
>>>>>> ----------------- ------------------
>>>>>> ------------------------------- ------------------------
>>>>>>
>> DT present ?
>> ?
>>>>>>
>> DT absent ?
>> ?
>>>>>>
>> ACPI present ?
>> ?
>>>>>>
>> ACPI absent ?
>> ?
>>>>>>
>>>>>> Again, this table is dictated by Linux."
>>>>>>
>>>>>> ******/
>>>>>>
>>>>>> Could you please take a look at this and let us know what is expected
>>>>>> here from
>>>>>> a Linux kernel view point.
>>>>>
>>>>> For arm64, so far we've not even been considering DT-based hotplug - as
>>>>> far as I'm aware there would still be a big open question there around
>>>>> notification mechanisms and how to describe them. The DT stuff so far
>>>>> has come from the PowerPC folks, so it's probably worth seeing what
>>>>> their ideas are.
>>>>>
>>>>> ACPI-wise I've always assumed/hoped that hotplug-related things should
>>>>> be sufficiently well-specified in UEFI that "do whatever x86/IA-64 do"
>>>>> would be enough for us.
>>>>
>>>> As far as I can see in UEFI v2.8 -- and I had checked the spec before
>>>> dumping the table with the many question marks on Shameer --, all the
>>>> hot-plug language in the spec refers to USB and PCI hot-plug in the
>>>> preboot environment. There is not a single word about hot-plug at OS
>>>> runtime (regarding any device or component type), nor about memory
>>>> hot-plug (at any time).
>>>>
>>>> Looking to x86 appears valid -- so what does the Linux kernel expect on
>>>> that architecture, in the "ACPI" rows of the table?
>>>
>>> I could only answer from QEMU x86 perspective.
>>> QEMU for x86 guests currently doesn't add hot-pluggable RAM into E820
>>> because of different linux guests tend to cannibalize it, making it non
>>> unpluggable. The last culprit I recall was KASLR.
>>>
>>> So I'd refrain from reporting hotpluggable RAM in GetMemoryMap() if
>>> it's possible (it's probably hack (spec deosn't say anything about it)
>>> but it mostly works for Linux (plug/unplug) and Windows guest also
>>> fine with plug part (no unplug there)).
>>
>> I can accept this as a perfectly valid design. Which would mean, QEMU should
>> mark each hotpluggable RAM range in the DTB for the firmware with the
>> special new property, regardless of its initial ("cold") plugged-ness, and then
>> the firmware will not expose the range in the GCD memory space map, and
>> consequently in the UEFI memmap either.
>>
>> IOW, our table is, thus far:
>>
>> OS boot (DT/ACPI) hotpluggable & ... GetMemoryMap() should report as
>> DT/ACPI should report as
>> ----------------- ------------------ ------------------------------- ------------------------
>> DT present
>> ABSENT ?
>> DT absent
>> ABSENT ?
>> ACPI present ABSENT
>> PRESENT
>> ACPI absent ABSENT
>> ABSENT
>> In the firmware, I only need to care about the GetMemoryMap() column, so I
>> can work with this.
>
> Thank you all for the inputs.
>
> I assume we will still report the DT cold plug case to kernel(hotpluggable & present).
> so the table will be something like this,
>
> OS boot (DT/ACPI) hotpluggable & ... GetMemoryMap() should report as DT/ACPI should report as
> ----------------- ------------------ ------------------------------- ------------------------
> DT present ABSENT PRESENT
> DT absent ABSENT ABSENT
With DT boot, how does the OS get to know if thehotpluggable memory is
present or absent? Or maybe I misunderstand the last column.
Thanks
Eric
> ACPI present ABSENT PRESENT
> ACPI absent ABSENT ABSENT
>
>
> Can someone please file a feature request at
>> <https://bugzilla.tianocore.org/>, for the ArmVirtPkg Package, with these
>> detais?
>
> Ok. I will do that.
>
> Thanks,
> Shameer
>
>> Thanks
>> Laszlo
>>
>>>
>>> As for physical systems, there are out there ones that do report
>>> hotpluggable RAM in GetMemoryMap().
>>>
>>>> Shameer: if you (Huawei) are represented on the USWG / ASWG, I suggest
>>>> re-raising the question on those lists too; at least the "ACPI" rows of
>>>> the table.
>>>>
>>>> Thanks!
>>>> Laszlo
>>>>
>>>>>
>>>>> Robin.
>>>>>
>>>>>> (Hi Laszlo/Igor/Eric, please feel free to add/change if I have missed
>>>>>> any valid
>>>>>> points above).
>>>>>>
>>>>>> Thanks,
>>>>>> Shameer
>>>>>> [0] https://patchwork.kernel.org/cover/10890919/
>>>>>> [1] https://patchwork.kernel.org/patch/10863299/
>>>>>> [2] https://patchwork.kernel.org/patch/10890937/
>>>>>>
>>>>>>
>>>>
>>>
>
next prev parent reply other threads:[~2019-05-10 9:15 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-08 10:15 [Question] Memory hotplug clarification for Qemu ARM/virt Shameerali Kolothum Thodi
2019-05-08 12:50 ` Robin Murphy
2019-05-08 20:26 ` Laszlo Ersek
2019-05-09 16:35 ` Igor Mammedov
2019-05-09 21:48 ` Laszlo Ersek
2019-05-10 8:34 ` Shameerali Kolothum Thodi
2019-05-10 9:15 ` Auger Eric [this message]
2019-05-10 9:27 ` [Qemu-devel] " Shameerali Kolothum Thodi
2019-05-10 9:58 ` Auger Eric
2019-05-10 15:05 ` Igor Mammedov
2019-05-08 20:08 ` Laszlo Ersek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=499f2bc5-da85-72b2-4f7b-32f2d25d842b@redhat.com \
--to=eric.auger@redhat.com \
--cc=Catalin.Marinas@arm.com \
--cc=anshuman.khandual@arm.com \
--cc=ard.biesheuvel@linaro.org \
--cc=imammedo@redhat.com \
--cc=jonathan.cameron@huawei.com \
--cc=lersek@redhat.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-mm@kvack.org \
--cc=linuxarm@huawei.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=robin.murphy@arm.com \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=will.deacon@arm.com \
--cc=xuwei5@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).