linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
To: "robin.murphy@arm.com" <robin.murphy@arm.com>,
	"will.deacon@arm.com" <will.deacon@arm.com>,
	Catalin Marinas <Catalin.Marinas@arm.com>,
	"Anshuman Khandual" <anshuman.khandual@arm.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	linux-mm <linux-mm@kvack.org>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"qemu-arm@nongnu.org" <qemu-arm@nongnu.org>,
	"eric.auger@redhat.com" <eric.auger@redhat.com>,
	"Igor Mammedov" <imammedo@redhat.com>,
	Laszlo Ersek <lersek@redhat.com>,
	"peter.maydell@linaro.org" <peter.maydell@linaro.org>,
	Linuxarm <linuxarm@huawei.com>,
	"ard.biesheuvel@linaro.org" <ard.biesheuvel@linaro.org>,
	Jonathan Cameron <jonathan.cameron@huawei.com>,
	"xuwei (O)" <xuwei5@huawei.com>
Subject: [Question] Memory hotplug clarification for Qemu ARM/virt
Date: Wed, 8 May 2019 10:15:50 +0000	[thread overview]
Message-ID: <5FC3163CFD30C246ABAA99954A238FA83F1B6A66@lhreml524-mbs.china.huawei.com> (raw)

Hi,

This series here[0] attempts to add support for PCDIMM in QEMU for
ARM/Virt platform and has stumbled upon an issue as it is not clear(at least
from Qemu/EDK2 point of view) how in physical world the hotpluggable
memory is handled by kernel.

The proposed implementation in Qemu, builds the SRAT and DSDT parts
and uses GED device to trigger the hotplug. This works fine.

But when we added the DT node corresponding to the PCDIMM(cold plug
scenario), we noticed that Guest kernel see this memory during early boot
even if we are booting with ACPI. Because of this, hotpluggable memory
may end up in zone normal and make it non-hot-un-pluggable even if Guest
boots with ACPI.

Further discussions[1] revealed that, EDK2 UEFI has no means to interpret the
ACPI content from Qemu(this is designed to do so) and uses DT info to
build the GetMemoryMap(). To solve this, introduced "hotpluggable" property
to DT memory node(patches #7 & #8 from [0]) so that UEFI can differentiate
the nodes and exclude the hotpluggable ones from GetMemoryMap().

But then Laszlo rightly pointed out that in order to accommodate the changes
into UEFI we need to know how exactly Linux expects/handles all the 
hotpluggable memory scenarios. Please find the discussion here[2].

For ease, I am just copying the relevant comment from Laszlo below,

/******
"Given patches #7 and #8, as I understand them, the firmware cannot distinguish
 hotpluggable & present, from hotpluggable & absent. The firmware can only
 skip both hotpluggable cases. That's fine in that the firmware will hog neither
 type -- but is that OK for the OS as well, for both ACPI boot and DT boot?

Consider in particular the "hotpluggable & present, ACPI boot" case. Assuming
we modify the firmware to skip "hotpluggable" altogether, the UEFI memmap
will not include the range despite it being present at boot. Presumably, ACPI
will refer to the range somehow, however. Will that not confuse the OS?

When Igor raised this earlier, I suggested that hotpluggable-and-present should
be added by the firmware, but also allocated immediately, as EfiBootServicesData
type memory. This will prevent other drivers in the firmware from allocating AcpiNVS
or Reserved chunks from the same memory range, the UEFI memmap will contain
the range as EfiBootServicesData, and then the OS can release that allocation in
one go early during boot.

But this really has to be clarified from the Linux kernel's expectations. Please
formalize all of the following cases:

OS boot (DT/ACPI)  hotpluggable & ...  GetMemoryMap() should report as  DT/ACPI should report as
-----------------  ------------------  -------------------------------  ------------------------
DT                 present             ?                                ?
DT                 absent              ?                                ?
ACPI               present             ?                                ?
ACPI               absent              ?                                ?

Again, this table is dictated by Linux."

******/

Could you please take a look at this and let us know what is expected here from
a Linux kernel view point.

(Hi Laszlo/Igor/Eric, please feel free to add/change if I have missed any valid
points above).

Thanks,
Shameer
[0] https://patchwork.kernel.org/cover/10890919/
[1] https://patchwork.kernel.org/patch/10863299/
[2] https://patchwork.kernel.org/patch/10890937/



             reply	other threads:[~2019-05-08 10:16 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-08 10:15 Shameerali Kolothum Thodi [this message]
2019-05-08 12:50 ` [Question] Memory hotplug clarification for Qemu ARM/virt Robin Murphy
2019-05-08 20:26   ` Laszlo Ersek
2019-05-09 16:35     ` Igor Mammedov
2019-05-09 21:48       ` Laszlo Ersek
2019-05-10  8:34         ` Shameerali Kolothum Thodi
2019-05-10  9:15           ` [Qemu-devel] " Auger Eric
2019-05-10  9:27             ` Shameerali Kolothum Thodi
2019-05-10  9:58               ` Auger Eric
2019-05-10 15:05                 ` Igor Mammedov
2019-05-08 20:08 ` Laszlo Ersek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5FC3163CFD30C246ABAA99954A238FA83F1B6A66@lhreml524-mbs.china.huawei.com \
    --to=shameerali.kolothum.thodi@huawei.com \
    --cc=Catalin.Marinas@arm.com \
    --cc=anshuman.khandual@arm.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=eric.auger@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=lersek@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxarm@huawei.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=robin.murphy@arm.com \
    --cc=will.deacon@arm.com \
    --cc=xuwei5@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).