From: Jonathan Cameron <email@example.com> To: Dan Williams <firstname.lastname@example.org> Cc: Linux Kernel Mailing List <email@example.com>, "Rafael J. Wysocki" <firstname.lastname@example.org>, Len Brown <email@example.com>, Keith Busch <firstname.lastname@example.org>, Vishal L Verma <email@example.com>, X86 ML <firstname.lastname@example.org>, Linux MM <email@example.com>, linux-nvdimm <firstname.lastname@example.org>, <email@example.com> Subject: Re: [RFC PATCH 4/5] acpi/hmat: Register special purpose memory as a device Date: Fri, 5 Apr 2019 18:39:45 +0100 [thread overview] Message-ID: <firstname.lastname@example.org> (raw) In-Reply-To: <CAPcyv4hxBFcJKbVVgNiE4UYXZS4XY9hfE8W9mN+VrcWS9AvJLw@mail.gmail.com> On Fri, 5 Apr 2019 09:56:22 -0700 Dan Williams <email@example.com> wrote: > On Fri, Apr 5, 2019 at 9:24 AM Jonathan Cameron > <firstname.lastname@example.org> wrote: > > > > On Fri, 5 Apr 2019 08:43:03 -0700 > > Dan Williams <email@example.com> wrote: > > > > > On Fri, Apr 5, 2019 at 4:19 AM Jonathan Cameron > > > <firstname.lastname@example.org> wrote: > > > > > > > > On Thu, 4 Apr 2019 12:08:49 -0700 > > > > Dan Williams <email@example.com> wrote: > > > > > > > > > Memory that has been tagged EFI_SPECIAL_PURPOSE, and has performance > > > > > properties described by the ACPI HMAT is expected to have an application > > > > > specific consumer. > > > > > > > > > > Those consumers may want 100% of the memory capacity to be reserved from > > > > > any usage by the kernel. By default, with this enabling, a platform > > > > > device is created to represent this differentiated resource. > > > > > > > > > > A follow on change arranges for device-dax to claim these devices by > > > > > default and provide an mmap interface for the target application. > > > > > However, if the administrator prefers that some or all of the special > > > > > purpose memory is made available to the core-mm the device-dax hotplug > > > > > facility can be used to online the memory with its own numa node. > > > > > > > > > > Cc: "Rafael J. Wysocki" <firstname.lastname@example.org> > > > > > Cc: Len Brown <email@example.com> > > > > > Cc: Keith Busch <firstname.lastname@example.org> > > > > > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > > > Signed-off-by: Dan Williams <email@example.com> > > > > > > > > Hi Dan, > > > > > > > > Great to see you getting this discussion going so fast and in > > > > general the approach makes sense to me. > > > > > > > > I'm a little confused why HMAT has anything to do with this. > > > > SPM is defined either via the attribute in SRAT SPA entries, > > > > EF_MEMORY_SP or via the EFI memory map. > > > > > > > > Whether it is in HMAT or not isn't all that relevant. > > > > Back in the days of the reservation hint (so before yesterday :) > > > > it was relevant obviously but that's no longer true. > > > > > > > > So what am I missing? > > > > > > It's a good question, and an assumption I should have explicitly > > > declared in the changelog. The problem with EFI_MEMORY_SP is the same > > > as the problem with the EfiPersistentMemory type, it isn't precise > > > enough on its own for the kernel to delineate 'type' or > > > device/replaceable-unit boundaries. For example, I expect one > > > EFI_MEMORY_SP range of a specific type may be contiguous with another > > > range of a different type. Similar to the NFIT there is no requirement > > > in the specification that platform firmware inject multiple range > > > entries. Instead that precision is left to the SRAT + HMAT, or the > > > NFIT in the case of PMEM. > > > > Absolutely, as long as they are all SPM, they could be anywhere in > > the system. > > > > > > > > Conversely, and thinking through this a bit more, if a memory range is > > > "special", but the platform fails to enumerate it in HMAT I think > > > Linux should scream loudly that the firmware is broken and leave the > > > range alone. The "scream loudly" piece is missing in the current set, > > > but the "leave the range alone" functionality is included. > > > > I am certainly keen on screaming if the various entries are inconsistent > > but am not sure they necessarily are here. > > > > So there are a couple of ways we could get an SPM range defined. > > The key thing here is that firmware should be attempting to describe > > what it has to some degree somewhere. If not it won't get a good > > result ;) So if there is no SRAT then you are on your own. SCREAM! > > > > 1. Directly in the memory map. If there is no other information then > > tough luck the kernel can only sensibly handle it as one device. > > Or not at all, which seems like a reasonable decision to me. > > SCREAM > > > > 2. In memory map + a proximity domain entry in SRAT. Given memory > > with different characteristics should be in different proximity > > domains anyway - this should be fairly precise. The slight snag > > here is that the fine grained nature of SRAT is actually a side > > effect of HMAT, so not sure well platforms have traditional > > describe their more subtle differences. > > > > 3. In NFIT as NFIT SPA carries the memory attribute. Not sure if > > we should scream if this disagrees with the memory map. > > > > 4. In HMAT? Now this changed in ACPI 6.3 to clean up the 'messy' > > prior relationship between it and SRAT. Now HMAT no longer has > > memory address ranges as you observed. That means, to describe > > properties of memory, it has to use the proximity domains of > > SRAT. It provides lots of additional info about those domains > > but it is SRAT that defines them. > > > > So I would argue that HMAT itself doesn't tell us anything useful. > > SRAT certainly does though so I think this should be coming from > > SRAT (or NFIT as that also defines the required precision) > > I agree, yes, SRAT by itself is sufficient for this "precision" > concern. However, do we, core Linux developers, really want to > encourage platform vendors that they can ignore deploying HMAT data > and get Linux to honor that sub-case for EFI_MEMORY_SP? My personal > experience is that platform firmware will take advantage of almost any > opportunity to minimize the data it provides to the OS. The only hard > lever Linux has to encourage platform firmware to give complete data > is to decline to support configurations that have incomplete data. > If we decide as a community that this is the way we want to go, I'm happy to politely point it out to our firmware people (who are a more proactive group on detailed system descriptions than many!) If we make this a clearly stated policy, perhaps via some comments in the code or Documentation/ that that would be even better and avoid people taking the 'but you could support my firmware' line in the future. I'll see if I can reach out to other OS vendors as well so we can present a unified front on this (perhaps after a few days, just in case we have any dissenting voices here!) Thanks, Jonathan
next prev parent reply other threads:[~2019-04-05 17:40 UTC|newest] Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-04-04 19:08 [RFC PATCH 0/5] EFI Special Purpose Memory Support Dan Williams 2019-04-04 19:08 ` [RFC PATCH 1/5] efi: Detect UEFI 2.8 Special Purpose Memory Dan Williams 2019-04-06 4:21 ` Ard Biesheuvel 2019-04-09 16:43 ` Dan Williams 2019-04-09 17:21 ` Ard Biesheuvel 2019-04-10 2:10 ` Dan Williams 2019-04-12 20:43 ` Ard Biesheuvel 2019-04-12 21:18 ` Dan Williams 2019-04-15 11:43 ` Enrico Weigelt, metux IT consult 2019-04-04 19:08 ` [RFC PATCH 2/5] lib/memregion: Uplevel the pmem "region" ida to a global allocator Dan Williams 2019-04-04 19:32 ` Matthew Wilcox 2019-04-04 21:02 ` Dan Williams 2019-04-04 19:08 ` [RFC PATCH 3/5] acpi/hmat: Track target address ranges Dan Williams 2019-04-04 20:58 ` Keith Busch 2019-04-04 20:58 ` Dan Williams 2019-04-04 19:08 ` [RFC PATCH 4/5] acpi/hmat: Register special purpose memory as a device Dan Williams 2019-04-05 11:18 ` Jonathan Cameron 2019-04-05 15:43 ` Dan Williams 2019-04-05 16:23 ` Jonathan Cameron 2019-04-05 16:56 ` Dan Williams 2019-04-05 17:39 ` Jonathan Cameron [this message] 2019-04-09 12:13 ` Christoph Hellwig 2019-04-09 14:49 ` Dan Williams 2019-04-04 19:08 ` [RFC PATCH 5/5] device-dax: Add a driver for "hmem" devices Dan Williams
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --subject='Re: [RFC PATCH 4/5] acpi/hmat: Register special purpose memory as a device' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).