RE: [ACPI Code First ECN v2]: Generic Port, performace data for hotplug memory

From: Vikram Sethi <vsethi@nvidia.com>
To: Jonathan Cameron <Jonathan.Cameron@Huawei.com>,
	Dan Williams <dan.j.williams@intel.com>
Cc: "linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"Natu, Mahesh" <mahesh.natu@intel.com>,
	"Douglas, Chet R" <chet.r.douglas@intel.com>,
	"Verma, Vishal L" <vishal.l.verma@intel.com>,
	"Widawsky, Ben" <ben.widawsky@intel.com>,
	Samer El-Haj-Mahmoud <Samer.El-Haj-Mahmoud@arm.com>,
	Thanu Rangarajan <Thanu.Rangarajan@arm.com>,
	Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>,
	Mark Hairgrove <mhairgrove@nvidia.com>
Subject: RE: [ACPI Code First ECN v2]: Generic Port, performace data for hotplug memory
Date: Tue, 20 Apr 2021 16:53:36 +0000	[thread overview]
Message-ID: <BL0PR12MB25323D058297AD200277FBCFBD489@BL0PR12MB2532.namprd12.prod.outlook.com> (raw)
In-Reply-To: <20210420100342.00000253@Huawei.com>

> -----Original Message-----
> From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> Sent: Tuesday, April 20, 2021 4:04 AM
> To: Dan Williams <dan.j.williams@intel.com>
> Cc: Vikram Sethi <vsethi@nvidia.com>; linux-cxl@vger.kernel.org; linux-
> acpi@vger.kernel.org; Natu, Mahesh <mahesh.natu@intel.com>; Douglas, Chet R
> <chet.r.douglas@intel.com>; Verma, Vishal L <vishal.l.verma@intel.com>;
> Widawsky, Ben <ben.widawsky@intel.com>; Samer El-Haj-Mahmoud <Samer.El-
> Haj-Mahmoud@arm.com>; Thanu Rangarajan <Thanu.Rangarajan@arm.com>
> Subject: Re: [ACPI Code First ECN v2]: Generic Port, performace data for hotplug
> memory
> 
> External email: Use caution opening links or attachments
> 
> 
> On Mon, 19 Apr 2021 22:08:39 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > On Mon, Apr 19, 2021 at 9:22 PM Vikram Sethi <vsethi@nvidia.com> wrote:
> > >
> > > > From: Dan Williams <dan.j.williams@intel.com>
> > > > On Mon, Apr 19, 2021 at 3:56 PM Vikram Sethi <vsethi@nvidia.com> wrote:
> > > > [..]
> > > > > > * Replace all instances of "Initiator" with "Initiator / Port" in "Table
> > > > > >   5.59 Flags - Generic Initiator Affinity Structure", including the
> > > > > >   table name.
> > > > >
> > > > > I wanted to discuss the implications of a CXL host bridge implementation
> that
> > > > > does not set the "Architectural Transactions" bit/flag in the aforementioned
> > > > > Flags in Table 5.59.
> > > > >
> > > > > Since the kernel would be expecting all "System RAM" to have equivalent
> > > > > Functional properties, if HDM cannot have all the same functionality, then
> in
> > > > > the absence of ISA specific ACPI tables clarifying what architectural feature
> isn't
> > > > > supported, the kernel may be forced to not online the HDM memory as
> system
> > > > > RAM. If there is more granular expression of what features are lacking in a
> ISA
> > > > > Specific table (eg Memory Tagging/Memory Protection keys not
> supported),
> > > > > the kernel could choose to not enable that feature in all of system RAM (if
> > > > > discrepancy discovered at boot), but still online the HDM as System RAM.
> > > > >
> > > > > To that end, it may be useful to clarify this to host vendors by way of an
> > > > > Implementation note (ideally in the CXL specification). Something like:
> > > > > "CXL hosts are encouraged to support all architectural features in HDM
> > > > > as they do in CPU attached memory to avoid either the memory from
> > > > > being onlined as System RAM, or the architectural feature being disabled.
> > > > > Hosts must indicate architectural parity for HDM in ACPI SRAT
> > > > > “Generic Port” flags “Architectural transactions” bit by setting it to 1.
> > > > > A port that sets this bit to 0 will need ISA specific ways/ACPI tables to
> > > > > describe which specific ISA features would not work in HDM, so an OS
> > > > > could disable that memory or that feature."
> > > > >
> > > > > Thoughts?
> > > >
> > > > The problem, as you know, is that those features are already defined
> > > > without those "ISA specific ways / ACPI tables". I think it simply
> > > > must be the case that the only agent in the system that is aware of
> > > > the intersection of capabilities between ISA and CXL (platform
> > > > firmware) must mitigate the conflict. I.e. so that the CXL
> > > > infrastructure need not worry about ISA feature capability and vice
> > > > versa. To me, this looks like a platform firmware pre-boot
> > > > configuration menu / switch that turns off CXL (declines to publish
> > > > ACPI0016 devices) if incompatible ISA feature "X" is enabled, or the
> > > > reverse turns off ISA feature "X" if CXL is enabled. In other words,
> > > > the conflict needs to be resolved before the OS boots, setting this
> > > > bit to 0 is not a viable option for mitigating the conflict because
> > > > there is no requirement for the OS to even look at this flag.
> > >
> > > Leaving it to Firmware is easier for the OS, but could be a couple
> > > of issues with that:
> > > Platform firmware may not have a way of disabling ISA feature
> > > if it is directly visible to the OS via CPU ID registers, and the
> > > registers can't be trapped to some FW and values adjusted
> > > on access
> > > Platform Firmware may not know if the OS supports a specific
> > > Feature (code may not exist or not default option etc) so it
> > > may be premature/suboptimal to disable CXL hostbridge
> > > altogether. Although I suppose a UEFI variable type knob
> > > could be adjusted in this case and take effect on reboot.
> > >
> > > Also, for some *future* ISA features where it may be possible and
> > > practical to define ISA feature support discovery per NUMA
> > > node/address range w/ ACPI (prior to userspace ABI being set),
> > > the platform would want to enable the CXL host bridge and leave
> > > selective enablement of the feature to the OS. Yes, this is messy
> > > and best avoided, but it may make sense depending on ISA
> > > feature and how messy it makes user space. I'm personally
> > > not in favor of this latter option, but I'm told this was discussed
> > > in other Coherent interconnect forums and chosen as a path
> > > forward.
> >
> > I think it's reasonable for new stuff to define _OSC or other opt-in
> > requirements to allow the OS to manage ISA vs CXL conflict policy. For
> > existing conflicts the only reliable mechanism is decline to publish
> > ACPI0016 if platform firmware can enumerate an ISA feature that it is
> > not supported on CXL. So I think the proposal here is a recommendation
> > for platform firmware implementations that they are responsible for
> > this conflict resolution unless / until other mechanisms arrive.
> 
> Agreed with one addition.  It should be possible to retrofit negotiation
> for existing features as well. Default policy should be that it's firmware's
> problem but if the OS uses _OSC to negotiate something else then it may
> be possible to be more flexible. As long as the default is safe, relaxing
> that can happen once mechanisms are defined.  The actual decision on
> whether to enable ACPI0016 can for example be pushed into AML code.
> 

Relaxations could happen only when ABI to userspace isn't already set
and in use though, no?

What about Coherent device memory in type 2 devices? That is not EFI
conventional memory but once some part has been onlined by a driver by
calling something like add_memory_driver_managed, that part should 
have all the properties of System RAM as well. Not clear that Platform
firmware would know if a driver intents to online some or all of that
memory.