From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dan Williams Subject: Re: [PATCH 2/7] node: Add heterogenous memory performance Date: Mon, 26 Nov 2018 23:00:09 -0800 Message-ID: References: <20181114224921.12123-2-keith.busch@intel.com> <20181114224921.12123-3-keith.busch@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Return-path: In-Reply-To: <20181114224921.12123-3-keith.busch@intel.com> Sender: linux-kernel-owner@vger.kernel.org To: Keith Busch Cc: Linux Kernel Mailing List , Linux ACPI , Linux MM , Greg KH , "Rafael J. Wysocki" , Dave Hansen List-Id: linux-acpi@vger.kernel.org On Wed, Nov 14, 2018 at 2:53 PM Keith Busch wrote: > > Heterogeneous memory systems provide memory nodes with latency > and bandwidth performance attributes that are different from other > nodes. Create an interface for the kernel to register these attributes > under the node that provides the memory. If the system provides this > information, applications can query the node attributes when deciding > which node to request memory. > > When multiple memory initiators exist, accessing the same memory target > from each may not perform the same as the other. The highest performing > initiator to a given target is considered to be a local initiator for > that target. The kernel provides performance attributes only for the > local initiators. > > The memory's compute node should be symlinked in sysfs as one of the > node's initiators. > > The following example shows the new sysfs hierarchy for a node exporting > performance attributes: > > # tree /sys/devices/system/node/nodeY/initiator_access > /sys/devices/system/node/nodeY/initiator_access > |-- read_bandwidth > |-- read_latency > |-- write_bandwidth > `-- write_latency With the expectation that there will be nodes that are initiator-only, target-only, or both I think this interface should indicate that. The 1:1 "local" designation of HMAT should not be directly encoded in the interface, it's just a shortcut for finding at least one initiator in the set that can realize the advertised performance. At least if the interface can enumerate the set of initiators then it becomes clear whether sysfs can answer a performance enumeration question or if the application needs to consult an interface with specific knowledge of a given initiator-target pairing. It seems a precursor to these patches is arranges for offline node devices to be created for the ACPI proximity domains that are offline-by default for reserved memory ranges.