From: Jonathan Cameron <jonathan.cameron@huawei.com> To: Keith Busch <keith.busch@intel.com> Cc: linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, linux-mm@kvack.org, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Rafael Wysocki <rafael@kernel.org>, Dave Hansen <dave.hansen@intel.com>, Dan Williams <dan.j.williams@intel.com> Subject: Re: [PATCH 5/7] doc/vm: New documentation for memory cache Date: Thu, 15 Nov 2018 13:16:43 +0000 [thread overview] Message-ID: <20181115131643.00003c0d@huawei.com> (raw) In-Reply-To: <20181114224921.12123-6-keith.busch@intel.com> On Wed, 14 Nov 2018 15:49:18 -0700 Keith Busch <keith.busch@intel.com> wrote: > Platforms may provide system memory that contains side caches to help If we can call them "memory-side caches" that would avoid a persistent confusion on what they actually are. It took me ages to get to the bottom of why they were always drawn to the side of the memory path ;) > spped up access. These memory caches are part of a memory node and speed > the cache attributes are exported by the kernel. > > Add new documentation providing a brief overview of system memory side > caches and the kernel provided attributes for application optimization. A few few nits in line, but mostly looks good to me. Thanks, Jonathan > > Signed-off-by: Keith Busch <keith.busch@intel.com> > --- > Documentation/vm/numacache.rst | 76 ++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 76 insertions(+) > create mode 100644 Documentation/vm/numacache.rst > > diff --git a/Documentation/vm/numacache.rst b/Documentation/vm/numacache.rst > new file mode 100644 > index 000000000000..e79c801b7e3b > --- /dev/null > +++ b/Documentation/vm/numacache.rst > @@ -0,0 +1,76 @@ > +.. _numacache: > + > +========== > +NUMA Cache > +========== > + > +System memory may be constructed in a hierarchy of various performing of elements with various performance > +characteristics in order to provide large address space of slower > +performing memory cached by a smaller size of higher performing cached by smaller higher performing memory. > +memory. The system physical addresses that software is aware of see is aware of is provided (no 'see') > +is provided by the last memory level in the hierarchy, while higher > +performing memory transparently provides caching to slower levels. > + > +The term "far memory" is used to denote the last level memory in the > +hierarchy. Each increasing cache level provides higher performing CPU initiator rather than CPU? > +access, and the term "near memory" represents the highest level cache > +provided by the system. This number is different than CPU caches where > +the cache level (ex: L1, L2, L3) uses a CPU centric view with each level > +being lower performing and closer to system memory. The memory cache > +level is centric to the last level memory, so the higher numbered cache from the last level memory? > +level denotes memory nearer to the CPU, and further from far memory. > + > +The memory side caches are not directly addressable by software. When > +software accesses a system address, the system will return it from the > +near memory cache if it is present. If it is not present, the system > +accesses the next level of memory until there is either a hit in that > +cache level, or it reaches far memory. > + > +In order to maximize the performance out of such a setup, software may > +wish to query the memory cache attributes. If the system provides a way > +to query this information, for example with ACPI HMAT (Heterogeneous > +Memory Attribute Table)[1], the kernel will append these attributes to > +the NUMA node that provides the memory. > + > +When the kernel first registers a memory cache with a node, the kernel > +will create the following directory:: > + > + /sys/devices/system/node/nodeX/cache/ Given we have other things with caches in a numa node, should we make this name more specific? > + > +If that directory is not present, then either the memory does not have > +a side cache, or that information is not provided to the kernel. > + > +The attributes for each level of cache is provided under its cache > +level index:: > + > + /sys/devices/system/node/nodeX/cache/indexA/ > + /sys/devices/system/node/nodeX/cache/indexB/ > + /sys/devices/system/node/nodeX/cache/indexC/ > + > +Each cache level's directory provides its attributes. For example, > +the following is a single cache level and the attributes available for > +software to query:: > + > + # tree sys/devices/system/node/node0/cache/ > + /sys/devices/system/node/node0/cache/ > + |-- index1 > + | |-- associativity > + | |-- level > + | |-- line_size > + | |-- size > + | `-- write_policy > + > +The cache "associativity" will be 0 if it is a direct-mapped cache, and > +non-zero for any other indexed based, multi-way associativity. This description is a little vague. Right now I think we have 3 options from HMAT, 1) no associativity (which I suppose could also be called fully associative?) 2) direct mapped (0 in your case) 3) Complex (who knows!) So how do you map 1 and 3? > + > +The "level" is the distance from the far memory, and matches the number > +appended to its "index" directory. > + > +The "line_size" is the number of bytes accessed on a cache miss. > + > +The "size" is the number of bytes provided by this cache level. > + > +The "write_policy" will be 0 for write-back, and non-zero for > +write-through caching. Do these not appear if the write_policy provided by acpi is "none". > + > +[1] https://www.uefi.org/sites/default/files/resources/ACPI_6_2.pdf
WARNING: multiple messages have this Message-ID (diff)
From: Jonathan Cameron <jonathan.cameron@huawei.com> To: Keith Busch <keith.busch@intel.com> Cc: <linux-kernel@vger.kernel.org>, <linux-acpi@vger.kernel.org>, <linux-mm@kvack.org>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, "Rafael Wysocki" <rafael@kernel.org>, Dave Hansen <dave.hansen@intel.com>, "Dan Williams" <dan.j.williams@intel.com> Subject: Re: [PATCH 5/7] doc/vm: New documentation for memory cache Date: Thu, 15 Nov 2018 13:16:43 +0000 [thread overview] Message-ID: <20181115131643.00003c0d@huawei.com> (raw) In-Reply-To: <20181114224921.12123-6-keith.busch@intel.com> On Wed, 14 Nov 2018 15:49:18 -0700 Keith Busch <keith.busch@intel.com> wrote: > Platforms may provide system memory that contains side caches to help If we can call them "memory-side caches" that would avoid a persistent confusion on what they actually are. It took me ages to get to the bottom of why they were always drawn to the side of the memory path ;) > spped up access. These memory caches are part of a memory node and speed > the cache attributes are exported by the kernel. > > Add new documentation providing a brief overview of system memory side > caches and the kernel provided attributes for application optimization. A few few nits in line, but mostly looks good to me. Thanks, Jonathan > > Signed-off-by: Keith Busch <keith.busch@intel.com> > --- > Documentation/vm/numacache.rst | 76 ++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 76 insertions(+) > create mode 100644 Documentation/vm/numacache.rst > > diff --git a/Documentation/vm/numacache.rst b/Documentation/vm/numacache.rst > new file mode 100644 > index 000000000000..e79c801b7e3b > --- /dev/null > +++ b/Documentation/vm/numacache.rst > @@ -0,0 +1,76 @@ > +.. _numacache: > + > +========== > +NUMA Cache > +========== > + > +System memory may be constructed in a hierarchy of various performing of elements with various performance > +characteristics in order to provide large address space of slower > +performing memory cached by a smaller size of higher performing cached by smaller higher performing memory. > +memory. The system physical addresses that software is aware of see is aware of is provided (no 'see') > +is provided by the last memory level in the hierarchy, while higher > +performing memory transparently provides caching to slower levels. > + > +The term "far memory" is used to denote the last level memory in the > +hierarchy. Each increasing cache level provides higher performing CPU initiator rather than CPU? > +access, and the term "near memory" represents the highest level cache > +provided by the system. This number is different than CPU caches where > +the cache level (ex: L1, L2, L3) uses a CPU centric view with each level > +being lower performing and closer to system memory. The memory cache > +level is centric to the last level memory, so the higher numbered cache from the last level memory? > +level denotes memory nearer to the CPU, and further from far memory. > + > +The memory side caches are not directly addressable by software. When > +software accesses a system address, the system will return it from the > +near memory cache if it is present. If it is not present, the system > +accesses the next level of memory until there is either a hit in that > +cache level, or it reaches far memory. > + > +In order to maximize the performance out of such a setup, software may > +wish to query the memory cache attributes. If the system provides a way > +to query this information, for example with ACPI HMAT (Heterogeneous > +Memory Attribute Table)[1], the kernel will append these attributes to > +the NUMA node that provides the memory. > + > +When the kernel first registers a memory cache with a node, the kernel > +will create the following directory:: > + > + /sys/devices/system/node/nodeX/cache/ Given we have other things with caches in a numa node, should we make this name more specific? > + > +If that directory is not present, then either the memory does not have > +a side cache, or that information is not provided to the kernel. > + > +The attributes for each level of cache is provided under its cache > +level index:: > + > + /sys/devices/system/node/nodeX/cache/indexA/ > + /sys/devices/system/node/nodeX/cache/indexB/ > + /sys/devices/system/node/nodeX/cache/indexC/ > + > +Each cache level's directory provides its attributes. For example, > +the following is a single cache level and the attributes available for > +software to query:: > + > + # tree sys/devices/system/node/node0/cache/ > + /sys/devices/system/node/node0/cache/ > + |-- index1 > + | |-- associativity > + | |-- level > + | |-- line_size > + | |-- size > + | `-- write_policy > + > +The cache "associativity" will be 0 if it is a direct-mapped cache, and > +non-zero for any other indexed based, multi-way associativity. This description is a little vague. Right now I think we have 3 options from HMAT, 1) no associativity (which I suppose could also be called fully associative?) 2) direct mapped (0 in your case) 3) Complex (who knows!) So how do you map 1 and 3? > + > +The "level" is the distance from the far memory, and matches the number > +appended to its "index" directory. > + > +The "line_size" is the number of bytes accessed on a cache miss. > + > +The "size" is the number of bytes provided by this cache level. > + > +The "write_policy" will be 0 for write-back, and non-zero for > +write-through caching. Do these not appear if the write_policy provided by acpi is "none". > + > +[1] https://www.uefi.org/sites/default/files/resources/ACPI_6_2.pdf
next prev parent reply other threads:[~2018-11-15 13:16 UTC|newest] Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-11-14 22:49 [PATCH 1/7] node: Link memory nodes to their compute nodes Keith Busch 2018-11-14 22:49 ` [PATCH 2/7] node: Add heterogenous memory performance Keith Busch 2018-11-19 3:35 ` Anshuman Khandual 2018-11-19 15:46 ` Keith Busch 2018-11-22 13:22 ` Anshuman Khandual 2018-11-27 7:00 ` Dan Williams 2018-11-27 17:42 ` Dan Williams 2018-11-27 17:44 ` Keith Busch 2018-11-14 22:49 ` [PATCH 3/7] doc/vm: New documentation for " Keith Busch 2018-11-15 12:59 ` Jonathan Cameron 2018-11-15 12:59 ` Jonathan Cameron 2018-12-10 16:12 ` Dan Williams 2018-11-20 13:51 ` Mike Rapoport 2018-11-20 15:31 ` Keith Busch 2018-11-14 22:49 ` [PATCH 4/7] node: Add memory caching attributes Keith Busch 2018-11-15 0:40 ` Dave Hansen 2018-11-19 4:14 ` Anshuman Khandual 2018-11-19 23:06 ` Keith Busch 2018-11-22 13:29 ` Anshuman Khandual 2018-11-26 15:14 ` Keith Busch 2018-11-26 19:06 ` Greg Kroah-Hartman 2018-11-26 19:53 ` Keith Busch 2018-11-26 19:06 ` Greg Kroah-Hartman 2018-11-14 22:49 ` [PATCH 5/7] doc/vm: New documentation for memory cache Keith Busch 2018-11-15 0:41 ` Dave Hansen 2018-11-15 13:16 ` Jonathan Cameron [this message] 2018-11-15 13:16 ` Jonathan Cameron 2018-11-20 13:53 ` Mike Rapoport 2018-11-14 22:49 ` [PATCH 6/7] acpi: Create subtable parsing infrastructure Keith Busch 2018-11-19 9:58 ` Rafael J. Wysocki 2018-11-19 18:36 ` Keith Busch 2018-11-14 22:49 ` [PATCH 7/7] acpi/hmat: Parse and report heterogeneous memory Keith Busch 2018-11-15 13:57 ` [PATCH 1/7] node: Link memory nodes to their compute nodes Matthew Wilcox 2018-11-15 14:59 ` Keith Busch 2018-11-15 17:50 ` Dan Williams 2018-11-19 3:04 ` Anshuman Khandual 2018-11-15 20:36 ` Matthew Wilcox 2018-11-16 18:32 ` Keith Busch 2018-11-19 3:15 ` Anshuman Khandual 2018-11-19 15:49 ` Keith Busch 2018-12-04 15:43 ` Aneesh Kumar K.V 2018-12-04 16:54 ` Keith Busch 2018-11-16 22:55 ` Dan Williams 2018-11-19 2:52 ` Anshuman Khandual 2018-11-19 2:46 ` Anshuman Khandual
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20181115131643.00003c0d@huawei.com \ --to=jonathan.cameron@huawei.com \ --cc=dan.j.williams@intel.com \ --cc=dave.hansen@intel.com \ --cc=gregkh@linuxfoundation.org \ --cc=keith.busch@intel.com \ --cc=linux-acpi@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=rafael@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.