All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Cameron <jonathan.cameron@huawei.com>
To: Keith Busch <keith.busch@intel.com>
Cc: linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org,
	linux-mm@kvack.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Rafael Wysocki <rafael@kernel.org>,
	Dave Hansen <dave.hansen@intel.com>,
	Dan Williams <dan.j.williams@intel.com>
Subject: Re: [PATCH 5/7] doc/vm: New documentation for memory cache
Date: Thu, 15 Nov 2018 13:16:43 +0000	[thread overview]
Message-ID: <20181115131643.00003c0d@huawei.com> (raw)
In-Reply-To: <20181114224921.12123-6-keith.busch@intel.com>

On Wed, 14 Nov 2018 15:49:18 -0700
Keith Busch <keith.busch@intel.com> wrote:

> Platforms may provide system memory that contains side caches to help

If we can call them "memory-side caches" that would avoid a persistent
confusion on what they actually are.  It took me ages to get to the
bottom of why they were always drawn to the side of the memory
path ;)

> spped up access. These memory caches are part of a memory node and

speed

> the cache attributes are exported by the kernel.
> 
> Add new documentation providing a brief overview of system memory side
> caches and the kernel provided attributes for application optimization.
A few few nits in line, but mostly looks good to me.

Thanks,

Jonathan

> 
> Signed-off-by: Keith Busch <keith.busch@intel.com>
> ---
>  Documentation/vm/numacache.rst | 76 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 76 insertions(+)
>  create mode 100644 Documentation/vm/numacache.rst
> 
> diff --git a/Documentation/vm/numacache.rst b/Documentation/vm/numacache.rst
> new file mode 100644
> index 000000000000..e79c801b7e3b
> --- /dev/null
> +++ b/Documentation/vm/numacache.rst
> @@ -0,0 +1,76 @@
> +.. _numacache:
> +
> +==========
> +NUMA Cache
> +==========
> +
> +System memory may be constructed in a hierarchy of various performing

of elements with various performance

> +characteristics in order to provide large address space of slower
> +performing memory cached by a smaller size of higher performing

cached by smaller higher performing memory.

> +memory. The system physical addresses that software is aware of see

is aware of is provided (no 'see')

> +is provided by the last memory level in the hierarchy, while higher
> +performing memory transparently provides caching to slower levels.
> +
> +The term "far memory" is used to denote the last level memory in the
> +hierarchy. Each increasing cache level provides higher performing CPU

initiator rather than CPU?

> +access, and the term "near memory" represents the highest level cache
> +provided by the system. This number is different than CPU caches where
> +the cache level (ex: L1, L2, L3) uses a CPU centric view with each level
> +being lower performing and closer to system memory. The memory cache
> +level is centric to the last level memory, so the higher numbered cache

from the last level memory?

> +level denotes memory nearer to the CPU, and further from far memory.
> +
> +The memory side caches are not directly addressable by software. When
> +software accesses a system address, the system will return it from the
> +near memory cache if it is present. If it is not present, the system
> +accesses the next level of memory until there is either a hit in that
> +cache level, or it reaches far memory.
> +
> +In order to maximize the performance out of such a setup, software may
> +wish to query the memory cache attributes. If the system provides a way
> +to query this information, for example with ACPI HMAT (Heterogeneous
> +Memory Attribute Table)[1], the kernel will append these attributes to
> +the NUMA node that provides the memory.
> +
> +When the kernel first registers a memory cache with a node, the kernel
> +will create the following directory::
> +
> +	/sys/devices/system/node/nodeX/cache/

Given we have other things with caches in a numa node, should we make
this name more specific?

> +
> +If that directory is not present, then either the memory does not have
> +a side cache, or that information is not provided to the kernel.
> +
> +The attributes for each level of cache is provided under its cache
> +level index::
> +
> +	/sys/devices/system/node/nodeX/cache/indexA/
> +	/sys/devices/system/node/nodeX/cache/indexB/
> +	/sys/devices/system/node/nodeX/cache/indexC/
> +
> +Each cache level's directory provides its attributes. For example,
> +the following is a single cache level and the attributes available for
> +software to query::
> +
> +	# tree sys/devices/system/node/node0/cache/
> +	/sys/devices/system/node/node0/cache/
> +	|-- index1
> +	|   |-- associativity
> +	|   |-- level
> +	|   |-- line_size
> +	|   |-- size
> +	|   `-- write_policy
> +
> +The cache "associativity" will be 0 if it is a direct-mapped cache, and
> +non-zero for any other indexed based, multi-way associativity.
This description is a little vague.  Right now I think we have 3 options
from HMAT,
1) no associativity (which I suppose could also be called fully associative?)
2) direct mapped (0 in your case)
3) Complex (who knows!)

So how do you map 1 and 3?

> +
> +The "level" is the distance from the far memory, and matches the number
> +appended to its "index" directory.
> +
> +The "line_size" is the number of bytes accessed on a cache miss.
> +
> +The "size" is the number of bytes provided by this cache level.
> +
> +The "write_policy" will be 0 for write-back, and non-zero for
> +write-through caching.

Do these not appear if the write_policy provided by acpi is "none".

> +
> +[1] https://www.uefi.org/sites/default/files/resources/ACPI_6_2.pdf

WARNING: multiple messages have this Message-ID (diff)
From: Jonathan Cameron <jonathan.cameron@huawei.com>
To: Keith Busch <keith.busch@intel.com>
Cc: <linux-kernel@vger.kernel.org>, <linux-acpi@vger.kernel.org>,
	<linux-mm@kvack.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Rafael Wysocki" <rafael@kernel.org>,
	Dave Hansen <dave.hansen@intel.com>,
	"Dan Williams" <dan.j.williams@intel.com>
Subject: Re: [PATCH 5/7] doc/vm: New documentation for memory cache
Date: Thu, 15 Nov 2018 13:16:43 +0000	[thread overview]
Message-ID: <20181115131643.00003c0d@huawei.com> (raw)
In-Reply-To: <20181114224921.12123-6-keith.busch@intel.com>

On Wed, 14 Nov 2018 15:49:18 -0700
Keith Busch <keith.busch@intel.com> wrote:

> Platforms may provide system memory that contains side caches to help

If we can call them "memory-side caches" that would avoid a persistent
confusion on what they actually are.  It took me ages to get to the
bottom of why they were always drawn to the side of the memory
path ;)

> spped up access. These memory caches are part of a memory node and

speed

> the cache attributes are exported by the kernel.
> 
> Add new documentation providing a brief overview of system memory side
> caches and the kernel provided attributes for application optimization.
A few few nits in line, but mostly looks good to me.

Thanks,

Jonathan

> 
> Signed-off-by: Keith Busch <keith.busch@intel.com>
> ---
>  Documentation/vm/numacache.rst | 76 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 76 insertions(+)
>  create mode 100644 Documentation/vm/numacache.rst
> 
> diff --git a/Documentation/vm/numacache.rst b/Documentation/vm/numacache.rst
> new file mode 100644
> index 000000000000..e79c801b7e3b
> --- /dev/null
> +++ b/Documentation/vm/numacache.rst
> @@ -0,0 +1,76 @@
> +.. _numacache:
> +
> +==========
> +NUMA Cache
> +==========
> +
> +System memory may be constructed in a hierarchy of various performing

of elements with various performance

> +characteristics in order to provide large address space of slower
> +performing memory cached by a smaller size of higher performing

cached by smaller higher performing memory.

> +memory. The system physical addresses that software is aware of see

is aware of is provided (no 'see')

> +is provided by the last memory level in the hierarchy, while higher
> +performing memory transparently provides caching to slower levels.
> +
> +The term "far memory" is used to denote the last level memory in the
> +hierarchy. Each increasing cache level provides higher performing CPU

initiator rather than CPU?

> +access, and the term "near memory" represents the highest level cache
> +provided by the system. This number is different than CPU caches where
> +the cache level (ex: L1, L2, L3) uses a CPU centric view with each level
> +being lower performing and closer to system memory. The memory cache
> +level is centric to the last level memory, so the higher numbered cache

from the last level memory?

> +level denotes memory nearer to the CPU, and further from far memory.
> +
> +The memory side caches are not directly addressable by software. When
> +software accesses a system address, the system will return it from the
> +near memory cache if it is present. If it is not present, the system
> +accesses the next level of memory until there is either a hit in that
> +cache level, or it reaches far memory.
> +
> +In order to maximize the performance out of such a setup, software may
> +wish to query the memory cache attributes. If the system provides a way
> +to query this information, for example with ACPI HMAT (Heterogeneous
> +Memory Attribute Table)[1], the kernel will append these attributes to
> +the NUMA node that provides the memory.
> +
> +When the kernel first registers a memory cache with a node, the kernel
> +will create the following directory::
> +
> +	/sys/devices/system/node/nodeX/cache/

Given we have other things with caches in a numa node, should we make
this name more specific?

> +
> +If that directory is not present, then either the memory does not have
> +a side cache, or that information is not provided to the kernel.
> +
> +The attributes for each level of cache is provided under its cache
> +level index::
> +
> +	/sys/devices/system/node/nodeX/cache/indexA/
> +	/sys/devices/system/node/nodeX/cache/indexB/
> +	/sys/devices/system/node/nodeX/cache/indexC/
> +
> +Each cache level's directory provides its attributes. For example,
> +the following is a single cache level and the attributes available for
> +software to query::
> +
> +	# tree sys/devices/system/node/node0/cache/
> +	/sys/devices/system/node/node0/cache/
> +	|-- index1
> +	|   |-- associativity
> +	|   |-- level
> +	|   |-- line_size
> +	|   |-- size
> +	|   `-- write_policy
> +
> +The cache "associativity" will be 0 if it is a direct-mapped cache, and
> +non-zero for any other indexed based, multi-way associativity.
This description is a little vague.  Right now I think we have 3 options
from HMAT,
1) no associativity (which I suppose could also be called fully associative?)
2) direct mapped (0 in your case)
3) Complex (who knows!)

So how do you map 1 and 3?

> +
> +The "level" is the distance from the far memory, and matches the number
> +appended to its "index" directory.
> +
> +The "line_size" is the number of bytes accessed on a cache miss.
> +
> +The "size" is the number of bytes provided by this cache level.
> +
> +The "write_policy" will be 0 for write-back, and non-zero for
> +write-through caching.

Do these not appear if the write_policy provided by acpi is "none".

> +
> +[1] https://www.uefi.org/sites/default/files/resources/ACPI_6_2.pdf



  parent reply	other threads:[~2018-11-15 13:16 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-14 22:49 [PATCH 1/7] node: Link memory nodes to their compute nodes Keith Busch
2018-11-14 22:49 ` [PATCH 2/7] node: Add heterogenous memory performance Keith Busch
2018-11-19  3:35   ` Anshuman Khandual
2018-11-19 15:46     ` Keith Busch
2018-11-22 13:22       ` Anshuman Khandual
2018-11-27  7:00   ` Dan Williams
2018-11-27 17:42     ` Dan Williams
2018-11-27 17:44     ` Keith Busch
2018-11-14 22:49 ` [PATCH 3/7] doc/vm: New documentation for " Keith Busch
2018-11-15 12:59   ` Jonathan Cameron
2018-11-15 12:59     ` Jonathan Cameron
2018-12-10 16:12     ` Dan Williams
2018-11-20 13:51   ` Mike Rapoport
2018-11-20 15:31     ` Keith Busch
2018-11-14 22:49 ` [PATCH 4/7] node: Add memory caching attributes Keith Busch
2018-11-15  0:40   ` Dave Hansen
2018-11-19  4:14   ` Anshuman Khandual
2018-11-19 23:06     ` Keith Busch
2018-11-22 13:29       ` Anshuman Khandual
2018-11-26 15:14         ` Keith Busch
2018-11-26 19:06   ` Greg Kroah-Hartman
2018-11-26 19:53     ` Keith Busch
2018-11-26 19:06   ` Greg Kroah-Hartman
2018-11-14 22:49 ` [PATCH 5/7] doc/vm: New documentation for memory cache Keith Busch
2018-11-15  0:41   ` Dave Hansen
2018-11-15 13:16   ` Jonathan Cameron [this message]
2018-11-15 13:16     ` Jonathan Cameron
2018-11-20 13:53   ` Mike Rapoport
2018-11-14 22:49 ` [PATCH 6/7] acpi: Create subtable parsing infrastructure Keith Busch
2018-11-19  9:58   ` Rafael J. Wysocki
2018-11-19 18:36     ` Keith Busch
2018-11-14 22:49 ` [PATCH 7/7] acpi/hmat: Parse and report heterogeneous memory Keith Busch
2018-11-15 13:57 ` [PATCH 1/7] node: Link memory nodes to their compute nodes Matthew Wilcox
2018-11-15 14:59   ` Keith Busch
2018-11-15 17:50     ` Dan Williams
2018-11-19  3:04       ` Anshuman Khandual
2018-11-15 20:36     ` Matthew Wilcox
2018-11-16 18:32       ` Keith Busch
2018-11-19  3:15         ` Anshuman Khandual
2018-11-19 15:49           ` Keith Busch
2018-12-04 15:43         ` Aneesh Kumar K.V
2018-12-04 16:54           ` Keith Busch
2018-11-16 22:55       ` Dan Williams
2018-11-19  2:52     ` Anshuman Khandual
2018-11-19  2:46 ` Anshuman Khandual

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181115131643.00003c0d@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=keith.busch@intel.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rafael@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.