linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
  • * Re: [PATCH v7 11/12] mm/demotion: Add documentation for memory tiering
           [not found] ` <20220622082513.467538-12-aneesh.kumar@linux.ibm.com>
           [not found]   ` <202206230554.5tVWF6UB-lkp@intel.com>
    @ 2022-06-25  4:13   ` Bagas Sanjaya
      2022-06-27  4:40     ` Aneesh Kumar K.V
      1 sibling, 1 reply; 3+ messages in thread
    From: Bagas Sanjaya @ 2022-06-25  4:13 UTC (permalink / raw)
      To: Aneesh Kumar K.V
      Cc: linux-mm, akpm, Wei Xu, Huang Ying, Yang Shi, Davidlohr Bueso,
    	Tim C Chen, Michal Hocko, Linux Kernel Mailing List,
    	Hesham Almatary, Dave Hansen, Jonathan Cameron, Alistair Popple,
    	Dan Williams, Jagdish Gediya, linux-doc
    
    On Wed, Jun 22, 2022 at 01:55:12PM +0530, Aneesh Kumar K.V wrote:
    > From: Jagdish Gediya <jvgediya@linux.ibm.com>
    > 
    
    Hi Aneesh and Jagdish,
    
    The documentation can be improved, see below.
    
    > All N_MEMORY nodes are divided into 3 memoty tiers with tier ID value
    > MEMORY_TIER_HBM_GPU, MEMORY_TIER_DRAM and MEMORY_TIER_PMEM. By default,
    > all nodes are assigned to default memory tier.
    > 
    > Demotion path for all N_MEMORY nodes is prepared based on the tier ID value
    > of memory tiers.
    > 
    > This patch adds documention for memory tiering introduction, its sysfs
    > interfaces and how demotion is performed based on memory tiers.
    > 
    
    I think the patch message should just be:
    "Add documentation for memory tiering. It also covers its sysfs
    interfaces and how demotion is performed based on memory tiers."
    
    > +===========
    > +Memory tiers
    > +============
    > +
    > +This document describes explicit memory tiering support along with
    > +demotion based on memory tiers.
    > +
    
    This causes htmldocs error, for which I have applied the fixup at [1].
    
    > +Memory nodes are divided into 3 types of memory tiers with tier ID
    > +value as shown based on their hardware characteristics.
    > +
    > +
    > +MEMORY_TIER_HBM_GPU
    > +MEMORY_TIER_DRAM
    > +MEMORY_TIER_PMEM
    > +
    
    Use bullet list.
    
    > +Sysfs interfaces
    > +================
    > +
    > +Nodes belonging to specific tier can be read from,
    > +/sys/devices/system/memtier/memtierN/nodelist (Read-Only)
    > +
    > +Where N is 0 - 2.
    
    The "where" sentence can be compounded into the previous sentence above.
    
    > +
    > +Example 1:
    > +For a system where Node 0 is CPU + DRAM nodes, Node 1 is HBM node,
    > +node 2 is a PMEM node an ideal tier layout will be
    > +
    > +$ cat /sys/devices/system/memtier/memtier0/nodelist
    > +1
    > +$ cat /sys/devices/system/memtier/memtier1/nodelist
    > +0
    > +$ cat /sys/devices/system/memtier/memtier2/nodelist
    > +2
    > +
    
    The code snippets should have been inside literal code blocks.
    
    > +Example 2:
    > +For a system where Node 0 & 1 are CPU + DRAM nodes, node 2 & 3 are PMEM
    > +nodes.
    > +
    > +$ cat /sys/devices/system/memtier/memtier0/nodelist
    > +cat: /sys/devices/system/memtier/memtier0/nodelist: No such file or
    > +directory
    > +$ cat /sys/devices/system/memtier/memtier1/nodelist
    > +0-1
    > +$ cat /sys/devices/system/memtier/memtier2/nodelist
    > +2-3
    > +
    
    Use literal code block.
    
    > +Default memory tier can be read from,
    > +/sys/devices/system/memtier/default_tier (Read-Only)
    > +
    > +e.g.
    > +$ cat /sys/devices/system/memtier/default_tier
    > +memtier200
    > +
    > +Max memory tier ID supported can be read from,
    > +/sys/devices/system/memtier/max_tier (Read-Only)
    > +
    > +e.g.
    > +$ cat /sys/devices/system/memtier/max_tier
    > +400
    > +
    > +Individual node's memory tier can be read of set using,
    > +/sys/devices/system/node/nodeN/memtier	(Read-Write)
    > +
    > +where N = node id
    > +
    > +When this interface is written, Node is moved from the old memory tier
    > +to new memory tier and demotion targets for all N_MEMORY nodes are
    > +built again.
    > +
    > +For example 1 mentioned above,
    > +$ cat /sys/devices/system/node/node0/memtier
    > +1
    > +$ cat /sys/devices/system/node/node1/memtier
    > +0
    > +$ cat /sys/devices/system/node/node2/memtier
    > +2
    > +
    
    The same suggestions above apply here, too.
    
    > +Enable/Disable demotion
    > +-----------------------
    > +
    > +By default demotion is disabled, it can be enabled/disabled using
    > +below sysfs interface,
    > +
    > +$ echo 0/1 or false/true > /sys/kernel/mm/numa/demotion_enabled
    > +
    
    Use literal code block.
    
    > +preferred and allowed demotion nodes
    > +------------------------------------
    > +
    > +Preferred nodes for a specific N_MEMORY node are the best nodes
    > +from the next possible lower memory tier. Allowed nodes for any
    > +node are all the nodes available in all possible lower memory
    > +tiers.
    > +
    > +Example:
    > +
    > +For a system where Node 0 & 1 are CPU + DRAM nodes, node 2 & 3 are PMEM
    > +nodes,
    > +
    > +node distances:
    > +node   0    1    2    3
    > +   0  10   20   30   40
    > +   1  20   10   40   30
    > +   2  30   40   10   40
    > +   3  40   30   40   10
    > +
    
    Use reST table.
    
    > +memory_tiers[0] = <empty>
    > +memory_tiers[1] = 0-1
    > +memory_tiers[2] = 2-3
    > +
    > +node_demotion[0].preferred = 2
    > +node_demotion[0].allowed   = 2, 3
    > +node_demotion[1].preferred = 3
    > +node_demotion[1].allowed   = 3, 2
    > +node_demotion[2].preferred = <empty>
    > +node_demotion[2].allowed   = <empty>
    > +node_demotion[3].preferred = <empty>
    > +node_demotion[3].allowed   = <empty>
    > +
    
    What are these above? Node properties? BTW, use literal code block.
    
    If you don't understand these suggestions above, here is the diff:
    
    ---- >8 ----
    
    diff --git a/Documentation/admin-guide/mm/memory-tiering.rst b/Documentation/admin-guide/mm/memory-tiering.rst
    index 0a75e0dab1fd8e..10ec5aab6ddd53 100644
    --- a/Documentation/admin-guide/mm/memory-tiering.rst
    +++ b/Documentation/admin-guide/mm/memory-tiering.rst
    @@ -14,13 +14,13 @@ Introduction
     
     Many systems have multiple types of memory devices e.g. GPU, DRAM and
     PMEM. The memory subsystem of these systems can be called a memory
    -tiering system because the performance of the different types of
    +tiering system because the performance of each type of
     memory is different. Memory tiers are defined based on the hardware
     capabilities of memory nodes. Each memory tier is assigned a tier ID
     value that determines the memory tier position in demotion order.
     
     The memory tier assignment of each node is independent of each
    -other. Moving a node from one tier to another tier doesn't affect
    +other. Moving a node from one tier to another doesn't affect
     the tier assignment of any other node.
     
     Memory tiers are used to build the demotion targets for nodes. A node
    @@ -32,10 +32,9 @@ Memory tier rank
     Memory nodes are divided into 3 types of memory tiers with tier ID
     value as shown based on their hardware characteristics.
     
    -
    -MEMORY_TIER_HBM_GPU
    -MEMORY_TIER_DRAM
    -MEMORY_TIER_PMEM
    +  * MEMORY_TIER_HBM_GPU
    +  * MEMORY_TIER_DRAM
    +  * MEMORY_TIER_PMEM
     
     Memory tiers initialization and (re)assignments
     ===============================================
    @@ -49,68 +48,73 @@ hotplug, the memory tier with default tier ID is assigned to the memory node.
     Sysfs interfaces
     ================
     
    -Nodes belonging to specific tier can be read from,
    -/sys/devices/system/memtier/memtierN/nodelist (Read-Only)
    +Nodes belonging to specific tier can be read from
    +/sys/devices/system/memtier/memtierN/nodelist, where N is 0 - 2 (read-only)
     
    -Where N is 0 - 2.
    +Examples:
     
    -Example 1:
    -For a system where Node 0 is CPU + DRAM nodes, Node 1 is HBM node,
    -node 2 is a PMEM node an ideal tier layout will be
    +1. On a system where Node 0 is CPU + DRAM nodes, Node 1 is HBM node,
    +   node 2 is a PMEM node an ideal tier layout will be:
     
    -$ cat /sys/devices/system/memtier/memtier0/nodelist
    -1
    -$ cat /sys/devices/system/memtier/memtier1/nodelist
    -0
    -$ cat /sys/devices/system/memtier/memtier2/nodelist
    -2
    +   .. code-block::
     
    -Example 2:
    -For a system where Node 0 & 1 are CPU + DRAM nodes, node 2 & 3 are PMEM
    -nodes.
    +      $ cat /sys/devices/system/memtier/memtier0/nodelist
    +      1
    +      $ cat /sys/devices/system/memtier/memtier1/nodelist
    +      0
    +      $ cat /sys/devices/system/memtier/memtier2/nodelist
    +      2
     
    -$ cat /sys/devices/system/memtier/memtier0/nodelist
    -cat: /sys/devices/system/memtier/memtier0/nodelist: No such file or
    -directory
    -$ cat /sys/devices/system/memtier/memtier1/nodelist
    -0-1
    -$ cat /sys/devices/system/memtier/memtier2/nodelist
    -2-3
    +2. On a system where Node 0 & 1 are CPU + DRAM nodes, node 2 & 3 are PMEM
    +   nodes:
     
    -Default memory tier can be read from,
    -/sys/devices/system/memtier/default_tier (Read-Only)
    +   .. code-block::
     
    -e.g.
    -$ cat /sys/devices/system/memtier/default_tier
    -memtier200
    +      $ cat /sys/devices/system/memtier/memtier0/nodelist
    +      cat: /sys/devices/system/memtier/memtier0/nodelist: No such file or
    +      directory
    +      $ cat /sys/devices/system/memtier/memtier1/nodelist
    +      0-1
    +      $ cat /sys/devices/system/memtier/memtier2/nodelist
    +      2-3
     
    -Max memory tier ID supported can be read from,
    -/sys/devices/system/memtier/max_tier (Read-Only)
    +Default memory tier can be read from
    +/sys/devices/system/memtier/default_tier (read-only), e.g.:
     
    -e.g.
    -$ cat /sys/devices/system/memtier/max_tier
    -400
    +.. code-block::
     
    -Individual node's memory tier can be read of set using,
    -/sys/devices/system/node/nodeN/memtier	(Read-Write)
    +   $ cat /sys/devices/system/memtier/default_tier
    +   memtier200
     
    -where N = node id
    +Max memory tier ID supported can be read from
    +/sys/devices/system/memtier/max_tier (read-only), e.g.:
     
    -When this interface is written, Node is moved from the old memory tier
    +.. code-block::
    +
    +   $ cat /sys/devices/system/memtier/max_tier
    +   400
    +
    +Individual node's memory tier can be read or set using
    +/sys/devices/system/node/nodeN/memtier (read-write), where N = node id.
    +
    +When this interface is written, node is moved from the old memory tier
     to new memory tier and demotion targets for all N_MEMORY nodes are
     built again.
     
    -For example 1 mentioned above,
    -$ cat /sys/devices/system/node/node0/memtier
    -1
    -$ cat /sys/devices/system/node/node1/memtier
    -0
    -$ cat /sys/devices/system/node/node2/memtier
    -2
    +For example 1 mentioned above:
    +
    +.. code-block::
    +
    +   $ cat /sys/devices/system/node/node0/memtier
    +   1
    +   $ cat /sys/devices/system/node/node1/memtier
    +   0
    +   $ cat /sys/devices/system/node/node2/memtier
    +   2
     
     Additional memory tiers can be created by writing a tier ID value to this file.
    -This results in a new memory tier creation and moving the specific NUMA node to
    -that memory tier.
    +This results into creating a new tier and moving the specific NUMA node to
    +that tier.
     
     Demotion
     ========
    @@ -128,19 +132,20 @@ be used.
     
     Instead of a page being discarded during reclaim, it can be moved to
     persistent memory. Allowing page migration during reclaim enables
    -these systems to migrate pages from fast(higher) tiers to slow(lower)
    -tiers when the fast(higher) tier is under pressure.
    +these systems to migrate pages from fast (higher) tiers to slow (lower)
    +tiers when the fast (higher) tier is under pressure.
     
     
     Enable/Disable demotion
     -----------------------
     
    -By default demotion is disabled, it can be enabled/disabled using
    -below sysfs interface,
    +By default demotion is disabled. It can be toggled by:
     
    -$ echo 0/1 or false/true > /sys/kernel/mm/numa/demotion_enabled
    +.. code-block::
     
    -preferred and allowed demotion nodes
    +   $ echo 0/1 or false/true > /sys/kernel/mm/numa/demotion_enabled
    +
    +Preferred and allowed demotion nodes
     ------------------------------------
     
     Preferred nodes for a specific N_MEMORY node are the best nodes
    @@ -148,35 +153,40 @@ from the next possible lower memory tier. Allowed nodes for any
     node are all the nodes available in all possible lower memory
     tiers.
     
    -Example:
    +For example, on a system where Node 0 & 1 are CPU + DRAM nodes,
    +node 2 & 3 are PMEM nodes:
     
    -For a system where Node 0 & 1 are CPU + DRAM nodes, node 2 & 3 are PMEM
    -nodes,
    +  * node distances
     
    -node distances:
    -node   0    1    2    3
    -   0  10   20   30   40
    -   1  20   10   40   30
    -   2  30   40   10   40
    -   3  40   30   40   10
    +    ====  ==   ==   ==   ==
    +    node   0    1    2    3
    +    ====  ==   ==   ==   ==
    +       0  10   20   30   40
    +       1  20   10   40   30
    +       2  30   40   10   40
    +       3  40   30   40   10
    +    ====  ==   ==   ==   ==
     
    -memory_tiers[0] = <empty>
    -memory_tiers[1] = 0-1
    -memory_tiers[2] = 2-3
    +  * node properties
     
    -node_demotion[0].preferred = 2
    -node_demotion[0].allowed   = 2, 3
    -node_demotion[1].preferred = 3
    -node_demotion[1].allowed   = 3, 2
    -node_demotion[2].preferred = <empty>
    -node_demotion[2].allowed   = <empty>
    -node_demotion[3].preferred = <empty>
    -node_demotion[3].allowed   = <empty>
    +    .. code-block::
    +
    +       memory_tiers[0] = <empty>
    +       memory_tiers[1] = 0-1
    +       memory_tiers[2] = 2-3
    +
    +       node_demotion[0].preferred = 2
    +       node_demotion[0].allowed   = 2, 3
    +       node_demotion[1].preferred = 3
    +       node_demotion[1].allowed   = 3, 2
    +       node_demotion[2].preferred = <empty>
    +       node_demotion[2].allowed   = <empty>
    +       node_demotion[3].preferred = <empty>
    +       node_demotion[3].allowed   = <empty>
     
     Memory allocation for demotion
     ------------------------------
     
    -If a page needs to be demoted from any node, the kernel 1st tries
    -to allocate a new page from the node's preferred node and fallbacks to
    -node's allowed targets in allocation fallback order.
    -
    +If a page needs to be demoted from any node, the kernel first tries
    +to allocate a new page from the node's preferred target node and fallbacks
    +to node's allowed targets in allocation fallback order.
    
    
    Thanks.
    
    [1]: https://lore.kernel.org/linux-doc/YrZ5cTFOSuWxlF2t@debian.me/
    
    -- 
    An old man doll... just what I always wanted! - Clara
    
    ^ permalink raw reply related	[flat|nested] 3+ messages in thread

  • end of thread, other threads:[~2022-06-27  4:41 UTC | newest]
    
    Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
    -- links below jump to the message on this page --
         [not found] <20220622082513.467538-1-aneesh.kumar@linux.ibm.com>
         [not found] ` <20220622082513.467538-12-aneesh.kumar@linux.ibm.com>
         [not found]   ` <202206230554.5tVWF6UB-lkp@intel.com>
    2022-06-25  2:56     ` [PATCH v7 11/12] mm/demotion: Add documentation for memory tiering Bagas Sanjaya
    2022-06-25  4:13   ` Bagas Sanjaya
    2022-06-27  4:40     ` Aneesh Kumar K.V
    

    This is a public inbox, see mirroring instructions
    for how to clone and mirror all data and code used for this inbox;
    as well as URLs for NNTP newsgroup(s).