linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/2] Allow migrate on protnone reference with MPOL_PREFERRED_MANY policy
@ 2024-03-25 14:24 Donet Tom
  2024-03-25 18:24 ` Andrew Morton
  2024-03-26  2:38 ` Huang, Ying
  0 siblings, 2 replies; 3+ messages in thread
From: Donet Tom @ 2024-03-25 14:24 UTC (permalink / raw)
  To: Andrew Morton, linux-mm, linux-kernel
  Cc: Aneesh Kumar, Huang Ying, Michal Hocko, Dave Hansen, Mel Gorman,
	Feng Tang, Andrea Arcangeli, Peter Zijlstra, Ingo Molnar,
	Rik van Riel, Johannes Weiner, Matthew Wilcox, Vlastimil Babka,
	Dan Williams, Hugh Dickins, Kefeng Wang, Suren Baghdasaryan,
	Donet Tom

This patchset is to optimize the cross-socket memory access with
MPOL_PREFERRED_MANY policy.

To test this patch we ran the following test on a 3 node system.
 Node 0 - 2GB   - Tier 1
 Node 1 - 11GB  - Tier 1
 Node 6 - 10GB  - Tier 2

Below changes are made to memcached to set the memory policy,
It select Node0 and Node1 as preferred nodes.

   #include <numaif.h>
   #include <numa.h>

    unsigned long nodemask;
    int ret;

    nodemask = 0x03;
    ret = set_mempolicy(MPOL_PREFERRED_MANY | MPOL_F_NUMA_BALANCING,
                                               &nodemask, 10);
    /* If MPOL_F_NUMA_BALANCING isn't supported,
     * fall back to MPOL_PREFERRED_MANY */
    if (ret < 0 && errno == EINVAL){
       printf("set mem policy normal\n");
        ret = set_mempolicy(MPOL_PREFERRED_MANY, &nodemask, 10);
    }
    if (ret < 0) {
       perror("Failed to call set_mempolicy");
       exit(-1);
    }

Test Procedure:
===============
1. Make sure memory tiering and demotion are enabled.
2. Start memcached.

   # ./memcached -b 100000 -m 204800 -u root -c 1000000 -t 7
       -d -s "/tmp/memcached.sock"

3. Run memtier_benchmark to store 3200000 keys.

  #./memtier_benchmark -S "/tmp/memcached.sock" --protocol=memcache_binary
    --threads=1 --pipeline=1 --ratio=1:0 --key-pattern=S:S --key-minimum=1
    --key-maximum=3200000 -n allkeys -c 1 -R -x 1 -d 1024

4. Start a memory eater on node 0 and 1. This will demote all memcached
   pages to node 6.
5. Make sure all the memcached pages got demoted to lower tier by reading
   /proc/<memcaced PID>/numa_maps.

    # cat /proc/2771/numa_maps
     ---
    default anon=1009 dirty=1009 active=0 N6=1009 kernelpagesize_kB=64
    default anon=1009 dirty=1009 active=0 N6=1009 kernelpagesize_kB=64
     ---

6. Kill memory eater.
7. Read the pgpromote_success counter.
8. Start reading the keys by running memtier_benchmark.

  #./memtier_benchmark -S "/tmp/memcached.sock" --protocol=memcache_binary
   --pipeline=1 --distinct-client-seed --ratio=0:3 --key-pattern=R:R
   --key-minimum=1 --key-maximum=3200000 -n allkeys
   --threads=64 -c 1 -R -x 6

9. Read the pgpromote_success counter.

Test Results:
=============
Without Patch
------------------
1. pgpromote_success  before test
Node 0:  pgpromote_success 11
Node 1:  pgpromote_success 140974

pgpromote_success  after test
Node 0:  pgpromote_success 11
Node 1:  pgpromote_success 140974

2. Memtier-benchmark result.
AGGREGATED AVERAGE RESULTS (6 runs)
==================================================================
Type    Ops/sec   Hits/sec   Misses/sec  Avg. Latency  p50 Latency
------------------------------------------------------------------
Sets     0.00       ---         ---        ---          ---
Gets    305792.03  305791.93   0.10       0.18949       0.16700
Waits    0.00       ---         ---        ---          ---
Totals  305792.03  305791.93   0.10       0.18949       0.16700

======================================
p99 Latency  p99.9 Latency  KB/sec
-------------------------------------
---          ---            0.00
0.44700     1.71100        11542.69
---           ---            ---
0.44700     1.71100        11542.69

With Patch
---------------
1. pgpromote_success  before test
Node 0:  pgpromote_success 5
Node 1:  pgpromote_success 89386

pgpromote_success  after test
Node 0:  pgpromote_success 57895
Node 1:  pgpromote_success 141463

2. Memtier-benchmark result.
AGGREGATED AVERAGE RESULTS (6 runs)
====================================================================
Type    Ops/sec    Hits/sec  Misses/sec  Avg. Latency  p50 Latency
--------------------------------------------------------------------
Sets     0.00        ---       ---        ---           ---
Gets    521942.24  521942.07  0.17       0.11459        0.10300
Waits    0.00        ---       ---         ---          ---
Totals  521942.24  521942.07  0.17       0.11459        0.10300

=======================================
p99 Latency  p99.9 Latency  KB/sec
---------------------------------------
 ---          ---            0.00
0.23100      0.31900        19701.68
---          ---             ---
0.23100      0.31900        19701.68


Test Result Analysis:
=====================
1. With patch we could observe pages are getting promoted.
2. Memtier-benchmark results shows that, with the patch,
   performance has increased more than 50%.

 Ops/sec without fix -  305792.03
 Ops/sec with fix    -  521942.24

Changes:
V4
- Added an example in the "PATCH 2/2" commit message as per the discussion
  from V3.
V3:
- Added "* @vmf: structure describing the fault" comment for
  mpol_misplaced() to fix the warning.
https://lore.kernel.org/oe-kbuild-all/202403202229.WZeAnUuO-lkp@intel.com/
-https://lore.kernel.org/lkml/cover.1711002865.git.donettom@linux.ibm.com/
v2:
- Rebased on latest upstream (v6.8-rc7)
- Used 'numa_node_id()' to get the current execution node ID, Added
  'lockdep_assert_held' to make sure that the 'mpol_misplaced()' is
  called with ptl held.
- The migration condition has been updated; now, migration will only
  occur if the execution node is present in the policy nodemask.
-https://lore.kernel.org/lkml/cover.1709909210.git.donettom@linux.ibm.com/

-v1: https://lore.kernel.org/linux-mm/9c3f7b743477560d1c5b12b8c111a584a2cc92ee.1708097962.git.donettom@linux.ibm.com/#t


Donet Tom (2):
  mm/mempolicy: Use numa_node_id() instead of cpu_to_node()
  mm/numa_balancing:Allow migrate on protnone reference with
    MPOL_PREFERRED_MANY policy

 include/linux/mempolicy.h |  5 +++--
 mm/huge_memory.c          |  2 +-
 mm/internal.h             |  2 +-
 mm/memory.c               |  8 +++++---
 mm/mempolicy.c            | 36 +++++++++++++++++++++++++++---------
 5 files changed, 37 insertions(+), 16 deletions(-)

-- 
2.39.3



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v4 0/2] Allow migrate on protnone reference with MPOL_PREFERRED_MANY policy
  2024-03-25 14:24 [PATCH v4 0/2] Allow migrate on protnone reference with MPOL_PREFERRED_MANY policy Donet Tom
@ 2024-03-25 18:24 ` Andrew Morton
  2024-03-26  2:38 ` Huang, Ying
  1 sibling, 0 replies; 3+ messages in thread
From: Andrew Morton @ 2024-03-25 18:24 UTC (permalink / raw)
  To: Donet Tom
  Cc: linux-mm, linux-kernel, Aneesh Kumar, Huang Ying, Michal Hocko,
	Dave Hansen, Mel Gorman, Feng Tang, Andrea Arcangeli,
	Peter Zijlstra, Ingo Molnar, Rik van Riel, Johannes Weiner,
	Matthew Wilcox, Vlastimil Babka, Dan Williams, Hugh Dickins,
	Kefeng Wang, Suren Baghdasaryan

On Mon, 25 Mar 2024 09:24:12 -0500 Donet Tom <donettom@linux.ibm.com> wrote:

> V4
> - Added an example in the "PATCH 2/2" commit message as per the discussion
>   from V3.

Thanks, I updated the changelogs in place.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v4 0/2] Allow migrate on protnone reference with MPOL_PREFERRED_MANY policy
  2024-03-25 14:24 [PATCH v4 0/2] Allow migrate on protnone reference with MPOL_PREFERRED_MANY policy Donet Tom
  2024-03-25 18:24 ` Andrew Morton
@ 2024-03-26  2:38 ` Huang, Ying
  1 sibling, 0 replies; 3+ messages in thread
From: Huang, Ying @ 2024-03-26  2:38 UTC (permalink / raw)
  To: Donet Tom
  Cc: Andrew Morton, linux-mm, linux-kernel, Aneesh Kumar,
	Michal Hocko, Dave Hansen, Mel Gorman, Feng Tang,
	Andrea Arcangeli, Peter Zijlstra, Ingo Molnar, Rik van Riel,
	Johannes Weiner, Matthew Wilcox, Vlastimil Babka, Dan Williams,
	Hugh Dickins, Kefeng Wang, Suren Baghdasaryan

Donet Tom <donettom@linux.ibm.com> writes:

> This patchset is to optimize the cross-socket memory access with
> MPOL_PREFERRED_MANY policy.
>
> To test this patch we ran the following test on a 3 node system.
>  Node 0 - 2GB   - Tier 1
>  Node 1 - 11GB  - Tier 1
>  Node 6 - 10GB  - Tier 2
>
> Below changes are made to memcached to set the memory policy,
> It select Node0 and Node1 as preferred nodes.
>
>    #include <numaif.h>
>    #include <numa.h>
>
>     unsigned long nodemask;
>     int ret;
>
>     nodemask = 0x03;
>     ret = set_mempolicy(MPOL_PREFERRED_MANY | MPOL_F_NUMA_BALANCING,
>                                                &nodemask, 10);
>     /* If MPOL_F_NUMA_BALANCING isn't supported,
>      * fall back to MPOL_PREFERRED_MANY */
>     if (ret < 0 && errno == EINVAL){
>        printf("set mem policy normal\n");
>         ret = set_mempolicy(MPOL_PREFERRED_MANY, &nodemask, 10);
>     }
>     if (ret < 0) {
>        perror("Failed to call set_mempolicy");
>        exit(-1);
>     }
>
> Test Procedure:
> ===============
> 1. Make sure memory tiering and demotion are enabled.
> 2. Start memcached.
>
>    # ./memcached -b 100000 -m 204800 -u root -c 1000000 -t 7
>        -d -s "/tmp/memcached.sock"
>
> 3. Run memtier_benchmark to store 3200000 keys.
>
>   #./memtier_benchmark -S "/tmp/memcached.sock" --protocol=memcache_binary
>     --threads=1 --pipeline=1 --ratio=1:0 --key-pattern=S:S --key-minimum=1
>     --key-maximum=3200000 -n allkeys -c 1 -R -x 1 -d 1024
>
> 4. Start a memory eater on node 0 and 1. This will demote all memcached
>    pages to node 6.
> 5. Make sure all the memcached pages got demoted to lower tier by reading
>    /proc/<memcaced PID>/numa_maps.
>
>     # cat /proc/2771/numa_maps
>      ---
>     default anon=1009 dirty=1009 active=0 N6=1009 kernelpagesize_kB=64
>     default anon=1009 dirty=1009 active=0 N6=1009 kernelpagesize_kB=64
>      ---
>
> 6. Kill memory eater.
> 7. Read the pgpromote_success counter.
> 8. Start reading the keys by running memtier_benchmark.
>
>   #./memtier_benchmark -S "/tmp/memcached.sock" --protocol=memcache_binary
>    --pipeline=1 --distinct-client-seed --ratio=0:3 --key-pattern=R:R
>    --key-minimum=1 --key-maximum=3200000 -n allkeys
>    --threads=64 -c 1 -R -x 6
>
> 9. Read the pgpromote_success counter.
>
> Test Results:
> =============
> Without Patch
> ------------------
> 1. pgpromote_success  before test
> Node 0:  pgpromote_success 11
> Node 1:  pgpromote_success 140974
>
> pgpromote_success  after test
> Node 0:  pgpromote_success 11
> Node 1:  pgpromote_success 140974
>
> 2. Memtier-benchmark result.
> AGGREGATED AVERAGE RESULTS (6 runs)
> ==================================================================
> Type    Ops/sec   Hits/sec   Misses/sec  Avg. Latency  p50 Latency
> ------------------------------------------------------------------
> Sets     0.00       ---         ---        ---          ---
> Gets    305792.03  305791.93   0.10       0.18949       0.16700
> Waits    0.00       ---         ---        ---          ---
> Totals  305792.03  305791.93   0.10       0.18949       0.16700
>
> ======================================
> p99 Latency  p99.9 Latency  KB/sec
> -------------------------------------
> ---          ---            0.00
> 0.44700     1.71100        11542.69
> ---           ---            ---
> 0.44700     1.71100        11542.69
>
> With Patch
> ---------------
> 1. pgpromote_success  before test
> Node 0:  pgpromote_success 5
> Node 1:  pgpromote_success 89386
>
> pgpromote_success  after test
> Node 0:  pgpromote_success 57895
> Node 1:  pgpromote_success 141463
>
> 2. Memtier-benchmark result.
> AGGREGATED AVERAGE RESULTS (6 runs)
> ====================================================================
> Type    Ops/sec    Hits/sec  Misses/sec  Avg. Latency  p50 Latency
> --------------------------------------------------------------------
> Sets     0.00        ---       ---        ---           ---
> Gets    521942.24  521942.07  0.17       0.11459        0.10300
> Waits    0.00        ---       ---         ---          ---
> Totals  521942.24  521942.07  0.17       0.11459        0.10300
>
> =======================================
> p99 Latency  p99.9 Latency  KB/sec
> ---------------------------------------
>  ---          ---            0.00
> 0.23100      0.31900        19701.68
> ---          ---             ---
> 0.23100      0.31900        19701.68
>
>
> Test Result Analysis:
> =====================
> 1. With patch we could observe pages are getting promoted.
> 2. Memtier-benchmark results shows that, with the patch,
>    performance has increased more than 50%.
>
>  Ops/sec without fix -  305792.03
>  Ops/sec with fix    -  521942.24
>
> Changes:
> V4
> - Added an example in the "PATCH 2/2" commit message as per the discussion
>   from V3.
> V3:
> - Added "* @vmf: structure describing the fault" comment for
>   mpol_misplaced() to fix the warning.
> https://lore.kernel.org/oe-kbuild-all/202403202229.WZeAnUuO-lkp@intel.com/
> -https://lore.kernel.org/lkml/cover.1711002865.git.donettom@linux.ibm.com/
> v2:
> - Rebased on latest upstream (v6.8-rc7)
> - Used 'numa_node_id()' to get the current execution node ID, Added
>   'lockdep_assert_held' to make sure that the 'mpol_misplaced()' is
>   called with ptl held.
> - The migration condition has been updated; now, migration will only
>   occur if the execution node is present in the policy nodemask.
> -https://lore.kernel.org/lkml/cover.1709909210.git.donettom@linux.ibm.com/
>
> -v1: https://lore.kernel.org/linux-mm/9c3f7b743477560d1c5b12b8c111a584a2cc92ee.1708097962.git.donettom@linux.ibm.com/#t
>
>
> Donet Tom (2):
>   mm/mempolicy: Use numa_node_id() instead of cpu_to_node()
>   mm/numa_balancing:Allow migrate on protnone reference with
>     MPOL_PREFERRED_MANY policy
>
>  include/linux/mempolicy.h |  5 +++--
>  mm/huge_memory.c          |  2 +-
>  mm/internal.h             |  2 +-
>  mm/memory.c               |  8 +++++---
>  mm/mempolicy.c            | 36 +++++++++++++++++++++++++++---------
>  5 files changed, 37 insertions(+), 16 deletions(-)

LGTM, Thanks!  Feel free to add

Reviewed-by: "Huang, Ying" <ying.huang@intel.com>

in the future version.

--
Best Regards,
Huang, Ying


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-03-26  2:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-25 14:24 [PATCH v4 0/2] Allow migrate on protnone reference with MPOL_PREFERRED_MANY policy Donet Tom
2024-03-25 18:24 ` Andrew Morton
2024-03-26  2:38 ` Huang, Ying

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).