mm-commits.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* + mm-swap-properly-update-readahead-statistics-in-unuse_pte_range.patch added to -mm tree
@ 2020-04-18 20:01 akpm
  0 siblings, 0 replies; 2+ messages in thread
From: akpm @ 2020-04-18 20:01 UTC (permalink / raw)
  To: anchalag, andrea.righi, hughd, kelleynnn, minchan, mm-commits,
	vpillai, ying.huang


The patch titled
     Subject: mm: swap: properly update readahead statistics in unuse_pte_range()
has been added to the -mm tree.  Its filename is
     mm-swap-properly-update-readahead-statistics-in-unuse_pte_range.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-swap-properly-update-readahead-statistics-in-unuse_pte_range.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-swap-properly-update-readahead-statistics-in-unuse_pte_range.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Andrea Righi <andrea.righi@canonical.com>
Subject: mm: swap: properly update readahead statistics in unuse_pte_range()

In unuse_pte_range() we blindly swap-in pages without checking if the swap
entry is already present in the swap cache.

By doing this, the hit/miss ratio used by the swap readahead heuristic is
not properly updated and this leads to non-optimal performance during
swapoff.

Tracing the distribution of the readahead size returned by the swap
readahead heuristic during swapoff shows that a small readahead size is
used most of the time as if we had only misses (this happens both with
cluster and vma readahead), for example:

r::swapin_nr_pages(unsigned long offset):unsigned long:$retval
        COUNT      EVENT
        36948      $retval = 8
        44151      $retval = 4
        49290      $retval = 1
        527771     $retval = 2

Checking if the swap entry is present in the swap cache, instead, allows
to properly update the readahead statistics and the heuristic behaves in a
better way during swapoff, selecting a bigger readahead size:

r::swapin_nr_pages(unsigned long offset):unsigned long:$retval
        COUNT      EVENT
        1618       $retval = 1
        4960       $retval = 2
        41315      $retval = 4
        103521     $retval = 8

In terms of swapoff performance the result is the following:

Testing environment
===================

 - Host:
   CPU: 1.8GHz Intel Core i7-8565U (quad-core, 8MB cache)
   HDD: PC401 NVMe SK hynix 512GB
   MEM: 16GB

 - Guest (kvm):
   8GB of RAM
   virtio block driver
   16GB swap file on ext4 (/swapfile)

Test case
=========
 - allocate 85% of memory
 - `systemctl hibernate` to force all the pages to be swapped-out to the
   swap file
 - resume the system
 - measure the time that swapoff takes to complete:
   # /usr/bin/time swapoff /swapfile

Result (swapoff time)
======
                  5.6 vanilla   5.6 w/ this patch
                  -----------   -----------------
cluster-readahead      22.09s              12.19s
    vma-readahead      18.20s              15.33s

Conclusion
==========

The specific use case this patch is addressing is to improve swapoff
performance in cloud environments when a VM has been hibernated, resumed
and all the memory needs to be forced back to RAM by disabling swap.

This change allows to better exploits the advantages of the readahead
heuristic during swapoff and this improvement allows to to speed up the
resume process of such VMs.

[andrea.righi@canonical.com: update changelog]
  Link: http://lkml.kernel.org/r/20200418084705.GA147642@xps-13
Link: http://lkml.kernel.org/r/20200416180132.GB3352@xps-13
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Anchal Agarwal <anchalag@amazon.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Vineeth Remanan Pillai <vpillai@digitalocean.com>
Cc: Kelley Nielsen <kelleynnn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/swapfile.c |   12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

--- a/mm/swapfile.c~mm-swap-properly-update-readahead-statistics-in-unuse_pte_range
+++ a/mm/swapfile.c
@@ -1937,10 +1937,14 @@ static int unuse_pte_range(struct vm_are
 
 		pte_unmap(pte);
 		swap_map = &si->swap_map[offset];
-		vmf.vma = vma;
-		vmf.address = addr;
-		vmf.pmd = pmd;
-		page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE, &vmf);
+		page = lookup_swap_cache(entry, vma, addr);
+		if (!page) {
+			vmf.vma = vma;
+			vmf.address = addr;
+			vmf.pmd = pmd;
+			page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE,
+						&vmf);
+		}
 		if (!page) {
 			if (*swap_map == 0 || *swap_map == SWAP_MAP_BAD)
 				goto try_next;
_

Patches currently in -mm which might be from andrea.righi@canonical.com are

mm-swap-properly-update-readahead-statistics-in-unuse_pte_range.patch

^ permalink raw reply	[flat|nested] 2+ messages in thread

* + mm-swap-properly-update-readahead-statistics-in-unuse_pte_range.patch added to -mm tree
@ 2020-04-17  4:09 akpm
  0 siblings, 0 replies; 2+ messages in thread
From: akpm @ 2020-04-17  4:09 UTC (permalink / raw)
  To: anchalag, andrea.righi, hughd, kelleynnn, minchan, mm-commits,
	vpillai, ying.huang


The patch titled
     Subject: mm: swap: properly update readahead statistics in unuse_pte_range()
has been added to the -mm tree.  Its filename is
     mm-swap-properly-update-readahead-statistics-in-unuse_pte_range.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-swap-properly-update-readahead-statistics-in-unuse_pte_range.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-swap-properly-update-readahead-statistics-in-unuse_pte_range.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Andrea Righi <andrea.righi@canonical.com>
Subject: mm: swap: properly update readahead statistics in unuse_pte_range()

In unuse_pte_range() we blindly swap-in pages without checking if the swap
entry is already present in the swap cache.

By doing this, the hit/miss ratio used by the swap readahead heuristic is
not properly updated and this leads to non-optimal performance during
swapoff.

Tracing the distribution of the readahead size returned by the swap
readahead heuristic during swapoff shows that a small readahead size is
used most of the time as if we had only misses (this happens both with
cluster and vma readahead), for example:

r::swapin_nr_pages(unsigned long offset):unsigned long:$retval
        COUNT      EVENT
        36948      $retval = 8
        44151      $retval = 4
        49290      $retval = 1
        527771     $retval = 2

Checking if the swap entry is present in the swap cache, instead, allows
to properly update the readahead statistics and the heuristic behaves in
a better way during swapoff, selecting a bigger readahead size:

r::swapin_nr_pages(unsigned long offset):unsigned long:$retval
        COUNT      EVENT
        1618       $retval = 1
        4960       $retval = 2
        41315      $retval = 4
        103521     $retval = 8

In terms of swapoff performance the result is the following:

Testing environment
===================

 - Host:
   CPU: 1.8GHz Intel Core i7-8565U (quad-core, 8MB cache)
   HDD: PC401 NVMe SK hynix 512GB
   MEM: 16GB

 - Guest (kvm):
   8GB of RAM
   virtio block driver
   16GB swap file on ext4 (/swapfile)

Test case
=========
 - allocate 85% of memory
 - `systemctl hibernate` to force all the pages to be swapped-out to the
   swap file
 - resume the system
 - measure the time that swapoff takes to complete:
   # /usr/bin/time swapoff /swapfile

Result (swapoff time)
======
                  5.6 vanilla   5.6 w/ this patch
                  -----------   -----------------
cluster-readahead      22.09s              12.19s
    vma-readahead      18.20s              15.33s

Link: http://lkml.kernel.org/r/20200416180132.GB3352@xps-13
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Anchal Agarwal <anchalag@amazon.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Vineeth Remanan Pillai <vpillai@digitalocean.com>
Cc: Kelley Nielsen <kelleynnn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/swapfile.c |   12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

--- a/mm/swapfile.c~mm-swap-properly-update-readahead-statistics-in-unuse_pte_range
+++ a/mm/swapfile.c
@@ -1937,10 +1937,14 @@ static int unuse_pte_range(struct vm_are
 
 		pte_unmap(pte);
 		swap_map = &si->swap_map[offset];
-		vmf.vma = vma;
-		vmf.address = addr;
-		vmf.pmd = pmd;
-		page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE, &vmf);
+		page = lookup_swap_cache(entry, vma, addr);
+		if (!page) {
+			vmf.vma = vma;
+			vmf.address = addr;
+			vmf.pmd = pmd;
+			page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE,
+						&vmf);
+		}
 		if (!page) {
 			if (*swap_map == 0 || *swap_map == SWAP_MAP_BAD)
 				goto try_next;
_

Patches currently in -mm which might be from andrea.righi@canonical.com are

mm-swap-properly-update-readahead-statistics-in-unuse_pte_range.patch

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-04-18 20:01 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-18 20:01 + mm-swap-properly-update-readahead-statistics-in-unuse_pte_range.patch added to -mm tree akpm
  -- strict thread matches above, loose matches on Subject: below --
2020-04-17  4:09 akpm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).