From: Dan Williams <dan.j.williams@gmail.com>
To: ying.huang@intel.com
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-mm <linux-mm@kvack.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Michal Hocko <mhocko@suse.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Shaohua Li <shli@kernel.org>,
hughd@google.com, Minchan Kim <minchan@kernel.org>,
Rik van Riel <riel@redhat.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
n-horiguchi@ah.jp.nec.com, zi.yan@cs.rutgers.edu,
daniel.m.jordan@oracle.com
Subject: Re: [PATCH -mm -v4 03/21] mm, THP, swap: Support PMD swap mapping in swap_duplicate()
Date: Sat, 7 Jul 2018 16:22:54 -0700 [thread overview]
Message-ID: <CAA9_cmc2YteXBhrLOFN0rAZ4UFDRPcXaE1OPNv06P+Fu9e+zeA@mail.gmail.com> (raw)
In-Reply-To: <20180622035151.6676-4-ying.huang@intel.com>
On Thu, Jun 21, 2018 at 8:55 PM Huang, Ying <ying.huang@intel.com> wrote:
>
> From: Huang Ying <ying.huang@intel.com>
>
> To support to swapin the THP as a whole, we need to create PMD swap
> mapping during swapout, and maintain PMD swap mapping count. This
> patch implements the support to increase the PMD swap mapping
> count (for swapout, fork, etc.) and set SWAP_HAS_CACHE flag (for
> swapin, etc.) for a huge swap cluster in swap_duplicate() function
> family. Although it only implements a part of the design of the swap
> reference count with PMD swap mapping, the whole design is described
> as follow to make it easy to understand the patch and the whole
> picture.
>
> A huge swap cluster is used to hold the contents of a swapouted THP.
> After swapout, a PMD page mapping to the THP will become a PMD
> swap mapping to the huge swap cluster via a swap entry in PMD. While
> a PTE page mapping to a subpage of the THP will become the PTE swap
> mapping to a swap slot in the huge swap cluster via a swap entry in
> PTE.
>
> If there is no PMD swap mapping and the corresponding THP is removed
> from the page cache (reclaimed), the huge swap cluster will be split
> and become a normal swap cluster.
>
> The count (cluster_count()) of the huge swap cluster is
> SWAPFILE_CLUSTER (= HPAGE_PMD_NR) + PMD swap mapping count. Because
> all swap slots in the huge swap cluster are mapped by PTE or PMD, or
> has SWAP_HAS_CACHE bit set, the usage count of the swap cluster is
> HPAGE_PMD_NR. And the PMD swap mapping count is recorded too to make
> it easy to determine whether there are remaining PMD swap mappings.
>
> The count in swap_map[offset] is the sum of PTE and PMD swap mapping
> count. This means when we increase the PMD swap mapping count, we
> need to increase swap_map[offset] for all swap slots inside the swap
> cluster. An alternative choice is to make swap_map[offset] to record
> PTE swap map count only, given we have recorded PMD swap mapping count
> in the count of the huge swap cluster. But this need to increase
> swap_map[offset] when splitting the PMD swap mapping, that may fail
> because of memory allocation for swap count continuation. That is
> hard to dealt with. So we choose current solution.
>
> The PMD swap mapping to a huge swap cluster may be split when unmap a
> part of PMD mapping etc. That is easy because only the count of the
> huge swap cluster need to be changed. When the last PMD swap mapping
> is gone and SWAP_HAS_CACHE is unset, we will split the huge swap
> cluster (clear the huge flag). This makes it easy to reason the
> cluster state.
>
> A huge swap cluster will be split when splitting the THP in swap
> cache, or failing to allocate THP during swapin, etc. But when
> splitting the huge swap cluster, we will not try to split all PMD swap
> mappings, because we haven't enough information available for that
> sometimes. Later, when the PMD swap mapping is duplicated or swapin,
> etc, the PMD swap mapping will be split and fallback to the PTE
> operation.
>
> When a THP is added into swap cache, the SWAP_HAS_CACHE flag will be
> set in the swap_map[offset] of all swap slots inside the huge swap
> cluster backing the THP. This huge swap cluster will not be split
> unless the THP is split even if its PMD swap mapping count dropped to
> 0. Later, when the THP is removed from swap cache, the SWAP_HAS_CACHE
> flag will be cleared in the swap_map[offset] of all swap slots inside
> the huge swap cluster. And this huge swap cluster will be split if
> its PMD swap mapping count is 0.
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Shaohua Li <shli@kernel.org>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Cc: Zi Yan <zi.yan@cs.rutgers.edu>
> Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
> ---
> include/linux/huge_mm.h | 5 +
> include/linux/swap.h | 9 +-
> mm/memory.c | 2 +-
> mm/rmap.c | 2 +-
> mm/swap_state.c | 2 +-
> mm/swapfile.c | 287 +++++++++++++++++++++++++++++++++---------------
> 6 files changed, 214 insertions(+), 93 deletions(-)
I'm probably missing some background, but I find the patch hard to
read. Can you disseminate some of this patch changelog into kernel-doc
commentary so it's easier to follow which helpers do what relative to
THP swap.
>
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index d3bbf6bea9e9..213d32e57c39 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -80,6 +80,11 @@ extern struct kobj_attribute shmem_enabled_attr;
> #define HPAGE_PMD_ORDER (HPAGE_PMD_SHIFT-PAGE_SHIFT)
> #define HPAGE_PMD_NR (1<<HPAGE_PMD_ORDER)
>
> +static inline bool thp_swap_supported(void)
> +{
> + return IS_ENABLED(CONFIG_THP_SWAP);
> +}
> +
> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> #define HPAGE_PMD_SHIFT PMD_SHIFT
> #define HPAGE_PMD_SIZE ((1UL) << HPAGE_PMD_SHIFT)
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index f73eafcaf4e9..57aa655ab27d 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -451,8 +451,8 @@ extern swp_entry_t get_swap_page_of_type(int);
> extern int get_swap_pages(int n, bool cluster, swp_entry_t swp_entries[]);
> extern int add_swap_count_continuation(swp_entry_t, gfp_t);
> extern void swap_shmem_alloc(swp_entry_t);
> -extern int swap_duplicate(swp_entry_t);
> -extern int swapcache_prepare(swp_entry_t);
> +extern int swap_duplicate(swp_entry_t *entry, bool cluster);
This patch introduces a new flag to swap_duplicate(), but then all all
usages still pass 'false' so why does this patch change the argument.
Seems this change belongs to another patch?
> +extern int swapcache_prepare(swp_entry_t entry, bool cluster);
Rather than add a cluster flag to these helpers can the swp_entry_t
carry the cluster flag directly?
next prev parent reply other threads:[~2018-07-07 23:23 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-22 3:51 [PATCH -mm -v4 00/21] mm, THP, swap: Swapout/swapin THP in one piece Huang, Ying
2018-06-22 3:51 ` [PATCH -mm -v4 01/21] mm, THP, swap: Enable PMD swap operations for CONFIG_THP_SWAP Huang, Ying
2018-07-07 21:11 ` Dan Williams
2018-07-09 5:40 ` Huang, Ying
2018-07-09 6:08 ` Dan Williams
2018-07-09 6:34 ` Huang, Ying
2018-07-09 15:59 ` Dave Hansen
2018-07-10 1:08 ` Huang, Ying
2018-06-22 3:51 ` [PATCH -mm -v4 02/21] mm, THP, swap: Make CONFIG_THP_SWAP depends on CONFIG_SWAP Huang, Ying
2018-07-07 21:12 ` Dan Williams
2018-07-09 6:34 ` Huang, Ying
2018-07-09 16:00 ` Dave Hansen
2018-07-10 1:19 ` Huang, Ying
2018-07-10 1:59 ` Dave Hansen
2018-07-10 5:26 ` Huang, Ying
2018-06-22 3:51 ` [PATCH -mm -v4 03/21] mm, THP, swap: Support PMD swap mapping in swap_duplicate() Huang, Ying
2018-06-29 6:04 ` Matthew Wilcox
2018-07-02 5:19 ` Huang, Ying
2018-07-07 23:22 ` Dan Williams [this message]
2018-07-09 7:38 ` Huang, Ying
2018-07-09 16:51 ` Dave Hansen
2018-07-10 6:44 ` Huang, Ying
2018-07-10 13:50 ` Dave Hansen
2018-07-11 0:59 ` Huang, Ying
2018-06-22 3:51 ` [PATCH -mm -v4 04/21] mm, THP, swap: Support PMD swap mapping in swapcache_free_cluster() Huang, Ying
2018-07-09 17:11 ` Dave Hansen
2018-07-10 6:53 ` Huang, Ying
2018-07-10 13:54 ` Dave Hansen
2018-07-11 1:08 ` Huang, Ying
2018-06-22 3:51 ` [PATCH -mm -v4 05/21] mm, THP, swap: Support PMD swap mapping in free_swap_and_cache()/swap_free() Huang, Ying
2018-07-05 18:33 ` Daniel Jordan
2018-07-06 12:49 ` Huang, Ying
2018-07-09 17:19 ` Dave Hansen
2018-07-10 7:13 ` Huang, Ying
2018-07-10 14:07 ` Dave Hansen
2018-07-11 1:28 ` Huang, Ying
2018-06-22 3:51 ` [PATCH -mm -v4 06/21] mm, THP, swap: Support PMD swap mapping when splitting huge PMD Huang, Ying
2018-06-22 3:51 ` [PATCH -mm -v4 07/21] mm, THP, swap: Support PMD swap mapping in split_swap_cluster() Huang, Ying
2018-06-22 3:51 ` [PATCH -mm -v4 08/21] mm, THP, swap: Support to read a huge swap cluster for swapin a THP Huang, Ying
2018-06-29 6:21 ` Matthew Wilcox
2018-07-02 6:02 ` Huang, Ying
2018-07-04 0:12 ` Daniel Jordan
2018-07-04 2:24 ` Huang, Ying
2018-06-22 3:51 ` [PATCH -mm -v4 09/21] mm, THP, swap: Swapin a THP as a whole Huang, Ying
2018-06-22 3:51 ` [PATCH -mm -v4 10/21] mm, THP, swap: Support to count THP swapin and its fallback Huang, Ying
2018-06-22 3:51 ` [PATCH -mm -v4 11/21] mm, THP, swap: Add sysfs interface to configure THP swapin Huang, Ying
2018-06-22 3:51 ` [PATCH -mm -v4 12/21] mm, THP, swap: Support PMD swap mapping in swapoff Huang, Ying
2018-06-22 3:51 ` [PATCH -mm -v4 13/21] mm, THP, swap: Support PMD swap mapping in madvise_free() Huang, Ying
2018-06-22 3:51 ` [PATCH -mm -v4 14/21] mm, cgroup, THP, swap: Support to move swap account for PMD swap mapping Huang, Ying
2018-07-09 17:20 ` Daniel Jordan
2018-07-10 7:49 ` Huang, Ying
2018-07-10 22:49 ` Daniel Jordan
2018-06-22 3:51 ` [PATCH -mm -v4 15/21] mm, THP, swap: Support to copy PMD swap mapping when fork() Huang, Ying
2018-06-22 3:51 ` [PATCH -mm -v4 16/21] mm, THP, swap: Free PMD swap mapping when zap_huge_pmd() Huang, Ying
2018-06-22 3:51 ` [PATCH -mm -v4 17/21] mm, THP, swap: Support PMD swap mapping for MADV_WILLNEED Huang, Ying
2018-06-22 3:51 ` [PATCH -mm -v4 18/21] mm, THP, swap: Support PMD swap mapping in mincore() Huang, Ying
2018-06-22 3:51 ` [PATCH -mm -v4 19/21] mm, THP, swap: Support PMD swap mapping in common path Huang, Ying
2018-06-22 3:51 ` [PATCH -mm -v4 20/21] mm, THP, swap: create PMD swap mapping when unmap the THP Huang, Ying
2018-06-22 3:51 ` [PATCH -mm -v4 21/21] mm, THP: Avoid to split THP when reclaim MADV_FREE THP Huang, Ying
2018-06-28 4:51 ` [PATCH -mm -v4 00/21] mm, THP, swap: Swapout/swapin THP in one piece Andrew Morton
2018-06-28 5:29 ` Huang, Ying
2018-06-28 5:31 ` Andrew Morton
2018-06-28 5:35 ` Huang, Ying
2018-06-28 6:18 ` Andrew Morton
2018-06-28 9:03 ` Matthew Wilcox
2018-06-29 1:17 ` Huang, Ying
2018-06-29 5:57 ` Matthew Wilcox
2018-07-02 5:19 ` Huang, Ying
2018-07-04 2:11 ` Sergey Senozhatsky
2018-07-04 2:20 ` Huang, Ying
2018-07-04 2:27 ` Sergey Senozhatsky
2018-07-04 2:59 ` Huang, Ying
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAA9_cmc2YteXBhrLOFN0rAZ4UFDRPcXaE1OPNv06P+Fu9e+zeA@mail.gmail.com \
--to=dan.j.williams@gmail.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=daniel.m.jordan@oracle.com \
--cc=dave.hansen@linux.intel.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=riel@redhat.com \
--cc=shli@kernel.org \
--cc=ying.huang@intel.com \
--cc=zi.yan@cs.rutgers.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).