All of lore.kernel.org
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Barry Song <21cnbao@gmail.com>
Cc: "Andrew Morton" <akpm@linux-foundation.org>,
	"Will Deacon" <will@kernel.org>, Linux-MM <linux-mm@kvack.org>,
	LAK <linux-arm-kernel@lists.infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	hanchuanhua <hanchuanhua@oppo.com>,
	"张诗明(Simon Zhang)" <zhangshiming@oppo.com>, 郭健 <guojian@oppo.com>,
	"Barry Song" <v-songbaohua@oppo.com>,
	"Huang, Ying" <ying.huang@intel.com>,
	"Minchan Kim" <minchan@kernel.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Hugh Dickins" <hughd@google.com>, "Shaohua Li" <shli@kernel.org>,
	"Rik van Riel" <riel@redhat.com>,
	"Andrea Arcangeli" <aarcange@redhat.com>,
	"Steven Price" <steven.price@arm.com>
Subject: Re: [PATCH] arm64: enable THP_SWAP for arm64
Date: Tue, 24 May 2022 20:14:04 +0100	[thread overview]
Message-ID: <Yo0ufMHXPL5mJ5t6@arm.com> (raw)
In-Reply-To: <CAGsJ_4xPFkc6Kn2G5pPPk8XJ4iZV=atzan=Quq6Ljc_5vr1fnA@mail.gmail.com>

On Tue, May 24, 2022 at 10:05:35PM +1200, Barry Song wrote:
> On Tue, May 24, 2022 at 8:12 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Tue, May 24, 2022 at 07:14:03PM +1200, Barry Song wrote:
> > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > > index d550f5acfaf3..8e3771c56fbf 100644
> > > --- a/arch/arm64/Kconfig
> > > +++ b/arch/arm64/Kconfig
> > > @@ -98,6 +98,7 @@ config ARM64
> > >       select ARCH_WANT_HUGE_PMD_SHARE if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36)
> > >       select ARCH_WANT_LD_ORPHAN_WARN
> > >       select ARCH_WANTS_NO_INSTR
> > > +     select ARCH_WANTS_THP_SWAP if ARM64_4K_PAGES
> >
> > I'm not opposed to this but I think it would break pages mapped with
> > PROT_MTE. We have an assumption in mte_sync_tags() that compound pages
> > are not swapped out (or in). With MTE, we store the tags in a slab
> 
> I assume you mean mte_sync_tags() require that THP is not swapped as a whole,
> as without THP_SWP, THP is still swapping after being splitted. MTE doesn't stop
> THP from swapping through a couple of splitted pages, does it?

That's correct, split THP page are swapped out/in just fine.

> > object (128-bytes per swapped page) and restore them when pages are
> > swapped in. At some point we may teach the core swap code about such
> > metadata but in the meantime that was the easiest way.
> 
> If my previous assumption is true,  the easiest way to enable THP_SWP
> for this moment might be always letting mm fallback to the splitting
> way for MTE hardware. For this moment, I care about THP_SWP more as
> none of my hardware has MTE.
> 
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 45c358538f13..d55a2a3e41a9 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -44,6 +44,8 @@
>         __flush_tlb_range(vma, addr, end, PUD_SIZE, false, 1)
>  #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
> 
> +#define arch_thp_swp_supported !system_supports_mte
> +
>  /*
>   * Outside of a few very special situations (e.g. hibernation), we always
>   * use broadcast TLB invalidation instructions, therefore a spurious page
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index 2999190adc22..064b6b03df9e 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -447,4 +447,16 @@ static inline int split_folio_to_list(struct folio *folio,
>         return split_huge_page_to_list(&folio->page, list);
>  }
> 
> +/*
> + * archs that select ARCH_WANTS_THP_SWAP but don't support THP_SWP due to
> + * limitations in the implementation like arm64 MTE can override this to
> + * false
> + */
> +#ifndef arch_thp_swp_supported
> +static inline bool arch_thp_swp_supported(void)
> +{
> +       return true;
> +}
> +#endif
> +
>  #endif /* _LINUX_HUGE_MM_H */
> diff --git a/mm/swap_slots.c b/mm/swap_slots.c
> index 2b5531840583..dde685836328 100644
> --- a/mm/swap_slots.c
> +++ b/mm/swap_slots.c
> @@ -309,7 +309,7 @@ swp_entry_t get_swap_page(struct page *page)
>         entry.val = 0;
> 
>         if (PageTransHuge(page)) {
> -               if (IS_ENABLED(CONFIG_THP_SWAP))
> +               if (IS_ENABLED(CONFIG_THP_SWAP) && arch_thp_swp_supported())
>                         get_swap_pages(1, &entry, HPAGE_PMD_NR);
>                 goto out;

I think this should work and with your other proposal it would be
limited to MTE pages:

#define arch_thp_swp_supported(page)	(!test_bit(PG_mte_tagged, &page->flags))

Are THP pages loaded from swap as a whole or are they split? IIRC the
splitting still happens but after the swapping out finishes. Even if
they are loaded as 4K pages, we still have the mte_save_tags() that only
understands small pages currently, so rejecting THP pages is probably
best.

-- 
Catalin

WARNING: multiple messages have this Message-ID (diff)
From: Catalin Marinas <catalin.marinas@arm.com>
To: Barry Song <21cnbao@gmail.com>
Cc: "Andrew Morton" <akpm@linux-foundation.org>,
	"Will Deacon" <will@kernel.org>, Linux-MM <linux-mm@kvack.org>,
	LAK <linux-arm-kernel@lists.infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	hanchuanhua <hanchuanhua@oppo.com>,
	"张诗明(Simon Zhang)" <zhangshiming@oppo.com>, 郭健 <guojian@oppo.com>,
	"Barry Song" <v-songbaohua@oppo.com>,
	"Huang, Ying" <ying.huang@intel.com>,
	"Minchan Kim" <minchan@kernel.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Hugh Dickins" <hughd@google.com>, "Shaohua Li" <shli@kernel.org>,
	"Rik van Riel" <riel@redhat.com>,
	"Andrea Arcangeli" <aarcange@redhat.com>,
	"Steven Price" <steven.price@arm.com>
Subject: Re: [PATCH] arm64: enable THP_SWAP for arm64
Date: Tue, 24 May 2022 20:14:04 +0100	[thread overview]
Message-ID: <Yo0ufMHXPL5mJ5t6@arm.com> (raw)
In-Reply-To: <CAGsJ_4xPFkc6Kn2G5pPPk8XJ4iZV=atzan=Quq6Ljc_5vr1fnA@mail.gmail.com>

On Tue, May 24, 2022 at 10:05:35PM +1200, Barry Song wrote:
> On Tue, May 24, 2022 at 8:12 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Tue, May 24, 2022 at 07:14:03PM +1200, Barry Song wrote:
> > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > > index d550f5acfaf3..8e3771c56fbf 100644
> > > --- a/arch/arm64/Kconfig
> > > +++ b/arch/arm64/Kconfig
> > > @@ -98,6 +98,7 @@ config ARM64
> > >       select ARCH_WANT_HUGE_PMD_SHARE if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36)
> > >       select ARCH_WANT_LD_ORPHAN_WARN
> > >       select ARCH_WANTS_NO_INSTR
> > > +     select ARCH_WANTS_THP_SWAP if ARM64_4K_PAGES
> >
> > I'm not opposed to this but I think it would break pages mapped with
> > PROT_MTE. We have an assumption in mte_sync_tags() that compound pages
> > are not swapped out (or in). With MTE, we store the tags in a slab
> 
> I assume you mean mte_sync_tags() require that THP is not swapped as a whole,
> as without THP_SWP, THP is still swapping after being splitted. MTE doesn't stop
> THP from swapping through a couple of splitted pages, does it?

That's correct, split THP page are swapped out/in just fine.

> > object (128-bytes per swapped page) and restore them when pages are
> > swapped in. At some point we may teach the core swap code about such
> > metadata but in the meantime that was the easiest way.
> 
> If my previous assumption is true,  the easiest way to enable THP_SWP
> for this moment might be always letting mm fallback to the splitting
> way for MTE hardware. For this moment, I care about THP_SWP more as
> none of my hardware has MTE.
> 
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 45c358538f13..d55a2a3e41a9 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -44,6 +44,8 @@
>         __flush_tlb_range(vma, addr, end, PUD_SIZE, false, 1)
>  #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
> 
> +#define arch_thp_swp_supported !system_supports_mte
> +
>  /*
>   * Outside of a few very special situations (e.g. hibernation), we always
>   * use broadcast TLB invalidation instructions, therefore a spurious page
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index 2999190adc22..064b6b03df9e 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -447,4 +447,16 @@ static inline int split_folio_to_list(struct folio *folio,
>         return split_huge_page_to_list(&folio->page, list);
>  }
> 
> +/*
> + * archs that select ARCH_WANTS_THP_SWAP but don't support THP_SWP due to
> + * limitations in the implementation like arm64 MTE can override this to
> + * false
> + */
> +#ifndef arch_thp_swp_supported
> +static inline bool arch_thp_swp_supported(void)
> +{
> +       return true;
> +}
> +#endif
> +
>  #endif /* _LINUX_HUGE_MM_H */
> diff --git a/mm/swap_slots.c b/mm/swap_slots.c
> index 2b5531840583..dde685836328 100644
> --- a/mm/swap_slots.c
> +++ b/mm/swap_slots.c
> @@ -309,7 +309,7 @@ swp_entry_t get_swap_page(struct page *page)
>         entry.val = 0;
> 
>         if (PageTransHuge(page)) {
> -               if (IS_ENABLED(CONFIG_THP_SWAP))
> +               if (IS_ENABLED(CONFIG_THP_SWAP) && arch_thp_swp_supported())
>                         get_swap_pages(1, &entry, HPAGE_PMD_NR);
>                 goto out;

I think this should work and with your other proposal it would be
limited to MTE pages:

#define arch_thp_swp_supported(page)	(!test_bit(PG_mte_tagged, &page->flags))

Are THP pages loaded from swap as a whole or are they split? IIRC the
splitting still happens but after the swapping out finishes. Even if
they are loaded as 4K pages, we still have the mte_save_tags() that only
understands small pages currently, so rejecting THP pages is probably
best.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2022-05-24 19:14 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-24  7:14 [PATCH] arm64: enable THP_SWAP for arm64 Barry Song
2022-05-24  7:14 ` Barry Song
2022-05-24  8:12 ` Catalin Marinas
2022-05-24  8:12   ` Catalin Marinas
2022-05-24 10:05   ` Barry Song
2022-05-24 10:05     ` Barry Song
2022-05-24 11:15     ` Barry Song
2022-05-24 11:15       ` Barry Song
2022-05-26  8:13       ` Anshuman Khandual
2022-05-26  8:13         ` Anshuman Khandual
2022-05-24 19:14     ` Catalin Marinas [this message]
2022-05-24 19:14       ` Catalin Marinas
2022-05-25 11:10       ` Barry Song
2022-05-25 11:10         ` Barry Song
2022-05-25 16:54         ` Catalin Marinas
2022-05-25 16:54           ` Catalin Marinas
2022-05-25 17:49         ` Yang Shi
2022-05-25 17:49           ` Yang Shi
2022-05-26  9:19           ` Barry Song
2022-05-26  9:19             ` Barry Song
2022-05-26 17:02             ` Yang Shi
2022-05-26 17:02               ` Yang Shi
2022-05-27  7:29               ` Barry Song
2022-05-27  7:29                 ` Barry Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yo0ufMHXPL5mJ5t6@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=21cnbao@gmail.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=guojian@oppo.com \
    --cc=hanchuanhua@oppo.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=riel@redhat.com \
    --cc=shli@kernel.org \
    --cc=steven.price@arm.com \
    --cc=v-songbaohua@oppo.com \
    --cc=will@kernel.org \
    --cc=ying.huang@intel.com \
    --cc=zhangshiming@oppo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.