* [PATCH/RFC v2 0/3] tlb: mmu_gather: use batched table free if possible
@ 2021-12-18 18:52 Nikita Yushchenko
2021-12-18 18:52 ` [PATCH/RFC v2 1/3] tlb: mmu_gather: introduce CONFIG_MMU_GATHER_TABLE_FREE_COMMON Nikita Yushchenko
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Nikita Yushchenko @ 2021-12-18 18:52 UTC (permalink / raw)
To: Will Deacon, Aneesh Kumar K.V, Andrew Morton, Nick Piggin,
Peter Zijlstra, Catalin Marinas, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, Arnd Bergmann, Sam Ravnborg
Cc: x86, linux-kernel, linux-arch, linux-mm, kernel
In mmu_gather code, the final table free in __tlb_remove_table_free()
executes a loop, calling arch hook __tlb_remove_table() to free each table
individually.
Several architectures use free_page_and_swap_cache() as their
__tlb_remove_table() implementation. Calling that in loop results into
individual calls to put_page() for each page being freed.
This patchset refactors the code to issue a single release_pages() call
in this case. This is expected to have better performance, especially when
memcg accounting is enabled.
Nikita Yushchenko (3):
tlb: mmu_gather: introduce CONFIG_MMU_GATHER_TABLE_FREE_COMMON
mm/swap: introduce free_pages_and_swap_cache_nolru()
tlb: mmu_gather: use batched table free if possible
arch/Kconfig | 3 +++
arch/arm/Kconfig | 1 +
arch/arm/include/asm/tlb.h | 5 -----
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/tlb.h | 5 -----
arch/x86/Kconfig | 1 +
arch/x86/include/asm/tlb.h | 14 --------------
include/asm-generic/tlb.h | 5 +++++
include/linux/swap.h | 5 ++++-
mm/mmu_gather.c | 25 ++++++++++++++++++++++---
mm/swap_state.c | 29 ++++++++++++++++++++++-------
11 files changed, 59 insertions(+), 35 deletions(-)
--
2.30.2
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH/RFC v2 1/3] tlb: mmu_gather: introduce CONFIG_MMU_GATHER_TABLE_FREE_COMMON
2021-12-18 18:52 [PATCH/RFC v2 0/3] tlb: mmu_gather: use batched table free if possible Nikita Yushchenko
@ 2021-12-18 18:52 ` Nikita Yushchenko
2021-12-21 13:21 ` Peter Zijlstra
2021-12-18 18:52 ` [PATCH/RFC v2 2/3] mm/swap: introduce free_pages_and_swap_cache_nolru() Nikita Yushchenko
2021-12-18 18:52 ` [PATCH/RFC v2 3/3] tlb: mmu_gather: use batched table free if possible Nikita Yushchenko
2 siblings, 1 reply; 6+ messages in thread
From: Nikita Yushchenko @ 2021-12-18 18:52 UTC (permalink / raw)
To: Will Deacon, Aneesh Kumar K.V, Andrew Morton, Nick Piggin,
Peter Zijlstra, Catalin Marinas, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, Arnd Bergmann, Sam Ravnborg
Cc: x86, linux-kernel, linux-arch, linux-mm, kernel
For architectures that use free_page_and_swap_cache() as their
__tlb_remove_table(), place that common implementation into
mm/mmu_gather.c, ifdef'ed by CONFIG_MMU_GATHER_TABLE_FREE_COMMON.
Signed-off-by: Nikita Yushchenko <nikita.yushchenko@virtuozzo.com>
---
arch/Kconfig | 3 +++
arch/arm/Kconfig | 1 +
arch/arm/include/asm/tlb.h | 5 -----
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/tlb.h | 5 -----
arch/x86/Kconfig | 1 +
arch/x86/include/asm/tlb.h | 14 --------------
include/asm-generic/tlb.h | 5 +++++
mm/mmu_gather.c | 10 ++++++++++
9 files changed, 21 insertions(+), 24 deletions(-)
diff --git a/arch/Kconfig b/arch/Kconfig
index d3c4ab249e9c..9eba553cd86f 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -415,6 +415,9 @@ config HAVE_ARCH_JUMP_LABEL_RELATIVE
config MMU_GATHER_TABLE_FREE
bool
+config MMU_GATHER_TABLE_FREE_COMMON
+ bool
+
config MMU_GATHER_RCU_TABLE_FREE
bool
select MMU_GATHER_TABLE_FREE
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index c2724d986fa0..cc272e1ad12c 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -110,6 +110,7 @@ config ARM
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
select MMU_GATHER_RCU_TABLE_FREE if SMP && ARM_LPAE
+ select MMU_GATHER_TABLE_FREE_COMMON
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RSEQ
select HAVE_STACKPROTECTOR
diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
index b8cbe03ad260..9d9b21649ca0 100644
--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -29,11 +29,6 @@
#include <linux/swap.h>
#include <asm/tlbflush.h>
-static inline void __tlb_remove_table(void *_table)
-{
- free_page_and_swap_cache((struct page *)_table);
-}
-
#include <asm-generic/tlb.h>
static inline void
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index c4207cf9bb17..0f99f30d99f6 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -196,6 +196,7 @@ config ARM64
select HAVE_FUNCTION_ARG_ACCESS_API
select HAVE_FUTEX_CMPXCHG if FUTEX
select MMU_GATHER_RCU_TABLE_FREE
+ select MMU_GATHER_TABLE_FREE_COMMON
select HAVE_RSEQ
select HAVE_STACKPROTECTOR
select HAVE_SYSCALL_TRACEPOINTS
diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h
index c995d1f4594f..401826260a5c 100644
--- a/arch/arm64/include/asm/tlb.h
+++ b/arch/arm64/include/asm/tlb.h
@@ -11,11 +11,6 @@
#include <linux/pagemap.h>
#include <linux/swap.h>
-static inline void __tlb_remove_table(void *_table)
-{
- free_page_and_swap_cache((struct page *)_table);
-}
-
#define tlb_flush tlb_flush
static void tlb_flush(struct mmu_gather *tlb);
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5c2ccb85f2ef..379d6832d3de 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -235,6 +235,7 @@ config X86
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
select MMU_GATHER_RCU_TABLE_FREE if PARAVIRT
+ select MMU_GATHER_TABLE_FREE_COMMON
select HAVE_POSIX_CPU_TIMERS_TASK_WORK
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RELIABLE_STACKTRACE if X86_64 && (UNWINDER_FRAME_POINTER || UNWINDER_ORC) && STACK_VALIDATION
diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h
index 1bfe979bb9bc..96e3b4f922c9 100644
--- a/arch/x86/include/asm/tlb.h
+++ b/arch/x86/include/asm/tlb.h
@@ -23,18 +23,4 @@ static inline void tlb_flush(struct mmu_gather *tlb)
flush_tlb_mm_range(tlb->mm, start, end, stride_shift, tlb->freed_tables);
}
-/*
- * While x86 architecture in general requires an IPI to perform TLB
- * shootdown, enablement code for several hypervisors overrides
- * .flush_tlb_others hook in pv_mmu_ops and implements it by issuing
- * a hypercall. To keep software pagetable walkers safe in this case we
- * switch to RCU based table free (MMU_GATHER_RCU_TABLE_FREE). See the comment
- * below 'ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE' in include/asm-generic/tlb.h
- * for more details.
- */
-static inline void __tlb_remove_table(void *table)
-{
- free_page_and_swap_cache(table);
-}
-
#endif /* _ASM_X86_TLB_H */
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index 2c68a545ffa7..877431da21cf 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -158,6 +158,11 @@
* Useful if your architecture doesn't use IPIs for remote TLB invalidates
* and therefore doesn't naturally serialize with software page-table walkers.
*
+ * MMU_GATHER_TABLE_FREE_COMMON
+ *
+ * Provide default implementation of __tlb_remove_table() based on
+ * free_page_and_swap_cache().
+ *
* MMU_GATHER_NO_RANGE
*
* Use this if your architecture lacks an efficient flush_tlb_range().
diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c
index 1b9837419bf9..eb2f30a92462 100644
--- a/mm/mmu_gather.c
+++ b/mm/mmu_gather.c
@@ -93,6 +93,13 @@ bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, int page_
#ifdef CONFIG_MMU_GATHER_TABLE_FREE
+#ifdef CONFIG_MMU_GATHER_TABLE_FREE_COMMON
+static inline void __tlb_remove_table(void *table)
+{
+ free_page_and_swap_cache((struct page *)table);
+}
+#endif
+
static void __tlb_remove_table_free(struct mmu_table_batch *batch)
{
int i;
@@ -132,6 +139,9 @@ static void __tlb_remove_table_free(struct mmu_table_batch *batch)
* pressure. To guarantee progress we fall back to single table freeing, see
* the implementation of tlb_remove_table_one().
*
+ * This is also used to keep software pagetable walkers safe when architecture
+ * natively uses IPIs for TLB flushes, but hypervisor enablement code replaced
+ * that by issuing a hypercall.
*/
static void tlb_remove_table_smp_sync(void *arg)
--
2.30.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH/RFC v2 2/3] mm/swap: introduce free_pages_and_swap_cache_nolru()
2021-12-18 18:52 [PATCH/RFC v2 0/3] tlb: mmu_gather: use batched table free if possible Nikita Yushchenko
2021-12-18 18:52 ` [PATCH/RFC v2 1/3] tlb: mmu_gather: introduce CONFIG_MMU_GATHER_TABLE_FREE_COMMON Nikita Yushchenko
@ 2021-12-18 18:52 ` Nikita Yushchenko
2021-12-18 18:52 ` [PATCH/RFC v2 3/3] tlb: mmu_gather: use batched table free if possible Nikita Yushchenko
2 siblings, 0 replies; 6+ messages in thread
From: Nikita Yushchenko @ 2021-12-18 18:52 UTC (permalink / raw)
To: Will Deacon, Aneesh Kumar K.V, Andrew Morton, Nick Piggin,
Peter Zijlstra, Catalin Marinas, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, Arnd Bergmann, Sam Ravnborg
Cc: x86, linux-kernel, linux-arch, linux-mm, kernel
This is a variant of free_pages_and_swap_cache() that does not call
lru_add_drain(), for better performance in case when the passed pages
are guaranteed to not be in LRU.
Signed-off-by: Nikita Yushchenko <nikita.yushchenko@virtuozzo.com>
---
include/linux/swap.h | 5 ++++-
mm/swap_state.c | 29 ++++++++++++++++++++++-------
2 files changed, 26 insertions(+), 8 deletions(-)
diff --git a/include/linux/swap.h b/include/linux/swap.h
index d1ea44b31f19..86a1b0a61889 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -460,6 +460,7 @@ extern void clear_shadow_from_swap_cache(int type, unsigned long begin,
extern void free_swap_cache(struct page *);
extern void free_page_and_swap_cache(struct page *);
extern void free_pages_and_swap_cache(struct page **, int);
+extern void free_pages_and_swap_cache_nolru(struct page **, int);
extern struct page *lookup_swap_cache(swp_entry_t entry,
struct vm_area_struct *vma,
unsigned long addr);
@@ -565,7 +566,9 @@ static inline struct address_space *swap_address_space(swp_entry_t entry)
#define free_page_and_swap_cache(page) \
put_page(page)
#define free_pages_and_swap_cache(pages, nr) \
- release_pages((pages), (nr));
+ release_pages((pages), (nr))
+#define free_pages_and_swap_cache_nolru(pages, nr) \
+ release_pages((pages), (nr))
static inline void free_swap_cache(struct page *page)
{
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 8d4104242100..a5d9fd258f0a 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -307,17 +307,32 @@ void free_page_and_swap_cache(struct page *page)
/*
* Passed an array of pages, drop them all from swapcache and then release
- * them. They are removed from the LRU and freed if this is their last use.
+ * them. They are optionally removed from the LRU and freed if this is their
+ * last use.
*/
-void free_pages_and_swap_cache(struct page **pages, int nr)
+static void __free_pages_and_swap_cache(struct page **pages, int nr,
+ bool do_lru)
{
- struct page **pagep = pages;
int i;
- lru_add_drain();
- for (i = 0; i < nr; i++)
- free_swap_cache(pagep[i]);
- release_pages(pagep, nr);
+ if (do_lru)
+ lru_add_drain();
+ for (i = 0; i < nr; i++) {
+ if (!do_lru)
+ VM_WARN_ON_ONCE_PAGE(PageLRU(pages[i]), pages[i]);
+ free_swap_cache(pages[i]);
+ }
+ release_pages(pages, nr);
+}
+
+void free_pages_and_swap_cache(struct page **pages, int nr)
+{
+ __free_pages_and_swap_cache(pages, nr, true);
+}
+
+void free_pages_and_swap_cache_nolru(struct page **pages, int nr)
+{
+ __free_pages_and_swap_cache(pages, nr, false);
}
static inline bool swap_use_vma_readahead(void)
--
2.30.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH/RFC v2 3/3] tlb: mmu_gather: use batched table free if possible
2021-12-18 18:52 [PATCH/RFC v2 0/3] tlb: mmu_gather: use batched table free if possible Nikita Yushchenko
2021-12-18 18:52 ` [PATCH/RFC v2 1/3] tlb: mmu_gather: introduce CONFIG_MMU_GATHER_TABLE_FREE_COMMON Nikita Yushchenko
2021-12-18 18:52 ` [PATCH/RFC v2 2/3] mm/swap: introduce free_pages_and_swap_cache_nolru() Nikita Yushchenko
@ 2021-12-18 18:52 ` Nikita Yushchenko
2 siblings, 0 replies; 6+ messages in thread
From: Nikita Yushchenko @ 2021-12-18 18:52 UTC (permalink / raw)
To: Will Deacon, Aneesh Kumar K.V, Andrew Morton, Nick Piggin,
Peter Zijlstra, Catalin Marinas, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, Arnd Bergmann, Sam Ravnborg
Cc: x86, linux-kernel, linux-arch, linux-mm, kernel
In case when __tlb_remove_table() is implemented via
free_page_and_swap_cache(), use free_pages_and_swap_cache_nolru() for
batch table removal.
This enables use of single release_pages() call instead of a loop
calling put_page(). This shall have better performance, especially when
memcg accounting is enabled.
Signed-off-by: Nikita Yushchenko <nikita.yushchenko@virtuozzo.com>
---
mm/mmu_gather.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c
index eb2f30a92462..2e75d396bbad 100644
--- a/mm/mmu_gather.c
+++ b/mm/mmu_gather.c
@@ -98,15 +98,24 @@ static inline void __tlb_remove_table(void *table)
{
free_page_and_swap_cache((struct page *)table);
}
-#endif
-static void __tlb_remove_table_free(struct mmu_table_batch *batch)
+static inline void __tlb_remove_tables(void **tables, int nr)
+{
+ free_pages_and_swap_cache_nolru((struct page **)tables, nr);
+}
+#else
+static inline void __tlb_remove_tables(void **tables, int nr)
{
int i;
- for (i = 0; i < batch->nr; i++)
- __tlb_remove_table(batch->tables[i]);
+ for (i = 0; i < nr; i++)
+ __tlb_remove_table(tables[i]);
+}
+#endif
+static void __tlb_remove_table_free(struct mmu_table_batch *batch)
+{
+ __tlb_remove_tables(batch->tables, batch->nr);
free_page((unsigned long)batch);
}
--
2.30.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH/RFC v2 1/3] tlb: mmu_gather: introduce CONFIG_MMU_GATHER_TABLE_FREE_COMMON
2021-12-18 18:52 ` [PATCH/RFC v2 1/3] tlb: mmu_gather: introduce CONFIG_MMU_GATHER_TABLE_FREE_COMMON Nikita Yushchenko
@ 2021-12-21 13:21 ` Peter Zijlstra
2021-12-21 15:42 ` Nikita Yushchenko
0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2021-12-21 13:21 UTC (permalink / raw)
To: Nikita Yushchenko
Cc: Will Deacon, Aneesh Kumar K.V, Andrew Morton, Nick Piggin,
Catalin Marinas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, Arnd Bergmann, Sam Ravnborg, x86, linux-kernel,
linux-arch, linux-mm, kernel
On Sat, Dec 18, 2021 at 09:52:04PM +0300, Nikita Yushchenko wrote:
> For architectures that use free_page_and_swap_cache() as their
> __tlb_remove_table(), place that common implementation into
> mm/mmu_gather.c, ifdef'ed by CONFIG_MMU_GATHER_TABLE_FREE_COMMON.
>
> Signed-off-by: Nikita Yushchenko <nikita.yushchenko@virtuozzo.com>
> ---
> arch/Kconfig | 3 +++
> arch/arm/Kconfig | 1 +
> arch/arm/include/asm/tlb.h | 5 -----
> arch/arm64/Kconfig | 1 +
> arch/arm64/include/asm/tlb.h | 5 -----
> arch/x86/Kconfig | 1 +
> arch/x86/include/asm/tlb.h | 14 --------------
> include/asm-generic/tlb.h | 5 +++++
> mm/mmu_gather.c | 10 ++++++++++
> 9 files changed, 21 insertions(+), 24 deletions(-)
>
> diff --git a/arch/Kconfig b/arch/Kconfig
> index d3c4ab249e9c..9eba553cd86f 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -415,6 +415,9 @@ config HAVE_ARCH_JUMP_LABEL_RELATIVE
> config MMU_GATHER_TABLE_FREE
> bool
>
> +config MMU_GATHER_TABLE_FREE_COMMON
> + bool
I don't like that name... The point isn't that it's common, the point is
that the page-table's are backed by pages.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH/RFC v2 1/3] tlb: mmu_gather: introduce CONFIG_MMU_GATHER_TABLE_FREE_COMMON
2021-12-21 13:21 ` Peter Zijlstra
@ 2021-12-21 15:42 ` Nikita Yushchenko
0 siblings, 0 replies; 6+ messages in thread
From: Nikita Yushchenko @ 2021-12-21 15:42 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Will Deacon, Aneesh Kumar K.V, Andrew Morton, Nick Piggin,
Catalin Marinas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, Arnd Bergmann, Sam Ravnborg, x86, linux-kernel,
linux-arch, linux-mm, kernel
>> +config MMU_GATHER_TABLE_FREE_COMMON
>> + bool
>
> I don't like that name... The point isn't that it's common, the point is
> that the page-table's are backed by pages.
What is a better name?
MMU_GATHER_TABLE_FREE_PAGES?
MMU_GATHER_TABLE_FREE_PAGES_BACKED?
Nikita
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-12-21 15:42 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-18 18:52 [PATCH/RFC v2 0/3] tlb: mmu_gather: use batched table free if possible Nikita Yushchenko
2021-12-18 18:52 ` [PATCH/RFC v2 1/3] tlb: mmu_gather: introduce CONFIG_MMU_GATHER_TABLE_FREE_COMMON Nikita Yushchenko
2021-12-21 13:21 ` Peter Zijlstra
2021-12-21 15:42 ` Nikita Yushchenko
2021-12-18 18:52 ` [PATCH/RFC v2 2/3] mm/swap: introduce free_pages_and_swap_cache_nolru() Nikita Yushchenko
2021-12-18 18:52 ` [PATCH/RFC v2 3/3] tlb: mmu_gather: use batched table free if possible Nikita Yushchenko
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).