linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/7] Try to free empty and zero user PTE page table pages
@ 2022-08-25 10:10 Qi Zheng
  2022-08-25 10:10 ` [RFC PATCH 1/7] mm: use ptep_clear() in non-present cases Qi Zheng
                   ` (7 more replies)
  0 siblings, 8 replies; 10+ messages in thread
From: Qi Zheng @ 2022-08-25 10:10 UTC (permalink / raw)
  To: akpm, david, kirill.shutemov, mika.penttila, jgg, tglx, willy
  Cc: linux-kernel, linux-mm, muchun.song, Qi Zheng

Hi,

Before this, in order to free empty user PTE page table pages, I posted the
following patch sets of two solutions:
 - atomic refcount version:
	https://lore.kernel.org/lkml/20211110105428.32458-1-zhengqi.arch@bytedance.com/
 - percpu refcount version:
	https://lore.kernel.org/lkml/20220429133552.33768-1-zhengqi.arch@bytedance.com/

Both patch sets have the following behavior:
a. Protect the page table walker by hooking pte_offset_map{_lock}() and
   pte_unmap{_unlock}()
b. Will automatically reclaim PTE page table pages in the non-reclaiming path

For behavior a, there may be the following disadvantages mentioned by
David Hildenbrand:
 - It introduces a lot of complexity. It's not something easy to get in and most
   probably not easy to get out again
 - It is inconvenient to extend to other architectures. For example, for the
   continuous ptes of arm64, the pointer to the PTE entry is obtained directly
   through pte_offset_kernel() instead of pte_offset_map{_lock}()
 - It has been found that pte_unmap() is missing in some places that only
   execute on 64-bit systems, which is a disaster for pte_refcount

For behavior b, it may not be necessary to actively reclaim PTE pages, especially
when memory pressure is not high, and deferring to the reclaim path may be a
better choice.

In addition, the above two solutions are only for empty PTE pages (a PTE page
where all entries are empty), and do not deal with the zero PTE page ( a PTE
page where all page table entries are mapped to shared zero page) mentioned by
David Hildenbrand:
	"Especially the shared zeropage is nasty, because there are
	 sane use cases that can trigger it. Assume you have a VM
	 (e.g., QEMU) that inflated the balloon to return free memory
	 to the hypervisor.

	 Simply migrating that VM will populate the shared zeropage to
	 all inflated pages, because migration code ends up reading all
	 VM memory. Similarly, the guest can just read that memory as
	 well, for example, when the guest issues kdump itself."

The purpose of this RFC patch is to continue the discussion and fix the above
issues. The following is the solution to be discussed.

In order to quickly identify the above two types of PTE pages, we still
introduced a pte_refcount for each PTE page. We put the mapped and zero PTE
entry counter into the pte_refcount of the PTE page. The bitmask has the
following meaning:

 - bits 0-9 are mapped PTE entry count
 - bits 10-19 are zero PTE entry count

In this way, when mapped PTE entry count is 0, we can know that the current PTE
page is an empty PTE page, and when zero PTE entry count is PTRS_PER_PTE, we can
know that the current PTE page is a zero PTE page.

We only update the pte_refcount when setting and clearing of PTE entry, and
since they are both protected by pte lock, pte_refcount can be a non-atomic
variable with little performance overhead.

For page table walker, we mutually exclusive it by holding write lock of
mmap_lock when doing pmd_clear() (in the newly added path to reclaim PTE pages).

The [RFC PATCH 7/7] is an example of reclaiming empty and zero PTE page in a
process. But the best time to reclaim should be in the reclaiming path, such as
before waking up the oom killer. At this point, the system can not reclaim more
memory. Compared with killing a process, it is more acceptable to hold a write
lock of mmap_lock to reclaim memory by releasing empty and zero PTE pages.

My idea is to count the number of bytes (mm->reclaimable_pt_bytes, similar to
mm->pgtables_bytes) of reclaimable PTE pages (including empty and zero PTE page)
in each mm, and maintain a rbtree with mm->reclaimable_pt_bytes as the key, then
we can pick the mm with the largest mm->reclaimable_pt_bytes to reclaim in the
reclaim path.

This series is based on v5.19.

Comments and suggestions are welcome.

Thanks,
Qi

Qi Zheng (7):
  mm: use ptep_clear() in non-present cases
  mm: introduce CONFIG_FREE_USER_PTE
  mm: add pte_to_page() helper
  mm: introduce pte_refcount for user PTE page table page
  pte_ref: add track_pte_{set, clear}() helper
  x86/mm: add x86_64 support for pte_ref
  mm: add proc interface to free user PTE page table pages

 arch/x86/Kconfig               |   1 +
 arch/x86/include/asm/pgtable.h |   4 +
 include/linux/mm.h             |   2 +
 include/linux/mm_types.h       |   1 +
 include/linux/pgtable.h        |  11 +-
 include/linux/pte_ref.h        |  41 ++++++
 kernel/sysctl.c                |  12 ++
 mm/Kconfig                     |  11 ++
 mm/Makefile                    |   2 +-
 mm/memory.c                    |   2 +-
 mm/mprotect.c                  |   2 +-
 mm/pte_ref.c                   | 234 +++++++++++++++++++++++++++++++++
 12 files changed, 319 insertions(+), 4 deletions(-)
 create mode 100644 include/linux/pte_ref.h
 create mode 100644 mm/pte_ref.c

-- 
2.20.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC PATCH 1/7] mm: use ptep_clear() in non-present cases
  2022-08-25 10:10 [RFC PATCH 0/7] Try to free empty and zero user PTE page table pages Qi Zheng
@ 2022-08-25 10:10 ` Qi Zheng
  2022-08-25 10:10 ` [RFC PATCH 2/7] mm: introduce CONFIG_FREE_USER_PTE Qi Zheng
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Qi Zheng @ 2022-08-25 10:10 UTC (permalink / raw)
  To: akpm, david, kirill.shutemov, mika.penttila, jgg, tglx, willy
  Cc: linux-kernel, linux-mm, muchun.song, Qi Zheng

After commit 08d5b29eac7d ("mm: ptep_clear() page table helper"),
the ptep_clear() can be used to track the clearing of PTE entries,
but it skips some places since the page table check does not care
about non-present PTE entries.

Subsequent patches need to use ptep_clear() to track all clearing
PTE entries, so this patch makes ptep_clear() used for all cases
including clearing non-present PTE entries.

Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
 include/linux/pgtable.h | 2 +-
 mm/memory.c             | 2 +-
 mm/mprotect.c           | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 3cdc16cfd867..9745684b0cdb 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -428,7 +428,7 @@ static inline void pte_clear_not_present_full(struct mm_struct *mm,
 					      pte_t *ptep,
 					      int full)
 {
-	pte_clear(mm, address, ptep);
+	ptep_clear(mm, address, ptep);
 }
 #endif
 
diff --git a/mm/memory.c b/mm/memory.c
index 1c6027adc542..207e0ee657e9 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3655,7 +3655,7 @@ static vm_fault_t pte_marker_clear(struct vm_fault *vmf)
 	 * none pte.  Otherwise it means the pte could have changed, so retry.
 	 */
 	if (is_pte_marker(*vmf->pte))
-		pte_clear(vmf->vma->vm_mm, vmf->address, vmf->pte);
+		ptep_clear(vmf->vma->vm_mm, vmf->address, vmf->pte);
 	pte_unmap_unlock(vmf->pte, vmf->ptl);
 	return 0;
 }
diff --git a/mm/mprotect.c b/mm/mprotect.c
index ba5592655ee3..1a01bd22a4ed 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -201,7 +201,7 @@ static unsigned long change_pte_range(struct mmu_gather *tlb,
 				 * fault will trigger without uffd trapping.
 				 */
 				if (uffd_wp_resolve) {
-					pte_clear(vma->vm_mm, addr, pte);
+					ptep_clear(vma->vm_mm, addr, pte);
 					pages++;
 				}
 				continue;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH 2/7] mm: introduce CONFIG_FREE_USER_PTE
  2022-08-25 10:10 [RFC PATCH 0/7] Try to free empty and zero user PTE page table pages Qi Zheng
  2022-08-25 10:10 ` [RFC PATCH 1/7] mm: use ptep_clear() in non-present cases Qi Zheng
@ 2022-08-25 10:10 ` Qi Zheng
  2022-08-25 10:10 ` [RFC PATCH 3/7] mm: add pte_to_page() helper Qi Zheng
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Qi Zheng @ 2022-08-25 10:10 UTC (permalink / raw)
  To: akpm, david, kirill.shutemov, mika.penttila, jgg, tglx, willy
  Cc: linux-kernel, linux-mm, muchun.song, Qi Zheng

This configuration variable will be used to build the code needed to
free user PTE page table pages.

The PTE page table setting and clearing functions(such as set_pte_at())
are in the architecture's files, and these functions will be hooked to
implement FREE_USER_PTE, so the architecture support is needed.

Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
 mm/Kconfig | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/mm/Kconfig b/mm/Kconfig
index 169e64192e48..d2a5a24cee2d 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -1130,6 +1130,17 @@ config PTE_MARKER_UFFD_WP
 	  purposes.  It is required to enable userfaultfd write protection on
 	  file-backed memory types like shmem and hugetlbfs.
 
+config ARCH_SUPPORTS_FREE_USER_PTE
+	def_bool n
+
+config FREE_USER_PTE
+	bool "Free user PTE page table pages"
+	default y
+	depends on ARCH_SUPPORTS_FREE_USER_PTE && MMU && SMP
+	help
+	  Try to free user PTE page table page when its all entries are none or
+	  mapped shared zero page.
+
 source "mm/damon/Kconfig"
 
 endmenu
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH 3/7] mm: add pte_to_page() helper
  2022-08-25 10:10 [RFC PATCH 0/7] Try to free empty and zero user PTE page table pages Qi Zheng
  2022-08-25 10:10 ` [RFC PATCH 1/7] mm: use ptep_clear() in non-present cases Qi Zheng
  2022-08-25 10:10 ` [RFC PATCH 2/7] mm: introduce CONFIG_FREE_USER_PTE Qi Zheng
@ 2022-08-25 10:10 ` Qi Zheng
  2022-08-25 10:10 ` [RFC PATCH 4/7] mm: introduce pte_refcount for user PTE page table page Qi Zheng
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Qi Zheng @ 2022-08-25 10:10 UTC (permalink / raw)
  To: akpm, david, kirill.shutemov, mika.penttila, jgg, tglx, willy
  Cc: linux-kernel, linux-mm, muchun.song, Qi Zheng

Add pte_to_page() helper similar to pmd_to_page(), which
will be used to get the struct page of the PTE page table.

Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
 include/linux/pgtable.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 9745684b0cdb..c4a6bda6e965 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -86,6 +86,14 @@ static inline unsigned long pud_index(unsigned long address)
 #define pgd_index(a)  (((a) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
 #endif
 
+#ifdef CONFIG_FREE_USER_PTE
+static inline struct page *pte_to_page(pte_t *pte)
+{
+	unsigned long mask = ~(PTRS_PER_PTE * sizeof(pte_t) - 1);
+	return virt_to_page((void *)((unsigned long) pte & mask));
+}
+#endif
+
 #ifndef pte_offset_kernel
 static inline pte_t *pte_offset_kernel(pmd_t *pmd, unsigned long address)
 {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH 4/7] mm: introduce pte_refcount for user PTE page table page
  2022-08-25 10:10 [RFC PATCH 0/7] Try to free empty and zero user PTE page table pages Qi Zheng
                   ` (2 preceding siblings ...)
  2022-08-25 10:10 ` [RFC PATCH 3/7] mm: add pte_to_page() helper Qi Zheng
@ 2022-08-25 10:10 ` Qi Zheng
  2022-08-25 10:10 ` [RFC PATCH 5/7] pte_ref: add track_pte_{set, clear}() helper Qi Zheng
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Qi Zheng @ 2022-08-25 10:10 UTC (permalink / raw)
  To: akpm, david, kirill.shutemov, mika.penttila, jgg, tglx, willy
  Cc: linux-kernel, linux-mm, muchun.song, Qi Zheng

The following is the largest user PTE page table memory that
can be allocated by a single user process in a 32-bit and a
64-bit system (assuming 4K page size).

+---------------------------+--------+---------+
|                           | 32-bit | 64-bit  |
+===========================+========+=========+
| user PTE page table pages | 3 MiB  | 512 GiB |
+---------------------------+--------+---------+
| user PMD page table pages | 3 KiB  | 1 GiB   |
+---------------------------+--------+---------+
(for 32-bit, take 3G user address space as an example;
 for 64-bit, take 48-bit address width as an example.)

Today, 64-bit servers generally have only a few terabytes of
physical memory, and mapping these memory does not require as
many PTE page tables as above, but in some of the following
scenarios, it is still possible to cause huge page table memory
usage.

1. In order to pursue high performance, applications mostly use
   some high-performance user-mode memory allocators, such as
   jemalloc or tcmalloc. These memory allocators use
   madvise(MADV_DONTNEED or MADV_FREE) to release physical memory,
   but neither MADV_DONTNEED nor MADV_FREE will release page table
   memory, which may cause huge page table memory as follows:

		VIRT:  55t
        	RES:   590g
        	VmPTE: 110g

In this case, most of the page table entries are empty. For such
a PTE page where all entries are empty, we call it empty PTE page.

2. The shared zero page scenario mentioned by David Hildenbrand:

	Especially the shared zeropage is nasty, because there are
	sane use cases that can trigger it. Assume you have a VM
	(e.g., QEMU) that inflated the balloon to return free memory
	to the hypervisor.

	Simply migrating that VM will populate the shared zeropage to
	all inflated pages, because migration code ends up reading all
	VM memory. Similarly, the guest can just read that memory as
	well, for example, when the guest issues kdump itself.

In this case, most of the page table entries are mapped to the shared
zero page. For such a PTE page where all page table entries are mapped
to zero pages, we call it zero PTE page.

The page table entries for both types of PTE pages do not record
"meaningful" information, so we can try to free these PTE pages at
some point (such as when memory pressure is high) to reclaim more
memory.

To quickly identify these two types of pages, we have introduced a
pte_refcount for each PTE page. We put the mapped and zero PTE entry
counter into the pte_refcount of the PTE page. The bitmask has the
following meaning:

 - bits 0-9 are mapped PTE entry count
 - bits 10-19 are zero PTE entry count

Because the mapping and unmapping of PTE entries are under pte_lock,
there is no concurrent thread to modify pte_refcount, so pte_refcount
can be a non-atomic variable with little performance overhead.

Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
 include/linux/mm.h       |  2 ++
 include/linux/mm_types.h |  1 +
 include/linux/pte_ref.h  | 23 +++++++++++++
 mm/Makefile              |  2 +-
 mm/pte_ref.c             | 72 ++++++++++++++++++++++++++++++++++++++++
 5 files changed, 99 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/pte_ref.h
 create mode 100644 mm/pte_ref.c

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 7898e29bcfb5..23e2f1e75b4b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -28,6 +28,7 @@
 #include <linux/sched.h>
 #include <linux/pgtable.h>
 #include <linux/kasan.h>
+#include <linux/pte_ref.h>
 
 struct mempolicy;
 struct anon_vma;
@@ -2336,6 +2337,7 @@ static inline bool pgtable_pte_page_ctor(struct page *page)
 		return false;
 	__SetPageTable(page);
 	inc_lruvec_page_state(page, NR_PAGETABLE);
+	pte_ref_init(page);
 	return true;
 }
 
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index c29ab4c0cd5c..da2738f87737 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -153,6 +153,7 @@ struct page {
 			union {
 				struct mm_struct *pt_mm; /* x86 pgds only */
 				atomic_t pt_frag_refcount; /* powerpc */
+				unsigned long pte_refcount; /* only for PTE page */
 			};
 #if ALLOC_SPLIT_PTLOCKS
 			spinlock_t *ptl;
diff --git a/include/linux/pte_ref.h b/include/linux/pte_ref.h
new file mode 100644
index 000000000000..db14e03e1dff
--- /dev/null
+++ b/include/linux/pte_ref.h
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2022, ByteDance. All rights reserved.
+ *
+ * 	Author: Qi Zheng <zhengqi.arch@bytedance.com>
+ */
+
+#ifndef _LINUX_PTE_REF_H
+#define _LINUX_PTE_REF_H
+
+#ifdef CONFIG_FREE_USER_PTE
+
+void pte_ref_init(pgtable_t pte);
+
+#else /* !CONFIG_FREE_USER_PTE */
+
+static inline void pte_ref_init(pgtable_t pte)
+{
+}
+
+#endif /* CONFIG_FREE_USER_PTE */
+
+#endif /* _LINUX_PTE_REF_H */
diff --git a/mm/Makefile b/mm/Makefile
index 6f9ffa968a1a..f8fa5078a13d 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -54,7 +54,7 @@ obj-y			:= filemap.o mempool.o oom_kill.o fadvise.o \
 			   mm_init.o percpu.o slab_common.o \
 			   compaction.o vmacache.o \
 			   interval_tree.o list_lru.o workingset.o \
-			   debug.o gup.o mmap_lock.o $(mmu-y)
+			   debug.o gup.o mmap_lock.o $(mmu-y) pte_ref.o
 
 # Give 'page_alloc' its own module-parameter namespace
 page-alloc-y := page_alloc.o
diff --git a/mm/pte_ref.c b/mm/pte_ref.c
new file mode 100644
index 000000000000..12b27646e88c
--- /dev/null
+++ b/mm/pte_ref.c
@@ -0,0 +1,72 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2022, ByteDance. All rights reserved.
+ *
+ * 	Author: Qi Zheng <zhengqi.arch@bytedance.com>
+ */
+#include <linux/pgtable.h>
+#include <linux/pte_ref.h>
+
+#ifdef CONFIG_FREE_USER_PTE
+
+/*
+ * For a PTE page where all entries are empty, we call it empty PTE page. For a
+ * PTE page where all page table entries are mapped to zero pages, we call it
+ * zero PTE page.
+ *
+ * The page table entries for both types of PTE pages do not record "meaningful"
+ * information, so we can try to free these PTE pages at some point (such as
+ * when memory pressure is high) to reclaim more memory.
+ *
+ * We put the mapped and zero PTE entry counter into the pte_refcount of the
+ * PTE page. The bitmask has the following meaning:
+ *
+ * - bits 0-9 are mapped PTE entry count
+ * - bits 10-19 are zero PTE entry count
+ *
+ * Because the mapping and unmapping of PTE entries are under pte_lock, there is
+ * no concurrent thread to modify pte_refcount, so pte_refcount can be a
+ * non-atomic variable with little performance overhead.
+ */
+#define PTE_MAPPED_BITS		10
+#define PTE_ZERO_BITS		10
+
+#define PTE_MAPPED_SHIFT		0
+#define PTE_ZERO_SHIFT		(PTE_MAPPED_SHIFT + PTE_MAPPED_BITS)
+
+#define __PTE_REF_MASK(x)	((1UL << (x))-1)
+
+#define PTE_MAPPED_MASK	(__PTE_REF_MASK(PTE_MAPPED_BITS) << PTE_MAPPED_SHIFT)
+#define PTE_ZERO_MASK	(__PTE_REF_MASK(PTE_ZERO_BITS) << PTE_ZERO_SHIFT)
+
+#define PTE_MAPPED_OFFSET	(1UL << PTE_MAPPED_SHIFT)
+#define PTE_ZERO_OFFSET		(1UL << PTE_ZERO_SHIFT)
+
+static inline unsigned long pte_refcount(pgtable_t pte)
+{
+	return pte->pte_refcount;
+}
+
+#define pte_mapped_count(pte) \
+	((pte_refcount(pte) & PTE_MAPPED_MASK) >> PTE_MAPPED_SHIFT)
+#define pte_zero_count(pte) \
+	((pte_refcount(pte) & PTE_ZERO_MASK) >> PTE_ZERO_SHIFT)
+
+static __always_inline void pte_refcount_add(struct mm_struct *mm,
+					     pgtable_t pte, int val)
+{
+	pte->pte_refcount += val;
+}
+
+static __always_inline void pte_refcount_sub(struct mm_struct *mm,
+					     pgtable_t pte, int val)
+{
+	pte->pte_refcount -= val;
+}
+
+void pte_ref_init(pgtable_t pte)
+{
+	pte->pte_refcount = 0;
+}
+
+#endif /* CONFIG_FREE_USER_PTE */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH 5/7] pte_ref: add track_pte_{set, clear}() helper
  2022-08-25 10:10 [RFC PATCH 0/7] Try to free empty and zero user PTE page table pages Qi Zheng
                   ` (3 preceding siblings ...)
  2022-08-25 10:10 ` [RFC PATCH 4/7] mm: introduce pte_refcount for user PTE page table page Qi Zheng
@ 2022-08-25 10:10 ` Qi Zheng
  2022-08-25 10:10 ` [RFC PATCH 6/7] x86/mm: add x86_64 support for pte_ref Qi Zheng
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Qi Zheng @ 2022-08-25 10:10 UTC (permalink / raw)
  To: akpm, david, kirill.shutemov, mika.penttila, jgg, tglx, willy
  Cc: linux-kernel, linux-mm, muchun.song, Qi Zheng

The track_pte_set() is used to track the setting of the PTE page table
entry, and the track_pte_clear() is used to track the clearing of the
PTE page table entry, we update the pte_refcount of the PTE page in
these two functions.

In this way, the usage of the PTE page table page can be tracked by
its pte_refcount.

Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
 include/linux/pte_ref.h | 13 +++++++++++++
 mm/pte_ref.c            | 36 ++++++++++++++++++++++++++++++++++++
 2 files changed, 49 insertions(+)

diff --git a/include/linux/pte_ref.h b/include/linux/pte_ref.h
index db14e03e1dff..ab49c7fac120 100644
--- a/include/linux/pte_ref.h
+++ b/include/linux/pte_ref.h
@@ -12,12 +12,25 @@
 
 void pte_ref_init(pgtable_t pte);
 
+void track_pte_set(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
+		   pte_t pte);
+void track_pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
+		     pte_t pte);
 #else /* !CONFIG_FREE_USER_PTE */
 
 static inline void pte_ref_init(pgtable_t pte)
 {
 }
 
+static inline void track_pte_set(struct mm_struct *mm, unsigned long addr,
+				 pte_t *ptep, pte_t pte)
+{
+}
+
+static inline void track_pte_clear(struct mm_struct *mm, unsigned long addr,
+				   pte_t *ptep, pte_t pte)
+{
+}
 #endif /* CONFIG_FREE_USER_PTE */
 
 #endif /* _LINUX_PTE_REF_H */
diff --git a/mm/pte_ref.c b/mm/pte_ref.c
index 12b27646e88c..818821d068af 100644
--- a/mm/pte_ref.c
+++ b/mm/pte_ref.c
@@ -69,4 +69,40 @@ void pte_ref_init(pgtable_t pte)
 	pte->pte_refcount = 0;
 }
 
+void track_pte_set(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
+		   pte_t pte)
+{
+	pgtable_t page;
+
+	if (&init_mm == mm || pte_huge(pte))
+		return;
+
+	page = pte_to_page(ptep);
+	if (pte_none(*ptep) && !pte_none(pte)) {
+		pte_refcount_add(mm, page, PTE_MAPPED_OFFSET);
+		if (is_zero_pfn(pte_pfn(pte)))
+			pte_refcount_add(mm, page, PTE_ZERO_OFFSET);
+	} else if (is_zero_pfn(pte_pfn(*ptep)) && !is_zero_pfn(pte_pfn(pte))) {
+		pte_refcount_sub(mm, page, PTE_ZERO_OFFSET);
+	}
+}
+EXPORT_SYMBOL(track_pte_set);
+
+void track_pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
+		     pte_t pte)
+{
+	pgtable_t page;
+
+	if (&init_mm == mm || pte_huge(pte))
+		return;
+
+	page = pte_to_page(ptep);
+	if (!pte_none(pte)) {
+		pte_refcount_sub(mm, page, PTE_MAPPED_OFFSET);
+		if (is_zero_pfn(pte_pfn(pte)))
+			pte_refcount_sub(mm, page, PTE_ZERO_OFFSET);
+	}
+}
+EXPORT_SYMBOL(track_pte_clear);
+
 #endif /* CONFIG_FREE_USER_PTE */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH 6/7] x86/mm: add x86_64 support for pte_ref
  2022-08-25 10:10 [RFC PATCH 0/7] Try to free empty and zero user PTE page table pages Qi Zheng
                   ` (4 preceding siblings ...)
  2022-08-25 10:10 ` [RFC PATCH 5/7] pte_ref: add track_pte_{set, clear}() helper Qi Zheng
@ 2022-08-25 10:10 ` Qi Zheng
  2022-08-25 10:10 ` [RFC PATCH 7/7] mm: add proc interface to free user PTE page table pages Qi Zheng
  2022-08-29 10:09 ` [RFC PATCH 0/7] Try to free empty and zero " David Hildenbrand
  7 siblings, 0 replies; 10+ messages in thread
From: Qi Zheng @ 2022-08-25 10:10 UTC (permalink / raw)
  To: akpm, david, kirill.shutemov, mika.penttila, jgg, tglx, willy
  Cc: linux-kernel, linux-mm, muchun.song, Qi Zheng

Add pte_ref hooks into routines that modify user PTE page tables,
and select ARCH_SUPPORTS_FREE_USER_PTE, so that the pte_ref code
can be compiled and worked on this architecture.

Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
 arch/x86/Kconfig               | 1 +
 arch/x86/include/asm/pgtable.h | 4 ++++
 include/linux/pgtable.h        | 1 +
 3 files changed, 6 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 52a7f91527fe..50215b05723e 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -34,6 +34,7 @@ config X86_64
 	select SWIOTLB
 	select ARCH_HAS_ELFCORE_COMPAT
 	select ZONE_DMA32
+	select ARCH_SUPPORTS_FREE_USER_PTE
 
 config FORCE_DYNAMIC_FTRACE
 	def_bool y
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 44e2d6f1dbaa..cbfcfa497fb9 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -23,6 +23,7 @@
 #include <asm/coco.h>
 #include <asm-generic/pgtable_uffd.h>
 #include <linux/page_table_check.h>
+#include <linux/pte_ref.h>
 
 extern pgd_t early_top_pgt[PTRS_PER_PGD];
 bool __init __early_make_pgtable(unsigned long address, pmdval_t pmd);
@@ -1005,6 +1006,7 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
 			      pte_t *ptep, pte_t pte)
 {
 	page_table_check_pte_set(mm, addr, ptep, pte);
+	track_pte_set(mm, addr, ptep, pte);
 	set_pte(ptep, pte);
 }
 
@@ -1050,6 +1052,7 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
 {
 	pte_t pte = native_ptep_get_and_clear(ptep);
 	page_table_check_pte_clear(mm, addr, pte);
+	track_pte_clear(mm, addr, ptep, pte);
 	return pte;
 }
 
@@ -1066,6 +1069,7 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm,
 		 */
 		pte = native_local_ptep_get_and_clear(ptep);
 		page_table_check_pte_clear(mm, addr, pte);
+		track_pte_clear(mm, addr, ptep, pte);
 	} else {
 		pte = ptep_get_and_clear(mm, addr, ptep);
 	}
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index c4a6bda6e965..908636f48c95 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -276,6 +276,7 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
 	pte_t pte = *ptep;
 	pte_clear(mm, address, ptep);
 	page_table_check_pte_clear(mm, address, pte);
+	track_pte_clear(mm, address, ptep, pte);
 	return pte;
 }
 #endif
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH 7/7] mm: add proc interface to free user PTE page table pages
  2022-08-25 10:10 [RFC PATCH 0/7] Try to free empty and zero user PTE page table pages Qi Zheng
                   ` (5 preceding siblings ...)
  2022-08-25 10:10 ` [RFC PATCH 6/7] x86/mm: add x86_64 support for pte_ref Qi Zheng
@ 2022-08-25 10:10 ` Qi Zheng
  2022-08-29 10:09 ` [RFC PATCH 0/7] Try to free empty and zero " David Hildenbrand
  7 siblings, 0 replies; 10+ messages in thread
From: Qi Zheng @ 2022-08-25 10:10 UTC (permalink / raw)
  To: akpm, david, kirill.shutemov, mika.penttila, jgg, tglx, willy
  Cc: linux-kernel, linux-mm, muchun.song, Qi Zheng

Add /proc/sys/vm/free_ptes file to procfs, when pid is written
to the file, we will traverse its process address space, find
and free empty PTE pages or zero PTE pages.

Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
 include/linux/pte_ref.h |   5 ++
 kernel/sysctl.c         |  12 ++++
 mm/pte_ref.c            | 126 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 143 insertions(+)

diff --git a/include/linux/pte_ref.h b/include/linux/pte_ref.h
index ab49c7fac120..f7e244129291 100644
--- a/include/linux/pte_ref.h
+++ b/include/linux/pte_ref.h
@@ -16,6 +16,11 @@ void track_pte_set(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
 		   pte_t pte);
 void track_pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
 		     pte_t pte);
+
+int free_ptes_sysctl_handler(struct ctl_table *table, int write,
+		void *buffer, size_t *length, loff_t *ppos);
+extern int sysctl_free_ptes_pid;
+
 #else /* !CONFIG_FREE_USER_PTE */
 
 static inline void pte_ref_init(pgtable_t pte)
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 35d034219513..14e1a9841cb8 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -64,6 +64,7 @@
 #include <linux/mount.h>
 #include <linux/userfaultfd_k.h>
 #include <linux/pid.h>
+#include <linux/pte_ref.h>
 
 #include "../lib/kstrtox.h"
 
@@ -2153,6 +2154,17 @@ static struct ctl_table vm_table[] = {
 		.extra1		= SYSCTL_ONE,
 		.extra2		= SYSCTL_FOUR,
 	},
+#ifdef CONFIG_FREE_USER_PTE
+	{
+		.procname	= "free_ptes",
+		.data		= &sysctl_free_ptes_pid,
+		.maxlen		= sizeof(int),
+		.mode		= 0200,
+		.proc_handler	= free_ptes_sysctl_handler,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= SYSCTL_INT_MAX,
+	},
+#endif
 #ifdef CONFIG_COMPACTION
 	{
 		.procname	= "compact_memory",
diff --git a/mm/pte_ref.c b/mm/pte_ref.c
index 818821d068af..e7080a3100a6 100644
--- a/mm/pte_ref.c
+++ b/mm/pte_ref.c
@@ -6,6 +6,14 @@
  */
 #include <linux/pgtable.h>
 #include <linux/pte_ref.h>
+#include <linux/mm.h>
+#include <linux/pagewalk.h>
+#include <linux/sched/mm.h>
+#include <linux/jump_label.h>
+#include <linux/hugetlb.h>
+#include <asm/tlbflush.h>
+
+#include "internal.h"
 
 #ifdef CONFIG_FREE_USER_PTE
 
@@ -105,4 +113,122 @@ void track_pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
 }
 EXPORT_SYMBOL(track_pte_clear);
 
+#ifdef CONFIG_DEBUG_VM
+void pte_free_debug(pmd_t pmd)
+{
+	pte_t *ptep = (pte_t *)pmd_page_vaddr(pmd);
+	int i = 0;
+
+	for (i = 0; i < PTRS_PER_PTE; i++, ptep++) {
+		pte_t pte = *ptep;
+		BUG_ON(!(pte_none(pte) || is_zero_pfn(pte_pfn(pte))));
+	}
+}
+#else
+static inline void pte_free_debug(pmd_t pmd)
+{
+}
+#endif
+
+
+static int kfreeptd_pmd_entry(pmd_t *pmd, unsigned long addr,
+			      unsigned long next, struct mm_walk *walk)
+{
+	pmd_t pmdval;
+	pgtable_t page;
+	struct mm_struct *mm = walk->mm;
+	struct vm_area_struct vma = TLB_FLUSH_VMA(mm, 0);
+	spinlock_t *ptl;
+	bool free = false;
+	unsigned long haddr = addr & PMD_MASK;
+
+	if (pmd_trans_unstable(pmd))
+		goto out;
+
+	mmap_read_unlock(mm);
+	mmap_write_lock(mm);
+
+	if (mm_find_pmd(mm, addr) != pmd)
+		goto unlock_out;
+
+	ptl = pmd_lock(mm, pmd);
+	pmdval = *pmd;
+	if (pmd_none(pmdval) || pmd_leaf(pmdval)) {
+		spin_unlock(ptl);
+		goto unlock_out;
+	}
+	page = pmd_pgtable(pmdval);
+	if (!pte_mapped_count(page) || pte_zero_count(page) == PTRS_PER_PTE) {
+		pmd_clear(pmd);
+		flush_tlb_range(&vma, haddr, haddr + PMD_SIZE);
+		free = true;
+	}
+	spin_unlock(ptl);
+
+unlock_out:
+	mmap_write_unlock(mm);
+	mmap_read_lock(mm);
+
+	if (free) {
+		pte_free_debug(pmdval);
+		mm_dec_nr_ptes(mm);
+		pgtable_pte_page_dtor(page);
+		__free_page(page);
+	}
+
+out:
+	cond_resched();
+	return 0;
+}
+
+static const struct mm_walk_ops kfreeptd_walk_ops = {
+	.pmd_entry		= kfreeptd_pmd_entry,
+};
+
+int sysctl_free_ptes_pid;
+int free_ptes_sysctl_handler(struct ctl_table *table, int write,
+		void *buffer, size_t *length, loff_t *ppos)
+{
+	int ret;
+
+	ret = proc_dointvec_minmax(table, write, buffer, length, ppos);
+	if (ret)
+		return ret;
+	if (write) {
+		struct task_struct *task;
+		struct mm_struct *mm;
+
+		rcu_read_lock();
+		task = find_task_by_vpid(sysctl_free_ptes_pid);
+		if (!task) {
+			rcu_read_unlock();
+			return -ESRCH;
+		}
+		mm = get_task_mm(task);
+		rcu_read_unlock();
+
+		if (!mm) {
+			mmput(mm);
+			return -ESRCH;
+		}
+
+		do {
+			ret = -EBUSY;
+
+			if (mmap_read_trylock(mm)) {
+				ret = walk_page_range(mm, FIRST_USER_ADDRESS,
+						      ULONG_MAX,
+						      &kfreeptd_walk_ops, NULL);
+
+				mmap_read_unlock(mm);
+			}
+
+			cond_resched();
+		} while (ret == -EAGAIN);
+
+		mmput(mm);
+	}
+	return ret;
+}
+
 #endif /* CONFIG_FREE_USER_PTE */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 0/7] Try to free empty and zero user PTE page table pages
  2022-08-25 10:10 [RFC PATCH 0/7] Try to free empty and zero user PTE page table pages Qi Zheng
                   ` (6 preceding siblings ...)
  2022-08-25 10:10 ` [RFC PATCH 7/7] mm: add proc interface to free user PTE page table pages Qi Zheng
@ 2022-08-29 10:09 ` David Hildenbrand
  2022-08-29 14:00   ` Qi Zheng
  7 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand @ 2022-08-29 10:09 UTC (permalink / raw)
  To: Qi Zheng, akpm, kirill.shutemov, mika.penttila, jgg, tglx, willy
  Cc: linux-kernel, linux-mm, muchun.song

On 25.08.22 12:10, Qi Zheng wrote:
> Hi,
> 
> Before this, in order to free empty user PTE page table pages, I posted the
> following patch sets of two solutions:
>  - atomic refcount version:
> 	https://lore.kernel.org/lkml/20211110105428.32458-1-zhengqi.arch@bytedance.com/
>  - percpu refcount version:
> 	https://lore.kernel.org/lkml/20220429133552.33768-1-zhengqi.arch@bytedance.com/
> 
> Both patch sets have the following behavior:
> a. Protect the page table walker by hooking pte_offset_map{_lock}() and
>    pte_unmap{_unlock}()
> b. Will automatically reclaim PTE page table pages in the non-reclaiming path
> 
> For behavior a, there may be the following disadvantages mentioned by
> David Hildenbrand:
>  - It introduces a lot of complexity. It's not something easy to get in and most
>    probably not easy to get out again
>  - It is inconvenient to extend to other architectures. For example, for the
>    continuous ptes of arm64, the pointer to the PTE entry is obtained directly
>    through pte_offset_kernel() instead of pte_offset_map{_lock}()
>  - It has been found that pte_unmap() is missing in some places that only
>    execute on 64-bit systems, which is a disaster for pte_refcount
> 
> For behavior b, it may not be necessary to actively reclaim PTE pages, especially
> when memory pressure is not high, and deferring to the reclaim path may be a
> better choice.
> 
> In addition, the above two solutions are only for empty PTE pages (a PTE page
> where all entries are empty), and do not deal with the zero PTE page ( a PTE
> page where all page table entries are mapped to shared zero page) mentioned by
> David Hildenbrand:
> 	"Especially the shared zeropage is nasty, because there are
> 	 sane use cases that can trigger it. Assume you have a VM
> 	 (e.g., QEMU) that inflated the balloon to return free memory
> 	 to the hypervisor.
> 
> 	 Simply migrating that VM will populate the shared zeropage to
> 	 all inflated pages, because migration code ends up reading all
> 	 VM memory. Similarly, the guest can just read that memory as
> 	 well, for example, when the guest issues kdump itself."
> 
> The purpose of this RFC patch is to continue the discussion and fix the above
> issues. The following is the solution to be discussed.

Thanks for providing an alternative! It's certainly easier to digest :)

> 
> In order to quickly identify the above two types of PTE pages, we still
> introduced a pte_refcount for each PTE page. We put the mapped and zero PTE
> entry counter into the pte_refcount of the PTE page. The bitmask has the
> following meaning:
> 
>  - bits 0-9 are mapped PTE entry count
>  - bits 10-19 are zero PTE entry count

I guess we could factor the zero PTE change out, to have an even simpler
first version. The issue is that some features (userfaultfd) don't
expect page faults when something was aleady mapped previously.

PTE markers as introduced by Peter might require a thought -- we don't
have anything mapped but do have additional information that we have to
maintain.

> 
> In this way, when mapped PTE entry count is 0, we can know that the current PTE
> page is an empty PTE page, and when zero PTE entry count is PTRS_PER_PTE, we can
> know that the current PTE page is a zero PTE page.
> 
> We only update the pte_refcount when setting and clearing of PTE entry, and
> since they are both protected by pte lock, pte_refcount can be a non-atomic
> variable with little performance overhead.
> 
> For page table walker, we mutually exclusive it by holding write lock of
> mmap_lock when doing pmd_clear() (in the newly added path to reclaim PTE pages).

I recall when I played with that idea that the mmap_lock is not
sufficient to rip out a page table. IIRC, we also have to hold the rmap
lock(s), to prevent RMAP walkers from still using the page table.

Especially if multiple VMAs intersect a page table, things might get
tricky, because multiple rmap locks could be involved.

We might want/need another mechanism to synchronize against page table
walkers.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 0/7] Try to free empty and zero user PTE page table pages
  2022-08-29 10:09 ` [RFC PATCH 0/7] Try to free empty and zero " David Hildenbrand
@ 2022-08-29 14:00   ` Qi Zheng
  0 siblings, 0 replies; 10+ messages in thread
From: Qi Zheng @ 2022-08-29 14:00 UTC (permalink / raw)
  To: David Hildenbrand, akpm, kirill.shutemov, jgg, tglx, willy
  Cc: linux-kernel, linux-mm, muchun.song



On 2022/8/29 18:09, David Hildenbrand wrote:
> On 25.08.22 12:10, Qi Zheng wrote:
>> Hi,
>>
>> Before this, in order to free empty user PTE page table pages, I posted the
>> following patch sets of two solutions:
>>   - atomic refcount version:
>> 	https://lore.kernel.org/lkml/20211110105428.32458-1-zhengqi.arch@bytedance.com/
>>   - percpu refcount version:
>> 	https://lore.kernel.org/lkml/20220429133552.33768-1-zhengqi.arch@bytedance.com/
>>
>> Both patch sets have the following behavior:
>> a. Protect the page table walker by hooking pte_offset_map{_lock}() and
>>     pte_unmap{_unlock}()
>> b. Will automatically reclaim PTE page table pages in the non-reclaiming path
>>
>> For behavior a, there may be the following disadvantages mentioned by
>> David Hildenbrand:
>>   - It introduces a lot of complexity. It's not something easy to get in and most
>>     probably not easy to get out again
>>   - It is inconvenient to extend to other architectures. For example, for the
>>     continuous ptes of arm64, the pointer to the PTE entry is obtained directly
>>     through pte_offset_kernel() instead of pte_offset_map{_lock}()
>>   - It has been found that pte_unmap() is missing in some places that only
>>     execute on 64-bit systems, which is a disaster for pte_refcount
>>
>> For behavior b, it may not be necessary to actively reclaim PTE pages, especially
>> when memory pressure is not high, and deferring to the reclaim path may be a
>> better choice.
>>
>> In addition, the above two solutions are only for empty PTE pages (a PTE page
>> where all entries are empty), and do not deal with the zero PTE page ( a PTE
>> page where all page table entries are mapped to shared zero page) mentioned by
>> David Hildenbrand:
>> 	"Especially the shared zeropage is nasty, because there are
>> 	 sane use cases that can trigger it. Assume you have a VM
>> 	 (e.g., QEMU) that inflated the balloon to return free memory
>> 	 to the hypervisor.
>>
>> 	 Simply migrating that VM will populate the shared zeropage to
>> 	 all inflated pages, because migration code ends up reading all
>> 	 VM memory. Similarly, the guest can just read that memory as
>> 	 well, for example, when the guest issues kdump itself."
>>
>> The purpose of this RFC patch is to continue the discussion and fix the above
>> issues. The following is the solution to be discussed.
> 
> Thanks for providing an alternative! It's certainly easier to digest :)

Hi David,

Nice to see your reply.

> 
>>
>> In order to quickly identify the above two types of PTE pages, we still
>> introduced a pte_refcount for each PTE page. We put the mapped and zero PTE
>> entry counter into the pte_refcount of the PTE page. The bitmask has the
>> following meaning:
>>
>>   - bits 0-9 are mapped PTE entry count
>>   - bits 10-19 are zero PTE entry count
> 
> I guess we could factor the zero PTE change out, to have an even simpler
OK, we can deal with the empty PTE page case first.

> first version. The issue is that some features (userfaultfd) don't
> expect page faults when something was aleady mapped previously.
> 
> PTE markers as introduced by Peter might require a thought -- we don't
> have anything mapped but do have additional information that we have to
> maintain.

I see the pte marker entry is non-present entry not empty entry 
(pte_none()). So we've dealt with this situation, which is also
what's done in [RFC PATCH 1/7].

> 
>>
>> In this way, when mapped PTE entry count is 0, we can know that the current PTE
>> page is an empty PTE page, and when zero PTE entry count is PTRS_PER_PTE, we can
>> know that the current PTE page is a zero PTE page.
>>
>> We only update the pte_refcount when setting and clearing of PTE entry, and
>> since they are both protected by pte lock, pte_refcount can be a non-atomic
>> variable with little performance overhead.
>>
>> For page table walker, we mutually exclusive it by holding write lock of
>> mmap_lock when doing pmd_clear() (in the newly added path to reclaim PTE pages).
> 
> I recall when I played with that idea that the mmap_lock is not
> sufficient to rip out a page table. IIRC, we also have to hold the rmap
> lock(s), to prevent RMAP walkers from still using the page table.

Oh, I forgot this. We should also hold rmap lock(s) like
move_normal_pmd().

> 
> Especially if multiple VMAs intersect a page table, things might get
> tricky, because multiple rmap locks could be involved.

Maybe we can iterate over the vma list and just process the 2M aligned
part?

> 
> We might want/need another mechanism to synchronize against page table
> walkers.

This is a tricky problem, equivalent to narrowing the protection scope
of mmap_lock. Any preliminary ideas?

Thanks,
Qi

> 

-- 
Thanks,
Qi

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-08-29 14:01 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-25 10:10 [RFC PATCH 0/7] Try to free empty and zero user PTE page table pages Qi Zheng
2022-08-25 10:10 ` [RFC PATCH 1/7] mm: use ptep_clear() in non-present cases Qi Zheng
2022-08-25 10:10 ` [RFC PATCH 2/7] mm: introduce CONFIG_FREE_USER_PTE Qi Zheng
2022-08-25 10:10 ` [RFC PATCH 3/7] mm: add pte_to_page() helper Qi Zheng
2022-08-25 10:10 ` [RFC PATCH 4/7] mm: introduce pte_refcount for user PTE page table page Qi Zheng
2022-08-25 10:10 ` [RFC PATCH 5/7] pte_ref: add track_pte_{set, clear}() helper Qi Zheng
2022-08-25 10:10 ` [RFC PATCH 6/7] x86/mm: add x86_64 support for pte_ref Qi Zheng
2022-08-25 10:10 ` [RFC PATCH 7/7] mm: add proc interface to free user PTE page table pages Qi Zheng
2022-08-29 10:09 ` [RFC PATCH 0/7] Try to free empty and zero " David Hildenbrand
2022-08-29 14:00   ` Qi Zheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).