All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] mm, thp: introduce a new sysfs interface to facilitate file THP for .text
@ 2021-10-09  9:26 Rongwei Wang
  2021-10-09  9:26 ` [PATCH 1/3] mm, thp: support binaries transparent use of file THP Rongwei Wang
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Rongwei Wang @ 2021-10-09  9:26 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linux-fsdevel
  Cc: akpm, willy, viro, song, william.kucharski, hughd, shy828301,
	linmiaohe, peterx

Hi, all

Recently, our team focus on huge pages of executable binary files
and share libraries, refer to these huge pages as 'hugetext' in
the next description. The hugetext indeed to improve the performance
of application, e.g. mysql. It has been shown in [1][2]. And with
the increase of the text section, the improvement will be more
obvious. Base on [1][2], we make some improvement to make file-backed
THP more usability and easy for applications.

In current kernel, ref[1] introduced READ_ONLY_THP_FOR_FS, and ref[2]
add the support for share libraries based on the previous one. However,
Until now, hugetext is not convenient to use at present. For example,
we need to explicitly madvise MADV_HUGEPAGE for .text and set
"transparent_hugepage/enabled" to always or madvise . On the other
hand, hugetext requires 2M alignment of vma->vm_start and vma->vm_pgoff,
which is not guaranteed by kernel or loader.

Our design:
To solve the drawback mentioned above of file THP in using, we have
mainly improved two points that shows below.
(1) introduce a new sysfs interface "transparent_hugepage/hugetext_enabled"
in order to automatically (i.e., transparently) enable file THP for
suitable .text vmas. The usage belows:

    to disable hugetext:
    $ echo 0 > /sys/kernel/mm/transparent_hugepage/hugetext_enabled

    to enable hugetext:
    $ echo 1 > /sys/kernel/mm/transparent_hugepage/hugetext_enabled

    to enable or disable in boot options: hugetext=1 or hugetext=0

Q: Why not add a new option, e.g., "text_always", in addition to
"always", "madvise", and "never" to "transparent_hugepage/enabled" ?

A: A new option to "transparent_hugepage/enabled" cannot handle such
scenario, where THP always for .text, and madivse/never for others
(e.g., anon vma).

The .text is usually small in size. In our production environment, at
most 10G out of 500G total memory is used as .text. The .text is also
performance critical. More important, We don't want to change the
user's default behavior too much. So we think that a new independent
sysfs interface for file THP is worthy.

(2) make vm_start of .text 2M align with vm_pgoff, especially
for PIE/PIC binaries and shared libraries.

For binaries that are compiled with '--pie -fPIC' and with LOAD
alignment smaller than 2M (typically 4K, 64K), change
maximum_alignment to 2M.

For shared libraries, ld.so seems not to consider p_align well, as
shown below.
$ readelf -l /usr/lib64/libc-2.17.so
LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
               0x00000000001c2fe8 0x00000000001c2fe8  R E    200000
$ cat /proc/1/smaps
7fecc4072000-7fecc4235000 r-xp 00000000 08:03 655802  /usr/lib64/libc-2.17.so

Finally, why this feasure is implemented in kernel, not in userspace, or
ld.so?

Userspace methods like libhugetlbfs have various disadvantages:
 * require recompiling applications;
 * the anonymous mapping cannot be shared;
 * debugging is not convenient.

To madvise MADV_HUGEPAGE for .text in ld.so has been suggested in the
glibc mailing list[3], but there was no response.

Finally, considering that this feature requires very little code and
is not too difficult to implement based on the existing file-backed
THP support, it was finally chosen to be implemented in the kernel.

Thanks!

Reference:
[1] https://patchwork.kernel.org/project/linux-mm/cover/20190801184244.3169074-1-songliubraving@fb.com/
[2] https://patchwork.kernel.org/project/linux-fsdevel/patch/20210406000930.3455850-1-cfijalkovich@google.com/
[3] https://sourceware.org/pipermail/libc-alpha/2021-February/122334.html

Rongwei Wang (3):
  mm, thp: support binaries transparent use of file THP
  mm, thp: make mapping address of libraries THP align
  mm, thp: make mapping address of PIC binaries THP align

 fs/binfmt_elf.c            |  5 +++
 include/linux/huge_mm.h    | 36 +++++++++++++++++++
 include/linux/khugepaged.h |  9 +++++
 mm/Kconfig                 | 11 ++++++
 mm/huge_memory.c           | 72 ++++++++++++++++++++++++++++++++++++++
 mm/khugepaged.c            |  4 +++
 mm/memory.c                | 12 +++++++
 mm/mmap.c                  | 18 ++++++++++
 8 files changed, 167 insertions(+)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] mm, thp: support binaries transparent use of file THP
  2021-10-09  9:26 [PATCH 0/3] mm, thp: introduce a new sysfs interface to facilitate file THP for .text Rongwei Wang
@ 2021-10-09  9:26 ` Rongwei Wang
  2021-10-09  9:26 ` [PATCH 2/3] mm, thp: make mapping address of libraries THP align Rongwei Wang
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Rongwei Wang @ 2021-10-09  9:26 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linux-fsdevel
  Cc: akpm, willy, viro, song, william.kucharski, hughd, shy828301,
	linmiaohe, peterx

The file THP for .text is not convenient to use at present.
Applications need to explicitly madvise MADV_HUGEPAGE for .text,
which is not friendly for tasks in the production environment.

This patch extends READ_ONLY_THP_FOR_FS, introduces a new sysfs
interface: hugetext_enabled, to make the read-only file-backed
pages THPeligible proactively and transparently.

Compared with original design, It not depend on 'madvise()' any
more. And because of 'hugetext_enabled' introduced, users are
no longer limited to 'enabled' setting (e.g., always, madvise
and never).

There are two methods to enable or disable this feature:
To enable hugetext:
1. echo 1 > /sys/kernel/mm/transparent_hugepage/hugetext_enabled
2. hugetext=1 in boot cmdline

To disable hugetext:
1. echo 0 > /sys/kernel/mm/transparent_hugepage/hugetext_enabled
2. hugetext=0 in boot cmdline

This feature is disabled by default.

Signed-off-by: Gang Deng <gavin.dg@linux.alibaba.com>
Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
---
 include/linux/huge_mm.h    | 24 ++++++++++++++++
 include/linux/khugepaged.h |  9 ++++++
 mm/Kconfig                 | 11 ++++++++
 mm/huge_memory.c           | 57 ++++++++++++++++++++++++++++++++++++++
 mm/khugepaged.c            |  4 +++
 mm/memory.c                | 12 ++++++++
 6 files changed, 117 insertions(+)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index f123e15d966e..95b718031ef3 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -87,6 +87,9 @@ enum transparent_hugepage_flag {
 	TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG,
 	TRANSPARENT_HUGEPAGE_DEFRAG_KHUGEPAGED_FLAG,
 	TRANSPARENT_HUGEPAGE_USE_ZERO_PAGE_FLAG,
+#ifdef CONFIG_HUGETEXT
+	TRANSPARENT_HUGEPAGE_HUGETEXT_ENABLED_FLAG,
+#endif
 };
 
 struct kobject;
@@ -140,6 +143,27 @@ static inline bool transhuge_vma_enabled(struct vm_area_struct *vma,
 	return true;
 }
 
+#ifdef CONFIG_HUGETEXT
+#define hugetext_enabled()			\
+	(transparent_hugepage_flags &		\
+	 (1<<TRANSPARENT_HUGEPAGE_HUGETEXT_ENABLED_FLAG))
+#else
+#define hugetext_enabled()	false
+#endif /* CONFIG_HUGETEXT */
+
+static inline bool vma_is_hugetext(struct vm_area_struct *vma,
+				   unsigned long vm_flags)
+{
+	if (!(vm_flags & VM_EXEC))
+		return false;
+
+	if (vma->vm_file && !inode_is_open_for_write(vma->vm_file->f_inode))
+		return IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) - vma->vm_pgoff,
+				HPAGE_PMD_NR);
+
+	return false;
+}
+
 /*
  * to be used on vmas which are known to support THP.
  * Use transparent_hugepage_active otherwise
diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
index 2fcc01891b47..ad56f75a2fda 100644
--- a/include/linux/khugepaged.h
+++ b/include/linux/khugepaged.h
@@ -26,10 +26,18 @@ static inline void collapse_pte_mapped_thp(struct mm_struct *mm,
 }
 #endif
 
+#ifdef CONFIG_HUGETEXT
+#define khugepaged_enabled()					\
+	(transparent_hugepage_flags &				\
+	 ((1<<TRANSPARENT_HUGEPAGE_FLAG) |			\
+	  (1<<TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG) |		\
+	  (1<<TRANSPARENT_HUGEPAGE_HUGETEXT_ENABLED_FLAG)))
+#else
 #define khugepaged_enabled()					       \
 	(transparent_hugepage_flags &				       \
 	 ((1<<TRANSPARENT_HUGEPAGE_FLAG) |		       \
 	  (1<<TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG)))
+#endif
 #define khugepaged_always()				\
 	(transparent_hugepage_flags &			\
 	 (1<<TRANSPARENT_HUGEPAGE_FLAG))
@@ -59,6 +67,7 @@ static inline int khugepaged_enter(struct vm_area_struct *vma,
 	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags))
 		if ((khugepaged_always() ||
 		     (shmem_file(vma->vm_file) && shmem_huge_enabled(vma)) ||
+		     (hugetext_enabled() && vma_is_hugetext(vma, vm_flags)) ||
 		     (khugepaged_req_madv() && (vm_flags & VM_HUGEPAGE))) &&
 		    !(vm_flags & VM_NOHUGEPAGE) &&
 		    !test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
diff --git a/mm/Kconfig b/mm/Kconfig
index d16ba9249bc5..5aa3fa86e7b1 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -868,6 +868,17 @@ config READ_ONLY_THP_FOR_FS
 	  support of file THPs will be developed in the next few release
 	  cycles.
 
+config HUGETEXT
+	bool "THP for text segments"
+	depends on READ_ONLY_THP_FOR_FS
+
+	help
+	  Allow khugepaged to put read-only file-backed pages, including
+	  shared libraries, as well as the anonymous and executable pages
+	  in THP.
+
+	  This feature builds on and extends READ_ONLY_THP_FOR_FS.
+
 config ARCH_HAS_PTE_SPECIAL
 	bool
 
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 5e9ef0fc261e..f6fffb5c5130 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -330,6 +330,35 @@ static ssize_t hpage_pmd_size_show(struct kobject *kobj,
 static struct kobj_attribute hpage_pmd_size_attr =
 	__ATTR_RO(hpage_pmd_size);
 
+#ifdef CONFIG_HUGETEXT
+static ssize_t hugetext_enabled_show(struct kobject *kobj,
+		struct kobj_attribute *attr, char *buf)
+{
+	return single_hugepage_flag_show(kobj, attr, buf,
+			TRANSPARENT_HUGEPAGE_HUGETEXT_ENABLED_FLAG);
+}
+
+static ssize_t hugetext_enabled_store(struct kobject *kobj,
+		struct kobj_attribute *attr, const char *buf, size_t count)
+{
+	ssize_t ret = count;
+
+	ret = single_hugepage_flag_store(kobj, attr, buf, count,
+			TRANSPARENT_HUGEPAGE_HUGETEXT_ENABLED_FLAG);
+
+	if (ret > 0) {
+		int err = start_stop_khugepaged();
+
+		if (err)
+			ret = err;
+	}
+
+	return ret;
+}
+struct kobj_attribute hugetext_enabled_attr =
+	__ATTR(hugetext_enabled, 0644, hugetext_enabled_show, hugetext_enabled_store);
+#endif /* CONFIG_HUGETEXT */
+
 static struct attribute *hugepage_attr[] = {
 	&enabled_attr.attr,
 	&defrag_attr.attr,
@@ -337,6 +366,9 @@ static struct attribute *hugepage_attr[] = {
 	&hpage_pmd_size_attr.attr,
 #ifdef CONFIG_SHMEM
 	&shmem_enabled_attr.attr,
+#endif
+#ifdef CONFIG_HUGETEXT
+	&hugetext_enabled_attr.attr,
 #endif
 	NULL,
 };
@@ -491,6 +523,31 @@ static int __init setup_transparent_hugepage(char *str)
 }
 __setup("transparent_hugepage=", setup_transparent_hugepage);
 
+#ifdef CONFIG_HUGETEXT
+static int __init setup_hugetext(char *str)
+{
+	int ret = 0;
+
+	if (!str)
+		goto out;
+	if (!strcmp(str, "1")) {
+		set_bit(TRANSPARENT_HUGEPAGE_HUGETEXT_ENABLED_FLAG,
+			  &transparent_hugepage_flags);
+		ret = 1;
+	} else if (!strcmp(str, "0")) {
+		clear_bit(TRANSPARENT_HUGEPAGE_HUGETEXT_ENABLED_FLAG,
+			&transparent_hugepage_flags);
+		ret = 1;
+	}
+
+out:
+	if (!ret)
+		pr_warn("hugetext= cannot parse, ignored\n");
+	return ret;
+}
+__setup("hugetext=", setup_hugetext);
+#endif
+
 pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma)
 {
 	if (likely(vma->vm_flags & VM_WRITE))
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 045cc579f724..2810bc1962b3 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -451,6 +451,10 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
 				HPAGE_PMD_NR);
 	}
 
+	/* Make hugetext independent of THP settings */
+	if (hugetext_enabled() && vma_is_hugetext(vma, vm_flags))
+		return true;
+
 	/* THP settings require madvise. */
 	if (!(vm_flags & VM_HUGEPAGE) && !khugepaged_always())
 		return false;
diff --git a/mm/memory.c b/mm/memory.c
index adf9b9ef8277..b0d0889af6ab 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -73,6 +73,7 @@
 #include <linux/perf_event.h>
 #include <linux/ptrace.h>
 #include <linux/vmalloc.h>
+#include <linux/khugepaged.h>
 
 #include <trace/events/kmem.h>
 
@@ -4157,6 +4158,17 @@ static vm_fault_t do_read_fault(struct vm_fault *vmf)
 	struct vm_area_struct *vma = vmf->vma;
 	vm_fault_t ret = 0;
 
+#ifdef CONFIG_HUGETEXT
+	/* Add the candidate hugetext vma into khugepaged scan list */
+	if (pmd_none(*vmf->pmd) && hugetext_enabled()
+			&& vma_is_hugetext(vma, vma->vm_flags)) {
+		unsigned long haddr = vmf->address & HPAGE_PMD_MASK;
+
+		if (transhuge_vma_suitable(vma, haddr))
+			khugepaged_enter(vma, vma->vm_flags);
+	}
+#endif
+
 	/*
 	 * Let's call ->map_pages() first and use ->fault() as fallback
 	 * if page by the offset is not ready to be mapped (cold cache or
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] mm, thp: make mapping address of libraries THP align
  2021-10-09  9:26 [PATCH 0/3] mm, thp: introduce a new sysfs interface to facilitate file THP for .text Rongwei Wang
  2021-10-09  9:26 ` [PATCH 1/3] mm, thp: support binaries transparent use of file THP Rongwei Wang
@ 2021-10-09  9:26 ` Rongwei Wang
  2021-10-09  9:26 ` [PATCH 3/3] mm, thp: make mapping address of PIC binaries " Rongwei Wang
  2021-10-11  8:06 ` [PATCH 0/3] mm, thp: introduce a new sysfs interface to facilitate file THP for .text Christoph Hellwig
  3 siblings, 0 replies; 7+ messages in thread
From: Rongwei Wang @ 2021-10-09  9:26 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linux-fsdevel
  Cc: akpm, willy, viro, song, william.kucharski, hughd, shy828301,
	linmiaohe, peterx

For shared libraries, ld.so seems not to consider p_align well, as shown
below.
$ readelf -l /usr/lib64/libc-2.17.so
LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
               0x00000000001c2fe8 0x00000000001c2fe8  R E    200000
$ cat /proc/1/smaps
7fecc4072000-7fecc4235000 r-xp 00000000 08:03 655802  /usr/lib64/libc-2.17.so

This makes the mapping address allocated by 'get_unmapped_area'
align with 2M for libraries, to facilitate file THP for .text
section as far as possible.

Signed-off-by: Gang Deng <gavin.dg@linux.alibaba.com>
Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
---
 include/linux/huge_mm.h | 12 ++++++++++++
 mm/huge_memory.c        | 15 +++++++++++++++
 mm/mmap.c               | 18 ++++++++++++++++++
 3 files changed, 45 insertions(+)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 95b718031ef3..ddbc0d19f90f 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -147,8 +147,20 @@ static inline bool transhuge_vma_enabled(struct vm_area_struct *vma,
 #define hugetext_enabled()			\
 	(transparent_hugepage_flags &		\
 	 (1<<TRANSPARENT_HUGEPAGE_HUGETEXT_ENABLED_FLAG))
+
+extern unsigned long hugetext_get_unmapped_area(struct file *filp,
+		unsigned long addr, unsigned long len, unsigned long pgoff,
+		unsigned long flags);
 #else
 #define hugetext_enabled()	false
+
+static inline unsigned long hugetext_get_unmapped_area(struct file *filp,
+		unsigned long addr, unsigned long len, unsigned long pgoff,
+		unsigned long flags)
+{
+	BUILD_BUG();
+	return 0;
+}
 #endif /* CONFIG_HUGETEXT */
 
 static inline bool vma_is_hugetext(struct vm_area_struct *vma,
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index f6fffb5c5130..076a74cdc214 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -650,6 +650,21 @@ unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
 }
 EXPORT_SYMBOL_GPL(thp_get_unmapped_area);
 
+#ifdef CONFIG_HUGETEXT
+unsigned long hugetext_get_unmapped_area(struct file *filp, unsigned long addr,
+		unsigned long len, unsigned long pgoff, unsigned long flags)
+{
+	unsigned long ret;
+	loff_t off = (loff_t)pgoff << PAGE_SHIFT;
+
+	ret = __thp_get_unmapped_area(filp, addr, len, off, flags, PMD_SIZE);
+	if (ret)
+		return ret;
+
+	return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
+}
+#endif /* CONFIG_HUGETEXT */
+
 static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf,
 			struct page *page, gfp_t gfp)
 {
diff --git a/mm/mmap.c b/mm/mmap.c
index 88dcc5c25225..cad94a13edc2 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2242,8 +2242,26 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
 
 	get_area = current->mm->get_unmapped_area;
 	if (file) {
+#ifdef CONFIG_HUGETEXT
+		/*
+		 * Prior to the file->f_op->get_unmapped_area.
+		 *
+		 * If hugetext is enabled, except for MAP_FIXED, it always
+		 * make the mapping address of files that have executable
+		 * attribute be mapped in 2MB alignment.
+		 */
+		struct inode *inode = file_inode(file);
+
+		if (hugetext_enabled() && (inode->i_mode & 0111) &&
+				(!file->f_op->get_unmapped_area ||
+				 file->f_op->get_unmapped_area == thp_get_unmapped_area))
+			get_area = hugetext_get_unmapped_area;
+		else if (file->f_op->get_unmapped_area)
+			get_area = file->f_op->get_unmapped_area;
+#else
 		if (file->f_op->get_unmapped_area)
 			get_area = file->f_op->get_unmapped_area;
+#endif
 	} else if (flags & MAP_SHARED) {
 		/*
 		 * mmap_region() will call shmem_zero_setup() to create a file,
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] mm, thp: make mapping address of PIC binaries THP align
  2021-10-09  9:26 [PATCH 0/3] mm, thp: introduce a new sysfs interface to facilitate file THP for .text Rongwei Wang
  2021-10-09  9:26 ` [PATCH 1/3] mm, thp: support binaries transparent use of file THP Rongwei Wang
  2021-10-09  9:26 ` [PATCH 2/3] mm, thp: make mapping address of libraries THP align Rongwei Wang
@ 2021-10-09  9:26 ` Rongwei Wang
  2021-10-11  8:06 ` [PATCH 0/3] mm, thp: introduce a new sysfs interface to facilitate file THP for .text Christoph Hellwig
  3 siblings, 0 replies; 7+ messages in thread
From: Rongwei Wang @ 2021-10-09  9:26 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linux-fsdevel
  Cc: akpm, willy, viro, song, william.kucharski, hughd, shy828301,
	linmiaohe, peterx

For binaries that are compiled with '--pie -fPIC' and with LOAD
alignment smaller than 2M (typically 4K, 64K), the load address
is least likely to be 2M aligned.

This changes the maximum_alignment of such binaries to 2M to
facilitate file THP for .text section as far as possible.

Signed-off-by: Gang Deng <gavin.dg@linux.alibaba.com>
Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
---
 fs/binfmt_elf.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index a813b70f594e..78795572d877 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1136,6 +1136,11 @@ static int load_elf_binary(struct linux_binprm *bprm)
 				retval = -EINVAL;
 				goto out_free_dentry;
 			}
+#ifdef CONFIG_HUGETEXT
+			if (hugetext_enabled() && interpreter &&
+					total_size >= HPAGE_PMD_SIZE)
+				load_bias &= HPAGE_PMD_MASK;
+#endif
 		}
 
 		error = elf_map(bprm->file, load_bias + vaddr, elf_ppnt,
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] mm, thp: introduce a new sysfs interface to facilitate file THP for .text
  2021-10-09  9:26 [PATCH 0/3] mm, thp: introduce a new sysfs interface to facilitate file THP for .text Rongwei Wang
                   ` (2 preceding siblings ...)
  2021-10-09  9:26 ` [PATCH 3/3] mm, thp: make mapping address of PIC binaries " Rongwei Wang
@ 2021-10-11  8:06 ` Christoph Hellwig
  2021-10-12  1:50   ` Matthew Wilcox
  3 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2021-10-11  8:06 UTC (permalink / raw)
  To: Rongwei Wang
  Cc: linux-mm, linux-kernel, linux-fsdevel, akpm, willy, viro, song,
	william.kucharski, hughd, shy828301, linmiaohe, peterx

Can we please just get proper pagecache THP (through folios) merged
instead of piling hacks over hacks here?  The whole readonly THP already
was more than painful enough due to all the hacks involved.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] mm, thp: introduce a new sysfs interface to facilitate file THP for .text
  2021-10-11  8:06 ` [PATCH 0/3] mm, thp: introduce a new sysfs interface to facilitate file THP for .text Christoph Hellwig
@ 2021-10-12  1:50   ` Matthew Wilcox
  2021-10-12  7:04     ` Rongwei Wang
  0 siblings, 1 reply; 7+ messages in thread
From: Matthew Wilcox @ 2021-10-12  1:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Rongwei Wang, linux-mm, linux-kernel, linux-fsdevel, akpm, viro,
	song, william.kucharski, hughd, shy828301, linmiaohe, peterx

On Mon, Oct 11, 2021 at 09:06:37AM +0100, Christoph Hellwig wrote:
> Can we please just get proper pagecache THP (through folios) merged
> instead of piling hacks over hacks here?  The whole readonly THP already
> was more than painful enough due to all the hacks involved.

This was my initial reaction too.

But read the patches.  They're nothing to do with the implementation of
THP / folios in the page cache.  They're all to make sure that mappings
are PMD aligned.

I think there's a lot to criticise in the patches (eg, a system-wide
setting is probably a bad idea.  and a lot of this stuff seems to
be fixing userspace bugs in the kernel).  But let's criticise what's
actually in the patches, because these are problems that exist regardless
of RO_THP vs folios.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] mm, thp: introduce a new sysfs interface to facilitate file THP for .text
  2021-10-12  1:50   ` Matthew Wilcox
@ 2021-10-12  7:04     ` Rongwei Wang
  0 siblings, 0 replies; 7+ messages in thread
From: Rongwei Wang @ 2021-10-12  7:04 UTC (permalink / raw)
  To: Matthew Wilcox, Christoph Hellwig
  Cc: linux-mm, linux-kernel, linux-fsdevel, akpm, viro, song,
	william.kucharski, hughd, shy828301, linmiaohe, peterx



On 10/12/21 9:50 AM, Matthew Wilcox wrote:
> On Mon, Oct 11, 2021 at 09:06:37AM +0100, Christoph Hellwig wrote:
>> Can we please just get proper pagecache THP (through folios) merged
>> instead of piling hacks over hacks here?  The whole readonly THP already
>> was more than painful enough due to all the hacks involved.
> 
> This was my initial reaction too.
> 
> But read the patches.  They're nothing to do with the implementation of
> THP / folios in the page cache.  They're all to make sure that mappings
> are PMD aligned.
Hi, Matthew

In fact, we had thought about realizing this by handling page cache 
directly. And then, we found that we just need to align the mapping 
address and make khugepaged can scan these 'mm_struct' base on 
READ_ONLY_THP_FOR_FS.

> 
> I think there's a lot to criticise in the patches (eg, a system-wide
> setting is probably a bad idea.  and a lot of this stuff seems to
At the beginning, we don't introduce the new sysfs interface, just 
re-use 'transparent_hugepage/enabled'. But In some production system, they
disable the THP directly, especially those applications that are 
sensitive to THP. So, Considering these scenarios, we had to design a 
new sysfs interface ('transparent_hugepage/hugetext_enabled').

And if you have other idea, we are willing to take to improve these patches.

Thanks!

> be fixing userspace bugs in the kernel).  But let's criticise what's
> actually in the patches, because these are problems that exist regardless
> of RO_THP vs folios.
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-10-12  7:04 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-09  9:26 [PATCH 0/3] mm, thp: introduce a new sysfs interface to facilitate file THP for .text Rongwei Wang
2021-10-09  9:26 ` [PATCH 1/3] mm, thp: support binaries transparent use of file THP Rongwei Wang
2021-10-09  9:26 ` [PATCH 2/3] mm, thp: make mapping address of libraries THP align Rongwei Wang
2021-10-09  9:26 ` [PATCH 3/3] mm, thp: make mapping address of PIC binaries " Rongwei Wang
2021-10-11  8:06 ` [PATCH 0/3] mm, thp: introduce a new sysfs interface to facilitate file THP for .text Christoph Hellwig
2021-10-12  1:50   ` Matthew Wilcox
2021-10-12  7:04     ` Rongwei Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.