linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v10 0/7] Enable THP for text section of non-shmem files
@ 2019-08-01 18:42 Song Liu
  2019-08-01 18:42 ` [PATCH v10 1/7] filemap: check compound_head(page)->mapping in filemap_fault() Song Liu
                   ` (6 more replies)
  0 siblings, 7 replies; 9+ messages in thread
From: Song Liu @ 2019-08-01 18:42 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-kernel
  Cc: matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
	akpm, hdanton, Song Liu

Changes v9 => v10:
1. Update check for page->mapping in pagecache_get_page() (Johannes)
2. Refactor code in collapse_file() so it is easy to understand (Johannes)
3. Don't call try_to_release() in khugepaged_scan_file() (Johannes)
4. Rebase.

Changes v8 => v9:
1. Fix bad use of IS_ENABLED (kbuild test robot)

Changes v7 => v8:
1. Use IS_ENABLED wherever possible (Kirill A. Shutemov);
2. Improve handling of !PageUptodate case (Kirill A. Shutemov);
3. Add comment for calling lru_add_drain (Kirill A. Shutemov);
4. Add more information about DENYWRITE dynamic (Johannes Weiner).

Changes v6 => v7:
1. Avoid accessing vma without holding mmap_sem (Hillf Dayton)
2. In collapse_file() use readahead API instead of gup API. This matches
   better with existing logic for shmem.
3. Add inline documentation for @nr_thps (kbuild test robot)

Changes v5 => v6:
1. Improve THP stats in 3/6, (Kirill).

Changes v4 => v5:
1. Move the logic to drop THP from pagecache to open() path (Rik).
2. Revise description of CONFIG_READ_ONLY_THP_FOR_FS.

Changes v3 => v4:
1. Put the logic to drop THP from pagecache in a separate function (Rik).
2. Move the function to drop THP from pagecache to exit_mmap().
3. Revise confusing commit log 6/6.

Changes v2 => v3:
1. Removed the limitation (cannot write to file with THP) by truncating
   whole file during sys_open (see 6/6);
2. Fixed a VM_BUG_ON_PAGE() in filemap_fault() (see 2/6);
3. Split function rename to a separate patch (Rik);
4. Updated condition in hugepage_vma_check() (Rik).

Changes v1 => v2:
1. Fixed a missing mem_cgroup_commit_charge() for non-shmem case.

This set follows up discussion at LSF/MM 2019. The motivation is to put
text section of an application in THP, and thus reduces iTLB miss rate and
improves performance. Both Facebook and Oracle showed strong interests to
this feature.

To make reviews easier, this set aims a mininal valid product. Current
version of the work does not have any changes to file system specific
code. This comes with some limitations (discussed later).

This set enables an application to "hugify" its text section by simply
running something like:

          madvise(0x600000, 0x80000, MADV_HUGEPAGE);

Before this call, the /proc/<pid>/maps looks like:

    00400000-074d0000 r-xp 00000000 00:27 2006927     app

After this call, part of the text section is split out and mapped to
THP:

    00400000-00425000 r-xp 00000000 00:27 2006927     app
    00600000-00e00000 r-xp 00200000 00:27 2006927     app   <<< on THP
    00e00000-074d0000 r-xp 00a00000 00:27 2006927     app

Limitations:

1. This only works for text section (vma with VM_DENYWRITE).
2. Original limitation #2 is removed in v3.

We gated this feature with an experimental config, READ_ONLY_THP_FOR_FS.
Once we get better support on the write path, we can remove the config and
enable it by default.

Tested cases:
1. Tested with btrfs and ext4.
2. Tested with real work application (memcache like caching service).
3. Tested with "THP aware uprobe":
   https://patchwork.kernel.org/project/linux-mm/list/?series=131339

This set (plus a few uprobe patches) is also available at

   https://github.com/liu-song-6/linux/tree/uprobe-thp

Please share your comments and suggestions on this.

Thanks!

Song Liu (7):
  filemap: check compound_head(page)->mapping in filemap_fault()
  filemap: check compound_head(page)->mapping in pagecache_get_page()
  filemap: update offset check in filemap_fault()
  mm,thp: stats for file backed THP
  khugepaged: rename collapse_shmem() and khugepaged_scan_shmem()
  mm,thp: add read-only THP support for (non-shmem) FS
  mm,thp: avoid writes to file with THP in pagecache

 drivers/base/node.c    |   6 ++
 fs/inode.c             |   3 +
 fs/open.c              |   8 ++
 fs/proc/meminfo.c      |   4 +
 fs/proc/task_mmu.c     |   4 +-
 include/linux/fs.h     |  32 ++++++++
 include/linux/mmzone.h |   2 +
 mm/Kconfig             |  11 +++
 mm/filemap.c           |  11 +--
 mm/khugepaged.c        | 172 ++++++++++++++++++++++++++++-------------
 mm/rmap.c              |  12 ++-
 mm/vmstat.c            |   2 +
 12 files changed, 204 insertions(+), 63 deletions(-)

--
2.17.1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v10 1/7] filemap: check compound_head(page)->mapping in filemap_fault()
  2019-08-01 18:42 [PATCH v10 0/7] Enable THP for text section of non-shmem files Song Liu
@ 2019-08-01 18:42 ` Song Liu
  2019-08-01 18:42 ` [PATCH v10 2/7] filemap: check compound_head(page)->mapping in pagecache_get_page() Song Liu
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Song Liu @ 2019-08-01 18:42 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-kernel
  Cc: matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
	akpm, hdanton, Song Liu

Currently, filemap_fault() avoids race condition with truncate by
checking page->mapping == mapping. This does not work for compound
pages. This patch let it check compound_head(page)->mapping instead.

Acked-by: Rik van Riel <riel@surriel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Song Liu <songliubraving@fb.com>
---
 mm/filemap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 7161fb937e78..d0bd9e585c2f 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2537,7 +2537,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
 		goto out_retry;
 
 	/* Did it get truncated? */
-	if (unlikely(page->mapping != mapping)) {
+	if (unlikely(compound_head(page)->mapping != mapping)) {
 		unlock_page(page);
 		put_page(page);
 		goto retry_find;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v10 2/7] filemap: check compound_head(page)->mapping in pagecache_get_page()
  2019-08-01 18:42 [PATCH v10 0/7] Enable THP for text section of non-shmem files Song Liu
  2019-08-01 18:42 ` [PATCH v10 1/7] filemap: check compound_head(page)->mapping in filemap_fault() Song Liu
@ 2019-08-01 18:42 ` Song Liu
  2019-08-12 20:33   ` Johannes Weiner
  2019-08-01 18:42 ` [PATCH v10 3/7] filemap: update offset check in filemap_fault() Song Liu
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 9+ messages in thread
From: Song Liu @ 2019-08-01 18:42 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-kernel
  Cc: matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
	akpm, hdanton, Song Liu

Similar to previous patch, pagecache_get_page() avoids race condition
with truncate by checking page->mapping == mapping. This does not work
for compound pages. This patch let it check compound_head(page)->mapping
instead.

Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Song Liu <songliubraving@fb.com>
---
 mm/filemap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index d0bd9e585c2f..aaee1ef96f6d 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1644,7 +1644,7 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t offset,
 		}
 
 		/* Has the page been truncated? */
-		if (unlikely(page->mapping != mapping)) {
+		if (unlikely(compound_head(page)->mapping != mapping)) {
 			unlock_page(page);
 			put_page(page);
 			goto repeat;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v10 3/7] filemap: update offset check in filemap_fault()
  2019-08-01 18:42 [PATCH v10 0/7] Enable THP for text section of non-shmem files Song Liu
  2019-08-01 18:42 ` [PATCH v10 1/7] filemap: check compound_head(page)->mapping in filemap_fault() Song Liu
  2019-08-01 18:42 ` [PATCH v10 2/7] filemap: check compound_head(page)->mapping in pagecache_get_page() Song Liu
@ 2019-08-01 18:42 ` Song Liu
  2019-08-01 18:42 ` [PATCH v10 4/7] mm,thp: stats for file backed THP Song Liu
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Song Liu @ 2019-08-01 18:42 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-kernel
  Cc: matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
	akpm, hdanton, Song Liu

With THP, current check of offset:

    VM_BUG_ON_PAGE(page->index != offset, page);

is no longer accurate. Update it to:

    VM_BUG_ON_PAGE(page_to_pgoff(page) != offset, page);

Acked-by: Rik van Riel <riel@surriel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Song Liu <songliubraving@fb.com>
---
 mm/filemap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index aaee1ef96f6d..97c7b7b92c20 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2542,7 +2542,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
 		put_page(page);
 		goto retry_find;
 	}
-	VM_BUG_ON_PAGE(page->index != offset, page);
+	VM_BUG_ON_PAGE(page_to_pgoff(page) != offset, page);
 
 	/*
 	 * We have a locked page in the page cache, now we need to check
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v10 4/7] mm,thp: stats for file backed THP
  2019-08-01 18:42 [PATCH v10 0/7] Enable THP for text section of non-shmem files Song Liu
                   ` (2 preceding siblings ...)
  2019-08-01 18:42 ` [PATCH v10 3/7] filemap: update offset check in filemap_fault() Song Liu
@ 2019-08-01 18:42 ` Song Liu
  2019-08-01 18:42 ` [PATCH v10 5/7] khugepaged: rename collapse_shmem() and khugepaged_scan_shmem() Song Liu
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Song Liu @ 2019-08-01 18:42 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-kernel
  Cc: matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
	akpm, hdanton, Song Liu

In preparation for non-shmem THP, this patch adds a few stats and exposes
them in /proc/meminfo, /sys/bus/node/devices/<node>/meminfo, and
/proc/<pid>/task/<tid>/smaps.

This patch is mostly a rewrite of Kirill A. Shutemov's earlier version:
https://lkml.kernel.org/r/20170126115819.58875-5-kirill.shutemov@linux.intel.com/

Acked-by: Rik van Riel <riel@surriel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Song Liu <songliubraving@fb.com>
---
 drivers/base/node.c    | 6 ++++++
 fs/proc/meminfo.c      | 4 ++++
 fs/proc/task_mmu.c     | 4 +++-
 include/linux/mmzone.h | 2 ++
 mm/vmstat.c            | 2 ++
 5 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/base/node.c b/drivers/base/node.c
index 75b7e6f6535b..4f2714ee819b 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -427,6 +427,8 @@ static ssize_t node_read_meminfo(struct device *dev,
 		       "Node %d AnonHugePages:  %8lu kB\n"
 		       "Node %d ShmemHugePages: %8lu kB\n"
 		       "Node %d ShmemPmdMapped: %8lu kB\n"
+		       "Node %d FileHugePages: %8lu kB\n"
+		       "Node %d FilePmdMapped: %8lu kB\n"
 #endif
 			,
 		       nid, K(node_page_state(pgdat, NR_FILE_DIRTY)),
@@ -452,6 +454,10 @@ static ssize_t node_read_meminfo(struct device *dev,
 		       nid, K(node_page_state(pgdat, NR_SHMEM_THPS) *
 				       HPAGE_PMD_NR),
 		       nid, K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED) *
+				       HPAGE_PMD_NR),
+		       nid, K(node_page_state(pgdat, NR_FILE_THPS) *
+				       HPAGE_PMD_NR),
+		       nid, K(node_page_state(pgdat, NR_FILE_PMDMAPPED) *
 				       HPAGE_PMD_NR)
 #endif
 		       );
diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index 465ea0153b2a..82673470dde7 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -136,6 +136,10 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
 		    global_node_page_state(NR_SHMEM_THPS) * HPAGE_PMD_NR);
 	show_val_kb(m, "ShmemPmdMapped: ",
 		    global_node_page_state(NR_SHMEM_PMDMAPPED) * HPAGE_PMD_NR);
+	show_val_kb(m, "FileHugePages: ",
+		    global_node_page_state(NR_FILE_THPS) * HPAGE_PMD_NR);
+	show_val_kb(m, "FilePmdMapped: ",
+		    global_node_page_state(NR_FILE_PMDMAPPED) * HPAGE_PMD_NR);
 #endif
 
 #ifdef CONFIG_CMA
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 731642e0f5a0..1ea7d730774c 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -417,6 +417,7 @@ struct mem_size_stats {
 	unsigned long lazyfree;
 	unsigned long anonymous_thp;
 	unsigned long shmem_thp;
+	unsigned long file_thp;
 	unsigned long swap;
 	unsigned long shared_hugetlb;
 	unsigned long private_hugetlb;
@@ -586,7 +587,7 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr,
 	else if (is_zone_device_page(page))
 		/* pass */;
 	else
-		VM_BUG_ON_PAGE(1, page);
+		mss->file_thp += HPAGE_PMD_SIZE;
 	smaps_account(mss, page, true, pmd_young(*pmd), pmd_dirty(*pmd), locked);
 }
 #else
@@ -803,6 +804,7 @@ static void __show_smap(struct seq_file *m, const struct mem_size_stats *mss,
 	SEQ_PUT_DEC(" kB\nLazyFree:       ", mss->lazyfree);
 	SEQ_PUT_DEC(" kB\nAnonHugePages:  ", mss->anonymous_thp);
 	SEQ_PUT_DEC(" kB\nShmemPmdMapped: ", mss->shmem_thp);
+	SEQ_PUT_DEC(" kB\nFilePmdMapped: ", mss->file_thp);
 	SEQ_PUT_DEC(" kB\nShared_Hugetlb: ", mss->shared_hugetlb);
 	seq_put_decimal_ull_width(m, " kB\nPrivate_Hugetlb: ",
 				  mss->private_hugetlb >> 10, 7);
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index d77d717c620c..aa0dd8ca36c8 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -234,6 +234,8 @@ enum node_stat_item {
 	NR_SHMEM,		/* shmem pages (included tmpfs/GEM pages) */
 	NR_SHMEM_THPS,
 	NR_SHMEM_PMDMAPPED,
+	NR_FILE_THPS,
+	NR_FILE_PMDMAPPED,
 	NR_ANON_THPS,
 	NR_UNSTABLE_NFS,	/* NFS unstable pages */
 	NR_VMSCAN_WRITE,
diff --git a/mm/vmstat.c b/mm/vmstat.c
index fd7e16ca6996..6afc892a148a 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1158,6 +1158,8 @@ const char * const vmstat_text[] = {
 	"nr_shmem",
 	"nr_shmem_hugepages",
 	"nr_shmem_pmdmapped",
+	"nr_file_hugepages",
+	"nr_file_pmdmapped",
 	"nr_anon_transparent_hugepages",
 	"nr_unstable",
 	"nr_vmscan_write",
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v10 5/7] khugepaged: rename collapse_shmem() and khugepaged_scan_shmem()
  2019-08-01 18:42 [PATCH v10 0/7] Enable THP for text section of non-shmem files Song Liu
                   ` (3 preceding siblings ...)
  2019-08-01 18:42 ` [PATCH v10 4/7] mm,thp: stats for file backed THP Song Liu
@ 2019-08-01 18:42 ` Song Liu
       [not found] ` <20190801184244.3169074-7-songliubraving@fb.com>
       [not found] ` <20190801184244.3169074-8-songliubraving@fb.com>
  6 siblings, 0 replies; 9+ messages in thread
From: Song Liu @ 2019-08-01 18:42 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-kernel
  Cc: matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
	akpm, hdanton, Song Liu

Next patch will add khugepaged support of non-shmem files. This patch
renames these two functions to reflect the new functionality:

    collapse_shmem()        =>  collapse_file()
    khugepaged_scan_shmem() =>  khugepaged_scan_file()

Acked-by: Rik van Riel <riel@surriel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Song Liu <songliubraving@fb.com>
---
 mm/khugepaged.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index b9949014346b..9d3cc2061960 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1426,7 +1426,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
 }
 
 /**
- * collapse_shmem - collapse small tmpfs/shmem pages into huge one.
+ * collapse_file - collapse small tmpfs/shmem pages into huge one.
  *
  * Basic scheme is simple, details are more complex:
  *  - allocate and lock a new huge page;
@@ -1443,10 +1443,11 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
  *    + restore gaps in the page cache;
  *    + unlock and free huge page;
  */
-static void collapse_shmem(struct mm_struct *mm,
-		struct address_space *mapping, pgoff_t start,
+static void collapse_file(struct mm_struct *mm,
+		struct file *file, pgoff_t start,
 		struct page **hpage, int node)
 {
+	struct address_space *mapping = file->f_mapping;
 	gfp_t gfp;
 	struct page *new_page;
 	struct mem_cgroup *memcg;
@@ -1702,11 +1703,11 @@ static void collapse_shmem(struct mm_struct *mm,
 	/* TODO: tracepoints */
 }
 
-static void khugepaged_scan_shmem(struct mm_struct *mm,
-		struct address_space *mapping,
-		pgoff_t start, struct page **hpage)
+static void khugepaged_scan_file(struct mm_struct *mm,
+		struct file *file, pgoff_t start, struct page **hpage)
 {
 	struct page *page = NULL;
+	struct address_space *mapping = file->f_mapping;
 	XA_STATE(xas, &mapping->i_pages, start);
 	int present, swap;
 	int node = NUMA_NO_NODE;
@@ -1770,16 +1771,15 @@ static void khugepaged_scan_shmem(struct mm_struct *mm,
 			result = SCAN_EXCEED_NONE_PTE;
 		} else {
 			node = khugepaged_find_target_node();
-			collapse_shmem(mm, mapping, start, hpage, node);
+			collapse_file(mm, file, start, hpage, node);
 		}
 	}
 
 	/* TODO: tracepoints */
 }
 #else
-static void khugepaged_scan_shmem(struct mm_struct *mm,
-		struct address_space *mapping,
-		pgoff_t start, struct page **hpage)
+static void khugepaged_scan_file(struct mm_struct *mm,
+		struct file *file, pgoff_t start, struct page **hpage)
 {
 	BUILD_BUG();
 }
@@ -1862,8 +1862,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
 				file = get_file(vma->vm_file);
 				up_read(&mm->mmap_sem);
 				ret = 1;
-				khugepaged_scan_shmem(mm, file->f_mapping,
-						pgoff, hpage);
+				khugepaged_scan_file(mm, file, pgoff, hpage);
 				fput(file);
 			} else {
 				ret = khugepaged_scan_pmd(mm, vma,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v10 2/7] filemap: check compound_head(page)->mapping in pagecache_get_page()
  2019-08-01 18:42 ` [PATCH v10 2/7] filemap: check compound_head(page)->mapping in pagecache_get_page() Song Liu
@ 2019-08-12 20:33   ` Johannes Weiner
  0 siblings, 0 replies; 9+ messages in thread
From: Johannes Weiner @ 2019-08-12 20:33 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-mm, linux-fsdevel, linux-kernel, matthew.wilcox,
	kirill.shutemov, kernel-team, william.kucharski, akpm, hdanton

On Thu, Aug 01, 2019 at 11:42:39AM -0700, Song Liu wrote:
> Similar to previous patch, pagecache_get_page() avoids race condition
> with truncate by checking page->mapping == mapping. This does not work
> for compound pages. This patch let it check compound_head(page)->mapping
> instead.
> 
> Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
> Signed-off-by: Song Liu <songliubraving@fb.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v10 6/7] mm,thp: add read-only THP support for (non-shmem) FS
       [not found] ` <20190801184244.3169074-7-songliubraving@fb.com>
@ 2019-08-12 20:36   ` Johannes Weiner
  0 siblings, 0 replies; 9+ messages in thread
From: Johannes Weiner @ 2019-08-12 20:36 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-mm, linux-fsdevel, linux-kernel, matthew.wilcox,
	kirill.shutemov, kernel-team, william.kucharski, akpm, hdanton

On Thu, Aug 01, 2019 at 11:42:43AM -0700, Song Liu wrote:
> This patch is (hopefully) the first step to enable THP for non-shmem
> filesystems.
> 
> This patch enables an application to put part of its text sections to THP
> via madvise, for example:
> 
>     madvise((void *)0x600000, 0x200000, MADV_HUGEPAGE);
> 
> We tried to reuse the logic for THP on tmpfs.
> 
> Currently, write is not supported for non-shmem THP. khugepaged will only
> process vma with VM_DENYWRITE. sys_mmap() ignores VM_DENYWRITE requests
> (see ksys_mmap_pgoff). The only way to create vma with VM_DENYWRITE is
> execve(). This requirement limits non-shmem THP to text sections.
> 
> The next patch will handle writes, which would only happen when the all
> the vmas with VM_DENYWRITE are unmapped.
> 
> An EXPERIMENTAL config, READ_ONLY_THP_FOR_FS, is added to gate this
> feature.
> 
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Acked-by: Rik van Riel <riel@surriel.com>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Signed-off-by: Song Liu <songliubraving@fb.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v10 7/7] mm,thp: avoid writes to file with THP in pagecache
       [not found] ` <20190801184244.3169074-8-songliubraving@fb.com>
@ 2019-08-12 20:38   ` Johannes Weiner
  0 siblings, 0 replies; 9+ messages in thread
From: Johannes Weiner @ 2019-08-12 20:38 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-mm, linux-fsdevel, linux-kernel, matthew.wilcox,
	kirill.shutemov, kernel-team, william.kucharski, akpm, hdanton

On Thu, Aug 01, 2019 at 11:42:44AM -0700, Song Liu wrote:
> In previous patch, an application could put part of its text section in
> THP via madvise(). These THPs will be protected from writes when the
> application is still running (TXTBSY). However, after the application
> exits, the file is available for writes.
> 
> This patch avoids writes to file THP by dropping page cache for the file
> when the file is open for write. A new counter nr_thps is added to struct
> address_space. In do_dentry_open(), if the file is open for write and
> nr_thps is non-zero, we drop page cache for the whole file.
> 
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Reported-by: kbuild test robot <lkp@intel.com>
> Acked-by: Rik van Riel <riel@surriel.com>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Signed-off-by: Song Liu <songliubraving@fb.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-08-12 20:38 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-01 18:42 [PATCH v10 0/7] Enable THP for text section of non-shmem files Song Liu
2019-08-01 18:42 ` [PATCH v10 1/7] filemap: check compound_head(page)->mapping in filemap_fault() Song Liu
2019-08-01 18:42 ` [PATCH v10 2/7] filemap: check compound_head(page)->mapping in pagecache_get_page() Song Liu
2019-08-12 20:33   ` Johannes Weiner
2019-08-01 18:42 ` [PATCH v10 3/7] filemap: update offset check in filemap_fault() Song Liu
2019-08-01 18:42 ` [PATCH v10 4/7] mm,thp: stats for file backed THP Song Liu
2019-08-01 18:42 ` [PATCH v10 5/7] khugepaged: rename collapse_shmem() and khugepaged_scan_shmem() Song Liu
     [not found] ` <20190801184244.3169074-7-songliubraving@fb.com>
2019-08-12 20:36   ` [PATCH v10 6/7] mm,thp: add read-only THP support for (non-shmem) FS Johannes Weiner
     [not found] ` <20190801184244.3169074-8-songliubraving@fb.com>
2019-08-12 20:38   ` [PATCH v10 7/7] mm,thp: avoid writes to file with THP in pagecache Johannes Weiner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).