Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v2 0/3] Enable THP for text section of non-shmem files
@ 2019-06-14 18:22 Song Liu
  2019-06-14 18:22 ` [PATCH v2 1/3] mm: check compound_head(page)->mapping in filemap_fault() Song Liu
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Song Liu @ 2019-06-14 18:22 UTC (permalink / raw)
  To: linux-mm
  Cc: matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
	chad.mynhier, mike.kravetz, Song Liu

This set follows up discussion at LSF/MM 2019. The motivation is to put
text section of an application in THP, and thus reduces iTLB miss rate and
improves performance. Both Facebook and Oracle showed strong interests to
this feature.

To make reviews easier, this set aims a mininal valid product. Current
version of the work does not have any changes to file system specific
code. This comes with some limitations (discussed later).

This set enables an application to "hugify" its text section by simply
running something like:

          madvise(0x600000, 0x80000, MADV_HUGEPAGE);

Before this call, the /proc/<pid>/maps looks like:

    00400000-074d0000 r-xp 00000000 00:27 2006927     app

After this call, part of the text section is split out and mapped to THP:

    00400000-00425000 r-xp 00000000 00:27 2006927     app
    00600000-00e00000 r-xp 00200000 00:27 2006927     app   <<< on THP
    00e00000-074d0000 r-xp 00a00000 00:27 2006927     app

Limitations:

1. This only works for text section (vma with VM_DENYWRITE).
2. Once the application put its own pages in THP, the file is read only.
   open(file, O_WRITE) will fail with -ETXTBSY. To modify/update the file,
   it must be removed first. Here is an example case:

    root@virt-test:~/# ./app hugify
    ^C

    root@virt-test:~/# dd if=/dev/zero of=./app bs=1k count=2
    dd: failed to open './app': Text file busy

    root@virt-test:~/# cp app.backup app
    cp: cannot create regular file 'app': Text file busy

    root@virt-test:~/# rm app
    root@virt-test:~/# cp app.backup app
    root@virt-test:~/#

We gated this feature with an experimental config, READ_ONLY_THP_FOR_FS.
Once we get better support on the write path, we can remove the config and
enable it by default.

Tested cases:
1. Tested with btrfs and ext4.
2. Tested with real work application (memcache like caching service).
3. Tested with "THP aware uprobe":
   https://patchwork.kernel.org/project/linux-mm/list/?series=131339

Please share your comments and suggestions on this.

Thanks!

Changes v1 => v2:
1. Fixed a missing mem_cgroup_commit_charge() for non-shmem case.

Song Liu (3):
  mm: check compound_head(page)->mapping in filemap_fault()
  mm,thp: stats for file backed THP
  mm,thp: add read-only THP support for (non-shmem) FS

 fs/proc/meminfo.c      |   4 ++
 include/linux/fs.h     |   8 ++++
 include/linux/mmzone.h |   2 +
 mm/Kconfig             |  11 +++++
 mm/filemap.c           |   7 +--
 mm/khugepaged.c        | 106 +++++++++++++++++++++++++++++++++--------
 mm/rmap.c              |  12 +++--
 mm/vmstat.c            |   2 +
 8 files changed, 125 insertions(+), 27 deletions(-)

--
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/3] mm: check compound_head(page)->mapping in filemap_fault()
  2019-06-14 18:22 [PATCH v2 0/3] Enable THP for text section of non-shmem files Song Liu
@ 2019-06-14 18:22 ` Song Liu
  2019-06-17 14:59   ` Rik van Riel
  2019-06-14 18:22 ` [PATCH v2 2/3] mm,thp: stats for file backed THP Song Liu
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Song Liu @ 2019-06-14 18:22 UTC (permalink / raw)
  To: linux-mm
  Cc: matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
	chad.mynhier, mike.kravetz, Song Liu

Currently, filemap_fault() avoids trace condition with truncate by
checking page->mapping == mapping. This does not work for compound
pages. This patch let it check compound_head(page)->mapping instead.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 mm/filemap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index df2006ba0cfa..f5b79a43946d 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2517,7 +2517,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
 		goto out_retry;
 
 	/* Did it get truncated? */
-	if (unlikely(page->mapping != mapping)) {
+	if (unlikely(compound_head(page)->mapping != mapping)) {
 		unlock_page(page);
 		put_page(page);
 		goto retry_find;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 2/3] mm,thp: stats for file backed THP
  2019-06-14 18:22 [PATCH v2 0/3] Enable THP for text section of non-shmem files Song Liu
  2019-06-14 18:22 ` [PATCH v2 1/3] mm: check compound_head(page)->mapping in filemap_fault() Song Liu
@ 2019-06-14 18:22 ` Song Liu
  2019-06-17 15:00   ` Rik van Riel
  2019-06-21 12:50   ` Kirill A. Shutemov
  2019-06-14 18:22 ` [PATCH v2 3/3] mm,thp: add read-only THP support for (non-shmem) FS Song Liu
  2019-06-18 21:12 ` [PATCH v2 0/3] Enable THP for text section of non-shmem files Andrew Morton
  3 siblings, 2 replies; 17+ messages in thread
From: Song Liu @ 2019-06-14 18:22 UTC (permalink / raw)
  To: linux-mm
  Cc: matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
	chad.mynhier, mike.kravetz, Song Liu

In preparation for non-shmem THP, this patch adds two stats and exposes
them in /proc/meminfo

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 fs/proc/meminfo.c      | 4 ++++
 include/linux/mmzone.h | 2 ++
 mm/vmstat.c            | 2 ++
 3 files changed, 8 insertions(+)

diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index 568d90e17c17..bac395fc11f9 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -136,6 +136,10 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
 		    global_node_page_state(NR_SHMEM_THPS) * HPAGE_PMD_NR);
 	show_val_kb(m, "ShmemPmdMapped: ",
 		    global_node_page_state(NR_SHMEM_PMDMAPPED) * HPAGE_PMD_NR);
+	show_val_kb(m, "FileHugePages: ",
+		    global_node_page_state(NR_FILE_THPS) * HPAGE_PMD_NR);
+	show_val_kb(m, "FilePmdMapped: ",
+		    global_node_page_state(NR_FILE_PMDMAPPED) * HPAGE_PMD_NR);
 #endif
 
 #ifdef CONFIG_CMA
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 70394cabaf4e..827f9b777938 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -234,6 +234,8 @@ enum node_stat_item {
 	NR_SHMEM,		/* shmem pages (included tmpfs/GEM pages) */
 	NR_SHMEM_THPS,
 	NR_SHMEM_PMDMAPPED,
+	NR_FILE_THPS,
+	NR_FILE_PMDMAPPED,
 	NR_ANON_THPS,
 	NR_UNSTABLE_NFS,	/* NFS unstable pages */
 	NR_VMSCAN_WRITE,
diff --git a/mm/vmstat.c b/mm/vmstat.c
index fd7e16ca6996..6afc892a148a 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1158,6 +1158,8 @@ const char * const vmstat_text[] = {
 	"nr_shmem",
 	"nr_shmem_hugepages",
 	"nr_shmem_pmdmapped",
+	"nr_file_hugepages",
+	"nr_file_pmdmapped",
 	"nr_anon_transparent_hugepages",
 	"nr_unstable",
 	"nr_vmscan_write",
-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 3/3] mm,thp: add read-only THP support for (non-shmem) FS
  2019-06-14 18:22 [PATCH v2 0/3] Enable THP for text section of non-shmem files Song Liu
  2019-06-14 18:22 ` [PATCH v2 1/3] mm: check compound_head(page)->mapping in filemap_fault() Song Liu
  2019-06-14 18:22 ` [PATCH v2 2/3] mm,thp: stats for file backed THP Song Liu
@ 2019-06-14 18:22 ` Song Liu
  2019-06-17 15:42   ` Rik van Riel
  2019-06-21 12:58   ` Kirill A. Shutemov
  2019-06-18 21:12 ` [PATCH v2 0/3] Enable THP for text section of non-shmem files Andrew Morton
  3 siblings, 2 replies; 17+ messages in thread
From: Song Liu @ 2019-06-14 18:22 UTC (permalink / raw)
  To: linux-mm
  Cc: matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
	chad.mynhier, mike.kravetz, Song Liu

This patch is (hopefully) the first step to enable THP for non-shmem
filesystems.

This patch enables an application to put part of its text sections to THP
via madvise, for example:

    madvise((void *)0x600000, 0x200000, MADV_HUGEPAGE);

We tried to reuse the logic for THP on tmpfs. The following functions are
renamed to reflect the new functionality:

	collapse_shmem()	=>  collapse_file()
	khugepaged_scan_shmem()	=>  khugepaged_scan_file()

Currently, write is not supported for non-shmem THP. This is enforced by
taking negative i_writecount. Therefore, if file has THP pages in the
page cache, open() to write will fail. To update/modify the file, the
user need to remove it first.

An EXPERIMENTAL config, READ_ONLY_THP_FOR_FS, is added to gate this
feature.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 include/linux/fs.h |   8 ++++
 mm/Kconfig         |  11 +++++
 mm/filemap.c       |   5 ++-
 mm/khugepaged.c    | 106 ++++++++++++++++++++++++++++++++++++---------
 mm/rmap.c          |  12 +++--
 5 files changed, 116 insertions(+), 26 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index f7fdfe93e25d..cda996ddaee1 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2871,6 +2871,10 @@ static inline int get_write_access(struct inode *inode)
 {
 	return atomic_inc_unless_negative(&inode->i_writecount) ? 0 : -ETXTBSY;
 }
+static inline int __deny_write_access(struct inode *inode)
+{
+	return atomic_dec_unless_positive(&inode->i_writecount) ? 0 : -ETXTBSY;
+}
 static inline int deny_write_access(struct file *file)
 {
 	struct inode *inode = file_inode(file);
@@ -2880,6 +2884,10 @@ static inline void put_write_access(struct inode * inode)
 {
 	atomic_dec(&inode->i_writecount);
 }
+static inline void __allow_write_access(struct inode *inode)
+{
+	atomic_inc(&inode->i_writecount);
+}
 static inline void allow_write_access(struct file *file)
 {
 	if (file)
diff --git a/mm/Kconfig b/mm/Kconfig
index f0c76ba47695..546d45d9bdab 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -762,6 +762,17 @@ config GUP_BENCHMARK
 
 	  See tools/testing/selftests/vm/gup_benchmark.c
 
+config READ_ONLY_THP_FOR_FS
+	bool "Read-only THP for filesystems (EXPERIMENTAL)"
+	depends on TRANSPARENT_HUGE_PAGECACHE && SHMEM
+
+	help
+	  Allow khugepaged to put read-only file-backed pages in THP.
+
+	  This is marked experimental because it makes files with thp in
+	  the page cache read-only. To overwrite the file, it need to be
+	  truncated or removed first.
+
 config ARCH_HAS_PTE_SPECIAL
 	bool
 
diff --git a/mm/filemap.c b/mm/filemap.c
index f5b79a43946d..966f24cee711 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -203,8 +203,9 @@ static void unaccount_page_cache_page(struct address_space *mapping,
 		__mod_node_page_state(page_pgdat(page), NR_SHMEM, -nr);
 		if (PageTransHuge(page))
 			__dec_node_page_state(page, NR_SHMEM_THPS);
-	} else {
-		VM_BUG_ON_PAGE(PageTransHuge(page), page);
+	} else if (PageTransHuge(page)) {
+		__dec_node_page_state(page, NR_FILE_THPS);
+		__allow_write_access(mapping->host);
 	}
 
 	/*
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index a335f7c1fac4..1855ace48488 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -48,6 +48,7 @@ enum scan_result {
 	SCAN_CGROUP_CHARGE_FAIL,
 	SCAN_EXCEED_SWAP_PTE,
 	SCAN_TRUNCATED,
+	SCAN_PAGE_HAS_PRIVATE,
 };
 
 #define CREATE_TRACE_POINTS
@@ -404,7 +405,13 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
 	    (vm_flags & VM_NOHUGEPAGE) ||
 	    test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
 		return false;
+
+#ifdef CONFIG_READ_ONLY_THP_FOR_FS
+	if (shmem_file(vma->vm_file) ||
+	    (vma->vm_file && (vm_flags & VM_DENYWRITE))) {
+#else
 	if (shmem_file(vma->vm_file)) {
+#endif
 		if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGE_PAGECACHE))
 			return false;
 		return IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) - vma->vm_pgoff,
@@ -456,8 +463,9 @@ int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
 	unsigned long hstart, hend;
 
 	/*
-	 * khugepaged does not yet work on non-shmem files or special
-	 * mappings. And file-private shmem THP is not supported.
+	 * khugepaged only supports read-only files for non-shmem files.
+	 * khugepaged does not yet work on special mappings. And
+	 * file-private shmem THP is not supported.
 	 */
 	if (!hugepage_vma_check(vma, vm_flags))
 		return 0;
@@ -1284,12 +1292,12 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
 }
 
 /**
- * collapse_shmem - collapse small tmpfs/shmem pages into huge one.
+ * collapse_file - collapse filemap/tmpfs/shmem pages into huge one.
  *
  * Basic scheme is simple, details are more complex:
  *  - allocate and lock a new huge page;
  *  - scan page cache replacing old pages with the new one
- *    + swap in pages if necessary;
+ *    + swap/gup in pages if necessary;
  *    + fill in gaps;
  *    + keep old pages around in case rollback is required;
  *  - if replacing succeeds:
@@ -1301,10 +1309,11 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
  *    + restore gaps in the page cache;
  *    + unlock and free huge page;
  */
-static void collapse_shmem(struct mm_struct *mm,
+static void collapse_file(struct vm_area_struct *vma,
 		struct address_space *mapping, pgoff_t start,
 		struct page **hpage, int node)
 {
+	struct mm_struct *mm = vma->vm_mm;
 	gfp_t gfp;
 	struct page *new_page;
 	struct mem_cgroup *memcg;
@@ -1312,7 +1321,11 @@ static void collapse_shmem(struct mm_struct *mm,
 	LIST_HEAD(pagelist);
 	XA_STATE_ORDER(xas, &mapping->i_pages, start, HPAGE_PMD_ORDER);
 	int nr_none = 0, result = SCAN_SUCCEED;
+	bool is_shmem = shmem_file(vma->vm_file);
 
+#ifndef CONFIG_READ_ONLY_THP_FOR_FS
+	VM_BUG_ON(!is_shmem);
+#endif
 	VM_BUG_ON(start & (HPAGE_PMD_NR - 1));
 
 	/* Only allocate from the target node */
@@ -1344,7 +1357,8 @@ static void collapse_shmem(struct mm_struct *mm,
 	} while (1);
 
 	__SetPageLocked(new_page);
-	__SetPageSwapBacked(new_page);
+	if (is_shmem)
+		__SetPageSwapBacked(new_page);
 	new_page->index = start;
 	new_page->mapping = mapping;
 
@@ -1359,7 +1373,7 @@ static void collapse_shmem(struct mm_struct *mm,
 		struct page *page = xas_next(&xas);
 
 		VM_BUG_ON(index != xas.xa_index);
-		if (!page) {
+		if (is_shmem && !page) {
 			/*
 			 * Stop if extent has been truncated or hole-punched,
 			 * and is now completely empty.
@@ -1380,7 +1394,7 @@ static void collapse_shmem(struct mm_struct *mm,
 			continue;
 		}
 
-		if (xa_is_value(page) || !PageUptodate(page)) {
+		if (is_shmem && (xa_is_value(page) || !PageUptodate(page))) {
 			xas_unlock_irq(&xas);
 			/* swap in or instantiate fallocated page */
 			if (shmem_getpage(mapping->host, index, &page,
@@ -1388,6 +1402,24 @@ static void collapse_shmem(struct mm_struct *mm,
 				result = SCAN_FAIL;
 				goto xa_unlocked;
 			}
+		} else if (!page || xa_is_value(page)) {
+			unsigned long vaddr;
+
+			VM_BUG_ON(is_shmem);
+
+			vaddr = vma->vm_start +
+				((index - vma->vm_pgoff) << PAGE_SHIFT);
+			xas_unlock_irq(&xas);
+			if (get_user_pages_remote(NULL, mm, vaddr, 1,
+					FOLL_FORCE, &page, NULL, NULL) != 1) {
+				result = SCAN_FAIL;
+				goto xa_unlocked;
+			}
+			lru_add_drain();
+			lock_page(page);
+		} else if (!PageUptodate(page) || PageDirty(page)) {
+			result = SCAN_FAIL;
+			goto xa_locked;
 		} else if (trylock_page(page)) {
 			get_page(page);
 			xas_unlock_irq(&xas);
@@ -1422,6 +1454,12 @@ static void collapse_shmem(struct mm_struct *mm,
 			goto out_unlock;
 		}
 
+		if (page_has_private(page) &&
+		    !try_to_release_page(page, GFP_KERNEL)) {
+			result = SCAN_PAGE_HAS_PRIVATE;
+			break;
+		}
+
 		if (page_mapped(page))
 			unmap_mapping_pages(mapping, index, 1, false);
 
@@ -1459,12 +1497,20 @@ static void collapse_shmem(struct mm_struct *mm,
 		goto xa_unlocked;
 	}
 
-	__inc_node_page_state(new_page, NR_SHMEM_THPS);
+	if (is_shmem)
+		__inc_node_page_state(new_page, NR_SHMEM_THPS);
+	else {
+		__inc_node_page_state(new_page, NR_FILE_THPS);
+		__deny_write_access(mapping->host);
+	}
+
 	if (nr_none) {
 		struct zone *zone = page_zone(new_page);
 
 		__mod_node_page_state(zone->zone_pgdat, NR_FILE_PAGES, nr_none);
-		__mod_node_page_state(zone->zone_pgdat, NR_SHMEM, nr_none);
+		if (is_shmem)
+			__mod_node_page_state(zone->zone_pgdat, NR_SHMEM,
+					      nr_none);
 	}
 
 xa_locked:
@@ -1502,10 +1548,15 @@ static void collapse_shmem(struct mm_struct *mm,
 
 		SetPageUptodate(new_page);
 		page_ref_add(new_page, HPAGE_PMD_NR - 1);
-		set_page_dirty(new_page);
 		mem_cgroup_commit_charge(new_page, memcg, false, true);
+
+		if (is_shmem) {
+			set_page_dirty(new_page);
+			lru_cache_add_anon(new_page);
+		} else {
+			lru_cache_add_file(new_page);
+		}
 		count_memcg_events(memcg, THP_COLLAPSE_ALLOC, 1);
-		lru_cache_add_anon(new_page);
 
 		/*
 		 * Remove pte page tables, so we can re-fault the page as huge.
@@ -1520,7 +1571,9 @@ static void collapse_shmem(struct mm_struct *mm,
 		/* Something went wrong: roll back page cache changes */
 		xas_lock_irq(&xas);
 		mapping->nrpages -= nr_none;
-		shmem_uncharge(mapping->host, nr_none);
+
+		if (is_shmem)
+			shmem_uncharge(mapping->host, nr_none);
 
 		xas_set(&xas, start);
 		xas_for_each(&xas, page, end - 1) {
@@ -1560,7 +1613,7 @@ static void collapse_shmem(struct mm_struct *mm,
 	/* TODO: tracepoints */
 }
 
-static void khugepaged_scan_shmem(struct mm_struct *mm,
+static void khugepaged_scan_file(struct vm_area_struct *vma,
 		struct address_space *mapping,
 		pgoff_t start, struct page **hpage)
 {
@@ -1603,6 +1656,17 @@ static void khugepaged_scan_shmem(struct mm_struct *mm,
 			break;
 		}
 
+		if (page_has_private(page) && trylock_page(page)) {
+			int ret;
+
+			ret = try_to_release_page(page, GFP_KERNEL);
+			unlock_page(page);
+			if (!ret) {
+				result = SCAN_PAGE_HAS_PRIVATE;
+				break;
+			}
+		}
+
 		if (page_count(page) != 1 + page_mapcount(page)) {
 			result = SCAN_PAGE_COUNT;
 			break;
@@ -1628,14 +1692,14 @@ static void khugepaged_scan_shmem(struct mm_struct *mm,
 			result = SCAN_EXCEED_NONE_PTE;
 		} else {
 			node = khugepaged_find_target_node();
-			collapse_shmem(mm, mapping, start, hpage, node);
+			collapse_file(vma, mapping, start, hpage, node);
 		}
 	}
 
 	/* TODO: tracepoints */
 }
 #else
-static void khugepaged_scan_shmem(struct mm_struct *mm,
+static void khugepaged_scan_file(struct vm_area_struct *vma,
 		struct address_space *mapping,
 		pgoff_t start, struct page **hpage)
 {
@@ -1710,17 +1774,19 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
 			VM_BUG_ON(khugepaged_scan.address < hstart ||
 				  khugepaged_scan.address + HPAGE_PMD_SIZE >
 				  hend);
-			if (shmem_file(vma->vm_file)) {
+			if (vma->vm_file) {
 				struct file *file;
 				pgoff_t pgoff = linear_page_index(vma,
 						khugepaged_scan.address);
-				if (!shmem_huge_enabled(vma))
+
+				if (shmem_file(vma->vm_file)
+				    && !shmem_huge_enabled(vma))
 					goto skip;
 				file = get_file(vma->vm_file);
 				up_read(&mm->mmap_sem);
 				ret = 1;
-				khugepaged_scan_shmem(mm, file->f_mapping,
-						pgoff, hpage);
+				khugepaged_scan_file(vma, file->f_mapping,
+						     pgoff, hpage);
 				fput(file);
 			} else {
 				ret = khugepaged_scan_pmd(mm, vma,
diff --git a/mm/rmap.c b/mm/rmap.c
index e5dfe2ae6b0d..87cfa2c19eda 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1192,8 +1192,10 @@ void page_add_file_rmap(struct page *page, bool compound)
 		}
 		if (!atomic_inc_and_test(compound_mapcount_ptr(page)))
 			goto out;
-		VM_BUG_ON_PAGE(!PageSwapBacked(page), page);
-		__inc_node_page_state(page, NR_SHMEM_PMDMAPPED);
+		if (PageSwapBacked(page))
+			__inc_node_page_state(page, NR_SHMEM_PMDMAPPED);
+		else
+			__inc_node_page_state(page, NR_FILE_PMDMAPPED);
 	} else {
 		if (PageTransCompound(page) && page_mapping(page)) {
 			VM_WARN_ON_ONCE(!PageLocked(page));
@@ -1232,8 +1234,10 @@ static void page_remove_file_rmap(struct page *page, bool compound)
 		}
 		if (!atomic_add_negative(-1, compound_mapcount_ptr(page)))
 			goto out;
-		VM_BUG_ON_PAGE(!PageSwapBacked(page), page);
-		__dec_node_page_state(page, NR_SHMEM_PMDMAPPED);
+		if (PageSwapBacked(page))
+			__dec_node_page_state(page, NR_SHMEM_PMDMAPPED);
+		else
+			__dec_node_page_state(page, NR_FILE_PMDMAPPED);
 	} else {
 		if (!atomic_add_negative(-1, &page->_mapcount))
 			goto out;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/3] mm: check compound_head(page)->mapping in filemap_fault()
  2019-06-14 18:22 ` [PATCH v2 1/3] mm: check compound_head(page)->mapping in filemap_fault() Song Liu
@ 2019-06-17 14:59   ` Rik van Riel
  0 siblings, 0 replies; 17+ messages in thread
From: Rik van Riel @ 2019-06-17 14:59 UTC (permalink / raw)
  To: Song Liu, linux-mm
  Cc: matthew.wilcox, kirill.shutemov, Kernel Team, william.kucharski,
	chad.mynhier, mike.kravetz

On Fri, 2019-06-14 at 11:22 -0700, Song Liu wrote:
> Currently, filemap_fault() avoids trace condition with truncate by
> checking page->mapping == mapping. This does not work for compound
> pages. This patch let it check compound_head(page)->mapping instead.
> 
> Signed-off-by: Song Liu <songliubraving@fb.com>

Acked-by: Rik van Riel <riel@surriel.com>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/3] mm,thp: stats for file backed THP
  2019-06-14 18:22 ` [PATCH v2 2/3] mm,thp: stats for file backed THP Song Liu
@ 2019-06-17 15:00   ` Rik van Riel
  2019-06-21 12:50   ` Kirill A. Shutemov
  1 sibling, 0 replies; 17+ messages in thread
From: Rik van Riel @ 2019-06-17 15:00 UTC (permalink / raw)
  To: Song Liu, linux-mm
  Cc: matthew.wilcox, kirill.shutemov, Kernel Team, william.kucharski,
	chad.mynhier, mike.kravetz

On Fri, 2019-06-14 at 11:22 -0700, Song Liu wrote:
> In preparation for non-shmem THP, this patch adds two stats and
> exposes
> them in /proc/meminfo
> 
> Signed-off-by: Song Liu <songliubraving@fb.com>

Acked-by: Rik van Riel <riel@surriel.com>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/3] mm,thp: add read-only THP support for (non-shmem) FS
  2019-06-14 18:22 ` [PATCH v2 3/3] mm,thp: add read-only THP support for (non-shmem) FS Song Liu
@ 2019-06-17 15:42   ` Rik van Riel
  2019-06-21 12:58   ` Kirill A. Shutemov
  1 sibling, 0 replies; 17+ messages in thread
From: Rik van Riel @ 2019-06-17 15:42 UTC (permalink / raw)
  To: Song Liu, linux-mm
  Cc: matthew.wilcox, kirill.shutemov, Kernel Team, william.kucharski,
	chad.mynhier, mike.kravetz

On Fri, 2019-06-14 at 11:22 -0700, Song Liu wrote:
> 
> +#ifdef CONFIG_READ_ONLY_THP_FOR_FS
> +	if (shmem_file(vma->vm_file) ||
> +	    (vma->vm_file && (vm_flags & VM_DENYWRITE))) {
> +#else
>  	if (shmem_file(vma->vm_file)) {
> +#endif
>  		if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGE_PAGECACHE))
>  			return false;

Future cleanup idea: could it be nice to hide the
above behind a "vma_can_have_file_thp" function or
similar?

That inline function could also have a comment explaining
why the check is the way it is.

OTOH, I guess this series is just the first step towards
more complete functionality, and things are likely to change
again soon(ish).

> @@ -1628,14 +1692,14 @@ static void khugepaged_scan_shmem(struct
> mm_struct *mm,
>  			result = SCAN_EXCEED_NONE_PTE;
>  		} else {
>  			node = khugepaged_find_target_node();
> -			collapse_shmem(mm, mapping, start, hpage,
> node);
> +			collapse_file(vma, mapping, start, hpage,
> node);
>  		}
>  	}

If for some reason you end up posting a v3 of this
series, the s/_shmem/_file/ renaming could be broken
out into its own patch.

All the code looks good though.

Acked-by: Rik van Riel <riel@surriel.com>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/3] Enable THP for text section of non-shmem files
  2019-06-14 18:22 [PATCH v2 0/3] Enable THP for text section of non-shmem files Song Liu
                   ` (2 preceding siblings ...)
  2019-06-14 18:22 ` [PATCH v2 3/3] mm,thp: add read-only THP support for (non-shmem) FS Song Liu
@ 2019-06-18 21:12 ` Andrew Morton
  2019-06-18 21:48   ` Song Liu
  2019-06-19  6:26   ` Song Liu
  3 siblings, 2 replies; 17+ messages in thread
From: Andrew Morton @ 2019-06-18 21:12 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-mm, matthew.wilcox, kirill.shutemov, kernel-team,
	william.kucharski, chad.mynhier, mike.kravetz

On Fri, 14 Jun 2019 11:22:01 -0700 Song Liu <songliubraving@fb.com> wrote:

> This set follows up discussion at LSF/MM 2019. The motivation is to put
> text section of an application in THP, and thus reduces iTLB miss rate and
> improves performance. Both Facebook and Oracle showed strong interests to
> this feature.
> 
> To make reviews easier, this set aims a mininal valid product. Current
> version of the work does not have any changes to file system specific
> code. This comes with some limitations (discussed later).
> 
> This set enables an application to "hugify" its text section by simply
> running something like:
> 
>           madvise(0x600000, 0x80000, MADV_HUGEPAGE);
> 
> Before this call, the /proc/<pid>/maps looks like:
> 
>     00400000-074d0000 r-xp 00000000 00:27 2006927     app
> 
> After this call, part of the text section is split out and mapped to THP:
> 
>     00400000-00425000 r-xp 00000000 00:27 2006927     app
>     00600000-00e00000 r-xp 00200000 00:27 2006927     app   <<< on THP
>     00e00000-074d0000 r-xp 00a00000 00:27 2006927     app
> 
> Limitations:
> 
> 1. This only works for text section (vma with VM_DENYWRITE).
> 2. Once the application put its own pages in THP, the file is read only.
>    open(file, O_WRITE) will fail with -ETXTBSY. To modify/update the file,
>    it must be removed first.

Removed?  Even if the original mmap/madvise has gone away?  hm.

I'm wondering if this limitation can be abused in some fashion: mmap a
file to which you have read permissions, run madvise(MADV_HUGEPAGE) and
thus prevent the file's owner from being able to modify the file?  Or
something like that.  What are the issues and protections here?



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/3] Enable THP for text section of non-shmem files
  2019-06-18 21:12 ` [PATCH v2 0/3] Enable THP for text section of non-shmem files Andrew Morton
@ 2019-06-18 21:48   ` Song Liu
  2019-06-20  1:13     ` Andrew Morton
  2019-06-19  6:26   ` Song Liu
  1 sibling, 1 reply; 17+ messages in thread
From: Song Liu @ 2019-06-18 21:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, matthew.wilcox, kirill.shutemov, Kernel Team,
	william.kucharski, chad.mynhier, mike.kravetz



> On Jun 18, 2019, at 2:12 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> On Fri, 14 Jun 2019 11:22:01 -0700 Song Liu <songliubraving@fb.com> wrote:
> 
>> This set follows up discussion at LSF/MM 2019. The motivation is to put
>> text section of an application in THP, and thus reduces iTLB miss rate and
>> improves performance. Both Facebook and Oracle showed strong interests to
>> this feature.
>> 
>> To make reviews easier, this set aims a mininal valid product. Current
>> version of the work does not have any changes to file system specific
>> code. This comes with some limitations (discussed later).
>> 
>> This set enables an application to "hugify" its text section by simply
>> running something like:
>> 
>>          madvise(0x600000, 0x80000, MADV_HUGEPAGE);
>> 
>> Before this call, the /proc/<pid>/maps looks like:
>> 
>>    00400000-074d0000 r-xp 00000000 00:27 2006927     app
>> 
>> After this call, part of the text section is split out and mapped to THP:
>> 
>>    00400000-00425000 r-xp 00000000 00:27 2006927     app
>>    00600000-00e00000 r-xp 00200000 00:27 2006927     app   <<< on THP
>>    00e00000-074d0000 r-xp 00a00000 00:27 2006927     app
>> 
>> Limitations:
>> 
>> 1. This only works for text section (vma with VM_DENYWRITE).
>> 2. Once the application put its own pages in THP, the file is read only.
>>   open(file, O_WRITE) will fail with -ETXTBSY. To modify/update the file,
>>   it must be removed first.
> 
> Removed?  Even if the original mmap/madvise has gone away?  hm.

Yeah, it is not ideal. The thp holds a negative count on i_mmap_writable, 
so it cannot be opened for write. 

> 
> I'm wondering if this limitation can be abused in some fashion: mmap a
> file to which you have read permissions, run madvise(MADV_HUGEPAGE) and
> thus prevent the file's owner from being able to modify the file?  Or
> something like that.  What are the issues and protections here?

In this case, the owner need to make a copy of the file, and then remove 
and update the original file. 

In this version, we want either split huge page on writes, or fail the 
write when we cannot split. However, the huge page information is only 
available at page level, and on the write path, page level information 
is not available until write_begin(). So it is hard to stop writes at 
earlier stage. Therefore, in this version, we leverage i_mmap_writable, 
which is at address_space level. So it is easier to stop writes to the 
file. 

This is a temporary behavior. And it is gated by the config. So I guess
it is OK. It works well for our use cases though. Once we have better 
write support, we can remove the limitation. 

If this is too weird, I am also open to suggestions. 

Thanks,
Song

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/3] Enable THP for text section of non-shmem files
  2019-06-18 21:12 ` [PATCH v2 0/3] Enable THP for text section of non-shmem files Andrew Morton
  2019-06-18 21:48   ` Song Liu
@ 2019-06-19  6:26   ` Song Liu
  1 sibling, 0 replies; 17+ messages in thread
From: Song Liu @ 2019-06-19  6:26 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linux-MM, Matthew Wilcox, kirill.shutemov, Kernel Team,
	william.kucharski, chad.mynhier, mike.kravetz



> On Jun 18, 2019, at 2:12 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> On Fri, 14 Jun 2019 11:22:01 -0700 Song Liu <songliubraving@fb.com> wrote:
> 
>> This set follows up discussion at LSF/MM 2019. The motivation is to put
>> text section of an application in THP, and thus reduces iTLB miss rate and
>> improves performance. Both Facebook and Oracle showed strong interests to
>> this feature.
>> 
>> To make reviews easier, this set aims a mininal valid product. Current
>> version of the work does not have any changes to file system specific
>> code. This comes with some limitations (discussed later).
>> 
>> This set enables an application to "hugify" its text section by simply
>> running something like:
>> 
>>          madvise(0x600000, 0x80000, MADV_HUGEPAGE);
>> 
>> Before this call, the /proc/<pid>/maps looks like:
>> 
>>    00400000-074d0000 r-xp 00000000 00:27 2006927     app
>> 
>> After this call, part of the text section is split out and mapped to THP:
>> 
>>    00400000-00425000 r-xp 00000000 00:27 2006927     app
>>    00600000-00e00000 r-xp 00200000 00:27 2006927     app   <<< on THP
>>    00e00000-074d0000 r-xp 00a00000 00:27 2006927     app
>> 
>> Limitations:
>> 
>> 1. This only works for text section (vma with VM_DENYWRITE).
>> 2. Once the application put its own pages in THP, the file is read only.
>>   open(file, O_WRITE) will fail with -ETXTBSY. To modify/update the file,
>>   it must be removed first.
> 
> Removed?  Even if the original mmap/madvise has gone away?  hm.
> 
> I'm wondering if this limitation can be abused in some fashion: mmap a
> file to which you have read permissions, run madvise(MADV_HUGEPAGE) and
> thus prevent the file's owner from being able to modify the file?  Or
> something like that.  What are the issues and protections here?
> 

I found a better solution to this limitation. Please refer to changes
in v3 (especially 6/6). 

Thanks,
Song

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/3] Enable THP for text section of non-shmem files
  2019-06-18 21:48   ` Song Liu
@ 2019-06-20  1:13     ` Andrew Morton
  2019-06-20  2:04       ` Song Liu
  0 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2019-06-20  1:13 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-mm, matthew.wilcox, kirill.shutemov, Kernel Team,
	william.kucharski, chad.mynhier, mike.kravetz

On Tue, 18 Jun 2019 21:48:16 +0000 Song Liu <songliubraving@fb.com> wrote:

> > I'm wondering if this limitation can be abused in some fashion: mmap a
> > file to which you have read permissions, run madvise(MADV_HUGEPAGE) and
> > thus prevent the file's owner from being able to modify the file?  Or
> > something like that.  What are the issues and protections here?
> 
> In this case, the owner need to make a copy of the file, and then remove 
> and update the original file. 
> 
> In this version, we want either split huge page on writes, or fail the 
> write when we cannot split. However, the huge page information is only 
> available at page level, and on the write path, page level information 
> is not available until write_begin(). So it is hard to stop writes at 
> earlier stage. Therefore, in this version, we leverage i_mmap_writable, 
> which is at address_space level. So it is easier to stop writes to the 
> file. 
> 
> This is a temporary behavior. And it is gated by the config. So I guess
> it is OK. It works well for our use cases though. Once we have better 
> write support, we can remove the limitation. 
> 
> If this is too weird, I am also open to suggestions. 

Well, it's more than weird?  This permits user A to deny service to
user B?  User A can, maliciously or accidentally, prevent user B from
modifying a file which user B has permission to modify?  Such as, umm,
/etc/hosts?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/3] Enable THP for text section of non-shmem files
  2019-06-20  1:13     ` Andrew Morton
@ 2019-06-20  2:04       ` Song Liu
  0 siblings, 0 replies; 17+ messages in thread
From: Song Liu @ 2019-06-20  2:04 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, matthew.wilcox, kirill.shutemov, Kernel Team,
	william.kucharski, chad.mynhier, mike.kravetz



> On Jun 19, 2019, at 6:13 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> On Tue, 18 Jun 2019 21:48:16 +0000 Song Liu <songliubraving@fb.com> wrote:
> 
>>> I'm wondering if this limitation can be abused in some fashion: mmap a
>>> file to which you have read permissions, run madvise(MADV_HUGEPAGE) and
>>> thus prevent the file's owner from being able to modify the file?  Or
>>> something like that.  What are the issues and protections here?
>> 
>> In this case, the owner need to make a copy of the file, and then remove 
>> and update the original file. 
>> 
>> In this version, we want either split huge page on writes, or fail the 
>> write when we cannot split. However, the huge page information is only 
>> available at page level, and on the write path, page level information 
>> is not available until write_begin(). So it is hard to stop writes at 
>> earlier stage. Therefore, in this version, we leverage i_mmap_writable, 
>> which is at address_space level. So it is easier to stop writes to the 
>> file. 
>> 
>> This is a temporary behavior. And it is gated by the config. So I guess
>> it is OK. It works well for our use cases though. Once we have better 
>> write support, we can remove the limitation. 
>> 
>> If this is too weird, I am also open to suggestions. 
> 
> Well, it's more than weird?  This permits user A to deny service to
> user B?  User A can, maliciously or accidentally, prevent user B from
> modifying a file which user B has permission to modify?  Such as, umm,
> /etc/hosts?

I have removed this behavior in v3. I think we really don't need this. 

Thanks,
Song


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/3] mm,thp: stats for file backed THP
  2019-06-14 18:22 ` [PATCH v2 2/3] mm,thp: stats for file backed THP Song Liu
  2019-06-17 15:00   ` Rik van Riel
@ 2019-06-21 12:50   ` Kirill A. Shutemov
  2019-06-21 14:09     ` Song Liu
  1 sibling, 1 reply; 17+ messages in thread
From: Kirill A. Shutemov @ 2019-06-21 12:50 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-mm, matthew.wilcox, kirill.shutemov, kernel-team,
	william.kucharski, chad.mynhier, mike.kravetz

On Fri, Jun 14, 2019 at 11:22:03AM -0700, Song Liu wrote:
> In preparation for non-shmem THP, this patch adds two stats and exposes
> them in /proc/meminfo
> 
> Signed-off-by: Song Liu <songliubraving@fb.com>

I think you also need to cover smaps.

See my old patch for refernece:

https://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git/commit/?h=hugeext4/v6&id=e629d1c4f9200c16bd7b4b02e8016d020c0869cb

-- 
 Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/3] mm,thp: add read-only THP support for (non-shmem) FS
  2019-06-14 18:22 ` [PATCH v2 3/3] mm,thp: add read-only THP support for (non-shmem) FS Song Liu
  2019-06-17 15:42   ` Rik van Riel
@ 2019-06-21 12:58   ` Kirill A. Shutemov
  2019-06-21 13:08     ` Song Liu
  1 sibling, 1 reply; 17+ messages in thread
From: Kirill A. Shutemov @ 2019-06-21 12:58 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-mm, matthew.wilcox, kirill.shutemov, kernel-team,
	william.kucharski, chad.mynhier, mike.kravetz

On Fri, Jun 14, 2019 at 11:22:04AM -0700, Song Liu wrote:
> This patch is (hopefully) the first step to enable THP for non-shmem
> filesystems.
> 
> This patch enables an application to put part of its text sections to THP
> via madvise, for example:
> 
>     madvise((void *)0x600000, 0x200000, MADV_HUGEPAGE);
> 
> We tried to reuse the logic for THP on tmpfs. The following functions are
> renamed to reflect the new functionality:
> 
> 	collapse_shmem()	=>  collapse_file()
> 	khugepaged_scan_shmem()	=>  khugepaged_scan_file()
> 
> Currently, write is not supported for non-shmem THP. This is enforced by
> taking negative i_writecount. Therefore, if file has THP pages in the
> page cache, open() to write will fail. To update/modify the file, the
> user need to remove it first.
> 
> An EXPERIMENTAL config, READ_ONLY_THP_FOR_FS, is added to gate this
> feature.

Please document explicitly that the feature opens local DoS attack: any
user with read access to file can block write to the file by using
MADV_HUGEPAGE for a range of the file.

As is it only has to be used with trusted userspace.

We also might want to have mount option in addition to Kconfig option to
enable the feature on per-mount basis.

-- 
 Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/3] mm,thp: add read-only THP support for (non-shmem) FS
  2019-06-21 12:58   ` Kirill A. Shutemov
@ 2019-06-21 13:08     ` Song Liu
  2019-06-21 13:11       ` Kirill A. Shutemov
  0 siblings, 1 reply; 17+ messages in thread
From: Song Liu @ 2019-06-21 13:08 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Linux-MM, Matthew Wilcox, Kirill A. Shutemov, Kernel Team,
	William Kucharski, Chad Mynhier, Mike Kravetz


Hi Kirill,

> On Jun 21, 2019, at 5:58 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> 
> On Fri, Jun 14, 2019 at 11:22:04AM -0700, Song Liu wrote:
>> This patch is (hopefully) the first step to enable THP for non-shmem
>> filesystems.
>> 
>> This patch enables an application to put part of its text sections to THP
>> via madvise, for example:
>> 
>>    madvise((void *)0x600000, 0x200000, MADV_HUGEPAGE);
>> 
>> We tried to reuse the logic for THP on tmpfs. The following functions are
>> renamed to reflect the new functionality:
>> 
>> 	collapse_shmem()	=>  collapse_file()
>> 	khugepaged_scan_shmem()	=>  khugepaged_scan_file()
>> 
>> Currently, write is not supported for non-shmem THP. This is enforced by
>> taking negative i_writecount. Therefore, if file has THP pages in the
>> page cache, open() to write will fail. To update/modify the file, the
>> user need to remove it first.
>> 
>> An EXPERIMENTAL config, READ_ONLY_THP_FOR_FS, is added to gate this
>> feature.
> 
> Please document explicitly that the feature opens local DoS attack: any
> user with read access to file can block write to the file by using
> MADV_HUGEPAGE for a range of the file.
> 
> As is it only has to be used with trusted userspace.
> 
> We also might want to have mount option in addition to Kconfig option to
> enable the feature on per-mount basis.

This behavior has been removed from v3 to v5. 

Thanks,
Song

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/3] mm,thp: add read-only THP support for (non-shmem) FS
  2019-06-21 13:08     ` Song Liu
@ 2019-06-21 13:11       ` Kirill A. Shutemov
  0 siblings, 0 replies; 17+ messages in thread
From: Kirill A. Shutemov @ 2019-06-21 13:11 UTC (permalink / raw)
  To: Song Liu
  Cc: Linux-MM, Matthew Wilcox, Kirill A. Shutemov, Kernel Team,
	William Kucharski, Chad Mynhier, Mike Kravetz

On Fri, Jun 21, 2019 at 01:08:39PM +0000, Song Liu wrote:
> 
> Hi Kirill,
> 
> > On Jun 21, 2019, at 5:58 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> > 
> > On Fri, Jun 14, 2019 at 11:22:04AM -0700, Song Liu wrote:
> >> This patch is (hopefully) the first step to enable THP for non-shmem
> >> filesystems.
> >> 
> >> This patch enables an application to put part of its text sections to THP
> >> via madvise, for example:
> >> 
> >>    madvise((void *)0x600000, 0x200000, MADV_HUGEPAGE);
> >> 
> >> We tried to reuse the logic for THP on tmpfs. The following functions are
> >> renamed to reflect the new functionality:
> >> 
> >> 	collapse_shmem()	=>  collapse_file()
> >> 	khugepaged_scan_shmem()	=>  khugepaged_scan_file()
> >> 
> >> Currently, write is not supported for non-shmem THP. This is enforced by
> >> taking negative i_writecount. Therefore, if file has THP pages in the
> >> page cache, open() to write will fail. To update/modify the file, the
> >> user need to remove it first.
> >> 
> >> An EXPERIMENTAL config, READ_ONLY_THP_FOR_FS, is added to gate this
> >> feature.
> > 
> > Please document explicitly that the feature opens local DoS attack: any
> > user with read access to file can block write to the file by using
> > MADV_HUGEPAGE for a range of the file.
> > 
> > As is it only has to be used with trusted userspace.
> > 
> > We also might want to have mount option in addition to Kconfig option to
> > enable the feature on per-mount basis.
> 
> This behavior has been removed from v3 to v5. 

Yes, I've catch up with that. :P

-- 
 Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/3] mm,thp: stats for file backed THP
  2019-06-21 12:50   ` Kirill A. Shutemov
@ 2019-06-21 14:09     ` Song Liu
  0 siblings, 0 replies; 17+ messages in thread
From: Song Liu @ 2019-06-21 14:09 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Linux-MM, matthew.wilcox, kirill.shutemov, Kernel Team,
	william.kucharski, chad.mynhier, mike.kravetz



> On Jun 21, 2019, at 5:50 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> 
> On Fri, Jun 14, 2019 at 11:22:03AM -0700, Song Liu wrote:
>> In preparation for non-shmem THP, this patch adds two stats and exposes
>> them in /proc/meminfo
>> 
>> Signed-off-by: Song Liu <songliubraving@fb.com>
> 
> I think you also need to cover smaps.
> 
> See my old patch for refernece:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git/commit/?h=hugeext4/v6&id=e629d1c4f9200c16bd7b4b02e8016d020c0869cb
> 
Thanks for the reference!

Adding the fix. 

Thanks,
Song


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, back to index

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-14 18:22 [PATCH v2 0/3] Enable THP for text section of non-shmem files Song Liu
2019-06-14 18:22 ` [PATCH v2 1/3] mm: check compound_head(page)->mapping in filemap_fault() Song Liu
2019-06-17 14:59   ` Rik van Riel
2019-06-14 18:22 ` [PATCH v2 2/3] mm,thp: stats for file backed THP Song Liu
2019-06-17 15:00   ` Rik van Riel
2019-06-21 12:50   ` Kirill A. Shutemov
2019-06-21 14:09     ` Song Liu
2019-06-14 18:22 ` [PATCH v2 3/3] mm,thp: add read-only THP support for (non-shmem) FS Song Liu
2019-06-17 15:42   ` Rik van Riel
2019-06-21 12:58   ` Kirill A. Shutemov
2019-06-21 13:08     ` Song Liu
2019-06-21 13:11       ` Kirill A. Shutemov
2019-06-18 21:12 ` [PATCH v2 0/3] Enable THP for text section of non-shmem files Andrew Morton
2019-06-18 21:48   ` Song Liu
2019-06-20  1:13     ` Andrew Morton
2019-06-20  2:04       ` Song Liu
2019-06-19  6:26   ` Song Liu

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org linux-mm@archiver.kernel.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/ public-inbox