All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rongwei Wang <rongwei.wang@linux.alibaba.com>
To: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: akpm@linux-foundation.org, cfijalkovich@google.com,
	willy@infradead.org, song@kernel.org,
	william.kucharski@oracle.com, hughd@google.com
Subject: [PATCH v2 1/2] mm, thp: check page mapping when truncating page cache
Date: Wed, 22 Sep 2021 15:06:44 +0800	[thread overview]
Message-ID: <20210922070645.47345-2-rongwei.wang@linux.alibaba.com> (raw)
In-Reply-To: <20210906121200.57905-1-rongwei.wang@linux.alibaba.com>

Transparent huge page has supported read-only non-shmem files. The file-
backed THP is collapsed by khugepaged and truncated when written (for
shared libraries).

However, there is race in two possible places.

1) multiple writers truncate the same page cache concurrently;
2) collapse_file rolls back when writer truncates the page cache;

In both cases, subpage(s) of file THP can be revealed by find_get_entry
in truncate_inode_pages_range, which will trigger PageTail BUG_ON in
truncate_inode_page, as follows.

[40326.247034] page:000000009e420ff2 refcount:1 mapcount:0 mapping:0000000000000000 index:0x7ff pfn:0x50c3ff
[40326.247041] head:0000000075ff816d order:9 compound_mapcount:0 compound_pincount:0
[40326.247046] flags: 0x37fffe0000010815(locked|uptodate|lru|arch_1|head)
[40326.247051] raw: 37fffe0000000000 fffffe0013108001 dead000000000122 dead000000000400
[40326.247053] raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
[40326.247055] head: 37fffe0000010815 fffffe001066bd48 ffff000404183c20 0000000000000000
[40326.247057] head: 0000000000000600 0000000000000000 00000001ffffffff ffff000c0345a000
[40326.247058] page dumped because: VM_BUG_ON_PAGE(PageTail(page))
[40326.247077] ------------[ cut here ]------------
[40326.247080] kernel BUG at mm/truncate.c:213!
[40326.280581] Internal error: Oops - BUG: 0 [#1] SMP
[40326.281077] Modules linked in: xfs(E) libcrc32c(E) rfkill(E) ...
[40326.285130] CPU: 14 PID: 11394 Comm: check_madvise_d Kdump: ...
[40326.286202] Hardware name: ECS, BIOS 0.0.0 02/06/2015
[40326.286968] pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[40326.287584] pc : truncate_inode_page+0x64/0x70
[40326.288040] lr : truncate_inode_page+0x64/0x70
[40326.288498] sp : ffff80001b60b900
[40326.288837] x29: ffff80001b60b900 x28: 00000000000007ff
[40326.289377] x27: ffff80001b60b9a0 x26: 0000000000000000
[40326.289943] x25: 000000000000000f x24: ffff80001b60b9a0
[40326.290485] x23: ffff80001b60ba18 x22: ffff0001e0999ea8
[40326.291027] x21: ffff0000c21db300 x20: ffffffffffffffff
[40326.291566] x19: fffffe001310ffc0 x18: 0000000000000020
[40326.292106] x17: 0000000000000000 x16: 0000000000000000
[40326.292655] x15: ffff0000c21db960 x14: 3030306666666620
[40326.293197] x13: 6666666666666666 x12: 3130303030303030
[40326.293746] x11: ffff8000117b69b8 x10: 00000000ffff8000
[40326.294313] x9 : ffff80001012690c x8 : 0000000000000000
[40326.294851] x7 : ffff8000114f69b8 x6 : 0000000000017ffd
[40326.295392] x5 : ffff0007fffbcbc8 x4 : ffff80001b60b5c0
[40326.295942] x3 : 0000000000000001 x2 : 0000000000000000
[40326.296497] x1 : 0000000000000000 x0 : 0000000000000000
[40326.297047] Call trace:
[40326.297304]  truncate_inode_page+0x64/0x70
[40326.297724]  truncate_inode_pages_range+0x550/0x7e4
[40326.298251]  truncate_pagecache+0x58/0x80
[40326.298662]  do_dentry_open+0x1e4/0x3c0
[40326.299052]  vfs_open+0x38/0x44
[40326.299377]  do_open+0x1f0/0x310
[40326.299709]  path_openat+0x114/0x1dc
[40326.300077]  do_filp_open+0x84/0x134
[40326.300444]  do_sys_openat2+0xbc/0x164
[40326.300825]  __arm64_sys_openat+0x74/0xc0
[40326.301236]  el0_svc_common.constprop.0+0x88/0x220
[40326.301723]  do_el0_svc+0x30/0xa0
[40326.302089]  el0_svc+0x20/0x30
[40326.302404]  el0_sync_handler+0x1a4/0x1b0
[40326.302814]  el0_sync+0x180/0x1c0
[40326.303157] Code: aa0103e0 900061e1 910ec021 9400d300 (d4210000)
[40326.303775] ---[ end trace f70cdb42cb7c2d42 ]---
[40326.304244] Kernel panic - not syncing: Oops - BUG: Fatal exception

This checks the page mapping and retries when subpage of file THP is
found, in truncate_inode_pages_range.

Fixes: eb6ecbed0aa2 ("mm, thp: relax the VM_DENYWRITE constraint on file-backed THPs")
Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
---
 mm/filemap.c  |  7 ++++++-
 mm/truncate.c | 17 ++++++++++++++++-
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index dae481293..a3af2ec 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2093,7 +2093,6 @@ unsigned find_lock_entries(struct address_space *mapping, pgoff_t start,
 		if (!xa_is_value(page)) {
 			if (page->index < start)
 				goto put;
-			VM_BUG_ON_PAGE(page->index != xas.xa_index, page);
 			if (page->index + thp_nr_pages(page) - 1 > end)
 				goto put;
 			if (!trylock_page(page))
@@ -2102,6 +2101,12 @@ unsigned find_lock_entries(struct address_space *mapping, pgoff_t start,
 				goto unlock;
 			VM_BUG_ON_PAGE(!thp_contains(page, xas.xa_index),
 					page);
+			/*
+			 * We can find and get head page of file THP with
+			 * non-head index. The head page should have already
+			 * be truncated with page->mapping reset to NULL.
+			 */
+			VM_BUG_ON_PAGE(page->index != xas.xa_index, page);
 		}
 		indices[pvec->nr] = xas.xa_index;
 		if (!pagevec_add(pvec, page))
diff --git a/mm/truncate.c b/mm/truncate.c
index 714eaf1..3f47190 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -319,7 +319,8 @@ void truncate_inode_pages_range(struct address_space *mapping,
 	index = start;
 	while (index < end && find_lock_entries(mapping, index, end - 1,
 			&pvec, indices)) {
-		index = indices[pagevec_count(&pvec) - 1] + 1;
+		index = indices[pagevec_count(&pvec) - 1] +
+			thp_nr_pages(pvec.pages[pagevec_count(&pvec) - 1]);
 		truncate_exceptional_pvec_entries(mapping, &pvec, indices);
 		for (i = 0; i < pagevec_count(&pvec); i++)
 			truncate_cleanup_page(pvec.pages[i]);
@@ -392,6 +393,20 @@ void truncate_inode_pages_range(struct address_space *mapping,
 				continue;
 
 			lock_page(page);
+			/*
+			 * Already truncated? We can find and get subpage
+			 * of file THP, of which the head page is truncated.
+			 *
+			 * In addition, another race will be avoided, where
+			 * collapse_file rolls back when writer truncates the
+			 * page cache.
+			 */
+			if (page_mapping(page) != mapping) {
+				unlock_page(page);
+				/* Restart to make sure all gone */
+				index = start - 1;
+				break;
+			}
 			WARN_ON(page_to_index(page) != index);
 			wait_on_page_writeback(page);
 			truncate_inode_page(mapping, page);
-- 
1.8.3.1


  parent reply	other threads:[~2021-09-22  7:06 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-06 12:11 [PATCH 0/2] mm, thp: fix file-backed THP race in collapse_file Rongwei Wang
2021-09-06 12:11 ` [PATCH 1/2] mm, thp: check page mapping when truncating page cache Rongwei Wang
2021-09-07  2:49   ` Yu Xu
2021-09-07 18:08   ` Yang Shi
2021-09-07 18:08     ` Yang Shi
2021-09-08  2:35     ` Rongwei Wang
2021-09-08 21:48       ` Yang Shi
2021-09-08 21:48         ` Yang Shi
2021-09-09  1:25         ` Rongwei Wang
2021-09-13 14:49   ` [mm, thp] 20753096b6: BUG:unable_to_handle_page_fault_for_address kernel test robot
2021-09-13 14:49     ` kernel test robot
2021-09-06 12:12 ` [PATCH 2/2] mm, thp: bail out early in collapse_file for writeback page Rongwei Wang
2021-09-07 16:56   ` Yang Shi
2021-09-07 16:56     ` Yang Shi
2021-09-08  2:16     ` Rongwei Wang
2021-09-08 21:51       ` Yang Shi
2021-09-08 21:51         ` Yang Shi
2021-09-09  1:33         ` Rongwei Wang
2021-09-22  7:06 ` [PATCH v2 0/2] mm, thp: fix file-backed THP race in collapse_file and truncate pagecache Rongwei Wang
2021-09-22  7:06 ` Rongwei Wang [this message]
2021-09-22 11:37   ` [PATCH v2 1/2] mm, thp: check page mapping when truncating page cache Matthew Wilcox
2021-09-22 17:04     ` Rongwei Wang
2021-09-24  2:43       ` Andrew Morton
2021-09-24  3:08         ` Yang Shi
2021-09-24  3:08           ` Yang Shi
2021-09-24  3:35         ` Rongwei Wang
2021-09-24  7:12         ` Rongwei Wang
2021-09-27 22:24           ` Song Liu
2021-09-27 22:24             ` Song Liu
2021-09-28 12:06             ` Matthew Wilcox
2021-09-28 16:59               ` Song Liu
2021-09-28 16:59                 ` Song Liu
2021-09-28 16:20             ` Rongwei Wang
2021-09-29  7:14               ` Song Liu
2021-09-29  7:14                 ` Song Liu
2021-09-29  7:50                 ` Rongwei Wang
2021-09-29 16:59                   ` Song Liu
2021-09-29 16:59                     ` Song Liu
2021-09-29 17:55                     ` Matthew Wilcox
2021-09-29 23:41                       ` Song Liu
2021-09-29 23:41                         ` Song Liu
2021-09-30  0:00                         ` Matthew Wilcox
2021-09-30  0:41                           ` Song Liu
2021-09-30  0:41                             ` Song Liu
2021-09-30  2:14                             ` Rongwei Wang
2021-10-04 17:26                             ` Rongwei Wang
2021-10-04 19:05                               ` Matthew Wilcox
2021-10-05  1:58                                 ` Rongwei Wang
2021-10-04 20:26                               ` Song Liu
2021-10-05  2:58                               ` Hugh Dickins
2021-10-05  3:07                                 ` Matthew Wilcox
2021-10-05  9:03                                 ` Rongwei Wang
2021-09-30  1:54                         ` Rongwei Wang
2021-09-30  3:26                           ` Song Liu
2021-09-30  3:26                             ` Song Liu
2021-09-30  5:24                             ` Hugh Dickins
2021-09-30  5:24                               ` Hugh Dickins
2021-09-30 15:28                               ` Matthew Wilcox
2021-09-30 16:49                                 ` Hugh Dickins
2021-09-30 16:49                                   ` Hugh Dickins
2021-09-30 17:39                                   ` Yang Shi
2021-09-30 17:39                                     ` Yang Shi
2021-10-02 17:08                                     ` Matthew Wilcox
2021-10-04 18:28                                       ` Yang Shi
2021-10-04 19:31                                         ` Matthew Wilcox
2021-10-05  2:26                                           ` Hugh Dickins
2021-10-02  2:22                                   ` Rongwei Wang
2021-09-22  7:06 ` [PATCH v2 2/2] mm, thp: bail out early in collapse_file for writeback page Rongwei Wang
2021-10-06  2:18 ` [PATCH v3 v3 0/2] mm, thp: fix file-backed THP race in collapse_file and truncate pagecache Rongwei Wang
2021-10-06  2:18   ` [PATCH v3 v3 1/2] mm, thp: lock filemap when truncating page cache Rongwei Wang
2021-10-06  2:18   ` [PATCH v3 v3 2/2] mm, thp: bail out early in collapse_file for writeback page Rongwei Wang
2021-10-06  2:41     ` Matthew Wilcox
2021-10-06  8:39       ` Rongwei Wang
2021-10-06 17:58     ` Yang Shi
2021-10-11  2:22 ` [PATCH v4 0/2] mm, thp: fix file-backed THP race in collapse_file and truncate pagecache Rongwei Wang
2021-10-11  2:22   ` [PATCH v4 1/2] mm, thp: lock filemap when truncating page cache Rongwei Wang
2021-10-13  7:55     ` Rongwei Wang
2021-10-11  2:22   ` [PATCH v4 2/2] mm, thp: bail out early in collapse_file for writeback page Rongwei Wang
2021-10-11  3:08     ` Matthew Wilcox
2021-10-11  3:22       ` Rongwei Wang
2021-10-11  5:08     ` [PATCH v4 RESEND " Rongwei Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210922070645.47345-2-rongwei.wang@linux.alibaba.com \
    --to=rongwei.wang@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=cfijalkovich@google.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=song@kernel.org \
    --cc=william.kucharski@oracle.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.