lustre-devel-lustre.org archive mirror
 help / color / mirror / Atom feed
From: James Simmons <jsimmons@infradead.org>
To: Andreas Dilger <adilger@whamcloud.com>,
	Oleg Drokin <green@whamcloud.com>, NeilBrown <neilb@suse.de>
Cc: Lustre Development List <lustre-devel@lists.lustre.org>
Subject: [lustre-devel] [PATCH 10/40] lustre: llite: check truncated page in ->readpage()
Date: Sun,  9 Apr 2023 08:12:50 -0400	[thread overview]
Message-ID: <1681042400-15491-11-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1681042400-15491-1-git-send-email-jsimmons@infradead.org>

From: Qian Yingjin <qian@ddn.com>

The page end offset calculation in filemap_get_read_batch() was
off by one. This bug was introduced in commit v5.11-10234-gcbd59c48ae
("mm/filemap: use head pages in generic_file_buffered_read")

When a read is submitted with end offset 1048575, it calculates
the end page index for read of 256 where it should be 255. This
results in the readpage() call for the page with index 256 is over
stripe boundary and may not be covered by a DLM extent lock.

This happens in a corner race case: filemap_get_read_batch()
batches the page with index 256 for read, but later this page is
removed from page cache due to the lock protected it being revoked,
but has a reference count due to the batch.  This results in this
page in the read path is not covered by any DLM lock.

The solution is simple. We can check whether the page was
truncated and removed from page cache in ->readpage() by the
address_sapce pointer of the page. If it was truncated, return
AOP_TRUNCATED_PAGE to the upper caller.  This will cause the
kernel to retry to batch pages and the truncated page will not
be added as it was already removed from page cache of the file.

WC-bug-id: https://jira.whamcloud.com/browse/LU-16412
Lustre-commit: 209afbe28b5f164bd ("LU-16412 llite: check truncated page in ->readpage()")
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49433
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/include/obd_support.h |  6 ++++--
 fs/lustre/llite/rw.c            | 35 +++++++++++++++++++++++++++++++++++
 fs/lustre/llite/rw26.c          |  7 +++++++
 3 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/fs/lustre/include/obd_support.h b/fs/lustre/include/obd_support.h
index a2930c8..4ef5c61 100644
--- a/fs/lustre/include/obd_support.h
+++ b/fs/lustre/include/obd_support.h
@@ -458,8 +458,8 @@
 /* was	OBD_FAIL_LLOG_CATINFO_NET			0x1309 until 2.3 */
 #define OBD_FAIL_MDS_SYNC_CAPA_SL			0x1310
 #define OBD_FAIL_SEQ_ALLOC				0x1311
-#define OBD_FAIL_PLAIN_RECORDS			    0x1319
-#define OBD_FAIL_CATALOG_FULL_CHECK		    0x131a
+#define OBD_FAIL_PLAIN_RECORDS				0x1319
+#define OBD_FAIL_CATALOG_FULL_CHECK			0x131a
 
 #define OBD_FAIL_LLITE					0x1400
 #define OBD_FAIL_LLITE_FAULT_TRUNC_RACE			0x1401
@@ -488,6 +488,8 @@
 #define OBD_FAIL_LLITE_PAGE_ALLOC			0x1418
 #define OBD_FAIL_LLITE_OPEN_DELAY			0x1419
 #define OBD_FAIL_LLITE_XATTR_PAUSE			0x1420
+#define OBD_FAIL_LLITE_PAGE_INVALIDATE_PAUSE		0x1421
+#define OBD_FAIL_LLITE_READPAGE_PAUSE			0x1422
 
 #define OBD_FAIL_FID_INDIR				0x1501
 #define OBD_FAIL_FID_INLMA				0x1502
diff --git a/fs/lustre/llite/rw.c b/fs/lustre/llite/rw.c
index 0b14ea6..dea2af1 100644
--- a/fs/lustre/llite/rw.c
+++ b/fs/lustre/llite/rw.c
@@ -1865,6 +1865,41 @@ int ll_readpage(struct file *file, struct page *vmpage)
 	struct ll_sb_info *sbi = ll_i2sbi(inode);
 	int result;
 
+	if (OBD_FAIL_PRECHECK(OBD_FAIL_LLITE_READPAGE_PAUSE)) {
+		unlock_page(vmpage);
+		OBD_FAIL_TIMEOUT(OBD_FAIL_LLITE_READPAGE_PAUSE, cfs_fail_val);
+		lock_page(vmpage);
+	}
+
+	/*
+	 * The @vmpage got truncated.
+	 * This is a kernel bug introduced since kernel 5.12:
+	 * comment: cbd59c48ae2bcadc4a7599c29cf32fd3f9b78251
+	 * ("mm/filemap: use head pages in generic_file_buffered_read")
+	 *
+	 * The page end offset calculation in filemap_get_read_batch() was off
+	 * by one.  When a read is submitted with end offset 1048575, then it
+	 * calculates the end page for read of 256 where it should be 255. This
+	 * results in the readpage() for the page with index 256 is over stripe
+	 * boundary and may not covered by a DLM extent lock.
+	 *
+	 * This happens in a corner race case: filemap_get_read_batch() adds
+	 * the page with index 256 for read which is not in the current read
+	 * I/O context, and this page is being invalidated and will be removed
+	 * from page cache due to the lock protected it being revoken. This
+	 * results in this page in the read path not covered by any DLM lock.
+	 *
+	 * The solution is simple. Check whether the page was truncated in
+	 * ->readpage(). If so, just return AOP_TRUNCATED_PAGE to the upper
+	 * caller. Then the kernel will retry to batch pages, and it will not
+	 * add the truncated page into batches as it was removed from page
+	 * cache of the file.
+	 */
+	if (vmpage->mapping != inode->i_mapping) {
+		unlock_page(vmpage);
+		return AOP_TRUNCATED_PAGE;
+	}
+
 	lcc = ll_cl_find(inode);
 	if (lcc) {
 		env = lcc->lcc_env;
diff --git a/fs/lustre/llite/rw26.c b/fs/lustre/llite/rw26.c
index cadded4..6700717 100644
--- a/fs/lustre/llite/rw26.c
+++ b/fs/lustre/llite/rw26.c
@@ -96,6 +96,13 @@ static void ll_invalidatepage(struct page *vmpage, unsigned int offset,
 		}
 		cl_env_percpu_put(env);
 	}
+
+	if (OBD_FAIL_PRECHECK(OBD_FAIL_LLITE_PAGE_INVALIDATE_PAUSE)) {
+		unlock_page(vmpage);
+		OBD_FAIL_TIMEOUT(OBD_FAIL_LLITE_PAGE_INVALIDATE_PAUSE,
+				 cfs_fail_val);
+		lock_page(vmpage);
+	}
 }
 
 static int ll_releasepage(struct page *vmpage, gfp_t gfp_mask)
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

  parent reply	other threads:[~2023-04-09 12:29 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-09 12:12 [lustre-devel] [PATCH 00/40] lustre: backport OpenSFS changes from March XX, 2023 James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 01/40] lustre: protocol: basic batching processing framework James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 02/40] lustre: lov: fiemap improperly handles fm_extent_count=0 James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 03/40] lustre: llite: SIGBUS is possible on a race with page reclaim James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 04/40] lustre: osc: page fault in osc_release_bounce_pages() James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 05/40] lustre: readahead: add stats for read-ahead page count James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 06/40] lustre: quota: enforce project quota for root James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 07/40] lustre: ldlm: send the cancel RPC asap James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 08/40] lustre: enc: align Base64 encoding with RFC 4648 base64url James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 09/40] lustre: quota: fix insane grant quota James Simmons
2023-04-09 12:12 ` James Simmons [this message]
2023-04-09 12:12 ` [lustre-devel] [PATCH 11/40] lnet: o2iblnd: Fix key mismatch issue James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 12/40] lustre: sec: fid2path for encrypted files James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 13/40] lustre: sec: Lustre/HSM on enc file with enc key James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 14/40] lustre: llite: check read page past requested James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 15/40] lustre: llite: fix relatime support James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 16/40] lustre: ptlrpc: clarify AT error message James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 17/40] lustre: update version to 2.15.54 James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 18/40] lustre: tgt: skip free inodes in OST weights James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 19/40] lustre: fileset: check fileset for operations by fid James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 20/40] lustre: clio: Remove cl_page_size() James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 21/40] lustre: fid: clean up OBIF_MAX_OID and IDIF_MAX_OID James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 22/40] lustre: llog: fix processing of a wrapped catalog James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 23/40] lustre: llite: replace lld_nfs_dentry flag with opencache handling James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 24/40] lustre: llite: match lock in corresponding namespace James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 25/40] lnet: libcfs: remove unused hash code James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 26/40] lustre: client: -o network needs add_conn processing James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 27/40] lnet: Lock primary NID logic James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 28/40] lnet: Peers added via kernel API should be permanent James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 29/40] lnet: don't delete peer created by Lustre James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 30/40] lnet: memory leak in copy_ioc_udsp_descr James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 31/40] lnet: remove crash with UDSP James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 32/40] lustre: ptlrpc: fix clang build errors James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 33/40] lustre: ldlm: remove client_import_find_conn() James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 34/40] lnet: add 'force' option to lnetctl peer del James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 35/40] lustre: ldlm: BL_AST lock cancel still can be batched James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 36/40] lnet: lnet_parse_route uses wrong loop var James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 37/40] lustre: tgt: add qos debug James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 38/40] lustre: enc: file names encryption when using secure boot James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 39/40] lustre: uapi: add DMV_IMP_INHERIT connect flag James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 40/40] lustre: llite: dir layout inheritance fixes James Simmons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1681042400-15491-11-git-send-email-jsimmons@infradead.org \
    --to=jsimmons@infradead.org \
    --cc=adilger@whamcloud.com \
    --cc=green@whamcloud.com \
    --cc=lustre-devel@lists.lustre.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).