All of lore.kernel.org
 help / color / mirror / Atom feed
From: James Simmons <jsimmons@infradead.org>
To: Andreas Dilger <adilger@whamcloud.com>,
	Oleg Drokin <green@whamcloud.com>, NeilBrown <neilb@suse.de>
Cc: Wang Shilong <wshilong@ddn.com>,
	Lustre Development List <lustre-devel@lists.lustre.org>
Subject: [lustre-devel] [PATCH 09/39] lustre: llite: fix client evicition with DIO
Date: Thu, 21 Jan 2021 12:16:32 -0500	[thread overview]
Message-ID: <1611249422-556-10-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org>

From: Wang Shilong <wshilong@ddn.com>

We set lockless in file open if O_DIRECT flag is passed,
however O_DIRECT flag could be cleared by
fcntl(..., F_SETFL, ...).

Finally we comes to a case where buffer IO without lock
held properly, and hit hang:

[<ffffffffc0d421ed>] osc_extent_wait+0x21d/0x7c0 [osc]
[<ffffffffc0d44897>] osc_cache_wait_range+0x2e7/0x940 [osc]
[<ffffffffc0d4585e>] osc_cache_writeback_range+0x96e/0xff0 [osc]
[<ffffffffc0d31c45>] osc_lock_flush+0x195/0x290 [osc]
[<ffffffffc0d31d7c>] osc_lock_lockless_cancel+0x3c/0xe0 [osc]
[<ffffffffc081f488>] cl_lock_cancel+0x78/0x160 [obdclass]
[<ffffffffc0cd8079>] lov_lock_cancel+0x99/0x190 [lov]
[<ffffffffc081f488>] cl_lock_cancel+0x78/0x160 [obdclass]
[<ffffffffc081f9a2>] cl_lock_release+0x52/0x140 [obdclass]
[<ffffffffc08238a9>] cl_io_unlock+0x139/0x290 [obdclass]
[<ffffffffc08242e8>] cl_io_loop+0xb8/0x200 [obdclass]
[<ffffffffc0e1d36b>] ll_file_io_generic+0x91b/0xdf0 [lustre]
[<ffffffffc0e1dd0c>] ll_file_aio_write+0x29c/0x6e0 [lustre]
[<ffffffffc0e1e250>] ll_file_write+0x100/0x1c0 [lustre]
[<ffffffffa984aa90>] vfs_write+0xc0/0x1f0
[<ffffffffa984b8af>] SyS_write+0x7f/0xf0
[<ffffffffa9d8eede>] system_call_fastpath+0x25/0x2a
[<ffffffffffffffff>] 0xffffffffffffffff

Lock cancel time out in the server side and client
eviction happen.

Fix this problem by testing O_DIRECT flag to decide if
we could issue lockless IO.

Fixes: bf18998820 ("lustre: clio: turn on lockless for some kind of IO")
WC-bug-id: https://jira.whamcloud.com/browse/LU-14072
Lustre-commit: f348437218d0b9 ("LU-14072 llite: fix client evicition with DIO")
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/40389
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/include/cl_object.h           | 2 +-
 fs/lustre/llite/file.c                  | 9 +++------
 fs/lustre/llite/rw.c                    | 4 ++--
 fs/lustre/llite/rw26.c                  | 6 +++---
 fs/lustre/llite/vvp_io.c                | 6 +++---
 include/uapi/linux/lustre/lustre_user.h | 1 -
 6 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/fs/lustre/include/cl_object.h b/fs/lustre/include/cl_object.h
index e17385c0..d2cee34 100644
--- a/fs/lustre/include/cl_object.h
+++ b/fs/lustre/include/cl_object.h
@@ -1962,7 +1962,7 @@ struct cl_io {
 	/**
 	 * Ignore lockless and do normal locking for this io.
 	 */
-				ci_ignore_lockless:1,
+				ci_dio_lock:1,
 	/**
 	 * Set if we've tried all mirrors for this read IO, if it's not set,
 	 * the read IO will check to-be-read OSCs' status, and make fast-switch
diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c
index f7f917b..2b0ffad 100644
--- a/fs/lustre/llite/file.c
+++ b/fs/lustre/llite/file.c
@@ -945,9 +945,6 @@ int ll_file_open(struct inode *inode, struct file *file)
 
 	mutex_unlock(&lli->lli_och_mutex);
 
-	/* lockless for direct IO so that it can do IO in parallel */
-	if (file->f_flags & O_DIRECT)
-		fd->fd_flags |= LL_FILE_LOCKLESS_IO;
 	fd = NULL;
 
 	/* Must do this outside lli_och_mutex lock to prevent deadlock where
@@ -1573,7 +1570,7 @@ static void ll_heat_add(struct inode *inode, enum cl_io_type iot,
 	ssize_t result = 0;
 	int rc = 0;
 	unsigned int retried = 0;
-	unsigned int ignore_lockless = 0;
+	unsigned int dio_lock = 0;
 	bool is_aio = false;
 	struct cl_dio_aio *ci_aio = NULL;
 
@@ -1595,7 +1592,7 @@ static void ll_heat_add(struct inode *inode, enum cl_io_type iot,
 	io = vvp_env_thread_io(env);
 	ll_io_init(io, file, iot == CIT_WRITE, args);
 	io->ci_aio = ci_aio;
-	io->ci_ignore_lockless = ignore_lockless;
+	io->ci_dio_lock = dio_lock;
 	io->ci_ndelay_tried = retried;
 
 	if (cl_io_rw_init(env, io, iot, *ppos, count) == 0) {
@@ -1675,7 +1672,7 @@ static void ll_heat_add(struct inode *inode, enum cl_io_type iot,
 		       *ppos, count, result);
 		/* preserve the tried count for FLR */
 		retried = io->ci_ndelay_tried;
-		ignore_lockless = io->ci_ignore_lockless;
+		dio_lock = io->ci_dio_lock;
 		goto restart;
 	}
 
diff --git a/fs/lustre/llite/rw.c b/fs/lustre/llite/rw.c
index 54f0b9a..da4a26d 100644
--- a/fs/lustre/llite/rw.c
+++ b/fs/lustre/llite/rw.c
@@ -1723,9 +1723,9 @@ int ll_readpage(struct file *file, struct page *vmpage)
 	 */
 	if (file->f_flags & O_DIRECT &&
 	    lcc && lcc->lcc_type == LCC_RW &&
-	    !io->ci_ignore_lockless) {
+	    !io->ci_dio_lock) {
 		unlock_page(vmpage);
-		io->ci_ignore_lockless = 1;
+		io->ci_dio_lock = 1;
 		io->ci_need_restart = 1;
 		return -ENOLCK;
 	}
diff --git a/fs/lustre/llite/rw26.c b/fs/lustre/llite/rw26.c
index 1736e9a..605a326 100644
--- a/fs/lustre/llite/rw26.c
+++ b/fs/lustre/llite/rw26.c
@@ -538,12 +538,12 @@ static int ll_write_begin(struct file *file, struct address_space *mapping,
 		}
 
 		/*
-		 * Direct read can fall back to buffered read, but DIO is done
+		 * Direct write can fall back to buffered read, but DIO is done
 		 * with lockless i/o, and buffered requires LDLM locking, so
 		 * in this case we must restart without lockless.
 		 */
-		if (!io->ci_ignore_lockless) {
-			io->ci_ignore_lockless = 1;
+		if (!io->ci_dio_lock) {
+			io->ci_dio_lock = 1;
 			io->ci_need_restart = 1;
 			result = -ENOLCK;
 			goto out;
diff --git a/fs/lustre/llite/vvp_io.c b/fs/lustre/llite/vvp_io.c
index d6ca267..8dbe835 100644
--- a/fs/lustre/llite/vvp_io.c
+++ b/fs/lustre/llite/vvp_io.c
@@ -557,11 +557,11 @@ static int vvp_io_rw_lock(const struct lu_env *env, struct cl_io *io,
 	if (vio->vui_fd) {
 		/* Group lock held means no lockless any more */
 		if (vio->vui_fd->fd_flags & LL_FILE_GROUP_LOCKED)
-			io->ci_ignore_lockless = 1;
+			io->ci_dio_lock = 1;
 
 		if (ll_file_nolock(vio->vui_fd->fd_file) ||
-		    (vio->vui_fd->fd_flags & LL_FILE_LOCKLESS_IO &&
-		     !io->ci_ignore_lockless))
+		    (vio->vui_fd->fd_file->f_flags & O_DIRECT &&
+		     !io->ci_dio_lock))
 			ast_flags |= CEF_NEVER;
 	}
 
diff --git a/include/uapi/linux/lustre/lustre_user.h b/include/uapi/linux/lustre/lustre_user.h
index b0301e1..143b7d5 100644
--- a/include/uapi/linux/lustre/lustre_user.h
+++ b/include/uapi/linux/lustre/lustre_user.h
@@ -402,7 +402,6 @@ struct ll_ioc_lease_id {
 #define LL_FILE_GROUP_LOCKED	0x00000002
 #define LL_FILE_READAHEA	0x00000004
 #define LL_FILE_LOCKED_DIRECTIO 0x00000008 /* client-side locks with dio */
-#define LL_FILE_LOCKLESS_IO	0x00000010 /* server-side locks with cio */
 #define LL_FILE_FLOCK_WARNING	0x00000020 /* warned about disabled flock */
 
 #define LOV_USER_MAGIC_V1	0x0BD10BD0
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

  parent reply	other threads:[~2021-01-21 17:17 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-21 17:16 [lustre-devel] [PATCH 00/39] lustre: update to latest OpenSFS version as of Jan 21 2021 James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 01/39] lustre: ldlm: page discard speedup James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 02/39] lustre: ptlrpc: fixes for RCU-related stalls James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 03/39] lustre: ldlm: Do not wait for lock replay sending if import dsconnected James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 04/39] lustre: ldlm: Do not hang if recovery restarted during lock replay James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 05/39] lnet: Correct handling of NETWORK_TIMEOUT status James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 06/39] lnet: Introduce constant for net ID of LNET_NID_ANY James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 07/39] lustre: ldlm: Don't re-enqueue glimpse lock on read James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 08/39] lustre: osc: prevent overflow of o_dropped James Simmons
2021-01-21 17:16 ` James Simmons [this message]
2021-01-21 17:16 ` [lustre-devel] [PATCH 10/39] lustre: Use vfree_atomic instead of vfree James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 11/39] lnet: lnd: Use NETWORK_TIMEOUT for txs on ibp_tx_queue James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 12/39] lnet: lnd: Use NETWORK_TIMEOUT for some conn failures James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 13/39] lustre: llite: allow DIO with unaligned IO count James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 14/39] lustre: osc: skip 0 row for rpc_stats James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 15/39] lustre: quota: df should return projid-specific values James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 16/39] lnet: discard the callback James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 17/39] lustre: llite: try to improve mmap performance James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 18/39] lnet: Introduce lnet_recovery_limit parameter James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 19/39] lustre: mdc: avoid easize set to 0 James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 20/39] lustre: lmv: optimize dir shard revalidate James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 21/39] lustre: ldlm: osc_object_ast_clear() is called for mdc object on eviction James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 22/39] lustre: uapi: fix compatibility for LL_IOC_MDC_GETINFO James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 23/39] lustre: llite: don't check layout info for page discard James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 24/39] lustre: update version to 2.13.57 James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 25/39] lnet: o2iblnd: retry qp creation with reduced queue depth James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 26/39] lustre: lov: fix SEEK_HOLE calcs at component end James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 27/39] lustre: lov: instantiate components layout for fallocate James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 28/39] lustre: dom: non-blocking enqueue for DOM locks James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 29/39] lustre: llite: fiemap set flags for encrypted files James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 30/39] lustre: ldlm: don't compute sumsq for pool stats James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 31/39] lustre: lov: FIEMAP support for PFL and FLR file James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 32/39] lustre: mdc: process changelogs_catalog from the oldest rec James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 33/39] lustre: ldlm: Use req_mode while lock cleanup James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 34/39] lnet: socklnd: announce deprecation of 'use_tcp_bonding' James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 35/39] lnet: o2iblnd: remove FMR-pool support James Simmons
2021-01-21 17:16 ` [lustre-devel] [PATCH 36/39] lustre: llite: return EOPNOTSUPP if fallocate is not supported James Simmons
2021-01-21 17:17 ` [lustre-devel] [PATCH 37/39] lnet: use an unbound cred in kiblnd_resolve_addr() James Simmons
2021-01-21 17:17 ` [lustre-devel] [PATCH 38/39] lustre: lov: correctly set OST obj size James Simmons
2021-01-21 17:17 ` [lustre-devel] [PATCH 39/39] lustre: cksum: add lprocfs checksum support in MDC/MDT James Simmons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1611249422-556-10-git-send-email-jsimmons@infradead.org \
    --to=jsimmons@infradead.org \
    --cc=adilger@whamcloud.com \
    --cc=green@whamcloud.com \
    --cc=lustre-devel@lists.lustre.org \
    --cc=neilb@suse.de \
    --cc=wshilong@ddn.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.