CEPH-Devel Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v3 0/5] ceph: fix spurious recover_session=clean errors
@ 2020-10-06 14:55 Jeff Layton
  2020-10-06 14:55 ` [PATCH v3 1/5] ceph: don't WARN when removing caps due to blocklisting Jeff Layton
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Jeff Layton @ 2020-10-06 14:55 UTC (permalink / raw)
  To: ceph-devel; +Cc: idryomov, ukernel, pdonnell

v3: add RECOVER mount_state and allow dumping pagecache when it's set
    shrink size of mount_state field
v2: fix handling of async requests in patch to queue requests

This is the third revision of this patchset and should hopefully address
comments from Zheng and Ilya.

Original cover letter:

Ilya noticed that he would get spurious EACCES errors on calls done just
after blocklisting the client on mounts with recover_session=clean. The
session would get marked as REJECTED and that caused in-flight calls to
die with EACCES. This patchset seems to smooth over the problem, but I'm
not fully convinced it's the right approach.

The potential issue I see is that the client could take cap references to
do a call on a session that has been blocklisted. We then queue the
message and reestablish the session, but we may not have been granted
the same caps by the MDS at that point.

If this is a problem, then we probably need to rework it so that we
return a distinct error code in this situation and have the upper layers
issue a completely new mds request (with new cap refs, etc.)

Obviously, that's a much more invasive approach though, so it would be
nice to avoid that if this would suffice.

Jeff Layton (5):
  ceph: don't WARN when removing caps due to blocklisting
  ceph: make fsc->mount_state an int
  ceph: don't mark mount as SHUTDOWN when recovering session
  ceph: remove timeout on allowing reconnect after blocklisting
  ceph: queue MDS requests to REJECTED sessions when CLEANRECOVER is set

 fs/ceph/caps.c               |  2 +-
 fs/ceph/inode.c              |  2 +-
 fs/ceph/mds_client.c         | 25 +++++++++++++++----------
 fs/ceph/super.c              | 14 ++++++++++----
 fs/ceph/super.h              |  3 +--
 include/linux/ceph/libceph.h |  1 +
 6 files changed, 29 insertions(+), 18 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v3 1/5] ceph: don't WARN when removing caps due to blocklisting
  2020-10-06 14:55 [PATCH v3 0/5] ceph: fix spurious recover_session=clean errors Jeff Layton
@ 2020-10-06 14:55 ` Jeff Layton
  2020-10-06 14:55 ` [PATCH v3 2/5] ceph: make fsc->mount_state an int Jeff Layton
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Jeff Layton @ 2020-10-06 14:55 UTC (permalink / raw)
  To: ceph-devel; +Cc: idryomov, ukernel, pdonnell

We expect to remove dirty caps when the client is blocklisted. Don't
throw a warning in that case.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/caps.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index c7e69547628e..2ee3f316afcf 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -1149,7 +1149,7 @@ void __ceph_remove_cap(struct ceph_cap *cap, bool queue_release)
 	/* remove from inode's cap rbtree, and clear auth cap */
 	rb_erase(&cap->ci_node, &ci->i_caps);
 	if (ci->i_auth_cap == cap) {
-		WARN_ON_ONCE(!list_empty(&ci->i_dirty_item));
+		WARN_ON_ONCE(!list_empty(&ci->i_dirty_item) && !mdsc->fsc->blocklisted);
 		ci->i_auth_cap = NULL;
 	}
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v3 2/5] ceph: make fsc->mount_state an int
  2020-10-06 14:55 [PATCH v3 0/5] ceph: fix spurious recover_session=clean errors Jeff Layton
  2020-10-06 14:55 ` [PATCH v3 1/5] ceph: don't WARN when removing caps due to blocklisting Jeff Layton
@ 2020-10-06 14:55 ` Jeff Layton
  2020-10-06 14:55 ` [PATCH v3 3/5] ceph: don't mark mount as SHUTDOWN when recovering session Jeff Layton
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Jeff Layton @ 2020-10-06 14:55 UTC (permalink / raw)
  To: ceph-devel; +Cc: idryomov, ukernel, pdonnell

This field is an unsigned long currently, which is a bit of a waste on
most arches since this just holds an enum. Make it (signed) int instead.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/super.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 582694899130..d0cb6a51c6a4 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -106,7 +106,7 @@ struct ceph_fs_client {
 	struct ceph_mount_options *mount_options;
 	struct ceph_client *client;
 
-	unsigned long mount_state;
+	int mount_state;
 
 	unsigned long last_auto_reconnect;
 	bool blocklisted;
-- 
2.26.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v3 3/5] ceph: don't mark mount as SHUTDOWN when recovering session
  2020-10-06 14:55 [PATCH v3 0/5] ceph: fix spurious recover_session=clean errors Jeff Layton
  2020-10-06 14:55 ` [PATCH v3 1/5] ceph: don't WARN when removing caps due to blocklisting Jeff Layton
  2020-10-06 14:55 ` [PATCH v3 2/5] ceph: make fsc->mount_state an int Jeff Layton
@ 2020-10-06 14:55 ` Jeff Layton
  2020-10-06 14:55 ` [PATCH v3 4/5] ceph: remove timeout on allowing reconnect after blocklisting Jeff Layton
  2020-10-06 14:55 ` [PATCH v3 5/5] ceph: queue MDS requests to REJECTED sessions when CLEANRECOVER is set Jeff Layton
  4 siblings, 0 replies; 6+ messages in thread
From: Jeff Layton @ 2020-10-06 14:55 UTC (permalink / raw)
  To: ceph-devel; +Cc: idryomov, ukernel, pdonnell

When recovering a session (a'la recover_session=clean), we want to do
all of the operations that we do on a forced umount, but changing the
mount state to SHUTDOWN is can cause queued MDS requests to fail when
the session comes back.

Reserve SHUTDOWN state for forced umount, and make a new RECOVER state
for the forced reconnect situation.

Cc: "Yan, Zheng" <ukernel@gmail.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>

SQUASH: add new CEPH_MOUNT_RECOVER state

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/inode.c              |  2 +-
 fs/ceph/mds_client.c         |  2 +-
 fs/ceph/super.c              | 14 ++++++++++----
 include/linux/ceph/libceph.h |  1 +
 4 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 526faf4778ce..02b11a4a4d39 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -1888,7 +1888,7 @@ static void ceph_do_invalidate_pages(struct inode *inode)
 
 	mutex_lock(&ci->i_truncate_mutex);
 
-	if (READ_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) {
+	if (READ_ONCE(fsc->mount_state) >= CEPH_MOUNT_SHUTDOWN) {
 		pr_warn_ratelimited("invalidate_pages %p %lld forced umount\n",
 				    inode, ceph_ino(inode));
 		mapping_set_error(inode->i_mapping, -EIO);
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 6b408851eea1..cd46f7e40370 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -1595,7 +1595,7 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap,
 		struct ceph_cap_flush *cf;
 		struct ceph_mds_client *mdsc = fsc->mdsc;
 
-		if (READ_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) {
+		if (READ_ONCE(fsc->mount_state) >= CEPH_MOUNT_SHUTDOWN) {
 			if (inode->i_data.nrpages > 0)
 				invalidate = true;
 			if (ci->i_wrbuffer_ref > 0)
diff --git a/fs/ceph/super.c b/fs/ceph/super.c
index 2516304379d3..2f530a111b3a 100644
--- a/fs/ceph/super.c
+++ b/fs/ceph/super.c
@@ -832,6 +832,13 @@ static void destroy_caches(void)
 	ceph_fscache_unregister();
 }
 
+static void __ceph_umount_begin(struct ceph_fs_client *fsc)
+{
+	ceph_osdc_abort_requests(&fsc->client->osdc, -EIO);
+	ceph_mdsc_force_umount(fsc->mdsc);
+	fsc->filp_gen++; // invalidate open files
+}
+
 /*
  * ceph_umount_begin - initiate forced umount.  Tear down the
  * mount, skipping steps that may hang while waiting for server(s).
@@ -844,9 +851,7 @@ static void ceph_umount_begin(struct super_block *sb)
 	if (!fsc)
 		return;
 	fsc->mount_state = CEPH_MOUNT_SHUTDOWN;
-	ceph_osdc_abort_requests(&fsc->client->osdc, -EIO);
-	ceph_mdsc_force_umount(fsc->mdsc);
-	fsc->filp_gen++; // invalidate open files
+	__ceph_umount_begin(fsc);
 }
 
 static const struct super_operations ceph_super_ops = {
@@ -1235,7 +1240,8 @@ int ceph_force_reconnect(struct super_block *sb)
 	struct ceph_fs_client *fsc = ceph_sb_to_client(sb);
 	int err = 0;
 
-	ceph_umount_begin(sb);
+	fsc->mount_state = CEPH_MOUNT_RECOVER;
+	__ceph_umount_begin(fsc);
 
 	/* Make sure all page caches get invalidated.
 	 * see remove_session_caps_cb() */
diff --git a/include/linux/ceph/libceph.h b/include/linux/ceph/libceph.h
index c8645f0b797d..eb5a7ca13f9c 100644
--- a/include/linux/ceph/libceph.h
+++ b/include/linux/ceph/libceph.h
@@ -104,6 +104,7 @@ enum {
 	CEPH_MOUNT_UNMOUNTING,
 	CEPH_MOUNT_UNMOUNTED,
 	CEPH_MOUNT_SHUTDOWN,
+	CEPH_MOUNT_RECOVER,
 };
 
 static inline unsigned long ceph_timeout_jiffies(unsigned long timeout)
-- 
2.26.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v3 4/5] ceph: remove timeout on allowing reconnect after blocklisting
  2020-10-06 14:55 [PATCH v3 0/5] ceph: fix spurious recover_session=clean errors Jeff Layton
                   ` (2 preceding siblings ...)
  2020-10-06 14:55 ` [PATCH v3 3/5] ceph: don't mark mount as SHUTDOWN when recovering session Jeff Layton
@ 2020-10-06 14:55 ` Jeff Layton
  2020-10-06 14:55 ` [PATCH v3 5/5] ceph: queue MDS requests to REJECTED sessions when CLEANRECOVER is set Jeff Layton
  4 siblings, 0 replies; 6+ messages in thread
From: Jeff Layton @ 2020-10-06 14:55 UTC (permalink / raw)
  To: ceph-devel; +Cc: idryomov, ukernel, pdonnell

30 minutes is a long time to wait, and this makes it difficult to test
the feature by manually blocklisting clients. Remove the timeout
infrastructure and just allow the client to reconnect at will.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/mds_client.c | 5 -----
 fs/ceph/super.h      | 1 -
 2 files changed, 6 deletions(-)

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index cd46f7e40370..1727931248b5 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -4374,12 +4374,7 @@ static void maybe_recover_session(struct ceph_mds_client *mdsc)
 	if (!READ_ONCE(fsc->blocklisted))
 		return;
 
-	if (fsc->last_auto_reconnect &&
-	    time_before(jiffies, fsc->last_auto_reconnect + HZ * 60 * 30))
-		return;
-
 	pr_info("auto reconnect after blocklisted\n");
-	fsc->last_auto_reconnect = jiffies;
 	ceph_force_reconnect(fsc->sb);
 }
 
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index d0cb6a51c6a4..9ced23b092f5 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -108,7 +108,6 @@ struct ceph_fs_client {
 
 	int mount_state;
 
-	unsigned long last_auto_reconnect;
 	bool blocklisted;
 
 	bool have_copy_from2;
-- 
2.26.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v3 5/5] ceph: queue MDS requests to REJECTED sessions when CLEANRECOVER is set
  2020-10-06 14:55 [PATCH v3 0/5] ceph: fix spurious recover_session=clean errors Jeff Layton
                   ` (3 preceding siblings ...)
  2020-10-06 14:55 ` [PATCH v3 4/5] ceph: remove timeout on allowing reconnect after blocklisting Jeff Layton
@ 2020-10-06 14:55 ` Jeff Layton
  4 siblings, 0 replies; 6+ messages in thread
From: Jeff Layton @ 2020-10-06 14:55 UTC (permalink / raw)
  To: ceph-devel; +Cc: idryomov, ukernel, pdonnell

Ilya noticed that the first access to a blacklisted mount would often
get back -EACCES, but then subsequent calls would be OK. The problem is
in __do_request. If the session is marked as REJECTED, a hard error is
returned instead of waiting for a new session to come into being.

When the session is REJECTED and the mount was done with
recover_session=clean, queue the request to the waiting_for_map queue,
which will be awoken after tearing down the old session. We can only
do this for sync requests though, so check for async ones first and
just let the callers redrive a sync request.

URL: https://tracker.ceph.com/issues/47385
Reported-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/mds_client.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 1727931248b5..048eb69be29e 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -2818,10 +2818,6 @@ static void __do_request(struct ceph_mds_client *mdsc,
 	     ceph_session_state_name(session->s_state));
 	if (session->s_state != CEPH_MDS_SESSION_OPEN &&
 	    session->s_state != CEPH_MDS_SESSION_HUNG) {
-		if (session->s_state == CEPH_MDS_SESSION_REJECTED) {
-			err = -EACCES;
-			goto out_session;
-		}
 		/*
 		 * We cannot queue async requests since the caps and delegated
 		 * inodes are bound to the session. Just return -EJUKEBOX and
@@ -2831,6 +2827,20 @@ static void __do_request(struct ceph_mds_client *mdsc,
 			err = -EJUKEBOX;
 			goto out_session;
 		}
+
+		/*
+		 * If the session has been REJECTED, then return a hard error,
+		 * unless it's a CLEANRECOVER mount, in which case we'll queue
+		 * it to the mdsc queue.
+		 */
+		if (session->s_state == CEPH_MDS_SESSION_REJECTED) {
+			if (ceph_test_mount_opt(mdsc->fsc, CLEANRECOVER))
+				list_add(&req->r_wait, &mdsc->waiting_for_map);
+			else
+				err = -EACCES;
+			goto out_session;
+		}
+
 		if (session->s_state == CEPH_MDS_SESSION_NEW ||
 		    session->s_state == CEPH_MDS_SESSION_CLOSING) {
 			err = __open_session(mdsc, session);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, back to index

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-06 14:55 [PATCH v3 0/5] ceph: fix spurious recover_session=clean errors Jeff Layton
2020-10-06 14:55 ` [PATCH v3 1/5] ceph: don't WARN when removing caps due to blocklisting Jeff Layton
2020-10-06 14:55 ` [PATCH v3 2/5] ceph: make fsc->mount_state an int Jeff Layton
2020-10-06 14:55 ` [PATCH v3 3/5] ceph: don't mark mount as SHUTDOWN when recovering session Jeff Layton
2020-10-06 14:55 ` [PATCH v3 4/5] ceph: remove timeout on allowing reconnect after blocklisting Jeff Layton
2020-10-06 14:55 ` [PATCH v3 5/5] ceph: queue MDS requests to REJECTED sessions when CLEANRECOVER is set Jeff Layton

CEPH-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/ceph-devel/0 ceph-devel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ceph-devel ceph-devel/ https://lore.kernel.org/ceph-devel \
		ceph-devel@vger.kernel.org
	public-inbox-index ceph-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.ceph-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git