* [PATCH] ceph: ensure we flush delayed caps when unmounting @ 2021-06-03 13:48 Jeff Layton 2021-06-03 16:57 ` Jeff Layton 0 siblings, 1 reply; 4+ messages in thread From: Jeff Layton @ 2021-06-03 13:48 UTC (permalink / raw) To: ceph-devel, idryomov I've seen some warnings when testing recently that indicate that there are caps still delayed on the delayed list even after we've started unmounting. When checking delayed caps, process the whole list if we're unmounting, and check for delayed caps after setting the stopping var and flushing dirty caps. Signed-off-by: Jeff Layton <jlayton@kernel.org> --- fs/ceph/caps.c | 3 ++- fs/ceph/mds_client.c | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index a5e93b185515..68b4c6dfe4db 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -4236,7 +4236,8 @@ void ceph_check_delayed_caps(struct ceph_mds_client *mdsc) ci = list_first_entry(&mdsc->cap_delay_list, struct ceph_inode_info, i_cap_delay_list); - if ((ci->i_ceph_flags & CEPH_I_FLUSH) == 0 && + if (!mdsc->stopping && + (ci->i_ceph_flags & CEPH_I_FLUSH) == 0 && time_before(jiffies, ci->i_hold_caps_max)) break; list_del_init(&ci->i_cap_delay_list); diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index e5af591d3bd4..916af5497829 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -4691,6 +4691,7 @@ void ceph_mdsc_pre_umount(struct ceph_mds_client *mdsc) lock_unlock_sessions(mdsc); ceph_flush_dirty_caps(mdsc); + ceph_check_delayed_caps(mdsc); wait_requests(mdsc); /* -- 2.31.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] ceph: ensure we flush delayed caps when unmounting 2021-06-03 13:48 [PATCH] ceph: ensure we flush delayed caps when unmounting Jeff Layton @ 2021-06-03 16:57 ` Jeff Layton 2021-06-04 9:35 ` Luis Henriques 0 siblings, 1 reply; 4+ messages in thread From: Jeff Layton @ 2021-06-03 16:57 UTC (permalink / raw) To: ceph-devel, idryomov On Thu, 2021-06-03 at 09:48 -0400, Jeff Layton wrote: > I've seen some warnings when testing recently that indicate that there > are caps still delayed on the delayed list even after we've started > unmounting. > > When checking delayed caps, process the whole list if we're unmounting, > and check for delayed caps after setting the stopping var and flushing > dirty caps. > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > --- > fs/ceph/caps.c | 3 ++- > fs/ceph/mds_client.c | 1 + > 2 files changed, 3 insertions(+), 1 deletion(-) > > diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c > index a5e93b185515..68b4c6dfe4db 100644 > --- a/fs/ceph/caps.c > +++ b/fs/ceph/caps.c > @@ -4236,7 +4236,8 @@ void ceph_check_delayed_caps(struct ceph_mds_client *mdsc) > ci = list_first_entry(&mdsc->cap_delay_list, > struct ceph_inode_info, > i_cap_delay_list); > - if ((ci->i_ceph_flags & CEPH_I_FLUSH) == 0 && > + if (!mdsc->stopping && > + (ci->i_ceph_flags & CEPH_I_FLUSH) == 0 && > time_before(jiffies, ci->i_hold_caps_max)) > break; > list_del_init(&ci->i_cap_delay_list); > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c > index e5af591d3bd4..916af5497829 100644 > --- a/fs/ceph/mds_client.c > +++ b/fs/ceph/mds_client.c > @@ -4691,6 +4691,7 @@ void ceph_mdsc_pre_umount(struct ceph_mds_client *mdsc) > > lock_unlock_sessions(mdsc); > ceph_flush_dirty_caps(mdsc); > + ceph_check_delayed_caps(mdsc); > wait_requests(mdsc); > > /* I'm going to self-NAK this patch for now. Initially this looked good in testing, but I think it's just papering over the real problem, which is that ceph_async_iput can queue a job to a workqueue after the point where we've flushed that workqueue on umount. I think the right approach is to look at how to ensure that calling iput doesn't end up taking these coarse-grained locks so we don't need to queue it in so many codepaths. -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] ceph: ensure we flush delayed caps when unmounting 2021-06-03 16:57 ` Jeff Layton @ 2021-06-04 9:35 ` Luis Henriques 2021-06-04 12:26 ` Jeff Layton 0 siblings, 1 reply; 4+ messages in thread From: Luis Henriques @ 2021-06-04 9:35 UTC (permalink / raw) To: Jeff Layton; +Cc: ceph-devel, idryomov On Thu, Jun 03, 2021 at 12:57:22PM -0400, Jeff Layton wrote: > On Thu, 2021-06-03 at 09:48 -0400, Jeff Layton wrote: > > I've seen some warnings when testing recently that indicate that there > > are caps still delayed on the delayed list even after we've started > > unmounting. > > > > When checking delayed caps, process the whole list if we're unmounting, > > and check for delayed caps after setting the stopping var and flushing > > dirty caps. > > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > --- > > fs/ceph/caps.c | 3 ++- > > fs/ceph/mds_client.c | 1 + > > 2 files changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c > > index a5e93b185515..68b4c6dfe4db 100644 > > --- a/fs/ceph/caps.c > > +++ b/fs/ceph/caps.c > > @@ -4236,7 +4236,8 @@ void ceph_check_delayed_caps(struct ceph_mds_client *mdsc) > > ci = list_first_entry(&mdsc->cap_delay_list, > > struct ceph_inode_info, > > i_cap_delay_list); > > - if ((ci->i_ceph_flags & CEPH_I_FLUSH) == 0 && > > + if (!mdsc->stopping && > > + (ci->i_ceph_flags & CEPH_I_FLUSH) == 0 && > > time_before(jiffies, ci->i_hold_caps_max)) > > break; > > list_del_init(&ci->i_cap_delay_list); > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c > > index e5af591d3bd4..916af5497829 100644 > > --- a/fs/ceph/mds_client.c > > +++ b/fs/ceph/mds_client.c > > @@ -4691,6 +4691,7 @@ void ceph_mdsc_pre_umount(struct ceph_mds_client *mdsc) > > > > lock_unlock_sessions(mdsc); > > ceph_flush_dirty_caps(mdsc); > > + ceph_check_delayed_caps(mdsc); > > wait_requests(mdsc); > > > > /* > > I'm going to self-NAK this patch for now. Initially this looked good in > testing, but I think it's just papering over the real problem, which is > that ceph_async_iput can queue a job to a workqueue after the point > where we've flushed that workqueue on umount. Ah, yeah. I think I saw this a few times with generic/014 (and I believe we chatted about it on irc). I've been on and off trying to figure out the way to fix it but it's really tricky. Cheers, -- Luís > I think the right approach is to look at how to ensure that calling iput > doesn't end up taking these coarse-grained locks so we don't need to > queue it in so many codepaths. > -- > Jeff Layton <jlayton@kernel.org> > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] ceph: ensure we flush delayed caps when unmounting 2021-06-04 9:35 ` Luis Henriques @ 2021-06-04 12:26 ` Jeff Layton 0 siblings, 0 replies; 4+ messages in thread From: Jeff Layton @ 2021-06-04 12:26 UTC (permalink / raw) To: Luis Henriques; +Cc: ceph-devel, idryomov On Fri, 2021-06-04 at 10:35 +0100, Luis Henriques wrote: > On Thu, Jun 03, 2021 at 12:57:22PM -0400, Jeff Layton wrote: > > On Thu, 2021-06-03 at 09:48 -0400, Jeff Layton wrote: > > > I've seen some warnings when testing recently that indicate that there > > > are caps still delayed on the delayed list even after we've started > > > unmounting. > > > > > > When checking delayed caps, process the whole list if we're unmounting, > > > and check for delayed caps after setting the stopping var and flushing > > > dirty caps. > > > > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > > --- > > > fs/ceph/caps.c | 3 ++- > > > fs/ceph/mds_client.c | 1 + > > > 2 files changed, 3 insertions(+), 1 deletion(-) > > > > > > diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c > > > index a5e93b185515..68b4c6dfe4db 100644 > > > --- a/fs/ceph/caps.c > > > +++ b/fs/ceph/caps.c > > > @@ -4236,7 +4236,8 @@ void ceph_check_delayed_caps(struct ceph_mds_client *mdsc) > > > ci = list_first_entry(&mdsc->cap_delay_list, > > > struct ceph_inode_info, > > > i_cap_delay_list); > > > - if ((ci->i_ceph_flags & CEPH_I_FLUSH) == 0 && > > > + if (!mdsc->stopping && > > > + (ci->i_ceph_flags & CEPH_I_FLUSH) == 0 && > > > time_before(jiffies, ci->i_hold_caps_max)) > > > break; > > > list_del_init(&ci->i_cap_delay_list); > > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c > > > index e5af591d3bd4..916af5497829 100644 > > > --- a/fs/ceph/mds_client.c > > > +++ b/fs/ceph/mds_client.c > > > @@ -4691,6 +4691,7 @@ void ceph_mdsc_pre_umount(struct ceph_mds_client *mdsc) > > > > > > lock_unlock_sessions(mdsc); > > > ceph_flush_dirty_caps(mdsc); > > > + ceph_check_delayed_caps(mdsc); > > > wait_requests(mdsc); > > > > > > /* > > > > I'm going to self-NAK this patch for now. Initially this looked good in > > testing, but I think it's just papering over the real problem, which is > > that ceph_async_iput can queue a job to a workqueue after the point > > where we've flushed that workqueue on umount. > > Ah, yeah. I think I saw this a few times with generic/014 (and I believe > we chatted about it on irc). I've been on and off trying to figure out > the way to fix it but it's really tricky. > Yeah, that's putting it mildly. The biggest issue here is the session->s_mutex, which is held over large swaths of the code, but it's not fully clear what it protects. The original patch that added ceph_async_iput did it to avoid the session mutex that gets held for ceph_iterate_session_caps. My current thinking is that we probably don't need to hold the session mutex over that function in some cases, if we can guarantee that the ceph_cap objects we're iterating over don't go away when the lock is dropped. So, I'm trying to add some refcounting to the ceph_cap structures themselves to see if that helps. It may turn out to be a dead end, but if we don't chip away at the edges of the fundamental problem, we'll never get there... -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-06-04 12:26 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-06-03 13:48 [PATCH] ceph: ensure we flush delayed caps when unmounting Jeff Layton 2021-06-03 16:57 ` Jeff Layton 2021-06-04 9:35 ` Luis Henriques 2021-06-04 12:26 ` Jeff Layton
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.