From: "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> To: Al Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org> Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Jeff Layton <jlayton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org> Subject: [PATCH] dcache: return -ESTALE not -EBUSY on distributed fs race Date: Tue, 10 Feb 2015 10:55:53 -0500 [thread overview] Message-ID: <20150210155553.GA11226@fieldses.org> (raw) From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> On a distributed filesystem it's possible for lookup to discover that a directory it just found is already cached elsewhere in the directory heirarchy. The dcache won't let us keep the directory in both places, so we have to move the dentry to the new location from the place we previously had it cached. If the parent has changed, then this requires all the same locks as we'd need to do a cross-directory rename. But we're already in lookup holding one parent's i_mutex, so it's too late to acquire those locks in the right order. The (unreliable) solution in __d_unalias is to trylock() the required locks and return -EBUSY if it fails. I see no particular reason for returning -EBUSY, and -ESTALE is already the result of some other lookup races on NFS. I think -ESTALE is the more helpful error return. It also allows us to take advantage of the logic Jeff Layton added in c6a9428401c0 "vfs: fix renameat to retry on ESTALE errors" and ancestors, which hopefully resolves some of these errors before they're returned to userspace. I can reproduce these cases using NFS with: ssh root@$client ' mount -olookupcache=pos '$server':'$export' /mnt/ mkdir /mnt/TO mkdir /mnt/DIR touch /mnt/DIR/test.txt while true; do strace -e open cat /mnt/DIR/test.txt 2>&1 | grep EBUSY done ' ssh root@$server ' while true; do mv $export/DIR $export/TO/DIR mv $export/TO/DIR $export/DIR done ' It also helps to add some other concurrent use of the directory on the client (e.g., "ls /mnt/TO"). And you can replace the server-side mv's by client-side mv's that are repeatedly killed. (If the client is interrupted while waiting for the RENAME response then it's left with a dentry that has to go under one parent or the other, but it doesn't yet know which.) Acked-by: Jeff Layton <jlayton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org> Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> --- fs/dcache.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/dcache.c b/fs/dcache.c index 5bc72b07fde2..b7de7f1ae38f 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -2612,7 +2612,7 @@ static struct dentry *__d_unalias(struct inode *inode, struct dentry *dentry, struct dentry *alias) { struct mutex *m1 = NULL, *m2 = NULL; - struct dentry *ret = ERR_PTR(-EBUSY); + struct dentry *ret = ERR_PTR(-ESTALE); /* If alias and dentry share a parent, then no extra locks required */ if (alias->d_parent == dentry->d_parent) -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html
WARNING: multiple messages have this Message-ID (diff)
From: "J. Bruce Fields" <bfields@fieldses.org> To: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Jeff Layton <jlayton@primarydata.com> Subject: [PATCH] dcache: return -ESTALE not -EBUSY on distributed fs race Date: Tue, 10 Feb 2015 10:55:53 -0500 [thread overview] Message-ID: <20150210155553.GA11226@fieldses.org> (raw) From: "J. Bruce Fields" <bfields@redhat.com> On a distributed filesystem it's possible for lookup to discover that a directory it just found is already cached elsewhere in the directory heirarchy. The dcache won't let us keep the directory in both places, so we have to move the dentry to the new location from the place we previously had it cached. If the parent has changed, then this requires all the same locks as we'd need to do a cross-directory rename. But we're already in lookup holding one parent's i_mutex, so it's too late to acquire those locks in the right order. The (unreliable) solution in __d_unalias is to trylock() the required locks and return -EBUSY if it fails. I see no particular reason for returning -EBUSY, and -ESTALE is already the result of some other lookup races on NFS. I think -ESTALE is the more helpful error return. It also allows us to take advantage of the logic Jeff Layton added in c6a9428401c0 "vfs: fix renameat to retry on ESTALE errors" and ancestors, which hopefully resolves some of these errors before they're returned to userspace. I can reproduce these cases using NFS with: ssh root@$client ' mount -olookupcache=pos '$server':'$export' /mnt/ mkdir /mnt/TO mkdir /mnt/DIR touch /mnt/DIR/test.txt while true; do strace -e open cat /mnt/DIR/test.txt 2>&1 | grep EBUSY done ' ssh root@$server ' while true; do mv $export/DIR $export/TO/DIR mv $export/TO/DIR $export/DIR done ' It also helps to add some other concurrent use of the directory on the client (e.g., "ls /mnt/TO"). And you can replace the server-side mv's by client-side mv's that are repeatedly killed. (If the client is interrupted while waiting for the RENAME response then it's left with a dentry that has to go under one parent or the other, but it doesn't yet know which.) Acked-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> --- fs/dcache.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/dcache.c b/fs/dcache.c index 5bc72b07fde2..b7de7f1ae38f 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -2612,7 +2612,7 @@ static struct dentry *__d_unalias(struct inode *inode, struct dentry *dentry, struct dentry *alias) { struct mutex *m1 = NULL, *m2 = NULL; - struct dentry *ret = ERR_PTR(-EBUSY); + struct dentry *ret = ERR_PTR(-ESTALE); /* If alias and dentry share a parent, then no extra locks required */ if (alias->d_parent == dentry->d_parent) -- 1.9.3
next reply other threads:[~2015-02-10 15:55 UTC|newest] Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top 2015-02-10 15:55 J. Bruce Fields [this message] 2015-02-10 15:55 ` [PATCH] dcache: return -ESTALE not -EBUSY on distributed fs race J. Bruce Fields 2015-02-18 15:03 ` J. Bruce Fields [not found] ` <20150218150343.GF4148-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2015-02-24 21:22 ` J. Bruce Fields 2015-02-24 21:22 ` J. Bruce Fields -- strict thread matches above, loose matches on Subject: below -- 2014-12-17 19:59 J. Bruce Fields [not found] ` <20141217195911.GF9617-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2014-12-17 20:01 ` J. Bruce Fields 2014-12-17 20:01 ` J. Bruce Fields 2014-12-18 15:50 ` J. R. Okajima 2014-12-18 15:58 ` J. Bruce Fields 2014-12-18 15:58 ` J. Bruce Fields [not found] ` <20141218155838.GD18179-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2014-12-18 16:27 ` J. R. Okajima 2014-12-18 16:27 ` J. R. Okajima 2014-12-18 17:32 ` Jeff Layton 2014-12-18 23:15 ` Dave Chinner 2014-12-18 23:15 ` Dave Chinner 2014-12-19 2:46 ` J. R. Okajima 2014-12-19 11:09 ` Jeff Layton 2014-12-19 11:09 ` Jeff Layton 2014-12-22 9:53 ` J. R. Okajima
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20150210155553.GA11226@fieldses.org \ --to=bfields-uc3wqj2krung9huczpvpmw@public.gmane.org \ --cc=jlayton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org \ --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \ --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \ --cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.