From: Pavel Shilovskiy <pshilov@microsoft.com>
To: David Wysochanski <dwysocha@redhat.com>,
Ronnie Sahlberg <lsahlber@redhat.com>
Cc: linux-cifs <linux-cifs@vger.kernel.org>
Subject: RE: [PATCH] cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs
Date: Fri, 25 Oct 2019 15:38:48 +0000 [thread overview]
Message-ID: <DM5PR21MB0185AD62B9AEF7945FAA3676B6650@DM5PR21MB0185.namprd21.prod.outlook.com> (raw)
In-Reply-To: <CALF+zO=bGY2KMUbjakjTcEx2CoxcQxk_bAiPEhhrtzkef4XcQQ@mail.gmail.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
I still think the right way to fix it would be to put "reconnected" file handles into a new queue and make another thread to "relock" the file.
It seems like a stable kernel candidate but I would hold off and let it bake a little bit in the mainline before we consider doing that.
Best regards,
Pavel Shilovsky
-----Original Message-----
From: linux-cifs-owner@vger.kernel.org <linux-cifs-owner@vger.kernel.org> On Behalf Of David Wysochanski
Sent: Thursday, October 24, 2019 6:23 PM
To: Ronnie Sahlberg <lsahlber@redhat.com>
Cc: linux-cifs <linux-cifs@vger.kernel.org>
Subject: Re: [PATCH] cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs
On Thu, Oct 24, 2019 at 7:51 PM Ronnie Sahlberg <lsahlber@redhat.com> wrote:
>
> From: Dave Wysochanski <dwysocha@redhat.com>
>
> There's a deadlock that is possible and can easily be seen with a test
> where multiple readers open/read/close of the same file and a
> disruption occurs causing reconnect. The deadlock is due a reader
> thread inside cifs_strict_readv calling down_read and obtaining
> lock_sem, and then after reconnect inside cifs_reopen_file calling
> down_read a second time. If in between the two down_read calls, a
> down_write comes from another process, deadlock occurs.
>
> CPU0 CPU1
> ---- ----
> cifs_strict_readv()
> down_read(&cifsi->lock_sem);
> _cifsFileInfo_put
> OR
> cifs_new_fileinfo
> down_write(&cifsi->lock_sem);
> cifs_reopen_file()
> down_read(&cifsi->lock_sem);
>
> Fix the above by changing all down_write(lock_sem) calls to
> down_write_trylock(lock_sem)/msleep() loop, which in turn makes the
> second down_read call benign since it will never block behind the
> writer while holding lock_sem.
>
> Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
> Suggested-by: Ronnie Sahlberg <lsahlber@redhat.com>
> ---
> fs/cifs/cifsglob.h | 5 +++++
> fs/cifs/cifsproto.h | 1 +
> fs/cifs/file.c | 23 +++++++++++++++--------
> fs/cifs/smb2file.c | 2 +-
> 4 files changed, 22 insertions(+), 9 deletions(-)
>
> diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h index
> 50dfd9049370..d78bfcc19156 100644
> --- a/fs/cifs/cifsglob.h
> +++ b/fs/cifs/cifsglob.h
> @@ -1391,6 +1391,11 @@ void cifsFileInfo_put(struct cifsFileInfo
> *cifs_file); struct cifsInodeInfo {
> bool can_cache_brlcks;
> struct list_head llist; /* locks helb by this inode */
> + /*
> + * NOTE: Some code paths call down_read(lock_sem) twice, so
> + * we must always use use cifs_down_write() instead of down_write()
> + * for this semaphore to avoid deadlocks.
> + */
> struct rw_semaphore lock_sem; /* protect the fields above */
> /* BB add in lists for dirty pages i.e. write caching info for oplock */
> struct list_head openFileList; diff --git
> a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h index
> e53e9f62b87b..fe597d3d5208 100644
> --- a/fs/cifs/cifsproto.h
> +++ b/fs/cifs/cifsproto.h
> @@ -170,6 +170,7 @@ extern int cifs_unlock_range(struct cifsFileInfo *cfile,
> struct file_lock *flock, const unsigned
> int xid); extern int cifs_push_mandatory_locks(struct cifsFileInfo
> *cfile);
>
> +extern void cifs_down_write(struct rw_semaphore *sem);
> extern struct cifsFileInfo *cifs_new_fileinfo(struct cifs_fid *fid,
> struct file *file,
> struct tcon_link *tlink,
> diff --git a/fs/cifs/file.c b/fs/cifs/file.c index
> 0e0217641de1..a80ec683ea32 100644
> --- a/fs/cifs/file.c
> +++ b/fs/cifs/file.c
> @@ -281,6 +281,13 @@ cifs_has_mand_locks(struct cifsInodeInfo *cinode)
> return has_locks;
> }
>
> +void
> +cifs_down_write(struct rw_semaphore *sem) {
> + while (!down_write_trylock(sem))
> + msleep(10);
> +}
> +
> struct cifsFileInfo *
> cifs_new_fileinfo(struct cifs_fid *fid, struct file *file,
> struct tcon_link *tlink, __u32 oplock) @@ -306,7
> +313,7 @@ cifs_new_fileinfo(struct cifs_fid *fid, struct file *file,
> INIT_LIST_HEAD(&fdlocks->locks);
> fdlocks->cfile = cfile;
> cfile->llist = fdlocks;
> - down_write(&cinode->lock_sem);
> + cifs_down_write(&cinode->lock_sem);
> list_add(&fdlocks->llist, &cinode->llist);
> up_write(&cinode->lock_sem);
>
> @@ -464,7 +471,7 @@ void _cifsFileInfo_put(struct cifsFileInfo *cifs_file, bool wait_oplock_handler)
> * Delete any outstanding lock records. We'll lose them when the file
> * is closed anyway.
> */
> - down_write(&cifsi->lock_sem);
> + cifs_down_write(&cifsi->lock_sem);
> list_for_each_entry_safe(li, tmp, &cifs_file->llist->locks, llist) {
> list_del(&li->llist);
> cifs_del_lock_waiters(li); @@ -1027,7 +1034,7 @@
> static void cifs_lock_add(struct cifsFileInfo *cfile, struct
> cifsLockInfo *lock) {
> struct cifsInodeInfo *cinode = CIFS_I(d_inode(cfile->dentry));
> - down_write(&cinode->lock_sem);
> + cifs_down_write(&cinode->lock_sem);
> list_add_tail(&lock->llist, &cfile->llist->locks);
> up_write(&cinode->lock_sem);
> }
> @@ -1049,7 +1056,7 @@ cifs_lock_add_if(struct cifsFileInfo *cfile,
> struct cifsLockInfo *lock,
>
> try_again:
> exist = false;
> - down_write(&cinode->lock_sem);
> + cifs_down_write(&cinode->lock_sem);
>
> exist = cifs_find_lock_conflict(cfile, lock->offset, lock->length,
> lock->type, lock->flags,
> &conf_lock, @@ -1072,7 +1079,7 @@ cifs_lock_add_if(struct cifsFileInfo *cfile, struct cifsLockInfo *lock,
> (lock->blist.next == &lock->blist));
> if (!rc)
> goto try_again;
> - down_write(&cinode->lock_sem);
> + cifs_down_write(&cinode->lock_sem);
> list_del_init(&lock->blist);
> }
>
> @@ -1125,7 +1132,7 @@ cifs_posix_lock_set(struct file *file, struct file_lock *flock)
> return rc;
>
> try_again:
> - down_write(&cinode->lock_sem);
> + cifs_down_write(&cinode->lock_sem);
> if (!cinode->can_cache_brlcks) {
> up_write(&cinode->lock_sem);
> return rc;
> @@ -1331,7 +1338,7 @@ cifs_push_locks(struct cifsFileInfo *cfile)
> int rc = 0;
>
> /* we are going to update can_cache_brlcks here - need a write access */
> - down_write(&cinode->lock_sem);
> + cifs_down_write(&cinode->lock_sem);
> if (!cinode->can_cache_brlcks) {
> up_write(&cinode->lock_sem);
> return rc;
> @@ -1522,7 +1529,7 @@ cifs_unlock_range(struct cifsFileInfo *cfile, struct file_lock *flock,
> if (!buf)
> return -ENOMEM;
>
> - down_write(&cinode->lock_sem);
> + cifs_down_write(&cinode->lock_sem);
> for (i = 0; i < 2; i++) {
> cur = buf;
> num = 0;
> diff --git a/fs/cifs/smb2file.c b/fs/cifs/smb2file.c index
> e6a1fc72018f..8b0b512c5792 100644
> --- a/fs/cifs/smb2file.c
> +++ b/fs/cifs/smb2file.c
> @@ -145,7 +145,7 @@ smb2_unlock_range(struct cifsFileInfo *cfile,
> struct file_lock *flock,
>
> cur = buf;
>
> - down_write(&cinode->lock_sem);
> + cifs_down_write(&cinode->lock_sem);
> list_for_each_entry_safe(li, tmp, &cfile->llist->locks, llist) {
> if (flock->fl_start > li->offset ||
> (flock->fl_start + length) <
> --
> 2.13.6
>
Thanks for doing this! Looks fine to me.
I'm not loving this approach to solve this problem but I understand why it may be preferred right now.
I note that trylock with a sleep / retry loop is not often used in filesystems that I could tell (though it is used in drivers), so maybe this can be improved another day.
next prev parent reply other threads:[~2019-10-25 15:38 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-24 23:51 Updated patch for the the lock_sem deadlock Ronnie Sahlberg
2019-10-24 23:51 ` [PATCH] cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs Ronnie Sahlberg
2019-10-25 1:23 ` David Wysochanski
2019-10-25 15:38 ` Pavel Shilovskiy [this message]
2019-10-25 15:41 ` Pavel Shilovskiy
2019-10-25 16:24 ` Updated patch for the the lock_sem deadlock Aurélien Aptel
2019-10-26 10:14 ` David Wysochanski
2019-10-28 12:05 ` Aurélien Aptel
-- strict thread matches above, loose matches on Subject: below --
2019-10-24 2:08 [PATCH] cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs Dave Wysochanski
2019-10-24 2:57 ` Ronnie Sahlberg
2019-10-24 22:22 ` Pavel Shilovsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DM5PR21MB0185AD62B9AEF7945FAA3676B6650@DM5PR21MB0185.namprd21.prod.outlook.com \
--to=pshilov@microsoft.com \
--cc=dwysocha@redhat.com \
--cc=linux-cifs@vger.kernel.org \
--cc=lsahlber@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).