linux-cifs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Updated patch for the the lock_sem deadlock
@ 2019-10-24 23:51 Ronnie Sahlberg
  2019-10-24 23:51 ` [PATCH] cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs Ronnie Sahlberg
  2019-10-25 16:24 ` Updated patch for the the lock_sem deadlock Aurélien Aptel
  0 siblings, 2 replies; 8+ messages in thread
From: Ronnie Sahlberg @ 2019-10-24 23:51 UTC (permalink / raw)
  To: linux-cifs

Steve, Pavel, Dave

This is a small update to Dave's patch to address Pavels recommendation
that we use a helper function for the trylock/sleep loop.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs
  2019-10-24 23:51 Updated patch for the the lock_sem deadlock Ronnie Sahlberg
@ 2019-10-24 23:51 ` Ronnie Sahlberg
  2019-10-25  1:23   ` David Wysochanski
  2019-10-25 16:24 ` Updated patch for the the lock_sem deadlock Aurélien Aptel
  1 sibling, 1 reply; 8+ messages in thread
From: Ronnie Sahlberg @ 2019-10-24 23:51 UTC (permalink / raw)
  To: linux-cifs; +Cc: Dave Wysochanski

From: Dave Wysochanski <dwysocha@redhat.com>

There's a deadlock that is possible and can easily be seen with
a test where multiple readers open/read/close of the same file
and a disruption occurs causing reconnect.  The deadlock is due
a reader thread inside cifs_strict_readv calling down_read and
obtaining lock_sem, and then after reconnect inside
cifs_reopen_file calling down_read a second time.  If in
between the two down_read calls, a down_write comes from
another process, deadlock occurs.

        CPU0                    CPU1
        ----                    ----
cifs_strict_readv()
 down_read(&cifsi->lock_sem);
                               _cifsFileInfo_put
                                  OR
                               cifs_new_fileinfo
                                down_write(&cifsi->lock_sem);
cifs_reopen_file()
 down_read(&cifsi->lock_sem);

Fix the above by changing all down_write(lock_sem) calls to
down_write_trylock(lock_sem)/msleep() loop, which in turn
makes the second down_read call benign since it will never
block behind the writer while holding lock_sem.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Suggested-by: Ronnie Sahlberg <lsahlber@redhat.com>
---
 fs/cifs/cifsglob.h  |  5 +++++
 fs/cifs/cifsproto.h |  1 +
 fs/cifs/file.c      | 23 +++++++++++++++--------
 fs/cifs/smb2file.c  |  2 +-
 4 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index 50dfd9049370..d78bfcc19156 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -1391,6 +1391,11 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file);
 struct cifsInodeInfo {
 	bool can_cache_brlcks;
 	struct list_head llist;	/* locks helb by this inode */
+	/*
+	 * NOTE: Some code paths call down_read(lock_sem) twice, so
+	 * we must always use use cifs_down_write() instead of down_write()
+	 * for this semaphore to avoid deadlocks.
+	 */
 	struct rw_semaphore lock_sem;	/* protect the fields above */
 	/* BB add in lists for dirty pages i.e. write caching info for oplock */
 	struct list_head openFileList;
diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
index e53e9f62b87b..fe597d3d5208 100644
--- a/fs/cifs/cifsproto.h
+++ b/fs/cifs/cifsproto.h
@@ -170,6 +170,7 @@ extern int cifs_unlock_range(struct cifsFileInfo *cfile,
 			     struct file_lock *flock, const unsigned int xid);
 extern int cifs_push_mandatory_locks(struct cifsFileInfo *cfile);
 
+extern void cifs_down_write(struct rw_semaphore *sem);
 extern struct cifsFileInfo *cifs_new_fileinfo(struct cifs_fid *fid,
 					      struct file *file,
 					      struct tcon_link *tlink,
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 0e0217641de1..a80ec683ea32 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -281,6 +281,13 @@ cifs_has_mand_locks(struct cifsInodeInfo *cinode)
 	return has_locks;
 }
 
+void
+cifs_down_write(struct rw_semaphore *sem)
+{
+	while (!down_write_trylock(sem))
+		msleep(10);
+}
+
 struct cifsFileInfo *
 cifs_new_fileinfo(struct cifs_fid *fid, struct file *file,
 		  struct tcon_link *tlink, __u32 oplock)
@@ -306,7 +313,7 @@ cifs_new_fileinfo(struct cifs_fid *fid, struct file *file,
 	INIT_LIST_HEAD(&fdlocks->locks);
 	fdlocks->cfile = cfile;
 	cfile->llist = fdlocks;
-	down_write(&cinode->lock_sem);
+	cifs_down_write(&cinode->lock_sem);
 	list_add(&fdlocks->llist, &cinode->llist);
 	up_write(&cinode->lock_sem);
 
@@ -464,7 +471,7 @@ void _cifsFileInfo_put(struct cifsFileInfo *cifs_file, bool wait_oplock_handler)
 	 * Delete any outstanding lock records. We'll lose them when the file
 	 * is closed anyway.
 	 */
-	down_write(&cifsi->lock_sem);
+	cifs_down_write(&cifsi->lock_sem);
 	list_for_each_entry_safe(li, tmp, &cifs_file->llist->locks, llist) {
 		list_del(&li->llist);
 		cifs_del_lock_waiters(li);
@@ -1027,7 +1034,7 @@ static void
 cifs_lock_add(struct cifsFileInfo *cfile, struct cifsLockInfo *lock)
 {
 	struct cifsInodeInfo *cinode = CIFS_I(d_inode(cfile->dentry));
-	down_write(&cinode->lock_sem);
+	cifs_down_write(&cinode->lock_sem);
 	list_add_tail(&lock->llist, &cfile->llist->locks);
 	up_write(&cinode->lock_sem);
 }
@@ -1049,7 +1056,7 @@ cifs_lock_add_if(struct cifsFileInfo *cfile, struct cifsLockInfo *lock,
 
 try_again:
 	exist = false;
-	down_write(&cinode->lock_sem);
+	cifs_down_write(&cinode->lock_sem);
 
 	exist = cifs_find_lock_conflict(cfile, lock->offset, lock->length,
 					lock->type, lock->flags, &conf_lock,
@@ -1072,7 +1079,7 @@ cifs_lock_add_if(struct cifsFileInfo *cfile, struct cifsLockInfo *lock,
 					(lock->blist.next == &lock->blist));
 		if (!rc)
 			goto try_again;
-		down_write(&cinode->lock_sem);
+		cifs_down_write(&cinode->lock_sem);
 		list_del_init(&lock->blist);
 	}
 
@@ -1125,7 +1132,7 @@ cifs_posix_lock_set(struct file *file, struct file_lock *flock)
 		return rc;
 
 try_again:
-	down_write(&cinode->lock_sem);
+	cifs_down_write(&cinode->lock_sem);
 	if (!cinode->can_cache_brlcks) {
 		up_write(&cinode->lock_sem);
 		return rc;
@@ -1331,7 +1338,7 @@ cifs_push_locks(struct cifsFileInfo *cfile)
 	int rc = 0;
 
 	/* we are going to update can_cache_brlcks here - need a write access */
-	down_write(&cinode->lock_sem);
+	cifs_down_write(&cinode->lock_sem);
 	if (!cinode->can_cache_brlcks) {
 		up_write(&cinode->lock_sem);
 		return rc;
@@ -1522,7 +1529,7 @@ cifs_unlock_range(struct cifsFileInfo *cfile, struct file_lock *flock,
 	if (!buf)
 		return -ENOMEM;
 
-	down_write(&cinode->lock_sem);
+	cifs_down_write(&cinode->lock_sem);
 	for (i = 0; i < 2; i++) {
 		cur = buf;
 		num = 0;
diff --git a/fs/cifs/smb2file.c b/fs/cifs/smb2file.c
index e6a1fc72018f..8b0b512c5792 100644
--- a/fs/cifs/smb2file.c
+++ b/fs/cifs/smb2file.c
@@ -145,7 +145,7 @@ smb2_unlock_range(struct cifsFileInfo *cfile, struct file_lock *flock,
 
 	cur = buf;
 
-	down_write(&cinode->lock_sem);
+	cifs_down_write(&cinode->lock_sem);
 	list_for_each_entry_safe(li, tmp, &cfile->llist->locks, llist) {
 		if (flock->fl_start > li->offset ||
 		    (flock->fl_start + length) <
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs
  2019-10-24 23:51 ` [PATCH] cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs Ronnie Sahlberg
@ 2019-10-25  1:23   ` David Wysochanski
  2019-10-25 15:38     ` Pavel Shilovskiy
  0 siblings, 1 reply; 8+ messages in thread
From: David Wysochanski @ 2019-10-25  1:23 UTC (permalink / raw)
  To: Ronnie Sahlberg; +Cc: linux-cifs

On Thu, Oct 24, 2019 at 7:51 PM Ronnie Sahlberg <lsahlber@redhat.com> wrote:
>
> From: Dave Wysochanski <dwysocha@redhat.com>
>
> There's a deadlock that is possible and can easily be seen with
> a test where multiple readers open/read/close of the same file
> and a disruption occurs causing reconnect.  The deadlock is due
> a reader thread inside cifs_strict_readv calling down_read and
> obtaining lock_sem, and then after reconnect inside
> cifs_reopen_file calling down_read a second time.  If in
> between the two down_read calls, a down_write comes from
> another process, deadlock occurs.
>
>         CPU0                    CPU1
>         ----                    ----
> cifs_strict_readv()
>  down_read(&cifsi->lock_sem);
>                                _cifsFileInfo_put
>                                   OR
>                                cifs_new_fileinfo
>                                 down_write(&cifsi->lock_sem);
> cifs_reopen_file()
>  down_read(&cifsi->lock_sem);
>
> Fix the above by changing all down_write(lock_sem) calls to
> down_write_trylock(lock_sem)/msleep() loop, which in turn
> makes the second down_read call benign since it will never
> block behind the writer while holding lock_sem.
>
> Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
> Suggested-by: Ronnie Sahlberg <lsahlber@redhat.com>
> ---
>  fs/cifs/cifsglob.h  |  5 +++++
>  fs/cifs/cifsproto.h |  1 +
>  fs/cifs/file.c      | 23 +++++++++++++++--------
>  fs/cifs/smb2file.c  |  2 +-
>  4 files changed, 22 insertions(+), 9 deletions(-)
>
> diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
> index 50dfd9049370..d78bfcc19156 100644
> --- a/fs/cifs/cifsglob.h
> +++ b/fs/cifs/cifsglob.h
> @@ -1391,6 +1391,11 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file);
>  struct cifsInodeInfo {
>         bool can_cache_brlcks;
>         struct list_head llist; /* locks helb by this inode */
> +       /*
> +        * NOTE: Some code paths call down_read(lock_sem) twice, so
> +        * we must always use use cifs_down_write() instead of down_write()
> +        * for this semaphore to avoid deadlocks.
> +        */
>         struct rw_semaphore lock_sem;   /* protect the fields above */
>         /* BB add in lists for dirty pages i.e. write caching info for oplock */
>         struct list_head openFileList;
> diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
> index e53e9f62b87b..fe597d3d5208 100644
> --- a/fs/cifs/cifsproto.h
> +++ b/fs/cifs/cifsproto.h
> @@ -170,6 +170,7 @@ extern int cifs_unlock_range(struct cifsFileInfo *cfile,
>                              struct file_lock *flock, const unsigned int xid);
>  extern int cifs_push_mandatory_locks(struct cifsFileInfo *cfile);
>
> +extern void cifs_down_write(struct rw_semaphore *sem);
>  extern struct cifsFileInfo *cifs_new_fileinfo(struct cifs_fid *fid,
>                                               struct file *file,
>                                               struct tcon_link *tlink,
> diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> index 0e0217641de1..a80ec683ea32 100644
> --- a/fs/cifs/file.c
> +++ b/fs/cifs/file.c
> @@ -281,6 +281,13 @@ cifs_has_mand_locks(struct cifsInodeInfo *cinode)
>         return has_locks;
>  }
>
> +void
> +cifs_down_write(struct rw_semaphore *sem)
> +{
> +       while (!down_write_trylock(sem))
> +               msleep(10);
> +}
> +
>  struct cifsFileInfo *
>  cifs_new_fileinfo(struct cifs_fid *fid, struct file *file,
>                   struct tcon_link *tlink, __u32 oplock)
> @@ -306,7 +313,7 @@ cifs_new_fileinfo(struct cifs_fid *fid, struct file *file,
>         INIT_LIST_HEAD(&fdlocks->locks);
>         fdlocks->cfile = cfile;
>         cfile->llist = fdlocks;
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>         list_add(&fdlocks->llist, &cinode->llist);
>         up_write(&cinode->lock_sem);
>
> @@ -464,7 +471,7 @@ void _cifsFileInfo_put(struct cifsFileInfo *cifs_file, bool wait_oplock_handler)
>          * Delete any outstanding lock records. We'll lose them when the file
>          * is closed anyway.
>          */
> -       down_write(&cifsi->lock_sem);
> +       cifs_down_write(&cifsi->lock_sem);
>         list_for_each_entry_safe(li, tmp, &cifs_file->llist->locks, llist) {
>                 list_del(&li->llist);
>                 cifs_del_lock_waiters(li);
> @@ -1027,7 +1034,7 @@ static void
>  cifs_lock_add(struct cifsFileInfo *cfile, struct cifsLockInfo *lock)
>  {
>         struct cifsInodeInfo *cinode = CIFS_I(d_inode(cfile->dentry));
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>         list_add_tail(&lock->llist, &cfile->llist->locks);
>         up_write(&cinode->lock_sem);
>  }
> @@ -1049,7 +1056,7 @@ cifs_lock_add_if(struct cifsFileInfo *cfile, struct cifsLockInfo *lock,
>
>  try_again:
>         exist = false;
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>
>         exist = cifs_find_lock_conflict(cfile, lock->offset, lock->length,
>                                         lock->type, lock->flags, &conf_lock,
> @@ -1072,7 +1079,7 @@ cifs_lock_add_if(struct cifsFileInfo *cfile, struct cifsLockInfo *lock,
>                                         (lock->blist.next == &lock->blist));
>                 if (!rc)
>                         goto try_again;
> -               down_write(&cinode->lock_sem);
> +               cifs_down_write(&cinode->lock_sem);
>                 list_del_init(&lock->blist);
>         }
>
> @@ -1125,7 +1132,7 @@ cifs_posix_lock_set(struct file *file, struct file_lock *flock)
>                 return rc;
>
>  try_again:
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>         if (!cinode->can_cache_brlcks) {
>                 up_write(&cinode->lock_sem);
>                 return rc;
> @@ -1331,7 +1338,7 @@ cifs_push_locks(struct cifsFileInfo *cfile)
>         int rc = 0;
>
>         /* we are going to update can_cache_brlcks here - need a write access */
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>         if (!cinode->can_cache_brlcks) {
>                 up_write(&cinode->lock_sem);
>                 return rc;
> @@ -1522,7 +1529,7 @@ cifs_unlock_range(struct cifsFileInfo *cfile, struct file_lock *flock,
>         if (!buf)
>                 return -ENOMEM;
>
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>         for (i = 0; i < 2; i++) {
>                 cur = buf;
>                 num = 0;
> diff --git a/fs/cifs/smb2file.c b/fs/cifs/smb2file.c
> index e6a1fc72018f..8b0b512c5792 100644
> --- a/fs/cifs/smb2file.c
> +++ b/fs/cifs/smb2file.c
> @@ -145,7 +145,7 @@ smb2_unlock_range(struct cifsFileInfo *cfile, struct file_lock *flock,
>
>         cur = buf;
>
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>         list_for_each_entry_safe(li, tmp, &cfile->llist->locks, llist) {
>                 if (flock->fl_start > li->offset ||
>                     (flock->fl_start + length) <
> --
> 2.13.6
>

Thanks for doing this!  Looks fine to me.

I'm not loving this approach to solve this problem but I understand
why it may be preferred right now.
I note that trylock with a sleep / retry loop is not often used in
filesystems that I could tell (though it is used in drivers), so maybe
this can be improved another day.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs
  2019-10-25  1:23   ` David Wysochanski
@ 2019-10-25 15:38     ` Pavel Shilovskiy
  2019-10-25 15:41       ` Pavel Shilovskiy
  0 siblings, 1 reply; 8+ messages in thread
From: Pavel Shilovskiy @ 2019-10-25 15:38 UTC (permalink / raw)
  To: David Wysochanski, Ronnie Sahlberg; +Cc: linux-cifs

Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>

I still think the right way to fix it would be to put "reconnected" file handles into a new queue and make another thread to "relock" the file.

It seems like a stable kernel candidate but I would hold off and let it bake a little bit in the mainline before we consider doing that.

Best regards,
Pavel Shilovsky

-----Original Message-----
From: linux-cifs-owner@vger.kernel.org <linux-cifs-owner@vger.kernel.org> On Behalf Of David Wysochanski
Sent: Thursday, October 24, 2019 6:23 PM
To: Ronnie Sahlberg <lsahlber@redhat.com>
Cc: linux-cifs <linux-cifs@vger.kernel.org>
Subject: Re: [PATCH] cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs

On Thu, Oct 24, 2019 at 7:51 PM Ronnie Sahlberg <lsahlber@redhat.com> wrote:
>
> From: Dave Wysochanski <dwysocha@redhat.com>
>
> There's a deadlock that is possible and can easily be seen with a test 
> where multiple readers open/read/close of the same file and a 
> disruption occurs causing reconnect.  The deadlock is due a reader 
> thread inside cifs_strict_readv calling down_read and obtaining 
> lock_sem, and then after reconnect inside cifs_reopen_file calling 
> down_read a second time.  If in between the two down_read calls, a 
> down_write comes from another process, deadlock occurs.
>
>         CPU0                    CPU1
>         ----                    ----
> cifs_strict_readv()
>  down_read(&cifsi->lock_sem);
>                                _cifsFileInfo_put
>                                   OR
>                                cifs_new_fileinfo
>                                 down_write(&cifsi->lock_sem);
> cifs_reopen_file()
>  down_read(&cifsi->lock_sem);
>
> Fix the above by changing all down_write(lock_sem) calls to
> down_write_trylock(lock_sem)/msleep() loop, which in turn makes the 
> second down_read call benign since it will never block behind the 
> writer while holding lock_sem.
>
> Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
> Suggested-by: Ronnie Sahlberg <lsahlber@redhat.com>
> ---
>  fs/cifs/cifsglob.h  |  5 +++++
>  fs/cifs/cifsproto.h |  1 +
>  fs/cifs/file.c      | 23 +++++++++++++++--------
>  fs/cifs/smb2file.c  |  2 +-
>  4 files changed, 22 insertions(+), 9 deletions(-)
>
> diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h index 
> 50dfd9049370..d78bfcc19156 100644
> --- a/fs/cifs/cifsglob.h
> +++ b/fs/cifs/cifsglob.h
> @@ -1391,6 +1391,11 @@ void cifsFileInfo_put(struct cifsFileInfo 
> *cifs_file);  struct cifsInodeInfo {
>         bool can_cache_brlcks;
>         struct list_head llist; /* locks helb by this inode */
> +       /*
> +        * NOTE: Some code paths call down_read(lock_sem) twice, so
> +        * we must always use use cifs_down_write() instead of down_write()
> +        * for this semaphore to avoid deadlocks.
> +        */
>         struct rw_semaphore lock_sem;   /* protect the fields above */
>         /* BB add in lists for dirty pages i.e. write caching info for oplock */
>         struct list_head openFileList; diff --git 
> a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h index 
> e53e9f62b87b..fe597d3d5208 100644
> --- a/fs/cifs/cifsproto.h
> +++ b/fs/cifs/cifsproto.h
> @@ -170,6 +170,7 @@ extern int cifs_unlock_range(struct cifsFileInfo *cfile,
>                              struct file_lock *flock, const unsigned 
> int xid);  extern int cifs_push_mandatory_locks(struct cifsFileInfo 
> *cfile);
>
> +extern void cifs_down_write(struct rw_semaphore *sem);
>  extern struct cifsFileInfo *cifs_new_fileinfo(struct cifs_fid *fid,
>                                               struct file *file,
>                                               struct tcon_link *tlink, 
> diff --git a/fs/cifs/file.c b/fs/cifs/file.c index 
> 0e0217641de1..a80ec683ea32 100644
> --- a/fs/cifs/file.c
> +++ b/fs/cifs/file.c
> @@ -281,6 +281,13 @@ cifs_has_mand_locks(struct cifsInodeInfo *cinode)
>         return has_locks;
>  }
>
> +void
> +cifs_down_write(struct rw_semaphore *sem) {
> +       while (!down_write_trylock(sem))
> +               msleep(10);
> +}
> +
>  struct cifsFileInfo *
>  cifs_new_fileinfo(struct cifs_fid *fid, struct file *file,
>                   struct tcon_link *tlink, __u32 oplock) @@ -306,7 
> +313,7 @@ cifs_new_fileinfo(struct cifs_fid *fid, struct file *file,
>         INIT_LIST_HEAD(&fdlocks->locks);
>         fdlocks->cfile = cfile;
>         cfile->llist = fdlocks;
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>         list_add(&fdlocks->llist, &cinode->llist);
>         up_write(&cinode->lock_sem);
>
> @@ -464,7 +471,7 @@ void _cifsFileInfo_put(struct cifsFileInfo *cifs_file, bool wait_oplock_handler)
>          * Delete any outstanding lock records. We'll lose them when the file
>          * is closed anyway.
>          */
> -       down_write(&cifsi->lock_sem);
> +       cifs_down_write(&cifsi->lock_sem);
>         list_for_each_entry_safe(li, tmp, &cifs_file->llist->locks, llist) {
>                 list_del(&li->llist);
>                 cifs_del_lock_waiters(li); @@ -1027,7 +1034,7 @@ 
> static void  cifs_lock_add(struct cifsFileInfo *cfile, struct 
> cifsLockInfo *lock)  {
>         struct cifsInodeInfo *cinode = CIFS_I(d_inode(cfile->dentry));
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>         list_add_tail(&lock->llist, &cfile->llist->locks);
>         up_write(&cinode->lock_sem);
>  }
> @@ -1049,7 +1056,7 @@ cifs_lock_add_if(struct cifsFileInfo *cfile, 
> struct cifsLockInfo *lock,
>
>  try_again:
>         exist = false;
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>
>         exist = cifs_find_lock_conflict(cfile, lock->offset, lock->length,
>                                         lock->type, lock->flags, 
> &conf_lock, @@ -1072,7 +1079,7 @@ cifs_lock_add_if(struct cifsFileInfo *cfile, struct cifsLockInfo *lock,
>                                         (lock->blist.next == &lock->blist));
>                 if (!rc)
>                         goto try_again;
> -               down_write(&cinode->lock_sem);
> +               cifs_down_write(&cinode->lock_sem);
>                 list_del_init(&lock->blist);
>         }
>
> @@ -1125,7 +1132,7 @@ cifs_posix_lock_set(struct file *file, struct file_lock *flock)
>                 return rc;
>
>  try_again:
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>         if (!cinode->can_cache_brlcks) {
>                 up_write(&cinode->lock_sem);
>                 return rc;
> @@ -1331,7 +1338,7 @@ cifs_push_locks(struct cifsFileInfo *cfile)
>         int rc = 0;
>
>         /* we are going to update can_cache_brlcks here - need a write access */
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>         if (!cinode->can_cache_brlcks) {
>                 up_write(&cinode->lock_sem);
>                 return rc;
> @@ -1522,7 +1529,7 @@ cifs_unlock_range(struct cifsFileInfo *cfile, struct file_lock *flock,
>         if (!buf)
>                 return -ENOMEM;
>
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>         for (i = 0; i < 2; i++) {
>                 cur = buf;
>                 num = 0;
> diff --git a/fs/cifs/smb2file.c b/fs/cifs/smb2file.c index 
> e6a1fc72018f..8b0b512c5792 100644
> --- a/fs/cifs/smb2file.c
> +++ b/fs/cifs/smb2file.c
> @@ -145,7 +145,7 @@ smb2_unlock_range(struct cifsFileInfo *cfile, 
> struct file_lock *flock,
>
>         cur = buf;
>
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>         list_for_each_entry_safe(li, tmp, &cfile->llist->locks, llist) {
>                 if (flock->fl_start > li->offset ||
>                     (flock->fl_start + length) <
> --
> 2.13.6
>

Thanks for doing this!  Looks fine to me.

I'm not loving this approach to solve this problem but I understand why it may be preferred right now.
I note that trylock with a sleep / retry loop is not often used in filesystems that I could tell (though it is used in drivers), so maybe this can be improved another day.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs
  2019-10-25 15:38     ` Pavel Shilovskiy
@ 2019-10-25 15:41       ` Pavel Shilovskiy
  0 siblings, 0 replies; 8+ messages in thread
From: Pavel Shilovskiy @ 2019-10-25 15:41 UTC (permalink / raw)
  To: Pavel Shilovskiy, David Wysochanski, Ronnie Sahlberg; +Cc: linux-cifs

...continuing my email:

"I still think the right way to fix it would be to put "reconnected" file handles into a new queue and make another thread to "relock" the file."
But this approach requires much more work and it will involve more regression risks. So, I think the current patch is good for now.

-----Original Message-----
From: linux-cifs-owner@vger.kernel.org <linux-cifs-owner@vger.kernel.org> On Behalf Of Pavel Shilovskiy
Sent: Friday, October 25, 2019 8:39 AM
To: David Wysochanski <dwysocha@redhat.com>; Ronnie Sahlberg <lsahlber@redhat.com>
Cc: linux-cifs <linux-cifs@vger.kernel.org>
Subject: RE: [PATCH] cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs

Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>

I still think the right way to fix it would be to put "reconnected" file handles into a new queue and make another thread to "relock" the file.

It seems like a stable kernel candidate but I would hold off and let it bake a little bit in the mainline before we consider doing that.

Best regards,
Pavel Shilovsky

-----Original Message-----
From: linux-cifs-owner@vger.kernel.org <linux-cifs-owner@vger.kernel.org> On Behalf Of David Wysochanski
Sent: Thursday, October 24, 2019 6:23 PM
To: Ronnie Sahlberg <lsahlber@redhat.com>
Cc: linux-cifs <linux-cifs@vger.kernel.org>
Subject: Re: [PATCH] cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs

On Thu, Oct 24, 2019 at 7:51 PM Ronnie Sahlberg <lsahlber@redhat.com> wrote:
>
> From: Dave Wysochanski <dwysocha@redhat.com>
>
> There's a deadlock that is possible and can easily be seen with a test 
> where multiple readers open/read/close of the same file and a 
> disruption occurs causing reconnect.  The deadlock is due a reader 
> thread inside cifs_strict_readv calling down_read and obtaining 
> lock_sem, and then after reconnect inside cifs_reopen_file calling 
> down_read a second time.  If in between the two down_read calls, a 
> down_write comes from another process, deadlock occurs.
>
>         CPU0                    CPU1
>         ----                    ----
> cifs_strict_readv()
>  down_read(&cifsi->lock_sem);
>                                _cifsFileInfo_put
>                                   OR
>                                cifs_new_fileinfo
>                                 down_write(&cifsi->lock_sem);
> cifs_reopen_file()
>  down_read(&cifsi->lock_sem);
>
> Fix the above by changing all down_write(lock_sem) calls to
> down_write_trylock(lock_sem)/msleep() loop, which in turn makes the 
> second down_read call benign since it will never block behind the 
> writer while holding lock_sem.
>
> Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
> Suggested-by: Ronnie Sahlberg <lsahlber@redhat.com>
> ---
>  fs/cifs/cifsglob.h  |  5 +++++
>  fs/cifs/cifsproto.h |  1 +
>  fs/cifs/file.c      | 23 +++++++++++++++--------
>  fs/cifs/smb2file.c  |  2 +-
>  4 files changed, 22 insertions(+), 9 deletions(-)
>
> diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h index
> 50dfd9049370..d78bfcc19156 100644
> --- a/fs/cifs/cifsglob.h
> +++ b/fs/cifs/cifsglob.h
> @@ -1391,6 +1391,11 @@ void cifsFileInfo_put(struct cifsFileInfo 
> *cifs_file);  struct cifsInodeInfo {
>         bool can_cache_brlcks;
>         struct list_head llist; /* locks helb by this inode */
> +       /*
> +        * NOTE: Some code paths call down_read(lock_sem) twice, so
> +        * we must always use use cifs_down_write() instead of down_write()
> +        * for this semaphore to avoid deadlocks.
> +        */
>         struct rw_semaphore lock_sem;   /* protect the fields above */
>         /* BB add in lists for dirty pages i.e. write caching info for oplock */
>         struct list_head openFileList; diff --git 
> a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h index
> e53e9f62b87b..fe597d3d5208 100644
> --- a/fs/cifs/cifsproto.h
> +++ b/fs/cifs/cifsproto.h
> @@ -170,6 +170,7 @@ extern int cifs_unlock_range(struct cifsFileInfo *cfile,
>                              struct file_lock *flock, const unsigned 
> int xid);  extern int cifs_push_mandatory_locks(struct cifsFileInfo 
> *cfile);
>
> +extern void cifs_down_write(struct rw_semaphore *sem);
>  extern struct cifsFileInfo *cifs_new_fileinfo(struct cifs_fid *fid,
>                                               struct file *file,
>                                               struct tcon_link *tlink, 
> diff --git a/fs/cifs/file.c b/fs/cifs/file.c index
> 0e0217641de1..a80ec683ea32 100644
> --- a/fs/cifs/file.c
> +++ b/fs/cifs/file.c
> @@ -281,6 +281,13 @@ cifs_has_mand_locks(struct cifsInodeInfo *cinode)
>         return has_locks;
>  }
>
> +void
> +cifs_down_write(struct rw_semaphore *sem) {
> +       while (!down_write_trylock(sem))
> +               msleep(10);
> +}
> +
>  struct cifsFileInfo *
>  cifs_new_fileinfo(struct cifs_fid *fid, struct file *file,
>                   struct tcon_link *tlink, __u32 oplock) @@ -306,7
> +313,7 @@ cifs_new_fileinfo(struct cifs_fid *fid, struct file *file,
>         INIT_LIST_HEAD(&fdlocks->locks);
>         fdlocks->cfile = cfile;
>         cfile->llist = fdlocks;
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>         list_add(&fdlocks->llist, &cinode->llist);
>         up_write(&cinode->lock_sem);
>
> @@ -464,7 +471,7 @@ void _cifsFileInfo_put(struct cifsFileInfo *cifs_file, bool wait_oplock_handler)
>          * Delete any outstanding lock records. We'll lose them when the file
>          * is closed anyway.
>          */
> -       down_write(&cifsi->lock_sem);
> +       cifs_down_write(&cifsi->lock_sem);
>         list_for_each_entry_safe(li, tmp, &cifs_file->llist->locks, llist) {
>                 list_del(&li->llist);
>                 cifs_del_lock_waiters(li); @@ -1027,7 +1034,7 @@ 
> static void  cifs_lock_add(struct cifsFileInfo *cfile, struct 
> cifsLockInfo *lock)  {
>         struct cifsInodeInfo *cinode = CIFS_I(d_inode(cfile->dentry));
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>         list_add_tail(&lock->llist, &cfile->llist->locks);
>         up_write(&cinode->lock_sem);
>  }
> @@ -1049,7 +1056,7 @@ cifs_lock_add_if(struct cifsFileInfo *cfile, 
> struct cifsLockInfo *lock,
>
>  try_again:
>         exist = false;
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>
>         exist = cifs_find_lock_conflict(cfile, lock->offset, lock->length,
>                                         lock->type, lock->flags, 
> &conf_lock, @@ -1072,7 +1079,7 @@ cifs_lock_add_if(struct cifsFileInfo *cfile, struct cifsLockInfo *lock,
>                                         (lock->blist.next == &lock->blist));
>                 if (!rc)
>                         goto try_again;
> -               down_write(&cinode->lock_sem);
> +               cifs_down_write(&cinode->lock_sem);
>                 list_del_init(&lock->blist);
>         }
>
> @@ -1125,7 +1132,7 @@ cifs_posix_lock_set(struct file *file, struct file_lock *flock)
>                 return rc;
>
>  try_again:
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>         if (!cinode->can_cache_brlcks) {
>                 up_write(&cinode->lock_sem);
>                 return rc;
> @@ -1331,7 +1338,7 @@ cifs_push_locks(struct cifsFileInfo *cfile)
>         int rc = 0;
>
>         /* we are going to update can_cache_brlcks here - need a write access */
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>         if (!cinode->can_cache_brlcks) {
>                 up_write(&cinode->lock_sem);
>                 return rc;
> @@ -1522,7 +1529,7 @@ cifs_unlock_range(struct cifsFileInfo *cfile, struct file_lock *flock,
>         if (!buf)
>                 return -ENOMEM;
>
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>         for (i = 0; i < 2; i++) {
>                 cur = buf;
>                 num = 0;
> diff --git a/fs/cifs/smb2file.c b/fs/cifs/smb2file.c index
> e6a1fc72018f..8b0b512c5792 100644
> --- a/fs/cifs/smb2file.c
> +++ b/fs/cifs/smb2file.c
> @@ -145,7 +145,7 @@ smb2_unlock_range(struct cifsFileInfo *cfile, 
> struct file_lock *flock,
>
>         cur = buf;
>
> -       down_write(&cinode->lock_sem);
> +       cifs_down_write(&cinode->lock_sem);
>         list_for_each_entry_safe(li, tmp, &cfile->llist->locks, llist) {
>                 if (flock->fl_start > li->offset ||
>                     (flock->fl_start + length) <
> --
> 2.13.6
>

Thanks for doing this!  Looks fine to me.

I'm not loving this approach to solve this problem but I understand why it may be preferred right now.
I note that trylock with a sleep / retry loop is not often used in filesystems that I could tell (though it is used in drivers), so maybe this can be improved another day.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Updated patch for the the lock_sem deadlock
  2019-10-24 23:51 Updated patch for the the lock_sem deadlock Ronnie Sahlberg
  2019-10-24 23:51 ` [PATCH] cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs Ronnie Sahlberg
@ 2019-10-25 16:24 ` Aurélien Aptel
  2019-10-26 10:14   ` David Wysochanski
  1 sibling, 1 reply; 8+ messages in thread
From: Aurélien Aptel @ 2019-10-25 16:24 UTC (permalink / raw)
  To: Ronnie Sahlberg, linux-cifs

Ronnie Sahlberg <lsahlber@redhat.com> writes:
> This is a small update to Dave's patch to address Pavels recommendation
> that we use a helper function for the trylock/sleep loop.

Disclamer: I have not read all the emails regarding this patch but it
is not obvious to me how replacing

    lock()

by

    while (trylock())
        sleep()

is fixing things, but I'm sure I'm missing something :(

Cheers,
-- 
Aurélien Aptel / SUSE Labs Samba Team
GPG: 1839 CB5F 9F5B FB9B AA97  8C99 03C8 A49B 521B D5D3
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, DE
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 247165 (AG München)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Updated patch for the the lock_sem deadlock
  2019-10-25 16:24 ` Updated patch for the the lock_sem deadlock Aurélien Aptel
@ 2019-10-26 10:14   ` David Wysochanski
  2019-10-28 12:05     ` Aurélien Aptel
  0 siblings, 1 reply; 8+ messages in thread
From: David Wysochanski @ 2019-10-26 10:14 UTC (permalink / raw)
  To: Aurélien Aptel; +Cc: Ronnie Sahlberg, linux-cifs

On Fri, Oct 25, 2019 at 12:24 PM Aurélien Aptel <aaptel@suse.com> wrote:
>
> Ronnie Sahlberg <lsahlber@redhat.com> writes:
> > This is a small update to Dave's patch to address Pavels recommendation
> > that we use a helper function for the trylock/sleep loop.
>
> Disclamer: I have not read all the emails regarding this patch but it
> is not obvious to me how replacing
>
>     lock()
>
> by
>
>     while (trylock())
>         sleep()
>
> is fixing things, but I'm sure I'm missing something :(
>

Let me try to explain better.

The deadlock occurs because of how rw_semaphores work in Linux.

The deadlock occurred because we had:
1. thread1: down_read() and obtained the semaphore
2. thread2: down_write() blocked due to thread1
3. thread1: down_read (a second time), blocked due to thread2

Note that it is normally benign for a single thread to call down_read
twice.  However, in this case, another thread called down_write in
between the two calls.  Once one thread calls down_write, any callers
of down_read will block, that is the rw_semaphore implementation in
Linux.  If this was not the case, we could have callers of down_read
continually streaming in and starving out callers of down_write.

The patch removes thread2 from blocking, so #3 will never occur, hence
removing the deadlock.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Updated patch for the the lock_sem deadlock
  2019-10-26 10:14   ` David Wysochanski
@ 2019-10-28 12:05     ` Aurélien Aptel
  0 siblings, 0 replies; 8+ messages in thread
From: Aurélien Aptel @ 2019-10-28 12:05 UTC (permalink / raw)
  To: David Wysochanski; +Cc: Ronnie Sahlberg, linux-cifs

David Wysochanski <dwysocha@redhat.com> writes:
> The patch removes thread2 from blocking, so #3 will never occur, hence
> removing the deadlock.

Thas was a great explanation, thanks!

-- 
Aurélien Aptel / SUSE Labs Samba Team
GPG: 1839 CB5F 9F5B FB9B AA97  8C99 03C8 A49B 521B D5D3
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, DE
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 247165 (AG München)

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-10-28 12:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-24 23:51 Updated patch for the the lock_sem deadlock Ronnie Sahlberg
2019-10-24 23:51 ` [PATCH] cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs Ronnie Sahlberg
2019-10-25  1:23   ` David Wysochanski
2019-10-25 15:38     ` Pavel Shilovskiy
2019-10-25 15:41       ` Pavel Shilovskiy
2019-10-25 16:24 ` Updated patch for the the lock_sem deadlock Aurélien Aptel
2019-10-26 10:14   ` David Wysochanski
2019-10-28 12:05     ` Aurélien Aptel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).