All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] CIFS: Fix race condition on RFC1002_NEGATIVE_SESSION_RESPONSE
@ 2015-03-17 17:13 Federico Sauter
       [not found] ` <550860D2.5040207-LVkJPw3T+odGBRGhe+f61g@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Federico Sauter @ 2015-03-17 17:13 UTC (permalink / raw)
  To: linux-cifs-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 3300 bytes --]

Greetings,


I have been running into an issue (kernel v3.10.40) when connecting to 
an NT 3.51 Workstation host with a configuration missing the NetBIOS 
name ('servern' option.) Under some conditions, the mount helper would 
hang forever in an uninterruptible sleep state. The mount helper comes 
from busybox. Under some other conditions the mount program would exit 
with an error and not hang.

I managed to create a setup where I could reliably reproduce both cases 
(the "good case" where the program exits, and the "bad case" where the 
program hangs forever.)

The problem seems to be a race condition between the demux thread and 
the "main"(?) thread over the srv_mutex. Here is the summary of the 
functions calls that lead to this problem:

* demux thread:
cifs_demultiplex_thread()
   is_smb_response()
     [connect.c:626 -- case RFC1002_NEGATIVE_SESSION_RESPONSE]
       cifs_reconnect()
         [connect.c:380]
         do {
           mutex_lock(&server->srv_mutex);
           generic_ip_connect(server);
           // on error -> msleep(3000);
           mutex_unlock(&server->srv_mutex);
         } while (server->tcpStatus == CifsNeedReconnect);

* "main" thread:
cifs_negotiate()
   CIFSSMBNegotiate()
     SendReceive()
       [transport.c:821 - thread hangs forever]
       mutex_lock(&ses->server->srv_mutex);

Another interesting piece of information is that in the good case at
cifs_reconnect(), generic_ip_connect() returns -EINTR, whereas in the 
bad case it returns -ECONNREFUSED. In the bad case it all leads to 
generic_ip_connect() being called over and over again with the same 
result, but never exiting the loop (thus: hanging.)

The following patch works around the issue by not re-attempting the SMB 
negotiation:

---8<----------------------------------------------------------------------
diff --git a/fs/cifs/smb1ops.c b/fs/cifs/smb1ops.c
index 4885a40..863a2da 100644
--- a/fs/cifs/smb1ops.c
+++ b/fs/cifs/smb1ops.c
@@ -415,10 +415,12 @@ cifs_negotiate(const unsigned int xid, struct 
cifs_ses *ses)
         int rc;
         rc = CIFSSMBNegotiate(xid, ses);
         if (rc == -EAGAIN) {
+#if 0
                 /* retry only once on 1st time connection */
                 set_credits(ses->server, 1);
                 rc = CIFSSMBNegotiate(xid, ses);
                 if (rc == -EAGAIN)
+#endif
                         rc = -EHOSTDOWN;
         }
         return rc;
---------------------------------------------------------------------->8---

I was able, however, to identify a (hopefully) better solution for the 
issue (see the attached patch.)

I would really appreciate your feedback on the attached patch. Please 
let me know if the solution seems acceptable as well as 
side-effects-free. We use CIFS to connect to older Windows systems and 
we have been experiencing similar issues for a while now (which I hope 
to solve with this patch.)

Thanks a lot in advance! :-)


Kind regards,


Federico Sauter
Senior Firmware Programmer
-- 
Innominate Security Technologies AG
Rudower Chaussee 13 | 12489 Berlin | Germany
tel: +49 30 921028-210 | fax: +49 30 921028-020
www.innominate.com | www.twitter.com/mGuardcom

Register Court: AG Charlottenburg, HR B 81603
Management Board: Dirk Seewald | Chairman of the Supervisory Board: 
Christoph Leifer

[-- Attachment #2: 0001-CIFS-Fix-race-condition-on-RFC1002_NEGATIVE_SESSION_.patch --]
[-- Type: text/x-patch, Size: 1664 bytes --]

>From 54e3c95a4646c8666c6f08766250fd056b06e7f5 Mon Sep 17 00:00:00 2001
From: Federico Sauter <fsauter-LVkJPw3T+odGBRGhe+f61g@public.gmane.org>
Date: Tue, 17 Mar 2015 17:45:28 +0100
Subject: [PATCH] CIFS: Fix race condition on
 RFC1002_NEGATIVE_SESSION_RESPONSE

This patch fixes a race condition that occurs when connecting
to a NT 3.51 host without specifying a NetBIOS name.
In that case a RFC1002_NEGATIVE_SESSION_RESPONSE is received
and the SMB negotiation is reattempted, but under some conditions
it leads SendReceive() to hang forever while waiting for srv_mutex.
This, in turn, sets the calling process to an uninterruptible sleep
state and makes it unkillable.

The solution is to unlock the srv_mutex acquired in the demux
thread *before* going to sleep (after the reconnect error) and
before reattempting the connection.
---
 fs/cifs/connect.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index d05a300..a45e7fc 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -381,6 +381,7 @@ cifs_reconnect(struct TCP_Server_Info *server)
 		rc = generic_ip_connect(server);
 		if (rc) {
 			cifs_dbg(FYI, "reconnect error %d\n", rc);
+			mutex_unlock(&server->srv_mutex);
 			msleep(3000);
 		} else {
 			atomic_inc(&tcpSesReconnectCount);
@@ -388,8 +389,8 @@ cifs_reconnect(struct TCP_Server_Info *server)
 			if (server->tcpStatus != CifsExiting)
 				server->tcpStatus = CifsNeedNegotiate;
 			spin_unlock(&GlobalMid_Lock);
+			mutex_unlock(&server->srv_mutex);
 		}
-		mutex_unlock(&server->srv_mutex);
 	} while (server->tcpStatus == CifsNeedReconnect);
 
 	return rc;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] CIFS: Fix race condition on RFC1002_NEGATIVE_SESSION_RESPONSE
       [not found] ` <550860D2.5040207-LVkJPw3T+odGBRGhe+f61g@public.gmane.org>
@ 2015-04-23 12:35   ` Federico Sauter
  2015-05-20 18:26   ` Steve French
  1 sibling, 0 replies; 3+ messages in thread
From: Federico Sauter @ 2015-04-23 12:35 UTC (permalink / raw)
  To: linux-cifs-u79uwXL29TY76Z2rM5mHXA

Any feedback on whether this patch makes sense or not? ;-)

---8<----------------------------------------------------------------------
 >From 54e3c95a4646c8666c6f08766250fd056b06e7f5 Mon Sep 17 00:00:00 2001
From: Federico Sauter <fsauter-LVkJPw3T+odGBRGhe+f61g@public.gmane.org>
Date: Tue, 17 Mar 2015 17:45:28 +0100
Subject: [PATCH] CIFS: Fix race condition on
  RFC1002_NEGATIVE_SESSION_RESPONSE

This patch fixes a race condition that occurs when connecting
to a NT 3.51 host without specifying a NetBIOS name.
In that case a RFC1002_NEGATIVE_SESSION_RESPONSE is received
and the SMB negotiation is reattempted, but under some conditions
it leads SendReceive() to hang forever while waiting for srv_mutex.
This, in turn, sets the calling process to an uninterruptible sleep
state and makes it unkillable.

The solution is to unlock the srv_mutex acquired in the demux
thread *before* going to sleep (after the reconnect error) and
before reattempting the connection.
---
  fs/cifs/connect.c |    3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index d05a300..a45e7fc 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -381,6 +381,7 @@ cifs_reconnect(struct TCP_Server_Info *server)
  		rc = generic_ip_connect(server);
  		if (rc) {
  			cifs_dbg(FYI, "reconnect error %d\n", rc);
+			mutex_unlock(&server->srv_mutex);
  			msleep(3000);
  		} else {
  			atomic_inc(&tcpSesReconnectCount);
@@ -388,8 +389,8 @@ cifs_reconnect(struct TCP_Server_Info *server)
  			if (server->tcpStatus != CifsExiting)
  				server->tcpStatus = CifsNeedNegotiate;
  			spin_unlock(&GlobalMid_Lock);
+			mutex_unlock(&server->srv_mutex);
  		}
-		mutex_unlock(&server->srv_mutex);
  	} while (server->tcpStatus == CifsNeedReconnect);

  	return rc;
-- 
1.7.10.4
---------------------------------------------------------------------->8---


On 03/17/2015 06:13 PM, Federico Sauter wrote:
> Greetings,
>
>
> I have been running into an issue (kernel v3.10.40) when connecting to
> an NT 3.51 Workstation host with a configuration missing the NetBIOS
> name ('servern' option.) Under some conditions, the mount helper would
> hang forever in an uninterruptible sleep state. The mount helper comes
> from busybox. Under some other conditions the mount program would exit
> with an error and not hang.
>
> I managed to create a setup where I could reliably reproduce both cases
> (the "good case" where the program exits, and the "bad case" where the
> program hangs forever.)
>
> The problem seems to be a race condition between the demux thread and
> the "main"(?) thread over the srv_mutex. Here is the summary of the
> functions calls that lead to this problem:
>
> * demux thread:
> cifs_demultiplex_thread()
>    is_smb_response()
>      [connect.c:626 -- case RFC1002_NEGATIVE_SESSION_RESPONSE]
>        cifs_reconnect()
>          [connect.c:380]
>          do {
>            mutex_lock(&server->srv_mutex);
>            generic_ip_connect(server);
>            // on error -> msleep(3000);
>            mutex_unlock(&server->srv_mutex);
>          } while (server->tcpStatus == CifsNeedReconnect);
>
> * "main" thread:
> cifs_negotiate()
>    CIFSSMBNegotiate()
>      SendReceive()
>        [transport.c:821 - thread hangs forever]
>        mutex_lock(&ses->server->srv_mutex);
>
> Another interesting piece of information is that in the good case at
> cifs_reconnect(), generic_ip_connect() returns -EINTR, whereas in the
> bad case it returns -ECONNREFUSED. In the bad case it all leads to
> generic_ip_connect() being called over and over again with the same
> result, but never exiting the loop (thus: hanging.)
>
> The following patch works around the issue by not re-attempting the SMB
> negotiation:
>
> ---8<----------------------------------------------------------------------
> diff --git a/fs/cifs/smb1ops.c b/fs/cifs/smb1ops.c
> index 4885a40..863a2da 100644
> --- a/fs/cifs/smb1ops.c
> +++ b/fs/cifs/smb1ops.c
> @@ -415,10 +415,12 @@ cifs_negotiate(const unsigned int xid, struct
> cifs_ses *ses)
>          int rc;
>          rc = CIFSSMBNegotiate(xid, ses);
>          if (rc == -EAGAIN) {
> +#if 0
>                  /* retry only once on 1st time connection */
>                  set_credits(ses->server, 1);
>                  rc = CIFSSMBNegotiate(xid, ses);
>                  if (rc == -EAGAIN)
> +#endif
>                          rc = -EHOSTDOWN;
>          }
>          return rc;
> ---------------------------------------------------------------------->8---
>
> I was able, however, to identify a (hopefully) better solution for the
> issue (see the attached patch.)
>
> I would really appreciate your feedback on the attached patch. Please
> let me know if the solution seems acceptable as well as
> side-effects-free. We use CIFS to connect to older Windows systems and
> we have been experiencing similar issues for a while now (which I hope
> to solve with this patch.)
>
> Thanks a lot in advance! :-)
>
>
> Kind regards,
>
>
> Federico Sauter
> Senior Firmware Programmer

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] CIFS: Fix race condition on RFC1002_NEGATIVE_SESSION_RESPONSE
       [not found] ` <550860D2.5040207-LVkJPw3T+odGBRGhe+f61g@public.gmane.org>
  2015-04-23 12:35   ` Federico Sauter
@ 2015-05-20 18:26   ` Steve French
  1 sibling, 0 replies; 3+ messages in thread
From: Steve French @ 2015-05-20 18:26 UTC (permalink / raw)
  To: Federico Sauter; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA

merged into cifs-2.6.git for-next

On Tue, Mar 17, 2015 at 12:13 PM, Federico Sauter
<fsauter-LVkJPw3T+odGBRGhe+f61g@public.gmane.org> wrote:
> Greetings,
>
>
> I have been running into an issue (kernel v3.10.40) when connecting to an NT
> 3.51 Workstation host with a configuration missing the NetBIOS name
> ('servern' option.) Under some conditions, the mount helper would hang
> forever in an uninterruptible sleep state. The mount helper comes from
> busybox. Under some other conditions the mount program would exit with an
> error and not hang.
>
> I managed to create a setup where I could reliably reproduce both cases (the
> "good case" where the program exits, and the "bad case" where the program
> hangs forever.)
>
> The problem seems to be a race condition between the demux thread and the
> "main"(?) thread over the srv_mutex. Here is the summary of the functions
> calls that lead to this problem:
>
> * demux thread:
> cifs_demultiplex_thread()
>   is_smb_response()
>     [connect.c:626 -- case RFC1002_NEGATIVE_SESSION_RESPONSE]
>       cifs_reconnect()
>         [connect.c:380]
>         do {
>           mutex_lock(&server->srv_mutex);
>           generic_ip_connect(server);
>           // on error -> msleep(3000);
>           mutex_unlock(&server->srv_mutex);
>         } while (server->tcpStatus == CifsNeedReconnect);
>
> * "main" thread:
> cifs_negotiate()
>   CIFSSMBNegotiate()
>     SendReceive()
>       [transport.c:821 - thread hangs forever]
>       mutex_lock(&ses->server->srv_mutex);
>
> Another interesting piece of information is that in the good case at
> cifs_reconnect(), generic_ip_connect() returns -EINTR, whereas in the bad
> case it returns -ECONNREFUSED. In the bad case it all leads to
> generic_ip_connect() being called over and over again with the same result,
> but never exiting the loop (thus: hanging.)
>
> The following patch works around the issue by not re-attempting the SMB
> negotiation:
>
> ---8<----------------------------------------------------------------------
> diff --git a/fs/cifs/smb1ops.c b/fs/cifs/smb1ops.c
> index 4885a40..863a2da 100644
> --- a/fs/cifs/smb1ops.c
> +++ b/fs/cifs/smb1ops.c
> @@ -415,10 +415,12 @@ cifs_negotiate(const unsigned int xid, struct cifs_ses
> *ses)
>         int rc;
>         rc = CIFSSMBNegotiate(xid, ses);
>         if (rc == -EAGAIN) {
> +#if 0
>                 /* retry only once on 1st time connection */
>                 set_credits(ses->server, 1);
>                 rc = CIFSSMBNegotiate(xid, ses);
>                 if (rc == -EAGAIN)
> +#endif
>                         rc = -EHOSTDOWN;
>         }
>         return rc;
> ---------------------------------------------------------------------->8---
>
> I was able, however, to identify a (hopefully) better solution for the issue
> (see the attached patch.)
>
> I would really appreciate your feedback on the attached patch. Please let me
> know if the solution seems acceptable as well as side-effects-free. We use
> CIFS to connect to older Windows systems and we have been experiencing
> similar issues for a while now (which I hope to solve with this patch.)
>
> Thanks a lot in advance! :-)
>
>
> Kind regards,
>
>
> Federico Sauter
> Senior Firmware Programmer
> --
> Innominate Security Technologies AG
> Rudower Chaussee 13 | 12489 Berlin | Germany
> tel: +49 30 921028-210 | fax: +49 30 921028-020
> www.innominate.com | www.twitter.com/mGuardcom
>
> Register Court: AG Charlottenburg, HR B 81603
> Management Board: Dirk Seewald | Chairman of the Supervisory Board:
> Christoph Leifer



-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-05-20 18:26 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-17 17:13 [PATCH] CIFS: Fix race condition on RFC1002_NEGATIVE_SESSION_RESPONSE Federico Sauter
     [not found] ` <550860D2.5040207-LVkJPw3T+odGBRGhe+f61g@public.gmane.org>
2015-04-23 12:35   ` Federico Sauter
2015-05-20 18:26   ` Steve French

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.