All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 2/2] sunrpc: svc_sock_names should hold ref to socket being closed.
  2010-11-15  0:27 [PATCH 0/2] sunrpc: fix two server-side race problems NeilBrown
  2010-11-15  0:27 ` [PATCH 1/2] sunrpc: remove xpt_pool NeilBrown
@ 2010-11-15  0:27 ` NeilBrown
  2010-11-16  6:07 ` [PATCH 0/2] sunrpc: fix two server-side race problems Neil Brown
  2010-11-16 15:33 ` J. Bruce Fields
  3 siblings, 0 replies; 7+ messages in thread
From: NeilBrown @ 2010-11-15  0:27 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

Currently svc_sock_names calls svc_close_xprt on a svc_sock to
which it does not own a reference.
As soon as svc_close_xprt sets XPT_CLOSE, the socket could be
freed by a separate thread (though this is a very unlikely race).

It is safer to hold a reference while calling svc_close_xprt.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 net/sunrpc/svcsock.c |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 07919e1..52bd113 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -324,19 +324,21 @@ int svc_sock_names(struct svc_serv *serv, char *buf, const size_t buflen,
 			len = onelen;
 			break;
 		}
-		if (toclose && strcmp(toclose, buf + len) == 0)
+		if (toclose && strcmp(toclose, buf + len) == 0) {
 			closesk = svsk;
-		else
+			svc_xprt_get(&closesk->sk_xprt);
+		} else
 			len += onelen;
 	}
 	spin_unlock_bh(&serv->sv_lock);
 
-	if (closesk)
+	if (closesk) {
 		/* Should unregister with portmap, but you cannot
 		 * unregister just one protocol...
 		 */
 		svc_close_xprt(&closesk->sk_xprt);
-	else if (toclose)
+		svc_xprt_put(&closesk->sk_xprt);
+	} else if (toclose)
 		return -ENOENT;
 	return len;
 }



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 1/2] sunrpc: remove xpt_pool
  2010-11-15  0:27 [PATCH 0/2] sunrpc: fix two server-side race problems NeilBrown
@ 2010-11-15  0:27 ` NeilBrown
  2010-11-15  0:27 ` [PATCH 2/2] sunrpc: svc_sock_names should hold ref to socket being closed NeilBrown
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: NeilBrown @ 2010-11-15  0:27 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

The xpt_pool field is only used for reporting BUGs.
And it isn't used correctly.

In particular, when it is cleared in svc_xprt_received before
XPT_BUSY is cleared, there is no guarantee that either the
compiler or the CPU might not re-order to two assignments, just
setting xpt_pool to NULL after XPT_BUSY is cleared.

If a different cpu were running svc_xprt_enqueue at this moment,
it might see XPT_BUSY clear and then xpt_pool non-NULL, and
so BUG.

This could be fixed by calling
  smp_mb__before_clear_bit()
before the clear_bit.  However as xpt_pool isn't really used,
it seems safest to simply remove xpt_pool.

Another alternate would be to change the clear_bit to
clear_bit_unlock, and the test_and_set_bit to test_and_set_bit_lock.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 include/linux/sunrpc/svc_xprt.h |    1 -
 net/sunrpc/svc_xprt.c           |    6 ------
 2 files changed, 0 insertions(+), 7 deletions(-)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index bbdb680..79baf68 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -63,7 +63,6 @@ struct svc_xprt {
 #define XPT_LISTENER	11		/* listening endpoint */
 #define XPT_CACHE_AUTH	12		/* cache auth info */
 
-	struct svc_pool		*xpt_pool;	/* current pool iff queued */
 	struct svc_serv		*xpt_server;	/* service for transport */
 	atomic_t    	    	xpt_reserved;	/* space on outq that is rsvd */
 	struct mutex		xpt_mutex;	/* to serialize sending data */
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index c82fe73..ec9d8ef 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -342,8 +342,6 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
 		dprintk("svc: transport %p busy, not enqueued\n", xprt);
 		goto out_unlock;
 	}
-	BUG_ON(xprt->xpt_pool != NULL);
-	xprt->xpt_pool = pool;
 
 	/* Handle pending connection */
 	if (test_bit(XPT_CONN, &xprt->xpt_flags))
@@ -358,7 +356,6 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
 		/* Don't enqueue while not enough space for reply */
 		dprintk("svc: no write space, transport %p  not enqueued\n",
 			xprt);
-		xprt->xpt_pool = NULL;
 		clear_bit(XPT_BUSY, &xprt->xpt_flags);
 		goto out_unlock;
 	}
@@ -380,13 +377,11 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
 		rqstp->rq_reserved = serv->sv_max_mesg;
 		atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved);
 		pool->sp_stats.threads_woken++;
-		BUG_ON(xprt->xpt_pool != pool);
 		wake_up(&rqstp->rq_wait);
 	} else {
 		dprintk("svc: transport %p put into queue\n", xprt);
 		list_add_tail(&xprt->xpt_ready, &pool->sp_sockets);
 		pool->sp_stats.sockets_queued++;
-		BUG_ON(xprt->xpt_pool != pool);
 	}
 
 out_unlock:
@@ -425,7 +420,6 @@ static struct svc_xprt *svc_xprt_dequeue(struct svc_pool *pool)
 void svc_xprt_received(struct svc_xprt *xprt)
 {
 	BUG_ON(!test_bit(XPT_BUSY, &xprt->xpt_flags));
-	xprt->xpt_pool = NULL;
 	clear_bit(XPT_BUSY, &xprt->xpt_flags);
 	svc_xprt_enqueue(xprt);
 }



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 0/2] sunrpc: fix two server-side race problems.
@ 2010-11-15  0:27 NeilBrown
  2010-11-15  0:27 ` [PATCH 1/2] sunrpc: remove xpt_pool NeilBrown
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: NeilBrown @ 2010-11-15  0:27 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs


Hi Bruce,
 I found these while trying to diagnose why a customer hit the
 	BUG_ON(xprt->xpt_pool != NULL);
 in svc_xprt_enqueue (in a 2.6.27 based kernel).  I don't think either
 of these actually explain the problem, but they seem to be bugs
 none-the-less.

NeilBrown


---

NeilBrown (2):
      sunrpc: svc_sock_names should hold ref to socket being closed.
      sunrpc: remove xpt_pool


 include/linux/sunrpc/svc_xprt.h |    1 -
 net/sunrpc/svc_xprt.c           |    6 ------
 net/sunrpc/svcsock.c            |   10 ++++++----
 3 files changed, 6 insertions(+), 11 deletions(-)

-- 
Signature


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] sunrpc: fix two server-side race problems.
  2010-11-15  0:27 [PATCH 0/2] sunrpc: fix two server-side race problems NeilBrown
  2010-11-15  0:27 ` [PATCH 1/2] sunrpc: remove xpt_pool NeilBrown
  2010-11-15  0:27 ` [PATCH 2/2] sunrpc: svc_sock_names should hold ref to socket being closed NeilBrown
@ 2010-11-16  6:07 ` Neil Brown
  2010-11-16 15:04   ` J. Bruce Fields
  2010-11-16 15:33 ` J. Bruce Fields
  3 siblings, 1 reply; 7+ messages in thread
From: Neil Brown @ 2010-11-16  6:07 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, Tom Tucker

On Mon, 15 Nov 2010 11:27:01 +1100
NeilBrown <neilb@suse.de> wrote:

> 
> Hi Bruce,
>  I found these while trying to diagnose why a customer hit the
>  	BUG_ON(xprt->xpt_pool != NULL);
>  in svc_xprt_enqueue (in a 2.6.27 based kernel).  I don't think either
>  of these actually explain the problem, but they seem to be bugs
>  none-the-less.
> 

And I think I have now found the bug that caused the BUG_ON.
The xprt has been freed and reused for something else so xpt_pool and other
things are garbage.

If you could review and apply this patch I would appreciate it.

While exploring I noticed something that I thought was odd.
In svc_rdma_transport.c, handle_connect_req calls rdma_create_xprt which will
return an xprt with XPT_BUSY set.
But I cannot see that XPT_BUSY is ever cleared.
There is a comment saying that svc_xprt_received() cannot be called for some
reason, but as far as I can see the reason is invalid, and svc_xprt_received
is exactly what should be called to clear XPT_BUSY.
But I don't really know this code so might be missing something important.

Thanks,
NeilBrown

>From 0edb33a19f415a783d89839efbdf7c572a797043 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Tue, 16 Nov 2010 16:55:19 +1100
Subject: [PATCH] sunrpc: close another server-size race.

When an xprt is created, it has a refcount of 1, and XPT_BUSY
is set.
The refcount is *not* owned by the thread that created the xprt
(as is clear from the fact that creators never put the reference).
Rather, it is owned by the absence of XPT_DEAD.  Once XPT_DEAD is set,
(And XPT_BUSY is clear) that initial reference is dropped and the xprt
can be freed.

So when a creator clears XPT_BUSY it is dropping its only reference and
so must not touch the xprt again.

However svc_recv, after calling ->xpo_accept (and so getting an XPT_BUSY
reference on a new xprt), calls svc_xprt_recieved.  This clears
XPT_BUSY and then svc_xprt_enqueue - this last without owning a reference.
This is dangerous and has been seen to leave svc_xprt_enqueue working
with an xprt containing garbage.

So we need to hold an extra counted reference over that call to
svc_xprt_received.

For safety, any time we clear XPT_BUSY and then use the xprt again, we
first get a reference, and the put it again afterwards.

Note that svc_close_all does not need this extra protection as there are
no threads running, and the final free can only be called asynchronously
from such a thread.

Signed-off-by: NeilBrown <neilb@suse.de>

diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index ec9d8ef..fa249ca 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -213,6 +213,7 @@ int svc_create_xprt(struct svc_serv *serv, const char *xprt_name,
 	spin_lock(&svc_xprt_class_lock);
 	list_for_each_entry(xcl, &svc_xprt_class_list, xcl_list) {
 		struct svc_xprt *newxprt;
+		unsigned short newport;
 
 		if (strcmp(xprt_name, xcl->xcl_name))
 			continue;
@@ -231,8 +232,9 @@ int svc_create_xprt(struct svc_serv *serv, const char *xprt_name,
 		spin_lock_bh(&serv->sv_lock);
 		list_add(&newxprt->xpt_list, &serv->sv_permsocks);
 		spin_unlock_bh(&serv->sv_lock);
+		newport = svc_xprt_local_port(newxprt);
 		clear_bit(XPT_BUSY, &newxprt->xpt_flags);
-		return svc_xprt_local_port(newxprt);
+		return newport;
 	}
  err:
 	spin_unlock(&svc_xprt_class_lock);
@@ -420,8 +422,13 @@ static struct svc_xprt *svc_xprt_dequeue(struct svc_pool *pool)
 void svc_xprt_received(struct svc_xprt *xprt)
 {
 	BUG_ON(!test_bit(XPT_BUSY, &xprt->xpt_flags));
+	/* As soon as we clear busy, the xprt could be closed and
+	 * 'put', so we need a reference to call svc_xprt_enqueue with
+	 */
+	svc_xprt_get(xprt);
 	clear_bit(XPT_BUSY, &xprt->xpt_flags);
 	svc_xprt_enqueue(xprt);
+	svc_xprt_put(xprt);
 }
 EXPORT_SYMBOL_GPL(svc_xprt_received);
 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] sunrpc: fix two server-side race problems.
  2010-11-16  6:07 ` [PATCH 0/2] sunrpc: fix two server-side race problems Neil Brown
@ 2010-11-16 15:04   ` J. Bruce Fields
  2010-11-16 22:24     ` Neil Brown
  0 siblings, 1 reply; 7+ messages in thread
From: J. Bruce Fields @ 2010-11-16 15:04 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-nfs, Tom Tucker

On Tue, Nov 16, 2010 at 05:07:20PM +1100, Neil Brown wrote:
> On Mon, 15 Nov 2010 11:27:01 +1100
> NeilBrown <neilb@suse.de> wrote:
> 
> > 
> > Hi Bruce,
> >  I found these while trying to diagnose why a customer hit the
> >  	BUG_ON(xprt->xpt_pool != NULL);
> >  in svc_xprt_enqueue (in a 2.6.27 based kernel).  I don't think either
> >  of these actually explain the problem, but they seem to be bugs
> >  none-the-less.
> > 
> 
> And I think I have now found the bug that caused the BUG_ON.

Ah-hah!  Cool bug.  (By the way, what made you think it's what this
customer hit?)

> The xprt has been freed and reused for something else so xpt_pool and other
> things are garbage.
> 
> If you could review and apply this patch I would appreciate it.

Certainly looks sensible to me.

Do you think that anything we did recently made this more likely?  Or
have we never seen this before because the window for this race is so
small?

(I'm just wondering whether to aim for 2.6.38, or for 2.6.37 and stable.
Though I'm sort of inclined to the latter if it's a crash that someone's
really hit, even if it's not a regression.)

--b.

> 
> While exploring I noticed something that I thought was odd.
> In svc_rdma_transport.c, handle_connect_req calls rdma_create_xprt which will
> return an xprt with XPT_BUSY set.
> But I cannot see that XPT_BUSY is ever cleared.
> There is a comment saying that svc_xprt_received() cannot be called for some
> reason, but as far as I can see the reason is invalid, and svc_xprt_received
> is exactly what should be called to clear XPT_BUSY.
> But I don't really know this code so might be missing something important.
> 
> Thanks,
> NeilBrown
> 
> From 0edb33a19f415a783d89839efbdf7c572a797043 Mon Sep 17 00:00:00 2001
> From: NeilBrown <neilb@suse.de>
> Date: Tue, 16 Nov 2010 16:55:19 +1100
> Subject: [PATCH] sunrpc: close another server-size race.
> 
> When an xprt is created, it has a refcount of 1, and XPT_BUSY
> is set.
> The refcount is *not* owned by the thread that created the xprt
> (as is clear from the fact that creators never put the reference).
> Rather, it is owned by the absence of XPT_DEAD.  Once XPT_DEAD is set,
> (And XPT_BUSY is clear) that initial reference is dropped and the xprt
> can be freed.
> 
> So when a creator clears XPT_BUSY it is dropping its only reference and
> so must not touch the xprt again.
> 
> However svc_recv, after calling ->xpo_accept (and so getting an XPT_BUSY
> reference on a new xprt), calls svc_xprt_recieved.  This clears
> XPT_BUSY and then svc_xprt_enqueue - this last without owning a reference.
> This is dangerous and has been seen to leave svc_xprt_enqueue working
> with an xprt containing garbage.
> 
> So we need to hold an extra counted reference over that call to
> svc_xprt_received.
> 
> For safety, any time we clear XPT_BUSY and then use the xprt again, we
> first get a reference, and the put it again afterwards.
> 
> Note that svc_close_all does not need this extra protection as there are
> no threads running, and the final free can only be called asynchronously
> from such a thread.
> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> 
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index ec9d8ef..fa249ca 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -213,6 +213,7 @@ int svc_create_xprt(struct svc_serv *serv, const char *xprt_name,
>  	spin_lock(&svc_xprt_class_lock);
>  	list_for_each_entry(xcl, &svc_xprt_class_list, xcl_list) {
>  		struct svc_xprt *newxprt;
> +		unsigned short newport;
>  
>  		if (strcmp(xprt_name, xcl->xcl_name))
>  			continue;
> @@ -231,8 +232,9 @@ int svc_create_xprt(struct svc_serv *serv, const char *xprt_name,
>  		spin_lock_bh(&serv->sv_lock);
>  		list_add(&newxprt->xpt_list, &serv->sv_permsocks);
>  		spin_unlock_bh(&serv->sv_lock);
> +		newport = svc_xprt_local_port(newxprt);
>  		clear_bit(XPT_BUSY, &newxprt->xpt_flags);
> -		return svc_xprt_local_port(newxprt);
> +		return newport;
>  	}
>   err:
>  	spin_unlock(&svc_xprt_class_lock);
> @@ -420,8 +422,13 @@ static struct svc_xprt *svc_xprt_dequeue(struct svc_pool *pool)
>  void svc_xprt_received(struct svc_xprt *xprt)
>  {
>  	BUG_ON(!test_bit(XPT_BUSY, &xprt->xpt_flags));
> +	/* As soon as we clear busy, the xprt could be closed and
> +	 * 'put', so we need a reference to call svc_xprt_enqueue with
> +	 */
> +	svc_xprt_get(xprt);
>  	clear_bit(XPT_BUSY, &xprt->xpt_flags);
>  	svc_xprt_enqueue(xprt);
> +	svc_xprt_put(xprt);
>  }
>  EXPORT_SYMBOL_GPL(svc_xprt_received);
>  

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] sunrpc: fix two server-side race problems.
  2010-11-15  0:27 [PATCH 0/2] sunrpc: fix two server-side race problems NeilBrown
                   ` (2 preceding siblings ...)
  2010-11-16  6:07 ` [PATCH 0/2] sunrpc: fix two server-side race problems Neil Brown
@ 2010-11-16 15:33 ` J. Bruce Fields
  3 siblings, 0 replies; 7+ messages in thread
From: J. Bruce Fields @ 2010-11-16 15:33 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs

On Mon, Nov 15, 2010 at 11:27:01AM +1100, NeilBrown wrote:
> 
> Hi Bruce,
>  I found these while trying to diagnose why a customer hit the
>  	BUG_ON(xprt->xpt_pool != NULL);
>  in svc_xprt_enqueue (in a 2.6.27 based kernel).  I don't think either
>  of these actually explain the problem, but they seem to be bugs
>  none-the-less.

Yes.  Inclined to queue these up for the next merge window, though, as
neither seems urgent.

--b.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] sunrpc: fix two server-side race problems.
  2010-11-16 15:04   ` J. Bruce Fields
@ 2010-11-16 22:24     ` Neil Brown
  0 siblings, 0 replies; 7+ messages in thread
From: Neil Brown @ 2010-11-16 22:24 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, Tom Tucker

On Tue, 16 Nov 2010 10:04:50 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Tue, Nov 16, 2010 at 05:07:20PM +1100, Neil Brown wrote:
> > On Mon, 15 Nov 2010 11:27:01 +1100
> > NeilBrown <neilb@suse.de> wrote:
> > 
> > > 
> > > Hi Bruce,
> > >  I found these while trying to diagnose why a customer hit the
> > >  	BUG_ON(xprt->xpt_pool != NULL);
> > >  in svc_xprt_enqueue (in a 2.6.27 based kernel).  I don't think either
> > >  of these actually explain the problem, but they seem to be bugs
> > >  none-the-less.
> > > 
> > 
> > And I think I have now found the bug that caused the BUG_ON.
> 
> Ah-hah!  Cool bug.  (By the way, what made you think it's what this
> customer hit?)

The first two I found while just looking for ways that BUG_ON could have
occurred, and both turned out to be actually impossible in the customers
config.  Once required a different arch and the other required manipulating
open sockets  through the nfsfs interface, which wasn't happening.

This one I found by looking more closely at the crash dump.  The BUG_ON was
in svc_xprt_enqueue called from svc_xprt_received called from svc_recv.  And
it was the svc_xprt_received that is call on a newly accepted socket.  Given
this was a new socket there are going to be limited options for racing with
anything.  And given that I found something that *could* happen in that
limite scenario, I figure it probably did happen.


> 
> > The xprt has been freed and reused for something else so xpt_pool and other
> > things are garbage.
> > 
> > If you could review and apply this patch I would appreciate it.
> 
> Certainly looks sensible to me.
> 
> Do you think that anything we did recently made this more likely?  Or
> have we never seen this before because the window for this race is so
> small?

No, I don't think we changed anything.  This customer seems to be beating on
nfsd very hard - big server and lots of clients - trying to make sure it is
stable.  So they have probably pushed it harder and longer than anyone else.
We need more people doing that!!!


> 
> (I'm just wondering whether to aim for 2.6.38, or for 2.6.37 and stable.
> Though I'm sort of inclined to the latter if it's a crash that someone's
> really hit, even if it's not a regression.)

I would go for 2.6.37 and -stable for purely selfish reasons.
If you put a "Cc: stable@kernel.org" tag on it, then it will automatically
get pulled into the stable trees (with suitable review of course) and then it
will be automatically pulled into the SLES kernel (and probably RHEL too) so
it will get into our update stream without me having to do anything :-)


But I agree there is no rush with the other two.
The one which removes the BUG_ON may not be strictly necessary.  It requires
the CPU to do out-of-order writes, and it was suggested to me that because
the two memory locations are likely to be in the same cache line,
out-of-order writes aren't actually possible.
The other requires explicit closing of connections via the nfsdfs which
hardly ever happens at all, so the chance of a race with it is almost
non-existent.

Thanks,
NeilBrown



> 
> --b.
> 
> > 
> > While exploring I noticed something that I thought was odd.
> > In svc_rdma_transport.c, handle_connect_req calls rdma_create_xprt which will
> > return an xprt with XPT_BUSY set.
> > But I cannot see that XPT_BUSY is ever cleared.
> > There is a comment saying that svc_xprt_received() cannot be called for some
> > reason, but as far as I can see the reason is invalid, and svc_xprt_received
> > is exactly what should be called to clear XPT_BUSY.
> > But I don't really know this code so might be missing something important.
> > 
> > Thanks,
> > NeilBrown
> > 
> > From 0edb33a19f415a783d89839efbdf7c572a797043 Mon Sep 17 00:00:00 2001
> > From: NeilBrown <neilb@suse.de>
> > Date: Tue, 16 Nov 2010 16:55:19 +1100
> > Subject: [PATCH] sunrpc: close another server-size race.
> > 
> > When an xprt is created, it has a refcount of 1, and XPT_BUSY
> > is set.
> > The refcount is *not* owned by the thread that created the xprt
> > (as is clear from the fact that creators never put the reference).
> > Rather, it is owned by the absence of XPT_DEAD.  Once XPT_DEAD is set,
> > (And XPT_BUSY is clear) that initial reference is dropped and the xprt
> > can be freed.
> > 
> > So when a creator clears XPT_BUSY it is dropping its only reference and
> > so must not touch the xprt again.
> > 
> > However svc_recv, after calling ->xpo_accept (and so getting an XPT_BUSY
> > reference on a new xprt), calls svc_xprt_recieved.  This clears
> > XPT_BUSY and then svc_xprt_enqueue - this last without owning a reference.
> > This is dangerous and has been seen to leave svc_xprt_enqueue working
> > with an xprt containing garbage.
> > 
> > So we need to hold an extra counted reference over that call to
> > svc_xprt_received.
> > 
> > For safety, any time we clear XPT_BUSY and then use the xprt again, we
> > first get a reference, and the put it again afterwards.
> > 
> > Note that svc_close_all does not need this extra protection as there are
> > no threads running, and the final free can only be called asynchronously
> > from such a thread.
> > 
> > Signed-off-by: NeilBrown <neilb@suse.de>
> > 
> > diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> > index ec9d8ef..fa249ca 100644
> > --- a/net/sunrpc/svc_xprt.c
> > +++ b/net/sunrpc/svc_xprt.c
> > @@ -213,6 +213,7 @@ int svc_create_xprt(struct svc_serv *serv, const char *xprt_name,
> >  	spin_lock(&svc_xprt_class_lock);
> >  	list_for_each_entry(xcl, &svc_xprt_class_list, xcl_list) {
> >  		struct svc_xprt *newxprt;
> > +		unsigned short newport;
> >  
> >  		if (strcmp(xprt_name, xcl->xcl_name))
> >  			continue;
> > @@ -231,8 +232,9 @@ int svc_create_xprt(struct svc_serv *serv, const char *xprt_name,
> >  		spin_lock_bh(&serv->sv_lock);
> >  		list_add(&newxprt->xpt_list, &serv->sv_permsocks);
> >  		spin_unlock_bh(&serv->sv_lock);
> > +		newport = svc_xprt_local_port(newxprt);
> >  		clear_bit(XPT_BUSY, &newxprt->xpt_flags);
> > -		return svc_xprt_local_port(newxprt);
> > +		return newport;
> >  	}
> >   err:
> >  	spin_unlock(&svc_xprt_class_lock);
> > @@ -420,8 +422,13 @@ static struct svc_xprt *svc_xprt_dequeue(struct svc_pool *pool)
> >  void svc_xprt_received(struct svc_xprt *xprt)
> >  {
> >  	BUG_ON(!test_bit(XPT_BUSY, &xprt->xpt_flags));
> > +	/* As soon as we clear busy, the xprt could be closed and
> > +	 * 'put', so we need a reference to call svc_xprt_enqueue with
> > +	 */
> > +	svc_xprt_get(xprt);
> >  	clear_bit(XPT_BUSY, &xprt->xpt_flags);
> >  	svc_xprt_enqueue(xprt);
> > +	svc_xprt_put(xprt);
> >  }
> >  EXPORT_SYMBOL_GPL(svc_xprt_received);
> >  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-11-16 22:24 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-15  0:27 [PATCH 0/2] sunrpc: fix two server-side race problems NeilBrown
2010-11-15  0:27 ` [PATCH 1/2] sunrpc: remove xpt_pool NeilBrown
2010-11-15  0:27 ` [PATCH 2/2] sunrpc: svc_sock_names should hold ref to socket being closed NeilBrown
2010-11-16  6:07 ` [PATCH 0/2] sunrpc: fix two server-side race problems Neil Brown
2010-11-16 15:04   ` J. Bruce Fields
2010-11-16 22:24     ` Neil Brown
2010-11-16 15:33 ` J. Bruce Fields

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.