All of lore.kernel.org
 help / color / mirror / Atom feed
* [Resend PATCH] RDS: fix race condition when sending a message on unbound socket
@ 2015-11-24 22:13 Santosh Shilimkar
  2015-11-24 22:20 ` David Miller
  2015-11-25 12:21 ` David Laight
  0 siblings, 2 replies; 5+ messages in thread
From: Santosh Shilimkar @ 2015-11-24 22:13 UTC (permalink / raw)
  To: netdev, davem; +Cc: linux-kernel, sasha.levin, ben, Quentin Casasnovas, stable

From: Quentin Casasnovas <quentin.casasnovas@oracle.com>

Sasha's found a NULL pointer dereference in the RDS connection code when
sending a message to an apparently unbound socket.  The problem is caused
by the code checking if the socket is bound in rds_sendmsg(), which checks
the rs_bound_addr field without taking a lock on the socket.  This opens a
race where rs_bound_addr is temporarily set but where the transport is not
in rds_bind(), leading to a NULL pointer dereference when trying to
dereference 'trans' in __rds_conn_create().

Vegard wrote a reproducer for this issue, so kindly ask him to share if
you're interested.

I cannot reproduce the NULL pointer dereference using Vegard's reproducer
with this patch, whereas I could without.

Complete earlier incomplete fix to CVE-2015-6937:

  74e98eb08588 ("RDS: verify the underlying transport exists before creating a connection")

Cc: David S. Miller <davem@davemloft.net>
Cc: stable@vger.kernel.org

Reviewed-by: Vegard Nossum <vegard.nossum@oracle.com>
Reviewed-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
---
Dave,

Resending refreshed version from earlier patch which was missed out
for some reason.
	https://lkml.org/lkml/2015/10/16/530

It applies cleanly against linus master as well as net-next. I
have also pushed it on my tree below.
	git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux.git 4.4/net-next/rds-fix


 net/rds/connection.c | 6 ------
 net/rds/send.c       | 4 +++-
 2 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/net/rds/connection.c b/net/rds/connection.c
index d456403..e3b118c 100644
--- a/net/rds/connection.c
+++ b/net/rds/connection.c
@@ -186,12 +186,6 @@ static struct rds_connection *__rds_conn_create(struct net *net,
 		}
 	}
 
-	if (trans == NULL) {
-		kmem_cache_free(rds_conn_slab, conn);
-		conn = ERR_PTR(-ENODEV);
-		goto out;
-	}
-
 	conn->c_trans = trans;
 
 	ret = trans->conn_alloc(conn, gfp);
diff --git a/net/rds/send.c b/net/rds/send.c
index 827155c..c9cdb35 100644
--- a/net/rds/send.c
+++ b/net/rds/send.c
@@ -1013,11 +1013,13 @@ int rds_sendmsg(struct socket *sock, struct msghdr *msg, size_t payload_len)
 		release_sock(sk);
 	}
 
-	/* racing with another thread binding seems ok here */
+	lock_sock(sk);
 	if (daddr == 0 || rs->rs_bound_addr == 0) {
+		release_sock(sk);
 		ret = -ENOTCONN; /* XXX not a great errno */
 		goto out;
 	}
+	release_sock(sk);
 
 	if (payload_len > rds_sk_sndbuf(rs)) {
 		ret = -EMSGSIZE;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Resend PATCH] RDS: fix race condition when sending a message on unbound socket
  2015-11-24 22:13 [Resend PATCH] RDS: fix race condition when sending a message on unbound socket Santosh Shilimkar
@ 2015-11-24 22:20 ` David Miller
  2015-11-25 12:21 ` David Laight
  1 sibling, 0 replies; 5+ messages in thread
From: David Miller @ 2015-11-24 22:20 UTC (permalink / raw)
  To: santosh.shilimkar
  Cc: netdev, linux-kernel, sasha.levin, ben, quentin.casasnovas, stable

From: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Date: Tue, 24 Nov 2015 17:13:21 -0500

> From: Quentin Casasnovas <quentin.casasnovas@oracle.com>
> 
> Sasha's found a NULL pointer dereference in the RDS connection code when
> sending a message to an apparently unbound socket.  The problem is caused
> by the code checking if the socket is bound in rds_sendmsg(), which checks
> the rs_bound_addr field without taking a lock on the socket.  This opens a
> race where rs_bound_addr is temporarily set but where the transport is not
> in rds_bind(), leading to a NULL pointer dereference when trying to
> dereference 'trans' in __rds_conn_create().
> 
> Vegard wrote a reproducer for this issue, so kindly ask him to share if
> you're interested.
> 
> I cannot reproduce the NULL pointer dereference using Vegard's reproducer
> with this patch, whereas I could without.
> 
> Complete earlier incomplete fix to CVE-2015-6937:
> 
>   74e98eb08588 ("RDS: verify the underlying transport exists before creating a connection")
> 
> Cc: David S. Miller <davem@davemloft.net>
> Cc: stable@vger.kernel.org
> 
> Reviewed-by: Vegard Nossum <vegard.nossum@oracle.com>
> Reviewed-by: Sasha Levin <sasha.levin@oracle.com>
> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
> Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [Resend PATCH] RDS: fix race condition when sending a message on unbound socket
  2015-11-24 22:13 [Resend PATCH] RDS: fix race condition when sending a message on unbound socket Santosh Shilimkar
  2015-11-24 22:20 ` David Miller
@ 2015-11-25 12:21 ` David Laight
  2015-11-25 12:52   ` Quentin Casasnovas
  1 sibling, 1 reply; 5+ messages in thread
From: David Laight @ 2015-11-25 12:21 UTC (permalink / raw)
  To: 'Santosh Shilimkar', netdev, davem
  Cc: linux-kernel, sasha.levin, ben, Quentin Casasnovas, stable

From: Santosh Shilimkar
> Sent: 24 November 2015 22:13
...
> Sasha's found a NULL pointer dereference in the RDS connection code when
> sending a message to an apparently unbound socket.  The problem is caused
> by the code checking if the socket is bound in rds_sendmsg(), which checks
> the rs_bound_addr field without taking a lock on the socket.  This opens a
> race where rs_bound_addr is temporarily set but where the transport is not
> in rds_bind(), leading to a NULL pointer dereference when trying to
> dereference 'trans' in __rds_conn_create().
> 
> Vegard wrote a reproducer for this issue, so kindly ask him to share if
> you're interested.
...
> diff --git a/net/rds/send.c b/net/rds/send.c
> index 827155c..c9cdb35 100644
> --- a/net/rds/send.c
> +++ b/net/rds/send.c
> @@ -1013,11 +1013,13 @@ int rds_sendmsg(struct socket *sock, struct msghdr *msg, size_t payload_len)
>  		release_sock(sk);

This is falling though into an unconditional lock_sock().
No need to unlock and relock immediately.

>  	}
> 
> -	/* racing with another thread binding seems ok here */
> +	lock_sock(sk);
>  	if (daddr == 0 || rs->rs_bound_addr == 0) {
> +		release_sock(sk);
>  		ret = -ENOTCONN; /* XXX not a great errno */
>  		goto out;
>  	}
> +	release_sock(sk);
> 

On the face of it the above looks somewhat dubious.
Locks usually tie together two action (eg a test and use of a value),
In this case you only have a test inside the lock.
That either means that the state can change after you release the lock
(ie rs->rs_bound_addr = 0 is executed somewhere), or you don't
really need the lock.

	David



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Resend PATCH] RDS: fix race condition when sending a message on unbound socket
  2015-11-25 12:21 ` David Laight
@ 2015-11-25 12:52   ` Quentin Casasnovas
  2015-11-25 23:54     ` santosh shilimkar
  0 siblings, 1 reply; 5+ messages in thread
From: Quentin Casasnovas @ 2015-11-25 12:52 UTC (permalink / raw)
  To: David Laight
  Cc: 'Santosh Shilimkar',
	netdev, davem, linux-kernel, sasha.levin, ben,
	Quentin Casasnovas, stable

On Wed, Nov 25, 2015 at 12:21:45PM +0000, David Laight wrote:
> From: Santosh Shilimkar
> > Sent: 24 November 2015 22:13
> ...
> > Sasha's found a NULL pointer dereference in the RDS connection code when
> > sending a message to an apparently unbound socket.  The problem is caused
> > by the code checking if the socket is bound in rds_sendmsg(), which checks
> > the rs_bound_addr field without taking a lock on the socket.  This opens a
> > race where rs_bound_addr is temporarily set but where the transport is not
> > in rds_bind(), leading to a NULL pointer dereference when trying to
> > dereference 'trans' in __rds_conn_create().
> > 
> > Vegard wrote a reproducer for this issue, so kindly ask him to share if
> > you're interested.
> ...
> > diff --git a/net/rds/send.c b/net/rds/send.c
> > index 827155c..c9cdb35 100644
> > --- a/net/rds/send.c
> > +++ b/net/rds/send.c
> > @@ -1013,11 +1013,13 @@ int rds_sendmsg(struct socket *sock, struct msghdr *msg, size_t payload_len)
> >  		release_sock(sk);
> 
> This is falling though into an unconditional lock_sock().
> No need to unlock and relock immediately.
> 
> >  	}
> > 
> > -	/* racing with another thread binding seems ok here */
> > +	lock_sock(sk);
> >  	if (daddr == 0 || rs->rs_bound_addr == 0) {
> > +		release_sock(sk);
> >  		ret = -ENOTCONN; /* XXX not a great errno */
> >  		goto out;
> >  	}
> > +	release_sock(sk);
> > 
> 
> On the face of it the above looks somewhat dubious.
> Locks usually tie together two action (eg a test and use of a value),
> In this case you only have a test inside the lock.
> That either means that the state can change after you release the lock
> (ie rs->rs_bound_addr = 0 is executed somewhere), or you don't
> really need the lock.
>

If you look at rds_bind(), you'll see that it does something like the
following:

    lock_sock(sk);
    ...
1:  rds_add_bound();  # This sets rs->rs_bound_addr
    ...
    if (!trans) {
            ...
2:          rds_remove_bound(rs);  # This unsets rs->rs_bound_addr
  ...
  release_sock(sk);

So any code checking rs_bound_addr without taking that lock could
potentially think the socket is bound, when in fact rds_bind() has failed.
This can happen if checking rs_bound_addr happens exactly between [1] and
[2] above.  So the usage of the lock in this particular case is to get a
consistent view of the sk.

The only other case where rs_bound_addr is cleared is on socket release, so
I didn't _think_ there was a problem here but maybe you can see another
race?

Thanks,
Quentin


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Resend PATCH] RDS: fix race condition when sending a message on unbound socket
  2015-11-25 12:52   ` Quentin Casasnovas
@ 2015-11-25 23:54     ` santosh shilimkar
  0 siblings, 0 replies; 5+ messages in thread
From: santosh shilimkar @ 2015-11-25 23:54 UTC (permalink / raw)
  To: Quentin Casasnovas, David Laight
  Cc: netdev, davem, linux-kernel, sasha.levin, ben, stable

On 11/25/2015 4:52 AM, Quentin Casasnovas wrote:
> On Wed, Nov 25, 2015 at 12:21:45PM +0000, David Laight wrote:
>> From: Santosh Shilimkar
>>> Sent: 24 November 2015 22:13
>> ...
>>> Sasha's found a NULL pointer dereference in the RDS connection code when
>>> sending a message to an apparently unbound socket.  The problem is caused
>>> by the code checking if the socket is bound in rds_sendmsg(), which checks
>>> the rs_bound_addr field without taking a lock on the socket.  This opens a
>>> race where rs_bound_addr is temporarily set but where the transport is not
>>> in rds_bind(), leading to a NULL pointer dereference when trying to
>>> dereference 'trans' in __rds_conn_create().
>>>
>>> Vegard wrote a reproducer for this issue, so kindly ask him to share if
>>> you're interested.
>> ...
>>> diff --git a/net/rds/send.c b/net/rds/send.c
>>> index 827155c..c9cdb35 100644
>>> --- a/net/rds/send.c
>>> +++ b/net/rds/send.c
>>> @@ -1013,11 +1013,13 @@ int rds_sendmsg(struct socket *sock, struct msghdr *msg, size_t payload_len)
>>>   		release_sock(sk);
>>
>> This is falling though into an unconditional lock_sock().
>> No need to unlock and relock immediately.
>>
>>>   	}
>>>
>>> -	/* racing with another thread binding seems ok here */
>>> +	lock_sock(sk);
>>>   	if (daddr == 0 || rs->rs_bound_addr == 0) {
>>> +		release_sock(sk);
>>>   		ret = -ENOTCONN; /* XXX not a great errno */
>>>   		goto out;
>>>   	}
>>> +	release_sock(sk);
>>>
>>
>> On the face of it the above looks somewhat dubious.
>> Locks usually tie together two action (eg a test and use of a value),
>> In this case you only have a test inside the lock.
>> That either means that the state can change after you release the lock
>> (ie rs->rs_bound_addr = 0 is executed somewhere), or you don't
>> really need the lock.
>>
If you see this patch in isolation, what you said looks very obvious
but as indicated by Quentin below, the update is protected by
lock and check wasn't which lead to the issue on bind() failures.

>
> If you look at rds_bind(), you'll see that it does something like the
> following:
>
>      lock_sock(sk);
>      ...
> 1:  rds_add_bound();  # This sets rs->rs_bound_addr
>      ...
>      if (!trans) {
>              ...
> 2:          rds_remove_bound(rs);  # This unsets rs->rs_bound_addr
>    ...
>    release_sock(sk);
>
> So any code checking rs_bound_addr without taking that lock could
> potentially think the socket is bound, when in fact rds_bind() has failed.
> This can happen if checking rs_bound_addr happens exactly between [1] and
> [2] above.  So the usage of the lock in this particular case is to get a
> consistent view of the sk.
>
> The only other case where rs_bound_addr is cleared is on socket release, so
> I didn't _think_ there was a problem here but maybe you can see another
> race?
>
I will be curious as well to know.

Regards,
Santosh

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-11-25 23:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-24 22:13 [Resend PATCH] RDS: fix race condition when sending a message on unbound socket Santosh Shilimkar
2015-11-24 22:20 ` David Miller
2015-11-25 12:21 ` David Laight
2015-11-25 12:52   ` Quentin Casasnovas
2015-11-25 23:54     ` santosh shilimkar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.