* [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences
@ 2017-03-04 16:57 Sowmini Varadhan
2017-03-04 16:57 ` [PATCH net 1/3] rds: tcp: Take explicit refcounts on struct net Sowmini Varadhan
` (4 more replies)
0 siblings, 5 replies; 8+ messages in thread
From: Sowmini Varadhan @ 2017-03-04 16:57 UTC (permalink / raw)
To: sowmini.varadhan, dvyukov, santosh.shilimkar, davem, netdev
Cc: syzkaller, rds-devel
Dmitry Vyukov reported some syszkaller panics during netns deletion.
While I have not been able to reproduce those exact panics, my attempts
to do so uncovered a few other problems, which are fixed patch 2 and
patch 3 of this patch series. In addition, as mentioned in,
https://www.spinics.net/lists/netdev/msg422997.html
code-inspection points that the rds_connection needs to take an explicit
refcnt on the struct net so that it is held down until all cleanup is
completed for netns removal, and this is fixed by patch1.
The following scripts were run concurrently to uncover/test patch{2, 3}
while simultaneously running rds-ping to 12.0.0.18 from another system:
# cat del.rds
while [ 1 ]; do
modprobe rds_tcp
modprobe -r rds-tcp
done
# cat del.netns
while [ 1 ]; do
ip netns delete blue
ip netns add blue
ip link add link eth1 address a:b:c:d:e:f blue0 type macvlan
ip link set blue0 netns blue
ip netns exec blue ip addr add 12.0.0.18/24 dev blue0
ip netns exec blue ifconfig blue0 up
sleep 3;
done
Sowmini Varadhan (3):
rds: tcp: Take explicit refcounts on struct net
rds: tcp: Reorder initialization sequence in rds_tcp_init to avoid
races
rds: tcp: Sequence teardown of listen and acceptor sockets to avoid
races
net/rds/connection.c | 1 +
net/rds/rds.h | 6 +++---
net/rds/tcp.c | 38 +++++++++++++++++++++-----------------
net/rds/tcp.h | 2 +-
net/rds/tcp_listen.c | 9 +++++++--
5 files changed, 33 insertions(+), 23 deletions(-)
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH net 1/3] rds: tcp: Take explicit refcounts on struct net
2017-03-04 16:57 [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences Sowmini Varadhan
@ 2017-03-04 16:57 ` Sowmini Varadhan
2017-03-04 16:57 ` [PATCH net 2/3] rds: tcp: Reorder initialization sequence in rds_tcp_init to avoid races Sowmini Varadhan
` (3 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: Sowmini Varadhan @ 2017-03-04 16:57 UTC (permalink / raw)
To: sowmini.varadhan, dvyukov, santosh.shilimkar, davem, netdev
Cc: syzkaller, rds-devel
It is incorrect for the rds_connection to piggyback on the
sock_net() refcount for the netns because this gives rise to
a chicken-and-egg problem during rds_conn_destroy. Instead explicitly
take a ref on the net, and hold the netns down till the connection
tear-down is complete.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
---
net/rds/connection.c | 1 +
net/rds/rds.h | 6 +++---
net/rds/tcp.c | 4 ++--
3 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/net/rds/connection.c b/net/rds/connection.c
index 0e04dcc..1fa75ab 100644
--- a/net/rds/connection.c
+++ b/net/rds/connection.c
@@ -429,6 +429,7 @@ void rds_conn_destroy(struct rds_connection *conn)
*/
rds_cong_remove_conn(conn);
+ put_net(conn->c_net);
kmem_cache_free(rds_conn_slab, conn);
spin_lock_irqsave(&rds_conn_lock, flags);
diff --git a/net/rds/rds.h b/net/rds/rds.h
index 6f523dd..5dd2565 100644
--- a/net/rds/rds.h
+++ b/net/rds/rds.h
@@ -147,7 +147,7 @@ struct rds_connection {
/* Protocol version */
unsigned int c_version;
- possible_net_t c_net;
+ struct net *c_net;
struct list_head c_map_item;
unsigned long c_map_queued;
@@ -162,13 +162,13 @@ struct rds_connection {
static inline
struct net *rds_conn_net(struct rds_connection *conn)
{
- return read_pnet(&conn->c_net);
+ return conn->c_net;
}
static inline
void rds_conn_net_set(struct rds_connection *conn, struct net *net)
{
- write_pnet(&conn->c_net, net);
+ conn->c_net = get_net(net);
}
#define RDS_FLAG_CONG_BITMAP 0x01
diff --git a/net/rds/tcp.c b/net/rds/tcp.c
index a973d3b..65c8e3b 100644
--- a/net/rds/tcp.c
+++ b/net/rds/tcp.c
@@ -529,7 +529,7 @@ static void rds_tcp_kill_sock(struct net *net)
flush_work(&rtn->rds_tcp_accept_w);
spin_lock_irq(&rds_tcp_conn_lock);
list_for_each_entry_safe(tc, _tc, &rds_tcp_conn_list, t_tcp_node) {
- struct net *c_net = read_pnet(&tc->t_cpath->cp_conn->c_net);
+ struct net *c_net = tc->t_cpath->cp_conn->c_net;
if (net != c_net || !tc->t_sock)
continue;
@@ -584,7 +584,7 @@ static void rds_tcp_sysctl_reset(struct net *net)
spin_lock_irq(&rds_tcp_conn_lock);
list_for_each_entry_safe(tc, _tc, &rds_tcp_conn_list, t_tcp_node) {
- struct net *c_net = read_pnet(&tc->t_cpath->cp_conn->c_net);
+ struct net *c_net = tc->t_cpath->cp_conn->c_net;
if (net != c_net || !tc->t_sock)
continue;
--
1.7.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH net 2/3] rds: tcp: Reorder initialization sequence in rds_tcp_init to avoid races
2017-03-04 16:57 [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences Sowmini Varadhan
2017-03-04 16:57 ` [PATCH net 1/3] rds: tcp: Take explicit refcounts on struct net Sowmini Varadhan
@ 2017-03-04 16:57 ` Sowmini Varadhan
2017-03-04 16:57 ` [PATCH net 3/3] rds: tcp: Sequence teardown of listen and acceptor sockets " Sowmini Varadhan
` (2 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: Sowmini Varadhan @ 2017-03-04 16:57 UTC (permalink / raw)
To: sowmini.varadhan, dvyukov, santosh.shilimkar, davem, netdev
Cc: syzkaller, rds-devel
Order of initialization in rds_tcp_init needs to be done so
that resources are set up and destroyed in the correct synchronization
sequence with both the data path, as well as netns create/destroy
path. Specifically,
- we must call register_pernet_subsys and get the rds_tcp_netid
before calling register_netdevice_notifier, otherwise we risk
the sequence
1. register_netdevice_notifier sets up netdev notifier callback
2. rds_tcp_dev_event -> rds_tcp_kill_sock uses netid 0, and finds
the wrong rtn, resulting in a panic with string that is of the form:
BUG: unable to handle kernel NULL pointer dereference at 000000000000000d
IP: rds_tcp_kill_sock+0x3a/0x1d0 [rds_tcp]
:
- the rds_tcp_incoming_slab kmem_cache must be initialized before the
datapath starts up. The latter can happen any time after the
pernet_subsys registration of rds_tcp_net_ops, whose -> init
function sets up the listen socket. If the rds_tcp_incoming_slab has
not been set up at that time, a panic of the form below may be
encountered
BUG: unable to handle kernel NULL pointer dereference at 0000000000000014
IP: kmem_cache_alloc+0x90/0x1c0
:
rds_tcp_data_recv+0x1e7/0x370 [rds_tcp]
tcp_read_sock+0x96/0x1c0
rds_tcp_recv_path+0x65/0x80 [rds_tcp]
:
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
---
net/rds/tcp.c | 19 +++++++++----------
1 files changed, 9 insertions(+), 10 deletions(-)
diff --git a/net/rds/tcp.c b/net/rds/tcp.c
index 65c8e3b..fbf807a 100644
--- a/net/rds/tcp.c
+++ b/net/rds/tcp.c
@@ -638,19 +638,19 @@ static int rds_tcp_init(void)
goto out;
}
- ret = register_netdevice_notifier(&rds_tcp_dev_notifier);
- if (ret) {
- pr_warn("could not register rds_tcp_dev_notifier\n");
+ ret = rds_tcp_recv_init();
+ if (ret)
goto out_slab;
- }
ret = register_pernet_subsys(&rds_tcp_net_ops);
if (ret)
- goto out_notifier;
+ goto out_recv;
- ret = rds_tcp_recv_init();
- if (ret)
+ ret = register_netdevice_notifier(&rds_tcp_dev_notifier);
+ if (ret) {
+ pr_warn("could not register rds_tcp_dev_notifier\n");
goto out_pernet;
+ }
rds_trans_register(&rds_tcp_transport);
@@ -660,9 +660,8 @@ static int rds_tcp_init(void)
out_pernet:
unregister_pernet_subsys(&rds_tcp_net_ops);
-out_notifier:
- if (unregister_netdevice_notifier(&rds_tcp_dev_notifier))
- pr_warn("could not unregister rds_tcp_dev_notifier\n");
+out_recv:
+ rds_tcp_recv_exit();
out_slab:
kmem_cache_destroy(rds_tcp_conn_slab);
out:
--
1.7.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH net 3/3] rds: tcp: Sequence teardown of listen and acceptor sockets to avoid races
2017-03-04 16:57 [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences Sowmini Varadhan
2017-03-04 16:57 ` [PATCH net 1/3] rds: tcp: Take explicit refcounts on struct net Sowmini Varadhan
2017-03-04 16:57 ` [PATCH net 2/3] rds: tcp: Reorder initialization sequence in rds_tcp_init to avoid races Sowmini Varadhan
@ 2017-03-04 16:57 ` Sowmini Varadhan
2017-03-07 1:04 ` [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences santosh.shilimkar
2017-03-07 22:10 ` David Miller
4 siblings, 0 replies; 8+ messages in thread
From: Sowmini Varadhan @ 2017-03-04 16:57 UTC (permalink / raw)
To: sowmini.varadhan, dvyukov, santosh.shilimkar, davem, netdev
Cc: syzkaller, rds-devel
Commit a93d01f5777e ("RDS: TCP: avoid bad page reference in
rds_tcp_listen_data_ready") added the function
rds_tcp_listen_sock_def_readable() to handle the case when a
partially set-up acceptor socket drops into rds_tcp_listen_data_ready().
However, if the listen socket (rtn->rds_tcp_listen_sock) is itself going
through a tear-down via rds_tcp_listen_stop(), the (*ready)() will be
null and we would hit a panic of the form
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: (null)
:
? rds_tcp_listen_data_ready+0x59/0xb0 [rds_tcp]
tcp_data_queue+0x39d/0x5b0
tcp_rcv_established+0x2e5/0x660
tcp_v4_do_rcv+0x122/0x220
tcp_v4_rcv+0x8b7/0x980
:
In the above case, it is not fatal to encounter a NULL value for
ready- we should just drop the packet and let the flush of the
acceptor thread finish gracefully.
In general, the tear-down sequence for listen() and accept() socket
that is ensured by this commit is:
rtn->rds_tcp_listen_sock = NULL; /* prevent any new accepts */
In rds_tcp_listen_stop():
serialize with, and prevent, further callbacks using lock_sock()
flush rds_wq
flush acceptor workq
sock_release(listen socket)
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
---
net/rds/tcp.c | 15 ++++++++++-----
net/rds/tcp.h | 2 +-
net/rds/tcp_listen.c | 9 +++++++--
3 files changed, 18 insertions(+), 8 deletions(-)
diff --git a/net/rds/tcp.c b/net/rds/tcp.c
index fbf807a..2256900 100644
--- a/net/rds/tcp.c
+++ b/net/rds/tcp.c
@@ -484,9 +484,10 @@ static void __net_exit rds_tcp_exit_net(struct net *net)
* we do need to clean up the listen socket here.
*/
if (rtn->rds_tcp_listen_sock) {
- rds_tcp_listen_stop(rtn->rds_tcp_listen_sock);
+ struct socket *lsock = rtn->rds_tcp_listen_sock;
+
rtn->rds_tcp_listen_sock = NULL;
- flush_work(&rtn->rds_tcp_accept_w);
+ rds_tcp_listen_stop(lsock, &rtn->rds_tcp_accept_w);
}
}
@@ -523,10 +524,10 @@ static void rds_tcp_kill_sock(struct net *net)
struct rds_tcp_connection *tc, *_tc;
LIST_HEAD(tmp_list);
struct rds_tcp_net *rtn = net_generic(net, rds_tcp_netid);
+ struct socket *lsock = rtn->rds_tcp_listen_sock;
- rds_tcp_listen_stop(rtn->rds_tcp_listen_sock);
rtn->rds_tcp_listen_sock = NULL;
- flush_work(&rtn->rds_tcp_accept_w);
+ rds_tcp_listen_stop(lsock, &rtn->rds_tcp_accept_w);
spin_lock_irq(&rds_tcp_conn_lock);
list_for_each_entry_safe(tc, _tc, &rds_tcp_conn_list, t_tcp_node) {
struct net *c_net = tc->t_cpath->cp_conn->c_net;
@@ -546,8 +547,12 @@ static void rds_tcp_kill_sock(struct net *net)
void *rds_tcp_listen_sock_def_readable(struct net *net)
{
struct rds_tcp_net *rtn = net_generic(net, rds_tcp_netid);
+ struct socket *lsock = rtn->rds_tcp_listen_sock;
+
+ if (!lsock)
+ return NULL;
- return rtn->rds_tcp_listen_sock->sk->sk_user_data;
+ return lsock->sk->sk_user_data;
}
static int rds_tcp_dev_event(struct notifier_block *this,
diff --git a/net/rds/tcp.h b/net/rds/tcp.h
index 9a1cc89..56ea662 100644
--- a/net/rds/tcp.h
+++ b/net/rds/tcp.h
@@ -66,7 +66,7 @@ void rds_tcp_restore_callbacks(struct socket *sock,
/* tcp_listen.c */
struct socket *rds_tcp_listen_init(struct net *);
-void rds_tcp_listen_stop(struct socket *);
+void rds_tcp_listen_stop(struct socket *sock, struct work_struct *acceptor);
void rds_tcp_listen_data_ready(struct sock *sk);
int rds_tcp_accept_one(struct socket *sock);
int rds_tcp_keepalive(struct socket *sock);
diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c
index 67d0929..2c69a50 100644
--- a/net/rds/tcp_listen.c
+++ b/net/rds/tcp_listen.c
@@ -223,6 +223,9 @@ void rds_tcp_listen_data_ready(struct sock *sk)
* before it has been accepted and the accepter has set up their
* data_ready.. we only want to queue listen work for our listening
* socket
+ *
+ * (*ready)() may be null if we are racing with netns delete, and
+ * the listen socket is being torn down.
*/
if (sk->sk_state == TCP_LISTEN)
rds_tcp_accept_work(sk);
@@ -231,7 +234,8 @@ void rds_tcp_listen_data_ready(struct sock *sk)
out:
read_unlock_bh(&sk->sk_callback_lock);
- ready(sk);
+ if (ready)
+ ready(sk);
}
struct socket *rds_tcp_listen_init(struct net *net)
@@ -271,7 +275,7 @@ struct socket *rds_tcp_listen_init(struct net *net)
return NULL;
}
-void rds_tcp_listen_stop(struct socket *sock)
+void rds_tcp_listen_stop(struct socket *sock, struct work_struct *acceptor)
{
struct sock *sk;
@@ -292,5 +296,6 @@ void rds_tcp_listen_stop(struct socket *sock)
/* wait for accepts to stop and close the socket */
flush_workqueue(rds_wq);
+ flush_work(acceptor);
sock_release(sock);
}
--
1.7.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences
2017-03-04 16:57 [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences Sowmini Varadhan
` (2 preceding siblings ...)
2017-03-04 16:57 ` [PATCH net 3/3] rds: tcp: Sequence teardown of listen and acceptor sockets " Sowmini Varadhan
@ 2017-03-07 1:04 ` santosh.shilimkar
2017-03-07 8:28 ` Dmitry Vyukov
2017-03-07 22:10 ` David Miller
4 siblings, 1 reply; 8+ messages in thread
From: santosh.shilimkar @ 2017-03-07 1:04 UTC (permalink / raw)
To: Sowmini Varadhan, dvyukov, davem, netdev; +Cc: syzkaller, rds-devel
On 3/4/17 8:57 AM, Sowmini Varadhan wrote:
> Dmitry Vyukov reported some syszkaller panics during netns deletion.
>
> While I have not been able to reproduce those exact panics, my attempts
> to do so uncovered a few other problems, which are fixed patch 2 and
> patch 3 of this patch series. In addition, as mentioned in,
> https://www.spinics.net/lists/netdev/msg422997.html
> code-inspection points that the rds_connection needs to take an explicit
> refcnt on the struct net so that it is held down until all cleanup is
> completed for netns removal, and this is fixed by patch1.
>
Hopefully Dmitry can try the series and see if it fixes the issue(s).
The fixes looks good to me.
FWIW, Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences
2017-03-07 1:04 ` [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences santosh.shilimkar
@ 2017-03-07 8:28 ` Dmitry Vyukov
2017-03-07 16:29 ` Santosh Shilimkar
0 siblings, 1 reply; 8+ messages in thread
From: Dmitry Vyukov @ 2017-03-07 8:28 UTC (permalink / raw)
To: santosh.shilimkar
Cc: Sowmini Varadhan, David Miller, netdev, syzkaller, rds-devel
On Tue, Mar 7, 2017 at 2:04 AM, santosh.shilimkar@oracle.com
<santosh.shilimkar@oracle.com> wrote:
> On 3/4/17 8:57 AM, Sowmini Varadhan wrote:
>>
>> Dmitry Vyukov reported some syszkaller panics during netns deletion.
>>
>> While I have not been able to reproduce those exact panics, my attempts
>> to do so uncovered a few other problems, which are fixed patch 2 and
>> patch 3 of this patch series. In addition, as mentioned in,
>> https://www.spinics.net/lists/netdev/msg422997.html
>> code-inspection points that the rds_connection needs to take an explicit
>> refcnt on the struct net so that it is held down until all cleanup is
>> completed for netns removal, and this is fixed by patch1.
>>
> Hopefully Dmitry can try the series and see if it fixes the issue(s).
> The fixes looks good to me.
>
> FWIW, Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
I've applied the patches for testing. I've seen the reported crashes
only few times, so it won't provide a good testing. But at least it
can detect any regressions.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences
2017-03-07 8:28 ` Dmitry Vyukov
@ 2017-03-07 16:29 ` Santosh Shilimkar
0 siblings, 0 replies; 8+ messages in thread
From: Santosh Shilimkar @ 2017-03-07 16:29 UTC (permalink / raw)
To: Dmitry Vyukov
Cc: Sowmini Varadhan, David Miller, netdev, syzkaller, rds-devel
On 3/7/2017 12:28 AM, Dmitry Vyukov wrote:
> On Tue, Mar 7, 2017 at 2:04 AM, santosh.shilimkar@oracle.com
> <santosh.shilimkar@oracle.com> wrote:
>> On 3/4/17 8:57 AM, Sowmini Varadhan wrote:
>>>
>>> Dmitry Vyukov reported some syszkaller panics during netns deletion.
>>>
>>> While I have not been able to reproduce those exact panics, my attempts
>>> to do so uncovered a few other problems, which are fixed patch 2 and
>>> patch 3 of this patch series. In addition, as mentioned in,
>>> https://www.spinics.net/lists/netdev/msg422997.html
>>> code-inspection points that the rds_connection needs to take an explicit
>>> refcnt on the struct net so that it is held down until all cleanup is
>>> completed for netns removal, and this is fixed by patch1.
>>>
>> Hopefully Dmitry can try the series and see if it fixes the issue(s).
>> The fixes looks good to me.
>>
>> FWIW, Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
>
>
> I've applied the patches for testing. I've seen the reported crashes
> only few times, so it won't provide a good testing. But at least it
> can detect any regressions.
>
Thanks Dmitry !!
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences
2017-03-04 16:57 [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences Sowmini Varadhan
` (3 preceding siblings ...)
2017-03-07 1:04 ` [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences santosh.shilimkar
@ 2017-03-07 22:10 ` David Miller
4 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2017-03-07 22:10 UTC (permalink / raw)
To: sowmini.varadhan; +Cc: dvyukov, santosh.shilimkar, netdev, syzkaller, rds-devel
From: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Date: Sat, 4 Mar 2017 08:57:32 -0800
> Dmitry Vyukov reported some syszkaller panics during netns deletion.
...
Series applied, thanks!
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-03-08 7:05 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-04 16:57 [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences Sowmini Varadhan
2017-03-04 16:57 ` [PATCH net 1/3] rds: tcp: Take explicit refcounts on struct net Sowmini Varadhan
2017-03-04 16:57 ` [PATCH net 2/3] rds: tcp: Reorder initialization sequence in rds_tcp_init to avoid races Sowmini Varadhan
2017-03-04 16:57 ` [PATCH net 3/3] rds: tcp: Sequence teardown of listen and acceptor sockets " Sowmini Varadhan
2017-03-07 1:04 ` [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences santosh.shilimkar
2017-03-07 8:28 ` Dmitry Vyukov
2017-03-07 16:29 ` Santosh Shilimkar
2017-03-07 22:10 ` David Miller
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.