netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 net] dccp/tcp: Reset saddr on failure after inet6?_hash_connect().
@ 2022-11-03 17:24 Kuniyuki Iwashima
  2022-11-06 19:18 ` Joanne Koong
  2022-11-10  9:19 ` Paolo Abeni
  0 siblings, 2 replies; 7+ messages in thread
From: Kuniyuki Iwashima @ 2022-11-03 17:24 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Hideaki YOSHIFUJI, David Ahern
  Cc: Arnaldo Carvalho de Melo, Joanne Koong, Martin KaFai Lau,
	Ziyang Xuan (William),
	Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, dccp

When connect() is called on a socket bound to the wildcard address,
we change the socket's saddr to a local address.  If the socket
fails to connect() to the destination, we have to reset the saddr.

However, when an error occurs after inet_hash6?_connect() in
(dccp|tcp)_v[46]_conect(), we forget to reset saddr and leave
the socket bound to the address.

From the user's point of view, whether saddr is reset or not varies
with errno.  Let's fix this inconsistent behaviour.

Note that with this patch, the repro [0] will trigger the WARN_ON()
in inet_csk_get_port() again, but this patch is not buggy and rather
fixes a bug papering over the bhash2's bug [1] for which we need
another fix.

For the record, the repro causes -EADDRNOTAVAIL in inet_hash6_connect()
by this sequence:

  s1 = socket()
  s1.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
  s1.bind(('127.0.0.1', 10000))
  s1.sendto(b'hello', MSG_FASTOPEN, (('127.0.0.1', 10000)))
  # or s1.connect(('127.0.0.1', 10000))

  s2 = socket()
  s2.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
  s2.bind(('0.0.0.0', 10000))
  s2.connect(('127.0.0.1', 10000))  # -EADDRNOTAVAIL

  s2.listen(32)  # WARN_ON(inet_csk(sk)->icsk_bind2_hash != tb2);

[0]: https://syzkaller.appspot.com/bug?extid=015d756bbd1f8b5c8f09
[1]: https://lore.kernel.org/netdev/20221029001249.86337-1-kuniyu@amazon.com/

Fixes: 3df80d9320bc ("[DCCP]: Introduce DCCPv6")
Fixes: 7c657876b63c ("[DCCP]: Initial implementation")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/dccp/ipv4.c     | 2 ++
 net/dccp/ipv6.c     | 2 ++
 net/ipv4/tcp_ipv4.c | 2 ++
 net/ipv6/tcp_ipv6.c | 2 ++
 4 files changed, 8 insertions(+)

diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index 713b7b8dad7e..40640c26680e 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -157,6 +157,8 @@ int dccp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 	 * This unhashes the socket and releases the local port, if necessary.
 	 */
 	dccp_set_state(sk, DCCP_CLOSED);
+	if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
+		inet_reset_saddr(sk);
 	ip_rt_put(rt);
 	sk->sk_route_caps = 0;
 	inet->inet_dport = 0;
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index e57b43006074..626166cb6d7e 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -985,6 +985,8 @@ static int dccp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
 
 late_failure:
 	dccp_set_state(sk, DCCP_CLOSED);
+	if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
+		inet_reset_saddr(sk);
 	__sk_dst_reset(sk);
 failure:
 	inet->inet_dport = 0;
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 87d440f47a70..6a3a732b584d 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -343,6 +343,8 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 	 * if necessary.
 	 */
 	tcp_set_state(sk, TCP_CLOSE);
+	if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
+		inet_reset_saddr(sk);
 	ip_rt_put(rt);
 	sk->sk_route_caps = 0;
 	inet->inet_dport = 0;
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 2a3f9296df1e..81b396e5cf79 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -359,6 +359,8 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
 
 late_failure:
 	tcp_set_state(sk, TCP_CLOSE);
+	if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
+		inet_reset_saddr(sk);
 failure:
 	inet->inet_dport = 0;
 	sk->sk_route_caps = 0;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 net] dccp/tcp: Reset saddr on failure after inet6?_hash_connect().
  2022-11-03 17:24 [PATCH v1 net] dccp/tcp: Reset saddr on failure after inet6?_hash_connect() Kuniyuki Iwashima
@ 2022-11-06 19:18 ` Joanne Koong
  2022-11-07  1:14   ` Kuniyuki Iwashima
  2022-11-10  9:19 ` Paolo Abeni
  1 sibling, 1 reply; 7+ messages in thread
From: Joanne Koong @ 2022-11-06 19:18 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Hideaki YOSHIFUJI, David Ahern, Arnaldo Carvalho de Melo,
	Martin KaFai Lau, Ziyang Xuan (William),
	Kuniyuki Iwashima, netdev, dccp

On Thu, Nov 3, 2022 at 10:24 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> When connect() is called on a socket bound to the wildcard address,
> we change the socket's saddr to a local address.  If the socket
> fails to connect() to the destination, we have to reset the saddr.
>
> However, when an error occurs after inet_hash6?_connect() in
> (dccp|tcp)_v[46]_conect(), we forget to reset saddr and leave
> the socket bound to the address.
>
> From the user's point of view, whether saddr is reset or not varies
> with errno.  Let's fix this inconsistent behaviour.
>
> Note that with this patch, the repro [0] will trigger the WARN_ON()
> in inet_csk_get_port() again, but this patch is not buggy and rather
> fixes a bug papering over the bhash2's bug [1] for which we need
> another fix.
>
> For the record, the repro causes -EADDRNOTAVAIL in inet_hash6_connect()
> by this sequence:
>
>   s1 = socket()
>   s1.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
>   s1.bind(('127.0.0.1', 10000))
>   s1.sendto(b'hello', MSG_FASTOPEN, (('127.0.0.1', 10000)))
>   # or s1.connect(('127.0.0.1', 10000))
>
>   s2 = socket()
>   s2.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
>   s2.bind(('0.0.0.0', 10000))
>   s2.connect(('127.0.0.1', 10000))  # -EADDRNOTAVAIL
>
>   s2.listen(32)  # WARN_ON(inet_csk(sk)->icsk_bind2_hash != tb2);
>
> [0]: https://syzkaller.appspot.com/bug?extid=015d756bbd1f8b5c8f09
> [1]: https://lore.kernel.org/netdev/20221029001249.86337-1-kuniyu@amazon.com/
>
> Fixes: 3df80d9320bc ("[DCCP]: Introduce DCCPv6")
> Fixes: 7c657876b63c ("[DCCP]: Initial implementation")
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> ---
>  net/dccp/ipv4.c     | 2 ++
>  net/dccp/ipv6.c     | 2 ++
>  net/ipv4/tcp_ipv4.c | 2 ++
>  net/ipv6/tcp_ipv6.c | 2 ++
>  4 files changed, 8 insertions(+)
>
> diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
> index 713b7b8dad7e..40640c26680e 100644
> --- a/net/dccp/ipv4.c
> +++ b/net/dccp/ipv4.c
> @@ -157,6 +157,8 @@ int dccp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
>          * This unhashes the socket and releases the local port, if necessary.
>          */
>         dccp_set_state(sk, DCCP_CLOSED);
> +       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> +               inet_reset_saddr(sk);
>         ip_rt_put(rt);
>         sk->sk_route_caps = 0;
>         inet->inet_dport = 0;
> diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
> index e57b43006074..626166cb6d7e 100644
> --- a/net/dccp/ipv6.c
> +++ b/net/dccp/ipv6.c
> @@ -985,6 +985,8 @@ static int dccp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
>
>  late_failure:
>         dccp_set_state(sk, DCCP_CLOSED);
> +       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> +               inet_reset_saddr(sk);
>         __sk_dst_reset(sk);
>  failure:
>         inet->inet_dport = 0;
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 87d440f47a70..6a3a732b584d 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -343,6 +343,8 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
>          * if necessary.
>          */
>         tcp_set_state(sk, TCP_CLOSE);
> +       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> +               inet_reset_saddr(sk);
>         ip_rt_put(rt);
>         sk->sk_route_caps = 0;
>         inet->inet_dport = 0;
> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> index 2a3f9296df1e..81b396e5cf79 100644
> --- a/net/ipv6/tcp_ipv6.c
> +++ b/net/ipv6/tcp_ipv6.c
> @@ -359,6 +359,8 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
>
>  late_failure:
>         tcp_set_state(sk, TCP_CLOSE);
> +       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> +               inet_reset_saddr(sk);
>  failure:
>         inet->inet_dport = 0;
>         sk->sk_route_caps = 0;
> --
> 2.30.2
>

inet_reset_saddr() sets both inet_saddr and inet_rcv_saddr to 0, but I
think there are some edge cases where when dccp/tcp_v4/6_connect() is
called, inet_saddr is 0 but inet_rcv_saddr is not, which means we'd
need to reset inet_rcv_saddr to its original value. The example case
I'm looking at is  __inet_bind() where if the request is to bind to a
multicast address,

    inet->inet_rcv_saddr = inet->inet_saddr = addr->sin_addr.s_addr;
    if (chk_addr_ret == RTN_MULTICAST || chk_addr_ret == RTN_BROADCAST)
        inet->inet_saddr = 0;  /* Use device */

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 net] dccp/tcp: Reset saddr on failure after inet6?_hash_connect().
  2022-11-06 19:18 ` Joanne Koong
@ 2022-11-07  1:14   ` Kuniyuki Iwashima
  2022-11-10 18:59     ` Joanne Koong
  0 siblings, 1 reply; 7+ messages in thread
From: Kuniyuki Iwashima @ 2022-11-07  1:14 UTC (permalink / raw)
  To: joannelkoong
  Cc: acme, davem, dccp, dsahern, edumazet, kuba, kuni1840, kuniyu,
	martin.lau, netdev, pabeni, william.xuanziyang, yoshfuji

From:   Joanne Koong <joannelkoong@gmail.com>
Date:   Sun, 6 Nov 2022 11:18:44 -0800
> On Thu, Nov 3, 2022 at 10:24 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > When connect() is called on a socket bound to the wildcard address,
> > we change the socket's saddr to a local address.  If the socket
> > fails to connect() to the destination, we have to reset the saddr.
> >
> > However, when an error occurs after inet_hash6?_connect() in
> > (dccp|tcp)_v[46]_conect(), we forget to reset saddr and leave
> > the socket bound to the address.
> >
> > From the user's point of view, whether saddr is reset or not varies
> > with errno.  Let's fix this inconsistent behaviour.
> >
> > Note that with this patch, the repro [0] will trigger the WARN_ON()
> > in inet_csk_get_port() again, but this patch is not buggy and rather
> > fixes a bug papering over the bhash2's bug [1] for which we need
> > another fix.
> >
> > For the record, the repro causes -EADDRNOTAVAIL in inet_hash6_connect()
> > by this sequence:
> >
> >   s1 = socket()
> >   s1.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
> >   s1.bind(('127.0.0.1', 10000))
> >   s1.sendto(b'hello', MSG_FASTOPEN, (('127.0.0.1', 10000)))
> >   # or s1.connect(('127.0.0.1', 10000))
> >
> >   s2 = socket()
> >   s2.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
> >   s2.bind(('0.0.0.0', 10000))
> >   s2.connect(('127.0.0.1', 10000))  # -EADDRNOTAVAIL
> >
> >   s2.listen(32)  # WARN_ON(inet_csk(sk)->icsk_bind2_hash != tb2);
> >
> > [0]: https://syzkaller.appspot.com/bug?extid=015d756bbd1f8b5c8f09
> > [1]: https://lore.kernel.org/netdev/20221029001249.86337-1-kuniyu@amazon.com/
> >
> > Fixes: 3df80d9320bc ("[DCCP]: Introduce DCCPv6")
> > Fixes: 7c657876b63c ("[DCCP]: Initial implementation")
> > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > ---
> >  net/dccp/ipv4.c     | 2 ++
> >  net/dccp/ipv6.c     | 2 ++
> >  net/ipv4/tcp_ipv4.c | 2 ++
> >  net/ipv6/tcp_ipv6.c | 2 ++
> >  4 files changed, 8 insertions(+)
> >
> > diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
> > index 713b7b8dad7e..40640c26680e 100644
> > --- a/net/dccp/ipv4.c
> > +++ b/net/dccp/ipv4.c
> > @@ -157,6 +157,8 @@ int dccp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
> >          * This unhashes the socket and releases the local port, if necessary.
> >          */
> >         dccp_set_state(sk, DCCP_CLOSED);
> > +       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> > +               inet_reset_saddr(sk);
> >         ip_rt_put(rt);
> >         sk->sk_route_caps = 0;
> >         inet->inet_dport = 0;
> > diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
> > index e57b43006074..626166cb6d7e 100644
> > --- a/net/dccp/ipv6.c
> > +++ b/net/dccp/ipv6.c
> > @@ -985,6 +985,8 @@ static int dccp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
> >
> >  late_failure:
> >         dccp_set_state(sk, DCCP_CLOSED);
> > +       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> > +               inet_reset_saddr(sk);
> >         __sk_dst_reset(sk);
> >  failure:
> >         inet->inet_dport = 0;
> > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > index 87d440f47a70..6a3a732b584d 100644
> > --- a/net/ipv4/tcp_ipv4.c
> > +++ b/net/ipv4/tcp_ipv4.c
> > @@ -343,6 +343,8 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
> >          * if necessary.
> >          */
> >         tcp_set_state(sk, TCP_CLOSE);
> > +       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> > +               inet_reset_saddr(sk);
> >         ip_rt_put(rt);
> >         sk->sk_route_caps = 0;
> >         inet->inet_dport = 0;
> > diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> > index 2a3f9296df1e..81b396e5cf79 100644
> > --- a/net/ipv6/tcp_ipv6.c
> > +++ b/net/ipv6/tcp_ipv6.c
> > @@ -359,6 +359,8 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
> >
> >  late_failure:
> >         tcp_set_state(sk, TCP_CLOSE);
> > +       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> > +               inet_reset_saddr(sk);
> >  failure:
> >         inet->inet_dport = 0;
> >         sk->sk_route_caps = 0;
> > --
> > 2.30.2
> >
> 
> inet_reset_saddr() sets both inet_saddr and inet_rcv_saddr to 0, but I
> think there are some edge cases where when dccp/tcp_v4/6_connect() is
> called, inet_saddr is 0 but inet_rcv_saddr is not, which means we'd
> need to reset inet_rcv_saddr to its original value. The example case
> I'm looking at is  __inet_bind() where if the request is to bind to a
> multicast address,
> 
>     inet->inet_rcv_saddr = inet->inet_saddr = addr->sin_addr.s_addr;
>     if (chk_addr_ret == RTN_MULTICAST || chk_addr_ret == RTN_BROADCAST)
>         inet->inet_saddr = 0;  /* Use device */

Thanks for reviewing.

We have to take care of these two error paths.

  * (dccp|tcp)_v[46]_coonnect()
  * __inet_stream_connect()

In __inet_stream_connect(), we call ->disconnect(), which already has the
same logic in this patch.

In your edge case, once (dccp|tcp)_v[46]_coonnect() succeeds, both of
inet->inet_saddr and sk_rcv_saddr are non-zero.  If connect() fails after
that, in ->disconnect(), we cannot know if we should restore sk_rcv_saddr
only.  Also, we don't have the previous address there.

For these reasons, we reset both addresses only if the sk was bound to
INADDR_ANY, which we can detect by the SOCK_BINDADDR_LOCK flag.

As you mentinoed, we can restore sk_rcv_saddr for the edge case in
(dccp|tcp)_v[46]_coonnect() but cannot in __inet_stream_connect().

If we do so, we need another flag for the case and another member to save
the old multicast/broadcast address. (+ where we need rehash for bhash2)

What do you think ?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 net] dccp/tcp: Reset saddr on failure after inet6?_hash_connect().
  2022-11-03 17:24 [PATCH v1 net] dccp/tcp: Reset saddr on failure after inet6?_hash_connect() Kuniyuki Iwashima
  2022-11-06 19:18 ` Joanne Koong
@ 2022-11-10  9:19 ` Paolo Abeni
  1 sibling, 0 replies; 7+ messages in thread
From: Paolo Abeni @ 2022-11-10  9:19 UTC (permalink / raw)
  To: Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Hideaki YOSHIFUJI, David Ahern
  Cc: Arnaldo Carvalho de Melo, Joanne Koong, Martin KaFai Lau,
	Ziyang Xuan (William),
	Kuniyuki Iwashima, netdev, dccp

On Thu, 2022-11-03 at 10:24 -0700, Kuniyuki Iwashima wrote:
> When connect() is called on a socket bound to the wildcard address,
> we change the socket's saddr to a local address.  If the socket
> fails to connect() to the destination, we have to reset the saddr.
> 
> However, when an error occurs after inet_hash6?_connect() in
> (dccp|tcp)_v[46]_conect(), we forget to reset saddr and leave
> the socket bound to the address.
> 
> From the user's point of view, whether saddr is reset or not varies
> with errno.  Let's fix this inconsistent behaviour.
> 
> Note that with this patch, the repro [0] will trigger the WARN_ON()
> in inet_csk_get_port() again, but this patch is not buggy and rather
> fixes a bug papering over the bhash2's bug [1] for which we need
> another fix.
> 
> For the record, the repro causes -EADDRNOTAVAIL in inet_hash6_connect()
> by this sequence:
> 
>   s1 = socket()
>   s1.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
>   s1.bind(('127.0.0.1', 10000))
>   s1.sendto(b'hello', MSG_FASTOPEN, (('127.0.0.1', 10000)))
>   # or s1.connect(('127.0.0.1', 10000))
> 
>   s2 = socket()
>   s2.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
>   s2.bind(('0.0.0.0', 10000))
>   s2.connect(('127.0.0.1', 10000))  # -EADDRNOTAVAIL
> 
>   s2.listen(32)  # WARN_ON(inet_csk(sk)->icsk_bind2_hash != tb2);
> 
> [0]: https://syzkaller.appspot.com/bug?extid=015d756bbd1f8b5c8f09
> [1]: https://lore.kernel.org/netdev/20221029001249.86337-1-kuniyu@amazon.com/
> 
> Fixes: 3df80d9320bc ("[DCCP]: Introduce DCCPv6")
> Fixes: 7c657876b63c ("[DCCP]: Initial implementation")
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>

As there is discussion on the correct approach here, I'm dropping this
patch from pw.

Eventually we can revive it later as needed.

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 net] dccp/tcp: Reset saddr on failure after inet6?_hash_connect().
  2022-11-07  1:14   ` Kuniyuki Iwashima
@ 2022-11-10 18:59     ` Joanne Koong
  2022-11-10 20:34       ` Kuniyuki Iwashima
  0 siblings, 1 reply; 7+ messages in thread
From: Joanne Koong @ 2022-11-10 18:59 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: acme, davem, dccp, dsahern, edumazet, kuba, kuni1840, martin.lau,
	netdev, pabeni, william.xuanziyang, yoshfuji

 hOn Sun, Nov 6, 2022 at 5:15 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> From:   Joanne Koong <joannelkoong@gmail.com>
> Date:   Sun, 6 Nov 2022 11:18:44 -0800
> > On Thu, Nov 3, 2022 at 10:24 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> > >
> > > When connect() is called on a socket bound to the wildcard address,
> > > we change the socket's saddr to a local address.  If the socket
> > > fails to connect() to the destination, we have to reset the saddr.
> > >
> > > However, when an error occurs after inet_hash6?_connect() in
> > > (dccp|tcp)_v[46]_conect(), we forget to reset saddr and leave
> > > the socket bound to the address.
> > >
> > > From the user's point of view, whether saddr is reset or not varies
> > > with errno.  Let's fix this inconsistent behaviour.
> > >
> > > Note that with this patch, the repro [0] will trigger the WARN_ON()
> > > in inet_csk_get_port() again, but this patch is not buggy and rather
> > > fixes a bug papering over the bhash2's bug [1] for which we need
> > > another fix.
> > >
> > > For the record, the repro causes -EADDRNOTAVAIL in inet_hash6_connect()
> > > by this sequence:
> > >
> > >   s1 = socket()
> > >   s1.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
> > >   s1.bind(('127.0.0.1', 10000))
> > >   s1.sendto(b'hello', MSG_FASTOPEN, (('127.0.0.1', 10000)))
> > >   # or s1.connect(('127.0.0.1', 10000))
> > >
> > >   s2 = socket()
> > >   s2.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
> > >   s2.bind(('0.0.0.0', 10000))
> > >   s2.connect(('127.0.0.1', 10000))  # -EADDRNOTAVAIL
> > >
> > >   s2.listen(32)  # WARN_ON(inet_csk(sk)->icsk_bind2_hash != tb2);
> > >
> > > [0]: https://syzkaller.appspot.com/bug?extid=015d756bbd1f8b5c8f09
> > > [1]: https://lore.kernel.org/netdev/20221029001249.86337-1-kuniyu@amazon.com/
> > >
> > > Fixes: 3df80d9320bc ("[DCCP]: Introduce DCCPv6")
> > > Fixes: 7c657876b63c ("[DCCP]: Initial implementation")
> > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > > ---
> > >  net/dccp/ipv4.c     | 2 ++
> > >  net/dccp/ipv6.c     | 2 ++
> > >  net/ipv4/tcp_ipv4.c | 2 ++
> > >  net/ipv6/tcp_ipv6.c | 2 ++
> > >  4 files changed, 8 insertions(+)
> > >
> > > diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
> > > index 713b7b8dad7e..40640c26680e 100644
> > > --- a/net/dccp/ipv4.c
> > > +++ b/net/dccp/ipv4.c
> > > @@ -157,6 +157,8 @@ int dccp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
> > >          * This unhashes the socket and releases the local port, if necessary.
> > >          */
> > >         dccp_set_state(sk, DCCP_CLOSED);
> > > +       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> > > +               inet_reset_saddr(sk);
> > >         ip_rt_put(rt);
> > >         sk->sk_route_caps = 0;
> > >         inet->inet_dport = 0;
> > > diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
> > > index e57b43006074..626166cb6d7e 100644
> > > --- a/net/dccp/ipv6.c
> > > +++ b/net/dccp/ipv6.c
> > > @@ -985,6 +985,8 @@ static int dccp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
> > >
> > >  late_failure:
> > >         dccp_set_state(sk, DCCP_CLOSED);
> > > +       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> > > +               inet_reset_saddr(sk);
> > >         __sk_dst_reset(sk);
> > >  failure:
> > >         inet->inet_dport = 0;
> > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > > index 87d440f47a70..6a3a732b584d 100644
> > > --- a/net/ipv4/tcp_ipv4.c
> > > +++ b/net/ipv4/tcp_ipv4.c
> > > @@ -343,6 +343,8 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
> > >          * if necessary.
> > >          */
> > >         tcp_set_state(sk, TCP_CLOSE);
> > > +       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> > > +               inet_reset_saddr(sk);
> > >         ip_rt_put(rt);
> > >         sk->sk_route_caps = 0;
> > >         inet->inet_dport = 0;
> > > diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> > > index 2a3f9296df1e..81b396e5cf79 100644
> > > --- a/net/ipv6/tcp_ipv6.c
> > > +++ b/net/ipv6/tcp_ipv6.c
> > > @@ -359,6 +359,8 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
> > >
> > >  late_failure:
> > >         tcp_set_state(sk, TCP_CLOSE);
> > > +       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> > > +               inet_reset_saddr(sk);
> > >  failure:
> > >         inet->inet_dport = 0;
> > >         sk->sk_route_caps = 0;
> > > --
> > > 2.30.2
> > >
> >
> > inet_reset_saddr() sets both inet_saddr and inet_rcv_saddr to 0, but I
> > think there are some edge cases where when dccp/tcp_v4/6_connect() is
> > called, inet_saddr is 0 but inet_rcv_saddr is not, which means we'd
> > need to reset inet_rcv_saddr to its original value. The example case
> > I'm looking at is  __inet_bind() where if the request is to bind to a
> > multicast address,
> >
> >     inet->inet_rcv_saddr = inet->inet_saddr = addr->sin_addr.s_addr;
> >     if (chk_addr_ret == RTN_MULTICAST || chk_addr_ret == RTN_BROADCAST)
> >         inet->inet_saddr = 0;  /* Use device */
>
> Thanks for reviewing.
>
> We have to take care of these two error paths.
>
>   * (dccp|tcp)_v[46]_coonnect()
>   * __inet_stream_connect()
>
> In __inet_stream_connect(), we call ->disconnect(), which already has the
> same logic in this patch.
>
> In your edge case, once (dccp|tcp)_v[46]_coonnect() succeeds, both of
> inet->inet_saddr and sk_rcv_saddr are non-zero.  If connect() fails after
> that, in ->disconnect(), we cannot know if we should restore sk_rcv_saddr
> only.  Also, we don't have the previous address there.
>
> For these reasons, we reset both addresses only if the sk was bound to
> INADDR_ANY, which we can detect by the SOCK_BINDADDR_LOCK flag.
>
> As you mentinoed, we can restore sk_rcv_saddr for the edge case in
> (dccp|tcp)_v[46]_coonnect() but cannot in __inet_stream_connect().
>
> If we do so, we need another flag for the case and another member to save
> the old multicast/broadcast address. (+ where we need rehash for bhash2)
>
> What do you think ?

Sorry for the late reply Kuniyuki, I didn't see your email in my inbox
until Paolo resurfaced it.

I think this error case (eg multicast address) will be very rare +
have minimal side-effects, and isn't worth accounting for in
__inet_stream_connect(). In (dccp|tcp)_v[46]_coonnect(), it seems
simple to address because we already track the previous sk address in
the "prev_sk_rcv_saddr" variable.

ALso, as a side-note, I think we'll need an additional patch to fix an
existing bug that's related to resetting saddrs, where we'll need to
first check whether there's a bind conflict in the bhash/bhash2 table
before we can reset its saddr back to 0
(https://lore.kernel.org/netdev/CAJnrk1Z1UFmJ2_-7G6sdNHYy0jfjbJjWiCmAzqtLN9dkJ_g+vA@mail.gmail.com/
has some more info). I don't think that issue blocks this patch
though.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 net] dccp/tcp: Reset saddr on failure after inet6?_hash_connect().
  2022-11-10 18:59     ` Joanne Koong
@ 2022-11-10 20:34       ` Kuniyuki Iwashima
  2022-11-10 21:42         ` Joanne Koong
  0 siblings, 1 reply; 7+ messages in thread
From: Kuniyuki Iwashima @ 2022-11-10 20:34 UTC (permalink / raw)
  To: joannelkoong
  Cc: acme, davem, dccp, dsahern, edumazet, kuba, kuni1840, kuniyu,
	martin.lau, netdev, pabeni, william.xuanziyang, yoshfuji

From:   Joanne Koong <joannelkoong@gmail.com>
Date:   Thu, 10 Nov 2022 10:59:43 -0800
>  hOn Sun, Nov 6, 2022 at 5:15 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > From:   Joanne Koong <joannelkoong@gmail.com>
> > Date:   Sun, 6 Nov 2022 11:18:44 -0800
> > > On Thu, Nov 3, 2022 at 10:24 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> > > >
> > > > When connect() is called on a socket bound to the wildcard address,
> > > > we change the socket's saddr to a local address.  If the socket
> > > > fails to connect() to the destination, we have to reset the saddr.
> > > >
> > > > However, when an error occurs after inet_hash6?_connect() in
> > > > (dccp|tcp)_v[46]_conect(), we forget to reset saddr and leave
> > > > the socket bound to the address.
> > > >
> > > > From the user's point of view, whether saddr is reset or not varies
> > > > with errno.  Let's fix this inconsistent behaviour.
> > > >
> > > > Note that with this patch, the repro [0] will trigger the WARN_ON()
> > > > in inet_csk_get_port() again, but this patch is not buggy and rather
> > > > fixes a bug papering over the bhash2's bug [1] for which we need
> > > > another fix.
> > > >
> > > > For the record, the repro causes -EADDRNOTAVAIL in inet_hash6_connect()
> > > > by this sequence:
> > > >
> > > >   s1 = socket()
> > > >   s1.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
> > > >   s1.bind(('127.0.0.1', 10000))
> > > >   s1.sendto(b'hello', MSG_FASTOPEN, (('127.0.0.1', 10000)))
> > > >   # or s1.connect(('127.0.0.1', 10000))
> > > >
> > > >   s2 = socket()
> > > >   s2.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
> > > >   s2.bind(('0.0.0.0', 10000))
> > > >   s2.connect(('127.0.0.1', 10000))  # -EADDRNOTAVAIL
> > > >
> > > >   s2.listen(32)  # WARN_ON(inet_csk(sk)->icsk_bind2_hash != tb2);
> > > >
> > > > [0]: https://syzkaller.appspot.com/bug?extid=015d756bbd1f8b5c8f09
> > > > [1]: https://lore.kernel.org/netdev/20221029001249.86337-1-kuniyu@amazon.com/
> > > >
> > > > Fixes: 3df80d9320bc ("[DCCP]: Introduce DCCPv6")
> > > > Fixes: 7c657876b63c ("[DCCP]: Initial implementation")
> > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > > > ---
> > > >  net/dccp/ipv4.c     | 2 ++
> > > >  net/dccp/ipv6.c     | 2 ++
> > > >  net/ipv4/tcp_ipv4.c | 2 ++
> > > >  net/ipv6/tcp_ipv6.c | 2 ++
> > > >  4 files changed, 8 insertions(+)
> > > >
> > > > diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
> > > > index 713b7b8dad7e..40640c26680e 100644
> > > > --- a/net/dccp/ipv4.c
> > > > +++ b/net/dccp/ipv4.c
> > > > @@ -157,6 +157,8 @@ int dccp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
> > > >          * This unhashes the socket and releases the local port, if necessary.
> > > >          */
> > > >         dccp_set_state(sk, DCCP_CLOSED);
> > > > +       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> > > > +               inet_reset_saddr(sk);
> > > >         ip_rt_put(rt);
> > > >         sk->sk_route_caps = 0;
> > > >         inet->inet_dport = 0;
> > > > diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
> > > > index e57b43006074..626166cb6d7e 100644
> > > > --- a/net/dccp/ipv6.c
> > > > +++ b/net/dccp/ipv6.c
> > > > @@ -985,6 +985,8 @@ static int dccp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
> > > >
> > > >  late_failure:
> > > >         dccp_set_state(sk, DCCP_CLOSED);
> > > > +       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> > > > +               inet_reset_saddr(sk);
> > > >         __sk_dst_reset(sk);
> > > >  failure:
> > > >         inet->inet_dport = 0;
> > > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > > > index 87d440f47a70..6a3a732b584d 100644
> > > > --- a/net/ipv4/tcp_ipv4.c
> > > > +++ b/net/ipv4/tcp_ipv4.c
> > > > @@ -343,6 +343,8 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
> > > >          * if necessary.
> > > >          */
> > > >         tcp_set_state(sk, TCP_CLOSE);
> > > > +       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> > > > +               inet_reset_saddr(sk);
> > > >         ip_rt_put(rt);
> > > >         sk->sk_route_caps = 0;
> > > >         inet->inet_dport = 0;
> > > > diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> > > > index 2a3f9296df1e..81b396e5cf79 100644
> > > > --- a/net/ipv6/tcp_ipv6.c
> > > > +++ b/net/ipv6/tcp_ipv6.c
> > > > @@ -359,6 +359,8 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
> > > >
> > > >  late_failure:
> > > >         tcp_set_state(sk, TCP_CLOSE);
> > > > +       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> > > > +               inet_reset_saddr(sk);
> > > >  failure:
> > > >         inet->inet_dport = 0;
> > > >         sk->sk_route_caps = 0;
> > > > --
> > > > 2.30.2
> > > >
> > >
> > > inet_reset_saddr() sets both inet_saddr and inet_rcv_saddr to 0, but I
> > > think there are some edge cases where when dccp/tcp_v4/6_connect() is
> > > called, inet_saddr is 0 but inet_rcv_saddr is not, which means we'd
> > > need to reset inet_rcv_saddr to its original value. The example case
> > > I'm looking at is  __inet_bind() where if the request is to bind to a
> > > multicast address,
> > >
> > >     inet->inet_rcv_saddr = inet->inet_saddr = addr->sin_addr.s_addr;
> > >     if (chk_addr_ret == RTN_MULTICAST || chk_addr_ret == RTN_BROADCAST)
> > >         inet->inet_saddr = 0;  /* Use device */
> >
> > Thanks for reviewing.
> >
> > We have to take care of these two error paths.
> >
> >   * (dccp|tcp)_v[46]_coonnect()
> >   * __inet_stream_connect()
> >
> > In __inet_stream_connect(), we call ->disconnect(), which already has the
> > same logic in this patch.
> >
> > In your edge case, once (dccp|tcp)_v[46]_coonnect() succeeds, both of
> > inet->inet_saddr and sk_rcv_saddr are non-zero.  If connect() fails after
> > that, in ->disconnect(), we cannot know if we should restore sk_rcv_saddr
> > only.  Also, we don't have the previous address there.
> >
> > For these reasons, we reset both addresses only if the sk was bound to
> > INADDR_ANY, which we can detect by the SOCK_BINDADDR_LOCK flag.
> >
> > As you mentinoed, we can restore sk_rcv_saddr for the edge case in
> > (dccp|tcp)_v[46]_coonnect() but cannot in __inet_stream_connect().
> >
> > If we do so, we need another flag for the case and another member to save
> > the old multicast/broadcast address. (+ where we need rehash for bhash2)
> >
> > What do you think ?
> 
> Sorry for the late reply Kuniyuki, I didn't see your email in my inbox
> until Paolo resurfaced it.
> 
> I think this error case (eg multicast address) will be very rare +
> have minimal side-effects, and isn't worth accounting for in
> __inet_stream_connect().

Agreed.


> In (dccp|tcp)_v[46]_coonnect(), it seems
> simple to address because we already track the previous sk address in
> the "prev_sk_rcv_saddr" variable.
> 
> ALso, as a side-note, I think we'll need an additional patch to fix an
> existing bug that's related to resetting saddrs, where we'll need to
> first check whether there's a bind conflict in the bhash/bhash2 table
> before we can reset its saddr back to 0
> (https://lore.kernel.org/netdev/CAJnrk1Z1UFmJ2_-7G6sdNHYy0jfjbJjWiCmAzqtLN9dkJ_g+vA@mail.gmail.com/
> has some more info). I don't think that issue blocks this patch
> though.

For better issue overview, I'll repost this patch with another patch
to fix bhash2 bucket when resetting saddr.  After that, let's fix
another issue for checking conflict before resetting sk_rcv_saddr.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 net] dccp/tcp: Reset saddr on failure after inet6?_hash_connect().
  2022-11-10 20:34       ` Kuniyuki Iwashima
@ 2022-11-10 21:42         ` Joanne Koong
  0 siblings, 0 replies; 7+ messages in thread
From: Joanne Koong @ 2022-11-10 21:42 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: acme, davem, dccp, dsahern, edumazet, kuba, kuni1840, martin.lau,
	netdev, pabeni, william.xuanziyang, yoshfuji

On Thu, Nov 10, 2022 at 12:34 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> From:   Joanne Koong <joannelkoong@gmail.com>
> Date:   Thu, 10 Nov 2022 10:59:43 -0800
> >  hOn Sun, Nov 6, 2022 at 5:15 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> > >
> > > From:   Joanne Koong <joannelkoong@gmail.com>
> > > Date:   Sun, 6 Nov 2022 11:18:44 -0800
> > > > On Thu, Nov 3, 2022 at 10:24 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> > > > >
> > > > > When connect() is called on a socket bound to the wildcard address,
> > > > > we change the socket's saddr to a local address.  If the socket
> > > > > fails to connect() to the destination, we have to reset the saddr.
> > > > >
> > > > > However, when an error occurs after inet_hash6?_connect() in
> > > > > (dccp|tcp)_v[46]_conect(), we forget to reset saddr and leave
> > > > > the socket bound to the address.
> > > > >
> > > > > From the user's point of view, whether saddr is reset or not varies
> > > > > with errno.  Let's fix this inconsistent behaviour.
> > > > >
> > > > > Note that with this patch, the repro [0] will trigger the WARN_ON()
> > > > > in inet_csk_get_port() again, but this patch is not buggy and rather
> > > > > fixes a bug papering over the bhash2's bug [1] for which we need
> > > > > another fix.
> > > > >
> > > > > For the record, the repro causes -EADDRNOTAVAIL in inet_hash6_connect()
> > > > > by this sequence:
> > > > >
> > > > >   s1 = socket()
> > > > >   s1.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
> > > > >   s1.bind(('127.0.0.1', 10000))
> > > > >   s1.sendto(b'hello', MSG_FASTOPEN, (('127.0.0.1', 10000)))
> > > > >   # or s1.connect(('127.0.0.1', 10000))
> > > > >
> > > > >   s2 = socket()
> > > > >   s2.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
> > > > >   s2.bind(('0.0.0.0', 10000))
> > > > >   s2.connect(('127.0.0.1', 10000))  # -EADDRNOTAVAIL
> > > > >
> > > > >   s2.listen(32)  # WARN_ON(inet_csk(sk)->icsk_bind2_hash != tb2);
> > > > >
> > > > > [0]: https://syzkaller.appspot.com/bug?extid=015d756bbd1f8b5c8f09
> > > > > [1]: https://lore.kernel.org/netdev/20221029001249.86337-1-kuniyu@amazon.com/
> > > > >
> > > > > Fixes: 3df80d9320bc ("[DCCP]: Introduce DCCPv6")
> > > > > Fixes: 7c657876b63c ("[DCCP]: Initial implementation")
> > > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > > > > ---
> > > > >  net/dccp/ipv4.c     | 2 ++
> > > > >  net/dccp/ipv6.c     | 2 ++
> > > > >  net/ipv4/tcp_ipv4.c | 2 ++
> > > > >  net/ipv6/tcp_ipv6.c | 2 ++
> > > > >  4 files changed, 8 insertions(+)
> > > > >
> > > > > diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
> > > > > index 713b7b8dad7e..40640c26680e 100644
> > > > > --- a/net/dccp/ipv4.c
> > > > > +++ b/net/dccp/ipv4.c
> > > > > @@ -157,6 +157,8 @@ int dccp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
> > > > >          * This unhashes the socket and releases the local port, if necessary.
> > > > >          */
> > > > >         dccp_set_state(sk, DCCP_CLOSED);
> > > > > +       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> > > > > +               inet_reset_saddr(sk);
> > > > >         ip_rt_put(rt);
> > > > >         sk->sk_route_caps = 0;
> > > > >         inet->inet_dport = 0;
> > > > > diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
> > > > > index e57b43006074..626166cb6d7e 100644
> > > > > --- a/net/dccp/ipv6.c
> > > > > +++ b/net/dccp/ipv6.c
> > > > > @@ -985,6 +985,8 @@ static int dccp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
> > > > >
> > > > >  late_failure:
> > > > >         dccp_set_state(sk, DCCP_CLOSED);
> > > > > +       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> > > > > +               inet_reset_saddr(sk);
> > > > >         __sk_dst_reset(sk);
> > > > >  failure:
> > > > >         inet->inet_dport = 0;
> > > > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > > > > index 87d440f47a70..6a3a732b584d 100644
> > > > > --- a/net/ipv4/tcp_ipv4.c
> > > > > +++ b/net/ipv4/tcp_ipv4.c
> > > > > @@ -343,6 +343,8 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
> > > > >          * if necessary.
> > > > >          */
> > > > >         tcp_set_state(sk, TCP_CLOSE);
> > > > > +       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> > > > > +               inet_reset_saddr(sk);
> > > > >         ip_rt_put(rt);
> > > > >         sk->sk_route_caps = 0;
> > > > >         inet->inet_dport = 0;
> > > > > diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> > > > > index 2a3f9296df1e..81b396e5cf79 100644
> > > > > --- a/net/ipv6/tcp_ipv6.c
> > > > > +++ b/net/ipv6/tcp_ipv6.c
> > > > > @@ -359,6 +359,8 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
> > > > >
> > > > >  late_failure:
> > > > >         tcp_set_state(sk, TCP_CLOSE);
> > > > > +       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> > > > > +               inet_reset_saddr(sk);
> > > > >  failure:
> > > > >         inet->inet_dport = 0;
> > > > >         sk->sk_route_caps = 0;
> > > > > --
> > > > > 2.30.2
> > > > >
> > > >
> > > > inet_reset_saddr() sets both inet_saddr and inet_rcv_saddr to 0, but I
> > > > think there are some edge cases where when dccp/tcp_v4/6_connect() is
> > > > called, inet_saddr is 0 but inet_rcv_saddr is not, which means we'd
> > > > need to reset inet_rcv_saddr to its original value. The example case
> > > > I'm looking at is  __inet_bind() where if the request is to bind to a
> > > > multicast address,
> > > >
> > > >     inet->inet_rcv_saddr = inet->inet_saddr = addr->sin_addr.s_addr;
> > > >     if (chk_addr_ret == RTN_MULTICAST || chk_addr_ret == RTN_BROADCAST)
> > > >         inet->inet_saddr = 0;  /* Use device */
> > >
> > > Thanks for reviewing.
> > >
> > > We have to take care of these two error paths.
> > >
> > >   * (dccp|tcp)_v[46]_coonnect()
> > >   * __inet_stream_connect()
> > >
> > > In __inet_stream_connect(), we call ->disconnect(), which already has the
> > > same logic in this patch.
> > >
> > > In your edge case, once (dccp|tcp)_v[46]_coonnect() succeeds, both of
> > > inet->inet_saddr and sk_rcv_saddr are non-zero.  If connect() fails after
> > > that, in ->disconnect(), we cannot know if we should restore sk_rcv_saddr
> > > only.  Also, we don't have the previous address there.
> > >
> > > For these reasons, we reset both addresses only if the sk was bound to
> > > INADDR_ANY, which we can detect by the SOCK_BINDADDR_LOCK flag.
> > >
> > > As you mentinoed, we can restore sk_rcv_saddr for the edge case in
> > > (dccp|tcp)_v[46]_coonnect() but cannot in __inet_stream_connect().
> > >
> > > If we do so, we need another flag for the case and another member to save
> > > the old multicast/broadcast address. (+ where we need rehash for bhash2)
> > >
> > > What do you think ?
> >
> > Sorry for the late reply Kuniyuki, I didn't see your email in my inbox
> > until Paolo resurfaced it.
> >
> > I think this error case (eg multicast address) will be very rare +
> > have minimal side-effects, and isn't worth accounting for in
> > __inet_stream_connect().
>
> Agreed.
>
>
> > In (dccp|tcp)_v[46]_coonnect(), it seems
> > simple to address because we already track the previous sk address in
> > the "prev_sk_rcv_saddr" variable.
> >
> > ALso, as a side-note, I think we'll need an additional patch to fix an
> > existing bug that's related to resetting saddrs, where we'll need to
> > first check whether there's a bind conflict in the bhash/bhash2 table
> > before we can reset its saddr back to 0
> > (https://lore.kernel.org/netdev/CAJnrk1Z1UFmJ2_-7G6sdNHYy0jfjbJjWiCmAzqtLN9dkJ_g+vA@mail.gmail.com/
> > has some more info). I don't think that issue blocks this patch
> > though.
>
> For better issue overview, I'll repost this patch with another patch
> to fix bhash2 bucket when resetting saddr.  After that, let's fix
> another issue for checking conflict before resetting sk_rcv_saddr.

Sounds great!

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-11-10 21:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-03 17:24 [PATCH v1 net] dccp/tcp: Reset saddr on failure after inet6?_hash_connect() Kuniyuki Iwashima
2022-11-06 19:18 ` Joanne Koong
2022-11-07  1:14   ` Kuniyuki Iwashima
2022-11-10 18:59     ` Joanne Koong
2022-11-10 20:34       ` Kuniyuki Iwashima
2022-11-10 21:42         ` Joanne Koong
2022-11-10  9:19 ` Paolo Abeni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).