* EINVAL when using connect() for udp sockets
@ 2017-03-09 3:10 Daurnimator
2017-03-23 2:22 ` Daurnimator
0 siblings, 1 reply; 17+ messages in thread
From: Daurnimator @ 2017-03-09 3:10 UTC (permalink / raw)
To: netdev; +Cc: William Ahern, santitm99
When debugging https://github.com/daurnimator/lua-http/issues/73 which
uses https://github.com/wahern/dns we ran into an issue where modern
linux kernels return EINVAL if you try and re-use a udp socket.
The issue seems to occur if you go from a local destination ip to a
non-local one.
>From connect(2) man page:
> connectionless protocol sockets may use connect() multiple times to change their association
(the following content is also available at
https://gist.github.com/daurnimator/6765345776e87a3830ed101d1d983ee1)
$ cat connect-bug-stage2.c
#include <stdio.h>
#include <stdlib.h>
#include <sys/socket.h>
#include <netinet/ip.h>
#include <arpa/inet.h>
#include <errno.h>
int main() {
int fd = socket(PF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP);
if (fd == -1)
exit(1);
if (bind(fd, (struct sockaddr*)&(struct
sockaddr_in){.sin_family=AF_INET, .sin_port=htons(57997),
.sin_addr=inet_addr("0.0.0.0")}, 16))
exit(2);
if (connect(fd, (struct sockaddr*)&(struct
sockaddr_in){.sin_family=AF_INET, .sin_port=htons(53),
.sin_addr=inet_addr("127.0.0.2")}, 16))
exit(3);
if (-1 == sendto(fd, "test", 4, 0, NULL, 0))
exit(4);
char buf[200];
if (-1 != recvfrom(fd, buf, 200, 0, 0, 0) && errno != ECONNREFUSED)
exit(5);
/* okay, try next server... */
if (connect(fd, (struct sockaddr*)&(struct
sockaddr_in){.sin_family=AF_INET, .sin_port=htons(53),
.sin_addr=inet_addr("8.8.8.8")}, 16))
exit(6);
exit(0);
}
$ gcc connect-bug-stage2.c
$ strace ./a.out
execve("./a.out", ["./a.out"], [/* 56 vars */]) = 0
brk(NULL) = 0x860000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=215350, ...}) = 0
mmap(NULL, 215350, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fc978791000
close(3) = 0
open("/usr/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260\3\2\0\0\0\0\0"...,
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1951744, ...}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x7fc97878f000
mmap(NULL, 3791152, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x7fc978206000
mprotect(0x7fc97839b000, 2093056, PROT_NONE) = 0
mmap(0x7fc97859a000, 24576, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x194000) = 0x7fc97859a000
mmap(0x7fc9785a0000, 14640, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fc9785a0000
close(3) = 0
arch_prctl(ARCH_SET_FS, 0x7fc978790400) = 0
mprotect(0x7fc97859a000, 16384, PROT_READ) = 0
mprotect(0x600000, 4096, PROT_READ) = 0
mprotect(0x7fc9787c6000, 4096, PROT_READ) = 0
munmap(0x7fc978791000, 215350) = 0
socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 3
bind(3, {sa_family=AF_INET, sin_port=htons(57997),
sin_addr=inet_addr("0.0.0.0")}, 16) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("127.0.0.2")}, 16) = 0
sendto(3, "test", 4, 0, NULL, 0) = 4
recvfrom(3, 0x7ffcc2c9a3a0, 200, 0, NULL, NULL) = -1 ECONNREFUSED
(Connection refused)
connect(3, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("8.8.8.8")}, 16) = -1 EINVAL (Invalid argument)
exit_group(6) = ?
+++ exited with 6 +++
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EINVAL when using connect() for udp sockets
2017-03-09 3:10 EINVAL when using connect() for udp sockets Daurnimator
@ 2017-03-23 2:22 ` Daurnimator
2017-03-23 5:18 ` Eric Dumazet
0 siblings, 1 reply; 17+ messages in thread
From: Daurnimator @ 2017-03-23 2:22 UTC (permalink / raw)
To: netdev; +Cc: William Ahern, Santi T.
On 9 March 2017 at 14:10, Daurnimator <quae@daurnimator.com> wrote:
> When debugging https://github.com/daurnimator/lua-http/issues/73 which
> uses https://github.com/wahern/dns we ran into an issue where modern
> linux kernels return EINVAL if you try and re-use a udp socket.
> The issue seems to occur if you go from a local destination ip to a
> non-local one.
Did anyone get a chance to look into this issue?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EINVAL when using connect() for udp sockets
2017-03-23 2:22 ` Daurnimator
@ 2017-03-23 5:18 ` Eric Dumazet
2017-03-24 22:34 ` Cong Wang
0 siblings, 1 reply; 17+ messages in thread
From: Eric Dumazet @ 2017-03-23 5:18 UTC (permalink / raw)
To: Daurnimator; +Cc: netdev, William Ahern, Santi T.
On Thu, 2017-03-23 at 13:22 +1100, Daurnimator wrote:
> On 9 March 2017 at 14:10, Daurnimator <quae@daurnimator.com> wrote:
> > When debugging https://github.com/daurnimator/lua-http/issues/73 which
> > uses https://github.com/wahern/dns we ran into an issue where modern
> > linux kernels return EINVAL if you try and re-use a udp socket.
> > The issue seems to occur if you go from a local destination ip to a
> > non-local one.
>
> Did anyone get a chance to look into this issue?
I believe man page is not complete.
A disconnect is needed before another connect()
eg :
#include <stdio.h>
#include <stdlib.h>
#include <sys/socket.h>
#include <netinet/ip.h>
#include <arpa/inet.h>
#include <errno.h>
int main() {
int fd = socket(PF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP);
if (fd == -1)
exit(1);
if (bind(fd, (struct sockaddr*)&(struct
sockaddr_in){.sin_family=AF_INET, .sin_port=htons(57997),
.sin_addr=inet_addr("0.0.0.0")}, 16))
exit(2);
if (connect(fd, (struct sockaddr*)&(struct
sockaddr_in){.sin_family=AF_INET, .sin_port=htons(53),
.sin_addr=inet_addr("127.0.0.2")}, 16))
exit(3);
if (-1 == sendto(fd, "test", 4, 0, NULL, 0))
exit(4);
char buf[200];
if (-1 != recvfrom(fd, buf, 200, 0, 0, 0) && errno != ECONNREFUSED)
exit(5);
/* okay, try next server... */
/* disconnect */
if (connect(fd, (struct sockaddr*)&(struct sockaddr_in){.sin_family=AF_UNSPEC}, 16))
exit(6);
if (connect(fd, (struct sockaddr*)&(struct
sockaddr_in){.sin_family=AF_INET, .sin_port=htons(53),
.sin_addr=inet_addr("8.8.8.8")}, 16))
exit(7);
exit(0);
}
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EINVAL when using connect() for udp sockets
2017-03-23 5:18 ` Eric Dumazet
@ 2017-03-24 22:34 ` Cong Wang
2017-03-24 23:19 ` Eric Dumazet
0 siblings, 1 reply; 17+ messages in thread
From: Cong Wang @ 2017-03-24 22:34 UTC (permalink / raw)
To: Eric Dumazet
Cc: Daurnimator, Linux Kernel Network Developers, William Ahern,
Santi T.,
mtk.manpages
(Cc'ing Michael Kerrisk)
On Wed, Mar 22, 2017 at 10:18 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2017-03-23 at 13:22 +1100, Daurnimator wrote:
>> On 9 March 2017 at 14:10, Daurnimator <quae@daurnimator.com> wrote:
>> > When debugging https://github.com/daurnimator/lua-http/issues/73 which
>> > uses https://github.com/wahern/dns we ran into an issue where modern
>> > linux kernels return EINVAL if you try and re-use a udp socket.
>> > The issue seems to occur if you go from a local destination ip to a
>> > non-local one.
>>
>> Did anyone get a chance to look into this issue?
>
> I believe man page is not complete.
>
> A disconnect is needed before another connect()
Is it? Making connect() reentrant is reasonable for connection-less
protocol like UDP, but I don't dig POSIX for the details. If so we need
something like below...
--- a/net/ipv4/datagram.c
+++ b/net/ipv4/datagram.c
@@ -40,7 +40,7 @@ int __ip4_datagram_connect(struct sock *sk, struct
sockaddr *uaddr, int addr_len
sk_dst_reset(sk);
oif = sk->sk_bound_dev_if;
- saddr = inet->inet_saddr;
+ saddr = inet->inet_saddr = 0;
if (ipv4_is_multicast(usin->sin_addr.s_addr)) {
if (!oif)
oif = inet->mc_index;
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EINVAL when using connect() for udp sockets
2017-03-24 22:34 ` Cong Wang
@ 2017-03-24 23:19 ` Eric Dumazet
2017-03-28 4:22 ` Cong Wang
0 siblings, 1 reply; 17+ messages in thread
From: Eric Dumazet @ 2017-03-24 23:19 UTC (permalink / raw)
To: Cong Wang
Cc: Daurnimator, Linux Kernel Network Developers, William Ahern,
Santi T.,
mtk.manpages
On Fri, 2017-03-24 at 15:34 -0700, Cong Wang wrote:
> (Cc'ing Michael Kerrisk)
>
> On Wed, Mar 22, 2017 at 10:18 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Thu, 2017-03-23 at 13:22 +1100, Daurnimator wrote:
> >> On 9 March 2017 at 14:10, Daurnimator <quae@daurnimator.com> wrote:
> >> > When debugging https://github.com/daurnimator/lua-http/issues/73 which
> >> > uses https://github.com/wahern/dns we ran into an issue where modern
> >> > linux kernels return EINVAL if you try and re-use a udp socket.
> >> > The issue seems to occur if you go from a local destination ip to a
> >> > non-local one.
> >>
> >> Did anyone get a chance to look into this issue?
> >
> > I believe man page is not complete.
> >
> > A disconnect is needed before another connect()
>
> Is it? Making connect() reentrant is reasonable for connection-less
> protocol like UDP, but I don't dig POSIX for the details. If so we need
> something like below...
>
> --- a/net/ipv4/datagram.c
> +++ b/net/ipv4/datagram.c
> @@ -40,7 +40,7 @@ int __ip4_datagram_connect(struct sock *sk, struct
> sockaddr *uaddr, int addr_len
> sk_dst_reset(sk);
>
> oif = sk->sk_bound_dev_if;
> - saddr = inet->inet_saddr;
> + saddr = inet->inet_saddr = 0;
> if (ipv4_is_multicast(usin->sin_addr.s_addr)) {
> if (!oif)
> oif = inet->mc_index;
Wont this break bind() ?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EINVAL when using connect() for udp sockets
2017-03-24 23:19 ` Eric Dumazet
@ 2017-03-28 4:22 ` Cong Wang
2017-03-28 23:11 ` Eric Dumazet
0 siblings, 1 reply; 17+ messages in thread
From: Cong Wang @ 2017-03-28 4:22 UTC (permalink / raw)
To: Eric Dumazet
Cc: Daurnimator, Linux Kernel Network Developers, William Ahern,
Santi T.,
Michael Kerrisk-manpages
On Fri, Mar 24, 2017 at 4:19 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2017-03-24 at 15:34 -0700, Cong Wang wrote:
>> (Cc'ing Michael Kerrisk)
>>
>> On Wed, Mar 22, 2017 at 10:18 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> > On Thu, 2017-03-23 at 13:22 +1100, Daurnimator wrote:
>> >> On 9 March 2017 at 14:10, Daurnimator <quae@daurnimator.com> wrote:
>> >> > When debugging https://github.com/daurnimator/lua-http/issues/73 which
>> >> > uses https://github.com/wahern/dns we ran into an issue where modern
>> >> > linux kernels return EINVAL if you try and re-use a udp socket.
>> >> > The issue seems to occur if you go from a local destination ip to a
>> >> > non-local one.
>> >>
>> >> Did anyone get a chance to look into this issue?
>> >
>> > I believe man page is not complete.
>> >
>> > A disconnect is needed before another connect()
>>
>> Is it? Making connect() reentrant is reasonable for connection-less
>> protocol like UDP, but I don't dig POSIX for the details. If so we need
>> something like below...
>>
>> --- a/net/ipv4/datagram.c
>> +++ b/net/ipv4/datagram.c
>> @@ -40,7 +40,7 @@ int __ip4_datagram_connect(struct sock *sk, struct
>> sockaddr *uaddr, int addr_len
>> sk_dst_reset(sk);
>>
>> oif = sk->sk_bound_dev_if;
>> - saddr = inet->inet_saddr;
>> + saddr = inet->inet_saddr = 0;
>> if (ipv4_is_multicast(usin->sin_addr.s_addr)) {
>> if (!oif)
>> oif = inet->mc_index;
>
> Wont this break bind() ?
>
Right. We need to distinguish bind() and connect(), something
like below?
--- a/net/ipv4/datagram.c
+++ b/net/ipv4/datagram.c
@@ -26,7 +26,7 @@ int __ip4_datagram_connect(struct sock *sk, struct
sockaddr *uaddr, int addr_len
struct sockaddr_in *usin = (struct sockaddr_in *) uaddr;
struct flowi4 *fl4;
struct rtable *rt;
- __be32 saddr;
+ __be32 saddr = 0;
int oif;
int err;
@@ -40,7 +40,8 @@ int __ip4_datagram_connect(struct sock *sk, struct
sockaddr *uaddr, int addr_len
sk_dst_reset(sk);
oif = sk->sk_bound_dev_if;
- saddr = inet->inet_saddr;
+ if (sk->sk_userlocks & SOCK_BINDADDR_LOCK)
+ saddr = inet->inet_saddr;
if (ipv4_is_multicast(usin->sin_addr.s_addr)) {
if (!oif)
oif = inet->mc_index;
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EINVAL when using connect() for udp sockets
2017-03-28 4:22 ` Cong Wang
@ 2017-03-28 23:11 ` Eric Dumazet
2017-03-28 23:44 ` Daurnimator
2017-03-29 0:52 ` Eric Dumazet
0 siblings, 2 replies; 17+ messages in thread
From: Eric Dumazet @ 2017-03-28 23:11 UTC (permalink / raw)
To: Cong Wang
Cc: Daurnimator, Linux Kernel Network Developers, William Ahern,
Santi T.,
Michael Kerrisk-manpages
On Mon, 2017-03-27 at 21:22 -0700, Cong Wang wrote:
> On Fri, Mar 24, 2017 at 4:19 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Fri, 2017-03-24 at 15:34 -0700, Cong Wang wrote:
> >> (Cc'ing Michael Kerrisk)
> >>
> >> On Wed, Mar 22, 2017 at 10:18 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >> > On Thu, 2017-03-23 at 13:22 +1100, Daurnimator wrote:
> >> >> On 9 March 2017 at 14:10, Daurnimator <quae@daurnimator.com> wrote:
> >> >> > When debugging https://github.com/daurnimator/lua-http/issues/73 which
> >> >> > uses https://github.com/wahern/dns we ran into an issue where modern
> >> >> > linux kernels return EINVAL if you try and re-use a udp socket.
> >> >> > The issue seems to occur if you go from a local destination ip to a
> >> >> > non-local one.
> >> >>
> >> >> Did anyone get a chance to look into this issue?
> >> >
> >> > I believe man page is not complete.
> >> >
> >> > A disconnect is needed before another connect()
> >>
> >> Is it? Making connect() reentrant is reasonable for connection-less
> >> protocol like UDP, but I don't dig POSIX for the details. If so we need
> >> something like below...
> >>
> >> --- a/net/ipv4/datagram.c
> >> +++ b/net/ipv4/datagram.c
> >> @@ -40,7 +40,7 @@ int __ip4_datagram_connect(struct sock *sk, struct
> >> sockaddr *uaddr, int addr_len
> >> sk_dst_reset(sk);
> >>
> >> oif = sk->sk_bound_dev_if;
> >> - saddr = inet->inet_saddr;
> >> + saddr = inet->inet_saddr = 0;
> >> if (ipv4_is_multicast(usin->sin_addr.s_addr)) {
> >> if (!oif)
> >> oif = inet->mc_index;
> >
> > Wont this break bind() ?
> >
>
> Right. We need to distinguish bind() and connect(), something
> like below?
>
> --- a/net/ipv4/datagram.c
> +++ b/net/ipv4/datagram.c
> @@ -26,7 +26,7 @@ int __ip4_datagram_connect(struct sock *sk, struct
> sockaddr *uaddr, int addr_len
> struct sockaddr_in *usin = (struct sockaddr_in *) uaddr;
> struct flowi4 *fl4;
> struct rtable *rt;
> - __be32 saddr;
> + __be32 saddr = 0;
> int oif;
> int err;
>
> @@ -40,7 +40,8 @@ int __ip4_datagram_connect(struct sock *sk, struct
> sockaddr *uaddr, int addr_len
> sk_dst_reset(sk);
>
> oif = sk->sk_bound_dev_if;
> - saddr = inet->inet_saddr;
> + if (sk->sk_userlocks & SOCK_BINDADDR_LOCK)
> + saddr = inet->inet_saddr;
> if (ipv4_is_multicast(usin->sin_addr.s_addr)) {
> if (!oif)
> oif = inet->mc_index;
Yes, this looks better.
Although you probably need to change a bit later this part :
if (!inet->inet_saddr)
inet->inet_saddr = fl4->saddr; /* Update source address */
Should we also fix IPv6 or is this bug only about IPv4 ?
Thanks !
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EINVAL when using connect() for udp sockets
2017-03-28 23:11 ` Eric Dumazet
@ 2017-03-28 23:44 ` Daurnimator
2017-03-28 23:45 ` Daurnimator
2017-03-29 0:52 ` Eric Dumazet
1 sibling, 1 reply; 17+ messages in thread
From: Daurnimator @ 2017-03-28 23:44 UTC (permalink / raw)
To: Eric Dumazet
Cc: Cong Wang, Linux Kernel Network Developers, William Ahern,
Santi T.,
Michael Kerrisk-manpages
On 29 March 2017 at 10:11, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Should we also fix IPv6 or is this bug only about IPv4 ?
In IPv6 the second connect() returns ENETUNREACH (rather than failing
yet returning 0 as it does in IPv4).
This should probably incorrect behaviour and should the operation
should succeed.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EINVAL when using connect() for udp sockets
2017-03-28 23:44 ` Daurnimator
@ 2017-03-28 23:45 ` Daurnimator
0 siblings, 0 replies; 17+ messages in thread
From: Daurnimator @ 2017-03-28 23:45 UTC (permalink / raw)
To: Eric Dumazet
Cc: Cong Wang, Linux Kernel Network Developers, William Ahern,
Santi T.,
Michael Kerrisk-manpages
On 29 March 2017 at 10:44, Daurnimator <quae@daurnimator.com> wrote:
> This should probably incorrect behaviour and should the operation
> should succeed.
Uh, not sure why there are so many "should"s in that message.
This is probably incorrect behaviour and the operation should succeed.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EINVAL when using connect() for udp sockets
2017-03-28 23:11 ` Eric Dumazet
2017-03-28 23:44 ` Daurnimator
@ 2017-03-29 0:52 ` Eric Dumazet
2017-03-30 23:36 ` Cong Wang
1 sibling, 1 reply; 17+ messages in thread
From: Eric Dumazet @ 2017-03-29 0:52 UTC (permalink / raw)
To: Cong Wang
Cc: Daurnimator, Linux Kernel Network Developers, William Ahern,
Santi T.,
Michael Kerrisk-manpages
On Tue, 2017-03-28 at 16:11 -0700, Eric Dumazet wrote:
> Yes, this looks better.
>
> Although you probably need to change a bit later this part :
>
> if (!inet->inet_saddr)
> inet->inet_saddr = fl4->saddr; /* Update source address */
>
I came up with the following tested patch for IPv4
diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c
index f915abff1350a86af8d5bb89725b751c061b0fb5..1454b6191e0d38ffae0ae260578858285bc5f77b 100644
--- a/net/ipv4/datagram.c
+++ b/net/ipv4/datagram.c
@@ -40,7 +40,7 @@ int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len
sk_dst_reset(sk);
oif = sk->sk_bound_dev_if;
- saddr = inet->inet_saddr;
+ saddr = (sk->sk_userlocks & SOCK_BINDADDR_LOCK) ? inet->inet_saddr : 0;
if (ipv4_is_multicast(usin->sin_addr.s_addr)) {
if (!oif)
oif = inet->mc_index;
@@ -64,9 +64,8 @@ int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len
err = -EACCES;
goto out;
}
- if (!inet->inet_saddr)
+ if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK)) {
inet->inet_saddr = fl4->saddr; /* Update source address */
- if (!inet->inet_rcv_saddr) {
inet->inet_rcv_saddr = fl4->saddr;
if (sk->sk_prot->rehash)
sk->sk_prot->rehash(sk);
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: EINVAL when using connect() for udp sockets
2017-03-29 0:52 ` Eric Dumazet
@ 2017-03-30 23:36 ` Cong Wang
2017-03-31 0:02 ` Eric Dumazet
0 siblings, 1 reply; 17+ messages in thread
From: Cong Wang @ 2017-03-30 23:36 UTC (permalink / raw)
To: Eric Dumazet
Cc: Daurnimator, Linux Kernel Network Developers, William Ahern,
Santi T.,
Michael Kerrisk-manpages
On Tue, Mar 28, 2017 at 5:52 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2017-03-28 at 16:11 -0700, Eric Dumazet wrote:
>
>> Yes, this looks better.
>>
>> Although you probably need to change a bit later this part :
>>
>> if (!inet->inet_saddr)
>> inet->inet_saddr = fl4->saddr; /* Update source address */
>>
>
> I came up with the following tested patch for IPv4
>
> diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c
> index f915abff1350a86af8d5bb89725b751c061b0fb5..1454b6191e0d38ffae0ae260578858285bc5f77b 100644
> --- a/net/ipv4/datagram.c
> +++ b/net/ipv4/datagram.c
> @@ -40,7 +40,7 @@ int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len
> sk_dst_reset(sk);
>
> oif = sk->sk_bound_dev_if;
> - saddr = inet->inet_saddr;
> + saddr = (sk->sk_userlocks & SOCK_BINDADDR_LOCK) ? inet->inet_saddr : 0;
> if (ipv4_is_multicast(usin->sin_addr.s_addr)) {
> if (!oif)
> oif = inet->mc_index;
> @@ -64,9 +64,8 @@ int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len
> err = -EACCES;
> goto out;
> }
> - if (!inet->inet_saddr)
> + if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK)) {
> inet->inet_saddr = fl4->saddr; /* Update source address */
> - if (!inet->inet_rcv_saddr) {
> inet->inet_rcv_saddr = fl4->saddr;
> if (sk->sk_prot->rehash)
> sk->sk_prot->rehash(sk);
Why do we need this here? If you mean bind() INADDR_ANY is bound,
then it is totally a different problem?
BTW, I am still not sure about what POSIX says about the connect()
behavior, I can only find this [1]:
"
If the initiating socket is not connection-mode, then connect() shall set the
socket's peer address, and no connection is made. For SOCK_DGRAM
sockets, the peer address identifies where all datagrams are sent on
subsequent send() functions, and limits the remote sender for subsequent
recv() functions.
"
It doesn't say anything about source address. But the man page [2] says:
"
When
connect(2) is called on an unbound socket, the socket is
automatically bound to a random free port or to a usable shared port
with the local address set to INADDR_ANY.
"
Seems the last part is inaccurate, kernel actually picks a source address
from route instead of just using INADDR_ANY for connect(2).
So, for me, I think the following behaviors make sense for UDP:
1) When a bind() is called before connect()'s, aka:
bind();
connect(addr1); // should not change source addr
connect(addr2); // should fail is the source addr can not reach peer addr
2) No bind() before connect()'s, aka:
connect(addr1); // Free to bind a source addr
connect(addr2); // Free to bind a new source addr and change peer addr
Thoughts?
1. http://pubs.opengroup.org/onlinepubs/9699919799/functions/connect.html
2. http://man7.org/linux/man-pages/man7/ip.7.html
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EINVAL when using connect() for udp sockets
2017-03-30 23:36 ` Cong Wang
@ 2017-03-31 0:02 ` Eric Dumazet
2017-03-31 16:52 ` Cong Wang
0 siblings, 1 reply; 17+ messages in thread
From: Eric Dumazet @ 2017-03-31 0:02 UTC (permalink / raw)
To: Cong Wang
Cc: Daurnimator, Linux Kernel Network Developers, William Ahern,
Santi T.,
Michael Kerrisk-manpages
On Thu, 2017-03-30 at 16:36 -0700, Cong Wang wrote:
> On Tue, Mar 28, 2017 at 5:52 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Tue, 2017-03-28 at 16:11 -0700, Eric Dumazet wrote:
> >
> >> Yes, this looks better.
> >>
> >> Although you probably need to change a bit later this part :
> >>
> >> if (!inet->inet_saddr)
> >> inet->inet_saddr = fl4->saddr; /* Update source address */
> >>
> >
> > I came up with the following tested patch for IPv4
> >
> > diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c
> > index f915abff1350a86af8d5bb89725b751c061b0fb5..1454b6191e0d38ffae0ae260578858285bc5f77b 100644
> > --- a/net/ipv4/datagram.c
> > +++ b/net/ipv4/datagram.c
> > @@ -40,7 +40,7 @@ int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len
> > sk_dst_reset(sk);
> >
> > oif = sk->sk_bound_dev_if;
> > - saddr = inet->inet_saddr;
> > + saddr = (sk->sk_userlocks & SOCK_BINDADDR_LOCK) ? inet->inet_saddr : 0;
> > if (ipv4_is_multicast(usin->sin_addr.s_addr)) {
> > if (!oif)
> > oif = inet->mc_index;
> > @@ -64,9 +64,8 @@ int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len
> > err = -EACCES;
> > goto out;
> > }
> > - if (!inet->inet_saddr)
> > + if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK)) {
> > inet->inet_saddr = fl4->saddr; /* Update source address */
> > - if (!inet->inet_rcv_saddr) {
> > inet->inet_rcv_saddr = fl4->saddr;
> > if (sk->sk_prot->rehash)
> > sk->sk_prot->rehash(sk);
>
> Why do we need this here? If you mean bind() INADDR_ANY is bound,
> then it is totally a different problem?
Proper delivery of RX packets will need to find the socket, and this
needs the 2-tuple (source address, source port) info for UDP.
So after a connect(), we need to rehash
>
> BTW, I am still not sure about what POSIX says about the connect()
> behavior, I can only find this [1]:
>
> "
> If the initiating socket is not connection-mode, then connect() shall set the
> socket's peer address, and no connection is made. For SOCK_DGRAM
> sockets, the peer address identifies where all datagrams are sent on
> subsequent send() functions, and limits the remote sender for subsequent
> recv() functions.
> "
>
> It doesn't say anything about source address. But the man page [2] says:
>
> "
> When
> connect(2) is called on an unbound socket, the socket is
> automatically bound to a random free port or to a usable shared port
> with the local address set to INADDR_ANY.
> "
>
> Seems the last part is inaccurate, kernel actually picks a source address
> from route instead of just using INADDR_ANY for connect(2).
>
> So, for me, I think the following behaviors make sense for UDP:
>
> 1) When a bind() is called before connect()'s, aka:
>
> bind();
> connect(addr1); // should not change source addr
It depends. bind() can be only allocating the source port.
If bind(INADDR_ANY) was used, then we need to determine source addr at
connect() time.
Point of connect() is that future send() wont have to guess the 4-tuple
infos. But also that incoming packets will find this precise socket
thanks to a higher score.
And tools like "ss -aun" should display the 4-tuple after a successful
connect()
> connect(addr2); // should fail is the source addr can not reach peer addr
>
> 2) No bind() before connect()'s, aka:
>
> connect(addr1); // Free to bind a source addr
> connect(addr2); // Free to bind a new source addr and change peer addr
Exactly. My patch does this.
>
> Thoughts?
>
> 1. http://pubs.opengroup.org/onlinepubs/9699919799/functions/connect.html
> 2. http://man7.org/linux/man-pages/man7/ip.7.html
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EINVAL when using connect() for udp sockets
2017-03-31 0:02 ` Eric Dumazet
@ 2017-03-31 16:52 ` Cong Wang
2017-03-31 17:00 ` Cong Wang
2017-04-28 2:55 ` Daurnimator
0 siblings, 2 replies; 17+ messages in thread
From: Cong Wang @ 2017-03-31 16:52 UTC (permalink / raw)
To: Eric Dumazet
Cc: Daurnimator, Linux Kernel Network Developers, William Ahern,
Santi T.,
Michael Kerrisk-manpages
On Thu, Mar 30, 2017 at 5:02 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2017-03-30 at 16:36 -0700, Cong Wang wrote:
>> On Tue, Mar 28, 2017 at 5:52 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> > On Tue, 2017-03-28 at 16:11 -0700, Eric Dumazet wrote:
>> >
>> >> Yes, this looks better.
>> >>
>> >> Although you probably need to change a bit later this part :
>> >>
>> >> if (!inet->inet_saddr)
>> >> inet->inet_saddr = fl4->saddr; /* Update source address */
>> >>
>> >
>> > I came up with the following tested patch for IPv4
>> >
>> > diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c
>> > index f915abff1350a86af8d5bb89725b751c061b0fb5..1454b6191e0d38ffae0ae260578858285bc5f77b 100644
>> > --- a/net/ipv4/datagram.c
>> > +++ b/net/ipv4/datagram.c
>> > @@ -40,7 +40,7 @@ int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len
>> > sk_dst_reset(sk);
>> >
>> > oif = sk->sk_bound_dev_if;
>> > - saddr = inet->inet_saddr;
>> > + saddr = (sk->sk_userlocks & SOCK_BINDADDR_LOCK) ? inet->inet_saddr : 0;
>> > if (ipv4_is_multicast(usin->sin_addr.s_addr)) {
>> > if (!oif)
>> > oif = inet->mc_index;
>> > @@ -64,9 +64,8 @@ int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len
>> > err = -EACCES;
>> > goto out;
>> > }
>> > - if (!inet->inet_saddr)
>> > + if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK)) {
>> > inet->inet_saddr = fl4->saddr; /* Update source address */
>> > - if (!inet->inet_rcv_saddr) {
>> > inet->inet_rcv_saddr = fl4->saddr;
>> > if (sk->sk_prot->rehash)
>> > sk->sk_prot->rehash(sk);
>>
>> Why do we need this here? If you mean bind() INADDR_ANY is bound,
>> then it is totally a different problem?
>
>
> Proper delivery of RX packets will need to find the socket, and this
> needs the 2-tuple (source address, source port) info for UDP.
>
> So after a connect(), we need to rehash
Oh, I didn't notice remove the if (!inet->inet_rcv_saddr) check...
[...]
>> 1) When a bind() is called before connect()'s, aka:
>>
>> bind();
>> connect(addr1); // should not change source addr
>
> It depends. bind() can be only allocating the source port.
>
> If bind(INADDR_ANY) was used, then we need to determine source addr at
> connect() time.
Yes, bind() only sets the flag for non-zero address:
if (inet->inet_rcv_saddr)
sk->sk_userlocks |= SOCK_BINDADDR_LOCK;
Please submit your patch formally and with a man page patch too.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EINVAL when using connect() for udp sockets
2017-03-31 16:52 ` Cong Wang
@ 2017-03-31 17:00 ` Cong Wang
2017-04-28 2:55 ` Daurnimator
1 sibling, 0 replies; 17+ messages in thread
From: Cong Wang @ 2017-03-31 17:00 UTC (permalink / raw)
To: Eric Dumazet
Cc: Daurnimator, Linux Kernel Network Developers, William Ahern,
Santi T.,
Michael Kerrisk-manpages
On Fri, Mar 31, 2017 at 9:52 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> Please submit your patch formally and with a man page patch too.
BTW, and probably IPv6 too.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EINVAL when using connect() for udp sockets
2017-03-31 16:52 ` Cong Wang
2017-03-31 17:00 ` Cong Wang
@ 2017-04-28 2:55 ` Daurnimator
2017-04-28 4:48 ` Eric Dumazet
1 sibling, 1 reply; 17+ messages in thread
From: Daurnimator @ 2017-04-28 2:55 UTC (permalink / raw)
To: Cong Wang
Cc: Eric Dumazet, Linux Kernel Network Developers, William Ahern,
Santi T.,
Michael Kerrisk-manpages
On 1 April 2017 at 03:52, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> Please submit your patch formally and with a man page patch too.
Did a patch get submitted? I had a look but couldn't see it.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EINVAL when using connect() for udp sockets
2017-04-28 2:55 ` Daurnimator
@ 2017-04-28 4:48 ` Eric Dumazet
2017-09-04 13:23 ` Daurnimator
0 siblings, 1 reply; 17+ messages in thread
From: Eric Dumazet @ 2017-04-28 4:48 UTC (permalink / raw)
To: Daurnimator
Cc: Cong Wang, Linux Kernel Network Developers, William Ahern,
Santi T.,
Michael Kerrisk-manpages
On Fri, 2017-04-28 at 12:55 +1000, Daurnimator wrote:
> On 1 April 2017 at 03:52, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > Please submit your patch formally and with a man page patch too.
>
> Did a patch get submitted? I had a look but couldn't see it.
Not yet.
I will probably wait for next cycle (linux-4.13), I have been busy on
other work lately.
Thanks.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EINVAL when using connect() for udp sockets
2017-04-28 4:48 ` Eric Dumazet
@ 2017-09-04 13:23 ` Daurnimator
0 siblings, 0 replies; 17+ messages in thread
From: Daurnimator @ 2017-09-04 13:23 UTC (permalink / raw)
To: Eric Dumazet
Cc: Cong Wang, Linux Kernel Network Developers, William Ahern,
Santi T.,
Michael Kerrisk-manpages
On 28 April 2017 at 14:48, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2017-04-28 at 12:55 +1000, Daurnimator wrote:
>> On 1 April 2017 at 03:52, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> > Please submit your patch formally and with a man page patch too.
>>
>> Did a patch get submitted? I had a look but couldn't see it.
>
> Not yet.
>
> I will probably wait for next cycle (linux-4.13), I have been busy on
> other work lately.
Will you have time to work on this soon?
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2017-09-04 13:23 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-09 3:10 EINVAL when using connect() for udp sockets Daurnimator
2017-03-23 2:22 ` Daurnimator
2017-03-23 5:18 ` Eric Dumazet
2017-03-24 22:34 ` Cong Wang
2017-03-24 23:19 ` Eric Dumazet
2017-03-28 4:22 ` Cong Wang
2017-03-28 23:11 ` Eric Dumazet
2017-03-28 23:44 ` Daurnimator
2017-03-28 23:45 ` Daurnimator
2017-03-29 0:52 ` Eric Dumazet
2017-03-30 23:36 ` Cong Wang
2017-03-31 0:02 ` Eric Dumazet
2017-03-31 16:52 ` Cong Wang
2017-03-31 17:00 ` Cong Wang
2017-04-28 2:55 ` Daurnimator
2017-04-28 4:48 ` Eric Dumazet
2017-09-04 13:23 ` Daurnimator
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.