All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v2 1/2] icmp: don't send out ICMP messages with a source address of 0.0.0.0
@ 2021-06-18 11:04 Toke Høiland-Jørgensen
  2021-06-18 11:04 ` [PATCH net v2 2/2] selftests/net: Add icmp.sh for testing ICMP dummy address responses Toke Høiland-Jørgensen
  2021-06-18 19:20 ` [PATCH net v2 1/2] icmp: don't send out ICMP messages with a source address of 0.0.0.0 patchwork-bot+netdevbpf
  0 siblings, 2 replies; 4+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-06-18 11:04 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: Toke Høiland-Jørgensen, Hideaki YOSHIFUJI, David Ahern,
	netdev, Juliusz Chroboczek

When constructing ICMP response messages, the kernel will try to pick a
suitable source address for the outgoing packet. However, if no IPv4
addresses are configured on the system at all, this will fail and we end up
producing an ICMP message with a source address of 0.0.0.0. This can happen
on a box routing IPv4 traffic via v6 nexthops, for instance.

Since 0.0.0.0 is not generally routable on the internet, there's a good
chance that such ICMP messages will never make it back to the sender of the
original packet that the ICMP message was sent in response to. This, in
turn, can create connectivity and PMTUd problems for senders. Fortunately,
RFC7600 reserves a dummy address to be used as a source for ICMP
messages (192.0.0.8/32), so let's teach the kernel to substitute that
address as a last resort if the regular source address selection procedure
fails.

Below is a quick example reproducing this issue with network namespaces:

ip netns add ns0
ip l add type veth peer netns ns0
ip l set dev veth0 up
ip a add 10.0.0.1/24 dev veth0
ip a add fc00:dead:cafe:42::1/64 dev veth0
ip r add 10.1.0.0/24 via inet6 fc00:dead:cafe:42::2
ip -n ns0 l set dev veth0 up
ip -n ns0 a add fc00:dead:cafe:42::2/64 dev veth0
ip -n ns0 r add 10.0.0.0/24 via inet6 fc00:dead:cafe:42::1
ip netns exec ns0 sysctl -w net.ipv4.icmp_ratelimit=0
ip netns exec ns0 sysctl -w net.ipv4.ip_forward=1
tcpdump -tpni veth0 -c 2 icmp &
ping -w 1 10.1.0.1 > /dev/null
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on veth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
IP 10.0.0.1 > 10.1.0.1: ICMP echo request, id 29, seq 1, length 64
IP 0.0.0.0 > 10.0.0.1: ICMP net 10.1.0.1 unreachable, length 92
2 packets captured
2 packets received by filter
0 packets dropped by kernel

With this patch the above capture changes to:
IP 10.0.0.1 > 10.1.0.1: ICMP echo request, id 31127, seq 1, length 64
IP 192.0.0.8 > 10.0.0.1: ICMP net 10.1.0.1 unreachable, length 92

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Juliusz Chroboczek <jch@irif.fr>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 include/uapi/linux/in.h | 3 +++
 net/ipv4/icmp.c         | 7 +++++++
 2 files changed, 10 insertions(+)

diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
index 7d6687618d80..d1b327036ae4 100644
--- a/include/uapi/linux/in.h
+++ b/include/uapi/linux/in.h
@@ -289,6 +289,9 @@ struct sockaddr_in {
 /* Address indicating an error return. */
 #define	INADDR_NONE		((unsigned long int) 0xffffffff)
 
+/* Dummy address for src of ICMP replies if no real address is set (RFC7600). */
+#define	INADDR_DUMMY		((unsigned long int) 0xc0000008)
+
 /* Network number for local host loopback. */
 #define	IN_LOOPBACKNET		127
 
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 7b6931a4d775..752e392083e6 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -759,6 +759,13 @@ void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info,
 		icmp_param.data_len = room;
 	icmp_param.head_len = sizeof(struct icmphdr);
 
+	/* if we don't have a source address at this point, fall back to the
+	 * dummy address instead of sending out a packet with a source address
+	 * of 0.0.0.0
+	 */
+	if (!fl4.saddr)
+		fl4.saddr = htonl(INADDR_DUMMY);
+
 	icmp_push_reply(&icmp_param, &fl4, &ipc, &rt);
 ende:
 	ip_rt_put(rt);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH net v2 2/2] selftests/net: Add icmp.sh for testing ICMP dummy address responses
  2021-06-18 11:04 [PATCH net v2 1/2] icmp: don't send out ICMP messages with a source address of 0.0.0.0 Toke Høiland-Jørgensen
@ 2021-06-18 11:04 ` Toke Høiland-Jørgensen
  2021-06-18 13:44   ` David Ahern
  2021-06-18 19:20 ` [PATCH net v2 1/2] icmp: don't send out ICMP messages with a source address of 0.0.0.0 patchwork-bot+netdevbpf
  1 sibling, 1 reply; 4+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-06-18 11:04 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: Toke Høiland-Jørgensen, Hideaki YOSHIFUJI, David Ahern,
	netdev, Juliusz Chroboczek

This adds a new icmp.sh selftest for testing that the kernel will respond
correctly with an ICMP unreachable message with the dummy (192.0.0.8)
source address when there are no IPv4 addresses configured to use as source
addresses.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 tools/testing/selftests/net/icmp.sh | 74 +++++++++++++++++++++++++++++
 1 file changed, 74 insertions(+)
 create mode 100755 tools/testing/selftests/net/icmp.sh

diff --git a/tools/testing/selftests/net/icmp.sh b/tools/testing/selftests/net/icmp.sh
new file mode 100755
index 000000000000..e4b04cd1644a
--- /dev/null
+++ b/tools/testing/selftests/net/icmp.sh
@@ -0,0 +1,74 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+# Test for checking ICMP response with dummy address instead of 0.0.0.0.
+# Sets up two namespaces like:
+# +----------------------+                          +--------------------+
+# | ns1                  |    v4-via-v6 routes:     | ns2                |
+# |                      |                  '       |                    |
+# |             +--------+   -> 172.16.1.0/24 ->    +--------+           |
+# |             | veth0  +--------------------------+  veth0 |           |
+# |             +--------+   <- 172.16.0.0/24 <-    +--------+           |
+# |           172.16.0.1 |                          | 2001:db8:1::2/64   |
+# |     2001:db8:1::2/64 |                          |                    |
+# +----------------------+                          +--------------------+
+#
+# And then tries to ping 172.16.1.1 from ns1. This results in a "net
+# unreachable" message being sent from ns2, but there is no IPv4 address set in
+# that address space, so the kernel should substitute the dummy address
+# 192.0.0.8 defined in RFC7600.
+
+NS1=ns1
+NS2=ns2
+H1_IP=172.16.0.1/32
+H1_IP6=2001:db8:1::1
+RT1=172.16.1.0/24
+PINGADDR=172.16.1.1
+RT2=172.16.0.0/24
+H2_IP6=2001:db8:1::2
+
+TMPFILE=$(mktemp)
+
+cleanup()
+{
+    rm -f "$TMPFILE"
+    ip netns del $NS1
+    ip netns del $NS2
+}
+
+trap cleanup EXIT
+
+# Namespaces
+ip netns add $NS1
+ip netns add $NS2
+
+# Connectivity
+ip -netns $NS1 link add veth0 type veth peer name veth0 netns $NS2
+ip -netns $NS1 link set dev veth0 up
+ip -netns $NS2 link set dev veth0 up
+ip -netns $NS1 addr add $H1_IP dev veth0
+ip -netns $NS1 addr add $H1_IP6/64 dev veth0 nodad
+ip -netns $NS2 addr add $H2_IP6/64 dev veth0 nodad
+ip -netns $NS1 route add $RT1 via inet6 $H2_IP6
+ip -netns $NS2 route add $RT2 via inet6 $H1_IP6
+
+# Make sure ns2 will respond with ICMP unreachable
+ip netns exec $NS2 sysctl -qw net.ipv4.icmp_ratelimit=0 net.ipv4.ip_forward=1
+
+# Run the test - a ping runs in the background, and we capture ICMP responses
+# with tcpdump; -c 1 means it should exit on the first ping, but add a timeout
+# in case something goes wrong
+ip netns exec $NS1 ping -w 3 -i 0.5 $PINGADDR >/dev/null &
+ip netns exec $NS1 timeout 10 tcpdump -tpni veth0 -c 1 'icmp and icmp[icmptype] != icmp-echo' > $TMPFILE 2>/dev/null
+
+# Parse response and check for dummy address
+# tcpdump output looks like:
+# IP 192.0.0.8 > 172.16.0.1: ICMP net 172.16.1.1 unreachable, length 92
+RESP_IP=$(awk '{print $2}' < $TMPFILE)
+if [[ "$RESP_IP" != "192.0.0.8" ]]; then
+    echo "FAIL - got ICMP response from $RESP_IP, should be 192.0.0.8"
+    exit 1
+else
+    echo "OK"
+    exit 0
+fi
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net v2 2/2] selftests/net: Add icmp.sh for testing ICMP dummy address responses
  2021-06-18 11:04 ` [PATCH net v2 2/2] selftests/net: Add icmp.sh for testing ICMP dummy address responses Toke Høiland-Jørgensen
@ 2021-06-18 13:44   ` David Ahern
  0 siblings, 0 replies; 4+ messages in thread
From: David Ahern @ 2021-06-18 13:44 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, David S. Miller, Jakub Kicinski
  Cc: Hideaki YOSHIFUJI, David Ahern, netdev, Juliusz Chroboczek

On 6/18/21 5:04 AM, Toke Høiland-Jørgensen wrote:
> This adds a new icmp.sh selftest for testing that the kernel will respond
> correctly with an ICMP unreachable message with the dummy (192.0.0.8)
> source address when there are no IPv4 addresses configured to use as source
> addresses.
> 
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> ---
>  tools/testing/selftests/net/icmp.sh | 74 +++++++++++++++++++++++++++++
>  1 file changed, 74 insertions(+)
>  create mode 100755 tools/testing/selftests/net/icmp.sh
> 

Reviewed-by: David Ahern <dsahern@kernel.org>

Thanks, Toke.



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net v2 1/2] icmp: don't send out ICMP messages with a source address of 0.0.0.0
  2021-06-18 11:04 [PATCH net v2 1/2] icmp: don't send out ICMP messages with a source address of 0.0.0.0 Toke Høiland-Jørgensen
  2021-06-18 11:04 ` [PATCH net v2 2/2] selftests/net: Add icmp.sh for testing ICMP dummy address responses Toke Høiland-Jørgensen
@ 2021-06-18 19:20 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-06-18 19:20 UTC (permalink / raw)
  To: =?utf-8?b?VG9rZSBIw7hpbGFuZC1Kw7hyZ2Vuc2VuIDx0b2tlQHJlZGhhdC5jb20+?=
  Cc: davem, kuba, yoshfuji, dsahern, netdev, jch

Hello:

This series was applied to netdev/net.git (refs/heads/master):

On Fri, 18 Jun 2021 13:04:35 +0200 you wrote:
> When constructing ICMP response messages, the kernel will try to pick a
> suitable source address for the outgoing packet. However, if no IPv4
> addresses are configured on the system at all, this will fail and we end up
> producing an ICMP message with a source address of 0.0.0.0. This can happen
> on a box routing IPv4 traffic via v6 nexthops, for instance.
> 
> Since 0.0.0.0 is not generally routable on the internet, there's a good
> chance that such ICMP messages will never make it back to the sender of the
> original packet that the ICMP message was sent in response to. This, in
> turn, can create connectivity and PMTUd problems for senders. Fortunately,
> RFC7600 reserves a dummy address to be used as a source for ICMP
> messages (192.0.0.8/32), so let's teach the kernel to substitute that
> address as a last resort if the regular source address selection procedure
> fails.
> 
> [...]

Here is the summary with links:
  - [net,v2,1/2] icmp: don't send out ICMP messages with a source address of 0.0.0.0
    https://git.kernel.org/netdev/net/c/321827477360
  - [net,v2,2/2] selftests/net: Add icmp.sh for testing ICMP dummy address responses
    https://git.kernel.org/netdev/net/c/7e9838b7915e

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-06-18 19:20 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-18 11:04 [PATCH net v2 1/2] icmp: don't send out ICMP messages with a source address of 0.0.0.0 Toke Høiland-Jørgensen
2021-06-18 11:04 ` [PATCH net v2 2/2] selftests/net: Add icmp.sh for testing ICMP dummy address responses Toke Høiland-Jørgensen
2021-06-18 13:44   ` David Ahern
2021-06-18 19:20 ` [PATCH net v2 1/2] icmp: don't send out ICMP messages with a source address of 0.0.0.0 patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.