All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Jan Tluka <jtluka@redhat.com>,
	Jakub Sitnicki <jkbs@redhat.com>,
	Hannes Frederic Sowa <hannes@stressinduktion.org>,
	"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.4 18/75] ipv6: Skip XFRM lookup if dst_entry in socket cache is valid
Date: Wed, 22 Jun 2016 15:40:40 -0700	[thread overview]
Message-ID: <20160622223500.961867227@linuxfoundation.org> (raw)
In-Reply-To: <20160622223500.055133765@linuxfoundation.org>

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jakub Sitnicki <jkbs@redhat.com>

[ Upstream commit 00bc0ef5880dc7b82f9c320dead4afaad48e47be ]

At present we perform an xfrm_lookup() for each UDPv6 message we
send. The lookup involves querying the flow cache (flow_cache_lookup)
and, in case of a cache miss, creating an XFRM bundle.

If we miss the flow cache, we can end up creating a new bundle and
deriving the path MTU (xfrm_init_pmtu) from on an already transformed
dst_entry, which we pass from the socket cache (sk->sk_dst_cache) down
to xfrm_lookup(). This can happen only if we're caching the dst_entry
in the socket, that is when we're using a connected UDP socket.

To put it another way, the path MTU shrinks each time we miss the flow
cache, which later on leads to incorrectly fragmented payload. It can
be observed with ESPv6 in transport mode:

  1) Set up a transformation and lower the MTU to trigger fragmentation
    # ip xfrm policy add dir out src ::1 dst ::1 \
      tmpl src ::1 dst ::1 proto esp spi 1
    # ip xfrm state add src ::1 dst ::1 \
      proto esp spi 1 enc 'aes' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b
    # ip link set dev lo mtu 1500

  2) Monitor the packet flow and set up an UDP sink
    # tcpdump -ni lo -ttt &
    # socat udp6-listen:12345,fork /dev/null &

  3) Send a datagram that needs fragmentation with a connected socket
    # perl -e 'print "@" x 1470 | socat - udp6:[::1]:12345
    2016/06/07 18:52:52 socat[724] E read(3, 0x555bb3d5ba00, 8192): Protocol error
    00:00:00.000000 IP6 ::1 > ::1: frag (0|1448) ESP(spi=0x00000001,seq=0x2), length 1448
    00:00:00.000014 IP6 ::1 > ::1: frag (1448|32)
    00:00:00.000050 IP6 ::1 > ::1: ESP(spi=0x00000001,seq=0x3), length 1272
    (^ ICMPv6 Parameter Problem)
    00:00:00.000022 IP6 ::1 > ::1: ESP(spi=0x00000001,seq=0x5), length 136

  4) Compare it to a non-connected socket
    # perl -e 'print "@" x 1500' | socat - udp6-sendto:[::1]:12345
    00:00:40.535488 IP6 ::1 > ::1: frag (0|1448) ESP(spi=0x00000001,seq=0x6), length 1448
    00:00:00.000010 IP6 ::1 > ::1: frag (1448|64)

What happens in step (3) is:

  1) when connecting the socket in __ip6_datagram_connect(), we
     perform an XFRM lookup, miss the flow cache, create an XFRM
     bundle, and cache the destination,

  2) afterwards, when sending the datagram, we perform an XFRM lookup,
     again, miss the flow cache (due to mismatch of flowi6_iif and
     flowi6_oif, which is an issue of its own), and recreate an XFRM
     bundle based on the cached (and already transformed) destination.

To prevent the recreation of an XFRM bundle, avoid an XFRM lookup
altogether whenever we already have a destination entry cached in the
socket. This prevents the path MTU shrinkage and brings us on par with
UDPv4.

The fix also benefits connected PINGv6 sockets, another user of
ip6_sk_dst_lookup_flow(), who also suffer messages being transformed
twice.

Joint work with Hannes Frederic Sowa.

Reported-by: Jan Tluka <jtluka@redhat.com>
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv6/ip6_output.c |   11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1072,17 +1072,12 @@ struct dst_entry *ip6_sk_dst_lookup_flow
 					 const struct in6_addr *final_dst)
 {
 	struct dst_entry *dst = sk_dst_check(sk, inet6_sk(sk)->dst_cookie);
-	int err;
 
 	dst = ip6_sk_dst_check(sk, dst, fl6);
+	if (!dst)
+		dst = ip6_dst_lookup_flow(sk, fl6, final_dst);
 
-	err = ip6_dst_lookup_tail(sock_net(sk), sk, &dst, fl6);
-	if (err)
-		return ERR_PTR(err);
-	if (final_dst)
-		fl6->daddr = *final_dst;
-
-	return xfrm_lookup_route(sock_net(sk), dst, flowi6_to_flowi(fl6), sk, 0);
+	return dst;
 }
 EXPORT_SYMBOL_GPL(ip6_sk_dst_lookup_flow);
 

  parent reply	other threads:[~2016-06-22 23:20 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-22 22:40 [PATCH 4.4 00/75] 4.4.14-stable review Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 01/75] scsi_lib: correctly retry failed zero length REQ_TYPE_FS commands Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 02/75] scsi: Add QEMU CD-ROM to VPD Inquiry Blacklist Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 03/75] tipc: check nl sock before parsing nested attributes Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 04/75] netlink: Fix dump skb leak/double free Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 05/75] tipc: fix nametable publication field in nl compat Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 06/75] switchdev: pass pointer to fib_info instead of copy Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 07/75] tuntap: correctly wake up process during uninit Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 08/75] bpf: Use mount_nodev not mount_ns to mount the bpf filesystem Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 09/75] udp: prevent skbs lingering in tunnel socket queues Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 10/75] uapi glibc compat: fix compilation when !__USE_MISC in glibc Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 11/75] bpf, inode: disallow userns mounts Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 12/75] sfc: on MC reset, clear PIO buffer linkage in TXQs Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 13/75] team: dont call netdev_change_features under team->lock Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 14/75] vxlan: Accept user specified MTU value when create new vxlan link Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 15/75] tcp: record TLP and ER timer stats in v6 stats Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 16/75] bridge: Dont insert unnecessary local fdb entry on changing mac address Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 17/75] l2tp: fix configuration passed to setup_udp_tunnel_sock() Greg Kroah-Hartman
2016-06-22 22:40 ` Greg Kroah-Hartman [this message]
2016-06-22 22:40 ` [PATCH 4.4 19/75] vxlan: Relax MTU constraints Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 20/75] geneve: " Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 21/75] vxlan, gre, geneve: Set a large MTU on ovs-created tunnel devices Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 24/75] ALSA: hda - Add PCI ID for Kabylake Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 25/75] ALSA: hda - Fix headset mic detection problem for Dell machine Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 26/75] ALSA: hda/realtek - ALC256 speaker noise issue Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 27/75] ALSA: hda/realtek - Add support for new codecs ALC700/ALC701/ALC703 Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 28/75] ALSA: hda/realtek: Add T560 docking unit fixup Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 29/75] ARM: fix PTRACE_SETVFPREGS on SMP systems Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 30/75] gpio: bcm-kona: fix bcm_kona_gpio_reset() warnings Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 31/75] s390/bpf: fix recache skb->data/hlen for skb_vlan_push/pop Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 32/75] s390/bpf: reduce maximum program size to 64 KB Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 33/75] irqchip/gic-v3: Fix ICC_SGI1R_EL1.INTID decoding mask Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 34/75] crypto: public_key: select CRYPTO_AKCIPHER Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 35/75] crypto: ccp - Fix AES XTS error for request sizes above 4096 Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 36/75] arm64: Provide "model name" in /proc/cpuinfo for PER_LINUX32 tasks Greg Kroah-Hartman
2016-06-22 22:40 ` [PATCH 4.4 37/75] arm64: mm: always take dirty state from new pte in ptep_set_access_flags Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 38/75] powerpc/pseries/eeh: Handle RTAS delay requests in configure_bridge Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 39/75] powerpc: Fix definition of SIAR and SDAR registers Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 40/75] powerpc: Use privileged SPR number for MMCR2 Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 41/75] powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 42/75] pinctrl: mediatek: fix dual-edge code defect Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 43/75] parisc: Fix pagefault crash in unaligned __get_user() call Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 44/75] memcg: add RCU locking around css_for_each_descendant_pre() in memcg_offline_kmem() Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 45/75] ecryptfs: forbid opening files without mmap handler Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 46/75] wext: Fix 32 bit iwpriv compatibility issue with 64 bit Kernel Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 47/75] x86/entry/traps: Dont force in_interrupt() to return true in IST handlers Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 48/75] proc: prevent stacking filesystems on top Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 49/75] sched: panic on corrupted stack end Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 50/75] fix d_walk()/non-delayed __d_free() race Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 51/75] sparc: Fix system call tracing register handling Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 52/75] sparc64: Fix bootup regressions on some Kconfig combinations Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 53/75] sparc64: Fix numa node distance initialization Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 54/75] sparc64: Fix sparc64_set_context stack handling Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 55/75] sparc/PCI: Fix for panic while enabling SR-IOV Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 56/75] sparc64: Reduce TLB flushes during hugepte changes Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 57/75] sparc64: Take ctx_alloc_lock properly in hugetlb_setup() Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 58/75] sparc: Harden signal return frame checks Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 59/75] sparc64: Fix return from trap window fill crashes Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 60/75] MIPS: Fix 64k page support for 32 bit kernels Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 61/75] netfilter: x_tables: validate e->target_offset early Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 62/75] netfilter: x_tables: make sure e->next_offset covers remaining blob size Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 63/75] netfilter: x_tables: fix unconditional helper Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 64/75] crypto: qat - fix adf_ctl_drv.c:undefined reference to adf_init_pf_wq Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 65/75] drm/core: Do not preserve framebuffer on rmfb, v4 Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 66/75] netfilter: x_tables: dont move to non-existent next rule Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 67/75] netfilter: x_tables: validate targets of jumps Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 68/75] netfilter: x_tables: add and use xt_check_entry_offsets Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 69/75] netfilter: x_tables: kill check_entry helper Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 70/75] netfilter: x_tables: assert minimum target size Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 71/75] netfilter: x_tables: add compat version of xt_check_entry_offsets Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 72/75] netfilter: x_tables: check standard target size too Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 73/75] netfilter: x_tables: check for bogus target offset Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 74/75] netfilter: x_tables: validate all offsets and sizes in a rule Greg Kroah-Hartman
2016-06-22 22:41 ` [PATCH 4.4 75/75] netfilter: x_tables: dont reject valid target size on some architectures Greg Kroah-Hartman
2016-06-23  4:54 ` [PATCH 4.4 00/75] 4.4.14-stable review -rc2 Greg Kroah-Hartman
2016-06-23 16:21   ` Kevin Hilman
2016-06-24 17:14     ` Greg Kroah-Hartman
2016-06-23 19:43   ` Guenter Roeck
2016-06-23 21:53   ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160622223500.961867227@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=hannes@stressinduktion.org \
    --cc=jkbs@redhat.com \
    --cc=jtluka@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.