From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Dan Streetman <ddstreet@canonical.com>,
"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.4 74/74] net: tcp: close sock if net namespace is exiting
Date: Mon, 29 Jan 2018 13:57:19 +0100 [thread overview]
Message-ID: <20180129123850.868584521@linuxfoundation.org> (raw)
In-Reply-To: <20180129123847.507563674@linuxfoundation.org>
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Streetman <ddstreet@ieee.org>
[ Upstream commit 4ee806d51176ba7b8ff1efd81f271d7252e03a1d ]
When a tcp socket is closed, if it detects that its net namespace is
exiting, close immediately and do not wait for FIN sequence.
For normal sockets, a reference is taken to their net namespace, so it will
never exit while the socket is open. However, kernel sockets do not take a
reference to their net namespace, so it may begin exiting while the kernel
socket is still open. In this case if the kernel socket is a tcp socket,
it will stay open trying to complete its close sequence. The sock's dst(s)
hold a reference to their interface, which are all transferred to the
namespace's loopback interface when the real interfaces are taken down.
When the namespace tries to take down its loopback interface, it hangs
waiting for all references to the loopback interface to release, which
results in messages like:
unregister_netdevice: waiting for lo to become free. Usage count = 1
These messages continue until the socket finally times out and closes.
Since the net namespace cleanup holds the net_mutex while calling its
registered pernet callbacks, any new net namespace initialization is
blocked until the current net namespace finishes exiting.
After this change, the tcp socket notices the exiting net namespace, and
closes immediately, releasing its dst(s) and their reference to the
loopback interface, which lets the net namespace continue exiting.
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=97811
Signed-off-by: Dan Streetman <ddstreet@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/net/net_namespace.h | 10 ++++++++++
net/ipv4/tcp.c | 3 +++
net/ipv4/tcp_timer.c | 15 +++++++++++++++
3 files changed, 28 insertions(+)
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -209,6 +209,11 @@ int net_eq(const struct net *net1, const
return net1 == net2;
}
+static inline int check_net(const struct net *net)
+{
+ return atomic_read(&net->count) != 0;
+}
+
void net_drop_ns(void *);
#else
@@ -232,6 +237,11 @@ int net_eq(const struct net *net1, const
{
return 1;
}
+
+static inline int check_net(const struct net *net)
+{
+ return 1;
+}
#define net_drop_ns NULL
#endif
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2176,6 +2176,9 @@ adjudge_to_death:
tcp_send_active_reset(sk, GFP_ATOMIC);
NET_INC_STATS_BH(sock_net(sk),
LINUX_MIB_TCPABORTONMEMORY);
+ } else if (!check_net(sock_net(sk))) {
+ /* Not possible to send reset; just close */
+ tcp_set_state(sk, TCP_CLOSE);
}
}
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -46,11 +46,19 @@ static void tcp_write_err(struct sock *s
* to prevent DoS attacks. It is called when a retransmission timeout
* or zero probe timeout occurs on orphaned socket.
*
+ * Also close if our net namespace is exiting; in that case there is no
+ * hope of ever communicating again since all netns interfaces are already
+ * down (or about to be down), and we need to release our dst references,
+ * which have been moved to the netns loopback interface, so the namespace
+ * can finish exiting. This condition is only possible if we are a kernel
+ * socket, as those do not hold references to the namespace.
+ *
* Criteria is still not confirmed experimentally and may change.
* We kill the socket, if:
* 1. If number of orphaned sockets exceeds an administratively configured
* limit.
* 2. If we have strong memory pressure.
+ * 3. If our net namespace is exiting.
*/
static int tcp_out_of_resources(struct sock *sk, bool do_reset)
{
@@ -79,6 +87,13 @@ static int tcp_out_of_resources(struct s
NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_TCPABORTONMEMORY);
return 1;
}
+
+ if (!check_net(sock_net(sk))) {
+ /* Not possible to send reset; just close */
+ tcp_done(sk);
+ return 1;
+ }
+
return 0;
}
next prev parent reply other threads:[~2018-01-29 12:57 UTC|newest]
Thread overview: 93+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-29 12:56 [PATCH 4.4 00/74] 4.4.114-stable review Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 01/74] x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels Greg Kroah-Hartman
2018-01-29 12:56 ` Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 02/74] usbip: prevent vhci_hcd driver from leaking a socket pointer address Greg Kroah-Hartman
2018-02-03 8:30 ` Eric Biggers
2018-02-05 14:58 ` Shuah Khan
2018-01-29 12:56 ` [PATCH 4.4 03/74] usbip: Fix implicit fallthrough warning Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 04/74] usbip: Fix potential format overflow in userspace tools Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 05/74] x86/microcode/intel: Fix BDW late-loading revision check Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 06/74] x86/cpu/intel: Introduce macros for Intel family numbers Greg Kroah-Hartman
2018-01-29 12:56 ` [4.4,06/74] " Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 06/74] " Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 07/74] x86/retpoline: Fill RSB on context switch for affected CPUs Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 08/74] sched/deadline: Use the revised wakeup rule for suspending constrained dl tasks Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 09/74] can: af_can: can_rcv(): replace WARN_ONCE by pr_warn_once Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 10/74] can: af_can: canfd_rcv(): " Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 11/74] PM / sleep: declare __tracedata symbols as char[] rather than char Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 12/74] time: Avoid undefined behaviour in ktime_add_safe() Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 13/74] timers: Plug locking race vs. timer migration Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 14/74] Prevent timer value 0 for MWAITX Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 15/74] drivers: base: cacheinfo: fix x86 with CONFIG_OF enabled Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 16/74] drivers: base: cacheinfo: fix boot error message when acpi is enabled Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 17/74] PCI: layerscape: Add "fsl,ls2085a-pcie" compatible ID Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 18/74] PCI: layerscape: Fix MSG TLP drop setting Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 19/74] mmc: sdhci-of-esdhc: add/remove some quirks according to vendor version Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 20/74] fs/select: add vmalloc fallback for select(2) Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 21/74] mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 22/74] hwpoison, memcg: forcibly uncharge LRU pages Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 23/74] cma: fix calculation of aligned offset Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 24/74] mm, page_alloc: fix potential false positive in __zone_watermark_ok Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 25/74] ipc: msg, make msgrcv work with LONG_MIN Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 26/74] x86/ioapic: Fix incorrect pointers in ioapic_setup_resources() Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 27/74] ACPI / processor: Avoid reserving IO regions too early Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 28/74] ACPI / scan: Prefer devices without _HID/_CID for _ADR matching Greg Kroah-Hartman
2018-02-01 8:46 ` Jiri Slaby
2018-02-01 8:57 ` Jiri Slaby
2018-02-01 10:29 ` Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 29/74] ACPICA: Namespace: fix operand cache leak Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 30/74] netfilter: x_tables: speed up jump target validation Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 31/74] netfilter: arp_tables: fix invoking 32bit "iptable -P INPUT ACCEPT" failed in 64bit kernel Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 32/74] netfilter: nf_dup_ipv6: set again FLOWI_FLAG_KNOWN_NH at flowi6_flags Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 33/74] netfilter: nf_ct_expect: remove the redundant slash when policy name is empty Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 34/74] netfilter: nfnetlink_queue: reject verdict request from different portid Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 35/74] netfilter: restart search if moved to other chain Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 36/74] netfilter: nf_conntrack_sip: extend request line validation Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 37/74] netfilter: use fwmark_reflect in nf_send_reset Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 38/74] netfilter: fix IS_ERR_VALUE usage Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 39/74] netfilter: nfnetlink_cthelper: Add missing permission checks Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 40/74] netfilter: xt_osf: " Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 41/74] ext2: Dont clear SGID when inheriting ACLs Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 42/74] reiserfs: fix race in prealloc discard Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 43/74] reiserfs: dont preallocate blocks for extended attributes Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 44/74] reiserfs: Dont clear SGID when inheriting ACLs Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 45/74] fs/fcntl: f_setown, avoid undefined behaviour Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 46/74] scsi: libiscsi: fix shifting of DID_REQUEUE host byte Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 47/74] Revert "module: Add retpoline tag to VERMAGIC" Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 48/74] Input: trackpoint - force 3 buttons if 0 button is reported Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 49/74] usb: usbip: Fix possible deadlocks reported by lockdep Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 50/74] usbip: fix stub_rx: get_pipe() to validate endpoint number Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 51/74] usbip: fix stub_rx: harden CMD_SUBMIT path to handle malicious input Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 52/74] usbip: prevent leaking socket pointer address in messages Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 53/74] um: link vmlinux with -no-pie Greg Kroah-Hartman
2018-01-29 12:56 ` [PATCH 4.4 54/74] vsyscall: Fix permissions for emulate mode with KAISER/PTI Greg Kroah-Hartman
2018-01-29 12:57 ` [PATCH 4.4 55/74] eventpoll.h: add missing epoll event masks Greg Kroah-Hartman
2018-01-29 12:57 ` [PATCH 4.4 56/74] x86/microcode/intel: Extend BDW late-loading further with LLC size check Greg Kroah-Hartman
2018-01-29 12:57 ` [PATCH 4.4 57/74] hrtimer: Reset hrtimer cpu base proper on CPU hotplug Greg Kroah-Hartman
2018-01-29 12:57 ` [PATCH 4.4 58/74] dccp: dont restart ccid2_hc_tx_rto_expire() if sk in closed state Greg Kroah-Hartman
2018-01-29 12:57 ` [PATCH 4.4 59/74] ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL Greg Kroah-Hartman
2018-01-29 12:57 ` [PATCH 4.4 60/74] ipv6: fix udpv6 sendmsg crash caused by too small MTU Greg Kroah-Hartman
2018-02-19 19:46 ` Ben Hutchings
2018-02-19 19:52 ` Ben Hutchings
2018-02-19 20:06 ` Eric Dumazet
2018-01-29 12:57 ` [PATCH 4.4 61/74] ipv6: ip6_make_skb() needs to clear cork.base.dst Greg Kroah-Hartman
2018-01-29 12:57 ` [PATCH 4.4 62/74] lan78xx: Fix failure in USB Full Speed Greg Kroah-Hartman
2018-01-29 12:57 ` [PATCH 4.4 63/74] net: igmp: fix source address check for IGMPv3 reports Greg Kroah-Hartman
2018-01-30 13:22 ` Florian Wolters
2018-01-30 13:33 ` Greg Kroah-Hartman
2018-01-29 12:57 ` [PATCH 4.4 64/74] tcp: __tcp_hdrlen() helper Greg Kroah-Hartman
2018-01-29 12:57 ` [PATCH 4.4 65/74] net: qdisc_pkt_len_init() should be more robust Greg Kroah-Hartman
2018-01-29 12:57 ` [PATCH 4.4 66/74] pppoe: take ->needed_headroom of lower device into account on xmit Greg Kroah-Hartman
2018-01-29 12:57 ` [PATCH 4.4 67/74] r8169: fix memory corruption on retrieval of hardware statistics Greg Kroah-Hartman
2018-01-29 12:57 ` [PATCH 4.4 68/74] sctp: do not allow the v4 socket to bind a v4mapped v6 address Greg Kroah-Hartman
2018-01-29 12:57 ` [PATCH 4.4 69/74] sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf Greg Kroah-Hartman
2018-01-29 12:57 ` [PATCH 4.4 70/74] vmxnet3: repair memory leak Greg Kroah-Hartman
2018-01-29 12:57 ` [PATCH 4.4 71/74] net: Allow neigh contructor functions ability to modify the primary_key Greg Kroah-Hartman
2018-01-29 12:57 ` [PATCH 4.4 72/74] ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY Greg Kroah-Hartman
2018-01-29 12:57 ` [PATCH 4.4 73/74] flow_dissector: properly cap thoff field Greg Kroah-Hartman
2018-01-29 12:57 ` Greg Kroah-Hartman [this message]
2018-01-29 21:30 ` [PATCH 4.4 00/74] 4.4.114-stable review Nathan Chancellor
2018-01-30 7:37 ` Greg Kroah-Hartman
2018-01-29 23:57 ` Shuah Khan
2018-01-30 10:05 ` Naresh Kamboju
2018-01-30 14:20 ` Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180129123850.868584521@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=ddstreet@canonical.com \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.