linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Shakeel Butt <shakeelb@google.com>,
	Roman Gushchin <guro@fb.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.9 29/90] cgroup: memcg: net: do not associate sock with unrelated cgroup
Date: Thu, 19 Mar 2020 13:59:51 +0100	[thread overview]
Message-ID: <20200319123937.605932836@linuxfoundation.org> (raw)
In-Reply-To: <20200319123928.635114118@linuxfoundation.org>

From: Shakeel Butt <shakeelb@google.com>

[ Upstream commit e876ecc67db80dfdb8e237f71e5b43bb88ae549c ]

We are testing network memory accounting in our setup and noticed
inconsistent network memory usage and often unrelated cgroups network
usage correlates with testing workload. On further inspection, it
seems like mem_cgroup_sk_alloc() and cgroup_sk_alloc() are broken in
irq context specially for cgroup v1.

mem_cgroup_sk_alloc() and cgroup_sk_alloc() can be called in irq context
and kind of assumes that this can only happen from sk_clone_lock()
and the source sock object has already associated cgroup. However in
cgroup v1, where network memory accounting is opt-in, the source sock
can be unassociated with any cgroup and the new cloned sock can get
associated with unrelated interrupted cgroup.

Cgroup v2 can also suffer if the source sock object was created by
process in the root cgroup or if sk_alloc() is called in irq context.
The fix is to just do nothing in interrupt.

WARNING: Please note that about half of the TCP sockets are allocated
from the IRQ context, so, memory used by such sockets will not be
accouted by the memcg.

The stack trace of mem_cgroup_sk_alloc() from IRQ-context:

CPU: 70 PID: 12720 Comm: ssh Tainted:  5.6.0-smp-DEV #1
Hardware name: ...
Call Trace:
 <IRQ>
 dump_stack+0x57/0x75
 mem_cgroup_sk_alloc+0xe9/0xf0
 sk_clone_lock+0x2a7/0x420
 inet_csk_clone_lock+0x1b/0x110
 tcp_create_openreq_child+0x23/0x3b0
 tcp_v6_syn_recv_sock+0x88/0x730
 tcp_check_req+0x429/0x560
 tcp_v6_rcv+0x72d/0xa40
 ip6_protocol_deliver_rcu+0xc9/0x400
 ip6_input+0x44/0xd0
 ? ip6_protocol_deliver_rcu+0x400/0x400
 ip6_rcv_finish+0x71/0x80
 ipv6_rcv+0x5b/0xe0
 ? ip6_sublist_rcv+0x2e0/0x2e0
 process_backlog+0x108/0x1e0
 net_rx_action+0x26b/0x460
 __do_softirq+0x104/0x2a6
 do_softirq_own_stack+0x2a/0x40
 </IRQ>
 do_softirq.part.19+0x40/0x50
 __local_bh_enable_ip+0x51/0x60
 ip6_finish_output2+0x23d/0x520
 ? ip6table_mangle_hook+0x55/0x160
 __ip6_finish_output+0xa1/0x100
 ip6_finish_output+0x30/0xd0
 ip6_output+0x73/0x120
 ? __ip6_finish_output+0x100/0x100
 ip6_xmit+0x2e3/0x600
 ? ipv6_anycast_cleanup+0x50/0x50
 ? inet6_csk_route_socket+0x136/0x1e0
 ? skb_free_head+0x1e/0x30
 inet6_csk_xmit+0x95/0xf0
 __tcp_transmit_skb+0x5b4/0xb20
 __tcp_send_ack.part.60+0xa3/0x110
 tcp_send_ack+0x1d/0x20
 tcp_rcv_state_process+0xe64/0xe80
 ? tcp_v6_connect+0x5d1/0x5f0
 tcp_v6_do_rcv+0x1b1/0x3f0
 ? tcp_v6_do_rcv+0x1b1/0x3f0
 __release_sock+0x7f/0xd0
 release_sock+0x30/0xa0
 __inet_stream_connect+0x1c3/0x3b0
 ? prepare_to_wait+0xb0/0xb0
 inet_stream_connect+0x3b/0x60
 __sys_connect+0x101/0x120
 ? __sys_getsockopt+0x11b/0x140
 __x64_sys_connect+0x1a/0x20
 do_syscall_64+0x51/0x200
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

The stack trace of mem_cgroup_sk_alloc() from IRQ-context:
Fixes: 2d7580738345 ("mm: memcontrol: consolidate cgroup socket tracking")
Fixes: d979a39d7242 ("cgroup: duplicate cgroup reference when cloning sockets")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/cgroup.c |    4 ++++
 mm/memcontrol.c |    4 ++++
 2 files changed, 8 insertions(+)

--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -6335,6 +6335,10 @@ void cgroup_sk_alloc(struct sock_cgroup_
 		return;
 	}
 
+	/* Don't associate the sock with unrelated interrupted task's cgroup. */
+	if (in_interrupt())
+		return;
+
 	rcu_read_lock();
 
 	while (true) {
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5726,6 +5726,10 @@ void mem_cgroup_sk_alloc(struct sock *sk
 		return;
 	}
 
+	/* Do not associate the sock with unrelated interrupted task's memcg. */
+	if (in_interrupt())
+		return;
+
 	rcu_read_lock();
 	memcg = mem_cgroup_from_task(current);
 	if (memcg == root_mem_cgroup)



  parent reply	other threads:[~2020-03-19 13:36 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-19 12:59 [PATCH 4.9 00/90] 4.9.217-rc1 review Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 01/90] NFS: Remove superfluous kmap in nfs_readdir_xdr_to_array Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 02/90] phy: Revert toggling reset changes Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 03/90] net: phy: Avoid multiple suspends Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 04/90] cgroup, netclassid: periodically release file_lock on classid updating Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 05/90] gre: fix uninit-value in __iptunnel_pull_header Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 06/90] ipv6/addrconf: call ipv6_mc_up() for non-Ethernet interface Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 07/90] net: macsec: update SCI upon MAC address change Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 08/90] net: nfc: fix bounds checking bugs on "pipe" Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 09/90] r8152: check disconnect status after long sleep Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 10/90] bnxt_en: reinitialize IRQs when MTU is modified Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 11/90] fib: add missing attribute validation for tun_id Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 12/90] nl802154: add missing attribute validation Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 13/90] nl802154: add missing attribute validation for dev_type Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 14/90] macsec: add missing attribute validation for port Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 15/90] net: fq: add missing attribute validation for orphan mask Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 16/90] team: add missing attribute validation for port ifindex Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 17/90] team: add missing attribute validation for array index Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 18/90] nfc: add missing attribute validation for SE API Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 19/90] nfc: add missing attribute validation for vendor subcommand Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 20/90] ipvlan: add cond_resched_rcu() while processing muticast backlog Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 21/90] ipvlan: do not add hardware address of master to its unicast filter list Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 22/90] ipvlan: egress mcast packets are not exceptional Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 23/90] ipvlan: do not use cond_resched_rcu() in ipvlan_process_multicast() Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 24/90] ipvlan: dont deref eth hdr before checking its set Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 25/90] macvlan: add cond_resched() during multicast processing Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 26/90] net: fec: validate the new settings in fec_enet_set_coalesce() Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 27/90] slip: make slhc_compress() more robust against malicious packets Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 28/90] bonding/alb: make sure arp header is pulled before accessing it Greg Kroah-Hartman
2020-03-19 12:59 ` Greg Kroah-Hartman [this message]
2020-03-19 12:59 ` [PATCH 4.9 30/90] net: phy: fix MDIO bus PM PHY resuming Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 31/90] virtio-blk: fix hw_queue stopped on arbitrary error Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 32/90] iommu/vt-d: quirk_ioat_snb_local_iommu: replace WARN_TAINT with pr_warn + add_taint Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 33/90] workqueue: dont use wq_select_unbound_cpu() for bound works Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 34/90] drm/amd/display: remove duplicated assignment to grph_obj_type Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 35/90] cifs_atomic_open(): fix double-put on late allocation failure Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 36/90] gfs2_atomic_open(): fix O_EXCL|O_CREAT handling on cold dcache Greg Kroah-Hartman
2020-03-19 12:59 ` [PATCH 4.9 37/90] KVM: x86: clear stale x86_emulate_ctxt->intercept value Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 38/90] ARC: define __ALIGN_STR and __ALIGN symbols for ARC Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 39/90] efi: Fix a race and a buffer overflow while reading efivars via sysfs Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 40/90] iommu/vt-d: dmar: replace WARN_TAINT with pr_warn + add_taint Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 41/90] iommu/vt-d: Fix a bug in intel_iommu_iova_to_phys() for huge page Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 42/90] nl80211: add missing attribute validation for critical protocol indication Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 43/90] nl80211: add missing attribute validation for beacon report scanning Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 44/90] nl80211: add missing attribute validation for channel switch Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 45/90] netfilter: cthelper: add missing attribute validation for cthelper Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 46/90] mwifiex: Fix heap overflow in mmwifiex_process_tdls_action_frame() Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 47/90] iommu/vt-d: Fix the wrong printing in RHSA parsing Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 48/90] iommu/vt-d: Ignore devices with out-of-spec domain number Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 49/90] ipv6: restrict IPV6_ADDRFORM operation Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 50/90] efi: Add a sanity check to efivar_store_raw() Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 51/90] batman-adv: Fix double free during fragment merge error Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 52/90] batman-adv: Fix transmission of final, 16th fragment Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 53/90] batman-adv: Initialize gw sel_class via batadv_algo Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 54/90] batman-adv: Fix rx packet/bytes stats on local ARP reply Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 55/90] batman-adv: Use default throughput value on cfg80211 error Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 56/90] batman-adv: Accept only filled wifi station info Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 57/90] batman-adv: fix TT sync flag inconsistencies Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 58/90] batman-adv: Avoid spurious warnings from bat_v neigh_cmp implementation Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 59/90] batman-adv: Always initialize fragment header priority Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 60/90] batman-adv: Fix check of retrieved orig_gw in batadv_v_gw_is_eligible Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 61/90] batman-adv: Fix lock for ogm cnt access in batadv_iv_ogm_calc_tq Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 62/90] batman-adv: Fix internal interface indices types Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 63/90] batman-adv: Avoid race in TT TVLV allocator helper Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 64/90] batman-adv: Fix TT sync flags for intermediate TT responses Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 65/90] batman-adv: prevent TT request storms by not sending inconsistent TT TLVLs Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 66/90] batman-adv: Fix debugfs path for renamed hardif Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 67/90] batman-adv: Fix debugfs path for renamed softif Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 68/90] batman-adv: Avoid storing non-TT-sync flags on singular entries too Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 69/90] batman-adv: Fix multicast TT issues with bogus ROAM flags Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 70/90] batman-adv: Prevent duplicated gateway_node entry Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 71/90] batman-adv: Fix duplicated OGMs on NETDEV_UP Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 72/90] batman-adv: Avoid free/alloc race when handling OGM2 buffer Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 73/90] batman-adv: Avoid free/alloc race when handling OGM buffer Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 74/90] batman-adv: Dont schedule OGM for disabled interface Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 75/90] batman-adv: update data pointers after skb_cow() Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 76/90] batman-adv: Avoid probe ELP information leak Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 77/90] batman-adv: Use explicit tvlv padding for ELP packets Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 78/90] perf/amd/uncore: Replace manual sampling check with CAP_NO_INTERRUPT flag Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 79/90] ACPI: watchdog: Allow disabling WDAT at boot Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 80/90] HID: apple: Add support for recent firmware on Magic Keyboards Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 81/90] HID: i2c-hid: add Trekstor Surfbook E11B to descriptor override Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 82/90] cfg80211: check reg_rule for NULL in handle_channel_custom() Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 83/90] net: ks8851-ml: Fix IRQ handling and locking Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 84/90] mac80211: rx: avoid RCU list traversal under mutex Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 85/90] signal: avoid double atomic counter increments for user accounting Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 86/90] jbd2: fix data races at struct journal_head Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 87/90] ARM: 8957/1: VDSO: Match ARMv8 timer in cntvct_functional() Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 88/90] ARM: 8958/1: rename missed uaccess .fixup section Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 89/90] mm: slub: add missing TID bump in kmem_cache_alloc_bulk() Greg Kroah-Hartman
2020-03-19 13:00 ` [PATCH 4.9 90/90] ipv4: ensure rcu_read_lock() in cipso_v4_error() Greg Kroah-Hartman
2020-03-19 18:01 ` [PATCH 4.9 00/90] 4.9.217-rc1 review Naresh Kamboju
2020-03-19 23:35 ` Guenter Roeck
2020-03-21  0:48 ` shuah

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200319123937.605932836@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=guro@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=shakeelb@google.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).