All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Shakeel Butt <shakeelb@google.com>,
	Roman Gushchin <guro@fb.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.14 15/99] cgroup: memcg: net: do not associate sock with unrelated cgroup
Date: Thu, 19 Mar 2020 14:02:53 +0100	[thread overview]
Message-ID: <20200319123946.295542354@linuxfoundation.org> (raw)
In-Reply-To: <20200319123941.630731708@linuxfoundation.org>

From: Shakeel Butt <shakeelb@google.com>

[ Upstream commit e876ecc67db80dfdb8e237f71e5b43bb88ae549c ]

We are testing network memory accounting in our setup and noticed
inconsistent network memory usage and often unrelated cgroups network
usage correlates with testing workload. On further inspection, it
seems like mem_cgroup_sk_alloc() and cgroup_sk_alloc() are broken in
irq context specially for cgroup v1.

mem_cgroup_sk_alloc() and cgroup_sk_alloc() can be called in irq context
and kind of assumes that this can only happen from sk_clone_lock()
and the source sock object has already associated cgroup. However in
cgroup v1, where network memory accounting is opt-in, the source sock
can be unassociated with any cgroup and the new cloned sock can get
associated with unrelated interrupted cgroup.

Cgroup v2 can also suffer if the source sock object was created by
process in the root cgroup or if sk_alloc() is called in irq context.
The fix is to just do nothing in interrupt.

WARNING: Please note that about half of the TCP sockets are allocated
from the IRQ context, so, memory used by such sockets will not be
accouted by the memcg.

The stack trace of mem_cgroup_sk_alloc() from IRQ-context:

CPU: 70 PID: 12720 Comm: ssh Tainted:  5.6.0-smp-DEV #1
Hardware name: ...
Call Trace:
 <IRQ>
 dump_stack+0x57/0x75
 mem_cgroup_sk_alloc+0xe9/0xf0
 sk_clone_lock+0x2a7/0x420
 inet_csk_clone_lock+0x1b/0x110
 tcp_create_openreq_child+0x23/0x3b0
 tcp_v6_syn_recv_sock+0x88/0x730
 tcp_check_req+0x429/0x560
 tcp_v6_rcv+0x72d/0xa40
 ip6_protocol_deliver_rcu+0xc9/0x400
 ip6_input+0x44/0xd0
 ? ip6_protocol_deliver_rcu+0x400/0x400
 ip6_rcv_finish+0x71/0x80
 ipv6_rcv+0x5b/0xe0
 ? ip6_sublist_rcv+0x2e0/0x2e0
 process_backlog+0x108/0x1e0
 net_rx_action+0x26b/0x460
 __do_softirq+0x104/0x2a6
 do_softirq_own_stack+0x2a/0x40
 </IRQ>
 do_softirq.part.19+0x40/0x50
 __local_bh_enable_ip+0x51/0x60
 ip6_finish_output2+0x23d/0x520
 ? ip6table_mangle_hook+0x55/0x160
 __ip6_finish_output+0xa1/0x100
 ip6_finish_output+0x30/0xd0
 ip6_output+0x73/0x120
 ? __ip6_finish_output+0x100/0x100
 ip6_xmit+0x2e3/0x600
 ? ipv6_anycast_cleanup+0x50/0x50
 ? inet6_csk_route_socket+0x136/0x1e0
 ? skb_free_head+0x1e/0x30
 inet6_csk_xmit+0x95/0xf0
 __tcp_transmit_skb+0x5b4/0xb20
 __tcp_send_ack.part.60+0xa3/0x110
 tcp_send_ack+0x1d/0x20
 tcp_rcv_state_process+0xe64/0xe80
 ? tcp_v6_connect+0x5d1/0x5f0
 tcp_v6_do_rcv+0x1b1/0x3f0
 ? tcp_v6_do_rcv+0x1b1/0x3f0
 __release_sock+0x7f/0xd0
 release_sock+0x30/0xa0
 __inet_stream_connect+0x1c3/0x3b0
 ? prepare_to_wait+0xb0/0xb0
 inet_stream_connect+0x3b/0x60
 __sys_connect+0x101/0x120
 ? __sys_getsockopt+0x11b/0x140
 __x64_sys_connect+0x1a/0x20
 do_syscall_64+0x51/0x200
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

The stack trace of mem_cgroup_sk_alloc() from IRQ-context:
Fixes: 2d7580738345 ("mm: memcontrol: consolidate cgroup socket tracking")
Fixes: d979a39d7242 ("cgroup: duplicate cgroup reference when cloning sockets")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/cgroup/cgroup.c |    4 ++++
 mm/memcontrol.c        |    4 ++++
 2 files changed, 8 insertions(+)

--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -5799,6 +5799,10 @@ void cgroup_sk_alloc(struct sock_cgroup_
 		return;
 	}
 
+	/* Don't associate the sock with unrelated interrupted task's cgroup. */
+	if (in_interrupt())
+		return;
+
 	rcu_read_lock();
 
 	while (true) {
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5881,6 +5881,10 @@ void mem_cgroup_sk_alloc(struct sock *sk
 		return;
 	}
 
+	/* Do not associate the sock with unrelated interrupted task's memcg. */
+	if (in_interrupt())
+		return;
+
 	rcu_read_lock();
 	memcg = mem_cgroup_from_task(current);
 	if (memcg == root_mem_cgroup)



  parent reply	other threads:[~2020-03-19 13:14 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-19 13:02 [PATCH 4.14 00/99] 4.14.174-rc1 review Greg Kroah-Hartman
2020-03-19 13:02 ` [PATCH 4.14 01/99] phy: Revert toggling reset changes Greg Kroah-Hartman
2020-03-19 13:02 ` [PATCH 4.14 02/99] net: phy: Avoid multiple suspends Greg Kroah-Hartman
2020-03-19 13:02 ` [PATCH 4.14 03/99] cgroup, netclassid: periodically release file_lock on classid updating Greg Kroah-Hartman
2020-03-19 13:02 ` [PATCH 4.14 04/99] gre: fix uninit-value in __iptunnel_pull_header Greg Kroah-Hartman
2020-03-19 13:02 ` [PATCH 4.14 05/99] ipv6/addrconf: call ipv6_mc_up() for non-Ethernet interface Greg Kroah-Hartman
2020-03-19 13:02 ` [PATCH 4.14 06/99] ipvlan: add cond_resched_rcu() while processing muticast backlog Greg Kroah-Hartman
2020-03-19 13:02 ` [PATCH 4.14 07/99] ipvlan: do not use cond_resched_rcu() in ipvlan_process_multicast() Greg Kroah-Hartman
2020-03-19 13:02 ` [PATCH 4.14 08/99] netlink: Use netlink header as base to calculate bad attribute offset Greg Kroah-Hartman
2020-03-19 13:02 ` [PATCH 4.14 09/99] net: macsec: update SCI upon MAC address change Greg Kroah-Hartman
2020-03-19 13:02 ` [PATCH 4.14 10/99] net: nfc: fix bounds checking bugs on "pipe" Greg Kroah-Hartman
2020-03-19 13:02 ` [PATCH 4.14 11/99] net/packet: tpacket_rcv: do not increment ring index on drop Greg Kroah-Hartman
2020-03-19 13:02 ` [PATCH 4.14 12/99] r8152: check disconnect status after long sleep Greg Kroah-Hartman
2020-03-19 13:02 ` [PATCH 4.14 13/99] sfc: detach from cb_page in efx_copy_channel() Greg Kroah-Hartman
2020-03-19 13:02 ` [PATCH 4.14 14/99] bnxt_en: reinitialize IRQs when MTU is modified Greg Kroah-Hartman
2020-03-19 13:02 ` Greg Kroah-Hartman [this message]
2020-03-19 13:02 ` [PATCH 4.14 16/99] net: memcg: late association of sock to memcg Greg Kroah-Hartman
2020-03-19 13:02 ` [PATCH 4.14 17/99] net: memcg: fix lockdep splat in inet_csk_accept() Greg Kroah-Hartman
2020-03-19 13:02 ` [PATCH 4.14 18/99] fib: add missing attribute validation for tun_id Greg Kroah-Hartman
2020-03-19 13:02 ` [PATCH 4.14 19/99] nl802154: add missing attribute validation Greg Kroah-Hartman
2020-03-19 13:02 ` [PATCH 4.14 20/99] nl802154: add missing attribute validation for dev_type Greg Kroah-Hartman
2020-03-19 13:02 ` [PATCH 4.14 21/99] can: add missing attribute validation for termination Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 22/99] macsec: add missing attribute validation for port Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 23/99] net: fq: add missing attribute validation for orphan mask Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 24/99] team: add missing attribute validation for port ifindex Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 25/99] team: add missing attribute validation for array index Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 26/99] nfc: add missing attribute validation for SE API Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 27/99] nfc: add missing attribute validation for vendor subcommand Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 28/99] net: phy: fix MDIO bus PM PHY resuming Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 29/99] bonding/alb: make sure arp header is pulled before accessing it Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 30/99] slip: make slhc_compress() more robust against malicious packets Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 31/99] net: fec: validate the new settings in fec_enet_set_coalesce() Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 32/99] macvlan: add cond_resched() during multicast processing Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 33/99] inet_diag: return classid for all socket types Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 34/99] ipvlan: do not add hardware address of master to its unicast filter list Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 35/99] ipvlan: egress mcast packets are not exceptional Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 36/99] ipvlan: dont deref eth hdr before checking its set Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 37/99] cgroup: cgroup_procs_next should increase position index Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 38/99] cgroup: Iterate tasks that did not finish do_exit() Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 39/99] iwlwifi: mvm: Do not require PHY_SKU NVM section for 3168 devices Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 40/99] virtio-blk: fix hw_queue stopped on arbitrary error Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 41/99] iommu/vt-d: quirk_ioat_snb_local_iommu: replace WARN_TAINT with pr_warn + add_taint Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 42/99] workqueue: dont use wq_select_unbound_cpu() for bound works Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 43/99] drm/amd/display: remove duplicated assignment to grph_obj_type Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 44/99] ktest: Add timeout for ssh sync testing Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 45/99] cifs_atomic_open(): fix double-put on late allocation failure Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 46/99] gfs2_atomic_open(): fix O_EXCL|O_CREAT handling on cold dcache Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 47/99] KVM: x86: clear stale x86_emulate_ctxt->intercept value Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 48/99] ARC: define __ALIGN_STR and __ALIGN symbols for ARC Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 49/99] efi: Fix a race and a buffer overflow while reading efivars via sysfs Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 50/99] x86/mce: Fix logic and comments around MSR_PPIN_CTL Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 51/99] iommu/dma: Fix MSI reservation allocation Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 52/99] iommu/vt-d: dmar: replace WARN_TAINT with pr_warn + add_taint Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 53/99] iommu/vt-d: Fix a bug in intel_iommu_iova_to_phys() for huge page Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 54/99] pinctrl: meson-gxl: fix GPIOX sdio pins Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 55/99] pinctrl: core: Remove extra kref_get which blocks hogs being freed Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 56/99] nl80211: add missing attribute validation for critical protocol indication Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 57/99] nl80211: add missing attribute validation for beacon report scanning Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 58/99] nl80211: add missing attribute validation for channel switch Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 59/99] netfilter: cthelper: add missing attribute validation for cthelper Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 60/99] netfilter: nft_payload: add missing attribute validation for payload csum flags Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 61/99] iommu/vt-d: Fix the wrong printing in RHSA parsing Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 62/99] iommu/vt-d: Ignore devices with out-of-spec domain number Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 63/99] i2c: acpi: put device when verifying client fails Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 64/99] ipv6: restrict IPV6_ADDRFORM operation Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 65/99] net/smc: check for valid ib_client_data Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 66/99] efi: Add a sanity check to efivar_store_raw() Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 67/99] batman-adv: Avoid spurious warnings from bat_v neigh_cmp implementation Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 68/99] batman-adv: Always initialize fragment header priority Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 69/99] batman-adv: Fix check of retrieved orig_gw in batadv_v_gw_is_eligible Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 70/99] batman-adv: Fix lock for ogm cnt access in batadv_iv_ogm_calc_tq Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 71/99] batman-adv: Fix internal interface indices types Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 72/99] batman-adv: update data pointers after skb_cow() Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 73/99] batman-adv: Avoid race in TT TVLV allocator helper Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 74/99] batman-adv: Fix TT sync flags for intermediate TT responses Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 75/99] batman-adv: prevent TT request storms by not sending inconsistent TT TLVLs Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 76/99] batman-adv: Fix debugfs path for renamed hardif Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 77/99] batman-adv: Fix debugfs path for renamed softif Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 78/99] batman-adv: Fix duplicated OGMs on NETDEV_UP Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 79/99] batman-adv: Avoid free/alloc race when handling OGM2 buffer Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 80/99] batman-adv: Avoid free/alloc race when handling OGM buffer Greg Kroah-Hartman
2020-03-19 13:03 ` [PATCH 4.14 81/99] batman-adv: Dont schedule OGM for disabled interface Greg Kroah-Hartman
2020-03-19 13:04 ` [PATCH 4.14 82/99] perf/amd/uncore: Replace manual sampling check with CAP_NO_INTERRUPT flag Greg Kroah-Hartman
2020-03-19 13:04 ` [PATCH 4.14 83/99] ACPI: watchdog: Allow disabling WDAT at boot Greg Kroah-Hartman
2020-03-19 13:04 ` [PATCH 4.14 84/99] HID: apple: Add support for recent firmware on Magic Keyboards Greg Kroah-Hartman
2020-03-19 13:04 ` [PATCH 4.14 85/99] HID: i2c-hid: add Trekstor Surfbook E11B to descriptor override Greg Kroah-Hartman
2020-03-19 13:04 ` [PATCH 4.14 86/99] cfg80211: check reg_rule for NULL in handle_channel_custom() Greg Kroah-Hartman
2020-03-19 13:04 ` [PATCH 4.14 87/99] scsi: libfc: free response frame from GPN_ID Greg Kroah-Hartman
2020-03-19 13:04 ` [PATCH 4.14 88/99] net: usb: qmi_wwan: restore mtu min/max values after raw_ip switch Greg Kroah-Hartman
2020-03-19 13:04 ` [PATCH 4.14 89/99] net: ks8851-ml: Fix IRQ handling and locking Greg Kroah-Hartman
2020-03-19 13:04 ` [PATCH 4.14 90/99] mac80211: rx: avoid RCU list traversal under mutex Greg Kroah-Hartman
2020-03-19 13:04 ` [PATCH 4.14 91/99] signal: avoid double atomic counter increments for user accounting Greg Kroah-Hartman
2020-03-19 13:04 ` [PATCH 4.14 92/99] slip: not call free_netdev before rtnl_unlock in slip_open Greg Kroah-Hartman
2020-03-19 13:04 ` [PATCH 4.14 93/99] hinic: fix a bug of setting hw_ioctxt Greg Kroah-Hartman
2020-03-19 13:04 ` [PATCH 4.14 94/99] net: rmnet: fix NULL pointer dereference in rmnet_newlink() Greg Kroah-Hartman
2020-03-19 13:04 ` [PATCH 4.14 95/99] jbd2: fix data races at struct journal_head Greg Kroah-Hartman
2020-03-19 13:04 ` [PATCH 4.14 96/99] ARM: 8957/1: VDSO: Match ARMv8 timer in cntvct_functional() Greg Kroah-Hartman
2020-03-19 13:04 ` [PATCH 4.14 97/99] ARM: 8958/1: rename missed uaccess .fixup section Greg Kroah-Hartman
2020-03-19 13:04 ` [PATCH 4.14 98/99] mm: slub: add missing TID bump in kmem_cache_alloc_bulk() Greg Kroah-Hartman
2020-03-19 13:04 ` [PATCH 4.14 99/99] ipv4: ensure rcu_read_lock() in cipso_v4_error() Greg Kroah-Hartman
2020-03-19 18:35 ` [PATCH 4.14 00/99] 4.14.174-rc1 review Naresh Kamboju
2020-03-19 23:36 ` Guenter Roeck
2020-03-21  0:44 ` shuah

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200319123946.295542354@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=guro@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=shakeelb@google.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.