linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vasily Averin <vvs@virtuozzo.com>
To: cgroups@vger.kernel.org, Michal Hocko <mhocko@kernel.org>,
	Shakeel Butt <shakeelb@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Roman Gushchin <guro@fb.com>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
	David Ahern <dsahern@kernel.org>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH v4 02/16] memcg: enable accounting for IP address and routing-related objects
Date: Wed, 28 Apr 2021 09:51:47 +0300	[thread overview]
Message-ID: <270cd5ae-02ab-925b-a679-9b4e7a6ff76d@virtuozzo.com> (raw)
In-Reply-To: <8664122a-99d3-7199-869a-781b21b7e712@virtuozzo.com>

Netadmin inside container can use 'ip a a' and 'ip r a'
to assign a large number of ipv4/ipv6 addresses and routing entries
and force kernel to allocate megabytes of unaccounted memory
for long-lived per-netdevice related kernel objects:
'struct in_ifaddr', 'struct inet6_ifaddr', 'struct fib6_node',
'struct rt6_info', 'struct fib_rules' and ip_fib caches.

These objects can be manually removed, though usually they lives
in memory till destroy of its net namespace.

It makes sense to account for them to restrict the host's memory
consumption from inside the memcg-limited container.

One of such objects is the 'struct fib6_node' mostly allocated in
net/ipv6/route.c::__ip6_ins_rt() inside the lock_bh()/unlock_bh() section:

 write_lock_bh(&table->tb6_lock);
 err = fib6_add(&table->tb6_root, rt, info, mxc);
 write_unlock_bh(&table->tb6_lock);

In this case it is not enough to simply add SLAB_ACCOUNT to corresponding
kmem cache. The proper memory cgroup still cannot be found due to the
incorrect 'in_interrupt()' check used in memcg_kmem_bypass().

Obsoleted in_interrupt() does not describe real execution context properly.
From include/linux/preempt.h:

 The following macros are deprecated and should not be used in new code:
 in_interrupt()	- We're in NMI,IRQ,SoftIRQ context or have BH disabled

To verify the current execution context new macro should be used instead:
 in_task()	- We're in task context

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
---
 mm/memcontrol.c      | 2 +-
 net/core/fib_rules.c | 4 ++--
 net/ipv4/devinet.c   | 2 +-
 net/ipv4/fib_trie.c  | 4 ++--
 net/ipv6/addrconf.c  | 2 +-
 net/ipv6/ip6_fib.c   | 4 ++--
 net/ipv6/route.c     | 2 +-
 7 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e064ac0d..15108ad 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1076,7 +1076,7 @@ static __always_inline bool memcg_kmem_bypass(void)
 		return false;
 
 	/* Memcg to charge can't be determined. */
-	if (in_interrupt() || !current->mm || (current->flags & PF_KTHREAD))
+	if (!in_task() || !current->mm || (current->flags & PF_KTHREAD))
 		return true;
 
 	return false;
diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index cd80ffe..65d8b1d 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -57,7 +57,7 @@ int fib_default_rule_add(struct fib_rules_ops *ops,
 {
 	struct fib_rule *r;
 
-	r = kzalloc(ops->rule_size, GFP_KERNEL);
+	r = kzalloc(ops->rule_size, GFP_KERNEL_ACCOUNT);
 	if (r == NULL)
 		return -ENOMEM;
 
@@ -541,7 +541,7 @@ static int fib_nl2rule(struct sk_buff *skb, struct nlmsghdr *nlh,
 			goto errout;
 	}
 
-	nlrule = kzalloc(ops->rule_size, GFP_KERNEL);
+	nlrule = kzalloc(ops->rule_size, GFP_KERNEL_ACCOUNT);
 	if (!nlrule) {
 		err = -ENOMEM;
 		goto errout;
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 2e35f68da..9b90413 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -215,7 +215,7 @@ static void devinet_sysctl_unregister(struct in_device *idev)
 
 static struct in_ifaddr *inet_alloc_ifa(void)
 {
-	return kzalloc(sizeof(struct in_ifaddr), GFP_KERNEL);
+	return kzalloc(sizeof(struct in_ifaddr), GFP_KERNEL_ACCOUNT);
 }
 
 static void inet_rcu_free_ifa(struct rcu_head *head)
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 25cf387..8060524 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -2380,11 +2380,11 @@ void __init fib_trie_init(void)
 {
 	fn_alias_kmem = kmem_cache_create("ip_fib_alias",
 					  sizeof(struct fib_alias),
-					  0, SLAB_PANIC, NULL);
+					  0, SLAB_PANIC | SLAB_ACCOUNT, NULL);
 
 	trie_leaf_kmem = kmem_cache_create("ip_fib_trie",
 					   LEAF_SIZE,
-					   0, SLAB_PANIC, NULL);
+					   0, SLAB_PANIC | SLAB_ACCOUNT, NULL);
 }
 
 struct fib_table *fib_trie_table(u32 id, struct fib_table *alias)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index a9e53f5..d56a15a 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1080,7 +1080,7 @@ static int ipv6_add_addr_hash(struct net_device *dev, struct inet6_ifaddr *ifa)
 			goto out;
 	}
 
-	ifa = kzalloc(sizeof(*ifa), gfp_flags);
+	ifa = kzalloc(sizeof(*ifa), gfp_flags | __GFP_ACCOUNT);
 	if (!ifa) {
 		err = -ENOBUFS;
 		goto out;
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 679699e..0982b7c 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -2444,8 +2444,8 @@ int __init fib6_init(void)
 	int ret = -ENOMEM;
 
 	fib6_node_kmem = kmem_cache_create("fib6_nodes",
-					   sizeof(struct fib6_node),
-					   0, SLAB_HWCACHE_ALIGN,
+					   sizeof(struct fib6_node), 0,
+					   SLAB_HWCACHE_ALIGN | SLAB_ACCOUNT,
 					   NULL);
 	if (!fib6_node_kmem)
 		goto out;
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 373d480..5dc5c68 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -6510,7 +6510,7 @@ int __init ip6_route_init(void)
 	ret = -ENOMEM;
 	ip6_dst_ops_template.kmem_cachep =
 		kmem_cache_create("ip6_dst_cache", sizeof(struct rt6_info), 0,
-				  SLAB_HWCACHE_ALIGN, NULL);
+				  SLAB_HWCACHE_ALIGN | SLAB_ACCOUNT, NULL);
 	if (!ip6_dst_ops_template.kmem_cachep)
 		goto out;
 
-- 
1.8.3.1


  parent reply	other threads:[~2021-04-28  6:51 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <8664122a-99d3-7199-869a-781b21b7e712@virtuozzo.com>
2021-04-28  6:51 ` [PATCH v4 00/16] memcg accounting from OpenVZ Vasily Averin
2021-07-15 17:11   ` Shakeel Butt
2021-07-16  4:11     ` Vasily Averin
2021-07-16 12:55       ` Shakeel Butt
2021-07-19 10:44         ` [PATCH v5 " Vasily Averin
2021-07-26 18:59           ` [PATCH v6 00/16] memcg accounting from Vasily Averin
2021-07-26 21:59             ` David Miller
2021-07-27  4:44               ` [PATCH v6 00/16] memcg accounting from OpenVZ Vasily Averin
2021-07-27  5:33                 ` [PATCH v7 00/10] " Vasily Averin
     [not found]                 ` <cover.1627362057.git.vvs@virtuozzo.com>
2021-07-27  5:33                   ` [PATCH v7 01/10] memcg: enable accounting for mnt_cache entries Vasily Averin
2021-07-27  6:44                     ` Shakeel Butt
2021-07-27  7:21                     ` Christian Brauner
2021-07-27  5:33                   ` [PATCH v7 02/10] memcg: enable accounting for pollfd and select bits arrays Vasily Averin
2021-07-27 21:39                     ` Shakeel Butt
2021-07-27  5:33                   ` [PATCH v7 03/10] memcg: enable accounting for file lock caches Vasily Averin
2021-07-27 21:41                     ` Shakeel Butt
2021-07-27  5:33                   ` [PATCH v7 04/10] memcg: enable accounting for fasync_cache Vasily Averin
2021-07-27 21:50                     ` Shakeel Butt
2021-07-27  5:33                   ` [PATCH v7 05/10] memcg: enable accounting for new namesapces and struct nsproxy Vasily Averin
2021-07-27 21:51                     ` Shakeel Butt
2021-07-27  5:33                   ` [PATCH v7 06/10] memcg: enable accounting of ipc resources Vasily Averin
2021-07-27 22:33                     ` Shakeel Butt
2021-07-27  5:34                   ` [PATCH v7 07/10] memcg: enable accounting for signals Vasily Averin
2021-07-27  5:34                   ` [PATCH v7 08/10] memcg: enable accounting for posix_timers_cache slab Vasily Averin
2021-07-27 22:33                     ` Shakeel Butt
2021-07-27  5:34                   ` [PATCH v7 09/10] memcg: enable accounting for tty-related objects Vasily Averin
2021-07-27  6:09                     ` Greg Kroah-Hartman
2021-07-27  6:54                     ` Jiri Slaby
2021-07-27  8:02                       ` Vasily Averin
2021-07-27  9:26                         ` [PATCH TTY] memcg: drop GFP_KERNEL_ACCOUNT use in tty_save_termios() Vasily Averin
2021-07-27  9:32                           ` Greg Kroah-Hartman
2022-02-28  9:13                           ` [PATCH v2] memcg: enable accounting for tty-related objects Vasily Averin
2021-07-27  9:30                         ` [PATCH v7 09/10] " Greg Kroah-Hartman
2021-07-27  5:34                   ` [PATCH v7 10/10] memcg: enable accounting for ldt_struct objects Vasily Averin
2021-07-27 22:36                     ` Shakeel Butt
     [not found]           ` <cover.1627321321.git.vvs@virtuozzo.com>
2021-07-26 18:59             ` [PATCH v6 01/16] memcg: enable accounting for net_device and Tx/Rx queues Vasily Averin
2021-07-26 19:00             ` [PATCH v6 02/16] memcg: enable accounting for IP address and routing-related objects Vasily Averin
2021-07-26 19:00             ` [PATCH v6 03/16] memcg: enable accounting for inet_bin_bucket cache Vasily Averin
2021-07-26 19:00             ` [PATCH v6 04/16] memcg: enable accounting for VLAN group array Vasily Averin
2021-07-26 19:00             ` [PATCH v6 05/16] memcg: ipv6/sit: account and don't WARN on ip_tunnel_prl structs allocation Vasily Averin
2021-07-26 19:00             ` [PATCH v6 06/16] memcg: enable accounting for scm_fp_list objects Vasily Averin
2021-07-26 19:00             ` [PATCH v6 07/16] memcg: enable accounting for mnt_cache entries Vasily Averin
2021-07-26 19:00             ` [PATCH v6 08/16] memcg: enable accounting for pollfd and select bits arrays Vasily Averin
2021-07-26 19:01             ` [PATCH v6 09/16] memcg: enable accounting for file lock caches Vasily Averin
2021-07-26 19:01             ` [PATCH v6 10/16] memcg: enable accounting for fasync_cache Vasily Averin
2021-07-26 19:01             ` [PATCH v6 11/16] memcg: enable accounting for new namesapces and struct nsproxy Vasily Averin
2021-07-26 19:58               ` Kirill Tkhai
2021-07-26 19:01             ` [PATCH v6 12/16] memcg: enable accounting of ipc resources Vasily Averin
2021-07-26 19:01             ` [PATCH v6 13/16] memcg: enable accounting for signals Vasily Averin
2021-07-26 19:01             ` [PATCH v6 14/16] memcg: enable accounting for posix_timers_cache slab Vasily Averin
2021-07-26 19:01             ` [PATCH v6 15/16] memcg: enable accounting for tty-related objects Vasily Averin
2021-07-26 19:01             ` [PATCH v6 16/16] memcg: enable accounting for ldt_struct objects Vasily Averin
     [not found]         ` <cover.1626688654.git.vvs@virtuozzo.com>
2021-07-19 10:44           ` [PATCH v5 01/16] memcg: enable accounting for net_device and Tx/Rx queues Vasily Averin
2021-07-19 10:44           ` [PATCH v5 02/16] memcg: enable accounting for IP address and routing-related objects Vasily Averin
2021-07-19 14:00             ` Dmitry Safonov
2021-07-19 14:22               ` Shakeel Butt
2021-07-19 14:24                 ` Dmitry Safonov
2021-07-20 19:26             ` Shakeel Butt
2021-07-26 10:23               ` Vasily Averin
2021-07-26 13:48                 ` Shakeel Butt
2021-07-26 16:53                   ` [PATCH] memcg: replace in_interrupt() by !in_task() in active_memcg() Vasily Averin
2021-07-26 16:57                     ` Shakeel Butt
2021-07-19 10:44           ` [PATCH v5 03/16] memcg: enable accounting for inet_bin_bucket cache Vasily Averin
2021-07-19 10:44           ` [PATCH v5 04/16] memcg: enable accounting for VLAN group array Vasily Averin
2021-07-19 10:44           ` [PATCH v5 05/16] memcg: ipv6/sit: account and don't WARN on ip_tunnel_prl structs allocation Vasily Averin
2021-07-19 10:44           ` [PATCH v5 06/16] memcg: enable accounting for scm_fp_list objects Vasily Averin
2021-07-19 10:45           ` [PATCH v5 07/16] memcg: enable accounting for mnt_cache entries Vasily Averin
2021-07-19 10:45           ` [PATCH v5 08/16] memcg: enable accounting for pollfd and select bits arrays Vasily Averin
2021-07-19 10:45           ` [PATCH v5 09/16] memcg: enable accounting for file lock caches Vasily Averin
2021-07-19 10:45           ` [PATCH v5 10/16] memcg: enable accounting for fasync_cache Vasily Averin
2021-07-19 10:45           ` [PATCH v5 11/16] memcg: enable accounting for new namesapces and struct nsproxy Vasily Averin
2021-07-19 10:45           ` [PATCH v5 12/16] memcg: enable accounting of ipc resources Vasily Averin
2021-07-19 10:45           ` [PATCH v5 13/16] memcg: enable accounting for signals Vasily Averin
2021-07-19 17:32             ` Eric W. Biederman
2021-07-20  8:35               ` Vasily Averin
2021-07-20 14:37                 ` Shakeel Butt
2021-07-20 16:42                 ` Eric W. Biederman
2021-07-20 19:15             ` Shakeel Butt
2021-07-19 10:45           ` [PATCH v5 14/16] memcg: enable accounting for posix_timers_cache slab Vasily Averin
2021-07-19 10:45           ` [PATCH v5 15/16] memcg: enable accounting for tty-related objects Vasily Averin
2021-07-19 10:46           ` [PATCH v5 16/16] memcg: enable accounting for ldt_struct objects Vasily Averin
2021-04-28  6:51 ` [PATCH v4 01/16] memcg: enable accounting for net_device and Tx/Rx queues Vasily Averin
2021-04-28  6:51 ` Vasily Averin [this message]
2021-04-28  6:51 ` [PATCH v4 03/16] memcg: enable accounting for inet_bin_bucket cache Vasily Averin
2021-04-28  6:52 ` [PATCH v4 04/16] memcg: enable accounting for VLAN group array Vasily Averin
2021-04-28  6:52 ` [PATCH v4 05/16] memcg: ipv6/sit: account and don't WARN on ip_tunnel_prl structs allocation Vasily Averin
2021-04-28  6:52 ` [PATCH v4 06/16] memcg: enable accounting for scm_fp_list objects Vasily Averin
2021-04-28  6:52 ` [PATCH v4 07/16] memcg: enable accounting for new namesapces and struct nsproxy Vasily Averin
2021-05-07 13:45   ` Serge E. Hallyn
2021-05-07 15:03   ` Christian Brauner
2021-04-28  6:52 ` [PATCH v4 08/16] memcg: enable accounting of ipc resources Vasily Averin
2021-04-28  6:53 ` [PATCH v4 09/16] memcg: enable accounting for mnt_cache entries Vasily Averin
2021-04-28  6:53 ` [PATCH v4 10/16] memcg: enable accounting for pollfd and select bits arrays Vasily Averin
2021-04-28  6:53 ` [PATCH v4 11/16] memcg: enable accounting for signals Vasily Averin
2021-04-28  6:53 ` [PATCH v4 12/16] memcg: enable accounting for posix_timers_cache slab Vasily Averin
2021-05-07 15:48   ` Thomas Gleixner
2021-04-28  6:53 ` [PATCH v4 13/16] memcg: enable accounting for file lock caches Vasily Averin
2021-04-28  6:54 ` [PATCH v4 14/16] memcg: enable accounting for fasync_cache Vasily Averin
2021-04-28  6:54 ` [PATCH v4 15/16] memcg: enable accounting for tty-related objects Vasily Averin
2021-04-28  7:38   ` Greg Kroah-Hartman
2021-04-28  6:54 ` [PATCH v4 16/16] memcg: enable accounting for ldt_struct objects Vasily Averin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=270cd5ae-02ab-925b-a679-9b4e7a6ff76d@virtuozzo.com \
    --to=vvs@virtuozzo.com \
    --cc=cgroups@vger.kernel.org \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=shakeelb@google.com \
    --cc=vdavydov.dev@gmail.com \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).