* [PATCH 2/3] cgroup: duplicate cgroup reference when cloning sockets
2016-09-14 19:48 [PATCH 1/3] mm: memcontrol: make per-cpu charge cache IRQ-safe for socket accounting Johannes Weiner
@ 2016-09-14 19:48 ` Johannes Weiner
2016-09-19 12:03 ` Michal Hocko
2016-09-19 15:43 ` Vladimir Davydov
2016-09-14 19:48 ` [PATCH 3/3] mm: memcontrol: consolidate cgroup socket tracking Johannes Weiner
` (2 subsequent siblings)
3 siblings, 2 replies; 14+ messages in thread
From: Johannes Weiner @ 2016-09-14 19:48 UTC (permalink / raw)
To: Andrew Morton, Tejun Heo, David S. Miller
Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, netdev,
linux-kernel, kernel-team
From: Johannes Weiner <jweiner@fb.com>
When a socket is cloned, the associated sock_cgroup_data is duplicated
but not its reference on the cgroup. As a result, the cgroup reference
count will underflow when both sockets are destroyed later on.
Fixes: bd1060a1d671 ("sock, cgroup: add sock->sk_cgroup")
Cc: <stable@vger.kernel.org> # 4.5+
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
kernel/cgroup.c | 6 ++++++
net/core/sock.c | 5 ++++-
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 0c4db7908264..b0d727d26fc7 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -6297,6 +6297,12 @@ void cgroup_sk_alloc(struct sock_cgroup_data *skcd)
if (cgroup_sk_alloc_disabled)
return;
+ /* Socket clone path */
+ if (skcd->val) {
+ cgroup_get(sock_cgroup_ptr(skcd));
+ return;
+ }
+
rcu_read_lock();
while (true) {
diff --git a/net/core/sock.c b/net/core/sock.c
index 51a730485649..038e660ef844 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1340,7 +1340,6 @@ static struct sock *sk_prot_alloc(struct proto *prot, gfp_t priority,
if (!try_module_get(prot->owner))
goto out_free_sec;
sk_tx_queue_clear(sk);
- cgroup_sk_alloc(&sk->sk_cgrp_data);
}
return sk;
@@ -1400,6 +1399,7 @@ struct sock *sk_alloc(struct net *net, int family, gfp_t priority,
sock_net_set(sk, net);
atomic_set(&sk->sk_wmem_alloc, 1);
+ cgroup_sk_alloc(&sk->sk_cgrp_data);
sock_update_classid(&sk->sk_cgrp_data);
sock_update_netprioidx(&sk->sk_cgrp_data);
}
@@ -1544,6 +1544,9 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority)
newsk->sk_priority = 0;
newsk->sk_incoming_cpu = raw_smp_processor_id();
atomic64_set(&newsk->sk_cookie, 0);
+
+ cgroup_sk_alloc(&newsk->sk_cgrp_data);
+
/*
* Before updating sk_refcnt, we must commit prior changes to memory
* (Documentation/RCU/rculist_nulls.txt for details)
--
2.9.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] cgroup: duplicate cgroup reference when cloning sockets
2016-09-14 19:48 ` [PATCH 2/3] cgroup: duplicate cgroup reference when cloning sockets Johannes Weiner
@ 2016-09-19 12:03 ` Michal Hocko
2016-09-19 15:43 ` Vladimir Davydov
1 sibling, 0 replies; 14+ messages in thread
From: Michal Hocko @ 2016-09-19 12:03 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, Tejun Heo, David S. Miller, linux-mm, cgroups,
netdev, linux-kernel, kernel-team, Vladimir Davydov
[Fixup Vladimir's email]
I am not familiar with this code path to give my ack, unfortunatelly.
On Wed 14-09-16 15:48:45, Johannes Weiner wrote:
> From: Johannes Weiner <jweiner@fb.com>
>
> When a socket is cloned, the associated sock_cgroup_data is duplicated
> but not its reference on the cgroup. As a result, the cgroup reference
> count will underflow when both sockets are destroyed later on.
>
> Fixes: bd1060a1d671 ("sock, cgroup: add sock->sk_cgroup")
> Cc: <stable@vger.kernel.org> # 4.5+
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
> kernel/cgroup.c | 6 ++++++
> net/core/sock.c | 5 ++++-
> 2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 0c4db7908264..b0d727d26fc7 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -6297,6 +6297,12 @@ void cgroup_sk_alloc(struct sock_cgroup_data *skcd)
> if (cgroup_sk_alloc_disabled)
> return;
>
> + /* Socket clone path */
> + if (skcd->val) {
> + cgroup_get(sock_cgroup_ptr(skcd));
> + return;
> + }
> +
> rcu_read_lock();
>
> while (true) {
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 51a730485649..038e660ef844 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1340,7 +1340,6 @@ static struct sock *sk_prot_alloc(struct proto *prot, gfp_t priority,
> if (!try_module_get(prot->owner))
> goto out_free_sec;
> sk_tx_queue_clear(sk);
> - cgroup_sk_alloc(&sk->sk_cgrp_data);
> }
>
> return sk;
> @@ -1400,6 +1399,7 @@ struct sock *sk_alloc(struct net *net, int family, gfp_t priority,
> sock_net_set(sk, net);
> atomic_set(&sk->sk_wmem_alloc, 1);
>
> + cgroup_sk_alloc(&sk->sk_cgrp_data);
> sock_update_classid(&sk->sk_cgrp_data);
> sock_update_netprioidx(&sk->sk_cgrp_data);
> }
> @@ -1544,6 +1544,9 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority)
> newsk->sk_priority = 0;
> newsk->sk_incoming_cpu = raw_smp_processor_id();
> atomic64_set(&newsk->sk_cookie, 0);
> +
> + cgroup_sk_alloc(&newsk->sk_cgrp_data);
> +
> /*
> * Before updating sk_refcnt, we must commit prior changes to memory
> * (Documentation/RCU/rculist_nulls.txt for details)
> --
> 2.9.3
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] cgroup: duplicate cgroup reference when cloning sockets
2016-09-14 19:48 ` [PATCH 2/3] cgroup: duplicate cgroup reference when cloning sockets Johannes Weiner
2016-09-19 12:03 ` Michal Hocko
@ 2016-09-19 15:43 ` Vladimir Davydov
1 sibling, 0 replies; 14+ messages in thread
From: Vladimir Davydov @ 2016-09-19 15:43 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, Tejun Heo, David S. Miller, Michal Hocko,
linux-mm, cgroups, netdev, linux-kernel, kernel-team
On Wed, Sep 14, 2016 at 03:48:45PM -0400, Johannes Weiner wrote:
> From: Johannes Weiner <jweiner@fb.com>
>
> When a socket is cloned, the associated sock_cgroup_data is duplicated
> but not its reference on the cgroup. As a result, the cgroup reference
> count will underflow when both sockets are destroyed later on.
>
> Fixes: bd1060a1d671 ("sock, cgroup: add sock->sk_cgroup")
> Cc: <stable@vger.kernel.org> # 4.5+
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Vladimir Davydov <vdavydov.dev@gmail.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 3/3] mm: memcontrol: consolidate cgroup socket tracking
2016-09-14 19:48 [PATCH 1/3] mm: memcontrol: make per-cpu charge cache IRQ-safe for socket accounting Johannes Weiner
2016-09-14 19:48 ` [PATCH 2/3] cgroup: duplicate cgroup reference when cloning sockets Johannes Weiner
@ 2016-09-14 19:48 ` Johannes Weiner
2016-09-14 20:45 ` Tejun Heo
` (4 more replies)
2016-09-19 12:01 ` [PATCH 1/3] mm: memcontrol: make per-cpu charge cache IRQ-safe for socket accounting Michal Hocko
2016-09-19 15:35 ` Vladimir Davydov
3 siblings, 5 replies; 14+ messages in thread
From: Johannes Weiner @ 2016-09-14 19:48 UTC (permalink / raw)
To: Andrew Morton, Tejun Heo, David S. Miller
Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, netdev,
linux-kernel, kernel-team
The cgroup core and the memory controller need to track socket
ownership for different purposes, but the tracking sites being
entirely different is kind of ugly.
Be a better citizen and rename the memory controller callbacks to
match the cgroup core callbacks, then move them to the same place.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
include/linux/memcontrol.h | 4 ++--
mm/memcontrol.c | 19 +++++++++++--------
net/core/sock.c | 6 +++---
net/ipv4/tcp.c | 2 --
net/ipv4/tcp_ipv4.c | 3 ---
5 files changed, 16 insertions(+), 18 deletions(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 0710143723bc..ca11b3e6dd65 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -773,8 +773,8 @@ static inline void mem_cgroup_wb_stats(struct bdi_writeback *wb,
#endif /* CONFIG_CGROUP_WRITEBACK */
struct sock;
-void sock_update_memcg(struct sock *sk);
-void sock_release_memcg(struct sock *sk);
+void mem_cgroup_sk_alloc(struct sock *sk);
+void mem_cgroup_sk_free(struct sock *sk);
bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages);
void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages);
#ifdef CONFIG_MEMCG
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 60bb830abc34..2caf1ee86e78 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2939,7 +2939,7 @@ static int memcg_update_tcp_limit(struct mem_cgroup *memcg, unsigned long limit)
/*
* The active flag needs to be written after the static_key
* update. This is what guarantees that the socket activation
- * function is the last one to run. See sock_update_memcg() for
+ * function is the last one to run. See mem_cgroup_sk_alloc() for
* details, and note that we don't mark any socket as belonging
* to this memcg until that flag is up.
*
@@ -2948,7 +2948,7 @@ static int memcg_update_tcp_limit(struct mem_cgroup *memcg, unsigned long limit)
* as accounted, but the accounting functions are not patched in
* yet, we'll lose accounting.
*
- * We never race with the readers in sock_update_memcg(),
+ * We never race with the readers in mem_cgroup_sk_alloc(),
* because when this value change, the code to process it is not
* patched in yet.
*/
@@ -5651,11 +5651,15 @@ void mem_cgroup_migrate(struct page *oldpage, struct page *newpage)
DEFINE_STATIC_KEY_FALSE(memcg_sockets_enabled_key);
EXPORT_SYMBOL(memcg_sockets_enabled_key);
-void sock_update_memcg(struct sock *sk)
+void mem_cgroup_sk_alloc(struct sock *sk)
{
struct mem_cgroup *memcg;
- /* Socket cloning can throw us here with sk_cgrp already
+ if (!mem_cgroup_sockets_enabled)
+ return;
+
+ /*
+ * Socket cloning can throw us here with sk_memcg already
* filled. It won't however, necessarily happen from
* process context. So the test for root memcg given
* the current task's memcg won't help us in this case.
@@ -5680,12 +5684,11 @@ void sock_update_memcg(struct sock *sk)
out:
rcu_read_unlock();
}
-EXPORT_SYMBOL(sock_update_memcg);
-void sock_release_memcg(struct sock *sk)
+void mem_cgroup_sk_free(struct sock *sk)
{
- WARN_ON(!sk->sk_memcg);
- css_put(&sk->sk_memcg->css);
+ if (sk->sk_memcg)
+ css_put(&sk->sk_memcg->css);
}
/**
diff --git a/net/core/sock.c b/net/core/sock.c
index 038e660ef844..c73e28fc9c2a 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1363,6 +1363,7 @@ static void sk_prot_free(struct proto *prot, struct sock *sk)
slab = prot->slab;
cgroup_sk_free(&sk->sk_cgrp_data);
+ mem_cgroup_sk_free(sk);
security_sk_free(sk);
if (slab != NULL)
kmem_cache_free(slab, sk);
@@ -1399,6 +1400,7 @@ struct sock *sk_alloc(struct net *net, int family, gfp_t priority,
sock_net_set(sk, net);
atomic_set(&sk->sk_wmem_alloc, 1);
+ mem_cgroup_sk_alloc(sk);
cgroup_sk_alloc(&sk->sk_cgrp_data);
sock_update_classid(&sk->sk_cgrp_data);
sock_update_netprioidx(&sk->sk_cgrp_data);
@@ -1545,6 +1547,7 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority)
newsk->sk_incoming_cpu = raw_smp_processor_id();
atomic64_set(&newsk->sk_cookie, 0);
+ mem_cgroup_sk_alloc(newsk);
cgroup_sk_alloc(&newsk->sk_cgrp_data);
/*
@@ -1569,9 +1572,6 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority)
sk_set_socket(newsk, NULL);
newsk->sk_wq = NULL;
- if (mem_cgroup_sockets_enabled && sk->sk_memcg)
- sock_update_memcg(newsk);
-
if (newsk->sk_prot->sockets_allocated)
sk_sockets_allocated_inc(newsk);
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index a13fcb369f52..fc76ef51a5f4 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -421,8 +421,6 @@ void tcp_init_sock(struct sock *sk)
sk->sk_rcvbuf = sysctl_tcp_rmem[1];
local_bh_disable();
- if (mem_cgroup_sockets_enabled)
- sock_update_memcg(sk);
sk_sockets_allocated_inc(sk);
local_bh_enable();
}
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 04b989328558..b8fc74a66299 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1872,9 +1872,6 @@ void tcp_v4_destroy_sock(struct sock *sk)
local_bh_disable();
sk_sockets_allocated_dec(sk);
local_bh_enable();
-
- if (mem_cgroup_sockets_enabled && sk->sk_memcg)
- sock_release_memcg(sk);
}
EXPORT_SYMBOL(tcp_v4_destroy_sock);
--
2.9.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] mm: memcontrol: consolidate cgroup socket tracking
2016-09-14 19:48 ` [PATCH 3/3] mm: memcontrol: consolidate cgroup socket tracking Johannes Weiner
@ 2016-09-14 20:45 ` Tejun Heo
2016-09-14 21:12 ` kbuild test robot
` (3 subsequent siblings)
4 siblings, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2016-09-14 20:45 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, David S. Miller, Michal Hocko, Vladimir Davydov,
linux-mm, cgroups, netdev, linux-kernel, kernel-team
On Wed, Sep 14, 2016 at 03:48:46PM -0400, Johannes Weiner wrote:
> The cgroup core and the memory controller need to track socket
> ownership for different purposes, but the tracking sites being
> entirely different is kind of ugly.
>
> Be a better citizen and rename the memory controller callbacks to
> match the cgroup core callbacks, then move them to the same place.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
For 1-3,
Acked-by: Tejun Heo <tj@kernel.org>
Thanks.
--
tejun
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] mm: memcontrol: consolidate cgroup socket tracking
2016-09-14 19:48 ` [PATCH 3/3] mm: memcontrol: consolidate cgroup socket tracking Johannes Weiner
2016-09-14 20:45 ` Tejun Heo
@ 2016-09-14 21:12 ` kbuild test robot
2016-09-14 21:12 ` kbuild test robot
` (2 subsequent siblings)
4 siblings, 0 replies; 14+ messages in thread
From: kbuild test robot @ 2016-09-14 21:12 UTC (permalink / raw)
To: Johannes Weiner
Cc: kbuild-all, Andrew Morton, Tejun Heo, David S. Miller,
Michal Hocko, Vladimir Davydov, linux-mm, cgroups, netdev,
linux-kernel, kernel-team
[-- Attachment #1: Type: text/plain, Size: 1535 bytes --]
Hi Johannes,
[auto build test ERROR on net/master]
[also build test ERROR on v4.8-rc6 next-20160914]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
[Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
[Check https://git-scm.com/docs/git-format-patch for more information]
url: https://github.com/0day-ci/linux/commits/Johannes-Weiner/mm-memcontrol-make-per-cpu-charge-cache-IRQ-safe-for-socket-accounting/20160915-035634
config: parisc-c3000_defconfig (attached as .config)
compiler: hppa-linux-gnu-gcc (Debian 5.4.0-6) 5.4.0 20160609
reproduce:
wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=parisc
All errors (new ones prefixed by >>):
net/built-in.o: In function `sk_alloc':
>> (.text.sk_alloc+0x88): undefined reference to `mem_cgroup_sk_alloc'
net/built-in.o: In function `__sk_destruct':
>> net/core/sock.o:(.text.__sk_destruct+0xbc): undefined reference to `mem_cgroup_sk_free'
net/built-in.o: In function `sk_clone_lock':
>> (.text.sk_clone_lock+0x164): undefined reference to `mem_cgroup_sk_alloc'
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 14256 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] mm: memcontrol: consolidate cgroup socket tracking
2016-09-14 19:48 ` [PATCH 3/3] mm: memcontrol: consolidate cgroup socket tracking Johannes Weiner
2016-09-14 20:45 ` Tejun Heo
2016-09-14 21:12 ` kbuild test robot
@ 2016-09-14 21:12 ` kbuild test robot
2016-09-15 5:34 ` kbuild test robot
2016-09-19 12:04 ` Michal Hocko
4 siblings, 0 replies; 14+ messages in thread
From: kbuild test robot @ 2016-09-14 21:12 UTC (permalink / raw)
To: Johannes Weiner
Cc: kbuild-all, Andrew Morton, Tejun Heo, David S. Miller,
Michal Hocko, Vladimir Davydov, linux-mm, cgroups, netdev,
linux-kernel, kernel-team
[-- Attachment #1: Type: text/plain, Size: 1539 bytes --]
Hi Johannes,
[auto build test ERROR on net/master]
[also build test ERROR on v4.8-rc6 next-20160914]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
[Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
[Check https://git-scm.com/docs/git-format-patch for more information]
url: https://github.com/0day-ci/linux/commits/Johannes-Weiner/mm-memcontrol-make-per-cpu-charge-cache-IRQ-safe-for-socket-accounting/20160915-035634
config: parisc-b180_defconfig (attached as .config)
compiler: hppa-linux-gnu-gcc (Debian 5.4.0-6) 5.4.0 20160609
reproduce:
wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=parisc
All errors (new ones prefixed by >>):
net/built-in.o: In function `sk_alloc':
(.text.sk_alloc+0xb4): undefined reference to `mem_cgroup_sk_alloc'
net/built-in.o: In function `__sk_destruct':
>> net/core/.tmp_sock.o:(.text.__sk_destruct+0xdc): undefined reference to `mem_cgroup_sk_free'
net/built-in.o: In function `sk_clone_lock':
(.text.sk_clone_lock+0x184): undefined reference to `mem_cgroup_sk_alloc'
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 12799 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] mm: memcontrol: consolidate cgroup socket tracking
2016-09-14 19:48 ` [PATCH 3/3] mm: memcontrol: consolidate cgroup socket tracking Johannes Weiner
` (2 preceding siblings ...)
2016-09-14 21:12 ` kbuild test robot
@ 2016-09-15 5:34 ` kbuild test robot
2016-09-14 22:17 ` Andrew Morton
2016-09-19 12:04 ` Michal Hocko
4 siblings, 1 reply; 14+ messages in thread
From: kbuild test robot @ 2016-09-15 5:34 UTC (permalink / raw)
To: Johannes Weiner
Cc: kbuild-all, Andrew Morton, Tejun Heo, David S. Miller,
Michal Hocko, Vladimir Davydov, linux-mm, cgroups, netdev,
linux-kernel, kernel-team
[-- Attachment #1: Type: text/plain, Size: 1465 bytes --]
Hi Johannes,
[auto build test ERROR on net/master]
[also build test ERROR on v4.8-rc6 next-20160914]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
[Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
[Check https://git-scm.com/docs/git-format-patch for more information]
url: https://github.com/0day-ci/linux/commits/Johannes-Weiner/mm-memcontrol-make-per-cpu-charge-cache-IRQ-safe-for-socket-accounting/20160915-035634
config: m68k-sun3_defconfig (attached as .config)
compiler: m68k-linux-gcc (GCC) 4.9.0
reproduce:
wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=m68k
All errors (new ones prefixed by >>):
net/built-in.o: In function `sk_alloc':
>> (.text+0x4076): undefined reference to `mem_cgroup_sk_alloc'
net/built-in.o: In function `__sk_destruct':
>> sock.c:(.text+0x457e): undefined reference to `mem_cgroup_sk_free'
net/built-in.o: In function `sk_clone_lock':
(.text+0x4f1c): undefined reference to `mem_cgroup_sk_alloc'
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 11447 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] mm: memcontrol: consolidate cgroup socket tracking
2016-09-15 5:34 ` kbuild test robot
@ 2016-09-14 22:17 ` Andrew Morton
2016-09-15 14:27 ` Johannes Weiner
0 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2016-09-14 22:17 UTC (permalink / raw)
To: kbuild test robot
Cc: Johannes Weiner, kbuild-all, Tejun Heo, David S. Miller,
Michal Hocko, Vladimir Davydov, linux-mm, cgroups, netdev,
linux-kernel, kernel-team
On Thu, 15 Sep 2016 13:34:24 +0800 kbuild test robot <lkp@intel.com> wrote:
> Hi Johannes,
>
> [auto build test ERROR on net/master]
> [also build test ERROR on v4.8-rc6 next-20160914]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
> [Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
> [Check https://git-scm.com/docs/git-format-patch for more information]
>
> url: https://github.com/0day-ci/linux/commits/Johannes-Weiner/mm-memcontrol-make-per-cpu-charge-cache-IRQ-safe-for-socket-accounting/20160915-035634
> config: m68k-sun3_defconfig (attached as .config)
> compiler: m68k-linux-gcc (GCC) 4.9.0
> reproduce:
> wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # save the attached .config to linux build tree
> make.cross ARCH=m68k
>
> All errors (new ones prefixed by >>):
>
> net/built-in.o: In function `sk_alloc':
> >> (.text+0x4076): undefined reference to `mem_cgroup_sk_alloc'
> net/built-in.o: In function `__sk_destruct':
> >> sock.c:(.text+0x457e): undefined reference to `mem_cgroup_sk_free'
> net/built-in.o: In function `sk_clone_lock':
> (.text+0x4f1c): undefined reference to `mem_cgroup_sk_alloc'
This?
--- a/mm/memcontrol.c~mm-memcontrol-consolidate-cgroup-socket-tracking-fix
+++ a/mm/memcontrol.c
@@ -5655,9 +5655,6 @@ void mem_cgroup_sk_alloc(struct sock *sk
{
struct mem_cgroup *memcg;
- if (!mem_cgroup_sockets_enabled)
- return;
-
/*
* Socket cloning can throw us here with sk_memcg already
* filled. It won't however, necessarily happen from
--- a/net/core/sock.c~mm-memcontrol-consolidate-cgroup-socket-tracking-fix
+++ a/net/core/sock.c
@@ -1385,7 +1385,8 @@ static void sk_prot_free(struct proto *p
slab = prot->slab;
cgroup_sk_free(&sk->sk_cgrp_data);
- mem_cgroup_sk_free(sk);
+ if (mem_cgroup_sockets_enabled)
+ mem_cgroup_sk_free(sk);
security_sk_free(sk);
if (slab != NULL)
kmem_cache_free(slab, sk);
@@ -1422,7 +1423,8 @@ struct sock *sk_alloc(struct net *net, i
sock_net_set(sk, net);
atomic_set(&sk->sk_wmem_alloc, 1);
- mem_cgroup_sk_alloc(sk);
+ if (mem_cgroup_sockets_enabled)
+ mem_cgroup_sk_alloc(sk);
cgroup_sk_alloc(&sk->sk_cgrp_data);
sock_update_classid(&sk->sk_cgrp_data);
sock_update_netprioidx(&sk->sk_cgrp_data);
@@ -1569,7 +1571,8 @@ struct sock *sk_clone_lock(const struct
newsk->sk_incoming_cpu = raw_smp_processor_id();
atomic64_set(&newsk->sk_cookie, 0);
- mem_cgroup_sk_alloc(newsk);
+ if (mem_cgroup_sockets_enabled)
+ mem_cgroup_sk_alloc(newsk);
cgroup_sk_alloc(&newsk->sk_cgrp_data);
/*
_
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] mm: memcontrol: consolidate cgroup socket tracking
2016-09-14 22:17 ` Andrew Morton
@ 2016-09-15 14:27 ` Johannes Weiner
0 siblings, 0 replies; 14+ messages in thread
From: Johannes Weiner @ 2016-09-15 14:27 UTC (permalink / raw)
To: Andrew Morton
Cc: kbuild test robot, kbuild-all, Tejun Heo, David S. Miller,
Michal Hocko, Vladimir Davydov, linux-mm, cgroups, netdev,
linux-kernel, kernel-team
On Wed, Sep 14, 2016 at 03:17:14PM -0700, Andrew Morton wrote:
> On Thu, 15 Sep 2016 13:34:24 +0800 kbuild test robot <lkp@intel.com> wrote:
>
> > Hi Johannes,
> >
> > [auto build test ERROR on net/master]
> > [also build test ERROR on v4.8-rc6 next-20160914]
> > [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
> > [Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
> > [Check https://git-scm.com/docs/git-format-patch for more information]
> >
> > url: https://github.com/0day-ci/linux/commits/Johannes-Weiner/mm-memcontrol-make-per-cpu-charge-cache-IRQ-safe-for-socket-accounting/20160915-035634
> > config: m68k-sun3_defconfig (attached as .config)
> > compiler: m68k-linux-gcc (GCC) 4.9.0
> > reproduce:
> > wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
> > chmod +x ~/bin/make.cross
> > # save the attached .config to linux build tree
> > make.cross ARCH=m68k
> >
> > All errors (new ones prefixed by >>):
> >
> > net/built-in.o: In function `sk_alloc':
> > >> (.text+0x4076): undefined reference to `mem_cgroup_sk_alloc'
> > net/built-in.o: In function `__sk_destruct':
> > >> sock.c:(.text+0x457e): undefined reference to `mem_cgroup_sk_free'
> > net/built-in.o: In function `sk_clone_lock':
> > (.text+0x4f1c): undefined reference to `mem_cgroup_sk_alloc'
>
> This?
Thanks for fixing it up, Andrew.
I think it'd be nicer to declare the dummy functions for !CONFIG_MEMCG;
it also doesn't look like a hotpath that would necessitate the jump
label in that place. Dave, any preference either way?
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index ca11b3e6dd65..61d20c17f3b7 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -773,13 +773,13 @@ static inline void mem_cgroup_wb_stats(struct bdi_writeback *wb,
#endif /* CONFIG_CGROUP_WRITEBACK */
struct sock;
-void mem_cgroup_sk_alloc(struct sock *sk);
-void mem_cgroup_sk_free(struct sock *sk);
bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages);
void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages);
#ifdef CONFIG_MEMCG
extern struct static_key_false memcg_sockets_enabled_key;
#define mem_cgroup_sockets_enabled static_branch_unlikely(&memcg_sockets_enabled_key)
+void mem_cgroup_sk_alloc(struct sock *sk);
+void mem_cgroup_sk_free(struct sock *sk);
static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg)
{
if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && memcg->tcpmem_pressure)
@@ -792,6 +792,8 @@ static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg)
}
#else
#define mem_cgroup_sockets_enabled 0
+static inline void mem_cgroup_sk_alloc(struct sock *sk) { };
+static inline void mem_cgroup_sk_free(struct sock *sk) { };
static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg)
{
return false;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] mm: memcontrol: consolidate cgroup socket tracking
2016-09-14 19:48 ` [PATCH 3/3] mm: memcontrol: consolidate cgroup socket tracking Johannes Weiner
` (3 preceding siblings ...)
2016-09-15 5:34 ` kbuild test robot
@ 2016-09-19 12:04 ` Michal Hocko
4 siblings, 0 replies; 14+ messages in thread
From: Michal Hocko @ 2016-09-19 12:04 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, Tejun Heo, David S. Miller, linux-mm, cgroups,
netdev, linux-kernel, kernel-team, Vladimir Davydov
[Fixup Vladimir's email]
same here I do not feel familiar with the code enough to give my ack but
Vladimir might be in a better position
On Wed 14-09-16 15:48:46, Johannes Weiner wrote:
> The cgroup core and the memory controller need to track socket
> ownership for different purposes, but the tracking sites being
> entirely different is kind of ugly.
>
> Be a better citizen and rename the memory controller callbacks to
> match the cgroup core callbacks, then move them to the same place.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
> include/linux/memcontrol.h | 4 ++--
> mm/memcontrol.c | 19 +++++++++++--------
> net/core/sock.c | 6 +++---
> net/ipv4/tcp.c | 2 --
> net/ipv4/tcp_ipv4.c | 3 ---
> 5 files changed, 16 insertions(+), 18 deletions(-)
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 0710143723bc..ca11b3e6dd65 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -773,8 +773,8 @@ static inline void mem_cgroup_wb_stats(struct bdi_writeback *wb,
> #endif /* CONFIG_CGROUP_WRITEBACK */
>
> struct sock;
> -void sock_update_memcg(struct sock *sk);
> -void sock_release_memcg(struct sock *sk);
> +void mem_cgroup_sk_alloc(struct sock *sk);
> +void mem_cgroup_sk_free(struct sock *sk);
> bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages);
> void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages);
> #ifdef CONFIG_MEMCG
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 60bb830abc34..2caf1ee86e78 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2939,7 +2939,7 @@ static int memcg_update_tcp_limit(struct mem_cgroup *memcg, unsigned long limit)
> /*
> * The active flag needs to be written after the static_key
> * update. This is what guarantees that the socket activation
> - * function is the last one to run. See sock_update_memcg() for
> + * function is the last one to run. See mem_cgroup_sk_alloc() for
> * details, and note that we don't mark any socket as belonging
> * to this memcg until that flag is up.
> *
> @@ -2948,7 +2948,7 @@ static int memcg_update_tcp_limit(struct mem_cgroup *memcg, unsigned long limit)
> * as accounted, but the accounting functions are not patched in
> * yet, we'll lose accounting.
> *
> - * We never race with the readers in sock_update_memcg(),
> + * We never race with the readers in mem_cgroup_sk_alloc(),
> * because when this value change, the code to process it is not
> * patched in yet.
> */
> @@ -5651,11 +5651,15 @@ void mem_cgroup_migrate(struct page *oldpage, struct page *newpage)
> DEFINE_STATIC_KEY_FALSE(memcg_sockets_enabled_key);
> EXPORT_SYMBOL(memcg_sockets_enabled_key);
>
> -void sock_update_memcg(struct sock *sk)
> +void mem_cgroup_sk_alloc(struct sock *sk)
> {
> struct mem_cgroup *memcg;
>
> - /* Socket cloning can throw us here with sk_cgrp already
> + if (!mem_cgroup_sockets_enabled)
> + return;
> +
> + /*
> + * Socket cloning can throw us here with sk_memcg already
> * filled. It won't however, necessarily happen from
> * process context. So the test for root memcg given
> * the current task's memcg won't help us in this case.
> @@ -5680,12 +5684,11 @@ void sock_update_memcg(struct sock *sk)
> out:
> rcu_read_unlock();
> }
> -EXPORT_SYMBOL(sock_update_memcg);
>
> -void sock_release_memcg(struct sock *sk)
> +void mem_cgroup_sk_free(struct sock *sk)
> {
> - WARN_ON(!sk->sk_memcg);
> - css_put(&sk->sk_memcg->css);
> + if (sk->sk_memcg)
> + css_put(&sk->sk_memcg->css);
> }
>
> /**
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 038e660ef844..c73e28fc9c2a 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1363,6 +1363,7 @@ static void sk_prot_free(struct proto *prot, struct sock *sk)
> slab = prot->slab;
>
> cgroup_sk_free(&sk->sk_cgrp_data);
> + mem_cgroup_sk_free(sk);
> security_sk_free(sk);
> if (slab != NULL)
> kmem_cache_free(slab, sk);
> @@ -1399,6 +1400,7 @@ struct sock *sk_alloc(struct net *net, int family, gfp_t priority,
> sock_net_set(sk, net);
> atomic_set(&sk->sk_wmem_alloc, 1);
>
> + mem_cgroup_sk_alloc(sk);
> cgroup_sk_alloc(&sk->sk_cgrp_data);
> sock_update_classid(&sk->sk_cgrp_data);
> sock_update_netprioidx(&sk->sk_cgrp_data);
> @@ -1545,6 +1547,7 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority)
> newsk->sk_incoming_cpu = raw_smp_processor_id();
> atomic64_set(&newsk->sk_cookie, 0);
>
> + mem_cgroup_sk_alloc(newsk);
> cgroup_sk_alloc(&newsk->sk_cgrp_data);
>
> /*
> @@ -1569,9 +1572,6 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority)
> sk_set_socket(newsk, NULL);
> newsk->sk_wq = NULL;
>
> - if (mem_cgroup_sockets_enabled && sk->sk_memcg)
> - sock_update_memcg(newsk);
> -
> if (newsk->sk_prot->sockets_allocated)
> sk_sockets_allocated_inc(newsk);
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index a13fcb369f52..fc76ef51a5f4 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -421,8 +421,6 @@ void tcp_init_sock(struct sock *sk)
> sk->sk_rcvbuf = sysctl_tcp_rmem[1];
>
> local_bh_disable();
> - if (mem_cgroup_sockets_enabled)
> - sock_update_memcg(sk);
> sk_sockets_allocated_inc(sk);
> local_bh_enable();
> }
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 04b989328558..b8fc74a66299 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -1872,9 +1872,6 @@ void tcp_v4_destroy_sock(struct sock *sk)
> local_bh_disable();
> sk_sockets_allocated_dec(sk);
> local_bh_enable();
> -
> - if (mem_cgroup_sockets_enabled && sk->sk_memcg)
> - sock_release_memcg(sk);
> }
> EXPORT_SYMBOL(tcp_v4_destroy_sock);
>
> --
> 2.9.3
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/3] mm: memcontrol: make per-cpu charge cache IRQ-safe for socket accounting
2016-09-14 19:48 [PATCH 1/3] mm: memcontrol: make per-cpu charge cache IRQ-safe for socket accounting Johannes Weiner
2016-09-14 19:48 ` [PATCH 2/3] cgroup: duplicate cgroup reference when cloning sockets Johannes Weiner
2016-09-14 19:48 ` [PATCH 3/3] mm: memcontrol: consolidate cgroup socket tracking Johannes Weiner
@ 2016-09-19 12:01 ` Michal Hocko
2016-09-19 15:35 ` Vladimir Davydov
3 siblings, 0 replies; 14+ messages in thread
From: Michal Hocko @ 2016-09-19 12:01 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, Tejun Heo, David S. Miller, linux-mm, cgroups,
netdev, linux-kernel, kernel-team, Vladimir Davydov
[Fixup Vladimir's email]
On Wed 14-09-16 15:48:44, Johannes Weiner wrote:
> From: Johannes Weiner <jweiner@fb.com>
>
> During cgroup2 rollout into production, we started encountering css
> refcount underflows and css access crashes in the memory controller.
> Splitting the heavily shared css reference counter into logical users
> narrowed the imbalance down to the cgroup2 socket memory accounting.
>
> The problem turns out to be the per-cpu charge cache. Cgroup1 had a
> separate socket counter, but the new cgroup2 socket accounting goes
> through the common charge path that uses a shared per-cpu cache for
> all memory that is being tracked. Those caches are safe against
> scheduling preemption, but not against interrupts - such as the newly
> added packet receive path. When cache draining is interrupted by
> network RX taking pages out of the cache, the resuming drain operation
> will put references of in-use pages, thus causing the imbalance.
>
> Disable IRQs during all per-cpu charge cache operations.
>
> Fixes: f7e1cb6ec51b ("mm: memcontrol: account socket memory in unified hierarchy memory controller")
> Cc: <stable@vger.kernel.org> # 4.5+
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
> ---
> mm/memcontrol.c | 31 ++++++++++++++++++++++---------
> 1 file changed, 22 insertions(+), 9 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 7a8d6624758a..60bb830abc34 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1710,17 +1710,22 @@ static DEFINE_MUTEX(percpu_charge_mutex);
> static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
> {
> struct memcg_stock_pcp *stock;
> + unsigned long flags;
> bool ret = false;
>
> if (nr_pages > CHARGE_BATCH)
> return ret;
>
> - stock = &get_cpu_var(memcg_stock);
> + local_irq_save(flags);
> +
> + stock = this_cpu_ptr(&memcg_stock);
> if (memcg == stock->cached && stock->nr_pages >= nr_pages) {
> stock->nr_pages -= nr_pages;
> ret = true;
> }
> - put_cpu_var(memcg_stock);
> +
> + local_irq_restore(flags);
> +
> return ret;
> }
>
> @@ -1741,15 +1746,18 @@ static void drain_stock(struct memcg_stock_pcp *stock)
> stock->cached = NULL;
> }
>
> -/*
> - * This must be called under preempt disabled or must be called by
> - * a thread which is pinned to local cpu.
> - */
> static void drain_local_stock(struct work_struct *dummy)
> {
> - struct memcg_stock_pcp *stock = this_cpu_ptr(&memcg_stock);
> + struct memcg_stock_pcp *stock;
> + unsigned long flags;
> +
> + local_irq_save(flags);
> +
> + stock = this_cpu_ptr(&memcg_stock);
> drain_stock(stock);
> clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags);
> +
> + local_irq_restore(flags);
> }
>
> /*
> @@ -1758,14 +1766,19 @@ static void drain_local_stock(struct work_struct *dummy)
> */
> static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
> {
> - struct memcg_stock_pcp *stock = &get_cpu_var(memcg_stock);
> + struct memcg_stock_pcp *stock;
> + unsigned long flags;
> +
> + local_irq_save(flags);
>
> + stock = this_cpu_ptr(&memcg_stock);
> if (stock->cached != memcg) { /* reset if necessary */
> drain_stock(stock);
> stock->cached = memcg;
> }
> stock->nr_pages += nr_pages;
> - put_cpu_var(memcg_stock);
> +
> + local_irq_restore(flags);
> }
>
> /*
> --
> 2.9.3
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/3] mm: memcontrol: make per-cpu charge cache IRQ-safe for socket accounting
2016-09-14 19:48 [PATCH 1/3] mm: memcontrol: make per-cpu charge cache IRQ-safe for socket accounting Johannes Weiner
` (2 preceding siblings ...)
2016-09-19 12:01 ` [PATCH 1/3] mm: memcontrol: make per-cpu charge cache IRQ-safe for socket accounting Michal Hocko
@ 2016-09-19 15:35 ` Vladimir Davydov
3 siblings, 0 replies; 14+ messages in thread
From: Vladimir Davydov @ 2016-09-19 15:35 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, Tejun Heo, David S. Miller, Michal Hocko,
linux-mm, cgroups, netdev, linux-kernel, kernel-team
On Wed, Sep 14, 2016 at 03:48:44PM -0400, Johannes Weiner wrote:
> From: Johannes Weiner <jweiner@fb.com>
>
> During cgroup2 rollout into production, we started encountering css
> refcount underflows and css access crashes in the memory controller.
> Splitting the heavily shared css reference counter into logical users
> narrowed the imbalance down to the cgroup2 socket memory accounting.
>
> The problem turns out to be the per-cpu charge cache. Cgroup1 had a
> separate socket counter, but the new cgroup2 socket accounting goes
> through the common charge path that uses a shared per-cpu cache for
> all memory that is being tracked. Those caches are safe against
> scheduling preemption, but not against interrupts - such as the newly
> added packet receive path. When cache draining is interrupted by
> network RX taking pages out of the cache, the resuming drain operation
> will put references of in-use pages, thus causing the imbalance.
>
> Disable IRQs during all per-cpu charge cache operations.
>
> Fixes: f7e1cb6ec51b ("mm: memcontrol: account socket memory in unified hierarchy memory controller")
> Cc: <stable@vger.kernel.org> # 4.5+
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread