All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
@ 2017-08-29 10:01 ` Roman Gushchin
  0 siblings, 0 replies; 31+ messages in thread
From: Roman Gushchin @ 2017-08-29 10:01 UTC (permalink / raw)
  To: linux-mm
  Cc: Roman Gushchin, Johannes Weiner, Michal Hocko, Vladimir Davydov,
	cgroups, kernel-team, linux-kernel

We've noticed a quite sensible performance overhead on some hosts
with significant network traffic when socket memory accounting
is enabled.

Perf top shows that socket memory uncharging path is hot:
  2.13%  [kernel]                [k] page_counter_cancel
  1.14%  [kernel]                [k] __sk_mem_reduce_allocated
  1.14%  [kernel]                [k] _raw_spin_lock
  0.87%  [kernel]                [k] _raw_spin_lock_irqsave
  0.84%  [kernel]                [k] tcp_ack
  0.84%  [kernel]                [k] ixgbe_poll
  0.83%  < workload >
  0.82%  [kernel]                [k] enqueue_entity
  0.68%  [kernel]                [k] __fget
  0.68%  [kernel]                [k] tcp_delack_timer_handler
  0.67%  [kernel]                [k] __schedule
  0.60%  < workload >
  0.59%  [kernel]                [k] __inet6_lookup_established
  0.55%  [kernel]                [k] __switch_to
  0.55%  [kernel]                [k] menu_select
  0.54%  libc-2.20.so            [.] __memcpy_avx_unaligned

To address this issue, the existing per-cpu stock infrastructure
can be used.

refill_stock() can be called from mem_cgroup_uncharge_skmem()
to move charge to a per-cpu stock instead of calling atomic
page_counter_uncharge().

To prevent the uncontrolled growth of per-cpu stocks,
refill_stock() will explicitly drain the cached charge,
if the cached value exceeds CHARGE_BATCH.

This allows significantly optimize the load:
  1.21%  [kernel]                [k] _raw_spin_lock
  1.01%  [kernel]                [k] ixgbe_poll
  0.92%  [kernel]                [k] _raw_spin_lock_irqsave
  0.90%  [kernel]                [k] enqueue_entity
  0.86%  [kernel]                [k] tcp_ack
  0.85%  < workload >
  0.74%  perf-11120.map          [.] 0x000000000061bf24
  0.73%  [kernel]                [k] __schedule
  0.67%  [kernel]                [k] __fget
  0.63%  [kernel]                [k] __inet6_lookup_established
  0.62%  [kernel]                [k] menu_select
  0.59%  < workload >
  0.59%  [kernel]                [k] __switch_to
  0.57%  libc-2.20.so            [.] __memcpy_avx_unaligned

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: cgroups@vger.kernel.org
Cc: kernel-team@fb.com
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/memcontrol.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b9cf3cf4a3d0..a69d23082abf 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
 	}
 	stock->nr_pages += nr_pages;
 
+	if (stock->nr_pages > CHARGE_BATCH)
+		drain_stock(stock);
+
 	local_irq_restore(flags);
 }
 
@@ -5886,8 +5889,7 @@ void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
 
 	this_cpu_sub(memcg->stat->count[MEMCG_SOCK], nr_pages);
 
-	page_counter_uncharge(&memcg->memory, nr_pages);
-	css_put_many(&memcg->css, nr_pages);
+	refill_stock(memcg, nr_pages);
 }
 
 static int __init cgroup_memory(char *s)
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
@ 2017-08-29 10:01 ` Roman Gushchin
  0 siblings, 0 replies; 31+ messages in thread
From: Roman Gushchin @ 2017-08-29 10:01 UTC (permalink / raw)
  To: linux-mm
  Cc: Roman Gushchin, Johannes Weiner, Michal Hocko, Vladimir Davydov,
	cgroups, kernel-team, linux-kernel

We've noticed a quite sensible performance overhead on some hosts
with significant network traffic when socket memory accounting
is enabled.

Perf top shows that socket memory uncharging path is hot:
  2.13%  [kernel]                [k] page_counter_cancel
  1.14%  [kernel]                [k] __sk_mem_reduce_allocated
  1.14%  [kernel]                [k] _raw_spin_lock
  0.87%  [kernel]                [k] _raw_spin_lock_irqsave
  0.84%  [kernel]                [k] tcp_ack
  0.84%  [kernel]                [k] ixgbe_poll
  0.83%  < workload >
  0.82%  [kernel]                [k] enqueue_entity
  0.68%  [kernel]                [k] __fget
  0.68%  [kernel]                [k] tcp_delack_timer_handler
  0.67%  [kernel]                [k] __schedule
  0.60%  < workload >
  0.59%  [kernel]                [k] __inet6_lookup_established
  0.55%  [kernel]                [k] __switch_to
  0.55%  [kernel]                [k] menu_select
  0.54%  libc-2.20.so            [.] __memcpy_avx_unaligned

To address this issue, the existing per-cpu stock infrastructure
can be used.

refill_stock() can be called from mem_cgroup_uncharge_skmem()
to move charge to a per-cpu stock instead of calling atomic
page_counter_uncharge().

To prevent the uncontrolled growth of per-cpu stocks,
refill_stock() will explicitly drain the cached charge,
if the cached value exceeds CHARGE_BATCH.

This allows significantly optimize the load:
  1.21%  [kernel]                [k] _raw_spin_lock
  1.01%  [kernel]                [k] ixgbe_poll
  0.92%  [kernel]                [k] _raw_spin_lock_irqsave
  0.90%  [kernel]                [k] enqueue_entity
  0.86%  [kernel]                [k] tcp_ack
  0.85%  < workload >
  0.74%  perf-11120.map          [.] 0x000000000061bf24
  0.73%  [kernel]                [k] __schedule
  0.67%  [kernel]                [k] __fget
  0.63%  [kernel]                [k] __inet6_lookup_established
  0.62%  [kernel]                [k] menu_select
  0.59%  < workload >
  0.59%  [kernel]                [k] __switch_to
  0.57%  libc-2.20.so            [.] __memcpy_avx_unaligned

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: cgroups@vger.kernel.org
Cc: kernel-team@fb.com
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/memcontrol.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b9cf3cf4a3d0..a69d23082abf 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
 	}
 	stock->nr_pages += nr_pages;
 
+	if (stock->nr_pages > CHARGE_BATCH)
+		drain_stock(stock);
+
 	local_irq_restore(flags);
 }
 
@@ -5886,8 +5889,7 @@ void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
 
 	this_cpu_sub(memcg->stat->count[MEMCG_SOCK], nr_pages);
 
-	page_counter_uncharge(&memcg->memory, nr_pages);
-	css_put_many(&memcg->css, nr_pages);
+	refill_stock(memcg, nr_pages);
 }
 
 static int __init cgroup_memory(char *s)
-- 
2.13.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
@ 2017-08-29 10:01 ` Roman Gushchin
  0 siblings, 0 replies; 31+ messages in thread
From: Roman Gushchin @ 2017-08-29 10:01 UTC (permalink / raw)
  To: linux-mm
  Cc: Roman Gushchin, Johannes Weiner, Michal Hocko, Vladimir Davydov,
	cgroups, kernel-team, linux-kernel

We've noticed a quite sensible performance overhead on some hosts
with significant network traffic when socket memory accounting
is enabled.

Perf top shows that socket memory uncharging path is hot:
  2.13%  [kernel]                [k] page_counter_cancel
  1.14%  [kernel]                [k] __sk_mem_reduce_allocated
  1.14%  [kernel]                [k] _raw_spin_lock
  0.87%  [kernel]                [k] _raw_spin_lock_irqsave
  0.84%  [kernel]                [k] tcp_ack
  0.84%  [kernel]                [k] ixgbe_poll
  0.83%  < workload >
  0.82%  [kernel]                [k] enqueue_entity
  0.68%  [kernel]                [k] __fget
  0.68%  [kernel]                [k] tcp_delack_timer_handler
  0.67%  [kernel]                [k] __schedule
  0.60%  < workload >
  0.59%  [kernel]                [k] __inet6_lookup_established
  0.55%  [kernel]                [k] __switch_to
  0.55%  [kernel]                [k] menu_select
  0.54%  libc-2.20.so            [.] __memcpy_avx_unaligned

To address this issue, the existing per-cpu stock infrastructure
can be used.

refill_stock() can be called from mem_cgroup_uncharge_skmem()
to move charge to a per-cpu stock instead of calling atomic
page_counter_uncharge().

To prevent the uncontrolled growth of per-cpu stocks,
refill_stock() will explicitly drain the cached charge,
if the cached value exceeds CHARGE_BATCH.

This allows significantly optimize the load:
  1.21%  [kernel]                [k] _raw_spin_lock
  1.01%  [kernel]                [k] ixgbe_poll
  0.92%  [kernel]                [k] _raw_spin_lock_irqsave
  0.90%  [kernel]                [k] enqueue_entity
  0.86%  [kernel]                [k] tcp_ack
  0.85%  < workload >
  0.74%  perf-11120.map          [.] 0x000000000061bf24
  0.73%  [kernel]                [k] __schedule
  0.67%  [kernel]                [k] __fget
  0.63%  [kernel]                [k] __inet6_lookup_established
  0.62%  [kernel]                [k] menu_select
  0.59%  < workload >
  0.59%  [kernel]                [k] __switch_to
  0.57%  libc-2.20.so            [.] __memcpy_avx_unaligned

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: cgroups@vger.kernel.org
Cc: kernel-team@fb.com
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/memcontrol.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b9cf3cf4a3d0..a69d23082abf 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
 	}
 	stock->nr_pages += nr_pages;
 
+	if (stock->nr_pages > CHARGE_BATCH)
+		drain_stock(stock);
+
 	local_irq_restore(flags);
 }
 
@@ -5886,8 +5889,7 @@ void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
 
 	this_cpu_sub(memcg->stat->count[MEMCG_SOCK], nr_pages);
 
-	page_counter_uncharge(&memcg->memory, nr_pages);
-	css_put_many(&memcg->css, nr_pages);
+	refill_stock(memcg, nr_pages);
 }
 
 static int __init cgroup_memory(char *s)
-- 
2.13.5


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
  2017-08-29 10:01 ` Roman Gushchin
@ 2017-08-29 19:26   ` Johannes Weiner
  -1 siblings, 0 replies; 31+ messages in thread
From: Johannes Weiner @ 2017-08-29 19:26 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm, Michal Hocko, Vladimir Davydov, cgroups, kernel-team,
	linux-kernel

On Tue, Aug 29, 2017 at 11:01:50AM +0100, Roman Gushchin wrote:
> We've noticed a quite sensible performance overhead on some hosts
> with significant network traffic when socket memory accounting
> is enabled.
> 
> Perf top shows that socket memory uncharging path is hot:
>   2.13%  [kernel]                [k] page_counter_cancel
>   1.14%  [kernel]                [k] __sk_mem_reduce_allocated
>   1.14%  [kernel]                [k] _raw_spin_lock
>   0.87%  [kernel]                [k] _raw_spin_lock_irqsave
>   0.84%  [kernel]                [k] tcp_ack
>   0.84%  [kernel]                [k] ixgbe_poll
>   0.83%  < workload >
>   0.82%  [kernel]                [k] enqueue_entity
>   0.68%  [kernel]                [k] __fget
>   0.68%  [kernel]                [k] tcp_delack_timer_handler
>   0.67%  [kernel]                [k] __schedule
>   0.60%  < workload >
>   0.59%  [kernel]                [k] __inet6_lookup_established
>   0.55%  [kernel]                [k] __switch_to
>   0.55%  [kernel]                [k] menu_select
>   0.54%  libc-2.20.so            [.] __memcpy_avx_unaligned
> 
> To address this issue, the existing per-cpu stock infrastructure
> can be used.
> 
> refill_stock() can be called from mem_cgroup_uncharge_skmem()
> to move charge to a per-cpu stock instead of calling atomic
> page_counter_uncharge().
> 
> To prevent the uncontrolled growth of per-cpu stocks,
> refill_stock() will explicitly drain the cached charge,
> if the cached value exceeds CHARGE_BATCH.
> 
> This allows significantly optimize the load:
>   1.21%  [kernel]                [k] _raw_spin_lock
>   1.01%  [kernel]                [k] ixgbe_poll
>   0.92%  [kernel]                [k] _raw_spin_lock_irqsave
>   0.90%  [kernel]                [k] enqueue_entity
>   0.86%  [kernel]                [k] tcp_ack
>   0.85%  < workload >
>   0.74%  perf-11120.map          [.] 0x000000000061bf24
>   0.73%  [kernel]                [k] __schedule
>   0.67%  [kernel]                [k] __fget
>   0.63%  [kernel]                [k] __inet6_lookup_established
>   0.62%  [kernel]                [k] menu_select
>   0.59%  < workload >
>   0.59%  [kernel]                [k] __switch_to
>   0.57%  libc-2.20.so            [.] __memcpy_avx_unaligned
> 
> Signed-off-by: Roman Gushchin <guro@fb.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> Cc: cgroups@vger.kernel.org
> Cc: kernel-team@fb.com
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Neat!

As far as other types of pages go: page cache and anon are already
batched pretty well, but I think kmem might benefit from this
too. Have you considered using the stock in memcg_kmem_uncharge()?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
@ 2017-08-29 19:26   ` Johannes Weiner
  0 siblings, 0 replies; 31+ messages in thread
From: Johannes Weiner @ 2017-08-29 19:26 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm, Michal Hocko, Vladimir Davydov, cgroups, kernel-team,
	linux-kernel

On Tue, Aug 29, 2017 at 11:01:50AM +0100, Roman Gushchin wrote:
> We've noticed a quite sensible performance overhead on some hosts
> with significant network traffic when socket memory accounting
> is enabled.
> 
> Perf top shows that socket memory uncharging path is hot:
>   2.13%  [kernel]                [k] page_counter_cancel
>   1.14%  [kernel]                [k] __sk_mem_reduce_allocated
>   1.14%  [kernel]                [k] _raw_spin_lock
>   0.87%  [kernel]                [k] _raw_spin_lock_irqsave
>   0.84%  [kernel]                [k] tcp_ack
>   0.84%  [kernel]                [k] ixgbe_poll
>   0.83%  < workload >
>   0.82%  [kernel]                [k] enqueue_entity
>   0.68%  [kernel]                [k] __fget
>   0.68%  [kernel]                [k] tcp_delack_timer_handler
>   0.67%  [kernel]                [k] __schedule
>   0.60%  < workload >
>   0.59%  [kernel]                [k] __inet6_lookup_established
>   0.55%  [kernel]                [k] __switch_to
>   0.55%  [kernel]                [k] menu_select
>   0.54%  libc-2.20.so            [.] __memcpy_avx_unaligned
> 
> To address this issue, the existing per-cpu stock infrastructure
> can be used.
> 
> refill_stock() can be called from mem_cgroup_uncharge_skmem()
> to move charge to a per-cpu stock instead of calling atomic
> page_counter_uncharge().
> 
> To prevent the uncontrolled growth of per-cpu stocks,
> refill_stock() will explicitly drain the cached charge,
> if the cached value exceeds CHARGE_BATCH.
> 
> This allows significantly optimize the load:
>   1.21%  [kernel]                [k] _raw_spin_lock
>   1.01%  [kernel]                [k] ixgbe_poll
>   0.92%  [kernel]                [k] _raw_spin_lock_irqsave
>   0.90%  [kernel]                [k] enqueue_entity
>   0.86%  [kernel]                [k] tcp_ack
>   0.85%  < workload >
>   0.74%  perf-11120.map          [.] 0x000000000061bf24
>   0.73%  [kernel]                [k] __schedule
>   0.67%  [kernel]                [k] __fget
>   0.63%  [kernel]                [k] __inet6_lookup_established
>   0.62%  [kernel]                [k] menu_select
>   0.59%  < workload >
>   0.59%  [kernel]                [k] __switch_to
>   0.57%  libc-2.20.so            [.] __memcpy_avx_unaligned
> 
> Signed-off-by: Roman Gushchin <guro@fb.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> Cc: cgroups@vger.kernel.org
> Cc: kernel-team@fb.com
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Neat!

As far as other types of pages go: page cache and anon are already
batched pretty well, but I think kmem might benefit from this
too. Have you considered using the stock in memcg_kmem_uncharge()?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
  2017-08-29 19:26   ` Johannes Weiner
  (?)
@ 2017-08-30 10:55     ` Roman Gushchin
  -1 siblings, 0 replies; 31+ messages in thread
From: Roman Gushchin @ 2017-08-30 10:55 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: linux-mm, Michal Hocko, Vladimir Davydov, cgroups, kernel-team,
	linux-kernel

On Tue, Aug 29, 2017 at 03:26:21PM -0400, Johannes Weiner wrote:
> On Tue, Aug 29, 2017 at 11:01:50AM +0100, Roman Gushchin wrote:
> > We've noticed a quite sensible performance overhead on some hosts
> > with significant network traffic when socket memory accounting
> > is enabled.
> > 
> > Perf top shows that socket memory uncharging path is hot:
> >   2.13%  [kernel]                [k] page_counter_cancel
> >   1.14%  [kernel]                [k] __sk_mem_reduce_allocated
> >   1.14%  [kernel]                [k] _raw_spin_lock
> >   0.87%  [kernel]                [k] _raw_spin_lock_irqsave
> >   0.84%  [kernel]                [k] tcp_ack
> >   0.84%  [kernel]                [k] ixgbe_poll
> >   0.83%  < workload >
> >   0.82%  [kernel]                [k] enqueue_entity
> >   0.68%  [kernel]                [k] __fget
> >   0.68%  [kernel]                [k] tcp_delack_timer_handler
> >   0.67%  [kernel]                [k] __schedule
> >   0.60%  < workload >
> >   0.59%  [kernel]                [k] __inet6_lookup_established
> >   0.55%  [kernel]                [k] __switch_to
> >   0.55%  [kernel]                [k] menu_select
> >   0.54%  libc-2.20.so            [.] __memcpy_avx_unaligned
> > 
> > To address this issue, the existing per-cpu stock infrastructure
> > can be used.
> > 
> > refill_stock() can be called from mem_cgroup_uncharge_skmem()
> > to move charge to a per-cpu stock instead of calling atomic
> > page_counter_uncharge().
> > 
> > To prevent the uncontrolled growth of per-cpu stocks,
> > refill_stock() will explicitly drain the cached charge,
> > if the cached value exceeds CHARGE_BATCH.
> > 
> > This allows significantly optimize the load:
> >   1.21%  [kernel]                [k] _raw_spin_lock
> >   1.01%  [kernel]                [k] ixgbe_poll
> >   0.92%  [kernel]                [k] _raw_spin_lock_irqsave
> >   0.90%  [kernel]                [k] enqueue_entity
> >   0.86%  [kernel]                [k] tcp_ack
> >   0.85%  < workload >
> >   0.74%  perf-11120.map          [.] 0x000000000061bf24
> >   0.73%  [kernel]                [k] __schedule
> >   0.67%  [kernel]                [k] __fget
> >   0.63%  [kernel]                [k] __inet6_lookup_established
> >   0.62%  [kernel]                [k] menu_select
> >   0.59%  < workload >
> >   0.59%  [kernel]                [k] __switch_to
> >   0.57%  libc-2.20.so            [.] __memcpy_avx_unaligned
> > 
> > Signed-off-by: Roman Gushchin <guro@fb.com>
> > Cc: Johannes Weiner <hannes@cmpxchg.org>
> > Cc: Michal Hocko <mhocko@kernel.org>
> > Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> > Cc: cgroups@vger.kernel.org
> > Cc: kernel-team@fb.com
> > Cc: linux-mm@kvack.org
> > Cc: linux-kernel@vger.kernel.org
> 
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> 
> Neat!
> 
> As far as other types of pages go: page cache and anon are already
> batched pretty well, but I think kmem might benefit from this
> too. Have you considered using the stock in memcg_kmem_uncharge()?

Good idea!
I'll try to find an appropriate testcase and check if it really
brings any benefits. If so, I'll master a patch.

Thanks!

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
@ 2017-08-30 10:55     ` Roman Gushchin
  0 siblings, 0 replies; 31+ messages in thread
From: Roman Gushchin @ 2017-08-30 10:55 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: linux-mm, Michal Hocko, Vladimir Davydov, cgroups, kernel-team,
	linux-kernel

On Tue, Aug 29, 2017 at 03:26:21PM -0400, Johannes Weiner wrote:
> On Tue, Aug 29, 2017 at 11:01:50AM +0100, Roman Gushchin wrote:
> > We've noticed a quite sensible performance overhead on some hosts
> > with significant network traffic when socket memory accounting
> > is enabled.
> > 
> > Perf top shows that socket memory uncharging path is hot:
> >   2.13%  [kernel]                [k] page_counter_cancel
> >   1.14%  [kernel]                [k] __sk_mem_reduce_allocated
> >   1.14%  [kernel]                [k] _raw_spin_lock
> >   0.87%  [kernel]                [k] _raw_spin_lock_irqsave
> >   0.84%  [kernel]                [k] tcp_ack
> >   0.84%  [kernel]                [k] ixgbe_poll
> >   0.83%  < workload >
> >   0.82%  [kernel]                [k] enqueue_entity
> >   0.68%  [kernel]                [k] __fget
> >   0.68%  [kernel]                [k] tcp_delack_timer_handler
> >   0.67%  [kernel]                [k] __schedule
> >   0.60%  < workload >
> >   0.59%  [kernel]                [k] __inet6_lookup_established
> >   0.55%  [kernel]                [k] __switch_to
> >   0.55%  [kernel]                [k] menu_select
> >   0.54%  libc-2.20.so            [.] __memcpy_avx_unaligned
> > 
> > To address this issue, the existing per-cpu stock infrastructure
> > can be used.
> > 
> > refill_stock() can be called from mem_cgroup_uncharge_skmem()
> > to move charge to a per-cpu stock instead of calling atomic
> > page_counter_uncharge().
> > 
> > To prevent the uncontrolled growth of per-cpu stocks,
> > refill_stock() will explicitly drain the cached charge,
> > if the cached value exceeds CHARGE_BATCH.
> > 
> > This allows significantly optimize the load:
> >   1.21%  [kernel]                [k] _raw_spin_lock
> >   1.01%  [kernel]                [k] ixgbe_poll
> >   0.92%  [kernel]                [k] _raw_spin_lock_irqsave
> >   0.90%  [kernel]                [k] enqueue_entity
> >   0.86%  [kernel]                [k] tcp_ack
> >   0.85%  < workload >
> >   0.74%  perf-11120.map          [.] 0x000000000061bf24
> >   0.73%  [kernel]                [k] __schedule
> >   0.67%  [kernel]                [k] __fget
> >   0.63%  [kernel]                [k] __inet6_lookup_established
> >   0.62%  [kernel]                [k] menu_select
> >   0.59%  < workload >
> >   0.59%  [kernel]                [k] __switch_to
> >   0.57%  libc-2.20.so            [.] __memcpy_avx_unaligned
> > 
> > Signed-off-by: Roman Gushchin <guro@fb.com>
> > Cc: Johannes Weiner <hannes@cmpxchg.org>
> > Cc: Michal Hocko <mhocko@kernel.org>
> > Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> > Cc: cgroups@vger.kernel.org
> > Cc: kernel-team@fb.com
> > Cc: linux-mm@kvack.org
> > Cc: linux-kernel@vger.kernel.org
> 
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> 
> Neat!
> 
> As far as other types of pages go: page cache and anon are already
> batched pretty well, but I think kmem might benefit from this
> too. Have you considered using the stock in memcg_kmem_uncharge()?

Good idea!
I'll try to find an appropriate testcase and check if it really
brings any benefits. If so, I'll master a patch.

Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
@ 2017-08-30 10:55     ` Roman Gushchin
  0 siblings, 0 replies; 31+ messages in thread
From: Roman Gushchin @ 2017-08-30 10:55 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Michal Hocko, Vladimir Davydov,
	cgroups-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, Aug 29, 2017 at 03:26:21PM -0400, Johannes Weiner wrote:
> On Tue, Aug 29, 2017 at 11:01:50AM +0100, Roman Gushchin wrote:
> > We've noticed a quite sensible performance overhead on some hosts
> > with significant network traffic when socket memory accounting
> > is enabled.
> > 
> > Perf top shows that socket memory uncharging path is hot:
> >   2.13%  [kernel]                [k] page_counter_cancel
> >   1.14%  [kernel]                [k] __sk_mem_reduce_allocated
> >   1.14%  [kernel]                [k] _raw_spin_lock
> >   0.87%  [kernel]                [k] _raw_spin_lock_irqsave
> >   0.84%  [kernel]                [k] tcp_ack
> >   0.84%  [kernel]                [k] ixgbe_poll
> >   0.83%  < workload >
> >   0.82%  [kernel]                [k] enqueue_entity
> >   0.68%  [kernel]                [k] __fget
> >   0.68%  [kernel]                [k] tcp_delack_timer_handler
> >   0.67%  [kernel]                [k] __schedule
> >   0.60%  < workload >
> >   0.59%  [kernel]                [k] __inet6_lookup_established
> >   0.55%  [kernel]                [k] __switch_to
> >   0.55%  [kernel]                [k] menu_select
> >   0.54%  libc-2.20.so            [.] __memcpy_avx_unaligned
> > 
> > To address this issue, the existing per-cpu stock infrastructure
> > can be used.
> > 
> > refill_stock() can be called from mem_cgroup_uncharge_skmem()
> > to move charge to a per-cpu stock instead of calling atomic
> > page_counter_uncharge().
> > 
> > To prevent the uncontrolled growth of per-cpu stocks,
> > refill_stock() will explicitly drain the cached charge,
> > if the cached value exceeds CHARGE_BATCH.
> > 
> > This allows significantly optimize the load:
> >   1.21%  [kernel]                [k] _raw_spin_lock
> >   1.01%  [kernel]                [k] ixgbe_poll
> >   0.92%  [kernel]                [k] _raw_spin_lock_irqsave
> >   0.90%  [kernel]                [k] enqueue_entity
> >   0.86%  [kernel]                [k] tcp_ack
> >   0.85%  < workload >
> >   0.74%  perf-11120.map          [.] 0x000000000061bf24
> >   0.73%  [kernel]                [k] __schedule
> >   0.67%  [kernel]                [k] __fget
> >   0.63%  [kernel]                [k] __inet6_lookup_established
> >   0.62%  [kernel]                [k] menu_select
> >   0.59%  < workload >
> >   0.59%  [kernel]                [k] __switch_to
> >   0.57%  libc-2.20.so            [.] __memcpy_avx_unaligned
> > 
> > Signed-off-by: Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org>
> > Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> > Cc: Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> > Cc: Vladimir Davydov <vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > Cc: kernel-team-b10kYP2dOMg@public.gmane.org
> > Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org
> > Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> 
> Acked-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> 
> Neat!
> 
> As far as other types of pages go: page cache and anon are already
> batched pretty well, but I think kmem might benefit from this
> too. Have you considered using the stock in memcg_kmem_uncharge()?

Good idea!
I'll try to find an appropriate testcase and check if it really
brings any benefits. If so, I'll master a patch.

Thanks!

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
  2017-08-29 19:26   ` Johannes Weiner
  (?)
@ 2017-08-30 10:57     ` Roman Gushchin
  -1 siblings, 0 replies; 31+ messages in thread
From: Roman Gushchin @ 2017-08-30 10:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, linux-mm, Michal Hocko, Vladimir Davydov,
	cgroups, kernel-team, linux-kernel

On Tue, Aug 29, 2017 at 03:26:21PM -0400, Johannes Weiner wrote:
> On Tue, Aug 29, 2017 at 11:01:50AM +0100, Roman Gushchin wrote:
> > We've noticed a quite sensible performance overhead on some hosts
> > with significant network traffic when socket memory accounting
> > is enabled.
> > 
> > Perf top shows that socket memory uncharging path is hot:
> >   2.13%  [kernel]                [k] page_counter_cancel
> >   1.14%  [kernel]                [k] __sk_mem_reduce_allocated
> >   1.14%  [kernel]                [k] _raw_spin_lock
> >   0.87%  [kernel]                [k] _raw_spin_lock_irqsave
> >   0.84%  [kernel]                [k] tcp_ack
> >   0.84%  [kernel]                [k] ixgbe_poll
> >   0.83%  < workload >
> >   0.82%  [kernel]                [k] enqueue_entity
> >   0.68%  [kernel]                [k] __fget
> >   0.68%  [kernel]                [k] tcp_delack_timer_handler
> >   0.67%  [kernel]                [k] __schedule
> >   0.60%  < workload >
> >   0.59%  [kernel]                [k] __inet6_lookup_established
> >   0.55%  [kernel]                [k] __switch_to
> >   0.55%  [kernel]                [k] menu_select
> >   0.54%  libc-2.20.so            [.] __memcpy_avx_unaligned
> > 
> > To address this issue, the existing per-cpu stock infrastructure
> > can be used.
> > 
> > refill_stock() can be called from mem_cgroup_uncharge_skmem()
> > to move charge to a per-cpu stock instead of calling atomic
> > page_counter_uncharge().
> > 
> > To prevent the uncontrolled growth of per-cpu stocks,
> > refill_stock() will explicitly drain the cached charge,
> > if the cached value exceeds CHARGE_BATCH.
> > 
> > This allows significantly optimize the load:
> >   1.21%  [kernel]                [k] _raw_spin_lock
> >   1.01%  [kernel]                [k] ixgbe_poll
> >   0.92%  [kernel]                [k] _raw_spin_lock_irqsave
> >   0.90%  [kernel]                [k] enqueue_entity
> >   0.86%  [kernel]                [k] tcp_ack
> >   0.85%  < workload >
> >   0.74%  perf-11120.map          [.] 0x000000000061bf24
> >   0.73%  [kernel]                [k] __schedule
> >   0.67%  [kernel]                [k] __fget
> >   0.63%  [kernel]                [k] __inet6_lookup_established
> >   0.62%  [kernel]                [k] menu_select
> >   0.59%  < workload >
> >   0.59%  [kernel]                [k] __switch_to
> >   0.57%  libc-2.20.so            [.] __memcpy_avx_unaligned
> > 
> > Signed-off-by: Roman Gushchin <guro@fb.com>
> > Cc: Johannes Weiner <hannes@cmpxchg.org>
> > Cc: Michal Hocko <mhocko@kernel.org>
> > Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> > Cc: cgroups@vger.kernel.org
> > Cc: kernel-team@fb.com
> > Cc: linux-mm@kvack.org
> > Cc: linux-kernel@vger.kernel.org
> 
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> 
> Neat!

Hi, Andrew!

Can you, please, pull this patch?

Thank you!

Roman

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
@ 2017-08-30 10:57     ` Roman Gushchin
  0 siblings, 0 replies; 31+ messages in thread
From: Roman Gushchin @ 2017-08-30 10:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, linux-mm, Michal Hocko, Vladimir Davydov,
	cgroups, kernel-team, linux-kernel

On Tue, Aug 29, 2017 at 03:26:21PM -0400, Johannes Weiner wrote:
> On Tue, Aug 29, 2017 at 11:01:50AM +0100, Roman Gushchin wrote:
> > We've noticed a quite sensible performance overhead on some hosts
> > with significant network traffic when socket memory accounting
> > is enabled.
> > 
> > Perf top shows that socket memory uncharging path is hot:
> >   2.13%  [kernel]                [k] page_counter_cancel
> >   1.14%  [kernel]                [k] __sk_mem_reduce_allocated
> >   1.14%  [kernel]                [k] _raw_spin_lock
> >   0.87%  [kernel]                [k] _raw_spin_lock_irqsave
> >   0.84%  [kernel]                [k] tcp_ack
> >   0.84%  [kernel]                [k] ixgbe_poll
> >   0.83%  < workload >
> >   0.82%  [kernel]                [k] enqueue_entity
> >   0.68%  [kernel]                [k] __fget
> >   0.68%  [kernel]                [k] tcp_delack_timer_handler
> >   0.67%  [kernel]                [k] __schedule
> >   0.60%  < workload >
> >   0.59%  [kernel]                [k] __inet6_lookup_established
> >   0.55%  [kernel]                [k] __switch_to
> >   0.55%  [kernel]                [k] menu_select
> >   0.54%  libc-2.20.so            [.] __memcpy_avx_unaligned
> > 
> > To address this issue, the existing per-cpu stock infrastructure
> > can be used.
> > 
> > refill_stock() can be called from mem_cgroup_uncharge_skmem()
> > to move charge to a per-cpu stock instead of calling atomic
> > page_counter_uncharge().
> > 
> > To prevent the uncontrolled growth of per-cpu stocks,
> > refill_stock() will explicitly drain the cached charge,
> > if the cached value exceeds CHARGE_BATCH.
> > 
> > This allows significantly optimize the load:
> >   1.21%  [kernel]                [k] _raw_spin_lock
> >   1.01%  [kernel]                [k] ixgbe_poll
> >   0.92%  [kernel]                [k] _raw_spin_lock_irqsave
> >   0.90%  [kernel]                [k] enqueue_entity
> >   0.86%  [kernel]                [k] tcp_ack
> >   0.85%  < workload >
> >   0.74%  perf-11120.map          [.] 0x000000000061bf24
> >   0.73%  [kernel]                [k] __schedule
> >   0.67%  [kernel]                [k] __fget
> >   0.63%  [kernel]                [k] __inet6_lookup_established
> >   0.62%  [kernel]                [k] menu_select
> >   0.59%  < workload >
> >   0.59%  [kernel]                [k] __switch_to
> >   0.57%  libc-2.20.so            [.] __memcpy_avx_unaligned
> > 
> > Signed-off-by: Roman Gushchin <guro@fb.com>
> > Cc: Johannes Weiner <hannes@cmpxchg.org>
> > Cc: Michal Hocko <mhocko@kernel.org>
> > Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> > Cc: cgroups@vger.kernel.org
> > Cc: kernel-team@fb.com
> > Cc: linux-mm@kvack.org
> > Cc: linux-kernel@vger.kernel.org
> 
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> 
> Neat!

Hi, Andrew!

Can you, please, pull this patch?

Thank you!

Roman

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
@ 2017-08-30 10:57     ` Roman Gushchin
  0 siblings, 0 replies; 31+ messages in thread
From: Roman Gushchin @ 2017-08-30 10:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Michal Hocko,
	Vladimir Davydov, cgroups-u79uwXL29TY76Z2rM5mHXA,
	kernel-team-b10kYP2dOMg, linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, Aug 29, 2017 at 03:26:21PM -0400, Johannes Weiner wrote:
> On Tue, Aug 29, 2017 at 11:01:50AM +0100, Roman Gushchin wrote:
> > We've noticed a quite sensible performance overhead on some hosts
> > with significant network traffic when socket memory accounting
> > is enabled.
> > 
> > Perf top shows that socket memory uncharging path is hot:
> >   2.13%  [kernel]                [k] page_counter_cancel
> >   1.14%  [kernel]                [k] __sk_mem_reduce_allocated
> >   1.14%  [kernel]                [k] _raw_spin_lock
> >   0.87%  [kernel]                [k] _raw_spin_lock_irqsave
> >   0.84%  [kernel]                [k] tcp_ack
> >   0.84%  [kernel]                [k] ixgbe_poll
> >   0.83%  < workload >
> >   0.82%  [kernel]                [k] enqueue_entity
> >   0.68%  [kernel]                [k] __fget
> >   0.68%  [kernel]                [k] tcp_delack_timer_handler
> >   0.67%  [kernel]                [k] __schedule
> >   0.60%  < workload >
> >   0.59%  [kernel]                [k] __inet6_lookup_established
> >   0.55%  [kernel]                [k] __switch_to
> >   0.55%  [kernel]                [k] menu_select
> >   0.54%  libc-2.20.so            [.] __memcpy_avx_unaligned
> > 
> > To address this issue, the existing per-cpu stock infrastructure
> > can be used.
> > 
> > refill_stock() can be called from mem_cgroup_uncharge_skmem()
> > to move charge to a per-cpu stock instead of calling atomic
> > page_counter_uncharge().
> > 
> > To prevent the uncontrolled growth of per-cpu stocks,
> > refill_stock() will explicitly drain the cached charge,
> > if the cached value exceeds CHARGE_BATCH.
> > 
> > This allows significantly optimize the load:
> >   1.21%  [kernel]                [k] _raw_spin_lock
> >   1.01%  [kernel]                [k] ixgbe_poll
> >   0.92%  [kernel]                [k] _raw_spin_lock_irqsave
> >   0.90%  [kernel]                [k] enqueue_entity
> >   0.86%  [kernel]                [k] tcp_ack
> >   0.85%  < workload >
> >   0.74%  perf-11120.map          [.] 0x000000000061bf24
> >   0.73%  [kernel]                [k] __schedule
> >   0.67%  [kernel]                [k] __fget
> >   0.63%  [kernel]                [k] __inet6_lookup_established
> >   0.62%  [kernel]                [k] menu_select
> >   0.59%  < workload >
> >   0.59%  [kernel]                [k] __switch_to
> >   0.57%  libc-2.20.so            [.] __memcpy_avx_unaligned
> > 
> > Signed-off-by: Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org>
> > Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> > Cc: Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> > Cc: Vladimir Davydov <vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > Cc: kernel-team-b10kYP2dOMg@public.gmane.org
> > Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org
> > Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> 
> Acked-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> 
> Neat!

Hi, Andrew!

Can you, please, pull this patch?

Thank you!

Roman

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
  2017-08-29 10:01 ` Roman Gushchin
  (?)
@ 2017-08-30 12:36   ` Michal Hocko
  -1 siblings, 0 replies; 31+ messages in thread
From: Michal Hocko @ 2017-08-30 12:36 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm, Johannes Weiner, Vladimir Davydov, cgroups,
	kernel-team, linux-kernel

On Tue 29-08-17 11:01:50, Roman Gushchin wrote:
[...]
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index b9cf3cf4a3d0..a69d23082abf 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
>  	}
>  	stock->nr_pages += nr_pages;
>  
> +	if (stock->nr_pages > CHARGE_BATCH)
> +		drain_stock(stock);
> +
>  	local_irq_restore(flags);
>  }

Why do we need this? In other words, why cannot we rely on draining we
already do?

>  
> @@ -5886,8 +5889,7 @@ void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
>  
>  	this_cpu_sub(memcg->stat->count[MEMCG_SOCK], nr_pages);
>  
> -	page_counter_uncharge(&memcg->memory, nr_pages);
> -	css_put_many(&memcg->css, nr_pages);
> +	refill_stock(memcg, nr_pages);
>  }

This makes sense to me.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
@ 2017-08-30 12:36   ` Michal Hocko
  0 siblings, 0 replies; 31+ messages in thread
From: Michal Hocko @ 2017-08-30 12:36 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm, Johannes Weiner, Vladimir Davydov, cgroups,
	kernel-team, linux-kernel

On Tue 29-08-17 11:01:50, Roman Gushchin wrote:
[...]
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index b9cf3cf4a3d0..a69d23082abf 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
>  	}
>  	stock->nr_pages += nr_pages;
>  
> +	if (stock->nr_pages > CHARGE_BATCH)
> +		drain_stock(stock);
> +
>  	local_irq_restore(flags);
>  }

Why do we need this? In other words, why cannot we rely on draining we
already do?

>  
> @@ -5886,8 +5889,7 @@ void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
>  
>  	this_cpu_sub(memcg->stat->count[MEMCG_SOCK], nr_pages);
>  
> -	page_counter_uncharge(&memcg->memory, nr_pages);
> -	css_put_many(&memcg->css, nr_pages);
> +	refill_stock(memcg, nr_pages);
>  }

This makes sense to me.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
@ 2017-08-30 12:36   ` Michal Hocko
  0 siblings, 0 replies; 31+ messages in thread
From: Michal Hocko @ 2017-08-30 12:36 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Johannes Weiner,
	Vladimir Davydov, cgroups-u79uwXL29TY76Z2rM5mHXA,
	kernel-team-b10kYP2dOMg, linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue 29-08-17 11:01:50, Roman Gushchin wrote:
[...]
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index b9cf3cf4a3d0..a69d23082abf 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
>  	}
>  	stock->nr_pages += nr_pages;
>  
> +	if (stock->nr_pages > CHARGE_BATCH)
> +		drain_stock(stock);
> +
>  	local_irq_restore(flags);
>  }

Why do we need this? In other words, why cannot we rely on draining we
already do?

>  
> @@ -5886,8 +5889,7 @@ void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
>  
>  	this_cpu_sub(memcg->stat->count[MEMCG_SOCK], nr_pages);
>  
> -	page_counter_uncharge(&memcg->memory, nr_pages);
> -	css_put_many(&memcg->css, nr_pages);
> +	refill_stock(memcg, nr_pages);
>  }

This makes sense to me.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
  2017-08-30 12:36   ` Michal Hocko
@ 2017-08-30 12:44     ` Roman Gushchin
  -1 siblings, 0 replies; 31+ messages in thread
From: Roman Gushchin @ 2017-08-30 12:44 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Johannes Weiner, Vladimir Davydov, cgroups,
	kernel-team, linux-kernel

On Wed, Aug 30, 2017 at 02:36:55PM +0200, Michal Hocko wrote:
> On Tue 29-08-17 11:01:50, Roman Gushchin wrote:
> [...]
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index b9cf3cf4a3d0..a69d23082abf 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
> >  	}
> >  	stock->nr_pages += nr_pages;
> >  
> > +	if (stock->nr_pages > CHARGE_BATCH)
> > +		drain_stock(stock);
> > +
> >  	local_irq_restore(flags);
> >  }
> 
> Why do we need this? In other words, why cannot we rely on draining we
> already do?

The existing draining depends on memory pressure, so to keep
the accounting (which we expose to a user) reasonable accurate
even without memory pressure, we need to limit the size of per-cpu stocks.

Thanks!

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
@ 2017-08-30 12:44     ` Roman Gushchin
  0 siblings, 0 replies; 31+ messages in thread
From: Roman Gushchin @ 2017-08-30 12:44 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Johannes Weiner, Vladimir Davydov, cgroups,
	kernel-team, linux-kernel

On Wed, Aug 30, 2017 at 02:36:55PM +0200, Michal Hocko wrote:
> On Tue 29-08-17 11:01:50, Roman Gushchin wrote:
> [...]
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index b9cf3cf4a3d0..a69d23082abf 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
> >  	}
> >  	stock->nr_pages += nr_pages;
> >  
> > +	if (stock->nr_pages > CHARGE_BATCH)
> > +		drain_stock(stock);
> > +
> >  	local_irq_restore(flags);
> >  }
> 
> Why do we need this? In other words, why cannot we rely on draining we
> already do?

The existing draining depends on memory pressure, so to keep
the accounting (which we expose to a user) reasonable accurate
even without memory pressure, we need to limit the size of per-cpu stocks.

Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
  2017-08-30 12:44     ` Roman Gushchin
@ 2017-08-30 12:55       ` Michal Hocko
  -1 siblings, 0 replies; 31+ messages in thread
From: Michal Hocko @ 2017-08-30 12:55 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm, Johannes Weiner, Vladimir Davydov, cgroups,
	kernel-team, linux-kernel

On Wed 30-08-17 13:44:59, Roman Gushchin wrote:
> On Wed, Aug 30, 2017 at 02:36:55PM +0200, Michal Hocko wrote:
> > On Tue 29-08-17 11:01:50, Roman Gushchin wrote:
> > [...]
> > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > index b9cf3cf4a3d0..a69d23082abf 100644
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
> > >  	}
> > >  	stock->nr_pages += nr_pages;
> > >  
> > > +	if (stock->nr_pages > CHARGE_BATCH)
> > > +		drain_stock(stock);
> > > +
> > >  	local_irq_restore(flags);
> > >  }
> > 
> > Why do we need this? In other words, why cannot we rely on draining we
> > already do?
> 
> The existing draining depends on memory pressure, so to keep
> the accounting (which we expose to a user) reasonable accurate
> even without memory pressure, we need to limit the size of per-cpu stocks.

Why don't we need this for regular page charges? Or maybe we do but that
sounds like a seprate and an unrealted fix to me.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
@ 2017-08-30 12:55       ` Michal Hocko
  0 siblings, 0 replies; 31+ messages in thread
From: Michal Hocko @ 2017-08-30 12:55 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm, Johannes Weiner, Vladimir Davydov, cgroups,
	kernel-team, linux-kernel

On Wed 30-08-17 13:44:59, Roman Gushchin wrote:
> On Wed, Aug 30, 2017 at 02:36:55PM +0200, Michal Hocko wrote:
> > On Tue 29-08-17 11:01:50, Roman Gushchin wrote:
> > [...]
> > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > index b9cf3cf4a3d0..a69d23082abf 100644
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
> > >  	}
> > >  	stock->nr_pages += nr_pages;
> > >  
> > > +	if (stock->nr_pages > CHARGE_BATCH)
> > > +		drain_stock(stock);
> > > +
> > >  	local_irq_restore(flags);
> > >  }
> > 
> > Why do we need this? In other words, why cannot we rely on draining we
> > already do?
> 
> The existing draining depends on memory pressure, so to keep
> the accounting (which we expose to a user) reasonable accurate
> even without memory pressure, we need to limit the size of per-cpu stocks.

Why don't we need this for regular page charges? Or maybe we do but that
sounds like a seprate and an unrealted fix to me.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
  2017-08-30 12:55       ` Michal Hocko
@ 2017-08-30 12:57         ` Roman Gushchin
  -1 siblings, 0 replies; 31+ messages in thread
From: Roman Gushchin @ 2017-08-30 12:57 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Johannes Weiner, Vladimir Davydov, cgroups,
	kernel-team, linux-kernel

On Wed, Aug 30, 2017 at 02:55:43PM +0200, Michal Hocko wrote:
> On Wed 30-08-17 13:44:59, Roman Gushchin wrote:
> > On Wed, Aug 30, 2017 at 02:36:55PM +0200, Michal Hocko wrote:
> > > On Tue 29-08-17 11:01:50, Roman Gushchin wrote:
> > > [...]
> > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > > index b9cf3cf4a3d0..a69d23082abf 100644
> > > > --- a/mm/memcontrol.c
> > > > +++ b/mm/memcontrol.c
> > > > @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
> > > >  	}
> > > >  	stock->nr_pages += nr_pages;
> > > >  
> > > > +	if (stock->nr_pages > CHARGE_BATCH)
> > > > +		drain_stock(stock);
> > > > +
> > > >  	local_irq_restore(flags);
> > > >  }
> > > 
> > > Why do we need this? In other words, why cannot we rely on draining we
> > > already do?
> > 
> > The existing draining depends on memory pressure, so to keep
> > the accounting (which we expose to a user) reasonable accurate
> > even without memory pressure, we need to limit the size of per-cpu stocks.
> 
> Why don't we need this for regular page charges? Or maybe we do but that
> sounds like a seprate and an unrealted fix to me.

Because we never refill more than CHARGE_BATCH.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
@ 2017-08-30 12:57         ` Roman Gushchin
  0 siblings, 0 replies; 31+ messages in thread
From: Roman Gushchin @ 2017-08-30 12:57 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Johannes Weiner, Vladimir Davydov, cgroups,
	kernel-team, linux-kernel

On Wed, Aug 30, 2017 at 02:55:43PM +0200, Michal Hocko wrote:
> On Wed 30-08-17 13:44:59, Roman Gushchin wrote:
> > On Wed, Aug 30, 2017 at 02:36:55PM +0200, Michal Hocko wrote:
> > > On Tue 29-08-17 11:01:50, Roman Gushchin wrote:
> > > [...]
> > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > > index b9cf3cf4a3d0..a69d23082abf 100644
> > > > --- a/mm/memcontrol.c
> > > > +++ b/mm/memcontrol.c
> > > > @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
> > > >  	}
> > > >  	stock->nr_pages += nr_pages;
> > > >  
> > > > +	if (stock->nr_pages > CHARGE_BATCH)
> > > > +		drain_stock(stock);
> > > > +
> > > >  	local_irq_restore(flags);
> > > >  }
> > > 
> > > Why do we need this? In other words, why cannot we rely on draining we
> > > already do?
> > 
> > The existing draining depends on memory pressure, so to keep
> > the accounting (which we expose to a user) reasonable accurate
> > even without memory pressure, we need to limit the size of per-cpu stocks.
> 
> Why don't we need this for regular page charges? Or maybe we do but that
> sounds like a seprate and an unrealted fix to me.

Because we never refill more than CHARGE_BATCH.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
  2017-08-30 12:57         ` Roman Gushchin
@ 2017-08-30 14:23           ` Michal Hocko
  -1 siblings, 0 replies; 31+ messages in thread
From: Michal Hocko @ 2017-08-30 14:23 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm, Johannes Weiner, Vladimir Davydov, cgroups,
	kernel-team, linux-kernel

On Wed 30-08-17 13:57:29, Roman Gushchin wrote:
> On Wed, Aug 30, 2017 at 02:55:43PM +0200, Michal Hocko wrote:
> > On Wed 30-08-17 13:44:59, Roman Gushchin wrote:
> > > On Wed, Aug 30, 2017 at 02:36:55PM +0200, Michal Hocko wrote:
> > > > On Tue 29-08-17 11:01:50, Roman Gushchin wrote:
> > > > [...]
> > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > > > index b9cf3cf4a3d0..a69d23082abf 100644
> > > > > --- a/mm/memcontrol.c
> > > > > +++ b/mm/memcontrol.c
> > > > > @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
> > > > >  	}
> > > > >  	stock->nr_pages += nr_pages;
> > > > >  
> > > > > +	if (stock->nr_pages > CHARGE_BATCH)
> > > > > +		drain_stock(stock);
> > > > > +
> > > > >  	local_irq_restore(flags);
> > > > >  }
> > > > 
> > > > Why do we need this? In other words, why cannot we rely on draining we
> > > > already do?
> > > 
> > > The existing draining depends on memory pressure, so to keep
> > > the accounting (which we expose to a user) reasonable accurate
> > > even without memory pressure, we need to limit the size of per-cpu stocks.
> > 
> > Why don't we need this for regular page charges? Or maybe we do but that
> > sounds like a seprate and an unrealted fix to me.
> 
> Because we never refill more than CHARGE_BATCH.

You are right that a single process will not but try_charge is a preemptible
context and so multiple processes might pass consume_stock and then
charge a N*CHARGE_BATCH. But I agree that this is quite unlikely so a
separate patch is probably not worth it.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
@ 2017-08-30 14:23           ` Michal Hocko
  0 siblings, 0 replies; 31+ messages in thread
From: Michal Hocko @ 2017-08-30 14:23 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm, Johannes Weiner, Vladimir Davydov, cgroups,
	kernel-team, linux-kernel

On Wed 30-08-17 13:57:29, Roman Gushchin wrote:
> On Wed, Aug 30, 2017 at 02:55:43PM +0200, Michal Hocko wrote:
> > On Wed 30-08-17 13:44:59, Roman Gushchin wrote:
> > > On Wed, Aug 30, 2017 at 02:36:55PM +0200, Michal Hocko wrote:
> > > > On Tue 29-08-17 11:01:50, Roman Gushchin wrote:
> > > > [...]
> > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > > > index b9cf3cf4a3d0..a69d23082abf 100644
> > > > > --- a/mm/memcontrol.c
> > > > > +++ b/mm/memcontrol.c
> > > > > @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
> > > > >  	}
> > > > >  	stock->nr_pages += nr_pages;
> > > > >  
> > > > > +	if (stock->nr_pages > CHARGE_BATCH)
> > > > > +		drain_stock(stock);
> > > > > +
> > > > >  	local_irq_restore(flags);
> > > > >  }
> > > > 
> > > > Why do we need this? In other words, why cannot we rely on draining we
> > > > already do?
> > > 
> > > The existing draining depends on memory pressure, so to keep
> > > the accounting (which we expose to a user) reasonable accurate
> > > even without memory pressure, we need to limit the size of per-cpu stocks.
> > 
> > Why don't we need this for regular page charges? Or maybe we do but that
> > sounds like a seprate and an unrealted fix to me.
> 
> Because we never refill more than CHARGE_BATCH.

You are right that a single process will not but try_charge is a preemptible
context and so multiple processes might pass consume_stock and then
charge a N*CHARGE_BATCH. But I agree that this is quite unlikely so a
separate patch is probably not worth it.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
  2017-08-29 10:01 ` Roman Gushchin
@ 2017-08-30 14:24   ` Michal Hocko
  -1 siblings, 0 replies; 31+ messages in thread
From: Michal Hocko @ 2017-08-30 14:24 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm, Johannes Weiner, Vladimir Davydov, cgroups,
	kernel-team, linux-kernel

On Tue 29-08-17 11:01:50, Roman Gushchin wrote:
> We've noticed a quite sensible performance overhead on some hosts
> with significant network traffic when socket memory accounting
> is enabled.
> 
> Perf top shows that socket memory uncharging path is hot:
>   2.13%  [kernel]                [k] page_counter_cancel
>   1.14%  [kernel]                [k] __sk_mem_reduce_allocated
>   1.14%  [kernel]                [k] _raw_spin_lock
>   0.87%  [kernel]                [k] _raw_spin_lock_irqsave
>   0.84%  [kernel]                [k] tcp_ack
>   0.84%  [kernel]                [k] ixgbe_poll
>   0.83%  < workload >
>   0.82%  [kernel]                [k] enqueue_entity
>   0.68%  [kernel]                [k] __fget
>   0.68%  [kernel]                [k] tcp_delack_timer_handler
>   0.67%  [kernel]                [k] __schedule
>   0.60%  < workload >
>   0.59%  [kernel]                [k] __inet6_lookup_established
>   0.55%  [kernel]                [k] __switch_to
>   0.55%  [kernel]                [k] menu_select
>   0.54%  libc-2.20.so            [.] __memcpy_avx_unaligned
> 
> To address this issue, the existing per-cpu stock infrastructure
> can be used.
> 
> refill_stock() can be called from mem_cgroup_uncharge_skmem()
> to move charge to a per-cpu stock instead of calling atomic
> page_counter_uncharge().
> 
> To prevent the uncontrolled growth of per-cpu stocks,
> refill_stock() will explicitly drain the cached charge,
> if the cached value exceeds CHARGE_BATCH.
> 
> This allows significantly optimize the load:
>   1.21%  [kernel]                [k] _raw_spin_lock
>   1.01%  [kernel]                [k] ixgbe_poll
>   0.92%  [kernel]                [k] _raw_spin_lock_irqsave
>   0.90%  [kernel]                [k] enqueue_entity
>   0.86%  [kernel]                [k] tcp_ack
>   0.85%  < workload >
>   0.74%  perf-11120.map          [.] 0x000000000061bf24
>   0.73%  [kernel]                [k] __schedule
>   0.67%  [kernel]                [k] __fget
>   0.63%  [kernel]                [k] __inet6_lookup_established
>   0.62%  [kernel]                [k] menu_select
>   0.59%  < workload >
>   0.59%  [kernel]                [k] __switch_to
>   0.57%  libc-2.20.so            [.] __memcpy_avx_unaligned
> 
> Signed-off-by: Roman Gushchin <guro@fb.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> Cc: cgroups@vger.kernel.org
> Cc: kernel-team@fb.com
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memcontrol.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index b9cf3cf4a3d0..a69d23082abf 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
>  	}
>  	stock->nr_pages += nr_pages;
>  
> +	if (stock->nr_pages > CHARGE_BATCH)
> +		drain_stock(stock);
> +
>  	local_irq_restore(flags);
>  }
>  
> @@ -5886,8 +5889,7 @@ void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
>  
>  	this_cpu_sub(memcg->stat->count[MEMCG_SOCK], nr_pages);
>  
> -	page_counter_uncharge(&memcg->memory, nr_pages);
> -	css_put_many(&memcg->css, nr_pages);
> +	refill_stock(memcg, nr_pages);
>  }
>  
>  static int __init cgroup_memory(char *s)
> -- 
> 2.13.5

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
@ 2017-08-30 14:24   ` Michal Hocko
  0 siblings, 0 replies; 31+ messages in thread
From: Michal Hocko @ 2017-08-30 14:24 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm, Johannes Weiner, Vladimir Davydov, cgroups,
	kernel-team, linux-kernel

On Tue 29-08-17 11:01:50, Roman Gushchin wrote:
> We've noticed a quite sensible performance overhead on some hosts
> with significant network traffic when socket memory accounting
> is enabled.
> 
> Perf top shows that socket memory uncharging path is hot:
>   2.13%  [kernel]                [k] page_counter_cancel
>   1.14%  [kernel]                [k] __sk_mem_reduce_allocated
>   1.14%  [kernel]                [k] _raw_spin_lock
>   0.87%  [kernel]                [k] _raw_spin_lock_irqsave
>   0.84%  [kernel]                [k] tcp_ack
>   0.84%  [kernel]                [k] ixgbe_poll
>   0.83%  < workload >
>   0.82%  [kernel]                [k] enqueue_entity
>   0.68%  [kernel]                [k] __fget
>   0.68%  [kernel]                [k] tcp_delack_timer_handler
>   0.67%  [kernel]                [k] __schedule
>   0.60%  < workload >
>   0.59%  [kernel]                [k] __inet6_lookup_established
>   0.55%  [kernel]                [k] __switch_to
>   0.55%  [kernel]                [k] menu_select
>   0.54%  libc-2.20.so            [.] __memcpy_avx_unaligned
> 
> To address this issue, the existing per-cpu stock infrastructure
> can be used.
> 
> refill_stock() can be called from mem_cgroup_uncharge_skmem()
> to move charge to a per-cpu stock instead of calling atomic
> page_counter_uncharge().
> 
> To prevent the uncontrolled growth of per-cpu stocks,
> refill_stock() will explicitly drain the cached charge,
> if the cached value exceeds CHARGE_BATCH.
> 
> This allows significantly optimize the load:
>   1.21%  [kernel]                [k] _raw_spin_lock
>   1.01%  [kernel]                [k] ixgbe_poll
>   0.92%  [kernel]                [k] _raw_spin_lock_irqsave
>   0.90%  [kernel]                [k] enqueue_entity
>   0.86%  [kernel]                [k] tcp_ack
>   0.85%  < workload >
>   0.74%  perf-11120.map          [.] 0x000000000061bf24
>   0.73%  [kernel]                [k] __schedule
>   0.67%  [kernel]                [k] __fget
>   0.63%  [kernel]                [k] __inet6_lookup_established
>   0.62%  [kernel]                [k] menu_select
>   0.59%  < workload >
>   0.59%  [kernel]                [k] __switch_to
>   0.57%  libc-2.20.so            [.] __memcpy_avx_unaligned
> 
> Signed-off-by: Roman Gushchin <guro@fb.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> Cc: cgroups@vger.kernel.org
> Cc: kernel-team@fb.com
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memcontrol.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index b9cf3cf4a3d0..a69d23082abf 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
>  	}
>  	stock->nr_pages += nr_pages;
>  
> +	if (stock->nr_pages > CHARGE_BATCH)
> +		drain_stock(stock);
> +
>  	local_irq_restore(flags);
>  }
>  
> @@ -5886,8 +5889,7 @@ void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
>  
>  	this_cpu_sub(memcg->stat->count[MEMCG_SOCK], nr_pages);
>  
> -	page_counter_uncharge(&memcg->memory, nr_pages);
> -	css_put_many(&memcg->css, nr_pages);
> +	refill_stock(memcg, nr_pages);
>  }
>  
>  static int __init cgroup_memory(char *s)
> -- 
> 2.13.5

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
  2017-08-30 10:55     ` Roman Gushchin
@ 2017-09-07 18:44       ` Shakeel Butt
  -1 siblings, 0 replies; 31+ messages in thread
From: Shakeel Butt @ 2017-09-07 18:44 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Johannes Weiner, Linux MM, Michal Hocko, Vladimir Davydov,
	cgroups, kernel-team, LKML

>> As far as other types of pages go: page cache and anon are already
>> batched pretty well, but I think kmem might benefit from this
>> too. Have you considered using the stock in memcg_kmem_uncharge()?
>
> Good idea!
> I'll try to find an appropriate testcase and check if it really
> brings any benefits. If so, I'll master a patch.
>

Hi Roman, did you get the chance to try this on memcg_kmem_uncharge()?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
@ 2017-09-07 18:44       ` Shakeel Butt
  0 siblings, 0 replies; 31+ messages in thread
From: Shakeel Butt @ 2017-09-07 18:44 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Johannes Weiner, Linux MM, Michal Hocko, Vladimir Davydov,
	cgroups, kernel-team, LKML

>> As far as other types of pages go: page cache and anon are already
>> batched pretty well, but I think kmem might benefit from this
>> too. Have you considered using the stock in memcg_kmem_uncharge()?
>
> Good idea!
> I'll try to find an appropriate testcase and check if it really
> brings any benefits. If so, I'll master a patch.
>

Hi Roman, did you get the chance to try this on memcg_kmem_uncharge()?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
  2017-09-07 18:44       ` Shakeel Butt
  (?)
@ 2017-09-07 18:47         ` Roman Gushchin
  -1 siblings, 0 replies; 31+ messages in thread
From: Roman Gushchin @ 2017-09-07 18:47 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Johannes Weiner, Linux MM, Michal Hocko, Vladimir Davydov,
	cgroups, kernel-team, LKML

On Thu, Sep 07, 2017 at 11:44:12AM -0700, Shakeel Butt wrote:
> >> As far as other types of pages go: page cache and anon are already
> >> batched pretty well, but I think kmem might benefit from this
> >> too. Have you considered using the stock in memcg_kmem_uncharge()?
> >
> > Good idea!
> > I'll try to find an appropriate testcase and check if it really
> > brings any benefits. If so, I'll master a patch.
> >
> 
> Hi Roman, did you get the chance to try this on memcg_kmem_uncharge()?

Hi Shakeel!

Not yet, I'll try to it asap.
Do you have an example when it's costly?

Thanks!

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
@ 2017-09-07 18:47         ` Roman Gushchin
  0 siblings, 0 replies; 31+ messages in thread
From: Roman Gushchin @ 2017-09-07 18:47 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Johannes Weiner, Linux MM, Michal Hocko, Vladimir Davydov,
	cgroups, kernel-team, LKML

On Thu, Sep 07, 2017 at 11:44:12AM -0700, Shakeel Butt wrote:
> >> As far as other types of pages go: page cache and anon are already
> >> batched pretty well, but I think kmem might benefit from this
> >> too. Have you considered using the stock in memcg_kmem_uncharge()?
> >
> > Good idea!
> > I'll try to find an appropriate testcase and check if it really
> > brings any benefits. If so, I'll master a patch.
> >
> 
> Hi Roman, did you get the chance to try this on memcg_kmem_uncharge()?

Hi Shakeel!

Not yet, I'll try to it asap.
Do you have an example when it's costly?

Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
@ 2017-09-07 18:47         ` Roman Gushchin
  0 siblings, 0 replies; 31+ messages in thread
From: Roman Gushchin @ 2017-09-07 18:47 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Johannes Weiner, Linux MM, Michal Hocko, Vladimir Davydov,
	cgroups, kernel-team, LKML

On Thu, Sep 07, 2017 at 11:44:12AM -0700, Shakeel Butt wrote:
> >> As far as other types of pages go: page cache and anon are already
> >> batched pretty well, but I think kmem might benefit from this
> >> too. Have you considered using the stock in memcg_kmem_uncharge()?
> >
> > Good idea!
> > I'll try to find an appropriate testcase and check if it really
> > brings any benefits. If so, I'll master a patch.
> >
> 
> Hi Roman, did you get the chance to try this on memcg_kmem_uncharge()?

Hi Shakeel!

Not yet, I'll try to it asap.
Do you have an example when it's costly?

Thanks!

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
  2017-09-07 18:47         ` Roman Gushchin
@ 2017-09-07 19:43           ` Shakeel Butt
  -1 siblings, 0 replies; 31+ messages in thread
From: Shakeel Butt @ 2017-09-07 19:43 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Johannes Weiner, Linux MM, Michal Hocko, Vladimir Davydov,
	cgroups, kernel-team, LKML

On Thu, Sep 7, 2017 at 11:47 AM, Roman Gushchin <guro@fb.com> wrote:
> On Thu, Sep 07, 2017 at 11:44:12AM -0700, Shakeel Butt wrote:
>> >> As far as other types of pages go: page cache and anon are already
>> >> batched pretty well, but I think kmem might benefit from this
>> >> too. Have you considered using the stock in memcg_kmem_uncharge()?
>> >
>> > Good idea!
>> > I'll try to find an appropriate testcase and check if it really
>> > brings any benefits. If so, I'll master a patch.
>> >
>>
>> Hi Roman, did you get the chance to try this on memcg_kmem_uncharge()?
>
> Hi Shakeel!
>
> Not yet, I'll try to it asap.
> Do you have an example when it's costly?
>

Nah, I was also curious of such examples.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
@ 2017-09-07 19:43           ` Shakeel Butt
  0 siblings, 0 replies; 31+ messages in thread
From: Shakeel Butt @ 2017-09-07 19:43 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Johannes Weiner, Linux MM, Michal Hocko, Vladimir Davydov,
	cgroups, kernel-team, LKML

On Thu, Sep 7, 2017 at 11:47 AM, Roman Gushchin <guro@fb.com> wrote:
> On Thu, Sep 07, 2017 at 11:44:12AM -0700, Shakeel Butt wrote:
>> >> As far as other types of pages go: page cache and anon are already
>> >> batched pretty well, but I think kmem might benefit from this
>> >> too. Have you considered using the stock in memcg_kmem_uncharge()?
>> >
>> > Good idea!
>> > I'll try to find an appropriate testcase and check if it really
>> > brings any benefits. If so, I'll master a patch.
>> >
>>
>> Hi Roman, did you get the chance to try this on memcg_kmem_uncharge()?
>
> Hi Shakeel!
>
> Not yet, I'll try to it asap.
> Do you have an example when it's costly?
>

Nah, I was also curious of such examples.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2017-09-07 19:43 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-29 10:01 [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging Roman Gushchin
2017-08-29 10:01 ` Roman Gushchin
2017-08-29 10:01 ` Roman Gushchin
2017-08-29 19:26 ` Johannes Weiner
2017-08-29 19:26   ` Johannes Weiner
2017-08-30 10:55   ` Roman Gushchin
2017-08-30 10:55     ` Roman Gushchin
2017-08-30 10:55     ` Roman Gushchin
2017-09-07 18:44     ` Shakeel Butt
2017-09-07 18:44       ` Shakeel Butt
2017-09-07 18:47       ` Roman Gushchin
2017-09-07 18:47         ` Roman Gushchin
2017-09-07 18:47         ` Roman Gushchin
2017-09-07 19:43         ` Shakeel Butt
2017-09-07 19:43           ` Shakeel Butt
2017-08-30 10:57   ` Roman Gushchin
2017-08-30 10:57     ` Roman Gushchin
2017-08-30 10:57     ` Roman Gushchin
2017-08-30 12:36 ` Michal Hocko
2017-08-30 12:36   ` Michal Hocko
2017-08-30 12:36   ` Michal Hocko
2017-08-30 12:44   ` Roman Gushchin
2017-08-30 12:44     ` Roman Gushchin
2017-08-30 12:55     ` Michal Hocko
2017-08-30 12:55       ` Michal Hocko
2017-08-30 12:57       ` Roman Gushchin
2017-08-30 12:57         ` Roman Gushchin
2017-08-30 14:23         ` Michal Hocko
2017-08-30 14:23           ` Michal Hocko
2017-08-30 14:24 ` Michal Hocko
2017-08-30 14:24   ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.