linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/6] mm: improve performance of accounted kernel memory allocations
@ 2023-10-19 22:53 Roman Gushchin
  2023-10-19 22:53 ` [PATCH v5 1/6] mm: kmem: optimize get_obj_cgroup_from_current() Roman Gushchin
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Roman Gushchin @ 2023-10-19 22:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, cgroups, Johannes Weiner, Michal Hocko,
	Shakeel Butt, Muchun Song, Dennis Zhou, David Rientjes,
	Vlastimil Babka, Naresh Kamboju, Roman Gushchin

This patchset improves the performance of accounted kernel memory allocations
by ~30% as measured by a micro-benchmark [1]. The benchmark is very
straightforward: 1M of 64 bytes-large kmalloc() allocations.

Below are results with the disabled kernel memory accounting, the original state
and with this patchset applied.

|             | Kmem disabled | Original | Patched |  Delta |
|-------------+---------------+----------+---------+--------|
| User cgroup |         29764 |    84548 |   59078 | -30.0% |
| Root cgroup |         29742 |    48342 |   31501 | -34.8% |

As we can see, the patchset removes the majority of the overhead when there is
no actual accounting (a task belongs to the root memory cgroup) and almost
halves the accounting overhead otherwise.

The main idea is to get rid of unnecessary memcg to objcg conversions and switch
to a scope-based protection of objcgs, which eliminates extra operations with
objcg reference counters under a rcu read lock. More details are provided in
individual commit descriptions.

v5:
	- fixed another refcnt bug spotted by Vlastimil
	- small refactoring of current_obj_cgroup()
	- added a patch for get_obj_cgroup() refactoring
v4:
	- fixed a bug spotted by Vlastimil
	- cosmetic changes, per Vlastimil
v3:
	- fixed a bug spotted by Shakeel
	- added some comments, per Shakeel
v2:
	- fixed a bug discovered by Naresh Kamboju
	- code changes asked by Johannes (added comments, open-coded bit ops)
	- merged in a couple of small fixes
v1:
	- made the objcg update fully lockless
	- fixed !CONFIG_MMU build issues
rfc:
	https://lwn.net/Articles/945722/

--
[1]:

static int memory_alloc_test(struct seq_file *m, void *v)
{
       unsigned long i, j;
       void **ptrs;
       ktime_t start, end;
       s64 delta, min_delta = LLONG_MAX;

       ptrs = kvmalloc(sizeof(void *) * 1000000, GFP_KERNEL);
       if (!ptrs)
               return -ENOMEM;

       for (j = 0; j < 100; j++) {
               start = ktime_get();
               for (i = 0; i < 1000000; i++)
                       ptrs[i] = kmalloc(64, GFP_KERNEL_ACCOUNT);
               end = ktime_get();

               delta = ktime_us_delta(end, start);
               if (delta < min_delta)
                       min_delta = delta;

               for (i = 0; i < 1000000; i++)
                       kfree(ptrs[i]);
       }

       kvfree(ptrs);
       seq_printf(m, "%lld us\n", min_delta);

       return 0;
}

--

Signed-off-by: Roman Gushchin (Cruise) <roman.gushchin@linux.dev>


Roman Gushchin (6):
  mm: kmem: optimize get_obj_cgroup_from_current()
  mm: kmem: add direct objcg pointer to task_struct
  mm: kmem: make memcg keep a reference to the original objcg
  mm: kmem: scoped objcg protection
  percpu: scoped objcg protection
  mm: kmem: reimplement get_obj_cgroup_from_current()

 include/linux/memcontrol.h |  28 +++++-
 include/linux/sched.h      |   4 +
 include/linux/sched/mm.h   |   4 +
 mm/memcontrol.c            | 187 +++++++++++++++++++++++++++++++------
 mm/percpu.c                |   8 +-
 mm/slab.h                  |  15 +--
 6 files changed, 204 insertions(+), 42 deletions(-)

-- 
2.42.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-10-20  6:31 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-19 22:53 [PATCH v5 0/6] mm: improve performance of accounted kernel memory allocations Roman Gushchin
2023-10-19 22:53 ` [PATCH v5 1/6] mm: kmem: optimize get_obj_cgroup_from_current() Roman Gushchin
2023-10-19 22:53 ` [PATCH v5 2/6] mm: kmem: add direct objcg pointer to task_struct Roman Gushchin
2023-10-20  5:33   ` Vlastimil Babka
2023-10-19 22:53 ` [PATCH v5 3/6] mm: kmem: make memcg keep a reference to the original objcg Roman Gushchin
2023-10-19 22:53 ` [PATCH v5 4/6] mm: kmem: scoped objcg protection Roman Gushchin
2023-10-19 22:53 ` [PATCH v5 5/6] percpu: " Roman Gushchin
2023-10-19 22:53 ` [PATCH v5 6/6] mm: kmem: reimplement get_obj_cgroup_from_current() Roman Gushchin
2023-10-20  5:41   ` Vlastimil Babka
2023-10-20  6:31   ` Shakeel Butt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).