* Re: [PATCH v1 1/5] percpu: return number of released bytes from pcpu_free_area() [not found] ` <20200528232508.1132382-2-guro@fb.com> @ 2020-06-05 19:44 ` Dennis Zhou 0 siblings, 0 replies; 7+ messages in thread From: Dennis Zhou @ 2020-06-05 19:44 UTC (permalink / raw) To: Roman Gushchin Cc: Andrew Morton, Tejun Heo, Christoph Lameter, Johannes Weiner, Michal Hocko, Shakeel Butt, linux-mm, kernel-team, linux-kernel On Thu, May 28, 2020 at 04:25:04PM -0700, Roman Gushchin wrote: > To implement accounting of percpu memory we need the information > about the size of freed object. Return it from pcpu_free_area(). > > Signed-off-by: Roman Gushchin <guro@fb.com> > --- > mm/percpu.c | 13 ++++++++++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/mm/percpu.c b/mm/percpu.c > index 696367b18222..aa36b78d45a6 100644 > --- a/mm/percpu.c > +++ b/mm/percpu.c > @@ -1211,11 +1211,14 @@ static int pcpu_alloc_area(struct pcpu_chunk *chunk, int alloc_bits, > * > * This function determines the size of an allocation to free using > * the boundary bitmap and clears the allocation map. > + * > + * RETURNS: > + * Number of freed bytes. > */ > -static void pcpu_free_area(struct pcpu_chunk *chunk, int off) > +static int pcpu_free_area(struct pcpu_chunk *chunk, int off) > { > struct pcpu_block_md *chunk_md = &chunk->chunk_md; > - int bit_off, bits, end, oslot; > + int bit_off, bits, end, oslot, freed; > > lockdep_assert_held(&pcpu_lock); > pcpu_stats_area_dealloc(chunk); > @@ -1230,8 +1233,10 @@ static void pcpu_free_area(struct pcpu_chunk *chunk, int off) > bits = end - bit_off; > bitmap_clear(chunk->alloc_map, bit_off, bits); > > + freed = bits * PCPU_MIN_ALLOC_SIZE; > + > /* update metadata */ > - chunk->free_bytes += bits * PCPU_MIN_ALLOC_SIZE; > + chunk->free_bytes += freed; > > /* update first free bit */ > chunk_md->first_free = min(chunk_md->first_free, bit_off); > @@ -1239,6 +1244,8 @@ static void pcpu_free_area(struct pcpu_chunk *chunk, int off) > pcpu_block_update_hint_free(chunk, bit_off, bits); > > pcpu_chunk_relocate(chunk, oslot); > + > + return freed; > } > > static void pcpu_init_md_block(struct pcpu_block_md *block, int nr_bits) > -- > 2.25.4 > Sorry for the delay. Acked-by: Dennis Zhou <dennis@kernel.org> What's the status of the depending patches? It might be easiest to have Andrew pick these up once the depending patch series is settled. Thanks, Dennis ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20200528232508.1132382-3-guro@fb.com>]
* Re: [PATCH v1 2/5] mm: memcg/percpu: account percpu memory to memory cgroups [not found] ` <20200528232508.1132382-3-guro@fb.com> @ 2020-06-05 19:49 ` Dennis Zhou 2020-06-05 22:44 ` Roman Gushchin 0 siblings, 1 reply; 7+ messages in thread From: Dennis Zhou @ 2020-06-05 19:49 UTC (permalink / raw) To: Roman Gushchin Cc: Andrew Morton, Tejun Heo, Christoph Lameter, Johannes Weiner, Michal Hocko, Shakeel Butt, linux-mm, kernel-team, linux-kernel On Thu, May 28, 2020 at 04:25:05PM -0700, Roman Gushchin wrote: > Percpu memory is becoming more and more widely used by various > subsystems, and the total amount of memory controlled by the percpu > allocator can make a good part of the total memory. > > As an example, bpf maps can consume a lot of percpu memory, > and they are created by a user. Also, some cgroup internals > (e.g. memory controller statistics) can be quite large. > On a machine with many CPUs and big number of cgroups they > can consume hundreds of megabytes. > > So the lack of memcg accounting is creating a breach in the memory > isolation. Similar to the slab memory, percpu memory should be > accounted by default. > > To implement the perpcu accounting it's possible to take the slab > memory accounting as a model to follow. Let's introduce two types of > percpu chunks: root and memcg. What makes memcg chunks different is > an additional space allocated to store memcg membership information. > If __GFP_ACCOUNT is passed on allocation, a memcg chunk should be be > used. If it's possible to charge the corresponding size to the target > memory cgroup, allocation is performed, and the memcg ownership data > is recorded. System-wide allocations are performed using root chunks, > so there is no additional memory overhead. > > To implement a fast reparenting of percpu memory on memcg removal, > we don't store mem_cgroup pointers directly: instead we use obj_cgroup > API, introduced for slab accounting. > > Signed-off-by: Roman Gushchin <guro@fb.com> > --- > mm/percpu-internal.h | 57 ++++++++++++- > mm/percpu-km.c | 5 +- > mm/percpu-stats.c | 36 +++++---- > mm/percpu-vm.c | 5 +- > mm/percpu.c | 186 ++++++++++++++++++++++++++++++++++++++----- > 5 files changed, 248 insertions(+), 41 deletions(-) > > diff --git a/mm/percpu-internal.h b/mm/percpu-internal.h > index 0468ba500bd4..0cf36337eb47 100644 > --- a/mm/percpu-internal.h > +++ b/mm/percpu-internal.h > @@ -5,6 +5,27 @@ > #include <linux/types.h> > #include <linux/percpu.h> > > +/* > + * There are two chunk types: root and memcg-aware. > + * Chunks of each type have separate slots list. > + * > + * Memcg-aware chunks have an attached vector of obj_cgroup > + * pointers, which is used to store memcg membership data > + * of a percpu object. Obj_cgroups are ref-counted pointers > + * to a memory cgroup with an ability to switch dynamically > + * to the parent memory cgroup. This allows to reclaim a deleted > + * memory cgroup without reclaiming of all outstanding objects, > + * which do hold a reference at it. > + */ nit: do you mind reflowing this to 80 characters and doing 2 spaces after each period to keep the formatting uniform. > +enum pcpu_chunk_type { > + PCPU_CHUNK_ROOT, > +#ifdef CONFIG_MEMCG_KMEM > + PCPU_CHUNK_MEMCG, > +#endif > + PCPU_NR_CHUNK_TYPES, > + PCPU_FAIL_ALLOC = PCPU_NR_CHUNK_TYPES > +}; > + > /* > * pcpu_block_md is the metadata block struct. > * Each chunk's bitmap is split into a number of full blocks. > @@ -54,6 +75,9 @@ struct pcpu_chunk { > int end_offset; /* additional area required to > have the region end page > aligned */ > +#ifdef CONFIG_MEMCG_KMEM > + struct obj_cgroup **obj_cgroups; /* vector of object cgroups */ > +#endif > > int nr_pages; /* # of pages served by this chunk */ > int nr_populated; /* # of populated pages */ > @@ -63,7 +87,7 @@ struct pcpu_chunk { > > extern spinlock_t pcpu_lock; > > -extern struct list_head *pcpu_slot; > +extern struct list_head *pcpu_chunk_lists; > extern int pcpu_nr_slots; > extern int pcpu_nr_empty_pop_pages; > > @@ -106,6 +130,37 @@ static inline int pcpu_chunk_map_bits(struct pcpu_chunk *chunk) > return pcpu_nr_pages_to_map_bits(chunk->nr_pages); > } > > +#ifdef CONFIG_MEMCG_KMEM > +static enum pcpu_chunk_type pcpu_chunk_type(struct pcpu_chunk *chunk) > +{ > + if (chunk->obj_cgroups) > + return PCPU_CHUNK_MEMCG; > + return PCPU_CHUNK_ROOT; > +} > + > +static bool pcpu_is_memcg_chunk(enum pcpu_chunk_type chunk_type) > +{ > + return chunk_type == PCPU_CHUNK_MEMCG; > +} > + > +#else > +static enum pcpu_chunk_type pcpu_chunk_type(struct pcpu_chunk *chunk) > +{ > + return PCPU_CHUNK_ROOT; > +} > + > +static bool pcpu_is_memcg_chunk(enum pcpu_chunk_type chunk_type) > +{ > + return false; > +} > +#endif > + > +static struct list_head *pcpu_chunk_list(enum pcpu_chunk_type chunk_type) > +{ > + return &pcpu_chunk_lists[pcpu_nr_slots * > + pcpu_is_memcg_chunk(chunk_type)]; > +} > + > #ifdef CONFIG_PERCPU_STATS > > #include <linux/spinlock.h> > diff --git a/mm/percpu-km.c b/mm/percpu-km.c > index 20d2b69a13b0..35c9941077ee 100644 > --- a/mm/percpu-km.c > +++ b/mm/percpu-km.c > @@ -44,7 +44,8 @@ static void pcpu_depopulate_chunk(struct pcpu_chunk *chunk, > /* nada */ > } > > -static struct pcpu_chunk *pcpu_create_chunk(gfp_t gfp) > +static struct pcpu_chunk *pcpu_create_chunk(enum pcpu_chunk_type type, > + gfp_t gfp) > { > const int nr_pages = pcpu_group_sizes[0] >> PAGE_SHIFT; > struct pcpu_chunk *chunk; > @@ -52,7 +53,7 @@ static struct pcpu_chunk *pcpu_create_chunk(gfp_t gfp) > unsigned long flags; > int i; > > - chunk = pcpu_alloc_chunk(gfp); > + chunk = pcpu_alloc_chunk(type, gfp); > if (!chunk) > return NULL; > > diff --git a/mm/percpu-stats.c b/mm/percpu-stats.c > index 32558063c3f9..c8400a2adbc2 100644 > --- a/mm/percpu-stats.c > +++ b/mm/percpu-stats.c > @@ -34,11 +34,15 @@ static int find_max_nr_alloc(void) > { > struct pcpu_chunk *chunk; > int slot, max_nr_alloc; > + enum pcpu_chunk_type type; > > max_nr_alloc = 0; > - for (slot = 0; slot < pcpu_nr_slots; slot++) > - list_for_each_entry(chunk, &pcpu_slot[slot], list) > - max_nr_alloc = max(max_nr_alloc, chunk->nr_alloc); > + for (type = 0; type < PCPU_NR_CHUNK_TYPES; type++) > + for (slot = 0; slot < pcpu_nr_slots; slot++) > + list_for_each_entry(chunk, &pcpu_chunk_list(type)[slot], > + list) > + max_nr_alloc = max(max_nr_alloc, > + chunk->nr_alloc); > > return max_nr_alloc; > } > @@ -129,6 +133,9 @@ static void chunk_map_stats(struct seq_file *m, struct pcpu_chunk *chunk, > P("cur_min_alloc", cur_min_alloc); > P("cur_med_alloc", cur_med_alloc); > P("cur_max_alloc", cur_max_alloc); > +#ifdef CONFIG_MEMCG_KMEM > + P("memcg_aware", pcpu_is_memcg_chunk(pcpu_chunk_type(chunk))); > +#endif > seq_putc(m, '\n'); > } > > @@ -137,6 +144,7 @@ static int percpu_stats_show(struct seq_file *m, void *v) > struct pcpu_chunk *chunk; > int slot, max_nr_alloc; > int *buffer; > + enum pcpu_chunk_type type; > > alloc_buffer: > spin_lock_irq(&pcpu_lock); > @@ -202,18 +210,18 @@ static int percpu_stats_show(struct seq_file *m, void *v) > chunk_map_stats(m, pcpu_reserved_chunk, buffer); > } > > - for (slot = 0; slot < pcpu_nr_slots; slot++) { > - list_for_each_entry(chunk, &pcpu_slot[slot], list) { > - if (chunk == pcpu_first_chunk) { > - seq_puts(m, "Chunk: <- First Chunk\n"); > - chunk_map_stats(m, chunk, buffer); > - > - > - } else { > - seq_puts(m, "Chunk:\n"); > - chunk_map_stats(m, chunk, buffer); > + for (type = 0; type < PCPU_NR_CHUNK_TYPES; type++) { > + for (slot = 0; slot < pcpu_nr_slots; slot++) { > + list_for_each_entry(chunk, &pcpu_chunk_list(type)[slot], > + list) { > + if (chunk == pcpu_first_chunk) { > + seq_puts(m, "Chunk: <- First Chunk\n"); > + chunk_map_stats(m, chunk, buffer); > + } else { > + seq_puts(m, "Chunk:\n"); > + chunk_map_stats(m, chunk, buffer); > + } > } > - > } > } > > diff --git a/mm/percpu-vm.c b/mm/percpu-vm.c > index a2b395acef89..e46f7a6917f9 100644 > --- a/mm/percpu-vm.c > +++ b/mm/percpu-vm.c > @@ -328,12 +328,13 @@ static void pcpu_depopulate_chunk(struct pcpu_chunk *chunk, > pcpu_free_pages(chunk, pages, page_start, page_end); > } > > -static struct pcpu_chunk *pcpu_create_chunk(gfp_t gfp) > +static struct pcpu_chunk *pcpu_create_chunk(enum pcpu_chunk_type type, > + gfp_t gfp) > { > struct pcpu_chunk *chunk; > struct vm_struct **vms; > > - chunk = pcpu_alloc_chunk(gfp); > + chunk = pcpu_alloc_chunk(type, gfp); > if (!chunk) > return NULL; > > diff --git a/mm/percpu.c b/mm/percpu.c > index aa36b78d45a6..85f5755c9114 100644 > --- a/mm/percpu.c > +++ b/mm/percpu.c > @@ -37,9 +37,14 @@ > * takes care of normal allocations. > * > * The allocator organizes chunks into lists according to free size and > - * tries to allocate from the fullest chunk first. Each chunk is managed > - * by a bitmap with metadata blocks. The allocation map is updated on > - * every allocation and free to reflect the current state while the boundary > + * memcg-awareness. To make a percpu allocation memcg-aware the __GFP_ACCOUNT > + * flag should be passed. All memcg-aware allocations are sharing one set > + * of chunks and all unaccounted allocations and allocations performed > + * by processes belonging to the root memory cgroup are using the second set. > + * > + * The allocator tries to allocate from the fullest chunk first. Each chunk > + * is managed by a bitmap with metadata blocks. The allocation map is updated > + * on every allocation and free to reflect the current state while the boundary > * map is only updated on allocation. Each metadata block contains > * information to help mitigate the need to iterate over large portions > * of the bitmap. The reverse mapping from page to chunk is stored in > @@ -81,6 +86,7 @@ > #include <linux/kmemleak.h> > #include <linux/sched.h> > #include <linux/sched/mm.h> > +#include <linux/memcontrol.h> > > #include <asm/cacheflush.h> > #include <asm/sections.h> > @@ -160,7 +166,7 @@ struct pcpu_chunk *pcpu_reserved_chunk __ro_after_init; > DEFINE_SPINLOCK(pcpu_lock); /* all internal data structures */ > static DEFINE_MUTEX(pcpu_alloc_mutex); /* chunk create/destroy, [de]pop, map ext */ > > -struct list_head *pcpu_slot __ro_after_init; /* chunk list slots */ > +struct list_head *pcpu_chunk_lists __ro_after_init; /* chunk list slots */ > > /* chunks which need their map areas extended, protected by pcpu_lock */ > static LIST_HEAD(pcpu_map_extend_chunks); > @@ -500,6 +506,9 @@ static void __pcpu_chunk_move(struct pcpu_chunk *chunk, int slot, > bool move_front) > { > if (chunk != pcpu_reserved_chunk) { > + struct list_head *pcpu_slot; > + > + pcpu_slot = pcpu_chunk_list(pcpu_chunk_type(chunk)); > if (move_front) > list_move(&chunk->list, &pcpu_slot[slot]); > else > @@ -1341,6 +1350,10 @@ static struct pcpu_chunk * __init pcpu_alloc_first_chunk(unsigned long tmp_addr, > panic("%s: Failed to allocate %zu bytes\n", __func__, > alloc_size); > > +#ifdef CONFIG_MEMCG_KMEM > + /* first chunk isn't memcg-aware */ > + chunk->obj_cgroups = NULL; > +#endif > pcpu_init_md_blocks(chunk); > > /* manage populated page bitmap */ > @@ -1380,7 +1393,7 @@ static struct pcpu_chunk * __init pcpu_alloc_first_chunk(unsigned long tmp_addr, > return chunk; > } > > -static struct pcpu_chunk *pcpu_alloc_chunk(gfp_t gfp) > +static struct pcpu_chunk *pcpu_alloc_chunk(enum pcpu_chunk_type type, gfp_t gfp) > { > struct pcpu_chunk *chunk; > int region_bits; > @@ -1408,6 +1421,16 @@ static struct pcpu_chunk *pcpu_alloc_chunk(gfp_t gfp) > if (!chunk->md_blocks) > goto md_blocks_fail; > > +#ifdef CONFIG_MEMCG_KMEM > + if (pcpu_is_memcg_chunk(type)) { > + chunk->obj_cgroups = > + pcpu_mem_zalloc(pcpu_chunk_map_bits(chunk) * > + sizeof(struct obj_cgroup *), gfp); > + if (!chunk->obj_cgroups) > + goto objcg_fail; > + } > +#endif > + > pcpu_init_md_blocks(chunk); > > /* init metadata */ > @@ -1415,6 +1438,8 @@ static struct pcpu_chunk *pcpu_alloc_chunk(gfp_t gfp) > > return chunk; > > +objcg_fail: > + pcpu_mem_free(chunk->md_blocks); > md_blocks_fail: > pcpu_mem_free(chunk->bound_map); > bound_map_fail: > @@ -1429,6 +1454,9 @@ static void pcpu_free_chunk(struct pcpu_chunk *chunk) > { > if (!chunk) > return; > +#ifdef CONFIG_MEMCG_KMEM > + pcpu_mem_free(chunk->obj_cgroups); > +#endif > pcpu_mem_free(chunk->md_blocks); > pcpu_mem_free(chunk->bound_map); > pcpu_mem_free(chunk->alloc_map); > @@ -1505,7 +1533,8 @@ static int pcpu_populate_chunk(struct pcpu_chunk *chunk, > int page_start, int page_end, gfp_t gfp); > static void pcpu_depopulate_chunk(struct pcpu_chunk *chunk, > int page_start, int page_end); > -static struct pcpu_chunk *pcpu_create_chunk(gfp_t gfp); > +static struct pcpu_chunk *pcpu_create_chunk(enum pcpu_chunk_type type, > + gfp_t gfp); > static void pcpu_destroy_chunk(struct pcpu_chunk *chunk); > static struct page *pcpu_addr_to_page(void *addr); > static int __init pcpu_verify_alloc_info(const struct pcpu_alloc_info *ai); > @@ -1547,6 +1576,77 @@ static struct pcpu_chunk *pcpu_chunk_addr_search(void *addr) > return pcpu_get_page_chunk(pcpu_addr_to_page(addr)); > } > > +#ifdef CONFIG_MEMCG_KMEM > +static enum pcpu_chunk_type pcpu_memcg_pre_alloc_hook(size_t size, gfp_t gfp, > + struct obj_cgroup **objcgp) > +{ > + struct obj_cgroup *objcg; > + > + if (!memcg_kmem_enabled() || !(gfp & __GFP_ACCOUNT) || > + memcg_kmem_bypass()) > + return PCPU_CHUNK_ROOT; > + > + objcg = get_obj_cgroup_from_current(); > + if (!objcg) > + return PCPU_CHUNK_ROOT; > + > + if (obj_cgroup_charge(objcg, gfp, size * num_possible_cpus())) { > + obj_cgroup_put(objcg); > + return PCPU_FAIL_ALLOC; > + } > + > + *objcgp = objcg; > + return PCPU_CHUNK_MEMCG; > +} > + > +static void pcpu_memcg_post_alloc_hook(struct obj_cgroup *objcg, > + struct pcpu_chunk *chunk, int off, > + size_t size) > +{ > + if (!objcg) > + return; > + > + if (chunk) { > + chunk->obj_cgroups[off >> PCPU_MIN_ALLOC_SHIFT] = objcg; > + } else { > + obj_cgroup_uncharge(objcg, size * num_possible_cpus()); > + obj_cgroup_put(objcg); > + } > +} > + > +static void pcpu_memcg_free_hook(struct pcpu_chunk *chunk, int off, size_t size) > +{ > + struct obj_cgroup *objcg; > + > + if (!pcpu_is_memcg_chunk(pcpu_chunk_type(chunk))) > + return; > + > + objcg = chunk->obj_cgroups[off >> PCPU_MIN_ALLOC_SHIFT]; > + chunk->obj_cgroups[off >> PCPU_MIN_ALLOC_SHIFT] = NULL; > + > + obj_cgroup_uncharge(objcg, size * num_possible_cpus()); > + > + obj_cgroup_put(objcg); > +} > + > +#else /* CONFIG_MEMCG_KMEM */ > +static enum pcpu_chunk_type pcpu_memcg_pre_alloc_hook(size_t size, gfp_t gfp, > + struct mem_cgroup **memcgp) > +{ > + return PCPU_CHUNK_ROOT; > +} > + > +static void pcpu_memcg_post_alloc_hook(struct mem_cgroup *memcg, > + struct pcpu_chunk *chunk, int off, > + size_t size) > +{ > +} > + > +static void pcpu_memcg_free_hook(struct pcpu_chunk *chunk, int off, size_t size) > +{ > +} > +#endif /* CONFIG_MEMCG_KMEM */ > + > /** > * pcpu_alloc - the percpu allocator > * @size: size of area to allocate in bytes > @@ -1568,6 +1668,9 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, > gfp_t pcpu_gfp; > bool is_atomic; > bool do_warn; > + enum pcpu_chunk_type type; > + struct list_head *pcpu_slot; > + struct obj_cgroup *objcg = NULL; > static int warn_limit = 10; > struct pcpu_chunk *chunk, *next; > const char *err; > @@ -1602,16 +1705,23 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, > return NULL; > } > > + type = pcpu_memcg_pre_alloc_hook(size, gfp, &objcg); > + if (unlikely(type == PCPU_FAIL_ALLOC)) > + return NULL; > + pcpu_slot = pcpu_chunk_list(type); > + > if (!is_atomic) { > /* > * pcpu_balance_workfn() allocates memory under this mutex, > * and it may wait for memory reclaim. Allow current task > * to become OOM victim, in case of memory pressure. > */ > - if (gfp & __GFP_NOFAIL) > + if (gfp & __GFP_NOFAIL) { > mutex_lock(&pcpu_alloc_mutex); > - else if (mutex_lock_killable(&pcpu_alloc_mutex)) > + } else if (mutex_lock_killable(&pcpu_alloc_mutex)) { > + pcpu_memcg_post_alloc_hook(objcg, NULL, 0, size); > return NULL; > + } > } > > spin_lock_irqsave(&pcpu_lock, flags); > @@ -1637,7 +1747,8 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, > restart: > /* search through normal chunks */ > for (slot = pcpu_size_to_slot(size); slot < pcpu_nr_slots; slot++) { > - list_for_each_entry_safe(chunk, next, &pcpu_slot[slot], list) { > + list_for_each_entry_safe(chunk, next, &pcpu_slot[slot], > + list) { nit: this line change doesn't do anything. Can you please remove it. > off = pcpu_find_block_fit(chunk, bits, bit_align, > is_atomic); > if (off < 0) { > @@ -1666,7 +1777,7 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, > } > > if (list_empty(&pcpu_slot[pcpu_nr_slots - 1])) { > - chunk = pcpu_create_chunk(pcpu_gfp); > + chunk = pcpu_create_chunk(type, pcpu_gfp); > if (!chunk) { > err = "failed to allocate new chunk"; > goto fail; > @@ -1723,6 +1834,8 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, > trace_percpu_alloc_percpu(reserved, is_atomic, size, align, > chunk->base_addr, off, ptr); > > + pcpu_memcg_post_alloc_hook(objcg, chunk, off, size); > + > return ptr; > > fail_unlock: > @@ -1744,6 +1857,9 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, > } else { > mutex_unlock(&pcpu_alloc_mutex); > } > + > + pcpu_memcg_post_alloc_hook(objcg, NULL, 0, size); > + > return NULL; > } > > @@ -1803,8 +1919,8 @@ void __percpu *__alloc_reserved_percpu(size_t size, size_t align) > } > > /** > - * pcpu_balance_workfn - manage the amount of free chunks and populated pages > - * @work: unused > + * __pcpu_balance_workfn - manage the amount of free chunks and populated pages > + * @type: chunk type > * > * Reclaim all fully free chunks except for the first one. This is also > * responsible for maintaining the pool of empty populated pages. However, > @@ -1813,11 +1929,12 @@ void __percpu *__alloc_reserved_percpu(size_t size, size_t align) > * allocation causes the failure as it is possible that requests can be > * serviced from already backed regions. > */ > -static void pcpu_balance_workfn(struct work_struct *work) > +static void __pcpu_balance_workfn(enum pcpu_chunk_type type) > { > /* gfp flags passed to underlying allocators */ > const gfp_t gfp = GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN; > LIST_HEAD(to_free); > + struct list_head *pcpu_slot = pcpu_chunk_list(type); > struct list_head *free_head = &pcpu_slot[pcpu_nr_slots - 1]; > struct pcpu_chunk *chunk, *next; > int slot, nr_to_pop, ret; > @@ -1915,7 +2032,7 @@ static void pcpu_balance_workfn(struct work_struct *work) > > if (nr_to_pop) { > /* ran out of chunks to populate, create a new one and retry */ > - chunk = pcpu_create_chunk(gfp); > + chunk = pcpu_create_chunk(type, gfp); > if (chunk) { > spin_lock_irq(&pcpu_lock); > pcpu_chunk_relocate(chunk, -1); > @@ -1927,6 +2044,20 @@ static void pcpu_balance_workfn(struct work_struct *work) > mutex_unlock(&pcpu_alloc_mutex); > } > > +/** > + * pcpu_balance_workfn - manage the amount of free chunks and populated pages > + * @work: unused > + * > + * Call __pcpu_balance_workfn() for each chunk type. > + */ > +static void pcpu_balance_workfn(struct work_struct *work) > +{ > + enum pcpu_chunk_type type; > + > + for (type = 0; type < PCPU_NR_CHUNK_TYPES; type++) > + __pcpu_balance_workfn(type); > +} > + > /** > * free_percpu - free percpu area > * @ptr: pointer to area to free > @@ -1941,8 +2072,9 @@ void free_percpu(void __percpu *ptr) > void *addr; > struct pcpu_chunk *chunk; > unsigned long flags; > - int off; > + int size, off; > bool need_balance = false; > + struct list_head *pcpu_slot; > > if (!ptr) > return; > @@ -1956,7 +2088,11 @@ void free_percpu(void __percpu *ptr) > chunk = pcpu_chunk_addr_search(addr); > off = addr - chunk->base_addr; > > - pcpu_free_area(chunk, off); > + size = pcpu_free_area(chunk, off); > + > + pcpu_slot = pcpu_chunk_list(pcpu_chunk_type(chunk)); > + > + pcpu_memcg_free_hook(chunk, off, size); > > /* if there are more than one fully free chunks, wake up grim reaper */ > if (chunk->free_bytes == pcpu_unit_size) { > @@ -2267,6 +2403,7 @@ void __init pcpu_setup_first_chunk(const struct pcpu_alloc_info *ai, > int map_size; > unsigned long tmp_addr; > size_t alloc_size; > + enum pcpu_chunk_type type; > > #define PCPU_SETUP_BUG_ON(cond) do { \ > if (unlikely(cond)) { \ > @@ -2384,13 +2521,18 @@ void __init pcpu_setup_first_chunk(const struct pcpu_alloc_info *ai, > * empty chunks. > */ > pcpu_nr_slots = __pcpu_size_to_slot(pcpu_unit_size) + 2; > - pcpu_slot = memblock_alloc(pcpu_nr_slots * sizeof(pcpu_slot[0]), > - SMP_CACHE_BYTES); > - if (!pcpu_slot) > + pcpu_chunk_lists = memblock_alloc(pcpu_nr_slots * > + sizeof(pcpu_chunk_lists[0]) * > + PCPU_NR_CHUNK_TYPES, > + SMP_CACHE_BYTES); > + if (!pcpu_chunk_lists) > panic("%s: Failed to allocate %zu bytes\n", __func__, > - pcpu_nr_slots * sizeof(pcpu_slot[0])); > - for (i = 0; i < pcpu_nr_slots; i++) > - INIT_LIST_HEAD(&pcpu_slot[i]); > + pcpu_nr_slots * sizeof(pcpu_chunk_lists[0]) * > + PCPU_NR_CHUNK_TYPES); > + > + for (type = 0; type < PCPU_NR_CHUNK_TYPES; type++) > + for (i = 0; i < pcpu_nr_slots; i++) > + INIT_LIST_HEAD(&pcpu_chunk_list(type)[i]); > > /* > * The end of the static region needs to be aligned with the > -- > 2.25.4 > There were just 2 minor nits. Do you mind resending with them fixed as I'm not sure I'll be carrying these patches or not. Acked-by: Dennis Zhou <dennis@kernel.org> Thanks, Dennis ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v1 2/5] mm: memcg/percpu: account percpu memory to memory cgroups 2020-06-05 19:49 ` [PATCH v1 2/5] mm: memcg/percpu: account percpu memory to memory cgroups Dennis Zhou @ 2020-06-05 22:44 ` Roman Gushchin 0 siblings, 0 replies; 7+ messages in thread From: Roman Gushchin @ 2020-06-05 22:44 UTC (permalink / raw) To: Dennis Zhou Cc: Andrew Morton, Tejun Heo, Christoph Lameter, Johannes Weiner, Michal Hocko, Shakeel Butt, linux-mm, kernel-team, linux-kernel On Fri, Jun 05, 2020 at 07:49:53PM +0000, Dennis Zhou wrote: > On Thu, May 28, 2020 at 04:25:05PM -0700, Roman Gushchin wrote: > > Percpu memory is becoming more and more widely used by various > > subsystems, and the total amount of memory controlled by the percpu > > allocator can make a good part of the total memory. > > > > As an example, bpf maps can consume a lot of percpu memory, > > and they are created by a user. Also, some cgroup internals > > (e.g. memory controller statistics) can be quite large. > > On a machine with many CPUs and big number of cgroups they > > can consume hundreds of megabytes. > > > > So the lack of memcg accounting is creating a breach in the memory > > isolation. Similar to the slab memory, percpu memory should be > > accounted by default. > > > > To implement the perpcu accounting it's possible to take the slab > > memory accounting as a model to follow. Let's introduce two types of > > percpu chunks: root and memcg. What makes memcg chunks different is > > an additional space allocated to store memcg membership information. > > If __GFP_ACCOUNT is passed on allocation, a memcg chunk should be be > > used. If it's possible to charge the corresponding size to the target > > memory cgroup, allocation is performed, and the memcg ownership data > > is recorded. System-wide allocations are performed using root chunks, > > so there is no additional memory overhead. > > > > To implement a fast reparenting of percpu memory on memcg removal, > > we don't store mem_cgroup pointers directly: instead we use obj_cgroup > > API, introduced for slab accounting. > > > > Signed-off-by: Roman Gushchin <guro@fb.com> > > --- > > mm/percpu-internal.h | 57 ++++++++++++- > > mm/percpu-km.c | 5 +- > > mm/percpu-stats.c | 36 +++++---- > > mm/percpu-vm.c | 5 +- > > mm/percpu.c | 186 ++++++++++++++++++++++++++++++++++++++----- > > 5 files changed, 248 insertions(+), 41 deletions(-) > > > > diff --git a/mm/percpu-internal.h b/mm/percpu-internal.h > > index 0468ba500bd4..0cf36337eb47 100644 > > --- a/mm/percpu-internal.h > > +++ b/mm/percpu-internal.h > > @@ -5,6 +5,27 @@ > > #include <linux/types.h> > > #include <linux/percpu.h> > > > > +/* > > + * There are two chunk types: root and memcg-aware. > > + * Chunks of each type have separate slots list. > > + * > > + * Memcg-aware chunks have an attached vector of obj_cgroup > > + * pointers, which is used to store memcg membership data > > + * of a percpu object. Obj_cgroups are ref-counted pointers > > + * to a memory cgroup with an ability to switch dynamically > > + * to the parent memory cgroup. This allows to reclaim a deleted > > + * memory cgroup without reclaiming of all outstanding objects, > > + * which do hold a reference at it. > > + */ > > nit: do you mind reflowing this to 80 characters and doing 2 spaces > after each period to keep the formatting uniform. > > > +enum pcpu_chunk_type { > > + PCPU_CHUNK_ROOT, > > +#ifdef CONFIG_MEMCG_KMEM > > + PCPU_CHUNK_MEMCG, > > +#endif > > + PCPU_NR_CHUNK_TYPES, > > + PCPU_FAIL_ALLOC = PCPU_NR_CHUNK_TYPES > > +}; > > + > > /* > > * pcpu_block_md is the metadata block struct. > > * Each chunk's bitmap is split into a number of full blocks. > > @@ -54,6 +75,9 @@ struct pcpu_chunk { > > int end_offset; /* additional area required to > > have the region end page > > aligned */ > > +#ifdef CONFIG_MEMCG_KMEM > > + struct obj_cgroup **obj_cgroups; /* vector of object cgroups */ > > +#endif > > > > int nr_pages; /* # of pages served by this chunk */ > > int nr_populated; /* # of populated pages */ > > @@ -63,7 +87,7 @@ struct pcpu_chunk { > > > > extern spinlock_t pcpu_lock; > > > > -extern struct list_head *pcpu_slot; > > +extern struct list_head *pcpu_chunk_lists; > > extern int pcpu_nr_slots; > > extern int pcpu_nr_empty_pop_pages; > > > > @@ -106,6 +130,37 @@ static inline int pcpu_chunk_map_bits(struct pcpu_chunk *chunk) > > return pcpu_nr_pages_to_map_bits(chunk->nr_pages); > > } > > > > +#ifdef CONFIG_MEMCG_KMEM > > +static enum pcpu_chunk_type pcpu_chunk_type(struct pcpu_chunk *chunk) > > +{ > > + if (chunk->obj_cgroups) > > + return PCPU_CHUNK_MEMCG; > > + return PCPU_CHUNK_ROOT; > > +} > > + > > +static bool pcpu_is_memcg_chunk(enum pcpu_chunk_type chunk_type) > > +{ > > + return chunk_type == PCPU_CHUNK_MEMCG; > > +} > > + > > +#else > > +static enum pcpu_chunk_type pcpu_chunk_type(struct pcpu_chunk *chunk) > > +{ > > + return PCPU_CHUNK_ROOT; > > +} > > + > > +static bool pcpu_is_memcg_chunk(enum pcpu_chunk_type chunk_type) > > +{ > > + return false; > > +} > > +#endif > > + > > +static struct list_head *pcpu_chunk_list(enum pcpu_chunk_type chunk_type) > > +{ > > + return &pcpu_chunk_lists[pcpu_nr_slots * > > + pcpu_is_memcg_chunk(chunk_type)]; > > +} > > + > > #ifdef CONFIG_PERCPU_STATS > > > > #include <linux/spinlock.h> > > diff --git a/mm/percpu-km.c b/mm/percpu-km.c > > index 20d2b69a13b0..35c9941077ee 100644 > > --- a/mm/percpu-km.c > > +++ b/mm/percpu-km.c > > @@ -44,7 +44,8 @@ static void pcpu_depopulate_chunk(struct pcpu_chunk *chunk, > > /* nada */ > > } > > > > -static struct pcpu_chunk *pcpu_create_chunk(gfp_t gfp) > > +static struct pcpu_chunk *pcpu_create_chunk(enum pcpu_chunk_type type, > > + gfp_t gfp) > > { > > const int nr_pages = pcpu_group_sizes[0] >> PAGE_SHIFT; > > struct pcpu_chunk *chunk; > > @@ -52,7 +53,7 @@ static struct pcpu_chunk *pcpu_create_chunk(gfp_t gfp) > > unsigned long flags; > > int i; > > > > - chunk = pcpu_alloc_chunk(gfp); > > + chunk = pcpu_alloc_chunk(type, gfp); > > if (!chunk) > > return NULL; > > > > diff --git a/mm/percpu-stats.c b/mm/percpu-stats.c > > index 32558063c3f9..c8400a2adbc2 100644 > > --- a/mm/percpu-stats.c > > +++ b/mm/percpu-stats.c > > @@ -34,11 +34,15 @@ static int find_max_nr_alloc(void) > > { > > struct pcpu_chunk *chunk; > > int slot, max_nr_alloc; > > + enum pcpu_chunk_type type; > > > > max_nr_alloc = 0; > > - for (slot = 0; slot < pcpu_nr_slots; slot++) > > - list_for_each_entry(chunk, &pcpu_slot[slot], list) > > - max_nr_alloc = max(max_nr_alloc, chunk->nr_alloc); > > + for (type = 0; type < PCPU_NR_CHUNK_TYPES; type++) > > + for (slot = 0; slot < pcpu_nr_slots; slot++) > > + list_for_each_entry(chunk, &pcpu_chunk_list(type)[slot], > > + list) > > + max_nr_alloc = max(max_nr_alloc, > > + chunk->nr_alloc); > > > > return max_nr_alloc; > > } > > @@ -129,6 +133,9 @@ static void chunk_map_stats(struct seq_file *m, struct pcpu_chunk *chunk, > > P("cur_min_alloc", cur_min_alloc); > > P("cur_med_alloc", cur_med_alloc); > > P("cur_max_alloc", cur_max_alloc); > > +#ifdef CONFIG_MEMCG_KMEM > > + P("memcg_aware", pcpu_is_memcg_chunk(pcpu_chunk_type(chunk))); > > +#endif > > seq_putc(m, '\n'); > > } > > > > @@ -137,6 +144,7 @@ static int percpu_stats_show(struct seq_file *m, void *v) > > struct pcpu_chunk *chunk; > > int slot, max_nr_alloc; > > int *buffer; > > + enum pcpu_chunk_type type; > > > > alloc_buffer: > > spin_lock_irq(&pcpu_lock); > > @@ -202,18 +210,18 @@ static int percpu_stats_show(struct seq_file *m, void *v) > > chunk_map_stats(m, pcpu_reserved_chunk, buffer); > > } > > > > - for (slot = 0; slot < pcpu_nr_slots; slot++) { > > - list_for_each_entry(chunk, &pcpu_slot[slot], list) { > > - if (chunk == pcpu_first_chunk) { > > - seq_puts(m, "Chunk: <- First Chunk\n"); > > - chunk_map_stats(m, chunk, buffer); > > - > > - > > - } else { > > - seq_puts(m, "Chunk:\n"); > > - chunk_map_stats(m, chunk, buffer); > > + for (type = 0; type < PCPU_NR_CHUNK_TYPES; type++) { > > + for (slot = 0; slot < pcpu_nr_slots; slot++) { > > + list_for_each_entry(chunk, &pcpu_chunk_list(type)[slot], > > + list) { > > + if (chunk == pcpu_first_chunk) { > > + seq_puts(m, "Chunk: <- First Chunk\n"); > > + chunk_map_stats(m, chunk, buffer); > > + } else { > > + seq_puts(m, "Chunk:\n"); > > + chunk_map_stats(m, chunk, buffer); > > + } > > } > > - > > } > > } > > > > diff --git a/mm/percpu-vm.c b/mm/percpu-vm.c > > index a2b395acef89..e46f7a6917f9 100644 > > --- a/mm/percpu-vm.c > > +++ b/mm/percpu-vm.c > > @@ -328,12 +328,13 @@ static void pcpu_depopulate_chunk(struct pcpu_chunk *chunk, > > pcpu_free_pages(chunk, pages, page_start, page_end); > > } > > > > -static struct pcpu_chunk *pcpu_create_chunk(gfp_t gfp) > > +static struct pcpu_chunk *pcpu_create_chunk(enum pcpu_chunk_type type, > > + gfp_t gfp) > > { > > struct pcpu_chunk *chunk; > > struct vm_struct **vms; > > > > - chunk = pcpu_alloc_chunk(gfp); > > + chunk = pcpu_alloc_chunk(type, gfp); > > if (!chunk) > > return NULL; > > > > diff --git a/mm/percpu.c b/mm/percpu.c > > index aa36b78d45a6..85f5755c9114 100644 > > --- a/mm/percpu.c > > +++ b/mm/percpu.c > > @@ -37,9 +37,14 @@ > > * takes care of normal allocations. > > * > > * The allocator organizes chunks into lists according to free size and > > - * tries to allocate from the fullest chunk first. Each chunk is managed > > - * by a bitmap with metadata blocks. The allocation map is updated on > > - * every allocation and free to reflect the current state while the boundary > > + * memcg-awareness. To make a percpu allocation memcg-aware the __GFP_ACCOUNT > > + * flag should be passed. All memcg-aware allocations are sharing one set > > + * of chunks and all unaccounted allocations and allocations performed > > + * by processes belonging to the root memory cgroup are using the second set. > > + * > > + * The allocator tries to allocate from the fullest chunk first. Each chunk > > + * is managed by a bitmap with metadata blocks. The allocation map is updated > > + * on every allocation and free to reflect the current state while the boundary > > * map is only updated on allocation. Each metadata block contains > > * information to help mitigate the need to iterate over large portions > > * of the bitmap. The reverse mapping from page to chunk is stored in > > @@ -81,6 +86,7 @@ > > #include <linux/kmemleak.h> > > #include <linux/sched.h> > > #include <linux/sched/mm.h> > > +#include <linux/memcontrol.h> > > > > #include <asm/cacheflush.h> > > #include <asm/sections.h> > > @@ -160,7 +166,7 @@ struct pcpu_chunk *pcpu_reserved_chunk __ro_after_init; > > DEFINE_SPINLOCK(pcpu_lock); /* all internal data structures */ > > static DEFINE_MUTEX(pcpu_alloc_mutex); /* chunk create/destroy, [de]pop, map ext */ > > > > -struct list_head *pcpu_slot __ro_after_init; /* chunk list slots */ > > +struct list_head *pcpu_chunk_lists __ro_after_init; /* chunk list slots */ > > > > /* chunks which need their map areas extended, protected by pcpu_lock */ > > static LIST_HEAD(pcpu_map_extend_chunks); > > @@ -500,6 +506,9 @@ static void __pcpu_chunk_move(struct pcpu_chunk *chunk, int slot, > > bool move_front) > > { > > if (chunk != pcpu_reserved_chunk) { > > + struct list_head *pcpu_slot; > > + > > + pcpu_slot = pcpu_chunk_list(pcpu_chunk_type(chunk)); > > if (move_front) > > list_move(&chunk->list, &pcpu_slot[slot]); > > else > > @@ -1341,6 +1350,10 @@ static struct pcpu_chunk * __init pcpu_alloc_first_chunk(unsigned long tmp_addr, > > panic("%s: Failed to allocate %zu bytes\n", __func__, > > alloc_size); > > > > +#ifdef CONFIG_MEMCG_KMEM > > + /* first chunk isn't memcg-aware */ > > + chunk->obj_cgroups = NULL; > > +#endif > > pcpu_init_md_blocks(chunk); > > > > /* manage populated page bitmap */ > > @@ -1380,7 +1393,7 @@ static struct pcpu_chunk * __init pcpu_alloc_first_chunk(unsigned long tmp_addr, > > return chunk; > > } > > > > -static struct pcpu_chunk *pcpu_alloc_chunk(gfp_t gfp) > > +static struct pcpu_chunk *pcpu_alloc_chunk(enum pcpu_chunk_type type, gfp_t gfp) > > { > > struct pcpu_chunk *chunk; > > int region_bits; > > @@ -1408,6 +1421,16 @@ static struct pcpu_chunk *pcpu_alloc_chunk(gfp_t gfp) > > if (!chunk->md_blocks) > > goto md_blocks_fail; > > > > +#ifdef CONFIG_MEMCG_KMEM > > + if (pcpu_is_memcg_chunk(type)) { > > + chunk->obj_cgroups = > > + pcpu_mem_zalloc(pcpu_chunk_map_bits(chunk) * > > + sizeof(struct obj_cgroup *), gfp); > > + if (!chunk->obj_cgroups) > > + goto objcg_fail; > > + } > > +#endif > > + > > pcpu_init_md_blocks(chunk); > > > > /* init metadata */ > > @@ -1415,6 +1438,8 @@ static struct pcpu_chunk *pcpu_alloc_chunk(gfp_t gfp) > > > > return chunk; > > > > +objcg_fail: > > + pcpu_mem_free(chunk->md_blocks); > > md_blocks_fail: > > pcpu_mem_free(chunk->bound_map); > > bound_map_fail: > > @@ -1429,6 +1454,9 @@ static void pcpu_free_chunk(struct pcpu_chunk *chunk) > > { > > if (!chunk) > > return; > > +#ifdef CONFIG_MEMCG_KMEM > > + pcpu_mem_free(chunk->obj_cgroups); > > +#endif > > pcpu_mem_free(chunk->md_blocks); > > pcpu_mem_free(chunk->bound_map); > > pcpu_mem_free(chunk->alloc_map); > > @@ -1505,7 +1533,8 @@ static int pcpu_populate_chunk(struct pcpu_chunk *chunk, > > int page_start, int page_end, gfp_t gfp); > > static void pcpu_depopulate_chunk(struct pcpu_chunk *chunk, > > int page_start, int page_end); > > -static struct pcpu_chunk *pcpu_create_chunk(gfp_t gfp); > > +static struct pcpu_chunk *pcpu_create_chunk(enum pcpu_chunk_type type, > > + gfp_t gfp); > > static void pcpu_destroy_chunk(struct pcpu_chunk *chunk); > > static struct page *pcpu_addr_to_page(void *addr); > > static int __init pcpu_verify_alloc_info(const struct pcpu_alloc_info *ai); > > @@ -1547,6 +1576,77 @@ static struct pcpu_chunk *pcpu_chunk_addr_search(void *addr) > > return pcpu_get_page_chunk(pcpu_addr_to_page(addr)); > > } > > > > +#ifdef CONFIG_MEMCG_KMEM > > +static enum pcpu_chunk_type pcpu_memcg_pre_alloc_hook(size_t size, gfp_t gfp, > > + struct obj_cgroup **objcgp) > > +{ > > + struct obj_cgroup *objcg; > > + > > + if (!memcg_kmem_enabled() || !(gfp & __GFP_ACCOUNT) || > > + memcg_kmem_bypass()) > > + return PCPU_CHUNK_ROOT; > > + > > + objcg = get_obj_cgroup_from_current(); > > + if (!objcg) > > + return PCPU_CHUNK_ROOT; > > + > > + if (obj_cgroup_charge(objcg, gfp, size * num_possible_cpus())) { > > + obj_cgroup_put(objcg); > > + return PCPU_FAIL_ALLOC; > > + } > > + > > + *objcgp = objcg; > > + return PCPU_CHUNK_MEMCG; > > +} > > + > > +static void pcpu_memcg_post_alloc_hook(struct obj_cgroup *objcg, > > + struct pcpu_chunk *chunk, int off, > > + size_t size) > > +{ > > + if (!objcg) > > + return; > > + > > + if (chunk) { > > + chunk->obj_cgroups[off >> PCPU_MIN_ALLOC_SHIFT] = objcg; > > + } else { > > + obj_cgroup_uncharge(objcg, size * num_possible_cpus()); > > + obj_cgroup_put(objcg); > > + } > > +} > > + > > +static void pcpu_memcg_free_hook(struct pcpu_chunk *chunk, int off, size_t size) > > +{ > > + struct obj_cgroup *objcg; > > + > > + if (!pcpu_is_memcg_chunk(pcpu_chunk_type(chunk))) > > + return; > > + > > + objcg = chunk->obj_cgroups[off >> PCPU_MIN_ALLOC_SHIFT]; > > + chunk->obj_cgroups[off >> PCPU_MIN_ALLOC_SHIFT] = NULL; > > + > > + obj_cgroup_uncharge(objcg, size * num_possible_cpus()); > > + > > + obj_cgroup_put(objcg); > > +} > > + > > +#else /* CONFIG_MEMCG_KMEM */ > > +static enum pcpu_chunk_type pcpu_memcg_pre_alloc_hook(size_t size, gfp_t gfp, > > + struct mem_cgroup **memcgp) > > +{ > > + return PCPU_CHUNK_ROOT; > > +} > > + > > +static void pcpu_memcg_post_alloc_hook(struct mem_cgroup *memcg, > > + struct pcpu_chunk *chunk, int off, > > + size_t size) > > +{ > > +} > > + > > +static void pcpu_memcg_free_hook(struct pcpu_chunk *chunk, int off, size_t size) > > +{ > > +} > > +#endif /* CONFIG_MEMCG_KMEM */ > > + > > /** > > * pcpu_alloc - the percpu allocator > > * @size: size of area to allocate in bytes > > @@ -1568,6 +1668,9 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, > > gfp_t pcpu_gfp; > > bool is_atomic; > > bool do_warn; > > + enum pcpu_chunk_type type; > > + struct list_head *pcpu_slot; > > + struct obj_cgroup *objcg = NULL; > > static int warn_limit = 10; > > struct pcpu_chunk *chunk, *next; > > const char *err; > > @@ -1602,16 +1705,23 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, > > return NULL; > > } > > > > + type = pcpu_memcg_pre_alloc_hook(size, gfp, &objcg); > > + if (unlikely(type == PCPU_FAIL_ALLOC)) > > + return NULL; > > + pcpu_slot = pcpu_chunk_list(type); > > + > > if (!is_atomic) { > > /* > > * pcpu_balance_workfn() allocates memory under this mutex, > > * and it may wait for memory reclaim. Allow current task > > * to become OOM victim, in case of memory pressure. > > */ > > - if (gfp & __GFP_NOFAIL) > > + if (gfp & __GFP_NOFAIL) { > > mutex_lock(&pcpu_alloc_mutex); > > - else if (mutex_lock_killable(&pcpu_alloc_mutex)) > > + } else if (mutex_lock_killable(&pcpu_alloc_mutex)) { > > + pcpu_memcg_post_alloc_hook(objcg, NULL, 0, size); > > return NULL; > > + } > > } > > > > spin_lock_irqsave(&pcpu_lock, flags); > > @@ -1637,7 +1747,8 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, > > restart: > > /* search through normal chunks */ > > for (slot = pcpu_size_to_slot(size); slot < pcpu_nr_slots; slot++) { > > - list_for_each_entry_safe(chunk, next, &pcpu_slot[slot], list) { > > + list_for_each_entry_safe(chunk, next, &pcpu_slot[slot], > > + list) { > > nit: this line change doesn't do anything. Can you please remove it. > > > off = pcpu_find_block_fit(chunk, bits, bit_align, > > is_atomic); > > if (off < 0) { > > @@ -1666,7 +1777,7 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, > > } > > > > if (list_empty(&pcpu_slot[pcpu_nr_slots - 1])) { > > - chunk = pcpu_create_chunk(pcpu_gfp); > > + chunk = pcpu_create_chunk(type, pcpu_gfp); > > if (!chunk) { > > err = "failed to allocate new chunk"; > > goto fail; > > @@ -1723,6 +1834,8 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, > > trace_percpu_alloc_percpu(reserved, is_atomic, size, align, > > chunk->base_addr, off, ptr); > > > > + pcpu_memcg_post_alloc_hook(objcg, chunk, off, size); > > + > > return ptr; > > > > fail_unlock: > > @@ -1744,6 +1857,9 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, > > } else { > > mutex_unlock(&pcpu_alloc_mutex); > > } > > + > > + pcpu_memcg_post_alloc_hook(objcg, NULL, 0, size); > > + > > return NULL; > > } > > > > @@ -1803,8 +1919,8 @@ void __percpu *__alloc_reserved_percpu(size_t size, size_t align) > > } > > > > /** > > - * pcpu_balance_workfn - manage the amount of free chunks and populated pages > > - * @work: unused > > + * __pcpu_balance_workfn - manage the amount of free chunks and populated pages > > + * @type: chunk type > > * > > * Reclaim all fully free chunks except for the first one. This is also > > * responsible for maintaining the pool of empty populated pages. However, > > @@ -1813,11 +1929,12 @@ void __percpu *__alloc_reserved_percpu(size_t size, size_t align) > > * allocation causes the failure as it is possible that requests can be > > * serviced from already backed regions. > > */ > > -static void pcpu_balance_workfn(struct work_struct *work) > > +static void __pcpu_balance_workfn(enum pcpu_chunk_type type) > > { > > /* gfp flags passed to underlying allocators */ > > const gfp_t gfp = GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN; > > LIST_HEAD(to_free); > > + struct list_head *pcpu_slot = pcpu_chunk_list(type); > > struct list_head *free_head = &pcpu_slot[pcpu_nr_slots - 1]; > > struct pcpu_chunk *chunk, *next; > > int slot, nr_to_pop, ret; > > @@ -1915,7 +2032,7 @@ static void pcpu_balance_workfn(struct work_struct *work) > > > > if (nr_to_pop) { > > /* ran out of chunks to populate, create a new one and retry */ > > - chunk = pcpu_create_chunk(gfp); > > + chunk = pcpu_create_chunk(type, gfp); > > if (chunk) { > > spin_lock_irq(&pcpu_lock); > > pcpu_chunk_relocate(chunk, -1); > > @@ -1927,6 +2044,20 @@ static void pcpu_balance_workfn(struct work_struct *work) > > mutex_unlock(&pcpu_alloc_mutex); > > } > > > > +/** > > + * pcpu_balance_workfn - manage the amount of free chunks and populated pages > > + * @work: unused > > + * > > + * Call __pcpu_balance_workfn() for each chunk type. > > + */ > > +static void pcpu_balance_workfn(struct work_struct *work) > > +{ > > + enum pcpu_chunk_type type; > > + > > + for (type = 0; type < PCPU_NR_CHUNK_TYPES; type++) > > + __pcpu_balance_workfn(type); > > +} > > + > > /** > > * free_percpu - free percpu area > > * @ptr: pointer to area to free > > @@ -1941,8 +2072,9 @@ void free_percpu(void __percpu *ptr) > > void *addr; > > struct pcpu_chunk *chunk; > > unsigned long flags; > > - int off; > > + int size, off; > > bool need_balance = false; > > + struct list_head *pcpu_slot; > > > > if (!ptr) > > return; > > @@ -1956,7 +2088,11 @@ void free_percpu(void __percpu *ptr) > > chunk = pcpu_chunk_addr_search(addr); > > off = addr - chunk->base_addr; > > > > - pcpu_free_area(chunk, off); > > + size = pcpu_free_area(chunk, off); > > + > > + pcpu_slot = pcpu_chunk_list(pcpu_chunk_type(chunk)); > > + > > + pcpu_memcg_free_hook(chunk, off, size); > > > > /* if there are more than one fully free chunks, wake up grim reaper */ > > if (chunk->free_bytes == pcpu_unit_size) { > > @@ -2267,6 +2403,7 @@ void __init pcpu_setup_first_chunk(const struct pcpu_alloc_info *ai, > > int map_size; > > unsigned long tmp_addr; > > size_t alloc_size; > > + enum pcpu_chunk_type type; > > > > #define PCPU_SETUP_BUG_ON(cond) do { \ > > if (unlikely(cond)) { \ > > @@ -2384,13 +2521,18 @@ void __init pcpu_setup_first_chunk(const struct pcpu_alloc_info *ai, > > * empty chunks. > > */ > > pcpu_nr_slots = __pcpu_size_to_slot(pcpu_unit_size) + 2; > > - pcpu_slot = memblock_alloc(pcpu_nr_slots * sizeof(pcpu_slot[0]), > > - SMP_CACHE_BYTES); > > - if (!pcpu_slot) > > + pcpu_chunk_lists = memblock_alloc(pcpu_nr_slots * > > + sizeof(pcpu_chunk_lists[0]) * > > + PCPU_NR_CHUNK_TYPES, > > + SMP_CACHE_BYTES); > > + if (!pcpu_chunk_lists) > > panic("%s: Failed to allocate %zu bytes\n", __func__, > > - pcpu_nr_slots * sizeof(pcpu_slot[0])); > > - for (i = 0; i < pcpu_nr_slots; i++) > > - INIT_LIST_HEAD(&pcpu_slot[i]); > > + pcpu_nr_slots * sizeof(pcpu_chunk_lists[0]) * > > + PCPU_NR_CHUNK_TYPES); > > + > > + for (type = 0; type < PCPU_NR_CHUNK_TYPES; type++) > > + for (i = 0; i < pcpu_nr_slots; i++) > > + INIT_LIST_HEAD(&pcpu_chunk_list(type)[i]); > > > > /* > > * The end of the static region needs to be aligned with the > > -- > > 2.25.4 > > > > There were just 2 minor nits. Do you mind resending with them fixed as > I'm not sure I'll be carrying these patches or not. Sure, will send v2 based on the slab controller v6 early next week. > > Acked-by: Dennis Zhou <dennis@kernel.org> Thank you! ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20200528232508.1132382-4-guro@fb.com>]
* Re: [PATCH v1 3/5] mm: memcg/percpu: per-memcg percpu memory statistics [not found] ` <20200528232508.1132382-4-guro@fb.com> @ 2020-06-05 19:53 ` Dennis Zhou 0 siblings, 0 replies; 7+ messages in thread From: Dennis Zhou @ 2020-06-05 19:53 UTC (permalink / raw) To: Roman Gushchin Cc: Andrew Morton, Tejun Heo, Christoph Lameter, Johannes Weiner, Michal Hocko, Shakeel Butt, linux-mm, kernel-team, linux-kernel On Thu, May 28, 2020 at 04:25:06PM -0700, Roman Gushchin wrote: > Percpu memory can represent a noticeable chunk of the total > memory consumption, especially on big machines with many CPUs. > Let's track percpu memory usage for each memcg and display > it in memory.stat. > > A percpu allocation is usually scattered over multiple pages > (and nodes), and can be significantly smaller than a page. > So let's add a byte-sized counter on the memcg level: > MEMCG_PERCPU_B. Byte-sized vmstat infra created for slabs > can be perfectly reused for percpu case. > > Signed-off-by: Roman Gushchin <guro@fb.com> > --- > Documentation/admin-guide/cgroup-v2.rst | 4 ++++ > include/linux/memcontrol.h | 8 ++++++++ > mm/memcontrol.c | 4 +++- > mm/percpu.c | 10 ++++++++++ > 4 files changed, 25 insertions(+), 1 deletion(-) > > diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst > index fed4e1d2a343..aa8cb6dadadc 100644 > --- a/Documentation/admin-guide/cgroup-v2.rst > +++ b/Documentation/admin-guide/cgroup-v2.rst > @@ -1276,6 +1276,10 @@ PAGE_SIZE multiple when read back. > Amount of memory used for storing in-kernel data > structures. > > + percpu > + Amount of memory used for storing per-cpu kernel > + data structures. > + > sock > Amount of memory used in network transmission buffers > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > index 7a84d9164449..f62a95d472f7 100644 > --- a/include/linux/memcontrol.h > +++ b/include/linux/memcontrol.h > @@ -32,11 +32,19 @@ struct kmem_cache; > enum memcg_stat_item { > MEMCG_SWAP = NR_VM_NODE_STAT_ITEMS, > MEMCG_SOCK, > + MEMCG_PERCPU_B, > /* XXX: why are these zone and not node counters? */ > MEMCG_KERNEL_STACK_KB, > MEMCG_NR_STAT, > }; > > +static __always_inline bool memcg_stat_item_in_bytes(enum memcg_stat_item item) > +{ > + if (item == MEMCG_PERCPU_B) > + return true; > + return vmstat_item_in_bytes(item); > +} > + > enum memcg_memory_event { > MEMCG_LOW, > MEMCG_HIGH, > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 7bc3fd196210..5007d1585a4a 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -783,7 +783,7 @@ void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val) > if (mem_cgroup_disabled()) > return; > > - if (vmstat_item_in_bytes(idx)) > + if (memcg_stat_item_in_bytes(idx)) > threshold <<= PAGE_SHIFT; > > x = val + __this_cpu_read(memcg->vmstats_percpu->stat[idx]); > @@ -1490,6 +1490,8 @@ static char *memory_stat_format(struct mem_cgroup *memcg) > seq_buf_printf(&s, "slab %llu\n", > (u64)(memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B) + > memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B))); > + seq_buf_printf(&s, "percpu %llu\n", > + (u64)memcg_page_state(memcg, MEMCG_PERCPU_B)); > seq_buf_printf(&s, "sock %llu\n", > (u64)memcg_page_state(memcg, MEMCG_SOCK) * > PAGE_SIZE); > diff --git a/mm/percpu.c b/mm/percpu.c > index 85f5755c9114..b4b3e9c8a6d1 100644 > --- a/mm/percpu.c > +++ b/mm/percpu.c > @@ -1608,6 +1608,11 @@ static void pcpu_memcg_post_alloc_hook(struct obj_cgroup *objcg, > > if (chunk) { > chunk->obj_cgroups[off >> PCPU_MIN_ALLOC_SHIFT] = objcg; > + > + rcu_read_lock(); > + mod_memcg_state(obj_cgroup_memcg(objcg), MEMCG_PERCPU_B, > + size * num_possible_cpus()); > + rcu_read_unlock(); > } else { > obj_cgroup_uncharge(objcg, size * num_possible_cpus()); > obj_cgroup_put(objcg); > @@ -1626,6 +1631,11 @@ static void pcpu_memcg_free_hook(struct pcpu_chunk *chunk, int off, size_t size) > > obj_cgroup_uncharge(objcg, size * num_possible_cpus()); > > + rcu_read_lock(); > + mod_memcg_state(obj_cgroup_memcg(objcg), MEMCG_PERCPU_B, > + -(size * num_possible_cpus())); > + rcu_read_unlock(); > + > obj_cgroup_put(objcg); > } > > -- > 2.25.4 > Acked-by: Dennis Zhou <dennis@kernel.org> Thanks, Dennis ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20200528232508.1132382-5-guro@fb.com>]
* Re: [PATCH v1 4/5] mm: memcg: charge memcg percpu memory to the parent cgroup [not found] ` <20200528232508.1132382-5-guro@fb.com> @ 2020-06-05 19:54 ` Dennis Zhou 0 siblings, 0 replies; 7+ messages in thread From: Dennis Zhou @ 2020-06-05 19:54 UTC (permalink / raw) To: Roman Gushchin Cc: Andrew Morton, Tejun Heo, Christoph Lameter, Johannes Weiner, Michal Hocko, Shakeel Butt, linux-mm, kernel-team, linux-kernel On Thu, May 28, 2020 at 04:25:07PM -0700, Roman Gushchin wrote: > Memory cgroups are using large chunks of percpu memory to store > vmstat data. Yet this memory is not accounted at all, so in the > case when there are many (dying) cgroups, it's not exactly clear > where all the memory is. > > Because the size of memory cgroup internal structures can > dramatically exceed the size of object or page which is pinning > it in the memory, it's not a good idea to simple ignore it. > It actually breaks the isolation between cgroups. > > Let's account the consumed percpu memory to the parent cgroup. > > Signed-off-by: Roman Gushchin <guro@fb.com> > --- > mm/memcontrol.c | 14 ++++++++++---- > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 5007d1585a4a..0dd0d05a011c 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -5020,13 +5020,15 @@ static int alloc_mem_cgroup_per_node_info(struct mem_cgroup *memcg, int node) > if (!pn) > return 1; > > - pn->lruvec_stat_local = alloc_percpu(struct lruvec_stat); > + pn->lruvec_stat_local = alloc_percpu_gfp(struct lruvec_stat, > + GFP_KERNEL_ACCOUNT); > if (!pn->lruvec_stat_local) { > kfree(pn); > return 1; > } > > - pn->lruvec_stat_cpu = alloc_percpu(struct lruvec_stat); > + pn->lruvec_stat_cpu = alloc_percpu_gfp(struct lruvec_stat, > + GFP_KERNEL_ACCOUNT); > if (!pn->lruvec_stat_cpu) { > free_percpu(pn->lruvec_stat_local); > kfree(pn); > @@ -5100,11 +5102,13 @@ static struct mem_cgroup *mem_cgroup_alloc(void) > goto fail; > } > > - memcg->vmstats_local = alloc_percpu(struct memcg_vmstats_percpu); > + memcg->vmstats_local = alloc_percpu_gfp(struct memcg_vmstats_percpu, > + GFP_KERNEL_ACCOUNT); > if (!memcg->vmstats_local) > goto fail; > > - memcg->vmstats_percpu = alloc_percpu(struct memcg_vmstats_percpu); > + memcg->vmstats_percpu = alloc_percpu_gfp(struct memcg_vmstats_percpu, > + GFP_KERNEL_ACCOUNT); > if (!memcg->vmstats_percpu) > goto fail; > > @@ -5153,7 +5157,9 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css) > struct mem_cgroup *memcg; > long error = -ENOMEM; > > + memalloc_use_memcg(parent); > memcg = mem_cgroup_alloc(); > + memalloc_unuse_memcg(); > if (IS_ERR(memcg)) > return ERR_CAST(memcg); > > -- > 2.25.4 > Acked-by: Dennis Zhou <dennis@kernel.org> Thanks, Dennis ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20200528232508.1132382-6-guro@fb.com>]
* Re: [PATCH v1 5/5] kselftests: cgroup: add perpcu memory accounting test [not found] ` <20200528232508.1132382-6-guro@fb.com> @ 2020-06-05 20:07 ` Dennis Zhou 2020-06-05 22:47 ` Roman Gushchin 0 siblings, 1 reply; 7+ messages in thread From: Dennis Zhou @ 2020-06-05 20:07 UTC (permalink / raw) To: Roman Gushchin Cc: Andrew Morton, Tejun Heo, Christoph Lameter, Johannes Weiner, Michal Hocko, Shakeel Butt, linux-mm, kernel-team, linux-kernel On Thu, May 28, 2020 at 04:25:08PM -0700, Roman Gushchin wrote: > Add a simple test to check the percpu memory accounting. > The test creates a cgroup tree with 1000 child cgroups > and checks values of memory.current and memory.stat::percpu. > > Signed-off-by: Roman Gushchin <guro@fb.com> > --- > tools/testing/selftests/cgroup/test_kmem.c | 59 ++++++++++++++++++++++ > 1 file changed, 59 insertions(+) > > diff --git a/tools/testing/selftests/cgroup/test_kmem.c b/tools/testing/selftests/cgroup/test_kmem.c > index 5224dae216e5..a0d4f1a3137d 100644 > --- a/tools/testing/selftests/cgroup/test_kmem.c > +++ b/tools/testing/selftests/cgroup/test_kmem.c > @@ -331,6 +331,64 @@ static int test_kmem_dead_cgroups(const char *root) > return ret; > } > > +/* > + * This test creates a sub-tree with 1000 memory cgroups. > + * Then it checks that the memory.current on the parent level > + * is greater than 0 and approximates matches the percpu value > + * from memory.stat. > + */ > +static int test_percpu_basic(const char *root) > +{ > + int ret = KSFT_FAIL; > + char *parent, *child; > + long current, percpu; > + int i; > + > + parent = cg_name(root, "percpu_basic_test"); > + if (!parent) > + goto cleanup; > + > + if (cg_create(parent)) > + goto cleanup; > + > + if (cg_write(parent, "cgroup.subtree_control", "+memory")) > + goto cleanup; > + > + for (i = 0; i < 1000; i++) { > + child = cg_name_indexed(parent, "child", i); > + if (!child) > + return -1; > + > + if (cg_create(child)) > + goto cleanup_children; > + > + free(child); > + } > + > + current = cg_read_long(parent, "memory.current"); > + percpu = cg_read_key_long(parent, "memory.stat", "percpu "); > + > + if (current > 0 && percpu > 0 && abs(current - percpu) < > + 4096 * 32 * get_nprocs()) So this is checking that we've allocated less than 32 pages per cpu over 1000 child cgroups that's not percpu memory? Is there a more definitive measurement or at least a comment we can leave saying why this limit was chosen. > + ret = KSFT_PASS; > + else > + printf("memory.current %ld\npercpu %ld\n", > + current, percpu); > + > +cleanup_children: > + for (i = 0; i < 1000; i++) { > + child = cg_name_indexed(parent, "child", i); > + cg_destroy(child); > + free(child); > + } > + > +cleanup: > + cg_destroy(parent); > + free(parent); > + > + return ret; > +} > + > #define T(x) { x, #x } > struct kmem_test { > int (*fn)(const char *root); > @@ -341,6 +399,7 @@ struct kmem_test { > T(test_kmem_proc_kpagecgroup), > T(test_kmem_kernel_stacks), > T(test_kmem_dead_cgroups), > + T(test_percpu_basic), > }; > #undef T > > -- > 2.25.4 > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v1 5/5] kselftests: cgroup: add perpcu memory accounting test 2020-06-05 20:07 ` [PATCH v1 5/5] kselftests: cgroup: add perpcu memory accounting test Dennis Zhou @ 2020-06-05 22:47 ` Roman Gushchin 0 siblings, 0 replies; 7+ messages in thread From: Roman Gushchin @ 2020-06-05 22:47 UTC (permalink / raw) To: Dennis Zhou Cc: Andrew Morton, Tejun Heo, Christoph Lameter, Johannes Weiner, Michal Hocko, Shakeel Butt, linux-mm, kernel-team, linux-kernel On Fri, Jun 05, 2020 at 08:07:51PM +0000, Dennis Zhou wrote: > On Thu, May 28, 2020 at 04:25:08PM -0700, Roman Gushchin wrote: > > Add a simple test to check the percpu memory accounting. > > The test creates a cgroup tree with 1000 child cgroups > > and checks values of memory.current and memory.stat::percpu. > > > > Signed-off-by: Roman Gushchin <guro@fb.com> > > --- > > tools/testing/selftests/cgroup/test_kmem.c | 59 ++++++++++++++++++++++ > > 1 file changed, 59 insertions(+) > > > > diff --git a/tools/testing/selftests/cgroup/test_kmem.c b/tools/testing/selftests/cgroup/test_kmem.c > > index 5224dae216e5..a0d4f1a3137d 100644 > > --- a/tools/testing/selftests/cgroup/test_kmem.c > > +++ b/tools/testing/selftests/cgroup/test_kmem.c > > @@ -331,6 +331,64 @@ static int test_kmem_dead_cgroups(const char *root) > > return ret; > > } > > > > +/* > > + * This test creates a sub-tree with 1000 memory cgroups. > > + * Then it checks that the memory.current on the parent level > > + * is greater than 0 and approximates matches the percpu value > > + * from memory.stat. > > + */ > > +static int test_percpu_basic(const char *root) > > +{ > > + int ret = KSFT_FAIL; > > + char *parent, *child; > > + long current, percpu; > > + int i; > > + > > + parent = cg_name(root, "percpu_basic_test"); > > + if (!parent) > > + goto cleanup; > > + > > + if (cg_create(parent)) > > + goto cleanup; > > + > > + if (cg_write(parent, "cgroup.subtree_control", "+memory")) > > + goto cleanup; > > + > > + for (i = 0; i < 1000; i++) { > > + child = cg_name_indexed(parent, "child", i); > > + if (!child) > > + return -1; > > + > > + if (cg_create(child)) > > + goto cleanup_children; > > + > > + free(child); > > + } > > + > > + current = cg_read_long(parent, "memory.current"); > > + percpu = cg_read_key_long(parent, "memory.stat", "percpu "); > > + > > + if (current > 0 && percpu > 0 && abs(current - percpu) < > > + 4096 * 32 * get_nprocs()) > > So this is checking that we've allocated less than 32 pages per cpu over > 1000 child cgroups that's not percpu memory? Is there a more definitive > measurement or at least a comment we can leave saying why this limit was > chosen. It simple means that "current" should be approximately equal to "percpu" statistics. Both charging and vmstat paths are using percpu batching, and the batch size is 32 pages. I'll add a comment to make it more obvious. Thanks! > > > + ret = KSFT_PASS; > > + else > > + printf("memory.current %ld\npercpu %ld\n", > > + current, percpu); > > + > > +cleanup_children: > > + for (i = 0; i < 1000; i++) { > > + child = cg_name_indexed(parent, "child", i); > > + cg_destroy(child); > > + free(child); > > + } > > + > > +cleanup: > > + cg_destroy(parent); > > + free(parent); > > + > > + return ret; > > +} > > + > > #define T(x) { x, #x } > > struct kmem_test { > > int (*fn)(const char *root); > > @@ -341,6 +399,7 @@ struct kmem_test { > > T(test_kmem_proc_kpagecgroup), > > T(test_kmem_kernel_stacks), > > T(test_kmem_dead_cgroups), > > + T(test_percpu_basic), > > }; > > #undef T > > > > -- > > 2.25.4 > > > > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-06-05 22:47 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <20200528232508.1132382-1-guro@fb.com> [not found] ` <20200528232508.1132382-2-guro@fb.com> 2020-06-05 19:44 ` [PATCH v1 1/5] percpu: return number of released bytes from pcpu_free_area() Dennis Zhou [not found] ` <20200528232508.1132382-3-guro@fb.com> 2020-06-05 19:49 ` [PATCH v1 2/5] mm: memcg/percpu: account percpu memory to memory cgroups Dennis Zhou 2020-06-05 22:44 ` Roman Gushchin [not found] ` <20200528232508.1132382-4-guro@fb.com> 2020-06-05 19:53 ` [PATCH v1 3/5] mm: memcg/percpu: per-memcg percpu memory statistics Dennis Zhou [not found] ` <20200528232508.1132382-5-guro@fb.com> 2020-06-05 19:54 ` [PATCH v1 4/5] mm: memcg: charge memcg percpu memory to the parent cgroup Dennis Zhou [not found] ` <20200528232508.1132382-6-guro@fb.com> 2020-06-05 20:07 ` [PATCH v1 5/5] kselftests: cgroup: add perpcu memory accounting test Dennis Zhou 2020-06-05 22:47 ` Roman Gushchin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).