linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mateusz Guzik <mjguzik@gmail.com>
To: Dennis Zhou <dennis@kernel.org>
Cc: linux-kernel@vger.kernel.org, tj@kernel.org, cl@linux.com,
	akpm@linux-foundation.org, shakeelb@google.com,
	linux-mm@kvack.org, jack@suse.cz
Subject: Re: [PATCH 0/2] execve scalability issues, part 1
Date: Tue, 22 Aug 2023 00:29:49 +0200	[thread overview]
Message-ID: <CAGudoHHJECp2-DfSr5hudooAdV6mivvSO+4mC9kwUrWnSiob5g@mail.gmail.com> (raw)
In-Reply-To: <20230821213951.bx3yyqh7omdvpyae@f>

On 8/21/23, Mateusz Guzik <mjguzik@gmail.com> wrote:
> On Mon, Aug 21, 2023 at 02:07:28PM -0700, Dennis Zhou wrote:
>> On Mon, Aug 21, 2023 at 10:28:27PM +0200, Mateusz Guzik wrote:
>> > With this out of the way I'll be looking at some form of caching to
>> > eliminate these allocs as a problem.
>> >
>>
>> I'm not against caching, this is just my first thought. Caching will
>> have an impact on the backing pages of percpu. All it takes is 1
>> allocation on a page for the current allocator to pin n pages of memory.
>> A few years ago percpu depopulation was implemented so that limits the
>> amount of resident backing pages.
>>
>
> I'm painfully aware.
>
>> Maybe the right thing to do is preallocate pools of common sized
>> allocations so that way they can be recycled such that we don't have to
>> think too hard about fragmentation that can occur if we populate these
>> pools over time?
>>
>
> This is what I was going to suggest :)
>
> FreeBSD has a per-cpu allocator which pretends to be the same as the
> slab allocator, except handing out per-cpu bufs. So far it has sizes 4,
> 8, 16, 32 and 64 and you can act as if you are mallocing in that size.
>
> Scales perfectly fine of course since it caches objs per-CPU, but there
> is some waste and I have 0 idea how it compares to what Linux is doing
> on that front.
>
> I stress though that even if you were to carve out certain sizes, a
> global lock to handle ops will still kill scalability.
>
> Perhaps granularity better than global, but less than per-CPU would be a
> sweet spot for scalabability vs memory waste.
>
> That said...
>
>> Also as you've pointed out, it wasn't just the percpu allocation being
>> the bottleneck, but percpu_counter's global lock too for hotplug
>> support. I'm hazarding a guess most use cases of percpu might have
>> additional locking requirements too such as percpu_counter.
>>
>
> True Fix(tm) is a longer story.
>
> Maybe let's sort out this patchset first, whichever way. :)
>

So I found the discussion around the original patch with a perf
regression report.

https://lore.kernel.org/linux-mm/20230608111408.s2minsenlcjow7q3@quack3/

The reporter suggests dodging the problem by only allocating per-cpu
counters when the process is going multithreaded. Given that there is
still plenty of forever single-threaded procs out there I think that's
does sound like a great plan regardless of what happens with this
patchset.

Almost all access is already done using dedicated routines, so this
should be an afternoon churn to sort out, unless I missed a
showstopper. (maybe there is no good place to stuff a flag/whatever
other indicator about the state of counters?)

That said I'll look into it some time this or next week.

>> Thanks,
>> Dennis
>>
>> > Thoughts?
>> >
>> > Mateusz Guzik (2):
>> >   pcpcntr: add group allocation/free
>> >   fork: group allocation of per-cpu counters for mm struct
>> >
>> >  include/linux/percpu_counter.h | 19 ++++++++---
>> >  kernel/fork.c                  | 13 ++------
>> >  lib/percpu_counter.c           | 61 ++++++++++++++++++++++++----------
>> >  3 files changed, 60 insertions(+), 33 deletions(-)
>> >
>> > --
>> > 2.39.2
>> >
>


-- 
Mateusz Guzik <mjguzik gmail.com>

  reply	other threads:[~2023-08-21 22:29 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-21 20:28 [PATCH 0/2] execve scalability issues, part 1 Mateusz Guzik
2023-08-21 20:28 ` [PATCH 1/2] pcpcntr: add group allocation/free Mateusz Guzik
2023-08-22 13:37   ` Vegard Nossum
2023-08-22 14:06     ` Mateusz Guzik
2023-08-22 17:02   ` Dennis Zhou
2023-08-21 20:28 ` [PATCH 2/2] fork: group allocation of per-cpu counters for mm struct Mateusz Guzik
2023-08-21 21:20   ` Matthew Wilcox
2023-08-21 20:42 ` [PATCH 0/2] execve scalability issues, part 1 Matthew Wilcox
2023-08-21 20:44   ` [PATCH 1/7] mm: Make folios_put() the basis of release_pages() Matthew Wilcox (Oracle)
2023-08-21 20:44     ` [PATCH 2/7] mm: Convert free_unref_page_list() to use folios Matthew Wilcox (Oracle)
2023-08-21 20:44     ` [PATCH 3/7] mm: Add free_unref_folios() Matthew Wilcox (Oracle)
2023-08-21 20:44     ` [PATCH 4/7] mm: Use folios_put() in __folio_batch_release() Matthew Wilcox (Oracle)
2023-08-21 20:44     ` [PATCH 5/7] memcg: Add mem_cgroup_uncharge_batch() Matthew Wilcox (Oracle)
2023-08-21 20:44     ` [PATCH 6/7] mm: Remove use of folio list from folios_put() Matthew Wilcox (Oracle)
2023-08-21 20:44     ` [PATCH 7/7] mm: Use free_unref_folios() in put_pages_list() Matthew Wilcox (Oracle)
2023-08-21 21:07 ` [PATCH 0/2] execve scalability issues, part 1 Dennis Zhou
2023-08-21 21:39   ` Mateusz Guzik
2023-08-21 22:29     ` Mateusz Guzik [this message]
2023-08-22  9:51       ` Jan Kara
2023-08-22 14:24         ` Mateusz Guzik
2023-08-23  9:49           ` Jan Kara
2023-08-23 10:49             ` David Laight
2023-08-23 12:01               ` Mateusz Guzik
2023-08-23 12:13             ` Mateusz Guzik
2023-08-23 15:47               ` Jan Kara
2023-08-23 16:10                 ` Mateusz Guzik
2023-08-23 16:41                   ` Jan Kara
2023-08-23 17:12                     ` Mateusz Guzik
2023-08-23 20:27             ` Dennis Zhou
2023-08-24  9:19               ` Jan Kara
2023-08-26 18:33 ` Mateusz Guzik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGudoHHJECp2-DfSr5hudooAdV6mivvSO+4mC9kwUrWnSiob5g@mail.gmail.com \
    --to=mjguzik@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=dennis@kernel.org \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).