From: Kent Overstreet <kent.overstreet@linux.dev>
To: Song Liu <song@kernel.org>
Cc: linux-kernel@vger.kernel.org, bpf@vger.kernel.org,
mcgrof@kernel.org, peterz@infradead.org, tglx@linutronix.de,
x86@kernel.org, rppt@kernel.org
Subject: Re: [PATCH 1/3] module: Introduce module_alloc_type
Date: Fri, 26 May 2023 23:19:48 -0400 [thread overview]
Message-ID: <ZHF21CSxyjQy9C7F@moria.home.lan> (raw)
In-Reply-To: <CAPhsuW5CTANR_B57w5n3HvdDjvHXOCSGTDc=a4s+JR0pKCAOcw@mail.gmail.com>
On Fri, May 26, 2023 at 05:03:18PM -0700, Song Liu wrote:
> On Fri, May 26, 2023 at 4:39 PM Kent Overstreet
> <kent.overstreet@linux.dev> wrote:
> >
> [...]
>
> >
> > But it should be an internal implementation detail, I don't think we
> > want the external interface tied to vmalloc -
> >
> > > These two APIs allow the core code work with all archs. They won't break
> > > sub-page allocations. (not all archs will have sub-page allocations)
> >
> > So yes, text_poke() doesn't work on all architectures, and we need a
> > fallback.
> >
> > But if the fallback needs to go the unprotect/protect route, I think we
> > need to put that in the helper, and not expose it. Part of what people
> > are wanting is to limit or eliminate pages that are RWX, so we
> > definitely shouldn't be creating new interfaces to flipping page
> > permissions: that should be pushed down to as low a level as possible.
> >
> > E.g., with my jitalloc_update(), a more complete version would look like
> >
> > void jitalloc_update(void *dst, void *src, size_t len)
> > {
> > if (text_poke_available) {
> > text_poke(...);
> > } else {
> > unprotect();
> > memcpy();
> > protect();
> > }
> > }
>
> I think this doesn't work for sub page allocation?
Perhaps I elided too much - it does if you add a single lock. You can't
do that if it's not a common helper.
> At the end of all this, we will have modules running from huge pages, data
> and text. It will give significant performance benefit when some key driver
> cannot be compiled into the kernel.
Yeah, I've seen the numbers for the perf impact of running as a module
due to the extra TLB overhead - but Mike's recent data was showing that
this doesn't matter nearly as much as data as it does for text.
> > Also - module_memory_fill_type(), module_memory_invalidate_type() - I
> > know these are for BPF, but could you explain why we need them in the
> > external interface here? Could they perhaps be small helpers in the bpf
> > code that use something like jitalloc_update()?
>
> These are used by all users, not just BPF. 1/3 uses them in
> kernel/module/main.c. I didn't use them in 3/3 as it is arch code, but I can
> use them instead of text_poke_* (and that is probably better code style).
> As I am writing the email, I think I should use it in ftrace.c (2/3) to avoid
> the __weak function.
Ok. Could we make it clearer why they're needed and what they're for?
I know bpf fills in invalid instructions initially; I didn't read
through enough code to find the "why", and let's document that to save
other people the same effort.
And do they really need to be callbacks in mod_alloc_params...?
Do we need the other things in mod_alloc_params/vmalloc_params?
- your granularity field says it's for specifying PAGE/PMD size: we
definitely do not want that. We've had way too much code along the
lines of "implement hugepages for x", we need to stop doing that.
It's an internal mm/ detail.
- vmalloc_params: we need gfp_t and pgprot_t, but those should be
normal arguments. start/end/vm_flags? They seem like historical
module baggage, can we get rid of them?
> > > OTOH, jit_alloc (as-is) solves a subset of the problem (only the text).
> > > We can probably use it (with some updates) instead of the vmap_area
> > > based allocator. But I guess we need to align the direction first.
> >
> > Let's see if we can get tglx to chime in again, since he was pretty
> > vocal in the last discussion.
> >
> > It's also good practice to try to summarize all the relevant "whys" from
> > the discussion that went into a feature and put that in at least the
> > commit message - or even better, header file comments.
> >
> > Also, organization: the module code is a huge mess, let's _please_ do
> > split this out and think about organization a bit more, not add to the
> > mess :)
>
> I don't really think module code is very messy at the moment. If
> necessary, we can put module_alloc_type related code in a
> separate file.
Hey, it's been organized since I last looked at it :) But I tihink this
belongs in mm/, not module code.
next prev parent reply other threads:[~2023-05-27 3:20 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-26 5:15 [PATCH 0/3] Type aware module allocator Song Liu
2023-05-26 5:15 ` [PATCH 1/3] module: Introduce module_alloc_type Song Liu
2023-05-26 6:07 ` Randy Dunlap
2023-05-26 22:29 ` Kent Overstreet
2023-05-26 23:09 ` Song Liu
2023-05-26 23:39 ` Kent Overstreet
2023-05-27 0:03 ` Song Liu
2023-05-27 3:19 ` Kent Overstreet [this message]
2023-05-27 6:00 ` Song Liu
2023-05-26 5:15 ` [PATCH 2/3] ftrace: Add swap_func to ftrace_process_locs() Song Liu
2023-05-26 5:15 ` [PATCH 3/3] x86/module: Use module_alloc_type Song Liu
2023-05-27 7:04 ` [PATCH 0/3] Type aware module allocator Kent Overstreet
2023-05-28 5:58 ` Song Liu
2023-05-29 10:45 ` Mike Rapoport
2023-05-30 22:37 ` Song Liu
2023-05-31 13:51 ` Mike Rapoport
2023-05-31 17:03 ` Song Liu
2023-05-29 18:25 ` Kent Overstreet
2023-05-30 22:48 ` Song Liu
2023-06-01 17:53 ` Kent Overstreet
2023-06-01 18:21 ` Kent Overstreet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZHF21CSxyjQy9C7F@moria.home.lan \
--to=kent.overstreet@linux.dev \
--cc=bpf@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=peterz@infradead.org \
--cc=rppt@kernel.org \
--cc=song@kernel.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).