linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kent Overstreet <kent.overstreet@linux.dev>
To: Song Liu <song@kernel.org>
Cc: linux-kernel@vger.kernel.org, bpf@vger.kernel.org,
	mcgrof@kernel.org, peterz@infradead.org, tglx@linutronix.de,
	x86@kernel.org, rppt@kernel.org
Subject: Re: [PATCH 1/3] module: Introduce module_alloc_type
Date: Fri, 26 May 2023 23:19:48 -0400	[thread overview]
Message-ID: <ZHF21CSxyjQy9C7F@moria.home.lan> (raw)
In-Reply-To: <CAPhsuW5CTANR_B57w5n3HvdDjvHXOCSGTDc=a4s+JR0pKCAOcw@mail.gmail.com>

On Fri, May 26, 2023 at 05:03:18PM -0700, Song Liu wrote:
> On Fri, May 26, 2023 at 4:39 PM Kent Overstreet
> <kent.overstreet@linux.dev> wrote:
> >
> [...]
> 
> >
> > But it should be an internal implementation detail, I don't think we
> > want the external interface tied to vmalloc -
> >
> > > These two APIs allow the core code work with all archs. They won't break
> > > sub-page allocations. (not all archs will have sub-page allocations)
> >
> > So yes, text_poke() doesn't work on all architectures, and we need a
> > fallback.
> >
> > But if the fallback needs to go the unprotect/protect route, I think we
> > need to put that in the helper, and not expose it. Part of what people
> > are wanting is to limit or eliminate pages that are RWX, so we
> > definitely shouldn't be creating new interfaces to flipping page
> > permissions: that should be pushed down to as low a level as possible.
> >
> > E.g., with my jitalloc_update(), a more complete version would look like
> >
> > void jitalloc_update(void *dst, void *src, size_t len)
> > {
> >         if (text_poke_available) {
> >                 text_poke(...);
> >         } else {
> >                 unprotect();
> >                 memcpy();
> >                 protect();
> >         }
> > }
> 
> I think this doesn't work for sub page allocation?

Perhaps I elided too much - it does if you add a single lock. You can't
do that if it's not a common helper.

> At the end of all this, we will have modules running from huge pages, data
> and text. It will give significant performance benefit when some key driver
> cannot be compiled into the kernel.

Yeah, I've seen the numbers for the perf impact of running as a module
due to the extra TLB overhead - but Mike's recent data was showing that
this doesn't matter nearly as much as data as it does for text.

> > Also - module_memory_fill_type(), module_memory_invalidate_type() - I
> > know these are for BPF, but could you explain why we need them in the
> > external interface here? Could they perhaps be small helpers in the bpf
> > code that use something like jitalloc_update()?
> 
> These are used by all users, not just BPF. 1/3 uses them in
> kernel/module/main.c. I didn't use them in 3/3 as it is arch code, but I can
> use them instead of text_poke_* (and that is probably better code style).
> As I am writing the email, I think I should use it in ftrace.c (2/3) to avoid
> the __weak function.

Ok. Could we make it clearer why they're needed and what they're for?

I know bpf fills in invalid instructions initially; I didn't read
through enough code to find the "why", and let's document that to save
other people the same effort.

And do they really need to be callbacks in mod_alloc_params...?

Do we need the other things in mod_alloc_params/vmalloc_params?
 - your granularity field says it's for specifying PAGE/PMD size: we
   definitely do not want that. We've had way too much code along the
   lines of "implement hugepages for x", we need to stop doing that.
   It's an internal mm/ detail.

 - vmalloc_params: we need gfp_t and pgprot_t, but those should be
   normal arguments. start/end/vm_flags? They seem like historical
   module baggage, can we get rid of them?

> > > OTOH, jit_alloc (as-is) solves a subset of the problem (only the text).
> > > We can probably use it (with some updates) instead of the vmap_area
> > > based allocator. But I guess we need to align the direction first.
> >
> > Let's see if we can get tglx to chime in again, since he was pretty
> > vocal in the last discussion.
> >
> > It's also good practice to try to summarize all the relevant "whys" from
> > the discussion that went into a feature and put that in at least the
> > commit message - or even better, header file comments.
> >
> > Also, organization: the module code is a huge mess, let's _please_ do
> > split this out and think about organization a bit more, not add to the
> > mess :)
> 
> I don't really think module code is very messy at the moment. If
> necessary, we can put module_alloc_type related code in a
> separate file.

Hey, it's been organized since I last looked at it :) But I tihink this
belongs in mm/, not module code.

  reply	other threads:[~2023-05-27  3:20 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-26  5:15 [PATCH 0/3] Type aware module allocator Song Liu
2023-05-26  5:15 ` [PATCH 1/3] module: Introduce module_alloc_type Song Liu
2023-05-26  6:07   ` Randy Dunlap
2023-05-26 22:29   ` Kent Overstreet
2023-05-26 23:09     ` Song Liu
2023-05-26 23:39       ` Kent Overstreet
2023-05-27  0:03         ` Song Liu
2023-05-27  3:19           ` Kent Overstreet [this message]
2023-05-27  6:00             ` Song Liu
2023-05-26  5:15 ` [PATCH 2/3] ftrace: Add swap_func to ftrace_process_locs() Song Liu
2023-05-26  5:15 ` [PATCH 3/3] x86/module: Use module_alloc_type Song Liu
2023-05-27  7:04 ` [PATCH 0/3] Type aware module allocator Kent Overstreet
2023-05-28  5:58   ` Song Liu
2023-05-29 10:45     ` Mike Rapoport
2023-05-30 22:37       ` Song Liu
2023-05-31 13:51         ` Mike Rapoport
2023-05-31 17:03           ` Song Liu
2023-05-29 18:25     ` Kent Overstreet
2023-05-30 22:48       ` Song Liu
2023-06-01 17:53         ` Kent Overstreet
2023-06-01 18:21         ` Kent Overstreet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZHF21CSxyjQy9C7F@moria.home.lan \
    --to=kent.overstreet@linux.dev \
    --cc=bpf@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rppt@kernel.org \
    --cc=song@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).