All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kent Overstreet <kent.overstreet@linux.dev>
To: Song Liu <song@kernel.org>
Cc: linux-kernel@vger.kernel.org, bpf@vger.kernel.org,
	mcgrof@kernel.org, peterz@infradead.org, tglx@linutronix.de,
	x86@kernel.org, rppt@kernel.org
Subject: Re: [PATCH 1/3] module: Introduce module_alloc_type
Date: Fri, 26 May 2023 23:19:48 -0400	[thread overview]
Message-ID: <ZHF21CSxyjQy9C7F@moria.home.lan> (raw)
In-Reply-To: <CAPhsuW5CTANR_B57w5n3HvdDjvHXOCSGTDc=a4s+JR0pKCAOcw@mail.gmail.com>

On Fri, May 26, 2023 at 05:03:18PM -0700, Song Liu wrote:
> On Fri, May 26, 2023 at 4:39 PM Kent Overstreet
> <kent.overstreet@linux.dev> wrote:
> >
> [...]
> 
> >
> > But it should be an internal implementation detail, I don't think we
> > want the external interface tied to vmalloc -
> >
> > > These two APIs allow the core code work with all archs. They won't break
> > > sub-page allocations. (not all archs will have sub-page allocations)
> >
> > So yes, text_poke() doesn't work on all architectures, and we need a
> > fallback.
> >
> > But if the fallback needs to go the unprotect/protect route, I think we
> > need to put that in the helper, and not expose it. Part of what people
> > are wanting is to limit or eliminate pages that are RWX, so we
> > definitely shouldn't be creating new interfaces to flipping page
> > permissions: that should be pushed down to as low a level as possible.
> >
> > E.g., with my jitalloc_update(), a more complete version would look like
> >
> > void jitalloc_update(void *dst, void *src, size_t len)
> > {
> >         if (text_poke_available) {
> >                 text_poke(...);
> >         } else {
> >                 unprotect();
> >                 memcpy();
> >                 protect();
> >         }
> > }
> 
> I think this doesn't work for sub page allocation?

Perhaps I elided too much - it does if you add a single lock. You can't
do that if it's not a common helper.

> At the end of all this, we will have modules running from huge pages, data
> and text. It will give significant performance benefit when some key driver
> cannot be compiled into the kernel.

Yeah, I've seen the numbers for the perf impact of running as a module
due to the extra TLB overhead - but Mike's recent data was showing that
this doesn't matter nearly as much as data as it does for text.

> > Also - module_memory_fill_type(), module_memory_invalidate_type() - I
> > know these are for BPF, but could you explain why we need them in the
> > external interface here? Could they perhaps be small helpers in the bpf
> > code that use something like jitalloc_update()?
> 
> These are used by all users, not just BPF. 1/3 uses them in
> kernel/module/main.c. I didn't use them in 3/3 as it is arch code, but I can
> use them instead of text_poke_* (and that is probably better code style).
> As I am writing the email, I think I should use it in ftrace.c (2/3) to avoid
> the __weak function.

Ok. Could we make it clearer why they're needed and what they're for?

I know bpf fills in invalid instructions initially; I didn't read
through enough code to find the "why", and let's document that to save
other people the same effort.

And do they really need to be callbacks in mod_alloc_params...?

Do we need the other things in mod_alloc_params/vmalloc_params?
 - your granularity field says it's for specifying PAGE/PMD size: we
   definitely do not want that. We've had way too much code along the
   lines of "implement hugepages for x", we need to stop doing that.
   It's an internal mm/ detail.

 - vmalloc_params: we need gfp_t and pgprot_t, but those should be
   normal arguments. start/end/vm_flags? They seem like historical
   module baggage, can we get rid of them?

> > > OTOH, jit_alloc (as-is) solves a subset of the problem (only the text).
> > > We can probably use it (with some updates) instead of the vmap_area
> > > based allocator. But I guess we need to align the direction first.
> >
> > Let's see if we can get tglx to chime in again, since he was pretty
> > vocal in the last discussion.
> >
> > It's also good practice to try to summarize all the relevant "whys" from
> > the discussion that went into a feature and put that in at least the
> > commit message - or even better, header file comments.
> >
> > Also, organization: the module code is a huge mess, let's _please_ do
> > split this out and think about organization a bit more, not add to the
> > mess :)
> 
> I don't really think module code is very messy at the moment. If
> necessary, we can put module_alloc_type related code in a
> separate file.

Hey, it's been organized since I last looked at it :) But I tihink this
belongs in mm/, not module code.

  reply	other threads:[~2023-05-27  3:20 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-26  5:15 [PATCH 0/3] Type aware module allocator Song Liu
2023-05-26  5:15 ` [PATCH 1/3] module: Introduce module_alloc_type Song Liu
2023-05-26  6:07   ` Randy Dunlap
2023-05-26 22:29   ` Kent Overstreet
2023-05-26 23:09     ` Song Liu
2023-05-26 23:39       ` Kent Overstreet
2023-05-27  0:03         ` Song Liu
2023-05-27  3:19           ` Kent Overstreet [this message]
2023-05-27  6:00             ` Song Liu
2023-05-26  5:15 ` [PATCH 2/3] ftrace: Add swap_func to ftrace_process_locs() Song Liu
2023-05-26  5:15 ` [PATCH 3/3] x86/module: Use module_alloc_type Song Liu
2023-05-27  7:04 ` [PATCH 0/3] Type aware module allocator Kent Overstreet
2023-05-28  5:58   ` Song Liu
2023-05-29 10:45     ` Mike Rapoport
2023-05-30 22:37       ` Song Liu
2023-05-31 13:51         ` Mike Rapoport
2023-05-31 17:03           ` Song Liu
2023-05-31 19:54             ` Mike Rapoport
2023-05-29 18:25     ` Kent Overstreet
2023-05-30 22:48       ` Song Liu
2023-06-01 17:53         ` Kent Overstreet
2023-06-01 18:21         ` Kent Overstreet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZHF21CSxyjQy9C7F@moria.home.lan \
    --to=kent.overstreet@linux.dev \
    --cc=bpf@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rppt@kernel.org \
    --cc=song@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.