linux-riscv.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: Song Liu <song@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>,
	Kent Overstreet <kent.overstreet@linux.dev>,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Christophe Leroy <christophe.leroy@csgroup.eu>,
	"David S. Miller" <davem@davemloft.net>,
	Dinh Nguyen <dinguyen@kernel.org>,
	Heiko Carstens <hca@linux.ibm.com>, Helge Deller <deller@gmx.de>,
	Huacai Chen <chenhuacai@kernel.org>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	"Naveen N. Rao" <naveen.n.rao@linux.ibm.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Russell King <linux@armlinux.org.uk>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Will Deacon <will@kernel.org>,
	bpf@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-mips@vger.kernel.org, linux-mm@kvack.org,
	linux-modules@vger.kernel.org, linux-parisc@vger.kernel.org,
	linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org,
	linux-trace-kernel@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, loongarch@lists.linux.dev,
	netdev@vger.kernel.org, sparclinux@vger.kernel.org,
	x86@kernel.org
Subject: Re: [PATCH 00/13] mm: jit/text allocator
Date: Tue, 13 Jun 2023 00:34:11 +0300	[thread overview]
Message-ID: <20230612213411.GP52412@kernel.org> (raw)
In-Reply-To: <CAPhsuW5YYa6nQhO2=zor75XkdKpFysZD42DgDRkKZvQT6aMqcA@mail.gmail.com>

On Fri, Jun 09, 2023 at 10:02:16AM -0700, Song Liu wrote:
> On Thu, Jun 8, 2023 at 11:41 AM Mike Rapoport <rppt@kernel.org> wrote:
> >
> > On Tue, Jun 06, 2023 at 11:21:59AM -0700, Song Liu wrote:
> > > On Mon, Jun 5, 2023 at 3:09 AM Mark Rutland <mark.rutland@arm.com> wrote:
> > >
> > > [...]
> > >
> > > > > > > Can you give more detail on what parameters you need? If the only extra
> > > > > > > parameter is just "does this allocation need to live close to kernel
> > > > > > > text", that's not that big of a deal.
> > > > > >
> > > > > > My thinking was that we at least need the start + end for each caller. That
> > > > > > might be it, tbh.
> > > > >
> > > > > Do you mean that modules will have something like
> > > > >
> > > > >       jit_text_alloc(size, MODULES_START, MODULES_END);
> > > > >
> > > > > and kprobes will have
> > > > >
> > > > >       jit_text_alloc(size, KPROBES_START, KPROBES_END);
> > > > > ?
> > > >
> > > > Yes.
> > >
> > > How about we start with two APIs:
> > >      jit_text_alloc(size);
> > >      jit_text_alloc_range(size, start, end);
> > >
> > > AFAICT, arm64 is the only arch that requires the latter API. And TBH, I am
> > > not quite convinced it is needed.
> >
> > Right now arm64 and riscv override bpf and kprobes allocations to use the
> > entire vmalloc address space, but having the ability to allocate generated
> > code outside of modules area may be useful for other architectures.
> >
> > Still the start + end for the callers feels backwards to me because the
> > callers do not define the ranges, but rather the architectures, so we still
> > need a way for architectures to define how they want allocate memory for
> > the generated code.
> 
> Yeah, this makes sense.
> 
> >
> > > > > It sill can be achieved with a single jit_alloc_arch_params(), just by
> > > > > adding enum jit_type parameter to jit_text_alloc().
> > > >
> > > > That feels backwards to me; it centralizes a bunch of information about
> > > > distinct users to be able to shove that into a static array, when the callsites
> > > > can pass that information.
> > >
> > > I think we only two type of users: module and everything else (ftrace, kprobe,
> > > bpf stuff). The key differences are:
> > >
> > >   1. module uses text and data; while everything else only uses text.
> > >   2. module code is generated by the compiler, and thus has stronger
> > >   requirements in address ranges; everything else are generated via some
> > >   JIT or manual written assembly, so they are more flexible with address
> > >   ranges (in JIT, we can avoid using instructions that requires a specific
> > >   address range).
> > >
> > > The next question is, can we have the two types of users share the same
> > > address ranges? If not, we can reserve the preferred range for modules,
> > > and let everything else use the other range. I don't see reasons to further
> > > separate users in the "everything else" group.
> >
> > I agree that we can define only two types: modules and everything else and
> > let the architectures define if they need different ranges for these two
> > types, or want the same range for everything.
> >
> > With only two types we can have two API calls for alloc, and a single
> > structure that defines the ranges etc from the architecture side rather
> > than spread all over.
> >
> > Like something along these lines:
> >
> >         struct execmem_range {
> >                 unsigned long   start;
> >                 unsigned long   end;
> >                 unsigned long   fallback_start;
> >                 unsigned long   fallback_end;
> >                 pgprot_t        pgprot;
> >                 unsigned int    alignment;
> >         };
> >
> >         struct execmem_modules_range {
> >                 enum execmem_module_flags flags;
> >                 struct execmem_range text;
> >                 struct execmem_range data;
> >         };
> >
> >         struct execmem_jit_range {
> >                 struct execmem_range text;
> >         };
> >
> >         struct execmem_params {
> >                 struct execmem_modules_range    modules;
> >                 struct execmem_jit_range        jit;
> >         };
> >
> >         struct execmem_params *execmem_arch_params(void);
> >
> >         void *execmem_text_alloc(size_t size);
> >         void *execmem_data_alloc(size_t size);
> >         void execmem_free(void *ptr);
> 
> With the jit variation, maybe we can just call these
> module_[text|data]_alloc()?

I was thinking about "execmem_*_alloc()" for allocations that must be close to kernel
image, like modules, ftrace on x86 and s390 and maybe something else in the
future.

And jit_text_alloc() for allocations that can reside anywhere.

I tried to find a different name for 'struct execmem_modules_range' but
couldn't think of anything better than 'struct execmem_close_to_kernel', so
I've left modules in the name.
 
> btw: Depending on the implementation of the allocator, we may also
> need separate free()s for text and data.
> 
> >
> >         void *jit_text_alloc(size_t size);
> >         void jit_free(void *ptr);
> >

Let's just add jit_free() for completeness even if it will be the same as
execmem_free() for now.
 
> [...]
> 
> How should we move ahead from here?
> 
> AFAICT, all these changes can be easily extended and refactored
> in the future, so we don't have to make it perfect the first time.
> OTOH, having the interface committed (either this set or my
> module_alloc_type version) can unblock works in the binpack
> allocator and the users side. Therefore, I think we can move
> relatively fast here?

Once the interface and architecture abstraction is ready we can work on the
allocator and the users. We also need to update text_poking/alternatives on
architectures that would allocate executable memory as ROX. I did some
quick tests and with these patches 'modprobe xfs' takes tens time more than
before.
 
> Thanks,
> Song

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  reply	other threads:[~2023-06-12 21:35 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-01 10:12 [PATCH 00/13] mm: jit/text allocator Mike Rapoport
2023-06-01 10:12 ` [PATCH 01/13] nios2: define virtual address space for modules Mike Rapoport
2023-06-13 22:16   ` Dinh Nguyen
2023-06-01 10:12 ` [PATCH 02/13] mm: introduce jit_text_alloc() and use it instead of module_alloc() Mike Rapoport
2023-06-01 10:12 ` [PATCH 03/13] mm/jitalloc, arch: convert simple overrides of module_alloc to jitalloc Mike Rapoport
2023-06-01 10:12 ` [PATCH 04/13] mm/jitalloc, arch: convert remaining " Mike Rapoport
2023-06-01 22:35   ` Song Liu
2023-06-01 10:12 ` [PATCH 05/13] module, jitalloc: drop module_alloc Mike Rapoport
2023-06-01 10:12 ` [PATCH 06/13] mm/jitalloc: introduce jit_data_alloc() Mike Rapoport
2023-06-01 10:12 ` [PATCH 07/13] x86/ftrace: enable dynamic ftrace without CONFIG_MODULES Mike Rapoport
2023-06-01 10:12 ` [PATCH 08/13] arch: make jitalloc setup available regardless of CONFIG_MODULES Mike Rapoport
2023-06-01 10:12 ` [PATCH 09/13] kprobes: remove dependcy on CONFIG_MODULES Mike Rapoport
2023-06-01 10:12 ` [PATCH 10/13] modules, jitalloc: prepare to allocate executable memory as ROX Mike Rapoport
2023-06-01 10:12 ` [PATCH 11/13] ftrace: Add swap_func to ftrace_process_locs() Mike Rapoport
2023-06-01 10:12 ` [PATCH 12/13] x86/jitalloc: prepare to allocate exectuatble memory as ROX Mike Rapoport
2023-06-01 10:30   ` Peter Zijlstra
2023-06-01 11:07     ` Mike Rapoport
2023-06-02  0:02       ` Song Liu
2023-06-01 17:52     ` Kent Overstreet
2023-06-01 16:54   ` Edgecombe, Rick P
2023-06-01 18:00     ` Kent Overstreet
2023-06-01 18:13       ` Edgecombe, Rick P
2023-06-01 18:38         ` Kent Overstreet
2023-06-01 20:50           ` Edgecombe, Rick P
2023-06-01 23:54             ` Nadav Amit
2023-06-05  2:52               ` Steven Rostedt
2023-06-05  8:11                 ` Mike Rapoport
2023-06-05 16:10                   ` Edgecombe, Rick P
2023-06-05 20:42                     ` Mike Rapoport
2023-06-05 21:01                       ` Edgecombe, Rick P
2023-06-05 21:11                     ` Nadav Amit
2023-06-04 21:47             ` Kent Overstreet
2023-06-01 22:49   ` Song Liu
2023-06-01 10:12 ` [PATCH 13/13] x86/jitalloc: make memory allocated for code ROX Mike Rapoport
2023-06-01 16:12 ` [PATCH 00/13] mm: jit/text allocator Mark Rutland
2023-06-01 18:14   ` Kent Overstreet
2023-06-02  9:35     ` Mark Rutland
2023-06-02 18:20       ` Song Liu
2023-06-03 21:11         ` Puranjay Mohan
2023-06-04 18:02         ` Kent Overstreet
2023-06-04 21:22           ` Song Liu
2023-06-04 21:40             ` Kent Overstreet
2023-06-05  4:05               ` Song Liu
2023-06-05  9:20       ` Mike Rapoport
2023-06-05 10:09         ` Mark Rutland
2023-06-06 10:16           ` Mike Rapoport
2023-06-06 18:21           ` Song Liu
2023-06-08 18:41             ` Mike Rapoport
2023-06-09 17:02               ` Song Liu
2023-06-12 21:34                 ` Mike Rapoport [this message]
2023-06-13 18:56               ` Kent Overstreet
2023-06-13 21:09                 ` Mike Rapoport
2023-07-20  8:53           ` Mike Rapoport
2023-06-05 21:13         ` Kent Overstreet
2023-06-02  0:36 ` Song Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230612213411.GP52412@kernel.org \
    --to=rppt@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bpf@vger.kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=chenhuacai@kernel.org \
    --cc=christophe.leroy@csgroup.eu \
    --cc=davem@davemloft.net \
    --cc=deller@gmx.de \
    --cc=dinguyen@kernel.org \
    --cc=hca@linux.ibm.com \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-modules@vger.kernel.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=loongarch@lists.linux.dev \
    --cc=mark.rutland@arm.com \
    --cc=mcgrof@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=naveen.n.rao@linux.ibm.com \
    --cc=netdev@vger.kernel.org \
    --cc=palmer@dabbelt.com \
    --cc=rostedt@goodmis.org \
    --cc=song@kernel.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tsbogend@alpha.franken.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).