All of lore.kernel.org
 help / color / mirror / Atom feed
From: Luis Chamberlain <mcgrof@kernel.org>
To: David Hildenbrand <david@redhat.com>
Cc: Adam Manzanares <a.manzanares@samsung.com>,
	linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org,
	pmladek@suse.com, petr.pavlu@suse.com, prarit@redhat.com,
	christophe.leroy@csgroup.eu, song@kernel.org,
	torvalds@linux-foundation.org
Subject: Re: [RFC 00/12] module: avoid userspace pressure on unwanted allocations
Date: Tue, 21 Mar 2023 09:52:34 -0700	[thread overview]
Message-ID: <ZBng0qnm/ADtSTBQ@bombadil.infradead.org> (raw)
In-Reply-To: <eaa75ce0-7064-7919-0e72-6bb4ccc5d0d6@redhat.com>

On Tue, Mar 21, 2023 at 04:11:27PM +0100, David Hildenbrand wrote:
> On 20.03.23 22:23, Luis Chamberlain wrote:
> > On Mon, Mar 20, 2023 at 10:15:23PM +0100, David Hildenbrand wrote:
> > > On 20.03.23 22:09, Luis Chamberlain wrote:
> > > > On Mon, Mar 20, 2023 at 08:40:07PM +0100, David Hildenbrand wrote:
> > > > > On 20.03.23 10:38, David Hildenbrand wrote:
> > > > > > On 18.03.23 01:11, Luis Chamberlain wrote:
> > > > > > > On Thu, Mar 16, 2023 at 04:56:56PM -0700, Luis Chamberlain wrote:
> > > > > > > > On Thu, Mar 16, 2023 at 04:55:31PM -0700, Luis Chamberlain wrote:
> > > > > > > > > On Wed, Mar 15, 2023 at 05:41:53PM +0100, David Hildenbrand wrote:
> > > > > > > > > > I expect to have a machine (with a crazy number of CPUs/devices) available
> > > > > > > > > > in a couple of days (1-2), so no need to rush.
> > > > > > > > > > 
> > > > > > > > > > The original machine I was able to reproduce with is blocked for a little
> > > > > > > > > > bit longer; so I hope the alternative I looked up will similarly trigger the
> > > > > > > > > > issue easily.
> > > > > > > > > 
> > > > > > > > > OK give this a spin:
> > > > > > > > > 
> > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=20230316-module-alloc-opts
> > > > > > > 
> > > > > > > Today I am up to here:
> > > > > > > 
> > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=20230317-module-alloc-opts
> > > > > > > 
> > > > > > > The last patch really would have no justification yet at all unless it
> > > > > > > does help your case.
> > > > > > 
> > > > > > Still waiting on the system (the replacement system I was able to grab
> > > > > > broke ...).
> > > > > > 
> > > > > > I'll let you know once I succeeded in reproducing + testing your fixes.
> > > > > 
> > > > > Okay, I have a system where I can reproduce.
> > > > > 
> > > > > Should I give
> > > > > 
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=20230319-module-alloc-opts
> > > > > 
> > > > > from yesterday a churn?
> > > > 
> > > > Yes please give that a run.
> > > 
> > > Reproduced with v6.3.0-rc1 (on 1st try)
> > 
> > By reproduced, you mean it fails to boot?
> 
> It boots but we get vmap allocation warnings, because the ~440 CPUs manage
> to completely exhaust the module vmap area due to KASAN.

Thanks, can you post a trace?

> > > Not able to reproduce with 20230319-module-alloc-opts so far (2 tries).
> > 
> > Oh wow, so to clarify, it boots OK?
> 
> It boots and I don't get the vmap allocation warnings.

Wonderful!

Now to zero if all commits are required, and measure small value-add 
for each of them. That is the hard part and thanks for any time you
can help dedicate towards that.

There's really only 3 functional changes we need to measure:

In the 20230319-module-alloc-opts-adjust branch they are left towards
the end, in history, the top being the most recent commit:

600f1f769d06 module: add a sanity check prior to allowing kernel module auto-loading
680f2c1fff2d module: use list_add_tail_rcu() when adding module
6023a6c7d98c module: avoid allocation if module is already present and ready

My guess is the majority of the fix comes from the last commit, so
commit 6023a6c7d98c ("module: avoid allocation if module is already
present and ready").

The rcu one probably doesn't help much but I think it makes sense while
we're at it.

The last one is the one I'm less convinced makes sense. But *if* it does
help, it means such finit_module() situations / stresses are much larger
than we ever anticipated.

> > > > Please collect systemd-analyze given lack of any other tool to evaluate
> > > > any deltas. Can't think of anything else to gather other than seeing if
> > > > it booted.
> > > 
> > > Issue is that some services (kdump, tuned) seem to take sometimes ages on
> > > that system to start for some reason,
> > 
> > How about disabling that?
> 
> It seems to be random services. On my debug kernel with KASAN everything is
> just super slow. I'll try to measure on a !debug kernel.

Got it thanks!

  Luis

  reply	other threads:[~2023-03-21 16:52 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-11  5:17 [RFC 00/12] module: avoid userspace pressure on unwanted allocations Luis Chamberlain
2023-03-11  5:17 ` [RFC 01/12] module: use goto errors on check_modinfo() and layout_and_allocate() Luis Chamberlain
2023-03-11  5:17 ` [RFC 02/12] module: move get_modinfo() helpers all above Luis Chamberlain
2023-03-11  5:17 ` [RFC 03/12] module: rename next_string() to module_next_tag_pair() Luis Chamberlain
2023-03-11  5:17 ` [RFC 04/12] module: add a for_each_modinfo_entry() Luis Chamberlain
2023-03-11  5:17 ` [RFC 05/12] module: add debugging alias parsing support Luis Chamberlain
2023-03-11  5:17 ` [RFC 06/12] module: move early sanity checks into a helper Luis Chamberlain
2023-03-11  5:17 ` [RFC 07/12] module: move check_modinfo() early to early_mod_check() Luis Chamberlain
2023-03-11  5:17 ` [RFC 08/12] module: move finished_loading() Luis Chamberlain
2023-03-11  5:17 ` [RFC 09/12] module: extract patient module check into helper Luis Chamberlain
2023-03-11  5:17 ` [RFC 10/12] module: avoid allocation if module is already present and ready Luis Chamberlain
2023-03-11  5:17 ` [RFC 11/12] module: use list_add_tail_rcu() when adding module Luis Chamberlain
2023-03-11  5:17 ` [RFC 12/12] module: use aliases to find module on find_module_all() Luis Chamberlain
2023-03-11 13:12   ` kernel test robot
2023-03-11 17:06   ` kernel test robot
2023-03-15 14:43   ` Petr Pavlu
2023-03-15 16:12     ` Luis Chamberlain
2023-03-15 12:24 ` [RFC 00/12] module: avoid userspace pressure on unwanted allocations David Hildenbrand
2023-03-15 16:10   ` Luis Chamberlain
2023-03-15 16:41     ` David Hildenbrand
2023-03-16 23:55       ` Luis Chamberlain
2023-03-16 23:56         ` Luis Chamberlain
2023-03-18  0:11           ` Luis Chamberlain
2023-03-20  9:38             ` David Hildenbrand
2023-03-20 19:40               ` David Hildenbrand
2023-03-20 21:09                 ` Luis Chamberlain
2023-03-20 21:15                   ` David Hildenbrand
2023-03-20 21:23                     ` Luis Chamberlain
2023-03-20 21:27                       ` Luis Chamberlain
2023-03-21 19:32                         ` David Hildenbrand
2023-03-24  9:27                           ` David Hildenbrand
2023-03-24 17:54                             ` Luis Chamberlain
2023-03-24 19:11                               ` Linus Torvalds
2023-03-24 19:59                                 ` Luis Chamberlain
2023-03-24 20:28                                   ` Linus Torvalds
2023-03-24 21:14                                     ` Luis Chamberlain
2023-03-24 23:27                                       ` Luis Chamberlain
2023-03-24 23:41                                         ` Linus Torvalds
2023-03-28  3:44                               ` David Hildenbrand
2023-03-28  6:16                                 ` Luis Chamberlain
2023-03-28 21:02                                   ` David Hildenbrand
2023-03-29  5:31                                     ` Luis Chamberlain
2023-03-30  4:42                                       ` David Hildenbrand
2023-03-21 15:11                       ` David Hildenbrand
2023-03-21 16:52                         ` Luis Chamberlain [this message]
2023-03-21 17:01                           ` David Hildenbrand
2023-03-20  9:37           ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZBng0qnm/ADtSTBQ@bombadil.infradead.org \
    --to=mcgrof@kernel.org \
    --cc=a.manzanares@samsung.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-modules@vger.kernel.org \
    --cc=petr.pavlu@suse.com \
    --cc=pmladek@suse.com \
    --cc=prarit@redhat.com \
    --cc=song@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.