linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org, linux-mm@kvack.org, peterz@infradead.org,
	akpm@linux-foundation.org, x86@kernel.org, hch@lst.de,
	rick.p.edgecombe@intel.com, aaron.lu@intel.com, rppt@kernel.org,
	mcgrof@kernel.org
Subject: Re: [PATCH bpf-next v2 0/5] execmem_alloc for BPF programs
Date: Wed, 30 Nov 2022 00:56:37 +0100	[thread overview]
Message-ID: <871qpluxfu.ffs@tglx> (raw)
In-Reply-To: <CAPhsuW7BoJbRi7Tqck=cW1n0xdESOkwqU=PMAdL9LvCun47Y+w@mail.gmail.com>

Song,

On Tue, Nov 29 2022 at 09:26, Song Liu wrote:
> On Tue, Nov 29, 2022 at 2:23 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>> Modules are the obvious starting point. Once that is solved pretty much
>> everything else falls into place including BPF.
>>
>> Without modules support this whole exercise is pointless and not going
>> anywhere near x86.
>
> I am not sure I fully understand your point here. Do you mean
>
> 1) There is something wrong with this solution, that makes it not suitable
> for modules;
>    or
> 2) The solution is in the right direction and it will very likely work
> for modules.
> But we haven't finished module support. ?

As I'm obviously unable to express myself coherently, let me try again:

 A solution which solves the BPF problem, but does not solve the
 underlying problem of module_alloc() is not acceptable.

Is that clear enough?

> If it is 1), I would like to understand what are the issues that make it not
> suitable for modules. If it is 2), I think a solid, mostly like working small
> step toward the right direction is the better way as it makes code reviews
> a lot easier and has much lower risks. Does this make sense?

No. Because all you are interested in is to get your BPF itch scratched
instead of actually sitting down and solving the underlying problem and
thereby creating a benefit for everyone.

You are not making anything easier. You are violating the basic
engineering principle of "Fix the root cause, not the symptom".

By doing that you are actually creating more problems than you
solve. Why?

  Clearly your "solution" does not cover the full requirements of the
  module space because you solely focus on executable memory allocations
  which somehow magically go into the module address space.

  Can you coherently explain how this results in a consistent solution
  for the rest of the module requirements?

  Can you coherently explain why this wont create problems down the road
  for anyone who actually would be willing to solve the root cause?

No, you can't answer any of these questions simply because you never
explored the problem space sufficiently.

I'm not the first one to point this out. Quite some people in the
various threads regarding this issue have been pointing that out to you
before. They even provided you hints on how this can be solved properly
once and forever and for everyones benefits.

> I would also highlight that part of the benefit of this work comes from
> reducing direct map fragmentations. While BPF programs consume less
> memory, they are more dynamic and can cause more direct map
> fragmentations. bpf_prog_pack in upstream kernel already covers this
> part, but this set is a better solution than bpf_prog_pack.
>
> Finally, I would like to point out that 5/6 and 6/6 of (v5) the set let BPF
> programs share a 2MB page with static kernel text. Therefore, even
> for systems without many BPF programs, we should already see some
> reduction in iTLB misses.

Can you please stop this marketing nonsense? As I pointed out to you in
the very mail which your are replying to, the influence of BPF on the
system I picked randomly out of the pool is pretty close to ZERO.

Ergo, the reduction of iTLB misses is going to be equally close to
ZERO. What is the benefit you are trying to sell me?

I'm happy to run perf on this machine and provide the numbers which put
your 'we should already see some reduction' handwaving into perspective.

But the above is just a distraction. The real point is:

You can highlight and point out the benefits of your BPF specific
solution as much as you want, it does not make the fact that you are
"fixing" the symptom instead of the root cause magically go away.

Again for the record:

  The iTLB pressure problem, which affects modules, kprobes, tracing and
  BPF, is caused by the way how module_alloc() is implemented.

That's the root cause and this needs to be solved for _ALL_ of the users
of this infrastructure and not worked around by adding something which
makes BPF shiny and handwaves about that it solves the underlying
problem.

Thanks,

        tglx





  reply	other threads:[~2022-11-29 23:56 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-07 22:39 [PATCH bpf-next v2 0/5] execmem_alloc for BPF programs Song Liu
2022-11-07 22:39 ` [PATCH bpf-next v2 1/5] vmalloc: introduce execmem_alloc, execmem_free, and execmem_fill Song Liu
2022-11-07 22:39 ` [PATCH bpf-next v2 2/5] x86/alternative: support execmem_alloc() and execmem_free() Song Liu
2022-11-07 22:39 ` [PATCH bpf-next v2 3/5] bpf: use execmem_alloc for bpf program and bpf dispatcher Song Liu
2022-11-07 22:39 ` [PATCH bpf-next v2 4/5] vmalloc: introduce register_text_tail_vm() Song Liu
2022-11-07 22:39 ` [PATCH bpf-next v2 5/5] x86: use register_text_tail_vm Song Liu
2022-11-08 19:04   ` Edgecombe, Rick P
2022-11-08 22:15     ` Song Liu
2022-11-15 17:28       ` Edgecombe, Rick P
2022-11-07 22:55 ` [PATCH bpf-next v2 0/5] execmem_alloc for BPF programs Luis Chamberlain
2022-11-07 23:13   ` Song Liu
2022-11-07 23:39     ` Luis Chamberlain
2022-11-08  0:13       ` Edgecombe, Rick P
2022-11-08  2:45         ` Luis Chamberlain
2022-11-08 18:20         ` Song Liu
2022-11-08 18:12       ` Song Liu
2022-11-08 11:27 ` Mike Rapoport
2022-11-08 12:38   ` Aaron Lu
2022-11-09  6:55     ` Christoph Hellwig
2022-11-09 11:05       ` Peter Zijlstra
2022-11-08 16:51   ` Edgecombe, Rick P
2022-11-08 18:50     ` Song Liu
2022-11-09 11:17     ` Mike Rapoport
2022-11-09 17:04       ` Edgecombe, Rick P
2022-11-09 17:53         ` Song Liu
2022-11-13 10:34         ` Mike Rapoport
2022-11-14 20:30           ` Song Liu
2022-11-15 21:18             ` Luis Chamberlain
2022-11-15 21:39               ` Edgecombe, Rick P
2022-11-16 22:34                 ` Luis Chamberlain
2022-11-17  8:50             ` Mike Rapoport
2022-11-17 18:36               ` Song Liu
2022-11-20 10:41                 ` Mike Rapoport
2022-11-21 14:52                   ` Song Liu
2022-11-30  9:39                     ` Mike Rapoport
2022-11-09 17:43       ` Song Liu
2022-11-09 21:23         ` Christophe Leroy
2022-11-10  1:50           ` Song Liu
2022-11-13 10:42         ` Mike Rapoport
2022-11-14 20:45           ` Song Liu
2022-11-15 20:51             ` Luis Chamberlain
2022-11-20 10:44             ` Mike Rapoport
2022-11-08 18:41   ` Song Liu
2022-11-08 19:43     ` Christophe Leroy
2022-11-08 21:40       ` Song Liu
2022-11-13  9:58     ` Mike Rapoport
2022-11-14 20:13       ` Song Liu
2022-11-08 11:44 ` Christophe Leroy
2022-11-08 18:47   ` Song Liu
2022-11-08 19:32     ` Christophe Leroy
2022-11-08 11:48 ` Mike Rapoport
2022-11-15  1:30 ` Song Liu
2022-11-15 17:34   ` Edgecombe, Rick P
2022-11-15 21:54     ` Song Liu
2022-11-15 22:14       ` Edgecombe, Rick P
2022-11-15 22:32         ` Song Liu
2022-11-16  1:20         ` Song Liu
2022-11-16 21:22           ` Edgecombe, Rick P
2022-11-16 22:03             ` Song Liu
2022-11-15 21:09   ` Luis Chamberlain
2022-11-15 21:32     ` Luis Chamberlain
2022-11-15 22:48     ` Song Liu
2022-11-16 22:33       ` Luis Chamberlain
2022-11-16 22:47         ` Edgecombe, Rick P
2022-11-16 23:53           ` Luis Chamberlain
2022-11-17  1:17             ` Song Liu
2022-11-17  9:37         ` Mike Rapoport
2022-11-29 10:23   ` Thomas Gleixner
2022-11-29 17:26     ` Song Liu
2022-11-29 23:56       ` Thomas Gleixner [this message]
2022-11-30 16:18         ` Song Liu
2022-12-01  9:08           ` Thomas Gleixner
2022-12-01 19:31             ` Song Liu
2022-12-02  1:38               ` Thomas Gleixner
2022-12-02  8:38                 ` Song Liu
2022-12-02  9:22                   ` Thomas Gleixner
2022-12-06 20:25                     ` Song Liu
2022-12-07 15:36                       ` Thomas Gleixner
2022-12-07 16:53                         ` Christophe Leroy
2022-12-07 19:29                           ` Song Liu
2022-12-07 21:04                           ` Thomas Gleixner
2022-12-07 21:48                             ` Christophe Leroy
2022-12-07 19:26                         ` Song Liu
2022-12-07 20:57                           ` Thomas Gleixner
2022-12-07 23:17                             ` Song Liu
2022-12-02 10:46                 ` Christophe Leroy
2022-12-02 17:43                   ` Thomas Gleixner
2022-12-01 20:23             ` Mike Rapoport
2022-12-01 22:34               ` Thomas Gleixner
2022-12-03 14:46                 ` Mike Rapoport
2022-12-03 20:58                   ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871qpluxfu.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=aaron.lu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=bpf@vger.kernel.org \
    --cc=hch@lst.de \
    --cc=linux-mm@kvack.org \
    --cc=mcgrof@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=rppt@kernel.org \
    --cc=song@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).