linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: Song Liu <songliubraving@fb.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Christoph Hellwig <hch@infradead.org>,
	Luis Chamberlain <mcgrof@kernel.org>, Song Liu <song@kernel.org>,
	bpf <bpf@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	open list <linux-kernel@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <Kernel-team@fb.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Edgecombe, Rick P" <rick.p.edgecombe@intel.com>,
	Claudio Imbrenda <imbrenda@linux.ibm.com>
Subject: Re: [PATCH v4 bpf 0/4] vmalloc: bpf: introduce VM_ALLOW_HUGE_VMAP
Date: Mon, 18 Apr 2022 13:06:36 +0300	[thread overview]
Message-ID: <Yl04LO/PfB3GocvU@kernel.org> (raw)
In-Reply-To: <B995F7EB-2019-4290-9C09-AE19C5BA3A70@fb.com>

Hi,

On Sat, Apr 16, 2022 at 10:26:08PM +0000, Song Liu wrote:
> > On Apr 16, 2022, at 1:30 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> > 
> > Maybe I am missing something, but I really don't think this is ready
> > for prime-time. We should effectively disable it all, and have people
> > think through it a lot more.
> 
> This has been discussed on lwn.net: https://lwn.net/Articles/883454/. 
> AFAICT, the biggest concern is whether reserving minimal 2MB for BPF
> programs is a good trade-off for memory usage. This is again my fault
> not to state the motivation clearly: the primary gain comes from less 
> page table fragmentation and thus better iTLB efficiency. 

Reserving 2MB pages for BPF programs will indeed reduce the fragmentation,
but OTOH it will reduce memory utilization. If for large systems this may
not be an issue, on smaller machines trading off memory for iTLB
performance may be not that obvious.
 
> Other folks (in recent thread on this topic and offline in other 
> discussions) also showed strong interests in using similar technical 
> for text of kernel modules. So I would really like to learn your 
> opinion on this. There are many details we can optimize, but I guess 
> the general mechanism has to be something like:
>  - allocate a huge page, make it safe, and set it as executable;
>  - as users (BPF, kernel module, etc.) request memory for text, give
>    a chunk of the huge page to the user. 
>  - use some mechanism to update the chunk of memory safely. 

There are use-cases that require 4K pages with non-default permissions in
the direct map and the pages not necessarily should be executable. There
were several suggestions to implement caches of 4K pages backed by 2M
pages.

I believe that "allocate huge page and split it to basic pages to hand out
to users" concept should be implemented at page allocator level and I
posted and RFC for this a while ago:

https://lore.kernel.org/all/20220127085608.306306-1-rppt@kernel.org/

-- 
Sincerely yours,
Mike.

  reply	other threads:[~2022-04-18 10:06 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20220415164413.2727220-1-song@kernel.org>
     [not found] ` <20220415164413.2727220-2-song@kernel.org>
2022-04-15 17:43   ` [PATCH v4 bpf 1/4] vmalloc: replace VM_NO_HUGE_VMAP with VM_ALLOW_HUGE_VMAP Rik van Riel
     [not found] ` <20220415164413.2727220-3-song@kernel.org>
2022-04-15 17:43   ` [PATCH v4 bpf 2/4] page_alloc: use vmalloc_huge for large system hash Rik van Riel
2022-04-25  7:07     ` Geert Uytterhoeven
2022-04-25  8:17       ` Linus Torvalds
2022-04-25  8:24         ` Geert Uytterhoeven
     [not found] ` <20220415164413.2727220-4-song@kernel.org>
2022-04-15 18:06   ` [PATCH v4 bpf 3/4] module: introduce module_alloc_huge Rik van Riel
2022-06-16 16:10   ` Dave Hansen
2022-04-15 19:05 ` [PATCH v4 bpf 0/4] vmalloc: bpf: introduce VM_ALLOW_HUGE_VMAP Luis Chamberlain
2022-04-16  1:34   ` Song Liu
2022-04-16  1:42     ` Luis Chamberlain
2022-04-16  1:43       ` Luis Chamberlain
2022-04-16  5:08   ` Christoph Hellwig
2022-04-16 19:55     ` Song Liu
2022-04-16 20:30       ` Linus Torvalds
2022-04-16 22:26         ` Song Liu
2022-04-18 10:06           ` Mike Rapoport [this message]
2022-04-19  0:44             ` Luis Chamberlain
2022-04-19  1:56               ` Edgecombe, Rick P
2022-04-19  5:36                 ` Song Liu
2022-04-19 18:42                   ` Mike Rapoport
2022-04-19 19:20                     ` Linus Torvalds
2022-04-20  2:03                       ` Alexei Starovoitov
2022-04-20  2:18                         ` Linus Torvalds
2022-04-20 14:42                           ` Song Liu
2022-04-20 18:28                             ` Luis Chamberlain
2022-04-21  3:25                       ` Nicholas Piggin
2022-04-21  5:48                         ` Linus Torvalds
2022-04-21  6:02                           ` Linus Torvalds
2022-04-21  9:07                             ` Nicholas Piggin
2022-04-21  8:57                           ` Nicholas Piggin
2022-04-21 15:44                             ` Linus Torvalds
2022-04-21 23:30                               ` Nicholas Piggin
2022-04-22  0:49                                 ` Linus Torvalds
2022-04-22  1:51                                   ` Nicholas Piggin
2022-04-22  2:31                                     ` Linus Torvalds
2022-04-22  2:57                                       ` Nicholas Piggin
2022-04-21 15:47                             ` Edgecombe, Rick P
2022-04-21 16:15                               ` Linus Torvalds
2022-04-22  0:12                                 ` Nicholas Piggin
2022-04-22  2:29                                   ` Edgecombe, Rick P
2022-04-22  2:47                                     ` Linus Torvalds
2022-04-22 16:54                                       ` Edgecombe, Rick P
2022-04-22  3:08                                     ` Nicholas Piggin
2022-04-22  4:31                                       ` Nicholas Piggin
2022-04-22 17:10                                         ` Edgecombe, Rick P
2022-04-22 20:22                                           ` Edgecombe, Rick P
2022-04-22  3:33                                     ` Nicholas Piggin
2022-04-21  9:47                           ` Nicholas Piggin
2022-04-19 21:24                 ` Luis Chamberlain
2022-04-19 23:58                   ` Edgecombe, Rick P
2022-04-20  7:58                   ` Petr Mladek
2022-04-19 18:20               ` Mike Rapoport
2022-04-24 17:43       ` Linus Torvalds
2022-04-25  6:48         ` Song Liu
2022-04-21  3:19     ` Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yl04LO/PfB3GocvU@kernel.org \
    --to=rppt@kernel.org \
    --cc=Kernel-team@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=hch@infradead.org \
    --cc=imbrenda@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mcgrof@kernel.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=song@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).