linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Song Liu <songliubraving@fb.com>
To: Christoph Hellwig <hch@infradead.org>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Cc: Song Liu <song@kernel.org>, bpf <bpf@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	open list <linux-kernel@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <Kernel-team@fb.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Edgecombe, Rick P" <rick.p.edgecombe@intel.com>,
	Claudio Imbrenda <imbrenda@linux.ibm.com>
Subject: Re: [PATCH v4 bpf 0/4] vmalloc: bpf: introduce VM_ALLOW_HUGE_VMAP
Date: Sat, 16 Apr 2022 19:55:00 +0000	[thread overview]
Message-ID: <4AD023F9-FBCE-4C7C-A049-9292491408AA@fb.com> (raw)
In-Reply-To: <YlpPW9SdCbZnLVog@infradead.org>



> On Apr 15, 2022, at 10:08 PM, Christoph Hellwig <hch@infradead.org> wrote:
> 
> On Fri, Apr 15, 2022 at 12:05:42PM -0700, Luis Chamberlain wrote:
>> Looks good except for that I think this should just wait for v5.19. The
>> fixes are so large I can't see why this needs to be rushed in other than
>> the first assumptions of the optimizations had some flaws addressed here.
> 
> Patches 1 and 2 are bug fixes for regressions caused by using huge page
> backed vmalloc by default.  So I think we do need it for 5.18.  The
> other two do look like candidates for 5.19, though.

Thanks Luis and Christoph for your kind inputs on the set. 

Here are my analysis after thinking about it overnight. 

We can discuss the users of vmalloc in 4 categories: module_alloc, BPF 
programs, alloc_large_system_hash, and others; and there are two archs
involved here: x86_64 and powerpc. 

With whole set, the behavior is like:

              |           x86_64            |       powerpc
--------------------------------------------+----------------------
module_alloc  |                        use small pages 
--------------------------------------------+----------------------
BPF programs  |      use 2MB pages          |  use small changes
--------------------------------------------+----------------------
large hash    |           use huge pages when size > PMD_SIZE
--------------------------------------------+----------------------
other-vmalloc |                      use small pages 


Patch 1/4 fixes the behavior of module_alloc and other-vmalloc. 
Without 1/4, both these users may get huge pages for size > PMD_SIZE 
allocations, which may be troublesome([3] for example). 

Patch 3/4 and 4/4, together with 1/1, allows BPF programs use 2MB 
pages. This is the same behavior as before 5.18-rc1, which has been 
tested in bpf-next and linux-next. Therefore, I don't think we need
to hold them until 5.19. 

Patch 2/4 enables huge pages for large hash. Large hash has been 
using huge pages on powerpc since 5.15. But this is new for x86_64. 
If we ship 2/4, this is a performance improvement for x86_64, but
it is less tested on x86_64 (didn't go through linux-next). If we 
ship 1/4 but not 2/4 with 5.18, we will see a small performance 
regression for powerpc. 

Based on this analysis, I think we should either 
  1) ship the whole set with 5.18; or
  2) ship 1/4, 3/4, and 4/4 with 5.18, and 2/4 with 5.19. 

With option 1), we enables huge pages for large hash on x86_64 
without going through linux-next. With option 2), we take a small
performance regression with 5.18 on powerpc. 

Of course, we can ship a hybrid solution by gating 2/4 for powerpc 
only in 5.18, and enabling it for x86_64 in 5.19. 

Does this make sense? Please let me know you comments and 
suggestions on this. 

Thanks,
Song

[3] https://lore.kernel.org/lkml/14444103-d51b-0fb3-ee63-c3f182f0b546@molgen.mpg.de/

  reply	other threads:[~2022-04-16 19:55 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20220415164413.2727220-1-song@kernel.org>
     [not found] ` <20220415164413.2727220-2-song@kernel.org>
2022-04-15 17:43   ` [PATCH v4 bpf 1/4] vmalloc: replace VM_NO_HUGE_VMAP with VM_ALLOW_HUGE_VMAP Rik van Riel
     [not found] ` <20220415164413.2727220-3-song@kernel.org>
2022-04-15 17:43   ` [PATCH v4 bpf 2/4] page_alloc: use vmalloc_huge for large system hash Rik van Riel
2022-04-25  7:07     ` Geert Uytterhoeven
2022-04-25  8:17       ` Linus Torvalds
2022-04-25  8:24         ` Geert Uytterhoeven
     [not found] ` <20220415164413.2727220-4-song@kernel.org>
2022-04-15 18:06   ` [PATCH v4 bpf 3/4] module: introduce module_alloc_huge Rik van Riel
2022-06-16 16:10   ` Dave Hansen
2022-04-15 19:05 ` [PATCH v4 bpf 0/4] vmalloc: bpf: introduce VM_ALLOW_HUGE_VMAP Luis Chamberlain
2022-04-16  1:34   ` Song Liu
2022-04-16  1:42     ` Luis Chamberlain
2022-04-16  1:43       ` Luis Chamberlain
2022-04-16  5:08   ` Christoph Hellwig
2022-04-16 19:55     ` Song Liu [this message]
2022-04-16 20:30       ` Linus Torvalds
2022-04-16 22:26         ` Song Liu
2022-04-18 10:06           ` Mike Rapoport
2022-04-19  0:44             ` Luis Chamberlain
2022-04-19  1:56               ` Edgecombe, Rick P
2022-04-19  5:36                 ` Song Liu
2022-04-19 18:42                   ` Mike Rapoport
2022-04-19 19:20                     ` Linus Torvalds
2022-04-20  2:03                       ` Alexei Starovoitov
2022-04-20  2:18                         ` Linus Torvalds
2022-04-20 14:42                           ` Song Liu
2022-04-20 18:28                             ` Luis Chamberlain
2022-04-21  3:25                       ` Nicholas Piggin
2022-04-21  5:48                         ` Linus Torvalds
2022-04-21  6:02                           ` Linus Torvalds
2022-04-21  9:07                             ` Nicholas Piggin
2022-04-21  8:57                           ` Nicholas Piggin
2022-04-21 15:44                             ` Linus Torvalds
2022-04-21 23:30                               ` Nicholas Piggin
2022-04-22  0:49                                 ` Linus Torvalds
2022-04-22  1:51                                   ` Nicholas Piggin
2022-04-22  2:31                                     ` Linus Torvalds
2022-04-22  2:57                                       ` Nicholas Piggin
2022-04-21 15:47                             ` Edgecombe, Rick P
2022-04-21 16:15                               ` Linus Torvalds
2022-04-22  0:12                                 ` Nicholas Piggin
2022-04-22  2:29                                   ` Edgecombe, Rick P
2022-04-22  2:47                                     ` Linus Torvalds
2022-04-22 16:54                                       ` Edgecombe, Rick P
2022-04-22  3:08                                     ` Nicholas Piggin
2022-04-22  4:31                                       ` Nicholas Piggin
2022-04-22 17:10                                         ` Edgecombe, Rick P
2022-04-22 20:22                                           ` Edgecombe, Rick P
2022-04-22  3:33                                     ` Nicholas Piggin
2022-04-21  9:47                           ` Nicholas Piggin
2022-04-19 21:24                 ` Luis Chamberlain
2022-04-19 23:58                   ` Edgecombe, Rick P
2022-04-20  7:58                   ` Petr Mladek
2022-04-19 18:20               ` Mike Rapoport
2022-04-24 17:43       ` Linus Torvalds
2022-04-25  6:48         ` Song Liu
2022-04-21  3:19     ` Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AD023F9-FBCE-4C7C-A049-9292491408AA@fb.com \
    --to=songliubraving@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=hch@infradead.org \
    --cc=imbrenda@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mcgrof@kernel.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=song@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).