All of lore.kernel.org
 help / color / mirror / Atom feed
From: Song Liu <songliubraving@fb.com>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: Song Liu <song@kernel.org>, bpf <bpf@vger.kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	Davidlohr Bueso <dave@stgolabs.net>,
	lkml <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <Kernel-team@fb.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"rick.p.edgecombe@intel.com" <rick.p.edgecombe@intel.com>,
	"linux-modules@vger.kernel.org" <linux-modules@vger.kernel.org>
Subject: Re: [PATCH v6 bpf-next 0/5] bpf_prog_pack followup
Date: Sat, 9 Jul 2022 01:14:23 +0000	[thread overview]
Message-ID: <6214B9C9-557B-4DC0-BFDE-77EAC425E577@fb.com> (raw)
In-Reply-To: <YsiupnNJ8WANZiIc@bombadil.infradead.org>



> On Jul 8, 2022, at 3:24 PM, Luis Chamberlain <mcgrof@kernel.org> wrote:
> 
> On Fri, Jul 08, 2022 at 07:58:44PM +0000, Song Liu wrote:
>> 
>> 
>>> On Jul 8, 2022, at 8:58 AM, Luis Chamberlain <mcgrof@kernel.org> wrote:
>>> 
>>> On Fri, Jul 08, 2022 at 01:36:25AM +0000, Song Liu wrote:
>>>> 
>>>> 
>>>>> On Jul 7, 2022, at 5:53 PM, Luis Chamberlain <mcgrof@kernel.org> wrote:
>>>>> 
>>>>> On Thu, Jul 07, 2022 at 11:52:58PM +0000, Song Liu wrote:
>>>>>>> On Jul 7, 2022, at 3:59 PM, Luis Chamberlain <mcgrof@kernel.org> wrote:
>>>>>>> 
>>>>>>> On Thu, Jul 07, 2022 at 03:35:41PM -0700, Song Liu wrote:
>>>>>>>> This set is the second half of v4 [1].
>>>>>>>> 
>>>>>>>> Changes v5 => v6:
>>>>>>>> 1. Rebase and extend CC list.
>>>>>>> 
>>>>>>> Why post a new iteration so soon without completing the discussion we
>>>>>>> had? It seems like we were at least going somewhere. If it's just
>>>>>>> to include mm as I requested, sure, that's fine, but this does not
>>>>>>> provide context as to what we last were talking about.
>>>>>> 
>>>>>> Sorry for sending v6 too soon. The primary reason was to extend the CC
>>>>>> list and add it back to patchwork (v5 somehow got archived). 
>>>>>> 
>>>>>> Also, I think vmalloc_exec_ work would be a separate project, while this 
>>>>>> set is the followup work of bpf_prog_pack. Does this make sense? 
>>>>>> 
>>>>>> Btw, vmalloc_exec_ work could be a good topic for LPC. It will be much
>>>>>> more efficient to discuss this in person. 
>>>>> 
>>>>> What we need is input from mm / arch folks. What is not done here is
>>>>> what that stuff we're talking about is and so mm folks can't guess. My
>>>>> preference is to address that.
>>>>> 
>>>>> I don't think in person discussion is needed if the only folks
>>>>> discussing this topic so far is just you and me.
>>>> 
>>>> How about we start a thread with mm / arch folks for the vmalloc_exec_*
>>>> topic? I will summarize previous discussions and include pointers to 
>>>> these discussions. If necessary, we can continue the discussion at LPC.
>>> 
>>> This sounds like a nice thread to use as this is why we are talking
>>> about that topic.
>>> 
>>>> OTOH, I guess the outcome of that discussion should not change this set? 
>>> 
>>> If the above is done right then actually I think it would show similar
>>> considerations for a respective free for module_alloc_huge().
>>> 
>>>> If we have concern about module_alloc_huge(), maybe we can have bpf code 
>>>> call vmalloc directly (until we have vmalloc_exec_)? 
>>> 
>>> You'd need to then still open code in a similar way the same things
>>> which we are trying to reach consensus on.
>> 
>>>> What do you think about this plan?
>>> 
>>> I think we should strive to not be lazy and sloppy, and prevent growth
>>> of sloppy code. So long as we do that I think this is all reasoanble.
>> 
>> Let me try to understand your concerns here. Say if we want module code
>> to be a temporary home for module_alloc_huge before we move it to mm
>> code. Would you think it is ready to ship if we:
> 
> Please CC Christoph and linux-modules@vger.kernel.org on future patches
> and dicussions aroudn this, and all others now CC'd.

Sometimes, vger drops my patch because the CC list is too long. That's 
the reason I often trim the CC list. I will try to keep folks in this
thread CC'ed. 

> 
>> 1) Rename module_alloc_huge as module_alloc_text_huge();
> 
> module_alloc_text_huge() is too long, but I've suggested names before
> which are short and generic, and also suggested that if modules are
> not the only users this needs to go outside of modules and so
> vmalloc_text_huge() or whatever.
> 
> To do this right it begs the question why we don't do that for the
> existing module_alloc(), as the users of this code is well outside of
> modules now. Last time a similar generic name was used all the special
> arch stuff was left to be done by the module code still, but still
> non-modules were still using that allocator. From my perspective the
> right thing to do is to deal with all the arch stuff as well in the
> generic handler, and have the module code *and* the other users which
> use module_alloc() to use that new caller as well.

The key difference between module_alloc() and the new API is that the 
API will return RO+X memory, and the user need text-poke like API to
modify this buffer. Archs that do not support text-poke will not be
able to use the new API. Does this sound like a reasonable design?

> 
>> 2) Add module_free_text_huge();
> 
> Right, we have special handling for how we free this special code for regular
> module_alloc() and so similar considerations would be needed here for
> the huge stuff.
> 
>> 3) Move set_memory_* and fill_ill_insn logic into module_alloc_text_huge()
>> and module_free_text_huge(). 
> 
> Yes, that's a bit hairy now, and so a saner and consistent way to do
> this would be best.

Thanks for these information. I will try to go this direction. 

> 
>> Are these on the right direction? Did I miss anything important?
> 
> I've also hinted before that another way to help here is to have draw
> up a simple lib/test_vmalloc_text.c or something like that which would
> enable a selftest to ensure correctness of this code on different archs
> and maybe even let you do performance analysis using perf [0]. You have
> good reasons to move to the huge allocator and the performance metrics
> are an abstract load, however perf measurements can also give you real
> raw data which you can reproduce and enable others to do similar
> comparisons later.
> 
> The last thing I'd ask is just ensure you Cc folks who have already been in
> these discussions.
> 
> [0] https://lkml.kernel.org/r/Yog+d+oR5TtPp2cs@bombadil.infradead.org

Let me see how we can test it. 

Thanks,
Song


  reply	other threads:[~2022-07-09  1:14 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-07 22:35 [PATCH v6 bpf-next 0/5] bpf_prog_pack followup Song Liu
2022-07-07 22:35 ` [PATCH v6 bpf-next 1/5] module: introduce module_alloc_huge Song Liu
2022-07-07 22:35 ` [PATCH v6 bpf-next 2/5] bpf: use module_alloc_huge for bpf_prog_pack Song Liu
2022-07-07 22:35 ` [PATCH v6 bpf-next 3/5] vmalloc: WARN for set_vm_flush_reset_perms() on huge pages Song Liu
2022-07-07 22:35 ` [PATCH v6 bpf-next 4/5] vmalloc: introduce huge_vmalloc_supported Song Liu
2022-07-07 22:35 ` [PATCH v6 bpf-next 5/5] bpf: simplify select_bpf_prog_pack_size Song Liu
2022-07-07 22:59 ` [PATCH v6 bpf-next 0/5] bpf_prog_pack followup Luis Chamberlain
2022-07-07 23:52   ` Song Liu
2022-07-08  0:53     ` Luis Chamberlain
2022-07-08  1:36       ` Song Liu
2022-07-08 15:58         ` Luis Chamberlain
2022-07-08 19:58           ` Song Liu
2022-07-08 22:24             ` Luis Chamberlain
2022-07-09  1:14               ` Song Liu [this message]
2022-07-12  4:18                 ` Luis Chamberlain
2022-07-12  4:24                   ` Luis Chamberlain
2022-07-12  5:49                   ` Song Liu
2022-07-12 19:04                     ` Luis Chamberlain
2022-07-12 23:12                       ` Song Liu
2022-07-12 23:42                         ` Luis Chamberlain
2022-07-13  1:00                           ` Song Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6214B9C9-557B-4DC0-BFDE-77EAC425E577@fb.com \
    --to=songliubraving@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave@stgolabs.net \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-modules@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=song@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.