All of lore.kernel.org
 help / color / mirror / Atom feed
From: Song Liu <songliubraving@fb.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>, bpf <bpf@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-modules@vger.kernel.org" <linux-modules@vger.kernel.org>,
	"mcgrof@kernel.org" <mcgrof@kernel.org>,
	"rostedt@goodmis.org" <rostedt@goodmis.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"mhiramat@kernel.org" <mhiramat@kernel.org>,
	"naveen.n.rao@linux.ibm.com" <naveen.n.rao@linux.ibm.com>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"anil.s.keshavamurthy@intel.com" <anil.s.keshavamurthy@intel.com>,
	"keescook@chromium.org" <keescook@chromium.org>,
	"hch@infradead.org" <hch@infradead.org>,
	"dave@stgolabs.net" <dave@stgolabs.net>,
	"daniel@iogearbox.net" <daniel@iogearbox.net>,
	Kernel Team <Kernel-team@fb.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"rick.p.edgecombe@intel.com" <rick.p.edgecombe@intel.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>
Subject: Re: [PATCH bpf-next 1/3] mm/vmalloc: introduce vmalloc_exec which allocates RO+X memory
Date: Fri, 5 Aug 2022 05:29:51 +0000	[thread overview]
Message-ID: <14D6DBA0-0572-44FB-A566-464B1FF541E0@fb.com> (raw)
In-Reply-To: <Ys6cWUMHO8XwyYgr@hirez.programming.kicks-ass.net>

Hi Peter,

> On Jul 13, 2022, at 3:20 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> 

[...]

> 
> So how about instead we separate them? Then much of the problem goes
> away, you don't need to track these 2M chunks at all.
> 
> Start by adding VM_TOPDOWN_VMAP, which instead of returning the lowest
> (leftmost) vmap_area that fits, picks the higests (rightmost).
> 
> Then add module_alloc_data() that uses VM_TOPDOWN_VMAP and make
> ARCH_WANTS_MODULE_DATA_IN_VMALLOC use that instead of vmalloc (with a
> weak function doing the vmalloc).
> 
> This gets you bottom of module range is RO+X only, top is shattered
> between different !X types.
> 
> Then track the boundary between X and !X and ensure module_alloc_data()
> and module_alloc() never cross over and stay strictly separated.
> 
> Then change all module_alloc() users to expect RO+X memory, instead of
> RW.
> 
> Then make sure any extention of the X range is 2M aligned.
> 
> And presto, *everybody* always uses 2M TLB for text, modules, bpf,
> ftrace, the lot and nobody is tracking chunks.
> 
> Maybe migration can be eased by instead providing module_alloc_text()
> and ARCH_WANTS_MODULE_ALLOC_TEXT.

I finally got some time to look into the code. A few questions:

1. AFAICT, vmap_area tree only works with PAGE_SIZE aligned addresses. 
   For the sharing to be more efficient, I think we need to go with
   smaller granularity. Will this work? Shall we pick a smaller 
   granularity, say 64 bytes? Or shall we go all the way to 1 byte?

2. I think we will need multiple vmap_area's sharing the same vm_struct. 
   Do we need to add refcount to vm_struct?

Thanks,
Song



  parent reply	other threads:[~2022-08-05  5:29 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20220713071846.3286727-1-song@kernel.org>
     [not found] ` <20220713071846.3286727-2-song@kernel.org>
2022-07-13  9:53   ` [PATCH bpf-next 1/3] mm/vmalloc: introduce vmalloc_exec which allocates RO+X memory Peter Zijlstra
2022-07-13 10:08   ` Christoph Hellwig
2022-07-13 15:49     ` Song Liu
2022-07-14  4:23       ` Christoph Hellwig
2022-07-14  4:54         ` Song Liu
2022-07-14 18:15           ` Uladzislau Rezki
2022-07-15  0:24             ` Song Liu
2022-07-13 10:20   ` Peter Zijlstra
2022-07-13 15:48     ` Song Liu
2022-07-13 20:26       ` Peter Zijlstra
2022-07-13 21:20         ` Song Liu
2022-07-14 10:10           ` Peter Zijlstra
2022-07-14  5:16     ` Christoph Hellwig
2022-07-14  7:26       ` Peter Zijlstra
2022-08-05  5:29     ` Song Liu [this message]
2022-08-05  5:29     ` Song Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=14D6DBA0-0572-44FB-A566-464B1FF541E0@fb.com \
    --to=songliubraving@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=anil.s.keshavamurthy@intel.com \
    --cc=bp@alien8.de \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave@stgolabs.net \
    --cc=davem@davemloft.net \
    --cc=hch@infradead.org \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-modules@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@redhat.com \
    --cc=naveen.n.rao@linux.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=song@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.