linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alan Maguire <alan.maguire@oracle.com>
To: ardb@kernel.org, catalin.marinas@arm.com, will@kernel.org,
	daniel@iogearbox.net, ast@kernel.org
Cc: zlim.lnx@gmail.com, andrii@kernel.org, kafai@fb.com,
	songliubraving@fb.com, yhs@fb.com, john.fastabend@gmail.com,
	kpsingh@kernel.org, andreyknvl@gmail.com,
	vincenzo.frascino@arm.com, mark.rutland@arm.com,
	samitolvanen@google.com, joey.gouly@arm.com, maz@kernel.org,
	daizhiyuan@phytium.com.cn, jthierry@redhat.com,
	tiantao6@hisilicon.com, pcc@google.com,
	akpm@linux-foundation.org, rppt@kernel.org,
	Jisheng.Zhang@synaptics.com, liu.hailong6@zte.com.cn,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	bpf@vger.kernel.org, Russell King <russell.king@oracle.com>
Subject: [PATCH bpf-next 1/2] arm64/bpf: remove 128MB limit for BPF JIT programs
Date: Wed,  3 Nov 2021 09:49:28 +0000	[thread overview]
Message-ID: <1635932969-13149-2-git-send-email-alan.maguire@oracle.com> (raw)
In-Reply-To: <1635932969-13149-1-git-send-email-alan.maguire@oracle.com>

From: Russell King <russell.king@oracle.com>

commit 91fc957c9b1d ("arm64/bpf: don't allocate BPF JIT programs in module memory")

...restricts BPF JIT program allocation to a 128MB region to ensure
BPF programs are still in branching range of each other.  However
this restriction should not apply to the aarch64 JIT, since
BPF_JMP | BPF_CALL are implemented as a 64-bit move into a register
and then a BLR instruction - which has the effect of being able to call
anything without proximity limitation.  Removing the contiguous JIT
region requires explicitly searching the bpf exception tables first
in fixup_exception(), since they are formatted differently from
the rest of the exception tables.  Previously we used the fact that
the JIT memory was contiguous to identify the fact that the context
for the exception (the program counter) is a BPF program.

The practical reason to relax this restriction on JIT memory is that 128MB
of JIT memory can be quickly exhausted, especially where PAGE_SIZE is 64KB -
one page is needed per program.  In cases where seccomp filters are applied
to multiple VMs on VM launch - such filters are classic BPF but converted to
BPF - this can severely limit the number of VMs that can be launched.  In a
world where we support BPF JIT always on, turning off the JIT isn't always
an option either.

Fixes: 91fc957c9b1d ("arm64/bpf: don't allocate BPF JIT programs in module memory")

Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <russell.king@oracle.com>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
---
 arch/arm64/include/asm/extable.h |  9 ---------
 arch/arm64/include/asm/memory.h  |  5 +----
 arch/arm64/kernel/traps.c        |  2 +-
 arch/arm64/mm/extable.c          | 13 +++++++++----
 arch/arm64/mm/ptdump.c           |  2 --
 arch/arm64/net/bpf_jit_comp.c    | 10 ++++++----
 6 files changed, 17 insertions(+), 24 deletions(-)

diff --git a/arch/arm64/include/asm/extable.h b/arch/arm64/include/asm/extable.h
index b15eb4a..840a35e 100644
--- a/arch/arm64/include/asm/extable.h
+++ b/arch/arm64/include/asm/extable.h
@@ -22,15 +22,6 @@ struct exception_table_entry
 
 #define ARCH_HAS_RELATIVE_EXTABLE
 
-static inline bool in_bpf_jit(struct pt_regs *regs)
-{
-	if (!IS_ENABLED(CONFIG_BPF_JIT))
-		return false;
-
-	return regs->pc >= BPF_JIT_REGION_START &&
-	       regs->pc < BPF_JIT_REGION_END;
-}
-
 #ifdef CONFIG_BPF_JIT
 int arm64_bpf_fixup_exception(const struct exception_table_entry *ex,
 			      struct pt_regs *regs);
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index f1745a8..0588632 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -44,11 +44,8 @@
 #define _PAGE_OFFSET(va)	(-(UL(1) << (va)))
 #define PAGE_OFFSET		(_PAGE_OFFSET(VA_BITS))
 #define KIMAGE_VADDR		(MODULES_END)
-#define BPF_JIT_REGION_START	(_PAGE_END(VA_BITS_MIN))
-#define BPF_JIT_REGION_SIZE	(SZ_128M)
-#define BPF_JIT_REGION_END	(BPF_JIT_REGION_START + BPF_JIT_REGION_SIZE)
 #define MODULES_END		(MODULES_VADDR + MODULES_VSIZE)
-#define MODULES_VADDR		(BPF_JIT_REGION_END)
+#define MODULES_VADDR		(_PAGE_END(VA_BITS_MIN))
 #define MODULES_VSIZE		(SZ_128M)
 #define VMEMMAP_START		(-(UL(1) << (VA_BITS - VMEMMAP_SHIFT)))
 #define VMEMMAP_END		(VMEMMAP_START + VMEMMAP_SIZE)
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index b03e383..fe0cd05 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -988,7 +988,7 @@ static int bug_handler(struct pt_regs *regs, unsigned int esr)
 static int reserved_fault_handler(struct pt_regs *regs, unsigned int esr)
 {
 	pr_err("%s generated an invalid instruction at %pS!\n",
-		in_bpf_jit(regs) ? "BPF JIT" : "Kernel text patching",
+		"Kernel text patching",
 		(void *)instruction_pointer(regs));
 
 	/* We cannot handle this */
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index aa00601..60a8b6a 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -9,14 +9,19 @@
 int fixup_exception(struct pt_regs *regs)
 {
 	const struct exception_table_entry *fixup;
+	unsigned long addr;
 
-	fixup = search_exception_tables(instruction_pointer(regs));
-	if (!fixup)
-		return 0;
+	addr = instruction_pointer(regs);
 
-	if (in_bpf_jit(regs))
+	/* Search the BPF tables first, these are formatted differently */
+	fixup = search_bpf_extables(addr);
+	if (fixup)
 		return arm64_bpf_fixup_exception(fixup, regs);
 
+	fixup = search_exception_tables(addr);
+	if (!fixup)
+		return 0;
+
 	regs->pc = (unsigned long)&fixup->fixup + fixup->fixup;
 	return 1;
 }
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 1c40353..9bc4066 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -41,8 +41,6 @@ enum address_markers_idx {
 	{ 0 /* KASAN_SHADOW_START */,	"Kasan shadow start" },
 	{ KASAN_SHADOW_END,		"Kasan shadow end" },
 #endif
-	{ BPF_JIT_REGION_START,		"BPF start" },
-	{ BPF_JIT_REGION_END,		"BPF end" },
 	{ MODULES_VADDR,		"Modules start" },
 	{ MODULES_END,			"Modules end" },
 	{ VMALLOC_START,		"vmalloc() area" },
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 41c23f4..465c44d 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -1136,12 +1136,14 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	return prog;
 }
 
+u64 bpf_jit_alloc_exec_limit(void)
+{
+	return VMALLOC_END - VMALLOC_START;
+}
+
 void *bpf_jit_alloc_exec(unsigned long size)
 {
-	return __vmalloc_node_range(size, PAGE_SIZE, BPF_JIT_REGION_START,
-				    BPF_JIT_REGION_END, GFP_KERNEL,
-				    PAGE_KERNEL, 0, NUMA_NO_NODE,
-				    __builtin_return_address(0));
+	return vmalloc(size);
 }
 
 void bpf_jit_free_exec(void *addr)
-- 
1.8.3.1


  reply	other threads:[~2021-11-03  9:50 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-03  9:49 [PATCH bpf-next 0/2] arm64/bpf: remove 128MB limit for BPF JIT programs Alan Maguire
2021-11-03  9:49 ` Alan Maguire [this message]
2021-11-03  9:49 ` [PATCH bpf-next 2/2] selftests/bpf: add exception handling selftests for tp_bpf program Alan Maguire
2021-11-03 18:39   ` Andrii Nakryiko
2021-11-04 22:56     ` Alan Maguire
2021-11-04 23:23       ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1635932969-13149-2-git-send-email-alan.maguire@oracle.com \
    --to=alan.maguire@oracle.com \
    --cc=Jisheng.Zhang@synaptics.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ardb@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=daizhiyuan@phytium.com.cn \
    --cc=daniel@iogearbox.net \
    --cc=joey.gouly@arm.com \
    --cc=john.fastabend@gmail.com \
    --cc=jthierry@redhat.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liu.hailong6@zte.com.cn \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pcc@google.com \
    --cc=rppt@kernel.org \
    --cc=russell.king@oracle.com \
    --cc=samitolvanen@google.com \
    --cc=songliubraving@fb.com \
    --cc=tiantao6@hisilicon.com \
    --cc=vincenzo.frascino@arm.com \
    --cc=will@kernel.org \
    --cc=yhs@fb.com \
    --cc=zlim.lnx@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).