[PATCH stable 4.9 0/6] BPF stable patches

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH stable 4.9 0/6] BPF stable patches
@ 2018-03-08 15:17 Daniel Borkmann
  2018-03-08 15:17 ` [PATCH stable 4.9 1/6] bpf: fix wrong exposure of map_flags into fdinfo for lpm Daniel Borkmann
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Daniel Borkmann @ 2018-03-08 15:17 UTC (permalink / raw)
  To: gregkh; +Cc: ast, daniel, stable

All for 4.9 backported and tested.

Thanks!

Daniel Borkmann (5):
  bpf: fix wrong exposure of map_flags into fdinfo for lpm
  bpf: fix mlock precharge on arraymaps
  bpf, x64: implement retpoline for tail call
  bpf, arm64: fix out of bounds access in tail call
  bpf, ppc64: fix out of bounds access in tail call

Eric Dumazet (1):
  bpf: add schedule points in percpu arrays management

 arch/arm64/net/bpf_jit_comp.c        |  5 +++--
 arch/powerpc/net/bpf_jit_comp64.c    |  1 +
 arch/x86/include/asm/nospec-branch.h | 37 ++++++++++++++++++++++++++++++++++++
 arch/x86/net/bpf_jit_comp.c          |  9 +++++----
 kernel/bpf/arraymap.c                | 35 ++++++++++++++++++++++------------
 kernel/bpf/stackmap.c                |  1 +
 6 files changed, 70 insertions(+), 18 deletions(-)

-- 
2.9.5

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH stable 4.9 1/6] bpf: fix wrong exposure of map_flags into fdinfo for lpm
  2018-03-08 15:17 [PATCH stable 4.9 0/6] BPF stable patches Daniel Borkmann
@ 2018-03-08 15:17 ` Daniel Borkmann
  2018-03-09 22:21   ` Patch "bpf: fix wrong exposure of map_flags into fdinfo for lpm" has been added to the 4.9-stable tree gregkh
  2018-03-08 15:17 ` [PATCH stable 4.9 2/6] bpf: fix mlock precharge on arraymaps Daniel Borkmann
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Daniel Borkmann @ 2018-03-08 15:17 UTC (permalink / raw)
  To: gregkh; +Cc: ast, daniel, stable, David S . Miller

[ upstream commit a316338cb71a3260201490e615f2f6d5c0d8fb2c ]

trie_alloc() always needs to have BPF_F_NO_PREALLOC passed in via
attr->map_flags, since it does not support preallocation yet. We
check the flag, but we never copy the flag into trie->map.map_flags,
which is later on exposed into fdinfo and used by loaders such as
iproute2. Latter uses this in bpf_map_selfcheck_pinned() to test
whether a pinned map has the same spec as the one from the BPF obj
file and if not, bails out, which is currently the case for lpm
since it exposes always 0 as flags.

Also copy over flags in array_map_alloc() and stack_map_alloc().
They always have to be 0 right now, but we should make sure to not
miss to copy them over at a later point in time when we add actual
flags for them to use.

Fixes: b95a5c4db09b ("bpf: add a longest prefix match trie map implementation")
Reported-by: Jarno Rajahalme <jarno@covalent.io>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 kernel/bpf/arraymap.c | 1 +
 kernel/bpf/stackmap.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 9a1e6ed..3be357a 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -107,6 +107,7 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
 	array->map.key_size = attr->key_size;
 	array->map.value_size = attr->value_size;
 	array->map.max_entries = attr->max_entries;
+	array->map.map_flags = attr->map_flags;
 	array->elem_size = elem_size;
 
 	if (!percpu)
diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index be85191..a2a232d 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -88,6 +88,7 @@ static struct bpf_map *stack_map_alloc(union bpf_attr *attr)
 	smap->map.key_size = attr->key_size;
 	smap->map.value_size = value_size;
 	smap->map.max_entries = attr->max_entries;
+	smap->map.map_flags = attr->map_flags;
 	smap->n_buckets = n_buckets;
 	smap->map.pages = round_up(cost, PAGE_SIZE) >> PAGE_SHIFT;
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH stable 4.9 2/6] bpf: fix mlock precharge on arraymaps
  2018-03-08 15:17 [PATCH stable 4.9 0/6] BPF stable patches Daniel Borkmann
  2018-03-08 15:17 ` [PATCH stable 4.9 1/6] bpf: fix wrong exposure of map_flags into fdinfo for lpm Daniel Borkmann
@ 2018-03-08 15:17 ` Daniel Borkmann
  2018-03-09 22:21   ` Patch "bpf: fix mlock precharge on arraymaps" has been added to the 4.9-stable tree gregkh
  2018-03-08 15:17 ` [PATCH stable 4.9 3/6] bpf, x64: implement retpoline for tail call Daniel Borkmann
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Daniel Borkmann @ 2018-03-08 15:17 UTC (permalink / raw)
  To: gregkh; +Cc: ast, daniel, stable, Dennis Zhou

[ upstream commit 9c2d63b843a5c8a8d0559cc067b5398aa5ec3ffc ]

syzkaller recently triggered OOM during percpu map allocation;
while there is work in progress by Dennis Zhou to add __GFP_NORETRY
semantics for percpu allocator under pressure, there seems also a
missing bpf_map_precharge_memlock() check in array map allocation.

Given today the actual bpf_map_charge_memlock() happens after the
find_and_alloc_map() in syscall path, the bpf_map_precharge_memlock()
is there to bail out early before we go and do the map setup work
when we find that we hit the limits anyway. Therefore add this for
array map as well.

Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
Fixes: a10423b87a7e ("bpf: introduce BPF_MAP_TYPE_PERCPU_ARRAY map")
Reported-by: syzbot+adb03f3f0bb57ce3acda@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Dennis Zhou <dennisszhou@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 kernel/bpf/arraymap.c | 29 ++++++++++++++++++-----------
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 3be357a..075ddb8 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -48,8 +48,9 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
 	bool percpu = attr->map_type == BPF_MAP_TYPE_PERCPU_ARRAY;
 	u32 elem_size, index_mask, max_entries;
 	bool unpriv = !capable(CAP_SYS_ADMIN);
+	u64 cost, array_size, mask64;
 	struct bpf_array *array;
-	u64 array_size, mask64;
+	int ret;
 
 	/* check sanity of attributes */
 	if (attr->max_entries == 0 || attr->key_size != 4 ||
@@ -92,8 +93,19 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
 		array_size += (u64) max_entries * elem_size;
 
 	/* make sure there is no u32 overflow later in round_up() */
-	if (array_size >= U32_MAX - PAGE_SIZE)
+	cost = array_size;
+	if (cost >= U32_MAX - PAGE_SIZE)
 		return ERR_PTR(-ENOMEM);
+	if (percpu) {
+		cost += (u64)attr->max_entries * elem_size * num_possible_cpus();
+		if (cost >= U32_MAX - PAGE_SIZE)
+			return ERR_PTR(-ENOMEM);
+	}
+	cost = round_up(cost, PAGE_SIZE) >> PAGE_SHIFT;
+
+	ret = bpf_map_precharge_memlock(cost);
+	if (ret < 0)
+		return ERR_PTR(ret);
 
 	/* allocate all map elements and zero-initialize them */
 	array = bpf_map_area_alloc(array_size);
@@ -108,20 +120,15 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
 	array->map.value_size = attr->value_size;
 	array->map.max_entries = attr->max_entries;
 	array->map.map_flags = attr->map_flags;
+	array->map.pages = cost;
 	array->elem_size = elem_size;
 
-	if (!percpu)
-		goto out;
-
-	array_size += (u64) attr->max_entries * elem_size * num_possible_cpus();
-
-	if (array_size >= U32_MAX - PAGE_SIZE ||
-	    elem_size > PCPU_MIN_UNIT_SIZE || bpf_array_alloc_percpu(array)) {
+	if (percpu &&
+	    (elem_size > PCPU_MIN_UNIT_SIZE ||
+	     bpf_array_alloc_percpu(array))) {
 		bpf_map_area_free(array);
 		return ERR_PTR(-ENOMEM);
 	}
-out:
-	array->map.pages = round_up(array_size, PAGE_SIZE) >> PAGE_SHIFT;
 
 	return &array->map;
 }
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH stable 4.9 3/6] bpf, x64: implement retpoline for tail call
  2018-03-08 15:17 [PATCH stable 4.9 0/6] BPF stable patches Daniel Borkmann
  2018-03-08 15:17 ` [PATCH stable 4.9 1/6] bpf: fix wrong exposure of map_flags into fdinfo for lpm Daniel Borkmann
  2018-03-08 15:17 ` [PATCH stable 4.9 2/6] bpf: fix mlock precharge on arraymaps Daniel Borkmann
@ 2018-03-08 15:17 ` Daniel Borkmann
  2018-03-09 22:21   ` Patch "bpf, x64: implement retpoline for tail call" has been added to the 4.9-stable tree gregkh
  2018-03-10  0:05   ` Patch "bpf, x64: implement retpoline for tail call" has been added to the 4.4-stable tree gregkh
  2018-03-08 15:17 ` [PATCH stable 4.9 4/6] bpf, arm64: fix out of bounds access in tail call Daniel Borkmann
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 17+ messages in thread
From: Daniel Borkmann @ 2018-03-08 15:17 UTC (permalink / raw)
  To: gregkh; +Cc: ast, daniel, stable

[ upstream commit a493a87f38cfa48caaa95c9347be2d914c6fdf29 ]

Implement a retpoline [0] for the BPF tail call JIT'ing that converts
the indirect jump via jmp %rax that is used to make the long jump into
another JITed BPF image. Since this is subject to speculative execution,
we need to control the transient instruction sequence here as well
when CONFIG_RETPOLINE is set, and direct it into a pause + lfence loop.
The latter aligns also with what gcc / clang emits (e.g. [1]).

JIT dump after patch:

  # bpftool p d x i 1
   0: (18) r2 = map[id:1]
   2: (b7) r3 = 0
   3: (85) call bpf_tail_call#12
   4: (b7) r0 = 2
   5: (95) exit

With CONFIG_RETPOLINE:

  # bpftool p d j i 1
  [...]
  33:	cmp    %edx,0x24(%rsi)
  36:	jbe    0x0000000000000072  |*
  38:	mov    0x24(%rbp),%eax
  3e:	cmp    $0x20,%eax
  41:	ja     0x0000000000000072  |
  43:	add    $0x1,%eax
  46:	mov    %eax,0x24(%rbp)
  4c:	mov    0x90(%rsi,%rdx,8),%rax
  54:	test   %rax,%rax
  57:	je     0x0000000000000072  |
  59:	mov    0x28(%rax),%rax
  5d:	add    $0x25,%rax
  61:	callq  0x000000000000006d  |+
  66:	pause                      |
  68:	lfence                     |
  6b:	jmp    0x0000000000000066  |
  6d:	mov    %rax,(%rsp)         |
  71:	retq                       |
  72:	mov    $0x2,%eax
  [...]

  * relative fall-through jumps in error case
  + retpoline for indirect jump

Without CONFIG_RETPOLINE:

  # bpftool p d j i 1
  [...]
  33:	cmp    %edx,0x24(%rsi)
  36:	jbe    0x0000000000000063  |*
  38:	mov    0x24(%rbp),%eax
  3e:	cmp    $0x20,%eax
  41:	ja     0x0000000000000063  |
  43:	add    $0x1,%eax
  46:	mov    %eax,0x24(%rbp)
  4c:	mov    0x90(%rsi,%rdx,8),%rax
  54:	test   %rax,%rax
  57:	je     0x0000000000000063  |
  59:	mov    0x28(%rax),%rax
  5d:	add    $0x25,%rax
  61:	jmpq   *%rax               |-
  63:	mov    $0x2,%eax
  [...]

  * relative fall-through jumps in error case
  - plain indirect jump as before

  [0] https://support.google.com/faqs/answer/7625886
  [1] https://github.com/gcc-mirror/gcc/commit/a31e654fa107be968b802786d747e962c2fcdb2b

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 arch/x86/include/asm/nospec-branch.h | 37 ++++++++++++++++++++++++++++++++++++
 arch/x86/net/bpf_jit_comp.c          |  9 +++++----
 2 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 76b0585..81a1be3 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -177,4 +177,41 @@ static inline void indirect_branch_prediction_barrier(void)
 }
 
 #endif /* __ASSEMBLY__ */
+
+/*
+ * Below is used in the eBPF JIT compiler and emits the byte sequence
+ * for the following assembly:
+ *
+ * With retpolines configured:
+ *
+ *    callq do_rop
+ *  spec_trap:
+ *    pause
+ *    lfence
+ *    jmp spec_trap
+ *  do_rop:
+ *    mov %rax,(%rsp)
+ *    retq
+ *
+ * Without retpolines configured:
+ *
+ *    jmp *%rax
+ */
+#ifdef CONFIG_RETPOLINE
+# define RETPOLINE_RAX_BPF_JIT_SIZE	17
+# define RETPOLINE_RAX_BPF_JIT()				\
+	EMIT1_off32(0xE8, 7);	 /* callq do_rop */		\
+	/* spec_trap: */					\
+	EMIT2(0xF3, 0x90);       /* pause */			\
+	EMIT3(0x0F, 0xAE, 0xE8); /* lfence */			\
+	EMIT2(0xEB, 0xF9);       /* jmp spec_trap */		\
+	/* do_rop: */						\
+	EMIT4(0x48, 0x89, 0x04, 0x24); /* mov %rax,(%rsp) */	\
+	EMIT1(0xC3);             /* retq */
+#else
+# define RETPOLINE_RAX_BPF_JIT_SIZE	2
+# define RETPOLINE_RAX_BPF_JIT()				\
+	EMIT2(0xFF, 0xE0);	 /* jmp *%rax */
+#endif
+
 #endif /* _ASM_X86_NOSPEC_BRANCH_H_ */
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 7840331..1f7ed2e 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -12,6 +12,7 @@
 #include <linux/filter.h>
 #include <linux/if_vlan.h>
 #include <asm/cacheflush.h>
+#include <asm/nospec-branch.h>
 #include <linux/bpf.h>
 
 int bpf_jit_enable __read_mostly;
@@ -281,7 +282,7 @@ static void emit_bpf_tail_call(u8 **pprog)
 	EMIT2(0x89, 0xD2);                        /* mov edx, edx */
 	EMIT3(0x39, 0x56,                         /* cmp dword ptr [rsi + 16], edx */
 	      offsetof(struct bpf_array, map.max_entries));
-#define OFFSET1 43 /* number of bytes to jump */
+#define OFFSET1 (41 + RETPOLINE_RAX_BPF_JIT_SIZE) /* number of bytes to jump */
 	EMIT2(X86_JBE, OFFSET1);                  /* jbe out */
 	label1 = cnt;
 
@@ -290,7 +291,7 @@ static void emit_bpf_tail_call(u8 **pprog)
 	 */
 	EMIT2_off32(0x8B, 0x85, -STACKSIZE + 36); /* mov eax, dword ptr [rbp - 516] */
 	EMIT3(0x83, 0xF8, MAX_TAIL_CALL_CNT);     /* cmp eax, MAX_TAIL_CALL_CNT */
-#define OFFSET2 32
+#define OFFSET2 (30 + RETPOLINE_RAX_BPF_JIT_SIZE)
 	EMIT2(X86_JA, OFFSET2);                   /* ja out */
 	label2 = cnt;
 	EMIT3(0x83, 0xC0, 0x01);                  /* add eax, 1 */
@@ -304,7 +305,7 @@ static void emit_bpf_tail_call(u8 **pprog)
 	 *   goto out;
 	 */
 	EMIT3(0x48, 0x85, 0xC0);		  /* test rax,rax */
-#define OFFSET3 10
+#define OFFSET3 (8 + RETPOLINE_RAX_BPF_JIT_SIZE)
 	EMIT2(X86_JE, OFFSET3);                   /* je out */
 	label3 = cnt;
 
@@ -317,7 +318,7 @@ static void emit_bpf_tail_call(u8 **pprog)
 	 * rdi == ctx (1st arg)
 	 * rax == prog->bpf_func + prologue_size
 	 */
-	EMIT2(0xFF, 0xE0);                        /* jmp rax */
+	RETPOLINE_RAX_BPF_JIT();
 
 	/* out: */
 	BUILD_BUG_ON(cnt - label1 != OFFSET1);
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH stable 4.9 4/6] bpf, arm64: fix out of bounds access in tail call
  2018-03-08 15:17 [PATCH stable 4.9 0/6] BPF stable patches Daniel Borkmann
                   ` (2 preceding siblings ...)
  2018-03-08 15:17 ` [PATCH stable 4.9 3/6] bpf, x64: implement retpoline for tail call Daniel Borkmann
@ 2018-03-08 15:17 ` Daniel Borkmann
  2018-03-09 22:21   ` Patch "bpf, arm64: fix out of bounds access in tail call" has been added to the 4.9-stable tree gregkh
  2018-03-08 15:17 ` [PATCH stable 4.9 5/6] bpf: add schedule points in percpu arrays management Daniel Borkmann
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Daniel Borkmann @ 2018-03-08 15:17 UTC (permalink / raw)
  To: gregkh; +Cc: ast, daniel, stable

[ upstream commit 16338a9b3ac30740d49f5dfed81bac0ffa53b9c7 ]

I recently noticed a crash on arm64 when feeding a bogus index
into BPF tail call helper. The crash would not occur when the
interpreter is used, but only in case of JIT. Output looks as
follows:

  [  347.007486] Unable to handle kernel paging request at virtual address fffb850e96492510
  [...]
  [  347.043065] [fffb850e96492510] address between user and kernel address ranges
  [  347.050205] Internal error: Oops: 96000004 [#1] SMP
  [...]
  [  347.190829] x13: 0000000000000000 x12: 0000000000000000
  [  347.196128] x11: fffc047ebe782800 x10: ffff808fd7d0fd10
  [  347.201427] x9 : 0000000000000000 x8 : 0000000000000000
  [  347.206726] x7 : 0000000000000000 x6 : 001c991738000000
  [  347.212025] x5 : 0000000000000018 x4 : 000000000000ba5a
  [  347.217325] x3 : 00000000000329c4 x2 : ffff808fd7cf0500
  [  347.222625] x1 : ffff808fd7d0fc00 x0 : ffff808fd7cf0500
  [  347.227926] Process test_verifier (pid: 4548, stack limit = 0x000000007467fa61)
  [  347.235221] Call trace:
  [  347.237656]  0xffff000002f3a4fc
  [  347.240784]  bpf_test_run+0x78/0xf8
  [  347.244260]  bpf_prog_test_run_skb+0x148/0x230
  [  347.248694]  SyS_bpf+0x77c/0x1110
  [  347.251999]  el0_svc_naked+0x30/0x34
  [  347.255564] Code: 9100075a d280220a 8b0a002a d37df04b (f86b694b)
  [...]

In this case the index used in BPF r3 is the same as in r1
at the time of the call, meaning we fed a pointer as index;
here, it had the value 0xffff808fd7cf0500 which sits in x2.

While I found tail calls to be working in general (also for
hitting the error cases), I noticed the following in the code
emission:

  # bpftool p d j i 988
  [...]
  38:   ldr     w10, [x1,x10]
  3c:   cmp     w2, w10
  40:   b.ge    0x000000000000007c              <-- signed cmp
  44:   mov     x10, #0x20                      // #32
  48:   cmp     x26, x10
  4c:   b.gt    0x000000000000007c
  50:   add     x26, x26, #0x1
  54:   mov     x10, #0x110                     // #272
  58:   add     x10, x1, x10
  5c:   lsl     x11, x2, #3
  60:   ldr     x11, [x10,x11]                  <-- faulting insn (f86b694b)
  64:   cbz     x11, 0x000000000000007c
  [...]

Meaning, the tests passed because commit ddb55992b04d ("arm64:
bpf: implement bpf_tail_call() helper") was using signed compares
instead of unsigned which as a result had the test wrongly passing.

Change this but also the tail call count test both into unsigned
and cap the index as u32. Latter we did as well in 90caccdd8cc0
("bpf: fix bpf_tail_call() x64 JIT") and is needed in addition here,
too. Tested on HiSilicon Hi1616.

Result after patch:

  # bpftool p d j i 268
  [...]
  38:	ldr	w10, [x1,x10]
  3c:	add	w2, w2, #0x0
  40:	cmp	w2, w10
  44:	b.cs	0x0000000000000080
  48:	mov	x10, #0x20                  	// #32
  4c:	cmp	x26, x10
  50:	b.hi	0x0000000000000080
  54:	add	x26, x26, #0x1
  58:	mov	x10, #0x110                 	// #272
  5c:	add	x10, x1, x10
  60:	lsl	x11, x2, #3
  64:	ldr	x11, [x10,x11]
  68:	cbz	x11, 0x0000000000000080
  [...]

Fixes: ddb55992b04d ("arm64: bpf: implement bpf_tail_call() helper")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 arch/arm64/net/bpf_jit_comp.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index d8199e1..b47a26f 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -234,8 +234,9 @@ static int emit_bpf_tail_call(struct jit_ctx *ctx)
 	off = offsetof(struct bpf_array, map.max_entries);
 	emit_a64_mov_i64(tmp, off, ctx);
 	emit(A64_LDR32(tmp, r2, tmp), ctx);
+	emit(A64_MOV(0, r3, r3), ctx);
 	emit(A64_CMP(0, r3, tmp), ctx);
-	emit(A64_B_(A64_COND_GE, jmp_offset), ctx);
+	emit(A64_B_(A64_COND_CS, jmp_offset), ctx);
 
 	/* if (tail_call_cnt > MAX_TAIL_CALL_CNT)
 	 *     goto out;
@@ -243,7 +244,7 @@ static int emit_bpf_tail_call(struct jit_ctx *ctx)
 	 */
 	emit_a64_mov_i64(tmp, MAX_TAIL_CALL_CNT, ctx);
 	emit(A64_CMP(1, tcc, tmp), ctx);
-	emit(A64_B_(A64_COND_GT, jmp_offset), ctx);
+	emit(A64_B_(A64_COND_HI, jmp_offset), ctx);
 	emit(A64_ADD_I(1, tcc, tcc, 1), ctx);
 
 	/* prog = array->ptrs[index];
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH stable 4.9 5/6] bpf: add schedule points in percpu arrays management
  2018-03-08 15:17 [PATCH stable 4.9 0/6] BPF stable patches Daniel Borkmann
                   ` (3 preceding siblings ...)
  2018-03-08 15:17 ` [PATCH stable 4.9 4/6] bpf, arm64: fix out of bounds access in tail call Daniel Borkmann
@ 2018-03-08 15:17 ` Daniel Borkmann
  2018-03-09 22:21   ` Patch "bpf: add schedule points in percpu arrays management" has been added to the 4.9-stable tree gregkh
  2018-03-08 15:17 ` [PATCH stable 4.9 6/6] bpf, ppc64: fix out of bounds access in tail call Daniel Borkmann
  2018-03-09 22:22 ` [PATCH stable 4.9 0/6] BPF stable patches Greg KH
  6 siblings, 1 reply; 17+ messages in thread
From: Daniel Borkmann @ 2018-03-08 15:17 UTC (permalink / raw)
  To: gregkh; +Cc: ast, daniel, stable, Eric Dumazet

From: Eric Dumazet <edumazet@google.com>

[ upstream commit 32fff239de37ef226d5b66329dd133f64d63b22d ]

syszbot managed to trigger RCU detected stalls in
bpf_array_free_percpu()

It takes time to allocate a huge percpu map, but even more time to free
it.

Since we run in process context, use cond_resched() to yield cpu if
needed.

Fixes: a10423b87a7e ("bpf: introduce BPF_MAP_TYPE_PERCPU_ARRAY map")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 kernel/bpf/arraymap.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 075ddb8..a38119e 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -20,8 +20,10 @@ static void bpf_array_free_percpu(struct bpf_array *array)
 {
 	int i;
 
-	for (i = 0; i < array->map.max_entries; i++)
+	for (i = 0; i < array->map.max_entries; i++) {
 		free_percpu(array->pptrs[i]);
+		cond_resched();
+	}
 }
 
 static int bpf_array_alloc_percpu(struct bpf_array *array)
@@ -37,6 +39,7 @@ static int bpf_array_alloc_percpu(struct bpf_array *array)
 			return -ENOMEM;
 		}
 		array->pptrs[i] = ptr;
+		cond_resched();
 	}
 
 	return 0;
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH stable 4.9 6/6] bpf, ppc64: fix out of bounds access in tail call
  2018-03-08 15:17 [PATCH stable 4.9 0/6] BPF stable patches Daniel Borkmann
                   ` (4 preceding siblings ...)
  2018-03-08 15:17 ` [PATCH stable 4.9 5/6] bpf: add schedule points in percpu arrays management Daniel Borkmann
@ 2018-03-08 15:17 ` Daniel Borkmann
  2018-03-09 22:21   ` Patch "bpf, ppc64: fix out of bounds access in tail call" has been added to the 4.9-stable tree gregkh
  2018-03-09 22:22 ` [PATCH stable 4.9 0/6] BPF stable patches Greg KH
  6 siblings, 1 reply; 17+ messages in thread
From: Daniel Borkmann @ 2018-03-08 15:17 UTC (permalink / raw)
  To: gregkh; +Cc: ast, daniel, stable

[ upstream commit d269176e766c71c998cb75b4ea8cbc321cc0019d ]

While working on 16338a9b3ac3 ("bpf, arm64: fix out of bounds access in
tail call") I noticed that ppc64 JIT is partially affected as well. While
the bound checking is correctly performed as unsigned comparison, the
register with the index value however, is never truncated into 32 bit
space, so e.g. a index value of 0x100000000ULL with a map of 1 element
would pass with PPC_CMPLW() whereas we later on continue with the full
64 bit register value. Therefore, as we do in interpreter and other JITs
truncate the value to 32 bit initially in order to fix access.

Fixes: ce0761419fae ("powerpc/bpf: Implement support for tail calls")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 arch/powerpc/net/bpf_jit_comp64.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index 0fe98a5..be9d968 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -245,6 +245,7 @@ static void bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32
 	 *   goto out;
 	 */
 	PPC_LWZ(b2p[TMP_REG_1], b2p_bpf_array, offsetof(struct bpf_array, map.max_entries));
+	PPC_RLWINM(b2p_index, b2p_index, 0, 0, 31);
 	PPC_CMPLW(b2p_index, b2p[TMP_REG_1]);
 	PPC_BCC(COND_GE, out);
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Patch "bpf: add schedule points in percpu arrays management" has been added to the 4.9-stable tree
  2018-03-08 15:17 ` [PATCH stable 4.9 5/6] bpf: add schedule points in percpu arrays management Daniel Borkmann
@ 2018-03-09 22:21   ` gregkh
  0 siblings, 0 replies; 17+ messages in thread
From: gregkh @ 2018-03-09 22:21 UTC (permalink / raw)
  To: daniel, edumazet, gregkh, syzkaller; +Cc: stable, stable-commits


This is a note to let you know that I've just added the patch titled

    bpf: add schedule points in percpu arrays management

to the 4.9-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     bpf-add-schedule-points-in-percpu-arrays-management.patch
and it can be found in the queue-4.9 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.


>From foo@baz Fri Mar  9 14:20:51 PST 2018
From: Daniel Borkmann <daniel@iogearbox.net>
Date: Thu,  8 Mar 2018 16:17:36 +0100
Subject: bpf: add schedule points in percpu arrays management
To: gregkh@linuxfoundation.org
Cc: ast@kernel.org, daniel@iogearbox.net, stable@vger.kernel.org, Eric Dumazet <edumazet@google.com>
Message-ID: <2f5704af2bdf05e8eae92917e2aeaec49d5477c9.1520521792.git.daniel@iogearbox.net>

From: Eric Dumazet <edumazet@google.com>

[ upstream commit 32fff239de37ef226d5b66329dd133f64d63b22d ]

syszbot managed to trigger RCU detected stalls in
bpf_array_free_percpu()

It takes time to allocate a huge percpu map, but even more time to free
it.

Since we run in process context, use cond_resched() to yield cpu if
needed.

Fixes: a10423b87a7e ("bpf: introduce BPF_MAP_TYPE_PERCPU_ARRAY map")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/bpf/arraymap.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -20,8 +20,10 @@ static void bpf_array_free_percpu(struct
 {
 	int i;
 
-	for (i = 0; i < array->map.max_entries; i++)
+	for (i = 0; i < array->map.max_entries; i++) {
 		free_percpu(array->pptrs[i]);
+		cond_resched();
+	}
 }
 
 static int bpf_array_alloc_percpu(struct bpf_array *array)
@@ -37,6 +39,7 @@ static int bpf_array_alloc_percpu(struct
 			return -ENOMEM;
 		}
 		array->pptrs[i] = ptr;
+		cond_resched();
 	}
 
 	return 0;


Patches currently in stable-queue which might be from daniel@iogearbox.net are

queue-4.9/bpf-fix-mlock-precharge-on-arraymaps.patch
queue-4.9/bpf-x64-implement-retpoline-for-tail-call.patch
queue-4.9/bpf-arm64-fix-out-of-bounds-access-in-tail-call.patch
queue-4.9/bpf-fix-wrong-exposure-of-map_flags-into-fdinfo-for-lpm.patch
queue-4.9/bpf-ppc64-fix-out-of-bounds-access-in-tail-call.patch
queue-4.9/bpf-add-schedule-points-in-percpu-arrays-management.patch

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Patch "bpf, arm64: fix out of bounds access in tail call" has been added to the 4.9-stable tree
  2018-03-08 15:17 ` [PATCH stable 4.9 4/6] bpf, arm64: fix out of bounds access in tail call Daniel Borkmann
@ 2018-03-09 22:21   ` gregkh
  0 siblings, 0 replies; 17+ messages in thread
From: gregkh @ 2018-03-09 22:21 UTC (permalink / raw)
  To: daniel, ast, gregkh; +Cc: stable, stable-commits


This is a note to let you know that I've just added the patch titled

    bpf, arm64: fix out of bounds access in tail call

to the 4.9-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     bpf-arm64-fix-out-of-bounds-access-in-tail-call.patch
and it can be found in the queue-4.9 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.


>From foo@baz Fri Mar  9 14:20:51 PST 2018
From: Daniel Borkmann <daniel@iogearbox.net>
Date: Thu,  8 Mar 2018 16:17:35 +0100
Subject: bpf, arm64: fix out of bounds access in tail call
To: gregkh@linuxfoundation.org
Cc: ast@kernel.org, daniel@iogearbox.net, stable@vger.kernel.org
Message-ID: <3e884789ba211b116935a7c05044b861aba2a30e.1520521792.git.daniel@iogearbox.net>

From: Daniel Borkmann <daniel@iogearbox.net>

[ upstream commit 16338a9b3ac30740d49f5dfed81bac0ffa53b9c7 ]

I recently noticed a crash on arm64 when feeding a bogus index
into BPF tail call helper. The crash would not occur when the
interpreter is used, but only in case of JIT. Output looks as
follows:

  [  347.007486] Unable to handle kernel paging request at virtual address fffb850e96492510
  [...]
  [  347.043065] [fffb850e96492510] address between user and kernel address ranges
  [  347.050205] Internal error: Oops: 96000004 [#1] SMP
  [...]
  [  347.190829] x13: 0000000000000000 x12: 0000000000000000
  [  347.196128] x11: fffc047ebe782800 x10: ffff808fd7d0fd10
  [  347.201427] x9 : 0000000000000000 x8 : 0000000000000000
  [  347.206726] x7 : 0000000000000000 x6 : 001c991738000000
  [  347.212025] x5 : 0000000000000018 x4 : 000000000000ba5a
  [  347.217325] x3 : 00000000000329c4 x2 : ffff808fd7cf0500
  [  347.222625] x1 : ffff808fd7d0fc00 x0 : ffff808fd7cf0500
  [  347.227926] Process test_verifier (pid: 4548, stack limit = 0x000000007467fa61)
  [  347.235221] Call trace:
  [  347.237656]  0xffff000002f3a4fc
  [  347.240784]  bpf_test_run+0x78/0xf8
  [  347.244260]  bpf_prog_test_run_skb+0x148/0x230
  [  347.248694]  SyS_bpf+0x77c/0x1110
  [  347.251999]  el0_svc_naked+0x30/0x34
  [  347.255564] Code: 9100075a d280220a 8b0a002a d37df04b (f86b694b)
  [...]

In this case the index used in BPF r3 is the same as in r1
at the time of the call, meaning we fed a pointer as index;
here, it had the value 0xffff808fd7cf0500 which sits in x2.

While I found tail calls to be working in general (also for
hitting the error cases), I noticed the following in the code
emission:

  # bpftool p d j i 988
  [...]
  38:   ldr     w10, [x1,x10]
  3c:   cmp     w2, w10
  40:   b.ge    0x000000000000007c              <-- signed cmp
  44:   mov     x10, #0x20                      // #32
  48:   cmp     x26, x10
  4c:   b.gt    0x000000000000007c
  50:   add     x26, x26, #0x1
  54:   mov     x10, #0x110                     // #272
  58:   add     x10, x1, x10
  5c:   lsl     x11, x2, #3
  60:   ldr     x11, [x10,x11]                  <-- faulting insn (f86b694b)
  64:   cbz     x11, 0x000000000000007c
  [...]

Meaning, the tests passed because commit ddb55992b04d ("arm64:
bpf: implement bpf_tail_call() helper") was using signed compares
instead of unsigned which as a result had the test wrongly passing.

Change this but also the tail call count test both into unsigned
and cap the index as u32. Latter we did as well in 90caccdd8cc0
("bpf: fix bpf_tail_call() x64 JIT") and is needed in addition here,
too. Tested on HiSilicon Hi1616.

Result after patch:

  # bpftool p d j i 268
  [...]
  38:	ldr	w10, [x1,x10]
  3c:	add	w2, w2, #0x0
  40:	cmp	w2, w10
  44:	b.cs	0x0000000000000080
  48:	mov	x10, #0x20                  	// #32
  4c:	cmp	x26, x10
  50:	b.hi	0x0000000000000080
  54:	add	x26, x26, #0x1
  58:	mov	x10, #0x110                 	// #272
  5c:	add	x10, x1, x10
  60:	lsl	x11, x2, #3
  64:	ldr	x11, [x10,x11]
  68:	cbz	x11, 0x0000000000000080
  [...]

Fixes: ddb55992b04d ("arm64: bpf: implement bpf_tail_call() helper")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/net/bpf_jit_comp.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -234,8 +234,9 @@ static int emit_bpf_tail_call(struct jit
 	off = offsetof(struct bpf_array, map.max_entries);
 	emit_a64_mov_i64(tmp, off, ctx);
 	emit(A64_LDR32(tmp, r2, tmp), ctx);
+	emit(A64_MOV(0, r3, r3), ctx);
 	emit(A64_CMP(0, r3, tmp), ctx);
-	emit(A64_B_(A64_COND_GE, jmp_offset), ctx);
+	emit(A64_B_(A64_COND_CS, jmp_offset), ctx);
 
 	/* if (tail_call_cnt > MAX_TAIL_CALL_CNT)
 	 *     goto out;
@@ -243,7 +244,7 @@ static int emit_bpf_tail_call(struct jit
 	 */
 	emit_a64_mov_i64(tmp, MAX_TAIL_CALL_CNT, ctx);
 	emit(A64_CMP(1, tcc, tmp), ctx);
-	emit(A64_B_(A64_COND_GT, jmp_offset), ctx);
+	emit(A64_B_(A64_COND_HI, jmp_offset), ctx);
 	emit(A64_ADD_I(1, tcc, tcc, 1), ctx);
 
 	/* prog = array->ptrs[index];


Patches currently in stable-queue which might be from daniel@iogearbox.net are

queue-4.9/bpf-fix-mlock-precharge-on-arraymaps.patch
queue-4.9/bpf-x64-implement-retpoline-for-tail-call.patch
queue-4.9/bpf-arm64-fix-out-of-bounds-access-in-tail-call.patch
queue-4.9/bpf-fix-wrong-exposure-of-map_flags-into-fdinfo-for-lpm.patch
queue-4.9/bpf-ppc64-fix-out-of-bounds-access-in-tail-call.patch
queue-4.9/bpf-add-schedule-points-in-percpu-arrays-management.patch

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Patch "bpf: fix mlock precharge on arraymaps" has been added to the 4.9-stable tree
  2018-03-08 15:17 ` [PATCH stable 4.9 2/6] bpf: fix mlock precharge on arraymaps Daniel Borkmann
@ 2018-03-09 22:21   ` gregkh
  0 siblings, 0 replies; 17+ messages in thread
From: gregkh @ 2018-03-09 22:21 UTC (permalink / raw)
  To: daniel, ast, dennisszhou, gregkh; +Cc: stable, stable-commits


This is a note to let you know that I've just added the patch titled

    bpf: fix mlock precharge on arraymaps

to the 4.9-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     bpf-fix-mlock-precharge-on-arraymaps.patch
and it can be found in the queue-4.9 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.


>From foo@baz Fri Mar  9 14:20:51 PST 2018
From: Daniel Borkmann <daniel@iogearbox.net>
Date: Thu,  8 Mar 2018 16:17:33 +0100
Subject: bpf: fix mlock precharge on arraymaps
To: gregkh@linuxfoundation.org
Cc: ast@kernel.org, daniel@iogearbox.net, stable@vger.kernel.org, Dennis Zhou <dennisszhou@gmail.com>
Message-ID: <56632230ccbd31f06f33af4c6ac9fcbc88506825.1520521792.git.daniel@iogearbox.net>

From: Daniel Borkmann <daniel@iogearbox.net>

[ upstream commit 9c2d63b843a5c8a8d0559cc067b5398aa5ec3ffc ]

syzkaller recently triggered OOM during percpu map allocation;
while there is work in progress by Dennis Zhou to add __GFP_NORETRY
semantics for percpu allocator under pressure, there seems also a
missing bpf_map_precharge_memlock() check in array map allocation.

Given today the actual bpf_map_charge_memlock() happens after the
find_and_alloc_map() in syscall path, the bpf_map_precharge_memlock()
is there to bail out early before we go and do the map setup work
when we find that we hit the limits anyway. Therefore add this for
array map as well.

Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
Fixes: a10423b87a7e ("bpf: introduce BPF_MAP_TYPE_PERCPU_ARRAY map")
Reported-by: syzbot+adb03f3f0bb57ce3acda@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Dennis Zhou <dennisszhou@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/bpf/arraymap.c |   29 ++++++++++++++++++-----------
 1 file changed, 18 insertions(+), 11 deletions(-)

--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -48,8 +48,9 @@ static struct bpf_map *array_map_alloc(u
 	bool percpu = attr->map_type == BPF_MAP_TYPE_PERCPU_ARRAY;
 	u32 elem_size, index_mask, max_entries;
 	bool unpriv = !capable(CAP_SYS_ADMIN);
+	u64 cost, array_size, mask64;
 	struct bpf_array *array;
-	u64 array_size, mask64;
+	int ret;
 
 	/* check sanity of attributes */
 	if (attr->max_entries == 0 || attr->key_size != 4 ||
@@ -92,8 +93,19 @@ static struct bpf_map *array_map_alloc(u
 		array_size += (u64) max_entries * elem_size;
 
 	/* make sure there is no u32 overflow later in round_up() */
-	if (array_size >= U32_MAX - PAGE_SIZE)
+	cost = array_size;
+	if (cost >= U32_MAX - PAGE_SIZE)
 		return ERR_PTR(-ENOMEM);
+	if (percpu) {
+		cost += (u64)attr->max_entries * elem_size * num_possible_cpus();
+		if (cost >= U32_MAX - PAGE_SIZE)
+			return ERR_PTR(-ENOMEM);
+	}
+	cost = round_up(cost, PAGE_SIZE) >> PAGE_SHIFT;
+
+	ret = bpf_map_precharge_memlock(cost);
+	if (ret < 0)
+		return ERR_PTR(ret);
 
 	/* allocate all map elements and zero-initialize them */
 	array = bpf_map_area_alloc(array_size);
@@ -108,20 +120,15 @@ static struct bpf_map *array_map_alloc(u
 	array->map.value_size = attr->value_size;
 	array->map.max_entries = attr->max_entries;
 	array->map.map_flags = attr->map_flags;
+	array->map.pages = cost;
 	array->elem_size = elem_size;
 
-	if (!percpu)
-		goto out;
-
-	array_size += (u64) attr->max_entries * elem_size * num_possible_cpus();
-
-	if (array_size >= U32_MAX - PAGE_SIZE ||
-	    elem_size > PCPU_MIN_UNIT_SIZE || bpf_array_alloc_percpu(array)) {
+	if (percpu &&
+	    (elem_size > PCPU_MIN_UNIT_SIZE ||
+	     bpf_array_alloc_percpu(array))) {
 		bpf_map_area_free(array);
 		return ERR_PTR(-ENOMEM);
 	}
-out:
-	array->map.pages = round_up(array_size, PAGE_SIZE) >> PAGE_SHIFT;
 
 	return &array->map;
 }


Patches currently in stable-queue which might be from daniel@iogearbox.net are

queue-4.9/bpf-fix-mlock-precharge-on-arraymaps.patch
queue-4.9/bpf-x64-implement-retpoline-for-tail-call.patch
queue-4.9/bpf-arm64-fix-out-of-bounds-access-in-tail-call.patch
queue-4.9/bpf-fix-wrong-exposure-of-map_flags-into-fdinfo-for-lpm.patch
queue-4.9/bpf-ppc64-fix-out-of-bounds-access-in-tail-call.patch
queue-4.9/bpf-add-schedule-points-in-percpu-arrays-management.patch

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Patch "bpf: fix wrong exposure of map_flags into fdinfo for lpm" has been added to the 4.9-stable tree
  2018-03-08 15:17 ` [PATCH stable 4.9 1/6] bpf: fix wrong exposure of map_flags into fdinfo for lpm Daniel Borkmann
@ 2018-03-09 22:21   ` gregkh
  0 siblings, 0 replies; 17+ messages in thread
From: gregkh @ 2018-03-09 22:21 UTC (permalink / raw)
  To: daniel, ast, davem, gregkh, jarno; +Cc: stable, stable-commits


This is a note to let you know that I've just added the patch titled

    bpf: fix wrong exposure of map_flags into fdinfo for lpm

to the 4.9-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     bpf-fix-wrong-exposure-of-map_flags-into-fdinfo-for-lpm.patch
and it can be found in the queue-4.9 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.


>From foo@baz Fri Mar  9 14:20:51 PST 2018
From: Daniel Borkmann <daniel@iogearbox.net>
Date: Thu,  8 Mar 2018 16:17:32 +0100
Subject: bpf: fix wrong exposure of map_flags into fdinfo for lpm
To: gregkh@linuxfoundation.org
Cc: ast@kernel.org, daniel@iogearbox.net, stable@vger.kernel.org, "David S . Miller" <davem@davemloft.net>
Message-ID: <d0c41d2614afbe10501cc3d96d998952694bab7f.1520521792.git.daniel@iogearbox.net>

From: Daniel Borkmann <daniel@iogearbox.net>

[ upstream commit a316338cb71a3260201490e615f2f6d5c0d8fb2c ]

trie_alloc() always needs to have BPF_F_NO_PREALLOC passed in via
attr->map_flags, since it does not support preallocation yet. We
check the flag, but we never copy the flag into trie->map.map_flags,
which is later on exposed into fdinfo and used by loaders such as
iproute2. Latter uses this in bpf_map_selfcheck_pinned() to test
whether a pinned map has the same spec as the one from the BPF obj
file and if not, bails out, which is currently the case for lpm
since it exposes always 0 as flags.

Also copy over flags in array_map_alloc() and stack_map_alloc().
They always have to be 0 right now, but we should make sure to not
miss to copy them over at a later point in time when we add actual
flags for them to use.

Fixes: b95a5c4db09b ("bpf: add a longest prefix match trie map implementation")
Reported-by: Jarno Rajahalme <jarno@covalent.io>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/bpf/arraymap.c |    1 +
 kernel/bpf/stackmap.c |    1 +
 2 files changed, 2 insertions(+)

--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -107,6 +107,7 @@ static struct bpf_map *array_map_alloc(u
 	array->map.key_size = attr->key_size;
 	array->map.value_size = attr->value_size;
 	array->map.max_entries = attr->max_entries;
+	array->map.map_flags = attr->map_flags;
 	array->elem_size = elem_size;
 
 	if (!percpu)
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -88,6 +88,7 @@ static struct bpf_map *stack_map_alloc(u
 	smap->map.key_size = attr->key_size;
 	smap->map.value_size = value_size;
 	smap->map.max_entries = attr->max_entries;
+	smap->map.map_flags = attr->map_flags;
 	smap->n_buckets = n_buckets;
 	smap->map.pages = round_up(cost, PAGE_SIZE) >> PAGE_SHIFT;
 


Patches currently in stable-queue which might be from daniel@iogearbox.net are

queue-4.9/bpf-fix-mlock-precharge-on-arraymaps.patch
queue-4.9/bpf-x64-implement-retpoline-for-tail-call.patch
queue-4.9/bpf-arm64-fix-out-of-bounds-access-in-tail-call.patch
queue-4.9/bpf-fix-wrong-exposure-of-map_flags-into-fdinfo-for-lpm.patch
queue-4.9/bpf-ppc64-fix-out-of-bounds-access-in-tail-call.patch
queue-4.9/bpf-add-schedule-points-in-percpu-arrays-management.patch

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Patch "bpf, ppc64: fix out of bounds access in tail call" has been added to the 4.9-stable tree
  2018-03-08 15:17 ` [PATCH stable 4.9 6/6] bpf, ppc64: fix out of bounds access in tail call Daniel Borkmann
@ 2018-03-09 22:21   ` gregkh
  0 siblings, 0 replies; 17+ messages in thread
From: gregkh @ 2018-03-09 22:21 UTC (permalink / raw)
  To: daniel, ast, gregkh, naveen.n.rao; +Cc: stable, stable-commits


This is a note to let you know that I've just added the patch titled

    bpf, ppc64: fix out of bounds access in tail call

to the 4.9-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     bpf-ppc64-fix-out-of-bounds-access-in-tail-call.patch
and it can be found in the queue-4.9 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.


>From foo@baz Fri Mar  9 14:20:51 PST 2018
From: Daniel Borkmann <daniel@iogearbox.net>
Date: Thu,  8 Mar 2018 16:17:37 +0100
Subject: bpf, ppc64: fix out of bounds access in tail call
To: gregkh@linuxfoundation.org
Cc: ast@kernel.org, daniel@iogearbox.net, stable@vger.kernel.org
Message-ID: <08bc9e5902de0e9e9b26194ba4ea219f053b7206.1520521792.git.daniel@iogearbox.net>

From: Daniel Borkmann <daniel@iogearbox.net>

[ upstream commit d269176e766c71c998cb75b4ea8cbc321cc0019d ]

While working on 16338a9b3ac3 ("bpf, arm64: fix out of bounds access in
tail call") I noticed that ppc64 JIT is partially affected as well. While
the bound checking is correctly performed as unsigned comparison, the
register with the index value however, is never truncated into 32 bit
space, so e.g. a index value of 0x100000000ULL with a map of 1 element
would pass with PPC_CMPLW() whereas we later on continue with the full
64 bit register value. Therefore, as we do in interpreter and other JITs
truncate the value to 32 bit initially in order to fix access.

Fixes: ce0761419fae ("powerpc/bpf: Implement support for tail calls")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/powerpc/net/bpf_jit_comp64.c |    1 +
 1 file changed, 1 insertion(+)

--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -245,6 +245,7 @@ static void bpf_jit_emit_tail_call(u32 *
 	 *   goto out;
 	 */
 	PPC_LWZ(b2p[TMP_REG_1], b2p_bpf_array, offsetof(struct bpf_array, map.max_entries));
+	PPC_RLWINM(b2p_index, b2p_index, 0, 0, 31);
 	PPC_CMPLW(b2p_index, b2p[TMP_REG_1]);
 	PPC_BCC(COND_GE, out);
 


Patches currently in stable-queue which might be from daniel@iogearbox.net are

queue-4.9/bpf-fix-mlock-precharge-on-arraymaps.patch
queue-4.9/bpf-x64-implement-retpoline-for-tail-call.patch
queue-4.9/bpf-arm64-fix-out-of-bounds-access-in-tail-call.patch
queue-4.9/bpf-fix-wrong-exposure-of-map_flags-into-fdinfo-for-lpm.patch
queue-4.9/bpf-ppc64-fix-out-of-bounds-access-in-tail-call.patch
queue-4.9/bpf-add-schedule-points-in-percpu-arrays-management.patch

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Patch "bpf, x64: implement retpoline for tail call" has been added to the 4.9-stable tree
  2018-03-08 15:17 ` [PATCH stable 4.9 3/6] bpf, x64: implement retpoline for tail call Daniel Borkmann
@ 2018-03-09 22:21   ` gregkh
  2018-03-10  0:05   ` Patch "bpf, x64: implement retpoline for tail call" has been added to the 4.4-stable tree gregkh
  1 sibling, 0 replies; 17+ messages in thread
From: gregkh @ 2018-03-09 22:21 UTC (permalink / raw)
  To: daniel, ast, gregkh; +Cc: stable, stable-commits


This is a note to let you know that I've just added the patch titled

    bpf, x64: implement retpoline for tail call

to the 4.9-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     bpf-x64-implement-retpoline-for-tail-call.patch
and it can be found in the queue-4.9 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.


>From foo@baz Fri Mar  9 14:20:51 PST 2018
From: Daniel Borkmann <daniel@iogearbox.net>
Date: Thu,  8 Mar 2018 16:17:34 +0100
Subject: bpf, x64: implement retpoline for tail call
To: gregkh@linuxfoundation.org
Cc: ast@kernel.org, daniel@iogearbox.net, stable@vger.kernel.org
Message-ID: <cfd8963c4c57f676177fb2d3a516a4b63cdccde2.1520521792.git.daniel@iogearbox.net>

From: Daniel Borkmann <daniel@iogearbox.net>

[ upstream commit a493a87f38cfa48caaa95c9347be2d914c6fdf29 ]

Implement a retpoline [0] for the BPF tail call JIT'ing that converts
the indirect jump via jmp %rax that is used to make the long jump into
another JITed BPF image. Since this is subject to speculative execution,
we need to control the transient instruction sequence here as well
when CONFIG_RETPOLINE is set, and direct it into a pause + lfence loop.
The latter aligns also with what gcc / clang emits (e.g. [1]).

JIT dump after patch:

  # bpftool p d x i 1
   0: (18) r2 = map[id:1]
   2: (b7) r3 = 0
   3: (85) call bpf_tail_call#12
   4: (b7) r0 = 2
   5: (95) exit

With CONFIG_RETPOLINE:

  # bpftool p d j i 1
  [...]
  33:	cmp    %edx,0x24(%rsi)
  36:	jbe    0x0000000000000072  |*
  38:	mov    0x24(%rbp),%eax
  3e:	cmp    $0x20,%eax
  41:	ja     0x0000000000000072  |
  43:	add    $0x1,%eax
  46:	mov    %eax,0x24(%rbp)
  4c:	mov    0x90(%rsi,%rdx,8),%rax
  54:	test   %rax,%rax
  57:	je     0x0000000000000072  |
  59:	mov    0x28(%rax),%rax
  5d:	add    $0x25,%rax
  61:	callq  0x000000000000006d  |+
  66:	pause                      |
  68:	lfence                     |
  6b:	jmp    0x0000000000000066  |
  6d:	mov    %rax,(%rsp)         |
  71:	retq                       |
  72:	mov    $0x2,%eax
  [...]

  * relative fall-through jumps in error case
  + retpoline for indirect jump

Without CONFIG_RETPOLINE:

  # bpftool p d j i 1
  [...]
  33:	cmp    %edx,0x24(%rsi)
  36:	jbe    0x0000000000000063  |*
  38:	mov    0x24(%rbp),%eax
  3e:	cmp    $0x20,%eax
  41:	ja     0x0000000000000063  |
  43:	add    $0x1,%eax
  46:	mov    %eax,0x24(%rbp)
  4c:	mov    0x90(%rsi,%rdx,8),%rax
  54:	test   %rax,%rax
  57:	je     0x0000000000000063  |
  59:	mov    0x28(%rax),%rax
  5d:	add    $0x25,%rax
  61:	jmpq   *%rax               |-
  63:	mov    $0x2,%eax
  [...]

  * relative fall-through jumps in error case
  - plain indirect jump as before

  [0] https://support.google.com/faqs/answer/7625886
  [1] https://github.com/gcc-mirror/gcc/commit/a31e654fa107be968b802786d747e962c2fcdb2b

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/nospec-branch.h |   37 +++++++++++++++++++++++++++++++++++
 arch/x86/net/bpf_jit_comp.c          |    9 ++++----
 2 files changed, 42 insertions(+), 4 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -177,4 +177,41 @@ static inline void indirect_branch_predi
 }
 
 #endif /* __ASSEMBLY__ */
+
+/*
+ * Below is used in the eBPF JIT compiler and emits the byte sequence
+ * for the following assembly:
+ *
+ * With retpolines configured:
+ *
+ *    callq do_rop
+ *  spec_trap:
+ *    pause
+ *    lfence
+ *    jmp spec_trap
+ *  do_rop:
+ *    mov %rax,(%rsp)
+ *    retq
+ *
+ * Without retpolines configured:
+ *
+ *    jmp *%rax
+ */
+#ifdef CONFIG_RETPOLINE
+# define RETPOLINE_RAX_BPF_JIT_SIZE	17
+# define RETPOLINE_RAX_BPF_JIT()				\
+	EMIT1_off32(0xE8, 7);	 /* callq do_rop */		\
+	/* spec_trap: */					\
+	EMIT2(0xF3, 0x90);       /* pause */			\
+	EMIT3(0x0F, 0xAE, 0xE8); /* lfence */			\
+	EMIT2(0xEB, 0xF9);       /* jmp spec_trap */		\
+	/* do_rop: */						\
+	EMIT4(0x48, 0x89, 0x04, 0x24); /* mov %rax,(%rsp) */	\
+	EMIT1(0xC3);             /* retq */
+#else
+# define RETPOLINE_RAX_BPF_JIT_SIZE	2
+# define RETPOLINE_RAX_BPF_JIT()				\
+	EMIT2(0xFF, 0xE0);	 /* jmp *%rax */
+#endif
+
 #endif /* _ASM_X86_NOSPEC_BRANCH_H_ */
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -12,6 +12,7 @@
 #include <linux/filter.h>
 #include <linux/if_vlan.h>
 #include <asm/cacheflush.h>
+#include <asm/nospec-branch.h>
 #include <linux/bpf.h>
 
 int bpf_jit_enable __read_mostly;
@@ -281,7 +282,7 @@ static void emit_bpf_tail_call(u8 **ppro
 	EMIT2(0x89, 0xD2);                        /* mov edx, edx */
 	EMIT3(0x39, 0x56,                         /* cmp dword ptr [rsi + 16], edx */
 	      offsetof(struct bpf_array, map.max_entries));
-#define OFFSET1 43 /* number of bytes to jump */
+#define OFFSET1 (41 + RETPOLINE_RAX_BPF_JIT_SIZE) /* number of bytes to jump */
 	EMIT2(X86_JBE, OFFSET1);                  /* jbe out */
 	label1 = cnt;
 
@@ -290,7 +291,7 @@ static void emit_bpf_tail_call(u8 **ppro
 	 */
 	EMIT2_off32(0x8B, 0x85, -STACKSIZE + 36); /* mov eax, dword ptr [rbp - 516] */
 	EMIT3(0x83, 0xF8, MAX_TAIL_CALL_CNT);     /* cmp eax, MAX_TAIL_CALL_CNT */
-#define OFFSET2 32
+#define OFFSET2 (30 + RETPOLINE_RAX_BPF_JIT_SIZE)
 	EMIT2(X86_JA, OFFSET2);                   /* ja out */
 	label2 = cnt;
 	EMIT3(0x83, 0xC0, 0x01);                  /* add eax, 1 */
@@ -304,7 +305,7 @@ static void emit_bpf_tail_call(u8 **ppro
 	 *   goto out;
 	 */
 	EMIT3(0x48, 0x85, 0xC0);		  /* test rax,rax */
-#define OFFSET3 10
+#define OFFSET3 (8 + RETPOLINE_RAX_BPF_JIT_SIZE)
 	EMIT2(X86_JE, OFFSET3);                   /* je out */
 	label3 = cnt;
 
@@ -317,7 +318,7 @@ static void emit_bpf_tail_call(u8 **ppro
 	 * rdi == ctx (1st arg)
 	 * rax == prog->bpf_func + prologue_size
 	 */
-	EMIT2(0xFF, 0xE0);                        /* jmp rax */
+	RETPOLINE_RAX_BPF_JIT();
 
 	/* out: */
 	BUILD_BUG_ON(cnt - label1 != OFFSET1);


Patches currently in stable-queue which might be from daniel@iogearbox.net are

queue-4.9/bpf-fix-mlock-precharge-on-arraymaps.patch
queue-4.9/bpf-x64-implement-retpoline-for-tail-call.patch
queue-4.9/bpf-arm64-fix-out-of-bounds-access-in-tail-call.patch
queue-4.9/bpf-fix-wrong-exposure-of-map_flags-into-fdinfo-for-lpm.patch
queue-4.9/bpf-ppc64-fix-out-of-bounds-access-in-tail-call.patch
queue-4.9/bpf-add-schedule-points-in-percpu-arrays-management.patch

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH stable 4.9 0/6] BPF stable patches
  2018-03-08 15:17 [PATCH stable 4.9 0/6] BPF stable patches Daniel Borkmann
                   ` (5 preceding siblings ...)
  2018-03-08 15:17 ` [PATCH stable 4.9 6/6] bpf, ppc64: fix out of bounds access in tail call Daniel Borkmann
@ 2018-03-09 22:22 ` Greg KH
  2018-03-09 22:36   ` Daniel Borkmann
  6 siblings, 1 reply; 17+ messages in thread
From: Greg KH @ 2018-03-09 22:22 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: ast, stable

On Thu, Mar 08, 2018 at 04:17:31PM +0100, Daniel Borkmann wrote:
> All for 4.9 backported and tested.

All now applied.  Should I work to backport these to the 4.4.y tree?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH stable 4.9 0/6] BPF stable patches
  2018-03-09 22:22 ` [PATCH stable 4.9 0/6] BPF stable patches Greg KH
@ 2018-03-09 22:36   ` Daniel Borkmann
  2018-03-10  0:03     ` Greg KH
  0 siblings, 1 reply; 17+ messages in thread
From: Daniel Borkmann @ 2018-03-09 22:36 UTC (permalink / raw)
  To: Greg KH; +Cc: ast, stable

On 03/09/2018 11:22 PM, Greg KH wrote:
> On Thu, Mar 08, 2018 at 04:17:31PM +0100, Daniel Borkmann wrote:
>> All for 4.9 backported and tested.
> 
> All now applied.  Should I work to backport these to the 4.4.y tree?

Yeah, would be great as I don't have a way to test them on 4.4.

Thanks Greg!

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH stable 4.9 0/6] BPF stable patches
  2018-03-09 22:36   ` Daniel Borkmann
@ 2018-03-10  0:03     ` Greg KH
  0 siblings, 0 replies; 17+ messages in thread
From: Greg KH @ 2018-03-10  0:03 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: ast, stable

On Fri, Mar 09, 2018 at 11:36:10PM +0100, Daniel Borkmann wrote:
> On 03/09/2018 11:22 PM, Greg KH wrote:
> > On Thu, Mar 08, 2018 at 04:17:31PM +0100, Daniel Borkmann wrote:
> >> All for 4.9 backported and tested.
> > 
> > All now applied.  Should I work to backport these to the 4.4.y tree?
> 
> Yeah, would be great as I don't have a way to test them on 4.4.

Looks like only the one patch that Ben wanted to have applied was
relevant, so I've added that one now.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Patch "bpf, x64: implement retpoline for tail call" has been added to the 4.4-stable tree
  2018-03-08 15:17 ` [PATCH stable 4.9 3/6] bpf, x64: implement retpoline for tail call Daniel Borkmann
  2018-03-09 22:21   ` Patch "bpf, x64: implement retpoline for tail call" has been added to the 4.9-stable tree gregkh
@ 2018-03-10  0:05   ` gregkh
  1 sibling, 0 replies; 17+ messages in thread
From: gregkh @ 2018-03-10  0:05 UTC (permalink / raw)
  To: daniel, ast, gregkh; +Cc: stable, stable-commits


This is a note to let you know that I've just added the patch titled

    bpf, x64: implement retpoline for tail call

to the 4.4-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     bpf-x64-implement-retpoline-for-tail-call.patch
and it can be found in the queue-4.4 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.


>From foo@baz Fri Mar  9 16:00:31 PST 2018
From: Daniel Borkmann <daniel@iogearbox.net>
Date: Thu,  8 Mar 2018 16:17:34 +0100
Subject: bpf, x64: implement retpoline for tail call
To: gregkh@linuxfoundation.org
Cc: ast@kernel.org, daniel@iogearbox.net, stable@vger.kernel.org
Message-ID: <cfd8963c4c57f676177fb2d3a516a4b63cdccde2.1520521792.git.daniel@iogearbox.net>

From: Daniel Borkmann <daniel@iogearbox.net>


[ upstream commit a493a87f38cfa48caaa95c9347be2d914c6fdf29 ]

Implement a retpoline [0] for the BPF tail call JIT'ing that converts
the indirect jump via jmp %rax that is used to make the long jump into
another JITed BPF image. Since this is subject to speculative execution,
we need to control the transient instruction sequence here as well
when CONFIG_RETPOLINE is set, and direct it into a pause + lfence loop.
The latter aligns also with what gcc / clang emits (e.g. [1]).

JIT dump after patch:

  # bpftool p d x i 1
   0: (18) r2 = map[id:1]
   2: (b7) r3 = 0
   3: (85) call bpf_tail_call#12
   4: (b7) r0 = 2
   5: (95) exit

With CONFIG_RETPOLINE:

  # bpftool p d j i 1
  [...]
  33:	cmp    %edx,0x24(%rsi)
  36:	jbe    0x0000000000000072  |*
  38:	mov    0x24(%rbp),%eax
  3e:	cmp    $0x20,%eax
  41:	ja     0x0000000000000072  |
  43:	add    $0x1,%eax
  46:	mov    %eax,0x24(%rbp)
  4c:	mov    0x90(%rsi,%rdx,8),%rax
  54:	test   %rax,%rax
  57:	je     0x0000000000000072  |
  59:	mov    0x28(%rax),%rax
  5d:	add    $0x25,%rax
  61:	callq  0x000000000000006d  |+
  66:	pause                      |
  68:	lfence                     |
  6b:	jmp    0x0000000000000066  |
  6d:	mov    %rax,(%rsp)         |
  71:	retq                       |
  72:	mov    $0x2,%eax
  [...]

  * relative fall-through jumps in error case
  + retpoline for indirect jump

Without CONFIG_RETPOLINE:

  # bpftool p d j i 1
  [...]
  33:	cmp    %edx,0x24(%rsi)
  36:	jbe    0x0000000000000063  |*
  38:	mov    0x24(%rbp),%eax
  3e:	cmp    $0x20,%eax
  41:	ja     0x0000000000000063  |
  43:	add    $0x1,%eax
  46:	mov    %eax,0x24(%rbp)
  4c:	mov    0x90(%rsi,%rdx,8),%rax
  54:	test   %rax,%rax
  57:	je     0x0000000000000063  |
  59:	mov    0x28(%rax),%rax
  5d:	add    $0x25,%rax
  61:	jmpq   *%rax               |-
  63:	mov    $0x2,%eax
  [...]

  * relative fall-through jumps in error case
  - plain indirect jump as before

  [0] https://support.google.com/faqs/answer/7625886
  [1] https://github.com/gcc-mirror/gcc/commit/a31e654fa107be968b802786d747e962c2fcdb2b

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/nospec-branch.h |   37 +++++++++++++++++++++++++++++++++++
 arch/x86/net/bpf_jit_comp.c          |    9 ++++----
 2 files changed, 42 insertions(+), 4 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -195,4 +195,41 @@ static inline void vmexit_fill_RSB(void)
 }
 
 #endif /* __ASSEMBLY__ */
+
+/*
+ * Below is used in the eBPF JIT compiler and emits the byte sequence
+ * for the following assembly:
+ *
+ * With retpolines configured:
+ *
+ *    callq do_rop
+ *  spec_trap:
+ *    pause
+ *    lfence
+ *    jmp spec_trap
+ *  do_rop:
+ *    mov %rax,(%rsp)
+ *    retq
+ *
+ * Without retpolines configured:
+ *
+ *    jmp *%rax
+ */
+#ifdef CONFIG_RETPOLINE
+# define RETPOLINE_RAX_BPF_JIT_SIZE	17
+# define RETPOLINE_RAX_BPF_JIT()				\
+	EMIT1_off32(0xE8, 7);	 /* callq do_rop */		\
+	/* spec_trap: */					\
+	EMIT2(0xF3, 0x90);       /* pause */			\
+	EMIT3(0x0F, 0xAE, 0xE8); /* lfence */			\
+	EMIT2(0xEB, 0xF9);       /* jmp spec_trap */		\
+	/* do_rop: */						\
+	EMIT4(0x48, 0x89, 0x04, 0x24); /* mov %rax,(%rsp) */	\
+	EMIT1(0xC3);             /* retq */
+#else
+# define RETPOLINE_RAX_BPF_JIT_SIZE	2
+# define RETPOLINE_RAX_BPF_JIT()				\
+	EMIT2(0xFF, 0xE0);	 /* jmp *%rax */
+#endif
+
 #endif /* _ASM_X86_NOSPEC_BRANCH_H_ */
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -12,6 +12,7 @@
 #include <linux/filter.h>
 #include <linux/if_vlan.h>
 #include <asm/cacheflush.h>
+#include <asm/nospec-branch.h>
 #include <linux/bpf.h>
 
 int bpf_jit_enable __read_mostly;
@@ -269,7 +270,7 @@ static void emit_bpf_tail_call(u8 **ppro
 	EMIT2(0x89, 0xD2);                        /* mov edx, edx */
 	EMIT3(0x39, 0x56,                         /* cmp dword ptr [rsi + 16], edx */
 	      offsetof(struct bpf_array, map.max_entries));
-#define OFFSET1 43 /* number of bytes to jump */
+#define OFFSET1 (41 + RETPOLINE_RAX_BPF_JIT_SIZE) /* number of bytes to jump */
 	EMIT2(X86_JBE, OFFSET1);                  /* jbe out */
 	label1 = cnt;
 
@@ -278,7 +279,7 @@ static void emit_bpf_tail_call(u8 **ppro
 	 */
 	EMIT2_off32(0x8B, 0x85, -STACKSIZE + 36); /* mov eax, dword ptr [rbp - 516] */
 	EMIT3(0x83, 0xF8, MAX_TAIL_CALL_CNT);     /* cmp eax, MAX_TAIL_CALL_CNT */
-#define OFFSET2 32
+#define OFFSET2 (30 + RETPOLINE_RAX_BPF_JIT_SIZE)
 	EMIT2(X86_JA, OFFSET2);                   /* ja out */
 	label2 = cnt;
 	EMIT3(0x83, 0xC0, 0x01);                  /* add eax, 1 */
@@ -292,7 +293,7 @@ static void emit_bpf_tail_call(u8 **ppro
 	 *   goto out;
 	 */
 	EMIT3(0x48, 0x85, 0xC0);		  /* test rax,rax */
-#define OFFSET3 10
+#define OFFSET3 (8 + RETPOLINE_RAX_BPF_JIT_SIZE)
 	EMIT2(X86_JE, OFFSET3);                   /* je out */
 	label3 = cnt;
 
@@ -305,7 +306,7 @@ static void emit_bpf_tail_call(u8 **ppro
 	 * rdi == ctx (1st arg)
 	 * rax == prog->bpf_func + prologue_size
 	 */
-	EMIT2(0xFF, 0xE0);                        /* jmp rax */
+	RETPOLINE_RAX_BPF_JIT();
 
 	/* out: */
 	BUILD_BUG_ON(cnt - label1 != OFFSET1);


Patches currently in stable-queue which might be from daniel@iogearbox.net are

queue-4.4/bpf-x64-implement-retpoline-for-tail-call.patch

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-03-10  0:05 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-08 15:17 [PATCH stable 4.9 0/6] BPF stable patches Daniel Borkmann
2018-03-08 15:17 ` [PATCH stable 4.9 1/6] bpf: fix wrong exposure of map_flags into fdinfo for lpm Daniel Borkmann
2018-03-09 22:21   ` Patch "bpf: fix wrong exposure of map_flags into fdinfo for lpm" has been added to the 4.9-stable tree gregkh
2018-03-08 15:17 ` [PATCH stable 4.9 2/6] bpf: fix mlock precharge on arraymaps Daniel Borkmann
2018-03-09 22:21   ` Patch "bpf: fix mlock precharge on arraymaps" has been added to the 4.9-stable tree gregkh
2018-03-08 15:17 ` [PATCH stable 4.9 3/6] bpf, x64: implement retpoline for tail call Daniel Borkmann
2018-03-09 22:21   ` Patch "bpf, x64: implement retpoline for tail call" has been added to the 4.9-stable tree gregkh
2018-03-10  0:05   ` Patch "bpf, x64: implement retpoline for tail call" has been added to the 4.4-stable tree gregkh
2018-03-08 15:17 ` [PATCH stable 4.9 4/6] bpf, arm64: fix out of bounds access in tail call Daniel Borkmann
2018-03-09 22:21   ` Patch "bpf, arm64: fix out of bounds access in tail call" has been added to the 4.9-stable tree gregkh
2018-03-08 15:17 ` [PATCH stable 4.9 5/6] bpf: add schedule points in percpu arrays management Daniel Borkmann
2018-03-09 22:21   ` Patch "bpf: add schedule points in percpu arrays management" has been added to the 4.9-stable tree gregkh
2018-03-08 15:17 ` [PATCH stable 4.9 6/6] bpf, ppc64: fix out of bounds access in tail call Daniel Borkmann
2018-03-09 22:21   ` Patch "bpf, ppc64: fix out of bounds access in tail call" has been added to the 4.9-stable tree gregkh
2018-03-09 22:22 ` [PATCH stable 4.9 0/6] BPF stable patches Greg KH
2018-03-09 22:36   ` Daniel Borkmann
2018-03-10  0:03     ` Greg KH

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.