From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751899AbcGRK52 (ORCPT ); Mon, 18 Jul 2016 06:57:28 -0400 Received: from mail-it0-f54.google.com ([209.85.214.54]:36859 "EHLO mail-it0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751700AbcGRK5T (ORCPT ); Mon, 18 Jul 2016 06:57:19 -0400 Date: Mon, 18 Jul 2016 03:57:17 -0700 (PDT) From: Sargun Dhillon X-X-Sender: sargun@ircssh.c.rugged-nimbus-611.internal To: Alexei Starovoitov cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, Daniel Borkmann Subject: Re: [PATCH 1/1] tracing, bpf: Implement function bpf_probe_write In-Reply-To: <20160718041105.GA36253@ast-mbp.thefacebook.com> Message-ID: References: <20160713170849.GA76615@ast-mbp.thefacebook.com> <20160715054025.GA99435@ast-mbp> <20160716023035.GA19373@ast-mbp.thefacebook.com> <20160718041105.GA36253@ast-mbp.thefacebook.com> User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, 17 Jul 2016, Alexei Starovoitov wrote: > On Sun, Jul 17, 2016 at 03:19:13AM -0700, Sargun Dhillon wrote: >> >> +static u64 bpf_copy_to_user(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5) >> +{ >> + void *to = (void *) (long) r1; >> + void *from = (void *) (long) r2; >> + int size = (int) r3; >> + >> + /* check if we're in a user context */ >> + if (unlikely(in_interrupt())) >> + return -EINVAL; >> + if (unlikely(!current->pid)) >> + return -EINVAL; >> + >> + return copy_to_user(to, from, size); >> +} > > thanks for the patch, unfortunately it's not that straightforward. > copy_to_user might fault. Try enabling CONFIG_DEBUG_ATOMIC_SLEEP and > you'll see the splat since bpf programs are protected by rcu. > Also 'current' can be null and I'm not sure what current->pid does. > So the writing to user memory either has to be verified to avoid > sleeping and faults or we need to use something like task_work_add > mechanism. Ideas are certainly welcome. > > >>From casual inspection, I can't find where current can be null when in_interrupt() is false. Although, we can check before dereferencing it. When not in a user context, the pid of the task struct returns 0. As far as preventing sleep, would the following alteration do? Or do we actually need something more sophisticated? diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index be89c148..45878f3 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -86,14 +86,19 @@ static u64 bpf_copy_to_user(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5) void *to = (void *) (long) r1; void *from = (void *) (long) r2; int size = (int) r3; + struct task_struct *task = current; /* check if we're in a user context */ if (unlikely(in_interrupt())) return -EINVAL; - if (unlikely(!current->pid)) + if (unlikely(!task || !task->pid)) return -EINVAL; - return copy_to_user(to, from, size); + /* Is this a user address, or a kernel address? */ + if (!access_ok(VERIFY_WRITE, to, size)) + return -EINVAL; + + return probe_kernel_write(to, from, size); } static const struct bpf_func_proto bpf_copy_to_user_proto = { probe_kernel_write doesn't block, and this will disallow BPF programs to write to kernel memory. This turns off the pagefault handler under the hood, unblocking us.