From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF5FEC3A5A6 for ; Thu, 29 Aug 2019 15:43:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C572A2166E for ; Thu, 29 Aug 2019 15:43:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1567093418; bh=Bf7qPlGUF5ncgGF7xPwWfoZ96AH0GQTONCGSJJO0ILM=; h=References:In-Reply-To:From:Date:Subject:To:Cc:List-ID:From; b=sfc544Ns7HAd9UOuXo8UH+3tQILar/7eWPDwu5M+AkdYMg7qb2/JSI+xEd3eNrczv Uu5dr71xkQ4oC7Oyrdp+axMObWcF6uZdRr2F6blajDsfIZzun5JnnTgmwkJ7Wr6Dzw yJ5Ye4xvtKua2POhBu70GIbhhRjrLUnH+97fkSQo= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727207AbfH2Pni (ORCPT ); Thu, 29 Aug 2019 11:43:38 -0400 Received: from mail.kernel.org ([198.145.29.99]:50928 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727118AbfH2Pni (ORCPT ); Thu, 29 Aug 2019 11:43:38 -0400 Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B1A672166E for ; Thu, 29 Aug 2019 15:43:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1567093417; bh=Bf7qPlGUF5ncgGF7xPwWfoZ96AH0GQTONCGSJJO0ILM=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=dB1NAEDT11NVk8mh8gwwVkr7ucDrdCe+aft7g9cEI3COyzU7wKZAlCjzp74TBAyVK slA4H/afXPvKvukQqpuRXEykZpQsK5ndXudmb0fvQPB7NuoH0SA9y3dU8126pMeASm gHCc04SuW8Pn0dpoarGsWK8P8jzEHrDS3EbJgMpo= Received: by mail-wm1-f50.google.com with SMTP id y135so2359934wmc.1 for ; Thu, 29 Aug 2019 08:43:36 -0700 (PDT) X-Gm-Message-State: APjAAAVj2fQYmjgtVrwyRss3/scDZuV3nlN7sS9Op1jHM7CLixZSH7dD 07w1GRNmNn8PtgV4tlQSy0OHwZWKrr4WWKDfAK9JcQ== X-Google-Smtp-Source: APXvYqzRkyS1nWJSFtXBAeKXiKlI+EhmvqeWKaGcIIDOMF6gTOsTy80m0M5IDFc3+xAboKNjq3vs7k/iFGpPWRUG3HM= X-Received: by 2002:a05:600c:24cf:: with SMTP id 15mr12397339wmu.76.1567093415251; Thu, 29 Aug 2019 08:43:35 -0700 (PDT) MIME-Version: 1.0 References: <20190827205213.456318-1-ast@kernel.org> <20190828071421.GK2332@hirez.programming.kicks-ass.net> <20190828220826.nlkpp632rsomocve@ast-mbp.dhcp.thefacebook.com> <20190829093434.36540972@gandalf.local.home> In-Reply-To: <20190829093434.36540972@gandalf.local.home> From: Andy Lutomirski Date: Thu, 29 Aug 2019 08:43:23 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH bpf-next] bpf, capabilities: introduce CAP_BPF To: Steven Rostedt Cc: Alexei Starovoitov , Peter Zijlstra , Andy Lutomirski , Alexei Starovoitov , Kees Cook , LSM List , James Morris , Jann Horn , Masami Hiramatsu , "David S. Miller" , Daniel Borkmann , Network Development , bpf , kernel-team , Linux API Content-Type: text/plain; charset="UTF-8" Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: On Thu, Aug 29, 2019 at 6:34 AM Steven Rostedt wrote: > > On Wed, 28 Aug 2019 15:08:28 -0700 > Alexei Starovoitov wrote: > > > On Wed, Aug 28, 2019 at 09:14:21AM +0200, Peter Zijlstra wrote: > > > On Tue, Aug 27, 2019 at 04:01:08PM -0700, Andy Lutomirski wrote: > > > > > > > > Tracing: > > > > > > > > > > CAP_BPF and perf_paranoid_tracepoint_raw() (which is kernel.perf_event_paranoid == -1) > > > > > are necessary to: > > > > > > That's not tracing, that's perf. > > > > > > re: your first comment above. > > I'm not sure what difference you see in words 'tracing' and 'perf'. > > I really hope we don't partition the overall tracing category > > into CAP_PERF and CAP_FTRACE only because these pieces are maintained > > by different people. > > I think Peter meant: It's not tracing, it's profiling. > > And there is a bit of separation between the two, although there is an > overlap. > > Yes, perf can do tracing but it's designed more for profiling. As I see it, there are a couple of reasons to split something into multiple capabilities. If they allow users to do well-defined things that have materially different risks from the perspective of the person granting the capabilities, then they can usefully be different. Similarly, if one carries a risk of accidental use that another does not, they should usefully be different. An example of the first is that CAP_NET_BIND_SERVICE has very different powers from CAP_NET_ADMIN, whereas CAP_SYS_ADMIN and CAP_PTRACE are really quite similar from a security perspective. An example of the latter is that CAP_DAC_OVERRIDE changes overall open() semantics and CAP_SYS_ADMIN does not, at least not outside of /proc. Things having different development histories and different maintainers doesn't seem like a good reason to split the capabilities IMO. > > > On one side perf_event_open() isn't really doing tracing (as step by > > step ftracing of function sequences), but perf_event_open() opens > > an event and the sequence of events (may include IP) becomes a trace. > > imo CAP_TRACING is the best name to descibe the privileged space > > of operations possible via perf_event_open, ftrace, kprobe, stack traces, etc. > > I have no issue with what you suggest. I guess it comes down to how > fine grain people want to go. Do we want it to be all or nothing? > Should CAP_TRACING allow for write access to tracefs? Or should we go > with needing both CAP_TRACING and permissions in that directory > (like changing the group ownership of the files at every boot). > > Perhaps we should have a CAP_TRACING_RO, that gives read access to > tracefs (and write if the users have permissions). And have CAP_TRACING > to allow full write access as well (allowing for users to add kprobe > events and enabling tracers like the function tracer). I can imagine splitting it into three capabilities: CAP_TRACE_KERNEL: learn which kernel functions are called when. This would allow perf profiling, for example, but not sampling of kernel regs. CAP_TRACE_READ_KERNEL_DATA: allow the tracing, profiling, etc features that can read the kernel's data. So you get function arguments via kprobe, kernel regs, and APIs that expose probe_kernel_read() CAP_TRACE_USER: trace unrelated user processes I'm not sure the code is written in a way that makes splitting CAP_TRACE_KERNEL and CAP_TRACE_READ_KERNEL_DATA, and I'm not sure that CAP_TRACE_KERNEL is all that useful except for plain perf record without CAP_TRACE_READ_KERNEL_DATA. What do you all think? I suppose it could also be: CAP_PROFILE_KERNEL: Use perf with events that aren't kprobes or tracepoints. Does not grant the ability to sample regs or the kernel stack directly. CAP_TRACE_KERNEL: Use all of perf, ftrace, kprobe, etc. CAP_TRACE_USER: Use all of perf with scope limited to user mode and uprobes. > As the above seems to favor the idea of CAP_TRACING allowing write > access to tracefs, should we have a CAP_TRACING_RO for just read access > and limited perf abilities? How about making a separate cap for limited perf capabilities along the lines of the above? For what it's worth, it should be straightforward using full tracing to read out the kernel's random number pool, for example, but it would be difficult or impossible to do that using just perf record -e cycles.