From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.9 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_PASS,URIBL_BLOCKED,USER_AGENT_NEOMUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 315E9C43381 for ; Thu, 21 Feb 2019 20:36:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EBDDB20823 for ; Thu, 21 Feb 2019 20:36:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="byul+2dz" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726428AbfBUUgT (ORCPT ); Thu, 21 Feb 2019 15:36:19 -0500 Received: from mail-pg1-f195.google.com ([209.85.215.195]:43007 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726075AbfBUUgT (ORCPT ); Thu, 21 Feb 2019 15:36:19 -0500 Received: by mail-pg1-f195.google.com with SMTP id b2so8583951pgl.9 for ; Thu, 21 Feb 2019 12:36:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=Dp7dp8fRoiE3zhyHD29GzxWElXqC1ffRTzwrSEa+O0k=; b=byul+2dzPripIav6jCXgPVWA7hHM/r1G/4VWBxokhj+Zk2u3xki6yE2m17X/eR4iru 2qK9iXwIqkPc4nL77QFr3augWwTrg0GJ1MrRlu19e4Ziz/c016QbuDQNqicmn1SKrp+p nXlJUsDAD0OyJjB9q18j/kmGBZt2zd09WGFuuR6DNySCRIt27hbnXwspB0PKea0XtztT UPPYUeAKRjR1iZhP4vYwYnzYG095dF80rTTP+pihSrt5hWg2olqW9zzBdnx3r+FDrk7b naoR2BSTAZJkPEL8h3Qhz+cQY/5uHfmKBUGSVcsoWJhMGz1djJVZiVDyxkcDPzTS2yVx C/Sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=Dp7dp8fRoiE3zhyHD29GzxWElXqC1ffRTzwrSEa+O0k=; b=RhP6819AeZqyfslxh7fWvdaPHIck7IM3FqQvK1aKsynjBVitBMhE1QQK1dqdr9I9ev v19PpJhjw50QSYOxCKp+TyHztCB0dmjtmD1p5IyE6J/SVqqooGlms0PLzoq5mWhbnbcy QkKkcOnxx4evGBquwjkVPszFljSfMjywalDN4HTXuecBEV5/IG1cuCY2gcazKlgE7jos gY/6m5ghZRJyd5KuLcoAfmFL8nXWrVSuuHzDv9wNn60cYcFvWZ6Mg1GlS57/nVguigZw n+2ZYcLcuZVKq8jRucJnd/F0eaXb7i4CCobVzSANGMVNN+XTzmOg3+t0L5bz1niuf8Mm GRQw== X-Gm-Message-State: AHQUAubUX9OnVGotzpkxbxrMozb337HKA2B9psDMx4zCd011ahCTeBCa DVta1E00aIW5sYOkCpPOei0= X-Google-Smtp-Source: AHgI3Ians31NbgiahQhsWKSmL6R44sJ8hBGRvwtqaYCdCr12FeeP/rrPmK8e1t0eOThRdm+KNCH6Pg== X-Received: by 2002:a62:c302:: with SMTP id v2mr381771pfg.155.1550781378076; Thu, 21 Feb 2019 12:36:18 -0800 (PST) Received: from ast-mbp.dhcp.thefacebook.com ([2620:10d:c090:200::4:eceb]) by smtp.gmail.com with ESMTPSA id d16sm1516126pfo.112.2019.02.21.12.36.16 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 21 Feb 2019 12:36:16 -0800 (PST) Date: Thu, 21 Feb 2019 12:36:15 -0800 From: Alexei Starovoitov To: Kees Cook Cc: Jann Horn , Daniel Borkmann , Andy Lutomirski , Alexei Starovoitov , Network Development Subject: Re: [PATCH bpf-next v2] bpf, seccomp: fix false positive preemption splat for cbpf->ebpf progs Message-ID: <20190221203613.q6k757fi3wxtoj5y@ast-mbp.dhcp.thefacebook.com> References: <20190220230135.9748-1-daniel@iogearbox.net> <20190220235952.uzrsjypoqkha7ya6@ast-mbp.dhcp.thefacebook.com> <20190221192916.2mcd4fmxbdj2j2u3@ast-mbp.dhcp.thefacebook.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20180223 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Thu, Feb 21, 2019 at 11:53:06AM -0800, Kees Cook wrote: > On Thu, Feb 21, 2019 at 11:29 AM Alexei Starovoitov > wrote: > > > > On Thu, Feb 21, 2019 at 01:56:53PM +0100, Jann Horn wrote: > > > On Thu, Feb 21, 2019 at 9:53 AM Daniel Borkmann wrote: > > > > On 02/21/2019 06:31 AM, Kees Cook wrote: > > > > > On Wed, Feb 20, 2019 at 8:03 PM Alexei Starovoitov > > > > > wrote: > > > > >> > > > > >> On Wed, Feb 20, 2019 at 3:59 PM Alexei Starovoitov > > > > >> wrote: > > > > >>> > > > > >>> On Thu, Feb 21, 2019 at 12:01:35AM +0100, Daniel Borkmann wrote: > > > > >>>> In 568f196756ad ("bpf: check that BPF programs run with preemption disabled") > > > > >>>> a check was added for BPF_PROG_RUN() that for every invocation preemption is > > > > >>>> disabled to not break eBPF assumptions (e.g. per-cpu map). Of course this does > > > > >>>> not count for seccomp because only cBPF -> eBPF is loaded here and it does > > > > >>>> not make use of any functionality that would require this assertion. Fix this > > > > >>>> false positive by adding and using SECCOMP_RUN() variant that does not have > > > > >>>> the cant_sleep(); check. > > > > >>>> > > > > >>>> Fixes: 568f196756ad ("bpf: check that BPF programs run with preemption disabled") > > > > >>>> Reported-by: syzbot+8bf19ee2aa580de7a2a7@syzkaller.appspotmail.com > > > > >>>> Signed-off-by: Daniel Borkmann > > > > >>>> Acked-by: Kees Cook > > > > >>> > > > > >>> Applied, Thanks > > > > >> > > > > >> Actually I think it's a wrong approach to go long term. > > > > >> I'm thinking to revert it. > > > > >> I think it's better to disable preemption for duration of > > > > >> seccomp cbpf prog. > > > > >> It's short and there is really no reason for it to be preemptible. > > > > >> When seccomp switches to ebpf we'll have this weird inconsistency. > > > > >> Let's just disable preemption for seccomp as well. > > > > > > > > > > A lot of changes will be needed for seccomp ebpf -- not the least of > > > > > which is convincing me there is a use-case. ;) > > > > > > > > > > But the main issue is that I'm not a huge fan of dropping two > > > > > barriers() across syscall entry. That seems pretty heavy-duty for > > > > > something that is literally not needed right now. > > > > > > > > Yeah, I think it's okay to add once actually technically needed. Last > > > > time I looked, if I recall correctly, at least Chrome installs some > > > > heavy duty seccomp programs that go close to prog limit. > > > > > > Half of that is probably because that seccomp BPF code is so > > > inefficient, though. > > > > > > This snippet shows that those programs constantly recheck the high > > > halves of arguments: > > > > > > Some of the generated code is pointless because all reachable code > > > from that point on has the same outcome (the last "ret ALLOW" in the > > > following sample is unreachable because they've already checked that > > > the high bit of the low half is set, so the low half can't be 3): > > > > and with ebpf these optimizations will be available for free > > because llvm will remove unnecessary loads and simplify branches. > > There is no technical reason not to use ebpf in seccomp. > > > > When we discussed preemption of classic vs extended in socket filters > > context we agreed to make it a requirement that preemption must be > > disabled though it's not strictly necessary. RX side of socket filters > > was already non-preempt while TX was preemptible. > > We must not make an exception of this rule for seccomp. > > Hence I've reverted this commit. > > > > Here is the actual fix for seccomp: > > From: Alexei Starovoitov > > Date: Thu, 21 Feb 2019 10:40:14 -0800 > > Subject: [PATCH] seccomp, bpf: disable preemption before calling into bpf prog > > > > All BPF programs must be called with preemption disabled. > > > > Signed-off-by: Alexei Starovoitov > > --- > > kernel/seccomp.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/kernel/seccomp.c b/kernel/seccomp.c > > index e815781ed751..a43c601ac252 100644 > > --- a/kernel/seccomp.c > > +++ b/kernel/seccomp.c > > @@ -267,6 +267,7 @@ static u32 seccomp_run_filters(const struct seccomp_data *sd, > > * All filters in the list are evaluated and the lowest BPF return > > * value always takes priority (ignoring the DATA). > > */ > > + preempt_disable(); > > for (; f; f = f->prev) { > > u32 cur_ret = BPF_PROG_RUN(f->prog, sd); > > > > @@ -275,6 +276,7 @@ static u32 seccomp_run_filters(const struct seccomp_data *sd, > > *match = f; > > } > > } > > + preempt_enable(); > > return ret; > > } > > #endif /* CONFIG_SECCOMP_FILTER */ > > -- > > > > Doing per-cpu increment of cache hot data is practically free and it makes seccomp > > play by the rules. > > Other accesses should dominate the run time, yes. I'm still not a big > fan of unconditionally adding this, but I won't NAK. :P Thank you. I also would like to touch on your comment: "A lot of changes will be needed for seccomp ebpf" There were two attempts to add it in the past and the patches were small and straightforward. If I recall correctly both times you nacked them because performance gains and ease of use arguments were not convincing enough, right? Are you still not convinced ?