From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS, USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F23A0C43381 for ; Thu, 14 Feb 2019 17:35:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B8E3421928 for ; Thu, 14 Feb 2019 17:35:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=fomichev-me.20150623.gappssmtp.com header.i=@fomichev-me.20150623.gappssmtp.com header.b="eFbH4FS9" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437036AbfBNRfp (ORCPT ); Thu, 14 Feb 2019 12:35:45 -0500 Received: from mail-pf1-f195.google.com ([209.85.210.195]:44216 "EHLO mail-pf1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729595AbfBNRfp (ORCPT ); Thu, 14 Feb 2019 12:35:45 -0500 Received: by mail-pf1-f195.google.com with SMTP id u6so3417653pfh.11 for ; Thu, 14 Feb 2019 09:35:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fomichev-me.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=A7nTX7TNYIGSWbBqbnBnMBh3jbLzWsSILlJ1d02bHt0=; b=eFbH4FS90l7pPznXsQ5aWizQCCc6F5jaNMy4Ay+NDSkaXeDHPL0nyjXOxAR5pQKp0E UoAQs8pSlK+W8ExV5FamqXwpkBwAArk1cwuAt3AgrH2rixVW6FG8381JEkU0U+L7lylk R9t/DLK2FzuD5C8fKHPzWkv5AsSlifV+rEpUvbWZjjD8mkfPGYlBPyvMtm2+ynus1/+O UUQ3VjW8FmQq66s3CTwQaq6rsmQxwhu06kcJ5xpOtC5dO57XmHpt/d9OIjMxeO34zaJG cW81T0VInhrGBpshyApkppn0xELDTv+DRQQuWuS/zdf6v1RrLjTAO3agiAM1i7aTFlIZ nwag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=A7nTX7TNYIGSWbBqbnBnMBh3jbLzWsSILlJ1d02bHt0=; b=KRGWMr+UIaZ/iV7m2/Y+5sAZTYR++K/3lhMMakJQWEeBGJFsT2tjJ0D+Jst993rg+R 67ZoPyVOdUvYJcHEmQIcM0H1Zic8cgMDFWTvQ5KH+iUF7m9x4JhyEu2oiGsUF2OOPi5y Nm+khAJZMDRKJ2gSv+7VEeNEuD2bMk+cn9tP1ANDj06ZFHrJKOBWDsMyBAUSbn+/9KZy 2ddP8+K/oKN9IUsDmKKkTRZedhZy/iUi/8LOMEGHFkpNUJgRA70FUbqkNTegfJCNFqi/ THYLu3iJslZSEZGE5QJXBYK0kRPGbmHUmlSVE/wcx2AnylmB0FOHSk3bohRKNLZSBKMq XrfQ== X-Gm-Message-State: AHQUAuY4QnuJ6YTW3Rjt/XTUxUPpEBxVlTaYwW5bBzHWxrMHYBC+Wu91 CxPPjOcZo2rw3BfzH9dtyKHCRQ== X-Google-Smtp-Source: AHgI3IYJV+oWAeRvKjWaqSoin5mDH4B/g9l55lRJOmH8Uji5uGqq1ILnYfc/LF8HTN8Akw+itjGozA== X-Received: by 2002:a63:374e:: with SMTP id g14mr991696pgn.59.1550165743994; Thu, 14 Feb 2019 09:35:43 -0800 (PST) Received: from localhost ([2601:646:8f00:18d9:d0fa:7a4b:764f:de48]) by smtp.gmail.com with ESMTPSA id z1sm5948496pfi.155.2019.02.14.09.35.43 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 14 Feb 2019 09:35:43 -0800 (PST) Date: Thu, 14 Feb 2019 09:35:42 -0800 From: Stanislav Fomichev To: Alexei Starovoitov Cc: Willem de Bruijn , Stanislav Fomichev , Network Development , David Miller , Alexei Starovoitov , Daniel Borkmann , simon.horman@netronome.com, Willem de Bruijn Subject: Re: [RFC bpf-next 0/7] net: flow_dissector: trigger BPF hook when called from eth_get_headlen Message-ID: <20190214173542.GA20651@mini-arch> References: <20190206004714.pz44evow5uwgvt4x@ast-mbp.dhcp.thefacebook.com> <20190206005931.GF10769@mini-arch> <20190206031215.qldeh7pfgqr3frg3@ast-mbp> <20190206035619.GG10769@mini-arch> <20190206041112.h3ccahimp2uerdfk@ast-mbp> <20190206054946.GH10769@mini-arch> <20190212170232.GB10595@mini-arch> <20190214043931.hdpddeom57rnnlhm@ast-mbp> <20190214055725.GC10595@mini-arch> <20190214063848.7zr5ph3ijp3wmjgx@ast-mbp.dhcp.thefacebook.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190214063848.7zr5ph3ijp3wmjgx@ast-mbp.dhcp.thefacebook.com> User-Agent: Mutt/1.11.3 (2019-02-01) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On 02/13, Alexei Starovoitov wrote: > On Wed, Feb 13, 2019 at 09:57:25PM -0800, Stanislav Fomichev wrote: > > > > > That 'stuck with __sk_buff' is what bothers me. > > I might have use the wrong word here. I don't think there is another > > option to be honest. Using __sk_buff makes flow dissector programs work > > with fragmented packets; > > good point. indeed real skb is essential. > > > > It's an indication that api wasn't thought through if first thing > > > it needs is this fake skb hack. > > > If bpf_flow.c is a realistic example of such flow dissector prog > > > it means that real skb fields are accessed. > > > In particular skb->vlan_proto, skb->protocol. > > I do manually set skb->protocol to eth->h_proto in my proposal. This is later > > correctly handled by bpf_flow.c: parse_eth_proto() is called on skb->protocol > > and we correctly handle bpf_htons(ETH_P_8021Q) there. So existing > > bpf_flow.c works as expected. > ... > > The goal of this patch series was to essentially make this skb/no-skb > > context transparent to the bpf_flow.c (i.e. no changes from the user > > flow programs). Adding another flow dissector for eth_get_headlen case > > also seems as a no go. > > The problem with this thinking is assumption that bpf_flow.c is the only program. I agree, it's a bad assumption, but it is sort of a reference implementation, I don't expect other users to do something wildly different. Hopefully :-) > Since ctx of flow_dissector prog type is 'struct __sk_buff' > all fields should be valid or the verifier has to reject access > to fields that were not set. > You cannot "manually set skb->protocol to eth->h_proto" in fake skb > and ignore the rest. Ugh, I did expect that we only allow a minimal set of __sk_buff fields to be allowed from the flow dissector program type, but that's not the case. We explicitly prohibit access only to family/ips/ports/tc_classid/tstamp/wire_len, everything else is readable :-/ Any idea why? Stuff like ingress_ifindex/ifindex/hash/mark/queue_mapping, does flow dissector programs really need to know that? For the most part, using zero-initialized fake skb looks fine, except: * infindex, where we do skb->dev->ifndex (skb->dev is NULL) * gso_segs, where we do skb_shinfo(skb)->gso_segs (we are missing shinfo) So there is indeed a couple of problems. How do you feel about tightening down the access to sk_buff fields from the flow dissector program type? That is an API change, but I don't see why existing users should use those fields. Let's allow access only to len/data/data_end, protocol, vlan_{present,tci,proto}, cb, flow_keys, that should be enough to dissect the packet (I also looked at C-based implementation, it doesn't use anything besides that). We can always rollback if somebody complains about.