From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B384EC433ED for ; Mon, 19 Apr 2021 21:43:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7B08F613B2 for ; Mon, 19 Apr 2021 21:43:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231378AbhDSVnr (ORCPT ); Mon, 19 Apr 2021 17:43:47 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:31159 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229714AbhDSVnr (ORCPT ); Mon, 19 Apr 2021 17:43:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1618868597; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=NCUSF0NODUQGqVEalI8EqwtjFOpvGAIM6Wv/bT2GNjA=; b=V0K4Qf1JRQAO36/1mqeFjFnpV8hV9+DD2uFv8a1hXowZOy+sCgz0jPrQ57DQkg84FxFqj/ 0Z4Ka/SCa7FvcWmJqeSxyUIuvB07u7ZAKNZa8IvhNjV1l8TUKn/dyUTRi6AfxyGeQQKTE3 xR6I8F8LNeTzw7SVvx5pwer2e3/J7pQ= Received: from mail-ej1-f72.google.com (mail-ej1-f72.google.com [209.85.218.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-146-AMMo3BjJMQW-MZmMnRHv7w-1; Mon, 19 Apr 2021 17:43:15 -0400 X-MC-Unique: AMMo3BjJMQW-MZmMnRHv7w-1 Received: by mail-ej1-f72.google.com with SMTP id bx15-20020a170906a1cfb029037415131f28so4172578ejb.18 for ; Mon, 19 Apr 2021 14:43:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version; bh=NCUSF0NODUQGqVEalI8EqwtjFOpvGAIM6Wv/bT2GNjA=; b=eMFXsbnGnfokep2OvOGTnkkPur+KtZOs+xhbnqgFQE9LroXfShMhG1Otch2NTbOqcW huMnwCPy0iWa++ElSk6t+QkIylj2kyVhYX+MHeVRrDZTnFL6CUKMC2RSpfjMqIFHJfLb jalEElfKLIFEKB6BnyNAx77sI/pan9VPhshRRKB3hLjuaEnM+RMSycxbtPxGU2Y01UJx 9+/7lqDYFY0IT6WMJzlKH1xsoStrR/uXfV4TXSlllplRHcaAliHEFJTu5Xbmf8xTXpTW TECLA94YFmH+EPxRsglrwbsZObIDuMPGuN7d21mDcK8WVeOxYlFy/HcKvn4PSFeGUc6V qOqA== X-Gm-Message-State: AOAM5318t/np9jb/XHCXFpgXYmW4Q6rQ+lPe6Q2mgsg04LlDpOyCXuEg 9bRb+WiwmjXwVoYKHYh/SiEV0iECsKYpBRCvLPblMxRVUA3IydQVXnIvNnezWir7XTJkZNw60jw NWO8oqpCHZY9M X-Received: by 2002:a17:906:747:: with SMTP id z7mr24640877ejb.192.1618868594142; Mon, 19 Apr 2021 14:43:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxiEFjfwoNlDAxzRphGFpVnyEOhSnRuyXBQvwLogx9tcq03XTiPwaXPzthBdCAUpJKUmtRbKA== X-Received: by 2002:a17:906:747:: with SMTP id z7mr24640852ejb.192.1618868593816; Mon, 19 Apr 2021 14:43:13 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk ([2a0c:4d80:42:443::2]) by smtp.gmail.com with ESMTPSA id x7sm13586562eds.67.2021.04.19.14.43.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Apr 2021 14:43:13 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 8EADE180092; Mon, 19 Apr 2021 23:43:12 +0200 (CEST) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: Daniel Borkmann , Kumar Kartikeya Dwivedi , bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , Jesper Dangaard Brouer , linux-kernel@vger.kernel.org, netdev@vger.kernel.org Subject: Re: [PATCH bpf-next v2 3/4] libbpf: add low level TC-BPF API In-Reply-To: <6e8b744c-e012-c76b-b55f-7ddc8b7483db@iogearbox.net> References: <20210419121811.117400-1-memxor@gmail.com> <20210419121811.117400-4-memxor@gmail.com> <6e8b744c-e012-c76b-b55f-7ddc8b7483db@iogearbox.net> X-Clacks-Overhead: GNU Terry Pratchett Date: Mon, 19 Apr 2021 23:43:12 +0200 Message-ID: <874kg2gdcf.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Daniel Borkmann writes: > On 4/19/21 2:18 PM, Kumar Kartikeya Dwivedi wrote: >> This adds functions that wrap the netlink API used for adding, >> manipulating, and removing traffic control filters. These functions >> operate directly on the loaded prog's fd, and return a handle to the >> filter using an out parameter named id. >> >> The basic featureset is covered to allow for attaching, manipulation of >> properties, and removal of filters. Some additional features like >> TCA_BPF_POLICE and TCA_RATE for tc_cls have been omitted. These can >> added on top later by extending the bpf_tc_cls_opts struct. >> >> Support for binding actions directly to a classifier by passing them in >> during filter creation has also been omitted for now. These actions have >> an auto clean up property because their lifetime is bound to the filter >> they are attached to. This can be added later, but was omitted for now >> as direct action mode is a better alternative to it, which is enabled by >> default. >> >> An API summary: >> >> bpf_tc_act_{attach, change, replace} may be used to attach, change, and > > typo on bpf_tc_act_{...} ? > ^^^ >> replace SCHED_CLS bpf classifier. The protocol field can be set as 0, in >> which case it is subsitituted as ETH_P_ALL by default. > > Do you have an actual user that needs anything other than ETH_P_ALL? Why is it > even needed? Why not stick to just ETH_P_ALL? > >> The behavior of the three functions is as follows: >> >> attach = create filter if it does not exist, fail otherwise >> change = change properties of the classifier of existing filter >> replace = create filter, and replace any existing filter > > This touches on tc oddities quite a bit. Why do we need to expose them? Can't we > simplify/abstract this e.g. i) create or update instance, ii) delete instance, > iii) get instance ? What concrete use case do you have that you need those three > above? > >> bpf_tc_cls_detach may be used to detach existing SCHED_CLS >> filter. The bpf_tc_cls_attach_id object filled in during attach, >> change, or replace must be passed in to the detach functions for them to >> remove the filter and its attached classififer correctly. >> >> bpf_tc_cls_get_info is a helper that can be used to obtain attributes >> for the filter and classififer. The opts structure may be used to >> choose the granularity of search, such that info for a specific filter >> corresponding to the same loaded bpf program can be obtained. By >> default, the first match is returned to the user. >> >> Examples: >> >> struct bpf_tc_cls_attach_id id = {}; >> struct bpf_object *obj; >> struct bpf_program *p; >> int fd, r; >> >> obj = bpf_object_open("foo.o"); >> if (IS_ERR_OR_NULL(obj)) >> return PTR_ERR(obj); >> >> p = bpf_object__find_program_by_title(obj, "classifier"); >> if (IS_ERR_OR_NULL(p)) >> return PTR_ERR(p); >> >> if (bpf_object__load(obj) < 0) >> return -1; >> >> fd = bpf_program__fd(p); >> >> r = bpf_tc_cls_attach(fd, if_nametoindex("lo"), >> BPF_TC_CLSACT_INGRESS, >> NULL, &id); >> if (r < 0) >> return r; >> >> ... which is roughly equivalent to (after clsact qdisc setup): >> # tc filter add dev lo ingress bpf obj foo.o sec classifier da >> >> ... as direct action mode is always enabled. >> >> If a user wishes to modify existing options on an attached classifier, >> bpf_tc_cls_change API may be used. >> >> Only parameters class_id can be modified, the rest are filled in to >> identify the correct filter. protocol can be left out if it was not >> chosen explicitly (defaulting to ETH_P_ALL). >> >> Example: >> >> /* Optional parameters necessary to select the right filter */ >> DECLARE_LIBBPF_OPTS(bpf_tc_cls_opts, opts, >> .handle = id.handle, >> .priority = id.priority, >> .chain_index = id.chain_index) > > Why do we need chain_index as part of the basic API? > >> opts.class_id = TC_H_MAKE(1UL << 16, 12); >> r = bpf_tc_cls_change(fd, if_nametoindex("lo"), >> BPF_TC_CLSACT_INGRESS, >> &opts, &id); > > Also, I'm not sure whether the prefix should even be named bpf_tc_cls_*() tbh, > yes, despite being "low level" api. I think in the context of bpf we should stop > regarding this as 'classifier' and 'action' objects since it's really just a > single entity and not separate ones. It's weird enough to explain this concept > to new users and if a libbpf based api could cleanly abstract it, I would be all > for it. I don't think we need to map 1:1 the old tc legacy even in the low level > api, tbh, as it feels backwards. I think the 'handle' & 'priority' bits are okay, > but I would remove the others. Hmm, I'm OK with dropping the TC oddities (including the cls_ in the name), but I think we should be documenting it so that users that do come from TC will not be completely lost :) -Toke