From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jamal Hadi Salim Subject: Re: [patch net-next v4 00/10] net: sched: allow qdiscs to share filter block instances Date: Thu, 4 Jan 2018 07:41:54 -0500 Message-ID: References: <780a80d0-9384-ae34-4cab-3070b004b64e@gmail.com> <20171225102346.GB1885@nanopsycho> <20180102194944.GG2051@nanopsycho.orion> <20180103094025.GA2067@nanopsycho.orion> <20180103172209.GD2067@nanopsycho.orion> <20180103155152.7e94a295@cakuba.netronome.com> <20180104065702.GH2067@nanopsycho.orion> <20180103230658.595eac7d@cakuba.netronome.com> <20180104101257.GA2213@nanopsycho> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Cc: David Ahern , netdev@vger.kernel.org, davem@davemloft.net, xiyou.wangcong@gmail.com, mlxsw@mellanox.com, andrew@lunn.ch, vivien.didelot@savoirfairelinux.com, f.fainelli@gmail.com, michael.chan@broadcom.com, ganeshgr@chelsio.com, saeedm@mellanox.com, matanb@mellanox.com, leonro@mellanox.com, idosch@mellanox.com, simon.horman@netronome.com, pieter.jansenvanvuuren@netronome.com, john.hurley@netronome.com, alexander.h.duyck@intel.com, ogerlitz@mellanox.com, john.fastabend@gmail.com, daniel@iogearbox.net To: Jiri Pirko , Jakub Kicinski Return-path: Received: from mail-io0-f182.google.com ([209.85.223.182]:40508 "EHLO mail-io0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751827AbeADMl6 (ORCPT ); Thu, 4 Jan 2018 07:41:58 -0500 Received: by mail-io0-f182.google.com with SMTP id v30so2084512iov.7 for ; Thu, 04 Jan 2018 04:41:58 -0800 (PST) In-Reply-To: <20180104101257.GA2213@nanopsycho> Content-Language: en-GB Sender: netdev-owner@vger.kernel.org List-ID: My comments below - I do have a more _basic_ issue on the egress side that IMO noone has brought up that needs to be addressed. All examples so far have been focusing on ingress. Since I am late in this discussion, let me state my understanding of the thread and assumptions so there is no cross-talk. we are going to need to be able to say: tc qdisc add block 1 tc filter add block 1 protocol ip priority 10 flower ... From my scanning of the thread, I dont think I see disagreement. Jiri, did i misread what you are saying? New approach syntax should still be legit for backward compat, so this should work: "tc filter add dev ens7 parent ffff: protocol ip priority 10 flower .." Blocks or not we still need to have old scripts work. Which also means one should also be able to say: "tc filter ls dev ens7 ..." with no suprises. From my reading of the thread - there is no disagreement that "tc qdisc add dev ens7 ingress block 1" should be able to "bind" to block1 if it exists or create it if it doesnt and then "bind" to it ("autobind"). And that a simple "tc qdisc add dev ens7 ingress" should either not be able to create a block (or we can say creates block id 0 - owned by the netdev - for consistency). To David A: If someone created the qdisc using new approach and then later tried to add/list rules using the old syntax, surely that should work, no? I have seen setups which separate creation of qdiscs using one script and addition of filters via another script (very common actually). Having said that: I see the value of less is more and being able to use one(block) interface to add/del/list filters if we add the blocks feature; but backward compat(meaning crazy scripting) needs to be respected. I dont see how we get away from that. Maybe what we need is another knob to control this new functionality? Either a kernel config option or an extra parameter when creating the qdisc or a system wide boolean per netdev (which of course contradicts the "less is more" principle). If this knob was set then you reject addition of filters via a port/qdisc which is sharing. I agree with Jiri - if you consciously choose to share there should be no suprises with what filters get applied. I think this email is too long and my egress concern will be lost if i typed it here; so i will send another email. cheers, jamal On 18-01-04 05:12 AM, Jiri Pirko wrote: > Thu, Jan 04, 2018 at 08:06:58AM CET, kubakici@wp.pl wrote: >> On Thu, 4 Jan 2018 07:57:02 +0100, Jiri Pirko wrote: >>> Thu, Jan 04, 2018 at 12:51:52AM CET, kubakici@wp.pl wrote: >>>> On Wed, 3 Jan 2018 18:22:09 +0100, Jiri Pirko wrote: >>>>> However I don't agree about breaking the existing filter add and show >>>>> and also imposibility to make not-shared block shared in the runtime >>>>> before defining it first. >>>> >>>> FWIW I would agree with David that allowing add on a shared block >>>> modify filters on another interface can break existing users. (No >>>> opinion on dump and lifetime). >>> >>> I don't think that David is saying that, but why do you think it would >>> break existing users? >> >> Perhaps I worded is too strongly as "breaking existing users", but it >> certainly introduces surprising side effects. David put it into words >> very well: >> >> On Tue, 2 Jan 2018 19:07:36 -0700, David Ahern wrote: >>> The disagreement is in how they should be managed. I think my last >>> response concisely captures my concerns -- the principle of least surprise. >>> >>> So with the initial commands above, all is fine. Then someone is >>> debugging a problem or wants to add another filter to ens8, so they run: >>> >>> $ tc filter add dev ens8 ingress protocol ip pref 25 flower dst_ip >>> 192.168.1.0/16 action drop >>> >>> Then traffic flows through ens7 break and some other user is struggling >>> to understand what just happened. That the new filter magically appears >>> on ens7 when the user operated on ens8 is a surprise. Nothing about that >>> last command acknowledges that it is changing a shared resource. > > It is not that the new filter magically appears on ens7. No magic. ens8 > and ens7 share the same block. User set the sharing up originally, > he can see it in "tc qdisc show". So if he does this, he cannot be > surprised. Also, he can list the filters if he does "tc filter show" > to see what is there per-qdisc. > > If the user expects something else, he should go back and read the docs > and learn to understand how tc works. >