From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89078C432C0 for ; Fri, 22 Nov 2019 06:12:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 632B42070E for ; Fri, 22 Nov 2019 06:12:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728496AbfKVGMS (ORCPT ); Fri, 22 Nov 2019 01:12:18 -0500 Received: from m9784.mail.qiye.163.com ([220.181.97.84]:48026 "EHLO m9784.mail.qiye.163.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728169AbfKVGMR (ORCPT ); Fri, 22 Nov 2019 01:12:17 -0500 Received: from [192.168.188.14] (unknown [120.132.1.226]) by m9784.mail.qiye.163.com (Hmail) with ESMTPA id 4A75241AE5; Fri, 22 Nov 2019 14:12:11 +0800 (CST) Subject: Re: Question about flow table offload in mlx5e To: Paul Blakey Cc: "pablo@netfilter.org" , "netdev@vger.kernel.org" , Mark Bloch References: <1574147331-31096-1-git-send-email-wenxu@ucloud.cn> <20191119.163923.660983355933809356.davem@davemloft.net> <2a08a1aa-6aa8-c361-f825-458d234d975f@ucloud.cn> <746ba973-3c58-31f8-42ce-db880fd1d8f4@ucloud.cn> From: wenxu Message-ID: Date: Fri, 22 Nov 2019 14:12:10 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Content-Language: en-US X-HM-Spam-Status: e1kfGhgUHx5ZQUtXWQgYFAkeWUFZVk1VTEJIS0tLTUJDS0NCTVlXWShZQU lCN1dZLVlBSVdZCQ4XHghZQVk1NCk2OjckKS43PlkG X-HM-Sender-Digest: e1kMHhlZQR0aFwgeV1kSHx4VD1lBWUc6OTI6Vgw6CTg2SA9MQ0McNjY0 HykKFD5VSlVKTkxPT0tISkhJTktPVTMWGhIXVQweFQMOOw4YFxQOH1UYFUVZV1kSC1lBWUpJS1VK SElVSlVJSU1ZV1kIAVlBSkpCS003Bg++ X-HM-Tid: 0a6e91bd48232086kuqy4a75241ae5 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Hi Paul, There are some update. ifconfig mlx_p0 172.168.152.75/24 up ip l add dev tun1 type gretap external tc qdisc add dev tun1 ingress tc qdisc add dev mlx_pf0vf0 ingress tc filter add dev mlx_pf0vf0 pref 2 ingress  protocol ip flower skip_sw ip_proto tcp dst_ip 10.0.1.241 src_ip 10.0.0.75 src_port 5002 dst_port 5001 tcp_flags 0/0x5  action tunnel_key set dst_ip 172.168.152.241 src_ip 0 id 1000 nocsum pipe action mirred egress redirect dev tun1 tc filter add dev tun1 pref 2 ingress  protocol ip flower ip_proto tcp src_ip 10.0.1.241 dst_ip 10.0.0.75 src_port 5001 dst_port 5002 tcp_flags 0/0x5 enc_key_id 1000 enc_src_ip 172.168.152.241 action tunnel_key unset pipe action mirred egress redirect dev mlx_pf0vf0 If you run this script on the host,  and in the virtual machine  run "iperf -c 10.0.1.241  -i 2  -B 10.0.0.75:5002  -t 1000" The tcp syn packet will not be offloaded But if you  only run the script  without the last filter as following , The tcp syn packet will be offloaded. ifconfig mlx_p0 172.168.152.75/24 up ip l add dev tun1 type gretap external tc qdisc add dev tun1 ingress tc qdisc add dev mlx_pf0vf0 ingress tc filter add dev mlx_pf0vf0 pref 2 ingress  protocol ip flower skip_sw ip_proto tcp dst_ip 10.0.1.241 src_ip 10.0.0.75 src_port 5002 dst_port 5001 tcp_flags 0/0x5  action tunnel_key set dst_ip 172.168.152.241 src_ip 0 id 1000 nocsum pipe action mirred egress redirect dev tun1 On 11/21/2019 9:05 PM, Paul Blakey wrote: > I see, I will test that, and how about normal FWD rules? > > Paul. > > >> -----Original Message----- >> From: wenxu >> Sent: Thursday, November 21, 2019 2:35 PM >> To: Paul Blakey >> Cc: pablo@netfilter.org; netdev@vger.kernel.org; Mark Bloch >> >> Subject: Re: Question about flow table offload in mlx5e >> >> >> 在 2019/11/21 19:39, Paul Blakey 写道: >>> They are good fixes, exactly what we had when we tested this, thanks. >>> >>> Regarding encap, I don't know what changes you did, how does the encap >> rule look? Is it a FWD to vxlan device? If not it should be, as our driver >> expects that. >> It is fwd to a gretap devices >>> I tried it on my setup via tc, by changing the callback of tc >> (mlx5e_rep_setup_tc_cb) to that of ft (mlx5e_rep_setup_ft_cb), >>> and testing a vxlan encap rule: >>> sudo tc qdisc add dev ens1f0_0 ingress >>> sudo ifconfig ens1f0 7.7.7.7/24 up >>> sudo ip link add name vxlan0 type vxlan dev ens1f0 remote 7.7.7.8 dstport >> 4789 external >>> sudo ifconfig vxlan0 up >>> sudo tc filter add dev ens1f0_0 ingress prio 1 chain 0 protocol ip flower >> dst_mac aa:bb:cc:dd:ee:ff ip_proto udp skip_sw action tunnel_key set >> src_ip 0.0.0.0 dst_ip 7.7.7.8 id 1234 dst_port 4789 pipe action mirred egress >> redirect dev vxlan >>> then tc show: >>> filter protocol ip pref 1 flower chain 0 handle 0x1 dst_mac aa:bb:cc:dd:ee:ff >> ip_proto udp skip_sw in_hw in_hw_count 1 >>> tunnel_key set src_ip 0.0.0.0 dst_ip 7.7.7.8 key_id 1234 dst_port 4789 >> csum pipe >>> Stats: used 119 sec 0 pkt >>> mirred (Egress Redirect to device vxlan0) >>> Stats: used 119 sec 0 pkt >> Can you send packet that match this offloaded flow to check it is real >> offloaded? >> >> In the flowtable offload with my patches both TC_SETUP_BLOCK and >> TC_SETUP_FT can offload the rule success >> >> But in the TC_SETUP_FT case the packet is not real offloaded. >> >> >> I  will test like u did. >> >>> >>> >>>> -----Original Message----- >>>> From: wenxu >>>> Sent: Thursday, November 21, 2019 10:29 AM >>>> To: Paul Blakey >>>> Cc: pablo@netfilter.org; netdev@vger.kernel.org; Mark Bloch >>>> >>>> Subject: Re: Question about flow table offload in mlx5e >>>> >>>> >>>> On 11/21/2019 3:42 PM, Paul Blakey wrote: >>>>> Hi, >>>>> >>>>> The original design was the block setup to use TC_SETUP_FT type, and >> the >>>> tc event type to be case TC_SETUP_CLSFLOWER. >>>>> We will post a patch to change that. I would advise to wait till we fix that >>>> 😊 >>>>> I'm not sure how you get to this function mlx5e_rep_setup_ft_cb() if it >> the >>>> nf_flow_table_offload ndo_setup_tc event was TC_SETUP_BLOCK, and >> not >>>> TC_SETUP_FT. >>>> >>>> >>>> Yes I change the TC_SETUP_BLOCK to TC_SETUP_FT in the >>>> nf_flow_table_offload_setup. >>>> >>>> Two fixes patch provide: >>>> >>>> http://patchwork.ozlabs.org/patch/1197818/ >>>> >>>> http://patchwork.ozlabs.org/patch/1197876/ >>>> >>>> So this change made by me is not correct currently? >>>> >>>>> In our driver en_rep.c we have: >>>>>> -------switch (type) { >>>>>> -------case TC_SETUP_BLOCK: >>>>>> ------->-------return flow_block_cb_setup_simple(type_data, >>>>>> ------->------->------->------->------->------- >> &mlx5e_rep_block_tc_cb_list, >>>>>> ------->------->------->------->------->------- mlx5e_rep_setup_tc_cb, >>>>>> ------->------->------->------->------->------- priv, priv, true); >>>>>> -------case TC_SETUP_FT: >>>>>> ------->-------return flow_block_cb_setup_simple(type_data, >>>>>> ------->------->------->------->------->------- >> &mlx5e_rep_block_ft_cb_list, >>>>>> ------->------->------->------->------->------- mlx5e_rep_setup_ft_cb, >>>>>> ------->------->------->------->------->------- priv, priv, true); >>>>>> -------default: >>>>>> ------->-------return -EOPNOTSUPP; >>>>>> -------} >>>>> In nf_flow_table_offload.c: >>>>>> -------bo.binder_type>-= >> FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS; >>>>>> -------bo.extack>------= &extack; >>>>>> -------INIT_LIST_HEAD(&bo.cb_list); >>>>>> -------err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, >>>> &bo); >>>>>> -------if (err < 0) >>>>>> ------->-------return err; >>>>>> -------return nf_flow_table_block_setup(flowtable, &bo, cmd); >>>>> } >>>>> EXPORT_SYMBOL_GPL(nf_flow_table_offload_setup); >>>>> >>>>> >>>>> So unless you changed that as well, you should have gotten to >>>> mlx5e_rep_setup_tc_cb and not mlx5e_rep_setup_tc_ft. >>>>> Regarding the encap action, there should be no difference on which >> chain >>>> the rule is on. >>>> >>>> >>>> But for the same encap rule can be real offloaded when setup through >>>> through TC_SETUP_BLOCK. But TC_SETUP_FT can't. >>>> >>>> So it is the problem of TC_SETUP_FT in mlx5e_rep_setup_ft_cb ? >>>> >>>>>> -----Original Message----- >>>>>> From: wenxu >>>>>> Sent: Thursday, November 21, 2019 9:30 AM >>>>>> To: Paul Blakey >>>>>> Cc: pablo@netfilter.org; netdev@vger.kernel.org; Mark Bloch >>>>>> >>>>>> Subject: Question about flow table offload in mlx5e >>>>>> >>>>>> Hi  paul, >>>>>> >>>>>> The flow table offload in the mlx5e is based on TC_SETUP_FT. >>>>>> >>>>>> >>>>>> It is almost the same as TC_SETUP_BLOCK. >>>>>> >>>>>> It just set MLX5_TC_FLAG(FT_OFFLOAD) flags and change >>>>>> cls_flower.common.chain_index = FDB_FT_CHAIN; >>>>>> >>>>>> In following codes line 1380 and 1392 >>>>>> >>>>>> 1368 static int mlx5e_rep_setup_ft_cb(enum tc_setup_type type, void >>>>>> *type_data, >>>>>> 1369                                  void *cb_priv) >>>>>> 1370 { >>>>>> 1371         struct flow_cls_offload *f = type_data; >>>>>> 1372         struct flow_cls_offload cls_flower; >>>>>> 1373         struct mlx5e_priv *priv = cb_priv; >>>>>> 1374         struct mlx5_eswitch *esw; >>>>>> 1375         unsigned long flags; >>>>>> 1376         int err; >>>>>> 1377 >>>>>> 1378         flags = MLX5_TC_FLAG(INGRESS) | >>>>>> 1379                 MLX5_TC_FLAG(ESW_OFFLOAD) | >>>>>> 1380                 MLX5_TC_FLAG(FT_OFFLOAD); >>>>>> 1381         esw = priv->mdev->priv.eswitch; >>>>>> 1382 >>>>>> 1383         switch (type) { >>>>>> 1384         case TC_SETUP_CLSFLOWER: >>>>>> 1385                 if (!mlx5_eswitch_prios_supported(esw) || f- >>>>>>> common.chain_index) >>>>>> 1386                         return -EOPNOTSUPP; >>>>>> 1387 >>>>>> 1388                 /* Re-use tc offload path by moving the ft flow to the >>>>>> 1389                  * reserved ft chain. >>>>>> 1390                  */ >>>>>> 1391                 memcpy(&cls_flower, f, sizeof(*f)); >>>>>> 1392                cls_flower.common.chain_index = FDB_FT_CHAIN; >>>>>> 1393                 err = mlx5e_rep_setup_tc_cls_flower(priv, &cls_flower, >>>> flags); >>>>>> 1394                 memcpy(&f->stats, &cls_flower.stats, sizeof(f->stats)); >>>>>> >>>>>> >>>>>> I want to add tunnel offload support in the flow table, I  add some >> patches >>>> in >>>>>> nf_flow_table_offload. >>>>>> >>>>>> Also add the indr setup support in the mlx driver. And Now I can  flow >>>> table >>>>>> offload with decap. >>>>>> >>>>>> >>>>>> But I meet a problem with the encap.  The encap rule can be added in >>>>>> hardware  successfully But it can't be offloaded. >>>>>> >>>>>> But I think the rule I added is correct.  If I mask the line 1392. The rule >> also >>>> can >>>>>> be add success and can be offloaded. >>>>>> >>>>>> So there are some limit for encap operation for FT_OFFLOAD in >>>>>> FDB_FT_CHAIN? >>>>>> >>>>>> >>>>>> BR >>>>>> >>>>>> wenxu >>>>>>