From mboxrd@z Thu Jan 1 00:00:00 1970 From: AMG Zollner Robert Subject: Re: [bug] cxgb4: vrf stopped working with cxgb4 card Date: Mon, 4 Jun 2018 23:14:47 +0300 Message-ID: <93057e04-d4ff-e16f-02ac-132501a8e08f@cloudmedia.eu> References: <8073c78c-3243-d7f3-55c3-2cc1a2153366@cumulusnetworks.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Cc: netdev@vger.kernel.org To: David Ahern , ganeshgr@chelsio.com Return-path: Received: from web01.accessmedia.ro ([86.107.100.4]:46952 "EHLO web01.accessmedia.ro" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750998AbeFDUOu (ORCPT ); Mon, 4 Jun 2018 16:14:50 -0400 In-Reply-To: <8073c78c-3243-d7f3-55c3-2cc1a2153366@cumulusnetworks.com> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: Yes, I was enslaving while the interface was up. Just tested some of the builds that where not working earlier and they are working if I keep the interface down when enslaving as you suggested. Is this the expected behavior? Thank you, Zollner Robert On 04.06.2018 21:17, David Ahern wrote: > On 6/4/18 8:03 AM, AMG Zollner Robert wrote: >> I have noticed that vrf is not working with kernel v4.15.0 but was >> working with v4.13.0 when using cxgb4 Chelsio driver (T520-cr) >> >> Setup: >> Two metal servers with a T520-cr card each, directly connected without a >> switch in between. >> >>        SVR1  only ipfwd                 SVR2     with vrf >> .----------------------------. .----------------------------------. >> |                            |         |             | >> |    192.168.8.1 [  ens2f4]--|---------|--[ens1f4] 192.168.8.2   | >> |    192.168.9.1 [ens2f4d1]--|---------|-- 192.168.9.2 VRF=10   | >> `----------------------------' `----------------------------------' >> >> When vrf is not working there are no error messages (dmesg or iproute >> commands), tcpdump on the interface (SVR2.ens1f4d1) enslaved in vrf 10 >> shows packets(arp req/reply) coming in and going out, but outgoing >> packets(arp reply) do not reach the other server SVR1.ens2f4d1 >> >> >> Bisect: >> Found this commit to be the problem after doing a git bisect between >> v4.13..v4.15: >> >> commit ba581f77df23c8ee70b372966e69cf10bc5453d8 >> Author: Ganesh Goudar >> Date:   Sat Sep 23 16:07:28 2017 +0530 >> >>     cxgb4: do DCB state reset in couple of places >> >>     reset the driver's DCB state in couple of places >>     where it was missing. >> >> >> A bisect step was considered good when: >> - successful ping from SVR1 to SVR2.ens1f4d1 vrf interface >> - successful ping from SVR2 global to SVR2 vrf interface trough SVR1(l3 >> forwarding) (this check was redundant,both tests fail or pass simultaneous) >> >> The problem is still present on recent kernels also, checked v4.16.0 and >> v4.17.rc7 >> >> Disabling DCB for the card support fixes the problem ( Compiling kernel >> with "CONFIG_CHELSIO_T4_DCB=n") >> > Are you doing the VRF enslave while it is up? > > If so, does it work ok if you change the sequence: > > ip li set ens1f4d1 down > ip li set ens1f4d1 master > ip li set ens1f4d1 up