linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Wang <wangzhiqiang.bj@bytedance.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: "Samuel Mendoza-Jonas" <sam@mendozajonas.com>,
	"Joel Stanley" <joel@jms.id.au>,
	"Lotus Xu" <xuxiaohan@bytedance.com>, 郁雷 <yulei.sh@bytedance.com>,
	"David S. Miller" <davem@davemloft.net>,
	"Gavin Shan" <gwshan@linux.vnet.ibm.com>,
	"open list:NETWORKING [GENERAL]" <netdev@vger.kernel.org>,
	"open list" <linux-kernel@vger.kernel.org>,
	"Bo Chen" <chenbo.gil@bytedance.com>
Subject: Re: [External] Re: [PATCH] net/ncsi: Use real net-device for response handler
Date: Wed, 23 Dec 2020 13:29:26 +0800	[thread overview]
Message-ID: <CAH0XSJvAKqYTeL+=7biS2jgwUL4mgY-WqZhwDb4Mx46Z0PSpWw@mail.gmail.com> (raw)
In-Reply-To: <20201222182458.4651c564@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>

On Wed, Dec 23, 2020 at 10:25 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 22 Dec 2020 10:38:21 -0800 Samuel Mendoza-Jonas wrote:
> > On Tue, 2020-12-22 at 06:13 +0000, Joel Stanley wrote:
> > > On Sun, 20 Dec 2020 at 12:40, John Wang wrote:
> > > > When aggregating ncsi interfaces and dedicated interfaces to bond
> > > > interfaces, the ncsi response handler will use the wrong net device
> > > > to
> > > > find ncsi_dev, so that the ncsi interface will not work properly.
> > > > Here, we use the net device registered to packet_type to fix it.
> > > >
> > > > Fixes: 138635cc27c9 ("net/ncsi: NCSI response packet handler")
> > > > Signed-off-by: John Wang <wangzhiqiang.bj@bytedance.com>
>
> This sounds like exactly the case for which orig_dev was introduced.
> I think you should use the orig_dev argument, rather than pt->dev.

will send a v2

>
> Can you test if that works?

Yes,  it works.

>
> > > Can you show me how to reproduce this?

On g220a, eth1 is the dedicated interface, eth0 is the ncsi interface

kernel cfg:
CONFIG_BONDING=y

cat /etc/systemd/network/00-bmc-bond1.netdev
[NetDev]
Name=bond1
Description=Bond eth0 and eth1
Kind=bond

[Bond]
Mode=active-backup

cat /etc/systemd/network/00-bmc-eth0.network
[Match]
Name=eth0
[Network]
Bond=bond1

cat /etc/systemd/network/00-bmc-eth0.network
[Match]
Name=eth1
[Network]
Bond=bond1
PrimarySlave=true

ip addr
....
6: bond1: <BROADCAST,MULTICAST,UP,LOWER_UP400> mtu 1500 qdisc noqueue qlen 1000
    link/ether b4:05:5d:8f:6a:ad brd ff:ff:ff:ff:ff:ff
    inet 169.254.11.178/16 brd 169.254.255.255 scope link bond1
       valid_lft forever preferred_lft forever
    inet 192.168.1.108/24 brd 192.168.1.255 scope global bond1
       valid_lft forever preferred_lft forever
    inet 10.2.16.118/24 brd 10.2.16.255 scope global bond1
       valid_lft forever preferred_lft forever
    inet6 fe80::b605:5dff:fe8f:6aad/64 scope link
...


Without this patch:
After bmc boots:
echo eth0 > /sys/class/net/bond1/bonding/active_slave
admin@g220a:~#
admin@g220a:~# echo eth0 > /sys/class/net/bond1/bonding/active_slave
[  105.964357] bond1: (slave eth0): making interface the new active one
admin@g220a:~# ping 10.2.16.1
PING 10.2.16.1 (10.2.16.1): 56 data bytes
64 bytes from 10.2.16.1: seq=0 ttl=255 time=7.096 ms
64 bytes from 10.2.16.1: seq=1 ttl=255 time=2.143 ms
64 bytes from 10.2.16.1: seq=2 ttl=255 time=2.111 ms
[  112.642734] ftgmac100 1e660000.ethernet eth0: NCSI Channel 0 timed out!
64 bytes from 10.2.16.1: seq=3 ttl=255 time=2.039 ms
64 bytes from 10.2.16.1: seq=4 ttl=255 time=2.037 ms
[  117.842814] ftgmac100 1e660000.ethernet eth0: NCSI: No channel with
link found, configuring channel 0
[  134.482746] ftgmac100 1e660000.ethernet eth0: NCSI Channel 0 timed out!
[  139.682820] ftgmac100 1e660000.ethernet eth0: NCSI: No channel with
link found, configuring channel 0

with this patch:
After bmc boots:

admin@g220a:~# echo eth0 > /sys/class/net/bond1/bonding/active_slave
[58332.123754] bond1: (slave eth0): making interface the new active one
admin@g220a:~# ping 10.2.16.1
PING 10.2.16.1 (10.2.16.1): 56 data bytes
64 bytes from 10.2.16.1: seq=0 ttl=255 time=7.279 ms
...
...
64 bytes from 10.2.16.1: seq=N ttl=255 time=2.037 ms



> > >
> > > I don't know the ncsi or net code well enough to know if this is the
> > > correct fix. If you are confident it is correct then I have no
> > > objections.
> >
> > This looks like it is probably right; pt->dev will be the original
> > device from ncsi_register_dev(), if a response comes in to
> > ncsi_rcv_rsp() associated with a different device then the driver will
> > fail to find the correct ncsi_dev_priv. An example of the broken case
> > would be good to see though.
>
> From the description sounds like the case is whenever the ncsi
> interface is in a bond, the netdev from the second argument is
> the bond not the interface from which the frame came. It should
> be possible to repro even with only one interface on the system,
> create a bond or a team and add the ncsi interface to it.
>
> Does that make sense? I'm likely missing the subtleties here.

:)  I guess so.

      reply	other threads:[~2020-12-23  5:30 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-20 12:39 [PATCH] net/ncsi: Use real net-device for response handler John Wang
2020-12-22  6:13 ` Joel Stanley
2020-12-22 18:38   ` Samuel Mendoza-Jonas
2020-12-23  2:24     ` Jakub Kicinski
2020-12-23  5:29       ` John Wang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAH0XSJvAKqYTeL+=7biS2jgwUL4mgY-WqZhwDb4Mx46Z0PSpWw@mail.gmail.com' \
    --to=wangzhiqiang.bj@bytedance.com \
    --cc=chenbo.gil@bytedance.com \
    --cc=davem@davemloft.net \
    --cc=gwshan@linux.vnet.ibm.com \
    --cc=joel@jms.id.au \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=sam@mendozajonas.com \
    --cc=xuxiaohan@bytedance.com \
    --cc=yulei.sh@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).