* [PATCH] net/ncsi: Use real net-device for response handler
@ 2020-12-20 12:39 John Wang
2020-12-22 6:13 ` Joel Stanley
0 siblings, 1 reply; 5+ messages in thread
From: John Wang @ 2020-12-20 12:39 UTC (permalink / raw)
To: xuxiaohan, yulei.sh, joel
Cc: Samuel Mendoza-Jonas, David S. Miller, Jakub Kicinski,
Gavin Shan, open list:NETWORKING [GENERAL],
open list
When aggregating ncsi interfaces and dedicated interfaces to bond
interfaces, the ncsi response handler will use the wrong net device to
find ncsi_dev, so that the ncsi interface will not work properly.
Here, we use the net device registered to packet_type to fix it.
Fixes: 138635cc27c9 ("net/ncsi: NCSI response packet handler")
Signed-off-by: John Wang <wangzhiqiang.bj@bytedance.com>
---
net/ncsi/ncsi-rsp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c
index a94bb59793f0..60ae32682904 100644
--- a/net/ncsi/ncsi-rsp.c
+++ b/net/ncsi/ncsi-rsp.c
@@ -1120,7 +1120,7 @@ int ncsi_rcv_rsp(struct sk_buff *skb, struct net_device *dev,
int payload, i, ret;
/* Find the NCSI device */
- nd = ncsi_find_dev(dev);
+ nd = ncsi_find_dev(pt->dev);
ndp = nd ? TO_NCSI_DEV_PRIV(nd) : NULL;
if (!ndp)
return -ENODEV;
--
2.25.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] net/ncsi: Use real net-device for response handler
2020-12-20 12:39 [PATCH] net/ncsi: Use real net-device for response handler John Wang
@ 2020-12-22 6:13 ` Joel Stanley
2020-12-22 18:38 ` Samuel Mendoza-Jonas
0 siblings, 1 reply; 5+ messages in thread
From: Joel Stanley @ 2020-12-22 6:13 UTC (permalink / raw)
To: John Wang
Cc: xuxiaohan, 郁雷,
Samuel Mendoza-Jonas, David S. Miller, Jakub Kicinski,
Gavin Shan, open list:NETWORKING [GENERAL],
open list
On Sun, 20 Dec 2020 at 12:40, John Wang <wangzhiqiang.bj@bytedance.com> wrote:
>
> When aggregating ncsi interfaces and dedicated interfaces to bond
> interfaces, the ncsi response handler will use the wrong net device to
> find ncsi_dev, so that the ncsi interface will not work properly.
> Here, we use the net device registered to packet_type to fix it.
>
> Fixes: 138635cc27c9 ("net/ncsi: NCSI response packet handler")
> Signed-off-by: John Wang <wangzhiqiang.bj@bytedance.com>
Can you show me how to reproduce this?
I don't know the ncsi or net code well enough to know if this is the
correct fix. If you are confident it is correct then I have no
objections.
Cheers,
Joel
> ---
> net/ncsi/ncsi-rsp.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c
> index a94bb59793f0..60ae32682904 100644
> --- a/net/ncsi/ncsi-rsp.c
> +++ b/net/ncsi/ncsi-rsp.c
> @@ -1120,7 +1120,7 @@ int ncsi_rcv_rsp(struct sk_buff *skb, struct net_device *dev,
> int payload, i, ret;
>
> /* Find the NCSI device */
> - nd = ncsi_find_dev(dev);
> + nd = ncsi_find_dev(pt->dev);
> ndp = nd ? TO_NCSI_DEV_PRIV(nd) : NULL;
> if (!ndp)
> return -ENODEV;
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] net/ncsi: Use real net-device for response handler
2020-12-22 6:13 ` Joel Stanley
@ 2020-12-22 18:38 ` Samuel Mendoza-Jonas
2020-12-23 2:24 ` Jakub Kicinski
0 siblings, 1 reply; 5+ messages in thread
From: Samuel Mendoza-Jonas @ 2020-12-22 18:38 UTC (permalink / raw)
To: Joel Stanley, John Wang
Cc: xuxiaohan, 郁雷,
David S. Miller, Jakub Kicinski, Gavin Shan,
open list:NETWORKING [GENERAL],
open list
On Tue, 2020-12-22 at 06:13 +0000, Joel Stanley wrote:
> On Sun, 20 Dec 2020 at 12:40, John Wang <
> wangzhiqiang.bj@bytedance.com> wrote:
> >
> > When aggregating ncsi interfaces and dedicated interfaces to bond
> > interfaces, the ncsi response handler will use the wrong net device
> > to
> > find ncsi_dev, so that the ncsi interface will not work properly.
> > Here, we use the net device registered to packet_type to fix it.
> >
> > Fixes: 138635cc27c9 ("net/ncsi: NCSI response packet handler")
> > Signed-off-by: John Wang <wangzhiqiang.bj@bytedance.com>
>
> Can you show me how to reproduce this?
>
> I don't know the ncsi or net code well enough to know if this is the
> correct fix. If you are confident it is correct then I have no
> objections.
This looks like it is probably right; pt->dev will be the original
device from ncsi_register_dev(), if a response comes in to
ncsi_rcv_rsp() associated with a different device then the driver will
fail to find the correct ncsi_dev_priv. An example of the broken case
would be good to see though.
Cheers,
Sam
>
> Cheers,
>
> Joel
>
> > ---
> > net/ncsi/ncsi-rsp.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c
> > index a94bb59793f0..60ae32682904 100644
> > --- a/net/ncsi/ncsi-rsp.c
> > +++ b/net/ncsi/ncsi-rsp.c
> > @@ -1120,7 +1120,7 @@ int ncsi_rcv_rsp(struct sk_buff *skb, struct
> > net_device *dev,
> > int payload, i, ret;
> >
> > /* Find the NCSI device */
> > - nd = ncsi_find_dev(dev);
> > + nd = ncsi_find_dev(pt->dev);
> > ndp = nd ? TO_NCSI_DEV_PRIV(nd) : NULL;
> > if (!ndp)
> > return -ENODEV;
> > --
> > 2.25.1
> >
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] net/ncsi: Use real net-device for response handler
2020-12-22 18:38 ` Samuel Mendoza-Jonas
@ 2020-12-23 2:24 ` Jakub Kicinski
2020-12-23 5:29 ` [External] " John Wang
0 siblings, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2020-12-23 2:24 UTC (permalink / raw)
To: Samuel Mendoza-Jonas
Cc: Joel Stanley, John Wang, xuxiaohan, 郁雷,
David S. Miller, Gavin Shan, open list:NETWORKING [GENERAL],
open list
On Tue, 22 Dec 2020 10:38:21 -0800 Samuel Mendoza-Jonas wrote:
> On Tue, 2020-12-22 at 06:13 +0000, Joel Stanley wrote:
> > On Sun, 20 Dec 2020 at 12:40, John Wang wrote:
> > > When aggregating ncsi interfaces and dedicated interfaces to bond
> > > interfaces, the ncsi response handler will use the wrong net device
> > > to
> > > find ncsi_dev, so that the ncsi interface will not work properly.
> > > Here, we use the net device registered to packet_type to fix it.
> > >
> > > Fixes: 138635cc27c9 ("net/ncsi: NCSI response packet handler")
> > > Signed-off-by: John Wang <wangzhiqiang.bj@bytedance.com>
This sounds like exactly the case for which orig_dev was introduced.
I think you should use the orig_dev argument, rather than pt->dev.
Can you test if that works?
> > Can you show me how to reproduce this?
> >
> > I don't know the ncsi or net code well enough to know if this is the
> > correct fix. If you are confident it is correct then I have no
> > objections.
>
> This looks like it is probably right; pt->dev will be the original
> device from ncsi_register_dev(), if a response comes in to
> ncsi_rcv_rsp() associated with a different device then the driver will
> fail to find the correct ncsi_dev_priv. An example of the broken case
> would be good to see though.
From the description sounds like the case is whenever the ncsi
interface is in a bond, the netdev from the second argument is
the bond not the interface from which the frame came. It should
be possible to repro even with only one interface on the system,
create a bond or a team and add the ncsi interface to it.
Does that make sense? I'm likely missing the subtleties here.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [External] Re: [PATCH] net/ncsi: Use real net-device for response handler
2020-12-23 2:24 ` Jakub Kicinski
@ 2020-12-23 5:29 ` John Wang
0 siblings, 0 replies; 5+ messages in thread
From: John Wang @ 2020-12-23 5:29 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Samuel Mendoza-Jonas, Joel Stanley, Lotus Xu, 郁雷,
David S. Miller, Gavin Shan, open list:NETWORKING [GENERAL],
open list, Bo Chen
On Wed, Dec 23, 2020 at 10:25 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 22 Dec 2020 10:38:21 -0800 Samuel Mendoza-Jonas wrote:
> > On Tue, 2020-12-22 at 06:13 +0000, Joel Stanley wrote:
> > > On Sun, 20 Dec 2020 at 12:40, John Wang wrote:
> > > > When aggregating ncsi interfaces and dedicated interfaces to bond
> > > > interfaces, the ncsi response handler will use the wrong net device
> > > > to
> > > > find ncsi_dev, so that the ncsi interface will not work properly.
> > > > Here, we use the net device registered to packet_type to fix it.
> > > >
> > > > Fixes: 138635cc27c9 ("net/ncsi: NCSI response packet handler")
> > > > Signed-off-by: John Wang <wangzhiqiang.bj@bytedance.com>
>
> This sounds like exactly the case for which orig_dev was introduced.
> I think you should use the orig_dev argument, rather than pt->dev.
will send a v2
>
> Can you test if that works?
Yes, it works.
>
> > > Can you show me how to reproduce this?
On g220a, eth1 is the dedicated interface, eth0 is the ncsi interface
kernel cfg:
CONFIG_BONDING=y
cat /etc/systemd/network/00-bmc-bond1.netdev
[NetDev]
Name=bond1
Description=Bond eth0 and eth1
Kind=bond
[Bond]
Mode=active-backup
cat /etc/systemd/network/00-bmc-eth0.network
[Match]
Name=eth0
[Network]
Bond=bond1
cat /etc/systemd/network/00-bmc-eth0.network
[Match]
Name=eth1
[Network]
Bond=bond1
PrimarySlave=true
ip addr
....
6: bond1: <BROADCAST,MULTICAST,UP,LOWER_UP400> mtu 1500 qdisc noqueue qlen 1000
link/ether b4:05:5d:8f:6a:ad brd ff:ff:ff:ff:ff:ff
inet 169.254.11.178/16 brd 169.254.255.255 scope link bond1
valid_lft forever preferred_lft forever
inet 192.168.1.108/24 brd 192.168.1.255 scope global bond1
valid_lft forever preferred_lft forever
inet 10.2.16.118/24 brd 10.2.16.255 scope global bond1
valid_lft forever preferred_lft forever
inet6 fe80::b605:5dff:fe8f:6aad/64 scope link
...
Without this patch:
After bmc boots:
echo eth0 > /sys/class/net/bond1/bonding/active_slave
admin@g220a:~#
admin@g220a:~# echo eth0 > /sys/class/net/bond1/bonding/active_slave
[ 105.964357] bond1: (slave eth0): making interface the new active one
admin@g220a:~# ping 10.2.16.1
PING 10.2.16.1 (10.2.16.1): 56 data bytes
64 bytes from 10.2.16.1: seq=0 ttl=255 time=7.096 ms
64 bytes from 10.2.16.1: seq=1 ttl=255 time=2.143 ms
64 bytes from 10.2.16.1: seq=2 ttl=255 time=2.111 ms
[ 112.642734] ftgmac100 1e660000.ethernet eth0: NCSI Channel 0 timed out!
64 bytes from 10.2.16.1: seq=3 ttl=255 time=2.039 ms
64 bytes from 10.2.16.1: seq=4 ttl=255 time=2.037 ms
[ 117.842814] ftgmac100 1e660000.ethernet eth0: NCSI: No channel with
link found, configuring channel 0
[ 134.482746] ftgmac100 1e660000.ethernet eth0: NCSI Channel 0 timed out!
[ 139.682820] ftgmac100 1e660000.ethernet eth0: NCSI: No channel with
link found, configuring channel 0
with this patch:
After bmc boots:
admin@g220a:~# echo eth0 > /sys/class/net/bond1/bonding/active_slave
[58332.123754] bond1: (slave eth0): making interface the new active one
admin@g220a:~# ping 10.2.16.1
PING 10.2.16.1 (10.2.16.1): 56 data bytes
64 bytes from 10.2.16.1: seq=0 ttl=255 time=7.279 ms
...
...
64 bytes from 10.2.16.1: seq=N ttl=255 time=2.037 ms
> > >
> > > I don't know the ncsi or net code well enough to know if this is the
> > > correct fix. If you are confident it is correct then I have no
> > > objections.
> >
> > This looks like it is probably right; pt->dev will be the original
> > device from ncsi_register_dev(), if a response comes in to
> > ncsi_rcv_rsp() associated with a different device then the driver will
> > fail to find the correct ncsi_dev_priv. An example of the broken case
> > would be good to see though.
>
> From the description sounds like the case is whenever the ncsi
> interface is in a bond, the netdev from the second argument is
> the bond not the interface from which the frame came. It should
> be possible to repro even with only one interface on the system,
> create a bond or a team and add the ncsi interface to it.
>
> Does that make sense? I'm likely missing the subtleties here.
:) I guess so.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-12-23 5:30 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-20 12:39 [PATCH] net/ncsi: Use real net-device for response handler John Wang
2020-12-22 6:13 ` Joel Stanley
2020-12-22 18:38 ` Samuel Mendoza-Jonas
2020-12-23 2:24 ` Jakub Kicinski
2020-12-23 5:29 ` [External] " John Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).