linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] net/ncsi: Use real net-device for response handler
@ 2020-12-20 12:39 John Wang
  2020-12-22  6:13 ` Joel Stanley
  0 siblings, 1 reply; 5+ messages in thread
From: John Wang @ 2020-12-20 12:39 UTC (permalink / raw)
  To: xuxiaohan, yulei.sh, joel
  Cc: Samuel Mendoza-Jonas, David S. Miller, Jakub Kicinski,
	Gavin Shan, open list:NETWORKING [GENERAL],
	open list

When aggregating ncsi interfaces and dedicated interfaces to bond
interfaces, the ncsi response handler will use the wrong net device to
find ncsi_dev, so that the ncsi interface will not work properly.
Here, we use the net device registered to packet_type to fix it.

Fixes: 138635cc27c9 ("net/ncsi: NCSI response packet handler")
Signed-off-by: John Wang <wangzhiqiang.bj@bytedance.com>
---
 net/ncsi/ncsi-rsp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c
index a94bb59793f0..60ae32682904 100644
--- a/net/ncsi/ncsi-rsp.c
+++ b/net/ncsi/ncsi-rsp.c
@@ -1120,7 +1120,7 @@ int ncsi_rcv_rsp(struct sk_buff *skb, struct net_device *dev,
 	int payload, i, ret;
 
 	/* Find the NCSI device */
-	nd = ncsi_find_dev(dev);
+	nd = ncsi_find_dev(pt->dev);
 	ndp = nd ? TO_NCSI_DEV_PRIV(nd) : NULL;
 	if (!ndp)
 		return -ENODEV;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] net/ncsi: Use real net-device for response handler
  2020-12-20 12:39 [PATCH] net/ncsi: Use real net-device for response handler John Wang
@ 2020-12-22  6:13 ` Joel Stanley
  2020-12-22 18:38   ` Samuel Mendoza-Jonas
  0 siblings, 1 reply; 5+ messages in thread
From: Joel Stanley @ 2020-12-22  6:13 UTC (permalink / raw)
  To: John Wang
  Cc: xuxiaohan, 郁雷,
	Samuel Mendoza-Jonas, David S. Miller, Jakub Kicinski,
	Gavin Shan, open list:NETWORKING [GENERAL],
	open list

On Sun, 20 Dec 2020 at 12:40, John Wang <wangzhiqiang.bj@bytedance.com> wrote:
>
> When aggregating ncsi interfaces and dedicated interfaces to bond
> interfaces, the ncsi response handler will use the wrong net device to
> find ncsi_dev, so that the ncsi interface will not work properly.
> Here, we use the net device registered to packet_type to fix it.
>
> Fixes: 138635cc27c9 ("net/ncsi: NCSI response packet handler")
> Signed-off-by: John Wang <wangzhiqiang.bj@bytedance.com>

Can you show me how to reproduce this?

I don't know the ncsi or net code well enough to know if this is the
correct fix. If you are confident it is correct then I have no
objections.

Cheers,

Joel

> ---
>  net/ncsi/ncsi-rsp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c
> index a94bb59793f0..60ae32682904 100644
> --- a/net/ncsi/ncsi-rsp.c
> +++ b/net/ncsi/ncsi-rsp.c
> @@ -1120,7 +1120,7 @@ int ncsi_rcv_rsp(struct sk_buff *skb, struct net_device *dev,
>         int payload, i, ret;
>
>         /* Find the NCSI device */
> -       nd = ncsi_find_dev(dev);
> +       nd = ncsi_find_dev(pt->dev);
>         ndp = nd ? TO_NCSI_DEV_PRIV(nd) : NULL;
>         if (!ndp)
>                 return -ENODEV;
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] net/ncsi: Use real net-device for response handler
  2020-12-22  6:13 ` Joel Stanley
@ 2020-12-22 18:38   ` Samuel Mendoza-Jonas
  2020-12-23  2:24     ` Jakub Kicinski
  0 siblings, 1 reply; 5+ messages in thread
From: Samuel Mendoza-Jonas @ 2020-12-22 18:38 UTC (permalink / raw)
  To: Joel Stanley, John Wang
  Cc: xuxiaohan, 郁雷,
	David S. Miller, Jakub Kicinski, Gavin Shan,
	open list:NETWORKING [GENERAL],
	open list

On Tue, 2020-12-22 at 06:13 +0000, Joel Stanley wrote:
> On Sun, 20 Dec 2020 at 12:40, John Wang <
> wangzhiqiang.bj@bytedance.com> wrote:
> > 
> > When aggregating ncsi interfaces and dedicated interfaces to bond
> > interfaces, the ncsi response handler will use the wrong net device
> > to
> > find ncsi_dev, so that the ncsi interface will not work properly.
> > Here, we use the net device registered to packet_type to fix it.
> > 
> > Fixes: 138635cc27c9 ("net/ncsi: NCSI response packet handler")
> > Signed-off-by: John Wang <wangzhiqiang.bj@bytedance.com>
> 
> Can you show me how to reproduce this?
> 
> I don't know the ncsi or net code well enough to know if this is the
> correct fix. If you are confident it is correct then I have no
> objections.

This looks like it is probably right; pt->dev will be the original
device from ncsi_register_dev(), if a response comes in to
ncsi_rcv_rsp() associated with a different device then the driver will
fail to find the correct ncsi_dev_priv. An example of the broken case
would be good to see though.

Cheers,
Sam

> 
> Cheers,
> 
> Joel
> 
> > ---
> >  net/ncsi/ncsi-rsp.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c
> > index a94bb59793f0..60ae32682904 100644
> > --- a/net/ncsi/ncsi-rsp.c
> > +++ b/net/ncsi/ncsi-rsp.c
> > @@ -1120,7 +1120,7 @@ int ncsi_rcv_rsp(struct sk_buff *skb, struct
> > net_device *dev,
> >         int payload, i, ret;
> > 
> >         /* Find the NCSI device */
> > -       nd = ncsi_find_dev(dev);
> > +       nd = ncsi_find_dev(pt->dev);
> >         ndp = nd ? TO_NCSI_DEV_PRIV(nd) : NULL;
> >         if (!ndp)
> >                 return -ENODEV;
> > --
> > 2.25.1
> > 



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] net/ncsi: Use real net-device for response handler
  2020-12-22 18:38   ` Samuel Mendoza-Jonas
@ 2020-12-23  2:24     ` Jakub Kicinski
  2020-12-23  5:29       ` [External] " John Wang
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2020-12-23  2:24 UTC (permalink / raw)
  To: Samuel Mendoza-Jonas
  Cc: Joel Stanley, John Wang, xuxiaohan, 郁雷,
	David S. Miller, Gavin Shan, open list:NETWORKING [GENERAL],
	open list

On Tue, 22 Dec 2020 10:38:21 -0800 Samuel Mendoza-Jonas wrote:
> On Tue, 2020-12-22 at 06:13 +0000, Joel Stanley wrote:
> > On Sun, 20 Dec 2020 at 12:40, John Wang wrote:
> > > When aggregating ncsi interfaces and dedicated interfaces to bond
> > > interfaces, the ncsi response handler will use the wrong net device
> > > to
> > > find ncsi_dev, so that the ncsi interface will not work properly.
> > > Here, we use the net device registered to packet_type to fix it.
> > > 
> > > Fixes: 138635cc27c9 ("net/ncsi: NCSI response packet handler")
> > > Signed-off-by: John Wang <wangzhiqiang.bj@bytedance.com>  

This sounds like exactly the case for which orig_dev was introduced.
I think you should use the orig_dev argument, rather than pt->dev.

Can you test if that works?

> > Can you show me how to reproduce this?
> > 
> > I don't know the ncsi or net code well enough to know if this is the
> > correct fix. If you are confident it is correct then I have no
> > objections.  
> 
> This looks like it is probably right; pt->dev will be the original
> device from ncsi_register_dev(), if a response comes in to
> ncsi_rcv_rsp() associated with a different device then the driver will
> fail to find the correct ncsi_dev_priv. An example of the broken case
> would be good to see though.

From the description sounds like the case is whenever the ncsi
interface is in a bond, the netdev from the second argument is 
the bond not the interface from which the frame came. It should 
be possible to repro even with only one interface on the system,
create a bond or a team and add the ncsi interface to it.

Does that make sense? I'm likely missing the subtleties here.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [External] Re: [PATCH] net/ncsi: Use real net-device for response handler
  2020-12-23  2:24     ` Jakub Kicinski
@ 2020-12-23  5:29       ` John Wang
  0 siblings, 0 replies; 5+ messages in thread
From: John Wang @ 2020-12-23  5:29 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Samuel Mendoza-Jonas, Joel Stanley, Lotus Xu, 郁雷,
	David S. Miller, Gavin Shan, open list:NETWORKING [GENERAL],
	open list, Bo Chen

On Wed, Dec 23, 2020 at 10:25 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 22 Dec 2020 10:38:21 -0800 Samuel Mendoza-Jonas wrote:
> > On Tue, 2020-12-22 at 06:13 +0000, Joel Stanley wrote:
> > > On Sun, 20 Dec 2020 at 12:40, John Wang wrote:
> > > > When aggregating ncsi interfaces and dedicated interfaces to bond
> > > > interfaces, the ncsi response handler will use the wrong net device
> > > > to
> > > > find ncsi_dev, so that the ncsi interface will not work properly.
> > > > Here, we use the net device registered to packet_type to fix it.
> > > >
> > > > Fixes: 138635cc27c9 ("net/ncsi: NCSI response packet handler")
> > > > Signed-off-by: John Wang <wangzhiqiang.bj@bytedance.com>
>
> This sounds like exactly the case for which orig_dev was introduced.
> I think you should use the orig_dev argument, rather than pt->dev.

will send a v2

>
> Can you test if that works?

Yes,  it works.

>
> > > Can you show me how to reproduce this?

On g220a, eth1 is the dedicated interface, eth0 is the ncsi interface

kernel cfg:
CONFIG_BONDING=y

cat /etc/systemd/network/00-bmc-bond1.netdev
[NetDev]
Name=bond1
Description=Bond eth0 and eth1
Kind=bond

[Bond]
Mode=active-backup

cat /etc/systemd/network/00-bmc-eth0.network
[Match]
Name=eth0
[Network]
Bond=bond1

cat /etc/systemd/network/00-bmc-eth0.network
[Match]
Name=eth1
[Network]
Bond=bond1
PrimarySlave=true

ip addr
....
6: bond1: <BROADCAST,MULTICAST,UP,LOWER_UP400> mtu 1500 qdisc noqueue qlen 1000
    link/ether b4:05:5d:8f:6a:ad brd ff:ff:ff:ff:ff:ff
    inet 169.254.11.178/16 brd 169.254.255.255 scope link bond1
       valid_lft forever preferred_lft forever
    inet 192.168.1.108/24 brd 192.168.1.255 scope global bond1
       valid_lft forever preferred_lft forever
    inet 10.2.16.118/24 brd 10.2.16.255 scope global bond1
       valid_lft forever preferred_lft forever
    inet6 fe80::b605:5dff:fe8f:6aad/64 scope link
...


Without this patch:
After bmc boots:
echo eth0 > /sys/class/net/bond1/bonding/active_slave
admin@g220a:~#
admin@g220a:~# echo eth0 > /sys/class/net/bond1/bonding/active_slave
[  105.964357] bond1: (slave eth0): making interface the new active one
admin@g220a:~# ping 10.2.16.1
PING 10.2.16.1 (10.2.16.1): 56 data bytes
64 bytes from 10.2.16.1: seq=0 ttl=255 time=7.096 ms
64 bytes from 10.2.16.1: seq=1 ttl=255 time=2.143 ms
64 bytes from 10.2.16.1: seq=2 ttl=255 time=2.111 ms
[  112.642734] ftgmac100 1e660000.ethernet eth0: NCSI Channel 0 timed out!
64 bytes from 10.2.16.1: seq=3 ttl=255 time=2.039 ms
64 bytes from 10.2.16.1: seq=4 ttl=255 time=2.037 ms
[  117.842814] ftgmac100 1e660000.ethernet eth0: NCSI: No channel with
link found, configuring channel 0
[  134.482746] ftgmac100 1e660000.ethernet eth0: NCSI Channel 0 timed out!
[  139.682820] ftgmac100 1e660000.ethernet eth0: NCSI: No channel with
link found, configuring channel 0

with this patch:
After bmc boots:

admin@g220a:~# echo eth0 > /sys/class/net/bond1/bonding/active_slave
[58332.123754] bond1: (slave eth0): making interface the new active one
admin@g220a:~# ping 10.2.16.1
PING 10.2.16.1 (10.2.16.1): 56 data bytes
64 bytes from 10.2.16.1: seq=0 ttl=255 time=7.279 ms
...
...
64 bytes from 10.2.16.1: seq=N ttl=255 time=2.037 ms



> > >
> > > I don't know the ncsi or net code well enough to know if this is the
> > > correct fix. If you are confident it is correct then I have no
> > > objections.
> >
> > This looks like it is probably right; pt->dev will be the original
> > device from ncsi_register_dev(), if a response comes in to
> > ncsi_rcv_rsp() associated with a different device then the driver will
> > fail to find the correct ncsi_dev_priv. An example of the broken case
> > would be good to see though.
>
> From the description sounds like the case is whenever the ncsi
> interface is in a bond, the netdev from the second argument is
> the bond not the interface from which the frame came. It should
> be possible to repro even with only one interface on the system,
> create a bond or a team and add the ncsi interface to it.
>
> Does that make sense? I'm likely missing the subtleties here.

:)  I guess so.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-12-23  5:30 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-20 12:39 [PATCH] net/ncsi: Use real net-device for response handler John Wang
2020-12-22  6:13 ` Joel Stanley
2020-12-22 18:38   ` Samuel Mendoza-Jonas
2020-12-23  2:24     ` Jakub Kicinski
2020-12-23  5:29       ` [External] " John Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).