From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jiri Pirko Subject: Re: [RFC PATCH net-next v6 2/4] net: Introduce generic bypass module Date: Wed, 18 Apr 2018 22:32:06 +0200 Message-ID: <20180418203206.GC1922@nanopsycho> References: <1523386790-12396-1-git-send-email-sridhar.samudrala@intel.com> <1523386790-12396-3-git-send-email-sridhar.samudrala@intel.com> <20180411155127.GQ2028@nanopsycho> <6a8c1ff5-153a-e40a-91b3-48532b8d3a38@intel.com> <20180418092515.GB1989@nanopsycho> <20180418191315.GA1922@nanopsycho> <20180418222309-mutt-send-email-mst@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Cc: "Samudrala, Sridhar" , stephen@networkplumber.org, davem@davemloft.net, netdev@vger.kernel.org, virtualization@lists.linux-foundation.org, virtio-dev@lists.oasis-open.org, jesse.brandeburg@intel.com, alexander.h.duyck@intel.com, kubakici@wp.pl, jasowang@redhat.com, loseweigh@gmail.com To: "Michael S. Tsirkin" Return-path: Received: from mail-wr0-f194.google.com ([209.85.128.194]:36404 "EHLO mail-wr0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751256AbeDRUcJ (ORCPT ); Wed, 18 Apr 2018 16:32:09 -0400 Received: by mail-wr0-f194.google.com with SMTP id q13-v6so8213502wre.3 for ; Wed, 18 Apr 2018 13:32:08 -0700 (PDT) Content-Disposition: inline In-Reply-To: <20180418222309-mutt-send-email-mst@kernel.org> Sender: netdev-owner@vger.kernel.org List-ID: Wed, Apr 18, 2018 at 09:46:04PM CEST, mst@redhat.com wrote: >On Wed, Apr 18, 2018 at 09:13:15PM +0200, Jiri Pirko wrote: >> Wed, Apr 18, 2018 at 08:43:15PM CEST, sridhar.samudrala@intel.com wrote: >> >On 4/18/2018 2:25 AM, Jiri Pirko wrote: >> >> Wed, Apr 11, 2018 at 09:13:52PM CEST, sridhar.samudrala@intel.com wrote: >> >> > On 4/11/2018 8:51 AM, Jiri Pirko wrote: >> >> > > Tue, Apr 10, 2018 at 08:59:48PM CEST, sridhar.samudrala@intel.com wrote: >> >> > > > This provides a generic interface for paravirtual drivers to listen >> >> > > > for netdev register/unregister/link change events from pci ethernet >> >> > > > devices with the same MAC and takeover their datapath. The notifier and >> >> > > > event handling code is based on the existing netvsc implementation. >> >> > > > >> >> > > > It exposes 2 sets of interfaces to the paravirtual drivers. >> >> > > > 1. existing netvsc driver that uses 2 netdev model. In this model, no >> >> > > > master netdev is created. The paravirtual driver registers each bypass >> >> > > > instance along with a set of ops to manage the slave events. >> >> > > > bypass_master_register() >> >> > > > bypass_master_unregister() >> >> > > > 2. new virtio_net based solution that uses 3 netdev model. In this model, >> >> > > > the bypass module provides interfaces to create/destroy additional master >> >> > > > netdev and all the slave events are managed internally. >> >> > > > bypass_master_create() >> >> > > > bypass_master_destroy() >> >> > > > >> >> > > > Signed-off-by: Sridhar Samudrala >> >> > > > --- >> >> > > > include/linux/netdevice.h | 14 + >> >> > > > include/net/bypass.h | 96 ++++++ >> >> > > > net/Kconfig | 18 + >> >> > > > net/core/Makefile | 1 + >> >> > > > net/core/bypass.c | 844 ++++++++++++++++++++++++++++++++++++++++++++++ >> >> > > > 5 files changed, 973 insertions(+) >> >> > > > create mode 100644 include/net/bypass.h >> >> > > > create mode 100644 net/core/bypass.c >> >> > > > >> >> > > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h >> >> > > > index cf44503ea81a..587293728f70 100644 >> >> > > > --- a/include/linux/netdevice.h >> >> > > > +++ b/include/linux/netdevice.h >> >> > > > @@ -1430,6 +1430,8 @@ enum netdev_priv_flags { >> >> > > > IFF_PHONY_HEADROOM = 1<<24, >> >> > > > IFF_MACSEC = 1<<25, >> >> > > > IFF_NO_RX_HANDLER = 1<<26, >> >> > > > + IFF_BYPASS = 1 << 27, >> >> > > > + IFF_BYPASS_SLAVE = 1 << 28, >> >> > > I wonder, why you don't follow the existing coding style... Also, please >> >> > > add these to into the comment above. >> >> > To avoid checkpatch warnings. If it is OK to ignore these warnings, I can switch back >> >> > to the existing coding style to be consistent. >> >> Please do. >> >> >> >> >> >> > > >> >> > > > }; >> >> > > > >> >> > > > #define IFF_802_1Q_VLAN IFF_802_1Q_VLAN >> >> > > > @@ -1458,6 +1460,8 @@ enum netdev_priv_flags { >> >> > > > #define IFF_RXFH_CONFIGURED IFF_RXFH_CONFIGURED >> >> > > > #define IFF_MACSEC IFF_MACSEC >> >> > > > #define IFF_NO_RX_HANDLER IFF_NO_RX_HANDLER >> >> > > > +#define IFF_BYPASS IFF_BYPASS >> >> > > > +#define IFF_BYPASS_SLAVE IFF_BYPASS_SLAVE >> >> > > > >> >> > > > /** >> >> > > > * struct net_device - The DEVICE structure. >> >> > > > @@ -4308,6 +4312,16 @@ static inline bool netif_is_rxfh_configured(const struct net_device *dev) >> >> > > > return dev->priv_flags & IFF_RXFH_CONFIGURED; >> >> > > > } >> >> > > > >> >> > > > +static inline bool netif_is_bypass_master(const struct net_device *dev) >> >> > > > +{ >> >> > > > + return dev->priv_flags & IFF_BYPASS; >> >> > > > +} >> >> > > > + >> >> > > > +static inline bool netif_is_bypass_slave(const struct net_device *dev) >> >> > > > +{ >> >> > > > + return dev->priv_flags & IFF_BYPASS_SLAVE; >> >> > > > +} >> >> > > > + >> >> > > > /* This device needs to keep skb dst for qdisc enqueue or ndo_start_xmit() */ >> >> > > > static inline void netif_keep_dst(struct net_device *dev) >> >> > > > { >> >> > > > diff --git a/include/net/bypass.h b/include/net/bypass.h >> >> > > > new file mode 100644 >> >> > > > index 000000000000..86b02cb894cf >> >> > > > --- /dev/null >> >> > > > +++ b/include/net/bypass.h >> >> > > > @@ -0,0 +1,96 @@ >> >> > > > +// SPDX-License-Identifier: GPL-2.0 >> >> > > > +/* Copyright (c) 2018, Intel Corporation. */ >> >> > > > + >> >> > > > +#ifndef _NET_BYPASS_H >> >> > > > +#define _NET_BYPASS_H >> >> > > > + >> >> > > > +#include >> >> > > > + >> >> > > > +struct bypass_ops { >> >> > > > + int (*slave_pre_register)(struct net_device *slave_netdev, >> >> > > > + struct net_device *bypass_netdev); >> >> > > > + int (*slave_join)(struct net_device *slave_netdev, >> >> > > > + struct net_device *bypass_netdev); >> >> > > > + int (*slave_pre_unregister)(struct net_device *slave_netdev, >> >> > > > + struct net_device *bypass_netdev); >> >> > > > + int (*slave_release)(struct net_device *slave_netdev, >> >> > > > + struct net_device *bypass_netdev); >> >> > > > + int (*slave_link_change)(struct net_device *slave_netdev, >> >> > > > + struct net_device *bypass_netdev); >> >> > > > + rx_handler_result_t (*handle_frame)(struct sk_buff **pskb); >> >> > > > +}; >> >> > > > + >> >> > > > +struct bypass_master { >> >> > > > + struct list_head list; >> >> > > > + struct net_device __rcu *bypass_netdev; >> >> > > > + struct bypass_ops __rcu *ops; >> >> > > > +}; >> >> > > > + >> >> > > > +/* bypass state */ >> >> > > > +struct bypass_info { >> >> > > > + /* passthru netdev with same MAC */ >> >> > > > + struct net_device __rcu *active_netdev; >> >> > > You still use "active"/"backup" names which is highly misleading as >> >> > > it has completely different meaning that in bond for example. >> >> > > I noted that in my previous review already. Please change it. >> >> > I guess the issue is with only the 'active'  name. 'backup' should be fine as it also >> >> > matches with the BACKUP feature bit we are adding to virtio_net. >> >> I think that "backup" is also misleading. Both "active" and "backup" >> >> mean a *state* of slaves. This should be named differently. >> >> >> >> >> >> >> >> > With regards to alternate names for 'active', you suggested 'stolen', but i >> >> > am not too happy with it. >> >> > netvsc uses vf_netdev, are you OK with this? Or another option is 'passthru' >> >> No. The netdev could be any netdevice. It does not have to be a "VF". >> >> I think "stolen" is quite appropriate since it describes the modus >> >> operandi. The bypass master steals some netdevice according to some >> >> match. >> >> >> >> But I don't insist on "stolen". Just sounds right. >> > >> >We are adding VIRTIO_NET_F_BACKUP as a new feature bit to enable this feature, So i think >> >'backup' name is consistent. >> >> It perhaps makes sense from the view of virtio device. However, as I >> described couple of times, for master/slave device the name "backup" is >> highly misleading. > >virtio is the backup. You are supposed to use another >(typically passthrough) device, if that fails use virtio. >It does seem appropriate to me. If you like, we can >change that to "standby". Active I don't like either. "main"? Sounds much better, yes. > >In fact would failover be better than bypass? Also, much better. > > >> >> > >> >The intent is to restrict the 'active' netdev to be a VF. If there is a way to check that >> >a PCI device is a VF in the guest kernel, we could restrict 'active' netdev to be a VF. >> > >> >Will look for any suggestions in the next day or two. If i don't get any, i will go >> >with 'stolen' >> > >> > >> > >> > >> >> + >> >> +static struct net_device *bypass_master_get_bymac(u8 *mac, >> >> + struct bypass_ops **ops) >> >> +{ >> >> + struct bypass_master *bypass_master; >> >> + struct net_device *bypass_netdev; >> >> + >> >> + spin_lock(&bypass_lock); >> >> + list_for_each_entry(bypass_master, &bypass_master_list, list) { >> >> > > As I wrote the last time, you don't need this list, spinlock. >> >> > > You can do just something like: >> >> > > for_each_net(net) { >> >> > > for_each_netdev(net, dev) { >> >> > > if (netif_is_bypass_master(dev)) { >> >> > This function returns the upper netdev as well as the ops associated >> >> > with that netdev. >> >> > bypass_master_list is a list of 'struct bypass_master' that associates >> >> Well, can't you have it in netdev priv? >> > >> >We cannot do this for 2-netdev model as there is no bypass_netdev created. >> >> Howcome? You have no master? I don't understand.. >> >> >> >> > >> >> >> >> >> >> > 'bypass_netdev' with 'bypass_ops' and gets added via bypass_master_register(). >> >> > We need 'ops' only to support the 2 netdev model of netvsc. ops will be >> >> > NULL for 3-netdev model. >> >> I see :( >> >> >> >> >> >> > >> >> > > >> >> > > >> >> > > >> >> > > > + bypass_netdev = rcu_dereference(bypass_master->bypass_netdev); >> >> > > > + if (ether_addr_equal(bypass_netdev->perm_addr, mac)) { >> >> > > > + *ops = rcu_dereference(bypass_master->ops); >> >> > > I don't see how rcu_dereference is ok here. >> >> > > 1) I don't see rcu_read_lock taken >> >> > > 2) Looks like bypass_master->ops has the same value across the whole >> >> > > existence. >> >> > We hold rtnl_lock(), i think i need to change this to rtnl_dereference. >> >> > Yes. ops doesn't change. >> >> If it does not change, you can just access it directly. >> >> >> >> >> >> > > >> >> > > > + spin_unlock(&bypass_lock); >> >> > > > + return bypass_netdev; >> >> > > > + } >> >> > > > + } >> >> > > > + spin_unlock(&bypass_lock); >> >> > > > + return NULL; >> >> > > > +} >> >> > > > + >> >> > > > +static int bypass_slave_register(struct net_device *slave_netdev) >> >> > > > +{ >> >> > > > + struct net_device *bypass_netdev; >> >> > > > + struct bypass_ops *bypass_ops; >> >> > > > + int ret, orig_mtu; >> >> > > > + >> >> > > > + ASSERT_RTNL(); >> >> > > > + >> >> > > > + bypass_netdev = bypass_master_get_bymac(slave_netdev->perm_addr, >> >> > > > + &bypass_ops); >> >> > > For master, could you use word "master" in the variables so it is clear? >> >> > > Also, "dev" is fine instead of "netdev". >> >> > > Something like "bpmaster_dev" >> >> > bypass_master is of  type struct bypass_master,  bypass_netdev is of type struct net_device. >> >> I was trying to point out, that "bypass_netdev" represents a "master" >> >> netdev, yet it does not say master. That is why I suggested >> >> "bpmaster_dev" >> >> >> >> >> >> > I can change all _netdev suffixes to _dev to make the names shorter. >> >> ok. >> >> >> >> >> >> > >> >> > > >> >> > > > + if (!bypass_netdev) >> >> > > > + goto done; >> >> > > > + >> >> > > > + ret = bypass_slave_pre_register(slave_netdev, bypass_netdev, >> >> > > > + bypass_ops); >> >> > > > + if (ret != 0) >> >> > > Just "if (ret)" will do. You have this on more places. >> >> > OK. >> >> > >> >> > >> >> > > >> >> > > > + goto done; >> >> > > > + >> >> > > > + ret = netdev_rx_handler_register(slave_netdev, >> >> > > > + bypass_ops ? bypass_ops->handle_frame : >> >> > > > + bypass_handle_frame, bypass_netdev); >> >> > > > + if (ret != 0) { >> >> > > > + netdev_err(slave_netdev, "can not register bypass rx handler (err = %d)\n", >> >> > > > + ret); >> >> > > > + goto done; >> >> > > > + } >> >> > > > + >> >> > > > + ret = netdev_upper_dev_link(slave_netdev, bypass_netdev, NULL); >> >> > > > + if (ret != 0) { >> >> > > > + netdev_err(slave_netdev, "can not set master device %s (err = %d)\n", >> >> > > > + bypass_netdev->name, ret); >> >> > > > + goto upper_link_failed; >> >> > > > + } >> >> > > > + >> >> > > > + slave_netdev->priv_flags |= IFF_BYPASS_SLAVE; >> >> > > > + >> >> > > > + if (netif_running(bypass_netdev)) { >> >> > > > + ret = dev_open(slave_netdev); >> >> > > > + if (ret && (ret != -EBUSY)) { >> >> > > > + netdev_err(bypass_netdev, "Opening slave %s failed ret:%d\n", >> >> > > > + slave_netdev->name, ret); >> >> > > > + goto err_interface_up; >> >> > > > + } >> >> > > > + } >> >> > > > + >> >> > > > + /* Align MTU of slave with master */ >> >> > > > + orig_mtu = slave_netdev->mtu; >> >> > > > + ret = dev_set_mtu(slave_netdev, bypass_netdev->mtu); >> >> > > > + if (ret != 0) { >> >> > > > + netdev_err(bypass_netdev, "unable to change mtu of %s to %u register failed\n", >> >> > > > + slave_netdev->name, bypass_netdev->mtu); >> >> > > > + goto err_set_mtu; >> >> > > > + } >> >> > > > + >> >> > > > + ret = bypass_slave_join(slave_netdev, bypass_netdev, bypass_ops); >> >> > > > + if (ret != 0) >> >> > > > + goto err_join; >> >> > > > + >> >> > > > + call_netdevice_notifiers(NETDEV_JOIN, slave_netdev); >> >> > > > + >> >> > > > + netdev_info(bypass_netdev, "bypass slave:%s registered\n", >> >> > > > + slave_netdev->name); >> >> > > > + >> >> > > > + goto done; >> >> > > > + >> >> > > > +err_join: >> >> > > > + dev_set_mtu(slave_netdev, orig_mtu); >> >> > > > +err_set_mtu: >> >> > > > + dev_close(slave_netdev); >> >> > > > +err_interface_up: >> >> > > > + netdev_upper_dev_unlink(slave_netdev, bypass_netdev); >> >> > > > + slave_netdev->priv_flags &= ~IFF_BYPASS_SLAVE; >> >> > > > +upper_link_failed: >> >> > > > + netdev_rx_handler_unregister(slave_netdev); >> >> > > > +done: >> >> > > > + return NOTIFY_DONE; >> >> > > > +} >> >> > > > + >> >> > > > +static int bypass_slave_pre_unregister(struct net_device *slave_netdev, >> >> > > > + struct net_device *bypass_netdev, >> >> > > > + struct bypass_ops *bypass_ops) >> >> > > > +{ >> >> > > > + struct net_device *backup_netdev, *active_netdev; >> >> > > > + struct bypass_info *bi; >> >> > > > + >> >> > > > + if (bypass_ops) { >> >> > > > + if (!bypass_ops->slave_pre_unregister) >> >> > > > + return -EINVAL; >> >> > > > + >> >> > > > + return bypass_ops->slave_pre_unregister(slave_netdev, >> >> > > > + bypass_netdev); >> >> > > > + } >> >> > > > + >> >> > > > + bi = netdev_priv(bypass_netdev); >> >> > > > + active_netdev = rtnl_dereference(bi->active_netdev); >> >> > > > + backup_netdev = rtnl_dereference(bi->backup_netdev); >> >> > > > + >> >> > > > + if (slave_netdev != active_netdev && slave_netdev != backup_netdev) >> >> > > > + return -EINVAL; >> >> > > > + >> >> > > > + return 0; >> >> > > > +} >> >> > > > + >> >> > > > +static int bypass_slave_release(struct net_device *slave_netdev, >> >> > > > + struct net_device *bypass_netdev, >> >> > > > + struct bypass_ops *bypass_ops) >> >> > > > +{ >> >> > > > + struct net_device *backup_netdev, *active_netdev; >> >> > > > + struct bypass_info *bi; >> >> > > > + >> >> > > > + if (bypass_ops) { >> >> > > > + if (!bypass_ops->slave_release) >> >> > > > + return -EINVAL; >> >> > > I think it would be good to make the API to the driver more strict and >> >> > > have a separate set of ops for "active" and "backup" netdevices. >> >> > > That should stop people thinking about extending this to more slaves in >> >> > > the future. >> >> > We have checks in slave_pre_register() that allows only 1 'backup' and 1 >> >> > 'active' slave. >> >> I'm very well aware of that. I just thought that explicit ops for the >> >> two slaves would make this more clear. >> >> >> >> >> >> > >> >> > > >> >> > > >> >> > > > + >> >> > > > + return bypass_ops->slave_release(slave_netdev, bypass_netdev); >> >> > > > + } >> >> > > > + >> >> > > > + bi = netdev_priv(bypass_netdev); >> >> > > > + active_netdev = rtnl_dereference(bi->active_netdev); >> >> > > > + backup_netdev = rtnl_dereference(bi->backup_netdev); >> >> > > > + >> >> > > > + if (slave_netdev == backup_netdev) { >> >> > > > + RCU_INIT_POINTER(bi->backup_netdev, NULL); >> >> > > > + } else { >> >> > > > + RCU_INIT_POINTER(bi->active_netdev, NULL); >> >> > > > + if (backup_netdev) { >> >> > > > + bypass_netdev->min_mtu = backup_netdev->min_mtu; >> >> > > > + bypass_netdev->max_mtu = backup_netdev->max_mtu; >> >> > > > + } >> >> > > > + } >> >> > > > + >> >> > > > + dev_put(slave_netdev); >> >> > > > + >> >> > > > + netdev_info(bypass_netdev, "bypass slave:%s released\n", >> >> > > > + slave_netdev->name); >> >> > > > + >> >> > > > + return 0; >> >> > > > +} >> >> > > > + >> >> > > > +int bypass_slave_unregister(struct net_device *slave_netdev) >> >> > > > +{ >> >> > > > + struct net_device *bypass_netdev; >> >> > > > + struct bypass_ops *bypass_ops; >> >> > > > + int ret; >> >> > > > + >> >> > > > + if (!netif_is_bypass_slave(slave_netdev)) >> >> > > > + goto done; >> >> > > > + >> >> > > > + ASSERT_RTNL(); >> >> > > > + >> >> > > > + bypass_netdev = bypass_master_get_bymac(slave_netdev->perm_addr, >> >> > > > + &bypass_ops); >> >> > > > + if (!bypass_netdev) >> >> > > > + goto done; >> >> > > > + >> >> > > > + ret = bypass_slave_pre_unregister(slave_netdev, bypass_netdev, >> >> > > > + bypass_ops); >> >> > > > + if (ret != 0) >> >> > > > + goto done; >> >> > > > + >> >> > > > + netdev_rx_handler_unregister(slave_netdev); >> >> > > > + netdev_upper_dev_unlink(slave_netdev, bypass_netdev); >> >> > > > + slave_netdev->priv_flags &= ~IFF_BYPASS_SLAVE; >> >> > > > + >> >> > > > + bypass_slave_release(slave_netdev, bypass_netdev, bypass_ops); >> >> > > > + >> >> > > > + netdev_info(bypass_netdev, "bypass slave:%s unregistered\n", >> >> > > > + slave_netdev->name); >> >> > > > + >> >> > > > +done: >> >> > > > + return NOTIFY_DONE; >> >> > > > +} >> >> > > > +EXPORT_SYMBOL_GPL(bypass_slave_unregister); >> >> > > > + >> >> > > > +static bool bypass_xmit_ready(struct net_device *dev) >> >> > > > +{ >> >> > > > + return netif_running(dev) && netif_carrier_ok(dev); >> >> > > > +} >> >> > > > + >> >> > > > +static int bypass_slave_link_change(struct net_device *slave_netdev) >> >> > > > +{ >> >> > > > + struct net_device *bypass_netdev, *active_netdev, *backup_netdev; >> >> > > > + struct bypass_ops *bypass_ops; >> >> > > > + struct bypass_info *bi; >> >> > > > + >> >> > > > + if (!netif_is_bypass_slave(slave_netdev)) >> >> > > > + goto done; >> >> > > > + >> >> > > > + ASSERT_RTNL(); >> >> > > > + >> >> > > > + bypass_netdev = bypass_master_get_bymac(slave_netdev->perm_addr, >> >> > > > + &bypass_ops); >> >> > > > + if (!bypass_netdev) >> >> > > > + goto done; >> >> > > > + >> >> > > > + if (bypass_ops) { >> >> > > > + if (!bypass_ops->slave_link_change) >> >> > > > + goto done; >> >> > > > + >> >> > > > + return bypass_ops->slave_link_change(slave_netdev, >> >> > > > + bypass_netdev); >> >> > > > + } >> >> > > > + >> >> > > > + if (!netif_running(bypass_netdev)) >> >> > > > + return 0; >> >> > > > + >> >> > > > + bi = netdev_priv(bypass_netdev); >> >> > > > + >> >> > > > + active_netdev = rtnl_dereference(bi->active_netdev); >> >> > > > + backup_netdev = rtnl_dereference(bi->backup_netdev); >> >> > > > + >> >> > > > + if (slave_netdev != active_netdev && slave_netdev != backup_netdev) >> >> > > > + goto done; >> >> > > You don't need this check. "if (!netif_is_bypass_slave(slave_netdev))" >> >> > > above is enough. >> >> > I think we need this check to not allow events from a slave that is not >> >> > attached to this master but has the same MAC. >> >> Why do we need such events? Seems wrong to me. >> > >> >We want to avoid events from a netdev that is mis-configured with the same MAC as >> >a bypass setup. >> > >> >> Consider: >> >> >> >> bp1 bp2 >> >> a1 b1 a2 b2 >> >> >> >> >> >> a1 and a2 have the same mac and bp1 and bp2 have the same mac. >> > >> >We should not have 2 bypass configs with the same MAC. >> >I need to add a check in the bypass_master_register() to prevent this. >> >> Mac can change, you would have to check in change as well. Feels odd >> thought. >> >> >> > >> >The above check is to avoid cases where we have >> >bp1(a1, b1) with mac1 >> >and a2 is mis-configured with mac1, we want to avoid using a2 link events to update bp1. >> > >> >> Now bypass_master_get_bymac() will return always bp1 or bp2 - depending on >> >> the order of creation. >> >> Let's say it will return bp1. Then when we have event for a2, the >> >> bypass_ops->slave_link_change is called with (a2, bp1). That is wrong. >> >> >> >> >> >> You cannot use bypass_master_get_bymac() here. >> >> >> >> >> >> >> >> > > >> >> > > > + >> >> > > > + if ((active_netdev && bypass_xmit_ready(active_netdev)) || >> >> > > > + (backup_netdev && bypass_xmit_ready(backup_netdev))) { >> >> > > > + netif_carrier_on(bypass_netdev); >> >> > > > + netif_tx_wake_all_queues(bypass_netdev); >> >> > > > + } else { >> >> > > > + netif_carrier_off(bypass_netdev); >> >> > > > + netif_tx_stop_all_queues(bypass_netdev); >> >> > > > + } >> >> > > > + >> >> > > > +done: >> >> > > > + return NOTIFY_DONE; >> >> > > > +} >> >> > > > + >> >> > > > +static bool bypass_validate_event_dev(struct net_device *dev) >> >> > > > +{ >> >> > > > + /* Skip parent events */ >> >> > > > + if (netif_is_bypass_master(dev)) >> >> > > > + return false; >> >> > > > + >> >> > > > + /* Avoid non-Ethernet type devices */ >> >> > > > + if (dev->type != ARPHRD_ETHER) >> >> > > > + return false; >> >> > > > + >> >> > > > + /* Avoid Vlan dev with same MAC registering as VF */ >> >> > > > + if (is_vlan_dev(dev)) >> >> > > > + return false; >> >> > > > + >> >> > > > + /* Avoid Bonding master dev with same MAC registering as slave dev */ >> >> > > > + if ((dev->priv_flags & IFF_BONDING) && (dev->flags & IFF_MASTER)) >> >> > > Yeah, this is certainly incorrect. One thing is, you should be using the >> >> > > helpers netif_is_bond_master(). >> >> > > But what about the rest? macsec, macvlan, team, bridge, ovs and others? >> >> > > >> >> > > You need to do it not by blacklisting, but with whitelisting. You need >> >> > > to whitelist VF devices. My port flavours patchset might help with this. >> >> > May be i can use netdev_has_lower_dev() helper to make sure that the slave >> >> I don't see such function in the code. >> > >> >It is netdev_has_any_lower_dev(). I need to export it. >> >> Come on, you cannot use that. That would allow bonding without slaves, >> but the slaves could be added later on. >> >> What exactly you are trying to achieve by this? >> >> >> > >> >> >> >> >> >> > device is not an upper dev. >> >> > Can you point to your port flavours patchset? Is it upstream? >> >> I sent rfc couple of weeks ago: >> >> [patch net-next RFC 00/12] devlink: introduce port flavours and common phys_port_name generation >> > >> > >> >