All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] bonding: avoid adding slave device with IFF_MASTER flag
@ 2021-06-22  3:09 zhudi
  2021-06-22 17:40 ` patchwork-bot+netdevbpf
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: zhudi @ 2021-06-22  3:09 UTC (permalink / raw)
  To: j.vosburgh, vfalico, kuba, davem; +Cc: netdev, zhudi21, rose.chen

From: Di Zhu <zhudi21@huawei.com>

The following steps will definitely cause the kernel to crash:
	ip link add vrf1 type vrf table 1
	modprobe bonding.ko max_bonds=1
	echo "+vrf1" >/sys/class/net/bond0/bonding/slaves
	rmmod bonding

The root cause is that: When the VRF is added to the slave device,
it will fail, and some cleaning work will be done. because VRF device
has IFF_MASTER flag, cleanup process  will not clear the IFF_BONDING flag.
Then, when we unload the bonding module, unregister_netdevice_notifier()
will treat the VRF device as a bond master device and treat netdev_priv()
as struct bonding{} which actually is struct net_vrf{}.

By analyzing the processing logic of bond_enslave(), it seems that
it is not allowed to add the slave device with the IFF_MASTER flag, so
we need to add a code check for this situation.

Signed-off-by: Di Zhu <zhudi21@huawei.com>
---
 drivers/net/bonding/bond_main.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index c5a646d06102..16840c9bc00d 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1601,6 +1601,12 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
 	int link_reporting;
 	int res = 0, i;
 
+	if (slave_dev->flags & IFF_MASTER) {
+		netdev_err(bond_dev,
+			   "Error: Device with IFF_MASTER cannot be enslaved\n");
+		return -EPERM;
+	}
+
 	if (!bond->params.use_carrier &&
 	    slave_dev->ethtool_ops->get_link == NULL &&
 	    slave_ops->ndo_do_ioctl == NULL) {
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] bonding: avoid adding slave device with IFF_MASTER flag
  2021-06-22  3:09 [PATCH] bonding: avoid adding slave device with IFF_MASTER flag zhudi
@ 2021-06-22 17:40 ` patchwork-bot+netdevbpf
  2021-06-22 17:53 ` Eric Dumazet
  2021-06-22 18:16 ` Jay Vosburgh
  2 siblings, 0 replies; 7+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-06-22 17:40 UTC (permalink / raw)
  To: zhudi; +Cc: j.vosburgh, vfalico, kuba, davem, netdev, rose.chen

Hello:

This patch was applied to netdev/net.git (refs/heads/master):

On Tue, 22 Jun 2021 11:09:29 +0800 you wrote:
> From: Di Zhu <zhudi21@huawei.com>
> 
> The following steps will definitely cause the kernel to crash:
> 	ip link add vrf1 type vrf table 1
> 	modprobe bonding.ko max_bonds=1
> 	echo "+vrf1" >/sys/class/net/bond0/bonding/slaves
> 	rmmod bonding
> 
> [...]

Here is the summary with links:
  - bonding: avoid adding slave device with IFF_MASTER flag
    https://git.kernel.org/netdev/net/c/3c9ef511b9fa

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] bonding: avoid adding slave device with IFF_MASTER flag
  2021-06-22  3:09 [PATCH] bonding: avoid adding slave device with IFF_MASTER flag zhudi
  2021-06-22 17:40 ` patchwork-bot+netdevbpf
@ 2021-06-22 17:53 ` Eric Dumazet
  2021-06-22 18:16 ` Jay Vosburgh
  2 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2021-06-22 17:53 UTC (permalink / raw)
  To: zhudi, j.vosburgh, vfalico, kuba, davem; +Cc: netdev, rose.chen



On 6/22/21 5:09 AM, zhudi wrote:
> From: Di Zhu <zhudi21@huawei.com>
> 
> The following steps will definitely cause the kernel to crash:
> 	ip link add vrf1 type vrf table 1
> 	modprobe bonding.ko max_bonds=1
> 	echo "+vrf1" >/sys/class/net/bond0/bonding/slaves
> 	rmmod bonding
> 
> The root cause is that: When the VRF is added to the slave device,
> it will fail, and some cleaning work will be done. because VRF device
> has IFF_MASTER flag, cleanup process  will not clear the IFF_BONDING flag.
> Then, when we unload the bonding module, unregister_netdevice_notifier()
> will treat the VRF device as a bond master device and treat netdev_priv()
> as struct bonding{} which actually is struct net_vrf{}.
> 
> By analyzing the processing logic of bond_enslave(), it seems that
> it is not allowed to add the slave device with the IFF_MASTER flag, so
> we need to add a code check for this situation.
> 
> Signed-off-by: Di Zhu <zhudi21@huawei.com>
> ---
>  drivers/net/bonding/bond_main.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index c5a646d06102..16840c9bc00d 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -1601,6 +1601,12 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
>  	int link_reporting;
>  	int res = 0, i;
>  
> +	if (slave_dev->flags & IFF_MASTER) {

Missing NL_SET_ERR_MSG( ?

> +		netdev_err(bond_dev,
> +			   "Error: Device with IFF_MASTER cannot be enslaved\n");
> +		return -EPERM;
> +	}
> +
>  	if (!bond->params.use_carrier &&
>  	    slave_dev->ethtool_ops->get_link == NULL &&
>  	    slave_ops->ndo_do_ioctl == NULL) {
> 

This is strange, can you tell us why we keep the following lines after your patch ?

	if (bond_dev == slave_dev) {
		NL_SET_ERR_MSG(extack, "Cannot enslave bond to itself.");
		netdev_err(bond_dev, "cannot enslave bond to itself.\n");
		return -EPERM;
	}

I was under the impression that a stack of bonding devices was allowed.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] bonding: avoid adding slave device with IFF_MASTER flag
  2021-06-22  3:09 [PATCH] bonding: avoid adding slave device with IFF_MASTER flag zhudi
  2021-06-22 17:40 ` patchwork-bot+netdevbpf
  2021-06-22 17:53 ` Eric Dumazet
@ 2021-06-22 18:16 ` Jay Vosburgh
  2021-06-22 18:52   ` Eric Dumazet
  2 siblings, 1 reply; 7+ messages in thread
From: Jay Vosburgh @ 2021-06-22 18:16 UTC (permalink / raw)
  To: zhudi; +Cc: vfalico, kuba, davem, netdev, rose.chen

zhudi <zhudi21@huawei.com> wrote:

>From: Di Zhu <zhudi21@huawei.com>
>
>The following steps will definitely cause the kernel to crash:
>	ip link add vrf1 type vrf table 1
>	modprobe bonding.ko max_bonds=1
>	echo "+vrf1" >/sys/class/net/bond0/bonding/slaves
>	rmmod bonding
>
>The root cause is that: When the VRF is added to the slave device,
>it will fail, and some cleaning work will be done. because VRF device
>has IFF_MASTER flag, cleanup process  will not clear the IFF_BONDING flag.
>Then, when we unload the bonding module, unregister_netdevice_notifier()
>will treat the VRF device as a bond master device and treat netdev_priv()
>as struct bonding{} which actually is struct net_vrf{}.
>
>By analyzing the processing logic of bond_enslave(), it seems that
>it is not allowed to add the slave device with the IFF_MASTER flag, so
>we need to add a code check for this situation.

	I don't believe the statement just above is correct; nesting
bonds has historically been permitted, even if it is of questionable
value these days.  I've not tested nesting in a while, but last I recall
it did function.

	Leaving aside the question of whether it's really useful to nest
bonds or not, my concern with disabling this is that it will break
existing configurations that currently work fine.

	However, it should be possible to use netif_is_bonding_master
(which tests dev->flags & IFF_MASTER and dev->priv_flags & IFF_BONDING)
to exclude IFF_MASTER devices that are not bonds (which seem to be vrf
and eql), e.g.,

	if ((slave_dev->flags & IFF_MASTER) &&
		!netif_is_bond_master(slave_dev))

	Or we can just go with this patch and see if anything breaks.

	-J

>Signed-off-by: Di Zhu <zhudi21@huawei.com>
>---
> drivers/net/bonding/bond_main.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index c5a646d06102..16840c9bc00d 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -1601,6 +1601,12 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
> 	int link_reporting;
> 	int res = 0, i;
> 
>+	if (slave_dev->flags & IFF_MASTER) {
>+		netdev_err(bond_dev,
>+			   "Error: Device with IFF_MASTER cannot be enslaved\n");
>+		return -EPERM;
>+	}
>+
> 	if (!bond->params.use_carrier &&
> 	    slave_dev->ethtool_ops->get_link == NULL &&
> 	    slave_ops->ndo_do_ioctl == NULL) {
>-- 
>2.23.0
>

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] bonding: avoid adding slave device with IFF_MASTER flag
  2021-06-22 18:16 ` Jay Vosburgh
@ 2021-06-22 18:52   ` Eric Dumazet
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2021-06-22 18:52 UTC (permalink / raw)
  To: Jay Vosburgh, zhudi; +Cc: vfalico, kuba, davem, netdev, rose.chen



On 6/22/21 8:16 PM, Jay Vosburgh wrote:
> zhudi <zhudi21@huawei.com> wrote:
> 
>> From: Di Zhu <zhudi21@huawei.com>
>>
>> The following steps will definitely cause the kernel to crash:
>> 	ip link add vrf1 type vrf table 1
>> 	modprobe bonding.ko max_bonds=1
>> 	echo "+vrf1" >/sys/class/net/bond0/bonding/slaves
>> 	rmmod bonding
>>
>> The root cause is that: When the VRF is added to the slave device,
>> it will fail, and some cleaning work will be done. because VRF device
>> has IFF_MASTER flag, cleanup process  will not clear the IFF_BONDING flag.
>> Then, when we unload the bonding module, unregister_netdevice_notifier()
>> will treat the VRF device as a bond master device and treat netdev_priv()
>> as struct bonding{} which actually is struct net_vrf{}.
>>
>> By analyzing the processing logic of bond_enslave(), it seems that
>> it is not allowed to add the slave device with the IFF_MASTER flag, so
>> we need to add a code check for this situation.
> 
> 	I don't believe the statement just above is correct; nesting
> bonds has historically been permitted, even if it is of questionable
> value these days.  I've not tested nesting in a while, but last I recall
> it did function.
> 
> 	Leaving aside the question of whether it's really useful to nest
> bonds or not, my concern with disabling this is that it will break
> existing configurations that currently work fine.
> 
> 	However, it should be possible to use netif_is_bonding_master
> (which tests dev->flags & IFF_MASTER and dev->priv_flags & IFF_BONDING)
> to exclude IFF_MASTER devices that are not bonds (which seem to be vrf
> and eql), e.g.,
> 
> 	if ((slave_dev->flags & IFF_MASTER) &&
> 		!netif_is_bond_master(slave_dev))
> 
> 	Or we can just go with this patch and see if anything breaks.
> 

syzbot for sure will stop finding stack overflows and other issues like that :)

I know that some people used nested bonding devices in order to implement complex qdisc setups.
(eg HTB on the first level, netem on the second level).


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] bonding: avoid adding slave device with IFF_MASTER flag
@ 2021-06-23  2:11 zhudi (J)
  0 siblings, 0 replies; 7+ messages in thread
From: zhudi (J) @ 2021-06-23  2:11 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: vfalico, kuba, davem, netdev, Chenxiang (EulerOS)

> 
> >From: Di Zhu <zhudi21@huawei.com>
> >
> >The following steps will definitely cause the kernel to crash:
> >	ip link add vrf1 type vrf table 1
> >	modprobe bonding.ko max_bonds=1
> >	echo "+vrf1" >/sys/class/net/bond0/bonding/slaves
> >	rmmod bonding
> >
> >The root cause is that: When the VRF is added to the slave device,
> >it will fail, and some cleaning work will be done. because VRF device
> >has IFF_MASTER flag, cleanup process  will not clear the IFF_BONDING flag.
> >Then, when we unload the bonding module,
> unregister_netdevice_notifier()
> >will treat the VRF device as a bond master device and treat netdev_priv()
> >as struct bonding{} which actually is struct net_vrf{}.
> >
> >By analyzing the processing logic of bond_enslave(), it seems that
> >it is not allowed to add the slave device with the IFF_MASTER flag, so
> >we need to add a code check for this situation.
> 
> 	I don't believe the statement just above is correct; nesting
> bonds has historically been permitted, even if it is of questionable
> value these days.  I've not tested nesting in a while, but last I recall
> it did function.
> 
> 	Leaving aside the question of whether it's really useful to nest
> bonds or not, my concern with disabling this is that it will break
> existing configurations that currently work fine.
> 
> 	However, it should be possible to use netif_is_bonding_master
> (which tests dev->flags & IFF_MASTER and dev->priv_flags & IFF_BONDING)
> to exclude IFF_MASTER devices that are not bonds (which seem to be vrf
> and eql), e.g.,
> 
> 	if ((slave_dev->flags & IFF_MASTER) &&
> 		!netif_is_bond_master(slave_dev))
> 	
> 	Or we can just go with this patch and see if anything breaks.
> 
> 	-J

	Thank you for your advice, as Eric dumazet described: since there is a usage scenario
about nesting bonding, we should not break it.

> 
> >Signed-off-by: Di Zhu <zhudi21@huawei.com>
> >---
> > drivers/net/bonding/bond_main.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> >diff --git a/drivers/net/bonding/bond_main.c
> b/drivers/net/bonding/bond_main.c
> >index c5a646d06102..16840c9bc00d 100644
> >--- a/drivers/net/bonding/bond_main.c
> >+++ b/drivers/net/bonding/bond_main.c
> >@@ -1601,6 +1601,12 @@ int bond_enslave(struct net_device *bond_dev,
> struct net_device *slave_dev,
> > 	int link_reporting;
> > 	int res = 0, i;
> >
> >+	if (slave_dev->flags & IFF_MASTER) {
> >+		netdev_err(bond_dev,
> >+			   "Error: Device with IFF_MASTER cannot be
> enslaved\n");
> >+		return -EPERM;
> >+	}
> >+
> > 	if (!bond->params.use_carrier &&
> > 	    slave_dev->ethtool_ops->get_link == NULL &&
> > 	    slave_ops->ndo_do_ioctl == NULL) {
> >--
> >2.23.0
> >
> 
> ---
> 	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] bonding: avoid adding slave device with IFF_MASTER flag
@ 2021-06-23  2:02 zhudi (J)
  0 siblings, 0 replies; 7+ messages in thread
From: zhudi (J) @ 2021-06-23  2:02 UTC (permalink / raw)
  To: Eric Dumazet, Jay Vosburgh
  Cc: vfalico, kuba, davem, netdev, Chenxiang (EulerOS)

> 
> On 6/22/21 8:16 PM, Jay Vosburgh wrote:
> > zhudi <zhudi21@huawei.com> wrote:
> >
> >> From: Di Zhu <zhudi21@huawei.com>
> >>
> >> The following steps will definitely cause the kernel to crash:
> >> 	ip link add vrf1 type vrf table 1
> >> 	modprobe bonding.ko max_bonds=1
> >> 	echo "+vrf1" >/sys/class/net/bond0/bonding/slaves
> >> 	rmmod bonding
> >>
> >> The root cause is that: When the VRF is added to the slave device, it
> >> will fail, and some cleaning work will be done. because VRF device
> >> has IFF_MASTER flag, cleanup process  will not clear the IFF_BONDING
> flag.
> >> Then, when we unload the bonding module,
> >> unregister_netdevice_notifier() will treat the VRF device as a bond
> >> master device and treat netdev_priv() as struct bonding{} which actually is
> struct net_vrf{}.
> >>
> >> By analyzing the processing logic of bond_enslave(), it seems that it
> >> is not allowed to add the slave device with the IFF_MASTER flag, so
> >> we need to add a code check for this situation.
> >
> > 	I don't believe the statement just above is correct; nesting bonds
> > has historically been permitted, even if it is of questionable value
> > these days.  I've not tested nesting in a while, but last I recall it
> > did function.
> >
> > 	Leaving aside the question of whether it's really useful to nest
> > bonds or not, my concern with disabling this is that it will break
> > existing configurations that currently work fine.
> >
> > 	However, it should be possible to use netif_is_bonding_master
> (which
> > tests dev->flags & IFF_MASTER and dev->priv_flags & IFF_BONDING) to
> > exclude IFF_MASTER devices that are not bonds (which seem to be vrf
> > and eql), e.g.,
> >
> > 	if ((slave_dev->flags & IFF_MASTER) &&
> > 		!netif_is_bond_master(slave_dev))
> >
> > 	Or we can just go with this patch and see if anything breaks.
> >
> 
> syzbot for sure will stop finding stack overflows and other issues like that :)
> 
> I know that some people used nested bonding devices in order to implement
> complex qdisc setups.
> (eg HTB on the first level, netem on the second level).

If there is such a usage scenario,  the following code proposed by Jay Vosburgh is better:
	if ((slave_dev->flags & IFF_MASTER) && 
			!netif_is_bond_master(slave_dev))

Thank you for your advice, I will send another patch to fix it.




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-06-23  2:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-22  3:09 [PATCH] bonding: avoid adding slave device with IFF_MASTER flag zhudi
2021-06-22 17:40 ` patchwork-bot+netdevbpf
2021-06-22 17:53 ` Eric Dumazet
2021-06-22 18:16 ` Jay Vosburgh
2021-06-22 18:52   ` Eric Dumazet
2021-06-23  2:02 zhudi (J)
2021-06-23  2:11 zhudi (J)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.