All of lore.kernel.org
 help / color / mirror / Atom feed
* [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad
@ 2023-04-28  7:36 Hangbin Liu
  2023-04-28 16:06 ` Jay Vosburgh
  0 siblings, 1 reply; 11+ messages in thread
From: Hangbin Liu @ 2023-04-28  7:36 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: netdev

Hi Jay,

A user reported a bonding issue that if we put an active-back bond on top of a
802.3ad bond interface. When the 802.3ad bond's speed/duplex changed
dynamically. The upper bonding interface's speed/duplex can't be changed at
the same time.

This seems not easy to fix since we update the speed/duplex only
when there is a failover(except 802.3ad mode) or slave netdev change.
But the lower bonding interface doesn't trigger netdev change when the speed
changed as ethtool get bonding speed via bond_ethtool_get_link_ksettings(),
which not affect bonding interface itself.

Here is a reproducer:

```
#!/bin/bash
s_ns="s"
c_ns="c"

ip netns del ${c_ns} &> /dev/null
ip netns del ${s_ns} &> /dev/null
sleep 1
ip netns add ${c_ns}
ip netns add ${s_ns}

ip -n ${c_ns} link add bond0 type bond mode 802.3ad miimon 100
ip -n ${s_ns} link add bond0 type bond mode 802.3ad miimon 100
ip -n ${s_ns} link add bond1 type bond mode active-backup miimon 100

for i in $(seq 0 2); do
        ip -n ${c_ns} link add eth${i} type veth peer name eth${i} netns ${s_ns}
        [ $i -eq 2 ] && break
        ip -n ${c_ns} link set eth${i} master bond0
        ip -n ${s_ns} link set eth${i} master bond0
done

ip -n ${c_ns} link set eth2 up
ip -n ${c_ns} link set bond0 up

ip -n ${s_ns} link set bond0 master bond1
ip -n ${s_ns} link set bond1 up

sleep 5

ip netns exec ${s_ns} ethtool bond0 | grep Speed
ip netns exec ${s_ns} ethtool bond1 | grep Speed
```

When run the reproducer directly, you will see:
# ./bond_topo_lacp.sh
        Speed: 20000Mb/s
        Speed: 10000Mb/s

So do you have any thoughts about how to fix it?

Thanks
Hangbin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad
  2023-04-28  7:36 [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad Hangbin Liu
@ 2023-04-28 16:06 ` Jay Vosburgh
  2023-05-08  9:26   ` Hangbin Liu
  0 siblings, 1 reply; 11+ messages in thread
From: Jay Vosburgh @ 2023-04-28 16:06 UTC (permalink / raw)
  To: Hangbin Liu; +Cc: netdev

Hangbin Liu <liuhangbin@gmail.com> wrote:

>A user reported a bonding issue that if we put an active-back bond on top of a
>802.3ad bond interface. When the 802.3ad bond's speed/duplex changed
>dynamically. The upper bonding interface's speed/duplex can't be changed at
>the same time.
>
>This seems not easy to fix since we update the speed/duplex only
>when there is a failover(except 802.3ad mode) or slave netdev change.
>But the lower bonding interface doesn't trigger netdev change when the speed
>changed as ethtool get bonding speed via bond_ethtool_get_link_ksettings(),
>which not affect bonding interface itself.

	Well, this gets back into the intermittent discussion on whether
or not being able to nest bonds is useful or not, and thus whether it
should be allowed or not.  It's at best a niche use case (I don't recall
the example configurations ever being anything other than 802.3ad under
active-backup), and was broken for a number of years without much
uproar.

	In this particular case, nesting two LACP (802.3ad) bonds inside
an active-backup bond provides no functional benefit as far as I'm aware
(maybe gratuitous ARP?), as 802.3ad mode will correctly handle switching
between multiple aggregators.  The "ad_select" option provides a few
choices on the criteria for choosing the active aggregator.

	Is there a reason the user in your case doesn't use 802.3ad mode
directly?

>Here is a reproducer:
>
>```
>#!/bin/bash
>s_ns="s"
>c_ns="c"
>
>ip netns del ${c_ns} &> /dev/null
>ip netns del ${s_ns} &> /dev/null
>sleep 1
>ip netns add ${c_ns}
>ip netns add ${s_ns}
>
>ip -n ${c_ns} link add bond0 type bond mode 802.3ad miimon 100
>ip -n ${s_ns} link add bond0 type bond mode 802.3ad miimon 100
>ip -n ${s_ns} link add bond1 type bond mode active-backup miimon 100
>
>for i in $(seq 0 2); do
>        ip -n ${c_ns} link add eth${i} type veth peer name eth${i} netns ${s_ns}
>        [ $i -eq 2 ] && break
>        ip -n ${c_ns} link set eth${i} master bond0
>        ip -n ${s_ns} link set eth${i} master bond0
>done
>
>ip -n ${c_ns} link set eth2 up
>ip -n ${c_ns} link set bond0 up
>
>ip -n ${s_ns} link set bond0 master bond1
>ip -n ${s_ns} link set bond1 up
>
>sleep 5
>
>ip netns exec ${s_ns} ethtool bond0 | grep Speed
>ip netns exec ${s_ns} ethtool bond1 | grep Speed
>```
>
>When run the reproducer directly, you will see:
># ./bond_topo_lacp.sh
>        Speed: 20000Mb/s
>        Speed: 10000Mb/s
>
>So do you have any thoughts about how to fix it?

	Maybe it's time to disable nesting of bonds, update the
documentation to note that it's disabled and that 802.3ad mode is smart
enough to do multiple aggregators, and then see if anyone has some other
use case and complains.

	In the past, I've been against doing this, but only because it
might break existing configurations.  If nested configurations are going
to misbehave and require complicated shenanigans to fix, then perhaps
it's time to push users into a configuration that works without the
nesting.

	The only thing I can think of that active-backup over 802.3ad
gets is the gratuitous ARP / NS on failover.  If that's the key feature
for nesting, then I'd rather add the grat ARP to 802.3ad aggregator
selection and disable nesting.

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad
  2023-04-28 16:06 ` Jay Vosburgh
@ 2023-05-08  9:26   ` Hangbin Liu
  2023-05-08 18:32     ` Jay Vosburgh
  0 siblings, 1 reply; 11+ messages in thread
From: Hangbin Liu @ 2023-05-08  9:26 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: netdev

On Fri, Apr 28, 2023 at 09:06:40AM -0700, Jay Vosburgh wrote:
> Hangbin Liu <liuhangbin@gmail.com> wrote:
> 
> >A user reported a bonding issue that if we put an active-back bond on top of a
> >802.3ad bond interface. When the 802.3ad bond's speed/duplex changed
> >dynamically. The upper bonding interface's speed/duplex can't be changed at
> >the same time.
> >
> >This seems not easy to fix since we update the speed/duplex only
> >when there is a failover(except 802.3ad mode) or slave netdev change.
> >But the lower bonding interface doesn't trigger netdev change when the speed
> >changed as ethtool get bonding speed via bond_ethtool_get_link_ksettings(),
> >which not affect bonding interface itself.
> 
> 	Well, this gets back into the intermittent discussion on whether
> or not being able to nest bonds is useful or not, and thus whether it
> should be allowed or not.  It's at best a niche use case (I don't recall
> the example configurations ever being anything other than 802.3ad under
> active-backup), and was broken for a number of years without much
> uproar.
> 
> 	In this particular case, nesting two LACP (802.3ad) bonds inside
> an active-backup bond provides no functional benefit as far as I'm aware
> (maybe gratuitous ARP?), as 802.3ad mode will correctly handle switching
> between multiple aggregators.  The "ad_select" option provides a few
> choices on the criteria for choosing the active aggregator.
> 
> 	Is there a reason the user in your case doesn't use 802.3ad mode
> directly?

Hi Jay,

I just back from holiday and re-read you reply. The user doesn't add 2 LACP
bonds inside an active-backup bond. He add 1 LACP bond and 1 normal NIC in to
an active-backup bond. This seems reasonable. e.g. The LACP bond in a switch
and the normal NIC in another switch.

What do you think?

Thanks
Hangbin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad
  2023-05-08  9:26   ` Hangbin Liu
@ 2023-05-08 18:32     ` Jay Vosburgh
  2023-05-09  3:16       ` Hangbin Liu
  2023-05-10  7:50       ` Hangbin Liu
  0 siblings, 2 replies; 11+ messages in thread
From: Jay Vosburgh @ 2023-05-08 18:32 UTC (permalink / raw)
  To: Hangbin Liu; +Cc: netdev

Hangbin Liu <liuhangbin@gmail.com> wrote:

>On Fri, Apr 28, 2023 at 09:06:40AM -0700, Jay Vosburgh wrote:
>> Hangbin Liu <liuhangbin@gmail.com> wrote:
>> 
>> >A user reported a bonding issue that if we put an active-back bond on top of a
>> >802.3ad bond interface. When the 802.3ad bond's speed/duplex changed
>> >dynamically. The upper bonding interface's speed/duplex can't be changed at
>> >the same time.
>> >
>> >This seems not easy to fix since we update the speed/duplex only
>> >when there is a failover(except 802.3ad mode) or slave netdev change.
>> >But the lower bonding interface doesn't trigger netdev change when the speed
>> >changed as ethtool get bonding speed via bond_ethtool_get_link_ksettings(),
>> >which not affect bonding interface itself.
>> 
>> 	Well, this gets back into the intermittent discussion on whether
>> or not being able to nest bonds is useful or not, and thus whether it
>> should be allowed or not.  It's at best a niche use case (I don't recall
>> the example configurations ever being anything other than 802.3ad under
>> active-backup), and was broken for a number of years without much
>> uproar.
>> 
>> 	In this particular case, nesting two LACP (802.3ad) bonds inside
>> an active-backup bond provides no functional benefit as far as I'm aware
>> (maybe gratuitous ARP?), as 802.3ad mode will correctly handle switching
>> between multiple aggregators.  The "ad_select" option provides a few
>> choices on the criteria for choosing the active aggregator.
>> 
>> 	Is there a reason the user in your case doesn't use 802.3ad mode
>> directly?
>
>Hi Jay,
>
>I just back from holiday and re-read you reply. The user doesn't add 2 LACP
>bonds inside an active-backup bond. He add 1 LACP bond and 1 normal NIC in to
>an active-backup bond. This seems reasonable. e.g. The LACP bond in a switch
>and the normal NIC in another switch.
>
>What do you think?

	That case should work fine without the active-backup.  LACP has
a concept of an "individual" port, which (in this context) would be the
"normal NIC," presuming that that means its link peer isn't running
LACP.

	If all of the ports (N that are LACP to a single switch, plus 1
that's the non-LACP "normal NIC") were attached to a single bond, it
would create one aggregator with the LACP enabled ports, and then a
separate aggregator for the indvidual port that's not.  The aggregator
selection logic prefers the LACP enabled aggregator over the individual
port aggregator.  The precise criteria is in the commentary within
ad_agg_selection_test().

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad
  2023-05-08 18:32     ` Jay Vosburgh
@ 2023-05-09  3:16       ` Hangbin Liu
  2023-05-10  7:50       ` Hangbin Liu
  1 sibling, 0 replies; 11+ messages in thread
From: Hangbin Liu @ 2023-05-09  3:16 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: netdev

On Mon, May 08, 2023 at 11:32:16AM -0700, Jay Vosburgh wrote:
> >Hi Jay,
> >
> >I just back from holiday and re-read you reply. The user doesn't add 2 LACP
> >bonds inside an active-backup bond. He add 1 LACP bond and 1 normal NIC in to
> >an active-backup bond. This seems reasonable. e.g. The LACP bond in a switch
> >and the normal NIC in another switch.
> >
> >What do you think?
> 
> 	That case should work fine without the active-backup.  LACP has
> a concept of an "individual" port, which (in this context) would be the
> "normal NIC," presuming that that means its link peer isn't running
> LACP.
> 
> 	If all of the ports (N that are LACP to a single switch, plus 1
> that's the non-LACP "normal NIC") were attached to a single bond, it
> would create one aggregator with the LACP enabled ports, and then a
> separate aggregator for the indvidual port that's not.  The aggregator
> selection logic prefers the LACP enabled aggregator over the individual
> port aggregator.  The precise criteria is in the commentary within
> ad_agg_selection_test().
> 

Thanks for your explanation. I didn't know this before. Now I have learned.

Regards
Hangbin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad
  2023-05-08 18:32     ` Jay Vosburgh
  2023-05-09  3:16       ` Hangbin Liu
@ 2023-05-10  7:50       ` Hangbin Liu
  2023-05-10 16:57         ` Andrew J. Schorr
  1 sibling, 1 reply; 11+ messages in thread
From: Hangbin Liu @ 2023-05-10  7:50 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: netdev, Andrew Schorr

On Mon, May 08, 2023 at 11:32:16AM -0700, Jay Vosburgh wrote:
> >Hi Jay,
> >
> >I just back from holiday and re-read you reply. The user doesn't add 2 LACP
> >bonds inside an active-backup bond. He add 1 LACP bond and 1 normal NIC in to
> >an active-backup bond. This seems reasonable. e.g. The LACP bond in a switch
> >and the normal NIC in another switch.
> >
> >What do you think?
> 
> 	That case should work fine without the active-backup.  LACP has
> a concept of an "individual" port, which (in this context) would be the
> "normal NIC," presuming that that means its link peer isn't running
> LACP.
> 
> 	If all of the ports (N that are LACP to a single switch, plus 1
> that's the non-LACP "normal NIC") were attached to a single bond, it
> would create one aggregator with the LACP enabled ports, and then a
> separate aggregator for the indvidual port that's not.  The aggregator
> selection logic prefers the LACP enabled aggregator over the individual
> port aggregator.  The precise criteria is in the commentary within
> ad_agg_selection_test().
> 

cc Andrew, He add active-backup bond over LACP bond because he want to
use arp_ip_target to ensure that the target network is reachable...

Hangbin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad
  2023-05-10  7:50       ` Hangbin Liu
@ 2023-05-10 16:57         ` Andrew J. Schorr
  2023-05-10 17:14           ` Andrew J. Schorr
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew J. Schorr @ 2023-05-10 16:57 UTC (permalink / raw)
  To: Hangbin Liu; +Cc: Jay Vosburgh, netdev

Hi Hangbin & Jay,

On Wed, May 10, 2023 at 03:50:34PM +0800, Hangbin Liu wrote:
> On Mon, May 08, 2023 at 11:32:16AM -0700, Jay Vosburgh wrote:
> > 	That case should work fine without the active-backup.  LACP has
> > a concept of an "individual" port, which (in this context) would be the
> > "normal NIC," presuming that that means its link peer isn't running
> > LACP.
> > 
> > 	If all of the ports (N that are LACP to a single switch, plus 1
> > that's the non-LACP "normal NIC") were attached to a single bond, it
> > would create one aggregator with the LACP enabled ports, and then a
> > separate aggregator for the indvidual port that's not.  The aggregator
> > selection logic prefers the LACP enabled aggregator over the individual
> > port aggregator.  The precise criteria is in the commentary within
> > ad_agg_selection_test().
> > 
> 
> cc Andrew, He add active-backup bond over LACP bond because he want to
> use arp_ip_target to ensure that the target network is reachable...

That's correct. I prefer the ARP monitoring to ensure that the needed
connectivity is actually there instead of relying on MII monitoring.

I also confess that I was unaware of the possibility of using an individual
port inside an 802.3ad bond without having to stick that individual port into a
port-channel group with LACP enabled. I want to avoid enabling LACP on that
link because I'd like to be able to PXE boot over it, not to mention the switch
configuration hassle.  Is that individual port configuration without LACP
detected automatically by the kernel, or do I need to configure something to do
that? I see the logic in drivers/net/bonding/bond_3ad.c to set is_individual,
but it appears to depend on whether duplex is enabled. At that point, I got
lost, since I see duplex mentioned only in ad_user_port_key, and that seems to
be a property of the bond master, not the slaves. Is there any documentation of
how this configuration works?

But in any case, I still prefer active-backup on top of 802.3ad so that I can
have the ARP monitoring.

If it's too much trouble to get the top-level bond to report duplex/speed
correctly when the underlying bond speed changes, then I think it would
be an improvement to set duplex/speed to N/A (or -1) for a bond of
bonds configuration instead of potentially having incorrect information.
I imagine such a fix might be much easier than updating dynamically
when the lower-level 802.3ad bond changes speed.

Best regards,
Andy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad
  2023-05-10 16:57         ` Andrew J. Schorr
@ 2023-05-10 17:14           ` Andrew J. Schorr
  2023-05-12  1:38             ` Jay Vosburgh
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew J. Schorr @ 2023-05-10 17:14 UTC (permalink / raw)
  To: Hangbin Liu; +Cc: Jay Vosburgh, netdev

Sorry -- resending from a different email address to fix a problem
with gmail rejecting it.

On Wed, May 10, 2023 at 12:57:38PM -0400, Andrew J. Schorr wrote:
> Hi Hangbin & Jay,
> 
> On Wed, May 10, 2023 at 03:50:34PM +0800, Hangbin Liu wrote:
> > On Mon, May 08, 2023 at 11:32:16AM -0700, Jay Vosburgh wrote:
> > > 	That case should work fine without the active-backup.  LACP has
> > > a concept of an "individual" port, which (in this context) would be the
> > > "normal NIC," presuming that that means its link peer isn't running
> > > LACP.
> > > 
> > > 	If all of the ports (N that are LACP to a single switch, plus 1
> > > that's the non-LACP "normal NIC") were attached to a single bond, it
> > > would create one aggregator with the LACP enabled ports, and then a
> > > separate aggregator for the indvidual port that's not.  The aggregator
> > > selection logic prefers the LACP enabled aggregator over the individual
> > > port aggregator.  The precise criteria is in the commentary within
> > > ad_agg_selection_test().
> > > 
> > 
> > cc Andrew, He add active-backup bond over LACP bond because he want to
> > use arp_ip_target to ensure that the target network is reachable...
> 
> That's correct. I prefer the ARP monitoring to ensure that the needed
> connectivity is actually there instead of relying on MII monitoring.
> 
> I also confess that I was unaware of the possibility of using an individual
> port inside an 802.3ad bond without having to stick that individual port into a
> port-channel group with LACP enabled. I want to avoid enabling LACP on that
> link because I'd like to be able to PXE boot over it, not to mention the switch
> configuration hassle.  Is that individual port configuration without LACP
> detected automatically by the kernel, or do I need to configure something to do
> that? I see the logic in drivers/net/bonding/bond_3ad.c to set is_individual,
> but it appears to depend on whether duplex is enabled. At that point, I got
> lost, since I see duplex mentioned only in ad_user_port_key, and that seems to
> be a property of the bond master, not the slaves. Is there any documentation of
> how this configuration works?
> 
> But in any case, I still prefer active-backup on top of 802.3ad so that I can
> have the ARP monitoring.
> 
> If it's too much trouble to get the top-level bond to report duplex/speed
> correctly when the underlying bond speed changes, then I think it would
> be an improvement to set duplex/speed to N/A (or -1) for a bond of
> bonds configuration instead of potentially having incorrect information.
> I imagine such a fix might be much easier than updating dynamically
> when the lower-level 802.3ad bond changes speed.
> 
> Best regards,
> Andy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad
  2023-05-10 17:14           ` Andrew J. Schorr
@ 2023-05-12  1:38             ` Jay Vosburgh
  2023-05-12 14:44               ` Andrew J. Schorr
  0 siblings, 1 reply; 11+ messages in thread
From: Jay Vosburgh @ 2023-05-12  1:38 UTC (permalink / raw)
  To: Andrew J. Schorr; +Cc: Hangbin Liu, netdev

Andrew J. Schorr <aschorr@telemetry-investments.com> wrote:

>Sorry -- resending from a different email address to fix a problem
>with gmail rejecting it.
>
>On Wed, May 10, 2023 at 12:57:38PM -0400, Andrew J. Schorr wrote:
>> Hi Hangbin & Jay,
>> 
>> On Wed, May 10, 2023 at 03:50:34PM +0800, Hangbin Liu wrote:
>> > On Mon, May 08, 2023 at 11:32:16AM -0700, Jay Vosburgh wrote:
>> > > 	That case should work fine without the active-backup.  LACP has
>> > > a concept of an "individual" port, which (in this context) would be the
>> > > "normal NIC," presuming that that means its link peer isn't running
>> > > LACP.
>> > > 
>> > > 	If all of the ports (N that are LACP to a single switch, plus 1
>> > > that's the non-LACP "normal NIC") were attached to a single bond, it
>> > > would create one aggregator with the LACP enabled ports, and then a
>> > > separate aggregator for the indvidual port that's not.  The aggregator
>> > > selection logic prefers the LACP enabled aggregator over the individual
>> > > port aggregator.  The precise criteria is in the commentary within
>> > > ad_agg_selection_test().
>> > > 
>> > 
>> > cc Andrew, He add active-backup bond over LACP bond because he want to
>> > use arp_ip_target to ensure that the target network is reachable...
>> 
>> That's correct. I prefer the ARP monitoring to ensure that the needed
>> connectivity is actually there instead of relying on MII monitoring.
>> 
>> I also confess that I was unaware of the possibility of using an individual
>> port inside an 802.3ad bond without having to stick that individual port into a
>> port-channel group with LACP enabled. I want to avoid enabling LACP on that
>> link because I'd like to be able to PXE boot over it, not to mention the switch
>> configuration hassle.  Is that individual port configuration without LACP
>> detected automatically by the kernel, or do I need to configure something to do
>> that? I see the logic in drivers/net/bonding/bond_3ad.c to set is_individual,
>> but it appears to depend on whether duplex is enabled. At that point, I got
>> lost, since I see duplex mentioned only in ad_user_port_key, and that seems to
>> be a property of the bond master, not the slaves. Is there any documentation of
>> how this configuration works?

	The individual port behavior is part of the LACP standard (IEEE
802.1AX, recent editions call this "Solitary"), and is done
automatically by the kernel.  One of the reasons for it is to permit
exactly the situation you mention: to enable PXE or "fallback"
communication to work even if LACP negotiation fails or is not
configured or implemented at one end.  This is called out explicitly in
802.1AX, 6.1.1.j.

	The duplex test is only part of the "individual" logic; it comes
up because LACP negotiation requires the peers to be point-to-point
links, i.e., full duplex (IEEE 802.1AX-2014, 6.4.8).  That's the norm
for most everything now, but historically a port in half duplex could be
on a multiple access topology, e.g., 802.3 CSMA/CD 10BASE2 on a coax
cable, which is incompatible with LACP aggregation.  This situation
doesn't come up a lot these days.

	The important part of the "individual" logic is whether or not
the port successfully completes LACP negotiation with a link partner.
If not, the port is an individual port, which acts essentially like an
aggregator with just one port in it.  This is separate from
"is_individual" in the bonding code, and happens in
ad_port_selection_logic(), after the comment "check if current
aggregator suits us".  "is_individual" is one element of this test, the
remaining tests compare the various keys and whether the partner MAC
address has been populated.

	As far as documentation goes, the bonding docs[0] describe some
of the parameters, but doesn't describe the specifics of bonding's
ability to manage multiple aggregators; I should write that up, since
this comes up periodically.  The IEEE standard (to which the bonding
implementation conforms) describes how the whole system works, but
doesn't really have a simple overview.

[0] https://www.kernel.org/doc/Documentation/networking/bonding.rst

>> But in any case, I still prefer active-backup on top of 802.3ad so that I can
>> have the ARP monitoring.
>> 
>> If it's too much trouble to get the top-level bond to report duplex/speed
>> correctly when the underlying bond speed changes, then I think it would
>> be an improvement to set duplex/speed to N/A (or -1) for a bond of
>> bonds configuration instead of potentially having incorrect information.
>> I imagine such a fix might be much easier than updating dynamically
>> when the lower-level 802.3ad bond changes speed.

	I'll have to give this some thought.  The best long term
solution would be to decouple the link monitoring stuff from the mode,
and thus allow ARP and MII in a wider variety of modes.  I've prototyped
that out in the past, along with changing the MII monitor to respond to
carrier state changes in real time instead of polling, and it's fairly
complicated.

	In any event, this does sound like a valid use case for nesting
the bonds, so simply disabling that facility seems to be off the table.

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad
  2023-05-12  1:38             ` Jay Vosburgh
@ 2023-05-12 14:44               ` Andrew J. Schorr
  2023-05-16 15:11                 ` Andrew J. Schorr
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew J. Schorr @ 2023-05-12 14:44 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Hangbin Liu, netdev

Hi Jay,

On Thu, May 11, 2023 at 06:38:48PM -0700, Jay Vosburgh wrote:
> 	The individual port behavior is part of the LACP standard (IEEE
> 802.1AX, recent editions call this "Solitary"), and is done
> automatically by the kernel.  One of the reasons for it is to permit
> exactly the situation you mention: to enable PXE or "fallback"
> communication to work even if LACP negotiation fails or is not
> configured or implemented at one end.  This is called out explicitly in
> 802.1AX, 6.1.1.j.
> 
> 	The duplex test is only part of the "individual" logic; it comes
> up because LACP negotiation requires the peers to be point-to-point
> links, i.e., full duplex (IEEE 802.1AX-2014, 6.4.8).  That's the norm
> for most everything now, but historically a port in half duplex could be
> on a multiple access topology, e.g., 802.3 CSMA/CD 10BASE2 on a coax
> cable, which is incompatible with LACP aggregation.  This situation
> doesn't come up a lot these days.
> 
> 	The important part of the "individual" logic is whether or not
> the port successfully completes LACP negotiation with a link partner.
> If not, the port is an individual port, which acts essentially like an
> aggregator with just one port in it.  This is separate from
> "is_individual" in the bonding code, and happens in
> ad_port_selection_logic(), after the comment "check if current
> aggregator suits us".  "is_individual" is one element of this test, the
> remaining tests compare the various keys and whether the partner MAC
> address has been populated.

OK. So it sounds like this should just work automatically with no
configuration required to identify which slaves are running in individual
mode. Thanks for clarifying.

> 	As far as documentation goes, the bonding docs[0] describe some
> of the parameters, but doesn't describe the specifics of bonding's
> ability to manage multiple aggregators; I should write that up, since
> this comes up periodically.  The IEEE standard (to which the bonding
> implementation conforms) describes how the whole system works, but
> doesn't really have a simple overview.
> 
> [0] https://www.kernel.org/doc/Documentation/networking/bonding.rst

I noticed the parameters related to this and did do some google searching to
learn about having multiple aggregators, but as you say, it would be
helpful to have a few more clues about how this works in the Bonding Howto,
as well as a mention of this individual port capability.

> 	I'll have to give this some thought.  The best long term
> solution would be to decouple the link monitoring stuff from the mode,
> and thus allow ARP and MII in a wider variety of modes.  I've prototyped
> that out in the past, along with changing the MII monitor to respond to
> carrier state changes in real time instead of polling, and it's fairly
> complicated.
> 
> 	In any event, this does sound like a valid use case for nesting
> the bonds, so simply disabling that facility seems to be off the table.

OK, great. Then I'll stick with this config for now, even though NetworkManager
has some brain damage in this area, since it tries to bring up both bonds
before the MAC addresses have gotten sorted out, which can leave everything
with a random MAC address. I've managed to kludge a solution to this by setting
ONBOOT=no for the active-backup bond, which convinces NetworkManager to start
it a bit later and somehow fixes the race condition.

Regards,
Andy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad
  2023-05-12 14:44               ` Andrew J. Schorr
@ 2023-05-16 15:11                 ` Andrew J. Schorr
  0 siblings, 0 replies; 11+ messages in thread
From: Andrew J. Schorr @ 2023-05-16 15:11 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Hangbin Liu, netdev

Hi,

On Fri, May 12, 2023 at 10:44:01AM -0400, Andrew J. Schorr wrote:
> OK. So it sounds like this should just work automatically with no
> configuration required to identify which slaves are running in individual
> mode. Thanks for clarifying.

Just to follow up on this -- for test purposes, I booted the system with the 
802.3ad bond containing both the 20 Gbps port-channel and the individual
1 Gbps port on the other switch, and it worked as expected. The only
drawback to this configuration is the lack of ARP monitoring, so I will
stick with the active-backup bond on top of the 802.3ad bond.

Regards,
Andy

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-05-16 15:11 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-28  7:36 [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad Hangbin Liu
2023-04-28 16:06 ` Jay Vosburgh
2023-05-08  9:26   ` Hangbin Liu
2023-05-08 18:32     ` Jay Vosburgh
2023-05-09  3:16       ` Hangbin Liu
2023-05-10  7:50       ` Hangbin Liu
2023-05-10 16:57         ` Andrew J. Schorr
2023-05-10 17:14           ` Andrew J. Schorr
2023-05-12  1:38             ` Jay Vosburgh
2023-05-12 14:44               ` Andrew J. Schorr
2023-05-16 15:11                 ` Andrew J. Schorr

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.