All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] bonded interfaces drop bpdu (stp) frames
@ 2018-07-11 22:23 Michal Soltys
  2018-07-11 23:22 ` Mahesh Bandewar (महेश बंडेवार)
  0 siblings, 1 reply; 11+ messages in thread
From: Michal Soltys @ 2018-07-11 22:23 UTC (permalink / raw)
  To: netdev

Hi,

As weird as that sounds, this is what I observed today after bumping
kernel version. I have a setup where 2 bonds are attached to linux
bridge and physically are connected to two switches doing MSTP (and
linux bridge is just passing them).

Initially I suspected some changes related to bridge code - but quick
peek at the code showed nothing suspicious - and the part of it that
explicitly passes stp frames if stp is not enabled has seen little
changes (e.g. per-port group_fwd_mask added recently). Furthermore - if
regular non-bonded interfaces are attached everything works fine.

Just to be sure I detached the bond (802.3ad mode) and checked it with
simple tcpdump (ether proto \\stp) - and indeed no hello packets were
there (with them being present just fine on active enslaved interface,
or on the bond device in earlier kernels).

If time permits I'll bisect tommorow to pinpoint the commit, but from
quick todays test - 4.9.x is working fine, while 4.16.16 (tested on
debian) and 4.17.3 (tested on archlinux) are failing.

Unless this is already a known issue (or you have any suggestions what
could be responsible).

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] bonded interfaces drop bpdu (stp) frames
  2018-07-11 22:23 [BUG] bonded interfaces drop bpdu (stp) frames Michal Soltys
@ 2018-07-11 23:22 ` Mahesh Bandewar (महेश बंडेवार)
  2018-07-12 14:51   ` Jay Vosburgh
  0 siblings, 1 reply; 11+ messages in thread
From: Mahesh Bandewar (महेश बंडेवार) @ 2018-07-11 23:22 UTC (permalink / raw)
  To: Michal Soltys; +Cc: linux-netdev

On Wed, Jul 11, 2018 at 3:23 PM, Michal Soltys <soltys@ziu.info> wrote:
>
> Hi,
>
> As weird as that sounds, this is what I observed today after bumping
> kernel version. I have a setup where 2 bonds are attached to linux
> bridge and physically are connected to two switches doing MSTP (and
> linux bridge is just passing them).
>
> Initially I suspected some changes related to bridge code - but quick
> peek at the code showed nothing suspicious - and the part of it that
> explicitly passes stp frames if stp is not enabled has seen little
> changes (e.g. per-port group_fwd_mask added recently). Furthermore - if
> regular non-bonded interfaces are attached everything works fine.
>
> Just to be sure I detached the bond (802.3ad mode) and checked it with
> simple tcpdump (ether proto \\stp) - and indeed no hello packets were
> there (with them being present just fine on active enslaved interface,
> or on the bond device in earlier kernels).
>
> If time permits I'll bisect tommorow to pinpoint the commit, but from
> quick todays test - 4.9.x is working fine, while 4.16.16 (tested on
> debian) and 4.17.3 (tested on archlinux) are failing.
>
> Unless this is already a known issue (or you have any suggestions what
> could be responsible).
>
I believe these are link-local-multicast messages and sometime back a
change went into to not pass those frames to the bonding master. This
could be the side effect of that.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] bonded interfaces drop bpdu (stp) frames
  2018-07-11 23:22 ` Mahesh Bandewar (महेश बंडेवार)
@ 2018-07-12 14:51   ` Jay Vosburgh
  2018-07-12 16:37     ` Michal Soltys
  0 siblings, 1 reply; 11+ messages in thread
From: Jay Vosburgh @ 2018-07-12 14:51 UTC (permalink / raw)
  To: =?UTF-8?B?TWFoZXNoIEJhbmRld2FyICjgpK7gpLngpYfgpLYg4KSs4KSC4KSh4KWH4KS14KS+4KSwKQ==?=
  Cc: Michal Soltys, linux-netdev

Mahesh Bandewar (महेश बंडेवार) wrote:

>On Wed, Jul 11, 2018 at 3:23 PM, Michal Soltys <soltys@ziu.info> wrote:
>>
>> Hi,
>>
>> As weird as that sounds, this is what I observed today after bumping
>> kernel version. I have a setup where 2 bonds are attached to linux
>> bridge and physically are connected to two switches doing MSTP (and
>> linux bridge is just passing them).
>>
>> Initially I suspected some changes related to bridge code - but quick
>> peek at the code showed nothing suspicious - and the part of it that
>> explicitly passes stp frames if stp is not enabled has seen little
>> changes (e.g. per-port group_fwd_mask added recently). Furthermore - if
>> regular non-bonded interfaces are attached everything works fine.
>>
>> Just to be sure I detached the bond (802.3ad mode) and checked it with
>> simple tcpdump (ether proto \\stp) - and indeed no hello packets were
>> there (with them being present just fine on active enslaved interface,
>> or on the bond device in earlier kernels).
>>
>> If time permits I'll bisect tommorow to pinpoint the commit, but from
>> quick todays test - 4.9.x is working fine, while 4.16.16 (tested on
>> debian) and 4.17.3 (tested on archlinux) are failing.
>>
>> Unless this is already a known issue (or you have any suggestions what
>> could be responsible).
>>
>I believe these are link-local-multicast messages and sometime back a
>change went into to not pass those frames to the bonding master. This
>could be the side effect of that.

	Mahesh, I suspect you're thinking of:

commit b89f04c61efe3b7756434d693b9203cc0cce002e
Author: Chonggang Li <chonggangli@google.com>
Date:   Sun Apr 16 12:02:18 2017 -0700

    bonding: deliver link-local packets with skb->dev set to link that packets arrived on

	Michal, are you able to revert this patch and test?

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] bonded interfaces drop bpdu (stp) frames
  2018-07-12 14:51   ` Jay Vosburgh
@ 2018-07-12 16:37     ` Michal Soltys
  2018-07-12 18:03       ` Jay Vosburgh
  0 siblings, 1 reply; 11+ messages in thread
From: Michal Soltys @ 2018-07-12 16:37 UTC (permalink / raw)
  To: Jay Vosburgh,
	Mahesh Bandewar (महेश
	बंडेवार)
  Cc: linux-netdev

On 07/12/2018 04:51 PM, Jay Vosburgh wrote:
> Mahesh Bandewar (महेश बंडेवार) wrote:
> 
>> On Wed, Jul 11, 2018 at 3:23 PM, Michal Soltys <soltys@ziu.info> wrote:
>>>
>>> Hi,
>>>
>>> As weird as that sounds, this is what I observed today after bumping
>>> kernel version. I have a setup where 2 bonds are attached to linux
>>> bridge and physically are connected to two switches doing MSTP (and
>>> linux bridge is just passing them).
>>>
>>> Initially I suspected some changes related to bridge code - but quick
>>> peek at the code showed nothing suspicious - and the part of it that
>>> explicitly passes stp frames if stp is not enabled has seen little
>>> changes (e.g. per-port group_fwd_mask added recently). Furthermore - if
>>> regular non-bonded interfaces are attached everything works fine.
>>>
>>> Just to be sure I detached the bond (802.3ad mode) and checked it with
>>> simple tcpdump (ether proto \\stp) - and indeed no hello packets were
>>> there (with them being present just fine on active enslaved interface,
>>> or on the bond device in earlier kernels).
>>>
>>> If time permits I'll bisect tommorow to pinpoint the commit, but from
>>> quick todays test - 4.9.x is working fine, while 4.16.16 (tested on
>>> debian) and 4.17.3 (tested on archlinux) are failing.
>>>
>>> Unless this is already a known issue (or you have any suggestions what
>>> could be responsible).
>>>
>> I believe these are link-local-multicast messages and sometime back a
>> change went into to not pass those frames to the bonding master. This
>> could be the side effect of that.
> 
> 	Mahesh, I suspect you're thinking of:
> 
> commit b89f04c61efe3b7756434d693b9203cc0cce002e
> Author: Chonggang Li <chonggangli@google.com>
> Date:   Sun Apr 16 12:02:18 2017 -0700
> 
>      bonding: deliver link-local packets with skb->dev set to link that packets arrived on
> 
> 	Michal, are you able to revert this patch and test?
> 
> 	-J
> 
> ---
> 	-Jay Vosburgh, jay.vosburgh@canonical.com
> 


Just tested - yes, reverting that patch solves the issues.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] bonded interfaces drop bpdu (stp) frames
  2018-07-12 16:37     ` Michal Soltys
@ 2018-07-12 18:03       ` Jay Vosburgh
  2018-07-12 21:26         ` Mahesh Bandewar (महेश बंडेवार)
  0 siblings, 1 reply; 11+ messages in thread
From: Jay Vosburgh @ 2018-07-12 18:03 UTC (permalink / raw)
  To: Michal Soltys, Chonggang Li
  Cc: =?UTF-8?B?TWFoZXNoIEJhbmRld2FyICjgpK7gpLngpYfgpLYg4KSs4KSC4KSh4KWH4KS14KS+?=
	=?UTF-8?B?4KSwKQ==?=,
	linux-netdev

Michal Soltys <soltys@ziu.info> wrote:

>On 07/12/2018 04:51 PM, Jay Vosburgh wrote:
>> Mahesh Bandewar (महेश बंडेवार) wrote:
>>
>>> On Wed, Jul 11, 2018 at 3:23 PM, Michal Soltys <soltys@ziu.info> wrote:
>>>>
>>>> Hi,
>>>>
>>>> As weird as that sounds, this is what I observed today after bumping
>>>> kernel version. I have a setup where 2 bonds are attached to linux
>>>> bridge and physically are connected to two switches doing MSTP (and
>>>> linux bridge is just passing them).
>>>>
>>>> Initially I suspected some changes related to bridge code - but quick
>>>> peek at the code showed nothing suspicious - and the part of it that
>>>> explicitly passes stp frames if stp is not enabled has seen little
>>>> changes (e.g. per-port group_fwd_mask added recently). Furthermore - if
>>>> regular non-bonded interfaces are attached everything works fine.
>>>>
>>>> Just to be sure I detached the bond (802.3ad mode) and checked it with
>>>> simple tcpdump (ether proto \\stp) - and indeed no hello packets were
>>>> there (with them being present just fine on active enslaved interface,
>>>> or on the bond device in earlier kernels).
>>>>
>>>> If time permits I'll bisect tommorow to pinpoint the commit, but from
>>>> quick todays test - 4.9.x is working fine, while 4.16.16 (tested on
>>>> debian) and 4.17.3 (tested on archlinux) are failing.
>>>>
>>>> Unless this is already a known issue (or you have any suggestions what
>>>> could be responsible).
>>>>
>>> I believe these are link-local-multicast messages and sometime back a
>>> change went into to not pass those frames to the bonding master. This
>>> could be the side effect of that.
>>
>> 	Mahesh, I suspect you're thinking of:
>>
>> commit b89f04c61efe3b7756434d693b9203cc0cce002e
>> Author: Chonggang Li <chonggangli@google.com>
>> Date:   Sun Apr 16 12:02:18 2017 -0700
>>
>>      bonding: deliver link-local packets with skb->dev set to link that packets arrived on
>>
>> 	Michal, are you able to revert this patch and test?
>>
>> 	-J
>>
>> ---
>> 	-Jay Vosburgh, jay.vosburgh@canonical.com
>>
>
>
>Just tested - yes, reverting that patch solves the issues.

	Chonggang,

	Reading the changelog in your commit referenced above, I'm not
entirely sure what actual problem it is fixing.  Could you elaborate?

	As the patch appears to cause a regression, it needs to be
either fixed or reverted.

	Mahesh, you signed-off on it as well, perhaps you also have some
context?

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] bonded interfaces drop bpdu (stp) frames
  2018-07-12 18:03       ` Jay Vosburgh
@ 2018-07-12 21:26         ` Mahesh Bandewar (महेश बंडेवार)
  2018-07-12 22:03           ` Jay Vosburgh
  2018-07-12 22:14           ` Michal Soltys
  0 siblings, 2 replies; 11+ messages in thread
From: Mahesh Bandewar (महेश बंडेवार) @ 2018-07-12 21:26 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Michal Soltys, Chonggang Li, linux-netdev

On Thu, Jul 12, 2018 at 11:03 AM, Jay Vosburgh
<jay.vosburgh@canonical.com> wrote:
> Michal Soltys <soltys@ziu.info> wrote:
>
>>On 07/12/2018 04:51 PM, Jay Vosburgh wrote:
>>> Mahesh Bandewar (महेश बंडेवार) wrote:
>>>
>>>> On Wed, Jul 11, 2018 at 3:23 PM, Michal Soltys <soltys@ziu.info> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> As weird as that sounds, this is what I observed today after bumping
>>>>> kernel version. I have a setup where 2 bonds are attached to linux
>>>>> bridge and physically are connected to two switches doing MSTP (and
>>>>> linux bridge is just passing them).
>>>>>
>>>>> Initially I suspected some changes related to bridge code - but quick
>>>>> peek at the code showed nothing suspicious - and the part of it that
>>>>> explicitly passes stp frames if stp is not enabled has seen little
>>>>> changes (e.g. per-port group_fwd_mask added recently). Furthermore - if
>>>>> regular non-bonded interfaces are attached everything works fine.
>>>>>
>>>>> Just to be sure I detached the bond (802.3ad mode) and checked it with
>>>>> simple tcpdump (ether proto \\stp) - and indeed no hello packets were
>>>>> there (with them being present just fine on active enslaved interface,
>>>>> or on the bond device in earlier kernels).
>>>>>
>>>>> If time permits I'll bisect tommorow to pinpoint the commit, but from
>>>>> quick todays test - 4.9.x is working fine, while 4.16.16 (tested on
>>>>> debian) and 4.17.3 (tested on archlinux) are failing.
>>>>>
>>>>> Unless this is already a known issue (or you have any suggestions what
>>>>> could be responsible).
>>>>>
>>>> I believe these are link-local-multicast messages and sometime back a
>>>> change went into to not pass those frames to the bonding master. This
>>>> could be the side effect of that.
>>>
>>>      Mahesh, I suspect you're thinking of:
>>>
>>> commit b89f04c61efe3b7756434d693b9203cc0cce002e
>>> Author: Chonggang Li <chonggangli@google.com>
>>> Date:   Sun Apr 16 12:02:18 2017 -0700
>>>
>>>      bonding: deliver link-local packets with skb->dev set to link that packets arrived on
>>>
>>>      Michal, are you able to revert this patch and test?
>>>
>>>      -J
>>>
>>> ---
>>>      -Jay Vosburgh, jay.vosburgh@canonical.com
>>>
>>
>>
>>Just tested - yes, reverting that patch solves the issues.
>
>         Chonggang,
>
>         Reading the changelog in your commit referenced above, I'm not
> entirely sure what actual problem it is fixing.  Could you elaborate?
>
>         As the patch appears to cause a regression, it needs to be
> either fixed or reverted.
>
>         Mahesh, you signed-off on it as well, perhaps you also have some
> context?
>

I think the original idea behind it was to pass the LLDPDUs to the
stack on the interface that they came on since this is considered to
be link-local traffic and passing to bond-master would loose it's
"linklocal-ness". This is true for LLDP and if you change the skb->dev
of the packet, then you don't know which slave link it came on in
(from LLDP consumer's perspective).

I don't know much about STP but trunking two links and aggregating
this link info through bond-master seems wrong. Just like LLDP, you
are losing info specific to a link and the decision derived from that
info could be wrong.

Having said that, we determine "linklocal-ness" by looking at L2 and
bondmaster shares this with lts slaves. So it does seem fair to pass
those frames to the bonding-master but at the same time link-local
traffic is supposed to be limited to the physical link (LLDP/STP/LACP
etc). Your thoughts?


>         -J
>
> ---
>         -Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] bonded interfaces drop bpdu (stp) frames
  2018-07-12 21:26         ` Mahesh Bandewar (महेश बंडेवार)
@ 2018-07-12 22:03           ` Jay Vosburgh
  2018-07-12 23:14             ` Michal Soltys
  2018-07-12 22:14           ` Michal Soltys
  1 sibling, 1 reply; 11+ messages in thread
From: Jay Vosburgh @ 2018-07-12 22:03 UTC (permalink / raw)
  To: =?UTF-8?B?TWFoZXNoIEJhbmRld2FyICjgpK7gpLngpYfgpLYg4KSs4KSC4KSh4KWH4KS14KS+4KSwKQ==?=
  Cc: Michal Soltys, Chonggang Li, linux-netdev

Mahesh Bandewar (महेश बंडेवार) wrote:

>On Thu, Jul 12, 2018 at 11:03 AM, Jay Vosburgh
><jay.vosburgh@canonical.com> wrote:
>> Michal Soltys <soltys@ziu.info> wrote:
>>
>>>On 07/12/2018 04:51 PM, Jay Vosburgh wrote:
>>>> Mahesh Bandewar (महेश बंडेवार) wrote:
>>>>
>>>>> On Wed, Jul 11, 2018 at 3:23 PM, Michal Soltys <soltys@ziu.info> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> As weird as that sounds, this is what I observed today after bumping
>>>>>> kernel version. I have a setup where 2 bonds are attached to linux
>>>>>> bridge and physically are connected to two switches doing MSTP (and
>>>>>> linux bridge is just passing them).
>>>>>>
>>>>>> Initially I suspected some changes related to bridge code - but quick
>>>>>> peek at the code showed nothing suspicious - and the part of it that
>>>>>> explicitly passes stp frames if stp is not enabled has seen little
>>>>>> changes (e.g. per-port group_fwd_mask added recently). Furthermore - if
>>>>>> regular non-bonded interfaces are attached everything works fine.
>>>>>>
>>>>>> Just to be sure I detached the bond (802.3ad mode) and checked it with
>>>>>> simple tcpdump (ether proto \\stp) - and indeed no hello packets were
>>>>>> there (with them being present just fine on active enslaved interface,
>>>>>> or on the bond device in earlier kernels).
>>>>>>
>>>>>> If time permits I'll bisect tommorow to pinpoint the commit, but from
>>>>>> quick todays test - 4.9.x is working fine, while 4.16.16 (tested on
>>>>>> debian) and 4.17.3 (tested on archlinux) are failing.
>>>>>>
>>>>>> Unless this is already a known issue (or you have any suggestions what
>>>>>> could be responsible).
>>>>>>
>>>>> I believe these are link-local-multicast messages and sometime back a
>>>>> change went into to not pass those frames to the bonding master. This
>>>>> could be the side effect of that.
>>>>
>>>>      Mahesh, I suspect you're thinking of:
>>>>
>>>> commit b89f04c61efe3b7756434d693b9203cc0cce002e
>>>> Author: Chonggang Li <chonggangli@google.com>
>>>> Date:   Sun Apr 16 12:02:18 2017 -0700
>>>>
>>>>      bonding: deliver link-local packets with skb->dev set to link that packets arrived on
>>>>
>>>>      Michal, are you able to revert this patch and test?
>>>>
>>>>      -J
>>>>
>>>> ---
>>>>      -Jay Vosburgh, jay.vosburgh@canonical.com
>>>>
>>>
>>>
>>>Just tested - yes, reverting that patch solves the issues.
>>
>>         Chonggang,
>>
>>         Reading the changelog in your commit referenced above, I'm not
>> entirely sure what actual problem it is fixing.  Could you elaborate?
>>
>>         As the patch appears to cause a regression, it needs to be
>> either fixed or reverted.
>>
>>         Mahesh, you signed-off on it as well, perhaps you also have some
>> context?
>>
>
>I think the original idea behind it was to pass the LLDPDUs to the
>stack on the interface that they came on since this is considered to
>be link-local traffic and passing to bond-master would loose it's
>"linklocal-ness". This is true for LLDP and if you change the skb->dev
>of the packet, then you don't know which slave link it came on in
>(from LLDP consumer's perspective).
>
>I don't know much about STP but trunking two links and aggregating
>this link info through bond-master seems wrong. Just like LLDP, you
>are losing info specific to a link and the decision derived from that
>info could be wrong.
>
>Having said that, we determine "linklocal-ness" by looking at L2 and
>bondmaster shares this with lts slaves. So it does seem fair to pass
>those frames to the bonding-master but at the same time link-local
>traffic is supposed to be limited to the physical link (LLDP/STP/LACP
>etc). Your thoughts?

	I agree the whole thing sounds kind of weird, but I'm curious as
to what Michal's actual use case is; he presumably has some practical
use for this, since he noticed that the behavior changed.

	Michal, you mentioned MSTP and using 802.3ad (LACP) mode; how
does that combination work rationally given that the bond might send and
receive traffic across multiple slaves?  Or does the switch side bundle
the ports together into a single logical interface for MSTP purposes?
On the TX side, I think the bond will likely balance all STP frames to
just one slave.

	As for a resolution, presuming that Michal has some reasonable
use case, I'm thinking along the lines of reverting the new (leave frame
attached to slave) behavior for the general case and adding a special
case for LLDP and friends to get the new behavior.  I'd like to avoid
adding any new options to bonding.

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] bonded interfaces drop bpdu (stp) frames
  2018-07-12 21:26         ` Mahesh Bandewar (महेश बंडेवार)
  2018-07-12 22:03           ` Jay Vosburgh
@ 2018-07-12 22:14           ` Michal Soltys
  1 sibling, 0 replies; 11+ messages in thread
From: Michal Soltys @ 2018-07-12 22:14 UTC (permalink / raw)
  To: Mahesh Bandewar (महेश
	बंडेवार),
	Jay Vosburgh
  Cc: Chonggang Li, linux-netdev

On 2018-07-12 23:26, Mahesh Bandewar (महेश बंडेवार) wrote:
> On Thu, Jul 12, 2018 at 11:03 AM, Jay Vosburgh
> <jay.vosburgh@canonical.com> wrote:
>> Michal Soltys <soltys@ziu.info> wrote:
>>
>>>On 07/12/2018 04:51 PM, Jay Vosburgh wrote:
>>>> Mahesh Bandewar (महेश बंडेवार) wrote:
>>>>
>>>>> On Wed, Jul 11, 2018 at 3:23 PM, Michal Soltys <soltys@ziu.info> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> As weird as that sounds, this is what I observed today after bumping
>>>>>> kernel version. I have a setup where 2 bonds are attached to linux
>>>>>> bridge and physically are connected to two switches doing MSTP (and
>>>>>> linux bridge is just passing them).
>>>>>>
>>>>>> Initially I suspected some changes related to bridge code - but quick
>>>>>> peek at the code showed nothing suspicious - and the part of it that
>>>>>> explicitly passes stp frames if stp is not enabled has seen little
>>>>>> changes (e.g. per-port group_fwd_mask added recently). Furthermore - if
>>>>>> regular non-bonded interfaces are attached everything works fine.
>>>>>>
>>>>>> Just to be sure I detached the bond (802.3ad mode) and checked it with
>>>>>> simple tcpdump (ether proto \\stp) - and indeed no hello packets were
>>>>>> there (with them being present just fine on active enslaved interface,
>>>>>> or on the bond device in earlier kernels).
>>>>>>
>>>>>> If time permits I'll bisect tommorow to pinpoint the commit, but from
>>>>>> quick todays test - 4.9.x is working fine, while 4.16.16 (tested on
>>>>>> debian) and 4.17.3 (tested on archlinux) are failing.
>>>>>>
>>>>>> Unless this is already a known issue (or you have any suggestions what
>>>>>> could be responsible).
>>>>>>
>>>>> I believe these are link-local-multicast messages and sometime back a
>>>>> change went into to not pass those frames to the bonding master. This
>>>>> could be the side effect of that.
>>>>
>>>>      Mahesh, I suspect you're thinking of:
>>>>
>>>> commit b89f04c61efe3b7756434d693b9203cc0cce002e
>>>> Author: Chonggang Li <chonggangli@google.com>
>>>> Date:   Sun Apr 16 12:02:18 2017 -0700
>>>>
>>>>      bonding: deliver link-local packets with skb->dev set to link that packets arrived on
>>>>
>>>>      Michal, are you able to revert this patch and test?
>>>>
>>>>      -J
>>>>
>>>> ---
>>>>      -Jay Vosburgh, jay.vosburgh@canonical.com
>>>>
>>>
>>>
>>>Just tested - yes, reverting that patch solves the issues.
>>
>>         Chonggang,
>>
>>         Reading the changelog in your commit referenced above, I'm not
>> entirely sure what actual problem it is fixing.  Could you elaborate?
>>
>>         As the patch appears to cause a regression, it needs to be
>> either fixed or reverted.
>>
>>         Mahesh, you signed-off on it as well, perhaps you also have some
>> context?
>>
> 
> I think the original idea behind it was to pass the LLDPDUs to the
> stack on the interface that they came on since this is considered to
> be link-local traffic and passing to bond-master would loose it's
> "linklocal-ness". This is true for LLDP and if you change the skb->dev
> of the packet, then you don't know which slave link it came on in
> (from LLDP consumer's perspective).
> 
> I don't know much about STP but trunking two links and aggregating
> this link info through bond-master seems wrong. Just like LLDP, you
> are losing info specific to a link and the decision derived from that
> info could be wrong.
> 
> Having said that, we determine "linklocal-ness" by looking at L2 and
> bondmaster shares this with lts slaves. So it does seem fair to pass
> those frames to the bonding-master but at the same time link-local
> traffic is supposed to be limited to the physical link (LLDP/STP/LACP
> etc). Your thoughts?
> 

But, isn't bond de-facto considered the "physical link" ? Not directly
of course, but say an LLDP daemon would likely be more interested in
getting LLDP data from a bond device (or a bridge device, if the bond is
attached to one), than from its enslaved interfaces (and enslaved
interfaces can be changed, not mentioning potentially complex setup
itself, even if usually it's just lacp&go ).

ITOW, blocking link-local multicasts on bond level (among those - bpdu,
pae, lldp) is a bit like if the interface itself hid LACP before bond code.

A few other examples:

- putting bonds in a bridge is pretty normal thing - and whether the
bridge interpretes the spanning tree data itself (via in-kernel classic
stp or userspace daemon for e.g. rstp) or passes the trafic, it must see
the BPDU frames. Otherwise it becomes blind to the whole spanning tree
protocol - and implicitly other switches around - real or virtual ones.
It's literally instant loop disaster. br_input.c specifically takes care
to pass those frames if the bridge has stp turned off

- "group_fwd_mask" (again in bridge context) has been added to bridge
code - and recently as a per-port knob as well - to specifically allow
the control of what kind of "link-local" stuff is passed or not. LLDP
and 802.1X PAE were, afaik, the main reasons for that sysfs variable.
The per-port setting is even more relaxed (iirc, only pause frames are
not passable)

- LLDP daemon example - as above

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] bonded interfaces drop bpdu (stp) frames
  2018-07-12 22:03           ` Jay Vosburgh
@ 2018-07-12 23:14             ` Michal Soltys
  2018-07-13  0:15               ` Mahesh Bandewar (महेश बंडेवार)
  0 siblings, 1 reply; 11+ messages in thread
From: Michal Soltys @ 2018-07-12 23:14 UTC (permalink / raw)
  To: Jay Vosburgh,
	Mahesh Bandewar (महेश
	बंडेवार)
  Cc: Chonggang Li, linux-netdev

On 2018-07-13 00:03, Jay Vosburgh wrote:
> Mahesh Bandewar (महेश बंडेवार) wrote:
> 
>>On Thu, Jul 12, 2018 at 11:03 AM, Jay Vosburgh
>><jay.vosburgh@canonical.com> wrote:
>>> Michal Soltys <soltys@ziu.info> wrote:
>>>
>>>>On 07/12/2018 04:51 PM, Jay Vosburgh wrote:
>>>>> Mahesh Bandewar (महेश बंडेवार) wrote:
>>>>>
>>>>>> On Wed, Jul 11, 2018 at 3:23 PM, Michal Soltys <soltys@ziu.info> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> As weird as that sounds, this is what I observed today after bumping
>>>>>>> kernel version. I have a setup where 2 bonds are attached to linux
>>>>>>> bridge and physically are connected to two switches doing MSTP (and
>>>>>>> linux bridge is just passing them).
>>>>>>>
>>>>>>> Initially I suspected some changes related to bridge code - but quick
>>>>>>> peek at the code showed nothing suspicious - and the part of it that
>>>>>>> explicitly passes stp frames if stp is not enabled has seen little
>>>>>>> changes (e.g. per-port group_fwd_mask added recently). Furthermore - if
>>>>>>> regular non-bonded interfaces are attached everything works fine.
>>>>>>>
>>>>>>> Just to be sure I detached the bond (802.3ad mode) and checked it with
>>>>>>> simple tcpdump (ether proto \\stp) - and indeed no hello packets were
>>>>>>> there (with them being present just fine on active enslaved interface,
>>>>>>> or on the bond device in earlier kernels).
>>>>>>>
>>>>>>> If time permits I'll bisect tommorow to pinpoint the commit, but from
>>>>>>> quick todays test - 4.9.x is working fine, while 4.16.16 (tested on
>>>>>>> debian) and 4.17.3 (tested on archlinux) are failing.
>>>>>>>
>>>>>>> Unless this is already a known issue (or you have any suggestions what
>>>>>>> could be responsible).
>>>>>>>
>>>>>> I believe these are link-local-multicast messages and sometime back a
>>>>>> change went into to not pass those frames to the bonding master. This
>>>>>> could be the side effect of that.
>>>>>
>>>>>      Mahesh, I suspect you're thinking of:
>>>>>
>>>>> commit b89f04c61efe3b7756434d693b9203cc0cce002e
>>>>> Author: Chonggang Li <chonggangli@google.com>
>>>>> Date:   Sun Apr 16 12:02:18 2017 -0700
>>>>>
>>>>>      bonding: deliver link-local packets with skb->dev set to link that packets arrived on
>>>>>
>>>>>      Michal, are you able to revert this patch and test?
>>>>>
>>>>>      -J
>>>>>
>>>>> ---
>>>>>      -Jay Vosburgh, jay.vosburgh@canonical.com
>>>>>
>>>>
>>>>
>>>>Just tested - yes, reverting that patch solves the issues.
>>>
>>>         Chonggang,
>>>
>>>         Reading the changelog in your commit referenced above, I'm not
>>> entirely sure what actual problem it is fixing.  Could you elaborate?
>>>
>>>         As the patch appears to cause a regression, it needs to be
>>> either fixed or reverted.
>>>
>>>         Mahesh, you signed-off on it as well, perhaps you also have some
>>> context?
>>>
>>
>>I think the original idea behind it was to pass the LLDPDUs to the
>>stack on the interface that they came on since this is considered to
>>be link-local traffic and passing to bond-master would loose it's
>>"linklocal-ness". This is true for LLDP and if you change the skb->dev
>>of the packet, then you don't know which slave link it came on in
>>(from LLDP consumer's perspective).
>>
>>I don't know much about STP but trunking two links and aggregating
>>this link info through bond-master seems wrong. Just like LLDP, you
>>are losing info specific to a link and the decision derived from that
>>info could be wrong.
>>
>>Having said that, we determine "linklocal-ness" by looking at L2 and
>>bondmaster shares this with lts slaves. So it does seem fair to pass
>>those frames to the bonding-master but at the same time link-local
>>traffic is supposed to be limited to the physical link (LLDP/STP/LACP
>>etc). Your thoughts?
> 
> 	I agree the whole thing sounds kind of weird, but I'm curious as
> to what Michal's actual use case is; he presumably has some practical
> use for this, since he noticed that the behavior changed.
> 

The whole "link-local" term is a bit I don't know - at this point it
feels like too many things were thrown into single bag and it got
somewhat confusing (bpdu, lldp, pause frames, lacp, pae, qinq mulitcast
that afaik has its own address) - I added some examples in another reply
I did at the same time as you were typing this one =)

> 	Michal, you mentioned MSTP and using 802.3ad (LACP) mode; how
> does that combination work rationally given that the bond might send and
> receive traffic across multiple slaves?  Or does the switch side bundle
> the ports together into a single logical interface for MSTP purposes?
> On the TX side, I think the bond will likely balance all STP frames to
> just one slave.
> 

The basic concept - two "main" switches with "important" machines
connected to those. One switch dies and everything keeps working. With
no unused ports and so on.

In more details:

Originally I was trying MSTP daemon (on "important" machines) which
seems quite well and completely coded, but cannot really work correctly
- as afaik you can't put port (in linux bridge conext) in different
forwarding/blocking/etc. state per-region - itow per group of vlans (or
mstpd didn't know how to do that, or it wasn't implemented - I didn't
look too deep back then, though my interest resurfaced in recent days).

So that option was out of the question. But any switch, real or not,
/must/ pass bpdu frames if it doesn't interpret them. So instead of
having active mstp participant, we have passive linux bridge that passes
the frames and the two real switches around that care of mstp, treating
the linux as a shared segment. The costs/priorities/etc. on the real
swtiches are set so one bond handles two regions, and the other bond
handles other two regions. If any of the real switches dies or is taken
down for e.g. firmware update - the bond going to the other switch
handles all four regions (failover is of course not as fast as with
active rstp/mstp participation, but works quite well none the less -
around 10s after some tuning).

We could have used RSTP for that purpose as well - but that being all or
nothing in context of per-port blocking/forwarding, would leave half of
the ports unused - and we wanted to avoid that (that's why MSTP was
created after all).

Instead of using 2 bonds (2 interfaces each) we could just use 4
interfaces directly, one per region. But two of those regions see very
little traffic, so we put more and less active regions in pairs.

> 	As for a resolution, presuming that Michal has some reasonable
> use case, I'm thinking along the lines of reverting the new (leave frame
> attached to slave) behavior for the general case and adding a special
> case for LLDP and friends to get the new behavior.  I'd like to avoid
> adding any new options to bonding.
> 

My use case aside, this will cause issues for anyone attaching bond
(instead of direct interface or veth) to a bridge and doing something
more complex with it - whether related to stp or to selectively passing
e.g. lldp using group_fwd_mask sysfs. Or having LLDP daemon (e.g.
systemd-resolvd to not look far away) told to do LLDP on bond device
(even most basic active-backup case) and remaining blind. Or anything
else that expects to see/pass those multicasts on/via bonded device
(which is just a convenient way to create virtual interface out of real
interfaces after all - ITOW shouldn't probably make any calls in this
regard).

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] bonded interfaces drop bpdu (stp) frames
  2018-07-12 23:14             ` Michal Soltys
@ 2018-07-13  0:15               ` Mahesh Bandewar (महेश बंडेवार)
  2018-07-14 15:48                 ` Michal Soltys
  0 siblings, 1 reply; 11+ messages in thread
From: Mahesh Bandewar (महेश बंडेवार) @ 2018-07-13  0:15 UTC (permalink / raw)
  To: Michal Soltys; +Cc: Jay Vosburgh, Chonggang Li, linux-netdev

On Thu, Jul 12, 2018 at 4:14 PM, Michal Soltys <soltys@ziu.info> wrote:
> On 2018-07-13 00:03, Jay Vosburgh wrote:
>> Mahesh Bandewar (महेश बंडेवार) wrote:
>>
>>>On Thu, Jul 12, 2018 at 11:03 AM, Jay Vosburgh
>>><jay.vosburgh@canonical.com> wrote:
>>>> Michal Soltys <soltys@ziu.info> wrote:
>>>>
>>>>>On 07/12/2018 04:51 PM, Jay Vosburgh wrote:
>>>>>> Mahesh Bandewar (महेश बंडेवार) wrote:
>>>>>>
>>>>>>> On Wed, Jul 11, 2018 at 3:23 PM, Michal Soltys <soltys@ziu.info> wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> As weird as that sounds, this is what I observed today after bumping
>>>>>>>> kernel version. I have a setup where 2 bonds are attached to linux
>>>>>>>> bridge and physically are connected to two switches doing MSTP (and
>>>>>>>> linux bridge is just passing them).
>>>>>>>>
>>>>>>>> Initially I suspected some changes related to bridge code - but quick
>>>>>>>> peek at the code showed nothing suspicious - and the part of it that
>>>>>>>> explicitly passes stp frames if stp is not enabled has seen little
>>>>>>>> changes (e.g. per-port group_fwd_mask added recently). Furthermore - if
>>>>>>>> regular non-bonded interfaces are attached everything works fine.
>>>>>>>>
>>>>>>>> Just to be sure I detached the bond (802.3ad mode) and checked it with
>>>>>>>> simple tcpdump (ether proto \\stp) - and indeed no hello packets were
>>>>>>>> there (with them being present just fine on active enslaved interface,
>>>>>>>> or on the bond device in earlier kernels).
>>>>>>>>
>>>>>>>> If time permits I'll bisect tommorow to pinpoint the commit, but from
>>>>>>>> quick todays test - 4.9.x is working fine, while 4.16.16 (tested on
>>>>>>>> debian) and 4.17.3 (tested on archlinux) are failing.
>>>>>>>>
>>>>>>>> Unless this is already a known issue (or you have any suggestions what
>>>>>>>> could be responsible).
>>>>>>>>
>>>>>>> I believe these are link-local-multicast messages and sometime back a
>>>>>>> change went into to not pass those frames to the bonding master. This
>>>>>>> could be the side effect of that.
>>>>>>
>>>>>>      Mahesh, I suspect you're thinking of:
>>>>>>
>>>>>> commit b89f04c61efe3b7756434d693b9203cc0cce002e
>>>>>> Author: Chonggang Li <chonggangli@google.com>
>>>>>> Date:   Sun Apr 16 12:02:18 2017 -0700
>>>>>>
>>>>>>      bonding: deliver link-local packets with skb->dev set to link that packets arrived on
>>>>>>
>>>>>>      Michal, are you able to revert this patch and test?
>>>>>>
>>>>>>      -J
>>>>>>
>>>>>> ---
>>>>>>      -Jay Vosburgh, jay.vosburgh@canonical.com
>>>>>>
>>>>>
>>>>>
>>>>>Just tested - yes, reverting that patch solves the issues.
>>>>
>>>>         Chonggang,
>>>>
>>>>         Reading the changelog in your commit referenced above, I'm not
>>>> entirely sure what actual problem it is fixing.  Could you elaborate?
>>>>
>>>>         As the patch appears to cause a regression, it needs to be
>>>> either fixed or reverted.
>>>>
>>>>         Mahesh, you signed-off on it as well, perhaps you also have some
>>>> context?
>>>>
>>>
>>>I think the original idea behind it was to pass the LLDPDUs to the
>>>stack on the interface that they came on since this is considered to
>>>be link-local traffic and passing to bond-master would loose it's
>>>"linklocal-ness". This is true for LLDP and if you change the skb->dev
>>>of the packet, then you don't know which slave link it came on in
>>>(from LLDP consumer's perspective).
>>>
>>>I don't know much about STP but trunking two links and aggregating
>>>this link info through bond-master seems wrong. Just like LLDP, you
>>>are losing info specific to a link and the decision derived from that
>>>info could be wrong.
>>>
>>>Having said that, we determine "linklocal-ness" by looking at L2 and
>>>bondmaster shares this with lts slaves. So it does seem fair to pass
>>>those frames to the bonding-master but at the same time link-local
>>>traffic is supposed to be limited to the physical link (LLDP/STP/LACP
>>>etc). Your thoughts?
>>
>>       I agree the whole thing sounds kind of weird, but I'm curious as
>> to what Michal's actual use case is; he presumably has some practical
>> use for this, since he noticed that the behavior changed.
>>
>
> The whole "link-local" term is a bit I don't know - at this point it
> feels like too many things were thrown into single bag and it got
> somewhat confusing (bpdu, lldp, pause frames, lacp, pae, qinq mulitcast
> that afaik has its own address) - I added some examples in another reply
> I did at the same time as you were typing this one =)
>
>>       Michal, you mentioned MSTP and using 802.3ad (LACP) mode; how
>> does that combination work rationally given that the bond might send and
>> receive traffic across multiple slaves?  Or does the switch side bundle
>> the ports together into a single logical interface for MSTP purposes?
>> On the TX side, I think the bond will likely balance all STP frames to
>> just one slave.
>>
>
> The basic concept - two "main" switches with "important" machines
> connected to those. One switch dies and everything keeps working. With
> no unused ports and so on.
>
> In more details:
>
> Originally I was trying MSTP daemon (on "important" machines) which
> seems quite well and completely coded, but cannot really work correctly
> - as afaik you can't put port (in linux bridge conext) in different
> forwarding/blocking/etc. state per-region - itow per group of vlans (or
> mstpd didn't know how to do that, or it wasn't implemented - I didn't
> look too deep back then, though my interest resurfaced in recent days).
>
> So that option was out of the question. But any switch, real or not,
> /must/ pass bpdu frames if it doesn't interpret them.

I think it's contrary! These (STP) frames are link-local-multicast
frames i.e. their dest-mac is multicast-mac (unlike the one that you
would see on an interface). So a *real* switch would never pass those
frames (because there wont be an entry in the CAM table) and would
have to consume those frames. Just imagine a userspace LACP-daemon
receiving LACPDUs on bonding master, it wouldn't make right decisions
about slaves and hence the health of the aggregator would be
questionable. Also imagine LACP frames being forwards by *real*
switch, things wont work as intended.


> So instead of
> having active mstp participant, we have passive linux bridge that passes
> the frames and the two real switches around that care of mstp, treating
> the linux as a shared segment. The costs/priorities/etc. on the real
> swtiches are set so one bond handles two regions, and the other bond
> handles other two regions. If any of the real switches dies or is taken
> down for e.g. firmware update - the bond going to the other switch
> handles all four regions (failover is of course not as fast as with
> active rstp/mstp participation, but works quite well none the less -
> around 10s after some tuning).
>
> We could have used RSTP for that purpose as well - but that being all or
> nothing in context of per-port blocking/forwarding, would leave half of
> the ports unused - and we wanted to avoid that (that's why MSTP was
> created after all).
>
> Instead of using 2 bonds (2 interfaces each) we could just use 4
> interfaces directly, one per region. But two of those regions see very
> little traffic, so we put more and less active regions in pairs.
>
>>       As for a resolution, presuming that Michal has some reasonable
>> use case, I'm thinking along the lines of reverting the new (leave frame
>> attached to slave) behavior for the general case and adding a special
>> case for LLDP and friends to get the new behavior.  I'd like to avoid
>> adding any new options to bonding.
>>
>
> My use case aside, this will cause issues for anyone attaching bond
> (instead of direct interface or veth) to a bridge and doing something
> more complex with it - whether related to stp or to selectively passing
> e.g. lldp using group_fwd_mask sysfs. Or having LLDP daemon (e.g.
> systemd-resolvd to not look far away) told to do LLDP on bond device
> (even most basic active-backup case) and remaining blind. Or anything
> else that expects to see/pass those multicasts on/via bonded device
> (which is just a convenient way to create virtual interface out of real
> interfaces after all - ITOW shouldn't probably make any calls in this
> regard).

A real switch cannot forward link-local-multicast and if it does, then
I would consider it as broken. Having said that we are taking about
the bonding-master and as I mentioned earlier, it seems fair to pass
those frames to bonding master because we share L2 with slaves but we
should pass the frames to the stack as they are (current behavior!) to
maintain the "linklocal-ness" so both master and the current-slave get
a copy (it's multicast after all).

Thanks,
--mahesh..

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] bonded interfaces drop bpdu (stp) frames
  2018-07-13  0:15               ` Mahesh Bandewar (महेश बंडेवार)
@ 2018-07-14 15:48                 ` Michal Soltys
  0 siblings, 0 replies; 11+ messages in thread
From: Michal Soltys @ 2018-07-14 15:48 UTC (permalink / raw)
  To: Mahesh Bandewar (महेश
	बंडेवार)
  Cc: Jay Vosburgh, Chonggang Li, linux-netdev, David Miller

On 07/13/2018 02:15 AM, Mahesh Bandewar (महेश बंडेवार) wrote:
> On Thu, Jul 12, 2018 at 4:14 PM, Michal Soltys <soltys@ziu.info> wrote:
>> On 2018-07-13 00:03, Jay Vosburgh wrote:
>>> Mahesh Bandewar (महेश बंडेवार) wrote:
>>>
>>>> On Thu, Jul 12, 2018 at 11:03 AM, Jay Vosburgh
>>>> <jay.vosburgh@canonical.com> wrote:
>>>>> Michal Soltys <soltys@ziu.info> wrote:
>>>>>
>>>>>> On 07/12/2018 04:51 PM, Jay Vosburgh wrote:
>>>>>>> Mahesh Bandewar (महेश बंडेवार) wrote:
>>>>>>>
>>>>>>>> On Wed, Jul 11, 2018 at 3:23 PM, Michal Soltys <soltys@ziu.info> wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> As weird as that sounds, this is what I observed today after bumping
>>>>>>>>> kernel version. I have a setup where 2 bonds are attached to linux
>>>>>>>>> bridge and physically are connected to two switches doing MSTP (and
>>>>>>>>> linux bridge is just passing them).
>>>>>>>>>
>>>>>>>>> Initially I suspected some changes related to bridge code - but quick
>>>>>>>>> peek at the code showed nothing suspicious - and the part of it that
>>>>>>>>> explicitly passes stp frames if stp is not enabled has seen little
>>>>>>>>> changes (e.g. per-port group_fwd_mask added recently). Furthermore - if
>>>>>>>>> regular non-bonded interfaces are attached everything works fine.
>>>>>>>>>
>>>>>>>>> Just to be sure I detached the bond (802.3ad mode) and checked it with
>>>>>>>>> simple tcpdump (ether proto \\stp) - and indeed no hello packets were
>>>>>>>>> there (with them being present just fine on active enslaved interface,
>>>>>>>>> or on the bond device in earlier kernels).
>>>>>>>>>
>>>>>>>>> If time permits I'll bisect tommorow to pinpoint the commit, but from
>>>>>>>>> quick todays test - 4.9.x is working fine, while 4.16.16 (tested on
>>>>>>>>> debian) and 4.17.3 (tested on archlinux) are failing.
>>>>>>>>>
>>>>>>>>> Unless this is already a known issue (or you have any suggestions what
>>>>>>>>> could be responsible).
>>>>>>>>>
>>>>>>>> I believe these are link-local-multicast messages and sometime back a
>>>>>>>> change went into to not pass those frames to the bonding master. This
>>>>>>>> could be the side effect of that.
>>>>>>>
>>>>>>>       Mahesh, I suspect you're thinking of:
>>>>>>>
>>>>>>> commit b89f04c61efe3b7756434d693b9203cc0cce002e
>>>>>>> Author: Chonggang Li <chonggangli@google.com>
>>>>>>> Date:   Sun Apr 16 12:02:18 2017 -0700
>>>>>>>
>>>>>>>       bonding: deliver link-local packets with skb->dev set to link that packets arrived on
>>>>>>>
>>>>>>>       Michal, are you able to revert this patch and test?
>>>>>>>
>>>>>>>       -J
>>>>>>>
>>>>>>> ---
>>>>>>>       -Jay Vosburgh, jay.vosburgh@canonical.com
>>>>>>>
>>>>>>
>>>>>>
>>>>>> Just tested - yes, reverting that patch solves the issues.
>>>>>
>>>>>          Chonggang,
>>>>>
>>>>>          Reading the changelog in your commit referenced above, I'm not
>>>>> entirely sure what actual problem it is fixing.  Could you elaborate?
>>>>>
>>>>>          As the patch appears to cause a regression, it needs to be
>>>>> either fixed or reverted.
>>>>>
>>>>>          Mahesh, you signed-off on it as well, perhaps you also have some
>>>>> context?
>>>>>
>>>>
>>>> I think the original idea behind it was to pass the LLDPDUs to the
>>>> stack on the interface that they came on since this is considered to
>>>> be link-local traffic and passing to bond-master would loose it's
>>>> "linklocal-ness". This is true for LLDP and if you change the skb->dev
>>>> of the packet, then you don't know which slave link it came on in
>>>> (from LLDP consumer's perspective).
>>>>
>>>> I don't know much about STP but trunking two links and aggregating
>>>> this link info through bond-master seems wrong. Just like LLDP, you
>>>> are losing info specific to a link and the decision derived from that
>>>> info could be wrong.
>>>>
>>>> Having said that, we determine "linklocal-ness" by looking at L2 and
>>>> bondmaster shares this with lts slaves. So it does seem fair to pass
>>>> those frames to the bonding-master but at the same time link-local
>>>> traffic is supposed to be limited to the physical link (LLDP/STP/LACP
>>>> etc). Your thoughts?
>>>
>>>        I agree the whole thing sounds kind of weird, but I'm curious as
>>> to what Michal's actual use case is; he presumably has some practical
>>> use for this, since he noticed that the behavior changed.
>>>
>>
>> The whole "link-local" term is a bit I don't know - at this point it
>> feels like too many things were thrown into single bag and it got
>> somewhat confusing (bpdu, lldp, pause frames, lacp, pae, qinq mulitcast
>> that afaik has its own address) - I added some examples in another reply
>> I did at the same time as you were typing this one =)
>>
>>>        Michal, you mentioned MSTP and using 802.3ad (LACP) mode; how
>>> does that combination work rationally given that the bond might send and
>>> receive traffic across multiple slaves?  Or does the switch side bundle
>>> the ports together into a single logical interface for MSTP purposes?
>>> On the TX side, I think the bond will likely balance all STP frames to
>>> just one slave.
>>>
>>
>> The basic concept - two "main" switches with "important" machines
>> connected to those. One switch dies and everything keeps working. With
>> no unused ports and so on.
>>
>> In more details:
>>
>> Originally I was trying MSTP daemon (on "important" machines) which
>> seems quite well and completely coded, but cannot really work correctly
>> - as afaik you can't put port (in linux bridge conext) in different
>> forwarding/blocking/etc. state per-region - itow per group of vlans (or
>> mstpd didn't know how to do that, or it wasn't implemented - I didn't
>> look too deep back then, though my interest resurfaced in recent days).
>>
>> So that option was out of the question. But any switch, real or not,
>> /must/ pass bpdu frames if it doesn't interpret them.
> 
> I think it's contrary! These (STP) frames are link-local-multicast
> frames i.e. their dest-mac is multicast-mac (unlike the one that you
> would see on an interface). So a *real* switch would never pass those
> frames (because there wont be an entry in the CAM table) and would
> have to consume those frames. Just imagine a userspace LACP-daemon
> receiving LACPDUs on bonding master, it wouldn't make right decisions
> about slaves and hence the health of the aggregator would be
> questionable. Also imagine LACP frames being forwards by *real*
> switch, things wont work as intended.
> 
> 
>> So instead of
>> having active mstp participant, we have passive linux bridge that passes
>> the frames and the two real switches around that care of mstp, treating
>> the linux as a shared segment. The costs/priorities/etc. on the real
>> swtiches are set so one bond handles two regions, and the other bond
>> handles other two regions. If any of the real switches dies or is taken
>> down for e.g. firmware update - the bond going to the other switch
>> handles all four regions (failover is of course not as fast as with
>> active rstp/mstp participation, but works quite well none the less -
>> around 10s after some tuning).
>>
>> We could have used RSTP for that purpose as well - but that being all or
>> nothing in context of per-port blocking/forwarding, would leave half of
>> the ports unused - and we wanted to avoid that (that's why MSTP was
>> created after all).
>>
>> Instead of using 2 bonds (2 interfaces each) we could just use 4
>> interfaces directly, one per region. But two of those regions see very
>> little traffic, so we put more and less active regions in pairs.
>>
>>>        As for a resolution, presuming that Michal has some reasonable
>>> use case, I'm thinking along the lines of reverting the new (leave frame
>>> attached to slave) behavior for the general case and adding a special
>>> case for LLDP and friends to get the new behavior.  I'd like to avoid
>>> adding any new options to bonding.
>>>
>>
>> My use case aside, this will cause issues for anyone attaching bond
>> (instead of direct interface or veth) to a bridge and doing something
>> more complex with it - whether related to stp or to selectively passing
>> e.g. lldp using group_fwd_mask sysfs. Or having LLDP daemon (e.g.
>> systemd-resolvd to not look far away) told to do LLDP on bond device
>> (even most basic active-backup case) and remaining blind. Or anything
>> else that expects to see/pass those multicasts on/via bonded device
>> (which is just a convenient way to create virtual interface out of real
>> interfaces after all - ITOW shouldn't probably make any calls in this
>> regard).

For the record - having peeked at LLDP's 802.1AB-2016 - it also states that 
LLDP agent must be able to operate both on any/all physical interfaces as well 
as on bonds (section 6.8 LLDP and Link Aggregation).

> 
> A real switch cannot forward link-local-multicast and if it does, then
> I would consider it as broken.

Well if we dig into details, even the standard (802.1q-2014) makes distinction 
between non-tpmr bridge, nearest bridge and nearest customer bridge - 
governing which of those multicasts should be relayed in "bigger" 
customer/provider cases. My point being, it's not just simple "don't forward" 
(section 8.6.3, among others).

As far as cheap / unmanaged real ones go, they generally pass the important 
stuff they don't understand (at least all the ones I had under my hands did 
so). R/MSTP expects "shared segments" to pass BPDUs as well. Bridge code in 
linux follows that (and allows more with aforementioned knobs).

But we digress.

> Having said that we are taking about
> the bonding-master and as I mentioned earlier, it seems fair to pass
> those frames to bonding master because we share L2 with slaves but we
> should pass the frames to the stack as they are (current behavior!) to
> maintain the "linklocal-ness" so both master and the current-slave get
> a copy (it's multicast after all).
> 

To sum up - meaning the patch will be reverted or do you plan to adjust it in 
some way ? If former, I can prep the reverting patch, as I'm - well - most 
interested in getting this regression sorted out.

Michal


> Thanks,
> --mahesh..
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-07-14 16:08 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-11 22:23 [BUG] bonded interfaces drop bpdu (stp) frames Michal Soltys
2018-07-11 23:22 ` Mahesh Bandewar (महेश बंडेवार)
2018-07-12 14:51   ` Jay Vosburgh
2018-07-12 16:37     ` Michal Soltys
2018-07-12 18:03       ` Jay Vosburgh
2018-07-12 21:26         ` Mahesh Bandewar (महेश बंडेवार)
2018-07-12 22:03           ` Jay Vosburgh
2018-07-12 23:14             ` Michal Soltys
2018-07-13  0:15               ` Mahesh Bandewar (महेश बंडेवार)
2018-07-14 15:48                 ` Michal Soltys
2018-07-12 22:14           ` Michal Soltys

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.