All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: bridge: HSR support
       [not found] ` <20111011112821.28cd3e51@nehalam.linuxnetplumber.net>
@ 2011-10-11 23:51   ` Arvid Brodin
  2011-10-12 13:28     ` David Lamparter
  2011-10-24 14:17     ` Arvid Brodin
  0 siblings, 2 replies; 15+ messages in thread
From: Arvid Brodin @ 2011-10-11 23:51 UTC (permalink / raw)
  To: netdev; +Cc: Stephen Hemminger, Lennert Buytenhek

Stephen Hemminger wrote:
> On Tue, 11 Oct 2011 20:25:08 +0200
> Arvid Brodin <arvid.brodin@enea.com> wrote:
> 
>> Hi,
>>
>> I want to add support for HSR ("High-availability Seamless Redundancy",
>> IEC-62439-3) to the bridge code. With HSR, all connected units have two network
>> ports and are connected in a ring. All new Ethernet packets are sent on both
>> ports (or passed through if the current unit is not the originating unit). The
>> same packet is never passed twice. Non-HSR units are not allowed in the ring.
>>
>> This gives instant, reconfiguration-free failover.
>>
>> I'd like your input on how to design the user interface. To me it seems natural
>> to use bridge-utils, which of course today supports STP.
>>
>> One solution is to simply add an "hsr" command:
>>
>> # brctl hsr <bridge> on|off
>>
>> But HSR is mutually exclusive to other modes, and I think that STP and standard
>> bridge mode are mutually exclusive, too? Perhaps it would be better (more user-
>> friendly) to 
>>
>> # brctl type <bridge> standard|stp|hsr
>>
>> ?
>>
>> 'brctl stp <bridge> on|off' would have to be kept for compatibility, but could
>> be a simple wrapper for 'brctl type <bridge> stp|standard'
>>
>> What do you think about this?
>>
>>
> 
> Why is it a bridge thing and not a standalone or bonding (or the new team
> device feature? Wouldn't users want to use it without all the stuff
> related to bridging. The fact that it doesn't work with STP is a big
> red flag that it doesn't belong in the bridge.

Good question! I'm new to the more advanced networking possibilities in Linux, so 
I really don't know where HSR fits best.

HSR is a layer 2 only protocol, with the host acting as bridge for packets not 
destined for itself. It also sends all originating Ethernet packets on both ports,
adding a HSR sequence tag to the packet (using a dedicated EtherType of 0x88FB).
As described above, the HSR units are connected in a ring, in which only HSR units
are allowed.

Having looked now at bonding, it seems to act on several network layers, doing 
multiple things mainly centered around 802.3ad (link aggregation). I'm not sure
how HSR would fit there.

If I understand correctly, team device is an emerging userspace implementation of
the bonding driver?

I guess my take was that HSR seems like a special bridging mode, much like STP.


> Please discuss this on netdev mailing list, others may have different
> opinions.

Done! :)


-- 
Arvid Brodin
Enea Services Stockholm AB

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: bridge: HSR support
  2011-10-11 23:51   ` bridge: HSR support Arvid Brodin
@ 2011-10-12 13:28     ` David Lamparter
  2011-10-12 14:24       ` Arvid Brodin
  2011-10-24 14:17     ` Arvid Brodin
  1 sibling, 1 reply; 15+ messages in thread
From: David Lamparter @ 2011-10-12 13:28 UTC (permalink / raw)
  To: Arvid Brodin; +Cc: netdev, Stephen Hemminger, Lennert Buytenhek

On Wed, Oct 12, 2011 at 01:51:22AM +0200, Arvid Brodin wrote:
> Stephen Hemminger wrote:
> >> I want to add support for HSR ("High-availability Seamless Redundancy",
> >> IEC-62439-3) to the bridge code. With HSR, all connected units have two network
> >> ports and are connected in a ring. All new Ethernet packets are sent on both
> >> ports (or passed through if the current unit is not the originating unit).


> >> The same packet is never passed twice.

How does the bridge decide whether a packet is arriving the second time?
Is the ring pre-resolved to stop things or does this happen per-packet?


-David

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: bridge: HSR support
  2011-10-12 13:28     ` David Lamparter
@ 2011-10-12 14:24       ` Arvid Brodin
  0 siblings, 0 replies; 15+ messages in thread
From: Arvid Brodin @ 2011-10-12 14:24 UTC (permalink / raw)
  To: David Lamparter; +Cc: netdev, Stephen Hemminger, Lennert Buytenhek


David Lamparter wrote:
> On Wed, Oct 12, 2011 at 01:51:22AM +0200, Arvid Brodin wrote:
>> Stephen Hemminger wrote:
>>>> I want to add support for HSR ("High-availability Seamless Redundancy",
>>>> IEC-62439-3) to the bridge code. With HSR, all connected units have two network
>>>> ports and are connected in a ring. All new Ethernet packets are sent on both
>>>> ports (or passed through if the current unit is not the originating unit).
> 
> 
>>>> The same packet is never passed twice.
> 
> How does the bridge decide whether a packet is arriving the second time?
> Is the ring pre-resolved to stop things or does this happen per-packet?

It happens per-packet. I believe a combination of source MAC address and the HSR
sequence tag is used to identify duplicates.

-- 
Arvid Brodin
Enea Services Stockholm AB

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: bridge: HSR support
  2011-10-11 23:51   ` bridge: HSR support Arvid Brodin
  2011-10-12 13:28     ` David Lamparter
@ 2011-10-24 14:17     ` Arvid Brodin
  2011-10-28 15:34       ` Arvid Brodin
  2012-01-06 18:11       ` Arvid Brodin
  1 sibling, 2 replies; 15+ messages in thread
From: Arvid Brodin @ 2011-10-24 14:17 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

Stephen Hemminger wrote:
> On Tue, 11 Oct 2011 20:25:08 +0200
> Arvid Brodin <arvid.brodin@enea.com> wrote:
> 
>> Hi,
>>
>> I want to add support for HSR ("High-availability Seamless Redundancy",
>> IEC-62439-3) to the bridge code. With HSR, all connected units have two network
>> ports and are connected in a ring. All new Ethernet packets are sent on both
>> ports (or passed through if the current unit is not the originating unit). The
>> same packet is never passed twice. Non-HSR units are not allowed in the ring.
>>
>> This gives instant, reconfiguration-free failover.
>>
>> I'd like your input on how to design the user interface. To me it seems natural
>> to use bridge-utils, which of course today supports STP.
>>
>> One solution is to simply add an "hsr" command:
>>
>> # brctl hsr <bridge> on|off
>>
>> But HSR is mutually exclusive to other modes, and I think that STP and standard
>> bridge mode are mutually exclusive, too? Perhaps it would be better (more user-
>> friendly) to 
>>
>> # brctl type <bridge> standard|stp|hsr
>>
>> ?
>>
>> 'brctl stp <bridge> on|off' would have to be kept for compatibility, but could
>> be a simple wrapper for 'brctl type <bridge> stp|standard'
>>
>> What do you think about this?
>>
> 
> Why is it a bridge thing and not a standalone or bonding (or the new team
> device feature? Wouldn't users want to use it without all the stuff
> related to bridging. The fact that it doesn't work with STP is a big
> red flag that it doesn't belong in the bridge.

Ok, having read up some more on this it looks like STP is a standardised part of
bridging, so I guess you're right. 


I need to do two things:

1) Bind two network interfaces into one (say, eth0 & eth1 => hsr0). Frames sent on
   hsr0 should get an HSR tag (including the correct EtherType) and go out on both
   eth0 and eth1.

2) Ingress frames on eth0 & eth1, with EtherType 0x88fb, should be captured and 
   handled specially (either received on hsr0 or forwarded to the other bound 
   physical interface).

Any ideas on the best way to implement this -- what's the nicest place to "hook
into" for this?


-- 
Arvid Brodin
Enea Services Stockholm AB

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: bridge: HSR support
  2011-10-24 14:17     ` Arvid Brodin
@ 2011-10-28 15:34       ` Arvid Brodin
  2011-10-28 15:54         ` Stephen Hemminger
  2011-11-21 16:52         ` Arvid Brodin
  2012-01-06 18:11       ` Arvid Brodin
  1 sibling, 2 replies; 15+ messages in thread
From: Arvid Brodin @ 2011-10-28 15:34 UTC (permalink / raw)
  To: netdev; +Cc: Arvid Brodin

Arvid Brodin wrote:
> Stephen Hemminger wrote:
>> On Tue, 11 Oct 2011 20:25:08 +0200
>> Arvid Brodin <arvid.brodin@enea.com> wrote:
>>
>>> Hi,
>>>
>>> I want to add support for HSR ("High-availability Seamless Redundancy",
>>> IEC-62439-3) to the bridge code. With HSR, all connected units have two network
>>> ports and are connected in a ring. All new Ethernet packets are sent on both
>>> ports (or passed through if the current unit is not the originating unit). The
>>> same packet is never passed twice. Non-HSR units are not allowed in the ring.
>>>
>>> This gives instant, reconfiguration-free failover.
>>>
*snip*
> 
> I need to do two things:
> 
> 1) Bind two network interfaces into one (say, eth0 & eth1 => hsr0). Frames sent on
>    hsr0 should get an HSR tag (including the correct EtherType) and go out on both
>    eth0 and eth1.
> 
> 2) Ingress frames on eth0 & eth1, with EtherType 0x88fb, should be captured and 
>    handled specially (either received on hsr0 or forwarded to the other bound 
>    physical interface).
> 
> Any ideas on the best way to implement this -- what's the nicest place to "hook
> into" for this?
> 

Ok, so after a lot of reading and looking through code I have this idea of a
standalone solution:

1) Add ioctls to create (and remove) "hsr" netdevs which encapsulates two
   physical Ethernet interfaces each (somewhat like the bridge code does, but
   with precisely 2 interfaces slaved).

2) Use dev_add_pack() to register protocol ("EtherType") 0x88FB. The device
   that the frames come in on are checked for being a slave to a hsr netdev,
   and handled accordingly.

It would be great to get some input on the sanity of this solution before I get
too much time invested in it!


Thanks,
-- 
Arvid Brodin
Enea Services Stockholm AB

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: bridge: HSR support
  2011-10-28 15:34       ` Arvid Brodin
@ 2011-10-28 15:54         ` Stephen Hemminger
  2011-10-28 16:36           ` Arvid Brodin
  2011-12-06 23:23           ` Arvid Brodin
  2011-11-21 16:52         ` Arvid Brodin
  1 sibling, 2 replies; 15+ messages in thread
From: Stephen Hemminger @ 2011-10-28 15:54 UTC (permalink / raw)
  To: Arvid Brodin; +Cc: netdev

On Fri, 28 Oct 2011 17:34:18 +0200
Arvid Brodin <arvid.brodin@enea.com> wrote:

> Ok, so after a lot of reading and looking through code I have this idea of a
> standalone solution:
> 
> 1) Add ioctls to create (and remove) "hsr" netdevs which encapsulates two
>    physical Ethernet interfaces each (somewhat like the bridge code does, but
>    with precisely 2 interfaces slaved).

Please use the newer netlink interface and the master attribute for this
rather than inventing yet another ioctl.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: bridge: HSR support
  2011-10-28 15:54         ` Stephen Hemminger
@ 2011-10-28 16:36           ` Arvid Brodin
  2011-12-06 23:23           ` Arvid Brodin
  1 sibling, 0 replies; 15+ messages in thread
From: Arvid Brodin @ 2011-10-28 16:36 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

Stephen Hemminger wrote:
> On Fri, 28 Oct 2011 17:34:18 +0200
> Arvid Brodin <arvid.brodin@enea.com> wrote:
> 
>> Ok, so after a lot of reading and looking through code I have this idea of a
>> standalone solution:
>>
>> 1) Add ioctls to create (and remove) "hsr" netdevs which encapsulates two
>>    physical Ethernet interfaces each (somewhat like the bridge code does, but
>>    with precisely 2 interfaces slaved).
> 
> Please use the newer netlink interface and the master attribute for this
> rather than inventing yet another ioctl.

Ok, will do! Thanks!

-- 
Arvid Brodin
Enea Services Stockholm AB

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: bridge: HSR support
  2011-10-28 15:34       ` Arvid Brodin
  2011-10-28 15:54         ` Stephen Hemminger
@ 2011-11-21 16:52         ` Arvid Brodin
  1 sibling, 0 replies; 15+ messages in thread
From: Arvid Brodin @ 2011-11-21 16:52 UTC (permalink / raw)
  To: netdev

Arvid Brodin wrote:
> Arvid Brodin wrote:
>> Stephen Hemminger wrote:
>>> On Tue, 11 Oct 2011 20:25:08 +0200
>>> Arvid Brodin <arvid.brodin@enea.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I want to add support for HSR ("High-availability Seamless Redundancy",
>>>> IEC-62439-3) to the bridge code. With HSR, all connected units have two network
>>>> ports and are connected in a ring. All new Ethernet packets are sent on both
>>>> ports (or passed through if the current unit is not the originating unit). The
>>>> same packet is never passed twice. Non-HSR units are not allowed in the ring.
>>>>
>>>> This gives instant, reconfiguration-free failover.
>>>>
> *snip*
>> I need to do two things:
>>
>> 1) Bind two network interfaces into one (say, eth0 & eth1 => hsr0). Frames sent on
>>    hsr0 should get an HSR tag (including the correct EtherType) and go out on both
>>    eth0 and eth1.
>>
>> 2) Ingress frames on eth0 & eth1, with EtherType 0x88fb, should be captured and 
>>    handled specially (either received on hsr0 or forwarded to the other bound 
>>    physical interface).
>>
>> Any ideas on the best way to implement this -- what's the nicest place to "hook
>> into" for this?
>>

I need some help with the code for creating virtual hsr devices.

To reiterate, a hsr device acts as a kind of master for two physical ethernet
devices. Any frame sent on the hsrX device should be forwarded to and sent from
both physical devices. Frames coming in on one of the physical devices should
be bridged to the other physical device, or received on the hsr device if the
host is the intended destination.

Questions:

1) net_device features. Should I just logically AND the features fields of the
   physical interfaces to get a correct value for the hsr device?


2) net_device priv_flags / flags:

   I'm guessing I need to clear IFF_XMIT_DST_RELEASE on all involved interfaces
   to be able to send outgoing frames on multiple interfaces?

   Can I set IFF_DONT_BRIDGE on the physical interfaces to prevent them being
   used in a bridge?

   Can I call netdev_set_master() on the physical devices to stop them from
   being used in e.g. a bond?


3) Stats: is there a reason not to use the net_device->stats field for
   statistics on the hsr device? (I see many drivers keep their own
   net_device_stats data and implement ndo_get_stats64() instead.)

-- 
Arvid Brodin
Enea Services Stockholm AB

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: bridge: HSR support
  2011-10-28 15:54         ` Stephen Hemminger
  2011-10-28 16:36           ` Arvid Brodin
@ 2011-12-06 23:23           ` Arvid Brodin
  2011-12-06 23:27             ` Stephen Hemminger
  1 sibling, 1 reply; 15+ messages in thread
From: Arvid Brodin @ 2011-12-06 23:23 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

Stephen Hemminger wrote:
> On Fri, 28 Oct 2011 17:34:18 +0200
> Arvid Brodin <arvid.brodin@enea.com> wrote:
> 
>> Ok, so after a lot of reading and looking through code I have this idea of a
>> standalone solution:
>>
>> 1) Add ioctls to create (and remove) "hsr" netdevs which encapsulates two
>>    physical Ethernet interfaces each (somewhat like the bridge code does, but
>>    with precisely 2 interfaces slaved).
> 
> Please use the newer netlink interface and the master attribute for this
> rather than inventing yet another ioctl.


Is the rtnl interface documented anywhere (the usage of the different IFLA_
flags etc.)? Specifically: how do I use the IFLA_MASTER flag (what's the
meaning of the 32-bit data it wants, and how is it used by the kernel)? I 
haven't been very successful figuring this out by looking at the kernel code.


Also, how do I best tell the kernel which my slave devices are when creating
the hsr device? Should I create my own IFLA_HSR_UNSPEC, etc, or can I use some
of the generic flags?


Oh, and the kernel (struct rtnl_link_ops).newlink method has two (struct
nlattr *[]) params: tb and data. What are their roles?


-- 
Arvid Brodin
Enea Services Stockholm AB

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: bridge: HSR support
  2011-12-06 23:23           ` Arvid Brodin
@ 2011-12-06 23:27             ` Stephen Hemminger
  2011-12-07 18:30               ` Arvid Brodin
  0 siblings, 1 reply; 15+ messages in thread
From: Stephen Hemminger @ 2011-12-06 23:27 UTC (permalink / raw)
  To: Arvid Brodin; +Cc: netdev

On Wed, 7 Dec 2011 00:23:21 +0100
Arvid Brodin <arvid.brodin@enea.com> wrote:

> Stephen Hemminger wrote:
> > On Fri, 28 Oct 2011 17:34:18 +0200
> > Arvid Brodin <arvid.brodin@enea.com> wrote:
> > 
> >> Ok, so after a lot of reading and looking through code I have this idea of a
> >> standalone solution:
> >>
> >> 1) Add ioctls to create (and remove) "hsr" netdevs which encapsulates two
> >>    physical Ethernet interfaces each (somewhat like the bridge code does, but
> >>    with precisely 2 interfaces slaved).
> > 
> > Please use the newer netlink interface and the master attribute for this
> > rather than inventing yet another ioctl.
> 
> 
> Is the rtnl interface documented anywhere (the usage of the different IFLA_
> flags etc.)? Specifically: how do I use the IFLA_MASTER flag (what's the
> meaning of the 32-bit data it wants, and how is it used by the kernel)? I 
> haven't been very successful figuring this out by looking at the kernel code.


Look at bridging or bonding. the u32 is the ifindex of the master device (ie
where packets are being forwarded to).

> 
> Also, how do I best tell the kernel which my slave devices are when creating
> the hsr device? Should I create my own IFLA_HSR_UNSPEC, etc, or can I use some
> of the generic flags?

Look at macvlan, vlan, or bridging. There this is done by processing a newlink
message.


> Oh, and the kernel (struct rtnl_link_ops).newlink method has two (struct
> nlattr *[]) params: tb and data. What are their roles?
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: bridge: HSR support
  2011-12-06 23:27             ` Stephen Hemminger
@ 2011-12-07 18:30               ` Arvid Brodin
  2011-12-07 19:59                 ` Jay Vosburgh
  0 siblings, 1 reply; 15+ messages in thread
From: Arvid Brodin @ 2011-12-07 18:30 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

Stephen Hemminger wrote:
> On Wed, 7 Dec 2011 00:23:21 +0100
> Arvid Brodin <arvid.brodin@enea.com> wrote:
> 
>> Stephen Hemminger wrote:
>>> On Fri, 28 Oct 2011 17:34:18 +0200
>>> Arvid Brodin <arvid.brodin@enea.com> wrote:
>>>
>>>> Ok, so after a lot of reading and looking through code I have this idea of a
>>>> standalone solution:
>>>>
>>>> 1) Add ioctls to create (and remove) "hsr" netdevs which encapsulates two
>>>>    physical Ethernet interfaces each (somewhat like the bridge code does, but
>>>>    with precisely 2 interfaces slaved).
>>> Please use the newer netlink interface and the master attribute for this
>>> rather than inventing yet another ioctl.
>>
>> Is the rtnl interface documented anywhere (the usage of the different IFLA_
>> flags etc.)? Specifically: how do I use the IFLA_MASTER flag (what's the
>> meaning of the 32-bit data it wants, and how is it used by the kernel)? I 
>> haven't been very successful figuring this out by looking at the kernel code.
> 
> 
> Look at bridging or bonding.

Believe me, I have! :) Turns out IFLA_MASTER is actually handled in net/core/rtnetlink.c.
Although the message is sent to the slave device, the functions called are the
master device's ndo_{add,del}_slave(). Looking at the bridging and bonding
implementations br_add_slave() and bond_enslave() and the functions they call,
it all boils down mostly to 

* netdev_set_master() which assigns slave_dev->master = master_dev.
* bonding also set IFF_SLAVE on the slave.
* netdev_rx_handler_register(slave_dev, ). "This handler will then be called
  from __netif_receive_skb."

So far so good, but:

* I don't know the meaning of the IFF_SLAVE flag. It's referenced all over the place
  (core, vlan, bonding, ipv6, eql). Do I need to/want to set this?

* I don't know the effects of setting dev->master. Do I need/want this?

* I don't want to forward all ingress frames on the slave devices to my master
  device; I only want the ones with protocol 0x88FB to be forwarded (other
  frames should be received by the slaves as normal). I think I already have this
  covered by registering a protocol handler (using dev_add_pack(packet_type)).
  So perhaps calling netdev_rx_handler_register() is not necessary in my case?

* As far as I can see, neither bridging nor bonding is handled by the ip program
  (iproute2 suite)? I.e. no examples of binding more than one interface to a
  virtual interface when it comes to which messages to send, etc. VLAN uses
  IFLA_IFNAME (name of the vlan link), IFLA_LINK (physical link behind the vlan
  link), and some IFLA_VLAN-specific messages.

  What I want to do is to atomically (from a user space perspective) create the
  HSR bonding, i.e.:

  	# ip link add name hsr0 type hsr slave1 slave2

  I have written a hsr "plugin" to iproute2 that accepts these parameters, I'm
  just not sure how to tell the kernel about them. Perhaps then I should define
  my own IFLA_HSR_UNSPEC, IFLA_HSR_SLAVE1, IFLA_HSR_SLAVE2 messages?


>> Also, how do I best tell the kernel which my slave devices are when creating
>> the hsr device? Should I create my own IFLA_HSR_UNSPEC, etc, or can I use some
>> of the generic flags?
> 
> Look at macvlan, vlan, or bridging. There this is done by processing a newlink
> message.

macvlan and vlan both use IFLA_LINK to tell the kernel about their single
underlying "real" device. None of these use more than one underlying device.
Bridging does not implement newlink at all (uses ioctls, I think).


>> Oh, and the kernel (struct rtnl_link_ops).newlink method has two (struct
>> nlattr *[]) params: tb and data. What are their roles?
>>
 
Heh, I just realised that the difference is that "tb" contains generic (IFLA_)
message data and "data" contains specific (e.g. IFLA_VLAN_) message data. 


-- 
Arvid Brodin
Enea Services Stockholm AB

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: bridge: HSR support
  2011-12-07 18:30               ` Arvid Brodin
@ 2011-12-07 19:59                 ` Jay Vosburgh
  2011-12-08 14:45                   ` Arvid Brodin
  0 siblings, 1 reply; 15+ messages in thread
From: Jay Vosburgh @ 2011-12-07 19:59 UTC (permalink / raw)
  To: Arvid Brodin; +Cc: Stephen Hemminger, netdev

Arvid Brodin <arvid.brodin@enea.com> wrote:

>Stephen Hemminger wrote:
>> On Wed, 7 Dec 2011 00:23:21 +0100
>> Arvid Brodin <arvid.brodin@enea.com> wrote:
>> 
>>> Stephen Hemminger wrote:
>>>> On Fri, 28 Oct 2011 17:34:18 +0200
>>>> Arvid Brodin <arvid.brodin@enea.com> wrote:
>>>>
>>>>> Ok, so after a lot of reading and looking through code I have this idea of a
>>>>> standalone solution:
>>>>>
>>>>> 1) Add ioctls to create (and remove) "hsr" netdevs which encapsulates two
>>>>>    physical Ethernet interfaces each (somewhat like the bridge code does, but
>>>>>    with precisely 2 interfaces slaved).
>>>> Please use the newer netlink interface and the master attribute for this
>>>> rather than inventing yet another ioctl.
>>>
>>> Is the rtnl interface documented anywhere (the usage of the different IFLA_
>>> flags etc.)? Specifically: how do I use the IFLA_MASTER flag (what's the
>>> meaning of the 32-bit data it wants, and how is it used by the kernel)? I 
>>> haven't been very successful figuring this out by looking at the kernel code.
>> 
>> 
>> Look at bridging or bonding.
>
>Believe me, I have! :) Turns out IFLA_MASTER is actually handled in net/core/rtnetlink.c.
>Although the message is sent to the slave device, the functions called are the
>master device's ndo_{add,del}_slave(). Looking at the bridging and bonding
>implementations br_add_slave() and bond_enslave() and the functions they call,
>it all boils down mostly to 
>
>* netdev_set_master() which assigns slave_dev->master = master_dev.
>* bonding also set IFF_SLAVE on the slave.
>* netdev_rx_handler_register(slave_dev, ). "This handler will then be called
>  from __netif_receive_skb."
>
>So far so good, but:
>
>* I don't know the meaning of the IFF_SLAVE flag. It's referenced all over the place
>  (core, vlan, bonding, ipv6, eql). Do I need to/want to set this?

	Only if you actually need to for some reason.  There are a few
tests that make actual use of IFF_SLAVE, e.g., IPv6 won't run addrconf
on an interface with IFF_SLAVE set (which prevents bonding slaves from
having an IPv6 address distinct from the master).  Netpoll also treats
interfaces with IFF_SLAVE in a special way.  Bonding uses it internally
for various tests.

>* I don't know the effects of setting dev->master. Do I need/want this?

	Maybe.  One effect of netdev_set_master is that a reference is
acquired on the master, on behalf of the slave, so the master cannot
simply vanish until the slave releases that reference.  This predates
the notifier facility, and careful use of notifiers (handling
NETDEV_UNREGISTER) can get around the need for dev->master, but, e.g.,
vlan still acquires a reference to the real_dev without using
dev->master.

	It used to be that dev->master was used in netif_receive_skb for
packet handling purposes (for bonding, mostly; bridge and I think
macvlan had separate hooks).  That special sauce is now done by the
rx_handler, so there's really no requirement to use dev->master if you
have no need.

>* I don't want to forward all ingress frames on the slave devices to my master
>  device; I only want the ones with protocol 0x88FB to be forwarded (other
>  frames should be received by the slaves as normal). I think I already have this
>  covered by registering a protocol handler (using dev_add_pack(packet_type)).
>  So perhaps calling netdev_rx_handler_register() is not necessary in my case?

	You may want to use the rx_handler, and have it set skb->dev
appropriately for the frames that should forward to the master, and
leave skb->dev alone for the ones that should stick with the slave.
Both of those need the appropriate return from the rx_handler, which is
documented in netdevice.h.

	I'm not sure that you need a dev_add_pack at all; bonding
doesn't use one anymore, since everything it needs can now be done via
rx_handler.  The dev_add_pack approach may work, but rx_handler is
probably more efficient.

>* As far as I can see, neither bridging nor bonding is handled by the ip program
>  (iproute2 suite)? I.e. no examples of binding more than one interface to a
>  virtual interface when it comes to which messages to send, etc. VLAN uses
>  IFLA_IFNAME (name of the vlan link), IFLA_LINK (physical link behind the vlan
>  link), and some IFLA_VLAN-specific messages.

	In current versions of iproute, something like "ip link set
device eth0 master bond0" would add a slave to a bond.  You are correct,
though, that ip does not permit changing the bonding options, and I
don't believe it will create new master devices, either, so the bonding
support is limited.

	-J

>  What I want to do is to atomically (from a user space perspective) create the
>  HSR bonding, i.e.:
>
>  	# ip link add name hsr0 type hsr slave1 slave2
>
>  I have written a hsr "plugin" to iproute2 that accepts these parameters, I'm
>  just not sure how to tell the kernel about them. Perhaps then I should define
>  my own IFLA_HSR_UNSPEC, IFLA_HSR_SLAVE1, IFLA_HSR_SLAVE2 messages?
>
>>> Also, how do I best tell the kernel which my slave devices are when creating
>>> the hsr device? Should I create my own IFLA_HSR_UNSPEC, etc, or can I use some
>>> of the generic flags?
>> 
>> Look at macvlan, vlan, or bridging. There this is done by processing a newlink
>> message.
>
>macvlan and vlan both use IFLA_LINK to tell the kernel about their single
>underlying "real" device. None of these use more than one underlying device.
>Bridging does not implement newlink at all (uses ioctls, I think).
>
>
>>> Oh, and the kernel (struct rtnl_link_ops).newlink method has two (struct
>>> nlattr *[]) params: tb and data. What are their roles?
>>>
>
>Heh, I just realised that the difference is that "tb" contains generic (IFLA_)
>message data and "data" contains specific (e.g. IFLA_VLAN_) message data. 

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: bridge: HSR support
  2011-12-07 19:59                 ` Jay Vosburgh
@ 2011-12-08 14:45                   ` Arvid Brodin
  0 siblings, 0 replies; 15+ messages in thread
From: Arvid Brodin @ 2011-12-08 14:45 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Stephen Hemminger, netdev

Jay Vosburgh wrote:
> Arvid Brodin <arvid.brodin@enea.com> wrote:
>> * I don't know the meaning of the IFF_SLAVE flag. It's referenced all over the place
>>  (core, vlan, bonding, ipv6, eql). Do I need to/want to set this?
> 
> 	Only if you actually need to for some reason.  There are a few
> tests that make actual use of IFF_SLAVE, e.g., IPv6 won't run addrconf
> on an interface with IFF_SLAVE set (which prevents bonding slaves from
> having an IPv6 address distinct from the master).  Netpoll also treats
> interfaces with IFF_SLAVE in a special way.  Bonding uses it internally
> for various tests.
> 
>> * I don't know the effects of setting dev->master. Do I need/want this?
> 
> 	Maybe.  One effect of netdev_set_master is that a reference is
> acquired on the master, on behalf of the slave, so the master cannot
> simply vanish until the slave releases that reference.  This predates
> the notifier facility, and careful use of notifiers (handling
> NETDEV_UNREGISTER) can get around the need for dev->master, but, e.g.,
> vlan still acquires a reference to the real_dev without using
> dev->master.
> 
> 	It used to be that dev->master was used in netif_receive_skb for
> packet handling purposes (for bonding, mostly; bridge and I think
> macvlan had separate hooks).  That special sauce is now done by the
> rx_handler, so there's really no requirement to use dev->master if you
> have no need.
> 
>> * I don't want to forward all ingress frames on the slave devices to my master
>>  device; I only want the ones with protocol 0x88FB to be forwarded (other
>>  frames should be received by the slaves as normal). I think I already have this
>>  covered by registering a protocol handler (using dev_add_pack(packet_type)).
>>  So perhaps calling netdev_rx_handler_register() is not necessary in my case?
> 
> 	You may want to use the rx_handler, and have it set skb->dev
> appropriately for the frames that should forward to the master, and
> leave skb->dev alone for the ones that should stick with the slave.
> Both of those need the appropriate return from the rx_handler, which is
> documented in netdevice.h.
> 
> 	I'm not sure that you need a dev_add_pack at all; bonding
> doesn't use one anymore, since everything it needs can now be done via
> rx_handler.  The dev_add_pack approach may work, but rx_handler is
> probably more efficient.
> 
>> * As far as I can see, neither bridging nor bonding is handled by the ip program
>>  (iproute2 suite)? I.e. no examples of binding more than one interface to a
>>  virtual interface when it comes to which messages to send, etc. VLAN uses
>>  IFLA_IFNAME (name of the vlan link), IFLA_LINK (physical link behind the vlan
>>  link), and some IFLA_VLAN-specific messages.
> 
> 	In current versions of iproute, something like "ip link set
> device eth0 master bond0" would add a slave to a bond.  You are correct,
> though, that ip does not permit changing the bonding options, and I
> don't believe it will create new master devices, either, so the bonding
> support is limited.
> 
> 	-J
> 
> ---
> 	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
> 

Lots of great info there, many thanks!

-- 
Arvid Brodin
Enea Services Stockholm AB

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: bridge: HSR support
  2011-10-24 14:17     ` Arvid Brodin
  2011-10-28 15:34       ` Arvid Brodin
@ 2012-01-06 18:11       ` Arvid Brodin
  2012-01-12 18:02         ` bridge: HSR support - possible recursive locking? Arvid Brodin
  1 sibling, 1 reply; 15+ messages in thread
From: Arvid Brodin @ 2012-01-06 18:11 UTC (permalink / raw)
  To: netdev; +Cc: Arvid Brodin

Arvid Brodin wrote:
>> On Tue, 11 Oct 2011 20:25:08 +0200
>> Arvid Brodin <arvid.brodin@enea.com> wrote:
>>
>>> Hi,
>>>
>>> I want to add support for HSR ("High-availability Seamless Redundancy",
>>> IEC-62439-3) to the bridge code. With HSR, all connected units have two network
>>> ports and are connected in a ring. All new Ethernet packets are sent on both
>>> ports (or passed through if the current unit is not the originating unit). The
>>> same packet is never passed twice. Non-HSR units are not allowed in the ring.
>>>
>>> This gives instant, reconfiguration-free failover.
>>>
*snip*
> 
> I need to do two things:
> 
> 1) Bind two network interfaces into one (say, eth0 & eth1 => hsr0). Frames sent on
>    hsr0 should get an HSR tag (including the correct EtherType) and go out on both
>    eth0 and eth1.
> 
> 2) Ingress frames on eth0 & eth1, with EtherType 0x88fb, should be captured and 
>    handled specially (either received on hsr0 or forwarded to the other bound 
>    physical interface).
> 

I'm slowly getting there! :)

But what is net_device->header_ops->rebuild supposed to do?

Thanks,
Arvid Brodin
Enea Services Stockholm AB

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: bridge: HSR support - possible recursive locking?
  2012-01-06 18:11       ` Arvid Brodin
@ 2012-01-12 18:02         ` Arvid Brodin
  0 siblings, 0 replies; 15+ messages in thread
From: Arvid Brodin @ 2012-01-12 18:02 UTC (permalink / raw)
  To: netdev; +Cc: arbr

Arvid Brodin wrote:
> Arvid Brodin wrote:
>>> On Tue, 11 Oct 2011 20:25:08 +0200
>>> Arvid Brodin <arvid.brodin@enea.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I want to add support for HSR ("High-availability Seamless Redundancy",
>>>> IEC-62439-3) to the bridge code. With HSR, all connected units have two network
>>>> ports and are connected in a ring. All new Ethernet packets are sent on both
>>>> ports (or passed through if the current unit is not the originating unit). The
>>>> same packet is never passed twice. Non-HSR units are not allowed in the ring.
>>>>
>>>> This gives instant, reconfiguration-free failover.
>>>>
> *snip*
>> I need to do two things:
>>
>> 1) Bind two network interfaces into one (say, eth0 & eth1 => hsr0). Frames sent on
>>    hsr0 should get an HSR tag (including the correct EtherType) and go out on both
>>    eth0 and eth1.
>>
>> 2) Ingress frames on eth0 & eth1, with EtherType 0x88fb, should be captured and 
>>    handled specially (either received on hsr0 or forwarded to the other bound 
>>    physical interface).
>>
> 
> I'm slowly getting there! :)
> 
> But what is net_device->header_ops->rebuild supposed to do?
> 

I have a "possible recursive locking" when I send cloned packets, and I can't figure out
why. Here's the stack dump and some debug printouts:


hsr_dev_xmit:286: sent on first slave

=============================================
[ INFO: possible recursive locking detected ]
2.6.37 #43
---------------------------------------------
swapper/0 is trying to acquire lock:
 (_xmit_ETHER#2){+.-...}, at: [<901b9aae>] sch_direct_xmit+0x24/0x152

but task is already holding lock:
 (_xmit_ETHER#2){+.-...}, at: [<901afc4a>] dev_queue_xmit+0x2ce/0x37c

other info that might help us debug this:
4 locks held by swapper/0:
 #0:  (&n->timer){+.-...}, at: [<9002b2b4>] run_timer_softirq+0x98/0x184
 #1:  (rcu_read_lock_bh){.+....}, at: [<901af97c>] dev_queue_xmit+0x0/0x37c
 #2:  (_xmit_ETHER#2){+.-...}, at: [<901afc4a>] dev_queue_xmit+0x2ce/0x37c
 #3:  (rcu_read_lock_bh){.+....}, at: [<901af97c>] dev_queue_xmit+0x0/0x37c

stack backtrace:
Call trace:
 [<9001c264>] dump_stack+0x18/0x20
 [<9003fdbc>] validate_chain+0x40c/0x9ac
 [<90040968>] __lock_acquire+0x60c/0x670
 [<90041cda>] lock_acquire+0x3a/0x48
 [<90216c5c>] _raw_spin_lock+0x20/0x44
 [<901b9aae>] sch_direct_xmit+0x24/0x152
 [<901afb44>] dev_queue_xmit+0x1c8/0x37c
 [<90213090>] nf_hook_xmit+0x8/0xc
 [<902130a2>] slave_xmit+0xe/0x10
 [<902131d6>] hsr_dev_xmit+0xa6/0xcc
 [<901af8c2>] dev_hard_start_xmit+0x382/0x43c
 [<901afc64>] dev_queue_xmit+0x2e8/0x37c
 [<901dc8a0>] arp_xmit+0x8/0xc
 [<901dcf86>] arp_send+0x2a/0x2c
 [<901dd978>] arp_solicit+0x110/0x130
 [<901b54a4>] neigh_timer_handler+0x1c2/0x206
 [<9002b31e>] run_timer_softirq+0x102/0x184
 [<90027eb8>] __do_softirq+0x64/0xe0
 [<9002804a>] do_softirq+0x26/0x48
 [<90028146>] irq_exit+0x2e/0x64
 [<90019bae>] do_IRQ+0x46/0x5c
 [<90018424>] irq_level0+0x18/0x60
 [<902136ae>] rest_init+0x72/0x90
 [<9000063c>] start_kernel+0x21c/0x258
 [<00000000>] 0x0

hsr_dev_xmit:289: sent on second slave

The code looks like this (from my hsr_dev_xmit() function):

	...
	skb2 = skb_clone(skb, GFP_ATOMIC);
	slave_xmit(skb, hsr_priv->slave_data[0].dev);
	printk(KERN_INFO "%s:%d: sent on first slave\n", __func__, __LINE__);
	if (skb2)
		slave_xmit(skb2, hsr_priv->slave_data[1].dev);
	printk(KERN_INFO "%s:%d: sent on second slave\n", __func__, __LINE__);
	...

and slave_xmit looks like this:

int nf_hook_xmit(struct sk_buff *skb)
{
	dev_queue_xmit(skb);
	return 0;
}

static int slave_xmit(struct sk_buff *skb, struct net_device *dev)
{
	int res;

	skb->dev = dev;
	skb->priority = 1; // FIXME: what does this mean?

	res = NF_HOOK(NFPROTO_BRIDGE, NF_BR_POST_ROUTING, skb, NULL, skb->dev, nf_hook_xmit);
//	res = dev_queue_xmit(skb);
	/* Buffer is consumed on errors too, so nothing to do here, really... */

	return res;
}

I believe I'm doing exactly the same thing as the bridging code (but of course I
can't be). So what is it that I'm doing wrong???


-- 
Arvid Brodin
Enea Services Stockholm AB

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-01-12 18:02 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <4E948A04.8060400@enea.com>
     [not found] ` <20111011112821.28cd3e51@nehalam.linuxnetplumber.net>
2011-10-11 23:51   ` bridge: HSR support Arvid Brodin
2011-10-12 13:28     ` David Lamparter
2011-10-12 14:24       ` Arvid Brodin
2011-10-24 14:17     ` Arvid Brodin
2011-10-28 15:34       ` Arvid Brodin
2011-10-28 15:54         ` Stephen Hemminger
2011-10-28 16:36           ` Arvid Brodin
2011-12-06 23:23           ` Arvid Brodin
2011-12-06 23:27             ` Stephen Hemminger
2011-12-07 18:30               ` Arvid Brodin
2011-12-07 19:59                 ` Jay Vosburgh
2011-12-08 14:45                   ` Arvid Brodin
2011-11-21 16:52         ` Arvid Brodin
2012-01-06 18:11       ` Arvid Brodin
2012-01-12 18:02         ` bridge: HSR support - possible recursive locking? Arvid Brodin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.