All of lore.kernel.org
 help / color / mirror / Atom feed
* PRP with VLAN support - or how to contribute to a Linux network driver
@ 2023-11-06 11:01 Heiko Gerstung
  2023-11-08 15:17 ` Andrew Lunn
  0 siblings, 1 reply; 6+ messages in thread
From: Heiko Gerstung @ 2023-11-06 11:01 UTC (permalink / raw)
  To: netdev

[-- Attachment #1: Type: text/plain, Size: 859 bytes --]

Hi All,

we are looking for a way to use the hsr/prp driver in our products and found out that it does not support VLANs at the moment. As I can see, the hsr driver is marked as “orphan”, i.e. there is no active maintainer for it. 

I would like to discuss if it makes sense to remove the PRP functionality from the HSR driver (which is based on the bridge kernel module AFAICS) and instead implement PRP as a separate module (based on the Bonding driver, which would make more sense for PRP). We have a working implementation for such a module for 4.14 and would only need help in porting it to newer kernels. We would volunteer to maintain that kernel module (or sponsor someone who could).  

Hoping for advise what the next steps could be. Happy to discuss this off-list as it may not be of interest for most people. 

Thank you
  Heiko



[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 8165 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: PRP with VLAN support - or how to contribute to a Linux network driver
  2023-11-06 11:01 PRP with VLAN support - or how to contribute to a Linux network driver Heiko Gerstung
@ 2023-11-08 15:17 ` Andrew Lunn
  2023-11-09  8:08   ` Heiko Gerstung
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Lunn @ 2023-11-08 15:17 UTC (permalink / raw)
  To: Heiko Gerstung; +Cc: netdev

> I would like to discuss if it makes sense to remove the PRP
> functionality from the HSR driver (which is based on the bridge
> kernel module AFAICS) and instead implement PRP as a separate module
> (based on the Bonding driver, which would make more sense for PRP).

Seems like nobody replied. I don't know PRP or HSR, so i can only make
general remarks.

The general policy is that we don't rip something out and replace it
with new code. We try to improve what already exists to meet the
demands. This is partially because of backwards compatibility. There
could be users using the code as is. You cannot break that. Can you
step by step modify the current code to make use of bonding, and in
the process show you don't break the current use cases? You also need
to consider offloading to hardware. The bridge code has infrastructure
to offload. Does the bond driver? I've no idea about that.

> Hoping for advise what the next steps could be. Happy to discuss
> this off-list as it may not be of interest for most people.

You probably want to get together with others who are interested in
PRP and HSR. linutronix, ti, microchip, etc.

	Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: PRP with VLAN support - or how to contribute to a Linux network driver
  2023-11-08 15:17 ` Andrew Lunn
@ 2023-11-09  8:08   ` Heiko Gerstung
  2023-11-09 12:20     ` Kristian Myrland Overskeid
  0 siblings, 1 reply; 6+ messages in thread
From: Heiko Gerstung @ 2023-11-09  8:08 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: netdev

[-- Attachment #1: Type: text/plain, Size: 2026 bytes --]

Am 08.11.23, 16:17 schrieb "Andrew Lunn" <andrew@lunn.ch <mailto:andrew@lunn.ch>>:


>> I would like to discuss if it makes sense to remove the PRP
>> functionality from the HSR driver (which is based on the bridge
>> kernel module AFAICS) and instead implement PRP as a separate module
>> (based on the Bonding driver, which would make more sense for PRP).


> Seems like nobody replied. I don't know PRP or HSR, so i can only make
> general remarks.

Thank you for responding!

> The general policy is that we don't rip something out and replace it
> with new code. We try to improve what already exists to meet the
> demands. This is partially because of backwards compatibility. There
> could be users using the code as is. You cannot break that. Can you
> step by step modify the current code to make use of bonding, and in
> the process show you don't break the current use cases? 

Understood. I am not sure if we can change the hsr driver to gradually use a more bonding-like approach for prp and I believe this might not be required, as long as we can get VLAN support into it. 

> You also need to consider offloading to hardware. The bridge code has infrastructure
> to offload. Does the bond driver? I've no idea about that.

I do not know this either but would expect that the nature of bonding would not require offloading support (I do not see a potential for efficiency/performance improvements here, unlike HSR or PRP). 

>> Hoping for advise what the next steps could be. Happy to discuss
>> this off-list as it may not be of interest for most people.

> You probably want to get together with others who are interested in
> PRP and HSR. linutronix, ti, microchip, etc.

Yes, would love to do that and my hope was that I would find them here. I am not familiar with the "orphaned" status for a kernel module, but I would have expected that one of the mentioned parties interested in PRP/HSR would have adopted the module. 

> Andrew

Again, thanks a lot for your comments and remarks, very useful.

Heiko




[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 8165 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: PRP with VLAN support - or how to contribute to a Linux network driver
  2023-11-09  8:08   ` Heiko Gerstung
@ 2023-11-09 12:20     ` Kristian Myrland Overskeid
  2023-11-10  8:24       ` Heiko Gerstung
  0 siblings, 1 reply; 6+ messages in thread
From: Kristian Myrland Overskeid @ 2023-11-09 12:20 UTC (permalink / raw)
  To: Heiko Gerstung; +Cc: Andrew Lunn, netdev

If you simply remove the line "dev->features |=
NETIF_F_VLAN_CHALLENGED;" in hsr_device.c, the hsr-module is handling
vlan frames without any further modifications. Unless you need to send
vlan tagged supervision frames, I'm pretty sure the current
implementation works just as fine with vlan as without.

However, in my opinion, the discard-algorithm
(hsr_register_frame_out() in hsr_framereg.c) is not made for switched
networks. The problem with the current implementation is that it does
not account for frames arriving in a different order than it was sent
from a host. It simply checks if the sequence number of an arriving
frame is higher than the previous one. If the network has some sort of
priority, it must be expected that frames will arrive out of order
when the network load is big enough for the switches to start
prioritizing.

My solution was to add a linked list to the node struct, one for each
registered vlan id. It contains the vlan id, last sequence number and
time. On reception of a vlan frame to the HSR_PT_MASTER, it retrieves
the "node_seq_out" and "node_time_out" based on the vlan.

This works fine for me because all the prp nodes are connected to
trunk ports and the switches are prioritizing frames based on the vlan
tag.

If a prp node is connected to an access port, but the network is using
vlan priority, all sequence numbers and timestamps with the
corresponding vlan id must be kept in a hashed list. The list must be
regularly checked to remove elements before new frames with a wrapped
around sequence number can arrive.

ZHAW School of Engineering has made a prp program for both linux user
and kernel space with such a discard algorithm. The program does not
compile without some modifications, but the discard algorithm works
fine. The program is open source and can be found at
https://github.com/ZHAW-InES-Team/sw_stack_prp1.


tor. 9. nov. 2023 kl. 09:08 skrev Heiko Gerstung <heiko.gerstung@meinberg.de>:
>
> Am 08.11.23, 16:17 schrieb "Andrew Lunn" <andrew@lunn.ch <mailto:andrew@lunn.ch>>:
>
>
> >> I would like to discuss if it makes sense to remove the PRP
> >> functionality from the HSR driver (which is based on the bridge
> >> kernel module AFAICS) and instead implement PRP as a separate module
> >> (based on the Bonding driver, which would make more sense for PRP).
>
>
> > Seems like nobody replied. I don't know PRP or HSR, so i can only make
> > general remarks.
>
> Thank you for responding!
>
> > The general policy is that we don't rip something out and replace it
> > with new code. We try to improve what already exists to meet the
> > demands. This is partially because of backwards compatibility. There
> > could be users using the code as is. You cannot break that. Can you
> > step by step modify the current code to make use of bonding, and in
> > the process show you don't break the current use cases?
>
> Understood. I am not sure if we can change the hsr driver to gradually use a more bonding-like approach for prp and I believe this might not be required, as long as we can get VLAN support into it.
>
> > You also need to consider offloading to hardware. The bridge code has infrastructure
> > to offload. Does the bond driver? I've no idea about that.
>
> I do not know this either but would expect that the nature of bonding would not require offloading support (I do not see a potential for efficiency/performance improvements here, unlike HSR or PRP).
>
> >> Hoping for advise what the next steps could be. Happy to discuss
> >> this off-list as it may not be of interest for most people.
>
> > You probably want to get together with others who are interested in
> > PRP and HSR. linutronix, ti, microchip, etc.
>
> Yes, would love to do that and my hope was that I would find them here. I am not familiar with the "orphaned" status for a kernel module, but I would have expected that one of the mentioned parties interested in PRP/HSR would have adopted the module.
>
> > Andrew
>
> Again, thanks a lot for your comments and remarks, very useful.
>
> Heiko
>
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: PRP with VLAN support - or how to contribute to a Linux network driver
  2023-11-09 12:20     ` Kristian Myrland Overskeid
@ 2023-11-10  8:24       ` Heiko Gerstung
  2023-11-12 20:52         ` Kristian Myrland Overskeid
  0 siblings, 1 reply; 6+ messages in thread
From: Heiko Gerstung @ 2023-11-10  8:24 UTC (permalink / raw)
  To: Kristian Myrland Overskeid; +Cc: Andrew Lunn, netdev

[-- Attachment #1: Type: text/plain, Size: 5397 bytes --]



Am 09.11.23, 13:20 schrieb "Kristian Myrland Overskeid" <koverskeid@gmail.com <mailto:koverskeid@gmail.com>>:


Hi Kristian,

> If you simply remove the line "dev->features |=
> NETIF_F_VLAN_CHALLENGED;" in hsr_device.c, the hsr-module is handling
> vlan frames without any further modifications. Unless you need to send
> vlan tagged supervision frames, I'm pretty sure the current
> implementation works just as fine with vlan as without.

thanks a lot for your respsonse - we tried removing the NETIF_F_VLAN_CHALLENGED flag and it did not work for us. We could set up a VLAN interface on top of the PRP interface, but traffic did not get through. I will retest this to make sure we did not overlook something.

> However, in my opinion, the discard-algorithm
> (hsr_register_frame_out() in hsr_framereg.c) is not made for switched
> networks. The problem with the current implementation is that it does
> not account for frames arriving in a different order than it was sent
> from a host. It simply checks if the sequence number of an arriving
> frame is higher than the previous one. If the network has some sort of
> priority, it must be expected that frames will arrive out of order
> when the network load is big enough for the switches to start
> prioritizing.
>
> My solution was to add a linked list to the node struct, one for each
> registered vlan id. It contains the vlan id, last sequence number and
> time. On reception of a vlan frame to the HSR_PT_MASTER, it retrieves
> the "node_seq_out" and "node_time_out" based on the vlan.

I agree that it would be necessary to handle frames arriving in a mixed up order.

> This works fine for me because all the prp nodes are connected to
> trunk ports and the switches are prioritizing frames based on the vlan
> tag.

> If a prp node is connected to an access port, but the network is using
> vlan priority, all sequence numbers and timestamps with the
> corresponding vlan id must be kept in a hashed list. The list must be
> regularly checked to remove elements before new frames with a wrapped
> around sequence number can arrive.

If I understand correctly, this would make the discard process more robust because in the access port scenario the frames can arrive in an even more mixed up order or do you mean that the access port is removing the VLAN tag and sends the frames untagged to the node?

> ZHAW School of Engineering has made a prp program for both linux user
> and kernel space with such a discard algorithm. The program does not
> compile without some modifications, but the discard algorithm works
> fine. The program is open source and can be found at
> https://github.com/ZHAW-InES-Team/sw_stack_prp1 <https://github.com/ZHAW-InES-Team/sw_stack_prp1>.


I will reach out to ZHAW and check with them if they would be willing to implement their more robust discard mechanism into the hsr module. The github repo has a note saying it moved to github.zhaw.ch which I cannot access as it requires credentials. 

Thanks again, 

Heiko





tor. 9. nov. 2023 kl. 09:08 skrev Heiko Gerstung <heiko.gerstung@meinberg.de <mailto:heiko.gerstung@meinberg.de>>:
>
> Am 08.11.23, 16:17 schrieb "Andrew Lunn" <andrew@lunn.ch <mailto:andrew@lunn.ch> <mailto:andrew@lunn.ch <mailto:andrew@lunn.ch>>>:
>
>
> >> I would like to discuss if it makes sense to remove the PRP
> >> functionality from the HSR driver (which is based on the bridge
> >> kernel module AFAICS) and instead implement PRP as a separate module
> >> (based on the Bonding driver, which would make more sense for PRP).
>
>
> > Seems like nobody replied. I don't know PRP or HSR, so i can only make
> > general remarks.
>
> Thank you for responding!
>
> > The general policy is that we don't rip something out and replace it
> > with new code. We try to improve what already exists to meet the
> > demands. This is partially because of backwards compatibility. There
> > could be users using the code as is. You cannot break that. Can you
> > step by step modify the current code to make use of bonding, and in
> > the process show you don't break the current use cases?
>
> Understood. I am not sure if we can change the hsr driver to gradually use a more bonding-like approach for prp and I believe this might not be required, as long as we can get VLAN support into it.
>
> > You also need to consider offloading to hardware. The bridge code has infrastructure
> > to offload. Does the bond driver? I've no idea about that.
>
> I do not know this either but would expect that the nature of bonding would not require offloading support (I do not see a potential for efficiency/performance improvements here, unlike HSR or PRP).
>
> >> Hoping for advise what the next steps could be. Happy to discuss
> >> this off-list as it may not be of interest for most people.
>
> > You probably want to get together with others who are interested in
> > PRP and HSR. linutronix, ti, microchip, etc.
>
> Yes, would love to do that and my hope was that I would find them here. I am not familiar with the "orphaned" status for a kernel module, but I would have expected that one of the mentioned parties interested in PRP/HSR would have adopted the module.
>
> > Andrew
>
> Again, thanks a lot for your comments and remarks, very useful.
>
> Heiko
>
>
>






[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 8165 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: PRP with VLAN support - or how to contribute to a Linux network driver
  2023-11-10  8:24       ` Heiko Gerstung
@ 2023-11-12 20:52         ` Kristian Myrland Overskeid
  0 siblings, 0 replies; 6+ messages in thread
From: Kristian Myrland Overskeid @ 2023-11-12 20:52 UTC (permalink / raw)
  To: Heiko Gerstung; +Cc: Andrew Lunn, netdev

Hi Heiko,

> thanks a lot for your respsonse - we tried removing the NETIF_F_VLAN_CHALLENGED flag and it did not work for us. We could set up a VLAN interface on top of the PRP interface, but traffic did not get through. I will retest this to make sure we did not overlook something.

It worked for me on Ubuntu 22.04.03 LTS, but I haven't tried it on
different distros. You can use tcpdump to check if the vlan frames
reach the prp interface. If not, it's probably a vlan configuration
issue.

One thing you should be aware of is that unless you're testing on the
vanilla kernel, you should compare the source code of the hsr module
with the vanilla kernels. For example, the hsr module on Ubuntu is far
behind the vanilla kernel and I needed to add changes manually to get
rid of some bugs(not related to vlan though). If you are using a
distro with an even more outdated hsr module, this could be the reason
why your tests are failing with the NETIF_F_VLAN_CHALLENGED flag
removed.

> If I understand correctly, this would make the discard process more robust because in the access port scenario the frames can arrive in an even more mixed up order or do you mean that the access port is removing the VLAN tag and sends the frames untagged to the node?

I see that I could have explained myself better here. I meant that the
access port is removing the VLAN tag and sends the frames untagged to
the node. In this case you cannot differ between the different vlans,
which means that you have to keep track of all sequence numbers that
should be dropped to avoid that legit frames arriving in a different
order is dropped. I wrote that the vlan id must be stored, but this is
not necessary since the source nodes don't consider vlan ids when
setting the sequence number for the outgoing frames.

Kristian






fre. 10. nov. 2023 kl. 09:24 skrev Heiko Gerstung <heiko.gerstung@meinberg.de>:
>
>
>
> Am 09.11.23, 13:20 schrieb "Kristian Myrland Overskeid" <koverskeid@gmail.com <mailto:koverskeid@gmail.com>>:
>
>
> Hi Kristian,
>
> > If you simply remove the line "dev->features |=
> > NETIF_F_VLAN_CHALLENGED;" in hsr_device.c, the hsr-module is handling
> > vlan frames without any further modifications. Unless you need to send
> > vlan tagged supervision frames, I'm pretty sure the current
> > implementation works just as fine with vlan as without.
>
> thanks a lot for your respsonse - we tried removing the NETIF_F_VLAN_CHALLENGED flag and it did not work for us. We could set up a VLAN interface on top of the PRP interface, but traffic did not get through. I will retest this to make sure we did not overlook something.
>
> > However, in my opinion, the discard-algorithm
> > (hsr_register_frame_out() in hsr_framereg.c) is not made for switched
> > networks. The problem with the current implementation is that it does
> > not account for frames arriving in a different order than it was sent
> > from a host. It simply checks if the sequence number of an arriving
> > frame is higher than the previous one. If the network has some sort of
> > priority, it must be expected that frames will arrive out of order
> > when the network load is big enough for the switches to start
> > prioritizing.
> >
> > My solution was to add a linked list to the node struct, one for each
> > registered vlan id. It contains the vlan id, last sequence number and
> > time. On reception of a vlan frame to the HSR_PT_MASTER, it retrieves
> > the "node_seq_out" and "node_time_out" based on the vlan.
>
> I agree that it would be necessary to handle frames arriving in a mixed up order.
>
> > This works fine for me because all the prp nodes are connected to
> > trunk ports and the switches are prioritizing frames based on the vlan
> > tag.
>
> > If a prp node is connected to an access port, but the network is using
> > vlan priority, all sequence numbers and timestamps with the
> > corresponding vlan id must be kept in a hashed list. The list must be
> > regularly checked to remove elements before new frames with a wrapped
> > around sequence number can arrive.
>
> If I understand correctly, this would make the discard process more robust because in the access port scenario the frames can arrive in an even more mixed up order or do you mean that the access port is removing the VLAN tag and sends the frames untagged to the node?
>
> > ZHAW School of Engineering has made a prp program for both linux user
> > and kernel space with such a discard algorithm. The program does not
> > compile without some modifications, but the discard algorithm works
> > fine. The program is open source and can be found at
> > https://github.com/ZHAW-InES-Team/sw_stack_prp1 <https://github.com/ZHAW-InES-Team/sw_stack_prp1>.
>
>
> I will reach out to ZHAW and check with them if they would be willing to implement their more robust discard mechanism into the hsr module. The github repo has a note saying it moved to github.zhaw.ch which I cannot access as it requires credentials.
>
> Thanks again,
>
> Heiko
>
>
>
>
>
> tor. 9. nov. 2023 kl. 09:08 skrev Heiko Gerstung <heiko.gerstung@meinberg.de <mailto:heiko.gerstung@meinberg.de>>:
> >
> > Am 08.11.23, 16:17 schrieb "Andrew Lunn" <andrew@lunn.ch <mailto:andrew@lunn.ch> <mailto:andrew@lunn.ch <mailto:andrew@lunn.ch>>>:
> >
> >
> > >> I would like to discuss if it makes sense to remove the PRP
> > >> functionality from the HSR driver (which is based on the bridge
> > >> kernel module AFAICS) and instead implement PRP as a separate module
> > >> (based on the Bonding driver, which would make more sense for PRP).
> >
> >
> > > Seems like nobody replied. I don't know PRP or HSR, so i can only make
> > > general remarks.
> >
> > Thank you for responding!
> >
> > > The general policy is that we don't rip something out and replace it
> > > with new code. We try to improve what already exists to meet the
> > > demands. This is partially because of backwards compatibility. There
> > > could be users using the code as is. You cannot break that. Can you
> > > step by step modify the current code to make use of bonding, and in
> > > the process show you don't break the current use cases?
> >
> > Understood. I am not sure if we can change the hsr driver to gradually use a more bonding-like approach for prp and I believe this might not be required, as long as we can get VLAN support into it.
> >
> > > You also need to consider offloading to hardware. The bridge code has infrastructure
> > > to offload. Does the bond driver? I've no idea about that.
> >
> > I do not know this either but would expect that the nature of bonding would not require offloading support (I do not see a potential for efficiency/performance improvements here, unlike HSR or PRP).
> >
> > >> Hoping for advise what the next steps could be. Happy to discuss
> > >> this off-list as it may not be of interest for most people.
> >
> > > You probably want to get together with others who are interested in
> > > PRP and HSR. linutronix, ti, microchip, etc.
> >
> > Yes, would love to do that and my hope was that I would find them here. I am not familiar with the "orphaned" status for a kernel module, but I would have expected that one of the mentioned parties interested in PRP/HSR would have adopted the module.
> >
> > > Andrew
> >
> > Again, thanks a lot for your comments and remarks, very useful.
> >
> > Heiko
> >
> >
> >
>
>
>
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-11-12 20:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-06 11:01 PRP with VLAN support - or how to contribute to a Linux network driver Heiko Gerstung
2023-11-08 15:17 ` Andrew Lunn
2023-11-09  8:08   ` Heiko Gerstung
2023-11-09 12:20     ` Kristian Myrland Overskeid
2023-11-10  8:24       ` Heiko Gerstung
2023-11-12 20:52         ` Kristian Myrland Overskeid

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.