All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][bonding] Improve VLAN support on top of bonding
@ 2003-07-15 13:55 Shmulik Hen
  2003-07-15 17:24 ` Ben Greear
  0 siblings, 1 reply; 10+ messages in thread
From: Shmulik Hen @ 2003-07-15 13:55 UTC (permalink / raw)
  To: bond-devel, linux-net, linux-netdev, David S. Miller, Ben Greear,
	Jeff Garzik, Jay Vosburgh
  Cc: Amir Noam, Noam Marom, Shmulik Hen, Tsippy Mendelson

Hi All,

	Currently, when using 8021q VLAN module to work on top of bonding,
everything seems to work OK, but there are some issues that will not work
according to our analysis. For example, any self-generated packets sent by
bonding itself (e.g. arp-mon, TLB learning packets, ALB arp replies, etc.)
do not have the VLAN id tag in them, and thus will not go through the
switch. Also, in order to configure a VLAN interface, the underlying
interface must be configured first to IP address 0.0.0.0. Since arp-mon
uses bond's IP address, this might cause further problems. Other issue
we've still not investigated, like what happens if bonding needs to parse
a tagged packet for layer2/layer3 data, might still create more problems.

	We have come up with some possible solution we would like to get
comments on. First of all, our main guide line was not to duplicate code
segments that are in the VLAN module and put them in bonding. Further, we
figured bonding should not need to know about how the VLAN module handles
hardware acceleration. On the other hand, bonding does need to know what
VLAN tags are being used so it may send packets successfully through all
the switch ports, so some kind of policy needs to be defined.

So here is what we've come up with until now.

1. Configuration
   Need to decide between:
   a. Block VLAN add/del operations when bond has no slaves.
   b. Block enslave/release of slaves when bond has no VLAN tags (needs a
      module parameter).
   c. Remove limitation of IP 0.0.0.0.

2. Indication
   Need to decide between:
   a. Add notification mechanism in VLAN module that bonding may register
      to listen to, and thus keep track of VLAN tags added/removed.
   b. Register to listen to net device register/unregister notifications
      to monitor creation/destruction of VLAN devices. Requires support
      for figuring out if a net device is a VLAN device, and also two vlan
      calls like get_realdev() and get_vlan_id() exported.
   c. Parse every packet going through bonding to collect VLAN tags.

3. Monitoring
   In order for bonding to be able to generate tagged packets on its own,
   two major changes need to be done. One is split the vlan_start_xmit
   function into insert_tag() and vlan_xmit(), so bonding may choose the
   required tag on its own, and let 8021q to the transmit. A second change
   is to split arp_send() into arp_create() and arp_send(), so bonding may
   pass all the usual parameters for arp creation, get a complete arp
   packet and then pass it to 8021q for tag insertion on transmission.


Hardware acceleration
=====================
	When coming to analyze what is required for adding support for
VLAN hardware acceleration on top of bonding, other issues come to mind.
Since add/del operations are defined and handshakes are performed between
the VLAN module and the device driver, tracking VLAN tags is simpler and
commands should just be propagated to the slaves. Enslaving/releasing
slaves should also be simple and just require adding/removing existing
VLAN tags from them. The problem is how to handle configuration issues.

  1. Since adding the first VLAN tag requires some additional handshake,
     can bonding support that operation on a bond that already has slaves
     and is running?
  2. What about removing the last tag from a bond?
  3. Should the bond device declare itself as "VLAN challenged" before
     registering and remove that limitation only once it has slaves?
  4. Should the bond declare itself as fully hardware acceleration capable
     to benefit from "strong" slaves while performing regular VLAN
     inserting/stripping for "weak" slaves?
  5. How can bonding generate untagged packets and send them via
     hardware acceleration capable slaves (e.g. 802.3ad LACPDU) ?


-- 
| Shmulik Hen                             |
| Israel Design Center (Jerusalem)        |
| LAN Access Division                     |
| Intel Communications Group, Intel corp. |



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][bonding] Improve VLAN support on top of bonding
  2003-07-15 13:55 [RFC][bonding] Improve VLAN support on top of bonding Shmulik Hen
@ 2003-07-15 17:24 ` Ben Greear
  2003-07-15 17:55   ` [Bonding-devel] " Dan Hollis
  0 siblings, 1 reply; 10+ messages in thread
From: Ben Greear @ 2003-07-15 17:24 UTC (permalink / raw)
  To: Shmulik Hen
  Cc: bond-devel, linux-net, linux-netdev, David S. Miller,
	Jeff Garzik, Jay Vosburgh, Amir Noam, Noam Marom,
	Tsippy Mendelson

Shmulik Hen wrote:
> Hi All,
> 
> 	Currently, when using 8021q VLAN module to work on top of bonding,
> everything seems to work OK, but there are some issues that will not work
> according to our analysis. For example, any self-generated packets sent by
> bonding itself (e.g. arp-mon, TLB learning packets, ALB arp replies, etc.)
> do not have the VLAN id tag in them, and thus will not go through the
> switch. Also, in order to configure a VLAN interface, the underlying
> interface must be configured first to IP address 0.0.0.0. Since arp-mon
> uses bond's IP address, this might cause further problems. Other issue
> we've still not investigated, like what happens if bonding needs to parse
> a tagged packet for layer2/layer3 data, might still create more problems.
> 
> 	We have come up with some possible solution we would like to get
> comments on. First of all, our main guide line was not to duplicate code
> segments that are in the VLAN module and put them in bonding. Further, we
> figured bonding should not need to know about how the VLAN module handles
> hardware acceleration. On the other hand, bonding does need to know what
> VLAN tags are being used so it may send packets successfully through all
> the switch ports, so some kind of policy needs to be defined.
> 
> So here is what we've come up with until now.
> 
> 1. Configuration
>    Need to decide between:
>    a. Block VLAN add/del operations when bond has no slaves.
>    b. Block enslave/release of slaves when bond has no VLAN tags (needs a
>       module parameter).
>    c. Remove limitation of IP 0.0.0.0.
> 
> 2. Indication
>    Need to decide between:
>    a. Add notification mechanism in VLAN module that bonding may register
>       to listen to, and thus keep track of VLAN tags added/removed.
>    b. Register to listen to net device register/unregister notifications
>       to monitor creation/destruction of VLAN devices. Requires support
>       for figuring out if a net device is a VLAN device, and also two vlan
>       calls like get_realdev() and get_vlan_id() exported.

b) sounds good to me.  There are flags that can let you know if it's a vlan
device or not.

if.h:#define IFF_802_1Q_VLAN 0x1             /* 802.1Q VLAN device.          */

>    c. Parse every packet going through bonding to collect VLAN tags.
> 
> 3. Monitoring
>    In order for bonding to be able to generate tagged packets on its own,
>    two major changes need to be done. One is split the vlan_start_xmit
>    function into insert_tag() and vlan_xmit(), so bonding may choose the
>    required tag on its own, and let 8021q to the transmit. A second change
>    is to split arp_send() into arp_create() and arp_send(), so bonding may
>    pass all the usual parameters for arp creation, get a complete arp
>    packet and then pass it to 8021q for tag insertion on transmission.
> 
> 
> Hardware acceleration
> =====================
> 	When coming to analyze what is required for adding support for
> VLAN hardware acceleration on top of bonding, other issues come to mind.
> Since add/del operations are defined and handshakes are performed between
> the VLAN module and the device driver, tracking VLAN tags is simpler and
> commands should just be propagated to the slaves. Enslaving/releasing
> slaves should also be simple and just require adding/removing existing
> VLAN tags from them. The problem is how to handle configuration issues.

I'd consider ignoring the HW accel unless you can prove it actually helps
performance to a noticeable degree.  I have never seen results of any benchmarking
related to this...

> 
>   1. Since adding the first VLAN tag requires some additional handshake,
>      can bonding support that operation on a bond that already has slaves
>      and is running?
>   2. What about removing the last tag from a bond?
>   3. Should the bond device declare itself as "VLAN challenged" before
>      registering and remove that limitation only once it has slaves?
>   4. Should the bond declare itself as fully hardware acceleration capable
>      to benefit from "strong" slaves while performing regular VLAN
>      inserting/stripping for "weak" slaves?
>   5. How can bonding generate untagged packets and send them via
>      hardware acceleration capable slaves (e.g. 802.3ad LACPDU) ?
> 
> 


-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bonding-devel] Re: [RFC][bonding] Improve VLAN support on top of bonding
  2003-07-15 17:24 ` Ben Greear
@ 2003-07-15 17:55   ` Dan Hollis
  2003-07-15 18:13     ` Ben Greear
  0 siblings, 1 reply; 10+ messages in thread
From: Dan Hollis @ 2003-07-15 17:55 UTC (permalink / raw)
  To: Ben Greear
  Cc: Shmulik Hen, bond-devel, linux-net, linux-netdev,
	David S. Miller, Jeff Garzik, Jay Vosburgh, Amir Noam,
	Noam Marom, Tsippy Mendelson

On Tue, 15 Jul 2003, Ben Greear wrote:
> I'd consider ignoring the HW accel unless you can prove it actually helps
> performance to a noticeable degree.  I have never seen results of any benchmarking
> related to this...

For gigabit ethernet, it makes a *H*U*G*E* difference.

-Dan
-- 
[-] Omae no subete no kichi wa ore no mono da. [-]


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bonding-devel] Re: [RFC][bonding] Improve VLAN support on top of bonding
  2003-07-15 17:55   ` [Bonding-devel] " Dan Hollis
@ 2003-07-15 18:13     ` Ben Greear
  2003-07-15 18:16       ` Dan Hollis
  0 siblings, 1 reply; 10+ messages in thread
From: Ben Greear @ 2003-07-15 18:13 UTC (permalink / raw)
  To: Dan Hollis
  Cc: Shmulik Hen, bond-devel, linux-net, linux-netdev,
	David S. Miller, Jeff Garzik, Jay Vosburgh, Amir Noam,
	Noam Marom, Tsippy Mendelson

Dan Hollis wrote:
> On Tue, 15 Jul 2003, Ben Greear wrote:
> 
>>I'd consider ignoring the HW accel unless you can prove it actually helps
>>performance to a noticeable degree.  I have never seen results of any benchmarking
>>related to this...
> 
> 
> For gigabit ethernet, it makes a *H*U*G*E* difference.

I'm curious to see numbers.  The VLAN shim is only inserting
a small shim header, at at most shifting the first part of the packet
when sent a pre-built packet.

Maybe the hw-accel turns on tcp checksumming or something too??

> 
> -Dan


-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bonding-devel] Re: [RFC][bonding] Improve VLAN support on top of bonding
  2003-07-15 18:13     ` Ben Greear
@ 2003-07-15 18:16       ` Dan Hollis
  2003-07-15 18:36         ` Ralph Doncaster
  0 siblings, 1 reply; 10+ messages in thread
From: Dan Hollis @ 2003-07-15 18:16 UTC (permalink / raw)
  To: Ben Greear
  Cc: Shmulik Hen, bond-devel, linux-net, linux-netdev,
	David S. Miller, Jeff Garzik, Jay Vosburgh, Amir Noam,
	Noam Marom, Tsippy Mendelson

On Tue, 15 Jul 2003, Ben Greear wrote:
> Dan Hollis wrote:
> > On Tue, 15 Jul 2003, Ben Greear wrote:
> >>I'd consider ignoring the HW accel unless you can prove it actually helps
> >>performance to a noticeable degree.  I have never seen results of any benchmarking
> >>related to this...
> > For gigabit ethernet, it makes a *H*U*G*E* difference.
> I'm curious to see numbers.  The VLAN shim is only inserting
> a small shim header, at at most shifting the first part of the packet
> when sent a pre-built packet.
> Maybe the hw-accel turns on tcp checksumming or something too??

That is exactly what it does. hw tcp checksumming helps a LOT at gbe rates

-Dan
-- 
[-] Omae no subete no kichi wa ore no mono da. [-]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bonding-devel] Re: [RFC][bonding] Improve VLAN support on top of bonding
  2003-07-15 18:16       ` Dan Hollis
@ 2003-07-15 18:36         ` Ralph Doncaster
  2003-07-15 19:20           ` Dan Hollis
  2003-07-15 22:30           ` Jeff Garzik
  0 siblings, 2 replies; 10+ messages in thread
From: Ralph Doncaster @ 2003-07-15 18:36 UTC (permalink / raw)
  To: Dan Hollis; +Cc: linux-netdev

On Tue, 15 Jul 2003, Dan Hollis wrote:

> That is exactly what it does. hw tcp checksumming helps a LOT at gbe rates

This still doesn't make any sense.  The copy from user-space to kernel
space does the checksum as far as I recall (unless you use the
router-not-host kernel build option).

-Ralph

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bonding-devel] Re: [RFC][bonding] Improve VLAN support on top of bonding
  2003-07-15 18:36         ` Ralph Doncaster
@ 2003-07-15 19:20           ` Dan Hollis
  2003-07-15 22:30           ` Jeff Garzik
  1 sibling, 0 replies; 10+ messages in thread
From: Dan Hollis @ 2003-07-15 19:20 UTC (permalink / raw)
  To: ralph+d; +Cc: linux-netdev

On Tue, 15 Jul 2003, Ralph Doncaster wrote:
> On Tue, 15 Jul 2003, Dan Hollis wrote:
> > That is exactly what it does. hw tcp checksumming helps a LOT at gbe rates
> This still doesn't make any sense.  The copy from user-space to kernel
> space does the checksum as far as I recall (unless you use the
> router-not-host kernel build option).

except that 2.5.x has zerocopy and I believe NFS supports it now as well

fwiw I believe sendfile() implementation was motivated a lot by hw csum 
support...

-Dan
-- 
[-] Omae no subete no kichi wa ore no mono da. [-]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bonding-devel] Re: [RFC][bonding] Improve VLAN support on top of bonding
  2003-07-15 18:36         ` Ralph Doncaster
  2003-07-15 19:20           ` Dan Hollis
@ 2003-07-15 22:30           ` Jeff Garzik
  2003-07-16 12:33             ` Ralph Doncaster
  1 sibling, 1 reply; 10+ messages in thread
From: Jeff Garzik @ 2003-07-15 22:30 UTC (permalink / raw)
  To: ralph+d; +Cc: Dan Hollis, linux-netdev

Ralph Doncaster wrote:
> On Tue, 15 Jul 2003, Dan Hollis wrote:
> 
> 
>>That is exactly what it does. hw tcp checksumming helps a LOT at gbe rates
> 
> 
> This still doesn't make any sense.  The copy from user-space to kernel
> space does the checksum as far as I recall (unless you use the
> router-not-host kernel build option).


Not for the zero-copy case.

	Jeff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bonding-devel] Re: [RFC][bonding] Improve VLAN support on top of bonding
  2003-07-15 22:30           ` Jeff Garzik
@ 2003-07-16 12:33             ` Ralph Doncaster
  0 siblings, 0 replies; 10+ messages in thread
From: Ralph Doncaster @ 2003-07-16 12:33 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Dan Hollis, linux-netdev

On Tue, 15 Jul 2003, Jeff Garzik wrote:

> Ralph Doncaster wrote:
> > On Tue, 15 Jul 2003, Dan Hollis wrote:
> >
> >
> >>That is exactly what it does. hw tcp checksumming helps a LOT at gbe rates
> >
> >
> > This still doesn't make any sense.  The copy from user-space to kernel
> > space does the checksum as far as I recall (unless you use the
> > router-not-host kernel build option).
>
>
> Not for the zero-copy case.

How common is this?  As far as I can tell, Apache 1.3 doesn't use sendfile
(you need 2.0 for that).  And even if 1.3 is using EnableMMAP with a large
write, you're limited to the size of SO_SNDBUF (or maybe only a single
page?).

This is not to say hw csum is a bad thing.  I think the linux IP stack
should support it.  When I was looking at the 2.4.19 code I noticed the
3c59x driver code supported hw csum, but I couldn't find anything in the
IP stack that used the csum flags set by the driver...

-Ralph

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [RFC][bonding] Improve VLAN support on top of bonding
@ 2003-07-15 17:49 Eble, Dan
  0 siblings, 0 replies; 10+ messages in thread
From: Eble, Dan @ 2003-07-15 17:49 UTC (permalink / raw)
  To: 'Shmulik Hen', 'Stephen Hemminger'
  Cc: bond-devel, linux-net, linux-netdev, David S. Miller, Ben Greear,
	Jeff Garzik, Jay Vosburgh, Amir Noam, Noam Marom,
	Tsippy Mendelson

My US$0.02: Bonding and bridging have some things in common, at least as far
as having to deal with diverse hardware.  It would be nice to have a
[un]tagging interface that is useful to both drivers with as little code
duplication as is reasonably possible.

> -----Original Message-----
> From: Shmulik Hen [mailto:shmulik.hen@intel.com] 
> Sent: Tuesday, July 15, 2003 9:55 AM
> To: bond-devel; linux-net; linux-netdev; David S. Miller; Ben 
> Greear; Jeff Garzik; Jay Vosburgh
> Cc: Amir Noam; Noam Marom; Shmulik Hen; Tsippy Mendelson
> Subject: [RFC][bonding] Improve VLAN support on top of bonding
> 
> 
> Hi All,
> 
> 	Currently, when using 8021q VLAN module to work on top 
> of bonding,
> everything seems to work OK, but there are some issues that 
> will not work
> according to our analysis. For example, any self-generated 
> packets sent by
> bonding itself (e.g. arp-mon, TLB learning packets, ALB arp 
> replies, etc.)
> do not have the VLAN id tag in them, and thus will not go through the
> switch. Also, in order to configure a VLAN interface, the underlying
> interface must be configured first to IP address 0.0.0.0. 
> Since arp-mon
> uses bond's IP address, this might cause further problems. Other issue
> we've still not investigated, like what happens if bonding 
> needs to parse
> a tagged packet for layer2/layer3 data, might still create 
> more problems.
> 
> 	We have come up with some possible solution we would like to get
> comments on. First of all, our main guide line was not to 
> duplicate code
> segments that are in the VLAN module and put them in bonding. 
> Further, we
> figured bonding should not need to know about how the VLAN 
> module handles
> hardware acceleration. On the other hand, bonding does need 
> to know what
> VLAN tags are being used so it may send packets successfully 
> through all
> the switch ports, so some kind of policy needs to be defined.
> 
> So here is what we've come up with until now.
> 
> 1. Configuration
>    Need to decide between:
>    a. Block VLAN add/del operations when bond has no slaves.
>    b. Block enslave/release of slaves when bond has no VLAN 
> tags (needs a
>       module parameter).
>    c. Remove limitation of IP 0.0.0.0.
> 
> 2. Indication
>    Need to decide between:
>    a. Add notification mechanism in VLAN module that bonding 
> may register
>       to listen to, and thus keep track of VLAN tags added/removed.
>    b. Register to listen to net device register/unregister 
> notifications
>       to monitor creation/destruction of VLAN devices. 
> Requires support
>       for figuring out if a net device is a VLAN device, and 
> also two vlan
>       calls like get_realdev() and get_vlan_id() exported.
>    c. Parse every packet going through bonding to collect VLAN tags.
> 
> 3. Monitoring
>    In order for bonding to be able to generate tagged packets 
> on its own,
>    two major changes need to be done. One is split the vlan_start_xmit
>    function into insert_tag() and vlan_xmit(), so bonding may 
> choose the
>    required tag on its own, and let 8021q to the transmit. A 
> second change
>    is to split arp_send() into arp_create() and arp_send(), 
> so bonding may
>    pass all the usual parameters for arp creation, get a complete arp
>    packet and then pass it to 8021q for tag insertion on transmission.
> 
> 
> Hardware acceleration
> =====================
> 	When coming to analyze what is required for adding support for
> VLAN hardware acceleration on top of bonding, other issues 
> come to mind.
> Since add/del operations are defined and handshakes are 
> performed between
> the VLAN module and the device driver, tracking VLAN tags is 
> simpler and
> commands should just be propagated to the slaves. Enslaving/releasing
> slaves should also be simple and just require adding/removing existing
> VLAN tags from them. The problem is how to handle 
> configuration issues.
> 
>   1. Since adding the first VLAN tag requires some additional 
> handshake,
>      can bonding support that operation on a bond that 
> already has slaves
>      and is running?
>   2. What about removing the last tag from a bond?
>   3. Should the bond device declare itself as "VLAN challenged" before
>      registering and remove that limitation only once it has slaves?
>   4. Should the bond declare itself as fully hardware 
> acceleration capable
>      to benefit from "strong" slaves while performing regular VLAN
>      inserting/stripping for "weak" slaves?
>   5. How can bonding generate untagged packets and send them via
>      hardware acceleration capable slaves (e.g. 802.3ad LACPDU) ?
> 
> 
> -- 
> | Shmulik Hen                             |
> | Israel Design Center (Jerusalem)        |
> | LAN Access Division                     |
> | Intel Communications Group, Intel corp. |
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-net" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2003-07-16 12:33 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-07-15 13:55 [RFC][bonding] Improve VLAN support on top of bonding Shmulik Hen
2003-07-15 17:24 ` Ben Greear
2003-07-15 17:55   ` [Bonding-devel] " Dan Hollis
2003-07-15 18:13     ` Ben Greear
2003-07-15 18:16       ` Dan Hollis
2003-07-15 18:36         ` Ralph Doncaster
2003-07-15 19:20           ` Dan Hollis
2003-07-15 22:30           ` Jeff Garzik
2003-07-16 12:33             ` Ralph Doncaster
2003-07-15 17:49 Eble, Dan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.