All of lore.kernel.org
 help / color / mirror / Atom feed
* RFC - should network devices trim frames > soft mtu
@ 2011-08-31 22:18 Stephen Hemminger
  2011-08-31 22:26 ` Ben Greear
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Stephen Hemminger @ 2011-08-31 22:18 UTC (permalink / raw)
  To: David Miller, Michael Chan; +Cc: netdev

I noticed the following in the bnx2 driver.


static int
bnx2_rx_int(struct bnx2 *bp, struct bnx2_napi *bnapi, int budget)
{
...
		skb->protocol = eth_type_trans(skb, bp->dev);

		if ((len > (bp->dev->mtu + ETH_HLEN)) &&
			(ntohs(skb->protocol) != 0x8100)) {

			dev_kfree_skb(skb);
			goto next_rx;

		}

This means that for non-VLAN tagged frames, the device drops received
packets if the length is greater than the MTU.  I don't see that in
other devices. What is the correct method? IMHO the bnx2 driver is
wrong here and if the policy is desired it should be enforced at
the next level (netif_receive_skb).  Hardcoding a protocol value is
kind of a giveaway that something is fishy.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC - should network devices trim frames > soft mtu
  2011-08-31 22:18 RFC - should network devices trim frames > soft mtu Stephen Hemminger
@ 2011-08-31 22:26 ` Ben Greear
  2011-08-31 22:27 ` Michael Chan
  2011-08-31 22:45 ` Ben Hutchings
  2 siblings, 0 replies; 5+ messages in thread
From: Ben Greear @ 2011-08-31 22:26 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, Michael Chan, netdev

On 08/31/2011 03:18 PM, Stephen Hemminger wrote:
> I noticed the following in the bnx2 driver.
>
>
> static int
> bnx2_rx_int(struct bnx2 *bp, struct bnx2_napi *bnapi, int budget)
> {
> ...
> 		skb->protocol = eth_type_trans(skb, bp->dev);
>
> 		if ((len>  (bp->dev->mtu + ETH_HLEN))&&
> 			(ntohs(skb->protocol) != 0x8100)) {
>
> 			dev_kfree_skb(skb);
> 			goto next_rx;
>
> 		}
>
> This means that for non-VLAN tagged frames, the device drops received
> packets if the length is greater than the MTU.  I don't see that in
> other devices. What is the correct method? IMHO the bnx2 driver is
> wrong here and if the policy is desired it should be enforced at
> the next level (netif_receive_skb).  Hardcoding a protocol value is
> kind of a giveaway that something is fishy.

Maybe that lets them use some kind of offload?

Either way, seems the pkt should be allowed to come up the
stack if the NIC can receive it and it's not otherwise funky.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC - should network devices trim frames > soft mtu
  2011-08-31 22:18 RFC - should network devices trim frames > soft mtu Stephen Hemminger
  2011-08-31 22:26 ` Ben Greear
@ 2011-08-31 22:27 ` Michael Chan
  2011-09-01  0:10   ` David Lamparter
  2011-08-31 22:45 ` Ben Hutchings
  2 siblings, 1 reply; 5+ messages in thread
From: Michael Chan @ 2011-08-31 22:27 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, netdev


On Wed, 2011-08-31 at 15:18 -0700, Stephen Hemminger wrote:
> I noticed the following in the bnx2 driver.
> 
> 
> static int
> bnx2_rx_int(struct bnx2 *bp, struct bnx2_napi *bnapi, int budget)
> {
> ...
> 		skb->protocol = eth_type_trans(skb, bp->dev);
> 
> 		if ((len > (bp->dev->mtu + ETH_HLEN)) &&
> 			(ntohs(skb->protocol) != 0x8100)) {
> 
> 			dev_kfree_skb(skb);
> 			goto next_rx;
> 
> 		}
> 
> This means that for non-VLAN tagged frames, the device drops received
> packets if the length is greater than the MTU.  I don't see that in
> other devices. What is the correct method? IMHO the bnx2 driver is
> wrong here and if the policy is desired it should be enforced at
> the next level (netif_receive_skb).  Hardcoding a protocol value is
> kind of a giveaway that something is fishy.
> 

I guess the reasoning is that we program the RX MTU in our chip to
automatically discard packets bigger than the RX MTU and count them as
over-size packets.  We add 4 bytes to the RX MTU to account for the VLAN
tag which may be stripped or not stripped by the chip depending on
settings.  The extra 4 bytes in the RX MTU setting will allow over-size
packets by up to 4 bytes to get through.

I agree we should move this to the next level.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC - should network devices trim frames > soft mtu
  2011-08-31 22:18 RFC - should network devices trim frames > soft mtu Stephen Hemminger
  2011-08-31 22:26 ` Ben Greear
  2011-08-31 22:27 ` Michael Chan
@ 2011-08-31 22:45 ` Ben Hutchings
  2 siblings, 0 replies; 5+ messages in thread
From: Ben Hutchings @ 2011-08-31 22:45 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, Michael Chan, netdev

On Wed, 2011-08-31 at 15:18 -0700, Stephen Hemminger wrote:
> I noticed the following in the bnx2 driver.
> 
> 
> static int
> bnx2_rx_int(struct bnx2 *bp, struct bnx2_napi *bnapi, int budget)
> {
> ...
> 		skb->protocol = eth_type_trans(skb, bp->dev);
> 
> 		if ((len > (bp->dev->mtu + ETH_HLEN)) &&
> 			(ntohs(skb->protocol) != 0x8100)) {
> 
> 			dev_kfree_skb(skb);
> 			goto next_rx;
> 
> 		}
> 
> This means that for non-VLAN tagged frames, the device drops received
> packets if the length is greater than the MTU.  I don't see that in
> other devices. What is the correct method? IMHO the bnx2 driver is
> wrong here and if the policy is desired it should be enforced at
> the next level (netif_receive_skb).  Hardcoding a protocol value is
> kind of a giveaway that something is fishy.

According to netdevices.txt:

"MTU is symmetrical and applies both to receive and transmit.  ...
The device may either: drop, truncate, or pass up oversize packets, but
dropping oversize packets is preferred."

I believe UNH interop tests expect that MRU = MTU and oversize packets
are dropped.  However, I seem to recall that David has said more
recently that it's preferable to always use the maximum possible MRU if
DMA scatter is supported (so that this doesn't require page allocations
of order > 0).

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC - should network devices trim frames > soft mtu
  2011-08-31 22:27 ` Michael Chan
@ 2011-09-01  0:10   ` David Lamparter
  0 siblings, 0 replies; 5+ messages in thread
From: David Lamparter @ 2011-09-01  0:10 UTC (permalink / raw)
  To: Michael Chan; +Cc: Stephen Hemminger, David Miller, netdev

On Wed, Aug 31, 2011 at 03:27:31PM -0700, Michael Chan wrote:
> > This means that for non-VLAN tagged frames, the device drops received
> > packets if the length is greater than the MTU.  I don't see that in
> > other devices. What is the correct method? IMHO the bnx2 driver is
> > wrong here and if the policy is desired it should be enforced at
> > the next level (netif_receive_skb).  Hardcoding a protocol value is
> > kind of a giveaway that something is fishy.
> > 
> 
> I guess the reasoning is that we program the RX MTU in our chip to
> automatically discard packets bigger than the RX MTU and count them as
> over-size packets.  We add 4 bytes to the RX MTU to account for the VLAN
> tag which may be stripped or not stripped by the chip depending on
> settings.  The extra 4 bytes in the RX MTU setting will allow over-size
> packets by up to 4 bytes to get through.
> 
> I agree we should move this to the next level.

802.3ac allows both unconditionally raising the MTU to 1522 as well as
checking the protocol and only accepting 802.1Q frames at 1522 while
restricting everything else to 1518.

802.3as raises the bar to 2000 bytes, but explicitly states that the
actual payload - without encapsulation headers from 802.1Q, 1ad, 1ah,
MPLS & co. - should keep the 1500 byte limit.

I think the sensible approach would be to move the MTU check as close
as possible to the border between ethernet and the upper layer
protocols, i.e. the driver shouldn't check this at all and try to tx/rx
as much as the hardware supports. This is needed for QinQ, 802.1ah & co.


-David

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-09-01  0:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-31 22:18 RFC - should network devices trim frames > soft mtu Stephen Hemminger
2011-08-31 22:26 ` Ben Greear
2011-08-31 22:27 ` Michael Chan
2011-09-01  0:10   ` David Lamparter
2011-08-31 22:45 ` Ben Hutchings

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.