linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.6.35 -> 2.6.36 panic when vlan and promisc with tg3
@ 2010-11-29 19:17 Michael Leun
  2010-11-30  0:19 ` Ben Greear
  0 siblings, 1 reply; 53+ messages in thread
From: Michael Leun @ 2010-11-29 19:17 UTC (permalink / raw)
  To: linux-kernel, netdev

Hi,

only kernel message I was able to capture on serial were:

UG: unable to handle kernel paging request at 01cc921c
IP: [<c034bfce>] vlan_hwaccel_do_receive+0x59/0xd0
*pdpt = 0000000036a2a001 *pde = 0000000000000000
Oops: 0002 [#1] SMP
last sysfs

Then machine dead.

In 2.6.35.x this did not happen (but vlans broken - cannot see vlan
tags with tcpdump),

To reproduce:

ip link set eth0 up
vconfig add eth0 2
ip link set eth0 promisc on

Machines with these ethernet cards I've seen to be affected:

Ethernet controller: Broadcom Corporation NetXtreme BCM5714 Gigabit
Ethernet (rev a3)

Ethernet controller: Broadcom Corporation NetXtreme BCM5752M Gigabit
Ethernet PCI Express (rev 02)

Machine with this card is NOT affected (no vlan hw accel?):

Ethernet controller: Broadcom Corporation NetLink BCM5906M Fast
Ethernet PCI Express (rev 02)


-- 
MfG,

Michael Leun


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: 2.6.35 -> 2.6.36 panic when vlan and promisc with tg3
  2010-11-29 19:17 2.6.35 -> 2.6.36 panic when vlan and promisc with tg3 Michael Leun
@ 2010-11-30  0:19 ` Ben Greear
  2010-11-30  3:10   ` Jesse Gross
  2010-11-30  8:59   ` Michael Leun
  0 siblings, 2 replies; 53+ messages in thread
From: Ben Greear @ 2010-11-30  0:19 UTC (permalink / raw)
  To: Michael Leun; +Cc: linux-kernel, netdev

On 11/29/2010 11:17 AM, Michael Leun wrote:
> Hi,
>
> only kernel message I was able to capture on serial were:
>
> UG: unable to handle kernel paging request at 01cc921c
> IP: [<c034bfce>] vlan_hwaccel_do_receive+0x59/0xd0
> *pdpt = 0000000036a2a001 *pde = 0000000000000000
> Oops: 0002 [#1] SMP
> last sysfs
>
> Then machine dead.
>
> In 2.6.35.x this did not happen (but vlans broken - cannot see vlan
> tags with tcpdump),

Try this patch:

http://permalink.gmane.org/gmane.linux.network/176566

It looks like this hasn't made it into stable yet?

Thanks,
Ben

>
> To reproduce:
>
> ip link set eth0 up
> vconfig add eth0 2
> ip link set eth0 promisc on
>
> Machines with these ethernet cards I've seen to be affected:
>
> Ethernet controller: Broadcom Corporation NetXtreme BCM5714 Gigabit
> Ethernet (rev a3)
>
> Ethernet controller: Broadcom Corporation NetXtreme BCM5752M Gigabit
> Ethernet PCI Express (rev 02)
>
> Machine with this card is NOT affected (no vlan hw accel?):
>
> Ethernet controller: Broadcom Corporation NetLink BCM5906M Fast
> Ethernet PCI Express (rev 02)
>
>


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: 2.6.35 -> 2.6.36 panic when vlan and promisc with tg3
  2010-11-30  0:19 ` Ben Greear
@ 2010-11-30  3:10   ` Jesse Gross
  2010-11-30  3:26     ` David Miller
  2010-11-30  8:59   ` Michael Leun
  1 sibling, 1 reply; 53+ messages in thread
From: Jesse Gross @ 2010-11-30  3:10 UTC (permalink / raw)
  To: Ben Greear, David Miller; +Cc: Michael Leun, linux-kernel, netdev

On Mon, Nov 29, 2010 at 4:19 PM, Ben Greear <greearb@candelatech.com> wrote:
> On 11/29/2010 11:17 AM, Michael Leun wrote:
>>
>> Hi,
>>
>> only kernel message I was able to capture on serial were:
>>
>> UG: unable to handle kernel paging request at 01cc921c
>> IP: [<c034bfce>] vlan_hwaccel_do_receive+0x59/0xd0
>> *pdpt = 0000000036a2a001 *pde = 0000000000000000
>> Oops: 0002 [#1] SMP
>> last sysfs
>>
>> Then machine dead.
>>
>> In 2.6.35.x this did not happen (but vlans broken - cannot see vlan
>> tags with tcpdump),
>
> Try this patch:
>
> http://permalink.gmane.org/gmane.linux.network/176566
>
> It looks like this hasn't made it into stable yet?

It didn't make it into 2.6.36.1.

David, I know that you already acked this patch but it sounds like
Greg would prefer to get it from you directly:
https://lkml.org/lkml/2010/11/19/685

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: 2.6.35 -> 2.6.36 panic when vlan and promisc with tg3
  2010-11-30  3:10   ` Jesse Gross
@ 2010-11-30  3:26     ` David Miller
  0 siblings, 0 replies; 53+ messages in thread
From: David Miller @ 2010-11-30  3:26 UTC (permalink / raw)
  To: jesse; +Cc: greearb, lkml20101129, linux-kernel, netdev

From: Jesse Gross <jesse@nicira.com>
Date: Mon, 29 Nov 2010 19:10:06 -0800

> On Mon, Nov 29, 2010 at 4:19 PM, Ben Greear <greearb@candelatech.com> wrote:
>> On 11/29/2010 11:17 AM, Michael Leun wrote:
>>>
>>> Hi,
>>>
>>> only kernel message I was able to capture on serial were:
>>>
>>> UG: unable to handle kernel paging request at 01cc921c
>>> IP: [<c034bfce>] vlan_hwaccel_do_receive+0x59/0xd0
>>> *pdpt = 0000000036a2a001 *pde = 0000000000000000
>>> Oops: 0002 [#1] SMP
>>> last sysfs
>>>
>>> Then machine dead.
>>>
>>> In 2.6.35.x this did not happen (but vlans broken - cannot see vlan
>>> tags with tcpdump),
>>
>> Try this patch:
>>
>> http://permalink.gmane.org/gmane.linux.network/176566
>>
>> It looks like this hasn't made it into stable yet?
> 
> It didn't make it into 2.6.36.1.
> 
> David, I know that you already acked this patch but it sounds like
> Greg would prefer to get it from you directly:
> https://lkml.org/lkml/2010/11/19/685

I haven't made any -stable submissions in a week or two, and that is
on purpose.  I want all of those things to cook for a while, and I intend
to send my huge queue to Greg over the next day or two.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: 2.6.35 -> 2.6.36 panic when vlan and promisc with tg3
  2010-11-30  0:19 ` Ben Greear
  2010-11-30  3:10   ` Jesse Gross
@ 2010-11-30  8:59   ` Michael Leun
  2010-11-30  9:20     ` Eric Dumazet
  1 sibling, 1 reply; 53+ messages in thread
From: Michael Leun @ 2010-11-30  8:59 UTC (permalink / raw)
  To: Ben Greear; +Cc: linux-kernel, netdev

On Mon, 29 Nov 2010 16:19:06 -0800
Ben Greear <greearb@candelatech.com> wrote:

> On 11/29/2010 11:17 AM, Michael Leun wrote:
> > UG: unable to handle kernel paging request at 01cc921c
> > IP: [<c034bfce>] vlan_hwaccel_do_receive+0x59/0xd0
> > *pdpt = 0000000036a2a001 *pde = 0000000000000000
> > Oops: 0002 [#1] SMP
> > last sysfs
> >
> > Then machine dead.
> >
> > In 2.6.35.x this did not happen (but vlans broken - cannot see vlan
> > tags with tcpdump),
> 
> Try this patch:
> 
> http://permalink.gmane.org/gmane.linux.network/176566
> 
> It looks like this hasn't made it into stable yet?

> > To reproduce:
> >
> > ip link set eth0 up
> > vconfig add eth0 2
> > ip link set eth0 promisc on

It makes it better - it does not crash anymore on this commands - but
if you add an "tcpdump -i eth0 -n" at the end it does. So,
unfortunately no real solution.

I guess, "dropping packet no one is interested in" (as noted in the
patch) does not work very well if tcpdump is actually interested?

-- 
MfG,

Michael Leun


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: 2.6.35 -> 2.6.36 panic when vlan and promisc with tg3
  2010-11-30  8:59   ` Michael Leun
@ 2010-11-30  9:20     ` Eric Dumazet
  2010-11-30 22:27       ` Jesse Gross
  2010-12-01 10:17       ` Michael Leun
  0 siblings, 2 replies; 53+ messages in thread
From: Eric Dumazet @ 2010-11-30  9:20 UTC (permalink / raw)
  To: Michael Leun; +Cc: Ben Greear, linux-kernel, netdev

Le mardi 30 novembre 2010 à 09:59 +0100, Michael Leun a écrit :
> On Mon, 29 Nov 2010 16:19:06 -0800
> Ben Greear <greearb@candelatech.com> wrote:
> 
> > On 11/29/2010 11:17 AM, Michael Leun wrote:
> > > UG: unable to handle kernel paging request at 01cc921c
> > > IP: [<c034bfce>] vlan_hwaccel_do_receive+0x59/0xd0
> > > *pdpt = 0000000036a2a001 *pde = 0000000000000000
> > > Oops: 0002 [#1] SMP
> > > last sysfs
> > >
> > > Then machine dead.
> > >
> > > In 2.6.35.x this did not happen (but vlans broken - cannot see vlan
> > > tags with tcpdump),
> > 
> > Try this patch:
> > 
> > http://permalink.gmane.org/gmane.linux.network/176566
> > 
> > It looks like this hasn't made it into stable yet?
> 
> > > To reproduce:
> > >
> > > ip link set eth0 up
> > > vconfig add eth0 2
> > > ip link set eth0 promisc on
> 
> It makes it better - it does not crash anymore on this commands - but
> if you add an "tcpdump -i eth0 -n" at the end it does. So,
> unfortunately no real solution.
> 
> I guess, "dropping packet no one is interested in" (as noted in the
> patch) does not work very well if tcpdump is actually interested?
> 

Could you try with following patch instead, for net/core/dev.c 

(and keep the net/8021q/vlan_core.c part)

--- net/core/dev.c.orig
+++ net/core/dev.c
@@ -2891,6 +2891,9 @@
 ncls:
 #endif
 
+	if (unlikely(vlan_tx_tag_present(skb)))
+		goto bypass;
+
 	/* Handle special case of bridge or macvlan */
 	rx_handler = rcu_dereference(skb->dev->rx_handler);
 	if (rx_handler) {
@@ -2927,6 +2930,7 @@
 		}
 	}
 
+bypass:
 	if (pt_prev) {
 		ret = pt_prev->func(skb, skb->dev, pt_prev, orig_dev);
 	} else {



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: 2.6.35 -> 2.6.36 panic when vlan and promisc with tg3
  2010-11-30  9:20     ` Eric Dumazet
@ 2010-11-30 22:27       ` Jesse Gross
  2010-12-01 10:17       ` Michael Leun
  1 sibling, 0 replies; 53+ messages in thread
From: Jesse Gross @ 2010-11-30 22:27 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Michael Leun, Ben Greear, linux-kernel, netdev

On Tue, Nov 30, 2010 at 1:20 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mardi 30 novembre 2010 à 09:59 +0100, Michael Leun a écrit :
>> On Mon, 29 Nov 2010 16:19:06 -0800
>> Ben Greear <greearb@candelatech.com> wrote:
>>
>> > On 11/29/2010 11:17 AM, Michael Leun wrote:
>> > > UG: unable to handle kernel paging request at 01cc921c
>> > > IP: [<c034bfce>] vlan_hwaccel_do_receive+0x59/0xd0
>> > > *pdpt = 0000000036a2a001 *pde = 0000000000000000
>> > > Oops: 0002 [#1] SMP
>> > > last sysfs
>> > >
>> > > Then machine dead.
>> > >
>> > > In 2.6.35.x this did not happen (but vlans broken - cannot see vlan
>> > > tags with tcpdump),
>> >
>> > Try this patch:
>> >
>> > http://permalink.gmane.org/gmane.linux.network/176566
>> >
>> > It looks like this hasn't made it into stable yet?
>>
>> > > To reproduce:
>> > >
>> > > ip link set eth0 up
>> > > vconfig add eth0 2
>> > > ip link set eth0 promisc on
>>
>> It makes it better - it does not crash anymore on this commands - but
>> if you add an "tcpdump -i eth0 -n" at the end it does. So,
>> unfortunately no real solution.
>>
>> I guess, "dropping packet no one is interested in" (as noted in the
>> patch) does not work very well if tcpdump is actually interested?
>>
>
> Could you try with following patch instead, for net/core/dev.c
>
> (and keep the net/8021q/vlan_core.c part)
>
> --- net/core/dev.c.orig
> +++ net/core/dev.c
> @@ -2891,6 +2891,9 @@
>  ncls:
>  #endif
>
> +       if (unlikely(vlan_tx_tag_present(skb)))
> +               goto bypass;
> +
>        /* Handle special case of bridge or macvlan */
>        rx_handler = rcu_dereference(skb->dev->rx_handler);
>        if (rx_handler) {
> @@ -2927,6 +2930,7 @@
>                }
>        }
>
> +bypass:
>        if (pt_prev) {
>                ret = pt_prev->func(skb, skb->dev, pt_prev, orig_dev);
>        } else {

Hmm, it looks like I forgot an else between pt_prev->func() and
kfree_skb() in my patch.  Yours is cleaner, so I like it better
anyways (assuming it fixes the problem).

Thanks.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: 2.6.35 -> 2.6.36 panic when vlan and promisc with tg3
  2010-11-30  9:20     ` Eric Dumazet
  2010-11-30 22:27       ` Jesse Gross
@ 2010-12-01 10:17       ` Michael Leun
  2010-12-01 10:55         ` [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used Eric Dumazet
  1 sibling, 1 reply; 53+ messages in thread
From: Michael Leun @ 2010-12-01 10:17 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Ben Greear, linux-kernel, netdev

On Tue, 30 Nov 2010 10:20:09 +0100
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Le mardi 30 novembre 2010 à 09:59 +0100, Michael Leun a écrit :
> > On Mon, 29 Nov 2010 16:19:06 -0800
> > Ben Greear <greearb@candelatech.com> wrote:
> > 
> > > On 11/29/2010 11:17 AM, Michael Leun wrote:
> > > > UG: unable to handle kernel paging request at 01cc921c
> > > > IP: [<c034bfce>] vlan_hwaccel_do_receive+0x59/0xd0
> > > > *pdpt = 0000000036a2a001 *pde = 0000000000000000
> > > > Oops: 0002 [#1] SMP
> > > > last sysfs
> > > >
> > > > Then machine dead.
> > > >
> > > > In 2.6.35.x this did not happen (but vlans broken - cannot see
> > > > vlan tags with tcpdump),
> > > 
> > > Try this patch:
> > > 
> > > http://permalink.gmane.org/gmane.linux.network/176566
> > > 
> > > It looks like this hasn't made it into stable yet?
> > 
> > > > To reproduce:
> > > >
> > > > ip link set eth0 up
> > > > vconfig add eth0 2
> > > > ip link set eth0 promisc on
> > 
> > It makes it better - it does not crash anymore on this commands -
> > but if you add an "tcpdump -i eth0 -n" at the end it does. So,
> > unfortunately no real solution.
> > 
> > I guess, "dropping packet no one is interested in" (as noted in the
> > patch) does not work very well if tcpdump is actually interested?
> > 
> 
> Could you try with following patch instead, for net/core/dev.c 
> 
> (and keep the net/8021q/vlan_core.c part)
> 
> --- net/core/dev.c.orig
> +++ net/core/dev.c
> @@ -2891,6 +2891,9 @@
>  ncls:
>  #endif
>  
> +	if (unlikely(vlan_tx_tag_present(skb)))
> +		goto bypass;
> +
>  	/* Handle special case of bridge or macvlan */
>  	rx_handler = rcu_dereference(skb->dev->rx_handler);
>  	if (rx_handler) {
> @@ -2927,6 +2930,7 @@
>  		}
>  	}
>  
> +bypass:
>  	if (pt_prev) {
>  		ret = pt_prev->func(skb, skb->dev, pt_prev,
> orig_dev); } else {
> 
> 

Yup, from what I've tested this works (and tcpdump sees broadcast
packets even for vlans not configured at the moment including vlan tag
- yipee!).

-- 
MfG,

Michael Leun


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-01 10:17       ` Michael Leun
@ 2010-12-01 10:55         ` Eric Dumazet
  2010-12-05  2:07           ` Michael Leun
  2010-12-08 16:47           ` David Miller
  0 siblings, 2 replies; 53+ messages in thread
From: Eric Dumazet @ 2010-12-01 10:55 UTC (permalink / raw)
  To: Michael Leun, David Miller
  Cc: Ben Greear, linux-kernel, netdev, Jesse Gross, stable

Le mercredi 01 décembre 2010 à 11:17 +0100, Michael Leun a écrit :

> Yup, from what I've tested this works (and tcpdump sees broadcast
> packets even for vlans not configured at the moment including vlan tag
> - yipee!).
> 

Thanks Michael !

Here is the revised patch again then for stable team, via David Miller
agreement.


[PATCH v2 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used.

Normally hardware accelerated vlan packets are quickly dropped if
there is no corresponding vlan device configured.  The one exception
is promiscuous mode, where we allow all of these packets through so
they can be picked up by tcpdump.  However, this behavior causes a
crash if we actually try to receive these packets.  This fixes that
crash by ignoring packets with vids not corresponding to a configured
device in the vlan hwaccel routines and then dropping them before they
get to consumers in the network stack.

Reported-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Michael Leun <lkml20101129@newton.leun.net>
---
v2: survives to tcpdump :)

 net/core/dev.c        |   10 ++++++++++
 net/8021q/vlan_core.c |    3 +++
 2 files changed, 13 insertions(+)

--- linux-2.6.36/net/core/dev.c.orig
+++ linux-2.6.36/net/core/dev.c
@@ -2891,6 +2891,15 @@
 ncls:
 #endif
 
+	/* If we got this far with a hardware accelerated VLAN tag, it means
+	 * that we were put in promiscuous mode but nobody is interested in
+	 * this vid. Drop the packet now to prevent it from getting propagated
+	 * to other parts of the stack that won't know how to deal with packets
+	 * tagged in this manner.
+	 */
+	if (unlikely(vlan_tx_tag_present(skb)))
+		goto bypass;
+
 	/* Handle special case of bridge or macvlan */
 	rx_handler = rcu_dereference(skb->dev->rx_handler);
 	if (rx_handler) {
@@ -2927,6 +2936,7 @@
 		}
 	}
 
+bypass:
 	if (pt_prev) {
 		ret = pt_prev->func(skb, skb->dev, pt_prev, orig_dev);
 	} else {
--- linux-2.6.36/net/8021q/vlan_core.c.orig
+++ linux-2.6.36/net/8021q/vlan_core.c
@@ -43,6 +43,9 @@
 	struct net_device *dev = skb->dev;
 	struct vlan_rx_stats     *rx_stats;
 
+	if (unlikely(!is_vlan_dev(dev)))
+		return 0;
+
 	skb->dev = vlan_dev_info(dev)->real_dev;
 	netif_nit_deliver(skb);
 



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-01 10:55         ` [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used Eric Dumazet
@ 2010-12-05  2:07           ` Michael Leun
  2010-12-05  8:03             ` Eric Dumazet
  2010-12-08 16:47           ` David Miller
  1 sibling, 1 reply; 53+ messages in thread
From: Michael Leun @ 2010-12-05  2:07 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, Ben Greear, linux-kernel, netdev, Jesse Gross, stable

On Wed, 01 Dec 2010 11:55:14 +0100
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Le mercredi 01 décembre 2010 à 11:17 +0100, Michael Leun a écrit :
> 
> > Yup, from what I've tested this works (and tcpdump sees broadcast
> > packets even for vlans not configured at the moment including vlan
> > tag
> > - yipee!).
> >

This was tested on machine with
root@tp_z61m:~/src/tcpdump-4.1.1# lspci|grep Eth
02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5752M Gigabit Ethernet PCI Express (rev 02)

There it works as I said.

But on

hpdl320g5:/home/ml # lspci | grep Eth
03:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5714 Gigabit Ethernet (rev a3)
03:04.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5714 Gigabit Ethernet (rev a3)

the good message is that it also does not crash, but with tcpdump I see
vlan tags when no vlan devices configured on the respective eth, if so
I do not see tags anymore vlan tags on the trunk interface.

hpdl320g5:/home/ml # ifconfig -a
[...]
eth1      Link encap:Ethernet  Hardware Adresse xx:xx:xx:xx:xx:xx  
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 Sendewarteschlangenlänge:1000 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Interrupt:17 
[...]

hpdl320g5:/home/ml # tcpdump -i eth1 -n -e
tcpdump: WARNING: eth1: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes
[...]
02:45:57.597640 xx:xx:xx:xx:xx:xx > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 1505, p 0, ethertype 802.1Q, vlan 99, p 3, ethertype ARP, arp who-has 10.0.0.1 tell 10.0.0.2
02:45:57.622654 xx:xx:xx:xx:xx:xx > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 1505, p 0, ethertype 802.1Q, vlan 99, p 3, ethertype ARP, arp who-has 10.0.0.1 tell 10.0.0.2
[...]

hpdl320g5:/home/ml # vconfig add eth1 2
WARNING:  Could not open /proc/net/vlan/config.  Maybe you need to load the 8021q module, or maybe you are not using PROCFS??
Added VLAN with VID == 2 to IF -:eth1:-
hpdl320g5:/home/ml # cat /proc/net/vlan/config 
VLAN Dev name    | VLAN ID
Name-Type: VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD
eth1.2         | 2  | eth1
hpdl320g5:/home/ml # tcpdump -i eth1 -n -e
tcpdump: WARNING: eth1: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes
[...]
02:50:18.095959 xx:xx:xx:xx:xx:xx > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 60: vlan 99, p 3, ethertype ARP, arp who-has 10.0.0.1 tell 10.0.0.2
02:50:18.120989 xx:xx:xx:xx:xx:xx > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 60: vlan 99, p 3, ethertype ARP, arp who-has 10.0.0.1 tell 10.0.0.2
[...]

The same packages we saw double vlan tagged a few minutes ago now appear single vlan tagged (and of course, other packets that were shown single vlan tagged before are now shown without vlan tag / eth type 0x0800).


On the other machine:

root@tp_z61m:~# cat /proc/net/vlan/config 
VLAN Dev name    | VLAN ID
Name-Type: VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD
eth0.741       | 741  | eth0
eth0.2         | 2  | eth0
root@tp_z61m:~/src/tcpdump-4.1.1# ./tcpdump -i eth0 -n -e not port 22
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
03:06:51.859016 xx:xx:xx:xx:xx:xx > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 1505, p 0, ethertype 802.1Q, vlan 99, p 3, ethertype ARP, Request who-has 10.0.0.1 tell 10.0.0.2, length 42
03:06:51.883956 xx:xx:xx:xx:xx:xx > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 1505, p 0, ethertype 802.1Q, vlan 99, p 3, ethertype ARP, Request who-has 10.0.0.1 tell 10.0.0.2, length 42



[...]
> Here is the revised patch again then for stable team, via David Miller
> agreement.
[...]

-- 
MfG,

Michael Leun


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-05  2:07           ` Michael Leun
@ 2010-12-05  8:03             ` Eric Dumazet
  2010-12-05  9:55               ` Michael Leun
  0 siblings, 1 reply; 53+ messages in thread
From: Eric Dumazet @ 2010-12-05  8:03 UTC (permalink / raw)
  To: Michael Leun
  Cc: David Miller, Ben Greear, linux-kernel, netdev, Jesse Gross, stable

Le dimanche 05 décembre 2010 à 03:07 +0100, Michael Leun a écrit :
> On Wed, 01 Dec 2010 11:55:14 +0100
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> > Le mercredi 01 décembre 2010 à 11:17 +0100, Michael Leun a écrit :
> > 
> > > Yup, from what I've tested this works (and tcpdump sees broadcast
> > > packets even for vlans not configured at the moment including vlan
> > > tag
> > > - yipee!).
> > >
> 
> This was tested on machine with
> root@tp_z61m:~/src/tcpdump-4.1.1# lspci|grep Eth
> 02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5752M Gigabit Ethernet PCI Express (rev 02)
> 
> There it works as I said.
> 
> But on
> 
> hpdl320g5:/home/ml # lspci | grep Eth
> 03:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5714 Gigabit Ethernet (rev a3)
> 03:04.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5714 Gigabit Ethernet (rev a3)
> 
> the good message is that it also does not crash, but with tcpdump I see
> vlan tags when no vlan devices configured on the respective eth, if so
> I do not see tags anymore vlan tags on the trunk interface.
> 

For all these very specific needs, you'll have to try 2.6.37 I am
afraid. Jesse did huge changes to exactly make this working, we wont
backport this to 2.6.36, but only avoid crashes.


> hpdl320g5:/home/ml # ifconfig -a
> [...]
> eth1      Link encap:Ethernet  Hardware Adresse xx:xx:xx:xx:xx:xx  
>           UP BROADCAST MULTICAST  MTU:1500  Metric:1
>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 Sendewarteschlangenlänge:1000 
>           RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
>           Interrupt:17 
> [...]
> 
> hpdl320g5:/home/ml # tcpdump -i eth1 -n -e
> tcpdump: WARNING: eth1: no IPv4 address assigned
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes
> [...]
> 02:45:57.597640 xx:xx:xx:xx:xx:xx > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 1505, p 0, ethertype 802.1Q, vlan 99, p 3, ethertype ARP, arp who-has 10.0.0.1 tell 10.0.0.2
> 02:45:57.622654 xx:xx:xx:xx:xx:xx > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 1505, p 0, ethertype 802.1Q, vlan 99, p 3, ethertype ARP, arp who-has 10.0.0.1 tell 10.0.0.2
> [...]
> 
> hpdl320g5:/home/ml # vconfig add eth1 2
> WARNING:  Could not open /proc/net/vlan/config.  Maybe you need to load the 8021q module, or maybe you are not using PROCFS??
> Added VLAN with VID == 2 to IF -:eth1:-
> hpdl320g5:/home/ml # cat /proc/net/vlan/config 
> VLAN Dev name    | VLAN ID
> Name-Type: VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD
> eth1.2         | 2  | eth1
> hpdl320g5:/home/ml # tcpdump -i eth1 -n -e
> tcpdump: WARNING: eth1: no IPv4 address assigned
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes
> [...]
> 02:50:18.095959 xx:xx:xx:xx:xx:xx > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 60: vlan 99, p 3, ethertype ARP, arp who-has 10.0.0.1 tell 10.0.0.2
> 02:50:18.120989 xx:xx:xx:xx:xx:xx > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 60: vlan 99, p 3, ethertype ARP, arp who-has 10.0.0.1 tell 10.0.0.2
> [...]
> 
> The same packages we saw double vlan tagged a few minutes ago now appear single vlan tagged (and of course, other packets that were shown single vlan tagged before are now shown without vlan tag / eth type 0x0800).
> 
> 
> On the other machine:
> 
> root@tp_z61m:~# cat /proc/net/vlan/config 
> VLAN Dev name    | VLAN ID
> Name-Type: VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD
> eth0.741       | 741  | eth0
> eth0.2         | 2  | eth0
> root@tp_z61m:~/src/tcpdump-4.1.1# ./tcpdump -i eth0 -n -e not port 22
> tcpdump: WARNING: eth0: no IPv4 address assigned
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
> 03:06:51.859016 xx:xx:xx:xx:xx:xx > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 1505, p 0, ethertype 802.1Q, vlan 99, p 3, ethertype ARP, Request who-has 10.0.0.1 tell 10.0.0.2, length 42
> 03:06:51.883956 xx:xx:xx:xx:xx:xx > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 1505, p 0, ethertype 802.1Q, vlan 99, p 3, ethertype ARP, Request who-has 10.0.0.1 tell 10.0.0.2, length 42
> 
> 
> 
> [...]
> > Here is the revised patch again then for stable team, via David Miller
> > agreement.
> [...]
> 



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-05  8:03             ` Eric Dumazet
@ 2010-12-05  9:55               ` Michael Leun
       [not found]                 ` <20101205114404.7c0cddc2@xenia.leun.net>
  0 siblings, 1 reply; 53+ messages in thread
From: Michael Leun @ 2010-12-05  9:55 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, Ben Greear, linux-kernel, netdev, Jesse Gross, stable

On Sun, 05 Dec 2010 09:03:53 +0100
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> > But on
> > 
> > hpdl320g5:/home/ml # lspci | grep Eth
> > 03:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5714
> > Gigabit Ethernet (rev a3) 03:04.1 Ethernet controller: Broadcom
> > Corporation NetXtreme BCM5714 Gigabit Ethernet (rev a3)
> > 
> > the good message is that it also does not crash, but with tcpdump I
> > see vlan tags when no vlan devices configured on the respective
> > eth, if so I do not see tags anymore vlan tags on the trunk
> > interface.
> > 
> 
> For all these very specific needs, you'll have to try 2.6.37 I am
> afraid. Jesse did huge changes to exactly make this working, we wont
> backport this to 2.6.36, but only avoid crashes.

OK, I'm perfectly fine with that, of course, actually nice to hear that
the issue already is addressed.

Likely I'll give some rc an shot on this machine (maybe over christmas),
but it is an production machine (acutally testing other devices is the
"product" produced on this machine), so unfortunately I'm not that free
in when and what I can do (but the possibility to, for example, bridge
the trunk interface would make testing easier, that justifies
something...).

Thank you all very much for your work.


-- 
MfG,

Michael Leun


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
       [not found]                   ` <AANLkTikrDTCDxsyOG4m0XcrOY=3pTRwWqnPGsio9cBFj@mail.gmail.com>
@ 2010-12-06 19:34                     ` Michael Leun
  2010-12-06 20:04                       ` Jesse Gross
  0 siblings, 1 reply; 53+ messages in thread
From: Michael Leun @ 2010-12-06 19:34 UTC (permalink / raw)
  To: Jesse Gross; +Cc: Eric Dumazet, David Miller, Ben Greear, linux-kernel, netdev

On Mon, 6 Dec 2010 10:14:55 -0800
Jesse Gross <jesse@nicira.com> wrote:

> On Sun, Dec 5, 2010 at 2:44 AM, Michael Leun
> <lkml20101129@newton.leun.net> wrote:
> > Hi Jesse,
> >
> > On Sun, 5 Dec 2010 10:55:28 +0100
> > Michael Leun <lkml20101129@newton.leun.net> wrote:
> >
> >> On Sun, 05 Dec 2010 09:03:53 +0100
> >> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >>
> >> > > But on
> >> > >
> >> > > hpdl320g5:/home/ml # lspci | grep Eth
> >> > > 03:04.0 Ethernet controller: Broadcom Corporation NetXtreme
> >> > > BCM5714 Gigabit Ethernet (rev a3) 03:04.1 Ethernet controller:
> >> > > Broadcom Corporation NetXtreme BCM5714 Gigabit Ethernet (rev
> >> > > a3)
> >> > >
> >> > > the good message is that it also does not crash, but with
> >> > > tcpdump I see vlan tags when no vlan devices configured on the
> >> > > respective eth, if so I do not see tags anymore vlan tags on
> >> > > the trunk interface.
> >> > >
> >> >
> >> > For all these very specific needs, you'll have to try 2.6.37 I am
> >> > afraid. Jesse did huge changes to exactly make this working, we
> >> > wont backport this to 2.6.36, but only avoid crashes.
> >>
> >> OK, I'm perfectly fine with that, of course, actually nice to hear
> >> that the issue already is addressed.
> >>
> >> Likely I'll give some rc an shot on this machine (maybe over
> >> christmas), but it is an production machine (acutally testing other
> >> devices is the "product" produced on this machine), so
> >> unfortunately I'm not that free in when and what I can do (but the
> >> possibility to, for example, bridge the trunk interface would make
> >> testing easier, that justifies something...).
> >>
> >> Thank you all very much for your work.
> >
> > Are these changes already in 2.6.37-rc4? Or, if not are they
> > somewhere publically available already?
> >
> > I looked into various changelogs but have some difficulties to
> > identify them...
> >
> > Maybe I have some time next days to give them an try...
> 
> Yes, all of the existing vlan changes are in 2.6.37-rc4.  There were a
> number of patches but the main one was
> 3701e51382a026cba10c60b03efabe534fba4ca4

Then, I'm afraid, this (seeing vlan tags even if vlan interfaces are
configured) does not work on HP DL320G5 (for exact description and
examples please see my mail a few days ago).

-- 
MfG,

Michael Leun


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-06 19:34                     ` Michael Leun
@ 2010-12-06 20:04                       ` Jesse Gross
  2010-12-06 21:27                         ` Michael Leun
  0 siblings, 1 reply; 53+ messages in thread
From: Jesse Gross @ 2010-12-06 20:04 UTC (permalink / raw)
  To: Michael Leun; +Cc: Eric Dumazet, David Miller, Ben Greear, linux-kernel, netdev

On Mon, Dec 6, 2010 at 11:34 AM, Michael Leun
<lkml20101129@newton.leun.net> wrote:
> On Mon, 6 Dec 2010 10:14:55 -0800
> Jesse Gross <jesse@nicira.com> wrote:
>
>> On Sun, Dec 5, 2010 at 2:44 AM, Michael Leun
>> <lkml20101129@newton.leun.net> wrote:
>> > Hi Jesse,
>> >
>> > On Sun, 5 Dec 2010 10:55:28 +0100
>> > Michael Leun <lkml20101129@newton.leun.net> wrote:
>> >
>> >> On Sun, 05 Dec 2010 09:03:53 +0100
>> >> Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> >>
>> >> > > But on
>> >> > >
>> >> > > hpdl320g5:/home/ml # lspci | grep Eth
>> >> > > 03:04.0 Ethernet controller: Broadcom Corporation NetXtreme
>> >> > > BCM5714 Gigabit Ethernet (rev a3) 03:04.1 Ethernet controller:
>> >> > > Broadcom Corporation NetXtreme BCM5714 Gigabit Ethernet (rev
>> >> > > a3)
>> >> > >
>> >> > > the good message is that it also does not crash, but with
>> >> > > tcpdump I see vlan tags when no vlan devices configured on the
>> >> > > respective eth, if so I do not see tags anymore vlan tags on
>> >> > > the trunk interface.
>> >> > >
>> >> >
>> >> > For all these very specific needs, you'll have to try 2.6.37 I am
>> >> > afraid. Jesse did huge changes to exactly make this working, we
>> >> > wont backport this to 2.6.36, but only avoid crashes.
>> >>
>> >> OK, I'm perfectly fine with that, of course, actually nice to hear
>> >> that the issue already is addressed.
>> >>
>> >> Likely I'll give some rc an shot on this machine (maybe over
>> >> christmas), but it is an production machine (acutally testing other
>> >> devices is the "product" produced on this machine), so
>> >> unfortunately I'm not that free in when and what I can do (but the
>> >> possibility to, for example, bridge the trunk interface would make
>> >> testing easier, that justifies something...).
>> >>
>> >> Thank you all very much for your work.
>> >
>> > Are these changes already in 2.6.37-rc4? Or, if not are they
>> > somewhere publically available already?
>> >
>> > I looked into various changelogs but have some difficulties to
>> > identify them...
>> >
>> > Maybe I have some time next days to give them an try...
>>
>> Yes, all of the existing vlan changes are in 2.6.37-rc4.  There were a
>> number of patches but the main one was
>> 3701e51382a026cba10c60b03efabe534fba4ca4
>
> Then, I'm afraid, this (seeing vlan tags even if vlan interfaces are
> configured) does not work on HP DL320G5 (for exact description and
> examples please see my mail a few days ago).

What driver are you using?  Is it tg3?

The vlan changes that I made unfortunately require updating drivers to
get the full benefit.  I've been busy lately so tg3 hasn't yet been
updated.

I know that tg3 does some things differently depending on whether a
vlan group is configured, so that would likely be the cause of what
you are seeing.  I'd have to look at it in more detail to be sure
though.

You said that everything works on the other Broadcom NIC that you
tested?  Maybe it uses bnx2 instead?

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-06 20:04                       ` Jesse Gross
@ 2010-12-06 21:27                         ` Michael Leun
  2010-12-13  0:11                           ` Jesse Gross
  0 siblings, 1 reply; 53+ messages in thread
From: Michael Leun @ 2010-12-06 21:27 UTC (permalink / raw)
  To: Jesse Gross; +Cc: Eric Dumazet, David Miller, Ben Greear, linux-kernel, netdev

On Mon, 6 Dec 2010 12:04:48 -0800
Jesse Gross <jesse@nicira.com> wrote:

> On Mon, Dec 6, 2010 at 11:34 AM, Michael Leun
> <lkml20101129@newton.leun.net> wrote:
> > On Mon, 6 Dec 2010 10:14:55 -0800
> > Jesse Gross <jesse@nicira.com> wrote:
> >
> >> On Sun, Dec 5, 2010 at 2:44 AM, Michael Leun
> >> <lkml20101129@newton.leun.net> wrote:
> >> > Hi Jesse,
> >> >
> >> > On Sun, 5 Dec 2010 10:55:28 +0100
> >> > Michael Leun <lkml20101129@newton.leun.net> wrote:
> >> >
> >> >> On Sun, 05 Dec 2010 09:03:53 +0100
> >> >> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >> >>
> >> >> > > But on
> >> >> > >
> >> >> > > hpdl320g5:/home/ml # lspci | grep Eth
> >> >> > > 03:04.0 Ethernet controller: Broadcom Corporation NetXtreme
> >> >> > > BCM5714 Gigabit Ethernet (rev a3) 03:04.1 Ethernet
> >> >> > > controller: Broadcom Corporation NetXtreme BCM5714 Gigabit
> >> >> > > Ethernet (rev a3)
> >> >> > >
> >> >> > > the good message is that it also does not crash, but with
> >> >> > > tcpdump I see vlan tags when no vlan devices configured on
> >> >> > > the respective eth, if so I do not see tags anymore vlan
> >> >> > > tags on the trunk interface.
> >> >> > >
> >> >> >
> >> >> > For all these very specific needs, you'll have to try 2.6.37
> >> >> > I am afraid. Jesse did huge changes to exactly make this
> >> >> > working, we wont backport this to 2.6.36, but only avoid
> >> >> > crashes.
> >> >>
> >> >> OK, I'm perfectly fine with that, of course, actually nice to
> >> >> hear that the issue already is addressed.
> >> >>
> >> >> Likely I'll give some rc an shot on this machine (maybe over
> >> >> christmas), but it is an production machine (acutally testing
> >> >> other devices is the "product" produced on this machine), so
> >> >> unfortunately I'm not that free in when and what I can do (but
> >> >> the possibility to, for example, bridge the trunk interface
> >> >> would make testing easier, that justifies something...).
> >> >>
> >> >> Thank you all very much for your work.
> >> >
> >> > Are these changes already in 2.6.37-rc4? Or, if not are they
> >> > somewhere publically available already?
> >> >
> >> > I looked into various changelogs but have some difficulties to
> >> > identify them...
> >> >
> >> > Maybe I have some time next days to give them an try...
> >>
> >> Yes, all of the existing vlan changes are in 2.6.37-rc4.  There
> >> were a number of patches but the main one was
> >> 3701e51382a026cba10c60b03efabe534fba4ca4
> >
> > Then, I'm afraid, this (seeing vlan tags even if vlan interfaces are
> > configured) does not work on HP DL320G5 (for exact description and
> > examples please see my mail a few days ago).
> 
> What driver are you using?  Is it tg3?
> 
> The vlan changes that I made unfortunately require updating drivers to
> get the full benefit.  I've been busy lately so tg3 hasn't yet been
> updated.
> 
> I know that tg3 does some things differently depending on whether a
> vlan group is configured, so that would likely be the cause of what
> you are seeing.  I'd have to look at it in more detail to be sure
> though.
> 
> You said that everything works on the other Broadcom NIC that you
> tested?  Maybe it uses bnx2 instead?
> 

Both machines use tg3 / 2.6.36.1 - one is opensuse, one ubuntu (but
this should not matter, I think).

If I can do anything to support your investigations / work (most
likely testing / providing information) please let me know.

-- 
MfG,

Michael Leun


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-01 10:55         ` [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used Eric Dumazet
  2010-12-05  2:07           ` Michael Leun
@ 2010-12-08 16:47           ` David Miller
  2010-12-08 23:06             ` [stable] " Greg KH
  2010-12-08 23:16             ` Greg KH
  1 sibling, 2 replies; 53+ messages in thread
From: David Miller @ 2010-12-08 16:47 UTC (permalink / raw)
  To: eric.dumazet; +Cc: lkml20101129, greearb, linux-kernel, netdev, jesse, stable

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 01 Dec 2010 11:55:14 +0100


Greg/-stable, please integrate this patch from Eric into 2.6.36 if you
haven't already done so.

Thanks!

> [PATCH v2 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used.
> 
> Normally hardware accelerated vlan packets are quickly dropped if
> there is no corresponding vlan device configured.  The one exception
> is promiscuous mode, where we allow all of these packets through so
> they can be picked up by tcpdump.  However, this behavior causes a
> crash if we actually try to receive these packets.  This fixes that
> crash by ignoring packets with vids not corresponding to a configured
> device in the vlan hwaccel routines and then dropping them before they
> get to consumers in the network stack.
> 
> Reported-by: Ben Greear <greearb@candelatech.com>
> Signed-off-by: Jesse Gross <jesse@nicira.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> Tested-by: Michael Leun <lkml20101129@newton.leun.net>
> ---
> v2: survives to tcpdump :)
> 
>  net/core/dev.c        |   10 ++++++++++
>  net/8021q/vlan_core.c |    3 +++
>  2 files changed, 13 insertions(+)
> 
> --- linux-2.6.36/net/core/dev.c.orig
> +++ linux-2.6.36/net/core/dev.c
> @@ -2891,6 +2891,15 @@
>  ncls:
>  #endif
>  
> +	/* If we got this far with a hardware accelerated VLAN tag, it means
> +	 * that we were put in promiscuous mode but nobody is interested in
> +	 * this vid. Drop the packet now to prevent it from getting propagated
> +	 * to other parts of the stack that won't know how to deal with packets
> +	 * tagged in this manner.
> +	 */
> +	if (unlikely(vlan_tx_tag_present(skb)))
> +		goto bypass;
> +
>  	/* Handle special case of bridge or macvlan */
>  	rx_handler = rcu_dereference(skb->dev->rx_handler);
>  	if (rx_handler) {
> @@ -2927,6 +2936,7 @@
>  		}
>  	}
>  
> +bypass:
>  	if (pt_prev) {
>  		ret = pt_prev->func(skb, skb->dev, pt_prev, orig_dev);
>  	} else {
> --- linux-2.6.36/net/8021q/vlan_core.c.orig
> +++ linux-2.6.36/net/8021q/vlan_core.c
> @@ -43,6 +43,9 @@
>  	struct net_device *dev = skb->dev;
>  	struct vlan_rx_stats     *rx_stats;
>  
> +	if (unlikely(!is_vlan_dev(dev)))
> +		return 0;
> +
>  	skb->dev = vlan_dev_info(dev)->real_dev;
>  	netif_nit_deliver(skb);
>  
> 
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [stable] [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-08 16:47           ` David Miller
@ 2010-12-08 23:06             ` Greg KH
  2010-12-08 23:16             ` Greg KH
  1 sibling, 0 replies; 53+ messages in thread
From: Greg KH @ 2010-12-08 23:06 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, lkml20101129, netdev, jesse, linux-kernel, greearb, stable

On Wed, Dec 08, 2010 at 08:47:31AM -0800, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Wed, 01 Dec 2010 11:55:14 +0100
> 
> 
> Greg/-stable, please integrate this patch from Eric into 2.6.36 if you
> haven't already done so.

It looks like I took a different, older version, I'll update it to this
one now.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [stable] [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-08 16:47           ` David Miller
  2010-12-08 23:06             ` [stable] " Greg KH
@ 2010-12-08 23:16             ` Greg KH
  2010-12-09  1:25               ` Eric Dumazet
  1 sibling, 1 reply; 53+ messages in thread
From: Greg KH @ 2010-12-08 23:16 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, lkml20101129, netdev, jesse, linux-kernel, greearb, stable

On Wed, Dec 08, 2010 at 08:47:31AM -0800, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Wed, 01 Dec 2010 11:55:14 +0100
> 
> 
> Greg/-stable, please integrate this patch from Eric into 2.6.36 if you
> haven't already done so.

I've updated the version in the .36-stable queue with this one, thanks.

greg k-h

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [stable] [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-08 23:16             ` Greg KH
@ 2010-12-09  1:25               ` Eric Dumazet
  2010-12-09 20:13                 ` Greg KH
  0 siblings, 1 reply; 53+ messages in thread
From: Eric Dumazet @ 2010-12-09  1:25 UTC (permalink / raw)
  To: Greg KH
  Cc: David Miller, lkml20101129, netdev, jesse, linux-kernel, greearb, stable

Le mercredi 08 décembre 2010 à 15:16 -0800, Greg KH a écrit :
> On Wed, Dec 08, 2010 at 08:47:31AM -0800, David Miller wrote:
> > From: Eric Dumazet <eric.dumazet@gmail.com>
> > Date: Wed, 01 Dec 2010 11:55:14 +0100
> > 
> > 
> > Greg/-stable, please integrate this patch from Eric into 2.6.36 if you
> > haven't already done so.
> 
> I've updated the version in the .36-stable queue with this one, thanks.
> 

Thanks.

By the way, all credits given to Jesse, I only made a change to his
patch, so I left him as the author.

Ah I see I forgot the "From: Jesse Gross <jesse@nicira.com>" header when
I sent it, my bad, sorry !




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [stable] [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-09  1:25               ` Eric Dumazet
@ 2010-12-09 20:13                 ` Greg KH
  0 siblings, 0 replies; 53+ messages in thread
From: Greg KH @ 2010-12-09 20:13 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, lkml20101129, netdev, jesse, linux-kernel, greearb, stable

On Thu, Dec 09, 2010 at 02:25:33AM +0100, Eric Dumazet wrote:
> Le mercredi 08 décembre 2010 à 15:16 -0800, Greg KH a écrit :
> > On Wed, Dec 08, 2010 at 08:47:31AM -0800, David Miller wrote:
> > > From: Eric Dumazet <eric.dumazet@gmail.com>
> > > Date: Wed, 01 Dec 2010 11:55:14 +0100
> > > 
> > > 
> > > Greg/-stable, please integrate this patch from Eric into 2.6.36 if you
> > > haven't already done so.
> > 
> > I've updated the version in the .36-stable queue with this one, thanks.
> > 
> 
> Thanks.
> 
> By the way, all credits given to Jesse, I only made a change to his
> patch, so I left him as the author.
> 
> Ah I see I forgot the "From: Jesse Gross <jesse@nicira.com>" header when
> I sent it, my bad, sorry !

No problem, I have that in the patch header already :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-06 21:27                         ` Michael Leun
@ 2010-12-13  0:11                           ` Jesse Gross
  2010-12-13 22:45                             ` Matt Carlson
  0 siblings, 1 reply; 53+ messages in thread
From: Jesse Gross @ 2010-12-13  0:11 UTC (permalink / raw)
  To: Michael Leun, Matt Carlson, Michael Chan
  Cc: Eric Dumazet, David Miller, Ben Greear, linux-kernel, netdev

On Mon, Dec 6, 2010 at 1:27 PM, Michael Leun
<lkml20101129@newton.leun.net> wrote:
> On Mon, 6 Dec 2010 12:04:48 -0800
> Jesse Gross <jesse@nicira.com> wrote:
>
>> On Mon, Dec 6, 2010 at 11:34 AM, Michael Leun
>> <lkml20101129@newton.leun.net> wrote:
>> > On Mon, 6 Dec 2010 10:14:55 -0800
>> > Jesse Gross <jesse@nicira.com> wrote:
>> >
>> >> On Sun, Dec 5, 2010 at 2:44 AM, Michael Leun
>> >> <lkml20101129@newton.leun.net> wrote:
>> >> > Hi Jesse,
>> >> >
>> >> > On Sun, 5 Dec 2010 10:55:28 +0100
>> >> > Michael Leun <lkml20101129@newton.leun.net> wrote:
>> >> >
>> >> >> On Sun, 05 Dec 2010 09:03:53 +0100
>> >> >> Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> >> >>
>> >> >> > > But on
>> >> >> > >
>> >> >> > > hpdl320g5:/home/ml # lspci | grep Eth
>> >> >> > > 03:04.0 Ethernet controller: Broadcom Corporation NetXtreme
>> >> >> > > BCM5714 Gigabit Ethernet (rev a3) 03:04.1 Ethernet
>> >> >> > > controller: Broadcom Corporation NetXtreme BCM5714 Gigabit
>> >> >> > > Ethernet (rev a3)
>> >> >> > >
>> >> >> > > the good message is that it also does not crash, but with
>> >> >> > > tcpdump I see vlan tags when no vlan devices configured on
>> >> >> > > the respective eth, if so I do not see tags anymore vlan
>> >> >> > > tags on the trunk interface.
>> >> >> > >
>> >> >> >
>> >> >> > For all these very specific needs, you'll have to try 2.6.37
>> >> >> > I am afraid. Jesse did huge changes to exactly make this
>> >> >> > working, we wont backport this to 2.6.36, but only avoid
>> >> >> > crashes.
>> >> >>
>> >> >> OK, I'm perfectly fine with that, of course, actually nice to
>> >> >> hear that the issue already is addressed.
>> >> >>
>> >> >> Likely I'll give some rc an shot on this machine (maybe over
>> >> >> christmas), but it is an production machine (acutally testing
>> >> >> other devices is the "product" produced on this machine), so
>> >> >> unfortunately I'm not that free in when and what I can do (but
>> >> >> the possibility to, for example, bridge the trunk interface
>> >> >> would make testing easier, that justifies something...).
>> >> >>
>> >> >> Thank you all very much for your work.
>> >> >
>> >> > Are these changes already in 2.6.37-rc4? Or, if not are they
>> >> > somewhere publically available already?
>> >> >
>> >> > I looked into various changelogs but have some difficulties to
>> >> > identify them...
>> >> >
>> >> > Maybe I have some time next days to give them an try...
>> >>
>> >> Yes, all of the existing vlan changes are in 2.6.37-rc4.  There
>> >> were a number of patches but the main one was
>> >> 3701e51382a026cba10c60b03efabe534fba4ca4
>> >
>> > Then, I'm afraid, this (seeing vlan tags even if vlan interfaces are
>> > configured) does not work on HP DL320G5 (for exact description and
>> > examples please see my mail a few days ago).
>>
>> What driver are you using?  Is it tg3?
>>
>> The vlan changes that I made unfortunately require updating drivers to
>> get the full benefit.  I've been busy lately so tg3 hasn't yet been
>> updated.
>>
>> I know that tg3 does some things differently depending on whether a
>> vlan group is configured, so that would likely be the cause of what
>> you are seeing.  I'd have to look at it in more detail to be sure
>> though.
>>
>> You said that everything works on the other Broadcom NIC that you
>> tested?  Maybe it uses bnx2 instead?
>>
>
> Both machines use tg3 / 2.6.36.1 - one is opensuse, one ubuntu (but
> this should not matter, I think).
>
> If I can do anything to support your investigations / work (most
> likely testing / providing information) please let me know.

Unfortunately, I probably won't have time to look at this in the near
future.  Given that the test works on one NIC but not another that
strongly suggests that it is a driver problem, even if both NICs use
the same driver.  I see tg3 can do different things with vlans
depending on the model and what features are enabled.  I also ran a
quick test on some of my machines and I didn't experience this issue.
They are running net-next with ixgbe.

One of the main goals of my general vlan changes was to remove as much
logic as possible from the drivers and put it in the networking core,
so we should in theory see consistent behavior.  However, in 2.6.36
and earlier, each driver knows about what vlan devices are configured
and does different things with that information.

Given all of that, the most logical step to me is simply to convert
tg3 to use the new vlan infrastructure.  It should be done regardless
and it will probably solve this problem.  Maybe you can convince the
Broadcom guys to do that?  It would be a lot faster for them to do it
than me.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-13  0:11                           ` Jesse Gross
@ 2010-12-13 22:45                             ` Matt Carlson
  2010-12-14  4:07                               ` Jesse Gross
  0 siblings, 1 reply; 53+ messages in thread
From: Matt Carlson @ 2010-12-13 22:45 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Michael Leun, Matthew Carlson, Michael Chan, Eric Dumazet,
	David Miller, Ben Greear, linux-kernel, netdev

On Sun, Dec 12, 2010 at 04:11:13PM -0800, Jesse Gross wrote:
> On Mon, Dec 6, 2010 at 1:27 PM, Michael Leun
> <lkml20101129@newton.leun.net> wrote:
> > On Mon, 6 Dec 2010 12:04:48 -0800
> > Jesse Gross <jesse@nicira.com> wrote:
> >
> >> On Mon, Dec 6, 2010 at 11:34 AM, Michael Leun
> >> <lkml20101129@newton.leun.net> wrote:
> >> > On Mon, 6 Dec 2010 10:14:55 -0800
> >> > Jesse Gross <jesse@nicira.com> wrote:
> >> >
> >> >> On Sun, Dec 5, 2010 at 2:44 AM, Michael Leun
> >> >> <lkml20101129@newton.leun.net> wrote:
> >> >> > Hi Jesse,
> >> >> >
> >> >> > On Sun, 5 Dec 2010 10:55:28 +0100
> >> >> > Michael Leun <lkml20101129@newton.leun.net> wrote:
> >> >> >
> >> >> >> On Sun, 05 Dec 2010 09:03:53 +0100
> >> >> >> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >> >> >>
> >> >> >> > > But on
> >> >> >> > >
> >> >> >> > > hpdl320g5:/home/ml # lspci | grep Eth
> >> >> >> > > 03:04.0 Ethernet controller: Broadcom Corporation NetXtreme
> >> >> >> > > BCM5714 Gigabit Ethernet (rev a3) 03:04.1 Ethernet
> >> >> >> > > controller: Broadcom Corporation NetXtreme BCM5714 Gigabit
> >> >> >> > > Ethernet (rev a3)
> >> >> >> > >
> >> >> >> > > the good message is that it also does not crash, but with
> >> >> >> > > tcpdump I see vlan tags when no vlan devices configured on
> >> >> >> > > the respective eth, if so I do not see tags anymore vlan
> >> >> >> > > tags on the trunk interface.
> >> >> >> > >
> >> >> >> >
> >> >> >> > For all these very specific needs, you'll have to try 2.6.37
> >> >> >> > I am afraid. Jesse did huge changes to exactly make this
> >> >> >> > working, we wont backport this to 2.6.36, but only avoid
> >> >> >> > crashes.
> >> >> >>
> >> >> >> OK, I'm perfectly fine with that, of course, actually nice to
> >> >> >> hear that the issue already is addressed.
> >> >> >>
> >> >> >> Likely I'll give some rc an shot on this machine (maybe over
> >> >> >> christmas), but it is an production machine (acutally testing
> >> >> >> other devices is the "product" produced on this machine), so
> >> >> >> unfortunately I'm not that free in when and what I can do (but
> >> >> >> the possibility to, for example, bridge the trunk interface
> >> >> >> would make testing easier, that justifies something...).
> >> >> >>
> >> >> >> Thank you all very much for your work.
> >> >> >
> >> >> > Are these changes already in 2.6.37-rc4? Or, if not are they
> >> >> > somewhere publically available already?
> >> >> >
> >> >> > I looked into various changelogs but have some difficulties to
> >> >> > identify them...
> >> >> >
> >> >> > Maybe I have some time next days to give them an try...
> >> >>
> >> >> Yes, all of the existing vlan changes are in 2.6.37-rc4. ?There
> >> >> were a number of patches but the main one was
> >> >> 3701e51382a026cba10c60b03efabe534fba4ca4
> >> >
> >> > Then, I'm afraid, this (seeing vlan tags even if vlan interfaces are
> >> > configured) does not work on HP DL320G5 (for exact description and
> >> > examples please see my mail a few days ago).
> >>
> >> What driver are you using? ?Is it tg3?
> >>
> >> The vlan changes that I made unfortunately require updating drivers to
> >> get the full benefit. ?I've been busy lately so tg3 hasn't yet been
> >> updated.
> >>
> >> I know that tg3 does some things differently depending on whether a
> >> vlan group is configured, so that would likely be the cause of what
> >> you are seeing. ?I'd have to look at it in more detail to be sure
> >> though.
> >>
> >> You said that everything works on the other Broadcom NIC that you
> >> tested? ?Maybe it uses bnx2 instead?
> >>
> >
> > Both machines use tg3 / 2.6.36.1 - one is opensuse, one ubuntu (but
> > this should not matter, I think).
> >
> > If I can do anything to support your investigations / work (most
> > likely testing / providing information) please let me know.
> 
> Unfortunately, I probably won't have time to look at this in the near
> future.  Given that the test works on one NIC but not another that
> strongly suggests that it is a driver problem, even if both NICs use
> the same driver.  I see tg3 can do different things with vlans
> depending on the model and what features are enabled.  I also ran a
> quick test on some of my machines and I didn't experience this issue.
> They are running net-next with ixgbe.
> 
> One of the main goals of my general vlan changes was to remove as much
> logic as possible from the drivers and put it in the networking core,
> so we should in theory see consistent behavior.  However, in 2.6.36
> and earlier, each driver knows about what vlan devices are configured
> and does different things with that information.
> 
> Given all of that, the most logical step to me is simply to convert
> tg3 to use the new vlan infrastructure.  It should be done regardless
> and it will probably solve this problem.  Maybe you can convince the
> Broadcom guys to do that?  It would be a lot faster for them to do it
> than me.

Below is the patch that converts the tg3 driver over to the new API.  I
don't see how it could fix the problem though.  Maybe the presence of
NETIF_F_HW_VLAN_TX changes things.

Compile tested only.

----
Subject: [PATCH] tg3: Use new VLAN code

This patch pivots the tg3 driver to the new VLAN infrastructure.

---
 drivers/net/tg3.c |   42 ++++++++----------------------------------
 1 files changed, 8 insertions(+), 34 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 5faa87d..2a3f83d 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -134,9 +134,6 @@
 				 TG3_TX_RING_SIZE)
 #define NEXT_TX(N)		(((N) + 1) & (TG3_TX_RING_SIZE - 1))
 
-#define TG3_RX_DMA_ALIGN		16
-#define TG3_RX_HEADROOM			ALIGN(VLAN_HLEN, TG3_RX_DMA_ALIGN)
-
 #define TG3_DMA_BYTE_ENAB		64
 
 #define TG3_RX_STD_DMA_SZ		1536
@@ -4725,8 +4722,6 @@ static int tg3_rx(struct tg3_napi *tnapi, int budget)
 		struct sk_buff *skb;
 		dma_addr_t dma_addr;
 		u32 opaque_key, desc_idx, *post_ptr;
-		bool hw_vlan __maybe_unused = false;
-		u16 vtag __maybe_unused = 0;
 
 		desc_idx = desc->opaque & RXD_OPAQUE_INDEX_MASK;
 		opaque_key = desc->opaque & RXD_OPAQUE_RING_MASK;
@@ -4785,12 +4780,12 @@ static int tg3_rx(struct tg3_napi *tnapi, int budget)
 			tg3_recycle_rx(tnapi, tpr, opaque_key,
 				       desc_idx, *post_ptr);
 
-			copy_skb = netdev_alloc_skb(tp->dev, len + VLAN_HLEN +
+			copy_skb = netdev_alloc_skb(tp->dev, len +
 						    TG3_RAW_IP_ALIGN);
 			if (copy_skb == NULL)
 				goto drop_it_no_recycle;
 
-			skb_reserve(copy_skb, TG3_RAW_IP_ALIGN + VLAN_HLEN);
+			skb_reserve(copy_skb, TG3_RAW_IP_ALIGN);
 			skb_put(copy_skb, len);
 			pci_dma_sync_single_for_cpu(tp->pdev, dma_addr, len, PCI_DMA_FROMDEVICE);
 			skb_copy_from_linear_data(skb, copy_skb->data, len);
@@ -4817,30 +4812,11 @@ static int tg3_rx(struct tg3_napi *tnapi, int budget)
 		}
 
 		if (desc->type_flags & RXD_FLAG_VLAN &&
-		    !(tp->rx_mode & RX_MODE_KEEP_VLAN_TAG)) {
-			vtag = desc->err_vlan & RXD_VLAN_MASK;
-#if TG3_VLAN_TAG_USED
-			if (tp->vlgrp)
-				hw_vlan = true;
-			else
-#endif
-			{
-				struct vlan_ethhdr *ve = (struct vlan_ethhdr *)
-						    __skb_push(skb, VLAN_HLEN);
-
-				memmove(ve, skb->data + VLAN_HLEN,
-					ETH_ALEN * 2);
-				ve->h_vlan_proto = htons(ETH_P_8021Q);
-				ve->h_vlan_TCI = htons(vtag);
-			}
-		}
+		    !(tp->rx_mode & RX_MODE_KEEP_VLAN_TAG))
+			__vlan_hwaccel_put_tag(skb,
+					       desc->err_vlan & RXD_VLAN_MASK);
 
-#if TG3_VLAN_TAG_USED
-		if (hw_vlan)
-			vlan_gro_receive(&tnapi->napi, tp->vlgrp, vtag, skb);
-		else
-#endif
-			napi_gro_receive(&tnapi->napi, skb);
+		napi_gro_receive(&tnapi->napi, skb);
 
 		received++;
 		budget--;
@@ -13866,11 +13842,11 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
 	else
 		tp->tg3_flags &= ~TG3_FLAG_POLL_SERDES;
 
-	tp->rx_offset = NET_IP_ALIGN + TG3_RX_HEADROOM;
+	tp->rx_offset = NET_IP_ALIGN;
 	tp->rx_copy_thresh = TG3_RX_COPY_THRESHOLD;
 	if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5701 &&
 	    (tp->tg3_flags & TG3_FLAG_PCIX_MODE) != 0) {
-		tp->rx_offset -= NET_IP_ALIGN;
+		tp->rx_offset = 0;
 #ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
 		tp->rx_copy_thresh = ~(u16)0;
 #endif
@@ -14705,9 +14681,7 @@ static int __devinit tg3_init_one(struct pci_dev *pdev,
 
 	SET_NETDEV_DEV(dev, &pdev->dev);
 
-#if TG3_VLAN_TAG_USED
 	dev->features |= NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX;
-#endif
 
 	tp = netdev_priv(dev);
 	tp->pdev = pdev;
-- 
1.7.2.2



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-13 22:45                             ` Matt Carlson
@ 2010-12-14  4:07                               ` Jesse Gross
  2010-12-14 19:15                                 ` Matt Carlson
  0 siblings, 1 reply; 53+ messages in thread
From: Jesse Gross @ 2010-12-14  4:07 UTC (permalink / raw)
  To: Matt Carlson
  Cc: Michael Leun, Michael Chan, Eric Dumazet, David Miller,
	Ben Greear, linux-kernel, netdev

On Mon, Dec 13, 2010 at 2:45 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> On Sun, Dec 12, 2010 at 04:11:13PM -0800, Jesse Gross wrote:
>> On Mon, Dec 6, 2010 at 1:27 PM, Michael Leun
>> <lkml20101129@newton.leun.net> wrote:
>> > On Mon, 6 Dec 2010 12:04:48 -0800
>> > Jesse Gross <jesse@nicira.com> wrote:
>> >
>> >> On Mon, Dec 6, 2010 at 11:34 AM, Michael Leun
>> >> <lkml20101129@newton.leun.net> wrote:
>> >> > On Mon, 6 Dec 2010 10:14:55 -0800
>> >> > Jesse Gross <jesse@nicira.com> wrote:
>> >> >
>> >> >> On Sun, Dec 5, 2010 at 2:44 AM, Michael Leun
>> >> >> <lkml20101129@newton.leun.net> wrote:
>> >> >> > Hi Jesse,
>> >> >> >
>> >> >> > On Sun, 5 Dec 2010 10:55:28 +0100
>> >> >> > Michael Leun <lkml20101129@newton.leun.net> wrote:
>> >> >> >
>> >> >> >> On Sun, 05 Dec 2010 09:03:53 +0100
>> >> >> >> Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> >> >> >>
>> >> >> >> > > But on
>> >> >> >> > >
>> >> >> >> > > hpdl320g5:/home/ml # lspci | grep Eth
>> >> >> >> > > 03:04.0 Ethernet controller: Broadcom Corporation NetXtreme
>> >> >> >> > > BCM5714 Gigabit Ethernet (rev a3) 03:04.1 Ethernet
>> >> >> >> > > controller: Broadcom Corporation NetXtreme BCM5714 Gigabit
>> >> >> >> > > Ethernet (rev a3)
>> >> >> >> > >
>> >> >> >> > > the good message is that it also does not crash, but with
>> >> >> >> > > tcpdump I see vlan tags when no vlan devices configured on
>> >> >> >> > > the respective eth, if so I do not see tags anymore vlan
>> >> >> >> > > tags on the trunk interface.
>> >> >> >> > >
>> >> >> >> >
>> >> >> >> > For all these very specific needs, you'll have to try 2.6.37
>> >> >> >> > I am afraid. Jesse did huge changes to exactly make this
>> >> >> >> > working, we wont backport this to 2.6.36, but only avoid
>> >> >> >> > crashes.
>> >> >> >>
>> >> >> >> OK, I'm perfectly fine with that, of course, actually nice to
>> >> >> >> hear that the issue already is addressed.
>> >> >> >>
>> >> >> >> Likely I'll give some rc an shot on this machine (maybe over
>> >> >> >> christmas), but it is an production machine (acutally testing
>> >> >> >> other devices is the "product" produced on this machine), so
>> >> >> >> unfortunately I'm not that free in when and what I can do (but
>> >> >> >> the possibility to, for example, bridge the trunk interface
>> >> >> >> would make testing easier, that justifies something...).
>> >> >> >>
>> >> >> >> Thank you all very much for your work.
>> >> >> >
>> >> >> > Are these changes already in 2.6.37-rc4? Or, if not are they
>> >> >> > somewhere publically available already?
>> >> >> >
>> >> >> > I looked into various changelogs but have some difficulties to
>> >> >> > identify them...
>> >> >> >
>> >> >> > Maybe I have some time next days to give them an try...
>> >> >>
>> >> >> Yes, all of the existing vlan changes are in 2.6.37-rc4. ?There
>> >> >> were a number of patches but the main one was
>> >> >> 3701e51382a026cba10c60b03efabe534fba4ca4
>> >> >
>> >> > Then, I'm afraid, this (seeing vlan tags even if vlan interfaces are
>> >> > configured) does not work on HP DL320G5 (for exact description and
>> >> > examples please see my mail a few days ago).
>> >>
>> >> What driver are you using? ?Is it tg3?
>> >>
>> >> The vlan changes that I made unfortunately require updating drivers to
>> >> get the full benefit. ?I've been busy lately so tg3 hasn't yet been
>> >> updated.
>> >>
>> >> I know that tg3 does some things differently depending on whether a
>> >> vlan group is configured, so that would likely be the cause of what
>> >> you are seeing. ?I'd have to look at it in more detail to be sure
>> >> though.
>> >>
>> >> You said that everything works on the other Broadcom NIC that you
>> >> tested? ?Maybe it uses bnx2 instead?
>> >>
>> >
>> > Both machines use tg3 / 2.6.36.1 - one is opensuse, one ubuntu (but
>> > this should not matter, I think).
>> >
>> > If I can do anything to support your investigations / work (most
>> > likely testing / providing information) please let me know.
>>
>> Unfortunately, I probably won't have time to look at this in the near
>> future.  Given that the test works on one NIC but not another that
>> strongly suggests that it is a driver problem, even if both NICs use
>> the same driver.  I see tg3 can do different things with vlans
>> depending on the model and what features are enabled.  I also ran a
>> quick test on some of my machines and I didn't experience this issue.
>> They are running net-next with ixgbe.
>>
>> One of the main goals of my general vlan changes was to remove as much
>> logic as possible from the drivers and put it in the networking core,
>> so we should in theory see consistent behavior.  However, in 2.6.36
>> and earlier, each driver knows about what vlan devices are configured
>> and does different things with that information.
>>
>> Given all of that, the most logical step to me is simply to convert
>> tg3 to use the new vlan infrastructure.  It should be done regardless
>> and it will probably solve this problem.  Maybe you can convince the
>> Broadcom guys to do that?  It would be a lot faster for them to do it
>> than me.
>
> Below is the patch that converts the tg3 driver over to the new API.  I
> don't see how it could fix the problem though.  Maybe the presence of
> NETIF_F_HW_VLAN_TX changes things.

Thanks Matt.

There's actually a little bit more that needs to be done for
conversion.  All references to the vlan group should be gone since
that logic has been moved to the networking core.
tg3_vlan_rx_register() completely disappears and all other code
contained in TG3_VLAN_TAG_USED is unconditionally active.  Ideally,
there would be an Ethtool set_flags function so that the vlan
offloading features could be enabled/disabled for situations like this
to help with debugging.

The reason why I think that this might help is that the problem
manifests when a vlan group is configured, even if that vlan isn't
used.  Since this removes all logic about vlan groups from the driver,
it should avoid any problems in that area.  It's possible that the
actual issue is somewhere else but then it should be easier to find
since we can separate out the different components.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-14  4:07                               ` Jesse Gross
@ 2010-12-14 19:15                                 ` Matt Carlson
  2010-12-14 21:46                                   ` Jesse Gross
                                                     ` (2 more replies)
  0 siblings, 3 replies; 53+ messages in thread
From: Matt Carlson @ 2010-12-14 19:15 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Matthew Carlson, Michael Leun, Michael Chan, Eric Dumazet,
	David Miller, Ben Greear, linux-kernel, netdev

On Mon, Dec 13, 2010 at 08:07:20PM -0800, Jesse Gross wrote:
> On Mon, Dec 13, 2010 at 2:45 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> > On Sun, Dec 12, 2010 at 04:11:13PM -0800, Jesse Gross wrote:
> >> On Mon, Dec 6, 2010 at 1:27 PM, Michael Leun
> >> <lkml20101129@newton.leun.net> wrote:
> >> > On Mon, 6 Dec 2010 12:04:48 -0800
> >> > Jesse Gross <jesse@nicira.com> wrote:
> >> >
> >> >> On Mon, Dec 6, 2010 at 11:34 AM, Michael Leun
> >> >> <lkml20101129@newton.leun.net> wrote:
> >> >> > On Mon, 6 Dec 2010 10:14:55 -0800
> >> >> > Jesse Gross <jesse@nicira.com> wrote:
> >> >> >
> >> >> >> On Sun, Dec 5, 2010 at 2:44 AM, Michael Leun
> >> >> >> <lkml20101129@newton.leun.net> wrote:
> >> >> >> > Hi Jesse,
> >> >> >> >
> >> >> >> > On Sun, 5 Dec 2010 10:55:28 +0100
> >> >> >> > Michael Leun <lkml20101129@newton.leun.net> wrote:
> >> >> >> >
> >> >> >> >> On Sun, 05 Dec 2010 09:03:53 +0100
> >> >> >> >> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >> >> >> >>
> >> >> >> >> > > But on
> >> >> >> >> > >
> >> >> >> >> > > hpdl320g5:/home/ml # lspci | grep Eth
> >> >> >> >> > > 03:04.0 Ethernet controller: Broadcom Corporation NetXtreme
> >> >> >> >> > > BCM5714 Gigabit Ethernet (rev a3) 03:04.1 Ethernet
> >> >> >> >> > > controller: Broadcom Corporation NetXtreme BCM5714 Gigabit
> >> >> >> >> > > Ethernet (rev a3)
> >> >> >> >> > >
> >> >> >> >> > > the good message is that it also does not crash, but with
> >> >> >> >> > > tcpdump I see vlan tags when no vlan devices configured on
> >> >> >> >> > > the respective eth, if so I do not see tags anymore vlan
> >> >> >> >> > > tags on the trunk interface.
> >> >> >> >> > >
> >> >> >> >> >
> >> >> >> >> > For all these very specific needs, you'll have to try 2.6.37
> >> >> >> >> > I am afraid. Jesse did huge changes to exactly make this
> >> >> >> >> > working, we wont backport this to 2.6.36, but only avoid
> >> >> >> >> > crashes.
> >> >> >> >>
> >> >> >> >> OK, I'm perfectly fine with that, of course, actually nice to
> >> >> >> >> hear that the issue already is addressed.
> >> >> >> >>
> >> >> >> >> Likely I'll give some rc an shot on this machine (maybe over
> >> >> >> >> christmas), but it is an production machine (acutally testing
> >> >> >> >> other devices is the "product" produced on this machine), so
> >> >> >> >> unfortunately I'm not that free in when and what I can do (but
> >> >> >> >> the possibility to, for example, bridge the trunk interface
> >> >> >> >> would make testing easier, that justifies something...).
> >> >> >> >>
> >> >> >> >> Thank you all very much for your work.
> >> >> >> >
> >> >> >> > Are these changes already in 2.6.37-rc4? Or, if not are they
> >> >> >> > somewhere publically available already?
> >> >> >> >
> >> >> >> > I looked into various changelogs but have some difficulties to
> >> >> >> > identify them...
> >> >> >> >
> >> >> >> > Maybe I have some time next days to give them an try...
> >> >> >>
> >> >> >> Yes, all of the existing vlan changes are in 2.6.37-rc4. ?There
> >> >> >> were a number of patches but the main one was
> >> >> >> 3701e51382a026cba10c60b03efabe534fba4ca4
> >> >> >
> >> >> > Then, I'm afraid, this (seeing vlan tags even if vlan interfaces are
> >> >> > configured) does not work on HP DL320G5 (for exact description and
> >> >> > examples please see my mail a few days ago).
> >> >>
> >> >> What driver are you using? ?Is it tg3?
> >> >>
> >> >> The vlan changes that I made unfortunately require updating drivers to
> >> >> get the full benefit. ?I've been busy lately so tg3 hasn't yet been
> >> >> updated.
> >> >>
> >> >> I know that tg3 does some things differently depending on whether a
> >> >> vlan group is configured, so that would likely be the cause of what
> >> >> you are seeing. ?I'd have to look at it in more detail to be sure
> >> >> though.
> >> >>
> >> >> You said that everything works on the other Broadcom NIC that you
> >> >> tested? ?Maybe it uses bnx2 instead?
> >> >>
> >> >
> >> > Both machines use tg3 / 2.6.36.1 - one is opensuse, one ubuntu (but
> >> > this should not matter, I think).
> >> >
> >> > If I can do anything to support your investigations / work (most
> >> > likely testing / providing information) please let me know.
> >>
> >> Unfortunately, I probably won't have time to look at this in the near
> >> future. ?Given that the test works on one NIC but not another that
> >> strongly suggests that it is a driver problem, even if both NICs use
> >> the same driver. ?I see tg3 can do different things with vlans
> >> depending on the model and what features are enabled. ?I also ran a
> >> quick test on some of my machines and I didn't experience this issue.
> >> They are running net-next with ixgbe.
> >>
> >> One of the main goals of my general vlan changes was to remove as much
> >> logic as possible from the drivers and put it in the networking core,
> >> so we should in theory see consistent behavior. ?However, in 2.6.36
> >> and earlier, each driver knows about what vlan devices are configured
> >> and does different things with that information.
> >>
> >> Given all of that, the most logical step to me is simply to convert
> >> tg3 to use the new vlan infrastructure. ?It should be done regardless
> >> and it will probably solve this problem. ?Maybe you can convince the
> >> Broadcom guys to do that? ?It would be a lot faster for them to do it
> >> than me.
> >
> > Below is the patch that converts the tg3 driver over to the new API. ?I
> > don't see how it could fix the problem though. ?Maybe the presence of
> > NETIF_F_HW_VLAN_TX changes things.
> 
> Thanks Matt.
> 
> There's actually a little bit more that needs to be done for
> conversion.  All references to the vlan group should be gone since
> that logic has been moved to the networking core.
> tg3_vlan_rx_register() completely disappears and all other code
> contained in TG3_VLAN_TAG_USED is unconditionally active.  Ideally,
> there would be an Ethtool set_flags function so that the vlan
> offloading features could be enabled/disabled for situations like this
> to help with debugging.
> 
> The reason why I think that this might help is that the problem
> manifests when a vlan group is configured, even if that vlan isn't
> used.  Since this removes all logic about vlan groups from the driver,
> it should avoid any problems in that area.  It's possible that the
> actual issue is somewhere else but then it should be easier to find
> since we can separate out the different components.

Thanks for the comments Jesse.  Below is an updated patch.

Michael, I'm wondering if the difference in behavior can be explained by
the presence or absence of management firmware.  Can you look at the
driver sign-on messages in your syslogs for ASF[]?  I'm half expecting
the 5752 to show "ASF[0]" and the 5714 to show "ASF[1]".  If you see
this, and the below patch doesn't fix the problem, let me know.  I have
another test I'd like you to run.

----

[PATCH] tg3: Use new VLAN code

This patch pivots the tg3 driver to the new VLAN infrastructure.
All references to vlgrp have been removed and all VLAN code is
unconditionally active.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
---
 drivers/net/tg3.c |   95 +++++------------------------------------------------
 drivers/net/tg3.h |    3 --
 2 files changed, 9 insertions(+), 89 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 5faa87d..3682205 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -60,12 +60,6 @@
 #define BAR_0	0
 #define BAR_2	2
 
-#if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
-#define TG3_VLAN_TAG_USED 1
-#else
-#define TG3_VLAN_TAG_USED 0
-#endif
-
 #include "tg3.h"
 
 #define DRV_MODULE_NAME		"tg3"
@@ -134,9 +128,6 @@
 				 TG3_TX_RING_SIZE)
 #define NEXT_TX(N)		(((N) + 1) & (TG3_TX_RING_SIZE - 1))
 
-#define TG3_RX_DMA_ALIGN		16
-#define TG3_RX_HEADROOM			ALIGN(VLAN_HLEN, TG3_RX_DMA_ALIGN)
-
 #define TG3_DMA_BYTE_ENAB		64
 
 #define TG3_RX_STD_DMA_SZ		1536
@@ -4725,8 +4716,6 @@ static int tg3_rx(struct tg3_napi *tnapi, int budget)
 		struct sk_buff *skb;
 		dma_addr_t dma_addr;
 		u32 opaque_key, desc_idx, *post_ptr;
-		bool hw_vlan __maybe_unused = false;
-		u16 vtag __maybe_unused = 0;
 
 		desc_idx = desc->opaque & RXD_OPAQUE_INDEX_MASK;
 		opaque_key = desc->opaque & RXD_OPAQUE_RING_MASK;
@@ -4785,12 +4774,12 @@ static int tg3_rx(struct tg3_napi *tnapi, int budget)
 			tg3_recycle_rx(tnapi, tpr, opaque_key,
 				       desc_idx, *post_ptr);
 
-			copy_skb = netdev_alloc_skb(tp->dev, len + VLAN_HLEN +
+			copy_skb = netdev_alloc_skb(tp->dev, len +
 						    TG3_RAW_IP_ALIGN);
 			if (copy_skb == NULL)
 				goto drop_it_no_recycle;
 
-			skb_reserve(copy_skb, TG3_RAW_IP_ALIGN + VLAN_HLEN);
+			skb_reserve(copy_skb, TG3_RAW_IP_ALIGN);
 			skb_put(copy_skb, len);
 			pci_dma_sync_single_for_cpu(tp->pdev, dma_addr, len, PCI_DMA_FROMDEVICE);
 			skb_copy_from_linear_data(skb, copy_skb->data, len);
@@ -4817,30 +4806,11 @@ static int tg3_rx(struct tg3_napi *tnapi, int budget)
 		}
 
 		if (desc->type_flags & RXD_FLAG_VLAN &&
-		    !(tp->rx_mode & RX_MODE_KEEP_VLAN_TAG)) {
-			vtag = desc->err_vlan & RXD_VLAN_MASK;
-#if TG3_VLAN_TAG_USED
-			if (tp->vlgrp)
-				hw_vlan = true;
-			else
-#endif
-			{
-				struct vlan_ethhdr *ve = (struct vlan_ethhdr *)
-						    __skb_push(skb, VLAN_HLEN);
-
-				memmove(ve, skb->data + VLAN_HLEN,
-					ETH_ALEN * 2);
-				ve->h_vlan_proto = htons(ETH_P_8021Q);
-				ve->h_vlan_TCI = htons(vtag);
-			}
-		}
+		    !(tp->rx_mode & RX_MODE_KEEP_VLAN_TAG))
+			__vlan_hwaccel_put_tag(skb,
+					       desc->err_vlan & RXD_VLAN_MASK);
 
-#if TG3_VLAN_TAG_USED
-		if (hw_vlan)
-			vlan_gro_receive(&tnapi->napi, tp->vlgrp, vtag, skb);
-		else
-#endif
-			napi_gro_receive(&tnapi->napi, skb);
+		napi_gro_receive(&tnapi->napi, skb);
 
 		received++;
 		budget--;
@@ -5743,11 +5713,9 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb,
 		base_flags |= TXD_FLAG_TCPUDP_CSUM;
 	}
 
-#if TG3_VLAN_TAG_USED
 	if (vlan_tx_tag_present(skb))
 		base_flags |= (TXD_FLAG_VLAN |
 			       (vlan_tx_tag_get(skb) << 16));
-#endif
 
 	len = skb_headlen(skb);
 
@@ -5989,11 +5957,10 @@ static netdev_tx_t tg3_start_xmit_dma_bug(struct sk_buff *skb,
 			}
 		}
 	}
-#if TG3_VLAN_TAG_USED
+
 	if (vlan_tx_tag_present(skb))
 		base_flags |= (TXD_FLAG_VLAN |
 			       (vlan_tx_tag_get(skb) << 16));
-#endif
 
 	if ((tp->tg3_flags3 & TG3_FLG3_USE_JUMBO_BDFLAG) &&
 	    !mss && skb->len > VLAN_ETH_FRAME_LEN)
@@ -9538,17 +9505,8 @@ static void __tg3_set_rx_mode(struct net_device *dev)
 	/* When ASF is in use, we always keep the RX_MODE_KEEP_VLAN_TAG
 	 * flag clear.
 	 */
-#if TG3_VLAN_TAG_USED
-	if (!tp->vlgrp &&
-	    !(tp->tg3_flags & TG3_FLAG_ENABLE_ASF))
-		rx_mode |= RX_MODE_KEEP_VLAN_TAG;
-#else
-	/* By definition, VLAN is disabled always in this
-	 * case.
-	 */
 	if (!(tp->tg3_flags & TG3_FLAG_ENABLE_ASF))
 		rx_mode |= RX_MODE_KEEP_VLAN_TAG;
-#endif
 
 	if (dev->flags & IFF_PROMISC) {
 		/* Promiscuous mode. */
@@ -11233,31 +11191,6 @@ static int tg3_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 	return -EOPNOTSUPP;
 }
 
-#if TG3_VLAN_TAG_USED
-static void tg3_vlan_rx_register(struct net_device *dev, struct vlan_group *grp)
-{
-	struct tg3 *tp = netdev_priv(dev);
-
-	if (!netif_running(dev)) {
-		tp->vlgrp = grp;
-		return;
-	}
-
-	tg3_netif_stop(tp);
-
-	tg3_full_lock(tp, 0);
-
-	tp->vlgrp = grp;
-
-	/* Update RX_MODE_KEEP_VLAN_TAG bit in RX_MODE register. */
-	__tg3_set_rx_mode(dev);
-
-	tg3_netif_start(tp);
-
-	tg3_full_unlock(tp);
-}
-#endif
-
 static int tg3_get_coalesce(struct net_device *dev, struct ethtool_coalesce *ec)
 {
 	struct tg3 *tp = netdev_priv(dev);
@@ -13069,9 +13002,7 @@ static struct pci_dev * __devinit tg3_find_peer(struct tg3 *);
 
 static void inline vlan_features_add(struct net_device *dev, unsigned long flags)
 {
-#if TG3_VLAN_TAG_USED
 	dev->vlan_features |= flags;
-#endif
 }
 
 static inline u32 tg3_rx_ret_ring_size(struct tg3 *tp)
@@ -13866,11 +13797,11 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
 	else
 		tp->tg3_flags &= ~TG3_FLAG_POLL_SERDES;
 
-	tp->rx_offset = NET_IP_ALIGN + TG3_RX_HEADROOM;
+	tp->rx_offset = NET_IP_ALIGN;
 	tp->rx_copy_thresh = TG3_RX_COPY_THRESHOLD;
 	if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5701 &&
 	    (tp->tg3_flags & TG3_FLAG_PCIX_MODE) != 0) {
-		tp->rx_offset -= NET_IP_ALIGN;
+		tp->rx_offset = 0;
 #ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
 		tp->rx_copy_thresh = ~(u16)0;
 #endif
@@ -14634,9 +14565,6 @@ static const struct net_device_ops tg3_netdev_ops = {
 	.ndo_do_ioctl		= tg3_ioctl,
 	.ndo_tx_timeout		= tg3_tx_timeout,
 	.ndo_change_mtu		= tg3_change_mtu,
-#if TG3_VLAN_TAG_USED
-	.ndo_vlan_rx_register	= tg3_vlan_rx_register,
-#endif
 #ifdef CONFIG_NET_POLL_CONTROLLER
 	.ndo_poll_controller	= tg3_poll_controller,
 #endif
@@ -14653,9 +14581,6 @@ static const struct net_device_ops tg3_netdev_ops_dma_bug = {
 	.ndo_do_ioctl		= tg3_ioctl,
 	.ndo_tx_timeout		= tg3_tx_timeout,
 	.ndo_change_mtu		= tg3_change_mtu,
-#if TG3_VLAN_TAG_USED
-	.ndo_vlan_rx_register	= tg3_vlan_rx_register,
-#endif
 #ifdef CONFIG_NET_POLL_CONTROLLER
 	.ndo_poll_controller	= tg3_poll_controller,
 #endif
@@ -14705,9 +14630,7 @@ static int __devinit tg3_init_one(struct pci_dev *pdev,
 
 	SET_NETDEV_DEV(dev, &pdev->dev);
 
-#if TG3_VLAN_TAG_USED
 	dev->features |= NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX;
-#endif
 
 	tp = netdev_priv(dev);
 	tp->pdev = pdev;
diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
index d62c8d9..f528243 100644
--- a/drivers/net/tg3.h
+++ b/drivers/net/tg3.h
@@ -2808,9 +2808,6 @@ struct tg3 {
 	u32				rx_std_max_post;
 	u32				rx_offset;
 	u32				rx_pkt_map_sz;
-#if TG3_VLAN_TAG_USED
-	struct vlan_group		*vlgrp;
-#endif
 
 
 	/* begin "everything else" cacheline(s) section */
-- 
1.7.2.2



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-14 19:15                                 ` Matt Carlson
@ 2010-12-14 21:46                                   ` Jesse Gross
  2010-12-15  0:24                                   ` Michael Leun
  2011-01-01 17:03                                   ` Eric Dumazet
  2 siblings, 0 replies; 53+ messages in thread
From: Jesse Gross @ 2010-12-14 21:46 UTC (permalink / raw)
  To: Matt Carlson
  Cc: Michael Leun, Michael Chan, Eric Dumazet, David Miller,
	Ben Greear, linux-kernel, netdev

On Tue, Dec 14, 2010 at 11:15 AM, Matt Carlson <mcarlson@broadcom.com> wrote:
> @@ -9538,17 +9505,8 @@ static void __tg3_set_rx_mode(struct net_device *dev)
>        /* When ASF is in use, we always keep the RX_MODE_KEEP_VLAN_TAG
>         * flag clear.
>         */
> -#if TG3_VLAN_TAG_USED
> -       if (!tp->vlgrp &&
> -           !(tp->tg3_flags & TG3_FLAG_ENABLE_ASF))
> -               rx_mode |= RX_MODE_KEEP_VLAN_TAG;
> -#else
> -       /* By definition, VLAN is disabled always in this
> -        * case.
> -        */
>        if (!(tp->tg3_flags & TG3_FLAG_ENABLE_ASF))
>                rx_mode |= RX_MODE_KEEP_VLAN_TAG;
> -#endif

Just one comment:

I don't think this does quite the right thing: it will always disable
vlan stripping unless ASF is in use.  However, it's now OK to always
use vlan stripping, so we might as well take advantage of it.  Since
without the set_flags Ethtool op there is no way to change this
setting, we should be able to just drop this code block completely
(and the check for  RX_MODE_KEEP_VLAN_TAG on receive).

In addition, this should also remove any differences between ASF
enabled/disabled firmware (at least with respect to vlans) since it
will no longer be a factor.

Thanks.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-14 19:15                                 ` Matt Carlson
  2010-12-14 21:46                                   ` Jesse Gross
@ 2010-12-15  0:24                                   ` Michael Leun
  2010-12-15  1:34                                     ` Matt Carlson
  2011-01-01 17:03                                   ` Eric Dumazet
  2 siblings, 1 reply; 53+ messages in thread
From: Michael Leun @ 2010-12-15  0:24 UTC (permalink / raw)
  To: Matt Carlson
  Cc: Jesse Gross, Michael Chan, Eric Dumazet, David Miller,
	Ben Greear, linux-kernel, netdev

On Tue, 14 Dec 2010 11:15:00 -0800
"Matt Carlson" <mcarlson@broadcom.com> wrote:

> On Mon, Dec 13, 2010 at 08:07:20PM -0800, Jesse Gross wrote:
> > On Mon, Dec 13, 2010 at 2:45 PM, Matt Carlson
> > <mcarlson@broadcom.com> wrote:
> > > On Sun, Dec 12, 2010 at 04:11:13PM -0800, Jesse Gross wrote:
> > >> On Mon, Dec 6, 2010 at 1:27 PM, Michael Leun
> > >> <lkml20101129@newton.leun.net> wrote:
> > >> > On Mon, 6 Dec 2010 12:04:48 -0800
> > >> > Jesse Gross <jesse@nicira.com> wrote:
> > >> >
> > >> >> On Mon, Dec 6, 2010 at 11:34 AM, Michael Leun
> > >> >> <lkml20101129@newton.leun.net> wrote:
> > >> >> > On Mon, 6 Dec 2010 10:14:55 -0800
> > >> >> > Jesse Gross <jesse@nicira.com> wrote:
> > >> >> >
> > >> >> >> On Sun, Dec 5, 2010 at 2:44 AM, Michael Leun
> > >> >> >> <lkml20101129@newton.leun.net> wrote:
> > >> >> >> > Hi Jesse,
> > >> >> >> >
> > >> >> >> > On Sun, 5 Dec 2010 10:55:28 +0100
> > >> >> >> > Michael Leun <lkml20101129@newton.leun.net> wrote:
> > >> >> >> >
> > >> >> >> >> On Sun, 05 Dec 2010 09:03:53 +0100
> > >> >> >> >> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > >> >> >> >>
> > >> >> >> >> > > But on
> > >> >> >> >> > >
> > >> >> >> >> > > hpdl320g5:/home/ml # lspci | grep Eth
> > >> >> >> >> > > 03:04.0 Ethernet controller: Broadcom Corporation
> > >> >> >> >> > > NetXtreme BCM5714 Gigabit Ethernet (rev a3) 03:04.1
> > >> >> >> >> > > Ethernet controller: Broadcom Corporation NetXtreme
> > >> >> >> >> > > BCM5714 Gigabit Ethernet (rev a3)
> > >> >> >> >> > >
> > >> >> >> >> > > the good message is that it also does not crash,
> > >> >> >> >> > > but with tcpdump I see vlan tags when no vlan
> > >> >> >> >> > > devices configured on the respective eth, if so I
> > >> >> >> >> > > do not see tags anymore vlan tags on the trunk
> > >> >> >> >> > > interface.
> > >> >> >> >> > >
> > >> >> >> >> >
> > >> >> >> >> > For all these very specific needs, you'll have to try
> > >> >> >> >> > 2.6.37 I am afraid. Jesse did huge changes to exactly
> > >> >> >> >> > make this working, we wont backport this to 2.6.36,
> > >> >> >> >> > but only avoid crashes.
> > >> >> >> >>
> > >> >> >> >> OK, I'm perfectly fine with that, of course, actually
> > >> >> >> >> nice to hear that the issue already is addressed.
> > >> >> >> >>
> > >> >> >> >> Likely I'll give some rc an shot on this machine (maybe
> > >> >> >> >> over christmas), but it is an production machine
> > >> >> >> >> (acutally testing other devices is the "product"
> > >> >> >> >> produced on this machine), so unfortunately I'm not
> > >> >> >> >> that free in when and what I can do (but the
> > >> >> >> >> possibility to, for example, bridge the trunk interface
> > >> >> >> >> would make testing easier, that justifies something...).
> > >> >> >> >>
> > >> >> >> >> Thank you all very much for your work.
> > >> >> >> >
> > >> >> >> > Are these changes already in 2.6.37-rc4? Or, if not are
> > >> >> >> > they somewhere publically available already?
> > >> >> >> >
> > >> >> >> > I looked into various changelogs but have some
> > >> >> >> > difficulties to identify them...
> > >> >> >> >
> > >> >> >> > Maybe I have some time next days to give them an try...
> > >> >> >>
> > >> >> >> Yes, all of the existing vlan changes are in
> > >> >> >> 2.6.37-rc4. ?There were a number of patches but the main
> > >> >> >> one was 3701e51382a026cba10c60b03efabe534fba4ca4
> > >> >> >
> > >> >> > Then, I'm afraid, this (seeing vlan tags even if vlan
> > >> >> > interfaces are configured) does not work on HP DL320G5 (for
> > >> >> > exact description and examples please see my mail a few
> > >> >> > days ago).
> > >> >>
> > >> >> What driver are you using? ?Is it tg3?
> > >> >>
> > >> >> The vlan changes that I made unfortunately require updating
> > >> >> drivers to get the full benefit. ?I've been busy lately so
> > >> >> tg3 hasn't yet been updated.
> > >> >>
> > >> >> I know that tg3 does some things differently depending on
> > >> >> whether a vlan group is configured, so that would likely be
> > >> >> the cause of what you are seeing. ?I'd have to look at it in
> > >> >> more detail to be sure though.
> > >> >>
> > >> >> You said that everything works on the other Broadcom NIC that
> > >> >> you tested? ?Maybe it uses bnx2 instead?
> > >> >>
> > >> >
> > >> > Both machines use tg3 / 2.6.36.1 - one is opensuse, one ubuntu
> > >> > (but this should not matter, I think).
> > >> >
> > >> > If I can do anything to support your investigations / work
> > >> > (most likely testing / providing information) please let me
> > >> > know.
> > >>
> > >> Unfortunately, I probably won't have time to look at this in the
> > >> near future. ?Given that the test works on one NIC but not
> > >> another that strongly suggests that it is a driver problem, even
> > >> if both NICs use the same driver. ?I see tg3 can do different
> > >> things with vlans depending on the model and what features are
> > >> enabled. ?I also ran a quick test on some of my machines and I
> > >> didn't experience this issue. They are running net-next with
> > >> ixgbe.
> > >>
> > >> One of the main goals of my general vlan changes was to remove
> > >> as much logic as possible from the drivers and put it in the
> > >> networking core, so we should in theory see consistent
> > >> behavior. ?However, in 2.6.36 and earlier, each driver knows
> > >> about what vlan devices are configured and does different things
> > >> with that information.
> > >>
> > >> Given all of that, the most logical step to me is simply to
> > >> convert tg3 to use the new vlan infrastructure. ?It should be
> > >> done regardless and it will probably solve this problem. ?Maybe
> > >> you can convince the Broadcom guys to do that? ?It would be a
> > >> lot faster for them to do it than me.
> > >
> > > Below is the patch that converts the tg3 driver over to the new
> > > API. ?I don't see how it could fix the problem though. ?Maybe the
> > > presence of NETIF_F_HW_VLAN_TX changes things.
> > 
> > Thanks Matt.
> > 
> > There's actually a little bit more that needs to be done for
> > conversion.  All references to the vlan group should be gone since
> > that logic has been moved to the networking core.
> > tg3_vlan_rx_register() completely disappears and all other code
> > contained in TG3_VLAN_TAG_USED is unconditionally active.  Ideally,
> > there would be an Ethtool set_flags function so that the vlan
> > offloading features could be enabled/disabled for situations like
> > this to help with debugging.
> > 
> > The reason why I think that this might help is that the problem
> > manifests when a vlan group is configured, even if that vlan isn't
> > used.  Since this removes all logic about vlan groups from the
> > driver, it should avoid any problems in that area.  It's possible
> > that the actual issue is somewhere else but then it should be
> > easier to find since we can separate out the different components.
> 
> Thanks for the comments Jesse.  Below is an updated patch.
> 
> Michael, I'm wondering if the difference in behavior can be explained
> by the presence or absence of management firmware.  Can you look at
> the driver sign-on messages in your syslogs for ASF[]?  I'm half
> expecting the 5752 to show "ASF[0]" and the 5714 to show "ASF[1]".
> If you see this, and the below patch doesn't fix the problem, let me
> know.  I have another test I'd like you to run.

Do I understand this correct? "Management firmware" or ASF is some
feature, vendor decides to built into network card (firmware) or not?

If so, would'nt one expect two oneboard network cards in one server
to look alike?

HP Proliant DL320G5

<6>tg3.c:v3.113 (August 2, 2010)
<6>tg3 0000:03:04.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
<6>tg3 0000:03:04.0: eth0: Tigon3 [partno(N/A) rev 9003] (PCIX:133MHz:64-bit) MAC address xx:xx:xx:xx:xx:xx
<6>tg3 0000:03:04.0: eth0: attached PHY is 5714 (10/100/1000Base-T Ethernet) (WireSpeed[1])
<6>tg3 0000:03:04.0: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1]
<6>tg3 0000:03:04.0: eth0: dma_rwctrl[76148000] dma_mask[64-bit]
<6>tg3 0000:03:04.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
<6>tg3 0000:03:04.1: eth1: Tigon3 [partno(N/A) rev 9003] (PCIX:133MHz:64-bit) MAC address xx:xx:xx:xx:xx:xx
<6>tg3 0000:03:04.1: eth1: attached PHY is 5714 (10/100/1000Base-T Ethernet) (WireSpeed[1])
<6>tg3 0000:03:04.1: eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
<6>tg3 0000:03:04.1: eth1: dma_rwctrl[76148000] dma_mask[64-bit]

Lenovo ThinkPad z61m

[    2.679130] tg3.c:v3.113 (August 2, 2010)
[    2.679176] tg3 0000:02:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[    2.679188] tg3 0000:02:00.0: setting latency timer to 64
[    2.728572] tg3 0000:02:00.0: eth0: Tigon3 [partno(BCM95752m) rev 6002] (PCI Express) MAC address xx:xx:xx:xx:xx:xx
[    2.728577] tg3 0000:02:00.0: eth0: attached PHY is 5752 (10/100/1000Base-T Ethernet) (WireSpeed[1])
[    2.728581] tg3 0000:02:00.0: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
[    2.728585] tg3 0000:02:00.0: eth0: dma_rwctrl[76180000]
dma_mask[64-bit]


> ----
> 
> [PATCH] tg3: Use new VLAN code

Unfortunately had'nt time to try much now, but with 2.6.37-rc5 / your
patch on the DL320, single user mode (nothing configured on eth) just
after ifconfig eth0/eth1 up I see NO vlan tags on eth0 but I see vlan
tags on eth1, so there clearly is a difference.

I should have checked if I still see vlan tags on eth1 if I configure
some vlan there - if helpful maybe I can do this (have to look, when I
can effort another downtime).

I wonder, if the difference in that both onboard cards is really there
or if there is some malfunction in detecion?

-- 
MfG,

Michael Leun


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-15  0:24                                   ` Michael Leun
@ 2010-12-15  1:34                                     ` Matt Carlson
  2010-12-15  7:16                                       ` Michael Leun
  0 siblings, 1 reply; 53+ messages in thread
From: Matt Carlson @ 2010-12-15  1:34 UTC (permalink / raw)
  To: Michael Leun
  Cc: Matthew Carlson, Jesse Gross, Michael Chan, Eric Dumazet,
	David Miller, Ben Greear, linux-kernel, netdev

On Tue, Dec 14, 2010 at 04:24:30PM -0800, Michael Leun wrote:
> On Tue, 14 Dec 2010 11:15:00 -0800
> "Matt Carlson" <mcarlson@broadcom.com> wrote:
> > Michael, I'm wondering if the difference in behavior can be explained
> > by the presence or absence of management firmware.  Can you look at
> > the driver sign-on messages in your syslogs for ASF[]?  I'm half
> > expecting the 5752 to show "ASF[0]" and the 5714 to show "ASF[1]".
> > If you see this, and the below patch doesn't fix the problem, let me
> > know.  I have another test I'd like you to run.
> 
> Do I understand this correct? "Management firmware" or ASF is some
> feature, vendor decides to built into network card (firmware) or not?

Right.

> If so, would'nt one expect two oneboard network cards in one server
> to look alike?

Mostly, yes.  Except for.....

> HP Proliant DL320G5
> 
> <6>tg3.c:v3.113 (August 2, 2010)
> <6>tg3 0000:03:04.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
> <6>tg3 0000:03:04.0: eth0: Tigon3 [partno(N/A) rev 9003] (PCIX:133MHz:64-bit) MAC address xx:xx:xx:xx:xx:xx
> <6>tg3 0000:03:04.0: eth0: attached PHY is 5714 (10/100/1000Base-T Ethernet) (WireSpeed[1])
> <6>tg3 0000:03:04.0: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1]
                                                      This =>  ^^^^^^
> <6>tg3 0000:03:04.0: eth0: dma_rwctrl[76148000] dma_mask[64-bit]
> <6>tg3 0000:03:04.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
> <6>tg3 0000:03:04.1: eth1: Tigon3 [partno(N/A) rev 9003] (PCIX:133MHz:64-bit) MAC address xx:xx:xx:xx:xx:xx
> <6>tg3 0000:03:04.1: eth1: attached PHY is 5714 (10/100/1000Base-T Ethernet) (WireSpeed[1])
> <6>tg3 0000:03:04.1: eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
                                                  And this =>  ^^^^^^
> <6>tg3 0000:03:04.1: eth1: dma_rwctrl[76148000] dma_mask[64-bit]

So management firmware is turned off on the second port.

> Lenovo ThinkPad z61m
> 
> [    2.679130] tg3.c:v3.113 (August 2, 2010)
> [    2.679176] tg3 0000:02:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
> [    2.679188] tg3 0000:02:00.0: setting latency timer to 64
> [    2.728572] tg3 0000:02:00.0: eth0: Tigon3 [partno(BCM95752m) rev 6002] (PCI Express) MAC address xx:xx:xx:xx:xx:xx
> [    2.728577] tg3 0000:02:00.0: eth0: attached PHY is 5752 (10/100/1000Base-T Ethernet) (WireSpeed[1])
> [    2.728581] tg3 0000:02:00.0: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
                                                                           ^^^^^^
And it isn't present on the 5752.

> [    2.728585] tg3 0000:02:00.0: eth0: dma_rwctrl[76180000]
> dma_mask[64-bit]
> 
> 
> > ----
> > 
> > [PATCH] tg3: Use new VLAN code
> 
> Unfortunately had'nt time to try much now, but with 2.6.37-rc5 / your
> patch on the DL320, single user mode (nothing configured on eth) just
> after ifconfig eth0/eth1 up I see NO vlan tags on eth0 but I see vlan
> tags on eth1, so there clearly is a difference.
> 
> I should have checked if I still see vlan tags on eth1 if I configure
> some vlan there - if helpful maybe I can do this (have to look, when I
> can effort another downtime).

This would be helpful, just to solidify our findings.

> I wonder, if the difference in that both onboard cards is really there
> or if there is some malfunction in detecion?

Please run the above test first, but afterwards, can you apply the below
patch on top of your current sources.  I suspect eth1 will begin to act
like eth0.

This patch is just a test.

[PATCH] tg3: Always strip VLAN tags

This patch configures the hardware to always strip VLAN tags from
incoming packets.
---
 drivers/net/tg3.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 3682205..964293f 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -9505,8 +9505,10 @@ static void __tg3_set_rx_mode(struct net_device *dev)
 	/* When ASF is in use, we always keep the RX_MODE_KEEP_VLAN_TAG
 	 * flag clear.
 	 */
+#if 0
 	if (!(tp->tg3_flags & TG3_FLAG_ENABLE_ASF))
 		rx_mode |= RX_MODE_KEEP_VLAN_TAG;
+#endif
 
 	if (dev->flags & IFF_PROMISC) {
 		/* Promiscuous mode. */
-- 
1.7.2.2



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-15  1:34                                     ` Matt Carlson
@ 2010-12-15  7:16                                       ` Michael Leun
  2010-12-19  3:38                                         ` Jesse Gross
  0 siblings, 1 reply; 53+ messages in thread
From: Michael Leun @ 2010-12-15  7:16 UTC (permalink / raw)
  To: Matt Carlson
  Cc: Jesse Gross, Michael Chan, Eric Dumazet, David Miller,
	Ben Greear, linux-kernel, netdev

On Tue, 14 Dec 2010 17:34:31 -0800
"Matt Carlson" <mcarlson@broadcom.com> wrote:

> On Tue, Dec 14, 2010 at 04:24:30PM -0800, Michael Leun wrote:
> > On Tue, 14 Dec 2010 11:15:00 -0800
> > "Matt Carlson" <mcarlson@broadcom.com> wrote:
> > > Michael, I'm wondering if the difference in behavior can be
> > > explained by the presence or absence of management firmware.  Can
> > > you look at the driver sign-on messages in your syslogs for
> > > ASF[]?  I'm half expecting the 5752 to show "ASF[0]" and the 5714
> > > to show "ASF[1]". If you see this, and the below patch doesn't
> > > fix the problem, let me know.  I have another test I'd like you
> > > to run.
> > 
> > Do I understand this correct? "Management firmware" or ASF is some
> > feature, vendor decides to built into network card (firmware) or
> > not?
> 
> Right.
> 
> > If so, would'nt one expect two oneboard network cards in one server
> > to look alike?
> 
> Mostly, yes.  Except for.....
> 
> > HP Proliant DL320G5
> > 
> > <6>tg3.c:v3.113 (August 2, 2010)
> > <6>tg3 0000:03:04.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
> > <6>tg3 0000:03:04.0: eth0: Tigon3 [partno(N/A) rev 9003]
> > (PCIX:133MHz:64-bit) MAC address xx:xx:xx:xx:xx:xx <6>tg3
> > 0000:03:04.0: eth0: attached PHY is 5714 (10/100/1000Base-T
> > Ethernet) (WireSpeed[1]) <6>tg3 0000:03:04.0: eth0: RXcsums[1]
> > LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1]
>                                                       This =>  ^^^^^^
> > <6>tg3 0000:03:04.0: eth0: dma_rwctrl[76148000] dma_mask[64-bit]
> > <6>tg3 0000:03:04.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
> > <6>tg3 0000:03:04.1: eth1: Tigon3 [partno(N/A) rev 9003]
> > (PCIX:133MHz:64-bit) MAC address xx:xx:xx:xx:xx:xx <6>tg3
> > 0000:03:04.1: eth1: attached PHY is 5714 (10/100/1000Base-T
> > Ethernet) (WireSpeed[1]) <6>tg3 0000:03:04.1: eth1: RXcsums[1]
> > LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
>                                                   And this =>  ^^^^^^
> > <6>tg3 0000:03:04.1: eth1: dma_rwctrl[76148000] dma_mask[64-bit]
> 
> So management firmware is turned off on the second port.
> 
> > Lenovo ThinkPad z61m
> > 
> > [    2.679130] tg3.c:v3.113 (August 2, 2010)
> > [    2.679176] tg3 0000:02:00.0: PCI INT A -> GSI 16 (level, low)
> > -> IRQ 16 [    2.679188] tg3 0000:02:00.0: setting latency timer to
> > 64 [    2.728572] tg3 0000:02:00.0: eth0: Tigon3 [partno(BCM95752m)
> > rev 6002] (PCI Express) MAC address xx:xx:xx:xx:xx:xx
> > [    2.728577] tg3 0000:02:00.0: eth0: attached PHY is 5752
> > (10/100/1000Base-T Ethernet) (WireSpeed[1]) [    2.728581] tg3
> > 0000:02:00.0: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0]
> > TSOcap[1]
>                                                                            ^^^^^^
> And it isn't present on the 5752.
> 
> > [    2.728585] tg3 0000:02:00.0: eth0: dma_rwctrl[76180000]
> > dma_mask[64-bit]
> > 
> > 
> > > ----
> > > 
> > > [PATCH] tg3: Use new VLAN code
> > 
> > Unfortunately had'nt time to try much now, but with 2.6.37-rc5 /
> > your patch on the DL320, single user mode (nothing configured on
> > eth) just after ifconfig eth0/eth1 up I see NO vlan tags on eth0
> > but I see vlan tags on eth1, so there clearly is a difference.
> > 
> > I should have checked if I still see vlan tags on eth1 if I
> > configure some vlan there - if helpful maybe I can do this (have to
> > look, when I can effort another downtime).
> 
> This would be helpful, just to solidify our findings.
> 
> > I wonder, if the difference in that both onboard cards is really
> > there or if there is some malfunction in detecion?
> 
> Please run the above test first, but afterwards, can you apply the
> below patch on top of your current sources.  I suspect eth1 will
> begin to act like eth0.
> 
> This patch is just a test.
> 
> [PATCH] tg3: Always strip VLAN tags
> 
> This patch configures the hardware to always strip VLAN tags from
> incoming packets.

OK - all tests done on that DL320G5:

For completeness, 2.6.37-rc5 unpatched:

eth0, no vlan configured: totally broken - see double tagged vlans
without tag, single or untagged packets missing at all

eth0, vlan configured: see packets without vlan tag (see double tagged
packets with one vlan tag)

eth1 same as originally reported:
without vlan configured see vlan tags (single and double tagged as
expected)
with vlan configured: see packets without vlan tag (see double tagged
packets with one vlan tag)


2.6.37-rc5, your tg3 use new vlan-code patch:

eth0, no vlan configured:  see packets without vlan tag (see double
tagged packets with one vlan tag)
eth1, no vlan configured: see vlan tags (single and double tagged as
expected)


eth0, vlan configured: as without vlan
eth1, vlan configured: as without vlan

2.6.37-rc5, your tg3 use new vlan-code patch with test patch ontop

eth1 no vlan configured: see packets without vlan tag (see double tagged
packets with one vlan tag)
eth1 with vlan: the same

Hope I did not miss a test I should have done - hope, I did not confuse
anything. If something is not what you would expect you might ask, I've
a script from that session and can look (but cannot post that, sorry).




-- 
MfG,

Michael Leun


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-15  7:16                                       ` Michael Leun
@ 2010-12-19  3:38                                         ` Jesse Gross
  2011-01-07  3:24                                           ` Matt Carlson
  0 siblings, 1 reply; 53+ messages in thread
From: Jesse Gross @ 2010-12-19  3:38 UTC (permalink / raw)
  To: Michael Leun
  Cc: Matt Carlson, Michael Chan, Eric Dumazet, David Miller,
	Ben Greear, linux-kernel, netdev

On Tue, Dec 14, 2010 at 11:16 PM, Michael Leun
<lkml20101129@newton.leun.net> wrote:
> OK - all tests done on that DL320G5:
>
> For completeness, 2.6.37-rc5 unpatched:
>
> eth0, no vlan configured: totally broken - see double tagged vlans
> without tag, single or untagged packets missing at all

Random behavior?  This one is somewhat hard to explain - maybe there
are some other factors.  eth0 has ASF on, so it always strips tags.  I
would expect it to behave like the vlan configured case.

>
> eth0, vlan configured: see packets without vlan tag (see double tagged
> packets with one vlan tag)

Both ASF and vlan group configured cause tag stripping to be enabled.
Missing tag.

>
> eth1 same as originally reported:
> without vlan configured see vlan tags (single and double tagged as
> expected)

No ASF and no vlan group means tag stripping is disabled.  Have tag.

> with vlan configured: see packets without vlan tag (see double tagged
> packets with one vlan tag)

Configuring vlan group causes stripping to be enabled.  Missing tag.

>
>
> 2.6.37-rc5, your tg3 use new vlan-code patch:
>
> eth0, no vlan configured:  see packets without vlan tag (see double
> tagged packets with one vlan tag)

ASF enables tag stripping.  Missing tag.

> eth1, no vlan configured: see vlan tags (single and double tagged as
> expected)

No ASF, no vlan group means no stripping.  Have tag.

>
>
> eth0, vlan configured: as without vlan

ASF enables stripping.  Missing tag.

> eth1, vlan configured: as without vlan

With this patch vlan stripping is only enabled when ASF is on, so no
stripping.  Have tag.

>
> 2.6.37-rc5, your tg3 use new vlan-code patch with test patch ontop
>
> eth1 no vlan configured: see packets without vlan tag (see double tagged
> packets with one vlan tag)

With the second patch, vlan stripping is always enabled.  Missing tag.

> eth1 with vlan: the same

Stripping still always enabled.  Missing tag.

The bottom line is whenever vlan stripping is enabled we're missing
the outer tag.  It might be worth adding some debugging in the area
before napi_gro_receive/vlan_gro_receive (depending on version).  My
guess is that (desc->type_flags & RXD_FLAG_VLAN) is false even for
vlan packets on this NIC.

You said that everything works on the 5752?  Matt, is it possible that
the 5714 either has a problem with vlan stripping or a different way
of reporting it?

Also, why does ASF require vlan stripping?

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-14 19:15                                 ` Matt Carlson
  2010-12-14 21:46                                   ` Jesse Gross
  2010-12-15  0:24                                   ` Michael Leun
@ 2011-01-01 17:03                                   ` Eric Dumazet
  2011-01-02  0:27                                     ` Jesse Gross
  2 siblings, 1 reply; 53+ messages in thread
From: Eric Dumazet @ 2011-01-01 17:03 UTC (permalink / raw)
  To: Matt Carlson
  Cc: Jesse Gross, Michael Leun, Michael Chan, David Miller,
	Ben Greear, linux-kernel, netdev

Le mardi 14 décembre 2010 à 11:15 -0800, Matt Carlson a écrit :

> Thanks for the comments Jesse.  Below is an updated patch.
> 
> Michael, I'm wondering if the difference in behavior can be explained by
> the presence or absence of management firmware.  Can you look at the
> driver sign-on messages in your syslogs for ASF[]?  I'm half expecting
> the 5752 to show "ASF[0]" and the 5714 to show "ASF[1]".  If you see
> this, and the below patch doesn't fix the problem, let me know.  I have
> another test I'd like you to run.
> 
> ----
> 
> [PATCH] tg3: Use new VLAN code
> 
> This patch pivots the tg3 driver to the new VLAN infrastructure.
> All references to vlgrp have been removed and all VLAN code is
> unconditionally active.
> 
> Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
> ---
>  drivers/net/tg3.c |   95 +++++------------------------------------------------
>  drivers/net/tg3.h |    3 --
>  2 files changed, 9 insertions(+), 89 deletions(-)
> 
> diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
> index 5faa87d..3682205 100644
> --- a/drivers/net/tg3.c
> +++ b/drivers/net/tg3.c
> @@ -60,12 +60,6 @@
>  #define BAR_0	0
>  #define BAR_2	2
>  
> -#if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
> -#define TG3_VLAN_TAG_USED 1
> -#else
> -#define TG3_VLAN_TAG_USED 0
> -#endif
> -
>  #include "tg3.h"
>  
>  #define DRV_MODULE_NAME		"tg3"
> @@ -134,9 +128,6 @@
>  				 TG3_TX_RING_SIZE)
>  #define NEXT_TX(N)		(((N) + 1) & (TG3_TX_RING_SIZE - 1))
>  
> -#define TG3_RX_DMA_ALIGN		16
> -#define TG3_RX_HEADROOM			ALIGN(VLAN_HLEN, TG3_RX_DMA_ALIGN)
> -
>  #define TG3_DMA_BYTE_ENAB		64
>  
>  #define TG3_RX_STD_DMA_SZ		1536
> @@ -4725,8 +4716,6 @@ static int tg3_rx(struct tg3_napi *tnapi, int budget)
>  		struct sk_buff *skb;
>  		dma_addr_t dma_addr;
>  		u32 opaque_key, desc_idx, *post_ptr;
> -		bool hw_vlan __maybe_unused = false;
> -		u16 vtag __maybe_unused = 0;
>  
>  		desc_idx = desc->opaque & RXD_OPAQUE_INDEX_MASK;
>  		opaque_key = desc->opaque & RXD_OPAQUE_RING_MASK;
> @@ -4785,12 +4774,12 @@ static int tg3_rx(struct tg3_napi *tnapi, int budget)
>  			tg3_recycle_rx(tnapi, tpr, opaque_key,
>  				       desc_idx, *post_ptr);
>  
> -			copy_skb = netdev_alloc_skb(tp->dev, len + VLAN_HLEN +
> +			copy_skb = netdev_alloc_skb(tp->dev, len +
>  						    TG3_RAW_IP_ALIGN);
>  			if (copy_skb == NULL)
>  				goto drop_it_no_recycle;
>  
> -			skb_reserve(copy_skb, TG3_RAW_IP_ALIGN + VLAN_HLEN);
> +			skb_reserve(copy_skb, TG3_RAW_IP_ALIGN);
>  			skb_put(copy_skb, len);
>  			pci_dma_sync_single_for_cpu(tp->pdev, dma_addr, len, PCI_DMA_FROMDEVICE);
>  			skb_copy_from_linear_data(skb, copy_skb->data, len);
> @@ -4817,30 +4806,11 @@ static int tg3_rx(struct tg3_napi *tnapi, int budget)
>  		}
>  
>  		if (desc->type_flags & RXD_FLAG_VLAN &&
> -		    !(tp->rx_mode & RX_MODE_KEEP_VLAN_TAG)) {
> -			vtag = desc->err_vlan & RXD_VLAN_MASK;
> -#if TG3_VLAN_TAG_USED
> -			if (tp->vlgrp)
> -				hw_vlan = true;
> -			else
> -#endif
> -			{
> -				struct vlan_ethhdr *ve = (struct vlan_ethhdr *)
> -						    __skb_push(skb, VLAN_HLEN);
> -
> -				memmove(ve, skb->data + VLAN_HLEN,
> -					ETH_ALEN * 2);
> -				ve->h_vlan_proto = htons(ETH_P_8021Q);
> -				ve->h_vlan_TCI = htons(vtag);
> -			}
> -		}
> +		    !(tp->rx_mode & RX_MODE_KEEP_VLAN_TAG))
> +			__vlan_hwaccel_put_tag(skb,
> +					       desc->err_vlan & RXD_VLAN_MASK);
>  
> -#if TG3_VLAN_TAG_USED
> -		if (hw_vlan)
> -			vlan_gro_receive(&tnapi->napi, tp->vlgrp, vtag, skb);
> -		else
> -#endif
> -			napi_gro_receive(&tnapi->napi, skb);
> +		napi_gro_receive(&tnapi->napi, skb);
>  
>  		received++;
>  		budget--;
> @@ -5743,11 +5713,9 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb,
>  		base_flags |= TXD_FLAG_TCPUDP_CSUM;
>  	}
>  
> -#if TG3_VLAN_TAG_USED
>  	if (vlan_tx_tag_present(skb))
>  		base_flags |= (TXD_FLAG_VLAN |
>  			       (vlan_tx_tag_get(skb) << 16));
> -#endif
>  
>  	len = skb_headlen(skb);
>  
> @@ -5989,11 +5957,10 @@ static netdev_tx_t tg3_start_xmit_dma_bug(struct sk_buff *skb,
>  			}
>  		}
>  	}
> -#if TG3_VLAN_TAG_USED
> +
>  	if (vlan_tx_tag_present(skb))
>  		base_flags |= (TXD_FLAG_VLAN |
>  			       (vlan_tx_tag_get(skb) << 16));
> -#endif
>  
>  	if ((tp->tg3_flags3 & TG3_FLG3_USE_JUMBO_BDFLAG) &&
>  	    !mss && skb->len > VLAN_ETH_FRAME_LEN)
> @@ -9538,17 +9505,8 @@ static void __tg3_set_rx_mode(struct net_device *dev)
>  	/* When ASF is in use, we always keep the RX_MODE_KEEP_VLAN_TAG
>  	 * flag clear.
>  	 */
> -#if TG3_VLAN_TAG_USED
> -	if (!tp->vlgrp &&
> -	    !(tp->tg3_flags & TG3_FLAG_ENABLE_ASF))
> -		rx_mode |= RX_MODE_KEEP_VLAN_TAG;
> -#else
> -	/* By definition, VLAN is disabled always in this
> -	 * case.
> -	 */
>  	if (!(tp->tg3_flags & TG3_FLAG_ENABLE_ASF))
>  		rx_mode |= RX_MODE_KEEP_VLAN_TAG;
> -#endif
>  
>  	if (dev->flags & IFF_PROMISC) {
>  		/* Promiscuous mode. */
> @@ -11233,31 +11191,6 @@ static int tg3_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
>  	return -EOPNOTSUPP;
>  }
>  
> -#if TG3_VLAN_TAG_USED
> -static void tg3_vlan_rx_register(struct net_device *dev, struct vlan_group *grp)
> -{
> -	struct tg3 *tp = netdev_priv(dev);
> -
> -	if (!netif_running(dev)) {
> -		tp->vlgrp = grp;
> -		return;
> -	}
> -
> -	tg3_netif_stop(tp);
> -
> -	tg3_full_lock(tp, 0);
> -
> -	tp->vlgrp = grp;
> -
> -	/* Update RX_MODE_KEEP_VLAN_TAG bit in RX_MODE register. */
> -	__tg3_set_rx_mode(dev);
> -
> -	tg3_netif_start(tp);
> -
> -	tg3_full_unlock(tp);
> -}
> -#endif
> -
>  static int tg3_get_coalesce(struct net_device *dev, struct ethtool_coalesce *ec)
>  {
>  	struct tg3 *tp = netdev_priv(dev);
> @@ -13069,9 +13002,7 @@ static struct pci_dev * __devinit tg3_find_peer(struct tg3 *);
>  
>  static void inline vlan_features_add(struct net_device *dev, unsigned long flags)
>  {
> -#if TG3_VLAN_TAG_USED
>  	dev->vlan_features |= flags;
> -#endif
>  }
>  
>  static inline u32 tg3_rx_ret_ring_size(struct tg3 *tp)
> @@ -13866,11 +13797,11 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
>  	else
>  		tp->tg3_flags &= ~TG3_FLAG_POLL_SERDES;
>  
> -	tp->rx_offset = NET_IP_ALIGN + TG3_RX_HEADROOM;
> +	tp->rx_offset = NET_IP_ALIGN;
>  	tp->rx_copy_thresh = TG3_RX_COPY_THRESHOLD;
>  	if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5701 &&
>  	    (tp->tg3_flags & TG3_FLAG_PCIX_MODE) != 0) {
> -		tp->rx_offset -= NET_IP_ALIGN;
> +		tp->rx_offset = 0;
>  #ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
>  		tp->rx_copy_thresh = ~(u16)0;
>  #endif
> @@ -14634,9 +14565,6 @@ static const struct net_device_ops tg3_netdev_ops = {
>  	.ndo_do_ioctl		= tg3_ioctl,
>  	.ndo_tx_timeout		= tg3_tx_timeout,
>  	.ndo_change_mtu		= tg3_change_mtu,
> -#if TG3_VLAN_TAG_USED
> -	.ndo_vlan_rx_register	= tg3_vlan_rx_register,
> -#endif
>  #ifdef CONFIG_NET_POLL_CONTROLLER
>  	.ndo_poll_controller	= tg3_poll_controller,
>  #endif
> @@ -14653,9 +14581,6 @@ static const struct net_device_ops tg3_netdev_ops_dma_bug = {
>  	.ndo_do_ioctl		= tg3_ioctl,
>  	.ndo_tx_timeout		= tg3_tx_timeout,
>  	.ndo_change_mtu		= tg3_change_mtu,
> -#if TG3_VLAN_TAG_USED
> -	.ndo_vlan_rx_register	= tg3_vlan_rx_register,
> -#endif
>  #ifdef CONFIG_NET_POLL_CONTROLLER
>  	.ndo_poll_controller	= tg3_poll_controller,
>  #endif
> @@ -14705,9 +14630,7 @@ static int __devinit tg3_init_one(struct pci_dev *pdev,
>  
>  	SET_NETDEV_DEV(dev, &pdev->dev);
>  
> -#if TG3_VLAN_TAG_USED
>  	dev->features |= NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX;
> -#endif
>  
>  	tp = netdev_priv(dev);
>  	tp->pdev = pdev;
> diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
> index d62c8d9..f528243 100644
> --- a/drivers/net/tg3.h
> +++ b/drivers/net/tg3.h
> @@ -2808,9 +2808,6 @@ struct tg3 {
>  	u32				rx_std_max_post;
>  	u32				rx_offset;
>  	u32				rx_pkt_map_sz;
> -#if TG3_VLAN_TAG_USED
> -	struct vlan_group		*vlgrp;
> -#endif
>  
> 
>  	/* begin "everything else" cacheline(s) section */


Hi Matt.

Any news on this patch ?

Without it, net-next-2.6 doesnt work for me on a vlan setup on top of
bonding.

(bond0 : eth1 & eth2, eth1 being bnx2, eth2 beging tg3)

ip link add link bond0 vlan.103 type vlan id 103
ip addr add 192.168.20.110/24 dev vlan.103
ip link set vlan.103 up


If active slave is eth1 (bnx2), everything works, but if active slave is
eth2 (tg3), incoming tagged frames (on vlan 103) are lost.




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-01 17:03                                   ` Eric Dumazet
@ 2011-01-02  0:27                                     ` Jesse Gross
  2011-01-02 16:05                                       ` Eric Dumazet
  0 siblings, 1 reply; 53+ messages in thread
From: Jesse Gross @ 2011-01-02  0:27 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Matt Carlson, Michael Leun, Michael Chan, David Miller,
	Ben Greear, linux-kernel, netdev

On Sat, Jan 1, 2011 at 12:03 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mardi 14 décembre 2010 à 11:15 -0800, Matt Carlson a écrit :
>
>> Thanks for the comments Jesse.  Below is an updated patch.
>>
>> Michael, I'm wondering if the difference in behavior can be explained by
>> the presence or absence of management firmware.  Can you look at the
>> driver sign-on messages in your syslogs for ASF[]?  I'm half expecting
>> the 5752 to show "ASF[0]" and the 5714 to show "ASF[1]".  If you see
>> this, and the below patch doesn't fix the problem, let me know.  I have
>> another test I'd like you to run.
>>
>> ----
>>
>> [PATCH] tg3: Use new VLAN code
>>
>> This patch pivots the tg3 driver to the new VLAN infrastructure.
>> All references to vlgrp have been removed and all VLAN code is
>> unconditionally active.
>>
>> Signed-off-by: Matt Carlson <mcarlson@broadcom.com>

[...]

> Hi Matt.
>
> Any news on this patch ?
>
> Without it, net-next-2.6 doesnt work for me on a vlan setup on top of
> bonding.
>
> (bond0 : eth1 & eth2, eth1 being bnx2, eth2 beging tg3)
>
> ip link add link bond0 vlan.103 type vlan id 103
> ip addr add 192.168.20.110/24 dev vlan.103
> ip link set vlan.103 up
>
>
> If active slave is eth1 (bnx2), everything works, but if active slave is
> eth2 (tg3), incoming tagged frames (on vlan 103) are lost.

This patch isn't quite right - it always disables vlan stripping
unless management firmware is in use, so it's not really a correct
fix.

You said that this used to work correctly on this NIC?  Does it work
without a bond, just a vlan on the tg3 device?  It sounds like Michael
has a problem with vlan stripping on one of his NICs but if it works
with just a vlan or on older kernels, it's probably not the same
thing.

If it works on bnx2, it would seem to be a driver problem but it would
be good to confirm that the tag in skb->vlan_tci is not being
delievered to the networking core in this case.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-02  0:27                                     ` Jesse Gross
@ 2011-01-02 16:05                                       ` Eric Dumazet
  2011-01-06 21:01                                         ` Jesse Gross
  0 siblings, 1 reply; 53+ messages in thread
From: Eric Dumazet @ 2011-01-02 16:05 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Matt Carlson, Michael Leun, Michael Chan, David Miller,
	Ben Greear, linux-kernel, netdev

Le samedi 01 janvier 2011 à 19:27 -0500, Jesse Gross a écrit :
> On Sat, Jan 1, 2011 at 12:03 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > Le mardi 14 décembre 2010 à 11:15 -0800, Matt Carlson a écrit :
> >
> >> Thanks for the comments Jesse.  Below is an updated patch.
> >>
> >> Michael, I'm wondering if the difference in behavior can be explained by
> >> the presence or absence of management firmware.  Can you look at the
> >> driver sign-on messages in your syslogs for ASF[]?  I'm half expecting
> >> the 5752 to show "ASF[0]" and the 5714 to show "ASF[1]".  If you see
> >> this, and the below patch doesn't fix the problem, let me know.  I have
> >> another test I'd like you to run.
> >>
> >> ----
> >>
> >> [PATCH] tg3: Use new VLAN code
> >>
> >> This patch pivots the tg3 driver to the new VLAN infrastructure.
> >> All references to vlgrp have been removed and all VLAN code is
> >> unconditionally active.
> >>
> >> Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
> 
> [...]
> 
> > Hi Matt.
> >
> > Any news on this patch ?
> >
> > Without it, net-next-2.6 doesnt work for me on a vlan setup on top of
> > bonding.
> >
> > (bond0 : eth1 & eth2, eth1 being bnx2, eth2 beging tg3)
> >
> > ip link add link bond0 vlan.103 type vlan id 103
> > ip addr add 192.168.20.110/24 dev vlan.103
> > ip link set vlan.103 up
> >
> >
> > If active slave is eth1 (bnx2), everything works, but if active slave is
> > eth2 (tg3), incoming tagged frames (on vlan 103) are lost.
> 
> This patch isn't quite right - it always disables vlan stripping
> unless management firmware is in use, so it's not really a correct
> fix.
> 
> You said that this used to work correctly on this NIC?  Does it work
> without a bond, just a vlan on the tg3 device?  It sounds like Michael
> has a problem with vlan stripping on one of his NICs but if it works
> with just a vlan or on older kernels, it's probably not the same
> thing.
> 

1) current linux-2.6 works OK for me (and previous versions as well, I
am using this vlan/bonding setup since 3 years or so on one of my dev
machine)

Only net-next-2.6 has the problem.

If I remove bonding of the equation, I still have the problem, and can
see the 'dropped' counter increasing while I send packets to eth2 (tg3)

$ ifconfig eth2
eth2      Link encap:Ethernet  HWaddr 00:1E:0B:92:78:50  
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:94 errors:0 dropped:38686 overruns:0 frame:0
          TX packets:18 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:8332 (8.1 Kb)  TX bytes:1392 (1.3 Kb)
          Interrupt:19 
$ ifconfig vlan.103
vlan.103  Link encap:Ethernet  HWaddr 00:1E:0B:92:78:50  
          inet addr:192.168.20.110  Bcast:0.0.0.0  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:15 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 b)  TX bytes:846 (846.0 b)





> If it works on bnx2, it would seem to be a driver problem but it would
> be good to confirm that the tag in skb->vlan_tci is not being
> delievered to the networking core in this case.

Hmm, where do you want me to check this ?



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-02 16:05                                       ` Eric Dumazet
@ 2011-01-06 21:01                                         ` Jesse Gross
  2011-01-06 23:34                                           ` Eric Dumazet
  0 siblings, 1 reply; 53+ messages in thread
From: Jesse Gross @ 2011-01-06 21:01 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Matt Carlson, Michael Leun, Michael Chan, David Miller,
	Ben Greear, linux-kernel, netdev

On Sun, Jan 2, 2011 at 11:05 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le samedi 01 janvier 2011 à 19:27 -0500, Jesse Gross a écrit :
>> On Sat, Jan 1, 2011 at 12:03 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> > Le mardi 14 décembre 2010 à 11:15 -0800, Matt Carlson a écrit :
>> >
>> >> Thanks for the comments Jesse.  Below is an updated patch.
>> >>
>> >> Michael, I'm wondering if the difference in behavior can be explained by
>> >> the presence or absence of management firmware.  Can you look at the
>> >> driver sign-on messages in your syslogs for ASF[]?  I'm half expecting
>> >> the 5752 to show "ASF[0]" and the 5714 to show "ASF[1]".  If you see
>> >> this, and the below patch doesn't fix the problem, let me know.  I have
>> >> another test I'd like you to run.
>> >>
>> >> ----
>> >>
>> >> [PATCH] tg3: Use new VLAN code
>> >>
>> >> This patch pivots the tg3 driver to the new VLAN infrastructure.
>> >> All references to vlgrp have been removed and all VLAN code is
>> >> unconditionally active.
>> >>
>> >> Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
>>
>> [...]
>>
>> > Hi Matt.
>> >
>> > Any news on this patch ?
>> >
>> > Without it, net-next-2.6 doesnt work for me on a vlan setup on top of
>> > bonding.
>> >
>> > (bond0 : eth1 & eth2, eth1 being bnx2, eth2 beging tg3)
>> >
>> > ip link add link bond0 vlan.103 type vlan id 103
>> > ip addr add 192.168.20.110/24 dev vlan.103
>> > ip link set vlan.103 up
>> >
>> >
>> > If active slave is eth1 (bnx2), everything works, but if active slave is
>> > eth2 (tg3), incoming tagged frames (on vlan 103) are lost.
>>
>> This patch isn't quite right - it always disables vlan stripping
>> unless management firmware is in use, so it's not really a correct
>> fix.
>>
>> You said that this used to work correctly on this NIC?  Does it work
>> without a bond, just a vlan on the tg3 device?  It sounds like Michael
>> has a problem with vlan stripping on one of his NICs but if it works
>> with just a vlan or on older kernels, it's probably not the same
>> thing.
>>
>
> 1) current linux-2.6 works OK for me (and previous versions as well, I
> am using this vlan/bonding setup since 3 years or so on one of my dev
> machine)
>
> Only net-next-2.6 has the problem.
>
> If I remove bonding of the equation, I still have the problem, and can
> see the 'dropped' counter increasing while I send packets to eth2 (tg3)
>
> $ ifconfig eth2
> eth2      Link encap:Ethernet  HWaddr 00:1E:0B:92:78:50
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:94 errors:0 dropped:38686 overruns:0 frame:0
>          TX packets:18 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:1000
>          RX bytes:8332 (8.1 Kb)  TX bytes:1392 (1.3 Kb)
>          Interrupt:19
> $ ifconfig vlan.103
> vlan.103  Link encap:Ethernet  HWaddr 00:1E:0B:92:78:50
>          inet addr:192.168.20.110  Bcast:0.0.0.0  Mask:255.255.255.0
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:15 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:0
>          RX bytes:0 (0.0 b)  TX bytes:846 (846.0 b)

Hmm, I thought that it might be some interaction with a corner case in
the networking core but now it seems less likely.  There weren't too
many vlan changes between the working and non-working states.  Plus,
since the rx counter isn't increasing, the packets probably aren't
making it anywhere.

I see that tg3 increases the drop counter in one place, which also
happens to be checking for vlan errors (at tg3.c:4753).  That seems
suspicious - maybe the NIC is only partially configured for vlan
offloading.  If we can confirm that is where the drop counter is being
incremented and what the error code is maybe it would shed some light.

If it's a driver issue I don't have much insight - maybe Matt or
bisect can help.

>> If it works on bnx2, it would seem to be a driver problem but it would
>> be good to confirm that the tag in skb->vlan_tci is not being
>> delievered to the networking core in this case.
>
> Hmm, where do you want me to check this ?

I was thinking right before vlan_gro_receive() at tg3.c:4837.  If my
theory above is right then this obviously isn't relevant since it
won't be hit at all.  Otherwise it would be good to know exactly what
the driver is producing.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-06 21:01                                         ` Jesse Gross
@ 2011-01-06 23:34                                           ` Eric Dumazet
  2011-01-07  1:20                                             ` Eric Dumazet
  0 siblings, 1 reply; 53+ messages in thread
From: Eric Dumazet @ 2011-01-06 23:34 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Matt Carlson, Michael Leun, Michael Chan, David Miller,
	Ben Greear, linux-kernel, netdev

Le jeudi 06 janvier 2011 à 16:01 -0500, Jesse Gross a écrit :

> Hmm, I thought that it might be some interaction with a corner case in
> the networking core but now it seems less likely.  There weren't too
> many vlan changes between the working and non-working states.  Plus,
> since the rx counter isn't increasing, the packets probably aren't
> making it anywhere.
> 
> I see that tg3 increases the drop counter in one place, which also
> happens to be checking for vlan errors (at tg3.c:4753).  That seems
> suspicious - maybe the NIC is only partially configured for vlan
> offloading.  If we can confirm that is where the drop counter is being
> incremented and what the error code is maybe it would shed some light.
> 

Hmm... I am pretty sure the drop counter is the dev rx_dropped (core
network handled, not tg3 one) incremented at the end of
__netif_receive_skb() : We found no suitable handler for packets.

atomic_long_inc(&skb->dev->rx_dropped);

But thats a guess, I'll have to check

> If it's a driver issue I don't have much insight - maybe Matt or
> bisect can help.
> 
> >> If it works on bnx2, it would seem to be a driver problem but it would
> >> be good to confirm that the tag in skb->vlan_tci is not being
> >> delievered to the networking core in this case.
> >
> > Hmm, where do you want me to check this ?
> 
> I was thinking right before vlan_gro_receive() at tg3.c:4837.  If my
> theory above is right then this obviously isn't relevant since it
> won't be hit at all.  Otherwise it would be good to know exactly what
> the driver is producing.



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-06 23:34                                           ` Eric Dumazet
@ 2011-01-07  1:20                                             ` Eric Dumazet
  2011-01-07  2:29                                               ` Matt Carlson
  0 siblings, 1 reply; 53+ messages in thread
From: Eric Dumazet @ 2011-01-07  1:20 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Matt Carlson, Michael Leun, Michael Chan, David Miller,
	Ben Greear, linux-kernel, netdev

Le vendredi 07 janvier 2011 à 00:34 +0100, Eric Dumazet a écrit :
> Le jeudi 06 janvier 2011 à 16:01 -0500, Jesse Gross a écrit :
> 
> > Hmm, I thought that it might be some interaction with a corner case in
> > the networking core but now it seems less likely.  There weren't too
> > many vlan changes between the working and non-working states.  Plus,
> > since the rx counter isn't increasing, the packets probably aren't
> > making it anywhere.
> > 
> > I see that tg3 increases the drop counter in one place, which also
> > happens to be checking for vlan errors (at tg3.c:4753).  That seems
> > suspicious - maybe the NIC is only partially configured for vlan
> > offloading.  If we can confirm that is where the drop counter is being
> > incremented and what the error code is maybe it would shed some light.
> > 
> 
> Hmm... I am pretty sure the drop counter is the dev rx_dropped (core
> network handled, not tg3 one) incremented at the end of
> __netif_receive_skb() : We found no suitable handler for packets.
> 
> atomic_long_inc(&skb->dev->rx_dropped);
> 
> But thats a guess, I'll have to check
> 

wrong guess. Its really the tg3 which drops frames

increasing rx_missed_errors  (get_stat64(&hw_stats->rx_discards)

ip -s -s link show dev eth2
5: eth2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq
master bond0 state UP qlen 1000
    link/ether 00:1e:0b:92:78:50 brd ff:ff:ff:ff:ff:ff
    RX: bytes  packets  errors  dropped overrun mcast   
    11627      167      0       0       0       2      
    RX errors: length  crc     frame   fifo    missed
               0        0       0       0       2713   
    TX: bytes  packets  errors  dropped carrier collsns 
    2274       31       0       0       0       0      
    TX errors: aborted fifo    window  heartbeat
               0        0       0       0      



It would be nice Broadcom guys could help a bit ?




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-07  1:20                                             ` Eric Dumazet
@ 2011-01-07  2:29                                               ` Matt Carlson
  2011-01-07  2:41                                                 ` Eric Dumazet
  0 siblings, 1 reply; 53+ messages in thread
From: Matt Carlson @ 2011-01-07  2:29 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jesse Gross, Matthew Carlson, Michael Leun, Michael Chan,
	David Miller, Ben Greear, linux-kernel, netdev

On Thu, Jan 06, 2011 at 05:20:01PM -0800, Eric Dumazet wrote:
> Le vendredi 07 janvier 2011 ?? 00:34 +0100, Eric Dumazet a ??crit :
> > Le jeudi 06 janvier 2011 ?? 16:01 -0500, Jesse Gross a ??crit :
> > 
> > > Hmm, I thought that it might be some interaction with a corner case in
> > > the networking core but now it seems less likely.  There weren't too
> > > many vlan changes between the working and non-working states.  Plus,
> > > since the rx counter isn't increasing, the packets probably aren't
> > > making it anywhere.
> > > 
> > > I see that tg3 increases the drop counter in one place, which also
> > > happens to be checking for vlan errors (at tg3.c:4753).  That seems
> > > suspicious - maybe the NIC is only partially configured for vlan
> > > offloading.  If we can confirm that is where the drop counter is being
> > > incremented and what the error code is maybe it would shed some light.
> > > 
> > 
> > Hmm... I am pretty sure the drop counter is the dev rx_dropped (core
> > network handled, not tg3 one) incremented at the end of
> > __netif_receive_skb() : We found no suitable handler for packets.
> > 
> > atomic_long_inc(&skb->dev->rx_dropped);
> > 
> > But thats a guess, I'll have to check
> > 
> 
> wrong guess. Its really the tg3 which drops frames
> 
> increasing rx_missed_errors  (get_stat64(&hw_stats->rx_discards)
> 
> ip -s -s link show dev eth2
> 5: eth2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq
> master bond0 state UP qlen 1000
>     link/ether 00:1e:0b:92:78:50 brd ff:ff:ff:ff:ff:ff
>     RX: bytes  packets  errors  dropped overrun mcast   
>     11627      167      0       0       0       2      
>     RX errors: length  crc     frame   fifo    missed
>                0        0       0       0       2713   
>     TX: bytes  packets  errors  dropped carrier collsns 
>     2274       31       0       0       0       0      
>     TX errors: aborted fifo    window  heartbeat
>                0        0       0       0      
> 
> 
> 
> It would be nice Broadcom guys could help a bit ?

Hi Eric.  Sorry for the delay.  I was under the impression that your
problems were software related and that you just needed a revised
version of these VLAN patches I was sending to Michael.  Is this not
true?

Having a hardware stat increment suggests this is a new problem.
Maybe I missed it, but I didn't see what hardware you are working
with and whether or not management firmware was enabled.  Could you tell
me that info?


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-07  2:29                                               ` Matt Carlson
@ 2011-01-07  2:41                                                 ` Eric Dumazet
  2011-01-07  2:43                                                   ` Eric Dumazet
  0 siblings, 1 reply; 53+ messages in thread
From: Eric Dumazet @ 2011-01-07  2:41 UTC (permalink / raw)
  To: Matt Carlson
  Cc: Jesse Gross, Michael Leun, Michael Chan, David Miller,
	Ben Greear, linux-kernel, netdev

Le jeudi 06 janvier 2011 à 18:29 -0800, Matt Carlson a écrit :

> Hi Eric.  Sorry for the delay.  I was under the impression that your
> problems were software related and that you just needed a revised
> version of these VLAN patches I was sending to Michael.  Is this not
> true?
> 
> Having a hardware stat increment suggests this is a new problem.
> Maybe I missed it, but I didn't see what hardware you are working
> with and whether or not management firmware was enabled.  Could you tell
> me that info?
> 

Hi Matt

I started a bisection, because I couldnt sleep tonight anyway :(

14:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5715S
Gigabit Ethernet (rev a3)
	Subsystem: Hewlett-Packard Company NC326m PCIe Dual Port Adapter
	Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 43
	Memory at fdff0000 (64-bit, non-prefetchable) [size=64K]
	Memory at fdfe0000 (64-bit, non-prefetchable) [size=64K]
	[virtual] Expansion ROM at fdbe0000 [disabled] [size=128K]
	Capabilities: [40] PCI-X non-bridge device
	Capabilities: [48] Power Management version 2
	Capabilities: [50] Vital Product Data
	Capabilities: [58] MSI: Enable+ Count=1/8 Maskable- 64bit+
	Kernel driver in use: tg3
	Kernel modules: tg3




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-07  2:41                                                 ` Eric Dumazet
@ 2011-01-07  2:43                                                   ` Eric Dumazet
  2011-01-07  2:59                                                     ` Matt Carlson
  0 siblings, 1 reply; 53+ messages in thread
From: Eric Dumazet @ 2011-01-07  2:43 UTC (permalink / raw)
  To: Matt Carlson
  Cc: Jesse Gross, Michael Leun, Michael Chan, David Miller,
	Ben Greear, linux-kernel, netdev

Le vendredi 07 janvier 2011 à 03:41 +0100, Eric Dumazet a écrit :
> Le jeudi 06 janvier 2011 à 18:29 -0800, Matt Carlson a écrit :
> 
> > Hi Eric.  Sorry for the delay.  I was under the impression that your
> > problems were software related and that you just needed a revised
> > version of these VLAN patches I was sending to Michael.  Is this not
> > true?
> > 
> > Having a hardware stat increment suggests this is a new problem.
> > Maybe I missed it, but I didn't see what hardware you are working
> > with and whether or not management firmware was enabled.  Could you tell
> > me that info?
> > 
> 
> Hi Matt
> 
> I started a bisection, because I couldnt sleep tonight anyway :(
> 
> 14:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5715S
> Gigabit Ethernet (rev a3)
> 	Subsystem: Hewlett-Packard Company NC326m PCIe Dual Port Adapter
> 	Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 43
> 	Memory at fdff0000 (64-bit, non-prefetchable) [size=64K]
> 	Memory at fdfe0000 (64-bit, non-prefetchable) [size=64K]
> 	[virtual] Expansion ROM at fdbe0000 [disabled] [size=128K]
> 	Capabilities: [40] PCI-X non-bridge device
> 	Capabilities: [48] Power Management version 2
> 	Capabilities: [50] Vital Product Data
> 	Capabilities: [58] MSI: Enable+ Count=1/8 Maskable- 64bit+
> 	Kernel driver in use: tg3
> 	Kernel modules: tg3
> 
> 

$ ethtool -i eth2
driver: tg3
version: 3.115
firmware-version: 5715s-v3.28
bus-info: 0000:14:04.0
$ dmesg | grep ASF
[    6.220577] tg3 0000:14:04.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0]
ASF[0] TSOcap[1]
[    6.228586] tg3 0000:14:04.1: eth3: RXcsums[1] LinkChgREG[0] MIirq[0]
ASF[0] TSOcap[1]




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-07  2:43                                                   ` Eric Dumazet
@ 2011-01-07  2:59                                                     ` Matt Carlson
  2011-01-07  3:04                                                       ` Eric Dumazet
  0 siblings, 1 reply; 53+ messages in thread
From: Matt Carlson @ 2011-01-07  2:59 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Matthew Carlson, Jesse Gross, Michael Leun, Michael Chan,
	David Miller, Ben Greear, linux-kernel, netdev

On Thu, Jan 06, 2011 at 06:43:22PM -0800, Eric Dumazet wrote:
> Le vendredi 07 janvier 2011 ?? 03:41 +0100, Eric Dumazet a ??crit :
> > Le jeudi 06 janvier 2011 ?? 18:29 -0800, Matt Carlson a ??crit :
> > 
> > > Hi Eric.  Sorry for the delay.  I was under the impression that your
> > > problems were software related and that you just needed a revised
> > > version of these VLAN patches I was sending to Michael.  Is this not
> > > true?
> > > 
> > > Having a hardware stat increment suggests this is a new problem.
> > > Maybe I missed it, but I didn't see what hardware you are working
> > > with and whether or not management firmware was enabled.  Could you tell
> > > me that info?
> > > 
> > 
> > Hi Matt
> > 
> > I started a bisection, because I couldnt sleep tonight anyway :(
> > 
> > 14:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5715S
> > Gigabit Ethernet (rev a3)
> > 	Subsystem: Hewlett-Packard Company NC326m PCIe Dual Port Adapter
> > 	Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 43
> > 	Memory at fdff0000 (64-bit, non-prefetchable) [size=64K]
> > 	Memory at fdfe0000 (64-bit, non-prefetchable) [size=64K]
> > 	[virtual] Expansion ROM at fdbe0000 [disabled] [size=128K]
> > 	Capabilities: [40] PCI-X non-bridge device
> > 	Capabilities: [48] Power Management version 2
> > 	Capabilities: [50] Vital Product Data
> > 	Capabilities: [58] MSI: Enable+ Count=1/8 Maskable- 64bit+
> > 	Kernel driver in use: tg3
> > 	Kernel modules: tg3
> > 
> > 
> 
> $ ethtool -i eth2
> driver: tg3
> version: 3.115
> firmware-version: 5715s-v3.28
> bus-info: 0000:14:04.0
> $ dmesg | grep ASF
> [    6.220577] tg3 0000:14:04.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0]
> ASF[0] TSOcap[1]
> [    6.228586] tg3 0000:14:04.1: eth3: RXcsums[1] LinkChgREG[0] MIirq[0]
> ASF[0] TSOcap[1]

Thanks.  So management firmware is disabled.  This should be
straightforward case.

I'm wondering if I'm misunderstanding something though.  You said earlier
that VLAN tagging doesn't work unless you applied my patch.  Is this no
longer true?


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-07  2:59                                                     ` Matt Carlson
@ 2011-01-07  3:04                                                       ` Eric Dumazet
  2011-01-07  3:41                                                         ` Matt Carlson
  0 siblings, 1 reply; 53+ messages in thread
From: Eric Dumazet @ 2011-01-07  3:04 UTC (permalink / raw)
  To: Matt Carlson
  Cc: Jesse Gross, Michael Leun, Michael Chan, David Miller,
	Ben Greear, linux-kernel, netdev

Le jeudi 06 janvier 2011 à 18:59 -0800, Matt Carlson a écrit :
> On Thu, Jan 06, 2011 at 06:43:22PM -0800, Eric Dumazet wrote:
> > Le vendredi 07 janvier 2011 ?? 03:41 +0100, Eric Dumazet a ??crit :
> > > Le jeudi 06 janvier 2011 ?? 18:29 -0800, Matt Carlson a ??crit :
> > > 
> > > > Hi Eric.  Sorry for the delay.  I was under the impression that your
> > > > problems were software related and that you just needed a revised
> > > > version of these VLAN patches I was sending to Michael.  Is this not
> > > > true?
> > > > 
> > > > Having a hardware stat increment suggests this is a new problem.
> > > > Maybe I missed it, but I didn't see what hardware you are working
> > > > with and whether or not management firmware was enabled.  Could you tell
> > > > me that info?
> > > > 
> > > 
> > > Hi Matt
> > > 
> > > I started a bisection, because I couldnt sleep tonight anyway :(
> > > 
> > > 14:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5715S
> > > Gigabit Ethernet (rev a3)
> > > 	Subsystem: Hewlett-Packard Company NC326m PCIe Dual Port Adapter
> > > 	Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 43
> > > 	Memory at fdff0000 (64-bit, non-prefetchable) [size=64K]
> > > 	Memory at fdfe0000 (64-bit, non-prefetchable) [size=64K]
> > > 	[virtual] Expansion ROM at fdbe0000 [disabled] [size=128K]
> > > 	Capabilities: [40] PCI-X non-bridge device
> > > 	Capabilities: [48] Power Management version 2
> > > 	Capabilities: [50] Vital Product Data
> > > 	Capabilities: [58] MSI: Enable+ Count=1/8 Maskable- 64bit+
> > > 	Kernel driver in use: tg3
> > > 	Kernel modules: tg3
> > > 
> > > 
> > 
> > $ ethtool -i eth2
> > driver: tg3
> > version: 3.115
> > firmware-version: 5715s-v3.28
> > bus-info: 0000:14:04.0
> > $ dmesg | grep ASF
> > [    6.220577] tg3 0000:14:04.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0]
> > ASF[0] TSOcap[1]
> > [    6.228586] tg3 0000:14:04.1: eth3: RXcsums[1] LinkChgREG[0] MIirq[0]
> > ASF[0] TSOcap[1]
> 
> Thanks.  So management firmware is disabled.  This should be
> straightforward case.
> 
> I'm wondering if I'm misunderstanding something though.  You said earlier
> that VLAN tagging doesn't work unless you applied my patch.  Is this no
> longer true?
> 

I dont apply your patch because Jesse said it was not a good patch ;)

Maybe I missed something and it must be applied ? Problem is : current
Linus tree now includes net-next-2.6 and vlan doesnt work. You should
resubmit it perhaps ?




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2010-12-19  3:38                                         ` Jesse Gross
@ 2011-01-07  3:24                                           ` Matt Carlson
  2011-01-07  4:36                                             ` Jesse Gross
  0 siblings, 1 reply; 53+ messages in thread
From: Matt Carlson @ 2011-01-07  3:24 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Michael Leun, Matthew Carlson, Michael Chan, Eric Dumazet,
	David Miller, Ben Greear, linux-kernel, netdev

On Sat, Dec 18, 2010 at 07:38:00PM -0800, Jesse Gross wrote:
> On Tue, Dec 14, 2010 at 11:16 PM, Michael Leun
> <lkml20101129@newton.leun.net> wrote:
> > OK - all tests done on that DL320G5:
> >
> > For completeness, 2.6.37-rc5 unpatched:
> >
> > eth0, no vlan configured: totally broken - see double tagged vlans
> > without tag, single or untagged packets missing at all
> 
> Random behavior?  This one is somewhat hard to explain - maybe there
> are some other factors.  eth0 has ASF on, so it always strips tags.  I
> would expect it to behave like the vlan configured case.
> 
> >
> > eth0, vlan configured: see packets without vlan tag (see double tagged
> > packets with one vlan tag)
> 
> Both ASF and vlan group configured cause tag stripping to be enabled.
> Missing tag.
> 
> >
> > eth1 same as originally reported:
> > without vlan configured see vlan tags (single and double tagged as
> > expected)
> 
> No ASF and no vlan group means tag stripping is disabled.  Have tag.
> 
> > with vlan configured: see packets without vlan tag (see double tagged
> > packets with one vlan tag)
> 
> Configuring vlan group causes stripping to be enabled.  Missing tag.
> 
> >
> >
> > 2.6.37-rc5, your tg3 use new vlan-code patch:
> >
> > eth0, no vlan configured: ?see packets without vlan tag (see double
> > tagged packets with one vlan tag)
> 
> ASF enables tag stripping.  Missing tag.
> 
> > eth1, no vlan configured: see vlan tags (single and double tagged as
> > expected)
> 
> No ASF, no vlan group means no stripping.  Have tag.
> 
> >
> >
> > eth0, vlan configured: as without vlan
> 
> ASF enables stripping.  Missing tag.
> 
> > eth1, vlan configured: as without vlan
> 
> With this patch vlan stripping is only enabled when ASF is on, so no
> stripping.  Have tag.
> 
> >
> > 2.6.37-rc5, your tg3 use new vlan-code patch with test patch ontop
> >
> > eth1 no vlan configured: see packets without vlan tag (see double tagged
> > packets with one vlan tag)
> 
> With the second patch, vlan stripping is always enabled.  Missing tag.
> 
> > eth1 with vlan: the same
> 
> Stripping still always enabled.  Missing tag.
> 
> The bottom line is whenever vlan stripping is enabled we're missing
> the outer tag.  It might be worth adding some debugging in the area
> before napi_gro_receive/vlan_gro_receive (depending on version).  My
> guess is that (desc->type_flags & RXD_FLAG_VLAN) is false even for
> vlan packets on this NIC.
> 
> You said that everything works on the 5752?  Matt, is it possible that
> the 5714 either has a problem with vlan stripping or a different way
> of reporting it?

I don't think this is a 5714 specific issue.  I think the problem is
rooted in the fact that the VLAN tag stripping is enabled.

Your RXD_FLAG_VLAN idea sounds unlikely to me, but it's worth a check.

The patch here is using __vlan_hwaccel_put_tag(), which informs the
stack a VLAN tag is present.  If this is indeed a reporting problem, I'm
not sure what else the driver should be doing.

> Also, why does ASF require vlan stripping?

This is a firmware limitation.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-07  3:04                                                       ` Eric Dumazet
@ 2011-01-07  3:41                                                         ` Matt Carlson
  2011-01-07  3:54                                                           ` Eric Dumazet
  2011-01-07  4:38                                                           ` Jesse Gross
  0 siblings, 2 replies; 53+ messages in thread
From: Matt Carlson @ 2011-01-07  3:41 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Matthew Carlson, Jesse Gross, Michael Leun, Michael Chan,
	David Miller, Ben Greear, linux-kernel, netdev

On Thu, Jan 06, 2011 at 07:04:46PM -0800, Eric Dumazet wrote:
> Le jeudi 06 janvier 2011 ?? 18:59 -0800, Matt Carlson a ??crit :
> > On Thu, Jan 06, 2011 at 06:43:22PM -0800, Eric Dumazet wrote:
> > > Le vendredi 07 janvier 2011 ?? 03:41 +0100, Eric Dumazet a ??crit :
> > > > Le jeudi 06 janvier 2011 ?? 18:29 -0800, Matt Carlson a ??crit :
> > > > 
> > > > > Hi Eric.  Sorry for the delay.  I was under the impression that your
> > > > > problems were software related and that you just needed a revised
> > > > > version of these VLAN patches I was sending to Michael.  Is this not
> > > > > true?
> > > > > 
> > > > > Having a hardware stat increment suggests this is a new problem.
> > > > > Maybe I missed it, but I didn't see what hardware you are working
> > > > > with and whether or not management firmware was enabled.  Could you tell
> > > > > me that info?
> > > > > 
> > > > 
> > > > Hi Matt
> > > > 
> > > > I started a bisection, because I couldnt sleep tonight anyway :(
> > > > 
> > > > 14:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5715S
> > > > Gigabit Ethernet (rev a3)
> > > > 	Subsystem: Hewlett-Packard Company NC326m PCIe Dual Port Adapter
> > > > 	Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 43
> > > > 	Memory at fdff0000 (64-bit, non-prefetchable) [size=64K]
> > > > 	Memory at fdfe0000 (64-bit, non-prefetchable) [size=64K]
> > > > 	[virtual] Expansion ROM at fdbe0000 [disabled] [size=128K]
> > > > 	Capabilities: [40] PCI-X non-bridge device
> > > > 	Capabilities: [48] Power Management version 2
> > > > 	Capabilities: [50] Vital Product Data
> > > > 	Capabilities: [58] MSI: Enable+ Count=1/8 Maskable- 64bit+
> > > > 	Kernel driver in use: tg3
> > > > 	Kernel modules: tg3
> > > > 
> > > > 
> > > 
> > > $ ethtool -i eth2
> > > driver: tg3
> > > version: 3.115
> > > firmware-version: 5715s-v3.28
> > > bus-info: 0000:14:04.0
> > > $ dmesg | grep ASF
> > > [    6.220577] tg3 0000:14:04.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0]
> > > ASF[0] TSOcap[1]
> > > [    6.228586] tg3 0000:14:04.1: eth3: RXcsums[1] LinkChgREG[0] MIirq[0]
> > > ASF[0] TSOcap[1]
> > 
> > Thanks.  So management firmware is disabled.  This should be
> > straightforward case.
> > 
> > I'm wondering if I'm misunderstanding something though.  You said earlier
> > that VLAN tagging doesn't work unless you applied my patch.  Is this no
> > longer true?
> > 
> 
> I dont apply your patch because Jesse said it was not a good patch ;)

Oh.

> Maybe I missed something and it must be applied ? Problem is : current
> Linus tree now includes net-next-2.6 and vlan doesnt work. You should
> resubmit it perhaps ?

Yes, something needs to be submitted.  I want to make sure we aren't
chasing the same problem though.  If the patch(es) fix your problem,
then I can concentrate on finalizing the patch.

I can combine my last patch (the one that always enabled VLAN tag
stripping) and the previous patch (that implements all your comments so
far) into one patch, but that still leaves the behavior Michael noted
unaddressed.

Michael, did you ever find out whether or not RXD_FLAG_VLAN was being
set?


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-07  3:41                                                         ` Matt Carlson
@ 2011-01-07  3:54                                                           ` Eric Dumazet
  2011-01-07  4:38                                                           ` Jesse Gross
  1 sibling, 0 replies; 53+ messages in thread
From: Eric Dumazet @ 2011-01-07  3:54 UTC (permalink / raw)
  To: Matt Carlson
  Cc: Jesse Gross, Michael Leun, Michael Chan, David Miller,
	Ben Greear, linux-kernel, netdev

Le jeudi 06 janvier 2011 à 19:41 -0800, Matt Carlson a écrit :
> On Thu, Jan 06, 2011 at 07:04:46PM -0800, Eric Dumazet wrote:
> > Le jeudi 06 janvier 2011 ?? 18:59 -0800, Matt Carlson a ??crit :
> > > On Thu, Jan 06, 2011 at 06:43:22PM -0800, Eric Dumazet wrote:
> > > > Le vendredi 07 janvier 2011 ?? 03:41 +0100, Eric Dumazet a ??crit :
> > > > > Le jeudi 06 janvier 2011 ?? 18:29 -0800, Matt Carlson a ??crit :
> > > > > 
> > > > > > Hi Eric.  Sorry for the delay.  I was under the impression that your
> > > > > > problems were software related and that you just needed a revised
> > > > > > version of these VLAN patches I was sending to Michael.  Is this not
> > > > > > true?
> > > > > > 
> > > > > > Having a hardware stat increment suggests this is a new problem.
> > > > > > Maybe I missed it, but I didn't see what hardware you are working
> > > > > > with and whether or not management firmware was enabled.  Could you tell
> > > > > > me that info?
> > > > > > 
> > > > > 
> > > > > Hi Matt
> > > > > 
> > > > > I started a bisection, because I couldnt sleep tonight anyway :(
> > > > > 
> > > > > 14:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5715S
> > > > > Gigabit Ethernet (rev a3)
> > > > > 	Subsystem: Hewlett-Packard Company NC326m PCIe Dual Port Adapter
> > > > > 	Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 43
> > > > > 	Memory at fdff0000 (64-bit, non-prefetchable) [size=64K]
> > > > > 	Memory at fdfe0000 (64-bit, non-prefetchable) [size=64K]
> > > > > 	[virtual] Expansion ROM at fdbe0000 [disabled] [size=128K]
> > > > > 	Capabilities: [40] PCI-X non-bridge device
> > > > > 	Capabilities: [48] Power Management version 2
> > > > > 	Capabilities: [50] Vital Product Data
> > > > > 	Capabilities: [58] MSI: Enable+ Count=1/8 Maskable- 64bit+
> > > > > 	Kernel driver in use: tg3
> > > > > 	Kernel modules: tg3
> > > > > 
> > > > > 
> > > > 
> > > > $ ethtool -i eth2
> > > > driver: tg3
> > > > version: 3.115
> > > > firmware-version: 5715s-v3.28
> > > > bus-info: 0000:14:04.0
> > > > $ dmesg | grep ASF
> > > > [    6.220577] tg3 0000:14:04.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0]
> > > > ASF[0] TSOcap[1]
> > > > [    6.228586] tg3 0000:14:04.1: eth3: RXcsums[1] LinkChgREG[0] MIirq[0]
> > > > ASF[0] TSOcap[1]
> > > 
> > > Thanks.  So management firmware is disabled.  This should be
> > > straightforward case.
> > > 
> > > I'm wondering if I'm misunderstanding something though.  You said earlier
> > > that VLAN tagging doesn't work unless you applied my patch.  Is this no
> > > longer true?
> > > 
> > 
> > I dont apply your patch because Jesse said it was not a good patch ;)
> 
> Oh.
> 
> > Maybe I missed something and it must be applied ? Problem is : current
> > Linus tree now includes net-next-2.6 and vlan doesnt work. You should
> > resubmit it perhaps ?
> 
> Yes, something needs to be submitted.  I want to make sure we aren't
> chasing the same problem though.  If the patch(es) fix your problem,
> then I can concentrate on finalizing the patch.
> 

I believe it did, I can test your next patch ;)

> I can combine my last patch (the one that always enabled VLAN tag
> stripping) and the previous patch (that implements all your comments so
> far) into one patch, but that still leaves the behavior Michael noted
> unaddressed.
> 
> Michael, did you ever find out whether or not RXD_FLAG_VLAN was being
> set?
> 

Here is the bisect log , just in case :

d2394e6bb1aa636f3bd142cb6f7845a4332514b5 is first bad commit
commit d2394e6bb1aa636f3bd142cb6f7845a4332514b5
Author: Matt Carlson <mcarlson@broadcom.com>
Date:   Wed Nov 24 08:31:47 2010 +0000

    tg3: Always turn on APE features in mac_mode reg
    
    The APE needs certain bits in the mac_mode register to be enabled for
    traffic to flow correctly.  This patch changes the code to always enable
    these bits in the presence of the APE.
    
    Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
    Reviewed-by: Michael Chan <mchan@broadcom.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

:040000 040000 086382ce3ecd909222faf53ca23c48d1200dd60c 0a325ac6e7aa87a6737610292ab3025156a4ed80 M	drivers



$ git bisect log
git bisect start
# bad: [3c0cb7c31c206aaedb967e44b98442bbeb17a6c4] Merge branch 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm
git bisect bad 3c0cb7c31c206aaedb967e44b98442bbeb17a6c4
# good: [3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5] Linux 2.6.37
git bisect good 3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5
# bad: [63e35cd9bd4c8ae085c8b9a70554595b529c4100] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
git bisect bad 63e35cd9bd4c8ae085c8b9a70554595b529c4100
# bad: [67d5288049f46f816181f63eaa8f1371877ad8ea] vxge: update driver version
git bisect bad 67d5288049f46f816181f63eaa8f1371877ad8ea
# good: [b382b191ea9e9ccefc437433d23befe91f4a8925] ipv6: AF_INET6 link address family
git bisect good b382b191ea9e9ccefc437433d23befe91f4a8925
# good: [bedbbb959d2c1d1dbb4c2215f5b7074b1da3030a] ath: Add a driver_info bitmask field
git bisect good bedbbb959d2c1d1dbb4c2215f5b7074b1da3030a
# bad: [cf7afbfeb8ceb0187348d0a1a0db61305e25f05f] rtnl: make link af-specific updates atomic
git bisect bad cf7afbfeb8ceb0187348d0a1a0db61305e25f05f
# good: [22674a24b44ac53f244ef6edadd02021a270df5a] Net: dns_resolver: Makefile: Remove deprecated kbuild goal definitions
git bisect good 22674a24b44ac53f244ef6edadd02021a270df5a
# bad: [cf79003d598b1f82a4caa0564107283b4f560e14] tg3: Fix 5719 internal FIFO overflow problem
git bisect bad cf79003d598b1f82a4caa0564107283b4f560e14
# good: [094f2faaa2c4973e50979158f655a1d31a97ba98] Net: rds: Makefile: Remove deprecated items
git bisect good 094f2faaa2c4973e50979158f655a1d31a97ba98
# good: [04f6d70f6e64900a5d70a5fc199dd9d5fa787738] SELinux: Only return netlink error when we know the return is fatal
git bisect good 04f6d70f6e64900a5d70a5fc199dd9d5fa787738
# good: [5093eedc8bdfd7d906836a44a248f66a99e27d22] tg3: Apply 10Mbps fix to all 57765 revisions
git bisect good 5093eedc8bdfd7d906836a44a248f66a99e27d22
# bad: [d2394e6bb1aa636f3bd142cb6f7845a4332514b5] tg3: Always turn on APE features in mac_mode reg
git bisect bad d2394e6bb1aa636f3bd142cb6f7845a4332514b5
# good: [b75cc0e4c1caac63941d96a73b2214e8007b934b] tg3: Assign correct tx margin for 5719
git bisect good b75cc0e4c1caac63941d96a73b2214e8007b934b



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-07  3:24                                           ` Matt Carlson
@ 2011-01-07  4:36                                             ` Jesse Gross
  2011-01-13  1:21                                               ` Matt Carlson
  0 siblings, 1 reply; 53+ messages in thread
From: Jesse Gross @ 2011-01-07  4:36 UTC (permalink / raw)
  To: Matt Carlson
  Cc: Michael Leun, Michael Chan, Eric Dumazet, David Miller,
	Ben Greear, linux-kernel, netdev

On Thu, Jan 6, 2011 at 10:24 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> On Sat, Dec 18, 2010 at 07:38:00PM -0800, Jesse Gross wrote:
>> On Tue, Dec 14, 2010 at 11:16 PM, Michael Leun
>> <lkml20101129@newton.leun.net> wrote:
>> > OK - all tests done on that DL320G5:
>> >
>> > For completeness, 2.6.37-rc5 unpatched:
>> >
>> > eth0, no vlan configured: totally broken - see double tagged vlans
>> > without tag, single or untagged packets missing at all
>>
>> Random behavior?  This one is somewhat hard to explain - maybe there
>> are some other factors.  eth0 has ASF on, so it always strips tags.  I
>> would expect it to behave like the vlan configured case.
>>
>> >
>> > eth0, vlan configured: see packets without vlan tag (see double tagged
>> > packets with one vlan tag)
>>
>> Both ASF and vlan group configured cause tag stripping to be enabled.
>> Missing tag.
>>
>> >
>> > eth1 same as originally reported:
>> > without vlan configured see vlan tags (single and double tagged as
>> > expected)
>>
>> No ASF and no vlan group means tag stripping is disabled.  Have tag.
>>
>> > with vlan configured: see packets without vlan tag (see double tagged
>> > packets with one vlan tag)
>>
>> Configuring vlan group causes stripping to be enabled.  Missing tag.
>>
>> >
>> >
>> > 2.6.37-rc5, your tg3 use new vlan-code patch:
>> >
>> > eth0, no vlan configured: ?see packets without vlan tag (see double
>> > tagged packets with one vlan tag)
>>
>> ASF enables tag stripping.  Missing tag.
>>
>> > eth1, no vlan configured: see vlan tags (single and double tagged as
>> > expected)
>>
>> No ASF, no vlan group means no stripping.  Have tag.
>>
>> >
>> >
>> > eth0, vlan configured: as without vlan
>>
>> ASF enables stripping.  Missing tag.
>>
>> > eth1, vlan configured: as without vlan
>>
>> With this patch vlan stripping is only enabled when ASF is on, so no
>> stripping.  Have tag.
>>
>> >
>> > 2.6.37-rc5, your tg3 use new vlan-code patch with test patch ontop
>> >
>> > eth1 no vlan configured: see packets without vlan tag (see double tagged
>> > packets with one vlan tag)
>>
>> With the second patch, vlan stripping is always enabled.  Missing tag.
>>
>> > eth1 with vlan: the same
>>
>> Stripping still always enabled.  Missing tag.
>>
>> The bottom line is whenever vlan stripping is enabled we're missing
>> the outer tag.  It might be worth adding some debugging in the area
>> before napi_gro_receive/vlan_gro_receive (depending on version).  My
>> guess is that (desc->type_flags & RXD_FLAG_VLAN) is false even for
>> vlan packets on this NIC.
>>
>> You said that everything works on the 5752?  Matt, is it possible that
>> the 5714 either has a problem with vlan stripping or a different way
>> of reporting it?
>
> I don't think this is a 5714 specific issue.  I think the problem is
> rooted in the fact that the VLAN tag stripping is enabled.

It's definitely related to vlan stripping being enabled.  Other cards
using tg3 seem to work fine with stripping though, which is why I
thought it might be specific to the 5714.

>
> Your RXD_FLAG_VLAN idea sounds unlikely to me, but it's worth a check.
>
> The patch here is using __vlan_hwaccel_put_tag(), which informs the
> stack a VLAN tag is present.  If this is indeed a reporting problem, I'm
> not sure what else the driver should be doing.

The code to hand off the tag to the stack looks OK to me.  Michael was
seeing this on older versions of the kernel as well with this NIC,
which predates both this patch and the larger vlan changes so it
doesn't seem like a problem with passing the tag to the network stack.
 It's hard to know exactly what is going on though without seeing what
the hardware is reporting.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-07  3:41                                                         ` Matt Carlson
  2011-01-07  3:54                                                           ` Eric Dumazet
@ 2011-01-07  4:38                                                           ` Jesse Gross
  1 sibling, 0 replies; 53+ messages in thread
From: Jesse Gross @ 2011-01-07  4:38 UTC (permalink / raw)
  To: Matt Carlson
  Cc: Eric Dumazet, Michael Leun, Michael Chan, David Miller,
	Ben Greear, linux-kernel, netdev

On Thu, Jan 6, 2011 at 10:41 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> On Thu, Jan 06, 2011 at 07:04:46PM -0800, Eric Dumazet wrote:
>> Le jeudi 06 janvier 2011 ?? 18:59 -0800, Matt Carlson a ??crit :
>> > On Thu, Jan 06, 2011 at 06:43:22PM -0800, Eric Dumazet wrote:
>> > > Le vendredi 07 janvier 2011 ?? 03:41 +0100, Eric Dumazet a ??crit :
>> > > > Le jeudi 06 janvier 2011 ?? 18:29 -0800, Matt Carlson a ??crit :
>> > > >
>> > > > > Hi Eric.  Sorry for the delay.  I was under the impression that your
>> > > > > problems were software related and that you just needed a revised
>> > > > > version of these VLAN patches I was sending to Michael.  Is this not
>> > > > > true?
>> > > > >
>> > > > > Having a hardware stat increment suggests this is a new problem.
>> > > > > Maybe I missed it, but I didn't see what hardware you are working
>> > > > > with and whether or not management firmware was enabled.  Could you tell
>> > > > > me that info?
>> > > > >
>> > > >
>> > > > Hi Matt
>> > > >
>> > > > I started a bisection, because I couldnt sleep tonight anyway :(
>> > > >
>> > > > 14:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5715S
>> > > > Gigabit Ethernet (rev a3)
>> > > >         Subsystem: Hewlett-Packard Company NC326m PCIe Dual Port Adapter
>> > > >         Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 43
>> > > >         Memory at fdff0000 (64-bit, non-prefetchable) [size=64K]
>> > > >         Memory at fdfe0000 (64-bit, non-prefetchable) [size=64K]
>> > > >         [virtual] Expansion ROM at fdbe0000 [disabled] [size=128K]
>> > > >         Capabilities: [40] PCI-X non-bridge device
>> > > >         Capabilities: [48] Power Management version 2
>> > > >         Capabilities: [50] Vital Product Data
>> > > >         Capabilities: [58] MSI: Enable+ Count=1/8 Maskable- 64bit+
>> > > >         Kernel driver in use: tg3
>> > > >         Kernel modules: tg3
>> > > >
>> > > >
>> > >
>> > > $ ethtool -i eth2
>> > > driver: tg3
>> > > version: 3.115
>> > > firmware-version: 5715s-v3.28
>> > > bus-info: 0000:14:04.0
>> > > $ dmesg | grep ASF
>> > > [    6.220577] tg3 0000:14:04.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0]
>> > > ASF[0] TSOcap[1]
>> > > [    6.228586] tg3 0000:14:04.1: eth3: RXcsums[1] LinkChgREG[0] MIirq[0]
>> > > ASF[0] TSOcap[1]
>> >
>> > Thanks.  So management firmware is disabled.  This should be
>> > straightforward case.
>> >
>> > I'm wondering if I'm misunderstanding something though.  You said earlier
>> > that VLAN tagging doesn't work unless you applied my patch.  Is this no
>> > longer true?
>> >
>>
>> I dont apply your patch because Jesse said it was not a good patch ;)
>
> Oh.
>
>> Maybe I missed something and it must be applied ? Problem is : current
>> Linus tree now includes net-next-2.6 and vlan doesnt work. You should
>> resubmit it perhaps ?
>
> Yes, something needs to be submitted.  I want to make sure we aren't
> chasing the same problem though.  If the patch(es) fix your problem,
> then I can concentrate on finalizing the patch.
>
> I can combine my last patch (the one that always enabled VLAN tag
> stripping) and the previous patch (that implements all your comments so
> far) into one patch, but that still leaves the behavior Michael noted
> unaddressed.

Just to clarify, I think there are three separate things going on here:

* The patch, which independent of the separately reported issues, is
good because it moves tg3 to the new vlan model.  However, I don't
think we should always disable vlan stripping as is done because it is
probably useful in the majority of cases.  Maybe in some situations it
needs to be disabled but those are independent and should affect both
the patched and unpatched versions.
* Eric's issue.  It sounds like the commit that bisect turned up has
some interaction with stripping.  The patch fixes this because it
always disables stripping but that doesn't seem like the right
solution because previous versions worked with stripping enabled.
* Michael's issue.  Not clear what the cause is but disabling
stripping fixes it.  It has different symptoms from Eric's though
(missing tags vs missing packets).  The patch changes behavior a
little bit because it changes when stripping is enabled but doesn't
fix the underlying cause.

So each needs to be tracked down separately.  Unfortunately, the fixed
patch will no longer solve Eric's issue...

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-07  4:36                                             ` Jesse Gross
@ 2011-01-13  1:21                                               ` Matt Carlson
  2011-01-13 15:06                                                 ` Jesse Gross
  0 siblings, 1 reply; 53+ messages in thread
From: Matt Carlson @ 2011-01-13  1:21 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Matthew Carlson, Michael Leun, Michael Chan, Eric Dumazet,
	David Miller, Ben Greear, linux-kernel, netdev

On Thu, Jan 06, 2011 at 08:36:27PM -0800, Jesse Gross wrote:
> On Thu, Jan 6, 2011 at 10:24 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> > On Sat, Dec 18, 2010 at 07:38:00PM -0800, Jesse Gross wrote:
> >> On Tue, Dec 14, 2010 at 11:16 PM, Michael Leun
> >> <lkml20101129@newton.leun.net> wrote:
> >> > OK - all tests done on that DL320G5:
> >> >
> >> > For completeness, 2.6.37-rc5 unpatched:
> >> >
> >> > eth0, no vlan configured: totally broken - see double tagged vlans
> >> > without tag, single or untagged packets missing at all
> >>
> >> Random behavior? ?This one is somewhat hard to explain - maybe there
> >> are some other factors. ?eth0 has ASF on, so it always strips tags. ?I
> >> would expect it to behave like the vlan configured case.
> >>
> >> >
> >> > eth0, vlan configured: see packets without vlan tag (see double tagged
> >> > packets with one vlan tag)
> >>
> >> Both ASF and vlan group configured cause tag stripping to be enabled.
> >> Missing tag.
> >>
> >> >
> >> > eth1 same as originally reported:
> >> > without vlan configured see vlan tags (single and double tagged as
> >> > expected)
> >>
> >> No ASF and no vlan group means tag stripping is disabled. ?Have tag.
> >>
> >> > with vlan configured: see packets without vlan tag (see double tagged
> >> > packets with one vlan tag)
> >>
> >> Configuring vlan group causes stripping to be enabled. ?Missing tag.
> >>
> >> >
> >> >
> >> > 2.6.37-rc5, your tg3 use new vlan-code patch:
> >> >
> >> > eth0, no vlan configured: ?see packets without vlan tag (see double
> >> > tagged packets with one vlan tag)
> >>
> >> ASF enables tag stripping. ?Missing tag.
> >>
> >> > eth1, no vlan configured: see vlan tags (single and double tagged as
> >> > expected)
> >>
> >> No ASF, no vlan group means no stripping. ?Have tag.
> >>
> >> >
> >> >
> >> > eth0, vlan configured: as without vlan
> >>
> >> ASF enables stripping. ?Missing tag.
> >>
> >> > eth1, vlan configured: as without vlan
> >>
> >> With this patch vlan stripping is only enabled when ASF is on, so no
> >> stripping. ?Have tag.
> >>
> >> >
> >> > 2.6.37-rc5, your tg3 use new vlan-code patch with test patch ontop
> >> >
> >> > eth1 no vlan configured: see packets without vlan tag (see double tagged
> >> > packets with one vlan tag)
> >>
> >> With the second patch, vlan stripping is always enabled. ?Missing tag.
> >>
> >> > eth1 with vlan: the same
> >>
> >> Stripping still always enabled. ?Missing tag.
> >>
> >> The bottom line is whenever vlan stripping is enabled we're missing
> >> the outer tag. ?It might be worth adding some debugging in the area
> >> before napi_gro_receive/vlan_gro_receive (depending on version). ?My
> >> guess is that (desc->type_flags & RXD_FLAG_VLAN) is false even for
> >> vlan packets on this NIC.
> >>
> >> You said that everything works on the 5752? ?Matt, is it possible that
> >> the 5714 either has a problem with vlan stripping or a different way
> >> of reporting it?
> >
> > I don't think this is a 5714 specific issue. ?I think the problem is
> > rooted in the fact that the VLAN tag stripping is enabled.
> 
> It's definitely related to vlan stripping being enabled.  Other cards
> using tg3 seem to work fine with stripping though, which is why I
> thought it might be specific to the 5714.

I just tested this on a 5714S, using a net-next-2.6 snapshot obtained
today.  It does the right thing in both cases (2nd tg3 patch ommited /
applied).  The tag is always visible in the packet stream as seen from
tcpdump.

> > Your RXD_FLAG_VLAN idea sounds unlikely to me, but it's worth a check.
> >
> > The patch here is using __vlan_hwaccel_put_tag(), which informs the
> > stack a VLAN tag is present. ?If this is indeed a reporting problem, I'm
> > not sure what else the driver should be doing.
> 
> The code to hand off the tag to the stack looks OK to me.  Michael was
> seeing this on older versions of the kernel as well with this NIC,
> which predates both this patch and the larger vlan changes so it
> doesn't seem like a problem with passing the tag to the network stack.
>  It's hard to know exactly what is going on though without seeing what
> the hardware is reporting.

When RX_MODE_KEEP_VLAN_TAG is set, the RXD_FLAG_VLAN flag will not be set
when receiving a packet.  The driver skips the __vlan_hwaccel_put_tag()
call.

When RX_MODE_KEEP_VLAN_TAG is unset, the RXD_FLAG_VLAN flag is set, and
__vlan_hwaccel_put_tag() is called to reinject the packet.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-13  1:21                                               ` Matt Carlson
@ 2011-01-13 15:06                                                 ` Jesse Gross
  2011-01-13 20:50                                                   ` Matt Carlson
  0 siblings, 1 reply; 53+ messages in thread
From: Jesse Gross @ 2011-01-13 15:06 UTC (permalink / raw)
  To: Matt Carlson
  Cc: Michael Leun, Michael Chan, Eric Dumazet, David Miller,
	Ben Greear, linux-kernel, netdev

On Wed, Jan 12, 2011 at 8:21 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> On Thu, Jan 06, 2011 at 08:36:27PM -0800, Jesse Gross wrote:
>> On Thu, Jan 6, 2011 at 10:24 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
>> > On Sat, Dec 18, 2010 at 07:38:00PM -0800, Jesse Gross wrote:
>> >> On Tue, Dec 14, 2010 at 11:16 PM, Michael Leun
>> >> <lkml20101129@newton.leun.net> wrote:
>> >> > OK - all tests done on that DL320G5:
>> >> >
>> >> > For completeness, 2.6.37-rc5 unpatched:
>> >> >
>> >> > eth0, no vlan configured: totally broken - see double tagged vlans
>> >> > without tag, single or untagged packets missing at all
>> >>
>> >> Random behavior? ?This one is somewhat hard to explain - maybe there
>> >> are some other factors. ?eth0 has ASF on, so it always strips tags. ?I
>> >> would expect it to behave like the vlan configured case.
>> >>
>> >> >
>> >> > eth0, vlan configured: see packets without vlan tag (see double tagged
>> >> > packets with one vlan tag)
>> >>
>> >> Both ASF and vlan group configured cause tag stripping to be enabled.
>> >> Missing tag.
>> >>
>> >> >
>> >> > eth1 same as originally reported:
>> >> > without vlan configured see vlan tags (single and double tagged as
>> >> > expected)
>> >>
>> >> No ASF and no vlan group means tag stripping is disabled. ?Have tag.
>> >>
>> >> > with vlan configured: see packets without vlan tag (see double tagged
>> >> > packets with one vlan tag)
>> >>
>> >> Configuring vlan group causes stripping to be enabled. ?Missing tag.
>> >>
>> >> >
>> >> >
>> >> > 2.6.37-rc5, your tg3 use new vlan-code patch:
>> >> >
>> >> > eth0, no vlan configured: ?see packets without vlan tag (see double
>> >> > tagged packets with one vlan tag)
>> >>
>> >> ASF enables tag stripping. ?Missing tag.
>> >>
>> >> > eth1, no vlan configured: see vlan tags (single and double tagged as
>> >> > expected)
>> >>
>> >> No ASF, no vlan group means no stripping. ?Have tag.
>> >>
>> >> >
>> >> >
>> >> > eth0, vlan configured: as without vlan
>> >>
>> >> ASF enables stripping. ?Missing tag.
>> >>
>> >> > eth1, vlan configured: as without vlan
>> >>
>> >> With this patch vlan stripping is only enabled when ASF is on, so no
>> >> stripping. ?Have tag.
>> >>
>> >> >
>> >> > 2.6.37-rc5, your tg3 use new vlan-code patch with test patch ontop
>> >> >
>> >> > eth1 no vlan configured: see packets without vlan tag (see double tagged
>> >> > packets with one vlan tag)
>> >>
>> >> With the second patch, vlan stripping is always enabled. ?Missing tag.
>> >>
>> >> > eth1 with vlan: the same
>> >>
>> >> Stripping still always enabled. ?Missing tag.
>> >>
>> >> The bottom line is whenever vlan stripping is enabled we're missing
>> >> the outer tag. ?It might be worth adding some debugging in the area
>> >> before napi_gro_receive/vlan_gro_receive (depending on version). ?My
>> >> guess is that (desc->type_flags & RXD_FLAG_VLAN) is false even for
>> >> vlan packets on this NIC.
>> >>
>> >> You said that everything works on the 5752? ?Matt, is it possible that
>> >> the 5714 either has a problem with vlan stripping or a different way
>> >> of reporting it?
>> >
>> > I don't think this is a 5714 specific issue. ?I think the problem is
>> > rooted in the fact that the VLAN tag stripping is enabled.
>>
>> It's definitely related to vlan stripping being enabled.  Other cards
>> using tg3 seem to work fine with stripping though, which is why I
>> thought it might be specific to the 5714.
>
> I just tested this on a 5714S, using a net-next-2.6 snapshot obtained
> today.  It does the right thing in both cases (2nd tg3 patch ommited /
> applied).  The tag is always visible in the packet stream as seen from
> tcpdump.
>
>> > Your RXD_FLAG_VLAN idea sounds unlikely to me, but it's worth a check.
>> >
>> > The patch here is using __vlan_hwaccel_put_tag(), which informs the
>> > stack a VLAN tag is present. ?If this is indeed a reporting problem, I'm
>> > not sure what else the driver should be doing.
>>
>> The code to hand off the tag to the stack looks OK to me.  Michael was
>> seeing this on older versions of the kernel as well with this NIC,
>> which predates both this patch and the larger vlan changes so it
>> doesn't seem like a problem with passing the tag to the network stack.
>>  It's hard to know exactly what is going on though without seeing what
>> the hardware is reporting.
>
> When RX_MODE_KEEP_VLAN_TAG is set, the RXD_FLAG_VLAN flag will not be set
> when receiving a packet.  The driver skips the __vlan_hwaccel_put_tag()
> call.
>
> When RX_MODE_KEEP_VLAN_TAG is unset, the RXD_FLAG_VLAN flag is set, and
> __vlan_hwaccel_put_tag() is called to reinject the packet.

OK, thanks for testing it out.  I'm not sure that there's anything
more we can do without hearing from Michael.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-13 15:06                                                 ` Jesse Gross
@ 2011-01-13 20:50                                                   ` Matt Carlson
  2011-01-13 21:58                                                     ` Jesse Gross
  0 siblings, 1 reply; 53+ messages in thread
From: Matt Carlson @ 2011-01-13 20:50 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Matthew Carlson, Michael Leun, Michael Chan, Eric Dumazet,
	David Miller, Ben Greear, linux-kernel, netdev

On Thu, Jan 13, 2011 at 07:06:22AM -0800, Jesse Gross wrote:
> On Wed, Jan 12, 2011 at 8:21 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> > On Thu, Jan 06, 2011 at 08:36:27PM -0800, Jesse Gross wrote:
> >> On Thu, Jan 6, 2011 at 10:24 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> >> > On Sat, Dec 18, 2010 at 07:38:00PM -0800, Jesse Gross wrote:
> >> >> On Tue, Dec 14, 2010 at 11:16 PM, Michael Leun
> >> >> <lkml20101129@newton.leun.net> wrote:
> >> >> > OK - all tests done on that DL320G5:
> >> >> >
> >> >> > For completeness, 2.6.37-rc5 unpatched:
> >> >> >
> >> >> > eth0, no vlan configured: totally broken - see double tagged vlans
> >> >> > without tag, single or untagged packets missing at all
> >> >>
> >> >> Random behavior? ?This one is somewhat hard to explain - maybe there
> >> >> are some other factors. ?eth0 has ASF on, so it always strips tags. ?I
> >> >> would expect it to behave like the vlan configured case.
> >> >>
> >> >> >
> >> >> > eth0, vlan configured: see packets without vlan tag (see double tagged
> >> >> > packets with one vlan tag)
> >> >>
> >> >> Both ASF and vlan group configured cause tag stripping to be enabled.
> >> >> Missing tag.
> >> >>
> >> >> >
> >> >> > eth1 same as originally reported:
> >> >> > without vlan configured see vlan tags (single and double tagged as
> >> >> > expected)
> >> >>
> >> >> No ASF and no vlan group means tag stripping is disabled. ?Have tag.
> >> >>
> >> >> > with vlan configured: see packets without vlan tag (see double tagged
> >> >> > packets with one vlan tag)
> >> >>
> >> >> Configuring vlan group causes stripping to be enabled. ?Missing tag.
> >> >>
> >> >> >
> >> >> >
> >> >> > 2.6.37-rc5, your tg3 use new vlan-code patch:
> >> >> >
> >> >> > eth0, no vlan configured: ?see packets without vlan tag (see double
> >> >> > tagged packets with one vlan tag)
> >> >>
> >> >> ASF enables tag stripping. ?Missing tag.
> >> >>
> >> >> > eth1, no vlan configured: see vlan tags (single and double tagged as
> >> >> > expected)
> >> >>
> >> >> No ASF, no vlan group means no stripping. ?Have tag.
> >> >>
> >> >> >
> >> >> >
> >> >> > eth0, vlan configured: as without vlan
> >> >>
> >> >> ASF enables stripping. ?Missing tag.
> >> >>
> >> >> > eth1, vlan configured: as without vlan
> >> >>
> >> >> With this patch vlan stripping is only enabled when ASF is on, so no
> >> >> stripping. ?Have tag.
> >> >>
> >> >> >
> >> >> > 2.6.37-rc5, your tg3 use new vlan-code patch with test patch ontop
> >> >> >
> >> >> > eth1 no vlan configured: see packets without vlan tag (see double tagged
> >> >> > packets with one vlan tag)
> >> >>
> >> >> With the second patch, vlan stripping is always enabled. ?Missing tag.
> >> >>
> >> >> > eth1 with vlan: the same
> >> >>
> >> >> Stripping still always enabled. ?Missing tag.
> >> >>
> >> >> The bottom line is whenever vlan stripping is enabled we're missing
> >> >> the outer tag. ?It might be worth adding some debugging in the area
> >> >> before napi_gro_receive/vlan_gro_receive (depending on version). ?My
> >> >> guess is that (desc->type_flags & RXD_FLAG_VLAN) is false even for
> >> >> vlan packets on this NIC.
> >> >>
> >> >> You said that everything works on the 5752? ?Matt, is it possible that
> >> >> the 5714 either has a problem with vlan stripping or a different way
> >> >> of reporting it?
> >> >
> >> > I don't think this is a 5714 specific issue. ?I think the problem is
> >> > rooted in the fact that the VLAN tag stripping is enabled.
> >>
> >> It's definitely related to vlan stripping being enabled. ?Other cards
> >> using tg3 seem to work fine with stripping though, which is why I
> >> thought it might be specific to the 5714.
> >
> > I just tested this on a 5714S, using a net-next-2.6 snapshot obtained
> > today. ?It does the right thing in both cases (2nd tg3 patch ommited /
> > applied). ?The tag is always visible in the packet stream as seen from
> > tcpdump.
> >
> >> > Your RXD_FLAG_VLAN idea sounds unlikely to me, but it's worth a check.
> >> >
> >> > The patch here is using __vlan_hwaccel_put_tag(), which informs the
> >> > stack a VLAN tag is present. ?If this is indeed a reporting problem, I'm
> >> > not sure what else the driver should be doing.
> >>
> >> The code to hand off the tag to the stack looks OK to me. ?Michael was
> >> seeing this on older versions of the kernel as well with this NIC,
> >> which predates both this patch and the larger vlan changes so it
> >> doesn't seem like a problem with passing the tag to the network stack.
> >> ?It's hard to know exactly what is going on though without seeing what
> >> the hardware is reporting.
> >
> > When RX_MODE_KEEP_VLAN_TAG is set, the RXD_FLAG_VLAN flag will not be set
> > when receiving a packet. ?The driver skips the __vlan_hwaccel_put_tag()
> > call.
> >
> > When RX_MODE_KEEP_VLAN_TAG is unset, the RXD_FLAG_VLAN flag is set, and
> > __vlan_hwaccel_put_tag() is called to reinject the packet.
> 
> OK, thanks for testing it out.  I'm not sure that there's anything
> more we can do without hearing from Michael.

In the meantime, I think what we have should go upstream.  Just to be
absolutely clear though, your position is that VLAN tags should always
be stripped?


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-13 20:50                                                   ` Matt Carlson
@ 2011-01-13 21:58                                                     ` Jesse Gross
  2011-01-14  1:15                                                       ` Matt Carlson
  0 siblings, 1 reply; 53+ messages in thread
From: Jesse Gross @ 2011-01-13 21:58 UTC (permalink / raw)
  To: Matt Carlson
  Cc: Michael Leun, Michael Chan, Eric Dumazet, David Miller,
	Ben Greear, linux-kernel, netdev

On Thu, Jan 13, 2011 at 3:50 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> On Thu, Jan 13, 2011 at 07:06:22AM -0800, Jesse Gross wrote:
>> On Wed, Jan 12, 2011 at 8:21 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
>> > On Thu, Jan 06, 2011 at 08:36:27PM -0800, Jesse Gross wrote:
>> >> On Thu, Jan 6, 2011 at 10:24 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
>> >> > On Sat, Dec 18, 2010 at 07:38:00PM -0800, Jesse Gross wrote:
>> >> >> On Tue, Dec 14, 2010 at 11:16 PM, Michael Leun
>> >> >> <lkml20101129@newton.leun.net> wrote:
>> >> >> > OK - all tests done on that DL320G5:
>> >> >> >
>> >> >> > For completeness, 2.6.37-rc5 unpatched:
>> >> >> >
>> >> >> > eth0, no vlan configured: totally broken - see double tagged vlans
>> >> >> > without tag, single or untagged packets missing at all
>> >> >>
>> >> >> Random behavior? ?This one is somewhat hard to explain - maybe there
>> >> >> are some other factors. ?eth0 has ASF on, so it always strips tags. ?I
>> >> >> would expect it to behave like the vlan configured case.
>> >> >>
>> >> >> >
>> >> >> > eth0, vlan configured: see packets without vlan tag (see double tagged
>> >> >> > packets with one vlan tag)
>> >> >>
>> >> >> Both ASF and vlan group configured cause tag stripping to be enabled.
>> >> >> Missing tag.
>> >> >>
>> >> >> >
>> >> >> > eth1 same as originally reported:
>> >> >> > without vlan configured see vlan tags (single and double tagged as
>> >> >> > expected)
>> >> >>
>> >> >> No ASF and no vlan group means tag stripping is disabled. ?Have tag.
>> >> >>
>> >> >> > with vlan configured: see packets without vlan tag (see double tagged
>> >> >> > packets with one vlan tag)
>> >> >>
>> >> >> Configuring vlan group causes stripping to be enabled. ?Missing tag.
>> >> >>
>> >> >> >
>> >> >> >
>> >> >> > 2.6.37-rc5, your tg3 use new vlan-code patch:
>> >> >> >
>> >> >> > eth0, no vlan configured: ?see packets without vlan tag (see double
>> >> >> > tagged packets with one vlan tag)
>> >> >>
>> >> >> ASF enables tag stripping. ?Missing tag.
>> >> >>
>> >> >> > eth1, no vlan configured: see vlan tags (single and double tagged as
>> >> >> > expected)
>> >> >>
>> >> >> No ASF, no vlan group means no stripping. ?Have tag.
>> >> >>
>> >> >> >
>> >> >> >
>> >> >> > eth0, vlan configured: as without vlan
>> >> >>
>> >> >> ASF enables stripping. ?Missing tag.
>> >> >>
>> >> >> > eth1, vlan configured: as without vlan
>> >> >>
>> >> >> With this patch vlan stripping is only enabled when ASF is on, so no
>> >> >> stripping. ?Have tag.
>> >> >>
>> >> >> >
>> >> >> > 2.6.37-rc5, your tg3 use new vlan-code patch with test patch ontop
>> >> >> >
>> >> >> > eth1 no vlan configured: see packets without vlan tag (see double tagged
>> >> >> > packets with one vlan tag)
>> >> >>
>> >> >> With the second patch, vlan stripping is always enabled. ?Missing tag.
>> >> >>
>> >> >> > eth1 with vlan: the same
>> >> >>
>> >> >> Stripping still always enabled. ?Missing tag.
>> >> >>
>> >> >> The bottom line is whenever vlan stripping is enabled we're missing
>> >> >> the outer tag. ?It might be worth adding some debugging in the area
>> >> >> before napi_gro_receive/vlan_gro_receive (depending on version). ?My
>> >> >> guess is that (desc->type_flags & RXD_FLAG_VLAN) is false even for
>> >> >> vlan packets on this NIC.
>> >> >>
>> >> >> You said that everything works on the 5752? ?Matt, is it possible that
>> >> >> the 5714 either has a problem with vlan stripping or a different way
>> >> >> of reporting it?
>> >> >
>> >> > I don't think this is a 5714 specific issue. ?I think the problem is
>> >> > rooted in the fact that the VLAN tag stripping is enabled.
>> >>
>> >> It's definitely related to vlan stripping being enabled. ?Other cards
>> >> using tg3 seem to work fine with stripping though, which is why I
>> >> thought it might be specific to the 5714.
>> >
>> > I just tested this on a 5714S, using a net-next-2.6 snapshot obtained
>> > today. ?It does the right thing in both cases (2nd tg3 patch ommited /
>> > applied). ?The tag is always visible in the packet stream as seen from
>> > tcpdump.
>> >
>> >> > Your RXD_FLAG_VLAN idea sounds unlikely to me, but it's worth a check.
>> >> >
>> >> > The patch here is using __vlan_hwaccel_put_tag(), which informs the
>> >> > stack a VLAN tag is present. ?If this is indeed a reporting problem, I'm
>> >> > not sure what else the driver should be doing.
>> >>
>> >> The code to hand off the tag to the stack looks OK to me. ?Michael was
>> >> seeing this on older versions of the kernel as well with this NIC,
>> >> which predates both this patch and the larger vlan changes so it
>> >> doesn't seem like a problem with passing the tag to the network stack.
>> >> ?It's hard to know exactly what is going on though without seeing what
>> >> the hardware is reporting.
>> >
>> > When RX_MODE_KEEP_VLAN_TAG is set, the RXD_FLAG_VLAN flag will not be set
>> > when receiving a packet. ?The driver skips the __vlan_hwaccel_put_tag()
>> > call.
>> >
>> > When RX_MODE_KEEP_VLAN_TAG is unset, the RXD_FLAG_VLAN flag is set, and
>> > __vlan_hwaccel_put_tag() is called to reinject the packet.
>>
>> OK, thanks for testing it out.  I'm not sure that there's anything
>> more we can do without hearing from Michael.
>
> In the meantime, I think what we have should go upstream.  Just to be
> absolutely clear though, your position is that VLAN tags should always
> be stripped?

That's what the other converted drivers do by default (though most of
them also provide an ethtool set_flags() method to change this).  It's
generally the most efficient and is now safe to do in all cases.  It's
also the consistent with what was happening before, since stripping
was enabled when a vlan device was configured.  So, yes, normally I
think stripping should be enabled.

I assumed that disabling stripping in most situations was just an
oversight.  Was there a reason why you feel it is better not to use
it?

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-13 21:58                                                     ` Jesse Gross
@ 2011-01-14  1:15                                                       ` Matt Carlson
  2011-01-14 17:49                                                         ` Jesse Gross
  0 siblings, 1 reply; 53+ messages in thread
From: Matt Carlson @ 2011-01-14  1:15 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Matthew Carlson, Michael Leun, Michael Chan, Eric Dumazet,
	David Miller, Ben Greear, linux-kernel, netdev

On Thu, Jan 13, 2011 at 01:58:40PM -0800, Jesse Gross wrote:
> On Thu, Jan 13, 2011 at 3:50 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> > On Thu, Jan 13, 2011 at 07:06:22AM -0800, Jesse Gross wrote:
> >> On Wed, Jan 12, 2011 at 8:21 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> >> > On Thu, Jan 06, 2011 at 08:36:27PM -0800, Jesse Gross wrote:
> >> >> On Thu, Jan 6, 2011 at 10:24 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> >> >> > On Sat, Dec 18, 2010 at 07:38:00PM -0800, Jesse Gross wrote:
> >> >> >> On Tue, Dec 14, 2010 at 11:16 PM, Michael Leun
> >> >> >> <lkml20101129@newton.leun.net> wrote:
> >> >> >> > OK - all tests done on that DL320G5:
> >> >> >> >
> >> >> >> > For completeness, 2.6.37-rc5 unpatched:
> >> >> >> >
> >> >> >> > eth0, no vlan configured: totally broken - see double tagged vlans
> >> >> >> > without tag, single or untagged packets missing at all
> >> >> >>
> >> >> >> Random behavior? ?This one is somewhat hard to explain - maybe there
> >> >> >> are some other factors. ?eth0 has ASF on, so it always strips tags. ?I
> >> >> >> would expect it to behave like the vlan configured case.
> >> >> >>
> >> >> >> >
> >> >> >> > eth0, vlan configured: see packets without vlan tag (see double tagged
> >> >> >> > packets with one vlan tag)
> >> >> >>
> >> >> >> Both ASF and vlan group configured cause tag stripping to be enabled.
> >> >> >> Missing tag.
> >> >> >>
> >> >> >> >
> >> >> >> > eth1 same as originally reported:
> >> >> >> > without vlan configured see vlan tags (single and double tagged as
> >> >> >> > expected)
> >> >> >>
> >> >> >> No ASF and no vlan group means tag stripping is disabled. ?Have tag.
> >> >> >>
> >> >> >> > with vlan configured: see packets without vlan tag (see double tagged
> >> >> >> > packets with one vlan tag)
> >> >> >>
> >> >> >> Configuring vlan group causes stripping to be enabled. ?Missing tag.
> >> >> >>
> >> >> >> >
> >> >> >> >
> >> >> >> > 2.6.37-rc5, your tg3 use new vlan-code patch:
> >> >> >> >
> >> >> >> > eth0, no vlan configured: ?see packets without vlan tag (see double
> >> >> >> > tagged packets with one vlan tag)
> >> >> >>
> >> >> >> ASF enables tag stripping. ?Missing tag.
> >> >> >>
> >> >> >> > eth1, no vlan configured: see vlan tags (single and double tagged as
> >> >> >> > expected)
> >> >> >>
> >> >> >> No ASF, no vlan group means no stripping. ?Have tag.
> >> >> >>
> >> >> >> >
> >> >> >> >
> >> >> >> > eth0, vlan configured: as without vlan
> >> >> >>
> >> >> >> ASF enables stripping. ?Missing tag.
> >> >> >>
> >> >> >> > eth1, vlan configured: as without vlan
> >> >> >>
> >> >> >> With this patch vlan stripping is only enabled when ASF is on, so no
> >> >> >> stripping. ?Have tag.
> >> >> >>
> >> >> >> >
> >> >> >> > 2.6.37-rc5, your tg3 use new vlan-code patch with test patch ontop
> >> >> >> >
> >> >> >> > eth1 no vlan configured: see packets without vlan tag (see double tagged
> >> >> >> > packets with one vlan tag)
> >> >> >>
> >> >> >> With the second patch, vlan stripping is always enabled. ?Missing tag.
> >> >> >>
> >> >> >> > eth1 with vlan: the same
> >> >> >>
> >> >> >> Stripping still always enabled. ?Missing tag.
> >> >> >>
> >> >> >> The bottom line is whenever vlan stripping is enabled we're missing
> >> >> >> the outer tag. ?It might be worth adding some debugging in the area
> >> >> >> before napi_gro_receive/vlan_gro_receive (depending on version). ?My
> >> >> >> guess is that (desc->type_flags & RXD_FLAG_VLAN) is false even for
> >> >> >> vlan packets on this NIC.
> >> >> >>
> >> >> >> You said that everything works on the 5752? ?Matt, is it possible that
> >> >> >> the 5714 either has a problem with vlan stripping or a different way
> >> >> >> of reporting it?
> >> >> >
> >> >> > I don't think this is a 5714 specific issue. ?I think the problem is
> >> >> > rooted in the fact that the VLAN tag stripping is enabled.
> >> >>
> >> >> It's definitely related to vlan stripping being enabled. ?Other cards
> >> >> using tg3 seem to work fine with stripping though, which is why I
> >> >> thought it might be specific to the 5714.
> >> >
> >> > I just tested this on a 5714S, using a net-next-2.6 snapshot obtained
> >> > today. ?It does the right thing in both cases (2nd tg3 patch ommited /
> >> > applied). ?The tag is always visible in the packet stream as seen from
> >> > tcpdump.
> >> >
> >> >> > Your RXD_FLAG_VLAN idea sounds unlikely to me, but it's worth a check.
> >> >> >
> >> >> > The patch here is using __vlan_hwaccel_put_tag(), which informs the
> >> >> > stack a VLAN tag is present. ?If this is indeed a reporting problem, I'm
> >> >> > not sure what else the driver should be doing.
> >> >>
> >> >> The code to hand off the tag to the stack looks OK to me. ?Michael was
> >> >> seeing this on older versions of the kernel as well with this NIC,
> >> >> which predates both this patch and the larger vlan changes so it
> >> >> doesn't seem like a problem with passing the tag to the network stack.
> >> >> ?It's hard to know exactly what is going on though without seeing what
> >> >> the hardware is reporting.
> >> >
> >> > When RX_MODE_KEEP_VLAN_TAG is set, the RXD_FLAG_VLAN flag will not be set
> >> > when receiving a packet. ?The driver skips the __vlan_hwaccel_put_tag()
> >> > call.
> >> >
> >> > When RX_MODE_KEEP_VLAN_TAG is unset, the RXD_FLAG_VLAN flag is set, and
> >> > __vlan_hwaccel_put_tag() is called to reinject the packet.
> >>
> >> OK, thanks for testing it out. ?I'm not sure that there's anything
> >> more we can do without hearing from Michael.
> >
> > In the meantime, I think what we have should go upstream. ?Just to be
> > absolutely clear though, your position is that VLAN tags should always
> > be stripped?
> 
> That's what the other converted drivers do by default (though most of
> them also provide an ethtool set_flags() method to change this).  It's
> generally the most efficient and is now safe to do in all cases.  It's
> also the consistent with what was happening before, since stripping
> was enabled when a vlan device was configured.  So, yes, normally I
> think stripping should be enabled.
> 
> I assumed that disabling stripping in most situations was just an
> oversight.  Was there a reason why you feel it is better not to use
> it?

Actually, the tg3 driver was trying to disable VLAN tag stripping
when possible.  I believe this was primarily to support the raw packet
interface.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-14  1:15                                                       ` Matt Carlson
@ 2011-01-14 17:49                                                         ` Jesse Gross
  2011-01-14 18:38                                                           ` Matt Carlson
  0 siblings, 1 reply; 53+ messages in thread
From: Jesse Gross @ 2011-01-14 17:49 UTC (permalink / raw)
  To: Matt Carlson
  Cc: Michael Leun, Michael Chan, Eric Dumazet, David Miller,
	Ben Greear, linux-kernel, netdev

On Thu, Jan 13, 2011 at 8:15 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> On Thu, Jan 13, 2011 at 01:58:40PM -0800, Jesse Gross wrote:
>> On Thu, Jan 13, 2011 at 3:50 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
>> > On Thu, Jan 13, 2011 at 07:06:22AM -0800, Jesse Gross wrote:
>> >> On Wed, Jan 12, 2011 at 8:21 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
>> >> > On Thu, Jan 06, 2011 at 08:36:27PM -0800, Jesse Gross wrote:
>> >> >> On Thu, Jan 6, 2011 at 10:24 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
>> >> >> > On Sat, Dec 18, 2010 at 07:38:00PM -0800, Jesse Gross wrote:
>> >> >> >> On Tue, Dec 14, 2010 at 11:16 PM, Michael Leun
>> >> >> >> <lkml20101129@newton.leun.net> wrote:
>> >> >> >> > OK - all tests done on that DL320G5:
>> >> >> >> >
>> >> >> >> > For completeness, 2.6.37-rc5 unpatched:
>> >> >> >> >
>> >> >> >> > eth0, no vlan configured: totally broken - see double tagged vlans
>> >> >> >> > without tag, single or untagged packets missing at all
>> >> >> >>
>> >> >> >> Random behavior? ?This one is somewhat hard to explain - maybe there
>> >> >> >> are some other factors. ?eth0 has ASF on, so it always strips tags. ?I
>> >> >> >> would expect it to behave like the vlan configured case.
>> >> >> >>
>> >> >> >> >
>> >> >> >> > eth0, vlan configured: see packets without vlan tag (see double tagged
>> >> >> >> > packets with one vlan tag)
>> >> >> >>
>> >> >> >> Both ASF and vlan group configured cause tag stripping to be enabled.
>> >> >> >> Missing tag.
>> >> >> >>
>> >> >> >> >
>> >> >> >> > eth1 same as originally reported:
>> >> >> >> > without vlan configured see vlan tags (single and double tagged as
>> >> >> >> > expected)
>> >> >> >>
>> >> >> >> No ASF and no vlan group means tag stripping is disabled. ?Have tag.
>> >> >> >>
>> >> >> >> > with vlan configured: see packets without vlan tag (see double tagged
>> >> >> >> > packets with one vlan tag)
>> >> >> >>
>> >> >> >> Configuring vlan group causes stripping to be enabled. ?Missing tag.
>> >> >> >>
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > 2.6.37-rc5, your tg3 use new vlan-code patch:
>> >> >> >> >
>> >> >> >> > eth0, no vlan configured: ?see packets without vlan tag (see double
>> >> >> >> > tagged packets with one vlan tag)
>> >> >> >>
>> >> >> >> ASF enables tag stripping. ?Missing tag.
>> >> >> >>
>> >> >> >> > eth1, no vlan configured: see vlan tags (single and double tagged as
>> >> >> >> > expected)
>> >> >> >>
>> >> >> >> No ASF, no vlan group means no stripping. ?Have tag.
>> >> >> >>
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > eth0, vlan configured: as without vlan
>> >> >> >>
>> >> >> >> ASF enables stripping. ?Missing tag.
>> >> >> >>
>> >> >> >> > eth1, vlan configured: as without vlan
>> >> >> >>
>> >> >> >> With this patch vlan stripping is only enabled when ASF is on, so no
>> >> >> >> stripping. ?Have tag.
>> >> >> >>
>> >> >> >> >
>> >> >> >> > 2.6.37-rc5, your tg3 use new vlan-code patch with test patch ontop
>> >> >> >> >
>> >> >> >> > eth1 no vlan configured: see packets without vlan tag (see double tagged
>> >> >> >> > packets with one vlan tag)
>> >> >> >>
>> >> >> >> With the second patch, vlan stripping is always enabled. ?Missing tag.
>> >> >> >>
>> >> >> >> > eth1 with vlan: the same
>> >> >> >>
>> >> >> >> Stripping still always enabled. ?Missing tag.
>> >> >> >>
>> >> >> >> The bottom line is whenever vlan stripping is enabled we're missing
>> >> >> >> the outer tag. ?It might be worth adding some debugging in the area
>> >> >> >> before napi_gro_receive/vlan_gro_receive (depending on version). ?My
>> >> >> >> guess is that (desc->type_flags & RXD_FLAG_VLAN) is false even for
>> >> >> >> vlan packets on this NIC.
>> >> >> >>
>> >> >> >> You said that everything works on the 5752? ?Matt, is it possible that
>> >> >> >> the 5714 either has a problem with vlan stripping or a different way
>> >> >> >> of reporting it?
>> >> >> >
>> >> >> > I don't think this is a 5714 specific issue. ?I think the problem is
>> >> >> > rooted in the fact that the VLAN tag stripping is enabled.
>> >> >>
>> >> >> It's definitely related to vlan stripping being enabled. ?Other cards
>> >> >> using tg3 seem to work fine with stripping though, which is why I
>> >> >> thought it might be specific to the 5714.
>> >> >
>> >> > I just tested this on a 5714S, using a net-next-2.6 snapshot obtained
>> >> > today. ?It does the right thing in both cases (2nd tg3 patch ommited /
>> >> > applied). ?The tag is always visible in the packet stream as seen from
>> >> > tcpdump.
>> >> >
>> >> >> > Your RXD_FLAG_VLAN idea sounds unlikely to me, but it's worth a check.
>> >> >> >
>> >> >> > The patch here is using __vlan_hwaccel_put_tag(), which informs the
>> >> >> > stack a VLAN tag is present. ?If this is indeed a reporting problem, I'm
>> >> >> > not sure what else the driver should be doing.
>> >> >>
>> >> >> The code to hand off the tag to the stack looks OK to me. ?Michael was
>> >> >> seeing this on older versions of the kernel as well with this NIC,
>> >> >> which predates both this patch and the larger vlan changes so it
>> >> >> doesn't seem like a problem with passing the tag to the network stack.
>> >> >> ?It's hard to know exactly what is going on though without seeing what
>> >> >> the hardware is reporting.
>> >> >
>> >> > When RX_MODE_KEEP_VLAN_TAG is set, the RXD_FLAG_VLAN flag will not be set
>> >> > when receiving a packet. ?The driver skips the __vlan_hwaccel_put_tag()
>> >> > call.
>> >> >
>> >> > When RX_MODE_KEEP_VLAN_TAG is unset, the RXD_FLAG_VLAN flag is set, and
>> >> > __vlan_hwaccel_put_tag() is called to reinject the packet.
>> >>
>> >> OK, thanks for testing it out. ?I'm not sure that there's anything
>> >> more we can do without hearing from Michael.
>> >
>> > In the meantime, I think what we have should go upstream. ?Just to be
>> > absolutely clear though, your position is that VLAN tags should always
>> > be stripped?
>>
>> That's what the other converted drivers do by default (though most of
>> them also provide an ethtool set_flags() method to change this).  It's
>> generally the most efficient and is now safe to do in all cases.  It's
>> also the consistent with what was happening before, since stripping
>> was enabled when a vlan device was configured.  So, yes, normally I
>> think stripping should be enabled.
>>
>> I assumed that disabling stripping in most situations was just an
>> oversight.  Was there a reason why you feel it is better not to use
>> it?
>
> Actually, the tg3 driver was trying to disable VLAN tag stripping
> when possible.  I believe this was primarily to support the raw packet
> interface.

Hmm, is this really true?

        if (!tp->vlgrp &&
            !(tp->tg3_flags & TG3_FLAG_ENABLE_ASF))
                rx_mode |= RX_MODE_KEEP_VLAN_TAG;

So if a vlan device is registered or ASF is enabled we will use
stripping.  That seems like it is using stripping in the common case
when vlans are used.

Before 2.6.37 it was only possible to deliver stripped tags if a vlan
group was configured.  This means that if ASF was enabled and forced
stripping but no group was configured we would lose the tag.  In this
situation tg3 manually reinserts the tags so raw packet capture will
see them, as you say.  However, in the current tree this limitation no
longer exists and it is always possible to deliver stripped tags to
the networking core, which should do the right thing in all
situations.

So I believe the only reason to disable tag stripping is for debugging
or some other special purpose situation.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-14 17:49                                                         ` Jesse Gross
@ 2011-01-14 18:38                                                           ` Matt Carlson
  2011-01-19 16:15                                                             ` Jesse Gross
  0 siblings, 1 reply; 53+ messages in thread
From: Matt Carlson @ 2011-01-14 18:38 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Matthew Carlson, Michael Leun, Michael Chan, Eric Dumazet,
	David Miller, Ben Greear, linux-kernel, netdev

On Fri, Jan 14, 2011 at 09:49:47AM -0800, Jesse Gross wrote:
> On Thu, Jan 13, 2011 at 8:15 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> > On Thu, Jan 13, 2011 at 01:58:40PM -0800, Jesse Gross wrote:
> >> On Thu, Jan 13, 2011 at 3:50 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> >> > On Thu, Jan 13, 2011 at 07:06:22AM -0800, Jesse Gross wrote:
> >> >> On Wed, Jan 12, 2011 at 8:21 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> >> >> > On Thu, Jan 06, 2011 at 08:36:27PM -0800, Jesse Gross wrote:
> >> >> >> On Thu, Jan 6, 2011 at 10:24 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> >> >> >> > On Sat, Dec 18, 2010 at 07:38:00PM -0800, Jesse Gross wrote:
> >> >> >> >> On Tue, Dec 14, 2010 at 11:16 PM, Michael Leun
> >> >> >> >> <lkml20101129@newton.leun.net> wrote:
> >> >> >> >> > OK - all tests done on that DL320G5:
> >> >> >> >> >
> >> >> >> >> > For completeness, 2.6.37-rc5 unpatched:
> >> >> >> >> >
> >> >> >> >> > eth0, no vlan configured: totally broken - see double tagged vlans
> >> >> >> >> > without tag, single or untagged packets missing at all
> >> >> >> >>
> >> >> >> >> Random behavior? ?This one is somewhat hard to explain - maybe there
> >> >> >> >> are some other factors. ?eth0 has ASF on, so it always strips tags. ?I
> >> >> >> >> would expect it to behave like the vlan configured case.
> >> >> >> >>
> >> >> >> >> >
> >> >> >> >> > eth0, vlan configured: see packets without vlan tag (see double tagged
> >> >> >> >> > packets with one vlan tag)
> >> >> >> >>
> >> >> >> >> Both ASF and vlan group configured cause tag stripping to be enabled.
> >> >> >> >> Missing tag.
> >> >> >> >>
> >> >> >> >> >
> >> >> >> >> > eth1 same as originally reported:
> >> >> >> >> > without vlan configured see vlan tags (single and double tagged as
> >> >> >> >> > expected)
> >> >> >> >>
> >> >> >> >> No ASF and no vlan group means tag stripping is disabled. ?Have tag.
> >> >> >> >>
> >> >> >> >> > with vlan configured: see packets without vlan tag (see double tagged
> >> >> >> >> > packets with one vlan tag)
> >> >> >> >>
> >> >> >> >> Configuring vlan group causes stripping to be enabled. ?Missing tag.
> >> >> >> >>
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > 2.6.37-rc5, your tg3 use new vlan-code patch:
> >> >> >> >> >
> >> >> >> >> > eth0, no vlan configured: ?see packets without vlan tag (see double
> >> >> >> >> > tagged packets with one vlan tag)
> >> >> >> >>
> >> >> >> >> ASF enables tag stripping. ?Missing tag.
> >> >> >> >>
> >> >> >> >> > eth1, no vlan configured: see vlan tags (single and double tagged as
> >> >> >> >> > expected)
> >> >> >> >>
> >> >> >> >> No ASF, no vlan group means no stripping. ?Have tag.
> >> >> >> >>
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > eth0, vlan configured: as without vlan
> >> >> >> >>
> >> >> >> >> ASF enables stripping. ?Missing tag.
> >> >> >> >>
> >> >> >> >> > eth1, vlan configured: as without vlan
> >> >> >> >>
> >> >> >> >> With this patch vlan stripping is only enabled when ASF is on, so no
> >> >> >> >> stripping. ?Have tag.
> >> >> >> >>
> >> >> >> >> >
> >> >> >> >> > 2.6.37-rc5, your tg3 use new vlan-code patch with test patch ontop
> >> >> >> >> >
> >> >> >> >> > eth1 no vlan configured: see packets without vlan tag (see double tagged
> >> >> >> >> > packets with one vlan tag)
> >> >> >> >>
> >> >> >> >> With the second patch, vlan stripping is always enabled. ?Missing tag.
> >> >> >> >>
> >> >> >> >> > eth1 with vlan: the same
> >> >> >> >>
> >> >> >> >> Stripping still always enabled. ?Missing tag.
> >> >> >> >>
> >> >> >> >> The bottom line is whenever vlan stripping is enabled we're missing
> >> >> >> >> the outer tag. ?It might be worth adding some debugging in the area
> >> >> >> >> before napi_gro_receive/vlan_gro_receive (depending on version). ?My
> >> >> >> >> guess is that (desc->type_flags & RXD_FLAG_VLAN) is false even for
> >> >> >> >> vlan packets on this NIC.
> >> >> >> >>
> >> >> >> >> You said that everything works on the 5752? ?Matt, is it possible that
> >> >> >> >> the 5714 either has a problem with vlan stripping or a different way
> >> >> >> >> of reporting it?
> >> >> >> >
> >> >> >> > I don't think this is a 5714 specific issue. ?I think the problem is
> >> >> >> > rooted in the fact that the VLAN tag stripping is enabled.
> >> >> >>
> >> >> >> It's definitely related to vlan stripping being enabled. ?Other cards
> >> >> >> using tg3 seem to work fine with stripping though, which is why I
> >> >> >> thought it might be specific to the 5714.
> >> >> >
> >> >> > I just tested this on a 5714S, using a net-next-2.6 snapshot obtained
> >> >> > today. ?It does the right thing in both cases (2nd tg3 patch ommited /
> >> >> > applied). ?The tag is always visible in the packet stream as seen from
> >> >> > tcpdump.
> >> >> >
> >> >> >> > Your RXD_FLAG_VLAN idea sounds unlikely to me, but it's worth a check.
> >> >> >> >
> >> >> >> > The patch here is using __vlan_hwaccel_put_tag(), which informs the
> >> >> >> > stack a VLAN tag is present. ?If this is indeed a reporting problem, I'm
> >> >> >> > not sure what else the driver should be doing.
> >> >> >>
> >> >> >> The code to hand off the tag to the stack looks OK to me. ?Michael was
> >> >> >> seeing this on older versions of the kernel as well with this NIC,
> >> >> >> which predates both this patch and the larger vlan changes so it
> >> >> >> doesn't seem like a problem with passing the tag to the network stack.
> >> >> >> ?It's hard to know exactly what is going on though without seeing what
> >> >> >> the hardware is reporting.
> >> >> >
> >> >> > When RX_MODE_KEEP_VLAN_TAG is set, the RXD_FLAG_VLAN flag will not be set
> >> >> > when receiving a packet. ?The driver skips the __vlan_hwaccel_put_tag()
> >> >> > call.
> >> >> >
> >> >> > When RX_MODE_KEEP_VLAN_TAG is unset, the RXD_FLAG_VLAN flag is set, and
> >> >> > __vlan_hwaccel_put_tag() is called to reinject the packet.
> >> >>
> >> >> OK, thanks for testing it out. ?I'm not sure that there's anything
> >> >> more we can do without hearing from Michael.
> >> >
> >> > In the meantime, I think what we have should go upstream. ?Just to be
> >> > absolutely clear though, your position is that VLAN tags should always
> >> > be stripped?
> >>
> >> That's what the other converted drivers do by default (though most of
> >> them also provide an ethtool set_flags() method to change this). ?It's
> >> generally the most efficient and is now safe to do in all cases. ?It's
> >> also the consistent with what was happening before, since stripping
> >> was enabled when a vlan device was configured. ?So, yes, normally I
> >> think stripping should be enabled.
> >>
> >> I assumed that disabling stripping in most situations was just an
> >> oversight. ?Was there a reason why you feel it is better not to use
> >> it?
> >
> > Actually, the tg3 driver was trying to disable VLAN tag stripping
> > when possible. ?I believe this was primarily to support the raw packet
> > interface.
> 
> Hmm, is this really true?
> 
>         if (!tp->vlgrp &&
>             !(tp->tg3_flags & TG3_FLAG_ENABLE_ASF))
>                 rx_mode |= RX_MODE_KEEP_VLAN_TAG;
> 
> So if a vlan device is registered or ASF is enabled we will use
> stripping.  That seems like it is using stripping in the common case
> when vlans are used.

Right.

> Before 2.6.37 it was only possible to deliver stripped tags if a vlan
> group was configured.  This means that if ASF was enabled and forced
> stripping but no group was configured we would lose the tag.  In this
> situation tg3 manually reinserts the tags so raw packet capture will
> see them, as you say.

Right.  VLAN tagged frames can still arrive if CONFIG_VLAN_8021Q or
CONFIG_VLAN_8021Q_MODULE is not set.  That's why the driver was trying
to keep them inline.  To eliminate the unnecessary overhead of
reinjecting the VLAN tag.

> However, in the current tree this limitation no
> longer exists and it is always possible to deliver stripped tags to
> the networking core, which should do the right thing in all
> situations.

Yes.  Even though the stack is capable of reinjecting the VLAN tag,
doesn't it make sense to avoid that if we knew they would never be
needed out-of-packet?

> So I believe the only reason to disable tag stripping is for debugging
> or some other special purpose situation.

Nope.  I think we covered it all.  Thanks for the info.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
  2011-01-14 18:38                                                           ` Matt Carlson
@ 2011-01-19 16:15                                                             ` Jesse Gross
  0 siblings, 0 replies; 53+ messages in thread
From: Jesse Gross @ 2011-01-19 16:15 UTC (permalink / raw)
  To: Matt Carlson
  Cc: Michael Leun, Michael Chan, Eric Dumazet, David Miller,
	Ben Greear, linux-kernel, netdev

On Fri, Jan 14, 2011 at 10:38 AM, Matt Carlson <mcarlson@broadcom.com> wrote:
>> >> > In the meantime, I think what we have should go upstream. ?Just to be
>> >> > absolutely clear though, your position is that VLAN tags should always
>> >> > be stripped?
>> >>
>> >> That's what the other converted drivers do by default (though most of
>> >> them also provide an ethtool set_flags() method to change this). ?It's
>> >> generally the most efficient and is now safe to do in all cases. ?It's
>> >> also the consistent with what was happening before, since stripping
>> >> was enabled when a vlan device was configured. ?So, yes, normally I
>> >> think stripping should be enabled.
>> >>
>> >> I assumed that disabling stripping in most situations was just an
>> >> oversight. ?Was there a reason why you feel it is better not to use
>> >> it?
>> >
>> > Actually, the tg3 driver was trying to disable VLAN tag stripping
>> > when possible. ?I believe this was primarily to support the raw packet
>> > interface.
>>
>> Hmm, is this really true?
>>
>>         if (!tp->vlgrp &&
>>             !(tp->tg3_flags & TG3_FLAG_ENABLE_ASF))
>>                 rx_mode |= RX_MODE_KEEP_VLAN_TAG;
>>
>> So if a vlan device is registered or ASF is enabled we will use
>> stripping.  That seems like it is using stripping in the common case
>> when vlans are used.
>
> Right.
>
>> Before 2.6.37 it was only possible to deliver stripped tags if a vlan
>> group was configured.  This means that if ASF was enabled and forced
>> stripping but no group was configured we would lose the tag.  In this
>> situation tg3 manually reinserts the tags so raw packet capture will
>> see them, as you say.
>
> Right.  VLAN tagged frames can still arrive if CONFIG_VLAN_8021Q or
> CONFIG_VLAN_8021Q_MODULE is not set.  That's why the driver was trying
> to keep them inline.  To eliminate the unnecessary overhead of
> reinjecting the VLAN tag.

OK, I understand now why you are saying that the driver was trying to
keep stripping disabled.  I was thinking about having vlan groups
configured as the common case for receiving vlans and therefore the
case to optimize.

>
>> However, in the current tree this limitation no
>> longer exists and it is always possible to deliver stripped tags to
>> the networking core, which should do the right thing in all
>> situations.
>
> Yes.  Even though the stack is capable of reinjecting the VLAN tag,
> doesn't it make sense to avoid that if we knew they would never be
> needed out-of-packet?

Maybe in theory but a relatively small optimization to the raw packet
interface doesn't seem terribly important to me.  After all of the
drivers have been converted to the new vlan interfaces I'll be
removing the ndo_vlan_rx_register() function from net_device_ops, so
the driver won't actually know how the vlan tag will be used.  The
impetus for this is that there were quite a few bugs and inconsistent
behavior in drivers around stripping logic.  Therefore, by moving all
of this into the networking core we can reduce code and get consistent
behavior, which is more important than a corner case optimization.

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2011-01-19 16:15 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-29 19:17 2.6.35 -> 2.6.36 panic when vlan and promisc with tg3 Michael Leun
2010-11-30  0:19 ` Ben Greear
2010-11-30  3:10   ` Jesse Gross
2010-11-30  3:26     ` David Miller
2010-11-30  8:59   ` Michael Leun
2010-11-30  9:20     ` Eric Dumazet
2010-11-30 22:27       ` Jesse Gross
2010-12-01 10:17       ` Michael Leun
2010-12-01 10:55         ` [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used Eric Dumazet
2010-12-05  2:07           ` Michael Leun
2010-12-05  8:03             ` Eric Dumazet
2010-12-05  9:55               ` Michael Leun
     [not found]                 ` <20101205114404.7c0cddc2@xenia.leun.net>
     [not found]                   ` <AANLkTikrDTCDxsyOG4m0XcrOY=3pTRwWqnPGsio9cBFj@mail.gmail.com>
2010-12-06 19:34                     ` Michael Leun
2010-12-06 20:04                       ` Jesse Gross
2010-12-06 21:27                         ` Michael Leun
2010-12-13  0:11                           ` Jesse Gross
2010-12-13 22:45                             ` Matt Carlson
2010-12-14  4:07                               ` Jesse Gross
2010-12-14 19:15                                 ` Matt Carlson
2010-12-14 21:46                                   ` Jesse Gross
2010-12-15  0:24                                   ` Michael Leun
2010-12-15  1:34                                     ` Matt Carlson
2010-12-15  7:16                                       ` Michael Leun
2010-12-19  3:38                                         ` Jesse Gross
2011-01-07  3:24                                           ` Matt Carlson
2011-01-07  4:36                                             ` Jesse Gross
2011-01-13  1:21                                               ` Matt Carlson
2011-01-13 15:06                                                 ` Jesse Gross
2011-01-13 20:50                                                   ` Matt Carlson
2011-01-13 21:58                                                     ` Jesse Gross
2011-01-14  1:15                                                       ` Matt Carlson
2011-01-14 17:49                                                         ` Jesse Gross
2011-01-14 18:38                                                           ` Matt Carlson
2011-01-19 16:15                                                             ` Jesse Gross
2011-01-01 17:03                                   ` Eric Dumazet
2011-01-02  0:27                                     ` Jesse Gross
2011-01-02 16:05                                       ` Eric Dumazet
2011-01-06 21:01                                         ` Jesse Gross
2011-01-06 23:34                                           ` Eric Dumazet
2011-01-07  1:20                                             ` Eric Dumazet
2011-01-07  2:29                                               ` Matt Carlson
2011-01-07  2:41                                                 ` Eric Dumazet
2011-01-07  2:43                                                   ` Eric Dumazet
2011-01-07  2:59                                                     ` Matt Carlson
2011-01-07  3:04                                                       ` Eric Dumazet
2011-01-07  3:41                                                         ` Matt Carlson
2011-01-07  3:54                                                           ` Eric Dumazet
2011-01-07  4:38                                                           ` Jesse Gross
2010-12-08 16:47           ` David Miller
2010-12-08 23:06             ` [stable] " Greg KH
2010-12-08 23:16             ` Greg KH
2010-12-09  1:25               ` Eric Dumazet
2010-12-09 20:13                 ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).